AI's Facial Expressions: A Step Forward, But Still Room for Growth with Imagen-v2
- 9 minutes read - 1804 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. In the realm of AI-generated imagery, capturing these nuances is a crucial step towards creating truly immersive and engaging experiences. This blog post examines the capabilities of a generative AI model in understanding and generating facial expressions, exploring its strengths and weaknesses in capturing the dramatic style of facial expressions. We’ll delve into examples where this style is effectively used, showcasing the potential and challenges of AI in this domain.
Created with: imagen-v2
Lost in the City Lights: A Moment of Melancholy
A woman with long blonde hair and freckles stands bathed in the soft glow of city lights, her expression hinting at a story waiting to be told. The dramatic lighting and her pensive gaze create a sense of mystery and intrigue, leaving you wondering what secrets she holds.
Prompt
facial-expressions Agreement: melancholy, contemplative ; A lone figure; eye-level; Single Person; a bustling city street at night; cinematic
Characteristic
Shot : A close-up portrait of a young woman with a moody expression, her hair illuminated by a warm light source, set against a blurry background of bokeh lights.
Aesthetic Score : 0.7
Mood : melancholy, mysterious, introspective
Quality
Entropy : 6.48
Noise : 60
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some noticeable noise in the hair and skin, slight blurriness in the face, and an unnatural lighting effect. The overall image has a somewhat artificial feel.
Superman: A Hero Forged in Steel and Light
A close-up portrait of Superman, bathed in dramatic lighting, captures his heroic presence against a futuristic cityscape. The image evokes a sense of action, adventure, and power, leaving you on the edge of your seat.
Prompt
facial-expressions Agreement: determined, resolute ; A superhero standing tall; eye-level; Hero; a cityscape with a burning building in the background; cinematic
Characteristic
Shot : A close-up shot of Superman standing in front of a cityscape, the sky is cloudy and the overall tone of the image is dark
Aesthetic Score : 0.7
Mood : serious, dramatic, powerful
Quality
Entropy : 6.50
Noise : 53
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts and errors, such as the Superman’s cape being slightly blurry in places. There is also some noise present, especially in the darker areas of the image
A Family’s Silent Story: Intimacy and Tension in a Dimly Lit Kitchen
A family gathers around a table in a cluttered kitchen, bathed in soft, mysterious light. The scene evokes a sense of intimacy, nostalgia, and underlying tension, leaving the viewer to ponder the unspoken stories unfolding within this lived-in space.
Prompt
facial-expressions Agreement: peaceful, content ; A family gathered around a dinner table; eye-level; Normal People; a cozy kitchen with warm lighting; cinematic
Characteristic
Shot : A family sits at a table in a rustic kitchen with warm lighting and a cluttered, lived-in feel. They are having a meal together, possibly a Thanksgiving dinner, though some of the food items look a bit unusual.
Aesthetic Score : 0.6
Mood : warm, intimate, slightly unsettling
Quality
Entropy : 6.75
Noise : 113
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness throughout the image, particularly on the faces, and a faint digital noise is noticeable.
Lost in the Neon Glow: A Mysterious Gaze
A young man, shrouded in shadows and bathed in vibrant neon light, stares directly into the camera. His intense expression and the blurred background create a sense of mystery and intrigue. Is he lost in thought, or is there something more sinister at play?
Prompt
facial-expressions Agreement: excited, engaged ; A gamer intensely focused on a screen; eye-level; Gamer; a dimly lit room with neon lights reflecting on the screen; cinematic
Characteristic
Shot : A close-up portrait of a young man wearing glasses, with a blurred background of neon lights. He is staring intently at the camera, with a serious expression on his face.
Aesthetic Score : 0.7
Mood : intense, mysterious, edgy
Quality
Entropy : 6.49
Noise : 89
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts in the background, the image is also quite dark.
Lost in the City’s Shadows
A woman stands alone in a dimly lit urban setting, her gaze lost in thought. The shallow depth of field isolates her from the bustling city, creating a sense of mystery and intrigue. The dark tones and her pensive expression evoke a mood of urban solitude.
Prompt
facial-expressions Agreement: reflective, introspective ; A woman walking down a quiet street; eye-level; Single Person; a row of old, brick buildings with faded paint; cinematic
Characteristic
Shot : A woman is walking down a street, looking up and to the right. She is wearing a green coat and a scarf. The background is a brick building.
Aesthetic Score : 0.7
Mood : intrigued, thoughtful, mysterious
Quality
Entropy : 6.76
Noise : 108
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight artifacts around the edges of the image, possibly due to compression.
Thunder God Unleashed: Epic Showdown in the Storm
A powerful figure, clad in the garb of a superhero, stands defiant against a raging storm. Lightning crackles in his hand, a symbol of his might and the drama unfolding. This image captures the essence of epic power and dramatic tension.
Prompt
facial-expressions Agreement: powerful, defiant ; A hero raising their fist in defiance; eye-level; Hero; a dark, stormy sky with lightning flashing in the background; cinematic
Characteristic
Shot : A man dressed as a superhero stands in front of a stormy sky, he is holding a glowing bolt of lightning in his hand and looks determined.
Aesthetic Score : 0.6
Mood : epic, dramatic, powerful
Quality
Entropy : 6.78
Noise : 56
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lightning effect appears slightly artificial, and the lighting on the character’s face is uneven. The image exhibits signs of digital manipulation, specifically in the character’s muscle definition and the lighting on the cape.
Unbridled Joy: Three Friends Share a Moment of Pure Laughter
Capture the essence of carefree happiness with this image of three friends laughing heartily outdoors. The wide open mouths and exaggerated laughter radiate genuine joy and amusement, set against a backdrop of lush greenery and a clear blue sky. This image evokes a sense of lightheartedness and the simple pleasures of friendship.
Prompt
facial-expressions Agreement: joyful, carefree ; A group of friends laughing together; eye-level; Normal People; a sunny park with trees and flowers; cinematic
Characteristic
Shot : Three people, a man, a woman and a young man, are laughing heartily in a sunny outdoor setting, possibly a garden.
Aesthetic Score : 0.6
Mood : joyful, carefree, happy
Quality
Entropy : 6.88
Noise : 111
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant errors, a few minor artifacts, some minor blurring on the background
Confetti Celebration: Young Man’s Joyful Moment Captured
A young man, radiating excitement, celebrates amidst a flurry of confetti. His green shirt and headphones add to the vibrant atmosphere, capturing the essence of pure joy and celebration.
Prompt
facial-expressions Agreement: triumphant, ecstatic ; A gamer celebrating a victory; eye-level; Gamer; a brightly lit room with confetti and streamers; cinematic
Characteristic
Shot : A young man in a green shirt and headphones is celebrating a victory, surrounded by confetti.
Aesthetic Score : 0.7
Mood : excited, joyful, triumphant
Quality
Entropy : 6.66
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The confetti is slightly blurry and there are some minor artifacts around the edges of the image.
The Detective’s Gaze: A Mystery Unfolds
A close-up portrait of a man with a serious expression, possibly a detective, looking directly at the viewer. His intense gaze and the mysterious atmosphere create a sense of intrigue and anticipation. What secrets does he hold?
Prompt
facial-expressions Agreement: lonely, melancholic ; A man sitting alone on a bench; eye-level; Single Person; a deserted park with fallen leaves; cinematic
Characteristic
Shot : A man with a serious expression, looking downwards, dressed in a grey coat, standing in a park with a blurred background of trees and grass.
Aesthetic Score : 0.7
Mood : serious, contemplative, mysterious
Quality
Entropy : 6.67
Noise : 91
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors or artifacts.
Lost in the City Lights
A man with a thoughtful gaze, his mustache a shadow against the night, stands silhouetted against a vibrant, blurred cityscape. The darkness and his enigmatic expression create a sense of mystery, leaving the viewer to wonder what secrets lie within the city’s depths.
Prompt
facial-expressions Agreement: determined, hopeful ; A hero standing on a rooftop overlooking the city; eye-level; Hero; a panoramic view of a city skyline at night; cinematic
Characteristic
Shot : A man with a mustache is standing in front of a blurry cityscape at night. The city lights are visible in the background.
Aesthetic Score : 0.7
Mood : mysterious, contemplative, urban
Quality
Entropy : 6.38
Noise : 113
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.62, falling within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it to a decent degree.
- Aesthetic Analysis: The model scored 0.08, which is far from the “very good” range of -0.2 to 0.1. This means the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding scene composition and camera angles, but needs improvement in generating images that match the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/