AI's Facial Expressions: A Step Forward, But Still Room for Improvement with Imagen-v2
- 9 minutes read - 1729 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. In the realm of generative AI, the ability to create realistic and expressive faces is a crucial step towards generating truly immersive and engaging content. This blog post examines the performance of a generative AI model in capturing facial expressions, highlighting its strengths and weaknesses, and exploring the potential for future advancements.
Created with: imagen-v2
Lost in the Glow: A Moment of Suspense
A young woman, bathed in the vibrant, yet blurry, light of a bustling city, gazes upwards with a troubled expression. The dramatic lighting and her pensive mood create an atmosphere of mystery and suspense, leaving the viewer wondering what secrets lie ahead.
Prompt
facial-expressions Anxiety: Overwhelmed, isolated ; A lone figure; eye-level; Single Person; bustling city street at night; cinematic
Characteristic
Shot : A young woman with long brown hair looks up with a worried expression. She is standing in front of a blurred background of colorful lights, possibly a city.
Aesthetic Score : 0.6
Mood : concerned, urban, mysterious
Quality
Entropy : 6.29
Noise : 52
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts, particularly in the hair and background, likely from digital processing.
Superman: The Man of Steel, Ready for Action
A close-up portrait of Superman, bathed in dramatic lighting, stands against a blurred cityscape. His intense gaze suggests a looming threat, capturing the hero’s unwavering determination and the anticipation of an epic battle.
Prompt
facial-expressions Anxiety: Pressure, responsibility ; A superhero standing on a rooftop; high angle; Hero; cityscape with flashing lights; cinematic
Characteristic
Shot : Superman, looking determined, stands in front of a city skyline with a tall building in the background.
Aesthetic Score : 0.8
Mood : heroic, powerful, determined
Quality
Entropy : 6.62
Noise : 81
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some slight artifacts on Superman’s chest and shoulders, but they are not very noticeable.
Drowning in Paperwork: The Stress of a Busy Office Worker
A man is overwhelmed by a mountain of paperwork, his tired expression reflecting the pressure and stress of his busy work life. The image captures the feeling of being buried under a pile of tasks, highlighting the struggles of a modern office worker.
Prompt
facial-expressions Anxiety: Overwhelmed, stressed ; A person sitting at a desk, surrounded by paperwork; close-up; Normal Person; cluttered office; cinematic
Characteristic
Shot : A man is sitting at a desk with his head in his hands, surrounded by a large pile of papers. He looks stressed and overwhelmed.
Aesthetic Score : 0.3
Mood : stressed, overwhelmed, defeated
Quality
Entropy : 6.76
Noise : 101
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, particularly around the edges of the papers. The lighting is a bit flat, which makes the image look a bit washed out. There are some minor color variations and inconsistencies.
The Weight of Determination
A young man, bathed in dramatic light, stares intently into the distance. His headphones and focused expression convey a sense of unwavering determination, hinting at a moment of intense focus and anticipation.
Prompt
facial-expressions Anxiety: Focused, intense ; A gamer hunched over a computer screen; close-up; Gamer; dimly lit room with flashing lights; cinematic
Characteristic
Shot : A close-up portrait of a young man with headphones on, his face contorted in anger. The lighting is dramatic, with harsh shadows and bright highlights. The man is wearing a dark shirt, and the background is blurry.
Aesthetic Score : 0.6
Mood : intense, frustrated, dramatic
Quality
Entropy : 6.16
Noise : 81
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight noise and artifacts in the dark areas, particularly the headphones. The lighting is a bit harsh and uneven.
A Look of Worry in the Blur
A woman with long brown hair gazes directly at the viewer, her face etched with concern. The red jacket she wears stands out against the soft, blurred background, adding to the sense of unease and isolation. The image evokes a feeling of anxiety and worry, leaving the viewer wondering what troubles her.
Prompt
facial-expressions Anxiety: Anxious, uncomfortable ; A woman walking down a crowded street; eye-level; Single Person; blurred background of people; cinematic
Characteristic
Shot : Close-up portrait of a young woman with long hair, looking directly at the camera with a worried expression.
Aesthetic Score : 0.7
Mood : intense, worried, concerned
Quality
Entropy : 6.32
Noise : 52
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurring in the background, likely from depth-of-field.
The Cowboy’s Gaze: A Portrait of Intensity
A close-up portrait of a rugged man in a cowboy hat and leather jacket, his intense stare piercing through the shadows. The dark, moody lighting adds to the mysterious and dramatic effect, leaving you wondering what secrets lie behind his gaze.
Prompt
facial-expressions Anxiety: Fear, anticipation ; A hero facing a menacing villain; medium shot; Hero; dark and ominous setting; cinematic
Characteristic
Shot : A close-up portrait of a man in a cowboy hat and leather jacket, looking directly at the camera with a serious expression.
Aesthetic Score : 0.8
Mood : intense, mysterious, serious
Quality
Entropy : 6.58
Noise : 84
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts around the edges of the man’s face, likely from digital processing.
Lost in Thought, A Moment of Anxiety
A young woman, her blonde hair cascading down her shoulders, sits in a bustling public space, her gaze fixed on something unseen. A worried expression etches her features, her red shirt and gray scarf a stark contrast to the blurry background. The scene evokes a sense of suspense and concern, leaving the viewer wondering what troubles her mind.
Prompt
facial-expressions Anxiety: Impatient, restless ; A person waiting in a long line; eye-level; Normal Person; crowded waiting room; cinematic
Characteristic
Shot : A young woman is sitting in a crowded public space, looking up with a worried expression.
Aesthetic Score : 0.7
Mood : pensive, anxious, concerned
Quality
Entropy : 6.61
Noise : 57
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor artifacts around the woman’s hair and skin, and the background is slightly blurry.
Neon Fingers: A Cyberpunk Mystery Unfolds
A close-up shot of hands typing on a keyboard, bathed in vibrant neon light, creates a sense of intrigue and mystery. The cyberpunk aesthetic is amplified by the dramatic lighting and focus on the hands, leaving the viewer wondering what secrets are being typed.
Prompt
facial-expressions Anxiety: Adrenaline, pressure ; A gamer’s hands frantically moving across a keyboard; close-up; Gamer; glowing computer screen; cinematic
Characteristic
Shot : A person’s hands typing on a keyboard, lit by pink and blue lights.
Aesthetic Score : 0.6
Mood : cyberpunk, futuristic, mysterious
Quality
Entropy : 6.18
Noise : 34
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : The lighting creates some noise and artifacts, especially around the hands.
Lost in Thought: A Man’s Silhouette Against a Stormy Sky
A solitary figure, cloaked in darkness, stands amidst a vast field, his back turned towards the camera. The cloudy sky above mirrors the somber mood, creating a sense of isolation and introspection. The dramatic lighting accentuates the man’s posture, hinting at a moment of deep contemplation.
Prompt
facial-expressions Anxiety: Loneliness, despair ; A man standing alone in a vast field; wide shot; Single Person; open sky with dark clouds; cinematic
Characteristic
Shot : A man in a dark green jacket stands in a field, looking away from the camera towards the horizon. The sky is overcast and there are dark clouds.
Aesthetic Score : 0.7
Mood : melancholy, pensive, thoughtful
Quality
Entropy : 6.76
Noise : 97
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise and slight blurring in the background.
Silhouetted Against the Sunset: A Lone Figure Contemplates the Vastness
A solitary figure stands on a rocky cliff, their silhouette stark against the fiery hues of a desert sunset. The scene evokes a sense of epic desolation and contemplative solitude, leaving the viewer to ponder the figure’s thoughts and the vastness of the landscape.
Prompt
facial-expressions Anxiety: Guilt, responsibility ; A lone explorer stands atop a crumbling mountain peak, gazing out over a vast, windswept desert. The sun sets in a fiery blaze, casting long shadows across the desolate landscape.; cinematic
Characteristic
Shot : A lone figure stands on a rocky cliff overlooking a vast desert landscape at sunset.
Aesthetic Score : 0.7
Mood : epic, desolate, hopeful
Quality
Entropy : 6.92
Noise : 116
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly grainy and the colors are a little washed out.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.44, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.67, which falls within the “good” range. This indicates that the model was able to understand the scene and create a shot that was generally consistent with the prompt.
- Aesthetic Analysis: The model scored 0.12, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/