AI's Facial Expressions: A Deep Dive into Generative Model Performance with Imagen-v2

Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Generative AI models are increasingly being used to create images with realistic facial expressions, but how well do they capture the nuances of human emotion? This blog post delves into the performance of a generative AI model in understanding and generating facial expressions across a range of scenes and aesthetics. We’ll explore the model’s strengths and weaknesses, analyzing its ability to capture camera position, shot composition, and overall aesthetic appeal.

Created with: imagen-v2

Lost in the Neon Maze: A Woman’s Worried Gaze in a City of Secrets

A woman stands alone in a bustling, neon-lit street, her worried expression hinting at a hidden story. The vibrant lights and the crowd’s anonymity create a sense of suspense and mystery, leaving you wondering what secrets lie beneath the surface.

Lost in the Neon Maze: A Woman’s Worried Gaze in a City of Secrets

Prompt

facial-expressions Confusion: Disoriented, overwhelmed ; A lone figure; eye-level; Single Person; a bustling city street with neon signs and crowds; cinematic

Characteristic

Shot : A woman stands in a bustling city street, with neon signs and crowds in the background. Her face is illuminated by the artificial lights.

Aesthetic Score : 0.7

Mood : mysterious, urban, melancholic

Quality

Entropy : 6.74

Noise : 92

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.80

Image errors : The skin texture appears artificial. The lighting seems too intense, and the color balance feels unnatural, likely due to heavy editing.

A Lone Warrior Contemplates the Vast Desert

An epic and adventurous scene unfolds as a solitary warrior stands atop a rocky outcrop, gazing out over a desolate desert landscape. A small oasis shimmers in the distance, offering a glimmer of hope amidst the vast emptiness. The warrior’s pose and the dramatic scale of the surroundings evoke a sense of solitude and the promise of thrilling adventures to come.

A Lone Warrior Contemplates the Vast Desert

Prompt

facial-expressions Confusion: Doubt, uncertainty ; A lone adventurer, their worn leather armor patched with scavenged materials, stands atop a crumbling stone tower. The wind whips through the ruins of a forgotten city, carrying the scent of dust and decay. In the distance, a shimmering oasis shimmers in the harsh desert sun.; cinematic

Characteristic

Shot : A lone female warrior stands on a rocky cliff in a desert landscape. There is a green oasis in the distance.

Aesthetic Score : 0.7

Mood : epic, dramatic, mysterious

Quality

Entropy : 6.62

Noise : 108

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.80

Image errors : There are no major image errors, but the textures on the character’s clothing and the rocks look slightly artificial.

A Look of Concern in the Face of Uncertainty

A woman in a suit, her face etched with worry, gazes upwards in an office setting. The blurred background adds to the sense of tension and anticipation, leaving the viewer wondering what she is looking at and what the future holds.

A Look of Concern in the Face of Uncertainty

Prompt

facial-expressions Confusion: Lost, unmoored ; A woman in a business suit; eye-level; Normal People; a sterile office with fluorescent lights and cubicles; cinematic

Characteristic

Shot : A woman in a business suit is looking upwards, likely in an office setting.

Aesthetic Score : 0.7

Mood : serious, intense, apprehensive

Quality

Entropy : 6.91

Noise : 94

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : No noticeable artifacts or errors are present.

Caught in the Moment: A Face of Intense Focus and Surprise

A close-up shot captures a young man lost in his world, headphones on, his expression a blend of concentration and surprise. The intensity of the moment is palpable, leaving the viewer on the edge of their seat, wondering what unfolds next.

Caught in the Moment: A Face of Intense Focus and Surprise

Prompt

facial-expressions Confusion: Frustration, bewilderment ; A gamer with headphones on; close-up; Gamer; a dimly lit room with a computer screen displaying a complex game interface; cinematic

Characteristic

Shot : Close-up portrait of a young man wearing headphones, looking slightly worried or surprised.

Aesthetic Score : 0.6

Mood : intense, focused, suspenseful

Quality

Entropy : 6.14

Noise : 93

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image appears slightly over-sharpened, leading to some halos around edges. Some minor noise is present, particularly in the darker areas.

The Shadow in the City

A man shrouded in mystery, his fedora casting a shadow over his intense gaze. The city lights blur behind him, adding to the air of intrigue and danger. This image evokes a sense of brooding mystery, leaving you wondering what secrets lie hidden in the shadows.

The Shadow in the City

Prompt

facial-expressions Confusion: Suspicious, wary ; A man in a trench coat; eye-level; Single Person; a foggy alleyway with flickering streetlights; cinematic

Characteristic

Shot : A man in a fedora and trench coat stands in a dimly lit environment with an out-of-focus light source behind him.

Aesthetic Score : 0.8

Mood : mysterious, intense, film noir

Quality

Entropy : 6.81

Noise : 70

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.90

Image errors : The image appears to be slightly over-sharpened, resulting in some artifacts in the face and clothing textures. The lighting is slightly artificial, and the overall image has a slightly flat and staged feel.

The Knight’s Watch: A Shadowy Gaze in the Forest

A knight in full armor stands amidst a dark, foreboding forest, his gaze fixed directly on the viewer. The scene is steeped in mystery and tension, leaving you wondering what secrets lie hidden in the shadows.

The Knight’s Watch: A Shadowy Gaze in the Forest

Prompt

facial-expressions Confusion: Disillusioned, lost ; A knight in shining armor; eye-level; Hero; a dark forest with twisted trees and ominous shadows; cinematic

Characteristic

Shot : A knight in armor, likely in a forest, with a dramatic, moody lighting.

Aesthetic Score : 0.7

Mood : dramatic, mysterious, serious

Quality

Entropy : 6.47

Noise : 110

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.50

Image errors : The image has some noise and artifacting, especially in the shadows.

Family Tension: A Messy Kitchen Reflects a Troubled Home

A snapshot of a family gathered around a cluttered table, their body language speaks volumes of tension and discomfort. The messy kitchen setting amplifies the sense of chaos and stress, hinting at a heated moment within the family.

Family Tension: A Messy Kitchen Reflects a Troubled Home

Prompt

facial-expressions Confusion: Awkward, uncomfortable ; A family at a dinner table; eye-level; Normal People; a brightly lit kitchen with mismatched plates and silverware; cinematic

Characteristic

Shot : A family sits around a kitchen table in a cluttered kitchen. There are dishes and other things on the table, including a glass pitcher. The people in the image look like they are having a tense conversation.

Aesthetic Score : 0.5

Mood : tense, uneasy, uncomfortable

Quality

Entropy : 6.78

Noise : 110

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image is slightly blurry and there is some graininess.

Adrenaline Rush: Gamer’s Shock at the Edge of Victory

A young woman’s face is etched with surprise and focus as she navigates a thrilling video game. The explosion on the TV screen and the blurred background create a sense of intense action and suspense, capturing the raw emotion of a close call in the digital world.

Adrenaline Rush: Gamer’s Shock at the Edge of Victory

Prompt

facial-expressions Confusion: Overwhelmed, disoriented ; A gamer holding a controller; close-up; Gamer; a brightly lit room with a TV screen displaying a chaotic game scene; cinematic

Characteristic

Shot : A young woman, possibly in her 20s, with blonde hair, is playing a video game, looking at the screen in the background. She is holding a game controller in her hands.

Aesthetic Score : 0.6

Mood : intense, focused, suspenseful

Quality

Entropy : 6.58

Noise : 60

Prompt Clip Score : 0.30

AI Evaluation

Likelihood of AI : 0.60

Image errors : There are some minor artifacts in the woman’s hair, particularly around her forehead. The lighting in the scene also appears somewhat uneven.

Lost in the City: A Moment of Anxiety

A woman stands amidst the bustling city, her worried gaze fixed on something unseen. The blurred background and low lighting heighten the sense of suspense, leaving the viewer wondering what has caused her distress.

Lost in the City: A Moment of Anxiety

Prompt

facial-expressions Confusion: Lost, alienated ; A woman walking down a crowded street; eye-level; Single Person; a bustling city street with people rushing past; cinematic

Characteristic

Shot : A woman with short brown hair is standing in a city street, looking up with a worried expression. The background is blurred, suggesting a bustling crowd and a sense of urgency.

Aesthetic Score : 0.7

Mood : suspenseful, anxious, worried

Quality

Entropy : 6.86

Noise : 80

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.20

Image errors : No noticeable artifacts or errors

Superman Stands Tall, Hopeful Against the Night

A dramatic image captures Superman, bathed in moonlight, gazing upwards with a determined expression. The city lights below and the vastness of the night sky create a sense of heroic grandeur and hopeful anticipation.

Superman Stands Tall, Hopeful Against the Night

Prompt

facial-expressions Confusion: Doubt, questioning ; A superhero standing on a rooftop; eye-level; Hero; a cityscape with twinkling lights and a full moon; cinematic

Characteristic

Shot : A man dressed as Superman stands against a cityscape and a large full moon, looking upward with a pensive expression.

Aesthetic Score : 0.7

Mood : heroic, contemplative, dramatic

Quality

Entropy : 6.56

Noise : 88

Prompt Clip Score : 0.23

AI Evaluation

Likelihood of AI : 0.80

Image errors : The subject’s skin appears slightly plastic and unreal. There are some subtle artifacts in the background, especially around the cityscape.

Conclusion

The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:

Camera Position: The model scored 0.33, which is below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
Shot Analysis: The model scored 0.54, which is considered good. This indicates that the model was able to understand the scene and create a shot that was somewhat aligned with the prompt.
Aesthetic Analysis: The model scored 0.11, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.

Overall, the model demonstrated a good understanding of the scene and its aesthetic, but struggled with accurately capturing the intended camera position.

AI's Facial Expressions: A Mixed Bag of Success with Imagen-v2

Contents

Lost in the Neon Maze: A Woman’s Worried Gaze in a City of Secrets

A Lone Warrior Contemplates the Vast Desert

A Look of Concern in the Face of Uncertainty

Caught in the Moment: A Face of Intense Focus and Surprise

The Shadow in the City

The Knight’s Watch: A Shadowy Gaze in the Forest

Family Tension: A Messy Kitchen Reflects a Troubled Home

Adrenaline Rush: Gamer’s Shock at the Edge of Victory

Lost in the City: A Moment of Anxiety

Superman Stands Tall, Hopeful Against the Night

Conclusion

Sources: