AI's Facial Expressions: A Step Forward, But Still Room for Growth with Imagen-v2

AI's Facial Expressions: A Mixed Bag of Results with Imagen-v2

Contents

Facial expressions are a powerful tool in storytelling, conveying a wide range of emotions and adding depth to characters. In the realm of generative AI, capturing these nuances presents a unique challenge. This blog post examines the results of an experiment where an AI model was tasked with generating images based on specific scene descriptions, focusing on the model’s ability to depict facial expressions. While the model shows promise in understanding scene composition and camera angles, it struggles to capture the desired aesthetic, particularly in the realm of facial expressions. We delve into the model’s performance, analyzing its strengths and weaknesses, and discuss the potential for future improvements.

Created with: imagen-v2

Lost in the City Lights: A Moment of Wonder

A young woman gazes up at the dazzling cityscape, her expression a mix of surprise and awe. The blurred lights create a sense of mystery and intrigue, leaving the viewer to wonder what has captured her attention.

Lost in the City Lights: A Moment of Wonder

Prompt

facial-expressions Excitement: Thrilled, anticipation ; A lone figure; eye-level; Single Person; bustling city street at night; cinematic

Characteristic

Shot : A woman with long brown hair is standing in a city at night, looking up in awe. The scene is lit with warm and cool colors.

Aesthetic Score : 0.6

Mood : dreamy, magical, hopeful

Quality

Entropy : 6.47

Noise : 51

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.80

Image errors : The woman’s face appears slightly distorted and the lighting is somewhat unnatural, especially around her hair.

Superman Soars Above the City at Sunset

A powerful image captures Superman in flight over a cityscape, bathed in the dramatic hues of sunset. His intense expression and the dramatic lighting create a sense of heroism and power.

Superman Soars Above the City at Sunset

Prompt

facial-expressions Excitement: Triumphant, exhilarating ; A superhero in mid-air; low-angle; Hero; cityscape with a dramatic sunset; cinematic

Characteristic

Shot : Superman flying over a city skyline at sunset.

Aesthetic Score : 0.7

Mood : heroic, dramatic, powerful

Quality

Entropy : 6.61

Noise : 60

Prompt Clip Score : 0.22

AI Evaluation

Likelihood of AI : 0.90

Image errors : The image has some minor artifacts, such as the slight blurring around Superman’s head and the overly-smooth skin textures. The city skyline also appears a little blurry and lacking in detail.

Sun-Kissed Laughter: Friends Embrace the Joy of Movement

Four young adults revel in the carefree spirit of a sunny day, their laughter echoing through a vibrant green field. The image captures the energy of their run, with blurred feet and outstretched arms painting a picture of pure joy and youthful exuberance.

Sun-Kissed Laughter: Friends Embrace the Joy of Movement

Prompt

facial-expressions Excitement: Joyful, carefree ; A group of friends laughing and running; eye-level; Normal People; a sunny park with a vibrant green lawn; cinematic

Characteristic

Shot : Four young adults are running towards the camera in a grassy field. There is a sunset in the background. The image is composed from a low angle perspective.

Aesthetic Score : 0.6

Mood : joyful, carefree, playful

Quality

Entropy : 6.74

Noise : 107

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has some graininess and the lighting is a bit uneven.

The Intensity of Focus

A close-up shot captures a man engrossed in his work, his face illuminated by the glow of the computer screen. The blurred background and dramatic lighting create a sense of suspense and isolation, drawing the viewer into the moment of intense concentration.

The Intensity of Focus

Prompt

facial-expressions Excitement: Intense, focused ; A gamer’s hands furiously tapping on a keyboard; close-up; Gamer; a dimly lit room with glowing screens; cinematic

Characteristic

Shot : A young man, wearing a headset, is intensely focused on a computer keyboard, lit with dramatic lighting.

Aesthetic Score : 0.6

Mood : intense, focused, competitive

Quality

Entropy : 6.16

Noise : 80

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.80

Image errors : The edges of the keyboard and the hands appear slightly blurred and lacking in detail.

Hope Amidst the Storm

A woman stands silhouetted against a dramatic sunset, her gaze fixed on the swirling clouds above. The golden light illuminates her face, reflecting a sense of awe and wonder. This captivating image captures a moment of hope and resilience, even in the face of adversity.

Hope Amidst the Storm

Prompt

facial-expressions Excitement: Awe-inspiring, liberating ; A woman standing on a cliff overlooking a vast ocean; eye-level; Single Person; dramatic clouds and a setting sun; cinematic

Characteristic

Shot : A woman with blonde hair is staring up at the sky. The sky is a mix of cloudy grey and orange, and there is a body of water behind the woman.

Aesthetic Score : 0.7

Mood : dramatic, surprised, awe

Quality

Entropy : 6.68

Noise : 70

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.80

Image errors : The hair seems too perfect, and there are some blur and sharpness inconsistencies.

Superman’s Last Stand: A Hero in Peril

A close-up shot captures Superman, his costume torn and battered, flying towards the viewer with a determined expression. The background is engulfed in smoke and fire, hinting at a fierce battle. The lighting, smoke, and Superman’s intense gaze create a sense of urgency and impending danger, leaving the viewer wondering if he will prevail.

Superman’s Last Stand: A Hero in Peril

Prompt

facial-expressions Excitement: Brave, adrenaline-fueled ; A hero charging into battle; low-angle; Hero; a chaotic battlefield with explosions and smoke; cinematic

Characteristic

Shot : A close-up shot of Superman running through a battle scene with smoke and fire in the background. He is looking determined and focused. The costume is slightly gritty and weathered, which is a visual storytelling device

Aesthetic Score : 0.7

Mood : intense, dramatic, heroic

Quality

Entropy : 6.76

Noise : 56

Prompt Clip Score : 0.21

AI Evaluation

Likelihood of AI : 0.80

Image errors : The image appears to have some minor artifacts in the background, and the subject’s skin texture is slightly unnatural.

Birthday Bliss on the Rooftop

Four friends celebrate a birthday with laughter, balloons, and confetti on a rooftop, capturing the essence of carefree joy and celebration.

Birthday Bliss on the Rooftop

Prompt

facial-expressions Excitement: Happy, celebratory ; A group of friends celebrating a graduation; eye-level; Normal People; a brightly decorated rooftop with balloons and streamers; cinematic

Characteristic

Shot : Four friends are celebrating on a rooftop, throwing confetti and holding balloons. They are all smiling and laughing. The background is a cityscape.

Aesthetic Score : 0.7

Mood : joyful, carefree, celebratory

Quality

Entropy : 6.77

Noise : 103

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors, slight compression artifacts are noticeable on the clothes

Intense Gaze, Dramatic Lighting: A Portrait of Mystery

This close-up portrait captures a man’s face bathed in cool and warm lighting, creating a dramatic effect. His intense gaze and the play of light and shadow evoke a sense of tension and mystery, leaving the viewer captivated.

Intense Gaze, Dramatic Lighting: A Portrait of Mystery

Prompt

facial-expressions Excitement: Engrossed, focused ; A gamer’s face illuminated by the screen; close-up; Gamer; a dark room with neon lights reflecting on the screen; cinematic

Characteristic

Shot : Close up portrait of a man with a serious expression, lit with blue and red light.

Aesthetic Score : 0.6

Mood : intense, dramatic, mysterious

Quality

Entropy : 6.17

Noise : 52

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.80

Image errors : Some artifacts around the eyes and the jawline

The Thrill of the Ride: A Close-Up Look at Pure Excitement

This intense close-up captures the raw emotion of a roller coaster ride. The man’s wide-eyed scream and the blurred background create a sense of immediacy, pulling you right into the heart of the action.

The Thrill of the Ride: A Close-Up Look at Pure Excitement

Prompt

facial-expressions Excitement: Thrilling, exhilarating ; A man riding a rollercoaster; POV shot; Single Person; a fast-paced ride with twists and turns; cinematic

Characteristic

Shot : A close-up shot of two people riding a roller coaster. The person in the foreground is looking directly at the camera with a surprised expression. The person in the background is partially visible and is also looking at the camera.

Aesthetic Score : 0.4

Mood : excitement, surprise, thrill

Quality

Entropy : 6.72

Noise : 86

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.10

Image errors : There is some blur in the background of the image, which may be due to the movement of the roller coaster. There is also some noise in the image, which may be due to the low light conditions.

Iron Man Faces the Storm

A brooding Iron Man stands atop a skyscraper, the cityscape spread out before him. The sky is a canvas of swirling clouds, mirroring the intensity of the moment. His expression is grim, hinting at the danger that lies ahead.

Iron Man Faces the Storm

Prompt

facial-expressions Excitement: Victorious, powerful ; A hero standing triumphantly on a rooftop; high-angle; Hero; a cityscape with a dramatic storm in the background; cinematic

Characteristic

Shot : A man in an Iron Man suit stands on a rooftop with a city behind him, he appears to be shouting or screaming. The sky has a stormy look to it.

Aesthetic Score : 0.6

Mood : dramatic, heroic, intense

Quality

Entropy : 6.65

Noise : 48

Prompt Clip Score : 0.23

AI Evaluation

Likelihood of AI : 0.80

Image errors : The image has a somewhat painted look, with visible brushstrokes, especially on the suit. The details of the suit are not very sharp.

Conclusion

The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:

  • Camera Position: The model scored 0.22, which is below the “good” threshold of 0.5. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
  • Shot Analysis: The model scored 0.52, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
  • Aesthetic Analysis: The model scored 0.26, which is significantly lower than the “very good” threshold of -0.2 to 0.1. This suggests that the generated image didn’t match the expected aesthetic style described in the prompt.

Overall, the model shows promise in understanding the scene and shot composition, but needs improvement in capturing the desired aesthetic.

Sources: