AI's Facial Expressions: A Step Forward, But Still Room for Growth with Imagen-v2
- 9 minutes read - 1822 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. In the realm of AI-generated imagery, the ability to accurately depict these expressions is crucial for creating compelling and engaging visuals. This blog post examines the results of an AI model’s attempt to generate images with specific facial expressions, exploring its strengths and weaknesses in capturing the nuances of human emotion.
Created with: imagen-v2
Fear in Her Eyes: A Moment of Vulnerability Captured
This haunting image captures a woman’s raw fear, her eyes wide with terror. The blurred background and dark shadows create an unsettling atmosphere, amplifying the intensity of her expression. The photograph’s aesthetic score of 0.7 speaks to its powerful impact, leaving a lasting impression of vulnerability and unease.
Prompt
facial-expressions Fear: Unease, paranoia ; A lone figure; eye-level; Single Person; a dark, deserted alleyway; cinematic
Characteristic
Shot : A close-up shot of a woman’s face, looking scared and distressed. She’s in a dimly lit environment, likely a hallway, with a hint of dust or grime on her face and clothes. The image is framed from a low angle, making her appear large and imposing.
Aesthetic Score : 0.6
Mood : tense, suspenseful, dramatic
Quality
Entropy : 6.22
Noise : 98
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some noise and slight blurriness in the image. The lighting is uneven, creating dark shadows in the background.
The Weight of the World on His Shoulders
A close-up portrait captures the intense worry etched on the face of a superhero, the blurred cityscape behind him hinting at the vast responsibility he carries. The mood is heavy with drama and suspense, leaving the viewer wondering what challenges lie ahead.
Prompt
facial-expressions Fear: Dread, anticipation ; A superhero standing alone on a rooftop; eye-level; Hero; a cityscape shrouded in fog; cinematic
Characteristic
Shot : A close-up shot of a man’s face, possibly Superman, with a blurry cityscape in the background. The man is looking upwards with a shocked expression, and he is wearing a red cape.
Aesthetic Score : 0.6
Mood : dramatic, intense, suspenseful
Quality
Entropy : 6.75
Noise : 51
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have some blurriness, and the subject’s face appears slightly distorted. The background is very blurry and lacks detail.
Fear in the Shadows
A woman’s terrified gaze pierces through the darkness, her fear palpable in the dimly lit space. The out-of-focus lights in the background only add to the suspense, creating a scene of dramatic tension.
Prompt
facial-expressions Fear: Vulnerability, isolation ; A woman walking down a dimly lit street; eye-level; Normal Person; a deserted street with flickering streetlights; cinematic
Characteristic
Shot : A woman with long hair is standing in a dimly lit area, with bokeh lights in the background. She is looking directly at the camera with a worried expression on her face.
Aesthetic Score : 0.7
Mood : suspenseful, mysterious, worried
Quality
Entropy : 6.10
Noise : 90
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight noise artifacts and compression artifacts in the image, particularly in the hair and the background.
Lost in the Red: A Moment of Intense Focus
A young man, bathed in red light, stares directly at the camera with an intensity that speaks volumes. The stylized image and blurred background create a sense of drama and tension, capturing a moment of deep concentration.
Prompt
facial-expressions Fear: Disquiet, unease ; A gamer hunched over their computer; close-up; Gamer; a flickering monitor displaying a disturbing image; cinematic
Characteristic
Shot : A close-up portrait of a young man wearing headphones, with red lighting on his face and a blue background. The image has a stylized and dramatic feel, with the subject looking intensely into the camera.
Aesthetic Score : 0.6
Mood : intense, dramatic, mysterious
Quality
Entropy : 5.80
Noise : 68
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts around the subject’s hair and headphones. There is also a slight blur around the edges of the image.
Terror Behind the Wall: A Moment of Fear Captured
A chilling image of a blonde woman peeking from behind a wall, her face contorted in terror. The dramatic lighting and composition heighten the sense of suspense and unease, leaving the viewer questioning what lies ahead.
Prompt
facial-expressions Fear: Terror, helplessness ; hiding ; low-angle; Single Person; a dark room with shadows creeping in; cinematic
Characteristic
Shot : A woman is looking over her shoulder, her face is filled with fear, she seems to be hiding. She is close to a wall or door.
Aesthetic Score : 0.4
Mood : fear, suspense, anxiety
Quality
Entropy : 6.27
Noise : 115
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and the lighting is uneven. Some digital noise and compression artifacts are present.
Caught in the Crossfire: A Man’s Moment of Terror
A close-up shot captures a man’s face contorted in fear, his long hair and brown cloak billowing in the wind. The blurry background of fire and smoke hints at a chaotic and dangerous situation, leaving the viewer to wonder what threat he faces.
Prompt
facial-expressions Fear: Desperation, courage ; A hero facing a monstrous creature; eye-level; Hero; a crumbling battlefield with smoke and debris; cinematic
Characteristic
Shot : A close-up of a man’s face, with a background of fire and smoke. He is looking at the camera with a look of fear and terror.
Aesthetic Score : 0.7
Mood : dramatic, intense, suspenseful
Quality
Entropy : 6.57
Noise : 64
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the details on the man’s face and clothing appear to be blurry, which may be an artifact of the image processing.
Fear Grips the Crowd as Lightning Strikes
A chilling moment captured as a group of people, two women and two men, look up in fear as lightning illuminates the sky. The dramatic lighting and their expressions create a palpable sense of suspense and tension.
Prompt
facial-expressions Fear: Anxiety, uncertainty ; A group of people huddled together in a darkened room; eye-level; Normal People; a storm raging outside with thunder and lightning; cinematic
Characteristic
Shot : A group of four people are looking up in fear as a lightning bolt strikes behind them. The image is composed in a way that emphasizes the fear on their faces.
Aesthetic Score : 0.5
Mood : fear, suspense, dread
Quality
Entropy : 6.39
Noise : 88
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have some noise and grain, particularly in the darker areas. The lightning bolt is also a bit pixelated.
Fear in the Shadows: A Man’s Face Frozen in Terror
A close-up shot captures a man’s face, his hands covering his eyes in a gesture of shock or fear. The dramatic, blue-green lighting casts an eerie glow, heightening the sense of intensity and suspense.
Prompt
facial-expressions Fear: Shock, adrenaline ; A gamer’s hands shaking as they play a horror game; close-up; Gamer; a screen displaying a jump scare; cinematic
Characteristic
Shot : A man’s face close-up, with both hands covering his eyes. The image is lit with green and red light, giving it a surreal and dramatic feel.
Aesthetic Score : 0.3
Mood : intense, eerie, shocking
Quality
Entropy : 6.39
Noise : 91
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight artifacts in the background, particularly around the man’s hair. The lighting is also a bit harsh and artificial.
Facing the Unknown: A Moment of Fear and Suspense
A close-up shot captures the raw emotion of a person standing on the edge of a vast, unforgiving landscape. Wind whips their hair, mirroring the turmoil within as they confront an uncertain future. The dramatic close-up intensifies the feeling of suspense, leaving the viewer questioning what lies ahead.
Prompt
facial-expressions Fear: Loneliness, despair ; A lone figure standing at the edge of a cliff; eye-level; Single Person; a vast, empty landscape with a stormy sky; cinematic
Characteristic
Shot : A close-up shot of a person’s face, with a stormy sky and ocean in the background. The person appears to be looking up in fear or alarm.
Aesthetic Score : 0.6
Mood : intense, fearful, dramatic
Quality
Entropy : 6.66
Noise : 111
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the background. There is some noise and graininess in the image.
Warrior Amidst the Flames
A lone warrior stands defiant against a backdrop of raging fire and smoke, his intense gaze reflecting the somber mood of a war-torn city. The burning buildings and the man’s determined stance create a powerful and dramatic scene, capturing the urgency and danger of the moment.
Prompt
facial-expressions Fear: Loss, determination ; A hero standing amidst a burning city; eye-level; Hero; a chaotic scene with smoke and flames; cinematic
Characteristic
Shot : A man in a dark suit stands in front of a burning city, looking determined with a hint of sadness. The fire in the background illuminates the subject’s face
Aesthetic Score : 0.7
Mood : dark, dramatic, tense
Quality
Entropy : 6.65
Noise : 75
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : Slight blurring artifacts around the subject’s head, potentially due to image processing or AI generation.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.615, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.17, which is considered okay. This means that the generated image’s aesthetic was somewhat different from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and shot composition, but needs improvement in accurately capturing the intended camera position and aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/