AI's Mixed Bag: Capturing Scenes, Missing the Emotion with Imagen-v2
- 9 minutes read - 1879 wordsTable of Contents
The ability to generate realistic and expressive facial expressions is a crucial aspect of creating compelling and engaging AI-generated imagery. This case study explores the challenges and successes of an AI model tasked with generating images featuring specific facial expressions. We delve into the model’s performance, analyzing its strengths and weaknesses in capturing the intended emotions. Through this analysis, we gain insights into the current state of AI’s ability to generate expressive imagery and the ongoing efforts to bridge the gap between technical capabilities and the nuanced complexities of human emotions.
Created with: imagen-v2
Lost in the Shadows: A Mysterious Gaze in the Urban Night
A young man with piercing dark eyes stares directly into the camera, his face shrouded in shadows. The dimly lit urban setting, with blurred lights in the background, adds to the mysterious and introspective mood of the scene. The dramatic use of lighting and shadows creates a sense of intrigue, leaving the viewer wondering what secrets lie behind his gaze.
Prompt
facial-expressions Disappointment: Melancholy, isolation ; A lone figure; eye-level; Single Person; a bustling city street at night, with neon signs and blurred lights; cinematic
Characteristic
Shot : A young man with a tired expression is standing in a city setting at night, with blurred out lights in the background.
Aesthetic Score : 0.7
Mood : melancholy, introspective, dramatic
Quality
Entropy : 6.34
Noise : 62
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The lighting is a bit harsh, and there are some minor artifacts in the background. The man’s skin looks a little unnatural.
Superman: Ready for the Fight
A close-up portrait of Superman, his face etched with determination, against a backdrop of a fading cityscape. The image captures a moment of quiet resolve before the inevitable clash, with the blurred background emphasizing the hero’s unwavering focus.
Prompt
facial-expressions Disappointment: Defeated, disillusioned ; A superhero standing on a rooftop; eye-level; Hero; a cityscape bathed in the orange glow of a setting sun, with the hero’s cape billowing in the wind; cinematic
Characteristic
Shot : A close-up portrait of Superman in a dramatic pose, with a cityscape in the background. The lighting is dramatic, with a strong backlight that creates a halo effect around the subject.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.76
Noise : 78
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The texture on the suit appears overly exaggerated and artificial, and the background cityscape appears blurry and slightly unrealistic.
A Moment of Solitude and Sorrow
A woman sits alone at a cluttered kitchen table, her posture and the dim lighting conveying a sense of sadness and loneliness. The scene evokes a feeling of contemplation and isolation, leaving the viewer to ponder her thoughts and emotions.
Prompt
facial-expressions Disappointment: Hopelessness, resignation ; A woman sitting at a kitchen table; eye-level; Normal Person; a cluttered kitchen with dirty dishes and a half-eaten meal; cinematic
Characteristic
Shot : A woman sits at a table with dirty dishes in a kitchen. Her expression is sad.
Aesthetic Score : 0.6
Mood : sad, pensive, lonely
Quality
Entropy : 6.75
Noise : 105
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the background, particularly around the windows.
Intense Gaze, Dramatic Lighting: A Portrait of Suspense
A close-up portrait captures a man’s serious expression, his gaze piercing through the blurred background of colorful lights. The dramatic lighting and intense mood create a sense of suspense and tension, leaving the viewer captivated by the unspoken story.
Prompt
facial-expressions Disappointment: Frustration, anger ; A gamer sitting in front of a computer screen; eye-level; Gamer; a dimly lit room with flashing lights and the glow of the monitor reflecting in their eyes; cinematic
Characteristic
Shot : Close-up portrait of a man’s face with a focused, intense stare. The background is out of focus and features blurred neon lights.
Aesthetic Score : 0.6
Mood : intense, focused, dark
Quality
Entropy : 6.58
Noise : 57
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be digitally edited, and there are some minor artifacts in the skin and hair.
Intense Gaze in the Shadows
A man in a dark coat stands alone in a dimly lit alleyway, his piercing gaze locked on the viewer. The blurred background and dramatic lighting create a sense of mystery and intrigue, leaving you wondering what secrets lie hidden in the shadows.
Prompt
facial-expressions Disappointment: Loneliness, despair ; A man walking down a deserted street; eye-level; Single Person; a street lined with closed shops and flickering streetlights; cinematic
Characteristic
Shot : A man in a dark coat and white shirt stands in a dimly lit urban setting, possibly an alleyway. The man looks apprehensive, his expression suggesting worry or fear.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, urban
Quality
Entropy : 6.55
Noise : 63
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image exhibits some noise and graininess, especially in the background. The lighting appears somewhat uneven, causing some areas to appear darker than others.
Superman Stands Victorious in a Post-Apocalyptic Battlefield
A powerful image captures the aftermath of a brutal battle, with Superman standing tall over a fallen enemy. The gritty, smoke-filled scene evokes a sense of chaos and destruction, while the dramatic lighting and Superman’s stoic pose create a powerful and memorable moment.
Prompt
facial-expressions Disappointment: Disappointment, regret ; A hero standing over a fallen villain; eye-level; Hero; a battlefield littered with debris and smoke, with the villain’s defeated form at the hero’s feet; cinematic
Characteristic
Shot : Superman stands over a defeated foe in a post-apocalyptic setting, amidst smoke and debris.
Aesthetic Score : 0.7
Mood : dark, dramatic, heroic
Quality
Entropy : 6.71
Noise : 80
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has slight blurring and artifacts, particularly in the smoke and debris.
A Tense Silence at the Table
A still from a film, capturing a moment of palpable tension. Three figures sit at a cluttered table, their expressions revealing a mix of unease and unspoken emotions. The woman in the center stares directly at the viewer, her gaze piercing and unsettling. The young man on the left looks down, while the one on the right gazes off into the distance, adding to the sense of mystery and drama.
Prompt
facial-expressions Disappointment: Tension, estrangement ; A family gathered around a dinner table; eye-level; Normal People; a table set with a simple meal, but with an uncomfortable silence hanging in the air; cinematic
Characteristic
Shot : A family sits at a table after a meal, with a sense of unease and tension. The table is set with simple dishes and utensils.
Aesthetic Score : 0.6
Mood : tense, somber, unsettling
Quality
Entropy : 6.62
Noise : 110
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts visible in the image, especially around the edges of the characters and the table.
Caught in the Spotlight: A Moment of Shock
A woman, headphones on, stares wide-eyed at a screen in a dimly lit room. The intensity of the moment is palpable, leaving you wondering what has just unfolded. Is it a revelation, a threat, or something else entirely? The suspense is thick in the air.
Prompt
facial-expressions Disappointment: Defeat, frustration ; A gamer staring at a game over screen; eye-level; Gamer; a darkened room with the glow of the monitor reflecting in their eyes, showing a game over message; cinematic
Characteristic
Shot : A woman wearing headphones is looking at a screen. She appears to be in a state of anxiety.
Aesthetic Score : 0.5
Mood : intense, anxious, focused
Quality
Entropy : 5.63
Noise : 100
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slightly grainy texture and some artifacts, especially noticeable in the hair.
A Moment of Worry in the Rain
A woman gazes out a window, her face etched with concern. The blurred background hints at a downpour, mirroring the storm brewing within her. The lighting and her expression create a palpable sense of drama and tension, leaving the viewer to wonder what troubles her.
Prompt
facial-expressions Disappointment: Sadness, longing ; A woman standing at a window; eye-level; Single Person; a rainy day with the city streets blurred in the background; cinematic
Characteristic
Shot : A woman looks out of a window at rain with a concerned expression, possibly reflecting on her thoughts
Aesthetic Score : 0.6
Mood : melancholy, somber, introspective
Quality
Entropy : 6.76
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly over-sharpened, creating an unnatural texture on the skin and hair. The background is overly blurred, losing details and creating a somewhat generic and artificial aesthetic.
A Warrior’s Gaze: Intensity and Drama in a Close-Up Portrait
This close-up portrait captures the intensity and drama of a warrior’s moment. The dirt on his face, the hazy golden sky, and his pensive gaze towards the left create a sense of tension and a story waiting to be told. The blurred background suggests a battlefield or a dramatic setting, adding to the overall mood of the image.
Prompt
facial-expressions Disappointment: Isolation, disillusionment ; A hero standing on a mountaintop; eye-level; Hero; a vast landscape stretching out before them, but with a sense of emptiness in the air; cinematic
Characteristic
Shot : A close-up portrait of a man with a determined expression, wearing a golden helmet and a red cloak, with a blurry background of a field and sky
Aesthetic Score : 0.7
Mood : dramatic, intense, focused
Quality
Entropy : 6.59
Noise : 52
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image shows some subtle blur and smoothing in the skin texture, especially visible in the area around the eyes.
Conclusion
The analysis of the generated image reveals mixed results:
- Camera Position: The model performed moderately well in capturing the intended camera position, scoring 0.35. This suggests the model is somewhat capable of understanding and implementing camera angles, but it could be better.
- Shot Analysis: The model performed well in understanding the scene described in the prompt, scoring 0.61. This indicates the model is generally good at translating the prompt’s description into a visually coherent scene.
- Aesthetic Analysis: The model performed poorly in achieving the desired aesthetic, scoring -0.08. This suggests the generated image deviates significantly from the expected aesthetic style.
Overall, the model shows promise in understanding scene composition and camera angles, but it struggles to achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/