AI's Struggle with Facial Expressions: A Mixed Bag of Results with Imagen-v3-fast
- 10 minutes read - 1951 wordsTable of Contents
The ability to accurately portray human emotion is a crucial aspect of visual storytelling. In this experiment, we tasked an AI model with generating images featuring specific facial expressions, aiming to understand its capabilities in capturing the nuances of human emotion. The results, while promising in some areas, reveal a clear struggle in accurately representing the intended camera position, highlighting the ongoing challenges in AI’s ability to translate complex human emotions into visual form. This blog post delves into the analysis of the generated images, exploring the model’s strengths and weaknesses, and discussing the implications for the future of AI-generated art.
Created with: imagen-v3-fast
Shadowed Figure in the Rain
A solitary figure, cloaked in darkness, stands amidst the rain-slicked streets of a city. The dim glow of streetlights casts an eerie light on his face, hinting at a mystery waiting to unfold. This image evokes a sense of suspense and foreboding, leaving the viewer to wonder about the secrets hidden within the shadows.
Prompt
facial-expressions Anger: Despair and rage ; A lone figure, standing in the middle of a deserted street; eye-level; Single Person; Rain pouring down, streetlights casting long shadows; cinematic
Characteristic
Shot : A man in a black hooded coat stands in the middle of a dark and rainy city street, his face is illuminated by streetlights.
Aesthetic Score : 0.6
Mood : dark, mysterious, menacing
Quality
Entropy : 6.52
Noise : 83
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The rain effect is a bit too uniform and unrealistic. There are some aliasing artifacts on the edges of the man’s clothing and the streetlights. The lighting is a bit flat and could use more contrast.
Hero Stands Tall Amidst the Ruins
A powerful superhero, clad in red and blue, commands attention in a scene of urban devastation. Their pose and expression convey a sense of intense determination and heroic resolve, promising a dramatic and thrilling story.
Prompt
facial-expressions Anger: Fury and determination ; A superhero, fists clenched, facing down a horde of villains; eye-level; Hero; A crumbling cityscape, smoke and debris filling the air; cinematic
Characteristic
Shot : A superhero in a red and blue costume stands in a ruined city with other people in the background
Aesthetic Score : 0.6
Mood : intense, dramatic, heroic
Quality
Entropy : 6.81
Noise : 73
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The background appears somewhat blurry and lacks detail. The lighting is inconsistent. The superhero’s face is slightly off in terms of proportions. There are minor inconsistencies in the overall composition.
Anger Erupts in the Boardroom
A tense atmosphere hangs heavy in the air as a man, his face contorted in anger, leans over a table littered with papers. The dark and moody lighting amplifies the dramatic tension, hinting at a heated confrontation.
Prompt
facial-expressions Anger: Frustration and rage ; A man, slamming his fist on a table, surrounded by scattered papers; eye-level; Normal Person; A cluttered office, with a window showing a stormy sky; cinematic
Characteristic
Shot : A man in a business shirt and tie, with an angry expression, leans over a table in an office with papers scattered around him. The scene is lit with a dark and moody light.
Aesthetic Score : 0.3
Mood : tense, angry, dramatic
Quality
Entropy : 6.78
Noise : 58
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry in some areas, particularly the man’s face.
The Price of Defeat: Gamer’s Frustration Caught on Camera
A dimly lit room, empty energy drink cans scattered around, and a man slumped on the floor, yelling in frustration. This image captures the raw emotion of a gamer facing defeat, highlighting the intense pressure and emotional toll of competitive gaming.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, throwing his headset on the floor, surrounded by empty energy drink cans; eye-level; Gamer; A dimly lit room, with a computer screen displaying a game in progress; cinematic
Characteristic
Shot : A man is sitting on the floor, frustrated and yelling. There are several empty cans of energy drink around him. He is in a dark room with a gaming monitor in the background, suggesting a gaming session.
Aesthetic Score : 0.4
Mood : frustration, anger, defeat
Quality
Entropy : 6.21
Noise : 51
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts, slight overexposure on the monitor.
Screaming in the Dark: A Moment of Terror Captured
A woman’s face contorted in a scream, bathed in harsh light against a blurred, dark background. This image captures a moment of intense fear and anxiety, leaving a lasting impression of the raw power of human emotion.
Prompt
facial-expressions Anger: Despair and rage ; A woman, screaming into the void, her face contorted in anger; close-up; Single Person; A dark, empty room, with only a single flickering light; cinematic
Characteristic
Shot : A woman is screaming with her mouth wide open. Her face is contorted in a grimace of distress. She has long blonde hair framing her face. The background is out of focus and dark. The lighting on the woman is bright.
Aesthetic Score : 0.2
Mood : intense, frightening, anxious
Quality
Entropy : 6.60
Noise : 60
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to be slightly out of focus and the lighting is uneven. The woman’s face appears to be overly saturated.
Silhouetted Against the Apocalypse
A lone figure in a dark cloak stands on a rooftop, silhouetted against a city consumed by flames. The setting sun casts an eerie glow, painting a scene of desolation and impending doom.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a rooftop, overlooking a city in flames; eye-level; Hero; A fiery inferno engulfing the city, with smoke billowing into the sky; cinematic
Characteristic
Shot : A lone figure in a dark cloak stands on a rooftop overlooking a city engulfed in flames and smoke. The sun sets in the distance, casting a fiery glow over the scene.
Aesthetic Score : 0.7
Mood : dramatic, apocalyptic, desolate
Quality
Entropy : 6.73
Noise : 66
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor technical errors. The edges of the buildings in the city are slightly blurry and the smoke appears somewhat pixelated.
A Tense Encounter in the Dimly Lit Restaurant
A couple’s heated conversation unfolds in a dimly lit restaurant, their worried and serious expressions captured in close-up. The blurred background hints at a bustling atmosphere, adding to the sense of intimacy and drama.
Prompt
facial-expressions Anger: Frustration and rage ; A couple, arguing in a crowded restaurant, their voices raised in anger; eye-level; Normal People; A bustling restaurant, with other diners looking on; cinematic
Characteristic
Shot : A couple is having a tense conversation in a dimly lit restaurant, the woman is looking at the man with a worried expression, the man is looking at the woman with a serious expression. The background is out of focus and suggests a busy restaurant scene.
Aesthetic Score : 0.6
Mood : tense, dramatic, intimate
Quality
Entropy : 6.73
Noise : 60
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors in the image
The Frustration of Focus: A Man Grapples with a Digital Challenge
A close-up shot captures the raw emotion of a man engrossed in his work. His furrowed brows and clenched jaw speak volumes about the intensity of his focus, hinting at a struggle he’s facing. The dramatic lighting and close-up framing amplify the tension, drawing the viewer into the heart of his frustration.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, smashing his keyboard in a fit of rage; close-up; Gamer; A dimly lit room, with a computer screen displaying a game over screen; cinematic
Characteristic
Shot : A man wearing headphones is sitting in front of a computer and is looking intensely at the screen. He appears to be frustrated, and he’s gripping the keyboard.
Aesthetic Score : 0.4
Mood : intense, frustrated, focused
Quality
Entropy : 6.28
Noise : 41
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and has some noise in the darker areas. The lighting seems artificial and doesn’t feel very natural.
Lost in the Rain: A Man’s Melancholy
A solitary figure, cloaked in darkness, stands amidst a downpour, his somber expression reflecting a profound sense of sadness. The rain, a symbol of his inner turmoil, amplifies the mood of loneliness and despair.
Prompt
facial-expressions Anger: Despair and rage ; A man, standing in the rain, his face obscured by the downpour; eye-level; Single Person; A dark, deserted street, with only the sound of rain and thunder; cinematic
Characteristic
Shot : A man in a dark coat, standing in the rain, looking down with a serious expression
Aesthetic Score : 0.6
Mood : sad, moody, dramatic
Quality
Entropy : 6.26
Noise : 102
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.70
Image errors : There are some minor artifacts in the background, particularly around the rain drops, that detract from the overall quality of the image. The rain effect seems slightly artificial.
One Warrior Stands Amidst the Ruins
A lone warrior, clad in armor, stands defiant amidst a battlefield littered with fallen soldiers. The retreating army of silhouettes in the background and the stormy sky create a grim and dramatic scene, highlighting the warrior’s resilience in the face of devastation.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a battlefield, surrounded by fallen enemies; eye-level; Hero; A battlefield littered with bodies, with smoke and dust filling the air; cinematic
Characteristic
Shot : A lone warrior, clad in armor, stands amidst a battlefield littered with fallen soldiers. The background features a retreating army of silhouettes under a stormy sky.
Aesthetic Score : 0.7
Mood : grim, dramatic, melancholic
Quality
Entropy : 6.72
Noise : 77
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have some slight artifacts and blurriness around the edges of the figures, particularly in the background.
Conclusion
The analysis of the generated image reveals mixed results:
Camera Position: The model’s performance in capturing the intended camera position is fairly weak. With a score of 0.15, it falls significantly below the “good” range of 0.5 to 0.75. This suggests the AI struggled to accurately translate the prompt’s camera instructions into the final image.
Shot Analysis: The model demonstrates a moderate understanding of the scene described in the prompt. A score of 0.5 falls within the “good” range, indicating a reasonable ability to translate the prompt’s scene description into the image.
Aesthetic Analysis: The image’s aesthetic is close to the expected aesthetic. A score of 0.26 falls within the “very good” range of -0.2 to 0.1, suggesting the AI successfully captured the desired visual style.
Overall, the model shows strengths in understanding the desired aesthetic and the scene description, but struggles with accurately representing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-3/