AI Captures the Emotion, But Misses the Angle: A Look at Facial Expressions in AI-Generated Images with Stable-diffusion
- 9 minutes read - 1833 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotion in visual storytelling. They can add depth and complexity to a scene, making it more engaging and relatable for the viewer. In the realm of AI-generated imagery, the ability to accurately portray facial expressions is crucial for creating realistic and compelling visuals. This study explores the capabilities of a generative AI model in capturing the nuances of facial expressions, analyzing its performance in various scenarios. We delve into the model’s strengths and weaknesses, highlighting its ability to understand and depict emotion while examining its limitations in accurately replicating camera angles. Through this analysis, we gain valuable insights into the potential and challenges of using AI to generate images that effectively communicate human emotion.
Created with: stability-ai-core
A Solitary Figure Contemplates the Fury of the Storm
A lone figure stands defiant against the backdrop of a raging sea, the dramatic contrast highlighting the power of nature and the resilience of the human spirit. Dark clouds and crashing waves create a moody atmosphere, emphasizing the raw beauty of this powerful scene.
Prompt
facial-expressions Disagreement: Melancholy, isolated, conflicted ; A lone figure standing on a clifftop, looking out at a stormy sea; eye-level; Single Person; Dramatic, stormy sky with crashing waves; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a stormy sea, with dramatic, dark clouds above.
Aesthetic Score : 0.8
Mood : dramatic, moody, powerful
Quality
Entropy : 6.67
Noise : 76
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blur on the waves and a little bit of noise in the darker areas.
Superman Faces the Flames, Protecting the City
A dramatic scene unfolds as Superman stands amidst a fiery inferno, his gaze fixed on the viewer. The intensity of the flames and the crowd of onlookers in the background create a sense of urgency and heroism, highlighting Superman’s unwavering commitment to protecting the innocent.
Prompt
facial-expressions Disagreement: Urgent, conflicted, determined ; A superhero, cape billowing in the wind, standing in front of a burning building, looking at a group of people fleeing; eye-level; Hero; City skyline with smoke and flames; cinematic
Characteristic
Shot : Superman stands in a city street, flames and smoke billow behind him. There are many people in the background, the cityscape is visible.
Aesthetic Score : 0.6
Mood : heroic, dramatic, intense
Quality
Entropy : 6.85
Noise : 78
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.60
Image errors : The fire is unrealistic, and the smoke is somewhat artificial.
A Moment of Truth: Two Men Locked in a Tense Conversation
Two men sit at a table, their expressions revealing a story of intensity and intrigue. One man gazes intently at the other, who looks away with a thoughtful frown. The background blurs, focusing attention on the unspoken tension between them. What secrets are being shared? What truths are being revealed?
Prompt
facial-expressions Disagreement: Angry, tense, frustrated ; A couple arguing in a crowded restaurant, their faces close together; close-up; Normal People; Busy restaurant interior with other diners; cinematic
Characteristic
Shot : Two men are sitting at a table in a restaurant, talking, with other people in the background.
Aesthetic Score : 0.6
Mood : serious, intense, focused
Quality
Entropy : 6.71
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant image errors.
Lost in the Blue Light: A Moment of Intense Focus
A man, bathed in blue light, sits at his desk, headphones on, eyes glued to the screen. His focused expression and the dramatic lighting create a sense of intensity and mystery, hinting at a world of work or perhaps a secret project.
Prompt
facial-expressions Disagreement: Frustrated, intense, focused ; A gamer, hunched over a computer screen, furiously clicking a mouse; close-up; Gamer; Dark room with glowing computer screen and peripherals; cinematic
Characteristic
Shot : A man wearing headphones is sitting in a dark room, typing on a keyboard. The only light source is the blue glow of the monitor.
Aesthetic Score : 0.6
Mood : focused, concentrated, serious
Quality
Entropy : 5.58
Noise : 62
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Lost in Thought: A Moment of Melancholy in a Cafe
A young woman sits alone at a cafe table, her gaze lost in the distance. The soft lighting and blurred background create a sense of solitude and introspection, capturing a moment of pensive contemplation.
Prompt
facial-expressions Disagreement: Disappointed, lonely, withdrawn ; A woman sitting alone in a coffee shop, staring at a phone with a blank expression; eye-level; Single Person; Cozy coffee shop interior with other patrons; cinematic
Characteristic
Shot : A young woman is sitting alone at a table in a cafe, looking contemplative. The cafe is busy with other patrons.
Aesthetic Score : 0.7
Mood : melancholy, pensive, introspective
Quality
Entropy : 6.62
Noise : 71
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and there are some minor artifacts in the background.
Lost in the Shadows: A Man’s Mysterious Journey
A lone figure, shrouded in darkness, stands amidst the graffiti-laden walls of a narrow alley. The dim lighting casts long shadows, creating an atmosphere of intrigue and danger. His intense expression hints at a story waiting to unfold.
Prompt
facial-expressions Disagreement: Confident, determined, defiant ; A hero, standing in a dark alleyway, looking at a villain with a determined expression; eye-level; Hero; Dark, gritty alleyway with shadows and graffiti; cinematic
Characteristic
Shot : A man with a serious expression stands in a dark alleyway with graffiti on the walls.
Aesthetic Score : 0.5
Mood : intense, urban, mysterious
Quality
Entropy : 6.33
Noise : 75
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, particularly around the edges of the man’s hair and the graffiti on the wall.
Tempers Flare on Park Bench: Heated Argument Caught on Camera
A tense scene unfolds on a park bench as a group of individuals engage in a heated argument. The camera captures their angry expressions and the dramatic tension of the moment, suggesting a conflict that has reached a boiling point.
Prompt
facial-expressions Disagreement: Angry, frustrated, heated ; A group of friends arguing in a park, their voices raised; medium shot; Normal People; Sunny park with trees and benches; cinematic
Characteristic
Shot : A group of people are sitting on a bench in a park and arguing. The background is a blurry green park with lots of trees.
Aesthetic Score : 0.3
Mood : angry, tense, argumentative
Quality
Entropy : 6.89
Noise : 79
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image seems to have a slight color mismatch between the different sections, the color of the trees and the background doesn’t quite match up.
The Fury of Defeat: Capturing the Raw Emotion of Frustration
This image captures the intense frustration of a man at his desk, likely fueled by a heated gaming session or a stressful work project. The dramatic lighting and composition highlight his anger, creating a powerful and relatable moment.
Prompt
facial-expressions Disagreement: Frustrated, angry, defeated ; A gamer, slamming his fist on a desk, yelling at the computer screen; close-up; Gamer; Brightly lit gaming room with multiple monitors; cinematic
Characteristic
Shot : A man sitting at a desk in front of two computer monitors, yelling in frustration while using the keyboard.
Aesthetic Score : 0.4
Mood : frustration, anger, tension
Quality
Entropy : 6.78
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight blur in the background and some noise in the shadows, especially in the upper left corner.
Lost in the City: A Solitary Figure Walks Away
A lone figure navigates the bustling cobblestone streets, their back turned to the camera, creating an air of mystery and quiet contemplation. The urban scene buzzes with life, yet the individual feels isolated, evoking a sense of calm loneliness.
Prompt
facial-expressions Disagreement: Sad, lonely, rejected ; A man walking away from a group of people, his head down; long shot; Single Person; Busy city street with people walking by; cinematic
Characteristic
Shot : A man walks down a cobblestone street in a city, with other people walking in the same direction and some people walking in the opposite direction. It’s a grey day with the sky overcast.
Aesthetic Score : 0.7
Mood : calm, contemplative, urban
Quality
Entropy : 6.79
Noise : 75
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors. There are some minor artifacts in the background that could have been caused by compression.
Silhouette of Mystery: A Man on the Rooftop
A solitary figure in a black jacket stands on a rooftop, gazing out at the city skyline as dusk descends. The silhouette against the twinkling lights creates a sense of mystery and isolation, hinting at a story waiting to be told.
Prompt
facial-expressions Disagreement: Thoughtful, conflicted, determined ; A hero, standing on a rooftop, looking at a city skyline with a conflicted expression; eye-level; Hero; City skyline at night with twinkling lights; cinematic
Characteristic
Shot : A man is standing on a rooftop overlooking a city at dusk.
Aesthetic Score : 0.7
Mood : thoughtful, urban, mysterious
Quality
Entropy : 6.71
Noise : 67
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the background.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.53, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai