AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion
- 8 minutes read - 1698 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Generative AI models are increasingly being used to create images with specific facial expressions, but how well do they capture the nuances of human emotion? This blog post delves into the performance of a generative AI model in creating images with dramatic facial expressions across diverse scenes. We’ll explore the model’s ability to understand camera position, scene details, and aesthetic style, highlighting its strengths and areas for improvement. For example, the model excels at capturing the desired aesthetic style, but struggles with accurately representing the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Created with: stability-ai-core
Rainy Day Reflections
A woman finds solace in a cup of coffee as she gazes out at a rainy street, her expression hinting at a melancholic contemplation. The moody atmosphere captures a sense of loneliness and introspection.
Prompt
facial-expressions Worry: melancholy, lonely ; Single woman; eye-level; Single Persons; dimly lit coffee shop with rain outside; cinematic
Characteristic
Shot : A young woman sits by a window in a cafe, looking out at the rain falling on the street outside.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, lonely
Quality
Entropy : 6.47
Noise : 73
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight chromatic aberration, but it is not very noticeable.
Superman Stands Tall, A City at His Feet
A powerful image of a Superman figure, bathed in dramatic lighting, stares intensely at the viewer. The blurred cityscape and dark sky create a sense of isolation and heroism, emphasizing his strength and determination.
Prompt
facial-expressions Worry: intense, burdened ; Man in a superhero costume; medium shot; Heroes; cityscape at night with flashing sirens; cinematic
Characteristic
Shot : Superman standing on a bridge, city skyline in the background.
Aesthetic Score : 0.6
Mood : serious, heroic, powerful
Quality
Entropy : 6.45
Noise : 75
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.40
Image errors : Some areas appear too sharp, like the skin on the subject, and some parts of the suit.
Lost in the Crowd: A Moment of Vulnerability on the Subway
A young woman stands alone on a crowded subway, her serious expression and the blurred background highlighting her isolation and internal struggle. The scene evokes a sense of melancholy, suspense, and introspection, leaving the viewer to wonder about her story.
Prompt
facial-expressions Worry: anxious, overwhelmed ; Young woman in a crowded subway; eye-level; Normal People; blurred faces of commuters; cinematic
Characteristic
Shot : A woman with a backpack stands in a crowded subway car, looking out of frame with a thoughtful expression.
Aesthetic Score : 0.7
Mood : pensive, mysterious, urban
Quality
Entropy : 6.54
Noise : 72
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor image artifacts are visible in the background.
Lost in the Code: A Man’s Focused Concentration in a Dark Room
A man, shrouded in darkness, sits before his computer, headphones on, eyes fixed on the screen. The dim lighting adds an air of mystery and intrigue, highlighting his intense focus and the serious nature of his work. This image captures the essence of a tech-savvy individual immersed in their digital world.
Prompt
facial-expressions Worry: intense, focused ; Gamer with headphones on; close-up; Gamer; dimly lit room with glowing computer screen; cinematic
Characteristic
Shot : A man wearing headphones is sitting in a dimly lit room, looking intensely at a computer screen.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 5.90
Noise : 59
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and grain, particularly in the shadows. The lighting is a bit uneven, with some areas appearing overexposed.
Autumn Melancholy: A Man Lost in Thought
A solitary figure sits on a park bench, surrounded by fallen leaves, his contemplative expression hinting at a sense of loneliness and introspection. The autumnal setting amplifies the melancholic mood, creating a poignant image of quiet contemplation.
Prompt
facial-expressions Worry: sad, reflective ; Man sitting alone on a park bench; long shot; Single Persons; empty park with falling leaves; cinematic
Characteristic
Shot : A man is sitting on a wooden bench in a park. The leaves are changing color and are all over the ground.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, autumnal
Quality
Entropy : 6.82
Noise : 78
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Solitude Amidst the Flames
A lone woman in a blue jumpsuit stands on a rooftop, her calm presence a stark contrast to the burning building and smoke-filled cityscape behind her. The scene evokes a sense of drama, tension, and impending doom, leaving the viewer to ponder the woman’s story and the fate of the city.
Prompt
facial-expressions Worry: determined, resolute ; Heroine standing on a rooftop; medium shot; Heroes; cityscape with smoke and fire in the distance; cinematic
Characteristic
Shot : A woman in a jumpsuit stands on a rooftop, looking out at a city with a burning building in the background. There’s a lot of smoke filling the sky.
Aesthetic Score : 0.6
Mood : dramatic, tense, somber
Quality
Entropy : 6.69
Noise : 65
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is slightly low, with some noise and pixelation. The color grading is a bit overdone.
Silence Speaks Volumes: A Couple’s Tense Kitchen Standoff
A couple’s simmering tension boils over in their kitchen. The woman’s upset expression and the man’s serious gaze, coupled with the untouched plates of food, paint a picture of a relationship on the brink. The composition emphasizes the distance between them, leaving the viewer to wonder what unspoken words hang heavy in the air.
Prompt
facial-expressions Worry: tense, frustrated ; Couple arguing in a kitchen; eye-level; Normal People; cluttered kitchen with dirty dishes; cinematic
Characteristic
Shot : A couple is having a tense conversation in a kitchen.
Aesthetic Score : 0.6
Mood : tense, uncertain, uncomfortable
Quality
Entropy : 6.87
Noise : 70
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.00
Image errors : No visible artifacts or errors in the image.
Lost in the Code: A Moment of Intense Focus
A young man, headphones on, is completely absorbed in his work. The close-up shot and dim lighting create a sense of intimacy, highlighting the intensity of his concentration as he types away on his keyboard.
Prompt
facial-expressions Worry: intense, focused ; Gamer’s hands on a keyboard; close-up; Gamer; flashing lights and sounds from the game; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer and typing on a keyboard. The lighting is dimly lit and the colors are muted.
Aesthetic Score : 0.7
Mood : intense, focused, determined
Quality
Entropy : 6.08
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
Lost in the Shadows: A Woman’s Solitary Walk Through a Mysterious City
A woman, shrouded in a black coat, walks alone down a cobblestone street bathed in the dim glow of streetlights. The urban landscape, shrouded in darkness, creates a sense of mystery and loneliness, leaving the viewer to wonder about her destination and the secrets she carries.
Prompt
facial-expressions Worry: lonely, vulnerable ; Woman walking alone at night; long shot; Single Persons; deserted street with streetlights; cinematic
Characteristic
Shot : A woman in a black coat walks down a cobblestone street at night, illuminated by streetlights.
Aesthetic Score : 0.7
Mood : mysterious, lonely, urban
Quality
Entropy : 6.60
Noise : 72
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major artifacts or errors are visible.
Soldier’s Focus Amidst Chaos
A lone soldier, surrounded by the devastation of war, studies a map amidst billowing smoke and rubble. The scene captures the seriousness and tension of the battlefield, highlighting the urgency of the situation.
Prompt
facial-expressions Worry: serious, strategic ; Hero looking at a map; medium shot; Heroes; war-torn battlefield with smoke and debris; cinematic
Characteristic
Shot : A soldier in a war-torn city, looking at a map while surrounded by rubble and smoke
Aesthetic Score : 0.7
Mood : intense, serious, dramatic
Quality
Entropy : 6.84
Noise : 79
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lighting is a bit unnatural, and the smoke seems a bit too clean. The soldier’s expression is a bit too intense.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.47, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.11, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai