AI's Facial Expressions: A Mixed Bag of Success with Stability-ai-ultra
- 9 minutes read - 1806 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Dramatic facial expressions, in particular, can heighten the impact of a scene and draw the viewer in. This blog post delves into the capabilities of a generative AI model in capturing these nuanced expressions across a range of scenes, from a dimly lit coffee shop to a war-torn battlefield. We’ll explore how the model performs in terms of understanding scene context, camera position, and aesthetic style, highlighting its strengths and areas for improvement.
Created with: stability-ai-ultra
Lost in Thought, Finding Comfort in the Rain
A young woman, her face etched with contemplation, sits in a warm cafe as rain pours outside. The soft lighting and steaming cup of coffee offer a sense of refuge from the storm, while her pensive gaze suggests a moment of deep introspection.
Prompt
facial-expressions Worry: melancholy, lonely ; Single woman; eye-level; Single Persons; dimly lit coffee shop with rain outside; cinematic
Characteristic
Shot : A young woman sits at a table in a cafe, looking out at the rain. She is wearing a black jacket and a grey scarf. There is a cup of coffee in front of her.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, cozy
Quality
Entropy : 6.62
Noise : 90
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and there is some noise present, particularly in the background.
The Man of Steel Under the City Lights
A dramatic silhouette of Superman, bathed in the neon glow of the city, his face etched with determination. The iconic logo on his chest shines brightly, a symbol of hope in the darkness.
Prompt
facial-expressions Worry: intense, burdened ; Man in a superhero costume; medium shot; Heroes; cityscape at night with flashing sirens; cinematic
Characteristic
Shot : A close-up of a man dressed as Superman, standing in a city street at night, with blurry lights in the background.
Aesthetic Score : 0.6
Mood : serious, heroic, dramatic
Quality
Entropy : 6.85
Noise : 85
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears to have some artifacts, most noticeable in the S logo. There is also a slight blurring effect that could be a result of digital manipulation.
Lost in Thought: A Moment of Melancholy on the Subway
A young woman sits alone on a crowded subway train, her gaze fixed on the passing scenery. Her neutral expression and downcast eyes hint at a quiet sadness, while the muted colors and blurred background create a sense of unease and introspection. The image captures a fleeting moment of melancholy, leaving the viewer to ponder the woman’s thoughts and emotions.
Prompt
facial-expressions Worry: anxious, overwhelmed ; Young woman in a crowded subway; eye-level; Normal People; blurred faces of commuters; cinematic
Characteristic
Shot : A woman on a crowded subway looking off to the side.
Aesthetic Score : 0.6
Mood : pensive, anxious, contemplative
Quality
Entropy : 6.80
Noise : 83
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no major artifacts or errors in the image.
Lost in the Code: A Moment of Intense Focus
A young man, bathed in blue light, sits hunched over his computer, headphones on, eyes glued to the screen. The intensity of his focus is palpable, hinting at a task demanding his full attention and a sense of urgency in the air.
Prompt
facial-expressions Worry: intense, focused ; Gamer with headphones on; close-up; Gamer; dimly lit room with glowing computer screen; cinematic
Characteristic
Shot : A young man is wearing headphones and looking intensely at a computer screen. The room is dimly lit with blue and purple hues.
Aesthetic Score : 0.6
Mood : focused, intense, concentrated
Quality
Entropy : 6.32
Noise : 71
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, especially in the shadows and highlights. The image also appears to be slightly overexposed.
Autumn Tranquility: A Moment of Peace Amidst the Changing Colors
A solitary figure finds solace on a park bench, surrounded by fallen autumn leaves. The warm sunlight filtering through the trees creates a serene and melancholic atmosphere, highlighting the beauty of the season’s transition.
Prompt
facial-expressions Worry: sad, reflective ; Man sitting alone on a park bench; long shot; Single Persons; empty park with falling leaves; cinematic
Characteristic
Shot : A lone man sits on a bench in a park with autumn leaves on the ground and in the air. The trees in the background are bare, but the leaves are a beautiful golden color. The sun is shining and the sky is blue.
Aesthetic Score : 0.8
Mood : melancholy, peaceful, serene
Quality
Entropy : 6.95
Noise : 82
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, but they are not noticeable.
Silhouetted Against the Flames: A Soldier’s Stoic Vigil
A woman in military gear stands on a rooftop, her gaze fixed on the burning building behind her. The scene is both dramatic and somber, capturing the intensity of the moment and the weight of the situation.
Prompt
facial-expressions Worry: determined, resolute ; Heroine standing on a rooftop; medium shot; Heroes; cityscape with smoke and fire in the distance; cinematic
Characteristic
Shot : A young woman in a black tactical gear stands in front of a burning building. Smoke is rising in the background, giving a sense of chaos and danger.
Aesthetic Score : 0.7
Mood : intense, dramatic, serious
Quality
Entropy : 6.78
Noise : 86
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image seems to have been slightly over-processed, which makes the colors look unrealistic.
What’s Behind the Door? Kitchen Horror Unfolds
A group of people huddle in a vibrant, saturated kitchen, their faces contorted in terror. What unseen horror has them frozen in fear? The dramatic use of color and exaggerated expressions heighten the suspense and shock of the scene.
Prompt
facial-expressions Worry: tense, frustrated ; Couple arguing in a kitchen; eye-level; Normal People; cluttered kitchen with dirty dishes; cinematic
Characteristic
Shot : A cartoon illustration of a group of friends in a kitchen, reacting to something shocking or disturbing, possibly a burnt meal or a broken appliance. The scene is depicted in a comic book style, with exaggerated expressions and poses.
Aesthetic Score : 0.6
Mood : alarmed, surprised, anxious
Quality
Entropy : 6.60
Noise : 65
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the lines in the illustration appear slightly jagged and uneven, especially in the faces, suggesting a possible lack of refining or smoothing during the digital art process. The rendering of the microwave and its reflection is somewhat unrealistic.
In the Glow of Victory: A Gamer’s Intense Focus
A young man is engrossed in a game, his face illuminated by the vibrant blue and red hues of his computer screen. The dramatic interplay of light and shadow captures the intensity of his focus and the competitive spirit of the moment.
Prompt
facial-expressions Worry: intense, focused ; Gamer’s hands on a keyboard; close-up; Gamer; flashing lights and sounds from the game; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer in a dimly lit room, focused on playing a video game. The background is blurred and the room is lit with vibrant blue and red neon lights.
Aesthetic Score : 0.6
Mood : intense, focused, futuristic
Quality
Entropy : 6.62
Noise : 70
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Lost in the Fog: A Woman’s Solitary Journey
A haunting image of a lone woman walking through a foggy city street at night, bathed in the ethereal glow of streetlights. The silhouette against the blue-tinged mist evokes a sense of mystery, loneliness, and somber reflection.
Prompt
facial-expressions Worry: lonely, vulnerable ; Woman walking alone at night; long shot; Single Persons; deserted street with streetlights; cinematic
Characteristic
Shot : A lone woman walks down a street at night. The street is wet and the air is foggy, creating a moody and atmospheric scene. There are street lights and cars in the distance, adding to the sense of urban solitude.
Aesthetic Score : 0.7
Mood : melancholy, atmospheric, solitude
Quality
Entropy : 6.78
Noise : 78
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, and the colors are a bit washed out.
Amidst the Ruins, a Moment of Focus
A lone soldier, amidst a landscape ravaged by war, studies a map. The grim reality of the battlefield is starkly contrasted by the soldier’s calm determination, creating a powerful image of resilience in the face of devastation.
Prompt
facial-expressions Worry: serious, strategic ; Hero looking at a map; medium shot; Heroes; war-torn battlefield with smoke and debris; cinematic
Characteristic
Shot : A soldier is sitting in the middle of a war-torn landscape, looking at a map. The scene is full of smoke and fire.
Aesthetic Score : 0.6
Mood : dramatic, intense, apocalyptic
Quality
Entropy : 6.83
Noise : 92
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The smoke and fire appear slightly unrealistic and the image has a slightly blurry feel.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.53, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.09, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic analysis suggests that the model is capable of producing images that align with the desired style.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai