AI's Artistic Struggle: Capturing Emotion in Visuals with Stability-ai-ultra
- 8 minutes read - 1674 wordsTable of Contents
The ability to convey emotion through visual storytelling is a hallmark of human creativity. While AI has made significant strides in generating realistic images, capturing the subtle nuances of facial expressions and emotional depth remains a challenge. This blog post examines the results of a generative AI model tasked with creating images based on specific scene descriptions, highlighting the model’s strengths and weaknesses in capturing emotional expression.
Created with: stability-ai-ultra
Lost in the Desert’s Embrace
A solitary figure traverses a desolate dirt road, dwarfed by the vastness of the desert landscape. The cloudy sky mirrors the melancholic mood, emphasizing the feeling of isolation and contemplation.
Prompt
facial-expressions Determination: Solitude and resilience ; A lone figure; eye-level; Single Person; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure walks down a desolate dirt road in a vast, barren landscape.
Aesthetic Score : 0.6
Mood : lonely, melancholic, contemplative
Quality
Entropy : 6.28
Noise : 92
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly underexposed, resulting in a somewhat muted color palette.
Superman: Hope Amidst the Ashes
A dramatic image of Superman standing tall against a backdrop of a burning cityscape. The hero’s presence offers a glimmer of hope in the face of apocalyptic destruction.
Prompt
facial-expressions Determination: Courage and unwavering resolve ; A hero standing tall; low-angle; Hero; A burning city in the background; cinematic
Characteristic
Shot : A muscular superhero, Superman, stands in front of a burning city, his cape billowing in the wind. The fire illuminates the scene, creating a dramatic contrast.
Aesthetic Score : 0.7
Mood : heroic, dramatic, intense
Quality
Entropy : 6.85
Noise : 74
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.80
Image errors : The flames are a little bit unrealistic and the cityscape is not very detailed. The fire particles in the background are quite pixelated.
The Grind: A Day in the Life of an Industrial Worker
A man in a hardhat pushes a cart through a bustling industrial corridor, capturing the everyday reality of blue-collar work. The scene evokes a sense of industrial grit and the mundane rhythm of a working day.
Prompt
facial-expressions Determination: Grit and perseverance ; A worker pushing a heavy cart; eye-level; Normal People; A bustling factory floor; cinematic
Characteristic
Shot : A man in a green hardhat and brown jacket is pushing a yellow cart through a large industrial space. There are other people walking in the background.
Aesthetic Score : 0.6
Mood : industrial, working class, lonely
Quality
Entropy : 6.87
Noise : 88
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, such as noise in the shadows. The color saturation is also a bit low, which gives the image a slightly washed-out look.
Lost in the Code: A Young Man’s Intense Focus Under Neon Lights
A young man, bathed in blue and red light, sits hunched over his computer, headphones on, his face a mask of intense concentration. The dimly lit room amplifies the dramatic mood, highlighting the seriousness of his task.
Prompt
facial-expressions Determination: Concentration and drive ; A gamer intensely focused on a screen; close-up; Gamer; A dimly lit room with glowing monitors; cinematic
Characteristic
Shot : A young man wearing a headset is seated in front of a computer screen, focused on the game, with colorful lights reflecting on his face.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 6.45
Noise : 72
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Lost in the Rain: A Moment of Contemplation
A woman gazes out of a window, her bright eyes reflecting the melancholy of a stormy city skyline. The overcast sky and falling rain create a sense of pensive introspection, highlighting the dramatic contrast between her inner light and the somber exterior.
Prompt
facial-expressions Determination: Inner strength and hope ; A woman staring out a window; eye-level; Single Person; A stormy sky; cinematic
Characteristic
Shot : A young woman looks out of a window at a city scene with a cloudy sky. The focus is on the woman’s face.
Aesthetic Score : 0.6
Mood : pensive, melancholic, contemplative
Quality
Entropy : 6.78
Noise : 89
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and grain, which could be attributed to a low-light setting or post-processing. The image also appears slightly underexposed.
A Lone Knight Stands Against the Flames of War
A solitary knight, sword raised high, stands amidst the smoldering ruins of a battlefield. The fiery sky above adds a dramatic and epic touch to this heroic scene.
Prompt
facial-expressions Determination: Victory and unwavering resolve ; A hero raising a sword; low-angle; Hero; A battlefield with fallen enemies; cinematic
Characteristic
Shot : A lone knight stands in a battlefield after a fierce battle, raising his sword in defiance. The ground is littered with burning debris and fallen comrades. The sky is filled with smoke and embers.
Aesthetic Score : 0.7
Mood : dramatic, intense, heroic
Quality
Entropy : 6.84
Noise : 83
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : The fire and smoke effects in the background appear a bit too artificial and repetitive.
Fear in the Flames: Children Huddle in Desperation
A haunting image captures the raw fear of four children huddled together in front of a raging fire. Their worried expressions and the fire’s menacing glow create a sense of urgency and vulnerability, leaving viewers with a profound sense of unease.
Prompt
facial-expressions Determination: Resilience and unity ; A family huddled together; eye-level; Normal People; A burning house in the background; cinematic
Characteristic
Shot : Five children, possibly siblings, huddle together against a backdrop of fire and destruction. Their expressions are fearful and sorrowful.
Aesthetic Score : 0.6
Mood : sadness, fear, despair
Quality
Entropy : 6.92
Noise : 101
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some slight inconsistencies in the rendering of the children’s faces and hair, especially around the eyes and mouths.
Neon Dreams: Capturing the Intensity of Gaming
A young gamer, bathed in vibrant pink and blue neon lights, is locked in a moment of intense focus. The close-up shot and dramatic lighting amplify the excitement and energy of the scene, showcasing the raw passion of gaming.
Prompt
facial-expressions Determination: Excitement and focus ; A gamer’s hands furiously typing on a keyboard; close-up; Gamer; A brightly lit gaming room; cinematic
Characteristic
Shot : A man in a dark room wearing headphones, lit by purple and pink light, is intensely looking at the computer screen while playing a game.
Aesthetic Score : 0.6
Mood : intense, dramatic, focused
Quality
Entropy : 6.81
Noise : 75
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight noise in the image and a bit of over-sharpening.
Lost in the Mist: A Solitary Figure Walks into the Unknown
A lone figure traverses a path shrouded in mist, the soft, diffused light casting an eerie glow. The scene evokes a sense of mystery, isolation, and anticipation, leaving the viewer wondering what lies ahead.
Prompt
facial-expressions Determination: Hope and perseverance ; A lone figure walking towards a distant light; eye-level; Single Person; A dark, foreboding forest; cinematic
Characteristic
Shot : A lone figure walks down a path in a misty forest, with the light of the sun filtering through the trees.
Aesthetic Score : 0.7
Mood : mysterious, eerie, tranquil
Quality
Entropy : 6.56
Noise : 92
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight noise in the darker areas, but no major errors.
Silhouetted Against the Sunset, a Moment of Contemplation
A lone figure stands on a rooftop, bathed in the golden hues of a setting sun. The city skyline stretches out below, a canvas of urban life. The man’s silhouette against the vibrant sky evokes a sense of calm reflection and quiet hope.
Prompt
facial-expressions Determination: Confidence and unwavering resolve ; A hero standing on a rooftop; high-angle; Hero; A city skyline bathed in sunlight; cinematic
Characteristic
Shot : A man standing on a rooftop overlooking a cityscape at sunset.
Aesthetic Score : 0.7
Mood : dramatic, contemplative, hopeful
Quality
Entropy : 6.29
Noise : 83
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight lens flare in the lower right corner of the image. The image also appears to be slightly overexposed, resulting in a loss of detail in the highlights.
Conclusion
The analysis shows that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect.
Here’s a breakdown:
- Camera Position: The model scored 0.3, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t fully capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.52, which falls within the “good” range. This indicates that the model was able to understand and translate the scene description in the prompt into a visually coherent shot.
- Aesthetic Analysis: The model scored 0.17, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai