AI's Mixed Bag: Capturing Emotion in Images with Stability-ai-ultra
- 9 minutes read - 1812 wordsTable of Contents
The ability to convey emotion through facial expressions is a hallmark of human communication. It’s a complex interplay of muscle movements, subtle nuances, and context that AI models are still learning to master. This blog post explores a case study where an AI model was tasked with generating images featuring specific facial expressions. We’ll examine the results, highlighting the model’s successes and challenges in capturing the nuances of human emotion. This analysis will shed light on the ongoing development of AI in the realm of artistic expression, particularly in the area of portraying the human experience.
Created with: stability-ai-ultra
Lost in the Neon Glow: A Silhouette of Mystery
A solitary figure walks through a city bathed in vibrant neon light, their silhouette shrouded in mystery. The urban landscape hums with energy, while a sense of loneliness hangs in the air. This evocative scene captures the essence of urban life, where shadows and light intertwine to create a captivating narrative.
Prompt
facial-expressions Disappointment: Melancholy, isolation ; A lone figure; eye-level; Single Person; a bustling city street at night, with neon signs and blurred lights; cinematic
Characteristic
Shot : A person walking away from the camera down a street lined with neon signs at night.
Aesthetic Score : 0.6
Mood : mysterious, urban, lonely
Quality
Entropy : 5.68
Noise : 63
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Superman Stands Tall Against the Setting Sun
A dramatic silhouette against the fading light, Superman surveys the city he protects. The blurred cityscape emphasizes the hero’s powerful presence and hopeful gaze towards the future.
Prompt
facial-expressions Disappointment: Defeated, disillusioned ; A superhero standing on a rooftop; eye-level; Hero; a cityscape bathed in the orange glow of a setting sun, with the hero’s cape billowing in the wind; cinematic
Characteristic
Shot : A man dressed as Superman is standing on a rooftop overlooking a cityscape at sunset.
Aesthetic Score : 0.7
Mood : epic, heroic, dramatic
Quality
Entropy : 6.58
Noise : 80
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image has some minor artifacts, such as a slight blur around the edges of the subject.
A Moment of Solitude
A woman sits alone at a kitchen table, her pensive gaze and the empty space around her evoke a sense of loneliness and somber reflection. The composition of the image emphasizes her isolation, creating a poignant and evocative scene.
Prompt
facial-expressions Disappointment: Hopelessness, resignation ; A woman sitting at a kitchen table; eye-level; Normal Person; a cluttered kitchen with dirty dishes and a half-eaten meal; cinematic
Characteristic
Shot : A woman is sitting at a kitchen table, looking thoughtful. There is food on the table and dishes in the background.
Aesthetic Score : 0.6
Mood : pensive, somber, quiet
Quality
Entropy : 6.69
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors, but the image looks somewhat flat.
Lost in the Digital Realm: A Gamer’s Intense Focus
A young man, bathed in the vibrant glow of his computer screen, is completely absorbed in the digital world. The dramatic lighting and his intense concentration create a sense of futuristic immersion, capturing the essence of a gamer lost in the heat of the moment.
Prompt
facial-expressions Disappointment: Frustration, anger ; A gamer sitting in front of a computer screen; eye-level; Gamer; a dimly lit room with flashing lights and the glow of the monitor reflecting in their eyes; cinematic
Characteristic
Shot : A young man is sitting in front of a computer screen, wearing a headset and looking focused on the screen. The room is lit in warm and cool tones, creating a dramatic atmosphere. The image is cropped tightly, focusing on the man’s face and hands.
Aesthetic Score : 0.7
Mood : intense, focused, determined
Quality
Entropy : 6.67
Noise : 82
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some slight noise in the shadows, particularly around the edges of the computer screens. There are no noticeable artifacts or other significant errors.
A Solitary Stroll Through Dusk’s Embrace
A lone figure walks down a quiet, narrow street in a European city at dusk. The sky is a vibrant blue, casting a mysterious glow on the closed shops lining the street. The scene evokes a sense of calm and tranquility, with a hint of urban mystery.
Prompt
facial-expressions Disappointment: Loneliness, despair ; A man walking down a deserted street; eye-level; Single Person; a street lined with closed shops and flickering streetlights; cinematic
Characteristic
Shot : A solitary figure walks down a deserted street in a city, lined with closed shops. The street is wet from recent rain and the sky is a hazy blue.
Aesthetic Score : 0.7
Mood : gloomy, lonely, melancholic
Quality
Entropy : 6.89
Noise : 102
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, particularly in the sky, which has a slightly artificial look.
Hope Amidst the Ashes: A Lone Figure in a Destroyed City
A powerful image captures the aftermath of devastation, with a lone figure in a red cloak standing amidst burning buildings and rubble. The scene evokes a sense of loss and chaos, yet the figure’s determined gaze suggests a glimmer of hope in the face of destruction.
Prompt
facial-expressions Disappointment: Disappointment, regret ; A hero standing over a fallen villain; eye-level; Hero; a battlefield littered with debris and smoke, with the villain’s defeated form at the hero’s feet; cinematic
Characteristic
Shot : A lone figure stands in a city street ravaged by fire. The buildings are mostly destroyed, and debris litters the ground.
Aesthetic Score : 0.7
Mood : desolate, ominous, dramatic
Quality
Entropy : 6.84
Noise : 98
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be slightly blurry, and there are some artifacts in the background.
Silent Supper: A Family Dinner Gone Wrong
A family gathers around a seemingly ordinary dinner table, but the atmosphere is thick with tension and discomfort. The food may be normal, but the mood is anything but, highlighting the awkwardness and unspoken struggles beneath the surface.
Prompt
facial-expressions Disappointment: Tension, estrangement ; A family gathered around a dinner table; eye-level; Normal People; a table set with a simple meal, but with an uncomfortable silence hanging in the air; cinematic
Characteristic
Shot : A family dinner with a tense atmosphere. The characters are all sitting at a table with food, but they are not looking at each other or smiling.
Aesthetic Score : 0.3
Mood : awkward, tense, uncomfortable
Quality
Entropy : 5.77
Noise : 46
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image quality is slightly blurry. Some elements, like the writing, are pixelated. The perspective of the table is off.
The Weight of Defeat
A solitary figure sits in the dim light, headphones on, facing a stark red ‘Game Over!’ on the screen. The scene captures the crushing weight of loss and the isolating feeling of defeat.
Prompt
facial-expressions Disappointment: Defeat, frustration ; A gamer staring at a game over screen; eye-level; Gamer; a darkened room with the glow of the monitor reflecting in their eyes, showing a game over message; cinematic
Characteristic
Shot : A gamer in a dimly lit room, wearing headphones, sitting in front of a computer screen displaying “Game Over!” in red neon letters.
Aesthetic Score : 0.6
Mood : melancholy, defeated, gaming
Quality
Entropy : 5.98
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors, but the lighting is a bit too dark.
Lost in the Rain: A Moment of Melancholy
A woman gazes out a window at a rainy city night, her expression filled with longing. The blurry lights and raindrops on the glass create a sense of mystery and isolation, capturing a moment of introspective sadness.
Prompt
facial-expressions Disappointment: Sadness, longing ; A woman standing at a window; eye-level; Single Person; a rainy day with the city streets blurred in the background; cinematic
Characteristic
Shot : A woman is looking out of a window, it is raining outside. The window is covered in raindrops, and the city lights can be seen in the background.
Aesthetic Score : 0.7
Mood : melancholy, pensive, contemplative
Quality
Entropy : 6.35
Noise : 72
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
A Moment of Solitude on the Mountaintop
A lone figure stands on a majestic peak, dwarfed by the vastness of the landscape below. The serene blue sky and fluffy clouds create a sense of peace and contemplation, while the adventurous spirit of the figure is palpable. This image evokes a feeling of awe and perspective, reminding us of our place in the grand scheme of things.
Prompt
facial-expressions Disappointment: Isolation, disillusionment ; A hero standing on a mountaintop; eye-level; Hero; a vast landscape stretching out before them, but with a sense of emptiness in the air; cinematic
Characteristic
Shot : A lone hiker stands on a mountaintop, looking out at a vast, valley landscape below. The sky is a bright blue with fluffy clouds, suggesting a pleasant day.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, adventurous
Quality
Entropy : 6.47
Noise : 73
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors.
Conclusion
The analysis of the generated image reveals mixed results:
- Camera Position: The model’s performance in capturing the intended camera position is fairly good, with a score of 0.1. This indicates that the generated image’s camera position is somewhat similar to what was requested in the prompt.
- Shot Analysis: The model’s ability to understand and recreate the scene described in the prompt is pretty good, with a score of 0.5. This suggests that the generated image captures the essence of the scene, but there might be some discrepancies in the details.
- Aesthetic Analysis: The generated image’s aesthetic is slightly off from the expected aesthetic, with a score of -0.06. This means that the image’s overall visual style is not quite what was envisioned.
Overall, the model shows some strengths in understanding the scene and camera position, but struggles to fully capture the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai