AI Captures Emotion, But Struggles with Perspective with Stability-ai-ultra
- 10 minutes read - 2020 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and expressive images is a rapidly evolving field. One area of particular interest is the creation of facial expressions that convey a wide range of emotions. This blog post explores the results of a study that tested an AI model’s ability to generate images with dramatic facial expressions, focusing on its performance in capturing camera position, scene understanding, and aesthetic style. We’ll delve into the model’s strengths and weaknesses, providing insights into the potential and limitations of AI in generating emotionally charged imagery. Dramatic facial expressions are often used in film, photography, and other forms of visual storytelling to enhance the emotional impact of a scene. By understanding how AI models can generate these expressions, we can gain a deeper appreciation for the role of technology in shaping our visual experiences.
Created with: stability-ai-ultra
Lost in the Shadows: A Solitary Figure Walks Through the Rain
A melancholic scene unfolds as a lone figure traverses a rain-soaked street at night, bathed in the ethereal glow of streetlights. The interplay of light and shadow creates an atmosphere of mystery and isolation, drawing the viewer into the figure’s solitary journey.
Prompt
facial-expressions Anger: Despair and rage ; A lone figure, standing in the middle of a deserted street; eye-level; Single Person; Rain pouring down, streetlights casting long shadows; cinematic
Characteristic
Shot : A solitary figure walks down a wet, empty street at night, illuminated by streetlights. The rain falls heavily, creating a moody atmosphere.
Aesthetic Score : 0.7
Mood : melancholy, atmospheric, urban
Quality
Entropy : 6.91
Noise : 99
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.60
Image errors : The rain effect appears somewhat artificial and repetitive. The lighting is a bit over-saturated, creating a slightly unrealistic look. Some of the building textures are blurry and lack detail.
The Last Stand: A Warrior’s Determination in the Face of Chaos
A close-up shot captures the intensity of a futuristic warrior, his fist clenched in defiance as he stands amidst a blurry backdrop of a raging battle. Smoke and fire engulf the scene, creating a sense of urgency and chaos. This powerful image evokes a mood of determination and resilience in the face of overwhelming odds.
Prompt
facial-expressions Anger: Fury and determination ; A superhero, fists clenched, facing down a horde of villains; eye-level; Hero; A crumbling cityscape, smoke and debris filling the air; cinematic
Characteristic
Shot : A man in a futuristic suit, with his fist clenched, is in a tense and chaotic environment. It looks like a battle scene with explosions and debris in the background.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.74
Noise : 79
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts and blurring, especially around the edges.
Fury in the Flames: Man’s Desperate Cry Amidst Urban Chaos
A man, consumed by rage, sits amidst a sea of paperwork, his fists clenched as he screams at the camera. The backdrop is a city engulfed in flames, mirroring the intensity of his anger. This dramatic scene captures a moment of raw emotion and desperation, leaving the viewer questioning the source of his fury and the fate of the burning city.
Prompt
facial-expressions Anger: Frustration and rage ; A man, slamming his fist on a table, surrounded by scattered papers; eye-level; Normal Person; A cluttered office, with a window showing a stormy sky; cinematic
Characteristic
Shot : A man is sitting at a desk with his fists clenched, surrounded by papers. There are flames behind him. He looks frustrated.
Aesthetic Score : 0.4
Mood : stressful, angry, tense
Quality
Entropy : 6.75
Noise : 79
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.70
Image errors : The flames look artificial and the man’s expression is a bit exaggerated. The lighting is a bit harsh.
Gamer’s Rage: The Moment He Lost It All
A gamer, caught in the throes of intense frustration, screams at his screen. Empty soda cans litter the scene, a testament to the heat of the moment. The dramatic lighting and the gamer’s raw emotion capture the chaotic energy of a gaming session gone wrong.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, throwing his headset on the floor, surrounded by empty energy drink cans; eye-level; Gamer; A dimly lit room, with a computer screen displaying a game in progress; cinematic
Characteristic
Shot : A man wearing headphones is screaming at his computer in a dimly lit room. The room is filled with energy drink cans and has a gaming setup.
Aesthetic Score : 0.4
Mood : intense, chaotic, frustrated
Quality
Entropy : 6.85
Noise : 81
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some slight artifacts visible in the image, especially on the edges of the man’s hair and around the can of energy drink. There is a noticeable halo around the man’s head.
Screaming in the Dark: A Portrait of Raw Emotion
This dramatic close-up captures a woman’s intense scream, bathed in shadows that heighten the sense of urgency and anger. The image is a powerful testament to the raw emotions that can erupt within us.
Prompt
facial-expressions Anger: Despair and rage ; A woman, screaming into the void, her face contorted in anger; close-up; Single Person; A dark, empty room, with only a single flickering light; cinematic
Characteristic
Shot : A close-up of a woman’s face, she is screaming with her mouth wide open, the lighting is dramatic with deep shadows, the background is blurred
Aesthetic Score : 0.6
Mood : intense, dramatic, angry
Quality
Entropy : 5.53
Noise : 65
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, particularly around the woman’s hair, the skin tone is slightly unrealistic and the overall lighting is a bit too dramatic
Silhouetted Against the Flames: A Man Contemplates the Apocalypse
A solitary figure stands on a rooftop, silhouetted against a cityscape consumed by fire. Smoke billows, creating a dramatic backdrop of chaos and destruction. The man’s back is turned, leaving his thoughts and emotions shrouded in mystery. This image evokes a sense of impending doom and the weight of a world in flames.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a rooftop, overlooking a city in flames; eye-level; Hero; A fiery inferno engulfing the city, with smoke billowing into the sky; cinematic
Characteristic
Shot : A lone figure stands on a rooftop, facing away from the viewer, against a backdrop of a city engulfed in flames. The fire is intense, with tall flames and thick plumes of smoke filling the sky.
Aesthetic Score : 0.7
Mood : dramatic, somber, apocalyptic
Quality
Entropy : 6.89
Noise : 83
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, and there are some minor artifacts in the smoke.
Emotions Run High in This Heated Restaurant Dispute
A series of four panels captures the raw intensity of an argument unfolding in a restaurant setting. Exaggerated expressions, dramatic body language, and close-up framing heighten the tension, leaving viewers on the edge of their seats.
Prompt
facial-expressions Anger: Frustration and rage ; A couple, arguing in a crowded restaurant, their voices raised in anger; eye-level; Normal People; A bustling restaurant, with other diners looking on; cinematic
Characteristic
Shot : A group of people are arguing at a restaurant, they are angry and yelling at each other
Aesthetic Score : 0.6
Mood : angry, tense, dramatic
Quality
Entropy : 6.64
Noise : 66
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is a bit too clean and polished. The use of strong colours and outlines makes the image look cartoonish. The speech bubbles are very generic. The lighting seems flat, adding more shadows and highlights would improve the overall look.
Rage Against the Machine: Man Explodes in Keyboard Fury
A man’s frustration boils over, resulting in a fiery outburst directed at his computer. Sparks fly, an explosion erupts, and his face contorts in a primal scream of anger. This image captures the raw intensity of digital rage, leaving viewers questioning the limits of human tolerance.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, smashing his keyboard in a fit of rage; close-up; Gamer; A dimly lit room, with a computer screen displaying a game over screen; cinematic
Characteristic
Shot : A man in a red hoodie is screaming at a keyboard, which is engulfed in flames. The background is blurry and lit with blue and red lights.
Aesthetic Score : 0.6
Mood : intense, dramatic, angry
Quality
Entropy : 6.68
Noise : 79
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.70
Image errors : The fire looks slightly artificial and the background is blurry.
Lost in the Rain: A City’s Shadows Embrace a Mysterious Figure
A solitary figure navigates a rain-soaked city street, the glistening pavement reflecting the dim glow of streetlights. The shadows cast by the downpour shroud the man’s face, adding an air of mystery and suspense to the scene. This brooding image evokes a sense of darkness and intrigue, leaving the viewer to wonder about the man’s story and the secrets he carries.
Prompt
facial-expressions Anger: Despair and rage ; A man, standing in the rain, his face obscured by the downpour; eye-level; Single Person; A dark, deserted street, with only the sound of rain and thunder; cinematic
Characteristic
Shot : A man stands in a dark, rainy street at night, illuminated by the soft glow of streetlights.
Aesthetic Score : 0.7
Mood : mysterious, moody, melancholic
Quality
Entropy : 6.64
Noise : 91
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight over-saturation and minor noise in the background.
Warrior’s Resolve Amidst the Inferno
A lone warrior stands defiant against a backdrop of raging flames and smoke, his determined gaze reflecting the intensity of the battle. This epic scene captures the warrior’s courage and resilience in the face of overwhelming odds.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a battlefield, surrounded by fallen enemies; eye-level; Hero; A battlefield littered with bodies, with smoke and dust filling the air; cinematic
Characteristic
Shot : A warrior in a red tunic and black armor stands in the midst of a chaotic battle scene. He is surrounded by fire, smoke, and fallen soldiers. He looks like he is in the midst of battle and is ready to fight.
Aesthetic Score : 0.7
Mood : intense, gritty, dramatic
Quality
Entropy : 6.85
Noise : 94
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image has some noise and artifacts. The lighting is uneven.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the generated image didn’t accurately reflect the camera position described in the prompt.
- Shot Analysis: The model scored 0.6, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create an image that reflects it well.
- Aesthetic Analysis: The model scored 0.21, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at understanding the scene and achieving the desired aesthetic than accurately capturing the camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai