AI's Struggle with Facial Expressions: A Look at the Gap Between Scene and Aesthetic with Stable-diffusion
- 10 minutes read - 1976 wordsTable of Contents
Dramatic facial expressions are a powerful tool in storytelling, conveying a wide range of emotions and adding depth to characters. From the subtle twitch of a lip to a full-blown scream, these expressions can draw the viewer in and create a visceral connection. However, replicating these expressions in AI-generated images remains a challenge. While AI models can understand the context of a scene and even capture the basic features of a face, they often struggle to convey the nuanced emotions that make facial expressions so compelling. This is particularly evident in scenarios where the prompt calls for a specific emotional state, such as sadness, anger, or fear. The generated images may accurately depict the scene and camera position, but the facial expressions themselves often lack the depth and realism that would make them truly impactful.
Created with: stability-ai-core
Lost in the Rain: A Gloomy Cityscape
A solitary figure, shrouded in a black coat, navigates a deserted city street under a relentless downpour. The wide-angle lens captures the vastness of the empty space, amplifying the sense of loneliness and somber mood.
Prompt
facial-expressions Shame: Desolate, lonely, regretful ; A lone figure, hunched over, walking down a deserted street; eye-level; Single Person; Rain-slicked pavement and flickering streetlights; cinematic
Characteristic
Shot : A man walks down a wet street in the city, carrying an umbrella. The street is lined with buildings and there are streetlights casting a warm glow on the wet pavement.
Aesthetic Score : 0.7
Mood : dark, lonely, melancholic
Quality
Entropy : 6.83
Noise : 80
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, making the overall scene look a little washed out.
Heroic Silhouette: A Superhero Stands Watch at Sunset
A dramatic image of a masked superhero silhouetted against a fiery sunset, capturing the hero’s power and the anticipation of an impending conflict.
Prompt
facial-expressions Shame: Melancholy, disillusioned, burdened ; A superhero, their mask removed, revealing a face etched with pain; eye-level; Hero; A cityscape bathed in the glow of a setting sun; cinematic
Characteristic
Shot : A superhero in a costume stands on a rooftop overlooking a cityscape at sunset.
Aesthetic Score : 0.6
Mood : dramatic, powerful, heroic
Quality
Entropy : 6.84
Noise : 71
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur, and the cityscape appears a bit soft.
Lost in the Crowd: A Moment of Solitude in a Bustling Diner
A woman sits alone at a diner, her hands on her head, a plate of untouched food before her. The bustling background blurs, emphasizing her isolation and the weight of her melancholy. The image captures a poignant moment of loneliness and contemplation.
Prompt
facial-expressions Shame: Embarrassed, defeated, self-loathing ; A woman, her face buried in her hands, sitting alone at a crowded diner table; eye-level; Normal Person; The bustling activity of the diner, a stark contrast to her isolation; cinematic
Characteristic
Shot : A woman sits alone in a diner, seemingly distressed, with her hands on her head, and two plates of food in front of her. The background shows other people in the diner, but they are out of focus.
Aesthetic Score : 0.6
Mood : melancholy, loneliness, pensive
Quality
Entropy : 6.84
Noise : 63
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.00
Image errors : No noticeable errors
Lost in the Game: A Moment of Contemplation
A young man sits at his desk, lost in thought as a video game plays on his screen. The low light and his contemplative pose create a sense of introspection and tension, hinting at a deeper story unfolding within the digital world.
Prompt
facial-expressions Shame: Empty, defeated, lost in a digital world ; A gamer, staring blankly at a screen, his controller lying idle; eye-level; Gamer; A dimly lit room filled with gaming paraphernalia, a sense of disconnection; cinematic
Characteristic
Shot : A man is sitting in front of a computer, looking thoughtful. The computer monitor is showing a scene from a video game or movie. The man is holding a gaming controller. There is a keyboard, mouse, and other gaming equipment on the desk.
Aesthetic Score : 0.6
Mood : thoughtful, introspective, focused
Quality
Entropy : 6.08
Noise : 60
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : None. The image appears well-lit and sharp, with no obvious artifacts.
What Will He Do? The Tension Is Palpable.
A man stands on the precipice, his worried expression mirroring the intensity of the crowd behind him. This dramatic scene, bathed in warm artificial light, is ripe with suspense. What will unfold next?
Prompt
facial-expressions Shame: Anxious, self-conscious, out of place ; A man, standing in a crowded room, his eyes darting nervously around; eye-level; Single Person; A party scene, filled with laughter and conversation, but he feels isolated; cinematic
Characteristic
Shot : A group of men, mostly in their 20s and 30s, are standing shoulder-to-shoulder, looking at the camera, with one man in the foreground, standing out from the crowd.
Aesthetic Score : 0.7
Mood : intense, serious, suspenseful
Quality
Entropy : 6.86
Noise : 78
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
Silhouetted Against the City: A Moment of Melancholy
A solitary figure stands on a rooftop, gazing out over the sprawling cityscape. The image evokes a sense of contemplation and isolation, with the man’s silhouette starkly contrasting against the distant lights of the city.
Prompt
facial-expressions Shame: Disheartened, disillusioned, questioning his purpose ; A hero, standing on a rooftop, looking down at the city below; not too close; Hero; A panoramic view of the city, but he feels small and insignificant; cinematic
Characteristic
Shot : A man is standing on a rooftop, looking out at the city skyline. The city is in the background, and the man is in the foreground.
Aesthetic Score : 0.7
Mood : thoughtful, urban, contemplative
Quality
Entropy : 6.72
Noise : 67
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.00
Image errors : There are no visible artifacts or errors in the image.
A Moment of Quiet Reflection
A woman sits alone at a kitchen table, her gaze fixed on a plate of food. Her expression is one of deep thought, perhaps tinged with sadness. The scene evokes a sense of quiet contemplation and introspection.
Prompt
facial-expressions Shame: Depressed, unmotivated, lost in her thoughts ; A woman, sitting at her kitchen table, staring at a plate of untouched food; eye-level; Normal Person; A cluttered kitchen, a reflection of her inner turmoil; cinematic
Characteristic
Shot : A woman is sitting at a kitchen table, looking down at a plate of food with a sad expression. The scene is set in a domestic kitchen with a window, a countertop, and a microwave in the background.
Aesthetic Score : 0.4
Mood : sad, pensive, contemplative
Quality
Entropy : 6.84
Noise : 73
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major image errors are visible. The image is well-exposed and the colors are accurate.
Lost in the Code: A Hacker’s Focus Under Neon Lights
A young man, shrouded in the glow of blue and green lights, sits intently at his computer. His focused expression and the futuristic aesthetic of the scene create a sense of mystery and intrigue, drawing you into the world of a hacker lost in the depths of code.
Prompt
facial-expressions Shame: Despair, addiction, a sense of being lost ; A gamer, hunched over his keyboard, his fingers flying across the keys, but his eyes are filled with sadness; eye-level; Gamer; A brightly lit gaming room, but he feels trapped in a digital world; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, wearing headphones and a dark green hoodie, typing on a keyboard. The scene is lit with blue and green lighting, creating a moody, almost cyberpunk aesthetic.
Aesthetic Score : 0.6
Mood : cyberpunk, focused, serious
Quality
Entropy : 6.14
Noise : 63
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness around the edges of the image, likely due to post-processing.
Lost in the City: A Man’s Mysterious Journey
A solitary figure walks through a bustling city, his gaze fixed on the camera with an intensity that speaks volumes. The shallow depth of field isolates him from the surrounding crowd, creating a sense of mystery and suspense. This urban scene evokes a mood of intensity, leaving the viewer wondering about the man’s story and his destination.
Prompt
facial-expressions Shame: Rejected, isolated, a sense of being unwanted ; A man, walking away from a group of people, his head down, his shoulders slumped; eye-level; Single Person; A bustling street, but he feels alone and invisible; cinematic
Characteristic
Shot : A man is walking down a crowded street. He is looking straight ahead with a serious expression. The street is lined with buildings on either side and there are people walking in both directions.
Aesthetic Score : 0.6
Mood : suspenseful, dark, mysterious
Quality
Entropy : 6.77
Noise : 72
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some blurriness in the background, some minor noise in the image.
Warrior’s Gaze: A Portrait of Grit and Conflict
A lone warrior stands defiant against a backdrop of ruin, his intense gaze piercing the darkness. This gritty image captures the raw power and uncertainty of a world on the brink, leaving you questioning what battles lie ahead.
Prompt
facial-expressions Shame: Guilt, regret, a sense of responsibility ; A hero, standing in the ruins of a battle, his armor dented and his face covered in grime; not too close; Hero; A scene of destruction, a reminder of the cost of his actions; cinematic
Characteristic
Shot : A man in armor stands amidst the ruins of a building. His face shows the scars of battle. The background is blurred and out of focus, focusing the attention on the man.
Aesthetic Score : 0.7
Mood : dramatic, intense, gritty
Quality
Entropy : 6.86
Noise : 75
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and graininess, particularly in the background. There are also some minor color inconsistencies.
Conclusion
The generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, indicating a moderate ability to react to camera positions in the prompt. This is below the “good” range of 0.5 to 0.75, suggesting room for improvement in accurately capturing the intended camera angles.
- Shot Analysis: The model scored 0.575, falling within the “good” range. This means it was able to understand the scene described in the prompt and translate it into a visually coherent image.
- Aesthetic Analysis: The model scored 0.12, which is slightly below the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated somewhat from the expected aesthetic, potentially lacking the desired visual style or composition.
Overall, the model shows promise in understanding scene descriptions and camera positions, but needs further development to consistently achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai