AI's Facial Expressions: A Mixed Bag of Emotions with Stable-diffusion
- 9 minutes read - 1855 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and adding depth to visual narratives. In the realm of generative AI, the ability to create realistic and expressive faces is a crucial benchmark. This blog post delves into the performance of a generative AI model in capturing facial expressions across diverse scenes, analyzing its strengths and weaknesses in understanding camera position, scene details, and aesthetic style. We’ll explore how the model excels in capturing the aesthetic style but struggles with accurately implementing camera positions and scene descriptions. Through this analysis, we gain insights into the current capabilities and limitations of AI in generating expressive imagery.
Created with: stability-ai-core
Lost in the Neon Glow: A City Night’s Mystery
A solitary figure navigates a rain-slicked city street, bathed in the vibrant hues of neon signs. The image evokes a sense of nostalgia, urban grit, and a touch of melancholy, leaving the viewer to ponder the figure’s destination and the secrets hidden within the shadows.
Prompt
facial-expressions Contempt: Alienation, isolation, detachment ; A lone figure, back turned to the camera; eye-level; Single Person; A bustling city street at night, neon signs reflecting in puddles; cinematic
Characteristic
Shot : A lone figure walks down a wet city street at night. The street is lined with neon signs and the reflection of the lights in the puddles creates a vibrant and colorful scene.
Aesthetic Score : 0.8
Mood : dark, mysterious, urban
Quality
Entropy : 6.20
Noise : 83
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible image errors
Superman’s Silhouette: A Heroic Sunset
A dramatic image captures Superman standing tall on a rooftop, his silhouette against the vibrant sunset. The scene evokes a sense of hope and heroism, leaving a lasting impression.
Prompt
facial-expressions Contempt: Disillusionment, weariness, cynicism ; A superhero, standing on a rooftop, looking down at the city; eye-level; Hero; A cityscape bathed in the golden light of sunset; cinematic
Characteristic
Shot : A man dressed as Superman stands on a rooftop overlooking a cityscape at sunset.
Aesthetic Score : 0.7
Mood : heroic, dramatic, hopeful
Quality
Entropy : 6.83
Noise : 68
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the background, likely from compression.
The Walk of Power: Tension Builds in the Hallway
A man in a suit strides confidently down a brightly lit hallway, his destination shrouded in mystery. The conference room, filled with expectant faces, adds to the palpable tension. Is he about to deliver a game-changing announcement, or face a critical confrontation? The scene is ripe with anticipation, leaving the viewer on the edge of their seat.
Prompt
facial-expressions Contempt: Apathy, boredom, resignation ; A man in a suit, walking through a crowded office; eye-level; Normal People; A sterile, corporate office environment, fluorescent lights casting harsh shadows; cinematic
Characteristic
Shot : A man in a suit walks past a conference room of people in suits.
Aesthetic Score : 0.7
Mood : serious, powerful, corporate
Quality
Entropy : 6.72
Noise : 63
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
The Hacker in the Shadows
A young man, shrouded in darkness, sits hunched over his keyboard, his focused expression hinting at a secret mission. The dimly lit room and multiple computer monitors create an atmosphere of mystery and intrigue, leaving you wondering what he’s working on.
Prompt
facial-expressions Contempt: Obsessive, detached, nihilistic ; A gamer, hunched over a computer screen, eyes glued to the monitor; eye-level; Gamer; A dimly lit room, cluttered with gaming paraphernalia; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, typing on a keyboard. The room is dimly lit, and the man’s face is illuminated by the light from the monitor.
Aesthetic Score : 0.7
Mood : serious, focused, techy
Quality
Entropy : 6.04
Noise : 65
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors in this image. The lighting is good and the subject is well composed.
Melancholy in the Rain
A woman, lost in thought, watches the rain fall from a cafe window. Her black leather jacket and the somber mood create a sense of isolation and contemplation. The rain adds a dramatic touch, highlighting the quiet beauty of the moment.
Prompt
facial-expressions Contempt: Melancholy, loneliness, disillusionment ; A woman, sitting alone in a cafe, staring out the window; eye-level; Single Person; A rainy day, the cafe filled with the sound of rain and chatter; cinematic
Characteristic
Shot : A woman sitting at a cafe table, looking out the window at the rain. She is wearing a black leather jacket and has a cup of coffee in front of her.
Aesthetic Score : 0.7
Mood : melancholic, contemplative, moody
Quality
Entropy : 6.57
Noise : 76
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to be slightly overexposed. The background is a little bit blurry.
Superman’s Shadow: A Hero in the Darkness
A solitary figure, cloaked in the shadows of a narrow alleyway, stands as a beacon of power. The dim lighting and cobblestone ground create an atmosphere of mystery and intrigue, highlighting the imposing presence of the Man of Steel.
Prompt
facial-expressions Contempt: Superiority, arrogance, disdain ; A hero, standing over a defeated villain, looking down with disdain; not too close; Hero; A dark, gritty alleyway, lit by flickering streetlights; cinematic
Characteristic
Shot : A man dressed as Superman stands in a narrow, dimly lit alleyway.
Aesthetic Score : 0.6
Mood : mysterious, dark, heroic
Quality
Entropy : 6.65
Noise : 81
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Awaiting Their Fate: Tension Rises in the Line
A group of young adults stand in a tight line, their expressions serious and expectant. The atmosphere is thick with anticipation, hinting at a moment of uncertainty and potential tension.
Prompt
facial-expressions Contempt: Indifference, apathy, boredom ; A group of people, standing in a queue, looking bored and apathetic; eye-level; Normal People; A sterile, modern shopping mall, filled with the sounds of chatter and music; cinematic
Characteristic
Shot : A group of people are standing in a line, looking forward with serious expressions. The setting is likely an indoor public space, possibly a hallway or a waiting area. The focus is on the faces of the people, and the composition is tight, with little space between them. The lighting is fairly well-balanced, with some subtle shadows around the edges of the people’s faces.
Aesthetic Score : 0.6
Mood : serious, tense, anticipation
Quality
Entropy : 6.71
Noise : 74
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears slightly overexposed, with some blown-out highlights in the background. There are also some minor artifacts around the edges of some of the people’s faces, possibly due to compression.
On the Edge of Victory: A Gamer’s Intense Focus
A young man, headphones on and controller in hand, sits in a dimly lit room, his eyes fixed on something off-screen. The atmosphere is charged with suspense, hinting at a thrilling moment in the game. The disarray of the room and the dynamic scene on the computer screen add to the intensity of the moment.
Prompt
facial-expressions Contempt: Desensitization, aggression, detachment ; A gamer, playing a violent video game, his face contorted in a grimace; not too close; Gamer; A dimly lit room, filled with the sounds of explosions and gunfire; cinematic
Characteristic
Shot : A man in a dark room wearing headphones is playing video games. The background is a blurry image of a fire and a TV screen.
Aesthetic Score : 0.6
Mood : serious, focused, intense
Quality
Entropy : 6.65
Noise : 69
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors or artifacts. Slight color banding on the monitor.
Lost in Thought: A Man Walks Through the Mist
A solitary figure walks down a cobblestone path, enveloped in a misty atmosphere. The muted light and surrounding trees create a sense of melancholy and introspection, highlighting the man’s isolation and contemplative mood.
Prompt
facial-expressions Contempt: Despair, loneliness, isolation ; A man, walking through a deserted park, his face etched with sadness; eye-level; Single Person; A park at dusk, the trees casting long shadows; cinematic
Characteristic
Shot : A man walks down a cobblestone path lined with trees in a foggy or misty environment. The photo is in black and white.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, somber
Quality
Entropy : 6.31
Noise : 88
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant image errors.
A Soldier’s Worried Gaze Amidst the Chaos of War
A powerful image captures the intensity of war, with a soldier in the foreground, his face etched with worry, as a fiery battlefield unfolds behind him. The shallow depth of field and dramatic lighting create a sense of tension and urgency, highlighting the soldier’s vulnerability and the chaos surrounding him.
Prompt
facial-expressions Contempt: Disillusionment, cynicism, weariness ; A hero, standing on a battlefield, surrounded by the carnage of war; not too close; Hero; A battlefield, littered with the bodies of fallen soldiers; cinematic
Characteristic
Shot : A soldier in full combat gear stands in the foreground of a war-torn battlefield, with smoke and fire in the background.
Aesthetic Score : 0.7
Mood : tense, gritty, dramatic
Quality
Entropy : 6.85
Noise : 79
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some slight blurriness and noise, possibly from the use of a high ISO setting.
Conclusion
The analysis shows that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.5, which is considered average. This indicates that the model was able to understand the scene in the prompt to a reasonable degree, but not exceptionally well.
- Aesthetic Analysis: The model scored 0.07, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at understanding the aesthetic style than the camera position and scene. This suggests that the model might need further training to improve its ability to accurately interpret and implement camera positions and scene descriptions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai