AI's Struggle with Facial Expressions: A Mixed Bag of Results with Stable-diffusion
- 9 minutes read - 1914 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions. In the realm of AI-generated imagery, capturing these nuances presents a significant challenge. This blog post examines the results of an AI model tasked with generating images featuring specific facial expressions. We explore the model’s performance, analyzing its strengths and weaknesses, and discuss the challenges of generating realistic and expressive imagery. Dramatic style facial expressions are often used in film, television, and theater to emphasize emotions and create a heightened sense of realism. These expressions are often exaggerated and stylized, making them more impactful and memorable. For example, a character’s anger might be conveyed through a furrowed brow, clenched jaw, and narrowed eyes, while a character’s sadness might be expressed through a drooping mouth, tearful eyes, and a slumped posture. By understanding the nuances of dramatic style facial expressions, we can better appreciate the challenges faced by AI models in generating realistic and expressive imagery.
Created with: stability-ai-core
Lost in the Neon Glow: A Woman’s Solitary Journey Through the City
A young woman, shrouded in darkness, stands alone in a vibrant cityscape bathed in neon light. The bustling city life fades into a blur, emphasizing her isolation and creating a sense of mystery and intrigue.
Prompt
facial-expressions Disappointment: Melancholy, isolation ; A lone figure; eye-level; Single Person; a bustling city street at night, with neon signs and blurred lights; cinematic
Characteristic
Shot : A young woman in a black hooded jacket is standing in a bustling city street at night, with neon signs reflecting in the background.
Aesthetic Score : 0.7
Mood : mysterious, urban, moody
Quality
Entropy : 6.27
Noise : 60
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious artifacts or errors.
Superman: A Silhouette of Power at Sunset
A dramatic image captures Superman standing tall on a rooftop, his cape billowing in the wind as the sun sets over the city. The silhouette against the vibrant sky evokes a sense of heroism and power.
Prompt
facial-expressions Disappointment: Defeated, disillusioned ; A superhero standing on a rooftop; eye-level; Hero; a cityscape bathed in the orange glow of a setting sun, with the hero’s cape billowing in the wind; cinematic
Characteristic
Shot : Superman standing on a rooftop with a cityscape behind him at sunset.
Aesthetic Score : 0.7
Mood : heroic, powerful, hopeful
Quality
Entropy : 6.78
Noise : 71
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : Some minor artifacts are present in the sky and the city, likely due to digital manipulation.
A Moment of Melancholy at the Kitchen Table
A woman sits alone at a kitchen table, surrounded by remnants of a meal. Her pensive expression and slumped posture convey a sense of loneliness and contemplation. The scene evokes a feeling of melancholy, leaving the viewer to wonder about the weight of her thoughts.
Prompt
facial-expressions Disappointment: Hopelessness, resignation ; A woman sitting at a kitchen table; eye-level; Normal Person; a cluttered kitchen with dirty dishes and a half-eaten meal; cinematic
Characteristic
Shot : A woman sits at a table in a kitchen, looking down, with her hands on her face. There are various dishes of food on the table, including a plate of noodles and a bowl of tomatoes.
Aesthetic Score : 0.5
Mood : melancholy, contemplative, somber
Quality
Entropy : 6.83
Noise : 70
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Lost in the Code: A Moment of Intense Focus
A man sits before his computer, headphones on, bathed in the blue glow of his monitor. His expression is thoughtful, revealing a deep concentration as he navigates the complexities of his work. The scene captures the essence of focused intensity, a moment of quiet contemplation in the midst of a demanding task.
Prompt
facial-expressions Disappointment: Frustration, anger ; A gamer sitting in front of a computer screen; eye-level; Gamer; a dimly lit room with flashing lights and the glow of the monitor reflecting in their eyes; cinematic
Characteristic
Shot : A young man sits at a desk in a dimly lit room, wearing headphones and looking at a computer screen.
Aesthetic Score : 0.6
Mood : focused, serious, contemplative
Quality
Entropy : 6.26
Noise : 59
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant image errors, there are some minor imperfections in the lighting, and the subject is not in perfect focus.
Lost in the Shadows: A Solitary Figure in a Black and White City
A man walks alone down a deserted cobblestone street, the soft, atmospheric lighting casting long shadows and creating a sense of mystery and solitude. The vintage architecture and black and white aesthetic evoke a melancholic mood, highlighting the man’s isolation in the bustling city.
Prompt
facial-expressions Disappointment: Loneliness, despair ; A man walking down a deserted street; eye-level; Single Person; a street lined with closed shops and flickering streetlights; cinematic
Characteristic
Shot : A man walking down a narrow, cobblestone street lined with shops. It is a gloomy day, and the street is almost deserted. The man is wearing a long coat and is walking with his head down. The mood is melancholic, but also peaceful.
Aesthetic Score : 0.7
Mood : melancholic, peaceful, introspective
Quality
Entropy : 6.59
Noise : 89
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors or artifacts.
Solemn Vigil: A Superhero’s Grief in a World of Ashes
A lone superhero, cloaked in black, stands over a fallen comrade amidst the smoldering ruins of a city. The image captures the weight of loss and the solemn reflection of a hero facing a world consumed by destruction.
Prompt
facial-expressions Disappointment: Disappointment, regret ; A hero standing over a fallen villain; eye-level; Hero; a battlefield littered with debris and smoke, with the villain’s defeated form at the hero’s feet; cinematic
Characteristic
Shot : A lone superhero stands over a fallen comrade in a post-apocalyptic city ravaged by fire and destruction. The scene is filled with smoke and debris, creating a sense of desolation.
Aesthetic Score : 0.7
Mood : dramatic, somber, heroic
Quality
Entropy : 6.88
Noise : 78
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.60
Image errors : The smoke and fire effects appear slightly artificial and lack realism. The background also seems somewhat blurry and lacking in detail, suggesting potential AI manipulation.
A Family Dinner, Heavy with Unspoken Words
A warm, inviting dining room becomes a stage for unspoken tension. The family gathered around the table, bathed in soft light, exudes an air of quiet contemplation. Their expressions, captured in a moment of stillness, hint at a simmering undercurrent of emotions, leaving the viewer to ponder the unspoken words hanging in the air.
Prompt
facial-expressions Disappointment: Tension, estrangement ; A family gathered around a dinner table; eye-level; Normal People; a table set with a simple meal, but with an uncomfortable silence hanging in the air; cinematic
Characteristic
Shot : A family is having dinner together in a well-lit dining room.
Aesthetic Score : 0.7
Mood : tense, intimate, serious
Quality
Entropy : 6.77
Noise : 75
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
In the Zone: Gamer’s Intensity Under the Glow of the Screen
A young man, bathed in the blue light of his computer monitor, grips his controller with unwavering focus. The darkness surrounding him amplifies the intensity of his gaze, creating a palpable sense of suspense and immersion in the digital world.
Prompt
facial-expressions Disappointment: Defeat, frustration ; A gamer staring at a game over screen; eye-level; Gamer; a darkened room with the glow of the monitor reflecting in their eyes, showing a game over message; cinematic
Characteristic
Shot : A man is sitting in front of a computer screen, likely playing a video game. The scene is dimly lit and focused on the man’s face and the game controller.
Aesthetic Score : 0.6
Mood : intense, focused, determined
Quality
Entropy : 5.67
Noise : 54
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Lost in the Rain: A Moment of Melancholy
A woman, shrouded in a leather jacket, gazes out a window at a rain-soaked city street. Her expression, tinged with sadness and longing, evokes a sense of isolation and wistful contemplation. The rain, mirroring her inner turmoil, amplifies the mood of melancholy and introspective reflection.
Prompt
facial-expressions Disappointment: Sadness, longing ; A woman standing at a window; eye-level; Single Person; a rainy day with the city streets blurred in the background; cinematic
Characteristic
Shot : A woman standing by a window looking out at a rainy cityscape.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.72
Noise : 71
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and grain, and the windowpane reflection is somewhat distracting.
A Solitary Figure Contemplates the Vastness
A lone figure, shrouded in a long coat, stands on a mountain peak, gazing out at a breathtaking panorama. The golden light of dawn or dusk bathes the scene in an ethereal glow, while dramatic clouds fill the sky. The vastness of the valley and the snow-capped peaks in the distance create a sense of awe and isolation, leaving the viewer to ponder the mysteries of the world.
Prompt
facial-expressions Disappointment: Isolation, disillusionment ; A hero standing on a mountaintop; eye-level; Hero; a vast landscape stretching out before them, but with a sense of emptiness in the air; cinematic
Characteristic
Shot : A lone figure in a long coat stands on a rocky mountain peak, gazing at a sprawling valley and distant mountains. The sky is a mix of clouds and golden sunlight, suggesting a sunset or dawn.
Aesthetic Score : 0.8
Mood : epic, contemplative, hopeful
Quality
Entropy : 6.78
Noise : 68
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some areas of the image, particularly the sky and the distant mountains, appear slightly blurry or soft. This may be due to the use of a depth of field effect, or it may be a result of post-processing.
Conclusion
The analysis of the generated image shows mixed results:
- Camera Position: The model performed pretty well at understanding and implementing the camera position specified in the prompt. The score of 0.05 falls within the “good” range (0.5 to 0.75).
- Shot Analysis: The model also performed pretty well at understanding the scene described in the prompt. The score of 0.51 falls within the “good” range (0.5 to 0.75).
- Aesthetic Analysis: The model struggled to achieve the desired aesthetic. The score of -0.11 falls within the “very good” range (-0.2 to 0.1), indicating a significant difference between the expected and actual aesthetic.
Overall, the model was able to capture the camera position and scene elements fairly well, but it fell short in achieving the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai