AI's Facial Expressions: A Mixed Bag with Stable-diffusion
- 9 minutes read - 1853 wordsTable of Contents
The ability to generate realistic and expressive facial expressions is a crucial aspect of creating compelling and engaging AI-generated imagery. This blog post examines the performance of a generative AI model in capturing facial expressions within various scenes. We’ll explore how the model handles different camera positions, shot compositions, and aesthetic styles, highlighting its strengths and weaknesses in replicating the nuances of human emotion.
Created with: stability-ai-core
Neon Shadows: A City Drenched in Mystery
A solitary figure navigates a rain-slicked urban landscape, bathed in the vibrant glow of neon signs. The interplay of light and shadow creates an atmosphere of intrigue and unspoken stories, leaving you wondering what secrets lie hidden in the darkness.
Prompt
facial-expressions Realization: Melancholy, introspective ; A lone figure; eye-level; Single Person; a bustling city street at night, with neon signs and rain reflecting on the wet pavement; cinematic
Characteristic
Shot : A lone figure walks through a rainy, neon-lit city street at night. The street is wet and reflective, and the figure is silhouetted against the bright lights. The scene is moody and atmospheric.
Aesthetic Score : 0.8
Mood : gloomy, urban, mysterious
Quality
Entropy : 6.40
Noise : 90
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors are present.
Superman’s Epic Sunset Stand
A dramatic silhouette against the setting sun, Superman stands atop a towering building, overlooking a sprawling cityscape. The pose, the light, and the view all combine to create a powerful and heroic image.
Prompt
facial-expressions Realization: Triumphant, awe-inspiring ; A superhero, standing atop a skyscraper; wide shot; Hero; a sprawling cityscape bathed in the golden light of sunset; cinematic
Characteristic
Shot : Superman stands on a rooftop overlooking a cityscape at sunset. His cape billows in the wind.
Aesthetic Score : 0.7
Mood : heroic, epic, dramatic
Quality
Entropy : 6.83
Noise : 81
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be digitally generated, and some areas, particularly the cityscape, lack detail and realism. The sky also appears slightly blurry and lacking in detail.
Lost in Thought: A Moment of Quiet Contemplation
A woman sits at a kitchen table, her hand resting on her chin, lost in thought. The soft lighting and her pensive expression create a sense of quiet introspection. The clean and tidy kitchen provides a backdrop of calm and order, highlighting the woman’s internal focus.
Prompt
facial-expressions Realization: Disillusioned, resigned ; A young woman, sitting at a kitchen table; close-up; Normal People; a cluttered kitchen, with dishes piled in the sink and a half-eaten meal on the table; cinematic
Characteristic
Shot : A woman sits alone in a kitchen, her hand resting on her chin, looking contemplative. There’s a plate of food in front of her, but she doesn’t appear to be eating.
Aesthetic Score : 0.6
Mood : melancholy, pensive, lonely
Quality
Entropy : 6.79
Noise : 64
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable image errors.
Late Night Focus: A Gamer’s Oasis
A young man, headphones on, is completely immersed in his computer screen. The dim lighting and pizza on the table suggest a late-night gaming session, creating a cozy and focused atmosphere. The scene captures the intensity and immersion of gaming, highlighting the relaxed yet focused mood.
Prompt
facial-expressions Realization: Intense, focused ; A gamer, hunched over a computer screen; close-up; Gamer; a dimly lit room, with flashing lights from the monitor and empty pizza boxes scattered around; cinematic
Characteristic
Shot : A young man is sitting at a desk in a dimly lit room, wearing headphones, looking intently at a computer screen, with a slice of pizza on the table in front of him. There is a lamp on the table and a second computer screen in the background.
Aesthetic Score : 0.6
Mood : focused, concentrated, late night
Quality
Entropy : 6.27
Noise : 64
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors in the image.
Lost in the Crowd: A Man’s Anxiety Amidst the Blur
A solitary figure, clad in gray, stands out amidst a sea of faces. The man’s concerned expression and the blurred background create a palpable sense of tension and isolation. The train station in the distance adds to the feeling of being lost and alone, hinting at a story of suspense and uncertainty.
Prompt
facial-expressions Realization: Lost, alienated ; A man, walking through a crowded train station; eye-level; Single Person; a sea of faces, all rushing in different directions; cinematic
Characteristic
Shot : A man stands in a crowded train station, with a train in the background. The image is shot from a low angle, focusing on the man’s face. The background is blurred, creating a sense of depth and isolation.
Aesthetic Score : 0.7
Mood : tense, anxious, lonely
Quality
Entropy : 6.83
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors are observed.
Heroic Stand Amidst the Flames
A lone superhero stands defiant against a backdrop of fiery destruction, their powerful stance and the dramatic contrast of light and shadow creating a sense of epic tension. Other heroes stand in the background, ready to face the chaos alongside them.
Prompt
facial-expressions Realization: Determined, resolute ; A superhero, standing in the middle of a battle; wide shot; Hero; a chaotic scene of destruction and explosions, with enemies closing in; cinematic
Characteristic
Shot : A superhero, possibly Superman, stands in front of a group of other superheroes with a city burning behind them, there is a lot of smoke and debris
Aesthetic Score : 0.6
Mood : intense, dramatic, hopeful
Quality
Entropy : 6.84
Noise : 85
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, particularly in the background. The smoke and fire are not rendered very realistically, giving them a slightly cartoonish look.
Family Gathering: A Moment of Joy and Togetherness
This heartwarming image captures a family enjoying a meal together, radiating warmth and happiness. The abundance of food, the cozy atmosphere, and the genuine smiles on their faces create a sense of comfort and belonging. It’s a beautiful reminder of the importance of family and shared moments.
Prompt
facial-expressions Realization: Nostalgic, heartwarming ; A family, gathered around a dinner table; medium shot; Normal People; a warm and inviting kitchen, with the aroma of home-cooked food filling the air; cinematic
Characteristic
Shot : A family is gathered around a dining table in a kitchen, enjoying a meal together.
Aesthetic Score : 0.7
Mood : warm, cozy, happy
Quality
Entropy : 6.87
Noise : 80
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image. The image is well-exposed and sharp.
The Intensity of Focus: A Gamer’s Determination
A man, lost in the digital world, his face illuminated by the glow of the screen. Headphones on, microphone ready, he’s locked in a battle of skill and strategy. The dim lighting and his focused expression create a palpable sense of intensity and suspense, capturing the essence of a gamer’s dedication.
Prompt
facial-expressions Realization: Defeated, frustrated ; A gamer, staring at a blank screen; close-up; Gamer; a dimly lit room, with the only light coming from the monitor, which is now displaying a game over message; cinematic
Characteristic
Shot : A young man wearing headphones sits in front of a computer screen, his expression is serious and focused. The background is blurry, suggesting he is concentrating on something.
Aesthetic Score : 0.6
Mood : focused, serious, contemplative
Quality
Entropy : 6.32
Noise : 60
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some slight noise is visible in the image.
Golden Hour Serenity: A Woman Contemplates the Vast Ocean
A breathtaking sunset paints the sky in hues of gold as a woman stands on a cliff, gazing out at the endless expanse of the ocean. The scene evokes a sense of tranquility and awe, capturing the beauty of nature and the quiet contemplation of the human spirit.
Prompt
facial-expressions Realization: Reflective, contemplative ; A woman, standing on a cliff overlooking the ocean; eye-level; Single Person; a vast expanse of blue water stretching out to the horizon, with the sun setting in the distance; cinematic
Characteristic
Shot : A woman in a blue dress is standing on a cliff overlooking the ocean at sunset. The sun is setting in the distance, casting a golden glow on the water.
Aesthetic Score : 0.75
Mood : serene, peaceful, contemplative
Quality
Entropy : 6.77
Noise : 75
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight color cast and the horizon is not perfectly level. The woman’s face is a bit blurry but this is probably intended due to the distance.
Superman Surveys the Devastation
A solitary figure of hope amidst the ashes. Superman stands amidst the ruins of a city, his gaze fixed on a burning building in the distance. The setting sun casts a somber glow on the scene, highlighting the hero’s determination to rebuild from the destruction.
Prompt
facial-expressions Realization: Hopeful, determined ; A superhero, standing in the ruins of a city; wide shot; Hero; a desolate landscape, with smoke rising from the rubble and the sun breaking through the clouds; cinematic
Characteristic
Shot : A superhero in a Superman costume stands in the middle of a destroyed city, looking at a burning building in the background. The sky is dark and cloudy.
Aesthetic Score : 0.6
Mood : dramatic, powerful, somber
Quality
Entropy : 6.86
Noise : 81
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.60
Image errors : Some artifacts in the sky.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.2, indicating it’s not very good at reacting to camera positions in the prompt. This means the generated image’s camera position significantly deviates from what was requested.
- Shot Analysis: The model scored 0.47, which is good. This means the generated image’s shot composition is fairly close to what was described in the prompt.
- Aesthetic Analysis: The model scored 0.14, which is very good. This means the generated image’s aesthetic is very close to what was expected.
Overall, the model seems to be better at understanding the scene and its aesthetic than it is at accurately capturing the camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai