AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion
- 9 minutes read - 1787 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and storytelling. In the realm of AI-generated imagery, capturing these expressions accurately is crucial for creating compelling and relatable visuals. This blog post delves into the performance of a generative AI model in generating images with specific facial expressions, analyzing its strengths and weaknesses in understanding scene context and aesthetic style. We’ll explore how the model interprets prompts, its ability to capture camera position and shot analysis, and its overall success in creating visually appealing images. Through this analysis, we aim to shed light on the current capabilities and limitations of AI in generating images with nuanced facial expressions.
Created with: stability-ai-core
Autumn Solitude: A Man Contemplates the Season’s Change
A poignant image captures the essence of autumn, with a solitary man lost in thought on a park bench amidst a carpet of fallen leaves. The composition evokes a sense of melancholy and quiet contemplation, highlighting the beauty and bittersweet nature of the season.
Prompt
facial-expressions Sadness: Melancholy, loneliness ; A lone figure; eye-level; Single Person; Empty park bench with fallen leaves; cinematic
Characteristic
Shot : A young man is sitting on a park bench in the fall, surrounded by yellow leaves. He looks pensive and sad.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, autumnal
Quality
Entropy : 6.79
Noise : 73
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the background is a bit blurry. There is some noise in the image.
The Dark Knight Rises: A Moment of Solitude in the Storm
A brooding Batman stands amidst a rain-soaked cityscape, his face etched with determination. The blurred background and moody lighting create a sense of dramatic intensity, hinting at the weight of his mission and the melancholic solitude he faces.
Prompt
facial-expressions Sadness: Despair, disillusionment ; A superhero in their costume; eye-level; Hero; City skyline at night, rain falling; cinematic
Characteristic
Shot : A close-up shot of a man in a Batman costume, standing in the rain with a city skyline in the background.
Aesthetic Score : 0.75
Mood : dark, gritty, intense
Quality
Entropy : 6.77
Noise : 80
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, such as a few stray pixels and a slight blurriness in the background.
The Weight of Solitude
A woman sits alone at a kitchen table, her head in her hands, a cup of coffee untouched before her. The image evokes a sense of melancholy and isolation, highlighting the quiet struggles that can weigh heavily on the heart.
Prompt
facial-expressions Sadness: Hopelessness, grief ; A woman sitting at a kitchen table; eye-level; Normal People; Empty coffee cup, unwashed dishes; cinematic
Characteristic
Shot : A woman sitting at a kitchen table with her hands on her face, looking distressed. There is a cup of coffee and a coffee pot on the table.
Aesthetic Score : 0.6
Mood : melancholy, pensive, contemplative
Quality
Entropy : 6.81
Noise : 66
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
The Pizza Blues: A Moment of Boredom
A young man sits alone at a table, surrounded by the remnants of a pizza feast. His bored expression and the stacked boxes in the background hint at a sense of loneliness and weariness. The image captures a casual, tired mood, leaving the viewer to wonder what’s on his mind.
Prompt
facial-expressions Sadness: Isolation, withdrawal ; A gamer hunched over their computer; close-up; Gamer; Empty pizza boxes, energy drink cans; cinematic
Characteristic
Shot : A young man sits at a table with two slices of pizza, a can of soda, and a stack of cardboard boxes in the background. He looks tired and bored.
Aesthetic Score : 0.6
Mood : tired, bored, mundane
Quality
Entropy : 6.52
Noise : 71
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors.
Lost in the Shadows: A Boy’s Eerie Journey
A young boy stands alone in a dilapidated hallway, his small figure dwarfed by the vast emptiness. The dim lighting and cracked walls create an atmosphere of mystery and isolation, leaving the viewer to wonder what secrets lie within.
Prompt
facial-expressions Sadness: Loneliness, abandonment ; A child standing in a doorway; eye-level; Single Person; Empty hallway, dim lighting; cinematic
Characteristic
Shot : A young boy in a dark green coat stands in a dimly lit hallway, looking slightly apprehensive. The hallway has white walls and a tiled floor. There are two doorways visible in the background, both open and leading into dark rooms.
Aesthetic Score : 0.7
Mood : dark, mysterious, eerie
Quality
Entropy : 6.41
Noise : 54
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly underexposed and the shadows are a bit harsh. The white balance is also slightly off, making the scene appear more cold and blue.
Solitary Figure in a Warzone
A lone soldier kneels amidst the devastation of war, smoke and explosions filling the background. The image captures the dramatic intensity and somber mood of a battlefield, highlighting the soldier’s isolated presence in a chaotic and urgent situation.
Prompt
facial-expressions Sadness: Loss, regret ; A soldier kneeling on a battlefield; eye-level; Hero; Explosions in the distance, smoke filling the air; cinematic
Characteristic
Shot : A soldier in full gear is kneeling in a destroyed battlefield, surrounded by debris and smoke from explosions in the background.
Aesthetic Score : 0.6
Mood : dramatic, somber, war-torn
Quality
Entropy : 6.80
Noise : 82
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor artifacts are present, especially around the edges of the soldier’s helmet and the smoke. The lighting seems a bit flat.
What’s Got Them Hooked? Two Guys, Popcorn, and a Mystery.
A casual scene unfolds with two men engrossed in something off-screen, popcorn in hand. Their bored yet contemplative expressions leave viewers wondering what captivating spectacle has captured their attention. The mundane setting and familiar snack add a touch of normalcy, heightening the anticipation and curiosity surrounding the unseen event.
Prompt
facial-expressions Sadness: Silence, unspoken tension ; A couple sitting on a couch; eye-level; Normal People; Empty popcorn bowl, remote control on the floor; cinematic
Characteristic
Shot : Two men are sitting on a couch, watching TV and eating popcorn. The popcorn bowl in the foreground is tipped over, with popcorn scattered on the floor.
Aesthetic Score : 0.6
Mood : relaxed, bored, disappointed
Quality
Entropy : 6.71
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur, but it is not a major issue.
The Focus Is On
A young man, headphones on, eyes glued to the keyboard, is locked in a moment of intense concentration. The gamer logo on the monitor behind him hints at the high stakes of the game he’s playing. This close-up shot captures the suspense and anticipation of the moment, leaving you wondering what’s about to happen next.
Prompt
facial-expressions Sadness: Frustration, defeat ; A gamer’s hands on a keyboard; close-up; Gamer; Screen displaying a game over message; cinematic
Characteristic
Shot : A man in a dark room wearing headphones and a black shirt is looking at a keyboard and appears to be playing a video game, the reflection of a computer screen shows the word “GAMER”
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 6.01
Noise : 62
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is some slight noise in the image, and the colors are a bit muted.
Lost in the City: A Moment of Melancholy
A woman walks through a bustling European city, her gaze fixed on the camera, conveying a sense of introspection and isolation. The blurred background emphasizes her solitude, creating a poignant mood of melancholy.
Prompt
facial-expressions Sadness: Alienation, loneliness ; A woman walking down a crowded street; eye-level; Single Person; People passing by, oblivious to her; cinematic
Characteristic
Shot : A woman walks down a crowded city street, looking forlorn and lost in thought. The background is out of focus, creating a sense of isolation.
Aesthetic Score : 0.7
Mood : melancholy, introspective, somber
Quality
Entropy : 6.77
Noise : 77
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Silhouette of Solitude: A Man Contemplates the City at Dusk
A solitary figure stands on a rooftop, their silhouette stark against the sprawling cityscape bathed in the warm hues of dusk. The scene evokes a sense of melancholy and contemplation, capturing the quiet introspection of urban life.
Prompt
facial-expressions Sadness: Reflection, introspection ; A hero standing on a rooftop; eye-level; Hero; City lights twinkling in the distance; cinematic
Characteristic
Shot : A man in a denim jacket stands on a rooftop overlooking a city skyline at dusk. The city lights are blurred in the background, and the man’s expression is pensive.
Aesthetic Score : 0.7
Mood : melancholic, contemplative, urban
Quality
Entropy : 6.76
Noise : 63
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight noise and graininess in the image, particularly noticeable in the background
Conclusion
The results of the analysis show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect.
Here’s a breakdown:
- Camera Position: The model scored 0.2, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.44, which is considered below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.13, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai