AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion
- 9 minutes read - 1855 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions with a single glance. Generative AI models are increasingly being used to create realistic and expressive faces, but how well do they capture the nuances of human emotion? This blog post explores the capabilities of a generative AI model in generating facial expressions across a range of scenes and characters, highlighting its strengths and weaknesses. We’ll delve into the model’s ability to understand scene context, aesthetics, and camera positions, providing insights into the exciting potential and ongoing challenges of AI in generating realistic and expressive faces.
Created with: stability-ai-core
The Weight of Unsolved Pieces: A Moment of Frustration and Pensiveness
A woman sits at a table, her head in her hands, a half-finished jigsaw puzzle mirroring the unfinished state of her emotions. The scene captures a moment of sadness, frustration, and pensive reflection, leaving the viewer to ponder the weight of her unspoken thoughts.
Prompt
facial-expressions Boredom: Apathy and resignation. ; A single person; eye-level; Single Persons; A cluttered apartment with unwashed dishes and a half-finished puzzle on the table.; cinematic
Characteristic
Shot : A woman sits at a table with her head in her hands, looking distressed. There is a half-finished jigsaw puzzle on the table, and a cup and spoon in the background.
Aesthetic Score : 0.6
Mood : sad, frustrated, melancholic
Quality
Entropy : 6.74
Noise : 69
Prompt Clip Score : 0.16
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, and there is some noise in the shadows. The composition is a bit cluttered, with too much empty space in the background.
The City’s Shadow: A Superhero Prepares
A gritty, close-up shot captures a superhero, his face etched with determination, standing in a shadowy urban environment. The blurred background hints at a city rooftop or alley, setting the stage for an intense and anticipated confrontation.
Prompt
facial-expressions Boredom: Disillusionment and weariness. ; A superhero; eye-level; Heroes; A deserted cityscape with crumbling buildings and graffiti.; cinematic
Characteristic
Shot : A man in a superhero costume stands in a city with a gritty urban background. The man is looking intensely at the camera, with a serious expression.
Aesthetic Score : 0.7
Mood : intense, dramatic, gritty
Quality
Entropy : 6.89
Noise : 75
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor artifacts, such as a slight blur on the edges of the man’s face, and some banding in the sky.
Lost in Thought: A Moment of Reflection on the City Bus
A young woman, bathed in the soft glow of the bus’s interior lights, sits lost in thought, her gaze fixed on her phone. The everyday scene of a public bus becomes a canvas for introspection, as the lighting and composition draw the viewer’s attention to her pensive mood. The urban backdrop adds a layer of anonymity, highlighting the universality of quiet moments of reflection.
Prompt
facial-expressions Boredom: Annoyance and detachment. ; A young woman; eye-level; Normal People; A crowded bus with people staring at their phones.; cinematic
Characteristic
Shot : A woman sitting on a bus, talking on her phone, while other passengers are around her.
Aesthetic Score : 0.6
Mood : pensive, everyday, focused
Quality
Entropy : 6.78
Noise : 74
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable image errors or artifacts in the image.
Lost in Thought: A Man’s Serious Focus in Dim Light
A man sits before his computer, his gaze averted, lost in contemplation. The dim lighting casts an air of mystery, hinting at the weight of his thoughts. His serious expression speaks of focus and determination, leaving the viewer to wonder what secrets lie within his mind.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A gamer; close-up; Gamer; A dimly lit room with a computer screen displaying a paused game.; cinematic
Characteristic
Shot : A man sits in front of a computer screen, looking away from the camera, lit by warm light from behind.
Aesthetic Score : 0.6
Mood : serious, contemplative, focused
Quality
Entropy : 5.89
Noise : 53
Prompt Clip Score : 0.13
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image and the color is slightly off.
Autumn’s Embrace: A Moment of Contemplation
An elderly man finds solace amidst the vibrant hues of autumn, his solitary figure a testament to the quiet beauty of the season. The blurred background and muted colors evoke a sense of melancholy and contemplation, inviting viewers to reflect on the passage of time and the serenity of nature.
Prompt
facial-expressions Boredom: Melancholy and loneliness. ; An elderly man; eye-level; Single Persons; A park bench with fallen leaves and a deserted playground.; cinematic
Characteristic
Shot : An elderly man sits on a park bench in an autumnal setting. The background is blurred, highlighting the subject. Fallen leaves surround the bench and a few trees are visible.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, autumnal
Quality
Entropy : 6.91
Noise : 74
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, the image is sharp and well-exposed.
The Weight of Decisions: A Tense Meeting in the Shadows
A dimly lit office scene reveals a meeting brimming with tension. Three men in suits, their faces etched with seriousness, engage in a conversation shrouded in mystery. The tight framing and focused gazes amplify the suspense, leaving the viewer to ponder the weight of the decisions being made.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A detective; eye-level; Heroes; A dimly lit office with stacks of unsolved cases and a flickering neon sign.; cinematic
Characteristic
Shot : A dimly lit office scene, featuring a detective sitting at a desk with several computer monitors in the background. The detective is looking directly at the camera, while several other men appear in the background of the scene, including a man on a video call in the top right corner.
Aesthetic Score : 0.6
Mood : intense, mysterious, suspenseful
Quality
Entropy : 6.45
Noise : 63
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Silent Discontent: A Couple’s Tense Dinner
A photograph captures the unspoken tension between a couple seated at a restaurant table. Their expressions, a mix of melancholy and awkwardness, hint at a brewing conflict, leaving the viewer to wonder what secrets lie beneath the surface.
Prompt
facial-expressions Boredom: Awkward silence and boredom. ; A young couple; eye-level; Normal People; A restaurant table with empty plates and a half-finished bottle of wine.; cinematic
Characteristic
Shot : A couple is sitting at a table in a restaurant, looking sad and distant. There’s a wine bottle, a glass of red wine, a plate with food, and a glass of white wine on the table.
Aesthetic Score : 0.6
Mood : melancholy, somber, strained
Quality
Entropy : 6.62
Noise : 72
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Immersed in the Race: A Gamer’s Focused Intensity
A man, headphones on, sits transfixed before his computer screen, a racing game unfolding in a blur of motion. The lighting highlights his focused expression, capturing the intensity of his gaming experience.
Prompt
facial-expressions Boredom: Monotony and boredom. ; A gamer; close-up; Gamer; A brightly lit room with a computer screen displaying a repetitive, simple game.; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer monitor, possibly gaming. The background is blurred and the focus is on the man’s face.
Aesthetic Score : 0.5
Mood : focused, intense, serious
Quality
Entropy : 6.64
Noise : 66
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, and the colors are a bit flat. The background is also a bit distracting.
Lost in the Pages: A Moment of Quiet Contemplation on the Subway
A young woman finds solace in a book amidst the bustling subway commute. Her thoughtful expression and the intimate framing capture a moment of quiet introspection, inviting viewers to share in her pensive mood.
Prompt
facial-expressions Boredom: Isolation and boredom. ; A woman; eye-level; Single Persons; A crowded train with people reading, sleeping, and staring blankly.; cinematic
Characteristic
Shot : A woman is sitting on a train reading a book. The woman is focused on the book and there are other passengers in the background. The lighting is a bit dim, and the focus is on the woman.
Aesthetic Score : 0.6
Mood : pensive, contemplative, focused
Quality
Entropy : 6.75
Noise : 72
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, which could be due to low lighting or post-processing. There is a slight blur around the edges of the image, which could be due to camera movement or focus.
A Soldier’s Solitude in the Desolate Wasteland
A lone soldier, clad in desert camouflage, stands amidst a barren landscape. The ruined building in the background and the distant mountains paint a picture of desolation. The soldier’s pose and the intense atmosphere evoke a sense of isolation and tension.
Prompt
facial-expressions Boredom: Despair and boredom. ; A soldier; eye-level; Heroes; A desolate desert landscape with a lone watchtower in the distance.; cinematic
Characteristic
Shot : A soldier in desert camouflage stands in a desert landscape, facing the camera with a serious expression. A crumbling stone structure is visible in the background, and distant mountains.
Aesthetic Score : 0.6
Mood : tense, serious, solitary
Quality
Entropy : 6.82
Noise : 66
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable errors
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.52, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.01, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and its aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai