AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion
- 9 minutes read - 1902 wordsTable of Contents
The ability to generate images based on text prompts is a rapidly evolving field in AI. One of the key aspects of image generation is capturing the nuances of human emotions through facial expressions. This blog post examines the performance of a generative AI model in creating images with dramatic facial expressions, exploring its strengths and weaknesses in capturing the essence of human emotion. We’ll delve into specific examples where the model excels in capturing the intensity of a gamer’s focus or the joy of a family on a roller coaster ride, while also highlighting areas where it struggles to accurately portray the intended camera angle. By analyzing these results, we gain valuable insights into the current capabilities and limitations of AI in generating images that evoke powerful emotions.
Created with: stability-ai-core
Carnival Lights and a Smile: Capturing Festive Joy
A young woman radiates happiness as she stands amidst a vibrant blur of carnival lights. The scene evokes a sense of nostalgia and celebration, with the dramatic effect of the lights and her smile creating a captivating atmosphere of joy and mystery.
Prompt
facial-expressions Amusement: Playful, carefree ; A lone woman; eye-level; Single Person; a bustling carnival with bright lights and colorful tents; cinematic
Characteristic
Shot : A woman is standing in front of a carnival or fairground at night. The lights are on and there are people in the background, creating a lively atmosphere.
Aesthetic Score : 0.7
Mood : happy, joyful, carefree
Quality
Entropy : 6.50
Noise : 68
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors or artifacts.
Superman Takes a Spin at the Amusement Park!
This playful scene captures Superman enjoying a day at the amusement park, his smile and excited expression contagious. The Ferris wheel and roller coaster in the background add to the energetic and adventurous mood, while his direct gaze invites you to join in the fun.
Prompt
facial-expressions Amusement: Exuberant, triumphant ; A superhero in a vibrant costume; eye-level; Hero; a crowded amusement park with roller coasters and Ferris wheels in the background; cinematic
Characteristic
Shot : A man dressed as Superman is running through a carnival or amusement park, with a Ferris wheel and roller coaster in the background.
Aesthetic Score : 0.6
Mood : fun, playful, energetic
Quality
Entropy : 6.81
Noise : 71
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Summer Fun and Laughter: Friends Enjoy a Picnic by the Carousel
Capture the joy of friendship and summer with this heartwarming scene. Three friends share a picnic in a vibrant park, with a whimsical carousel adding to the cheerful atmosphere. Their bright smiles radiate positivity and invite you to join in the fun.
Prompt
facial-expressions Amusement: Relaxed, happy ; A group of friends; eye-level; Normal People; a picnic blanket under a shady tree in a park, with a carousel in the distance; cinematic
Characteristic
Shot : Three young women are sitting on a picnic blanket in a park, laughing and enjoying their time together. Behind them is a carousel and a lot of people.
Aesthetic Score : 0.7
Mood : joyful, carefree, friendship
Quality
Entropy : 6.86
Noise : 85
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a small amount of blur on the subjects, which is likely due to a shallow depth of field. Otherwise the image is clean.
Lost in the Game: A Moment of Focused Intensity
A young man, bathed in the soft glow of his monitor, is completely absorbed in his video game. His headphones isolate him from the world, his expression focused and determined. The dim lighting adds a touch of mystery, hinting at the intensity of the virtual world he’s inhabiting.
Prompt
facial-expressions Amusement: Focused, excited ; A gamer; close-up; Gamer; a dimly lit room with a computer screen displaying a vibrant video game, a controller in their hand; cinematic
Characteristic
Shot : A young man is sitting at a desk, wearing headphones, and looking at the camera while holding a video game controller. The scene is lit with warm, soft lighting, creating a cozy and inviting atmosphere.
Aesthetic Score : 0.6
Mood : relaxed, focused, happy
Quality
Entropy : 6.39
Noise : 69
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors are present in the image.
A Dreamy Glance at Childhood Nostalgia
A young girl, her long hair adorned with a headband, stands beside a carousel horse, her gaze fixed on the camera with a wistful longing. The soft lighting and her expression create a sense of mystery and a touch of melancholy, evoking a feeling of cherished childhood memories.
Prompt
facial-expressions Amusement: Magical, innocent ; A young girl; eye-level; Single Person; a carousel with brightly painted horses, her eyes wide with wonder; cinematic
Characteristic
Shot : A young girl with long brown hair and a crown is standing next to a white carousel horse with golden accents. The scene is set in a carousel park, with the blurred background suggesting a festive and whimsical atmosphere.
Aesthetic Score : 0.8
Mood : dreamy, whimsical, nostalgic
Quality
Entropy : 6.84
Noise : 81
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness in the background, particularly in the carousel lights, suggesting some motion during the capture.
Childhood Joy: Laughter and Fun on the Playground
A heartwarming scene of children laughing and playing on a swing set, captured in a moment of pure joy. The bright colors and natural light enhance the carefree mood, creating a beautiful and evocative image of childhood happiness.
Prompt
facial-expressions Amusement: Joyful, carefree ; A group of children; eye-level; Normal People; a playground with swings, slides, and a sandbox, their laughter echoing in the air; cinematic
Characteristic
Shot : A group of children are laughing on a swing set at a playground
Aesthetic Score : 0.8
Mood : joyful, playful, carefree
Quality
Entropy : 6.85
Noise : 75
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, especially in the background, likely due to a fast shutter speed. There is also some chromatic aberration around the edges of the swing set, which is not distracting but could be improved
Silhouettes of Solitude: A Man Walks into the Storm
A solitary figure strides along a wooden pier, his long coat billowing in the wind. The choppy sea and stormy sky create a stark backdrop, emphasizing the man’s isolation and contemplative mood. The minimalist composition and black and white aesthetic enhance the sense of melancholy and loneliness.
Prompt
facial-expressions Amusement: Melancholy, contemplative ; A lone man; eye-level; Single Person; a deserted boardwalk at night, the sound of crashing waves in the background; cinematic
Characteristic
Shot : A lone figure walks down a wooden pier on a stormy, overcast day, with waves crashing in the background. The scene is shot in black and white, adding to the moody atmosphere.
Aesthetic Score : 0.7
Mood : melancholy, solitude, mysterious
Quality
Entropy : 6.63
Noise : 81
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly grainy and there are some artifacts from the processing of the image.
Superman Faces City-Wide Crisis
A dramatic scene unfolds as Superman confronts a devastating explosion, his serious expression reflecting the intensity of the situation. The city’s plight and the superhero’s heroic stance create a powerful image of urgency and danger.
Prompt
facial-expressions Amusement: Thrilling, heroic ; A superhero in action; dynamic shot; Hero; a cityscape with towering buildings, a dramatic explosion in the background; cinematic
Characteristic
Shot : A man dressed as Superman is standing in front of a large explosion in a city.
Aesthetic Score : 0.7
Mood : epic, heroic, dramatic
Quality
Entropy : 6.90
Noise : 80
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The background seems very AI-generated, and there is a slight blur on the subject.
Rollercoaster of Joy: Faces Lit Up by the Thrill of the Ride
A vibrant scene of pure exhilaration as a group of friends experience the rush of a rollercoaster ride. The camera captures their joyful expressions and the blur of motion, perfectly encapsulating the thrill of the moment.
Prompt
facial-expressions Amusement: Exhilarating, bonding ; A family; eye-level; Normal People; a crowded amusement park, their faces lit up with joy as they ride a roller coaster; cinematic
Characteristic
Shot : A group of people are riding a rollercoaster at an amusement park. They are all smiling and laughing, and they look like they are having a lot of fun.
Aesthetic Score : 0.7
Mood : joyful, excited, carefree
Quality
Entropy : 6.84
Noise : 79
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : None. The image is well-composed and has no obvious flaws.
The Intensity of the Game: Three Gamers Locked in a Battle
A dimly lit gaming room, three young men in headsets, their faces etched with focus and determination. The atmosphere is electric with competition, the dark lighting amplifying the intensity of their gaming session. This image captures the raw energy and thrill of competitive gaming.
Prompt
facial-expressions Amusement: Triumphant, exhilarating ; A gamer; close-up; Gamer; a dimly lit room, their hands moving rapidly on a keyboard, a triumphant shout escaping their lips; cinematic
Characteristic
Shot : Three young men, wearing headsets, are seen in a dimly lit room with computer screens in the background, one of them is typing on a keyboard. All of them are shouting.
Aesthetic Score : 0.5
Mood : intense, focused, competitive
Quality
Entropy : 6.49
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has some artifacts and errors, particularly in the lighting and in the subjects’ skin. The image also appears to be slightly oversharpened.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.535, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.09, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and its aesthetic, but struggled with accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai