AI's Facial Expressions: A Mixed Bag with Stable-diffusion
- 9 minutes read - 1779 wordsTable of Contents
Facial expressions are a powerful tool in storytelling and communication. They convey emotions, intentions, and even subtle nuances of character. In the realm of AI image generation, the ability to accurately depict facial expressions is crucial for creating realistic and engaging visuals. This blog post examines the results of an AI model’s attempt to generate images with specific facial expressions, highlighting both its successes and struggles.
Created with: stability-ai-core
What’s Got Them Stunned? The Cafe Mystery Unfolds.
A group of friends or colleagues gather in a cafe, their faces etched with surprise and anticipation. What could have caught their attention? The image, with its tight composition and shallow depth of field, builds a palpable sense of suspense, leaving viewers eager to discover the source of their shock.
Prompt
facial-expressions Embarrassment: Awkward and self-conscious ; A single woman; eye-level; Single Persons; A crowded cafe with loud chatter and laughter; cinematic
Characteristic
Shot : A group of friends in a cafe are looking at something off-camera, all with surprised or shocked expressions. They are all seated at a table and some are holding coffee mugs.
Aesthetic Score : 0.6
Mood : surprised, shocked, casual
Quality
Entropy : 6.49
Noise : 76
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible image errors.
Superman Soars Above the City
A powerful image captures Superman striding through a bustling city, his gaze locked on the viewer. The blurred background emphasizes his heroic presence and creates a sense of dramatic isolation.
Prompt
facial-expressions Embarrassment: Humiliated and exposed ; A superhero in a full costume; eye-level; Heroes; A bustling city street with people staring; cinematic
Characteristic
Shot : A man dressed as Superman stands in the middle of a city street, surrounded by pedestrians. The scene is busy and chaotic, but the focus is on the Superman figure.
Aesthetic Score : 0.6
Mood : heroic, dramatic, edgy
Quality
Entropy : 6.72
Noise : 76
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, such as the blurring of the background and the lack of detail in the faces of the pedestrians.
A Moment of Mystery at the Gala
A man in a tuxedo stands out from the crowd at a formal event, his serious expression hinting at a hidden story. The elegant setting and focused composition create a sense of intrigue, leaving the viewer wondering what secrets lie beneath the surface.
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A man in a business suit; eye-level; Normal People; A formal dinner party with elegant guests; cinematic
Characteristic
Shot : A man in a tuxedo sits at a table surrounded by other people in tuxedos. The scene is a formal event, likely a wedding reception.
Aesthetic Score : 0.7
Mood : serious, formal, elegant
Quality
Entropy : 6.25
Noise : 66
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible image errors
Focused on the Game, Pizza in Hand
A young man, bathed in the soft glow of his computer screen, enjoys a pizza break while immersed in his gaming session. The dim lighting creates a sense of intimacy and focus, highlighting his casual yet determined demeanor.
Prompt
facial-expressions Embarrassment: Cringing and defeated ; A gamer in a gaming chair; eye-level; Gamer; A dimly lit room with flashing screens and empty pizza boxes; cinematic
Characteristic
Shot : A young man is sitting in a gaming chair in front of a computer setup, eating pizza. There are three monitors in the background, one showing another man playing a video game.
Aesthetic Score : 0.6
Mood : relaxed, casual, focused
Quality
Entropy : 6.10
Noise : 66
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Radiant Bride Beams with Joy at Wedding Reception
A bride, radiating happiness, sits at a beautifully decorated table surrounded by loved ones at her wedding reception. The soft lighting and her infectious smile create a warm and inviting atmosphere, capturing the joy and elegance of this special occasion.
Prompt
facial-expressions Embarrassment: Lonely and out of place ; A woman in a wedding dress; eye-level; Single Persons; A crowded wedding reception with happy couples; cinematic
Characteristic
Shot : A bride is smiling during a wedding reception. The photo is taken from a slightly elevated angle, showing the bride from the chest up. The bride is wearing a white wedding dress and a tiara. She is surrounded by guests, who are mostly blurred out.
Aesthetic Score : 0.7
Mood : joyful, elegant, romantic
Quality
Entropy : 6.63
Noise : 75
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight blurriness in the background, and the lighting is not perfectly even. The cropping could be improved. Some details in the background are slightly out of focus, making the background appear blurry.
Superman’s Army: A Sea of Hope and Excitement
A vibrant crowd of Superman enthusiasts, their faces alight with energy and hope, create a powerful image of unity and shared purpose. The blurred background adds to the sense of movement and chaos, capturing the electrifying atmosphere of this gathering.
Prompt
facial-expressions Embarrassment: Embarrassed and self-conscious ; A superhero in a cape; eye-level; Heroes; A cheering crowd at a victory parade; cinematic
Characteristic
Shot : A crowd of people dressed as Superman, all shouting and cheering. There are numerous Chinese flags visible in the crowd.
Aesthetic Score : 0.6
Mood : energetic, chaotic, humorous
Quality
Entropy : 6.90
Noise : 82
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the people in the crowd have been poorly rendered and their features are not as realistic as the other characters. This is a characteristic of using AI to generate images.
A Glass of Wine and a Heart Full of Loneliness
A woman sits alone in a restaurant, her face etched with sadness. A glass of wine sits untouched before her, mirroring the emptiness she feels. In the background, a man stands, seemingly oblivious to her pain, adding to the sense of isolation and heartbreak.
Prompt
facial-expressions Embarrassment: Uncomfortable and out of place ; A woman in a casual outfit; eye-level; Normal People; A fancy restaurant with white tablecloths and expensive wine; cinematic
Characteristic
Shot : A woman sitting at a restaurant table with a glass of red wine, looking pensive. A man is in the background, out of focus.
Aesthetic Score : 0.7
Mood : pensive, lonely, melancholic
Quality
Entropy : 6.39
Noise : 68
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Lost in Thought: A Moment of Contemplation in a Sea of Faces
A young man, shrouded in a blue hoodie, sits amidst a bustling crowd, his face a canvas of serious contemplation. The blurred background and muted lighting create a sense of isolation, drawing the viewer into his introspective world.
Prompt
facial-expressions Embarrassment: Humiliated and defeated ; A gamer in a hoodie; eye-level; Gamer; A crowded esports tournament with loud cheers and flashing lights; cinematic
Characteristic
Shot : A young man wearing a blue hoodie sits in a crowd of people, possibly at an event or conference. The focus is on the man, and he appears to be looking thoughtfully.
Aesthetic Score : 0.6
Mood : serious, focused, contemplative
Quality
Entropy : 6.61
Noise : 70
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise is present in the image, especially in the darker areas. The background is slightly out of focus, but the blur is not entirely consistent.
A Night of Secrets: Two Men in Tuxedos Gather in Candlelit Luxury
Two men in formal attire sit at a lavishly set table, bathed in the soft glow of candlelight. Their serious expressions and the dim lighting create an air of mystery and anticipation, hinting at a night filled with secrets and intrigue.
Prompt
facial-expressions Embarrassment: Awkward and uncomfortable ; A man in a tuxedo; eye-level; Single Persons; A romantic dinner for two with candles and flowers; cinematic
Characteristic
Shot : Two men in formal attire, possibly at a dinner party, seated at a table with lit candles in the foreground.
Aesthetic Score : 0.8
Mood : elegant, formal, intimate
Quality
Entropy : 6.32
Noise : 64
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are minor image artifacts, such as slight blurriness and noise, which are not very noticeable.
Superheroes Take Over the Streets (and Maybe the News Conference)
A playful and ironic collage captures the unexpected appearance of superheroes in everyday settings, creating a humorous clash between the fantastical and the mundane.
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A superhero in a mask; eye-level; Heroes; A news conference with reporters asking difficult questions; cinematic
Characteristic
Shot : A group of people, mostly men, wearing masks and superhero costumes, standing in a public space. The background is blurred and appears to be a city street.
Aesthetic Score : 0.3
Mood : playful, humorous, ironic
Quality
Entropy : 6.95
Noise : 78
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the colors are a bit flat.
Conclusion
The results of the image analysis show that the generative AI model performed well in some areas but struggled in others.
Here’s a breakdown:
- Camera Position: The model scored a 0.2, indicating a poor ability to understand and react to camera positions specified in the prompt. This suggests the generated image likely deviated significantly from the intended camera perspective.
- Shot Analysis: The model scored a 0.63, indicating a good understanding of the scene described in the prompt. This means the generated image likely captured the overall scene composition and elements as intended.
- Aesthetic Analysis: The model scored a 0.1, indicating a very good ability to match the expected aesthetic of the image. This means the generated image likely achieved the desired visual style and mood.
Overall, the model demonstrated a strong ability to understand the scene and achieve the desired aesthetic, but struggled to accurately interpret the camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai