AI Captures the Nuances of Facial Expressions, But Struggles with Aesthetics with Stable-diffusion
- 9 minutes read - 1811 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions. In the realm of AI image generation, capturing these nuances accurately is crucial for creating realistic and engaging visuals. This blog post examines the performance of a generative AI model in generating images with specific facial expressions, analyzing its strengths and weaknesses. We’ll explore how the model handles different scenes, camera positions, and aesthetic styles, highlighting the challenges and opportunities in achieving a truly expressive AI.
Created with: stability-ai-core
Contagious Laughter: The Joy of a Live Event
Capture the pure joy and excitement of a live event as the audience erupts in laughter, their smiles and energy radiating through the camera lens. This image is a testament to the power of shared experiences and the infectious nature of happiness.
Prompt
facial-expressions Jealousy: Lonely and envious ; A single woman; eye-level; Single Persons; A crowded party with couples dancing and laughing; cinematic
Characteristic
Shot : A group of people are laughing and looking at the camera. The image is composed of two rows of people. It was likely taken during a theatrical performance or a comedy show.
Aesthetic Score : 0.8
Mood : joyful, playful, exciting
Quality
Entropy : 6.51
Noise : 78
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors were detected.
Superman Takes Flight Over the City
A powerful image captures the iconic superhero standing tall on a rooftop, gazing out at the sprawling cityscape. The dramatic lighting and composition create a sense of heroism and awe, leaving viewers in awe of the Man of Steel.
Prompt
facial-expressions Jealousy: Bitter and isolated ; A superhero standing alone on a rooftop; eye-level; Heroes; A city skyline with a couple holding hands in the distance; cinematic
Characteristic
Shot : A superhero, dressed as Superman, stands on a rooftop overlooking a city skyline at sunset. A group of people are in the background, also looking at the view.
Aesthetic Score : 0.6
Mood : heroic, dramatic, powerful
Quality
Entropy : 6.79
Noise : 80
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as some blurring around the edges of the superhero’s costume.
Laughter and Camaraderie at the Cafe
A group of friends share a moment of genuine joy and laughter at a cafe table. The image captures the warmth and happiness of their connection, creating a cheerful and friendly atmosphere.
Prompt
facial-expressions Jealousy: Heartbroken and resentful ; A man watching his ex-girlfriend laughing with another man; eye-level; Normal People; A bustling cafe with people chatting and enjoying coffee; cinematic
Characteristic
Shot : A group of friends are sitting at a table in a cafe, enjoying their coffee and laughing.
Aesthetic Score : 0.7
Mood : happy, friendly, relaxed
Quality
Entropy : 6.49
Noise : 78
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major errors. A few minor artifacts in the background.
Lost in the Code: A Man’s Focus Illuminated by Screens
A solitary figure, headphones on, sits bathed in the glow of multiple computer monitors. The dimly lit scene evokes a sense of intense focus and dedication, highlighting the power and allure of the digital world.
Prompt
facial-expressions Jealousy: Obsessive and competitive ; A gamer staring intently at his computer screen; eye-level; Gamer; A dimly lit room with posters of video game characters on the walls; cinematic
Characteristic
Shot : A young man is sitting in front of a computer with a headset on. He is surrounded by multiple screens displaying images of other men with headsets on.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 6.37
Noise : 65
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
A Moment of Connection in the Park
A tender scene unfolds in a park, where a young boy enjoys a moment with his parents. The blurred background adds a touch of melancholy, highlighting the intimacy and connection between the family members.
Prompt
facial-expressions Jealousy: Yearning and wistful ; A woman looking at a couple holding hands in the park; eye-level; Single Persons; A sunny park with children playing and couples strolling; cinematic
Characteristic
Shot : A family in a park, the father looks sad and the mother is comforting a young boy, the background is blurry and there are other people walking around in the park
Aesthetic Score : 0.6
Mood : sad, melancholic, concerned
Quality
Entropy : 6.83
Noise : 76
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible image errors.
United in Song: Soccer Players Sing National Anthem Before Game
A powerful image captures the intensity and patriotism of soccer players from different teams united in singing their national anthem before a game. The repetition of the image with players in various uniforms creates a sense of symmetry and uniformity, highlighting the shared spirit of competition and national pride.
Prompt
facial-expressions Jealousy: Disgruntled and envious ; A hero watching another hero receive accolades; eye-level; Heroes; A crowded stadium with cheering fans and flashing lights; cinematic
Characteristic
Shot : A group of men in soccer jerseys are singing the national anthem. It is a split-screen view of three different rows of players, all with their mouths open, as if they are singing in unison.
Aesthetic Score : 0.5
Mood : passionate, united, patriotic
Quality
Entropy : 6.58
Noise : 81
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image suffers from significant blurring and artifacts. The repeated pattern of the split-screen with very similar frames makes the image monotonous and repetitive.
Joyful Celebration: Laughter and Smiles Fill the Air
A vibrant party scene captures the essence of joy and celebration. The image is filled with laughter and smiles, suggesting a birthday or wedding. The dramatic focus on the emotion adds a sense of intensity to the lively atmosphere.
Prompt
facial-expressions Jealousy: Angry and betrayed ; A man watching his wife dancing with another man at a party; eye-level; Normal People; A brightly lit party with people dancing and laughing; cinematic
Characteristic
Shot : A group of people are gathered in a social setting. The three subjects in the foreground are all smiling and laughing.
Aesthetic Score : 0.6
Mood : joyful, celebratory, happy
Quality
Entropy : 6.59
Noise : 77
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, resulting in blown-out highlights in the background. The focus is sharp on the subjects but appears slightly soft in the background.
Lost in the Code: A Man’s Intense Focus in a Dark Room
A solitary figure, shrouded in darkness, sits before a multitude of glowing screens. Headphones isolate him from the world, his expression a mask of intense concentration. The atmosphere is heavy with a sense of mystery and anticipation, hinting at a world hidden within the digital realm.
Prompt
facial-expressions Jealousy: Frustrated and envious ; A gamer watching a livestream of another player achieving a high score; eye-level; Gamer; A dimly lit room with a computer screen displaying the livestream; cinematic
Characteristic
Shot : A young man with headphones sits in front of a computer, focused on the screen. There are multiple screens displaying various content. The image has a dark and moody ambiance, creating a sense of mystery and tension.
Aesthetic Score : 0.6
Mood : intense, focused, mysterious
Quality
Entropy : 6.14
Noise : 62
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor artifacts and noise are visible, especially in the darker areas of the image.
A Rainy Serenade: Love in the City
In the heart of the city, a couple shares a romantic moment under an umbrella, their intimacy heightened by the melancholic beauty of the rain. The city lights twinkle in the background, adding a touch of romance to their intimate scene, creating a dramatic effect that is both intimate and isolated.
Prompt
facial-expressions Jealousy: Melancholy and longing ; looking at a couple kissing in the rain; eye-level; Single Persons; A rainy street with puddles reflecting the city lights; cinematic
Characteristic
Shot : A couple standing under an umbrella in the rain, looking at each other
Aesthetic Score : 0.8
Mood : romantic, moody, intimate
Quality
Entropy : 6.57
Noise : 82
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts and noise in the image, particularly in the background.
Trapped in the Inferno: One Man’s Struggle Amidst Urban Chaos
A haunting collage captures the intensity of a man’s struggle within a burning cityscape. The repetition of his face, contorted in various expressions of fear and determination, amplifies the sense of urgency and chaos engulfing him.
Prompt
facial-expressions Jealousy: Frustrated and envious ; A hero watching another hero save the day; eye-level; Heroes; A chaotic scene with explosions and people running for safety; cinematic
Characteristic
Shot : A collage of images featuring a man in a black jacket looking worried and distressed in an apocalyptic scenario with burning cars and smoke in the background. The man is shown in different shots, suggesting a scene from an action movie or a TV series.
Aesthetic Score : 0.6
Mood : intense, dramatic, suspenseful
Quality
Entropy : 6.76
Noise : 82
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blurring in some of the images, particularly around the edges. There is a little bit of noise in some of the images.
Conclusion
The analysis shows that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.5, which is considered good. This means the generated image’s camera position was fairly close to what was requested in the prompt.
- Shot Analysis: The model scored 0.615, also considered good. This indicates the model successfully captured the scene elements and composition as described in the prompt.
- Aesthetic Analysis: The model scored 0.11, which is slightly below average. This suggests the generated image’s aesthetic style deviated somewhat from the expected aesthetic.
Overall, the model demonstrated a good understanding of the scene and camera position, but could benefit from improvements in capturing the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai