AI Captures the Scene, But Misses the Angle: A Look at Facial Expressions in AI-Generated Images with Flux-schnell
- 9 minutes read - 1902 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and telling stories. In the realm of AI-generated images, capturing these expressions accurately is crucial for creating realistic and engaging visuals. This blog post explores the results of a study that investigated the ability of a generative AI model to understand and depict facial expressions in different scenes and camera positions. The study revealed that while the model excelled at capturing the essence of the scene and its aesthetic, it struggled with accurately capturing the intended camera angle. This highlights the ongoing challenges in developing AI models that can fully understand and replicate the nuances of human expression.
Created with: flux-schnell
A Look That Speaks Volumes
Lost in the shadows of a dimly lit bar, a woman with captivating brown eyes holds your gaze. Her confident expression and the mysterious atmosphere create an alluring enigma.
Prompt
facial-expressions Jealousy: Lonely and envious ; A single woman; eye-level; Single Persons; A crowded party with couples dancing and laughing; cinematic
Characteristic
Shot : A woman with long brown hair is looking directly at the camera. She is wearing a grey top and is standing in a dimly lit bar or club setting. Other people are out of focus in the background.
Aesthetic Score : 0.8
Mood : mysterious, alluring, intimate
Quality
Entropy : 6.74
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and graininess, particularly in the shadows.
Silhouettes and Secrets: A Cityscape of Mystery
A man in a dark suit stands on a rooftop, his silhouette stark against the city lights. A woman stands in the distance, adding a layer of intrigue to the scene. The mood is dramatic, mysterious, and contemplative, leaving the viewer to ponder the story unfolding in the shadows.
Prompt
facial-expressions Jealousy: Bitter and isolated ; A superhero standing alone on a rooftop; eye-level; Heroes; A city skyline with a couple holding hands in the distance; cinematic
Characteristic
Shot : A man in a superhero costume stands on a rooftop overlooking a city skyline at sunset, a woman in the background.
Aesthetic Score : 0.7
Mood : dramatic, cinematic, pensive
Quality
Entropy : 6.53
Noise : 71
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor noise and blur in the background, likely due to compression or low-light conditions.
Lost in Thought: A Moment of Contemplation in a Busy Cafe
A solitary man sits at a table in a bustling cafe, his gaze fixed on something unseen. The composition emphasizes his isolation and introspection, drawing the viewer into his thoughtful moment. The woman in the background, enjoying her coffee, adds a touch of everyday life to the scene, highlighting the contrast between the man’s internal world and the external environment.
Prompt
facial-expressions Jealousy: Heartbroken and resentful ; A man watching his ex-girlfriend laughing with another man; eye-level; Normal People; A bustling cafe with people chatting and enjoying coffee; cinematic
Characteristic
Shot : A man sitting at a table in a cafe, looking away from the camera. There are other people in the background, but they are out of focus. The man is holding a cup of coffee and has a thoughtful expression on his face.
Aesthetic Score : 0.7
Mood : melancholy, pensive, introspective
Quality
Entropy : 6.62
Noise : 76
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some artifacts in the image, particularly in the background. The image is also slightly overexposed.
Lost in the Game: A Moment of Intense Focus
A young man, headphones on, is completely absorbed in a video game. The dimly lit room amplifies the intensity of his focus, drawing the viewer into the immersive world unfolding on the screen.
Prompt
facial-expressions Jealousy: Obsessive and competitive ; A gamer staring intently at his computer screen; eye-level; Gamer; A dimly lit room with posters of video game characters on the walls; cinematic
Characteristic
Shot : A young man wearing headphones is looking intently at a computer screen. The screen is displaying a video game, and the man appears to be engrossed in the action.
Aesthetic Score : 0.6
Mood : focused, intense, immersive
Quality
Entropy : 6.07
Noise : 63
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has slight graininess and some noise in the darker areas, particularly in the background. This could be due to low light conditions or post-processing.
A Moment of Anticipation in the Park: A Romantic Encounter
In a serene park setting, a woman stands before a curious man, their eyes locked in a hopeful gaze. The scene is filled with romantic tension and dramatic anticipation, leaving viewers eager to discover the man’s next move.
Prompt
facial-expressions Jealousy: Yearning and wistful ; A woman looking at a couple holding hands in the park; eye-level; Single Persons; A sunny park with children playing and couples strolling; cinematic
Characteristic
Shot : A woman is standing in a park, looking off to the side. A man is standing behind her, looking at her. There are other people in the background, and the day is sunny.
Aesthetic Score : 0.6
Mood : romantic, hopeful, curious
Quality
Entropy : 6.72
Noise : 79
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears slightly soft, particularly the subject’s face. This may be due to the camera settings or post-processing.
Lost in the Crowd: A Moment of Contemplation
A man, shrouded in shadow, stands amidst a sea of faces, his gaze fixed on something beyond the frame. The play of light and shadow creates an air of mystery, leaving us to wonder what captivates his attention in this bustling crowd.
Prompt
facial-expressions Jealousy: Disgruntled and envious ; A hero watching another hero receive accolades; eye-level; Heroes; A crowded stadium with cheering fans and flashing lights; cinematic
Characteristic
Shot : A man in a black jacket is looking at something off-camera in a large indoor arena filled with people. The lights are on, and there is a sense of anticipation.
Aesthetic Score : 0.7
Mood : reflective, anticipation, hopeful
Quality
Entropy : 6.68
Noise : 68
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a slight blur in the background and some noise in the dark areas.
Lost in the Glow: A Moment of Ambiguity
A man stands in the soft, warm light of a dimly lit room, his face partially obscured by the shadows. The intimate atmosphere, punctuated by twinkling string lights, creates a sense of relaxed mystery. His expression is ambiguous, leaving the viewer to wonder about the story unfolding around him.
Prompt
facial-expressions Jealousy: Angry and betrayed ; A man watching his wife dancing with another man at a party; eye-level; Normal People; A brightly lit party with people dancing and laughing; cinematic
Characteristic
Shot : A man in a white shirt stands in a dimly lit room, surrounded by other people, some of whom are out of focus. The lights in the room are warm and create a cozy atmosphere.
Aesthetic Score : 0.4
Mood : intimate, casual, relaxed
Quality
Entropy : 6.74
Noise : 52
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some noise and grain are present in the image.
Lost in the Digital World: A Moment of Focused Intensity
A man, headphones on, stares intently at his computer screen. The lighting and composition emphasize his isolation and concentration, drawing the viewer into his world of digital immersion. The scene evokes a sense of focus, seriousness, and contemplation, leaving us to wonder what unfolds on the screen before him.
Prompt
facial-expressions Jealousy: Frustrated and envious ; A gamer watching a livestream of another player achieving a high score; eye-level; Gamer; A dimly lit room with a computer screen displaying the livestream; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer monitor, looking at the screen. He is wearing a dark shirt and the lighting is dim, with the monitor’s light reflecting in his glasses.
Aesthetic Score : 0.6
Mood : focused, serious, intense
Quality
Entropy : 5.99
Noise : 45
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and the colors are a bit washed out.
Silhouetted Romance in the City Lights
A couple’s passionate kiss under the glow of streetlights, creating a romantic and intimate scene against the backdrop of a bustling urban landscape.
Prompt
facial-expressions Jealousy: Melancholy and longing ; looking at a couple kissing in the rain; eye-level; Single Persons; A rainy street with puddles reflecting the city lights; cinematic
Characteristic
Shot : A couple is kissing in a city street at night. The street is lit by streetlights and the couple is silhouetted against the light.
Aesthetic Score : 0.7
Mood : romantic, intimate, urban
Quality
Entropy : 6.57
Noise : 92
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain.
A Soldier’s Gaze: Intensity and Drama in a Single Shot
This image captures a moment of intense focus and seriousness. The dramatic lighting and sharp focus on the soldier’s face create a powerful visual narrative, leaving the viewer to ponder the weight of his expression and the story behind it.
Prompt
facial-expressions Jealousy: Frustrated and envious ; A hero watching another hero save the day; eye-level; Heroes; A chaotic scene with explosions and people running for safety; cinematic
Characteristic
Shot : A man in a tactical vest, standing in a blurred background of what appears to be a war-torn cityscape.
Aesthetic Score : 0.8
Mood : intense, determined, serious
Quality
Entropy : 6.06
Noise : 59
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.2, indicating it did not perform well in reacting to the camera position specified in the prompt. This suggests the generated image may not have accurately captured the intended camera angle or perspective.
- Shot Analysis: The model scored 0.66, which is considered good. This means the model was able to understand the scene described in the prompt and translate it into a visually coherent image.
- Aesthetic Analysis: The model scored 0.13, which is considered very good. This indicates that the generated image closely matched the expected aesthetic style, despite the camera position issues.
Overall, the model demonstrates a good understanding of the scene and its aesthetic, but struggles with accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api