AI's Facial Expressions: A Mixed Bag of Success with Flux-dev
- 9 minutes read - 1734 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions. In the realm of AI-generated imagery, capturing these nuances is a significant challenge. This analysis delves into the performance of a generative AI model in creating images with dramatic facial expressions, exploring its strengths and weaknesses. The model demonstrates a strong understanding of scene composition and camera angles, but struggles to achieve the desired aesthetic, highlighting the ongoing challenges in AI’s ability to capture nuanced human emotions. This blog post will explore these findings in detail, providing examples of both successful and less successful attempts at generating images with realistic facial expressions.
Created with: flux-dev
Silhouetted Hero at Dusk
A lone figure, cloaked in red, stands against a breathtaking cityscape at dusk. Their back is turned, leaving their identity a mystery. The dramatic lighting and silhouette create a sense of intrigue and heroism, hinting at a story waiting to unfold.
Prompt
facial-expressions Attentiveness: Determined, vigilant ; A superhero standing on a rooftop, looking out over the city; eye-level; Hero; cityscape with twinkling lights; cinematic
Characteristic
Shot : A man in a superhero costume stands on a rooftop, looking out over a cityscape at dusk. He is silhouetted against the city lights.
Aesthetic Score : 0.6
Mood : dramatic, hopeful, powerful
Quality
Entropy : 6.82
Noise : 86
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, and the subject’s face is not in focus.
Warrior Amidst the Flames
A woman clad in red armor and cape stands defiant against a backdrop of fiery chaos. The intensity of the battle is palpable, highlighting her courage and resilience in the face of danger.
Prompt
facial-expressions Attentiveness: Brave, fearless ; A hero standing in the middle of a battle, eyes locked on the enemy; eye-level; Hero; chaotic battlefield with explosions and smoke; cinematic
Characteristic
Shot : A woman warrior in armor stands in front of a fiery background. She looks determined and ready for battle.
Aesthetic Score : 0.7
Mood : epic, dramatic, powerful
Quality
Entropy : 6.48
Noise : 64
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible errors
Lost in Thought: A Silhouette of Loneliness
A man in a suit sits alone on a bench, his figure silhouetted against a backdrop of misty forest. The scene evokes a sense of melancholy and contemplation, leaving the viewer to wonder about his thoughts and the secrets hidden within the fog.
Prompt
facial-expressions Attentiveness: Melancholy, yet observant ; A lone figure sitting on a park bench; eye-level; Single Person; bustling city park in the background; cinematic
Characteristic
Shot : A man sits on a bench in a park. The trees are bare and the ground is covered in leaves, creating a sense of autumnal melancholy.
Aesthetic Score : 0.6
Mood : melancholic, contemplative, lonely
Quality
Entropy : 6.73
Noise : 75
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background. The bench has a slightly unrealistic texture.
Lost in the City’s Embrace
A solitary figure, shrouded in shadow, walks through a bustling city. The blurred background and contemplative expression evoke a sense of isolation and mystery, capturing the mood of urban life.
Prompt
facial-expressions Attentiveness: Lost in thought, introspective ; A man walking down a crowded street, seemingly oblivious to the chaos around him; eye-level; Single Person; bustling city street with people and traffic; cinematic
Characteristic
Shot : A man is walking down a city street, with a blurred background and a slightly melancholy mood.
Aesthetic Score : 0.6
Mood : melancholy, introspective, urban
Quality
Entropy : 6.61
Noise : 56
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight graininess, some blurriness in the background, which might be intentional.
Lost in the Digital World: A Moment of Intense Focus
A young person, bathed in blue and purple light, sits at a desk, headphones on, eyes glued to the computer screen. The dimly lit room and their focused expression create a sense of dramatic intensity, highlighting their deep concentration in the digital realm.
Prompt
facial-expressions Attentiveness: Thrilled, competitive ; A gamer intensely focused on a screen, fingers flying across the keyboard; close-up; Gamer; dimly lit room with glowing monitor; cinematic
Characteristic
Shot : A young person is sitting in front of a computer, wearing headphones, and looking at the screen. The room is dimly lit with blue and red light. The person is typing on a keyboard.
Aesthetic Score : 0.7
Mood : focused, intense, tech-savvy
Quality
Entropy : 6.46
Noise : 61
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, and there is some noise in the shadows. There are some minor artifacts around the edges of the image.
Silhouetted Against the Setting Sun: A Moment of Contemplation
A lone figure stands on a cliff, bathed in the golden light of the setting sun. The vast mountain range behind them creates a sense of isolation and wonder, evoking a mood of serenity and hope. This breathtaking scene captures the beauty of nature and the human spirit’s yearning for connection.
Prompt
facial-expressions Attentiveness: Reflective, contemplative ; A hero standing on a cliff, looking out at the vast landscape; eye-level; Hero; dramatic mountain range with clouds and sunlight; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a vast mountain range. The sun is setting, casting a warm glow over the landscape.
Aesthetic Score : 0.8
Mood : serene, contemplative, inspiring
Quality
Entropy : 6.66
Noise : 70
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors in the image.
Lost in Memories: A Girl’s Pensive Gaze
A young girl, bathed in a soft light, sits before a blurred adult, her eyes fixed on a framed photograph. The selective focus draws the viewer into her world of melancholy and nostalgia, as she contemplates the past with a wistful expression.
Prompt
facial-expressions Attentiveness: Curious, engaged ; A young girl listening intently to her grandmother tell a story; eye-level; Normal Person; cozy living room with warm lighting; cinematic
Characteristic
Shot : A young girl in a yellow shirt is looking at something off-camera, with the back of an adult’s head in the foreground.
Aesthetic Score : 0.7
Mood : pensive, thoughtful, heartwarming
Quality
Entropy : 6.47
Noise : 64
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some slight noise and grain in the image, particularly in the shadows.
Lost in Thought: A Moment of Quiet Contemplation
A young woman finds solace in the dim light of a cafe, her pensive gaze directed out of frame. The soft lighting creates a dramatic effect, highlighting her thoughtful expression and isolating her in a moment of quiet introspection.
Prompt
facial-expressions Attentiveness: Observant, introspective ; A woman sitting alone in a cafe, observing the people around her; eye-level; Single Person; bustling cafe with tables and chairs; cinematic
Characteristic
Shot : A young woman sits alone at a table in a dimly lit restaurant, looking thoughtful.
Aesthetic Score : 0.7
Mood : pensive, contemplative, melancholic
Quality
Entropy : 6.48
Noise : 71
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a slight amount of noise in the image, particularly in the shadows.
Lost in the Pages: A Moment of Tranquility on the Train
A young woman finds solace in a good book, her focused expression and the soft lighting creating a sense of peace and tranquility. The simple yet elegant composition captures a moment of quiet contemplation on a bustling train journey.
Prompt
facial-expressions Attentiveness: Focused, absorbed ; A woman reading a book on a train; eye-level; Normal Person; blurred passengers and train windows; cinematic
Characteristic
Shot : A young woman is sitting on a train and reading a book.
Aesthetic Score : 0.7
Mood : pensive, contemplative, introspective
Quality
Entropy : 6.35
Noise : 61
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurred around the edges.
Headphones On, Excitement High: Gamer Reacts to the Big Moment
This image captures the raw energy of a gamer fully immersed in the action. The vibrant pink and blue lighting, combined with the man’s animated expression, creates a sense of intense excitement and anticipation. Is he celebrating a victory or bracing for a challenge? The moment is electric.
Prompt
facial-expressions Attentiveness: Joyful, triumphant ; A gamer celebrating a victory, eyes wide with excitement; close-up; Gamer; brightly lit room with cheering friends; cinematic
Characteristic
Shot : A young man with headphones on, his mouth wide open in a surprised expression. The background is blurry, likely a gaming setup, with colorful lights.
Aesthetic Score : 0.6
Mood : excited, surprised, energetic
Quality
Entropy : 6.85
Noise : 69
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors.
Conclusion
The analysis shows that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.61, which falls within the “good” range. This indicates that the model was able to understand and translate the scene description in the prompt into a visually coherent shot.
- Aesthetic Analysis: The model scored 0.14, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/dev/api