AI's Facial Expressions: A Mixed Bag of Success with Flux-schnell
- 9 minutes read - 1769 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. Generative AI models are increasingly being used to create images with realistic facial expressions, but how well do they capture the nuances of human emotion? This blog post delves into the performance of a generative AI model in generating images with facial expressions, analyzing its strengths and weaknesses across various scenes. We’ll explore how the model performs in terms of camera position, shot analysis, and aesthetic style, providing insights into the current state of AI’s ability to capture the complexities of human expression.
Created with: flux-schnell
Lost in Thought on a City Street
A young man with glasses walks through a bustling city, his gaze fixed on something unseen. The shallow depth of field blurs the background, creating a sense of movement and isolation. His contemplative expression hints at a story waiting to be told.
Prompt
facial-expressions Interest: Intrigued, observant ; A lone figure; eye-level; Single Person; bustling city street; cinematic
Characteristic
Shot : A young man in glasses is walking down a city street, possibly in New York. The image is taken from his perspective, with the street and buildings behind him blurring in the background.
Aesthetic Score : 0.6
Mood : casual, urban, observant
Quality
Entropy : 6.90
Noise : 97
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor color artifacts and noise in the background, which are more noticeable in the darker areas.
Superman Faces the Flames: A Hero’s Moment Captured
A dramatic image of Superman standing against a backdrop of a burning building and city skyline. The shallow depth of field emphasizes the hero’s presence and the urgency of the situation, creating a powerful and intense mood.
Prompt
facial-expressions Interest: Focused, determined ; A superhero in a dramatic pose; medium shot; Hero; cityscape with a burning building in the background; cinematic
Characteristic
Shot : A man dressed as Superman stands in front of a building with a large explosion in the background.
Aesthetic Score : 0.7
Mood : dramatic, intense, heroic
Quality
Entropy : 6.90
Noise : 74
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Lost in the Pages: A Moment of Tranquility at the Cafe
A young woman with long brown hair finds solace in a good book at a cozy cafe. The composition draws the viewer’s attention to her focused expression, creating a sense of intimacy and calm. This image captures the essence of quiet contemplation and the joy of escaping into a story.
Prompt
facial-expressions Interest: Engrossed, absorbed ; A woman reading a book in a coffee shop; eye-level; Normal People; warm, inviting cafe interior; cinematic
Characteristic
Shot : A young woman is sitting in a cafe, reading a book. She is dressed casually in a grey sweater. The cafe is brightly lit and has large windows.
Aesthetic Score : 0.7
Mood : calm, relaxed, contemplative
Quality
Entropy : 6.76
Noise : 80
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry in places, especially in the background.
The Moment of Triumph: Gamer’s Face Lights Up with Excitement
A close-up shot captures the intense focus and surprise of a young man engrossed in a video game. The darkness surrounding him amplifies the excitement of the moment, leaving viewers eager to know what thrilling event unfolded on the screen.
Prompt
facial-expressions Interest: Excited, concentrated ; A gamer intensely focused on a screen; close-up; Gamer; dimly lit room with glowing monitor; cinematic
Characteristic
Shot : A young man wearing headphones is looking intently at a computer screen. Two other young men are sitting beside him, also looking at the screen. They are all in a dimly lit room, suggesting they might be gaming.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 6.23
Noise : 57
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise, particularly in the darker areas of the image. The colors are slightly muted.
Silhouetted Against the Storm
A man gazes out of a vehicle window at a brooding, cloudy sky. His silhouette against the darkness evokes a sense of reflection, pensiveness, and isolation. The scene is both somber and mysterious, leaving the viewer to ponder the man’s thoughts and the weight of the storm outside.
Prompt
facial-expressions Interest: Contemplative, thoughtful ; A man gazing out a window at a stormy sky; eye-level; Single Person; dark, moody interior; cinematic
Characteristic
Shot : A man’s silhouette is seen through a car window, looking out at a stormy sky.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, pensive
Quality
Entropy : 5.92
Noise : 43
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Silhouette of Mystery: A Man Contemplates the City Lights
A solitary figure stands on a rooftop, his dark blue shirt blending into the night. The city lights below shimmer like a distant galaxy, creating a backdrop of urban mystery. The man’s silhouette against this glittering panorama evokes a sense of contemplation and intrigue, leaving the viewer to wonder about his thoughts and the secrets he holds.
Prompt
facial-expressions Interest: Confident, determined ; A hero standing on a rooftop overlooking a city; wide shot; Hero; panoramic cityscape with dramatic lighting; cinematic
Characteristic
Shot : A man standing on a rooftop overlooking a city at night. The cityscape is blurred in the background, with the man’s figure in focus.
Aesthetic Score : 0.6
Mood : mysterious, confident, urban
Quality
Entropy : 6.82
Noise : 65
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has slight blurriness, especially in the background. The lighting is also uneven.
Intimate Gathering: Friends Sharing a Meal and Laughter
A group of friends gather around a table, enjoying a meal and drinks in a warm and inviting setting. Soft lighting and cozy ambiance create a sense of intimacy and comfort, drawing you into the scene.
Prompt
facial-expressions Interest: Happy, engaged ; A group of friends laughing together at a dinner table; eye-level; Normal People; cozy, homey dining room; cinematic
Characteristic
Shot : A group of friends are having dinner together. They are sitting at a table with wine glasses and plates of food. There are candles on the table, and the room is dimly lit. The scene is warm and inviting.
Aesthetic Score : 0.7
Mood : warm, casual, friendly
Quality
Entropy : 6.80
Noise : 84
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors
The Thrill of the Game: Intensity and Focus in the Digital Arena
A young man, eyes glued to the screen, is immersed in the world of video games. The blue-purple glow and close-up framing capture the intensity and focus of the moment, while the presence of another gamer in the background hints at the competitive spirit fueling the action.
Prompt
facial-expressions Interest: Thrilled, focused ; A gamer’s hands rapidly moving across a keyboard and mouse; close-up; Gamer; brightly lit gaming setup with flashing lights; cinematic
Characteristic
Shot : A close-up of a young man playing a video game. He is wearing a headset and is focused on the game. The image is lit by the glow of the computer screen.
Aesthetic Score : 0.6
Mood : focused, intense, competitive
Quality
Entropy : 6.77
Noise : 67
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some slight blurriness in the background, and slightly unnatural lighting in the background.
Lost in the Art: A Moment of Wonder in the Gallery
A young woman, captivated by the ornate details of the gallery ceiling, stands with a slight smile and a look of curiosity. The scene evokes a sense of wonder and contemplation, inviting viewers to share in her moment of artistic appreciation.
Prompt
facial-expressions Interest: Appreciative, curious ; A woman looking at a painting in a museum; eye-level; Single Person; grand museum hall with intricate artwork; cinematic
Characteristic
Shot : A young woman in a red shirt stands in an art gallery looking up at the ceiling, the image is a close-up, focusing on the woman’s face and the paintings in the background
Aesthetic Score : 0.7
Mood : thoughtful, curious, artistic
Quality
Entropy : 6.88
Noise : 100
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : none
Clashing Titans: Two Men Face Off in a Blaze of Fury
A tense standoff unfolds amidst a backdrop of smoke and fire. The dramatic lighting and stark contrast highlight the intensity of the confrontation between two men, leaving the outcome uncertain.
Prompt
facial-expressions Interest: Intense, focused ; A hero facing off against a villain; medium shot; Hero; dramatic, action-packed scene with explosions and smoke; cinematic
Characteristic
Shot : Two men are facing each other in a dramatic confrontation. The background is blurred and suggestive of a conflict.
Aesthetic Score : 0.6
Mood : intense, dramatic, confrontational
Quality
Entropy : 6.76
Noise : 81
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Conclusion
The analysis shows that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.56, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.15, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api