AI's Facial Expressions: A Step Forward, But Still Room for Growth with Flux-schnell
- 9 minutes read - 1879 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in storytelling. In the realm of AI, generating realistic and expressive faces is a challenging task. This blog post explores the capabilities of generative AI models in capturing the nuances of facial expressions, analyzing their performance in understanding scene context, camera angles, and aesthetic styles. We’ll delve into the results of a recent experiment, highlighting the model’s strengths and areas for improvement. By understanding the current state of AI-generated facial expressions, we can gain insights into the potential of this technology for creating immersive and engaging experiences in various fields, from filmmaking to virtual reality.
Created with: flux-schnell
Scream in the Shadows: A Moment of Terror in the Alley
A figure cloaked in darkness, their face contorted in a primal scream, stands trapped within the confines of a shadowy alleyway. The image evokes a sense of intense fear and claustrophobia, leaving the viewer questioning the source of their terror.
Prompt
facial-expressions Fear: Unease, paranoia ; A lone figure; eye-level; Single Person; a dark, deserted alleyway; cinematic
Characteristic
Shot : A man is standing in a dark alleyway, his face is contorted in a scream. The lighting is very harsh, and the man’s expression is exaggerated and intense.
Aesthetic Score : 0.2
Mood : intense, scary, sinister
Quality
Entropy : 5.77
Noise : 46
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has a number of artifacts, including the lighting which makes the image look slightly distorted and unrealistic. The man’s expression is also very exaggerated, to the point of being cartoonish.
Lost in the Mist: A Figure in the Shadows
A man shrouded in mystery, clad in a dark leather jacket, stands amidst a swirling urban fog. His gaze is fixed on something unseen, hinting at a hidden story. A smaller figure in the distance, also cloaked in darkness, adds to the sense of intrigue and suspense.
Prompt
facial-expressions Fear: Dread, anticipation ; A superhero standing alone on a rooftop; eye-level; Hero; a cityscape shrouded in fog; cinematic
Characteristic
Shot : A man in a dark leather jacket stands in the foreground with a figure in the distance, set against a misty urban backdrop
Aesthetic Score : 0.6
Mood : mysterious, dark, brooding
Quality
Entropy : 6.71
Noise : 66
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears slightly grainy, likely due to the fog and low light conditions. There are no significant artifacts or errors
Lost in the Shadows: A Woman’s Solitary Night
A young woman stands alone on a dark street, bathed in the eerie glow of a single streetlamp. The shadows play across her face, hinting at a story of mystery and loneliness. This evocative image captures a brooding mood, leaving the viewer to wonder about her secrets and her destination.
Prompt
facial-expressions Fear: Vulnerability, isolation ; A woman walking down a dimly lit street; eye-level; Normal Person; a deserted street with flickering streetlights; cinematic
Characteristic
Shot : A young woman is standing in a dark street, her face is illuminated by a nearby streetlight. Her expression is serious, she’s looking directly at the camera. The background is out of focus.
Aesthetic Score : 0.7
Mood : dark, mysterious, brooding
Quality
Entropy : 5.45
Noise : 52
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, but the image is slightly grainy.
Lost in the Digital Realm: A Man’s Intense Focus on a 3D Creature
A dimly lit room, a man engrossed in his work, and a captivating 3D creature on the screen. This image captures the focused intensity of a creative mind immersed in a digital world, leaving the viewer to wonder what mysteries lie within the virtual realm.
Prompt
facial-expressions Fear: Disquiet, unease ; A gamer hunched over their computer; close-up; Gamer; a flickering monitor displaying a disturbing image; cinematic
Characteristic
Shot : A man wearing headphones is sitting in front of a computer, looking at the screen. There is a blurry image of a monster on the screen behind him.
Aesthetic Score : 0.6
Mood : focused, intense, mysterious
Quality
Entropy : 6.19
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and the colors are a bit muted.
Screaming in the Shadows: A Moment of Terror
A man’s face contorted in a primal scream, his hands desperately pushing against an unseen threat. The darkness and shadows amplify the intensity of the moment, leaving the viewer with a palpable sense of fear and urgency.
Prompt
facial-expressions Fear: Terror, helplessness ; hiding ; low-angle; Single Person; a dark room with shadows creeping in; cinematic
Characteristic
Shot : A man with a dark expression and a scream on his face, looking directly at the camera. His hands are held up in front of his face, obscuring part of his visage. The background is dark and blurry.
Aesthetic Score : 0.4
Mood : dark, tense, aggressive
Quality
Entropy : 5.72
Noise : 55
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry. There is a lot of noise in the shadows and the contrast is high, making the image appear less natural.
The Weight of War in His Eyes
A close-up portrait captures the intensity and grit of a bearded man, his serious expression reflecting the chaos of a war-torn city behind him. The intimacy of the shot draws you into his world, leaving you to ponder the weight of his experience.
Prompt
facial-expressions Fear: Desperation, courage ; A hero facing a monstrous creature; eye-level; Hero; a crumbling battlefield with smoke and debris; cinematic
Characteristic
Shot : A close-up of a man’s face, with a blurry background of a destroyed city and soldiers in the distance. The man is looking at the viewer with an intense expression.
Aesthetic Score : 0.7
Mood : intense, dramatic, gritty
Quality
Entropy : 6.77
Noise : 83
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some slight artifacts in the image, particularly around the edges of the man’s face and beard.
Fear in the Flash: Five Friends Huddle in the Face of a Storm
A chilling scene unfolds as five young people gather in a dimly lit room, their faces etched with fear as a lightning strike illuminates the night sky outside. The atmosphere is thick with suspense, leaving viewers on the edge of their seats wondering what danger lurks beyond the window.
Prompt
facial-expressions Fear: Anxiety, uncertainty ; A group of people huddled together in a darkened room; eye-level; Normal People; a storm raging outside with thunder and lightning; cinematic
Characteristic
Shot : A group of five young people huddle together in a dark room, looking out at a stormy night with a dramatic lightning strike in the background.
Aesthetic Score : 0.6
Mood : suspenseful, eerie, dark
Quality
Entropy : 5.66
Noise : 64
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors, but the image is slightly grainy and the lighting could be more dramatic.
Lost in the Shadows: A Moment of Intrigue
A solitary figure sits shrouded in darkness, their gaze fixed on a television screen displaying two enigmatic women. The play of light and shadow creates an atmosphere of mystery and contemplation, leaving the viewer to wonder what secrets lie hidden within the room.
Prompt
facial-expressions Fear: Shock, adrenaline ; A gamer’s hands shaking as they play a horror game; close-up; Gamer; a screen displaying a jump scare; cinematic
Characteristic
Shot : A person is watching a television screen. The screen is showing a scene of two women. The person’s hand is in the foreground. The scene is dark and shadowy, with only the television screen brightly lit.
Aesthetic Score : 0.3
Mood : dark, brooding, introspective
Quality
Entropy : 6.19
Noise : 45
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
A Solitary Figure Contemplates the Storm
A lone figure stands on a cliff edge, silhouetted against a misty landscape. The sky is heavy with the promise of a storm, mirroring the figure’s own sense of isolation and contemplation. The vastness of the scene emphasizes the figure’s smallness and vulnerability, creating a powerful sense of drama and tension.
Prompt
facial-expressions Fear: Loneliness, despair ; A lone figure standing at the edge of a cliff; eye-level; Single Person; a vast, empty landscape with a stormy sky; cinematic
Characteristic
Shot : A solitary figure stands on the edge of a cliff, gazing out at a vast, misty landscape. The sky is overcast with grey clouds.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, lonely
Quality
Entropy : 6.76
Noise : 60
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
Silhouetted Against the Flames: A Lone Figure in a City of Ashes
A solitary figure, cloaked in shadow, stands amidst a cityscape consumed by fire and smoke. The image evokes a sense of grim isolation and impending doom, leaving viewers to ponder the fate of this lone survivor in a world consumed by chaos.
Prompt
facial-expressions Fear: Loss, determination ; A hero standing amidst a burning city; eye-level; Hero; a chaotic scene with smoke and flames; cinematic
Characteristic
Shot : A man in a hooded jacket stands alone in front of a scene of destruction, fire and smoke in the background. The man seems to be contemplating the scene, his back to the viewer.
Aesthetic Score : 0.6
Mood : dark, suspenseful, melancholic
Quality
Entropy : 6.62
Noise : 74
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image is somewhat grainy and the colors are muted. The fire in the background looks somewhat artificial, and the smoke is too dense and lacking in detail.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.58, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.23, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api