AI's Struggle with Dramatic Facial Expressions with Flux-schnell
- 10 minutes read - 1982 wordsTable of Contents
Dramatic facial expressions are a powerful tool in storytelling, conveying a wide range of emotions and adding depth to characters. From the intense rage of a superhero facing down a villain to the quiet despair of a lone figure in the rain, these expressions can evoke strong reactions in viewers. However, capturing the nuance and complexity of human emotion is a challenging task for generative AI models. This blog post explores the results of an experiment where a model was tasked with creating images depicting dramatic facial expressions, highlighting the challenges and potential solutions for improving AI’s ability to generate images with nuanced emotional expression.
Created with: flux-schnell
Lost in the Shadows: A Lonely Figure Walks a Deserted City Street
A solitary figure traverses a rain-slicked, deserted city street at night. The dim glow of streetlamps casts long shadows, adding to the sense of mystery and intrigue. The image evokes feelings of loneliness, gloom, and a palpable sense of the unknown.
Prompt
facial-expressions Anger: Despair and rage ; A lone figure, standing in the middle of a deserted street; eye-level; Single Person; Rain pouring down, streetlights casting long shadows; cinematic
Characteristic
Shot : A lone figure walks down a dark, wet street in the rain. The streetlights cast an eerie glow on the buildings and the figure is silhouetted against the dark background.
Aesthetic Score : 0.7
Mood : mysterious, lonely, melancholic
Quality
Entropy : 5.65
Noise : 86
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : No noticeable errors in the image.
Heroic Charge in the Urban Battlefield
A superhero, clad in red and black, charges forward with fists clenched, amidst a gritty urban backdrop. The low angle and intense expression on the hero’s face create a sense of immediacy and danger, suggesting a fierce battle is underway. The dynamic composition draws the viewer’s eye towards the hero’s action, capturing the intensity and drama of the moment.
Prompt
facial-expressions Anger: Fury and determination ; A superhero, fists clenched, facing down a horde of villains; eye-level; Hero; A crumbling cityscape, smoke and debris filling the air; cinematic
Characteristic
Shot : A superhero in red and black suit is charging towards the camera with a determined look on his face, fists clenched, amidst a blurry background of other figures in the scene.
Aesthetic Score : 0.6
Mood : intense, dramatic, powerful
Quality
Entropy : 6.61
Noise : 80
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image has some minor blurriness and noise, especially in the background.
On the Verge of Explosion: A Man’s Frustration Caught on Camera
A tense scene unfolds as a man sits at a cluttered desk, his clenched fist and intense gaze suggesting simmering anger. The messy surroundings and the man’s direct stare create a palpable sense of frustration and impending outburst.
Prompt
facial-expressions Anger: Frustration and rage ; A man, slamming his fist on a table, surrounded by scattered papers; eye-level; Normal Person; A cluttered office, with a window showing a stormy sky; cinematic
Characteristic
Shot : A man in a white shirt is sitting at a desk in a messy office, looking angry and raising his fist. Papers are scattered around him. The background is out of focus. The image appears to be taken with a wide angle lens.
Aesthetic Score : 0.3
Mood : angry, tense, frustrated
Quality
Entropy : 6.84
Noise : 86
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, which may be due to motion blur or a wide aperture setting. There is also some noise in the image, particularly in the shadows.
The Gamer’s Fury: A Moment of Intense Focus
A young man, lost in the world of gaming, sits on the floor, headset on, surrounded by empty soda cans. His expression is intense, almost angry, as the dramatic lighting highlights his face and the headset, blurring the background into a hazy backdrop.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, throwing his headset on the floor, surrounded by empty energy drink cans; eye-level; Gamer; A dimly lit room, with a computer screen displaying a game in progress; cinematic
Characteristic
Shot : A young man is sitting on the floor with his head down, looking intensely at something. He is wearing a grey t-shirt and headphones. There are several empty cans of beer in front of him, as well as a TV screen in the background. The image is dark and moody, with a sense of tension and focus.
Aesthetic Score : 0.6
Mood : intense, focused, frustrated
Quality
Entropy : 6.57
Noise : 78
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some of the objects in the scene appear blurry, particularly the TV screens, cans, and the headphones.
Screaming in the Dark
A close-up shot captures a woman’s face contorted in a scream, illuminated only by a single, stark lightbulb. The image evokes a sense of intense fear and urgency, leaving the viewer breathless and questioning the source of her terror.
Prompt
facial-expressions Anger: Despair and rage ; A woman, screaming into the void, her face contorted in anger; close-up; Single Person; A dark, empty room, with only a single flickering light; cinematic
Characteristic
Shot : A close-up shot of a woman’s face, with her mouth wide open in a scream, against a dark background. A single light bulb illuminates the scene.
Aesthetic Score : 0.2
Mood : intense, fear, dramatic
Quality
Entropy : 6.24
Noise : 39
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and the lighting is not very flattering. The woman’s face is also not very well-composed.
Unfazed by the Inferno: Man Stands Amidst Burning City
A solitary figure in a red shirt stands stoic amidst a city consumed by flames. The apocalyptic scene, with smoke billowing into the sky, creates a stark contrast to the man’s calm demeanor, leaving viewers to ponder his resilience in the face of utter destruction.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a rooftop, overlooking a city in flames; eye-level; Hero; A fiery inferno engulfing the city, with smoke billowing into the sky; cinematic
Characteristic
Shot : A man in a red shirt is standing in front of a burning city. He looks serious and intense. The city is engulfed in flames and smoke.
Aesthetic Score : 0.6
Mood : intense, dramatic, apocalyptic
Quality
Entropy : 6.87
Noise : 63
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, but it is not overly distracting. The flames are well-defined and realistic.
Silent Tension: A Couple’s Heated Exchange in a Dimly Lit Restaurant
A couple sits at a table in a dimly lit restaurant, their conversation charged with unspoken emotions. The man’s anger is palpable, while the woman listens intently, her expression a mixture of concern and apprehension. The dramatic lighting and their intense expressions create a palpable sense of tension, leaving the viewer wondering what secrets lie beneath the surface.
Prompt
facial-expressions Anger: Frustration and rage ; A couple, arguing in a crowded restaurant, their voices raised in anger; eye-level; Normal People; A bustling restaurant, with other diners looking on; cinematic
Characteristic
Shot : A man and a woman are sitting at a table in a restaurant. The woman is looking at the man, who is looking away from her.
Aesthetic Score : 0.6
Mood : tense, dramatic, curious
Quality
Entropy : 6.85
Noise : 90
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the background. The lighting is also a bit uneven, which detracts from the overall aesthetic.
Headphones Amplify the Rage: Man Unleashes Fury in Dark Room
A man, consumed by anger, screams into the night, his headphones amplifying his frustration. The dark room and intense facial expression create a dramatic scene, capturing the raw emotion of the moment.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, smashing his keyboard in a fit of rage; close-up; Gamer; A dimly lit room, with a computer screen displaying a game over screen; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer, he is screaming and appears to be angry or frustrated. He is likely playing a game and has lost. The image is slightly blurred.
Aesthetic Score : 0.4
Mood : intense, frustrated, angry
Quality
Entropy : 6.29
Noise : 76
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have been blurred slightly.
Intense Gaze in the Shadows
A man with a dark expression stares directly at the camera, his intensity palpable. The blurred background and low lighting create a sense of mystery and suspense, hinting at a dramatic and possibly dangerous situation.
Prompt
facial-expressions Anger: Despair and rage ; A man, standing in the rain, his face obscured by the downpour; eye-level; Single Person; A dark, deserted street, with only the sound of rain and thunder; cinematic
Characteristic
Shot : A close-up portrait of a man in the rain, his face contorted in anger, with a blurry background of city lights and buildings.
Aesthetic Score : 0.4
Mood : intense, dark, agitated
Quality
Entropy : 6.46
Noise : 61
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some slight artifacts in the image, particularly around the edges of the subject’s hair and the background.
Warrior’s Fury: A Moment of Chaos in Battle
A fierce warrior, clad in armor, stands amidst a blurred and chaotic battlefield. The image captures the intensity and drama of the moment, with a sense of tension and anticipation hanging in the air.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a battlefield, surrounded by fallen enemies; eye-level; Hero; A battlefield littered with bodies, with smoke and dust filling the air; cinematic
Characteristic
Shot : A warrior in armor, with a fierce expression, stands in a dimly lit setting, likely a battlefield. He grips a weapon, possibly a sword. Other figures, obscured by smoke, can be seen in the background.
Aesthetic Score : 0.7
Mood : intense, dramatic, gritty
Quality
Entropy : 6.71
Noise : 83
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some noise and pixelation are noticeable. The image also appears slightly blurry.
Conclusion
The results of the analysis show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.55, which falls within the “good” range. This means the model was able to understand the scene described in the prompt and create an image that reflects it reasonably well.
- Aesthetic Analysis: The model scored 0.29, which is significantly lower than the “very good” range of -0.2 to 0.1. This suggests that the generated image didn’t match the expected aesthetic style described in the prompt.
Overall, the model shows promise in understanding the scene and camera position, but needs improvement in capturing the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api