AI Captures the Emotion, But Misses the Angle: A Look at Facial Expressions in Generative Art with Flux-schnell
- 10 minutes read - 1939 wordsTable of Contents
Dramatic facial expressions are a powerful tool in storytelling, conveying a multitude of emotions and adding depth to characters. From the furrowed brow of a hero facing a challenge to the triumphant smile of a villain achieving their goal, these expressions can instantly engage viewers and draw them into the narrative. In the realm of AI-generated imagery, the ability to capture these expressions is crucial for creating compelling and realistic scenes. This blog post explores the results of a study that investigated the performance of an AI model in generating images with dramatic facial expressions, analyzing its strengths and weaknesses in capturing both the emotional nuances and the technical aspects of the scene.
Created with: flux-schnell
Lost in the Neon Glow
A solitary figure navigates the vibrant, yet isolating, streets of a bustling city at night. The stark contrast between the bright neon signs and the lone figure evokes a sense of loneliness and urban melancholy.
Prompt
facial-expressions Contempt: Alienation, isolation, detachment ; A lone figure, back turned to the camera; eye-level; Single Person; A bustling city street at night, neon signs reflecting in puddles; cinematic
Characteristic
Shot : A man walking down a street at night, with blurred buildings and lights in the background.
Aesthetic Score : 0.4
Mood : melancholy, urban, lonely
Quality
Entropy : 6.56
Noise : 79
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and blur, which could be improved with editing.
Superman’s Silhouette: A Symbol of Hope at Sunset
A powerful image captures Superman standing tall on a rooftop, his silhouette against the fiery sunset. The scene evokes a sense of heroism, drama, and hope, making it a visually striking and emotionally resonant moment.
Prompt
facial-expressions Contempt: Disillusionment, weariness, cynicism ; A superhero, standing on a rooftop, looking down at the city; eye-level; Hero; A cityscape bathed in the golden light of sunset; cinematic
Characteristic
Shot : A man dressed as Superman is standing on a rooftop with a city skyline in the background. The sun is setting and the sky is a mix of orange and yellow.
Aesthetic Score : 0.7
Mood : dramatic, powerful, heroic
Quality
Entropy : 6.75
Noise : 87
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no noticeable artifacts or errors in the image.
Intrigue in the Shadows: A Man in a Suit Stands Alone
A man in a suit stands in a dimly lit hallway, his expression serious and intense. The use of light and shadow creates a sense of mystery and intrigue, hinting at a dramatic scene unfolding. This image, likely from a movie or TV show set, captures the mood of professionalism and suspense.
Prompt
facial-expressions Contempt: Apathy, boredom, resignation ; A man in a suit, walking through a crowded office; eye-level; Normal People; A sterile, corporate office environment, fluorescent lights casting harsh shadows; cinematic
Characteristic
Shot : A man in a dark suit and tie stands in a hallway, looking directly at the camera. There are other people in the background, but they are out of focus.
Aesthetic Score : 0.7
Mood : serious, professional, intense
Quality
Entropy : 6.69
Noise : 70
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry in the background.
Lost in the Code: A Moment of Focused Intensity
A young man sits hunched over his keyboard, bathed in the soft glow of his monitor. The dimly lit room and his intense focus create an atmosphere of mystery and intrigue, hinting at a world of code and hidden secrets.
Prompt
facial-expressions Contempt: Obsessive, detached, nihilistic ; A gamer, hunched over a computer screen, eyes glued to the monitor; eye-level; Gamer; A dimly lit room, cluttered with gaming paraphernalia; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit room, hunched over a computer keyboard, in front of multiple monitors. He appears to be focused on his work. There is a sense of concentration and perhaps even a hint of weariness in his posture.
Aesthetic Score : 0.6
Mood : serious, focused, contemplative
Quality
Entropy : 5.79
Noise : 55
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to be slightly overexposed. There is some noise in the dark areas of the image. Some minor chromatic aberration can be seen around the edges of the monitors.
Lost in Thought: A Moment of Melancholy at the Cafe
A woman sits alone at a cafe, her gaze distant and contemplative. The soft lighting and carefully composed scene evoke a sense of mystery and introspection, hinting at a story waiting to be told. Her quiet solitude speaks volumes about her inner world, leaving the viewer to ponder her thoughts and emotions.
Prompt
facial-expressions Contempt: Melancholy, loneliness, disillusionment ; A woman, sitting alone in a cafe, staring out the window; eye-level; Single Person; A rainy day, the cafe filled with the sound of rain and chatter; cinematic
Characteristic
Shot : A woman is sitting in a cafe, looking out of the window. She is wearing a grey sweater.
Aesthetic Score : 0.7
Mood : melancholic, contemplative, introspective
Quality
Entropy : 6.79
Noise : 87
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly grainy and has some noise, particularly in the background. There is a slight chromatic aberration visible around the edges of the frame.
Shadowy Figure in a Dark Alley: A Suspenseful Encounter
A man shrouded in mystery walks down a narrow, dimly lit alleyway. The figure’s silhouette is stark against the darkness, and a person lies motionless on the ground ahead. The scene evokes a sense of suspense and intrigue, leaving the viewer to wonder what transpired and what will happen next.
Prompt
facial-expressions Contempt: Superiority, arrogance, disdain ; A hero, standing over a defeated villain, looking down with disdain; not too close; Hero; A dark, gritty alleyway, lit by flickering streetlights; cinematic
Characteristic
Shot : A man in a dark coat and tie stands over a person lying on the ground in a dimly lit alleyway. The scene is dark and mysterious.
Aesthetic Score : 0.6
Mood : dark, suspenseful, foreboding
Quality
Entropy : 6.17
Noise : 69
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly grainy, especially in the shadows, but doesn’t detract significantly.
Lost in the Crowd: A Moment of Reflection in the Urban Jungle
A young man navigates the bustling energy of a shopping mall, his face a study in quiet contemplation amidst the blur of activity. The image captures the casual, everyday moments of urban life, with a touch of dramatic focus on the individual’s experience.
Prompt
facial-expressions Contempt: Indifference, apathy, boredom ; A group of people, standing in a queue, looking bored and apathetic; eye-level; Normal People; A sterile, modern shopping mall, filled with the sounds of chatter and music; cinematic
Characteristic
Shot : A group of people walking through a crowded hallway, possibly a train station or airport. There is a sense of movement and transition.
Aesthetic Score : 0.6
Mood : casual, contemplative, expectant
Quality
Entropy : 6.86
Noise : 93
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some minor noise and compression artifacts, particularly in the background.
On the Edge: A Moment of High Tension
A man, his face etched with seriousness, grips a gun in a dimly lit, almost abstract background. The image pulsates with intensity, hinting at a moment of imminent danger and a story waiting to unfold.
Prompt
facial-expressions Contempt: Desensitization, aggression, detachment ; A gamer, playing a violent video game, his face contorted in a grimace; not too close; Gamer; A dimly lit room, filled with the sounds of explosions and gunfire; cinematic
Characteristic
Shot : A man with a determined expression is holding a gun, possibly in a video game. The background is dark and blurry, suggesting a nighttime or indoor setting.
Aesthetic Score : 0.4
Mood : intense, focused, edgy
Quality
Entropy : 6.44
Noise : 68
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, particularly in the shadows.
Intense Gaze in the Shadows
A man in a black jacket stares directly at the camera, his expression intense and enigmatic. The blurred forest background and dim lighting create a moody atmosphere, adding to the sense of drama and mystery.
Prompt
facial-expressions Contempt: Despair, loneliness, isolation ; A man, walking through a deserted park, his face etched with sadness; eye-level; Single Person; A park at dusk, the trees casting long shadows; cinematic
Characteristic
Shot : A man is standing in a park at dusk, looking at the camera with a serious expression. The background is slightly blurred and the lighting is soft and warm.
Aesthetic Score : 0.7
Mood : serious, melancholic, thoughtful
Quality
Entropy : 6.59
Noise : 83
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
A Soldier’s Burden: A Moment of Loss and Sacrifice
A lone soldier, clad in military garb, stands amidst a field of fallen figures, his rifle held tight. The dramatic composition, with a cloudy sky as a backdrop, evokes a sense of somber intensity and potential loss, highlighting the weight of the soldier’s burden.
Prompt
facial-expressions Contempt: Disillusionment, cynicism, weariness ; A hero, standing on a battlefield, surrounded by the carnage of war; not too close; Hero; A battlefield, littered with the bodies of fallen soldiers; cinematic
Characteristic
Shot : A lone man standing in a field of fallen soldiers, holding a rifle. The background is blurry and out of focus, suggesting a chaotic scene.
Aesthetic Score : 0.6
Mood : intense, grim, desolate
Quality
Entropy : 6.62
Noise : 98
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image quality is slightly grainy, and there are some minor artifacts around the edges of the man’s figure. There is a visible seam where the man’s face is joined to the body.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.24, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.57, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.16, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api