AI's Facial Expressions: A Step Forward, But Still Room for Growth with Flux-dev
- 9 minutes read - 1838 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying a wide range of emotions and adding depth to characters. In the realm of AI-generated imagery, capturing these nuances accurately is a significant challenge. This blog post examines the results of an AI model tasked with generating images based on detailed scene descriptions, focusing on the model’s ability to portray facial expressions in a dramatic and impactful manner. We’ll explore examples where the model excels and where it falls short, highlighting the ongoing journey towards creating AI-generated imagery that truly captures the essence of human emotion.
Created with: flux-dev
Intense Gaze in the Blue-Green Gloom
A close-up shot, shrouded in a bluish-green light, captures a man’s serious expression as he stares directly at the camera. The image, slightly out of focus, evokes a sense of tension and mystery, leaving the viewer questioning the story behind the intense gaze.
Prompt
facial-expressions Anxiety: Fear, anticipation ; A hero facing a menacing villain; medium shot; Hero; dark and ominous setting; cinematic
Characteristic
Shot : A close-up shot of a man’s face, illuminated by a single, blueish light source, casting dramatic shadows. The subject appears intense and focused, possibly angry or suspicious.
Aesthetic Score : 0.6
Mood : intense, suspicious, dramatic
Quality
Entropy : 5.47
Noise : 39
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors in the image.
Red Glow of Focus: A Man’s Intensity in the Digital Dark
A solitary figure hunches over a keyboard, bathed in the red glow of his screen. The dimly lit room amplifies the intensity of his focus, creating a mood of quiet determination. The dramatic lighting highlights the man’s dedication, capturing a moment of intense concentration in the digital age.
Prompt
facial-expressions Anxiety: Adrenaline, pressure ; A gamer’s hands frantically moving across a keyboard; close-up; Gamer; glowing computer screen; cinematic
Characteristic
Shot : A man is typing on a keyboard in a dimly lit room. The keyboard is illuminated red, the only source of light. There is a computer screen in the background, but the screen is not visible in detail.
Aesthetic Score : 0.6
Mood : focused, intense, dark
Quality
Entropy : 6.48
Noise : 62
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors are present in the image. However, the image could benefit from a bit more detail in the background and a more balanced exposure to better highlight the man’s face.
Lost in Thought: A Man Contemplates the Overcast Sky
A solitary figure stands in a dry, brown field, his gaze fixed on the distant horizon. The overcast sky casts a somber mood, mirroring the man’s pensive expression. The scene evokes a sense of melancholy and mystery, leaving the viewer to ponder the thoughts swirling in his mind.
Prompt
facial-expressions Anxiety: Loneliness, despair ; A man standing alone in a vast field; wide shot; Single Person; open sky with dark clouds; cinematic
Characteristic
Shot : A man is standing in a field looking up at the sky. The sky is cloudy and the mood is somber, perhaps reflecting the man’s thoughts.
Aesthetic Score : 0.6
Mood : somber, reflective, contemplative
Quality
Entropy : 6.73
Noise : 33
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors are evident.
The Man of Steel Stands Watch
A shadowy figure, cloaked in the iconic red and blue, stands atop a towering building, gazing out over a sprawling cityscape bathed in the hues of dusk. The image evokes a sense of drama and mystery, leaving the hero’s identity and intentions shrouded in intrigue.
Prompt
facial-expressions Anxiety: Pressure, responsibility ; A superhero standing on a rooftop; high angle; Hero; cityscape with flashing lights; cinematic
Characteristic
Shot : A man dressed as Superman looks down from a rooftop, with a cityscape in the background.
Aesthetic Score : 0.7
Mood : dramatic, heroic, contemplative
Quality
Entropy : 6.79
Noise : 81
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the background, and the subject’s face appears slightly blurred.
A Tense Standoff in the Fluorescent Hallway
A man, arms crossed and eyes locked on the viewer, stands in a crowded hallway bathed in harsh fluorescent light. The atmosphere is thick with tension, and the man’s focused gaze draws you into the heart of the suspense.
Prompt
facial-expressions Anxiety: Impatient, restless ; A person waiting in a long line; eye-level; Normal Person; crowded waiting room; cinematic
Characteristic
Shot : A man with a beard standing in a crowded hallway, looking directly at the camera. He is wearing a dark green jacket and has his arms crossed. Behind him, there are a few other people, but they are mostly blurred and out of focus.
Aesthetic Score : 0.5
Mood : serious, tense, contemplative
Quality
Entropy : 6.76
Noise : 61
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
Mystery in the City Lights
A hooded figure stands alone in a bustling city street, their silhouette shrouded in darkness. The blurred city lights create a sense of mystery and intrigue, leaving you wondering who they are and what secrets they hold.
Prompt
facial-expressions Anxiety: Overwhelmed, isolated ; A lone figure; eye-level; Single Person; bustling city street at night; cinematic
Characteristic
Shot : A solitary figure in a hooded jacket stands in the middle of a city street at night. The background is out of focus and filled with colorful lights.
Aesthetic Score : 0.3
Mood : dark, mysterious, lonely
Quality
Entropy : 6.38
Noise : 46
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Lost in the Code: A Silhouette of Focus
A young man, bathed in the cool glow of blue and purple lights, sits engrossed in his work. Headphones on, eyes fixed on the screen, he’s a picture of intense concentration. The dramatic play of light and shadow adds a layer of mystery, hinting at the depth of his focus and the secrets hidden within the code.
Prompt
facial-expressions Anxiety: Focused, intense ; A gamer hunched over a computer screen; close-up; Gamer; dimly lit room with flashing lights; cinematic
Characteristic
Shot : A young man wearing headphones is focused on his computer screen, likely playing a game, in a dimly lit room with a blue and purple ambiance. The scene is likely set in a home office or a gaming room.
Aesthetic Score : 0.6
Mood : focused, intense, mysterious
Quality
Entropy : 6.07
Noise : 50
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight graininess, likely due to low light conditions.
Lost in the City: A Moment of Introspection
A young woman navigates the bustling city streets, her face a canvas of mystery and contemplation. The blurred background emphasizes her isolation, drawing the viewer into her inner world. This image captures the essence of urban life, where moments of quiet reflection can be found amidst the chaos.
Prompt
facial-expressions Anxiety: Anxious, uncomfortable ; A woman walking down a crowded street; eye-level; Single Person; blurred background of people; cinematic
Characteristic
Shot : A woman is walking down a city street, with other people walking around her. The background is blurry, making the subject stand out.
Aesthetic Score : 0.7
Mood : mysterious, urban, contemplative
Quality
Entropy : 6.47
Noise : 56
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is some noise in the image, particularly in the background. The subject’s hair appears slightly blurry.
The Weight of the World: A Man’s Stressful Day
A man in a suit sits at his desk, head in hands, overwhelmed by the pressures of his day. The lighting and his posture create a palpable sense of tension and stress, hinting at a moment of deep frustration or thoughtful contemplation.
Prompt
facial-expressions Anxiety: Overwhelmed, stressed ; A person sitting at a desk, surrounded by paperwork; close-up; Normal Person; cluttered office; cinematic
Characteristic
Shot : A man is sitting at a desk, looking stressed. His hand is on his forehead, and he is looking down at some papers. The image is shot from a slightly low angle, giving the viewer a sense of intimacy with the subject.
Aesthetic Score : 0.6
Mood : stressed, pensive, worried
Quality
Entropy : 6.80
Noise : 65
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a slight blurriness to the image, particularly in the background, but it is not a significant issue.
Silhouette of Hope Amidst the Smoke
A solitary figure stands on a rooftop, their black coat blending with the shadows as they gaze out at a city shrouded in smoke. The scene evokes a sense of mystery and anticipation, hinting at a story of loss, resilience, and the unknown.
Prompt
facial-expressions Anxiety: Guilt, responsibility ; A hero looking out over a devastated city; high angle; Hero; destroyed buildings and smoke; cinematic
Characteristic
Shot : A lone figure in a dark coat stands on a rooftop, looking out at a cityscape obscured by smoke and fog. The buildings in the background are mostly out of focus, giving the scene a sense of isolation and mystery.
Aesthetic Score : 0.5
Mood : dark, mysterious, brooding
Quality
Entropy : 6.61
Noise : 56
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise and compression artifacts present in the image.
Conclusion
The results of the analysis show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.36, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t fully capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.53, which falls within the “good” range. This suggests that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.22, which is significantly lower than the “very good” range of -0.2 to 0.1. This indicates that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall: While the model demonstrated good understanding of the scene and shot composition, it struggled to achieve the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic preferences into visual outputs.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/dev/api