AI's Struggle with Facial Expressions: A Look at the Gaps in Generative Models with Flux-pro
- 9 minutes read - 1917 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in human communication. They play a crucial role in storytelling, character development, and overall visual impact. However, generating realistic facial expressions in AI-generated images remains a significant challenge. This blog post explores the limitations of current generative AI models in capturing the nuances of human expressions, using a specific example to illustrate the strengths and weaknesses of these models. We’ll delve into the concepts of camera position, scene understanding, and aesthetic appeal, analyzing how these factors contribute to the overall success of AI-generated images. By understanding the challenges and limitations of current models, we can gain valuable insights into the future of AI-generated imagery and the potential for creating more expressive and engaging visual experiences.
Created with: flux-pro
Lost in Thought: A Moment of Quiet Reflection
A young woman, bathed in soft light, sits in a cluttered room, her hands resting on her cheeks as she gazes upwards with a thoughtful expression. The scene evokes a sense of quiet contemplation and curiosity, capturing a moment of introspection.
Prompt
facial-expressions Frustration: Overwhelmed and defeated ; A single person; eye-level; Single Persons; A cluttered apartment with overflowing laundry baskets and takeout containers.; cinematic
Characteristic
Shot : A young woman with long brown hair is sitting in a room with shelves in the background, she is looking up and has a thoughtful expression on her face. There are various items on the shelves, including baskets, towels, and a vase.
Aesthetic Score : 0.6
Mood : thoughtful, pensive, introspective
Quality
Entropy : 6.54
Noise : 79
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, particularly in the background. There is some noise in the shadows, but it is not overly distracting.
Superman’s Determined Gaze: A City Awaits
A man, clad in the iconic Superman suit, stares intensely at the camera, his expression serious and determined. The city behind him is blurred, creating a sense of anticipation and drama. This image captures the essence of a hero ready to face any challenge.
Prompt
facial-expressions Frustration: Powerless and angry ; A superhero; close-up; Heroes; A dark alley with flickering streetlights, the hero’s cape billowing in the wind.; cinematic
Characteristic
Shot : A man dressed as Superman is looking intensely at the viewer, likely in the middle of a fight or heroic act, with a blurred city background.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.68
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors, slight compression artifacts but not distracting
Passionate Plea in the Heart of the City
A man in a suit, his face alight with conviction, delivers a powerful message amidst the blur of a bustling train or subway. The close-up framing draws you into the intensity of the moment, leaving you captivated by his urgency and passion.
Prompt
facial-expressions Frustration: Impatient and stressed ; A businessman; eye-level; Normal People; A crowded train with people pushing and shoving, the businessman trapped in the middle.; cinematic
Characteristic
Shot : A man in a suit is speaking passionately to a group of people in a crowded environment, likely a train or subway car.
Aesthetic Score : 0.6
Mood : intense, serious, focused
Quality
Entropy : 6.87
Noise : 78
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts and noise in the image, particularly in the background. The focus on the man’s face is slightly soft.
Lost in the Glow: A Man’s Focus Under Red and Blue Light
A solitary figure, bathed in the contrasting hues of red and blue, sits intently before a computer screen. The scene evokes a sense of focused concentration, mystery, and introspection, heightened by the dramatic interplay of light and shadow.
Prompt
facial-expressions Frustration: Focused but frustrated ; A gamer; close-up; Gamer; A dimly lit room with a computer screen displaying a frustratingly difficult level, the gamer’s hands shaking on the keyboard.; cinematic
Characteristic
Shot : A man is sitting in front of a computer monitor, with red and blue lighting illuminating the scene. The man is in a thoughtful pose, looking at the screen with a slightly furrowed brow.
Aesthetic Score : 0.7
Mood : intrigued, contemplative, focused
Quality
Entropy : 6.38
Noise : 63
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise and blurriness, particularly in the darker areas. This is likely due to a low aperture setting or poor lighting conditions.
A Moment of Melancholy in the Park
A woman sits alone on a bench, her head in her hands, lost in contemplation. The soft lighting and quiet park setting amplify her sense of loneliness and sadness.
Prompt
facial-expressions Frustration: Lonely and isolated ; A young woman; eye-level; Single Persons; A deserted park bench, the woman staring blankly at the ground, her phone lying forgotten beside her.; cinematic
Characteristic
Shot : A young woman sits on a bench with her head in her hands, looking distressed.
Aesthetic Score : 0.6
Mood : sad, lonely, pensive
Quality
Entropy : 6.82
Noise : 74
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant artifacts or errors.
Firefighter’s Silhouette: A Moment of Courage in the Flames
A dramatic image captures a firefighter in full gear, silhouetted against the blaze as they open a door in a burning building. The intense lighting and heroic pose convey a sense of urgency and danger, highlighting the bravery of those who fight fires.
Prompt
facial-expressions Frustration: Urgent and desperate ; A firefighter; close-up; Heroes; A burning building with smoke billowing out, the firefighter struggling to open a door.; cinematic
Characteristic
Shot : A firefighter in full gear is opening a door with flames visible in the background.
Aesthetic Score : 0.7
Mood : intense, focused, determined
Quality
Entropy : 6.90
Noise : 86
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blur in the background, possibly due to motion.
Lost in Thought: A Moment of Contemplation in the Library
A young man sits at a table in a library, deeply engrossed in writing. The soft lighting and intimate composition highlight his pensive mood, capturing a moment of focused contemplation.
Prompt
facial-expressions Frustration: Overwhelmed and anxious ; A student; eye-level; Normal People; A crowded library with students hunched over books, the student staring at a blank page, their pen hovering over the paper.; cinematic
Characteristic
Shot : A young man is sitting at a desk in a library, deep in thought, while writing in a notebook with a pen. He is leaning on his hand and looks focused on his writing.
Aesthetic Score : 0.7
Mood : focused, contemplative, serious
Quality
Entropy : 6.56
Noise : 77
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor noise and artifacts, especially in the background. The sharpness could also be slightly better.
The Gamer’s Focus: A Moment of Intense Competition
A young man, lost in the world of his game, sits in a dimly lit room, his focus unwavering. The intensity of his concentration and the dramatic lighting create a sense of suspense, capturing the thrill of competitive gaming.
Prompt
facial-expressions Frustration: Focused and intense ; A gamer; close-up; Gamer; A brightly lit gaming tournament stage, the gamer staring at the screen, their controller gripped tightly in their hands.; cinematic
Characteristic
Shot : A young man wearing headphones and a black hoodie is sitting in a gaming chair and playing a video game. He is focused on the game and his expression is intense. There is another person in the background, but they are out of focus.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 6.71
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears slightly overexposed and the lighting is uneven. There are some minor image artifacts, including a slight blurring of the image.
Drowning in Debt: A Woman’s Struggle with Financial Burden
A poignant image captures the raw emotion of financial distress. A young woman sits at a table, surrounded by bills, her face etched with sadness and overwhelm. The scene speaks volumes about the weight of financial burdens and the struggle to stay afloat.
Prompt
facial-expressions Frustration: Exhausted and defeated ; A single mother; eye-level; Single Persons; A messy kitchen with dishes piled high in the sink, the single mother staring at a pile of bills, her shoulders slumped.; cinematic
Characteristic
Shot : A woman is sitting at a table in a kitchen, looking stressed. She is covered in papers, possibly bills or mail, and has her hands on her face.
Aesthetic Score : 0.5
Mood : sad, stressed, overwhelmed
Quality
Entropy : 6.69
Noise : 79
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and the colors are a little muted.
A Doctor’s Somber Gaze: Uncertainty in the Hospital Room
A doctor’s serious expression and the dim lighting create a sense of tension and uncertainty in this hospital room. The patient’s face is obscured, leaving their fate unknown. The bedside monitor and defibrillator in the background add to the sense of urgency and potential danger.
Prompt
facial-expressions Frustration: Concerned and helpless ; A doctor; close-up; Heroes; A hospital room with a patient hooked up to machines, the doctor looking at a medical chart with a furrowed brow.; cinematic
Characteristic
Shot : A doctor or nurse is looking at a patient in a hospital bed. The patient is lying in the bed, and the doctor is standing beside the bed looking down at them.
Aesthetic Score : 0.4
Mood : serious, concerned, somber
Quality
Entropy : 6.83
Noise : 60
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, such as a slight blurriness around the edges of the doctor’s face. The overall composition of the image is slightly static, and there is not much movement or action in the scene.
Conclusion
The generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.585, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.2, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and achieving the desired aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux-pro/api