AI's Struggle with Emotional Depth: Can Machines Capture Human Expressions? with Flux-dev
- 9 minutes read - 1908 wordsTable of Contents
The ability to convey emotion through facial expressions is a fundamental aspect of human communication. It’s a complex interplay of muscle movements, subtle nuances, and personal experiences that contribute to the richness of our emotional landscape. But can artificial intelligence replicate this intricate human ability? This blog post delves into an experiment that explores the limitations and potential of AI in capturing the nuances of facial expressions, using a series of prompts that challenge the model to generate images with specific emotional undertones.
Created with: flux-dev
Lost in Thought, Above the City
A man in a suit stands on a rooftop, his gaze fixed on the blurred cityscape below. The cloudy sky and hazy atmosphere create a sense of reflection and introspection, leaving the viewer to ponder his thoughts and the weight of the urban landscape.
Prompt
facial-expressions Shame: Disheartened, disillusioned, questioning his purpose ; A hero, standing on a rooftop, looking down at the city below; not too close; Hero; A panoramic view of the city, but he feels small and insignificant; cinematic
Characteristic
Shot : A man in a suit standing with his back to the camera looking out over a city skyline.
Aesthetic Score : 0.6
Mood : melancholic, contemplative, urban
Quality
Entropy : 6.34
Noise : 52
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors, though the image is quite soft and lacks sharpness.
Lost in the Crowd: A Moment of Solitude in the Urban Jungle
A solitary figure navigates the bustling city streets, swallowed by the anonymity of the crowd. The image captures a poignant sense of melancholy and isolation, emphasizing the man’s loneliness through the use of depth of field.
Prompt
facial-expressions Shame: Rejected, isolated, a sense of being unwanted ; A man, walking away from a group of people, his head down, his shoulders slumped; eye-level; Single Person; A bustling street, but he feels alone and invisible; cinematic
Characteristic
Shot : A man in a black coat is walking down a city street. The man is the subject of the photo and is in focus, while the background is blurred and out of focus.
Aesthetic Score : 0.5
Mood : melancholy, somber, lonely
Quality
Entropy : 6.43
Noise : 46
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some slight blurriness and the lighting is a bit flat. The man’s hair is also somewhat blurred.
Lost in the Neon Glow: A Moment of Focus and Mystery
A young man, bathed in the cool blue and pink hues of neon light, sits engrossed in his work. The dim room adds an air of mystery, while his focused expression hints at a story waiting to unfold. This image captures the essence of a focused, edgy, and intriguing moment.
Prompt
facial-expressions Shame: Despair, addiction, a sense of being lost ; A gamer, hunched over his keyboard, his fingers flying across the keys, but his eyes are filled with sadness; eye-level; Gamer; A brightly lit gaming room, but he feels trapped in a digital world; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit room, focused on typing on a keyboard. He is surrounded by colorful lighting, creating a moody atmosphere.
Aesthetic Score : 0.6
Mood : focused, intense, techy
Quality
Entropy : 6.42
Noise : 67
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
The Dark Knight Meets the Setting Sun
A close-up portrait of a man shrouded in the shadows of a Batman mask, silhouetted against a vibrant sunset. The image evokes a sense of mystery and brooding intensity, highlighting the dramatic contrast between the dark figure and the fiery sky.
Prompt
facial-expressions Shame: Melancholy, disillusioned, burdened ; A superhero, their mask removed, revealing a face etched with pain; eye-level; Hero; A cityscape bathed in the glow of a setting sun; cinematic
Characteristic
Shot : A close-up portrait of a man dressed as Batman, with the sun setting in the background.
Aesthetic Score : 0.7
Mood : dark, brooding, intense
Quality
Entropy : 6.40
Noise : 51
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible errors in the image.
Lost in the Fog: A Solitary Figure Navigates a Mysterious Urban Landscape
A lone figure walks through a dense fog-filled street at night, creating an eerie and isolated atmosphere. The dramatic use of shadows and lighting emphasizes the figure’s solitude against the backdrop of tall, dark buildings. This image evokes a sense of mystery and loneliness, capturing the essence of urban life at its most enigmatic.
Prompt
facial-expressions Shame: Desolate, lonely, regretful ; A lone figure, hunched over, walking down a deserted street; eye-level; Single Person; Rain-slicked pavement and flickering streetlights; cinematic
Characteristic
Shot : A lone figure walks down a foggy city street at night, with two other people in the distance. The street is wet and the light from the street lamps creates a moody atmosphere.
Aesthetic Score : 0.6
Mood : mysterious, lonely, somber
Quality
Entropy : 6.67
Noise : 78
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and grain, particularly in the shadows.
A Lone Warrior in a Desolate World
A man clad in armor stands amidst the ruins of a forgotten civilization. His stern expression and the gritty, post-apocalyptic landscape create a sense of tension and anticipation. The lighting casts long shadows, adding to the dramatic effect.
Prompt
facial-expressions Shame: Guilt, regret, a sense of responsibility ; A hero, standing in the ruins of a battle, his armor dented and his face covered in grime; not too close; Hero; A scene of destruction, a reminder of the cost of his actions; cinematic
Characteristic
Shot : A man in armor stands in a foggy, desolate landscape, facing the camera with a serious expression. Other figures in armor are visible in the background, but blurred.
Aesthetic Score : 0.7
Mood : serious, gritty, dramatic
Quality
Entropy : 6.80
Noise : 81
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears slightly blurry, particularly in the background. There are some minor artifacts in the armor.
Lost in the Glow: A Boy’s Intense Focus in a Neon-Lit Gaming Session
A young boy is completely absorbed in his video game, the blue and red lights casting dramatic shadows as he navigates the digital world. The scene captures the intensity and playful energy of gaming, highlighting the boy’s focused immersion.
Prompt
facial-expressions Shame: Empty, defeated, lost in a digital world ; A gamer, staring blankly at a screen, his controller lying idle; eye-level; Gamer; A dimly lit room filled with gaming paraphernalia, a sense of disconnection; cinematic
Characteristic
Shot : A young boy is playing a video game in a dimly lit room, the only light source is from the computer screen and a lamp in the background.
Aesthetic Score : 0.6
Mood : focused, pensive, concentrated
Quality
Entropy : 6.43
Noise : 62
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, especially around the edges of the screen.
Lost in the Shadows: A Man’s Solitary Moment
A man in a suit stands alone in a dimly lit room, bathed in warm light. The scene evokes a sense of mystery and intimacy, leaving the viewer to ponder his thoughts and the secrets he may hold.
Prompt
facial-expressions Shame: Anxious, self-conscious, out of place ; A man, standing in a crowded room, his eyes darting nervously around; eye-level; Single Person; A party scene, filled with laughter and conversation, but he feels isolated; cinematic
Characteristic
Shot : A man in a suit stands in the middle of a party, looking away from the camera, while other people are socializing around him. There are warm lights in the background.
Aesthetic Score : 0.6
Mood : mysterious, intimate, relaxed
Quality
Entropy : 5.99
Noise : 44
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight noise is visible in the background, and the edges of the image appear slightly blurry.
Lost in Thought: A Moment of Solitude and Sorrow
A woman sits alone in a dimly lit restaurant, her hands covering her face, lost in contemplation. The shallow depth of field isolates her in a sea of blurred background, emphasizing her sense of loneliness and sadness.
Prompt
facial-expressions Shame: Embarrassed, defeated, self-loathing ; A woman, her face buried in her hands, sitting alone at a crowded diner table; eye-level; Normal Person; The bustling activity of the diner, a stark contrast to her isolation; cinematic
Characteristic
Shot : A woman is sitting in a restaurant, with her head in her hands, looking distressed.
Aesthetic Score : 0.6
Mood : sad, lonely, introspective
Quality
Entropy : 6.87
Noise : 55
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears slightly blurry. There is a slight noise on the image.
A Moment of Quiet Reflection
A woman sits alone in a kitchen, her hands covering her face, lost in thought. The plate of untouched food suggests a heavy heart and a moment of quiet contemplation. The image evokes a sense of sadness and isolation, capturing the raw emotion of a solitary moment.
Prompt
facial-expressions Shame: Depressed, unmotivated, lost in her thoughts ; A woman, sitting at her kitchen table, staring at a plate of untouched food; eye-level; Normal Person; A cluttered kitchen, a reflection of her inner turmoil; cinematic
Characteristic
Shot : A woman with long brown hair is sitting at a kitchen table with her head in her hands, looking distressed. There is a plate of food in front of her.
Aesthetic Score : 0.6
Mood : sad, lonely, melancholic
Quality
Entropy : 6.89
Noise : 73
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The lighting is somewhat uneven, resulting in a slightly dark and shadowy image. There is some minor noise and grain in the image.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.29, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t quite capture the intended camera positions as described in the prompt.
- Shot Analysis: The model scored 0.63, falling within the “good” range. This suggests that the model was able to understand the scene and create a shot that was generally consistent with the prompt.
- Aesthetic Analysis: The model scored 0.19, which is significantly lower than the “very good” range of -0.2 to 0.1. This indicates that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrated a decent understanding of the scene and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/dev/api