AI's Struggle with Facial Expressions: A Look at the Gaps in Generative Art with Dall-e-3
- 10 minutes read - 1954 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying a wide range of emotions and adding depth to characters. However, generating realistic and emotionally impactful facial expressions remains a challenge for generative AI models. This article explores the limitations of AI in capturing the nuances of human expressions, using a case study of a model’s performance across various scenes. We’ll examine the model’s strengths and weaknesses, highlighting its ability to understand camera positions and shot composition while revealing its struggles with aesthetic analysis and capturing the emotional depth of facial expressions. By understanding these limitations, we can better appreciate the potential and challenges of AI in creating visually compelling and emotionally resonant art.
Created with: dall-e-3
Autumn Melancholy: A Man Lost in Thought
A solitary figure sits on a park bench, surrounded by fallen leaves. The muted colors and the man’s contemplative posture evoke a sense of loneliness and introspection, capturing the essence of autumn’s melancholic mood.
Prompt
facial-expressions Sadness: Melancholy, loneliness ; A lone figure; eye-level; Single Person; Empty park bench with fallen leaves; cinematic
Characteristic
Shot : A man is sitting alone on a bench in a park with fallen leaves around him. The scene is hazy and the colors are muted.
Aesthetic Score : 0.7
Mood : melancholy, introspective, lonely
Quality
Entropy : 6.88
Noise : 99
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors or artifacts
A Lonely Vigil: Superhero Contemplates the City in the Rain
A futuristic superhero stands alone on a bridge, bathed in the glow of city lights and the downpour of a stormy night. Their posture and the rain create a sense of isolation and contemplation, hinting at a dramatic and emotional story unfolding.
Prompt
facial-expressions Sadness: Despair, disillusionment ; A superhero in their costume; eye-level; Hero; City skyline at night, rain falling; cinematic
Characteristic
Shot : A lone superhero in a futuristic cityscape, standing on a wet, dark street with the rain falling heavily
Aesthetic Score : 0.7
Mood : mysterious, futuristic, melancholic
Quality
Entropy : 6.57
Noise : 125
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lighting on the superhero’s face appears slightly artificial, and the rain effect is somewhat repetitive and unrealistic
Lost in Thought: A Moment of Melancholy
A woman sits alone at a dimly lit kitchen table, her head bowed over a cup of coffee. The fisheye lens draws the viewer into her intimate world, capturing a moment of quiet sadness and introspection. The somber mood and the woman’s isolated posture evoke a sense of contemplation and emotional weight.
Prompt
facial-expressions Sadness: Hopelessness, grief ; A woman sitting at a kitchen table; eye-level; Normal People; Empty coffee cup, unwashed dishes; cinematic
Characteristic
Shot : A woman sits alone in a kitchen, with her head in her hands, at a wooden table with a cup of coffee. The kitchen is simple and worn, with light green cabinets and a window overlooking a backyard. The scene is lit by a soft, warm light coming from the window.
Aesthetic Score : 0.6
Mood : melancholy, loneliness, introspection
Quality
Entropy : 6.90
Noise : 103
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight artifacts in the image, specifically a blurry edge and a graininess
The Gamer’s Focus: A Moment of Intense Competition
A young man, bathed in the warm glow of a lamp, sits hunched over his computer, headphones on, eyes glued to the screen. The pizza box and soda cans hint at the hours spent in this state of intense focus, fueled by the thrill of the game. The low lighting and his determined expression create a palpable sense of suspense, capturing the essence of competitive gaming.
Prompt
facial-expressions Sadness: Isolation, withdrawal ; A gamer hunched over their computer; close-up; Gamer; Empty pizza boxes, energy drink cans; cinematic
Characteristic
Shot : A young man sits at a desk in a dimly lit room, focused on his computer screen. There is a pizza box, empty soda cans, and a lamp on the desk. The image is shot from a low angle, creating a sense of intimacy and immersion in the scene.
Aesthetic Score : 0.6
Mood : intense, focused, dark
Quality
Entropy : 6.28
Noise : 91
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some slight blurring around the edges, which might be a result of post-processing.
Lost in the Shadows: A Boy’s Solitary Journey
A young boy stands alone in a dimly lit school hallway, bathed in the ethereal glow of a doorway. The contrasting light and shadow create a sense of isolation and mystery, highlighting the boy’s small figure in the vast space. The mood is somber and contemplative, hinting at a story of loneliness and introspection.
Prompt
facial-expressions Sadness: Loneliness, abandonment ; A child standing in a doorway; eye-level; Single Person; Empty hallway, dim lighting; cinematic
Characteristic
Shot : A young boy stands alone in a dimly lit school hallway, facing a doorway at the end of the hallway. Lockers line the walls on either side of him.
Aesthetic Score : 0.6
Mood : lonely, somber, introspective
Quality
Entropy : 6.72
Noise : 94
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears slightly blurred, especially in the background, and some of the details, such as the boy’s clothing, seem slightly over-sharpened, leading to a somewhat artificial look.
Solitude Amidst the Chaos
A lone soldier kneels in a war-torn landscape, smoke and explosions painting a backdrop of destruction. The composition captures the intensity of the battlefield and the soldier’s somber reflection, highlighting the stark reality of war.
Prompt
facial-expressions Sadness: Loss, regret ; A soldier kneeling on a battlefield; eye-level; Hero; Explosions in the distance, smoke filling the air; cinematic
Characteristic
Shot : A soldier in full gear, kneeling in a desert environment, with smoke and explosions in the background. He is looking down, with his hands clasped together as if in prayer.
Aesthetic Score : 0.6
Mood : dramatic, intense, solemn
Quality
Entropy : 6.82
Noise : 100
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The lighting is somewhat uneven, with the soldier being overexposed in some areas. The background also appears somewhat blurry, which may be intentional, but it could be sharper.
Silence Speaks Volumes: A Couple’s Uneasy Evening
A man and a woman sit in a dimly lit room, their expressions shrouded in melancholy. The bowl of popcorn in the foreground adds a touch of normalcy to the tense atmosphere, but the unspoken tension between them speaks volumes. What secrets are they keeping, and what will become of their relationship?
Prompt
facial-expressions Sadness: Silence, unspoken tension ; A couple sitting on a couch; eye-level; Normal People; Empty popcorn bowl, remote control on the floor; cinematic
Characteristic
Shot : A man and a woman are sitting on the floor in front of a couch, looking down. There is a bowl of popcorn in front of them and two remote controls on the floor.
Aesthetic Score : 0.5
Mood : sad, melancholic, pensive
Quality
Entropy : 6.72
Noise : 87
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, and the subjects’ faces are a bit pixelated. The lighting is uneven, making the image look somewhat flat.
Tears on the Keyboard: A Moment of Raw Emotion
A close-up shot captures a woman’s face as she cries while typing, highlighting her vulnerability and the intensity of her emotions. The dramatic lighting and focus on her face create a sense of intimacy and draw the viewer into her moment of sadness.
Prompt
facial-expressions Sadness: Frustration, defeat ; A gamer’s hands on a keyboard; close-up; Gamer; Screen displaying a game over message; cinematic
Characteristic
Shot : A close-up of a woman’s face with tears running down her cheeks as she types on a backlit keyboard. She looks upset and frustrated.
Aesthetic Score : 0.7
Mood : sadness, frustration, despair
Quality
Entropy : 6.51
Noise : 98
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : The tears look a little bit artificial and the lighting is a bit too dramatic. There is a noticeable texture effect that makes the image look a little bit plastic.
Lost in the City: A Woman’s Intense Gaze Amidst the Blur
A solitary figure stands amidst the bustling city, her serious expression and piercing gaze drawing the viewer in. The blurred background creates a sense of isolation and suspense, leaving us wondering about her story and the secrets she holds.
Prompt
facial-expressions Sadness: Alienation, loneliness ; A woman walking down a crowded street; eye-level; Single Person; People passing by, oblivious to her; cinematic
Characteristic
Shot : A woman stands out in a crowd of people walking down a busy street in a city. The people in the background are blurred, drawing attention to the woman in the foreground.
Aesthetic Score : 0.7
Mood : intense, focused, urban
Quality
Entropy : 6.98
Noise : 103
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible artifacts or errors
A Shadowy Figure in the City’s Embrace
A woman walks through the neon-lit streets, her face obscured by the darkness. The camera, held by an unseen hand, captures her every move, creating a sense of mystery and suspense. The urban landscape becomes a backdrop for a story waiting to unfold.
Prompt
facial-expressions Sadness: Reflection, introspection ; A hero standing on a rooftop; eye-level; Hero; City lights twinkling in the distance; cinematic
Characteristic
Shot : A woman is being photographed through a camera lens. The background is a cityscape at night.
Aesthetic Score : 0.7
Mood : mysterious, moody, dramatic
Quality
Entropy : 5.96
Noise : 74
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts in the form of color banding in the dark areas. There are also some areas of the image that are slightly blurry.
Conclusion
The analysis shows that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.25
- Interpretation: This score indicates that the model’s ability to understand and implement camera positions in the generated image is below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
Shot Analysis:
- Score: 0.51
- Interpretation: This score indicates that the model’s ability to understand and create the desired shot composition is average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
Aesthetic Analysis:
- Score: 0.17
- Interpretation: This score indicates that the model’s ability to match the expected aesthetic of the image is below average. A score between -0.2 and 0.1 would be considered very good. This suggests that the generated image may not have the desired visual style or feel.
Overall:
The model demonstrates a decent understanding of camera positions and shot composition, but struggles to achieve the desired aesthetic. This suggests that the model may need further training to improve its ability to capture the intended visual style.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://openai.com/index/dall-e-3/