AI Captures Camera Angles, But Struggles with Facial Expressions with Dall-e-3
- 9 minutes read - 1869 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. In the realm of AI-generated imagery, capturing these nuances accurately is crucial. This blog post examines the performance of a generative AI model in creating images with specific facial expressions, exploring its strengths and weaknesses in capturing the essence of human emotion.
Created with: dall-e-3
Lost in the City’s Embrace
A solitary figure sits on a bench, dwarfed by the sprawling cityscape at dusk. The image captures a poignant sense of loneliness and contemplation amidst the urban bustle.
Prompt
facial-expressions Attentiveness: Melancholy, yet observant ; A lone figure sitting on a park bench; eye-level; Single Person; bustling city park in the background; cinematic
Characteristic
Shot : A lone figure sits on a bench in a bustling city square, surrounded by a crowd of people. The scene is lit by street lamps and the sun is setting in the distance.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, urban
Quality
Entropy : 6.79
Noise : 105
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image has a few artifacts, such as the blurry background and the slightly out-of-focus people.
She Stands Tall: A Superhero’s Nighttime Vigil
A powerful image of a superheroine, bathed in dramatic light and shadow, stands against a vibrant cityscape. Her determined gaze and flowing cape convey strength and resolve, promising a thrilling story to unfold.
Prompt
facial-expressions Attentiveness: Determined, vigilant ; A superhero standing on a rooftop, looking out over the city; eye-level; Hero; cityscape with twinkling lights; cinematic
Characteristic
Shot : A woman dressed as a superhero stands in front of a city skyline at night. She is adjusting her cape and looking determined.
Aesthetic Score : 0.7
Mood : powerful, confident, heroic
Quality
Entropy : 6.89
Noise : 101
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have some slight blurring and artifacts, particularly in the background. The woman’s hair and clothing also appear slightly unnatural.
Lost in the Pages: A Moment of Tranquility Amidst the Chaos
A woman finds solace in a book on a bustling train. The low-angle perspective and selective focus draw you into her world, highlighting her contemplative mood and the intimacy of her reading experience.
Prompt
facial-expressions Attentiveness: Focused, absorbed ; A woman reading a book on a train; eye-level; Normal Person; blurred passengers and train windows; cinematic
Characteristic
Shot : A woman reading a book on a train, with other passengers blurred in the background.
Aesthetic Score : 0.6
Mood : tranquil, contemplative, solitary
Quality
Entropy : 6.59
Noise : 82
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a bit of a hazy, blurry quality, particularly in the background, which is likely a stylistic choice. The book’s pages are slightly overexposed, leading to a loss of detail in the text. This blurriness, although artistic, could be considered an error by some.
The Glow of Focus: A Young Man Immersed in His Work
A young man sits hunched over his computer, his face illuminated by the screen’s glow. The low-key lighting and his intense expression suggest a moment of deep focus and excitement, as if he’s on the verge of a breakthrough. The blurry background hints at a personal space, perhaps his home, where he’s fully immersed in his work.
Prompt
facial-expressions Attentiveness: Thrilled, competitive ; A gamer intensely focused on a screen, fingers flying across the keyboard; close-up; Gamer; dimly lit room with glowing monitor; cinematic
Characteristic
Shot : A man is sitting at a computer and typing on a keyboard, he is looking at a screen that is not in the image, his mouth is open in a surprised expression
Aesthetic Score : 0.6
Mood : excited, energetic, surprise
Quality
Entropy : 6.52
Noise : 75
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some noise, mainly in the background. The lighting is harsh and creates some strong shadows. The focus is on the keyboard, the face is not sharp. The image seems to have a high level of digital editing, and some of the details in the face, like the beard, have some unnatural look, especially around the mouth.
Lost in the City’s Symphony
A solitary figure stands amidst the bustling urban landscape, his gaze fixed on the ground, lost in thought. The blurry crowd behind him underscores his isolation, creating a poignant scene of melancholy and contemplation.
Prompt
facial-expressions Attentiveness: Lost in thought, introspective ; A man walking down a crowded street, seemingly oblivious to the chaos around him; eye-level; Single Person; bustling city street with people and traffic; cinematic
Characteristic
Shot : A man is walking in a busy city street, the city is busy with people and traffic.
Aesthetic Score : 0.6
Mood : melancholy, city life, urban
Quality
Entropy : 6.70
Noise : 87
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The subject’s face and the background are slightly blurry. The colors are a bit muted, and the contrast is low. There are also some minor artifacts in the background.
Through the Sniper’s Scope: A Soldier’s Moment of Truth
A dramatic scene unfolds through the lens of a sniper scope. A lone soldier kneels in the foreground, rifle in hand, facing a line of comrades charging towards a massive explosion. The intensity of the moment is palpable, capturing the chaos and determination of war.
Prompt
facial-expressions Attentiveness: Brave, fearless ; A hero standing in the middle of a battle, eyes locked on the enemy; eye-level; Hero; chaotic battlefield with explosions and smoke; cinematic
Characteristic
Shot : A soldier is kneeling in a field, aiming his rifle at the viewer, with a blurry background of a group of soldiers charging and a large explosion in the distance. The scene is framed by a sniper scope.
Aesthetic Score : 0.6
Mood : tense, dramatic, action-packed
Quality
Entropy : 6.84
Noise : 95
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.70
Image errors : The lighting and shadows are somewhat unnatural and the background is overly blurry.
An Intimate Storytelling Moment Shared Between Generations
In a cozy, dimly lit room, a young girl is captivated by an older woman’s story, as they sit together on the floor. The warm light from lamps creates an intimate atmosphere, while the focused expressions of both characters evoke a sense of anticipation and draw viewers into the heartwarming scene.
Prompt
facial-expressions Attentiveness: Curious, engaged ; A young girl listening intently to her grandmother tell a story; eye-level; Normal Person; cozy living room with warm lighting; cinematic
Characteristic
Shot : A young girl is lying on the floor with an open book in front of her, looking up at an older woman who is sitting on the floor and gesticulating with her hands.
Aesthetic Score : 0.7
Mood : warm, intimate, attentive
Quality
Entropy : 6.69
Noise : 89
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, especially around the edges. There are also some minor artifacts in the background.
The Joy of Gaming: Friends United in Virtual Worlds
A group of friends gather around, their faces lit with excitement as they engage in a thrilling video game session. The energy is palpable, capturing the pure joy and camaraderie of shared gaming experiences.
Prompt
facial-expressions Attentiveness: Joyful, triumphant ; A gamer celebrating a victory, eyes wide with excitement; close-up; Gamer; brightly lit room with cheering friends; cinematic
Characteristic
Shot : A group of friends are playing video games. The main subject is a man in the foreground, holding a game controller with a joyful expression on his face. His friends are cheering behind him, suggesting a victorious moment in the game. There are celebratory decorations like streamers and confetti visible in the background.
Aesthetic Score : 0.6
Mood : joyful, exciting, celebratory
Quality
Entropy : 6.70
Noise : 88
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.60
Image errors : There seems to be a slight unnatural texture on the man’s skin and hair, potentially caused by AI editing or over-sharpening.
Lost in Thought: A Woman’s Solitary Moment
A woman sits alone in a bustling cafe, her gaze fixed directly on the viewer. The background blurs, isolating her in a world of her own. Her expression is a mix of mystery and pensiveness, hinting at a story waiting to be told.
Prompt
facial-expressions Attentiveness: Observant, introspective ; A woman sitting alone in a cafe, observing the people around her; eye-level; Single Person; bustling cafe with tables and chairs; cinematic
Characteristic
Shot : A woman sitting in a cafe, looking directly at the camera, with other customers blurred in the background.
Aesthetic Score : 0.6
Mood : mysterious, pensive, introspective
Quality
Entropy : 6.74
Noise : 80
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly grainy and there are some minor artifacts.
Awe-Inspiring Sunrise Over Majestic Mountains
A solitary figure stands in a cave, gazing out at a breathtaking vista. The misty valley below and the rising sun behind the clouds create a scene of serene beauty and wonder. The man’s contemplative pose and the dramatic lighting evoke a sense of awe at the majesty of the natural world.
Prompt
facial-expressions Attentiveness: Reflective, contemplative ; A hero standing on a cliff, looking out at the vast landscape; eye-level; Hero; dramatic mountain range with clouds and sunlight; cinematic
Characteristic
Shot : A man stands in a cave opening looking out at a mountainous landscape during sunset.
Aesthetic Score : 0.7
Mood : mysterious, contemplative, hopeful
Quality
Entropy : 6.90
Noise : 96
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the textures, especially the mountains, look slightly artificial and unnatural.
Conclusion
The analysis shows that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.45, which is considered good. This indicates that the model was able to accurately capture the camera position described in the prompt.
- Shot Analysis: The model scored 0.595, also considered good. This suggests that the model understood the scene described in the prompt and was able to create a shot that reflected that understanding.
- Aesthetic Analysis: The model scored 0.15, which is not very good. This means that the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt.
Overall, the model demonstrates a good understanding of camera position and shot composition, but needs improvement in generating images that match the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://openai.com/index/dall-e-3/