AI's Facial Expressions: A Mixed Bag of Success with Leonardo-ai
- 9 minutes read - 1773 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions with a single glance. In the realm of AI-generated imagery, the ability to accurately depict these expressions is crucial for creating compelling and engaging visuals. This blog post examines the performance of a generative AI model in capturing facial expressions, analyzing its strengths and weaknesses in understanding scene context, camera position, and aesthetic style.
Created with: leonardo-ai
The Weight of Unfinished Dreams
A solitary figure sits amidst the scattered pieces of a half-completed jigsaw puzzle, his posture heavy with melancholy. The unfinished task mirrors a sense of stagnation and the weight of unfulfilled aspirations, leaving a poignant impression of isolation and defeat.
Prompt
facial-expressions Boredom: Apathy and resignation. ; A single person; eye-level; Single Persons; A cluttered apartment with unwashed dishes and a half-finished puzzle on the table.; cinematic
Characteristic
Shot : A man sits at a table with a half-finished jigsaw puzzle spread out in front of him. He looks dejected and has his head in his hands.
Aesthetic Score : 0.6
Mood : sad, contemplative, frustrated
Quality
Entropy : 6.80
Noise : 93
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but the image could benefit from better lighting and possibly some cropping
On the Edge: A Shadow Prepares to Strike
A lone figure, shrouded in darkness, stands poised on a rooftop overlooking a city shrouded in mist. The air crackles with tension, hinting at a dangerous plan unfolding. This image captures the raw intensity and mystery of a moment poised on the brink of action.
Prompt
facial-expressions Boredom: Disillusionment and weariness. ; A superhero; eye-level; Heroes; A deserted cityscape with crumbling buildings and graffiti.; cinematic
Characteristic
Shot : A man wearing a dark suit and a mask, standing on a rooftop overlooking a city. The sky is overcast, and there is a sense of foreboding in the air.
Aesthetic Score : 0.7
Mood : dark, gritty, intense
Quality
Entropy : 6.88
Noise : 93
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts present in the image, particularly in the background and around the man’s face.
Worried Glance: A Moment of Mystery on Public Transit
A woman sits on a bus or train, her face illuminated by the screen of her phone. Her worried expression and the shadowy lighting create a sense of intrigue, leaving us to wonder what troubles her.
Prompt
facial-expressions Boredom: Annoyance and detachment. ; A young woman; eye-level; Normal People; A crowded bus with people staring at their phones.; cinematic
Characteristic
Shot : A young woman sitting on a public transit, looking worried, possibly scared. She is holding a smartphone.
Aesthetic Score : 0.6
Mood : anxious, worried, concerned
Quality
Entropy : 6.72
Noise : 96
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable errors found.
In the Shadows of Focus: A Man’s Intense Concentration
A solitary figure, bathed in the glow of a computer screen, sits in a dimly lit room. Headphones on, his face etched with focus, he’s engrossed in a task that demands his full attention. The atmosphere is charged with suspense, leaving the viewer to wonder what secrets lie within the digital realm.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A gamer; close-up; Gamer; A dimly lit room with a computer screen displaying a paused game.; cinematic
Characteristic
Shot : A man is sitting in a dark room, looking intently at a computer screen. He is wearing a headset and has a serious expression on his face.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 5.88
Noise : 87
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor noise and grain, especially in the darker areas. The lighting is a bit uneven, with some areas being overexposed and others underexposed.
Autumn Reflections: A Moment of Contemplation
An elderly man sits on a park bench, surrounded by fallen leaves, lost in thought. The blurred background and soft lighting create a melancholic mood, highlighting the man’s weathered face and the passage of time.
Prompt
facial-expressions Boredom: Melancholy and loneliness. ; An elderly man; eye-level; Single Persons; A park bench with fallen leaves and a deserted playground.; cinematic
Characteristic
Shot : An elderly man sits on a green bench in a park, surrounded by fallen autumn leaves. The background features trees with blurred foliage, suggesting a peaceful autumnal setting.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, serene
Quality
Entropy : 6.95
Noise : 93
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major artifacts or errors are noticeable. The image exhibits subtle noise, primarily in the background, but this is not excessive.
A Man in the Shadows: Secrets and Suspense in a Dimly Lit Room
A man in a suit sits at a cluttered desk, bathed in the soft glow of a lamp. His concerned expression and the dimly lit room create an atmosphere of suspense and mystery. What secrets are hidden in the shadows?
Prompt
facial-expressions Boredom: Frustration and boredom. ; A detective; eye-level; Heroes; A dimly lit office with stacks of unsolved cases and a flickering neon sign.; cinematic
Characteristic
Shot : A man in a suit sits at a desk in a dimly lit office. The office is cluttered with papers and documents, and there are posters on the wall with some Cyrillic text.
Aesthetic Score : 0.7
Mood : suspenseful, mysterious, retro
Quality
Entropy : 6.18
Noise : 90
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors, though there are some minor blemishes on the posters.
A Silent Tension: A Couple’s Uneasy Dinner
A dimly lit restaurant setting, a couple seated at a table, their silence speaks volumes. The woman’s gaze is fixed on the man, who looks away, creating an atmosphere of unspoken tension and anticipation. The scene is both introspective and unsettling, hinting at a deeper story waiting to unfold.
Prompt
facial-expressions Boredom: Awkward silence and boredom. ; A young couple; eye-level; Normal People; A restaurant table with empty plates and a half-finished bottle of wine.; cinematic
Characteristic
Shot : A couple sitting at a table in a dimly lit restaurant. The woman is looking at the man, while the man looks away. There is a plate and glasses of wine on the table.
Aesthetic Score : 0.6
Mood : tense, somber, conflicted
Quality
Entropy : 6.52
Noise : 92
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors, but the lighting could be more balanced.
Lost in the Code: A Moment of Intense Focus
A solitary figure, bathed in the soft glow of the screen, is completely engrossed in their work. The dimly lit room and the man’s focused gaze create a sense of intensity and dedication, capturing the essence of deep concentration.
Prompt
facial-expressions Boredom: Monotony and boredom. ; A gamer; close-up; Gamer; A brightly lit room with a computer screen displaying a repetitive, simple game.; cinematic
Characteristic
Shot : A man is sitting in front of a computer, wearing headphones, he is focused on the screen and typing something on the keyboard.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.27
Noise : 87
Prompt Clip Score : 0.13
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are minor artifacts and noise in the image, particularly in the shadows and the background.
Lost in Thought: A Woman’s Melancholy Journey
A woman with long brown hair gazes out the train window, her serious expression hinting at a world of unspoken emotions. The moody lighting and her introspective demeanor create an atmosphere of mystery and suspense, leaving the viewer to ponder her thoughts and the destination of her journey.
Prompt
facial-expressions Boredom: Isolation and boredom. ; A woman; eye-level; Single Persons; A crowded train with people reading, sleeping, and staring blankly.; cinematic
Characteristic
Shot : A young woman with long brown hair is sitting on a train looking out the window. She is wearing a gray coat.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.59
Noise : 94
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.30
Image errors : There seems to be a slight smoothing out of the woman’s face, potentially overusing a sharpening filter, and the texture of the woman’s coat has been slightly altered.
A Lone Sentinel in the Desert’s Embrace
A solitary soldier stands guard before a weathered watchtower, the vast desert stretching out before him. The image evokes a sense of loneliness and contemplation, with the soldier’s presence a stark contrast to the desolate landscape.
Prompt
facial-expressions Boredom: Despair and boredom. ; A soldier; eye-level; Heroes; A desolate desert landscape with a lone watchtower in the distance.; cinematic
Characteristic
Shot : A lone soldier stands in the doorway of a stone tower in a desert landscape. Mountains can be seen in the distance.
Aesthetic Score : 0.5
Mood : lonely, desolate, melancholic
Quality
Entropy : 6.78
Noise : 97
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant errors. The image appears to be well-exposed and sharp.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.23, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.595, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.01, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://leonardo.ai