AI Struggles to Capture the Nuance of Human Emotion in Images with Leonardo-ai
- 11 minutes read - 2143 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and emotionally evocative images is a coveted goal. This study delves into the challenges of AI in capturing the nuances of human facial expressions, specifically focusing on the dramatic style of facial expressions. Dramatic facial expressions are often used in film, theater, and photography to convey intense emotions and heightened states of being. These expressions are characterized by exaggerated movements, heightened intensity, and a focus on conveying the emotional core of a scene. This study explores how a generative AI model performed when tasked with creating images based on scenes and dramatic facial expressions, highlighting the model’s strengths and weaknesses in capturing the desired aesthetic and emotional depth.
Created with: leonardo-ai
Lost in the Rain: A Figure Walks into the Night
A solitary figure, shrouded in darkness under an umbrella, walks down a deserted, rain-soaked street. The dim glow of street lamps casts long shadows, adding to the sense of mystery and melancholy. This image evokes a feeling of isolation and intrigue, leaving the viewer wondering about the figure’s story.
Prompt
facial-expressions Shame: Desolate, lonely, regretful ; A lone figure, hunched over, walking down a deserted street; eye-level; Single Person; Rain-slicked pavement and flickering streetlights; cinematic
Characteristic
Shot : A man in a black coat walks down a rainy street in the night. He is holding an umbrella over his head. The street is wet and dark, with streetlights in the distance.
Aesthetic Score : 0.7
Mood : dark, mysterious, lonely
Quality
Entropy : 6.35
Noise : 107
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as a slight blur around the edges. There are some visible pixelation and compression artifacts.
Heroic Silhouette: A City Awaits
A superhero, clad in vibrant red, gold, and blue, stands poised on a rooftop, the setting sun casting a dramatic glow on the cityscape. The silhouette of a towering skyscraper looms in the background, hinting at the epic battles that may unfold. This image captures the essence of hope and heroism, as the hero prepares to face whatever challenges lie ahead.
Prompt
facial-expressions Shame: Melancholy, disillusioned, burdened ; A superhero, their mask removed, revealing a face etched with pain; eye-level; Hero; A cityscape bathed in the glow of a setting sun; cinematic
Characteristic
Shot : A superhero, wearing a red, gold, and blue costume and a mask, stands on a rooftop overlooking a city skyline at sunset. The city appears to be New York City, with the Empire State Building visible in the background.
Aesthetic Score : 0.7
Mood : heroic, contemplative, determined
Quality
Entropy : 6.80
Noise : 96
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have been edited to improve the lighting, resulting in some unnatural color saturation. There is also some noise visible in the background.
Lost in Thought: A Moment of Melancholy at the Diner
A young woman sits alone at a diner booth, her hands clasped in front of her, lost in thought. Her melancholic expression and posture create a sense of emotional weight and isolation, making the scene feel intimate and personal. The glass of milk on the table adds to the feeling of quiet contemplation.
Prompt
facial-expressions Shame: Embarrassed, defeated, self-loathing ; A woman, her face buried in her hands, sitting alone at a crowded diner table; eye-level; Normal Person; The bustling activity of the diner, a stark contrast to her isolation; cinematic
Characteristic
Shot : A woman sits alone at a diner booth, with a glass of milk in front of her. She looks troubled.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.82
Noise : 99
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.00
Image errors : There are no noticeable errors in the image.
Lost in the Code: A Gamer’s Focused Intensity
A young man, headphones on, sits in a dimly lit room, his eyes glued to the computer screen. The cluttered space, filled with electronics, hints at a world of gaming or creative pursuits. His focused expression and the low lighting create a palpable sense of intensity and immersion, capturing the essence of a dedicated gamer lost in their digital world.
Prompt
facial-expressions Shame: Empty, defeated, lost in a digital world ; A gamer, staring blankly at a screen, his controller lying idle; eye-level; Gamer; A dimly lit room filled with gaming paraphernalia, a sense of disconnection; cinematic
Characteristic
Shot : A young man wearing a headset is sitting at a desk in a dimly lit room and is looking intently at a computer screen. He has his hand on a keyboard and looks to be engrossed in something on the screen. There is another monitor in the background, as well as some shelves with books on them, and a picture on the wall.
Aesthetic Score : 0.6
Mood : focused, concentrated, serious
Quality
Entropy : 6.15
Noise : 90
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, particularly in the shadows. The colors are slightly washed out.
A Look of Worry in the Shadows
A man’s anxious gaze and a woman’s direct stare create a palpable sense of suspense and mystery in this dimly lit bar scene. What secrets are hidden in the shadows?
Prompt
facial-expressions Shame: Anxious, self-conscious, out of place ; A man, standing in a crowded room, his eyes darting nervously around; eye-level; Single Person; A party scene, filled with laughter and conversation, but he feels isolated; cinematic
Characteristic
Shot : A man with a beard and a sweater looks nervously over his shoulder in a dimly lit room with other people in the background.
Aesthetic Score : 0.7
Mood : suspenseful, anxious, tense
Quality
Entropy : 6.54
Noise : 96
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image seems slightly underexposed, leading to darker shadows than desired. The image appears to be slightly noisy, which might be due to low-light conditions.
Lost in the City Lights: A Moment of Contemplation on a Rooftop
A solitary figure, clad in leather, stands on a rooftop overlooking the sprawling cityscape of New York. The iconic Empire State Building looms in the distance, adding a touch of grandeur to the scene. The mood is melancholic, yet contemplative, as the man seems lost in thought, dwarfed by the vastness of the urban landscape.
Prompt
facial-expressions Shame: Disheartened, disillusioned, questioning his purpose ; A hero, standing on a rooftop, looking down at the city below; not too close; Hero; A panoramic view of the city, but he feels small and insignificant; cinematic
Characteristic
Shot : A man in a black leather jacket stands on a rooftop overlooking the city skyline, with the Empire State Building prominently visible in the background. The cityscape is bathed in a soft, hazy light, suggesting either dawn or dusk.
Aesthetic Score : 0.7
Mood : melancholic, contemplative, urban
Quality
Entropy : 6.95
Noise : 103
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly soft, especially in the background, which suggests some over-sharpening. The subject’s hair appears slightly unnatural and the skin tones are a bit too smooth, which could be a sign of post-processing.
A Moment of Quiet Melancholy
A woman sits alone at a kitchen table, her gaze lost in the window. The plate of untouched food speaks volumes about her unspoken emotions. The soft lighting and her pensive expression evoke a sense of sadness and loneliness, leaving the viewer to ponder her thoughts.
Prompt
facial-expressions Shame: Depressed, unmotivated, lost in her thoughts ; A woman, sitting at her kitchen table, staring at a plate of untouched food; eye-level; Normal Person; A cluttered kitchen, a reflection of her inner turmoil; cinematic
Characteristic
Shot : A woman sits at a kitchen table, seemingly lost in thought, with a plate of food in front of her. The scene is bathed in soft, warm light, highlighting the woman’s melancholic expression.
Aesthetic Score : 0.6
Mood : melancholic, contemplative, somber
Quality
Entropy : 6.69
Noise : 97
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Lost in the Code: A Moment of Intense Focus
A young man sits in a dimly lit room, his face illuminated by the glow of his computer screen. He’s completely absorbed in his work, his serious expression reflecting the intensity of his focus. The low lighting creates a sense of intimacy and drama, highlighting the man’s dedication to his task.
Prompt
facial-expressions Shame: Despair, addiction, a sense of being lost ; A gamer, hunched over his keyboard, his fingers flying across the keys, but his eyes are filled with sadness; eye-level; Gamer; A brightly lit gaming room, but he feels trapped in a digital world; cinematic
Characteristic
Shot : A young man is sitting in front of his computer, typing on a keyboard. The room is dark, lit only by the glow of the computer screen. The subject is in focus while the background is blurred. The lighting is somewhat dramatic and the subject’s expression is serious.
Aesthetic Score : 0.6
Mood : serious, focused, dark
Quality
Entropy : 6.35
Noise : 96
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : None. Image looks fine.
Lost in the City’s Shadows
A solitary figure walks through a narrow, cobblestone street, his expression shrouded in mystery. The shallow depth of field blurs the bustling city around him, creating a sense of isolation and introspection. This brooding image evokes a melancholic mood, leaving the viewer to ponder the man’s thoughts and the secrets he carries.
Prompt
facial-expressions Shame: Rejected, isolated, a sense of being unwanted ; A man, walking away from a group of people, his head down, his shoulders slumped; eye-level; Single Person; A bustling street, but he feels alone and invisible; cinematic
Characteristic
Shot : A man walking down a street in a city. The street is narrow and the buildings are old.
Aesthetic Score : 0.6
Mood : dark, mysterious, lonely
Quality
Entropy : 6.93
Noise : 100
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and has some noise. The colors are also slightly desaturated.
Masked Warrior in a World of Ruin
A close-up portrait of a man shrouded in metal, his face obscured by a mask, stands against the backdrop of a crumbling building. The dramatic lighting and his intense gaze create a sense of mystery and tension, leaving the viewer questioning his story and the world he inhabits.
Prompt
facial-expressions Shame: Guilt, regret, a sense of responsibility ; A hero, standing in the ruins of a battle, his armor dented and his face covered in grime; not too close; Hero; A scene of destruction, a reminder of the cost of his actions; cinematic
Characteristic
Shot : A close-up portrait of a man wearing a metal mask and armor, set against a background of a war-torn landscape. The focus is on his face, especially his eyes. The scene has a gritty, realistic tone.
Aesthetic Score : 0.7
Mood : intense, dramatic, determined
Quality
Entropy : 6.81
Noise : 98
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has minor sharpening artifacts, especially around the edges of the mask.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.32, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t quite capture the intended camera position as described in the prompt.
- Shot Analysis: The model scored 0.63, falling within the “good” range. This suggests that the model was able to understand the scene and create a shot that was generally aligned with the prompt.
- Aesthetic Analysis: The model scored 0.12, which is outside the “very good” range of -0.2 to 0.1. This indicates that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model demonstrated a decent understanding of the scene and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://leonardo.ai