AI Captures Emotion, But Struggles with Perspective with Leonardo-ai
- 9 minutes read - 1804 wordsTable of Contents
The ability to convey emotion through facial expressions is a hallmark of human creativity. Now, AI is stepping into this realm, attempting to capture the nuances of human emotion through its own artistic lens. This study explores the capabilities of a generative AI model in creating images that evoke specific emotions and perspectives. We analyze the model’s performance in capturing facial expressions, camera angles, and overall aesthetic, revealing both its strengths and limitations. Through examples of generated images, we delve into the fascinating world of AI-generated art and its potential to push the boundaries of creative expression.
Created with: leonardo-ai
A Moment of Solitude in Autumn’s Embrace
A solitary park bench, bathed in the soft glow of streetlights, sits amidst a carpet of fallen leaves. The shallow depth of field isolates the bench, creating a sense of melancholic peace and inviting contemplation. Blurry figures in the distance hint at a world moving on, while the bench remains a quiet sanctuary for reflection.
Prompt
facial-expressions Sadness: Melancholy, loneliness ; A lone figure; eye-level; Single Person; Empty park bench with fallen leaves; cinematic
Characteristic
Shot : A lonely bench in a park, covered in autumn leaves. There are streetlights in the background and some people walking in the distance.
Aesthetic Score : 0.7
Mood : melancholic, tranquil, autumnal
Quality
Entropy : 6.89
Noise : 101
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, particularly around the edges of the leaves and the bench.
Batman, Bathed in Rain, Watches Over Gotham
A brooding figure silhouetted against the rain-soaked cityscape, Batman stands guard on a rooftop. The low-key lighting and falling rain create a sense of mystery and suspense, hinting at the darkness that lurks within Gotham.
Prompt
facial-expressions Sadness: Despair, disillusionment ; A superhero in their costume; eye-level; Hero; City skyline at night, rain falling; cinematic
Characteristic
Shot : A man dressed as Batman, standing in the rain in front of a city skyline at night.
Aesthetic Score : 0.7
Mood : dark, mysterious, brooding
Quality
Entropy : 6.32
Noise : 96
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The rain effect is a bit too artificial and the lighting is a bit flat. There is a bit of noise in the image, but it’s not too noticeable.
A Moment of Melancholy
A woman sits alone at a kitchen table, her head resting on her hand, lost in thought. The soft lighting and her pensive expression evoke a sense of sadness and loneliness, creating a poignant image of contemplation.
Prompt
facial-expressions Sadness: Hopelessness, grief ; A woman sitting at a kitchen table; eye-level; Normal People; Empty coffee cup, unwashed dishes; cinematic
Characteristic
Shot : A woman is sitting at a kitchen table with a cup of coffee in front of her. She is looking off to the side, with a sad expression on her face. The background is a blurred kitchen.
Aesthetic Score : 0.6
Mood : melancholy, thoughtful, lonely
Quality
Entropy : 6.79
Noise : 96
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly out of focus, especially the background. The lighting is uneven, making the woman’s face appear brighter than the rest of the scene.
The Weight of a Single Can
A young man sits alone, bathed in dim light, a pizza box and a can of beer his only companions. His neutral expression speaks volumes, hinting at a quiet struggle and a sense of unease that hangs heavy in the air.
Prompt
facial-expressions Sadness: Isolation, withdrawal ; A gamer hunched over their computer; close-up; Gamer; Empty pizza boxes, energy drink cans; cinematic
Characteristic
Shot : A young man is sitting at a table, looking directly at the camera. He has a serious expression on his face. There are various boxes and items on the table, including a pizza box, a can of beer, and some other items.
Aesthetic Score : 0.4
Mood : serious, contemplative, somber
Quality
Entropy : 6.25
Noise : 82
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant image errors, but the lighting could be better.
A Shadowy Figure in the Hallway
A young girl stands alone in a dimly lit hallway, her silhouette stark against the darkness. A single light fixture casts a spotlight on her, while the doorway behind her promises a brighter, more open space. The scene is filled with mystery and suspense, leaving the viewer wondering what secrets lie ahead.
Prompt
facial-expressions Sadness: Loneliness, abandonment ; A child standing in a doorway; eye-level; Single Person; Empty hallway, dim lighting; cinematic
Characteristic
Shot : A lone child stands silhouetted in a dark hallway, looking towards a doorway that leads to a brighter room. The hallway walls are painted a dark blue and the floor is shiny and reflective.
Aesthetic Score : 0.6
Mood : mysterious, eerie, foreboding
Quality
Entropy : 6.17
Noise : 92
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts in the shadows and some graininess. There is some noise and a lack of detail in the shadows, especially in the foreground.
Soldier Faces the Inferno
A lone soldier, clad in combat gear, kneels before a raging fire, the smoke billowing high into the sky. The image captures the intensity and somber mood of a battlefield, highlighting the danger and urgency of the situation.
Prompt
facial-expressions Sadness: Loss, regret ; A soldier kneeling on a battlefield; eye-level; Hero; Explosions in the distance, smoke filling the air; cinematic
Characteristic
Shot : A soldier in full combat gear kneeling in a field, with a large fire and smoke plume in the background.
Aesthetic Score : 0.6
Mood : intense, dramatic, melancholic
Quality
Entropy : 6.79
Noise : 99
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and the smoke plume in the background appears slightly blurry.
Is This Movie Really That Boring?
A couple sits on the floor, engrossed in a movie. The woman’s annoyed expression and hand gesture suggest she’s not enjoying the film, while the man seems more indifferent. The scene captures a moment of tension and boredom, leaving the viewer wondering what’s causing the disconnect.
Prompt
facial-expressions Sadness: Silence, unspoken tension ; A couple sitting on a couch; eye-level; Normal People; Empty popcorn bowl, remote control on the floor; cinematic
Characteristic
Shot : A couple sitting on the floor in front of a couch, watching something on TV, a bowl of popcorn is in the foreground.
Aesthetic Score : 0.6
Mood : tense, bored, casual
Quality
Entropy : 6.88
Noise : 99
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors detected. The lighting could be improved, and the scene is a bit bland.
The Blue Glow of Focus: A Close-Up on Typing
A close-up shot captures the focused intensity of typing, highlighting the blue backlight of the keyboard and emphasizing the technical nature of the task. The image evokes a sense of digital immersion and concentration.
Prompt
facial-expressions Sadness: Frustration, defeat ; A gamer’s hands on a keyboard; close-up; Gamer; Screen displaying a game over message; cinematic
Characteristic
Shot : A person’s hands are typing on a keyboard with blue backlighting, the keyboard is on a black surface and another keyboard can be seen in the background.
Aesthetic Score : 0.5
Mood : focused, techy, digital
Quality
Entropy : 6.52
Noise : 97
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no major errors in the image, but the lighting is a bit uneven and the colors are a bit washed out.
A City Street, A Worried Glance, and a Sense of Urgency
A woman walks through a bustling city street, her worried expression and the blurred background creating a palpable sense of tension and suspense. The scene evokes a feeling of anxiety, leaving the viewer wondering what she is running from and what awaits her around the corner.
Prompt
facial-expressions Sadness: Alienation, loneliness ; A woman walking down a crowded street; eye-level; Single Person; People passing by, oblivious to her; cinematic
Characteristic
Shot : A woman walking on a city street, looking concerned and worried.
Aesthetic Score : 0.7
Mood : suspenseful, anxious, urban
Quality
Entropy : 6.88
Noise : 98
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight compression artifacts in the background. The color palette is slightly muted.
Lost in the City Lights: A Silhouette of Solitude
A man, cloaked in leather, stands on a rooftop, his silhouette a stark contrast against the vibrant cityscape. The night lights paint a canvas of urban beauty, but a sense of melancholy hangs in the air, reflecting the man’s contemplative mood and the isolation he feels amidst the bustling city.
Prompt
facial-expressions Sadness: Reflection, introspection ; A hero standing on a rooftop; eye-level; Hero; City lights twinkling in the distance; cinematic
Characteristic
Shot : A man in a leather jacket stands on a rooftop overlooking a cityscape at night. The city lights are blurred in the distance, creating a dreamy and romantic atmosphere.
Aesthetic Score : 0.7
Mood : romantic, contemplative, melancholic
Quality
Entropy : 6.73
Noise : 98
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, however the image has a slightly grainy texture.
Conclusion
The analysis shows that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.2, indicating it’s not very good at reacting to camera positions in the prompt. This suggests the generated image might not accurately reflect the intended camera angle or perspective.
- Shot Analysis: The model scored 0.51, which is good. This means the model was able to understand the scene described in the prompt and create an image that reflects it fairly well.
- Aesthetic Analysis: The model scored 0.18, which is very good. This means the generated image’s aesthetic closely matches the expected aesthetic described in the prompt.
Overall, the model seems to be better at understanding the scene and achieving the desired aesthetic than accurately representing the camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://leonardo.ai