AI Captures the Scene, But Struggles with Camera Angles with Leonardo-ai
- 9 minutes read - 1796 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. In the realm of AI-generated imagery, capturing these expressions accurately is crucial for creating compelling and realistic scenes. This blog post examines the performance of a generative AI model in understanding and generating facial expressions across various scenarios. We’ll explore how the model handles different camera positions, scene compositions, and aesthetic styles, highlighting its strengths and areas for improvement.
Created with: leonardo-ai
Caught by Surprise: A Moment of Shock in a Lively Cafe
A woman sits at a cafe table, her expression a mixture of surprise and shock. The slightly blurred background and dramatic lighting add to the sense of intrigue, capturing a candid moment of unexpected excitement.
Prompt
facial-expressions Embarrassment: Awkward and self-conscious ; A single woman; eye-level; Single Persons; A crowded cafe with loud chatter and laughter; cinematic
Characteristic
Shot : A woman sitting at a cafe table, looking surprised and laughing.
Aesthetic Score : 0.6
Mood : surprised, happy, casual
Quality
Entropy : 6.84
Noise : 95
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, particularly around the edges of the woman’s hair. The lighting is also slightly uneven, leading to some areas being overexposed.
Batman’s Solitary Stroll: A City in Motion, One Man in Focus
A lone figure in the Batsuit navigates a bustling city street, his stern expression and the blurred background creating a sense of isolation and intrigue. The image captures a dramatic and intense mood, leaving viewers to wonder what secrets lie ahead.
Prompt
facial-expressions Embarrassment: Humiliated and exposed ; A superhero in a full costume; eye-level; Heroes; A bustling city street with people staring; cinematic
Characteristic
Shot : A man dressed as Batman is walking down a city street. The street is crowded with people, but the focus is on the man in the costume.
Aesthetic Score : 0.6
Mood : serious, dramatic, mysterious
Quality
Entropy : 6.86
Noise : 100
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
A Deal in the Shadows: Two Men Meet in a Luxurious Dining Room
Two men in black tie attire sit at a formal dinner table, their serious expressions hinting at a tense and potentially dangerous meeting. The opulent setting, with its dark wood paneling and ornate decorations, adds to the sense of mystery and anticipation.
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A man in a business suit; eye-level; Normal People; A formal dinner party with elegant guests; cinematic
Characteristic
Shot : Two men in tuxedos are seated at a table in a dimly lit room. The table is set with fine china and silverware, and there is a bottle of wine and glasses on the table.
Aesthetic Score : 0.7
Mood : formal, suspenseful, elegant
Quality
Entropy : 6.47
Noise : 97
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some artifacts and errors in the image, including slight noise in the shadows and some minor blurring in the background.
Lost in the Pixelated World: A Gamer’s Contemplative Moment
A young man, bathed in the soft glow of his computer monitors, sits lost in thought. The dimly lit room, a haven for gaming, reflects a mood of quiet contemplation, tinged with a hint of boredom. This image captures the introspective side of a gamer, lost in the digital world.
Prompt
facial-expressions Embarrassment: Cringing and defeated ; A gamer in a gaming chair; eye-level; Gamer; A dimly lit room with flashing screens and empty pizza boxes; cinematic
Characteristic
Shot : A young man sits in a gaming chair, looking to the side, in front of two computer screens. There’s a desk with a keyboard and a mouse. The room is dimly lit with blue and green light coming from the screens.
Aesthetic Score : 0.6
Mood : focused, contemplative, cyberpunk
Quality
Entropy : 6.22
Noise : 89
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a few minor image compression artifacts, especially around the edges of the screens.
Radiant Bride Basking in the Joy of Her Special Day
A beautiful bride, adorned in a flowing white gown and veil, radiates happiness as she shares a special moment with loved ones. The soft lighting and her radiant smile create a warm and romantic atmosphere, capturing the essence of this elegant celebration.
Prompt
facial-expressions Embarrassment: Lonely and out of place ; A woman in a wedding dress; eye-level; Single Persons; A crowded wedding reception with happy couples; cinematic
Characteristic
Shot : A bride in a white wedding dress with a veil and a bouquet of flowers is standing in front of a group of people at a wedding ceremony.
Aesthetic Score : 0.8
Mood : happy, joyous, romantic
Quality
Entropy : 6.90
Noise : 100
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors
Superman’s Worried Gaze: A Hero Under Pressure
A close-up shot captures Superman’s intense expression as he stands amidst a crowd, his worried gaze directed straight at the camera. The dramatic lighting and blurry background heighten the sense of urgency and heroism in this powerful image.
Prompt
facial-expressions Embarrassment: Embarrassed and self-conscious ; A superhero in a cape; eye-level; Heroes; A cheering crowd at a victory parade; cinematic
Characteristic
Shot : A man dressed as Superman is looking directly at the camera in a crowd of people. The background is blurry.
Aesthetic Score : 0.7
Mood : serious, dramatic, heroic
Quality
Entropy : 6.90
Noise : 101
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor image artifacts are present.
Lost in Thought: A Moment of Solitude in a Busy Restaurant
A woman sits alone at a table, her gaze distant and thoughtful. The bustling restaurant fades into the background, highlighting her introspective mood. The scene evokes a sense of loneliness and contemplation, leaving the viewer to wonder about her thoughts and emotions.
Prompt
facial-expressions Embarrassment: Uncomfortable and out of place ; A woman in a casual outfit; eye-level; Normal People; A fancy restaurant with white tablecloths and expensive wine; cinematic
Characteristic
Shot : A woman sits alone at a table in a restaurant, looking up with a concerned expression. Her glass of wine is full, the plate is almost empty. The setting is upscale, with a warm, dark wood interior and large windows.
Aesthetic Score : 0.7
Mood : pensive, contemplative, slightly anxious
Quality
Entropy : 6.75
Noise : 97
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors. The lighting is even and well-balanced.
The Gamer’s Focus: A Moment of Intensity
A young man sits in his gaming chair, eyes locked on the screen, surrounded by the energy of his fellow gamers. The dramatic lighting casts a mysterious glow, highlighting his intense focus and competitive spirit.
Prompt
facial-expressions Embarrassment: Humiliated and defeated ; A gamer in a hoodie; eye-level; Gamer; A crowded esports tournament with loud cheers and flashing lights; cinematic
Characteristic
Shot : A young man in a grey hoodie sits at a desk in a dimly lit room, looking intently at something off-screen. The room is filled with other people, but they are blurry in the background, suggesting a scene of concentration and focus.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.53
Noise : 97
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight noise and graininess, which is likely due to the low light conditions in which it was taken.
A Gentleman’s Mystery: Unveiling the Secrets of a Formal Evening
A man in a tuxedo, bathed in soft candlelight, sits at a beautifully set table. His serious gaze and the elegant ambiance create an air of mystery and intrigue. What secrets lie behind this formal gathering?
Prompt
facial-expressions Embarrassment: Awkward and uncomfortable ; A man in a tuxedo; eye-level; Single Persons; A romantic dinner for two with candles and flowers; cinematic
Characteristic
Shot : A man in a tuxedo sits at a table set for a formal dinner, with candles and flowers in the background.
Aesthetic Score : 0.8
Mood : elegant, sophisticated, romantic
Quality
Entropy : 6.50
Noise : 95
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors or artifacts.
Masked Figure Prepares to Speak: Mystery and Intrigue Surround the Announcement
A man in a suit and a white mask sits at a table, microphone in hand, surrounded by others in suits. The atmosphere is tense, the mood mysterious. What will he say? What secrets will be revealed?
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A superhero in a mask; eye-level; Heroes; A news conference with reporters asking difficult questions; cinematic
Characteristic
Shot : A man in a suit and a white mask is sitting at a table with a microphone in front of him. There are other people behind him, also in suits. It seems to be a press conference, the man is addressing the press.
Aesthetic Score : 0.6
Mood : serious, dramatic, intense
Quality
Entropy : 6.60
Noise : 96
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors in the image.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.62, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.06, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://leonardo.ai