AI's Facial Expressions: A Step Forward, But Camera Angles Need Work with Leonardo-ai
- 9 minutes read - 1859 wordsTable of Contents
The ability to generate images with specific facial expressions is a powerful tool for artists, filmmakers, and anyone who wants to create compelling visuals. A new AI model has been developed to tackle this challenge, and while it shows promise in understanding the scene and achieving the desired aesthetic, it struggles with accurately representing camera angles. This blog post will delve into the model’s performance, analyzing its strengths and weaknesses, and discussing the potential for future improvements. We’ll explore how the model’s ability to capture dramatic facial expressions can be used in various applications, from creating realistic character portraits to generating dynamic scenes for movies and video games.
Created with: leonardo-ai
Silhouetted in the Desert Sunset
A solitary figure stands amidst the vastness of the desert, bathed in the warm glow of a setting sun. The silhouette against the fiery sky evokes a sense of peace, contemplation, and a hint of mystery. The scene invites you to ponder the journey of this lone traveler and the stories held within the desert’s embrace.
Prompt
facial-expressions Curiosity: Melancholy, contemplative ; A lone figure, silhouetted against a setting sun; eye-level; Single Person; vast, empty desert landscape; cinematic
Characteristic
Shot : A lone figure in a long robe stands on a sand dune overlooking a vast desert landscape at sunset. The sky is filled with clouds, with the sun peeking through the clouds in the distance. The figure’s silhouette is striking against the warm colors of the sky.
Aesthetic Score : 0.7
Mood : serene, contemplative, solitary
Quality
Entropy : 6.51
Noise : 91
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Silhouetted in the Future: A Man on the Edge
A lone figure in a futuristic suit stands on a rooftop, bathed in the glow of a sprawling cityscape. The night sky, a canvas of blue and orange hues, adds to the dramatic atmosphere. The silhouette of the man against the city lights creates a sense of mystery and intrigue, while the glowing blue light on his suit hints at a world beyond our own.
Prompt
facial-expressions Curiosity: Determined, hopeful ; A superhero, standing atop a skyscraper, looking out at the city; eye-level; Hero; bustling cityscape with neon lights; cinematic
Characteristic
Shot : A lone figure in a futuristic suit stands on a rooftop overlooking a sprawling cityscape at twilight, with the sky transitioning from dark blue to orange.
Aesthetic Score : 0.6
Mood : dark, mysterious, futuristic
Quality
Entropy : 6.59
Noise : 88
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.70
Image errors : The cityscape appears slightly blurry, especially in the background, which might be due to a shallow depth of field or post-processing.
Lost in Thought: A Moment of Serene Contemplation
A young woman finds solace in the quiet beauty of a park, her gaze fixed on an unseen horizon. The soft blur of the background and vibrant flowers in the foreground create a sense of peaceful introspection, leaving the viewer to wonder what thoughts occupy her mind.
Prompt
facial-expressions Curiosity: Peaceful, observant ; A young woman, sitting on a park bench, watching children play; eye-level; Normal People; vibrant park with blooming flowers; cinematic
Characteristic
Shot : A young woman sits on a bench in a park, looking up and to the left, with a contemplative expression.
Aesthetic Score : 0.7
Mood : pensive, wistful, thoughtful
Quality
Entropy : 6.84
Noise : 93
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No obvious errors in the image.
Lost in the Code: A Young Man’s Intense Focus Under Neon Lights
A young man, bathed in the dramatic glow of blue and red lighting, is completely absorbed in his work. Headphones on, fingers flying across the keyboard, he embodies the intensity and focus of a coder lost in the digital world.
Prompt
facial-expressions Curiosity: Intense, focused ; A gamer, hunched over a computer screen, eyes glued to the monitor; close-up; Gamer; dimly lit room with flashing lights from the screen; cinematic
Characteristic
Shot : A young man wearing headphones is hunched over a computer, typing on a keyboard in a dimly lit room.
Aesthetic Score : 0.6
Mood : focused, intense, tech
Quality
Entropy : 5.61
Noise : 83
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight noise and color banding in some areas.
Lost in Thought Amidst the Bustle
A man with a thoughtful expression navigates a vibrant market, his graying beard and backpack hinting at a journey both physical and internal. The colorful lanterns overhead and the bustling crowd create a sense of urban energy, while his serious demeanor suggests a deeper story waiting to unfold.
Prompt
facial-expressions Curiosity: Intrigued, observant ; A man, walking through a crowded marketplace, his eyes darting around; eye-level; Single Person; bustling marketplace with colorful stalls and vendors; cinematic
Characteristic
Shot : A man in a busy marketplace. The man is in the foreground and the busy street is in the background.
Aesthetic Score : 0.7
Mood : intense, thoughtful, crowded
Quality
Entropy : 6.90
Noise : 96
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor noise and blurriness.
A Lone Warrior in a World of Ashes
A man, clad in tactical gear, stands amidst the ruins of a post-apocalyptic wasteland. Smoke and fire billow in the background, creating a sense of danger and chaos. His focused expression reflects the intensity of the situation, leaving the viewer to wonder what battles he has fought and what challenges lie ahead.
Prompt
facial-expressions Curiosity: Brave, resolute ; A hero, standing in the middle of a chaotic battle, looking determined; eye-level; Hero; smoke-filled battlefield with explosions and debris; cinematic
Characteristic
Shot : A lone man stands amidst a destroyed cityscape. There is smoke and fire in the background, suggesting a recent disaster.
Aesthetic Score : 0.7
Mood : dramatic, tense, somber
Quality
Entropy : 6.81
Noise : 93
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some of the smoke and fire elements look a bit artificial, with an unnatural texture.
Laughter and Warmth: Friends Share a Joyful Moment
A group of friends gather around a table, their laughter filling the air. The warm lighting and intimate setting create a sense of comfort and connection, capturing the essence of shared joy and friendship.
Prompt
facial-expressions Curiosity: Joyful, connected ; A group of friends, gathered around a table, sharing stories and laughter; eye-level; Normal People; cozy living room with warm lighting; cinematic
Characteristic
Shot : Four friends are sitting around a table in a dimly lit room. They are laughing and talking. There are three glasses of beer on the table.
Aesthetic Score : 0.7
Mood : happy, friendly, relaxed
Quality
Entropy : 6.76
Noise : 91
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The lighting in the image is a bit uneven. The shadows are a bit too dark, and the highlights are a bit too bright.
The Focus of Competition: A Gamer’s Intensity in the Spotlight
Two young men battle it out in a dimly lit gaming room, the focused gamer in the foreground highlighted by a shallow depth of field. The scene captures the intense, competitive spirit of gaming, with a mood of focused determination.
Prompt
facial-expressions Curiosity: Excited, engaged ; A gamer, holding a controller, eyes wide with excitement; close-up; Gamer; brightly lit gaming room with colorful lights; cinematic
Characteristic
Shot : Two young men are playing video games in a dimly lit room. The man in the foreground is holding a game controller and his face is lit by blue light. The man in the background is out of focus and wearing headphones.
Aesthetic Score : 0.6
Mood : intense, focused, playful
Quality
Entropy : 6.13
Noise : 88
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some noise in the image, particularly noticeable in the background.
Solitude on the Stormy Edge
A woman stands alone on a cliff, her silhouette stark against the dramatic backdrop of a stormy sea. The scene evokes a sense of melancholy and contemplation, capturing the raw power of nature and the fragility of human existence.
Prompt
facial-expressions Curiosity: Contemplative, introspective ; A woman, standing at the edge of a cliff, gazing out at the vast ocean; eye-level; Single Person; dramatic cliffside with crashing waves; cinematic
Characteristic
Shot : A woman stands on a cliff overlooking a stormy ocean, with wind blowing her hair
Aesthetic Score : 0.7
Mood : melancholic, dramatic, isolated
Quality
Entropy : 6.74
Noise : 96
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors are present
Soldier Faces the Inferno: A Moment of War’s Brutality
A lone soldier, clad in combat gear, stands defiant against a backdrop of fiery destruction. The burning building and billowing smoke create a scene of intense chaos, while the soldier’s determined gaze reflects the weight of war and the melancholic reality of the situation.
Prompt
facial-expressions Curiosity: Brave, selfless ; A hero, standing in front of a burning building, ready to save people; eye-level; Hero; chaotic scene with smoke and flames; cinematic
Characteristic
Shot : A soldier in full gear stands in front of a burning building with a serious expression on his face. The background is a mix of smoke and flames.
Aesthetic Score : 0.6
Mood : tense, dramatic, somber
Quality
Entropy : 6.62
Noise : 89
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors detected.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.1, indicating a very low ability to accurately represent the camera position described in the prompt. This suggests the model is not very good at understanding and implementing camera angles.
- Shot Analysis: The model scored 0.5, indicating a good ability to understand the scene described in the prompt. This means the model was able to create an image that generally matched the scene described, but there might be some discrepancies in details.
- Aesthetic Analysis: The model scored 0.1, indicating a very good ability to match the expected aesthetic of the image. This means the model was able to create an image that closely matched the desired aesthetic, despite the issues with camera position.
Overall, the model shows promise in understanding the scene and achieving the desired aesthetic, but needs improvement in accurately representing camera positions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://leonardo.ai