AI's Artistic Journey: Capturing Poses, But Missing the Shot with Leonardo-ai
- 9 minutes read - 1807 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on text prompts is a rapidly evolving field. This blog post delves into the results of an experiment where an AI model was tasked with creating images based on specific scene descriptions, including details about camera position, shot type, and aesthetic style. While the model demonstrated a strong grasp of aesthetic style, it fell short in accurately capturing camera positions and scene composition. This analysis explores the model’s performance, highlighting its strengths and weaknesses, and discusses potential improvements for future prompts.
Created with: leonardo-ai
A Lone Knight Stands Tall Amidst the Chaos
This epic scene captures the heart of a lone knight, clad in shining armor, amidst a dusty battlefield. The blurred background and the knight’s forward momentum create a sense of urgency and excitement, highlighting the intensity of the battle. The mood is heroic and dramatic, leaving you wanting to know more about this valiant warrior’s story.
Prompt
poses dancing: triumphant, powerful ; A lone warrior; wide shot; heroism; a battlefield littered with fallen enemies; cinematic
Characteristic
Shot : A lone knight in shining armor strides forward towards the viewer with a drawn sword, a line of other knights in the background appear to be retreating, the scene takes place on a dusty plain under a hazy sky
Aesthetic Score : 0.75
Mood : epic, dramatic, heroic
Quality
Entropy : 6.92
Noise : 100
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable artifacts or errors, the only thing that seems off is that the blur on the background appears to be slightly too sharp, which may make the scene appear less realistic.
Adventure Awaits: Friends Explore Ancient Ruins in the Jungle
Three friends, full of laughter and joy, stand before an ancient stone building nestled deep within a lush jungle. Their playful expressions and the mysterious ruins create a sense of adventure and discovery, promising a lighthearted and exciting journey.
Prompt
poses dancing: excited, adventurous ; A group of explorers; medium shot; adventure; a dense jungle with ancient ruins in the background; cinematic
Characteristic
Shot : Three people are having fun in front of an ancient temple, they are wearing casual clothes and look like they are on an adventure.
Aesthetic Score : 0.7
Mood : happy, adventurous, playful
Quality
Entropy : 6.90
Noise : 112
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts and noise in the image, particularly in the shadows.
Immersed in the Game: A Gamer’s Sanctuary
This image captures the essence of a dedicated gamer, lost in the digital world. The vibrant lighting, focused posture, and immersive setup create a sense of energy and excitement, showcasing the passion and dedication of a true gaming enthusiast.
Prompt
poses dancing: intense, focused ; A gamer; close-up; gaming; a brightly lit gaming setup with a screen displaying a virtual world; cinematic
Characteristic
Shot : A young man is sitting at his computer in a dimly lit room with neon lighting playing video games.
Aesthetic Score : 0.6
Mood : focused, gamer, futuristic
Quality
Entropy : 5.76
Noise : 84
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight artifacts around the edges of the screen and the room. Slight blurriness of the subject’s arm.
Love Blooms in the Bustling Market
A couple strolls hand-in-hand through a vibrant market, their love story unfolding under the warm sunlight. The woman’s colorful sari and the man’s striped shirt add a touch of vibrancy to the scene, creating a romantic and happy atmosphere.
Prompt
poses dancing: joyful, romantic ; A couple; medium shot; tourism; a bustling marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A couple is walking down a busy street in a market. They are both smiling and seem to be in love.
Aesthetic Score : 0.7
Mood : romantic, playful, happy
Quality
Entropy : 6.79
Noise : 102
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Silhouettes of Hope in the Desert Sunset
A solitary figure in a white dress walks towards the horizon, bathed in the warm glow of the setting sun. The vast desert landscape creates a sense of peace and tranquility, while the woman’s retreating form adds a touch of mystery and intrigue. This serene scene evokes feelings of hope and a sense of quiet contemplation.
Prompt
poses dancing: reflective, contemplative ; A traveler; long shot; travel; a vast desert landscape with a setting sun; cinematic
Characteristic
Shot : A woman in a white dress walks towards the sunset in a desert landscape.
Aesthetic Score : 0.7
Mood : serene, peaceful, hopeful
Quality
Entropy : 6.56
Noise : 98
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight chromatic aberration, particularly around the edges.
Silhouettes of Friendship Against the City Lights
Three friends stand on a rooftop, their figures outlined against the twinkling cityscape. The night air is filled with a romantic, dreamy nostalgia, as they gaze out at the view. The use of silhouettes creates a sense of mystery and intrigue, capturing the essence of their shared moment.
Prompt
poses dancing: happy, carefree ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : Three friends are standing on a rooftop, facing a cityscape at night. The city lights are visible in the distance, with a few tall buildings dominating the skyline.
Aesthetic Score : 0.7
Mood : romantic, urban, friendly
Quality
Entropy : 6.85
Noise : 93
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable artifacts or errors
Shadow Dance: A Woman’s Grace in the Gloom
A solitary figure in a black dress moves with captivating power in a dark alleyway. The stark contrast of light and shadow, illuminated by a single streetlamp, creates a dramatic and mysterious atmosphere. This captivating scene evokes a sense of both vulnerability and strength, leaving a lasting impression.
Prompt
poses dancing: determined, defiant ; A lone dancer; close-up; heroism; a dark alleyway with flickering streetlights; cinematic
Characteristic
Shot : A woman in a black dress dances in a dark alleyway, illuminated by a single street lamp.
Aesthetic Score : 0.8
Mood : dramatic, moody, mysterious
Quality
Entropy : 6.49
Noise : 102
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is well-composed and has no visible errors.
Leap of Joy Against a Snowy Mountain
Four friends capture a moment of pure joy and adventure as they jump in unison against a breathtaking backdrop of snow-capped mountains and a clear blue sky. The vibrant colors of their clothing create a striking contrast against the natural landscape, adding to the image’s playful and energetic mood.
Prompt
poses dancing: exhilarated, free ; A group of adventurers; wide shot; adventure; a breathtaking mountain range with a clear blue sky; cinematic
Characteristic
Shot : Four people are jumping in the air with their arms raised in front of a snow-capped mountain range. The ground is covered in brown grass and bushes. The sky is blue and clear.
Aesthetic Score : 0.8
Mood : joyful, adventurous, celebratory
Quality
Entropy : 6.80
Noise : 106
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors
Lost in the Game: A Moment of Intense Focus
A young man, headphones on, is completely absorbed in a video game. The dimly lit room and blurry background heighten the sense of drama and suspense as he navigates the virtual world with laser focus.
Prompt
poses dancing: focused, strategic ; A gamer; close-up; gaming; a dimly lit room with a computer screen displaying a competitive game; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit room, wearing a headset and focusing intently on a computer screen. He is typing on a keyboard with his right hand. The room is filled with other people, but they are blurred in the background. There are faint lights in the background, suggesting it might be a gaming center or a LAN party.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 5.93
Noise : 87
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible image errors.
A Romantic Stroll Towards the Unknown
In this captivating scene, a couple is seen walking hand-in-hand on a pristine white sandy beach, their eyes fixed on the mesmerizing turquoise ocean. The sky above is a canvas of blue, dotted with fluffy white clouds. The mood is romantic, happy, and carefree, as the couple embarks on a journey towards the unknown horizon. The use of backlighting adds a touch of mystery and intrigue, making this a perfect depiction of freedom and adventure.
Prompt
poses dancing: relaxed, joyful ; A family; medium shot; travel; a picturesque beach with turquoise water and white sand; cinematic
Characteristic
Shot : A couple walking hand-in-hand away from the camera on a white sand beach with turquoise water and blue sky with clouds
Aesthetic Score : 0.7
Mood : romantic, happy, carefree
Quality
Entropy : 6.39
Noise : 96
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors, image looks crisp and well-exposed.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.5, which is considered average. This indicates that the model was able to understand the scene in the prompt reasonably well, but there’s room for improvement.
- Aesthetic Analysis: The model scored 0.08, which is considered very good. This means the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at understanding the aesthetic style than the camera position and scene composition. It might be helpful to provide more specific instructions regarding camera angles and shot types in future prompts to improve the model’s performance in these areas.