AI Captures the Scene, But Misses the Mood with Leonardo-ai
- 9 minutes read - 1769 wordsTable of Contents
In the realm of AI image generation, capturing the essence of a scene goes beyond simply depicting objects and characters. It involves conveying the mood, atmosphere, and aesthetic that define the image. This blog post explores the results of testing an AI model’s ability to generate images based on specific prompts, focusing on the model’s performance in capturing poses and aesthetics. We’ll delve into the model’s strengths and weaknesses, analyzing its ability to understand camera positions, scene composition, and the desired aesthetic. Through this analysis, we aim to shed light on the current capabilities and limitations of AI image generation, highlighting the areas where further development is needed.
Created with: leonardo-ai
Through the Smoke and Fire: Soldiers Advance in a Chaotic Battlefield
A dramatic scene unfolds as soldiers push forward through a battlefield engulfed in smoke and fire. The intensity of the moment is palpable, with the soldiers’ determined expressions contrasting sharply with the chaos around them.
Prompt
poses standing-in-a-row: determined, courageous, hopeful ; A group of soldiers; wide shot; heroism; a battlefield with smoke and explosions in the background; cinematic
Characteristic
Shot : A group of soldiers are running through a battlefield with fire and smoke in the background. The soldiers are wearing military uniforms and carrying weapons.
Aesthetic Score : 0.7
Mood : intense, dramatic, chaotic
Quality
Entropy : 6.76
Noise : 98
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.50
Image errors : None. The image is well-composed and well-lit.
Uncharted Adventures Await in the Golden Jungle
A group of intrepid explorers stand poised on a stone path, ready to delve into the mysteries of a lush jungle. Ruins peek through the foliage, hinting at ancient secrets, while the golden light of the setting sun casts a hopeful glow on their journey.
Prompt
poses standing-in-a-row: excited, curious, adventurous ; A team of explorers; medium shot; adventure; a lush jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : A group of four explorers, wearing backpacks and hats, stand on a stone bridge, looking towards the camera, with a lush green jungle and ancient ruins in the background. The sun is shining brightly and the sky is clear.
Aesthetic Score : 0.7
Mood : adventurous, tropical, mysterious
Quality
Entropy : 6.85
Noise : 111
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant artifacts or errors in the image.
Eyes on the Prize: Esports Athlete in the Zone
A young woman, clad in a blue and black jersey and headset, is locked in on the game. The intense lighting and focused composition capture the competitive spirit of esports, highlighting the player’s unwavering concentration.
Prompt
poses standing-in-a-row: focused, competitive, passionate ; A group of gamers; close-up shot; gaming; a brightly lit esports arena with cheering fans; cinematic
Characteristic
Shot : A young woman wearing a headset is focused on a computer keyboard, with two other figures in the background, likely in a gaming setting.
Aesthetic Score : 0.7
Mood : intense, focused, competitive
Quality
Entropy : 6.49
Noise : 96
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears slightly overexposed, especially in the background, leading to some blown-out highlights. The overall color saturation might be slightly excessive, creating a slightly artificial look.
Friends Conquer the Peak, Finding Joy in the Majestic View
Three friends stand atop a snow-capped mountain, their backpacks a testament to their adventure. The vastness of the landscape inspires awe and wonder, reflecting their happy and relaxed mood. This breathtaking scene captures the essence of friendship and the thrill of exploration.
Prompt
poses standing-in-a-row: happy, relaxed, joyful ; A family of tourists; long shot; tourism; a breathtaking view of a mountain range with a clear blue sky; cinematic
Characteristic
Shot : Three friends stand on a mountain ridge, looking out at a vast mountain range in the distance. They are wearing casual clothing and backpacks, suggesting they are on a hiking trip. The sky is blue and clear, and the mountains are covered in green trees and snow.
Aesthetic Score : 0.7
Mood : joyful, adventurous, serene
Quality
Entropy : 6.77
Noise : 100
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors or artifacts
A Journey Through the Desert: Hope and Mystery in Every Step
A group of adventurers traverse a sun-drenched desert path, their silhouettes framed against the majestic backdrop of palm trees and distant mountains. The play of light and shadow adds a sense of depth and mystery, while the balanced composition guides the eye towards the horizon, hinting at the promise of a hopeful future.
Prompt
poses standing-in-a-row: free-spirited, adventurous, optimistic ; A group of backpackers; medium shot; travel; a dusty road leading to a distant village with palm trees; cinematic
Characteristic
Shot : A group of people are hiking on a dirt road in a desert landscape. There are palm trees and mountains in the background. The sun is shining brightly.
Aesthetic Score : 0.7
Mood : peaceful, adventurous, hopeful
Quality
Entropy : 6.78
Noise : 104
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Nine Lives of Passion: A Woman’s Theatrical Transformation
A captivating performance unfolds in a dark theater, where a woman’s passionate singing is captured in nine distinct poses and costumes. The repetitive structure creates a mesmerizing and intense visual experience, highlighting the dramatic nature of her performance.
Prompt
poses standing-in-a-row: harmonious, powerful, emotional ; A choir singing in harmony; close-up shot; groups; a dimly lit stage with spotlights; cinematic
Characteristic
Shot : A woman is singing on a stage with lights and a microphone
Aesthetic Score : 0.4
Mood : dramatic, passionate, intense
Quality
Entropy : 5.81
Noise : 98
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.00
Image errors : The image has noticeable noise in the darker areas, pixelation in some areas, and blurry background
Vibrant Dance Performance Captures Festive Spirit
Three women in dazzling costumes bring joy and energy to the stage, their movements illuminated by spotlights and enhanced by colorful circular props. The scene radiates a festive mood, capturing the excitement and vibrancy of the performance.
Prompt
poses standing-in-a-row: energetic, synchronized, joyful ; A line of dancers; wide shot; groups; a brightly lit stage with colorful costumes; cinematic
Characteristic
Shot : Three women in colorful traditional costumes are dancing on a stage with colorful lights in the background.
Aesthetic Score : 0.7
Mood : joyful, festive, vibrant
Quality
Entropy : 6.63
Noise : 102
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant errors in the image.
Golden Hour Friendships on the Beach
Five friends bask in the warm glow of a sunset on the beach, capturing a moment of pure joy and carefree summer vibes. The dramatic light creates a warm and inviting atmosphere, making this a picture perfect memory.
Prompt
poses standing-in-a-row: relaxed, happy, nostalgic ; A group of friends; medium shot; groups; a sunset over a beach with waves crashing in the background; cinematic
Characteristic
Shot : Five friends are standing on a beach at sunset. The water is calm and the sky is a beautiful orange and pink.
Aesthetic Score : 0.7
Mood : happy, carefree, fun
Quality
Entropy : 6.86
Noise : 96
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors, but the image is slightly underexposed.
Scientists Unveil Mysterious Glowing Object in Lab
Two scientists in white lab coats work intently in a lab, their focus drawn to a glowing cylindrical object within a glass container. The scene, filled with data screens and a serious mood, hints at a groundbreaking discovery and raises questions about the nature of the mysterious object.
Prompt
poses standing-in-a-row: focused, determined, innovative ; A team of scientists; close-up shot; groups; a laboratory with complex machinery and glowing screens; cinematic
Characteristic
Shot : Two scientists working in a futuristic laboratory, focused on an experiment involving a glowing device. The background features numerous large screens displaying technical data.
Aesthetic Score : 0.6
Mood : serious, technological, focused
Quality
Entropy : 6.79
Noise : 102
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.40
Image errors : No significant image errors
Protesters Demand ‘HOTM POTER’ in Somber March
A group of masked individuals marched through an urban street, their faces set in determination as they held signs and a banner proclaiming ‘HOTM POTER’ in bold letters. The overcast sky and muted light added to the somber mood of the protest, highlighting the urgency of their message.
Prompt
poses standing-in-a-row: determined, passionate, hopeful ; A group of protesters; long shot; groups; a city street with banners and signs; cinematic
Characteristic
Shot : A group of people are holding a banner that says “HOTM POTER” in a city street. They are all wearing masks. There are several other signs in the background.
Aesthetic Score : 0.3
Mood : intense, serious, political
Quality
Entropy : 6.60
Noise : 104
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.05
Image errors : There are some minor artifacts in the image, such as blurring around the edges of the people and signs. However, these artifacts are not very noticeable.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.51, which falls within the “good” range (0.5 to 0.75). This means the model was able to accurately capture the camera positions described in the prompt.
- Shot Analysis: The model scored 0.54, also within the “good” range. This indicates the model understood the scene described in the prompt and created an image that reflects that understanding.
- Aesthetic Analysis: The model scored 0.11, which is outside the “very good” range (-0.2 to 0.1). This suggests that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera positions and scene composition, but needs improvement in capturing the desired aesthetic.