AI's Artistic Journey: Capturing Poses, But Missing the Shot with Stable-diffusion
- 9 minutes read - 1821 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and evocative images is a rapidly evolving field. One intriguing area of exploration is the creation of images based on specific poses and scene descriptions. This blog post delves into the results of an AI model tasked with this challenge, highlighting its strengths and weaknesses in capturing the essence of dramatic poses.
Created with: stability-ai-core
Silhouetted Against the Sunset: A Hiker’s Moment of Triumph
A lone hiker stands on a mountain peak, their silhouette a stark contrast against the fiery sunset sky. The vastness of the mountain range and the dramatic clouds create a sense of awe and inspiration, capturing the essence of solitude and achievement.
Prompt
poses low-angle: inspiring, triumphant ; A lone figure standing atop a mountain peak, silhouetted against the rising sun; wide shot; heroism; majestic mountain range with clouds swirling below; cinematic
Characteristic
Shot : A lone hiker stands on the peak of a mountain at sunset. The hiker is silhouetted against the bright orange and pink sky. The mountain range stretches out behind the hiker, creating a dramatic and beautiful scene.
Aesthetic Score : 0.8
Mood : inspirational, dramatic, peaceful
Quality
Entropy : 6.65
Noise : 70
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible image errors.
Lost in the Jungle: A Mysterious Journey Awaits
Four explorers, clad in jungle gear, navigate a dense, hazy forest towards ancient stone ruins. The low angle and soft light create a sense of mystery and suspense, hinting at an adventure waiting to unfold.
Prompt
poses low-angle: mysterious, adventurous ; A group of explorers navigating a dense jungle, their faces illuminated by the light of their headlamps; medium shot; adventure; lush green foliage and ancient ruins in the background; cinematic
Characteristic
Shot : Four men in explorer attire are walking through a jungle path. There is a ruined structure in the background, obscured by dense foliage.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, eerie
Quality
Entropy : 6.66
Noise : 93
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight blurring and softness in the background and leaves might be the result of the soft lighting and shallow depth of field.
Lost in the Neon Glow: A Gamer’s Intense Focus
A dimly lit room, bathed in the vibrant hues of neon lights, becomes a stage for a man’s intense gaming session. The low-key lighting adds a layer of mystery and drama, highlighting his focused expression as he navigates the digital world.
Prompt
poses low-angle: intense, focused ; A gamer’s hands intensely manipulating a controller, their face illuminated by the glow of the monitor; close-up; gaming; a vibrant, futuristic cityscape projected on the screen; cinematic
Characteristic
Shot : A young man is sitting in front of a computer screen, focused on playing a game with a controller in his hands. The screen behind him displays a vibrant city skyline with a neon glow. The room is lit with blue and orange hues, creating a moody atmosphere.
Aesthetic Score : 0.7
Mood : intense, focused, futuristic
Quality
Entropy : 6.22
Noise : 61
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has slight noise in the background, especially in the darker areas. There are also some minor artifacts around the edges of the subject, particularly in the hair.
Majestic Bronze Statue Commands City Square
A bronze statue of a man in historical attire stands proudly in a bustling city square, its grandeur enhanced by the surrounding buildings and the vibrant life unfolding in the background. The statue’s central position and the clear blue sky with fluffy clouds create a sense of historical significance and majestic presence.
Prompt
poses low-angle: awe-inspiring, historical ; A towering statue of a historical figure, viewed from the perspective of a tourist looking up in awe; wide shot; tourism; a bustling city square with other tourists and vendors; cinematic
Characteristic
Shot : A bronze statue of a man in a historical outfit, standing on a pedestal in a city square. The statue is surrounded by people and buildings. The sky is blue with white clouds.
Aesthetic Score : 0.6
Mood : historical, monumental, urban
Quality
Entropy : 6.83
Noise : 74
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Solitude in the Desert: A Moment of Tranquility
A lone figure finds peace amidst the vastness of the desert, bathed in the golden light of the sun. The scene evokes a sense of serenity and isolation, with long shadows stretching across the dunes.
Prompt
poses low-angle: solitude, contemplative ; A lone traveler gazing out at a vast desert landscape, their back to the camera; medium shot; travel; endless sand dunes stretching out to the horizon; cinematic
Characteristic
Shot : A lone figure sits on a sand dune in a vast desert, with the sun shining brightly in the sky and distant mountains on the horizon.
Aesthetic Score : 0.7
Mood : serene, contemplative, isolated
Quality
Entropy : 6.68
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No obvious image errors.
Confetti Rain and Cheering Faces: The Joy of a Live Event
Capture the vibrant energy of a concert or sporting event with this image. A sea of smiling faces, confetti raining down, and the palpable excitement of the crowd create a truly celebratory atmosphere.
Prompt
poses low-angle: joyful, celebratory ; A group of friends celebrating a victory, their arms raised in the air, viewed from the perspective of someone standing below; wide shot; groups; a brightly lit party scene with confetti and balloons; cinematic
Characteristic
Shot : A crowd of people are celebrating and cheering with confetti falling around them. The image is taken from a low angle, looking up at the crowd.
Aesthetic Score : 0.7
Mood : joyful, celebratory, energetic
Quality
Entropy : 6.56
Noise : 79
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No errors, just a bit noisy, not super sharp
Firefighter Stands Tall Amidst Blazing Inferno
A lone firefighter, silhouetted against the fiery backdrop of a city blaze, embodies courage and determination. The intense flames and billowing smoke create a dramatic scene, highlighting the hero’s unwavering focus amidst the chaos.
Prompt
poses low-angle: intense, heroic ; A lone firefighter battling a raging inferno, their silhouette framed against the flames; medium shot; heroism; a burning building with smoke billowing into the sky; cinematic
Characteristic
Shot : A firefighter in full gear stands amidst the debris of a burning building, with a fire hose in hand. The background shows a building engulfed in flames, with smoke billowing out.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.60
Noise : 70
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image does not exhibit any notable artifacts or errors.
Precarious Heights: Rock Climbers Defy Gravity on a Sheer Cliff Face
Witness the breathtaking audacity of four rock climbers as they ascend a towering cliff, secured only by ropes and harnesses. The scene, set against a backdrop of rugged mountains and a winding river, evokes a sense of adventure, daring, and isolation. The perspective from the cliff edge creates a dizzying sense of vertigo, highlighting the climbers’ bravery in the face of danger.
Prompt
poses low-angle: thrilling, adventurous ; A group of adventurers rappelling down a sheer cliff face, their ropes dangling below; medium shot; adventure; a breathtaking view of a mountain range and a valley below; cinematic
Characteristic
Shot : Four climbers ascending a steep cliff face, the landscape below is a vast valley with a winding river.
Aesthetic Score : 0.8
Mood : adventurous, daring, scenic
Quality
Entropy : 6.65
Noise : 89
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
Lost in the Game: A Moment of Intense Focus
A dimly lit room, the glow of screens, and a player’s hands flying across the keyboard. This image captures the raw intensity of gaming, with a futuristic edge and a sense of suspense that pulls you right into the action.
Prompt
poses low-angle: immersive, fantastical ; A gamer’s hands deftly navigating a virtual world, their fingers flying across the keyboard; close-up; gaming; a vibrant, fantasy world displayed on the monitor; cinematic
Characteristic
Shot : A person is playing video games on a computer with multiple monitors, the scene is lit with blue and orange neon lights.
Aesthetic Score : 0.6
Mood : intense, focused, futuristic
Quality
Entropy : 6.60
Noise : 58
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : No noticeable errors
Golden Hour at the Temple: Tourists Embrace History and Adventure
A group of tourists stand before an ancient temple entrance, bathed in the warm glow of the setting sun. The golden light streaming through the doorway creates a sense of mystery and grandeur, capturing the tranquil and adventurous spirit of the moment.
Prompt
poses low-angle: awe-inspiring, historical ; A group of tourists standing in awe before a magnificent ancient temple, their faces illuminated by the setting sun; wide shot; tourism; a sprawling temple complex with intricate carvings and statues; cinematic
Characteristic
Shot : A group of people are standing in front of an ancient temple in Southeast Asia. The sun is setting in the background, casting a warm glow on the scene.
Aesthetic Score : 0.7
Mood : peaceful, majestic, adventurous
Quality
Entropy : 6.67
Noise : 88
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.48, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.56, which is also considered okay. This indicates the generated image’s shot composition was somewhat different from what was expected based on the prompt.
- Aesthetic Analysis: The model scored 0.29, which is considered pretty good. This means the generated image’s aesthetic was fairly close to what was expected.
Overall, the model seems to be better at understanding and implementing aesthetic elements than it is at accurately capturing camera positions and shot compositions.