AI Struggles to Capture the Essence of Dramatic Poses with Stable-diffusion
- 9 minutes read - 1862 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, used to convey emotion, heroism, and a sense of grandeur. They often involve dynamic angles, striking silhouettes, and a strong connection to the surrounding environment. However, generating images with dramatic poses poses a significant challenge for AI models. This is due to the complex interplay of factors involved, including camera position, shot composition, and aesthetic style. In this blog post, we explore a case study that highlights the limitations of AI in capturing the essence of dramatic poses.
Created with: stability-ai-core
Solitude on the Summit: A Hiker’s Moment of Awe
A lone hiker stands silhouetted against a breathtaking vista of misty mountains and valleys. The dramatic clouds and vast landscape evoke a sense of serenity and awe, capturing the essence of contemplative solitude.
Prompt
poses thoughtful-pose: determined, contemplative ; Lone figure standing on a mountain peak; wide shot; heroism; dramatic sky with clouds; cinematic
Characteristic
Shot : A lone hiker stands on the peak of a mountain, gazing at a vast valley shrouded in mist and clouds. The sky is a canvas of dramatic, swirling clouds, with rays of sunlight breaking through.
Aesthetic Score : 0.8
Mood : serene, contemplative, majestic
Quality
Entropy : 6.73
Noise : 64
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Lost in the Jungle: A Man’s Quest for Ancient Secrets
A lone explorer navigates the dense jungle, his gaze fixed on a weathered map. An ancient structure looms in the background, hinting at forgotten mysteries. The air is thick with anticipation, as he embarks on a journey fueled by determination and a thirst for discovery.
Prompt
poses thoughtful-pose: curious, adventurous ; Explorer looking at a map, surrounded by ancient ruins; medium shot; adventure; jungle foliage; cinematic
Characteristic
Shot : A man in a jungle setting, wearing a backpack, looking at a map. He is sitting in front of an ancient stone structure. The scene is lush and green.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, thoughtful
Quality
Entropy : 6.81
Noise : 87
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Neon Glow, Focused Flow: A Gamer’s World
Immerse yourself in the vibrant world of a dedicated gamer, bathed in neon light. This scene captures the intensity and focus of a player lost in the digital realm, creating a futuristic and captivating atmosphere.
Prompt
poses thoughtful-pose: intense, focused ; Gamer intensely focused on a screen, hands on a controller; close-up; gaming; neon lights and gaming peripherals; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer, typing on a keyboard. He is illuminated by blue and red neon lights, giving the scene a futuristic and slightly edgy feel.
Aesthetic Score : 0.7
Mood : intense, focused, futuristic
Quality
Entropy : 6.35
Noise : 60
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, likely due to camera shake. The neon lights are overexposed, causing some areas to be blown out.
Lost in the City Lights: A Moment of Contemplation
A young woman finds solitude on a rooftop, her gaze lost in the sprawling cityscape. The blurred background and her thoughtful pose evoke a sense of loneliness and introspection, capturing the essence of urban life.
Prompt
poses thoughtful-pose: awe-struck, contemplative ; Tourist gazing at a breathtaking cityscape; medium shot; tourism; bustling city streets; cinematic
Characteristic
Shot : A young woman is sitting on a ledge, looking out over a cityscape, likely New York City. The skyline is visible in the background, with the Freedom Tower prominently featured. The woman is wearing a brown jacket and jeans, and her hair is long and flowing.
Aesthetic Score : 0.7
Mood : pensive, melancholic, contemplative
Quality
Entropy : 6.81
Noise : 64
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image. The quality is good.
Sunset Silhouette: Two Figures Find Tranquility on a Cliffside
A breathtaking sunset paints the sky in golden hues as two figures sit perched on a rocky cliff, their silhouettes stark against the vibrant backdrop. The vast ocean stretches out before them, evoking a sense of peace and tranquility. This serene scene captures the intimacy and isolation of the moment, leaving a lasting impression of beauty and serenity.
Prompt
poses thoughtful-pose: relaxed, introspective ; Backpackers sitting on a cliff overlooking a vast ocean; wide shot; travel; sunset sky; cinematic
Characteristic
Shot : Two people are sitting on a cliff overlooking a large body of water at sunset.
Aesthetic Score : 0.8
Mood : tranquil, romantic, peaceful
Quality
Entropy : 6.66
Noise : 76
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has slight overexposure, leading to some loss of detail in the sky and the foreground. There’s a slight amount of noise in the shadows.
Campfire Tales Under a Starry Sky
A group of friends gather around a crackling campfire, sharing stories and laughter under a breathtaking night sky. The warm glow of the fire creates a cozy atmosphere, while the vastness of the stars evokes a sense of adventure and nostalgia.
Prompt
poses thoughtful-pose: intimate, nostalgic ; Group of friends huddled around a campfire, sharing stories; medium shot; groups; starry night sky; cinematic
Characteristic
Shot : A group of five young adults are sitting around a campfire under a starry night sky. The scene is set in a mountainous landscape.
Aesthetic Score : 0.7
Mood : serene, introspective, cozy
Quality
Entropy : 6.38
Noise : 71
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights. The starry sky appears artificial, possibly due to over-processing or AI enhancement.
Silhouetted Solitude: A Moment of Contemplation in the City
A lone figure stands on a wooden bridge, their silhouette stark against the glittering cityscape. The mood is melancholic, contemplative, and urban, with the reflection of the city lights in the water below adding to the sense of mystery and isolation.
Prompt
poses thoughtful-pose: reflective, hopeful ; A lone figure standing on a bridge, looking out at the city lights; medium shot; heroism; cityscape at night; cinematic
Characteristic
Shot : A man in a black coat is standing on a wooden pier at night, looking out at a city skyline. The city is lit up with lights, and the water is reflecting the light from the city.
Aesthetic Score : 0.7
Mood : lonely, contemplative, urban
Quality
Entropy : 5.96
Noise : 73
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors
Uncharted Territory: A Tense Encounter in the Lush Wilderness
Three explorers, clad in rugged gear, stand and sit amidst moss-covered rocks in a dense, verdant forest. Their expressions and postures convey a palpable sense of anticipation and a hint of danger lurking in the unknown. The scene evokes a mysterious and adventurous mood, leaving viewers on the edge of their seats.
Prompt
poses thoughtful-pose: determined, cautious ; A group of adventurers navigating a dense forest; wide shot; adventure; lush green foliage; cinematic
Characteristic
Shot : Three men in green military-style clothing stand in a lush green forest with tall trees and ferns. They are looking off to the side, seemingly on a mission.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, serious
Quality
Entropy : 6.81
Noise : 93
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Focused and Determined: Gamer Ready for Action
A young man, radiating confidence, sits in his gaming chair, headphones on, ready to conquer the digital world. The dramatic lighting highlights his face, emphasizing his focus and determination. Multiple computer monitors in the background hint at the immersive world he’s about to enter.
Prompt
poses thoughtful-pose: triumphant, excited ; A gamer celebrating a victory, fist raised in the air; close-up; gaming; vibrant gaming setup; cinematic
Characteristic
Shot : A young man is sitting in a gaming chair, adjusting his headphones in a dark room with computer monitors in the background.
Aesthetic Score : 0.6
Mood : focused, serious, determined
Quality
Entropy : 6.20
Noise : 66
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but there are a few minor artifacts in the background, particularly on the computer screens.
Silhouettes of Hope: A Family Welcomes the Sunset
A tranquil scene unfolds as three figures stand on a wet beach, their backs to the camera, gazing at the breathtaking sunset. The golden light paints the sky in vibrant hues, casting long shadows and creating a sense of peace and hope. The gentle waves lapping at the shore add to the serene atmosphere, making this a moment to cherish.
Prompt
poses thoughtful-pose: peaceful, hopeful ; A family standing on a beach, watching the sunrise; wide shot; tourism; golden sunrise over the ocean; cinematic
Characteristic
Shot : A family of three, two men and a young boy, are standing on a beach at sunset. The sun is setting behind them, casting a warm glow on the sky and the water. The men are holding hands, and the boy is holding the hand of one of the men. They are looking out at the ocean, and their silhouettes are visible against the sky.
Aesthetic Score : 0.7
Mood : peaceful, heartwarming, family
Quality
Entropy : 6.78
Noise : 70
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image seems to have some slight noise, which is typical for high ISO images. The lighting isn’t perfectly even, causing some slight contrast variation.
Conclusion
The results show that the generative AI model performed okay in terms of camera position and shot analysis, but not so well in terms of aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t quite capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.485, also below the “good” range. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create the expected shot composition.
- Aesthetic Analysis: The model scored 0.02, which is significantly below the “very good” range of -0.2 to 0.1. This means the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall, the model struggled to accurately interpret the prompt’s instructions regarding camera position, shot composition, and aesthetic style.