AI Captures the Scene, But Struggles with the Angle with Stable-diffusion
- 9 minutes read - 1863 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions and adding depth to a scene. They are often used in film, photography, and art to create a sense of drama, excitement, or suspense. For example, a lone warrior standing tall against a backdrop of a raging battle conveys heroism and defiance. An adventurer standing on a cliff edge, gazing at a majestic mountain range, evokes a sense of wonder and adventure. AI image generation is rapidly evolving, and its ability to understand and create dramatic poses is a key area of development. This blog post explores the challenges and potential of AI in capturing these dynamic poses.
Created with: stability-ai-core
Knight of Fire: A Dramatic Portrait of Power
A knight in black armor stands amidst a fiery landscape, his swords held high in dramatic poses. The intense lighting and powerful stances create a sense of epic action and unwavering strength.
Prompt
poses action-pose: determined, heroic ; Lone warrior; wide shot; Heroism; Epic battle scene with smoke and fire; cinematic
Characteristic
Shot : A knight in full armor is standing on a rocky surface with fire around him. The image is a collage of nine separate images of the knight in different poses.
Aesthetic Score : 0.7
Mood : epic, dramatic, powerful
Quality
Entropy : 6.89
Noise : 77
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be a collage, and the different images don’t quite line up. The fire looks artificial, likely created with CGI.
A Moment of Triumph on the Mountaintop
A lone hiker stands triumphantly on a mountain peak, arms outstretched, embracing the breathtaking view of a valley, lake, and snow-capped mountains. The cloudy sky, with glimpses of sunlight, adds to the inspirational and adventurous mood of this peaceful scene. The hiker’s silhouette against the vast landscape evokes a sense of awe and wonder, reminding us of the power and beauty of nature.
Prompt
poses action-pose: adventurous, awe-inspired ; Adventurer standing on a cliff edge; medium shot; Adventure; Majestic mountain range with clouds; cinematic
Characteristic
Shot : A hiker stands on a mountain peak with his arms outstretched, overlooking a valley with a lake and snow-capped mountains in the distance.
Aesthetic Score : 0.8
Mood : inspirational, adventurous, majestic
Quality
Entropy : 6.84
Noise : 74
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : Minor chromatic aberration around the edges and slight blurriness in the distance.
Lost in the Neon Glow: A Gamer’s Intense Focus
A man, bathed in the vibrant hues of neon lights, is completely absorbed in his video game. The dimly lit room amplifies the intensity of his focus, creating a captivating scene of pure gaming immersion.
Prompt
poses action-pose: focused, intense ; Gamer holding a controller; close-up; Gaming; Neon-lit gaming room with multiple screens; cinematic
Characteristic
Shot : A young man is sitting in a gaming chair, wearing headphones, and holding a video game controller. The room is lit with neon pink and blue lights, creating a vibrant and futuristic atmosphere. There are two computer monitors behind him, one of which is showing a video game interface.
Aesthetic Score : 0.6
Mood : intense, focused, futuristic
Quality
Entropy : 6.55
Noise : 66
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background. The neon lights are also quite harsh and detract from the overall aesthetic of the image.
Capturing City Joy: A Selfie in the Heart of the Action
A young woman radiates happiness as she snaps a selfie in a vibrant city square. The bright colors of the surrounding buildings and her infectious smile create a sense of carefree adventure. This image captures the joy and excitement of exploring a new place.
Prompt
poses action-pose: happy, excited ; Tourist taking a selfie in front of a famous landmark; medium shot; Tourism; Busy city square with people and street performers; cinematic
Characteristic
Shot : A young woman is taking a selfie in a city square. She is wearing a brown hat and a blue denim jacket. The square is lined with old buildings and there are people walking around.
Aesthetic Score : 0.6
Mood : happy, carefree, urban
Quality
Entropy : 6.72
Noise : 71
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Two Women, One Motorcycle, Endless Adventure
Capture the spirit of freedom as two women cruise through rolling vineyards on a winding road. The sun-drenched landscape and the sense of open space evoke a feeling of joy and adventure. This image is a testament to the beauty of exploration and the thrill of the open road.
Prompt
poses action-pose: free, adventurous ; Couple riding a motorcycle on a winding road; wide shot; Travel; Scenic countryside with rolling hills and vineyards; cinematic
Characteristic
Shot : Two women on a motorcycle driving on a winding road through a vineyard landscape.
Aesthetic Score : 0.7
Mood : adventure, freedom, joy
Quality
Entropy : 6.84
Noise : 83
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, which makes it difficult to see the women’s faces clearly. The motorcycle’s details are also slightly blurred.
Rooftop Cheers: Friends Celebrate Under the City Lights
A group of friends raise their glasses in a toast on a rooftop patio, bathed in warm light and enjoying the vibrant city skyline. The scene captures the joy and camaraderie of a celebratory night out.
Prompt
poses action-pose: joyful, celebratory ; Group of friends celebrating with drinks; medium shot; Groups; Rooftop bar with city lights in the background; cinematic
Characteristic
Shot : Four friends toasting each other with drinks on a rooftop overlooking a city at night.
Aesthetic Score : 0.7
Mood : joyful, celebratory, fun
Quality
Entropy : 6.54
Noise : 75
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts in the image, particularly around the edges of the subjects’ hair. These are likely due to compression or post-processing.
Silhouetted Hero, City at Dusk
A superhero stands tall on a rooftop, their silhouette a powerful presence against the vibrant cityscape at dusk. The scene evokes a sense of heroism, futuristic wonder, and dramatic tension.
Prompt
poses action-pose: powerful, confident ; Superhero landing on a rooftop; wide shot; Heroism; City skyline with skyscrapers and neon lights; cinematic
Characteristic
Shot : A superhero stands on a rooftop overlooking a city skyline at dusk.
Aesthetic Score : 0.7
Mood : heroic, futuristic, dramatic
Quality
Entropy : 6.82
Noise : 66
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly blurred, and the edges of the subject are somewhat pixelated.
Lost in the Lush: A Man’s Solitary Journey Through the Jungle
A solitary figure walks through a vibrant jungle, bathed in soft, diffused light. The scene evokes a sense of serenity and adventure, with the composition highlighting the man’s isolation and the mystery that surrounds him.
Prompt
poses action-pose: determined, adventurous ; Explorer navigating a jungle path; medium shot; Adventure; Lush green jungle with vines and sunlight filtering through the canopy; cinematic
Characteristic
Shot : A young man in a green shirt and khaki pants is walking through a dense jungle. The man is looking off to the side and the image is taken at a low angle.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, tropical
Quality
Entropy : 6.72
Noise : 93
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
In the Zone: Gamer’s Focus at the Tournament
A young man, clad in a black hoodie and headphones, sits intently in his chair, his gaze fixed on something off-camera. The blurred background of the gaming tournament crowd adds to the sense of intensity and focus, capturing the serious concentration of a competitor in the heat of the moment.
Prompt
poses action-pose: intense, focused ; Gamer competing in an esports tournament; close-up; Gaming; Stadium filled with cheering fans and bright lights; cinematic
Characteristic
Shot : A man wearing headphones sits in a chair in a crowded arena. He is wearing a black shirt with a white logo and has his arms crossed. The background is blurry and there are people in the stands and on the floor.
Aesthetic Score : 0.7
Mood : focused, serious, competitive
Quality
Entropy : 6.09
Noise : 62
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry in some areas. There is some noise in the background.
Sunset Smiles: A Family’s Joyful Moment on the Beach
Capture the warmth and happiness of a family vacation with this stunning sunset scene. The golden light bathes the beach, highlighting the family’s smiles as they stand against the backdrop of the ocean and distant mountains. This image evokes a sense of peace and joy, perfect for capturing the essence of a happy family moment.
Prompt
poses action-pose: happy, relaxed ; Family posing for a photo in front of a sunset; medium shot; Travel; Beach with golden sand and turquoise water; cinematic
Characteristic
Shot : A family of four is standing on a beach at sunset. The two adults are standing in the back, with the two children in front. They are all smiling and looking at the camera. The beach is sandy and the ocean is in the background.
Aesthetic Score : 0.7
Mood : happy, joyful, family
Quality
Entropy : 6.56
Noise : 73
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, especially in the sky and the water. The color tones are also a bit too saturated, especially for the sky. There’s a mild noise in the image, noticeable in the background.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.54, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.01, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.