AI Captures the Scene, But Misses the Mood with Stability-ai-ultra
- 9 minutes read - 1765 wordsTable of Contents
In the realm of AI image generation, capturing the essence of a scene goes beyond simply replicating the elements described in a prompt. It involves understanding the nuances of composition, lighting, and overall mood. This blog post examines the performance of a generative AI model in creating images based on specific poses and scenes, highlighting its strengths and weaknesses in capturing the desired aesthetic.
Created with: stability-ai-ultra
Running from the Storm: A Figure’s Lonely Flight
A hooded figure races down a dirt road, the ominous sky mirroring their own sense of urgency. Rain falls, and the approaching storm adds to the dramatic tension, leaving the figure’s destination and purpose shrouded in mystery.
Prompt
poses running: determined, hopeful ; A lone figure in a tattered cloak; wide shot; Heroism; a desolate wasteland with a storm brewing in the distance; cinematic
Characteristic
Shot : A lone figure in a dark cloak runs away from a large, ominous storm cloud. The figure is running on a dirt road in a barren, desolate landscape.
Aesthetic Score : 0.7
Mood : dark, mysterious, dramatic
Quality
Entropy : 6.87
Noise : 92
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has a few minor artifacts, particularly in the storm clouds. The figure’s cloak appears to be slightly blurry.
Into the Jungle’s Embrace: A Runner’s Quest for Mystery
A lone figure races through a verdant jungle path, their destination an ancient stone building shrouded in mystery. The image captures the thrill of adventure, the promise of discovery, and the allure of the unknown, leaving viewers eager to follow the runner’s footsteps.
Prompt
poses running: excited, curious ; A young adventurer with a backpack; medium shot; Adventure; a lush jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A man running away from the camera through a jungle path with overgrown ruins in the background
Aesthetic Score : 0.7
Mood : adventure, escape, mystery
Quality
Entropy : 6.89
Noise : 83
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has a slightly artificial look, like it was created with a digital filter. Some of the colors are not quite natural.
Immersed in the Neon Glow: A Gamer’s Focus Under Futuristic Lights
A vibrant scene of a gamer engrossed in their game, bathed in a captivating blue and pink lighting scheme. The intense focus and futuristic aesthetic create a palpable sense of energy and excitement.
Prompt
poses running: intense, focused ; A gamer’s hands on a keyboard and mouse; close-up; Gaming; a brightly lit gaming room with a monitor displaying a virtual world; cinematic
Characteristic
Shot : A person is playing a video game on a computer. They are wearing a headset and have their hands on a keyboard and mouse. The computer screen is displaying a video game, and the room is lit with blue and red lights.
Aesthetic Score : 0.6
Mood : intense, focused, gamer
Quality
Entropy : 6.90
Noise : 69
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no obvious artifacts or errors in the image.
Sun-Kissed Joy: Young Friends Run Through a Vibrant Street
Capture the carefree spirit of youth as a group of friends race through a bustling, colorful street bathed in sunlight. The dynamic composition and bright hues evoke a sense of happiness and energy, making this image a celebration of life’s simple pleasures.
Prompt
poses running: energetic, joyful ; A group of tourists running through a bustling marketplace; long shot; Tourism; a vibrant marketplace with colorful stalls and vendors; cinematic
Characteristic
Shot : A group of young adults are running through a bustling market street, filled with colorful awnings and fresh produce.
Aesthetic Score : 0.7
Mood : joyful, vibrant, energetic
Quality
Entropy : 6.93
Noise : 85
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : None.
Love on the Horizon: A Beachside Romance
In this captivating scene, a couple is seen running hand-in-hand on a pristine white sandy beach, their laughter echoing in the salty breeze. The turquoise ocean stretches out before them, a symbol of their boundless love. The sunny day, with its dramatic contrast of sand, sea, and sky, perfectly encapsulates their happy, romantic, and carefree mood. Their joy is palpable, making this a truly beautiful moment to behold.
Prompt
poses running: romantic, carefree ; A couple running hand-in-hand along a beach; medium shot; Travel; a beautiful beach with turquoise water and white sand; cinematic
Characteristic
Shot : A couple is running on a beautiful, white-sand beach with turquoise water.
Aesthetic Score : 0.7
Mood : joyful, romantic, carefree
Quality
Entropy : 6.05
Noise : 70
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : The sky has some light color cast. The image might be slightly overexposed. No major artifacts, but some blur is noticeable on the woman’s hair.
Sun-Kissed Joy: Friends Embrace the Day
A vibrant group of young friends radiate pure joy as they sprint through a sun-drenched park. The camera captures their carefree energy, highlighting the warmth and light of a perfect summer day.
Prompt
poses running: happy, playful ; A group of friends running through a park; wide shot; Groups; a sunny park with green grass and trees; cinematic
Characteristic
Shot : A group of six young people are running through a grassy park with a bright green background and trees on either side, the sun is shining and there is a lens flare from the sun in the top left corner.
Aesthetic Score : 0.7
Mood : joyful, carefree, energetic
Quality
Entropy : 6.78
Noise : 79
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a little blurry and there is some noise visible in the background.
Superhero in Motion: Blurred Lights and Dynamic Pose Capture the Thrill of the Chase
This image captures the essence of a superhero in action, with a dynamic pose and blurred city lights creating a sense of speed and urgency. The red and blue suit and futuristic setting add to the heroic and dynamic mood of the scene.
Prompt
poses running: powerful, confident ; A superhero in a bright costume; close-up; Heroism; a city skyline with skyscrapers and flashing lights; cinematic
Characteristic
Shot : A superhero, with a red and black suit, is running through a city at night. The city is blurred in the background, and the superhero is in focus.
Aesthetic Score : 0.7
Mood : dynamic, futuristic, powerful
Quality
Entropy : 6.88
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has a slight AI generated look, particularly in the city background and the superhero’s suit. The blurring of the city is also somewhat unnatural and lacking in detail.
Conquering the Peak: A Hiker’s Journey Through Snowy Majesty
A lone hiker braves the snowy mountain path, their small figure dwarfed by the majestic peak. This inspiring scene captures the serenity and adventure of a challenging journey, leaving you with a sense of awe and wonder.
Prompt
poses running: determined, adventurous ; A lone explorer running through a snow-covered mountain pass; long shot; Adventure; a majestic mountain range with snow-capped peaks; cinematic
Characteristic
Shot : A lone hiker runs uphill through a snowy valley towards a towering mountain peak in the distance. The sky is blue and partly cloudy.
Aesthetic Score : 0.8
Mood : tranquil, adventurous, majestic
Quality
Entropy : 6.47
Noise : 89
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No apparent artifacts or errors.
Neon Nights: A Cyberpunk Chase
A young woman races through a vibrant cyberpunk city, the neon glow reflecting off her blurred form. The scene captures the energy and urgency of a futuristic world, leaving you breathless with its dynamic energy.
Prompt
poses running: immersive, exciting ; A gamer’s avatar running through a virtual world; close-up; Gaming; a vibrant and detailed virtual world with fantastical creatures; cinematic
Characteristic
Shot : A young woman is running through a neon-lit city street at night.
Aesthetic Score : 0.7
Mood : futuristic, cyberpunk, edgy
Quality
Entropy : 6.86
Noise : 87
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some minor artifacts in the lighting and shadows, particularly on the woman’s hair and the background.
Family Fun on a Winding Road
A joyous family of four races down a picturesque country road, bathed in sunshine and surrounded by lush greenery. Their laughter and energy fill the air, capturing the essence of a perfect outdoor adventure.
Prompt
poses running: happy, carefree ; A family running along a scenic road; medium shot; Travel; a winding road with rolling hills and a picturesque countryside; cinematic
Characteristic
Shot : A family of four is running down a paved road in a rural setting. The sun is shining and the sky is blue. The family is happy and healthy, and the scene is full of energy and life.
Aesthetic Score : 0.7
Mood : happy, active, energetic
Quality
Entropy : 6.68
Noise : 85
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This means the model was able to accurately capture the camera position described in the prompt.
- Shot Analysis: The model scored 0.49, also within the “good” range. This indicates the model understood the scene described in the prompt and created an image that reflects that understanding.
- Aesthetic Analysis: The model scored 0.08, which is significantly lower than the “very good” range (-0.2 to 0.1). This suggests that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera position and shot composition, but needs improvement in capturing the desired aesthetic.