AI Captures the Scene, But Misses the Mood with Stability-ai-ultra
- 9 minutes read - 1728 wordsTable of Contents
In the realm of artificial intelligence, image generation has emerged as a fascinating area of exploration. Generative AI models, trained on vast datasets of images and text, have the potential to create stunning visuals based on user prompts. However, the ability to capture the intended aesthetic remains a significant challenge. This blog post delves into the results of a test that evaluated the performance of a generative AI model in creating images based on specific prompts, highlighting its strengths and weaknesses.
Created with: stability-ai-ultra
Hand in Hand, Against the Cosmic Canvas
Two astronauts, silhouetted against the breathtaking backdrop of Earth and a distant celestial body, share a moment of awe and solitude on the lunar surface. Their intertwined hands symbolize the human spirit’s unwavering desire to explore the vastness of space.
Prompt
poses holding-hands: Hopeful, determined, camaraderie ; Two astronauts; wide shot; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : Two astronauts standing on a desolate lunar surface, holding hands and looking at the Earth in the distance. The Earth is a prominent feature in the background, and there are other planets or stars in the sky.
Aesthetic Score : 0.6
Mood : awe, wonder, hopeful
Quality
Entropy : 6.64
Noise : 94
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The astronauts’ helmets appear to be slightly out of proportion, and the background stars have a slightly artificial look. There is a noticeable seam along the right edge of the image, suggesting it might be a composite image.
Hopeful Hikers: A Journey Through the Tropical Forest
Five friends, hand in hand, embark on a tranquil adventure through a lush tropical forest. The sun’s warm rays filter through the dense canopy, creating a sense of hope and adventure as they navigate the leafy path.
Prompt
poses holding-hands: Excited, adventurous, trusting ; A group of explorers; medium shot; adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : Five people are walking in a jungle path, holding hands, with the sun shining through the trees in the background.
Aesthetic Score : 0.7
Mood : serene, adventurous, hopeful
Quality
Entropy : 6.56
Noise : 103
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as the leaves being slightly pixelated.
Neon Glow, Intense Focus: Two Gamers Battle It Out
Two young men are locked in a fierce video game competition, bathed in the vibrant glow of purple and pink neon lights. The dimly lit room amplifies the intensity of their focus, creating a scene of palpable excitement and competitive spirit.
Prompt
poses holding-hands: Focused, competitive, collaborative ; Two gamers; close-up; gaming; a brightly lit gaming setup with glowing screens and controllers; cinematic
Characteristic
Shot : Two young men are playing video games in a dimly lit room, the focus is on the person in the foreground who is holding a controller.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 6.91
Noise : 69
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts and blurriness in the image, particularly around the edges of the screens and in the background.
Sunset Romance on the Hilltop
A couple, hand in hand, shares a tender moment as the sun dips below the horizon, painting the cityscape in warm hues. Their love story unfolds against the backdrop of a breathtaking view, capturing the essence of romance and happiness.
Prompt
poses holding-hands: Romantic, happy, adventurous ; A couple; medium shot; tourism; a picturesque cityscape with iconic landmarks in the background; cinematic
Characteristic
Shot : A couple is walking hand-in-hand on a hill overlooking a cityscape during sunset.
Aesthetic Score : 0.7
Mood : romantic, warm, nostalgic
Quality
Entropy : 6.94
Noise : 79
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
A Family’s Journey Through Majestic Mountains
A serene and adventurous scene unfolds as a family of four walks along a winding road through a breathtaking valley. The towering snow-capped mountain in the distance evokes a sense of awe and wonder, promising an unforgettable journey.
Prompt
poses holding-hands: Joyful, connected, adventurous ; A family; long shot; travel; a scenic mountain range with a winding road leading to the peak; cinematic
Characteristic
Shot : A family of four, two parents and two children, is hiking on a winding road in a mountainous landscape. The road leads towards a majestic, snow-capped mountain peak in the distance. The scene is characterized by bright sunlight and a clear blue sky. The mountain landscape is in autumn colors and the road is paved and surrounded by lush green vegetation.
Aesthetic Score : 0.8
Mood : peaceful, adventurous, inspiring
Quality
Entropy : 6.82
Noise : 95
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Joyful Celebration: Three Friends Share the Excitement
Three young women radiate happiness as they stand hand-in-hand amidst a vibrant festival crowd. The scene is bursting with color and energy, capturing the contagious joy of a shared experience.
Prompt
poses holding-hands: Happy, celebratory, connected ; A group of friends; medium shot; groups; a vibrant festival with colorful decorations and music; cinematic
Characteristic
Shot : Three young women are joyfully raising their hands in the air at a music festival, surrounded by a colorful crowd and festive decorations.
Aesthetic Score : 0.7
Mood : joyful, celebratory, carefree
Quality
Entropy : 6.81
Noise : 83
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
A Hiker’s Moment of Majesty
A lone hiker stands on a rocky mountain path, dwarfed by the grandeur of snow-capped peaks and swirling clouds. This serene yet adventurous scene evokes a sense of isolation and the humbling power of nature.
Prompt
poses holding-hands: Determined, courageous, triumphant ; A lone hiker; close-up; heroism; a breathtaking mountain vista with clouds swirling below; cinematic
Characteristic
Shot : A lone hiker stands on a rocky mountain path, looking out at a breathtaking vista of misty mountains and clouds.
Aesthetic Score : 0.8
Mood : serene, awe-inspiring, contemplative
Quality
Entropy : 6.74
Noise : 88
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Childhood Joy: A Moment of Pure Happiness on the Playground
Two young children, a boy and a girl, share a carefree moment on a sunny playground. Their hand-in-hand walk and bright smiles capture the essence of childhood joy and innocence. The natural and candid feel of the image evokes a sense of warmth and happiness.
Prompt
poses holding-hands: Playful, innocent, carefree ; Two children; close-up; adventure; a playground with swings, slides, and a sandbox; cinematic
Characteristic
Shot : Two young children, a boy and a girl, are walking hand-in-hand on a playground. The girl is wearing a white dress and a brown hat, while the boy is wearing a white shirt with colorful patterns and blue denim shorts. The playground is in the background, with a slide and other equipment visible.
Aesthetic Score : 0.7
Mood : happy, innocent, playful
Quality
Entropy : 6.60
Noise : 71
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable errors.
Blue Light Mystery: A Circle of Hope in the Shadows
Seven figures stand united, their silhouettes illuminated by piercing blue spotlights. The dramatic lighting and their intertwined hands evoke a sense of mystery and hope, leaving the audience captivated by the unspoken story unfolding on stage.
Prompt
poses holding-hands: Passionate, connected, expressive ; A group of musicians; medium shot; groups; a dimly lit stage with spotlights shining on them; cinematic
Characteristic
Shot : A group of eight people are standing on a stage in a silhouette. They are holding hands and the lights are shining on them.
Aesthetic Score : 0.6
Mood : dramatic, mysterious, hopeful
Quality
Entropy : 6.51
Noise : 76
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some artifacts in the image, including some noise and a slight blur to the subjects.
Sunset Romance in the Desert
A couple strolls hand-in-hand across a sun-drenched desert landscape, their silhouettes bathed in the warm glow of the setting sun. This romantic and adventurous scene evokes a sense of hope and possibility.
Prompt
poses holding-hands: Romantic, adventurous, hopeful ; A couple; long shot; travel; a vast desert landscape with a setting sun in the distance; cinematic
Characteristic
Shot : A couple walking hand-in-hand in a desert landscape during a sunset. The setting sun casts a warm, golden light over the sand dunes.
Aesthetic Score : 0.7
Mood : romantic, adventurous, peaceful
Quality
Entropy : 6.71
Noise : 79
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, and the sand dunes lack detail in some areas. There is a slight noise in the image. The image could also benefit from sharper focus.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This means the model was able to accurately capture the camera position described in the prompt.
- Shot Analysis: The model scored 0.61, also within the “good” range. This indicates the model understood the scene described in the prompt and created an image that reflects that understanding.
- Aesthetic Analysis: The model scored 0.08, which is significantly lower than the “very good” range (-0.2 to 0.1). This suggests that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera position and shot composition, but needs improvement in capturing the desired aesthetic.