AI's Artistic Journey: Capturing Scenes, Missing the Vibe with Stability-ai-ultra
- 9 minutes read - 1879 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, capturing the essence of a desired aesthetic style remains a challenge. This blog post explores the results of testing an AI model’s ability to generate images based on specific scenes and aesthetics. We’ll delve into the model’s performance in capturing camera positions, shot types, and overall aesthetic style, highlighting areas where it excels and where it needs improvement.
For example, imagine a scene described as ‘a lone knight, silhouetted against a setting sun; wide shot; Heroism; A vast, desolate battlefield littered with fallen soldiers.’ While the AI might accurately depict a knight and a battlefield, it might struggle to convey the dramatic, heroic tone intended by the ‘Heroism’ aesthetic. This highlights the need for further development in AI’s understanding of artistic nuances and emotional impact.
Created with: stability-ai-ultra
A Lone Knight’s Melancholy Victory
A solitary knight stands amidst the carnage of a recent battle, bathed in the golden glow of the setting sun. The scene evokes a powerful sense of epic struggle and melancholic reflection, with the knight’s silhouette against the vibrant sky creating a dramatic and moving image.
Prompt
Baroque: Epic, melancholic ; A lone knight, silhouetted against a setting sun; wide shot; Heroism; A vast, desolate battlefield littered with fallen soldiers.; cinematic
Characteristic
Shot : A lone knight stands in the middle of a battlefield, silhouetted against a dramatic sunset. The knight is armed with a sword and shield, and is looking out over a field of fallen soldiers.
Aesthetic Score : 0.7
Mood : epic, dramatic, melancholic
Quality
Entropy : 6.53
Noise : 87
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some slight artifacts are noticeable in the background, particularly in the clouds. The knight’s armor is slightly blurry.
A Ship Battles the Storm, Lit by Lightning’s Fury
A majestic sailing ship cuts through a tempestuous sea, its lanterns casting an eerie glow against the backdrop of a lightning strike. The scene evokes a sense of drama, mystery, and adventure, as the ship braves the elements.
Prompt
Baroque: Dramatic, thrilling ; A pirate ship, sails billowing in the wind, crashing through stormy waves; dynamic, close-up; Adventure; A raging sea with lightning illuminating the sky.; cinematic
Characteristic
Shot : A ship sailing through stormy seas at night, illuminated by lightning
Aesthetic Score : 0.8
Mood : dramatic, mysterious, adventurous
Quality
Entropy : 6.75
Noise : 94
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.90
Image errors : Slight blurriness and some unnatural looking textures in the water, especially near the ship
Lost in the Neon Dreams: A Cyberpunk Cityscape
A lone figure, controller in hand, gazes out at a sprawling futuristic city bathed in vibrant neon hues. The scene evokes a sense of wonder and escapism, transporting you to a world where dreams and technology collide.
Prompt
Baroque: Intense, focused ; A player’s hand, gripping a controller, illuminated by the glow of a screen; close-up; Gaming; A chaotic, pixelated cityscape on the screen.; cinematic
Characteristic
Shot : A person is holding a video game controller in front of a screen depicting a futuristic city skyline at night, with neon lights and a starry sky
Aesthetic Score : 0.6
Mood : futuristic, cyberpunk, vibrant
Quality
Entropy : 6.80
Noise : 96
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The city skyline appears somewhat artificial and repetitive, lacking individual character and realism. The lighting is slightly unnatural, with neon colors being overly saturated.
Genoa’s Vibrant Market Under a Grand Facade
A bustling street market unfolds before a majestic, ornate building in Genoa, Italy. The scene is alive with energy and culture, the grand architecture adding a sense of scale and grandeur to the vibrant atmosphere.
Prompt
Baroque: Opulent, vibrant ; A grand, ornate palace, bathed in golden sunlight; wide shot; Tourism; A bustling marketplace with vibrant colors and exotic goods.; cinematic
Characteristic
Shot : A bustling street market in front of a grand, ornate building, with people walking and shopping.
Aesthetic Score : 0.8
Mood : vibrant, lively, colorful
Quality
Entropy : 6.96
Noise : 105
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts and slight blurriness in the background, likely due to digital processing.
A Hiker’s Journey: Finding Solitude in the Mountains
A lone hiker traverses a winding path, dwarfed by the majestic mountain range in the background. The scene evokes a sense of serenity, adventure, and inspiration, highlighting the dramatic contrast between the individual and the vastness of nature.
Prompt
Baroque: Awe-inspiring, contemplative ; A lone traveler, gazing out at a breathtaking mountain range; medium shot; Travel; A vast, snow-capped mountain range with a winding road leading into the distance.; cinematic
Characteristic
Shot : A lone hiker walking on a winding road that leads towards a majestic snow-capped mountain range. The mountain peaks pierce the blue sky with fluffy clouds
Aesthetic Score : 0.8
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.88
Noise : 105
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image seems to be slightly oversaturated and the colors are too vivid, leading to an artificial look. Some textures in the mountains and the road lack detail.
Firelight and Fellowship: A Cozy Gathering in the Dark
A group of friends huddle around a crackling fireplace, their faces illuminated by the warm glow. The dimly lit room adds an air of mystery and intimacy, creating a scene of cozy nostalgia.
Prompt
Baroque: Warm, intimate ; A family gathered around a fireplace, sharing stories and laughter; medium shot; Family; A cozy, candlelit room with portraits of ancestors on the walls.; cinematic
Characteristic
Shot : A group of people are gathered around a fireplace in a dimly lit room. The room is decorated with portraits and candles. The fire is casting a warm glow on the people’s faces.
Aesthetic Score : 0.7
Mood : cozy, intimate, nostalgic
Quality
Entropy : 6.48
Noise : 95
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
A Knight’s Tale: Epic Silhouette Against a Setting Sun
A lone knight in full armor stands defiant, silhouetted against a fiery sunset. Behind him, an army of knights on horseback charge into the distance, leaving a trail of dust and smoke in their wake. The scene evokes a sense of epic adventure and mystery, with a hint of danger lurking in the shadows.
Prompt
Baroque: Brave, determined ; A knight, charging into battle, his armor gleaming in the sunlight; dynamic, close-up; Heroism; A chaotic battlefield with smoke and dust swirling in the air.; cinematic
Characteristic
Shot : A lone knight in full armor stands in the foreground with a sword drawn, facing a group of knights riding horses in the background. The scene is set against a backdrop of a cloudy sky with the sun shining through the clouds, creating a golden light. There is dust and smoke in the air, suggesting a battlefield.
Aesthetic Score : 0.8
Mood : dramatic, epic, heroic
Quality
Entropy : 6.86
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Unveiling the Mystical Treasure
A captivating scene unfolds, revealing an ornate treasure chest overflowing with gold coins. The flickering candlelight and ethereal fog create a magical and mysterious atmosphere, inviting you to embark on an adventurous journey.
Prompt
Baroque: Intriguing, mysterious ; A treasure chest, overflowing with gold and jewels, illuminated by a single candle; close-up; Adventure; A dark, mysterious cave with cobwebs and shadows.; cinematic
Characteristic
Shot : An open treasure chest overflowing with golden jewels, sitting in a dimly lit cave with a smoky atmosphere.
Aesthetic Score : 0.7
Mood : mysterious, magical, adventurous
Quality
Entropy : 6.70
Noise : 86
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor blurring around the edges of the chest and jewels.
A Lone Figure Contemplates the Majesty of a Dreamlike World
A solitary figure stands on a rocky precipice, dwarfed by the breathtaking beauty of cascading waterfalls and floating islands. The vibrant sky and ethereal clouds create a dreamlike atmosphere, emphasizing the character’s smallness against the grandeur of this fantastical landscape.
Prompt
Baroque: Triumphant, surreal ; A player’s avatar, standing triumphantly on a virtual mountain peak; wide shot; Gaming; A fantastical, digital landscape with glowing waterfalls and floating islands.; cinematic
Characteristic
Shot : A lone figure stands on a rocky cliff overlooking a fantastical landscape of towering waterfalls cascading from floating islands. The vibrant blue and pink hues of the water and clouds create a dreamy and otherworldly atmosphere.
Aesthetic Score : 0.7
Mood : dreamy, surreal, magical
Quality
Entropy : 6.82
Noise : 87
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is well-rendered with minimal visible artifacts. However, the edges of the floating islands have a slightly pixelated appearance, and the water in the waterfalls appears somewhat blurry.
Vibrant City Life: A European Street Scene
Capture the energy of a bustling European city with this image. A grand church dominates the end of the street, while shops and restaurants line the sides. The converging lines of the street create a sense of movement and depth, making this a visually captivating scene.
Prompt
Baroque: Energetic, lively ; A bustling city square, filled with people from all walks of life; wide shot; Tourism; A grand, Baroque cathedral towering over the city.; cinematic
Characteristic
Shot : A bustling street in a European city, with a grand church at the end of the street.
Aesthetic Score : 0.7
Mood : vibrant, lively, historic
Quality
Entropy : 6.96
Noise : 98
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.53, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.58, also within the “good” range. This suggests that the model understood the scene descriptions in the prompts and generated images that reflected those descriptions.
- Aesthetic Analysis: The model scored 0.15, which is significantly lower than the “very good” range (-0.2 to 0.1). This indicates that the generated images did not match the expected aesthetic style as closely as desired.
Overall, the model demonstrates a good understanding of camera positions and scene descriptions, but needs improvement in generating images that align with the desired aesthetic style.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai