AI's Artistic Journey: Capturing Emotion, Missing the Mark on Aesthetics with Stability-ai-ultra
- 9 minutes read - 1837 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, achieving artistic nuance remains a challenge. This blog post examines the results of an experiment using a generative AI model to create images based on specific scene descriptions, highlighting its strengths and weaknesses in capturing camera positions, shot analysis, and aesthetic elements. We explore the concept of ‘dramatic style poses’ and how they are used in various forms of media, from photography to film and video games. We’ll also discuss the potential for future advancements in AI art and its ability to truly capture the essence of human creativity.
Created with: stability-ai-ultra
Silhouetted Against the Setting Sun, a Figure of Mystery
A lone figure, cloaked in shadow, stands against a vibrant orange sunset, their silhouette a stark contrast against the fiery sky. The majestic mountain range in the background adds to the sense of isolation and drama, leaving the viewer to ponder the story behind this enigmatic scene.
Prompt
poses close-up: epic, determined ; A lone figure, silhouetted against a blazing sunset; close-up; heroism; a vast, desolate landscape; cinematic
Characteristic
Shot : A man in a cloak stands silhouetted against a sunset over a mountain range.
Aesthetic Score : 0.7
Mood : melancholy, dramatic, contemplative
Quality
Entropy : 6.57
Noise : 79
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly over-exposed, and the colors are a bit washed out.
Unveiling the Past: A Hand Points to a Map’s Secrets
A close-up shot of an aged map, its surface worn with time, lies open on a table. A hand, reaching from the shadows, points to a specific location, hinting at a hidden story. The scene is bathed in a mysterious light, drawing the viewer into a world of history and intrigue. Antique objects and a globe in the background add to the sense of exploration and discovery.
Prompt
poses close-up: intrigued, adventurous ; A weathered map, its edges frayed, with a finger tracing a route; close-up; adventure; a dimly lit room filled with antique maps and globes; cinematic
Characteristic
Shot : A close-up shot of a hand pointing at a map, with a globe and other objects in the background, evoking a sense of exploration and discovery.
Aesthetic Score : 0.7
Mood : vintage, mysterious, thoughtful
Quality
Entropy : 6.60
Noise : 82
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness in some areas, particularly in the background, likely due to low lighting conditions.
Lost in the Digital Realm: A Gamer’s Focus Under Neon Lights
A solitary figure, bathed in the ethereal glow of pink and blue, is completely engrossed in their virtual world. The dimly lit room adds an air of mystery, highlighting the intensity of their focus as they navigate the digital landscape.
Prompt
poses close-up: intense, focused ; A gamer’s hands, fingers flying across a keyboard, eyes glued to the screen; close-up; gaming; a dimly lit room with neon lights reflecting on the screen; cinematic
Characteristic
Shot : A gamer wearing headphones is playing a video game in a dimly lit room with neon lights. There are two computer monitors and a keyboard in front of him.
Aesthetic Score : 0.7
Mood : intense, focused, neon
Quality
Entropy : 6.54
Noise : 67
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no visible artifacts or errors in the image.
Capturing the Majesty: A Hand Holds the View of Majestic Mountains
A tranquil scene unfolds as a hand extends towards the camera, framing a breathtaking mountain vista. The blurry background hints at a vast valley and distant peaks, creating a sense of adventure and scale. The image evokes a feeling of serenity and wonder, inviting you to explore the beauty of the natural world.
Prompt
poses close-up: awe-inspiring, wonder ; A hand holding a camera, capturing a breathtaking vista; close-up; tourism; a panoramic view of a mountain range with clouds swirling below; cinematic
Characteristic
Shot : A person is holding a camera in front of a mountain range with a valley in the background. The mountains are covered in clouds and the sky is blue.
Aesthetic Score : 0.6
Mood : tranquil, majestic, adventurous
Quality
Entropy : 6.92
Noise : 70
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant image errors.
A Journey Begins: Vintage Passport Beckons Adventure
A worn leather backpack holds a secret - a vintage passport, its pages whispering tales of past travels. The close-up focus on the passport evokes a sense of nostalgia and intrigue, hinting at a journey waiting to be taken.
Prompt
poses close-up: nostalgic, adventurous ; A passport, open to a page with a stamp from a foreign country; close-up; travel; a cluttered backpack overflowing with travel essentials; cinematic
Characteristic
Shot : A close-up of an old passport inside a backpack
Aesthetic Score : 0.6
Mood : nostalgic, travel, adventure
Quality
Entropy : 6.71
Noise : 84
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Warmth and Togetherness by the Fire
A captivating image of two hands intertwined in front of a crackling campfire, radiating warmth and intimacy. The blurred figures in the background add a sense of depth and create a cozy atmosphere.
Prompt
poses close-up: warm, connected ; A group of hands, clasped together in a circle, symbolizing unity; close-up; groups; a campfire burning brightly in the background; cinematic
Characteristic
Shot : Two people are holding hands in front of a bonfire, the flames are out of focus in the background.
Aesthetic Score : 0.7
Mood : warm, cozy, togetherness
Quality
Entropy : 6.79
Noise : 81
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible image artifacts.
The Weight of War: A Soldier’s Close-Up
A young soldier, his face etched with seriousness and marked by blood, stares directly at the camera in this intense close-up. The image captures the somber mood and dramatic tension of war, leaving a lasting impression.
Prompt
poses close-up: tragic, poignant ; A single tear rolling down a hero’s cheek, reflecting the weight of their sacrifice; close-up; heroism; a battlefield littered with fallen comrades; cinematic
Characteristic
Shot : A close-up portrait of a young soldier with a serious expression, likely in a wartime setting. He has visible injuries on his face, suggesting he’s been in battle. There’s a blurred background with a crowd of soldiers, creating a sense of scale and context.
Aesthetic Score : 0.7
Mood : intense, serious, war-torn
Quality
Entropy : 6.93
Noise : 95
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have been digitally enhanced with some smoothing and sharpening, which reduces the natural texture of the skin. This is particularly noticeable on the soldier’s face and uniform.
Lost in the Emerald Labyrinth
A sunbeam pierces the dense canopy of a lush jungle, illuminating a compass lying forgotten on the forest floor. The scene evokes a sense of mystery and adventure, inviting you to explore the secrets hidden within the emerald depths.
Prompt
poses close-up: uncertain, suspenseful ; A compass needle spinning wildly, pointing in all directions; close-up; adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : A lush green jungle with a compass lying on the ground, with a sunbeam breaking through the canopy
Aesthetic Score : 0.7
Mood : mysterious, tranquil, adventurous
Quality
Entropy : 6.65
Noise : 79
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 1.00
Image errors : No noticeable errors, the image appears to be generated using high quality rendering.
Lost in the Game: A Close-Up on Intensity
A player’s hand grips the controller, their focus unwavering as colorful lights blur in the background. This image captures the intense, playful immersion of gaming.
Prompt
poses close-up: exhilarated, competitive ; A joystick, gripped tightly in a gamer’s hand, as they navigate a virtual world; close-up; gaming; a brightly lit arcade with flashing lights and sounds; cinematic
Characteristic
Shot : A person is holding a game controller with the background blurred and displaying vibrant neon lights. The image is close-up, focusing primarily on the controller.
Aesthetic Score : 0.6
Mood : intense, focused, vibrant
Quality
Entropy : 6.79
Noise : 68
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slight blurriness, especially in the background, which might be due to motion blur or a shallow depth of field. There are some chromatic aberrations around the edges of the image. The blur and the aberrations could be considered technical errors.
A Suitcase Full of Hope and Mystery
A close-up of a blue suitcase with a handwritten note attached, hinting at a journey filled with both hope and uncertainty. The blurred background of an airport adds a sense of movement and transition, leaving the viewer wondering what awaits the traveler.
Prompt
poses close-up: hopeful, anticipatory ; A luggage tag, with a handwritten note attached, signifying a journey to a new destination; close-up; travel; a bustling airport terminal with people rushing around; cinematic
Characteristic
Shot : A suitcase with a note attached to it, with a blurry background of people at an airport.
Aesthetic Score : 0.3
Mood : hopeful, travel
Quality
Entropy : 6.85
Noise : 68
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.3
- Interpretation: This score is below average, indicating the model didn’t accurately capture the intended camera positions described in the prompt.
- Desired Range: 0.5 - 0.75 (good), > 0.75 (very good)
Shot Analysis:
- Score: 0.54
- Interpretation: This score falls within the “good” range, suggesting the model understood the scene described in the prompt reasonably well.
- Desired Range: 0.5 - 0.75 (good), > 0.75 (very good)
Aesthetic Analysis:
- Score: 0.13
- Interpretation: This score is significantly below the desired range, indicating a large discrepancy between the expected aesthetic and the actual aesthetic of the generated image.
- Desired Range: -0.2 - 0.1 (very good)
Overall:
The model demonstrates a decent ability to understand and translate shot descriptions into visuals. However, it needs improvement in accurately capturing the intended camera positions and achieving the desired aesthetic.