AI's Artistic Journey: Capturing Scenes, Missing the Mood with Stability-ai-ultra
- 9 minutes read - 1795 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, achieving the desired aesthetic remains a challenge. This blog post examines the results of an AI model tasked with generating images based on specific scenes and poses, highlighting its strengths and weaknesses. We’ll explore the concept of ‘dramatic style poses’ and how they are used in various contexts, from photography to film and even video games.
Created with: stability-ai-ultra
A Moment of Tranquility Amidst Majestic Peaks
A lone hiker finds peace on a mountain trail, overlooking a breathtaking valley adorned with wildflowers and snow-capped peaks. The sun’s golden rays cast long shadows, creating a sense of awe and wonder in this inspiring landscape.
Prompt
poses interactive-pose: Determined, hopeful, adventurous ; A lone adventurer; wide shot; Adventure; Majestic mountain range with a winding path leading to a hidden valley; cinematic
Characteristic
Shot : A lone hiker stands on a trail overlooking a vast mountain valley with a river winding through the green fields below. The sun is shining brightly, illuminating the snowy peaks in the distance.
Aesthetic Score : 0.8
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.85
Noise : 98
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some minor artifacts are present in the mountain textures, particularly in the snow-covered areas. The flowers in the foreground appear somewhat overly saturated and idealized.
Friends Gather for a Vibrant Gaming Session
A group of friends are captured in the midst of an energetic gaming session, illuminated by colorful lights that create a playful and casual atmosphere. The intensity of the game is palpable, highlighted by the vibrant lighting and focused expressions of the players.
Prompt
poses interactive-pose: Excited, focused, competitive ; A group of friends; medium shot; Gaming; A dimly lit room with a large screen displaying a video game, surrounded by controllers and snacks; cinematic
Characteristic
Shot : A group of friends are playing video games in a dimly lit room. The room is decorated with neon lights and there are snacks on the table. The players are focused on the game, which is shown on a large screen.
Aesthetic Score : 0.6
Mood : fun, focused, casual
Quality
Entropy : 6.70
Noise : 74
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, such as noise and banding. The colors are also a bit over saturated.
Sunset Hero: A Silhouette of Power
A costumed superhero stands tall against a breathtaking sunset cityscape, his muscular physique and determined expression radiating power and heroism. The dramatic lighting enhances the scene, emphasizing the superhero’s presence and the weight of his mission.
Prompt
poses interactive-pose: Confident, powerful, heroic ; A superhero; close-up; Heroism; A cityscape with towering buildings and a dramatic sunset in the background; cinematic
Characteristic
Shot : A superhero in a black and purple costume stands in front of a city skyline at sunset.
Aesthetic Score : 0.7
Mood : heroic, powerful, mysterious
Quality
Entropy : 6.54
Noise : 70
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, such as blurring around the edges of the superhero’s costume. There is also some aliasing in the background.
Joyful Moments in the Marketplace
A heartwarming scene unfolds in a bustling marketplace, where three generations of women - two adults and a young girl - share smiles and laughter. The vibrant colors and lively atmosphere create a sense of happiness and connection, capturing the essence of family and community.
Prompt
poses interactive-pose: Happy, joyful, curious ; A family; medium shot; Tourism; A bustling marketplace with colorful stalls and vibrant street performers; cinematic
Characteristic
Shot : Three women, possibly a family, are smiling and looking at the camera in an outdoor market setting.
Aesthetic Score : 0.8
Mood : joyful, friendly, vibrant
Quality
Entropy : 6.91
Noise : 83
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the background that suggest potential image editing or enhancement.
A Solitary Journey Through Tranquil Hills
A lone figure traverses a winding road amidst rolling green hills, the sky a clear blue. The scene evokes a sense of tranquility, serenity, and hope, with the solitary figure emphasizing the vastness of the landscape and the journey ahead.
Prompt
poses interactive-pose: Free, adventurous, contemplative ; A traveler; close-up; Travel; A scenic landscape with rolling hills, a clear blue sky, and a winding road leading to the horizon; cinematic
Characteristic
Shot : A lone figure walks down a winding road in a picturesque valley, surrounded by lush green hills and a clear blue sky. The road stretches out towards the horizon, offering a sense of openness and possibility.
Aesthetic Score : 0.8
Mood : serene, hopeful, adventurous
Quality
Entropy : 6.59
Noise : 76
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors.
Silhouettes of Energy: Dancers Blaze Under Colorful Spotlights
A captivating performance unfolds on a dark stage, where vibrant spotlights illuminate a group of dancers. Their silhouettes dance against the light, creating a dramatic and energetic spectacle. The mood is electric, fueled by the powerful lighting and the dancers’ dynamic movements.
Prompt
poses interactive-pose: Energetic, expressive, joyful ; A group of dancers; wide shot; Groups; A brightly lit stage with a vibrant backdrop, showcasing a performance; cinematic
Characteristic
Shot : A group of dancers are performing on stage in a colorful lighting environment.
Aesthetic Score : 0.7
Mood : energetic, vibrant, expressive
Quality
Entropy : 6.65
Noise : 81
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major issues, but the image appears slightly overexposed, with some loss of detail in the highlights, and the dance floor looks a bit artificial and glossy
Mystical Journey Through the Fog
A lone figure ventures into a sun-dappled forest, the ethereal fog and dappled sunlight creating a sense of wonder and mystery. This captivating scene evokes a feeling of adventure and serenity, inviting you to explore the unknown.
Prompt
poses interactive-pose: Calm, peaceful, introspective ; A lone hiker; medium shot; Adventure; A dense forest with towering trees and dappled sunlight filtering through the leaves; cinematic
Characteristic
Shot : A solitary hiker walks along a path through a lush, sun-dappled forest. The path is lined with ferns and tall trees. Light streams through the canopy, casting long shadows on the ground.
Aesthetic Score : 0.75
Mood : peaceful, serene, mysterious
Quality
Entropy : 6.69
Noise : 118
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some minor artifacts in the shadows, particularly around the trees.
Friends Gather for a Night of Laughter and Games
A cozy living room filled with warmth and laughter as friends engage in a lively board game. The anticipation and excitement are palpable, creating a sense of intimacy and joy. This scene captures the essence of friendship and shared moments of fun.
Prompt
poses interactive-pose: Fun, playful, competitive ; A group of friends; close-up; Gaming; A dimly lit room with a table covered in board games and snacks; cinematic
Characteristic
Shot : A group of friends are playing a board game together in a cozy living room. The image is lit by warm, inviting light, and the characters are all smiling and laughing. The scene is playful and carefree.
Aesthetic Score : 0.7
Mood : joyful, friendly, relaxed
Quality
Entropy : 6.91
Noise : 82
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be AI-generated, with some slight inconsistencies in the lighting and textures.
Silhouettes of Love at Sunset
A couple wades into the ocean at sunset, their silhouettes painted against the vibrant sky. The scene evokes a romantic, serene, and peaceful mood, with the dramatic effect of their forms against the colorful backdrop.
Prompt
poses interactive-pose: Romantic, intimate, peaceful ; A couple; close-up; Tourism; A romantic sunset over a beach with the ocean waves crashing in the background; cinematic
Characteristic
Shot : A couple in silhouette, wading in the ocean at sunset.
Aesthetic Score : 0.8
Mood : romantic, serene, peaceful
Quality
Entropy : 6.73
Noise : 76
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.40
Image errors : The sunset colors are slightly oversaturated, and the water has a slightly unnatural texture.
Silhouettes of Excitement: Band Ignites the Stage with Energetic Performance
A vibrant band electrifies the crowd with their performance, bathed in stage lights and fog. The silhouettes of the band members against the bright lights create a powerful sense of energy and excitement, capturing the essence of the lively atmosphere.
Prompt
poses interactive-pose: Energetic, passionate, inspiring ; A group of musicians; wide shot; Groups; A concert stage with a large crowd cheering in the background; cinematic
Characteristic
Shot : A rock band is performing on stage in front of a large crowd, lit by stage lights. The crowd is excited and cheering.
Aesthetic Score : 0.7
Mood : energetic, live, optimistic
Quality
Entropy : 6.70
Noise : 82
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are minor artifacts and errors in the image, such as some slight pixelation in some areas.
Conclusion
The results show that the generative AI model performed well in understanding camera positions and scene composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.45, indicating a moderate ability to follow the camera position instructions in the prompt. This is considered okay, as a score between 0.5 and 0.75 is considered good.
- Shot Analysis: The model scored 0.535, indicating a good ability to understand the scene composition described in the prompt. This is considered good, as a score between 0.5 and 0.75 is considered good.
- Aesthetic Analysis: The model scored 0.04, indicating a significant difference between the expected aesthetic and the actual aesthetic of the generated image. This is considered very good, as a score between -0.2 and 0.1 is considered very good. This suggests that the model struggled to capture the desired aesthetic style.
Overall, the model shows promise in understanding camera positions and scene composition, but needs improvement in achieving the desired aesthetic.