AI's Artistic Journey: Capturing Poses, Missing the Mood with Stability-ai-ultra
- 9 minutes read - 1814 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts is rapidly advancing. However, achieving a perfect balance between technical accuracy and artistic expression remains a challenge. This blog post delves into the results of an experiment where an AI model was tasked with generating images based on specific poses and scenes, revealing both its strengths and limitations in capturing the desired aesthetic.
Created with: stability-ai-ultra
Conquering the Clouds: A Moment of Solitude and Inspiration
A lone hiker stands triumphant on a mountain peak, bathed in the golden light of the setting sun. Below, a sea of clouds stretches out, offering a breathtaking view of nature’s grandeur. This image captures the essence of adventure, serenity, and the indomitable spirit of exploration.
Prompt
poses ankle-cross: Determined, confident, facing the unknown ; A lone adventurer, standing atop a windswept mountain peak; wide shot; Adventure; Dramatic sky with swirling clouds; cinematic
Characteristic
Shot : A lone figure stands on the peak of a mountain, gazing out at a sea of clouds below. The sky is filled with dramatic clouds and the setting sun casts a warm glow on the scene.
Aesthetic Score : 0.8
Mood : serene, awe-inspiring, adventurous
Quality
Entropy : 6.86
Noise : 84
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
Silhouetted Strength: A Hero’s Sunset
A muscular figure stands tall against a vibrant sunset, their silhouette casting a powerful presence against the city skyline. The dramatic lighting and heroic pose evoke a sense of hope and strength, leaving a lasting impression.
Prompt
poses ankle-cross: Powerful, heroic, standing tall ; A superhero, silhouetted against a blazing sunset; medium shot; Heroism; City skyline with towering buildings; cinematic
Characteristic
Shot : Silhouette of a muscular man standing in front of a city skyline at sunset.
Aesthetic Score : 0.6
Mood : powerful, dramatic, hopeful
Quality
Entropy : 5.46
Noise : 61
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.70
Image errors : The cityscape is slightly blurry and the man’s silhouette is not as defined as it could be.
Lost in the Neon Glow: A Young Man’s Immersive VR Journey
A young man, bathed in vibrant blue and pink neon light, is completely absorbed in a virtual reality experience. The intense focus on the subject, blurred background, and contrasting lighting create a sense of mystery and intrigue, transporting viewers into a futuristic world of immersive possibilities.
Prompt
poses ankle-cross: Immersed, concentrated, in the zone ; A gamer, intensely focused on a virtual reality headset; close-up; Gaming; Futuristic, neon-lit gaming room; cinematic
Characteristic
Shot : A young person is wearing a VR headset and holding a controller. The background is out of focus and features vibrant blue and pink lighting.
Aesthetic Score : 0.7
Mood : futuristic, intense, playful
Quality
Entropy : 6.88
Noise : 69
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Finding Peace in the Vastness
A solitary figure sits on a stone wall, gazing out at a breathtaking valley. The sun bathes the scene in golden light, creating a sense of tranquility and contemplation. The man’s isolation against the backdrop of towering mountains evokes a feeling of peace and solitude.
Prompt
poses ankle-cross: Awe-struck, contemplative, taking in the beauty ; A tourist, gazing out at a breathtaking vista; medium shot; Tourism; Ancient ruins with a panoramic view; cinematic
Characteristic
Shot : A lone man sits on a stone wall overlooking a valley with ancient ruins in the distance. The sky is blue and there are clouds in the sky.
Aesthetic Score : 0.75
Mood : serene, contemplative, tranquil
Quality
Entropy : 6.92
Noise : 84
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Minor sharpening artifacts visible on some of the edges of the rocks and foliage.
Lost in the Vastness: A Hiker’s Moment of Contemplation
A lone hiker stands on a sand dune, dwarfed by the endless expanse of the desert. The clear blue sky and serene atmosphere evoke a sense of isolation and contemplation, capturing the adventurous spirit of exploring the unknown.
Prompt
poses ankle-cross: Free-spirited, adventurous, embracing the unknown ; A backpacker, standing at the edge of a vast desert; wide shot; Travel; Endless sand dunes stretching into the horizon; cinematic
Characteristic
Shot : A lone hiker stands on a sand dune, looking out at the vast expanse of the desert. The sky is a brilliant blue, and the sand is a warm orange.
Aesthetic Score : 0.7
Mood : serene, contemplative, adventurous
Quality
Entropy : 6.76
Noise : 62
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry and has a bit of noise, especially in the sand dunes.
Friendship Under Festive Lights
Four friends radiate joy as they stroll down a vibrant street, bathed in the warm glow of colorful lights. The scene captures the essence of a festive gathering, brimming with happiness and camaraderie.
Prompt
poses ankle-cross: Joyful, carefree, enjoying each other’s company ; A group of friends, laughing and celebrating; medium shot; Groups; Vibrant, bustling street scene with colorful lights; cinematic
Characteristic
Shot : A group of four young friends walk along a cobblestone street in the evening, laughing and enjoying each other’s company. The street is lined with festive lights creating a warm and inviting atmosphere.
Aesthetic Score : 0.7
Mood : joyful, festive, cheerful
Quality
Entropy : 6.82
Noise : 85
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Minor noise in the image, particularly in the background areas, might be due to low light conditions during capture.
A Knight’s Journey Begins: A Majestic Castle Beckons
A lone knight stands at the threshold of adventure, gazing upon a magnificent castle perched atop a rocky hill. The scene evokes a sense of epic grandeur, mystery, and tranquility, hinting at the challenges and wonders that await within the castle’s walls.
Prompt
poses ankle-cross: Stoic, vigilant, protecting the realm ; A lone warrior, standing guard at a castle gate; medium shot; Heroism; Majestic castle with a moat and drawbridge; cinematic
Characteristic
Shot : A knight stands in front of an open gate, looking towards a large castle on a hill across a lake.
Aesthetic Score : 0.7
Mood : medieval, epic, dramatic
Quality
Entropy : 6.76
Noise : 100
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : The water in the lake appears somewhat flat and artificial. Some of the textures, particularly the stonework, lack detail and appear somewhat smooth.
Campfire Camaraderie: A Night Under the Stars
A group of friends gather around a crackling campfire in the heart of the forest, sharing stories and laughter under the starry sky. The warm glow of the fire illuminates their faces, creating a cozy and intimate atmosphere. This scene captures the essence of adventure and friendship, reminding us of the simple joys of life.
Prompt
poses ankle-cross: Intrigued, curious, sharing stories ; A group of explorers, huddled around a campfire; close-up; Adventure; Dense forest with flickering flames; cinematic
Characteristic
Shot : A group of four people are sitting around a campfire in a forest. The people are dressed in casual clothing, and the fire is glowing brightly. The scene is lit by the fire and the surrounding forest is dark.
Aesthetic Score : 0.7
Mood : mysterious, cozy, adventurous
Quality
Entropy : 6.85
Noise : 92
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible image errors
Headphones On, Lights Up: Celebrating the Moment
A man, bathed in vibrant LED light, throws his arms in the air, radiating pure joy. The silhouette against the bright TV screen creates a dynamic and celebratory mood, capturing the energy of the moment.
Prompt
poses ankle-cross: Excited, victorious, celebrating success ; A gamer, triumphantly raising their hands after winning a game; close-up; Gaming; Brightly lit gaming console with flashing lights; cinematic
Characteristic
Shot : A young man sitting on a couch in a dimly lit room with colorful LED lights and a TV screen behind him. He is wearing headphones and has his arms raised in the air.
Aesthetic Score : 0.6
Mood : energetic, vibrant, playful
Quality
Entropy : 6.82
Noise : 76
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and the colors are a bit oversaturated.
City Lights, City Dreams: A Rooftop Romance
A couple embraces on a rooftop, their silhouettes framed against the twinkling cityscape. The scene evokes a sense of intimacy, isolation, and the magic of a shared moment under the stars.
Prompt
poses ankle-cross: Intimate, romantic, enjoying the view together ; A couple, standing on a balcony overlooking a bustling city; medium shot; Travel; Romantic cityscape with twinkling lights; cinematic
Characteristic
Shot : A couple is standing on a rooftop overlooking a cityscape at night. They are embracing and looking at each other, with the city lights blurred in the background.
Aesthetic Score : 0.7
Mood : romantic, intimate, dreamy
Quality
Entropy : 6.86
Noise : 80
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and the colors are a bit muted.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model’s ability to accurately interpret and reproduce camera positions in the prompt is decent, but could be improved.
- Shot Analysis: The model scored 0.48, also slightly below the “good” range. This indicates that the model’s understanding of the scene and its ability to create shots that match the prompt are fairly good, but not exceptional.
- Aesthetic Analysis: The model scored 0.12, which is significantly higher than the “very good” range of -0.2 to 0.1. This means that the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt. This could indicate that the model struggled to capture the desired visual style or mood.
Overall, the model shows promise in its ability to understand and implement camera positions and shot types, but needs improvement in its ability to generate images that match the desired aesthetic.