AI's Artistic Journey: Capturing Poses, But Missing the Vibe with Stable-diffusion
- 9 minutes read - 1859 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts is rapidly advancing. However, achieving a perfect balance between technical accuracy and artistic expression remains a challenge. This blog post examines the results of an AI model tasked with generating images based on specific poses and scenes, highlighting its strengths and weaknesses in capturing the intended aesthetic.
Created with: stability-ai-core
A Solitary Figure Faces the Storm
A lone figure, shrouded in darkness, walks a desolate path towards a menacing storm cloud. The dramatic lighting, composition, and the figure’s solitary journey create a sense of mystery and foreboding.
Prompt
poses running: determined, hopeful ; A lone figure in a tattered cloak; wide shot; Heroism; a desolate wasteland with a storm brewing in the distance; cinematic
Characteristic
Shot : A lone figure in a long dark cloak walks down a dirt road towards a looming storm cloud in the distance. The road is lined with debris and there are puddles of water reflecting the dark sky.
Aesthetic Score : 0.7
Mood : dark, ominous, isolated
Quality
Entropy : 6.74
Noise : 73
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slightly blurry background and a few artifacts, such as some unnatural-looking debris.
Uncharted Territory: A Man’s Journey into the Jungle’s Heart
A lone figure races towards an ancient stone structure, shrouded in the vibrant green of a dense jungle. The air crackles with anticipation as he ventures deeper into the unknown, fueled by a thirst for adventure and discovery.
Prompt
poses running: excited, curious ; A young adventurer with a backpack; medium shot; Adventure; a lush jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A man with a backpack running towards the camera on a path through a jungle, an ancient stone building is visible in the background.
Aesthetic Score : 0.7
Mood : adventurous, mysterious, exciting
Quality
Entropy : 6.77
Noise : 89
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors in the image.
The Intensity of Focus: A Gamer’s World
A dimly lit room, multiple screens, and a gamer completely immersed in the game. The lighting and his focused gaze create a palpable sense of intensity and immersion, capturing the essence of a dedicated player.
Prompt
poses running: intense, focused ; A gamer’s hands on a keyboard and mouse; close-up; Gaming; a brightly lit gaming room with a monitor displaying a virtual world; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer and playing a video game. His focus is on the game, and he has a determined look on his face.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 6.12
Noise : 59
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors
Joyful Chaos: A Festive Street Scene
Capture the energy of a bustling street festival with this vibrant photo. Colorful flags flutter overhead as a crowd of people race down a narrow street lined with shops, creating a sense of depth and movement that draws you into the scene. The mood is undeniably joyful and energetic, perfect for capturing the spirit of celebration.
Prompt
poses running: energetic, joyful ; A group of tourists running through a bustling marketplace; long shot; Tourism; a vibrant marketplace with colorful stalls and vendors; cinematic
Characteristic
Shot : A group of people, mostly men, are running down a narrow street lined with shops and decorated with colorful flags. The street is made of cobblestones and the buildings are old and worn.
Aesthetic Score : 0.6
Mood : fun, celebratory, vibrant
Quality
Entropy : 6.89
Noise : 88
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image has some slight artifacts, particularly around the edges of the figures, which suggest it may be digitally manipulated. The colors are slightly oversaturated, which makes the image look artificial.
Carefree Couple Embraces the Beauty of a White Sandy Beach
In this joyful and romantic scene, a couple is seen walking along a pristine white sandy beach, with the sparkling blue water in the background. The woman’s long flowing green dress and the man’s white shirt and dark pants add a touch of elegance to the carefree moment. The sun is shining brightly, and their smiles reflect the happiness and freedom they feel in this beautiful natural setting.
Prompt
poses running: romantic, carefree ; A couple running hand-in-hand along a beach; medium shot; Travel; a beautiful beach with turquoise water and white sand; cinematic
Characteristic
Shot : A couple is running along a beach, holding hands, with the ocean in the background. The woman is wearing a long, flowing teal dress and the man is wearing a white shirt and blue pants. The couple is smiling and appears to be happy.
Aesthetic Score : 0.75
Mood : happy, romantic, carefree
Quality
Entropy : 6.72
Noise : 54
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors were detected, but the color balance could be slightly adjusted to create more harmony and vibrancy.
Friends Running Through Sunshine
A group of friends enjoy a carefree day in the park, their laughter echoing as they run towards the camera. The warm sunlight bathes the scene in a happy glow, capturing the essence of youthful energy and friendship.
Prompt
poses running: happy, playful ; A group of friends running through a park; wide shot; Groups; a sunny park with green grass and trees; cinematic
Characteristic
Shot : A group of friends are running through a park, laughing and having fun.
Aesthetic Score : 0.7
Mood : joyful, energetic, carefree
Quality
Entropy : 6.74
Noise : 87
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears slightly grainy and there is some chromatic aberration visible.
Superman Soars Through the City, Hopeful and Powerful
A dramatic shot captures Superman in mid-stride, running across a rooftop with the city’s skyscrapers as a backdrop. The use of light and shadow, along with his dynamic pose, creates a sense of heroism and hope.
Prompt
poses running: powerful, confident ; A superhero in a bright costume; close-up; Heroism; a city skyline with skyscrapers and flashing lights; cinematic
Characteristic
Shot : Superman running towards the camera in a city rooftop, likely New York
Aesthetic Score : 0.7
Mood : powerful, heroic, hopeful
Quality
Entropy : 6.83
Noise : 71
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.50
Image errors : Slight blurriness in the background, some artifacts and distortion on Superman’s costume
Conquering the Summit: One Man, One Mountain, One Breathtaking View
A lone figure races up a snow-covered mountain path, driven by adventure and a desire to reach the peak. The vastness of the snow-capped mountains and the shimmering blue lake below create a sense of awe and inspire a feeling of determination. This is a scene of pure, unadulterated adventure.
Prompt
poses running: determined, adventurous ; A lone explorer running through a snow-covered mountain pass; long shot; Adventure; a majestic mountain range with snow-capped peaks; cinematic
Characteristic
Shot : A man is running up a snow-covered mountain path, with a breathtaking view of snow-capped mountains and a lake in the distance. The sky is clear and blue, and the sun is shining.
Aesthetic Score : 0.8
Mood : serene, adventurous, determined
Quality
Entropy : 6.77
Noise : 78
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Escape to a Futuristic City of Dragons and Excitement!
Experience the thrill of a vibrant, futuristic city where dragons soar through the sky and glowing orbs illuminate the night. A group of adventurers races through the streets, their urgency palpable in this dynamic scene. Get ready for an adventure that’s both exhilarating and visually stunning.
Prompt
poses running: immersive, exciting ; A gamer’s avatar running through a virtual world; close-up; Gaming; a vibrant and detailed virtual world with fantastical creatures; cinematic
Characteristic
Shot : A group of young people are running down a brightly lit street in a futuristic city. There are dragons flying overhead and glowing neon signs on the buildings.
Aesthetic Score : 0.7
Mood : energetic, whimsical, futuristic
Quality
Entropy : 6.84
Noise : 85
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some of the textures and details in the image are a bit blurry and lack detail, particularly in the background. The lighting feels somewhat inconsistent with different areas having oddly flat lighting or being too bright.
Running Towards Happiness: A Family’s Joyful Escape
Capture the essence of carefree joy with this image of a family running through a picturesque rural landscape. The vibrant green fields and rolling hills create a backdrop of natural beauty, while the smiles on their faces radiate pure happiness and a sense of freedom. This image evokes a feeling of active adventure and the simple pleasures of life.
Prompt
poses running: happy, carefree ; A family running along a scenic road; medium shot; Travel; a winding road with rolling hills and a picturesque countryside; cinematic
Characteristic
Shot : Three people are running on a paved road in a countryside setting. There are lush green hills and a clear blue sky in the background. The road is lined with grass and a wooden fence.
Aesthetic Score : 0.7
Mood : happy, active, family
Quality
Entropy : 6.68
Noise : 76
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.51
- Interpretation: This score falls within the “good” range (0.5 to 0.75), indicating that the model generally understood and implemented the camera positions described in the prompt.
Shot Analysis:
- Score: 0.52
- Interpretation: Similar to camera position, this score also falls within the “good” range, suggesting the model effectively captured the intended scene and shot composition.
Aesthetic Analysis:
- Score: 0.07
- Interpretation: This score is significantly lower than the ideal range (-0.2 to 0.1), indicating a mismatch between the expected aesthetic and the actual aesthetic of the generated image. This suggests the model may have struggled to accurately translate the desired visual style into the final output.
Overall:
While the model demonstrated good understanding of camera position and shot composition, it fell short in capturing the intended aesthetic. This suggests that the model might need further training or refinement to better understand and implement specific visual styles.