AI's Artistic Struggle: Capturing the Essence of Poses with Scenario
- 9 minutes read - 1804 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual descriptions is a rapidly evolving field. This blog post delves into an experiment where an AI model was tasked with creating images based on specific poses and scene descriptions. While the model demonstrated a good understanding of camera positions and scene composition, it struggled to achieve the desired aesthetic, highlighting the ongoing challenges in AI’s artistic capabilities. This exploration sheds light on the complexities of translating human artistic vision into the digital domain, showcasing both the strengths and limitations of current AI models.
Created with: scenario
A Moment of Solitude in the Mountains
A lone woman stands on a rocky outcrop, dwarfed by the vast, mountainous landscape. The soft golden light and fluffy clouds create a serene atmosphere, suggesting a moment of contemplation and adventure.
Prompt
poses ankle-cross: Determined, confident, facing the unknown ; A lone adventurer, standing atop a windswept mountain peak; wide shot; Adventure; Dramatic sky with swirling clouds; cinematic
Characteristic
Shot : A woman standing on a rocky outcrop, looking out at a vast valley with mountains in the distance. The sky is cloudy and the light is soft.
Aesthetic Score : 0.7
Mood : serene, adventurous, contemplative
Quality
Entropy : 6.79
Noise : 88
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
Superheroine Silhouetted Against a Sunset Sky
A powerful image captures a superheroine, possibly Superwoman, standing in a heroic pose on a rooftop overlooking a city skyline at sunset. The silhouette against the vibrant sky creates a dramatic and hopeful mood, emphasizing the character’s strength and the promise of a brighter future.
Prompt
poses ankle-cross: Powerful, heroic, standing tall ; A superhero, silhouetted against a blazing sunset; medium shot; Heroism; City skyline with towering buildings; cinematic
Characteristic
Shot : A female superhero in a red cape and blue suit stands on a rooftop overlooking a cityscape at sunset.
Aesthetic Score : 0.8
Mood : powerful, heroic, inspiring
Quality
Entropy : 6.74
Noise : 91
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The city skyline seems slightly blurry and lacks detail, indicating possible over-sharpening or AI generation.
Lost in the Neon Glow: A Futuristic VR Experience
A young woman, immersed in a vibrant virtual world, sits bathed in the glow of neon lights. The scene captures the playful and futuristic essence of VR gaming, with a dramatic touch thanks to the striking lighting.
Prompt
poses ankle-cross: Immersed, concentrated, in the zone ; A gamer, intensely focused on a virtual reality headset; close-up; Gaming; Futuristic, neon-lit gaming room; cinematic
Characteristic
Shot : A young woman wearing a VR headset and headphones, sitting on a floor with neon lights. There is a TV screen and a computer monitor in the background, suggesting a gaming environment.
Aesthetic Score : 0.7
Mood : futuristic, energetic, cool
Quality
Entropy : 6.82
Noise : 74
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible errors in the image.
Tranquility Amidst Ancient Ruins
A woman finds solace in the golden hour, perched on a stone ledge overlooking a sprawling ancient city. The setting sun casts a warm glow, creating a sense of peace and nostalgia as she contemplates the passage of time.
Prompt
poses ankle-cross: Awe-struck, contemplative, taking in the beauty ; A tourist, gazing out at a breathtaking vista; medium shot; Tourism; Ancient ruins with a panoramic view; cinematic
Characteristic
Shot : A woman sits on a stone ledge overlooking a large, ancient fortress. The scene is set in a warm, dry climate, with dry grasses and scrub vegetation. The woman appears to be contemplating the scene before her.
Aesthetic Score : 0.7
Mood : peaceful, contemplative, nostalgic
Quality
Entropy : 6.68
Noise : 100
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors
A Solitary Figure in the Vastness of the Desert
A woman in a white dress traverses a sand dune, her small figure dwarfed by the endless expanse of the desert. The tranquil blue sky and distant dunes create a sense of isolation and wonder, evoking a mood of serenity and adventure.
Prompt
poses ankle-cross: Free-spirited, adventurous, embracing the unknown ; A backpacker, standing at the edge of a vast desert; wide shot; Travel; Endless sand dunes stretching into the horizon; cinematic
Characteristic
Shot : A woman in a white dress walks away from the camera on a sand dune in a desert
Aesthetic Score : 0.7
Mood : serene, peaceful, adventurous
Quality
Entropy : 6.26
Noise : 73
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Joyful Stroll Through a Silent Night
A woman’s infectious laughter echoes through a quiet cobblestone street, her happiness a radiant beacon against the muted backdrop. This scene captures the essence of pure joy, a moment of lightheartedness in the stillness of the night.
Prompt
poses ankle-cross: Joyful, carefree, enjoying each other’s company ; A group of friends, laughing and celebrating; medium shot; Groups; Vibrant, bustling street scene with colorful lights; cinematic
Characteristic
Shot : A woman in a brown coat and jeans is laughing as she walks through a cobblestone street decorated with Christmas lights at night. Two other people are walking behind her.
Aesthetic Score : 0.7
Mood : happy, festive, warm
Quality
Entropy : 6.76
Noise : 102
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness on the background and some artifacts in the shadows on the left side.
A Knight’s Hope at Sunset
A lone woman in medieval armor stands at the entrance of a grand castle, bathed in the golden light of sunset. Her gaze is fixed on the distant landscape, hinting at a journey ahead. The scene evokes a sense of mystery, adventure, and hope, with the dramatic play of light and shadow adding to the intrigue.
Prompt
poses ankle-cross: Stoic, vigilant, protecting the realm ; A lone warrior, standing guard at a castle gate; medium shot; Heroism; Majestic castle with a moat and drawbridge; cinematic
Characteristic
Shot : A woman in blue armor stands facing a large stone castle gate, with a moat separating the viewer from the castle.
Aesthetic Score : 0.75
Mood : fantasy, majestic, introspective
Quality
Entropy : 6.63
Noise : 105
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, such as the blurry background and the slightly unnatural-looking water.
Warmth in the Wilderness: Explorers Gather Around a Campfire
A cozy scene unfolds in a snowy forest as four figures huddle around a crackling campfire. The firelight casts a warm glow on their faces, creating a stark contrast to the cold, mysterious surroundings. Their poses and expressions suggest a sense of camaraderie and shared adventure, inviting viewers to imagine their stories and the challenges they have faced.
Prompt
poses ankle-cross: Intrigued, curious, sharing stories ; A group of explorers, huddled around a campfire; close-up; Adventure; Dense forest with flickering flames; cinematic
Characteristic
Shot : Four people sitting around a campfire in a forest, likely at night. The light of the fire illuminates their faces.
Aesthetic Score : 0.7
Mood : cozy, nostalgic, adventurous
Quality
Entropy : 6.68
Noise : 113
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight blurring around the edges, and the lighting is a bit flat.
Joyful Gaming Session in a Cozy Living Room
A young woman finds pure joy in a video game, her laughter echoing through a living room decorated with string lights and a pink wall. The scene captures the playful and casual atmosphere of a relaxed gaming session.
Prompt
poses ankle-cross: Excited, victorious, celebrating success ; A gamer, triumphantly raising their hands after winning a game; close-up; Gaming; Brightly lit gaming console with flashing lights; cinematic
Characteristic
Shot : A young woman is sitting on the floor in a room with a pink wall, string lights, a TV screen and speakers behind her. She is holding a video game controller and laughing.
Aesthetic Score : 0.6
Mood : joyful, playful, relaxed
Quality
Entropy : 6.79
Noise : 78
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no notable errors in the image.
Sunset Embrace: A Romantic Moment on the Rooftop
Experience the dreamy intimacy of a couple’s embrace on a rooftop overlooking a city at sunset. The distant tower stands as a silent witness to their love, while the city lights create a soft, romantic glow. This scene, with an aesthetic score of 0.7, captures the essence of a romantic and intimate mood.
Prompt
poses ankle-cross: Intimate, romantic, enjoying the view together ; A couple, standing on a balcony overlooking a bustling city; medium shot; Travel; Romantic cityscape with twinkling lights; cinematic
Characteristic
Shot : A couple is embracing on a balcony overlooking a cityscape at sunset. The cityscape lights create a warm glow in the background.
Aesthetic Score : 0.7
Mood : romantic, intimate, dreamy
Quality
Entropy : 6.80
Noise : 80
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights.
Conclusion
The results show that the generative AI model performed well in terms of understanding camera positions and scene composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:
Camera Position:
- Score: 0.45
- Interpretation: This score falls below the “good” range (0.5-0.75). It suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
Shot Analysis:
- Score: 0.6
- Interpretation: This score falls within the “good” range (0.5-0.75). It indicates that the model was able to understand and translate the scene description from the prompt into the generated image reasonably well.
Aesthetic Analysis:
- Score: 0.09
- Interpretation: This score is significantly higher than the “very good” range (-0.2 to 0.1). It suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall:
While the model demonstrated a good understanding of camera positions and scene composition, it struggled to achieve the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic preferences into its image generation process.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.scenario.com