AI's Artistic Journey: Capturing Poses, But Missing the Essence with Scenario
- 9 minutes read - 1820 wordsTable of Contents
The world of AI is constantly evolving, pushing the boundaries of what’s possible. One fascinating area of exploration is the ability of AI models to generate images based on textual prompts. This experiment delves into the realm of pose generation, where an AI model was tasked with creating images that capture specific poses within various scenes. While the model demonstrates a good understanding of camera angles and shot composition, it falls short in achieving the desired aesthetic, highlighting the ongoing challenges in AI’s artistic development. This blog post will delve into the results of this experiment, analyzing the model’s strengths and weaknesses, and exploring the implications for the future of AI-generated art.
Created with: scenario
Red Dress Against the Storm
A woman in a vibrant red dress stands defiantly on a cliff edge, the stormy sea crashing below. The dramatic contrast creates a powerful and melancholic scene.
Prompt
poses rule-of-thirds: Epic, determined, hopeful ; A lone hero standing on a cliff overlooking a vast, stormy sea; Wide shot; Heroism; Dramatic sky with crashing waves; cinematic
Characteristic
Shot : A woman in a red dress stands on a cliff overlooking a stormy sea. Dark clouds and a dramatic sky create a sense of foreboding.
Aesthetic Score : 0.8
Mood : dramatic, mysterious, melancholic
Quality
Entropy : 6.68
Noise : 92
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors or artifacts in the image.
Warmth in the Winter Wilderness
Four friends gather around a crackling campfire in a snowy forest, seeking warmth and adventure. The cozy glow of the fire contrasts with the cold, white landscape, creating a sense of comfort and escape.
Prompt
poses rule-of-thirds: Intriguing, mysterious, suspenseful ; A group of adventurers huddled around a campfire in a dense forest; Medium shot; Adventure; Shadows and flickering flames; cinematic
Characteristic
Shot : A group of four people are gathered around a campfire in a snowy forest, with a teepee-like tent behind them. The scene is illuminated by the fire, which casts warm light on the figures. The forest is dark and shadowy, creating a sense of mystery and seclusion.
Aesthetic Score : 0.7
Mood : cozy, adventurous, mystical
Quality
Entropy : 6.69
Noise : 94
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the lines in the image are slightly jagged, which is typical of digital illustrations.
Ready to Play: A Confident Gamer’s Intense Focus
This close-up shot captures a young woman with blonde hair, her eyes locked on the camera, holding a white game controller. Her confident expression and playful energy create a sense of anticipation, hinting at the excitement of the game ahead.
Prompt
poses rule-of-thirds: Focused, intense, exhilarating ; A gamer’s hands intensely gripping a controller, the screen displaying a thrilling moment in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : A young woman with blonde hair is holding a white video game controller in front of her. There are two computer monitors in the background, one with a blurry image and the other with a bright red light.
Aesthetic Score : 0.7
Mood : confident, playful, focused
Quality
Entropy : 6.77
Noise : 75
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, and there are some imperfections in the lighting and color balance.
Tranquility Amidst the Peaks: A Hiker Finds Peace in a Mountain Lake
A solitary hiker stands on a rock in the heart of a serene mountain lake, surrounded by vibrant greenery and majestic snow-capped peaks. The scene evokes a sense of tranquility and peace, with the hiker’s small figure highlighting the vastness of nature.
Prompt
poses rule-of-thirds: Tranquil, awe-inspiring, peaceful ; A majestic mountain range reflected in a still lake, with a lone hiker standing on a rocky outcrop; Wide shot; Tourism; Clear blue sky and vibrant green foliage; cinematic
Characteristic
Shot : A lone hiker stands on a rock in the middle of a calm, clear mountain lake, gazing at the majestic mountains in the distance. The reflection of the mountains and surrounding foliage are perfectly mirrored in the water.
Aesthetic Score : 0.8
Mood : serene, tranquil, peaceful
Quality
Entropy : 6.66
Noise : 105
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts.
Golden Fields, A Moment of Longing
A woman gazes out the train window, her expression lost in thought as the sun bathes a field of wheat in golden light. The scene evokes a sense of nostalgia, dreaminess, and romantic longing.
Prompt
poses rule-of-thirds: Nostalgic, romantic, adventurous ; A vintage train speeding through a picturesque countryside, with a lone traveler gazing out the window; Medium shot; Travel; Rolling hills and vibrant fields; cinematic
Characteristic
Shot : A young woman looking out the window of a train, possibly a vintage train. The view outside is of a field of grain and hills.
Aesthetic Score : 0.8
Mood : dreamy, wistful, romantic
Quality
Entropy : 6.83
Noise : 83
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible errors, but the woman’s skin appears slightly airbrushed and the lighting is a bit too perfect.
Laughter and Light: Friends Enjoying a Vibrant Street Market
Three young women share laughter and delicious food at a bustling street market. The scene is bursting with color and warmth, capturing the joy and camaraderie of their friendship.
Prompt
poses rule-of-thirds: Joyful, lively, celebratory ; A group of friends laughing and enjoying a meal together at a bustling outdoor market; Medium shot; Groups; Colorful stalls and vibrant street life; cinematic
Characteristic
Shot : Three young women laughing while eating at an outdoor market, with a background of a bustling market scene.
Aesthetic Score : 0.8
Mood : joyful, carefree, vibrant
Quality
Entropy : 6.79
Noise : 97
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, some minor imperfections in the background
Silhouetted Serenity: A Woman Bathed in Sunset Hues
A woman in a flowing dress stands in the shallows, her silhouette a stark contrast against the vibrant orange and pink sunset sky. The water reflects the fiery colors, creating a scene of serene beauty and contemplative peace.
Prompt
poses rule-of-thirds: Melancholy, reflective, hopeful ; A lone figure standing on a deserted beach, watching the sun setting over the horizon; Wide shot; Heroism; Golden light illuminating the sky and water; cinematic
Characteristic
Shot : A solitary woman standing on a beach at sunset, with her reflection in the water
Aesthetic Score : 0.8
Mood : serene, contemplative, peaceful
Quality
Entropy : 6.21
Noise : 90
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors
Into the Unknown: A Journey Through the Jungle
Two figures venture into a dense jungle, sunlight filtering through the canopy as they walk along a stone pathway. The scene evokes a sense of mystery, adventure, and hope, with the light and their journey into the unknown creating a dramatic and exciting atmosphere.
Prompt
poses rule-of-thirds: Intriguing, suspenseful, adventurous ; A group of explorers navigating a treacherous jungle path, with dense foliage surrounding them; Medium shot; Adventure; Lush greenery and dappled sunlight; cinematic
Characteristic
Shot : Two figures, possibly hikers, walking up a stone path in a dense jungle with sunlight streaming through the foliage.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, serene
Quality
Entropy : 6.55
Noise : 116
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The foliage and lighting have some unrealistic elements, possibly due to digital manipulation. The figures are somewhat flat and lack details.
A Moment of Quiet Contemplation
A close-up portrait captures the soft beauty of a young woman’s face, her eyes reflecting a gentle and contemplative mood. The soft lighting enhances the intimacy and vulnerability of the moment.
Prompt
poses rule-of-thirds: Focused, intense, determined ; A close-up of a gamer’s face, eyes glued to the screen, as they navigate a challenging level in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : Close-up portrait of a young woman’s face, with focus on her eyes and lips.
Aesthetic Score : 0.7
Mood : soft, dreamy, gentle
Quality
Entropy : 6.81
Noise : 84
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be digitally generated and has some minor artifacts, particularly around the edges of the face.
Silhouette of Romance: A Woman’s Nighttime Cityscape
A captivating image of a woman in a black dress, standing on a rooftop and gazing out at the city lights. The silhouette against the urban backdrop creates a romantic and mysterious mood, with a touch of dramatic flair.
Prompt
poses rule-of-thirds: Energetic, exciting, awe-inspiring ; A panoramic view of a bustling city skyline, with a lone tourist standing on a rooftop overlooking the scene; Wide shot; Tourism; Vibrant lights and towering buildings; cinematic
Characteristic
Shot : A woman standing on a rooftop overlooking a city at dusk. The city is lit up with lights, and the sky is a mix of orange and purple.
Aesthetic Score : 0.7
Mood : romantic, adventurous, urban
Quality
Entropy : 6.61
Noise : 110
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.70
Image errors : There are some minor artifacts in the image, particularly around the edges of the buildings. The lighting is a bit too uniform, and the colors are a bit desaturated.
Conclusion
The results of the analysis show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.2, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt. Ideally, a score between 0.5 and 0.75 would indicate a good understanding of the camera position.
- Shot Analysis: The model scored 0.6, which is considered good. This indicates that the generated image’s shot composition was fairly close to what was requested in the prompt.
- Aesthetic Analysis: The model scored -0.01, which is considered very good. This means the generated image’s aesthetic was very close to the expected aesthetic.
Overall, the model seems to be better at understanding the scene and camera position than it is at achieving the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.scenario.com