AI Struggles to Capture the Essence of Poses with Midjourney
- 9 minutes read - 1836 wordsTable of Contents
The ability to understand and generate poses is a crucial aspect of creating compelling visual content. This analysis delves into the performance of a generative AI model in interpreting prompts related to poses. While the model demonstrates proficiency in identifying camera positions and shot types, it struggles to grasp the artistic nuances of poses, highlighting the ongoing challenge of bridging the gap between technical understanding and creative expression in AI. This exploration examines the model’s strengths and weaknesses, providing insights into the current state of AI’s ability to capture the essence of poses.
Created with: midjourney
Solitude on the Summit: A Silhouette Against the Sky
A lone figure stands atop a majestic mountain, their silhouette stark against a dramatic, cloudy sky. The scene evokes a sense of serenity and contemplation, with the vast valley below shrouded in forest. The play of light and shadow creates a powerful dramatic effect, highlighting the figure’s isolation and the grandeur of the natural world.
Prompt
thoughtful-pose thoughtful-pose: determined, contemplative ; Lone figure standing on a mountain peak; wide shot; heroism; dramatic sky with clouds; cinematic
Characteristic
Shot : A lone figure stands on the peak of a mountain, overlooking a vast expanse of rolling hills and valleys. The sky is filled with dramatic clouds, creating a sense of awe and wonder.
Aesthetic Score : 0.8
Mood : serene, majestic, contemplative
Quality
Entropy : 6.85
Noise : 104
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, resulting in a washed-out look in the sky.
Lost in the Jungle: A Woman’s Quest for Discovery
A young explorer, shrouded in mystery, stands amidst the ruins of an ancient temple, her face partially hidden by shadow as she studies a map. The lush jungle setting and dramatic lighting create a sense of adventure and intrigue, hinting at a captivating journey to come.
Prompt
thoughtful-pose thoughtful-pose: curious, adventurous ; Explorer looking at a map, surrounded by ancient ruins; medium shot; adventure; jungle foliage; cinematic
Characteristic
Shot : A young woman in explorer attire stands in a jungle clearing, studying a map. The background is a crumbling stone structure partially overgrown with foliage.
Aesthetic Score : 0.8
Mood : adventure, mystery, exploration
Quality
Entropy : 6.78
Noise : 107
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.60
Image errors : Some minor artifacts are visible in the foliage and around the woman’s hair, suggesting potential AI generation.
Lost in the Game: A Moment of Intense Focus
A young woman, bathed in vibrant blue and red lighting, is completely engrossed in her video game. Her headphones block out the world, and her expression reveals a determined focus. The dramatic lighting adds to the intensity of the scene, capturing the thrill of the game.
Prompt
thoughtful-pose thoughtful-pose: intense, focused ; Gamer intensely focused on a screen, hands on a controller; close-up; gaming; neon lights and gaming peripherals; cinematic
Characteristic
Shot : A young woman wearing headphones, illuminated by neon lights, playing a video game.
Aesthetic Score : 0.7
Mood : intense, focused, futuristic
Quality
Entropy : 6.46
Noise : 100
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight digital artifacting around the edges of the subject’s hair.
Lost in the City’s Symphony
A solitary figure perched on a ledge, gazing out at the vibrant chaos of Hong Kong. The city’s neon lights and bustling traffic create a stark contrast to the quiet contemplation of the lone individual, evoking a sense of isolation amidst the urban symphony.
Prompt
thoughtful-pose thoughtful-pose: awe-struck, contemplative ; Tourist gazing at a breathtaking cityscape; medium shot; tourism; bustling city streets; cinematic
Characteristic
Shot : A lone figure sits on a ledge overlooking a busy street in a densely populated city, with tall buildings lining the street and streetlights and traffic in the distance. The image is taken from behind the figure, looking out towards the city, and the person is wearing a white shirt and a black backpack, and they are looking straight ahead towards the city.
Aesthetic Score : 0.7
Mood : solitude, urban, contemplative
Quality
Entropy : 6.83
Noise : 94
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some of the image elements are slightly blurred, and there is a slight amount of chromatic aberration.
Sunset Serenity: A Man Finds Tranquility on a Cliffside
A solitary figure sits perched on a cliff, gazing out at a breathtaking sunset over the vast ocean. The golden light paints the clouds, while the crashing waves and deep blue water create a scene of serene beauty. The dramatic contrast between the man’s small size and the grandeur of nature evokes a sense of contemplation and the insignificance of human existence in the face of the natural world.
Prompt
thoughtful-pose thoughtful-pose: relaxed, introspective ; Backpackers sitting on a cliff overlooking a vast ocean; wide shot; travel; sunset sky; cinematic
Characteristic
Shot : A man sitting on a cliff overlooking the ocean at sunset. The sun is setting behind the clouds, casting a golden glow over the scene.
Aesthetic Score : 0.8
Mood : tranquil, serene, contemplative
Quality
Entropy : 6.36
Noise : 112
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Campfire Nights: Tranquility Under the Stars
A group of friends gather around a crackling campfire, their laughter echoing under a vast, star-filled sky. The scene evokes a sense of warmth, camaraderie, and nostalgia, capturing the essence of a peaceful night spent in nature.
Prompt
thoughtful-pose thoughtful-pose: intimate, nostalgic ; Group of friends huddled around a campfire, sharing stories; medium shot; groups; starry night sky; cinematic
Characteristic
Shot : A group of friends are sitting around a campfire under a starry night sky. The Milky Way is visible in the sky. The scene is warm and inviting, with the fire casting a soft glow on the faces of the friends.
Aesthetic Score : 0.8
Mood : cozy, warm, magical
Quality
Entropy : 5.55
Noise : 108
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no obvious artifacts or errors in the image.
Silhouetted Solitude: A Moment of Contemplation in the City
A lone figure sits on a railing, their silhouette stark against the blurred lights of the city skyline. The scene evokes a sense of melancholy and contemplation, capturing the quiet solitude of urban life.
Prompt
thoughtful-pose thoughtful-pose: reflective, hopeful ; A lone figure standing on a bridge, looking out at the city lights; medium shot; heroism; cityscape at night; cinematic
Characteristic
Shot : A lone figure sits on a railing overlooking a city skyline at night. The city is brightly lit and the figure is silhouetted against the lights.
Aesthetic Score : 0.7
Mood : melancholy, lonely, contemplative
Quality
Entropy : 5.79
Noise : 82
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is a slight amount of noise in the image, especially in the darker areas.
Lost in the Verdant Labyrinth
A group of figures disappear into the dense foliage of a lush green forest, creating a sense of mystery and adventure. The tranquil atmosphere is punctuated by the feeling of something unknown lurking just beyond the trees.
Prompt
thoughtful-pose thoughtful-pose: determined, cautious ; A group of adventurers navigating a dense forest; wide shot; adventure; lush green foliage; cinematic
Characteristic
Shot : A group of people are hiking through a lush green jungle. They are walking in a single file line, with the person in the back being the closest to the viewer. The jungle is very dense, and the light is dim.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, tranquil
Quality
Entropy : 6.48
Noise : 123
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Victory Dance in Neon Lights: Gamer Celebrates in Style
A young man, bathed in vibrant red and blue lighting, throws his arms up in victory from his gaming chair. Headphones on, he radiates pure joy and excitement, capturing the thrill of a hard-earned win.
Prompt
thoughtful-pose thoughtful-pose: triumphant, excited ; A gamer celebrating a victory, fist raised in the air; close-up; gaming; vibrant gaming setup; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in a gaming chair, with his arm raised in the air. He is looking up and smiling, seemingly very excited. The lighting is pink and blue, with the background out of focus.
Aesthetic Score : 0.7
Mood : joyful, energetic, excited
Quality
Entropy : 6.20
Noise : 94
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Silhouettes of Hope: A Family’s Tranquil Sunset
A father and his two children stand silhouetted against a breathtaking sunset on a beach, their gazes fixed on the vast ocean. The scene evokes a sense of peace, serenity, and hope, capturing the beauty of a family united against the backdrop of nature’s splendor.
Prompt
thoughtful-pose thoughtful-pose: peaceful, hopeful ; A family standing on a beach, watching the sunrise; wide shot; tourism; golden sunrise over the ocean; cinematic
Characteristic
Shot : A family of three is standing on the beach watching the sunset over the ocean. The sky is filled with clouds, and the sun is just beginning to set.
Aesthetic Score : 0.7
Mood : tranquil, peaceful, hopeful
Quality
Entropy : 6.61
Noise : 118
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors detected.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position Analysis: The score of 0.45 indicates that the model’s ability to react to camera positions in the prompt is slightly below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Shot Analysis: The score of 0.44 suggests that the model’s understanding of the scene in the prompt is slightly below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Aesthetic Analysis: The score of 1.1102230246251566e-17 (which is essentially zero) indicates that the model significantly deviated from the expected aesthetic of the image. A score between -0.2 and 0.1 would be considered very good.
Overall, the model seems to be better at understanding the technical aspects of the prompt (camera position and shot) than the artistic aspects (aesthetic).
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com