AI Struggles to Capture the Essence of Poses with Midjourney
- 9 minutes read - 1784 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions, actions, and relationships. They are often used in photography, film, and art to create impactful and memorable images. However, generating images with specific poses presents a unique challenge for generative AI models. These models struggle to understand and implement the nuances of human poses, often resulting in images that deviate from the intended aesthetic and composition.
Created with: midjourney
Silhouetted Solitude at Sunset
A lone figure stands on a cliff, their silhouette stark against the fiery sunset over a majestic mountain range. The scene evokes a sense of tranquility and contemplation, capturing the beauty of nature and the quiet power of solitude.
Prompt
profile Profile: Epic, hopeful, determined ; A lone figure, silhouetted against a setting sun; wide shot; Heroism; A vast, mountainous landscape; cinematic
Characteristic
Shot : A lone figure stands on a rocky outcropping, silhouetted against a vibrant sunset over a mountain range.
Aesthetic Score : 0.7
Mood : tranquil, awe, solitude
Quality
Entropy : 5.60
Noise : 57
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be slightly blurry, especially around the edges of the mountains.
A Hiker’s Solitude: Finding Awe in the Mist
A lone hiker stands on a cliff edge, dwarfed by the vastness of a misty valley and a cascading waterfall. The scene evokes a sense of serenity, adventure, and awe, capturing the beauty and isolation of nature.
Prompt
profile Profile: Adventurous, free-spirited, awe-inspired ; A backpacker standing on a cliff edge, looking out at a breathtaking view; medium shot; Adventure; A sprawling valley with cascading waterfalls; cinematic
Characteristic
Shot : A lone hiker stands on the edge of a cliff overlooking a deep, misty valley. A waterfall cascades down a rocky cliff face in the distance.
Aesthetic Score : 0.8
Mood : serene, adventurous, awe-inspiring
Quality
Entropy : 6.67
Noise : 116
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but some subtle noise is visible in the background.
Neon Glow, Focused Hands: The Intensity of Gaming
A gamer’s hands grip a controller, bathed in vibrant blue and red light. The blurred background fades away, emphasizing the intense focus and immersion of the gaming experience.
Prompt
profile Profile: Focused, intense, passionate ; A gamer’s hands, illuminated by the glow of a monitor, holding a controller; close-up; Gaming; A dimly lit room with gaming posters on the walls; cinematic
Characteristic
Shot : A person is playing a video game with a controller in their hands, with a computer screen in the background showing a game or an image
Aesthetic Score : 0.6
Mood : intense, focused, immersive
Quality
Entropy : 6.22
Noise : 93
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and the colors are a bit oversaturated
Finding Solace in the Shadow of the Cathedral
A woman, lost in contemplation, stands before a majestic cathedral, its grandeur emphasized by her solitary presence and the bustling city life around her. The scene evokes a sense of calm and awe, capturing the quiet beauty of a moment of reflection.
Prompt
profile Profile: Curious, excited, appreciative ; A tourist gazing up at a majestic cathedral; medium shot; Tourism; A bustling city square with cobblestone streets; cinematic
Characteristic
Shot : A person with a backpack is standing in front of Cologne Cathedral, looking up at the building. There are other people in the square, and the sky is cloudy.
Aesthetic Score : 0.8
Mood : awe, wonder, travel
Quality
Entropy : 6.59
Noise : 108
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some slight noise in the image, especially in the darker areas. The person in the foreground is slightly out of focus.
Golden Hour Melancholy
A young woman gazes out the train window as the sun sets, casting a warm glow on the passing countryside. The scene evokes a sense of peaceful contemplation, with the play of light and shadow adding a touch of mystery.
Prompt
profile Profile: Reflective, contemplative, nostalgic ; A traveler sitting on a train, looking out the window at passing scenery; medium shot; Travel; A scenic train journey through rolling hills and fields; cinematic
Characteristic
Shot : A woman sits in a train, looking out of the window, with a view of rolling hills and farmland outside.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 5.69
Noise : 110
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, especially in the shadows.
Friendship Glows Under Colorful Lights
Four young women radiate joy and laughter at a vibrant party, their smiles and close proximity capturing the essence of genuine friendship and celebration.
Prompt
profile Profile: Joyful, celebratory, connected ; A group of friends laughing and celebrating together; wide shot; Groups; A lively party with colorful decorations and music; cinematic
Characteristic
Shot : Four young women are laughing together at a party, with colorful string lights hanging in the background.
Aesthetic Score : 0.7
Mood : happy, celebratory, joyful
Quality
Entropy : 6.79
Noise : 100
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major artifacts or errors are visible.
Superman: Ready to Soar
A powerful image of Superman standing tall in a city, his determined expression and pose conveying a sense of heroism and strength. The blurred skyscrapers in the background add to the dramatic effect, highlighting his dominance over the urban landscape.
Prompt
profile Profile: Powerful, confident, inspiring ; A superhero standing tall, cape billowing in the wind; medium shot; Heroism; A cityscape with towering skyscrapers; cinematic
Characteristic
Shot : Superman standing in a city, looking up at the sky, with a city background out of focus
Aesthetic Score : 0.6
Mood : heroic, hopeful, determined
Quality
Entropy : 6.57
Noise : 91
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has some slight artifacts in the background, like a slight shimmering or blur, that might suggest AI generation. There is also a slight, almost glitchy, effect around the edges of the figure. The muscles are quite overblown and not realistic.
Lost in the Jungle: A Mayan Pyramid Beckons
A sense of mystery and adventure hangs in the air as three hikers navigate a dense jungle towards an ancient Mayan pyramid, its crumbling stone shrouded in vibrant vegetation. The dappled sunlight filtering through the canopy adds to the intrigue, hinting at the secrets hidden within the ruins. This captivating scene evokes a sense of wonder and awe at the power of nature and the mysteries of the past.
Prompt
profile Profile: Intrigued, adventurous, determined ; A group of explorers navigating a dense jungle; wide shot; Adventure; Lush greenery, ancient ruins, and dappled sunlight; cinematic
Characteristic
Shot : Three people are walking towards a Mayan temple overgrown with foliage.
Aesthetic Score : 0.8
Mood : mysterious, adventurous, tranquil
Quality
Entropy : 6.46
Noise : 130
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Lost in the Code: A Moment of Intense Focus
A young man, bathed in the ethereal glow of pink and blue light, is completely absorbed in his work. His headphones isolate him from the world, leaving only the glow of the screen and the intensity of his focus. The dim lighting adds a layer of mystery, leaving us to wonder what secrets lie within the code.
Prompt
profile Profile: Focused, competitive, determined ; A gamer’s face, lit by the screen, showing intense concentration; close-up; Gaming; A dimly lit room with a gaming setup and neon lights; cinematic
Characteristic
Shot : A young man, wearing headphones, is illuminated by pink and purple light as he sits in front of a computer screen. The screen is reflecting a red glow, suggesting the man is playing a game or watching a video.
Aesthetic Score : 0.7
Mood : focused, intense, mysterious
Quality
Entropy : 5.65
Noise : 96
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and graininess, particularly in the darker areas, indicating it might be a low-light shot or has been edited with a grainy filter.
Sunset Romance on the Beach
A couple strolls hand-in-hand along a sandy beach as the sun dips below the horizon, painting the sky with vibrant hues of orange, yellow, and blue. The warm, golden light creates a romantic and peaceful atmosphere, capturing the essence of a serene and loving moment.
Prompt
profile Profile: Romantic, peaceful, serene ; A couple holding hands, walking along a beach at sunset; medium shot; Tourism; A golden beach with turquoise waters and a vibrant sky; cinematic
Characteristic
Shot : A couple walks hand in hand on a sandy beach at sunset, the ocean waves breaking in the background, with a cloudy sky above.
Aesthetic Score : 0.8
Mood : romantic, tranquil, serene
Quality
Entropy : 6.66
Noise : 114
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable image errors.
Conclusion
The results of the analysis show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position Analysis: The score of 0.4 indicates that the model’s ability to understand and implement camera positions in the generated image is below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Shot Analysis: The score of 0.46 indicates that the model’s ability to understand and implement the scene described in the prompt is below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Aesthetic Analysis: The score of 1.1102230246251566e-17 is essentially zero, indicating that the generated image’s aesthetic significantly deviates from the expected aesthetic. A score between -0.2 and 0.1 would be considered very good, showing a close match between the expected and actual aesthetics.
Overall: The model needs improvement in understanding and implementing camera positions, shot descriptions, and achieving the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com