AI's Artistic Struggle: Capturing the Essence of Poses with Midjourney
- 9 minutes read - 1830 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and aesthetically pleasing images is a coveted skill. One area where AI models are often tested is in their ability to understand and recreate poses within specific scenes. This blog post delves into an experiment where a generative AI model was tasked with creating images based on various poses and scenes. While the model demonstrated a good understanding of shot composition, it struggled with accurately capturing the intended camera position and aesthetic. This highlights the ongoing challenges in developing AI models that can truly capture the nuances of human creativity and artistic expression. Dramatic style poses are often used in photography, film, and visual arts to convey emotion, action, and storytelling. They are characterized by exaggerated movements, dynamic angles, and a sense of energy. Examples of dramatic style poses can be found in action movies, superhero comics, and fashion photography.
Created with: midjourney
Victory’s Silhouette: A Warrior’s Triumphant Stand
A lone warrior, silhouetted against the sky, stands victorious after a fierce battle. Two swords held high, they symbolize dominance amidst the hazy battlefield and fallen comrades, creating a powerful image of epic triumph.
Prompt
dancing victorious dance, celebrating victory: triumphant, powerful ; A lone warrior; wide shot; heroism; a battlefield littered with fallen enemies; cinematic
Characteristic
Shot : A lone warrior with swords raised high, standing in the middle of a battlefield. Many fallen warriors are on the ground, and a cloud of dust billows behind the victor.
Aesthetic Score : 0.7
Mood : epic, triumphant, dramatic
Quality
Entropy : 5.76
Noise : 82
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors
Lost in the Fog: A Jungle Adventure Unfolds
A group of excited adventurers race through a misty jungle, their expressions hinting at a thrilling journey ahead. The fog adds an air of mystery, leaving viewers eager to discover what lies beyond the veil.
Prompt
dancing celebratory dance, improvisational moves: excited, adventurous ; A group of explorers; medium shot; adventure; a dense jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A group of people are running through a dense jungle, with fog and green foliage surrounding them, they appear to be celebrating or escaping something
Aesthetic Score : 0.6
Mood : joyful, adventurous, suspenseful
Quality
Entropy : 6.31
Noise : 100
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors in the image.
Silhouette of Determination: Woman Powers Through Workout
Capture the intensity and focus of a woman pushing her limits during a workout, highlighted by the dramatic silhouette she creates against a bright screen. This image evokes a sense of energy and determination, perfect for showcasing fitness and motivation.
Prompt
dancing intense, rhythmic movements mimicking in-game actions: intense, focused ; A gamer; close-up; gaming; a brightly lit gaming setup with a screen displaying a virtual world; cinematic
Characteristic
Shot : A woman in workout clothes is silhouetted in front of a television screen playing a video game. The room is dimly lit, with red and blue lights highlighting the woman and screen.
Aesthetic Score : 0.6
Mood : intense, energetic, focused
Quality
Entropy : 6.05
Noise : 86
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurring of the woman and the TV screen. The background is somewhat out of focus, but this may be intentional.
Love Blooms Amidst Festive Lights
A couple finds romance in the heart of a bustling market, their embrace illuminated by a vibrant array of hanging lanterns. The scene captures the joy and festivity of the moment, creating a heartwarming and romantic atmosphere.
Prompt
dancing gentle, playful dance, mirroring the energy of the market: joyful, romantic ; A couple; medium shot; tourism; a bustling marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A couple in love is hugging in a busy marketplace. They are looking at each other and smiling. The background is full of colorful lanterns and other decorations.
Aesthetic Score : 0.7
Mood : romantic, festive, joyful
Quality
Entropy : 6.91
Noise : 100
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Silhouette of Hope in the Desert Sunset
A woman in a flowing dress stands alone in a vast desert, her silhouette stark against the fiery hues of the setting sun. The scene evokes a sense of peace, serenity, and hope, with the dramatic effect of isolation adding depth to the image.
Prompt
dancing slow, graceful movements, reflecting on the journey: reflective, contemplative ; A traveler; long shot; travel; a vast desert landscape with a setting sun; cinematic
Characteristic
Shot : A woman in a long dress stands in a desert with her arms outstretched, silhouetted against the setting sun.
Aesthetic Score : 0.7
Mood : serene, peaceful, hopeful
Quality
Entropy : 6.31
Noise : 94
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight noise in the shadows, mostly on the sand dunes in the background
Rooftop Revelry: City Lights and Joyful Celebrations
Capture the vibrant energy of a rooftop party with a group of young women celebrating under the city lights. The dynamic poses and colorful atmosphere create a sense of excitement and joy, perfect for capturing the spirit of a special occasion.
Prompt
dancing energetic, synchronized dance, celebrating friendship: happy, carefree ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : A group of friends are laughing and having fun on a rooftop at night with the city skyline in the background.
Aesthetic Score : 0.7
Mood : joyful, celebratory, energetic
Quality
Entropy : 6.41
Noise : 95
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, particularly around the edges of the friends and the city skyline.
Silhouette of Mystery: A Woman in the Shadows
A captivating image of a woman standing in a dark alley, bathed in a single beam of light. The stark contrast creates a sense of mystery and allure, leaving the viewer wondering about her story.
Prompt
dancing powerful, expressive movements, conveying strength and resilience: determined, defiant ; A lone dancer; close-up; heroism; a dark alleyway with flickering streetlights; cinematic
Characteristic
Shot : A woman’s silhouette stands in a dimly lit alleyway, reaching up towards a bright circle of light, possibly a moon or streetlamp.
Aesthetic Score : 0.7
Mood : mysterious, dramatic, alluring
Quality
Entropy : 5.97
Noise : 112
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blurring around the edges of the image, particularly near the walls, which may be due to post-processing or a technical limitation.
Conquering the Summit: Hikers Celebrate a Golden Sunset
Five adventurers stand triumphant on a mountain peak, their silhouettes etched against a breathtaking panorama of sun-drenched peaks. The scene captures the joy of achievement and the awe-inspiring beauty of nature.
Prompt
dancing spontaneous, joyful dance, celebrating the beauty of nature: exhilarated, free ; A group of adventurers; wide shot; adventure; a breathtaking mountain range with a clear blue sky; cinematic
Characteristic
Shot : A group of four hikers standing on a mountaintop with their arms raised, overlooking a valley and mountain ranges in the distance. The sky is clear and the sun is shining.
Aesthetic Score : 0.7
Mood : joyful, triumphant, adventurous
Quality
Entropy : 6.24
Noise : 112
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : no visible errors
Lost in the Neon Glow: A Gamer’s Intense Focus
A silhouette against a vibrant screen, this gamer is fully immersed in their digital world. Neon lights illuminate the scene, highlighting the intensity and focus of the moment. The dramatic effect captures the essence of a passionate player lost in the game.
Prompt
dancing precise, calculated movements, reflecting the game’s strategy: focused, strategic ; A gamer; close-up; gaming; a dimly lit room with a computer screen displaying a competitive game; cinematic
Characteristic
Shot : A person is sitting at a desk, likely in a dimly lit room, playing a video game. They are wearing a headset and focused on the screen. The background is mostly dark with some neon lights.
Aesthetic Score : 0.6
Mood : intense, focused, dark
Quality
Entropy : 5.93
Noise : 79
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Chasing Summer Joy: A Family Runs Towards the Ocean
A heartwarming scene of a woman and two children running towards the sparkling blue ocean on a pristine white sand beach. The image captures the essence of summer joy, freedom, and carefree happiness.
Prompt
dancing gentle, playful dance, enjoying the beauty of the beach: relaxed, joyful ; A family; medium shot; travel; a picturesque beach with turquoise water and white sand; cinematic
Characteristic
Shot : Three people are running towards the ocean on a beautiful beach with white sand and crystal clear water.
Aesthetic Score : 0.8
Mood : happy, carefree, playful
Quality
Entropy : 5.97
Noise : 112
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors detected.
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.48, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model’s ability to accurately interpret and recreate the camera position specified in the prompt is somewhat lacking.
- Shot Analysis: The model scored 0.61, falling within the “good” range. This indicates that the model was able to understand and implement the shot composition described in the prompt with a decent level of accuracy.
- Aesthetic Analysis: The model scored 0.09, which is significantly lower than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic based on the prompt.
Overall, the model demonstrates a good understanding of shot composition but needs improvement in accurately capturing the intended camera position and aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com