AI's Artistic Journey: Capturing Scenes, But Missing the Mark on Poses with Midjourney
- 9 minutes read - 1810 wordsTable of Contents
In the realm of artificial intelligence, generative models are pushing the boundaries of creativity. These models can generate images, text, and even music based on textual prompts. One intriguing area of exploration is the ability of these models to capture the essence of a scene, including the poses of the subjects within it. This blog post examines the performance of a generative AI model in this regard, analyzing its strengths and weaknesses in understanding and translating textual descriptions into visual representations.
Created with: midjourney
Lost in the Misty Majesty
A solitary hiker stands on a mountain peak, dwarfed by the vast expanse of fog-covered mountains. The scene evokes a sense of tranquility and solitude, highlighting the immensity of nature.
Prompt
face-to-face Face-to-face with the vastness of nature: Determined, awe-inspiring ; A lone adventurer, standing on a mountain peak; wide shot; Adventure; Majestic mountain range with clouds swirling around; cinematic
Characteristic
Shot : A lone hiker stands on a mountain peak, overlooking a valley shrouded in mist and clouds. The sky is a mix of gray and blue, with patches of sunlight breaking through the clouds.
Aesthetic Score : 0.8
Mood : serene, contemplative, majestic
Quality
Entropy : 6.51
Noise : 84
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors
Silhouettes in the Forest: A Moment of Mystery and Light
A line of six figures stands silhouetted against a backdrop of trees, bathed in the ethereal glow of sunlight filtering through the foliage. The scene evokes a sense of mystery, eeriness, and tranquility, with the dramatic play of light and shadow creating a captivating visual.
Prompt
face-to-face Face-to-face, whispering secrets: Suspenseful, mysterious ; A group of friends, huddled together in a dark forest; medium shot; Adventure; Tall trees casting long shadows, sunlight filtering through the leaves; cinematic
Characteristic
Shot : Silhouettes of a group of people standing in a forest with light filtering through the trees.
Aesthetic Score : 0.6
Mood : mysterious, serene, peaceful
Quality
Entropy : 5.34
Noise : 99
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight graininess and noise visible.
Dragon’s Breath, Woman’s Gaze: A Moment of Intrigue
A captivating image of a dragon’s head and a woman’s face in close proximity, set against a fiery backdrop. The close-up shot and the contrasting expressions create a powerful sense of tension and mystery.
Prompt
face-to-face Face-to-face, locked in a battle of wills: Brave, intense ; A seasoned warrior, facing down a fearsome dragon; close-up; Heroism; Fiery dragon with glowing eyes, smoke billowing around; cinematic
Characteristic
Shot : A close-up shot of a woman facing a fiery dragon. The dragon’s head is in the foreground, with its teeth bared and a glowing orange eye visible. The woman is in the background, her expression is unreadable.
Aesthetic Score : 0.7
Mood : intense, dramatic, mystical
Quality
Entropy : 6.18
Noise : 106
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : The dragon’s scales and the flames appear to be slightly blurry, the woman’s face lacks some detail.
City Lights in the Eyes: A Mysterious Close-Up
A captivating close-up shot reveals a person’s face, their eyes reflecting a vibrant city skyline. The image evokes a sense of mystery, futurism, and intensity, leaving the viewer intrigued by the story behind the gaze.
Prompt
face-to-face Face-to-face with the digital world: Focused, determined ; A young gamer, staring intently at a computer screen; close-up; Gaming; Vibrant, futuristic cityscape reflected in the screen; cinematic
Characteristic
Shot : Close-up of a person’s face looking at a screen with a blurry city lights background
Aesthetic Score : 0.6
Mood : mysterious, dark, futuristic
Quality
Entropy : 6.31
Noise : 81
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : No major errors, but some minor artifacts in the lighting and shadows
Parisian Romance: A Silhouette of Love Against the Eiffel Tower
A timeless and romantic image captures a couple in a passionate embrace, silhouetted against the iconic Eiffel Tower. The Parisian cityscape provides a breathtaking backdrop, enhancing the emotional impact of this classic scene.
Prompt
face-to-face Face-to-face, sharing a moment: Romantic, nostalgic ; A couple, gazing at each other in front of the Eiffel Tower; medium shot; Tourism; Romantic Parisian cityscape with the Eiffel Tower in the background; cinematic
Characteristic
Shot : A couple is silhouetted in front of the Eiffel Tower in Paris. The man is wearing a suit and the woman is wearing a dress. The photo is taken in black and white.
Aesthetic Score : 0.7
Mood : romantic, classic, timeless
Quality
Entropy : 6.56
Noise : 79
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable errors
A Vibrant Tapestry of Colors and Life: Exploring a Bustling Southeast Asian Market
Immerse yourself in the vibrant chaos of a bustling Southeast Asian market. This narrow street teems with life, showcasing a colorful array of fresh produce, exotic goods, and lively vendors. The perspective draws you into the heart of the action, highlighting the depth and energy of this vibrant scene. A woman in a blue skirt adds a focal point, guiding your eye through the bustling marketplace.
Prompt
face-to-face Face-to-face with the local culture: Curious, vibrant ; A traveler, standing on a bustling street market; medium shot; Travel; Colorful stalls overflowing with exotic goods, people bustling around; cinematic
Characteristic
Shot : A bustling market in a narrow street, with vendors selling fresh produce. The scene is vibrant with colors and textures.
Aesthetic Score : 0.7
Mood : vibrant, lively, bustling
Quality
Entropy : 6.47
Noise : 102
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors.
Shadows and Flames: A Night of Mystery in the Woods
A group of four figures huddle around a crackling campfire, their faces obscured by the dancing shadows. The forest whispers secrets, and the night air crackles with anticipation. This scene evokes a sense of mystery, warmth, and adventure, leaving you wondering what secrets lie hidden in the darkness.
Prompt
face-to-face Face-to-face, sharing stories: Intimate, suspenseful ; A group of explorers, huddled around a campfire; medium shot; Adventure; Dark forest with flickering flames illuminating their faces; cinematic
Characteristic
Shot : A group of four people are gathered around a campfire in a forest, with the firelight illuminating their faces and casting shadows around them.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, calm
Quality
Entropy : 5.90
Noise : 67
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts in the foreground, particularly in the grass and leaves.
Reaching for the Sky: A Moment of Hope in the City
A young woman stands amidst towering buildings, bathed in the golden light of dawn or dusk. Her gaze is fixed upwards, suggesting ambition and a yearning for something greater. The scene evokes a sense of inspiration and hope, reminding us that even in the face of overwhelming scale, we can still strive for our dreams.
Prompt
face-to-face Face-to-face with the urban jungle: Awe-inspiring, hopeful ; A young girl, looking up at a towering skyscraper; wide shot; Tourism; Modern cityscape with towering skyscrapers and bustling streets; cinematic
Characteristic
Shot : A lone figure stands in the middle of a narrow city street, looking up at the towering skyscrapers. The buildings are mostly glass and steel, reflecting the sunlight.
Aesthetic Score : 0.7
Mood : awe, urban, claustrophobic
Quality
Entropy : 6.67
Noise : 110
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slightly washed-out look, and the colors are not very saturated. The buildings in the background seem to be blurry and slightly distorted, as if they were not properly focused. There is also some noise in the image, particularly in the shadows.
Neon Nights: Friends, Laughter, and Video Game Glory
Capture the vibrant energy of a night spent with friends, fueled by video games and neon lights. This image radiates fun and excitement, showcasing the joy of shared experiences.
Prompt
face-to-face Face-to-face, sharing the excitement: Joyful, celebratory ; A group of friends, celebrating a victory in a video game; close-up; Gaming; Brightly lit gaming room with controllers and headsets; cinematic
Characteristic
Shot : Three young adults, possibly friends, are playing video games and laughing together. The image is shot from a low angle, focusing on the person in the foreground wearing headphones.
Aesthetic Score : 0.7
Mood : excited, joyful, playful
Quality
Entropy : 6.74
Noise : 84
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, resulting in some loss of detail in the highlights.
Silhouetted Solitude: A Moment of Tranquility at Sunset
A lone figure stands on a serene beach, their silhouette stark against the fiery hues of the setting sun. The scene evokes a sense of tranquility and contemplation, with the dramatic effect highlighting a moment of introspection and perhaps even loneliness.
Prompt
face-to-face Face-to-face with the vastness of the sea: Melancholy, contemplative ; A lone traveler, standing on a deserted beach; wide shot; Travel; Vast ocean stretching out to the horizon, golden sunset; cinematic
Characteristic
Shot : A lone figure stands on a beach at sunset, facing the ocean, with a cloudy sky above.
Aesthetic Score : 0.7
Mood : serene, contemplative, melancholic
Quality
Entropy : 6.60
Noise : 121
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight graininess and slight color banding in the sky.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.56, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and its aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com