AI's Artistic Journey: Capturing Poses, But Missing the Mood with Midjourney
- 9 minutes read - 1865 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images from text prompts is a fascinating and rapidly evolving field. This blog post examines the performance of a generative AI model in capturing poses and aesthetics based on specific scene descriptions. While the model demonstrates proficiency in understanding camera positions and scene descriptions, it falls short in capturing the desired aesthetic style. We delve into the model’s performance, analyzing its strengths and weaknesses, and discuss the potential for future improvements. Dramatic style poses are often used in visual storytelling to convey emotion, action, and character. They are commonly seen in movies, video games, and comic books. For example, a superhero standing triumphantly over a defeated villain, or a lone warrior facing down a horde of enemies, are both examples of dramatic style poses.
Created with: midjourney
A Knight’s Solitude at Sunset
A lone knight stands silhouetted against a fiery sunset, his sword held high. The dramatic lighting and barren landscape evoke a sense of epic solitude and melancholic beauty.
Prompt
fighting fighting, heroic stance: epic, determined ; A lone warrior; wide shot; heroism; a desolate battlefield with the setting sun in the background; cinematic
Characteristic
Shot : A lone knight stands on a rocky landscape, silhouetted against a fiery sunset. The sun is setting behind the knight, casting a warm glow over the scene.
Aesthetic Score : 0.7
Mood : epic, somber, lonely
Quality
Entropy : 6.28
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has a slight, almost cartoonish, quality to it. The texture of the rocks and the knight’s armor is somewhat unrealistic.
Lost Temple Beckons in the Mist
A group of explorers stand on the edge of an ancient, overgrown temple, shrouded in mystery and intrigue. The interplay of light, shadow, and mist creates a sense of depth and adventure, beckoning you to uncover the secrets hidden within.
Prompt
fighting fighting, collaborative: intense, adventurous ; A group of adventurers; medium shot; adventure; a dense jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : A group of soldiers stand before the entrance to an ancient, overgrown temple in the jungle. The air is thick with mist, and the temple seems shrouded in mystery.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 6.56
Noise : 121
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, especially around the edges of the temple.
Neon Blade: A Cyberpunk Showdown in the Shadows
A lone figure, bathed in the harsh glow of neon signs, stands poised in a cyberpunk alleyway, wielding a weapon that crackles with energy. The scene is dark, futuristic, and charged with anticipation, hinting at a thrilling confrontation to come.
Prompt
fighting fighting, powerful: dynamic, futuristic ; A player character; close-up; gaming; a neon-lit cityscape with holographic projections; cinematic
Characteristic
Shot : A cyberpunk-style alleyway with a female figure in the foreground, holding a futuristic weapon, and a digital holographic projection in the background. The scene is lit by neon lights and fog, creating a moody atmosphere.
Aesthetic Score : 0.8
Mood : futuristic, dystopian, cyberpunk
Quality
Entropy : 6.59
Noise : 101
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have some slight artifacts and blurriness in certain areas. The holographic projection could benefit from more detailed rendering.
Chaos in the Market: Two Men Clash in a South American Street Brawl
A tense moment captured in a bustling South American market, as two men engage in a heated brawl. The dynamic composition, with both men mid-movement, conveys the intensity and chaos of the scene, creating a sense of raw energy and excitement.
Prompt
fighting fighting, clumsy: chaotic, humorous ; Two tourists; medium shot; tourism; a bustling marketplace with colorful stalls and vibrant crowds; cinematic
Characteristic
Shot : Two men are engaged in a playful struggle in a crowded outdoor market. The market is filled with stalls selling produce, clothing, and other goods. The scene is bustling with activity, with people moving about and interacting with each other.
Aesthetic Score : 0.3
Mood : tense, playful, bustling
Quality
Entropy : 6.41
Noise : 109
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight blur, especially around the edges of the men’s figures. The lighting is also a bit uneven, with some areas being overexposed and others being underexposed.
Lost in the Expanse: A Solitary Figure Contemplates the Vast Desert
A single figure traverses a desolate, sun-baked landscape, their journey a testament to the overwhelming vastness of the desert. The hazy white sky and the figure’s shrinking form evoke a sense of loneliness and contemplation, leaving the viewer to ponder the weight of solitude in such an unforgiving environment.
Prompt
fighting fighting, defensive: isolated, desperate ; A lone traveler; long shot; travel; a vast desert landscape with a lone sand dune in the foreground; cinematic
Characteristic
Shot : A single figure walks across a vast, sandy desert with a large sand dune in the background. The sky is a pale, hazy white.
Aesthetic Score : 0.6
Mood : lonely, serene, contemplative
Quality
Entropy : 5.82
Noise : 87
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some visible brushstrokes and digital artifacts, particularly in the sky and the sand.
Silhouettes Against the City Lights: A Rooftop Dance Under the Stars
Capture the vibrant energy of a rooftop dance party, where silhouetted figures move against a backdrop of twinkling city lights. The scene evokes a playful and mysterious mood, perfect for capturing the urban spirit.
Prompt
fighting fighting, playful: energetic, playful ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : A group of young adults are dancing on a rooftop in the city at night, with a skyline view in the background. The city lights are illuminating the scene.
Aesthetic Score : 0.7
Mood : energetic, urban, night
Quality
Entropy : 6.30
Noise : 101
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise and artifacts in the image, particularly in the shadows. The image is also a bit grainy.
A Solitary Figure Walks Away from the Ashes
A lone figure walks away from a burning village, silhouetted against a sky choked with smoke. The setting sun casts a somber glow on the scene, highlighting the stark contrast between the destruction and the individual’s resilience. This image evokes a sense of loss, isolation, and the enduring spirit of humanity in the face of devastation.
Prompt
fighting fighting, fierce: tragic, determined ; A lone warrior; close-up; heroism; a burning village with smoke billowing in the air; cinematic
Characteristic
Shot : A lone figure walks away from a burning village, the smoke and flames obscuring the details but creating a dramatic silhouette.
Aesthetic Score : 0.7
Mood : dark, ominous, despair
Quality
Entropy : 6.27
Noise : 101
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : Slight blurriness in the image, some artifacts in the smoke
Shadows Dance in the Cave’s Depths
Four figures, cloaked in darkness, ascend a treacherous path illuminated only by flickering torches. Their silhouettes against the faint light at the cave entrance create a sense of mystery and suspense, hinting at a perilous adventure ahead.
Prompt
fighting fighting, cautious: suspenseful, adventurous ; A group of explorers; wide shot; adventure; a dark cave with flickering torches and mysterious shadows; cinematic
Characteristic
Shot : A group of three people walk through a dark cave with torches. The figures are silhouetted against the light, and the cave is lit by the glow of their torches. The image is evocative of exploration and danger.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, dark
Quality
Entropy : 6.36
Noise : 105
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : No visible image errors, although the textures on the rock faces appear slightly artificial.
Lost in the Digital Dreamscape: A Glimpse into the Future of VR
A man, immersed in a virtual reality headset, stands before a screen pulsating with abstract, futuristic imagery. Bathed in a mesmerizing purple and blue neon glow, the scene evokes a sense of wonder and mystery, hinting at the boundless possibilities of VR technology.
Prompt
fighting fighting, focused: immersive, intense ; A gamer; close-up; gaming; a virtual reality headset with a pixelated world projected in the background; cinematic
Characteristic
Shot : A man wearing a VR headset is looking up, there’s a glowing blue digital background.
Aesthetic Score : 0.6
Mood : futuristic, techy, immersive
Quality
Entropy : 6.38
Noise : 99
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Lost in the Blur: The Anonymous Rush of a Train Station
A blurry image captures the chaotic energy of a bustling train station, where faces are lost in the crowd and the only focus is the relentless forward motion. The intentional blur creates a sense of anonymity and urgency, highlighting the transient nature of the space and the lives passing through it.
Prompt
fighting fighting, agile: fast-paced, chaotic ; Two travelers; medium shot; travel; a crowded train station with people rushing in all directions; cinematic
Characteristic
Shot : A blurred image of people walking in a train station. The image is very blurry and it is difficult to make out any details.
Aesthetic Score : 0.4
Mood : chaotic, busy, anonymous
Quality
Entropy : 6.83
Noise : 91
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is very blurry.
Conclusion
The generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.43, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model is not perfectly capturing the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.61, falling within the “good” range. This indicates that the model is generally able to understand the scene descriptions in the prompts and translate them into appropriate shots.
- Aesthetic Analysis: The model scored 0.14, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated images are not closely matching the expected aesthetic style described in the prompts.
Overall, the model shows promise in understanding camera positions and scene descriptions, but needs improvement in generating images that align with the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com