AI Struggles to Capture the Essence of Dramatic Poses with Imagen-v3
- 9 minutes read - 1902 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotion, action, and character through the way a figure is positioned. These poses are often used in photography, film, and art to create a sense of drama, tension, or heroism. However, generating images with these poses poses a challenge for AI, as it requires understanding not only the physical positioning of the figure but also the underlying emotional and narrative context. This blog post explores the limitations of AI in capturing the essence of dramatic poses, using a case study of 10 different prompts.
Created with: imagen-v3
A Knight’s Lament Amidst the Flames
A lone knight, clad in heavy armor, kneels on a rocky outcrop, his gaze fixed on a burning medieval town in the distance. The sky is a canvas of dark clouds and flashing lightning, mirroring the turmoil within the knight’s heart. This dramatic scene evokes a sense of melancholic heroism, as the knight contemplates the devastation and the uncertain future.
Prompt
poses dutch-angle: determined, heroic, hopeful ; A lone knight, standing tall on a hilltop overlooking a besieged city; wide shot; heroism; a dramatic, stormy sky with flashes of lightning; cinematic
Characteristic
Shot : A lone knight in full armor kneels on a rocky outcrop, looking out at a burning medieval town in the distance. The sky is dark and stormy, with lightning striking in the background.
Aesthetic Score : 0.7
Mood : dramatic, melancholic, epic
Quality
Entropy : 6.65
Noise : 80
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be computer-generated, with some unnatural details such as the lighting and the knight’s armor.
Silhouettes of Hope: Adventure Beckons at Sunset
Five figures stand silhouetted against a vibrant sunset, their forms shrouded in mystery as they face the unknown. Tall rock formations rise behind them, hinting at a jungle setting ripe with adventure and possibility. This evocative scene captures a sense of hope and intrigue, leaving viewers to wonder what lies ahead for these intrepid explorers.
Prompt
poses dutch-angle: adventurous, mysterious, awe-inspiring ; A group of explorers, silhouetted against the setting sun, standing at the edge of a vast, unexplored jungle; medium shot; adventure; lush green foliage and towering trees; cinematic
Characteristic
Shot : A group of five people are silhouetted against a sunset in a jungle setting, with tall rock formations behind them.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, hopeful
Quality
Entropy : 6.69
Noise : 94
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some blurring and pixelation in the background, likely from AI generation.
The Gamer’s Focus: A Moment of Intense Concentration
A young man, fully immersed in his game, sits in a gaming chair, headset on, fingers flying across the keyboard. His expression is serious, his focus unwavering. This image captures the competitive spirit and intense concentration that defines the world of gaming.
Prompt
poses dutch-angle: intense, focused, competitive ; A gamer, intensely focused on a screen, fingers flying across a keyboard; close-up; gaming; a brightly lit room with gaming peripherals and posters; cinematic
Characteristic
Shot : A young man wearing a gaming headset and a black t-shirt with a gaming logo is sitting in a gaming chair in front of a computer. He is typing on a keyboard.
Aesthetic Score : 0.6
Mood : focused, serious, competitive
Quality
Entropy : 6.25
Noise : 73
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts around the edges of the subject and the keyboard.
Parisian Romance: A Dreamy Stroll with the Eiffel Tower
Capture the essence of Parisian romance with this stunning image. A couple walks hand-in-hand down a charming street, the iconic Eiffel Tower standing tall in the background. The scene evokes a sense of nostalgia and dreamy love, with the tower adding a touch of grandeur and romance.
Prompt
poses dutch-angle: romantic, nostalgic, joyful ; A couple, hand-in-hand, gazing out at the Eiffel Tower from a Parisian cafe; medium shot; tourism; bustling Parisian streets with charming cafes and shops; cinematic
Characteristic
Shot : A couple is walking hand in hand down a Parisian street with the Eiffel Tower in the background.
Aesthetic Score : 0.7
Mood : romantic, dreamy, nostalgic
Quality
Entropy : 6.61
Noise : 100
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some artifacts and blurriness, especially in the background.
A Hiker’s Journey Towards Majestic Peaks
A lone hiker, their orange backpack a splash of color against the serene landscape, traverses a stone path towards snow-capped mountains under a clear blue sky. The vastness of the mountains evokes a sense of awe and insignificance, inspiring a feeling of adventure and serenity.
Prompt
poses dutch-angle: free-spirited, adventurous, inspiring ; A backpacker, walking along a winding mountain path, with breathtaking views of snow-capped peaks; medium shot; travel; a rugged mountain landscape with clear blue skies; cinematic
Characteristic
Shot : A lone hiker with an orange backpack walks on a stone path leading towards snow-capped mountains under a clear blue sky.
Aesthetic Score : 0.8
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.72
Noise : 105
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious artifacts or errors in the image.
Cheers to Friendship: A Toast in the Dimly Lit Bar
Capture the joy and intimacy of a group of friends celebrating with wine glasses in a dimly lit bar setting. The shallow depth of field focuses on the glasses and hands, creating a sense of closeness and festive spirit.
Prompt
poses dutch-angle: joyful, celebratory, connected ; A group of friends, laughing and celebrating, raising their glasses in a toast; medium shot; groups; a lively bar or restaurant with warm lighting and festive decorations; cinematic
Characteristic
Shot : Group of friends toasting with wine glasses in a dimly lit bar setting.
Aesthetic Score : 0.7
Mood : joyful, celebratory, festive
Quality
Entropy : 6.57
Noise : 92
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.30
Image errors : No major errors, but the image appears to be slightly overexposed, resulting in a washed-out look in some areas.
A Moment of Awe: Astronauts Gaze Upon Earth’s Fragility
Two astronauts, clad in spacesuits, stand in a spaceship, their eyes fixed on the breathtaking view of Earth and the Milky Way. The image evokes a sense of wonder and isolation, highlighting the vastness of space and the delicate nature of life on our home planet.
Prompt
poses dutch-angle: awe-inspiring, contemplative, hopeful ; A lone astronaut, gazing out at the Earth from a space station window; close-up; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : Two astronauts in spacesuits are standing in a spaceship, gazing out the window at Earth and the Milky Way.
Aesthetic Score : 0.7
Mood : serious, awe, anticipation
Quality
Entropy : 6.25
Noise : 93
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image is slightly blurry and the astronauts’ faces are a bit pixelated.
Precarious Descent: Climbers Brave the Vertical Drop
A breathtaking view unfolds as three climbers rappel down a sheer cliff face. The dramatic perspective emphasizes their daring adventure, while the serene valley and cascading waterfall below offer a glimpse of the beauty that awaits. This image captures the thrill and tranquility of a challenging climb.
Prompt
poses dutch-angle: exciting, daring, adventurous ; A group of adventurers, rappelling down a steep cliff face, with a breathtaking view of a valley below; wide shot; adventure; a dramatic mountain landscape with waterfalls and lush vegetation; cinematic
Characteristic
Shot : Three climbers rappelling down a steep cliff face, with a valley and waterfall visible in the background.
Aesthetic Score : 0.8
Mood : adventurous, daring, serene
Quality
Entropy : 6.95
Noise : 111
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors
Victory is Sweet: Gamer Celebrates Triumph with a Smile
A young man, radiating joy and excitement, holds aloft a trophy after a hard-fought victory. His triumphant fist pump and beaming smile capture the essence of his achievement, set against the backdrop of his gaming setup. The image exudes a celebratory mood, showcasing the thrill of victory in the world of gaming.
Prompt
poses dutch-angle: triumphant, celebratory, exciting ; A gamer, celebrating a victory, holding up a trophy; close-up; gaming; a brightly lit stage with cheering crowds and flashing lights; cinematic
Characteristic
Shot : A young man, wearing a black and green t-shirt, is holding up a trophy in his left hand and raising his right fist in the air, with a bright smile on his face. He is sitting in a gaming chair in front of a gaming setup. The background is blurred, but there is a screen with red and orange highlights in the back.
Aesthetic Score : 0.7
Mood : joyful, victorious, celebratory
Quality
Entropy : 6.11
Noise : 87
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious artifacts or errors. Slight compression noise might be visible upon closer inspection.
Silhouette of Solitude: A Moment of Contemplation at Sunset
A lone figure sits perched on a cliff, their silhouette stark against the fiery hues of the setting sun. The scene evokes a sense of serenity, introspection, and perhaps a touch of melancholy, as the individual contemplates the vastness of the ocean and the passing of another day.
Prompt
poses dutch-angle: Melancholy, contemplative, introspective ; A lone figure silhouetted against the fiery sunset, gazing out at the vast, shimmering ocean.; cinematic
Characteristic
Shot : A silhouette of a person sitting on a cliff overlooking the ocean at sunset.
Aesthetic Score : 0.7
Mood : serene, contemplative, melancholic
Quality
Entropy : 6.06
Noise : 71
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors
Conclusion
The results of the image analysis show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was intended in the prompt.
- Shot Analysis: The model scored 0.52, which is also considered okay. This indicates that the generated image’s shot composition was somewhat different from what was intended in the prompt.
- Aesthetic Analysis: The model scored 0.07, which is considered pretty bad. This means the generated image’s aesthetic was significantly different from what was expected based on the prompt.
Overall, the model seems to be struggling with capturing the desired aesthetic of the image. It’s important to note that these scores are based on a limited number of analyses (10 each for camera position, shot, and aesthetic). More data would be needed to draw more definitive conclusions.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/