AI's Artistic Struggle: Capturing the Essence of Dramatic Poses with Imagen-v3
- 9 minutes read - 1769 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions and narratives through the positioning of the human form. From iconic movie stills to captivating photographs, these poses have the ability to evoke a sense of heroism, adventure, or even solitude. However, replicating these poses with AI presents unique challenges, as the model must not only understand the physical positioning but also capture the intended aesthetic and emotional impact. This blog post explores the results of an experiment using AI to generate images with dramatic poses, highlighting both its strengths and weaknesses.
Created with: imagen-v3
Silhouetted Warrior: A Stormy Tale of Power and Isolation
A dramatic silhouette of a warrior stands defiant against a stormy sea, lightning illuminating the epic scene. This image evokes a sense of power and isolation, capturing a moment of intense drama.
Prompt
poses silhouette: epic, determined ; Lone figure standing on a clifftop, overlooking a vast, stormy sea; wide shot; heroism; dramatic sky with lightning; cinematic
Characteristic
Shot : Silhouette of a warrior standing on a cliff overlooking a stormy sea, lightning strikes in the distance.
Aesthetic Score : 0.6
Mood : dark, dramatic, epic
Quality
Entropy : 5.96
Noise : 54
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.70
Image errors : The lightning strikes appear somewhat artificial and lack detail. The water surface also lacks realism.
Silhouettes of Hope: A Sunset Walk on the Edge
Seven figures, silhouetted against a fiery sunset, walk along a ridge overlooking a vast desert landscape. The dramatic lighting and use of silhouette create a sense of mystery and intrigue, hinting at an adventurous journey filled with hope.
Prompt
poses silhouette: hopeful, adventurous ; A group of adventurers silhouetted against the setting sun, walking towards a distant mountain range; medium shot; adventure; desert landscape; cinematic
Characteristic
Shot : A group of 7 people are silhouetted against a sunset, walking along a ridge with a desert landscape below.
Aesthetic Score : 0.6
Mood : dramatic, adventurous, hopeful
Quality
Entropy : 5.01
Noise : 54
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant errors, but the image is a bit too dark in some areas.
In the Glow of Victory: A Gamer’s Focused Intensity
A dimly lit room, bathed in the red and blue hues of a computer screen, reveals a pair of hands gripping a video game controller. The dramatic interplay of light and shadow underscores the gamer’s intense focus, capturing the raw emotion of the moment.
Prompt
poses silhouette: intense, focused ; A gamer’s hands silhouetted against a glowing computer screen, holding a controller; close-up; gaming; neon lights and digital interfaces; cinematic
Characteristic
Shot : A person’s hands are holding a video game controller in a dimly lit room. The hands are lit by red and blue light coming from a computer screen in the background.
Aesthetic Score : 0.6
Mood : intense, focused, dark
Quality
Entropy : 5.87
Noise : 67
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Eiffel Tower Romance: A Silhouette of Love
A dreamy and nostalgic scene of a couple holding hands, silhouetted against the illuminated Eiffel Tower at night. The dramatic effect creates a sense of intimacy and romance, capturing the magic of Paris.
Prompt
poses silhouette: romantic, nostalgic ; A couple holding hands, silhouetted against the iconic Eiffel Tower; medium shot; tourism; Parisian cityscape at night; cinematic
Characteristic
Shot : Silhouettes of a couple holding hands in front of the Eiffel Tower at night.
Aesthetic Score : 0.7
Mood : romantic, dreamy, nostalgic
Quality
Entropy : 5.54
Noise : 70
Prompt Clip Score : 0.38
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly underexposed, leading to dark areas and the couple’s faces are not visible. The image also has a slightly blurry quality.
Silhouettes of Hope in the Desert Sunset
A lone figure walks a dusty path towards the setting sun, their silhouette a beacon of hope against the vast, desolate landscape. The swirling dust adds to the sense of adventure and solitude, creating a powerful image of resilience and the pursuit of something beyond the horizon.
Prompt
poses silhouette: lonely, contemplative ; A lone traveler walking down a dusty road, silhouetted against the rising sun; long shot; travel; vast, open desert landscape; cinematic
Characteristic
Shot : A lone figure walks down a dusty road in a desert landscape. The sun is setting in the distance, casting long shadows.
Aesthetic Score : 0.7
Mood : solitude, adventure, hope
Quality
Entropy : 6.63
Noise : 84
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Cheers to Friendship: A Warm and Intimate Celebration
Capture the joy of a night out with friends as they raise their glasses in a toast. The low angle and warm lighting create a sense of intimacy and celebration, while the silhouettes of the friends add a touch of mystery. This image is perfect for evoking feelings of warmth, camaraderie, and good times.
Prompt
poses silhouette: joyful, celebratory ; A group of friends raising their glasses in a toast, silhouetted against a brightly lit bar; medium shot; groups; vibrant nightlife scene; cinematic
Characteristic
Shot : A group of friends celebrating at a bar, toasting with glasses of wine. The image is shot from a low angle, looking up at the group. The lighting is warm and inviting, creating a sense of intimacy. The scene is lit in a way that emphasizes the silhouettes of the people.
Aesthetic Score : 0.5
Mood : celebratory, warm, intimate
Quality
Entropy : 5.63
Noise : 77
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No errors
Heroic Leap into the Sunset
A powerful superhero in a red cape takes a dramatic leap from a skyscraper, silhouetted against the fiery hues of a setting sun. The image captures the hero’s strength and determination, leaving a lasting impression of their heroic spirit.
Prompt
poses silhouette: powerful, heroic ; A superhero leaping from a tall building, silhouetted against the city skyline; wide shot; heroism; cityscape with skyscrapers; cinematic
Characteristic
Shot : A superhero in a red cape is leaping from a skyscraper in a cityscape, the sun is setting in the background.
Aesthetic Score : 0.6
Mood : dramatic, powerful, heroic
Quality
Entropy : 6.85
Noise : 84
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts in the cityscape, and the superhero’s costume looks a bit blurry.
Into the Unknown: A Mysterious Cave Beckons
A group of adventurers stand poised at the entrance of a dark and foreboding cave, a cascading waterfall adding to the sense of mystery and suspense. Will they dare to venture inside? This image captures the thrill of exploration and the unknown that lies ahead.
Prompt
poses silhouette: suspenseful, adventurous ; A group of explorers silhouetted against the entrance to a dark, mysterious cave; medium shot; adventure; dense jungle foliage; cinematic
Characteristic
Shot : A group of adventurers are standing in front of a cave entrance. The cave is dark and mysterious, and there is a waterfall in the background.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 5.98
Noise : 94
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and compression artifacts noticeable in the shadows.
The Hacker’s Hands: A Symphony of Code in Neon Light
A close-up shot captures the intensity of a hacker’s focus as they type furiously on a keyboard, bathed in a mesmerizing red and blue glow. The scene evokes a sense of mystery and intrigue, leaving you wondering what secrets are being unlocked in the digital realm.
Prompt
poses silhouette: intense, focused ; A gamer’s hands silhouetted against a glowing computer screen, typing furiously; close-up; gaming; futuristic, neon-lit gaming room; cinematic
Characteristic
Shot : Close-up of hands typing on a keyboard, the person is sitting in front of a computer screen. The scene is lit with red and blue lights creating a dark and mysterious atmosphere.
Aesthetic Score : 0.5
Mood : mysterious, techy, intense
Quality
Entropy : 5.84
Noise : 56
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, particularly the hands and the keyboard.
Silhouetted Solitude at Sunset
A lone figure stands on a sandy beach, their silhouette stark against the fiery orange sunset over the ocean. The scene evokes a sense of melancholy peace and contemplation, with the dramatic lighting adding an air of mystery and isolation.
Prompt
poses silhouette: Solitude, contemplation, longing ; A lone figure stands on a deserted beach, silhouetted against the fiery sunset, the vast ocean stretching out behind them.; cinematic
Characteristic
Shot : A lone figure silhouetted against an orange sunset over the ocean. The person is standing on a sandy beach, facing the water.
Aesthetic Score : 0.7
Mood : melancholy, peaceful, contemplative
Quality
Entropy : 6.11
Noise : 62
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.35
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model didn’t quite capture the intended camera positions as described in the prompt.
Shot Analysis:
- Score: 0.45
- Interpretation: This score also falls below the “good” range. It indicates that the model had some difficulty understanding the scene and creating the desired shot composition.
Aesthetic Analysis:
- Score: 0.14
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall:
While the model showed some success in understanding camera positions and shot composition, it struggled to create an image with the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic concepts into visual outputs.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/