AI's Artistic Struggle: Capturing the Perfect Pose with Imagen-v3
- 8 minutes read - 1698 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotion, action, and character through body language. They are often used in film, photography, and even video games to create impactful and memorable scenes. This blog post explores the challenges of generating dramatic poses using AI, examining how well a generative model can understand and translate scene descriptions into visually compelling images.
Created with: imagen-v3
One Man Stands Against the Tide
A lone warrior, his sword stained with blood, faces down a smoke-filled army in a dramatic scene of impending battle. The image captures the intensity and epic scale of the moment, leaving the viewer on the edge of their seat.
Prompt
poses action-pose: determined, heroic ; Lone warrior; wide shot; Heroism; Epic battle scene with smoke and fire; cinematic
Characteristic
Shot : A lone warrior stands in a battlefield, his sword dripping with blood, as a smoke-filled army approaches behind him.
Aesthetic Score : 0.7
Mood : dramatic, intense, epic
Quality
Entropy : 6.74
Noise : 100
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some of the soldiers in the background appear blurry and unrealistic, and the smoke effect seems artificial. There are also some minor color artifacts.
Lost in the Majesty: A Hiker’s Moment of Solitude
A lone figure stands on a cliff edge, dwarfed by the vastness of a misty mountain range. The scene evokes a sense of awe and adventure, capturing the dramatic beauty of nature and the human spirit’s desire to explore.
Prompt
poses action-pose: adventurous, awe-inspired ; Adventurer standing on a cliff edge; medium shot; Adventure; Majestic mountain range with clouds; cinematic
Characteristic
Shot : A lone hiker stands on a cliff edge overlooking a vast mountain range shrouded in mist and clouds.
Aesthetic Score : 0.7
Mood : dramatic, adventurous, serene
Quality
Entropy : 6.75
Noise : 85
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable artifacts or errors.
Lost in the Neon Glow: A Gamer’s Immersive Experience
This image captures the intensity of a gamer fully immersed in their virtual world. The vibrant blue and red neon lights illuminate the scene, while the blurred monitor screen hints at the captivating action unfolding within. The back of the gamer’s head and the controller in their hands tell a story of focus and dedication, transporting the viewer into the heart of the gaming experience.
Prompt
poses action-pose: focused, intense ; Gamer holding a controller; close-up; Gaming; Neon-lit gaming room with multiple screens; cinematic
Characteristic
Shot : A person is playing a video game with a controller, the image is taken from behind the person, we see the back of their head, and the controller in their hands, the screen of the monitor is blurred. The scene is illuminated by blue and red neon lights
Aesthetic Score : 0.6
Mood : intense, focused, futuristic
Quality
Entropy : 6.36
Noise : 71
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.70
Image errors : No noticeable errors in the image
City Lights and Smiles: Capturing the Joy of the Moment
A young woman radiates happiness as she takes a selfie in front of a stunning, illuminated building. The vibrant atmosphere and her infectious smile create a sense of adventure and excitement, making this a truly joyful moment captured in time.
Prompt
poses action-pose: happy, excited ; Tourist taking a selfie in front of a famous landmark; medium shot; Tourism; Busy city square with people and street performers; cinematic
Characteristic
Shot : A young woman is taking a selfie in front of a large, ornate building. The building is lit up at night, and there are other people in the background. The woman is smiling and looks happy.
Aesthetic Score : 0.6
Mood : joyful, excited, adventurous
Quality
Entropy : 6.40
Noise : 86
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise, especially in the darker areas. The woman’s face is slightly overexposed.
Love on the Open Road: A Couple’s Joyride Through Vineyards
Capture the essence of romance and adventure as a couple cruises through picturesque vineyards on a motorcycle. Their smiles and the stunning scenery radiate joy and freedom, creating a captivating scene that scores high on aesthetic appeal.
Prompt
poses action-pose: free, adventurous ; Couple riding a motorcycle on a winding road; wide shot; Travel; Scenic countryside with rolling hills and vineyards; cinematic
Characteristic
Shot : A couple riding a motorcycle on a winding road through vineyards.
Aesthetic Score : 0.7
Mood : romantic, adventurous, carefree
Quality
Entropy : 6.84
Noise : 106
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, leading to a washed-out look in the background.
Friends Celebrate with City Lights as Their Backdrop
A group of four friends raise their glasses in a toast, their smiles radiating joy and excitement. The vibrant city lights create a dazzling backdrop for their celebration, capturing the energy and happiness of the moment.
Prompt
poses action-pose: joyful, celebratory ; Group of friends celebrating with drinks; medium shot; Groups; Rooftop bar with city lights in the background; cinematic
Characteristic
Shot : A group of four friends toasting each other with drinks at a rooftop bar at night. The city lights are visible in the background.
Aesthetic Score : 0.7
Mood : happy, celebratory, joyful
Quality
Entropy : 6.17
Noise : 93
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors in the image.
Superhero Stands Tall Against the Night
A powerful superhero, possibly Superman, strikes a dynamic pose on a rooftop, gazing directly at the viewer. The city skyline behind him is bathed in the glow of the night, adding to the intense and heroic mood of the scene. The dramatic lighting and dynamic pose create a sense of power and energy.
Prompt
poses action-pose: powerful, confident ; Superhero landing on a rooftop; wide shot; Heroism; City skyline with skyscrapers and neon lights; cinematic
Characteristic
Shot : A superhero, possibly Superman, is posed in a dynamic stance on a rooftop, looking directly at the camera. The background is a city skyline at night.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.63
Noise : 87
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lighting on the superhero’s face seems a bit harsh, and the cityscape is somewhat artificial looking.
Lost in the Mist: An Explorer’s Journey into the Unknown
A lone explorer ventures through a dense jungle, sunlight filtering through the canopy and creating a misty atmosphere. The scene evokes a sense of mystery, adventure, and eerieness, drawing the viewer’s eye towards the unknown.
Prompt
poses action-pose: determined, adventurous ; Explorer navigating a jungle path; medium shot; Adventure; Lush green jungle with vines and sunlight filtering through the canopy; cinematic
Characteristic
Shot : A lone explorer walks through a dense jungle, sunlight filtering through the canopy and creating a misty atmosphere.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, eerie
Quality
Entropy : 6.57
Noise : 111
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.60
Image errors : No noticeable artifacts or errors
Lost in the Code: A Moment of Intense Focus
A young man sits hunched over his computer, bathed in the soft glow of the screen. The low light and close-up shot capture his intense concentration, highlighting the seriousness of his task.
Prompt
poses action-pose: intense, focused ; Gamer competing in an esports tournament; close-up; Gaming; Stadium filled with cheering fans and bright lights; cinematic
Characteristic
Shot : A young man is sitting at a desk in a dimly lit room, concentrating on a computer screen.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 5.99
Noise : 72
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and artifacts, and the background is blurry.
Silhouette of Solitude: A Man Contemplates the Sunset
A solitary figure stands on a beach, their silhouette stark against the fiery hues of a setting sun. The scene evokes a sense of melancholy and contemplation, as the man gazes out at the vast ocean. The dramatic effect of the silhouette against the sunset creates a powerful and evocative image.
Prompt
poses action-pose: Melancholy, contemplative ; A lone figure silhouetted against a fiery sunset, standing on a windswept beach, the vast ocean stretching out before them.; cinematic
Characteristic
Shot : A silhouette of a man standing on a beach, facing the ocean with a fiery sunset in the background.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, serene
Quality
Entropy : 6.62
Noise : 84
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.36, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.435, which is also below average. This indicates that the model didn’t fully understand the desired shot composition from the prompt.
- Aesthetic Analysis: The model scored 0.02, which is considered very good. This means the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the camera position and shot composition. This suggests that the model might need further training to improve its ability to interpret and translate these aspects from the prompt into the generated image.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/