AI Captures the Essence of Poses, But Struggles with Aesthetics with Imagen-v3-fast
- 9 minutes read - 1760 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions and narratives through the positioning of the subject. From the heroic stance of a lone figure against a backdrop of destruction to the intimate connection of a couple walking hand-in-hand on a beach, poses can evoke a wide range of feelings and experiences. This blog post explores the capabilities of a generative AI model in capturing the essence of these poses, analyzing its strengths and weaknesses in translating textual descriptions into visual representations.
Created with: imagen-v3-fast
Silhouettes of Hope in a Sun-Kissed Ruin
Two figures stand on a stone path, their forms stark against the golden sunset. Behind them, a ruined city whispers tales of loss and resilience. This evocative scene blends mystery, hope, and a touch of melancholy, leaving the viewer to ponder the stories etched in the fading light.
Prompt
poses looking-back: Melancholy, yet hopeful ; Lone figure in a tattered cloak; wide shot; Heroism; Ruins of a fallen city bathed in the golden light of a setting sun; cinematic
Characteristic
Shot : Two figures stand on a stone path in front of a ruined city at sunset. The figures are silhouetted against the golden sky, and the city behind them is hazy and indistinct.
Aesthetic Score : 0.7
Mood : mysterious, hopeful, melancholic
Quality
Entropy : 6.92
Noise : 65
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, such as the blurry edges of the figures and the somewhat pixelated background.
Unveiling the Secrets of the Jungle Temple
Three figures, a man and two women, embark on a journey through a lush, overgrown jungle, their destination: a majestic, ancient temple. The air is thick with mystery and anticipation, inviting you to imagine the adventures that lie ahead.
Prompt
poses looking-back: Excited, adventurous ; A group of explorers; medium shot; Adventure; Lush jungle with ancient temples in the distance; cinematic
Characteristic
Shot : Three figures, a man and two women, are walking through a lush, overgrown jungle towards a large, ancient temple.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, contemplative
Quality
Entropy : 6.54
Noise : 95
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some of the textures, especially on the temple, appear slightly blurry and lacking in detail. The foliage also appears somewhat generic and lacks the visual interest of real foliage.
Immersed in the Game: Blue and Orange Lights Illuminate a Gamer’s Focus
A gamer sits at their desk, bathed in the cool glow of blue and orange lighting, their intense focus on the game before them. The dramatic lighting creates a sense of energy and immersion, capturing the thrill of the moment.
Prompt
poses looking-back: Intense, focused ; A gamer’s hands on a keyboard; close-up; Gaming; Neon lights reflecting on the screen, displaying a virtual world; cinematic
Characteristic
Shot : A gamer is sitting at a desk in front of a computer, playing a game. The room is dimly lit, with blue and orange lights highlighting the scene.
Aesthetic Score : 0.6
Mood : intense, focused, cool
Quality
Entropy : 6.20
Noise : 36
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blurriness, especially in the background. The lighting could be better balanced.
A Moment of Solitude on the Mountaintop
A lone hiker stands on a breathtaking mountain ridge, dwarfed by the majestic peaks and lush valleys below. The scene evokes a sense of serenity, contemplation, and adventure, capturing the awe-inspiring beauty of nature.
Prompt
poses looking-back: Awe-inspiring, peaceful ; A lone traveler standing on a mountain peak; long shot; Tourism; Breathtaking panoramic view of a snow-capped mountain range; cinematic
Characteristic
Shot : A lone hiker stands on a mountain ridge, looking out at a breathtaking panoramic view of snow-capped peaks and lush green valleys.
Aesthetic Score : 0.8
Mood : serene, contemplative, adventurous
Quality
Entropy : 6.89
Noise : 73
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
Silhouetted Against the Setting Sun: A Moment of Hope in the Desert
A lone woman, hat shielding her face, stands beside a train track in a desolate desert landscape. The setting sun casts long shadows, creating a dramatic scene of solitude and adventure. The mood is one of tranquility and hope, as she gazes towards the horizon, perhaps contemplating the journey ahead.
Prompt
poses looking-back: Nostalgic, adventurous ; A vintage train speeding through a desert landscape; medium shot; Travel; Sun setting over the horizon, casting long shadows; cinematic
Characteristic
Shot : A woman in a hat stands next to a train on a desert landscape, looking at the setting sun
Aesthetic Score : 0.7
Mood : tranquil, hopeful, adventurous
Quality
Entropy : 6.96
Noise : 72
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Lost in the Art: A Moment of Wonder in the Urban Jungle
Five figures stand dwarfed by a sprawling graffiti mural in a gritty alleyway. Their casual attire and curious gazes suggest a shared fascination with the vibrant artwork, creating a sense of awe and wonder amidst the urban landscape.
Prompt
poses looking-back: Joyful, carefree ; A group of friends laughing and talking; medium shot; Groups; A bustling city street with vibrant street art; cinematic
Characteristic
Shot : A group of five people are standing in an alleyway, looking at a mural on the wall. The wall is made of metal and has a lot of graffiti on it. The people are dressed in casual clothes and they are all looking in different directions.
Aesthetic Score : 0.6
Mood : casual, urban, curious
Quality
Entropy : 6.79
Noise : 101
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain.
A Moment of Awe: Astronaut Gazes at Earth from the Cosmic Frontier
A lone astronaut, silhouetted against the vast expanse of space, stands on a rocky surface and contemplates the distant Earth. The scene evokes a sense of awe, wonder, and solitude, highlighting the profound isolation and beauty of space exploration.
Prompt
poses looking-back: Awe-inspiring, contemplative ; A lone astronaut floating in space; long shot; Heroism; Earth hanging in the distance, a blue marble against the black void; cinematic
Characteristic
Shot : An astronaut in a white spacesuit stands on a rocky surface, gazing at a distant Earth with a glow of sunlight on the horizon.
Aesthetic Score : 0.7
Mood : awe, wonder, solitude
Quality
Entropy : 6.35
Noise : 64
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.90
Image errors : Slight blurriness in the background and some artifacts around the astronaut.
Adrenaline Rush: Whitewater Rafting Adventure
Experience the thrill of whitewater rafting as a group navigates through turbulent rapids. The camera captures the excitement from behind, showcasing the fast-paced action and the adventurous spirit of the participants.
Prompt
poses looking-back: Thrilling, exhilarating ; A group of adventurers on a raft; medium shot; Adventure; Rapids churning whitewater, a sense of danger and excitement; cinematic
Characteristic
Shot : A group of people in a raft on a river, the raft is moving through whitewater rapids, the people are wearing helmets and life jackets, the camera is pointing forward from behind the group, looking out at the river.
Aesthetic Score : 0.6
Mood : adventurous, exciting, active
Quality
Entropy : 6.56
Noise : 73
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, slight overexposure
Golden Dawn, Solitary Figure
A lone figure stands on a mountain peak, bathed in the golden light of dawn, gazing out over a misty valley. The dramatic use of light and shadow creates a sense of depth and solitude, evoking a mood of serenity, contemplation, and hope.
Prompt
poses looking-back: Triumphant, accomplished ; A gamer’s avatar standing on a virtual mountain peak; close-up; Gaming; A vast, fantastical landscape stretching out before them; cinematic
Characteristic
Shot : A lone figure stands on a mountain peak, looking out over a vast, misty valley, bathed in the golden light of dawn.
Aesthetic Score : 0.7
Mood : serene, contemplative, hopeful
Quality
Entropy : 6.64
Noise : 63
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some slight pixelation and blurring, particularly in the background mountains.
Silhouettes of Love Against a Fiery Sunset
A romantic couple stands hand-in-hand on a beach, their silhouettes framed by a breathtaking sunset. The scene evokes a sense of peace, serenity, and the enduring power of love.
Prompt
poses looking-back: Romantic, peaceful ; A couple walking hand-in-hand on a beach; long shot; Tourism; Sunset painting the sky in vibrant hues of orange and pink; cinematic
Characteristic
Shot : A couple standing on a beach, holding hands, with a sunset in the background.
Aesthetic Score : 0.7
Mood : romantic, serene, peaceful
Quality
Entropy : 6.83
Noise : 62
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurring of the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the camera position described in the prompt.
- Shot Analysis: The model scored 0.53, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflected that understanding.
- Aesthetic Analysis: The model scored 0.1, which is considered “very good” (between -0.2 and 0.1). This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the prompt’s instructions, particularly in terms of camera position and shot composition. However, it could benefit from further development to improve its ability to accurately capture the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/