AI's Artistic Struggle: Capturing the Essence of Poses with Imagen-v2
- 9 minutes read - 1892 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts has become increasingly sophisticated. However, capturing the nuances of human expression, particularly in dramatic poses, remains a challenge. This blog post delves into the results of an AI model tasked with generating images based on prompts describing poses and scenes, highlighting its strengths and weaknesses in capturing the essence of dramatic poses.
Created with: imagen-v2
A Lone Warrior Faces the Storm
A solitary figure, clad in dark armor and wielding a sword, races across a desolate wasteland under a stormy sky. The dramatic lighting and the warrior’s determined expression create a sense of urgency and danger, drawing you into this epic scene.
Prompt
poses running: determined, hopeful ; A lone figure in a tattered cloak; wide shot; Heroism; a desolate wasteland with a storm brewing in the distance; cinematic
Characteristic
Shot : A lone warrior runs through a desolate, barren wasteland under a stormy sky
Aesthetic Score : 0.7
Mood : epic, dramatic, bleak
Quality
Entropy : 6.59
Noise : 85
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts in the background and the sky. The lighting is a bit flat and the color palette is a bit muted.
Lost in the Jungle: A Race Against Time
A young man sprints through a dense jungle, ancient ruins looming in the background. The air is thick with mystery and tension, as he races against an unknown threat. This adventurous scene captures the thrill of exploration and the urgency of a perilous mission.
Prompt
poses running: excited, curious ; adventurer with a backpack; medium shot; Adventure; a lush jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A man is running through a lush jungle, with ancient stone structures in the background. The scene is filled with dense vegetation and warm lighting.
Aesthetic Score : 0.6
Mood : adventurous, mysterious, suspenseful
Quality
Entropy : 6.82
Noise : 84
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some blurring and softness, particularly in the background and the vegetation. The lighting is a bit too evenly distributed and lacks natural shadows.
Lost in the Neon Glow: A Gamer’s Focus Under the Digital Spotlight
This image captures the intensity of a gamer immersed in a futuristic world. The close-up shot of their hands on the keyboard and mouse, bathed in the glow of the screen and a neon sign, creates a sense of both focus and mystery. The dimly lit room adds to the dramatic effect, drawing the viewer into the player’s world.
Prompt
poses running: intense, focused ; A gamer’s hands on a keyboard and mouse; close-up; Gaming; a brightly lit gaming room with a monitor displaying a virtual world; cinematic
Characteristic
Shot : A close-up shot of a person’s hand typing on a keyboard with a blurry background of a computer screen and a room.
Aesthetic Score : 0.6
Mood : intense, focused, gamer
Quality
Entropy : 5.83
Noise : 76
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise and blurriness in the background, particularly in the computer screen.
Chaos and Excitement: A Rush Through a Vibrant Market
Capture the energy of a bustling market as people weave through colorful stalls and awnings, creating a sense of urgency and adventure. The scene evokes a vibrant atmosphere, possibly in a developing country, where life moves at a rapid pace.
Prompt
poses running: energetic, joyful ; A group of tourists running through a bustling marketplace; long shot; Tourism; a vibrant marketplace with colorful stalls and vendors; cinematic
Characteristic
Shot : A group of people are running through a crowded marketplace with colorful awnings and stalls. The setting appears to be an exotic, possibly Asian location, based on the architecture and attire of the individuals.
Aesthetic Score : 0.5
Mood : intense, chaotic, fast-paced
Quality
Entropy : 6.65
Noise : 96
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image exhibits some artifacts and blur, particularly in the background, suggesting it might have been processed or upscaled from a lower resolution source.
Running Towards Happiness: A Couple’s Joyful Escape on a Pristine Beach
Capture the essence of carefree love as a couple races towards the turquoise waters of a white sand beach. The clear blue sky and playful energy create a scene brimming with joy and romance. This image evokes a sense of freedom and happiness, perfect for capturing the spirit of a blissful getaway.
Prompt
poses running: romantic, carefree ; A couple running hand-in-hand along a beach; medium shot; Travel; a beautiful beach with turquoise water and white sand; cinematic
Characteristic
Shot : A couple is running along a pristine white sand beach towards the turquoise ocean water. The sky is a bright blue with a few white clouds.
Aesthetic Score : 0.75
Mood : romantic, playful, carefree
Quality
Entropy : 6.13
Noise : 87
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the colors are a bit too saturated. The detail in the sand is a little fuzzy.
Sun-Kissed Laughter and Freedom: Four Friends Embrace the Joy of Running
Capture the essence of carefree joy with this image of four young women running through a sun-drenched field. Their laughter and energetic movements radiate a sense of freedom and happiness, creating a vibrant and uplifting scene.
Prompt
poses running: happy, playful ; A group of friends running through a park; wide shot; Groups; a sunny park with green grass and trees; cinematic
Characteristic
Shot : Four women are running through a grassy field towards the camera, with trees and a bright sky in the background.
Aesthetic Score : 0.6
Mood : happy, carefree, youthful
Quality
Entropy : 6.67
Noise : 106
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, and the colors are a bit muted.
Superhero in Motion: A Cityscape of Power and Intensity
This dramatic image captures a superhero running towards the camera, their focused expression and blurry background creating a sense of intense motion and heroic power. The cityscape backdrop adds to the scene’s grandeur, highlighting the superhero’s presence and the scale of their mission.
Prompt
poses running: powerful, confident ; A superhero in a bright costume; close-up; Heroism; a city skyline with skyscrapers and flashing lights; cinematic
Characteristic
Shot : A superhero in a red and yellow suit is running towards the camera, with a city skyline in the background.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.60
Noise : 70
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The colors are a bit oversaturated, and the image has some minor artifacts around the edges of the subject.
Lost in the Majesty: A Man Contemplates the Snowy Peaks
A lone hiker, clad in brown, traverses a breathtaking snowy landscape. The sun bathes the scene in golden light, highlighting the vastness of the mountains and the smallness of the man. His contemplative gaze suggests a moment of deep reflection amidst the adventurous journey.
Prompt
poses running: determined, adventurous ; A lone explorer running through a snow-covered mountain pass; long shot; Adventure; a majestic mountain range with snow-capped peaks; cinematic
Characteristic
Shot : A man in a brown jumpsuit is walking on a snowy mountainside. The background is a stunning view of snow-capped mountains.
Aesthetic Score : 0.6
Mood : adventurous, determined, cold
Quality
Entropy : 6.73
Noise : 94
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is somewhat over-saturated and the colors are a bit artificial. The snow looks a bit too smooth and the mountains lack detail, which might be caused by a lack of focus or over-processing.
Futuristic Warrior Races Through the Desert
A lone warrior, clad in advanced armor, sprints across a desolate desert landscape. Mountains rise in the distance, framing the epic scene. The image captures the raw power and speed of the warrior’s movement, creating a sense of thrilling action and futuristic adventure.
Prompt
poses running: immersive, exciting ; A gamer’s avatar running through a virtual world; close-up; Gaming; a vibrant and detailed virtual world with fantastical creatures; cinematic
Characteristic
Shot : A futuristic warrior runs across a barren desert landscape. The sun is setting in the background, casting a golden light on the scene.
Aesthetic Score : 0.7
Mood : epic, mysterious, futuristic
Quality
Entropy : 6.76
Noise : 80
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, such as blurriness and aliasing, and the composition could be more dynamic, more focus on the subject could also help
Running Towards the Horizon: A Moment of Determination
A young woman, fueled by energy and purpose, strides along a winding road. The vibrant blue sky and distant clouds create a backdrop of hope and possibility, mirroring her active and determined spirit. The curve of the road and her running pose capture the essence of forward momentum, inviting viewers to share in her journey.
Prompt
poses running: happy, carefree ; running along a scenic road; medium shot; Travel; a winding road with rolling hills and a picturesque countryside; cinematic
Characteristic
Shot : A young woman is running on a road in the countryside. There are fields on either side of the road, and a blue sky above. The road is paved, and the woman is wearing athletic clothing.
Aesthetic Score : 0.7
Mood : energetic, determined, healthy
Quality
Entropy : 6.89
Noise : 113
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor artifacts, particularly in the sky and in the fields. The colors are also slightly oversaturated, and the overall image appears slightly hazy.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.48, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model is somewhat capable of understanding and implementing camera positions from the prompt, but there’s room for improvement.
- Shot Analysis: The model scored 0.45, also slightly below the “good” range. This indicates that the model is moderately successful at understanding the scene described in the prompt and translating it into a visual shot.
- Aesthetic Analysis: The model scored 0.11, which is significantly above the “very good” range of -0.2 to 0.1. This means that the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt.
Overall: The model demonstrates a decent understanding of camera positions and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-2/