AI's Artistic Struggle: Capturing the Essence of Dramatic Poses with Imagen-v3
- 10 minutes read - 1924 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions, actions, and narratives through the positioning of the human body. From heroic stances to moments of vulnerability, these poses have the ability to evoke a wide range of feelings in the viewer. In this blog post, we explore the challenges of using AI to generate images that capture the essence of dramatic poses, analyzing the results of an experiment where an AI model was tasked with creating images based on specific pose descriptions and scene settings. We delve into the model’s strengths and weaknesses, highlighting its ability to understand camera position and shot analysis, while also examining its struggles with capturing the desired aesthetic nuances. Join us as we explore the fascinating world of AI-generated art and its ongoing journey to understand and replicate human creativity.
Created with: imagen-v3
Silhouetted Against Despair: A Warrior’s Sunset
A lone warrior stands in a desolate desert, silhouetted against a fiery sunset. The dramatic lighting and composition evoke a sense of isolation, despair, and the weight of battle.
Prompt
poses close-up: epic, determined ; A lone figure, silhouetted against a blazing sunset; close-up; heroism; a vast, desolate landscape; cinematic
Characteristic
Shot : A lone warrior stands in a desolate desert at sunset. The sky is ablaze with orange and red hues. The warrior is silhouetted against the sunset, his armor and face partially visible.
Aesthetic Score : 0.7
Mood : dramatic, intense, melancholic
Quality
Entropy : 5.83
Noise : 66
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors or artifacts are visible.
Unveiling the Secrets of a Forgotten Journey
A weathered map, held in the glow of candlelight, whispers tales of adventure and mystery. The blurred globes in the background hint at distant lands and uncharted territories, inviting you to embark on a journey of discovery.
Prompt
poses close-up: intrigued, adventurous ; A weathered map, its edges frayed, with a finger tracing a route; close-up; adventure; a dimly lit room filled with antique maps and globes; cinematic
Characteristic
Shot : A person’s hands are shown holding an old, weathered map, with a lit candle and globes out of focus in the background.
Aesthetic Score : 0.7
Mood : mysterious, vintage, adventurous
Quality
Entropy : 6.20
Noise : 70
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise, particularly in the darker areas.
Lost in the Code: A Hand Typing in the Dark
A close-up shot captures a hand furiously typing on a backlit keyboard, bathed in a mysterious pink glow. The dark surroundings and red lighting create a sense of suspense and focus, hinting at a story unfolding in the shadows.
Prompt
poses close-up: intense, focused ; A gamer’s hands, fingers flying across a keyboard, eyes glued to the screen; close-up; gaming; a dimly lit room with neon lights reflecting on the screen; cinematic
Characteristic
Shot : A person’s hand is typing on a keyboard, the lighting is dark with a pink glow, the keyboard is backlit with red, only a part of the hand is visible, as well as the keyboard and a mouse.
Aesthetic Score : 0.4
Mood : dark, focused, mysterious
Quality
Entropy : 6.10
Noise : 67
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some graininess and noise, especially in the shadows. There are also some slight artifacts around the edges of the keyboard.
Capturing the Majesty: A Sunset Symphony in the Mountains
A photographer stands in awe, capturing the breathtaking beauty of a sunset over a majestic mountain range. The tranquil scene evokes a sense of serenity and inspiration, leaving you wanting to experience the moment yourself.
Prompt
poses close-up: awe-inspiring, wonder ; A hand holding a camera, capturing a breathtaking vista; close-up; tourism; a panoramic view of a mountain range with clouds swirling below; cinematic
Characteristic
Shot : A person is holding a camera and taking a photo of a mountain range with a sunset in the background.
Aesthetic Score : 0.7
Mood : tranquil, serene, inspiring
Quality
Entropy : 6.87
Noise : 82
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed in some areas, particularly in the sky. There is also some slight noise present in the image.
Passport Ready: Adventure Awaits
A close-up shot of a US passport, nestled amongst travel essentials like sunglasses and a backpack, captures the anticipation and excitement of an upcoming adventure. The mood is one of travel and exploration, with the passport taking center stage as the key to unlocking new experiences.
Prompt
poses close-up: nostalgic, adventurous ; A passport, open to a page with a stamp from a foreign country; close-up; travel; a cluttered backpack overflowing with travel essentials; cinematic
Characteristic
Shot : A person is holding a US passport in front of a green backpack, sunglasses, and a black electronic device on a wooden table.
Aesthetic Score : 0.6
Mood : travel, anticipation, adventure
Quality
Entropy : 6.38
Noise : 106
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but the color saturation and contrast could be slightly optimized for better visual appeal. The background is slightly blurry, which is not ideal for a high-resolution image.
The Warmth of Unity: A Circle of Hope
A campfire glows, casting a warm light on a group of people whose hands are stacked in a circle, symbolizing togetherness and support. The scene evokes feelings of hope and intimacy, a testament to the power of human connection.
Prompt
poses close-up: warm, connected ; A group of hands, clasped together in a circle, symbolizing unity; close-up; groups; a campfire burning brightly in the background; cinematic
Characteristic
Shot : A group of people are standing around a campfire, with their hands stacked on top of each other in a circle. The scene is lit by the fire, and the hands are the main focus of the image.
Aesthetic Score : 0.7
Mood : warm, togetherness, hopeful
Quality
Entropy : 6.17
Noise : 79
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
A Tear Tells a Thousand Stories: Soldier’s Distress in the Midst of Chaos
A close-up shot captures the raw emotion of a soldier, a single tear rolling down his cheek against a blurry backdrop of chaos. The image evokes feelings of sadness, vulnerability, and despair, highlighting the profound impact of war on the human spirit.
Prompt
poses close-up: tragic, poignant ; A single tear rolling down a hero’s cheek, reflecting the weight of their sacrifice; close-up; heroism; a battlefield littered with fallen comrades; cinematic
Characteristic
Shot : Close-up shot of a man’s face, likely a soldier, with tear rolling down his cheek, highlighting emotional distress. The background is blurry, suggesting a chaotic environment.
Aesthetic Score : 0.7
Mood : sadness, vulnerability, despair
Quality
Entropy : 6.61
Noise : 97
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : Minor artifacts in the background, particularly around the man’s shoulder. The lighting seems slightly uneven.
Lost in the Fog, Found by the Compass
A weathered compass rests on a moss-covered tree stump, its needle pointing north amidst a dense, foggy jungle. The scene evokes a sense of mystery and adventure, hinting at a journey through the unknown. The compass, a symbol of hope and direction, promises a path forward in this enigmatic landscape.
Prompt
poses close-up: uncertain, suspenseful ; A compass needle spinning wildly, pointing in all directions; close-up; adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : A compass lies on a tree stump in a dense, foggy jungle. The compass needle points north, suggesting a sense of direction and exploration.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, hopeful
Quality
Entropy : 6.55
Noise : 72
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The compass needle appears somewhat blurry and lacks fine detail. The image may have been slightly over-sharpened, resulting in a slight halo effect around the compass.
The Thrill of the Arcade: A Close-Up on the Joystick
A nostalgic and intense close-up shot captures the moment of pure joy as a hand grips an arcade joystick. The blurred background of flashing neon lights and arcade machines adds to the sense of excitement and immediacy, transporting you back to the golden age of gaming.
Prompt
poses close-up: exhilarated, competitive ; A joystick, gripped tightly in a gamer’s hand, as they navigate a virtual world; close-up; gaming; a brightly lit arcade with flashing lights and sounds; cinematic
Characteristic
Shot : Close-up of a hand holding an arcade joystick with a blurred background of arcade machines and neon lights.
Aesthetic Score : 0.6
Mood : intense, nostalgic, playful
Quality
Entropy : 6.62
Noise : 67
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor blur, particularly in the background. The colors also seem slightly oversaturated.
Traumi: A Journey Begins
A close-up shot captures a hand clutching a luggage tag inscribed with the word ‘Traumi,’ hinting at a journey of emotional healing. The blurry background of an airport or train station adds to the sense of uncertainty and the weight of the journey ahead.
Prompt
poses close-up: hopeful, anticipatory ; A luggage tag, with a handwritten note attached, signifying a journey to a new destination; close-up; travel; a bustling airport terminal with people rushing around; cinematic
Characteristic
Shot : Close-up of a hand holding a luggage tag attached to a suitcase, with a blurry background of people at an airport or train station. The tag is written with the word “Traumi” in cursive, and the suitcase is black.
Aesthetic Score : 0.4
Mood : melancholy, travel, uncertainty
Quality
Entropy : 6.38
Noise : 86
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.4
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
Shot Analysis:
- Score: 0.54
- Interpretation: This score falls within the “good” range of 0.5 to 0.75. It indicates that the model was able to understand and translate the scene description from the prompt into the generated image fairly well.
Aesthetic Analysis:
- Score: 0.14
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall:
The model demonstrates a good understanding of shot composition and scene description, but struggles to match the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic preferences into visual outputs.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/