AI's Artistic Journey: Capturing the Essence of Poses, But Missing the Aesthetic with Imagen-v3
- 9 minutes read - 1854 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts has become increasingly sophisticated. However, achieving artistic accuracy remains a challenge. This blog post examines the results of an AI model tasked with generating images based on specific poses and scenes, revealing both its strengths and weaknesses in capturing the desired aesthetic. We explore the concept of ‘dramatic style poses’ and how they are used in various contexts, providing examples to illustrate the nuances of this artistic approach.
Created with: imagen-v3
Silhouetted in Solitude: A Moment of Loss in a Ruined City
A lone figure stands amidst the ruins, their silhouette stark against the fiery sunset. The image evokes a sense of melancholy and loss, capturing a moment of contemplation in a world consumed by destruction.
Prompt
poses looking-back: Melancholy, yet hopeful ; Lone figure in a tattered cloak; wide shot; Heroism; Ruins of a fallen city bathed in the golden light of a setting sun; cinematic
Characteristic
Shot : A lone figure stands in a ruined city, silhouetted against a sunset sky. The image evokes a sense of desolation and loss, with the figure seemingly lost in contemplation.
Aesthetic Score : 0.6
Mood : melancholy, dramatic, somber
Quality
Entropy : 6.89
Noise : 74
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some minor issues with the rendering of the ruins, which appear somewhat blurry and lacking in detail. The lighting also seems slightly unnatural, with the sunset being a bit too uniform across the scene.
Lost Temple Beckons Explorers into the Jungle’s Secrets
A sense of mystery and adventure hangs heavy in the air as three explorers stand before an ancient stone temple, its entrance shrouded in the dense jungle foliage. The mood is foreboding, hinting at the secrets that lie hidden within the temple’s forgotten walls.
Prompt
poses looking-back: Excited, adventurous ; A group of explorers; medium shot; Adventure; Lush jungle with ancient temples in the distance; cinematic
Characteristic
Shot : Three people are standing in front of an ancient temple in a jungle. The temple is made of stone and has a large entrance. The people are dressed in explorer clothing and are looking at the temple.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, foreboding
Quality
Entropy : 6.80
Noise : 95
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable errors in the image, although the lighting is a bit flat and could be improved by adding more shadows and highlights.
Immersed in the Game: A Gamer’s Focused Intensity
A young man sits at his desk, bathed in blue light, completely engrossed in a video game. The image captures the focused intensity of his concentration, highlighting the immersive power of gaming.
Prompt
poses looking-back: Intense, focused ; A gamer’s hands on a keyboard; close-up; Gaming; Neon lights reflecting on the screen, displaying a virtual world; cinematic
Characteristic
Shot : A young man is playing a video game on his computer. He is sitting at his desk in front of a monitor. The room is lit by a blue light.
Aesthetic Score : 0.6
Mood : focused, intense, concentrated
Quality
Entropy : 6.22
Noise : 71
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors.
Solitude and Majesty: A Hiker Finds Peace Amidst the Mountains
A lone hiker stands on a breathtaking mountain ridge, dwarfed by the towering snow-capped peaks. The scene evokes a sense of serenity and inspiration, highlighting the vastness and beauty of the natural world.
Prompt
poses looking-back: Awe-inspiring, peaceful ; A lone traveler standing on a mountain peak; long shot; Tourism; Breathtaking panoramic view of a snow-capped mountain range; cinematic
Characteristic
Shot : A lone hiker stands on a mountain ridge, overlooking a vast expanse of snow-capped mountains. The sky is a clear blue, and the air is crisp and clean. The scene is breathtaking, and the hiker is dwarfed by the size and majesty of the mountains.
Aesthetic Score : 0.8
Mood : serene, inspiring, vast
Quality
Entropy : 6.80
Noise : 96
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
Silhouettes of Solitude: A Woman and a Train at Sunset
A lone woman in a brown coat stands beside a vintage train, gazing out over a dusty landscape. The setting sun casts a warm glow, highlighting her silhouetted figure against the vast expanse. The scene evokes a sense of melancholy, nostalgia, and contemplation, as the woman’s isolation and introspection are emphasized by the train’s receding line.
Prompt
poses looking-back: Nostalgic, adventurous ; A vintage train speeding through a desert landscape; medium shot; Travel; Sun setting over the horizon, casting long shadows; cinematic
Characteristic
Shot : A lone woman in a brown coat stands looking out over a dusty landscape beside a vintage train. The sun is setting, casting a warm glow on the scene.
Aesthetic Score : 0.6
Mood : melancholy, nostalgic, contemplative
Quality
Entropy : 6.64
Noise : 91
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Youthful Joy in a Colorful Alleyway
Four friends radiate carefree happiness as they stroll through a narrow alleyway adorned with a vibrant mural. The natural light bathes the scene in warmth, capturing the essence of youthful exuberance.
Prompt
poses looking-back: Joyful, carefree ; A group of friends laughing and talking; medium shot; Groups; A bustling city street with vibrant street art; cinematic
Characteristic
Shot : Four young people are walking in a narrow alleyway with a large mural on the side of the building. The alleyway is lit with natural light and the mural is a colorful piece of artwork.
Aesthetic Score : 0.7
Mood : happy, youthful, carefree
Quality
Entropy : 6.76
Noise : 85
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights. The background is also a bit blurry, but this could be an intentional stylistic choice.
A Moment of Wonder: Astronaut Gazes at Earth from the Vastness of Space
This awe-inspiring image captures an astronaut floating in the void, their silhouette stark against the backdrop of space. The Earth hangs in the distance, a blue marble against the black canvas, while another planet peeks from the cosmic depths. The scene evokes a sense of isolation, mystery, and the sheer vastness of the universe.
Prompt
poses looking-back: Awe-inspiring, contemplative ; A lone astronaut floating in space; long shot; Heroism; Earth hanging in the distance, a blue marble against the black void; cinematic
Characteristic
Shot : An astronaut in a spacesuit floats in space, gazing at the Earth, with another planet in the background. The image conveys a sense of wonder, isolation, and the vastness of space.
Aesthetic Score : 0.7
Mood : awe, mystery, vastness
Quality
Entropy : 4.64
Noise : 79
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, especially the astronaut’s face and the Earth’s surface. The image could also be improved by having more contrast in the shadows, making the space background feel deeper and more vast.
Adrenaline Rush: Rafting Through Rapids
Four friends brave the wild rapids, facing the thrill of the fast-moving water. The photographer captures the excitement and danger of their adventure from the back of the raft.
Prompt
poses looking-back: Thrilling, exhilarating ; A group of adventurers on a raft; medium shot; Adventure; Rapids churning whitewater, a sense of danger and excitement; cinematic
Characteristic
Shot : A group of four people are rafting down a river with rapids, the photographer is in the raft facing backwards.
Aesthetic Score : 0.7
Mood : adventurous, exciting, daring
Quality
Entropy : 6.00
Noise : 80
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.00
Image errors : There are no noticeable errors in the image.
Finding Tranquility Amidst the Peaks
A solitary figure stands on a mountaintop, headphones on, lost in thought as they gaze upon a breathtaking vista of forest, distant mountains, and a sky adorned with shooting stars. The image evokes a sense of peace and introspective calm, inviting viewers to contemplate the vastness of nature and the serenity of solitude.
Prompt
poses looking-back: Triumphant, accomplished ; A gamer’s avatar standing on a virtual mountain peak; close-up; Gaming; A vast, fantastical landscape stretching out before them; cinematic
Characteristic
Shot : A person wearing headphones is standing on a mountain overlooking a forest, with a distant mountain range and a bright sky with a couple of shooting stars visible in the distance.
Aesthetic Score : 0.7
Mood : tranquil, introspective, peaceful
Quality
Entropy : 6.81
Noise : 78
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : There is a slight blurriness to the scene, and the trees in the forest appear somewhat repetitive, which could be an AI artifact.
Sunset Romance on the Beach
A couple strolls hand-in-hand along a sandy shore as the sun dips below the horizon, casting a warm glow on their silhouettes. The peaceful atmosphere and breathtaking sunset create a truly romantic and serene moment.
Prompt
poses looking-back: Romantic, peaceful ; A couple walking hand-in-hand on a beach; long shot; Tourism; Sunset painting the sky in vibrant hues of orange and pink; cinematic
Characteristic
Shot : A couple is walking hand-in-hand on a beach at sunset, their backs are to the camera, they are looking out at the ocean.
Aesthetic Score : 0.7
Mood : romantic, peaceful, serene
Quality
Entropy : 6.80
Noise : 99
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible errors in the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.51, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.59, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflected the intended shot composition.
- Aesthetic Analysis: The model scored 0.11, which is outside the “very good” range (-0.2 to 0.1). This indicates that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera positions and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/