AI's Artistic Journey: Capturing Poses, But Missing the Essence with Imagen-v2
- 9 minutes read - 1857 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and aesthetically pleasing images is a constant pursuit. One key aspect of this pursuit is the accurate portrayal of poses, which can significantly impact the overall impact and emotional resonance of an image. This blog post examines the results of an experiment using a generative AI model to create images based on specific scenes and poses, highlighting the model’s strengths and weaknesses in capturing the desired aesthetic.
Created with: imagen-v2
Steel-Eyed Resolve: World War II Soldiers Stand Ready
A powerful image captures the unwavering determination of four World War II soldiers, their faces illuminated with a sense of purpose. The dramatic lighting and composition heighten the tension and anticipation of the moment, leaving a lasting impression of wartime grit.
Prompt
poses standing-in-a-row: determined, courageous, hopeful ; A group of soldiers; wide shot; heroism; a battlefield with smoke and explosions in the background; cinematic
Characteristic
Shot : Four soldiers in World War II-era uniforms stand in a line, facing the camera. The soldiers are all wearing helmets and have grim expressions on their faces. The background is a blurred out cloudy sky, suggesting an outdoor setting.
Aesthetic Score : 0.7
Mood : serious, dramatic, somber
Quality
Entropy : 6.56
Noise : 115
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.60
Image errors : There are some minor artifacts in the image, particularly in the soldiers’ uniforms. There is a slight blurriness to the image that is not necessarily an error but could be attributed to a stylistic choice or a limitation in the image’s resolution.
Uncharted Jungle: A Mystery Awaits
Four adventurers, their faces set with determination, stand poised at the edge of a lush jungle. A weathered stone structure looms in the background, hinting at secrets waiting to be uncovered. The air crackles with anticipation, promising an adventure unlike any other.
Prompt
poses standing-in-a-row: excited, curious, adventurous ; A team of explorers; medium shot; adventure; a lush jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : A group of four adventurers in a jungle setting, posing in front of an ancient temple or structure in the background. The lighting is warm and hazy, giving the image a somewhat cinematic and mysterious feel.
Aesthetic Score : 0.6
Mood : adventure, suspense, exploration
Quality
Entropy : 6.76
Noise : 117
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.70
Image errors : Some artifacts are noticeable, particularly in the foliage, suggesting possible over-sharpening or digital manipulation. The edges of the figures appear slightly blurred, indicating possible poor focus or digital editing.
Esports Athletes Ready to Battle
A group of young men, clad in esports jerseys and headphones, stand in a row, their focused expressions hinting at the intense competition ahead. The blurred background suggests a large stage, setting the scene for a thrilling gaming tournament.
Prompt
poses standing-in-a-row: focused, competitive, passionate ; A group of gamers; close-up shot; gaming; a brightly lit esports arena with cheering fans; cinematic
Characteristic
Shot : A group of young men in eSports jerseys and headphones are standing in a line, looking forward with serious expressions. The background is blurry and lit with neon lights, suggesting a gaming event.
Aesthetic Score : 0.7
Mood : intense, focused, competitive
Quality
Entropy : 6.53
Noise : 99
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors in the image. The quality of the image is good.
Family Adventure in the Majestic Mountains
A serene and happy family enjoys a day out in a vast field, with towering mountains providing a dramatic backdrop. The bright blue sky and casual attire suggest a carefree and adventurous spirit.
Prompt
poses standing-in-a-row: happy, relaxed, joyful ; A family of tourists; long shot; tourism; a breathtaking view of a mountain range with a clear blue sky; cinematic
Characteristic
Shot : Four people, likely a family, stand in a line in front of a mountain range. They are all wearing casual clothing. The sky is a clear blue, and the mountain peaks are partially covered in snow.
Aesthetic Score : 0.6
Mood : tranquil, peaceful, adventurous
Quality
Entropy : 6.84
Noise : 110
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is well-exposed, but there is some noise present in the shadow areas. The colors are a bit muted, and the overall image appears to be slightly over-sharpened.
Four Friends Embark on a Dusty Road to Adventure
A group of young adventurers, their faces set with determination, stand on a dirt road in a tropical setting. Palm trees sway in the background, and the cloudy sky hints at the unknown that lies ahead. This image captures the essence of adventure, suspense, and hope, leaving viewers eager to discover what awaits these intrepid travelers.
Prompt
poses standing-in-a-row: free-spirited, adventurous, optimistic ; A group of backpackers; medium shot; travel; a dusty road leading to a distant village with palm trees; cinematic
Characteristic
Shot : Four people, likely a group of friends or a family, are standing on a dirt road in a tropical environment. They are all wearing casual clothing, and carrying backpacks. The scene is likely set in Southeast Asia, as the vegetation suggests a tropical climate.
Aesthetic Score : 0.6
Mood : adventurous, determined, hopeful
Quality
Entropy : 6.83
Noise : 108
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, especially in the sky. The contrast is also a little low.
Passionate Performance: Singers Captivate with Dramatic Intensity
A group of singers, clad in black, deliver a powerful performance. Their expressive faces and well-lit stage capture the raw emotion and intensity of their music, creating a dramatic and captivating scene.
Prompt
poses standing-in-a-row: harmonious, powerful, emotional ; A choir singing in harmony; close-up shot; groups; a dimly lit stage with spotlights; cinematic
Characteristic
Shot : A group of people, mostly women, are singing in a dark room or on a stage.
Aesthetic Score : 0.7
Mood : intense, dramatic, focused
Quality
Entropy : 5.95
Noise : 120
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and there is some noise in the shadows. The resolution is also a bit low, which makes it difficult to see the details of the singers’ faces.
Anticipation on Stage: Young Women in Yellow Dresses Command Attention
A group of young women in vibrant yellow dresses stand poised on a stage, their gazes fixed upwards. The dramatic lighting and their focused postures create a palpable sense of anticipation, hinting at a performance about to unfold. The mood is serious, elegant, and captivating.
Prompt
poses standing-in-a-row: energetic, synchronized, joyful ; A line of dancers; wide shot; groups; a brightly lit stage with colorful costumes; cinematic
Characteristic
Shot : A group of young women in yellow dresses stand on a stage, looking up and singing. They are lined up in a row, and the stage is dark except for the light from the spotlight on them.
Aesthetic Score : 0.6
Mood : focused, determined, hopeful
Quality
Entropy : 6.15
Noise : 113
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, some minor color banding on the dresses
Golden Hour Serenity: Friends Embrace the Sunset’s Majesty
A group of friends stand on a golden beach, their silhouettes framed against a breathtaking sunset. The sky bursts with vibrant hues of orange, pink, and purple, while the crashing waves add a dramatic touch to the serene scene. This moment captures the tranquility and beauty of a perfect evening.
Prompt
poses standing-in-a-row: relaxed, happy, nostalgic ; A group of friends; medium shot; groups; a sunset over a beach with waves crashing in the background; cinematic
Characteristic
Shot : A group of friends standing on a beach at sunset, facing the ocean.
Aesthetic Score : 0.7
Mood : peaceful, nostalgic, hopeful
Quality
Entropy : 6.63
Noise : 110
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and compression artifacts are present, particularly in the sky and the water.
Unveiling the Secrets: A Glimpse into a Mysterious Lab
Three figures in lab coats stand shrouded in a dimly lit laboratory, their serious expressions hinting at a hidden purpose. The greenish-blue color palette and dramatic lighting create an atmosphere of suspense and intrigue, leaving viewers wondering what secrets lie within.
Prompt
poses standing-in-a-row: focused, determined, innovative ; A team of scientists; close-up shot; groups; a laboratory with complex machinery and glowing screens; cinematic
Characteristic
Shot : Three scientists, two men and one woman, are standing in a laboratory. They are all wearing white lab coats. The woman is looking down and the two men are looking directly at the camera.
Aesthetic Score : 0.6
Mood : serious, tense, professional
Quality
Entropy : 6.74
Noise : 102
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and the colors are a bit desaturated. The focus could be sharper.
Determined Faces, Muted Colors: A Crowd Gathers in Protest
A group of young women, their faces etched with seriousness, stand amidst a crowd, holding signs and conveying a sense of urgency. The muted colors of the image amplify the somber mood, highlighting the gravity of the moment.
Prompt
poses standing-in-a-row: determined, passionate, hopeful ; A group of protesters; long shot; groups; a city street with banners and signs; cinematic
Characteristic
Shot : A group of people, mostly women, are standing in a street protest, holding signs and looking determined.
Aesthetic Score : 0.7
Mood : serious, tense, hopeful
Quality
Entropy : 6.70
Noise : 107
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.00
Image errors : No notable image artifacts or errors
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range. This means the generated image’s camera position was fairly close to what was requested in the prompt.
- Shot Analysis: The model scored 0.58, also within the “good” range. This indicates the model was able to understand the scene described in the prompt and translate it into a visually similar image.
- Aesthetic Analysis: The model scored 0.1, which is considered “very good”. This means the generated image’s aesthetic was very close to the expected aesthetic, despite the model’s struggles in other areas.
Overall, the model demonstrates a good understanding of scene and camera position, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-2/