AI's Artistic Struggle: Capturing the Essence of Poses with Dall-e-3
- 10 minutes read - 2040 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and aesthetically pleasing images is a coveted goal. One approach to achieving this is by training AI models on vast datasets of images and their corresponding descriptions. This allows the model to learn the relationship between visual elements and textual descriptions. However, capturing the nuances of human expression, particularly in the context of poses, remains a challenging task for AI. This blog post explores the results of an experiment where an AI model was tasked with generating images based on descriptions of poses and scenes, highlighting its strengths and weaknesses in capturing camera position, shot analysis, and aesthetic style.
Created with: dall-e-3
Solitude and Wonder on the Mountaintop
A lone hiker stands in the golden light, taking in the breathtaking view of a winding river and shadowed mountains. This tranquil scene evokes a sense of adventure and contemplation, inviting you to imagine the stories unfolding in this vast landscape.
Prompt
poses standing-tall: Determined, hopeful, awe-inspiring ; Lone adventurer; wide shot; Adventure; Majestic mountain range with a vast, clear sky; cinematic
Characteristic
Shot : A lone hiker stands on a mountain ridge, looking out at a valley with a river winding through it. The sun is setting behind them, casting a warm glow over the scene.
Aesthetic Score : 0.7
Mood : serene, peaceful, contemplative
Quality
Entropy : 6.61
Noise : 102
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is a bit blurry, especially in the background. The hiker’s figure is also a little bit too dark.
The Eye of the Storm: A Soldier’s Determination Amidst Chaos
A powerful image captures the intensity of a soldier in military uniform, his determined expression unwavering against a backdrop of a blurry explosion and smoke. The dramatic effect of the blurred background emphasizes the soldier’s focus and the chaos surrounding him, creating a sense of urgency and intensity.
Prompt
poses standing-tall: Brave, defiant, resolute ; Soldier standing on a battlefield; medium shot; Heroism; Smoke and debris from a recent explosion; cinematic
Characteristic
Shot : A man in military gear, possibly a soldier, is looking directly at the camera with a serious expression. He’s wearing a camouflage vest and a black and white scarf around his neck. There is a blurred background of what appears to be a battlefield or war zone, with smoke and explosions in the distance.
Aesthetic Score : 0.6
Mood : serious, determined, intense
Quality
Entropy : 6.67
Noise : 111
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image appears to have been edited with some filters or effects, which may have caused some artifacts or unnatural color saturation. There is a slight blur on the subject’s face, which might be a result of post-processing.
Neon Nights: Friends Celebrate a Video Game Victory
Capture the thrill of victory with this vibrant image! A group of friends bask in the glow of neon lights, their excitement palpable as they celebrate a hard-won video game win. The dynamic lighting and energetic atmosphere perfectly encapsulate the joy and camaraderie of shared gaming experiences.
Prompt
poses standing-tall: Joyful, triumphant, celebratory ; Group of friends celebrating a victory in a video game; close-up; Gaming; Neon lights and glowing screens of a gaming setup; cinematic
Characteristic
Shot : A group of friends are playing a video game. They are all excited and cheering. There are neon lights in the background. There is a TV monitor on the right side of the image.
Aesthetic Score : 0.6
Mood : excitement, energy, joy
Quality
Entropy : 6.69
Noise : 106
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some noticeable artifacts, such as the blurring of the faces and the overexposed areas.
On the Edge of Hope: A Dreamy Landscape of Contemplation
Three figures stand on a cliff, their silhouettes stark against the hazy blue sky. The vast valley below, with its winding river, stretches out before them, offering a sense of scale and perspective. The bright sun, shining in the distance, casts a dreamy glow on the scene, hinting at a hopeful future. This image evokes a sense of contemplation and vulnerability, as the figures stand on the edge, both physically and emotionally.
Prompt
poses standing-tall: Awe-struck, contemplative, peaceful ; Tourist standing on a cliff overlooking a breathtaking view; long shot; Tourism; Scenic landscape with rolling hills and a sparkling ocean; cinematic
Characteristic
Shot : Three people stand on a cliff overlooking a vast valley with a river winding through it. They are gazing out at the landscape, suggesting a sense of wonder or contemplation. The background is hazy and ethereal, with the sun shining brightly in the sky.
Aesthetic Score : 0.6
Mood : mysterious, contemplative, hopeful
Quality
Entropy : 6.81
Noise : 101
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.70
Image errors : The background landscape appears a bit artificial and unrealistic, lacking details, the lighting on the figures is inconsistent, particularly on the woman on the left. The figures seem a bit posed.
Silhouettes of Love Against a Sunset Sky
A couple stands on a ship’s deck, their forms outlined against a breathtaking sunset. Mountains rise in the distance, adding to the scene’s romantic and melancholic atmosphere. The silhouette creates a sense of mystery and longing, capturing the essence of a hopeful love story.
Prompt
poses standing-tall: Romantic, adventurous, hopeful ; Couple standing on a ship’s deck; medium shot; Travel; Sunset over the ocean with a silhouette of a distant island; cinematic
Characteristic
Shot : A couple stands silhouetted on a ship’s deck, gazing out at the ocean as the sun sets behind them. There are mountains in the distance and the sky is filled with colorful clouds.
Aesthetic Score : 0.75
Mood : romantic, nostalgic, melancholic
Quality
Entropy : 6.93
Noise : 102
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : There are some minor artifacts in the image, particularly around the edges of the subjects. The sky is also a bit too saturated and lacks depth.
Exhilarating Dance Performance Captures the Stage
A group of dancers ignite the opera house with their vibrant performance, showcasing dramatic movements and captivating lighting that create a lively and celebratory atmosphere.
Prompt
poses standing-tall: Energetic, passionate, expressive ; Group of dancers performing on a stage; wide shot; Groups; Bright stage lights and a cheering audience; cinematic
Characteristic
Shot : A group of dancers performing on stage in a theater. The stage is illuminated by bright spotlights and the audience is cheering in the background.
Aesthetic Score : 0.6
Mood : energetic, exciting, theatrical
Quality
Entropy : 6.75
Noise : 94
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image is slightly blurry. The dancers’ poses are a bit stiff and unnatural.
Lost in the Cosmic Expanse: An Astronaut’s Solitary Journey
A lone astronaut stands on a desolate, moon-like surface, dwarfed by the vastness of space. A celestial body hangs in the sky, casting an ethereal glow on the rocky landscape. The scene evokes a sense of solitude, mystery, and the awe-inspiring beauty of the universe.
Prompt
poses standing-tall: Awe-inspiring, futuristic, surreal ; Astronaut standing on the surface of the moon; long shot; Adventure; Cratered lunar landscape with Earth in the distance; cinematic
Characteristic
Shot : An astronaut stands on a desolate, rocky, lunar surface. A large, luminous planet is visible in the distance, with a galaxy of stars in the background. The scene is bathed in the soft light of a distant sun.
Aesthetic Score : 0.8
Mood : solitude, wonder, mystery
Quality
Entropy : 6.66
Noise : 113
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The planet appears slightly out of focus, and the background stars seem a bit too uniform in size and shape.
Heroic Firefighter Faces Down Flames
A firefighter in full gear stands bravely before a burning building, their determined gaze locked on the camera. The dramatic lighting and composition capture the intensity and urgency of the scene, highlighting the heroics of those who risk their lives to save others.
Prompt
poses standing-tall: Brave, determined, selfless ; Firefighter standing in front of a burning building; medium shot; Heroism; Flames and smoke billowing from the building; cinematic
Characteristic
Shot : A firefighter in full gear stands in front of a burning building, looking directly at the camera with a serious expression.
Aesthetic Score : 0.7
Mood : intense, heroic, dramatic
Quality
Entropy : 6.52
Noise : 91
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Champion’s Triumph: Young Man Celebrates Victory Amidst Cheers and Confetti
A young man, radiating triumph, holds a trophy aloft as a cheering crowd erupts around him. The stage is bathed in spotlights, and confetti falls from the ceiling, creating a vibrant and celebratory atmosphere. The image captures the raw emotion of victory, with the shallow depth of field emphasizing the champion’s moment of glory.
Prompt
poses standing-tall: Triumphant, proud, accomplished ; Gamer holding a trophy after winning a tournament; close-up; Gaming; Crowd cheering and flashing cameras; cinematic
Characteristic
Shot : A young man in a grey hoodie, wearing a gold medal and headphones, is holding a trophy aloft while looking up with a smile, surrounded by cheering fans in a dimly lit room with confetti falling.
Aesthetic Score : 0.7
Mood : triumphant, celebratory, joyful
Quality
Entropy : 6.84
Noise : 96
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image has some artifacts, particularly in the shadows and highlights. The confetti looks slightly artificial. Some of the faces in the crowd are blurry.
Summit Success: A Family’s Joyful Mountaintop Moment
Capture the spirit of adventure and family bonding as a group of five hikers bask in the breathtaking panorama from a mountain peak. The sun-drenched landscape and their beaming smiles radiate happiness and a sense of accomplishment.
Prompt
poses standing-tall: Joyful, united, adventurous ; Family standing on a mountain peak; wide shot; Travel; Panoramic view of snow-capped mountains and a clear blue sky; cinematic
Characteristic
Shot : A group of five hikers stand on a rocky mountain peak, holding hands. They are looking at the camera, smiling. The background is a beautiful mountain range with snow-capped peaks and a clear blue sky.
Aesthetic Score : 0.7
Mood : joyful, adventurous, triumphant
Quality
Entropy : 6.49
Noise : 111
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image has some minor artifacts, such as the slight blurriness of the mountain peaks in the background and the unnaturalness of the shadows.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.45
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
Shot Analysis:
- Score: 0.45
- Interpretation: Similar to camera position, this score also falls below the “good” range. It indicates that the model didn’t fully understand the scene described in the prompt and didn’t accurately translate it into the generated image.
Aesthetic Analysis:
- Score: 0.09
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall:
While the model showed some success in understanding camera position and shot composition, it struggled to capture the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic descriptions into visual representations.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://openai.com/index/dall-e-3/