AI's Artistic Journey: Capturing Poses, But Missing the Essence with Dall-e-3
- 9 minutes read - 1849 wordsTable of Contents
In the realm of artificial intelligence, generative models are making strides in creating realistic and compelling images. These models are trained on vast datasets of images and text, enabling them to generate new images based on user prompts. However, while these models excel in capturing technical aspects like camera position and shot composition, they often struggle to capture the desired aesthetic. This blog post explores the performance of a generative AI model in creating images based on prompts, highlighting its strengths and weaknesses, and discussing the implications for the future of AI-generated art.
Created with: dall-e-3
Victory’s Melancholy: A Warrior’s Triumph Amidst the Fallen
A lone warrior, silhouetted against the setting sun, stands triumphant in a field of crosses. The golden light casts an epic glow, highlighting the warrior’s victory and the somber weight of the fallen. This scene evokes a powerful sense of both triumph and melancholic reflection.
Prompt
poses dancing: triumphant, powerful ; A lone warrior; wide shot; heroism; a battlefield littered with fallen enemies; cinematic
Characteristic
Shot : A lone warrior stands victorious in a field of fallen warriors, all of whom are lying in the shape of a cross. The warrior holds a sword above their head, and the sun is setting behind them.
Aesthetic Score : 0.7
Mood : dramatic, triumphant, melancholic
Quality
Entropy : 6.72
Noise : 109
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have been created by AI, and the fallen warriors are repeated and slightly misaligned. There is some blur in the distance, which could be improved.
Explorers Celebrate Triumph in the Heart of the Jungle
A group of adventurers, clad in explorer garb, revel in their success amidst the lush greenery of a jungle setting. The ancient stone structure behind them serves as a backdrop to their joyous celebration, captured with dynamic poses and a shallow depth of field that evokes a sense of energy and excitement.
Prompt
poses dancing: excited, adventurous ; A group of explorers; medium shot; adventure; a dense jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A group of people, mostly in khaki clothing, are celebrating in a jungle setting with ancient ruins in the background. There are vines and plants, and the light is soft and diffused.
Aesthetic Score : 0.6
Mood : joyful, adventurous, celebratory
Quality
Entropy : 6.97
Noise : 120
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some noise in the shadows and some artifacts on the characters’ faces. The edges of the image are slightly blurry.
Neon Glow, Intense Focus: A Gamer’s Sanctuary
A young woman is immersed in a video game, bathed in the vibrant glow of neon lights. The dimly lit room, adorned with gaming paraphernalia, amplifies the intensity and futuristic feel of the scene. The dramatic lighting and color palette create a sense of excitement and anticipation, capturing the essence of a gamer’s world.
Prompt
poses dancing: intense, focused ; A gamer; close-up; gaming; a brightly lit gaming setup with a screen displaying a virtual world; cinematic
Characteristic
Shot : A young woman is playing a video game on a computer in a dimly lit room. The room is decorated with neon lights, and there are other gaming consoles in the background.
Aesthetic Score : 0.7
Mood : intense, focused, futuristic
Quality
Entropy : 6.78
Noise : 93
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise and digital artifacts are visible in the image, particularly in the shadows.
A Dance Amidst the Vibrant Market: A Romantic Adventure
Experience the warmth and excitement of a bustling marketplace as a couple shares a romantic dance. The vibrant colors and hazy atmosphere create an adventurous mood, while the soft, warm light highlights their connection amidst the bustling movement.
Prompt
poses dancing: joyful, romantic ; A couple; medium shot; tourism; a bustling marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A couple dancing in a bustling marketplace. The scene is filled with vibrant colors, spices, and people. A hand holding a smartphone is capturing the moment, adding a meta layer to the image.
Aesthetic Score : 0.7
Mood : romantic, vibrant, energetic
Quality
Entropy : 6.91
Noise : 104
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.70
Image errors : The lighting feels artificial and the colors are a bit oversaturated, creating a slightly unnatural look. Some artifacts are visible in the background, especially around the edges of the phone.
Silhouetted Serenity: Yoga in the Desert Sunset
A lone figure finds peace and solitude amidst the vast desert landscape, their silhouette a striking contrast against the fiery sunset. This serene scene evokes a sense of contemplation and inner harmony.
Prompt
poses dancing: reflective, contemplative ; A traveler; long shot; travel; a vast desert landscape with a setting sun; cinematic
Characteristic
Shot : A lone figure stands in a yoga pose in a desert landscape with a vivid sunset behind them. The mountains are in the background.
Aesthetic Score : 0.7
Mood : peaceful, serene, introspective
Quality
Entropy : 6.66
Noise : 95
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some artifacts are visible in the sky and the sand.
Rooftop Revelry: City Lights and Dancing Dreams
Capture the energy of a joyous night out with this vibrant image. Young adults celebrate on a rooftop, bathed in the glow of city lights and a star-studded sky. The scene radiates with fun and carefree energy, perfect for capturing the spirit of a memorable night.
Prompt
poses dancing: happy, carefree ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : A group of friends are celebrating on a rooftop with the New York City skyline in the background. They are laughing and dancing with their arms raised in the air.
Aesthetic Score : 0.7
Mood : joyful, celebratory, carefree
Quality
Entropy : 6.61
Noise : 113
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.70
Image errors : The background seems to be blurry and distorted, possibly due to AI image generation.
Shadow Play: A Woman Leaps into the Unknown
A mysterious figure, silhouetted against a bright light source, leaps through a shadowy alleyway. The dramatic lighting and her dynamic pose create a sense of action and suspense, leaving the viewer wondering what lies ahead.
Prompt
poses dancing: determined, defiant ; A lone dancer; close-up; heroism; a dark alleyway with flickering streetlights; cinematic
Characteristic
Shot : A woman in a white shirt and dark pants jumps in mid-air in a narrow street lit by lanterns at night
Aesthetic Score : 0.7
Mood : dramatic, mysterious, action
Quality
Entropy : 6.11
Noise : 91
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some slight blurring and softness around the edges.
Adventure Awaits in the Misty Mountains
A band of adventurers, clad in fantastical garb, stride confidently through a misty mountain landscape. Dragons soar overhead, adding a touch of magic and danger to the scene. This whimsical image captures the thrill of exploration and the promise of untold adventures.
Prompt
poses dancing: exhilarated, free ; A group of adventurers; wide shot; adventure; a breathtaking mountain range with a clear blue sky; cinematic
Characteristic
Shot : A group of adventurers, including a woman holding a scroll, are running across a mountaintop with a dramatic and epic sky background. There are flying dragons in the distance.
Aesthetic Score : 0.7
Mood : fantasy, adventure, excitement
Quality
Entropy : 6.81
Noise : 106
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some artifacts, particularly in the sky and on the characters’ clothes.
In the Zone: A Gamer’s Intense Focus Under Dramatic Lighting
A man is completely engrossed in a game on his computer screen, the dramatic lighting highlighting his intense focus. The dimly lit room and the presence of another person in the background add to the sense of tension and drama.
Prompt
poses dancing: focused, strategic ; A gamer; close-up; gaming; a dimly lit room with a computer screen displaying a competitive game; cinematic
Characteristic
Shot : A man in a tactical vest is playing a video game on a computer, the game is focused on a soldier holding a gun
Aesthetic Score : 0.7
Mood : intense, focused, suspenseful
Quality
Entropy : 6.28
Noise : 80
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some artifacts around the edges of the screen and the man’s hair. The lighting on the man’s face is a bit unnatural.
Sunset Bliss: A Family’s Joyful Run on the Beach
Capture the warmth and happiness of a family enjoying a golden hour run on a pristine beach. The sun sets behind them, casting a romantic glow on their silhouettes as they embrace the beauty of the moment. Palm trees and mountains frame the scene, creating a picturesque backdrop for this joyful memory.
Prompt
poses dancing: relaxed, joyful ; A family; medium shot; travel; a picturesque beach with turquoise water and white sand; cinematic
Characteristic
Shot : A family of four is walking along a beach at sunset. The beach is sandy and there is water in the background. The sky is a bright orange and yellow color.
Aesthetic Score : 0.8
Mood : happy, playful, carefree
Quality
Entropy : 6.03
Noise : 93
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, but the sky could have been slightly more dramatic to enhance the overall aesthetic.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the camera positions described in the prompt.
- Shot Analysis: The model scored 0.595, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflected the intended shot composition.
- Aesthetic Analysis: The model scored 0.07, which is significantly lower than the “very good” range (-0.2 to 0.1). This indicates that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera positions and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://openai.com/index/dall-e-3/