AI's Artistic Eye: Capturing Emotion, Missing the Angle with Dall-e-3
- 9 minutes read - 1895 wordsTable of Contents
In the realm of artificial intelligence, generative models have emerged as powerful tools for creating realistic and imaginative images. These models can translate text prompts into stunning visuals, capturing the essence of a scene and its emotional nuances. However, while AI excels at understanding the aesthetics and emotional content of a prompt, it often struggles with accurately representing the intended camera position. This discrepancy highlights the ongoing challenges and potential for future advancements in AI image generation. This blog post explores the strengths and weaknesses of generative AI in capturing facial expressions and camera angles, using a recent experiment as a case study.
Created with: dall-e-3
Lost in Thought: A Moment of Contemplation in the City
A woman sits alone on a park bench, her thoughtful expression and the blurred cityscape behind her evoke a sense of melancholy and contemplation. The scene captures the quiet solitude that can be found even amidst the bustling urban environment.
Prompt
facial-expressions Thoughtfulness: Melancholy, contemplative ; A lone figure sitting on a park bench; eye-level; Single Person; a bustling city park in the background; cinematic
Characteristic
Shot : A young woman sits on a bench in a city park, looking down with a thoughtful expression. The background is blurred, suggesting a busy city scene.
Aesthetic Score : 0.7
Mood : melancholy, introspective, contemplative
Quality
Entropy : 6.30
Noise : 82
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, particularly in the background, which may be due to compression or editing.
A Hero Stands Watch Over the City
A lone superhero, silhouetted against the vibrant cityscape, embodies hope and responsibility under a starry sky. The dramatic composition highlights their power and the weight of their mission.
Prompt
facial-expressions Thoughtfulness: Reflective, introspective ; A superhero standing on a rooftop, looking out at the city; eye-level; Hero; a sprawling cityscape with twinkling lights; cinematic
Characteristic
Shot : A superhero stands on a rooftop, looking out over a city at night. The city lights are twinkling in the distance.
Aesthetic Score : 0.6
Mood : epic, hopeful, futuristic
Quality
Entropy : 6.55
Noise : 129
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly blurry. The cityscape is also a bit repetitive. The stars in the sky are not realistic
Tranquility in Motion: A Moment of Peace on a Moving Train
A young woman finds solace in a book as the world rushes by outside her train window. The blurred landscape evokes a sense of passing time, while her stillness and focused gaze create a peaceful contrast. This image captures the beauty of quiet contemplation amidst the bustle of life.
Prompt
facial-expressions Thoughtfulness: Peaceful, absorbed ; A woman reading a book on a train; eye-level; Normal Person; a blurry view of passing scenery outside the window; cinematic
Characteristic
Shot : A young woman is sitting by the window of a train, reading a book. She is looking out the window, watching the scenery go by.
Aesthetic Score : 0.7
Mood : peaceful, calm, contemplative
Quality
Entropy : 6.55
Noise : 86
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some slight artifacts in the hair and clothing of the woman. The motion blur is a bit too harsh.
The Intensity of the Game: A Gamer’s Focus Under the Lights
A young man, headphones on, eyes glued to the screen, embodies the competitive spirit of gaming. The low-angle shot captures the intensity of the moment, highlighting the gamer’s focus amidst the dimly lit atmosphere of a tournament or LAN party.
Prompt
facial-expressions Thoughtfulness: Intense, focused ; A gamer sitting in a dimly lit room, staring intently at a computer screen; eye-level; Gamer; a cluttered desk with gaming peripherals; cinematic
Characteristic
Shot : A young man wearing headphones is intensely focused on playing a video game on a computer. He is sitting in a dimly lit room, with another person in the background. The screen of the computer shows a first-person shooter game.
Aesthetic Score : 0.7
Mood : intense, focused, competitive
Quality
Entropy : 6.69
Noise : 78
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slight over-sharpening effect, particularly on the man’s face. The background is a bit blurry and lacks detail.
Capturing the Moment: A Solitary Figure on the Beach
A man stands on the shore, his smartphone capturing the image of another figure walking along the beach. The shallow depth of field draws the viewer’s eye to the photographer, creating a sense of mystery and contemplation. The blurred background adds to the feeling of solitude and distance, leaving the viewer to wonder about the story unfolding in this serene scene.
Prompt
facial-expressions Thoughtfulness: Solitary, introspective ; A man walking alone on a deserted beach; eye-level; Single Person; the vast ocean stretching out before him; cinematic
Characteristic
Shot : A man is taking a photo of another man walking on a beach with his smartphone. The photo is in focus, while the person taking the photo is out of focus.
Aesthetic Score : 0.7
Mood : mysterious, contemplative, candid
Quality
Entropy : 6.66
Noise : 94
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has slight artifacting around the edges of the phone screen, particularly in the top right corner.
Firefighter’s Stoic Gaze Amidst the Ashes
A firefighter stands amidst the smoky ruins, their expression a testament to the intensity and somberness of the aftermath. The scene captures the dramatic weight of the event, leaving viewers with a sense of uncertainty and the lingering impact of the fire.
Prompt
facial-expressions Thoughtfulness: Somber, reflective ; A firefighter standing amidst the ruins of a fire; eye-level; Hero; smoke and debris filling the air; cinematic
Characteristic
Shot : A firefighter stands in a destroyed city street, smoke and debris in the background, looking directly at the viewer with a stoic expression.
Aesthetic Score : 0.6
Mood : dramatic, somber, gritty
Quality
Entropy : 6.83
Noise : 95
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The background is slightly blurry and lacks detail, and the overall image has a slightly artificial feel, likely due to digital enhancement. The light on the man’s face is a bit too perfect and lacks any shadowing or natural light texture.
Capturing the Warmth of Family Dinner
A close-up shot captures the intimacy of a family gathering around a table, radiating warmth and coziness. The camera angle draws you into the scene, making you feel like a part of their shared moment.
Prompt
facial-expressions Thoughtfulness: Intimate, connected ; A family gathered around a dinner table; eye-level; Normal People; a warm, inviting kitchen setting; cinematic
Characteristic
Shot : A group of people are gathered around a table for a meal, with a camera positioned in the center of the table. The scene is warm and inviting, with a focus on the intimacy of the gathering.
Aesthetic Score : 0.7
Mood : warm, intimate, familial
Quality
Entropy : 6.84
Noise : 94
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, which is causing a loss of detail in the highlights.
Lost in the Game: A Moment of Intense Focus
A young woman, eyes locked on the screen, navigates a vibrant, futuristic world with a controller in hand. The blurred background of explosions and colorful lights adds to the intensity of her focus, capturing the thrill of immersive gaming.
Prompt
facial-expressions Thoughtfulness: Excited, immersed ; A gamer holding a controller, eyes glued to the screen; close-up; Gamer; a vibrant, colorful gaming world displayed on the monitor; cinematic
Characteristic
Shot : A young woman is playing a video game, her intense focus is emphasized by the blurred background of a dynamic action game scene and the glow of the controller.
Aesthetic Score : 0.7
Mood : intense, focused, futuristic
Quality
Entropy : 6.71
Noise : 93
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some minor artifacts in the background, particularly noticeable in the blur.
Stolen Moments: A Glimpse into Quiet Contemplation
A woman finds solace on a park bench, lost in thought, while another captures the moment in a notebook. The camera, positioned in the foreground, invites us to peek into their private world, creating a sense of intimacy and quiet melancholy.
Prompt
facial-expressions Thoughtfulness: Peaceful, creative ; A woman sitting on a park bench, sketching in a notebook; eye-level; Single Person; a serene park setting with blooming flowers; cinematic
Characteristic
Shot : A woman is writing in a notebook, looking out of a window. The view outside is a lush garden with a woman sitting on a bench.
Aesthetic Score : 0.7
Mood : reflective, peaceful, serene
Quality
Entropy : 6.72
Noise : 94
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some blurriness and noise are present in the image, especially in the background. The composition is a little cluttered, with too many elements vying for attention.
Superhero Stands Tall Against the Storm
A powerful image of a superhero silhouetted against a dramatic cityscape, with a stormy sky and birds soaring overhead. The scene evokes a sense of hope and anticipation, suggesting a climactic battle or a heroic act to come.
Prompt
facial-expressions Thoughtfulness: Determined, resolute ; A superhero looking up at the sky, a determined expression on their face; eye-level; Hero; a dramatic sky with dark clouds gathering; cinematic
Characteristic
Shot : A superhero stands on a rooftop looking out at a city at night, with dark clouds and a flock of birds in the sky
Aesthetic Score : 0.7
Mood : epic, dramatic, hopeful
Quality
Entropy : 6.68
Noise : 102
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image contains some artifacts, particularly in the clouds and the city, and it looks somewhat blurry. The bird shapes look unrealistic.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and aesthetics, but struggled with camera positioning. Here’s a breakdown:
- Aesthetic Analysis: The model achieved a score of 0.1, which is considered very good. This means the generated image closely matched the expected aesthetic described in the prompt.
- Shot Analysis: The model scored 0.4, indicating a good understanding of the scene described in the prompt. This suggests the model was able to accurately translate the scene into the generated image.
- Camera Position Analysis: The model scored 0.1, which is considered poor. This suggests the model struggled to accurately represent the camera position described in the prompt.
Overall, the model demonstrates a strong ability to understand the scene and create aesthetically pleasing images. However, it needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://openai.com/index/dall-e-3/