AI Captures the Essence, But Misses the Shot with Dall-e-3
- 9 minutes read - 1851 wordsTable of Contents
The ‘dramatic’ aesthetic style is characterized by its use of strong contrasts, bold colors, and dynamic compositions to create a sense of intensity and emotion. It’s often used in film, photography, and visual art to evoke feelings of awe, excitement, or suspense. This style is particularly effective in capturing the essence of adventure, heroism, and other powerful themes. In this experiment, we tested an AI model’s ability to generate images that embody this dramatic aesthetic style, while also adhering to specific scene descriptions and camera angles. The results revealed some interesting insights into the model’s strengths and weaknesses.
Created with: dall-e-3
Epic Mountain Hike: A Lone Wanderer Embraces the Vastness
A solitary hiker, backpack in tow, traverses a breathtaking landscape of towering mountains. The play of light and shadow across the rugged peaks creates a sense of awe and adventure, while the serene blue sky and fluffy clouds add a touch of tranquility to this epic scene.
Prompt
Cinema Verité: Awe-inspiring, determined ; A lone hiker; wide shot; Adventure; Majestic mountain range with snow-capped peaks; cinematic
Characteristic
Shot : A lone hiker walks through a mountain valley, with snow-capped peaks in the background.
Aesthetic Score : 0.7
Mood : epic, dramatic, adventurous
Quality
Entropy : 6.67
Noise : 112
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : Some artifacts are visible in the mountains, particularly in the snow and the sky.
Firefighters Battle Blaze, Braving Danger to Save Lives
A dramatic scene unfolds as firefighters confront a raging inferno, smoke and flames billowing from the windows of a burning house. The image captures the intensity and somber mood of the situation, highlighting the bravery and dedication of these first responders.
Prompt
Cinema Verité: Urgent, heroic, chaotic ; A firefighter battling a blaze; close-up; Heroism; Smoke and flames engulfing a building; cinematic
Characteristic
Shot : Firefighters in action, battling a blaze engulfing a house. One firefighter is kneeling in prayer, while others are putting out the flames.
Aesthetic Score : 0.8
Mood : dramatic, intense, somber
Quality
Entropy : 6.84
Noise : 107
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to be digitally enhanced, potentially with some slight blurring or sharpening. There are no major technical errors.
The Intensity of the Game
A young man is fully immersed in a first-person shooter video game, his focus and energy palpable in the dimly lit room. The red and blue lights cast dramatic shadows, highlighting the intensity of his experience.
Prompt
Cinema Verité: Intense, focused, exhilarating ; A gamer’s hands furiously manipulating a controller; close-up; Gaming; Blurred background of a computer screen displaying a fast-paced game; cinematic
Characteristic
Shot : A man is playing a video game, he is intensely focused on the screen and his expression shows excitement and passion.
Aesthetic Score : 0.6
Mood : intense, passionate, competitive
Quality
Entropy : 6.62
Noise : 102
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no major image errors, the colors are slightly oversaturated and the contrast is a bit high but it’s a stylistic choice.
Family Fun at the Museum: Capturing Joyful Moments
A family of five, beaming with happiness, takes a selfie in front of a majestic building. Their laughter and bright smiles radiate joy, creating a heartwarming scene that captures the essence of a happy family outing.
Prompt
Cinema Verité: Joyful, celebratory, memorable ; A family laughing and taking photos in front of a famous landmark; medium shot; Tourism; Vibrant cityscape with iconic architecture; cinematic
Characteristic
Shot : A family of five is standing in front of a large building, taking a selfie. The family is smiling and appears to be happy. The building is in the background and is somewhat out of focus. The photo is taken in a bright and sunny environment.
Aesthetic Score : 0.7
Mood : joyful, happy, candid
Quality
Entropy : 6.73
Noise : 104
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable errors
Chasing Dreams at Sunset
A lone woman, silhouetted against a vibrant sunset, gazes out at a cityscape dotted with hot air balloons. Her backpack and hat suggest a spirit of adventure, while the dreamy atmosphere evokes a sense of hope and possibility.
Prompt
Cinema Verité: Tranquil, contemplative, awe-inspiring ; A backpacker gazing out at a breathtaking sunset over a foreign city; long shot; Travel; Silhouettes of buildings against a fiery sky; cinematic
Characteristic
Shot : A woman with a backpack, wearing a hat, is looking at the sunset over a city with hot air balloons in the sky. The woman is standing on a hill, and the city is in the distance. There is a large cathedral in the middle of the city.
Aesthetic Score : 0.7
Mood : peaceful, hopeful, adventurous
Quality
Entropy : 6.56
Noise : 93
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has some minor artifacts, such as the blurry background and the unrealistic lighting of the city. However, these errors are not very noticeable and do not detract from the overall quality of the image.
A Child’s Reach for Wonder
A whimsical scene of a child’s hand reaching towards a butterfly in a field of wildflowers. The butterfly in flight creates a sense of anticipation and wonder, capturing the innocence and hopefulness of childhood.
Prompt
Cinema Verité: Innocent, curious, heartwarming ; A young child’s hand reaching out to touch a butterfly; close-up; Family; Lush green meadow with wildflowers; cinematic
Characteristic
Shot : A child’s hand reaching out to a butterfly in a field of wildflowers
Aesthetic Score : 0.8
Mood : serene, magical, innocent
Quality
Entropy : 6.85
Noise : 97
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have been generated by AI and there is slight blurriness around the edges of objects.
The Crowd Erupts in a Sea of Excitement
A stadium explodes with joy as fans celebrate, their faces lit by the anticipation of victory. The energy is palpable, a testament to the power of shared passion and the thrill of the moment.
Prompt
Cinema Verité: Energetic, passionate, communal ; A group of friends cheering on their favorite team at a sporting event; wide shot; Heroism; Stadium filled with excited fans; cinematic
Characteristic
Shot : A large crowd of people are cheering in a stadium, most likely at a sporting event. The scene is lively and exciting.
Aesthetic Score : 0.6
Mood : excited, energetic, celebratory
Quality
Entropy : 6.89
Noise : 114
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears to be slightly blurry, especially in the background, and there is a faint pixelation effect present.
Lost in the Labyrinth of the Bazaar
A man and a woman explore a vibrant Middle Eastern market, their senses captivated by the sights, sounds, and smells. The man captures the moment with his vintage camera, while the woman’s curious gaze reveals a thirst for adventure. The scene is bathed in a mysterious light, hinting at secrets waiting to be uncovered.
Prompt
Cinema Verité: Adventurous, curious, vibrant ; A couple exploring a bustling market in a foreign country; medium shot; Travel; Colorful stalls overflowing with exotic goods; cinematic
Characteristic
Shot : A man and a woman are walking through a bustling market in the Middle East. The man is wearing a traditional Arab headdress, and the woman is dressed in a colorful outfit. The market is full of vendors selling various goods, and there are many people walking around.
Aesthetic Score : 0.7
Mood : exotic, adventurous, intriguing
Quality
Entropy : 6.64
Noise : 104
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.60
Image errors : There are no major errors in the image, but the composition could be improved by placing the subjects in a more balanced position in the frame.
Lost in the Code: A Hacker’s Focus
A young woman, shrouded in shadow, is consumed by her work. The glow of the computer screen illuminates her face, revealing an intense focus. The dimly lit room adds to the sense of mystery and suspense, leaving the viewer wondering what secrets she’s uncovering.
Prompt
Cinema Verité: Focused, intense, absorbed ; A gamer’s face lit by the glow of a computer screen, eyes glued to the action; close-up; Gaming; Dark room with only the screen illuminating the face; cinematic
Characteristic
Shot : Close-up shot of a young woman’s face, illuminated by a screen, looking intensely at something out of frame, possibly a computer screen.
Aesthetic Score : 0.7
Mood : intense, focused, mysterious
Quality
Entropy : 5.72
Noise : 76
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some noticeable artifacts around the subject’s hair and on the screen, indicating possible processing or compression issues.
Campfire Magic: A Family’s Night Under the Stars
A heartwarming scene of a family gathered around a crackling campfire, bathed in the warm glow of the flames and the twinkling light of a starry sky. This image captures the joy, coziness, and adventurous spirit of a night spent under the open sky.
Prompt
Cinema Verité: Warm, intimate, nostalgic ; A family sharing a meal together around a campfire; medium shot; Family; Campsite under a starry night sky; cinematic
Characteristic
Shot : A group of people are gathered around a campfire at night under a starry sky. They are eating and laughing together, creating a warm and inviting atmosphere.
Aesthetic Score : 0.7
Mood : joyful, cozy, family
Quality
Entropy : 6.90
Noise : 99
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : The star field appears to be slightly blurry and artificial, likely due to digital manipulation.
Conclusion
The results indicate that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.465, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create the expected shot composition.
- Aesthetic Analysis: The model scored 0.06, which is considered very good. This means that the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at understanding and capturing the desired aesthetic style than it is at accurately interpreting camera positions and shot descriptions.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://openai.com/index/dall-e-3/