AI Struggles to Capture Dramatic Style in Images with Imagen-v3
- 9 minutes read - 1794 wordsTable of Contents
The dramatic style, often used in film and photography, aims to evoke strong emotions and create a sense of heightened tension or excitement. This style is characterized by specific camera positions, shot types, and aesthetic choices. In this experiment, we tested a generative AI model’s ability to understand and implement these elements to create images in a dramatic style. The results, while promising in some areas, highlight the challenges AI faces in capturing the nuances of artistic expression.
Created with: imagen-v3
One Against Many: A Warrior’s Silhouette Against the Setting Sun
A lone warrior stands defiant against a vast army, bathed in the golden light of a dramatic sunset. The scene evokes a sense of epic struggle and ultimate victory, with the warrior’s silhouette highlighting his courage and determination.
Prompt
dramatic-styles Color Grading: Epic, hopeful ; A lone warrior; wide shot; Heroism; a desolate battlefield with a setting sun; cinematic
Characteristic
Shot : A lone warrior stands in front of a large army, with a sunset in the background.
Aesthetic Score : 0.7
Mood : epic, dramatic, victorious
Quality
Entropy : 6.69
Noise : 65
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The army in the background appears slightly blurry and lacking in detail.
Lost in the Fog: Explorers Face the Unknown
A group of explorers venture deep into a dense, foggy jungle, their silhouettes illuminated by the setting sun. The mysterious atmosphere and dramatic lighting create a sense of suspense and adventure, leaving viewers wondering what dangers lie ahead.
Prompt
dramatic-styles Color Grading: Mysterious, adventurous ; A group of explorers; medium shot; Adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : A group of explorers are walking through a dense, foggy jungle. The light is coming from behind them, creating a dramatic silhouette effect.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 6.71
Noise : 107
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has a slightly blurry background, and the leaves look a bit artificial.
Lost in the Neon Maze: A Cyberpunk Portrait
A close-up portrait of a young man with piercing blue eyes, captured in a futuristic setting. The soft, colorful glow and blurry city backdrop create a mysterious and intriguing atmosphere, transporting you to a world of cyberpunk wonder.
Prompt
dramatic-styles Color Grading: Energetic, futuristic ; A player’s avatar; close-up; Gaming; a futuristic cityscape with neon lights; cinematic
Characteristic
Shot : A close-up portrait of a young man with blue eyes. He is wearing a futuristic-looking outfit. The background is a blurry city scene with neon lights. The image is lit with a soft, colorful glow.
Aesthetic Score : 0.7
Mood : mysterious, futuristic, cyberpunk
Quality
Entropy : 6.58
Noise : 73
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image seems a bit too smooth. It looks like it might have been edited to remove detail, which gives it an unnatural look.
Awe-Inspiring Mountaintop Panorama
Experience the serenity of a vast mountain range stretching beneath a bright blue sky. Fluffy clouds drift across the horizon, creating a sense of tranquility and wonder. This breathtaking view evokes a feeling of expansiveness and peace.
Prompt
dramatic-styles Color Grading: Tranquil, inspiring ; A panoramic view of a mountain range; wide shot; Tourism; a clear blue sky with fluffy clouds; cinematic
Characteristic
Shot : A mountaintop view with a vast expanse of forested mountains under a bright blue sky with fluffy clouds.
Aesthetic Score : 0.8
Mood : serene, tranquil, expansive
Quality
Entropy : 6.82
Noise : 94
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Silhouettes of Love at Sunset
A romantic stroll along a beach at sunset, with a pier in the distance. The couple’s silhouettes against the golden light create a sense of intimacy and mystery, capturing the essence of a serene and nostalgic moment.
Prompt
dramatic-styles Color Grading: Romantic, nostalgic ; A couple walking hand-in-hand on a beach; medium shot; Travel; a sunset over the ocean; cinematic
Characteristic
Shot : A couple walks hand in hand along a beach at sunset, with a pier in the background
Aesthetic Score : 0.7
Mood : romantic, serene, nostalgic
Quality
Entropy : 6.54
Noise : 84
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, which is especially noticeable in the sky.
Campfire Serenity: A Moment of Tranquility in the Dark Forest
A group of people gather around a crackling campfire, their faces illuminated by the warm glow. The surrounding forest is dark and mysterious, while the night sky above is filled with twinkling stars. This scene evokes a sense of peace, intimacy, and warmth, capturing a moment of shared connection under the vastness of the night.
Prompt
dramatic-styles Color Grading: Warm, intimate ; A family gathered around a campfire; medium shot; Family; a starry night sky; cinematic
Characteristic
Shot : A group of people are gathered around a campfire in the middle of a dark forest. The fire is the main source of light and it illuminates the faces of the people around it. The night sky is dark and filled with stars. The scene is one of peace and serenity.
Aesthetic Score : 0.6
Mood : tranquil, intimate, warm
Quality
Entropy : 4.54
Noise : 75
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts in the image, such as the slight blurring of the people in the background. The contrast is also slightly off, making it difficult to see the faces of the people around the campfire. The image has a slight yellow tint.
A White Spaceship Blazes Through an Orange Nebula
Experience the thrill of futuristic space travel as a sleek white spaceship cuts through the cosmos, leaving a trail of light against a vibrant orange nebula. The scene evokes a sense of adventure and dynamism, capturing the excitement of exploring the unknown.
Prompt
dramatic-styles Color Grading: Awe-inspiring, futuristic ; A spaceship soaring through space; wide shot; Adventure; a nebula with vibrant colors; cinematic
Characteristic
Shot : A white spaceship is flying through space, with an orange nebula and stars in the background.
Aesthetic Score : 0.7
Mood : futuristic, sci-fi, adventurous
Quality
Entropy : 6.40
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The stars and nebula in the background are slightly blurry. The spaceship’s texture is a bit rough, with some pixelation.
Silhouetted Against the Storm
A solitary figure stands defiant on a mountain peak, bathed in the eerie glow of lightning strikes. The dramatic contrast between the dark silhouette and the illuminated clouds evokes a sense of power, isolation, and impending danger.
Prompt
dramatic-styles Color Grading: Powerful, dramatic ; A lone figure standing on a mountain peak; wide shot; Heroism; a dramatic storm with lightning; cinematic
Characteristic
Shot : A solitary figure stands on a mountain peak, silhouetted against a stormy sky. Lightning strikes illuminate the clouds above.
Aesthetic Score : 0.7
Mood : dramatic, powerful, ominous
Quality
Entropy : 6.68
Noise : 68
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lightning bolts appear a bit artificial and the mountain texture is too uniform.
A City in Twilight: Mystery and Melancholy in the Empty Streets
This image captures the essence of a European city at dusk. The empty streets, bathed in the golden light of the setting sun, create a sense of mystery and intrigue. The motion blur adds a dynamic element, suggesting a hidden story unfolding within the shadows of the tall buildings. The overall mood is melancholic, yet captivating, leaving the viewer to ponder the secrets held within this urban landscape.
Prompt
dramatic-styles Color Grading: Energetic, lively ; A bustling city street; wide shot; Tourism; a vibrant cityscape with diverse architecture; cinematic
Characteristic
Shot : A crowded street in a European city with tall buildings on both sides. The street is empty, with only a few people walking in the distance. The buildings are illuminated by the setting sun.
Aesthetic Score : 0.7
Mood : mysterious, urban, melancholic
Quality
Entropy : 6.37
Noise : 104
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor image errors, including a slight amount of noise and some chromatic aberration.
Lost in the Smoke: A Man’s Solitary Reflection
A shadowy figure shrouded in smoke sits at a bar, a glass of whiskey his only companion. The scene evokes a sense of mystery and contemplation, leaving the viewer to wonder about the man’s thoughts and the secrets he holds.
Prompt
dramatic-styles Color Grading: Lonely, contemplative, and enigmatic. ; A lone figure, shrouded in smoke, sits at a dimly lit bar, nursing a drink. The camera focuses on the swirling smoke, obscuring their face.; cinematic
Characteristic
Shot : A man in a dark suit sits at a bar, his back to the viewer. He is obscured by smoke, and there is a glass of whiskey in front of him.
Aesthetic Score : 0.6
Mood : mysterious, contemplative, lonely
Quality
Entropy : 5.46
Noise : 60
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly overexposed, and the smoke looks a bit artificial. The man’s face is also somewhat blurry.
Conclusion
The results show that the generative AI model performed well in understanding and implementing camera positions and shot types, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately translate the camera positions described in the prompt into the generated image.
- Shot Analysis: The model scored 0.55, which is considered average. This indicates that the model was able to understand the shot types described in the prompt, but not perfectly.
- Aesthetic Analysis: The model scored 0.07, which is considered poor. This means that the generated image’s aesthetic significantly deviated from the expected aesthetic described in the prompt.
Overall, the model needs improvement in accurately capturing the desired camera positions and aesthetic. It performed better in understanding the shot types, but still has room for improvement.
Sources:
- https://www.swiff.org/article/crafting-the-tone-and-style-of-a-film
- https://digital-photography-school.com/backlighting-in-photography/
- https://www.studiobinder.com/blog/what-is-chiaroscuro-definition-examples/
- https://infocusfilmschool.com/4-wildly-different-movie-styles-youll-explore-filmmaking-college/
- https://cinepunked.com/2022/09/23/a-quick-guide-to-visual-style/
- https://cinematography.com/index.php?/forums/topic/184-desaturation-techniques/
- https://www.reddit.com/r/Filmmakers/comments/1452afb/colour_grading_an_underrated_factor_in_the/
- https://digital-photography-school.com/rule-of-thirds/
- https://deepmind.google/technologies/imagen-3/