AI's Artistic Struggle: Capturing the 'Dramatic' Aesthetic with Imagen-v2
- 10 minutes read - 1923 wordsTable of Contents
The ‘dramatic’ aesthetic, characterized by strong contrasts, heightened emotions, and a sense of grandeur, is a powerful tool in visual storytelling. But can AI truly capture this aesthetic? Recent experiments have shown promising results in AI’s ability to understand scene and camera position, but the results reveal a significant gap in capturing the desired aesthetic. This article explores the challenges and opportunities in AI’s journey to master the art of the dramatic.
Created with: imagen-v2
Silhouettes of Solitude: A Dramatic Sunset on the Mountain Peak
Two figures stand in stark silhouette against a vibrant orange sunset, their smallness emphasized by the vast expanse of white clouds below. The scene evokes a sense of serene isolation and awe-inspiring beauty.
Prompt
Naturalistic: Epic, triumphant ; A lone figure, silhouetted against the setting sun, standing atop a mountain peak; wide shot; Heroism; Majestic mountain range with clouds swirling around the peak; cinematic
Characteristic
Shot : Two figures stand on a mountain peak, looking out at a sea of clouds below and a colorful sunset sky above.
Aesthetic Score : 0.8
Mood : serene, majestic, inspiring
Quality
Entropy : 6.46
Noise : 100
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise and compression artifacts present, particularly in the darker areas.
A Lifetime of Stories in Every Line
A close-up portrait of an older man, his weathered face framed by lush greenery, captures a lifetime of adventure and mystery. His intense gaze and the dramatic lighting evoke a sense of ruggedness and experience.
Prompt
Naturalistic: Intriguing, adventurous ; A weathered explorer, their face etched with determination, peering through dense jungle foliage; close-up; Adventure; Lush, vibrant rainforest with sunlight filtering through the canopy; cinematic
Characteristic
Shot : A close-up portrait of a man wearing a safari hat and a khaki shirt, standing in a jungle setting.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, rugged
Quality
Entropy : 6.82
Noise : 73
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has some minor artifacts, particularly in the background foliage, which appear slightly blurry and unnatural. The lighting on the man’s face is slightly uneven, creating some harsh shadows.
Lost in the Game: A Moment of Focused Intensity
A young man, bathed in warm light, sits engrossed in a video game. His headphones isolate him from the world, his gaze fixed on the vibrant screen. The low light and his determined expression create a sense of suspense and focus, capturing the thrill of the game.
Prompt
Naturalistic: Focused, intense ; A gamer’s hands, illuminated by the glow of a monitor, rapidly manipulating a controller; close-up; Gaming; A dimly lit room with gaming posters and peripherals scattered around; cinematic
Characteristic
Shot : A young man in a dimly lit room is sitting in a chair, playing a video game. He is wearing headphones, and he is focused on the game.
Aesthetic Score : 0.6
Mood : intense, focused, suspenseful
Quality
Entropy : 6.20
Noise : 85
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight artifacts are present in the image.
A Vibrant Tapestry of Old and New: Exploring a Bustling Marketplace
Step into a world of vibrant colors and bustling energy in this historic marketplace. The interplay of light and shadow adds depth and intrigue, while the architectural details whisper tales of a rich past. Discover the charm of this lively scene, where vendors hawk their wares and locals mingle amidst the colorful umbrellas.
Prompt
Naturalistic: Energetic, vibrant ; A bustling marketplace in a foreign city, filled with vibrant colors and exotic goods; wide shot; Tourism; A bustling street with traditional architecture and locals going about their day; cinematic
Characteristic
Shot : A bustling marketplace in a foreign city, with colorful stalls, vibrant goods, and a diverse crowd.
Aesthetic Score : 0.6
Mood : vibrant, chaotic, exotic
Quality
Entropy : 6.68
Noise : 103
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have been processed with a filter or style transfer algorithm, which has introduced some artifacts and distortions, particularly in the edges and textures. The overall resolution is low, which makes the details less clear.
Solitude in the Setting Sun
A lone figure stands on a sand dune, dwarfed by the vastness of the desert landscape. The setting sun paints the sky in vibrant hues, casting long shadows across the undulating dunes. A sense of tranquility and solitude permeates the scene, evoking a feeling of awe and wonder.
Prompt
Naturalistic: Solitude, contemplative ; A lone traveler, gazing out at a vast, open desert landscape; medium shot; Travel; A desolate desert with sand dunes stretching as far as the eye can see; cinematic
Characteristic
Shot : A lone figure stands on a sand dune, looking out at a vast, rolling desert landscape under a blue and cloudy sky. The sun is setting, casting long shadows.
Aesthetic Score : 0.8
Mood : serene, contemplative, lonely
Quality
Entropy : 6.52
Noise : 79
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : No apparent artifacts or errors
Tranquility Under the Milky Way
A group of friends gather around a crackling campfire, bathed in the ethereal glow of the Milky Way. The majestic mountains provide a breathtaking backdrop, evoking a sense of awe and wonder. This serene scene captures the essence of tranquility and nostalgia, inviting you to escape into the beauty of the night.
Prompt
Naturalistic: Warm, nostalgic ; gathered around a campfire, sharing stories and laughter; medium shot; A cozy campsite under a starry night sky with a crackling fire in the foreground; cinematic
Characteristic
Shot : Four people sitting around a campfire at night in a mountainous area, with the Milky Way visible in the sky.
Aesthetic Score : 0.7
Mood : cozy, contemplative, serene
Quality
Entropy : 6.13
Noise : 116
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image appears to have some digital noise, particularly in the sky. There is also some blurring around the edges of the figures.
Lost in the Majesty: A Hiker Finds Perspective on a Mountain Trail
A solitary hiker traverses a narrow path, dwarfed by the towering peaks and expansive views. The tranquil scene evokes a sense of adventure and the humbling vastness of nature.
Prompt
Naturalistic: Challenging, determined ; A lone hiker, navigating a treacherous mountain path; medium shot; Heroism; A rugged mountain trail with steep cliffs and breathtaking views; cinematic
Characteristic
Shot : A lone hiker traverses a narrow trail along the side of a rocky mountain ridge, with a valley below. The sky is overcast with clouds.
Aesthetic Score : 0.7
Mood : solitude, adventurous, rugged
Quality
Entropy : 6.59
Noise : 105
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
Lost in the Digital Realm: VR Experience Captures the Imagination
Three figures stand shrouded in shadow, their faces obscured by VR headsets. The dramatic lighting and industrial setting create a sense of mystery and intrigue, hinting at a futuristic world of immersive experiences.
Prompt
Naturalistic: Excited, immersive ; A group of friends, their faces lit by the screen of a VR headset, immersed in a virtual world; close-up; Gaming; A dimly lit room with VR headsets and controllers scattered around; cinematic
Characteristic
Shot : Three people wearing VR headsets are in a dimly lit room, possibly a studio or workshop.
Aesthetic Score : 0.7
Mood : futuristic, mysterious, immersive
Quality
Entropy : 6.12
Noise : 97
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some slight artifacts around the edges of the VR headsets and some noise in the shadows.
Cityscape Sunset: A Symphony of Light and Shadow
An aerial view captures the urban sprawl bathed in the golden hues of a setting sun. The towering skyscrapers cast long shadows across the vast cityscape, creating a dramatic contrast that evokes a sense of calm and grandeur.
Prompt
Naturalistic: Energetic, cosmopolitan ; A panoramic view of a bustling city skyline, captured from a rooftop; wide shot; Tourism; A vibrant city with towering skyscrapers and bustling streets below; cinematic
Characteristic
Shot : Aerial view of a city skyline, likely New York City, with numerous skyscrapers in the background, a sunset sky, and a body of water in the distance.
Aesthetic Score : 0.7
Mood : urban, dramatic, nostalgic
Quality
Entropy : 6.96
Noise : 101
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image shows some signs of digital manipulation and post-processing.
Tranquil Journey Through Rolling Hills
A winding road snakes through lush green hills under a bright blue sky, creating a sense of peace and vastness. A single car journeys along the road, emphasizing the perspective and journey ahead. This serene scene evokes a feeling of tranquility and escape.
Prompt
Naturalistic: Peaceful, nostalgic ; A family driving down a scenic highway, with rolling hills and fields passing by; medium shot; Travel; A winding highway with lush green fields and distant mountains in the background; cinematic
Characteristic
Shot : A winding road through a green valley. The road curves through rolling hills and fields. In the distance, a mountain range can be seen.
Aesthetic Score : 0.7
Mood : peaceful, tranquil, scenic
Quality
Entropy : 6.61
Noise : 95
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a little blurry. There are some artifacts in the grass near the bottom of the image
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.55, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create an image that reflects it reasonably well.
- Aesthetic Analysis: The model scored 0.06, which is far from the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding scene and camera position, but needs improvement in capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://deepmind.google/technologies/imagen-2/