AI Struggles to Capture the 'Dramatic' Aesthetic with Imagen-v3
- 10 minutes read - 1988 wordsTable of Contents
The ‘dramatic’ aesthetic is a powerful tool in visual storytelling. It evokes strong emotions, creates a sense of tension, and draws the viewer into the scene. But can AI truly understand and capture this aesthetic? Recent experiments with a generative AI model suggest that while AI excels at understanding scene and camera position, it struggles to capture the nuances of a desired aesthetic. This blog post explores the challenges and potential solutions for improving AI’s ability to understand and generate images with specific aesthetics, focusing on the ‘dramatic’ style. We’ll examine examples of how this aesthetic is used in film, photography, and art, and discuss how AI can be trained to better understand and replicate it.
Created with: imagen-v3
A Lone Figure in the Ruins of Hope
A solitary figure stands amidst a desolate, post-apocalyptic landscape, bathed in the warm glow of a setting sun. The scene evokes a sense of melancholy and solitude, yet the light hints at a glimmer of hope amidst the ruins.
Prompt
style-aesthetic Postmodern: Epic, melancholic ; A lone figure, silhouetted against a blazing sunset; wide shot; Heroism; A vast, desolate landscape with a crumbling cityscape in the distance; cinematic
Characteristic
Shot : A lone figure stands in a desolate, post-apocalyptic landscape. The sun is setting, casting a warm orange glow over the cracked earth and the distant city skyline.
Aesthetic Score : 0.7
Mood : melancholy, hope, solitude
Quality
Entropy : 6.88
Noise : 75
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : No noticeable artifacts or errors.
Hand From the Digital Realm Reaches Out
A mysterious hand emerges from a computer screen, blurring the lines between reality and the digital world. This cyberpunk-inspired scene evokes a sense of intrigue and wonder, leaving viewers questioning what lies beyond the screen.
Prompt
style-aesthetic Postmodern: Surreal, playful ; A hand reaching out from a pixelated, digital world, grasping at a real-world object; close-up; Gaming; A cluttered desk with a gaming console and controllers; cinematic
Characteristic
Shot : A hand reaching out of a computer screen, seemingly from a digital world, on a desk with gaming accessories
Aesthetic Score : 0.7
Mood : futuristic, mysterious, cyberpunk
Quality
Entropy : 6.14
Noise : 62
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.90
Image errors : The hand and the screen have some blurring and pixelation, the lighting and shadows are inconsistent, the scene has a slight plastic look.
Sun-Kissed Mystery in the City Square
A young man, his face obscured by white sunglasses reflecting the bustling city square, stands before a grand church. The sun bathes the scene in a warm glow, adding to the air of intrigue and mystery. This urban landscape whispers of secrets waiting to be unveiled.
Prompt
style-aesthetic Postmodern: Alienated, detached, cynical ; A lone figure, sunglasses reflecting the blinding glare of the sun, stands amidst a throng of tourists, their faces obscured by the same oversized shades. The iconic landmark looms behind, dwarfed by the human sea.; cinematic
Characteristic
Shot : A young man wearing white sunglasses stands in a crowded square in front of a large, imposing church. The sun is shining, and there are many people walking around.
Aesthetic Score : 0.6
Mood : mysterious, cool, urban
Quality
Entropy : 6.44
Noise : 89
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Lost in Time: A Vintage Travel Scene
Step back in time with this nostalgic scene, featuring vintage suitcases, a fedora hat, and maps. A colorful poster, partially covering a suitcase, draws the eye, while dramatic lighting adds depth and mystery, evoking the spirit of classic adventure films.
Prompt
style-aesthetic Postmodern: Nostalgic, melancholic ; A vintage travel poster, faded and torn, with a romanticized image of a foreign land; close-up; Travel; A dusty, cluttered attic filled with old suitcases and maps; cinematic
Characteristic
Shot : A vintage travel scene with suitcases, a hat, and maps. The main focal point is a colorful vintage poster that is partly covering a suitcase.
Aesthetic Score : 0.7
Mood : nostalgic, vintage, adventurous
Quality
Entropy : 6.61
Noise : 95
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight blurring and a few minor artifacts, but these are not very noticeable and do not detract from the overall aesthetic of the image.
Immersed in the Future: VR Gaming Takes Center Stage
A vibrant scene captures the thrill of VR gaming, as young people engage with a virtual world. Neon lights and futuristic decor create an atmosphere of wonder and immersion, highlighting the excitement and playful nature of this cutting-edge technology.
Prompt
style-aesthetic Postmodern: Energetic, futuristic ; A group of friends, their faces obscured by digital avatars, playing a virtual reality game; medium shot; Gaming; A brightly lit, futuristic arcade with neon lights and holographic displays; cinematic
Characteristic
Shot : A group of young people are wearing VR headsets and interacting with a virtual world. The setting appears to be a gaming arcade or entertainment center, with bright neon lights and futuristic decor.
Aesthetic Score : 0.6
Mood : futuristic, immersive, playful
Quality
Entropy : 6.64
Noise : 75
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and some of the colors appear oversaturated. The lighting is uneven, creating some dark areas in the scene.
Lost in the Crowd: A Moment of Solitude in the Airport
A lone traveler navigates the bustling airport terminal, his suitcase trailing behind him. The scene, captured from behind, evokes a sense of calm amidst the chaos, suggesting themes of travel, journey, and new beginnings.
Prompt
style-aesthetic Postmodern: Lonely, alienated ; A lone traveler, their back to the camera, walking through a crowded airport terminal; long shot; Travel; A chaotic airport terminal with people rushing and luggage carts; cinematic
Characteristic
Shot : A lone traveler walks through a crowded airport terminal, pulling a suitcase behind him. The scene is captured from behind the traveler, looking towards the front of the terminal. The terminal is brightly lit, and the atmosphere is bustling and slightly chaotic.
Aesthetic Score : 0.6
Mood : solitary, calm, urban
Quality
Entropy : 6.65
Noise : 99
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors are evident in the image. The image quality is good, and the lighting is balanced. There are some minor imperfections in the background elements, but they are barely noticeable.
A Moment of Quiet Contemplation: A Family Portrait in Soft Light
This intimate family portrait captures a moment of quiet contemplation, bathed in soft light and muted colors. The modern interior setting, with a large window in the background, adds a sense of depth and mystery to the scene. The family members, dressed in casual clothing and holding small objects, exude a sense of connection and shared experience.
Prompt
style-aesthetic Postmodern: Reflective, nostalgic ; A family portrait, with each member holding a different, iconic object from their travels; medium shot; Family; A minimalist, modern living room with a large window overlooking a cityscape; cinematic
Characteristic
Shot : A family portrait set in a modern interior with a large window in the background. The family members are all dressed in casual clothing and are holding small objects. The overall mood of the image is one of quiet contemplation.
Aesthetic Score : 0.7
Mood : serious, contemplative, intimate
Quality
Entropy : 6.72
Noise : 95
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable errors in the image.
Lost in the Woods, Guided by a Flickering Screen
A lone hand clutches a smartphone, its screen illuminating a map app in the heart of a shadowy forest. A red pin marks the destination, but the path ahead remains shrouded in mystery. Is this a journey of adventure or a descent into the unknown?
Prompt
style-aesthetic Postmodern: Intriguing, suspenseful ; A hand holding a smartphone, displaying a map with a pin dropped on a remote, unknown location; close-up; Adventure; A dark, mysterious forest with dense foliage and shadows; cinematic
Characteristic
Shot : A hand holding a smartphone with a map app open, in a dark forest setting. The map shows a red pin on a location, suggesting the person is looking for directions.
Aesthetic Score : 0.4
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 6.05
Noise : 67
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be generated with an AI model, with some artifacts visible in the foliage and the hand. The map app is not fully rendered, and the text on the map is blurry.
One Hero Stands Against the Ashes
A lone superhero, silhouetted against a fiery cityscape, embodies hope and resilience in the face of utter devastation. This powerful image captures the essence of heroism in a world consumed by darkness.
Prompt
style-aesthetic Postmodern: Desolate, hopeful ; A superhero, their costume ripped and tattered, standing on a rooftop overlooking a city in chaos; wide shot; Heroism; A dystopian cityscape with crumbling buildings and smoke in the air; cinematic
Characteristic
Shot : A lone superhero stands on a rooftop overlooking a post-apocalyptic cityscape engulfed in fire and smoke.
Aesthetic Score : 0.6
Mood : dark, ominous, heroic
Quality
Entropy : 6.74
Noise : 84
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some minor artifacts and errors in the image, particularly in the smoke and flames, which appear a bit blurry and unrealistic.
The Last Sentinel: A Robot Stands Guard in a Dystopian City
A solitary robot, clad in military garb, stands defiantly in the heart of a blurred, futuristic cityscape. Its outstretched arms and dramatic pose evoke a sense of isolation and tension, hinting at a world where humanity and technology have collided in a dystopian future.
Prompt
style-aesthetic Postmodern: Surreal, humorous ; A vintage video game character, rendered in a hyper-realistic style, standing in a real-world environment; medium shot; Gaming; A bustling city street with people and traffic; cinematic
Characteristic
Shot : A robot in a military uniform standing in the middle of a city street, the background is blurry and stylized, it resembles a 3D rendering
Aesthetic Score : 0.5
Mood : futuristic, dystopian, robotic
Quality
Entropy : 6.69
Noise : 98
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 1.00
Image errors : The image is very artificial, the robot looks flat and unrealistic, the background is too blurry and doesn’t seem to be the same style as the robot.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.2, indicating it’s not very good at reacting to camera positions in prompts. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Shot Analysis: The model scored 0.48, which is good at understanding the scene in a prompt. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Aesthetic Analysis: The model scored 0.14, which is not very good at matching the expected aesthetic. A score between -0.2 and 0.1 would be considered very good.
Overall, the model seems to be better at understanding the scene than the camera position, but it needs improvement in capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://deepmind.google/technologies/imagen-3/