AI Struggles to Capture the 'Dramatic' Aesthetic with Imagen-v2
- 10 minutes read - 2034 wordsTable of Contents
The ‘dramatic’ aesthetic is a powerful tool in visual storytelling. It uses elements like strong lighting, dramatic composition, and evocative colors to create a sense of intensity, emotion, and impact. This style is often used in film, photography, and painting to draw the viewer’s attention and create a memorable experience. But can AI truly understand and replicate this aesthetic? In this blog post, we explore the results of an experiment that tested a generative AI model’s ability to create images with a ‘dramatic’ aesthetic. We’ll analyze the model’s performance and discuss the challenges of teaching AI to understand and replicate artistic styles.
Created with: imagen-v2
A Lone Figure Silhouetted Against a Fiery Sunset
A solitary figure stands on a rocky outcrop, gazing out at a breathtaking sunset over a vast desert landscape. The dramatic use of light and shadow, combined with the vastness of the scene, creates a sense of mystery and grandeur. This image evokes feelings of loneliness, epic adventure, and the allure of the unknown.
Prompt
Expressionist: Epic, determined ; A lone figure, silhouetted against a blazing sunset; wide shot; Heroism; A vast, desolate landscape with towering mountains in the distance; cinematic
Characteristic
Shot : A lone figure, presumably a cowboy, stands on a rocky cliff overlooking a vast desert landscape at sunset. The sky is a vibrant orange and yellow, with wispy clouds. The figure is wearing a wide-brimmed hat and a long coat, and is holding a stick or staff. The desert landscape is rough and rocky, with a mesa in the background and snow-capped peaks in the distance.
Aesthetic Score : 0.7
Mood : melancholy, solitude, dramatic
Quality
Entropy : 6.84
Noise : 116
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, particularly in the clouds and the figure’s clothing, as well as in the background mountains. The colors are also quite saturated and may appear artificial to some viewers.
Lost in the Mist: A Cobblestone Path Beckons
A misty forest path, shrouded in an ethereal fog, invites you to explore its secrets. The cobblestone road winds its way into the distance, promising adventure and mystery. The atmosphere is both eerie and captivating, leaving you wondering what lies beyond the veil of mist.
Prompt
Expressionist: Mysterious, suspenseful ; A winding, cobblestone path disappearing into a dense, swirling fog; low-angle shot; Adventure; A dark, foreboding forest with gnarled trees and flickering shadows; cinematic
Characteristic
Shot : A cobblestone path disappearing into a misty forest. The trees are tall and bare with gnarled branches, giving the scene an eerie and haunting atmosphere.
Aesthetic Score : 0.7
Mood : eerie, mysterious, dark
Quality
Entropy : 6.66
Noise : 106
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurring towards the edges, may be due to lens distortion or post-processing
Fierce Gaze, City Lights: A Portrait of Strength
A close-up portrait captures a woman’s intense expression, her gaze piercing through the blurred cityscape behind her. The warm and cool tones of the background create a dramatic atmosphere, highlighting the mystery and power radiating from her.
Prompt
Expressionist: Intense, futuristic ; A pixelated character, illuminated by the glow of a computer screen; close-up; Gaming; A chaotic, neon-lit cityscape with flashing lights and distorted reflections; cinematic
Characteristic
Shot : A woman with a determined expression, with an intense red and blue glow around her.
Aesthetic Score : 0.7
Mood : dark, mysterious, powerful
Quality
Entropy : 6.35
Noise : 87
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some digital noise and artifacts visible in the image.
A City’s Soul: Mystery and Majesty in a Crowded Cathedral
A grand cathedral, bathed in the play of light and shadow, stands as a beacon of mystery and majesty. A throng of people gather before its imposing facade, their presence adding to the sense of awe and intrigue. The towering buildings of the city rise in the background, creating a dramatic backdrop for this captivating scene.
Prompt
Expressionist: Awe-inspiring, spiritual ; A towering, ancient cathedral bathed in the golden light of dawn; high-angle shot; Tourism; A bustling, crowded marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A large gothic cathedral stands in the middle of a bustling marketplace. People are gathered in the square, some buying and selling goods, while others are just enjoying the lively atmosphere.
Aesthetic Score : 0.7
Mood : medieval, mystical, grand
Quality
Entropy : 6.72
Noise : 80
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have a slightly blurry or pixelated appearance. This could be due to compression or the image being generated by AI. The details in the crowd are very repetitive, the figures are mostly uniform in their pose and have no individual features. The clouds look too perfect and artificial.
Dreamlike Journey Through Abstract Mountains
A train glides through a surreal landscape, where swirling, abstract mountains create a sense of depth and scale. The dreamy, melancholic mood is enhanced by the use of color and perspective, leaving a lasting impression of wonder and introspection.
Prompt
Expressionist: Surreal, disorienting ; A train speeding through a surreal, dreamlike landscape; long shot; Travel; A distorted, abstract landscape with swirling colors and shifting shapes; cinematic
Characteristic
Shot : A train traveling through a surreal landscape, a mixture of mountains and a flowing liquid
Aesthetic Score : 0.6
Mood : dreamy, surreal, mysterious
Quality
Entropy : 6.97
Noise : 121
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has a somewhat blurry and pixelated look, which can be attributed to the painterly style. Some brushstroke patterns repeat and create a sense of artificiality. The train lacks realistic detail, its colors and lines look a bit flat and unrealistic, with no shadows.
A Candlelit Gathering: Shadows and Secrets in a Rustic Kitchen
A group of four figures huddle around a solitary candle, their faces shrouded in shadow. The dimly lit, rustic kitchen evokes a sense of history and melancholy, hinting at a shared secret or a moment of profound intimacy. The flickering candle, a beacon of hope in the darkness, adds a touch of mystery and intrigue to this evocative scene.
Prompt
Expressionist: Intimate, melancholic ; A family huddled together in a dimly lit room, their faces illuminated by flickering candlelight; close-up; Family; A cramped, cluttered room with faded wallpaper and worn furniture; cinematic
Characteristic
Shot : A dimly lit room with four figures sitting around a table. There is a single candle providing light, illuminating their faces and hands. The room is sparsely furnished with a few items visible in the background. The overall atmosphere is quiet and intimate.
Aesthetic Score : 0.7
Mood : dark, intimate, suspenseful
Quality
Entropy : 6.14
Noise : 109
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image contains some noise and grain, possibly introduced during post-processing. The shadows appear somewhat unnatural and exaggerated, creating a slightly unrealistic feel.
A Solitary Figure Against the Fury of the Sea
A dramatic scene unfolds as a lone figure stands on a rocky cliff, dwarfed by the raging storm. The crashing waves and brooding sky create a powerful sense of nature’s dominance, leaving the figure feeling insignificant yet strangely connected to the raw energy of the sea.
Prompt
Expressionist: Dramatic, contemplative ; A lone figure standing on a precipice, gazing out at a stormy sea; medium shot; Heroism; A dramatic, stormy seascape with crashing waves and swirling clouds; cinematic
Characteristic
Shot : A lone figure stands on a rocky cliff overlooking a stormy sea. Dramatic, stormy clouds fill the sky.
Aesthetic Score : 0.8
Mood : dramatic, solitary, powerful
Quality
Entropy : 6.49
Noise : 97
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable image errors.
A Flickering Candle in the Dark
A mysterious, stone-lined hallway bathed in the eerie glow of a single candle. The wet floor reflects the flickering light, creating a dramatic contrast and highlighting the texture of the stone. The scene evokes a sense of contemplation and unease.
Prompt
Expressionist: Confusing, suspenseful ; A labyrinthine maze of twisting corridors and flickering lights; low-angle shot; Adventure; A dark, claustrophobic dungeon with dripping water and eerie shadows; cinematic
Characteristic
Shot : A dimly lit, stone-walled passageway with a staircase leading upwards. The only light source is a candle flickering at the bottom of the steps, casting warm light on the wet stone floor.
Aesthetic Score : 0.7
Mood : mysterious, eerie, foreboding
Quality
Entropy : 6.00
Noise : 103
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have some slight artifacts in the stonework, particularly around the edges. The lighting could be more consistent, with some areas appearing overly bright or dark.
Escaping Reality: A Journey into Psychedelic Wonder
A person immersed in a vibrant virtual world, their VR headset displaying a colorful fantasy landscape. The scene evokes a sense of surreal wonder and escape, with the interplay of light and shadow adding depth and intrigue.
Prompt
Expressionist: Immersive, futuristic ; A virtual reality headset, displaying a vibrant, pixelated world; close-up; Gaming; A distorted, abstract landscape with swirling colors and shifting shapes; cinematic
Characteristic
Shot : A person wearing a VR headset, looking into a virtual world of an alien landscape, with a colorful and abstract background.
Aesthetic Score : 0.7
Mood : mysterious, futuristic, surreal
Quality
Entropy : 6.32
Noise : 94
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 1.00
Image errors : The image has some minor artifacts, particularly in the background. The colors are also a bit oversaturated, which can make the image look artificial. The blurriness of the subject’s face makes it look less realistic
Dreamy Cityscape: A Vibrant Canvas of Chaos
This stylized cityscape bursts with color and energy, capturing a chaotic crowd against a backdrop of towering buildings. The painting’s strong sense of depth and movement draws you into the heart of the scene, leaving you with a feeling of both wonder and unease.
Prompt
Expressionist: Chaotic, overwhelming ; A bustling, crowded street scene, with people rushing past in a blur; long shot; Tourism; A distorted, abstract cityscape with exaggerated buildings and swirling colors; cinematic
Characteristic
Shot : A busy street scene with tall buildings on either side, the sky is a vivid yellow and blue. The street is crowded with people. The artwork is painted in a vibrant and expressive style, with bold colors and thick brushstrokes.
Aesthetic Score : 0.6
Mood : busy, vibrant, dramatic
Quality
Entropy : 6.75
Noise : 120
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are slight artifacts in the sky, and the brushstrokes are not always smooth. The people in the foreground are very blurry.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic style. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered average. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.61, which is considered good. This indicates the model was able to understand the scene described in the prompt and create a shot that was relatively close to what was expected.
- Aesthetic Analysis: The model scored 0.04, which is considered poor. This suggests the generated image’s aesthetic style was significantly different from the desired style.
Overall, the model seems to be better at understanding the scene and camera position than it is at capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://deepmind.google/technologies/imagen-2/