AI's Artistic Struggle: Capturing the Dramatic Aesthetic with Imagen-v3
- 9 minutes read - 1904 wordsTable of Contents
The dramatic aesthetic, characterized by high contrast, strong lighting, and evocative compositions, is a powerful tool in visual storytelling. It’s often used to create a sense of tension, mystery, or heroism. But can AI truly capture this complex aesthetic? In this blog post, we explore the results of an experiment using a generative AI model to create images with a dramatic aesthetic. We analyze the model’s performance in terms of camera position, shot composition, and overall aesthetic, highlighting both its strengths and weaknesses.
Created with: imagen-v3
Silhouetted Against the Setting Sun: A Soldier’s Lonely Vigil
A lone soldier stands amidst the wreckage of a battlefield, his rifle held tight. The setting sun casts a dramatic silhouette, highlighting the somber mood and the devastating consequences of war.
Prompt
style-aesthetic Gritty realism: Melancholy, determined ; A lone soldier, silhouetted against the setting sun; wide shot; Heroism; a war-torn battlefield littered with debris and the wreckage of tanks; cinematic
Characteristic
Shot : A lone soldier standing in a battlefield, holding a rifle. There are destroyed tanks and debris around him. The sun is setting in the background.
Aesthetic Score : 0.7
Mood : dramatic, somber, war
Quality
Entropy : 6.62
Noise : 81
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors
Lost in the Jungle: A Man’s Worried Gaze
A close-up shot captures the intense worry etched on a man’s face, his eyes locked on the camera. The dimly lit jungle setting amplifies the suspense, leaving viewers questioning his fate.
Prompt
style-aesthetic Gritty realism: Intrigued, apprehensive ; A weathered explorer, their face etched with lines of hardship, peering through a dense jungle canopy; close-up; Adventure; overgrown ruins of an ancient temple; cinematic
Characteristic
Shot : A close-up shot of a man’s face, likely in a jungle setting. He is looking directly at the camera with a worried expression.
Aesthetic Score : 0.7
Mood : intense, suspenseful, worried
Quality
Entropy : 6.35
Noise : 79
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors, the image looks clean and sharp
Focused on the Game, Pizza and Drink in the Background
A dimly lit scene captures a person engrossed in their video game, holding a black controller. The casual setting with a pizza box and a can of drink in the background suggests a relaxed and focused mood.
Prompt
style-aesthetic Gritty realism: Focused, intense ; A gamer’s hands, gripping a worn controller, illuminated by the flickering glow of a monitor; close-up; Gaming; a dimly lit room filled with empty pizza boxes and energy drink cans; cinematic
Characteristic
Shot : A person is playing video games, holding a black controller, with a pizza box and a can of drink in the background.
Aesthetic Score : 0.5
Mood : focused, casual, relaxed
Quality
Entropy : 6.08
Noise : 73
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Lost in the Desert’s Embrace
A solitary figure stands at the edge of a vast, sun-drenched desert, their back to the camera, lost in contemplation. The stark landscape and the figure’s smallness evoke a sense of loneliness and isolation, while the play of light and shadow adds a touch of mystery.
Prompt
style-aesthetic Gritty realism: Lonely, contemplative ; A weary traveler, their backpack slung over their shoulder, gazing out at a desolate, dusty landscape; medium shot; Tourism; a crumbling roadside diner with faded neon signs; cinematic
Characteristic
Shot : A man stands with his back to the camera, looking out into the desert. He is wearing a backpack and a jacket. There is a building behind him, and the desert stretches out before him.
Aesthetic Score : 0.7
Mood : lonely, contemplative, desolate
Quality
Entropy : 6.51
Noise : 72
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the shadows are a bit too harsh.
Two Strangers, One Train, A Shadow of Doubt
A dimly lit train car, two men facing each other in silence. The windows are blurred, the atmosphere heavy with unspoken tension. This stark and minimalist image evokes a sense of melancholy, suspense, and isolation, leaving the viewer questioning what secrets lie beneath the surface.
Prompt
style-aesthetic Gritty realism: Uneasy, introspective ; A single, flickering overhead bulb casts harsh shadows on the faces of two strangers huddled in a cramped train compartment, the rhythmic clatter of the train echoing through the dark, rain-soaked countryside.; cinematic
Characteristic
Shot : Two men are seated facing each other in a dimly lit train car. The windows are blurry and the atmosphere is somber.
Aesthetic Score : 0.6
Mood : melancholy, suspenseful, isolated
Quality
Entropy : 5.25
Noise : 77
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors. There is a minor amount of noise in the darker areas.
Lost in the City’s Glow
A solitary figure stands in the shadows of a deserted city street, their gaze drawn to a towering, illuminated building in the distance. The building’s warm glow casts a mysterious aura, leaving the viewer to wonder about the secrets it holds.
Prompt
style-aesthetic Gritty realism: Awe, trepidation, isolation ; A lone figure, dwarfed by the towering skyscraper, gazes upwards with a mixture of awe and trepidation. The camera slowly pans up, revealing the building’s imposing facade against the backdrop of a bustling cityscape.; cinematic
Characteristic
Shot : A lone figure stands in a dark, empty city street, looking up at a tall, illuminated building in the distance. The building is a towering structure, possibly a skyscraper, and the light emanating from it is both bright and mysterious.
Aesthetic Score : 0.7
Mood : dark, mysterious, solitary
Quality
Entropy : 6.17
Noise : 87
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has slight artifacts and errors, particularly in the edges of the buildings and the figure’s clothing.
Silhouetted Hero: Firefighter Battles Blaze in Dramatic Scene
A firefighter in full gear stands bravely against a burning building, backlit by the intense flames. The dramatic lighting highlights the silhouette of the hero, emphasizing the danger and urgency of the situation.
Prompt
style-aesthetic Gritty realism: Brave, determined ; A firefighter, their face obscured by smoke, battling a raging inferno; close-up; Heroism; a burning building with flames licking at the sky; cinematic
Characteristic
Shot : A firefighter in full gear stands in front of a burning building. The scene is backlit, with the flames of the fire creating a dramatic glow behind the firefighter. The firefighter is in silhouette, but the details of their gear and helmet are visible.
Aesthetic Score : 0.7
Mood : dramatic, heroic, intense
Quality
Entropy : 6.12
Noise : 67
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Conquering the Fog: Climbers Brave the Mountain’s Perilous Path
A dramatic scene unfolds as climbers ascend a snow-covered mountain path, shrouded in an ominous fog. The wide-angle perspective and deep shadows emphasize the scale and danger of their journey, capturing the essence of mountaineering’s challenges and the allure of the unknown.
Prompt
style-aesthetic Gritty realism: Exhausted, determined ; A group of adventurers, their faces grimy and exhausted, navigating a treacherous mountain pass; wide shot; Adventure; a snow-covered mountain range with jagged peaks; cinematic
Characteristic
Shot : A group of climbers ascend a snow-covered mountain path in a dramatic, foggy mountain landscape
Aesthetic Score : 0.7
Mood : dramatic, ominous, adventurous
Quality
Entropy : 6.78
Noise : 92
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.80
Image errors : There is a slight blurriness to the image, particularly in the background. The climbers are silhouetted, lacking detail in their faces and gear, and the overall lighting creates a sense of flatness.
The Focus is On
A young man, headphones on, is locked in a battle of concentration. His narrowed eyes and furrowed brow tell a story of intense focus, while the blurred background emphasizes his singular purpose. The air crackles with competitive energy, leaving you wondering what challenge he’s facing.
Prompt
style-aesthetic Gritty realism: Focused, competitive ; A gamer, their eyes glued to the screen, their fingers flying across the keyboard; close-up; Gaming; a dimly lit room filled with computer monitors and gaming peripherals; cinematic
Characteristic
Shot : A young man wearing headphones is focused on typing on a keyboard. He is in a room with another person in the background. There is a computer monitor in the foreground, partially out of focus.
Aesthetic Score : 0.6
Mood : serious, focused, competitive
Quality
Entropy : 6.60
Noise : 84
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight chromatic aberration present around edges, background detail a bit blurry, slightly overexposed
Lost in the Neon Fog: A Solitary Figure Walks Through a Futuristic City
A lone figure traverses a neon-drenched street in a futuristic city shrouded in fog. The wet pavement reflects the vibrant lights, creating a visually striking scene that evokes a sense of isolation and mystery. The figure’s journey into the foggy distance emphasizes the city’s enigmatic nature and the character’s solitary existence.
Prompt
style-aesthetic Gritty realism: Lonely, introspective ; A lone traveler, their suitcase in hand, walking down a deserted street; medium shot; Tourism; a city skyline at night, with neon lights reflecting off the wet pavement; cinematic
Characteristic
Shot : A lone figure walks down a neon-lit street in a futuristic city at night. The city is shrouded in fog and the wet pavement reflects the bright lights, creating a visually striking scene.
Aesthetic Score : 0.7
Mood : futuristic, moody, lonely
Quality
Entropy : 6.58
Noise : 95
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.90
Image errors : The rendering of the neon signs and the fog could be more realistic. There are slight inconsistencies in the reflections on the wet pavement.
Conclusion
The results indicate that the generative AI model performed well in understanding and executing camera positions and shot composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored a 0.45, which falls below the “good” range of 0.5 to 0.75. This suggests that the model’s ability to accurately interpret and translate camera positions from the prompt to the generated image is somewhat lacking.
- Shot Analysis: The model scored a 0.56, which is within the “good” range. This indicates that the model was generally successful in understanding the scene described in the prompt and creating a shot that aligns with it.
- Aesthetic Analysis: The model scored a 0.07, which is significantly below the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic based on the prompt.
Overall, the model shows promise in its ability to understand and execute camera positions and shot composition, but needs improvement in achieving the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://deepmind.google/technologies/imagen-3/