AI's Artistic Struggle: Capturing the 'Dramatic' Aesthetic with Stable-diffusion
- 9 minutes read - 1848 wordsTable of Contents
The ‘dramatic’ aesthetic, characterized by high contrast, strong lighting, and a sense of tension, is a powerful tool in visual storytelling. It’s often used to evoke emotions like suspense, awe, and heroism. But can AI truly capture this aesthetic? In this article, we explore the challenges and successes of AI in generating images with a ‘dramatic’ style. We’ll analyze the results of a test, examining the AI’s ability to understand scene composition, camera position, and the nuances of visual style. Through this analysis, we’ll gain insights into the current capabilities of AI in artistic expression and its potential for future development.
Created with: stability-ai-core
Silhouetted Soldier Contemplates the Aftermath
A lone soldier stands amidst the ruins of a battlefield, their silhouette stark against the setting sun. Two tanks in the background add to the sense of desolation and loss, while the soldier’s posture evokes a mood of somber reflection.
Prompt
Gritty realism: Melancholy, determined ; A lone soldier, silhouetted against the setting sun; wide shot; Heroism; a war-torn battlefield littered with debris and the wreckage of tanks; cinematic
Characteristic
Shot : A lone soldier stands in a war-torn landscape, surrounded by debris and destroyed vehicles. The sun is setting behind him, casting a warm glow over the scene.
Aesthetic Score : 0.7
Mood : melancholy, somber, contemplative
Quality
Entropy : 6.59
Noise : 87
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor image artifacts present, like slight blur and over sharpening.
The Jungle’s Embrace: A Portrait of Grit and Reflection
A close-up portrait captures the weathered face of a man adorned with a leaf crown, his intense gaze reflecting a life lived in the heart of the jungle. Strong lighting and intimate framing draw you into his world, revealing a story of adventure, resilience, and quiet contemplation.
Prompt
Gritty realism: Intrigued, apprehensive ; A weathered explorer, their face etched with lines of hardship, peering through a dense jungle canopy; close-up; Adventure; overgrown ruins of an ancient temple; cinematic
Characteristic
Shot : A man with a weathered face and a leaf crown stands in a lush jungle setting. The lighting is dramatic and moody.
Aesthetic Score : 0.7
Mood : intense, mysterious, adventurous
Quality
Entropy : 6.81
Noise : 108
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the foliage, but they are not overly distracting.
The Focus of the Game: A Gamer’s Hands in Action
A close-up shot captures the intensity of a gamer’s focus as they grip their PS4 controller. The dim lighting and muted colors create a sense of intimacy and immersion, while the pizza and soda on the table hint at the casual enjoyment of the experience.
Prompt
Gritty realism: Focused, intense ; A gamer’s hands, gripping a worn controller, illuminated by the flickering glow of a monitor; close-up; Gaming; a dimly lit room filled with empty pizza boxes and energy drink cans; cinematic
Characteristic
Shot : A close-up of a gaming setup with a PS4 controller in the foreground, pizza slices, and a computer monitor in the background.
Aesthetic Score : 0.6
Mood : casual, gaming, cozy
Quality
Entropy : 6.28
Noise : 70
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some slight artifacts around the edges of the controller and the monitor.
A Lone Figure in the Desert: Hope Amidst Isolation
A solitary man stands on a dusty road, his gaze fixed on a distant, closed diner. The vast, empty landscape evokes a sense of loneliness and desolation, yet the diner, a beacon in the distance, suggests a glimmer of hope and the promise of a destination.
Prompt
Gritty realism: Lonely, contemplative ; A weary traveler, their backpack slung over their shoulder, gazing out at a desolate, dusty landscape; medium shot; Tourism; a crumbling roadside diner with faded neon signs; cinematic
Characteristic
Shot : A lone man, with a backpack, stands on a dusty road in the desert, looking at a diner with a faded sign.
Aesthetic Score : 0.7
Mood : lonely, desolate, contemplative
Quality
Entropy : 6.65
Noise : 90
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
A Shadow of Doubt: A Somber Journey on a Dusty Train
A group of passengers, shrouded in the dim light of an aging train, gaze out the grimy windows. Their expressions speak of apprehension and uncertainty, hinting at a journey fraught with mystery and unspoken anxieties. The somber mood and dramatic lighting create a sense of unease, leaving the viewer to ponder the secrets hidden within this fleeting moment.
Prompt
Gritty realism: Intimate, hopeful ; A family huddled together in a cramped train compartment, their faces illuminated by the flickering light of a single overhead bulb; medium shot; Travel; a train rattling through a dark, rain-soaked countryside; cinematic
Characteristic
Shot : A group of people are seated on a train, looking out the window. The lighting is dim, and the mood is tense. The train is old and worn, and the people are dressed in simple clothing.
Aesthetic Score : 0.7
Mood : tense, suspenseful, dramatic
Quality
Entropy : 5.89
Noise : 93
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight graininess to it, but this is not a significant issue. The overall quality is good.
A City Boy’s Wonder
A young boy, lost in thought, gazes up at the towering cityscape. The bustling streets and bright sunshine create a vibrant backdrop for his pensive mood, hinting at a hopeful future amidst the urban landscape.
Prompt
Gritty realism: Awe, curiosity ; A young boy, his eyes wide with wonder, staring up at a towering skyscraper; low angle shot; Family; a bustling city street filled with people and traffic; cinematic
Characteristic
Shot : A young boy stands in the middle of a city street, looking up at the tall buildings. There are other people walking around, but the boy is the focus of the image.
Aesthetic Score : 0.7
Mood : thoughtful, contemplative, urban
Quality
Entropy : 6.86
Noise : 95
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
Heroic Firefighter Stands Tall Amidst Blazing Inferno
A firefighter in full gear maintains a calm composure in the face of a raging fire, showcasing the bravery and dedication of those who risk their lives to protect others. The dramatic contrast between the firefighter’s stillness and the chaotic flames creates a powerful image of heroism.
Prompt
Gritty realism: Brave, determined ; A firefighter, their face obscured by smoke, battling a raging inferno; close-up; Heroism; a burning building with flames licking at the sky; cinematic
Characteristic
Shot : A firefighter in full gear, with a gas mask, stands in front of a burning building.
Aesthetic Score : 0.7
Mood : dramatic, intense, serious
Quality
Entropy : 6.75
Noise : 96
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly around the edges of the fire.
Hike Through Majestic Mountains: A Breathtaking View
Experience the awe-inspiring beauty of a snow-capped mountain range as hikers traverse a scenic pass. The dramatic scale of the mountains, emphasized by the figures in the foreground, creates a sense of adventure and serenity.
Prompt
Gritty realism: Exhausted, determined ; A group of adventurers, their faces grimy and exhausted, navigating a treacherous mountain pass; wide shot; Adventure; a snow-covered mountain range with jagged peaks; cinematic
Characteristic
Shot : A group of five hikers, all wearing orange jackets and carrying backpacks, ascend a snow-covered mountain pass. The scene is framed by towering snow-capped peaks on either side, and the sky is bright with a few clouds.
Aesthetic Score : 0.8
Mood : epic, adventurous, serene
Quality
Entropy : 6.85
Noise : 99
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.80
Image errors : None
The Hacker’s Focus
A young man, bathed in the blue glow of multiple monitors, sits hunched over his keyboard, his expression intense as he navigates a digital world. The dimly lit room amplifies the sense of focus and tension, hinting at a high-stakes mission in the heart of a futuristic landscape.
Prompt
Gritty realism: Focused, competitive ; A gamer, their eyes glued to the screen, their fingers flying across the keyboard; close-up; Gaming; a dimly lit room filled with computer monitors and gaming peripherals; cinematic
Characteristic
Shot : A young man sits at his computer, wearing headphones and playing a video game. The scene is set in a dimly lit room, with several computer monitors in the background.
Aesthetic Score : 0.7
Mood : intense, focused, concentrated
Quality
Entropy : 6.16
Noise : 84
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor noise and grain, but it is otherwise well-exposed.
Lost in the Neon Glow: A Melancholy Stroll Through the City
Two figures navigate a rain-slicked urban landscape, their silhouettes stark against the vibrant neon signs. The atmosphere is heavy with mystery and a touch of melancholy, captured in the interplay of light and shadow.
Prompt
Gritty realism: Lonely, introspective ; A lone traveler, their suitcase in hand, walking down a deserted street; medium shot; Tourism; a city skyline at night, with neon lights reflecting off the wet pavement; cinematic
Characteristic
Shot : Two figures walk in opposite directions down a wet city street at night. The street is lined with tall buildings and neon signs.
Aesthetic Score : 0.8
Mood : dark, urban, lonely
Quality
Entropy : 6.18
Noise : 97
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as the blurry reflections in the wet street.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.49, which is considered good. This indicates the generated image’s shot composition was fairly close to what was described in the prompt.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means the generated image’s aesthetic was very close to the expected aesthetic.
Overall, the model seems to be better at understanding the scene and shot composition than it is at capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai