AI Struggles to Capture the 'Dramatic' Aesthetic with Leonardo-ai
- 9 minutes read - 1871 wordsTable of Contents
The ‘dramatic’ aesthetic is a powerful tool in visual storytelling, evoking strong emotions and immersing viewers in a world of heightened tension and beauty. It often involves stark contrasts, dramatic lighting, and compositions that emphasize the subject’s isolation or vulnerability. This style is commonly used in film, photography, and even video games to create impactful and memorable scenes. However, teaching AI to understand and replicate this aesthetic presents unique challenges, as it requires a deep understanding of visual language and the ability to evoke specific emotions.
Created with: leonardo-ai
Silhouetted Against the Setting Sun: A Soldier’s Lonely Vigil
A lone soldier stands beside a tank, their figures stark against the fiery sunset. The battlefield is littered with debris, a stark reminder of the war’s devastation. The scene evokes a sense of gloom and isolation, capturing the somber reality of conflict.
Prompt
Gritty realism: Melancholy, determined ; A lone soldier, silhouetted against the setting sun; wide shot; Heroism; a war-torn battlefield littered with debris and the wreckage of tanks; cinematic
Characteristic
Shot : A tank and a soldier silhouetted against a setting sun, with debris scattered across the foreground.
Aesthetic Score : 0.6
Mood : dramatic, somber, war
Quality
Entropy : 6.65
Noise : 93
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some slight noise and blurring in some areas.
Lost in the Jungle: A Man’s Contemplative Gaze
A weathered man stands amidst the dense foliage of a jungle, his serious expression hinting at a hidden story. The atmosphere is thick with suspense and mystery, leaving the viewer to wonder what secrets lie within the shadows.
Prompt
Gritty realism: Intrigued, apprehensive ; A weathered explorer, their face etched with lines of hardship, peering through a dense jungle canopy; close-up; Adventure; overgrown ruins of an ancient temple; cinematic
Characteristic
Shot : A man with a rugged look stands in the middle of a lush green jungle. The foliage obscures some of his body and the background, creating a sense of depth.
Aesthetic Score : 0.6
Mood : intense, adventurous, mysterious
Quality
Entropy : 6.70
Noise : 113
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Lost in the Game: A Moment of Intense Focus
A man is completely engrossed in a video game, his face illuminated by the screen’s glow in a dimly lit room. The dramatic lighting creates a sense of mystery and intensity, highlighting his focused and serious demeanor.
Prompt
Gritty realism: Focused, intense ; A gamer’s hands, gripping a worn controller, illuminated by the flickering glow of a monitor; close-up; Gaming; a dimly lit room filled with empty pizza boxes and energy drink cans; cinematic
Characteristic
Shot : A man is sitting in a dimly lit room, playing a video game on a computer. There is a glass of liquid on the desk in front of him. There are a lot of things on the desk, such as a keyboard, a mouse, a controller, and a computer monitor.
Aesthetic Score : 0.6
Mood : dark, focused, intense
Quality
Entropy : 6.29
Noise : 82
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts in the form of noise and grain. There is also some blur in the background.
Lost in Time: A Vintage Car and a Desolate Diner in the Desert
A vintage car sits parked in front of a run-down roadside diner, its rusty neon sign casting a dim glow. The scene evokes a sense of nostalgia and desolation, with long shadows and faded colors adding to the melancholy feeling. This Americana landscape captures the loneliness and isolation of the open road.
Prompt
Gritty realism: Lonely, contemplative ; A weary traveler, their backpack slung over their shoulder, gazing out at a desolate, dusty landscape; medium shot; Tourism; a crumbling roadside diner with faded neon signs; cinematic
Characteristic
Shot : A retro diner with a faded neon sign stands abandoned on a desolate desert road, with a vintage car parked in front of it.
Aesthetic Score : 0.7
Mood : melancholic, nostalgic, deserted
Quality
Entropy : 6.80
Noise : 96
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible errors or artifacts in the image.
A Rainy Day of Contemplation
Three women, their faces etched with melancholy, gaze out the rain-streaked window of a train car. The dim lighting and somber mood create a sense of isolation and unease, drawing the viewer into their world of contemplation.
Prompt
Gritty realism: Intimate, hopeful ; A family huddled together in a cramped train compartment, their faces illuminated by the flickering light of a single overhead bulb; medium shot; Travel; a train rattling through a dark, rain-soaked countryside; cinematic
Characteristic
Shot : Three women are sitting in a train car, looking out the window, which is blurred with raindrops. The interior of the train is dark and moody.
Aesthetic Score : 0.6
Mood : gloomy, suspenseful, mysterious
Quality
Entropy : 5.76
Noise : 85
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially the faces of the women. The color balance is slightly off, with some colors appearing slightly desaturated.
A Family’s Journey Towards Hope in the City
A heartwarming scene of a family of five strolling down a bustling city street, their eyes fixed on a towering building in the distance. The image evokes a sense of hope and possibility, capturing the essence of urban life and the strength of family bonds.
Prompt
Gritty realism: Awe, curiosity ; eyes wide with wonder, staring up at a towering skyscraper; low angle shot; Family; a bustling city street filled with people and traffic; cinematic
Characteristic
Shot : A family walking down a city street, with tall buildings on either side and a skyscraper in the background.
Aesthetic Score : 0.6
Mood : urban, family, everyday
Quality
Entropy : 6.89
Noise : 115
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry in some areas.
Firefighter’s Courage Amidst the Flames
A powerful image captures the bravery of a firefighter standing against a backdrop of a burning house, smoke billowing in the air. The scene evokes a sense of intensity, drama, and somber reflection on the destructive power of fire.
Prompt
Gritty realism: Brave, determined ; A firefighter, their face obscured by smoke, battling a raging inferno; close-up; Heroism; a burning building with flames licking at the sky; cinematic
Characteristic
Shot : A firefighter in full gear is standing in front of a burning building. The building is engulfed in flames and smoke.
Aesthetic Score : 0.6
Mood : intense, dramatic, somber
Quality
Entropy : 6.71
Noise : 88
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have some slight compression artifacts, particularly in the smoke.
Conquering the Summit: Hikers Brave the Snowy Pass
A breathtaking scene of four hikers ascending a snowy mountain pass, with a dramatic peak looming in the background. The strong perspective and use of light and shadow create a sense of depth and scale, highlighting the epic journey and the immensity of the mountain.
Prompt
Gritty realism: Exhausted, determined ; A group of adventurers, their faces grimy and exhausted, navigating a treacherous mountain pass; wide shot; Adventure; a snow-covered mountain range with jagged peaks; cinematic
Characteristic
Shot : A group of four hikers are ascending a snowy mountain path, with a majestic snow-capped peak in the background.
Aesthetic Score : 0.8
Mood : serene, adventurous, determined
Quality
Entropy : 6.85
Noise : 98
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors detected. The image appears well-processed and balanced.
Lost in the Game: A Young Man’s Intense Focus in a Futuristic World
A dimly lit room, a young man with headphones, and a computer screen glowing with the intensity of a digital battle. This image captures the focused energy of a gamer lost in the world of virtual reality, creating a sense of futuristic intensity.
Prompt
Gritty realism: Focused, competitive ; A gamer, their eyes glued to the screen, their fingers flying across the keyboard; close-up; Gaming; a dimly lit room filled with computer monitors and gaming peripherals; cinematic
Characteristic
Shot : A young man is sitting in a dark room, wearing a headset and looking intently at a computer screen. He is likely playing a video game, as the screen displays a game interface.
Aesthetic Score : 0.6
Mood : intense, focused, determined
Quality
Entropy : 5.88
Noise : 80
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and grain, which may be due to the low lighting conditions.
Lost in the Neon Rain
A solitary figure navigates a rain-soaked city street, bathed in the glow of neon signs. The mood is nostalgic and moody, with a sense of mystery and intrigue. The lone figure, carrying a book and a backpack, walks away from the camera, adding to the feeling of isolation and urban solitude.
Prompt
Gritty realism: Lonely, introspective ; A lone traveler, their suitcase in hand, walking down a deserted street; medium shot; Tourism; a city skyline at night, with neon lights reflecting off the wet pavement; cinematic
Characteristic
Shot : A lone figure walks down a wet city street at night, with a tall building in the distance and neon signs reflecting on the wet pavement.
Aesthetic Score : 0.7
Mood : mysterious, urban, lonely
Quality
Entropy : 6.43
Noise : 104
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts in the shadows, and the figure’s silhouette is slightly blurry.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t fully capture the intended camera position in the prompt.
- Shot Analysis: The model scored 0.5, which falls within the “good” range. This indicates that the model was able to understand the scene in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.12, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and camera position, but needs improvement in generating images that match the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://leonardo.ai