AI's Artistic Struggle: Capturing the 'Dramatic' Aesthetic with Stability-ai-ultra
- 9 minutes read - 1895 wordsTable of Contents
The ‘dramatic’ aesthetic, characterized by strong contrasts, heightened emotions, and a sense of grandeur, is a powerful tool in visual storytelling. It’s often used to evoke feelings of awe, suspense, or even tragedy. But can AI truly capture this complex aesthetic? In this blog post, we explore the challenges and successes of AI in generating images with a dramatic feel, analyzing its strengths and weaknesses through a recent experiment.
Created with: stability-ai-ultra
Silhouetted Against the Setting Sun: A Moment of Solitude in the Desert
A lone figure walks towards the horizon, their silhouette stark against the fiery sunset. The vast, rocky desert landscape evokes a sense of solitude and contemplation, highlighting the dramatic effect of the setting sun.
Prompt
Dogme 95: Epic, hopeful ; A lone figure, silhouetted against a setting sun; long shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A solitary figure walks towards the setting sun in a rocky, desert landscape.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, vast
Quality
Entropy : 6.12
Noise : 94
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.70
Image errors : The rocks in the foreground have a slightly artificial texture and appear overly smooth. The sun appears overly bright and may have been digitally enhanced.
On the Edge of Greatness: A Climber’s Daring Ascent
A climber’s hand reaches out from a narrow cliff edge, defying gravity and embracing the vastness of the natural world. This breathtaking image captures the intensity, adventure, and daring spirit of a true mountaineer, showcasing the awe-inspiring beauty and inherent danger of the climb.
Prompt
Dogme 95: Suspenseful, thrilling ; A hand reaching out to grasp a rope ladder dangling from a cliff face; close-up; Adventure; A rocky, treacherous mountainside; cinematic
Characteristic
Shot : A climber hangs off the edge of a cliff, with ropes, looking down a vast valley and mountains in the distance.
Aesthetic Score : 0.7
Mood : dramatic, adventurous, daring
Quality
Entropy : 6.85
Noise : 106
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no apparent artifacts or errors in the image.
Lost in the Flames: Gamer Immersed in Action-Packed World
A man, headphones on and cap pulled low, is completely engrossed in a video game. The screen blazes with fiery action, casting an intense glow on his focused face. The low lighting and dramatic scene create a palpable sense of immersion and excitement.
Prompt
Dogme 95: Intense, focused ; A player’s hands frantically manipulating a joystick, their face illuminated by the screen; medium shot; Gaming; A dimly lit room with a computer monitor glowing brightly; cinematic
Characteristic
Shot : A young man is sitting at a desk playing a video game. He is wearing a headset and a baseball cap. The game on the screen is displaying fire and a dark, almost ominous scene. The image is taken from a low angle, and the light is warm, orangey, and artificial.
Aesthetic Score : 0.6
Mood : intense, focused, competitive
Quality
Entropy : 6.03
Noise : 73
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No obvious artifacts or errors are noticeable in the image.
A Vibrant Street Market in a European City
Experience the bustling energy of a European street market, filled with colorful produce, lively vendors, and a vibrant crowd. The narrow street creates a sense of depth and perspective, enhancing the visual appeal of this lively scene.
Prompt
Dogme 95: Energetic, lively ; A bustling marketplace, filled with vibrant colors and exotic goods; wide shot; Tourism; A crowded street in a foreign city; cinematic
Characteristic
Shot : A bustling outdoor market in a narrow street. The street is lined with stalls selling fresh produce, and the air is filled with the sounds of vendors hawking their wares and shoppers bargaining.
Aesthetic Score : 0.7
Mood : vibrant, lively, bustling
Quality
Entropy : 6.78
Noise : 102
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight over-exposure in some areas, especially towards the back of the image, which causes some details to be lost.
Blur of Speed: Capturing the Energy of a High-Speed Train
A dynamic photograph showcasing a high-speed train racing through a picturesque rural landscape. The blur of motion and vibrant colors create a sense of energy and excitement, capturing the power and speed of the train.
Prompt
Dogme 95: Nostalgic, contemplative ; A train speeding through a countryside landscape, blurring the scenery; long shot; Travel; Rolling hills and fields passing by; cinematic
Characteristic
Shot : A high-speed train is speeding through a scenic rural landscape with rolling hills, farmland, and a cloudy sky.
Aesthetic Score : 0.75
Mood : tranquil, fast-paced, adventurous
Quality
Entropy : 6.95
Noise : 99
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The blur of the train is slightly excessive, and some parts of the image appear to be oversharpened. There are some minor compression artifacts.
Warm Gatherings and Delicious Feasts: A Moment of Shared Joy
A group of friends share a heartwarming meal around a large wooden table, bathed in warm light. The scene exudes a sense of intimacy and togetherness, with the focus on the abundant food and the joy of shared company.
Prompt
Dogme 95: Warm, intimate ; A family gathered around a dinner table, sharing a meal and laughter; medium shot; Family; A cozy, well-worn kitchen; cinematic
Characteristic
Shot : A group of six people are having dinner together in a rustic kitchen. They are seated around a large wooden table, and there is food and drink in front of them. The kitchen is warm and inviting, with wooden cabinets and a large window. The light is coming in from the window, creating a warm glow on the scene.
Aesthetic Score : 0.7
Mood : cozy, warm, togetherness
Quality
Entropy : 6.63
Noise : 90
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
A Tearful Glance: A Boy’s Vulnerability in the Warm Light
A close-up shot captures the raw emotion of a young boy, tears streaming down his face as he looks up at an unseen figure. The soft, warm lighting creates an intimate and vulnerable atmosphere, highlighting the boy’s sadness and the emotional weight of the moment.
Prompt
Dogme 95: Sad, poignant ; A single tear rolling down a child’s cheek as they watch their parents argue; close-up; Family; A dimly lit living room; cinematic
Characteristic
Shot : A young boy with short brown hair is crying while looking directly at the camera. A person is standing behind the boy, only a shoulder and arm are visible, they are wearing a blue shirt. The image has a cartoon style and looks like a still from an animated movie.
Aesthetic Score : 0.8
Mood : sad, emotional, melancholic
Quality
Entropy : 6.50
Noise : 69
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image is well-rendered and free of obvious artifacts or errors.
Campfire Laughter: A Night of Friendship and Warmth
Five young women gather around a crackling campfire, their laughter echoing through the forest. The warm glow of the flames illuminates their faces, creating a cozy and inviting atmosphere. The soft lighting and the silhouettes of the trees in the background add a touch of mystery and wonder to this scene of pure joy and camaraderie.
Prompt
Dogme 95: Joyful, communal ; A group of friends huddled together around a campfire, sharing stories and laughter; medium shot; Adventure; A dark forest with flickering flames; cinematic
Characteristic
Shot : Five young women are gathered around a campfire in a forest setting. The flames are bright and warm, and the women are dressed in warm clothing, suggesting a cozy and inviting atmosphere.
Aesthetic Score : 0.7
Mood : cozy, friendly, adventurous
Quality
Entropy : 6.54
Noise : 92
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to have some minor noise and artifacts, particularly in the shadows. The fire flames seem slightly artificial, with some areas appearing overly bright and lacking natural variations.
Solitude and the Sea: A Moment of Contemplation
A lone figure finds peace on a rugged cliff overlooking a dramatic ocean scene. The vastness of the sea and the smallness of the figure create a sense of awe and wonder, capturing a moment of serene contemplation.
Prompt
Dogme 95: Awe-inspiring, contemplative ; A lone traveler gazing out at a vast ocean, their face filled with wonder; long shot; Travel; A dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A lone figure sits on a rocky cliff overlooking a vast, turbulent ocean with breaking waves. The scene is bathed in soft, blue light from a clear sky.
Aesthetic Score : 0.75
Mood : serene, contemplative, dramatic
Quality
Entropy : 6.73
Noise : 95
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors. The image seems well-processed and natural.
A Hand Holds the Past: A Nostalgic Glimpse into a Bygone Era
A weathered hand delicately cradles an old photograph, revealing a woman in a coat and hat standing in a field. The image evokes a sense of nostalgia and mystery, with the cluttered background hinting at a life lived and memories cherished. The intimate gesture of the hand draws the viewer into the moment, inviting them to ponder the story behind the photograph.
Prompt
Dogme 95: Melancholy, nostalgic ; A hand holding a worn photograph, the image blurred and faded; close-up; Family; A cluttered attic filled with old memories; cinematic
Characteristic
Shot : A hand is holding up an old photograph in a room filled with old photographs and clutter.
Aesthetic Score : 0.7
Mood : nostalgic, melancholic, historical
Quality
Entropy : 6.80
Noise : 83
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no visible artifacts or errors in the image.
Conclusion
The results indicate that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.56, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.08, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and camera position, but needs improvement in achieving the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai