AI's Artistic Struggle: Capturing the 'Dramatic' Aesthetic with Stable-diffusion
- 9 minutes read - 1865 wordsTable of Contents
The ‘dramatic’ aesthetic, characterized by strong contrasts, heightened emotions, and a sense of grandeur, is a powerful tool in visual storytelling. It’s often used to evoke feelings of awe, suspense, or even tragedy. But can AI truly capture this complex aesthetic? In this blog post, we explore the challenges and successes of AI in generating images with a dramatic feel, analyzing its strengths and weaknesses through a recent experiment.
Created with: stability-ai-core
Silhouetted Against the Sunset: A Moment of Contemplation
A lone figure stands silhouetted against a vibrant sunset, their dog by their side. The vast landscape stretches out before them, evoking a sense of melancholy and introspection. The dramatic silhouette against the fiery sky creates a sense of mystery and intrigue, leaving the viewer to ponder the figure’s thoughts and emotions.
Prompt
Dogme 95: Epic, hopeful ; A lone figure, silhouetted against a setting sun; long shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure stands in silhouette, facing a setting sun over a vast landscape. There is a small animal beside them.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, mysterious
Quality
Entropy : 6.02
Noise : 70
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and has some noise, particularly in the sky. There is a slight halo effect around the sun, and the edges of the silhouette are not sharp.
On the Edge: A Climber’s Focus in the Face of Danger
A close-up shot captures the intensity of a climber’s focus as they grip a rope on a sheer cliff face. The blurry forest valley below emphasizes the height and precariousness of their position, creating a sense of adventure and danger.
Prompt
Dogme 95: Suspenseful, thrilling ; A hand reaching out to grasp a rope ladder dangling from a cliff face; close-up; Adventure; A rocky, treacherous mountainside; cinematic
Characteristic
Shot : A close-up of a rock climber’s hand gripping a rope as they ascend a steep cliff face. The climber is facing the cliff, and the camera is positioned behind them, capturing their hand and a portion of the rocky wall. The background is a blurry expanse of green forest and mountains in the distance, giving a sense of depth and perspective.
Aesthetic Score : 0.7
Mood : intense, suspenseful, adventurous
Quality
Entropy : 6.90
Noise : 111
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Lost in the Blue Light: A Man’s Intense Focus on the Digital World
A man is deeply engrossed in his work, his face illuminated by the blue glow of a computer screen. The scene evokes a sense of intense focus and a techy atmosphere, with the blue light creating a dramatic and moody effect.
Prompt
Dogme 95: Intense, focused ; A player’s hands frantically manipulating a joystick, their face illuminated by the screen; medium shot; Gaming; A dimly lit room with a computer monitor glowing brightly; cinematic
Characteristic
Shot : A man is sitting at a desk, looking at a computer screen, working on a control panel. The room is dark and lit by the blue light of the screen.
Aesthetic Score : 0.7
Mood : serious, focused, intense
Quality
Entropy : 5.01
Noise : 71
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Immerse Yourself in the Vibrant Heart of a Historic Market
Experience the lively atmosphere of a bustling outdoor market, where colorful fruit and vegetable stalls line the streets. The perspective draws you into the scene, showcasing the depth and scale of this traditional marketplace.
Prompt
Dogme 95: Energetic, lively ; A bustling marketplace, filled with vibrant colors and exotic goods; wide shot; Tourism; A crowded street in a foreign city; cinematic
Characteristic
Shot : A bustling outdoor market in a European city. Shoppers browse produce and other goods under awnings. The narrow street is lined with historic buildings. A church spire rises in the distance.
Aesthetic Score : 0.7
Mood : busy, lively, nostalgic
Quality
Entropy : 6.80
Noise : 109
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights.
Nostalgic Journey Through a Verdant Valley
A serene and peaceful scene unfolds as a blue and yellow train chugs through a lush valley, its smoke trailing behind like a wisp of memory. Rolling hills and green fields stretch out on either side, creating a sense of vastness and perspective. The train, a focal point of movement and grandeur, evokes a feeling of nostalgia and tranquility.
Prompt
Dogme 95: Nostalgic, contemplative ; A train speeding through a countryside landscape, blurring the scenery; long shot; Travel; Rolling hills and fields passing by; cinematic
Characteristic
Shot : A train, possibly a steam train, is traveling through a rolling countryside landscape. The train is visible in the foreground, and the landscape is visible in the background. The train is moving towards the right side of the image. The sky is clear and mostly blue, with a few white clouds visible.
Aesthetic Score : 0.8
Mood : tranquil, peaceful, nostalgic
Quality
Entropy : 6.15
Noise : 99
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.00
Image errors : No visible errors
Dinner with Friends: A Night of Laughter and Joy
This heartwarming scene captures the essence of a perfect dinner party. Four friends gather around a beautifully set table, their smiles radiating warmth and happiness. The warm lighting and intimate atmosphere create a sense of togetherness, making this a truly special moment.
Prompt
Dogme 95: Warm, intimate ; A family gathered around a dinner table, sharing a meal and laughter; medium shot; Family; A cozy, well-worn kitchen; cinematic
Characteristic
Shot : Four people are having dinner in a cozy dining room. They are seated around a table, eating and talking. The room is lit by a warm overhead light, giving the scene a pleasant and intimate atmosphere.
Aesthetic Score : 0.7
Mood : warm, cozy, intimate
Quality
Entropy : 6.60
Noise : 90
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise, particularly in the shadows. There are also some minor artifacts in the background.
A Moment of Shared Curiosity
Three figures, a man and two children, sit together, their gazes fixed on a point beyond the frame. The quiet intensity of their focus creates a palpable sense of anticipation, leaving the viewer to wonder what captivating scene lies before them.
Prompt
Dogme 95: Sad, poignant ; A single tear rolling down a child’s cheek as they watch their parents argue; close-up; Family; A dimly lit living room; cinematic
Characteristic
Shot : A close-up shot of three people, a man and two children, looking off-screen. The man’s face is mostly obscured by the shadow of his hat.
Aesthetic Score : 0.7
Mood : serious, pensive, dramatic
Quality
Entropy : 6.50
Noise : 88
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly out of focus. The lighting is uneven, creating harsh shadows.
Campfire Laughter: Friends Gather Under the Stars
A cozy scene of friendship and warmth unfolds around a crackling campfire. The firelight illuminates the faces of four friends, their laughter echoing through the forest. The dramatic shadows cast by the flames create a sense of intimacy and shared joy.
Prompt
Dogme 95: Joyful, communal ; A group of friends huddled together around a campfire, sharing stories and laughter; medium shot; Adventure; A dark forest with flickering flames; cinematic
Characteristic
Shot : A group of four friends are sitting around a campfire in a forest. They are all smiling and laughing, enjoying each other’s company.
Aesthetic Score : 0.7
Mood : happy, friendly, cozy
Quality
Entropy : 6.13
Noise : 98
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, and there is some noise in the background.
Solitude on the Edge: A Figure Contemplates the Vastness of the Sea
A lone figure stands silhouetted against the dramatic backdrop of a crashing coastline, evoking a sense of serenity and contemplation. The smallness of the figure against the vastness of the ocean creates a powerful sense of awe and solitude.
Prompt
Dogme 95: Awe-inspiring, contemplative ; A lone traveler gazing out at a vast ocean, their face filled with wonder; long shot; Travel; A dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A solitary figure stands on a cliff overlooking a vast expanse of ocean and rugged coastline. The clouds are dramatic and the light is soft.
Aesthetic Score : 0.8
Mood : serene, contemplative, dramatic
Quality
Entropy : 6.72
Noise : 90
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
A Glimpse into the Past: A Faded Photograph Evokes Nostalgia
A weathered hand cradles a cherished photograph, its faded colors and worn edges whispering tales of time gone by. The image, a family standing before a building, evokes a sense of nostalgia and invites viewers to imagine the stories behind the faces captured in time.
Prompt
Dogme 95: Melancholy, nostalgic ; A hand holding a worn photograph, the image blurred and faded; close-up; Family; A cluttered attic filled with old memories; cinematic
Characteristic
Shot : A person is holding a photograph of a family standing in front of a building. The photo is grainy and faded.
Aesthetic Score : 0.6
Mood : nostalgic, melancholic, intimate
Quality
Entropy : 6.71
Noise : 88
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.00
Image errors : The image is grainy and slightly blurred, indicating that it is an old photograph.
Conclusion
The results indicate that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.56, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.08, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and camera position, but needs improvement in achieving the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai