AI Struggles to Capture the 'Style-Aesthetic' - A Case Study with Stability-ai-ultra
- 9 minutes read - 1874 wordsTable of Contents
The ‘style-aesthetic’ is a powerful tool in visual storytelling, allowing creators to evoke specific emotions and atmospheres. It encompasses elements like color palettes, lighting, composition, and even the choice of objects within a scene. This experiment aimed to explore how well a generative AI model could understand and implement this ‘style-aesthetic’ in its image generation. While the model showed promise in capturing camera position and shot analysis, it struggled to capture the desired aesthetic, highlighting the ongoing challenges in teaching AI to understand and implement complex artistic concepts.
Created with: stability-ai-ultra
A Solitary Journey Through the Golden Desert
A lone figure traverses a breathtaking desert landscape, bathed in the warm glow of the setting sun. Towering sandstone cliffs frame the scene, emphasizing the vastness and solitude of the journey. This image evokes a sense of adventure and the allure of the unknown.
Prompt
Vintage: Epic, adventurous, hopeful ; A lone, weathered explorer; medium shot; Adventure; a vast, sun-drenched desert landscape with ancient ruins in the distance; cinematic
Characteristic
Shot : A lone traveler, possibly a ranger or explorer, walks through a vast, dry, and desolate desert landscape. The sand dunes and canyons are the only landmarks in the distance, while the sky is a beautiful blue with white puffy clouds.
Aesthetic Score : 0.7
Mood : epic, lonely, adventurous
Quality
Entropy : 6.86
Noise : 92
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some minor artifacts in the clouds and the sand, indicating that it could be a digital painting. The shadows and light seem a bit artificial and lack natural depth.
Candlelit Board Game: A Moment of Cozy Play
Three children huddle around a board game, their hands illuminated by the warm glow of candlelight. The intimate framing and low light create a sense of cozy playfulness, capturing a moment of shared joy and connection.
Prompt
Vintage: Nostalgic, intimate, playful ; A group of children playing a board game; close-up; Gaming; a dimly lit room with a worn wooden table and flickering candlelight; cinematic
Characteristic
Shot : Three boys are playing a board game by candlelight. The lighting is warm and intimate. There is a cozy atmosphere.
Aesthetic Score : 0.7
Mood : cozy, intimate, playful
Quality
Entropy : 6.49
Noise : 84
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors
A Moment of Farewell: Capturing Nostalgia on the Platform
A woman in a vibrant yellow dress stands on a train platform, her gaze fixed on a departing steam train. The smoke billows, creating a sense of movement and adding a touch of drama to the scene. The woman’s pose and the angle of the shot evoke a feeling of romance and nostalgia, capturing a fleeting moment of farewell.
Prompt
Vintage: Romantic, adventurous, hopeful ; A young woman in a vintage dress standing on a train platform; long shot; Travel; a bustling train station with steam locomotives and vintage luggage; cinematic
Characteristic
Shot : A young woman in a yellow dress stands on a train platform, looking back at a steam locomotive with smoke billowing out of its chimney. The platform is wet and the atmosphere is nostalgic.
Aesthetic Score : 0.8
Mood : nostalgic, romantic, whimsical
Quality
Entropy : 6.73
Noise : 79
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, with some loss of detail in the highlights. The lighting is a bit flat.
Heroic Rescue Amidst the Flames
A firefighter, silhouetted against the intense blaze, bravely carries a child to safety. The dramatic contrast of light and shadow highlights the courage and selflessness of those who risk their lives to save others.
Prompt
Vintage: Dramatic, heroic, suspenseful ; A firefighter carrying a child through a burning building; close-up; Heroism; a smoky, chaotic scene with flames and debris; cinematic
Characteristic
Shot : A firefighter is carrying a child through a burning building. The background is filled with smoke and flames.
Aesthetic Score : 0.7
Mood : dramatic, heroic, intense
Quality
Entropy : 6.61
Noise : 86
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry in places. There is some noise in the darker areas of the image. The shadows are a bit harsh and unnatural.
Campfire Serenity Under the Milky Way
A tranquil scene of four friends gathered around a crackling campfire in a moonlit forest. The lake reflects the starry sky, creating a sense of peace and nostalgia. The dramatic contrast between the warm firelight and the dark surroundings, with the Milky Way shimmering above, adds a touch of magic to this serene moment.
Prompt
Vintage: Warm, nostalgic, peaceful ; A family gathered around a campfire; wide shot; Family; a serene forest setting with stars twinkling in the night sky; cinematic
Characteristic
Shot : Four people are sitting around a campfire in a forest at night. There is a lake in the background and a view of the milky way in the sky.
Aesthetic Score : 0.8
Mood : peaceful, magical, serene
Quality
Entropy : 6.35
Noise : 106
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some of the trees and other background elements are slightly blurry and pixelated.
Classic Car Adventure Through Majestic Mountains
A red classic car cruises along a winding mountain road, with snow-capped peaks in the background. The scene evokes a sense of serene adventure and nostalgia, capturing the timeless beauty of the journey.
Prompt
Vintage: Romantic, adventurous, nostalgic ; A vintage car driving down a winding mountain road; long shot; Tourism; a scenic mountain landscape with lush forests and snow-capped peaks; cinematic
Characteristic
Shot : A red classic car drives along a winding road through a mountain pass with lush green forests and snow-capped mountains in the distance.
Aesthetic Score : 0.8
Mood : serene, adventurous, nostalgic
Quality
Entropy : 6.89
Noise : 103
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable artifacts or errors.
Soaring Through Time: A Vintage Biplane Takes Flight
A classic biplane cuts through a sea of clouds, bathed in golden sunlight. This serene and nostalgic scene evokes a sense of adventure and freedom, capturing the timeless allure of flight.
Prompt
Vintage: Exhilarating, adventurous, free ; A pilot in a vintage biplane soaring through the clouds; close-up; Adventure; a breathtaking view of a vast, blue sky with fluffy white clouds; cinematic
Characteristic
Shot : A vintage biplane flying high above a sea of clouds, with a single pilot at the controls.
Aesthetic Score : 0.7
Mood : nostalgic, adventurous, serene
Quality
Entropy : 6.78
Noise : 83
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.70
Image errors : The clouds appear somewhat artificial and lack natural texture. Some minor blurring around the plane’s edges may be due to post-processing.
Soldiers March Through a World of Ashes
A haunting image captures the somber reality of war, as soldiers navigate a war-torn street shrouded in smoke and debris. The dramatic lighting and composition evoke a sense of foreboding and despair, reflecting the melancholic mood of the scene.
Prompt
Vintage: Grim, heroic, determined ; A group of soldiers marching through a war-torn city; medium shot; Heroism; a desolate cityscape with rubble and smoke; cinematic
Characteristic
Shot : A group of soldiers walks down a war-torn street, rubble and debris are scattered around them. In the background, a large explosion sends a plume of smoke and fire into the sky.
Aesthetic Score : 0.75
Mood : dramatic, war-torn, somber
Quality
Entropy : 6.83
Noise : 102
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.75
Image errors : There is a slight artifacting in the smoke and fire, some blurring around the soldier’s faces, and some strange blockiness in the rubble. There’s slight texture repetition in the rubble.
A Moment of Elegant Intimacy: A Couple’s Dance in the Bokeh Lights
In the heart of a grand ballroom, a couple shares a romantic dance, their eyes locked in a loving gaze. The man, dashing in a tuxedo, and the woman, resplendent in a long gown, are surrounded by the soft glow of chandeliers and the blurred lights of other guests. The scene, with its shallow depth of field and soft lighting, exudes elegance and intimacy, capturing the beauty of their connection.
Prompt
Vintage: Romantic, elegant, nostalgic ; A couple dancing in a vintage ballroom; close-up; Tourism; a grand ballroom with chandeliers and elegant guests; cinematic
Characteristic
Shot : A couple is dancing at a formal ball or wedding reception. They are the main focus of the image, and the background is filled with warm lights and blurred figures of other guests.
Aesthetic Score : 0.7
Mood : romantic, elegant, intimate
Quality
Entropy : 6.50
Noise : 85
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, resulting in some blown-out highlights in the chandeliers. Some of the guests in the background are also a bit blurry.
Lost in the Shadows of Discovery
A young boy, illuminated by candlelight, pores over an ancient map in a dimly lit room. His focused expression hints at a mystery unfolding, leaving the viewer to wonder what secrets lie within the parchment.
Prompt
Vintage: Curious, adventurous, hopeful ; A young boy gazing at a vintage map; close-up; Adventure; a dimly lit room with a worn wooden table and a flickering candlelight; cinematic
Characteristic
Shot : A young boy is looking at a map lit by a candle. The scene is set in a dimly lit room with a warm, cozy atmosphere.
Aesthetic Score : 0.7
Mood : mysterious, contemplative, adventurous
Quality
Entropy : 6.80
Noise : 88
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.425, which is also considered okay. This indicates that the generated image’s shot composition was somewhat different from what was expected based on the prompt.
- Aesthetic Analysis: The model scored 0.05, which is considered pretty bad. This means the generated image’s aesthetic was significantly different from what was expected based on the prompt.
Overall, the model seems to be struggling with understanding and implementing the desired aesthetic. It’s doing a decent job with camera position and shot analysis, but there’s room for improvement in all areas.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai