AI Struggles to Capture the 'Style-Aesthetic' - A Case Study with Stable-diffusion
- 9 minutes read - 1900 wordsTable of Contents
The ‘style-aesthetic’ is a powerful tool in visual storytelling, allowing creators to evoke specific emotions and atmospheres. It encompasses elements like color palettes, lighting, composition, and even the choice of objects within a scene. This experiment aimed to explore how well a generative AI model could understand and implement this ‘style-aesthetic’ in its image generation. While the model showed promise in capturing camera position and shot analysis, it struggled to capture the desired aesthetic, highlighting the ongoing challenges in teaching AI to understand and implement complex artistic concepts.
Created with: stability-ai-core
Lost in the Sands: A Lone Explorer Contemplates the Vastness of the Desert
A solitary figure stands on a sandy dune, gazing out at a sprawling desert landscape. The sun blazes in the bright blue sky, casting long shadows across the dunes. In the distance, a ruined building hints at a forgotten past. This evocative scene captures the essence of solitude, adventure, and exploration, leaving the viewer to ponder the mysteries of the desert.
Prompt
Vintage: Epic, adventurous, hopeful ; A lone, weathered explorer; medium shot; Adventure; a vast, sun-drenched desert landscape with ancient ruins in the distance; cinematic
Characteristic
Shot : A lone figure stands on a sand dune, looking out over a vast desert landscape. In the distance are the ruins of an ancient city. The sun is shining brightly, and the sky is a clear blue.
Aesthetic Score : 0.8
Mood : mysterious, desolate, adventurous
Quality
Entropy : 6.69
Noise : 84
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No artifacts, slight blur on the subject.
Cozy Childhood Memories: A Board Game Under Warm Lamplight
Three children are engrossed in a board game, bathed in the warm glow of a lamp. The dimly lit room, filled with bookshelves and framed pictures, evokes a sense of cozy nostalgia and childhood innocence. The soft lighting and focused composition highlight the children’s playful yet concentrated expressions, creating a sense of intimacy and shared joy.
Prompt
Vintage: Nostalgic, intimate, playful ; A group of children playing a board game; close-up; Gaming; a dimly lit room with a worn wooden table and flickering candlelight; cinematic
Characteristic
Shot : Three children are playing a board game in a dimly lit room. The room is decorated with old-fashioned furniture and pictures. The children are focused on the game and look intrigued.
Aesthetic Score : 0.7
Mood : nostalgic, cozy, serious
Quality
Entropy : 6.30
Noise : 83
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
A Moment of Departure: Nostalgia and Mystery on the Platform
A woman in a vintage dress stands on a train platform, her gaze fixed on the departing steam locomotive. The swirling smoke and fog create a sense of nostalgia and mystery, hinting at a journey both physical and emotional. Her posture and the briefcase she holds suggest a life in motion, leaving the viewer to wonder about her destination and the stories she carries within.
Prompt
Vintage: Romantic, adventurous, hopeful ; A young woman in a vintage dress standing on a train platform; long shot; Travel; a bustling train station with steam locomotives and vintage luggage; cinematic
Characteristic
Shot : A woman in a vintage dress stands on a train platform, waiting for a steam train to depart. The train is in the background, with smoke billowing from its chimney. The platform is empty except for the woman and a few distant figures.
Aesthetic Score : 0.8
Mood : nostalgic, romantic, melancholic
Quality
Entropy : 6.61
Noise : 96
Prompt Clip Score : 0.39
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight artifacts around the edges of the train, particularly around the wheels. There are also some minor color banding issues in the sky. The edges of the image are not sharp.
Hero in the Ashes: Firefighter Rescues Child from Burning Ruins
A firefighter, a beacon of hope amidst the devastation, carries a child to safety through the smoldering remains of a fire. The dramatic scene captures the intensity of the moment, contrasting the firefighter’s calm with the raging inferno and the smoke-filled air.
Prompt
Vintage: Dramatic, heroic, suspenseful ; A firefighter carrying a child through a burning building; close-up; Heroism; a smoky, chaotic scene with flames and debris; cinematic
Characteristic
Shot : A firefighter in full gear is carrying a child through a burning building. The scene is chaotic and dangerous, with fire and smoke filling the air.
Aesthetic Score : 0.7
Mood : intense, heroic, somber
Quality
Entropy : 6.80
Noise : 105
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image is slightly blurry, the flames and smoke are not very realistic, and the composition is a bit cluttered.
Campfire Tales Under the Milky Way
A group of friends gather around a crackling campfire, sharing stories and laughter under a breathtaking night sky. The tent in the background and the visible Milky Way evoke a sense of adventure and nostalgia, creating a cozy and inviting atmosphere.
Prompt
Vintage: Warm, nostalgic, peaceful ; A family gathered around a campfire; wide shot; Family; a serene forest setting with stars twinkling in the night sky; cinematic
Characteristic
Shot : A group of friends are sitting around a campfire in a forest at night. They are looking at the fire and talking. The night sky is full of stars and a shooting star is visible in the top center of the image.
Aesthetic Score : 0.75
Mood : cozy, warm, adventurous
Quality
Entropy : 6.03
Noise : 102
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight artifacts and blurriness, particularly in the background and on the trees.
Vintage Journey Through Majestic Peaks
A classic car winds its way through a mountain pass, bathed in the golden light of the setting sun. The vastness of the snow-capped peaks creates a sense of awe and adventure, while the vintage car evokes a feeling of nostalgia and serenity.
Prompt
Vintage: Romantic, adventurous, nostalgic ; A vintage car driving down a winding mountain road; long shot; Tourism; a scenic mountain landscape with lush forests and snow-capped peaks; cinematic
Characteristic
Shot : A vintage car drives on a winding mountain road, the road leads through a valley with snow-capped peaks in the background, the sky is a bright blue with white fluffy clouds.
Aesthetic Score : 0.8
Mood : serene, adventurous, nostalgic
Quality
Entropy : 6.78
Noise : 105
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Soaring Through Nostalgia: A Vintage Biplane’s Journey Above the Clouds
Experience the thrill of adventure and the serenity of the sky as a vintage biplane glides through a sea of fluffy white clouds. This nostalgic scene evokes a sense of freedom and exploration, capturing the beauty of a bygone era.
Prompt
Vintage: Exhilarating, adventurous, free ; A pilot in a vintage biplane soaring through the clouds; close-up; Adventure; a breathtaking view of a vast, blue sky with fluffy white clouds; cinematic
Characteristic
Shot : A vintage biplane flies above a sea of clouds, the sky is clear blue with fluffy white clouds.
Aesthetic Score : 0.8
Mood : nostalgic, adventurous, serene
Quality
Entropy : 6.72
Noise : 94
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
Grim Reality: Soldiers Navigate a War-Torn City
A haunting image captures the somber reality of war, as soldiers traverse a street ravaged by conflict. The scene is filled with rubble, smoke, and a palpable sense of loss and destruction.
Prompt
Vintage: Grim, heroic, determined ; A group of soldiers marching through a war-torn city; medium shot; Heroism; a desolate cityscape with rubble and smoke; cinematic
Characteristic
Shot : A group of soldiers marching through a war-torn city street. The street is littered with rubble and debris, and smoke fills the air. The soldiers are all wearing uniforms and carrying weapons. The lighting is dim and overcast, creating a somber mood.
Aesthetic Score : 0.7
Mood : somber, bleak, dramatic
Quality
Entropy : 6.84
Noise : 109
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant artifacts or errors.
A Moment of Romance in a Grand Ballroom
A couple dances in the spotlight, their love story unfolding amidst the elegance of a grand ballroom. The soft lighting and blurred background create an intimate atmosphere, capturing the essence of a romantic and nostalgic moment.
Prompt
Vintage: Romantic, elegant, nostalgic ; A couple dancing in a vintage ballroom; close-up; Tourism; a grand ballroom with chandeliers and elegant guests; cinematic
Characteristic
Shot : A couple is dancing in a grand ballroom. The room is filled with other people dancing and socializing.
Aesthetic Score : 0.7
Mood : romantic, elegant, festive
Quality
Entropy : 6.73
Noise : 97
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a slight blurriness to the image, particularly in the background. The lighting is a bit uneven, with some areas appearing slightly overexposed.
A Boy’s Journey: A Candlelit Map and Dreams of Adventure
A young boy, bathed in the warm glow of candlelight, traces a route across an antique world map. The scene evokes a sense of mystery, adventure, and nostalgia, with the low-key lighting and focused expression adding to the dramatic effect.
Prompt
Vintage: Curious, adventurous, hopeful ; A young boy gazing at a vintage map; close-up; Adventure; a dimly lit room with a worn wooden table and a flickering candlelight; cinematic
Characteristic
Shot : A young boy is sitting at a table, intently studying a map with a pencil in hand. A world map hangs on the wall behind him, and a lit candle illuminates the scene.
Aesthetic Score : 0.7
Mood : intrigued, adventurous, focused
Quality
Entropy : 6.50
Noise : 90
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.425, which is also considered okay. This indicates that the generated image’s shot composition was somewhat different from what was expected based on the prompt.
- Aesthetic Analysis: The model scored 0.05, which is considered pretty bad. This means the generated image’s aesthetic was significantly different from what was expected based on the prompt.
Overall, the model seems to be struggling with understanding and implementing the desired aesthetic. It’s doing a decent job with camera position and shot analysis, but there’s room for improvement in all areas.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai