AI's Artistic Journey: Capturing Scenes, Missing the Vibe with Stable-diffusion
- 9 minutes read - 1867 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, capturing the essence of a desired aesthetic style remains a challenge. This blog post explores the results of testing an AI model’s ability to generate images based on specific scenes and aesthetics. We’ll delve into the model’s performance in capturing camera positions, shot types, and overall aesthetic style, highlighting areas where it excels and where it needs improvement.
For example, imagine a scene described as ‘a lone knight, silhouetted against a setting sun; wide shot; Heroism; A vast, desolate battlefield littered with fallen soldiers.’ While the AI might accurately depict a knight and a battlefield, it might struggle to convey the dramatic, heroic tone intended by the ‘Heroism’ aesthetic. This highlights the need for further development in AI’s understanding of artistic nuances and emotional impact.
Created with: stability-ai-core
Silhouetted Against Glory: A Knight’s Epic Return
A lone knight, silhouetted against a breathtaking sunset, rides towards a distant city. The scene evokes a sense of epic grandeur and melancholic beauty, hinting at a victorious return or a final stand. The dramatic contrast of light and shadow creates a powerful visual impact, leaving the viewer pondering the knight’s destiny.
Prompt
Baroque: Epic, melancholic ; A lone knight, silhouetted against a setting sun; wide shot; Heroism; A vast, desolate battlefield littered with fallen soldiers.; cinematic
Characteristic
Shot : A lone knight in full armor, riding a horse, silhouetted against a dramatic sunset, with a medieval city in the background and a line of soldiers in the distance.
Aesthetic Score : 0.7
Mood : epic, somber, dramatic
Quality
Entropy : 6.63
Noise : 82
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.40
Image errors : The image has some minor artifacts and noise in the background. Some details in the city and the soldiers are blurred.
Majestic Ship Battles Stormy Seas
A grand sailing ship, adorned with intricate details, faces a raging storm with lightning illuminating the sky. A smaller vessel appears on the horizon, adding to the sense of adventure and peril. This dramatic scene captures the raw power of nature and the resilience of the human spirit.
Prompt
Baroque: Dramatic, thrilling ; A pirate ship, sails billowing in the wind, crashing through stormy waves; dynamic, close-up; Adventure; A raging sea with lightning illuminating the sky.; cinematic
Characteristic
Shot : A large, ornate sailing ship is sailing through a stormy sea with dramatic clouds and lightning in the background.
Aesthetic Score : 0.8
Mood : dramatic, adventurous, ominous
Quality
Entropy : 6.85
Noise : 89
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The water surface is somewhat blurry and the waves are repetitive.
In the Zone: A Gamer’s Focus
A close-up shot captures the intensity of a gamer’s focus, their hand gripping a controller in front of two glowing cityscapes. The image evokes a sense of darkness, technology, and unwavering concentration.
Prompt
Baroque: Intense, focused ; A player’s hand, gripping a controller, illuminated by the glow of a screen; close-up; Gaming; A chaotic, pixelated cityscape on the screen.; cinematic
Characteristic
Shot : A hand holding a video game controller in front of two computer monitors displaying a city at night.
Aesthetic Score : 0.6
Mood : focused, tech, gamer
Quality
Entropy : 6.28
Noise : 78
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, especially in the darker areas. There is also some slight chromatic aberration.
Golden Hour Elegance: A Grand Building Bathed in Sunlight
Capture the warmth and grandeur of a stunning architectural masterpiece as the sun casts a golden glow on its ornate details. The bustling courtyard below adds a touch of life to this elegant scene.
Prompt
Baroque: Opulent, vibrant ; A grand, ornate palace, bathed in golden sunlight; wide shot; Tourism; A bustling marketplace with vibrant colors and exotic goods.; cinematic
Characteristic
Shot : A large, ornate building with a golden facade and a blue dome. The building is surrounded by a plaza with people walking around.
Aesthetic Score : 0.8
Mood : grand, opulent, historical
Quality
Entropy : 6.83
Noise : 104
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some minor lens distortion and noise is visible in the image.
A Solitary Journey Through a Snowy Valley
A lone hiker traverses a winding road leading into a breathtaking snowy valley, dwarfed by the majestic mountain range in the background. The scene evokes a sense of serenity, peace, and adventure, with the small figure emphasizing the feeling of isolation and exploration.
Prompt
Baroque: Awe-inspiring, contemplative ; A lone traveler, gazing out at a breathtaking mountain range; medium shot; Travel; A vast, snow-capped mountain range with a winding road leading into the distance.; cinematic
Characteristic
Shot : A lone figure walks down a snow-covered road, with a vast mountain range in the background. The sky is clear and blue.
Aesthetic Score : 0.8
Mood : serene, contemplative, adventurous
Quality
Entropy : 6.71
Noise : 88
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts
Secrets Whispered in the Shadows
A group of four gather around a dimly lit table, their expressions shrouded in mystery. Antique furniture and flickering firelight create an intimate and cozy atmosphere, hinting at secrets shared in the hushed darkness.
Prompt
Baroque: Warm, intimate ; A family gathered around a fireplace, sharing stories and laughter; medium shot; Family; A cozy, candlelit room with portraits of ancestors on the walls.; cinematic
Characteristic
Shot : A group of four people are gathered around a table in a dimly lit room, likely a drawing room, with a roaring fireplace behind them. The room has a cozy and intimate atmosphere with warm lighting, enhanced by the flickering candles and the fire.
Aesthetic Score : 0.7
Mood : cozy, mysterious, intimate
Quality
Entropy : 6.14
Noise : 86
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.00
Image errors : No significant errors.
Knight’s Charge: A Moment of Epic Valor
A valiant knight, clad in shining armor, bursts through a cloud of dust on a battlefield, his determined expression reflecting the urgency of the moment. A fallen enemy lies in the foreground, while a medieval city looms in the background, adding to the epic scale of the scene. This dramatic image captures the bravery and strength of the knight, leaving a lasting impression of heroism.
Prompt
Baroque: Brave, determined ; A knight, charging into battle, his armor gleaming in the sunlight; dynamic, close-up; Heroism; A chaotic battlefield with smoke and dust swirling in the air.; cinematic
Characteristic
Shot : A knight in full armor is charging into battle, with a sword in each hand. He is surrounded by other knights, and there is dust and smoke in the air.
Aesthetic Score : 0.8
Mood : epic, dramatic, heroic
Quality
Entropy : 6.86
Noise : 102
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight noise is visible in the dust and smoke, but it does not detract from the image. The lighting and sharpness seem well done.
Hidden Treasure Gleams in Candlelight
A mysterious cave, illuminated by flickering candlelight, reveals a treasure chest overflowing with jewels. The scene evokes a sense of adventure and wonder, with the dramatic play of light and shadow highlighting the riches within.
Prompt
Baroque: Intriguing, mysterious ; A treasure chest, overflowing with gold and jewels, illuminated by a single candle; close-up; Adventure; A dark, mysterious cave with cobwebs and shadows.; cinematic
Characteristic
Shot : A treasure chest overflowing with jewels is nestled within a dark cave illuminated by candlelight.
Aesthetic Score : 0.7
Mood : mysterious, romantic, alluring
Quality
Entropy : 6.33
Noise : 90
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image has a few minor artifacts, particularly around the edges of the chest and the jewels. The lighting seems slightly unnatural in some areas.
Dreamlike Waterfall in a Fantastical Landscape
A magical scene unfolds with a grand waterfall cascading from a cave into a river, surrounded by towering mountains and a small city perched on a cliff. The contrast of light and dark, the scale of the waterfall, and the overall sense of wonder create a truly dramatic and peaceful atmosphere.
Prompt
Baroque: Triumphant, surreal ; A player’s avatar, standing triumphantly on a virtual mountain peak; wide shot; Gaming; A fantastical, digital landscape with glowing waterfalls and floating islands.; cinematic
Characteristic
Shot : A fantasy landscape with a large mountain in the center with a waterfall flowing down its side. There are other mountains in the background and a city built on the side of the mountain.
Aesthetic Score : 0.8
Mood : magical, ethereal, adventurous
Quality
Entropy : 6.77
Noise : 99
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some slight blurriness and noise, but this is minimal.
A Majestic Church Dominates the Square
This wide-angle shot captures the grandeur of a large, ornate church, its beige stone facade and towering spires dominating a bustling public square. The peaceful atmosphere and historical significance of the scene are palpable, inviting viewers to imagine the stories held within its walls.
Prompt
Baroque: Energetic, lively ; A bustling city square, filled with people from all walks of life; wide shot; Tourism; A grand, Baroque cathedral towering over the city.; cinematic
Characteristic
Shot : A wide shot of a grand cathedral in a European city, with people walking in the square in front of it.
Aesthetic Score : 0.8
Mood : tranquil, majestic, historical
Quality
Entropy : 6.88
Noise : 89
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.53, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.58, also within the “good” range. This suggests that the model understood the scene descriptions in the prompts and generated images that reflected those descriptions.
- Aesthetic Analysis: The model scored 0.15, which is significantly lower than the “very good” range (-0.2 to 0.1). This indicates that the generated images did not match the expected aesthetic style as closely as desired.
Overall, the model demonstrates a good understanding of camera positions and scene descriptions, but needs improvement in generating images that align with the desired aesthetic style.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai