AI's Artistic Eye: A Look at Generative Models and Scene Composition with Stable-diffusion
- 9 minutes read - 1784 wordsTable of Contents
The world of generative AI is rapidly evolving, with models capable of creating stunning and realistic images from text prompts. One key aspect of this technology is its ability to understand and translate scene composition, including camera positions, shot analysis, and aesthetic style. This blog post explores the performance of a generative AI model in capturing these elements, analyzing its strengths and areas for improvement. We’ll delve into the concept of ‘dramatic style’ and explore how it’s used in various creative contexts, providing examples of its application in film, photography, and digital art.
Created with: stability-ai-core
Silhouettes of Solitude: A Desolate Sunset
Two figures stand on a rocky outcrop, their silhouettes stark against the vast, golden expanse of a desert sunset. The scene evokes a sense of epic desolation and contemplative solitude, with distant mountains adding to the grandeur of the landscape.
Prompt
Stylized: Epic and melancholic ; A lone warrior; wide shot; Heroism; A desolate battlefield with a setting sun; cinematic
Characteristic
Shot : Two figures in medieval armor stand on a rocky outcrop, overlooking a vast desert landscape with a winding river in the distance. The sun is setting in the background, casting a warm, golden glow over the scene.
Aesthetic Score : 0.7
Mood : epic, mysterious, contemplative
Quality
Entropy : 6.74
Noise : 84
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be digitally generated, and there is some slight blurring in the distance, as well as a lack of detail in some areas of the scene.
Lost Treasure Found! Cave Unveils Glittering Riches
A mysterious cave opening reveals a treasure chest overflowing with gold coins, sparking a sense of wonder and adventure. The scene evokes the thrill of discovery and the promise of untold wealth.
Prompt
Stylized: Excitement and wonder ; A treasure chest overflowing with gold; close-up; Adventure; A dark and mysterious cave; cinematic
Characteristic
Shot : A treasure chest overflowing with gold coins lies open in a dark and mysterious cave, illuminated by a soft light from the entrance.
Aesthetic Score : 0.8
Mood : mysterious, adventurous, exciting
Quality
Entropy : 6.50
Noise : 94
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.60
Image errors : No visible errors or artifacts
Silhouetted Against the Future: A Hero Prepares
A lone, armored figure stands on a rooftop overlooking a futuristic cityscape bathed in the hues of sunset. The dramatic lighting and their poised stance with a futuristic weapon create a palpable sense of tension and anticipation. This scene evokes a mood of heroic determination in the face of an unknown threat.
Prompt
Stylized: Triumphant and futuristic ; A player’s avatar, a powerful warrior, standing triumphantly; medium shot; Gaming; A vibrant and futuristic cityscape; cinematic
Characteristic
Shot : A futuristic soldier stands in a futuristic city with a sword in his hand, there is a sunset in the background
Aesthetic Score : 0.8
Mood : epic, futuristic, powerful
Quality
Entropy : 6.94
Noise : 86
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image is slightly blurry, especially the background
Sunset Serenity: A City Skyline’s Golden Hour
An aerial view captures the calm beauty of a city at sunset. The warm light bathes the towering skyscrapers and the narrow street below, creating a sense of peaceful urban tranquility.
Prompt
Stylized: Energetic and lively ; A panoramic view of a bustling city; long shot; Tourism; A vibrant and colorful cityscape; cinematic
Characteristic
Shot : A cityscape at sunset, with a street running diagonally through the center of the image.
Aesthetic Score : 0.8
Mood : tranquil, urban, nostalgic
Quality
Entropy : 6.66
Noise : 105
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some minor artifacts in the image, such as the pixelated edges of the buildings and the slightly blurry street.
A Moment of Solitude in the Vastness of the Desert
A lone figure finds peace and contemplation amidst the expansive desert landscape at sunrise. The dramatic contrast between the figure’s small size and the vastness of the surroundings evokes a sense of isolation and insignificance, creating a powerful and serene image.
Prompt
Stylized: Serene and contemplative ; A lone traveler gazing at a breathtaking sunset; medium shot; Travel; A vast desert landscape; cinematic
Characteristic
Shot : A lone figure sitting on a sand dune in a vast desert, watching the sunset
Aesthetic Score : 0.8
Mood : serene, peaceful, contemplative
Quality
Entropy : 6.75
Noise : 83
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Sun-Kissed Smiles and Family Love: A Moment of Pure Joy
Capture the essence of family happiness with this vibrant image. A family of four strolls through a sun-drenched park, their smiles radiating warmth and togetherness. The scene evokes a sense of carefree joy and the enduring power of family bonds.
Prompt
Stylized: Joyful and heartwarming ; A family laughing and playing in a park; medium shot; Family; A sunny and idyllic park setting; cinematic
Characteristic
Shot : A happy family of four is walking in a park on a sunny day, the father is holding the hand of his wife and daughter, while the son is holding his father’s hand. The family is laughing and enjoying their time together. There are green trees in the background, and the sun is shining brightly.
Aesthetic Score : 0.75
Mood : joyful, carefree, happy
Quality
Entropy : 6.71
Noise : 109
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is generally well-exposed and sharp, but there are some minor artifacts around the edges of the image.
A Solitary Figure Contemplates the Stormy Sea
A lone figure stands on a cliff, silhouetted against a dramatic sky filled with dark clouds and rays of sunlight. The vast, stormy sea below evokes a sense of awe and melancholic isolation. The juxtaposition of the figure against the powerful forces of nature creates a powerful and dramatic scene.
Prompt
Stylized: Dramatic and powerful ; A lone figure standing on a cliff overlooking a vast ocean; long shot; Heroism; A stormy sea with dramatic clouds; cinematic
Characteristic
Shot : A lone figure standing on a cliff overlooking a vast, stormy sea. The sky is filled with dark, ominous clouds, while the sea is choppy and white-capped. The cliff face is rugged and textured, adding to the overall dramatic effect.
Aesthetic Score : 0.8
Mood : dramatic, moody, melancholic
Quality
Entropy : 6.60
Noise : 91
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors or artifacts.
Step Back in Time: A Vintage Desk Beckons Adventure
Discover a world of mystery and intrigue in this vintage setting. An antique desk, adorned with globes, compasses, and clocks, sits beneath a large world map, hinting at tales of exploration and forgotten secrets. The dark lighting adds a touch of drama, inviting you to delve into the past.
Prompt
Stylized: Intriguing and mysterious ; A map with pins marking locations of hidden treasures; close-up; Adventure; A dimly lit room with antique furniture; cinematic
Characteristic
Shot : A vintage-style study or library with a large world map on the wall, a wooden desk with drawers, various antique-looking items, and lamps on either side of the desk.
Aesthetic Score : 0.7
Mood : nostalgic, mysterious, adventurous
Quality
Entropy : 6.54
Noise : 92
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : No major issues, but the lighting creates some minor shadows and highlights that might not be completely natural.
The Archer’s Focus
A lone archer stands poised in a misty forest, his arrow drawn, his gaze unwavering. The scene is heavy with anticipation, hinting at a looming conflict. The archer’s determination and the dramatic setting create a sense of suspense, leaving the viewer wondering what lies ahead.
Prompt
Stylized: Intense and focused ; A player’s character, a skilled archer, aiming at a target; close-up; Gaming; A dark and mysterious forest; cinematic
Characteristic
Shot : A male archer stands in a forest, drawing his bow and arrow, aiming toward the right side of the frame.
Aesthetic Score : 0.8
Mood : serious, focused, adventurous
Quality
Entropy : 6.88
Noise : 83
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : No noticeable artifacts or errors. The image quality is high.
String Lights, Smiles, and Good Times: A Night Out with Friends
Capture the joy of a warm summer evening with friends, gathered around a table at an outdoor restaurant, bathed in the soft glow of string lights. The scene exudes happiness, conviviality, and relaxation, making it a perfect snapshot of friendship and good times.
Prompt
Stylized: Social and celebratory ; A group of friends enjoying a meal at a restaurant with a view; medium shot; Tourism; A bustling city street with vibrant lights; cinematic
Characteristic
Shot : Four friends are enjoying a meal together at an outdoor restaurant. The setting is a city street, lit by string lights, creating a warm and inviting ambiance.
Aesthetic Score : 0.8
Mood : cozy, festive, joyful
Quality
Entropy : 6.62
Noise : 94
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Conclusion
The results show that the generative AI model performed okay in terms of understanding and reacting to camera positions and scene composition.
Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t always accurately capture the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.55, which falls within the “good” range. This indicates that the model generally understood the scene descriptions in the prompts and produced images that reflected those descriptions.
- Aesthetic Analysis: The model scored 0.01, which is within the “very good” range of -0.2 to 0.1. This means that the generated images closely matched the expected aesthetic style.
Overall, the model demonstrates a decent ability to understand and translate prompts into images, but it could benefit from improvements in its ability to accurately capture camera positions.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://stability.ai