AI's Artistic Eye: Capturing the Moment, But Missing the Shot with Stable-diffusion
- 9 minutes read - 1762 wordsTable of Contents
In the realm of artificial intelligence, image generation has emerged as a captivating field. AI models are now capable of creating stunning visuals based on textual prompts, offering a glimpse into the future of creative expression. However, as with any nascent technology, there are limitations to be explored. This blog post delves into the results of an experiment that tested the capabilities of an AI model in generating images based on specific scene descriptions. The experiment revealed both strengths and weaknesses in the model’s performance, particularly in its ability to capture the intended camera positions, shot composition, and aesthetic style.
Created with: stability-ai-core
Amidst the Inferno: Soldiers Brave the Battlefield’s Fury
A poignant image captures the raw intensity of war, with two soldiers standing amidst a landscape ravaged by explosions. Smoke billows, flames dance, and the distant figures of other soldiers highlight the scale of the conflict. The scene evokes a sense of drama, somber reflection, and the overwhelming power of destruction.
Prompt
poses embrace: triumphant, camaraderie ; Two soldiers; wide shot; heroism; battlefield with smoke and explosions in the background; cinematic
Characteristic
Shot : Two soldiers are standing in front of a large explosion in a war-torn city. There are flames and smoke in the background, and the soldiers are looking at each other.
Aesthetic Score : 0.6
Mood : intense, dramatic, chaotic
Quality
Entropy : 6.77
Noise : 80
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.70
Image errors : There are some minor artifacts in the smoke and flames, but overall the image is clean.
Love Amidst Ancient Mysteries: A Jungle Adventure
In this captivating scene, a couple shares a romantic moment in front of an ancient stone temple, lost in the heart of a lush jungle. The man, with a large backpack, has his arm around the woman, who carries a smaller pack. The temple, overgrown with vines and greenery, adds a sense of mystery and adventure to their journey.
Prompt
poses embrace: trust, respect ; A lone explorer and a local guide; medium shot; adventure; lush jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : A couple is embracing in front of a large stone building, surrounded by lush greenery and a jungle setting. The building is seemingly dilapidated and overgrown with vegetation.
Aesthetic Score : 0.7
Mood : romantic, adventurous, mysterious
Quality
Entropy : 6.82
Noise : 87
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight chromatic aberration in some areas, especially around the edges.
Gaming Buddies Share a Laugh in the Heat of the Action
Two young men, clad in blue adidas shirts and headsets, erupt in laughter as they share a moment of victory in front of a computer screen. The dimly lit room, filled with gaming equipment, adds to the energetic and playful mood of the scene. The focused lighting highlights their joyful expressions, capturing the excitement of the gaming experience.
Prompt
poses embrace: excitement, joy ; Two gamers celebrating a victory; close-up; gaming; brightly lit gaming room with monitors and controllers; cinematic
Characteristic
Shot : Two young men, likely twins, are sitting at a desk in a dimly lit room, playing video games. They are both smiling and appear to be enjoying themselves.
Aesthetic Score : 0.6
Mood : joyful, competitive, playful
Quality
Entropy : 6.40
Noise : 68
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors, but the lighting could be more balanced.
Silhouettes of Love Against the Setting Sun
A romantic embrace on a rooftop, bathed in the golden glow of sunset. The couple’s silhouette against the city skyline creates a dreamy and dramatic scene, capturing the essence of urban romance.
Prompt
poses embrace: romantic, awe ; A couple gazing at a breathtaking sunset; long shot; tourism; panoramic view of a city skyline; cinematic
Characteristic
Shot : A couple is standing on a rooftop overlooking the city of New York at sunset.
Aesthetic Score : 0.7
Mood : romantic, dreamy, love
Quality
Entropy : 6.83
Noise : 78
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight chromatic aberration in the edges and the sky gradient is a bit artificial.
A Family’s Moment of Awe on the Mountaintop
A family of four stands on a majestic mountain peak, their backs to the camera, gazing out at the sprawling mountain range. The vastness of the landscape evokes a sense of tranquility and adventure, reminding us of the humbling power of nature.
Prompt
poses embrace: unity, accomplishment ; A family standing on a mountain peak; medium shot; travel; majestic mountain range with clouds in the background; cinematic
Characteristic
Shot : A family of three stands on a mountaintop, looking out over a valley of rolling hills and snow-capped mountains.
Aesthetic Score : 0.8
Mood : tranquil, serene, adventurous
Quality
Entropy : 6.78
Noise : 70
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : None, but the composition can be improved
Cheers to Friendship: A Toast to Intimacy and Joy
Capture the warmth and camaraderie of a shared moment as four friends raise their glasses in a toast. The inviting lighting and intimate composition create a sense of happiness and celebration, perfect for evoking feelings of friendship and connection.
Prompt
poses embrace: celebratory, friendship ; A group of friends raising their glasses in a toast; close-up; groups; lively bar or restaurant setting; cinematic
Characteristic
Shot : Four friends are celebrating and clinking glasses at a pub or restaurant.
Aesthetic Score : 0.7
Mood : happy, joyful, friendly
Quality
Entropy : 6.42
Noise : 74
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors
A Moment of Shared Joy: Two Women Find Comfort and Love in a Park
This heartwarming image captures the tender bond between two older women as they share a moment of quiet joy by a park fountain. The woman on the left leans on the woman on the right’s shoulder, creating a sense of warmth and intimacy. The scene evokes a feeling of nostalgia and happiness, reminding us of the enduring power of friendship and love.
Prompt
poses embrace: love, gratitude ; A young woman and her grandmother; medium shot; heroism; a peaceful park with a fountain in the background; cinematic
Characteristic
Shot : Two elderly women sitting on a stone bench in a park, with a fountain in the background.
Aesthetic Score : 0.7
Mood : joyful, heartwarming, tender
Quality
Entropy : 6.82
Noise : 75
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors in the image.
Awe-Inspiring View: Astronauts Embrace the Vastness of Space
Two astronauts, silhouetted against the radiant glow of Earth, float amidst the celestial tapestry. The sun bathes the scene in golden light, highlighting the majesty of space and the profound beauty of our planet.
Prompt
poses embrace: wonder, awe ; Two astronauts floating in space; long shot; adventure; Earth in the distance; cinematic
Characteristic
Shot : Two astronauts in space suits are floating in space with a planet in the background and a sun in the distance.
Aesthetic Score : 0.7
Mood : awe, wonder, adventurous
Quality
Entropy : 6.59
Noise : 71
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some artifacts and errors in the image, such as the sun being too bright and the planet being too flat. The astronauts’ faces appear a little off as well.
Band Takes Center Stage in a Symphony of Light and Sound
A vibrant concert scene unfolds with a band bathed in colorful spotlights, their energy electrifying the crowd. The dramatic interplay of light and shadow creates a captivating visual experience.
Prompt
poses embrace: passion, energy ; A group of musicians performing on stage; wide shot; gaming; a concert venue with flashing lights; cinematic
Characteristic
Shot : A live concert performance with multiple musicians on stage, captured from different angles. The lighting is dramatic and colorful, creating a vibrant and energetic atmosphere.
Aesthetic Score : 0.5
Mood : energetic, lively, exciting
Quality
Entropy : 6.24
Noise : 76
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image quality is generally good, but there is a slight amount of noise and grain in some areas.
Sunset Embrace: A Moment of Love and Tranquility
A couple finds solace and connection in each other’s arms as the sun dips below the horizon, casting a warm glow over the crashing waves. This romantic scene evokes feelings of peace and serenity, capturing the essence of a perfect sunset.
Prompt
poses embrace: love, hope ; A couple standing on a beach at sunrise; close-up; travel; ocean waves crashing on the shore; cinematic
Characteristic
Shot : A couple is embracing on the beach at sunset. The man is holding the woman from behind, and they are looking at each other lovingly.
Aesthetic Score : 0.7
Mood : romantic, tender, peaceful
Quality
Entropy : 6.81
Noise : 65
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a bit blurry in places, particularly around the edges. This could be due to camera shake or a slow shutter speed.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.53, which is considered average. This indicates that the model was able to understand the scene in the prompt to a reasonable degree, but not exceptionally well.
- Aesthetic Analysis: The model scored 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at understanding the aesthetic style of the prompt than it is at accurately capturing the camera positions and shot composition.