AI's Artistic Eye: Capturing Emotion, Missing the Shot with Stable-diffusion
- 8 minutes read - 1664 wordsTable of Contents
The ability to generate images based on text prompts has become increasingly sophisticated, with AI models capable of creating stunning visuals. However, these models still face challenges in accurately interpreting and translating complex instructions, particularly when it comes to camera angles and shot composition. This blog post examines the results of a generative AI model tasked with creating images based on specific camera positions and shot compositions, highlighting the model’s strengths and weaknesses in capturing the desired aesthetic style and accurately interpreting the instructions.
Created with: stability-ai-core
Lost in the City Lights: A Moment of Unease
A young man stands alone on a city street at night, his gaze fixed on something unseen. The blur of the background lights and his enigmatic expression create a sense of mystery and unease. The image captures a fleeting moment of isolation and introspection, leaving the viewer to wonder what secrets lie within the shadows.
Prompt
facial-expressions Anxiety: Overwhelmed, isolated ; A lone figure; eye-level; Single Person; bustling city street at night; cinematic
Characteristic
Shot : A man standing on a city street at night, looking away from the camera, with lights and buildings in the background.
Aesthetic Score : 0.6
Mood : gloomy, mysterious, urban
Quality
Entropy : 6.15
Noise : 63
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, causing some of the lights in the background to be blown out.
Superman Stands Tall, Hopeful Against the Dusk
A powerful image captures Superman, silhouetted against a breathtaking sunset, overlooking a sprawling city. His determined pose and the vibrant cityscape evoke a sense of hope and heroism, leaving viewers inspired by his unwavering commitment to justice.
Prompt
facial-expressions Anxiety: Pressure, responsibility ; A superhero standing on a rooftop; high angle; Hero; cityscape with flashing lights; cinematic
Characteristic
Shot : Superman standing on a rooftop overlooking a city at sunset, his cape billowing in the wind
Aesthetic Score : 0.7
Mood : heroic, dramatic, powerful
Quality
Entropy : 6.83
Noise : 76
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise in the background, especially visible on the buildings.
Drowning in Paperwork: The Weight of Stress
A man sits defeated at his desk, buried under towering stacks of paper. The image captures the overwhelming feeling of stress and exhaustion that can come with a heavy workload.
Prompt
facial-expressions Anxiety: Overwhelmed, stressed ; A person sitting at a desk, surrounded by paperwork; close-up; Normal Person; cluttered office; cinematic
Characteristic
Shot : A man is sitting at a desk with stacks of paperwork on either side of him. He has his head in his hands and looks distressed.
Aesthetic Score : 0.3
Mood : stressful, overwhelmed, tired
Quality
Entropy : 6.77
Noise : 74
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight blurring and noise, especially in the background.
The Hacker’s Focus
A young man, shrouded in shadow, sits hunched over his keyboard, his intense gaze fixed on the screen. The dim lighting and multiple monitors create an atmosphere of suspense, hinting at a high-stakes operation in progress.
Prompt
facial-expressions Anxiety: Focused, intense ; A gamer hunched over a computer screen; close-up; Gamer; dimly lit room with flashing lights; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer, likely playing a game. The room is dimly lit with red and blue lights. The man appears focused and intense.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 5.97
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors
Lost in Thought: A Woman’s Journey Through the City
A solitary figure navigates the bustling urban landscape, her face a canvas of contemplation. The blurred background adds a sense of mystery, inviting viewers to delve into her inner world. This image evokes a mood of calm introspection, leaving us to wonder about her thoughts and destination.
Prompt
facial-expressions Anxiety: Anxious, uncomfortable ; A woman walking down a crowded street; eye-level; Single Person; blurred background of people; cinematic
Characteristic
Shot : A woman walks in a city, with a blurry background of people and buildings.
Aesthetic Score : 0.7
Mood : melancholic, pensive, urban
Quality
Entropy : 6.77
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor blurriness in the background.
The Shadow of the Flame: A Warrior’s Unwavering Gaze
A close-up portrait captures the intensity of a warrior, his serious expression and dark attire hinting at a battle fought and a future uncertain. The low lighting and distant flicker of fire create a dramatic backdrop, emphasizing the brooding mood and the weight of his resolve.
Prompt
facial-expressions Anxiety: Fear, anticipation ; A hero facing a menacing villain; medium shot; Hero; dark and ominous setting; cinematic
Characteristic
Shot : A man with a determined expression, wearing black armor and a cape, stands against a dark, smoky background, possibly a battlefield.
Aesthetic Score : 0.8
Mood : serious, intense, brooding
Quality
Entropy : 6.13
Noise : 63
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the background, that could be attributed to compression or post-processing.
The Men in the Hallway: A Silent Threat
A group of men stand in a dimly lit hallway, their faces obscured by shadows. Their serious expressions and the repetition of their forms create a palpable sense of unease and suspense. What are they waiting for? What secrets lie behind their stoic gaze?
Prompt
facial-expressions Anxiety: Impatient, restless ; A person waiting in a long line; eye-level; Normal Person; crowded waiting room; cinematic
Characteristic
Shot : A group of young men are lined up in a hallway, all looking at the camera with serious expressions.
Aesthetic Score : 0.4
Mood : serious, intense, suspenseful
Quality
Entropy : 6.86
Noise : 80
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : No noticeable errors.
In the Zone: Gamer’s Intensity Under Neon Lights
A young man, eyes locked on the keyboard, is fully immersed in the digital world. The dimly lit room, bathed in neon hues, amplifies the intensity and focus of his gaming session, creating a futuristic and thrilling atmosphere.
Prompt
facial-expressions Anxiety: Adrenaline, pressure ; A gamer’s hands frantically moving across a keyboard; close-up; Gamer; glowing computer screen; cinematic
Characteristic
Shot : A man wearing headphones is playing a game on his computer, his hands are on the keyboard, the scene is lit with blue and orange lights.
Aesthetic Score : 0.7
Mood : intense, focused, serious
Quality
Entropy : 5.99
Noise : 58
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness on the background, some noise in the image.
Solitude Under a Stormy Sky
A lone figure stands amidst a field of vibrant green crops, silhouetted against a brooding, dark sky. The contrast between the lush landscape and the ominous clouds creates a sense of melancholy and atmospheric tension, highlighting the man’s isolation and introspection.
Prompt
facial-expressions Anxiety: Loneliness, despair ; A man standing alone in a vast field; wide shot; Single Person; open sky with dark clouds; cinematic
Characteristic
Shot : A lone man stands in a field of green grass with dark, ominous clouds overhead. The field is divided into rows, adding to the image’s structure.
Aesthetic Score : 0.7
Mood : melancholy, suspenseful, contemplative
Quality
Entropy : 6.36
Noise : 74
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
A Solitary Figure Amidst the Ruins
A poignant image captures the desolation of a war-torn or disaster-stricken city. A lone figure stands on a rooftop, their silhouette stark against the backdrop of crumbling buildings and smoke-filled skies. The scene evokes a sense of profound isolation and melancholic reflection on the aftermath of destruction.
Prompt
facial-expressions Anxiety: Guilt, responsibility ; A hero looking out over a devastated city; high angle; Hero; destroyed buildings and smoke; cinematic
Characteristic
Shot : A man in a black hoodie stands on a rooftop, looking out at a city in ruins. There are thick plumes of black smoke billowing in the sky, and debris is scattered everywhere. The image has a dark and somber tone.
Aesthetic Score : 0.4
Mood : gloomy, somber, apocalyptic
Quality
Entropy : 6.68
Noise : 77
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and the edges of the smoke plumes are a bit jagged.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.455, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create the expected shot composition.
- Aesthetic Analysis: The model scored 0.16, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and shot analysis.
Overall, the model seems to be better at capturing the desired aesthetic style than accurately interpreting the camera position and shot composition instructions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai