AI's Camera Vision: A Tale of Two Shots with Imagen-v2
- 10 minutes read - 1937 wordsTable of Contents
In the realm of visual storytelling, camera position plays a crucial role in shaping the narrative and conveying emotions. Dramatic camera positions, such as close-ups, wide shots, and low angles, can enhance the impact of a scene and draw the viewer into the story. However, replicating these techniques in AI-generated imagery presents unique challenges. This blog post explores the results of an experiment that tested an AI model’s ability to understand and implement camera positions, revealing both its strengths and limitations.
Created with: imagen-v2
Heroic Silhouette: Firefighter Saves Child from Blazing Inferno
A dramatic image captures the silhouette of a firefighter carrying a child to safety amidst a raging fire. The out-of-focus flames highlight the bravery and urgency of the moment, creating a powerful and emotional scene.
Prompt
Eye Level: Heroic, suspenseful, hopeful ; A lone firefighter, silhouetted against the flames of a burning building, bravely carrying to safety. the firefighter’s expression is resolute and determined.; cinematic
Characteristic
Shot : A silhouette of a firefighter carrying a person, likely a child, against a backdrop of intense flames.
Aesthetic Score : 0.3
Mood : dramatic, intense, urgent
Quality
Entropy : 6.21
Noise : 102
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is somewhat blurry and lacks detail, particularly in the background.
Lost in the Golden Hour
A young woman with long blonde hair stands silhouetted against a hazy sunset, her gaze fixed on the distant horizon. The soft lighting and muted colors create a dreamy and nostalgic atmosphere, evoking a sense of longing and contemplation.
Prompt
Eye Level: Awe-inspiring, adventurous, liberating ; A young woman, close side-shot. The sun is setting, landscape in the background.; cinematic
Characteristic
Shot : A woman with long blonde hair, wearing a yellow jacket, is looking off into the distance at a sunset.
Aesthetic Score : 0.7
Mood : dreamy, hopeful, melancholic
Quality
Entropy : 6.70
Noise : 53
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have been digitally altered, with some unnatural smoothing and blurring around the edges.
Painted Desert: A Man’s Journey of Mystery and Determination
A lone figure, his face adorned with dark blue paint, strides across a desolate desert landscape. The stark contrast between his painted visage and the barren surroundings creates a powerful visual impact, hinting at a story of adventure and intrigue. His determined stride suggests a journey filled with challenges and purpose.
Prompt
Eye Level: Intriguing, adventurous, determined ; A backpacker, walking along a dusty road in a foreign country, their face etched with a mixture of exhaustion and exhilaration, looking into the camera. The sun beats down on them, but they continue on, driven by a thirst for adventure.; cinematic
Characteristic
Shot : A man with a backpack walks on a desert road.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, rugged
Quality
Entropy : 6.61
Noise : 74
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts in the image, such as the slight blurriness around the edges of the subject. The color palette is a bit too saturated and could be softened.
Silhouettes of Love Against a Fiery Sunset
A couple stands hand-in-hand, their silhouettes outlined against a breathtaking orange sunset. The scene evokes a sense of romance, nostalgia, and bittersweet longing, capturing the beauty and fleeting nature of love.
Prompt
Eye Level: Romantic, passionate, hopeful ; A couple, silhouetted against the setting sun, holding hands and gazing into each other’s eyes. The sky is ablaze with vibrant colors, reflecting the passion and intensity of their love.; cinematic
Characteristic
Shot : A young couple is silhouetted against an orange sunset, holding hands. The man is on the left, the woman is on the right. Their faces are lit from the side, and their eyes are closed. The backdrop is a hazy orange sky. There seems to be water in the background.
Aesthetic Score : 0.7
Mood : romantic, nostalgic, hopeful
Quality
Entropy : 6.47
Noise : 94
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, particularly in the sky and on the woman’s dress.
The campfire symbolizes togetherness.
The image has a soft, romantic feel with the use of light and shadow. The couple’s proximity and focused gaze creates a sense of intimacy. The fire adds warmth and a sense of comfort.
Prompt
Eye Level: passionate, nostalgic ; gathered around a campfire in the wilderness, their faces illuminated by the flickering flames. They share stories and laughter, creating memories that will last a lifetime.; cinematic
Characteristic
Shot : A man and a woman are sitting by a campfire, their foreheads touching. They are both wearing jackets. There are trees in the background and a blurry lake.
Aesthetic Score : 0.7
Mood : romantic, intimate, warm
Quality
Entropy : 6.47
Noise : 77
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be a bit blurry, especially the background. The colors are a bit oversaturated.
Tranquility in the Tall Grass
A serene scene unfolds as four figures stroll through a field of swaying grass, capturing a moment of peaceful contemplation amidst nature’s embrace.
Prompt
Eye Level: Hopeful, adventurous, contemplative ; Eye-level wide shot of a group of friends laughing together; Travel; Lush green rice terraces stretching into the distance under a clear blue sky.; cinematic
Characteristic
Shot : Four people are standing in a field of tall grass, with a mountain in the background, the sun is setting.
Aesthetic Score : 0.5
Mood : serene, peaceful, contemplative
Quality
Entropy : 6.47
Noise : 87
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the grass. Some of the blades of grass are blurry and appear to be distorted.
Laughter and Vibrance: Friends Enjoying a Market Feast
Three friends share a moment of pure joy, their laughter echoing through the bustling market. The vibrant colors of the food and the lively atmosphere create a scene bursting with energy and happiness.
Prompt
Eye Level: Joyful, lively, celebratory ; Eye-level shot; A group of friends, laughing and sharing a meal at a bustling street market. The vibrant colors and sounds of the market create a sense of energy and excitement.; cinematic
Characteristic
Shot : Three friends are laughing and sharing food in a busy market, the background is a bit blurred and the colors are vibrant
Aesthetic Score : 0.7
Mood : joyful, vibrant, friendly
Quality
Entropy : 6.69
Noise : 101
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Lost in the City’s Embrace
A young woman, her long brown hair cascading down her back, gazes out at the sprawling cityscape from a balcony. Holding a steaming coffee cup, she appears lost in thought, her pensive expression hinting at a melancholic reflection. The stark contrast between her soft features and the harsh urban landscape adds a layer of dramatic depth to the scene.
Prompt
Eye Level: Melancholy, introspective, contemplative ; A young woman, standing on a balcony overlooking a bustling city. She holds a cup of coffee in her hand, her eyes filled with a mixture of sadness and longing.; cinematic
Characteristic
Shot : A young woman is standing on a rooftop overlooking a cityscape, holding a cup of coffee.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, urban
Quality
Entropy : 6.90
Noise : 92
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.80
Image errors : The background city skyline is out of focus and appears slightly blurry.
Sun-Kissed Friends, A Moment Captured
A group of friends enjoys a carefree stroll across a sun-drenched field, their laughter echoing in the open air. The low angle captures the energy of the moment, but the cropped legs and lack of faces leave a sense of mystery. A playful snapshot of friendship, bathed in golden light.
Prompt
Eye Level: lively, carefree, celebratory ; A group, playing in a park. The sun shines brightly, casting long shadows on the grass.; cinematic
Characteristic
Shot : A group of young people are playing in a field, the image is shot from a low angle, with the camera looking up at them. The sun is shining brightly, and the grass is green. It looks like a fun and carefree moment
Aesthetic Score : 0.6
Mood : happy, carefree, playful
Quality
Entropy : 6.41
Noise : 101
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a lens flare in the image, it doesn’t detract from the overall look, it is a stylistic choice
Silhouette of Solitude: A Man Contemplates the Setting Sun
A poignant image captures the essence of melancholy as a man, shrouded in a hoodie, stands silhouetted against a vibrant sunset. The vastness of the sky and the solitary figure evoke a sense of contemplation and peaceful introspection.
Prompt
Eye Level: Hopeful, inspiring, contemplative ; A lone man, close side-shot, embracing the new day, silhouetted against the rising sun; cinematic
Characteristic
Shot : A man silhouetted against a setting sun. He is wearing a hooded jacket and looking off to the side.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, peaceful
Quality
Entropy : 6.31
Noise : 94
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and some slight blurring, especially on the sun
Conclusion
The results show that the generative AI model performed well in understanding the camera positions and the scene in the prompt, but struggled with the aesthetic expectations. Here’s a breakdown:
- Camera Position: The model scored a 0.2, which is considered below average. This indicates that the generated image’s camera position deviated significantly from the prompt’s instructions.
- Shot Analysis: The model scored a 0.58, which is considered good. This means the generated image’s shot composition was fairly close to what was expected based on the prompt.
- Aesthetic Analysis: The model scored a 0.19, which is considered very good. This suggests that the generated image’s aesthetic was quite close to the expected aesthetic, despite the camera position issues.
Overall, the model seems to be better at understanding the scene and its aesthetic than it is at accurately implementing the camera position.
Sources:
- https://www.studiobinder.com/blog/types-of-camera-shot-angles-in-film/
- https://www.learnaboutfilm.com/film-language/picture/camera-position/
- https://boords.com/blog/16-types-of-camera-shots-and-angles-with-gifs
- https://shorthand.com/the-craft/8-tips-for-great-visual-storytelling/
- https://deepmind.google/technologies/imagen-2/