AI's Eye: Tracking Shots - A Success, But Aesthetics Need Work with Ideogram-v2
- 9 minutes read - 1859 wordsTable of Contents
The world of filmmaking is built on the power of camera positions and shot composition. These elements, often referred to as ‘camera language,’ are crucial in conveying emotions, building tension, and guiding the viewer’s attention. A recent experiment with a generative AI model explored its ability to translate these cinematic concepts into visual reality. The model was tasked with creating images based on specific camera positions and shot descriptions, such as ’tracking shot’ or ‘heroic silhouette.’ The results were intriguing, revealing both strengths and weaknesses in the AI’s understanding of cinematic language.
Created with: ideogram-v2
Silhouette of Hope: A Solitary Figure Walks Towards the Setting Sun
A tranquil scene of a lone figure walking across a rocky landscape as the sun sets, casting a dramatic silhouette. The image evokes feelings of hope, contemplation, and a touch of mystery.
Prompt
camera-positions Tracking shot: Epic, hopeful ; A lone figure, silhouetted against the setting sun; tracking shot; Heroism; A vast, desolate landscape.; cinematic
Characteristic
Shot : A solitary figure walks towards the setting sun on a rocky landscape.
Aesthetic Score : 0.7
Mood : tranquil, hopeful, contemplative
Quality
Entropy : 6.23
Noise : 55
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : None.
Lost in the Jungle: A Mysterious Journey Awaits
A group of intrepid explorers venture through a dense, muddy jungle path, their destination shrouded in mystery. A crumbling structure looms in the distance, hinting at a forgotten past and the secrets it holds. The air is thick with suspense, as the adventurers press onward, their fate uncertain.
Prompt
camera-positions Tracking shot: Intriguing, adventurous ; A group of explorers navigating a dense jungle; tracking shot; Adventure; Lush greenery, ancient ruins in the distance.; cinematic
Characteristic
Shot : A group of people walk through a muddy path in a lush jungle, with a distant ruined structure in the background.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 6.32
Noise : 116
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor image artifacts and compression artifacts, particularly around the edges of the image.
Lost in the Pixels: A Futuristic Gaming Escape
This vibrant image captures the essence of immersive gaming, with a player holding a controller against a backdrop of futuristic cityscapes, floating islands, and abstract shapes. The dynamic composition and bold colors create a sense of depth and intensity, transporting you into a world of digital wonder.
Prompt
camera-positions Tracking shot: Intense, focused ; A gamer’s hands furiously manipulating a controller; tracking shot; Gaming; elevated virtual world; cinematic
Characteristic
Shot : A person is holding a video game controller in front of a futuristic collage of cityscapes, floating islands, and abstract shapes. The image is meant to evoke a sense of being immersed in a video game.
Aesthetic Score : 0.6
Mood : futuristic, immersive, intense
Quality
Entropy : 6.51
Noise : 100
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are slight artifacts and blurriness around the edges of the image, which is likely due to the collage nature of the image. There are also some visible seams where the different images have been merged together.
Lost in the Labyrinth: A Sensory Overload in a Bustling Marketplace
Experience the vibrant chaos of a crowded marketplace, where narrow streets teem with shoppers and vendors. The air is thick with the aroma of spices, the colors of textiles are a feast for the eyes, and the energy is palpable. This scene captures the claustrophobic beauty of a bustling market, where the senses are overwhelmed by the sights, sounds, and smells of a vibrant culture.
Prompt
camera-positions Tracking shot: Energetic, lively ; A bustling marketplace in a foreign city; tracking shot; Tourism; Vibrant colors, exotic goods, diverse crowds.; cinematic
Characteristic
Shot : A bustling marketplace with people shopping for various goods, including textiles, spices, and souvenirs. The scene takes place in a narrow street with shops on both sides.
Aesthetic Score : 0.6
Mood : busy, vibrant, crowded
Quality
Entropy : 6.75
Noise : 108
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a few minor artifacts and errors, including some noise in the shadows and some slight blurring in the background. These errors are mostly cosmetic and do not detract significantly from the image.
Road Trip Bliss: A Vintage Adventure Filled with Smiles
Capture the joy of a family road trip in a classic car. The camera’s focus on the front of the car creates a sense of movement and adventure, highlighting the happy and joyful mood of the scene.
Prompt
camera-positions Tracking shot: Nostalgic, heartwarming ; A family driving down a scenic highway; tracking shot; Travel; Rolling hills, open road, sunlight streaming through the car window.; cinematic
Characteristic
Shot : A family is on a road trip, driving in a vintage car. They are all smiling and looking happy.
Aesthetic Score : 0.6
Mood : happy, joyful, adventurous
Quality
Entropy : 6.91
Noise : 61
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur, particularly on the faces. There is some lens flare present.
Lost in Thought: A Boy’s Journey Through Time
A young boy with piercing blue eyes gazes out the window of a train, his expression a mix of pensive longing and wistful melancholy. The image captures a fleeting moment of introspection, as he seems to be lost in thought, perhaps reminiscing about a past experience. The rural countryside whizzing by outside serves as a backdrop to his internal journey, leaving the viewer to wonder about the stories hidden within his gaze.
Prompt
camera-positions Tracking shot: Innocent, hopeful ; A young boy gazing out of a train window; tracking shot; Family; Passing landscapes, a sense of anticipation and wonder.; cinematic
Characteristic
Shot : A young boy with blue eyes looks out the window of a train, likely traveling through rural countryside. He appears to be lost in thought, perhaps reminiscing about a past experience.
Aesthetic Score : 0.7
Mood : pensive, wistful, melancholic
Quality
Entropy : 6.82
Noise : 71
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
Firefighter’s Calm Amidst the Flames
A firefighter in full gear, with a determined expression, runs towards the camera, fire extinguisher in hand. The blaze behind him rages, creating a stark contrast with the firefighter’s composure. This image captures the intensity and urgency of the situation, highlighting the bravery of those who face danger head-on.
Prompt
camera-positions Tracking shot: Urgent, dramatic ; A firefighter rushing into a burning building; tracking shot; Heroism; Smoke and flames engulfing the structure.; cinematic
Characteristic
Shot : A firefighter in full gear is running towards the camera with a fire extinguisher in his hand. There is a fire in the background.
Aesthetic Score : 0.6
Mood : intense, urgent, dramatic
Quality
Entropy : 6.84
Noise : 96
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is a slight blur on the firefighter’s face. The flames in the background appear slightly unnatural, possibly due to over-editing or AI manipulation.
Adventure Awaits: Hiking with Friends in the Majestic Mountains
Capture the spirit of adventure with this stunning image of friends hiking through breathtaking mountain scenery. The expansive view and bright sunshine evoke a sense of freedom and upliftment, making it the perfect picture for anyone who loves the outdoors.
Prompt
camera-positions Tracking shot: Inspiring, adventurous ; A group of friends hiking through a breathtaking mountain range; tracking shot; Adventure; Majestic peaks, clear blue sky.; cinematic
Characteristic
Shot : A group of friends is hiking in the mountains, with a dramatic view of the peaks in the background. The hikers are walking along a trail, and the sun is shining brightly.
Aesthetic Score : 0.7
Mood : adventurous, uplifting, serene
Quality
Entropy : 6.68
Noise : 104
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Immerse Yourself in the Future of Gaming
A futuristic VR headset, bathed in glowing light, takes center stage above a blurred gaming setup. Prepare to be transported to a world of immersive experiences.
Prompt
camera-positions Tracking shot: Intriguing, futuristic ; A virtual reality headset being put on; tracking shot; Gaming; futuristic.; cinematic
Characteristic
Shot : A virtual reality headset floating above a gaming setup. The headset has glowing lights and is in focus. The background is a blurry image of a gaming desk with a computer monitor, controller, and a chair.
Aesthetic Score : 0.7
Mood : futuristic, techy, immersive
Quality
Entropy : 6.78
Noise : 62
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The lighting on the headset is somewhat unnatural and the background appears slightly blurry and pixelated.
Warmth and Laughter Fill the Booth
A family of four enjoys a cozy meal in a dimly lit restaurant. The warm lighting and intimate atmosphere create a sense of happiness and connection, while the dark background adds a touch of drama to the scene.
Prompt
camera-positions Tracking shot: Intimate, heartwarming ; A family enjoying a meal restaurant; tracking shot; Family; Warm lighting, open world.; cinematic
Characteristic
Shot : A family of four is sitting at a booth in a restaurant, eating and interacting with each other. The lighting is warm and inviting, and the atmosphere is casual and relaxed.
Aesthetic Score : 0.5
Mood : happy, casual, intimate
Quality
Entropy : 6.58
Noise : 84
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors, but the focus could be sharper on the foreground elements.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5-0.75). This indicates that the model was able to accurately capture the camera positions described in the prompt.
- Shot Analysis: The model scored 0.55, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflected the intended shot composition.
- Aesthetic Analysis: The model scored 0.14, which is significantly lower than the “very good” range (-0.2 to 0.1). This indicates that the generated image did not match the expected aesthetic as closely as it did with the camera position and shot analysis.
Overall, the model demonstrates a good understanding of camera positions and shot composition, but needs improvement in generating images that meet the desired aesthetic.