AI's Cinematic Vision: A Step Towards Realistic Storytelling with Flux-dev
- 9 minutes read - 1780 wordsTable of Contents
The ability to create compelling visuals is a crucial aspect of storytelling. Dramatic camera positions, like tracking shots, can enhance the narrative by guiding the viewer’s attention and creating a sense of immersion. This experiment aimed to explore how well a generative AI model could understand and implement these techniques. While the model showed promise in capturing the technical aspects of camera positions and shot types, it struggled to achieve the desired aesthetic, highlighting the ongoing challenges in AI’s creative capabilities.
Created with: flux-dev
Vibrant Street Market: A Feast for the Senses
Capture the energy of a bustling European street market, where colorful awnings, fresh produce, and lively crowds create a vibrant scene. The leading lines of the awnings draw your eye towards the heart of the action, inviting you to experience the sights, sounds, and smells of this lively marketplace.
Prompt
camera-positions Tracking shot: Energetic, lively ; A bustling marketplace in a foreign city; tracking shot; Tourism; Vibrant colors, exotic goods, diverse crowds.; cinematic
Characteristic
Shot : A bustling street market with people walking past stalls selling fresh produce and other goods.
Aesthetic Score : 0.6
Mood : busy, lively, urban
Quality
Entropy : 6.77
Noise : 97
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, which results in some blown-out highlights in the sky and on the awnings. There is also some noise visible in the shadows.
Lost in the Digital Realm: A Man’s Journey into Virtual Reality
A solitary figure, immersed in a virtual world, his face obscured by a VR headset. The dimly lit environment and blurred background create a sense of mystery and intrigue, highlighting the isolation and immersive nature of the experience. This image captures the allure and potential of virtual reality, leaving viewers to wonder what lies beyond the digital veil.
Prompt
camera-positions Tracking shot: Intriguing, futuristic ; A virtual reality headset being put on; tracking shot; Gaming; futuristic.; cinematic
Characteristic
Shot : A man wearing a VR headset stands in front of a blurry blue and white background, looking directly at the camera.
Aesthetic Score : 0.6
Mood : futuristic, technological, curious
Quality
Entropy : 6.75
Noise : 61
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no visible artifacts or errors in the image.
Immersed in the Game: A Moment of Focused Gaming
This image captures the intensity of a VR gaming session, with a person fully engrossed in the virtual world, their focused expression and the controller in their hand revealing their complete immersion in the game.
Prompt
camera-positions Tracking shot: Intense, focused ; A gamer’s hands furiously manipulating a controller; tracking shot; Gaming; elevated virtual world; cinematic
Characteristic
Shot : A person wearing a VR headset is holding a video game controller in their hands, they are sitting in front of a computer screen and is playing a game, the room is dimly lit and the person is focused on their game
Aesthetic Score : 0.6
Mood : focused, intense, playful
Quality
Entropy : 6.65
Noise : 54
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as slight blurriness and a few pixels with noise. The edges of the image are also slightly pixelated.
Tiny Hikers, Mighty Mountains: A Journey of Hope and Scale
Three hikers, their backs to the camera, traverse a mountain trail, creating a sense of peaceful adventure and hopeful grandeur. The vastness of the mountains emphasizes the smallness of the figures, highlighting the power and beauty of nature.
Prompt
camera-positions Tracking shot: Inspiring, adventurous ; A group of friends hiking through a breathtaking mountain range; tracking shot; Adventure; Majestic peaks, clear blue sky.; cinematic
Characteristic
Shot : Three hikers are walking up a mountain trail in a beautiful mountainous region. They are all wearing backpacks and carrying trekking poles. The view is stunning, with a clear blue sky and snow-capped peaks in the distance.
Aesthetic Score : 0.75
Mood : adventurous, inspiring, serene
Quality
Entropy : 6.68
Noise : 84
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Silhouette of Courage: Firefighter Braves Blazing Inferno
A dramatic image captures the silhouette of a firefighter walking towards the camera down a narrow alley, backlit by a raging fire. The scene evokes a sense of danger, heroism, and the unwavering dedication of those who fight to protect us.
Prompt
camera-positions Tracking shot: Urgent, dramatic ; A firefighter rushing into a burning building; tracking shot; Heroism; Smoke and flames engulfing the structure.; cinematic
Characteristic
Shot : A lone firefighter walks through a narrow street towards a large, fiery blaze in the distance.
Aesthetic Score : 0.6
Mood : dramatic, intense, heroic
Quality
Entropy : 6.59
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, the image is slightly overexposed.
A Serene Hike Towards Mystery
Two hikers traverse a verdant forest path, the soft light filtering through the canopy and revealing a distant stone temple. The scene evokes a sense of tranquility and adventure, inviting you to explore the unknown.
Prompt
camera-positions Tracking shot: Intriguing, adventurous ; A group of explorers navigating a dense jungle; tracking shot; Adventure; Lush greenery, ancient ruins in the distance.; cinematic
Characteristic
Shot : Two people are walking on a path through a lush green forest toward a mysterious ancient temple structure in the distance.
Aesthetic Score : 0.7
Mood : mysterious, tranquil, adventurous
Quality
Entropy : 6.87
Noise : 121
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no major artifacts or errors in the image, however the exposure is slightly dark, which could be improved.
Sunset Serenity: A Road Leading to Hope
A peaceful scene of an empty road stretching towards a breathtaking sunset. The low angle of the sun creates a sense of vastness and openness, evoking feelings of calm, serenity, and hope.
Prompt
camera-positions Tracking shot: Nostalgic, heartwarming ; A family driving down a scenic highway; tracking shot; Travel; Rolling hills, open road, sunlight streaming through the car window.; cinematic
Characteristic
Shot : A car drives down a long, winding road in the desert. The sun is setting in the distance, casting a warm glow over the landscape.
Aesthetic Score : 0.6
Mood : serene, hopeful, vast
Quality
Entropy : 6.75
Noise : 56
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed and the details of the landscape are not as sharp as they could be. There is some lens flare around the sun.
An Intimate Evening: A Glimpse into a Cozy Restaurant Scene
Experience the warmth and romance of an intimate dinner for two in a cozy restaurant setting. The soft lighting and inviting atmosphere create the perfect backdrop for a memorable evening, as the couple enjoys their meal and shares a glass of wine. The connection between them is palpable, making this scene a beautiful representation of love and togetherness.
Prompt
camera-positions Tracking shot: Intimate, heartwarming ; A family enjoying a meal restaurant; tracking shot; Family; Warm lighting, open world.; cinematic
Characteristic
Shot : Two people are sitting at a table in a restaurant, having dinner and wine. It’s a warm, inviting scene, likely in the evening.
Aesthetic Score : 0.7
Mood : romantic, intimate, cozy
Quality
Entropy : 6.58
Noise : 69
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, but the image appears to be slightly overexposed, causing a slight loss of detail in the highlights.
Silhouettes of Solitude: A Journey into the Setting Sun
A lone figure walks towards the horizon, their silhouette stark against the fiery sunset. The vast landscape, perhaps a desert or prairie, stretches out before them, evoking feelings of serenity, melancholy, and hope. This powerful image captures the essence of isolation, contemplation, and the passage of time.
Prompt
camera-positions Tracking shot: Epic, hopeful ; A lone figure, silhouetted against the setting sun; tracking shot; Heroism; A vast, desolate landscape.; cinematic
Characteristic
Shot : A lone figure walks away from the viewer towards a setting sun.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, hopeful
Quality
Entropy : 6.47
Noise : 27
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Lost in the Moment: A Boy’s Contemplative Journey
A young boy sits by a train window, bathed in warm sunlight, his gaze fixed on the passing scenery. The light and shadow play creates a sense of depth and mystery, capturing a moment of pensive contemplation and nostalgia.
Prompt
camera-positions Tracking shot: Innocent, hopeful ; A young boy gazing out of a train window; tracking shot; Family; Passing landscapes, a sense of anticipation and wonder.; cinematic
Characteristic
Shot : A young boy is sitting by the window of a train, looking out at the passing scenery.
Aesthetic Score : 0.7
Mood : pensive, contemplative, nostalgic
Quality
Entropy : 6.57
Noise : 54
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blurriness around the edges of the image.
Conclusion
The results show that the generative AI model performed well in understanding and implementing camera positions and shot types, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored a 0.5, which falls within the “good” range. This indicates that the model was able to capture the intended camera positions reasonably well, but there’s room for improvement to reach the “very good” level.
- Shot Analysis: The model scored a 0.57, also within the “good” range. This suggests that the model understood the scene and shot types described in the prompt, but could benefit from further refinement to achieve a more accurate representation.
- Aesthetic Analysis: The model scored a 0.08, which is significantly lower than the ideal range of -0.2 to 0.1. This indicates that the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera positions and shot types, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.studiobinder.com/blog/types-of-camera-shot-angles-in-film/
- https://www.learnaboutfilm.com/film-language/picture/camera-position/
- https://boords.com/blog/16-types-of-camera-shots-and-angles-with-gifs
- https://shorthand.com/the-craft/8-tips-for-great-visual-storytelling/
- https://fal.ai/models/fal-ai/flux/dev/api