AI's Camera Skills: A Work in Progress with Flux-pro
- 9 minutes read - 1813 wordsTable of Contents
The ability to understand and implement camera positions is crucial for creating compelling visual narratives. This is a skill that humans have mastered through years of experience and observation. But what about AI? Can it learn to see the world through a camera lens? In this experiment, we tested an AI model’s ability to interpret camera positions and shot composition based on textual prompts. The results offer a glimpse into the potential of AI in visual storytelling, but also highlight the challenges that remain.
Created with: flux-pro
Silhouetted Against the Sunset: A Moment of Contemplation
A solitary figure stands on a hilltop, their silhouette stark against the vibrant hues of a setting sun. The scene evokes a sense of melancholy and contemplation, with the dramatic effect of the silhouette adding an air of mystery and intrigue.
Prompt
camera-positions Two-shot: Epic, hopeful, determined ; A lone hero, silhouetted against the setting sun; Two-shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure stands silhouetted against a vibrant orange sunset, overlooking a vast, hazy landscape.
Aesthetic Score : 0.7
Mood : serene, contemplative, hopeful
Quality
Entropy : 6.66
Noise : 57
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts.
Awe-Inspiring Waterfall: Nature’s Majestic Spectacle
Two figures stand dwarfed by a cascading waterfall, surrounded by vibrant greenery. The scene evokes a sense of serenity and wonder, highlighting the dramatic scale of nature’s power.
Prompt
camera-positions Two-shot: Wonder, excitement, awe ; Two adventurers, gazing in awe at a towering waterfall; Two-shot; Adventure; Lush, tropical rainforest; cinematic
Characteristic
Shot : A couple standing in front of a large waterfall in a lush, tropical forest.
Aesthetic Score : 0.8
Mood : peaceful, serene, awe-inspiring
Quality
Entropy : 6.84
Noise : 98
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
In the Zone: Gamer’s Intensity Under Dim Lights
A young man, headphones on, is fully immersed in a video game, his focused expression and the dimly lit room creating a sense of intense competition. A shadowy figure in the background adds a layer of intrigue to the scene.
Prompt
camera-positions Two-shot: Intense, focused, competitive ; Two gamers, intensely focused on a screen, controllers in hand; Two-shot; Gaming; A dimly lit room with neon lights; cinematic
Characteristic
Shot : A young man in a headset is sitting in front of a computer playing video games, with another person blurred in the background.
Aesthetic Score : 0.6
Mood : focused, intense, techy
Quality
Entropy : 6.70
Noise : 67
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor artifacts can be seen on the edges of the subject’s head and the computer screen. The lighting is a little harsh, causing some unnatural shadows and highlights.
Love in the City of Lights: A Selfie Under the Parisian Arch
A couple captures their Parisian adventure with a joyful selfie under a grand archway. The blurred background adds a touch of romance, highlighting their happiness and the city’s charm. This photo embodies the spirit of adventure and love, making it a perfect memory of their trip.
Prompt
camera-positions Two-shot: Happy, carefree, celebratory ; Two tourists, smiling and taking a selfie in front of a famous landmark; Two-shot; Tourism; A bustling city square; cinematic
Characteristic
Shot : A couple is taking a selfie in front of an archway. They are both smiling and seem happy. The background is blurry and out of focus, but there are other people walking around in the distance. The archway is a beautiful architectural feature, and it is in good condition.
Aesthetic Score : 0.7
Mood : joyful, romantic, happy
Quality
Entropy : 6.91
Noise : 72
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurring around the edges, likely caused by image compression or editing.
Sunshine and Laughter: Two Friends Embrace the Joy of the Day
Two women, radiating happiness, stroll down a vibrant street, their laughter echoing through the sunny air. The scene captures the pure joy of friendship and the simple pleasures of a beautiful day.
Prompt
camera-positions Two-shot: Joyful, adventurous, curious ; Two friends, sharing a laugh as they explore a foreign city; Two-shot; Travel; A vibrant, colorful street market; cinematic
Characteristic
Shot : Two women are walking down a street, laughing and talking to each other. The setting is a city, and the street is lined with shops and buildings. The sun is shining brightly.
Aesthetic Score : 0.8
Mood : happy, carefree, friendly
Quality
Entropy : 6.84
Noise : 77
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Cheers to Friendship: A Toast in the Warm Glow of a Pub
Capture the joy and camaraderie of friends sharing a beer in a dimly lit pub. The warm lighting and focus on the glasses create a sense of intimacy and connection, perfect for a casual and social mood.
Prompt
camera-positions Two-shot: Warm, celebratory, intimate ; A group of friends, raising their glasses in a toast; Two-shot; Groups; A cozy, dimly lit pub; cinematic
Characteristic
Shot : A group of friends toasting with beer in a dimly lit pub setting. The focus is on the hand holding the beer.
Aesthetic Score : 0.6
Mood : casual, friendly, festive
Quality
Entropy : 6.53
Noise : 66
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is good but there are some minor noise and blur issues.
Awe and Loneliness: Astronauts Gaze at Earth from the Vastness of Space
Two astronauts, bathed in the cool light of a futuristic spaceship, stand before a window offering a breathtaking view of Earth. The scene evokes a sense of wonder and isolation, capturing the profound experience of witnessing our planet from afar.
Prompt
camera-positions Two-shot: Serious, focused, determined ; Two astronauts, working together in a space station; Two-shot; Heroism; The vast emptiness of space; cinematic
Characteristic
Shot : Two astronauts in spacesuits are floating in a spaceship looking out at a large planet in the distance.
Aesthetic Score : 0.7
Mood : futuristic, mysterious, awe-inspiring
Quality
Entropy : 6.69
Noise : 82
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, particularly in the background.
Lost in the Mist: A Serene Adventure Awaits
Two figures, a man and a woman, embark on a mysterious journey through a lush, mist-shrouded forest. The stone path winds its way deeper into the dense foliage, promising adventure and intrigue. The dim light and ethereal atmosphere create a sense of serenity and wonder.
Prompt
camera-positions Two-shot: Suspenseful, adventurous, determined ; Two explorers, navigating a treacherous jungle path; Two-shot; Adventure; Dense, overgrown jungle; cinematic
Characteristic
Shot : Two hikers, a man and a woman, walk along a jungle path, surrounded by lush greenery. The path is mostly obscured by vegetation and the image focuses on the hikers as they walk towards the camera.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, peaceful
Quality
Entropy : 6.72
Noise : 120
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable errors in the image
Victory High Five Under Neon Lights
Two friends celebrate a win in a dimly lit gaming room, their high five illuminated by vibrant neon lights. The scene captures the excitement and camaraderie of shared victory.
Prompt
camera-positions Two-shot: Excited, triumphant, celebratory ; Two gamers, celebrating a victory with a high-five; Two-shot; Gaming; A brightly lit gaming room with colorful lights; cinematic
Characteristic
Shot : Two young men in a dimly lit room celebrating a victory, possibly a gaming win. They are high-fiving each other. The room has gaming peripherals and colorful lights.
Aesthetic Score : 0.6
Mood : joyful, energetic, competitive
Quality
Entropy : 6.35
Noise : 66
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is slightly blurry, especially in the background, which may be due to low light conditions. The image has a slight chromatic aberration in some areas, particularly around the edges of objects.
Silhouettes of Love Against a Sunset Sky
A romantic and serene scene unfolds as a couple stands silhouetted against a breathtaking sunset over a tranquil beach. The dramatic effect of their forms against the warm glow creates a sense of mystery and intimacy, leaving a hopeful and enduring impression.
Prompt
camera-positions Two-shot: Peaceful, romantic, contemplative ; Two travelers, gazing out at a breathtaking sunset over the ocean; Two-shot; Travel; A serene beach with golden sand; cinematic
Characteristic
Shot : A couple sitting on a beach at sunset, facing the ocean. The woman is wearing a black top and jeans, and the man is wearing a suit jacket. The sun is setting in the background, casting a warm glow over the scene.
Aesthetic Score : 0.7
Mood : romantic, serene, contemplative
Quality
Entropy : 6.69
Noise : 70
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly overexposed, resulting in some loss of detail in the highlights.
Conclusion
The results show that the generative AI model performed okay in terms of understanding and implementing camera positions and shot composition.
Here’s a breakdown:
- Camera Position Analysis: The score of 0.25 indicates that the model’s ability to react to camera positions in the prompt is below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Shot Analysis: The score of 0.485 suggests that the model’s understanding of the scene in the prompt and its ability to create a corresponding shot is slightly below average. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Aesthetic Analysis: The score of 0.04 indicates that the generated image’s aesthetic is close to the expected aesthetic described in the prompt. A score between -0.2 and 0.1 is considered very good.
Overall, the model seems to struggle with accurately interpreting camera positions and shot composition, but it does a decent job of capturing the desired aesthetic.
Sources:
- https://www.studiobinder.com/blog/types-of-camera-shot-angles-in-film/
- https://www.learnaboutfilm.com/film-language/picture/camera-position/
- https://boords.com/blog/16-types-of-camera-shots-and-angles-with-gifs
- https://shorthand.com/the-craft/8-tips-for-great-visual-storytelling/
- https://fal.ai/models/fal-ai/flux-pro/api