AI Captures the Scene, But Struggles with the Shot with Bfl-flux-pro
- 10 minutes read - 1928 wordsTable of Contents
In the realm of artificial intelligence, image generation has made significant strides. Generative AI models are now capable of creating stunning visuals based on textual prompts. However, achieving a perfect match between the prompt and the generated image remains a challenge. This blog post examines the results of a recent experiment where a generative AI model was tasked with creating images based on detailed scene descriptions, including camera position, shot type, and aesthetic elements. While the model demonstrated a strong understanding of scene content and aesthetic, it struggled with accurately replicating the intended camera angles. This highlights the ongoing challenges in achieving a complete understanding of visual language and the complexities of translating textual descriptions into realistic images.
Created with: flux-pro
A Solitary Figure Contemplates the Fury of the Storm
A lone figure stands defiant against the elements, silhouetted against a dramatic lightning strike. The crashing waves and stormy sky create a powerful and ominous scene, capturing the raw beauty of nature’s wrath.
Prompt
poses rule-of-thirds: Epic, determined, hopeful ; A lone hero standing on a cliff overlooking a vast, stormy sea; Wide shot; Heroism; Dramatic sky with crashing waves; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a stormy sea with crashing waves and lightning in the background
Aesthetic Score : 0.7
Mood : dramatic, mysterious, foreboding
Quality
Entropy : 6.57
Noise : 79
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The waves look a bit too artificial and the figure’s details are blurry.
Campfire Tales: Mystery and Warmth Under the Stars
Four friends huddle around a crackling campfire, their faces illuminated by the dancing flames. The atmosphere is both cozy and mysterious, hinting at an adventure unfolding in the darkness of the woods.
Prompt
poses rule-of-thirds: Intriguing, mysterious, suspenseful ; A group of adventurers huddled around a campfire in a dense forest; Medium shot; Adventure; Shadows and flickering flames; cinematic
Characteristic
Shot : A group of four friends are gathered around a campfire in a forest. It is nighttime and the fire is casting a warm glow on their faces. The friends are all dressed in casual clothes and look like they are enjoying each other’s company.
Aesthetic Score : 0.6
Mood : cozy, warm, adventurous
Quality
Entropy : 6.49
Noise : 81
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors. The image is slightly underexposed and the faces of the friends are not in focus.
Immersed in the Action: A Gamer’s Focused Intensity
A player is deeply engrossed in a video game, their controller held firmly in hand. The TV screen behind them displays a fiery battle scene, adding to the excitement and anticipation of the moment. The blurred background and sharp focus on the controller create a dynamic visual that captures the intensity of the gaming experience.
Prompt
poses rule-of-thirds: Focused, intense, exhilarating ; A gamer’s hands intensely gripping a controller, the screen displaying a thrilling moment in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : A person is holding a video game controller in front of a TV playing a video game with a fiery scene and a character in the background
Aesthetic Score : 0.6
Mood : intense, focused, exciting
Quality
Entropy : 6.85
Noise : 55
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, particularly in the background. The lighting is also somewhat uneven, with the controller being overexposed compared to the TV screen.
Solitude and Majesty: A Hiker Finds Peace Amidst Mountain Peaks
A lone hiker stands on a rocky outcropping, taking in the breathtaking view of a serene mountain lake. The majestic peaks surrounding the lake and the clear blue sky create a tranquil and awe-inspiring scene. The dramatic effect of the lone figure against the vast landscape evokes a sense of solitude and wonder.
Prompt
poses rule-of-thirds: Tranquil, awe-inspiring, peaceful ; A majestic mountain range reflected in a still lake, with a lone hiker standing on a rocky outcrop; Wide shot; Tourism; Clear blue sky and vibrant green foliage; cinematic
Characteristic
Shot : A lone hiker stands on a rocky outcropping overlooking a pristine mountain lake, surrounded by towering peaks and a clear blue sky.
Aesthetic Score : 0.8
Mood : serene, peaceful, tranquil
Quality
Entropy : 6.81
Noise : 90
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, causing the sky to appear washed out.
Lost in the Sunset: A Moment of Nostalgia on the Rails
A man, lost in thought, gazes out the window of a moving train as the sun sets over a verdant valley. The image evokes a sense of nostalgia and melancholic contemplation, capturing the fleeting beauty of a moment in time.
Prompt
poses rule-of-thirds: Nostalgic, romantic, adventurous ; A vintage train speeding through a picturesque countryside, with a lone traveler gazing out the window; Medium shot; Travel; Rolling hills and vibrant fields; cinematic
Characteristic
Shot : A man in a vintage outfit looks out of a window of a train travelling through a lush green landscape. The train is likely going through mountainous areas.
Aesthetic Score : 0.7
Mood : nostalgic, contemplative, adventurous
Quality
Entropy : 6.74
Noise : 75
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor chromatic aberration and noise are visible at the edges of the image, particularly in the blurred background.
Sun-Kissed Friends: A Lunchtime Gathering Filled with Joy
Capture the warmth and camaraderie of a sunny afternoon as friends gather for a relaxed lunch outdoors. The vibrant greenery and bright sunlight create a cheerful atmosphere, perfectly reflecting the joyful mood of the moment.
Prompt
poses rule-of-thirds: Joyful, lively, celebratory ; A group of friends laughing and enjoying a meal together at a bustling outdoor market; Medium shot; Groups; Colorful stalls and vibrant street life; cinematic
Characteristic
Shot : A group of friends are gathered around a table outdoors, enjoying a meal and drinks. The sun is shining brightly and the atmosphere is relaxed and happy.
Aesthetic Score : 0.7
Mood : happy, friendly, relaxed
Quality
Entropy : 6.73
Noise : 79
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly overexposed and some of the colors are slightly muted. There is also some noise in the shadows.
Silhouette of Serenity: A Woman Contemplates the Sunset
A solitary figure stands on a tranquil beach, bathed in the golden hues of a setting sun. The woman’s silhouette against the vibrant sky evokes a sense of calm and contemplation, creating a dramatic and mysterious scene.
Prompt
poses rule-of-thirds: Melancholy, reflective, hopeful ; A lone figure standing on a deserted beach, watching the sun setting over the horizon; Wide shot; Heroism; Golden light illuminating the sky and water; cinematic
Characteristic
Shot : A woman standing on a beach at sunset, looking out at the ocean.
Aesthetic Score : 0.7
Mood : melancholy, serene, peaceful
Quality
Entropy : 6.60
Noise : 63
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Sunlight Dappled Mystery: Three Men Journey Through the Forest
A serene and adventurous mood fills the air as three men navigate a dense forest, sunlight filtering through the leaves creating a sense of mystery and wonder. The scene evokes a feeling of exploration and the unknown, promising a captivating journey.
Prompt
poses rule-of-thirds: Intriguing, suspenseful, adventurous ; A group of explorers navigating a treacherous jungle path, with dense foliage surrounding them; Medium shot; Adventure; Lush greenery and dappled sunlight; cinematic
Characteristic
Shot : Three men are walking through a lush green forest with a path leading to the light coming from the trees in the distance.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, hopeful
Quality
Entropy : 6.36
Noise : 115
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors, only some minor noise from the shadow areas
Lost in the Game: A Moment of Intense Focus
A young man, bathed in cool blue and purple light, sits engrossed in a video game. His intense focus and the dramatic lighting create a sense of suspense and anticipation, drawing the viewer into his world.
Prompt
poses rule-of-thirds: Focused, intense, determined ; A close-up of a gamer’s face, eyes glued to the screen, as they navigate a challenging level in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : A young man wearing headphones, illuminated by blue and purple light, is holding a game controller in his hands, while looking intently at the screen.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.84
Noise : 66
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears slightly overexposed, with some loss of detail in the highlights, but no other significant errors.
Silhouetted Against the City Lights: A Moment of Hope and Adventure
A lone figure, arms outstretched, stands on a rooftop overlooking a vibrant cityscape at dusk. The silhouette against the glowing skyline evokes a sense of grandeur and scale, while the serene mood suggests a moment of hope and adventure.
Prompt
poses rule-of-thirds: Energetic, exciting, awe-inspiring ; A panoramic view of a bustling city skyline, with a lone tourist standing on a rooftop overlooking the scene; Wide shot; Tourism; Vibrant lights and towering buildings; cinematic
Characteristic
Shot : A man standing on a rooftop with his arms outstretched, looking out at the city skyline at dusk. The city lights are twinkling below, and a tall skyscraper is in the background. The sky is a mix of blue and orange, and the mood is peaceful and serene. There is a clock tower in the right corner.
Aesthetic Score : 0.7
Mood : serene, peaceful, hopeful
Quality
Entropy : 6.88
Noise : 78
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight artifacts in the image, particularly in the sky and the cityscape. The lighting is a little uneven, with some areas being overexposed and others underexposed.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.505, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and its aesthetic, but struggled with accurately capturing the intended camera position.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://api.bfl.ml/docs#/util/get_result_v1_get_result_get