AI's Artistic Eye: A Look at Generative Models and Scene Composition with Flux-pro
- 9 minutes read - 1777 wordsTable of Contents
The world of generative AI is rapidly evolving, with models capable of creating stunning and realistic images from text prompts. One key aspect of this technology is its ability to understand and translate scene composition, including camera positions, shot analysis, and aesthetic style. This blog post explores the performance of a generative AI model in capturing these elements, analyzing its strengths and areas for improvement. We’ll delve into the concept of ‘dramatic style’ and explore how it’s used in various creative contexts, providing examples of its application in film, photography, and digital art.
Created with: flux-pro
A Solitary Journey Towards the Crimson Sun
A lone figure, cloaked in mystery, walks towards a colossal red sun, its fiery glow illuminating a sky awash in swirling clouds. This digital painting evokes a sense of epic drama and melancholic contemplation, highlighting the figure’s isolation against the vastness of the universe.
Prompt
Stylized: Epic and melancholic ; A lone warrior; wide shot; Heroism; A desolate battlefield with a setting sun; cinematic
Characteristic
Shot : A solitary figure in a long coat stands in a field with a large red sun setting in the distance.
Aesthetic Score : 0.6
Mood : melancholy, dramatic, hopeful
Quality
Entropy : 6.01
Noise : 58
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are no visible errors in the image.
Unveiling the Forest’s Hidden Treasure
A dimly lit forest conceals a treasure chest overflowing with gold coins, its contents illuminated by a mysterious light. The scene evokes a sense of adventure, magic, and the promise of untold riches.
Prompt
Stylized: Excitement and wonder ; A treasure chest overflowing with gold; close-up; Adventure; A dark and mysterious cave; cinematic
Characteristic
Shot : A treasure chest overflowing with gold coins is nestled amidst rocks and a blurred forest background, suggesting a hidden or forgotten trove.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, rich
Quality
Entropy : 6.89
Noise : 66
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.50
Image errors : Slight blurriness in the background and some texture artifacts on the chest.
A Fantasy Hero Stands Guard, City Below, Orb Above
A powerful fantasy character, staff in hand, surveys a sprawling cityscape from a rooftop. A large, mysterious orb hangs in the background, adding to the epic and dramatic atmosphere. The character’s pose and the scene evoke a sense of power and mystery.
Prompt
Stylized: Triumphant and futuristic ; A player’s avatar, a powerful warrior, standing triumphantly; medium shot; Gaming; A vibrant and futuristic cityscape; cinematic
Characteristic
Shot : A fantasy character standing on a platform in front of a city skyline. The character is holding a staff and has a cape flowing behind them. There is a moon in the sky and the background is filled with a variety of buildings and lights.
Aesthetic Score : 0.6
Mood : fantasy, dramatic, heroic
Quality
Entropy : 6.85
Noise : 83
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : There is some aliasing and banding in the image, particularly in the sky and the character’s cape. The lighting is somewhat flat, which makes the image look a bit lifeless.
Dreamy Sunset Paints a Vibrant Cityscape
A whimsical scene unfolds as the sun sets over a bustling city, casting a warm glow on the cityscape. A majestic bull statue stands proudly atop a building, silhouetted against the fiery sky. The dramatic sunset creates a breathtaking spectacle, capturing the beauty and energy of urban life.
Prompt
Stylized: Energetic and lively ; A panoramic view of a bustling city; long shot; Tourism; A vibrant and colorful cityscape; cinematic
Characteristic
Shot : A vibrant cityscape at sunset with a large yellow sun dominating the sky. The buildings are diverse, some modern and some traditional. A street winds through the city, filled with cars and people. The scene is full of life and color.
Aesthetic Score : 0.7
Mood : dreamy, vibrant, optimistic
Quality
Entropy : 6.89
Noise : 99
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts around the edges of the buildings and in the sky. There is also some slight blurring in the distance, which could be improved.
Silhouetted Against the Setting Sun: A Moment of Solitude in the Desert
A lone figure stands in the vast expanse of a desert, their silhouette stark against the fiery glow of the setting sun. The scene evokes a sense of loneliness, calm, and a glimmer of hope, as the figure contemplates the vastness of the world around them.
Prompt
Stylized: Serene and contemplative ; A lone traveler gazing at a breathtaking sunset; medium shot; Travel; A vast desert landscape; cinematic
Characteristic
Shot : A single figure silhouetted against a large, bright sun in a desert landscape.
Aesthetic Score : 0.7
Mood : solitude, contemplative, hopeful
Quality
Entropy : 5.26
Noise : 32
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be digitally generated and has a slightly ‘flat’ appearance. The sand dunes lack realistic texture and there is a slight blurring effect around the sun.
Sun-Kissed Smiles and Blurred Happiness
A family of three strolls through a sun-drenched park, their laughter echoing through the air. The warm lighting and their joyful expressions create a sense of pure happiness, while the blurred background adds depth and perspective to this heartwarming scene.
Prompt
Stylized: Joyful and heartwarming ; A family laughing and playing in a park; medium shot; Family; A sunny and idyllic park setting; cinematic
Characteristic
Shot : A family of three, a father, mother and daughter, are walking and laughing in a park. The setting is a sunny afternoon in a park with trees and green grass.
Aesthetic Score : 0.7
Mood : happy, joyful, carefree
Quality
Entropy : 6.40
Noise : 77
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors in the image.
Solitude on the Stormy Edge
A lone figure stands defiant against the elements, silhouetted against a dramatic sky and crashing waves. This image evokes a sense of solitude, contemplation, and the raw power of nature.
Prompt
Stylized: Dramatic and powerful ; A lone figure standing on a cliff overlooking a vast ocean; long shot; Heroism; A stormy sea with dramatic clouds; cinematic
Characteristic
Shot : A lone figure stands on a cliff edge overlooking a vast, stormy ocean. The sky is filled with dramatic clouds and the setting sun casts a warm glow on the scene.
Aesthetic Score : 0.7
Mood : dramatic, melancholic, contemplative
Quality
Entropy : 6.62
Noise : 81
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to have a slight amount of noise and the colors are a bit muted. The figure could be more detailed.
Uncharted Territory: A Journey Begins
A close-up view of a map, its pins hinting at an adventure waiting to unfold. The blurred background adds a sense of mystery, leaving the viewer to ponder the destination and the journey ahead.
Prompt
Stylized: Intriguing and mysterious ; A map with pins marking locations of hidden treasures; close-up; Adventure; A dimly lit room with antique furniture; cinematic
Characteristic
Shot : A close-up of a map with push pins marking locations, the map is on a wooden table with a blurry background of a window and a candle in the distance.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, nostalgic
Quality
Entropy : 6.88
Noise : 69
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise and grain are visible in the image, particularly in the darker areas.
The Huntress’s Focus
A woman stands poised in a sun-dappled forest, her arrow drawn back, her gaze unwavering. The low angle captures her intensity, creating a dramatic and suspenseful scene.
Prompt
Stylized: Intense and focused ; A player’s character, a skilled archer, aiming at a target; close-up; Gaming; A dark and mysterious forest; cinematic
Characteristic
Shot : A woman in a green jacket is aiming a bow and arrow in a forest.
Aesthetic Score : 0.7
Mood : intense, focused, mysterious
Quality
Entropy : 6.85
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts present in the image, particularly around the edges of the subject.
Cozy Dinner with Friends Under City Lights
A group of friends enjoy a warm and inviting dinner at a restaurant, bathed in soft lighting that creates a sense of intimacy. The city lights in the background add a touch of urban flair to the cozy atmosphere.
Prompt
Stylized: Social and celebratory ; A group of friends enjoying a meal at a restaurant with a view; medium shot; Tourism; A bustling city street with vibrant lights; cinematic
Characteristic
Shot : A group of friends enjoying a meal at an outdoor restaurant at night. The restaurant is decorated with string lights and the atmosphere is warm and inviting.
Aesthetic Score : 0.6
Mood : cozy, convivial, festive
Quality
Entropy : 6.60
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some noise in the dark areas, slight blur on the faces due to movement.
Conclusion
The results show that the generative AI model performed okay in terms of understanding and reacting to camera positions and scene composition.
Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t always accurately capture the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.55, which falls within the “good” range. This indicates that the model generally understood the scene descriptions in the prompts and produced images that reflected those descriptions.
- Aesthetic Analysis: The model scored 0.01, which is within the “very good” range of -0.2 to 0.1. This means that the generated images closely matched the expected aesthetic style.
Overall, the model demonstrates a decent ability to understand and translate prompts into images, but it could benefit from improvements in its ability to accurately capture camera positions.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux-pro/api