AI's Artistic Eye: A Mixed Bag of Results with Flux-dev
- 9 minutes read - 1905 wordsTable of Contents
The world of filmmaking is filled with distinct visual styles, each conveying a unique mood and emotion. From the gritty realism of a war film to the whimsical charm of a fantasy adventure, these aesthetics are carefully crafted to immerse the audience in the story. But can artificial intelligence replicate these artistic choices? We put a generative AI model to the test, asking it to create images based on specific cinematic aesthetics. While the model demonstrated a strong understanding of camera positioning and shot composition, it struggled to capture the desired emotional and stylistic nuances. This blog post explores the results of our experiment, delving into the reasons behind this discrepancy and discussing the potential for future improvements in AI image generation.
Created with: flux-dev
Lost in the Shadows: A Moment of Focused Contemplation
A solitary figure sits in a dimly lit room, their face partially obscured by shadow as they navigate the digital world. The low light and dramatic use of shadow create a sense of mystery and intrigue, hinting at a moment of deep thought and contemplation.
Prompt
style-aesthetic Dogme 95: Intense, focused ; A player’s hands frantically manipulating a joystick, their face illuminated by the screen; medium shot; Gaming; A dimly lit room with a computer monitor glowing brightly; cinematic
Characteristic
Shot : A person is sitting in a dimly lit room, using a computer and a mouse, with a monitor in the background.
Aesthetic Score : 0.5
Mood : gloomy, introspective, techy
Quality
Entropy : 5.96
Noise : 49
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, particularly in the shadows.
Silhouetted Serenity: A Moment of Hope at Sunset
A solitary figure stands against the backdrop of a golden sunset, their silhouette casting a sense of mystery and intrigue. The serene scene evokes feelings of contemplation and hope, as the setting sun paints the sky with hues of peace and beauty.
Prompt
style-aesthetic Dogme 95: Epic, hopeful ; A lone figure, silhouetted against a setting sun; long shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure silhouetted against a vivid orange sunset, standing on a grassy hilltop.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, serene
Quality
Entropy : 6.24
Noise : 30
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor compression artifacts, particularly in the sky, which could be improved with a higher-resolution source.
A Dash of Red in a Vibrant Market
A woman in a striking red dress walks through a bustling market, her presence adding a touch of mystery to the vibrant scene. The colorful awnings and bustling vendors create a lively and exotic atmosphere.
Prompt
style-aesthetic Dogme 95: Energetic, lively ; A bustling marketplace, filled with vibrant colors and exotic goods; wide shot; Tourism; A crowded street in a foreign city; cinematic
Characteristic
Shot : A woman in a red dress is walking down a narrow street lined with stalls and awnings. The street is bustling with people and the scene is filled with a sense of life and energy.
Aesthetic Score : 0.6
Mood : vibrant, lively, bustling
Quality
Entropy : 6.48
Noise : 102
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor artifacts are present in the image, but they do not detract significantly from the overall quality.
Tranquil Journey Through Hazy Landscapes
A train glides through a picturesque rural scene on a slightly hazy day, evoking a sense of tranquility and nostalgia. The motion blur captures the dynamism of the journey, adding a touch of melancholic beauty to the moment.
Prompt
style-aesthetic Dogme 95: Nostalgic, contemplative ; A train speeding through a countryside landscape, blurring the scenery; long shot; Travel; Rolling hills and fields passing by; cinematic
Characteristic
Shot : A train is moving through a rural landscape. The camera is positioned inside the train, looking out of a window. The motion blur effect created by the movement of the train makes the scene feel dynamic, but it also obscures the details of the landscape.
Aesthetic Score : 0.4
Mood : tranquil, nostalgic, somber
Quality
Entropy : 6.52
Noise : 42
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The motion blur is a bit overdone. There are some minor artifacts on the train windows.
One Hand, One Rope, One Mountain to Conquer
A close-up shot captures the raw intensity of a climber’s grip, clinging to a sheer cliff face. The blurred mountain range in the background adds a sense of scale and isolation, highlighting the dramatic challenge ahead. This image evokes a mood of suspense and adventure, leaving you wondering what lies beyond the next hold.
Prompt
style-aesthetic Dogme 95: Suspenseful, thrilling ; A hand reaching out to grasp a rope ladder dangling from a cliff face; close-up; Adventure; A rocky, treacherous mountainside; cinematic
Characteristic
Shot : Close-up of a hand holding a rope, with a mountain range in the background. The hand is reaching out towards the blurry mountains, emphasizing a sense of isolation and solitude.
Aesthetic Score : 0.5
Mood : solitary, adventurous, contemplative
Quality
Entropy : 6.54
Noise : 71
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image quality is somewhat soft, particularly in the background. This could be due to the camera settings or post-processing.
A Moment of Wonder: Capturing a Child’s Curiosity
This intimate portrait captures a young child’s startled expression, bathed in soft, warm light. The close-up framing and blurred background create a sense of vulnerability and draw the viewer into the child’s world of wonder and contemplation.
Prompt
style-aesthetic Dogme 95: Sad, poignant ; A single tear rolling down a child’s cheek as they watch their parents argue; close-up; Family; A dimly lit living room; cinematic
Characteristic
Shot : A close-up shot of a young child, likely a girl, looking to the side with a slightly open mouth. The background is blurry, suggesting a person out of focus.
Aesthetic Score : 0.6
Mood : intrigued, curious, thoughtful
Quality
Entropy : 6.34
Noise : 39
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image suffers from some slight blurriness, primarily around the edges and the background. This might be due to the camera settings or the way the image was processed.
Family Dinner: A Moment of Warmth and Connection
This heartwarming scene captures a family enjoying a cozy dinner together. The warm lighting and intimate setting create a sense of closeness and happiness, highlighting the importance of shared meals and family bonds.
Prompt
style-aesthetic Dogme 95: Warm, intimate ; A family gathered around a dinner table, sharing a meal and laughter; medium shot; Family; A cozy, well-worn kitchen; cinematic
Characteristic
Shot : A family of four is gathered around a dining table. The adults are seated on chairs, while the children are seated on the floor. They are all eating and talking to each other. There is a large bowl of food in the center of the table.
Aesthetic Score : 0.6
Mood : warm, cozy, casual
Quality
Entropy : 6.42
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Timeless Memories: A Vintage Snapshot of Youth
This faded photograph captures two young girls, their smiles hinting at a shared secret. The image evokes a sense of nostalgia, sweetness, and a touch of mystery, transporting us back to a bygone era.
Prompt
style-aesthetic Dogme 95: Melancholy, nostalgic ; A hand holding a worn photograph, the image blurred and faded; close-up; Family; A cluttered attic filled with old memories; cinematic
Characteristic
Shot : A vintage photograph of two young girls in old-fashioned clothing. The girls are standing in front of a blurry background that seems to be outdoors.
Aesthetic Score : 0.6
Mood : nostalgia, innocence, sweet
Quality
Entropy : 6.39
Noise : 54
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.00
Image errors : There is some slight graininess in the image.
Lost in the Storm’s Embrace
A solitary figure stands on a windswept cliff, dwarfed by the churning fury of the sea. The dramatic scene evokes a sense of solitude and melancholy, highlighting the fragility of life against the vastness of nature.
Prompt
style-aesthetic Dogme 95: Awe-inspiring, contemplative ; A lone traveler gazing out at a vast ocean, their face filled with wonder; long shot; Travel; A dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a vast, misty ocean. The ocean waves crash against the rocky shore. The figure is wearing a dark jacket and a backpack. The sky is a muted gray.
Aesthetic Score : 0.6
Mood : tranquil, contemplative, solitary
Quality
Entropy : 6.66
Noise : 76
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blurriness throughout the image, especially in the foreground. The figure’s details are not sharply defined.
Campfire Tales: Friends Gather Under the Stars
A group of young adults share stories and laughter around a crackling campfire, bathed in the warm glow of the flames. The cozy atmosphere and the mystery of the surrounding forest create a perfect setting for a night of connection and adventure.
Prompt
style-aesthetic Dogme 95: Joyful, communal ; A group of friends huddled together around a campfire, sharing stories and laughter; medium shot; Adventure; A dark forest with flickering flames; cinematic
Characteristic
Shot : A group of four friends are sitting around a campfire in a dark forest. The fire is casting a warm glow on their faces and the surrounding trees. There is a lot of fog in the air.
Aesthetic Score : 0.6
Mood : cozy, relaxed, mysterious
Quality
Entropy : 6.32
Noise : 76
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a bit grainy, likely due to low light conditions and camera settings.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This indicates that the model was able to accurately capture the camera position described in the prompt.
- Shot Analysis: The model scored 0.59, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflected that understanding.
- Aesthetic Analysis: The model scored 0.13, which is significantly lower than the “very good” range (-0.2 to 0.1). This indicates that the generated image did not match the expected aesthetic as closely as it did with the camera position and shot analysis.
Overall, the model demonstrates a good understanding of camera position and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux/dev/api