AI Struggles to Capture the 'Style-Aesthetic' - A Case Study with Flux-pro
- 9 minutes read - 1742 wordsTable of Contents
The ‘style-aesthetic’ is a powerful tool in visual storytelling, allowing creators to evoke specific emotions and atmospheres. It encompasses elements like color palettes, lighting, composition, and even the choice of objects within a scene. This experiment aimed to explore how well a generative AI model could understand and implement this ‘style-aesthetic’ in its image generation. While the model showed promise in capturing camera position and shot analysis, it struggled to capture the desired aesthetic, highlighting the ongoing challenges in teaching AI to understand and implement complex artistic concepts.
Created with: flux-pro
A Solitary Journey Through the Desert
A lone figure, shrouded in mystery, traverses a vast desert landscape towards a distant building. The play of light and color evokes a sense of depth and intrigue, while the juxtaposition of the solitary traveler and the distant structure creates a powerful sense of scale and adventure.
Prompt
Vintage: Epic, adventurous, hopeful ; A lone, weathered explorer; medium shot; Adventure; a vast, sun-drenched desert landscape with ancient ruins in the distance; cinematic
Characteristic
Shot : A man wearing a large backpack and hat walks across a desert landscape. There is a building in the background.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, hopeful
Quality
Entropy : 6.69
Noise : 83
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed. There are no other apparent artifacts or errors.
Cozy Candlelight and Board Game Fun
Four children gather around a wooden table, bathed in the warm glow of a window and candles, as they engage in a lively board game. The intimate setting and close-up framing capture the playful energy and cozy atmosphere of their shared moment.
Prompt
Vintage: Nostalgic, intimate, playful ; A group of children playing a board game; close-up; Gaming; a dimly lit room with a worn wooden table and flickering candlelight; cinematic
Characteristic
Shot : A group of children are playing a board game around a wooden table in a dimly lit room.
Aesthetic Score : 0.6
Mood : cozy, warm, focused
Quality
Entropy : 6.51
Noise : 73
Prompt Clip Score : 0.39
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors in the image.
Lost in Thought at the Station
A young woman in a grey dress stands alone on a train platform, her gaze lost in the distance. The shallow depth of field draws attention to her figure, creating a sense of isolation and contemplation. The romantic and wistful mood is palpable, hinting at a story waiting to be told.
Prompt
Vintage: Romantic, adventurous, hopeful ; A young woman in a vintage dress standing on a train platform; long shot; Travel; a bustling train station with steam locomotives and vintage luggage; cinematic
Characteristic
Shot : A young woman in a grey dress stands in front of a train station platform. A train is pulling out in the background.
Aesthetic Score : 0.7
Mood : dreamy, nostalgic, romantic
Quality
Entropy : 6.83
Noise : 72
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors, good resolution and clarity.
Heroic Rescue: Firefighter Braves Flames to Save Child
A dramatic image captures the moment a firefighter carries a child through a burning building, the flames and smoke billowing behind them. The scene highlights the bravery and urgency of the rescue, showcasing the intensity of the situation.
Prompt
Vintage: Dramatic, heroic, suspenseful ; A firefighter carrying a child through a burning building; close-up; Heroism; a smoky, chaotic scene with flames and debris; cinematic
Characteristic
Shot : A firefighter carrying a child out of a burning building, with the flames visible in the background.
Aesthetic Score : 0.7
Mood : dramatic, heroic, hopeful
Quality
Entropy : 6.53
Noise : 76
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors in the image.
Campfire Tales Under a Starry Sky
A cozy gathering around a crackling campfire, bathed in the warm glow of the flames and the twinkling light of a star-filled sky. This scene evokes a sense of nostalgia, peace, and connection, perfect for a night of shared stories and laughter.
Prompt
Vintage: Warm, nostalgic, peaceful ; A family gathered around a campfire; wide shot; Family; a serene forest setting with stars twinkling in the night sky; cinematic
Characteristic
Shot : A group of people are gathered around a campfire under a starry night sky. The scene is set in a forest clearing with trees surrounding the group.
Aesthetic Score : 0.7
Mood : cozy, peaceful, nostalgic
Quality
Entropy : 6.44
Noise : 84
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image. The colors are well-balanced and the image is sharp.
A Vintage Journey Through Majestic Mountains
A classic car stands poised on a winding mountain road, its journey just beginning. The towering peaks in the distance and lush greenery create a serene and nostalgic atmosphere, promising an adventurous drive through breathtaking scenery.
Prompt
Vintage: Romantic, adventurous, nostalgic ; A vintage car driving down a winding mountain road; long shot; Tourism; a scenic mountain landscape with lush forests and snow-capped peaks; cinematic
Characteristic
Shot : A vintage car is parked on a winding road with a mountain backdrop.
Aesthetic Score : 0.7
Mood : nostalgic, serene, adventurous
Quality
Entropy : 6.76
Noise : 102
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, resulting in a washed-out appearance.
Sunset Flight: A Pilot’s Tranquil Adventure Above the Clouds
A small biplane cuts through a sea of clouds as the sun sets, casting a warm glow on the pilot’s silhouette. The perspective and framing evoke a sense of isolation and adventure, while the soft light creates a peaceful atmosphere.
Prompt
Vintage: Exhilarating, adventurous, free ; A pilot in a vintage biplane soaring through the clouds; close-up; Adventure; a breathtaking view of a vast, blue sky with fluffy white clouds; cinematic
Characteristic
Shot : A person is flying in a small plane above the clouds. The clouds are lit by the setting sun, creating a warm glow.
Aesthetic Score : 0.7
Mood : serene, adventurous, nostalgic
Quality
Entropy : 6.73
Noise : 77
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some artifacts in the image, particularly around the edges of the clouds.
Fog of War: Soldiers March Through a Grim Cityscape
A haunting image of soldiers, armed and resolute, navigating a fog-shrouded city. The scene evokes a sense of tension and uncertainty, hinting at the harsh realities of war.
Prompt
Vintage: Grim, heroic, determined ; A group of soldiers marching through a war-torn city; medium shot; Heroism; a desolate cityscape with rubble and smoke; cinematic
Characteristic
Shot : A group of soldiers in military uniforms walking through a city street during a wartime setting. The image is taken from a slightly elevated perspective, looking down upon the soldiers as they march through the fog-filled streets.
Aesthetic Score : 0.6
Mood : wartime, grim, determined
Quality
Entropy : 6.77
Noise : 101
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors are visible
A Romantic Waltz in a Grand Ballroom
A couple dances under warm, romantic lighting in a grand ballroom, surrounded by other guests. The scene exudes elegance and grace, capturing the intimacy and connection between the dancers.
Prompt
Vintage: Romantic, elegant, nostalgic ; A couple dancing in a vintage ballroom; close-up; Tourism; a grand ballroom with chandeliers and elegant guests; cinematic
Characteristic
Shot : A couple is dancing in a grand ballroom, with a blurred background of other guests
Aesthetic Score : 0.7
Mood : romantic, elegant, nostalgic
Quality
Entropy : 6.69
Noise : 76
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight blurriness and a few artifacts around the edges of the frame.
A World of Wonder: A Boy’s Curious Gaze
A young boy, lost in thought, studies a world map, his expression reflecting a sense of curiosity and nostalgia. The image evokes a feeling of exploration and the boundless possibilities that lie ahead.
Prompt
Vintage: Curious, adventurous, hopeful ; A young boy gazing at a vintage map; close-up; Adventure; a dimly lit room with a worn wooden table and a flickering candlelight; cinematic
Characteristic
Shot : A young boy, possibly 8-10 years old, is looking at a map spread out in front of him. The map covers a large portion of the image, while the boy is seated in a partially obscured background. The lighting is soft and warm, creating a cozy and inviting atmosphere.
Aesthetic Score : 0.7
Mood : curious, nostalgic, contemplative
Quality
Entropy : 6.70
Noise : 72
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors are visible. The lighting and focus are excellent, and the color balance is natural and pleasing.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.425, which is also considered okay. This indicates that the generated image’s shot composition was somewhat different from what was expected based on the prompt.
- Aesthetic Analysis: The model scored 0.05, which is considered pretty bad. This means the generated image’s aesthetic was significantly different from what was expected based on the prompt.
Overall, the model seems to be struggling with understanding and implementing the desired aesthetic. It’s doing a decent job with camera position and shot analysis, but there’s room for improvement in all areas.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux-pro/api