AI's Artistic Struggle: Capturing the 'style-aesthetic' with Flux-dev
- 9 minutes read - 1902 wordsTable of Contents
The ‘style-aesthetic’ is a captivating visual language that transcends traditional artistic boundaries. It often features distorted perspectives, vibrant colors, and abstract shapes, creating a sense of surrealism and emotional depth. This style is commonly found in works exploring themes of fantasy, dreamscapes, and the human condition. However, capturing this unique aesthetic poses a significant challenge for AI image generation models. While they excel at understanding scene descriptions and camera positions, they often struggle to translate the nuanced visual language of the ‘style-aesthetic’ into their generated images. This blog post will explore the reasons behind this challenge and examine potential solutions for bridging the gap between human creativity and AI artistry.
Created with: flux-dev
A Gloomy Cityscape: Where Shadows Dance and Secrets Linger
This urban scene, shrouded in a hazy gray light, evokes a sense of mystery and gloom. Tall buildings cast long shadows, while the bustling crowd below adds a touch of drama. The overcast sky and the fleeting birds create a sense of unease, hinting at a story waiting to unfold.
Prompt
style-aesthetic Expressionist: Chaotic, overwhelming ; A bustling, crowded street scene, with people rushing past in a blur; long shot; Tourism; A distorted, abstract cityscape with exaggerated buildings and swirling colors; cinematic
Characteristic
Shot : A bustling city street with tall buildings and a crowd of people. There are many birds flying in the air, creating a sense of movement and life.
Aesthetic Score : 0.6
Mood : urban, atmospheric, busy
Quality
Entropy : 6.82
Noise : 116
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slight amount of blur, which is likely due to the motion of the birds and the crowd. The colors are slightly faded and lack vibrancy. The overall image feels a bit flat and lacking in detail.
Lost in a Dreamy Future
A person escapes into a world of wonder and mystery, immersed in a virtual reality experience with a vibrant pink and purple sky. The image evokes a sense of hope and a futuristic, dreamy mood.
Prompt
style-aesthetic Expressionist: Immersive, futuristic ; A virtual reality headset, displaying a vibrant, pixelated world; close-up; Gaming; A distorted, abstract landscape with swirling colors and shifting shapes; cinematic
Characteristic
Shot : A person wearing a futuristic-looking VR headset is standing in front of a soft, pink and purple background that resembles a cloudy sky.
Aesthetic Score : 0.7
Mood : futuristic, dreamy, surreal
Quality
Entropy : 6.94
Noise : 66
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.70
Image errors : There are minor artifacts and blurring in the background, particularly in the clouds. The lighting seems a bit uneven, with some areas appearing brighter than others.
Mystical Sunset Over a Bustling City
A grand mosque stands tall against the backdrop of a vibrant city street, bathed in the warm glow of a setting sun. The dramatic lighting and bustling atmosphere create a sense of awe and wonder, capturing the mystical beauty of the moment.
Prompt
style-aesthetic Expressionist: Awe-inspiring, spiritual ; A towering, ancient cathedral bathed in the golden light of dawn; high-angle shot; Tourism; A bustling, crowded marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A bustling street scene in a Middle Eastern city. The street is lined with buildings on both sides, some of which are quite old. There is a large, ornate mosque in the distance. The sun is setting, casting a warm glow over the scene.
Aesthetic Score : 0.7
Mood : mysterious, vibrant, warm
Quality
Entropy : 6.81
Noise : 94
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts and compression artifacts, but these are not significant enough to detract from the overall quality of the image.
Nostalgia on Rails: A Dreamy Sunset Journey
A red train winds through a mountain pass, bathed in the golden glow of a dramatic, swirling sunset. The scene evokes a sense of nostalgia and hope, with the train in sharp focus and the background gently blurred, creating a dreamy atmosphere.
Prompt
style-aesthetic Expressionist: Surreal, disorienting ; A train speeding through a surreal, dreamlike landscape; long shot; Travel; A distorted, abstract landscape with swirling colors and shifting shapes; cinematic
Characteristic
Shot : A red train travels through a mountain landscape under a dramatic sunset sky. The clouds are swirling and creating a sense of movement.
Aesthetic Score : 0.7
Mood : dramatic, nostalgic, hopeful
Quality
Entropy : 6.90
Noise : 117
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has a slightly blurry effect in the background and the clouds look somewhat artificial. The train also has some pixelated edges.
A Solitary Figure Contemplates the Stormy Sea
A lone figure stands on a cliff, silhouetted against a dramatic sky. The crashing waves below and the overcast clouds create a sense of isolation and melancholy. This image evokes a feeling of vulnerability and introspection, capturing the raw power of nature and the fragility of human existence.
Prompt
style-aesthetic Expressionist: Dramatic, contemplative ; A lone figure standing on a precipice, gazing out at a stormy sea; medium shot; Heroism; A dramatic, stormy seascape with crashing waves and swirling clouds; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a turbulent sea, with a dramatic sky and crashing waves.
Aesthetic Score : 0.7
Mood : dramatic, moody, solitude
Quality
Entropy : 6.45
Noise : 80
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears slightly blurry and has a muted color palette.
Candlelit Gathering: A Moment of Intimacy and Reflection
In a dimly lit room adorned with floral wallpaper, three young siblings share a quiet moment around a flickering candle. The warm glow illuminates their faces, creating an atmosphere of coziness and thoughtfulness. The low-light and soft glow of the candle light add a sense of intimacy and vulnerability, suggesting a reflective moment of togetherness.
Prompt
style-aesthetic Expressionist: Intimate, melancholic ; A family huddled together in a dimly lit room, their faces illuminated by flickering candlelight; close-up; Family; A cramped, cluttered room with faded wallpaper and worn furniture; cinematic
Characteristic
Shot : Three young people sitting in a dimly lit room, looking at a lit candle
Aesthetic Score : 0.7
Mood : intimate, cozy, warm
Quality
Entropy : 6.24
Noise : 62
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors
Lost in the Mist: A Path Beckons into the Unknown
A winding forest path disappears into a swirling mist, creating an atmosphere of mystery and intrigue. The ethereal fog adds a touch of eeriness, inviting you to explore the unknown depths of the woods.
Prompt
style-aesthetic Expressionist: Mysterious, suspenseful ; A winding, cobblestone path disappearing into a dense, swirling fog; low-angle shot; Adventure; A dark, foreboding forest with gnarled trees and flickering shadows; cinematic
Characteristic
Shot : A misty, mysterious path through a forest, with cobblestone road in the foreground leading to vanishing point
Aesthetic Score : 0.6
Mood : eerie, atmospheric, mystical
Quality
Entropy : 6.81
Noise : 96
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.70
Image errors : The cobblestone road and trees appear a bit blurry and lack detail, possibly due to over-sharpening or digital manipulation.
Silhouetted Against Hope: A Lone Figure at Sunset
A solitary figure stands on a rocky peak, their silhouette stark against the fiery hues of a setting sun. The dramatic interplay of light and shadow emphasizes the vastness of the mountainous landscape, creating a sense of mystery and hopeful anticipation.
Prompt
style-aesthetic Expressionist: Epic, determined ; A lone figure, silhouetted against a blazing sunset; wide shot; Heroism; A vast, desolate landscape with towering mountains in the distance; cinematic
Characteristic
Shot : A lone figure stands silhouetted against a fiery sunset in a mountainous landscape.
Aesthetic Score : 0.7
Mood : dramatic, solitary, hopeful
Quality
Entropy : 6.48
Noise : 41
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be digitally generated with some artifacts present, particularly in the sky and the mountains. The lighting is a bit too perfect, and the edges of the figure and the mountains are a bit blurry.
A Glimmer of Hope in a Decaying Hallway
A single light source pierces the darkness of a narrow, waterlogged hallway, hinting at a mystery waiting to be unveiled. The rough walls and eerie atmosphere create a sense of unease, leaving you wondering what lies beyond the light.
Prompt
style-aesthetic Expressionist: Confusing, suspenseful ; A labyrinthine maze of twisting corridors and flickering lights; low-angle shot; Adventure; A dark, claustrophobic dungeon with dripping water and eerie shadows; cinematic
Characteristic
Shot : A dimly lit hallway with an old-fashioned lantern hanging from the ceiling. The floor is wet and uneven, suggesting a sense of neglect or disrepair. The walls are covered in rust and other imperfections, adding to the overall sense of abandonment.
Aesthetic Score : 0.6
Mood : eerie, mysterious, abandoned
Quality
Entropy : 6.47
Noise : 95
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Lost in the Neon Glow: A Moment of Focus in the Urban Night
A young person, headphones on, is completely absorbed in their computer screen. The vibrant neon lights of the city blur into the background, creating a sense of mystery and intrigue. This image captures the focused energy of urban life in a futuristic setting.
Prompt
style-aesthetic Expressionist: Intense, futuristic ; A pixelated character, illuminated by the glow of a computer screen; close-up; Gaming; A chaotic, neon-lit cityscape with flashing lights and distorted reflections; cinematic
Characteristic
Shot : A person is sitting in front of a computer, the screen is displaying a pink and white abstract image, the person is wearing headphones and is looking at the screen. The background is a blurred city scene with neon lights.
Aesthetic Score : 0.7
Mood : focused, digital, urban
Quality
Entropy : 6.67
Noise : 74
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some noise visible on the screen, color depth appears lacking.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.63, which is considered good. This indicates the model successfully captured the scene described in the prompt.
- Aesthetic Analysis: The model scored 0.07, which is considered okay. This suggests the generated image’s aesthetic was somewhat different from the expected aesthetic.
Overall, the model seems to be better at understanding the scene and camera position than it is at capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux/dev/api