AI's Artistic Eye: Capturing the 'style-aesthetic' with Precision with Flux-schnell
- 9 minutes read - 1798 wordsTable of Contents
The ‘style-aesthetic’ is a powerful tool for visual storytelling, allowing artists and filmmakers to evoke specific emotions and atmospheres through carefully chosen visual elements. This article explores the ability of a generative AI model to understand and replicate this complex concept. We analyze its performance in interpreting prompts that specify various ‘style-aesthetic’ categories, such as ‘Heroism’, ‘Adventure’, and ‘Gaming’, and assess its ability to capture the desired camera positions, shot types, and overall aesthetic.
Created with: flux-schnell
Silhouetted in Solitude: A Cowboy’s Melancholy Sunset
A lone figure, silhouetted against a fiery sunset, evokes a sense of melancholy and solitude in this dramatic desert scene. The cowboy’s hat and the vast, empty landscape amplify the feeling of isolation, leaving the viewer to ponder the figure’s thoughts and the story behind their presence.
Prompt
style-aesthetic Avant-garde: Epic, melancholic ; A lone figure, silhouetted against a blazing sunset; long shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure in a cowboy hat stands in silhouette against a setting sun.
Aesthetic Score : 0.7
Mood : melancholy, lonely, contemplative
Quality
Entropy : 6.48
Noise : 37
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image is slightly grainy and the silhouette is not very well-defined.
Reaching for the Unknown: A Hand Grasps at a Swirling Vortex
A hand stretches out towards a mesmerizing vortex of orange and yellow, hinting at a mysterious and intense journey ahead. The image evokes a sense of hope and suspense, leaving the viewer captivated by the unknown that lies beyond.
Prompt
style-aesthetic Avant-garde: Surreal, mysterious ; A hand reaching out from a swirling vortex of light; close-up; Adventure; A kaleidoscope of colors and abstract shapes; cinematic
Characteristic
Shot : A hand reaching towards a swirling orange vortex. The vortex appears to be some sort of cosmic portal.
Aesthetic Score : 0.7
Mood : mysterious, dramatic, ethereal
Quality
Entropy : 6.36
Noise : 68
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : No visible artifacts or errors
Silhouetted Against the Future: A Lone Figure and a Glowing Portal
A solitary figure stands on a cliff, their silhouette stark against the neon-drenched cityscape. Towering structures pierce the sky, while a massive, glowing portal hangs overhead, promising both wonder and the unknown. This image evokes a sense of futuristic isolation, tinged with hope and a hint of mystery.
Prompt
style-aesthetic Avant-garde: Nostalgic, futuristic ; A pixelated character, rendered in a retro 8-bit style, standing on a precipice overlooking a digital cityscape; medium shot; Gaming; A neon-lit, futuristic cityscape; cinematic
Characteristic
Shot : A lone figure stands on a rock overlooking a futuristic cityscape. The city is bathed in neon lights and has a cyberpunk aesthetic.
Aesthetic Score : 0.8
Mood : futuristic, lonely, dark
Quality
Entropy : 6.45
Noise : 88
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts, particularly around the edges of the buildings and the figure. There are some slight color banding issues in the sky.
A Suitcase, a Fog, and a Journey Unfolding
A vintage suitcase rests on a train platform, shrouded in a thick fog. The tracks stretch into the distance, hinting at a journey yet to be taken. The scene evokes a sense of melancholy and nostalgia, leaving the viewer to ponder the destination and the stories held within the suitcase.
Prompt
style-aesthetic Avant-garde: Lonely, evocative ; A single, weathered suitcase, abandoned on a deserted train platform; close-up; Tourism; A misty, atmospheric train station; cinematic
Characteristic
Shot : A vintage suitcase sits on a platform at a train station, with a blurred background suggesting a foggy, overcast day.
Aesthetic Score : 0.7
Mood : melancholy, nostalgic, solitude
Quality
Entropy : 6.48
Noise : 58
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to be slightly overexposed, with a slight loss of detail in the highlights. The colors are muted, giving the image a slightly faded, vintage look.
The Rhythm of the City
A low-angle shot captures the confident stride of a pedestrian on a cobblestone street, emphasizing the movement and energy of urban life. The focus on their feet and the pavement creates a sense of groundedness and everyday simplicity.
Prompt
style-aesthetic Avant-garde: Disorienting, dreamlike ; A pair of feet walking on a cracked, abstract pavement; low-angle shot; Travel; A distorted, surreal cityscape; cinematic
Characteristic
Shot : A person is walking away from the camera on a city street. The focus is on the person’s feet. The street is paved with cobblestones.
Aesthetic Score : 0.4
Mood : urban, casual, mundane
Quality
Entropy : 6.77
Noise : 96
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some blurriness around the edges of the image, potentially caused by motion blur or a lack of focus. The colors are a bit washed out, especially in the background.
Secrets in the Shadows: A Gathering of Intrigue
Four figures huddle in a dimly lit room, their faces illuminated by flickering candlelight. The atmosphere is thick with mystery and suspense, as they share a silent, intense moment. The warm glow of the candles creates an intimate setting, while the shadows cast upon their faces hint at secrets waiting to be revealed.
Prompt
style-aesthetic Avant-garde: Intimate, mysterious ; A family gathered around a flickering candle, their faces obscured by shadows; close-up; Family; A dimly lit, antique room; cinematic
Characteristic
Shot : Three people are gathered around a table in a dimly lit room, their faces illuminated by the soft glow of candles.
Aesthetic Score : 0.6
Mood : mysterious, intimate, suspenseful
Quality
Entropy : 4.50
Noise : 42
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, but nothing too distracting.
Red Balloon, White Space: A Minimalist Moment
A single red balloon floats against a stark white backdrop, creating a simple yet powerful image. The balloon’s weightless movement and the minimalist setting evoke a sense of playfulness and isolation, inviting contemplation of the simple beauty in everyday objects.
Prompt
style-aesthetic Avant-garde: Hopeful, symbolic ; A single, red balloon floating against a stark, white background; close-up; Heroism; A minimalist, abstract setting; cinematic
Characteristic
Shot : A single red balloon is floating against a plain white background, the balloon’s string dangles down.
Aesthetic Score : 0.6
Mood : simple, minimalist, playful
Quality
Entropy : 5.22
Noise : 9
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Lost in the Pixels: A Nostalgic Gaming Moment
A dimly lit room, a vintage television, and a retro controller - this image captures the essence of nostalgic gaming. The low-light and focus on the controller draw you into the player’s immersive experience, transporting you back to a simpler time of pixelated adventures.
Prompt
style-aesthetic Avant-garde: Nostalgic, introspective ; A hand holding a vintage game controller, the screen reflecting a distorted, pixelated world; close-up; Gaming; A dimly lit, retro-themed room; cinematic
Characteristic
Shot : A person is playing video games on an old-fashioned TV and controller.
Aesthetic Score : 0.6
Mood : retro, nostalgic, focused
Quality
Entropy : 5.97
Noise : 44
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some graininess and blur, especially in the background.
A Solitary Figure Contemplates Nature’s Majesty
A lone figure stands on a mountain peak, dwarfed by a swirling vortex of clouds. The scene evokes a sense of mystery and awe, highlighting the power and beauty of nature.
Prompt
style-aesthetic Avant-garde: Sublime, awe-inspiring ; A lone figure standing on a mountain peak, their silhouette framed by a swirling vortex of clouds; long shot; Adventure; A dramatic, mountainous landscape; cinematic
Characteristic
Shot : A lone figure stands on a mountain peak overlooking a swirling cloud formation, with a bright light emerging from the center of the cloud.
Aesthetic Score : 0.7
Mood : mysterious, dramatic, contemplative
Quality
Entropy : 6.50
Noise : 64
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be digitally manipulated, with some unrealistic cloud formations.
A Chaotic Collage of Disjointed Images
This jarring and confusing image is a collage of various locations, creating an overwhelming and chaotic aesthetic. The random assortment of cities, monuments, and landscapes leaves the viewer feeling disoriented and unsure of where to focus.
Prompt
style-aesthetic Avant-garde: Energetic, disorienting ; A series of fragmented, overlapping images, depicting different aspects of travel and tourism; montage; Tourism; A chaotic, abstract collage; cinematic
Characteristic
Shot : A collage of various images, seemingly from different locations and times. The collage features urban scenes, architecture, landscapes, and people. It has a somewhat chaotic and disjointed feel.
Aesthetic Score : 0.3
Mood : busy, eclectic, disjointed
Quality
Entropy : 6.82
Noise : 112
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to be a collage made from different sources with varying image quality and resolution. Some sections are blurry, and the overall composition seems rushed.
Conclusion
The results show that the generative AI model performed okay in terms of camera position and shot analysis, but very well in terms of aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t quite capture the intended camera positions from the prompt as well as it could have.
- Shot Analysis: The model scored 0.5, which falls within the “good” range. This indicates that the model was able to understand the scene in the prompt reasonably well, but could still improve in this area.
- Aesthetic Analysis: The model scored 0.22, which is within the “very good” range of -0.2 to 0.1. This means that the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at capturing the desired aesthetic than it is at accurately representing camera positions and shot types.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux/schnell/api