Gothic & Dramatic: Exploring the 'style-aesthetic' AI Challenge with Flux-dev
- 10 minutes read - 1937 wordsTable of Contents
The ‘style-aesthetic’ is a crucial aspect of image generation, encompassing the overall mood, tone, and visual style of an image. This challenge becomes particularly evident when dealing with complex aesthetics like gothic and dramatic themes. These styles often rely on specific visual elements, lighting techniques, and compositions to evoke a desired atmosphere. In this blog post, we delve into the challenges faced by AI models in capturing these intricate aesthetics, analyzing the results of a generative AI model tasked with creating images based on prompts featuring gothic and dramatic themes.
Created with: flux-dev
Shattered Reality: A Skeletal Hand Reaches Out From the Screen
A chilling image of a skeletal hand emerging from a shattered computer screen sets a dark and suspenseful mood. The dimly lit room and the keyboard in the foreground add to the eerie atmosphere, leaving viewers with a sense of tension and foreboding.
Prompt
style-aesthetic Gothic: Eerie and unsettling ; A skeletal hand reaching out from a cracked screen; close-up; Gaming; A dimly lit room filled with gaming consoles and flickering monitors; cinematic
Characteristic
Shot : A computer monitor with a cracked screen displays a skeletal hand reaching out from the screen. The scene is lit with blue and green light, creating a dark and mysterious atmosphere.
Aesthetic Score : 0.6
Mood : creepy, mysterious, digital
Quality
Entropy : 6.32
Noise : 68
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some artifacts in the image, such as the slight blurriness of the skeletal hand and the cracks in the screen. The blue and green lighting appear to be artificial and somewhat overblown.
A Candlelit Treasure: Mystery and Opulence in a Single Frame
A treasure chest overflowing with gold coins sits on a dark surface, bathed in the warm glow of a single candle. The low light and shadows create a sense of mystery and intrigue, while the gold coins add a touch of opulence. This image evokes a feeling of antique luxury, transporting you to a world of hidden riches and forgotten stories.
Prompt
style-aesthetic Gothic: Excitement and danger ; A treasure chest overflowing with gold and jewels, illuminated by a single candle; close-up; Adventure; A dark, damp dungeon with cobwebs and chains; cinematic
Characteristic
Shot : A treasure chest filled with gold coins, illuminated by a candle, in a dimly lit room.
Aesthetic Score : 0.8
Mood : mysterious, opulent, magical
Quality
Entropy : 6.77
Noise : 81
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors or artifacts.
Solitude at the Edge of the World
A lone figure contemplates the vastness of a misty valley and sprawling city below, bathed in the soft light of a cloudy sky. The scene evokes a sense of serenity, contemplation, and the profound solitude of being at the edge of everything.
Prompt
style-aesthetic Gothic: Awe-inspiring and melancholic ; A lone traveler standing at the edge of a cliff, gazing at a sprawling, gothic-inspired city; wide shot; Travel; A mountainous landscape with dramatic clouds and a stormy sky; cinematic
Characteristic
Shot : A lone figure stands on the edge of a cliff overlooking a vast, misty valley, with a city in the distance.
Aesthetic Score : 0.7
Mood : solitude, contemplation, vastness
Quality
Entropy : 6.67
Noise : 73
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts and compression artifacts, particularly noticeable in the sky and the distant city.
Lost in the Mist: A Steam Locomotive’s Journey into the Unknown
A nostalgic and mysterious scene unfolds as a steam locomotive emerges from a cloud of smoke, its bright lights cutting through the dense fog. The train’s journey into the unknown evokes a sense of anticipation and wonder, leaving the viewer captivated by the atmospheric beauty of the moment.
Prompt
style-aesthetic Gothic: Dramatic and suspenseful ; A vintage train hurtling through a dark, stormy landscape; long shot; Travel; A desolate, gothic-inspired train station; cinematic
Characteristic
Shot : A vintage steam locomotive emerging from a thick cloud of smoke, moving towards the viewer on a railroad track in a misty, dark, and desolate setting.
Aesthetic Score : 0.8
Mood : mysterious, atmospheric, nostalgic
Quality
Entropy : 6.85
Noise : 68
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Silhouetted Knight Against a Blood Red Sky
A lone knight stands on a cliff, his silhouette stark against the backdrop of a dark castle and a blood-red sky. The scene evokes a sense of mystery and drama, with the contrast of light and shadow creating a powerful visual effect.
Prompt
style-aesthetic Gothic: Epic and melancholic ; A lone knight, silhouetted against a blood-red sunset; wide shot; Heroism; A crumbling castle on a windswept cliff; cinematic
Characteristic
Shot : A lone figure in a knight’s armor stands on a cliff overlooking a mysterious gothic castle. A blood red moon hangs large in the sky above.
Aesthetic Score : 0.7
Mood : dark, ominous, foreboding
Quality
Entropy : 6.32
Noise : 37
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has a slight blur around the edges, and the shadows are not very realistic. The knight’s helmet is a bit too pointy.
Death’s Shadow Looms in Foggy Graveyard
A skeletal figure, cloaked in black, stands ominously in a misty graveyard, creating a sense of foreboding and dread. The scene evokes a dark and mysterious mood, hinting at the presence of death.
Prompt
style-aesthetic Gothic: Grim and triumphant ; A player’s avatar, a skeletal warrior, standing amidst a graveyard of fallen enemies; medium shot; Gaming; A dark and eerie virtual world with gothic architecture; cinematic
Characteristic
Shot : A skeletal figure in a dark cloak stands in a misty graveyard. The figure is framed by a row of tall stone structures, which seem to be the walls of a graveyard or crypt.
Aesthetic Score : 0.7
Mood : eerie, ominous, dark
Quality
Entropy : 6.86
Noise : 89
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have some noise in the background and on the figure’s armor. The lighting could be improved to reduce the graininess in the image.
Lost in the Fog: A Solitary Figure Walks Through a Mysterious City
A lone figure disappears into the swirling fog of a cobblestone street, shadowed by ancient buildings and a towering church. The scene evokes a sense of mystery and isolation, leaving the viewer to wonder about the figure’s destination and the secrets hidden within the misty city.
Prompt
style-aesthetic Gothic: Awe-inspiring and melancholic ; A lone figure standing on a cobblestone street, gazing at a towering cathedral; medium shot; Tourism; A misty, rain-soaked European city; cinematic
Characteristic
Shot : A man walks down a cobblestone street in a foggy, European city. The architecture is old and gothic, with a large church in the distance. The atmosphere is mysterious and eerie.
Aesthetic Score : 0.7
Mood : mysterious, eerie, atmospheric
Quality
Entropy : 6.86
Noise : 88
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the distance. The colors are also a bit washed out, likely due to the fog.
Unveiling Secrets in Candlelight
A close-up of an antique map, bathed in the warm glow of a candle, evokes a sense of mystery and nostalgia. The map’s intricate details are brought to life against the soft darkness, hinting at forgotten journeys and hidden treasures.
Prompt
style-aesthetic Gothic: Intriguing and mysterious ; A weathered map, illuminated by flickering candlelight; close-up; Adventure; A dusty, cobweb-filled library; cinematic
Characteristic
Shot : A lit candle illuminates an old map spread out on a wooden table, with other items and books out of focus in the background.
Aesthetic Score : 0.7
Mood : mysterious, warm, nostalgic
Quality
Entropy : 6.84
Noise : 72
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness in the background and around the edges of the map.
Silhouettes of Mystery: A Family’s Grand Entrance
A mystical and dramatic scene unfolds as a family of four is silhouetted against a large window in a grand hall. The fireplace on the left side of the image adds to the elegant ambiance, while the use of silhouettes creates a sense of mystery and intrigue.
Prompt
style-aesthetic Gothic: Warm and intimate ; A family huddled around a fireplace, shadows dancing on the walls; medium shot; Family; A grand, gothic-style mansion with stained glass windows; cinematic
Characteristic
Shot : A group of people standing in a grand, empty room with a fireplace and a large window. The room is lit by the soft glow of the fireplace and the setting sun shining through the window. The room is decorated with ornate details.
Aesthetic Score : 0.6
Mood : mysterious, calm, nostalgic
Quality
Entropy : 6.30
Noise : 91
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some slight artifacts around the edges of the image.
Raven’s Watch: A Gothic City in Mist
A solitary black raven perches on a stone railing, its dark silhouette stark against the misty cityscape. A towering gothic spire pierces the clouds in the distance, adding to the eerie and mysterious atmosphere. The dramatic use of contrast and depth of field draws the eye to the raven, creating a sense of foreboding and intrigue.
Prompt
style-aesthetic Gothic: Mysterious and ominous ; A lone raven perched on a gargoyle, overlooking a bustling city; close-up; Heroism; A gothic cathedral with intricate carvings and stained glass; cinematic
Characteristic
Shot : A black raven perched on a stone railing, looking out over a city skyline with a tall, gothic church in the background, all bathed in a misty, atmospheric light.
Aesthetic Score : 0.7
Mood : dark, mysterious, brooding
Quality
Entropy : 6.85
Noise : 81
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight blurriness, especially in the background. Some noise present.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.45
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model’s ability to accurately interpret and reproduce camera positions from the prompt is somewhat lacking.
Shot Analysis:
- Score: 0.57
- Interpretation: This score falls within the “good” range of 0.5 to 0.75. It indicates that the model is generally capable of understanding the scene described in the prompt and creating images that reflect the intended shot type.
Aesthetic Analysis:
- Score: 0.3
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the model struggled to match the expected aesthetic of the image. The generated image likely deviated significantly from the desired aesthetic style.
Overall:
While the model demonstrates some proficiency in understanding camera positions and shot types, it needs improvement in capturing the intended aesthetic style.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux/dev/api