AI Struggles to Capture the 'Dramatic' Aesthetic with Flux-dev
- 9 minutes read - 1811 wordsTable of Contents
The ‘dramatic’ aesthetic is a powerful tool in visual storytelling, evoking strong emotions and creating a sense of depth and intrigue. It often involves stark contrasts, dramatic lighting, and a focus on individual figures or moments of heightened tension. While generative AI models have made significant strides in understanding and generating images, capturing the essence of a dramatic aesthetic remains a challenge. This blog post explores the results of a recent experiment, highlighting the model’s strengths and weaknesses in understanding and generating images with a dramatic aesthetic.
Created with: flux-dev
Intrigued by the Blue Light: Young People Unravel a Mystery
Three young people huddle around a computer screen in a dimly lit room, their faces illuminated by the blue glow. The low angle shot creates a sense of mystery and intrigue, suggesting they are deep in thought and focused on something important. Their expressions convey a mix of curiosity and determination, hinting at a captivating story unfolding before them.
Prompt
style-aesthetic French New Wave: intense, focused ; A group of friends huddled around a computer screen, their faces illuminated by the glow; medium shot; gaming; a dimly lit, cluttered room; cinematic
Characteristic
Shot : Three young people are sitting at a desk in a dimly lit room, using a computer. The light from the computer screen illuminates their faces and the keyboard.
Aesthetic Score : 0.6
Mood : focused, serious, intense
Quality
Entropy : 6.28
Noise : 68
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight graininess and some noise, likely due to the low lighting conditions. The color balance appears slightly off, and the colors are a little too saturated.
A Hand Reaches Out, Curiosity Unfolds
A thoughtfully placed hand, slightly blurred, extends towards a world map, hinting at a journey of exploration and contemplation. The out-of-focus detail adds a touch of mystery, inviting the viewer to delve deeper into the story behind this moment.
Prompt
style-aesthetic French New Wave: intriguing, suspenseful ; A close-up of a weathered map, with a finger tracing a route; medium shot; adventure; a cluttered, dimly lit room; cinematic
Characteristic
Shot : A hand is touching a map, the focus is on the hand, the map is out of focus
Aesthetic Score : 0.6
Mood : mysterious, thoughtful, nostalgic
Quality
Entropy : 6.72
Noise : 57
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight blur in the map, making it difficult to see the details.
Mysterious City Stroll: A Man in a Stylish Coat Navigates the Urban Landscape
A moody and atmospheric scene unfolds as a man in a stylish coat walks through a city street. The lighting and composition create a sense of drama and intrigue, hinting at a mysterious story waiting to be told. This image captures the edgy and urban spirit of the city, leaving you wanting to know more.
Prompt
style-aesthetic French New Wave: urgent, dramatic ; A young man, his face etched with determination, running through a crowded marketplace; medium shot; heroism; a chaotic, bustling market; cinematic
Characteristic
Shot : A man is walking through a busy, foggy city street. The image is shot from a low angle, making the man appear larger than life.
Aesthetic Score : 0.7
Mood : mysterious, urban, dramatic
Quality
Entropy : 6.64
Noise : 61
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor noise in the background, slight artifacts on the man’s jacket
Golden Hour Journey: A Train Ride Filled with Hope and Mystery
A solitary figure gazes out the window of a train traversing a golden field bathed in the warm glow of sunset. The scene evokes a sense of serenity, hope, and adventure, with the dramatic lighting adding an air of mystery to the journey.
Prompt
style-aesthetic French New Wave: reflective, contemplative ; A train speeding through a countryside landscape, with a lone figure looking out the window; long shot; travel; a vibrant, sun-drenched field; cinematic
Characteristic
Shot : A train traveling through a field at sunset, with the sun shining brightly in the background. The passenger in the train is looking out the window and enjoying the view.
Aesthetic Score : 0.7
Mood : tranquil, serene, hopeful
Quality
Entropy : 6.84
Noise : 89
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible errors.
Warmth and Intimacy at the Dinner Table
A family gathers around a beautifully set table, bathed in warm, inviting light. The scene exudes coziness and intimacy, as they enjoy a meal together. The composition and dramatic lighting draw the viewer into the heartwarming interaction between family members.
Prompt
style-aesthetic French New Wave: intimate, heartwarming ; A family gathered around a table, sharing a meal, with laughter and conversation; medium shot; family; a warm, inviting kitchen; cinematic
Characteristic
Shot : A family of four is sitting around a table, eating dinner. The lighting is warm and inviting, and the scene is relaxed and casual.
Aesthetic Score : 0.7
Mood : warm, cozy, familial
Quality
Entropy : 6.62
Noise : 70
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the colors are a bit washed out.
A Moment of Longing in the City of Lights
A woman stands silhouetted against the Parisian skyline, her gaze fixed on the distant Eiffel Tower. The city lights shimmer behind her, reflecting a sense of melancholy and nostalgia. The dramatic composition evokes a feeling of longing and mystery, leaving the viewer to ponder her thoughts and the story behind her solitary moment.
Prompt
style-aesthetic French New Wave: romantic, nostalgic ; A young woman, her face filled with wonder, gazing at the Eiffel Tower; medium shot; tourism; a bustling Parisian street; cinematic
Characteristic
Shot : A woman in a brown coat stands with her back to the camera looking at the Eiffel Tower, the background is blurry.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, serene
Quality
Entropy : 6.58
Noise : 74
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is a slight blurriness to the image, particularly in the background, but it could be intentional to create a sense of depth.
Lost in Time: A Vintage Compass Beckons Adventure
A close-up of a weathered vintage compass, its gold needle pointing north, evokes a sense of mystery and adventure. The shallow depth of field draws you into the intricate details, inviting you to explore the unknown.
Prompt
style-aesthetic French New Wave: mysterious, suspenseful ; A close-up of a compass needle spinning, pointing towards an unknown destination; close-up; adventure; a dimly lit, mysterious room; cinematic
Characteristic
Shot : A close-up shot of an old compass with a golden needle pointing north. The compass is sitting on a dark wooden surface.
Aesthetic Score : 0.8
Mood : vintage, mysterious, adventurous
Quality
Entropy : 6.67
Noise : 56
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Neon Touch: A Moment of Mystery in the Urban Night
A lone arm, adorned with a watch, reaches out to touch a vibrant red neon light. The background blurs into a kaleidoscope of colors, creating a mysterious and futuristic atmosphere. The interplay of light and shadow adds a dramatic touch, leaving the viewer wondering what lies beyond the glow.
Prompt
style-aesthetic French New Wave: intense, energetic ; A hand holding a joystick, fingers moving rapidly; close-up; gaming; a neon-lit arcade with flashing screens; cinematic
Characteristic
Shot : A person’s hand and wrist, wearing a smartwatch, are seen touching a glowing red light, in a dimly lit space filled with other blurry lights.
Aesthetic Score : 0.6
Mood : mysterious, futuristic, urban
Quality
Entropy : 6.59
Noise : 46
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and has some chromatic aberration.
Silhouette of Solitude: A Lone Figure Walks into the Setting Sun
A melancholic scene unfolds as a lone figure, shrouded in a wide-brimmed hat, walks into the fiery embrace of the setting sun. The dramatic silhouette against the desert landscape evokes a sense of mystery and isolation, leaving the viewer pondering the figure’s journey and the weight of their solitude.
Prompt
style-aesthetic French New Wave: epic, melancholic ; A lone figure, silhouetted against a setting sun; long shot; heroism; a vast, empty desert landscape; cinematic
Characteristic
Shot : A lone figure walks towards the setting sun in a desert landscape
Aesthetic Score : 0.7
Mood : melancholy, lonely, contemplative
Quality
Entropy : 6.22
Noise : 24
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and graininess in the image, especially in the sky
Silhouettes of Love: A Romantic Sunset Stroll
A couple walks hand-in-hand into the golden glow of the setting sun, their silhouettes creating a sense of mystery and intimacy. This romantic scene evokes feelings of hope and a promise of something beautiful to come.
Prompt
style-aesthetic French New Wave: romantic, nostalgic ; A couple walking hand-in-hand along a cobblestone street, their silhouettes framed by the setting sun; long shot; tourism; a romantic, picturesque town; cinematic
Characteristic
Shot : A silhouette of a couple walking down a street towards the setting sun.
Aesthetic Score : 0.7
Mood : romantic, nostalgic, hopeful
Quality
Entropy : 6.21
Noise : 74
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t quite capture the intended camera positions as described in the prompt.
- Shot Analysis: The model scored 0.63, which falls within the “good” range. This indicates that the model was able to understand the scene and create a shot that was generally consistent with the prompt.
- Aesthetic Analysis: The model scored 0.07, which is far from the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic significantly deviated from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and camera positions, but needs improvement in capturing the desired aesthetic.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://fal.ai/models/fal-ai/flux/dev/api