AI Struggles to Capture the 'Style-Aesthetic' - A Case Study with Flux-pro

Exploring the Limits of AI in Capturing 'Style-Aesthetic' with Flux-pro

Contents

The ‘style-aesthetic’ is a powerful tool in visual storytelling, allowing creators to evoke specific emotions and atmospheres. It encompasses elements like color palettes, lighting, composition, and even the choice of objects within a scene. This experiment aimed to explore how well a generative AI model could understand and implement this ‘style-aesthetic’ in its image generation. While the model showed promise in capturing camera position and shot analysis, it struggled to capture the desired aesthetic, highlighting the ongoing challenges in teaching AI to understand and implement complex artistic concepts.

Created with: flux-pro

A Solitary Journey Through the Desert

A lone figure, shrouded in mystery, traverses a vast desert landscape towards a distant building. The play of light and color evokes a sense of depth and intrigue, while the juxtaposition of the solitary traveler and the distant structure creates a powerful sense of scale and adventure.

A Solitary Journey Through the Desert

Prompt

Vintage: Epic, adventurous, hopeful ; A lone, weathered explorer; medium shot; Adventure; a vast, sun-drenched desert landscape with ancient ruins in the distance; cinematic

Characteristic

Shot : A man wearing a large backpack and hat walks across a desert landscape. There is a building in the background.

Aesthetic Score : 0.7

Mood : mysterious, adventurous, hopeful

Quality

Entropy : 6.69

Noise : 83

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly overexposed. There are no other apparent artifacts or errors.

Cozy Candlelight and Board Game Fun

Four children gather around a wooden table, bathed in the warm glow of a window and candles, as they engage in a lively board game. The intimate setting and close-up framing capture the playful energy and cozy atmosphere of their shared moment.

Cozy Candlelight and Board Game Fun

Prompt

Vintage: Nostalgic, intimate, playful ; A group of children playing a board game; close-up; Gaming; a dimly lit room with a worn wooden table and flickering candlelight; cinematic

Characteristic

Shot : A group of children are playing a board game around a wooden table in a dimly lit room.

Aesthetic Score : 0.6

Mood : cozy, warm, focused

Quality

Entropy : 6.51

Noise : 73

Prompt Clip Score : 0.39

AI Evaluation

Likelihood of AI : 0.10

Image errors : No noticeable errors in the image.

Lost in Thought at the Station

A young woman in a grey dress stands alone on a train platform, her gaze lost in the distance. The shallow depth of field draws attention to her figure, creating a sense of isolation and contemplation. The romantic and wistful mood is palpable, hinting at a story waiting to be told.

Lost in Thought at the Station

Prompt

Vintage: Romantic, adventurous, hopeful ; A young woman in a vintage dress standing on a train platform; long shot; Travel; a bustling train station with steam locomotives and vintage luggage; cinematic

Characteristic

Shot : A young woman in a grey dress stands in front of a train station platform. A train is pulling out in the background.

Aesthetic Score : 0.7

Mood : dreamy, nostalgic, romantic

Quality

Entropy : 6.83

Noise : 72

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible errors, good resolution and clarity.

Heroic Rescue: Firefighter Braves Flames to Save Child

A dramatic image captures the moment a firefighter carries a child through a burning building, the flames and smoke billowing behind them. The scene highlights the bravery and urgency of the rescue, showcasing the intensity of the situation.

Heroic Rescue: Firefighter Braves Flames to Save Child

Prompt

Vintage: Dramatic, heroic, suspenseful ; A firefighter carrying a child through a burning building; close-up; Heroism; a smoky, chaotic scene with flames and debris; cinematic

Characteristic

Shot : A firefighter carrying a child out of a burning building, with the flames visible in the background.

Aesthetic Score : 0.7

Mood : dramatic, heroic, hopeful

Quality

Entropy : 6.53

Noise : 76

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible errors in the image.

Campfire Tales Under a Starry Sky

A cozy gathering around a crackling campfire, bathed in the warm glow of the flames and the twinkling light of a star-filled sky. This scene evokes a sense of nostalgia, peace, and connection, perfect for a night of shared stories and laughter.

Campfire Tales Under a Starry Sky

Prompt

Vintage: Warm, nostalgic, peaceful ; A family gathered around a campfire; wide shot; Family; a serene forest setting with stars twinkling in the night sky; cinematic

Characteristic

Shot : A group of people are gathered around a campfire under a starry night sky. The scene is set in a forest clearing with trees surrounding the group.

Aesthetic Score : 0.7

Mood : cozy, peaceful, nostalgic

Quality

Entropy : 6.44

Noise : 84

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are no noticeable artifacts or errors in the image. The colors are well-balanced and the image is sharp.

A Vintage Journey Through Majestic Mountains

A classic car stands poised on a winding mountain road, its journey just beginning. The towering peaks in the distance and lush greenery create a serene and nostalgic atmosphere, promising an adventurous drive through breathtaking scenery.

A Vintage Journey Through Majestic Mountains

Prompt

Vintage: Romantic, adventurous, nostalgic ; A vintage car driving down a winding mountain road; long shot; Tourism; a scenic mountain landscape with lush forests and snow-capped peaks; cinematic

Characteristic

Shot : A vintage car is parked on a winding road with a mountain backdrop.

Aesthetic Score : 0.7

Mood : nostalgic, serene, adventurous

Quality

Entropy : 6.76

Noise : 102

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly overexposed, resulting in a washed-out appearance.

Sunset Flight: A Pilot’s Tranquil Adventure Above the Clouds

A small biplane cuts through a sea of clouds as the sun sets, casting a warm glow on the pilot’s silhouette. The perspective and framing evoke a sense of isolation and adventure, while the soft light creates a peaceful atmosphere.

Sunset Flight: A Pilot’s Tranquil Adventure Above the Clouds

Prompt

Vintage: Exhilarating, adventurous, free ; A pilot in a vintage biplane soaring through the clouds; close-up; Adventure; a breathtaking view of a vast, blue sky with fluffy white clouds; cinematic

Characteristic

Shot : A person is flying in a small plane above the clouds. The clouds are lit by the setting sun, creating a warm glow.

Aesthetic Score : 0.7

Mood : serene, adventurous, nostalgic

Quality

Entropy : 6.73

Noise : 77

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.10

Image errors : There are some artifacts in the image, particularly around the edges of the clouds.

Fog of War: Soldiers March Through a Grim Cityscape

A haunting image of soldiers, armed and resolute, navigating a fog-shrouded city. The scene evokes a sense of tension and uncertainty, hinting at the harsh realities of war.

Fog of War: Soldiers March Through a Grim Cityscape

Prompt

Vintage: Grim, heroic, determined ; A group of soldiers marching through a war-torn city; medium shot; Heroism; a desolate cityscape with rubble and smoke; cinematic

Characteristic

Shot : A group of soldiers in military uniforms walking through a city street during a wartime setting. The image is taken from a slightly elevated perspective, looking down upon the soldiers as they march through the fog-filled streets.

Aesthetic Score : 0.6

Mood : wartime, grim, determined

Quality

Entropy : 6.77

Noise : 101

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors are visible

A Romantic Waltz in a Grand Ballroom

A couple dances under warm, romantic lighting in a grand ballroom, surrounded by other guests. The scene exudes elegance and grace, capturing the intimacy and connection between the dancers.

A Romantic Waltz in a Grand Ballroom

Prompt

Vintage: Romantic, elegant, nostalgic ; A couple dancing in a vintage ballroom; close-up; Tourism; a grand ballroom with chandeliers and elegant guests; cinematic

Characteristic

Shot : A couple is dancing in a grand ballroom, with a blurred background of other guests

Aesthetic Score : 0.7

Mood : romantic, elegant, nostalgic

Quality

Entropy : 6.69

Noise : 76

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has some slight blurriness and a few artifacts around the edges of the frame.

A World of Wonder: A Boy’s Curious Gaze

A young boy, lost in thought, studies a world map, his expression reflecting a sense of curiosity and nostalgia. The image evokes a feeling of exploration and the boundless possibilities that lie ahead.

A World of Wonder: A Boy’s Curious Gaze

Prompt

Vintage: Curious, adventurous, hopeful ; A young boy gazing at a vintage map; close-up; Adventure; a dimly lit room with a worn wooden table and a flickering candlelight; cinematic

Characteristic

Shot : A young boy, possibly 8-10 years old, is looking at a map spread out in front of him. The map covers a large portion of the image, while the boy is seated in a partially obscured background. The lighting is soft and warm, creating a cozy and inviting atmosphere.

Aesthetic Score : 0.7

Mood : curious, nostalgic, contemplative

Quality

Entropy : 6.70

Noise : 72

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors are visible. The lighting and focus are excellent, and the color balance is natural and pleasing.

Conclusion

The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.

Here’s a breakdown:

  • Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
  • Shot Analysis: The model scored 0.425, which is also considered okay. This indicates that the generated image’s shot composition was somewhat different from what was expected based on the prompt.
  • Aesthetic Analysis: The model scored 0.05, which is considered pretty bad. This means the generated image’s aesthetic was significantly different from what was expected based on the prompt.

Overall, the model seems to be struggling with understanding and implementing the desired aesthetic. It’s doing a decent job with camera position and shot analysis, but there’s room for improvement in all areas.

Sources: