AI Captures the Pose, But Misses the Mood with Flux-dev

AI Image Generation: A Step Forward, But Still Room for Growth with Flux-dev

Contents

The world of AI image generation is constantly evolving, with new models emerging that promise to revolutionize the way we create visual content. One such model was recently put to the test, tasked with generating images based on a series of prompts describing different scenes and poses. While the model demonstrated impressive capabilities in understanding and implementing camera positions and shot types, it struggled to capture the desired aesthetic, highlighting a key challenge in the field of AI image generation. This blog post delves into the results of this experiment, exploring the model’s strengths and weaknesses, and discussing the potential for future improvements.

Created with: flux-dev

Silhouetted Solitude: A Moment of Contemplation on the Mountaintop

A lone figure stands silhouetted against the misty sky, their dark jacket and jeans blending with the clouds. The low angle shot emphasizes the vastness of the landscape, creating a sense of isolation and contemplation. This moody image evokes feelings of solitude and introspection.

Silhouetted Solitude: A Moment of Contemplation on the Mountaintop

Prompt

poses hands-in-pockets: determined, confident ; A lone adventurer, standing on a mountain peak; wide shot; heroism; dramatic sky with clouds; cinematic

Characteristic

Shot : A lone figure stands on a mountain overlooking a misty landscape.

Aesthetic Score : 0.6

Mood : solitude, contemplative, atmospheric

Quality

Entropy : 6.20

Noise : 55

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are some minor artifacts in the sky, but they are not very noticeable.

A Boy’s Journey Begins: A Serene Forest Adventure

A young boy, backpack in tow, stands amidst the tranquil forest, his gaze fixed on the distant horizon. The scene evokes a sense of serene contemplation and adventurous spirit, leaving the viewer to wonder what mysteries lie ahead. The solitary figure and the unclear destination create a captivating sense of anticipation and mystery.

A Boy’s Journey Begins: A Serene Forest Adventure

Prompt

poses hands-in-pockets: curious, excited ; A young explorer, gazing at a vast jungle; medium shot; adventure; lush green foliage and ancient ruins; cinematic

Characteristic

Shot : A young boy with a backpack standing in a forest, looking away from the camera.

Aesthetic Score : 0.5

Mood : melancholy, contemplative, hopeful

Quality

Entropy : 6.87

Noise : 81

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible errors or artifacts in the image

Immersed in the Game: A Gamer’s Focus Under Neon Lights

A young gamer, bathed in vibrant purple and blue lighting, sits intently in their gaming chair, headphones on, eyes locked on the screen. The low lighting and focused expression create a palpable sense of intensity and immersion in the digital world.

Immersed in the Game: A Gamer’s Focus Under Neon Lights

Prompt

poses hands-in-pockets: focused, intense ; A gamer, sitting at a desk with a controller in hand; close-up; gaming; neon lights and computer screens; cinematic

Characteristic

Shot : A young person is sitting in a gaming chair in a dimly lit room with colorful lighting, they’re wearing headphones and holding a gaming controller in their hand, likely playing a video game.

Aesthetic Score : 0.6

Mood : focused, intense, gaming

Quality

Entropy : 6.31

Noise : 56

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has a slight noise and artifacting in the shadows and dark areas, especially in the background. There is a slight blurriness on the edges of the subject, particularly on the person’s face and headphones.

City Lights, Open Skies: A Moment of Hope

A young woman finds joy in the simple act of walking, her gaze drawn upwards to the bright sky. The contrast between the bustling city and the open expanse above creates a sense of hope and carefree optimism.

City Lights, Open Skies: A Moment of Hope

Prompt

poses hands-in-pockets: amazed, happy ; A tourist, admiring a famous landmark; medium shot; tourism; bustling city streets and iconic architecture; cinematic

Characteristic

Shot : A woman is walking on a city street, looking up at the sky, bathed in warm sunlight.

Aesthetic Score : 0.7

Mood : happy, hopeful, carefree

Quality

Entropy : 6.74

Noise : 65

Prompt Clip Score : 0.20

AI Evaluation

Likelihood of AI : 0.30

Image errors : The image is slightly overexposed, resulting in some blown-out highlights in the sky.

Finding Tranquility Amidst the Wildflowers

A lone figure, backpack in tow, traverses a path through a field of vibrant yellow wildflowers. The majestic mountain range in the background and the vast blue sky create a sense of calm and solitude, inviting contemplation and adventure.

Finding Tranquility Amidst the Wildflowers

Prompt

poses hands-in-pockets: free, adventurous ; A backpacker, walking along a scenic road; medium shot; travel; rolling hills and vibrant wildflowers; cinematic

Characteristic

Shot : A person with a backpack is walking on a path in the mountains, with yellow flowers on either side.

Aesthetic Score : 0.6

Mood : tranquil, contemplative, serene

Quality

Entropy : 6.54

Noise : 66

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors detected

Golden Hour Friendships on the Beach

Capture the joy and carefree spirit of a sunset gathering with friends. This image evokes a sense of happiness and relaxation, with the warm glow of the setting sun casting a beautiful light on the scene.

Golden Hour Friendships on the Beach

Prompt

poses hands-in-pockets: relaxed, joyful ; A group of friends, standing on a beach at sunset; wide shot; groups; golden sand and crashing waves; cinematic

Characteristic

Shot : A group of six friends standing on a beach at sunset, looking at the horizon.

Aesthetic Score : 0.6

Mood : happy, relaxed, friendly

Quality

Entropy : 6.50

Noise : 63

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly underexposed, resulting in a loss of detail in the shadows. The silhouettes are well-defined, but the overall image lacks sharpness.

Silhouetted Hero: Firefighter Braves the Blaze

A firefighter, clad in full gear, stands in stark silhouette against a backdrop of raging flames. The intense fire creates a dramatic and somber atmosphere, highlighting the danger and heroism of the scene. The firefighter’s stoic form evokes a sense of bravery in the face of adversity.

Silhouetted Hero: Firefighter Braves the Blaze

Prompt

poses hands-in-pockets: brave, determined ; A firefighter, standing in front of a burning building; medium shot; heroism; smoke and flames; cinematic

Characteristic

Shot : A firefighter in full gear stands in front of a fire, silhouetted against the flames.

Aesthetic Score : 0.6

Mood : dramatic, intense, heroic

Quality

Entropy : 6.69

Noise : 62

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.10

Image errors : There is some slight noise and grain in the image, but nothing too distracting.

Shadows and Secrets: A Journey into the Unknown

Three figures, shrouded in mystery, navigate a dark, cave-like environment illuminated by an ethereal light source. The interplay of light and shadow creates a sense of suspense and adventure, leaving the viewer to wonder what lies ahead.

Shadows and Secrets: A Journey into the Unknown

Prompt

poses hands-in-pockets: cautious, curious ; A group of explorers, navigating a dark cave; medium shot; adventure; stalactites and stalagmites; cinematic

Characteristic

Shot : Three figures silhouetted against a bright, ethereal glow, walking through a narrow canyon with rock walls on both sides. The figures are in the middle of the frame and the light source is behind them, creating a dramatic backlight effect.

Aesthetic Score : 0.6

Mood : mysterious, ominous, adventurous

Quality

Entropy : 6.05

Noise : 81

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : No significant image errors observed.

Lost in the Music: Dancing Under a Sea of Lights

A vibrant scene captures the joy and energy of a concert or rave. The person, lost in the music, dances with raised arms under a dazzling display of colorful lights, surrounded by a lively crowd. The image evokes a sense of carefree excitement and the pure thrill of being part of the moment.

Lost in the Music: Dancing Under a Sea of Lights

Prompt

poses hands-in-pockets: excited, triumphant ; A gamer, celebrating a victory with friends; close-up; gaming; celebratory confetti and flashing lights; cinematic

Characteristic

Shot : A silhouette of a person with headphones raised in the air, surrounded by other people dancing at a party, with a soft purple and pink lighting and confetti

Aesthetic Score : 0.6

Mood : energetic, happy, festive

Quality

Entropy : 6.33

Noise : 51

Prompt Clip Score : 0.20

AI Evaluation

Likelihood of AI : 0.30

Image errors : There are some slight artifacts in the image, but they are not very noticeable.

A Family’s Journey Through Time

A mother, father, and their young daughter walk hand-in-hand towards a mysterious archway, bathed in the golden light of the setting sun. The scene evokes a sense of peace, hope, and the promise of adventure as they embark on a journey together.

A Family’s Journey Through Time

Prompt

poses hands-in-pockets: happy, united ; A family, standing in front of a famous monument; wide shot; tourism; historical landmark and sunny sky; cinematic

Characteristic

Shot : A family of three, a couple and their young daughter, stand in front of a large, imposing archway. The scene appears to be taking place in a park or outdoor setting.

Aesthetic Score : 0.7

Mood : peaceful, happy, hopeful

Quality

Entropy : 6.54

Noise : 65

Prompt Clip Score : 0.30

AI Evaluation

Likelihood of AI : 0.10

Image errors : No obvious artifacts or errors

Conclusion

The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.

Here’s a breakdown:

  • Camera Position: The model scored 0.5, which is considered good. This means the generated image’s camera position closely matched the prompt’s instructions.
  • Shot Analysis: The model scored 0.56, also considered good. This indicates the generated image’s shot composition was fairly aligned with the prompt’s description.
  • Aesthetic Analysis: The model scored 0.16, which is not very good. This suggests the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt.

Overall, the model seems to be capable of understanding and implementing camera positions and shot types, but it needs improvement in generating images that match the desired aesthetic.

Sources: