AI Captures the Pose, But Misses the Mood with Flux-dev

edited on:October 1, 2024- published: August 12, 2024 - 9 minutes read - 1743 words

Tags:

<<< AI Image Generation: A Step Forward, But Still Room for Growth with Flux-dev AI Image Generation: A Look at the Strengths and Weaknesses with Flux-dev >>>

image from AI Image Generation: A Step Forward, But Still Room for Growth with Flux-dev

In the realm of AI image generation, capturing the essence of a scene goes beyond simply placing objects and characters in the right positions. Dramatic poses, for example, are often used to convey emotion, action, or a specific mood. This blog post explores the results of testing an AI model’s ability to generate images based on specific poses and scenes, focusing on the model’s success in capturing the intended aesthetic.

Created with: flux-dev

Clash of Titans: Silhouettes Battle at Sunset

Two figures locked in a fierce sword fight, their silhouettes stark against the fiery sunset. The dramatic backlighting and epic composition evoke a sense of heroism and grandeur. A third figure, partially obscured in the distance, adds a layer of mystery to this captivating scene.

Clash of Titans: Silhouettes Battle at Sunset

Prompt

poses fighting: epic, determined ; A lone warrior; wide shot; heroism; a desolate battlefield with the setting sun in the background; cinematic

Characteristic

Shot : Two silhouetted figures, one with a sword, the other with a shield, stand facing each other in a field with a sunset behind them, another figure is visible in the background.

Aesthetic Score : 0.7

Mood : epic, dramatic, nostalgic

Quality

Entropy : 6.43

Noise : 52

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.40

Image errors : The image appears to be slightly overexposed, which may be an intentional effect.

Warriors on the Brink of Mystery

Affiliate Links

Stable Diffusion with Python

Master Stable Diffusion for AI image generation using Python. Control and customize your creations.

Mastering Midjourney: AI Art Guide

Unlock Midjourney V6 features and create exceptional AI art.

Midjourney Prompt Book: AI Image Generation

Master Midjourney with this comprehensive guide for beginners and pros.

A group of warriors, silhouetted against a misty jungle, stand poised before a looming, ancient structure. The scene evokes a sense of mystery and adventure, hinting at a dramatic confrontation or a perilous quest.

Warriors on the Brink of Mystery

Prompt

poses fighting: intense, adventurous ; A group of adventurers; medium shot; adventure; a dense jungle with ancient ruins in the distance; cinematic

Characteristic

Shot : A group of warriors with swords are standing in a forest with a misty background. The scene is set in a fantasy world.

Aesthetic Score : 0.6

Mood : mysterious, epic, dramatic

Quality

Entropy : 6.65

Noise : 104

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.70

Image errors : There are some minor artifacts in the image, particularly in the areas of high contrast. The edges of the warriors and the swords are somewhat jagged.

Neon City Enigma: A Woman’s Determined Walk Through a Cyberpunk World

A young woman with a ponytail strides through a vibrant cyberpunk city, bathed in neon light. Her focused expression and mysterious pose hint at a hidden purpose. A blurry figure trails behind, adding to the intrigue of this futuristic urban scene.

Neon City Enigma: A Woman’s Determined Walk Through a Cyberpunk World

Prompt

poses fighting: dynamic, futuristic ; A player character; close-up; gaming; a neon-lit cityscape with holographic projections; cinematic

Characteristic

Shot : A woman in a black jacket and jeans stands in a futuristic city setting, illuminated by neon lights. Another person, blurry and out of focus, stands behind her in a similar pose.

Aesthetic Score : 0.6

Mood : urban, futuristic, mysterious

Quality

Entropy : 6.82

Noise : 71

Prompt Clip Score : 0.30

AI Evaluation

Likelihood of AI : 0.90

Image errors : The image contains some slight artifacts and compression, particularly noticeable in the blurred background and the woman’s hair.

Friendship in the City Lights

Two young men, clad in casual wear and backpacks, stroll through a bustling Asian city, their laughter echoing through the crowded streets. The contrasting light and shadow, along with the slightly blurred background, add a touch of drama to this heartwarming scene of friendship.

Friendship in the City Lights

Prompt

poses fighting: chaotic, humorous ; Two tourists; medium shot; tourism; a bustling marketplace with colorful stalls and vibrant crowds; cinematic

Characteristic

Shot : Two men are walking down a crowded street in a city, they are greeting each other with a handshake.

Aesthetic Score : 0.7

Mood : friendly, urban, casual

Quality

Entropy : 6.67

Noise : 85

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly overexposed, and there is some noise in the background.

A Solitary Figure Against the Vastness of the Desert

A lone traveler walks across a sand dune, their silhouette stark against the clear blue sky. The scene evokes a sense of serenity, adventure, and hope, with the vastness of the desert emphasizing the individual’s journey.

A Solitary Figure Against the Vastness of the Desert

Prompt

poses fighting: isolated, desperate ; A lone traveler; long shot; travel; a vast desert landscape with a lone sand dune in the foreground; cinematic

Characteristic

Shot : A lone figure in a brown coat walks across a vast expanse of sand dunes in the desert. The figure is walking away from the viewer, looking over their shoulder, with their arm outstretched.

Aesthetic Score : 0.6

Mood : adventurous, solitary, vast

Quality

Entropy : 5.92

Noise : 36

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.30

Image errors : The image has a slight amount of noise and grain, especially in the shadows. The image also has a few artifacts in the sky which could have been edited.

Silhouettes of Conflict: A City at Dusk

Two figures stand in silhouette against the backdrop of a city at dusk, their tense embrace hinting at a dramatic confrontation. The interplay of light and shadow creates a sense of mystery and intrigue, leaving the viewer to ponder the story unfolding before them.

Silhouettes of Conflict: A City at Dusk

Prompt

poses fighting: energetic, playful ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic

Characteristic

Shot : Two men in silhouette are standing on a rooftop overlooking a city skyline at night. The men appear to be in a tense or confrontational pose, with their arms raised as if they are about to engage in a physical altercation.

Aesthetic Score : 0.4

Mood : tense, dramatic, urban

Quality

Entropy : 6.55

Noise : 56

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are no major errors in the image. However, the quality of the image is somewhat low, which may be due to the low light conditions in which it was captured.

Silhouetted Warrior in a Fiery Landscape

A lone warrior, silhouetted against a backdrop of raging fire and smoke, stands ready with sword in hand. The epic scene evokes a sense of power, drama, and fierce determination.

Silhouetted Warrior in a Fiery Landscape

Prompt

poses fighting: tragic, determined ; A lone warrior; close-up; heroism; a burning village with smoke billowing in the air; cinematic

Characteristic

Shot : A lone warrior stands silhouetted against a fiery backdrop, a sword in hand. The background suggests a battle or a burning city.

Aesthetic Score : 0.7

Mood : epic, dramatic, powerful

Quality

Entropy : 6.46

Noise : 65

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : There is some noise and graininess in the image. The edges of the silhouetted figures are slightly blurry.

Shadows in the Cave: A Journey into the Unknown

Four figures, cloaked in shadow, navigate a mysterious cave, their silhouettes illuminated by an unseen light source. A sword held aloft hints at adventure and danger ahead. This epic scene evokes a sense of mystery and intrigue, promising a thrilling journey into the unknown.

Shadows in the Cave: A Journey into the Unknown

Prompt

poses fighting: suspenseful, adventurous ; A group of explorers; wide shot; adventure; a dark cave with flickering torches and mysterious shadows; cinematic

Characteristic

Shot : A group of four figures, three standing and one holding a sword, are silhouetted against a bright opening in a cave, the light catches the sword and illuminates the figures in a dramatic way

Aesthetic Score : 0.6

Mood : mysterious, adventurous, dramatic

Quality

Entropy : 6.16

Noise : 82

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.70

Image errors : The image has some slight blurriness, particularly around the edges of the figures.

VR Worlds Collide: A Futuristic Dance of Light and Shadow

Two figures, silhouetted against a vibrant blue and pink backdrop, engage in a playful interaction within the realm of virtual reality. The contrasting light creates a dramatic effect, highlighting the futuristic and exciting nature of their experience.

VR Worlds Collide: A Futuristic Dance of Light and Shadow

Prompt

poses fighting: immersive, intense ; A gamer; close-up; gaming; a virtual reality headset with a pixelated world projected in the background; cinematic

Characteristic

Shot : Two people wearing VR headsets are interacting with each other in a dimly lit room. The background is a blurry wall with a screen displaying a blurry scene.

Aesthetic Score : 0.7

Mood : futuristic, playful, mysterious

Quality

Entropy : 6.29

Noise : 54

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has some noise and grain, particularly in the shadows.

A Handshake of Secrets: Mystery in the Train Station

Two men meet in a bustling train station, their handshake shrouded in an air of professionalism and intrigue. The lighting and composition create a sense of mystery, leaving the viewer wondering what secrets lie beneath the surface.

A Handshake of Secrets: Mystery in the Train Station

Prompt

poses fighting: fast-paced, chaotic ; Two travelers; medium shot; travel; a crowded train station with people rushing in all directions; cinematic

Characteristic

Shot : Two men in suits are shaking hands in a dimly lit hallway or subway station.

Aesthetic Score : 0.6

Mood : serious, professional, formal

Quality

Entropy : 6.37

Noise : 57

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are some minor artifacts around the edges, especially on the right side.

Conclusion

The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.

Here’s a breakdown:

Camera Position: The model scored 0.5, which is considered good. This means the generated image’s camera position closely matched the prompt’s instructions.
Shot Analysis: The model scored 0.64, also considered good. This indicates the generated image’s shot composition was fairly aligned with the prompt’s description.
Aesthetic Analysis: The model scored 0.12, which is not very good. This suggests the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt.

Overall, the model seems to be capable of understanding and implementing camera positions and shot types, but it needs improvement in capturing the desired aesthetic.

AI Captures the Pose, But Misses the Mood with Flux-dev

Table of Contents

Clash of Titans: Silhouettes Battle at Sunset

Warriors on the Brink of Mystery

Neon City Enigma: A Woman’s Determined Walk Through a Cyberpunk World

Friendship in the City Lights

A Solitary Figure Against the Vastness of the Desert

Silhouettes of Conflict: A City at Dusk

Silhouetted Warrior in a Fiery Landscape

Shadows in the Cave: A Journey into the Unknown

VR Worlds Collide: A Futuristic Dance of Light and Shadow

A Handshake of Secrets: Mystery in the Train Station

Conclusion

Sources: