AI's Artistic Journey: Capturing Poses, But Missing the Essence with Flux-dev
- 9 minutes read - 1817 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images from text prompts has become increasingly sophisticated. However, while these models excel at capturing the technical aspects of a scene, such as camera angles and object placement, they often struggle to convey the desired aesthetic style. This is particularly evident in the portrayal of poses, where the model may accurately depict the physical position of a figure but fail to capture the intended emotion, mood, or artistic intent. This blog post explores the challenges and opportunities in generating images with a specific aesthetic style, focusing on the example of poses and scene generation.
Created with: flux-dev
Contemplating the Vastness: A Solitary Figure on a Mountain Peak
A lone figure stands on a rocky mountain peak, gazing out at a breathtaking panorama of towering peaks. The clear blue sky above amplifies the sense of awe and solitude, creating a serene and contemplative mood. This image captures the spirit of adventure and the beauty of nature’s grandeur.
Prompt
poses standing-tall: Determined, hopeful, awe-inspiring ; Lone adventurer; wide shot; Adventure; Majestic mountain range with a vast, clear sky; cinematic
Characteristic
Shot : A solitary figure stands on a rocky mountain peak, looking out over a vast expanse of mountains under a clear sky.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, adventurous
Quality
Entropy : 6.22
Noise : 86
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Silhouettes of War: A Soldier’s Lonely Stand
A lone soldier stands in silhouette against a backdrop of smoke and fire, a stark reminder of the grim realities of war. The image evokes a sense of isolation, tension, and the dramatic weight of conflict.
Prompt
poses standing-tall: Brave, defiant, resolute ; Soldier standing on a battlefield; medium shot; Heroism; Smoke and debris from a recent explosion; cinematic
Characteristic
Shot : A lone soldier stands in the foreground, silhouetted against a backdrop of smoke and fire. Two other soldiers are visible in the background.
Aesthetic Score : 0.5
Mood : dramatic, somber, suspenseful
Quality
Entropy : 6.66
Noise : 70
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blurriness, particularly in the background. The silhouettes are slightly soft, lacking crisp definition.
Neon Nights: Friends Celebrate Victory in a Burst of Color
Four friends, bathed in vibrant pink and blue neon lights, raise their arms in triumph, celebrating a hard-earned victory. The energy is palpable, captured in their joyful expressions and celebratory poses. The dramatic use of neon lighting adds a layer of excitement and energy to the scene, making it a perfect snapshot of shared success.
Prompt
poses standing-tall: Joyful, triumphant, celebratory ; Group of friends celebrating a victory in a video game; close-up; Gaming; Neon lights and glowing screens of a gaming setup; cinematic
Characteristic
Shot : Four friends are celebrating a victory in front of a computer screen. The room is lit with red and blue lights.
Aesthetic Score : 0.5
Mood : happy, exciting, competitive
Quality
Entropy : 6.14
Noise : 60
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
A Moment of Solitude Amidst Nature’s Grandeur
A lone figure stands on a cliff, dwarfed by the vast expanse of a bay or fjord. Bathed in warm sunlight, the scene evokes a sense of tranquility and contemplation, highlighting the overwhelming power of nature.
Prompt
poses standing-tall: Awe-struck, contemplative, peaceful ; Tourist standing on a cliff overlooking a breathtaking view; long shot; Tourism; Scenic landscape with rolling hills and a sparkling ocean; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a vast, blue ocean bay, with sunlit mountains in the distance
Aesthetic Score : 0.7
Mood : tranquil, contemplative, serene
Quality
Entropy : 6.94
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable artifacts or errors
Silhouettes of Love at Sunset
A romantic and nostalgic scene of a couple embracing on a boat deck, silhouetted against a breathtaking sunset. The golden light reflects on the water, creating a serene and dramatic atmosphere.
Prompt
poses standing-tall: Romantic, adventurous, hopeful ; Couple standing on a ship’s deck; medium shot; Travel; Sunset over the ocean with a silhouette of a distant island; cinematic
Characteristic
Shot : A couple silhouetted against a sunset on a boat deck. The couple is looking at each other with affection.
Aesthetic Score : 0.7
Mood : romantic, warm, serene
Quality
Entropy : 6.92
Noise : 65
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight artifacts in the sky, particularly around the sun. There is also some minor color banding in the water.
Silhouettes in Red: A Dance of Mystery and Intensity
A captivating scene unfolds in a dimly lit room bathed in red light. The dancers, their forms silhouetted against the glow, move with a dramatic intensity that evokes a sense of mystery and intrigue. The use of red lighting and silhouettes creates a powerful visual effect, highlighting the dancers’ every movement and drawing the viewer into the heart of the performance.
Prompt
poses standing-tall: Energetic, passionate, expressive ; Group of dancers performing on a stage; wide shot; Groups; Bright stage lights and a cheering audience; cinematic
Characteristic
Shot : A group of people are dancing in a dimly lit room with red lights. The dancers are silhouetted against the lights, creating a dramatic effect.
Aesthetic Score : 0.6
Mood : dramatic, energetic, playful
Quality
Entropy : 6.70
Noise : 67
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as slight blurring around the edges of the dancers.
A Lone Figure in the Vastness of Space
An astronaut, silhouetted against a starry sky, stands on a desolate lunar landscape, gazing at the luminous moon. The scene evokes a sense of loneliness and contemplation, highlighting the vastness of space and the astronaut’s isolation.
Prompt
poses standing-tall: Awe-inspiring, futuristic, surreal ; Astronaut standing on the surface of the moon; long shot; Adventure; Cratered lunar landscape with Earth in the distance; cinematic
Characteristic
Shot : An astronaut in a white spacesuit stands on a desolate, moon-like landscape. The background features a dark sky with stars and a large, brightly lit moon.
Aesthetic Score : 0.7
Mood : lonely, futuristic, awe
Quality
Entropy : 6.34
Noise : 70
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : No visible artifacts or errors in the image.
Firefighter Silhouette Against Blazing Inferno
A dramatic image captures the silhouette of a firefighter standing bravely against a raging fire in an urban setting. The intense flames create a powerful visual contrast, highlighting the seriousness and danger of the situation.
Prompt
poses standing-tall: Brave, determined, selfless ; Firefighter standing in front of a burning building; medium shot; Heroism; Flames and smoke billowing from the building; cinematic
Characteristic
Shot : A silhouette of a firefighter standing in front of a burning building, the flames are visible in the background.
Aesthetic Score : 0.7
Mood : dramatic, intense, heroic
Quality
Entropy : 6.81
Noise : 66
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some noise and artifacts due to the low light conditions and the silhouette of the firefighter is slightly blurry. There is also some noise in the flame background.
Triumphant Moment Captured: A Single Figure Basking in the Spotlight
A joyous celebration is captured in this image, with a person in a blue shirt holding a trophy aloft, bathed in dramatic lighting. The focus is on the individual’s triumphant moment, while the surrounding crowd fades into a blur, emphasizing the significance of their achievement.
Prompt
poses standing-tall: Triumphant, proud, accomplished ; Gamer holding a trophy after winning a tournament; close-up; Gaming; Crowd cheering and flashing cameras; cinematic
Characteristic
Shot : A man is silhouetted in a concert venue, holding a trophy above his head. He is facing away from the camera, and there is a crowd behind him. The scene is lit by colorful spotlights.
Aesthetic Score : 0.6
Mood : triumphant, celebratory, energetic
Quality
Entropy : 6.63
Noise : 58
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise and blur in the background.
A Family’s Moment of Joy and Wonder on a Majestic Mountaintop
Capture the spirit of adventure and hope as a family of three stands triumphantly on a snow-capped mountain peak, their outstretched arms embracing the breathtaking panorama of a vast mountain range. The clear blue sky and the majestic scenery evoke a sense of awe and freedom, making this a truly inspiring moment.
Prompt
poses standing-tall: Joyful, united, adventurous ; Family standing on a mountain peak; wide shot; Travel; Panoramic view of snow-capped mountains and a clear blue sky; cinematic
Characteristic
Shot : A family of three, two adults and a child, are standing on a mountain top with a majestic view of a mountain range in the background. The adults are looking towards the horizon, while the child is looking at the camera. They are wearing backpacks and casual clothing.
Aesthetic Score : 0.75
Mood : joyful, adventurous, hopeful
Quality
Entropy : 6.70
Noise : 69
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Conclusion
The generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model is not perfectly capturing the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.51, which falls within the “good” range. This indicates that the model is generally able to understand the scene descriptions in the prompts and create images that reflect those descriptions.
- Aesthetic Analysis: The model scored 0.12, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated images are not closely matching the expected aesthetic style described in the prompts.
Overall, the model shows promise in understanding scene descriptions and camera positions, but needs improvement in generating images that align with the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/dev/api