AI's Artistic Struggle: Capturing the Essence of Poses with Bfl-flux-pro
- 9 minutes read - 1827 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and aesthetically pleasing images is a coveted skill. One area where AI models are often tested is in their ability to capture the essence of poses. This involves not only understanding the physical positioning of the subject but also conveying the emotions, intentions, and overall aesthetic that the pose is meant to evoke. This blog post delves into the results of a recent experiment where a generative AI model was tasked with creating images based on specific poses and scenes. While the model demonstrated a good understanding of camera positions and shot composition, it struggled to achieve the desired aesthetic. We will explore the reasons behind this discrepancy and discuss potential solutions for improving the model’s artistic capabilities.
Created with: flux-pro
Silhouetted Against the Sunset: A Hiker’s Moment of Solitude
A lone hiker stands on a rocky outcrop, their silhouette stark against the vibrant hues of a setting sun. The mountains in the distance are bathed in warm light, creating a serene and adventurous scene. This image captures a moment of contemplation and the vastness of nature.
Prompt
poses leaning-back: epic, contemplative ; A lone adventurer, silhouetted against a setting sun; wide shot; adventure; vast, rugged mountain range; cinematic
Characteristic
Shot : A lone hiker stands on a rocky mountain peak, looking out at a dramatic sunset over a vast, mountainous landscape.
Aesthetic Score : 0.75
Mood : serene, inspirational, adventurous
Quality
Entropy : 6.84
Noise : 61
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major errors, some slight noise and grain may be present
Gotham’s Guardian: A Dramatic Silhouette Against the Cityscape
A powerful image captures the essence of a superhero, likely Batman, standing tall against the backdrop of a sprawling city skyline. The flowing blue cape and dramatic pose evoke a sense of epic power and heroism, leaving a lasting impression of the character’s presence and strength.
Prompt
poses leaning-back: triumphant, powerful ; A superhero, cape billowing in the wind, looking down at a city skyline; medium shot; heroism; bustling cityscape; cinematic
Characteristic
Shot : A superhero, presumably Batman, stands in a city skyline with his cape flowing behind him. He appears to be looking out at the cityscape.
Aesthetic Score : 0.7
Mood : powerful, mysterious, heroic
Quality
Entropy : 6.90
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has a few minor artifacts, such as slight blurring around the edges of the cape and some aliasing in the background.
Sunset Serenity on the Beach
A trio of friends bask in the golden glow of a breathtaking sunset, the warm hues painting the sky with dramatic beauty. Palm trees sway gently in the breeze, adding to the relaxed and tranquil atmosphere of this idyllic scene.
Prompt
poses leaning-back: joyful, carefree ; A group of friends, laughing and relaxing on a beach, watching the sunset; wide shot; tourism; tropical beach with palm trees; cinematic
Characteristic
Shot : Three people sitting on a pink blanket on a beach at sunset, with a palm tree in the foreground.
Aesthetic Score : 0.7
Mood : romantic, serene, relaxed
Quality
Entropy : 6.70
Noise : 91
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Lost in the Game: A Boy’s Intense Focus
A young boy, headphones on, is completely absorbed in his video game. The dramatic lighting and his serious expression capture the intensity of his focus, creating a captivating image of youthful determination.
Prompt
poses leaning-back: intense, focused ; A gamer, eyes glued to a screen, leaning back in a gaming chair, surrounded by controllers and snacks; medium shot; gaming; dimly lit room with neon lights; cinematic
Characteristic
Shot : A young boy is sitting in a chair wearing headphones and playing a video game. The lighting is blue and pink and there is a cup of fries in the foreground.
Aesthetic Score : 0.6
Mood : focused, determined, playful
Quality
Entropy : 6.70
Noise : 58
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background.
Lost in Thought, Bathed in Golden Light
A young woman, her face illuminated by the warm glow of the setting sun, gazes out the window of a moving train. The landscape blurs past, mirroring the contemplative mood she embodies. Her pensive expression suggests a journey of introspection, leaving the viewer to wonder about the thoughts swirling in her mind.
Prompt
poses leaning-back: reflective, nostalgic ; A traveler, gazing out of a train window, watching the scenery pass by; medium shot; travel; rolling hills and fields; cinematic
Characteristic
Shot : A young woman sitting by a window on a train, looking out at the countryside.
Aesthetic Score : 0.7
Mood : reflective, wistful, peaceful
Quality
Entropy : 6.81
Noise : 77
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise, but this is likely due to the natural grain of the film or the digital sensor.
Rockstar’s Fiery Performance Ignites the Crowd
A silhouette of a rockstar, bathed in bright spotlights and surrounded by flames, belts out a song to an energized crowd. The high contrast and dramatic lighting create a powerful and vibrant atmosphere, capturing the intensity of the concert experience.
Prompt
poses leaning-back: energetic, passionate ; A group of musicians, performing on stage, bathed in spotlights; wide shot; groups; concert stage with cheering audience; cinematic
Characteristic
Shot : A silhouette of a singer performing on stage in front of a crowd, with flames behind him. There are other musicians visible in the background.
Aesthetic Score : 0.6
Mood : energetic, passionate, dramatic
Quality
Entropy : 6.51
Noise : 64
Prompt Clip Score : 0.16
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight blurriness in the background, mostly in the flames and some details are not clearly defined. The image is slightly grainy, possibly due to compression.
Contemplation by the Sea
A solitary woman finds peace on a cliff overlooking the vast ocean, with seagulls soaring overhead. The scene evokes a sense of tranquility and contemplation, captured in the woman’s pose and the serene blue hues of the water and sky.
Prompt
poses leaning-back: solitary, contemplative ; A lone figure, sitting on a cliff edge, looking out at a vast ocean; medium shot; adventure; dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A young woman sits on a cliff overlooking the ocean. The sea is choppy and the sky is bright blue. There are some seagulls in the sky.
Aesthetic Score : 0.7
Mood : pensive, contemplative, peaceful
Quality
Entropy : 6.87
Noise : 68
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Awe-Inspiring View: Astronauts Embrace the Cosmic Expanse
This image captures the breathtaking beauty of space exploration, with astronauts floating amidst the vastness of the cosmos. A vibrant blue planet and a distant sun create a scene of epic wonder, evoking a sense of hope and the boundless possibilities of the future.
Prompt
poses leaning-back: awe-inspiring, majestic ; A group of astronauts, floating weightlessly in space, looking out at Earth; wide shot; heroism; Earth from space with stars in the background; cinematic
Characteristic
Shot : Astronauts floating in space, near Earth
Aesthetic Score : 0.6
Mood : futuristic, mysterious, adventurous
Quality
Entropy : 6.70
Noise : 97
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some slight artifacting around the astronauts, particularly their helmets and suits.
Family Campfire: A Moment of Warmth and Connection
A heartwarming scene of a family gathered around a campfire in the forest. The warm glow of the setting sun and the flickering flames create a cozy and intimate atmosphere, highlighting the love and connection shared by this family.
Prompt
poses leaning-back: warm, intimate ; A family, gathered around a campfire, sharing stories and laughter; medium shot; groups; forest clearing with a crackling fire; cinematic
Characteristic
Shot : A family of four, including two children, sits by a campfire in a forest. The father is holding the older child on his lap and the mother is holding the younger child.
Aesthetic Score : 0.7
Mood : warm, cozy, happy
Quality
Entropy : 6.78
Noise : 76
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
A Lone Pilot Embraces the Sunset’s Promise
A solitary figure in a futuristic cockpit gazes out at a breathtaking sunset over a majestic mountain range. The warm hues of twilight paint a scene of adventure and anticipation, capturing a moment of hopeful solitude against the vastness of the world.
Prompt
poses leaning-back: exhilarating, adventurous ; A pilot, looking out of the cockpit window, flying over a breathtaking landscape; medium shot; travel; mountains and valleys covered in clouds; cinematic
Characteristic
Shot : A man wearing a helmet is sitting in a cockpit-like vehicle, looking out at a mountain range with a sunset in the background.
Aesthetic Score : 0.7
Mood : dramatic, adventurous, hopeful
Quality
Entropy : 6.53
Noise : 62
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.45
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
Shot Analysis:
- Score: 0.54
- Interpretation: This score falls within the “good” range of 0.5 to 0.75. It indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it to a decent degree.
Aesthetic Analysis:
- Score: 0.11
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall:
The model demonstrates a good understanding of camera positions and shot composition, but struggles to achieve the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic preferences into visual outputs.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://api.bfl.ml/docs#/util/get_result_v1_get_result_get