AI's Artistic Journey: Capturing Poses, But Missing the Essence with Flux-dev
- 9 minutes read - 1792 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts has become increasingly sophisticated. However, while AI models excel in understanding technical aspects like camera positions and scene composition, capturing the desired aesthetic remains a challenge. This blog post delves into the results of an experiment where an AI model was tasked with generating images based on specific poses and scene descriptions, revealing both its strengths and limitations in the pursuit of artistic expression.
Created with: flux-dev
Silhouetted Against Hope: A Hiker’s Sunrise Moment
A lone hiker stands on a mountain ridge, their silhouette stark against the vibrant hues of a sunrise. The scene evokes a sense of serenity, contemplation, and hope, highlighting the vastness of nature and the smallness of the individual within it.
Prompt
poses leaning-back: epic, contemplative ; A lone adventurer, silhouetted against a setting sun; wide shot; adventure; vast, rugged mountain range; cinematic
Characteristic
Shot : A lone hiker stands silhouetted against a vibrant orange sunset, overlooking a vast mountain range.
Aesthetic Score : 0.7
Mood : serene, contemplative, adventurous
Quality
Entropy : 5.78
Noise : 32
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors in the image.
Silhouetted Hero, Hopeful Future
A lone figure, cloaked in heroism, stands atop a rock, gazing towards a distant city skyline. The setting sun casts a dramatic silhouette, highlighting the man’s power and the promise of a brighter tomorrow.
Prompt
poses leaning-back: triumphant, powerful ; A superhero, cape billowing in the wind, looking down at a city skyline; medium shot; heroism; bustling cityscape; cinematic
Characteristic
Shot : A man in a superhero costume stands on a rock with a city skyline in the background. The sun is setting in the distance.
Aesthetic Score : 0.7
Mood : dramatic, hopeful, powerful
Quality
Entropy : 6.50
Noise : 50
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background. Some of the colors are slightly oversaturated, particularly the red cape.
Silhouettes of Love at Sunset
A romantic scene unfolds on a sandy beach as three figures are silhouetted against a breathtaking sunset. The serene and relaxed mood is enhanced by the dramatic effect of the golden light, creating a captivating image.
Prompt
poses leaning-back: joyful, carefree ; A group of friends, laughing and relaxing on a beach, watching the sunset; wide shot; tourism; tropical beach with palm trees; cinematic
Characteristic
Shot : Three friends are sitting on the beach at sunset. They are all facing the sunset and appear to be enjoying the view.
Aesthetic Score : 0.6
Mood : relaxed, happy, romantic
Quality
Entropy : 6.40
Noise : 57
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable errors in the image.
Lost in the Game: A Gamer’s Focus Under Neon Lights
A young gamer, bathed in blue and pink lighting, is completely immersed in their game. The intensity of their focus and the dramatic lighting create a captivating scene of pure gaming passion.
Prompt
poses leaning-back: intense, focused ; A gamer, eyes glued to a screen, leaning back in a gaming chair, surrounded by controllers and snacks; medium shot; gaming; dimly lit room with neon lights; cinematic
Characteristic
Shot : A young person is sitting in a gaming chair in front of a computer. They are wearing headphones and looking at the screen, seemingly engrossed in a game.
Aesthetic Score : 0.6
Mood : focused, concentrated, intense
Quality
Entropy : 5.92
Noise : 56
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors. The image is slightly blurred, but not excessively.
A Moment of Quiet Reflection
A woman finds solace in the passing scenery, bathed in soft light, as the train carries her through a muted landscape. The contrast between the bright window and the dark interior evokes a sense of isolation and contemplation.
Prompt
poses leaning-back: reflective, nostalgic ; A traveler, gazing out of a train window, watching the scenery pass by; medium shot; travel; rolling hills and fields; cinematic
Characteristic
Shot : A young woman sits by the window of a train looking out at a blurry landscape.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, wistful
Quality
Entropy : 6.06
Noise : 55
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, particularly in the background.
Silhouettes of Sound: A Black and White Concert Experience
This black and white image captures the raw energy of a live concert, with the band silhouetted against the stage lights and a sea of faces in the crowd. The dramatic backlighting and use of silhouettes create a sense of mystery and intensity, highlighting the raw power of the music.
Prompt
poses leaning-back: energetic, passionate ; A group of musicians, performing on stage, bathed in spotlights; wide shot; groups; concert stage with cheering audience; cinematic
Characteristic
Shot : A black and white photo of a concert with a band performing on stage and a crowd of people in the audience
Aesthetic Score : 0.6
Mood : energetic, dramatic, intimate
Quality
Entropy : 6.42
Noise : 53
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is some noise in the image, particularly in the shadows.
Contemplating the Storm: A Solitary Figure on the Cliff’s Edge
A lone figure finds solace amidst the dramatic backdrop of a stormy ocean. The vastness of the seascape and the choppy waves create a sense of isolation and contemplation, capturing a moment of quiet reflection against the forces of nature.
Prompt
poses leaning-back: solitary, contemplative ; A lone figure, sitting on a cliff edge, looking out at a vast ocean; medium shot; adventure; dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A lone figure sits on a cliff overlooking a choppy sea, with a misty sky and distant cliffs in the background.
Aesthetic Score : 0.7
Mood : solitude, contemplative, dramatic
Quality
Entropy : 6.39
Noise : 78
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some slight noise in the image, particularly in the sky.
A Cosmic Odyssey: Five Astronauts Embark on a Journey of Hope
This breathtaking image captures the essence of space exploration, with five astronauts in their spacesuits floating against the backdrop of a vibrant Earth and a star-studded sky. The dynamic composition suggests a sense of forward motion, hinting at the adventure that lies ahead. The mood is both epic and hopeful, inspiring awe and wonder at the vastness of the universe.
Prompt
poses leaning-back: awe-inspiring, majestic ; A group of astronauts, floating weightlessly in space, looking out at Earth; wide shot; heroism; Earth from space with stars in the background; cinematic
Characteristic
Shot : Five astronauts in spacesuits are floating in space, facing the camera. The background is a dark blue space with a few stars, and a bright planet Earth visible below. The astronauts are illuminated by a bright light source, possibly a nearby star or planet.
Aesthetic Score : 0.6
Mood : mysterious, hopeful, adventurous
Quality
Entropy : 6.62
Noise : 92
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts, such as the slight blurriness of the astronauts’ faces and the lack of detail in the background. The colors are a bit too saturated and the overall image is a bit too dark.
Campfire Companionship: A Cozy Gathering in the Woods
A group of four friends huddle around a crackling campfire, bathed in the warm glow of the flames. The scene exudes a sense of cozy intimacy and friendly camaraderie, making it a perfect picture of a relaxing evening in the wilderness.
Prompt
poses leaning-back: warm, intimate ; A family, gathered around a campfire, sharing stories and laughter; medium shot; groups; forest clearing with a crackling fire; cinematic
Characteristic
Shot : A group of four young adults are gathered around a campfire in a forest, enjoying each other’s company.
Aesthetic Score : 0.6
Mood : warm, cozy, friendship
Quality
Entropy : 6.46
Noise : 83
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and has a few minor artifacts, particularly in the dark areas.
Awe-Inspiring Views from the Clouds
A serene and contemplative moment captured from a small aircraft window, showcasing breathtaking mountain peaks and fluffy clouds. The contrast between the dark interior and the bright, expansive view outside evokes a sense of wonder and adventure.
Prompt
poses leaning-back: exhilarating, adventurous ; A pilot, looking out of the cockpit window, flying over a breathtaking landscape; medium shot; travel; mountains and valleys covered in clouds; cinematic
Characteristic
Shot : A person is looking out the window of an airplane at a mountain range with snow capped peaks. The person is wearing a headset, suggesting that they are a pilot.
Aesthetic Score : 0.7
Mood : serene, contemplative, adventurous
Quality
Entropy : 6.23
Noise : 61
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and the colors are a bit washed out. The detail in the mountains is lost due to the exposure.
Conclusion
The generative AI model performed well in terms of understanding camera positions and scene composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:
- Camera Position: The model scored a 0.35, indicating a moderate ability to follow the camera position instructions in the prompt. This is considered okay, as a score between 0.5 and 0.75 is considered good, and above 0.75 is very good.
- Shot Analysis: The model scored a 0.53, indicating a good ability to understand the scene composition described in the prompt. This is considered good, as a score between 0.5 and 0.75 is considered good, and above 0.75 is very good.
- Aesthetic Analysis: The model scored a 0.16, indicating a significant difference between the expected aesthetic and the actual aesthetic of the generated image. This is considered okay, as a score between -0.2 and 0.1 is considered very good.
Overall, the model shows promise in understanding the technical aspects of the prompt, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/dev/api