AI's Artistic Journey: Capturing Poses, But Missing the Scene with Flux-schnell
- 9 minutes read - 1818 wordsTable of Contents
The world of AI-generated art is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, the journey towards perfect artistic translation is still ongoing. This blog post explores the fascinating case of AI’s ability to capture poses within scenes, highlighting its strengths and weaknesses. We’ll delve into the results of a recent experiment, analyzing how well an AI model understood and translated scene descriptions into images. Dramatic poses, often used in photography and film to convey emotion and action, are a key element in storytelling. Think of the iconic silhouette of a lone figure against a sunset, or the powerful stance of a superhero overlooking a city. These poses are instantly recognizable and evoke strong emotions. But can AI capture these nuances effectively?
Created with: flux-schnell
Silhouetted Hope: A Lone Figure Contemplates the Sunset
A solitary figure stands on a mountain peak, their silhouette stark against the vibrant orange sunset. The scene evokes a sense of serenity and contemplation, hinting at a hopeful outlook towards the future. The dramatic contrast between the bright sky and the dark figure creates a powerful image of isolation and introspection.
Prompt
poses leaning-back: epic, contemplative ; A lone adventurer, silhouetted against a setting sun; wide shot; adventure; vast, rugged mountain range; cinematic
Characteristic
Shot : A silhouette of a man with a backpack standing on a mountain top at sunset
Aesthetic Score : 0.7
Mood : serene, contemplative, hopeful
Quality
Entropy : 6.34
Noise : 47
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Heroic Stance, City Below: A Moment of Hope
A superhero, cloaked in a billowing cape, stands on a rooftop overlooking a sprawling city. The scene evokes a sense of drama and hope, suggesting a hero ready to face any challenge. The dramatic lighting and the hero’s confident stance create a powerful image of courage and determination.
Prompt
poses leaning-back: triumphant, powerful ; A superhero, cape billowing in the wind, looking down at a city skyline; medium shot; heroism; bustling cityscape; cinematic
Characteristic
Shot : A man in a superhero cape stands on a rooftop overlooking a city skyline. He’s looking toward the horizon, the city is behind him. The city’s buildings are out of focus.
Aesthetic Score : 0.7
Mood : determined, powerful, hopeful
Quality
Entropy : 6.81
Noise : 74
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors in the image
Sunset Serenity on the Beach
A group of friends bask in the golden glow of a breathtaking sunset, their laughter echoing through the air as they soak up the relaxed and carefree atmosphere. The palm tree silhouette against the vibrant sky adds a touch of tropical paradise to this idyllic scene.
Prompt
poses leaning-back: joyful, carefree ; A group of friends, laughing and relaxing on a beach, watching the sunset; wide shot; tourism; tropical beach with palm trees; cinematic
Characteristic
Shot : Five friends sitting on a beach, looking at the ocean.
Aesthetic Score : 0.6
Mood : relaxed, friendly, summery
Quality
Entropy : 6.68
Noise : 63
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant artifacts or errors in the image.
Lost in Thought: A Gamer’s Moment of Contemplation
A young man, bathed in soft pink and blue light, sits lost in thought in his gaming chair. The scene evokes a sense of tranquility and introspection, highlighting the quiet moments of reflection that can occur even amidst the excitement of gaming.
Prompt
poses leaning-back: intense, focused ; A gamer, eyes glued to a screen, leaning back in a gaming chair, surrounded by controllers and snacks; medium shot; gaming; dimly lit room with neon lights; cinematic
Characteristic
Shot : A young man wearing headphones sits in a gaming chair, looking up and to the right. The room is dimly lit with colorful lighting, suggesting a gaming setup.
Aesthetic Score : 0.6
Mood : relaxed, contemplative, focused
Quality
Entropy : 5.74
Noise : 51
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Golden Hour Reflections: A Moment of Tranquility on the Train
A man, silhouetted against a window, gazes out at the blurring countryside as the sun sets. The warm, golden light evokes a sense of tranquility and contemplation, capturing a moment of quiet reflection amidst the journey.
Prompt
poses leaning-back: reflective, nostalgic ; A traveler, gazing out of a train window, watching the scenery pass by; medium shot; travel; rolling hills and fields; cinematic
Characteristic
Shot : A man is sitting in a train looking out the window at the passing landscape. It is sunset and the sun is shining on the countryside.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, peaceful
Quality
Entropy : 6.76
Noise : 53
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a bit blurry, especially in the background.
Energetic Band Rocks the Stage with Dramatic Lighting
Capture the vibrant energy of a live performance with this image. A band takes the stage, their music electrifying the crowd. The stage lights create a dramatic effect, highlighting the band and the audience in a lively and casual atmosphere.
Prompt
poses leaning-back: energetic, passionate ; A group of musicians, performing on stage, bathed in spotlights; wide shot; groups; concert stage with cheering audience; cinematic
Characteristic
Shot : A band performing on stage in front of a crowd
Aesthetic Score : 0.6
Mood : energetic, hopeful, lively
Quality
Entropy : 6.77
Noise : 75
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some noise in the image, particularly in the darker areas.
A Moment of Solitude on the Edge of the World
A lone figure contemplates the vastness of the ocean, finding peace amidst the swirling clouds and crashing waves. The scene evokes a sense of serenity and the humbling realization of our place in the grand scheme of things.
Prompt
poses leaning-back: solitary, contemplative ; A lone figure, sitting on a cliff edge, looking out at a vast ocean; medium shot; adventure; dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A solitary figure sits on a cliff overlooking a vast, stormy ocean. The horizon is obscured by heavy clouds and the waves are crashing against the rocky shore. The image is captured from a high vantage point, looking down at the man and the sea.
Aesthetic Score : 0.8
Mood : serene, contemplative, dramatic
Quality
Entropy : 6.71
Noise : 87
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors.
Lost in the Cosmic Dance: A Journey of Hope and Wonder
Three astronauts, adrift in the vast expanse of space, gaze towards a distant planet. The composition evokes a sense of mystery and adventure, with the astronauts seemingly floating towards the viewer, creating a captivating sense of depth and perspective. This image captures the hopeful spirit of exploration, reminding us of the boundless possibilities that lie beyond our world.
Prompt
poses leaning-back: awe-inspiring, majestic ; A group of astronauts, floating weightlessly in space, looking out at Earth; wide shot; heroism; Earth from space with stars in the background; cinematic
Characteristic
Shot : Three astronauts are floating in space, against a backdrop of a blue planet and a starry night sky.
Aesthetic Score : 0.7
Mood : futuristic, adventurous, hopeful
Quality
Entropy : 6.42
Noise : 95
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.50
Image errors : No notable artifacts or errors observed.
Campfire Gathering: Warmth, Friendship, and Forest Tranquility
A cozy scene unfolds around a crackling campfire in the heart of a forest. Three adults and a child, their faces illuminated by the dancing flames, share a moment of warmth and connection. The forest setting adds a sense of peace and tranquility, creating a truly inviting atmosphere.
Prompt
poses leaning-back: warm, intimate ; A family, gathered around a campfire, sharing stories and laughter; medium shot; groups; forest clearing with a crackling fire; cinematic
Characteristic
Shot : A group of four people are gathered around a campfire in a wooded area. The fire is in the foreground, and the people are sitting on the ground behind it. The people are all smiling and appear to be enjoying themselves.
Aesthetic Score : 0.7
Mood : cozy, heartwarming, playful
Quality
Entropy : 6.86
Noise : 89
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No obvious errors.
Finding Serenity Above the Clouds
A man, lost in the moment, gazes out the airplane window at a breathtaking vista of clouds and mountains. His calm demeanor and the vastness of the landscape evoke a sense of tranquility and awe.
Prompt
poses leaning-back: exhilarating, adventurous ; A pilot, looking out of the cockpit window, flying over a breathtaking landscape; medium shot; travel; mountains and valleys covered in clouds; cinematic
Characteristic
Shot : A man in a white shirt and headphones is sitting in a small aircraft, looking out the window at a breathtaking view of fluffy clouds and distant mountains.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, hopeful
Quality
Entropy : 5.89
Noise : 80
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed and has some noise in the shadows. There are also some slight artifacts in the window.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t fully capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.49, which is also below the “good” range. This suggests that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.14, which is within the “very good” range of -0.2 to 0.1. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model seems to be better at understanding the aesthetic aspects of the prompt than the scene and camera position. It might need further training to improve its ability to accurately interpret and translate camera positions and scene descriptions into images.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api