AI's Artistic Journey: Capturing Poses, But Missing the Angle with Flux-schnell
- 9 minutes read - 1879 wordsTable of Contents
Dramatic poses are a powerful tool in visual storytelling, conveying emotions and actions through the way a figure is positioned. From the heroic stance of a knight to the contemplative gaze of a traveler, poses can add depth and meaning to any image. This blog post explores the use of dramatic poses in AI-generated art, examining the strengths and limitations of current models in capturing these dynamic elements.
Created with: flux-schnell
A Knight’s Solitary Vigil: Mystery and Majesty Await
A lone knight stands guard on a windswept cliff, his silhouette stark against a backdrop of mist-shrouded mountains and a distant, majestic castle. The scene is both epic and mysterious, evoking a sense of wonder and anticipation for the unknown.
Prompt
poses three-quarter-pose: determined, resolute, heroic ; A lone knight, standing tall on a windswept hilltop; three-quarter pose; Heroism; a vast, stormy landscape with a distant castle in the background; cinematic
Characteristic
Shot : A lone knight, standing on a rocky mountaintop, gazes out towards a distant, imposing castle in the mist.
Aesthetic Score : 0.7
Mood : epic, lonely, dramatic
Quality
Entropy : 6.85
Noise : 67
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be AI generated, with a slight lack of detail in the knight’s armor and the background landscape.
Silhouetted Against the Sunset: A Journey Begins
A lone figure, armed and equipped with a map, stands on a hill overlooking a sprawling jungle. The dramatic sunset casts a mysterious silhouette, hinting at an epic adventure to come.
Prompt
poses three-quarter-pose: adventurous, curious, hopeful ; An intrepid explorer, silhouetted against the setting sun, holding a map; three-quarter pose; Adventure; a dense jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : A silhouette of a man in a cowboy hat, holding a map and a stick, stands against a sunset backdrop, likely in a jungle or mountainous region.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, hopeful
Quality
Entropy : 5.78
Noise : 64
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no noticeable image errors.
Lost in the Code: A Moment of Intense Focus
A young man, bathed in the soft glow of his computer screen, is completely absorbed in his work. The dimly lit room adds an air of mystery, while the presence of another figure in the background hints at a larger story unfolding. This image captures the essence of focused intensity and the allure of the digital world.
Prompt
poses three-quarter-pose: focused, intense, exhilarated ; A gamer, eyes glued to the screen, fingers flying across the keyboard; three-quarter pose; Gaming; a brightly lit gaming room with neon lights and a futuristic cityscape projected on the wall; cinematic
Characteristic
Shot : A young man is playing a video game in a dark room, with red and blue lights and a large screen behind him.
Aesthetic Score : 0.6
Mood : focused, intense, competitive
Quality
Entropy : 6.67
Noise : 73
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors, but the image has been heavily processed with added contrast and saturation, causing a slightly unnatural look.
Capturing the Eiffel Tower: A Moment of Joy in Paris
A man, radiating happiness and adventure, stands before the iconic Eiffel Tower, camera in hand, capturing the moment. The overcast sky adds a touch of drama, while the bustling city life provides a vibrant backdrop. His blue jacket and backpack suggest a journey filled with exploration and discovery.
Prompt
poses three-quarter-pose: amazed, joyful, curious ; A tourist, gazing in awe at the Eiffel Tower, camera in hand; three-quarter pose; Tourism; a bustling Parisian street with cafes and shops lining the sidewalk; cinematic
Characteristic
Shot : A man is taking a picture of the Eiffel Tower in Paris, France. He is standing on a street and there are other people and cars in the background.
Aesthetic Score : 0.7
Mood : happy, adventurous, travel
Quality
Entropy : 6.27
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight overexposure, which is noticeable in the sky. The man’s right arm is also slightly out of focus.
Conquering the Peak, Embracing the View
A lone hiker stands triumphant on a mountain summit, arms outstretched, taking in the breathtaking panorama of snow-capped peaks. The vastness of the landscape evokes a sense of peace, inspiration, and adventure.
Prompt
poses three-quarter-pose: free, exhilarated, adventurous ; A backpacker, standing on a mountain peak, arms outstretched, enjoying the view; three-quarter pose; Travel; a breathtaking panorama of snow-capped mountains and valleys; cinematic
Characteristic
Shot : A lone hiker stands on a mountain peak, arms outstretched, looking out over a vast, snowy mountain range with a clear blue sky above.
Aesthetic Score : 0.7
Mood : inspiring, triumphant, hopeful
Quality
Entropy : 6.53
Noise : 72
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight amount of noise, particularly in the sky and mountain shadows.
Campfire Glow Under a Starry Sky
A cozy scene of four friends gathered around a campfire, bathed in the warm light against the backdrop of a star-filled night. The peaceful atmosphere and the dramatic contrast between fire and darkness create a sense of intimacy and wonder.
Prompt
poses three-quarter-pose: happy, relaxed, connected ; A group of friends, laughing and sharing stories around a campfire; three-quarter pose; Groups; a serene forest clearing with stars twinkling in the night sky; cinematic
Characteristic
Shot : Four people are sitting around a campfire in the woods at night. The sky is full of stars. The fire is burning brightly. There is a lot of smoke in the air. The trees are dark and silhouetted against the night sky.
Aesthetic Score : 0.7
Mood : cozy, peaceful, adventurous
Quality
Entropy : 6.29
Noise : 88
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, particularly the faces of the people. There is some noise in the image, particularly in the darker areas.
Clash of Titans: Superman and Batman Face Off in Gritty Urban Showdown
A dramatic and intense image captures the tension between Superman and Batman as they stand in a gritty cityscape. The characters’ poses and expressions create a powerful sense of conflict, leaving viewers on the edge of their seats.
Prompt
poses three-quarter-pose: powerful, victorious, confident ; A superhero, standing triumphantly over a defeated villain; three-quarter pose; Heroism; a cityscape with smoke and debris in the background; cinematic
Characteristic
Shot : A man dressed as Superman is standing over another man dressed as Batman in an urban setting
Aesthetic Score : 0.6
Mood : dark, dramatic, intense
Quality
Entropy : 6.85
Noise : 73
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is a bit blurry. The edges of the Superman costume are somewhat pixelated.
Hike to the Top: Adventuring on a Mountain Ridge
Experience the breathtaking beauty of a vast valley and snow-capped peaks as five hikers navigate a narrow mountain ridge. The dramatic scale of the mountains inspires awe and wonder, while the bright blue sky and puffy clouds create a serene and adventurous atmosphere.
Prompt
poses three-quarter-pose: determined, focused, adventurous ; A group of adventurers, navigating a treacherous mountain path; three-quarter pose; Adventure; a rugged mountain range with snow-covered peaks and a deep valley below; cinematic
Characteristic
Shot : Hikers on a mountain ridge overlooking a valley, with snowy peaks in the distance.
Aesthetic Score : 0.7
Mood : epic, adventurous, inspiring
Quality
Entropy : 6.80
Noise : 115
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight compression artifacts, particularly visible in the sky and the distant mountains.
The Intensity of the Game: Young Men Locked in a Digital Battle
A dimly lit room, a table crowded with young men, and the glow of a screen reflecting in their focused faces. This image captures the raw intensity of competitive gaming, where every move matters and the thrill of victory is palpable.
Prompt
poses three-quarter-pose: focused, competitive, excited ; A group of gamers, huddled around a table, strategizing their next move; three-quarter pose; Gaming; a dimly lit room with flickering computer screens and a stack of pizza boxes; cinematic
Characteristic
Shot : A group of young men are gathered around a table, working on computers in a dimly lit room.
Aesthetic Score : 0.6
Mood : focused, concentrated, serious
Quality
Entropy : 6.34
Noise : 58
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, particularly around the edges of the screens and the lighting. There are some halos around the men and the table.
Friendship Under a Grand Archway
Three friends share a joyful moment in front of a majestic European archway, surrounded by vibrant buildings and a clear blue sky. The warm colors and sense of depth create a cheerful and inviting atmosphere.
Prompt
poses three-quarter-pose: happy, joyful, memorable ; A family, standing in front of a famous landmark, smiling for a photo; three-quarter pose; Tourism; a vibrant city square with colorful buildings and street performers; cinematic
Characteristic
Shot : A group of three friends are standing in front of a beautiful European architecture, the building has a light color palette and a classical design. The day is sunny and the sky is clear.
Aesthetic Score : 0.7
Mood : happy, friendly, joyful
Quality
Entropy : 6.86
Noise : 91
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise in the background and some unnatural texture on the building
Conclusion
The results show that the generative AI model performed well in terms of shot analysis and aesthetic analysis, but struggled with camera position analysis.
Here’s a breakdown:
- Camera Position Analysis: The score of 0.2 indicates that the model did a poor job of understanding and implementing the camera position specified in the prompt. A score between 0.5 and 0.75 would be considered good, and above 0.75 would be very good.
- Shot Analysis: The score of 0.49 indicates that the model did a good job of understanding the scene described in the prompt and creating an image that reflects it. A score between 0.5 and 0.75 would be considered good, and above 0.75 would be very good.
- Aesthetic Analysis: The score of 0.37 indicates that the generated image’s aesthetic was close to the expected aesthetic. A score between -0.2 and 0.1 would be considered very good.
Overall, the model seems to be better at understanding the scene and creating an aesthetically pleasing image than it is at accurately implementing camera positions.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api