AI's Artistic Struggle: Capturing the Essence of Poses with Flux-schnell
- 9 minutes read - 1820 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual descriptions is a rapidly evolving field. This blog post delves into the results of an experiment where an AI model was tasked with creating images based on specific scene descriptions, focusing on the poses of the subjects within those scenes. The experiment reveals both the strengths and weaknesses of the AI model, highlighting its proficiency in understanding camera positioning and shot analysis, but its struggle in capturing the desired aesthetic. This exploration sheds light on the ongoing challenges in AI’s artistic capabilities and the potential for future advancements in this domain.
Created with: flux-schnell
Silhouetted Solitude at Sunset
A lone figure, silhouetted against the fiery hues of a setting sun, evokes a sense of calm contemplation. The dramatic lighting emphasizes their smallness against the vastness of the landscape, creating a powerful image of solitude and introspection.
Prompt
poses profile: Epic, hopeful, determined ; A lone figure, silhouetted against a setting sun; wide shot; Heroism; A vast, mountainous landscape; cinematic
Characteristic
Shot : A lone figure stands silhouetted against a vibrant sunset, gazing towards the horizon, holding a hiking stick, with a backpack on their back. There’s a sense of solitude and contemplation.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, serene
Quality
Entropy : 6.40
Noise : 38
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors in the image.
Lost in the Majesty: A Hiker’s Moment of Awe
A solitary figure stands on a precipice, dwarfed by the breathtaking panorama of mountains and a cascading waterfall. The scene evokes a sense of tranquility, adventure, and the humbling power of nature.
Prompt
poses profile: Adventurous, free-spirited, awe-inspired ; A backpacker standing on a cliff edge, looking out at a breathtaking view; medium shot; Adventure; A sprawling valley with cascading waterfalls; cinematic
Characteristic
Shot : A lone hiker stands on a cliff edge overlooking a vast valley with a waterfall in the distance. The sun is shining and the sky is blue.
Aesthetic Score : 0.8
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.75
Noise : 79
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors
Lost in the Game: A Gamer’s Intense Focus Under Neon Lights
A young man, immersed in a virtual world, his face illuminated by the blue and purple glow of his gaming setup. The intensity of his focus is palpable, creating a sense of suspense and anticipation. The image captures the thrill of the game, leaving the viewer wondering what challenges lie ahead.
Prompt
poses profile: Focused, intense, passionate ; A gamer’s hands, illuminated by the glow of a monitor, holding a controller; close-up; Gaming; A dimly lit room with gaming posters on the walls; cinematic
Characteristic
Shot : A person is playing a video game in a dimly lit room with a headset on. The image is cropped in a way that only shows the person’s hands holding a controller, the lower part of their face and their shoulder.
Aesthetic Score : 0.5
Mood : intense, focused, determined
Quality
Entropy : 6.50
Noise : 59
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background. There are also some digital artifacts present around the edges of the person’s head and the controller. The image appears a bit grainy.
Lost in the City’s Embrace: A Moment of Wonder in a European City
A young man, clad in blue and glasses, stands amidst the bustling streets of a European city, his gaze drawn upwards to the intricate details of an ancient building. The scene evokes a sense of pensive contemplation, capturing the beauty and history of urban life.
Prompt
poses profile: Curious, excited, appreciative ; A tourist gazing up at a majestic cathedral; medium shot; Tourism; A bustling city square with cobblestone streets; cinematic
Characteristic
Shot : A man is looking up at a building in a European city, with other people walking by in the background.
Aesthetic Score : 0.7
Mood : reflective, hopeful, urban
Quality
Entropy : 6.91
Noise : 97
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight noise and compression artifacts are visible, particularly in the shadows.
Lost in Thought: A Moment of Melancholy on the Train
A young woman gazes out the window of a moving train, her expression hinting at a contemplative mood. The soft lighting casts a sense of solitude and mystery, capturing a moment of introspective reflection.
Prompt
poses profile: Reflective, contemplative, nostalgic ; A traveler sitting on a train, looking out the window at passing scenery; medium shot; Travel; A scenic train journey through rolling hills and fields; cinematic
Characteristic
Shot : A woman sits by the window of a train, looking out at a blurred landscape. The window frame and the seat in front of her are visible in the foreground.
Aesthetic Score : 0.6
Mood : melancholy, pensive, contemplative
Quality
Entropy : 5.96
Noise : 60
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors.
Friends Gather for a Joyful Celebration
A group of friends share laughter and good times in a warmly lit, festive setting. The scene exudes a sense of joy, camaraderie, and celebration, captured in a beautifully composed image.
Prompt
poses profile: Joyful, celebratory, connected ; A group of friends laughing and celebrating together; wide shot; Groups; A lively party with colorful decorations and music; cinematic
Characteristic
Shot : A group of people, mostly women, are standing in a room decorated with hanging paper lanterns. There is a sense of celebration and joy in the air.
Aesthetic Score : 0.6
Mood : happy, celebratory, warm
Quality
Entropy : 6.76
Noise : 85
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has slight noise and some artifacts in the background due to compression, but it’s not significantly noticeable.
Hope Rises Above the City
A superhero silhouetted against a breathtaking sunset, overlooking a sprawling cityscape. This epic scene captures the dramatic hope and resilience of a hero facing an uncertain future.
Prompt
poses profile: Powerful, confident, inspiring ; A superhero standing tall, cape billowing in the wind; medium shot; Heroism; A cityscape with towering skyscrapers; cinematic
Characteristic
Shot : A superhero standing on a rooftop in a city, looking out over the cityscape. The sun is setting in the background, casting a warm glow on the scene.
Aesthetic Score : 0.6
Mood : determined, hopeful, powerful
Quality
Entropy : 6.59
Noise : 61
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some noise and a slight blur. The lighting is a bit uneven.
Unveiling the Secrets of the Jungle Temple
Three adventurers embark on a thrilling expedition through a vibrant jungle, their curiosity piqued by the enigmatic temple looming in the distance. The lush greenery and mysterious atmosphere create a sense of wonder and anticipation, inviting you to join their journey.
Prompt
poses profile: Intrigued, adventurous, determined ; A group of explorers navigating a dense jungle; wide shot; Adventure; Lush greenery, ancient ruins, and dappled sunlight; cinematic
Characteristic
Shot : Three people are walking through a jungle-like environment towards a large, ancient temple-like building in the background.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, tropical
Quality
Entropy : 6.74
Noise : 119
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image suffers from overexposure in some areas, particularly in the background. This is probably due to sunlight. The building seems a bit blurry and poorly defined.
Lost in the Code: A Young Man’s Intense Focus Under Neon Lights
A young man, bathed in vibrant red and blue light, is completely absorbed in his work. His headphones block out the world, and his focused expression reveals a deep concentration. The interplay of light and shadow adds a dramatic touch, highlighting the intensity of his moment.
Prompt
poses profile: Focused, competitive, determined ; A gamer’s face, lit by the screen, showing intense concentration; close-up; Gaming; A dimly lit room with a gaming setup and neon lights; cinematic
Characteristic
Shot : A young man wearing headphones is looking intently at a computer screen, lit by blue and red light. The background is blurry but shows a second monitor and a red light.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 5.76
Noise : 49
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, particularly the background and the subject’s face. The lighting creates some harsh shadows on the face.
Silhouettes and Sunset Romance on the Beach
A woman strolls along a sandy shore as the sun dips below the horizon, casting a warm glow and creating a serene and romantic atmosphere. The scene evokes feelings of peace and tranquility, with the dramatic sunset colors adding a touch of magic.
Prompt
poses profile: Romantic, peaceful, serene ; A couple holding hands, walking along a beach at sunset; medium shot; Tourism; A golden beach with turquoise waters and a vibrant sky; cinematic
Characteristic
Shot : A woman walking on a beach at sunset.
Aesthetic Score : 0.7
Mood : serene, tranquil, romantic
Quality
Entropy : 6.52
Noise : 86
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no obvious artifacts or errors in the image.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.565, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.09, which is significantly higher than the “very good” range of -0.2 to 0.1. This means that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding scene composition and camera positioning, but needs improvement in generating images that match the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api