AI's Artistic Journey: Capturing Poses, But Missing the Mood with Flux-schnell
- 9 minutes read - 1785 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts is rapidly advancing. This experiment delves into the capabilities of a generative AI model, focusing on its ability to capture dramatic poses within various scenes. The results reveal a fascinating interplay between technical proficiency and artistic interpretation, highlighting the model’s strengths and areas for improvement. Dramatic poses, often used in storytelling and visual media, aim to convey emotion, action, or a specific character trait. They are frequently employed in films, photography, and even video games to enhance the narrative and engage the audience. This experiment explores how well an AI model can understand and translate these dramatic poses into visually compelling images.
Created with: flux-schnell
Warriors at Sunset: A Silhouette of Power
A dramatic and epic scene unfolds as a group of warriors stand silhouetted against a fiery sunset. The warm glow of the setting sun casts a powerful aura, highlighting their strength and determination. This visually striking composition evokes a sense of grandeur and heroism.
Prompt
poses fighting: epic, determined ; A lone warrior; wide shot; heroism; a desolate battlefield with the setting sun in the background; cinematic
Characteristic
Shot : A silhouette of a warrior with a raised sword, surrounded by other warriors, against a backdrop of a fiery sunset.
Aesthetic Score : 0.6
Mood : epic, dramatic, heroic
Quality
Entropy : 6.02
Noise : 74
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some slight blurring and a lack of sharpness in the silhouettes.
Clash in the Jungle: Swords Drawn, Fate Uncertain
Four figures stand locked in a tense confrontation amidst the lush greenery of a jungle. Their weapons, held high, suggest an imminent clash. A towering mountain and a mysterious Mayan temple loom in the background, adding to the sense of adventure and danger.
Prompt
poses fighting: intense, adventurous ; A group of adventurers; medium shot; adventure; a dense jungle with ancient ruins in the distance; cinematic
Characteristic
Shot : Four people are standing in a jungle clearing, holding weapons, with a large structure in the background.
Aesthetic Score : 0.6
Mood : action, suspense, adventurous
Quality
Entropy : 6.81
Noise : 118
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor artifacts are present in the image, especially around the edges of the characters.
Cyberpunk Warrior in Neon City
A cyberpunk-style figure, clad in a helmet and goggles, stands amidst a neon-lit cityscape, wielding a glowing rod. The scene evokes a sense of action and tension, with the figure’s pose and the vibrant lighting creating a dramatic and futuristic atmosphere.
Prompt
poses fighting: dynamic, futuristic ; A player character; close-up; gaming; a neon-lit cityscape with holographic projections; cinematic
Characteristic
Shot : A man in a futuristic helmet and goggles walks through a neon-lit city at night. He is holding a glowing baton.
Aesthetic Score : 0.7
Mood : futuristic, cyberpunk, edgy
Quality
Entropy : 6.85
Noise : 77
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some blur and noise, particularly in the background.
A Moment of Connection Amidst the Bustle
Two men share a warm handshake in a vibrant street market, their connection a beacon of warmth amidst the bustling crowd. The scene captures a moment of casual camaraderie, bathed in soft, dimmed light, suggesting a shared understanding and friendship.
Prompt
poses fighting: chaotic, humorous ; Two tourists; medium shot; tourism; a bustling marketplace with colorful stalls and vibrant crowds; cinematic
Characteristic
Shot : Two men are standing in a crowded market, seemingly in conversation. There is a lot of visual clutter, and the colors are muted.
Aesthetic Score : 0.6
Mood : casual, friendly, contemplative
Quality
Entropy : 6.84
Noise : 98
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blurriness, particularly in the background.
A Lone Figure in the Vastness of the Desert
A solitary traveler, clad in a long coat and carrying a stick, traverses a desolate desert landscape. The muted colors and minimalist composition evoke a sense of mystery and adventure, leaving the viewer to ponder the figure’s journey and destination.
Prompt
poses fighting: isolated, desperate ; A lone traveler; long shot; travel; a vast desert landscape with a lone sand dune in the foreground; cinematic
Characteristic
Shot : A lone figure walks across a vast desert landscape, carrying a backpack and holding a stick in front of them.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, desolate
Quality
Entropy : 6.17
Noise : 63
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious image errors.
City Lights, Young Hearts: A Rooftop Gathering Under the Stars
Four friends share laughter and good times on a rooftop terrace, the vibrant city skyline providing a breathtaking backdrop. This scene captures the energy and joy of youth, with a focus on the intimate connections between the individuals.
Prompt
poses fighting: energetic, playful ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : A group of four young people are standing on a rooftop overlooking a cityscape at night. They are all casually dressed and seem to be enjoying themselves.
Aesthetic Score : 0.6
Mood : fun, playful, energetic
Quality
Entropy : 6.84
Noise : 90
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, resulting in some loss of detail in the highlights. There is also some noise present in the image, which is most noticeable in the darker areas.
A Lone Warrior Faces the Flames of War
A single warrior, silhouetted against a fiery inferno, stands poised for battle. The scene is charged with dramatic tension, hinting at an epic clash about to unfold. The warrior’s raised sword and the chaotic backdrop of countless other warriors create a sense of scale and impending conflict.
Prompt
poses fighting: tragic, determined ; A lone warrior; close-up; heroism; a burning village with smoke billowing in the air; cinematic
Characteristic
Shot : A silhouette of a warrior holding a sword, fighting in a war-torn setting. The background depicts a burning city with smoke and fire.
Aesthetic Score : 0.6
Mood : intense, epic, dramatic
Quality
Entropy : 6.02
Noise : 61
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight graininess in the image. Some of the silhouetted figures lack sharpness and detail.
Shadows in the Cave: A Gathering of Mystery
A group of figures, silhouetted against the dim light of a cave, hold their sticks with an air of intensity. The scene is shrouded in mystery, suspense, and a palpable sense of danger. The composition is dynamic, creating a feeling of movement and tension, leaving the viewer to wonder what secrets lie within the shadows.
Prompt
poses fighting: suspenseful, adventurous ; A group of explorers; wide shot; adventure; a dark cave with flickering torches and mysterious shadows; cinematic
Characteristic
Shot : A group of five men, silhouetted against a dark cave opening, are holding sticks or swords, creating an atmosphere of mystery and tension. The light source is coming from outside the cave, highlighting the figures in a dramatic fashion.
Aesthetic Score : 0.7
Mood : intense, mysterious, adventurous
Quality
Entropy : 5.74
Noise : 78
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors in the image. The silhouetted figures are well-defined and the lighting is consistent throughout.
VR Duel: A Moment of Intense Action
Two individuals engage in a virtual boxing match, their expressions hidden behind VR headsets. The dimly lit room and blurred background create a futuristic atmosphere, while the focus on their hands and the boxing glove suggests a thrilling and competitive experience.
Prompt
poses fighting: immersive, intense ; A gamer; close-up; gaming; a virtual reality headset with a pixelated world projected in the background; cinematic
Characteristic
Shot : Two people wearing VR headsets and boxing gloves, likely participating in a virtual reality boxing game. The background is a blurry, abstract pattern with a lot of colorful light and shapes
Aesthetic Score : 0.6
Mood : intense, futuristic, competitive
Quality
Entropy : 6.62
Noise : 69
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has minor artifacts and noise in the background.
Intriguing Encounter in the Subway
Two individuals engage in a captivating conversation amidst the bustling backdrop of a subway station. The image captures a sense of mystery and intrigue, drawing the viewer into their unspoken exchange.
Prompt
poses fighting: fast-paced, chaotic ; Two travelers; medium shot; travel; a crowded train station with people rushing in all directions; cinematic
Characteristic
Shot : A man and a woman are talking to each other in a crowded subway station.
Aesthetic Score : 0.6
Mood : intense, curious, urban
Quality
Entropy : 6.72
Noise : 89
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and there is a bit of noise. The lighting is uneven, with the man’s face being very bright and the woman’s face being very dark.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5-0.75). This indicates that the model was able to accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.59, also within the “good” range. This suggests that the model understood the scene described in the prompt and was able to create an image that reflects that understanding.
- Aesthetic Analysis: The model scored 0.12, which is outside the “very good” range (-0.2 to 0.1). This indicates that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of camera position and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api