AI's Artistic Struggle: Capturing the Essence of Poses with Flux-schnell
- 9 minutes read - 1856 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts has made significant strides. However, capturing the essence of poses, the subtle nuances that convey emotion and narrative, remains a challenge. This blog post delves into the results of a recent experiment where a generative AI model was tasked with creating images based on specific poses and scenes. While the model demonstrated proficiency in understanding camera position and shot composition, it fell short in capturing the desired aesthetic style. This discrepancy highlights the ongoing struggle to bridge the gap between technical proficiency and artistic expression in AI-generated imagery. We will explore the reasons behind this challenge and discuss potential solutions for improving AI’s artistic capabilities.
Created with: flux-schnell
Lost in the Clouds: Hikers Find Serenity on a Mountain Ridge
Two hikers stand on a rocky mountain ridge, dwarfed by the vast expanse of white clouds below. The scene evokes a sense of serenity, adventure, and contemplation, highlighting the grandeur of nature and the smallness of humanity in its face.
Prompt
poses looking-at-each-other: determined, awe-inspired ; A lone adventurer, standing on a mountain peak; wide shot; adventure; a vast, breathtaking landscape with clouds swirling below; cinematic
Characteristic
Shot : Two hikers standing on a mountain ridge overlooking a sea of clouds.
Aesthetic Score : 0.6
Mood : adventurous, hopeful, serene
Quality
Entropy : 6.70
Noise : 100
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness on the subject’s faces and some texture distortion on the clouds.
Clash of Steel: Medieval Warriors in a Smoky Battlefield
Two armored men face off in a dramatic, smoke-filled battlefield. The intense lighting and atmospheric smoke create a visually striking scene, capturing the raw power and intensity of medieval warfare.
Prompt
poses looking-at-each-other: tense, hopeful ; Two soldiers, one injured, the other holding a shield; medium shot; heroism; a battlefield with smoke and fire in the background; cinematic
Characteristic
Shot : Two men in medieval armor are facing each other, with a large shield between them. There is a sense of tension and anticipation in the air. The background is a battlefield with smoke and fire.
Aesthetic Score : 0.7
Mood : dramatic, intense, warlike
Quality
Entropy : 6.75
Noise : 94
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, and some noise is visible in the background.
Caught in the Moment: An Intimate Gaming Experience
In the soft glow of pink and purple lights, two young adults are engrossed in their own world, sharing a close moment as they game or listen to music together. The dimly lit room, filled with gaming equipment, sets the stage for an intimate and playful atmosphere.
Prompt
poses looking-at-each-other: intense, focused ; Two gamers, heads bent over a screen; close-up; gaming; a dimly lit room with neon lights reflecting on their faces; cinematic
Characteristic
Shot : Two young adults, a man and a woman, are sitting at a computer desk with neon lights in the background. They are both wearing headsets and looking down.
Aesthetic Score : 0.6
Mood : intense, focused, mysterious
Quality
Entropy : 6.13
Noise : 66
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, especially in the darker areas.
Urban Oasis: A Moment of Tranquility Amidst the City
A group of people stroll through a majestic archway, bathed in the soft light of a cloudy sky. The scene evokes a sense of casual urban life, with the distant building adding a touch of perspective and the dramatic clouds enhancing the mood.
Prompt
poses looking-at-each-other: excited, curious ; A group of tourists, standing in front of a famous landmark; medium shot; tourism; a bustling city street with people and vehicles passing by; cinematic
Characteristic
Shot : A group of people walking in front of a large archway with a building in the background. The photo was taken on a cloudy day.
Aesthetic Score : 0.6
Mood : urban, casual, daytime
Quality
Entropy : 6.89
Noise : 96
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
A Moment in Time: Love and Landscapes on a Train
A couple, lost in their own world, gazes out the window of a moving train, their intimacy framed by the passing scenery. The romantic, wistful mood is palpable, capturing the fleeting beauty of the moment and the enduring power of love.
Prompt
poses looking-at-each-other: reflective, nostalgic ; Two friends, sitting on a train, looking out the window; medium shot; travel; a scenic landscape with rolling hills and fields; cinematic
Characteristic
Shot : A couple sits side-by-side on a train, looking out the window. The window shows a blurry countryside landscape.
Aesthetic Score : 0.6
Mood : romantic, contemplative, nostalgic
Quality
Entropy : 5.69
Noise : 51
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slight blur, especially in the landscape.
Campfire Cozy: Friends Gather Around the Flames
A group of friends share laughter and stories around a crackling campfire, creating a warm and nostalgic atmosphere. The fire’s glow illuminates their faces, highlighting the intimacy and connection they share.
Prompt
poses looking-at-each-other: warm, intimate ; A group of friends, huddled together around a campfire; close-up; groups; a dark forest with stars twinkling in the sky; cinematic
Characteristic
Shot : A group of friends are gathered around a campfire in a forest at night. The fire is burning brightly, and the flames are casting a warm glow on the faces of the friends. The forest is dark and mysterious, and the friends seem to be enjoying each other’s company.
Aesthetic Score : 0.6
Mood : cozy, warm, friendly
Quality
Entropy : 5.11
Noise : 70
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a little bit blurry and the contrast is a bit too high, which is making the forest too dark. The friends also seem to be slightly too far away from the campfire.
Silhouetted Against the Sunset, a Moment of Contemplation
A solitary figure stands on a beach, bathed in the golden hues of a setting sun. The scene evokes a sense of tranquility and introspection, as the man’s silhouette against the vibrant sky suggests a moment of deep thought and reflection.
Prompt
poses looking-at-each-other: melancholy, contemplative ; A lone figure, standing on a deserted beach; wide shot; adventure; a vast ocean with crashing waves and a setting sun; cinematic
Characteristic
Shot : A man silhouetted against the setting sun on a beach, gazing out at the ocean.
Aesthetic Score : 0.7
Mood : serene, contemplative, peaceful
Quality
Entropy : 6.70
Noise : 69
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : No notable artifacts or errors.
Tiny Explorers in a Vast Universe: A Playful Journey Through Space
Two astronauts, dwarfed by the immense blue planet, float through the cosmos with a sense of playful adventure and hopeful anticipation. The image captures the dramatic scale of space and the boundless possibilities that lie ahead.
Prompt
poses looking-at-each-other: awe-inspired, hopeful ; Two astronauts, floating in space; medium shot; heroism; a view of Earth from space with stars and galaxies in the background; cinematic
Characteristic
Shot : Two astronauts in space suits are floating in space, against the backdrop of a planet and stars.
Aesthetic Score : 0.7
Mood : dreamy, hopeful, adventurous
Quality
Entropy : 6.51
Noise : 105
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, particularly around the astronauts’ faces. The colors are also a bit muted.
Finding Serenity in the Forest
A group of friends enjoys a leisurely hike through a lush green forest, bathed in soft, diffused light. The scene evokes a sense of relaxation, adventure, and connection with nature.
Prompt
poses looking-at-each-other: curious, adventurous ; A group of explorers, standing in a jungle clearing; medium shot; adventure; lush greenery with sunlight filtering through the leaves; cinematic
Characteristic
Shot : A group of five people are standing in a forest, likely on a hiking trip. The group includes three men and two women, and they are all dressed casually in hiking clothes. The background is a lush green forest, with tall trees and dense foliage.
Aesthetic Score : 0.5
Mood : relaxed, adventurous, friendly
Quality
Entropy : 6.76
Noise : 125
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors in this image. Image quality and focus are good
Sunset Serenade: A Moment of Love and Laughter on the Bridge
In the heart of the city, a couple shares a tender moment on a bridge at sunset. The woman, in a flowing dress, and the man, in a casual jacket and jeans, are captured in a loving gaze, their smiles reflecting the warmth of the setting sun. The city lights twinkle in the background, mirrored in the water below, adding a touch of urban romance to their intimate scene.
Prompt
poses looking-at-each-other: romantic, intimate ; Two lovers, standing on a bridge overlooking a city; medium shot; tourism; a cityscape with twinkling lights and a river flowing below; cinematic
Characteristic
Shot : A couple is standing on a bridge overlooking a city at dusk. They are looking at each other and smiling. The city lights are twinkling in the distance.
Aesthetic Score : 0.7
Mood : romantic, intimate, dreamy
Quality
Entropy : 6.74
Noise : 82
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors, just a slight noise in the background.
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered average. This means the generated image’s camera position was somewhat similar to what was requested in the prompt.
- Shot Analysis: The model scored 0.55, which is considered good. This indicates the generated image’s shot composition was fairly close to what was expected based on the prompt.
- Aesthetic Analysis: The model scored 0.04, which is considered poor. This suggests the generated image’s aesthetic style deviated significantly from what was expected based on the prompt.
Overall, the model seems to be better at understanding the technical aspects of the prompt (camera position and shot composition) than the artistic aspects (aesthetic style).
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api