AI's Artistic Journey: Capturing Poses, But Missing the Essence with Flux-schnell
- 9 minutes read - 1756 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on text prompts has become increasingly sophisticated. However, while AI models excel in understanding technical aspects like camera position and shot composition, they often struggle to capture the desired aesthetic. This blog post delves into the results of an experiment where an AI model was tasked with generating images based on prompts describing poses and scenes, revealing both its strengths and limitations in capturing the artistic essence.
Created with: flux-schnell
Silhouetted Against the Sunset: A Moment of Triumph and Hope
A man in a red jacket stands triumphantly on a mountain peak, arms raised towards the setting sun. The golden light paints the landscape in warm hues, creating a breathtaking scene of joy, hope, and inspiration. The silhouette of the man against the sunset adds a dramatic touch, emphasizing the grandeur of the moment.
Prompt
poses over-the-shoulder: epic, hopeful ; A lone adventurer, silhouetted against a setting sun; wide shot; Adventure; a vast, rugged mountain range; cinematic
Characteristic
Shot : A lone hiker stands on a mountaintop, arms outstretched, silhouetted against a bright sunrise.
Aesthetic Score : 0.7
Mood : peaceful, hopeful, triumphant
Quality
Entropy : 6.26
Noise : 49
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable errors or artifacts
Firefighter Braces Against Blazing Inferno
A firefighter in full gear stands defiantly in front of a burning building, the flames casting an intense glow on their silhouette. The scene captures the raw danger and urgency of the situation, highlighting the bravery of those who face the fire.
Prompt
poses over-the-shoulder: intense, dramatic ; A firefighter, helmet gleaming, facing a raging inferno; medium shot; Heroism; a burning building with smoke billowing; cinematic
Characteristic
Shot : A firefighter in full gear, standing in front of a burning building. The image focuses on the firefighter, and the burning building is out of focus in the background.
Aesthetic Score : 0.6
Mood : intense, serious, dramatic
Quality
Entropy : 6.69
Noise : 62
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors.
Lost in the Code: A Moment of Intense Focus
A young person, headphones on and eyes glued to the screen, navigates the digital world with unwavering concentration. The dimly lit room and close-up shot create an intimate and intense atmosphere, highlighting the subject’s deep immersion in their task.
Prompt
poses over-the-shoulder: focused, intense ; A gamer, eyes glued to the screen, fingers flying across the keyboard; close-up; Gaming; a brightly lit gaming setup with flashing lights; cinematic
Characteristic
Shot : A young person, likely a teenager or young adult, is sitting in front of a computer, wearing headphones and glasses, focused intently on their screen. The room is dimly lit with warm, soft lighting, creating a cozy and intimate atmosphere. There is a monitor on the right side of the frame, blurry and out of focus, suggesting the presence of other people or objects in the room.
Aesthetic Score : 0.7
Mood : focused, intense, determined
Quality
Entropy : 6.66
Noise : 67
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight artifacts in the image, particularly in the highlights and shadows, suggesting possible compression or processing errors. The lighting on the subject’s face is slightly uneven, with some areas appearing brighter than others.
Capturing the Eiffel Tower: A Moment of Wanderlust in Paris
A traveler stands before the iconic Eiffel Tower, camera in hand, capturing the grandeur of the Parisian skyline. The scene evokes a sense of adventure and wanderlust, with the tower’s imposing presence adding a dramatic touch to the moment.
Prompt
poses over-the-shoulder: joyful, awe-inspired ; A tourist, camera in hand, gazing at the Eiffel Tower; medium shot; Tourism; a bustling Parisian street with the Eiffel Tower in the background; cinematic
Characteristic
Shot : A young man taking a photo of the Eiffel Tower in Paris, France. He is wearing a black jacket and a brown backpack. The Eiffel Tower is in the background, and there are other people in the scene.
Aesthetic Score : 0.7
Mood : romantic, nostalgic, adventurous
Quality
Entropy : 6.80
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is slightly blurry. There is some slight noise in the darker areas of the image.
Silhouettes of Hope: A Solitary Figure Welcomes the Sunset
A lone traveler stands on a tranquil beach, their silhouette framed against the fiery hues of the setting sun. The vast ocean stretches before them, mirroring the contemplative mood of the moment. This image evokes a sense of peace, hope, and the quiet beauty of introspection.
Prompt
poses over-the-shoulder: peaceful, contemplative ; A backpacker, gazing out at a breathtaking sunset over the ocean; wide shot; Travel; a serene beach with palm trees and turquoise water; cinematic
Characteristic
Shot : A person standing on a beach, looking at the ocean sunset. There are palm trees in the background.
Aesthetic Score : 0.6
Mood : tranquil, contemplative, peaceful
Quality
Entropy : 6.52
Noise : 52
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has a slight blurriness, especially in the background.
Campfire Tales Under a Starry Sky
A group of friends gather around a crackling campfire, their faces illuminated by the warm glow. The starry sky above adds a touch of magic to the cozy scene, creating a perfect setting for laughter, stories, and shared moments of friendship.
Prompt
poses over-the-shoulder: warm, nostalgic ; A group of friends, laughing and sharing stories, around a campfire; medium shot; Groups; a campsite under a starry night sky; cinematic
Characteristic
Shot : A group of four friends are gathered around a campfire under a starry night sky. The Milky Way is visible in the background.
Aesthetic Score : 0.7
Mood : cozy, relaxing, warm
Quality
Entropy : 6.33
Noise : 89
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some noise is visible in the image, particularly in the darker areas. The image is slightly overexposed, which is causing some of the details in the background to be lost.
Unveiling the Secrets: A Scientist’s Focused Pursuit
A woman in a lab coat leans intently over a microscope, her expression focused and curious. The close-up shot captures the essence of scientific exploration, hinting at a moment of discovery in a laboratory setting.
Prompt
poses over-the-shoulder: focused, determined ; A scientist, peering through a microscope, engrossed in her research; close-up; Heroism; a laboratory filled with scientific equipment; cinematic
Characteristic
Shot : A woman wearing safety glasses is looking through the eyepiece of a microscope. The background is out of focus and appears to be a lab setting.
Aesthetic Score : 0.7
Mood : focused, curious, scientific
Quality
Entropy : 6.87
Noise : 78
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : None
Soaring Above the Clouds: A Pilot’s Perspective
Experience the serenity and adventure of flight as a pilot navigates through a field of clouds. This captivating image captures the feeling of freedom and exploration, offering a unique glimpse into the world from above.
Prompt
poses over-the-shoulder: exhilarating, adventurous ; A pilot, gripping the controls, soaring through the clouds; wide shot; Adventure; a cockpit with a view of the vast, blue sky; cinematic
Characteristic
Shot : A pilot in a small plane flying over a field of clouds.
Aesthetic Score : 0.7
Mood : exciting, adventurous, serene
Quality
Entropy : 6.39
Noise : 52
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
The Art of Plating: A Chef’s Focused Precision
A glimpse into the heart of a professional kitchen, where a chef meticulously plates a dish under low-key lighting. The focus on their skilled hand evokes a sense of intimacy and the high standards of gourmet cuisine.
Prompt
poses over-the-shoulder: passionate, artistic ; A chef, meticulously plating a dish, surrounded by the aromas of fresh ingredients; close-up; Tourism; a bustling kitchen in a gourmet restaurant; cinematic
Characteristic
Shot : A chef in a white uniform is plating a meal in a restaurant kitchen
Aesthetic Score : 0.6
Mood : professional, focused, culinary
Quality
Entropy : 6.91
Noise : 82
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, and the lighting is a bit uneven.
Silhouetted Hikers Conquer the Summit, Embracing the Expansive View
Four adventurers stand triumphant on a mountain peak, their silhouettes stark against a breathtaking panorama of distant peaks and a clear blue sky. The scene evokes a sense of serenity, adventure, and inspiration, capturing the essence of conquering challenges and embracing the beauty of nature.
Prompt
poses over-the-shoulder: triumphant, inspiring ; A group of hikers, silhouetted against a mountain peak, reaching the summit; wide shot; Groups; a majestic mountain range with a breathtaking view; cinematic
Characteristic
Shot : Silhouettes of four hikers standing on a mountaintop with a majestic mountain range in the background.
Aesthetic Score : 0.8
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.70
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.53, which is considered good. This indicates the generated image’s shot composition was fairly close to what was described in the prompt.
- Aesthetic Analysis: The model scored 0.05, which is considered okay. This suggests the generated image’s aesthetic was somewhat different from what was expected based on the prompt.
Overall, the model seems to be better at understanding the scene and shot composition than it is at capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api