AI's Artistic Struggle: Capturing the Scene, Not the Feeling with Flux-schnell
- 9 minutes read - 1856 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images from text prompts is a rapidly evolving field. While impressive strides have been made, there are still challenges in accurately translating the nuances of human language into visual representations. This blog post delves into an experiment that highlights these challenges, showcasing an AI model’s strengths and weaknesses in capturing the essence of a scene.
Created with: flux-schnell
A Knight’s Solitude Amidst the Storm
A lone knight stands defiant on a hilltop, silhouetted against a city bathed in the glow of a lightning strike. The scene evokes a sense of drama, epic grandeur, and melancholic isolation.
Prompt
poses dutch-angle: determined, heroic, hopeful ; A lone knight, standing tall on a hilltop overlooking a besieged city; wide shot; heroism; a dramatic, stormy sky with flashes of lightning; cinematic
Characteristic
Shot : A lone knight in full armor stands on a hilltop overlooking a city, with lightning striking in the background. The sky is dark and stormy, and the city below is shrouded in mist.
Aesthetic Score : 0.7
Mood : dramatic, epic, lonely
Quality
Entropy : 6.74
Noise : 85
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has some noise and artifacting. The sharpness is also a little blurry in some places. The color seems a little washed out.
Silhouettes of Mystery at Sunset
A serene jungle scene unfolds as four figures stand silhouetted against a breathtaking sunset. The sun, partially obscured by trees and clouds, casts a warm glow, creating a sense of calm and mystery. The silhouettes invite contemplation, leaving the viewer to imagine their stories and the secrets they hold.
Prompt
poses dutch-angle: adventurous, mysterious, awe-inspiring ; A group of explorers, silhouetted against the setting sun, standing at the edge of a vast, unexplored jungle; medium shot; adventure; lush green foliage and towering trees; cinematic
Characteristic
Shot : A group of four people are silhouetted against a vibrant sunset over a forest. The people are standing in a line, facing away from the viewer.
Aesthetic Score : 0.6
Mood : serene, contemplative, hopeful
Quality
Entropy : 5.95
Noise : 90
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, which washes out some of the detail in the sky. The shadows in the foreground are also quite dark, making it difficult to see the detail of the forest.
Lost in the Code: A Moment of Intense Focus
A young man, bathed in the soft glow of his computer screen, is completely absorbed in his work. The dimly lit room adds an air of mystery, drawing you into his world of concentration and contemplation.
Prompt
poses dutch-angle: intense, focused, competitive ; A gamer, intensely focused on a screen, fingers flying across a keyboard; close-up; gaming; a brightly lit room with gaming peripherals and posters; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, wearing headphones and gaming. The room is dark, lit only by the screen and a few lamps.
Aesthetic Score : 0.7
Mood : focused, intense, serious
Quality
Entropy : 6.75
Noise : 83
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
Parisian Romance Under the Eiffel Tower
A couple enjoys a romantic moment in Paris, with the iconic Eiffel Tower as a breathtaking backdrop. The cafe table and chairs in the foreground add a touch of Parisian charm, creating a scene that is both beautiful and heartwarming.
Prompt
poses dutch-angle: romantic, nostalgic, joyful ; A couple, hand-in-hand, gazing out at the Eiffel Tower from a Parisian cafe; medium shot; tourism; bustling Parisian streets with charming cafes and shops; cinematic
Characteristic
Shot : A couple walking in Paris with the Eiffel Tower in the background, framed by a cafe terrace.
Aesthetic Score : 0.6
Mood : romantic, charming, Parisian
Quality
Entropy : 6.66
Noise : 97
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has slight noise and some artifacts in the background.
A Solitary Figure Against the Majestic Peaks
A lone hiker traverses a mountain ridge, dwarfed by the vast, snow-capped peaks. The scene evokes a sense of serenity, adventure, and contemplation, highlighting the hiker’s smallness against the immense landscape.
Prompt
poses dutch-angle: free-spirited, adventurous, inspiring ; A backpacker, walking along a winding mountain path, with breathtaking views of snow-capped peaks; medium shot; travel; a rugged mountain landscape with clear blue skies; cinematic
Characteristic
Shot : A lone hiker stands on a rocky ridge, looking out at a vast mountain range. The sky is clear and blue, and the sun is shining. The scene is peaceful and serene, with a sense of adventure and exploration.
Aesthetic Score : 0.7
Mood : peaceful, serene, adventurous
Quality
Entropy : 6.68
Noise : 76
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors detected in the image, although the subject appears a bit too saturated. The color balance could be tweaked to enhance the natural colors.
Friends Toast to Good Times in Warm, Intimate Setting
A group of friends gather at a bar, their smiles and laughter radiating warmth and joy. The close-up framing and warm lighting create a sense of intimacy and connection, capturing the essence of their celebratory mood.
Prompt
poses dutch-angle: joyful, celebratory, connected ; A group of friends, laughing and celebrating, raising their glasses in a toast; medium shot; groups; a lively bar or restaurant with warm lighting and festive decorations; cinematic
Characteristic
Shot : A group of friends are toasting each other with wine glasses in a dimly lit bar or restaurant. The lighting is warm and inviting, with a festive atmosphere.
Aesthetic Score : 0.7
Mood : joyful, warm, celebratory
Quality
Entropy : 6.62
Noise : 89
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
A Moment of Awe: Astronaut Gazes at Earth from Space
A lone astronaut, clad in a white spacesuit, stares out of a spaceship window at the breathtaking sight of Earth. The blue oceans and white clouds swirl beneath them, highlighting the vastness of space and the fragility of our planet. This image evokes feelings of awe, wonder, and solitude, capturing the profound experience of witnessing Earth from afar.
Prompt
poses dutch-angle: awe-inspiring, contemplative, hopeful ; A lone astronaut, gazing out at the Earth from a space station window; close-up; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : An astronaut in a spacesuit looks out of a window of a spaceship, looking at the Earth from space.
Aesthetic Score : 0.7
Mood : awe, wonder, isolation
Quality
Entropy : 6.40
Noise : 73
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The Earth looks a bit blurry, and the astronaut’s helmet lacks detail.
Precarious Perch: Climber Defies Gravity on a Majestic Cliffside
A lone climber hangs precariously from a sheer cliff face, dwarfed by the vast valley below. A distant waterfall adds to the dramatic scene, capturing the awe-inspiring beauty and inherent danger of this adventurous pursuit.
Prompt
poses dutch-angle: exciting, daring, adventurous ; A group of adventurers, rappelling down a steep cliff face, with a breathtaking view of a valley below; wide shot; adventure; a dramatic mountain landscape with waterfalls and lush vegetation; cinematic
Characteristic
Shot : A climber hangs off the edge of a cliff with a majestic waterfall in the background. The view is a wide shot of a valley with lush green vegetation.
Aesthetic Score : 0.8
Mood : dramatic, adventurous, awe-inspiring
Quality
Entropy : 6.66
Noise : 114
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has minor compression artifacts and some noise in the darker areas, particularly in the shadow areas of the cliff face.
Champion’s Triumph: A Moment of Pure Joy
A man, radiating excitement, raises a trophy high above his head, his headphones reflecting the celebratory mood. The crowd behind him roars with applause, capturing the essence of victory and the thrill of achievement.
Prompt
poses dutch-angle: triumphant, celebratory, exciting ; A gamer, celebrating a victory, holding up a trophy; close-up; gaming; a brightly lit stage with cheering crowds and flashing lights; cinematic
Characteristic
Shot : A young man, wearing headphones and a backpack, is raising his arms in triumph while holding a trophy. He is surrounded by a large crowd of people who are cheering and clapping.
Aesthetic Score : 0.7
Mood : joyful, celebratory, energetic
Quality
Entropy : 6.75
Noise : 82
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : no visible errors
Sunset Serenity: A Family’s Moment of Joy
A heartwarming scene of a family of four basking in the golden glow of a sunset on a tranquil beach. The warm hues of the sky create a peaceful and nostalgic atmosphere, capturing a moment of shared happiness.
Prompt
poses dutch-angle: peaceful, heartwarming, nostalgic ; A family, standing on a beach, watching the sunset over the ocean; medium shot; travel; a serene beach with golden sand and turquoise waters; cinematic
Characteristic
Shot : A family of four silhouetted against a sunset on a beach.
Aesthetic Score : 0.6
Mood : tranquil, happy, warm
Quality
Entropy : 6.80
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major image errors, some mild noise in the sky.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.45, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.08, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/schnell/api