AI's Artistic Struggle: Capturing the Essence of a Scene with Flux-dev
- 9 minutes read - 1798 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual descriptions is a rapidly evolving field. This experiment aimed to assess the capabilities of a generative AI model in capturing the essence of a scene, encompassing camera position, shot analysis, and aesthetic execution. The results revealed a fascinating interplay between the model’s strengths and limitations, highlighting the ongoing challenges in achieving truly artistic AI outputs. This blog post delves into the details of the experiment, analyzing the model’s performance and exploring the implications for the future of AI-generated art.
Created with: flux-dev
Two Astronauts Share a Moment of Hope Amidst the Cosmic Vastness
A poignant image captures two astronauts in white space suits, standing on a desolate alien landscape. Their hands are joined, their gazes fixed on a breathtaking expanse of starry sky. The scene evokes a sense of melancholy and hope, highlighting the profound isolation and wonder of space exploration.
Prompt
poses holding-hands: Hopeful, determined, camaraderie ; Two astronauts; wide shot; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : Two astronauts in spacesuits are standing on a desolate alien landscape, gazing at a distant planet.
Aesthetic Score : 0.7
Mood : dreamy, nostalgic, hopeful
Quality
Entropy : 6.52
Noise : 78
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are no noticeable artifacts or errors in the image.
Friends on a Tranquil Forest Adventure
Four friends, silhouetted against the sunlight filtering through the trees, embark on a peaceful hike in the forest. Their backpacks and relaxed postures suggest a fun and adventurous journey through nature’s beauty.
Prompt
poses holding-hands: Excited, adventurous, trusting ; A group of explorers; medium shot; adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : Four people hiking in a forest, holding hands, bathed in warm sunlight
Aesthetic Score : 0.6
Mood : serene, adventurous, friendly
Quality
Entropy : 6.53
Noise : 107
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blurring and pixelation in the shadows, especially in the background. This could be due to the low light conditions or the camera’s settings.
A Mysterious Moment of Connection
In this intimate and romantic scene, two hands are clasped together in the foreground, symbolizing a deep connection. The blurred computer monitor and visible keyboard in the background hint at a mysterious narrative, further emphasized by the low-light setting. The dramatic focus on the hands creates a sense of isolation and intense emotion.
Prompt
poses holding-hands: Focused, competitive, collaborative ; Two gamers; close-up; gaming; a brightly lit gaming setup with glowing screens and controllers; cinematic
Characteristic
Shot : Two hands clasped together in a dimly lit room, with a computer monitor in the background. The image is in focus on the hands, with a blurred background. The lighting is primarily red and blue.
Aesthetic Score : 0.6
Mood : intimate, tender, hopeful
Quality
Entropy : 6.55
Noise : 49
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise, particularly around the hands. This is likely due to the low lighting conditions in which the image was taken.
Parisian Romance at Sunset
A couple silhouetted against the iconic Eiffel Tower and the golden hues of a Parisian sunset. Their embrace evokes a sense of intimacy and hope, capturing the magic of love in the City of Lights.
Prompt
poses holding-hands: Romantic, happy, adventurous ; A couple; medium shot; tourism; a picturesque cityscape with iconic landmarks in the background; cinematic
Characteristic
Shot : A couple is standing on a hill overlooking a city, with the Eiffel Tower in the background. The sun is setting, and the sky is a beautiful orange and pink.
Aesthetic Score : 0.6
Mood : romantic, nostalgic, hopeful
Quality
Entropy : 6.73
Noise : 55
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors
A Family’s Serene Adventure Amidst Majestic Mountains
Capture the breathtaking beauty of a family’s hike through lush mountain trails. The wide shot emphasizes the vastness of the landscape, creating a sense of wonder and adventure. The serene blue sky and vibrant greenery evoke a peaceful and tranquil mood.
Prompt
poses holding-hands: Joyful, connected, adventurous ; A family; long shot; travel; a scenic mountain range with a winding road leading to the peak; cinematic
Characteristic
Shot : A family of three, a man, a woman and a girl, are walking along a mountain trail. The sun is setting in the background, casting a warm glow on the scene.
Aesthetic Score : 0.7
Mood : peaceful, serene, heartwarming
Quality
Entropy : 6.79
Noise : 63
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors
Joyful Celebration Under the Sun
A group of young adults revel in the festive atmosphere, captured in a moment of pure joy. The man’s raised arms and the bright sunlight radiate happiness and togetherness, creating a vibrant and carefree scene.
Prompt
poses holding-hands: Happy, celebratory, connected ; A group of friends; medium shot; groups; a vibrant festival with colorful decorations and music; cinematic
Characteristic
Shot : A group of people, mostly out of focus, are gathered outdoors with a bright sun in the background. The image is focused on a shirtless man in the foreground who is reaching his arms up, seemingly in a celebratory gesture. A woman with long hair is standing to his left with a crop top and shorts. Other people in the background, including a woman in a pink dress, are blurred.
Aesthetic Score : 0.5
Mood : happy, carefree, celebratory
Quality
Entropy : 6.15
Noise : 63
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are slight artifacts in the blurred background. The lighting is somewhat harsh and creates some overexposure.
Love Amidst the Peaks: A Couple’s Silhouette Against a Majestic Landscape
A serene and adventurous scene unfolds as a couple hikes atop a mountain, their silhouetted figures dwarfed by the vast expanse of clouds and distant peaks. The dramatic perspective evokes a sense of romance and the couple’s connection to the awe-inspiring beauty of nature.
Prompt
poses holding-hands: Determined, courageous, triumphant ; A lone hiker; close-up; heroism; a breathtaking mountain vista with clouds swirling below; cinematic
Characteristic
Shot : A couple is hiking up a mountain, with a scenic view of a mountain range in the background. The couple is silhouetted against the bright sky.
Aesthetic Score : 0.7
Mood : serene, adventurous, romantic
Quality
Entropy : 5.75
Noise : 61
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Innocence and Joy: A Moment of Friendship Captured
Two young girls, hand in hand, radiate pure joy on a vibrant playground. The shallow depth of field draws attention to their playful interaction, highlighting the innocence and strength of their bond. A perfect snapshot of childhood friendship.
Prompt
poses holding-hands: Playful, innocent, carefree ; Two children; close-up; adventure; a playground with swings, slides, and a sandbox; cinematic
Characteristic
Shot : Two young girls are holding hands in a playground, the setting appears to be a sunny day with green trees and a brightly colored playground structure in the background
Aesthetic Score : 0.7
Mood : happy, playful, carefree
Quality
Entropy : 6.46
Noise : 80
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor blurriness is visible in the background, the lighting is not perfectly balanced as there are some areas that are slightly overexposed.
Silhouettes of Hope: A Moment of Intimacy Under the Spotlight
Two figures, hand in hand, stand silhouetted against a brightly lit stage. The dramatic use of spotlights and shadows creates a sense of mystery and hope, hinting at a powerful connection between the two individuals. The scene is both intimate and evocative, leaving the viewer to imagine the story unfolding behind the stage.
Prompt
poses holding-hands: Passionate, connected, expressive ; A group of musicians; medium shot; groups; a dimly lit stage with spotlights shining on them; cinematic
Characteristic
Shot : Two people silhouetted against a stage with spotlights, with other people in the background, possibly a concert or performance.
Aesthetic Score : 0.6
Mood : dramatic, hopeful, celebratory
Quality
Entropy : 5.59
Noise : 38
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight graininess and some noise in the darker areas. The edges of the image are slightly blurry, but it’s likely due to the intentional backlighting.
Silhouettes of Love Against the Desert Sunset
A couple stands hand-in-hand, their silhouettes stark against the fiery hues of a desert sunset. The scene evokes a sense of romance, serenity, and hope, capturing the beauty of love amidst the vastness of nature.
Prompt
poses holding-hands: Romantic, adventurous, hopeful ; A couple; long shot; travel; a vast desert landscape with a setting sun in the distance; cinematic
Characteristic
Shot : A couple is silhouetted against a sunset in a desert landscape. The scene evokes feelings of love and romance.
Aesthetic Score : 0.7
Mood : romantic, peaceful, hopeful
Quality
Entropy : 5.89
Noise : 51
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have a slight amount of noise, which is a technical error, and a slight graininess.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.61, which falls within the “good” range. This indicates that the model was able to understand the scene and create a shot that was generally consistent with the prompt.
- Aesthetic Analysis: The model scored 0.13, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to achieve the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://fal.ai/models/fal-ai/flux/dev/api