AI's Artistic Eye: Capturing Emotion, Missing the Shot with Flux-pro
- 9 minutes read - 1850 wordsTable of Contents
In the realm of artificial intelligence, generative models are revolutionizing the way we create images. These models can translate textual descriptions into stunning visuals, offering a glimpse into the future of art and design. However, as with any emerging technology, there are limitations to be explored. This blog post examines the performance of a generative AI model in capturing facial expressions and interpreting complex scene descriptions, highlighting its strengths and weaknesses.
Created with: flux-pro
Solitude Amidst the Storm
A lone figure stands defiant against the raw power of a stormy sea, their silhouette a stark contrast against the brooding sky. The scene evokes a sense of dramatic solitude and melancholic anticipation, leaving the viewer to ponder the figure’s thoughts and the unknown that lies ahead.
Prompt
facial-expressions Hope: Determined, resilient, facing adversity ; A lone figure standing on a clifftop overlooking a vast, stormy sea; eye-level; Single Person; Dramatic, stormy sky with crashing waves; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a stormy sea. The waves are crashing against the rocks, and the sky is dark and ominous.
Aesthetic Score : 0.75
Mood : dramatic, melancholic, suspenseful
Quality
Entropy : 6.51
Noise : 92
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors.
Heroic Silhouette: Firefighter Carries Child to Safety
A dramatic image captures a firefighter in full gear, silhouetted against the flames of a burning building, as they carry a child to safety. The scene evokes a sense of heroism, somberness, and the powerful impact of the event.
Prompt
facial-expressions Hope: Brave, selfless, courageous ; A firefighter carrying a child through a burning building; eye-level; Hero; Smoke and flames engulfing the background; cinematic
Characteristic
Shot : A firefighter in full gear is walking through a smoke-filled alleyway, carrying a small child in their arms. The scene is dramatic and slightly unsettling.
Aesthetic Score : 0.6
Mood : dramatic, intense, heroic
Quality
Entropy : 6.77
Noise : 79
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some noise and compression artifacts, especially in the shadows and smoke.
A Seed of Hope in the Desert
A young woman plants a small tree in a sun-drenched desert, a symbol of life and growth amidst the arid landscape. The image evokes a sense of serenity, hope, and optimism, capturing the beauty of resilience in the face of adversity.
Prompt
facial-expressions Hope: Optimistic, hopeful, believing in a better future ; A young woman planting a tree in a barren wasteland; eye-level; Normal Person; Dusty, desolate landscape with a single, hopeful green sprout; cinematic
Characteristic
Shot : A young woman plants a small sapling in a sandy desert environment, sunlight bathes the scene, giving a warm tone.
Aesthetic Score : 0.7
Mood : hopeful, serene, environmental
Quality
Entropy : 6.76
Noise : 59
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight overexposure in the sky and some noise in the shadows. The subject’s hand appears slightly blurred, indicating minor motion blur.
Friends Get Competitive in Dimly Lit Gaming Session
A group of friends gather in a dimly lit room, their excitement palpable as they engage in a heated video game session. One friend, wearing headphones, waves playfully at the camera, capturing the energy and fun of the moment.
Prompt
facial-expressions Hope: Excited, triumphant, feeling a sense of accomplishment ; A gamer celebrating a victory with their team, their faces illuminated by the glow of the monitor; eye-level; Gamer; A dimly lit room with gaming peripherals and posters on the walls; cinematic
Characteristic
Shot : A group of friends are gathered around a computer, playing video games. The room is dimly lit and decorated with posters.
Aesthetic Score : 0.6
Mood : energetic, fun, playful
Quality
Entropy : 6.63
Noise : 70
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some noise and blur in the background, potentially from low light conditions.
A Single Flame in the Darkness
A person holds a flickering candle, its light illuminating their cupped hands against a backdrop of shadow. The scene evokes a sense of hope, solemnity, and introspection, with the dim lighting adding an air of mystery and intimacy.
Prompt
facial-expressions Hope: Hopeful, comforting, a beacon of light in the darkness ; A single candle burning brightly in a dark room; eye-level; Single Person; Shadows and darkness surrounding the candle; cinematic
Characteristic
Shot : A person is holding a lit candle in their cupped hands, the image is dark with only the candle flame and hands illuminated.
Aesthetic Score : 0.7
Mood : hopeful, somber, contemplative
Quality
Entropy : 5.98
Noise : 58
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight graininess and some noise.
New Life, Overflowing Joy: A Moment of Pure Love
A tender image captures the overwhelming joy of new parenthood. A woman beams with love as she cradles her newborn, swaddled in a blue blanket. The soft lighting and gentle smile create a heartwarming atmosphere, highlighting the deep bond between mother and child.
Prompt
facial-expressions Hope: Joyful, hopeful, a symbol of new beginnings ; A doctor holding a newborn baby in their arms; eye-level; Hero; A sterile hospital room with medical equipment in the background; cinematic
Characteristic
Shot : A woman is holding a newborn baby in a hospital room. The woman is wearing a blue scrubs and the baby is wearing a blue and white hat. There is another person in the background wearing blue scrubs.
Aesthetic Score : 0.7
Mood : tender, heartwarming, joyful
Quality
Entropy : 6.71
Noise : 58
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : None.
Warmth and Connection: A Shared Meal Brings Joy
This heartwarming scene captures the essence of togetherness. Four friends gather around a table, bathed in warm light, their smiles radiating joy and connection. The cozy dining room setting, with natural light streaming in, adds to the sense of intimacy and comfort. This image celebrates the simple pleasures of shared meals and the bonds that unite us.
Prompt
facial-expressions Hope: Warm, comforting, a sense of belonging ; A group of friends sharing a meal together in a cozy kitchen; eye-level; Normal People; Warm, inviting kitchen with sunlight streaming through the window; cinematic
Characteristic
Shot : A group of four people are gathered around a table, sharing a meal, with a window in the background. The warm light suggests a cozy setting, perhaps a family gathering.
Aesthetic Score : 0.7
Mood : joyful, intimate, warm
Quality
Entropy : 6.66
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no major image errors.
Immersed in the Game: A Moment of Focused Intensity
A young gamer, bathed in warm light, is completely engrossed in their game. Their focused expression and the dramatic lighting capture the intensity and immersion of the gaming experience.
Prompt
facial-expressions Hope: Determined, focused, persevering ; A gamer overcoming a difficult challenge in a video game, their face showing determination and focus; eye-level; Gamer; A brightly lit room with a large monitor displaying the game; cinematic
Characteristic
Shot : A young person is playing a video game on their computer. They are wearing headphones and have a serious expression on their face, focused on the game. The scene is lit with blue and red hues, creating a dramatic atmosphere.
Aesthetic Score : 0.7
Mood : focused, intense, determined
Quality
Entropy : 6.69
Noise : 66
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant image errors. The image is sharp and well-composed. There are some minor artifacts in the background but these are not distracting.
Soaring High with Joy
A young woman embraces the sunshine and blue skies, leaping into the air with a carefree spirit. Her outstretched arms and bright smile capture the essence of pure happiness and freedom.
Prompt
facial-expressions Hope: Free, hopeful, a symbol of liberation ; Soaring through blue sky; eye-level; Single Person; Vast, open sky with fluffy white clouds; cinematic
Characteristic
Shot : A woman is jumping in the air with her arms outstretched, against a bright blue sky with white clouds. The sun is shining brightly in the background.
Aesthetic Score : 0.7
Mood : joyful, carefree, uplifting
Quality
Entropy : 5.99
Noise : 60
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts in the image, such as some banding in the sky and a slight blur around the woman’s hair.
Silhouettes of Hope: A Sunset Symphony of Unity
A vibrant sunset paints the sky as a group of individuals, hand in hand, stand in silhouette. Their unity and shared purpose are palpable, radiating a sense of hope and positivity.
Prompt
facial-expressions Hope: United, hopeful, facing the future together ; A group of people standing together, arms linked, facing a bright sunrise; eye-level; Heroes; A vast, open field with a golden sunrise in the background; cinematic
Characteristic
Shot : A group of people silhouetted against a sunset, holding hands in a circle.
Aesthetic Score : 0.6
Mood : hopeful, peaceful, united
Quality
Entropy : 6.54
Noise : 55
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to be slightly overexposed, which washes out some of the detail in the sky.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.1, indicating it did not perform well in capturing the intended camera position. This suggests the model may not be very sensitive to camera position instructions.
- Shot Analysis: The model scored 0.46, which is below average. This means the model had some difficulty understanding the scene described in the prompt and translating it into the generated image.
- Aesthetic Analysis: The model scored 0.08, which is very good. This indicates that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model may need further training to improve its ability to interpret and translate complex scene descriptions and camera positions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux-pro/api