AI's Facial Expressions: A Mixed Bag with Flux-pro
- 9 minutes read - 1854 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. In the realm of AI-generated content, capturing these nuances is crucial for creating engaging and realistic narratives. This experiment aimed to assess an AI model’s ability to generate images with specific facial expressions and scene settings. While the model demonstrated some success in understanding camera positions and shot analysis, it fell short in capturing the desired aesthetic. This blog post explores the results in detail, highlighting the model’s strengths and weaknesses, and discussing the potential for improvement in future iterations.
Created with: flux-pro
Lost in Thought, Piece by Piece
A man sits at a table, his brow furrowed in concentration as he works on a puzzle. The soft, natural light and intimate setting create a sense of quiet contemplation, capturing a moment of focused introspection.
Prompt
facial-expressions Boredom: Apathy and resignation. ; A single person; eye-level; Single Persons; A cluttered apartment with unwashed dishes and a half-finished puzzle on the table.; cinematic
Characteristic
Shot : A man sits at a table, focused on a jigsaw puzzle, looking down with a serious expression. The scene is lit by natural light from a window. The interior is minimalist and calming.
Aesthetic Score : 0.6
Mood : introspective, calm, focused
Quality
Entropy : 6.81
Noise : 73
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise is visible in the shadows and there is a slight blur in the background.
A Hero’s Burden: The Weight of a Ruined World
A powerful superhero, cloaked in blue and red, sits amidst the ruins of a shattered city. Their pensive gaze reflects the weight of their responsibility and the uncertainty of the future. This dramatic scene evokes a sense of melancholy and power, leaving the viewer questioning the hero’s fate and the city’s hope for redemption.
Prompt
facial-expressions Boredom: Disillusionment and weariness. ; A superhero; eye-level; Heroes; A deserted cityscape with crumbling buildings and graffiti.; cinematic
Characteristic
Shot : A superhero in a dark costume sits in a deserted city street, seemingly lost in thought, with a backdrop of faded graffiti and abandoned buildings.
Aesthetic Score : 0.6
Mood : melancholic, mysterious, brooding
Quality
Entropy : 6.67
Noise : 95
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors in the image.
Lost in Thought: A Moment of Quiet Reflection
A young woman with long brown hair sits on a bus, her gaze lost in the passing scenery. The soft, muted lighting creates an intimate and introspective atmosphere, highlighting her pensive mood. The woman’s focus on the world outside, rather than the viewer, adds a layer of mystery and invites contemplation.
Prompt
facial-expressions Boredom: Annoyance and detachment. ; A young woman; eye-level; Normal People; A crowded bus with people staring at their phones.; cinematic
Characteristic
Shot : A young woman sits on a bus looking out the window. Other passengers are also visible, but the woman is the focus of the image.
Aesthetic Score : 0.7
Mood : pensive, wistful, contemplative
Quality
Entropy : 6.75
Noise : 75
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors.
Lost in the Shadows: A Man’s Contemplative Focus
A dimly lit room, a solitary figure hunched over a computer screen. The low lighting casts long shadows, adding an air of mystery to the scene. The man’s focused expression suggests deep contemplation, leaving the viewer to wonder what secrets lie within the digital world.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A gamer; close-up; Gamer; A dimly lit room with a computer screen displaying a paused game.; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit room, looking intently at a computer screen. He is wearing a dark hoodie and has a serious expression on his face.
Aesthetic Score : 0.6
Mood : focused, serious, contemplative
Quality
Entropy : 6.51
Noise : 66
Prompt Clip Score : 0.12
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some noise and grain present in the image
Autumn Reflections: A Moment of Contemplation
An elderly man sits on a park bench, lost in thought as the autumn leaves paint the background in muted hues. His contemplative gaze and the melancholic atmosphere evoke a sense of quiet solitude and nostalgia.
Prompt
facial-expressions Boredom: Melancholy and loneliness. ; An elderly man; eye-level; Single Persons; A park bench with fallen leaves and a deserted playground.; cinematic
Characteristic
Shot : An older man sits on a park bench with autumn leaves on the ground.
Aesthetic Score : 0.6
Mood : pensive, lonely, contemplative
Quality
Entropy : 6.86
Noise : 84
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some slight noise in the image, especially in the background.
Shadows and Secrets: A Man in the Neon Light
A figure shrouded in mystery. A man in a brown jacket and tie stands in a dimly lit room, bathed in the eerie glow of a neon sign. Bookshelves line the walls, hinting at a world of knowledge and secrets. His gaze is fixed on the viewer, a sense of tension and intrigue hanging in the air. What story does this enigmatic scene hold?
Prompt
facial-expressions Boredom: Frustration and boredom. ; A detective; eye-level; Heroes; A dimly lit office with stacks of unsolved cases and a flickering neon sign.; cinematic
Characteristic
Shot : A man in a brown jacket stands in an office with a neon sign reading ‘PEASON’ behind him.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, moody
Quality
Entropy : 6.29
Noise : 66
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy and the edges are slightly blurry.
Intimate Moments in the Candlelight
A couple shares a quiet conversation in a dimly lit restaurant, their expressions hinting at a mix of intimacy, thoughtfulness, and perhaps a touch of melancholy. The soft glow of the candles adds a layer of mystery and depth to the scene.
Prompt
facial-expressions Boredom: Awkward silence and boredom. ; A young couple; eye-level; Normal People; A restaurant table with empty plates and a half-finished bottle of wine.; cinematic
Characteristic
Shot : A couple sits at a table in a dimly lit restaurant, the woman is looking down while the man is looking at her.
Aesthetic Score : 0.6
Mood : romantic, intimate, pensive
Quality
Entropy : 6.54
Noise : 72
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, although there is slight blurriness in some areas due to the low lighting.
Lost in the Digital Realm: A Gamer’s Focus
A young man, headphones on, is completely immersed in the digital world. The vibrant green and blue interface on his screen, combined with the contrasting red and blue lighting, creates a sense of mystery and intensity. His focused expression and the blurred background further emphasize his concentration, leaving the viewer wondering what challenges he faces within this digital realm.
Prompt
facial-expressions Boredom: Monotony and boredom. ; A gamer; close-up; Gamer; A brightly lit room with a computer screen displaying a repetitive, simple game.; cinematic
Characteristic
Shot : A young man is sitting in front of a computer screen, illuminated by blue and red lights, looking at the screen. The scene appears to be in a dimly lit room.
Aesthetic Score : 0.6
Mood : serious, concentrated, tech
Quality
Entropy : 6.75
Noise : 63
Prompt Clip Score : 0.10
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly grainy, likely due to compression or low lighting.
A Moment of Tranquility on the Train
Three young travelers find solace in the rhythm of the train journey. Bathed in natural light, the woman in the center, lost in her book, embodies a sense of calm contemplation. The image captures a peaceful moment of introspection, inviting viewers to share in the quiet beauty of the scene.
Prompt
facial-expressions Boredom: Isolation and boredom. ; A woman; eye-level; Single Persons; A crowded train with people reading, sleeping, and staring blankly.; cinematic
Characteristic
Shot : Three young people sitting on a train, one of them is reading a book
Aesthetic Score : 0.7
Mood : calm, pensive, nostalgic
Quality
Entropy : 6.72
Noise : 84
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts in the form of pixelation and noise, particularly in the shadows. The lighting is uneven, creating some highlights and shadows that are too harsh.
The Weight of Duty: A Soldier’s Contemplative Gaze
A close-up shot captures the serious expression of a soldier in a helmet, his gaze directed off-camera. The dramatic lighting highlights his weathered face, conveying a sense of contemplation and the weight of his duty.
Prompt
facial-expressions Boredom: Despair and boredom. ; A soldier; eye-level; Heroes; A desolate desert landscape with a lone watchtower in the distance.; cinematic
Characteristic
Shot : A close-up portrait of a man wearing a military helmet, likely in a desert or arid environment. The background is out of focus, suggesting a strong emphasis on the subject.
Aesthetic Score : 0.6
Mood : serious, contemplative, weathered
Quality
Entropy : 6.74
Noise : 73
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There’s a slight blur in the background that could be due to motion blur or a technical error. The image appears to be slightly overexposed, leading to a slight washed-out effect.
Conclusion
The results show that the generative AI model performed okay in terms of camera position and shot analysis, but not so well in terms of aesthetic analysis. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t quite capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.6, which falls within the “good” range. This indicates that the model was able to understand the scene in the prompt reasonably well, but could be better.
- Aesthetic Analysis: The model scored 0.02, which is far from the “very good” range of -0.2 to 0.1. This means the generated image’s aesthetic deviated significantly from the expected aesthetic described in the prompt.
Overall, the model needs improvement in understanding and implementing the desired aesthetic and camera positions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux-pro/api