AI Captures the Essence of Scenes, But Struggles with Aesthetics with Flux-dev
- 9 minutes read - 1784 wordsTable of Contents
The ability to generate realistic and expressive facial expressions is a crucial aspect of creating compelling and engaging visual content. This blog post delves into the capabilities of a generative AI model in capturing the nuances of human emotion through facial expressions. We’ll explore the model’s strengths and weaknesses, analyzing its performance in understanding scene descriptions, camera positions, and aesthetic styles. Through this analysis, we’ll gain insights into the current state of AI’s ability to translate human emotions into visual art.
Created with: flux-dev
Lost in the City Lights
A solitary figure walks through a bustling cityscape, the blur of the background highlighting her isolation and introspective mood. The woman’s dark jacket and the strap across her chest add to the sense of mystery and melancholy.
Prompt
facial-expressions Guilt: Lonely, isolated, rejected ; A woman walking away from a group of friends; long shot; Single Person; A bustling city street, people rushing by; cinematic
Characteristic
Shot : A woman is walking down a city street, her back to the camera. The street is blurred and out of focus, and there are lights and signs in the background.
Aesthetic Score : 0.5
Mood : lonely, urban, introspective
Quality
Entropy : 6.58
Noise : 63
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
A Lone Figure Stands Against the Ashes of a Lost World
A solitary figure, cloaked in red, stands defiant against a backdrop of smoke and fire. The scene evokes a sense of loss and power, with a lone man lying in the foreground, highlighting the stark reality of a post-apocalyptic wasteland. The dramatic use of light and shadow, and the contrast between the lone figure and the distant group, creates a powerful and somber mood.
Prompt
facial-expressions Guilt: Torn, conflicted, remorseful ; A hero, standing over a fallen villain; medium shot; Hero; A battlefield, smoke and debris everywhere; cinematic
Characteristic
Shot : A lone figure, cloaked in red, stands before a dying fire and a fallen warrior, a backdrop of smoke and distant figures.
Aesthetic Score : 0.7
Mood : melancholy, somber, dramatic
Quality
Entropy : 6.48
Noise : 74
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : No noticeable errors or artifacts.
Lost in the Shadows: A Man’s Mysterious Journey
A solitary figure, cloaked in darkness, walks a path illuminated only by flickering streetlights. The low-key lighting and blurred background create a sense of intrigue and mystery, leaving the viewer to wonder about the man’s destination and the secrets he carries.
Prompt
facial-expressions Guilt: Desolate, regretful ; A lone figure; eye-level; Single Person; Empty street at night, rain falling; cinematic
Characteristic
Shot : A man in a dark coat walks alone down a deserted street at night, illuminated by streetlights. The background is blurry and the overall feel is dark and mysterious.
Aesthetic Score : 0.5
Mood : dark, mysterious, lonely
Quality
Entropy : 6.52
Noise : 58
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.30
Image errors : Slight noise in the background and some slight blurriness around the edges of the subject. There are no significant image errors.
Lost in the Shadows: A Man’s Solitary Moment at a Dimly Lit Party
A man in a suit stands alone in the heart of a dimly lit party, his presence a stark contrast to the blurry figures surrounding him. The low light and shadows cast an air of mystery and intrigue, hinting at a story waiting to be told.
Prompt
facial-expressions Guilt: Alienated, invisible ; A man standing in a crowded room, looking lost; wide shot; Single Person; A party, people laughing and dancing, oblivious to him; cinematic
Characteristic
Shot : A man in a suit stands in a dimly lit room with many people around him.
Aesthetic Score : 0.6
Mood : mysterious, enigmatic, dark
Quality
Entropy : 5.81
Noise : 37
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some noise and grain, especially in the darker areas. Some blurring can be seen around the edges.
The Man of Steel, Contemplating the City’s Fate
A powerful image of a superhero in a Superman suit, standing against a city skyline. The dramatic lighting and pensive expression evoke a sense of seriousness and heroic weight.
Prompt
facial-expressions Guilt: Heavy, burdened, conflicted ; A superhero, cape billowing in the wind; medium shot; Hero; City skyline, destroyed buildings in the background; cinematic
Characteristic
Shot : A man dressed as Superman stands against a backdrop of a city.
Aesthetic Score : 0.7
Mood : serious, dramatic, superheroic
Quality
Entropy : 6.43
Noise : 63
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and compression artifacts in the background.
Silhouetted in Solitude: A Moment of Contemplation
A lone figure stands against a moonlit cityscape, their silhouette a stark contrast against the blur of lights and shapes. The scene evokes a sense of melancholy and introspection, capturing a moment of quiet contemplation in the face of urban vastness.
Prompt
facial-expressions Guilt: Reflective, contemplative, seeking redemption ; A hero, standing on a rooftop, looking out at the city; wide shot; Hero; A cityscape bathed in moonlight, a sense of peace; cinematic
Characteristic
Shot : A lone man stands silhouetted against a cityscape with a bright moon in the sky. The city lights are visible in the distance, creating a sense of urban loneliness.
Aesthetic Score : 0.6
Mood : melancholy, solitude, introspective
Quality
Entropy : 6.61
Noise : 38
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight haze or graininess, possibly due to post-processing. The silhouette is also slightly pixelated, particularly in the hair.
Lost in the Digital Labyrinth
A young man, shrouded in shadows, sits intently before a glowing computer screen. The dim light and reflective surface create an atmosphere of mystery and focus, hinting at a world hidden within the digital realm.
Prompt
facial-expressions Guilt: Isolated, self-loathing ; A gamer, hunched over a computer screen; close-up; Gamer; Neon lights reflecting in their eyes, empty pizza boxes scattered around; cinematic
Characteristic
Shot : A man is sitting in front of a computer screen in a dimly lit room. The screen is showing code.
Aesthetic Score : 0.5
Mood : focused, tech, dark
Quality
Entropy : 6.56
Noise : 68
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors or artifacts are visible.
Secrets Shared in the Shadows: A Dinner of Intrigue
Four figures gather around a dimly lit table, their faces illuminated by flickering candlelight. The air is thick with unspoken words and a sense of shared history. Is this a celebration or a reckoning? The mood is intimate, somber, and contemplative, leaving the viewer to ponder the secrets hidden beneath the surface.
Prompt
facial-expressions Guilt: Awkward, strained, unspoken ; A family gathered around a table, but the atmosphere is tense; medium shot; Normal People; A dimly lit dining room, empty chairs at the table; cinematic
Characteristic
Shot : A group of people are sitting around a table, likely at a dinner party, in a dimly lit room.
Aesthetic Score : 0.5
Mood : intimate, somber, pensive
Quality
Entropy : 6.24
Noise : 54
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight color cast and some noise in the darker areas.
Lost in the Game: A Moment of Solitary Focus
A young man, bathed in the soft glow of a large screen, is completely absorbed in his video game. The dim lighting and his pensive expression create a sense of solitude and intense concentration, highlighting the immersive power of gaming.
Prompt
facial-expressions Guilt: Disillusioned, defeated, empty ; A gamer, staring at a blank screen, controller in hand; close-up; Gamer; A dimly lit room, empty energy drink cans scattered around; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit living room, he is looking at a screen and holding a controller in his hands, there are a few cans in the background and a TV in the background, it is dark outside the window.
Aesthetic Score : 0.4
Mood : lonely, contemplative, quiet
Quality
Entropy : 6.19
Noise : 40
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slightly grainy texture and there is some noise in the shadows.
Lost in Memories: A Moment of Melancholy in the Kitchen
A young woman stands in a kitchen, her gaze fixed on a photograph. The scene evokes a sense of melancholy and nostalgia, as her posture and expression suggest a deep contemplation of the past. The cabinets and counters in the background provide a sense of domesticity, highlighting the personal nature of her reflection.
Prompt
facial-expressions Guilt: Nostalgic, melancholic ; A woman holding a photo of a loved one; close-up; Normal Person; A cluttered kitchen, dishes piled in the sink; cinematic
Characteristic
Shot : A woman in a kitchen is looking at a photograph of another woman, the scene is lit by natural light, and the kitchen is slightly messy.
Aesthetic Score : 0.6
Mood : melancholy, introspective, somber
Quality
Entropy : 6.65
Noise : 64
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor blurriness, especially in the background.
Conclusion
The analysis shows that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is below the “good” range of 0.5 to 0.75. This suggests the model didn’t perfectly capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.57, which falls within the “good” range. This indicates the model successfully understood the scene described in the prompt and created an image that reflects it.
- Aesthetic Analysis: The model scored 0.16, which is outside the “very good” range of -0.2 to 0.1. This suggests the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/dev/api