AI Struggles with Camera Angles, but Nails Disgust in Facial Expressions with Flux-schnell
- 10 minutes read - 1958 wordsTable of Contents
In the realm of artificial intelligence, generating realistic and expressive images is a constant challenge. This experiment focused on the ability of a generative AI model to depict disgust through facial expressions. The model was tasked with creating images based on various scenarios, each featuring a character experiencing disgust. While the model excelled at capturing the emotion itself, it struggled with accurately implementing the intended camera position, highlighting the ongoing development needed for AI to fully understand and replicate complex visual elements. This blog post delves into the results of this experiment, exploring the model’s strengths and weaknesses in capturing disgust and camera placement.
Created with: flux-schnell
Mystery in the Shadows: A Hooded Figure Haunts a Dark Alley
A low-angle shot captures a hooded figure standing in a dimly lit alleyway, their face obscured by the shadows. Overflowing garbage bins line the walls, adding to the sense of darkness and mystery. The mood is heavy with loneliness and intrigue, leaving viewers wondering who this figure is and what secrets they hold.
Prompt
facial-expressions Disgust: Despair and alienation ; A lone figure, hunched over in a dimly lit alleyway; eye-level; Single Person; overflowing trash bins and graffiti-covered walls; cinematic
Characteristic
Shot : A hooded figure is standing in a narrow alleyway lined with trash bins. The alleyway is dimly lit and there is a sense of mystery and intrigue.
Aesthetic Score : 0.6
Mood : dark, mysterious, lonely
Quality
Entropy : 6.51
Noise : 99
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Blood and Fury: A Close-Up of a Masked Man’s Rage
A chilling close-up reveals a masked figure, his face smeared with blood, radiating an intense and dramatic fury. The red cape adds to the sense of danger and mystery, leaving the viewer questioning the story behind this powerful image.
Prompt
facial-expressions Disgust: Horror and disgust ; A superhero, their face contorted in revulsion, as they witness a horrific crime; eye-level; Hero; a chaotic crime scene with blood and debris; cinematic
Characteristic
Shot : A close-up of a man’s face, the man is wearing a mask and a red cape, he appears to be in pain.
Aesthetic Score : 0.6
Mood : intense, dramatic, gritty
Quality
Entropy : 6.55
Noise : 83
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to be AI generated, with some unnatural textures and detail.
A Moment of Quiet Reflection in the Kitchen
A woman gazes thoughtfully at the camera, a plate of unappetizing food before her. The mundane setting of a kitchen, with a washing machine in the background, adds to the sense of melancholy and routine. The image captures a fleeting moment of quiet contemplation, devoid of any dramatic flair.
Prompt
facial-expressions Disgust: Disappointment and disgust ; A young woman, her face pale and wrinkled, as she stares at a plate of spoiled food; eye-level; Normal Person; a cluttered kitchen with dirty dishes and a overflowing trash can; cinematic
Characteristic
Shot : A woman is looking at the camera with a plate of food in front of her, a washing machine is in the background
Aesthetic Score : 0.4
Mood : unhappy, tired, pensive
Quality
Entropy : 6.73
Noise : 82
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and graininess, particularly in the shadows.
Virtual Reality’s Terrifying Turn: A Close-Up on Fear
A young person’s face, frozen in a mask of terror, reveals the unsettling power of virtual reality. The close-up shot captures their wide-eyed shock, leaving viewers to wonder what horrors lurk within the digital realm.
Prompt
facial-expressions Disgust: Unease and disgust ; A gamer, their eyes wide with disgust, as they witness a grotesque scene in a virtual reality game; eye-level; Gamer; a brightly lit gaming room with multiple monitors and controllers; cinematic
Characteristic
Shot : A young person wearing a VR headset and headphones looks surprised and scared. The person is in a dimly lit room with a gaming setup around them.
Aesthetic Score : 0.3
Mood : surprise, fear, intense
Quality
Entropy : 6.58
Noise : 68
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise and grain in the image, especially in the background. The colors are slightly muted and the image has a slightly blurry appearance.
Anger in the Alley
A man, his face etched with frustration, stands amidst overflowing trash cans in a dimly lit alley. The gritty urban setting amplifies the intensity of the moment, creating a powerful and dramatic image.
Prompt
facial-expressions Disgust: Repulsion and disgust ; A man, his face contorted in disgust, as he walks past a pile of rotting garbage; eye-level; Single Person; a dirty and neglected street with overflowing trash cans; cinematic
Characteristic
Shot : A man in a dark shirt is walking through an alley with overflowing garbage bins on both sides. The scene looks dirty and neglected.
Aesthetic Score : 0.3
Mood : dark, gritty, gloomy
Quality
Entropy : 6.84
Noise : 110
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a bit blurry and the color is a bit flat.
Masked Encounter: Tension Rises in the City Shadows
Two figures, cloaked in darkness, confront each other in a dimly lit urban setting. The masked man’s expression is unreadable, while his opponent seems wary. The close-up shot amplifies the intensity of the moment, leaving the viewer to wonder what secrets lie behind the masks and what fate awaits these men.
Prompt
facial-expressions Disgust: Anger and disgust ; A superhero, their face etched with disgust, as they confront a villain who has committed a heinous act; eye-level; Hero; a dark and smoky cityscape with a towering villainous figure; cinematic
Characteristic
Shot : Two men facing each other. One is wearing a mask and the other has a serious expression. There is an urban background.
Aesthetic Score : 0.6
Mood : intense, mysterious, dramatic
Quality
Entropy : 6.65
Noise : 79
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight noise and artifacting, especially in the darker areas.
Unexpected Guest: A Rat Crashes Family Kitchen Fun
A quirky and casual scene unfolds in a kitchen, with a family enjoying their day. But the unexpected appearance of a rat in the foreground adds a touch of absurdity and light-hearted surprise to the moment.
Prompt
facial-expressions Disgust: Horror and disgust ; A family, their faces twisted in disgust, as they discover a dead rat in their kitchen; eye-level; Normal People; a cluttered and messy kitchen with dirty dishes and a overflowing trash can; cinematic
Characteristic
Shot : Three people, two adults and a child, are in a kitchen. The adults are standing and the child is sitting on a counter. There is a dead armadillo on a table in the foreground.
Aesthetic Score : 0.5
Mood : uncomfortable, unsettling, strange
Quality
Entropy : 6.87
Noise : 96
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry and the colors are a bit washed out.
What Did He See? Man’s Shocked Reaction in Dimly Lit Room Sparks Suspense
A young man, headphones on, stares at a computer screen with a look of pure shock. The dimly lit room adds to the suspense, leaving viewers wondering what could have caused such a reaction. Is it a terrifying discovery, a life-changing revelation, or something else entirely? The mystery unfolds in this intense and captivating scene.
Prompt
facial-expressions Disgust: Fear and disgust ; A gamer, their face pale and sweaty, as they witness a disturbing scene in a horror game; eye-level; Gamer; a dimly lit gaming room with a flickering monitor and a dark, ominous atmosphere; cinematic
Characteristic
Shot : A young man wearing headphones is looking at a computer screen, his face is filled with a mixture of shock and surprise.
Aesthetic Score : 0.6
Mood : intense, suspenseful, worried
Quality
Entropy : 6.01
Noise : 55
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry.
Disgust and Dread: A Woman Faces Her Worst Fear
A chilling image captures the moment a woman confronts her deepest revulsion. With a look of pure disgust, she stares directly at the viewer, a cockroach lurking in the foreground. The unsettling scene evokes a sense of unease and creepiness, leaving a lasting impression of discomfort.
Prompt
facial-expressions Disgust: Revulsion and disgust ; A woman, her face contorted in disgust, as she discovers a cockroach in her food; eye-level; Single Person; a brightly lit restaurant with a table full of food and a cockroach crawling on a plate; cinematic
Characteristic
Shot : A woman is looking at the camera with a surprised expression. There is a cockroach on a plate of food in the foreground.
Aesthetic Score : 0.2
Mood : disgusted, surprised, uneasy
Quality
Entropy : 6.78
Noise : 79
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and the lighting is uneven.
Intense Gaze in the Mirror’s Reflection
A close-up shot captures a man’s face in a mirror, his furrowed brow and intense expression hinting at a hidden story. The blurred background of a wooden-furnished room adds to the mysterious and brooding atmosphere.
Prompt
facial-expressions Disgust: Disdain and disgust ; A hero, their face hardened with disgust, as they confront a corrupt politician; eye-level; Hero; a grand, opulent office with a powerful politician sitting behind a large desk; cinematic
Characteristic
Shot : A man’s face is reflected in the mirror. The man is looking directly at the viewer. The background is blurred and out of focus.
Aesthetic Score : 0.6
Mood : serious, intense, thoughtful
Quality
Entropy : 6.69
Noise : 76
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and has some noise. The background is also out of focus and the colors are not very vibrant.
Conclusion
The results of the analysis show that the generative AI model performed well in understanding the scene and shot composition, but struggled with camera position and aesthetic expectations.
Here’s a breakdown:
- Camera Position: The model scored 0.2, indicating a significant difference between the intended camera position in the prompt and the actual camera position in the generated image. This suggests the model is not very good at following camera position instructions.
- Shot Analysis: The model scored 0.66, which is considered good. This means the model was able to understand the scene and shot composition in the prompt reasonably well.
- Aesthetic Analysis: The model scored 0.28, which is considered very good. This indicates that the generated image closely matched the expected aesthetic style.
Overall: While the model performed well in understanding the scene and achieving the desired aesthetic, it struggled with accurately implementing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://fal.ai/models/fal-ai/flux/schnell/api