AI Struggles to Capture Disgust: A Look at Facial Expressions in Generated Images with Imagen-v3-fast
- 10 minutes read - 1980 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions in art and storytelling. Dramatic facial expressions, like disgust, can add depth and realism to a scene. However, generating images with accurate and nuanced facial expressions remains a challenge for AI. This blog post explores the challenges of generating images with accurate facial expressions of disgust using AI, analyzing a set of images generated from prompts describing scenes of disgust. We examine the AI’s performance in capturing camera position, scene understanding, and aesthetic. The results reveal strengths in scene understanding and aesthetic but weaknesses in accurately representing the intended camera position. This analysis highlights the ongoing development of AI in capturing the complexities of human emotion and expression.
Created with: imagen-v3-fast
Lost in the Shadows: A Man’s Solitude in a Gloomy Alley
A hooded figure sits alone amidst discarded refuse in a dimly lit alleyway. Graffiti-covered walls and the pervasive darkness create a sense of isolation and vulnerability, leaving the viewer to ponder the man’s story and the secrets hidden within the shadows.
Prompt
facial-expressions Disgust: Despair and alienation ; A lone figure, hunched over in a dimly lit alleyway; eye-level; Single Person; overflowing trash bins and graffiti-covered walls; cinematic
Characteristic
Shot : A man in a hooded jacket sits on the ground in a dark alleyway, surrounded by trash cans and garbage bags. The alleyway is dimly lit, and the walls are covered in graffiti.
Aesthetic Score : 0.6
Mood : dark, gloomy, mysterious
Quality
Entropy : 6.50
Noise : 89
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Art Attack: Man’s Surprise at Splattered Masterpiece
A man in a hat stands frozen in an art gallery, his expression a mixture of shock and amusement. He’s gazing at an abstract painting, but the real eye-catcher is the large splatter of paint that has landed directly in front of it. The scene is a humorous collision of chaos and elegance, leaving viewers wondering: was it an accident, a statement, or a masterpiece in the making?
Prompt
facial-expressions Disgust: Horror and disgust ; A seasoned detective, his face etched with disgust, stares at a vandalized masterpiece in a museum; eye-level; Detective; a chaotic scene with paint splattered across the canvas and broken glass on the floor.; cinematic
Characteristic
Shot : A man in a hat stands in an art gallery looking at an abstract painting, in front of which a large amount of paint has been splattered.
Aesthetic Score : 0.6
Mood : suspenseful, humorous, chaotic
Quality
Entropy : 6.74
Noise : 80
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts in the background, particularly around the edges of the painting.
Disgust on a Plate: What’s Making This Woman React?
A woman stands in a kitchen, her face contorted in disgust as she stares at a plate of food. The image attempts to capture a dramatic moment, but the lighting and composition leave much to be desired. What could be on that plate to elicit such a strong reaction? The mystery remains, leaving viewers to ponder the source of her apprehension.
Prompt
facial-expressions Disgust: Disappointment and disgust ; A young woman, her face pale and wrinkled, as she stares at a plate of spoiled food; eye-level; Normal Person; a cluttered kitchen with dirty dishes and a overflowing trash can; cinematic
Characteristic
Shot : A woman is holding a plate of food and looking at it with a disgusted expression. She is standing in a kitchen.
Aesthetic Score : 0.2
Mood : disgust, apprehension, confusion
Quality
Entropy : 6.97
Noise : 60
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, particularly in the background.
Caught in the Moment: Gamer’s Shocked Reaction
A young man, headphones on and eyes wide with surprise, stares directly at the camera. Seated in his gaming chair, the intense lighting and his shocked expression create a powerful sense of urgency, capturing a moment of intense gameplay.
Prompt
facial-expressions Disgust: Unease and disgust ; A gamer, their eyes wide with disgust, as they witness a grotesque scene in a virtual reality game; eye-level; Gamer; a brightly lit gaming room with multiple monitors and controllers; cinematic
Characteristic
Shot : A young man with headphones on is looking at the camera with a shocked expression. He is sitting in a gaming chair in front of a computer.
Aesthetic Score : 0.3
Mood : shocked, surprised, intense
Quality
Entropy : 6.06
Noise : 36
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some artifacts in the image, especially around the edges of the man’s face. The lighting is also a bit harsh.
Terror in the Alley: Man’s Fearful Gaze in a Cluttered Street
A chilling image captures a man’s terror as he stares down a narrow, garbage-strewn alleyway. The scene evokes a sense of foreboding and suspense, leaving the viewer questioning what lurks in the shadows.
Prompt
facial-expressions Disgust: Repulsion and disgust ; A man, his face contorted in disgust, as he walks past a pile of rotting garbage; eye-level; Single Person; a dirty and neglected street with overflowing trash cans; cinematic
Characteristic
Shot : A man is standing in a narrow street, his face contorted in terror, as he stares down the street. There are garbage cans in the background.
Aesthetic Score : 0.2
Mood : fear, anxiety, urban
Quality
Entropy : 6.90
Noise : 90
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and lacks sharpness.
Superman Faces His Shadow in a Gritty Alleyway
A tense standoff unfolds in a dark, urban alley. Superman, illuminated by a harsh spotlight, confronts a shadowy figure lurking behind him. The contrasting lighting and Superman’s intense expression create a palpable sense of suspense and danger.
Prompt
facial-expressions Disgust: Anger and disgust ; A superhero, their face etched with disgust, as they confront a villain who has committed a heinous act; eye-level; Hero; a dark and smoky cityscape with a towering villainous figure; cinematic
Characteristic
Shot : A dark, gritty city alleyway, Superman stands facing the camera, with a shadowed, menacing figure behind him.
Aesthetic Score : 0.7
Mood : intense, dark, suspenseful
Quality
Entropy : 6.68
Noise : 56
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has minor artifacts, particularly noticeable in the shadows and the darker areas of the image. The edges of the characters are slightly jagged.
Horror in the Kitchen: Family’s Dismay at Unwelcome Guest
A chilling scene unfolds in a cluttered kitchen, where a family confronts a dead rat on the floor. The dim lighting and tight composition amplify the horror and disgust, leaving viewers on the edge of their seats.
Prompt
facial-expressions Disgust: Horror and disgust ; A family, their faces twisted in disgust, as they discover a dead rat in their kitchen; eye-level; Normal People; a cluttered and messy kitchen with dirty dishes and a overflowing trash can; cinematic
Characteristic
Shot : A family is standing in a messy kitchen, reacting in horror to a dead rat on the floor. The kitchen is cluttered and dirty, with trash and dishes scattered around.
Aesthetic Score : 0.6
Mood : horror, suspense, disgust
Quality
Entropy : 6.75
Noise : 88
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable artifacts or errors in the image.
Caught in the Spotlight: A Moment of Startled Intensity
A young man, bathed in dramatic lighting, stares at a screen with a look of surprise. The contrast between light and shadow amplifies the intensity of the moment, leaving the viewer wondering what has just unfolded.
Prompt
facial-expressions Disgust: Fear and disgust ; A gamer, their face pale and sweaty, as they witness a disturbing scene in a horror game; eye-level; Gamer; a dimly lit gaming room with a flickering monitor and a dark, ominous atmosphere; cinematic
Characteristic
Shot : A young man, wearing headphones, is looking at a screen with a startled expression. The lighting is dramatic, with a lot of contrast between the light and dark areas.
Aesthetic Score : 0.7
Mood : intense, dramatic, surprised
Quality
Entropy : 6.41
Noise : 53
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, especially around the edges of the subject’s face and hair, which could be due to over-sharpening or over-processing
Disgusting Discovery: Woman Finds Cockroach in Restaurant Meal
A woman dining at a restaurant experiences a horrifying encounter when she discovers a cockroach on her plate. The image captures her visible disgust and the unsettling tension of the situation.
Prompt
facial-expressions Disgust: Revulsion and disgust ; A woman, her face contorted in disgust, as she discovers a cockroach in her food; eye-level; Single Person; a brightly lit restaurant with a table full of food and a cockroach crawling on a plate; cinematic
Characteristic
Shot : A woman is looking down at a plate with a cockroach on it in a restaurant setting, she is visibly disgusted.
Aesthetic Score : 0.2
Mood : disgust, fear, apprehension
Quality
Entropy : 6.79
Noise : 69
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image quality is slightly grainy. The cockroach appears to be slightly out of focus.
Power and Authority: A Portrait of Leadership
A man in a suit, radiating power and authority, sits at a large desk in a stately office. The opulent setting, with dark wood paneling, a grand window, and a fireplace, reinforces his position of influence. The lighting and composition create a sense of importance and consequence, hinting at the weight of decisions made within these walls.
Prompt
facial-expressions Disgust: Disdain and disgust ; A hero, their face hardened with disgust, as they confront a corrupt politician; eye-level; Hero; a grand, opulent office with a powerful politician sitting behind a large desk; cinematic
Characteristic
Shot : A man in a suit sits at a large desk in a stately office, likely a politician or CEO. The room is opulent with dark wood paneling, a large window, a fireplace, and a portrait on the wall.
Aesthetic Score : 0.7
Mood : formal, serious, authoritative
Quality
Entropy : 6.21
Noise : 54
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness is noticeable in the portrait, but it’s not too distracting.
Conclusion
The analysis of the generated image reveals mixed results:
Camera Position: The model’s performance in capturing the intended camera position is fairly weak. With a score of 0.2, it falls significantly below the “good” range of 0.5 to 0.75. This suggests the AI struggled to accurately translate the prompt’s camera positioning into the final image.
Shot Analysis: The model demonstrates a moderate understanding of the scene described in the prompt. The score of 0.56 falls within the “good” range, indicating a reasonable ability to translate the prompt’s scene into the image.
Aesthetic Analysis: The generated image’s aesthetic is very close to the expected aesthetic. The score of 0.25 falls within the “very good” range of -0.2 to 0.1, suggesting the AI successfully captured the desired visual style.
Overall, the model shows strengths in understanding the desired aesthetic and the scene, but struggles with accurately representing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-3/