AI's Facial Expressions: A Mixed Bag of Success with Imagen-v3-fast
- 9 minutes read - 1854 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. In the realm of generative AI, the ability to create images with specific facial expressions is a crucial aspect of achieving realistic and engaging visuals. This blog post delves into the performance of a generative AI model in capturing facial expressions, camera angles, and aesthetics, analyzing its strengths and weaknesses through a series of test prompts.
Created with: imagen-v3-fast
Lost in the Neon Shadows: A Man’s Fearful Journey
A solitary figure walks through a rain-slicked, neon-lit street, his face etched with fear. The darkness and shadows create an eerie atmosphere, hinting at a mystery unfolding in the city’s underbelly.
Prompt
facial-expressions Surprise: Eerie, suspenseful ; A lone figure walking down a deserted street; eye-level; Single Person; neon signs reflecting in puddles; cinematic
Characteristic
Shot : A man walks down a dark and wet street, the street is lined with buildings and neon signs, he looks frightened and scared.
Aesthetic Score : 0.6
Mood : dark, eerie, mysterious
Quality
Entropy : 6.64
Noise : 94
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has a few minor artifacts, such as the lack of detail in the buildings and the blurry edges of the man.
Superman Gazes Upon the City, A Shadow of Concern in His Eyes
A dramatic shot captures Superman, his cape billowing in the wind, looking up at the city skyline with a mixture of awe and concern. The lighting casts long shadows, adding to the sense of urgency and anticipation. What threat looms over Metropolis?
Prompt
facial-expressions Surprise: Triumphant, awe-inspiring ; A superhero standing on a rooftop, looking out over the city; eye-level; Hero; cityscape at night, with flashing lights and sirens in the distance; cinematic
Characteristic
Shot : A man dressed as Superman, looking up in awe at the city skyline.
Aesthetic Score : 0.7
Mood : dramatic, heroic, concerned
Quality
Entropy : 6.40
Noise : 51
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears to have been digitally enhanced or edited. Some of the details, particularly in the background, are somewhat blurry and lack sharpness.
A Candlelit Mystery: Man’s Shocked Reaction Unveils a Tense Secret
A single candle illuminates a man’s shocked face as he reads a book, casting long shadows in a dimly lit room. The scene is both eerie and captivating, hinting at a hidden truth and a story waiting to unfold.
Prompt
facial-expressions Surprise: Solitude, foreboding ; A lone figure hunches over a flickering candlelit table, lost in a book, oblivious to the growing shadows outside the window.; cinematic
Characteristic
Shot : A man is seated at a table, lit by a single candle, looking shocked while reading a book. He appears to be alone. The scene is lit by the candle, but there is also light coming through a window behind him.
Aesthetic Score : 0.6
Mood : spooky, mysterious, tense
Quality
Entropy : 6.58
Noise : 45
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors. The image is well-composed, with no noticeable artifacts or technical issues.
Caught in the Moment: Gamer’s Shock and Awe
A young gamer, bathed in blue light, stares intently at his computer screen, his wide eyes and open mouth revealing a moment of intense surprise. The image captures the raw emotion of a gamer fully immersed in the digital world.
Prompt
facial-expressions Surprise: Intense, focused ; A gamer sitting in a dimly lit room, eyes glued to the screen; close-up; Gamer; glowing monitor, keyboard, and mouse; cinematic
Characteristic
Shot : A young man is staring intently at a computer screen, his eyes wide with surprise. He is wearing headphones and is sitting in a gaming chair. The lighting is blue and creates a moody atmosphere.
Aesthetic Score : 0.5
Mood : intense, focused, surprised
Quality
Entropy : 6.36
Noise : 41
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is well-lit and composed. However, the subject’s face is slightly blurry, possibly due to motion or focus issues.
Lost in the Crowd: A Moment of Anxiety
A woman stands alone in a bustling train station, her startled expression captured in sharp focus against a blurred background. The shallow depth of field creates a sense of isolation and suspense, leaving the viewer wondering what has caused her distress.
Prompt
facial-expressions Surprise: Panic, frantic ; A woman standing in a crowded train station, suddenly realizing she’s lost her purse; eye-level; Single Person; bustling crowd, hurried footsteps; cinematic
Characteristic
Shot : A woman stands in a crowded train station, looking startled. The background is blurred, focusing attention on the woman.
Aesthetic Score : 0.7
Mood : anxiety, tension, suspense
Quality
Entropy : 6.61
Noise : 48
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors
Trapped in a Blaze, He Holds a Golden Secret
A man, consumed by fear, clutches a golden artifact as flames engulf him. The intense contrast between the fire and his terror creates a dramatic scene, leaving the viewer questioning the artifact’s significance and the man’s fate.
Prompt
facial-expressions Surprise: Desperate, determined ; A lone figure silhouetted against the inferno, emerging from the collapsing building, clutching a precious artifact.; cinematic
Characteristic
Shot : A man with a beard, wearing a brown shirt and a brown strap across his chest, looks terrified. He is holding a golden artifact in front of him, as he is surrounded by flames.
Aesthetic Score : 0.7
Mood : intense, dramatic, fear
Quality
Entropy : 6.65
Noise : 68
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some minor artifacts in the flames and the man’s hair.
What’s Got Them Staring?
A group of friends enjoys a casual picnic in the park, but their surprised expressions and shared gaze off-camera leave you wondering what has caught their attention. Is it something exciting, funny, or perhaps even a little scary? The image’s playful intrigue invites you to imagine the scene unfolding.
Prompt
facial-expressions Surprise: Peaceful, ominous ; A group of friends enjoying a picnic in a park, unaware of the strange object falling from the sky; eye-level; Normal People; sunny day, green grass, blue sky; cinematic
Characteristic
Shot : A group of friends having a picnic in a park, looking off-camera in surprise
Aesthetic Score : 0.6
Mood : surprised, friendly, casual
Quality
Entropy : 6.88
Noise : 99
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, particularly in the background.
The Shock on His Face Says It All
A young man sits in a dimly lit room, bathed in blue light, his face etched with shock as he stares at a computer keyboard. The intensity of the moment is palpable, leaving you wondering what could have caused such a reaction. Is he witnessing something extraordinary, or is he facing a terrifying truth? The suspense is thick in the air, leaving you eager to uncover the story behind his stunned expression.
Prompt
facial-expressions Surprise: Disbelief, frustration ; A gamer’s hands frantically moving across the keyboard, as a sudden glitch appears on the screen; close-up; Gamer; distorted screen, flashing lights; cinematic
Characteristic
Shot : A young man is looking at a computer keyboard with a shocked expression on his face. He is sitting in a dark room with blue lighting.
Aesthetic Score : 0.6
Mood : intense, surprised, focused
Quality
Entropy : 6.36
Noise : 40
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, particularly in the shadows. The lighting is also a bit uneven, with some areas being overexposed.
The Shadow in the Woods
A lone man stumbles through a dense forest, oblivious to the menacing creature lurking behind him. The scene, shrouded in eerie shadows, creates a palpable sense of suspense, leaving you wondering what fate awaits the unsuspecting traveler.
Prompt
facial-expressions Surprise: Mystical, awe-inspiring ; A man walking through a forest, suddenly finding himself face-to-face with a mythical creature; eye-level; Single Person; dense foliage, dappled sunlight; cinematic
Characteristic
Shot : A man walks down a path in a dense forest, unaware of a large, menacing creature standing behind him. The creature is humanoid but covered in fur and has horns. The scene is framed in a way that makes the viewer feel like they are the one being watched by the creature.
Aesthetic Score : 0.7
Mood : suspenseful, eerie, mysterious
Quality
Entropy : 6.48
Noise : 82
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The creature’s fur looks somewhat artificial and the lighting is a bit too harsh.
Terror in the Trenches: One Man’s Fear Amidst the Chaos
A close-up shot captures the raw terror on a soldier’s face as he witnesses the horrors of war. The flames of battle rage in the background, while blurred figures of comrades fight for survival. This image evokes a sense of fear, anxiety, and the intense urgency of the moment.
Prompt
facial-expressions Surprise: Melancholy, reflective ; A hero standing on a battlefield, surrounded by fallen enemies, realizing the true cost of victory; eye-level; Hero; smoke and debris, wounded soldiers; cinematic
Characteristic
Shot : A close-up shot of a man with a beard, looking terrified. He is in a battlefield with fire in the background and blurred soldiers around him.
Aesthetic Score : 0.8
Mood : fear, anxiety, intense
Quality
Entropy : 6.78
Noise : 68
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some slight artifacts and a little bit of blur in the background, particularly around the fire.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This suggests the model didn’t accurately capture the intended camera position in the prompt.
- Shot Analysis: The model scored 0.62, falling within the “good” range. This indicates the model was able to understand the scene described in the prompt reasonably well.
- Aesthetic Analysis: The model scored 0.12, which is slightly above the “very good” range of -0.2 to 0.1. This suggests the generated image’s aesthetic deviated somewhat from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to accurately capture the intended camera position and aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-3/