AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion
- 9 minutes read - 1759 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. Generative AI models are increasingly being used to create images with realistic facial expressions. This blog post delves into the performance of these models, examining their ability to capture the nuances of facial expressions across diverse scenes. We’ll explore how well they understand camera angles, shot composition, and aesthetic appeal, highlighting both their strengths and areas for improvement. For example, we’ll analyze how well AI can depict the stoic expression of a lone figure in a desolate landscape, the determined look of a hero facing a burning city, or the intense focus of a gamer engrossed in a game. By understanding the capabilities and limitations of AI in generating facial expressions, we can better appreciate its potential and guide its future development.
Created with: stability-ai-core
A Lone Figure in a Desolate Landscape
A rugged man, clad in a dark jacket and carrying a backpack, stands amidst a barren landscape. The cloudy sky and distant hills suggest a remote and possibly harsh environment. His intense gaze and the stark backdrop create a sense of mystery and intrigue, highlighting the gritty and melancholic mood of the scene.
Prompt
facial-expressions Determination: Solitude and resilience ; A lone figure; eye-level; Single Person; A vast, desolate landscape; cinematic
Characteristic
Shot : A man in a dark jacket stands in a barren, desolate landscape, with a distant mountain in the background.
Aesthetic Score : 0.75
Mood : dark, moody, melancholic
Quality
Entropy : 6.81
Noise : 74
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Hero Stands Tall Amidst City in Flames
A lone superhero, clad in vibrant colors, stands defiant on a rooftop overlooking a city engulfed in flames. The intense mood and dramatic backdrop suggest a battle for survival, with the hero’s determined expression hinting at his unwavering resolve.
Prompt
facial-expressions Determination: Courage and unwavering resolve ; A hero standing tall; low-angle; Hero; A burning city in the background; cinematic
Characteristic
Shot : A superhero in a futuristic costume stands on a rooftop, looking out over a city engulfed in flames and smoke.
Aesthetic Score : 0.7
Mood : dramatic, heroic, apocalyptic
Quality
Entropy : 6.86
Noise : 75
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The smoke and flames in the background look somewhat artificial.
The Grind: A Day in the Life of Factory Workers
A glimpse into the repetitive and monotonous world of factory work, captured in this image of four men in blue uniforms pushing carts through a cluttered industrial setting. Their stoic expressions and postures speak volumes about the routine nature of their labor.
Prompt
facial-expressions Determination: Grit and perseverance ; A worker pushing a heavy cart; eye-level; Normal People; A bustling factory floor; cinematic
Characteristic
Shot : Four men in blue overalls and hardhats are pushing carts filled with boxes and metal parts in a factory. The factory is large and industrial, with concrete floors, metal beams, and a lot of clutter.
Aesthetic Score : 0.5
Mood : industrial, hardworking, mundane
Quality
Entropy : 6.86
Noise : 86
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable image artifacts or errors.
Lost in the Code: A Hacker’s Focus
A young man, bathed in the soft glow of multiple monitors, is completely absorbed in his work. The low light and close-up shot create an atmosphere of intensity and mystery, leaving you wondering what secrets he’s uncovering.
Prompt
facial-expressions Determination: Concentration and drive ; A gamer intensely focused on a screen; close-up; Gamer; A dimly lit room with glowing monitors; cinematic
Characteristic
Shot : A young man in a dark room, wearing a headset, is focused on typing on a keyboard in front of three computer monitors.
Aesthetic Score : 0.7
Mood : focused, intense, serious
Quality
Entropy : 5.99
Noise : 64
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and the colors are slightly desaturated, resulting in a slightly dull appearance.
A Stormy Outlook: A Woman’s Pensive Gaze
A woman stands by a window, her gaze fixed on a turbulent sky. The blurry suburban landscape beyond suggests a sense of isolation and uncertainty, mirroring the melancholy mood captured in this image. The stormy sky and her pensive expression create a palpable sense of foreboding, leaving the viewer to ponder the weight of her thoughts.
Prompt
facial-expressions Determination: Inner strength and hope ; A woman staring out a window; eye-level; Single Person; A stormy sky; cinematic
Characteristic
Shot : A woman looks out a window at a rainy, overcast sky and a suburban neighborhood beyond.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.79
Noise : 77
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight artifacts in the rain streaks and on the woman’s skin. These are barely noticeable, but could be addressed with more careful editing.
Into the Fire: A Warrior Leads the Charge
A dramatic scene of a warrior in armor leading a charge through a fiery battlefield. The intense lighting and chaotic composition capture the urgency and epic scale of the battle, highlighting the warrior’s leadership and courage.
Prompt
facial-expressions Determination: Victory and unwavering resolve ; A hero raising a sword; low-angle; Hero; A battlefield with fallen enemies; cinematic
Characteristic
Shot : A medieval knight in full armor is leading his troops into battle. He is screaming with fury, his sword raised high, with a red cape and cape blowing in the wind. The scene is chaotic and dramatic, with smoke and fire in the background.
Aesthetic Score : 0.7
Mood : epic, dramatic, fierce
Quality
Entropy : 6.62
Noise : 74
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
Family Watches in Despair as Their Home Burns
A family stands amidst the ruins of their burning home, their faces etched with grief and despair. The flames reflect in their eyes, highlighting the devastating loss they have suffered. The scene is both dramatic and somber, capturing the chaotic aftermath of the fire and the family’s struggle to cope with the destruction.
Prompt
facial-expressions Determination: Resilience and unity ; A family huddled together; eye-level; Normal People; A burning house in the background; cinematic
Characteristic
Shot : A family stands in front of their burning house, with a sense of loss and shock in their faces. The fire engulfs the house in flames, and smoke billows into the sky.
Aesthetic Score : 0.6
Mood : somber, tragic, desperate
Quality
Entropy : 6.77
Noise : 82
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
The Intensity of Focus
A young man, immersed in his task, sits before a computer screen in a dimly lit room. His focused expression and the intensity of the scene convey the seriousness of the moment. The image captures the essence of dedication and unwavering concentration.
Prompt
facial-expressions Determination: Excitement and focus ; A gamer’s hands furiously typing on a keyboard; close-up; Gamer; A brightly lit gaming room; cinematic
Characteristic
Shot : A young man is sitting at a desk wearing a headset and typing on a keyboard. He appears to be playing a video game. The room is dimly lit with various screens visible in the background.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.34
Noise : 61
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts and noise in the background, and some parts of the image are slightly out of focus.
Lost in the Mist: A Solitary Figure Walks Through a Mysterious Forest
A lone figure traverses a path shrouded in mist, the dense forest creating an eerie and contemplative atmosphere. The fog and the solitary figure evoke a sense of isolation and mystery, enhancing the dramatic effect of the scene.
Prompt
facial-expressions Determination: Hope and perseverance ; A lone figure walking towards a distant light; eye-level; Single Person; A dark, foreboding forest; cinematic
Characteristic
Shot : A misty forest path with two figures walking away from the viewer towards the fog, with tall trees lining the path
Aesthetic Score : 0.7
Mood : mysterious, eerie, atmospheric
Quality
Entropy : 6.73
Noise : 89
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts
Silhouetted Against the City: A Moment of Introspection
A solitary figure stands on a rooftop, bathed in the dramatic play of light and shadow. The urban skyline stretches out behind him, creating a backdrop of both grandeur and isolation. His pose suggests a moment of deep contemplation, capturing the essence of urban life’s complexities.
Prompt
facial-expressions Determination: Confidence and unwavering resolve ; A hero standing on a rooftop; high-angle; Hero; A city skyline bathed in sunlight; cinematic
Characteristic
Shot : A man standing on a rooftop overlooking a city skyline at sunset.
Aesthetic Score : 0.7
Mood : serious, contemplative, urban
Quality
Entropy : 6.51
Noise : 64
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Conclusion
The generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.565, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.14, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and achieving the desired aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai