AI's Facial Expressions: A Mixed Bag of Success with Stable-diffusion

AI's Facial Expressions: A Deep Dive into Generative Model Performance with Stable-diffusion

Contents

Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Generative AI models are increasingly being used to create images with specific facial expressions, but how well do they capture the nuances of human emotion? This blog post delves into the performance of a generative AI model in creating images with dramatic facial expressions across diverse scenes. We’ll explore the model’s ability to understand camera position, scene details, and aesthetic style, highlighting its strengths and areas for improvement. For example, the model excels at capturing the desired aesthetic style, but struggles with accurately representing the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.

Created with: stability-ai-core

Rainy Day Reflections

A woman finds solace in a cup of coffee as she gazes out at a rainy street, her expression hinting at a melancholic contemplation. The moody atmosphere captures a sense of loneliness and introspection.

Rainy Day Reflections

Prompt

facial-expressions Worry: melancholy, lonely ; Single woman; eye-level; Single Persons; dimly lit coffee shop with rain outside; cinematic

Characteristic

Shot : A young woman sits by a window in a cafe, looking out at the rain falling on the street outside.

Aesthetic Score : 0.7

Mood : melancholy, contemplative, lonely

Quality

Entropy : 6.47

Noise : 73

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image has a slight chromatic aberration, but it is not very noticeable.

Superman Stands Tall, A City at His Feet

A powerful image of a Superman figure, bathed in dramatic lighting, stares intensely at the viewer. The blurred cityscape and dark sky create a sense of isolation and heroism, emphasizing his strength and determination.

Superman Stands Tall, A City at His Feet

Prompt

facial-expressions Worry: intense, burdened ; Man in a superhero costume; medium shot; Heroes; cityscape at night with flashing sirens; cinematic

Characteristic

Shot : Superman standing on a bridge, city skyline in the background.

Aesthetic Score : 0.6

Mood : serious, heroic, powerful

Quality

Entropy : 6.45

Noise : 75

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.40

Image errors : Some areas appear too sharp, like the skin on the subject, and some parts of the suit.

Lost in the Crowd: A Moment of Vulnerability on the Subway

A young woman stands alone on a crowded subway, her serious expression and the blurred background highlighting her isolation and internal struggle. The scene evokes a sense of melancholy, suspense, and introspection, leaving the viewer to wonder about her story.

Lost in the Crowd: A Moment of Vulnerability on the Subway

Prompt

facial-expressions Worry: anxious, overwhelmed ; Young woman in a crowded subway; eye-level; Normal People; blurred faces of commuters; cinematic

Characteristic

Shot : A woman with a backpack stands in a crowded subway car, looking out of frame with a thoughtful expression.

Aesthetic Score : 0.7

Mood : pensive, mysterious, urban

Quality

Entropy : 6.54

Noise : 72

Prompt Clip Score : 0.30

AI Evaluation

Likelihood of AI : 0.20

Image errors : Some minor image artifacts are visible in the background.

Lost in the Code: A Man’s Focused Concentration in a Dark Room

A man, shrouded in darkness, sits before his computer, headphones on, eyes fixed on the screen. The dim lighting adds an air of mystery and intrigue, highlighting his intense focus and the serious nature of his work. This image captures the essence of a tech-savvy individual immersed in their digital world.

Lost in the Code: A Man’s Focused Concentration in a Dark Room

Prompt

facial-expressions Worry: intense, focused ; Gamer with headphones on; close-up; Gamer; dimly lit room with glowing computer screen; cinematic

Characteristic

Shot : A man wearing headphones is sitting in a dimly lit room, looking intensely at a computer screen.

Aesthetic Score : 0.6

Mood : focused, intense, serious

Quality

Entropy : 5.90

Noise : 59

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has some noise and grain, particularly in the shadows. The lighting is a bit uneven, with some areas appearing overexposed.

Autumn Melancholy: A Man Lost in Thought

A solitary figure sits on a park bench, surrounded by fallen leaves, his contemplative expression hinting at a sense of loneliness and introspection. The autumnal setting amplifies the melancholic mood, creating a poignant image of quiet contemplation.

Autumn Melancholy: A Man Lost in Thought

Prompt

facial-expressions Worry: sad, reflective ; Man sitting alone on a park bench; long shot; Single Persons; empty park with falling leaves; cinematic

Characteristic

Shot : A man is sitting on a wooden bench in a park. The leaves are changing color and are all over the ground.

Aesthetic Score : 0.7

Mood : melancholy, contemplative, autumnal

Quality

Entropy : 6.82

Noise : 78

Prompt Clip Score : 0.33

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible errors

Solitude Amidst the Flames

A lone woman in a blue jumpsuit stands on a rooftop, her calm presence a stark contrast to the burning building and smoke-filled cityscape behind her. The scene evokes a sense of drama, tension, and impending doom, leaving the viewer to ponder the woman’s story and the fate of the city.

Solitude Amidst the Flames

Prompt

facial-expressions Worry: determined, resolute ; Heroine standing on a rooftop; medium shot; Heroes; cityscape with smoke and fire in the distance; cinematic

Characteristic

Shot : A woman in a jumpsuit stands on a rooftop, looking out at a city with a burning building in the background. There’s a lot of smoke filling the sky.

Aesthetic Score : 0.6

Mood : dramatic, tense, somber

Quality

Entropy : 6.69

Noise : 65

Prompt Clip Score : 0.32

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image quality is slightly low, with some noise and pixelation. The color grading is a bit overdone.

Silence Speaks Volumes: A Couple’s Tense Kitchen Standoff

A couple’s simmering tension boils over in their kitchen. The woman’s upset expression and the man’s serious gaze, coupled with the untouched plates of food, paint a picture of a relationship on the brink. The composition emphasizes the distance between them, leaving the viewer to wonder what unspoken words hang heavy in the air.

Silence Speaks Volumes: A Couple’s Tense Kitchen Standoff

Prompt

facial-expressions Worry: tense, frustrated ; Couple arguing in a kitchen; eye-level; Normal People; cluttered kitchen with dirty dishes; cinematic

Characteristic

Shot : A couple is having a tense conversation in a kitchen.

Aesthetic Score : 0.6

Mood : tense, uncertain, uncomfortable

Quality

Entropy : 6.87

Noise : 70

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.00

Image errors : No visible artifacts or errors in the image.

Lost in the Code: A Moment of Intense Focus

A young man, headphones on, is completely absorbed in his work. The close-up shot and dim lighting create a sense of intimacy, highlighting the intensity of his concentration as he types away on his keyboard.

Lost in the Code: A Moment of Intense Focus

Prompt

facial-expressions Worry: intense, focused ; Gamer’s hands on a keyboard; close-up; Gamer; flashing lights and sounds from the game; cinematic

Characteristic

Shot : A young man wearing headphones is sitting in front of a computer and typing on a keyboard. The lighting is dimly lit and the colors are muted.

Aesthetic Score : 0.7

Mood : intense, focused, determined

Quality

Entropy : 6.08

Noise : 64

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : There are no noticeable artifacts or errors in the image.

Lost in the Shadows: A Woman’s Solitary Walk Through a Mysterious City

A woman, shrouded in a black coat, walks alone down a cobblestone street bathed in the dim glow of streetlights. The urban landscape, shrouded in darkness, creates a sense of mystery and loneliness, leaving the viewer to wonder about her destination and the secrets she carries.

Lost in the Shadows: A Woman’s Solitary Walk Through a Mysterious City

Prompt

facial-expressions Worry: lonely, vulnerable ; Woman walking alone at night; long shot; Single Persons; deserted street with streetlights; cinematic

Characteristic

Shot : A woman in a black coat walks down a cobblestone street at night, illuminated by streetlights.

Aesthetic Score : 0.7

Mood : mysterious, lonely, urban

Quality

Entropy : 6.60

Noise : 72

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : No major artifacts or errors are visible.

Soldier’s Focus Amidst Chaos

A lone soldier, surrounded by the devastation of war, studies a map amidst billowing smoke and rubble. The scene captures the seriousness and tension of the battlefield, highlighting the urgency of the situation.

Soldier’s Focus Amidst Chaos

Prompt

facial-expressions Worry: serious, strategic ; Hero looking at a map; medium shot; Heroes; war-torn battlefield with smoke and debris; cinematic

Characteristic

Shot : A soldier in a war-torn city, looking at a map while surrounded by rubble and smoke

Aesthetic Score : 0.7

Mood : intense, serious, dramatic

Quality

Entropy : 6.84

Noise : 79

Prompt Clip Score : 0.34

AI Evaluation

Likelihood of AI : 0.80

Image errors : The lighting is a bit unnatural, and the smoke seems a bit too clean. The soldier’s expression is a bit too intense.

Conclusion

The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:

  • Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
  • Shot Analysis: The model scored 0.47, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
  • Aesthetic Analysis: The model scored 0.11, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.

Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.

Sources: