AI's Facial Expressions: A Step Towards Realism, But Still Room for Growth with Stable-diffusion

AI's Facial Expressions: A Mixed Bag of Success and Struggle with Stable-diffusion

Contents

Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. In the realm of AI-generated imagery, capturing these expressions accurately is crucial for creating compelling and engaging visuals. This analysis explores the performance of a generative AI model in generating images with specific facial expressions, highlighting its strengths and weaknesses in understanding and translating complex prompts.

Created with: stability-ai-core

Lost in the Neon Maze

A solitary figure navigates the bustling city streets at night, bathed in the vibrant glow of neon signs. The shallow depth of field isolates the man, creating a sense of mystery and intrigue. His expression hints at a hidden story, leaving you wondering what secrets lie within the urban labyrinth.

Lost in the Neon Maze

Prompt

facial-expressions Confusion: Disoriented, overwhelmed ; A lone figure; eye-level; Single Person; a bustling city street with neon signs and crowds; cinematic

Characteristic

Shot : A man is standing in the middle of a street in a city, there are many neon signs, there is a lot of light pollution

Aesthetic Score : 0.7

Mood : mysterious, urban, lonely

Quality

Entropy : 6.46

Noise : 67

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image has some noise, particularly in the shadows. The neon signs are also a bit overexposed, which makes them lose some detail.

Superman Rises from the Ashes

A gritty and realistic depiction of Superman standing amidst a destroyed city, his determined gaze promising hope in the face of devastation. The contrast between his powerful physique and the ruined cityscape creates a powerful sense of drama and intensity.

Superman Rises from the Ashes

Prompt

facial-expressions Confusion: Doubt, uncertainty ; A superhero in a tattered costume; eye-level; Hero; a destroyed cityscape with smoke and debris; cinematic

Characteristic

Shot : A superhero, possibly Superman, is walking through a destroyed city, looking determined and with a slight grimace on his face.

Aesthetic Score : 0.7

Mood : dramatic, gritty, somber

Quality

Entropy : 6.84

Noise : 84

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.60

Image errors : No significant image errors. The lighting could be slightly improved.

Corporate Tensions Rise: What’s the Big Secret?

A sense of unease hangs in the air as a group of corporate professionals gather, their gazes fixed on something unseen. The mood is serious, the atmosphere tense, and the question remains: what is the source of this palpable tension?

Corporate Tensions Rise: What’s the Big Secret?

Prompt

facial-expressions Confusion: Lost, unmoored ; A woman in a business suit; eye-level; Normal People; a sterile office with fluorescent lights and cubicles; cinematic

Characteristic

Shot : The image portrays a series of scenes within an office setting, featuring individuals in professional attire, likely involved in a high-stakes situation. The visual style leans towards a dramatic and suspenseful aesthetic.

Aesthetic Score : 0.7

Mood : tense, professional, serious

Quality

Entropy : 6.77

Noise : 67

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors were detected in the image. The lighting and color balance are consistent across the scenes, and the image appears to be of high quality.

In the Zone: A Gamer’s Focus Under Low Light

A young man, headphones on, sits before a wall of computer monitors, his expression intense and focused. The low lighting adds a dramatic edge, highlighting his concentration as he navigates the digital world.

In the Zone: A Gamer’s Focus Under Low Light

Prompt

facial-expressions Confusion: Frustration, bewilderment ; A gamer with headphones on; close-up; Gamer; a dimly lit room with a computer screen displaying a complex game interface; cinematic

Characteristic

Shot : A young man wearing headphones is sitting in front of a computer screen in a dimly lit room. He appears to be focused on something on the screen.

Aesthetic Score : 0.6

Mood : focused, serious, concentrated

Quality

Entropy : 6.11

Noise : 66

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are no visible artifacts or errors in the image.

Lost in the Fog: A Man’s Shadowy Journey

A solitary figure, cloaked in a trench coat, stands amidst the swirling fog of a narrow alleyway. The dim glow of streetlamps casts long, eerie shadows, adding to the atmosphere of mystery and suspense. This brooding scene evokes a sense of intrigue, leaving the viewer to wonder about the man’s secrets and the path he is destined to take.

Lost in the Fog: A Man’s Shadowy Journey

Prompt

facial-expressions Confusion: Suspicious, wary ; A man in a trench coat; eye-level; Single Person; a foggy alleyway with flickering streetlights; cinematic

Characteristic

Shot : A man in a trench coat stands in a foggy, cobblestone alleyway. The light from streetlamps casts long shadows.

Aesthetic Score : 0.7

Mood : mysterious, moody, atmospheric

Quality

Entropy : 6.63

Noise : 71

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image appears to have some noise and artifacting, especially in the shadows.

The Antlered Knight: A Mystery in the Woods

A shadowy figure in medieval armor, adorned with antlers, stands amidst a blurred forest. The image evokes a sense of mystery and suspense, leaving the viewer to ponder the knight’s purpose and the secrets hidden within the woods.

The Antlered Knight: A Mystery in the Woods

Prompt

facial-expressions Confusion: Disillusioned, lost ; A knight in shining armor; eye-level; Hero; a dark forest with twisted trees and ominous shadows; cinematic

Characteristic

Shot : A collage of nine images of a man in a knight’s armor and helmet, set in a dark forest with tall trees

Aesthetic Score : 0.6

Mood : dark, mysterious, fantasy

Quality

Entropy : 6.66

Noise : 84

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.70

Image errors : The images appear to have been digitally manipulated, with some imperfections in the edges and seams, and some slight blurring.

Tension at the Table: A Moment of Uncomfortable Truth

A group of people gather around a kitchen table, their faces etched with tension. The warm lighting and remnants of a meal create a stark contrast to the palpable unease in the air. The composition, with characters tightly clustered, amplifies the feeling of claustrophobia and suspense, leaving the viewer wondering what secrets lie beneath the surface.

Tension at the Table: A Moment of Uncomfortable Truth

Prompt

facial-expressions Confusion: Awkward, uncomfortable ; A family at a dinner table; eye-level; Normal People; a brightly lit kitchen with mismatched plates and silverware; cinematic

Characteristic

Shot : A group of people are sitting around a table eating dinner. There is a tense atmosphere in the room, and it appears as if they are having a difficult conversation. The lighting is warm and inviting, but the composition is not very dynamic.

Aesthetic Score : 0.6

Mood : tense, serious, somber

Quality

Entropy : 6.79

Noise : 77

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image is slightly overexposed in some areas, and the colors are a little bit muted.

Gamer’s Shock: Caught in the Heat of the Game

A young man’s face is etched with surprise and intensity as he plays video games, surrounded by multiple screens. The scene captures the thrill and focus of a gamer fully immersed in the digital world.

Gamer’s Shock:  Caught in the Heat of the Game

Prompt

facial-expressions Confusion: Overwhelmed, disoriented ; A gamer holding a controller; close-up; Gamer; a brightly lit room with a TV screen displaying a chaotic game scene; cinematic

Characteristic

Shot : A young man wearing a headset and holding a gaming controller, sitting in a dimly lit room with multiple screens behind him.

Aesthetic Score : 0.6

Mood : intense, focused, surprised

Quality

Entropy : 6.53

Noise : 62

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible artifacts or errors in the image.

Lost in the City: A Woman’s Solitary Stroll

A woman walks through a bustling city street, her face shrouded in mystery as the shallow depth of field isolates her from the surrounding crowd. The urban landscape and pensive mood create a sense of intrigue, leaving you wondering about her story.

Lost in the City: A Woman’s Solitary Stroll

Prompt

facial-expressions Confusion: Lost, alienated ; A woman walking down a crowded street; eye-level; Single Person; a bustling city street with people rushing past; cinematic

Characteristic

Shot : A woman in a trench coat is walking down a busy city street, the city background is blurred and out of focus

Aesthetic Score : 0.7

Mood : mysterious, urban, cool

Quality

Entropy : 6.79

Noise : 77

Prompt Clip Score : 0.26

AI Evaluation

Likelihood of AI : 0.20

Image errors : No significant errors

City Lights, City Hope: A Superhero Stands Watch

A lone figure, silhouetted against the moonlit cityscape, embodies hope and heroism. This dramatic image captures the essence of a superhero’s unwavering commitment to protecting the city below.

City Lights, City Hope: A Superhero Stands Watch

Prompt

facial-expressions Confusion: Doubt, questioning ; A superhero standing on a rooftop; eye-level; Hero; a cityscape with twinkling lights and a full moon; cinematic

Characteristic

Shot : A superhero, possibly Superman, stands on a rooftop overlooking a city at night, the full moon illuminating the scene.

Aesthetic Score : 0.6

Mood : dramatic, heroic, powerful

Quality

Entropy : 6.70

Noise : 70

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.70

Image errors : The image exhibits some noticeable artifacts, particularly in the background cityscape, suggesting some level of digital manipulation.

Conclusion

The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:

  • Camera Position: The model scored 0.25, which is below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
  • Shot Analysis: The model scored 0.45, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
  • Aesthetic Analysis: The model scored 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.

Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate complex prompts into visually accurate images.

Sources: