AI's Facial Expressions: A Step Towards Realism, But Still Room for Growth with Stability-ai-ultra
- 9 minutes read - 1850 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and adding depth to characters. Generative AI models are increasingly being used to create realistic images, and one area of focus is the ability to generate convincing facial expressions. This blog post examines the results of a recent experiment, where an AI model was tasked with generating images based on specific prompts, including details about the scene, camera position, and desired facial expressions. While the model shows promise in understanding the scene context, it struggles with accurately capturing camera positions and achieving the desired aesthetic. We’ll delve into the model’s strengths and weaknesses, analyzing its performance in various scenarios.
Created with: stability-ai-ultra
The Man in the Setting Sun: A Moment of Mystery in the City
A solitary figure, shrouded in the golden light of a setting sun, stands on a bustling city street. His intense gaze, directed straight at the camera, evokes a sense of mystery and contemplation. The dramatic lighting and urban backdrop create a captivating scene, leaving the viewer wondering about the man’s story and the secrets he holds.
Prompt
facial-expressions Interest: Intrigued, observant ; A lone figure; eye-level; Single Person; bustling city street; cinematic
Characteristic
Shot : A young man is standing in a city street, looking directly at the camera. The sun is setting behind him, casting a warm glow on the scene. The background is a busy city street, with many people walking around.
Aesthetic Score : 0.7
Mood : mysterious, pensive, urban
Quality
Entropy : 6.80
Noise : 73
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, especially in the shadows. The subject’s hair seems too perfect and might be digitally enhanced.
Batman’s Fiery Gaze: A Hero in the Shadows
A captivating image of Batman, his intense stare piercing through the camera lens. The fiery backdrop and blurred city streets create a dramatic and heroic atmosphere, hinting at the danger lurking in the shadows.
Prompt
facial-expressions Interest: Focused, determined ; A superhero in a dramatic pose; medium shot; Hero; cityscape with a burning building in the background; cinematic
Characteristic
Shot : A man dressed as Batman stands in front of an intense, fiery explosion in a cityscape.
Aesthetic Score : 0.7
Mood : dramatic, heroic, action
Quality
Entropy : 6.92
Noise : 95
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blurriness in the background, possibly due to movement or depth of field.
Lost in the Pages: A Moment of Tranquility in a Cozy Cafe
A young woman finds solace in a well-worn book, the warm glow of the cafe enveloping her in a peaceful embrace. The soft lighting and cozy decor create an intimate atmosphere, inviting viewers to share in her quiet contemplation.
Prompt
facial-expressions Interest: Engrossed, absorbed ; A woman reading a book in a coffee shop; eye-level; Normal People; warm, inviting cafe interior; cinematic
Characteristic
Shot : A woman is sitting in a cafe, reading a book, with a cup of coffee on the table.
Aesthetic Score : 0.8
Mood : cozy, relaxed, thoughtful
Quality
Entropy : 6.87
Noise : 79
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed, resulting in a washed-out appearance. There is some noise in the shadows, which is particularly noticeable on the woman’s sweater.
Caught in the Glow: A Moment of Intense Focus
A young man, bathed in vibrant red and blue light, stares intently at his computer screen. His expression, a mix of surprise and excitement, speaks volumes about the intensity of the moment. The dramatic lighting and his focused gaze create a palpable sense of tension and anticipation.
Prompt
facial-expressions Interest: Excited, concentrated ; A gamer intensely focused on a screen; close-up; Gamer; dimly lit room with glowing monitor; cinematic
Characteristic
Shot : A young man is looking intensely at a computer screen, illuminated by red and blue lights. His facial expression suggests excitement or surprise.
Aesthetic Score : 0.7
Mood : intense, focused, dramatic
Quality
Entropy : 6.45
Noise : 74
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is some minor noise in the image, particularly in the darker areas. The lighting is slightly uneven and there appears to be a slight overexposure in some sections.
Lost in Thought: A Man’s Melancholy Gaze Through a Rain-Streaked Window
A solitary figure, shrouded in a black hoodie, sits by a window, his face obscured by shadows. The cloudy sky and rain-streaked glass reflect his somber mood, creating a poignant scene of contemplation and introspection.
Prompt
facial-expressions Interest: Contemplative, thoughtful ; A man gazing out a window at a stormy sky; eye-level; Single Person; dark, moody interior; cinematic
Characteristic
Shot : A man sits by a window, looking out at a cloudy, overcast sky. He is dressed in black and appears to be lost in thought.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, introspective
Quality
Entropy : 5.71
Noise : 66
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is a bit too dark and the man’s face is slightly out of focus. The windowpane has some minor artifacts that distract from the overall image.
Heroic Silhouette Against the Setting Sun
A lone superhero, silhouetted against the fiery hues of a dusk sky, stands on a rooftop overlooking the sprawling cityscape. The dramatic composition evokes a sense of contemplation and heroism, leaving the viewer to ponder the weight of the hero’s responsibility.
Prompt
facial-expressions Interest: Confident, determined ; A hero standing on a rooftop overlooking a city; wide shot; Hero; panoramic cityscape with dramatic lighting; cinematic
Characteristic
Shot : A man in a superhero costume stands on a rooftop looking out at the city lights at dusk.
Aesthetic Score : 0.6
Mood : dramatic, brooding, powerful
Quality
Entropy : 6.65
Noise : 73
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some artifacts around the edges of the man’s costume. The city lights are also a bit blurry.
Laughter and Light: Friends Share a Joyful Dinner
A group of friends gather for a warm and intimate dinner, their laughter filling the air. The soft lighting creates a cozy atmosphere, highlighting the joy and connection they share.
Prompt
facial-expressions Interest: Happy, engaged ; A group of friends laughing together at a dinner table; eye-level; Normal People; cozy, homey dining room; cinematic
Characteristic
Shot : A group of friends are enjoying dinner at a table lit by candles and lamps. The woman is laughing and looking at the man across the table.
Aesthetic Score : 0.7
Mood : happy, intimate, warm
Quality
Entropy : 6.87
Noise : 87
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Immersed in the Game: Hands Fly Across the Keyboard
A close-up shot captures the intensity of a gamer’s focus as their hands dance across the keyboard and mouse. The vibrant purple and blue lighting creates a captivating atmosphere, hinting at the excitement and challenge of the game.
Prompt
facial-expressions Interest: Thrilled, focused ; A gamer’s hands rapidly moving across a keyboard and mouse; close-up; Gamer; brightly lit gaming setup with flashing lights; cinematic
Characteristic
Shot : A person’s hand using a computer mouse and keyboard. The background is blurred and has a pink-blue color scheme.
Aesthetic Score : 0.6
Mood : intense, focused, tech
Quality
Entropy : 6.73
Noise : 70
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : No significant errors, colors are a bit saturated
Lost in the Canvas: A Moment of Contemplation in an Art Gallery
A young woman stands captivated by a Renaissance masterpiece, her thoughtful gaze drawn to the vibrant colors and intricate details. The natural light streaming through the gallery window illuminates the scene, creating a sense of depth and atmosphere. The contrast between her contemplative expression and the lively painting evokes a sense of wonder and curiosity.
Prompt
facial-expressions Interest: Appreciative, curious ; A woman looking at a painting in a museum; eye-level; Single Person; grand museum hall with intricate artwork; cinematic
Characteristic
Shot : A woman is standing in an art gallery, looking up at a painting. The painting is of a group of women in a landscape setting. The woman in the foreground is wearing a black coat and has long brown hair. The background is blurred and out of focus.
Aesthetic Score : 0.7
Mood : elegant, contemplative, curious
Quality
Entropy : 6.84
Noise : 83
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors or artifacts in the image.
Intense Gaze, Fiery Backdrop: A Portrait of Danger
A man in a leather jacket stares directly into the camera, his expression intense and unwavering. The fiery background, whether a raging inferno or a dramatic sunset, adds a sense of danger and tension to the scene. This image evokes a mood of intensity, drama, and potential threat.
Prompt
facial-expressions Interest: Intense, focused ; A hero facing off against a villain; medium shot; Hero; dramatic, action-packed scene with explosions and smoke; cinematic
Characteristic
Shot : A man in a black leather jacket is standing in front of a fire. The fire is in the background and out of focus. The man is looking at the camera with a serious expression on his face.
Aesthetic Score : 0.7
Mood : intense, dramatic, mysterious
Quality
Entropy : 6.20
Noise : 74
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, but some minor noise
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, indicating it’s not very good at reacting to camera positions in the prompt. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Shot Analysis: The model scored 0.46, which is okay but not great. It suggests the model is able to understand the scene in the prompt to some extent, but not perfectly. A score between 0.5 and 0.75 would be considered good, and above 0.75 very good.
- Aesthetic Analysis: The model scored 0.16, which is not very good. A score between -0.2 and 0.1 would be considered very good, indicating the generated image closely matches the expected aesthetic.
Overall, the model needs improvement in its ability to accurately interpret camera positions and achieve the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai