AI Captures the Nuance of Facial Expressions, But Struggles with Camera Angles with Stable-diffusion
- 9 minutes read - 1887 wordsTable of Contents
Dramatic facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. Generative AI models are increasingly adept at capturing these nuances, creating images that evoke a wide range of feelings. This blog post explores the capabilities of one such model, analyzing its performance in generating images with realistic facial expressions. We’ll delve into the model’s strengths and weaknesses, highlighting its ability to capture the essence of a scene while exploring areas for improvement in camera positioning. Join us as we uncover the potential of AI in creating compelling visual narratives.
Created with: stability-ai-core
Lost in the Neon Maze: A Solitary Figure Navigates the City’s Night
A lone man walks through a vibrant, bustling city at night, the neon lights reflecting off the wet pavement. The image evokes a sense of isolation and mystery, capturing the urban landscape’s allure and the individual’s solitary journey.
Prompt
facial-expressions Skepticism: Melancholy, disillusioned ; A lone figure, back turned, walking away from a brightly lit city skyline; eye-level; Single Person; Urban, neon signs, bustling crowds; cinematic
Characteristic
Shot : A lone figure walks down a bustling street in a city, surrounded by neon signs and crowds of people. The street is lit by streetlights and the reflections of the signs.
Aesthetic Score : 0.7
Mood : urban, lonely, mysterious
Quality
Entropy : 6.14
Noise : 65
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some minor artifacts are visible in the neon signs, particularly around the edges. There are also a few instances of over-sharpening, which can be seen in the details of the figures and the buildings.
Superman Stands Against the Flames
A dramatic image of Superman standing on a burning rooftop, the flames licking at his feet, as a city burns in the background. The scene evokes a sense of heroism and urgency, with the juxtaposition of Superman’s power against the apocalyptic backdrop creating a powerful visual.
Prompt
facial-expressions Skepticism: Doubtful, conflicted ; A superhero, cape billowing, standing on a rooftop, looking down at a city in chaos; eye-level; Hero; Smoke, fire, destruction; cinematic
Characteristic
Shot : Superman standing on a rooftop with fire and smoke in the background, facing the camera
Aesthetic Score : 0.6
Mood : dramatic, heroic, powerful
Quality
Entropy : 6.84
Noise : 76
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.70
Image errors : The fire looks a bit artificial. There is some blurring around the edges of the image. The city skyline is not very detailed.
Troubled Times: Woman Reads News with a Frown
A woman sits alone in a bustling cafe, her brow furrowed as she reads a newspaper. The headline, though obscured, hints at troubling news, adding to her air of concern and isolation. The busy background only emphasizes her solitude, leaving the viewer to wonder what anxieties she faces.
Prompt
facial-expressions Skepticism: Cynical, disbelieving ; A woman, dressed in everyday clothes, holding a newspaper with a sensational headline; eye-level; Normal People; Coffee shop, people going about their day; cinematic
Characteristic
Shot : A woman is sitting in a cafe, reading a newspaper, with a cup of coffee on the table. The cafe is bustling with people, and the atmosphere is busy and noisy.
Aesthetic Score : 0.7
Mood : casual, relaxed, contemplative
Quality
Entropy : 6.75
Noise : 72
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts in the image, such as noise in the background and some blurring of the newspaper.
Lost in Thought, Coke in Hand
A young man sits in a dimly lit room, his gaze fixed on something unseen. A can of Coca-Cola rests in his hand, a pizza sits untouched before him. The low light and his pensive expression create an atmosphere of mystery and contemplation.
Prompt
facial-expressions Skepticism: Suspicious, wary ; A gamer, hunched over a computer screen, surrounded by empty pizza boxes and energy drink cans; close-up; Gamer; Dark room, flashing lights, gaming peripherals; cinematic
Characteristic
Shot : A young man is sitting at a desk, looking intently at a computer screen. There is a pizza and a can of Coca-Cola in front of him. The lighting is dim, creating a moody atmosphere.
Aesthetic Score : 0.6
Mood : dark, focused, contemplative
Quality
Entropy : 5.04
Noise : 59
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight noise in the shadows, the pizza looks slightly blurry.
Lost in Thought: A Man’s Contemplative Moment at a Dimly Lit Bar
A solitary figure, shrouded in shadow, sits at a bar, lost in contemplation. The dim lighting and his thoughtful expression create an air of mystery and intrigue, leaving the viewer wondering what secrets lie within his mind.
Prompt
facial-expressions Skepticism: Doubtful, introspective ; A man, sitting alone in a dimly lit bar, staring into his drink; eye-level; Single Person; Empty bar, flickering neon lights, rain outside; cinematic
Characteristic
Shot : A man sits alone at a bar, looking pensively at his drink. The bar is dimly lit, and there are other people in the background, but they are out of focus.
Aesthetic Score : 0.6
Mood : melancholy, lonely, introspective
Quality
Entropy : 5.88
Noise : 65
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight amount of noise, but it’s not particularly noticeable. There are a few artifacts around the edges of the image.
One Man, One Gun, A City on Edge
A lone figure in a futuristic suit, armed and poised, stands before a vast crowd. The scene is charged with tension, hinting at a dramatic confrontation in a world of advanced technology and uncertain futures.
Prompt
facial-expressions Skepticism: Uncertain, hesitant ; A hero, standing in front of a crowd, holding a weapon, but looking conflicted; eye-level; Hero; cheering crowd, bright lights, stage; cinematic
Characteristic
Shot : A man in a futuristic military uniform, holding a gun, stands in front of a large crowd.
Aesthetic Score : 0.7
Mood : intense, dramatic, suspenseful
Quality
Entropy : 6.62
Noise : 78
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major errors, but the background is a bit blurry and lacks detail.
Warmth and Intimacy: Friends Gather for a Casual Dinner
A group of friends share laughter and conversation over a meal in a dimly lit dining room. The soft lighting and their close interactions create a sense of warmth and intimacy, capturing the essence of a casual, friendly gathering.
Prompt
facial-expressions Skepticism: Disbelieving, amused ; A group of friends, gathered around a table, listening to a story with skeptical expressions; eye-level; Normal People; Cozy living room, warm lighting, snacks; cinematic
Characteristic
Shot : A group of friends are gathered around a table, eating and talking. The lighting is warm and inviting, and the atmosphere is casual and relaxed.
Aesthetic Score : 0.6
Mood : cozy, casual, relaxed
Quality
Entropy : 6.73
Noise : 73
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors, but the image is a bit blurry and the colors are not very vibrant
Lost in the Digital World: A Moment of Contemplation
A man sits in a dimly lit room, his face illuminated by the glow of a computer screen. A video game controller rests before him, but his gaze is fixed on the digital world, lost in thought. The scene evokes a sense of mystery and intrigue, leaving the viewer to wonder what secrets lie within the screen.
Prompt
facial-expressions Skepticism: Frustrated, doubtful ; A gamer, staring intently at a screen, but with a look of frustration; close-up; Gamer; Brightly lit room, gaming setup, controller in hand; cinematic
Characteristic
Shot : A man sits in a dimly lit room, looking thoughtfully at a computer monitor. His arm is resting on a table with a gaming controller nearby.
Aesthetic Score : 0.6
Mood : thoughtful, contemplative, serious
Quality
Entropy : 6.52
Noise : 64
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
Lost in the City: A Woman’s Mysterious Gaze
A woman in a blue coat walks through a bustling city street, her face shrouded in mystery. The blurred background emphasizes her solitary presence, leaving the viewer to wonder about her secrets and her destination. The image evokes a sense of urban intrigue and pensive contemplation.
Prompt
facial-expressions Skepticism: Paranoid, distrustful ; A woman, walking through a crowded street, looking around with suspicion; eye-level; Single Person; Busy city street, people rushing by, street vendors; cinematic
Characteristic
Shot : A woman in a blue coat is walking in a city street. The background is blurred with people and shops.
Aesthetic Score : 0.7
Mood : mysterious, urban, pensive
Quality
Entropy : 6.75
Noise : 76
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some minor blurriness in the background due to motion
Silhouetted Against the City’s Melancholy Glow
A solitary figure stands on a rooftop, their silhouette stark against the backdrop of a city bathed in the soft, pink and grey hues of dusk. The scene evokes a sense of quiet contemplation and loneliness, as the man gazes out at the distant lights, lost in thought.
Prompt
facial-expressions Skepticism: Isolated, disillusioned ; A hero, standing on a rooftop, looking out at a city skyline, but with a sense of loneliness; eye-level; Hero; City lights, distant sounds of the city; cinematic
Characteristic
Shot : A man stands on a rooftop overlooking a city skyline at dusk. The city lights are twinkling in the distance, and the sky is a soft orange and blue.
Aesthetic Score : 0.7
Mood : calm, reflective, hopeful
Quality
Entropy : 6.69
Noise : 74
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The city lights are slightly blurry and the overall image has a slightly grainy texture. This could be due to the low-light conditions or post-processing.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.15, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.55, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.08, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai