AI's Facial Expressions: A Mixed Bag of Success with Stability-ai-ultra
- 9 minutes read - 1897 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Dramatic facial expressions, in particular, can heighten the impact of a scene and draw the viewer in. Generative AI models are increasingly being used to create images with specific facial expressions, but how well do they capture the nuances of human emotion and the aesthetic style of a scene? This blog post explores the capabilities and limitations of AI in generating images with dramatic facial expressions, using a series of prompts that test the model’s ability to understand scene context, camera position, and aesthetic style.
Created with: stability-ai-ultra
Caught in the Moment: A Look of Surprise Amidst the Blur
A young woman with dark hair sits at a table, her face etched with surprise. The surrounding figures are blurred, creating a sense of mystery and leaving the viewer to wonder what has just transpired. The scene is both dramatic and intense, capturing a fleeting moment in a story waiting to be told.
Prompt
facial-expressions Embarrassment: Awkward and self-conscious ; A single woman; eye-level; Single Persons; A crowded cafe with loud chatter and laughter; cinematic
Characteristic
Shot : A woman in a red shirt is sitting at a table in a cafe, looking surprised. There are other people in the background, blurred and out of focus.
Aesthetic Score : 0.8
Mood : surprised, dramatic, intimate
Quality
Entropy : 6.67
Noise : 80
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has a slightly cartoonish style and the colors are vibrant. There is some aliasing around edges and details. The blurring of the background is not very realistic.
The Masked Hero Stands Alone, Ready to Strike
A superhero, cloaked in a dark blue and red suit, emerges from the blur of a crowd. Their focused expression and the dramatic lighting create a sense of mystery and anticipation, leaving the audience wondering what lies ahead.
Prompt
facial-expressions Embarrassment: Humiliated and exposed ; A superhero in a full costume; eye-level; Heroes; A bustling city street with people staring; cinematic
Characteristic
Shot : A man dressed as a superhero stands in a crowded city street.
Aesthetic Score : 0.6
Mood : mysterious, powerful, urban
Quality
Entropy : 6.90
Noise : 90
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some blurriness in the background, and the edges of the superhero’s costume are slightly pixelated.
From Shocked to Shocked: One Man’s Hilarious Journey Through a Formal Event
This series of four photos captures a man’s escalating shock and surprise at a formal event. The comedic juxtaposition of the scenes, from the mundane to the unexpected, creates a hilarious and relatable experience. Get ready to laugh along with this man’s awkward, yet undeniably entertaining, journey.
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A man in a business suit; eye-level; Normal People; A formal dinner party with elegant guests; cinematic
Characteristic
Shot : Four photos of a man in a tuxedo reacting in a funny way, presumably at a dinner party. The pictures are presented in a grid format and there is some text added below each image, describing the situation, but the text is poorly formatted.
Aesthetic Score : 0.2
Mood : funny, awkward, embarrassing
Quality
Entropy : 6.66
Noise : 72
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.00
Image errors : The images are low-resolution and the text is poorly formatted.
Caught in the Heat of the Game: A Moment of Shock Under Neon Lights
A young gamer, bathed in vibrant blue and pink neon, stares directly at the camera with a look of pure shock. The intensity of the moment is palpable, captured in the gamer’s expression and the dramatic lighting. This image perfectly encapsulates the thrill and suspense of competitive gaming.
Prompt
facial-expressions Embarrassment: Cringing and defeated ; A gamer in a gaming chair; eye-level; Gamer; A dimly lit room with flashing screens and empty pizza boxes; cinematic
Characteristic
Shot : A young man is sitting in a gaming chair, wearing headphones and looking at the camera with a surprised expression, holding a controller in his hands.
Aesthetic Score : 0.6
Mood : intense, surprised, energetic
Quality
Entropy : 6.59
Noise : 66
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor artifacts and noise are visible in the background of the image.
A Moment of Love and Light: The Bride’s Radiant Glow
In the midst of a joyous gathering, a bride in a stunning white dress and veil captivates the scene. The close-up perspective highlights her radiant face and elegant attire, while the soft lighting adds a romantic and intimate touch to the atmosphere. The bride’s focused gaze exudes happiness and anticipation, making this a truly unforgettable moment.
Prompt
facial-expressions Embarrassment: Lonely and out of place ; A woman in a wedding dress; eye-level; Single Persons; A crowded wedding reception with happy couples; cinematic
Characteristic
Shot : A bride in a white wedding dress and veil is standing in a church or wedding ceremony venue. She is looking off to the side, perhaps at her groom or at the officiant. The scene is blurry in the background, suggesting that the photo was taken during the ceremony.
Aesthetic Score : 0.8
Mood : romantic, elegant, hopeful
Quality
Entropy : 6.81
Noise : 82
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed and there is a slight vignette effect around the edges.
The Evolution of Superman: A Collage of Iconic Moments
This powerful collage captures the essence of Superman across generations, showcasing different actors embodying the iconic hero. From the youthful exuberance of a young Superman to the seasoned gravitas of a veteran, the image evokes a sense of time and the enduring legacy of the Man of Steel.
Prompt
facial-expressions Embarrassment: Embarrassed and self-conscious ; A superhero in a cape; eye-level; Heroes; A cheering crowd at a victory parade; cinematic
Characteristic
Shot : Four images of different ages of a person dressed as Superman, two are in a crowd, one is on a blue background, the other is in a dramatic scene with fire and blurry people
Aesthetic Score : 0.6
Mood : serious, hopeful, heroic
Quality
Entropy : 6.88
Noise : 97
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears to be a composite of different photos. The edges of the images are not always well-aligned. Some of the images have a slightly artificial look.
A Moment of Mystery in the Dimly Lit Restaurant
A woman in a white blouse and black skirt stands alone in a dimly lit restaurant, her gaze fixed on something unseen. The blurred figures in the background and her enigmatic expression create a sense of intrigue and tension, leaving the viewer wondering what secrets lie within this alluring scene.
Prompt
facial-expressions Embarrassment: Uncomfortable and out of place ; A woman in a casual outfit; eye-level; Normal People; A fancy restaurant with white tablecloths and expensive wine; cinematic
Characteristic
Shot : A woman in a white blouse and black skirt stands in a dimly lit restaurant, looking to her right. The background is blurry and out of focus, with tables and people in the background.
Aesthetic Score : 0.6
Mood : mysterious, intriguing, elegant
Quality
Entropy : 6.77
Noise : 88
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some noise in the background and slight blurriness.
Lost in the Music: A Moment of Pure Intensity
A young man, bathed in vibrant stage lights, stands amidst a pulsating crowd. His headphones amplify the energy of the music, his expression a mix of exhilaration and focus. This image captures the raw intensity and excitement of a live music experience.
Prompt
facial-expressions Embarrassment: Humiliated and defeated ; A gamer in a hoodie; eye-level; Gamer; A crowded esports tournament with loud cheers and flashing lights; cinematic
Characteristic
Shot : A young man wearing headphones stands in front of a crowd at a concert or club. The scene is lit with colorful spotlights, creating a vibrant and energetic atmosphere.
Aesthetic Score : 0.6
Mood : intense, energetic, anticipation
Quality
Entropy : 6.41
Noise : 77
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is slight blurriness in the background, which is likely a result of low light conditions or a fast shutter speed.
An Intimate Candlelit Dinner for Two
Experience the warmth and elegance of a romantic dinner for two, illuminated by the soft glow of candlelight and string lights. The couple’s focused expressions and the intimate setting create a sense of drama and intimacy.
Prompt
facial-expressions Embarrassment: Awkward and uncomfortable ; A man in a tuxedo; eye-level; Single Persons; A romantic dinner for two with candles and flowers; cinematic
Characteristic
Shot : A couple is sitting at a table in a dimly lit restaurant. There are candles on the table and a romantic atmosphere.
Aesthetic Score : 0.7
Mood : romantic, intimate, elegant
Quality
Entropy : 6.89
Noise : 83
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has slight artifacts around the edges of the subjects, likely due to noise reduction.
Batman Faces the Press Amidst Growing Tension
A brooding Batman stands before a throng of reporters, his expression unreadable. The close-up shot emphasizes the intensity of the moment, hinting at a dramatic and suspenseful situation unfolding.
Prompt
facial-expressions Embarrassment: Mortified and ashamed ; A superhero in a mask; eye-level; Heroes; A news conference with reporters asking difficult questions; cinematic
Characteristic
Shot : A man dressed as Batman is being interviewed by a group of journalists. The scene is likely a public event or a press conference. There are other people in the background who are out of focus.
Aesthetic Score : 0.7
Mood : intense, dramatic, serious
Quality
Entropy : 6.66
Noise : 92
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.615, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.16, which is considered okay. This means that the generated image’s aesthetic was somewhat different from the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled to accurately capture the intended camera position and aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai