AI's Facial Expressions: A Mixed Bag with Stable-diffusion
- 9 minutes read - 1758 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. In the realm of generative AI, capturing these expressions accurately is crucial for creating compelling and realistic images. This blog post delves into the performance of a generative AI model in generating images with specific facial expressions, analyzing its strengths and weaknesses across various aspects, including camera position, shot analysis, and aesthetic appeal. We’ll explore how the model excels in certain areas while struggling in others, providing insights into the current state of AI’s ability to capture the nuances of human emotion.
Created with: stability-ai-core
Autumnal Contemplation
A man, shrouded in a dark jacket, sits on a park bench, his gaze fixed on the cityscape beyond. The vibrant fall foliage provides a backdrop of muted colors, mirroring the pensive mood of the scene. The man’s quiet contemplation evokes a sense of melancholy and introspection, capturing the essence of the season.
Prompt
facial-expressions Thoughtfulness: Melancholy, contemplative ; A lone figure sitting on a park bench; eye-level; Single Person; a bustling city park in the background; cinematic
Characteristic
Shot : A man in a dark coat sits on a park bench, looking off to the side. There are trees with yellow leaves in the background and a building with a dome in the far distance.
Aesthetic Score : 0.6
Mood : pensive, contemplative, autumnal
Quality
Entropy : 6.75
Noise : 66
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image quality is slightly soft, particularly in the background.
Superman Stands Tall Against the Setting Sun
A modern, darker-toned Superman stands on a rooftop, silhouetted against the fiery sunset. The city lights twinkle below, reflecting the hope and heroism he embodies.
Prompt
facial-expressions Thoughtfulness: Reflective, introspective ; A superhero standing on a rooftop, looking out at the city; eye-level; Hero; a sprawling cityscape with twinkling lights; cinematic
Characteristic
Shot : Superman stands on a rooftop overlooking a city at sunset, with the cityscape blurred in the background and a dramatic sky above.
Aesthetic Score : 0.7
Mood : heroic, dramatic, hopeful
Quality
Entropy : 6.88
Noise : 78
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.40
Image errors : The cityscape is slightly blurry and the lighting is a bit uneven. Some artifacts appear on the suit.
Finding Peace in the Mountain View
A woman finds solace in the tranquil beauty of a mountain landscape as she reads on a train journey. Her contemplative expression and the scenic view evoke a sense of calm and quiet reflection.
Prompt
facial-expressions Thoughtfulness: Peaceful, absorbed ; A woman reading a book on a train; eye-level; Normal Person; a blurry view of passing scenery outside the window; cinematic
Characteristic
Shot : A woman is sitting by the window of a train, reading a book. The view outside the window is of a green valley and mountains.
Aesthetic Score : 0.7
Mood : calm, contemplative, thoughtful
Quality
Entropy : 6.65
Noise : 63
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No major errors, slight noise in some areas, especially in the background.
Lost in the Code: A Hacker’s Focus Under Dim Lights
A young man, shrouded in shadows, sits hunched over his computer, his intense focus illuminated by the screen’s glow. The dimly lit room adds an air of mystery, hinting at the secrets he’s uncovering or the code he’s crafting.
Prompt
facial-expressions Thoughtfulness: Intense, focused ; A gamer sitting in a dimly lit room, staring intently at a computer screen; eye-level; Gamer; a cluttered desk with gaming peripherals; cinematic
Characteristic
Shot : A young man wearing headphones sits at a computer in a dimly lit room, focused on the screen. The room is filled with tech equipment and the atmosphere is one of concentration.
Aesthetic Score : 0.6
Mood : serious, focused, techy
Quality
Entropy : 5.97
Noise : 63
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some graininess in the image, particularly in the shadows. The lighting on the subject’s face could be more evenly distributed.
Lost in the Vastness: A Man’s Solitary Walk on the Beach
A melancholic scene unfolds as a man in a brown coat walks along a sandy beach, the vast ocean stretching out behind him. His posture and the immensity of the water evoke a sense of isolation and introspection, capturing a moment of quiet contemplation and loneliness.
Prompt
facial-expressions Thoughtfulness: Solitary, introspective ; A man walking alone on a deserted beach; eye-level; Single Person; the vast ocean stretching out before him; cinematic
Characteristic
Shot : A man is walking on a sandy beach towards the ocean, in the background a large cliff with green vegetation can be seen. The sky is cloudy, but the light is soft.
Aesthetic Score : 0.7
Mood : pensive, contemplative, serene
Quality
Entropy : 6.64
Noise : 63
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No apparent errors.
Firefighter Stands Tall Amidst Blazing Inferno
A dramatic image captures a firefighter in full gear, facing a burning building with smoke billowing in the air. The scene evokes a sense of danger and heroism, highlighting the bravery of those who risk their lives to protect others.
Prompt
facial-expressions Thoughtfulness: Somber, reflective ; A firefighter standing amidst the ruins of a fire; eye-level; Hero; smoke and debris filling the air; cinematic
Characteristic
Shot : A firefighter in full gear stands amidst the ruins of a building fire. Flames and smoke billow in the background.
Aesthetic Score : 0.7
Mood : serious, dramatic, somber
Quality
Entropy : 6.74
Noise : 78
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some slight blurring in the background and on the firefighter’s face.
Intimate Gathering: Friends Share a Meal and Laughter
A group of four friends enjoy a casual and warm dinner together in a dimly lit dining room. The warm lighting and their smiles create a sense of intimacy and togetherness, capturing the essence of friendship and shared moments.
Prompt
facial-expressions Thoughtfulness: Intimate, connected ; A family gathered around a dinner table; eye-level; Normal People; a warm, inviting kitchen setting; cinematic
Characteristic
Shot : A group of friends are gathered around a dinner table, enjoying a meal and conversation. The warm lighting creates a cozy and inviting atmosphere. The food is laid out in front of them, and there are glasses of wine on the table.
Aesthetic Score : 0.7
Mood : warm, inviting, cheerful
Quality
Entropy : 6.50
Noise : 74
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable image errors.
The Joy of Gaming: Capturing the Excitement of a Gamer
This image captures the pure joy of gaming, with a young man fully immersed in his game. His excited expression and the dynamic lighting create a sense of energy and excitement, showcasing the thrill of the gaming experience.
Prompt
facial-expressions Thoughtfulness: Excited, immersed ; A gamer holding a controller, eyes glued to the screen; close-up; Gamer; a vibrant, colorful gaming world displayed on the monitor; cinematic
Characteristic
Shot : A young man is playing video games with a controller, while wearing headphones. He is in a dimly lit room, and his face is lit up by the screen.
Aesthetic Score : 0.6
Mood : excited, focused, energetic
Quality
Entropy : 6.68
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and a slight blur. The background is a bit too dark, which makes the image look less polished
Finding Peace in the Park
A woman finds solace and inspiration amidst the tranquil beauty of a park, capturing her thoughts in a notebook. The scene evokes a sense of calm and contemplation, inviting viewers to embrace the peaceful moment.
Prompt
facial-expressions Thoughtfulness: Peaceful, creative ; A woman sitting on a park bench, sketching in a notebook; eye-level; Single Person; a serene park setting with blooming flowers; cinematic
Characteristic
Shot : A woman is sitting on a park bench under a tree, writing in a notebook.
Aesthetic Score : 0.7
Mood : calm, contemplative, focused
Quality
Entropy : 6.91
Noise : 74
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors.
Superman Prepares for the Inevitable
A determined Superman gazes upwards, his expression resolute against a backdrop of dramatic, swirling clouds. The scene evokes a sense of impending action and heroic resolve.
Prompt
facial-expressions Thoughtfulness: Determined, resolute ; A superhero looking up at the sky, a determined expression on their face; eye-level; Hero; a dramatic sky with dark clouds gathering; cinematic
Characteristic
Shot : A close-up portrait of a man in a Superman costume, with a stormy sky in the background.
Aesthetic Score : 0.7
Mood : serious, dramatic, heroic
Quality
Entropy : 6.82
Noise : 68
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some slight pixelation and slight noise in the image.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.1, which is considered very bad. This means there’s a significant difference between the camera position specified in the prompt and the camera position in the generated image.
- Shot Analysis: The model scored 0.48, which is considered good. This indicates that the model was able to understand the scene in the prompt and create a shot that aligns with it, but there’s room for improvement.
- Aesthetic Analysis: The model scored 0.09, which is considered very good. This means the generated image’s aesthetic closely matches the expected aesthetic.
Overall, the model seems to be better at understanding the scene and creating a shot that aligns with the prompt, but it struggles with accurately capturing the intended camera position. The model’s ability to create aesthetically pleasing images is a positive aspect.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai