AI's Artistic Struggle: Capturing Emotion in Images with Stable-diffusion
- 9 minutes read - 1787 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual descriptions is a rapidly evolving field. While AI models have made significant strides in capturing visual elements like objects and landscapes, accurately portraying facial expressions and conveying the intended emotional tone remains a challenge. This blog post delves into the intricacies of AI-generated imagery, focusing on the model’s performance in capturing dramatic facial expressions. We’ll explore how the model interprets and translates textual descriptions into visual representations, highlighting its strengths and weaknesses in capturing the nuances of human emotion.
Created with: stability-ai-core
Lost in Thought on a City Street
A young woman, shrouded in a gray coat, stands on a cobbled city street, her gaze lost in the distance. The shallow depth of field isolates her, creating a sense of pensive melancholy and atmospheric longing.
Prompt
facial-expressions Daydreaming: Melancholy, lost in thought ; A lone figure; eye-level; Single Person; bustling city street; cinematic
Characteristic
Shot : A young woman in a grey coat standing on a street in a city. The background is blurred and out of focus.
Aesthetic Score : 0.75
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.78
Noise : 74
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight noise and graininess, which is a common artifact for photos taken in low light conditions.
Superman Takes Flight, Contemplating the City Below
A lone figure, clad in the iconic red and blue, stands on a rooftop, bathed in the glow of the city lights. The pose, the dramatic lighting, and the vast cityscape all contribute to a sense of heroism and contemplation, capturing the essence of Superman’s iconic presence.
Prompt
facial-expressions Daydreaming: Confident, determined ; A superhero standing on a rooftop; high angle; Hero; cityscape at night; cinematic
Characteristic
Shot : A man dressed as Superman stands on a rooftop overlooking a city at night.
Aesthetic Score : 0.6
Mood : heroic, dramatic, powerful
Quality
Entropy : 6.62
Noise : 74
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, and the subject’s costume appears to be a bit too smooth and plastic-like. There are also some minor artifacts around the edges of the image.
Lost in Thought: A Moment of Quiet Contemplation
A young woman finds solace in a cozy cafe, her thoughtful gaze fixed on the world outside. The soft lighting and her introspective expression evoke a sense of calm and wistful contemplation.
Prompt
facial-expressions Daydreaming: Peaceful, content ; A woman sipping coffee in a cafe; eye-level; Normal People; warm, inviting cafe interior; cinematic
Characteristic
Shot : A woman is sitting at a table in a cafe, looking out the window. She is holding a cup of coffee in her hand. There is a second cup of coffee on the table in front of her. A glass of water is in the background. The cafe is decorated with warm colors and has a warm, inviting atmosphere.
Aesthetic Score : 0.7
Mood : calm, thoughtful, relaxed
Quality
Entropy : 6.71
Noise : 68
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights. The white balance is also a bit off, causing the colors to appear slightly too warm.
The Mystery in His Eyes: A Man Lost in Thought
A man, headphones on, sits before a computer screen, his gaze fixed on something unseen. The lighting casts shadows, adding to the air of suspense and intrigue. What is he contemplating? What secret does he hold?
Prompt
facial-expressions Daydreaming: Engrossed, excited ; A gamer intensely focused on a screen; close-up; Gamer; dimly lit room with gaming peripherals; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, wearing headphones, and looking off to the side. The room is dimly lit and there is a monitor in the background. The man is wearing a grey t-shirt and black headphones. He has a serious expression on his face.
Aesthetic Score : 0.6
Mood : serious, focused, intense
Quality
Entropy : 6.30
Noise : 66
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, slight over-sharpening is visible.
Lost in Thought: A Boy’s Yearning Gaze
A young boy, lost in contemplation, gazes out a weathered window. The blurred view beyond hints at a longing for something distant, creating a mood of quiet wistfulness and pensive reflection.
Prompt
facial-expressions Daydreaming: Curious, imaginative ; A child staring out a window; eye-level; Single Person; lush green garden; cinematic
Characteristic
Shot : A young boy is looking out of a window. He appears thoughtful. There is a small potted plant on the windowsill. The window is slightly open, and the boy is leaning against the window frame, looking out into a garden.
Aesthetic Score : 0.7
Mood : pensive, contemplative, quiet
Quality
Entropy : 6.64
Noise : 65
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor imperfections in the image, such as a few dust specks and a slight chromatic aberration on the window frame. There is also a slight graininess in the image.
A Knight’s Journey Through Sun-Dappled Woods
A lone knight in shining armor rides a white steed through a sun-dappled forest path, evoking a sense of mystery and adventure. The dramatic lighting highlights the knight’s silhouette against the foliage, creating a majestic and timeless scene.
Prompt
facial-expressions Daydreaming: Brave, adventurous ; A knight in shining armor riding through a forest; wide shot; Hero; mystical forest with dappled sunlight; cinematic
Characteristic
Shot : A knight in shining armor rides a white horse through a forest path.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, epic
Quality
Entropy : 6.88
Noise : 85
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image seems to have some minor artifacts and blurriness around the edges of the knight’s armor.
Friends, Food, and Laughter: A Perfect Picnic Day
This heartwarming image captures the essence of friendship and joy. Four friends gather in a park, sharing a picnic, laughter, and precious moments together. The scene radiates happiness and a sense of connection, making it a truly beautiful and inspiring sight.
Prompt
facial-expressions Daydreaming: Joyful, carefree ; A group of friends laughing together at a picnic; eye-level; Normal People; sunny park with picnic blanket; cinematic
Characteristic
Shot : A group of friends having a picnic in a park on a sunny day.
Aesthetic Score : 0.7
Mood : happy, joyful, carefree
Quality
Entropy : 6.86
Noise : 85
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors. There is a slight color cast in the image.
The Blue Light of Focus
A young man, illuminated by the blue glow of his computer screen, sits intently at his keyboard. His focused expression and the dramatic lighting create a sense of determination and intensity.
Prompt
facial-expressions Daydreaming: Thrilled, competitive ; A gamer’s hands rapidly moving across a keyboard; close-up; Gamer; brightly lit gaming setup with glowing screen; cinematic
Characteristic
Shot : A young man wearing glasses is sitting at a computer desk, looking intently at a monitor. His hand is on the keyboard.
Aesthetic Score : 0.7
Mood : focused, serious, concentrated
Quality
Entropy : 6.24
Noise : 59
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors, but there’s slight blurriness in the background monitors, likely due to the focus being on the man.
Lost in Thought on a Lonely Shore
A young woman stands on a deserted beach, her gaze fixed on the distant horizon. The soft lighting and muted colors evoke a sense of melancholy and longing, as she contemplates the vastness of the ocean and the solitude of her surroundings.
Prompt
facial-expressions Daydreaming: Reflective, introspective ; A woman walking alone on a beach; eye-level; Single Person; vast, empty beach with crashing waves; cinematic
Characteristic
Shot : A young woman stands on a beach, looking out at the ocean. The sky is cloudy and the waves are rolling in.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.69
Noise : 66
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors
Soaring High: A Superhero’s Triumphant Flight
This image captures the essence of heroism, with a superhero soaring through the sky, their dynamic pose conveying power and hope. The cityscape in the background hints at the scale of their mission, while the overall mood is one of triumph and inspiration.
Prompt
facial-expressions Daydreaming: Empowered, triumphant ; A superhero soaring through the sky; high angle; Hero; dramatic cloudscape with city skyline in the distance; cinematic
Characteristic
Shot : A superhero flying over a cityscape, with the iconic Superman costume.
Aesthetic Score : 0.7
Mood : heroic, powerful, action
Quality
Entropy : 6.79
Noise : 75
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lighting and shadowing on the superhero’s costume seem slightly unnatural and the image has a slight plastic look, The background city looks like it was generated and not a real photo.
Conclusion
The analysis shows that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.485, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.11, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate complex visual descriptions into images.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai