AI's Facial Expressions: A Mixed Bag of Success with Titan-g1
- 9 minutes read - 1811 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and intentions in visual storytelling. Generative AI models are increasingly being used to create images with realistic facial expressions, but how well do they perform? This blog post delves into the nuances of AI-generated facial expressions, examining its strengths and weaknesses in capturing the subtle nuances of human emotion. We’ll explore examples of successful and less successful attempts, analyzing the factors that contribute to the model’s performance. By understanding the capabilities and limitations of AI in this domain, we can better appreciate its potential and guide its future development.
Created with: titan-g1
Lost in the Pieces: A Man’s Struggle with a Puzzle and His Thoughts
A solitary figure sits at a table, his brow furrowed in contemplation. Scattered puzzle pieces surround him, mirroring the fragmented state of his mind. The unfinished meal suggests a pause in his life, a moment of introspection where he grapples with the complexities of his thoughts.
Prompt
facial-expressions Boredom: Apathy and resignation. ; A single person; eye-level; Single Persons; A cluttered apartment with unwashed dishes and a half-finished puzzle on the table.; cinematic
Characteristic
Shot : A man is sitting at a table looking down in thought. There are puzzle pieces scattered around him.
Aesthetic Score : 0.4
Mood : pensive, frustrated, thoughtful
Quality
Entropy : 6.79
Noise : 102
Prompt Clip Score : 0.16
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blur in the image, which is most evident in the background.
Youth in the Ruins: A Portrait of Loss
A young man, his face etched with emotion, stands before a bombed-out building, the stark contrast highlighting the tragedy of war. The out-of-focus background emphasizes his isolation and the weight of his experience.
Prompt
facial-expressions Boredom: Disillusionment and weariness. ; A superhero; eye-level; Heroes; A deserted cityscape with crumbling buildings and graffiti.; cinematic
Characteristic
Shot : A young man in a dark jacket is standing in front of a building that has been heavily damaged. The man is looking off to the side and appears to be lost in thought. The scene is one of destruction and despair, but there is also a sense of hope in the man’s determined gaze.
Aesthetic Score : 0.6
Mood : despair, hope, contemplative
Quality
Entropy : 6.96
Noise : 98
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.10
Image errors : No notable artifacts or errors in the image.
Intrigued by the Screen: A Moment of Curiosity on the Bus
A young woman sits on a bus, her gaze fixed on her phone. Her expression is one of intrigue, suggesting she’s reading something captivating or unexpected. The image captures a fleeting moment of curiosity and focus, leaving the viewer wondering what has caught her attention.
Prompt
facial-expressions Boredom: Annoyance and detachment. ; A young woman; eye-level; Normal People; A crowded bus with people staring at their phones.; cinematic
Characteristic
Shot : A young woman is looking at her phone while sitting on a bus.
Aesthetic Score : 0.6
Mood : focused, serious, contemplative
Quality
Entropy : 6.63
Noise : 102
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the background.
Caught in the Heat of the Moment: Gamer’s Intense Focus Under Blue Light
A close-up shot captures the raw emotion of a gamer in the midst of an intense gaming session. Their surprised expression and clenched hands speak volumes about the thrill of the game, while the blue light emanating from the monitor adds to the feeling of energy and concentration.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A gamer; close-up; Gamer; A dimly lit room with a computer screen displaying a paused game.; cinematic
Characteristic
Shot : A young person, likely a teenager, is playing a video game. They are wearing headphones and their mouth is open in surprise or excitement. The image is likely taken in their bedroom, as there is a computer monitor and a tower PC visible in the background. The lighting is soft and warm, creating a cozy and inviting atmosphere.
Aesthetic Score : 0.7
Mood : excited, focused, engaged
Quality
Entropy : 6.68
Noise : 102
Prompt Clip Score : 0.16
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, and there is some noise in the background.
A Life Lived, A Playground Remembered
An elderly man contemplates life on a park bench, his gaze drawn to the vibrant colors of a nearby playground. The scene evokes a sense of melancholy, nostalgia, and the passage of time, as the man reflects on his own journey and the youthful energy that once filled his days.
Prompt
facial-expressions Boredom: Melancholy and loneliness. ; An elderly man; eye-level; Single Persons; A park bench with fallen leaves and a deserted playground.; cinematic
Characteristic
Shot : An elderly man is sitting on the ground in front of a playground, looking down, with a thoughtful expression on his face.
Aesthetic Score : 0.5
Mood : melancholy, contemplative, somber
Quality
Entropy : 6.92
Noise : 101
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and some minor noise is visible.
A High-Stakes Game in the Shadows
A man in a suit sits alone at a poker table, bathed in the dim light of a room marked ‘CASE’. The atmosphere is thick with suspense, hinting at a high-stakes game with secrets waiting to be revealed.
Prompt
facial-expressions Boredom: Frustration and boredom. ; A lone gambler, hunched over a worn table in a smoky casino, surrounded by stacks of chips and a flickering neon sign advertising the night’s big game.; cinematic
Characteristic
Shot : A man in a suit sits at a poker table, with poker chips in front of him, in a dimly lit room with the word ‘CASE’ spelled out in lights behind him. There is smoke in the background.
Aesthetic Score : 0.6
Mood : dramatic, intense, suspenseful
Quality
Entropy : 6.71
Noise : 101
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.20
Image errors : The lighting is a little flat and there is some noise in the image.
Silent Tension: A Couple’s Dinner Takes a Dark Turn
A dimly lit restaurant scene captures a couple’s strained interaction. The woman’s upset expression and the man’s downcast gaze create a palpable sense of tension and uncertainty, leaving the viewer wondering what secrets lie beneath the surface.
Prompt
facial-expressions Boredom: Awkward silence and boredom. ; A young couple; eye-level; Normal People; A restaurant table with empty plates and a half-finished bottle of wine.; cinematic
Characteristic
Shot : A couple is sitting at a restaurant table, the woman is looking to the left and the man is looking down. There is a bottle of wine between them.
Aesthetic Score : 0.5
Mood : tense, awkward, uncomfortable
Quality
Entropy : 6.63
Noise : 100
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no obvious artifacts or errors in the image.
The Moment He Knew He Won
A young gamer, headphones on and eyes wide with surprise, stares intently at his computer screen. The vibrant lights of his gaming PC illuminate the room, reflecting the intensity of the moment. Is this the victory he’s been waiting for?
Prompt
facial-expressions Boredom: Monotony and boredom. ; A gamer; close-up; Gamer; A brightly lit room with a computer screen displaying a repetitive, simple game.; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, wearing headphones and looking surprised. He is likely playing a video game. There is a gaming PC in the background with RGB lighting.
Aesthetic Score : 0.7
Mood : intense, focused, surprised
Quality
Entropy : 6.88
Noise : 102
Prompt Clip Score : 0.16
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and artifacting, particularly in the shadows. The RGB lighting is slightly overexposed.
Lost in Thought: A Moment of Contemplation on the Train
A young man sits by the window, bathed in soft light, his thoughtful expression hinting at a pensive mood. The scene evokes a sense of melancholy and introspection, capturing a fleeting moment of reflection during a train journey.
Prompt
facial-expressions Boredom: Isolation and boredom. ; A woman; eye-level; Single Persons; A crowded train with people reading, sleeping, and staring blankly.; cinematic
Characteristic
Shot : A man sitting by the window of a train, looking out
Aesthetic Score : 0.6
Mood : pensive, contemplative, serene
Quality
Entropy : 6.86
Noise : 98
Prompt Clip Score : 0.13
AI Evaluation
Likelihood of AI : 0.30
Image errors : No visible errors
A Tense Standoff in the Desert
A young woman in military uniform stands vigilant in a desolate desert landscape, her gaze fixed on a looming watchtower. The scene is heavy with tension and suspense, hinting at an impending threat.
Prompt
facial-expressions Boredom: Despair and boredom. ; A soldier; eye-level; Heroes; A desolate desert landscape with a lone watchtower in the distance.; cinematic
Characteristic
Shot : A young woman in military fatigues stands in a desert landscape, gazing at a small, distant tower.
Aesthetic Score : 0.6
Mood : tense, solitary, mysterious
Quality
Entropy : 6.54
Noise : 94
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some minor noise and grain, which may be due to the digital processing or the shooting conditions. Some slight blurring is present, particularly in the distance.
Conclusion
The analysis shows that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.15, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.57, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.09, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrated a good understanding of the scene and shot composition, but struggled with accurately capturing the intended camera position. The aesthetic quality of the generated image was very close to the expected style.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html