AI's Facial Expressions: A Step Towards Realism, But Still a Work in Progress with Titan-g1
- 10 minutes read - 1981 wordsTable of Contents
The ability to generate realistic facial expressions is a crucial aspect of creating believable and engaging AI-generated content. This blog post delves into the results of an experiment where an AI model was tasked with generating images based on detailed scene descriptions, focusing on the model’s performance in capturing facial expressions. The results reveal a mixed bag of success and struggle, highlighting the ongoing challenges in achieving realistic facial expressions in AI-generated imagery. We’ll explore the model’s strengths and weaknesses, analyzing its understanding of camera position, shot composition, and aesthetic style, and discuss the implications for the future of AI-generated content.
Created with: titan-g1
Lost in the Neon Shadows
A solitary figure shrouded in darkness, a woman stands in a narrow alleyway bathed in the glow of neon signs. The play of light and shadow creates an atmosphere of mystery and intrigue, hinting at a story waiting to unfold.
Prompt
facial-expressions Surprise: Eerie, suspenseful ; A lone figure walking down a deserted street; eye-level; Single Person; neon signs reflecting in puddles; cinematic
Characteristic
Shot : A young woman in a dark jacket stands in an alleyway with neon signs, looking at the camera. The setting appears to be urban, possibly in Japan.
Aesthetic Score : 0.6
Mood : mysterious, urban, contemplative
Quality
Entropy : 6.81
Noise : 109
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no significant errors in the image. The lighting creates a slight glare on the subject’s face, but this is a minor issue.
Solitude Under a Starry Sky
A lone figure stands on a snow-covered mountain, gazing out at a city’s twinkling lights. The Milky Way stretches across the night sky, creating a serene and majestic scene. The contrast between the figure’s silhouette and the city lights emphasizes the vastness of the landscape.
Prompt
facial-expressions Surprise: Triumphant, awe-inspiring ; A lone figure stands atop a towering mountain, gazing out at the vast, star-studded sky. The wind whips through their hair, carrying the scent of pine and snow. In the distance, the twinkling lights of a distant city shimmer like scattered jewels.; cinematic
Characteristic
Shot : A lone figure stands at the edge of a snow-covered mountain, looking out at a valley lit up by city lights in the distance. A starry sky with the Milky Way visible above adds a sense of grandeur to the scene.
Aesthetic Score : 0.8
Mood : serene, contemplative, awe
Quality
Entropy : 6.79
Noise : 124
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight artifacts in the sky and on the snow, but they are not overly distracting. There is a bit of noise, likely from ISO.
Laughter and Light: Friends Sharing a Joyful Moment
Three young adults bask in the warm glow of a sunny day, their laughter echoing through the air as they share a relaxed and casual moment together. The natural light and their joyful expressions create a truly inviting and positive atmosphere.
Prompt
facial-expressions Surprise: unsettling ; A group of friends, their laughter echoing through the bustling marketplace, enjoys a break at a charming cafe, oblivious to the approaching downpour.; cinematic
Characteristic
Shot : Three young people are sitting together at a table, laughing and enjoying each other’s company. There is a coffee cup and a bottle of liquid on the table, suggesting they are enjoying a drink.
Aesthetic Score : 0.7
Mood : happy, joyful, friendly
Quality
Entropy : 6.92
Noise : 99
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Woman on the Edge of Discovery: A Glimpse into Intense Focus and Excitement
A young woman, engrossed in her computer screen, wears an expression of surprise and excitement. The scene, bathed in the blue glow of the monitor and a soft light from a nearby window, exudes an intense and focused mood. The dramatic effect is heightened by her expression and the suspenseful ambiance, hinting at an imminent revelation.
Prompt
facial-expressions Surprise: Intense, focused ; A gamer sitting in a dimly lit room, eyes glued to the screen; close-up; Gamer; glowing monitor, keyboard, and mouse; cinematic
Characteristic
Shot : A young woman is wearing headphones and is focused on a computer screen. She appears to be playing a video game. The setting appears to be her home office or bedroom, with a blue curtain in the background.
Aesthetic Score : 0.6
Mood : intense, focused, playful
Quality
Entropy : 6.54
Noise : 103
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry in the background. The lighting is not evenly distributed, but this could be artistic.
A Moment of Intrigue on the Subway Platform
A woman in a casual outfit, caught in a moment of surprise, gazes intently at something off-camera. Her curious expression and the blurred figure in the background create a sense of intrigue, leaving us wondering what has captured her attention.
Prompt
facial-expressions Surprise: Panic, frantic ; A woman standing in a crowded train station, suddenly realizing she’s lost her purse; eye-level; Single Person; bustling crowd, hurried footsteps; cinematic
Characteristic
Shot : A young woman in a beige cardigan is looking up and to the left, slightly surprised. She is standing in a subway station and another person is walking behind her.
Aesthetic Score : 0.6
Mood : surprised, curious, casual
Quality
Entropy : 6.94
Noise : 98
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
A Glowing Orb in the Ruins: A Quest for the Unknown
A man, clad in a dark jacket and brown shirt, stands amidst mossy stone ruins, holding a luminous orb. His gaze is fixed upwards, hinting at a hidden mystery. The glowing orb casts an ethereal light, creating an atmosphere of wonder and adventure. This image evokes a sense of hope and anticipation, suggesting a journey of discovery and exploration.
Prompt
facial-expressions Surprise: heroic ; An archaeologist emerges from a forgotten, moss-covered ruin, holding a glowing artifact aloft.; cinematic
Characteristic
Shot : A man in a forest, holding a glowing orb in his hand, surrounded by moss-covered rocks and stone structures
Aesthetic Score : 0.6
Mood : mysterious, adventurous, magical
Quality
Entropy : 6.93
Noise : 108
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image seems slightly overexposed, especially around the orb. The edges are also slightly blurry.
Laughter, Sunshine, and a Drone: Friends Capture the Perfect Summer Day
Three friends bask in the joy of a sunny afternoon, sharing laughter and a picnic in the park. A playful drone adds a touch of whimsy to the scene, capturing the carefree spirit of their friendship.
Prompt
facial-expressions Surprise: Peaceful, ominous ; A group of friends enjoying a picnic in a park, unaware of the strange object falling from the sky; eye-level; Normal People; sunny day, green grass, blue sky; cinematic
Characteristic
Shot : Three young women are sitting on a picnic blanket in a park. They are laughing and looking up at something in the air. The scene is very bright and sunny.
Aesthetic Score : 0.7
Mood : joyful, carefree, happy
Quality
Entropy : 6.91
Noise : 96
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable errors in the image.
Headphones On, Excitement On: Gamer Reacts to Victory
A young woman, headphones firmly in place, sits at her computer desk, her face alight with excitement. She’s clearly engrossed in a game, reacting with a mix of surprise and joy to something on the screen. The image captures the energy and thrill of gaming, though the composition could benefit from a bit more clarity.
Prompt
facial-expressions Surprise: Disbelief, frustration ; A gamer’s hands frantically moving across the keyboard, as a sudden glitch appears on the screen; close-up; Gamer; distorted screen, flashing lights; cinematic
Characteristic
Shot : A woman is sitting at a computer, wearing headphones, and reacting excitedly.
Aesthetic Score : 0.6
Mood : excited, playful, joyful
Quality
Entropy : 6.73
Noise : 104
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The background is somewhat blurry and indistinct. The lighting appears uneven, with some areas being overexposed and others being too dark. There are some noticeable artifacts around the woman’s hair, particularly in the area where it is being pulled back by her headphones.
Lost in Thought: A Man’s Pensive Gaze in a Mysterious Forest
A solitary figure stands amidst a blurred forest, his gaze directed upwards and to the left, hinting at a moment of deep contemplation. The out-of-focus background adds to the sense of mystery and isolation, leaving the viewer to ponder the man’s thoughts and the secrets held within the woods.
Prompt
facial-expressions Surprise: Mystical, awe-inspiring ; A man walking through a forest, suddenly finding himself face-to-face with a mythical creature; eye-level; Single Person; dense foliage, dappled sunlight; cinematic
Characteristic
Shot : A man in a forest, looking up, a blurry figure of a furry animal is in the background.
Aesthetic Score : 0.7
Mood : dramatic, pensive, hopeful
Quality
Entropy : 6.92
Noise : 100
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The blur on the animal in the background seems intentional, but could be sharper, making the image more defined.
A Moment of Awe on the Mountain Path
A woman in a vibrant red jacket stands captivated by the winding mountain path ahead, her gaze fixed on the distant horizon. The dramatic landscape and her intense expression evoke a sense of adventure and awe, inviting you to imagine the journey she’s about to embark on.
Prompt
facial-expressions Surprise: Melancholy, reflective ; A lone climber stands atop a treacherous mountain peak, the wind whipping around them. Below, the path is littered with fallen rocks and debris, a testament to the perilous journey they’ve endured. The climber, battered but triumphant, gazes out at the breathtaking vista, realizing the true cost of reaching the summit.; cinematic
Characteristic
Shot : A woman in a red jacket is hiking on a mountain trail, looking ahead at the path. Another hiker is further down the path. The scene is rugged and mountainous with rocks and a clear sky.
Aesthetic Score : 0.7
Mood : adventurous, determined, contemplative
Quality
Entropy : 6.90
Noise : 106
Prompt Clip Score : 0.17
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors.
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic expectations. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t fully capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.65, which falls within the “good” range. This indicates that the model was able to understand and translate the scene description in the prompt into a visually coherent shot.
- Aesthetic Analysis: The model scored 0.14, which is outside the “very good” range of -0.2 to 0.1. This suggests that the generated image didn’t quite match the expected aesthetic style described in the prompt.
Overall, the model demonstrated a good understanding of shot composition but needs improvement in capturing the intended camera position and aesthetic style.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html