AI's Facial Expressions: A Step Forward, But Still Room for Growth with Titan-g1
- 10 minutes read - 1964 wordsTable of Contents
Facial expressions are a powerful tool for conveying emotions and adding depth to storytelling. In the realm of AI-generated imagery, capturing these nuances accurately is crucial for creating truly compelling and engaging visuals. This blog post examines the performance of a generative AI model in understanding and generating facial expressions, highlighting its strengths and weaknesses, and exploring the potential for future advancements in this area.
Created with: titan-g1
Lost in the City’s Embrace: A Solitary Figure Walks Through the Night
A lone figure navigates a wet city street, bathed in the melancholic glow of streetlights. Shadows dance around them, creating an atmosphere of isolation and mystery. This image captures the essence of urban solitude, leaving viewers to ponder the figure’s story and the secrets hidden within the city’s depths.
Prompt
facial-expressions Anger: Despair and rage ; A lone figure, standing in the middle of a deserted street; eye-level; Single Person; Rain pouring down, streetlights casting long shadows; cinematic
Characteristic
Shot : A solitary figure walks down a wet street at night. The street is lined with buildings, and streetlights illuminate the scene.
Aesthetic Score : 0.6
Mood : melancholy, urban, lonely
Quality
Entropy : 6.83
Noise : 108
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some noise and grain in the image.
Lost in Wonder: A Young Woman Explores a Majestic Waterfall
An adventurous spirit fills the air as a young woman stands amidst a breathtaking canyon, a cascading waterfall serving as a dramatic backdrop. Her open mouth and raised arm suggest a moment of awe and excitement, capturing the essence of exploration and the beauty of nature.
Prompt
facial-expressions Anger: rage and determination ; A lone explorer, map in hand, stands at the precipice of a hidden waterfall, its cascading waters illuminating a network of ancient tunnels carved into the mountainside. The air hums with an unseen energy, and the walls are covered in intricate carvings depicting fantastical creatures.; cinematic
Characteristic
Shot : A woman is standing in a canyon, looking at a waterfall in the distance. She is holding a map and seems to be excited or surprised.
Aesthetic Score : 0.6
Mood : adventurous, excited, scenic
Quality
Entropy : 6.93
Noise : 112
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant artifacts or errors in the image. The lighting is good, and there is no noticeable blurriness or noise.
The Frustration is Palpable
A man, face contorted in anger, sits at a desk littered with crumpled papers, his fist raised in the air. The image captures the raw emotion of frustration and stress, leaving the viewer to wonder what drove him to this point.
Prompt
facial-expressions Anger: Frustration and rage ; A man, slamming his fist on a table, surrounded by scattered papers; eye-level; Normal Person; A cluttered office, with a window showing a stormy sky; cinematic
Characteristic
Shot : A man in a white shirt is sitting at a desk, yelling and shaking his fist in the air. There are crumpled up papers all over the desk, suggesting frustration and anger. The background is a blurry office setting with a window.
Aesthetic Score : 0.3
Mood : frustrated, angry, overwhelmed
Quality
Entropy : 6.84
Noise : 103
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, and some of the details are lost in the shadows. There is also a slight blur to the image, which may be due to motion or a shallow depth of field.
The Thrill of the Game: Gamer’s Intense Focus Captured in a Single Shot
A young man, headphones on and eyes glued to the screen, sits in his gaming chair, his body language radiating intensity and excitement. The scene captures the raw emotion of gaming, with a can of soda adding a touch of everyday life to the moment.
Prompt
facial-expressions Anger: Frustration and rage ; A gamer, throwing his headset on the floor, surrounded by empty energy drink cans; eye-level; Gamer; A dimly lit room, with a computer screen displaying a game in progress; cinematic
Characteristic
Shot : A young man is sitting in front of a computer and wearing headphones, he appears to be playing a video game and is reacting to something on the screen with a scream and both hands raised. The scene is set in a dimly lit room, with a red gaming chair, and a computer monitor in the background. There is a can of soda on the desk beside the keyboard
Aesthetic Score : 0.4
Mood : intense, competitive, frustration
Quality
Entropy : 6.89
Noise : 105
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, especially in the background. There is some noise in the image, particularly in the darker areas. The colors are slightly muted and lack vibrancy.
A Scream of Raw Emotion
A close-up portrait captures the intensity of a woman’s scream, her face contorted in a moment of raw, unfiltered distress. The image is stark and dramatic, leaving a lasting impression of the power of human emotion.
Prompt
facial-expressions Anger: Despair and rage ; A woman, screaming into the void, her face contorted in anger; close-up; Single Person; A dark, empty room, with only a single flickering light; cinematic
Characteristic
Shot : A woman is screaming with her mouth wide open, her eyes are wide and her face is contorted in pain. Her hair is dark and she is wearing a dark shirt. The background is out of focus and it is hard to see what is going on. She looks like she is in a state of distress.
Aesthetic Score : 0.2
Mood : distress, agony, anger
Quality
Entropy : 6.69
Noise : 100
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some graininess, but nothing too disturbing.
A Solitary Figure Contemplates the Fury of the Storm
A lone figure stands on a mountain ridge, silhouetted against a breathtaking lightning storm. The dramatic scene evokes a sense of power, isolation, and awe, capturing the raw beauty of nature’s fury.
Prompt
facial-expressions Anger: Anger and determination ; A lone figure stands atop a towering mountain, gazing out at a vast, swirling storm. Lightning cracks across the sky, illuminating the jagged peaks and the churning clouds below.; cinematic
Characteristic
Shot : A lone figure stands on a rocky mountain ridge, silhouetted against a dramatic sky filled with lightning.
Aesthetic Score : 0.7
Mood : dramatic, awe-inspiring, powerful
Quality
Entropy : 6.76
Noise : 107
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.50
Image errors : No significant image errors are present.
Cafe Showdown: Heated Argument Erupts in Public
Two men engage in a heated argument at a cafe, one man’s shouting and gesturing creating a tense atmosphere. The other man’s shocked expression adds to the dramatic scene.
Prompt
facial-expressions Anger: Frustration ; Two friends, locked in a heated debate over the latest board game release, their voices rising in excitement; eye-level; Normal People; A bustling coffee shop, with other patrons looking on.; cinematic
Characteristic
Shot : Two men are arguing in a cafe, one looks surprised and the other is yelling.
Aesthetic Score : 0.4
Mood : intense, tense, argumentative
Quality
Entropy : 6.92
Noise : 99
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant artifacts or errors
The Croissant Conundrum: A Chef’s Disgust
A woman in a chef’s uniform stands in a kitchen, her face contorted in disgust as she stares at a plate of croissants. The ordinary setting contrasts sharply with her intense emotion, leaving viewers to wonder what culinary catastrophe has unfolded.
Prompt
facial-expressions Anger: Frustration ; A frustrated chef in the kitchen. Close-up on the chef’s face. The room is dimly lit, with a plate of cooling pastries on the table and a timer ticking down on the wall.; cinematic
Characteristic
Shot : A woman in a kitchen, looking distressed, with a plate of pastries in the foreground. A clock is visible in the background.
Aesthetic Score : 0.3
Mood : tense, frustrated, disappointed
Quality
Entropy : 6.83
Noise : 98
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image seems slightly out of focus, particularly in the background. The lighting is not very even, creating shadows around the woman’s face.
Lost in the Storm: A Man’s Cry for Help
A solitary figure stands amidst a downpour, his mouth open in a silent scream. The blurred background hints at a chaotic urban landscape, amplifying the intensity of the moment. This image captures raw emotion and a sense of desperation, leaving the viewer to ponder the man’s story.
Prompt
facial-expressions Anger: Despair and rage ; A man, standing in the rain, his face obscured by the downpour; eye-level; Single Person; A dark, deserted street, with only the sound of rain and thunder; cinematic
Characteristic
Shot : A man in a black jacket stands in the rain, with a city street in the background. He is looking up and shouting.
Aesthetic Score : 0.4
Mood : intense, dramatic, emotional
Quality
Entropy : 6.95
Noise : 108
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts and noise in the image, particularly in the background.
Screaming in the Ashes: A Soldier’s Anguish Amidst the Apocalypse
A powerful image captures the raw emotion of a soldier amidst a fiery, chaotic battlefield. The smoke-filled landscape and debris create a sense of urgency and danger, highlighting the man’s anguish and the devastating consequences of war.
Prompt
facial-expressions Anger: Anger and determination ; A hero, standing on a battlefield, surrounded by fallen enemies; eye-level; Hero; A battlefield littered with bodies, with smoke and dust filling the air; cinematic
Characteristic
Shot : A man in a military uniform is yelling in a battlefield with fire and explosions in the background. The scene is slightly blurry, giving it a sense of motion and intensity.
Aesthetic Score : 0.7
Mood : intense, dramatic, action-packed
Quality
Entropy : 6.83
Noise : 98
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blurriness to the image, likely from camera shake or motion blur during the shot. The image might also have some noise in the dark areas.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t accurately capture the intended camera position in the prompt.
- Shot Analysis: The model scored 0.58, which falls within the “good” range. This suggests that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.32, which is significantly lower than the “very good” range of -0.2 to 0.1. This indicates that the generated image’s aesthetic deviated from the expected aesthetic described in the prompt.
Overall, the model shows promise in understanding the scene and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html