AI Captures Emotions, But Struggles with Camera Angles with Titan-g1
- 9 minutes read - 1774 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and emotionally evocative images is a coveted skill. This blog post examines the performance of a generative AI model in capturing facial expressions within various scenes. The model demonstrates a remarkable ability to understand and depict emotions, but its accuracy in replicating camera angles falls short. We explore the nuances of this performance, highlighting the model’s strengths and weaknesses, and delve into the implications of this mixed bag of success.
Created with: titan-g1
Laughter and Lights: A Moment of Pure Joy
A person’s infectious laughter fills the air, accompanied by the twinkling lights of a carousel and Ferris wheel. The scene evokes a sense of carefree happiness and wonder, captured in a moment of pure joy.
Prompt
facial-expressions Amusement: Playful, carefree ; A lone woman; eye-level; Single Person; a bustling carnival with bright lights and colorful tents; cinematic
Characteristic
Shot : A young person laughing and looking up at a Ferris wheel and a carousel in the background.
Aesthetic Score : 0.7
Mood : joyful, carefree, whimsical
Quality
Entropy : 6.87
Noise : 97
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight softness to it, perhaps due to a wide aperture, which gives a slight blurring to the background and the subject’s face. There are no obvious technical errors, like banding or artifacts.
Joyful Moments at the Ferris Wheel
A man stands beaming before a towering Ferris wheel, radiating happiness and carefree joy. The scene captures the essence of pure delight, with the Ferris wheel serving as a backdrop to his infectious smile.
Prompt
facial-expressions Amusement: Exuberant, triumphant ; A superhero in a vibrant costume; eye-level; Hero; a crowded amusement park with roller coasters and Ferris wheels in the background; cinematic
Characteristic
Shot : A man is laughing with his arm raised in the air, in the background is a blurred Ferris wheel.
Aesthetic Score : 0.6
Mood : joyful, excited, happy
Quality
Entropy : 6.75
Noise : 99
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is overly saturated and the colors are a bit unnatural. The blur on the Ferris wheel seems exaggerated. The overall quality of the image appears to be low, likely from compression.
Laughter and Whimsy at the Carousel
Three friends share a moment of pure joy on a sunny day, their laughter echoing through the park as they sit on a red and white checkered blanket in front of a softly blurred carousel. The scene captures the carefree spirit of friendship and the magic of a summer day.
Prompt
facial-expressions Amusement: Relaxed, happy ; A group of friends; eye-level; Normal People; a picnic blanket under a shady tree in a park, with a carousel in the distance; cinematic
Characteristic
Shot : Three friends are laughing together in a park, with a carousel in the background.
Aesthetic Score : 0.7
Mood : joyful, carefree, friendly
Quality
Entropy : 6.89
Noise : 101
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
Joyful Gamer: Lit Up by the Thrill of Victory
A young woman, headphones on and controller in hand, beams with excitement as she plays a video game. The vibrant keyboard and focused lighting capture the energy and joy of her gaming experience.
Prompt
facial-expressions Amusement: Focused, excited ; A gamer; close-up; Gamer; a dimly lit room with a computer screen displaying a vibrant video game, a controller in their hand; cinematic
Characteristic
Shot : A young woman is playing video games and smiling happily.
Aesthetic Score : 0.7
Mood : joyful, energetic, focused
Quality
Entropy : 6.86
Noise : 103
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Carousel Dreams: A Moment of Joy and Wonder
A young girl, radiating happiness, stands before a vibrant carousel, her gaze fixed on a majestic white horse. The scene is awash in bright lights and colors, evoking a sense of playful nostalgia. This captivating image captures the pure joy and wonder of childhood.
Prompt
facial-expressions Amusement: Magical, innocent ; A young girl; eye-level; Single Person; a carousel with brightly painted horses, her eyes wide with wonder; cinematic
Characteristic
Shot : A young girl is standing next to a carousel horse, looking up with a smile on her face. The carousel is in the background, and the image is shot in a bright and colorful setting.
Aesthetic Score : 0.7
Mood : joyful, whimsical, nostalgic
Quality
Entropy : 6.97
Noise : 105
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant errors or artifacts in the image.
Laughter on the Playground: A Moment of Pure Joy
Two young boys share a moment of pure joy on a playground. The boy in the foreground laughs heartily, his infectious laughter radiating innocence and playfulness. The other boy watches him with a smile, capturing the essence of childhood friendship and carefree fun.
Prompt
facial-expressions Amusement: Joyful, carefree ; A group of children; eye-level; Normal People; a playground with swings, slides, and a sandbox, their laughter echoing in the air; cinematic
Characteristic
Shot : Two young boys, one laughing, the other looking on, possibly on a playground.
Aesthetic Score : 0.8
Mood : joyful, playful, innocent
Quality
Entropy : 6.88
Noise : 99
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : Slight blurriness, slight noise, especially in the background.
Silhouetted Against the Dusk, a Moment of Contemplation
A man, shrouded in the shadows of his jacket, stands on a pier, his gaze fixed on the turbulent ocean at dusk. The rule of thirds composition emphasizes his solitary figure, creating a mood of melancholy and introspection. The fading light paints a poignant backdrop to his quiet contemplation.
Prompt
facial-expressions Amusement: Melancholy, contemplative ; A lone man; eye-level; Single Person; a deserted boardwalk at night, the sound of crashing waves in the background; cinematic
Characteristic
Shot : A young man in a dark blue jacket stands on a pier, gazing out at the sea. The sun is setting in the distance, casting a warm glow over the scene.
Aesthetic Score : 0.6
Mood : melancholy, contemplative, calm
Quality
Entropy : 6.43
Noise : 93
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some noise in the image, particularly in the shadows. The colours are a bit muted and the overall contrast could be slightly higher.
Chasing Dreams in the City
A man in a suit, radiating excitement and determination, sprints through a bustling city, his energy palpable in every stride. The shallow depth of field and blurred background capture the feeling of speed and momentum, leaving you wanting to join him on his journey.
Prompt
facial-expressions Amusement: Thrilling, heroic ; A superhero in action; dynamic shot; Hero; a cityscape with towering buildings, a dramatic explosion in the background; cinematic
Characteristic
Shot : A man in a suit is running and celebrating in an urban environment, with skyscrapers in the background.
Aesthetic Score : 0.7
Mood : joyful, energetic, successful
Quality
Entropy : 6.69
Noise : 96
Prompt Clip Score : 0.18
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
The Thrill of the Ride: Joy and Excitement on a Roller Coaster
A young girl, a young man, and a woman share a moment of pure joy and excitement as they speed down a roller coaster hill. The blurry background and tilted angle capture the exhilarating motion, while their joyful expressions add to the vibrant and exciting mood of the scene.
Prompt
facial-expressions Amusement: Exhilarating, bonding ; A family; eye-level; Normal People; a crowded amusement park, their faces lit up with joy as they ride a roller coaster; cinematic
Characteristic
Shot : A family of three is riding a roller coaster, the image is cropped from the chest up. They are all smiling and laughing excitedly. There is bright light filtering from behind. The image is taken in a park setting, and the background is out of focus.
Aesthetic Score : 0.7
Mood : joyful, exciting, happy
Quality
Entropy : 6.90
Noise : 103
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : no visible artifacts or errors
Headphones On, Excitement High: Gamer’s Joy Captured in a Single Shot
This image captures the pure joy of gaming. A young woman, headphones on, fist raised in the air, radiates excitement as she plays her favorite video game. The vibrant energy and dynamic expression create a sense of engagement and immersion in the virtual world.
Prompt
facial-expressions Amusement: Triumphant, exhilarating ; A gamer; close-up; Gamer; a dimly lit room, their hands moving rapidly on a keyboard, a triumphant shout escaping their lips; cinematic
Characteristic
Shot : A young woman with headphones on, is sitting in front of a computer keyboard, she is excitedly celebrating, probably winning a game or getting a high score.
Aesthetic Score : 0.7
Mood : joyful, excited, triumphant
Quality
Entropy : 6.94
Noise : 101
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur around the woman’s hand, it’s not too distracting but noticeable, especially on the left side.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.625, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html