AI Captures the Essence of Emotion, But Struggles with Camera Angles with Imagen-v2
- 10 minutes read - 1936 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and emotionally evocative images is a coveted goal. This blog post examines the performance of a generative AI model in capturing facial expressions within various scenes. The model demonstrates a remarkable ability to convey the desired aesthetic and emotional tone, but struggles with accurately replicating the intended camera position. This highlights the ongoing challenges and triumphs in the development of AI image generation technology. We will explore the model’s strengths and weaknesses, providing insights into the nuances of AI-generated imagery and its potential for future applications.
Created with: imagen-v2
Lost in the Neon Glow: A City’s Mystery Unfolds
A young woman, her hair slick with rain, stands bathed in the vibrant hues of urban neon. Her enigmatic expression and the city’s shadowy backdrop weave a tale of mystery and melancholic allure.
Prompt
facial-expressions Realization: Melancholy, introspective ; A lone figure; eye-level; Single Person; a bustling city street at night, with neon signs and rain reflecting on the wet pavement; cinematic
Characteristic
Shot : A young woman with messy hair is looking towards the right of the frame. She is wearing a dark leather jacket and has a worried expression. The background is blurry, with out of focus neon lights and silhouettes of people.
Aesthetic Score : 0.7
Mood : dark, mysterious, concerned
Quality
Entropy : 6.36
Noise : 78
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry and has some artifacts, particularly around the edges of the woman’s hair and in the background. The lighting and colors are dramatic and impactful, but could be more refined.
Superhero Stands Tall Against the Cityscape
A powerful and dramatic image of a superhero in full costume, silhouetted against a towering cityscape. The lighting and pose create a sense of importance and gravitas, hinting at the hero’s strength and determination.
Prompt
facial-expressions Realization: Triumphant, awe-inspiring ; A superhero, standing atop a skyscraper; wide shot; Hero; a sprawling cityscape bathed in the golden light of sunset; cinematic
Characteristic
Shot : A man in a superhero costume stands in front of a city skyline. It looks like he is on a rooftop.
Aesthetic Score : 0.8
Mood : serious, powerful, dramatic
Quality
Entropy : 6.82
Noise : 82
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some noise and compression artifacts.
Lost in Thought: A Moment of Contemplation
A young woman with long blonde hair sits in a kitchen, her gaze fixed on something unseen. The soft lighting and her thoughtful expression create an air of mystery and intrigue, leaving the viewer to wonder what thoughts are swirling in her mind.
Prompt
facial-expressions Realization: Disillusioned, resigned ; A young woman, sitting at a kitchen table; close-up; Normal People; a cluttered kitchen, with dishes piled in the sink and a half-eaten meal on the table; cinematic
Characteristic
Shot : A young woman with long blonde hair is sitting in a kitchen, looking thoughtful. The lighting is soft and the composition is simple.
Aesthetic Score : 0.7
Mood : pensive, melancholic, introspective
Quality
Entropy : 6.72
Noise : 90
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight amount of noise in the background, which is most noticeable in the blurred areas.
Lost in the Game: A Moment of Intense Focus
A young gamer, headphones on, is completely absorbed in their virtual world. The dramatic lighting and composition capture the intensity and energy of the moment, highlighting the player’s focused expression.
Prompt
facial-expressions Realization: Intense, focused ; A gamer, hunched over a computer screen; close-up; Gamer; a dimly lit room, with flashing lights from the monitor and empty pizza boxes scattered around; cinematic
Characteristic
Shot : A young person wearing headphones is intensely focused on a computer screen while gaming. The scene is lit by a warm light from the screen and a cool blue light from the room.
Aesthetic Score : 0.6
Mood : intense, focused, suspenseful
Quality
Entropy : 6.39
Noise : 49
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.70
Image errors : Some minor artifacts can be seen in the background, especially around the screen and the character’s hair.
Lost in the Crowd: A Man’s Mysterious Journey Begins
A solitary figure stands out amidst the bustling chaos of a train station. The man’s intense expression and the dramatic use of lighting and blur create a sense of mystery and suspense, hinting at a story waiting to unfold.
Prompt
facial-expressions Realization: Lost, alienated ; A man, walking through a crowded train station; eye-level; Single Person; a sea of faces, all rushing in different directions; cinematic
Characteristic
Shot : A man with a serious expression is standing in a crowded train station or similar public space. The focus is on his face, and he is looking directly at the viewer.
Aesthetic Score : 0.7
Mood : mysterious, intense, serious
Quality
Entropy : 6.76
Noise : 82
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some artifacts around the edges of the image, and the background is somewhat blurry. The details of the surrounding people are also not particularly sharp.
Hero Stands Tall Amidst the Ashes
A muscular, Superman-like figure stands defiant against a backdrop of fire and smoke, evoking a sense of epic heroism and the intensity of a post-apocalyptic or battle-torn world. The dramatic contrast between the hero’s determined stance and the fiery landscape amplifies the sense of danger and courage.
Prompt
facial-expressions Realization: Determined, resolute ; A superhero, standing in the middle of a battle; wide shot; Hero; a chaotic scene of destruction and explosions, with enemies closing in; cinematic
Characteristic
Shot : A superhero, presumably Superman, is standing in front of a fiery explosion. The image has a dark and gritty feel, with a lot of smoke and debris in the background.
Aesthetic Score : 0.6
Mood : dark, intense, powerful
Quality
Entropy : 6.69
Noise : 68
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image contains some artifacts, particularly in the smoke and debris. The superhero’s suit also looks a little too detailed for a superhero suit. The flames in the background also lack realism and seem like an overlay.
Friends Enjoying a Sunny Picnic
A group of four friends share laughter and conversation around a picnic table in a grassy field. The warm colors and soft lighting create a sense of intimacy and togetherness, capturing the carefree spirit of their friendship.
Prompt
facial-expressions Realization: Nostalgic, heartwarming ; A group of friends, gathered around a picnic table; medium shot; Normal People; a sunny park, with the scent of freshly cut grass and blooming flowers in the air.; cinematic
Characteristic
Shot : A group of four young people are sitting at a picnic table in a park. The table is covered in food and drinks, and the people are talking and laughing. The background is a blurry green field and some trees.
Aesthetic Score : 0.6
Mood : relaxed, friendly, casual
Quality
Entropy : 6.50
Noise : 84
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight overexposure and some noise in the background.
Intense Gaze, Mysterious Aura: A Man Bathed in Light
A captivating portrait of a man with messy blonde hair, his face illuminated by a dramatic backlighting. The intense gaze and sharp features create a sense of mystery and intrigue, leaving the viewer wondering about his story.
Prompt
facial-expressions Realization: Defeated, frustrated ; A gamer, staring at a blank screen; close-up; Gamer; a dimly lit room, with the only light coming from the monitor, which is now displaying a game over message; cinematic
Characteristic
Shot : A close-up portrait of a man, looking up, in a dark room with a bright light source behind him.
Aesthetic Score : 0.7
Mood : intense, mysterious, contemplative
Quality
Entropy : 6.33
Noise : 113
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, and there is some noise visible.
Lost in Thought: A Moment of Serenity at Sunset
A young woman with blonde hair gazes thoughtfully towards the horizon, her expression a blend of pensiveness and serenity. The soft light of sunset bathes the scene in a warm glow, creating a sense of mystery and intrigue. The blurred background of water and sky suggests a peaceful beach setting, where the woman finds solace in the beauty of the moment.
Prompt
facial-expressions Realization: Reflective, contemplative ; A woman, standing on a cliff overlooking the ocean; eye-level; Single Person; a vast expanse of blue water stretching out to the horizon, with the sun setting in the distance; cinematic
Characteristic
Shot : A portrait of a young woman with blonde hair, looking off to the side. The background is blurred and out of focus, and the lighting is soft and warm.
Aesthetic Score : 0.8
Mood : pensive, dreamy, romantic
Quality
Entropy : 6.91
Noise : 54
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some noticeable artifacts in the image, especially in the woman’s hair and the background. The woman’s hair appears too smooth and artificial.
A Hero Rises from the Ashes
A muscular, Superman-like figure stands defiant in a post-apocalyptic wasteland, bathed in dramatic light and shadow. The scene evokes a sense of epic struggle and impending action, leaving the viewer wondering what challenges lie ahead for this lone hero.
Prompt
facial-expressions Realization: Hopeful, determined ; A superhero, standing in the ruins of a city; wide shot; Hero; a desolate landscape, with smoke rising from the rubble and the sun breaking through the clouds; cinematic
Characteristic
Shot : A muscular Superman-like figure stands in a post-apocalyptic cityscape, with smoke billowing in the background. The figure is illuminated by a bright light, suggesting a sense of hope or power.
Aesthetic Score : 0.7
Mood : heroic, dramatic, hopeful
Quality
Entropy : 6.48
Noise : 52
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : The figure’s musculature appears slightly exaggerated and the textures in the background are somewhat blurry and artificial.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.39, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t fully capture the intended camera position in the prompt.
- Shot Analysis: The model scored 0.57, which falls within the “good” range. This indicates that the model was able to understand the scene described in the prompt reasonably well.
- Aesthetic Analysis: The model scored 0.10, which is within the “very good” range of -0.2 to 0.1. This means the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model demonstrated a good understanding of the scene and a very good ability to match the desired aesthetic. However, it struggled to accurately capture the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/