AI's Artistic Eye: Capturing Emotion, Missing the Angle with Imagen-v3-fast
- 8 minutes read - 1675 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on text prompts has become increasingly sophisticated. This study delves into the performance of a generative AI model in capturing facial expressions and adhering to camera positioning instructions. The results reveal a fascinating dichotomy: while the model excels at understanding the emotional nuances of a scene and creating visually appealing images, it struggles with accurately interpreting camera position instructions. This highlights the ongoing challenge of bridging the gap between human intention and AI execution in creative tasks. For example, the model might accurately depict a character’s sadness through facial expressions, but place the camera in an unexpected angle, disrupting the intended perspective. This discrepancy underscores the need for further development in AI’s ability to understand and translate complex visual instructions.
Created with: imagen-v3-fast
Lost in Thought: A Moment of Solitude in the City
A man sits alone on a park bench, his gaze distant and his expression pensive. The blurred background suggests an urban setting, adding to the feeling of isolation and introspection. The image captures a moment of quiet contemplation, leaving the viewer to wonder about the man’s thoughts and emotions.
Prompt
facial-expressions Thoughtfulness: Melancholy, contemplative ; A lone figure sitting on a park bench; eye-level; Single Person; a bustling city park in the background; cinematic
Characteristic
Shot : A man sitting on a park bench in a pensive mood. The background is blurry and out of focus, suggesting an urban setting.
Aesthetic Score : 0.6
Mood : melancholy, pensive, introspective
Quality
Entropy : 6.84
Noise : 55
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors. The image appears to be of high quality.
Superman Stands Tall, Hopeful Against the City Lights
A powerful image of Superman, bathed in the glow of the city skyline, captures his determined spirit and hopeful gaze. The dramatic use of light and shadow enhances his presence, making him a symbol of strength and resilience.
Prompt
facial-expressions Thoughtfulness: Reflective, introspective ; A superhero standing on a rooftop, looking out at the city; eye-level; Hero; a sprawling cityscape with twinkling lights; cinematic
Characteristic
Shot : A man dressed as Superman, looking out at a city skyline at night.
Aesthetic Score : 0.75
Mood : determined, hopeful, powerful
Quality
Entropy : 6.32
Noise : 60
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight blurriness in the background, especially in the city skyline.
Lost in the Pages, Found in the Moment
A woman finds solace in a book as the world rushes by outside her train window. The soft, natural light creates a peaceful atmosphere, capturing a moment of calm contemplation.
Prompt
facial-expressions Thoughtfulness: Peaceful, absorbed ; A woman reading a book on a train; eye-level; Normal Person; a blurry view of passing scenery outside the window; cinematic
Characteristic
Shot : A woman sits by a window on a train, reading a book. The window shows a blurry view of passing scenery.
Aesthetic Score : 0.6
Mood : calm, contemplative, peaceful
Quality
Entropy : 6.81
Noise : 73
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and there is some graininess in the shadows.
The Focus of a Gamer
A young man, headphones on, is completely immersed in a game, his face reflecting intense concentration. The image captures the serious and competitive nature of gaming, highlighting the player’s dedication and focus.
Prompt
facial-expressions Thoughtfulness: Intense, focused ; A gamer sitting in a dimly lit room, staring intently at a computer screen; eye-level; Gamer; a cluttered desk with gaming peripherals; cinematic
Characteristic
Shot : A young man wearing headphones is seated in front of a computer, focused on a game.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 6.36
Noise : 42
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors in the image
Lost in the Waves: A Man’s Solitary Contemplation
A bearded man with long hair stands on a windswept beach, his gaze fixed on the turbulent sea. The cloudy sky and choppy waves mirror the melancholy mood, highlighting a sense of loneliness and introspection in his posture.
Prompt
facial-expressions Thoughtfulness: Solitary, introspective ; A man walking alone on a deserted beach; eye-level; Single Person; the vast ocean stretching out before him; cinematic
Characteristic
Shot : A man with long hair and a beard is standing on a beach, looking out at the sea. The sky is cloudy, and the sea is choppy.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, introspective
Quality
Entropy : 6.72
Noise : 50
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.00
Image errors : No errors
Firefighter Faces the Aftermath, a Solemn Reminder of Loss
A firefighter stands resolute in front of a charred building, their expression reflecting the gravity of the situation. The scene evokes a sense of somber determination, highlighting the sacrifices made in the face of tragedy.
Prompt
facial-expressions Thoughtfulness: Somber, reflective ; A firefighter standing amidst the ruins of a fire; eye-level; Hero; smoke and debris filling the air; cinematic
Characteristic
Shot : A firefighter standing in front of a burned-out building, looking directly at the camera.
Aesthetic Score : 0.7
Mood : serious, determined, somber
Quality
Entropy : 6.76
Noise : 68
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, and the background is a bit blurry.
Secrets Whispered in the Shadows
Four young adults gather around a dimly lit table, their faces illuminated by flickering candlelight. The atmosphere is thick with mystery and suspense, hinting at secrets waiting to be revealed.
Prompt
facial-expressions Thoughtfulness: Intimate, conspiratorial ; A group of friends huddle around a dimly lit table in a cozy cafe, their faces illuminated by flickering candlelight.; cinematic
Characteristic
Shot : A group of four young adults are sitting around a table with lit candles in a dimly lit room. The lighting creates a moody and mysterious atmosphere.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, intimate
Quality
Entropy : 6.53
Noise : 61
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
The Moment He Knew He Was Winning
A young man, eyes wide with surprise, is completely immersed in his video game. Headphones on, controller in hand, he’s locked in a battle of wits and reflexes. The intensity of the moment is palpable, captured in this close-up shot.
Prompt
facial-expressions Thoughtfulness: Excited, immersed ; A gamer holding a controller, eyes glued to the screen; close-up; Gamer; a vibrant, colorful gaming world displayed on the monitor; cinematic
Characteristic
Shot : A young man is playing video games, wearing headphones, looking surprised, with a gamepad in his hands
Aesthetic Score : 0.6
Mood : intense, focused, surprised
Quality
Entropy : 6.78
Noise : 53
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Lost in Thought: A Moment of Tranquility in the Park
A young woman finds peace amidst the vibrant blooms of a park, her pen dancing across the pages of her notebook. The intimate composition draws you into her world of quiet contemplation, capturing the essence of calm and thoughtful reflection.
Prompt
facial-expressions Thoughtfulness: Peaceful, creative ; A woman sitting on a park bench, sketching in a notebook; eye-level; Single Person; a serene park setting with blooming flowers; cinematic
Characteristic
Shot : A young woman is sitting on a park bench, writing in a notebook. The bench is surrounded by pink flowers and green trees.
Aesthetic Score : 0.7
Mood : calm, thoughtful, peaceful
Quality
Entropy : 6.97
Noise : 87
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality appears to be good with no visible artifacts or errors.
Hope Takes Flight: A Superhero’s Determined Gaze
A masked superhero stands against a dramatic sky, their gaze fixed upwards. The lighting and composition create a powerful and inspiring image, capturing the hero’s unwavering hope and determination.
Prompt
facial-expressions Thoughtfulness: Determined, resolute ; A superhero looking up at the sky, a determined expression on their face; eye-level; Hero; a dramatic sky with dark clouds gathering; cinematic
Characteristic
Shot : A superhero with a mask looking upwards at the sky. The sky is a dramatic backdrop of clouds.
Aesthetic Score : 0.6
Mood : hopeful, determined, heroic
Quality
Entropy : 6.88
Noise : 65
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry and the colours are a little too saturated.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and aesthetics, but struggled with camera positioning. Here’s a breakdown:
- Camera Position: The model scored 0.1, indicating a significant difference between the intended camera position in the prompt and the actual camera position in the generated image. This suggests the model is not very good at following camera position instructions.
- Shot Analysis: The model scored 0.45, which is considered good. This means the model was able to understand the scene in the prompt and create an image that reflects it fairly well.
- Aesthetic Analysis: The model scored 0.095, which is considered very good. This indicates that the generated image closely matches the expected aesthetic style described in the prompt.
Overall: The model demonstrates a strong ability to understand the scene and create aesthetically pleasing images, but needs improvement in accurately interpreting camera position instructions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-3/