AI Captures Pride's Spirit, But Struggles with Camera Angles with Imagen-v2
- 9 minutes read - 1765 wordsTable of Contents
The world of AI is constantly evolving, and its ability to generate realistic and evocative images is rapidly advancing. This blog post examines a recent experiment where an AI model was tasked with creating images of Pride celebrations, focusing on the model’s ability to capture the essence of these events through various camera angles and aesthetic choices. We’ll explore the model’s strengths and weaknesses, highlighting its success in capturing the vibrant spirit of Pride while also revealing its limitations in accurately portraying camera perspectives. This analysis provides valuable insights into the current state of AI image generation and its potential for future applications in visual storytelling.
Created with: imagen-v2
Afro-Powered Joy: A Celebration in Confetti
A close-up portrait captures the radiant smile of a person with an afro, adorned with a rainbow sash, amidst a jubilant crowd. Confetti dances in the air, amplifying the joyous atmosphere and celebrating a moment of empowerment.
Prompt
facial-expressions Pride: Joyful, confident, celebratory ; A single person; eye-level; Single Persons; A bustling Pride parade with rainbow flags and confetti; cinematic
Characteristic
Shot : A person with an afro is smiling at the camera with confetti falling down around them. A rainbow flag is visible behind them.
Aesthetic Score : 0.8
Mood : joyful, celebratory, hopeful
Quality
Entropy : 6.73
Noise : 71
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : The confetti and other details in the background appear to be slightly blurry.
Rainbow Warrior: A Portrait of Hope and Pride
This powerful image captures a woman’s unwavering spirit, her face painted with the colors of the rainbow, gazing upwards with a mix of pride and hope. The vibrant flag behind her, dominating the frame, symbolizes a collective celebration of diversity and acceptance.
Prompt
facial-expressions Pride: Empowered, defiant, hopeful ; A person holding a rainbow flag high; eye-level; Single Persons; A crowd of people at a Pride rally; cinematic
Characteristic
Shot : A woman with long blonde hair is holding a rainbow pride flag, looking up at the sky with a determined expression. There is a suggestion of a hand holding the flag with her, but the focus is on the woman’s face.
Aesthetic Score : 0.7
Mood : hopeful, strong, proud
Quality
Entropy : 6.75
Noise : 73
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors. The image appears to be well-exposed and sharp.
Superman Stands for Hope, Even in the Face of Adversity
A close-up portrait of Superman, bathed in dramatic lighting, captures his unwavering resolve. The rainbow flag in the background symbolizes inclusivity and hope, adding a powerful layer to this intense and heroic image.
Prompt
facial-expressions Pride: Powerful, inspiring, hopeful ; A superhero in a rainbow costume; eye-level; Heroes; A cityscape with a Pride flag flying in the background; cinematic
Characteristic
Shot : A close-up portrait of Superman with a rainbow flag draped behind him. The background appears to be a city skyline with a blurred, distant view.
Aesthetic Score : 0.6
Mood : intense, determined, hopeful
Quality
Entropy : 6.71
Noise : 78
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : Some artifacts are noticeable in the image, particularly around the edges of the subject’s hair, the flag, and the background. These may be due to over-processing, resulting in a slightly unnatural look.
Rainbow Lights and Smiles: Capturing the Energy of a Nightclub
A man with a beaming smile stands out against a backdrop of vibrant rainbow laser lights, capturing the energetic and exciting atmosphere of a bustling nightclub.
Prompt
facial-expressions Pride: Joyful, carefree, celebratory ; A group of people dancing in a club; eye-level; Normal People; A brightly lit dance floor with rainbow lights; cinematic
Characteristic
Shot : A group of people are in a nightclub or concert venue. They are looking up at a rainbow light show. The man in the foreground is singing along.
Aesthetic Score : 0.7
Mood : energetic, celebratory, euphoric
Quality
Entropy : 6.62
Noise : 95
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some blurring and artifacts around the edges of the people in the background.
Love and Pride on Display in a Suburban Street
Two men stroll hand-in-hand down a suburban street, a rainbow flag waving proudly in the background. The scene radiates happiness, hope, and romance, capturing a moment of love and acceptance.
Prompt
facial-expressions Pride: Loving, peaceful, accepting ; A couple holding hands and walking down the street; eye-level; Normal People; A quiet, residential street with rainbow flags on display; cinematic
Characteristic
Shot : Two men are walking down a street, holding hands. There’s a rainbow flag in the background and they are both wearing sunglasses.
Aesthetic Score : 0.6
Mood : optimistic, hopeful, romantic
Quality
Entropy : 6.58
Noise : 73
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, especially in the sky. There are no major artifacts or errors.
Gamer Focus: A Close-Up on Intensity
A young man with blue hair, headphones on, and a controller in hand, is locked in a world of digital competition. The vibrant background and intense expression capture the focused energy of a gamer in the zone.
Prompt
facial-expressions Pride: Fun, playful, inclusive ; A gamer playing a video game with rainbow-themed characters; eye-level; Gamer; A brightly lit gaming room with posters of LGBTQ+ characters; cinematic
Characteristic
Shot : A young man with blue hair and headphones is playing a video game. He has a surprised look on his face. The background is a blurry wall with a rainbow flag.
Aesthetic Score : 0.7
Mood : intense, focused, surprised
Quality
Entropy : 6.42
Noise : 94
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly overexposed.
A Determined Gaze, A Powerful Message
A woman stands with unwavering focus, her cardboard sign emblazoned with a rainbow circle, demanding attention. The blurred background and the presence of another woman in the distance add to the sense of urgency and the importance of her message.
Prompt
facial-expressions Pride: Determined, hopeful, powerful ; A person holding a sign with a message of acceptance; eye-level; Single Persons; A crowd of people at a Pride protest; cinematic
Characteristic
Shot : A woman holding a sign with a rainbow circle in front of her face, standing in a crowd of people.
Aesthetic Score : 0.6
Mood : powerful, determined, hopeful
Quality
Entropy : 6.85
Noise : 86
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and the colors are a bit muted.
Pride Celebration: Joy and Unity in the Air
A vibrant scene of celebration captures the spirit of Pride, with a joyous group of people, mostly women, radiating happiness and unity. The woman in the center, looking directly at the camera, invites you to share in the festive atmosphere.
Prompt
facial-expressions Pride: Joyful, celebratory, inclusive ; A group of friends celebrating at a Pride party; eye-level; Normal People; A brightly decorated room with rainbow decorations; cinematic
Characteristic
Shot : A group of people are celebrating in a festive atmosphere, possibly at a Pride parade. The background is decorated with rainbow flags.
Aesthetic Score : 0.6
Mood : joyful, celebratory, vibrant
Quality
Entropy : 6.52
Noise : 46
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor artifacts and noise, particularly in the background. The colors appear slightly oversaturated.
Hopeful Gaze: A Young Man Finds Pride in the Rainbow
A young man with curly hair, wearing a brown jacket and a silver chain, stands amidst a crowd, his gaze fixed on a vibrant rainbow flag. The shallow depth of field emphasizes his hopeful expression, capturing a moment of celebration and pride.
Prompt
facial-expressions Pride: Awe, inspiration, hope ; A person looking out at a Pride parade with a sense of wonder; eye-level; Single Persons; A vibrant parade with colorful floats and music; cinematic
Characteristic
Shot : A young man with curly hair and a gold earring is looking up at a rainbow flag while standing in a crowd. The flag is in the background and slightly out of focus.
Aesthetic Score : 0.7
Mood : hopeful, optimistic, proud
Quality
Entropy : 6.70
Noise : 75
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Purple Haze: A Portrait of Intensity
A close-up portrait captures the piercing gaze of a young person with vibrant purple hair, their eyes accentuated by dramatic lighting. The headphones and colorful jacket add an edgy, mysterious vibe, leaving you wondering what story lies behind this intense stare.
Prompt
facial-expressions Pride: Creative, playful, inclusive ; A gamer creating a rainbow-themed character in a video game; eye-level; Gamer; A computer screen with a character creation menu; cinematic
Characteristic
Shot : Close-up portrait of a young person with purple hair and headphones, looking intense at the camera.
Aesthetic Score : 0.6
Mood : intense, futuristic, rebellious
Quality
Entropy : 6.53
Noise : 52
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The hair appears slightly artificial and the skin texture is somewhat overly smooth.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.665, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means that the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall, the model demonstrates a good understanding of the scene and its aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://deepmind.google/technologies/imagen-2/