AI Captures the Essence, But Misses the Shot with Imagen-v2
- 9 minutes read - 1896 wordsTable of Contents
The ‘dramatic’ aesthetic style is characterized by its use of strong contrasts, dramatic lighting, and evocative imagery. It’s often used in film, photography, and visual art to create a sense of tension, suspense, or emotional impact. This style is particularly effective in conveying powerful emotions and creating a sense of depth and intrigue. For example, a scene depicting a lone figure silhouetted against a setting sun, with a long shot emphasizing the vastness of the landscape, would be considered a classic example of the dramatic aesthetic. This style is often used to create a sense of heroism, isolation, or the weight of the world on the character’s shoulders.
Created with: imagen-v2
Silhouette of Solitude: A Sunset Mystery
A lone figure stands silhouetted against a vibrant sunset, casting a sense of tranquility and mystery over the scene. The dramatic effect of the silhouette against the fiery sky evokes feelings of contemplation and isolation, leaving the viewer to ponder the story behind this moment.
Prompt
Dogme 95: Epic, hopeful ; A lone figure, silhouetted against a setting sun; long shot; Heroism; A vast, desolate landscape; cinematic
Characteristic
Shot : A lone figure stands in the middle of a vast, white salt flat, silhouetted against a bright sunset. The figure is small and distant, conveying a sense of solitude and vastness.
Aesthetic Score : 0.6
Mood : tranquil, contemplative, vast
Quality
Entropy : 6.19
Noise : 44
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some slight noise and artifacts, particularly in the sky and on the salt flat.
A Hand Reaches for Hope: The Cliffside Challenge
A close-up shot captures the intensity of a hand reaching for a rope ladder against a rugged cliffside. The scene evokes a sense of urgency and suspense, leaving the viewer wondering what lies ahead for the climber.
Prompt
Dogme 95: Suspenseful, thrilling ; A hand reaching out to grasp a rope ladder dangling from a cliff face; close-up; Adventure; A rocky, treacherous mountainside; cinematic
Characteristic
Shot : A close-up shot of a hand reaching out towards a rope ladder attached to a rocky cliff face. The background is a blurry view of a cloudy sky and distant mountains.
Aesthetic Score : 0.6
Mood : tense, adventurous, hopeful
Quality
Entropy : 6.50
Noise : 99
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor artifacts and noise in the image, particularly in the background. The lighting is uneven, with some areas being brighter than others.
Rage Quit: The Red Glow of Frustration
A gamer’s frustration boils over in a dimly lit room, illuminated only by the red glow of the monitor. Their exaggerated expression and the intense atmosphere capture the raw emotion of a gaming rage quit.
Prompt
Dogme 95: Intense, focused ; A player’s hands frantically manipulating a joystick, their face illuminated by the screen; medium shot; Gaming; A dimly lit room with a computer monitor glowing brightly; cinematic
Characteristic
Shot : A young person is playing a video game and is intensely focused, they are yelling in frustration at the screen. The scene is dimly lit and the person is wearing headphones. The person is sitting in front of a computer and is holding a controller in their hands.
Aesthetic Score : 0.6
Mood : intense, focused, frustrated
Quality
Entropy : 5.74
Noise : 107
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some graininess to the image. There are also some artifacts around the edges of the screen, which might be due to over-sharpening or compression.
Lost in the City’s Heart: A Watercolor Dream
Step into a bustling, mysterious world where narrow streets teem with life and vendors hawk their wares. This painterly scene, rendered in a watercolor-like style, evokes a sense of intrigue and wonder, inviting you to explore its hidden corners.
Prompt
Dogme 95: Energetic, lively ; A bustling marketplace, filled with vibrant colors and exotic goods; wide shot; Tourism; A crowded street in a foreign city; cinematic
Characteristic
Shot : A bustling marketplace in a narrow, crowded alleyway with a hint of magical realism.
Aesthetic Score : 0.6
Mood : mysterious, vibrant, mystical
Quality
Entropy : 6.69
Noise : 96
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image is slightly blurry and the colors are somewhat muted.
Tranquility in Motion: A Blurred Train Against Rolling Hills
A solitary train streaks across a field of verdant hills, its blur a testament to speed against the backdrop of a calm, cloudy sky. The scene evokes a sense of tranquility and loneliness, with the train’s movement adding a touch of dynamism.
Prompt
Dogme 95: Nostalgic, contemplative ; A train speeding through a countryside landscape, blurring the scenery; long shot; Travel; Rolling hills and fields passing by; cinematic
Characteristic
Shot : A train traveling across a field with green grass and a grey sky.
Aesthetic Score : 0.6
Mood : peaceful, calm, serene
Quality
Entropy : 6.58
Noise : 89
Prompt Clip Score : 0.36
AI Evaluation
Likelihood of AI : 0.10
Image errors : Minor compression artifacts and noise in the sky. Some blur artifacts in the train.
A Family Meal, Shadows of Emotion
A rustic kitchen bathed in warm light, a family gathers around a table laden with food. The woman serves a dish to the young man, but the old man’s somber expression casts a shadow over the scene. The interplay of light and shadow, along with the contrasting emotions, creates a poignant and intimate moment.
Prompt
Dogme 95: Warm, intimate ; A family gathered around a dinner table, sharing a meal and laughter; medium shot; Family; A cozy, well-worn kitchen; cinematic
Characteristic
Shot : A group of three people are sitting at a table, the woman in the middle is serving food. The scene is set in a rustic kitchen with a window in the background.
Aesthetic Score : 0.6
Mood : warm, cozy, intimate
Quality
Entropy : 6.61
Noise : 110
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
Tears of Sorrow: A Close-Up Portrait of Vulnerability
This intimate image captures the raw emotion of a woman’s tears, drawing the viewer into her moment of sadness and vulnerability. The close-up framing emphasizes the detail of her eye and the tears streaming down her cheek, creating a powerful and evocative visual.
Prompt
Dogme 95: Sad, poignant ; A single tear rolling down; close-up; A dimly lit living room; cinematic
Characteristic
Shot : Close-up shot of a woman’s face, with tears running down her cheek. The focus is on the tear running down her cheek and her sad expression.
Aesthetic Score : 0.7
Mood : sad, melancholic, emotional
Quality
Entropy : 6.19
Noise : 101
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to have some minor artifacts or compression artifacts around the edges, but they are not very noticeable. The overall image quality is good.
Campfire Camaraderie: Friends Gather Under the Stars
A group of six friends share laughter and warmth around a crackling campfire, bathed in the glow of the flames against the backdrop of a dark forest. The scene evokes a sense of joy, intimacy, and friendship.
Prompt
Dogme 95: Joyful, communal ; A group of friends huddled together around a campfire, sharing stories and laughter; medium shot; Adventure; A dark forest with flickering flames; cinematic
Characteristic
Shot : A group of friends are gathered around a campfire in a forest. They are laughing and talking, enjoying each other’s company. The fire is bright and warm, and the forest is dark and mysterious.
Aesthetic Score : 0.7
Mood : happy, cozy, adventurous
Quality
Entropy : 6.16
Noise : 104
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, particularly in the background, making the forest appear washed out. Some blurring is present around the edges of the subjects, particularly the individual on the left, suggesting minor technical issues.
Solitude Amidst the Storm
A solitary figure finds peace on a windswept cliff, gazing out at a turbulent sea under a stormy sky. The scene evokes a sense of melancholy and introspection, highlighting the contrast between the figure’s calm and the dramatic forces of nature.
Prompt
Dogme 95: Awe-inspiring, contemplative ; A lone traveler gazing out at a vast ocean, their face filled with wonder; long shot; Travel; A dramatic coastline with crashing waves; cinematic
Characteristic
Shot : A lone figure sits on a grassy cliff overlooking a stormy sea. The waves are crashing against the rocks below, and the sky is overcast.
Aesthetic Score : 0.75
Mood : melancholy, contemplative, dramatic
Quality
Entropy : 6.32
Noise : 69
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors are present. The image appears to have been well-edited, with some subtle color grading.
A Glimpse into the Past: Faded Memories of Childhood
A weathered hand cradles a vintage photograph, capturing a moment frozen in time. Three children stand in a sun-drenched field, their smiles tinged with the bittersweet nostalgia of youth. The faded image whispers of days gone by, leaving a lingering sense of sentimentality and the passage of time.
Prompt
Dogme 95: Melancholy, nostalgic ; A hand holding a worn photograph, the image blurred and faded; close-up; Family; A cluttered attic filled with old memories; cinematic
Characteristic
Shot : A hand holds an old black and white photograph showing three children, two girls and a boy, standing outdoors. The photo is faded and has a vintage aesthetic.
Aesthetic Score : 0.6
Mood : nostalgic, sentimental, wistful
Quality
Entropy : 5.94
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur around the edges, indicating it might be a scan of an old photograph, and the photo itself has faded areas which could be caused by age or exposure.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera positions described in the prompt.
- Shot Analysis: The model scored 0.525, which is considered average. This indicates that the model was able to understand the scene in the prompt to a reasonable degree, but not exceptionally well.
- Aesthetic Analysis: The model scored 0.14, which is considered very good. This means that the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at understanding the aesthetic style of the prompt than it is at accurately capturing the camera position and shot composition.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://deepmind.google/technologies/imagen-2/