AI's Artistic Eye: Capturing the Moment, But Missing the Mood with Imagen-v2
- 10 minutes read - 1937 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, achieving a perfect match between the prompt’s vision and the generated image remains a challenge. This blog post examines the results of an experiment that tested an AI model’s ability to capture specific poses, camera angles, and aesthetics. While the model demonstrated proficiency in technical aspects like camera position and shot composition, it struggled to replicate the desired mood and aesthetic of the images. This analysis delves into the specific scores and explores the reasons behind the model’s performance, offering insights into the current state of AI image generation and its potential for future development.
Created with: imagen-v2
A Knight’s Vigil: Stormy Skies and a City’s Fate
A lone knight stands defiant on a windswept hilltop, his silhouette stark against the backdrop of a raging storm. The medieval city below seems to shrink in the face of the tempest, creating a sense of impending doom and epic drama.
Prompt
poses dutch-angle: determined, heroic, hopeful ; A lone knight, standing tall on a hilltop overlooking a besieged city; wide shot; heroism; a dramatic, stormy sky with flashes of lightning; cinematic
Characteristic
Shot : A knight in shining armor stands on a hill overlooking a medieval city. The sky is filled with dramatic lightning strikes.
Aesthetic Score : 0.6
Mood : epic, dramatic, brooding
Quality
Entropy : 6.71
Noise : 91
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The lightning strikes look a bit artificial and the city’s details are blurry and lack sharpness. There are some visible seams in the composite image, particularly around the knight’s feet.
Golden Hour Adventure: A Tranquil Path Through Lush Greens
As the sun dips below the horizon, casting a warm glow on the scene, three figures walk along a dirt path through vibrant greenery. The path leads towards a distant valley, promising adventure and breathtaking views. This tranquil moment captures the essence of serenity and exploration.
Prompt
poses dutch-angle: adventurous, mysterious, awe-inspiring ; A group of explorers, silhouetted against the setting sun, standing at the edge of a vast, unexplored jungle; medium shot; adventure; lush green foliage and towering trees; cinematic
Characteristic
Shot : Three people hiking through a lush green forest, looking out at a valley below. The sun is shining in the distance, casting a warm glow over the scene.
Aesthetic Score : 0.5
Mood : tranquil, adventurous, hopeful
Quality
Entropy : 6.64
Noise : 122
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some noise in the image, particularly in the shadows.
Focused Gamer Immersed in a Fantasy World
A young man sits at his desk, headphones on, eyes fixed on the screen. The low angle shot and dramatic lighting emphasize his intense focus as he navigates a virtual world. A fantasy painting on the wall behind him adds a touch of whimsy to the scene, hinting at the immersive nature of his gaming experience.
Prompt
poses dutch-angle: intense, focused, competitive ; A gamer, intensely focused on a screen, fingers flying across a keyboard; close-up; gaming; a brightly lit room with gaming peripherals and posters; cinematic
Characteristic
Shot : A young man is playing video games in a dimly lit room. He’s wearing headphones, and his hand is on the keyboard. There is a painting on the wall behind him.
Aesthetic Score : 0.6
Mood : focused, intense, gamer
Quality
Entropy : 5.83
Noise : 80
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, mostly around the edges of the subject. The subject’s face appears to have some noise on it.
Parisian Romance: A Moment Captured
A couple shares a quiet moment in a Parisian cafe, their gaze drawn to the iconic Eiffel Tower. The photo captures the intimacy and nostalgia of the scene, with the tower adding a touch of grandeur and scale.
Prompt
poses dutch-angle: romantic, nostalgic, joyful ; A couple, hand-in-hand, gazing out at the Eiffel Tower from a Parisian cafe; medium shot; tourism; bustling Parisian streets with charming cafes and shops; cinematic
Characteristic
Shot : A couple sits at a table in a cafe, looking out at the Eiffel Tower.
Aesthetic Score : 0.6
Mood : romantic, dreamy, nostalgic
Quality
Entropy : 6.45
Noise : 89
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some slight blurring and noise in the image, especially around the edges.
Contemplating the Peaks: A Selfie with a View
An adventurous soul captures the breathtaking panorama of snow-capped mountains on a winding trail. The perspective emphasizes the vastness of the landscape, creating a sense of awe and wonder.
Prompt
poses dutch-angle: free-spirited, adventurous, inspiring ; A backpacker, walking along a winding mountain path, with breathtaking views of snow-capped peaks; medium shot; travel; a rugged mountain landscape with clear blue skies; cinematic
Characteristic
Shot : A man is standing on a mountain path, looking at the camera, with a mountain range in the background.
Aesthetic Score : 0.6
Mood : adventurous, inspiring, rugged
Quality
Entropy : 6.75
Noise : 98
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, such as a few pixels that are out of place and some banding in the sky. The image also has a slight overexposure.
Intimate Toast: A Celebration Captured in Low Light
A group of friends raise their glasses in a dimly lit room, their hands and the clinking glasses the focal point. The low light creates an intimate atmosphere, capturing the festive and celebratory mood of the moment.
Prompt
poses dutch-angle: joyful, celebratory, connected ; A group of friends, laughing and celebrating, raising their glasses in a toast; medium shot; groups; a lively bar or restaurant with warm lighting and festive decorations; cinematic
Characteristic
Shot : A group of friends is toasting with wine glasses in a dimly lit bar or restaurant. The composition is dynamic and the warm lighting creates a cozy atmosphere.
Aesthetic Score : 0.6
Mood : celebratory, warm, friendly
Quality
Entropy : 6.61
Noise : 120
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor noise and blur, especially in the background. There are some minor artifacts around the edges of the image.
A Moment of Solitude in the Vastness of Space
An astronaut, clad in a spacesuit, gazes out a window at the distant Earth and moon. The image evokes a sense of wonder, isolation, and contemplation, capturing the profound beauty and loneliness of space exploration.
Prompt
poses dutch-angle: awe-inspiring, contemplative, hopeful ; A lone astronaut, gazing out at the Earth from a space station window; close-up; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : An astronaut looks out of a spacecraft window at a planet in the distance.
Aesthetic Score : 0.7
Mood : lonely, contemplative, awe
Quality
Entropy : 6.06
Noise : 91
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The astronaut’s face is blurred and lacks detail. The planet is also lacking in detail.
Conquering the Waterfall: A Daring Descent into the Mountains
Two adventurers rappel down a sheer cliff face, braving the cascading waterfall and the breathtaking grandeur of the surrounding mountain range. This awe-inspiring scene captures the thrill and danger of their daring expedition.
Prompt
poses dutch-angle: exciting, daring, adventurous ; A group of adventurers, rappelling down a steep cliff face, with a breathtaking view of a valley below; wide shot; adventure; a dramatic mountain landscape with waterfalls and lush vegetation; cinematic
Characteristic
Shot : Two people rappelling down a waterfall in a mountainous region
Aesthetic Score : 0.7
Mood : adventurous, dramatic, daring
Quality
Entropy : 6.80
Noise : 116
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some artifacts and noise, particularly in the background. The composition is slightly awkward and the subject could be more centered. Some color banding in the sky.
Triumphant Moment Captured: Young Man’s Joyful Victory
A young man, radiating excitement and triumph, holds aloft a trophy against a backdrop of swirling blue smoke and rugged red-brown rocks. The dramatic lighting and his shocked expression capture the intensity of his victory.
Prompt
poses dutch-angle: triumphant, celebratory, exciting ; A gamer, celebrating a victory, holding up a trophy; close-up; gaming; a brightly lit stage with cheering crowds and flashing lights; cinematic
Characteristic
Shot : A man holding a trophy, smiling and looking at the camera, on a stage with a lot of glowing red objects in the background. There is smoke and blue light in the background and there is a large screen with a scene of what looks like a frozen landscape.
Aesthetic Score : 0.6
Mood : excited, triumphant, energetic
Quality
Entropy : 6.47
Noise : 107
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some noise and grain in the image. The image seems compressed and slightly blurry.
Sunset Romance on the Beach
A couple embraces on a sandy shore, their silhouettes bathed in the warm glow of a setting sun. The scene evokes a sense of romantic intimacy and peaceful serenity.
Prompt
poses dutch-angle: peaceful, heartwarming, nostalgic ; A family, standing on a beach, watching the sunset over the ocean; medium shot; travel; a serene beach with golden sand and turquoise waters; cinematic
Characteristic
Shot : A couple standing on a beach at sunset, looking out at the ocean. The man has his arm around the woman, and they are both smiling.
Aesthetic Score : 0.7
Mood : romantic, serene, peaceful
Quality
Entropy : 6.56
Noise : 96
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have a slight color cast, which may be due to the filter used in post-processing. There is also a bit of noise in the image, which is likely due to the low light conditions in which it was taken.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.45, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.59, which is considered good. This indicates the generated image’s shot composition was fairly close to what was described in the prompt.
- Aesthetic Analysis: The model scored 0.13, which is considered okay. This suggests the generated image’s aesthetic was somewhat different from the expected aesthetic based on the prompt.
Overall, the model seems to be better at understanding and implementing shot composition than camera position or aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-2/