AI Captures the Pose, But Misses the Feeling with Imagen-v2

AI's Struggle with Aesthetic in Image Generation with Imagen-v2

Contents

The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. However, achieving the desired aesthetic remains a challenge. This blog post delves into an experiment that explores the capabilities of a generative AI model in capturing specific poses and scenes, highlighting its strengths and weaknesses in achieving the desired visual style.

Created with: imagen-v2

Silhouetted Against the Sunset, a Figure Contemplates the Vastness

A lone figure, cloaked in darkness, stands with their back to the viewer, gazing out at a majestic mountain range bathed in the warm glow of the setting sun. The silhouette against the fiery sky evokes a sense of isolation and contemplation, leaving the viewer to ponder the mysteries of the scene.

Silhouetted Against the Sunset, a Figure Contemplates the Vastness

Prompt

poses profile: Epic, hopeful, determined ; A lone figure, silhouetted against a setting sun; wide shot; Heroism; A vast, mountainous landscape; cinematic

Characteristic

Shot : A man in a rugged cloak stands with his back to the camera, looking out at a vast landscape of mountains and a sunset. His face is partially shadowed, creating a sense of mystery.

Aesthetic Score : 0.8

Mood : dramatic, introspective, solitary

Quality

Entropy : 6.80

Noise : 83

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.20

Image errors : No obvious image errors

A Hiker’s Perspective: Where Nature’s Majesty Meets Serenity

A lone hiker stands on a cliff, dwarfed by the breathtaking beauty of a valley with two cascading waterfalls. The overcast sky adds a dramatic touch, while the scene evokes a sense of adventure and tranquility.

A Hiker’s Perspective: Where Nature’s Majesty Meets Serenity

Prompt

poses profile: Adventurous, free-spirited, awe-inspired ; A backpacker standing on a cliff edge, looking out at a breathtaking view; medium shot; Adventure; A sprawling valley with cascading waterfalls; cinematic

Characteristic

Shot : A man is standing on a cliff overlooking a valley with two waterfalls. The sky is overcast, but the light is still bright. The man is wearing a backpack and is looking out at the view.

Aesthetic Score : 0.8

Mood : epic, adventurous, contemplative

Quality

Entropy : 6.71

Noise : 105

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.20

Image errors : No visible errors

The Intensity of Play: A Close-Up on Focused Hands

A close-up shot captures the intensity of a gamer’s focus as their hands grip a video game controller. The blurry background of a computer monitor and lamp adds to the sense of immersion, highlighting the player’s dedication to the game.

The Intensity of Play: A Close-Up on Focused Hands

Prompt

poses profile: Focused, intense, passionate ; A gamer’s hands, illuminated by the glow of a monitor, holding a controller; close-up; Gaming; A dimly lit room with gaming posters on the walls; cinematic

Characteristic

Shot : Close-up shot of a person’s hands holding a video game controller. The background is blurry and out of focus, but it appears to be a gaming setup with a monitor and other electronics.

Aesthetic Score : 0.6

Mood : intense, focused, immersive

Quality

Entropy : 6.19

Noise : 90

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.20

Image errors : Slight color banding and noise, especially in the shadows.

Contemplation in the City: A Moment of Awe

A woman stands amidst the bustling city, her gaze drawn upwards to the imposing cathedral. The muted colors and cloudy sky create a peaceful atmosphere, inviting viewers to share in her introspective moment of wonder.

Contemplation in the City: A Moment of Awe

Prompt

poses profile: Curious, excited, appreciative ; A tourist gazing up at a majestic cathedral; medium shot; Tourism; A bustling city square with cobblestone streets; cinematic

Characteristic

Shot : A woman in a tan coat stands in front of a large cathedral with a crowd of people in the background. The camera is focused on the woman’s face, and the cathedral is blurred in the background.

Aesthetic Score : 0.6

Mood : pensive, contemplative, peaceful

Quality

Entropy : 6.75

Noise : 88

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly overexposed, and the colors are a bit washed out. Some digital noise is visible in the background.

Lost in the Landscape: A Moment of Contemplation

A man gazes out the train window, his expression hinting at a melancholic introspection. The passing countryside becomes a backdrop for his inner thoughts, creating a poignant scene of longing and contemplation.

Lost in the Landscape: A Moment of Contemplation

Prompt

poses profile: Reflective, contemplative, nostalgic ; A traveler sitting on a train, looking out the window at passing scenery; medium shot; Travel; A scenic train journey through rolling hills and fields; cinematic

Characteristic

Shot : A man sits by a train window, looking out at the passing countryside.

Aesthetic Score : 0.6

Mood : melancholy, contemplative, introspective

Quality

Entropy : 6.66

Noise : 103

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image is slightly blurry, and the colors are muted. The subject appears slightly overexposed.

Laughter and Light: Capturing the Joy of a Festive Gathering

Two women share a moment of genuine laughter at a vibrant party, their joy illuminated by festive decorations. The shallow depth of field focuses on their expressions, creating a sense of intimacy and connection amidst the celebratory atmosphere.

Laughter and Light: Capturing the Joy of a Festive Gathering

Prompt

poses profile: Joyful, celebratory, connected ; A group of friends laughing and celebrating together; wide shot; Groups; A lively party with colorful decorations and music; cinematic

Characteristic

Shot : Two young women are laughing and enjoying themselves at a party, they are surrounded by friends and colorful decorations. It’s a celebratory atmosphere with a lot of energy.

Aesthetic Score : 0.7

Mood : joyful, celebratory, carefree

Quality

Entropy : 6.62

Noise : 109

Prompt Clip Score : 0.21

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image contains some noise, particularly in the background. There is also some overexposure in the highlights and some chromatic aberration visible in the edges.

Superman Takes Flight Over Metropolis

A powerful image captures the iconic superhero standing tall on a rooftop, his cape billowing in the wind as he surveys the cityscape below. The scene evokes a sense of heroism, power, and hope, emphasizing Superman’s dominance and unwavering commitment to protecting the city.

Superman Takes Flight Over Metropolis

Prompt

poses profile: Powerful, confident, inspiring ; A superhero standing tall, cape billowing in the wind; medium shot; Heroism; A cityscape with towering skyscrapers; cinematic

Characteristic

Shot : A man dressed as Superman standing on a rooftop overlooking a city skyline.

Aesthetic Score : 0.6

Mood : heroic, powerful, determined

Quality

Entropy : 6.68

Noise : 72

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.70

Image errors : The city skyline appears to be somewhat blurry and lacks detail. The red cape also has some unnatural folds.

Lost in the Jungle’s Embrace: A Mysterious Expedition

A group of explorers venture deep into a misty jungle, their path shrouded in mystery. The lush foliage and dramatic lighting create a sense of depth and intrigue, highlighting the central explorer as they navigate the unknown. This scene evokes a mood of adventure, mystery, and a touch of the eerie.

Lost in the Jungle’s Embrace: A Mysterious Expedition

Prompt

poses profile: Intrigued, adventurous, determined ; A group of explorers navigating a dense jungle; wide shot; Adventure; Lush greenery, ancient ruins, and dappled sunlight; cinematic

Characteristic

Shot : A group of adventurers, mostly men, in period clothing (early 1900s), are walking through a lush jungle. The composition is a little awkward with the main character in the foreground but the light and colors are appealing, giving it an interesting mood.

Aesthetic Score : 0.6

Mood : mysterious, adventurous, jungle

Quality

Entropy : 6.71

Noise : 105

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.80

Image errors : The image is slightly blurry and has some artifacts, particularly in the background. The textures seem to be a little blurry in the background.

Intense Gaze, Dramatic Lighting: A Portrait of Urban Cool

A young man, bathed in vibrant red and blue light, stares directly into the camera with an intensity that demands attention. His black jacket and headphones add to the edgy, urban aesthetic, creating a mood that is both dramatic and captivating.

Intense Gaze, Dramatic Lighting: A Portrait of Urban Cool

Prompt

poses profile: Focused, competitive, determined ; A gamer’s face, lit by the screen, showing intense concentration; close-up; Gaming; A dimly lit room with a gaming setup and neon lights; cinematic

Characteristic

Shot : A close-up portrait of a young man with curly hair, wearing a black jacket and headphones, against a backdrop of neon lights.

Aesthetic Score : 0.7

Mood : intense, serious, mysterious

Quality

Entropy : 6.09

Noise : 96

Prompt Clip Score : 0.24

AI Evaluation

Likelihood of AI : 0.80

Image errors : The image has some minor artifacts around the edges of the subject’s hair and skin, and the lighting is slightly uneven.

Sunset Serenade: A Dreamy Beach Stroll

Experience the warmth of a romantic sunset as a couple shares a dreamy walk on the beach, their silhouettes painting a picture of love against the vibrant sky. The man gazes towards the camera, while the woman’s subtle downward glance adds an air of mystery to this visually striking scene.

Sunset Serenade: A Dreamy Beach Stroll

Prompt

poses profile: Romantic, peaceful, serene ; A couple holding hands, walking along a beach at sunset; medium shot; Tourism; A golden beach with turquoise waters and a vibrant sky; cinematic

Characteristic

Shot : A couple is walking on the beach hand in hand, with a beautiful sunset behind them.

Aesthetic Score : 0.7

Mood : romantic, peaceful, warm

Quality

Entropy : 6.72

Noise : 91

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.30

Image errors : No visible artifacts or errors.

Conclusion

The results show that the generative AI model performed well in understanding and executing camera positions and shot composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:

  • Camera Position: The model scored 4.5 out of 10, indicating a good understanding of the camera position specified in the prompt. This suggests the model is capable of generating images with the intended camera angles and perspectives.
  • Shot Analysis: The model scored 4.6 out of 10, also indicating a good understanding of the shot composition specified in the prompt. This suggests the model is capable of generating images with the intended framing and composition.
  • Aesthetic Analysis: The model scored 0.05 out of 10, indicating a significant difference between the expected aesthetic and the actual aesthetic of the generated image. This suggests the model struggled to capture the desired visual style or mood.

Overall, the model demonstrates a strong ability to interpret and execute camera positions and shot composition, but needs improvement in achieving the desired aesthetic.

Sources: