AI's Artistic Struggle: Capturing the Essence of a Scene with Imagen-v3
- 9 minutes read - 1762 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals from text prompts. However, achieving the perfect balance between technical accuracy and artistic expression remains a challenge. This blog post examines the results of an experiment that tested an AI model’s ability to generate images based on detailed scene descriptions, focusing on the model’s performance in capturing the desired aesthetic.
Created with: imagen-v3
Two Astronauts Embrace the Milky Way
A breathtaking scene of two astronauts, hand in hand, standing on a barren landscape against the backdrop of the Milky Way. The image evokes a sense of hope and inspiration, highlighting the vastness of space and the enduring power of human connection.
Prompt
poses holding-hands: Hopeful, determined, camaraderie ; Two astronauts; wide shot; heroism; the vastness of space with stars and planets in the background; cinematic
Characteristic
Shot : Two astronauts in space suits stand on a barren landscape, holding hands, facing away from the viewer, with a milky way backdrop.
Aesthetic Score : 0.8
Mood : dreamy, hopeful, inspiring
Quality
Entropy : 6.48
Noise : 99
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.90
Image errors : No major errors are apparent, however, there is a faint banding pattern in the sky.
Lost in the Jungle’s Embrace: A Romantic Adventure
A couple strolls hand-in-hand through a vibrant jungle, their silhouettes bathed in the ethereal glow of filtered sunlight. The air is thick with mystery and romance, promising an unforgettable journey into the heart of the wild.
Prompt
poses holding-hands: Excited, adventurous, trusting ; A group of explorers; medium shot; adventure; a dense jungle with sunlight filtering through the canopy; cinematic
Characteristic
Shot : A couple is walking through a lush, green jungle. They are holding hands and looking forward.
Aesthetic Score : 0.6
Mood : romantic, adventurous, mysterious
Quality
Entropy : 6.54
Noise : 91
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but the lighting is a bit flat and the colors are a bit muted.
The Handshake That Seals the Deal: Two Gamers Lock In Their Competitive Spirit
A close-up shot captures the intensity of a handshake between two gamers, their determined expressions and the gaming controller in the background hinting at a significant moment or agreement. The lighting adds to the dramatic effect, emphasizing the seriousness of the occasion.
Prompt
poses holding-hands: Focused, competitive, collaborative ; Two gamers; close-up; gaming; a brightly lit gaming setup with glowing screens and controllers; cinematic
Characteristic
Shot : Two people shaking hands in front of a computer monitor with a gaming controller in the background.
Aesthetic Score : 0.6
Mood : serious, determined, competitive
Quality
Entropy : 6.32
Noise : 83
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight blurriness on the hands, suggesting a slightly out-of-focus shot. No major artifacts or errors.
Love Blooms Under the City Lights
A young couple strolls hand-in-hand across a bridge, their laughter echoing against the backdrop of a vibrant city skyline. The scene radiates with romantic energy, capturing the joy and optimism of new love.
Prompt
poses holding-hands: Romantic, happy, adventurous ; A couple; medium shot; tourism; a picturesque cityscape with iconic landmarks in the background; cinematic
Characteristic
Shot : A young couple is holding hands and walking along a bridge with a city skyline in the background.
Aesthetic Score : 0.6
Mood : romantic, playful, happy
Quality
Entropy : 6.84
Noise : 84
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The background is a bit blurry and there’s some noise in the image.
Friends Strike a Pose Against a Majestic Mountain Backdrop
Five friends, united in laughter and adventure, stand in a playful line on a mountain road, their joy mirrored in the breathtaking scenery. The dramatic mountains and cloudy sky create a stunning backdrop for their lighthearted moment, capturing the essence of friendship and the thrill of exploration.
Prompt
poses holding-hands: Joyful, connected, adventurous ; group; long shot; travel; a scenic mountain range with a winding road leading to the peak; cinematic
Characteristic
Shot : Five friends are standing in a line on a mountain road, holding hands, facing the camera. They are all balanced on one leg, appearing to be in a playful mood. The background is a majestic mountain range with a cloudy sky, creating a scenic setting. The road winds through the mountains.
Aesthetic Score : 0.7
Mood : playful, adventurous, happy
Quality
Entropy : 6.68
Noise : 103
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed in certain areas, particularly the sky. The shadows are slightly hard, creating a less natural look.
A Night of Joy and Connection
Three friends share a heartwarming moment amidst the festive lights and decorations of a bustling party. The image captures the warmth and joy of their connection, creating a sense of celebration and togetherness.
Prompt
poses holding-hands: Happy, celebratory, connected ; A group of friends; medium shot; groups; a vibrant festival with colorful decorations and music; cinematic
Characteristic
Shot : Three people, two men and a woman, are standing in a festive setting with colorful decorations and lights hanging above. The woman is in the center, and the two men are holding her hands on either side. The background is blurred, suggesting a bustling party atmosphere. The image is shot at night, with warm lighting creating a cozy and inviting ambiance.
Aesthetic Score : 0.7
Mood : joyful, festive, heartwarming
Quality
Entropy : 6.89
Noise : 104
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant image errors.
Sunrise Romance on the Mountaintop
A breathtaking sunrise paints the sky in golden hues as a couple stands hand-in-hand on a mountain peak, their silhouettes a testament to their love and shared adventure. The vastness of the clouds and mountains below amplifies the sense of wonder and hope in this romantic scene.
Prompt
poses holding-hands: Determined, courageous, triumphant ; A lone hiker; close-up; heroism; a breathtaking mountain vista with clouds swirling below; cinematic
Characteristic
Shot : Two people are holding hands, standing on a mountaintop overlooking a valley with clouds below. The scene is a stunning sunrise with a warm, golden light filtering through the clouds.
Aesthetic Score : 0.7
Mood : romantic, hopeful, adventurous
Quality
Entropy : 5.75
Noise : 83
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight noise in the background, likely due to compression. Some of the details in the mountains could be slightly blurry.
Golden Hour Intimacy: A Love Story Unfolds
In this heartwarming image, a couple’s hands intertwine in the foreground, symbolizing their unbreakable bond. The backdrop, a breathtaking mountain landscape bathed in the warm glow of the setting sun, adds a touch of romance and hope to the scene. The dramatic effect of the blurred background draws the viewer’s focus to the couple’s connection, creating an intimate and hopeful mood.
Prompt
poses holding-hands: Playful, celebratory, carefree ; close-up; adventure; cinematic
Characteristic
Shot : A couple’s hands are holding each other in the foreground of the image. The background is a blurred landscape, most likely a mountain scene, with a warm golden light from the setting sun.
Aesthetic Score : 0.7
Mood : romantic, intimate, hopeful
Quality
Entropy : 6.09
Noise : 66
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image. The image is well-composed and the colors are balanced.
United in Hope: A Moment of Shared Belief Under the Spotlight
A powerful image captures a group of individuals standing together on a stage, their hands clasped in unity as they gaze upwards. The dramatic lighting casts long shadows, adding to the sense of hope and solemnity that permeates the scene.
Prompt
poses holding-hands: Passionate, connected, expressive ; A group of musicians; medium shot; groups; a dimly lit stage with spotlights shining on them; cinematic
Characteristic
Shot : A group of people standing on a stage, holding hands and looking up. The stage is lit with spotlights. There is a sense of unity and hope in the image.
Aesthetic Score : 0.6
Mood : hopeful, dramatic, solemn
Quality
Entropy : 6.73
Noise : 92
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Sunset Romance in the Desert
A couple, hand in hand, gazes towards a breathtaking desert sunset. The warm light paints the scene with intimacy and hope, capturing a moment of serene love.
Prompt
poses holding-hands: Romantic, adventurous, hopeful ; A couple; long shot; travel; a vast desert landscape with a setting sun in the distance; cinematic
Characteristic
Shot : A couple is holding hands, looking towards the sunset in a desert landscape.
Aesthetic Score : 0.7
Mood : romantic, hopeful, serene
Quality
Entropy : 6.46
Noise : 89
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is good, but the blurriness on the bottom right corner of the image could be better.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.48, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model’s ability to interpret and recreate camera positions in the image is decent, but could be improved.
- Shot Analysis: The model scored 0.665, falling within the “good” range. This indicates that the model effectively understood the scene described in the prompt and translated it into a visually coherent image.
- Aesthetic Analysis: The model scored 0.11, which is significantly higher than the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic based on the prompt.
Overall, the model demonstrates a good understanding of scene composition and camera positioning, but needs improvement in generating images that align with the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://deepmind.google/technologies/imagen-3/