AI's Artistic Eye: Capturing Scenes, But Missing the Angle with Midjourney
- 9 minutes read - 1853 wordsTable of Contents
In the realm of artificial intelligence, image generation has emerged as a captivating field, with models capable of creating stunning visuals from textual prompts. However, replicating the intricacies of human vision, particularly in capturing the nuances of camera position and perspective, remains a challenge. This blog post delves into the performance of a generative AI model in understanding and translating scene descriptions into images, focusing on its ability to capture camera position, shot analysis, and aesthetic style. We’ll explore the model’s strengths and weaknesses, highlighting its impressive ability to grasp the essence of a scene while revealing its limitations in accurately replicating the intended camera angle. Through this analysis, we gain valuable insights into the current state of AI image generation and its potential for future advancements.
Created with: midjourney
Lost in the Concrete Jungle
A solitary figure stands amidst the towering skyscrapers of a rain-soaked city, their silhouette swallowed by the imposing architecture. The depth of field emphasizes the feeling of isolation, creating a somber and melancholic mood.
Prompt
Daydreaming Dreamy, distant: Melancholy, lost in thought ; A lone figure; eye-level; Single Person; bustling city street; cinematic
Characteristic
Shot : A lone figure stands in the middle of a wet city street with tram tracks, facing away from the camera, surrounded by tall buildings on either side.
Aesthetic Score : 0.7
Mood : solitude, urban, melancholic
Quality
Entropy : 6.49
Noise : 92
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors
A Hero Stands Alone, Bathed in the City’s Neon Glow
A lone figure, cloaked in crimson, commands the rooftop edge, gazing out over a sprawling cityscape awash in the vibrant hues of night. The image whispers of power, mystery, and a hero poised for action.
Prompt
Daydreaming Focused, determined: Confident, determined ; A superhero standing on a rooftop; high angle; Hero; cityscape at night; cinematic
Characteristic
Shot : A lone figure in a red cape stands on the rooftop of a tall building, overlooking a city skyline at night.
Aesthetic Score : 0.7
Mood : dramatic, heroic, mysterious
Quality
Entropy : 6.30
Noise : 71
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some slight blurring and artifacts, particularly in the city skyline.
A Moment of Warmth and Romance in a Cozy Cafe
A young woman, dressed in a beautiful white floral dress, is captured in a moment of pure joy and warmth. She sits in a cozy cafe, her smile radiating as she looks towards the camera. The soft lighting and blurred background create an intimate atmosphere, while the cup of coffee in front of her adds to the cozy scene. This image exudes romance and warmth, inviting viewers into this private moment.
Prompt
Daydreaming Smiling, relaxed: Peaceful, content ; A woman sipping coffee in a cafe; eye-level; Normal People; warm, inviting cafe interior; cinematic
Characteristic
Shot : A young woman sits at a table in a cafe, looking at the camera with a warm smile. She is holding a cup of coffee in her hand, and her hair is styled in a casual way. The background is a bit blurry and out of focus, but it is still clear that she is in a comfortable and relaxed setting.
Aesthetic Score : 0.8
Mood : cozy, relaxed, warm
Quality
Entropy : 6.04
Noise : 53
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors.
Lost in Thought: A Moment of Contemplation in the Digital Age
A young man sits alone in a dimly lit room, his face illuminated by the glow of his computer screen. His gaze is directed away from the screen, lost in thought. The low-light setting and his pensive expression create a sense of mystery and intrigue, hinting at a moment of deep contemplation.
Prompt
Daydreaming Concentrated, excited: Engrossed, excited ; A gamer intensely focused on a screen; close-up; Gamer; dimly lit room with gaming peripherals; cinematic
Characteristic
Shot : A young man is sitting in a dimly lit room, looking away from a computer screen, likely engrossed in a video game. The room is decorated with string lights, adding a cozy touch to the scene.
Aesthetic Score : 0.7
Mood : pensive, introspective, focused
Quality
Entropy : 6.04
Noise : 70
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to have some noise and slight compression artifacts, but they are minor and don’t detract significantly from the overall quality.
A Child’s Curious Gaze Through Time
A young child with blonde hair gazes out of a dusty, old window, their expression filled with wonder and anticipation. Bathed in soft, warm light, the scene evokes a nostalgic and hopeful mood, capturing the magic of childhood curiosity.
Prompt
Daydreaming Wonder, curiosity: Curious, imaginative ; A child staring out a window; eye-level; Single Person; lush green garden; cinematic
Characteristic
Shot : A young child with blonde hair is looking out of a window. The window is old and has a dark wood frame. The child is wearing a dark green shirt. The scene is dimly lit with a lot of green foliage visible through the window.
Aesthetic Score : 0.7
Mood : melancholy, contemplative, mysterious
Quality
Entropy : 6.39
Noise : 100
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight artifacts and noise, particularly in the darker areas. This is likely due to the low lighting conditions and post-processing. The window pane appears to have imperfections that give the image a dreamy or surreal quality.
A Knight’s Journey Through the Misty Forest
A lone knight in shining armor rides a white horse through a misty forest, bathed in ethereal sunlight. The scene evokes a sense of mystery, adventure, and epic grandeur, with the knight’s isolation highlighting their unwavering purpose.
Prompt
Daydreaming Determined, fearless: Brave, adventurous ; A knight in shining armor riding through a forest; wide shot; Hero; mystical forest with dappled sunlight; cinematic
Characteristic
Shot : A lone knight in shining armor rides a white horse through a misty forest, sunlight streams through the trees
Aesthetic Score : 0.8
Mood : mysterious, epic, magical
Quality
Entropy : 6.56
Noise : 126
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.70
Image errors : There is some minor blur around the edges of the image, likely due to post-processing.
Sunny Day Picnics and Laughter: Friends Enjoying the Simple Joys
Capture the essence of carefree friendship with this image. Three friends bask in the sunshine, sharing laughter and good times during a picnic in a park. The joyful mood is palpable, making you want to join in on the fun.
Prompt
Daydreaming Laughing, happy: Joyful, carefree ; A group of friends laughing together at a picnic; eye-level; Normal People; sunny park with picnic blanket; cinematic
Characteristic
Shot : Three friends are having a picnic in a park, they are sitting on a blanket and laughing.
Aesthetic Score : 0.8
Mood : happy, carefree, friendship
Quality
Entropy : 6.92
Noise : 82
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors.
The Glow of Focus: A Hands-On Look at Digital Intensity
A dimly lit room, a vibrant keyboard, and a pair of hands working with intense focus. This image captures the essence of digital immersion, where the glow of the screen illuminates a world of possibilities.
Prompt
Daydreaming Focused, intense: Thrilled, competitive ; A gamer’s hands rapidly moving across a keyboard; close-up; Gamer; brightly lit gaming setup with glowing screen; cinematic
Characteristic
Shot : A person’s hands are typing on a backlit keyboard in a dark room, with a monitor and a desk in the background.
Aesthetic Score : 0.7
Mood : cyberpunk, futuristic, gamer
Quality
Entropy : 6.18
Noise : 95
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : Some minor noise is present in the image. The background is too dark. The lighting is not consistent, making it look unnatural. The focus is not sharp on the subject.
Silhouettes of Solitude: A Melancholy Walk on the Beach
A lone figure walks towards the horizon, their silhouette stark against the cloudy sky. The vastness of the beach and the calm water evoke a sense of melancholy and contemplation, capturing the essence of solitude.
Prompt
Daydreaming Thoughtful, serene: Reflective, introspective ; A woman walking alone on a beach; eye-level; Single Person; vast, empty beach with crashing waves; cinematic
Characteristic
Shot : A solitary figure walks along a sandy beach, with the ocean in the background. The sky is overcast, and the overall mood is tranquil and serene.
Aesthetic Score : 0.8
Mood : tranquil, serene, melancholic
Quality
Entropy : 6.65
Noise : 105
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no noticeable artifacts or errors in the image. The color grading creates a slightly hazy effect, which adds to the overall atmosphere of the image.
Soaring Above It All: A Superhero’s Epic Journey
Witness the awe-inspiring power and freedom of a superhero as they soar through the clouds, leaving the city far below. This epic scene evokes a sense of hope and inspiration, capturing the essence of a true hero’s journey.
Prompt
Daydreaming Confident, determined: Empowered, triumphant ; A superhero soaring through the sky; high angle; Hero; dramatic cloudscape with city skyline in the distance; cinematic
Characteristic
Shot : A superhero figure in a red cape flying over a city, seen from above, with a background of clouds and a city skyline in the distance.
Aesthetic Score : 0.7
Mood : epic, inspiring, hopeful
Quality
Entropy : 6.26
Noise : 93
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : No visible errors.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.56, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.08, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and shot composition, but needs improvement in accurately capturing the intended camera position. The aesthetic quality of the generated image is very good.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://midjourney.com