AI's Artistic Journey: Capturing the Essence of Scenes, But Missing the Mark on Camera Angles with Midjourney
- 9 minutes read - 1886 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images from text descriptions is a rapidly evolving field. This technology holds immense potential for creative applications, from generating visual content for websites and social media to assisting artists in their creative process. However, as with any emerging technology, there are challenges to overcome. This blog post examines the performance of a generative AI model in capturing the essence of scenes described in text, focusing on its ability to understand camera positions and aesthetics. We will explore the model’s strengths and weaknesses, highlighting its successes and areas for improvement.
Created with: midjourney
Solitude Amidst the Storm
A lone figure stands defiant against the raw power of nature, silhouetted against a stormy sea. The dramatic contrast between the small human form and the vast, turbulent ocean evokes a sense of melancholic beauty and impending danger.
Prompt
rule-of-thirds Rule of thirds: Epic, determined, hopeful ; A lone hero standing on a cliff overlooking a vast, stormy sea; Wide shot; Heroism; Dramatic sky with crashing waves; cinematic
Characteristic
Shot : A solitary figure stands on a cliff edge overlooking a stormy sea. The sky is dark and overcast, and the waves are crashing against the rocks below.
Aesthetic Score : 0.75
Mood : dramatic, melancholic, powerful
Quality
Entropy : 6.74
Noise : 110
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no obvious artifacts or errors in the image.
Campfire Mystery in the Fog
Four men huddle around a flickering campfire in a dark, shadowy forest. The fire’s warmth contrasts with the chilling atmosphere, creating a sense of mystery and suspense. The swirling smoke and fog add to the enigmatic mood, leaving you wondering what secrets lie hidden in the darkness.
Prompt
rule-of-thirds Rule of thirds: Intriguing, mysterious, suspenseful ; A group of adventurers huddled around a campfire in a dense forest; Medium shot; Adventure; Shadows and flickering flames; cinematic
Characteristic
Shot : A group of five people are gathered around a campfire in a dark forest. The scene is lit by the fire and the moonlight.
Aesthetic Score : 0.7
Mood : mysterious, atmospheric, foreboding
Quality
Entropy : 5.91
Noise : 81
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image has some minor artifacts, particularly around the edges of the figures.
Lost in the Game: A Moment of Intense Focus
This image captures the raw intensity of a gamer fully immersed in their virtual world. The blurred screen, the gripped controller, and the cool lighting all contribute to a sense of suspense and excitement, drawing the viewer into the player’s experience.
Prompt
rule-of-thirds Rule of thirds: Focused, intense, exhilarating ; A gamer’s hands intensely gripping a controller, the screen displaying a thrilling moment in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : A person is playing a video game, the screen shows a futuristic combat scene, the person is holding a video game controller.
Aesthetic Score : 0.6
Mood : intense, focused, serious
Quality
Entropy : 6.17
Noise : 61
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly grainy and the lighting is a bit uneven.
Tranquility Reflected: A Hiker’s Moment of Serenity
A lone hiker finds peace on a secluded island, surrounded by majestic mountains mirrored in the calm lake. The scene evokes a sense of serenity and inspiration, with the symmetrical reflection adding a touch of dramatic beauty.
Prompt
rule-of-thirds Rule of thirds: Tranquil, awe-inspiring, peaceful ; A majestic mountain range reflected in a still lake, with a lone hiker standing on a rocky outcrop; Wide shot; Tourism; Clear blue sky and vibrant green foliage; cinematic
Characteristic
Shot : A lone hiker stands on a rock in the middle of a calm lake, with majestic mountains reflecting in the water, creating a mirrored effect. The sky is a bright blue, and the overall scene is serene and peaceful.
Aesthetic Score : 0.9
Mood : tranquil, serene, majestic
Quality
Entropy : 6.69
Noise : 105
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors in the image.
Nostalgic Journey Through Rolling Countryside
A vintage train, billowing smoke, races through a picturesque landscape of lush green hills and fields. The sun bathes the scene in a warm glow, evoking a sense of peace and nostalgia. The dynamic composition captures the train’s speed and creates a captivating sense of movement.
Prompt
rule-of-thirds Rule of thirds: Nostalgic, romantic, adventurous ; A vintage train speeding through a picturesque countryside, with a lone traveler gazing out the window; Medium shot; Travel; Rolling hills and vibrant fields; cinematic
Characteristic
Shot : A steam train is traveling through a countryside landscape, with rolling hills and green fields in the distance. Smoke is billowing from the train’s engine.
Aesthetic Score : 0.7
Mood : nostalgic, peaceful, serene
Quality
Entropy : 6.66
Noise : 113
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image contains some minor image artifacts. The image also has a slight motion blur, likely caused by the movement of the train.
Laughter, Lights, and Good Times: Friends Share a Joyful Meal
Capture the warmth and connection of a night out with friends. This scene features a group laughing and enjoying a meal together, bathed in warm light against a bustling, festive backdrop. The genuine smiles and lively atmosphere radiate happiness and create a sense of shared joy.
Prompt
rule-of-thirds Rule of thirds: Joyful, lively, celebratory ; A group of friends laughing and enjoying a meal together at a bustling outdoor market; Medium shot; Groups; Colorful stalls and vibrant street life; cinematic
Characteristic
Shot : A group of friends are having a meal together at a bustling night market. The scene is warm and inviting, with lots of activity and energy.
Aesthetic Score : 0.7
Mood : joyful, vibrant, authentic
Quality
Entropy : 6.86
Noise : 98
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is generally well-exposed with no significant errors. Some highlights in the background are slightly overblown, but not distracting.
Silhouetted Solitude at Sunset
A lone figure stands on a beach, bathed in the golden light of the setting sun. Dramatic clouds paint the sky, reflecting in the tranquil water. The scene evokes a sense of serene contemplation and melancholic beauty, with the silhouette of the figure highlighting their isolation and introspective mood.
Prompt
rule-of-thirds Rule of thirds: Melancholy, reflective, hopeful ; A lone figure standing on a deserted beach, watching the sun setting over the horizon; Wide shot; Heroism; Golden light illuminating the sky and water; cinematic
Characteristic
Shot : A lone figure stands on a beach at sunset, the golden light reflecting off the water and illuminating the clouds.
Aesthetic Score : 0.8
Mood : serene, contemplative, hopeful
Quality
Entropy : 6.65
Noise : 115
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors.
Into the Green Unknown: Soldiers Venture Through a Sun-Dappled Jungle
A sense of mystery and adventure hangs in the air as a group of soldiers navigate a dense jungle. Sunlight filters through the canopy, casting long shadows and highlighting the unknown dangers that lie ahead. The image, framed from behind the soldiers, captures the suspense of their journey into the heart of the wilderness.
Prompt
rule-of-thirds Rule of thirds: Intriguing, suspenseful, adventurous ; A group of explorers navigating a treacherous jungle path, with dense foliage surrounding them; Medium shot; Adventure; Lush greenery and dappled sunlight; cinematic
Characteristic
Shot : A group of soldiers walking through a dense jungle. They are silhouetted against the light coming from the back of the path, creating a sense of mystery and adventure.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, suspenseful
Quality
Entropy : 5.96
Noise : 116
Prompt Clip Score : 0.20
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry, and there are some pixelated areas in the foliage. The soldiers’ faces are not clearly defined.
Lost in Thought: A Moment of Introspection
A close-up shot captures the depth of a person’s gaze as they look off into the distance, conveying a sense of pensive thoughtfulness and quiet contemplation. The intimacy of the close-up draws the viewer into their world, leaving them to wonder what thoughts are swirling within.
Prompt
rule-of-thirds Rule of thirds: Focused, intense, determined ; A close-up of a gamer’s face, eyes glued to the screen, as they navigate a challenging level in a video game; Close-up; Gaming; Blurred background of the game’s visuals; cinematic
Characteristic
Shot : A close-up of a person’s eye looking off to the side, with the reflection of light in the eye
Aesthetic Score : 0.7
Mood : reflective, contemplative, pensive
Quality
Entropy : 6.55
Noise : 98
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Silhouette of Solitude: A Cityscape of Dreams and Loneliness
A solitary figure stands on a rooftop, their silhouette stark against the dazzling backdrop of a city awash in light. The scene evokes a sense of contemplation and isolation, highlighting the vastness and grandeur of urban life.
Prompt
rule-of-thirds Rule of thirds: Energetic, exciting, awe-inspiring ; A panoramic view of a bustling city skyline, with a lone tourist standing on a rooftop overlooking the scene; Wide shot; Tourism; Vibrant lights and towering buildings; cinematic
Characteristic
Shot : A lone figure stands silhouetted on a rooftop overlooking a sprawling city at night. Thousands of lights illuminate the cityscape, creating a dazzling display of urban life. The viewer’s perspective is high up, providing a sweeping vista of the cityscape.
Aesthetic Score : 0.8
Mood : serene, contemplative, awe
Quality
Entropy : 6.60
Noise : 100
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors are visible. There may be some slight compression artifacts, but they are barely noticeable.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.51, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.02, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrated a good understanding of the scene and its aesthetic, but struggled with accurately capturing the intended camera position.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com