AI's Artistic Struggle: Capturing the Scene vs. the Style with Midjourney
- 9 minutes read - 1870 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts has become increasingly sophisticated. However, the challenge of accurately capturing both the scene and the desired aesthetic remains a hurdle. This blog post examines the results of a generative AI model tasked with creating images based on specific scenes and camera positions, highlighting the model’s strengths and weaknesses in capturing the essence of visual storytelling.
Created with: midjourney
A Solitary Figure Conquers the Majestic Peaks
A lone figure stands triumphant atop a snow-capped mountain, silhouetted against a breathtaking panorama of snowy peaks and clouds. The scene evokes a sense of awe-inspiring isolation and accomplishment, highlighting the grandeur of nature and the resilience of the human spirit.
Prompt
crossed-arms crossed-arms: determined, confident ; A lone explorer, standing atop a windswept mountain peak; wide shot; Adventure; a vast, breathtaking panorama of snow-capped peaks and swirling clouds; cinematic
Characteristic
Shot : A lone figure stands on the peak of a snow-capped mountain, overlooking a vast expanse of clouds and snow-covered peaks.
Aesthetic Score : 0.8
Mood : epic, serene, contemplative
Quality
Entropy : 6.58
Noise : 92
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be AI-generated, with some minor artifacts and inconsistencies in the clouds and snow.
Silhouetted Hero: A Powerful Sunset Scene
A superhero stands tall against a fiery sunset, their silhouette creating a dramatic and hopeful scene. The epic mood and powerful presence are emphasized by the dramatic lighting, making this a visually stunning image.
Prompt
crossed-arms crossed-arms: powerful, stoic ; A superhero, silhouetted against a blazing sunset; medium shot; Heroism; a cityscape with towering skyscrapers and a fiery sky; cinematic
Characteristic
Shot : A silhouetted superhero stands on a rooftop overlooking a city skyline at sunset, with a red cape billowing behind him.
Aesthetic Score : 0.7
Mood : dramatic, powerful, hopeful
Quality
Entropy : 6.42
Noise : 83
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.80
Image errors : There are some minor artifacts in the sky, particularly around the edges of the clouds.
Intense Focus Under Neon Lights
Two young men huddle over a computer screen, their faces illuminated by a dramatic blend of blue and red lighting. The scene exudes an atmosphere of intense concentration and seriousness, heightened by the striking visual contrast.
Prompt
crossed-arms crossed-arms: focused, intense ; A group of gamers, huddled around a glowing computer screen; close-up; Gaming; a dimly lit room with neon lights and gaming peripherals; cinematic
Characteristic
Shot : Three young men are playing video games in a dimly lit room. The room is lit with blue and red lights, and the focus is on the man in the center. He is looking intensely at the computer screen, while the other two men look on.
Aesthetic Score : 0.7
Mood : intense, focused, competitive
Quality
Entropy : 6.55
Noise : 81
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts in the image.
Lost in Paris: A Moment of Melancholy at the Eiffel Tower
A young woman, wrapped in warmth and contemplation, sits in a Parisian cafe, her gaze drawn to the distant Eiffel Tower. The soft focus and muted colors evoke a sense of loneliness and longing, capturing a poignant moment of reflection.
Prompt
crossed-arms crossed-arms: awe-struck, contemplative ; A young woman, gazing out at the Eiffel Tower; medium shot; Tourism; a bustling Parisian street with charming cafes and cobblestone streets; cinematic
Characteristic
Shot : A young woman is sitting at a cafe table, looking out the window at the Eiffel Tower in the distance.
Aesthetic Score : 0.8
Mood : melancholy, romantic, contemplative
Quality
Entropy : 6.79
Noise : 96
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible artifacts or errors in the image.
Tranquility on the Shores of Paradise
A solitary figure finds peace on a pristine white sand beach, the turquoise ocean stretching endlessly beneath a vibrant blue sky. Palm trees sway gently, and a dramatic cloud casts a shadow, highlighting the vastness of the scene and the woman’s smallness in comparison. This image evokes a sense of serenity and tranquility, inviting you to escape into its peaceful embrace.
Prompt
crossed-arms crossed-arms: free-spirited, adventurous ; A backpacker, standing on a deserted beach; long shot; Travel; a pristine beach with turquoise waters and palm trees swaying in the breeze; cinematic
Characteristic
Shot : A lone woman walks on a white sandy beach with a palm tree in the background. The sea is calm and turquoise, with a large white cloud in the sky.
Aesthetic Score : 0.7
Mood : serene, tranquil, wanderlust
Quality
Entropy : 6.42
Noise : 92
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image appears to have been digitally enhanced, possibly to increase saturation and contrast. Some areas, particularly in the sky and water, have a slightly artificial look.
Awe-Inspiring View: Astronauts Witness a Cosmic Spectacle
Four astronauts stand in awe on a space station platform, gazing out at a majestic spaceship silhouetted against a swirling nebula. The dramatic lighting creates a sense of wonder and anticipation, capturing the essence of a futuristic space exploration.
Prompt
crossed-arms crossed-arms: determined, united ; A team of astronauts, standing in the shadow of a colossal spaceship; medium shot; Heroism; a futuristic spaceport with gleaming metal and swirling nebulae; cinematic
Characteristic
Shot : Four astronauts stand on a platform looking at a futuristic spaceship in a star-filled space, the spaceship is engulfed in orange fire
Aesthetic Score : 0.7
Mood : futuristic, adventurous, awe-inspiring
Quality
Entropy : 6.49
Noise : 113
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.90
Image errors : The spaceship appears to have some slightly blurry and undefined details, particularly on its surface. The flames in the background are a bit pixelated, possibly due to the image being compressed or upscaled.
The Joy of Virtual Reality: A Shared Experience
A man immersed in a virtual world, his face alight with excitement, captures the thrill of shared VR experiences. The dimly lit environment and the presence of others in the background create a sense of community and immersion.
Prompt
crossed-arms crossed-arms: excited, triumphant ; A group of friends, celebrating a victory in a virtual reality game; close-up; Gaming; a brightly lit arcade with flashing lights and immersive VR headsets; cinematic
Characteristic
Shot : A man wearing a VR headset is reacting enthusiastically to something happening in the virtual reality. He is standing in a dimly lit room with other people, and the room appears to be an arcade or game center. There is a lot of color in the photo, likely from the lights of the games or arcade machines.
Aesthetic Score : 0.7
Mood : joyful, exciting, playful
Quality
Entropy : 6.79
Noise : 80
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, just some minor artifacts like color fringing around the edge of the man’s headset and some blur in the background.
Solitude in the City
A lone figure contemplates the urban landscape from a bridge, creating a powerful sense of solitude amidst the bustling cityscape. The dramatic perspective highlights the scale of the city and the individual’s place within it.
Prompt
crossed-arms crossed-arms: reflective, introspective ; A lone traveler, standing on a bridge overlooking a bustling city; medium shot; Travel; a vibrant cityscape with towering buildings and a river flowing below; cinematic
Characteristic
Shot : A lone man stands on a bridge overlooking a river, with a city skyline in the background.
Aesthetic Score : 0.7
Mood : solitude, urban, contemplative
Quality
Entropy : 6.42
Noise : 102
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no significant errors in the image. The exposure is well-balanced, and the colors are vibrant.
Silhouettes of Adventure: Hikers Embrace the Sunset’s Majesty
A breathtaking scene unfolds as a group of hikers stand on a mountaintop, silhouetted against the warm glow of a setting sun. The vast, rolling landscape stretches before them, inspiring a sense of serenity, adventure, and contemplation. The dramatic effect of the silhouettes against the light emphasizes the awe-inspiring beauty of nature and the sense of wonder the hikers must be feeling.
Prompt
crossed-arms crossed-arms: accomplished, exhilarated ; A group of hikers, standing at the summit of a mountain; wide shot; Adventure; a panoramic view of rolling hills and lush forests; cinematic
Characteristic
Shot : A group of six hikers stand on a mountaintop overlooking a beautiful valley. The sun is setting, casting a warm glow over the scene. The hikers are all silhouetted against the sky, giving the image a sense of mystery and grandeur.
Aesthetic Score : 0.7
Mood : serene, adventurous, inspiring
Quality
Entropy : 6.66
Noise : 101
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable artifacts or errors in the image.
Taj Mahal Selfie: Capturing Joy and Beauty
A group of friends strike a pose in front of the iconic Taj Mahal, their smiles and excitement radiating against the stunning backdrop. The vibrant colors of the monument and the clear blue sky create a picture-perfect moment, capturing the essence of happiness and celebration.
Prompt
crossed-arms crossed-arms: happy, excited ; A group of tourists, posing for a photo in front of a famous landmark; medium shot; Tourism; a historic landmark with intricate architecture and vibrant colors; cinematic
Characteristic
Shot : A group of young people are smiling and posing for a photo in front of the Taj Mahal.
Aesthetic Score : 0.6
Mood : happy, celebratory, adventurous
Quality
Entropy : 6.86
Noise : 94
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant image errors.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.47, which is considered below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://midjourney.com