AI's Artistic Eye: Mastering Aesthetics, Struggling with Shots with Midjourney
- 9 minutes read - 1830 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals based on text prompts. One key aspect of this technology is its ability to capture and implement specific aesthetic styles. This analysis delves into the performance of a generative AI model in understanding and applying the ‘dramatic’ aesthetic style. We explore how the model handles various aspects of image creation, including camera position, shot composition, and overall aesthetic, providing insights into its strengths and areas for improvement.
Created with: midjourney
Silhouetted Against the Setting Sun: A Moment of Hope and Loneliness
A solitary figure walks into the fiery sunset, their silhouette a stark contrast against the vast field of grass. The dramatic lighting and epic composition evoke a sense of both hope and loneliness, leaving the viewer to ponder the figure’s journey.
Prompt
Stylized: Epic and melancholic ; A lone warrior; wide shot; Heroism; A desolate battlefield with a setting sun; cinematic
Characteristic
Shot : A lone figure walks away from the viewer in the foreground of a field towards a setting sun in the distance. The sky is a vibrant orange and yellow, with clouds.
Aesthetic Score : 0.7
Mood : lonely, serene, epic
Quality
Entropy : 6.66
Noise : 92
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have some slight blur, particularly in the grass and clouds.
A Treasure Trove in the Dark
Discover the allure of a hidden treasure chest overflowing with gold coins, bathed in an ethereal glow within a mysterious cave. The scene evokes a sense of wonder, magic, and adventure, inviting you to explore the secrets it holds.
Prompt
Stylized: Excitement and wonder ; A treasure chest overflowing with gold; close-up; Adventure; A dark and mysterious cave; cinematic
Characteristic
Shot : A treasure chest overflowing with gold coins is nestled inside a dark cave, the light from the chest illuminating the surrounding rocks.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, opulent
Quality
Entropy : 6.65
Noise : 116
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.90
Image errors : Slight blurring around the edges of the image, possible artifacts from the rendering process.
Cyberpunk Shadow: A Figure of Mystery in the Neon City
A lone figure, clad in futuristic attire and wielding a weapon, stands amidst the neon-drenched streets of a cyberpunk metropolis. The dramatic lighting and the character’s enigmatic pose evoke a sense of danger and intrigue, promising a story of mystery and action.
Prompt
Stylized: Triumphant and futuristic ; A player’s avatar, a powerful warrior, standing triumphantly; medium shot; Gaming; A vibrant and futuristic cityscape; cinematic
Characteristic
Shot : A futuristic cyberpunk scene with a woman in a red and black outfit standing in the middle of a city street. The buildings are tall and sleek, with neon lights and holographic displays.
Aesthetic Score : 0.7
Mood : futuristic, cyberpunk, gritty
Quality
Entropy : 6.63
Noise : 87
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.95
Image errors : There are some minor artifacts in the image, such as aliasing around the edges of objects. The background is also a bit blurry, which may be intentional but could be improved.
A Futuristic Cityscape Bursting with Life
This vibrant aerial view captures a futuristic city teeming with activity. Towering skyscrapers, rendered in a mix of organic and geometric forms, create a sense of awe and wonder. The use of light and color emphasizes the city’s vastness and complexity, leaving you feeling both captivated and inspired.
Prompt
Stylized: Energetic and lively ; A panoramic view of a bustling city; long shot; Tourism; A vibrant and colorful cityscape; cinematic
Characteristic
Shot : A futuristic city viewed from above, with tall skyscrapers and a bridge in the background.
Aesthetic Score : 0.7
Mood : futuristic, urban, dynamic
Quality
Entropy : 6.46
Noise : 126
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.90
Image errors : There are some artifacts in the image, particularly around the edges of the buildings.
Solitude and Sunset: A Moment of Contemplation in the Desert
A lone figure finds peace amidst the vastness of the desert, silhouetted against a breathtaking sunset. The scene evokes a sense of serenity and contemplation, inviting viewers to reflect on the beauty and transience of life.
Prompt
Stylized: Serene and contemplative ; A lone traveler gazing at a breathtaking sunset; medium shot; Travel; A vast desert landscape; cinematic
Characteristic
Shot : A lone figure sits on a rock overlooking a vast desert landscape at sunset. The sky is ablaze with warm colors, and the sun is setting behind a distant mountain range.
Aesthetic Score : 0.8
Mood : tranquil, contemplative, peaceful
Quality
Entropy : 6.75
Noise : 74
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.50
Image errors : No visible artifacts or errors in the image.
Family Fun in the Sun: A Moment of Pure Joy
Capture the essence of family happiness with this heartwarming image. A family of three, radiating joy, runs through a sun-drenched meadow, their laughter echoing through the trees. The vibrant colors and dynamic composition evoke a sense of playful energy and pure delight.
Prompt
Stylized: Joyful and heartwarming ; A family laughing and playing in a park; medium shot; Family; A sunny and idyllic park setting; cinematic
Characteristic
Shot : A family, consisting of a father, mother and son, are joyfully running through a sunny forest. The son is in the middle, the father is on the right, and the mother is on the left. The father is wearing a red shirt, the mother is wearing a yellow dress, and the son is wearing a blue shirt and jeans. The background is a bokeh of green and yellow colors, which creates a sense of depth and vibrancy.
Aesthetic Score : 0.8
Mood : joyful, loving, playful
Quality
Entropy : 6.60
Noise : 87
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image is slightly blurry, and the characters’ faces are not very detailed.
A Solitary Figure Contemplates the Stormy Sea
A lone figure stands on a cliff, silhouetted against a dramatic sky, as a stormy sea rages below. The scene evokes a sense of isolation and grandeur, capturing the raw power of nature and the fragility of human existence.
Prompt
Stylized: Dramatic and powerful ; A lone figure standing on a cliff overlooking a vast ocean; long shot; Heroism; A stormy sea with dramatic clouds; cinematic
Characteristic
Shot : A lone figure stands at the edge of a rocky cliff overlooking a stormy sea. The sky is overcast with dark clouds, and the waves are crashing against the rocks.
Aesthetic Score : 0.7
Mood : dramatic, powerful, lonely
Quality
Entropy : 6.82
Noise : 101
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have been created using AI. The textures on the rocks appear unnatural and the lighting is slightly off.
Uncharted Territory: A Map Beckons
A weathered map, pinned with secrets, lies on a wooden table bathed in the soft glow of a dimly lit room. Smoke curls around the scene, hinting at a journey filled with mystery and adventure. The close-up perspective draws you into the intrigue, leaving you eager to discover what lies ahead.
Prompt
Stylized: Intriguing and mysterious ; A map with pins marking locations of hidden treasures; close-up; Adventure; A dimly lit room with antique furniture; cinematic
Characteristic
Shot : A dimly lit tavern or pub with a map and pins spread out on a wooden table. There’s smoke or steam in the air, creating a mysterious atmosphere.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, nostalgic
Quality
Entropy : 6.59
Noise : 76
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blurriness, especially in the background.
The Archer’s Focus
A lone archer, bathed in golden light, stands poised in a lush forest, his arrow drawn. The blurred background and his intense expression create a palpable sense of anticipation and danger, hinting at an imminent and thrilling event.
Prompt
Stylized: Intense and focused ; A player’s character, a skilled archer, aiming at a target; close-up; Gaming; A dark and mysterious forest; cinematic
Characteristic
Shot : A man in a forest, drawing a bow and arrow. The scene is lit with warm light, creating a sense of magic and danger.
Aesthetic Score : 0.6
Mood : mysterious, magical, tense
Quality
Entropy : 6.54
Noise : 81
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.90
Image errors : The man’s hand holding the bow is oddly shaped and the lighting on his face is a bit unnatural.
City Lights, Cozy Vibes: Friends Enjoy a Night Out
Three friends gather for a casual meal in a bustling city restaurant. The warm glow of the restaurant contrasts with the vibrant cityscape seen through a large window, creating a perfect blend of urban energy and cozy comfort.
Prompt
Stylized: Social and celebratory ; A group of friends enjoying a meal at a restaurant with a view; medium shot; Tourism; A bustling city street with vibrant lights; cinematic
Characteristic
Shot : Three people are sitting at a table in a restaurant, looking out the window at the city lights.
Aesthetic Score : 0.6
Mood : cozy, urban, night
Quality
Entropy : 6.53
Noise : 85
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slight blur, likely due to motion or low light conditions.
Conclusion
The results show that the generative AI model performed okay in terms of camera position and shot analysis, but very well in terms of aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t always accurately capture the intended camera positions described in the prompts.
- Shot Analysis: The model scored 0.545, which is within the “good” range. This indicates that the model generally understood the scene descriptions in the prompts and produced images with appropriate shot compositions.
- Aesthetic Analysis: The model scored 0.04, which is within the “very good” range of -0.2 to 0.1. This means the generated images closely matched the expected aesthetic style described in the prompts.
Overall, the model demonstrates a strong ability to understand and implement aesthetic styles, but it could benefit from improvements in its ability to accurately interpret camera positions and shot descriptions.
Sources:
- https://heartofnoir.com/knowing-noir/aesthetic-of-noir/
- https://www.yellowbrick.co/blog/film/maximizing-the-visual-impact-unveiling-the-art-of-film-aesthetics
- https://www.questjournals.org/jrhss/papers/vol10-issue8/1008255260.pdf
- https://www.jstor.org/stable/3331672
- https://www.cinepoetics.fu-berlin.de/activities/workshops/2020-12-ws/index.html
- https://resource.download.wjec.co.uk/vtc/2016-17/16-17_1-22/eng/Part%201%20What%20is%20Aesthetics.pdf
- https://midjourney.com