AI's Artistic Eye: Capturing the Moment, But Missing the Mood with Freepik
- 9 minutes read - 1809 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals from simple text prompts. However, while these models excel in capturing the technical aspects of a scene, they often struggle to convey the desired mood or aesthetic. This article explores the strengths and weaknesses of AI image generation, focusing on a recent experiment that tested the model’s ability to create images based on specific poses and scenes. The results reveal that while the model excels in understanding camera position and shot composition, it falls short in capturing the intended aesthetic, highlighting the ongoing challenges in achieving true artistic expression through AI.
Created with: freepik
Silhouetted Triumph: A Lone Figure Conquers the Sunset
A solitary figure stands atop a mountain peak, their silhouette stark against a fiery sunset. Dramatic clouds partially obscure the setting sun, casting its rays through the sky and illuminating the surrounding mountain range. This inspirational scene evokes a sense of triumph, solitude, and hope.
Prompt
poses low-angle: inspiring, triumphant ; A lone figure standing atop a mountain peak, silhouetted against the rising sun; wide shot; heroism; majestic mountain range with clouds swirling below; cinematic
Characteristic
Shot : A lone figure stands on a mountain peak, silhouetted against the setting sun. The sky is filled with dramatic clouds, and the sun’s rays are casting a warm glow over the landscape.
Aesthetic Score : 0.8
Mood : epic, inspirational, dramatic
Quality
Entropy : 6.76
Noise : 43
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image appears to be slightly over-processed. The colors are a bit too saturated and the sky appears somewhat unnatural.
Uncharted Territory: A Glimpse into the Unknown
Three explorers, clad in rugged gear, stand amidst a vibrant jungle, their gazes fixed on something unseen. The lush greenery, towering trees, and ancient stone structure create a sense of mystery and adventure, leaving the viewer eager to discover what lies beyond the frame.
Prompt
poses low-angle: mysterious, adventurous ; A group of explorers navigating a dense jungle, their faces illuminated by the light of their headlamps; medium shot; adventure; lush green foliage and ancient ruins in the background; cinematic
Characteristic
Shot : Three young explorers in jungle setting, looking ahead with curiosity. There is dense foliage, a stone path, and a misty sky in the background.
Aesthetic Score : 0.7
Mood : adventurous, mysterious, hopeful
Quality
Entropy : 6.76
Noise : 73
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable image errors.
The Intensity of the Game
A young man is completely engrossed in a video game, his focus unwavering as he prepares for a crucial moment. The dimly lit room and his intense gaze create a palpable sense of tension and excitement, drawing the viewer into the heart of the action.
Prompt
poses low-angle: intense, focused ; A gamer’s hands intensely manipulating a controller, their face illuminated by the glow of the monitor; close-up; gaming; a vibrant, futuristic cityscape projected on the screen; cinematic
Characteristic
Shot : A young man is playing a video game on a computer in a dimly lit room. He is holding a controller in his right hand and is looking at the screen with a focused expression. The screen is showing a cityscape with a river in the background.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.71
Noise : 50
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, especially in the background.
Contemplating the Past: A Grand Statue in a City Square
A majestic statue of a man in a toga, standing tall on a pedestal in a city square, evokes a sense of grandeur and historical significance. The statue’s outstretched arm and gaze towards the sky create a powerful and contemplative mood, inviting viewers to reflect on the past and its enduring legacy.
Prompt
poses low-angle: awe-inspiring, historical ; A towering statue of a historical figure, viewed from the perspective of a tourist looking up in awe; wide shot; tourism; a bustling city square with other tourists and vendors; cinematic
Characteristic
Shot : A statue of a man in a classical pose, standing on a pedestal in front of a building with a blue sky in the background
Aesthetic Score : 0.6
Mood : serious, historical, grand
Quality
Entropy : 6.73
Noise : 42
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors
Lost in the Golden Hour
A solitary figure stands on a sand dune, silhouetted against the setting sun, contemplating the vastness of the desert landscape. The scene evokes a sense of solitude and insignificance, highlighting the power and beauty of nature.
Prompt
poses low-angle: solitude, contemplative ; A lone traveler gazing out at a vast desert landscape, their back to the camera; medium shot; travel; endless sand dunes stretching out to the horizon; cinematic
Characteristic
Shot : A lone figure stands on a sand dune in a vast desert landscape, looking out towards the distant horizon.
Aesthetic Score : 0.7
Mood : solitude, vastness, contemplative
Quality
Entropy : 6.32
Noise : 53
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : None
Confetti Rain and Smiles: Capturing the Joy of a Concert
This image captures the vibrant energy of a concert, with a crowd of smiling faces bathed in confetti. The atmosphere is electric, reflecting the joy and excitement of the event.
Prompt
poses low-angle: joyful, celebratory ; A group of friends celebrating a victory, their arms raised in the air, viewed from the perspective of someone standing below; wide shot; groups; a brightly lit party scene with confetti and balloons; cinematic
Characteristic
Shot : A group of people are cheering at a concert, confetti is falling from the ceiling
Aesthetic Score : 0.7
Mood : happy, excited, joyous
Quality
Entropy : 6.82
Noise : 78
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : There is some noise in the image, particularly in the background.
Firefighter Braves Blazing Inferno
A lone firefighter stands defiant against a raging inferno, smoke and flames billowing around them. The dramatic contrast of light and shadow, and the hero’s unwavering stance, capture the intensity and danger of the scene.
Prompt
poses low-angle: intense, heroic ; A lone firefighter battling a raging inferno, their silhouette framed against the flames; medium shot; heroism; a burning building with smoke billowing into the sky; cinematic
Characteristic
Shot : A firefighter in full gear stands in front of a burning building, looking back at the flames.
Aesthetic Score : 0.7
Mood : intense, dramatic, heroic
Quality
Entropy : 6.87
Noise : 54
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly grainy in some areas.
Tiny Figures Against a Mighty Mountain: Climbers Conquer a Steep Descent
Witness the breathtaking scale of nature as a group of climbers rappel down a sheer cliff face. The vast valley and winding river below offer a stunning backdrop to their daring adventure, leaving viewers in awe of their courage and the beauty of the natural world.
Prompt
poses low-angle: thrilling, adventurous ; A group of adventurers rappelling down a sheer cliff face, their ropes dangling below; medium shot; adventure; a breathtaking view of a mountain range and a valley below; cinematic
Characteristic
Shot : A group of climbers are rappelling down a steep rock face with a stunning view of a valley below.
Aesthetic Score : 0.7
Mood : adventure, daring, determination
Quality
Entropy : 6.65
Noise : 97
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major artifacts or errors, but a slight overexposure might be seen
In the Zone: A Gamer’s Hands Dance Across the Keyboard
A close-up shot captures the intensity of a gamer’s focus as their hands fly across the keyboard. The dimly lit room and abstract video game scene in the background create a moody atmosphere, highlighting the player’s deep immersion in the digital world.
Prompt
poses low-angle: immersive, fantastical ; A gamer’s hands deftly navigating a virtual world, their fingers flying across the keyboard; close-up; gaming; a vibrant, fantasy world displayed on the monitor; cinematic
Characteristic
Shot : A person’s hands are typing on a keyboard in front of a computer monitor. The monitor is displaying a colorful scene from a video game.
Aesthetic Score : 0.6
Mood : intense, focused, techy
Quality
Entropy : 6.75
Noise : 48
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors in the image, slight blur on hands
Sun-Kissed Temple: A Moment of Tranquility
A group of tourists stand in awe before a majestic ancient temple, bathed in the golden glow of the setting sun. The play of light and shadow creates a sense of mystery and peace, inviting you to step into this timeless scene.
Prompt
poses low-angle: awe-inspiring, historical ; A group of tourists standing in awe before a magnificent ancient temple, their faces illuminated by the setting sun; wide shot; tourism; a sprawling temple complex with intricate carvings and statues; cinematic
Characteristic
Shot : A group of people are walking up the steps of a grand temple. The temple is very detailed and has a beautiful symmetrical design. The setting sun casts a warm glow over the scene.
Aesthetic Score : 0.7
Mood : peaceful, adventurous, historical
Quality
Entropy : 6.76
Noise : 76
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors. The image is well-exposed and focused.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered okay. This means the generated image’s camera position was somewhat different from what was requested in the prompt.
- Shot Analysis: The model scored 0.605, which is considered good. This indicates the generated image’s shot composition was fairly close to what was described in the prompt.
- Aesthetic Analysis: The model scored 0.31, which is considered okay. This suggests the generated image’s aesthetic was somewhat different from what was expected based on the prompt.
Overall, the model seems to be better at understanding and implementing shot composition than camera position or aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.freepik.com