AI's Artistic Struggle: Capturing the Scene vs. the Style with Freepik
- 9 minutes read - 1750 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on text prompts has become increasingly sophisticated. However, achieving a perfect balance between capturing the scene, camera position, and aesthetic style remains a challenge. This blog post examines the results of an experiment using a generative AI model to create images based on specific prompts, highlighting the model’s strengths and weaknesses in capturing these crucial elements.
Created with: freepik
A Moment of Solitude on the Mountaintop
A lone hiker stands on a rocky ridge, dwarfed by the vastness of a misty valley and snow-capped peaks. The image evokes a sense of serenity, adventure, and contemplation, highlighting the beauty and power of nature.
Prompt
poses standing-tall: Determined, hopeful, awe-inspiring ; Lone adventurer; wide shot; Adventure; Majestic mountain range with a vast, clear sky; cinematic
Characteristic
Shot : A lone hiker stands on a rocky ridge, gazing out at a vast expanse of mountains in the distance. The mountains are shrouded in mist, giving the scene an ethereal and somewhat mysterious quality. The sky is clear and blue, and the sun is shining brightly.
Aesthetic Score : 0.8
Mood : tranquil, contemplative, vast
Quality
Entropy : 6.31
Noise : 51
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible errors
Soldier’s Silhouette Against the Flames of War
A lone soldier stands amidst the ruins of a war-torn city, the fiery backdrop highlighting the intensity and danger of the situation. The dramatic pose and smoky atmosphere create a sense of urgency and impending conflict.
Prompt
poses standing-tall: Brave, defiant, resolute ; Soldier standing on a battlefield; medium shot; Heroism; Smoke and debris from a recent explosion; cinematic
Characteristic
Shot : A soldier stands in the midst of a war-torn cityscape, smoke and flames billowing behind him, creating a stark contrast with the soldier’s stoic composure.
Aesthetic Score : 0.7
Mood : dramatic, intense, somber
Quality
Entropy : 6.82
Noise : 61
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : There is some slight blurring in the smoke and flames, but it is not overly distracting.
Neon Nights: Friends Celebrate in a Glow of Joy
A group of friends light up the night with their smiles and laughter, captured in a dimly lit club bathed in vibrant neon light. The scene radiates happiness and a celebratory spirit, perfectly encapsulating the energy of a night out with friends.
Prompt
poses standing-tall: Joyful, triumphant, celebratory ; Group of friends celebrating a victory in a video game; close-up; Gaming; Neon lights and glowing screens of a gaming setup; cinematic
Characteristic
Shot : A group of friends are having fun at a club. They are all smiling and laughing.
Aesthetic Score : 0.6
Mood : joyful, energetic, celebratory
Quality
Entropy : 6.72
Noise : 63
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be overexposed, with some areas of the image being blown out. The background is somewhat distracting, and the composition could be more balanced.
Sunset Serenity: Awe-Inspiring View from the Clifftop
Two figures stand silhouetted against a breathtaking sunset, overlooking a vast ocean and rolling green hills. The scene evokes a sense of tranquility and wonder, capturing the beauty of nature’s grand spectacle.
Prompt
poses standing-tall: Awe-struck, contemplative, peaceful ; Tourist standing on a cliff overlooking a breathtaking view; long shot; Tourism; Scenic landscape with rolling hills and a sparkling ocean; cinematic
Characteristic
Shot : Two figures stand on a rocky cliff overlooking a vast ocean with rolling green hills behind them.
Aesthetic Score : 0.8
Mood : tranquil, contemplative, adventurous
Quality
Entropy : 6.74
Noise : 88
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors.
Silhouettes of Love Against a Sunset Sea
A couple stands hand-in-hand on a ship’s deck, their silhouettes framed against a breathtaking sunset. The scene evokes a sense of romance, peace, and serenity, with the couple’s mystery adding a touch of intrigue.
Prompt
poses standing-tall: Romantic, adventurous, hopeful ; Couple standing on a ship’s deck; medium shot; Travel; Sunset over the ocean with a silhouette of a distant island; cinematic
Characteristic
Shot : A couple silhouetted against a sunset, standing on a ship deck, looking out at the ocean.
Aesthetic Score : 0.7
Mood : romantic, tranquil, serene
Quality
Entropy : 6.82
Noise : 55
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, but the image could be sharper.
Golden Spotlight: A Dancer Takes Center Stage
A young woman in a dazzling gold dance outfit commands attention as she performs under dramatic spotlights. The smoke-filled stage and dynamic composition create a sense of energy and confidence, highlighting the dancer’s captivating performance.
Prompt
poses standing-tall: Energetic, passionate, expressive ; Group of dancers performing on a stage; wide shot; Groups; Bright stage lights and a cheering audience; cinematic
Characteristic
Shot : A solo dancer in a gold costume on stage performing in front of a group of dancers in the background, all with arms outstretched and blending into a hazy smoke background with spot lights
Aesthetic Score : 0.7
Mood : dramatic, confident, powerful
Quality
Entropy : 6.80
Noise : 60
Prompt Clip Score : 0.19
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry in areas, particularly the dancers in the background. This could be due to motion blur or poor focus.
A Solitary Figure on the Moon: A Moment of Ethereal Isolation
An astronaut stands alone on the lunar surface, bathed in the soft glow of a full moon. The vast, desolate landscape and the astronaut’s small stature create a powerful sense of solitude and the insignificance of humanity against the backdrop of the cosmos. The scene evokes a sense of wonder and mystery, highlighting the ethereal beauty of space.
Prompt
poses standing-tall: Awe-inspiring, futuristic, surreal ; Astronaut standing on the surface of the moon; long shot; Adventure; Cratered lunar landscape with Earth in the distance; cinematic
Characteristic
Shot : A lone astronaut stands on the surface of the moon, silhouetted against a bright full moon in the dark sky. The astronaut is wearing a white spacesuit with an American flag patch, and is standing on a dusty, rocky lunar landscape.
Aesthetic Score : 0.7
Mood : solitude, awe, wonder
Quality
Entropy : 6.94
Noise : 82
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image is slightly blurry and the texture of the lunar surface is somewhat unrealistic.
Heroic Firefighter Battles Blaze in Dramatic Scene
A firefighter stands bravely against a raging inferno, smoke billowing and flames engulfing the building. The image captures the dramatic tension and heroism of their job, with striking use of light and shadow enhancing the mood.
Prompt
poses standing-tall: Brave, determined, selfless ; Firefighter standing in front of a burning building; medium shot; Heroism; Flames and smoke billowing from the building; cinematic
Characteristic
Shot : A firefighter in full gear stands in front of a burning building. The building is engulfed in flames, and smoke is billowing into the air.
Aesthetic Score : 0.6
Mood : dramatic, tense, heroic
Quality
Entropy : 6.84
Noise : 59
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable errors or artifacts in the image.
Gamer’s Triumph: A Moment of Pure Joy and Accomplishment
This image captures the pure joy and excitement of a gamer’s victory. The young man’s face radiates triumph as he holds his trophy aloft, surrounded by a cheering crowd. The scene is a testament to the power of hard work and dedication, and the image evokes a sense of shared celebration and accomplishment.
Prompt
poses standing-tall: Triumphant, proud, accomplished ; Gamer holding a trophy after winning a tournament; close-up; Gaming; Crowd cheering and flashing cameras; cinematic
Characteristic
Shot : A young man is holding a gold trophy in front of a cheering crowd. He is smiling and looking directly at the camera.
Aesthetic Score : 0.7
Mood : joyful, celebratory, triumphant
Quality
Entropy : 6.79
Noise : 57
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors or artifacts
Family Adventure on a Snowy Mountaintop
A heartwarming scene of a family of four standing on a snow-covered mountain peak, gazing at the majestic mountain range in the distance. The image evokes feelings of joy, peace, and adventure, with the grandeur of the mountains contrasting beautifully with the warmth of the family.
Prompt
poses standing-tall: Joyful, united, adventurous ; Family standing on a mountain peak; wide shot; Travel; Panoramic view of snow-capped mountains and a clear blue sky; cinematic
Characteristic
Shot : A family of four standing in a snowy mountain landscape, holding hands and looking at each other.
Aesthetic Score : 0.7
Mood : happy, joyful, peaceful
Quality
Entropy : 6.71
Noise : 51
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors, only a very slight noise in the shadow areas, practically unnoticeable
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.46, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.05, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.freepik.com