AI Captures the Scene, But Struggles with the Shot with Freepik
- 9 minutes read - 1812 wordsTable of Contents
In the realm of generative AI, capturing the essence of a scene is paramount. This involves not only understanding the elements within the scene but also the perspective from which it’s viewed. This blog post examines the performance of a generative AI model in creating images based on detailed scene descriptions, focusing on its ability to capture the intended camera position and aesthetic style. We’ll explore the model’s strengths and weaknesses, highlighting its impressive understanding of scene content while acknowledging its need for improvement in accurately capturing the desired camera angle.
Created with: freepik
One Against Many: A Warrior’s Stand
A lone warrior, clad in armor, stands defiant against a vast army in a dusty field. The warm, hazy light casts long shadows, creating a dramatic scene of impending battle. The warrior’s resolute pose and the epic scale of the scene evoke a sense of courage and determination.
Prompt
poses dancing: triumphant, powerful ; A lone warrior; wide shot; heroism; a battlefield littered with fallen enemies; cinematic
Characteristic
Shot : A warrior in full armor leads a charge into battle, his sword raised high in the air. A line of soldiers follows him, marching towards an unseen enemy.
Aesthetic Score : 0.7
Mood : epic, dramatic, intense
Quality
Entropy : 6.73
Noise : 65
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as slight blurring in the background and some unnatural edges on the warrior’s armor. The lighting is a little flat, but overall the image is well-composed and has a strong sense of drama.
Adventure Awaits: Young Explorers Charge Through the Jungle
A group of five friends, clad in safari gear, race through a dusty jungle path towards an ancient stone structure. Their excited expressions and the lush, mysterious surroundings promise an adventure filled with thrills and discovery.
Prompt
poses dancing: excited, adventurous ; A group of explorers; medium shot; adventure; a dense jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A group of four young adults, three women and one man, are running through a jungle path, with a stone temple in the background. The scene evokes a sense of adventure and exploration, with the lush green vegetation and the ancient ruins creating a visually intriguing setting.
Aesthetic Score : 0.6
Mood : adventurous, playful, hopeful
Quality
Entropy : 6.80
Noise : 86
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
Lost in the Code: A Moment of Intense Focus
A young man, headphones on, sits in a dimly lit room, his eyes glued to a computer screen. The cluttered desk and bright lights create an atmosphere of intense concentration, hinting at a project demanding his full attention. This image captures the essence of a tech-savvy individual immersed in their work, fueled by passion and dedication.
Prompt
poses dancing: intense, focused ; A gamer; close-up; gaming; a brightly lit gaming setup with a screen displaying a virtual world; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer, working on a music production. He is focused on his work and looks very serious.
Aesthetic Score : 0.7
Mood : focused, serious, creative
Quality
Entropy : 6.69
Noise : 49
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible image errors.
Whimsical Romance: A Dance in the Heart of the City
A couple dances in the midst of a bustling street, their vibrant attire standing out against the urban backdrop. The woman, in a red dress, and the man, in a blue shirt, create a dynamic energy that is both romantic and joyful. The scene is further enlivened by the movement of the crowd and the city lights, making for a whimsical and exciting atmosphere.
Prompt
poses dancing: joyful, romantic ; A couple; medium shot; tourism; a bustling marketplace with vibrant colors and exotic goods; cinematic
Characteristic
Shot : A couple is dancing in the middle of a cobblestone street in a European town. The street is lined with buildings and there are lights and vendors in the background.
Aesthetic Score : 0.7
Mood : romantic, playful, happy
Quality
Entropy : 6.83
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is slight blurring on the subjects and slight camera shake
Sunset Serenity: A Woman Walks Through the Desert
A woman in a flowing dress strolls through a breathtaking desert landscape as the sun sets, casting a warm glow over the scene. The peaceful and dreamy mood is enhanced by the dramatic effect of the sun highlighting the woman and the sand dunes. This image evokes a sense of tranquility and wonder.
Prompt
poses dancing: reflective, contemplative ; A traveler; long shot; travel; a vast desert landscape with a setting sun; cinematic
Characteristic
Shot : A woman in a brown dress walks through a desert at sunset
Aesthetic Score : 0.75
Mood : tranquil, serene, adventurous
Quality
Entropy : 6.54
Noise : 54
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears to be slightly overexposed, and there is some noise in the shadows
City Lights, Bright Smiles: Friendship Shines on a Rooftop
Four young women, their laughter echoing through the night, stroll across a rooftop overlooking a dazzling cityscape. The urban backdrop amplifies their joy and camaraderie, creating a moment of pure, carefree bliss.
Prompt
poses dancing: happy, carefree ; A group of friends; medium shot; groups; a rooftop overlooking a city skyline at night; cinematic
Characteristic
Shot : Four young women are walking on a rooftop in the city at night. They are laughing and having a good time, and the city skyline is visible in the background.
Aesthetic Score : 0.7
Mood : happy, carefree, youthful
Quality
Entropy : 6.58
Noise : 60
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors.
Lost in the Shadows: A Woman’s Alluring Mystery
A captivating image of a woman in a black dress, bathed in the warm glow of streetlights, stands alone in a dark alleyway. The mysterious lighting and composition create an air of intrigue, drawing you into her world.
Prompt
poses dancing: determined, defiant ; A lone dancer; close-up; heroism; a dark alleyway with flickering streetlights; cinematic
Characteristic
Shot : A woman in a black dress is standing in a dimly lit alleyway, with streetlights casting a warm glow on the scene. The alley is paved with cobblestones.
Aesthetic Score : 0.8
Mood : mysterious, alluring, urban
Quality
Entropy : 6.70
Noise : 50
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The lighting in the image is slightly uneven, causing some parts of the subject to be darker than others. There are also some minor artifacts in the background.
Joyful Escape: Three Friends Embrace the Mountain’s Majesty
Capture the spirit of adventure as three young women run and laugh through a breathtaking mountain landscape. Their carefree joy is amplified by the grandeur of their surroundings, creating a scene that radiates happiness and freedom.
Prompt
poses dancing: exhilarated, free ; A group of adventurers; wide shot; adventure; a breathtaking mountain range with a clear blue sky; cinematic
Characteristic
Shot : Three young women are running and laughing in a mountain landscape. The sky is blue and there are snow-capped mountains in the background.
Aesthetic Score : 0.7
Mood : joyful, carefree, adventurous
Quality
Entropy : 6.74
Noise : 63
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
Lost in the Game: A Gamer’s Focused Intensity
A young man is completely absorbed in his game, his face illuminated by the screen’s glow. The dimly lit room and his intense gaze create a sense of drama and focus, capturing the immersive experience of gaming.
Prompt
poses dancing: focused, strategic ; A gamer; close-up; gaming; a dimly lit room with a computer screen displaying a competitive game; cinematic
Characteristic
Shot : A young man wearing headphones is playing a video game on his computer. He is sitting at a desk with a keyboard and mouse in front of him. The room is dimly lit, and the screen of the computer is reflecting the light from the room.
Aesthetic Score : 0.7
Mood : focused, concentrated, determined
Quality
Entropy : 6.65
Noise : 47
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight amount of noise and grain. This is likely due to the low light conditions in which the image was taken. There is also some slight blurring in the background.
Love in Paradise: A Dance on the Shore
A romantic scene unfolds as a couple dances on a pristine white beach, their happiness accentuated by the vibrant blue sky. A third person walks in the background, adding to the carefree and joyous atmosphere. The couple’s love story is beautifully emphasized by their position and the contrasting colors, creating a dramatic effect.
Prompt
poses dancing: relaxed, joyful ; A family; medium shot; travel; a picturesque beach with turquoise water and white sand; cinematic
Characteristic
Shot : A couple is dancing on a beach, while another woman walks in the background
Aesthetic Score : 0.7
Mood : romantic, happy, playful
Quality
Entropy : 6.56
Noise : 51
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant image errors detected.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.45, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.61, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.07, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model demonstrates a good understanding of the scene and its aesthetic, but needs improvement in accurately capturing the intended camera position.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.freepik.com