AI Captures Poses, But Struggles with the Vibe with Freepik
- 9 minutes read - 1855 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual prompts is a fascinating area of exploration. This blog post delves into an experiment where a generative AI model was tasked with creating images based on specific poses and scenes. While the model demonstrated a good understanding of camera positioning and shot composition, it struggled to capture the desired aesthetic. This raises questions about the model’s ability to truly understand and translate artistic intent. We’ll explore the model’s strengths and weaknesses, analyzing its performance in capturing the ‘wow’ factor, and discuss the implications for the future of AI-generated art.
Created with: freepik
Silhouetted Serenity: A Man Contemplates the Sunset
A solitary figure sits atop a mountain, bathed in the golden hues of a setting sun. The scene evokes a sense of peace and contemplation, as the man gazes out at the sprawling valley below. The dramatic silhouette against the vibrant sky adds a touch of mystery, highlighting the individual’s connection to the vastness of nature.
Prompt
poses profile: Epic, hopeful, determined ; A lone figure, silhouetted against a setting sun; wide shot; Heroism; A vast, mountainous landscape; cinematic
Characteristic
Shot : A lone man sits on a rocky outcrop, looking out over a valley at a distant sunset
Aesthetic Score : 0.7
Mood : serene, contemplative, dramatic
Quality
Entropy : 6.58
Noise : 32
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors in the image.
Majestic Valley Views: A Hiker’s Paradise
Experience the breathtaking beauty of a vast valley, with a winding river and a cascading waterfall. This serene and adventurous scene evokes a sense of awe and wonder, capturing the majesty of nature.
Prompt
poses profile: Adventurous, free-spirited, awe-inspired ; A backpacker standing on a cliff edge, looking out at a breathtaking view; medium shot; Adventure; A sprawling valley with cascading waterfalls; cinematic
Characteristic
Shot : A lone hiker stands on a cliff overlooking a vast, verdant canyon with a river snaking through it. A waterfall cascades down the cliff face on the right side of the image.
Aesthetic Score : 0.8
Mood : serene, majestic, awe-inspiring
Quality
Entropy : 6.72
Noise : 71
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors observed.
Lost in the Game: A Gamer’s Intense Focus
A young man, bathed in the glow of his computer screen, is completely absorbed in his game. The dark room and his focused expression create a sense of intensity and drama, capturing the essence of a dedicated gamer.
Prompt
poses profile: Focused, intense, passionate ; A gamer’s hands, illuminated by the glow of a monitor, holding a controller; close-up; Gaming; A dimly lit room with gaming posters on the walls; cinematic
Characteristic
Shot : A young man in a dark room is playing video games, sitting at a desk with a gaming keyboard in front of him. He is wearing a headset and has a focused expression. The image is lit by a desk lamp and the glowing screen of the computer, casting a warm, inviting light on the scene.
Aesthetic Score : 0.7
Mood : intense, focused, immersive
Quality
Entropy : 6.40
Noise : 47
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No significant errors
Contemplation Before Grandeur
A young woman stands in quiet contemplation, her profile silhouetted against the imposing backdrop of a grand cathedral. The scene evokes a sense of peace and awe, highlighting the contrast between human fragility and architectural might.
Prompt
poses profile: Curious, excited, appreciative ; A tourist gazing up at a majestic cathedral; medium shot; Tourism; A bustling city square with cobblestone streets; cinematic
Characteristic
Shot : A young woman stands in profile, looking up at a large, beautiful cathedral. The scene is set in a city square, with many people in the background.
Aesthetic Score : 0.7
Mood : calm, contemplative, peaceful
Quality
Entropy : 6.82
Noise : 70
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable errors or artifacts
Lost in Thought: A Moment of Contemplation on a Train Journey
A young man gazes out the window of a moving train, his expression pensive as he observes the passing green landscape. The contrast between the confined interior and the expansive scenery evokes a sense of introspection and the fleeting nature of time.
Prompt
poses profile: Reflective, contemplative, nostalgic ; A traveler sitting on a train, looking out the window at passing scenery; medium shot; Travel; A scenic train journey through rolling hills and fields; cinematic
Characteristic
Shot : A young man sits by the window of a train looking out at a rural landscape of rolling hills and fields.
Aesthetic Score : 0.7
Mood : pensive, contemplative, nostalgic
Quality
Entropy : 6.78
Noise : 53
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors in the image.
Friendship, Laughter, and Warmth: A Celebration Captured
This heartwarming scene captures the essence of friendship and celebration. A group of friends gather around a table, their laughter and joy filling the air. String lights and candles create a warm and festive ambiance, while the soft lighting adds a touch of intimacy. The image evokes feelings of happiness, connection, and the special moments shared with loved ones.
Prompt
poses profile: Joyful, celebratory, connected ; A group of friends laughing and celebrating together; wide shot; Groups; A lively party with colorful decorations and music; cinematic
Characteristic
Shot : A group of friends are gathered around a table, laughing and enjoying each other’s company. The scene is lit by string lights and candles, creating a warm and inviting atmosphere.
Aesthetic Score : 0.8
Mood : joyful, celebratory, friendly
Quality
Entropy : 6.75
Noise : 65
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is well-lit and there are no noticeable errors.
Superhero Stands Tall, Cape Flowing in the Wind
A powerful image of a superhero silhouetted against a city skyline, their cape billowing dramatically in the wind. The scene evokes a sense of heroism, epic scale, and dramatic tension.
Prompt
poses profile: Powerful, confident, inspiring ; A superhero standing tall, cape billowing in the wind; medium shot; Heroism; A cityscape with towering skyscrapers; cinematic
Characteristic
Shot : A superhero in a red cape stands on a rooftop overlooking a city skyline, with the sun setting in the background.
Aesthetic Score : 0.6
Mood : heroic, dramatic, hopeful
Quality
Entropy : 6.75
Noise : 52
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.70
Image errors : The cape’s flow and texture are somewhat unrealistic, and the lighting is somewhat flat. The city skyline appears somewhat generic.
Lost Temple Beckons in Sun-Dappled Jungle
A group of explorers stand on a dirt path, their gaze drawn to a mysterious stone temple shrouded in vines and bathed in sunlight. The scene evokes a sense of adventure, awe, and the promise of discovery.
Prompt
poses profile: Intrigued, adventurous, determined ; A group of explorers navigating a dense jungle; wide shot; Adventure; Lush greenery, ancient ruins, and dappled sunlight; cinematic
Characteristic
Shot : A group of people are standing on a path in a jungle, looking at a ruined temple in the distance. The sun is shining through the trees, creating a dramatic effect.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, awe-inspiring
Quality
Entropy : 6.73
Noise : 93
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image looks like it was generated by AI. The trees are somewhat repetitive, the sunlight is too perfect, and there is a lack of detail in the shadows. The people have a plastic quality.
Neon Focus: A Young Man Immersed in the Digital World
A young man, bathed in the cool glow of neon lights, sits intently at his computer desk. Headphones on, he’s fully absorbed in the digital world, creating a scene of focused intensity and futuristic aesthetic.
Prompt
poses profile: Focused, competitive, determined ; A gamer’s face, lit by the screen, showing intense concentration; close-up; Gaming; A dimly lit room with a gaming setup and neon lights; cinematic
Characteristic
Shot : A young man is sitting at his computer desk, wearing headphones and a hoodie. The room is dimly lit with blue and purple lighting.
Aesthetic Score : 0.7
Mood : focused, intense, concentrated
Quality
Entropy : 6.51
Noise : 53
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight noise and grain, particularly in the shadows. The lighting is a bit harsh, especially on the man’s face. The colors are a bit oversaturated.
Sunset Romance on the Beach
A heartwarming scene of three friends walking hand-in-hand along a sandy beach as the sun sets in the distance. The warm glow of the sunset creates a romantic and serene atmosphere, capturing a moment of joy and companionship.
Prompt
poses profile: Romantic, peaceful, serene ; A couple holding hands, walking along a beach at sunset; medium shot; Tourism; A golden beach with turquoise waters and a vibrant sky; cinematic
Characteristic
Shot : Three people walking on a beach at sunset. The man is in the middle, and the women are on either side. The sun is setting behind them, and the water is calm and still.
Aesthetic Score : 0.7
Mood : romantic, peaceful, happy
Quality
Entropy : 6.62
Noise : 50
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect of the image. Here’s a breakdown:
- Camera Position: The model scored 0.41, which is slightly below the “good” range of 0.5 to 0.75. This suggests that the model was able to somewhat accurately capture the camera position described in the prompt, but there’s room for improvement.
- Shot Analysis: The model scored 0.52, which falls within the “good” range. This indicates that the model was able to understand and translate the shot composition from the prompt into the generated image.
- Aesthetic Analysis: The model scored 0.02, which is significantly below the “very good” range of -0.2 to 0.1. This suggests that the generated image’s aesthetic deviated significantly from the expected aesthetic based on the prompt.
Overall, the model demonstrates a decent understanding of camera position and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://www.freepik.com