AI's Artistic Struggle: Capturing the Essence of Poses with Freepik

AI's Artistic Struggle: Capturing the Essence of Poses with Freepik

Contents

Dramatic poses are a powerful tool in visual storytelling, conveying emotions, actions, and relationships. They are often used in photography, film, and art to create a sense of dynamism and engagement. However, capturing the essence of a dramatic pose through AI-generated imagery presents unique challenges. This blog post explores the results of an experiment where a generative AI model was tasked with creating images based on specific poses and scenes. While the model showed some success in capturing camera position and shot analysis, it struggled to achieve the desired aesthetic. We delve into the reasons behind these limitations and discuss the potential for future improvements in AI-generated imagery.

Created with: freepik

A Knight’s Brooding Vigil: Epic Stormy Landscape

A lone knight in full armor stands on a rocky outcrop, gazing towards a distant castle nestled in a valley. A dramatic stormy sky hangs overhead, casting a brooding mood over the scene. The knight’s cape billows in the wind, adding to the sense of grandeur and movement. This epic and dramatic scene evokes a sense of anticipation, mystery, and perhaps even a hint of danger.

A Knight’s Brooding Vigil: Epic Stormy Landscape

Prompt

poses three-quarter-pose: determined, resolute, heroic ; A lone knight, standing tall on a windswept hilltop; three-quarter pose; Heroism; a vast, stormy landscape with a distant castle in the background; cinematic

Characteristic

Shot : A knight in shining armor stands on a rocky outcrop, looking out over a distant castle and a green valley. The sky is overcast, giving the scene a somber mood.

Aesthetic Score : 0.8

Mood : dramatic, epic, melancholic

Quality

Entropy : 6.84

Noise : 56

Prompt Clip Score : 0.29

AI Evaluation

Likelihood of AI : 0.90

Image errors : The edges of the knight’s armor and the castle walls appear slightly pixelated. The overall image has a slightly artificial feel.

Silhouetted Against the Dawn: An Explorer’s Journey Begins

A lone explorer stands in the heart of a lush jungle, their silhouette stark against the hazy sunrise. A map in hand, they contemplate their next move, fueling a sense of adventure, mystery, and hope. The dramatic lighting creates an air of intrigue, drawing you into their world.

Silhouetted Against the Dawn: An Explorer’s Journey Begins

Prompt

poses three-quarter-pose: adventurous, curious, hopeful ; An intrepid explorer, silhouetted against the setting sun, holding a map; three-quarter pose; Adventure; a dense jungle with ancient ruins in the distance; cinematic

Characteristic

Shot : A lone adventurer, silhouetted against a hazy jungle backdrop, consults a map at the edge of a stone formation. The soft light of dawn filters through the dense foliage.

Aesthetic Score : 0.6

Mood : mysterious, adventurous, hopeful

Quality

Entropy : 6.78

Noise : 56

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.30

Image errors : The image appears slightly overexposed, particularly in the background. The edges of the map seem to blend in with the background, making it less clear as a visual element. The overall resolution is good, but some details, especially in the background, are lost due to the haze.

Neon Dreams: A Gamer’s Focus in a Futuristic Cityscape

A young man, bathed in neon light, sits intently in his gaming chair, headphones on, eyes fixed on the screen. The futuristic cityscape glimpsed through the window adds a layer of depth and intrigue to this image, capturing the essence of a gamer’s focus in a world of digital possibilities.

Neon Dreams: A Gamer’s Focus in a Futuristic Cityscape

Prompt

poses three-quarter-pose: focused, intense, exhilarated ; A gamer, eyes glued to the screen, fingers flying across the keyboard; three-quarter pose; Gaming; a brightly lit gaming room with neon lights and a futuristic cityscape projected on the wall; cinematic

Characteristic

Shot : A young man is playing video games in a dimly lit room with a cityscape view behind him. He is wearing a headset and is focused on the game. The room is decorated with gaming gear and neon lights.

Aesthetic Score : 0.7

Mood : cyberpunk, futuristic, focused

Quality

Entropy : 6.78

Noise : 65

Prompt Clip Score : 0.32

AI Evaluation

Likelihood of AI : 0.90

Image errors : The background cityscape seems somewhat artificial and flat. Some texture inconsistencies in the room. The person’s left hand (right in the image) looks slightly distorted in the fingers.

Lost in the Parisian Dream

A young woman, bathed in the golden light of Paris, stands before the iconic Eiffel Tower, her gaze lost in contemplation. The bustling city life fades into the background, leaving only the grandeur of the moment and the woman’s quiet awe.

Lost in the Parisian Dream

Prompt

poses three-quarter-pose: amazed, joyful, curious ; A tourist, gazing in awe at the Eiffel Tower, camera in hand; three-quarter pose; Tourism; a bustling Parisian street with cafes and shops lining the sidewalk; cinematic

Characteristic

Shot : A young woman in a beige trench coat is standing in front of the Eiffel Tower, looking up at it. There are other people in the background, walking around.

Aesthetic Score : 0.7

Mood : romantic, wistful, hopeful

Quality

Entropy : 6.77

Noise : 56

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image is slightly blurry, and there are some minor artifacts in the background.

Finding Freedom on the Mountaintop

A woman embraces the breathtaking view from a rocky mountain peak, her arms outstretched as she takes in the vast snow-capped range. The scene evokes a sense of joy, adventure, and serenity, capturing the essence of freedom and wonder.

Finding Freedom on the Mountaintop

Prompt

poses three-quarter-pose: free, exhilarated, adventurous ; A backpacker, standing on a mountain peak, arms outstretched, enjoying the view; three-quarter pose; Travel; a breathtaking panorama of snow-capped mountains and valleys; cinematic

Characteristic

Shot : A woman in hiking gear stands on a rock with her arms outstretched towards a snowy mountain range. She is smiling and enjoying the view.

Aesthetic Score : 0.7

Mood : joyful, adventurous, serene

Quality

Entropy : 6.57

Noise : 55

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.10

Image errors : No noticeable artifacts or errors.

Campfire Glow: Friends Gather Under the Stars

A cozy scene of friendship and warmth, captured under a starlit sky. The fire’s glow illuminates the faces of friends gathered around, creating a sense of intimacy and comfort against the cool night air.

Campfire Glow: Friends Gather Under the Stars

Prompt

poses three-quarter-pose: happy, relaxed, connected ; A group of friends, laughing and sharing stories around a campfire; three-quarter pose; Groups; a serene forest clearing with stars twinkling in the night sky; cinematic

Characteristic

Shot : A group of friends are gathered around a campfire in a forest at night. There are stars in the sky, and the light from the fire is casting a warm glow on their faces. The scene is intimate and cozy, and the friends look happy and relaxed.

Aesthetic Score : 0.75

Mood : cozy, warm, intimate

Quality

Entropy : 6.18

Noise : 58

Prompt Clip Score : 0.30

AI Evaluation

Likelihood of AI : 0.10

Image errors : Some of the details in the image, such as the trees and the stars, are a bit blurry.

Superman Amidst the Ashes: A City in Ruin

A dramatic scene unfolds as Superman stands tall on a rubble-strewn rooftop, overlooking a cityscape consumed by fire and smoke. The ominous sky and distant skyline hint at a devastating disaster, while Superman’s heroic presence adds a layer of suspense and anticipation.

Superman Amidst the Ashes: A City in Ruin

Prompt

poses three-quarter-pose: powerful, victorious, confident ; A superhero, standing triumphantly over a defeated villain; three-quarter pose; Heroism; a cityscape with smoke and debris in the background; cinematic

Characteristic

Shot : Superman standing on rubble in a destroyed city, with smoke and flames in the background

Aesthetic Score : 0.7

Mood : epic, dramatic, heroic

Quality

Entropy : 6.84

Noise : 62

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.80

Image errors : The smoke and flames in the background appear to be slightly unrealistic and are not quite integrated into the background. The texture of Superman’s costume appears to be artificial, lacking a natural texture.

Conquering the Peaks: Hikers Embrace the Majestic Mountain Landscape

Three adventurers navigate a challenging mountain trail, their determination fueled by the awe-inspiring beauty of snow-capped peaks. This serene and inspiring scene captures the spirit of exploration and the vastness of nature.

Conquering the Peaks: Hikers Embrace the Majestic Mountain Landscape

Prompt

poses three-quarter-pose: determined, focused, adventurous ; A group of adventurers, navigating a treacherous mountain path; three-quarter pose; Adventure; a rugged mountain range with snow-covered peaks and a deep valley below; cinematic

Characteristic

Shot : Three hikers walking along a mountain trail, with a stunning view of snow-capped peaks in the background.

Aesthetic Score : 0.7

Mood : adventure, nature, determination

Quality

Entropy : 6.75

Noise : 78

Prompt Clip Score : 0.25

AI Evaluation

Likelihood of AI : 0.10

Image errors : No visible artifacts or errors.

Intrigued Gazes: A Moment of Shared Mystery

Four young men gather around a table, their faces illuminated by the soft glow of the room. A pizza sits untouched in the center, as their eyes fix on something unseen, creating a palpable sense of curiosity and anticipation. The casual, relaxed atmosphere is punctuated by a subtle dramatic effect, leaving the viewer wondering what has captured their attention.

Intrigued Gazes: A Moment of Shared Mystery

Prompt

poses three-quarter-pose: focused, competitive, excited ; A group of gamers, huddled around a table, strategizing their next move; three-quarter pose; Gaming; a dimly lit room with flickering computer screens and a stack of pizza boxes; cinematic

Characteristic

Shot : A group of young men are gathered around a table, eating pizza and looking intently at something off-camera. The scene is set in a dimly lit room with a warm color palette, giving it a cozy and intimate feel.

Aesthetic Score : 0.6

Mood : casual, relaxed, suspenseful

Quality

Entropy : 6.77

Noise : 64

Prompt Clip Score : 0.28

AI Evaluation

Likelihood of AI : 0.10

Image errors : No noticeable errors in the image

Family Fun in the European City Square

A heartwarming scene of a happy family of four enjoying a day out in a charming European city square. Their smiles and laughter radiate joy and the warmth of family bonds, making this image a perfect reminder of the simple pleasures of travel and togetherness.

Family Fun in the European City Square

Prompt

poses three-quarter-pose: happy, joyful, memorable ; A family, standing in front of a famous landmark, smiling for a photo; three-quarter pose; Tourism; a vibrant city square with colorful buildings and street performers; cinematic

Characteristic

Shot : A family of five is posing for a photo in a European city square. They are in front of colorful buildings and there is a crowd of people in the background.

Aesthetic Score : 0.7

Mood : happy, cheerful, family

Quality

Entropy : 6.79

Noise : 69

Prompt Clip Score : 0.27

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image is slightly overexposed, and there is some noise in the background.

Conclusion

The results show that the generative AI model performed okay in terms of camera position and shot analysis, but below average in terms of aesthetic analysis. Here’s a breakdown:

  • Camera Position: The model scored 0.3, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t always accurately capture the intended camera positions described in the prompts.
  • Shot Analysis: The model scored 0.46, also below the “good” range. This indicates that the model struggled to understand and translate the scene descriptions in the prompts into the generated images.
  • Aesthetic Analysis: The model scored 0.29, which is significantly below the “very good” range of -0.2 to 0.1. This means that the generated images didn’t match the expected aesthetic style as closely as they could have.

Overall, the model needs improvement in all three areas to produce images that more accurately reflect the prompts.

Sources: