AI's Artistic Struggle: Capturing the Essence of Poses with Titan-g1
- 9 minutes read - 1868 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on textual descriptions is a rapidly evolving field. This experiment aimed to assess the capabilities of a generative AI model in capturing the essence of poses and scenes. The model was presented with a series of prompts, each describing a pose, a scene, and an intended aesthetic. While the model demonstrated a decent understanding of camera positions and shots, it struggled to achieve the desired aesthetic, highlighting the challenges of translating artistic concepts into visual outputs. This blog post delves into the results of this experiment, analyzing the model’s strengths and weaknesses, and exploring the implications for the future of AI-generated art.
Created with: titan-g1
A Shadow in the Sunset
A solitary figure, cloaked in black, walks away from a crumbling building as the sun sets, casting long shadows. The scene evokes a sense of melancholy and mystery, leaving the viewer to wonder about the figure’s past and future.
Prompt
poses walking-away: Melancholy, yet hopeful ; figure in a tattered cloak; wide shot; Heroism; Ruins of a fallen city bathed in the golden light of a setting sun; cinematic
Characteristic
Shot : A woman in a long black cloak walks away from the camera towards the remains of a building. The sun is setting in the background, casting a warm glow on the scene.
Aesthetic Score : 0.6
Mood : mysterious, melancholic, dramatic
Quality
Entropy : 6.70
Noise : 98
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some slight blurriness in the image, particularly in the background. The image is also a bit too saturated.
Lost in the Jungle: A Journey of Discovery
A young woman, backpack in tow and hat shielding her eyes, ventures deep into the lush jungle. Her raised arm speaks of freedom and wonder, inviting you to join her on this adventurous exploration. The mystery of her destination and the anticipation of what lies ahead create a captivating sense of wanderlust.
Prompt
poses walking-away: Excited, adventurous ; A young adventurer with a backpack; medium shot; Adventure; Lush jungle with a hidden path leading into the unknown; cinematic
Characteristic
Shot : A woman wearing a backpack and a hat walks through a lush green jungle path
Aesthetic Score : 0.6
Mood : adventurous, wanderlust, nature
Quality
Entropy : 6.85
Noise : 116
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is slightly blurry, and there are some visible noise artifacts.
Lost in the City Lights
A solitary figure gazes out at a vibrant, yet distant cityscape, lost in thought amidst the blurred glow of urban life. The scene evokes a sense of pensive contemplation and the isolating beauty of a futuristic world.
Prompt
poses walking-away: Focused, determined ; A gamer with a headset; close-up; Gaming; Neon-lit cityscape reflected in a computer screen; cinematic
Characteristic
Shot : A man is wearing headphones and looking out of a window at a city skyline at night
Aesthetic Score : 0.6
Mood : focused, contemplative, urban
Quality
Entropy : 6.94
Noise : 102
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no significant artifacts or errors in the image. Some minor blurring around the edges of the man’s face and the city skyline might be present due to the use of depth of field.
Love Blooms in a Vibrant European City
A couple strolls hand-in-hand down a charming cobblestone street, their smiles reflecting the joy and romance of their surroundings. The brightly colored buildings create a picturesque backdrop, adding a touch of old-world charm to this vibrant scene.
Prompt
poses walking-away: Romantic, carefree ; A couple holding hands; medium shot; Tourism; Picturesque European street with cobblestone paths and colorful buildings; cinematic
Characteristic
Shot : A couple is walking down a cobblestone street in a European city. The street is lined with old buildings, and the sky is clear and sunny.
Aesthetic Score : 0.7
Mood : romantic, whimsical, happy
Quality
Entropy : 6.79
Noise : 99
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly overexposed and the colors are a little washed out.
Lost in the Vastness: A Woman’s Solitary Journey Begins
A lone figure walks towards the camera, luggage in tow, on a vast airport runway. The plane she’s leaving on looms in the distance, a symbol of both escape and the unknown. The perspective emphasizes her smallness, creating a poignant sense of loneliness and contemplation as she embarks on her journey.
Prompt
poses walking-away: Nostalgic, bittersweet ; A lone traveler with a suitcase; long shot; Travel; Airport runway with a departing airplane in the distance; cinematic
Characteristic
Shot : A woman is walking on a runway with a suitcase, a plane has just taken off in the background.
Aesthetic Score : 0.6
Mood : minimalistic, lonely, hopeful
Quality
Entropy : 6.24
Noise : 99
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable errors.
Sun-Kissed Smiles and Laughter on the Beach
Capture the essence of carefree friendship with this heartwarming image. Four friends share laughter and joy as they stroll along a pristine beach, bathed in the golden glow of natural light. The vibrant blue sky and sparkling water create a picturesque backdrop for this moment of pure happiness.
Prompt
poses walking-away: Joyful, carefree ; A group of friends laughing; wide shot; Groups; Beach at sunset with the ocean waves crashing in the background; cinematic
Characteristic
Shot : A group of friends walking on a beach, laughing and enjoying each other’s company. The background shows ocean waves crashing on the shore, creating a sense of tranquility and excitement.
Aesthetic Score : 0.7
Mood : happy, carefree, friendly
Quality
Entropy : 6.76
Noise : 98
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, particularly in the background. There is also some noise present.
Lost in the Woods: A Medieval Mystery
A lone woman, cloaked in medieval garb, disappears into the depths of a shadowy forest. Her solitary journey evokes a sense of mystery and adventure, leaving viewers to wonder what secrets lie ahead.
Prompt
poses walking-away: Determined, resolute ; A warrior; medium shot; Heroism; Dark forest with a path leading into the shadows; cinematic
Characteristic
Shot : A lone female figure walks away from the camera on a path through a forest. The light is dim and the overall atmosphere is moody.
Aesthetic Score : 0.7
Mood : mysterious, dark, melancholic
Quality
Entropy : 6.71
Noise : 107
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some minor artifacts, particularly in the shadows and the leaves. The overall sharpness is also slightly lacking.
Adventure Awaits: Three Friends Discover a Hidden Path
A trio of young explorers stand before an ancient stone archway, their faces lit by the warm glow of sunlight filtering through the ruins. Their expressions are filled with excitement and curiosity as they study a map, hinting at the thrilling journey that lies ahead. The dramatic lighting creates a sense of mystery and wonder, emphasizing the importance of their adventure.
Prompt
poses walking-away: Curious, excited ; A group of explorers with maps; wide shot; Adventure; Ancient ruins with a mysterious entrance; cinematic
Characteristic
Shot : Three young people are walking through an old stone archway. They are looking at a map and seem to be enjoying their time together. The image is taken from a low angle, giving the viewer a sense of looking up at the people.
Aesthetic Score : 0.6
Mood : happy, adventurous, curious
Quality
Entropy : 6.88
Noise : 105
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry and noisy. There are some artifacts around the edges of the image.
Lost in the Virtual World: A Moment of Pure Gaming Joy
A gamer, eyes lit by the screen, is fully immersed in a fantastical virtual landscape. The excitement and focus on their face, combined with the dramatic lighting, capture the thrill of a truly immersive gaming experience.
Prompt
poses walking-away: Immersed, excited ; A gamer with a controller; close-up; Gaming; Virtual reality headset with a fantastical world displayed; cinematic
Characteristic
Shot : A person is wearing a VR headset and holding a controller, they are looking at a blurred background of a fantasy scene.
Aesthetic Score : 0.7
Mood : excited, playful, futuristic
Quality
Entropy : 6.85
Noise : 100
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant errors.
Silhouettes of Hope: A Family Embarks on a New Journey
A family of three, bathed in golden light, walks away from the camera on a train platform, their silhouette a testament to their hopeful journey. The train in the background and the suitcase they pull behind them hint at a new adventure, leaving a sense of tranquility and anticipation in their wake.
Prompt
poses walking-away: Emotional, bittersweet ; A family with luggage; long shot; Travel; Train station platform with a departing train in the distance; cinematic
Characteristic
Shot : A family of three, a man, a woman, and a young girl, walking away from the camera on a train platform. The train is in the background.
Aesthetic Score : 0.6
Mood : tranquil, family, travel
Quality
Entropy : 6.89
Noise : 104
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a few minor artifacts, such as some noise in the shadows and some slight blurring around the edges.
Conclusion
The results show that the generative AI model performed well in terms of camera position and shot analysis, but struggled with aesthetic analysis. Here’s a breakdown:
Camera Position:
- Score: 0.45
- Interpretation: This score falls below the “good” range of 0.5 to 0.75. It suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
Shot Analysis:
- Score: 0.5
- Interpretation: This score falls right at the lower end of the “good” range. It indicates that the model was able to understand the scene in the prompt reasonably well, but there might be some discrepancies between the intended shot and the generated image.
Aesthetic Analysis:
- Score: 0.11
- Interpretation: This score is significantly higher than the “very good” range of -0.2 to 0.1. It suggests that the generated image’s aesthetic deviated considerably from the expected aesthetic described in the prompt.
Overall:
While the model demonstrated a decent understanding of camera positions and shots, it struggled to achieve the desired aesthetic. This suggests that the model might need further training to better understand and translate aesthetic concepts into visual outputs.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html