AI's Artistic Journey: Capturing Poses, But Missing the Shot with Titan-g1
- 9 minutes read - 1830 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images based on text prompts is a rapidly evolving field. This blog post delves into the fascinating world of AI image generation, specifically focusing on the model’s performance in capturing poses and scenes. We’ll explore how the model excels in understanding aesthetic style but struggles with camera angles and scene composition. Through a detailed analysis of the results, we’ll gain insights into the model’s strengths and weaknesses, paving the way for future improvements in AI image generation.
Created with: titan-g1
A Solitary Figure Against a Stormy Sky
A lone woman, cloaked in mystery, traverses a desolate landscape under a dramatic, stormy sky. The contrast between her small figure and the vastness of the setting creates a sense of melancholic foreboding.
Prompt
poses running: determined, hopeful ; A lone figure in a tattered cloak; wide shot; Heroism; a desolate wasteland with a storm brewing in the distance; cinematic
Characteristic
Shot : A lone figure in a long, flowing cloak walks across a desolate landscape. The sky is overcast with a sense of impending doom.
Aesthetic Score : 0.7
Mood : dramatic, lonely, mysterious
Quality
Entropy : 6.68
Noise : 96
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image appears slightly over-processed, with a slight halo effect around the edges of the figure.
Adventure Awaits: Exploring Ancient Ruins with a Smile
A young woman, backpack in tow, strolls through the verdant ruins of an ancient temple, her laughter echoing through the air. The vibrant scene captures the thrill of exploration and the joy of discovery, promising an unforgettable adventure.
Prompt
poses running: excited, curious ; A young adventurer with a backpack; medium shot; Adventure; a lush jungle with ancient ruins in the background; cinematic
Characteristic
Shot : A young woman with a backpack is hiking through a jungle and smiles enthusiastically at the camera, in the background is a crumbling stone structure resembling a temple or ancient ruin
Aesthetic Score : 0.7
Mood : joyful, adventurous, carefree
Quality
Entropy : 6.81
Noise : 109
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is some slight noise and a bit of blurriness in the image, this is likely due to the digital nature of the image, or a slight imperfection in the camera lens.
Immersed in the Game: A Gamer’s Focus Under Neon Lights
A captivating image captures the intensity of a gaming session. The player’s hands fly across the keyboard, their focus unwavering as they navigate a dusty road in the game. The blue and purple lighting creates a dramatic atmosphere, highlighting the action and emphasizing the gamer’s concentration.
Prompt
poses running: intense, focused ; A gamer’s hands on a keyboard and mouse; close-up; Gaming; a brightly lit gaming room with a monitor displaying a virtual world; cinematic
Characteristic
Shot : A person is playing a video game on a computer. The screen is showing a racing game with a car driving on a dirt track. The person is using a keyboard and mouse to play the game.
Aesthetic Score : 0.6
Mood : focused, intense, digital
Quality
Entropy : 6.85
Noise : 97
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and grain, particularly in the background and on the monitor screen. The edges of the image are slightly blurry.
Joyful Stroll Through a Vibrant Market
A young woman radiates happiness as she walks through a bustling street market, bathed in the warm glow of colorful lanterns. The image captures the lively atmosphere with a well-composed shot, showcasing the subject in sharp focus against a beautifully blurred background.
Prompt
poses running: energetic, joyful ; A group of tourists running through a bustling marketplace; long shot; Tourism; a vibrant marketplace with colorful stalls and vendors; cinematic
Characteristic
Shot : A young woman in a straw hat is walking through a crowded market with brightly colored lanterns hanging above. She is smiling and laughing, and the image captures a sense of joy and adventure.
Aesthetic Score : 0.8
Mood : joyful, vibrant, adventurous
Quality
Entropy : 6.95
Noise : 99
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts, particularly around the edges of the lanterns. There is a slight blur in the background, which could be intentional but may indicate slightly shaky hands or a moving camera.
A Moment of Joy on the Beach
Capture the feeling of pure happiness as a woman in a vibrant dress walks hand-in-hand with a loved one along a sun-drenched beach. The first-person perspective draws you into the scene, making you feel like you’re right there with them, sharing in their joy.
Prompt
poses running: romantic, carefree ; running hand-in-hand along a beach; medium shot; Travel; a beautiful beach with turquoise water and white sand; cinematic
Characteristic
Shot : A woman in a sundress walks on a white sandy beach, the camera follows behind her hand, she is looking back at the camera and smiling
Aesthetic Score : 0.8
Mood : happy, carefree, summery
Quality
Entropy : 6.71
Noise : 92
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and slight blur, particularly in the background, the subject’s face is not perfectly focused, a part of the photographer’s hand is visible
Sun-Kissed Laughter: Friends Embrace the Joy of a Sunny Day
Four friends run through a vibrant park, their laughter echoing through the lush greenery. The bright sunlight and carefree energy capture the essence of youthful exuberance and create a heartwarming scene that evokes feelings of happiness and nostalgia.
Prompt
poses running: happy, playful ; A group of friends running through a park; wide shot; Groups; a sunny park with green grass and trees; cinematic
Characteristic
Shot : Four friends are running in a park, laughing and having fun.
Aesthetic Score : 0.7
Mood : joyful, carefree, energetic
Quality
Entropy : 6.83
Noise : 113
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors
Superhero in Motion: A City’s Guardian Takes Flight
A powerful and confident superhero, clad in blue and red, races through the city streets. The shallow depth of field draws your eye to their determined expression and flowing cape, capturing the thrill of action and the hero’s unwavering resolve.
Prompt
poses running: powerful, confident ; A superhero in a bright costume; close-up; Heroism; a city skyline with skyscrapers and flashing lights; cinematic
Characteristic
Shot : A superhero, dressed in a blue and red costume with a lightning bolt symbol, runs through a city. The background is blurry, suggesting motion and speed.
Aesthetic Score : 0.7
Mood : dynamic, heroic, powerful
Quality
Entropy : 6.68
Noise : 100
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.60
Image errors : The image contains some digital artifacts around the edges, likely caused by noise reduction or image manipulation. The blur effect is slightly unnatural and overly applied.
Solitude and Serenity on a Snowy Trail
A lone runner traverses a snowy road, dwarfed by the majestic mountain range in the background. The scene evokes a sense of peace, adventure, and the vastness of nature.
Prompt
poses running: determined, adventurous ; A lone explorer running through a snow-covered mountain pass; long shot; Adventure; a majestic mountain range with snow-capped peaks; cinematic
Characteristic
Shot : A lone runner in red is running on a snowy road in the mountains.
Aesthetic Score : 0.6
Mood : serene, adventurous, contemplative
Quality
Entropy : 6.68
Noise : 102
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight blur in the background, especially on the snowy peaks, possibly due to over-sharpening or post-processing.
Blurred Cityscapes, Sharp Focus on Adventure
A person races through a futuristic cityscape, the background a blur of neon and chrome, while the foreground captures the intensity of their VR experience. This image evokes a sense of speed, energy, and the thrill of exploration.
Prompt
poses running: immersive, exciting ; A gamer’s avatar running through a virtual world; close-up; Gaming; a vibrant and detailed virtual world with fantastical creatures; cinematic
Characteristic
Shot : A person wearing a VR headset is running through a futuristic urban environment. The background is blurry and there is a dark figure in the foreground.
Aesthetic Score : 0.5
Mood : futuristic, action, mysterious
Quality
Entropy : 6.85
Noise : 103
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.80
Image errors : The VR headset looks a bit too smooth and plastic-like. The lighting on the figure is a bit uneven.
Family Adventure: Joyful Run Through Rolling Hills
Capture the spirit of adventure with this heartwarming scene of a family running on a winding road through lush green hills. The bright sunshine and joyful expressions create a sense of pure happiness and exploration.
Prompt
poses running: happy, carefree ; A family running along a scenic road; medium shot; Travel; a winding road with rolling hills and a picturesque countryside; cinematic
Characteristic
Shot : A family of four is walking on a winding road in a rural setting. The road is lined with green hills and trees. The family is dressed in casual clothing and appears to be enjoying their time together.
Aesthetic Score : 0.7
Mood : happy, carefree, active
Quality
Entropy : 6.61
Noise : 106
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.4, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.55, which is considered average. This indicates that the model was able to understand the scene in the prompt reasonably well, but there’s room for improvement.
- Aesthetic Analysis: The model scored 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style described in the prompt.
Overall, the model seems to be better at understanding the aesthetic style than the camera position and scene composition. It might be helpful to provide more specific instructions regarding camera angles and shot types in future prompts to improve the model’s performance in these areas.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html