AI's Artistic Struggle: Capturing the Scene vs. the Feeling with Titan-g1
- 9 minutes read - 1817 wordsTable of Contents
In the realm of artificial intelligence, image generation has made significant strides. However, capturing the essence of a scene, beyond just the visual elements, remains a challenge. This blog post examines an experiment where an AI model was tasked with generating images based on detailed scene descriptions, highlighting its strengths and weaknesses in capturing the intended aesthetic, camera position, and scene details. We explore the concept of ‘dramatic style poses’ and how they are used in various contexts, providing examples to illustrate the importance of understanding the nuances of visual storytelling.
Created with: titan-g1
A Moment of Solitude on the Mountaintop
A lone woman stands on a windswept peak, her gaze fixed on the misty expanse below. The vastness of the landscape evokes a sense of awe and wonder, while her smallness in comparison inspires contemplation and a yearning for adventure.
Prompt
poses looking-at-each-other: determined, awe-inspired ; A lone adventurer, standing on a mountain peak; wide shot; adventure; a vast, breathtaking landscape with clouds swirling below; cinematic
Characteristic
Shot : A woman stands on a mountaintop, looking out at a sea of clouds. The scene is peaceful and serene, with the clouds creating a soft and ethereal backdrop.
Aesthetic Score : 0.7
Mood : serene, peaceful, contemplative
Quality
Entropy : 6.66
Noise : 96
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight noise and blur, especially in the cloud areas.
A Duel in the Smoke
A man and woman, clad in green jumpsuits, face off in a smoky, fiery landscape. The man’s shield and the woman’s intense gaze create a palpable tension, hinting at a dramatic and suspenseful confrontation.
Prompt
poses looking-at-each-other: tense, hopeful ; Two soldiers, one injured, the other holding a shield; medium shot; heroism; a battlefield with smoke and fire in the background; cinematic
Characteristic
Shot : A man in military garb with a shield stands in front of a woman in a dark green hooded outfit against a background of smoke and flames. There is a sense of urgency and tension in the scene.
Aesthetic Score : 0.6
Mood : serious, mysterious, tense
Quality
Entropy : 6.94
Noise : 101
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly overexposed, resulting in a loss of detail in the highlights. The smoke in the background appears slightly artificial.
The Intensity of the Game
Two young men are locked in a fierce gaming battle, their focus unwavering as they strive for victory. The dramatic lighting and composition highlight the intensity of the moment, drawing you into the heart of the competition.
Prompt
poses looking-at-each-other: intense, focused ; Two gamers, heads bent over a screen; close-up; gaming; a dimly lit room with neon lights reflecting on their faces; cinematic
Characteristic
Shot : Two young men are looking intently at a computer screen, one of them is wearing headphones. It’s a dimly lit room with neon lighting.
Aesthetic Score : 0.6
Mood : focused, intense, competitive
Quality
Entropy : 6.69
Noise : 99
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : No major errors, but slight blurriness around the edges of the image.
City Lights and Smiles: A Day of Joyful Exploration
Capture the vibrant energy of youth as four friends stroll down a city street, their smiles reflecting the carefree spirit of the moment. The scene bursts with color and life, promising an adventure filled with laughter and excitement.
Prompt
poses looking-at-each-other: excited, curious ; A group of tourists, standing in front of a famous landmark; medium shot; tourism; a bustling city street with people and vehicles passing by; cinematic
Characteristic
Shot : A group of four friends, two women and two men, are walking down a street in a European city. They are laughing and enjoying each other’s company.
Aesthetic Score : 0.6
Mood : joyful, carefree, friendly
Quality
Entropy : 6.90
Noise : 99
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some slight artifacts, particularly in the sky.
Lost in the Landscape: A Moment of Tranquility
Two women, their faces turned towards the passing countryside, find solace in the rolling hills and the quiet contemplation of a train journey. The window, a barrier between them and the vastness beyond, emphasizes the feeling of being alone with one’s thoughts, creating a wistful and tranquil mood.
Prompt
poses looking-at-each-other: reflective, nostalgic ; Two friends, sitting on a train, looking out the window; medium shot; travel; a scenic landscape with rolling hills and fields; cinematic
Characteristic
Shot : Two women are sitting in a train looking out the window at the countryside scenery.
Aesthetic Score : 0.7
Mood : tranquil, peaceful, contemplative
Quality
Entropy : 6.68
Noise : 104
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Campfire Tales Under a Starry Sky
Four friends gather around a crackling campfire, sharing stories and laughter under a breathtaking night sky. The warm glow of the fire and the twinkling stars create a cozy and nostalgic atmosphere, perfect for reminiscing and making memories.
Prompt
poses looking-at-each-other: warm, intimate ; A group of friends, huddled together around a campfire; close-up; groups; a dark forest with stars twinkling in the sky; cinematic
Characteristic
Shot : Four friends are sitting around a campfire under a starry sky.
Aesthetic Score : 0.6
Mood : relaxed, warm, friendly
Quality
Entropy : 6.56
Noise : 108
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some artifacts in the background, and the image is a bit blurry in some areas.
Solitude by the Sea: A Moment of Tranquility
A woman stands on a serene beach, her gaze fixed on the endless horizon. The crashing waves and soft blue sky create a calming atmosphere, highlighting the contrast between her small figure and the vastness of the ocean. This image evokes a sense of peace and contemplation.
Prompt
poses looking-at-each-other: melancholy, contemplative ; A lone figure, standing on a deserted beach; wide shot; adventure; a vast ocean with crashing waves and a setting sun; cinematic
Characteristic
Shot : A lone figure stands on a sandy beach, gazing out at the ocean. The water is a light blue, with whitecaps forming as the waves break on the shore. The sky is a soft pink and blue, with a hint of orange in the distance.
Aesthetic Score : 0.7
Mood : peaceful, tranquil, contemplative
Quality
Entropy : 6.57
Noise : 97
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.20
Image errors : There is a slight pixelation visible in the sky, particularly around the horizon line.
Two Astronauts, Hand in Hand, Against the Majesty of Earth
A breathtaking image captures the awe-inspiring moment of two astronauts, floating in the vast expanse of space, their hands clasped together. The Earth’s atmosphere, with its swirling clouds, serves as a majestic backdrop, highlighting the incredible adventure these explorers are undertaking. The scene evokes a sense of hope, adventure, and mystery, leaving viewers in wonder at the vastness of the universe.
Prompt
poses looking-at-each-other: awe-inspired, hopeful ; astronauts, floating in space; medium shot; heroism; a view of Earth from space with stars and galaxies in the background; cinematic
Characteristic
Shot : Two astronauts in space suits are floating in space and shaking hands. There is a blue and white earth in the background.
Aesthetic Score : 0.8
Mood : dreamy, hopeful, futuristic
Quality
Entropy : 6.58
Noise : 105
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no visible artifacts or errors in the image.
A Moment of Shared Discovery in the Lush Green Forest
Three friends embark on an adventurous hike through a dense, overgrown forest. The woman’s smile and the men’s curious gazes hint at a shared secret or a captivating discovery waiting to be unveiled. The lush greenery and the sense of mystery create an intriguing atmosphere, inviting viewers to imagine the story unfolding within the forest.
Prompt
poses looking-at-each-other: curious, adventurous ; A group of explorers, standing in a jungle clearing; medium shot; adventure; lush greenery with sunlight filtering through the leaves; cinematic
Characteristic
Shot : Three people are walking through a lush green jungle. They are all wearing backpacks and casual clothing. The woman is looking at something off-camera and smiling, the man in the middle is looking up and the man on the right is facing the camera.
Aesthetic Score : 0.6
Mood : adventurous, curious, casual
Quality
Entropy : 6.91
Noise : 114
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are some minor image artifacts around the edges of the image, particularly around the people’s hair.
Golden Hour Romance: A Love Story Unfolds on the Bridge at Dusk
Experience the warmth and intimacy of a loving couple standing on a bridge at dusk, lost in each other’s eyes. The golden glow of the setting sun adds a touch of drama and romance to this heartwarming scene.
Prompt
poses looking-at-each-other: romantic, intimate ; Two lovers, standing on a bridge overlooking a city; medium shot; tourism; a cityscape with twinkling lights and a river flowing below; cinematic
Characteristic
Shot : A couple is embracing on a bridge overlooking a city at dusk.
Aesthetic Score : 0.7
Mood : romantic, intimate, cozy
Quality
Entropy : 6.85
Noise : 100
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : No significant artifacts or errors.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.48, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored 0.02, which is considered very good. This means that the generated image closely matched the expected aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than understanding the scene and camera position. This suggests that the model might need further training to improve its ability to interpret and translate prompts into accurate visual representations.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html