AI Captures the Scene, But Misses the Mood with Titan-g1
- 9 minutes read - 1789 wordsTable of Contents
The world of AI image generation is rapidly evolving, with models capable of creating stunning visuals from text prompts. However, achieving the perfect balance between technical accuracy and artistic expression remains a challenge. This blog post examines the results of an experiment that tested the capabilities of a generative AI model in capturing specific camera angles, shot composition, and aesthetic styles. While the model demonstrated a strong understanding of camera positioning and shot composition, it struggled to capture the desired aesthetic, highlighting the ongoing need for advancements in this area. We will explore the specific scores for camera position, shot analysis, and aesthetic analysis, providing insights into the model’s performance and potential areas for improvement.
Created with: titan-g1
A Solitary Figure Amidst Majestic Peaks and Serene Clouds
A lone hiker stands on a mountain summit, gazing out at a breathtaking panorama of clouds below. The rugged peaks and the peaceful expanse create a dramatic contrast, highlighting the hiker’s sense of isolation and the vastness of nature’s beauty.
Prompt
poses face-to-face: Determined, awe-inspiring ; A lone adventurer, standing on a mountain peak; wide shot; Adventure; Majestic mountain range with clouds swirling around; cinematic
Characteristic
Shot : A lone hiker stands on a mountain peak, looking out over a vast sea of clouds.
Aesthetic Score : 0.7
Mood : serene, contemplative, adventurous
Quality
Entropy : 6.79
Noise : 99
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image quality is slightly soft, especially in the background, potentially due to compression.
A Moment of Mystery in the Woods
Four young adults stand in a shadowy forest, their expressions pensive and their gazes hinting at a shared secret. The composition, with the tallest figure slightly out of focus, creates a sense of depth and intrigue, while the subtle dramatic effect adds to the mysterious mood.
Prompt
poses face-to-face: Suspenseful, mysterious ; A group of friends, huddled together in a dark forest; medium shot; Adventure; Tall trees casting long shadows, sunlight filtering through the leaves; cinematic
Characteristic
Shot : Four young adults stand in a forest, the light is dim, it is likely a cloudy day, the trees are tall and slender and in the background, the trees are out of focus. There is a mysterious vibe to this scene.
Aesthetic Score : 0.6
Mood : mysterious, introspective, somber
Quality
Entropy : 6.70
Noise : 106
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible errors in the image.
Dragon’s Fury: A Knight Faces His Fate
A dramatic scene unfolds as a valiant knight in armor confronts a fearsome dragon spewing fire. The smoke and fog create an atmosphere of impending doom, highlighting the intensity and epic scale of the battle.
Prompt
poses face-to-face: intense ; warrior, facing down a fearsome dragon; close-up; Heroism; dragon with glowing eyes, smoke billowing around; cinematic
Characteristic
Shot : A man in armor stands facing a dragon. The dragon has its mouth open, revealing sharp teeth. There is a burst of fire or smoke behind the man.
Aesthetic Score : 0.6
Mood : intense, dramatic, adventurous
Quality
Entropy : 6.94
Noise : 101
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The dragon’s scales are somewhat blurry. The lighting is uneven.
Lost in the Future: A Moment of Melancholy
A young man, headphones on, gazes out a window at a sprawling futuristic cityscape. His expression is one of quiet contemplation, hinting at a longing for something beyond the gleaming towers and neon lights.
Prompt
poses face-to-face: Focused, determined ; A young gamer, staring intently at a computer screen; close-up; Gaming; Vibrant, futuristic cityscape reflected in the screen; cinematic
Characteristic
Shot : A young man wearing headphones looks out the window at a cityscape, possibly a futuristic city
Aesthetic Score : 0.6
Mood : pensive, introspective, thoughtful
Quality
Entropy : 6.84
Noise : 101
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some blurriness, especially in the background. The edges of the image are slightly pixelated.
Parisian Romance: A Couple’s Love Story Under the Eiffel Tower
A heartwarming scene of a couple, dressed for a romantic occasion, sharing a smile and a moment of affection in front of the iconic Eiffel Tower. The setting is both picturesque and intimate, capturing the essence of love and happiness.
Prompt
poses face-to-face: Romantic, nostalgic ; A couple, gazing at each other in front of the Eiffel Tower; medium shot; Tourism; Romantic Parisian cityscape with the Eiffel Tower in the background; cinematic
Characteristic
Shot : A couple is standing in front of the Eiffel Tower, the woman is smiling and looking at the man, the man is looking at the woman.
Aesthetic Score : 0.7
Mood : romantic, happy, playful
Quality
Entropy : 6.87
Noise : 105
Prompt Clip Score : 0.26
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some minor artifacts and noise, particularly in the background. There is some minor chromatic aberration in the image.
A Moment of Curiosity in the Market
A young man strolls through a bustling market, his gaze drawn to something off-screen. The vibrant produce and slightly blurred background create a sense of casual observation and anticipation, inviting the viewer to wonder what has caught his eye.
Prompt
poses face-to-face: Curious, vibrant ; A traveler, standing on a bustling street market; medium shot; Travel; Colorful stalls overflowing with exotic goods, people bustling around; cinematic
Characteristic
Shot : A young man is walking through a market, looking at the produce. He is wearing a blue jean jacket and a backpack. There are various colorful fruits and vegetables in the background.
Aesthetic Score : 0.6
Mood : casual, adventurous, curious
Quality
Entropy : 6.88
Noise : 102
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
Campfire Tales: A Moment of Shared Warmth and Adventure
A cozy scene unfolds around a crackling campfire in the heart of the forest. Three friends gather, their faces illuminated by the dancing flames, sharing stories and laughter under the starlit sky. The low angle captures the intimacy of the moment, while the darkness beyond hints at the wild beauty and potential dangers of their adventure.
Prompt
poses face-to-face: Intimate, suspenseful ; A group of explorers, huddled around a campfire; medium shot; Adventure; Dark forest with flickering flames illuminating their faces; cinematic
Characteristic
Shot : Three people are gathered around a campfire in a forest, the woman on the left is smiling and the two men are looking up towards something. The background is a dark forest with trees and bushes.
Aesthetic Score : 0.6
Mood : cozy, friendly, adventurous
Quality
Entropy : 6.59
Noise : 105
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly blurry, especially the woman’s face.
Gazing Upwards: A Moment of Urban Wonder
A woman stands on a city street, her gaze drawn upwards to a towering building. The minimalist composition and contemplative mood evoke a sense of awe and wonder at the scale of the urban landscape.
Prompt
poses face-to-face: Awe-inspiring, hopeful ; A young girl, looking up at a towering skyscraper; wide shot; Tourism; Modern cityscape with towering skyscrapers and bustling streets; cinematic
Characteristic
Shot : A woman in a black dress stands in front of a tall skyscraper, looking up at the building. There is an older building in the foreground to the left.
Aesthetic Score : 0.7
Mood : dreamy, hopeful, ambitious
Quality
Entropy : 6.79
Noise : 104
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No errors detected.
Victory Dance: Three Friends Celebrate in a Moment of Pure Joy
Three young men, radiating excitement, capture the thrill of victory in this dynamic image. Their shared joy is palpable, amplified by the intimate setting and dramatic lighting. The bluish-purple tones and expressive faces create a captivating scene that speaks volumes about the power of camaraderie and shared triumph.
Prompt
poses face-to-face: Joyful, celebratory ; A group of friends, celebrating a victory in a video game; close-up; Gaming; Brightly lit gaming room with controllers and headsets; cinematic
Characteristic
Shot : Three friends are celebrating a win while playing a video game. They are wearing headphones and smiling.
Aesthetic Score : 0.6
Mood : happy, excited, joyful
Quality
Entropy : 6.89
Noise : 104
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.20
Image errors : No notable artifacts or errors.
Solitude by the Sea: A Moment of Tranquility
A man finds peace and contemplation as he stands alone on a sandy beach, watching the gentle waves crash against the shore under a soft, pink-hued sky. The vastness of the ocean and his solitary figure create a sense of serenity and tranquility.
Prompt
poses face-to-face: Melancholy, contemplative ; A lone traveler, standing on a deserted beach; wide shot; Travel; Vast ocean stretching out to the horizon, golden sunset; cinematic
Characteristic
Shot : A man stands on a sandy beach, looking out at the ocean. The sky is a pale blue and the waves are rolling in.
Aesthetic Score : 0.7
Mood : tranquil, contemplative, serene
Quality
Entropy : 6.44
Noise : 92
Prompt Clip Score : 0.24
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight amount of noise and graininess, particularly in the sky. There is also a slight vignette effect.
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.5, which falls within the “good” range (0.5 to 0.75). This means the generated image’s camera position was fairly close to what was requested in the prompt.
- Shot Analysis: The model scored 0.61, also within the “good” range. This indicates the model successfully captured the intended shot type and composition.
- Aesthetic Analysis: The model scored 0.07, which is significantly lower than the “very good” range (-0.2 to 0.1). This suggests the generated image’s aesthetic deviated from the expected style.
Overall, the model demonstrates a good understanding of camera positioning and shot composition, but needs improvement in capturing the desired aesthetic.
Sources:
- https://www.writerswrite.co.za/cheat-sheets-for-writing-body-language/
- https://mads3df.wordpress.com/2013/09/04/storytelling-poses/
- https://www.pinterest.com/pegasister890/character-poses/
- https://www.youtube.com/watch?v=udky6ANxWws
- https://maven.com/articles/storytelling-techniques
- https://docs.aws.amazon.com/bedrock/latest/userguide/titan-image-models.html