AI's Struggle with Symmetry: A Dramatic Style Analysis with Imagen-v2
- 10 minutes read - 1939 wordsTable of Contents
The ‘dramatic-styles’ aesthetic often relies on symmetrical compositions to create a sense of grandeur, balance, and visual impact. This style is commonly found in various media, from epic films and video games to architectural designs and photography. In this blog post, we explore the capabilities of AI in generating images with this dramatic style, specifically focusing on its ability to capture symmetrical compositions. We analyze the results of a test using various scene prompts and discuss the AI’s strengths and weaknesses in understanding camera position, shot analysis, and aesthetic style.
Created with: imagen-v2
A Gateway to Serenity: A Lone Figure Finds Peace on a Mountain Peak
This breathtaking scene captures a solitary figure standing atop a majestic mountain, bathed in the golden light of a parting cloud. The symmetrical composition creates a sense of awe and wonder, while the misty mountains in the distance add a touch of mystery and solitude. The mood is serene and peaceful, inspiring contemplation and a sense of connection with nature’s grandeur.
Prompt
Symmetry and Patterns: Epic, inspiring ; A lone hero standing on a mountain peak; wide shot; Heroism; Dramatic cloudscape with a symmetrical mountain range in the background; cinematic
Characteristic
Shot : A lone figure stands on a mountain peak with a breathtaking view of distant mountains and a dramatic sky filled with clouds.
Aesthetic Score : 0.8
Mood : serene, majestic, contemplative
Quality
Entropy : 6.74
Noise : 85
Prompt Clip Score : 0.35
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors are present in the image.
Hope Beckons in the Shadows: A Journey Through Time
Three figures venture into the depths of an ancient structure, their path illuminated by a sliver of light at the end of a narrow stone corridor. The scene evokes a sense of mystery, adventure, and the allure of the unknown, leaving viewers eager to discover what lies beyond the light.
Prompt
Symmetry and Patterns: Mysterious, suspenseful ; A group of adventurers navigating a symmetrical maze of ancient ruins; medium shot; Adventure; Intricate patterns on the walls and floor of the maze; cinematic
Characteristic
Shot : Three people walking in a long, stone corridor leading to an arched doorway at the end, with a hazy, ethereal light coming from the door.
Aesthetic Score : 0.7
Mood : mysterious, adventurous, foreboding
Quality
Entropy : 6.59
Noise : 108
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are some minor artifacts in the image, especially in the stonework. The figures are also somewhat blurry.
Neon Dreams: A Cyberpunk Portrait
A close-up portrait of a woman with vibrant purple hair, clad in a futuristic outfit, stands out against a blurred neon-lit background. The strong contrast between her hair and the lights creates a dramatic, cyberpunk aesthetic.
Prompt
Symmetry and Patterns: Futuristic, vibrant ; A player’s avatar standing in the center of a symmetrical, neon-lit cityscape; close-up; Gaming; Geometric patterns on buildings and holographic displays; cinematic
Characteristic
Shot : A woman with purple hair stands in front of a neon-lit cityscape, her face is partially obscured by shadow and she looks out toward the viewer, giving off an air of mystery and power.
Aesthetic Score : 0.8
Mood : futuristic, mysterious, powerful
Quality
Entropy : 6.64
Noise : 60
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.90
Image errors : The image appears to have been heavily processed, with some noticeable blurriness in the background and in the woman’s hair.
A Buzzing City Market from Above
A vibrant street market unfolds beneath towering buildings, showcasing a bustling urban scene filled with colorful stalls and lively crowds. The perspective from above emphasizes the depth and scale of the city, capturing the energy and movement of this vibrant marketplace.
Prompt
Symmetry and Patterns: Lively, bustling ; A panoramic view of a symmetrical city skyline with a vibrant, patterned market in the foreground; wide shot; Tourism; Colorful stalls and bustling crowds in the market; cinematic
Characteristic
Shot : A street market with vendors selling goods under colorful canopies, flanked by tall buildings on either side.
Aesthetic Score : 0.6
Mood : busy, urban, chaotic
Quality
Entropy : 6.65
Noise : 112
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has noticeable artifacts and blurring, especially in the buildings.
Silhouette of Solitude: A Tranquil Sunset Bridge
A solitary figure stands on a bridge, their silhouette stark against the vibrant hues of a setting sun reflected in the calm water below. The scene evokes a sense of tranquility and contemplation, capturing the beauty of a moment of quiet reflection.
Prompt
Symmetry and Patterns: Tranquil, reflective ; A lone traveler standing on a bridge overlooking a symmetrical river with a patterned sunset in the background; medium shot; Travel; Reflections of the sunset and bridge in the water; cinematic
Characteristic
Shot : A silhouette of a person standing on a bridge over a body of water at sunset. The sky is filled with clouds that are reflecting in the water.
Aesthetic Score : 0.8
Mood : serene, calm, peaceful
Quality
Entropy : 6.52
Noise : 78
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : Some minor artifacts are present in the image, such as banding in the sky and noise in the water.
A Meal Filled with Tension
A group of people gather around a table laden with food, but the atmosphere is anything but celebratory. Their expressions are strained, their gazes fixed on something beyond the frame, creating a palpable sense of unease and anticipation. The vibrant tablecloth and abundance of dishes only serve to heighten the dramatic tension in this scene.
Prompt
Symmetry and Patterns: Warm, joyful ; gathered around a symmetrical dining table, sharing a meal; medium shot; group; Patterned tablecloth and dishes on the table; cinematic
Characteristic
Shot : A group of people are sitting at a dining table, having a meal. The table is covered in a colorful tablecloth and the room is decorated in a vintage style.
Aesthetic Score : 0.6
Mood : tense, unsettling, mystery
Quality
Entropy : 6.70
Noise : 114
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.70
Image errors : The image has a few artifacts, particularly in the faces of the people. The color palette is a bit oversaturated.
Silhouetted Against the Setting Sun: A Lone Figure in the Desert
A solitary figure stands on a sand dune, their silhouette stark against the fiery sunset. The vastness of the desert landscape evokes feelings of loneliness, epic grandeur, and contemplation. The dramatic play of light and shadow adds a sense of mystery and intrigue to this evocative scene.
Prompt
Symmetry and Patterns: Hopeful, determined ; A hero’s silhouette against a symmetrical sunrise over a vast, patterned desert; long shot; Heroism; Sand dunes creating a repeating pattern in the background; cinematic
Characteristic
Shot : A lone figure stands on top of a sand dune in a vast desert, silhouetted against the setting sun.
Aesthetic Score : 0.7
Mood : dramatic, epic, solitude
Quality
Entropy : 6.53
Noise : 50
Prompt Clip Score : 0.37
AI Evaluation
Likelihood of AI : 0.50
Image errors : The image appears slightly blurry and the sand texture could be more realistic.
Awe-Inspiring Cave: Figures Gather Under a Majestic Rock Formation
Two groups of figures stand in a cavernous space, bathed in the warm glow of a faint light source. A massive rock formation hangs from the ceiling, creating a sense of scale and mystery. The scene evokes a feeling of adventure and wonder, inviting viewers to explore the depths of this enigmatic cave.
Prompt
Symmetry and Patterns: Awe-inspiring, mysterious ; A group of adventurers exploring a symmetrical, patterned cave system; medium shot; Adventure; Stalactites and stalagmites forming intricate patterns on the cave walls; cinematic
Characteristic
Shot : A group of four people are exploring a cave. The cave is dark and mysterious, with large rock formations and a narrow passageway. The people are wearing helmets and headlamps, and they are carrying flashlights. The scene is lit by the beams of their flashlights, which create a sense of depth and mystery.
Aesthetic Score : 0.6
Mood : mysterious, adventurous, eerie
Quality
Entropy : 6.85
Noise : 105
Prompt Clip Score : 0.34
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image has some minor artifacts in the shadows and the rock formations. There is also some slight blurring in the edges of the image. This blurring might be intentional but is quite strong.
Blue Gaze: A Portrait of Intensity
A close-up portrait captures a piercing gaze, enhanced by a striking blue paint design on the face. The intensity of the expression and the dramatic lighting create a mysterious and captivating image.
Prompt
Symmetry and Patterns: Immersive, surreal ; A player’s avatar navigating a symmetrical, patterned virtual world; close-up; Gaming; Geometric shapes and textures creating a visually stunning environment; cinematic
Characteristic
Shot : Close up portrait of a man with blue face paint
Aesthetic Score : 0.7
Mood : intense, futuristic, mysterious
Quality
Entropy : 6.43
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.90
Image errors : The skin texture and the blue paint look slightly artificial.
A Moment of Joy and Importance
Four individuals, dressed in smart casual attire, stand smiling in front of a grand building, likely a government building or museum. The scene, bathed in soft lighting and warm colors, exudes a sense of happiness, friendliness, and welcoming grandeur.
Prompt
Symmetry and Patterns: Joyful, celebratory ; A family standing in front of a symmetrical, patterned building, smiling for a photo; medium shot; Family; Architectural details and patterns on the building; cinematic
Characteristic
Shot : A group of four people stand in front of a classical building. The group is composed of three women and one man, and they all appear to be smiling.
Aesthetic Score : 0.6
Mood : happy, confident, elegant
Quality
Entropy : 6.64
Noise : 101
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has some slight artifacts, particularly around the edges of the building, and some unnatural color gradients in the sky. This is likely caused by image compression or processing, although there’s a chance that some of the building details might be AI-generated.
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.35, which is considered below average. This suggests that the model didn’t accurately translate the intended camera position from the prompt into the generated image.
- Shot Analysis: The model scored 0.55, which is considered good. This indicates that the model was able to understand the scene described in the prompt and create a shot that aligns with it.
- Aesthetic Analysis: The model scored 0.31, which is considered below average. This means that the generated image didn’t match the expected aesthetic style as closely as it could have.
Overall, the model shows promise in understanding the scene and shot composition, but needs improvement in accurately capturing the intended camera position and aesthetic style.
Sources:
- https://www.swiff.org/article/crafting-the-tone-and-style-of-a-film
- https://digital-photography-school.com/backlighting-in-photography/
- https://www.studiobinder.com/blog/what-is-chiaroscuro-definition-examples/
- https://infocusfilmschool.com/4-wildly-different-movie-styles-youll-explore-filmmaking-college/
- https://cinepunked.com/2022/09/23/a-quick-guide-to-visual-style/
- https://cinematography.com/index.php?/forums/topic/184-desaturation-techniques/
- https://www.reddit.com/r/Filmmakers/comments/1452afb/colour_grading_an_underrated_factor_in_the/
- https://digital-photography-school.com/rule-of-thirds/
- https://deepmind.google/technologies/imagen-2/