AI's Artistic Struggle: Capturing the 'Style-Aesthetic' with Imagen-v2

Can AI Master the 'Style-Aesthetic'? A Case Study with Imagen-v2

Contents

The ‘style-aesthetic’ is a visual style characterized by dramatic lighting, bold colors, and a focus on capturing a specific mood or emotion. It’s often used in film, photography, and graphic design to create a powerful and memorable visual experience. However, replicating this style using AI presents unique challenges. This blog post explores these challenges through a case study, analyzing the results of a generative AI model tasked with creating images based on specific prompts designed to evoke the ‘style-aesthetic’.

Created with: imagen-v2

A Solitary Figure Contemplates the Vastness

A lone figure stands on a rocky cliff, silhouetted against a misty valley. The dramatic play of light and shadow emphasizes the smallness of the figure against the vastness of the landscape, evoking a sense of solitude and contemplation.

A Solitary Figure Contemplates the Vastness

Prompt

Minimalist: Epic, triumphant ; Lone figure standing on a mountain peak; wide shot; Heroism; Dramatic sky with clouds; cinematic

Characteristic

Shot : A lone figure stands on the peak of a rocky mountain, looking out over a misty valley and a cloudy sky.

Aesthetic Score : 0.7

Mood : solitude, contemplative, dramatic

Quality

Entropy : 6.70

Noise : 109

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.20

Image errors : None

A Compass Points to Mystery

A vintage compass rests on worn leather, its needle pointing towards an unknown destination. The shallow depth of field draws you into the moment, hinting at a story waiting to be unraveled.

A Compass Points to Mystery

Prompt

Minimalist: Intriguing, mysterious ; A single, weathered compass; close-up; Adventure; Dusty, worn leather bag; cinematic

Characteristic

Shot : A close-up shot of a compass lying on a brown leather surface.

Aesthetic Score : 0.7

Mood : vintage, rustic, adventurous

Quality

Entropy : 6.49

Noise : 98

Prompt Clip Score : 0.31

AI Evaluation

Likelihood of AI : 0.10

Image errors : No visible errors in the image.

Blurred Action: The Thrill of the Game

A silhouette of a gamer, controller in hand, is locked in intense focus as the blurry screen of their video game explodes with action. The scene captures the excitement and immersion of gaming, leaving the details of the game itself to the imagination.

Blurred Action: The Thrill of the Game

Prompt

Minimalist: Focused, intense ; A pair of hands holding a joystick; close-up; Gaming; Blurred background of a vibrant video game screen; cinematic

Characteristic

Shot : A person is playing a video game with a controller, the television screen behind them shows the game

Aesthetic Score : 0.6

Mood : intense, focused, concentrated

Quality

Entropy : 6.73

Noise : 76

Prompt Clip Score : 0.32

AI Evaluation

Likelihood of AI : 0.20

Image errors : The image has a slight amount of noise, likely from the camera’s sensor

A Suitcase, a Cobblestone Street, and a Feeling of Loss

A solitary suitcase sits abandoned on a deserted cobblestone street in a quaint European town. The overcast sky and worn buildings create a somber atmosphere, hinting at a traveler’s departure and the lingering sense of loneliness it leaves behind.

A Suitcase, a Cobblestone Street, and a Feeling of Loss

Prompt

Minimalist: Nostalgic, hopeful ; A lone suitcase on a cobblestone street; medium shot; Tourism; A quaint, European town in the background; cinematic

Characteristic

Shot : A red suitcase sits in the middle of a cobblestone street, with old European buildings lining the sides. The sky is overcast and gray.

Aesthetic Score : 0.6

Mood : melancholy, lonely, introspective

Quality

Entropy : 6.71

Noise : 69

Prompt Clip Score : 0.36

AI Evaluation

Likelihood of AI : 0.30

Image errors : No noticeable artifacts or errors.

Footprints in the Sand: A Moment of Tranquility

A solitary figure walks along a sandy beach, their footsteps disappearing into the waves. The camera’s perspective, focused on their legs and feet, creates a sense of mystery and invites contemplation. The scene evokes a feeling of calm and serenity, capturing the essence of a peaceful moment by the sea.

Footprints in the Sand: A Moment of Tranquility

Prompt

Minimalist: Serene, liberating ; A pair of feet walking on a sandy beach; low-angle shot; Travel; Vast ocean and horizon in the background; cinematic

Characteristic

Shot : A person is walking on a sandy beach, with the ocean in the background. The image is taken from a low angle, looking up at the person’s legs and feet.

Aesthetic Score : 0.6

Mood : calm, contemplative, serene

Quality

Entropy : 6.50

Noise : 78

Prompt Clip Score : 0.34

AI Evaluation

Likelihood of AI : 0.10

Image errors : The image has some minor artifacts, such as noise and graininess, especially in the shadows. The colors are also slightly muted and the overall image is somewhat blurry.

A Tender Touch: Hope Blooms in the Park

A close-up captures the gentle connection between a child’s hand and an adult’s, their fingers intertwined against a soft, blurred backdrop of nature. The image evokes a sense of love, hope, and the enduring bond between generations.

A Tender Touch: Hope Blooms in the Park

Prompt

Minimalist: Warm, loving ; A hand holding a child’s hand; close-up; Family; A blurred background of a park or playground; cinematic

Characteristic

Shot : A close-up shot of an adult hand holding a child’s hand. The background is out of focus and shows a green field.

Aesthetic Score : 0.7

Mood : tender, loving, hopeful

Quality

Entropy : 6.72

Noise : 87

Prompt Clip Score : 0.32

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are no visible artifacts or errors in the image.

A Single Red Rose, A Story Untold

A vibrant red rose rests delicately upon a worn brown leather glove, a poignant contrast against a muted grey backdrop. The image evokes a sense of romance, intimacy, and a touch of melancholy, leaving the viewer to ponder the story behind this solitary bloom.

A Single Red Rose, A Story Untold

Prompt

Minimalist: Romantic, symbolic ; A single, red rose; close-up; Heroism; A weathered, worn leather glove; cinematic

Characteristic

Shot : A single red rose resting on a brown leather glove. The background is a simple gray.

Aesthetic Score : 0.7

Mood : romantic, elegant, nostalgic

Quality

Entropy : 6.65

Noise : 100

Prompt Clip Score : 0.34

AI Evaluation

Likelihood of AI : 0.10

Image errors : There is a slight blur in the background that is not a result of the scene’s subject. The background appears to be a soft-focus, which is not in keeping with the sharp focus of the rose.

Unveiling the Secrets of a Vintage Journal

A close-up shot of an old leather-bound journal, tied with a leather strap, reveals a red pin marking a location on a map visible in the background. The image evokes a sense of mystery and intrigue, hinting at the journal’s secrets and the adventures it holds.

Unveiling the Secrets of a Vintage Journal

Prompt

Minimalist: Intriguing, adventurous ; A map with a single pin marking a destination; close-up; Adventure; A worn, leather-bound journal; cinematic

Characteristic

Shot : A close-up of a leather-bound journal with a red pin in the cover, lying on top of a faded map.

Aesthetic Score : 0.6

Mood : nostalgic, mysterious, adventurous

Quality

Entropy : 6.57

Noise : 102

Prompt Clip Score : 0.33

AI Evaluation

Likelihood of AI : 0.20

Image errors : There are no visible image errors.

Urban Echoes: Headphones Reflecting the City’s Pulse

A minimalist composition featuring headphones against a dark backdrop. The ear cups reflect a vibrant city skyline, creating a sense of depth and mystery. This image evokes a futuristic, urban mood, capturing the essence of modern life.

Urban Echoes: Headphones Reflecting the City’s Pulse

Prompt

Minimalist: Immersive, futuristic ; A pair of headphones with a cityscape reflected in the earcups; close-up; Gaming; A dimly lit room with a computer screen in the background; cinematic

Characteristic

Shot : A pair of headphones with a cityscape reflected in the earcups. The headphones are set against a dark background.

Aesthetic Score : 0.7

Mood : mysterious, futuristic, urban

Quality

Entropy : 5.39

Noise : 73

Prompt Clip Score : 0.32

AI Evaluation

Likelihood of AI : 0.20

Image errors : The reflection in the earcups is a bit blurry, and there is a slight halo effect around the headphones.

Capturing the Golden Hour: A Vintage Lens on a Dreamy Landscape

A classic camera, perched on a tripod, points towards a vibrant sunset scene. The composition evokes a sense of nostalgia and anticipation, inviting you to step into the frame and experience the beauty of the moment.

Capturing the Golden Hour: A Vintage Lens on a Dreamy Landscape

Prompt

Minimalist: Nostalgic, adventurous ; A vintage camera with a viewfinder showing a breathtaking landscape; close-up; Tourism; A vibrant, colorful landscape in the background; cinematic

Characteristic

Shot : An old camera on a tripod, with a view of a colorful landscape through the lens.

Aesthetic Score : 0.7

Mood : nostalgic, dreamy, vintage

Quality

Entropy : 6.71

Noise : 109

Prompt Clip Score : 0.35

AI Evaluation

Likelihood of AI : 0.40

Image errors : The image has a slight color cast and the text on the camera is blurry.

Conclusion

The results indicate that the generative AI model performed well in understanding and executing camera positions and shot composition, but struggled with achieving the desired aesthetic. Here’s a breakdown:

  • Camera Position: The model scored a 0.35, which falls below the “good” range of 0.5 to 0.75. This suggests that the model didn’t perfectly capture the intended camera positions described in the prompt.
  • Shot Analysis: The model scored a 0.47, also below the “good” range. This indicates that the model didn’t fully understand the scene and its elements as described in the prompt, resulting in a shot that wasn’t entirely accurate.
  • Aesthetic Analysis: The model scored a 0.06, which is significantly below the “very good” range of -0.2 to 0.1. This suggests a significant difference between the expected aesthetic and the actual aesthetic of the generated image. The model likely struggled to capture the desired visual style or mood.

Overall, the model shows promise in understanding camera positions and shot composition, but needs improvement in achieving the desired aesthetic.

Sources: