AI's Artistic Eye: Capturing Emotion, Not Camera Angles with Midjourney
- 10 minutes read - 1923 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images that evoke emotion is a fascinating frontier. This blog post examines the performance of a generative AI model in capturing facial expressions, a key element in conveying human emotion. While the model demonstrates a strong grasp of aesthetics, it falls short in accurately replicating camera angles and shot types. We delve into the reasons behind this discrepancy and discuss the potential for future improvements.
Dramatic facial expressions are a powerful tool in storytelling, used to convey a wide range of emotions, from joy and sorrow to anger and fear. They are often employed in film, theater, and visual art to enhance the narrative and create a deeper connection with the audience.
For example, in a scene depicting a character’s grief, a close-up shot focusing on their tear-streaked face and furrowed brow can evoke a powerful sense of sadness. Similarly, a wide shot capturing a character’s triumphant smile as they raise their arms in victory can convey a sense of exhilaration.
The ability to generate images with realistic and expressive facial expressions is crucial for AI models to create compelling and emotionally resonant content. This blog post explores the challenges and successes of a generative AI model in this area, highlighting its strengths and weaknesses.
Created with: midjourney
Hong Kong’s Neon Dreams: A Night of Bustling Chaos
A vibrant and chaotic scene unfolds on a bustling Hong Kong street, illuminated by a dazzling array of neon signs. The depth of field creates a sense of mystery, focusing on the foreground and blurring the background, drawing the viewer into the heart of the action.
Prompt
Confusion Confusion, bewilderment: Disoriented, overwhelmed ; A lone figure; eye-level; Single Person; a bustling city street with neon signs and crowds; cinematic
Characteristic
Shot : A bustling night street in a city with many neon signs and people walking
Aesthetic Score : 0.7
Mood : vibrant, urban, exciting
Quality
Entropy : 6.29
Noise : 94
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some slight artifacts in the image, particularly in the areas with bright light.
Batman: A Beacon of Hope Amidst the Ruins
A solitary Batman stands defiant against a backdrop of urban devastation. Smoke billows, rubble litters the ground, and towering buildings cast long shadows. The image captures the hero’s unwavering resolve in the face of overwhelming destruction, highlighting the stark contrast between his imposing presence and the desolation of his surroundings.
Prompt
Confusion Confusion, questioning: Doubt, uncertainty ; A superhero in a tattered costume; eye-level; Hero; a destroyed cityscape with smoke and debris; cinematic
Characteristic
Shot : A lone Batman stands in the ruins of a city, shrouded in smoke and dust. Buildings are crumbling around him, creating a sense of devastation and chaos.
Aesthetic Score : 0.7
Mood : dark, heroic, dramatic
Quality
Entropy : 5.86
Noise : 98
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image contains some minor artifacts and noise, particularly in the smoke and dust. The edges of the buildings are slightly blurred, which might be due to over-sharpening.
Contemplating the Ceiling: A Moment of Uncertainty in the Corporate World
A woman in a business suit stands in a sterile office, her gaze fixed on the fluorescent lights above. The scene evokes a sense of contemplation and anticipation, hinting at the pressures and uncertainties of corporate life.
Prompt
Confusion Confusion, anxiety: Lost, unmoored ; A woman in a business suit; eye-level; Normal People; a sterile office with fluorescent lights and cubicles; cinematic
Characteristic
Shot : A woman in a business suit standing in an office, looking up at the ceiling.
Aesthetic Score : 0.5
Mood : thoughtful, pensive, contemplative
Quality
Entropy : 6.83
Noise : 93
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is a bit blurry, particularly in the background. This is likely due to the low-light conditions in which it was taken.
The Hacker’s Focus
A young man, bathed in the blue and green glow of his computer screen, stares intently at the code before him. The low-key lighting and close-up framing amplify the intensity of his focus, hinting at a high-stakes mission.
Prompt
Confusion Confusion, frustration: Frustration, bewilderment ; A gamer with headphones on; close-up; Gamer; a dimly lit room with a computer screen displaying a complex game interface; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer screen. The screen is displaying a blue and green interface. The man has a focused expression on his face. The scene is dimly lit, creating a moody atmosphere.
Aesthetic Score : 0.6
Mood : focused, intense, serious
Quality
Entropy : 6.42
Noise : 103
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors.
Lost in the Fog: A Man’s Solitary Journey
A lone figure walks through a dark and foggy alleyway, illuminated only by the faint glow of streetlights. The atmosphere is thick with mystery and suspense, leaving the viewer wondering what awaits him around the next corner. The dramatic interplay of light and shadow creates a sense of intrigue, while the man’s determined stride suggests a glimmer of hope amidst the darkness.
Prompt
Confusion Confusion, suspicion: Suspicious, wary ; A man in a trench coat; eye-level; Single Person; a foggy alleyway with flickering streetlights; cinematic
Characteristic
Shot : A lone figure walks down a dark, narrow alleyway, lit by a single streetlamp at the end of the alley.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, lonely
Quality
Entropy : 6.85
Noise : 103
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible errors in the image.
A Knight’s Solemn Reflection in the Dark Forest
A lone knight, shrouded in shadow, stands amidst the dense foliage of a dark forest. His head bowed in contemplation, he evokes a sense of solitude and mystery. The blurred background emphasizes the knight’s isolation, drawing the viewer into his introspective moment.
Prompt
Confusion Confusion, despair: Disillusioned, lost ; A knight in shining armor; eye-level; Hero; a dark forest with twisted trees and ominous shadows; cinematic
Characteristic
Shot : A knight in full armor standing in a dark forest, with his head bowed down in a contemplative pose, in front of a tree covered in vines.
Aesthetic Score : 0.7
Mood : mysterious, dark, somber
Quality
Entropy : 6.03
Noise : 101
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, but this could be intentional for a more dramatic effect.
Family Dinner: A Moment of Warmth and Connection
A heartwarming scene of a family enjoying a meal together. The soft lighting, warm colors, and happy expressions create a sense of intimacy and contentment. This image captures the essence of family love and togetherness.
Prompt
Confusion Confusion, unease: Awkward, uncomfortable ; A family at a dinner table; eye-level; Normal People; a brightly lit kitchen with mismatched plates and silverware; cinematic
Characteristic
Shot : A family of four sits at a dinner table, eating and talking, in a dimly lit dining room with blue walls and a wooden table.
Aesthetic Score : 0.7
Mood : warm, intimate, cozy
Quality
Entropy : 6.42
Noise : 115
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to be slightly overexposed, with some areas being washed out. The colors are also a bit muted.
Lost in the Game: A Silhouette of Focus and Intensity
A young man, bathed in the blue glow of a late-night screen, is completely absorbed in his video game. The silhouette he casts against the brightly lit TV creates a sense of drama and isolation, highlighting the intense focus and relaxation that comes with being lost in the digital world.
Prompt
Confusion Confusion, panic: Overwhelmed, disoriented ; A gamer holding a controller; close-up; Gamer; a brightly lit room with a TV screen displaying a chaotic game scene; cinematic
Characteristic
Shot : A young man is playing a video game in his dimly lit room. The TV screen is reflecting blue light. The player is sitting on the floor with a controller in his hands, his silhouette is visible against the screen.
Aesthetic Score : 0.6
Mood : immersive, concentrated, blue
Quality
Entropy : 6.24
Noise : 98
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly noisy, especially in the darker areas. The color balance is a bit blue.
Lost in the City’s Blur
A solitary figure walks through a bustling urban landscape, the background a swirling blur of motion. The dramatic effect emphasizes the woman’s isolation and creates a sense of mystery, leaving the viewer to wonder about her journey and her thoughts.
Prompt
Confusion Confusion, detachment: Lost, alienated ; A woman walking down a crowded street; eye-level; Single Person; a bustling city street with people rushing past; cinematic
Characteristic
Shot : A woman is walking through a busy city street. Everything is blurred except for the woman and her face.
Aesthetic Score : 0.6
Mood : lonely, melancholic, urban
Quality
Entropy : 6.39
Noise : 94
Prompt Clip Score : 0.23
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some minor noise and artifacting in the image.
A Lone Figure, A City of Dreams
A solitary figure, cloaked in red, stands on a rooftop, silhouetted against the twinkling cityscape and a full moon. The scene evokes a sense of dramatic hope and inspiration, highlighting the individual’s courage and ambition against the vastness of the urban landscape.
Prompt
Confusion Confusion, introspection: Doubt, questioning ; A superhero standing on a rooftop; eye-level; Hero; a cityscape with twinkling lights and a full moon; cinematic
Characteristic
Shot : A lone figure in a red cape stands on a rooftop overlooking a city at night, silhouetted against a bright moon.
Aesthetic Score : 0.7
Mood : dramatic, hopeful, powerful
Quality
Entropy : 6.12
Noise : 108
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has a slightly grainy texture, likely due to compression.
Conclusion
The analysis shows that the generative AI model performed okay in terms of camera position and shot analysis, but very well in terms of aesthetic analysis.
Here’s a breakdown:
- Camera Position: The model scored 0.2, which is below the “good” range of 0.5 to 0.75. This suggests that the model didn’t quite capture the intended camera positions as described in the prompt.
- Shot Analysis: The model scored 0.45, also below the “good” range. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create the expected shot composition.
- Aesthetic Analysis: The model scored 0.13, which falls within the “very good” range of -0.2 to 0.1. This means the generated image closely matched the desired aesthetic style.
Overall, the model seems to be better at capturing the desired aesthetic than accurately interpreting camera positions and shot descriptions.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://midjourney.com