AI's Artistic Eye: Capturing Emotion, Not Camera Angles with Freepik
- 9 minutes read - 1819 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate realistic and emotionally evocative images is a rapidly evolving field. This blog post examines the performance of a generative AI model in creating scenes based on detailed descriptions, focusing on its ability to capture facial expressions and convey emotion. While the model demonstrates a strong understanding of shot composition and aesthetic style, it falls short in accurately capturing the intended camera position. This highlights the ongoing challenge of bridging the gap between human artistic intuition and AI’s computational capabilities. We delve into the model’s strengths and weaknesses, highlighting its ability to evoke emotion through its artistic choices, and explore the potential for future advancements in this exciting area of AI research.
Created with: freepik
A Mysterious Gaze in the Shadows
A young woman, shrouded in a black coat, stands in a narrow alleyway, her piercing gaze locked directly on the viewer. The dramatic lighting and confined space create a sense of mystery and tension, hinting at a story waiting to unfold.
Prompt
facial-expressions Fear: Unease, paranoia ; A lone figure; eye-level; Single Person; a dark, deserted alleyway; cinematic
Characteristic
Shot : A young woman in a black coat standing in a narrow, dimly lit alleyway.
Aesthetic Score : 0.7
Mood : mysterious, lonely, moody
Quality
Entropy : 6.69
Noise : 47
Prompt Clip Score : 0.21
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry and has a bit of grain.
Silhouetted Hero, Fog-Shrouded City: A Moment of Epic Mystery
A superhero stands tall against the backdrop of a foggy cityscape at dusk, their silhouette a beacon of hope amidst the shadows. The dramatic lighting and balanced composition create a sense of anticipation and mystery, hinting at the epic events to come.
Prompt
facial-expressions Fear: Dread, anticipation ; A superhero standing alone on a rooftop; eye-level; Hero; a cityscape shrouded in fog; cinematic
Characteristic
Shot : A superhero standing on a rooftop, looking out over a foggy city at night.
Aesthetic Score : 0.7
Mood : heroic, mysterious, brooding
Quality
Entropy : 6.68
Noise : 42
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.70
Image errors : The superhero’s face looks a little unnatural, as if it was generated by AI. The fog is too uniform and lacks depth.
Lost in the Shadows: A Moment of Contemplation
A young woman, shrouded in darkness, stands bathed in the ethereal glow of streetlights. Her contemplative expression and the mysterious ambiance evoke a sense of melancholy and intrigue. The dramatic lighting draws the viewer’s eye to her face, leaving them to ponder her thoughts and the secrets she holds.
Prompt
facial-expressions Fear: Vulnerability, isolation ; A woman walking down a dimly lit street; eye-level; Normal Person; a deserted street with flickering streetlights; cinematic
Characteristic
Shot : A young woman walking down a deserted street at night, lit by streetlamps
Aesthetic Score : 0.7
Mood : melancholy, mysterious, contemplative
Quality
Entropy : 6.67
Noise : 49
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is slightly blurry, particularly in the background.
Lost in the Code: A Moment of Intense Focus
A young man, shrouded in shadow, sits hunched over his computer, his gaze fixed on the screen. The low lighting adds an air of mystery, hinting at the intensity of his focus. Is he working on a groundbreaking project, or lost in the thrill of a game? The scene captures the raw energy of dedication and the allure of the digital world.
Prompt
facial-expressions Fear: Disquiet, unease ; A gamer hunched over their computer; close-up; Gamer; a flickering monitor displaying a disturbing image; cinematic
Characteristic
Shot : A young man is sitting in front of a computer, looking intently at the screen. The room is dimly lit, and the only light source is coming from the computer screen.
Aesthetic Score : 0.7
Mood : focused, intense, serious
Quality
Entropy : 6.48
Noise : 50
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some slight artifacts around the edges, particularly near the edges of the computer screen.
Fear in the Shadows: A Portrait of Unease
A single light source illuminates a woman’s face, revealing a look of intense fear. The low-key lighting and her worried expression create a palpable sense of suspense and unease, drawing the viewer into a moment of heightened tension.
Prompt
facial-expressions Fear: Terror, helplessness ; hiding ; low-angle; Single Person; a dark room with shadows creeping in; cinematic
Characteristic
Shot : A close-up portrait of a woman’s face, with her eyes wide open in a state of fear or surprise. She is illuminated by a soft light source from the left, casting shadows across her face.
Aesthetic Score : 0.6
Mood : fearful, intense, dramatic
Quality
Entropy : 5.79
Noise : 30
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors.
Caught in the Crossfire: A Soldier’s Moment of Terror
A young warrior, clad in battle-scarred armor, stares directly into the camera, his face etched with shock and fear. The blurry background hints at the chaos of a raging battle, drawing the viewer into the heart of the conflict. This intense and dramatic image captures the raw emotion of war, leaving a lasting impression.
Prompt
facial-expressions Fear: Desperation, courage ; A hero facing a monstrous creature; eye-level; Hero; a crumbling battlefield with smoke and debris; cinematic
Characteristic
Shot : A close-up portrait of a young man, likely a soldier, with dirt and grime on his face and armor. The background is out of focus, but it appears to be a battle scene with smoke and fire in the distance.
Aesthetic Score : 0.7
Mood : dramatic, intense, somber
Quality
Entropy : 6.77
Noise : 65
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has slight blurring around the edges and some noise in the background.
Fear Grips the Crowd as Storm Breaks
A group of people, mostly women, stand in fear, their faces illuminated by flashes of lightning against a dark, stormy sky. The scene evokes a sense of suspense and anticipation, leaving viewers on the edge of their seats.
Prompt
facial-expressions Fear: Anxiety, uncertainty ; A group of people huddled together in a darkened room; eye-level; Normal People; a storm raging outside with thunder and lightning; cinematic
Characteristic
Shot : A group of people are looking up with fear, likely in a situation of danger. The setting is dark with rain and lightning.
Aesthetic Score : 0.7
Mood : tense, ominous, suspenseful
Quality
Entropy : 6.23
Noise : 46
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The lighting is a bit uneven, with some parts of the image appearing darker than others, the lightning effects are slightly artificial.
In the Zone: Gamer’s Intensity Captures the Screen
A young woman, eyes locked on the screen, embodies the thrill of competitive gaming. Dramatic lighting and her focused expression create a palpable sense of intensity, capturing the moment just before a crucial play.
Prompt
facial-expressions Fear: Shock, adrenaline ; A gamer’s hands shaking as they play a horror game; close-up; Gamer; a screen displaying a jump scare; cinematic
Characteristic
Shot : A young woman playing video games, looking at the camera with a shocked expression, wearing a headset, holding a controller, in a dimly lit room.
Aesthetic Score : 0.6
Mood : intense, focused, excited
Quality
Entropy : 6.72
Noise : 48
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image has some noise and graininess, particularly in the darker areas.
Solitude Amidst the Storm
A lone figure stands on a windswept cliff, gazing out at a tumultuous sea. The dramatic lighting and vastness of the scene evoke a sense of isolation and awe, while the distant figure on another clifftop hints at a connection beyond the storm.
Prompt
facial-expressions Fear: Loneliness, despair ; A lone figure standing at the edge of a cliff; eye-level; Single Person; a vast, empty landscape with a stormy sky; cinematic
Characteristic
Shot : A lone figure stands on a clifftop overlooking a dramatic, stormy sea with a distant figure standing on another cliff in the background
Aesthetic Score : 0.8
Mood : melancholy, dramatic, contemplative
Quality
Entropy : 6.60
Noise : 61
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable artifacts or errors in the image.
Man of Fire: A City’s Last Stand
Amidst the fiery ruins of a city, a lone figure stands defiant. The flames engulfing his chest mirror the inferno raging around him, creating a powerful and dramatic image of resilience in the face of apocalypse. His serious expression speaks volumes of the danger and urgency he faces, leaving the viewer to wonder what fate awaits him.
Prompt
facial-expressions Fear: Loss, determination ; A hero standing amidst a burning city; eye-level; Hero; a chaotic scene with smoke and flames; cinematic
Characteristic
Shot : A man is standing in the middle of a city street, with fire burning in the background and a flame-shaped pattern on his chest. The scene is dramatic and conveys a sense of danger and chaos.
Aesthetic Score : 0.7
Mood : intense, dramatic, apocalyptic
Quality
Entropy : 6.80
Noise : 52
Prompt Clip Score : 0.33
AI Evaluation
Likelihood of AI : 0.70
Image errors : The flames on the man’s chest look a bit unnatural and there are some minor artifacts in the background.
Conclusion
The results show that the generative AI model performed well in understanding the camera position and shot composition, but struggled with the aesthetic aspect.
Here’s a breakdown:
- Camera Position: The model scored 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.59, which is considered good. This indicates that the model was able to understand and translate the scene description from the prompt into a visually coherent shot.
- Aesthetic Analysis: The model scored 0.11, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position.
Overall, the model demonstrates a good understanding of shot composition but needs improvement in accurately capturing the intended camera position. The model’s ability to achieve the desired aesthetic is a positive sign.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://www.freepik.com