AI's Facial Expressions: A Step Forward, But Still Room for Growth with Freepik
- 9 minutes read - 1898 wordsTable of Contents
Facial expressions are a powerful tool in storytelling, conveying emotions and intentions without words. In the realm of AI-generated imagery, capturing these nuances is a crucial step towards creating truly immersive and engaging experiences. This blog post examines the capabilities of a new AI model in generating images with specific facial expressions, exploring its strengths and weaknesses in different scenarios.
Created with: freepik
Lost in the Neon Glow: A Moment of Solitude in the City
A young woman stands amidst the vibrant chaos of a bustling city street, bathed in the glow of colorful neon signs. The camera focuses on her face, capturing a sense of quiet contemplation that contrasts sharply with the surrounding energy. Her expression is a study in mystery, hinting at a story waiting to be told. This image evokes a mood of urban melancholy, leaving the viewer to wonder about her thoughts and the secrets she holds.
Prompt
facial-expressions Confusion: Disoriented, overwhelmed ; A lone figure; eye-level; Single Person; a bustling city street with neon signs and crowds; cinematic
Characteristic
Shot : A young woman with short brown hair stands in a brightly lit city street at night. There are many people walking in the background, and the neon signs of the city are reflected in the wet pavement.
Aesthetic Score : 0.8
Mood : melancholy, mysterious, urban
Quality
Entropy : 6.80
Noise : 58
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor artifacts in the background, particularly around the neon signs. There is also some slight noise in the shadows. The woman’s eyes are a little blurry.
Superman: A Hero Rises from the Ashes
A powerful image of Superman, his costume torn and weathered, stands defiant amidst the ruins of a destroyed city. The scene evokes a sense of hope and resilience, showcasing the hero’s strength in the face of overwhelming adversity.
Prompt
facial-expressions Confusion: Doubt, uncertainty ; A superhero in a tattered costume; eye-level; Hero; a destroyed cityscape with smoke and debris; cinematic
Characteristic
Shot : A superhero in a Superman costume stands in a destroyed city, with rubble and smoke in the background. The superhero looks determined and focused, conveying a sense of hope and resilience.
Aesthetic Score : 0.7
Mood : dramatic, hopeful, resilient
Quality
Entropy : 6.88
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to have some slight artifacts, particularly in the smoke and rubble. Some areas, such as the superhero’s muscles, may have been digitally enhanced.
The Weight of Expectations: A Woman’s Intense Gaze in a Sterile Office
A woman in a sharp business suit sits at her desk in a sleek, modern office, her gaze fixed directly on the viewer. The sterile environment and her intense expression create a palpable sense of tension and unease, hinting at the weight of professional pressure and the unspoken demands of her world.
Prompt
facial-expressions Confusion: Lost, unmoored ; A woman in a business suit; eye-level; Normal People; a sterile office with fluorescent lights and cubicles; cinematic
Characteristic
Shot : A woman is sitting at a desk in an office, looking directly at the camera. The room is well-lit, and the woman is wearing a suit. The overall feel is professional and focused.
Aesthetic Score : 0.6
Mood : serious, focused, professional
Quality
Entropy : 6.83
Noise : 43
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no noticeable errors in the image.
Lost in the Code: A Moment of Intense Focus
A young man, headphones on, sits immersed in his work at a multi-monitor desk. His focused gaze and the blurred background create a sense of intensity, capturing the essence of deep concentration and technological immersion.
Prompt
facial-expressions Confusion: Frustration, bewilderment ; A gamer with headphones on; close-up; Gamer; a dimly lit room with a computer screen displaying a complex game interface; cinematic
Characteristic
Shot : A young man in headphones is sitting in front of a computer, looking intently at one of the screens. He is surrounded by multiple computer monitors.
Aesthetic Score : 0.6
Mood : focused, intense, technological
Quality
Entropy : 6.49
Noise : 42
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There is some noise in the image, particularly in the darker areas, which might be due to the lighting. The image also has a slight yellow cast. There are some minor sharpening artifacts around the edges of the image.
Lost in the Fog: A Man’s Shadow in the Night
A solitary figure, shrouded in a trench coat, stands in a dimly lit alleyway, bathed in the eerie glow of streetlights. The dense fog adds to the mystery, creating a sense of suspense and intrigue. This cinematic scene begs the question: who is this man, and what secrets does he hold?
Prompt
facial-expressions Confusion: Suspicious, wary ; A man in a trench coat; eye-level; Single Person; a foggy alleyway with flickering streetlights; cinematic
Characteristic
Shot : A man in a trench coat stands in a foggy street at night, lit by streetlights. There are two other people in the background, walking further down the street.
Aesthetic Score : 0.7
Mood : mysterious, suspenseful, moody
Quality
Entropy : 6.87
Noise : 48
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are no noticeable errors in the image. The quality is good, there’s no compression or pixelation visible.
A Knight’s Contemplation: Mystery and Melancholy in the Forest
A lone knight, clad in shining armor, stands amidst the verdant embrace of a forest. His pensive gaze, directed towards an unseen horizon, evokes a sense of dramatic intrigue. The play of light and shadow adds to the mystery, leaving the viewer to ponder the knight’s thoughts and the path he may choose.
Prompt
facial-expressions Confusion: Disillusioned, lost ; A knight in shining armor; eye-level; Hero; a dark forest with twisted trees and ominous shadows; cinematic
Characteristic
Shot : A knight stands in a forest, looking off to the side, with a thoughtful expression.
Aesthetic Score : 0.7
Mood : mysterious, pensive, dramatic
Quality
Entropy : 6.83
Noise : 64
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, especially in the background. Some of the leaves and branches are out of focus, and the light is not evenly distributed across the image.
A Family’s Silent Tension
A family sits together at a dinner table, their gazes fixed on something unseen. The atmosphere is thick with unspoken tension, leaving the viewer to wonder what has captured their attention and what secrets lie beneath the surface.
Prompt
facial-expressions Confusion: Awkward, uncomfortable ; A family at a dinner table; eye-level; Normal People; a brightly lit kitchen with mismatched plates and silverware; cinematic
Characteristic
Shot : A family dinner setting with five people at the table, lit by candles and ambient light, in an understated warm color scheme.
Aesthetic Score : 0.6
Mood : calm, intimate, contemplative
Quality
Entropy : 6.90
Noise : 63
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : No visible artifacts or errors.
Lost in the Game: A Moment of Intense Focus
A young man, bathed in the soft glow of a TV screen, sits on his bed, controller in hand. His focused expression and the dimly lit room create a sense of intimacy and immersion in the virtual world. This image captures the intensity and contemplation that comes with being fully engrossed in a video game.
Prompt
facial-expressions Confusion: Overwhelmed, disoriented ; A gamer holding a controller; close-up; Gamer; a brightly lit room with a TV screen displaying a chaotic game scene; cinematic
Characteristic
Shot : A young man is sitting on a bed, holding a video game controller, staring intently at the camera. There are multiple TV screens in the background. It is likely he is playing a video game.
Aesthetic Score : 0.6
Mood : focused, serious, intense
Quality
Entropy : 6.81
Noise : 45
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have some noise and compression artifacts. The background is a bit blurry.
Lost in the City’s Embrace
A young woman navigates the bustling city streets at dusk, her gaze fixed on the viewer, conveying a sense of melancholy and introspection. The blurred background isolates her, adding to the mystery surrounding her thoughts.
Prompt
facial-expressions Confusion: Lost, alienated ; A woman walking down a crowded street; eye-level; Single Person; a bustling city street with people rushing past; cinematic
Characteristic
Shot : A young woman walks through a bustling city street, looking directly at the camera with a serious expression. The background is blurred, focusing on the woman and her emotions.
Aesthetic Score : 0.7
Mood : serious, urban, melancholic
Quality
Entropy : 6.81
Noise : 52
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No visible artifacts or errors in the image
Superman, Guardian of the Night
A dramatic shot of Superman standing tall on a rooftop, bathed in moonlight, overlooking a sprawling cityscape. The pose, lighting, and setting evoke a sense of power, heroism, and hope.
Prompt
facial-expressions Confusion: Doubt, questioning ; A superhero standing on a rooftop; eye-level; Hero; a cityscape with twinkling lights and a full moon; cinematic
Characteristic
Shot : Superman standing on a rooftop overlooking a city at night, with a full moon in the background.
Aesthetic Score : 0.7
Mood : heroic, dramatic, contemplative
Quality
Entropy : 6.76
Noise : 45
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.80
Image errors : The city skyline looks a bit artificial. The lighting on Superman’s suit is a bit too bright and even.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.3, which is below the “good” range of 0.5 to 0.75. This indicates that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored 0.45, also below the “good” range. This suggests that the model had some difficulty understanding the scene and translating it into the generated image.
- Aesthetic Analysis: The model scored 0.09, which is within the “very good” range of -0.2 to 0.1. This means the generated image’s aesthetic closely matched the expected aesthetic described in the prompt.
Overall: While the model excelled in capturing the desired aesthetic, it struggled with accurately representing the camera position and scene details. This suggests that the model might need further training to better understand and translate these aspects from prompts into generated images.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://www.freepik.com