AI's Artistic Struggle: Capturing Emotion in Images with Freepik
- 9 minutes read - 1809 wordsTable of Contents
In the realm of artificial intelligence, the ability to generate images that evoke specific emotions is a coveted skill. This blog post delves into the challenges of using AI to create images that not only capture the intended scene and camera position but also convey the desired emotional impact. We’ll examine a case study where a generative AI model was tasked with creating images based on various prompts, focusing on the model’s performance in capturing facial expressions and the overall aesthetic style. Through this analysis, we’ll explore the strengths and weaknesses of AI in understanding and replicating human emotions in visual art.
Created with: freepik
Lost in the City Lights
A young woman stands alone on a bustling city street, her gaze fixed on something unseen. The blurred lights and her pensive expression create a sense of mystery and melancholy, capturing the essence of urban solitude.
Prompt
facial-expressions Excitement: Thrilled, anticipation ; A lone figure; eye-level; Single Person; bustling city street at night; cinematic
Characteristic
Shot : A young woman stands in a city street at night, her face illuminated by streetlights and the glow of nearby buildings.
Aesthetic Score : 0.7
Mood : melancholy, pensive, introspective
Quality
Entropy : 6.65
Noise : 49
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.30
Image errors : No notable artifacts or errors.
Superman Soars into the Sunset, Hope Takes Flight
A dramatic image captures Superman in mid-flight, his cape billowing behind him as he soars above a cityscape bathed in the golden hues of sunset. The scene evokes a sense of heroism, hope, and urgency, leaving viewers eager to witness the next chapter in the Man of Steel’s story.
Prompt
facial-expressions Excitement: Triumphant, exhilarating ; A superhero in mid-air; low-angle; Hero; cityscape with a dramatic sunset; cinematic
Characteristic
Shot : Superman flying over a city at sunset
Aesthetic Score : 0.7
Mood : epic, powerful, heroic
Quality
Entropy : 6.84
Noise : 49
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The background is a bit blurry and unrealistic. The cape is a little stiff and lacks realistic movement.
Sun-Kissed Laughter: Capturing the Joy of Youth
A vibrant scene of youthful energy unfolds in this photograph. A group of friends, bathed in warm sunlight, run through a lush park, their laughter echoing through the air. The sharp focus on the girl in the foreground draws the viewer’s eye, while the blurred background adds a sense of movement and depth, capturing the carefree spirit of the moment.
Prompt
facial-expressions Excitement: Joyful, carefree ; A group of friends laughing and running; eye-level; Normal People; a sunny park with a vibrant green lawn; cinematic
Characteristic
Shot : A group of young people are running in a park, laughing and enjoying the sunshine. The woman in the center of the frame is the main subject. She has her hair flowing in the wind and is smiling brightly. The other people in the background are blurred and out of focus, creating a sense of movement and energy.
Aesthetic Score : 0.7
Mood : joyful, energetic, carefree
Quality
Entropy : 6.76
Noise : 65
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image appears to have a slight over-exposure in certain areas, leading to a bit of blown-out highlights in the background. The focus is slightly soft on the woman’s hair, making it less distinct.
Lost in the Code: A Moment of Intense Focus
A young man, bathed in the soft glow of his computer screen, is completely absorbed in his work. The low lighting and close-up shot capture the intensity and seriousness of his focus, highlighting the power of dedication and the allure of the digital world.
Prompt
facial-expressions Excitement: Intense, focused ; A gamer’s hands furiously tapping on a keyboard; close-up; Gamer; a dimly lit room with glowing screens; cinematic
Characteristic
Shot : A young man is wearing headphones and typing on a keyboard. The scene is set in a dimly lit room with a computer monitor in the background.
Aesthetic Score : 0.7
Mood : focused, intense, determined
Quality
Entropy : 6.70
Noise : 54
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : No noticeable artifacts or errors in the image.
Silhouettes of Serenity: Two Women Embrace the Sunset’s Embrace
A breathtaking sunset paints the sky with dramatic hues as two women stand on a cliff, their figures silhouetted against the golden light. The vast ocean stretches before them, mirroring the tranquility and melancholic beauty of the moment. This scene evokes a sense of peace and awe, leaving you to ponder the mysteries of the world.
Prompt
facial-expressions Excitement: Awe-inspiring, liberating ; A woman standing on a cliff overlooking a vast ocean; eye-level; Single Person; dramatic clouds and a setting sun; cinematic
Characteristic
Shot : Two women standing on a cliff overlooking a vast ocean with a dramatic sunset in the background.
Aesthetic Score : 0.8
Mood : serene, contemplative, melancholic
Quality
Entropy : 6.72
Noise : 66
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image has some minor noise, but it is not a significant error. The composition is slightly off-balance, as the figures are clustered in the center.
Amidst the Chaos, a Soldier’s Desperate Run
A lone soldier races through a smoke-filled battlefield, explosions echoing around him. The scene captures the raw intensity and danger of war, with fallen comrades scattered in the background. The image evokes a sense of urgency and desperation, highlighting the chaotic reality of combat.
Prompt
facial-expressions Excitement: Brave, adrenaline-fueled ; A hero charging into battle; low-angle; Hero; a chaotic battlefield with explosions and smoke; cinematic
Characteristic
Shot : A soldier running through a battlefield with explosions behind him.
Aesthetic Score : 0.8
Mood : intense, dramatic, chaotic
Quality
Entropy : 6.86
Noise : 64
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.70
Image errors : The explosions seem a bit too perfect and artificial, the blur in the background might be overdone.
Birthday Bliss: Friends Celebrate with Laughter and Joy
Capture the infectious joy of a birthday celebration as friends gather, laughing and looking up at something out of frame. Balloons and festive decorations create a vibrant atmosphere, radiating positive energy and carefree fun.
Prompt
facial-expressions Excitement: Happy, celebratory ; A family celebrating a birthday; eye-level; Normal People; a brightly decorated living room with balloons and streamers; cinematic
Characteristic
Shot : A group of friends are celebrating a birthday. They are all smiling and laughing. There are balloons and party decorations in the background.
Aesthetic Score : 0.7
Mood : joyful, festive, celebratory
Quality
Entropy : 6.84
Noise : 63
Prompt Clip Score : 0.31
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is well-lit and there are no noticeable artifacts or errors.
The Focus of the Game
A young man, headphones on, sits transfixed before his computer screen, his face a mask of intense concentration. The lighting and camera angle draw you into his world, highlighting the determination in his eyes as he navigates the digital battlefield.
Prompt
facial-expressions Excitement: Engrossed, focused ; A gamer’s face illuminated by the screen; close-up; Gamer; a dark room with neon lights reflecting on the screen; cinematic
Characteristic
Shot : A young man wearing headphones is sitting in front of a computer screen, likely playing a video game. The room is dimly lit with colorful lights and the focus is on the man’s face and the headphones.
Aesthetic Score : 0.7
Mood : focused, intense, serious
Quality
Entropy : 6.56
Noise : 50
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : The image is slightly blurry, especially in the background and the lighting is uneven. The subject’s hair appears slightly unnatural.
The Scream of Excitement: A Rollercoaster Ride From the Inside
Experience the thrill of a rollercoaster ride through the eyes of a screaming passenger. This image captures the intense moment of the drop, with the track blurring behind the man’s face, creating a sense of exhilarating chaos.
Prompt
facial-expressions Excitement: Thrilling, exhilarating ; A man riding a rollercoaster; POV shot; Single Person; a fast-paced ride with twists and turns; cinematic
Characteristic
Shot : A man is riding a roller coaster. The camera is looking up at him from the front, and the motion blur effect is quite intense.
Aesthetic Score : 0.6
Mood : intense, exciting, thrilling
Quality
Entropy : 6.82
Noise : 58
Prompt Clip Score : 0.29
AI Evaluation
Likelihood of AI : 0.10
Image errors : No noticeable artifacts or errors. There is motion blur, but it’s likely intentional.
Stormy Silhouette: A Man on the Edge
A muscular figure stands defiant against a backdrop of brewing storm clouds, his silhouette a stark contrast against the city lights below. The scene evokes a sense of power, drama, and brooding anticipation, leaving the viewer wondering what lies ahead.
Prompt
facial-expressions Excitement: Victorious, powerful ; A hero standing triumphantly on a rooftop; high-angle; Hero; a cityscape with a dramatic storm in the background; cinematic
Characteristic
Shot : A muscular man stands on a rooftop overlooking a cityscape with a stormy sky above him.
Aesthetic Score : 0.6
Mood : dramatic, powerful, intense
Quality
Entropy : 6.76
Noise : 64
Prompt Clip Score : 0.25
AI Evaluation
Likelihood of AI : 0.80
Image errors : The image appears to be slightly blurred, and there are some artifacts around the edges of the man’s body.
Conclusion
The results show that the generative AI model performed well in terms of understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored a 0.25, which is considered below average. This suggests that the model didn’t accurately capture the intended camera position described in the prompt.
- Shot Analysis: The model scored a 0.38, which is also below average. This indicates that the model didn’t fully understand the scene described in the prompt and didn’t create an image that accurately reflects it.
- Aesthetic Analysis: The model scored a 0.1, which is considered very good. This means that the generated image closely matched the expected aesthetic style, despite the issues with camera position and scene understanding.
Overall, the model seems to be struggling with interpreting the prompt’s instructions regarding camera position and scene composition. However, it managed to create an image with the desired aesthetic style.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://www.freepik.com