AI Captures Emotion, But Struggles with Style with Stability-ai-ultra
- 9 minutes read - 1817 wordsTable of Contents
The ability to generate images from text prompts is a rapidly evolving field in AI. This technology holds immense potential for creative applications, but it still faces challenges in capturing the nuances of human expression and artistic style. This blog post examines the performance of a generative AI model in creating images based on detailed scene descriptions, focusing on its ability to convey emotion and aesthetic style through facial expressions.
Created with: stability-ai-ultra
A Solitary Figure Contemplates the Fury of the Storm
A lone figure stands on a windswept cliff, silhouetted against a tempestuous sea. The crashing waves and dramatic sky evoke a sense of power and isolation, leaving the viewer to ponder the figure’s thoughts and emotions.
Prompt
facial-expressions Hope: Determined, resilient, facing adversity ; A lone figure standing on a clifftop overlooking a vast, stormy sea; eye-level; Single Person; Dramatic, stormy sky with crashing waves; cinematic
Characteristic
Shot : A lone figure stands on a cliff overlooking a stormy sea. The waves are crashing against the rocks, and the sky is dark and brooding.
Aesthetic Score : 0.8
Mood : dramatic, ominous, powerful
Quality
Entropy : 6.87
Noise : 96
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.10
Image errors : There are no visible image errors.
Heroic Firefighter Braves Blazing Inferno to Rescue Child
A firefighter’s unwavering determination shines through the chaos of a raging fire as he carries a child to safety. The stark contrast between his calm demeanor and the fiery backdrop creates a powerful and dramatic scene.
Prompt
facial-expressions Hope: Brave, selfless, courageous ; A firefighter carrying a child through a burning building; eye-level; Hero; Smoke and flames engulfing the background; cinematic
Characteristic
Shot : A firefighter in full gear carrying a small child through a raging fire.
Aesthetic Score : 0.7
Mood : dramatic, heroic, hopeful
Quality
Entropy : 6.76
Noise : 80
Prompt Clip Score : 0.32
AI Evaluation
Likelihood of AI : 0.60
Image errors : The fire appears somewhat artificial, with some unnatural shapes and textures. The smoke is also slightly blurred and lacks depth.
A Seed of Hope in the Desert
A young woman plants a sapling in a barren desert landscape, a symbol of hope and renewal amidst the dry, cracked earth. The contrast between the vibrant green and the desolate surroundings creates a powerful visual, emphasizing the importance of environmental stewardship.
Prompt
facial-expressions Hope: Optimistic, hopeful, believing in a better future ; A young woman planting a tree in a barren wasteland; eye-level; Normal Person; Dusty, desolate landscape with a single, hopeful green sprout; cinematic
Characteristic
Shot : A woman is planting a small tree in a dry, cracked desert landscape. The sun is shining and the sky is blue.
Aesthetic Score : 0.7
Mood : hopeful, environmental, hopeful
Quality
Entropy : 6.82
Noise : 75
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : The image is well-exposed and there are no obvious errors. However, the woman’s hand in the foreground appears a little blurry.
Immersed in the Game: A Young Gamer’s Focused Intensity
A young man, captivated by the digital world, sits in a dimly lit room bathed in blue and red light. His expression is one of pure excitement and focus as he navigates the virtual landscape. The dramatic lighting and his dynamic pose capture the intensity of his gaming experience.
Prompt
facial-expressions Hope: Excited, triumphant, feeling a sense of accomplishment ; A gamer celebrating a victory with their team, their faces illuminated by the glow of the monitor; eye-level; Gamer; A dimly lit room with gaming peripherals and posters on the walls; cinematic
Characteristic
Shot : A young man wearing headphones is playing a video game on his computer. The room is lit with blue and red lights. He is laughing and looks excited.
Aesthetic Score : 0.6
Mood : excited, energetic, playful
Quality
Entropy : 6.91
Noise : 72
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image has a slight overexposure, the colors are a bit oversaturated and the focus is slightly off. The lighting is a bit harsh on the face.
A Single Flame, A Moment of Tranquility
A solitary candle flame dances in the center of the image, casting a warm glow on the surrounding white wax. The dark background creates a sense of intimacy and peace, highlighting the flickering light and inviting you to pause and appreciate the moment.
Prompt
facial-expressions Hope: Hopeful, comforting, a beacon of light in the darkness ; A single candle burning brightly in a dark room; eye-level; Single Person; Shadows and darkness surrounding the candle; cinematic
Characteristic
Shot : A single candle is burning, the flame is captured in close-up, the wax around the wick is melting.
Aesthetic Score : 0.7
Mood : calm, peaceful, contemplative
Quality
Entropy : 5.94
Noise : 83
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.10
Image errors : No obvious errors or artifacts are visible in the image.
A Moment of Joy: New Life Begins in a Hospital Room
This heartwarming image captures the tender moment of a doctor or nurse holding a newborn baby in a hospital room. The soft lighting and genuine smiles create a sense of happiness and new beginnings, reminding us of the beauty and hope that life brings.
Prompt
facial-expressions Hope: Joyful, hopeful, a symbol of new beginnings ; A doctor holding a newborn baby in their arms; eye-level; Hero; A sterile hospital room with medical equipment in the background; cinematic
Characteristic
Shot : A young woman, presumably a nurse or doctor, is holding a newborn baby in a hospital room. The baby is swaddled in a colorful blanket. There is a medical professional in the background.
Aesthetic Score : 0.7
Mood : joyful, tender, hopeful
Quality
Entropy : 6.81
Noise : 72
Prompt Clip Score : 0.30
AI Evaluation
Likelihood of AI : 0.10
Image errors : The image is slightly overexposed, especially in the woman’s face.
Golden Hour Gathering: Friends Share Laughter and Warmth
A group of friends enjoy a casual dinner by a window as the sun sets, casting a warm golden glow on their happy faces. The scene exudes a sense of intimacy and joy, captured in the laughter and conversation that fill the air.
Prompt
facial-expressions Hope: Warm, comforting, a sense of belonging ; A group of friends sharing a meal together in a cozy kitchen; eye-level; Normal People; Warm, inviting kitchen with sunlight streaming through the window; cinematic
Characteristic
Shot : A group of friends are gathered around a table, enjoying a meal together. The setting is a cozy and casual dining room, with warm lighting and a window behind the table.
Aesthetic Score : 0.7
Mood : happy, warm, friendly
Quality
Entropy : 6.80
Noise : 75
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.20
Image errors : There are some minor artifacts in the image, such as slight blurriness around the edges of the subjects.
Lost in the Game: A Gamer’s Intense Focus Under Neon Lights
A young man, his face illuminated by the vibrant hues of his monitor, is completely absorbed in his video game. The dramatic lighting and his focused posture capture the intensity and immersion of the gaming experience.
Prompt
facial-expressions Hope: Determined, focused, persevering ; A gamer overcoming a difficult challenge in a video game, their face showing determination and focus; eye-level; Gamer; A brightly lit room with a large monitor displaying the game; cinematic
Characteristic
Shot : A young man is playing a video game. He is wearing a headset and is looking intently at the screen. The room is lit with blue and purple lights.
Aesthetic Score : 0.6
Mood : focused, intense, futuristic
Quality
Entropy : 6.84
Noise : 72
Prompt Clip Score : 0.27
AI Evaluation
Likelihood of AI : 0.30
Image errors : Some minor noise present in the image but not very noticeable.
Soaring High: A Bird’s Eye View of Hope
A lone bird cuts through a vibrant blue sky, its silhouette a symbol of freedom and wonder. The scene evokes a sense of serenity and optimism, reminding us of the beauty and possibilities that lie ahead.
Prompt
facial-expressions Hope: Free, hopeful, a symbol of liberation ; Soaring through blue sky; eye-level; Single Person; Vast, open sky with fluffy white clouds; cinematic
Characteristic
Shot : A single bird is flying against a backdrop of fluffy white clouds in a clear blue sky.
Aesthetic Score : 0.7
Mood : peaceful, serene, hopeful
Quality
Entropy : 5.12
Noise : 61
Prompt Clip Score : 0.22
AI Evaluation
Likelihood of AI : 0.30
Image errors : There are no visible artifacts or errors in the image.
Silhouettes of Friendship Against a Hopeful Sunset
A group of friends stand together in a field, their silhouettes outlined against a vibrant orange sunset. The image evokes a sense of joy, hope, and togetherness, capturing the beauty of friendship in a dramatic and heartwarming way.
Prompt
facial-expressions Hope: United, hopeful, facing the future together ; A group of people standing together, arms linked, facing a bright sunrise; eye-level; Heroes; A vast, open field with a golden sunrise in the background; cinematic
Characteristic
Shot : A group of friends stand in a field, arms around each other, facing a sunset
Aesthetic Score : 0.7
Mood : happy, hopeful, togetherness
Quality
Entropy : 6.52
Noise : 72
Prompt Clip Score : 0.28
AI Evaluation
Likelihood of AI : 0.20
Image errors : Slight overexposure in the sky, some lens flare visible, the grass and the horizon have a slight blur and are not quite in focus
Conclusion
The results show that the generative AI model performed well in understanding the scene and camera position, but struggled with the aesthetic aspect. Here’s a breakdown:
- Camera Position: The model scored 0.5, which is considered good. This means the generated image’s camera position was fairly close to what was requested in the prompt.
- Shot Analysis: The model scored 4.1 out of 10, which is also considered good. This indicates the model was able to understand the scene described in the prompt and create a shot that reflected it well.
- Aesthetic Analysis: The model scored 0.07, which is not very good. This suggests that the generated image’s aesthetic was not as close to the expected aesthetic as it could have been.
Overall, the model seems to be capable of understanding the scene and camera position, but needs improvement in generating images that meet the desired aesthetic.
Sources:
- https://dramaresource.com/storytelling/
- https://seedsoftellers.eu/resources/the-body-language-for-young-tellers/
- https://digitalcollections.sit.edu/cgi/viewcontent.cgi?article=1288&context=sandanona&filename=1&type=additional
- https://citeseerx.ist.psu.edu/document?doi=7f842882e9bb1fa2c0e96939bc8d2c37e34e17c0&repid=rep1&type=pdf
- https://www.twinkl.co.uk/search?q=drama+facial+expression
- https://stability.ai