This blog post explores the results of a generative AI model tasked with creating images based on specific scenes and camera positions. While the model demonstrates a good understanding of the scene and camera position, it falls short in capturing the intended aesthetic, particularly in facial expressions. We delve into the model's performance, analyzing its strengths and weaknesses, and discuss potential improvements for future iterations.