People often praise Grand Theft Auto V for its graphical prowess even though the game was released back in 2013. Others can’t stop talking about San Andreas’ resemblance to its real-life inspirations – Los Angeles city and the Southern California area. But, while beautiful, the game looks inauthentic compared to what Intel Labs achieved in terms of realism. In a project named “Enhancing Photorealism Enhancement,” researchers Stephan R. Richter, Hassan Abu AlHaija, and Vladlen Koltun used machine learning to train their neural network. Here’s a preview of how Intel made GTA V look incredibly realistic using machine learning:
Need more details? Check their research paper in PDF. We’ll try to abridge it for those that don’t. The fundamentals are identical to a neural network responsible for AI (Artificial Intelligence) upscaling. They take a rendered image, pass it through an Image Enhancement Network, and get an enhanced image. Only this time, instead of adding missing pixels to blow up the game resolution, the neural network replaces them. The replacement comes from datasets of automobile-grade cameras (dashcams) extracted from Cityscapes Dataset and mostly based on streets in German cities.
And, to improve details and remove camera instability, the researchers encoded geographical information the game engine produces itself, which they named “G-buffers”. These are responsible for car glossiness, scene lightning levels, and surface normals. Also, they calculate distances between in-game objects and objects and camera and determine texture detail and quality.
Also, did you notice the same thing we did? The video toward the end looks heaps better than what it starts with, right? Well, you have the Mapillary Vistas dataset to thank – without it, the image was stable but very washed-out. This dataset is what trained the neural network with high-quality images from a variety of cameras, different angles, heights, and lightning levels. It’s also why, as researchers explained on Github, other state-of-the-art AI technologies, CUT (Contrastive Unpaired image-to-image Translation) and TSIT, wouldn’t work. In their demonstration, both introduced artifacts, temporal instability, and smudgy details.
Now for the main question. What are the chances of installing a GTA mod with photorealistic features? Pretty slim, no doubt. But the potential of this technology is immense. Moreover, the team at Intel believes they can replicate how they made GTA V look incredibly realistic using machine learning with future game engines. Shall we add that to the list of GTA 6 rumors, everybody?