The PC version of the Doom 2016 reboot finally has the Vulkan API update we've been waiting for. Everyone's a winner in terms of higher performance but for AMD owners in particular, there are some game-changing improvements. Our initial tests suggest anything from a 30 to 40 per cent increase in gaming performance for Radeon users but these are rough, initial numbers. It could actually be higher.
So what is Vulkan exactly? Well, think of it as the OpenGL equivalent to DirectX 12, with many of the same advantages - principally, far better utilisation of multi-core CPUs, along with the implementation of GPU asynchronous compute. The latter element in particular sees big improvements for Radeon hardware, and it's used extensively in Doom. id Software's lead rendering programmer Tiago Sousa recently revealed efficiency improvements of 3-5ms per frame on the console versions of the game - a seriously big deal when you have a 16ms per-frame render budget.
In a tech interview with Digital Foundry (due to be published in full this weekend), the id team talk about the advantages of Vulkan and the potential of async compute in particular.
"Yes, async compute will be extensively used on the PC Vulkan version running on AMD hardware," lead programmer Billy Khan tells us. "Vulkan allows us to finally code much more to the 'metal'. The thick driver layer is eliminated with Vulkan, which will give significant performance improvements that were not achievable on OpenGL or DX."
Senior engine programmer Jean Geffroy goes into depth on the profound advantages that async compute brings to the table.
"When looking at GPU performance, something that becomes quite obvious right away is that some rendering passes barely use compute units. Shadow map rendering, as an example, is typically bottlenecked by fixed pipeline processing (eg rasterisation) and memory bandwidth rather than raw compute performance. This means that when rendering your shadow maps, if nothing is running in parallel, you're effectively wasting a lot of GPU processing power.
"Even geometry passes with more intensive shading computations will potentially not be able to consistently max out the compute units for numerous reasons related to the internal graphics pipeline. Whenever this occurs, async compute shaders can leverage those unused compute units for other tasks. This is the approach we took with Doom. Our post-processing and tone-mapping, for instance, run in parallel with a significant part of the graphics work. This is a good example of a situation where just scheduling your work differently across the graphics and compute queues can result in multi-ms gains.
"This is just one example, but generally speaking, async compute is a great tool to get the most out of the GPU. Whenever it is possible to overlap some memory-intensive work with some compute-intensive tasks, there's opportunity for performance gains. We use async compute just the same way on both consoles. There are some hardware differences when it comes to the number of available queues, but with the way we're scheduling our compute tasks, this actually wasn't all that important."
So how does this pan out in terms of the actual Vulkan code that id software has delivered to PC users? Well, we use FCAT for performance testing - a system that marks up every frame output by the GPU with a coloured border. It's the best way of actually tracking what you actually see, as opposed to relying on internal metrics.
There's just one problem here - there is no support for FCAT right now in Doom itself or via Vulkan in general, while the game's OSD cumulative GPU render time average didn't seem to work for us on AMD hardware. To get some numbers together, we used a very simple approach - to visit three very different scenes and to measure the performance differential across a range of GPUs.
It can only be considered as a very basic way to judge the potential differential, but the results as they stand are stark. We'll begin with a 1440p/ultra/8x TSSAA comparison between four highly capable GPUs - GTX 1080, GTX 1070, GTX 980 Ti and R9 Fury X. We've averaged the scores across the three scenes here, and the results are clear: the Radeon hardware drastically underperforms under OpenGL against its nearest competitors - GTX 1070 and GTX 980 Ti - but actually moves ahead of both of them when Vulkan is engaged.
|Average FPS||GTX 1080||GTX 1070||GTX 980 Ti||R9 Fury X|
We also wanted to see how AMD's new Polaris technology checks out with Vulkan, so we repeated exactly the same test with the RX 480 - the same PC, the same settings, the same performance points. Now, in an ideal world, we would have compared it directly with the upcoming GTX 1060, but as that remains under embargo, we've done the next best thing and factored in GTX 970 and GTX 980, the two cards that Nvidia's next offering directly replaces.
The results once again highlight AMD's clear disadvantage in the quality of its OpenGL driver. GTX 970 is seven per cent faster than RX 480, while GTX 980 streaks ahead with a 24 per cent advantage. However, once again, the situation changes remarkably with Vulkan. The RX 480 leapfrogs the GTX 970 and moves within the margin of error with GTX 980.
And we should stress again that we've only tested here on a small selection of relatively light scenes. What's clear is that AMD's CPU utilisation has dropped significantly, so there may be even bigger gains in more action-packed scenes. Benchmarking Doom is very challenging - even if the GPU average frame-time metric on the OSD worked properly for us with AMD, the fact is that the highly dynamic nature of the game makes the repeatable gameplay necessary for accurate benching almost impossible to pull off.
|Average FPS||RX 480||GTX 970||GTX 980|
Hopefully we will see a Vulkan FCAT injector soon, or else a command line mode added by the developer itself - and bearing in mind this game's roots, it would be great to get old-school timedemo support integrated too. However, in the here and now, the results are clear. Everyone is a winner with Vulkan - regardless of hardware. And it is worth pointing out that our tests were carried out with an overclocked Core i7 6700K running at 4.6GHz. Whether you're running with Nvidia or AMD GPUs, the CPU optimisations should produce big improvements for those with less capable processors.
However, in terms of raw GPU performance improvement, our numbers show that Vulkan is a big deal for AMD. The turnaround with the R9 Fury X in particular is remarkable - while GTX 1080's sheer brute force in terms of GPU power keeps it comfortably at the top of the pile, the Fury X pulling ahead of both GTX 1070 and 980 Ti is a seriously impressive result for a software-only upgrade.
id Software itself is pretty clear about the advantages of Vulkan and async compute. We asked the team whether they see a time when async compute will be a major factor in all engines across platforms.
"The time is now, really. Doom is already a clear example where async compute, when used properly, can make drastic enhancements to the performance and look of a game," reckons Billy Khan. "Going forward, compute and async compute will be even more extensively used for idTech6. It is almost certain that more developers will take advantage of compute and async compute as they discover how to effectively use it in their games."
Will you support the Digital Foundry team?
Digital Foundry specialises in technical analysis of gaming hardware and software, using state-of-the-art capture systems and bespoke software to show you how well games and hardware run, visualising precisely what they're capable of. In order to show you what 4K gaming actually looks like we needed to build our own platform to supply high quality 4K video for offline viewing. So we did.
Our videos are multi-gigabyte files and we've chosen a high quality provider to ensure fast downloads. However, that bandwidth isn't free and so we charge a small monthly subscription fee of $5. We think it's a small price to pay for unlimited access to top-tier quality encodes of our content. Thank you.Support Digital Foundry