Following up from last Saturday's in-depth technical discussion with WipEout HD developer Studio Liverpool, I decided to contact a number of other console coders with experience working on 1080p video games. The objective was straightforward: to discuss the more in-depth technical challenges associated with coding for the so-called 'Full HD' resolution.
Sacred 2 developer Tobias Berghoff worked directly on the 1080p renderer on the PlayStation 3 version of the game, and had a whole range of intriguing insights on the process. The project he worked on is interesting in that unlike WipEout HD and the majority of the 1080p-enabled PS3 titles, it is a multiplatform game: Sacred 2 is available on PC, PlayStation 3 and Xbox 360, with both console versions supporting maximum 1920x1080 resolution. Previously, we had put together this analysis of the game's performance at all the supported modes, and it's interesting to note that while the game runs internally with profiles for both 720p and 1080p, the lower HD resolution is only deployed scaled down for standard-def gameplay.
If some of the questions look familiar, it's because a number of them were initially sent out to several developers simultaneously, the original idea being that the release of WipEout HD: Fury would be accompanied by a more general overview of the state of play with regards 1080p console gaming. However, the wealth of stuff I got back was simply too voluminous and too interesting to edit down, and follow-up questions provided even more quality material. So, as per the Studio Liverpool interview, what we have here is the complete, unabridged interview: 100 per cent technical discussion – just the way we like it at Digital Foundry.
Digital Foundry: Bearing in mind how many people are still using SDTVs, what was the reasoning behind going for full 1080p? Is there not a sense that the game is somewhat over-engineered?
Tobias Berghoff: It was a very gradual process, to be honest. When the work on the Xbox version began in late '06, the performance goal was 720p with 2xMSAA. The PC version was already quite far along, graphically, and it was very performance-hungry. Furthermore we had zero experience on the platform, so a little conservatism was a good idea.
Development on the PS3 version started in mid '07 and we considered it more of an experiment. There were all these horror stories about PS3 development floating around the industry at the time, so we were not exactly confident that we could get it to work. We did not announce it for a full year, so that we could still cancel it easily if the platform proved too challenging.
As we had anticipated, performance on both platforms turned out to be quite problematic. We inherited a forward renderer from the PC version, which produces frametimes of 100-200ms on the 360 and about 100ms more on the PS3. After some futile attempts at optimising it, the Xbox graphics team came up with a deferred renderer, which was the first major performance breakthrough. We were not quite there, but the 360 version was definitively able to render at 720p.
We probably would have stopped here if it wasn't for the PS3 version. Even with the deferred renderer, our frametimes would still go up into the 100ms+ range, if enough light sources were on the screen. The problem was the pixel shader used in the deferred pass, which does the entire lighting computation on one go, which allows us to do gamma-correct lighting with up to 12 pointlights (eight of which can have shadowmaps) plus the sun and its shadow-map. A Sony engineer once remarked that it "produces really pretty pictures but is probably the heaviest pixel shader I have ever seen". The problem lay with the need to dynamically determine which light-sources could be skipped for any given pixel. Xenos handled that quite fine, RSX is a GeForce 7 and thus not a fan of branching.
Being the sole guy responsible for the rendering on the PS3, this gave me quite some headaches. The solution was to use the SPUs to determine which light sources affect which pixel, and then to cut the deferred-pass into blocks of 64 pixels, so that all blocks touched by the same lights can be drawn at once (*). Together with pixel shaders optimised for the actual number of light sources, this put the PS3 way ahead of the Xbox; far enough that 1080p became a possibility.
I think this was basically the point were we went "Ohhhh! Shiny!" and tried to make it work. We had done some test rendering in 1080p before and it was pretty well established that with all of our alpha-tested grass and foliage, the improvement in image quality was going to be immense. It is really a 'night and day' kind of thing.
For some time it even looked as if the PS3 was going to be 1080p and the Xbox was not, until Stephan Hodes – the primary Xbox graphics programmer – wrote a slightly less insane version of the system described above for the Xbox, bringing us pretty much to parity (**).
So, is it over-engineered? Possibly. We did not meet the final performance goals and on the PS3 – which is the only version I really know well enough to talk about – this is almost completely a CPU issue. So the rendering is certainly faster than the CPU-side code on that platform, which is a bit of a waste. I should point out, however, that at least on the PS3, getting 720p with 2xMSAA to work at 30FPS would not have been a lot simpler than going all the way to 1080p. So moving development time from the renderer to the game-code would not have helped.
Digital Foundry: Can you outlay in layman's terms what the core challenges are between rendering at 720p and 1080p?
Tobias Berghoff: 2.25 times the number of pixels. Really, that's all. On the PS3, your render-targets get bigger, which cuts into your VRAM budget, and potentially increases the pressure on the texture streaming. For us, this really complicated the SPU processing I talked about earlier. We need to go through the entire depth buffer, so it has to be copied from VRAM to system RAM. If you render in 720p, you need a 3.5MB buffer for that. In 1080p, that's 8MB, which is a lot of extra memory.
So I ended up cutting the buffer in half, having RSX copy the left side into system RAM, processing it with the SPU and then repeating that with the right side. What you really do not want is for your GPU to be idle, so while the SPUs were busy, RSX needed to perform work as well. This required something akin to an interrupt system, which allows the SPUs to tell RSX to copy down the second half of the depth buffer, all without involvement from the PPU-side render code and without knowing what RSX is actually working on at the time. So you may end up doing some pretty interesting things to save a couple of MBs of RAM.
On the 360, the situation is a bit different, as your render-targets are stored in eDRAM (the 10MB of additional ultra-fast RAM connected to the GPU), so bigger targets mean more tiles and more resolves (copying from eDRAM to system RAM). If you really need the full targets as textures somewhere, you run into the same memory issues, of course.
The major issue is pixel processing, however. The higher the resolution, the more important it is to have fast pixel shaders and the more memory bandwidth is consumed by the ROPs (render output units). But if you compare a 1080p30 game with one running at 720p60, the differences will be in the game code, not in the renderer.
* This is inspired by work done by the fine folks at SCEE's PhyreEngine Team.
** Turns out Naughty Dog has comparable "less insane" tech in Uncharted.