During this year’s GDC, Emil Persson – Head of Research at Avalanche Studios – talked about low-level shader optimization for both next-gen consoles and DX11. Most of these will sound like mumbo jumbo to you, but Emil’s presentation has been posted online and is something that will benefit some of you (provided you are game developers).
According to Emil, DX11 suffers from Interpolators have a negative impact in performance. Moreover, very branchy code tends to increase the number of registers the shader requires (resulting in decreased performance).
Emil also gave some numbers about the ROPs of PS4 and Xbox One. While talking about the low number of ROPs (how many pixels we can output per clock), Emil revealed that a Radeon HD7970 can handle a 128bit path format. Next-Gen consoles, though, seem to be limited compared to GPUs similar to the aforementioned one. PS4 can handle a 64bit path format until it hits bandwidth issues.
“As hardware has gotten increasingly more powerful over the years, some parts of it has lagged behind. The number of ROPs (i.e. how many pixels we can output per clock) remains very low. While this reflects typical use cases where the shader is reasonably long, it may limit the performance of short shaders. Unless the output format is wide, we are not even theoretically capable of using the full bandwidth available. For the HD7970 we need a 128bit format to become bandwidth bound. For the PS4 64bit would suffice.”
For Xbox One, a 64bit path format will hit its peak which may result in a number of performance issues. Developers will be able to render their games via the slower DDR3 memory but they’ll have to adopt a 32bit path format to avoid any bandwidth issues.
“On the XB1, if we are rendering to ESRAM, 64bit just about hits the crossover point between ROP and bandwidth-bound. But even if we render to the relatively slow DDR3 memory, we will still be ROP-bound if the render-target is a typical 32bit texture.”
Those interested can read Emil’s GDC presentation here.