Starfield feature 2

Starfield overperforms on AMD’s GPUs due to an architectural edge over NVIDIA’s GPUs


When Starfield became available, we were among the first who reported AMD’s superiority over NVIDIA’s hardware. Then, as we specifically said in our PC Performance Analysis, Starfield simply favored AMD’s architecture, which is why the AMD Radeon RX 7900XTX was faster than the NVIDIA GeForce RTX 4090. And… well… we were right.

ChipsAndCheese has decided to deep dive into Starfield’s performance, and examine why the game performs the way it does. In order to get some solid results, ChipsAndCheese analyzed a scene using Nvidia’s Nsight Graphics and AMD’s Radeon GPU Profiler.

Now while I won’t go into a lot of details (you should read their article though), ChipsAndCheese has basically confirmed everything that we’ve reported. At lower resolutions, the AMD Radeon RX 7900XTX is faster than the NVIDIA GeForce RTX 4090. However, at 4K, NVIDIA’s high-end GPU managed to close the gap.

This is mainly due to an architectural edge that AMD has – in this particular game – over NVIDIA’s GPUs. It has nothing to do with drivers, the power that the NVIDIA GPUs draw, or anything like that. NVIDIA’s GPUs perform exactly the way they should, and there is nothing wrong with them.

To make things simpler, as Reddit’s mikereysalo explained, AMD chose fewer cores with bigger register files for its GPUs, which means the GPU can track more threads per core. As such, in Starfield, this translates to a benefit of higher occupancy.

On the other hand, NVIDIA chose more cores with smaller register files to take advantage of parallelism. However, this ends up having lower occupancy and in this particular case, this can result in a lower performance.

AMD is also benefiting from higher L2 bandwidth because it has a smaller L2 cache. On the other hand, NVIDIA has a bigger L2 cache, but with lower bandwidth.

As ChipsAndCheese concluded.

“Starfield is a complex workload that sees different demands throughout a frame. In the two longest duration shaders we looked at, AMD was able to leverage its larger vector register file to keep more work in flight per SIMD. That in turn gave it a better chance of hiding cache and execution latency.

However, quantity has a quality all of its own, and it’s hard to argue with 128 SMs sitting on a gigantic 608 mm2 die. AMD may be better at feeding its execution units, but Nvidia doesn’t do a bad job. 128 moderately well fed SMs still end up ahead of 48 very well fed WGPs, letting Nvidia keep the 4K performance crown. AMD’s 7900 XTX uses just 522 mm2 of die area across all its chiplets. To no one’s surprise, it can’t match the throughput of Nvidia’s monster even if we consider wave64 or wave32 dual issue.”

In short, there is nothing NVIDIA can do in order to improve performance in this game. The only thing that the green team could do was to enable reBAR, which brought a 5% performance boost. However, you should not expect any additional major performance improvements.

And now you know why Starfield performs better on AMD’s hardware!