Asynchronous Compute is a feature that has been in the spotlight for a while. As we’ve seen, more and more developers are taking advantage of it via their DX12 implementations in their game engines. And from the looks of it, this feature will stay relevant for a very long time.
A few days ago, Futuremark released its new DX12 benchmark that took advantage of Async Compute. Futuremark’s 3D Mark DX12 Time Spy Benchmark has an option to disable or enable Async Compute and as you may have guessed, AMD’s GPUs see a significant boost when this option is enabled.
In a lengthy post, Futuremark detailed its Async Compute implementation. Without going into a lot of tech details, the team claimed that its implementation is the same on every hardware.
“The implementation is the same regardless of the underlying hardware. In the benchmark at large, there are no vendor specific optimizations in order to ensure that all hardware performs the same amount of work. This makes benchmark results from all vendors comparable across multiple generations of hardware.” claimed Futuremark and continued:
“Whether work placed in the COMPUTE queue is executed in parallel or in serial is ultimately the decision of the underlying driver. In DirectX 12, by placing items into a different queue the application is simply stating that it allows execution to take place in parallel – it is not a requirement, nor is there a method for making such a demand. This is similar to traditional multi-threaded programming for the CPU – by creating threads we allow and are prepared for execution to happen simultaneously. It is up to the OS to decide how it distributes the work.”
Futuremark concluded that its benchmarks can be considered accurate, relevant, and impartial as they are ‘built using a process that’s been government vetted for fairness and neutrality.’
“For benchmarks to be relevant and useful tools, they must be fair, impartial, and unbiased. This is why 3DMark Time Spy, and all other Futuremark benchmarks, are developed with industry-leading hardware and software partners through our Benchmark Development Program using a process that’s been government vetted for fairness and neutrality. This process ensures that our benchmarks are accurate, relevant, and impartial.”

John is the founder and Editor in Chief at DSOGaming. He is a PC gaming fan and highly supports the modding and indie communities. Before creating DSOGaming, John worked on numerous gaming websites. While he is a die-hard PC gamer, his gaming roots can be found on consoles. John loved – and still does – the 16-bit consoles, and considers SNES to be one of the best consoles. Still, the PC platform won him over consoles. That was mainly due to 3DFX and its iconic dedicated 3D accelerator graphics card, Voodoo 2. John has also written a higher degree thesis on the “The Evolution of PC graphics cards.”
Contact: Email
In huge thread at overclock .. net AMD’s junta is crying why AMD not wining by margin.
I’d be curious to see some furyx/1080 benches.
I can’t believe people like you exist.
Look in the mirror shill!
1060 is beating 480 with Async Compute ON – and that’s without any overclocking. RIP AMD
AMD get a bigger performance boost in Async Compute even in Time Spy, NVIDIA have higher raw clock speeds. Remember this isn’t a specially optimised benchmark, unlike DX12 games where AMD are more optimised.
RX480 gets a nearly 10.8% gain in Async, 1060 gets about 5.4%. So you see people claim it’s biased simply have no clue. Hitman gains are similar. People can’t get their head around the fact that Pascal get’s a performance gain with Async Compute after saying it doesn’t have Async Compute, that’s all it is, hardware illiterate people again trying to find conspiracies.
We do get that NV gets a little boost from Async. What i keep telling you is that it’s not as effective as Amd’s. Hence, they need to either re-work it OR create something (driver or hardware side) that will close the gap in term of Async Efficency. Amd has a clear advantage.
You don’t need to tell me something I already know and have said many times.
You keep repeating the same thing over and over again. Expecting someone who hasn’t understood your point/fact yet isn’t worth your time. All of this has been understood by 99% of the dsog community imo.
Pascal implementation of async compute is via switching, very fast switching, but still switching. If we consider async compute as simultaneous running of compute code then pascal is not capable of that. That is probably reason why people claim pascal do not have proper AS support, due its archaic implementation.
Hitman makes “proper use” of Async Compute yet gets about the same boost in performance as Time Spy.
Hitman not only uses AS but other tech, but AMD still have huge advantage there and why is that is unclear to me, fact is that all AMD tech via GPUopen has MIT licence so nvidia can review even ask for change if they wish. Still weird nvidia perform so badly there.
they perform bad with async compute because they don’t have it on a hardware level preemption is close but it’s still not a synchronous compute module
Hitman has ALWAYS performed better on AMD hardware, all the way back past Blood Money.
Be thankful FM aren’t in charge of formula 1 else we would see drivers pushing their cars around the track.
Hahahahahahahaha
This 1000x
Ashes of the Singularity uses a single codepath as well, and I saw no one complaining about that or invalidating its use as a DX12 benchmark.
You know what AMD fa g g ots are like…always winding the way of their pathetic company.
Go back to WCCFtech
I’ve actually upvoted that, lol. So right…
Except Pascal CAN do Async Compute on the hardware using their own solution called Dynamic Load Balancing. It is well explained in an nvidia paper and anandtech’s in-depth GTX1080/1070 review.
But I can agree with you if we are speaking about maxwell and other previous nvidia architectures.
I do not see why AMD would have any concerns if the benchmark was developed to be neutral. Even their cards gain more than Pascal with Async Compute enabled, which shows they arguably have the better hardware implementation.
I’ve checked the article. Pre-emption != Asynchronous Compute. Quote “while not exactly async compute, can be used to accomplish similar goals.”
As I understand it:
True Asynchronous Compute – developer has full control over the GPU code and its tasks.
Nvidia Dynamic Load Balancing – developer does not have full control over the GPU, it decides what to do by itself.
If you ask me, the latter somewhat contradicts with the nature of DX12 and Vulkan. And again, developer would need to create a whole new code path for each vendor, rather than doing one path with Async Compute with variations in parameters.
Wait, where did you get that quote from?
We are talking about page 9 of the anandtech article posted on July 20 right? I don’t see that quote anywhere.
And in no point did I mention pre-emption. Dynamic load balancing is not pre-emption.
Far as I understand how Async is used under DX12: Both in AMD and nvidia’s case the developers send command lists in different queues to the GPU and it’s up to the drivers in each case to decide whether to run the queues in parallel (AKA Async Compute) or serial mode.
In the case of the Time Spy benchmark, AMD’s cards and nvidia’s Pascal cards run the queues in parallel while nvidia’s older cards run them in serial mode because nvidia disabled Async at the driver level for these particular cards.
Futuremark explains this fully in their statement and includes GPUView trace images to back it up.
Thing is, a lot of false claims are being made on how Pascal does not support Async Compute at the hardware level. But it does, as explained in-depth by the Anandtech article. It is a different implementation than AMD but that doesn’t make it any less “true”.
And yes AMD’s cards do gain more from Async enabled, which shows they arguably have the better hardware implementation, but again that doesn’t make nvidia’s approach a “false async” as some keep calling it.
http://wccftech.com/nvidia-geforce-gtx-1080-dx12-benchmarks/ (quote from here)
http://www.overclock-and-game.com/news/pc-gaming/50-analyzing-futuremark-time-spy-fiasco
“Although principally this is not exactly the same as Asynchronous Shading or Computing. Because Pascal still can’t execute async code concurrently without pre-emption.” Page 10 of anandtech article explains the pre-emption.
The page 9 on anandtech article shows results from Time Spy Benchmark as example of Async Compute they are “expecting to see in future games”. That’s not how games work. Real game developers will surely try to optimize their games for both vendors individually, unlike what Futuremark did here.
Took me a while to read but quite interesting. You sir have tought me something tonight.
Gracias 😉