AMD and Ubisoft revealed that both Wolfenstein II: The New Colossus and Far Cry 5 will support Rapid Packed Math; a new feature that is – at least for now – exclusive to AMD’s latest RX Vega graphics cards. According to the developers, Rapid Packed Math allows a graphics card to execute two math instructions for the price of one.
This basically means that both of these games should, theoretically, run way faster with FP16 compute than with the standard FP32 compute that pretty much all games currently use.
As said, this feature is being supported exclusively by the AMD RX Vega graphics cards that will be released later this month. NVIDIA has not announced plans of supporting it. Do note that this is a hardware feature, meaning that owners of current NVIDIA GPUs won’t be able to benefit from it.
It will be interesting to see whether AMD’s RX Vega GPUs will be able to close the gap between the GTX1080 and the GTX1080Ti via Rapid Packed Math. It will also be interesting to see whether other future games will support it as right now, only these two upcoming games – as well as a benchmark from FutureMark – will support Rapid Packed Math.
Not only that, but AMD’s RX Vega GPUs offer the most complete support for Microsoft’s DX12 API. Again, we don’t know whether developers will take advantage of all these DX12 features and whether these features are enough to boost the performance of AMD’s GPUs in order to compete with NVIDIA’s GPUs.

John is the founder and Editor in Chief at DSOGaming. He is a PC gaming fan and highly supports the modding and indie communities. Before creating DSOGaming, John worked on numerous gaming websites. While he is a die-hard PC gamer, his gaming roots can be found on consoles. John loved – and still does – the 16-bit consoles, and considers SNES to be one of the best consoles. Still, the PC platform won him over consoles. That was mainly due to 3DFX and its iconic dedicated 3D accelerator graphics card, Voodoo 2. John has also written a higher degree thesis on the “The Evolution of PC graphics cards.”
Contact: Email



I don’t know what any of this means.
It means AMD have a weak card and want you to buy theirs on future promises that may or may not ever happen. Same ole story.
Hahaha best comment mate. This is true.
“Ole story” is this a story about a Spanish matador?
Matador pokes bull with stick, bull proceeds to maul and trample matador. The end. Ole!
That made me laugh.
yeah an old max payne clone called El matador.
Describe VEGA10, so GPU with almost 13TFLOPs as weak is just plain stupidity. IT can be described by many adjectives but weak is not one of them. It is the most advanced, the most versatile graphic architecture ever build, that does not mean it will be the best for gaming, but weak? What have you been smoking?
Some part of game will be rendered in low quality but two times faster (Vega)
When MS created first version of HLSL shader language in 2001 they force game developers to use only 32bit operations. So all graphics cards in last 16 years always use 32bit compute. But game developers said that they want faster operations in lower quality for less important things such as hair, distant shadows etc. So MS create new HLSL 6 with direct support of 16-bit compute (Shader Model 6). This was introduced last march in Windows 10 1703 ‘Creators Update’. You must use WDDM 2.2 drivers for that feature
geforce fx had utter garbage perf even in games with PS 1.1/1.3 support.
Yeah, I remember the Geforce FX fiasco, nVidia tweaked it’s drivers so some Half-Life 2 scenes were rendered at lower quality, right? They did so in order to close the gap with Ati cards, because red team was beating them badly at the time.
Of course, Geforce users didn’t know that, until Gabe Newell told the media about it. Something about how nVidia was changing shaders at driver level leading to image quality degradation. Then, there was outrage of course.
nvidia still changes shaders. they probably are doing some level of this but with reduced quality loss to the point its not noticeable.
ATI R300 and its iterations terrorized three generations of nvidia gpus (NV1x/2x/3x/4x). was one of the best gpu arch ever done by ATI. but yeah! fanboys ruins everything. nvidia still use those tricks. in the fact every driver package contains collections of shaders from supported games recompiled for their current gpus. the main power of nvidia is money and corrupted sellouts. with their money they can manipulate everything. as example futuremark timespy assync implementation. instead of two separate hardware codepaths for each vendor they implemented just one which won’t show nvidia hardware inferior to amd..
Nothing to do with simply having more efficient architecture or anything then?
The very fact amd needs so many teraflops, such high clocks and power draw speaks volumes
Wow, it’s a shame virtually no-one could (and can) utilize this feature for such a long time.
google GPU 16bit precision vs 32bit precision. checking images will give you a quick look at the lower quality differences and links to articles will also be available
Somehow I couldn’t find any images concerning the difference of image quality.
I just tried GPU 16 bit vs 32 bit precision. Worked for me m8.
From my understanding (admittedly based on neogaf), FP16 is a lower quality calculation being used by GPU, which in theory would lead to lower quality image being produced, but if they used it in areas of game where higher quality image construction isn’t required (ie is Less obvious to the eye) they can theoretically gain performance with minimal visual quality loss.
How much will be used and what the loss of quality that will be evident remains to be seen. It does sound like a lot of extra optimisation work though, so Im sceptical for it’s uptake, but let’s see. If can have minimal visual quality loss and perform better then Im all for this.
Sure techniques like this and checkerboarding aren’t as good as actually running at full quality, but Id take it over having to drop resolution or losing performance in games.
PC gaming is all about options and if this is one, then great
that’s the problem. more attention are needed on the optimization side or else there might be no saving. also the only way for this to get massive adoption is by having all hardware vendor to support the said feature.
I agree, even dx12 and Vulkan are struggling to really gather any momentum
DX12 has fastest adoption rate than any previous DX and same can be said to Vulkan vs. OpenGL. It is very hard to compare it as there are almost none openGL games atm and mostly we would have to compare mobile games. With that being said low level API has very fast adoption rate, that is a fact.
Performance-wise though we do not see any dramatic improvements and there are reasons for that, but it doesn’t change the fact we need those API for new technologies like VR, 4K etc. We simply need to squeeze as much as we can for those resolutions and low level API is the only way the industry has atm.
Name number of dx12 games and number of Vulkan games and you would be lucky to need two hands. Even less where they are built from ground up on the api
Exactly 20 released games since 2H15, so it is double of what you expected with your lucky hands.
a whole 20 eh?? whooooo and again, how many were from ground up and werent just tacked on ??
Seems it might be similar to the Nvidia multi resoution in Shadow Warrior 2. I like that PC games are getting options like these, dynamic resolution/graphics settings, and temporal rendering/AA.
Similar in concept ie using lower quality details in periphery vision
I think in execution will be more subtle and require much more optimisation.
Well, if it’s a feature at hardware level it shouldn’t take that much extra optimization work from the devs, hopefully.
Using FP16 doesn’t have to lead to lower visual quality, it actually can lead to much higher visual quality at same cost as AMD presented with their TressFX demo. Just saying…
Behave, you can’t halve the calculations and have superior quality.
Outside of scientific applications or some large space sims you will never need 32 bit precision. 16 bit operations can lead to higher visual quality as rapid packed math can issue two 16 bit operations simultaneously. You are essentially doubling the number of operations.
That is the whole point of having different precision, so you can make those computation faster when higher precision would not bring any improvement in computation itself. There is number of applications and that’s why those different precision exist.
In your silly logic, superior quality could be only achieved with FP64, because it is the highest precision, which no game currently support anyway.
It means that they are still manipulating the crowd with empty promises. They don’t have good cards so they are making empty promises on future features, FineWines and whatever other nonsense. Better than Nvidia DX12 support, better Vulkan support, Raipid Maths and whatnots, but in reality it has a meaning in like 2 or 3 sponsored games per year and until those low level APIs will be properly and widely used all current top GPUs will be 5 year old and will be going for $100 on Ebay and all you will actually get is few years with a card that needs extremely controlled environment to be competitive, no to mention 2x power draw, heat and noise.
the price of a gtx 1060 is too damn high compared to previoys x60 models. Just putting that outhere. You dont get better cards from nvidia for no reason.
Right now it is because of cryptocurrency craziness, but normally it is trading blows with GTX 980 for half the price of what 980 was and it has 1.5x memory. What more can you wish from just one generation jump.
the starting model to have more ram or a lower price.
If you read the text at the end of the video then they basically state that with fp16 (rapid packed math) the Sara demo ran at 50fps. With fp32 (what is normally used) the demo ran at 44fps. So 6fps difference (or 13% increase). Obviously this is a singular case. And I’m sure Nvidia will be able to optimize drivers to cut into that 13% gain. Just like they did with async compute. Plus I am betting that the 1080 is on average already 10% faster then Vega… so the gain will likely use level the performance gap. Not run away with it. But we will all see soon enough!
They would have to unlock FP16 in their current graphics (not sure they even can do it atm) so they will not be able to use it at all, will have to use FP32 instead. There is no way how to optimize that, you can optimize many other things but certainly not something your HW doesn’t support in the first place.
Amd’s fp16 works only with supported windows machines, which is Windows10 creators update or higher whereas Nvidia can do custom driver magic that works on any OS that is running their card. But requires some game specific driver optimisation by NVIDIA’s team. Give them time.
AMD and Nvidia can optimize number of things, but they cannot work around HW feature, when AMD can simply compute 2 instructions in one cycle and nvidia cannot. It is very same as Async compute on kepler.
In other words you would have to present magic to overcome that, and if you would be able to do that you wouldn’t have to make new HW at all as everything would be added later by THIS magic. That alone is very interesting idea in IT.
I am not knowledgeable on the technical side. What I understand after googling it and going thru AnandTech gp104 review is that: NVIDIA enabled fp16 in hardware in Maxwell based GTX Titan X and didn’t design the same in any further gaming focussed GPUs.
To quote AT “As for why NVIDIA would want to make FP16 performance so slow on Pascal GeForce(1080) parts, I strongly suspect that the Maxwell 2 based GTX Titan X sold too well with compute users over the past 12 months, and that this is NVIDIA’s reaction to that event…
…limiting the FP16 instruction rate on GeForce products is an easy way to ensure that these products don’t compete with the higher margin Tesla business…
…However I have to admit that I am surprised that NVIDIA limited it in hardware on GP104 in this fashion, similar to how they limit FP64 performance, rather than using FP16x2 cores throughout the GPU and using software cap. The difference is that had NVIDIA implemented a complete fast FP16 path in GP104 and merely turned it off for GeForce, then they could have used GP104 for high performance (and high margin) FP16 Tesla cards. However by building GP104 from the get-go with a single FP16x2 unit per SM, they have closed the door on that option”
What does all this mean ? 1080 & 1070 do not have native FP16 support & there is nothing NV can do to turn them on in those cards. But what about 1060, 1080ti etc?
It’s basically like doing two little operations instead of one large one within the same amount of time. It’s faster, although lower quality. Considering most objects and effects are generally farther away and less noticeable, it’s a great way to speed up rendering.
AMD should call it what you just said instead of Rancid Pecan Myths or whatever it’s called. Your explanation is better.
That’s a suprise – those developers will use features available to very few gamers. Rapid Fast Math (FP16) is part of Shader Model 6 introduced few months ago in Windows 10 1703 (Creators Update). This is not available in Windows 7, 8.1 or older editions of Windows 10 from 2015/2016. Strange. AMD must pay a lot money to developers
Next suprise… Wolfenstein 2 will use both standards: DX12 and Vulkan. It will be interesting benchmark between those API. First AAA game using both DX12 and Vulkan.
https://uploads.disquscdn.com/images/141293ebb37f1138f6cab0cf3c77e38635a79e39493ed0f785e34ef9f4c99264.jpg
is that actually confirmed? cause to me it just looked like they were advertising all the feature of the card, as well as the game
No… only this picture. We need wait to see
Lol what, the game will use iDTech 6 which is built on Vulkan, and the previous version were all OGL.
Why the hell would they rewrite the entire engine for DX12 when Vulkan already accomplished the same.
They don’t need to change the entire engine, just the renderer. For example, The Evil Within is based on id Tech 5, but uses DX11 instead of OpenGL.
IdSoftware hates DX and will never use it!
All iDTech engines were built on OpenGL until IdTech 6(DOOM) when they moved to Vulkan, which is again the new OpenGL! So, in short, no Wolfenstein will not use DX12…but only Vulkan!
FP16 was not supported in Shader Model 5. It was added less than 6 months ago to latest version of Windows 10 1703 (SM6). Why add some hardware to GPU if there wasn’t any “gaming API” which allow use it.
BTW. Last Nvidia consumer GPU with support for both FP16 and FP32 was GeForce FX. But at that time game developers don’t like this solution
not always. even mobile games are getting more complex. even in tegra that FP16 was built for deep learning stuff and not so much about games.
As always, he’s ultimately here to shill for his beloved Microsoft. Again.
lol now AMD brag about having the most complete DX12 support. but back when asked why Fiji did not support FL 12_1 they insisting that the feature is not important.
actually it has full dx 12 implementation.. i think
Never used amd never will
Strange. When nvidia had FP16 in GeforceFX, they were criticized for this due to the lower quality produced by FP16. Now AMD found how to implement it and it suddenly became a “headline feature” for AMD. Next time we will revert back to 16-bit color dithering and it will be touted by AMD as fast rendering. Go figure.
We are talking about computation not rendering, in GeForceFX case there were not any GPU computation standardized. You are comparing HW from different decades, different APIs, different OS and different applications.
NVidia makes powerful GPUs, AMD makes powerful marketing gimmicks
Math is singular
Video entitled “What is Rapid Packed Math?” – after you watch it you have learned nothing and instead just been told AMD is awesome a bunch of times. If they wanted to make it sound like another fluff feature that no game ends up using, like tesselation, they accomplished that.
Also the benchmark (13% improvement) is meaningless if we don’t get to see a comparison of how the visual quality holds up. You might as well run the game at half res and boast about your 100% improvement due to new “rapid upscale” technology. It doesn’t mean anything.
Guys, TO BE CLEAR: Rapid Packed Math is the same as NVIDIA’s Variable Precision Feature on Pascal.
This is simply the ability to run mathematical calculations at half the precision (FP16) of standard precision (FP32 or Floating Point 32 bit). When you cut the precision of a calculation in half, you get two times the Floating Point performance (FLOPS).
Remember when the Nvidia TITAN came out and it advertised Double Precision as a feature? Well that simply means the GPU performs calculations at 2x the standard precision (FP64 or Floating Point 64bit) which cuts performance in half but gives a huge boost to calculation precision in workloads where such precise calculations are necessary.
This is a really good feature for games since you don’t need super precise calculations BUT you do need performance.
I was just looking this up. Pascal definitely supports it whereas Maxwell doesn’t.
I couldn’t see if this was true of consumer Pascal or not though, but this seems to be hardware based ability within pascal architecture too.
consumer gaming Pascal do not support that! End of story.
“It will also be interesting to see whether other future games will support it”
I think we all know the likely answer to that.
https://cdn.meme.am/cache/instances/folder341/500x/32768341/serious-cat-am-i-supposed-to-say-wow-or-something.jpg
Probably not a surprise for most of you, but Wolf 2 will be using Vulkan. Tiago Sousa confirmed:
https://twitter.com/idSoftwareTiago/status/874287031249244160
Doom runs 60-80 fps 1440p maxed on my 970 g1 gaming and vulkan with no overclock so wolfestein will also run about the same but even better because now i have ryzen 1700 and 16 gb ddr4 3200 mhz instead oc i5 2500k and 12 gb ddr3 1600 mhz that i had on doom release. and a 1080ti will run it 4k max settings 60+ fps anyway