AMD’s SAM or Smart Access Memory is a new tech feature that allows AMD CPUs to access all the VRAM in any Radeon RX 6000-series GPU. AMD’s solution was initially supposed to work with its latest Ryzen 5000-series Zen 3 CPUs, when they are paired with any RX 6000-series GPU, on any compatible B550/X570 500-series motherboard.
Later on, some motherboard vendors have also started offering support for the AMD 400-series Motherboard chipsets as well.
In an interview with PC World, AMD already stated that its Radeon group is working with INTEL to get this feature to work with RX 6000-series GPUs, and also on Intel’s latest compatible CPUs and motherboards. Now we have some new update and info on this SAM/re-size BAR technology feature which is part of the PCIe specification.
According to a recent report coming via CapFrameX (Twitter), AMD Ryzen 3000-series CPUs based on the ZEN 2 microarchitecture (codenamed as Matisse), as well as older-gen AMD processors based on Zen+ and Zen architectures will not be able to support the Smart Access Memory (SAM) feature being introduced by AMD recently.
However, INTEL processors since the 4th gen Haswell CPU architecture, and all the newer Gen CPUs since 2013-2014 are going to support SAM. Technically speaking, every Intel processor dating back to 2014 can support Resizable-BAR or SAM feature.
It’s just a matter of time for Motherboard vendors to release compatible UEFI BIOS firmware updates for the respective Motherboard chipsets, Intel 8-series chipsets and above. Intel processors have been supporting this instruction feature since Haswell CPU days.
Sorry, I didn't know that but this is technically impossible. Maybe they could emulate it, but it would be extremely slow.
Zen 2 does not support full-rate _pdep_u32. https://t.co/OwIv3lZ2WO
— CapFrameX (@CapFrameX) November 26, 2020
Intel’s 4th Gen Haswell core architecture introduced this support with its 20-lane PCI-Express gen 3.0 root-complex. SAM is essentially a subset/branding of the Resizable Base-Address Register/BAR feature developed by the PCI-SIG group.
The feature enables a processor to see a GPU’s entire VRAM as a single addressable block, rather than as 256-megabyte. BAR basically defines how much discrete GPU memory space can be mapped. Modern PCs are typically limited to 256 MB of mapped memory.
However, it appears that the PCI-Express root complex of Ryzen 5000-series processors introduces a PCI-E physical-layer feature called full-rate _pdep_u32/64, which is required and necessary for resizable-BAR to work. And AMD Zen 2-based and other previous older-gen Ryzen chips lack support for “full-rate _pdep_u32”. So there appears to be a hardware limitation.
According to Anandtech, PDEP/PEXT (_pdep_u32) is almost 250 times faster in Zen 3 than Zen 2. AMD processors before Zen 3 that implement PDEP and PEXT do so in microcode, and with a latency of 18 cycles rather than a single cycle.
For what it’s worth, Zen 3 has a 32bit MMIO register width just like Zen 2. Since 4’th gen Haswell CPU architecture and onwards, Intel supports PDEP/PEXT in hardware. With AMD they have only offered support with Zen 3 CPU lineup, on the hardware level.
So technically speaking, support for Zen 2 and older gen Zen+/Zen AMD CPU chips might seem unlikely, and it remains to be seen whether SAM will work on these processors, via some kind of emulation or some other method.
But assuming SAM does work on the ZEN 2 CPU lineup, via emulation, the performance is going to be very slow.
The PDEP and PEXT instructions are new generalized bit-level compress and expand instructions. They take two inputs; one is a source, and the other is a selector.
While what these instructions do is similar to bit level gather-scatter SIMD instructions, PDEP and PEXT instructions (like the rest of the BMI instruction sets) operate on general-purpose registers. The instructions are available in 32-bit and 64-bit versions.
PDEP/PEXT is used in conjunction with resizable BAR to quickly move the data to and from the GPU. Resizable BAR on the other hand is used to map the GPU’s VRAM to a corresponding CPU-addressable space, and then PDEP/PEXT is used to move the data to and from it.
Without PDEP/PEXT, just having a resizable BAR isn’t very helpful in terms of GPU performance.
NVIDIA plans to add support for SAM on some of its GPUs, since it is a PCI-SIG feature. In a recent statement given to GamersNexus, NVIDIA confirmed that they are indeed working on their own Smart Access Memory feature similar to what AMD has enabled on their RDNA 2 GPU lineup.
The green team has confirmed before that resizable BAR is actually a part of the PCI-Express specifications, and NVIDIA’s existing hardware fully supports this functionality. So it is just a matter of time we see this feature getting implemented on Nvidia GPUs as well.
NVIDIA has stated this SAM feature is going to be enabled through future GPU driver/software updates, and it will be compatible with both AMD and Intel processors. This new technology feature will not require a PCIe Gen 4-compatible platform, as it will be supported by PCIe Gen 3 systems as well.
Like I said before, BAR basically defines how much discrete GPU memory space can be mapped. Modern PCs are typically limited to 256 MB of mapped memory.
It is typical today for a discrete graphics processing unit (GPU) to have only a small portion of its frame buffer exposed over the PCI bus. For compatibility with 32bit OSes, discrete GPUs typically claim a 256MB I/O region for their frame buffers and this is how typical firmware configures them.
A GPU, supporting resizable BAR, must ensure that it can keep the display up and showing a static image during the reprogramming of the BAR.
This feature is rather important for graphics hardware, because the PCI BARs are usually limited to 256MB while on modern cards you can easily find 4GB or more VRAM. The end result is that only a fraction of that VRAM is CPU accessible, causing a whole bunch of workarounds in the driver stack for that hardware.
AMD says that with SAM they can access all of the GPU memory, thus removing any bottlenecks. This will also allow for faster performance. In conventional Windows-based PC systems, processors can only access a fraction of graphics memory (VRAM) at once, limiting system performance.
With AMD Smart Access Memory, the data channel gets expanded to harness the full potential of GPU memory, utilizing the bandwidth of PCI Express to remove the bottlenecks and increase performance.
Smart Access Memory feature will boost the overall gaming performance by optimizing the data transfer between the CPU and the GPU. Smart Access Memory aims at optimizing both the GPU and CPU to offer the best possible performance when they operate in tandem.
Stay tuned for more!
The recent rumors that older generation AMD CPUs won’t be getting SAM support due to a lack of full-rate PDEP instructions are false. Anandtech’s Dr. Ian Cutress recently got a response from AMD saying that PDEP doesn’t dictate if SAM could be enabled or not on older Ryzen CPUs.
Recently ASUS enabled AMD Smart Access Memory support for the 1st Gen Ryzen CPU on the B450 Motherboard chipset.
A user on Reddit has managed to get the AMD Smart Access memory feature working on their 1st Gen Ryzen CPU and has been posting his results on the forum. User Merich98 shared that he had got the tech working with his ASUS B450-Plus motherboard running the latest December BIOS – 2409.
Merich98 managed to enable AMD Smart Access Memory with an AMD Ryzen 7 1700 CPU and tested performance in Doom Eternal and Rise of the Tomb Raider.
While seeing little to no effect on Tomb Raider, running Doom Eternal on Ultra at 1080p did yield some interesting results although most proved to be within the margin of error. One thing that it did highlight though is that the minimum FPS takes a huge hit with Smart Access memory enabled but this could probably be down to a lack of support in the API at this stage.