AMD Zen Microarchitecture IPC Gains Detailed

During Hot Chips AMD hosted a Zen microarchitecture presentation outlining its road to the massive 40 percent gain in IPC (Per-core performance gains), over the current Excavator microarchitecture. AMD says these gains are made possible by three major changes; a better core engine, better cache system, and lower power. Instead of using the approach they used with Bulldozer where two cores share certain number-crunching components from “modules”, Zen uses a self-sufficient core design.

Beyond the core design the next subunit of the Zen architecture is t he CPU-Complex (CCX), in which four cores share an 8 MB L3 cache. This is much like current Intel architectures, the cores will share nothing beyond the L3 cache, making them truly independent. Other features that make Zen better are subtle upscaling in key ancillaries such as micro-Op dispatch, instruction schedulers; retire, load, and store queues; and a larger quad-issue FPU.

amd-zen-hotchips-1 amd-zen-hotchips-2 amd-zen-hotchips-3

The cache system has also been improved. The L3 cache will be shared between full-fledged cores, and each core having dedicated L2 cache. The L1 cache has been changed from write-through to write-back, and the SRAM that makes up the L2 and L3 caches are faster.

amd-zen-hotchips-4 amd-zen-hotchips-5 amd-zen-hotchips-6

The SRAM on the L3 cache has 5 times more bandwidth than the L3 cache found on current AMD architectures. The L1 and L2 caches have 2 times more bandwidth as well. Load from cache to FPU is faster also. The core is endowed with 64 KB each of L1I cache, 32 KB L1D cache; 512 KB of dedicated L2 cache, and 8 MB of L3 cache shared between four cores in a CCX.

amd-zen-hotchips-7 amd-zen-hotchips-8 amd-zen-hotchips-9

Zen introduces a simultaneous multi-threading (SMT), much like Intel’s HyperThreading technology. AMD’s SMT is very similar to Intel’s implementation in that each core is addressed to as two threads, with each thread competing for the resources on the core.

amd-zen-hotchips-10 amd-zen-hotchips-11 amd-zen-hotchips-12

Moving on to lower-power, which is attributed not just to the silicon-level gains yielded from the move to the 14 nm FinFET process. The design team really focused on lower power draw from the very start with Zen. The L1 write-back cache and the Op cache have a lower power-draw and the various components on Zen processors feature aggressive clock-gating, although there is no power-gating.

amd-zen-hotchips-13

AMD has also expanded the ISA instruction sets with AVX, AVX2, BMI1, BMI2, AES, RDRAND, sMEP, SHA1/SHA256, ADX, CFLUSHopt, XSAVEC/XSAVES/XRSTORS, and SMAP. There are also AMD-exclusive instruction sets, these are CLzero and PTE Coalescing.

About Author