What is “ZEN2”?
AMD’s Zen2 (“Matisse”) is the “true” 2nd generation ZEN core on 7nm process shrink while the previous ZEN+ (“Pinnacle Ridge”) core was just an optimisation of the original ZEN (“Summit Ridge”) core. While socket compatible it introduces many design improvements over both previous cores. An APU version (with integrated “Navi” graphics) is scheduled to be launched later.
While new chipsets (AMD 500 series) will also be introduced and required to support some new features (PCIe 4.0), with an BIOS/firmware update older boards may support them thus allowing upgrades to existing systems adding more cores and thus performance. [Note: older boards will not be enabled for PCIe 4.0 after all]
The list of changes vs. previous ZEN/ZEN+ is extensive thus performance delta is likely to be very different also:
- Built around “chiplets” of up to 2 CCX (“core complexes”) each of 4C/8T and 8MB L3 cache (7nm)
- Central I/O hub with memory controller(s) and PCIe 4.0 bridges connected through IF (“Infinity Fabric”) (12nm)
- Up to 2 chiplets on desktop platform thus up to 2x2x4C (16C/32T 3950X) (same amount as old ThreadRipper 1950X/2950X)
- 2x larger L3 cache per CCX thus up to 2x2x16MB (64MB) L3 cache (3900X+)
- 20 PCIe 4.0 lanes (2x higher transfer rate over PCIe 3.0)
- 2x DDR4 memory controllers up to 3200Mt/s official (4266Mt/s max)
What’s new in the Zen2 core?
Micro-architecturally there are more changes that should improve performance:
- 256-bit (single-op) SIMD units 2x FMA (fixing a major deficiency in ZEN/ZEN+ cores)
- TLB (2nd level) increased (should help out-of-page access latencies that are somewhat high on ZEN/ZEN+)
- Memory latencies claim to be reduced through higher-speed memory (note all requests go through IF to Central I/O hub with memory controllers)
- Load/Store 32bytes/cycle (2x ZEN/ZEN+) to keep up with the 256-bit SIMD units (L1D bandwidth should be 2x)
- L3 cache is 2x ZEN/ZEN+ (so 16MB) but higher latency (cache is exclusive)
- Infinity Fabric is 512-bit (2x ZEN/ZEN+) and can run 1x or 1/2x vs. DRAM clock (when higher than 3733Mt/s)
- AMD processors have thankfully not been affected by most of the vulnerabilities bar two (BTI/”Spectre”, SSB/”Spectre v4″) that have now been addressed in hardware.
- HWM-P (hardware performance state management) transitions latencies reduced (ACPI/CPPCv2)
In this article we test CPU core performance; please see our other articles on:
- AMD Ryzen 7 5800X (Zen3) Review & Benchmarks – CPU 8-core/16-thread Performance
- AMD Ryzen 9 3900X (Zen2) Review & Benchmarks – Cache and Memory Performance
- AMD Ryzen 7 3700X (Zen2) Review & Benchmarks – CPU 8-core/16-thread Performance
- AMD Ryzen Threadripper 3970X, 3960X Review & Benchmarks – CPU Performance
Hardware Specifications
We are comparing the top-of-the-range Ryzen2 (3900X, 3700X) with previous generation Ryzen+ (2700X) and competing architectures with a view to upgrading to a mid-range high performance design.
CPU Specifications | AMD Ryzen 9 3900X (Matisse) |
AMD Ryzen 7 3700X (Matisse) | AMD Ryzen 7 2700X (Pinnacle Ridge) | Intel i9 9900K (Coffeelake-R) | Intel i9 7900X (Skylake-X) | Comments | |
Cores (CU) / Threads (SP) | 12C / 24T | 8C / 16T | 8C / 16T | 8C / 16T | 10C / 20T | Matching core-count with CFL (3800X) but 3900X has 50% more cores – more than SKL-X. | |
Topology | 2 chiplets, each 2 CCX, each 3 cores (1 disabled) (12C) | 1 chiplet, 2 CCX, each 4 cores (8C) | 2 CCX, each 4 cores (8C) | Monolithic die | Monolithic die | AMD uses discrete dies/chiplets unlike Intel | |
Speed (Min / Max / Turbo) | 3.8 / 4.6GHz | 3.6 / 4.4GHz | 3.7 / 4.2GHz | 3.6 / 5GHz | 3.3 / 4.3GHz | Base clock and turbo are competitive with 3800X having higher base while 3900X higher turbo. | |
Power (TDP / Turbo) | 105 / 135W | 65 / 90W | 105 / 135W | 95 / 135W | 140 / 308W | TDP remains the same but 3900X may exceed that having more cores. | |
L1D / L1I Caches | 12x 32kB 8-way / 12x 32kB 8-way | 8x 32kB 8-way / 8x 32kB 8-way | 8x 32kB 8-way / 8x 64kB 4-way | 8x 32kB 8-way / 8x 32kB 8-way | 10x 32kB 8-way / 10x 32kB 8-way | ZEN2 matches L1I with CFL/SKL-X (1/2x ZEN+ but 8-way), L1D is unchanged (also matches Intel) | |
L2 Caches | 12x 512kB (6MB) 8-way | 8x 512kB (4MB) 8-way | 8x 512kB (4MB) 8-way | 8x 256kB (2MB) 16-way | 10x 1MB (10MB) 16-way | No changes to L2, still 2x CFL. Only SKL-X has its massive 1MB L2 per core which 3900X almost matches! | |
L3 Caches | 2x2x 16MB (64MB) 16-way | 2x 16MB (32MB) 16-way | 2x 8MB (16MB) 16-way | 16MB 16-way | 13.75MB 11-way | L3 is 2x ZEN/ZEN+ and thus 2x CFL (3800X) with 3900X having a massive 64MB unheard of on the desktop platform! SKL-X can’t match it either. | |
Mitigations for Vulnerabilities | BTI/”Spectre”, SSB/”Spectre v4″ hardware | BTI/”Spectre”, SSB/”Spectre v4″ hardware | BTI/”Spectre”, SSB/”Spectre v4″ software/firmware | RDCL/”Meltdown”, L1TF hardware, BTI/”Spectre”, MDS/”Zombieload”, software/firmware | RDCL/”Meltdown” , L1TF, BTI/”Spectre”, MDS/”Zombieload”, all software/firmware | Ryzen2 addresses the remaining 2 vulnerabilities while Intel was forced to add MDS to its long list… | |
Microcode | MU-8F7100-11 | MU-8F7100-11 | MU-8F0802-04 | MU-069E0C-9E | MU-065504-49 | The latest microcodes included in the respective BIOS/Windows have been loaded. | |
SIMD Units | 256-bit AVX/FMA3/AVX2 | 256-bit AVX/FMA3/AVX2 | 128bit AVX/FMA3/AVX2 | 256-bit AVX/FMA3/AVX2 | 512-bit AVX512 | ZEN2 finally matches Intel/CFL but SKL-X’s secret weapon is AVX512 with even consumer CPUs able to do 2x 512-bit FMA ops. |
Native Performance
We are testing native arithmetic, SIMD and cryptography performance using the highest performing instruction sets (AVX2, FMA3, AVX, etc.). Ryzen2 supports all modern instruction sets including AVX2, FMA3 and even more like SHA HWA but not AVX-512.
Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.
Environment: Windows 10 x64, latest AMD and Intel drivers. 2MB “large pages” were enabled and in use. Turbo / Boost was enabled on all configurations. All mitigations for vulnerabilities (Meltdown, Spectre, L1TF, MDS, etc.) were enabled as per Windows default where applicable.
Ryzen2 (unlike Ryzen1/+) has no trouble with SIMD code due to its widened SIMD units (256-bit) and thus soundly beats the opposition into dust (CFL-R 9900K flagship) sometimes more than just core count increase alone (+50% i.e. 12 cores vs. 8). Sometimes it even beats the AVX512 opposition (SKL-X 7900K) with more cores (10 cores vs. 12).
The only “problematic” algorithms are the memory bound ones where the cores/threads (due to SMT we have 24!) are starved for data and due to contention we see performance lower than less-core devices. While larger caches help (thus the massive 4x 16MB L3 caches) higher clocked memory should be used to match the additional core requirements.
SiSoftware Official Ranker Scores
Final Thoughts / Conclusions
Executive Summary: Ryzen2 is phenomenal and a huge upgrade over Ryzen1/+ that (most) AM4 users can enjoy and Intel has no answer to. 10/10.
Just as original Ryzen forced Intel to increase (double really) core counts to match (from 4 to 6 then 8), Ryzen2 will force Intel to come up with even more (and better) cores in order to compete. 3900X with its 12-cores soundly beats CFL-R 9900K (8-cores) in just about all benchmarks and in some tests goes toe-to-toe with HEDT SKL-X AVX512-enabled (10-cores) except in memory-bound algorithms where the 4 DDR4 memory channels with 2x more bandwidth count. For that you need ThreadRipper!
Ryzen1/+ was already competitive with Intel on integer and floating-point (non-SIMD) workloads but would fare badly on SIMD (AVX/FMA3/AVX2) workloads due to its 128-bit units; Ryzen2 “fixes” this issue, with its 256-bit units matching Intel. Only SKL-X with its 512-bit units (AVX512) is faster and Intel will have to finally include AVX512 for consumer CPUs in order to compete (IceLake?).
For compute-bound workloads, the forthcoming 3950X with its 16-cores/32-threads brings unprecedented performance to the consumer/desktop segment pretty much unheard of just a few years ago when 4-core/8-threads (e.g. 7700K) were all you could hope for – unless paying a lot more for HEDT where 8/10-core CPUs were far far more expensive. Naturally we shall see how the reduced memory bandwidth affects its performance with likely very fast DDR4 memory (4300Mt/s+) required for best performance.
Let’s also remember than Ryzen2 adds hardware mitigation to its remaining 2 vulnerabilities while Intel has been forced to add MDS/”Zombieload” even to its very latest CFL-R that now loses its trump card: hardware RDCL/”Meltdown” fix not to forget the recommendation to disable SMT/Hyperthreading that would mean a sizeable performance drop.
What is astonishing is that TDP has remained similar and with a BIOS/firmware upgrade, owners of older 300-series boards can now upgrade to these CPUs – and likely not even change the cooler unit! Naturally for PCIe4.0 a 500-series board is recommended and 400-series boards do support more features in Ryzen2/+ but let’s remember than on Intel you can only go back/forward 1 generation even though there is pretty much no core difference from Skylake (Gen 6) to Coffeelake-R (Gen 9)!
From top-end (3950X), high-end (3800X) to low-end/APU (3200G) Ryzen2 is such a compelling choice it is hard to recommend anything else… at least at this time…