What is “Ryzen2” ZEN2?
AMD’s Zen2 (“Matisse”) is the “true” 2nd generation ZEN core on 7nm process shrink while the previous ZEN+ (“Pinnacle Ridge”) core was just an optimisation of the original ZEN (“Summit Ridge”) core that while socket compatible it introduces many design improvements over both previous cores. An APU version (with integrated “Navi” graphics) is scheduled to be launched later.
While new chipsets (500 series) will also be introduced and required to support some new features (PCIe 4.0), with an BIOS/firmware update even 300/400-series boards should support them thus allowing upgrades to existing systems adding more cores and thus performance.
The list of changes vs. previous ZEN/ZEN+ is extensive thus performance delta is likely to be very different also:
- Built around “chiplets” of up to 2 CCX (“core complexes”) each of 4C/16T and L3 cache (7nm)
- Central I/O hub with memory controller(s) and PCIe 4.0 bridges connected through IF (“Infinity Fabric”) (12nm)
- Up to 2 chiplets on desktop platform thus up to 16C/32T (3950X) (same amount as old ThreadRipper 1950X/2950X)
- 2x larger L3 cache per CCX thus up to 64MB L3 cache (3950X)
- 24 PCIe 4.0 lanes (2x higher transfer rate over PCIe 3.0)
- 2x DDR4 memory controllers up to 4266Mt/s
- 256-bit (single-op) SIMD units 2x Fmacs (fixing a major deficiency in ZEN/ZEN+ cores)
- TLB (2nd level) increased (should help out-of-page access latencies that are somewhat high on ZEN/ZEN+)
- Memory latencies claim to be reduced through higher-speed memory (note all requests go through IF to Central I/O hub with memory controllers)
- Load/Store 32bytes/cycle (2x ZEN/ZEN+) to keep up with the 256-bit SIMD units (L1D bandwidth should be 2x)
- L3 cache is 2x ZEN/ZEN+ but higher latency (cache is exclusive)
- Infinity Fabric is 512-bit (2x ZEN/ZEN+) and can run 1x or 1/2x vs. DRAM clock (when higher than 3733Mt/s)
- AMD processors have thankfully not been affected by most of the vulnerabilities bar two (BTI/”Spectre”, SSB/”Spectre v4″) that have now been addressed in hardware.
- HWM-P (hardware performance state management) transitions latencies reduced (ACPI/CPPCv2)
In this article we test CPU core performance; please see our other articles on:
We are comparing the top-of-the-range Ryzen2 (3950X, 3800X) with previous generation (2700X) and competing architectures with a view to upgrading to a mid-range high performance design.
|CPU Specifications||AMD Ryzen 9 3950X Matisse
||AMD Ryzen 7 3800X Matisse||AMD Ryzen 7 2700X Pinnacle Ridge||Intel i9 9900K CoffeeLake-R||Intel i9 7900K Skylake-X||Comments|
|Cores (CU) / Threads (SP)||16C / 32T||8C / 16T||8C / 16T||8C / 16T||10C / 20T||Matching core-count with CFL (3800X) but 3950X doubles cores/threads – effectively ThreadRipper at lower cost.|
|Speed (Min / Max / Turbo)||3.5 / 4.7GHz||3.9 / 4.5GHz||3.7 / 4.2GHz||3.6 / 5GHz||3.3 / 4.3GHz||Base clock and turbo are competitive with 3800X having higher base while 3950X higher turbo.|
|Power (TDP / Turbo)||105 / 145W||105 / 125W||105 / 135W||95 / 135W||140 / 308W||TDP remains the same but 3950X may smash through it as it has 2x more cores, turbo power is naturally higher.|
|L1D / L1I Caches||16x 32kB 8-way / 8x 32kB 8-way||8x 32kB 8-way / 8x 32kB 8-way||8x 32kB 8-way / 8x 64kB 4-way||8x 32kB 8-way / 8x 32kB 8-way||10x 32kB 8-way / 10x 32kB 8-way||ZEN2 matches L1I with CFL/SKL-X (1/2x ZEN+ but 8-way), L1D is unchanged (also matches Intel)|
|L2 Caches||16x 512kB (8MB) 8-way||8x 512kB (4MB) 8-way||8x 512kB (4MB) 8-way||8x 256kB (2MB) 16-way||10x 1MB (10MB) 16-way||No changes to L2, still 2x CFL. Only SKL-X has its massive 1MB L2 per core which 3950X almost matches!|
|L3 Caches||4x 16MB (64MB) 16-way||2x 16MB (32MB) 16-way||2x 8MB (16MB) 16-way||16MB 16-way||13.75MB 11-way||L3 is 2x ZEN/ZEN+ and thus 2x CFL (3800X) with 3950X having a massive 64MB unheard of on the desktop platform! SKL-X can’t match it either.|
|SIMD Units||256-bit AVX/FMA3/AVX2||256-bit AVX/FMA3/AVX2||128bit AVX/FMA3/AVX2||256-bit AVX/FMA3/AVX2||512-bit AVX512||ZEN2 finally matches Intel/CFL but SKL-X’s secret weapon is AVX512 with even consumer CPUs able to do 2x 512-bit FMA ops.|
We are testing native arithmetic, SIMD and cryptography performance using the highest performing instruction sets (AVX2, FMA3, AVX, etc.). Ryzen2 supports all modern instruction sets including AVX2, FMA3 and even more like SHA HWA but not AVX-512.
Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.
Environment: Windows 10 x64, latest AMD and Intel drivers. 2MB “large pages” were enabled and in use. Turbo / Boost was enabled on all configurations.
Check back on July 7 for the benchmark results ;(
SiSoftware Official Ranker Scores
Final Thoughts / Conclusions
Check back on July 7 for the conclusion ;(