AMD Ryzen 5 7600X (Zen4 Raphael) Review & Benchmarks – Value AVX512 Performance

What is “Zen4” (Ryzen 7000)?

AMD’s Zen4 (“Raphael”) is the 4rd generation ZEN core – aka the new 7000-series of CPUs from AMD – that brings brand new features like AVX512 ISA (instruction set support), DDR5 and PCIe5. These do require a brand new platform (AM5) almost a decade since the current AM4 platform was launched before even the 1st generation Ryzen. With any luck, it will remain for the next 4 or even more CPU generations, unlike the 2 generation support on competitor (Intel) platform.

Zen4 contains only big/P(erformance) cores and it is not a hybrid design. It remains to be seen if AMD will launch such hybrid (big/LITTLE) products that, in our opinion, are too problematic on desktop platforms for the benefits they bring. Even on mobile platforms where efficiency is a top priority – workloads do not easily lend to a hybrid design despite huge work done on the Windows scheduler for Windows 11. In this regard, a non-hybrid design like Zen4 is very much preferred.

AVX512 is a huge boost for compute performance as we’ve seen on Intel since SKL-X (Skylake-X). There is a reason it exists + all the extensions (IFMA, VNNI, VAES, etc.) and it is not unexpected that even basic usage can bring up to 100% (2x) performance improvement and even higher with specific instructions. While originally CPUs would reduce clocks due to the power generated – this has pretty much been mitigated in modern designs. Even Centaur (before Intel bought them) had AVX512-enabled (LITTLE) cores.

While here AMD has implemented it as 2x 256-bit ops (similar to previous AVX2/FMA3 in Zen1/1+/2 implemented as 2x 128-bit) – we still benefit from 2x more registers + 2x wider registers (4x overall), arguably better instruction specification, optimised extensions (IFMA, VNNI, VAES, etc.) that overall can still build up to a big improvement over old AVX2/FMA3.

  • 5nm process (TSMC) for CCX (vs. 7nm on Zen3) for better efficiency and clocks
  • 6nm process (TSMC) for I/O hub (vs. 12nm for Zen3) for better memory speeds
    • claimed 13% IPC increase vs. Zen3 + clock increase uplift => ~29% total uplift vs. Zen 3
  • AVX512 instruction support, with potential 100%+ improvement in optimised workloads
    • Executed as 2x 256-bit (not true 512-bit like Intel) but still many benefits over AVX2/FMA3
    • Specific AVX512 extensions (IFMA, VNNI, VAES, etc.) can bring well over 100% improvement
  • DDR5 support up to 5200Mt/s (official) for much higher memory bandwidth vs. DDR4 Zen3
    • Unofficial support for at least 6400Mt/s with XMP3/EXPO profiles
    • AMD says 6000Mt/s is the “sweet-spot” for performance/value
  • 1MB L2 per core (2x vs. 512kB on Zen3)
  • L3 is the same at 32MB – 7600X-3D model will get 96MB V-Cache
  • PCIe5 support, up to 24 lanes (2x bandwidth vs. PCIe4)
  • Still up to 2 chiplets (at launch) thus max. 8C big/P cores (6C/12T on 7600X)
  • Much higher both base and turbo speeds in most variants, e.g. 7600X
    • Higher base 4.7GHz (vs. 3.7GHz on 5600X +27% clock uplift)
    • Higher turbo 5.3GHz (vs. 4.6GHz on 5600X +15% clock uplift)
  • TDP at 105W (7600X) vs. 65W (5600X) thus 62% higher! (ouch!)
    • Turbo (PPT aka PL2) around 142W thus 61% higher! (ouch!)
    • Note that other models (e.g. 7700X) have kept the same TDP/Turbo
  • Built-in Radeon Graphics (RDNA2) core
    • 2CU / 128SP 400-2.2GHz cores for very basic graphics
AMD Zen4 (7700X, 7600X) Chiplet + I/O Hub

AMD Zen4 (7700X, 7600X) Chiplet + I/O Hub

Review

In this article we test CPU core performance; please see our other articles on:

Hardware Specifications

We are comparing the top-range Ryzen 7 5000-series (Zen3 8-core) with previous generation Ryzen 7 3000-series (Zen2 8-core) and competing architectures with a view to upgrading to a top-range, high performance design.

CPU Specifications AMD Ryzen 5 7600X 6C/12T (Zen4, Raphael) AMD Ryzen 7 5600X 6C/12T (Zen3, Vermeer) Intel Core i5 12600K 6C+4c/16T (ADL, AlderLake)
Intel Core i5 11600K 6C/12T (RKL, RocketLake)
Comments
Cores (CU) / Threads (SP) 6C /12T 6C / 12T 6C+4c / 16T 6C/12T Core counts remain the same.
Topology 1 chiplet, 1 CCX, 6 core + I/O hub 1 chiplet, CCX, 6 core + I/O hub Monolithic die Monolithic die Same topology
Speed (Min / Max / Turbo) (GHz)
4.7 [+27%] – 5.3GHz [+15%] 3.5 – 4.6GHz 3.7 – 4.9GHz 3.9 – 4.9GHz Base is 27% higher, turbo 15%
Power (TDP / Turbo) (W)
105 [+62%] – 142W [+61%] 65 – 88W 125 – 155W 125 – 155W TDP/PPT are 61% higher!
L1D / L1I Caches (kB)
6x 32kB 8-way / 6x 32kB 8-way 6x 32kB 8-way / 6x 32kB 8-way 6x 64kB + 4x 32kB / 6x 32kB + 4x 48kB 6x 64kB + 6x 32kB No changes to L1
L2 Caches (MB)
6x 1MB (6MB) 8-way inclusive [+2x] 6x 512kB (3MB) 8-way inclusive 6x 1.25MB + 2MB [10MB] 6x 512MB [3MB] L2 is 2x larger
L3 Caches (MB)
32MB 16-way exclusive
32MB 16-way exclusive 20MB 16-way 12MB 12-way L3 is the same
Mitigations for Vulnerabilities BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ software/firmware No new fixes required… yet!
Microcode (MU)
A60F12-03 A20F10-09 090672-15 06A701-50 The latest microcodes have been loaded.
SIMD Units 2x 256-bit (512-bit total) AVX512+ 256-bit AVX/FMA3/AVX2 256-bit AVX/FMA3/AVX2 512-bit [1 Unit] AVX512+ 2x wider SIMD
Price/RRP (USD)
$299 [=] $299 $289 $262 Price is the same

Disclaimer

This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. AMD, etc.). All trademarks acknowledged and used for identification only under fair use.

The review contains only public information and not provided under NDA nor embargoed. At publication time, the products have not been directly tested by SiSoftware but submitted to the public Benchmark Ranker; thus the accuracy of the benchmark scores cannot be verified, however, they appear consistent and pass current validation checks.

And please, don’t forget small ISVs like ourselves in these very challenging times. Please buy a copy of Sandra if you find our software useful. Your custom means everything to us!

Native Performance

We are testing native arithmetic, SIMD and cryptography performance using the highest performing instruction sets. Zen4 supports all modern instruction sets including AVX2/FMA3 and crypto SHA HWA but also AVX-512 and extensions (IFMA, VNNI, VAES, etc.)

Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.

Environment: Windows 11 x64 (21H2), latest AMD and Intel drivers. 2MB “large pages” were enabled and in use. Turbo / Boost was enabled on all configurations. All mitigations for vulnerabilities (Meltdown, Spectre, L1TF, MDS, etc.) were enabled as per Windows default where applicable.

Native Benchmarks AMD Ryzen 5 7600X 6C/12T (Zen4, Raphael) AMD Ryzen 7 5600X 6C/12T (Zen3, Vermeer) Intel Core i5 12600K 6C+4c/16T (ADL, AlderLake) Intel Core i5 11600K 6C/12T (RKL, RocketLake) Comments
CPU Arithmetic Benchmark Native Dhrystone Integer (GIPS) 462 [+33%] 347 479 378 Zen4 is 33% faster than Zen3!
CPU Arithmetic Benchmark Native Dhrystone Long (GIPS) 484 [+36%] 356 494 383 With a 64-bit integer workload, we’re a bit faster
CPU Arithmetic Benchmark Native FP32 (Float) Whetstone (GFLOPS) 263 [+17%] 224 334 191 Floating-point performance is 17% faster
CPU Arithmetic Benchmark Native FP64 (Double) Whetstone (GFLOPS) 225 [+22%] 185 220 161 With FP64 we’re 22% faster
Zen4 starts off with decent numbers: 34% faster in legacy integer and 20% faster in legacy floating-point – and that is before using any AVX512. The much improved turbo frequencies and ALU/FPU improvements do make an improvement here. All code should benefit from these uplifts.

Still, this is not enough to beat Intel’s ADL in integer and it is just about competitive in floating-point.

BenchCpuMM Native Integer (Int32) Multi-Media (Mpix/s) 1,846* [+28%] 1,441 1,548 1,102* Zen4 with AVX512 is 28% faster than Zen3!
BenchCpuMM Native Long (Int64) Multi-Media (Mpix/s) 618* [+30%] 475 528 378* With a 64-bit integer workload still 30% faster.
BenchCpuMM Native Quad-Int (Int128) Multi-Media (Mpix/s) 169* [+86%] 90.69 100 155* Using IFMA of AVX512, Zen4 is 69% faster than Zen3!
BenchCpuMM Native Float/FP32 Multi-Media (Mpix/s) 1,711* [+31%] 1,311 1,633 1,173* In this floating-point test, Zen4 is 31% faster!
BenchCpuMM Native Double/FP64 Multi-Media (Mpix/s) 944* [+40%] 676 840 648* Switching to FP64 code, Zen 4 is 40% faster
BenchCpuMM Native Quad-Float/FP128 Multi-Media (Mpix/s) 38.58* [+36%] 28.28 39.03 29.56* Using FP64 to mantissa extend FP128, Zen 4 is 36% faster
Even in heavy compute SIMD vectorised algorithms we see the similar results, Zen4 with AVX512 performs very – overall 42% faster than Zen3. When using AVX512 extensions (IFMA), it is 86% percent faster. Still, we’re not seeing the numbers its higher-end brothers with higher TDP/PPT are achieving.

Against Intel’s ADL with more cores (though LITTLE/E),  Zen4 does win all tests but not by a large margin – where Zen3 lost all tests. If Intel had AVX512-enabled in ADL things would have been different.

Note*: using AVX512 instead of AVX2/FMA.

BenchCrypt Crypto AES-256 (GB/s) 24.59*** [+18%] 20.8 24 22.15*** Zen4 is 18% faster than Zen3
BenchCrypt Crypto AES-128 (GB/s) 24.6*** [+18%] 20.8 24.03 What we saw with AES-256 just repeats with AES-128.
BenchCrypt Crypto SHA2-256 (GB/s) 21.89* [+17%] 18.67** 19.56** 24.19* With SHA, Zen4 is 17% faster
BenchCrypt Crypto SHA1 (GB/s) 19.56** 19.26** The less compute-intensive SHA1 does not change things due to acceleration.
While streaming tests (crypto/hashing) are memory bound, Zen4 is 27% faster than Zen3 due to DDR5 memory. It would likely be even faster if even faster memory was being used.

AVX512 does help with hashing performance (18% faster than Zen3) – but since all processors have SHA hardware acceleration the improvement is more modest than it would have been otherwise. We’ve seen higher-end Zen4 models with more cores perform much better.

Note***: using VAES 256-bit (AVX2) or 512-bit (AVX512)

Note**: using SHA HWA not SIMD (e.g. AVX512, AVX2, AVX, etc.)

Note*: using AVX512 not AVX2.

BenchFinance Black-Scholes float/FP32 (MOPT/s) 296 The standard financial algorithm.
BenchFinance Black-Scholes double/FP64 (MOPT/s) 298 [+27%] 235 311 227 Switching to FP64 code, Zen4 is 27% faster
BenchFinance Binomial float/FP32 (kOPT/s) 59.98 Binomial uses thread shared data thus stresses the cache & memory system;
BenchFinance Binomial double/FP64 (kOPT/s) 88.63 [+29%] 68.92 91.15 57.45 With FP64 code Zen4 is 29% faster.
BenchFinance Monte-Carlo float/FP32 (kOPT/s) 286 Monte-Carlo also uses thread shared data but read-only thus reducing modify pressure on the caches
BenchFinance Monte-Carlo double/FP64 (kOPT/s) 125 [+32%] 94.9 131 79.83 Here Zen4 is 32% faster
Ryzen always did well on non-SIMD floating-point algorithms and as we’ve seen in the legacy benchmarks (Dhrystone/Whetstone) Zen4 does not disappoint – it is 30% faster than Zen3.

But – this is not enough to beat Intel’s ADL – for the 1st time we see AMD losing all the benchmarks here. It is clear that adding AVX512 is still very important to Zen4.

BenchScience SGEMM (GFLOPS) float/FP32 437 573 In this tough vectorised algorithm that is widely used (e.g. AI/ML).
BenchScience DGEMM (GFLOPS) double/FP64 289* [+46%] 198 233 163* With FP64 Zen4 is 46% faster
BenchScience SFFT (GFLOPS) float/FP32 33.18 29.87 FFT is also heavily vectorised but stresses the memory sub-system more.
BenchScience DFFT (GFLOPS) double/FP64 15.56* [+72%] 9.05 19.3 9.86* With FP64 code, Zen4 is memory latency bound
BenchScience SNBODY (GFLOPS) float/FP32 456 530 N-Body simulation is vectorised but fewer memory accesses.
BenchScience DNBODY (GFLOPS) double/FP64 242* [+48%] 164 149 137* With FP64 Zen4 is 48% faster.
As we’ve seen in SIMD benchmarks, Zen4 is 55% faster than Zen3 on most algorithms that will see it though most heavy compute algorithms.

Here, faster DDR5 memory will make a big difference. AMD themselves said that DDR5-6000 memory is the “sweet-spot” and with such speeds Zen4 will perform much better.

Note*: using AVX512 not AVX2/FMA3.

CPU Image Processing Blur (3×3) Filter (MPix/s) 5,244* [+2x] 2,585 3,718 4,040* In this vectorised integer Zen4 is 2x faster!
CPU Image Processing Sharpen (5×5) Filter (MPix/s) 2,052* [+2.1x] 975 1,417 1,749* Same algorithm but more shared data 2.1x faster
CPU Image Processing Motion-Blur (7×7) Filter (MPix/s) 1,062* [+2.1x] 499 695 894* Again same algorithm but even more data shared – 2.1x faster!
CPU Image Processing Edge Detection (2*5×5) Sobel Filter (MPix/s) 1,634* [+2x] 821 1,207 1,431* Different algorithm Zen4 is 2x faster.
CPU Image Processing Noise Removal (5×5) Median Filter (MPix/s) 214* [+2.5x] 85.79 95.84 210* Still vectorised code Zen4 i 2.5x faster.
CPU Image Processing Oil Painting Quantise Filter (MPix/s) 33.18* [+19%] 27.81 51 59.51* This test has always been tough Zen4 is 19% faster.
CPU Image Processing Diffusion Randomise (XorShift) Filter (MPix/s) 3,818* [+44%] 2,647 4,152 4,548* With integer workload, Zen4 is 44% faster.
CPU Image Processing Marbling Perlin Noise 2D Filter (MPix/s) 532* [+43%] 373 731 735* In this final test we see Zen4 36% faster.
AVX512 really loves this benchmark – and here Zen4 is on average 85% faster across the 8 tests, with one test just 19% faster and one test 2.5x faster. These are pretty good improvements that ensure Zen4 also beats the Intel’s ADL competition in most tests – but not all.

With heavy compute vectorised AVX512 code, Zen4 is clearly better than Zen3 and AVX512 support helps absolutely. Do note that with AVX512 even Intel’s RKL (11600K) wins a few tests!

Note*: using AVX512 not AVX2/FMA3.

Inter-Thread/Core Latency Heatmap (ns) - AMD 7600X

Inter-Thread/Core Latency Heatmap (ns) – AMD 7600X

The inter-thread/core/module latencies “heat-map” shows how the latencies vary when transferring data off-thread (same L1D), off-core (same L3) or off-module/CCX (through memory). As 7700X has a single CCX/module there are just 2 types of latencies.

Still, judicious thread-pair scheduling is needed to keep latencies low (and conversely bandwidth high when large data is transferred.

CPU Multi-Core Benchmark Total Inter-Thread Bandwidth – Best Pairing (GB/s) 95.21* [+32%] 72.07 85.21 70.25* Zen4 has 32% more bandwidth than Zen3.
While L1D and L3 stay the same, AVX512 and the double size L2 (1MB vs. 512kB on Zen3) allow Zen4 32% more bandwidth than Zen3 that is a decent improvement.

The 3D-VCache versions with huge L3 caches are likely to further improve bandwidth by about 10% over this. It remains to be seen if we will see even bigger VCaches.

Note:* using AVX512 512-bit wide transfers.

CPU Multi-Core Benchmark Average Inter-Thread Latency (ns) 17.1 [-17%] 20.6 39.2 35.1 Overall latencies are 17% lower
CPU Multi-Core Benchmark Inter-Thread Latency (Same Core) Latency (ns) 8.7 [-15%] 10.2 10.4 15 Inter-module is 15% faster on Zen4
CPU Multi-Core Benchmark Inter-Core Latency (big Core, same Module) Latency (ns) 17.9 [-17%] 21.6 34.3 37.3 We see 15% reduced latencies
CPU Multi-Core Benchmark Inter-Core (Little Core, same Module) Latency (ns) 56.8 n/a
CPU Multi-Core Benchmark Inter-Module/CCX Latency (ns) We see increased inter-CCX latency.
Running at higher clocks, the inter-thread and inter-core latencies are ~17% less on Zen4 vs. Zen3. As there is a single CCX/module, we don’t see any increase inter-CCX latencies as with multiple CCX models.
Aggregate Score (Points) 13,380* [+48%] 9,030 12,160 9,640* Across all benchmarks, Zen4 is 48% faster!
Across all the benchmarks – Zen4 ends up an good 48% faster than Zen3 (7600X vs. 5600X) which is a pretty good improvement from a generation to another.

It is not really unexpected, with AVX512 support included (even when executed in 256-bit chunks) brings good performance improvement, maybe not as high as on Intel (with native 512-bit) but still far improved over old AVX2/FMA3 256-bit SIMD.

Note*: using AVX512 instead of AVX2/FMA3.

Price/RRP (USD) $299 [=] $299 $289 $262 Price remains the same
Price Efficiency (Perf. vs. Cost) (Points/USD) 44.75 [+48%] 30.20 42.08 36.79 Overall 48% more performance for the price
As AMD has even reduced the launch price – Zen4 (7600X) ends up 48% more price efficient than Zen3 (5600X) and thus just beating Intel’s ADL. Note that Intel was already better value than Zen3 – something we’ve not seen with Zen3 higher-end models. It would also not take a lot for Intel’s future CPUs to compete if performance improves a bit.
Power/TDP (W) 105 [+62%] – 142W [+61%] 65 – 88W 125 – 155W 125 – 155W TDP and Turbo are 61% higher!
Power Efficiency (Perf. vs. Power) (Points/W) 94.23 [-8%] 102.61 78.45 62.19 Zen4 is still 8% less efficient than Zen3
Unlike the other Zen4 processors, the 7800X ends up less efficient than Zen3 but still better than Intel’s ADL. The greatly increased TDP/Turbo power is the main issue here.

By disabling/not-using AVX512 it is possible to reduce turbo power and thus make Zen4 even more power efficient similar to Intel’s previous AVX512-enabled CPUs.

SiSoftware Official Ranker Scores

Final Thoughts / Conclusions

Summary: A great CPU update (AMD Ryzen 7600X): 8/10

Ever since Ryzen (Zen1) AMD has been hitting winners – with Zen2 (series 2000) and Zen3 (series 5000) bringing decent performance improvements – while still using the same AM4 platform (with BIOS updates). While some features (e.g. PCIe4, USB 3.2, etc.) may not be supported by old mainboards, you could still have gone from a 1st gen Ryzen to series 5000 16C/32T monster on the same platform; thus you’d be going to the very top of desktop performance beating anything the competition (Intel) had released on their latest platform.

AMD had to finally refresh the platform in order to bring new technologies support – DDR5 primarily, but also PCIe5, USB 4.0 – and they could have easily just stuck with that. But, no, AMD has instead brought a pretty revolutionary Zen4 – bringing AVX512 512-bit SIMD support just when Intel has dropped them in their latest hybrid designs (ADL, RPL).

Like the higher end models (7950X, 7900X) – our 7600X has good clock increases (vs. 5600X) but also greatly increases TDP/PPT (61% higher vs 5600X) which makes less power efficient (-8% less) but still 48% faster overall.

Also, unlike more core versions – our 7600X has the same number of big/P cores (6C/12T) vs. Intel’s ADL (12600K with 6C + 4c) and less overall cores. It thus cannot always beat ADL though it generally does consistently – unlike the previous Zen3 (5600X) that would generally lose against it. With future Intel’s RPL adding twice (2x) more little cores, performance may be more matched.

A new AM5 mainboard is required – but hopefully it will last you many more updates than the competition – possibly Zen7 (!) with 64C/128T (!) if things progress in the same manner we’ve seen until now. DDR5 memory has come down somewhat by now and brings much needed memory bandwidth improvements and USB 4.0 is very much needed for (very) high speed external devices. Not to mention PCIe5 support for future NVMe and GP-GPU components.

Also keep an eye for the 3D-VCache version with much larger L3 cache (96MB vs. 32MB) if your data workloads are large.

Good things come to those who wait” it is said; in this case AMD has definitely delivered!

As we keep repeating (!) – unlike the higher end models – Zen4 (7600X) performance is not overwhelming against Intel’s ADL (AlderLake) – thus it is possible that RPL (RaptorLake) will be a lot more competitive at this level. We will need to wait and see. As consumers, we do need them to be competitive – otherwise we will see greatly increased prices even from the “underdog” as we have seen in the past.

Summary: A great CPU update (AMD Ryzen 7600X): 8/10

Please see the other reviews on the other Zen variants:

Disclaimer

This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. AMD, etc.). All trademarks acknowledged and used for identification only under fair use.

The review contains only public information and not provided under NDA nor embargoed. At publication time, the products have not been directly tested by SiSoftware but submitted to the public Benchmark Ranker; thus the accuracy of the benchmark scores cannot be verified, however, they appear consistent and pass current validation checks.

Tagged , , , , , . Bookmark the permalink.

Comments are closed.