AMD Ryzen 7 7700X (Zen4 Raphael) Review & Benchmarks – AVX512 Mainstream Performance

What is “Zen4” (Ryzen 7000)?

AMD’s Zen4 (“Raphael”) is the 4rd generation ZEN core – aka the new 7000-series of CPUs from AMD – that brings brand new features like AVX512 ISA (instruction set support), DDR5 and PCIe5. These do require a brand new platform (AM5) almost a decade since the current AM4 platform was launched before even the 1st generation Ryzen. With any luck, it will remain for the next 4 or even more CPU generations, unlike the 2 generation support on competitor (Intel) platform.

Zen4 contains only big/P(erformance) cores and it is not a hybrid design. It remains to be seen if AMD will launch such hybrid (big/LITTLE) products that, in our opinion, are too problematic on desktop platforms for the benefits they bring. Even on mobile platforms where efficiency is a top priority – workloads do not easily lend to a hybrid design despite huge work done on the Windows scheduler for Windows 11. In this regard, a non-hybrid design like Zen4 is very much preferred.

AVX512 is a huge boost for compute performance as we’ve seen on Intel since SKL-X (Skylake-X). There is a reason it exists + all the extensions (IFMA, VNNI, VAES, etc.) and it is not unexpected that even basic usage can bring up to 100% (2x) performance improvement and even higher with specific instructions. While originally CPUs would reduce clocks due to the power generated – this has pretty much been mitigated in modern designs. Even Centaur (before Intel bought them) had AVX512-enabled (LITTLE) cores.

While here AMD has implemented it as 2x 256-bit ops (similar to previous AVX2/FMA3 in Zen1/1+/2 implemented as 2x 128-bit) – we still benefit from 2x more registers + 2x wider registers (4x overall), arguably better instruction specification, optimised extensions (IFMA, VNNI, VAES, etc.) that overall can still build up to a big improvement over old AVX2/FMA3.

  • 5nm process (TSMC) for CCX (vs. 7nm on Zen3) for better efficiency and clocks
  • 6nm process (TSMC) for I/O hub (vs. 12nm for Zen3) for better memory speeds
    • claimed 13% IPC increase vs. Zen3 + clock increase uplift => ~29% total uplift vs. Zen 3
  • AVX512 instruction support, with potential 100%+ improvement in optimised workloads
    • Executed as 2x 256-bit (not true 512-bit like Intel) but still many benefits over AVX2/FMA3
    • Specific AVX512 extensions (IFMA, VNNI, VAES, etc.) can bring well over 100% improvement
  • DDR5 support up to 5200Mt/s (official) for much higher memory bandwidth vs. DDR4
    • Unofficial support for at least 6400Mt/s with XMP3/EXPO profiles
    • AMD says 6000Mt/s is the “sweet-spot” for performance/value
  • 1MB L2 per core (2x vs. 512kB on Zen3)
  • L3 is the same at 32MB – 7700X-3D model will get 96MB V-Cache
  • PCIe5 support, up to 24 lanes (2x bandwidth vs. PCIe4)
  • Still up to 2 chiplets (at launch) thus max. 8C big/P cores (8C/16T on 7700X)
  • Much higher both base and turbo speeds in most variants, e.g. 7700X
    • Higher base 4.5GHz (vs. 3.8GHz on 5800X +18% clock uplift)
    • Higher turbo 5.7GHz (vs. 4.7GHz on 5800X +21% clock uplift)
  • TDP is still at 105W – unlike the higher core variants (7700X and 5800X) – great!
    • Turbo (PPT aka PL2) around 142W also the same (7700X and 5800X) – great!
    • Note that other models (7950X, 7900X, even 7600X have increased TDP/Turbo!)
  • Built-in Radeon Graphics (RDNA2) core
    • 2CU / 128SP 400-2.2GHz cores for very basic graphics
AMD 7700X Single Chiplet + I/O Hub

AMD 7700X Single Chiplet + I/O Hub

Review

In this article we test CPU core performance; please see our other articles on:

Hardware Specifications

We are comparing the top-range Ryzen 7 5000-series (Zen3 8-core) with previous generation Ryzen 7 3000-series (Zen2 8-core) and competing architectures with a view to upgrading to a top-range, high performance design.

CPU Specifications AMD Ryzen 7 7700X 8C/16T (Zen4, Raphael) AMD Ryzen 7 5800X 8C/16T (Zen3, Vermeer) Intel Core i7 12700K 8C+4c/20T (ADL, AlderLake)
Intel Core i7 11700K 8C/16T (RKL, RocketLake)
Comments
Cores (CU) / Threads (SP) 8C /16T 8C / 16T 8C+4c / 20T 8C/16T Core counts remain the same.
Topology 1 chiplet, 1 CCX, 8 core + I/O hub 1 chiplet, 1 CCX, 8 core + I/O hub Monolithic die Monolithic die Same topology
Speed (Min / Max / Turbo) (GHz)
4.5 [+18%] / 5.7GHz [+21%] 3.8 / 4.7GHz 3.6 + 2.7 / 5GHz + 3.8GHz 3.6 / 5GHz Base is 18% higher, turbo 21%
Power (TDP / Turbo) (W)
105 / 142W (PPT) [=]
105 / 142W (PPT) 125 / 190W (PL2) 125 / 190W (PL2) TDP/PPT is the same – great!
L1D / L1I Caches (kB)
8x 32kB 8-way / 8x 32kB 8-way 8x 32kB 8-way / 8x 32kB 8-way 8x 64kB + 4x 32kB / 8x 32kB + 4x 48kB 8x 64kB + 8x 32kB No changes to L1
L2 Caches (MB)
8x 1MB (8MB) 8-way inclusive [+2x] 8x 512kB (4MB) 8-way inclusive 8x 1.25MB + 2MB [12MB] 8x 512MB [4MB] L2 is 2x larger
L3 Caches (MB)
32MB 16-way exclusive
32MB 16-way exclusive 25MB 16-way 16MB 16-way L3 is the same
Mitigations for Vulnerabilities BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ hardware BTI/”Spectre”, SSB/”Spectre v4″ software/firmware No new fixes required… yet!
Microcode (MU)
A60F12-03 A20F10-09 090672-15 06A701-50 The latest microcodes have been loaded.
SIMD Units 2x 256-bit (512-bit total) AVX512+ 256-bit AVX/FMA3/AVX2 256-bit AVX/FMA3/AVX2 512-bit [1 Unit] AVX512+ 2x wider SIMD
Price/RRP (USD)
$399 [-11%] $449 $409 $409 Price is even 11% lower

Disclaimer

This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. AMD, etc.). All trademarks acknowledged and used for identification only under fair use.

The review contains only public information and not provided under NDA nor embargoed. At publication time, the products have not been directly tested by SiSoftware but submitted to the public Benchmark Ranker; thus the accuracy of the benchmark scores cannot be verified, however, they appear consistent and pass current validation checks.

And please, don’t forget small ISVs like ourselves in these very challenging times. Please buy a copy of Sandra if you find our software useful. Your custom means everything to us!

Native Performance

We are testing native arithmetic, SIMD and cryptography performance using the highest performing instruction sets. Zen4 supports all modern instruction sets including AVX2/FMA3 and crypto SHA HWA but also AVX-512 and extensions (IFMA, VNNI, VAES, etc.)

Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.

Environment: Windows 11 x64 (21H2), latest AMD and Intel drivers. 2MB “large pages” were enabled and in use. Turbo / Boost was enabled on all configurations. All mitigations for vulnerabilities (Meltdown, Spectre, L1TF, MDS, etc.) were enabled as per Windows default where applicable.

Native Benchmarks AMD Ryzen 7 7700X 8C/16T (Zen4, Raphael) AMD Ryzen 7 5800X 8C/16T (Zen3, Vermeer) Intel Core i7 12700K 8C+4c/20T (ADL, AlderLake) Intel Core i7 11700K 8C/16T (RKL, RocketLake) Comments
CPU Arithmetic Benchmark Native Dhrystone Integer (GIPS) 612 [+33%] 459 596 327 Zen4 is 33% faster than Zen3!
CPU Arithmetic Benchmark Native Dhrystone Long (GIPS) 637 [+36%] 467 611 325 With a 64-bit integer workload, we’re a bit faster
CPU Arithmetic Benchmark Native FP32 (Float) Whetstone (GFLOPS) 350 [+19%] 293 401 239 Floating-point performance is 19% faster
CPU Arithmetic Benchmark Native FP64 (Double) Whetstone (GFLOPS) 295 [+23] 239 315 205 With FP64 we’re 23% faster
Zen4 starts off with decent numbers: 34% faster in legacy integer and 20% faster in legacy floating-point – and that is before using any AVX512. The much improved turbo frequencies and ALU/FPU improvements do make an improvement here. All code should benefit from these uplifts.

While this is enough to beat Intel’s ADL in integer (unlike Zen3), it is not enough to beat it in floating-point. What a reversal.

BenchCpuMM Native Integer (Int32) Multi-Media (Mpix/s) 2,648* [+32%] 2,009 1,820 1,351* Zen4 with AVX512 is 32% faster than Zen3!
BenchCpuMM Native Long (Int64) Multi-Media (Mpix/s) 816* [+17%] 695 646 456* With a 64-bit integer workload still 17% faster.
BenchCpuMM Native Quad-Int (Int128) Multi-Media (Mpix/s) 227* [+69%] 134 126 156* Using IFMA of AVX512, Zen4 is 69% faster than Zen3!
BenchCpuMM Native Float/FP32 Multi-Media (Mpix/s) 2,341* [+27%] 1,847 2,008 1,226* In this floating-point test, Zen4 is 27% faster!
BenchCpuMM Native Double/FP64 Multi-Media (Mpix/s) 1,271* [+34%] 949 1,034 709* Switching to FP64 code, Zen 4 is 34% faster
BenchCpuMM Native Quad-Float/FP128 Multi-Media (Mpix/s) 50.49* [+31%] 38.43 47.45 30.79* Using FP64 to mantissa extend FP128, Zen 4 is 31% faster
Even in heavy compute SIMD vectorised algorithms we see the similar results, Zen4 with AVX512 performs very – overall 35% faster than Zen3. When using AVX512 extensions (IFMA), it is 69% percent faster. Still, we’re not seeing the numbers its higher-end brothers with higher TDP/PPT are achieving.

Against Intel’s ADL with more cores (though LITTLE/E),  Zen4 does win all tests but not by a large margin – where Zen3 lost all tests. If Intel had AVX512-enabled in ADL things would have been different.

Note*: using AVX512 instead of AVX2/FMA.

BenchCrypt Crypto AES-256 (GB/s) 26.4***9 [+27%] 20.78 26.74 14.95*** Zen4 is 27% faster than Zen3
BenchCrypt Crypto AES-128 (GB/s) 23.04 What we saw with AES-256 just repeats with AES-128.
BenchCrypt Crypto SHA2-256 (GB/s) 27.92* [+6%] 26.38** 24.62** 26.28* With SHA, Zen4 is 6% faster
BenchCrypt Crypto SHA1 (GB/s) 28.56** The less compute-intensive SHA1 does not change things due to acceleration.
While streaming tests (crypto/hashing) are memory bound, Zen4 is 27% faster than Zen3 due to DDR5 memory. It would likely be even faster if even faster memory was being used.

AVX512 does help with hashing performance (6% faster than Zen3) – but since all processors have SHA hardware acceleration the improvement is more modest than it would have been otherwise. We’ve seen higher-end Zen4 models with more cores perform much better.

Note***: using VAES 256-bit (AVX2) or 512-bit (AVX512)

Note**: using SHA HWA not SIMD (e.g. AVX512, AVX2, AVX, etc.)

Note*: using AVX512 not AVX2.

BenchFinance Black-Scholes float/FP32 (MOPT/s) 371 The standard financial algorithm.
BenchFinance Black-Scholes double/FP64 (MOPT/s) 391 [+18%] 331 450 207 Switching to FP64 code, Zen4 is 18% faster
BenchFinance Binomial float/FP32 (kOPT/s) 162 Binomial uses thread shared data thus stresses the cache & memory system;
BenchFinance Binomial double/FP64 (kOPT/s) 117 [+32%] 88.59 134 54.16 With FP64 code Zen4 is 32% faster.
BenchFinance Monte-Carlo float/FP32 (kOPT/s) 292 Monte-Carlo also uses thread shared data but read-only thus reducing modify pressure on the caches
BenchFinance Monte-Carlo double/FP64 (kOPT/s) 165 [+23%] 134 179 73.01 Here Zen4 is 23% faster
Ryzen always did well on non-SIMD floating-point algorithms and as we’ve seen in the legacy benchmarks (Dhrystone/Whetstone) Zen4 does not disappoint – it is 18-32% faster than Zen3.

But – this is not enough to beat Intel’s ADL – for the 1st time we see AMD losing all the benchmarks here. It is clear that adding AVX512 is still very important to Zen4.

BenchScience SGEMM (GFLOPS) float/FP32 534 In this tough vectorised algorithm that is widely used (e.g. AI/ML).
BenchScience DGEMM (GFLOPS) double/FP64 340* [+63%] 208 337 162* With FP64 Zen4 is 63% faster
BenchScience SFFT (GFLOPS) float/FP32 33.13 FFT is also heavily vectorised but stresses the memory sub-system more.
BenchScience DFFT (GFLOPS) double/FP64 14.92* [+17%] 12.73 23.08 13.47* With FP64 code, Zen4 is memory latency bound
BenchScience SNBODY (GFLOPS) float/FP32 574 N-Body simulation is vectorised but fewer memory accesses.
BenchScience DNBODY (GFLOPS) double/FP64 318* [+38%] 231 201 116* With FP64 Zen4 is 55% faster.
As we’ve seen in SIMD benchmarks, Zen4 is 17-63% faster than Zen3 on most algorithms that will see it though most heavy compute algorithms.

Here, faster DDR5 memory will make a big difference. AMD themselves said that DDR5-6000 memory is the “sweet-spot” and with such speeds Zen4 will perform much better.

Note*: using AVX512 not AVX2/FMA3.

CPU Image Processing Blur (3×3) Filter (MPix/s) 6,237* [+73%] 3,606 5,002 2,702* In this vectorised integer Zen4 is 73% faster!
CPU Image Processing Sharpen (5×5) Filter (MPix/s) 2,673* [+95%] 1,372 1,883 1,685* Same algorithm but more shared data 95% faster
CPU Image Processing Motion-Blur (7×7) Filter (MPix/s) 1,385* [+96%] 705 928 776* Again same algorithm but even more data shared – 96% faster!
CPU Image Processing Edge Detection (2*5×5) Sobel Filter (MPix/s) 2,109* [+82%] 1,160 1,632 1,302* Different algorithm Zen4 is 82% faster.
CPU Image Processing Noise Removal (5×5) Median Filter (MPix/s) 284* [+2.3x] 123 135 226* Still vectorised code Zen4 i 2.3x faster.
CPU Image Processing Oil Painting Quantise Filter (MPix/s) 43.37* [+14%] 38 68.32 66.32* This test has always been tough Zen4 is 14% faster.
CPU Image Processing Diffusion Randomise (XorShift) Filter (MPix/s) 4,841* [+51%] 3,200 6,282 2,906* With integer workload, Zen4 is 51% faster.
CPU Image Processing Marbling Perlin Noise 2D Filter (MPix/s) 692* [+36%] 507 996 814* In this final test we see Zen4 36% faster.
AVX512 really loves this benchmark – and here Zen4 is on average 70% faster across the 8 tests, with one test just 14% faster and one test 2.3x faster. These are pretty good improvements that ensure Zen4 also beats the Intel’s ADL competition in most – but not all – tests.

With heavy compute vectorised AVX512 code, Zen4 is clearly better than Zen3 and AVX512 support helps absolutely.

Note*: using AVX512 not AVX2/FMA3.

Inter-Thread/Core Latency Heatmap (ns) - AMD 7700X

Inter-Thread/Core Latency Heatmap (ns) – AMD 7700X

The inter-thread/core/module latencies “heat-map” shows how the latencies vary when transferring data off-thread (same L1D), off-core (same L3) or off-module/CCX (through memory). As 7700X has a single CCX/module there are just 2 types of latencies.

Still, judicious thread-pair scheduling is needed to keep latencies low (and conversely bandwidth high when large data is transferred.

CPU Multi-Core Benchmark Total Inter-Thread Bandwidth – Best Pairing (GB/s) 110* [+22%] 90.16 94.63 57.42* Zen4 has22% more bandwidth than Zen3.
While L1D and L3 stay the same, AVX512 and the double size L2 (1MB vs. 512kB on Zen3) allow Zen4 22% more bandwidth than Zen3 that is a decent improvement.

The 3D-VCache versions with huge L3 caches are likely to further improve bandwidth by about 10% over this. It remains to be seen if we will see even bigger VCaches.

Note:* using AVX512 512-bit wide transfers.

CPU Multi-Core Benchmark Average Inter-Thread Latency (ns) 16.8 [-8%] 20.6 40 30.5 Overall latencies are 8% lower
CPU Multi-Core Benchmark Inter-Thread Latency (Same Core) Latency (ns) 7.9 [-9%] 9.8 12.5 14.3 Inter-module is 9% faster on Zen4
CPU Multi-Core Benchmark Inter-Core Latency (big Core, same Module) Latency (ns) 17.4 [-9%] 21.4 38 31.7 We see 19% reduced latencies
CPU Multi-Core Benchmark Inter-Core (Little Core, same Module) Latency (ns) 45.8 n/a
CPU Multi-Core Benchmark Inter-Module/CCX Latency (ns) We see increased inter-CCX latency.
Running at higher clocks, the inter-thread and inter-core latencies are ~9% less on Zen4 vs. Zen3. As there is a single CCX/module, we don’t see any increase inter-CCX latencies as with multiple CCX models.
Aggregate Score (Points) 16,390* [+39%] 11,800 15,320 9,270* Across all benchmarks, Zen4 is 39% faster!
Across all the benchmarks – Zen4 ends up an good 39% faster than Zen3 (7700X vs. 5800X) which is a pretty good improvement from a generation to another.

It is not really unexpected, with AVX512 support included (even when executed in 256-bit chunks) brings good performance improvement, maybe not as high as on Intel (with native 512-bit) but still far improved over old AVX2/FMA3 256-bit SIMD.

Note*: using AVX512 instead of AVX2/FMA3.

Price/RRP (USD) $399 [-11%] $449 $409 $409 Even the price is 11% lower!
Price Efficiency (Perf. vs. Cost) (Points/USD) 41.08 [+56%] 26.28 37.45 22.67 Overall 56% more performance for the price
As AMD has even reduced the launch price – Zen4 (7700X) ends up 56% more price efficient than Zen3 (5800X) and thus beating Intel’s ADL. Note that Intel was already better value than Zen3 – something we’ve not seen with Zen3 higher-end models. It would also not take a lot for Intel’s future CPUs to compete if performance improves a bit.
Power/TDP (W) 105 – 142 [=] 105 – 142 125 – 190 125 – 190 TDP and Turbo are the same.
Power Efficiency (Perf. vs. Power) (Points/W) 115.42 [+39%] 83.10 80.63 48.79 Zen4 is still 39% more efficient
Unlike its higher-end brothers, Zen4 maintains the same TDP/Turbo power and thus is the same 39% more power efficient than Zen3.

By disabling/not-using AVX512 it is possible to reduce turbo power and thus make Zen4 even more power efficient similar to Intel’s previous AVX512-enabled CPUs.

SiSoftware Official Ranker Scores

Final Thoughts / Conclusions

Summary: A great CPU update (AMD Ryzen 7700X): 9/10

Ever since Ryzen (Zen1) AMD has been hitting winners – with Zen2 (series 2000) and Zen3 (series 5000) bringing decent performance improvements – while still using the same AM4 platform (with BIOS updates). While some features (e.g. PCIe4, USB 3.2, etc.) may not be supported by old mainboards, you could still have gone from a 1st gen Ryzen to series 5000 16C/32T monster on the same platform; thus you’d be going to the very top of desktop performance beating anything the competition (Intel) had released on their latest platform.

AMD had to finally refresh the platform in order to bring new technologies support – DDR5 primarily, but also PCIe5, USB 4.0 – and they could have easily just stuck with that. But, no, AMD has instead brought a pretty revolutionary Zen4 – bringing AVX512 512-bit SIMD support just when Intel has dropped them in their latest hybrid designs (ADL, RPL).

Unlike the higher end models (7950X, 7900X) – our 7700X has more modest clock increases (vs. 5800X) and also maintains the same TDP/PPT (vs. 5800X) which does not seem to allow it the same large performance uplift we’ve seen in other Zen4s (7950X, 7900X). Still, we see 39% overall performance uplift which is pretty significant.

Also, unlike more core versions – our 7700X has the same number of big/P cores (8C/16T) vs. Intel’s ADL (12700K with 8C + 4c) and less overall cores. It thus cannot always beat ADL though it generally does consistently – unlike the previous Zen3 (5800X) that would generally lose against it. With future Intel’s RPL adding twice (2x) more little cores, performance may be more matched.

The launch (RRP) price is also a bit less, thus making Zen4 even better value than Zen3. It is now matching Intel’s ADL, while Zen3 was quite a bit more expensive.

Unlike the more core versions – TDP and Turbo (PTT) are the same, thus power efficiency is greatly improved. In practice, it is likely that Zen4 will consume more and require better cooling, but at least it is not a significant change vs. the old Zen3 version.

A new AM5 mainboard is required – but hopefully it will last you many more updates than the competition – possibly Zen7 (!) with 64C/128T (!) if things progress in the same manner we’ve seen until now. DDR5 memory has come down somewhat by now and brings much needed memory bandwidth improvements and USB 4.0 is very much needed for (very) high speed external devices. Not to mention PCIe5 support for future NVMe and GP-GPU components.

Also keep an eye for the 3D-VCache version with much larger L3 cache (96MB vs. 32MB) if your data workloads are large.

Good things come to those who wait” it is said; in this case AMD has definitely delivered!

As we keep repeating (!) – unlike the higher end models – Zen4 (7700X) performance is not overwhelming against Intel’s ADL (AlderLake) – thus it is possible that RPL (RaptorLake) will be a lot more competitive at this level. We will need to wait and see. As consumers, we do need them to be competitive – otherwise we will see greatly increased prices even from the “underdog” as we have seen in the past.

Summary: A great CPU update (AMD Ryzen 7700X): 9/10

Please see the other reviews on the other Zen variants:

Disclaimer

This is an independent review (critical appraisal) that has not been endorsed nor sponsored by any entity (e.g. AMD, etc.). All trademarks acknowledged and used for identification only under fair use.

The review contains only public information and not provided under NDA nor embargoed. At publication time, the products have not been directly tested by SiSoftware but submitted to the public Benchmark Ranker; thus the accuracy of the benchmark scores cannot be verified, however, they appear consistent and pass current validation checks.

Tagged , , , , , . Bookmark the permalink.

Comments are closed.