AMD Ryzen 2 Mobile 2500U Review & Benchmarks – CPU Performance

What is “Ryzen2” ZEN+ Mobile?

It is the long-awaited Ryzen2 APU mobile “Bristol Ridge” version of the desktop Ryzen 2 with integrated Vega graphics (the latest GPU architecture from AMD) for mobile devices. While on desktop we had the original Ryzen1/ThreadRipper – there was no (at least released) APU version or a mobile version – leaving only the much older designs that were never competitive against Intel’s ULV and H APUs.

After the very successful launch of the original “Ryzen1”, AMD has been hard at work optimising and improving the design in order to hit TDP (15-35W) range for mobile devices. It has also added the brand-new Vega graphics cores to the APU that have been incredibly performant in the desktop space. Note that mobile versions have a single CCX (compute unit) thus do not require operating system kernel patches for best thread scheduling/power optimisation.

Here’s what AMD says it has done for Ryzen2:

  • Process technology optimisations (12nm vs 14nm) – lower power but higher frequencies
  • Improvements for cache & memory speed & latencies (we shall test that ourselves!)
  • Multi-core optimised boost (aka Turbo) algorithm – XFR2 – higher speeds

Why review it now?

With Ryzen3 soon to be released later this year (2019) – with a corresponding Ryzen3 APU mobile – it is good to re-test the platform especially in light of the many BIOS/firmware updates, many video/GPU driver updates and not forgetting the many operating system (Windows) vulnerabilities (“Spectre”) mitigations that have greatly affected performance – sometimes for the good (firmware, drivers, optimisations) sometimes for the bad (mitigations).

In this article we test CPU core performance; please see our other articles on:

Hardware Specifications

We are comparing the top-of-the-range Ryzen2 mobile (2500U) with competing architectures (Intel gen 6, 7, 8) with a view to upgrading to a mid-range but high performance design.

 

CPU Specifications AMD Ryzen2 2500U Bristol Ridge
Intel i7 6500U (Skylake ULV)
Intel i7 7500U (Kabylake ULV)
Intel i5 8250U (Coffeelake ULV)
Comments
Cores (CU) / Threads (SP) 4C / 8T 2C / 4T 2C / 4T 4C / 8T Ryzen has double the cores of ULV Skylake/Kabylake and only recently Intel has caught up by also doubling cores.
Speed (Min / Max / Turbo) 1.6-2.0-3.6GHz (16x-20x-36x) 0.4-2.6-3.1GHz (4x-26x-31x) 0.4-2.7-3.5GHz (4x-27x-35x) 0.4-1.6-3.4GHz (4x-16x-34x) Ryzen2 has higher base and turbo than CFL-U and higher turbo than all Intel competition.
Power (TDP) 25-35W 15-25W 15-25W 25-35W Both Ryzen2 and CFL-U have higher TDP at 25W and turbo up to 35W depending on configuration while older devices were mostly 15W with turbo 20-25W.
L1D / L1I Caches 4x 32kB 8-way / 4x 64kB 4-way 2x 32kB 8-way / 2x 32kB 8-way 2x 32kB 8-way / 2x 32kB 8-way 4x 32kB 8-way / 4x 32kB 8-way Ryzen2 icache is 2x of Intel with matching dcache.
L2 Caches 4x 512kB 8-way 2x 256kB 16-way 2x 256kB 16-way 4x 256kB 16-way Ryzen2 L2 cache is 2x bigger than Intel and thus 4x larger than older SKL/KBL-U.
L3 Caches 4MB 16-way 4MB 16-way 4MB 16-way 6MB 16-way Here CFL-U brings 50% bigger L3 cache (6 vs 4MB) which may help some workloads.
Microcode (Firmware) MU8F1100-0B MU064E03-C6 MU068E09-8E MU068E09-96 On Intel you can see just how many updates the platforms have had – we’re now at CX versions but even Ryzen2 has had a few.

Native Performance

We are testing native arithmetic, SIMD and cryptography performance using the highest performing instruction sets (AVX2, AVX, etc.). Ryzen supports all modern instruction sets including AVX2, FMA3 and even more like SHA HWA (supported by Intel’s Atom only) but has dropped all AMD’s variations like FMA4 and XOP likely due to low usage.

Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.

Environment: Windows 10 x64, latest AMD and Intel drivers. 2MB “large pages” were enabled and in use. Turbo / Boost was enabled on all configurations.

Native Benchmarks AMD Ryzen2 2500U Bristol Ridge Intel i7 6500U (Skylake ULV) Intel i7 7500U (Kabylake ULV) Intel i5 8250U (Coffeelake ULV) Comments
CPU Arithmetic Benchmark Native Dhrystone Integer (GIPS) 103 [-6%] 52 73 109 Right off Ryzen2 does not beat CFL-U but is very close, soundly beating the older Intel designs.
CPU Arithmetic Benchmark Native Dhrystone Long (GIPS) 102 [-4%] 51 74 106 With a 64-bit integer workload – the difference drops to 4%.
CPU Arithmetic Benchmark Native FP32 (Float) Whetstone (GFLOPS) 79 [+18%] 39 45 67 Somewhat surprisingly, Ryzen2 is almost 20% faster than CFL-U here.
CPU Arithmetic Benchmark Native FP64 (Double) Whetstone (GFLOPS) 67 [+22%] 33 37 55 With FP64 nothing much changes, with Ryzen2 over 20% faster.
You can see why Intel needed to double the cores for ULV: otherwise even top-of-the-line i7 SKL/KBL-U are pounded into dust by Ryzen2. CFL-U does trade blows with it and manages to pull ahead in Dhrystone but Ryzen2 is 20% faster in floating-point. Whatever you choose you can thank AMD for forcing Intel’s hand.
BenchCpuMM Native Integer (Int32) Multi-Media (Mpix/s) 239 [-32%] 183 193 350 In this vectorised AVX2 integer test Ryzen2 starts 30% slower than CFL-U but does beat the older designs.
BenchCpuMM Native Long (Int64) Multi-Media (Mpix/s) 53.4 [-58%] 68.2 75 127 With a 64-bit AVX2 integer vectorised workload, Ryzen2 is even slower.
BenchCpuMM Native Quad-Int (Int128) Multi-Media (Mpix/s) 2.41 [+12%] 1.15 1.12 2.15 This is a tough test using Long integers to emulate Int128 without SIMD; here Ryzen2 has its 1st win by 12% over CFL-U.
BenchCpuMM Native Float/FP32 Multi-Media (Mpix/s) 222 [-20%] 149 159 277 In this floating-point AVX/FMA vectorised test, Ryzen2 is still slower but only by 20%.
BenchCpuMM Native Double/FP64 Multi-Media (Mpix/s) 126 [-22%] 88.3 94.8 163 Switching to FP64 SIMD code, nothing much changes still 20% slower.
BenchCpuMM Native Quad-Float/FP128 Multi-Media (Mpix/s) 6.23 [-16%] 3.79 4.04 7.4 In this heavy algorithm using FP64 to mantissa extend FP128 with AVX2 – Ryzen2 is less than 20% slower.
Just as on desktop, we did not expect AMD’s Ryzen2 mobile to beat 4-core CFL-U (with Intel’s wide SIMD units) and it doesn’t: but it remains very competitive and is just 20% slower. In any case, it soundly beats all older but ex-top-of-the-line i7 SKL/KBL-U thus making them all obsolete at a stroke.
BenchCrypt Crypto AES-256 (GB/s) 10.9 [+1%] 6.29 7.28 10.8 With AES/HWA support all CPUs are memory bandwidth bound – here Ryzen2 ties with CFL-U and soundly beats older versions.
BenchCrypt Crypto AES-128 (GB/s) 10.9 [+1%] 8.84 9.07 10.8 What we saw with AES-256 just repeats with AES-128; Ryzen2 is marginally faster but the improvement is there.
BenchCrypt Crypto SHA2-256 (GB/s) 6.78 [+60%] 2 2.55 4.24 With SHA/HWA Ryzen2 similarly powers through hashing tests leaving Intel in the dust; SHA is still memory bound but Ryzen2 is 60% faster than CFL-U.
BenchCrypt Crypto SHA1 (GB/s) 7.13 [+2%] 3.88 4.07 7.02 Ryzen also accelerates the soon-to-be-defunct SHA1 but CFL-U with AVX2 has caught up.
BenchCrypt Crypto SHA2-512 (GB/s) 1.48 [-44%] 1.47 1.54 2.66 SHA2-512 is not accelerated by SHA/HWA thus Ryzen2 falls behind here.
Ryzen2 mobile (like its desktop brother) gets a boost from SHA/HWA but otherwise ties with CFL-U which is helped by its SIMD units. As before older 2-core i7 SKL/KBL-U are left with no hope and cannot even saturate the memory bandwidth.
BenchFinance Black-Scholes float/FP32 (MOPT/s) 93.3 [-4%] 44.7 49.3 97 In this non-vectorised test we see Ryzen2 matches CFL-U.
BenchFinance Black-Scholes double/FP64 (MOPT/s) 77.8 [-8%] 39 43.3 84.7 Switching to FP64 code, nothing much changes, Ryzen2 is 8% slower.
BenchFinance Binomial float/FP32 (kOPT/s) 35.5 [+61%] 10.4 12.3 22 Binomial uses thread shared data thus stresses the cache & memory system; here the arch(itecture) improvements do show, Ryzen2 is 60% faster than CFL-U.
BenchFinance Binomial double/FP64 (kOPT/s) 19.5 [-7%] 10.1 11.4 21 With FP64 code Ryzen2 drops back from its previous win.
BenchFinance Monte-Carlo float/FP32 (kOPT/s) 20.1 [+1%] 9.24 9.87 19.8 Monte-Carlo also uses thread shared data but read-only thus reducing modify pressure on the caches; Ryzen2 cannot match its previous gain.
BenchFinance Monte-Carlo double/FP64 (kOPT/s) 15.3 [-3%] 7.38 7.88 15.8 Switching to FP64 nothing much changes, Ryzen2 matches CFL-U.
Unlike desktop where Ryzen2 is unstoppable, here we are a more mixed result – with CFL-U able to trade blows with it except one test where Ryzen2 is 60% faster. Otherwise CFL-U does manage to be just a bit faster in the other tests but nothing significant.
BenchScience SGEMM (GFLOPS) float/FP32 107 [+16%] 92 76 85 In this tough vectorised AVX2/FMA algorithm Ryzen2 manages to be almost 20% faster than CFL-U.
BenchScience DGEMM (GFLOPS) double/FP64 47.2 [-6%] 44.2 31.7 50.5 With FP64 vectorised code, Ryzen2 drops down to 6% slower.
BenchScience SFFT (GFLOPS) float/FP32 3.75 [-53%] 7.17 7.21 8 FFT is also heavily vectorised (x4 AVX2/FMA) but stresses the memory sub-system more; Ryzen2 does not like it much.
BenchScience DFFT (GFLOPS) double/FP64 4 [-7%] 3.23 3.95 4.3 With FP64 code, Ryzen2 does better and is just 7% slower.
BenchScience SNBODY (GFLOPS) float/FP32 112 [-27%] 96.6 104.9 154 N-Body simulation is vectorised but many memory accesses and not a Ryzen2 favourite.
BenchScience DNBODY (GFLOPS) double/FP64 45.3 [-30%] 29.6 30.64 64.8 With FP64 code nothing much changes.
With highly vectorised SIMD code Ryzen2 remains competitive but finds some algorithms tougher than others. Just as with desktop Ryzen1/2 it may require SIMD code changes for best performance due to its 128-bit units; Ryzen3 with 256-bit units should fix that.
CPU Image Processing Blur (3×3) Filter (MPix/s) 532 [-39%] 418 474 872 In this vectorised integer AVX2 workload Ryzen2 is quite a bit slower than CFL-U.
CPU Image Processing Sharpen (5×5) Filter (MPix/s) 146 [-58%] 168 191 350 Same algorithm but more shared data makes Ryzen2 even slower, 1/2 CFL-U.
CPU Image Processing Motion-Blur (7×7) Filter (MPix/s) 123 [-32%] 87.6 98 181 Again same algorithm but even more data shared reduces the delta to 1/3.
CPU Image Processing Edge Detection (2*5×5) Sobel Filter (MPix/s) 185 [-37%] 136 164 295 Different algorithm but still AVX2 vectorised workload still Ryzen2 is ~35% slower.
CPU Image Processing Noise Removal (5×5) Median Filter (MPix/s) 26.5 [-1%] 13.3 14.4 26.7 Still AVX2 vectorised code but here Ryzen2 ties with CFL-U.
CPU Image Processing Oil Painting Quantise Filter (MPix/s) 9.38 [-38%] 7.21 7.63 15.09 Again we see Ryzen2 fall behind CFL-U.
CPU Image Processing Diffusion Randomise (XorShift) Filter (MPix/s) 660 [-53%] 730 764 1394 With integer AVX2 workload, Ryzen2 falls behind even SKL/KBL-U.
CPU Image Processing Marbling Perlin Noise 2D Filter (MPix/s) 94.1 [-55%] 99.6 105 209 In this final test again with integer AVX2 workload Ryzen2 is 1/2 speed of CFL-U.

With all the modern instruction sets supported (AVX2, FMA, AES and SHA/HWA) Ryzen2 does extremely well in all workloads – and makes all older i7 SKL/KBL-U designs obsolete and unable to compete. As we said – Intel pretty much had to double the number of cores in CFL-U to stay competitive – and it does – but it is all thanks to AMD.

Even then Ryzen2 does beat CFL-U in non-SIMD tests with the latter being helped tremendously by its wide (256-bit) SIMD units and greatly benefits from AVX2/FMA workloads. But Ryzen3 with double-width SIMD units should be much faster and thus greatly beating Intel designs.

Software VM (.Net/Java) Performance

We are testing arithmetic and vectorised performance of software virtual machines (SVM), i.e. Java and .Net. With operating systems – like Windows 10 – favouring SVM applications over “legacy” native, the performance of .Net CLR (and Java JVM) has become far more important.

Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.

Environment: Windows 10 x64, latest drivers. .Net 4.7.x (RyuJit), Java 1.9.x. Turbo / Boost was enabled on all configurations.

VM Benchmarks AMD Ryzen2 2500U Bristol Ridge Intel i7 6500U (Skylake ULV) Intel i7 7500U (Kabylake ULV) Intel i5 8250U (Coffeelake ULV) Comments
BenchDotNetAA .Net Dhrystone Integer (GIPS) 22.7 [+39%] 9.58 12.1 16.36 .Net CLR integer starerts great – Ryzen2 is 40% faster than CFL-U.
BenchDotNetAA .Net Dhrystone Long (GIPS) 22 [+34%] 9.24 12.1 16.4 64-bit integer workloads also favour Ryzen2, still 35% faster.
BenchDotNetAA .Net Whetstone float/FP32 (GFLOPS) 40.5 [+9%] 18.7 22.5 37.1 Floating-Point CLR performance is also good but just about 10% faster than CFL-U.
BenchDotNetAA .Net Whetstone double/FP64 (GFLOPS) 49.6 [+6%] 23.7 28.8 46.8 FP64 performance is also great (CLR seems to promote FP32 to FP64 anyway) with Ryzen2 faster by 6%.
.Net CLR performance was always incredible on Ryzen1 and 2 (desktop/workstation) and here is no exception – all Intel designs are left in the dust with even CFL-U soundly beated by anything between 10-40%.
BenchDotNetMM .Net Integer Vectorised/Multi-Media (MPix/s) 43.23 [+20%] 21.32 25 35 Just as we saw with Dhrystone, this integer workload sees a big 20% improvement for Ryzen2.
BenchDotNetMM .Net Long Vectorised/Multi-Media (MPix/s) 44.71 [+21%] 21.27 26 37 With 64-bit integer workload we see a similar story – 21% better.
BenchDotNetMM .Net Float/FP32 Vectorised/Multi-Media (MPix/s) 137 [+46%] 78.17 94 56 Here we make use of RyuJit’s support for SIMD vectors thus running AVX2/FMA code – Ryzen2 does even better here 50% faster than CFL-U.
BenchDotNetMM .Net Double/FP64 Vectorised/Multi-Media (MPix/s) 75.2 [+45%] 43.59 52 35 Switching to FP64 SIMD vector code – still running AVX2/FMA – we see a similar gain
As before Ryzen2 dominates .Net CLR performance – even when using RyuJit’s SIMD instructions we see big gains of 20-45% over CFL-U.
Java Arithmetic Java Dhrystone Integer (GIPS) 222 [+13%] 119 150 196 We start JVM integer performance with a 13% lead over CFL-U.
Java Arithmetic Java Dhrystone Long (GIPS) 208 [+12%] 101 131 185 Nothing much changes with 64-bit integer workload – Ryzen2 still faster.
Java Arithmetic Java Whetstone float/FP32 (GFLOPS) 50.9 [+9%] 23.13 27.8 46.6 With a floating-point workload Ryzen2 performance improvement drops a bit.
Java Arithmetic Java Whetstone double/FP64 (GFLOPS) 54 [+13%] 23.74 28.7 47.7 With FP64 workload Ryzen2 gets back to 13% faster.
Java JVM performance delta is not as high as .Net but still decent just over 10% over CFL-U similar to what we’ve seen on desktop.
Java Multi-Media Java Integer Vectorised/Multi-Media (MPix/s) 48.74 [+15%] 20.5 24 42.5 Oracle’s JVM does not yet support native vector to SIMD translation like .Net’s CLR but Ryzen2 is still 15% faster.
Java Multi-Media Java Long Vectorised/Multi-Media (MPix/s) 46.75 [+4%] 20.3 24.8 44.8 With 64-bit vectorised workload Ryzen2’s lead drops to 4%.
Java Multi-Media Java Float/FP32 Vectorised/Multi-Media (MPix/s) 38.2 [+9%] 14.59 17.6 35 Switching to floating-point we return to a somewhat expected 9% improvement.
Java Multi-Media Java Double/FP64 Vectorised/Multi-Media (MPix/s) 35.7 [+2%] 14.59 17.4 35 With FP64 workload Ryzen2’s lead somewhat unexplicably drops to 2%.
Java’s lack of vectorised primitives to allow the JVM to use SIMD instruction sets allow Ryzen2 to do well and overtake CFL-U between 2-15%.

Ryzen2 on desktop dominated the .Net and Java benchmarks – and Ryzen2 mobile does not disappoint – it is consistently faster than CFL-U which does not bode well for Intel. If you mainly run .Net and Java apps on your laptop then Ryzen2 is the one to get.

SiSoftware Official Ranker Scores

Final Thoughts / Conclusions

Ryzen2 was a worthy update on the desktop and Ryzen2 mobile does not disappoint; it instantly obsoleted all older Intel designs (SKL/KBL-U) with only the very latest 4-core ULV (CFL/WHL-U) being able to match it. You can see from the results how AMD forced Intel’s hand to double cores in order to stay competitive.

Even then Ryzen2 manages to beat CFL-U in non-SIMD workloads and remains competitive in SIMD AVX2/FMA workloads (only 20% or so slower) while soundly beating SKL/KBL-U with their 2-cores and wide SIMD units. With soon-to-be-released Ryzen3 with wide SIMD units (256-bit as CFL/WHL-U) – Intel will need AVX512 to stay competitive – however it has its own issues which may be problematic in mobile/ULV space.

Both Ryzen2 mobile and CFL/WHL-U have increased TDP (~25W) in order to manage the increased number of cores (instead of 15W with older 2-core designs) and turbo short-term power as much as 35W. This means while larger 14/15″ designs with good cooling are able to extract top performance – smaller 12/13″ designs are forced to use lower cTDP of 15W (20-25W turbo) thus with lower multi-threaded performance.

Also consider than Ryzen2 is not affected by most “Spectre” vulnerabilities and not by “Meltdown” either thus does not need KVA (kernel pages virtualisation) that greatly impacts I/O workloads. Only the very latest Whiskey-Lake ULV (WHL-U gen 8-refresh) has hardware “Meltdown” fixes – thus there is little point buying CFL-U (gen 8 original) and even less point buying older SKL/KBL-U.

In light of the above – Ryzen2 mobile is a compelling choice especially as it comes at a (much) lower price-point: its competition is really only the very latest WHL-U i5/i7 which do not come cheap – with most vendors still selling CFL-U and even KBL-U inventory. The only issue is the small choice of laptops available with it – hopefully the vendors (Dell, HP, etc.) will continue to release more versions especially with Ryzen 3 mobile.

In a word: Highly Recommended!

Please see our other articles on:

Tagged , , , , . Bookmark the permalink.

Comments are closed.