SiSoftware Sandra 20/21-R25 – Future ISA support (iFMA, SHA512, and more)

As in the famous Monty Python (parrot) sketch – we’re not dead, just resting 😉 Do not worry, we’re making a return.

We are pleased to release R25 (version 31.133) update for Sandra 20/21 with brand-new support for future instruction sets (ISA) and future hardware.

We have also published two articles with our thoughts on the two major announcements from Intel:

Benchmarks, Hardware Support updates and fixes

CPU Multi-Media Benchmarks

AVX-iFMA(52): New 256-bit code path based on AVX512-iFMA(52) 512-bit for future arch (e.g. “Meteor Lake” MTL, “Arrow Lake” ARL). We saw +66% improvement as detailed in our AVX512-iFMA(52) Improvement for IceLake and TigerLake article.
AVX512-FP16: New code path for Xeon processors that support AVX512-FP16. We expect +90% improvement over FP32 if precision loss in acceptable (e.g. zoomed out fractals).

Note: Future FP16 code-paths will also be added to the other CPU benchmarks, however some parts may remain FP32 as the loss of precision would make the results useless. We have explored this in our GP-GPU article dealing with FP16 support: FP16 GP-GPU Image Processing Performance & Quality.

CPU Cryptogaphy Benchmarks

SHA2-512 HWA: Hardware-accelerated hashing SHA512 code-path – based on current SHA2-256 HWA. We expect ~3x (three times) better performance based on the SHA2 non-accelerated/HWA results.
Future SM3-256 (China) HWA: Hardware-accelerated hashing SM3 code-path (China’s version of SHA hashing functions). We expect similar performance improvement to SHA HWA.
Future SM4-128/256 (China) HWA: Hardware-accelerated block crypto SM4 code-path (China’s version of AES block crypto functions). We expect similar performance improvement to AES HWA.

Note: We will default to “SM4 + SM3” benchmarks – rather than “AES + SHA” for both CPU & GP-GPU Cryptography benchmarks when “China” locale is detected as these algorithms are more likely to be used there.

Note 2: ARM already includes SHA2-512, SM4, SM3 HWA (hardware acceleration extensions) in their high-end cores.

Note 3: While AES & SHA are not going anywhere, there has been a shift to other crypto-algorithms (especially those that can encrypt/decrypt and hash/authenticate like “ChaCha20Poly1305” as famously used in WireGuard VPN) that are fast enough even without hardware acceleration!

All CPU Benchmarks – AVX10 Support

AVX10.2+ 256-bit future code paths (FP32, FP64 and FP16) for Hybrid architectures (e.g. “Meteor Lake” MTL, “Arrow Lake” ARL). Note both Core (P) and Atom (E) will run the same 256-bit width binary and not different widths
AVX10.1+ 512-bit & AVX512 shared code path (FP32, FP64 and FP16) for Xeon architectures
Possible AVX10.2+ 128-bit future code path for Atom/Other discrete architectures if required

Note: in future Hybrid arch, Core (P) cores are likely to support 128/256-bit widths only. We don’t know (and we could not tell you) whether disabling Atom (E) cores will get the Core (P) to report 512-bit widths.

Hardware Support

Intel 14th gen Hybrid “Raptor Lake Refresh” RPL-S support
Intel future gen Hybrid “Meteor Lake” (MTL-S/M/P/N), “Arrow Lake” (ARL-S/U), “Lunar Lake” (LNL-M) detection
Intel future gen Xeon “Granite Rapids” SP/D detection

Sandra 20/21 Press Release

Please don’t forget to submit benchmark results to the Official SiSoftware Ranker! Many thanks for your continued support.

And please, don’t forget small ISVs (independent software vendors) like ourselves in these very challenging times. Please buy a copy of Sandra if you find our software useful. Your custom means everything to us!

Reviews using Sandra 20/21:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Commercial (Pro/Biz/Eng/Ent)

Download Sandra Lite (Free/Shareware)