SiSoftware Sandra 20/20/8 (2020 R8t) Update – JCC, bigLITTLE, Hypervisors + Future Hardware

Note: The original R8 release has been updated to R8t with future hardware support.

We are pleased to release R8t (version 30.61) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

JCC Erratum Mitigation

Recent Intel processors (SKL “Skylake” and later but not ICL “IceLake”) have been found to be impacted by the JCC Erratum that had to be patched through microcode. Naturally this can cause performance degradation depending on benchmark (approx 3% but up to 10%) but can be mitigated through assembler/compiler updates that prevent this issue from happening.

We have updated the tools with with which Sandra is built to mitigate JCC and we have tested the performance implications on both Intel and AMD hardware in the linked articles.

bigLITTLE Hybrid Architecture (aka “heterogeneous multi-processing”)

big.Little HMP
While bigLITTLE arch (including low and high-performance asymmetric cores into the same processor) has been used in many ARM processors, Intel is now introducing it to x86 as “Foveros”. Thus we have have Atom (low performance but very low power) and Core (high performance but relatively high power) into the same processor – scheduled to run or be “parked” depending on compute demands.

As with any new technology, it will naturally require operating system (scheduler) support and may go through various iterations. Do note that as we’ve discussed in our 2015 (!) article – ARM big.LITTLE: The trouble with heterogeneous multi-processing: when 4 are better than 8 (or when 8 is not always the “lucky” number) – software (including benchmarks) using all cores (big & LITTLE) may have trouble correctly assigning workloads and thus not use such processors optimally.

As Sandra uses its own scheduler to assign (benchmarking) threads to logical cores, we have updated it to allow users to benchmarks not only “All Threads (MT)” and “Only Cores (MC)” but also “Only big Cores (bMC)” and “Only LITTLE Cores (LMC)“. This way you can compare and contrast the various cores performance without BIOS/firmware changes.

The (benchmark) workload scheduler also had to be updated to allow per-thread workload – with threads scheduled on LITTLE cores assigned less work and threads on big cores assigned more work depending on their relative performance. The changes to Sandra’s workload scheduler allows each core can be fully utilised – at least when benchmarking.

Note: This advanced information is subject to change pending hardware and software releases and updates.

Future Hardware Support

Update R8t adds support for “Tiger Lake” (TGL) as well as updated support for “Ice Lake” (ICL) and future processors.

AMD Power/Performance Determinism

Some AMD’s server processors allow “determinism” to be changed to either “performance” (consistent speed across nodes/sockets) or “power” (consistent power across nodes/sockets). While normally workloads require predictability and thus “consistent performance” – this can be at the expense of speed (not taking advantage of power/thermal headroom for higher speed) and even power (too much power consumed by some sockets/nodes).

As “power deterministic” mode allows each processor at the maximum performance, there can be reasonable deviations across processors – but this would be unused if each thread has been assigned the same workload. In effect, it is similar to the “hybrid” issue above, with some cores able to sustain a different workload than other cores and the workload needs to vary accordingly. Again, the changes to Sandra’s workload scheduler allows each core to be fully utilised – at least when benchmarking.

Note: In most systems the deviation between nodes/sockets is relatively small if headroom (thermal/power) is small.

Hypervisors

More and more installations are now running in virtualised mode under a (Type 1) Hypervisor: using Hyper-V, Docker, programming tools for various systems (Android, etc.) or even enabling “Memory Integrity” all mean the system will be silently be modified to run in transparently under a hypervisor (Hyper-V on Windows).

As a result, Sandra will now detect and report hypervisor details when uploading benchmarks to the SiSoftware Official Live Ranker as even when running transparently/”host mode” – there can be deviation between benchmark scores especially when I/O operations (disk, network but even memory) are involved; some mitigations for vulnerabilities apply to both the hypervisor and host/guest operating system with a “double-impact” to performance.

Note: We will publish an article detailing the deviation seen with different hypervisors (Hyper-V, VmWare, Xen, etc.).

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/7 (2020 R7) Released – updates and fixes

We are pleased to release R7 (version 30.49) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Updates & Optimisations
    • CPU Benchmarks: AMD Ryzen 4000 series (APU) preliminary support.
    • GPGPU (CUDA/OpenCL) Benchmarks: nVidia Ampere preliminary support.
    • Database: Optimise performance when accessing/updating benchmark results.
    • Branding (Benchmarks/Ranker): Updates manufacturer list.
  • Support & Fixes
    • Internet Benchmarks: Fix website access due to obsolete agent string.
    • Disk Benchmarks: Fix crash on fragmented media (HDD/SSD).
    • Database: Fix update/insert issues with specific benchmark results.

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/6 (2020 R6) Released – 2 brand-new benchmarks!

We are pleased to release R6 (version 30.45) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

Internet DNS Benchmark Internet DNS Benchmark Benchmark the performance of the DNS service. Measure the latency of both cached and un-cached DNS queries to local and remote DNS servers.
Internet Overall Score Benchmark A combined performance index all Internet benchmarks (Connection (Bandwidth/Latency), Peerage (Bandwidth/Latency) and DNS (cached/un-cached Query Latency). Rate the overall performance of your Internet connection.
  • Benchmarks:
    • New: Internet DNS Benchmark: measure cached & un-cached DNS query latency for local and public DNS servers.
    • New: Internet Overall Score: using the existing Internet benchmarks (Connection, Peerage and brand-new DNS), compute an overall score denoting the Internet connection quality.
    • Internet Connection, Internet Peerage Benchmarks: updated list of top (300) websites to test against; additional multi-threading optimisations
  • Hardware Support:
    • Additional future hardware support and optimisations.
    • Additional CPU features support
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/5 (2020 R5) Released

We are pleased to release R5 (version 30.41) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Benchmarks:
    • Internet Connection, Internet Peerage Benchmarks: updated list of top websites to test against; additional multi-threading optimisations
  • Hardware Support:
    • Additional IceLake (ICL Gen10 Core), Future* (RKL, TGL Gen11 Core) AVX512, VAES, SHA-HWA support (see CPU, GP-GPU, Cache & Memory, AVX512 improvement reviews)
    • Additional CPU features support
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/4a (2020 R4a) Released

Note: The original R4 release text has been updated below. The (*) denotes new changes.

We are pleased to release R4a (version 30.39) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Benchmarks:
    • Crypto AES Benchmarks*: Optimised AVX512/AVX2-VAES code to outperform AES-HWA where possible.
    • Crypto SHA Benchmarks*: Select AVX512 multi-buffer instead of SHA-HWA where supported.
    • Network (LAN), Wireless (WLAN/WWAN) Benchmarks: multi-threaded transfer tests and increased packet size to better utilise 10Gbe+ (and higher) links. [Note: threaded CPU required]
    • Internet Connection, Internet Peerage Benchmarks: multi-threaded transfer tests and increased packet size to better utilise Gigabit+ (and higher) connections.
  • Hardware Support:
    • Updated IceLake (ICL Gen10 Core), Future* (RKL, TGL Gen11 Core) AVX512, VAES, SHA-HWA support (see CPU, GP-GPU, Cache & Memory, AVX512 improvement reviews)
    • Updated CometLake (Gen10 Core) support (see CPU, GP-GPU, Cache & Memory reviews)
    • Updated CPU features support*
    • Updated NVMe support
    • Enhanced Biometrics information (fingerprint, face, voice, audio, etc. sensors)
    • Updated WiFi support (WiFi 6/802.11ax, WPA3)
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/3 (2020 R3) Released

We are pleased to release R3 (version 30.31) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • Additional PCIe extended capabilities support
  • CPU Cyrptography Benchmarks:
    • Block size changed to ~1500 bytes similar to Ethernet packet
    • Various stability and reliability improvements
  • GPGPU Cyrptography Benchmarks:
    • Block size changed to ~1500 bytes similar to Ethernet packet
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/2 (2020 R2) Released

We are pleased to release R2 (version 30.27) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • PCIe extended capabilities support
  • Software Support:
    • ReFS format Disk benchmark stability issues
  • CPU Benchmarks:
    • Tools (Visual C++ compiler 2019) Update
  • GPGPU Benchmarks:
    • CUDA: Updated SDK 10.2/10.1
    • OpenCL: Updated SDK support

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/1a (2020 R1a) Released

Update November 25th: Released patch (version 30.24) to add further hardware and software support.

Update October 24th: Released patch (version 30.21) to corrrect Windows 7 / Server 2008/R2 run-time issues.

We are pleased to release R1 (version 30.24) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • AMD Ryzen2 (series 3000 Matisse), Stoney Ridge updated support
    • Intel Cascade Lake (CSL), Comet Lake (CML), Cannon Lake (CNL), Ice Lake (ICL) updated support
  • CPU Benchmarks:
    • Tools (Visual C++ compiler 2019) Update
  • GPGPU Benchmarks:
    • CUDA: Updated SDK 10.2/10.1
    • OpenCL: Updated SDK support

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra Titanium (2018) SP4/a/c Update: Retpoline and hardware support

Note: Updated 2019/June with information regarding MDS as well as change of recent CFL-R microcode vulnerability reporting.

We are pleased to release SP4/a/c (version 28.69) update for Sandra Titanium (2018) with the following updates:

Sandra Titanium (2018) Press Release

  • Reporting of Operating System (Windows) speculation control settings for the recently discovered vulnerabilities:
    • Kernel Retpoline mitigation status (for RDCL) in recent Windows 10 / Server 2019 updates
    • Kernel Address Table Import Optimisation (“KATI”) status (as above)
    • L1TFL1 data terminal fault mitigation status
    • MDSMicroarchitectural Data Sampling/”ZombieLoad” mitigation status
  • Hardware Support:
    • AMD Ryzen2 (Matisse), Stoney Ridge support
    • Intel CometLake (CML), CannonLake (CNL), IceLake (ICL) support (based on public information)
  • CPU Benchmarks:
    • Image Processing: SIMD code improvement (SSE2/SSE4/AVX/AVX2-FMA/AVX512)
    • Multi-Media: Lock-up on NUMA systems (e.g. AMD ThreadRipper) thanks to Rob @ TechGage.
  • Memory/Cache Benchmarks
    • Return memory controller firmware version to Ranker
  • GPGPU Benchmarks:
    • CUDA SDK 10.1
    • OpenCL: Processing (Fractals/Mandelbrot) variable vector width based on reported FP16/32/64 optimal SIMD width.
  • Ranker, Price & Information Engines
    • HTTPS (encryption) support for all engines as well as the main website

What is Retpoline?

It is a mitigation against ‘Spectre‘ 2 variant (BTI – Branch Target Injection) that affects just about all CPUs (not just Intel but AMD, ARM, etc.). While ‘Spectre’ does not have the same overall performance impact degradation as ‘Meltdown‘ (RDCL – Rogue Data Cache Load) it can have a sizeable impact on some processors and workloads. At this time no CPUs contain hardware mitigation for Spectre without performance impact.

Retpoline (Return Trampoline) is a faster way to mitigate against it without restricting branch speculation in kernel mode (using IBRS/IBPB) and has recently been added to Linux and now Windows version 1809 builds with KB4482887. Note that it still needs to be enabled in registry via the Mitigation Features Override flags as by default it is not enabled.

What CPUs can Retpoline be used on?

Unfortunately Retpoline is only safe to use on some CPUs: AMD CPUs (though does not engage on Ryzen, see below), Intel Broadwell or older (v5 and earlier) – thus not Skylake (v6 or later).

Windows speculation control settings reporting:

Intel Haswell (Core v4), Broadwell (v5) – Retpoline enabled, KATI enabled
Kernel Retpoline Speculation Control – Enabled

Kernel Address Table Import Optimisation – Enabled

(Note RDCL mitigations KVA, L1TF are also enabled as required)

Intel Skylake (Core v6), Kabylake (v7), Skylake/Kabylake-X (v6x) – no Retpoline, KATI can be enabled
Kernel Retpoline Speculation Control – no

Kernel Address Table Import Optimisation – no/yes (can be enabled)

(Note RDCL mitigations KVA, L1TF are enabled as required)

Intel Coffeelake-R (Core v8r), Whiskeylake/AmberLake (Core v8r), CometLake* – no Retpoline, KATI not enabled
Kernel Retpoline Speculation Control – no

Kernel Address Table Import Optimisation – Enabled

Note 2019/June: Latest microcode (AEh) with MDS vulnerability support cause Windows to report KVA/L1TF mitigations as required despite CPU claiming to not be vulnerable to RDCL.

Intel Atom Braswell (Atom v5), GeminiLake/ApolloLake (Atom v6) – no Retpoline but KATI enabled
Kernel Retpoline Speculation Control – no

Kernel Address Table Import Optimisation – Enabled

(Note RDCL mitigations KVA, L1TF are enabled as required)

AMD Ryzen (Threadripper) 1, 2 – no Retpoline, no KATI
Kernel Retpoline Speculation Control – no (should be usable?)

Kernel Address Table Import Optimisation – no (should be usable)

(Note CPU does not require RDCL mitigation thus no KVA, L1TF required)

From our somewhat limited testing above it seems that:

  • Intel Haswell/Broadwell (Core v4/v5) and perhaps earlier (Ivy Bridge/Sandy Bridge Core v3/v2) users are in luck, Retpoline is enabled and should improve performance; unfortunately RDCL (“Meltdown” mitigation) remains.
  • Intel Coffeelake-R (Core v8r refresh), Whiskylake ULV (v8r) users do benefit a bit more for their investment – while Retpoline is not enabled, KATI is enabled and should help. Not requiring KVA is the biggest gain of CFL-R. 2019/June: latest microcode (AEh) causes Windows to require KVA/L1TF thus negating any benefit CFL-R had over original CFL/KBL/SKL.
  • Intel Skylake (Core v6), Kabylake (v7) and Coffeelake (v8) are not able to benefit from Retpoline but KATI can work on some systems (driver dependent). However, on our Skylake ULV, Skylake-X test systems KATI could not be enabled. We are investigating further.
  • Intel Atom (v4/v5+) users should be able to use Retpoline but it seems it cannot be enabled currently. KATI is enabled.
  • AMD Ryzen (Threadripper) 1/2 users should also be able to use Retpoline but it seems it cannot be enabled currently. While RDCL is not required, mitigations for Spectre v2 are required and should be enabled. We are investigating further.

Reviews using Sandra 2018 SP4:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra Titanium (2018) SP3a/b Update: Pushing the Limits

Note: This article originally announced SP3a (28.45); it has since been updated to SP3b (28.49).

We are pleased to announce SP3b (version 28.49 for Sandra Titanium (2018) with updated hardware and software support:

Sandra Titanium (2018) Press Release

Sandra has always pushed the limits of hardware, optimising the workload based on the capablities of the device (compute performance, memory/storage size, etc.) ensuring that both low-end devices and high-end devices are used to their best of their capability.

This new version pushes the workload even higher with better scaling across all GPGPU benchmarks allowing low-end devices to work (e.g. integrated graphics, emulation on CPU) and high-end professional GPGPU accelerators with very fast, very large memory.

GPGPU Benchmarks

  • All Benchmarks: increased workload size on all benchmarks to up to maximum device capacity.
  • FP16/half optimisations for AMD Vega and Radeon VII (vectorisation)*
  • FP16/half optimisations for nVidia Volta/Turing (half2) [CUDA 10]
  • Workgroup and workload optimisations for AMD Vega and Radeon VII*
  • Updated DirectX and OpenGL compute to match CUDA and OpenCL
  • Resolved “out of memory” issues on low-end hardware**
  • Resolved long running time of CPU test-paths (that the GPGPU test paths are checked against) by enabling SIMD implementation (FFT/GEMM)**

Note: At this time we have not personally tested Radeon VII to confirm improvements.

Note2: Applies to SP3b update.

Hardware Support

  • Intel Core v8r Mobile WhiskyLake (WHL), AmberLake (AML) support (based on public information)

Reviews using Sandra 2018 SP3a/b

Ranker Hardware Results

Note: FP64 rate on Radeon VII is 1/4 not 1/8 as stated in some places. FP16 rate is 2x with vectorisation, 1x with scalar.

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite