SiSoftware Sandra 20/20/8 (2020 R8t) Update – JCC, bigLITTLE, Hypervisors + Future Hardware

Note: The original R8 release has been updated to R8t with future hardware support.

We are pleased to release R8t (version 30.61) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

JCC Erratum Mitigation

Recent Intel processors (SKL “Skylake” and later but not ICL “IceLake”) have been found to be impacted by the JCC Erratum that had to be patched through microcode. Naturally this can cause performance degradation depending on benchmark (approx 3% but up to 10%) but can be mitigated through assembler/compiler updates that prevent this issue from happening.

We have updated the tools with with which Sandra is built to mitigate JCC and we have tested the performance implications on both Intel and AMD hardware in the linked articles.

bigLITTLE Hybrid Architecture (aka “heterogeneous multi-processing”)

big.Little HMP
While bigLITTLE arch (including low and high-performance asymmetric cores into the same processor) has been used in many ARM processors, Intel is now introducing it to x86 as “Foveros”. Thus we have have Atom (low performance but very low power) and Core (high performance but relatively high power) into the same processor – scheduled to run or be “parked” depending on compute demands.

As with any new technology, it will naturally require operating system (scheduler) support and may go through various iterations. Do note that as we’ve discussed in our 2015 (!) article – ARM big.LITTLE: The trouble with heterogeneous multi-processing: when 4 are better than 8 (or when 8 is not always the “lucky” number) – software (including benchmarks) using all cores (big & LITTLE) may have trouble correctly assigning workloads and thus not use such processors optimally.

As Sandra uses its own scheduler to assign (benchmarking) threads to logical cores, we have updated it to allow users to benchmarks not only “All Threads (MT)” and “Only Cores (MC)” but also “Only big Cores (bMC)” and “Only LITTLE Cores (LMC)“. This way you can compare and contrast the various cores performance without BIOS/firmware changes.

The (benchmark) workload scheduler also had to be updated to allow per-thread workload – with threads scheduled on LITTLE cores assigned less work and threads on big cores assigned more work depending on their relative performance. The changes to Sandra’s workload scheduler allows each core can be fully utilised – at least when benchmarking.

Note: This advanced information is subject to change pending hardware and software releases and updates.

Future Hardware Support

Update R8t adds support for “Tiger Lake” (TGL) as well as updated support for “Ice Lake” (ICL) and future processors.

AMD Power/Performance Determinism

Some AMD’s server processors allow “determinism” to be changed to either “performance” (consistent speed across nodes/sockets) or “power” (consistent power across nodes/sockets). While normally workloads require predictability and thus “consistent performance” – this can be at the expense of speed (not taking advantage of power/thermal headroom for higher speed) and even power (too much power consumed by some sockets/nodes).

As “power deterministic” mode allows each processor at the maximum performance, there can be reasonable deviations across processors – but this would be unused if each thread has been assigned the same workload. In effect, it is similar to the “hybrid” issue above, with some cores able to sustain a different workload than other cores and the workload needs to vary accordingly. Again, the changes to Sandra’s workload scheduler allows each core to be fully utilised – at least when benchmarking.

Note: In most systems the deviation between nodes/sockets is relatively small if headroom (thermal/power) is small.

Hypervisors

More and more installations are now running in virtualised mode under a (Type 1) Hypervisor: using Hyper-V, Docker, programming tools for various systems (Android, etc.) or even enabling “Memory Integrity” all mean the system will be silently be modified to run in transparently under a hypervisor (Hyper-V on Windows).

As a result, Sandra will now detect and report hypervisor details when uploading benchmarks to the SiSoftware Official Live Ranker as even when running transparently/”host mode” – there can be deviation between benchmark scores especially when I/O operations (disk, network but even memory) are involved; some mitigations for vulnerabilities apply to both the hypervisor and host/guest operating system with a “double-impact” to performance.

Note: We will publish an article detailing the deviation seen with different hypervisors (Hyper-V, VmWare, Xen, etc.).

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/7 (2020 R7) Released – updates and fixes

We are pleased to release R7 (version 30.49) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Updates & Optimisations
    • CPU Benchmarks: AMD Ryzen 4000 series (APU) preliminary support.
    • GPGPU (CUDA/OpenCL) Benchmarks: nVidia Ampere preliminary support.
    • Database: Optimise performance when accessing/updating benchmark results.
    • Branding (Benchmarks/Ranker): Updates manufacturer list.
  • Support & Fixes
    • Internet Benchmarks: Fix website access due to obsolete agent string.
    • Disk Benchmarks: Fix crash on fragmented media (HDD/SSD).
    • Database: Fix update/insert issues with specific benchmark results.

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/6 (2020 R6) Released – 2 brand-new benchmarks!

We are pleased to release R6 (version 30.45) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

Internet DNS Benchmark Internet DNS Benchmark Benchmark the performance of the DNS service. Measure the latency of both cached and un-cached DNS queries to local and remote DNS servers.
Internet Overall Score Benchmark A combined performance index all Internet benchmarks (Connection (Bandwidth/Latency), Peerage (Bandwidth/Latency) and DNS (cached/un-cached Query Latency). Rate the overall performance of your Internet connection.
  • Benchmarks:
    • New: Internet DNS Benchmark: measure cached & un-cached DNS query latency for local and public DNS servers.
    • New: Internet Overall Score: using the existing Internet benchmarks (Connection, Peerage and brand-new DNS), compute an overall score denoting the Internet connection quality.
    • Internet Connection, Internet Peerage Benchmarks: updated list of top (300) websites to test against; additional multi-threading optimisations
  • Hardware Support:
    • Additional future hardware support and optimisations.
    • Additional CPU features support
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/5 (2020 R5) Released – Updated Hardware Support

We are pleased to release R5 (version 30.41) update for Sandra 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Benchmarks:
    • Internet Connection, Internet Peerage Benchmarks: updated list of top websites to test against; additional multi-threading optimisations
  • Hardware Support:
    • Additional IceLake (ICL Gen10 Core), Future* (RKL, TGL Gen11 Core) AVX512, VAES, SHA-HWA support (see CPU, GP-GPU, Cache & Memory, AVX512 improvement reviews)
    • Additional CPU features support
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/4 (2020 R4a) Released – Updated Benchmarks

Note: The original R4 release text has been updated below. The (*) denotes new changes.

We are pleased to release R4a (version 30.39) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Benchmarks:
    • Crypto AES Benchmarks*: Optimised AVX512/AVX2-VAES code to outperform AES-HWA where possible.
    • Crypto SHA Benchmarks*: Select AVX512 multi-buffer instead of SHA-HWA where supported.
    • Network (LAN), Wireless (WLAN/WWAN) Benchmarks: multi-threaded transfer tests and increased packet size to better utilise 10Gbe+ (and higher) links. [Note: threaded CPU required]
    • Internet Connection, Internet Peerage Benchmarks: multi-threaded transfer tests and increased packet size to better utilise Gigabit+ (and higher) connections.
  • Hardware Support:
    • Updated IceLake (ICL Gen10 Core), Future* (RKL, TGL Gen11 Core) AVX512, VAES, SHA-HWA support (see CPU, GP-GPU, Cache & Memory, AVX512 improvement reviews)
    • Updated CometLake (Gen10 Core) support (see CPU, GP-GPU, Cache & Memory reviews)
    • Updated CPU features support*
    • Updated NVMe support
    • Enhanced Biometrics information (fingerprint, face, voice, audio, etc. sensors)
    • Updated WiFi support (WiFi 6/802.11ax, WPA3)
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/3 (2020 R3) Released – Updated Benchmarks

We are pleased to release R3 (version 30.31) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • Additional PCIe extended capabilities support
  • CPU Cyrptography Benchmarks:
    • Block size changed to ~1500 bytes similar to Ethernet packet
    • Various stability and reliability improvements
  • GPGPU Cyrptography Benchmarks:
    • Block size changed to ~1500 bytes similar to Ethernet packet
    • Various stability and reliability improvements

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/2 (2020 R2) Released – Stability Fixes

We are pleased to release R2 (version 30.27) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • PCIe extended capabilities support
  • Software Support:
    • ReFS format Disk benchmark stability issues
  • CPU Benchmarks:
    • Tools (Visual C++ compiler 2019) Update
  • GPGPU Benchmarks:
    • CUDA: Updated SDK 10.2/10.1
    • OpenCL: Updated SDK support

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20/1 (2020 R1a) Released – Updated Hardware Support

Update November 25th: Released patch (version 30.24) to add further hardware and software support.

Update October 24th: Released patch (version 30.21) to corrrect Windows 7 / Server 2008/R2 run-time issues.

We are pleased to release R1 (version 30.24) update for 20/20 (2020) with the following updates:

Sandra 20/20 (2020) Press Release

  • Hardware Support:
    • AMD Ryzen2 (series 3000 Matisse), Stoney Ridge updated support
    • Intel Cascade Lake (CSL), Comet Lake (CML), Cannon Lake (CNL), Ice Lake (ICL) updated support
  • CPU Benchmarks:
    • Tools (Visual C++ compiler 2019) Update
  • GPGPU Benchmarks:
    • CUDA: Updated SDK 10.2/10.1
    • OpenCL: Updated SDK support

Reviews using Sandra 20/20:

Update & Download

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra 20/20 (2020) Released!

FOR IMMEDIATE RELEASE

Contact: Press Office

SiSoftware Sandra 20/20 (2020) Released:
Brand-new benchmarks (AI/ML), hardware support

Updates: R1, R2, R3, R4, R5, R6, R7, R8.

London, UK, July 18th, 2019 – We are pleased to announce the launch of SiSoftware Sandra 20/20 (2020), the latest version of our award-winning utility, which includes remote analysis, benchmarking and diagnostic features for PCs, servers, mobile devices and networks.

It adds two Neural Networks AI/ML (Artificial Intelligence/Machine Learning) benchmarks for both CPU and GP (GPU) to measure both CNN (Convolution Neural Network) & RNN (Recurrent Neural Networks) performance on modern hardware.

It also adds hardware support and optimisations for brand-new CPU architectures (AMD Ryzen 2 (3000 series); Intel IceLake, CometLake) not forgetting GPGPU architectures across the various interfaces (CUDA, OpenCL, DirectX ComputeShader, OpenGL Compute).

As SiSoftware operates a “just-in-time” release cycle, some features were introduced in Sandra 2017 service packs: in Sandra Titanium they have been updated and enhanced based on all the feedback received.

Operating System Module

Broad Operating System Support

All current versions supported: Windows 10, 8.1*, 8*, 7*; Server 2019, 2016, 2012/R2 and 2008/R2*

Brand new AI/ML benchmarks featuring both CNN & RNN networks testing both inference/forward and training/back-propagation performance. Brand new Internet DNS and Overall Internet Score benchmarks.

 

Processor Neural Networks (AI/ML)

A combined performance index of CNN (inference/forward & training) & RNN (inference/forward & training) for all precisions (single/FP32, double/FP64 floating-point) and instruction sets (AVX512, AVX2/FMA, AVX, SSE4, SSE2, RTM/HLE with NUMA and large-page support)

Ranker: Processor Neural Networks (Normal/Single Precision)
Ranker: Processor Neural Networks (High/Double Precision)

GP (GPU) Neural Networks (AI/ML)

A combined performance index of CNN (inference/forward & training) & RNN (inference/forward & training) for all precisions (half/FP16, single/FP32 floating-point) and platforms (CUDA, OpenCL, DirectX Compute)

GP (GPU) Neural Networks (Normal/Single Precision)
GP (GPU) Neural Networks (Low/Half Precision)

CNN (Convolution Neural Network) Architecture

Detailed document on the CNN architecture, data-sets and results that underpin our choices for the new benchmarks.

The new Neural Networks (AI/ML) Benchmarks: CNN Architecture

RNN (Recurrent Neural Network) Architecture

Detailed document on the RNN architecture, data-sets and results that underpin our choices for the new benchmarks.

The new Neural Networks (AI/ML) Benchmarks: RNN Architecture

Internet DNS Benchmark

Benchmark the performance of the DNS service. Measure the latency of both cached and un-cached DNS queries to local and remote DNS servers.

Internet Overall Score Benchmark

A combined performance index all Internet benchmarks (Connection (Bandwidth/Latency), Peerage (Bandwidth/Latency) and DNS (cached/un-cached Query Latency). Rate the overall performance of your Internet connection.

Major changes

  • All connections to website engines (Ranker, Information, Price) are now secured by SSL through HTTP.
  • Sandra client (management console) is now installed as native 64-bit (on x64 and arm64) and thus needs 64-bit Access components (2016, 2013, 2010, etc.) or SQL Server (2017, 2016, 2014, etc) for its database.

Key features of Sandra 20/20

  • 4 native architectures support (x86, x64, ARM64** – Windows; ARM, ARM64, x86, x64 – Android)
  • Huge official hardware support through technology partners (AMD/ATI, nVidia, Intel).
  • 4 native (GP)GPU/APU platforms support (OpenCL 2.1+, CUDA 10.1+, DirectX Compute Shader 11/10+, OpenGL Compute 4.5+, Vulkan 1.0+).
  • 4 native Graphics platforms support (DirectX 11.x/10.x, OpenGL 4.0+, Vulkan 1.0+).
  • 9 language versions (English, German, French, Italian, Spanish, Japanese, Chinese (Traditional, Simplified), Russian) in a single installer.
  • Enhanced Sandra Lite (Eval) version (free for personal/educational use, evaluation for other uses)

Articles & Benchmarks

For more details, please see the following articles:

Purchasing

For more details, and to purchase the commercial versions, please click here.

Updating or Upgrading

To update your existing commercial version, please click here.

Downloading

For more details, and to download the Lite (Evaluation) version, please click here.

Reviewers and Editors

For your free review copies, please contact us.

About SiSoftware

SiSoftware, founded in 1995, is one of the leading providers of computer analysis, diagnostic and benchmarking software. The flagship product, known as “SANDRA”, was launched in 1997 and has become one of the most widely used products in its field. Many worldwide IT publications, magazines and review sites use SANDRA to analyse the performance of today’s computers. Thousands on-line reviews of computer hardware that use SANDRA are catalogued on our website alone.

Since launch, SiSoftware has always been at the forefront of the technology arena, being among the first providers of benchmarks that show the power of emerging new technologies such as multi-core, GPGPU, OpenCL, OpenGL, DirectCompute, x64, ARM64, ARM, NUMA, SMT (Hyper-Threading), SMP (multi-threading), AVX512, AVX2/FMA3, AVX, NEON/2, SSE4.2/4, SSSE3, SSE2, SSE, Java and .NET.

SiSoftware is located in London, UK. For more information, please visit www.sisoftware.net, www.sisoftware.eu, or www.sisoftware.co.uk

The new Neural Networks (AI/ML) Benchmarks: RNN Architecture

What is a Recurrent Neural Network (RNN/LSTM)?

A RNN is a type of neural network that is primarily made of of neurons that store their previous states thus are said to ‘have memory’. In effect this allows them to ‘remember’ patterns or sequences.

However, they can still be used as ‘classifiers’ i.e. recognising visual patterns in images and thus can be used in visual recognition software.

What is VGG(net) is why use it now?

VGGNet is the baseline (or benchmark) CNN-type network that while did not win the ILSVRC 2014 competition (won by GoogleNet/Inception) it is still the preferred choice in the community for classification due to its uniform and thus relatively simple architecture.

While it is generally implemented using CNN layers, either directly or combination like ResNet, it can also be implemented using RNN layers which is what we have done here.

We believe this is a good test scenario and thus a relevant benchmark for today’s common systems.

We are considering much complex neurons, like LSTM, for future tests specifically designed for high-end systems as those used in research and academia.

What is the MNIST dataset and why use it now?

The MNIST database (https://en.wikipedia.org/wiki/MNIST_database) is a decently sized dataset of handwritten digits used for training and testing image processing systems like neural networks. It contains 60K training and 10K testing images of 28×28 pixel anti-aliased gray levels. The number of classes is only 10 (digits ‘0’ to ‘9’).

While they are only 28×28 and not colour, they can be up-scaled to any size by common up-scaling algorithms to test neural networks with little source data.

Today (2019) the digits would be captured in much higher resolution similar to the standard input resolution of the image processing networks of today (between 200×200 and 300×300 pixels).

As Sandra is designed to be small and easily downloadable, it is not possible to include gigabytes (GB) of data for either inference or training. Even the low-resolution (32x32x3) ILSVRC is 3GB thus unusable for our purpose.

What is Sandra’s RNN network architecture and why was it designed this way?

Due to the low complexity of the data and in order to maintain good performance even on low-end hardware, a standard RNN was chosen as the architecture. The features are:

  • Input is 224x224x1 as MNIST images are grey-scale only (up-scaled from 28×28)
  • Output is 10 as there are only 10 classes
  • 4 layer network, 1 RNN, 3 fully connected layers

What are the implementation details of the network?

The CPU version of the neural network supports all common instruction sets and precision and will be continuously updated as the industry moves forward.

  • Both inference/forward and train/back-propagation tested and supported.
  • Precision: single and double floating-point supported with future half/FP16.
  • SIMD Instruction Sets: CPU, SSE2, SSE4.x, AVX, AVX2/FMA and AVX512 with future VNNI.
  • Threads/Cores: Up to the maximum operating system 384 threads in 64-thread groups are supported with hard affinity as all other benchmarks.
  • NUMA: NUMA is supported up to 16 nodes with data allocated to the closest node.

What kind of BTT (Back-propagation Through Time) is used?

Unfortunately as we only know the output (digit) at the end of the sequence (i.e. once all pixels have been presented) we cannot calculate intermediate errors in order to use TBTT (Truncated BTT) which relies on known output at intermediate sequence time-steps.

What kind of detection rate and error does Sandra’s implementation achieve?

Naturally due to the low source resolution, a much shallower/simpler network would have sufficed. However due to up-scaling and the relatively large number of training images there is no danger of over-fitting.

It achieves a % detection rate (over the 10K testing images) after just 1 epoch (Epoch 0) and % after 30 epochs.

Training (30 epochs) took just X* hours on an i9-7900X (10C/20T) using AVX512/single-precision.

Does Sandra fully infer or train the full image set when benchmarking?

As with all other Sandra benchmarks the tests are limited to 30 seconds (in order to complete reasonably quickly) – within this time as many images at random from the data-sets (60K train, 10K test) will be processed.