NUMA performance improvement for AMD ThreadRipper in Sandra SP2

What is NUMA?

Modern CPUs have had a built-in memory controller for many years now, starting with the K8/Opteron, in order to higher better bandwidth and lower latency. As a result in SMP systems each CPU has their own memory controller and its own system memory that it can access at high speed – while to access other memory it must send requests to the other CPUs. NUMA is a way of describing such systems and allow the operating system and applications to allocate memory on the node they are running on for best performance.

As ThreadRipper is really two (2) Ryzen dies connected internally through InfinityFabric – it is basically a 2-CPU SMP system and thus a 2-node NUMA system.

While it is possible to configure it in UMA (Uniform Memory Access mode) where all memory appears to be unified and interleaved between nodes, for best performance the NUMA mode is recommended when the operating system and applications support it.

While Sandra has always supported NUMA in the standard benchmarks – some of the new benchmarks have not been updated with NUMA support especially since multi-core systems have pretty much killed SMP systems on the desktop – with only expensive severs left to bring SMP / NUMA support.

Note that all the NUMA improvements here would apply to competitor NUMA (e.g. Intel) systems, thus it is not just for ThreadRipper – with EPYC systems likely showing a far higher improvement too.

In this article we test NUMA performance; please see our other articles on:

Native Performance

We are testing native performance using various instruction sets: AVX512, AVX2/FMA3, AVX to determine the gains the new instruction sets bring.

Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.

Environment: Windows 10 x64, latest AMD and Intel drivers. Turbo / Dynamic Overclocking was enabled on both configurations.

Native Benchmarks NUMA 2-nodes
UMA single-node
Comments
BenchCpuMM Native Integer (Int32) Multi-Media (Mpix/s)  965 [+2.8%]  938 The ‘lightest’ workload should show some NUMA overhead but we can only manage 3% here.
BenchCpuMM Native Long (Int64) Multi-Media (Mpix/s)  312 [+2.3%] 305 With a 64-bit integer workload the improvement drops to 2%.
BenchCpuMM Native Quad-Int (Int128) Multi-Media (Mpix/s)  10.9 [=]  10.9 Emulating int128 means far increased compute workload with NUMA overhead insignificant.
BenchCpuMM Native Float/FP32 Multi-Media (Mpix/s)  997 [+1.2%]  985 Again no measured improvement here.
BenchCpuMM Native Double/FP64 Multi-Media (Mpix/s)  562 [+1%]  556 Again no measured improvement here.
BenchCpuMM Native Quad-Float/FP128 Multi-Media (Mpix/s)  27 [=]  26.85 In this heavy algorithm using FP64 to mantissa extend FP128 we see no improvement.
Fractals are compute intensive with few memory accesses – mainly to store results – thus we see a maximum of 3% improvement with NUMA support with the rest insignificant. However, this is a simple 2-node system – bigger 4/8-node systems would likely show bigger gains.
BenchCrypt Crypto AES256 (GB/s) 27.1 [+139%] 11.3 AES hardware accelerated is memory bandwidth bound thus NUMA support matters; even in this 2-node system we see a over 2x improvement of 139%!
BenchCrypt Crypto AES128 (GB/s) 27.4 [+142%] 11.3 Similar to above we see a massive 142% improvement by allocating memory on the right NUMA node.
BenchCrypt Crypto SHA2-256 (GB/s)  32.3 [+50%] 21.4 SHA is also hardware accelerated but operates on a single input buffer (with a small output hash value buffer) and here out improvement drops to 50%, still very much significant.
BenchCrypt Crypto SHA1 (GB/s) 34.2  [+56%]  21.8 Similar to above we see an even larger 56% improvement for supporting NUMA.
BenchCrypt Crypto SHA2-512 (GB/s)  6.36 [=]  6.35 SHA2-256 is not hardware accelerated (AVX2 used) but heavy compute bound thus our improvement drops to nothing.
Finally in streaming algorithms we see just how much NUMA support matters: even on this 2-note system we see over 2x improvement of 140% when working with 2 buffers (in/out). When using a single buffer our improvement drops to 50% but still very much significant. TR needs NUMA suppport to shine.
BenchScience SGEMM (GFLOPS) float/FP32  395 [114%]  184 As with crypto, GEMM benefits greatly from NUMA support with an incredible 114% improvement by allocating the (3) buffers on the right NUMA nodes.
BenchScience DGEMM (GFLOPS) double/FP64  183 [131%]  79 Changing to FP64 brings an even more incredible 131%.
BenchScience SFFT (GFLOPS) float/FP32  11.6 [86%]  6.25 FFT also shows big gains from NUMA support with 86% improvement just by allocating the buffers (2+1 const) on the right nodes.
BenchScience DFFT (GFLOPS) double/FP64  10.6 [112%]  5 With FP64 again increases
BenchScience SNBODY (GFLOPS) float/FP32  479 [=]  483 Strangely N-Body does not benefit much from NUMA support with no appreciable improvement.
BenchScience DNBODY (GFLOPS) double/FP64  189 [=]  191 With FP64 workload nothing much changes.
As with crypto, buffer heavy algorithms (GEMM, FFT, N-Body) greatly benefit from NUMA support with performance doubling (86-131%) by allocating on the right NUMA nodes; in effect TR needs NUMA in order to perform better than a standard Ryzen!
CPU Image Processing Blur (3×3) Filter (MPix/s)  2090 [+71%]  1220 Least compute brings highest benefit from NUMA support – here it is 71%.
CPU Image Processing Sharpen (5×5) Filter (MPix/s)  886 [=]  890 Same algorithm but more compute brings the improvement to nothing.
CPU Image Processing Motion-Blur (7×7) Filter (MPix/s)  494 [=]  495 Again same algorithm but even more compute again no benefit.
CPU Image Processing Edge Detection (2*5×5) Sobel Filter (MPix/s)  720 [=]  719 Using two buffers does not seem to show any benefit either.
CPU Image Processing Noise Removal (5×5) Median Filter (MPix/s)  116 [=]  117 Different algorithm keeps with more compute means no benefit either.
CPU Image Processing Oil Painting Quantise Filter (MPix/s)  40.3 [=]  40.7 Using the new scatter/gather in AVX2 does not help matters even with NUMA support.
CPU Image Processing Diffusion Randomise (XorShift) Filter (MPix/s)  1880 [+90%]  982 Here we have a 64-bit integer workload algorithm with many gathers not compute heavy brings 90% improvement.
CPU Image Processing Marbling Perlin Noise 2D Filter (MPix/s)  397 [=]  396 Heavy compute brings down the improvement to nothing.
As with other SIMD tests,  low compute algorithms see 70-90% improvement from NUMA support; heavy compute algorithms bring the improvement down to zero. It all depends whether the overhead of accessing other nodes can be masked by compute; in effect TR seems to perform pretty well.

SiSoftware Official Ranker Scores

Final Thoughts / Conclusions

It is clear that ThreadRipper needs NUMA support in applications – just like any other SMP system today to shine: we see over 2x improvement in bandwidth-heavy algorithms. However, in compute-heavy algorithms TR is able to mask the overhead pretty well – with NUMA bringing almost no improvement. For non NUMA supporting software the UMA mode should be employed.

Let’s remember we are only testing a 2-node system, here, a 4+ node system is likely to show higher improvements and with EPYC systems stating at 1-socket 4-node we can potentially have common 4-socket 16-node systems that absolutely need NUMA for best performance. We look forward to testing such a system as soon as possible 😉

 

Sandra Platinum (2017) SP1a

Update Wizard

We are pleased to release SP1a (Service Pack 1a – version 24.30) update for Sandra Platinum (2017) with the following updates:

Sandra Platinum (2017) Press Release

  • Tools update allowing further ports of benchmarks to AVX512, e.g.:
    • CPU Multi-Media: 128-bit (octa) floating-point benchmark
    • CPU Scientific: GEMM and N-Body (single and double floating-point)
    • CPU Image Processing: SIMD filters (AVX2/FMA3, AVX, SSE4*, SSE2) performance improvement
    • further benchmarks will be enabled as tools are further updated

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

Sandra Platinum (2017) SP1

We are pleased to announce SP1 (Service Pack 1 – version 24.27) for Sandra Platinum (2017) with updated hardware support and benchmark optimisations:

Sandra Platinum (2017) Press Release

  • Updated hardware support to including:
    • Intel HEDT/Workstation/Server Skylake-X/Kabylake-X
    • Intel Core Cofeelake
    • AMD HEDT/Workstation/Server Threadripper
    • Updated DDR4, NVMem (non-volatile), PMem (persistent) memory support
  • Updated CPU benchmarks including:
    • Updated AVX512 benchmarks (Multi-Media, Cryptography/Hashing, Memory & Cache Bandwidth)
    • Further benchmarks will be updated to AVX512 in due course
  • Benchmark Fixes
    • Fix: CPU Power Management Efficiency benchmark running with more than 16 threads.
    • Fix: SGEMM AVX2/FMA running with non-power of 2 threads
    • Fix: Database AVX512 scores would not be entered

New hardware reviews with Sandra Platinum (2017) SP1:

Commercial version customers can download the free updates from their software distributor; Lite users please download from your favourite download site.

Download Sandra Lite

SiSoftware Sandra Platinum updated to RTMa

We are updating Sandra Platinum (2017) to RTMa (version 24.18) with a few important fixes and optimizations. Please update to this version as soon as applicable.

Sandra 2017 Press Release

* Windows 7/Server 2008 R2: failure to launch due to missing TPM support.

If the BitLocker/TPM2 update was not installed and TPM support were missing Sandra would fail to launch.

* AMD Ryzen: updated support and fixes.

Better detection and information through further testing.

* Intel Atom Braswell and later: multiplier detection issues.

Benchmarks would fail to validate due to incorrect values.

– CPU Scientific: crash using the SSE2/3 FFT implementation on Atom CPUs.

* GPGPU: optimisations and fixes.

– OpenCL: optimisations for image processing, especially relating to FP16/half processing:

http://www.sisoftware.eu/2017/04/14/fp16-gpgpu-image-processing-performance-quality/

– DX ComputeShader: fixes for image processing enabling both FP32 and FP16 performance.

* GPGPU: test data randomised.

– Cryptography: CUDA, OpenCL, DX CS: replaced with random data which reduces cache locality and thus performance especially on APUs. AES performance has thus reduced by up to 50-70%.

Using randomised/high-entropy data is meant to improve the “fairness” of the benchmarks preventing “best-case” scenarios where results may be better than expected. While hardware may thus perform better given low-entropy data – that is preferable to the previous results.

* Benchmark Results Ranker: old results (pre Sandra 2014) removed.

– The reference/aggregated results have been updated with newer 2017 results given higher weight.

Download Sandra Lite

Sandra USB supplied SanDisk* Extreme disks

We are now providing Sandra USB versions on one of the fastest USB3 flash disks – the SanDisk* Extreme! With read bandwidth ~200MB/s you won’t be waiting for Sandra to start off the USB drive (write speed is not bad either at ~60MB/s).

Now they are not the smallest of flash drives but then again you are unlikely to lose them. Here they are for your viewing pleasure:

Sandra USB on SanDisk Extreme

Sandra USB on SanDisk Extreme

SP1a for SiSoftware Sandra 2016 Released!

Update Wizard

We are happy to release SP1a (Service Pack 1a) to SiSoftware Sandra 2016.

This is a minor update that improves stability and adds a few optimisations that were developed after further testing of SP1 release.

The SP1a update also enables the Marbling: Perlin Noise 2D (3 octaves) Filter for both GPGPUs (CUDA, OpenCL) and CPU.

Sandra 2016 SP1 New Image Filters

SP1 for SiSoftware Sandra 2016 Released!

Update Wizard

We are happy to release SP1 (Service Pack 1) to SiSoftware Sandra 2016.

This release introduces initial AVX512 benchmarks with all SIMD benchmarks due to be ported once compiler support becomes available:

CPU Multi-Media (Fractal Generation): single, double floating-point; integer, long benchmarks ported to AVX512. [See article Future performance with AVX512]

CPU Crypto (SHA Hashing): SHA2-256 and SHA2-512 multi-buffer ported to AVX512.

– Hardware support for future arch (AMD and Intel).

.Net Multi-Media native vector support is vector width independent and thus will support AVX512 with a future CLR release automatically

GPU Image Processing: New, more complex filters:

  • Oil Painting: Quantise (9×9) Filter: CUDA, OpenCL
  • Diffusion: Randomise (256) Filter: CUDA, OpenCL
  • Marbling: Perlin Noise 2D (3 octaves) Filter: CUDA, OpenCL

CPU Image Processing: New, more complex filters

  • Oil Painting: Quantise (9×9) Filter: AVX2/FMA, AVX, SSE2
  • Diffusion: Randomise (256) Filter: AVX2/FMA, AVX, SSE2
  • Marbling: Perlin Noise 2D (3 octaves) Filter: AVX2/FMA, AVX, SSE2

Sandra 2016 SP1 New Image FiltersMore benchmarks will be ported to AVX512 subject to compiler support; currently Microsoft’s VC++ does not support AVX512 intrinsics and in the interest of fairness we do not use specialised compilers.

Please see our article – Future performance with AVX512 – for a primer on AVX512 and projected performance improvements due to AVX512 and 512-bit transfers.

SiSoftware Sandra 2016 RTMa Released

Bulb

We are providing an update to Sandra 2016, RTMa (version 22.15) with various updates and fixes:

  • .Net native Vector support: (floating-point single/double) in latest 4.6 CLR RyuJIT. the CLR automatically uses AVX/SSE2 SIMD as supported by the CPU. (see .Net Vectors (CLR 4.6 RyuJIT) Performance article for more information)
  • CPU Image Processing: Did not run SIMD code-paths (FMA, AVX, SSE2) only FPU resulting in low performance.
  • GPGPU Image Processing: Minor performance optimisation for median/de-noise filter.
  • GPGPU Crypto: SHA performance optimisations for nVidia cards in CUDA and OpenCL (SHA1 especially).
  • Overall Score 2016: score may not generate in all cases.
  • Windows 10: 1511 SDK update (build 10586 2015 November update)
  • Website Change: Due to transition to WP links and feeds were broken.

We recommend you update your version of Sandra 2016 as soon as possible.

SiSoftware Sandra 2016 Released

SiSoftware Logo

FOR IMMEDIATE RELEASE

Contact: Press Office

SiSoftware Sandra 2016 Released:
Brand-new benchmarks, OpenGL CS, RTM Windows 10 and future Server 2016* support

Superseded By: Sandra Platinum (2017)

Updates: RTMa, SP1, SP1a, SP2, SP3, SP4.

Articles: .Net Vector Performance, AVX512 Future Performance.

London, UK, November 16th 2015 – We are pleased to announce the launch of SiSoftware Sandra 2016, the latest version of our award-winning utility, which includes remote analysis, benchmarking and diagnostic features for PCs, servers, mobile devices and networks.

We have expanded our portfolio with brand-new CPU and GPGPU Image Processing benchmarks (testing common filters) that support all modern instruction sets (AVX512, AVX2/FMA, AVX, SSE2) as well as GPGPU interfaces (CUDA, OpenCL, DirectX 10/11 ComputeShader). Since Compute is now supported in the very latest versions of OpenGL (4.3+) we have also ported the GPGPU benchmarks to this new interface.

With the public release of Windows 10 RTM we have transitioned to the brand-new tools in order to use the very latest technologies, including future DirectX 12 (both shader and compute).

As SiSoftware operates a “just-in-time” release cycle, some features were introduced in Sandra 2015 service packs: in Sandra 2015 they have been updated and enhanced based on all the feedback received.

Here is an in-depth new feature list of Sandra 2016:

Windows

Broad Operating System Support
All current OS versions supported: Windows 10 RTM/AU, 8.1, 8, 7; Server 2016*, 2012/R2 and 2008/R2

  • New Benchmark Module: GPGPU Image Processing (common filters: blur, sharpen, sobel, median/de-noise, oil painting, diffuse/random, marbling/perlin noise) supporting all modern interfaces (CUDA, OpenCL, DirectX ComputeShader)
  • New Benchmark Module: CPU Image Processing (common filters: blur, sharpen, sobel, median/de-noise, oil painting, diffuse, marbling) supporting all modern vectorised SIMD instruction sets (AVX512*, AVX2/FMA, AVX, SSE2)
  • New OpenGL Compute Support: Ported GPGPU benchmarks to OpenGL (4.3+) Compute Shader (Fractals, Crypto, Image Processing)
  • New GPU Precision: FP16/half-float precision benchmarks (Financial, Scientific)
  • New CPU Test: 64-bit Integer Dhrystone measuring 64-bit integer workload performance.
  • New .Net vector Support: Native vector to SIMD (AVX512*, FMA/AVX, SSE2, etc.) conversion in the latest 4.6 CLR (RyuJIT).
  • New Transcode Test: HEVC/H.265 media transcode test, brand-new high-bitrate master AVC1 media file 1080p and UHD/4K (commercial versions) for UHD/4K, 3K, 1440p transcoding benchmarking.
  • Updated Benchmark: Updated Overall Score (2016) by adding new benchmarks to the index.
  • New Operating System Support: Full support for Windows 10 RTM, 8.1, 8, 7 as well as Server 2016*, 2012/R2, 2008/R2.
  • New Hardware Support: Modern and future hardware support.
CPU Image Processing Benchmark

CPU, GPGPU Image Processing
Common filters: blur, sharpen, sobel, median/de-noise, oil filter, diffuse/random, marbling/perlin noise

Image/photo manipulation is an increasing common task with GPGPUs increasingly used to accelerate filter processing in popular programs (e.g. Photoshop). This brand-new benchmark set tests the performance of various filters:

  • Blur: 3×3 Convolution Filter
  • Sharpen: 5×5 Convolution Filter
  • Motion Blur: 7×7 Convolution Filter
  • Edge Detection: Horizontal + Vertical 5×5 Sobel Filter
  • De-Noise: 5×5 Median Filter
  • Oil Painting: Quantise (9×9) Filter [SP1]
  • Diffusion: Randomise / XorShift (256) Filter [SP1]
  • Marbling: Simplex Noise 2D Perlin (3O) Filter [SP1a]
GPGPU Image Processing Benchmark

CPU, GPGPU Image Processing
Modern vectorised and GPU interfaces

Image/photo manipulation is greatly accelerated through vectorised SIMD instruction sets (AVX512, AVX2/FMA, AVX, SSE2) operating on multiple pixels at the same time, but also increasingly accelerated by GPGPUs in modern programs (e.g. Photoshop). This brand-new benchmark set supports all GPGPU interfaces as well as SIMD instruction sets:

  • GPGPU: CUDA (7.5), OpenCL (2.0, 1.2), DirectX Compute Shader (11/10), OpenGL Compute Shader (4.3+) [future DirectX 12 support]
  • CPU: AVX512*, AVX2/FMA, AVX, SSE2 instruction sets


Sandra 2016 Image Filters
Sandra 2016 SP1 New Image Filters

System Overall Benchmark

Updated Overall Score 2016 benchmark for complete system performance evaluation
16 benchmarks to fully evaluate computer performance

While each benchmark measures the performance of a specific device (CPU, Memory, (GP)GPU, Storage, etc.), there is a real need for a benchmark to evaluate the overall computer performance: this new benchmark is a weighted average of the individual scores of the existing benchmarks:

  • Native CPU Arithmetic, Cryptographic, Multi-Media (SIMD), Financial and Scientific: measures native processing performance using the very latest instruction sets (AVX512*, AVX2/FMA, AVX, SSE2)
  • .Net/Java Arithmetic: measures software virtual machine performance (e.g. for .Net WPF/Silverlight/Modern applications)
  • Memory and Cache Bandwidth and Latency: measures memory and caches performance
  • File System/Storage Bandwidth and I/O: measures storage performance
  • GP (General Processing) / HC (Heterogonous Compute) (GPU/APU) Arithmetic, Cryptographic, Financial, Scientific: measures (GP)GPU/APU processing performance
  • GP (General Processing) / HC (Heterogonous Compute) (GPU/APU) Memory Bandwidth and Latency: measures (GP)GPU/APU memory performance

Key features of Sandra 2016

  • 4 native architectures support (x86, x64 – Windows; ARM, ARM64, x86, x64 – Android)
  • Huge official hardware support through technology partners (AMD/ATI, nVidia, Intel).
  • 4 native (GP)GPU/APU platforms support (OpenCL 1.1+, CUDA 7.5+, DirectX Compute Shader 11+, OpenGL Compute 4.3+).
  • 4 native Graphics platforms support (DirectX 12.x, DirectX 11.x, DirectX 10.x, OpenGL 3.0+).
  • 9 language versions (English, German, French, Italian, Spanish, Japanese, Chinese (Traditional, Simplified), Russian) in a single installer.
  • Enhanced Sandra Lite (Eval) version (free for personal/educational use, evaluation for other uses)

Relevant Articles

For more details, please see the following articles:

Purchasing

For more details, and to purchase the commercial versions, please click here.

Updating or Upgrading

To update your existing commercial version, please click here.

Downloading

For more details, and to download the Lite (Evaluation) version, please click here.

Reviewers and Editors

For your free review copies, please contact us.

About SiSoftware

SiSoftware, founded in 1995, is one of the leading providers of computer analysis, diagnostic and benchmarking software. The flagship product, known as “SANDRA”, was launched in 1997 and has become one of the most widely used products in its field. Nearly 700 worldwide IT publications, magazines and review sites use SANDRA to analyse the performance of today’s computers. Over 9,000 on-line reviews of computer hardware that use SANDRA are catalogued on our website alone.

Since launch, SiSoftware has always been at the forefront of the technology arena, being among the first providers of benchmarks that show the power of emerging new technologies such as multi-core, GPGPU, OpenCL, OpenGL, DirectCompute, x64, ARM, MIPS, NUMA, SMT (Hyper-Threading), SMP (multi-threading), AVX512, AVX2, AVX, FMA, NEON, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE, Java and .NET.

SiSoftware is located in London, UK. For more information, please visit http://www.sisoftware.net, http://www.sisoftware.eu, http://www.sisoftware.info or http://www.sisoftware.co.uk

SiSoftware Sandra 2015 Released

SiSoftware Logo

FOR IMMEDIATE RELEASE

Contact: Press Office

SiSoftware Sandra 2015 Released:
Windows 10/Server 2016 (Beta) Support, New Benchmarks

Superseded: This version has been replaced by Sandra 2016.

Updates: SP4 (Nov 19th 2015), SP3 (Sep 17th 2015), SP2b (Jul 6th 2015), SP2 (Jun 1st 2015), SP1a (Mar 17th 2015), SP1 (Feb 16th 2015), 2015i (Nov 26th 2014).

London, UK, November 10th 2014 – We are pleased to announce the launch of SiSoftware Sandra 2015, the latest version of our award-winning utility, which includes remote analysis, benchmarking and diagnostic features for PCs, servers, mobile devices and networks.

With the public release of Windows 10 Tech Preview, we have updated Sandra to the latest tools for Windows 10 – but also supporting Windows 8.1, 8, 7 and Vista – but also Server 2016, 2012/R2 and 2008/R2; however, Windows XP and Server 2003/R2 are no longer supported.

As SiSoftware operates a “just-in-time” release cycle, some features were introduced in Sandra 2014 service packs: in Sandra 2015 they have been updated and enhanced based on all the feedback received.

Here is an in-depth new feature list of Sandra 2015:

Windows

Broad Operating System Support
All current OS versions supported: Windows 10, 8.1, 8, 7, Vista; Server 2016, 2012/R2 and 2008/R2

  • Enhanced for Windows 10 / Server 2016 Desktop mode using the latest API.
  • New theme for Windows 10 / 8.1 / 8 with high-DPI support (200% 192ppi) 256×256 pixels icons.
  • Full support for Windows 8.1, 8, 7, Vista as well as Server 2012/R2, 2008/R2.
  • Updated hardware support for both current and future hardware
Ranker

Updated Device Performance Certification*
Certify the validity and quality of your benchmarks results [Commercial versions]

Device Performance Certification validates whether the benchmark result (score) you have obtained upon benchmarking your device is valid (i.e. the device you tested is performing correctly) and how it compares to the scores obtained by other users when testing the same device.

By aggregating the results submitted for each device and performing statistical analysis (e.g. computing mean/average, standard deviation, etc.) we can use statistical tools to work out whether the score is within the expected range (confidence intervals).

Based on the variability of scores you can determine whether the performance of your device is consistent or varies significantly from test to test.

CertValidSm CertInValidSm
System Overall Benchmark

Updated Overall Score (2015) benchmark for complete system performance evaluation
14 benchmarks to fully evaluate computer performance

While each benchmark measures the performance of a specific device (CPU, Memory, (GP)GPU, Storage, etc.), there is a real need for a benchmark to evaluate the overall computer performance: this new benchmark is a weighted average of the individual scores of the existing benchmarks:

  • Native CPU Arithmetic, Cryptographic, Multi-Media (SIMD), Financial and Scientific: measures native processing performance using the very latest instruction sets (AVX2, FMA, AVX, SSE2 and future AVX-512F)
  • .Net/Java Arithmetic: measures software virtual machine performance (e.g. for .Net WPF/Silverlight/Modern applications)
  • Memory and Cache Bandwidth and Latency: measures memory and caches performance
  • File System/Storage Bandwidth and I/O: measures storage performance
  • GP (General Processing) / HC (Heterogonous Compute) (GPU/APU) Arithmetic, Financial, Scientific: measures (GP)GPU/APU processing performance
  • GP (General Processing) / HC (Heterogonous Compute) (GPU/APU) Memory Bandwidth and Latency: measures (GP)GPU/APU memory performance
OverallBarSm OverallOctaSm
Windows

Brand new style for Windows 10

Windows 10 / 8.x have their own style (Modern, ex-Metro) shared with Windows Phone 8.x. Love it or hate it, it is here to stay. We have also provided Windows 7, Vista users with an updated, modern style:

HwInfoMetro BenchListMetro
Buy

Price Engine: invaluable

Why? The Price engine enhances the user’s experience by providing product pictures and additional specifications – as well as the the latest price. It enables the calculation of important metrics like Performance vs. Price and Capacity vs. Price (for storage media) which are extremely useful when making comparisons. All this is done automatically rather than manually searching for pricing, a great time saver.

Key features of Sandra 2015:

  • 4 native architectures support (x86, x64 – Windows; ARM, ARM64, x86, x64 – Android)
  • Huge official hardware support through technology partners (AMD/ATI, nVidia, Intel).
  • 4 native (GP)GPU/APU platforms support (OpenCL, CUDA, DirectX Compute Shader, OpenGL Compute).
  • 4 native Graphics platforms support (DirectX 12.x, DirectX 11.x, DirectX 10.x, OpenGL 3.0+).
  • 9 language versions (English, German, French, Italian, Spanish, Japanese, Chinese (Traditional, Simplified), Russian) in a single installer.
  • Enhanced Sandra Lite (Eval) version (free for personal/educational use, evaluation for other uses)

Relevant Articles

For more details, please see the following articles:

Purchasing

For more details, and to purchase the commercial versions, please click here.

Updating or Upgrading

To update your existing commercial version, please click here.

Downloading

For more details, and to download the Lite (Eval) version, please click here.

Reviewers and Editors

For your free review copies, please contact us.

About SiSoftware

SiSoftware, founded in 1995, is one of the leading providers of computer analysis, diagnostic and benchmarking software. The flagship product, known as “SANDRA”, was launched in 1997 and has become one of the most widely used products in its field. Nearly 700 worldwide IT publications, magazines and review sites use SANDRA to analyse the performance of today’s computers. Over 9,000 on-line reviews of computer hardware that use SANDRA are catalogued on our website alone.

Since launch, SiSoftware has always been at the forefront of the technology arena, being among the first providers of benchmarks that show the power of emerging new technologies such as multi-core, GPGPU, OpenCL, DirectCompute, x64, ARM, MIPS, NUMA, SMT (Hyper-Threading), SMP (multi-threading), AVX3, AVX2, AVX, FMA4, FMA, NEON, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE, Java and .NET.

SiSoftware is located in London, UK. For more information, please visit http://www.sisoftware.net, http://www.sisoftware.eu, http://www.sisoftware.info or http://www.sisoftware.co.uk