Q & A – Cache & Memory Benchmark

This document provides some frequently asked questions about Sandra. Please read the Help File as well!

This module uses the technology of the well-known Memory Benchmark module. For more information about it see the respective module. This topic deals exclusively with the differences between these two modules.

Q: Why does it take so long to run the test?
A:
In order to support SMP, Multi-Core, SMT (Hyper-Threading), etc. the framework is quite complex and thus has significant overhead. In order to get a true index, the tests need to be run many times and a index computed based on the distribution of results. This results in a stable index. Generally this benchmark should take 5 to 10 times as long as the Memory Benchmark.

Q: Why is the memory index (i.e. using large blocks > L2/L3 cache) lower than Memory Benchmark index?
A:
The index is lower as streaming/buffering/block pre-fetch is not used to increase performance. The test is the same regardless of block size; different techniques should be applied when using the data caches and when using the memory.

The memory index should correspond to the legacy ALU/FPU tests in the Memory Benchmark. On modern systems you must disable EMMX/SSE/SSE2 instructions to fall back to these tests.

Q: Why doesn’t this module use streaming/buffering/block pre-fetch?
A:
These techniques are very useful when streaming large amounts of data, not when small blocks are involved as with this test.

Q: Why is there no MMX test?
A:
Both MMX & FPU work on 64-bits of data. Unless streaming instructions are used, there is no compelling reason to use MMX instead of FPU. Moreover, all the tests (like the memory benchmark) use 64-bit floats while MMX supports 32-bit integers only.

Q: Why does P4 get such a boost from SSE(2) while the PIII does not get any?
A:
Large transfer sizes (128-bit) work better on the NetBurst architecture than smaller (32/64-bit). The PIII reaches its limit with normal 64-bit transfers. You can also see this as P4 needing SSE(2) to reach its full potential and not legacy code.

 

Comments are closed.