What is “SKL-X”?
“Skylake-X” (E/EP) is the server/workstation/HEDT version of desktop/mobile Skylake CPU – the 6-th gen Core/Xeon replacing the current Haswell/Broadwell-E designs. It naturally does not contain an integrated GPU but what does contain is more cores, more PCIe lanes and more memory channels (up to 6 64-bit) for huge memory bandwidth.
While it may seem an “old core”, the 7-th gen Kabylake core is not much more than a stepping update with even the future 8-th gen Coffeelake rumored again to use the very same core. But what it does do is include the much expected 512-bit AVX512 instruction set (ISA) that are are not enabled in the current desktop/mobile parts.
SKL-X does not only support DDR4 but also NVM-DIMMs (non-volatile memory DIMMs) and PMem (Persistent Memory) that should revolutionise future computing with no need for memory refresh or immediate sleep/resume (no need to save/restore memory from storage).
In this article we test CPU Cache and Memory performance; please see our other articles on:
Hardware Specifications
We are comparing the top-end desktop Core i9 with current competing architectures from both AMD and Intel as well as its previous version.
CPU Specifications | Intel i9 7900X (Skylake-X) | AMD Ryzen 1700X | Intel i7 6700K (Skylake) | Intel i7 5820K (Haswell-E) | Comments | |
TLB 4kB pages |
64 4-way / 64 8-way 1536 8-way |
64 full-way 1536 8-way |
64 4-way / 64 8-way 1536 6-way |
64 4-way 1024 8-way |
Ryzen has comparatively ‘better’ TLBs than all Intel CPUs. | |
TLB 2MB pages |
8 full-way 1536 2-way |
64 full-way 1536 2-way |
8 full-way 1536 6-way |
8 full-way 1024 8-way |
Again Ryzen has ‘better’ TLBs than all Intel versions | |
Memory Controller Speed (MHz) | 800-3300 | 600-1200 | 800-4000 | 1200-4000 | Intel’s UNC clock runs higher than Ryzen | |
Memory Speed (Mhz) Max |
3200 / 2667 | 2400 / 2667 | 2533 /2667 | 2133 / 2133 | SKL-X can officially go as high as Ryzen and normal SKL @ 2667 but runs happily at 3200Mt/s. | |
Memory Channels / Width |
4 / 256-bit (max 8 / 384-bit) | 2 / 128-bit | 2 / 128-bit | 4 / 256-bit | SKL-X has 2 memory controllers each with up to 3 channels each for massive memory bandwidth. | |
Memory Timing (clocks) |
16-18-18-36 6-54-19-4 2T | 14-16-16-32 7-54-18-9 2T | 16-18-18-36 5-54-21-10 2T | 14-15-15-36 4-51-16-3 2T | SKL-X can run as tight timings as normal SKL or Ryzen. |
Core Topology and Testing
Intel has dropped the (dual) ring bus(es) and instead opted for a mesh inter-connect between cores; on desktop parts this should not cause latency differences between cores (as with Ryzen) but on high-end server parts with many cores (up to 28) this may not be the case. The much increased L2 cache (1MB vs. old 256kB) should alleviate this issue – though the L3 cache seems to have been reduced quite a bit.
Native Performance
We are testing bandwidth and latency performance using all the available SIMD instruction sets (AVX, AVX2/FMA, AVX512) supported by the CPUs.
Results Interpretation: Higher values (GOPS, MB/s, etc.) mean better performance.
Environment: Windows 10 x64, latest AMD and Intel drivers. Turbo / Dynamic Overclocking was enabled on both configurations.
If there was any doubt, SKL-X does not disappoint – massive cache (L1D and L2) aggregate and memory bandwidths with server versions likely even more; the smaller L3 cache does falter though which is a bit of a surprise – the larger L2 caches must have forced some compromises to be made.
Latency is a bit disappointing compared to the “normal” SKL/KBL we have on desktop, but are still better than older HSW-E and also Ryzen competitor. Again the L1 and L2 caches (despite being 4-times bigger) clock latencies are OK with the L3 and memory controller being the source of the increased latencies.
SiSoftware Official Ranker Scores
Final Thoughts / Conclusions
After a strong CPU performance we did not expect the cache and memory performance to disappoint – and it does not. SKL-X is a big improvement over the older versions (HSW-E) and competition with few weaknesses.
The mesh interconnect does seem to exhibit higher inter-core latencies with small increase in bandwidth; perhaps this can be fixed.
The very much reduced L3 cache does disappoint both bandwidth and latency wise; the memory controllers provide huge bandwidth but at the expense of higher latencies.
All in all, if you can afford it, there is no question that SKL-X is worth it. But better wait to see what AMD’s Threadripper has in store before making your choice… 😉