nVidia RTX 3090, 3080: Ampere GPGPU performance in CUDA and OpenCL

nVidia 3090 RTX (Ampere)

What is “Ampere”? It is the latest arch(itecture) (SM8.x) from nVidia launching with the new Series 30 mainstream cards (RTX 3090, 3080 and soon 3070, 3060) cards, a major update from the previous “Turing”/”Volta” (Series 20, SM 7.x). A “Titan” pro-sumer version will also launch soon while the data-center A100 … Read more…

nVidia Titan RTX / 2080Ti: Turing GPGPU performance in CUDA and OpenCL

nVidia RTX 2080 TI (Turing)

What is “Titan RTX / 2080Ti”? It is the latest high-end “pro-sumer” card from nVidia with the next-generation “Turing” architecture, the update to the current “Volta” architecture that has had a limited release in Titan/Quadro cards. It powers the new Series 20 top-end (with RTX) and Series 16 mainstream (without … Read more…

nVidia Titan V/X: FP16 and Tensor CUDA Performance

What is FP16 (“half”)? FP16 (aka “half” floating-point) is the IEEE lower-precision floating-point representation that has recently begun to be supported by GPGPUs for compute (e.g. Intel EV9+ Skylake GPU, nVidia Pascal/Turing) and soon by CPUs (BFloat16). While originally meant for mobile devices in order to reduce memory and compute … Read more…

nVidia Titan V: Volta GPGPU performance in CUDA and OpenCL

What is “Titan V”? It is the latest high-end “pro-sumer” card from nVidia with the next-generation “Volta” architecture, the next generation to the current “Pascal” architecture on the Series 10 cards. Based on the top-end 100 chipset (not lower 102 or 104) it boasts full speed FP64/FP16 performance as well … Read more…

nVidia Titan X: Pascal GPGPU Performance in CUDA and OpenCL

What is “Titan X (Pascal)”? It is the current high-end “pro-sumer” card from nVidia using the current generation “Pascal” architecture – equivalent to the Series 10 cards. It is based on the 2nd-from-the-top 102 chipset (not the top-end 100) thus it does not feature full speed FP64/FP16 performance that is … Read more…