You are here: Home > COMPUTER ARTICLES > Why Are Nvidia's T4 Tensor Core Cards Special?



Back to all articles

Why Are Nvidia's T4 Tensor Core Cards Special?


Published: 5-10-2021



We’ve explained what “Tensor cores” are in a previous blog post, but what you may not know is that Nvidia is working hard at improving how fast these already blistering CPU cores are. The T4 PCIe card was first introduced in 2018, but it’s still a product that many people have never heard of. Yet it can play a crucial part in modern data centers.

What Does the T4 Do?

The T4 is designed to accelerate processing tasks typically used in machine learning applications and other high-performance tasks using tensor math. It can also perform general-purpose GPU computing tasks using CUDA cores. So anything that’s been written to use CUDA on regular GPUs, should work here as well.


What the T4 doesn’t do is connect to a display device and act as a GPU. It doesn’t have any back-panel IO at all and is designed to be installed in low-profile server systems.

The T4s Specs

The T4 card is essentially what you get when you take an Nvidia RTX GPU and remove the GPU features, such as display outputs. What you have left over are the CUDA cores, dedicated Tensor Cores and the ray-tracing acceleration hardware found in RTX GPUs. The T4 is based on the Turing chipset specifically, so it’s tensor and ray-tracing hardware matches those on equivalent RTX cards on a per-core basis.


The T4 in particular has:


  • 2560 CUDA cores.

  • 320 Turing Tensor cores.

  • 16GB of GDDR6 with ECC.


Compared to a regular x86 CPU typically used in servers, the T4 is much, much faster at processing jobs such as training neural nets or drawing inferences from data.


It’s about more than just hardware performance however. The T4 card is passively cooled, low-profile and only draws 70W at its peak. In environments where the energy cost of computation is a major factor, this makes it orders of magnitude cheaper to run machine learning and GPGPU tasks on something like the T4 than a typical server-grade CPU.

Who is the T4 For?

The T4 has a place both in server systems and in workstations. Especially for workstation systems where you need to do machine learning tasks such as creating deep fakes or upscale footage using high-end AI upscaling. The use cases for machine-learning acceleration are growing by the day and adding a dedicated card to handle that while you keep working in the foreground could be a cost-effective way to boost your available processing power.


For server owners in data centers or perhaps just in SME’s or creative groups who need to share processing time, T4 cards and the like offer a way to accelerate offline rendering or machine learning type workloads in a small package. Many servers already have several low-profile PCIe slots to spare, which means that T4s can act as in-place upgrades and free up traditional CPUs for other tasks.

A Niche Card With Wide Applications

While a headless GPU packed with specialized silicon isn’t a component we’d recommend to every customer, the application of machine learning methods and the need to perform tensor math is growing rapidly. The T4 makes a whole lot of sense for a surprisingly large number of consumers.






LIST OF COMPATIBLE WORKSTATIONS


Sort By:
1
Titan W422 Octane - Intel Xeon W Cascade Lake - VR Design - CUDA GPU Rendering Workstation PC up to 18 Cores Titan W422 Octane - Intel Xeon W Cascade Lake - VR Design - CUDA GPU Rendering Workstation PC up to 18 Cores

There are plenty of high-end computers out there that boast multiple GPUs. What they don’t tell you is that just because you can put a whole bunch of GPUs into a computer, doesn’t mean you’ll get the full benefit. The Titan W422 Octane has no such problems. We’ve carefully chosen the best processor and motherboard combination to let you wring as much performance from your multi-GPU setup as possible. In fact, every component in this workstation has been chosen around the concept of multi-GPU computing. If you’re looking to crunch some GPU workloads, this is where to start.



Starting Price: $5,100.00
Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro Workstation PC up to 64 Cores Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro Workstation PC up to 64 Cores

There are plenty of computers out there with “Pro” tacked on to the end of their names, but none like the Titan A499 Octane Pro deliver an equal amount of CPU threads and performance at a price that won’t make you feel like you’re paying for a brand. Like all Titan Workstations, this system is built using top grade internal components to ensure you get the maximum performance out of your processor.




Starting Price: $5,195.00
Titan X550 - Dual CPUs Intel Xeon Scalable Quad Quadro / Tesla GPU Computing Server up to 56 Cores Titan X550 - Dual CPUs Intel Xeon Scalable Quad Tesla GPU Computing Server up to 56 Cores

You want compute power and you want it now. Whether GPU, CPU or both types of performance the X550 brings the cutting edge of technology straight to your desktop or server rack. Offering a staggering dual Xeon Scalable and Quad GPU configuration, this is one of the most serious number-crunching machines money can buy.




Starting Price: $5,982.00
Titan A575 - Up to 8x NVIDIA Multi GPUs Computing Server w/ Dual AMD Epyc and up to 128 Cores Titan A575 - Up to 8x NVIDIA Multi GPUs Computing Server w/ Dual AMD Epyc and up to 128 Cores

The S575 is a multi threaded, multi GPU capable system with the option to install up to 8 dual slot GPUs. Perfect for those who want GPU Supercomputing ability in a convenient rack-mounted form, the Titan S575 is a uniquely designed, flexible parallel processing workstation server. Up to 128 hyper-threaded Intel cores and ten GPUs mean no compromises for medical, nuclear, oil & gas or render farm parallel computing applications.





Starting Price: $8,450.00
Titan X575 - Up to 10x NVIDIA Multi GPUs Computing Server w/ Dual Xeon Scalable and up to 56 Cores Titan X575 - Up to 10x NVIDIA Multi GPUs Computing Server w/ Dual Xeon Scalable and up to 56 Cores

The X575 is a multi threaded, multi GPU capable system with the option to install up to 10 dual slot GPUs. Perfect for those who want GPU Supercomputing ability in a convenient rack-mounted form, the Titan X575 is a uniquely designed, flexible parallel processing workstation server. Up to 56 hyper-threaded Intel cores and ten GPUs mean no compromises for medical, nuclear, oil & gas or render farm parallel computing applications.





Starting Price: $8,950.00
   
 
1