You are here: Home > Computer Articles > Why Are Nvidia's T4 Tensor Core Cards Special?



Back to all articles

Why Are Nvidia's T4 Tensor Core Cards Special?


Published: 5-10-2021



We’ve explained what “Tensor cores” are in a previous blog post, but what you may not know is that Nvidia is working hard at improving how fast these already blistering CPU cores are. The T4 PCIe card was first introduced in 2018, but it’s still a product that many people have never heard of. Yet it can play a crucial part in modern data centers.

What Does the T4 Do?

The T4 is designed to accelerate processing tasks typically used in machine learning applications and other high-performance tasks using tensor math. It can also perform general-purpose GPU computing tasks using CUDA cores. So anything that’s been written to use CUDA on regular GPUs, should work here as well.


What the T4 doesn’t do is connect to a display device and act as a GPU. It doesn’t have any back-panel IO at all and is designed to be installed in low-profile server systems.

The T4s Specs

The T4 card is essentially what you get when you take an Nvidia RTX GPU and remove the GPU features, such as display outputs. What you have left over are the CUDA cores, dedicated Tensor Cores and the ray-tracing acceleration hardware found in RTX GPUs. The T4 is based on the Turing chipset specifically, so it’s tensor and ray-tracing hardware matches those on equivalent RTX cards on a per-core basis.


The T4 in particular has:


  • 2560 CUDA cores.

  • 320 Turing Tensor cores.

  • 16GB of GDDR6 with ECC.


Compared to a regular x86 CPU typically used in servers, the T4 is much, much faster at processing jobs such as training neural nets or drawing inferences from data.


It’s about more than just hardware performance however. The T4 card is passively cooled, low-profile and only draws 70W at its peak. In environments where the energy cost of computation is a major factor, this makes it orders of magnitude cheaper to run machine learning and GPGPU tasks on something like the T4 than a typical server-grade CPU.

Who is the T4 For?

The T4 has a place both in server systems and in workstations. Especially for workstation systems where you need to do machine learning tasks such as creating deep fakes or upscale footage using high-end AI upscaling. The use cases for machine-learning acceleration are growing by the day and adding a dedicated card to handle that while you keep working in the foreground could be a cost-effective way to boost your available processing power.


For server owners in data centers or perhaps just in SME’s or creative groups who need to share processing time, T4 cards and the like offer a way to accelerate offline rendering or machine learning type workloads in a small package. Many servers already have several low-profile PCIe slots to spare, which means that T4s can act as in-place upgrades and free up traditional CPUs for other tasks.

A Niche Card With Wide Applications

While a headless GPU packed with specialized silicon isn’t a component we’d recommend to every customer, the application of machine learning methods and the need to perform tensor math is growing rapidly. The T4 makes a whole lot of sense for a surprisingly large number of consumers.






LIST OF COMPATIBLE WORKSTATIONS


Sort By:
1
Titan W422 Octane - Intel Xeon W-2200 Series Processors Workstation PC for VR Design, CUDA GPU Rendering up to 18 CPU Cores Titan W422 Octane - Intel Xeon W-2200 Series Processors Workstation PC for VR Design, CUDA GPU Rendering up to 18 CPU Cores

There are plenty of high-end computers out there that boast multiple GPUs. What they don’t tell you is that just because you can put a whole bunch of GPUs into a computer, doesn’t mean you’ll get the full benefit. The Titan W422 Octane has no such problems. We’ve carefully chosen the best processor and motherboard combination to let you wring as much performance from your multi-GPU setup as possible. In fact, every component in this workstation has been chosen around the concept of multi-GPU computing. If you’re looking to crunch some GPU workloads, this is where to start.



Starting Price: $4,995.00
Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores

There are plenty of computers out there with “Pro” tacked on to the end of their names, but none like the Titan A499 Octane Pro deliver an equal amount of CPU threads and performance at a price that won’t make you feel like you’re paying for a brand. Like all Titan Workstations, this system is built using top grade internal components to ensure you get the maximum performance out of your processor.




Starting Price: $5,974.25
Titan X550 - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC For High CPU / GPU Computing Server up to 56 CPU Cores Titan X550 - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC For High CPU / GPU Computing Server up to 56 CPU Cores

You want compute power and you want it now. Whether GPU, CPU or both types of performance the X550 brings the cutting edge of technology straight to your desktop or server rack. Offering a staggering dual Xeon Scalable and Quad GPU configuration, this is one of the most serious number-crunching machines money can buy.




Starting Price: $6,880.00
Titan A600 - Dual AMD EPYC Milan 8 GPU Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR Titan A600 - Dual AMD EPYC Milan 8 GPU Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR

AI and Machine Learning are quickly becoming the most important fields for high-performance computing. More and more of our customers have asked for dedicated AI and Machine Learning systems, which is where the Titan A600 comes in. Even HAL 9000 would be intimidated by this much Deep Learning power!





Starting Price: $9,717.50
Titan X575 - Dual 2nd Gen Intel Xeon Scalable Processors Server Computer for up to 10x NVIDIA Video Cards / GPUs and up to 56 CPU Cores Titan X575 - Dual 2nd Gen Intel Xeon Scalable Processors Server Computer for up to 10x NVIDIA Video Cards / GPUs and up to 56 CPU Cores

The X575 is a multi threaded, multi GPU capable system with the option to install up to 10 dual slot GPUs. Perfect for those who want GPU Supercomputing ability in a convenient rack-mounted form, the Titan X575 is a uniquely designed, flexible parallel processing workstation server. Up to 56 hyper-threaded Intel cores and ten GPUs mean no compromises for medical, nuclear, oil & gas or render farm parallel computing applications.





Starting Price: $10,295.00
   
 
1