You are here: Home > Computer Articles > What Are RT Cores in Nvidia GPUs?



Back to all articles

What Are RT Cores in Nvidia GPUs?

Published: 6-2-2021


The latest GPUs from Nvidia have special hardware inside that accelerates ray-traced graphics, allowing them to be rendered in real time. These RT (ray tracing) cores have pushed what’s possible in real time rendering to new heights, but what are RT cores really and how do they work?

The Parts of an RTX GPU

An Nvidia RTX GPU, which is the product series in question here, has three main types of processor.

The first are its CUDA cores. These are general computation cores. Well, as “general” as modern GPU cores can be. They are small, relatively simple processors. There are thousands of them working in parallel on modern GPUs. These are the processors that work out how to shade each pixel you see on screen and pull off all the other effects you see in modern traditional graphics.

Next, RTX cards have Tensor cores. We’ve written about Tensor cores before, but in short they are built to do a sort of math known as tensor math. These calculations are fundamental to machine learning and artificial intelligence, particularly for neural networks. These cores can be used to accelerate machine learning software in any realm that uses tensor math, but they also play a role in graphics. Nvidia uses them to clean up ray traced images and also to intelligently upsample images rendered at lower resolutions using a technology known as DLSS (Deep Learning Super Sampling).

Finally, we get to the RT cores this post is about. They have the job of doing the math of ray-tracing as quickly as possible. Fast enough to show a moving image on screen at playable frame rates. But hang on a second, what is ray-tracing to begin with?

Ray Tracing in a Nutshell

In real life, what you see is the result of photons of different wavelength hitting the retina of your eye after being focused and gathered there by the lens of your eye.

Before those photons enter your eye, they’ve been bouncing around the world, interacting with all the objects around you. That’s how the scene around you is constructed. By photons bouncing around, interacting with objects, being absorbed or reflected and then coming to rest with you.

3D real-time computer graphics have not been rendered in a way that’s anything like this. Why? Because simulating the way light works is incredibly computationally intensive. Ray tracing has been used extensively for offline rendering. Where one frame may take hours to compute. That’s how they make Hollywood blockbuster animated CG films or visual effects for live action titles.

RT cores specifically accelerate the key math needed to trace virtual rays of light through a scene. Although, with ray tracing the rays are actually fired from the “eye” into the scene. Which is obviously not how we see in real life. Within the simulation however, the result is more or less the same.

RT Cores are ASICs

RT cores are an example of an ASIC or application-specific integrated circuit. You may have heard of ASIC’s in the context of cryptocurrency, with microprocessors designed to only process the cryptography math of one specific crypto coin.

In short, RT cores add extra circuits to the more general purpose CUDA cores that can be included in the rendering pipeline when a ray-tracing calculation comes along.

The CUDA cores hand off that job to the RT cores and then use the resulting answers to the ray-tracing math to render the scene and correctly shade the pixels in front of your eyeballs.

What Do RT Cores Actually Accelerate?

But we can go into a little more detail than that! RT cores aren’t actually doing the full-fat job of ray tracing. Nvidia has found a less computationally intense way of quickly calculating light ray bounces around the scene.

Scene geometry is organized into a data structure known as a BVH (Bounding Volume Hierarchy). It’s a representation in 3D space of how objects in a scene are organized.

The RT cores actually look for ray intersections within this BVH structure. Whether rays intersect according to tests within the BVH influences the value of the relevant pixel shaders.This is a relatively simple test, but the RT cores can do them in massive volume and at incredible speed.

This approach is however fairly low-fidelity and results in a grainy image.Which is where the Tensor cores come in, applying a machine-learning denoiser in real time to clean up the picture.

That’s what those RT cores do explained simply enough so that even we can understand it!







LIST OF COMPATIBLE WORKSTATIONS


Sort By:
1
Titan S64 - Intel Xeon W-3300 Series Processors 4U Rackmount Workstation PC for AI, Deep Learning up to 38 CPU Cores Titan S64 - Intel Xeon W-3300 Series Processors 4U Rackmount Workstation PC for AI, Deep Learning up to 38 CPU Cores


Machine learning is undoubtedly a crucial part of software development as well as providing AI cloud services for a wide variety of application types. The latest server CPUs are packed with specialized machine learning hardware acceleration, and the Titan S64 brings you a rack-mounted AI powerhouse solution in an affordable and compact package.





4U Rackmount Workstation / Server Computer for:

Animation and Modeling • Design & Visualization • 3D Rendering • Deep Learning • Data Analysis • AI • Machine Learning • Media / Video Streaming • CGI

Starting Price: $4,995.00
Titan W64 Octane - Intel Xeon W-3300 Series Processors Workstation PC for AI, Deep Learning up to 38 CPU Cores Titan W64 Octane - Intel Xeon W-3300 Series Processors Workstation PC for AI, Deep Learning up to 38 CPU Cores


Built on Intel Xeon 3300-series CPU technology with Ice Lake technology in its veins, the Titan W64 Octane is a cool computer with some hot performance numbers. A CPU monster with new Intel Deep Learning Boost (Intel DL Boost), but can double as a titan of GPU-centric workloads as well. All thanks to the latest, cutting-edge workstation CPU technology from Intel. On the new Xeon W they added more pipes! Specifically, there are now 64 dedicated PCIe lanes. That means four x16 slots for four full-speed, high-end GPUs.





Intel Xeon W-3300 Series Processors Workstation Computer for:


GPU Parallel Computing • Deep Learning • AI • 3D Modeling • Engendering • Computer Animation • Video Editing • Design & Visualization • Rendering

Starting Price: $5,345.00
Titan A790 - AMD Ryzen Threadripper Pro 7000 Series Workstation PC - up to 96 cores Titan A790 - AMD Ryzen Threadripper Pro 7000 Series Workstation PC - up to 96 cores


Our Titan A790 OCTANE PRO is a high-performance workstation PC powered by the AMD Ryzen Threadripper Pro 7000 Series, capable of handling up to 96 cores. This workstation is designed for professionals requiring extreme computing power at a highly-competitive price.





AMD Threadripper Pro 7000 Series Workstation Computer for:

3D Rendering • CAD/CAM • Product Design • 3D Modeling • CGI • Computer Animation • Video Editing • Design & Visualization • Machine Learning • Fluid Dynamics

Starting Price: $5,585.00
Titan W599 Octane - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC for Quad GPU CUDA 3D Rendering and Simulations up to 56 CPU Cores Titan W599 Octane - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC for Quad GPU CUDA 3D Rendering and Simulations up to 56 CPU Cores


We’re always in a race to build the most powerful machines for every use case and budget point. Which means we can never take a break from trying to improve on our very best performing machines. With this latest generation Titan W599 Octane the envelope of what we can build has been pushed again. The W599 is one of our most powerful and most customized rigs to date. It’s designed to cover both CPU and GPU processing needs using new "Six Channel Memory Configuration". So if you’re thinking about entering the world of high-performance desktop computing, this should be your first stop.







Dual 2nd Gen Intel Xeon Scalable Processors Workstation Computer for:

GPU Parallel Computing • Data Processing • 3D Modeling • Engendering • Computer Animation • Video Editing • Design & Visualization • Rendering

Starting Price: $5,665.00
Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores

There are plenty of computers out there with “Pro” tacked on to the end of their names, but none like the Titan A499 Octane Pro deliver an equal amount of CPU threads and performance at a price that won’t make you feel like you’re paying for a brand. Like all Titan Workstations, this system is built using top grade internal components to ensure you get the maximum performance out of your processor.





AMD Threadripper Pro 5000 WX Series Workstation Computer for:

3D Rendering • Deep Learning • Data Analysis • AI • Machine Learning • Media / Video Streaming • Cloud Gaming • Animation and Modeling • Design & Visualization • Diagnostic Imaging

Starting Price: $5,974.25
Titan A790 OCTANE PRO - AMD Ryzen Threadripper Pro 7000 Series Workstation PC - up to 96 cores Titan A790 OCTANE PRO - AMD Ryzen Threadripper Pro 7000 Series Workstation PC - up to 96 cores


Our Titan A790 OCTANE PRO is a high-performance workstation PC powered by the AMD Ryzen Threadripper Pro 7000 Series, capable of handling up to 96 cores. This workstation is designed for professionals requiring extreme computing power at a highly-competitive price.





AMD Threadripper Pro 7000 Series Workstation Computer for:

3D Rendering • CAD/CAM • Product Design • 3D Modeling • CGI • Computer Animation • Video Editing • Design & Visualization • Machine Learning • Fluid Dynamics

Starting Price: $6,085.00
Titan S600 - Dual AMD EPYC Milan CPUs + 8x GPUs Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR Titan S600 - Dual AMD EPYC Milan CPUs + 8x GPUs Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR


The future of AI and Machine Learning is here with the Titan A600. This powerhouse workstation is dedicated to supporting the most demanding AI and Machine Learning systems. With the power of two AMD EPYC Milan CPUs and support for 8 GPUs, the A600 is a monster of deep learning capabilities. Even the most advanced AI would be in awe of the processing power housed within the A600.





4U Rackmount Workstation / Server Computer for:

Deep Learning • Data Analysis • AI • Machine Learning • Media / Video Streaming • Cloud Gaming • Animation and Modeling • Design & Visualization • 3D Rendering • Diagnostic Imaging

Starting Price: $9,717.50
Titan S575 - Dual 2nd Gen Intel Xeon Scalable CPUs + 10x GPUs Server PC for AI / Deep Learning HPC up to 56 Cores - Supermicro 4029GP-TRT Titan S575 - Dual 2nd Gen Intel Xeon Scalable CPUs + 10x GPUs Server PC for AI / Deep Learning HPC up to 56 Cores - Supermicro 4029GP-TRT


The X575 is a multi threaded, multi GPU capable system with the option to install up to 10 dual slot GPUs. Perfect for those who want GPU Supercomputing ability in a convenient rack-mounted form, the Titan X575 is a uniquely designed, flexible parallel processing workstation server. Up to 56 hyper-threaded Intel cores and ten GPUs mean no compromises for medical, nuclear, oil & gas or render farm parallel computing applications.





4U Rackmount Workstation / Server Computer for:

Deep Learning • Data Analysis • AI • Machine Learning • Media / Video Streaming • Cloud Gaming • Animation and Modeling • Design & Visualization • 3D Rendering • Diagnostic Imaging

Starting Price: $10,295.00
   
 
1