You are here: Home > Computer Articles > What Are RT Cores in Nvidia GPUs?



Back to all articles

What Are RT Cores in Nvidia GPUs?

Published: 6-2-2021


The latest GPUs from Nvidia have special hardware inside that accelerates ray-traced graphics, allowing them to be rendered in real time. These RT (ray tracing) cores have pushed what’s possible in real time rendering to new heights, but what are RT cores really and how do they work?

The Parts of an RTX GPU

An Nvidia RTX GPU, which is the product series in question here, has three main types of processor.

The first are its CUDA cores. These are general computation cores. Well, as “general” as modern GPU cores can be. They are small, relatively simple processors. There are thousands of them working in parallel on modern GPUs. These are the processors that work out how to shade each pixel you see on screen and pull off all the other effects you see in modern traditional graphics.

Next, RTX cards have Tensor cores. We’ve written about Tensor cores before, but in short they are built to do a sort of math known as tensor math. These calculations are fundamental to machine learning and artificial intelligence, particularly for neural networks. These cores can be used to accelerate machine learning software in any realm that uses tensor math, but they also play a role in graphics. Nvidia uses them to clean up ray traced images and also to intelligently upsample images rendered at lower resolutions using a technology known as DLSS (Deep Learning Super Sampling).

Finally, we get to the RT cores this post is about. They have the job of doing the math of ray-tracing as quickly as possible. Fast enough to show a moving image on screen at playable frame rates. But hang on a second, what is ray-tracing to begin with?

Ray Tracing in a Nutshell

In real life, what you see is the result of photons of different wavelength hitting the retina of your eye after being focused and gathered there by the lens of your eye.

Before those photons enter your eye, they’ve been bouncing around the world, interacting with all the objects around you. That’s how the scene around you is constructed. By photons bouncing around, interacting with objects, being absorbed or reflected and then coming to rest with you.

3D real-time computer graphics have not been rendered in a way that’s anything like this. Why? Because simulating the way light works is incredibly computationally intensive. Ray tracing has been used extensively for offline rendering. Where one frame may take hours to compute. That’s how they make Hollywood blockbuster animated CG films or visual effects for live action titles.

RT cores specifically accelerate the key math needed to trace virtual rays of light through a scene. Although, with ray tracing the rays are actually fired from the “eye” into the scene. Which is obviously not how we see in real life. Within the simulation however, the result is more or less the same.

RT Cores are ASICs

RT cores are an example of an ASIC or application-specific integrated circuit. You may have heard of ASIC’s in the context of cryptocurrency, with microprocessors designed to only process the cryptography math of one specific crypto coin.

In short, RT cores add extra circuits to the more general purpose CUDA cores that can be included in the rendering pipeline when a ray-tracing calculation comes along.

The CUDA cores hand off that job to the RT cores and then use the resulting answers to the ray-tracing math to render the scene and correctly shade the pixels in front of your eyeballs.

What Do RT Cores Actually Accelerate?

But we can go into a little more detail than that! RT cores aren’t actually doing the full-fat job of ray tracing. Nvidia has found a less computationally intense way of quickly calculating light ray bounces around the scene.

Scene geometry is organized into a data structure known as a BVH (Bounding Volume Hierarchy). It’s a representation in 3D space of how objects in a scene are organized.

The RT cores actually look for ray intersections within this BVH structure. Whether rays intersect according to tests within the BVH influences the value of the relevant pixel shaders.This is a relatively simple test, but the RT cores can do them in massive volume and at incredible speed.

This approach is however fairly low-fidelity and results in a grainy image.Which is where the Tensor cores come in, applying a machine-learning denoiser in real time to clean up the picture.

That’s what those RT cores do explained simply enough so that even we can understand it!







LIST OF COMPATIBLE WORKSTATIONS


Sort By:
1
Titan S64 - Intel Xeon W-3300 Series Processors 4U Rackmount Workstation PC for AI, Deep Learning up to 38 CPU Cores Titan S64 - Intel Xeon W-3300 Series Processors 4U Rackmount Workstation PC for AI, Deep Learning up to 38 CPU Cores

Machine learning is undoubtedly a crucial part of software development as well as providing AI cloud services for a wide variety of application types. The latest server CPUs are packed with specialized machine learning hardware acceleration, and the Titan S64 brings you a rack-mounted AI powerhouse solution in an affordable and compact package.




Starting Price: $4,995.00
Titan W64 Octane - Intel Xeon W-3300 Series Processors Workstation PC for AI, Deep Learning up to 38 CPU Cores Titan W64 Octane - Intel Xeon W-3300 Series Processors Workstation PC for AI, Deep Learning up to 38 CPU Cores

Our Titan W64 Octane makes all the difference in the computer world. Built on Intel Xeon 3300-series CPU technology with Ice Lake technology in its veins, the Titan W64 Octane is a cool computer with some hot performance numbers. It’s not only a multi-core CPU monster with new Intel Deep Learning Boost (Intel DL Boost), but can double as a titan of GPU-centric workloads as well. All thanks to the latest, cutting-edge workstation CPU technology from Intel. On the new Xeon W they added more pipes! Specifically, there are now 64 dedicated PCIe lanes. That means four x16 slots for four full-speed, high-end GPUs.




Starting Price: $5,345.00
Titan W599 Octane - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC for Quad GPU CUDA 3D Rendering and Simulations up to 56 CPU Cores Titan W599 Octane - Dual 2nd Gen Intel Xeon Scalable Processors Workstation PC for Quad GPU CUDA 3D Rendering and Simulations up to 56 CPU Cores

We’re always in a race to build the most powerful machines for every use case and budget point. Which means we can never take a break from trying to improve on our very best performing machines. With this latest generation Titan W599 Octane the envelope of what we can build has been pushed again. The W599 is one of our most powerful and most customized rigs to date. It’s designed to cover both CPU and GPU processing needs using new "Six Channel Memory Configuration". So if you’re thinking about entering the world of high-performance desktop computing, this should be your first stop.




Starting Price: $5,665.00
Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores Titan A499 OCTANE PRO - AMD Ryzen Threadripper Pro 5000 WX Series Workstation PC - up to 64 cores

There are plenty of computers out there with “Pro” tacked on to the end of their names, but none like the Titan A499 Octane Pro deliver an equal amount of CPU threads and performance at a price that won’t make you feel like you’re paying for a brand. Like all Titan Workstations, this system is built using top grade internal components to ensure you get the maximum performance out of your processor.




Starting Price: $5,974.25
Titan A600 - Dual AMD EPYC Milan 8 GPU Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR Titan A600 - Dual AMD EPYC Milan 8 GPU Server PC for AI / Deep Learning HPC up to 128 cores - Supermicro 4124GS-TNR

AI and Machine Learning are quickly becoming the most important fields for high-performance computing. More and more of our customers have asked for dedicated AI and Machine Learning systems, which is where the Titan A600 comes in. Even HAL 9000 would be intimidated by this much Deep Learning power!





Starting Price: $9,717.50
Titan X575 - Dual 2nd Gen Intel Xeon Scalable Processors Server Computer for up to 10x NVIDIA Video Cards / GPUs and up to 56 CPU Cores Titan X575 - Dual 2nd Gen Intel Xeon Scalable Processors Server Computer for up to 10x NVIDIA Video Cards / GPUs and up to 56 CPU Cores

The X575 is a multi threaded, multi GPU capable system with the option to install up to 10 dual slot GPUs. Perfect for those who want GPU Supercomputing ability in a convenient rack-mounted form, the Titan X575 is a uniquely designed, flexible parallel processing workstation server. Up to 56 hyper-threaded Intel cores and ten GPUs mean no compromises for medical, nuclear, oil & gas or render farm parallel computing applications.





Starting Price: $10,295.00
   
 
1