Nvidia's new six-trillion transistor Vera Rubin…

Nvidia's new six-trillion transistor Vera Rubin 'superchip' for AI makes the 92-billion transistor RTX 5090 gaming GPU look positively puny

Nvidia's ever-optimistic CEO, Jensen Huang, has revealed more details of the company's next-gen AI platform at the same GTC event where the company's quantum computing plans were outlined. He calls it the Vera Rubin "superchip" and claims it contains no fewer than six trillion transistors. Yup, six trillion.

In very simple terms, that means the Vera Rubin superchip sports 60 times the number of transistors of an Nvidia RTX 5090 gaming GPU. And that is a completely bananas comparison.

What's more, the Vera Rubin superchip is capable of 100 petaflops of raw compute. And that is exactly 100 times more compute than Nvidia claims for its neat little DGX Spark AI box. Oh, and it packs 88 custom Arm cores suporting 176 software threads for conventional CPU compute, too.

On a side note, that Arm CPU bit (which alone is known as Vera) is interesting from a PC perspective as you might expect its custom designed CPU cores, which notably support multi-threading, could turn up in a PC processor in the future. Watch this space.

At this point you may be wondering how any of this is possible. Six trillion transistors, 100 PF of compute, 88 CPU cores, all in one chip? OK, Vera Rubin is expected to upgrade to TSMC's latest N3 process node. But the N5-derived node that Nvidia currently uses for its GPUs tops out at a little over 100 billion transistors for a single, monolithic chip.

That's a whole lotta transistors (Image credit: Nvidia )

N3 is more dense than N5, for sure. But not anything like enough to go from about 100 billion to six trillion transistors. Actually, it turns out that Vera Rubin isn't just more than one chip, it's more than one chip package.

Huang held up an engineering sample of Vera Rubin and what's clear to see are the three main chip packages, one for the CPU and two GPU packages. In a later part of the presentation, it becomes clear that each GPU package actually contains two GPUs. So that's four GPUs total, plus the CPU. A close up on the Vera CPU package also reveals it to be a chiplet design, further elevating the overall transistor count headroom.

That said, each Rubin AI GPU die is monolithic and described as reticle-sized, which means it's as big as TSMC's production process will allow. But even if you were generous and assumed that TSMC could squeeze 200 billion transistors into a single monolithic chip using N3, you'd need 30 such chips to get to that six trillion figure.

(Image credit: Look closely and you can see that the Vera CPU is a chiplet design...)

Actually, Nvidia also revealed a "Rubin Ultra" GPU package with four Rubin GPU dies, so this stuff is just going to scale and scale. But, still, it's unclear how Nvidia is doing its transistor count. I'd have to assume that everything, including all the memory for the GPUs and CPU, plus the supporting chips on the Vera Rubin motherboard, is thrown in.

However Nvidia is coming up with the numbers, the Vera Rubin AI board (as opposed to "chip" as Huang is calling it), is an unfathomably complex thing. Indeed, Huang says it's 100 times faster than the Nvidia equivalent product from nine years ago, the DGX One, which Huang says he delivered to what must then have been a nascent OpenAI.

The rest, as they say, is history. And trillions and trillions of transistors. And trillions of dollars for Nvidia. For the record, Huang says Vera Rubin goes into production "this time next year, maybe even slightly earlier" and it probably won't be cheap. So get saving up.

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here