theregister.com

Nvidia’s AI suite may get a whole lot pricier, thanks to Jensen’s GPU math mistake

Comment At its GPU Technology Conference last month, Nvidia broke with convention by shifting its definition of what counts as a GPU.

"One of the things I made a mistake on: Blackwell is really two GPUs in one Blackwell chip," CEO Jensen Huang explained on stage at GTC. "We called that one chip a GPU and that was wrong. The reason for that is it screws up all the NVLink nomenclature."

However, Nvidia's shift to counting GPU dies, rather than SXM modules, as individual GPUs doesn't just simplify NVLink model numbers and naming conventions. It could also double the number of AI Enterprise licenses Nvidia can charge for.

Nvidia's AI Enterprise suite, which covers a host of AI frameworks including access to its inference microservices (NIMs), would run you $4,500 a year or $1 an hour in the cloud, per GPU. This meant an Nvidia HGX B200 with eight modules (one Blackwell GPU per module) would cost $36,000 a year or $8 per hour in the cloud.

But with the new HGX B300 NVL16, Nvidia is now counting each die as a GPU. And since the system also has eight modules, each with two dies, that brings the total to 16 GPUs. That means, assuming no changes to Nvidia's AI Enterprise subscription pricing, its latest HGX boxes will set you back twice as much.

The change in naming convention is a departure from last year's Blackwell systems. In our Blackwell launch coverage, Nvidia took issue with us calling Blackwell a "chiplet" architecture – multiple separate dies or chiplets linked within one processor package – arguing that it's actually a "two-reticle limited die architecture that acts as a unified, single GPU."

It's not like the latest B300 GPUs are that much more powerful, either, over last year's B200. As a quick refresher, the HGX B300 offers about 1.5x more memory capacity at 2.3TB versus 1.5TB on the B200, while 4-bit floating point (FP4) perf is up roughly 50 percent to just over 105 dense petaFLOPS per system. However, the performance jump is only for workloads that can take advantage of that FP4 performance. At higher precision, the B300 doesn't offer any floating point advantage over the older system.

Confusingly, this change only applies to Nvidia's air-cooled B300 boxes and not the more powerful GB300 NVL72 systems, which continue to count the packages as GPUs.

So what gives? Well, according to Nvidia's VP and GM of Hyperscale and HPC, Ian Buck, there is a technical reason for this.

The main difference is that the B300 package offered on the HGX chassis lacks the chip-to-chip interconnect found on previous-gen Blackwell accelerators. This means the two chips really are two distinct 144GB GPUs sharing a common package. Buck explained this allowed Nvidia to achieve better power and thermals. This does come with some disadvantages. Because there's no C2C interconnect between the two, if one die wants to access the memory on the other, it has to go off-package, over the NVLink switch, and then take a U-turn.

The GB300, on the other hand, retains the C2C interface, avoiding the off-package memory detour. Because the two dies can directly communicate and share memory, they're treated as a single, unified GPU - at least as far as Nvidia's software and licensing are concerned.

This technical exception won't last long, however, with the launch of Nvidia's Vera Rubin superchips, which will embrace the B300-style naming convention and start counting individual dies as GPUs, hence the NVL144 designation.

This is also how Nvidia's Vera Rubin Ultra platform, coming in late 2027, can claim 576 GPUs per rack. As we previously explored, it's really just 144 modules — what prior to Blackwell Ultra we would have considered a GPU — with four dies per module.

If we had to guess, we'd wager in the year since Nvidia unveiled Blackwell, the GPU giant realized it was leaving subscription software revenues on the table. We say it looks like because when we asked Nvidia how the naming change would impact AI Enterprise licensing, they told us pricing details hadn't been finalized yet.

"Pricing details are still being finalized for B300 and no details to share on Rubin beyond what was shown in the GTC keynote at this time," a spokesperson, who clarified that this also included AI Enterprise pricing, told El Reg. ®

Read full news in source page