The Information : What to Expect From GTC: Nvidia’s Groq Chip

It’s party time in San Jose, Calif.! Nvidia on Monday kicks off its annual GTC developer conference in the city, the biggest event on the chip giant’s calendar—which, given the number of public events featuring its CEO, Jensen Huang, is saying something. Huang is scheduled to deliver his keynote tomorrow morning, and it is sure to be a must-watch for people in the AI sector.

One of the biggest things he’s expected to announce is a new chip system that combines Nvidia’s technology with that of Groq, the independent chip firm whose tech Nvidia licensed in a roughly $20 billion deal late last year. This is the first time Nvidia has integrated another company’s AI processor directly into its server racks. Nvidia’s current flagship systems rely almost entirely on its own processors and high-speed interconnects to link the components together.

So why has Nvidia broken the mold for Groq? One reason: inference. That’s the technical term for the process of operating AI models that spit out answers to questions asked by ordinary people, as opposed to the process of training AI models. Nvidia’s chips are outstanding for training models, which is why they’ve been so much in demand in recent years. They’re good at the inference stage as well. But Groq’s chips are tailored for inference, which is going to become an increasing part of AI data center workloads.

Nvidia is expected to name OpenAI as a buyer of the new chip, which could power the AI agent that assists with the firm’s coding tasks. It’s worth noting that assuming Nvidia does unveil the Groq-Nvidia system as expected, it will be quite a departure from Huang’s comment in January that what Nvidia would build with Groq would be “quite unique and quite cool, but it won’t affect our core business.”

Groq’s chip—known as a language processing unit (yes, we have GPUs, TPUs and now LPUs!)—is expected to be mass-produced in the second half of this year at Samsung Electronics’ foundry, according to two people with direct knowledge of Nvidia’s plans. It will be the first time Nvidia has manufactured a server chip outside Taiwan Semiconductor Manufacturing Co., the Taiwanese chipmaker that has supplied virtually all of its flagship AI chips.

Diversifying away from a single supplier makes a lot of sense for Nvidia. But the company plans to eventually move production of the LPU back to TSMC, the people said, as the next generation of LPU will integrate better into a coming version of Nvidia’s next AI chip, we hear.

Here are some technical details of the new Nvidia-Groq system for the AI chip nerds out there. The Nvidia-Groq rack will use a different architecture from that of existing Nvidia racks. It will contain 256 Groq chips in a single rack, and Intel processors will help manage communication between them, according to a person involved in the project—a role Nvidia’s own hardware typically performs in its GPU systems. The decision to use Intel components suggests Nvidia’s existing technologies don’t yet integrate cleanly with the LPU.

Still, this is only the beginning of Nvidia’s plans for Groq’s technology. The company is exploring ways to integrate the LPU more deeply into its future road map for chips. One idea under consideration would be to fuse Groq’s processor with a Feynman GPU—Nvidia’s next generation after Rubin—into a single chip. That would improve performance while lowering costs, according to two people involved in the development. Not that Huang will talk about any of that tomorrow!—Anissa Gardizy and Wayne Ma contributed to this item