The Information : How Google’s AI Chips Stack Up to Nvidia’s

How Google’s AI Chips Stack Up to Nvidia’s

The Takeaway

Google’s TPU sales push outside its cloud directly challenges Nvidia’s dominance.
Nvidia’s most advanced GPU is twice as powerful per chip than Google’s TPU.
Google plans to produce over 5 million TPUs by 2027, boosting supply.

Google’s efforts to sell or lease its artificial intelligence server chips so they can run in any company’s data center—and not just in Google Cloud—have generated headlines, stock moves and a noteworthy response from AI chip leader Nvidia.

It’s not lost on Nvidia that two of the world’s best AI models, from Google and Anthropic, were developed fully or partly using AI server chips made by Google rather than with Nvidia’s graphics processing units.

That reality has prompted another large AI developer—Meta Platforms, one of Nvidia’s biggest chip customers—to seriously consider using Google’s Tensor Processing Units to develop new models.

Nvidia is still dominant in terms of its chips’ performance on a wide variety of AI tasks, as is the company’s Cuda software for running AI on its chips. And unlike TPUs, which until recently were only available through Google Cloud, Nvidia chips can be accessed from any cloud provider.

But if Google continues down the path of competing more directly with Nvidia by providing its chips outside Google Cloud, more AI developers will want to evaluate each company’s chips and servers side by side.

Below we answer some frequently asked questions about how TPUs and GPUs stack up.

Does the most advanced GPU have better price and performance compared to the most advanced TPU when customers rent these chips in the cloud?

That depends on the price a cloud provider charges for GPUs, which can vary based on the length of a developer’s commitment to that chip system. Still, comparing them head to head is difficult because of the software involved in running apps on them.

For customers that already use Nvidia’s Cuda programming language to run their AI on server chips, renting Nvidia chips is more cost-effective, whereas developers that have time and resources to rewrite their programs can save money by using TPUs.

Still, for most developers, Nvidia’s software makes it fast and easy to start running AI applications on GPUs.

Sophisticated TPU customers such as Anthropic, Apple and Meta may have fewer challenges with using TPUs because they are more adept at writing software for running AI on server chips.

Based on interviews with former Google and Nvidia employees, TPUs offer potential cost benefits over GPUs, depending on how many AI computing workloads the customer runs and what type they are. TPUs can be especially cost-efficient for customers using Google’s Gemini models because the models were developed using TPUs.

Nvidia CEO Jensen Huang has said that even if rival chips were priced at zero dollars, companies would still prefer Nvidia chips. Is this accurate?

It’s not so simple. Taiwan Semiconductor Manufacturing Co., which produces Nvidia’s chips, is careful not to devote too much of its chipmaking and chip packaging capacity to one company, so Nvidia is unlikely to get as much capacity as it wants to meet customers’ demand. Because Nvidia typically won’t have enough capacity to meet total demand, there will be demand for rival chips.

What is the difference between the most advanced TPU (Ironwood) and the most advanced GPU (Blackwell) in terms of computations or other key measures, such as energy efficiency?

On a per-chip basis, Google’s most advanced TPU is half as powerful as Nvidia’s most advanced GPU, as measured by the number of tera floating-point operations per second, or FLOPS, it can handle, according to one industry executive. (This is a common way AI developers measure a chip’s computational power.)

Google can string together servers housing thousands of its TPUs into one pod, making them particularly useful and cost-efficient for developing new AI models, whereas Nvidia can only connect a maximum of 256 of its GPU chips. (Nvidia chip customers can overcome that limitation by using additional networking cables to connect servers in their data centers.)

How do TPUs run AI differently from GPUs?

GPUs can handle a wide variety of computational tasks, from rendering videogame graphics to training large language models. The chips excel in repetitive math operations that machine-learning models need, particularly multiplying grids of numbers together, a process known as matrix multiplication.

Google’s TPUs are even more specialized to handle matrix multiplication and running certain AI models faster than GPUs. TPUs can do so through a systolic array, a grid of simple calculators that pass data to each other in a cadenced pattern. This design keeps numbers flowing continuously through the calculations without needing to constantly fetch data from the chips’ memory, which would waste time and energy.

While TPUs are more efficient because they do only one thing, it means they only work well with certain software. GPUs come with better software for running the chips, and developers can use them for a wider variety of tasks.

What are the advantages and disadvantages of TPUs compared to GPUs in terms of how they handle LLMs or large vision or video models?

TPUs give Google’s AI developers a cost advantage over GPUs because the company’s AI models, applications and data centers were designed with TPUs in mind.

But TPUs work smoothly only with certain AI software tools, such as TensorFlow. However, most AI researchers use PyTorch, which runs better on GPUs. TensorFlow and Pytorch allow developers to train and run AI models without needing to write specific software code from scratch.

GPUs can have higher performance than TPUs if a developer takes the time to fully utilize them by writing custom software for running the developer’s specific AI models, according to multiple engineers with experience using both types of chips. (That kind of customization generally isn’t possible with TPUs.)

For video and vision models, TPUs excel at the repetitive mathematical operations that image recognition requires. They handle convolutions, the core calculations in image models, by converting them into matrix multiplications.

But some engineers say GPUs pull ahead of TPUs in developing vision models because that process often involves experimenting with complex image transformations, such as rotating, cropping, or adjusting colors.

Which companies use TPUs?

Apple has long used TPUs to train its largest language models, according to former Apple employees and research papers published by its AI group. AI image firm Midjourney in 2023 said it was using TPUs to develop its models.

AI developer Cohere previously developed models using TPUs but last year transitioned to GPUs due to technical problems it encountered with earlier versions of TPUs, according to a person with knowledge of the shift.

What would it take for Google to start selling a lot of TPUs outside Google Cloud?

Google would need to overhaul its entire supply chain, mirroring Nvidia’s business model, not only to secure enough chips from foundries but also to make sure customers can install the chips and use them reliably. This means Google would have to make a large investment in developing a sales distribution network, including server designers that produce equipment to house the chips, and in hiring many engineers to provide customer support and other services to TPU buyers.

What is the cost of producing the most advanced TPU versus that of producing the most advanced GPU?

The underlying cost may be similar. Google uses more expensive and advanced chipmaking technology at TSMC for Ironwood than Nvidia used for Blackwell. But the Ironwood chip is smaller, which means TSMC can cut more chips from one wafer. That compensates for the extra cost associated with expensive silicon. Both chips use the same type of high-bandwidth memory, people with knowledge of the production said.

Nvidia has a 63% operating profit margin from selling GPUs and other hardware, whereas Google’s cloud unit has an operating profit margin of 24%, though TPU rentals make up a tiny portion of Google’s overall cloud sales.

How many TPUs does Google produce, and how does that compare to other AI chips?

Google plans to produce more than 3 million TPUs in 2026 and about 5 million in 2027, according to Morgan Stanley’s latest estimate. A Google employee with knowledge of the TPU program said the company has told some TPU customers it plans to produce an even higher figure in 2027, but it isn’t clear whether TSMC will agree to produce that many TPUs that year.

Google places orders for its most powerful TPUs through Broadcom, which works with TSMC and also provides some secondary technology for the TPU chips themselves.

Nvidia currently produces around three times as many GPUs as Google produces TPUs, according to two people with knowledge of the production.

What role does Broadcom play in developing TPUs?

In addition to managing TSMC’s production of the most powerful TPUs, Broadcom has been responsible for the TPU’s physical design, including the all-important chip packaging, and essentially develops the chip based on blueprints Google created. Chip packaging refers to the assembly of the chip, which has become a more important part of the process as shrinking transistors on the chip becomes harder.

Broadcom also provides Google with a crucial piece of intellectual property for designing TPUs: a serializer/deserializer, or SerDes in industry parlance. This is the best technology for moving data at high speeds from one TPU to another to enable parallel computing, in which multiple chips work in unison—an important step for developing LLMs.

Google and Broadcom have sometimes butted heads over Broadcom’s prices for the TPUs, prompting Google to seek alternative partners such as MediaTek, which will soon produce a less-powerful TPU that aims to help Google save on the cost of running its AI.

What is Broadcom’s cut from developing TPUs?

It is at least $8 billion, according to analysts.

What might the economics be if Google sells or leases out TPUs so they end up in other companies’ data centers?

It isn’t clear how much gross profit margin Google generates from renting out TPUs to its cloud customers. It can sell many other services to cloud customers in addition to the server chip rentals.

If Google sold or leased out TPUs to other companies’ data centers, those facilities would need to be designed in a highly specific way, similar to Google’s, to reap the cost benefits of TPUs as Google does for its own AI uses, the former TPU executive said. Plus, doing so would mean Google would forgo other types of revenue it generates from cloud customers, such as storage and databases, so it might levy an extra cost on TPU buyers to make up for that lost potential revenue.

Why would Google pursue a business model that’s closer to Nvidia’s?

Google has told potential TPU customers that some technology and financial services firms want to house TPUs in their own data centers—meaning non-Google data centers—for security and other reasons.

Google has been talking to rival cloud providers about hosting TPUs for some customers, The Information reported in September.

At a minimum, customers that use TPUs or discuss using those chips instead of Nvidia’s could pressure Nvidia to lower its GPU prices for them. Making TPUs more ubiquitous could also help Google convince more customers to use its Gemini AI models, which are optimized for TPUs.

Developers are more familiar with Nvidia chips and the software that runs them, compared to the software for running TPUs. Are new solutions like JAX PyTorch XLA bridging the gap?

The short answer is no, though Google is working hard to change that. And it has pitched potential TPU customers on using those chips in tandem with specially made Google software that makes it easier to run them.