The Information : Apple to Renew Push for AI That Runs on Devices, Instead of th

Apple to Renew Push for AI That Runs on Devices, Instead of the Cloud

As the tech industry pours fortunes into data centers for AI, Apple is expected to increasingly showcase the benefits of running models locally on iPhones and other devices.

The Takeaway

Apple plans to push on-device AI, leveraging custom silicon at its developer conference.
Apple has approved the use of an Nvidia privacy technology to handle the processing of AI tasks in Google Cloud
Apple is distilling Google’s Gemini for local use, but some AI tasks still need the cloud.

At Apple’s annual developer conference next month, the star of the show will be a series of long-delayed artificial intelligence upgrades to the iPhone. But the company is also expected to emphasize what could be an underrated asset in its efforts to catch up in AI: Its ability to run AI models on the billions of Apple devices in circulation.

People familiar with the company’s plans for its Worldwide Developers Conference say that Apple is likely to showcase how the company’s 15 years of experience of designing custom silicon chips for iPhones, Watches and Macs will give it an advantage when running AI models locally on those devices. Typically, AI models are run in expensive data centers filled with powerful AI chips.

Many AI queries from Apple’s devices will still have to be processed in the cloud because of their complexity and need for access to troves of online information. For example, as part of an Apple agreement with Google, some user queries to a new version of Siri will run in Google Cloud on a licensed version of the search giant’s Gemini model. Apple recently approved the use of a privacy technology from Nvidia in that setting, suggesting it will use Nvidia AI chips for at least some of its computing needs in Google Cloud, according to people familiar with the matter.

But running models locally could reduce the risk of exposing consumers’ data and prevent advertising companies from monetizing their personal information. It could save business customers money by reducing their consumption of “tokens”—the units of text that cloud AI models base their pricing on. And for Apple, pushing more AI chores onto devices could allow it to continue to avoid the eye-watering investments its tech peers have made in data centers.

As part of its Google agreement, Apple is using a version of Google’s large Gemini model to train a smaller version of the model that can run locally on Apple devices, a process known as distillation, said people familiar with the effort. Apple is also on the lookout to acquire smaller companies that can assist in the effort of shrinking down AI models to run on its devices, people familiar with the company said. One such company it has considered acquiring is Liquid AI, a Cambridge, Mass.-based startup specializing in running AI locally on devices, said people familiar with Apple’s strategy.

Apple first began touting the privacy benefits of running models on devices in 2024, when it introduced Apple Intelligence, a collection of new AI features it announced at the time. Since then, it has largely been quiet on the topic, after an embarrassing series of stumbles, including a tepid response to its new AI features and a delay with the launch of the new Siri.

At the same time, Apple has largely sat on the sidelines as the biggest companies in tech have poured fortunes into building AI computing capacity in the cloud. Last year, Meta Platforms, for example, spent $72 billion on capital expenditures—mostly stemming from data centers—while Microsoft spent $88 billion. Apple, meanwhile, spent only $12.72 billion on capital expenditures during that period.

At times, the company’s hesitation to pour more money into AI has prompted criticism from investors and pundits who believe the company could risk being left behind in a future where AI is a key ingredient in personal devices. As the tech industry’s AI investments have swelled to mammoth proportions—Microsoft, for one, is projecting $190 billion in capital expenditures this year—some technologists have started to worry about overspending on AI computing capacity and look more favorably at Apple’s relatively conservative bets in the category.

“I think the data center boom is a mistake,” said David Stout, chief executive of Austin-based AI startup webAI. “Intelligence is getting smaller. Data centers won’t disappear, but the majority of work will happen at the edge. Apple made the bet correctly there.”

Stout is among a growing cohort of AI developers who are making bets on new businesses based on Apple hardware. WebAI develops specialized AI applications for enterprises that run locally on Apple chips. For example, webAI builds tools for aviation customers with an AI trained on a massive manual describing the intricacies of a Boeing Dreamliner engine to assist in engine maintenance.

The models can run on an iPad or Mac, without the need for an internet connection. Apple’s computers have also taken off among techies who use them to run OpenClaw, an open-source tool for creating AI agents that can autonomously run a computer.

Technology analyst Richard Kramer of Arete Research said in a recent note to investors that he estimates Apple has $50 billion worth of compute capacity in on-device chips “funded by users.”

Mark Suman, a former Apple senior engineering project manager who worked on internal AI systems before leaving in 2024, said that the billions of Apple devices around the world collectively amount to a powerful source of AI computing capacity.

“Apple could deploy the largest edge-compute AI that anybody has in the world,” said Suman, who is now co-founder of Maple, a startup offering customers an encrypted system for accessing AI models in the cloud. “It’s really a matter of time before they leverage that.”

Apple still can’t rely entirely on on-device models for its AI strategy. Google’s full Gemini model has trillions of parameters—a rough measure of the complexity of an AI model. The full model requires so much computing horsepower that Apple has struggled to get it to work on its own internal server infrastructure, called Private Cloud Compute, which runs on the same Apple chips in Mac computers, said people familiar with the situation.

Apple will likely have to take advantage of Google’s cloud infrastructure to run parts of the new Siri, former Apple engineers said. Still, Apple is looking for ways to run AI services in the cloud while still offering improved privacy protections. Its decision in recent weeks to approve the use of Nvidia’s confidential compute system to handle some of the processing of the bigger Gemini-based model inside Google Cloud is one such effort, people familiar with the partnership said.

Confidential compute is a security feature inside Nvidia graphics processing units that encrypts data and AI models as they are being processed. When enabled, it slightly slows down the processing of AI queries in the cloud, but it could help Apple keep its promises about protecting users’ privacy.

The plan would also represent a reversal from Apple’s original announcement of Apple Intelligence, in which the company said that Apple hardware inside its Private Cloud Compute system would process all AI queries that weren’t processed locally on users’ devices. Despite that, Apple is likely to continue to use the Private Cloud Compute brand, said people familiar with the partnership.