>>> GOOGL I/O conf: CEO: We now process 3.2 quadrillion tokens per month,

Summary I/O conf: CEO: We now process 3.2 quadrillion tokens per month, a sevenfold jump from last year
- Its been 10 years since we shifted the company to be 'AI-first'
- Now have 375+ customers that process over 1T tokens per month- AI mode search now has over 1B monthly users
- Gemini now has 900M monthly users
- Now have 13 products with over 1 billion users each, including five with more than 3 billion users- To roll out 'Ask YouTube' AI search in the US this summer
- Demonstrates 'Doc Live' product allowing users to create and format docs with voice
- OpenAI and Kakao are adopting SynthID for watermarking their content (along with Nvidia which already adopted SynthID)
- Unveils Gemini 3.5 Flash model for agents and coding; available for everyone today; Gemini 3.5 Pro coming next month; Gemini 3.5 to be twice as cheap as rival models
- Introduces Gemini Spark a personal 24/7 AI agent
- Introduces $100/month plan for developers; Reduces Ultra plan by $50 to $200/month
- Gemini uses can now generate and edit Canva designs; Turn nano banana images into layered, editable designs in Canva
- DeepMind Technologies CEO Hassibas: Artificial General Intelligence is just a few years away
- Announces Gemini Omni AI model for video generation and editing; We are starting with video, but eventually Omni will be able to take any input and provide any output

The Information : Is the Gap Widening Between Anthropic and Open-Source Models?

Is the Gap Widening Between Anthropic and Open-Source Models?

Some developers have told me that the rising costs of frontier AI models from Anthropic and other firms could prompt them to shift to cheaper open-source AI. After all, when companies as sophisticated as Uber are accidentally blowing through their entire year’s AI budget in a matter of months, it makes sense to cut back by using a less capable open-source model to automate simpler tasks. (In fact, companies like Uber and Airbnb are doing exactly that!)

It’s not clear whether open-source AI is good enough to meet the challenge, though. For instance, one executive at a major customer of OpenAI and Anthropic told me that they’ve been trying to use open-source models like Moonshot AI’s Kimi K2.6 and DeepSeek V4. But while these models have performed well on benchmarks and are good at answering more surface-level questions in a variety of areas, they tend to struggle with follow-up questions or deeper lines of questioning, this executive said.

For instance, you could imagine a model doing well on a popular brainteaser but then struggling if you tweak a few assumptions or details in the brainteaser.

Of course, this is just one developer’s experience, and usage of open-source models does seem to be growing overall, based on data from inference provider OpenRouter.

But other data suggests that the performance gap between open- and closed-source AI is widening, such as the analysis here from the National Institute of Standards and Technology. NIST’s analysis determined that the capabilities of DeepSeek V4, which was released in April, lag behind frontier models by about eight months. In comparison, DeepSeek R1, which was released in January 2025, lagged behind frontier models by around three to four months, according to the NIST analysis.

The executive I spoke with had a few ideas why the open-source models they were using tended to struggle on deeper or more detailed questions.

One possibility is that the models’ training datasets had limited coverage, meaning that they didn’t include enough examples of the full range of situations the models were expected to handle, the exec said.

For instance, the model could have been trained on the public web, which contains lots of intro-level blog posts explaining various topics, like ways to train a model. The web, however, does not contain more complex examples of what to do when those model training techniques go wrong.

Additionally, the open-source models may have been trained with a lot of data produced by closed-source models like OpenAI’s GPT and Anthropic's Claude, a process known as distillation, the executive said. (OpenAI has previously accused Chinese AI developers of using distillation, and even well known AI CEOs like Elon Musk have admitted to developing their AI using distillation of other frontier models.)

Distillation can be a cheap and quick way to improve the performance of a “dumber” model, but it can also backfire if researchers cut corners. You can think of bad distillation as telling a high schooler to memorize the answers to a physics exam: they might know enough to sound knowledgeable initially, but if you were to ask them any follow-up questions, their lack of knowledge would become apparent.

The increasing costs of AI could offset these performance shortcomings, however. If closed-source models continue to get more expensive, developers may increasingly look for cheaper alternatives.

And new tricks and workarounds like “ask-expert-mcp,” a special kind of software which allows a weaker or cheaper model to ask a stronger model for help when it gets stuck, could also help to make open-source models more usable.

The Information : Amazon’s Nvidia Alternative Starts Winning Over AI Developers

Amazon’s Nvidia Alternative Starts Winning Over AI Developers

The Takeaway
  • Scarcity of Nvidia GPUs has made Amazon’s pitch more attractive
  • Amazon worked closely with Anthropic on improving Trainium software
  • Developers say Trainium documentation and support have improved recently

Amazon’s yearslong effort to build a serious alternative to Nvidia’s dominant AI chips is starting to gain traction.

Anthropic and OpenAI, which have struck multibillion-dollar investment and infrastructure deals with Amazon, have already committed to renting large amounts of current and future Trainium capacity. Now, recent software improvements are prompting smaller developers to consider moving more workloads to Trainium, half a dozen people who use or work with the chips said.

That includes Daniel Svonava, CEO of Superlinked, an infrastructure firm that helps companies run AI models on rented infrastructure. He said Amazon’s pitch on Trainium, including potential cost savings by switching to the chip, only recently started becoming more compelling.

“Our response has always been the lack of software support being a barrier,” Svonava said. “That’s the thing that changed in the last couple months. That barrier has been removed.”

The scarcity of Nvidia chips has also made Amazon’s pitch more attractive, with sales reps telling the startup they have limited availability on the latest graphics processing units. At the same time, Amazon has indicated it has more Trainium capacity available and is willing to be flexible on price, he said. Amazon has given Superlinked $200,000 worth of AWS credits, which it is using to test Trainium.

Bojan Jakimovski, machine-learning lead at Loka, which helps businesses train their own AI models, said interest in Trainium started spiking in the past couple of months in part because of issues securing Nvidia GPUs. With Trainium, clients know “they will have a reserved spot when they need to develop,” Jakimovski said.

One Loka client switched its inference workloads to second-generation Trainium chips earlier this year after tests showed doing so could cut costs up to 35% compared with Nvidia’s H100, Jakimovski said. For training a large language model, though, he would still recommend Nvidia.

The new interest comes as Amazon is betting that Trainium can improve the economics of its AI cloud business. In a January interview with The Information, CEO Andy Jassy said that while Amazon plans to continue buying Nvidia chips, “if you’re building a big inference business” that charges less and has sustainable margins, “you’re strategically disadvantaged if you don’t have your own custom silicon.”

Last month, Jassy said Amazon’s custom silicon business, including Trainium and Graviton, has reached a more than $20 billion annualized run rate, or roughly $50 billion if measured as a stand-alone chip seller. That $20 billion reflects revenue from customers using Trainium and Graviton directly through Amazon’s EC2 service, an Amazon spokesperson said. It excludes offerings such as Amazon’s Bedrock, which lets customers access AI models, and internal Amazon workloads.

“Customers are choosing Trainium because of years of architectural decisions that compound at scale: a chip designed in partnership with leading AI labs to be more efficient in how it computes, communicates and operates,” the spokesperson said. “Today, Trainium2 is being chosen by the leading AI labs in the world for their most demanding workloads, and Trainium3 delivers a 30 to 40 percent price performance improvement over Trainium 2.”

The Anthropic Test

Getting to this point took years of software work and close collaboration with Anthropic. Nvidia’s advantage was as much in software as in hardware—developers had spent years building around Cuda, while Amazon had to make its Trainium software, called Neuron, easy enough to justify switching.

Amazon announced Trainium in 2020 through its Annapurna Labs unit, initially pitching it as a cheaper way to train machine-learning models on AWS. When the first-generation chips launched, early internal users included Amazon’s search teams, which helped shape the chip’s development, according to someone with knowledge of the matter.

But when Amazon staff began ramping up generative AI products in late 2022, some teams did not use Trainium broadly, and Amazon’s Nova large language models were first trained on Nvidia GPUs, according to a former employee.

Amazon announced in 2023 that Anthropic would use Trainium and Inferentia to train and run future models, and by the following year had committed $8 billion to Anthropic. The two companies also teamed up to make Trainium faster and more efficient.

Anthropic and Amazon engineers worked closely to optimize Trainium for Anthropic’s models, talking frequently, according to Carlos Escapa, a former AWS executive who worked on selling Anthropic models. Anthropic and Amazon made software improvements that could also benefit other customers.

“The collaboration between Anthropic and AWS on the NKI [Neuron Kernel Interface] has been very, very deep,” Escapa said, referring to Amazon software that lets developers fine-tune how models run on Trainium chips. “And some of these features that have been developed for Anthropic have also become very useful for other companies.”

Some of the work involved software changes that helped Trainium perform more processes simultaneously, Escapa said. Anthropic co-founder Tom Brown has publicly described the broader effort as “a game of Tetris,” where a tight chip architecture makes models cheaper and faster.

By the end of 2024, Amazon had launched its second-generation Trainium chip broadly and announced Project Rainier, a large Trainium cluster for Anthropic. Inside Amazon, Trainium use began picking up in some areas, with Nova starting to use Trainium in 2024 and ramping up since then with pretraining in particular, a former employee said.

Bedrock, which offers access to Anthropic and other models, initially relied on GPUs, according to two people with knowledge of the product. One of the people said some Bedrock workloads in 2024 required roughly twice as many Trainium chips as Nvidia chips to handle the same workload.

Some members of the Trainium team were frustrated that the Bedrock team was not adopting Trainium faster, one of the people said, but over time, Bedrock staff became more convinced that the chips were competitive.

Amazon said it prioritized limited Trainium capacity for external customers such as Anthropic as demand accelerated. The company also said Bedrock used Trainium for models and tasks where the chips offered better cost and performance, while relying on GPUs for other models to keep Bedrock’s selection broad.

As the software matured, Amazon said, more Bedrock workloads moved to Trainium, which now runs the majority of Bedrock inference across more than 125,000 customers. Amazon also said it is planning to train its largest internal models on Trainium going forward.

Software Catch-Up

Outside Amazon, developers had their own frustrations with Trainium. Julien Simon, an AI operating partner at a private equity firm, first ran into issues while working at Hugging Face, where he spent three years. Hugging Face had worked with Trainium chips, and Amazon was sometimes slow to support newer models on the startup’s open-source platform, Simon said.

“You would reach out to [Amazon] saying, ‘We need you to support this slightly newer model that came out on Hugging Face last week,’ and the answer would be, ‘Yeah, maybe in six months,’” Simon said. Amazon says such delays were addressed with later open-source software integrations, and a current product director at Hugging Face, Jeff Boudier, said the company has been happy with the relationship.

Simon said he ran into similar issues at open-source AI developer Arcee, where from mid-2024 to mid-2025 he tried training and deploying models on Trainium chips. “Anything that works out of the box on [Nvidia’s software] requires custom engineering and custom development and back-and-forth with the Amazon teams,” he said.

Arcee’s CEO, Mark McQuade, said the company stopped trying to use first-generation Trainium chips in early 2025 and hasn’t tried newer versions.

Trainium also initially struggled with some capabilities, including support for dynamic shapes, which help models handle inputs that vary in size or format. Without that support, developers had to do more manual work to adapt models for different requests, said Jakimovski, who started testing the technology in late 2024.

Kevin Gomes, a graduate student at Cornell University conducting AI research with Trainium, said he also found the chips difficult to use at first because the documentation was lacking. “It’s not very well documented, so you have no idea how to fix it,” Gomes said.

In recent months, however, several customers said Amazon has made Trainium much easier to use by improving documentation and support and making the chips work better with popular open-source tools.

That included a native PyTorch integration Amazon unveiled in December, an important step because PyTorch is many developers’ default programming platform and has long worked best with Nvidia. Before the integration, developers often had to write code in PyTorch and adapt it to Amazon’s Neuron software.

Amazon also fixed Trainium’s dynamic shapes issue and made its software easier to customize, helping models run faster, Jakimovski said. “They started to listen to developers,” he said. “I didn’t have that feeling two years ago.”

The Next Test

In February, Amazon said OpenAI would take around 2 gigawatts of Trainium capacity, including Trainium3 and the upcoming Trainium4, alongside an initial $15 billion Amazon investment.

Amazon has also paired Trainium with Cerebras, another OpenAI chip partner. OpenAI announced in January that it would use Cerebras systems for 750 megawatts of high-speed inference compute. Two months later, Amazon said it would deploy Cerebras systems in AWS data centers alongside Trainium to deliver faster inference through Bedrock.

Amazon later expanded its Anthropic partnership, with Anthropic committing to spend more than $100 billion on AWS over the next decade, including on Trainium capacity. Amazon also said it would invest another $5 billion in Anthropic, with up to $20 billion more tied to future milestones.

Late last month, Amazon said Trainium2 is largely sold out, Trainium3 is nearly fully subscribed and much of Trainium4, which is about 18 months from broad availability, is reserved. The Amazon spokesperson said Trainium’s customer base “extends well beyond OpenAI and Anthropic,” citing examples including Uber and Decart.

Still, Amazon has not detailed how much of that demand comes from a small number of large customers, and it’s unclear how much of a market exists beyond those AI giants and Amazon’s own services. Many AI-heavy companies buy model access through application programming interfaces or Amazon’s Bedrock instead of renting chips directly, Svonava said, and most companies don’t want to actively evaluate the underlying chips.

Trainium also hasn’t fully displaced Nvidia inside Amazon. Some of the models underpinning Amazon’s shopping AI still use Nvidia chips exclusively, someone with direct knowledge of the product said. Anthropic is still securing Nvidia capacity too, recently announcing a deal with SpaceX to access more than 220,000 Nvidia chips within the coming month.

>>> US Gapping down

Gapping down
In reaction to earnings/guidance
:
  • XP -5.1%, LUXE -3.9%, NRGV -1.6%
Other news:
  • SITM -5.3% (Offers $1.1 billion of 2031 convertible senior notes to help fund Renesas timing business asset acquisition)
  • YMT -4.5% (Received Nasdaq notice for failing minimum market value listing requirement, with until November 9, 2026 to regain compliance.)
  • CRWV -3.7% (Blackstone announced a joint venture with Google to launch a U.S.-based TPU cloud company.)
  • NBIS -3.3% (Announced a joint venture with Google to launch a U.S. TPU cloud company, backed by Blackstone's $5 billion commitment.)
  • STX -3.1% (Underperformed yesterday after CEO said building new factories would take too long to keep pace with surging memory demand.)
  • AKAM -2.6% (Announced a $2.6 billion convertible senior notes offering and plans to use proceeds for note hedges and share repurchases.)
  • LITE -2.3% (Received multiple orders for G10-AsP platform from Lumentum, according to AIXTRON announcement.)
  • HUT -2.3% (Agreed to invest about $16 million to expand water infrastructure for its River Bend AI data center campus.)
  • VRT -2.0% (Follow-through selling on heels of yesterday's weakness.)
  • KRRO -1.7% (Added KRRO-111 for potential AATD treatment to its pipeline, a GalNAc-conjugated oligonucleotide designed to restore normal AAT protein.)
  • GSIT -1.7% (Awarded Phase I of a Smart City project by Hsinchu County, Taiwan, deploying its Gemini-II APU pilot system.)
  • ERIC -1.6% (Partners with Net Feasa to provide real-time agentic AI-based connectivity and monitoring for maritime and container shipping.)
  • ALGT -1.5% (Added eight new nonstop routes and offered introductory one-way fares as low as $59 plus bonus rewards points.)
  • CGEN -1.4% (Files for a $400 million mixed securities shelf offering.)
  • ONTO -1.4% (Priced upsized $1.3 billion private offering of 0.00% convertible senior notes due 2031.)

>>> US Gapping up

Gapping up
In reaction to earnings/guidance
:
  • AGYS +19.6%, EXP +6.5%, AS +2.9%, BILI +1.3%, CMBT +7.7%, ZBIO +5.7%, HSAI +1.2%
Other news:
  • RLAY +10.6% (Announced promising Phase 2 zovegalisib data in vascular anomalies, showing 60% volumetric response and supportive safety profile)
  • GILT +7.9% (Reached milestone with Boeing toward offering Sidewinder electronically steered antenna as future line-fit in-flight connectivity solution)
  • INFU +3.9% (Authorizes a new $20 million stock repurchase program running from July 1, 2026 through June 30, 2028.)
  • WVE +3.9% (Announces positive RestorAATion-2 trial update as WVE-006 achieved MZ-like phenotype with biweekly and monthly dosing)
  • ECX +2.9% (Announced strategic framework agreement with May Mobility to develop and deliver autonomy-enabled vehicles and exclusive AV technology components.)
  • TLX +2.7% (Completes patient enrollment for IPAX-2 study of TLX101-Tx in newly diagnosed glioblastoma; no dose-limiting toxicities observed.)
  • MLYS +2.2% (Announced Phase 3 Launch-HTN lorundrostat data in hypertension and CKD will be presented orally at ESH 2026.)
  • GDRX +2.0% (Integrated its discounts into TrumpRx.gov's expanded generic drug price transparency platform covering more than 600 medications.)
  • CGEM +1.9% (Receives FDA orphan drug designation for CLN-049 in relapsed/refractory acute myeloid leukemia.)

>>> US Research Calls I

Research Calls I
  • Upgrades:
    • American Tower (AMT) upgraded to Outperform from Market Perform at Bernstein, tgt $207
    • Assured Guaranty (AGO) upgraded to Buy from Neutral at UBS, tgt $94
    • Credicorp (BAP) upgraded to Buy from Hold at HSBC, tgt $350
    • Jazz Pharmaceuticals (JAZZ) upgraded to Buy from Neutral at UBS, tgt $307
    • Progyny (PGNY) upgraded to Buy from Hold at Canaccord, tgt $30
    • StubHub Holdings (STUB) upgraded to Buy from Neutral at Guggenheim, tgt $12.50
  • Downgrades:
    • Biofrontera (BFRI) downgraded to Speculative Buy from Buy at Benchmark, tgt $3
    • CrowdStrike (CRWD) downgraded to Sell from Buy at DZ Bank, tgt $500
    • Fortinet (FTNT) downgraded to Hold from Buy at DZ Bank, tgt $125
    • James River Group Holdings (JRVR) downgraded to Neutral from Buy at UBS, tgt $4.75
    • The Hanover Insurance Group (THG) downgraded to Market Perform from Outperform at BMO Capital, tgt $203
  • Others:
    • Alnylam Pharmaceuticals (ALNY) initiated with a Buy at Citigroup, tgt $380
    • Ascendis Pharma (ASND) initiated with a Buy at Citigroup, tgt $355
    • BioMarin Pharmaceutical (BMRN) initiated with a Buy at Citigroup, tgt $75
    • BridgeBio Pharma (BBIO) initiated with a Neutral at Citigroup, tgt $82
    • CG Oncology (CGON) initiated with a Peer Perform at Wolfe Research
    • Corpay (CPAY) initiated with a Buy at Loop Capital, tgt $406
    • CytomX Therapeutics (CTMX) initiated with an Outperform at Wolfe Research, tgt $6
    • Cytokinetics (CYTK) initiated with a Buy at Citigroup, tgt $99
    • Interface (TILE) initiated with a Buy at Benchmark, tgt $36
    • Ionis Pharmaceuticals (IONS) initiated with a Buy at Citigroup, tgt $115
    • Lightspeed POS (LSPD) reinstated with an Underperform at BofA Securities, tgt $10
    • Mirum Pharmaceuticals (MIRM) initiated with an Outperform at Wolfe Research, tgt $145
    • NeOnc Technologies (NTHI) initiated with a Buy at Alliance Global, tgt $13
    • Oklo (OKLO) initiated with a Peer Perform at Wolfe Research
    • Pilgrim's Pride (PPC) reinstated with a Neutral at UBS, tgt $30
    • Rapport Therapeutics (RAPP) assumed with a Buy at Truist, tgt $56
    • Tyra Biosciences (TYRA) initiated with a Peer Perform at Wolfe Research
    • X-energy (XE) initiated with a Hold at Jefferies, tgt $28
    • X-energy (XE) initiated with a Buy at Guggenheim, tgt $57
    • X-energy (XE) initiated with a Buy at TD Cowen, tgt $35
    • X-energy (XE) initiated with a Buy at UBS, tgt $40
    • X-energy (XE) initiated with an Overweight at JPMorgan, tgt $38
    • X-energy (XE) initiated with an Overweight at Morgan Stanley, tgt $41
    • X-energy (XE) initiated with a Peer Perform at Wolfe Research
    • Zeta Global Holdings (ZETA) reinstated with a Buy at BofA Securities, tgt $24