The Information : Tencent’s New Model Shows Improvement, Partly Thanks to Anthro

Tencent’s New Model Shows Improvement, Partly Thanks to Anthropic

The Takeaway

Tencent’s new Hy3 AI model earns positive reviews from developers.
Tencent employees used Anthropic’s Claude to fine-tune its Hy3 model.
Claude access violates Anthropic’s China ban.

Chinese tech giant Tencent’s latest AI model has generated positive reviews from developers. But the company probably owes some of that success to Anthropic.

Tencent employees used Anthropic’s Claude to assist them with evaluating and fine-tuning the model, known as Hy3, to improve its performance, according to two people with direct knowledge of the matter and Tencent’s internal memos reviewed by The Information. That’s despite the fact that Anthropic doesn’t offer its models and services to companies in countries such as China that are considered to be U.S. adversaries.

It is unclear how Tencent’s China-based employees obtained access to Claude, although it is widely used by Chinese tech companies and startups who get access via intermediary services using non-Chinese mobile numbers or non-Chinese credit cards.

Earlier this month, Anthropic tightened identification requirements for usage of its models, requiring some customers to provide a government-issued photo ID and an image of themselves taken with their phone or webcam. The move triggered concerns among affected many Chinese startups and programmers who rely on Claude.

Tencent released a free preview of Hy3 last week. The company says it is “the most intelligent model” in its series of AI models known as HY, or Hunyuan. Hy3 is Tencent’s first major model release since Yao Shunyu, a former OpenAI researcher, joined Tencent as its chief AI scientist in September. Yao’s hiring is part of Tencent’s ongoing makeover to catch up to rivals Alibaba and ByteDance in foundational model development.

When developing Hy3, Tencent employees used Anthropic’s AI coding tool, Claude Code, in the post-training stage known as reinforcement learning with human feedback.

Tencent enlisted employees to participate as human evaluators in that phase of the model’s development. The company provided them with instructions on how to install Claude Code, according to one of the people and the memos.

Tencent capped each evaluator’s access to Claude Code at “thousands of tokens,” one of the memos showed. A token is approximately four characters in the English language.

An Anthropic spokesperson said that “for national security reasons, Anthropic does not currently offer commercial access to Claude in China, or to subsidiaries of their companies located outside of the country.”

The spokesperson also said the company continues to detect and prevent attempts to illicitly use Claude to train competing models.

“Our safeguards and threat intelligence teams actively monitor for distillation attacks, and we take swift action when we identify accounts or organizations engaged in it,” the spokesperson said, without directly commenting on Tencent’s use of Claude Code.

Distillation, a common technique in AI model development, involves training a smaller or less capable model based on the outputs of a larger or more capable one. The process usually requires feeding outputs of the stronger model into a new model under development.

Tencent employees don’t consider the use of Claude Code in Hy3’s development distillation, as they didn’t feed Claude’s responses into Hy3. They said their work with Claude Code is more akin to benchmarking, using another model considered best in its class as a reference so the new model learns to make better decisions through trial and error. It is very common for AI model developers to use leading models such as Claude for such purposes during post-training work, according to employees at other AI companies.

For example, in benchmarking, Tencent employees posed the same programming questions to two unidentified models and scored the responses in different dimensions, according to one of the people with direct knowledge and the internal memos. The employees didn’t know which models they were interacting with while scoring. They also relied on Claude Code to generate examples of high-quality AI behavior in real time and used them as guiding references to analyze the Tencent model and filter out low-quality responses.

Even though not considered state of the art, the Hy3 preview has received positive reviews from developers for its efficiency in coding relative to similar-size models. It’s currently ranked ninth in the programming category by usage, according to OpenRouter, which helps AI app developers access hundreds of models from a single application programming interface.