China AI LLM Weekly Usage Surpasses US by 2x: What 7.9 Trillion Tokens Really Means

Published on: 2026-05-25

China's AI LLM Usage Surpasses the US by 2x: What Lies Behind 7.9 Trillion Tokens?

Summary: Openrate's latest data shows that in the first week of May 2026, China's AI large language model weekly API usage reached 7.941 trillion tokens, while the US recorded only 3.76 trillion—China leads by a factor of 2.11. This gap isn't accidental. It reflects the accelerating pace of AI application deployment in China: from enterprise-grade Agents to personal assistants, from e-commerce customer service to government services, domestic LLMs are leveraging scale to claim market dominance. This article dissects the structural reasons behind 7.9 trillion tokens and explores what it means for China's AI industry.

7.9 Trillion Tokens: A Number That Demands Attention

Let's start with the data itself. Openrate's statistics cover the API usage of major AI LLM service providers worldwide, including OpenAI, Google, Anthropic, Baidu, Alibaba, ByteDance, and Tencent. The data for the first week of May 2026 shows:

  • China: 7.941 trillion tokens weekly
  • United States: 3.76 trillion tokens weekly
  • Other regions: approximately 1.2 trillion tokens

China accounts for 61.6% of the global total usage (approximately 12.9 trillion tokens).

This isn't the first time China's AI usage has led, but a gap exceeding 2x is unprecedented. In the same period of 2025, China was approximately 1.4x the US; by late 2025, the gap widened to 1.7x; by May 2026, it broke through 2x.

The growth rate comparison is even more telling: China's AI usage grew 312% year-over-year, while the US grew 89%. China's growth rate is 3.5 times that of the US.

These numbers are striking, but they require careful interpretation. Raw token volume is a measure of activity, not of capability or value creation. To understand what 7.9 trillion tokens really means, we need to look at the structural factors driving this volume—and the qualitative questions that volume alone cannot answer.

Why China? Dissecting the Structural Reasons Behind 7.9 Trillion

Numbers don't speak for themselves, but the structure behind the numbers is worth excavating. China's AI usage far exceeding the US isn't due to a single factor but to multiple overlapping forces.

Reason 1: Application-Driven AI, Not Research-Driven AI

The core driver of the US AI ecosystem is "frontier research"—OpenAI, Google DeepMind, and Anthropic compete on whose model is smarter and has stronger reasoning capabilities. Applications are secondary.

China is different. The core driver of China's AI ecosystem is "application deployment"—Baidu, Alibaba, ByteDance, and Tencent compete on whose model is used in more scenarios. Parameter count and benchmark scores are means, not ends.

This leads to a critical difference: US AI usage is concentrated in a few high-end scenarios (code generation, research assistance, creative writing), while China's AI usage is distributed across a vast number of everyday scenarios (customer service, translation, content generation, government services, education). The latter's base is inherently larger.

An e-commerce platform's AI customer service agent might handle millions of conversations per day, each consuming hundreds to thousands of tokens. When dozens of such e-commerce platforms in China are running AI customer service simultaneously, token consumption naturally skyrockets. Multiply this across every major industry—finance, healthcare, education, government—and the scale becomes self-reinforcing.

This structural difference has important implications. In the US, AI is often positioned as a premium tool for knowledge workers—a force multiplier for skilled professionals. In China, AI is positioned as infrastructure—a utility that touches every aspect of daily life. The former creates high-value usage; the latter creates high-volume usage. Both are valid strategies, but they produce very different token counts.

Reason 2: Price War-Triggered Usage Explosion

Starting in the second half of 2025, China's AI industry experienced an unprecedented price war. Baidu's ERNIE, Alibaba's Qwen, ByteDance's Doubao, and Tencent's Hunyuan successively slashed API prices to near-free levels.

Specifically, in early 2025, mainstream domestic LLM API prices were approximately 0.12 RMB per 1,000 tokens; by May 2026, prices had dropped to 0.008 RMB per 1,000 tokens—a 93% decline in one year.

The direct consequence of this price collapse was the elimination of the usage threshold. Previously, only core business functions could justify calling LLM APIs; now even internal tools, edge-case business processes, and individual developers are using them. Usage naturally grows exponentially when cost is no longer a constraint.

The price war dynamics are worth examining in detail. Unlike the US market, where OpenAI and Anthropic have maintained premium pricing (with cheaper models serving as loss leaders for ecosystem lock-in), Chinese providers engaged in a race to the bottom. This was driven by several factors: the availability of open-source foundation models (which lowered the floor for competitive pricing), the strategic importance of capturing developer mindshare in a rapidly growing market, and the willingness of large tech conglomerates to subsidize AI divisions with revenue from other business lines.

The result is a market where the marginal cost of AI usage approaches zero for consumers. This is great for adoption but raises serious questions about sustainability. When prices are this low, the only path to profitability is through massive volume—and even then, the margins are razor-thin.

Reason 3: Scaled Deployment of Enterprise AI Agents

This is the most easily overlooked but most important factor.

Starting in the second half of 2025, the deployment of enterprise-grade AI Agents in China entered an explosive phase. Unlike the US's "personal assistant + code tools" model, Chinese enterprise AI Agents are concentrated in the following scenarios:

  • E-commerce: Intelligent customer service, product description generation, product recommendation
  • Finance: Risk control assistance, compliance review, investment research reports
  • Government services: Service guidance, policy interpretation, document drafting assistance
  • Manufacturing: Quality control, supply chain optimization, predictive maintenance
  • Education: AI teaching assistants, personalized exercises, essay grading

文章配图

These scenarios share a common characteristic: high-frequency, batch, continuous usage. An AI customer service agent might run 24/7, processing dozens of conversations per second; an investment research agent might generate hundreds of report summaries per day. They don't operate like ChatGPT, which individual users open occasionally—they run like industrial equipment, continuously.

The industrial metaphor is apt. In the same way that factories run machines continuously to maximize return on capital investment, Chinese enterprises are running AI Agents continuously to maximize return on technology investment. The token volume generated by this mode of operation dwarfs the volume generated by individual chat interactions, even if the latter number in the hundreds of millions.

KaiheAiBox's practice in the local AI deployment field also confirms this trend. The KaiheAiBox A1 Agent Computer enables enterprises to run AI Agents locally without continuously calling cloud APIs. But even with local inference, the enterprise demand for "continuously running AI" is real—they need AI online 24/7, not just when they open an app.

Reason 4: Mobile AI Penetration Rate Differential

China's mobile internet user base (approximately 1.05 billion) far exceeds that of the US (approximately 300 million), and AI applications are rapidly penetrating the mobile segment.

ByteDance's Doubao, Baidu's ERNIE App, and Tencent's Yuanyuan all have monthly active users exceeding 50 million. Doubao's monthly active users reportedly exceed 100 million—a number that surpasses ChatGPT's monthly active users in the US.

The usage pattern of mobile AI is fragmented and high-frequency: voice-asking the weather, photo-based object recognition, menu translation, generating social media captions. Each individual call doesn't consume many tokens, but daily active users multiplied by a dozen or more calls per day creates an enormous aggregate volume.

The mobile dimension is particularly significant in China because of the country's unique digital ecosystem. In the US, desktop computing still plays a major role in AI usage—developers writing code, analysts processing data, writers drafting documents. In China, mobile-first is not just a preference but a structural reality. Hundreds of millions of users interact with AI primarily through their phones, in bite-sized sessions throughout the day. This creates a usage pattern that is qualitatively different from the desktop-centric pattern prevalent in the US: more sessions, shorter durations, simpler tasks, but vastly more total interactions.

The Other Side of the 2x Gap: Quality vs. Quantity Concerns

Having outlined the structural reasons, we also need to look calmly at the other side of this number.

High usage volume ≠ technical leadership.

The US's 3.76 trillion tokens are concentrated in higher-quality scenarios: GPT-5 and Claude 4 reasoning tasks, Sora video generation, Copilot code assistance. These scenarios demand far more from model capabilities than customer service conversations and content generation.

In other words, if we calculate "economic value generated per token," the US may still lead. A significant proportion of the 3.76 trillion tokens comes from high-value B2B scenarios; a large portion of the 7.9 trillion tokens comes from low-unit-price C2C and enterprise edge scenarios.

This is the core contradiction of China's AI industry: scale leadership, but insufficient value density.

Another concern is profitability. The price war has driven API profits to the floor. Baidu, Alibaba, and ByteDance's AI businesses are currently operating at a loss, subsidized by other business units within their conglomerates. How long this "burn money for scale" model can sustain depends on whether high-value business models can be built on the scale foundation.

The comparison with the US market is instructive. OpenAI, despite its own profitability challenges, has been able to command premium pricing for its most capable models. Enterprises willingly pay $200+ per month per seat for ChatGPT Enterprise because the productivity gains justify the cost. In China, the equivalent enterprise products are often priced at a fraction of this, reflecting both a different willingness-to-pay and a different competitive dynamic where free alternatives are always just a click away.

Three Judgments About China's AI Industry

Based on the 7.9 trillion token data and the structural analysis behind it, I offer three judgments:

Judgment 1: China's AI Application Density Will Double Again Before End of 2026

The price war continues, enterprise Agent deployment is just beginning to accelerate, and mobile AI penetration still has vast untapped markets in lower-tier cities. These three factors combined make it highly probable that China's AI usage will reach 15-20 trillion tokens per week by the end of 2026.

The untapped potential in lower-tier cities deserves elaboration. While first-tier cities like Beijing, Shanghai, and Shenzhen have high AI adoption rates, the hundreds of millions of users in second, third, and fourth-tier cities represent a massive growth opportunity. These users are already comfortable with mobile-first digital services (WeChat Pay, short video platforms, e-commerce). Adapting AI services for this population—simpler interfaces, more voice interaction, tighter integration with existing super-apps—could unlock the next wave of token growth.

Judgment 2: The Inflection Point from "Usage Scale" to "Value Density" Will Arrive in 2027

Scale is not the goal; value is. Once usage volume is large enough, the industry's center of gravity will shift from "getting more people to use AI" to "making AI create more value." The specific manifestation: the proportion of high-end Agents (reasoning, decision-making, creative work) will increase, while the proportion of low-end Agents (simple conversations, template generation) will decrease.

This transition is already visible in the early signals. Several Chinese AI companies have begun shifting their marketing messages from "cheapest API" to "most capable agent." Enterprise customers, having experimented with basic AI applications, are now demanding more sophisticated capabilities. The market is maturing from "any AI is better than no AI" to "I need AI that actually solves my specific problem."

Judgment 3: Local AI Deployment Will Become a New Choice for Chinese Enterprises

Currently, the vast majority of the 7.9 trillion tokens are cloud API calls. But enterprise demand for data privacy, controllable costs, and low-latency responses is growing. The local AI deployment model represented by the KaiheAiBox A1 Agent Computer offers enterprises an alternative: run AI on your own premises, keep data local, and eliminate per-usage billing anxiety.

Local AI won't replace cloud AI, but it will become an important complement. For high-frequency, standardized, data-sensitive scenarios, the cost-effectiveness advantage of local deployment will become increasingly apparent. The hybrid architecture—local inference for routine tasks, cloud inference for complex ones—is likely to become the dominant enterprise AI deployment pattern.

This mirrors the broader trend in enterprise computing. Just as organizations moved from purely on-premises to cloud to hybrid cloud, AI deployment is following a similar trajectory. The initial wave was "everything in the cloud" for simplicity and speed. The next wave is "sensitive workloads on-premises, burst capacity in the cloud" for security and cost optimization.

The Real Question After 7.9 Trillion Tokens

7.9 trillion tokens is a milestone number, but it's not the destination. It proves China's advantage in AI application scale while exposing the shortfall in value density.

The question truly worth pondering isn't "why does China's AI usage exceed the US," but rather:

When usage volume is no longer the bottleneck, what do we do with all these tokens?

Do we use cheaper AI to do more of the same commoditized things, or do we use smarter AI to do more valuable things? Do we continue the race-to-the-bottom price war, or do we find paths that let AI genuinely create profits for enterprises?

7.9 trillion tokens has given China's AI industry a massive base advantage. But a base advantage only translates into a competitive advantage when it's converted into a quality advantage. From "largest scale" to "highest value"—this is the chasm China's AI industry must cross next.

The stakes are high. If China can make this transition—leveraging its scale advantage to build genuinely superior AI applications—then 7.9 trillion tokens will be remembered as the foundation of a dominant position. If it cannot, then 7.9 trillion tokens will be remembered as a impressive number that failed to translate into lasting competitive advantage. The difference between these two outcomes depends not on technology alone but on strategic choices: where to invest, what to optimize, and which trade-offs to make.

One thing is certain: the era of measuring AI progress primarily by token volume is ending. The next chapter will be measured by value created per token. And that's a much harder metric to game.


KaiheAiBox | The Agent Computer for Everyone · AI Frontier Tracker

© KAIHE AI - Agent Computer Specialist