A Chinese Company Quietly Released a Thinking AI Agent

Published on: 2026-05-27

A Chinese Company Quietly Released a Thinking AI Agent

Summary: While the West debates whether AI agents can "think," a Chinese AI company has shipped one that does — not in the philosophical sense, but in the practical sense: it plans before acting, remembers past interactions, reflects on its mistakes, and iterates on its own output. This is the third level of AI agent evolution: cognitive planning agents that go beyond instruction-execution and adaptive-execution. The technology builds on a decade of research: Chain-of-Thought → Tree-of-Thought → ReAct → Reflection. China has three structural advantages in this race: application scenario richness, user data volume, and iteration speed. Representative products — Manus, AutoGLM, and Kimi — are already in production. The hardware implication: thinking agents need 24/7 runtime, low power consumption, and physical isolation from the user's primary device. That's exactly what KaiheAiBox A1 provides.


Three Levels of AI Agents

Not all AI agents are created equal. Understanding the three levels is essential for understanding why "thinking agents" matter.

Level 1: Instruction-Execution Agents

What they do: Execute a specific instruction with a specific tool.

Example: "Set a timer for 10 minutes" → Siri sets the timer.

Limitation: They can only do what they're explicitly told. If the instruction is ambiguous, they fail. If the execution path breaks, they stop.

Current state: This is what most people think of when they hear "AI agent." Siri, Alexa, Google Assistant, and most chatbot-based agents are Level 1.

Level 2: Adaptive-Execution Agents

What they do: Execute a task with the ability to adapt when the execution path changes.

Example: "Order me a large iced latte from Luckin Coffee" → Agent opens Luckin app → "Large iced latte is sold out" → Agent adapts: "Large iced Americano instead?" → User confirms → Agent completes the order.

Improvement over Level 1: The agent can handle unexpected states (sold out items, app crashes, network errors) without human intervention.

Limitation: The agent still follows a pre-defined task structure. It can adapt the execution, but it can't redefine the task. If the user's real goal is "I need caffeine," the agent won't suggest switching to a different coffee shop — it'll just try harder to complete the Luckin order.

Current state: AutoGLM, Rabbit R1, and most "AI agent" products shipping today are Level 2.

Level 3: Cognitive-Planning Agents

What they do: Understand the user's underlying goal, plan a multi-step strategy, execute the plan, reflect on results, and iterate.

Example: "I need to prepare for my board meeting next Thursday" → Agent: 1. Plans: "I need to: (a) review last quarter's financials, (b) check action items from the previous meeting, (c) prepare a status update on the three strategic initiatives, (d) anticipate likely questions." 2. Executes: Pulls financial data from the ERP, retrieves previous meeting minutes, collects status updates from project managers, identifies likely questions based on past meeting patterns. 3. Reflects: "The Q3 revenue was 15% below target. The board will likely ask about this. I should prepare a detailed explanation." 4. Iterates: Adds a section on Q3 revenue analysis, including root cause and corrective actions. 5. Delivers: A complete board meeting preparation package, delivered 24 hours before the meeting.

Improvement over Level 2: The agent doesn't just execute — it plans, reflects, and iterates. It understands the "why" behind the "what."

Current state: This is what the "thinking AI agent" from the title delivers. It's not perfect, but it's real, and it's shipping in China today.

The jump from Level 2 to Level 3 is not incremental. It's the difference between a GPS that follows a route and a chauffeur who knows when to take a detour. One follows instructions; the other understands the journey.

The Technology: How Thinking Agents Work

The "thinking" in thinking agents isn't magic. It's the result of a decade of research in AI reasoning:

Chain-of-Thought (CoT) — 2022

The breakthrough: instead of asking a model to produce an answer directly, ask it to produce the reasoning steps first.

Before CoT: "What is 347 × 892?" → Model: "309,524" (often wrong) After CoT: "What is 347 × 892? Think step by step." → Model: "Let me break this down: 347 × 800 = 277,600; 347 × 90 = 31,230; 347 × 2 = 694; Total = 277,600 + 31,230 + 694 = 309,524" (usually correct)

CoT taught models to show their work, which dramatically improved accuracy on reasoning tasks.

Tree-of-Thought (ToT) — 2023

CoT produces a single reasoning chain. ToT produces multiple reasoning chains and evaluates which one is most promising.

Example: "Design a marketing campaign for a new smartwatch" - Branch 1: Focus on fitness features → evaluate: "Good, but crowded market" - Branch 2: Focus on fashion/lifestyle → evaluate: "Differentiated, but may not resonate with tech audience" - Branch 3: Focus on health monitoring → evaluate: "Strong, growing demand, regulatory tailwind" - Select Branch 3, expand further.

ToT taught models to explore multiple options before committing.

ReAct (Reasoning + Acting) — 2023

ReAct combines reasoning with action. The model doesn't just think — it takes actions, observes the results, and adjusts its reasoning.

Example: "Find the best price for a MacBook Pro M4" 1. Reason: I need to check multiple retailers 2. Act: Search JD.com → ¥14,999 3. Observe: Price is higher than expected 4. Reason: Let me check Tmall 5. Act: Search Tmall → ¥14,499 6. Observe: Better price, but is it the lowest? 7. Reason: Check Pinduoduo 8. Act: Search Pinduoduo → ¥13,899 9. Observe: Lowest price found 10. Report: Best price is ¥13,899 on Pinduoduo

ReAct taught models to interleave thinking with doing.

Reflection — 2024-2025

Reflection adds the final piece: the model evaluates its own output and revises it.

Example: After writing a report, the model: 1. Reviews the report for coherence, accuracy, and completeness 2. Identifies gaps: "The financial analysis section is missing the Q3 comparison" 3. Generates a revision: Adds Q3 comparison 4. Reviews again: "Now the conclusion doesn't match the updated data" 5. Revises the conclusion 6. Reviews again: "Satisfactory"

Reflection taught models to self-correct.

Putting It Together

A thinking agent combines all four:

  1. CoT → Show reasoning steps
  2. ToT → Explore multiple approaches
  3. ReAct → Take actions and observe results
  4. Reflection → Self-evaluate and improve

The result: an agent that can plan, execute, learn, and improve — all without human intervention for routine tasks.

文章配图

China's Three Advantages

Why is China at the forefront of thinking agents? Three structural advantages:

Advantage 1: Application Scenario Richness

China's super-app ecosystem (WeChat, Alipay, Taobao, Douyin, Meituan) creates application scenarios that don't exist elsewhere:

  • WeChat integrates messaging, payments, mini-programs, social media, and enterprise tools into one platform. An agent that can navigate WeChat can serve virtually any daily-life need.
  • Meituan aggregates food delivery, ride-hailing, hotel booking, movie tickets, and more. An agent that can use Meituan can automate a wide range of local services.
  • Douyin provides a content consumption + e-commerce platform where an agent can both entertain and shop.

In the West, these functions are spread across dozens of apps with no unified platform. The integration density in China makes agent development more productive: one integration covers more ground.

Advantage 2: User Data Volume

China's 1.05 billion internet users generate data at a scale that enables: - Better model fine-tuning on Chinese-language tasks - More training examples for agent behavior patterns - Faster feedback loops from real user interactions

The data advantage is particularly strong for thinking agents, which require extensive training on multi-step reasoning patterns. More data = better reasoning = more useful agents.

Advantage 3: Iteration Speed

Chinese AI companies ship fast. The typical cycle from prototype to production in China's AI industry is 4-6 weeks, compared to 3-6 months in Silicon Valley. This isn't because Chinese engineers are faster — it's because the ecosystem (users, platforms, regulatory environment) supports rapid iteration.

Key factors: - Regulatory flexibility: China's AI regulations are evolving but currently more permissive for agent experimentation than the EU's AI Act. - User tolerance for imperfection: Chinese consumers are more willing to use beta products and provide feedback, which accelerates the iteration cycle. - Platform cooperation: WeChat, Alipay, and other platforms actively support AI agent integrations, providing APIs and developer support.

Representative Products: Three to Watch

Manus (Manus AI)

Manus is the most prominent Chinese thinking agent. Launched in late 2025, it's positioned as a "general-purpose AI agent" that can handle complex, multi-step tasks across domains.

Key capabilities: - Multi-step task planning with ReAct-style reasoning - Self-reflection and output revision - Integration with Chinese platforms (WeChat, Taobao, Baidu) - Both local and cloud deployment options

Manus gained attention when a user posted a video of it autonomously planning and booking a complete travel itinerary (flights, hotel, restaurant reservations, local transportation) from a single natural language request: "Plan a 5-day trip to Chengdu for two, budget ¥8,000."

AutoGLM (Zhipu AI)

As discussed in our earlier article, AutoGLM is a mobile AI agent that understands phone screens and simulates tap operations. The thinking agent upgrade adds: - Proactive task suggestions ("You usually order coffee at 9am. Shall I order your usual?") - Multi-step planning across apps ("Book the restaurant, then send the details to the group chat") - Error recovery with reflection ("The booking failed because the time slot is full. Let me try the next available slot.")

Kimi (Moonshot AI)

Kimi started as a long-context chatbot (200K token window) and evolved into a thinking agent focused on research and analysis tasks: - Deep research: read 50+ documents, synthesize insights, produce a structured report - Source verification: cross-reference claims across multiple sources - Iterative analysis: if the initial analysis is incomplete, Kimi can extend it without starting over

Kimi's strength is in knowledge work — tasks that require reading, understanding, and synthesizing large volumes of information.

The Hardware Implication

Thinking agents have specific hardware requirements that differ from both traditional software and Level 1-2 agents:

24/7 runtime. A thinking agent works best when it's always available. If you ask it to "monitor my email and flag anything urgent," it needs to be checking your email continuously — not just when you open the app.

Low power consumption. An always-on device shouldn't cost ¥100/month in electricity. The agent framework itself is lightweight; the heavy lifting (LLM inference) happens in the cloud.

Physical isolation. A thinking agent that has access to your email, calendar, bank transactions, and WeChat is extremely sensitive. You don't want it running on your primary computer (where a malware infection could compromise the agent). Physical isolation — a separate device — provides a security boundary.

KaiheAiBox A1 is designed for exactly this:

Requirement Phone Laptop Cloud VPS KaiheAiBox A1
24/7 runtime ❌ OS kills background ❌ Sleeps/updates
Low power ❌ ~60W N/A ✅ ~15W
Physical isolation ❌ Primary device ❌ Primary device ❌ Data in cloud ✅ Separate device
Data sovereignty ⚠️ ⚠️ ✅ Data stays local
API access

A thinking agent is the most sensitive AI workload you'll ever run. It has access to your email, your calendar, your financial data, and your communication channels. The hardware it runs on must be physically isolated, always-on, and locally-secure. That's not a laptop. That's not a phone. That's an Agent Computer.

What This Means for the Industry

Three predictions for the next 12-18 months:

  1. Thinking agents will become the default. Level 2 agents (instruction + adaptation) will feel primitive once users experience Level 3 agents (planning + reflection). The bar for "AI agent" will rise from "executes tasks" to "thinks about tasks."

  2. China will lead in thinking agent deployment. The combination of scenario richness, data volume, and iteration speed gives Chinese companies a structural advantage in shipping thinking agents that work in real-world conditions.

  3. Hardware will become the bottleneck. As thinking agents become more capable, the question shifts from "can the agent do this?" to "where does the agent run?" The hardware requirement — always-on, low-power, physically-isolated, locally-secure — is not met by any existing consumer device except purpose-built Agent Computers like KaiheAiBox A1.

The Bottom Line

A Chinese company quietly released a thinking AI agent. It's not a chatbot that sometimes gets things right. It's a system that plans, executes, reflects, and improves — the third level of AI agent evolution.

The technology is built on a decade of research (CoT → ToT → ReAct → Reflection). China has structural advantages in deploying it (scenarios, data, speed). And the hardware to run it — an always-on, low-power, locally-secure Agent Computer — already exists.

The question isn't whether thinking agents will become mainstream. The question is how fast. If the historical parallels hold (smartphones, mobile payment, short video), the answer is: faster than you think.


KaiheAiBox · AI Agents

© KAIHE AI - Agent Computer Specialist