Tencent Hunyuan Hy3 Preview: 295B Parameters with Fast-Slow Thinking Fusion, Trained in Three Months
📖 Glossary
AI Box (also known as Agent Computer / Agent PC), is a dedicated local hardware device that runs AI Agents. Pre-installed with an AI agent management system, plug-and-play, running 24/7. Users can remotely command AI to work via Discord, Slack, Telegram, WhatsApp, and more.
Abstract: Tencent's Hunyuan team released the Hy3 preview model — 295B parameters with MoE architecture. The headline innovation is fast-slow thinking fusion: one model that can both respond instantly to simple questions and reason deeply on complex tasks. From project initiation to launch in just three months, training efficiency has significantly improved. Another heavyweight contender in China's LLM landscape.
On June 16, Tencent's Hunyuan team released the Hy3 preview model. 295B parameters, MoE architecture, approximately 32B activated parameters. This isn't simple parameter stacking — Hy3's core innovation is fast-slow thinking fusion, one model balancing speed and depth.
What Is Fast-Slow Thinking Fusion?
Most LLMs are either fast or deep, rarely both. Fast models (like GPT-5.6 Air) respond quickly but reason shallowly. Deep models (like Claude Opus 4.8, GPT-5.6 Sol) reason powerfully but respond slowly at higher cost.
Hy3's approach: two reasoning modes within a single model.
Fast Thinking Mode: For simple problems (casual conversation, information queries, translation), the model takes a lightweight reasoning path with millisecond-level response. Feels like chatting with a quick assistant.
Slow Thinking Mode: For complex problems (math derivation, code generation, multi-step logic), the model automatically switches to a deep reasoning path, spending more time "thinking" before answering. Quality approaches specialized reasoning models.
Users don't need to manually select modes — the model judges problem difficulty and routes automatically. This is notably convenient in practice — no deliberating over "should I enable reasoning mode for this question?"

Key Parameters
According to Tencent Hunyuan's published technical report:
| Dimension | Parameter |
|---|---|
| Total Parameters | 295B |
| Architecture | MoE (Mixture of Experts) |
| Activated Parameters | ~32B |
| Context Length | 256K tokens |
| Training Duration | ~3 months |
| Training Data | 20 trillion tokens |
295B total parameters places Hy3 in the top tier of Chinese models. MoE architecture's advantage: only partial expert networks activate during inference, so actual computation is far less than a dense model. 32B activated parameters means inference cost comparable to a 32B dense model, but with significantly higher capability.
256K context is double GPT-5.6's 128K, suitable for ultra-long documents. 20 trillion tokens of training data is also industry-leading.
Benchmark Performance
According to Tencent's published data:
Math Reasoning: GSM8K score 92.1%, MATH score 68.3%. Close to GPT-5.5's level, surpassing same-parameter-class open-source models.
Code Generation: HumanEval score 87.4%, MBPP score 81.2%. Leading position among Chinese models.
Chinese Understanding: C-Eval score 83.7%, CMMLU score 82.9%. Chinese capability is a strength, aligning with Tencent's product positioning.
Fast-Slow Switching Effect: On simple Q&A tasks, fast thinking mode responds 4x faster than slow thinking mode, with accuracy difference within 2%. On complex reasoning tasks, slow thinking mode achieves 15-20% higher accuracy than fast mode.

What Three-Month Training Means
From initiation to launch in three months is remarkably fast for a 100B+ parameter model. For comparison, GPT-5.5's training cycle was approximately 6-8 months per public information.
Tencent attributes the training efficiency improvement to three factors: more efficient data mixture strategy, MoE architecture parallel training optimization, and improved scheduling efficiency in their proprietary training framework.
Short training cycles enable fast iteration. The LLM landscape can shift every two months — shorter training cycles mean faster adaptation. Hy3 is a preview version; the formal release is expected to further improve in subsequent iterations.
Impact on Industry Landscape
China's LLM competition is forming a tripod: Zhipu's GLM series taking the open-source route, DeepSeek pursuing cost-effectiveness, and Tencent Hunyuan pursuing product integration. Hy3's release gives Tencent the foundational model capability to compete head-to-head with the other two.
For developers, another strong option is welcome. Hy3's fast-slow thinking fusion reduces model selection friction — no need to choose between speed-type and reasoning-type models. One model adapts to all scenarios.
For Kaihe AIBOX users, Hunyuan Hy3's addition means one more callable cloud LLM. Kaihe AIBOX's local multi-Agent + cloud LLM architecture lets Agents automatically select models based on task type: everyday conversation uses fast models, deep reasoning uses slow models. Hy3's adaptive fast-slow switching simplifies this routing layer — Agents can use Hy3 as a single model for all scenarios, letting the model itself decide fast or slow.
Data Sources
Data from Tencent Hunyuan team's official technical report, CSDN technical community evaluations, and public benchmark leaderboards.
-#KaiheAIBOX #AIAgent #OpenSource #ArtificialIntelligence
Kaihe AIBOX | The Agent Computer That Works 7×24 for You · AI Agent