DeepSeek-V4 Open-Source Release: The Million-Token Era Is Here, and It's Affordable
📖 Glossary
AI Box (also known as Agent Computer / Agent PC), is a dedicated local hardware device that runs AI Agents. Pre-installed with an AI agent management system, plug-and-play, running 24/7. Users can remotely command AI to work via Discord, Slack, Telegram, WhatsApp, and more.
Abstract: DeepSeek releases V4 fully open-source. V4-Flash costs just 2 RMB per million output tokens. All versions natively support 1M context. What does open-source + 1M + low pricing actually mean? What's the practical impact for local deployment and Kaihe AIBOX users? Let's break it down.
DeepSeek just dropped another one on the industry.
V4 preview released, open-sourced simultaneously. 1M context, MIT license, V4-Flash at 2 RMB per million tokens. Same playbook as V3: slash prices, open-source, then close with long context.
What Changed in V4
Compared to V3:
Native 1M context support. V3 had 128K. V4 jumps to 1M. Not a plugin extension — native support, meaning the model was trained with long context in mind from the start, not stretched after the fact.
Architecture improvements. DeepSeek hasn't shared many architecture details, but benchmarks show better inference speed and more consistent quality across long texts. V3 started degrading past 64K; V4 degrades less across the full 1M range.
Multiple versions launched together. V4 is the flagship. V4-Flash is the lightweight version. Flash targets speed and price. Flagship targets capability ceiling.
What Does 2 RMB/Million Tokens Actually Mean
Price comparison:
| Model | Input Price | Output Price |
|---|---|---|
| DeepSeek-V4-Flash | 0.1 RMB/M | 2 RMB/M |
| DeepSeek-V4 | 1 RMB/M | 16 RMB/M |
| GPT-5.5 | 15 RMB/M | 60 RMB/M |
| Claude Opus 4.6 | 15 RMB/M | 75 RMB/M |
| Gemini 3.1 Pro | 7.5 RMB/M | 30 RMB/M |

V4-Flash output pricing is one-thirtieth of GPT-5.5.
What does this mean? A long document analysis task that costs 10 RMB on GPT-5.5 costs 0.3 RMB on V4-Flash.
For heavy API users, monthly bills drop by double digits. For occasional personal users, it's basically free.
What Open-Source Means
MIT license — same as GLM-5.2 and DeepSeek-V3. Commercial use is free. No requirement to open-source modifications. Almost zero restrictions.
Practical impact:
Local deployment becomes viable. Run a quantized V4 on your own server or Kaihe AIBOX A1. Data stays on-device. Privacy guaranteed.
Enterprise customization. With access to model code, you can fine-tune on industry data. Legal, medical, finance — fill in the domain-specific terminology and logic that general models miss.
Ecosystem acceleration. Open-source means the community can build tools, plugins, and Agent framework integrations on top of V4. DeepSeek provides the foundation; everyone builds the upper layers together.
What 1M Context Changes
1M tokens is roughly 730,000 Chinese characters. In practice:
Legal: Full contract + relevant regulations + case precedents — dump it all in for cross-analysis. Previously required batching; now it's one shot.
Software development: Drop an entire mid-size codebase in and ask for cross-file refactoring suggestions. No need to manually select files.
Enterprise documents: A year's worth of meeting minutes, project reports, and emails — have AI do annual summaries and trend analysis.
Academic research: Feed dozens of papers simultaneously and have AI produce literature reviews with cross-references.

But 1M context has a known issue: does the model "forget"? Technically called "needle in a haystack" — can the model find a needle hidden in a massive haystack?
DeepSeek's benchmarks show V4 achieves over 95% information retrieval accuracy across the full 1M range. Real-world results await independent testing after launch.
Running DeepSeek-V4 on Kaihe AIBOX
Two options:
Cloud API. Call through DeepSeek's API, pay per token. V4-Flash at 2 RMB/million tokens — daily use costs almost nothing. Configure the API key in OpenClaw's dashboard on Kaihe AIBOX, and Agents can call it automatically.
Local deployment. Once V4 is open-sourced, quantized versions can run on Kaihe AIBOX A1. The A1 hardware supports INT4/INT8 quantized inference. Local means data never leaves the device, no API fees, available 24/7. The trade-off: quantization reduces precision, and speed depends on hardware.
Edge-cloud: daily lightweight tasks run local quantized V4. Heavy reasoning and 1M context tasks go through the cloud API. You choose.
AI Box (also known as Agent Computer or AI Box) is a dedicated local hardware device that runs AI Agents, pre-installed with an AI agent management system, plug-and-play, running 24/7. Users can remotely command AI via WeChat, Feishu, Discord, Slack, and more. Kaihe AIBOX ships with OpenClaw and Hermes Agent, supporting DeepSeek and other mainstream models.
Why "Affordable Era" Isn't Hype
1M context used to be flagship-only — GPT-5.5 had 256K, Claude had 200K, and both were expensive. DeepSeek-V4 stacks 1M + open-source + low pricing together, pulling long-context capability from "premium add-on" to "standard feature."
For users: no extra payment for long context, no lock-in to closed-source providers, no commercial compliance headaches for local deployment.
Affordable doesn't mean "cheap but weak." V4-Flash already matches several flagship models on coding and reasoning. Getting this level at 2 RMB/million tokens wasn't easy even six months ago.
Want to Go Deeper?
Getting Started - Kaihe AIBOX Official Website (agentaibox.com) — see what an Agent Computer with pre-installed Agents looks like - "GLM-5.2 Fully Open-Sourced: 1M Context + MIT License, China's Best-Value LLM" — another 1M + MIT Chinese open-source model
Going Further - "From AI Hype to Real Impact: 5 Signals That 2026 Is the Year of Value Validation" — where the Agent industry stands right now
-#KaiheAIBOX #DeepSeek #OpenSourceLLM #AIBOX #AIBox
Kaihe AIBOX | The Agent Computer That Works 7×24 for You · AI Frontier