Is OpenClaw Safe? Engineers Sound the Alarm — And Here's Your Complete Security Guide
Abstract: China's NVDB security alert, a critical CVE scoring 8.8, and malicious plugins stealing credentials — the default OpenClaw setup is far riskier than you think. This guide walks you through every vulnerability and how to fix it.
1. The Red Flag from China's Cyber Authority: OpenClaw Isn't Plug-and-Play
In May 2026, China's National Information Security Vulnerability Database (NVDB), operated under the Ministry of Industry and Information Technology, issued a security alert that sent shockwaves through the AI agent community: certain OpenClaw instances running default configurations pose high-severity risks, and all users are urged to audit their deployments immediately.
This wasn't an overreaction. Security researchers scanning the public internet found over 270,000 OpenClaw instances exposed due to misconfiguration — the vast majority running without any authentication mechanism. That's the equivalent of leaving a computer stuffed with private data and API keys wide open on the internet with the front door not just unlocked but removed entirely.
Where does the problem originate? OpenClaw's design philosophy prioritizes "out-of-the-box usability." Default settings favor convenience over security. For a developer experimenting in a local sandbox, this is perfectly fine. But the moment OpenClaw is deployed on a server, a cloud VM, or even a smart home device, those defaults become the single most dangerous attack surface.
The NVDB alert specifically highlighted three categories of risk: unauthenticated Gateway exposure, insufficient permission controls for third-party Skills, and the absence of sandboxing for agent execution environments. Each of these, on its own, represents a significant vulnerability. Combined — as they often are in default configurations — they create an attack surface that is exponentially more dangerous than any single issue.
The scale of the problem is staggering. A follow-up scan by an independent security research group found that of the 270,000+ exposed instances, approximately 62% were running completely unpatched versions, 78% had no authentication configured, and 91% were using the default Gateway port number. These numbers paint a picture not of isolated misconfigurations but of a systemic problem rooted in the software's default behavior.
To understand why this matters so much, consider what an OpenClaw instance typically contains. Unlike a traditional web server that might host static content or a simple application, an OpenClaw instance holds an intelligent agent that has been configured with the user's personal credentials, API keys, browser sessions, and workflow automations. It's not just data that's at risk — it's the user's entire digital operational capability. A compromised OpenClaw instance gives an attacker the ability to act as the user across multiple platforms and services simultaneously.
The alert from NVDB was particularly significant because China represents one of the largest and fastest-growing markets for AI agent technology. With millions of small and medium businesses adopting AI tools for content creation, customer service, and operational automation, the potential impact of a widespread vulnerability is enormous. The Chinese government's decision to publicly warn about OpenClaw security — rather than quietly coordinating patches — underscored the severity and the urgency of the situation.
2. CVE-2026-25253: Inside the ClawJacked Remote Takeover Vulnerability
The most alarming disclosure came from Australian security firm Dvuln, which identified CVE-2026-25253 — a vulnerability rated CVSS 8.8 (High Severity). They named it "ClawJacked," a play on the idea of an OpenClaw instance being hijacked.
How it works: ClawJacked exploits a design flaw in which OpenClaw Gateway's default listening port does not enforce authentication. The Gateway serves as the central access point for all agent communications, handling session management, tool invocation routing, and credential storage. An attacker who knows the target IP and port can communicate directly with the Gateway via standard HTTP requests and perform the following actions without any authorization:
- Read the agent's entire conversation history, including private messages containing sensitive personal information, financial data, and confidential business discussions
- Extract API keys and access credentials stored in configuration files — these may include credentials for cloud services (AWS, Azure, GCP), database connections, payment gateway tokens, and third-party service integrations
- Execute arbitrary operations under the agent's identity, including sending emails, manipulating the file system, calling external APIs, and interacting with other services the agent has access to
- Implant malicious Skills for persistent control, ensuring that even after the initial vulnerability is patched, the attacker retains a backdoor into the system
Dvuln's technical write-up revealed that the exploitation is trivially simple. A basic curl command, with no special tools or expertise required, can retrieve full conversation logs from an unauthenticated Gateway. More sophisticated attacks involve crafting tool invocation requests that instruct the agent to exfiltrate data or execute harmful commands — and because the agent has legitimate access to these tools, the attacks blend in with normal operation and are difficult to distinguish from authorized activity.
During verification, Dvuln researchers discovered that a staggering number of exposed instances hadn't even changed the default port number — making them trivially discoverable through search engines like Shodan and Censys, as well as automated port scanners. They demonstrated a proof-of-concept scan that identified over 10,000 vulnerable instances in under five minutes using nothing more than a standard laptop and publicly available scanning tools.
Impact scope: All OpenClaw instances without authentication enabled, particularly Gateway services directly exposed to the public internet. As of the disclosure date, affected instances exceeded 270,000. The vulnerability affects all OpenClaw versions prior to v4.5, which introduced mandatory authentication prompts during initial setup.
What makes this particularly dangerous is the nature of data that OpenClaw agents typically handle. Unlike a web server that might expose static content, an OpenClaw instance contains an agent that has been configured with the user's personal credentials, API keys, and workflow automations. Compromising an OpenClaw instance is not like defacing a website — it's more like giving someone full access to your digital life. They can read your emails, access your cloud storage, make purchases on your behalf, and communicate as you across social media platforms. The damage potential is not limited to a single system; it extends to every system and service that the agent has been granted access to.
The ClawJacked vulnerability also highlighted a broader issue in the AI agent ecosystem: the attack surface of an autonomous agent is fundamentally different from that of a traditional application. A web application has a well-defined perimeter — it accepts specific types of requests and returns specific types of responses. An AI agent, by contrast, is designed to be flexible and adaptable, capable of performing a wide variety of tasks across multiple systems. This flexibility, while powerful, means that a compromised agent can cause damage across a much broader range of systems and services than a compromised traditional application.

3. Beyond the Vulnerability: Two Emerging Risk Categories
ClawJacked is just the tip of the iceberg. Security teams from Tencent PC Manager and password management leader 1Password have independently flagged deeper structural risks within the OpenClaw ecosystem — risks that no single patch can address and that require systemic changes to the platform's architecture and governance.
Risk 1: Permission Sprawl and the "Over-Privileged Skill" Problem
Tencent PC Manager's security report highlights a structural "broad grant, narrow audit" problem in OpenClaw's permission model. When users install Skills (plugin extensions that add capabilities to agents), they typically grant sweeping permissions — file system read/write, full network access, system command execution — with little to no granularity or review.
The root cause is understandable: OpenClaw's Skill installation flow was designed for simplicity. A one-click install experience means users rarely pause to examine what permissions they're granting. The permission prompt, when it exists at all, presents a monolithic list that users instinctively approve without reading — much like the way most people click "Accept All" on cookie consent dialogs without examining what they're consenting to.
This means a malicious or vulnerable Skill can, without the user's knowledge, perform any of the following actions:
- Access the entire file system, including documents, photos, and configuration files
- Exfiltrate browser cookies and saved passwords from Chrome, Firefox, Edge, and other browsers
- Install backdoor programs or persistent daemons that survive agent restarts
- Use the agent's network access to communicate with external command-and-control servers
- Modify other installed Skills to propagate malicious code laterally across the agent's tool ecosystem
Compounding the issue, OpenClaw's current permission auditing mechanisms remain immature. There is no built-in way to generate a comprehensive report of what each Skill has accessed or modified. Users who suspect a Skill may be misbehaving have limited forensic tools at their disposal — they can check recent file modifications and network connections, but they cannot easily attribute specific actions to specific Skills.
Tencent's report specifically flagged the "dependency chain" problem: a seemingly innocuous Skill may depend on a shared library that has its own vulnerabilities, creating a transitive risk that the user cannot easily identify. This mirrors the well-documented supply chain risks in npm and PyPI ecosystems, but with higher stakes because OpenClaw Skills operate with more system-level access than typical JavaScript or Python packages. A compromised npm package might steal environment variables; a compromised OpenClaw Skill can read your email, access your bank accounts, and communicate with your customers.
The permission model also suffers from a temporal dimension: Skills that were legitimate when installed can be updated to include malicious code in subsequent versions. The current update mechanism does not re-prompt users for permission review when a Skill is updated, meaning that a Skill that was originally safe can become dangerous without the user's awareness.
Risk 2: Plugin Supply Chain Security and the Malicious Skills Epidemic
1Password's security team issued a more concrete and alarming warning: they identified multiple malicious Skills disguised as popular utility tools circulating in the OpenClaw Skills marketplace. These Skills, bearing innocuous names like "File Format Converter Pro," "Quick Translator Plus," and "Smart Document Organizer," were actively siphoning users' password databases, browser credentials, cryptocurrency wallet private keys, and SSH keys in the background.
The attack pattern follows a well-established playbook from other plugin ecosystems, but adapted to take advantage of the unique characteristics of the OpenClaw platform:
-
Trojan packaging: The malicious Skill includes legitimate functionality (the file converter actually converts files) alongside hidden data exfiltration code. This dual-purpose approach ensures positive user reviews and avoids early detection. The legitimate functionality also means that security researchers testing the Skill in isolation may not observe any suspicious behavior unless they specifically look for data exfiltration patterns.
-
Delayed execution: The malicious payload activates only after a random delay (typically 48–72 hours after installation), making it harder to correlate the installation with the observed data theft. By the time a user notices unauthorized access to their accounts, the connection to the recently installed Skill has faded from memory.
-
Data staging: Stolen credentials are collected locally in an encrypted temporary file before being exfiltrated in a single burst, reducing the number of outbound network requests that might trigger monitoring alerts. This approach minimizes the network footprint of the attack, making it harder to detect through traffic analysis.
-
Multi-hop exfiltration: Stolen data is routed through multiple proxy servers before reaching the attacker's collection point, obscuring the ultimate destination and making it extremely difficult to trace the attack back to its source.
-
Selective targeting: Some of the more sophisticated malicious Skills only activate their data collection when they detect specific conditions — for example, when the user is accessing financial websites, when cryptocurrency wallet software is running, or when specific file types (like .env files containing API keys) are present on the system. This targeted approach makes the attacks even harder to detect through general monitoring, because the malicious behavior only manifests under specific circumstances that may not be present during security testing.
1Password's investigation revealed that at least seven malicious Skills had collectively been installed by over 15,000 users before being detected and removed. The stolen data included approximately 23,000 unique credentials spanning banking, email, social media, and cloud storage services. The total estimated financial impact of the stolen credentials was in the millions of dollars, though exact figures are difficult to determine because many victims were unaware that their credentials had been compromised.
These attacks exploit user trust in the plugin ecosystem. Compared to mature package registries like npm, PyPI, and Maven Central, the OpenClaw Skills marketplace is still in its early stages — with auditing mechanisms and security scanning that haven't caught up with the ecosystem's rapid growth. Code review is largely manual, dependency scanning is optional, and there is no formal security certification process for published Skills. The marketplace's review process currently takes an average of 3–5 days for new Skills, but this timeline is primarily focused on functionality verification rather than security analysis.
A malicious Skill can exist for weeks or even months before detection, creating a prolonged exploitation window that puts every user who installs it during that period at risk. The rapid growth of the Skills marketplace — from 47 Skills in March to over 600 by May — has stretched the review team's capacity, and security researchers believe that more malicious Skills may still be lurking undetected in the marketplace.
4. The Complete Security Playbook: Six Steps to Lock Down OpenClaw
Once the risks are clearly understood, the countermeasures follow naturally. The six recommendations below are ordered by priority — the earlier steps are the most critical and should be implemented immediately. Each step includes specific implementation guidance.
Step 1: Enable Gateway Authentication (Non-Negotiable)
This is the single most important security measure you can take. Force authentication in the OpenClaw configuration file with the following actions:
- Set a strong password or API token-based authentication. Use a randomly generated token of at least 32 characters. Never use default credentials, dictionary words, or predictable patterns. Store the token in a secure credential manager, not in plain text configuration files.
- Enable TLS-encrypted communication. All traffic between clients and the Gateway must be encrypted. Self-signed certificates are acceptable for internal use; for public-facing deployments, use certificates from a recognized Certificate Authority. The overhead of TLS is negligible and the security benefit is enormous.
- Never expose an unauthenticated Gateway to the public internet. If you must make the Gateway accessible remotely, use a VPN or SSH tunnel rather than direct port exposure. Cloud providers offer managed VPN solutions that can be configured in minutes.
- If you only use OpenClaw locally, bind the Gateway to
127.0.0.1instead of0.0.0.0— this eliminates external access at the network level. No firewall rules needed; the Gateway simply won't accept connections from any address other than localhost.
After enabling authentication, verify it works by attempting to connect without credentials — the connection should be immediately refused. Also test with incorrect credentials to ensure the Gateway doesn't fall back to unauthenticated mode. A common misconfiguration is enabling authentication in the config file but failing to restart the Gateway service, leaving the instance vulnerable despite the configuration change. Always verify that the new settings are actually in effect.
Step 2: Change the Default Port
The default port is the primary target for automated scanners. Scripts that mass-scan the internet for vulnerable OpenClaw instances start with the default port, and the vast majority of the 270,000+ exposed instances were found on the standard port number. Switching to a non-standard port number, combined with firewall rules that restrict access to known IP ranges, can evade over 90% of automated scan-based attacks.
Choose a port number in the range of 49152–65535 (the "dynamic/private" port range) to minimize conflicts with well-known services. Document the chosen port in your internal configuration management system so that team members can connect without confusion. Remember that changing the port is not a substitute for authentication — it's an additional layer of defense that slows down unsophisticated attackers and reduces noise in your logs. Security professionals refer to this as "security through obscurity," and while it shouldn't be your only defense, it's a valuable addition to a layered security strategy.
Step 3: Run in a Sandbox
Execute OpenClaw within a sandboxed environment that restricts file system and network access. This ensures that even if an attacker compromises the agent, the blast radius is limited to the sandbox.
Docker containers are currently the simplest and most widely adopted sandboxing approach. Create an isolated Docker container for each agent with the following constraints:
- Read-only root filesystem where possible, with explicit volume mounts only for directories the agent needs to write to
- Network restrictions using Docker's built-in network policies to limit outbound connections to required services only
- Resource limits (CPU, memory) to prevent denial-of-service scenarios where a compromised agent consumes all available resources
- No privileged mode — never run the container with
--privilegedflag, as this effectively disables all container isolation
For higher-security requirements, consider using gVisor or Kata Containers for stronger isolation guarantees than standard Docker provides. These technologies add an additional layer of kernel-level isolation that makes container escape attacks significantly more difficult. Organizations handling sensitive data — financial records, healthcare information, personal data subject to privacy regulations — should strongly consider these hardened isolation options.
Step 4: Apply the Principle of Least Privilege
Grant each Skill only the minimum permissions required to complete its task. This requires a shift in mindset from "install and forget" to "review and authorize."
- Before installing any Skill, review its requested permissions. If a "translation tool" asks for file system write access or network permissions beyond what translation requires, that's a red flag.
- Regularly audit installed Skills' permission lists. Remove plugins you no longer use — unused Skills represent unnecessary attack surface. Every installed Skill is a potential entry point for an attacker, even if you're not actively using it.
- For Skills involving sensitive operations (file writes, network requests, system commands), review the source code before installation. Open-source Skills should have a public repository you can inspect. Closed-source Skills should be treated with extra caution.
- Create permission profiles for different categories of tasks. A "content creation" profile might allow file reads but restrict network access, while a "web automation" profile might allow browser control but restrict file system access. This compartmentalization limits the damage if any single profile is compromised.
Step 5: Keep Everything Updated
The OpenClaw team conducted a major security hardening initiative in the April 2026 v4.5 release, patching multiple known vulnerabilities including ClawJacked. Ensuring your instance runs the latest stable version is the lowest-cost security measure available.
Establish a regular update cadence — at minimum, check for updates weekly. Subscribe to the OpenClaw security mailing list and GitHub security advisories to receive timely notifications about critical patches. For business-critical deployments, consider implementing automated update pipelines that test new versions in a staging environment before deploying to production.
When updating, always review the release notes for security-relevant changes. Some updates may change default behaviors or require manual configuration adjustments to fully benefit from security improvements. The v4.5 release, for example, introduced mandatory authentication prompts during setup — but users who had already set up their instances without authentication needed to manually enable it, as the update didn't retroactively enforce authentication on existing configurations.
Step 6: Monitor and Audit Continuously
Enable operation logging at the Gateway level and configure alerts for suspicious activity patterns:
- Unusual tool invocation patterns — a Skill calling tools it has never called before, or a sudden spike in network requests from an agent that normally makes few external calls
- Off-hours activity — agent actions occurring at times when no user is expected to be active, especially if they involve sensitive operations
- Bulk data access — reading large numbers of files or making extensive API calls in a short period, which could indicate data exfiltration
- Failed authentication attempts — repeated login failures may indicate brute-force attacks or credential stuffing attempts
- Unexpected outbound connections — connections to IP addresses or domains that don't correspond to known services, which could indicate command-and-control communication
Regularly review audit logs — at least weekly for personal deployments, daily for business-critical instances. Follow official OpenClaw security bulletins and CVE databases to respond to newly disclosed vulnerabilities at the earliest opportunity.
Consider integrating OpenClaw logs with a SIEM (Security Information and Event Management) system for automated threat detection and correlation with other security events in your infrastructure. This is especially important for organizations running multiple OpenClaw instances or deploying agents across different environments. A SIEM can correlate events across your entire security perimeter — detecting, for example, that an unusual OpenClaw tool invocation coincided with an anomalous network connection from a different system, indicating a coordinated attack.
5. The Bigger Picture: Open Source AI Security Is an Ecosystem Problem
The vulnerabilities affecting OpenClaw are not unique to this one project. They are symptoms of a broader challenge that affects every open-source AI agent platform: how do you balance rapid innovation with security, in an ecosystem where most users are not security professionals?
Consider the parallels with other transformative open-source technologies. Kubernetes, in its early years, faced similar security growing pains — default configurations that prioritized ease of use over security, a steep learning curve for securing production deployments, and a lag between vulnerability disclosure and widespread patching. The Kubernetes community eventually addressed these issues through a combination of hardened defaults, security-focused tooling (like Pod Security Policies and later Pod Security Standards), and extensive security documentation. But the process took years, and many organizations learned about Kubernetes security the hard way — through breaches and incidents that could have been prevented with proper configuration.
The WordPress ecosystem offers another instructive parallel. WordPress powers over 40% of all websites on the internet, and its plugin ecosystem is one of the largest in the world. But this success came with security challenges: malicious plugins, vulnerable themes, and default configurations that left sites exposed to automated attacks. The WordPress community responded with automated vulnerability scanning, plugin review processes, and security hardening guides. Today, WordPress security is a mature discipline — but it took over a decade of incidents and responses to get there.
OpenClaw is at an earlier stage of this maturity curve. The project's explosive growth means that the number of deployments is scaling faster than the security infrastructure around them. The 270,000+ exposed instances identified by security researchers represent not just a vulnerability in OpenClaw's default configuration, but a failure of the broader ecosystem — documentation, tutorials, deployment guides, and cloud marketplace images — to adequately communicate and enforce security best practices.
The Skills supply chain issue mirrors challenges that npm, PyPI, and other package registries have faced for years. These ecosystems have invested heavily in automated security scanning (npm audit, PyPI's malware detection), code signing, and dependency analysis tools. OpenClaw's Skills marketplace will need similar investments — and the community will need time to develop and deploy them. The difference is that OpenClaw Skills operate with more system-level access than typical npm packages, making the consequences of a compromised Skill more severe than a compromised JavaScript library.
There's also a regulatory dimension that's just beginning to emerge. The NVDB alert from China's Ministry of Industry and Information Technology may be a harbinger of more formal regulatory attention to AI agent security. As AI agents gain the ability to access financial systems, handle personal data, and make decisions that affect business operations, regulators around the world are likely to impose minimum security requirements — just as they have for financial services, healthcare IT, and other critical infrastructure. The European Union's AI Act, which came into effect in 2025, already classifies AI systems that have access to critical infrastructure as "high-risk" and imposes specific security and transparency requirements. Organizations that adopt AI agents without considering security compliance may find themselves facing regulatory scrutiny in addition to the operational risks of a breach.
The key lesson from the OpenClaw security incidents of early 2026 is this: the security of autonomous AI agents is fundamentally different from the security of traditional software. Traditional software has well-understood attack surfaces (input validation, authentication boundaries, privilege escalation paths). Autonomous AI agents introduce new categories of risk: prompt injection attacks that manipulate the agent's reasoning process, tool-use vulnerabilities that exploit the gap between what the agent intends and what the tools actually do, and the cascading failure mode where a compromised agent can cause damage across multiple integrated systems simultaneously because it has legitimate access to all of them.
Securing this new paradigm requires new thinking — not just applying old security practices to a new technology, but developing security frameworks specifically designed for autonomous, tool-using AI systems. This means thinking about agent behavior as a security boundary, not just network and application boundaries. It means treating the agent's reasoning process as an attack surface that can be manipulated through carefully crafted inputs. And it means recognizing that the combination of autonomy and broad system access that makes AI agents so powerful also makes them uniquely dangerous when compromised.
The OpenClaw community, to its credit, has begun this work with the v4.5 security hardening release and the Skills review process. But there is much more to be done, and the pace of innovation means that new security challenges will continue to emerge faster than they can be fully addressed. The security of AI agents will be an ongoing journey, not a destination — and the organizations that recognize this reality will be better positioned to benefit from the technology while managing its risks.
6. What About Non-Technical Users? Let Professionals Handle Security
The six steps above are straightforward for developers with Linux operations experience and a security mindset. But for non-technical users — e-commerce operators, content creators, small business owners, freelancers — configuring each security policy individually is not just inconvenient; it's practically impossible.
Most non-technical users don't have the expertise to evaluate Skill permissions, configure Docker sandboxes, set up TLS certificates, or interpret audit logs. They install OpenClaw because they want an AI agent to help them with their work, not because they want to become part-time security administrators. And yet, the consequences of a security breach — stolen credentials, compromised accounts, data exfiltration — are just as severe for them as for any enterprise. In some ways, they're even more vulnerable: a small business owner whose payment credentials are stolen may not have the resources to recover from the financial impact, and a freelancer whose client data is breached may lose their reputation and their livelihood.
The asymmetric nature of this problem is worth emphasizing. The cost of implementing proper security is moderate — a few hours of configuration for someone who knows what they're doing. The cost of a security breach is potentially catastrophic — financial loss, legal liability, reputation damage, and the cascading effects of credential compromise across multiple services. For most users, the risk-reward calculus strongly favors delegating security to experts rather than attempting to handle it themselves.
This is precisely the problem that KaiheAiBox's Agent Computer solves. It provides a 7×24 stable OpenClaw deployment environment with factory-complete security hardening:
- Physical isolation ensures your agents are never exposed to the public internet, eliminating the entire category of remote attack vectors
- Authentication and sandboxing are enabled by default — no configuration required, no risk of forgetting to enable them
- Pre-configured security policies are continuously maintained and updated by a professional security team that monitors threat intelligence and applies patches proactively
- Curated Skills marketplace with pre-vetted plugins that have undergone security review before being made available, reducing the risk of installing malicious Skills
- Automated monitoring and alerting that detects and responds to anomalous behavior without user intervention, providing enterprise-grade security without requiring enterprise-grade expertise
You focus on putting your agents to work — security is handled by the infrastructure. It's the same reason most people use cloud email instead of running their own mail server: the security expertise required to do it properly is substantial, and the cost of getting it wrong is high. The economics strongly favor delegating security to specialists who do it at scale.
This approach also has a compounding benefit: a managed security environment gets stronger over time as the security team encounters and responds to new threats. Every vulnerability that's patched, every attack pattern that's identified, every security improvement that's deployed benefits all users simultaneously. An individual user managing their own security has to learn each lesson independently; a managed service learns once and protects everyone.
For organizations that need to maintain their own OpenClaw deployments — whether for data sovereignty, compliance, or custom integration requirements — the six steps outlined above provide a solid security foundation. But even these organizations should consider engaging professional security consultants for initial setup and periodic audits. The cost of a security review is a fraction of the cost of a security breach.
OpenClaw's security challenges are not a question of "whether to address them" but of "who addresses them and how." The vulnerabilities are real, the attack vectors are proven, and the consequences of ignoring them are severe. For most users, choosing a reliable managed environment is far wiser than building a security stack from scratch — and far less likely to result in learning about security through the painful experience of a breach.
The AI agent revolution is real, and the productivity gains are substantial. But no productivity gain is worth a compromised bank account, a stolen identity, or a breached business. Security is not optional — it's the foundation on which all the benefits of autonomous AI agents are built. Get the foundation right, and everything else follows. Get it wrong, and the entire edifice collapses.
KaiheAiBox · OpenClaw Zone