← Back to blog

OpenClaw Security Fix Plan: 24 Issues, 6 Fix Sets

·OpenClaw
openclawsecurityvulnerabilities

TL;DR: Here's a battle plan to fix all 24 of OpenClaw's security issues with just 410 lines of core patches and 12 modular extensions. Six fix sets cover everything: rate limiting with cost circuit breakers, encryption at rest, exec approval overhaul, container isolation for subagents, session isolation with auth tightening, and audit logging. The plugin-first architecture keeps 80% of changes out of the core, making upstream merges survivable while adding features the project desperately needs—cost tracking, team orchestration, and progress indicators.

OpenClaw Soft Fork: Security, UX & Multi-Agent Improvements

Context

OpenClaw is the most popular open-source AI agent (190K+ stars, MIT license), but has critical gaps in security (135K+ exposed instances, ClawHavoc supply chain attack, plaintext tokens, no rate limiting), UX (15+ hours setup, broken dashboard, no progress indicators, $200-3600/month cost surprises), and multi-agent orchestration (no named teams, no task queues, no plan approval). The creator just joined OpenAI (Feb 14, 2026) and the project is moving to a foundation.

Goal: Soft fork that implements security hardening, UX improvements, and team orchestration as modular extensions + minimal core patches, keeping the ability to merge upstream releases.


Architecture: Plugin-First, Core-Minimal

OpenClaw's plugin SDK is powerful. Plugins can:

  • Register custom tools, hooks, gateway RPC methods, HTTP endpoints, CLI commands, background services
  • Intercept/block/modify: messages (message_sending), tool calls (before_tool_call), agent behavior (before_agent_start), subagent spawning (subagent_spawning)
  • Add auth providers, channels, config schemas (plugin-scoped)

Plugins cannot: intercept gateway requests before core auth, add WebSocket middleware, extend main config schema, override core auth, wrap existing gateway methods, modify session routing, or add new hook types.

Strategy: ~80% of changes as extensions in /extensions/, ~20% as surgical core patches (adding hook points, middleware layer, config fields). Core patches kept small and isolated for easy upstream merging.


Security Issue Inventory (24 Issues)

Before detailing fixes, here's the full enumeration of security issues in upstream OpenClaw:

Critical

  1. No rate limiting on inference/tool execution — auth endpoint has 10/min limit, but inference requests and tool execution are completely unprotected
  2. Exec approval bypass — substring matching (rm matches arm, /bin/barm), aliases/redirects/pipes all bypass
  3. Plaintext credential storage — all OAuth tokens, API keys, channel creds stored in plaintext on disk
  4. Malicious skill execution — ClawHavoc deployed 1,184+ malicious skills; skills run in-process with full gateway privileges
  5. Session data leakage — default DM scope "main" means all users of a shared gateway see each other's sessions

High

  1. No audit logging — no centralized trail, no immutable log, can't investigate incidents
  2. Admin scope is superscope — CLI gets [admin, approvals, pairing] by default, admin bypasses all scope checks
  3. No tenant isolation — user A can access user B's sessions via API (no session ownership validation)
  4. Device token replay — no nonce consumption tracking, no replay window enforcement
  5. No TLS enforcement — local connections accept plaintext, no option to require TLS for non-localhost
  6. WebSocket resource exhaustion — no per-connection rate limiting, 25MB max payload, 50MB send buffer

Medium

  1. Large payload DoS — 25MB max WebSocket payload with no per-message cost accounting
  2. Session cap exhaustion — 500 session limit with no per-user quota, one user can exhaust the cap
  3. Session transcript exposure — transcripts stored as plaintext JSONL, no integrity verification
  4. Config drift — model config in 4 places, changes don't propagate, form editor silently drops unknown keys
  5. Concurrent write interleaving — parallel agent runs on same session produce interleaved output
  6. Plugins run in-process — full gateway privileges, no filesystem/network/process restrictions
  7. Session metadata lacks validation — ~40 fields with no schema enforcement on write

Low

  1. No CSP headers on HTTP endpoints — XSS risk on control UI and custom HTTP routes
  2. Dependency pinning gaps — some transitive dependencies unpinned
  3. Log redaction incomplete — API keys and tokens can appear in debug logs
  4. No signed releases — binary releases not cryptographically signed
  5. Stale session cleanup race — 30-day pruning can race with active session writes
  6. Heartbeat timing side-channel — 30s heartbeat interval reveals connection liveness to network observers

Phase 1: Security Hardening (Minimum Fix Set)

Six fixes that cover all 24 security issues. Each fix lists which issues it addresses.

1.1 Rate Limiting + Cost Circuit Breaker (Core Patch + Extension)

Addresses: #1 (no rate limiting), #9 (device token replay — rate-limit auth attempts), #11 (WebSocket resource exhaustion), #12 (large payload DoS — per-message cost accounting), #13 (session cap exhaustion — per-user session quota)

Core patch (~15 LOC): Add before_request hook point in src/gateway/server-methods.ts handleGatewayRequest() — a single early-interception hook that runs before method dispatch.

Extension (extensions/security-ratelimit/):

  • Per-IP sliding window on auth attempts (10/min default, matches existing, but now covers all endpoints)
  • Per-session sliding window on inference requests (configurable, default 30/min)
  • Per-tool execution limits (max N bash commands/minute, default 20/min)
  • Per-user session quota (max sessions per user, default 50)
  • Per-message size accounting (reject oversized payloads before processing)
  • Global API spend circuit breaker (kill inference when spend exceeds $/hour or $/day budget)
  • Device token auth: rate-limit per-deviceId to mitigate replay attacks
  • Uses before_request hook (from core patch) + before_tool_call hook (existing)
  • Config: plugin-scoped rate limit settings with sensible defaults for personal 24-hr org use

Default tuning for personal org: Generous limits that catch abuse without interfering with normal use. Inference: 30/min (enough for rapid iteration), tool execution: 20/min (covers aggressive coding sessions), cost: $50/day circuit breaker (prevents runaway but allows heavy days). All configurable and auto-refinable based on usage history.

1.2 Encryption at Rest (Extension)

Addresses: #3 (plaintext credentials), #14 (session transcript exposure), #21 (log redaction — encrypted secrets can't leak into logs)

Extension (extensions/security-vault/):

  • Encrypt credentials, tokens, API keys on disk using OS keychain (macOS Keychain, Linux Secret Service via keytar or @aspect/secret-store)
  • Fallback: AES-256-GCM with user-provided master key (for headless/CI environments)
  • Session transcripts: encrypt JSONL files at rest, decrypt on read into memory
  • Hook into gateway_start to unlock vault, gateway_stop to verify seal
  • Custom CLI command: openclaw vault init, openclaw vault rotate, openclaw vault status
  • Targets: ~/.openclaw/oauth/, config auth sections, exec-approvals.json, channel creds.json, session transcript files
  • Log sanitizer: hook into logging to redact any value matching a known secret pattern before write

1.3 Exec Approval Overhaul (Core Patch)

Addresses: #2 (exec approval bypass — substring matching, metacharacters, pipes, aliases)

Core patch (~120 LOC): Replace substring matching in src/infra/exec-approvals.ts with proper command parsing.

Changes to src/infra/exec-approvals.ts:

  • Parse commands into shell AST (using shell-quote or bash-parser)
  • Normalize before matching: resolve aliases, expand paths, strip env var prefixes
  • Block shell metacharacters (&&, ||, ;, |, backticks, $()) unless explicitly allowed
  • Add --dry-run mode that shows what would execute without executing
  • Validate against pipe chains, not just first command
  • Reject commands containing null bytes, ANSI escape sequences, or Unicode direction overrides

1.4 Container Isolation for Subagents (Core Patch + Extension)

Addresses: #4 (malicious skill execution), #7 (admin superscope — blast radius containment), #11 (WebSocket resource exhaustion — resource caps), #12 (large payload DoS — memory caps), #13 (session cap exhaustion — container limits), #17 (plugins run in-process)

Derived from: docker-claude project learnings. The Go code doesn't port (OpenClaw is TypeScript), but the architecture decisions carry over directly.

Core patch (~25 LOC):

  • Add plugin_load hook point in plugin loader
  • Add sandbox option to plugin manifest schema
  • Add container spawn mode to subagent spawn options in src/agents/subagent-spawn.ts

Extension (extensions/container-isolation/):

  • Hook into subagent_spawning (existing) to route spawns through Docker instead of in-process child_process.fork()
  • Hook into plugin_load (from core patch) to run untrusted skills in containers

Container architecture:

  • Network: Bridge network per team (openclaw-team-{id}), not per agent. Agents on same team can communicate; cross-team traffic blocked
  • Filesystem: Volume mounts scoped to workspace + agent-specific config only. No host system directories. Read-only where possible
  • Environment: API keys injected via env vars at container creation, never baked into images
  • Resources: Per-container CPU (1 core default), memory (512MB default), PID limit (256), no privileged mode
  • Lifecycle: Container labels (openclaw.managed=true, openclaw.team={id}, openclaw.agent={id}) for management. Auto-cleanup on agent exit with configurable grace period
  • Image: Minimal base image with Node.js + OpenClaw agent runtime, no shell by default for skill containers
  • Communication: Gateway communicates with containerized agents via Unix socket (mounted volume) or localhost TCP. <4ms per-message overhead vs in-process

Graceful degradation: If Docker is unavailable (no socket, permissions), fall back to isolated-vm V8 sandboxing for skills and warn. Subagents fall back to in-process with a security warning.

Signed skill packages: Hash verification before container launch. Manifest declares required capabilities (filesystem paths, network hosts, tools). User approves capability grants on first load.

1.5 Session Isolation + Auth Tightening (Core Patch)

Addresses: #5 (session data leakage), #7 (admin superscope — least-privilege defaults), #8 (no tenant isolation), #9 (device token replay — nonce validation), #10 (no TLS enforcement), #18 (session metadata validation), #23 (stale session cleanup race)

Core patches (~75 LOC total across 4 files):

src/routing/resolve-route.ts (~5 LOC):

  • Change default DM scope from "main" to "per-channel-peer"

src/gateway/session-utils.ts (~30 LOC):

  • Session ownership validation: every session read/write checks that requesting user owns the session key
  • User A cannot enumerate or access user B's sessions via any gateway method
  • Optimistic locking: add version field to session metadata, reject stale writes (fixes #23 race condition)

src/gateway/method-scopes.ts (~10 LOC):

  • Change CLI_DEFAULT_OPERATOR_SCOPES from [admin, approvals, pairing] to [read, write, approvals]
  • Admin scope must be explicitly granted, not assumed
  • This is a breaking change but the correct security posture

src/gateway/server.impl.ts (~15 LOC):

  • Enforce TLS for non-localhost connections (reject plaintext WebSocket upgrade if remote IP is not loopback)
  • Device token nonce tracking: maintain nonce set per device, reject replayed nonces within a 5-minute window

src/gateway/protocol/ws-connection.ts (~15 LOC):

  • Session metadata schema validation on write (AJV schema for the ~40 fields, reject unknown/malformed)

1.6 Audit Logging (Extension)

Addresses: #6 (no audit logging), #15 (config drift — log config changes for detection), #16 (concurrent write interleaving — log concurrent access for forensics), #19 (no CSP — audit log endpoint has CSP headers), #20 (dependency pinning — audit log of dependency changes), #22 (no signed releases — log release verification events)

Extension (extensions/security-audit/):

  • Structured append-only audit log for security-sensitive actions
  • Events: auth attempts (success/fail), permission changes, exec approvals, tool invocations, session access, config changes, plugin loads, container lifecycle, dependency updates
  • Hash-chain integrity (each entry includes SHA-256 hash of previous entry)
  • Tamper detection: periodic integrity verification, alert on chain break
  • Hooks used: gateway_start, before_tool_call, after_tool_call, message_received, message_sending, subagent_spawning, plugin_load
  • Custom HTTP endpoint: /audit/query with CSP headers (addresses #19 for this endpoint)
  • Custom CLI: openclaw audit search, openclaw audit verify, openclaw audit export
  • Retention: configurable rotation (default 90 days), compressed archives

Residual Low-Severity Items

Issues #20 (dependency pinning), #22 (signed releases), and #24 (heartbeat timing) are operational hygiene rather than code fixes:

  • #20: Add npm audit to CI, pin transitive deps in lockfile
  • #22: Add GPG signing to release workflow
  • #24: Accept as low-risk; randomizing heartbeat interval adds complexity for minimal gain

Security Coverage Matrix

FixIssues AddressedType
1.1 Rate Limiting#1, #9(partial), #11, #12, #13Core (~15 LOC) + Extension
1.2 Encryption at Rest#3, #14, #21Extension only
1.3 Exec Approval Overhaul#2Core (~120 LOC)
1.4 Container Isolation#4, #7(partial), #11, #12, #13, #17Core (~25 LOC) + Extension
1.5 Session Isolation + Auth#5, #7(partial), #8, #9, #10, #18, #23Core (~75 LOC)
1.6 Audit Logging#6, #15, #16, #19, #20, #22Extension only
TotalAll 24 issues~235 LOC core + 4 extensions

Implementation Priority

OrderFixEffortImpact
1Session Isolation + Auth (1.5)~1 dayHighest — fixes data leakage in every multi-user deployment
2Exec Approval Overhaul (1.3)~1 daySingle file, fixes the most embarrassing vulnerability
3Encryption at Rest (1.2)~2 daysExtension only, no core patches needed
4Rate Limiting (1.1)~2 daysTiny core patch enables full extension
5Audit Logging (1.6)~3 daysExtension only, should go in before container isolation for observability
6Container Isolation (1.4)~1-2 weeksBiggest lift, most issues covered, benefits from audit logging being in place

Phase 2: UX Improvements

2.1 Cost Dashboard & Budget Tracking (Extension)

Extension (extensions/cost-tracker/):

  • Track token usage per session, per user, per model in real-time
  • Estimate cost before sending (show token count + estimated $)
  • Session-level and user-level cost summaries
  • Alerts when approaching configurable budget limits
  • Hooks: llm_input (count input tokens), llm_output (count output tokens), before_agent_start (show estimate)
  • Custom gateway method: cost.summary, cost.budget.set
  • Custom HTTP endpoint: /cost/dashboard (web UI)

Why this matters: Users report $200/day to $3,600/month surprises. No per-session, per-user, or per-tool cost budgets exist.

2.2 Streaming UX / Progress Indicators (Extension + Core Patch)

Core patch (small): Add agent_progress event type to broadcast system in src/gateway/server-broadcast.ts. Emit progress events from agent runner.

Extension (extensions/ux-progress/):

  • Hook before_tool_call: emit "executing tool X..." progress event
  • Hook before_agent_start: emit "thinking..." indicator
  • Hook after_tool_call: emit tool result summary
  • Track tool execution duration, show elapsed time
  • TUI integration: render progress bar/spinner for long operations

Why this matters: Users complain the terminal sits static with no spinner, progress bar, or ETA during complex requests.

2.3 Configuration Consolidation (Core Patch)

Core patch: Consolidate model configuration into single source of truth.

Changes to src/config/config.ts and src/config/io.ts:

  • Model config lives in one place (main config), propagates to session state, cron payloads, model allowlist
  • Config validation on save with clear error messages (not silent dropping of unknown keys)
  • Form editor preserves unknown keys (don't silently drop)
  • Add openclaw config check command that validates entire config and reports conflicts
  • Add openclaw config diff to show divergence between the 4 current storage locations

Why this matters: Model config stored in 4 separate places. Changes to main config don't propagate. Form editor silently drops unknown fields.

2.4 Multi-Client Session Coordination (Core Patch)

Core patch: Add session-level queuing in src/gateway/server.impl.ts chat handlers.

Changes:

  • Queue concurrent sends to same session (max 1 active agent run per session, additional sends wait or return "busy")
  • Add chat.persisted event emitted after disk write completes (durability guarantee)
  • Adaptive backpressure: degrade streaming quality (larger batch chunks, lower frequency) before disconnecting slow clients
  • Session metadata optimistic locking (version field, reject stale writes)

Why this matters: Concurrent sends from same user interleave. Slow clients force-disconnected. No persistence guarantee.

2.5 Context Window Intelligence (Extension)

Extension (extensions/context-optimizer/):

  • Smarter context compaction: preserve recent + semantically important (not just most recent)
  • Show token usage breakdown (system prompt vs history vs tool output)
  • Workspace-aware context pruning: tool outputs from irrelevant files pruned first
  • Hooks: before_compaction (existing), before_prompt_build (existing)
  • Custom tool: context_status showing current window usage

Phase 3: Multi-Agent Team Orchestration

3.1 Named Teams (Extension + Core Patch)

Core patch (small): Add team_id field to SubagentRunRecord in src/agents/subagent-registry.types.ts. Add team-related hooks: team_created, team_member_joined, team_task_assigned.

Extension (extensions/teams/):

  • Team model: named groups of agents with roles (lead, coder, reviewer, planner)
  • Team CRUD: create, list, update, delete teams
  • Member management: add/remove agents, assign roles
  • Per-role tool scoping (reviewers get read-only tools, coders get edit tools)
  • Custom gateway methods: team.create, team.list, team.addMember, etc.
  • Custom tools: team_create, team_message, team_assign (registered via plugin SDK)
  • Persistence: team state in plugin-scoped SQLite DB
  • Builds on existing subagent registry -- teams are a named grouping layer on top of spawn/steer/kill

3.2 Shared Task Queue (Extension)

Extension (extensions/task-queue/):

  • Task model: title, description, status (pending/in_progress/completed/blocked), assignee, dependencies
  • Dependency tracking: task A blocks task B
  • Auto-assignment: unassigned tasks offered to idle team members
  • Custom tools: task_create, task_claim, task_complete, task_list, task_block
  • Persistence: SQLite DB (plugin-scoped)
  • Hooks: subagent_ended (existing) -- when agent finishes, check for next task

3.3 Plan Approval Workflow (Extension)

Extension (extensions/plan-approval/):

  • Agent proposes plan (structured document with steps)
  • Plan sent to team lead or human for approval
  • Approval/rejection with feedback
  • Approved plan becomes task queue entries
  • Custom tools: plan_propose, plan_approve, plan_reject
  • Custom HTTP endpoint: /plans/pending (web UI for human approval)

3.4 Team Dashboard (Extension)

Extension (extensions/team-dashboard/):

  • Web UI showing: active team members, their status, current tasks, task queue, activity log
  • Real-time updates via gateway events
  • Custom HTTP routes serving dashboard UI

Summary: Core Patches vs Extensions

Core Patches (~410 lines total)

FileChangeLines Est.
src/gateway/server-methods.tsAdd before_request hook point~15
src/gateway/server-broadcast.tsAdd agent_progress event type~10
src/gateway/server.impl.tsSession queuing, chat.persisted, backpressure, TLS enforcement, nonce tracking~95
src/gateway/session-utils.tsSession ownership validation, optimistic locking~30
src/gateway/method-scopes.tsCLI default scopes to least-privilege~10
src/gateway/protocol/ws-connection.tsSession metadata schema validation~15
src/infra/exec-approvals.tsShell AST parsing, command normalization~120
src/routing/resolve-route.tsChange default DM scope to per-channel-peer~5
src/agents/subagent-spawn.tsAdd container spawn mode~15
src/agents/subagent-registry.types.tsAdd team_id field~3
src/config/config.ts + io.tsConfig consolidation, form editor fix, validation~60
src/plugins/types.tsAdd new hook types (before_request, plugin_load, team hooks)~20
Plugin loaderAdd plugin_load hook, sandbox manifest option~15

Extensions (12 total)

ExtensionPurposePhase
extensions/security-ratelimit/Rate limiting (IP, user, session, cost, tool)1
extensions/security-vault/Encryption at rest (OS keychain, AES fallback)1
extensions/container-isolation/Docker container isolation for subagents + skills1
extensions/security-audit/Audit logging (append-only, hash-chain)1
extensions/cost-tracker/Cost dashboard, budgets, alerts2
extensions/ux-progress/Progress indicators, streaming UX2
extensions/context-optimizer/Context compaction, token usage visibility2
extensions/teams/Named teams, roles, member management3
extensions/task-queue/Shared task queue with dependencies3
extensions/plan-approval/Plan proposal and approval workflow3
extensions/team-dashboard/Web UI for team monitoring3

Verification Plan

Security (Phase 1):

  • Reproduce SecurityScorecard's auth-bypass test -- verify it's blocked
  • Send 1000 rapid inference requests -- verify rate limiter engages at configured threshold
  • Attempt exec approval bypass with shell metacharacters (rm && cat /etc/passwd) -- verify AST parser catches it
  • Load malicious skill (simulated) -- verify container sandbox blocks filesystem/network access
  • Verify isolated-vm fallback engages when Docker unavailable
  • Connect two users to shared gateway -- verify session isolation (user A cannot see user B's sessions)
  • Verify CLI no longer gets admin scope by default
  • Replay a captured device token -- verify nonce rejection
  • Connect via plaintext WebSocket from remote IP -- verify TLS enforcement rejects it
  • Verify all tokens encrypted on disk (hexdump config files should show ciphertext)
  • Run openclaw audit verify -- confirm hash chain integrity
  • Verify per-user session quota prevents cap exhaustion

UX (Phase 2):

  • Connect CLI + Web + mobile simultaneously to same session
  • Send concurrent messages from two clients -- verify queuing (second waits, not interleaved)
  • Disconnect slow client gracefully -- verify degradation before disconnect
  • Run multi-turn conversation -- verify cost tracker accuracy against provider billing
  • Change model in main config -- verify it propagates to session state and cron
  • Run openclaw config check -- verify it catches config drift

Multi-Agent (Phase 3):

  • Create team with 3 agents (lead, coder, reviewer)
  • Assign tasks with dependencies -- verify blocked tasks don't start early
  • Kill a team member -- verify tasks reassigned
  • Propose plan from agent -- verify approval flow reaches human
  • Check team dashboard -- verify real-time status updates

Regression:

  • Run OpenClaw's existing Vitest suite -- verify no regressions
  • Verify all 14+ channel integrations still work
  • Verify subagent spawn/steer/kill still works with team_id and container spawn mode added
  • Verify plugin loading still works with new hook types

Fork Maintenance Strategy

Keeping mergeable with upstream:

  1. Core patches in isolated commits with clear descriptions (upstreamable as PRs)
  2. Extensions in /extensions/ directory (clean separation, no merge conflicts)
  3. Track upstream releases, merge regularly, resolve conflicts only in ~410 lines of core patches
  4. Run upstream test suite after each merge
  5. Tag fork releases aligned with upstream versions (e.g., upstream-2026.2.22-fork.1)

What could break on upstream merge:

  • Changes to handleGatewayRequest() signature (our before_request hook)
  • Changes to exec-approvals.ts (our AST parser replaces their implementation)
  • Changes to plugin hook types (our new hook additions)
  • Changes to SubagentRunRecord type (our team_id field)
  • Changes to subagent-spawn.ts (our container spawn mode)
  • Changes to method-scopes.ts (our CLI default scope change)
  • Changes to ws-connection.ts (our metadata validation)

These are all small, predictable conflict zones.