← Back to blog

OpenClaw Architecture Deep Dive

·OpenClaw
openclawarchitecturetypescript

TL;DR: OpenClaw's architecture is a masterclass in centralized routing—one gateway server connects 14+ messaging platforms to AI agents through a WebSocket protocol. Follow a real Telegram message through the entire stack: from webhook to route resolution to LLM invocation to streamed response. The single-process design is elegant but fragile—one crash takes down everything, and file-based sessions create race conditions you'll wish you'd known about before deploying.

OpenClaw Architecture Deep Dive

Overview

OpenClaw is a multi-client, multi-channel AI agent platform built in TypeScript/Node.js. At its core is a gateway server that routes messages between clients (CLI, TUI, web, mobile, messaging channels) and agent sessions. This document traces the architecture through a concrete example: a Telegram message flowing through the entire system.


High-Level Architecture

                    ┌─────────────┐
                    │  Telegram   │
                    │  WhatsApp   │
                    │  Discord    │   Channel Integrations
                    │  Slack      │   (14+ platforms)
                    │  Signal     │
                    │  iMessage   │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │   Gateway   │   WebSocket + HTTP Server
                    │   Server    │   Central routing hub
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
       ┌──────▼───┐  ┌────▼────┐  ┌───▼──────┐
       │  Route   │  │ Session │  │  Plugin   │
       │ Resolver │  │  Store  │  │  System   │
       └──────┬───┘  └────┬────┘  └───┬──────┘
              │            │            │
              └────────────┼────────────┘
                           │
                    ┌──────▼──────┐
                    │   Agent     │   Pi Agent Core
                    │   Runner    │   LLM + Tools
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
       ┌──────▼───┐  ┌────▼────┐  ┌───▼──────┐
       │   LLM    │  │  Tool   │  │ Subagent │
       │ Provider  │  │ System  │  │ Registry │
       └──────────┘  └─────────┘  └──────────┘

Component Breakdown

1. Gateway Server

Location: src/gateway/server.impl.ts (~178 files in gateway directory)

The gateway is the central hub. It:

  • Runs HTTP/HTTPS + WebSocket servers
  • Manages all client connections (CLI, TUI, web, mobile, channel bots)
  • Routes messages to the correct agent session
  • Broadcasts agent events (streaming tokens, tool calls) to connected clients
  • Manages channel integrations lifecycle
  • Provides RPC interface for internal components

Key runtime state:

clients: Set<GatewayWsClient>          // All connected clients
chatRunState: Map<runId, ChatRun>      // Active agent runs
chatRunBuffers: Map<runId, Buffer>     // Token streaming buffers
nodeRegistry: Map<nodeId, Node>        // Remote execution nodes

2. WebSocket Protocol

Location: src/gateway/protocol/, src/gateway/server/ws-connection.ts

Three frame types over JSON:

  • Request: {type: "req", id: UUID, method: string, params: {...}}
  • Response: {type: "res", id: UUID, ok: boolean, payload: {...}}
  • Event: {type: "event", event: string, payload: {...}, seq: number}

Connection handshake:

Client connects (TCP upgrade)
  → Server sends: "connect.challenge" {nonce}
  → Client sends: "connect" RPC with auth + device identity
  → Server validates, sends: HelloOk {policy, role}
  → Connection ready

3. Authentication

Location: src/gateway/auth.ts

Five auth modes:

  • Token: SHA-256 validated shared secret
  • Password: Timing-safe comparison via safeEqualSecret()
  • Device token: RSA signature-based per-device identity
  • Tailscale: User identity via Tailscale whois headers
  • None: Open access (loopback only)

Rate limiting: 10 failed attempts → 5-minute lockout per IP (auth only, not API).

4. Routing

Location: src/routing/resolve-route.ts

Session key format: agent:{agentId}:{channel}:{accountId}:{peerKind}:{peerId}

Route resolution checks bindings in priority order:

  1. Exact peer match (user's configured agent for this DM)
  2. Parent peer match (for threads, inherit parent's agent)
  3. Guild+roles match (Discord server + roles)
  4. Guild match (Discord server only)
  5. Team match (Slack workspace)
  6. Account match (channel account)
  7. Channel match (channel type)
  8. Default agent

5. Plugin SDK

Location: src/plugin-sdk/

Plugins can register:

  • registerTool() -- custom agent tools
  • registerHook() -- lifecycle hooks (20+ hook points)
  • registerGatewayMethod() -- new RPC methods
  • registerHttpHandler() / registerHttpRoute() -- HTTP endpoints
  • registerChannel() -- new communication channels
  • registerCommand() -- chat commands (bypass LLM)
  • registerCli() -- CLI commands
  • registerService() -- background services
  • registerProvider() -- auth providers

Available hooks: before_tool_call, after_tool_call, message_received, message_sending, message_sent, before_agent_start, agent_end, before_compaction, subagent_spawning, subagent_ended, gateway_start, gateway_stop, llm_input, llm_output, and more.

6. Subagent System

Location: src/agents/subagent-registry.ts (940 LOC), src/agents/subagent-spawn.ts (528 LOC)

Two spawn modes:

  • "run": One-shot ephemeral agent. Executes task, announces result, cleaned up.
  • "session": Persistent thread-bound agent. Stays active for follow-ups.

Key capabilities:

  • Depth limiting (maxSpawnDepth, maxChildrenPerAgent)
  • Steer: redirect running agent mid-task (abort + restart with new instructions)
  • Cascading kill: recursively terminate all descendants
  • Disk persistence: state survives gateway restarts
  • Announce flow: exponential backoff delivery (1s/2s/4s/8s, max 3 retries, 5-min expiry)

Complete Message Flow: Telegram Example

Scenario: A Telegram user sends "Help me write a Python script" to the OpenClaw bot.

Step 1: Telegram Webhook Receives Message

File: src/telegram/webhook.tsstartTelegramWebhook()

The gateway starts a dedicated HTTP server for Telegram webhooks. When Telegram sends an update:

POST /telegram-webhook
→ grammy's webhookCallback() parses the update
→ Bot middleware chain executes
→ Sequential processing per chat (getTelegramSequentialKey)

Request body limits: 1MB max, 30s timeout.

Step 2: Message Parsing

File: src/telegram/bot.tscreateTelegramBot() (line 115)

The grammy Bot instance extracts:

chatId = msg.chat.id                    // e.g., 123456789
text = msg.text ?? msg.caption          // "Help me write a Python script"
senderId = msg.from?.id                 // e.g., 123456789
senderUsername = msg.from?.username      // e.g., "js0n"
isGroup = msg.chat.type === "group"     // false (DM)
threadId = msg.message_thread_id        // undefined (not a forum topic)

Messages are serialized per chat via sequentialize(getTelegramSequentialKey) -- one message processed at a time per chat to prevent race conditions.

Step 3: Access Control

File: src/telegram/bot-message-context.tsbuildTelegramMessageContext()

For DMs, checks DM policy:

if (dmPolicy === "disabled") → reject
if (dmPolicy === "open") → allow
if (dmPolicy === "allowlist") → check senderId against allowFrom list
if (dmPolicy === "pairing") → check if paired, else send pairing code

File: src/telegram/bot-access.tsresolveSenderAllowMatch()

  • Checks allow.hasWildcard (if "*" in config, allow everyone)
  • Checks if senderId in allow.entries set (normalized phone/ID)
  • Returns {allowed: boolean, matchKey, matchSource}

If pairing required and user not yet paired:

const { code, created } = await upsertChannelPairingRequest({
  channel: "telegram",
  accountId,
  senderId: candidate,
  senderDisplayName
})
// Sends pairing code back to user: "Reply with code: ABCD1234"

Step 4: Route Resolution

File: src/routing/resolve-route.tsresolveAgentRoute()

Input:

{
  cfg: loadConfig(),
  channel: "telegram",
  accountId: "default",          // Telegram bot account
  peer: { kind: "direct", id: "123456789" }
}

Walks binding tiers:

  1. Check peer bindings → no match
  2. Check account bindings → no match
  3. Check channel bindings → no match
  4. Fall back to default agent"main"

File: src/routing/session-key.tsbuildAgentSessionKey()

Generates session key:

agent:main:telegram:direct:123456789

DM scope (default "main") determines isolation level:

  • "main" → all DMs share one session (default, insecure for multi-user)
  • "per-peer" → separate per user
  • "per-channel-peer" → separate per channel+user
  • "per-account-channel-peer" → full isolation

Step 5: Session Loading

File: src/auto-reply/reply/get-reply.tsgetReplyFromConfig()

Calls initSessionState() which:

File: src/config/sessions.ts

Loads session from disk:

~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json

Session structure:

{
  "channel": "telegram",
  "origin": { "provider": "telegram", "from": "123456789" },
  "modelOverride": null,
  "chatType": "direct",
  "sessionId": "<uuid>",
  "createdAt": "2026-02-21T...",
  "lastActivity": "2026-02-21T...",
  "messages": [
    {"role": "user", "content": [{"type": "text", "text": "previous message"}]},
    {"role": "assistant", "content": [{"type": "text", "text": "previous response"}]}
  ]
}

If session doesn't exist, creates a new one with empty message history.

Step 6: System Prompt Construction

File: src/auto-reply/reply/get-reply-run.tsrunPreparedReply()

Builds the extra system prompt:

const extraSystemPrompt = [
  inboundMetaPrompt,         // "Message received from Telegram DM, user: @js0n"
  groupChatContext,           // null (DM, not group)
  groupIntro,                // null
  groupSystemPrompt          // null
].filter(Boolean).join("\n\n");

The agent also has a base system prompt from its agent config (~/.openclaw/agents/main/).

Step 7: Agent Dispatch & LLM Invocation

File: src/agents/pi-embedded-runner/run/attempt.tsrunEmbeddedAttempt()

Creates agent session:

const { session: activeSession } = await createAgentSession({
  workspace: resolvedWorkspace,
  model: params.model,           // e.g., "claude-sonnet-4-5-20250929"
  tools: builtInTools,           // bash, file ops, web, etc.
  customTools: allCustomTools,   // plugin-registered tools
  sessionManager,
  resourceLoader
});

Applies system prompt, then invokes:

await activeSession.prompt(effectivePrompt)
// effectivePrompt = "Help me write a Python script"

This calls the LLM via the configured provider (Anthropic, OpenAI, etc.) through @mariozechner/pi-ai's streaming interface. The provider selection comes from:

  • Agent config → session override → environment → config file

Step 8: Tool Execution (If Agent Uses Tools)

File: src/agents/pi-tools.tscreateOpenClawCodingTools()

If the agent decides to write a Python file, it invokes the bash or file.write tool:

Agent: "I'll create the script for you."
Tool call: file.write { path: "/workspace/script.py", content: "#!/usr/bin/env python3\n..." }

Tool execution flow:

  1. Agent outputs tool call in response stream
  2. Pi agent core intercepts, finds matching tool
  3. Tool's execute() function runs (with security policy checks)
  4. Tool result added to conversation history
  5. Control returns to LLM with tool result
  6. LLM generates next response (may call more tools or produce final text)

Exec approval (for bash tools):

Agent wants to run: python3 script.py
→ Check exec-approvals.json allowlist
→ If not allowed: send approval request to user (Telegram inline keyboard)
→ User taps "Allow" → command executes
→ Result returned to agent

Step 9: Response Routing Back to Telegram

File: src/telegram/bot-message-dispatch.tsdispatchTelegramMessage()

The agent's response streams back through the dispatch chain:

const { queuedFinal } = await dispatchReplyWithBufferedBlockDispatcher({
  ctx: context,
  dispatcherOptions: {
    sendTyping,              // Shows "typing..." in Telegram
    replyDispatcher,
    channel: "telegram",
  },
  replyOptions: {
    onBlockReply: (payload) => {
      // Each block of response text sent to Telegram
    },
  }
});

File: src/telegram/send.tssendMessageTelegram()

Formats response for Telegram (Markdown → HTML conversion) and sends:

const response = await api.sendMessage(
  chatId,                    // 123456789
  formattedText,             // HTML-formatted response
  {
    parse_mode: "HTML",
    reply_to_message_id: originalMessageId,
    message_thread_id: threadSpec.id,
    reply_markup: buttons     // Inline keyboards if applicable
  }
);

Long responses are automatically split into multiple messages (Telegram's 4096 char limit). Images, code blocks, and files are handled with appropriate Telegram API methods (sendPhoto, sendDocument, etc.).

Step 10: Session Persistence

File: src/config/sessions.ts

After the agent run completes, the session is saved:

~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json

Updated with:

  • New user message appended to messages[]
  • New assistant response appended to messages[]
  • lastActivity timestamp updated
  • Token usage metrics updated

Timing: Disk write happens after the response is sent to Telegram. If the gateway crashes between send and persist, the message is lost from session history (but the user already saw it in Telegram).


Complete Flow Diagram

Telegram User: "Help me write a Python script"
       │
       ▼
[1] Telegram API → POST /telegram-webhook
       │           src/telegram/webhook.ts
       ▼
[2] grammy Bot.handleUpdate()
       │  Extracts: chatId, text, senderId, isGroup
       │  src/telegram/bot.ts
       ▼
[3] buildTelegramMessageContext()
       │  Checks: dmPolicy, allowFrom, pairing status
       │  src/telegram/bot-message-context.ts
       │  src/telegram/bot-access.ts
       ▼
[4] resolveAgentRoute()
       │  Matches: channel → account → default agent
       │  Generates: sessionKey = "agent:main:telegram:direct:123456789"
       │  src/routing/resolve-route.ts
       ▼
[5] loadSessionStore()
       │  Reads: ~/.openclaw/sessions/main/...json
       │  Loads: previous messages, model overrides
       │  src/config/sessions.ts
       ▼
[6] getReplyFromConfig() → runPreparedReply()
       │  Builds: system prompt + user message
       │  src/auto-reply/reply/get-reply.ts
       │  src/auto-reply/reply/get-reply-run.ts
       ▼
[7] runEmbeddedAttempt()
       │  Creates agent session with tools
       │  Calls: activeSession.prompt(message)
       │  → LLM API call (Anthropic/OpenAI/etc.)
       │  src/agents/pi-embedded-runner/run/attempt.ts
       ▼
[8] Tool execution (if needed)
       │  bash, file.write, web_fetch, etc.
       │  With exec approval gates
       │  src/agents/pi-tools.ts
       ▼
[9] dispatchTelegramMessage()
       │  Formats: Markdown → HTML
       │  Splits: long messages (4096 char limit)
       │  Sends: api.sendMessage(chatId, text, {parse_mode: "HTML"})
       │  src/telegram/bot-message-dispatch.ts
       │  src/telegram/send.ts
       ▼
[10] updateSessionStore()
        Persists: messages, usage, metadata
        Path: ~/.openclaw/sessions/main/...json
        src/config/sessions.ts
       ▼
Telegram User sees response in chat

Gateway Broadcast: How Other Clients See This

While the Telegram message is being processed, other connected clients (CLI, TUI, Web) see the same activity in real-time:

Agent starts processing
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: "Here's"}]}}
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " a"}]}}
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " Python"}]}}
  ...
  → Gateway emits: {event: "chat", state: "final", message: {content: [{text: "Here's a Python script..."}]}}

All clients connected to the same session key see streaming tokens via WebSocket events. Broadcast is scope-filtered (only clients with appropriate permissions see the events) and backpressure-managed (slow clients may be dropped).


Key Architectural Characteristics

Strengths

  • Single gateway = single coordination point: All routing, auth, and session management centralized
  • Channel-agnostic core: Agent doesn't know or care if the message came from Telegram, Discord, or CLI
  • Streaming-first: Tokens stream to all connected clients as they arrive from the LLM
  • Plugin extensibility: Most behavior can be modified via hooks without touching core
  • Sequential processing per chat: grammy's sequentialize() prevents race conditions

Weaknesses

  • Single process: All agents run in one Node.js process; crash = everything down
  • File-based sessions: JSONL/JSON5 on disk; no transactional guarantees, potential race conditions
  • Default session isolation is "main": Multi-user deployments leak data unless explicitly configured
  • No persistence ACK: Client sees message before disk write; crash = message lost from history
  • Concurrent send interleaving: Multiple sends to same session run in parallel, not queued