TL;DR: OpenClaw's architecture is a masterclass in centralized routing—one gateway server connects 14+ messaging platforms to AI agents through a WebSocket protocol. Follow a real Telegram message through the entire stack: from webhook to route resolution to LLM invocation to streamed response. The single-process design is elegant but fragile—one crash takes down everything, and file-based sessions create race conditions you'll wish you'd known about before deploying.

OpenClaw Architecture Deep Dive

Overview

OpenClaw is a multi-client, multi-channel AI agent platform built in TypeScript/Node.js. At its core is a gateway server that routes messages between clients (CLI, TUI, web, mobile, messaging channels) and agent sessions. This document traces the architecture through a concrete example: a Telegram message flowing through the entire system.

High-Level Architecture

                    ┌─────────────┐
                    │  Telegram   │
                    │  WhatsApp   │
                    │  Discord    │   Channel Integrations
                    │  Slack      │   (14+ platforms)
                    │  Signal     │
                    │  iMessage   │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │   Gateway   │   WebSocket + HTTP Server
                    │   Server    │   Central routing hub
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
       ┌──────▼───┐  ┌────▼────┐  ┌───▼──────┐
       │  Route   │  │ Session │  │  Plugin   │
       │ Resolver │  │  Store  │  │  System   │
       └──────┬───┘  └────┬────┘  └───┬──────┘
              │            │            │
              └────────────┼────────────┘
                           │
                    ┌──────▼──────┐
                    │   Agent     │   Pi Agent Core
                    │   Runner    │   LLM + Tools
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
       ┌──────▼───┐  ┌────▼────┐  ┌───▼──────┐
       │   LLM    │  │  Tool   │  │ Subagent │
       │ Provider  │  │ System  │  │ Registry │
       └──────────┘  └─────────┘  └──────────┘

Component Breakdown

1. Gateway Server

Location: src/gateway/server.impl.ts (~178 files in gateway directory)

The gateway is the central hub. It:

Runs HTTP/HTTPS + WebSocket servers
Manages all client connections (CLI, TUI, web, mobile, channel bots)
Routes messages to the correct agent session
Broadcasts agent events (streaming tokens, tool calls) to connected clients
Manages channel integrations lifecycle
Provides RPC interface for internal components

Key runtime state:

clients: Set<GatewayWsClient>          // All connected clients
chatRunState: Map<runId, ChatRun>      // Active agent runs
chatRunBuffers: Map<runId, Buffer>     // Token streaming buffers
nodeRegistry: Map<nodeId, Node>        // Remote execution nodes

2. WebSocket Protocol

Location: src/gateway/protocol/, src/gateway/server/ws-connection.ts

Three frame types over JSON:

Request: {type: "req", id: UUID, method: string, params: {...}}
Response: {type: "res", id: UUID, ok: boolean, payload: {...}}
Event: {type: "event", event: string, payload: {...}, seq: number}

Connection handshake:

Client connects (TCP upgrade)
  → Server sends: "connect.challenge" {nonce}
  → Client sends: "connect" RPC with auth + device identity
  → Server validates, sends: HelloOk {policy, role}
  → Connection ready

3. Authentication

Location: src/gateway/auth.ts

Five auth modes:

Token: SHA-256 validated shared secret
Password: Timing-safe comparison via safeEqualSecret()
Device token: RSA signature-based per-device identity
Tailscale: User identity via Tailscale whois headers
None: Open access (loopback only)

Rate limiting: 10 failed attempts → 5-minute lockout per IP (auth only, not API).

4. Routing

Location: src/routing/resolve-route.ts

Session key format: agent:{agentId}:{channel}:{accountId}:{peerKind}:{peerId}

Route resolution checks bindings in priority order:

Exact peer match (user's configured agent for this DM)
Parent peer match (for threads, inherit parent's agent)
Guild+roles match (Discord server + roles)
Guild match (Discord server only)
Team match (Slack workspace)
Account match (channel account)
Channel match (channel type)
Default agent

5. Plugin SDK

Location: src/plugin-sdk/

Plugins can register:

registerTool() -- custom agent tools
registerHook() -- lifecycle hooks (20+ hook points)
registerGatewayMethod() -- new RPC methods
registerHttpHandler() / registerHttpRoute() -- HTTP endpoints
registerChannel() -- new communication channels
registerCommand() -- chat commands (bypass LLM)
registerCli() -- CLI commands
registerService() -- background services
registerProvider() -- auth providers

Available hooks: before_tool_call, after_tool_call, message_received, message_sending, message_sent, before_agent_start, agent_end, before_compaction, subagent_spawning, subagent_ended, gateway_start, gateway_stop, llm_input, llm_output, and more.

6. Subagent System

Location: src/agents/subagent-registry.ts (940 LOC), src/agents/subagent-spawn.ts (528 LOC)

Two spawn modes:

"run": One-shot ephemeral agent. Executes task, announces result, cleaned up.
"session": Persistent thread-bound agent. Stays active for follow-ups.

Key capabilities:

Depth limiting (maxSpawnDepth, maxChildrenPerAgent)
Steer: redirect running agent mid-task (abort + restart with new instructions)
Cascading kill: recursively terminate all descendants
Disk persistence: state survives gateway restarts
Announce flow: exponential backoff delivery (1s/2s/4s/8s, max 3 retries, 5-min expiry)

Complete Message Flow: Telegram Example

Scenario: A Telegram user sends "Help me write a Python script" to the OpenClaw bot.

Step 1: Telegram Webhook Receives Message

File: src/telegram/webhook.ts → startTelegramWebhook()

The gateway starts a dedicated HTTP server for Telegram webhooks. When Telegram sends an update:

POST /telegram-webhook
→ grammy's webhookCallback() parses the update
→ Bot middleware chain executes
→ Sequential processing per chat (getTelegramSequentialKey)

Request body limits: 1MB max, 30s timeout.

Step 2: Message Parsing

File: src/telegram/bot.ts → createTelegramBot() (line 115)

The grammy Bot instance extracts:

chatId = msg.chat.id                    // e.g., 123456789
text = msg.text ?? msg.caption          // "Help me write a Python script"
senderId = msg.from?.id                 // e.g., 123456789
senderUsername = msg.from?.username      // e.g., "js0n"
isGroup = msg.chat.type === "group"     // false (DM)
threadId = msg.message_thread_id        // undefined (not a forum topic)

Messages are serialized per chat via sequentialize(getTelegramSequentialKey) -- one message processed at a time per chat to prevent race conditions.

Step 3: Access Control

File: src/telegram/bot-message-context.ts → buildTelegramMessageContext()

For DMs, checks DM policy:

if (dmPolicy === "disabled") → reject
if (dmPolicy === "open") → allow
if (dmPolicy === "allowlist") → check senderId against allowFrom list
if (dmPolicy === "pairing") → check if paired, else send pairing code

File: src/telegram/bot-access.ts → resolveSenderAllowMatch()

Checks allow.hasWildcard (if "*" in config, allow everyone)
Checks if senderId in allow.entries set (normalized phone/ID)
Returns {allowed: boolean, matchKey, matchSource}

If pairing required and user not yet paired:

const { code, created } = await upsertChannelPairingRequest({
  channel: "telegram",
  accountId,
  senderId: candidate,
  senderDisplayName
})
// Sends pairing code back to user: "Reply with code: ABCD1234"

Step 4: Route Resolution

File: src/routing/resolve-route.ts → resolveAgentRoute()

Input:

{
  cfg: loadConfig(),
  channel: "telegram",
  accountId: "default",          // Telegram bot account
  peer: { kind: "direct", id: "123456789" }
}

Walks binding tiers:

Check peer bindings → no match
Check account bindings → no match
Check channel bindings → no match
Fall back to default agent → "main"

File: src/routing/session-key.ts → buildAgentSessionKey()

Generates session key:

agent:main:telegram:direct:123456789

DM scope (default "main") determines isolation level:

"main" → all DMs share one session (default, insecure for multi-user)
"per-peer" → separate per user
"per-channel-peer" → separate per channel+user
"per-account-channel-peer" → full isolation

Step 5: Session Loading

File: src/auto-reply/reply/get-reply.ts → getReplyFromConfig()

Calls initSessionState() which:

File: src/config/sessions.ts

Loads session from disk:

~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json

Session structure:

{
  "channel": "telegram",
  "origin": { "provider": "telegram", "from": "123456789" },
  "modelOverride": null,
  "chatType": "direct",
  "sessionId": "<uuid>",
  "createdAt": "2026-02-21T...",
  "lastActivity": "2026-02-21T...",
  "messages": [
    {"role": "user", "content": [{"type": "text", "text": "previous message"}]},
    {"role": "assistant", "content": [{"type": "text", "text": "previous response"}]}
  ]
}

If session doesn't exist, creates a new one with empty message history.

Step 6: System Prompt Construction

File: src/auto-reply/reply/get-reply-run.ts → runPreparedReply()

Builds the extra system prompt:

const extraSystemPrompt = [
  inboundMetaPrompt,         // "Message received from Telegram DM, user: @js0n"
  groupChatContext,           // null (DM, not group)
  groupIntro,                // null
  groupSystemPrompt          // null
].filter(Boolean).join("\n\n");

The agent also has a base system prompt from its agent config (~/.openclaw/agents/main/).

Step 7: Agent Dispatch & LLM Invocation

File: src/agents/pi-embedded-runner/run/attempt.ts → runEmbeddedAttempt()

Creates agent session:

const { session: activeSession } = await createAgentSession({
  workspace: resolvedWorkspace,
  model: params.model,           // e.g., "claude-sonnet-4-5-20250929"
  tools: builtInTools,           // bash, file ops, web, etc.
  customTools: allCustomTools,   // plugin-registered tools
  sessionManager,
  resourceLoader
});

Applies system prompt, then invokes:

await activeSession.prompt(effectivePrompt)
// effectivePrompt = "Help me write a Python script"

This calls the LLM via the configured provider (Anthropic, OpenAI, etc.) through @mariozechner/pi-ai's streaming interface. The provider selection comes from:

Agent config → session override → environment → config file

Step 8: Tool Execution (If Agent Uses Tools)

File: src/agents/pi-tools.ts → createOpenClawCodingTools()

If the agent decides to write a Python file, it invokes the bash or file.write tool:

Agent: "I'll create the script for you."
Tool call: file.write { path: "/workspace/script.py", content: "#!/usr/bin/env python3\n..." }

Tool execution flow:

Agent outputs tool call in response stream
Pi agent core intercepts, finds matching tool
Tool's execute() function runs (with security policy checks)
Tool result added to conversation history
Control returns to LLM with tool result
LLM generates next response (may call more tools or produce final text)

Exec approval (for bash tools):

Agent wants to run: python3 script.py
→ Check exec-approvals.json allowlist
→ If not allowed: send approval request to user (Telegram inline keyboard)
→ User taps "Allow" → command executes
→ Result returned to agent

Step 9: Response Routing Back to Telegram

File: src/telegram/bot-message-dispatch.ts → dispatchTelegramMessage()

The agent's response streams back through the dispatch chain:

const { queuedFinal } = await dispatchReplyWithBufferedBlockDispatcher({
  ctx: context,
  dispatcherOptions: {
    sendTyping,              // Shows "typing..." in Telegram
    replyDispatcher,
    channel: "telegram",
  },
  replyOptions: {
    onBlockReply: (payload) => {
      // Each block of response text sent to Telegram
    },
  }
});

File: src/telegram/send.ts → sendMessageTelegram()

Formats response for Telegram (Markdown → HTML conversion) and sends:

const response = await api.sendMessage(
  chatId,                    // 123456789
  formattedText,             // HTML-formatted response
  {
    parse_mode: "HTML",
    reply_to_message_id: originalMessageId,
    message_thread_id: threadSpec.id,
    reply_markup: buttons     // Inline keyboards if applicable
  }
);

Long responses are automatically split into multiple messages (Telegram's 4096 char limit). Images, code blocks, and files are handled with appropriate Telegram API methods (sendPhoto, sendDocument, etc.).

Step 10: Session Persistence

File: src/config/sessions.ts

After the agent run completes, the session is saved:

~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json

Updated with:

New user message appended to messages[]
New assistant response appended to messages[]
lastActivity timestamp updated
Token usage metrics updated

Timing: Disk write happens after the response is sent to Telegram. If the gateway crashes between send and persist, the message is lost from session history (but the user already saw it in Telegram).

Complete Flow Diagram

Telegram User: "Help me write a Python script"
       │
       ▼
[1] Telegram API → POST /telegram-webhook
       │           src/telegram/webhook.ts
       ▼
[2] grammy Bot.handleUpdate()
       │  Extracts: chatId, text, senderId, isGroup
       │  src/telegram/bot.ts
       ▼
[3] buildTelegramMessageContext()
       │  Checks: dmPolicy, allowFrom, pairing status
       │  src/telegram/bot-message-context.ts
       │  src/telegram/bot-access.ts
       ▼
[4] resolveAgentRoute()
       │  Matches: channel → account → default agent
       │  Generates: sessionKey = "agent:main:telegram:direct:123456789"
       │  src/routing/resolve-route.ts
       ▼
[5] loadSessionStore()
       │  Reads: ~/.openclaw/sessions/main/...json
       │  Loads: previous messages, model overrides
       │  src/config/sessions.ts
       ▼
[6] getReplyFromConfig() → runPreparedReply()
       │  Builds: system prompt + user message
       │  src/auto-reply/reply/get-reply.ts
       │  src/auto-reply/reply/get-reply-run.ts
       ▼
[7] runEmbeddedAttempt()
       │  Creates agent session with tools
       │  Calls: activeSession.prompt(message)
       │  → LLM API call (Anthropic/OpenAI/etc.)
       │  src/agents/pi-embedded-runner/run/attempt.ts
       ▼
[8] Tool execution (if needed)
       │  bash, file.write, web_fetch, etc.
       │  With exec approval gates
       │  src/agents/pi-tools.ts
       ▼
[9] dispatchTelegramMessage()
       │  Formats: Markdown → HTML
       │  Splits: long messages (4096 char limit)
       │  Sends: api.sendMessage(chatId, text, {parse_mode: "HTML"})
       │  src/telegram/bot-message-dispatch.ts
       │  src/telegram/send.ts
       ▼
[10] updateSessionStore()
        Persists: messages, usage, metadata
        Path: ~/.openclaw/sessions/main/...json
        src/config/sessions.ts
       ▼
Telegram User sees response in chat

Gateway Broadcast: How Other Clients See This

While the Telegram message is being processed, other connected clients (CLI, TUI, Web) see the same activity in real-time:

Agent starts processing
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: "Here's"}]}}
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " a"}]}}
  → Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " Python"}]}}
  ...
  → Gateway emits: {event: "chat", state: "final", message: {content: [{text: "Here's a Python script..."}]}}

All clients connected to the same session key see streaming tokens via WebSocket events. Broadcast is scope-filtered (only clients with appropriate permissions see the events) and backpressure-managed (slow clients may be dropped).

Key Architectural Characteristics

Strengths

Single gateway = single coordination point: All routing, auth, and session management centralized
Channel-agnostic core: Agent doesn't know or care if the message came from Telegram, Discord, or CLI
Streaming-first: Tokens stream to all connected clients as they arrive from the LLM
Plugin extensibility: Most behavior can be modified via hooks without touching core
Sequential processing per chat: grammy's sequentialize() prevents race conditions

Weaknesses

Single process: All agents run in one Node.js process; crash = everything down
File-based sessions: JSONL/JSON5 on disk; no transactional guarantees, potential race conditions
Default session isolation is "main": Multi-user deployments leak data unless explicitly configured
No persistence ACK: Client sees message before disk write; crash = message lost from history
Concurrent send interleaving: Multiple sends to same session run in parallel, not queued