OpenClaw Architecture Deep Dive
TL;DR: OpenClaw's architecture is a masterclass in centralized routing—one gateway server connects 14+ messaging platforms to AI agents through a WebSocket protocol. Follow a real Telegram message through the entire stack: from webhook to route resolution to LLM invocation to streamed response. The single-process design is elegant but fragile—one crash takes down everything, and file-based sessions create race conditions you'll wish you'd known about before deploying.
OpenClaw Architecture Deep Dive
Overview
OpenClaw is a multi-client, multi-channel AI agent platform built in TypeScript/Node.js. At its core is a gateway server that routes messages between clients (CLI, TUI, web, mobile, messaging channels) and agent sessions. This document traces the architecture through a concrete example: a Telegram message flowing through the entire system.
High-Level Architecture
┌─────────────┐
│ Telegram │
│ WhatsApp │
│ Discord │ Channel Integrations
│ Slack │ (14+ platforms)
│ Signal │
│ iMessage │
└──────┬──────┘
│
┌──────▼──────┐
│ Gateway │ WebSocket + HTTP Server
│ Server │ Central routing hub
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌──────▼───┐ ┌────▼────┐ ┌───▼──────┐
│ Route │ │ Session │ │ Plugin │
│ Resolver │ │ Store │ │ System │
└──────┬───┘ └────┬────┘ └───┬──────┘
│ │ │
└────────────┼────────────┘
│
┌──────▼──────┐
│ Agent │ Pi Agent Core
│ Runner │ LLM + Tools
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌──────▼───┐ ┌────▼────┐ ┌───▼──────┐
│ LLM │ │ Tool │ │ Subagent │
│ Provider │ │ System │ │ Registry │
└──────────┘ └─────────┘ └──────────┘
Component Breakdown
1. Gateway Server
Location: src/gateway/server.impl.ts (~178 files in gateway directory)
The gateway is the central hub. It:
- Runs HTTP/HTTPS + WebSocket servers
- Manages all client connections (CLI, TUI, web, mobile, channel bots)
- Routes messages to the correct agent session
- Broadcasts agent events (streaming tokens, tool calls) to connected clients
- Manages channel integrations lifecycle
- Provides RPC interface for internal components
Key runtime state:
clients: Set<GatewayWsClient> // All connected clients
chatRunState: Map<runId, ChatRun> // Active agent runs
chatRunBuffers: Map<runId, Buffer> // Token streaming buffers
nodeRegistry: Map<nodeId, Node> // Remote execution nodes
2. WebSocket Protocol
Location: src/gateway/protocol/, src/gateway/server/ws-connection.ts
Three frame types over JSON:
- Request:
{type: "req", id: UUID, method: string, params: {...}} - Response:
{type: "res", id: UUID, ok: boolean, payload: {...}} - Event:
{type: "event", event: string, payload: {...}, seq: number}
Connection handshake:
Client connects (TCP upgrade)
→ Server sends: "connect.challenge" {nonce}
→ Client sends: "connect" RPC with auth + device identity
→ Server validates, sends: HelloOk {policy, role}
→ Connection ready
3. Authentication
Location: src/gateway/auth.ts
Five auth modes:
- Token: SHA-256 validated shared secret
- Password: Timing-safe comparison via
safeEqualSecret() - Device token: RSA signature-based per-device identity
- Tailscale: User identity via Tailscale whois headers
- None: Open access (loopback only)
Rate limiting: 10 failed attempts → 5-minute lockout per IP (auth only, not API).
4. Routing
Location: src/routing/resolve-route.ts
Session key format: agent:{agentId}:{channel}:{accountId}:{peerKind}:{peerId}
Route resolution checks bindings in priority order:
- Exact peer match (user's configured agent for this DM)
- Parent peer match (for threads, inherit parent's agent)
- Guild+roles match (Discord server + roles)
- Guild match (Discord server only)
- Team match (Slack workspace)
- Account match (channel account)
- Channel match (channel type)
- Default agent
5. Plugin SDK
Location: src/plugin-sdk/
Plugins can register:
registerTool()-- custom agent toolsregisterHook()-- lifecycle hooks (20+ hook points)registerGatewayMethod()-- new RPC methodsregisterHttpHandler()/registerHttpRoute()-- HTTP endpointsregisterChannel()-- new communication channelsregisterCommand()-- chat commands (bypass LLM)registerCli()-- CLI commandsregisterService()-- background servicesregisterProvider()-- auth providers
Available hooks: before_tool_call, after_tool_call, message_received, message_sending, message_sent, before_agent_start, agent_end, before_compaction, subagent_spawning, subagent_ended, gateway_start, gateway_stop, llm_input, llm_output, and more.
6. Subagent System
Location: src/agents/subagent-registry.ts (940 LOC), src/agents/subagent-spawn.ts (528 LOC)
Two spawn modes:
"run": One-shot ephemeral agent. Executes task, announces result, cleaned up."session": Persistent thread-bound agent. Stays active for follow-ups.
Key capabilities:
- Depth limiting (
maxSpawnDepth,maxChildrenPerAgent) - Steer: redirect running agent mid-task (abort + restart with new instructions)
- Cascading kill: recursively terminate all descendants
- Disk persistence: state survives gateway restarts
- Announce flow: exponential backoff delivery (1s/2s/4s/8s, max 3 retries, 5-min expiry)
Complete Message Flow: Telegram Example
Scenario: A Telegram user sends "Help me write a Python script" to the OpenClaw bot.
Step 1: Telegram Webhook Receives Message
File: src/telegram/webhook.ts → startTelegramWebhook()
The gateway starts a dedicated HTTP server for Telegram webhooks. When Telegram sends an update:
POST /telegram-webhook
→ grammy's webhookCallback() parses the update
→ Bot middleware chain executes
→ Sequential processing per chat (getTelegramSequentialKey)
Request body limits: 1MB max, 30s timeout.
Step 2: Message Parsing
File: src/telegram/bot.ts → createTelegramBot() (line 115)
The grammy Bot instance extracts:
chatId = msg.chat.id // e.g., 123456789
text = msg.text ?? msg.caption // "Help me write a Python script"
senderId = msg.from?.id // e.g., 123456789
senderUsername = msg.from?.username // e.g., "js0n"
isGroup = msg.chat.type === "group" // false (DM)
threadId = msg.message_thread_id // undefined (not a forum topic)
Messages are serialized per chat via sequentialize(getTelegramSequentialKey) -- one message processed at a time per chat to prevent race conditions.
Step 3: Access Control
File: src/telegram/bot-message-context.ts → buildTelegramMessageContext()
For DMs, checks DM policy:
if (dmPolicy === "disabled") → reject
if (dmPolicy === "open") → allow
if (dmPolicy === "allowlist") → check senderId against allowFrom list
if (dmPolicy === "pairing") → check if paired, else send pairing code
File: src/telegram/bot-access.ts → resolveSenderAllowMatch()
- Checks
allow.hasWildcard(if"*"in config, allow everyone) - Checks if
senderIdinallow.entriesset (normalized phone/ID) - Returns
{allowed: boolean, matchKey, matchSource}
If pairing required and user not yet paired:
const { code, created } = await upsertChannelPairingRequest({
channel: "telegram",
accountId,
senderId: candidate,
senderDisplayName
})
// Sends pairing code back to user: "Reply with code: ABCD1234"
Step 4: Route Resolution
File: src/routing/resolve-route.ts → resolveAgentRoute()
Input:
{
cfg: loadConfig(),
channel: "telegram",
accountId: "default", // Telegram bot account
peer: { kind: "direct", id: "123456789" }
}
Walks binding tiers:
- Check peer bindings → no match
- Check account bindings → no match
- Check channel bindings → no match
- Fall back to default agent →
"main"
File: src/routing/session-key.ts → buildAgentSessionKey()
Generates session key:
agent:main:telegram:direct:123456789
DM scope (default "main") determines isolation level:
"main"→ all DMs share one session (default, insecure for multi-user)"per-peer"→ separate per user"per-channel-peer"→ separate per channel+user"per-account-channel-peer"→ full isolation
Step 5: Session Loading
File: src/auto-reply/reply/get-reply.ts → getReplyFromConfig()
Calls initSessionState() which:
File: src/config/sessions.ts
Loads session from disk:
~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json
Session structure:
{
"channel": "telegram",
"origin": { "provider": "telegram", "from": "123456789" },
"modelOverride": null,
"chatType": "direct",
"sessionId": "<uuid>",
"createdAt": "2026-02-21T...",
"lastActivity": "2026-02-21T...",
"messages": [
{"role": "user", "content": [{"type": "text", "text": "previous message"}]},
{"role": "assistant", "content": [{"type": "text", "text": "previous response"}]}
]
}
If session doesn't exist, creates a new one with empty message history.
Step 6: System Prompt Construction
File: src/auto-reply/reply/get-reply-run.ts → runPreparedReply()
Builds the extra system prompt:
const extraSystemPrompt = [
inboundMetaPrompt, // "Message received from Telegram DM, user: @js0n"
groupChatContext, // null (DM, not group)
groupIntro, // null
groupSystemPrompt // null
].filter(Boolean).join("\n\n");
The agent also has a base system prompt from its agent config (~/.openclaw/agents/main/).
Step 7: Agent Dispatch & LLM Invocation
File: src/agents/pi-embedded-runner/run/attempt.ts → runEmbeddedAttempt()
Creates agent session:
const { session: activeSession } = await createAgentSession({
workspace: resolvedWorkspace,
model: params.model, // e.g., "claude-sonnet-4-5-20250929"
tools: builtInTools, // bash, file ops, web, etc.
customTools: allCustomTools, // plugin-registered tools
sessionManager,
resourceLoader
});
Applies system prompt, then invokes:
await activeSession.prompt(effectivePrompt)
// effectivePrompt = "Help me write a Python script"
This calls the LLM via the configured provider (Anthropic, OpenAI, etc.) through @mariozechner/pi-ai's streaming interface. The provider selection comes from:
- Agent config → session override → environment → config file
Step 8: Tool Execution (If Agent Uses Tools)
File: src/agents/pi-tools.ts → createOpenClawCodingTools()
If the agent decides to write a Python file, it invokes the bash or file.write tool:
Agent: "I'll create the script for you."
Tool call: file.write { path: "/workspace/script.py", content: "#!/usr/bin/env python3\n..." }
Tool execution flow:
- Agent outputs tool call in response stream
- Pi agent core intercepts, finds matching tool
- Tool's
execute()function runs (with security policy checks) - Tool result added to conversation history
- Control returns to LLM with tool result
- LLM generates next response (may call more tools or produce final text)
Exec approval (for bash tools):
Agent wants to run: python3 script.py
→ Check exec-approvals.json allowlist
→ If not allowed: send approval request to user (Telegram inline keyboard)
→ User taps "Allow" → command executes
→ Result returned to agent
Step 9: Response Routing Back to Telegram
File: src/telegram/bot-message-dispatch.ts → dispatchTelegramMessage()
The agent's response streams back through the dispatch chain:
const { queuedFinal } = await dispatchReplyWithBufferedBlockDispatcher({
ctx: context,
dispatcherOptions: {
sendTyping, // Shows "typing..." in Telegram
replyDispatcher,
channel: "telegram",
},
replyOptions: {
onBlockReply: (payload) => {
// Each block of response text sent to Telegram
},
}
});
File: src/telegram/send.ts → sendMessageTelegram()
Formats response for Telegram (Markdown → HTML conversion) and sends:
const response = await api.sendMessage(
chatId, // 123456789
formattedText, // HTML-formatted response
{
parse_mode: "HTML",
reply_to_message_id: originalMessageId,
message_thread_id: threadSpec.id,
reply_markup: buttons // Inline keyboards if applicable
}
);
Long responses are automatically split into multiple messages (Telegram's 4096 char limit). Images, code blocks, and files are handled with appropriate Telegram API methods (sendPhoto, sendDocument, etc.).
Step 10: Session Persistence
File: src/config/sessions.ts
After the agent run completes, the session is saved:
~/.openclaw/sessions/main/agent:main:telegram:direct:123456789.json
Updated with:
- New user message appended to
messages[] - New assistant response appended to
messages[] lastActivitytimestamp updated- Token usage metrics updated
Timing: Disk write happens after the response is sent to Telegram. If the gateway crashes between send and persist, the message is lost from session history (but the user already saw it in Telegram).
Complete Flow Diagram
Telegram User: "Help me write a Python script"
│
▼
[1] Telegram API → POST /telegram-webhook
│ src/telegram/webhook.ts
▼
[2] grammy Bot.handleUpdate()
│ Extracts: chatId, text, senderId, isGroup
│ src/telegram/bot.ts
▼
[3] buildTelegramMessageContext()
│ Checks: dmPolicy, allowFrom, pairing status
│ src/telegram/bot-message-context.ts
│ src/telegram/bot-access.ts
▼
[4] resolveAgentRoute()
│ Matches: channel → account → default agent
│ Generates: sessionKey = "agent:main:telegram:direct:123456789"
│ src/routing/resolve-route.ts
▼
[5] loadSessionStore()
│ Reads: ~/.openclaw/sessions/main/...json
│ Loads: previous messages, model overrides
│ src/config/sessions.ts
▼
[6] getReplyFromConfig() → runPreparedReply()
│ Builds: system prompt + user message
│ src/auto-reply/reply/get-reply.ts
│ src/auto-reply/reply/get-reply-run.ts
▼
[7] runEmbeddedAttempt()
│ Creates agent session with tools
│ Calls: activeSession.prompt(message)
│ → LLM API call (Anthropic/OpenAI/etc.)
│ src/agents/pi-embedded-runner/run/attempt.ts
▼
[8] Tool execution (if needed)
│ bash, file.write, web_fetch, etc.
│ With exec approval gates
│ src/agents/pi-tools.ts
▼
[9] dispatchTelegramMessage()
│ Formats: Markdown → HTML
│ Splits: long messages (4096 char limit)
│ Sends: api.sendMessage(chatId, text, {parse_mode: "HTML"})
│ src/telegram/bot-message-dispatch.ts
│ src/telegram/send.ts
▼
[10] updateSessionStore()
Persists: messages, usage, metadata
Path: ~/.openclaw/sessions/main/...json
src/config/sessions.ts
▼
Telegram User sees response in chat
Gateway Broadcast: How Other Clients See This
While the Telegram message is being processed, other connected clients (CLI, TUI, Web) see the same activity in real-time:
Agent starts processing
→ Gateway emits: {event: "chat", state: "delta", message: {content: [{text: "Here's"}]}}
→ Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " a"}]}}
→ Gateway emits: {event: "chat", state: "delta", message: {content: [{text: " Python"}]}}
...
→ Gateway emits: {event: "chat", state: "final", message: {content: [{text: "Here's a Python script..."}]}}
All clients connected to the same session key see streaming tokens via WebSocket events. Broadcast is scope-filtered (only clients with appropriate permissions see the events) and backpressure-managed (slow clients may be dropped).
Key Architectural Characteristics
Strengths
- Single gateway = single coordination point: All routing, auth, and session management centralized
- Channel-agnostic core: Agent doesn't know or care if the message came from Telegram, Discord, or CLI
- Streaming-first: Tokens stream to all connected clients as they arrive from the LLM
- Plugin extensibility: Most behavior can be modified via hooks without touching core
- Sequential processing per chat: grammy's
sequentialize()prevents race conditions
Weaknesses
- Single process: All agents run in one Node.js process; crash = everything down
- File-based sessions: JSONL/JSON5 on disk; no transactional guarantees, potential race conditions
- Default session isolation is "main": Multi-user deployments leak data unless explicitly configured
- No persistence ACK: Client sees message before disk write; crash = message lost from history
- Concurrent send interleaving: Multiple sends to same session run in parallel, not queued