AO messaging redesign — research findings

Author: Harshit (with Claude) Date: 2026-05-10 Status: Findings ready for team review; no code shipped yet.

TL;DR

We're replacing tmux send-keys for orchestrator → agent notifications. The original assumption — "file-based for everyone" — was wrong. The right answer is per-agent transports, dispatched from each agent plugin's sendMessage. One common record (inbox.jsonl for durability), different delivery mechanisms.

Agent	Transport	Status
Claude Code	bg-watcher → process exit wakes agent	✅ proven end-to-end
Copilot CLI	bg-watcher (same as Claude Code)	✅ proven by Harshit on Copilot
OpenCode	`POST /session/{id}/prompt_async` to its HTTP server	✅ proven end-to-end
Codex CLI	bg-bash is broken; keep send-keys for now	❌ blocked on openai/codex#22003
Cursor	bg-process completion does not auto-inject; needs send-keys equivalent	❌ empirically verified — same bucket as Codex

OpenCode integration is strictly easier than Claude Code's: no sidecar process inside the agent's session, no inbox file required for delivery — the orchestrator just hits an HTTP endpoint.

What we set out to solve

The orchestrator needs to deliver structured notifications to running agents — CI failed, review came in, PR ready to merge, etc. Today this goes through runtime.sendMessage, which for tmux runtime calls tmux send-keys and pastes the message into the agent's pty (packages/plugins/runtime-tmux/src/index.ts:132-169).

Problems with the current path (admitted in code comments):

Two failure points in the long-message path: temp-file write, tmux buffer rotation, and a 300 ms sleep before pressing Enter — under load, Enter can land before the paste finishes.
sendWithConfirmation (packages/core/src/session-manager.ts:2525-2565) explicitly returns success if it can't confirm delivery, because failing led to duplicate messages on retry. So "delivered" really means "we tried."
Races with a human typing in the same tmux pane.
No idempotency — orchestrator crash between send and dispatch-hash update means the same message gets re-sent on next poll.
tmux-specific; doesn't compose well with runtime-process or future Windows pty-host paths.

Goal: replace it with something where the data is structured, the delivery is auditable, and the wake-up doesn't depend on terminal byte streams.

Experiment 1 — Claude Code: file watcher + bg-process exit

The pattern

A small Node script runs in the background as a one-shot watcher. It uses fs.watchFile to block until the target file (./inbox.jsonl) grows past a known offset. When new bytes arrive, it reads them, prints one JSON line to stdout, and exits with code 0.

Claude Code's runtime auto-injects a <task-notification> system reminder into the agent's context the moment a backgrounded shell command exits, while the agent is idle. The agent reads the new inbox content, treats it as a user prompt, and acts. Latency: ~500 ms (the watchFile poll interval).

What we proved

Live-tested in this session. Watcher script: ~30 lines of Node, no installs (uses built-in fs.watchFile, equivalent to chokidar's underlying mechanism on macOS). Spawned as Bash(run_in_background: true). Wrote to inbox.jsonl from a side terminal — agent received the wake-up, read the new lines, processed as user input, respawned watcher with the new offset. Loop runs forever.

Caveats

Wake-up only fires when the agent is idle. If the agent is mid-tool-call, the notification queues and fires when the call returns. Fine for AO's use case (CI failure landing 10 s late doesn't matter).
Agent has to remember to respawn the watcher each cycle. A single restart instruction in the system prompt handles this.

Watcher script (canonical)

#!/usr/bin/env node
const fs = require("node:fs");
const [, , filePath, fromArg] = process.argv;
const fromOffset = Number.parseInt(fromArg ?? "0", 10) || 0;
const readNew = () => {
  const size = fs.statSync(filePath).size;
  if (size <= fromOffset) return null;
  const fd = fs.openSync(filePath, "r");
  const buf = Buffer.alloc(size - fromOffset);
  fs.readSync(fd, buf, 0, buf.length, fromOffset);
  fs.closeSync(fd);
  return { newOffset: size, content: buf.toString("utf8") };
};
const emit = (p) => { process.stdout.write(JSON.stringify(p) + "\n"); process.exit(0); };
const initial = readNew();
if (initial?.content) emit({ event: "initial", ...initial });
fs.watchFile(filePath, { interval: 500 }, () => {
  const r = readNew();
  if (r?.content) { fs.unwatchFile(filePath); emit({ event: "change", ...r }); }
});
setTimeout(() => emit({ event: "timeout", newOffset: fromOffset, content: "" }), 30 * 60 * 1000).unref();

Per-agent compatibility — what works, what doesn't

The "wake-up channel from external event to agent's REPL" is the bottleneck. Here's what we found, both empirically and via the source.

✅ Claude Code — works natively

run_in_background: true on Bash + auto <task-notification> injection. Already used in production by AO in this session.

✅ Copilot CLI — works (surprising win)

Public docs and open issues (github/copilot-cli#2682) suggested bg shell completion notification only worked for MCP background agents, not bare bg shells. Harshit empirically verified that bare bg shell completion does wake the agent — works the same way as Claude Code with the same watcher script.

✅ OpenCode — works via HTTP, not via watcher (see Experiment 2 below)

❌ Codex CLI — broken

Bg-bash exists but the agent harness fails to detect process completion: known stuck-loop bug (openai/codex#14314, "Waited for background terminal" infinite log spam). Proper bg-session support still open at openai/codex#3968. Harshit filed a duplicate report with the AO use case at openai/codex#22003.

For Codex we must keep the existing send-keys-equivalent until upstream fixes land.

❌ Cursor — bg-completion does not wake the agent (empirically verified 2026-05-10)

Tested live with cursor-agent v1.x. Spawned the same Node fs.watchFile watcher as a backgrounded shell command from inside the agent's session, ended the agent's turn, then appended a line to ./inbox.jsonl from outside. The watcher detected the change, exited cleanly with the expected JSON payload (event:"change", newOffset:170, content matching), but cursor-agent's harness did not inject the output into the model's context. The agent only acted on the watcher's payload after the human typed a fresh message — at which point the harness made the prior bg-job's stdout available, but as a tool-result lookup, not a wake-up.

The historical bug Harshit asked us to investigate (forum.cursor.com #122392) was resolved in v1.3.7 (July 2025) — that one was about foreground command completion detection. The wake-up-on-bg-exit gap is a separate, still-open architectural issue.

Implication: Cursor goes in the same bucket as Codex — orchestrator → agent delivery needs a send-keys-style nudge until Cursor adds a wake-up channel. There's no HTTP server like OpenCode (we checked).

Experiment 2 — OpenCode: orchestrator hits the HTTP API

The discovery

OpenCode auto-starts a Hono HTTP server on port 4096 (or --port <N>) when the TUI launches, and opencode serve runs it headlessly. The server exposes:

POST /session/{sessionID}/prompt — send message (streams response)
POST /session/{sessionID}/prompt_async — fire-and-forget
POST /session/{sessionID}/abort — cancel
GET /session/{sessionID}/message — list messages (audit)
noReply: true request flag — inject context without forcing a turn

This is already a structured external messaging channel. We don't need to invent one.

What we proved

Live-tested with Harshit's running TUI bound to http://127.0.0.1:4096:

curl -X POST http://127.0.0.1:4096/session/ses_1ef2ff3d4ffe8HKxAfuQtDWNTC/prompt_async \
  -H 'Content-Type: application/json' \
  -d '{
    "parts":[{"type":"text","text":"Hello from AO orchestrator!"}],
    "model":{"providerID":"zai-coding-plan","modelID":"glm-5.1"}
  }'
# → HTTP 204

Server stored the user message. Agent ran inference. Response streamed back. Verified via GET /session/.../message.

Implication for AO's `agent-opencode` plugin

The OpenCode plugin's sendMessage becomes ~30 lines:

async sendMessage(handle: RuntimeHandle, message: string): Promise<void> {
  const port = handle.data.opencodePort as number;
  const sessionID = handle.data.opencodeSessionID as string;
  const model = handle.data.opencodeModel as { providerID: string; modelID: string };
  await fetch(`http://127.0.0.1:${port}/session/${sessionID}/prompt_async`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      parts: [{ type: "text", text: message }],
      model,
    }),
  });
}

agent.getLaunchCommand becomes opencode serve --port <free-port> (so we know the port up-front). Capture port, sessionID, and model into handle.data at session creation. No tmux involvement, no inbox watcher, no plugin-side code inside the agent process. End of story.

Bug filed during this work

anomalyco/opencode#26671 — "TUI does not live-render messages when prompt is POSTed externally via /session/{id}/prompt_async"

OpenCode's TUI doesn't live-render messages whose prompt originated from an external HTTP POST — they appear only after Ctrl+C + resume. Web UI renders correctly. Server persists correctly. Cosmetic for AO (the human watches via the dashboard, not the agent's TUI), but worth fixing upstream.

Recommended messaging architecture for AO

Per-agent transport, single inbox record

orchestrator
   │
   ├─ append { sessionId, kind, payload, ts } to {workspace}/.ao/inbox.jsonl   (durable record)
   │
   └─ runtime/agent plugin's sendMessage()
        │
        ├─ Claude Code   → ensure inbox-watcher running in agent's bash; agent picks up via wake-up
        ├─ Copilot CLI   → same as Claude Code
        ├─ OpenCode      → fetch POST /session/{id}/prompt_async to the agent's local server
        ├─ Codex          → existing send-keys path (until upstream fixes #22003)
        └─ Cursor         → existing send-keys path (no wake-up channel exists, verified empirically)

inbox.jsonl becomes the durable log — replay, audit, debugging, idempotency. Each entry has a monotonic id; the agent persists the last-consumed id in its workspace.

The transport is per-agent. One mechanism per plugin, dispatched from sendMessage. The orchestrator doesn't care which.

What this buys us

Atomic structured records — no half-pasted messages, no terminal-byte parsing.
Real idempotency — last-consumed id beats the current dispatch-hash heuristic.
Cross-runtime by construction — tmux, process, Windows pty-host all work the same.
Replayable — agent restart re-reads inbox from last id; no lost messages.
No timing races — no 300 ms sleeps, no pasted-text-vs-Enter ordering.

Migration path (rough sketch)

Add inbox-write helper to core (appendInboxEntry(workspacePath, entry)).
Update each agent plugin's sendMessage to use its native transport.
For Claude Code / Copilot — install the inbox watcher in setupWorkspaceHooks (system prompt instructs the agent to keep one running).
For OpenCode — capture port / sessionID / model in handle.data; sendMessage POSTs to the local server.
For Codex — keep the existing send-keys path; revisit when #22003 lands.
Replace sendWithConfirmation's "best-effort + dispatch-hash" with explicit ack via the inbox last-consumed id.

This is incremental. Each agent plugin migrates independently. The orchestrator-side append-to-inbox can ship first behind a feature flag.

Open questions for the team

Where does the inbox file live? {workspace}/.ao/inbox.jsonl (gitignored, like the existing .ao/AGENTS.md pattern) is the natural fit. Confirm.
Acknowledgement model. Last-consumed id persisted by the agent → orchestrator polls a .ao/inbox.cursor file? Or push-back via the same transport (e.g. agent calls a hook)? Push-back is cleaner but more code.
Migration timeline. Ship the per-agent transports incrementally over multiple PRs, or wait until all four are ready and flip in one go? Incremental seems safer; behind AO_EXPERIMENTAL_FILE_INBOX env var.
OpenCode launch command. Switching agent-opencode's launch to opencode serve --port <free-port> (headless). Resolved: keep the TUI launch. The TUI's bound HTTP server on port 4096 is sufficient for AO to POST into; the original concern (TUI not live-rendering external POSTs) has a fix in flight at PR #26687. Once that merges, humans attached via TUI see orchestrator-injected messages in real time too.
Codex parity. Do we want to invest engineering time helping unblock #22003 (PR to upstream), or accept the status quo (send-keys for Codex) until OpenAI ships?

References

AO codebase

packages/core/src/types.ts:399 — Runtime.sendMessage
packages/plugins/runtime-tmux/src/index.ts:132-169 — current tmux send-keys impl
packages/plugins/runtime-process/src/index.ts:369-413 — current process stdin impl
packages/core/src/session-manager.ts:2525-2565 — sendWithConfirmation and best-effort delivery comment

Claude Code

Built-in Bash(run_in_background: true) + <task-notification> injection. No public docs link; verified empirically.

OpenCode

Server docs
SDK docs
REST API reference (DeepWiki)
Open feature request for native bg-bash: anomalyco/opencode#1970 (working branch by abrar360 awaits maintainer review)

Codex CLI

openai/codex#22003 — Harshit's filed report with AO use case

Cursor

Cursor forum — agent doesn't wait for terminal command

Filed by AO during this work

anomalyco/opencode#26671 — OpenCode TUI live-render bug (reproducer + steps included). Fix already in flight: PR #26687 by kitlangton — 2-line fix in TUI's useEvent filter + regression tests.

AO messaging redesign — research findings

TL;DR

What we set out to solve

Experiment 1 — Claude Code: file watcher + bg-process exit

The pattern

What we proved

Caveats

Watcher script (canonical)

Per-agent compatibility — what works, what doesn't

✅ Claude Code — works natively

✅ Copilot CLI — works (surprising win)

✅ OpenCode — works via HTTP, not via watcher (see Experiment 2 below)

❌ Codex CLI — broken

❌ Cursor — bg-completion does not wake the agent (empirically verified 2026-05-10)

Experiment 2 — OpenCode: orchestrator hits the HTTP API

The discovery

What we proved

Implication for AO's agent-opencode plugin

Bug filed during this work

Recommended messaging architecture for AO

Per-agent transport, single inbox record

What this buys us

Migration path (rough sketch)

Open questions for the team

References

AO codebase

Claude Code

OpenCode

Codex CLI

Cursor

Filed by AO during this work

Implication for AO's `agent-opencode` plugin