AO messaging redesign — research findings

Author: Harshit (with Claude) Date: 2026-05-10 Status: Findings ready for team review; no code shipped yet.

TL;DR

We're replacing tmux send-keys for orchestrator → agent notifications. The original assumption — "file-based for everyone" — was wrong. The right answer is per-agent transports, dispatched from each agent plugin's sendMessage. One common record (inbox.jsonl for durability), different delivery mechanisms.

Agent Transport Status
Claude Code bg-watcher → process exit wakes agent ✅ proven end-to-end
Copilot CLI bg-watcher (same as Claude Code) ✅ proven by Harshit on Copilot
OpenCode POST /session/{id}/prompt_async to its HTTP server ✅ proven end-to-end
Codex CLI bg-bash is broken; keep send-keys for now ❌ blocked on openai/codex#22003
Cursor bg-process completion does not auto-inject; needs send-keys equivalent ❌ empirically verified — same bucket as Codex

OpenCode integration is strictly easier than Claude Code's: no sidecar process inside the agent's session, no inbox file required for delivery — the orchestrator just hits an HTTP endpoint.


What we set out to solve

The orchestrator needs to deliver structured notifications to running agents — CI failed, review came in, PR ready to merge, etc. Today this goes through runtime.sendMessage, which for tmux runtime calls tmux send-keys and pastes the message into the agent's pty (packages/plugins/runtime-tmux/src/index.ts:132-169).

Problems with the current path (admitted in code comments):

Goal: replace it with something where the data is structured, the delivery is auditable, and the wake-up doesn't depend on terminal byte streams.


Experiment 1 — Claude Code: file watcher + bg-process exit

The pattern

A small Node script runs in the background as a one-shot watcher. It uses fs.watchFile to block until the target file (./inbox.jsonl) grows past a known offset. When new bytes arrive, it reads them, prints one JSON line to stdout, and exits with code 0.

Claude Code's runtime auto-injects a <task-notification> system reminder into the agent's context the moment a backgrounded shell command exits, while the agent is idle. The agent reads the new inbox content, treats it as a user prompt, and acts. Latency: ~500 ms (the watchFile poll interval).

What we proved

Live-tested in this session. Watcher script: ~30 lines of Node, no installs (uses built-in fs.watchFile, equivalent to chokidar's underlying mechanism on macOS). Spawned as Bash(run_in_background: true). Wrote to inbox.jsonl from a side terminal — agent received the wake-up, read the new lines, processed as user input, respawned watcher with the new offset. Loop runs forever.

Caveats

Watcher script (canonical)

#!/usr/bin/env node
const fs = require("node:fs");
const [, , filePath, fromArg] = process.argv;
const fromOffset = Number.parseInt(fromArg ?? "0", 10) || 0;
const readNew = () => {
  const size = fs.statSync(filePath).size;
  if (size <= fromOffset) return null;
  const fd = fs.openSync(filePath, "r");
  const buf = Buffer.alloc(size - fromOffset);
  fs.readSync(fd, buf, 0, buf.length, fromOffset);
  fs.closeSync(fd);
  return { newOffset: size, content: buf.toString("utf8") };
};
const emit = (p) => { process.stdout.write(JSON.stringify(p) + "\n"); process.exit(0); };
const initial = readNew();
if (initial?.content) emit({ event: "initial", ...initial });
fs.watchFile(filePath, { interval: 500 }, () => {
  const r = readNew();
  if (r?.content) { fs.unwatchFile(filePath); emit({ event: "change", ...r }); }
});
setTimeout(() => emit({ event: "timeout", newOffset: fromOffset, content: "" }), 30 * 60 * 1000).unref();

Per-agent compatibility — what works, what doesn't

The "wake-up channel from external event to agent's REPL" is the bottleneck. Here's what we found, both empirically and via the source.

✅ Claude Code — works natively

run_in_background: true on Bash + auto <task-notification> injection. Already used in production by AO in this session.

✅ Copilot CLI — works (surprising win)

Public docs and open issues (github/copilot-cli#2682) suggested bg shell completion notification only worked for MCP background agents, not bare bg shells. Harshit empirically verified that bare bg shell completion does wake the agent — works the same way as Claude Code with the same watcher script.

✅ OpenCode — works via HTTP, not via watcher (see Experiment 2 below)

❌ Codex CLI — broken

Bg-bash exists but the agent harness fails to detect process completion: known stuck-loop bug (openai/codex#14314, "Waited for background terminal" infinite log spam). Proper bg-session support still open at openai/codex#3968. Harshit filed a duplicate report with the AO use case at openai/codex#22003.

For Codex we must keep the existing send-keys-equivalent until upstream fixes land.

❌ Cursor — bg-completion does not wake the agent (empirically verified 2026-05-10)

Tested live with cursor-agent v1.x. Spawned the same Node fs.watchFile watcher as a backgrounded shell command from inside the agent's session, ended the agent's turn, then appended a line to ./inbox.jsonl from outside. The watcher detected the change, exited cleanly with the expected JSON payload (event:"change", newOffset:170, content matching), but cursor-agent's harness did not inject the output into the model's context. The agent only acted on the watcher's payload after the human typed a fresh message — at which point the harness made the prior bg-job's stdout available, but as a tool-result lookup, not a wake-up.

The historical bug Harshit asked us to investigate (forum.cursor.com #122392) was resolved in v1.3.7 (July 2025) — that one was about foreground command completion detection. The wake-up-on-bg-exit gap is a separate, still-open architectural issue.

Implication: Cursor goes in the same bucket as Codex — orchestrator → agent delivery needs a send-keys-style nudge until Cursor adds a wake-up channel. There's no HTTP server like OpenCode (we checked).


Experiment 2 — OpenCode: orchestrator hits the HTTP API

The discovery

OpenCode auto-starts a Hono HTTP server on port 4096 (or --port <N>) when the TUI launches, and opencode serve runs it headlessly. The server exposes:

This is already a structured external messaging channel. We don't need to invent one.

What we proved

Live-tested with Harshit's running TUI bound to http://127.0.0.1:4096:

curl -X POST http://127.0.0.1:4096/session/ses_1ef2ff3d4ffe8HKxAfuQtDWNTC/prompt_async \
  -H 'Content-Type: application/json' \
  -d '{
    "parts":[{"type":"text","text":"Hello from AO orchestrator!"}],
    "model":{"providerID":"zai-coding-plan","modelID":"glm-5.1"}
  }'
# → HTTP 204

Server stored the user message. Agent ran inference. Response streamed back. Verified via GET /session/.../message.

Implication for AO's agent-opencode plugin

The OpenCode plugin's sendMessage becomes ~30 lines:

async sendMessage(handle: RuntimeHandle, message: string): Promise<void> {
  const port = handle.data.opencodePort as number;
  const sessionID = handle.data.opencodeSessionID as string;
  const model = handle.data.opencodeModel as { providerID: string; modelID: string };
  await fetch(`http://127.0.0.1:${port}/session/${sessionID}/prompt_async`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      parts: [{ type: "text", text: message }],
      model,
    }),
  });
}

agent.getLaunchCommand becomes opencode serve --port <free-port> (so we know the port up-front). Capture port, sessionID, and model into handle.data at session creation. No tmux involvement, no inbox watcher, no plugin-side code inside the agent process. End of story.


Bug filed during this work

anomalyco/opencode#26671"TUI does not live-render messages when prompt is POSTed externally via /session/{id}/prompt_async"

OpenCode's TUI doesn't live-render messages whose prompt originated from an external HTTP POST — they appear only after Ctrl+C + resume. Web UI renders correctly. Server persists correctly. Cosmetic for AO (the human watches via the dashboard, not the agent's TUI), but worth fixing upstream.


Recommended messaging architecture for AO

Per-agent transport, single inbox record

orchestrator
   │
   ├─ append { sessionId, kind, payload, ts } to {workspace}/.ao/inbox.jsonl   (durable record)
   │
   └─ runtime/agent plugin's sendMessage()
        │
        ├─ Claude Code   → ensure inbox-watcher running in agent's bash; agent picks up via wake-up
        ├─ Copilot CLI   → same as Claude Code
        ├─ OpenCode      → fetch POST /session/{id}/prompt_async to the agent's local server
        ├─ Codex          → existing send-keys path (until upstream fixes #22003)
        └─ Cursor         → existing send-keys path (no wake-up channel exists, verified empirically)

inbox.jsonl becomes the durable log — replay, audit, debugging, idempotency. Each entry has a monotonic id; the agent persists the last-consumed id in its workspace.

The transport is per-agent. One mechanism per plugin, dispatched from sendMessage. The orchestrator doesn't care which.

What this buys us

Migration path (rough sketch)

  1. Add inbox-write helper to core (appendInboxEntry(workspacePath, entry)).
  2. Update each agent plugin's sendMessage to use its native transport.
  3. For Claude Code / Copilot — install the inbox watcher in setupWorkspaceHooks (system prompt instructs the agent to keep one running).
  4. For OpenCode — capture port / sessionID / model in handle.data; sendMessage POSTs to the local server.
  5. For Codex — keep the existing send-keys path; revisit when #22003 lands.
  6. Replace sendWithConfirmation's "best-effort + dispatch-hash" with explicit ack via the inbox last-consumed id.

This is incremental. Each agent plugin migrates independently. The orchestrator-side append-to-inbox can ship first behind a feature flag.


Open questions for the team

  1. Where does the inbox file live? {workspace}/.ao/inbox.jsonl (gitignored, like the existing .ao/AGENTS.md pattern) is the natural fit. Confirm.
  2. Acknowledgement model. Last-consumed id persisted by the agent → orchestrator polls a .ao/inbox.cursor file? Or push-back via the same transport (e.g. agent calls a hook)? Push-back is cleaner but more code.
  3. Migration timeline. Ship the per-agent transports incrementally over multiple PRs, or wait until all four are ready and flip in one go? Incremental seems safer; behind AO_EXPERIMENTAL_FILE_INBOX env var.
  4. OpenCode launch command. Switching agent-opencode's launch to opencode serve --port <free-port> (headless). Resolved: keep the TUI launch. The TUI's bound HTTP server on port 4096 is sufficient for AO to POST into; the original concern (TUI not live-rendering external POSTs) has a fix in flight at PR #26687. Once that merges, humans attached via TUI see orchestrator-injected messages in real time too.
  5. Codex parity. Do we want to invest engineering time helping unblock #22003 (PR to upstream), or accept the status quo (send-keys for Codex) until OpenAI ships?

References

AO codebase

Claude Code

OpenCode

Codex CLI

Cursor

Filed by AO during this work