PR #1466 / Storage Redesign / Merge Case

Make AO storage match how AO actually works.

Agent Orchestrator is project-centered. The current storage layer is hash-centered. PR #1466 removes that mismatch, replaces fragile string metadata with JSON, and gives users a deliberate migration path.

Branch storage-redesign
Diff +6,014 / -2,177
Files 80 changed
Commits 50
Test files 40
Review rounds 11

This is not a cosmetic refactor. It removes a foundational inconsistency.

Users, the dashboard, and commands identify work by project. The filesystem should too.

Today AO stores project state under generated hash directories and keeps a storageKey indirection alive so the rest of the system can find it. That indirection leaks into core, CLI, web, recovery, config, and tests.

PR #1466 changes the storage model to the same unit the product already exposes: projectId. It also moves session metadata to JSON so lifecycle state, runtime handles, agent reports, and dashboard data are represented as structured data instead of encoded strings.

1
Storage Model
0
Runtime storageKey
54
Migration Tests
80
Files Changed
11
Review Rounds

The design becomes obvious on disk.

Before
~/.agent-orchestrator/{hash}-{project}/sessions/{sessionId}
~/.agent-orchestrator/{hash}-{project}/worktrees/
metadata: key=value strings
config: storageKey required
identity: sha256(configDir).slice(0,12)
After
~/.agent-orchestrator/projects/{projectId}/sessions/{sessionId}.json
~/.agent-orchestrator/projects/{projectId}/worktrees/
metadata: typed JSON
config: projectId is enough
identity: basename_sha256(path+origin).slice(0,10)
AreaOld ModelNew Model
Project identity Generated 12-hex hash + storageKey indirection. Single canonical projectId (basename_hash10).
Session metadata Flat key=value strings with ad hoc parsing. JSON files with typed structured fields.
Metadata hooks Shell wrappers write key=value; jq not required. Shell wrappers write JSON with jq, fallback to sed when jq unavailable.
Project deletion Resolve hash identity, then delete. Delete projects/{projectId} directly.
Debugging Need generated hash + flat string encodings. Open a project folder, read JSON.
Project registration Per-config registry, implicit. Global config registry with collision detection, 409 responses.
Migration No clear boundary. ao migrate-storage with dry-run, rollback, partial-failure safety.

Where the lines actually go.

80 files, but not all files are equal. Most changes are mechanical (path replacements, test updates). The core logic that needs careful review is concentrated in a few files.

core (49 files)
+4,531 / -1,535
cli (10 files)
+575 / -286
web (15 files)
+270 / -167
plugins (1 file)
+42 / -23
root (2 files)
+406

Core subsystem detail

The core package contains the highest-impact changes. Here is the breakdown by subsystem:

SubsystemChangeNature
migration/storage-v2.ts +1,360 (new file) Core migration logic. Primary review target.
global-config.ts +137 / -321 Global registry, collision detection, project registration.
paths.ts +127 / -17 V2 path functions, assertSafeProjectId, parseTmuxNameV2.
session-manager.ts +101 / -97 Wired to new paths and JSON metadata model.
agent-workspace-hooks.ts +35 / -11 Shell wrapper hooks updated for JSON metadata format.
types.ts +18 / -39 SessionMetadata type restructured for JSON.
__tests__/ (26 files) +2,275 / -664 Comprehensive test coverage including 54 migration tests.

About half of the core package diff is test code (+2,275 in tests). Of the remaining production code, the migration engine is the largest single file (+1,360). The core design decisions live in storage-v2.ts, global-config.ts, paths.ts, and session-manager.ts.

The old abstraction is becoming a tax on every feature.

Product cost

Projects do not own their own storage

AO presents projects as the main unit, but storage still uses generated hash identities. That makes lookup, deletion, restoration, and recovery more complex than the product model requires.

Engineering cost

Structured state is squeezed into strings

Lifecycle state, runtime handles, dashboard state, and agent reports are object-shaped. Keeping them in key=value files creates parsing bridges and ambiguous values like "off" versus "false".

Migration cost

Waiting makes this harder

More sessions, plugins, metadata fields, and project flows all increase the eventual migration surface. This PR pays the cost once with an explicit upgrade path.

Review cost

Future storage changes get simpler

Reviewers no longer need to trace storageKey generation, origin files, collision handling, and flat metadata conversion just to reason about session behavior.

Hashed identity, JSON metadata, safe migration.

Identity

One canonical project ID

generateExternalId(path, originUrl) produces basename_sha256hash10. This single ID is used everywhere: storage paths, tmux session names, global registry, CLI, and dashboard. No more internal/external ID distinction.

Metadata

JSON replaces key=value

Session files are .json with typed fields. Shell hooks (both PATH wrappers and Claude Code PostToolUse hooks) write JSON natively, with a sed-based fallback when jq is unavailable.

Migration

Explicit upgrade path

ao migrate-storage converts hash-based dirs to projects/{projectId}, rewrites worktree references, converts metadata to JSON, and sanitizes unsafe legacy IDs. Dry-run and rollback are first-class.

Project ID safety

assertSafeProjectId validates all project IDs at the path boundary: must match /^[a-zA-Z0-9][a-zA-Z0-9._-]*$/, length 1-128, no path traversal. Legacy IDs that don't match are sanitized during migration via sanitizeLegacyProjectId.

The risky part is migration. The PR treats migration as the product.

The branch went through 11 review rounds focused on the failure modes that matter: rollback, moved worktrees, corrupt metadata, active tmux sessions, archive collisions, cross-device moves, partial failures, and interrupted migration state.

Fixed

Rollback data loss

Rollback checks for post-migration sessions and preserves V2 project data instead of recursively deleting it.

Fixed

Moved worktree paths

Migration moves worktrees, then patches session JSON to point at the new location. Previously the metadata still referenced the old path.

Fixed

Partial failure safety

If some projects fail migration, config is not stripped and the migration marker is not removed. Only a fully clean run completes the migration.

Fixed

Corrupt metadata

Bad JSON no longer takes down session listing or dashboard reads. Atomic write-and-rename prevents partial writes.

Fixed

Unsafe legacy project IDs

Legacy project IDs with spaces, special chars, or path traversal are sanitized via sanitizeLegacyProjectId during migration.

Fixed

jq-unavailable JSON corruption

When jq is not installed, shell hooks now use sed-based JSON updates instead of falling back to key=value format that corrupts JSON files.

Fixed

False active-session blocks

Tmux detection is scoped to known project prefixes instead of broad matching unrelated sessions. Dry-run skips active checks entirely.

Fixed

Archive collisions

Archive filenames include compact timestamps and PID suffixes to avoid concurrent overwrite cases.

Fixed

Project collision in web UI

POST /api/projects returns structured 409 with collision info. The AddProjectModal shows existing project, suggested ID, and resolution options.

11 review rounds shaped this PR.

This is not a one-shot branch. The migration logic was iterated across multiple review cycles, each catching real edge cases that would have hit production users.

Phase 1 — Core refactor

JSON metadata format, V2 path functions, storageKey removal, session manager wiring.

Phase 2 — Migration engine

Inventory, dry-run, worktree moves, metadata conversion, active-session detection.

Review round 1

Worktree path rewriting, archive location fixes, recovery log improvements.

Review round 2

Rollback data-loss fix, stray worktree recursion, JSON parse whitelist.

Review round 3

Crash safety, atomic ops, corrupt data handling, 8 edge case hardening (EC-1 through EC-8).

Review round 4

V1 detection, git worktree repair, storageKey preservation for downgrade safety.

Review round 5

Zod schema gaps, worktree repair, rollback safety, status priority alignment.

Hashed identity

Single canonical project ID via generateExternalId. Collision detection in global config.

Review round 6 (final)

7 findings fixed: worktree metadata patching, partial-failure safety, legacy ID sanitization, jq-unavailable JSON corruption, CLI effective-ID return, web route safety, config repair fallback.

Copilot review

3 fixes: parseTmuxNameV2 digit-prefix regex, DELETE route unsafe-ID guard, POST route 409 collision response.

How to review this efficiently.

This is a large PR. Not all 80 files need equal attention. Most changes are mechanical (path replacements, test updates). Focus review time on the design decisions below.

Priority 1: Migration safety (read carefully)

FileLinesWhat to check
core/src/migration/storage-v2.ts +1,360 The migration engine. Inventory, move logic, worktree repair, rollback, partial failure handling, legacy ID sanitization. This is the highest-risk file.
core/src/global-config.ts +137/-321 Global registry, collision detection, registerProjectInGlobalConfig. How duplicate project IDs are handled.
core/src/paths.ts +127/-17 assertSafeProjectId validation, V2 path functions. Security boundary for project ID input.

Priority 2: Metadata format (scan for correctness)

FileLinesWhat to check
core/src/agent-workspace-hooks.ts +35/-11 Shell wrapper hooks. JSON write logic and the sed-based fallback when jq unavailable.
plugins/agent-claude-code/src/index.ts +42/-23 PostToolUse hook template literal. Same JSON/sed pattern as workspace hooks.
core/src/session-manager.ts +101/-97 Main consumer of new paths and JSON metadata. Session CRUD operations.

Priority 3: Mechanical changes (skim or skip)

CategoryFilesNature
Web 15 files Replace storageKey with projectId in components, API routes, tests.
CLI 10 files Use projectId instead of storageKey. New migrate-storage command.
Plugins 1 file agent-claude-code PostToolUse hook template updated for JSON.
Test updates 40 files Updated fixtures, path expectations, new migration test suite (54 tests).

Suggested review order: Start with paths.ts (understand the new path contract), then storage-v2.ts (migration logic), then global-config.ts (registration), then agent-workspace-hooks.ts (shell hooks). Everything else follows from these four files.

40 test files. 54 migration-specific tests.

The migration is the riskiest part of this PR, so it has the deepest test coverage. The migration test suite covers every failure mode that was found during review.

Migration tests (54)

storage-v2.ts coverage

Inventory detection, dry-run fidelity, worktree moves and git repair, metadata JSON conversion, archive timestamp formats, rollback preservation, crash recovery, partial failure, legacy ID sanitization, active-session blocking, .migrated marker semantics.

Core tests (26 files)

+2,275 / -664

Session manager spawn/restore/archive, paths and project ID validation, metadata read/write/corrupt handling, lifecycle state machine, config validation, agent workspace hooks, global config registry.

Web tests (5 files)

+193 / -110

API route tests (projects CRUD, session routes), component tests (Dashboard, SessionCard), serialization tests, project detail route with degraded config handling.

CLI tests

Part of 10 CLI files

CLI command tests (start, spawn, status, migrate-storage, completion). Plugin test changes are minimal (1 file).

Issues found and fixed during review.

These are the concrete issues caught during the review process. Each was fixed, tested, and pushed before this guide was written.

Review finding

Worktree metadata not patched after move

tryMoveWorktree() moved the directory but left session JSON pointing at the old path. Now patches worktree field in the JSON after moving.

Review finding

Partial migration stripped config

If 3 of 5 projects migrated and 2 failed, config was still stripped and the migration marker removed. Now both are guarded by projectErrors.length === 0.

Review finding

Legacy IDs with special chars crash paths

Project IDs from legacy data could contain spaces, colons, or shell-unsafe characters. sanitizeLegacyProjectId() normalizes these during migration.

Review finding

jq-unavailable corrupts JSON

Shell hooks fell back to writing key=value into JSON files when jq was missing. Now uses sed-based JSON update preserving the file format.

Review finding

CLI didn't return effective project ID

registerProject returned void so the CLI always printed the requested ID, not the actual registered ID (which may have a suffix). Now returns the effective ID.

Review finding

Config repair couldn't find wrapped projects

repairWrappedLocalProjectConfig looked up by hashed project ID, but wrapped configs use the original name as key. Added single-entry fallback.

Copilot review

parseTmuxNameV2 rejected digit-prefixed names

Regex required [a-zA-Z] start, but config schema allows digit-leading sessionPrefix. Fixed to [a-zA-Z0-9].

Copilot review

DELETE route crashed on unsafe project IDs

getProjectDir(id) throws for IDs with path traversal. DELETE handler now catches and returns 400 instead of 500.

Merge this so future AO work does not build on the wrong storage abstraction.

The alternative is to keep teaching every new feature about storageKey, hash directories, flat metadata, and legacy collision behavior. That is not cheaper; it spreads the migration cost across future PRs.

PR #1466 pays that cost once across 80 files, with a visible migration command, 54 migration-specific tests, 11 review rounds of hardening, and the safety checks expected for production user data.

50 commits
80 files
+6,014 / -2,177 lines
40 test files
54 migration tests
11 review rounds
6 changesets