ace-swarm 2.1.2 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (121) hide show
  1. package/CHANGELOG.md +52 -1
  2. package/README.md +86 -40
  3. package/assets/.agents/ACE/AGENT_REGISTRY.md +7 -1
  4. package/assets/.agents/ACE/agent-eval/instructions.md +41 -1
  5. package/assets/.agents/ACE/agent-memory/instructions.md +35 -1
  6. package/assets/.agents/ACE/agent-observability/instructions.md +35 -1
  7. package/assets/.agents/ACE/agent-release/instructions.md +34 -1
  8. package/assets/.agents/ACE/agent-security/instructions.md +35 -1
  9. package/assets/.agents/ACE/agent-skeptic/instructions.md +49 -0
  10. package/assets/.agents/ACE/orchestrator/AGENTS.md +11 -0
  11. package/assets/.github/hooks/ace-copilot.json +16 -16
  12. package/assets/agent-state/ACE_WORKFLOW.md +65 -0
  13. package/assets/agent-state/INTERFACE_REGISTRY.md +1 -0
  14. package/assets/agent-state/MODULES/schemas/ACE_RUNTIME_PROFILE.schema.json +79 -0
  15. package/assets/agent-state/MODULES/schemas/VERICIFY_PROCESS_POST_LOG.schema.json +1 -0
  16. package/assets/scripts/ace-hook-dispatch.mjs +447 -0
  17. package/assets/scripts/copilot-hook-dispatch.mjs +1 -265
  18. package/assets/scripts/render-mcp-configs.sh +328 -1
  19. package/assets/tasks/README.md +26 -0
  20. package/dist/ace-autonomy.d.ts +137 -0
  21. package/dist/ace-autonomy.d.ts.map +1 -0
  22. package/dist/ace-autonomy.js +472 -0
  23. package/dist/ace-autonomy.js.map +1 -0
  24. package/dist/ace-context.d.ts +29 -0
  25. package/dist/ace-context.d.ts.map +1 -0
  26. package/dist/ace-context.js +240 -0
  27. package/dist/ace-context.js.map +1 -0
  28. package/dist/ace-internal-tools.d.ts +8 -0
  29. package/dist/ace-internal-tools.d.ts.map +1 -0
  30. package/dist/ace-internal-tools.js +76 -0
  31. package/dist/ace-internal-tools.js.map +1 -0
  32. package/dist/ace-server-instructions.d.ts +12 -0
  33. package/dist/ace-server-instructions.d.ts.map +1 -0
  34. package/dist/ace-server-instructions.js +299 -0
  35. package/dist/ace-server-instructions.js.map +1 -0
  36. package/dist/agent-runtime/role-adapters.d.ts.map +1 -1
  37. package/dist/agent-runtime/role-adapters.js +47 -6
  38. package/dist/agent-runtime/role-adapters.js.map +1 -1
  39. package/dist/helpers.d.ts.map +1 -1
  40. package/dist/helpers.js +90 -0
  41. package/dist/helpers.js.map +1 -1
  42. package/dist/internal-tool-runtime.d.ts +21 -0
  43. package/dist/internal-tool-runtime.d.ts.map +1 -0
  44. package/dist/internal-tool-runtime.js +136 -0
  45. package/dist/internal-tool-runtime.js.map +1 -0
  46. package/dist/local-model-runtime.d.ts +36 -0
  47. package/dist/local-model-runtime.d.ts.map +1 -0
  48. package/dist/local-model-runtime.js +161 -0
  49. package/dist/local-model-runtime.js.map +1 -0
  50. package/dist/model-bridge.d.ts +54 -0
  51. package/dist/model-bridge.d.ts.map +1 -0
  52. package/dist/model-bridge.js +587 -0
  53. package/dist/model-bridge.js.map +1 -0
  54. package/dist/orchestrator-supervisor.d.ts +100 -0
  55. package/dist/orchestrator-supervisor.d.ts.map +1 -0
  56. package/dist/orchestrator-supervisor.js +399 -0
  57. package/dist/orchestrator-supervisor.js.map +1 -0
  58. package/dist/prompts.d.ts.map +1 -1
  59. package/dist/prompts.js +101 -0
  60. package/dist/prompts.js.map +1 -1
  61. package/dist/public-surface.d.ts.map +1 -1
  62. package/dist/public-surface.js +6 -0
  63. package/dist/public-surface.js.map +1 -1
  64. package/dist/resources.d.ts.map +1 -1
  65. package/dist/resources.js +29 -0
  66. package/dist/resources.js.map +1 -1
  67. package/dist/runtime-executor.d.ts.map +1 -1
  68. package/dist/runtime-executor.js +121 -0
  69. package/dist/runtime-executor.js.map +1 -1
  70. package/dist/runtime-profile.d.ts +18 -0
  71. package/dist/runtime-profile.d.ts.map +1 -1
  72. package/dist/runtime-profile.js +39 -3
  73. package/dist/runtime-profile.js.map +1 -1
  74. package/dist/schemas.js +1 -1
  75. package/dist/schemas.js.map +1 -1
  76. package/dist/server.d.ts +5 -1
  77. package/dist/server.d.ts.map +1 -1
  78. package/dist/server.js +9 -1
  79. package/dist/server.js.map +1 -1
  80. package/dist/shared.d.ts +3 -3
  81. package/dist/shared.d.ts.map +1 -1
  82. package/dist/shared.js +1 -0
  83. package/dist/shared.js.map +1 -1
  84. package/dist/tools-agent.d.ts +1 -0
  85. package/dist/tools-agent.d.ts.map +1 -1
  86. package/dist/tools-agent.js +456 -1
  87. package/dist/tools-agent.js.map +1 -1
  88. package/dist/tools-framework.d.ts.map +1 -1
  89. package/dist/tools-framework.js +366 -128
  90. package/dist/tools-framework.js.map +1 -1
  91. package/dist/tools-memory.d.ts.map +1 -1
  92. package/dist/tools-memory.js +80 -0
  93. package/dist/tools-memory.js.map +1 -1
  94. package/dist/tui/agent-runner.d.ts +6 -0
  95. package/dist/tui/agent-runner.d.ts.map +1 -1
  96. package/dist/tui/agent-runner.js +15 -1
  97. package/dist/tui/agent-runner.js.map +1 -1
  98. package/dist/tui/agent-worker.d.ts +3 -1
  99. package/dist/tui/agent-worker.d.ts.map +1 -1
  100. package/dist/tui/agent-worker.js +117 -9
  101. package/dist/tui/agent-worker.js.map +1 -1
  102. package/dist/tui/chat.d.ts +19 -0
  103. package/dist/tui/chat.d.ts.map +1 -1
  104. package/dist/tui/chat.js +108 -0
  105. package/dist/tui/chat.js.map +1 -1
  106. package/dist/tui/index.d.ts +1 -0
  107. package/dist/tui/index.d.ts.map +1 -1
  108. package/dist/tui/index.js +3 -0
  109. package/dist/tui/index.js.map +1 -1
  110. package/dist/vericify-bridge.d.ts +5 -1
  111. package/dist/vericify-bridge.d.ts.map +1 -1
  112. package/dist/vericify-bridge.js +10 -0
  113. package/dist/vericify-bridge.js.map +1 -1
  114. package/dist/vericify-context.d.ts +10 -0
  115. package/dist/vericify-context.d.ts.map +1 -0
  116. package/dist/vericify-context.js +72 -0
  117. package/dist/vericify-context.js.map +1 -0
  118. package/dist/workspace-manager.d.ts.map +1 -1
  119. package/dist/workspace-manager.js +13 -2
  120. package/dist/workspace-manager.js.map +1 -1
  121. package/package.json +2 -1
package/CHANGELOG.md CHANGED
@@ -1,6 +1,57 @@
1
1
  # Changelog
2
2
 
3
- This changelog is based on local tagged release history and repository diffs.
3
+ ## [2.3.0] - 2026-03-28
4
+
5
+ ### Added
6
+
7
+ **Orchestrator Supervisor:**
8
+ - Added `run_orchestrator` MCP tool for full task-plan execution with explicit step routing, sequential/parallel dispatch, and handoff create/ack flows.
9
+ - Added Vericify context carry-forward and delta recovery for state continuity across plan execution.
10
+ - Added circuit breaker open/close hooks for upstream failure handling.
11
+ - Added plan amendment support during mid-execution.
12
+
13
+ **Model Bridge Context Management:**
14
+ - Added context budget calculation with configurable reserve (default 25%) to prevent overflow.
15
+ - Added tier fallback (full → compressed → brief) when context exceeds budget.
16
+ - Added automatic conversation history compaction (summarize older messages, retain recent messages + system).
17
+ - Added large tool-result write-through to disk with configurable truncation (default 3000 chars).
18
+ - Added retry-once on transient provider errors (abort, timeout, rate-limit).
19
+
20
+ **ACE Compliance Enforcement:**
21
+ - Added 5-layer ACE compliance enforcement: MCP instructions, per-host instruction files, lifecycle hooks, tool-level guidance, and bootstrap-on-first-tool.
22
+ - Added per-host instruction file generation for CLAUDE.md, .cursorrules, .github/copilot-instructions.md, and Codex AGENTS.md.
23
+ - Added lifecycle hook dispatcher (`ace-hook-dispatch.mjs`) with role-aware tool hints for 7 hosts.
24
+ - Added tool-level soft enforcement via guidance appended to all tool results.
25
+ - Added bootstrap nudge lifecycle for lazy ACE context loading on first mutation tool.
26
+
27
+ **Tool Governance:**
28
+ - Added named sets for explicit tool classification: BOOTSTRAP_TOOLS, REFRESH_TOOLS, NAMED_MUTATION_TOOLS.
29
+ - Added maintenance comments on mutable tool sets to prevent accidental out-of-sync registrations.
30
+ - Added Vericify resolution scoping with `startsWith` validation to prevent parent-directory leakage.
31
+
32
+ ### Changed
33
+
34
+ - Changed agent worker and runner behavior so blocked or idle TUI agents can accept new input and resume without relaunching.
35
+ - Expanded hook dispatch context to emit role-aware ACE tool hints from session, subagent, and tool lifecycle events.
36
+ - Updated bootstrap and config rendering to generate multi-host hook artifacts under `.vscode`, `.cursor`, and `.mcp-config`.
37
+
38
+ ### Validation
39
+
40
+ - Verified `208/208` tests passing (up from 202) with full orchestrator supervisor, model bridge, and compliance enforcement coverage.
41
+
42
+ ## [2.2.0] - 2026-03-26
43
+
44
+ ### Added
45
+ - Added `ace turnkey` command for rapid workspace bootstrap with one command.
46
+ - Added `ace tui` command for a full terminal operator surface with status, tasks, chat, and telemetry.
47
+ - Expanded the packaged skill system to 15 skills, including `landing-review-watcher`, `problem-triage`, `skill-auditor`, and `astgrep-index`.
48
+ - Finalized the 17-agent role system (4 swarm, 13 composable).
49
+ - Added automatic MCP client configuration for Codex, VS Code, Claude Desktop, Cursor, and Antigravity.
50
+ - Expanded the MCP tool surface to 70+ tools across 12 domain-specific modules.
51
+
52
+ ### Packaging
53
+ - Bumped package version from `2.1.0` to `2.2.0`.
54
+ - Verified 170/170 tests passing on the final v2.2.0 release candidate.
4
55
 
5
56
  ## [2.1.0] - 2026-03-13
6
57
 
package/README.md CHANGED
@@ -63,30 +63,62 @@ ACE Swarm
63
63
  ## Recent Releases
64
64
 
65
65
  ```text
66
- +---------+------------+------------------------------------------------------+
67
- | Version | Date | Highlights |
68
- +---------+------------+------------------------------------------------------+
69
- | 2.1.0 | 2026-03-13 | README redesign, ASCII docs, packaged changelog |
70
- | 2.0.7 | 2026-03-13 | runtime executor/profile/tools, trackers, workspace |
71
- | 2.0.6 | 2026-03-10 | bootstrap/workspace helper hardening |
72
- +---------+------------+------------------------------------------------------+
66
+ +------------+------------+------------------------------------------------------------------+
67
+ | Version | Date | Highlights |
68
+ +------------+------------+------------------------------------------------------------------+
69
+ | 2.3.0 | 2026-03-28 | Orchestrator supervisor, context-aware model bridge, ACE |
70
+ | | | compliance enforcement, Vericify integration, tool governance |
71
+ | 2.2.0 | 2026-03-26 | turnkey bootstrap, TUI, expanded skills/roles, client configs |
72
+ | 2.1.0 | 2026-03-13 | README redesign, ASCII docs, packaged changelog |
73
+ +------------+------------+------------------------------------------------------------------+
73
74
  ```
74
75
 
75
- See [CHANGELOG.md](./CHANGELOG.md) for the recent tagged release history that can be verified from the local repository.
76
+ See [CHANGELOG.md](./CHANGELOG.md) for detailed release notes.
77
+
78
+ ### What's New in 2.3.0
79
+
80
+ **Orchestrator Supervisor (`run_orchestrator` MCP tool):**
81
+ - Full task-plan execution loop with explicit step routing and sequential/parallel dispatch
82
+ - Handoff create/ack flows with fallback ID generation
83
+ - Vericify context carry-forward and delta recovery for state continuity
84
+ - Circuit breaker open/close for upstream failure handling
85
+ - Plan amendment support during execution
86
+
87
+ **Model Bridge Context Management:**
88
+ - Context budget calculation with configurable reserve (default 25%)
89
+ - Tier fallback (full → compressed → brief) when context overflows
90
+ - Automatic conversation history compaction (summarize older messages, retain recent + system)
91
+ - Large tool-result write-through to disk with configurable truncation
92
+ - Retry-once on transient provider errors (abort, timeout, rate-limit)
93
+
94
+ **ACE Compliance Enforcement (5 Stacked Layers):**
95
+ - Layer 1: MCP server initialization instructions (all hosts)
96
+ - Layer 2: Per-host instruction files (CLAUDE.md, .cursorrules, .github/copilot-instructions.md)
97
+ - Layer 3: Lifecycle hooks with role-aware tool hints (7 hosts)
98
+ - Layer 4: Tool-level soft enforcement (guidance on all tool results)
99
+ - Layer 5: Bootstrap-on-first-tool lazy enforcement
100
+
101
+ **Tool Governance & Vericify:**
102
+ - Named sets for explicit tool classification (BOOTSTRAP_TOOLS, REFRESH_TOOLS, NAMED_MUTATION_TOOLS)
103
+ - Vericify resolution scoping to prevent parent-directory leakage
104
+ - Full Vericify API integration with graceful degradation fallback
105
+
106
+ **Test Coverage:** 208 passing tests (up from 202)
76
107
 
77
108
  ## Core Surfaces
78
109
 
79
110
  ```text
80
- +------------------+------------------------------------------+--------------------------------------------------+
81
- | Surface | Internals | Public entrypoints |
82
- +------------------+------------------------------------------+--------------------------------------------------+
83
- | Bootstrap | workspace state, tasks, configs | `ace init`, `ace turnkey`, `ace mcp-config` |
84
- | Agent runtime | swarm + composable roles | MCP server, TUI `/agent`, packaged agent assets |
85
- | Durable state | ledgers, TODOs, handoffs, status events | runtime stores under `agent-state/*` |
86
- | Dispatch | queue, lock table, lease, recovery | `enqueue_job`, `dispatch_jobs`, `complete_job` |
87
- | Change analysis | delta scan, semantic diff, rewrite hints | `scan_workspace_delta`, `semantic_diff`, etc. |
88
- | Operator UI | tabs, chat, telemetry, model switching | `ace tui` |
89
- +------------------+------------------------------------------+--------------------------------------------------+
111
+ +------------------+----------------------------------------------+------------------------------------------------------+
112
+ | Surface | Internals | Public entrypoints |
113
+ +------------------+----------------------------------------------+------------------------------------------------------+
114
+ | Bootstrap | workspace state, tasks, configs | `ace init`, `ace turnkey`, `ace mcp-config` |
115
+ | Agent runtime | swarm roles, model bridge, tool planning | MCP server, TUI `/agent`, packaged agent assets |
116
+ | Durable state | ledgers, TODOs, handoffs, status events | runtime stores under `agent-state/*` |
117
+ | Dispatch | queue, lock table, lease, recovery | `enqueue_job`, `dispatch_jobs`, `complete_job` |
118
+ | Supervision | plan amendment, handoff, Vericify, breakers | `superviseTaskPlan`, `amendTaskPlan` |
119
+ | Change analysis | delta scan, semantic diff, rewrite hints | `scan_workspace_delta`, `semantic_diff`, etc. |
120
+ | Operator UI | tabs, ACE chat bridge, telemetry, providers | `ace tui` |
121
+ +------------------+----------------------------------------------+------------------------------------------------------+
90
122
  ```
91
123
 
92
124
  ## Workspace Internals
@@ -191,24 +223,37 @@ See [CHANGELOG.md](./CHANGELOG.md) for the recent tagged release history that ca
191
223
  Provider notes:
192
224
 
193
225
  - Ollama works directly through `ace tui`.
226
+ - ACE workspaces automatically route chat tabs through the ACE model bridge, including a pre-execution tool-selection pass and streamed tool/result updates in the transcript.
194
227
  - OpenAI-compatible providers use `OPENAI_API_KEY` and optional `OPENAI_BASE_URL`, or provider-scoped equivalents such as `CODEX_API_KEY` and `CODEX_BASE_URL`.
228
+ - `/agent` sessions can now accept follow-up input while blocked or idle, so operators can continue a run without relaunching the worker tab.
195
229
  - VS Code Copilot model hints may appear in discovery, but the standalone TUI does not directly execute through the VS Code chat runtime without an extension-side bridge.
196
230
 
231
+ ## Supervisor And Bridge Notes
232
+
233
+ - `ModelBridge` now performs a tool-planning turn before execution and can amend the model-selected scope mid-run when another valid ACE tool is required.
234
+ - Supervisor helpers now support sequential task-plan execution with amendments, handoff create/ack flows, Vericify context carry-forward, and circuit-breaker open/close hooks.
235
+ - Hook dispatch now emits richer role-aware ACE context across workspace lifecycle events, including Codex, Claude, VS Code, Cursor, Gemini, and Antigravity hook bundles.
236
+
197
237
  ## Bootstrap Outputs
198
238
 
199
239
  ```text
200
- +----------------------------------+----------------------------------------------+
201
- | Path | Output |
202
- +----------------------------------+----------------------------------------------+
203
- | `agent-state/*` | state, ledgers, schemas, reports |
204
- | `.agents/ACE/*` | packaged agent instructions |
205
- | `.agents/skills/*` | packaged skills |
206
- | `tasks/*` | task packs, examples, handoff templates |
207
- | `scripts/ace/*` | helper scripts |
208
- | `.vscode/mcp.json` | VS Code MCP config |
209
- | `.mcp-config/*` | Codex / VS Code / Claude / Cursor / Antigravity |
210
- | `.github/hooks/*.json` | optional workspace hook policies |
211
- +----------------------------------+----------------------------------------------+
240
+ +-----------------------------------+------------------------------------------------------+
241
+ | Path | Output |
242
+ +-----------------------------------+------------------------------------------------------+
243
+ | `agent-state/*` | state, ledgers, schemas, reports |
244
+ | `.agents/ACE/*` | packaged agent instructions |
245
+ | `.agents/skills/*` | packaged skills |
246
+ | `tasks/*` | task packs, examples, handoff templates |
247
+ | `scripts/ace/*` | helper scripts |
248
+ | `.vscode/mcp.json` | VS Code MCP config |
249
+ | `.vscode/ace-hooks.json` | VS Code hook bundle |
250
+ | `.cursor/hooks.json` | Cursor hook bundle |
251
+ | `.mcp-config/codex.hooks.toml` | Codex hook config |
252
+ | `.mcp-config/gemini.hooks.json` | Gemini hook bundle |
253
+ | `.mcp-config/antigravity.hooks.json` | Antigravity hook bundle |
254
+ | `.mcp-config/*` | client configs for Codex / VS Code / Claude / Cursor / Gemini / Antigravity |
255
+ | `.github/hooks/*.json` | optional workspace hook policies |
256
+ +-----------------------------------+------------------------------------------------------+
212
257
  ```
213
258
 
214
259
  Local-model bootstrap:
@@ -223,16 +268,17 @@ ace doctor --llm ollama --model llama3.1:8b
223
268
  ## Client Config Targets
224
269
 
225
270
  ```text
226
- +---------------+--------------------------------------------------------------+
227
- | Client | Target |
228
- +---------------+--------------------------------------------------------------+
229
- | Codex | `$CODEX_HOME/config.toml` or `~/.codex/config.toml` |
230
- | VS Code | `.vscode/mcp.json` |
231
- | Claude | macOS: `~/Library/Application Support/Claude/...` |
232
- | | Linux: `~/.config/Claude/claude_desktop_config.json` |
233
- | Cursor | `~/.cursor/mcp.json` |
234
- | Antigravity | import `.mcp-config/antigravity.mcp.json` in client UI |
235
- +---------------+--------------------------------------------------------------+
271
+ +---------------+---------------------------------------------------------------------+
272
+ | Client | Target |
273
+ +---------------+---------------------------------------------------------------------+
274
+ | Codex | `$CODEX_HOME/config.toml` or `~/.codex/config.toml` |
275
+ | VS Code | `.vscode/mcp.json` and `.vscode/ace-hooks.json` |
276
+ | Claude | macOS: `~/Library/Application Support/Claude/...` |
277
+ | | Linux: `~/.config/Claude/claude_desktop_config.json` |
278
+ | Cursor | `~/.cursor/mcp.json` or workspace `.cursor/hooks.json` |
279
+ | Gemini | import `.mcp-config/gemini.hooks.json` with the generated MCP config |
280
+ | Antigravity | import `.mcp-config/antigravity.mcp.json` and `.mcp-config/antigravity.hooks.json` |
281
+ +---------------+---------------------------------------------------------------------+
236
282
  ```
237
283
 
238
284
  ## Included Assets
@@ -2,6 +2,12 @@
2
2
 
3
3
  Version: 3.0.0
4
4
 
5
+ ## Hierarchy Lock
6
+
7
+ - Primary ingress is locked to four swarm agents: `orchestrator`, `vos`, `ui`, and `coders`.
8
+ - Composable agents are delegated specialists that operate under an active swarm task, not replacements for the swarm layer.
9
+ - Direct operator invocation of a composable agent is allowed only for bounded specialist work; it is not the default routing path for new general tasks.
10
+
5
11
  ## Registry Table
6
12
 
7
13
  | Agent | Group | Objective | Inputs | Outputs | Emits | Consumes | Default Skills |
@@ -26,7 +32,7 @@ Version: 3.0.0
26
32
 
27
33
  ## Dependency Rules
28
34
 
29
- 1. Orchestrator routes; composable modules execute bounded contracts.
35
+ 1. Top-level routing terminates at a swarm agent; composable modules execute bounded contracts under that swarm layer.
30
36
  2. Skeptic and ops can block downstream flow when gates fail.
31
37
  3. Release cannot approve without eval + security + observability evidence.
32
38
  4. Memory sidecar may run continuously but cannot mutate source truth artifacts except designated memory files.
@@ -3,10 +3,18 @@ applyTo: 'agent-eval'
3
3
  ---
4
4
  # agent-eval Execution Instructions
5
5
 
6
- ## Objective
6
+ ## Operating Objective
7
7
 
8
8
  Run deterministic evaluation checks and publish confidence/regression outcomes.
9
9
 
10
+ ## When To Use It
11
+
12
+ - After autonomy, routing, runtime, or hook changes that can shift ACE behavior.
13
+ - Before promotion when evidence needs a deterministic regression verdict.
14
+ - When comparing candidate implementations, profiles, or prompt/runtime variants.
15
+ - After bug fixes that claim to close a previous failure mode.
16
+ - Do not use this role for speculative design or open-ended research; evaluation requires a concrete target and pass/fail surface.
17
+
10
18
  ## Required Loop
11
19
 
12
20
  1. `[STATE_ANALYSIS]` Identify triggered evaluation scope.
@@ -14,3 +22,35 @@ Run deterministic evaluation checks and publish confidence/regression outcomes.
14
22
  3. `[EXECUTION_LOG]` Run suites and capture raw outcomes.
15
23
  4. `[ARTIFACT_UPDATE]` Update `EVAL_REPORT.md` and evidence links.
16
24
  5. `[VERIFICATION]` Emit pass/fail/hold decision.
25
+
26
+ ## Evaluation Inputs
27
+
28
+ - `TASK.md`, `SCOPE.md`, `QUALITY_GATES.md`, and `HANDOFF.json` for the active objective and gate surface.
29
+ - Package or workspace test suites, fixtures, and golden outputs.
30
+ - `STATUS_EVENTS.ndjson` and `run-ledger.json` when regression history or prior failures matter.
31
+ - `EVIDENCE_LOG.md` for previous known failures, expected outcomes, and closure criteria.
32
+
33
+ ## Decision Rules
34
+
35
+ - `pass`: deterministic suites meet the declared threshold and no unexplained regressions remain.
36
+ - `hold`: evaluation signal is incomplete, stale, or inconclusive, so promotion should pause pending clearer evidence.
37
+ - `fail`: a regression, contract violation, or unexplained drift is reproduced with raw evidence.
38
+
39
+ ## Evidence And Artifact Contract
40
+
41
+ - Update `EVAL_REPORT.md` with suite name, fixture/scope, threshold, and verdict.
42
+ - Preserve raw outputs or exit codes in `EVIDENCE_LOG.md` rather than paraphrasing them away.
43
+ - Link failures to the owning gate, risk, or handoff so downstream routing stays deterministic.
44
+ - If a suite is missing or non-deterministic, log that explicitly as a hold condition instead of silently skipping it.
45
+
46
+ ## Example Invocations
47
+
48
+ - `Run eval on the autonomy preflight path after this runtime change.`
49
+ - `Compare the new routing behavior against the previous fixture set.`
50
+ - `Re-run the regression suite for the reported skeptic failure before release.`
51
+
52
+ ## Troubleshooting
53
+
54
+ - If no deterministic suite exists, emit `hold` and specify the missing fixture or harness.
55
+ - If outcomes differ across runs, treat the instability itself as evidence and document the drift condition.
56
+ - If the evaluation target is unclear, route back to `agent-spec` or `agent-ops` instead of guessing the scope.
@@ -3,10 +3,18 @@ applyTo: 'agent-memory'
3
3
  ---
4
4
  # agent-memory Execution Instructions
5
5
 
6
- ## Objective
6
+ ## Operating Objective
7
7
 
8
8
  Reconcile state artifacts into compact memory outputs that reduce drift and improve retrieval quality for orchestrator and composable agents.
9
9
 
10
+ ## When To Use It
11
+
12
+ - Before a handoff when the next role needs compact, contradiction-free continuity.
13
+ - When long-running work has accumulated too much evidence, status, and decision noise.
14
+ - When orchestrator recall is pulling stale or redundant state into the active loop.
15
+ - When artifacts disagree and a curated memory view is needed before dispatch.
16
+ - Do not use this role to invent new truth. Memory is a curated index over existing ACE artifacts.
17
+
10
18
  ## Required Loop
11
19
 
12
20
  1. `[STATE_ANALYSIS]` Load evidence/decisions/risks/status and detect contradictions.
@@ -14,3 +22,29 @@ Reconcile state artifacts into compact memory outputs that reduce drift and impr
14
22
  3. `[EXECUTION_LOG]` Extract only artifact-backed claims; remove duplicates.
15
23
  4. `[ARTIFACT_UPDATE]` Update `MEMORY_INDEX.md`; append evidence anchors.
16
24
  5. `[VERIFICATION]` Confirm no unsupported claims remain.
25
+
26
+ ## Curation Rules
27
+
28
+ - Prefer the newest artifact-backed statement when two entries say the same thing.
29
+ - Preserve open questions and blockers as unresolved, not normalized away.
30
+ - Separate current objective, current phase, confirmed evidence, and active risks instead of blending them together.
31
+ - If a claim cannot be tied to `EVIDENCE_LOG.md`, `STATUS.md`, `DECISIONS.md`, `RISKS.md`, or `HANDOFF.json`, it does not belong in memory.
32
+
33
+ ## Evidence And Artifact Contract
34
+
35
+ - Update `MEMORY_INDEX.md` with dated sections or anchors that point back to source artifacts.
36
+ - Record contradiction resolution or removal rationale in `EVIDENCE_LOG.md` when memory pruning changes what downstream roles will see.
37
+ - Keep memory outputs short enough to improve retrieval, but complete enough to preserve objective, phase, evidence, and risk continuity.
38
+ - Never let `MEMORY_INDEX.md` become a second source of truth for scope or requirements.
39
+
40
+ ## Example Invocations
41
+
42
+ - `Curate memory before handing this build thread to QA.`
43
+ - `Collapse duplicate evidence and keep only the current blocker story.`
44
+ - `Prepare a recall-safe summary for the orchestrator before resuming this objective.`
45
+
46
+ ## Troubleshooting
47
+
48
+ - If artifacts disagree, keep both claims visible until one is falsified by evidence.
49
+ - If memory keeps growing, split by objective or phase instead of making one bloated summary.
50
+ - If downstream recall still drifts, check whether the source artifacts themselves are stale before rewriting memory again.
@@ -3,10 +3,18 @@ applyTo: 'agent-observability'
3
3
  ---
4
4
  # agent-observability Execution Instructions
5
5
 
6
- ## Objective
6
+ ## Operating Objective
7
7
 
8
8
  Validate operational readiness and produce evidence-backed observability conclusions.
9
9
 
10
+ ## When To Use It
11
+
12
+ - Before release when the team needs a real operability verdict, not just passing tests.
13
+ - After adding new runtime paths, background sessions, hooks, or sidecars that need visibility.
14
+ - When incidents or repeated blockers suggest missing ownership, alerting, or detection signals.
15
+ - When a runbook, SLO, or telemetry path must be checked against actual artifact coverage.
16
+ - Do not use this role to fabricate monitoring maturity. Missing signals must remain explicit gaps.
17
+
10
18
  ## Required Loop
11
19
 
12
20
  1. `[STATE_ANALYSIS]` Identify missing runbook/SLO/owner mappings.
@@ -14,3 +22,29 @@ Validate operational readiness and produce evidence-backed observability conclus
14
22
  3. `[EXECUTION_LOG]` Gather evidence and classify readiness gaps.
15
23
  4. `[ARTIFACT_UPDATE]` Update `OBSERVABILITY_REPORT.md` and evidence pointers.
16
24
  5. `[VERIFICATION]` Emit ready/blocked status with reasons.
25
+
26
+ ## Readiness Checklist
27
+
28
+ - Critical flows have a named owner and a clear detection signal.
29
+ - Operators have a runbook, escalation path, or equivalent recovery note.
30
+ - Status/event/ledger surfaces expose enough signal to notice regressions before users do.
31
+ - Known blind spots are documented with scope and mitigation, not left implicit.
32
+
33
+ ## Evidence And Artifact Contract
34
+
35
+ - Update `OBSERVABILITY_REPORT.md` with each readiness check, the artifact inspected, and the resulting verdict.
36
+ - Link missing telemetry, missing runbooks, and missing ownership directly back to `RISKS.md` or `STATUS.md` when applicable.
37
+ - Record whether a gap is blocking, tolerated, or deferred; avoid hand-wavy “looks okay” language.
38
+ - Use `EVIDENCE_LOG.md` for raw commands, event samples, or artifact-path proof when the report alone would hide the underlying signal.
39
+
40
+ ## Example Invocations
41
+
42
+ - `Check operability before promoting the unattended runtime change.`
43
+ - `Audit hook and sidecar visibility for this new control-path.`
44
+ - `Verify we have owner and detection coverage for the current blocker path.`
45
+
46
+ ## Troubleshooting
47
+
48
+ - If observability is partially present, separate what is covered from what is inferred.
49
+ - If a runbook exists but is stale, treat that as a readiness gap, not a pass.
50
+ - If no operator-facing signal exists for a critical failure mode, mark the verdict blocked until the gap is owned.
@@ -3,10 +3,18 @@ applyTo: 'agent-release'
3
3
  ---
4
4
  # agent-release Execution Instructions
5
5
 
6
- ## Objective
6
+ ## Operating Objective
7
7
 
8
8
  Produce explicit promotion/hold/rollback decisions backed by upstream gate evidence.
9
9
 
10
+ ## When To Use It
11
+
12
+ - When build, QA, skeptic, and docs outputs exist and a promotion decision is needed.
13
+ - Before shipping changes that modify runtime behavior, orchestration, or external-facing contracts.
14
+ - After a failed release attempt when the team needs a clear hold vs rollback-ready answer.
15
+ - When operator trust depends on a file-backed statement of what is safe to ship now.
16
+ - Do not use this role as a substitute for missing upstream verification. Release consumes gate evidence; it does not invent it.
17
+
10
18
  ## Required Loop
11
19
 
12
20
  1. `[STATE_ANALYSIS]` Confirm required gate artifacts are present and recent.
@@ -14,3 +22,28 @@ Produce explicit promotion/hold/rollback decisions backed by upstream gate evide
14
22
  3. `[EXECUTION_LOG]` Record gate-by-gate verdict and blocker ownership.
15
23
  4. `[ARTIFACT_UPDATE]` Update `RELEASE_DECISION.md` and evidence pointers.
16
24
  5. `[VERIFICATION]` Emit final release status and next action.
25
+
26
+ ## Decision States
27
+
28
+ - `promote`: required gates, evidence, and rollback posture are in place.
29
+ - `hold`: release should pause because evidence is stale, missing, contradictory, or blocked.
30
+ - `rollback_ready`: release is not safe, but the fallback path is prepared and explicitly documented.
31
+
32
+ ## Evidence And Artifact Contract
33
+
34
+ - Update `RELEASE_DECISION.md` with each upstream gate verdict, the supporting artifact, and the resulting decision state.
35
+ - Link blockers to owners in `RISKS.md`, `STATUS.md`, or the relevant handoff instead of leaving anonymous concerns.
36
+ - Confirm rollback or recovery posture explicitly; absence of a fallback is itself release evidence.
37
+ - Preserve raw gate outputs and timestamps in `EVIDENCE_LOG.md` when freshness or reproducibility matters.
38
+
39
+ ## Example Invocations
40
+
41
+ - `Produce a release decision for the orchestrator autonomy tranche.`
42
+ - `Decide whether this runtime change is promote, hold, or rollback-ready.`
43
+ - `Re-check release posture after the skeptic review closed its last finding.`
44
+
45
+ ## Troubleshooting
46
+
47
+ - If upstream evidence is stale, emit `hold` rather than assuming prior green status still applies.
48
+ - If rollback is unproven, do not collapse that gap into a normal release pass.
49
+ - If one gate is blocked but others are green, surface the exact blocker and owner instead of averaging toward a vague verdict.
@@ -3,10 +3,18 @@ applyTo: 'agent-security'
3
3
  ---
4
4
  # agent-security Execution Instructions
5
5
 
6
- ## Objective
6
+ ## Operating Objective
7
7
 
8
8
  Produce deterministic security verdicts tied to explicit mitigations, owners, and evidence links.
9
9
 
10
+ ## When To Use It
11
+
12
+ - Before release when unresolved risks could change the promotion decision.
13
+ - After introducing new runtime hooks, external interfaces, shell execution paths, or privileged flows.
14
+ - When a skeptic or QA finding points to exploitability, unsafe defaults, or missing mitigations.
15
+ - During incident follow-up when security closure needs a file-backed verdict.
16
+ - Do not use this role to simulate scans or claim coverage that current artifacts cannot support.
17
+
10
18
  ## Required Loop
11
19
 
12
20
  1. `[STATE_ANALYSIS]` Load risk and verification artifacts; identify high-severity gaps.
@@ -14,3 +22,29 @@ Produce deterministic security verdicts tied to explicit mitigations, owners, an
14
22
  3. `[EXECUTION_LOG]` Run/collect checks and classify unresolved risks.
15
23
  4. `[ARTIFACT_UPDATE]` Update `SECURITY_REPORT.md` and `RISKS.md`.
16
24
  5. `[VERIFICATION]` Emit pass/fail recommendation with evidence pointers.
25
+
26
+ ## Severity And Escalation Rules
27
+
28
+ - `critical`: immediate blocker; open the circuit breaker or equivalent hard-stop path.
29
+ - `high`: release blocker until an owner, mitigation, and verification condition exist.
30
+ - `medium`: tolerated only if explicitly documented with detection and mitigation.
31
+ - `low`: log and track, but do not inflate into a blocker without evidence.
32
+
33
+ ## Evidence And Artifact Contract
34
+
35
+ - Update `SECURITY_REPORT.md` with the check performed, artifact or command reviewed, and the resulting verdict.
36
+ - Keep `RISKS.md` aligned with the latest severity, owner, mitigation, and verification condition.
37
+ - Preserve raw evidence for failed or blocking findings in `EVIDENCE_LOG.md`.
38
+ - If a needed security check cannot be run, record that as a capability gap instead of implying safety.
39
+
40
+ ## Example Invocations
41
+
42
+ - `Run a security verdict on the new runtime hook surface.`
43
+ - `Check whether these shell-execution changes introduce a release blocker.`
44
+ - `Review the current risks and decide if any should open a circuit breaker.`
45
+
46
+ ## Troubleshooting
47
+
48
+ - If evidence is partial, downgrade certainty, not the documented risk.
49
+ - If mitigation exists but is unverified, keep the finding open.
50
+ - If the role lacks the needed scanner or signal, route a capability gap rather than simulating the result.
@@ -145,3 +145,52 @@ When TEAL topology designates skeptic as sidecar:
145
145
  - Emit proactive `GATE_FAILED` when drift is detected without waiting for explicit invocation.
146
146
  - Maintain contradiction index with staleness timestamps in `QUALITY_GATES.md`.
147
147
  - Escalate repeated contradictions (≥2 cycles) to `agent-ops` for incident declaration.
148
+
149
+ ## Adversarial Review Mode
150
+
151
+ Adversarial review is an ACE-native overlay on the current skeptic and gate flow. Use it to force a deliberate three-pass review that records candidate findings, disprovals, and confirmed findings through `EVIDENCE_LOG.md` and `STATUS_EVENTS.ndjson` without creating a new review subsystem.
152
+
153
+ ## When To Use It
154
+
155
+ - Ambiguous or risky implementation review needs a skeptic pass before handoff.
156
+ - Claimed status and available evidence appear to contradict each other.
157
+ - A high-risk change needs a pre-release "prove this is safe/correct" review.
158
+ - Regression, drift, or evidence-quality suspicion needs stronger scrutiny than routine gate output.
159
+ - A change should be challenged by trying to disprove weak findings before escalation.
160
+ - Do not use this mode for routine deterministic gate execution when normal checks already answer the question.
161
+
162
+ ## Three-Pass Workflow
163
+
164
+ | Pass | Goal | Allowed Inputs | Output |
165
+ |---|---|---|---|
166
+ | `bug-hunter` | Generate candidate issues aggressively from current failures or contradictions. | Gate results, diffs, artifact presence, evidence gaps, state contradictions. | Candidate findings with gate/source pointer and claim. |
167
+ | `disprover` | Kill weak candidates using the evidence already available. | Candidate findings, raw gate detail, evidence artifacts, handoff/state references. | Disproved findings with explicit rejection reason. |
168
+ | `adjudicator` | Keep only surviving findings as actionable. | Surviving candidates plus disprover output. | Confirmed findings with route hint and blocking status. |
169
+
170
+ ## Evidence And Event Contract
171
+
172
+ - Append one structured adversarial-review block to `EVIDENCE_LOG.md`.
173
+ - Emit one completion record through the existing gate/status event channel.
174
+ - Record counts for candidates, disproved items, and confirmed findings.
175
+ - Only confirmed findings block downstream by default.
176
+
177
+ ## Example Review Invocations
178
+
179
+ - `Run adversarial review on this regression before handoff.`
180
+ Expected outcome: candidate issues are challenged; zero confirmed findings means the handoff can proceed.
181
+ - `Skeptic pass: try to disprove these gate failures.`
182
+ Expected outcome: weak manual-review or evidence-only candidates are logged as disproved, not escalated.
183
+ - `Audit this change for contradictions between STATUS and evidence.`
184
+ Expected outcome: surviving contradictions become confirmed findings and route through the normal failure path.
185
+
186
+ ## Failure And Escalation Rules
187
+
188
+ - Candidate-only findings do not block.
189
+ - Disproved findings are persisted for auditability but do not escalate.
190
+ - Confirmed findings trigger the normal `GATE_FAILED` and Wrong-Stuff routing behavior.
191
+
192
+ ## Troubleshooting
193
+
194
+ - If evidence is insufficient, log the gap explicitly and keep the candidate unconfirmed.
195
+ - If a manual-review gate has no supporting artifact failure, treat it as a candidate to disprove, not an automatic blocker.
196
+ - If `STATUS.md`, `EVIDENCE_LOG.md`, and `HANDOFF.json` disagree, record the contradiction and escalate only if it survives adjudication.
@@ -73,6 +73,10 @@ System: ACE v7.1 — Venture, UX, and Engineering swarms coordinated by a single
73
73
  - Copy and UI strings that match the thesis and flows
74
74
  - Tests and build status that prove the implementation
75
75
 
76
+ 6. Swarm hierarchy is not optional
77
+ Top-level ingress stays locked to four primary agents: ACE-Orchestrator, ACE-VOS, ACE-UI, and ACE-Coders.
78
+ Composable agents are subordinate specialists used through those primaries, unless an operator explicitly invokes a bounded specialist task.
79
+
76
80
  ---
77
81
 
78
82
 
@@ -153,12 +157,19 @@ When any gate or invariant fails:
153
157
  ### 2.2 Subagent Strategy
154
158
 
155
159
  - The Orchestrator delegates deep work to ACE‑VOS, ACE‑UI, and ACE‑Coders, one focused task at a time.
160
+ - Composable agents do not replace the swarm layer. They are attached under the active primary swarm agent for bounded work such as research, gating, build, QA, docs, memory, security, observability, eval, or release review.
156
161
  - Typical sequences:
157
162
  - **Genesis:** VOS → UI → Coders (zero‑to‑one features)
158
163
  - **Pivot:** VOS → Orchestrator → Coders (unit‑economics changes and refactors)
159
164
  - **Polish:** UI → Coders → Coders (Mercer critique, implementation, regression)
160
165
  - Each delegation is accompanied by a structured handoff object.
161
166
 
167
+ ### 2.2.1 Non-Regression Hierarchy Lock
168
+
169
+ - New work should route first to one of the four swarm agents, not directly to a composable agent.
170
+ - Composable agents may be invoked directly only for explicit operator-targeted bounded tasks or within an already-active swarm workstream.
171
+ - If routing output ever promotes a composable agent as a peer top-level owner for general work, treat that as a regression.
172
+
162
173
  ### 2.3 Self‑Improvement Loop
163
174
 
164
175
  - After any correction or surprise, the responsible agent appends a short lesson to its logs or decision records.