ace-swarm 2.1.2 → 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +52 -1
- package/README.md +86 -40
- package/assets/.agents/ACE/AGENT_REGISTRY.md +7 -1
- package/assets/.agents/ACE/agent-eval/instructions.md +41 -1
- package/assets/.agents/ACE/agent-memory/instructions.md +35 -1
- package/assets/.agents/ACE/agent-observability/instructions.md +35 -1
- package/assets/.agents/ACE/agent-release/instructions.md +34 -1
- package/assets/.agents/ACE/agent-security/instructions.md +35 -1
- package/assets/.agents/ACE/agent-skeptic/instructions.md +49 -0
- package/assets/.agents/ACE/orchestrator/AGENTS.md +11 -0
- package/assets/.github/hooks/ace-copilot.json +16 -16
- package/assets/agent-state/ACE_WORKFLOW.md +65 -0
- package/assets/agent-state/INTERFACE_REGISTRY.md +1 -0
- package/assets/agent-state/MODULES/schemas/ACE_RUNTIME_PROFILE.schema.json +79 -0
- package/assets/agent-state/MODULES/schemas/VERICIFY_PROCESS_POST_LOG.schema.json +1 -0
- package/assets/scripts/ace-hook-dispatch.mjs +447 -0
- package/assets/scripts/copilot-hook-dispatch.mjs +1 -265
- package/assets/scripts/render-mcp-configs.sh +328 -1
- package/assets/tasks/README.md +26 -0
- package/dist/ace-autonomy.d.ts +137 -0
- package/dist/ace-autonomy.d.ts.map +1 -0
- package/dist/ace-autonomy.js +472 -0
- package/dist/ace-autonomy.js.map +1 -0
- package/dist/ace-context.d.ts +29 -0
- package/dist/ace-context.d.ts.map +1 -0
- package/dist/ace-context.js +240 -0
- package/dist/ace-context.js.map +1 -0
- package/dist/ace-internal-tools.d.ts +8 -0
- package/dist/ace-internal-tools.d.ts.map +1 -0
- package/dist/ace-internal-tools.js +76 -0
- package/dist/ace-internal-tools.js.map +1 -0
- package/dist/ace-server-instructions.d.ts +12 -0
- package/dist/ace-server-instructions.d.ts.map +1 -0
- package/dist/ace-server-instructions.js +299 -0
- package/dist/ace-server-instructions.js.map +1 -0
- package/dist/agent-runtime/role-adapters.d.ts.map +1 -1
- package/dist/agent-runtime/role-adapters.js +47 -6
- package/dist/agent-runtime/role-adapters.js.map +1 -1
- package/dist/helpers.d.ts.map +1 -1
- package/dist/helpers.js +90 -0
- package/dist/helpers.js.map +1 -1
- package/dist/internal-tool-runtime.d.ts +21 -0
- package/dist/internal-tool-runtime.d.ts.map +1 -0
- package/dist/internal-tool-runtime.js +136 -0
- package/dist/internal-tool-runtime.js.map +1 -0
- package/dist/local-model-runtime.d.ts +36 -0
- package/dist/local-model-runtime.d.ts.map +1 -0
- package/dist/local-model-runtime.js +161 -0
- package/dist/local-model-runtime.js.map +1 -0
- package/dist/model-bridge.d.ts +54 -0
- package/dist/model-bridge.d.ts.map +1 -0
- package/dist/model-bridge.js +587 -0
- package/dist/model-bridge.js.map +1 -0
- package/dist/orchestrator-supervisor.d.ts +100 -0
- package/dist/orchestrator-supervisor.d.ts.map +1 -0
- package/dist/orchestrator-supervisor.js +399 -0
- package/dist/orchestrator-supervisor.js.map +1 -0
- package/dist/prompts.d.ts.map +1 -1
- package/dist/prompts.js +101 -0
- package/dist/prompts.js.map +1 -1
- package/dist/public-surface.d.ts.map +1 -1
- package/dist/public-surface.js +6 -0
- package/dist/public-surface.js.map +1 -1
- package/dist/resources.d.ts.map +1 -1
- package/dist/resources.js +29 -0
- package/dist/resources.js.map +1 -1
- package/dist/runtime-executor.d.ts.map +1 -1
- package/dist/runtime-executor.js +121 -0
- package/dist/runtime-executor.js.map +1 -1
- package/dist/runtime-profile.d.ts +18 -0
- package/dist/runtime-profile.d.ts.map +1 -1
- package/dist/runtime-profile.js +39 -3
- package/dist/runtime-profile.js.map +1 -1
- package/dist/schemas.js +1 -1
- package/dist/schemas.js.map +1 -1
- package/dist/server.d.ts +5 -1
- package/dist/server.d.ts.map +1 -1
- package/dist/server.js +9 -1
- package/dist/server.js.map +1 -1
- package/dist/shared.d.ts +3 -3
- package/dist/shared.d.ts.map +1 -1
- package/dist/shared.js +1 -0
- package/dist/shared.js.map +1 -1
- package/dist/tools-agent.d.ts +1 -0
- package/dist/tools-agent.d.ts.map +1 -1
- package/dist/tools-agent.js +456 -1
- package/dist/tools-agent.js.map +1 -1
- package/dist/tools-framework.d.ts.map +1 -1
- package/dist/tools-framework.js +366 -128
- package/dist/tools-framework.js.map +1 -1
- package/dist/tools-memory.d.ts.map +1 -1
- package/dist/tools-memory.js +80 -0
- package/dist/tools-memory.js.map +1 -1
- package/dist/tui/agent-runner.d.ts +6 -0
- package/dist/tui/agent-runner.d.ts.map +1 -1
- package/dist/tui/agent-runner.js +15 -1
- package/dist/tui/agent-runner.js.map +1 -1
- package/dist/tui/agent-worker.d.ts +3 -1
- package/dist/tui/agent-worker.d.ts.map +1 -1
- package/dist/tui/agent-worker.js +117 -9
- package/dist/tui/agent-worker.js.map +1 -1
- package/dist/tui/chat.d.ts +19 -0
- package/dist/tui/chat.d.ts.map +1 -1
- package/dist/tui/chat.js +108 -0
- package/dist/tui/chat.js.map +1 -1
- package/dist/tui/index.d.ts +1 -0
- package/dist/tui/index.d.ts.map +1 -1
- package/dist/tui/index.js +3 -0
- package/dist/tui/index.js.map +1 -1
- package/dist/vericify-bridge.d.ts +5 -1
- package/dist/vericify-bridge.d.ts.map +1 -1
- package/dist/vericify-bridge.js +10 -0
- package/dist/vericify-bridge.js.map +1 -1
- package/dist/vericify-context.d.ts +10 -0
- package/dist/vericify-context.d.ts.map +1 -0
- package/dist/vericify-context.js +72 -0
- package/dist/vericify-context.js.map +1 -0
- package/dist/workspace-manager.d.ts.map +1 -1
- package/dist/workspace-manager.js +13 -2
- package/dist/workspace-manager.js.map +1 -1
- package/package.json +2 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,6 +1,57 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
## [2.3.0] - 2026-03-28
|
|
4
|
+
|
|
5
|
+
### Added
|
|
6
|
+
|
|
7
|
+
**Orchestrator Supervisor:**
|
|
8
|
+
- Added `run_orchestrator` MCP tool for full task-plan execution with explicit step routing, sequential/parallel dispatch, and handoff create/ack flows.
|
|
9
|
+
- Added Vericify context carry-forward and delta recovery for state continuity across plan execution.
|
|
10
|
+
- Added circuit breaker open/close hooks for upstream failure handling.
|
|
11
|
+
- Added plan amendment support during mid-execution.
|
|
12
|
+
|
|
13
|
+
**Model Bridge Context Management:**
|
|
14
|
+
- Added context budget calculation with configurable reserve (default 25%) to prevent overflow.
|
|
15
|
+
- Added tier fallback (full → compressed → brief) when context exceeds budget.
|
|
16
|
+
- Added automatic conversation history compaction (summarize older messages, retain recent messages + system).
|
|
17
|
+
- Added large tool-result write-through to disk with configurable truncation (default 3000 chars).
|
|
18
|
+
- Added retry-once on transient provider errors (abort, timeout, rate-limit).
|
|
19
|
+
|
|
20
|
+
**ACE Compliance Enforcement:**
|
|
21
|
+
- Added 5-layer ACE compliance enforcement: MCP instructions, per-host instruction files, lifecycle hooks, tool-level guidance, and bootstrap-on-first-tool.
|
|
22
|
+
- Added per-host instruction file generation for CLAUDE.md, .cursorrules, .github/copilot-instructions.md, and Codex AGENTS.md.
|
|
23
|
+
- Added lifecycle hook dispatcher (`ace-hook-dispatch.mjs`) with role-aware tool hints for 7 hosts.
|
|
24
|
+
- Added tool-level soft enforcement via guidance appended to all tool results.
|
|
25
|
+
- Added bootstrap nudge lifecycle for lazy ACE context loading on first mutation tool.
|
|
26
|
+
|
|
27
|
+
**Tool Governance:**
|
|
28
|
+
- Added named sets for explicit tool classification: BOOTSTRAP_TOOLS, REFRESH_TOOLS, NAMED_MUTATION_TOOLS.
|
|
29
|
+
- Added maintenance comments on mutable tool sets to prevent accidental out-of-sync registrations.
|
|
30
|
+
- Added Vericify resolution scoping with `startsWith` validation to prevent parent-directory leakage.
|
|
31
|
+
|
|
32
|
+
### Changed
|
|
33
|
+
|
|
34
|
+
- Changed agent worker and runner behavior so blocked or idle TUI agents can accept new input and resume without relaunching.
|
|
35
|
+
- Expanded hook dispatch context to emit role-aware ACE tool hints from session, subagent, and tool lifecycle events.
|
|
36
|
+
- Updated bootstrap and config rendering to generate multi-host hook artifacts under `.vscode`, `.cursor`, and `.mcp-config`.
|
|
37
|
+
|
|
38
|
+
### Validation
|
|
39
|
+
|
|
40
|
+
- Verified `208/208` tests passing (up from 202) with full orchestrator supervisor, model bridge, and compliance enforcement coverage.
|
|
41
|
+
|
|
42
|
+
## [2.2.0] - 2026-03-26
|
|
43
|
+
|
|
44
|
+
### Added
|
|
45
|
+
- Added `ace turnkey` command for rapid workspace bootstrap with one command.
|
|
46
|
+
- Added `ace tui` command for a full terminal operator surface with status, tasks, chat, and telemetry.
|
|
47
|
+
- Expanded the packaged skill system to 15 skills, including `landing-review-watcher`, `problem-triage`, `skill-auditor`, and `astgrep-index`.
|
|
48
|
+
- Finalized the 17-agent role system (4 swarm, 13 composable).
|
|
49
|
+
- Added automatic MCP client configuration for Codex, VS Code, Claude Desktop, Cursor, and Antigravity.
|
|
50
|
+
- Expanded the MCP tool surface to 70+ tools across 12 domain-specific modules.
|
|
51
|
+
|
|
52
|
+
### Packaging
|
|
53
|
+
- Bumped package version from `2.1.0` to `2.2.0`.
|
|
54
|
+
- Verified 170/170 tests passing on the final v2.2.0 release candidate.
|
|
4
55
|
|
|
5
56
|
## [2.1.0] - 2026-03-13
|
|
6
57
|
|
package/README.md
CHANGED
|
@@ -63,30 +63,62 @@ ACE Swarm
|
|
|
63
63
|
## Recent Releases
|
|
64
64
|
|
|
65
65
|
```text
|
|
66
|
-
|
|
67
|
-
| Version
|
|
68
|
-
|
|
69
|
-
| 2.
|
|
70
|
-
|
|
|
71
|
-
| 2.0
|
|
72
|
-
|
|
66
|
+
+------------+------------+------------------------------------------------------------------+
|
|
67
|
+
| Version | Date | Highlights |
|
|
68
|
+
+------------+------------+------------------------------------------------------------------+
|
|
69
|
+
| 2.3.0 | 2026-03-28 | Orchestrator supervisor, context-aware model bridge, ACE |
|
|
70
|
+
| | | compliance enforcement, Vericify integration, tool governance |
|
|
71
|
+
| 2.2.0 | 2026-03-26 | turnkey bootstrap, TUI, expanded skills/roles, client configs |
|
|
72
|
+
| 2.1.0 | 2026-03-13 | README redesign, ASCII docs, packaged changelog |
|
|
73
|
+
+------------+------------+------------------------------------------------------------------+
|
|
73
74
|
```
|
|
74
75
|
|
|
75
|
-
See [CHANGELOG.md](./CHANGELOG.md) for
|
|
76
|
+
See [CHANGELOG.md](./CHANGELOG.md) for detailed release notes.
|
|
77
|
+
|
|
78
|
+
### What's New in 2.3.0
|
|
79
|
+
|
|
80
|
+
**Orchestrator Supervisor (`run_orchestrator` MCP tool):**
|
|
81
|
+
- Full task-plan execution loop with explicit step routing and sequential/parallel dispatch
|
|
82
|
+
- Handoff create/ack flows with fallback ID generation
|
|
83
|
+
- Vericify context carry-forward and delta recovery for state continuity
|
|
84
|
+
- Circuit breaker open/close for upstream failure handling
|
|
85
|
+
- Plan amendment support during execution
|
|
86
|
+
|
|
87
|
+
**Model Bridge Context Management:**
|
|
88
|
+
- Context budget calculation with configurable reserve (default 25%)
|
|
89
|
+
- Tier fallback (full → compressed → brief) when context overflows
|
|
90
|
+
- Automatic conversation history compaction (summarize older messages, retain recent + system)
|
|
91
|
+
- Large tool-result write-through to disk with configurable truncation
|
|
92
|
+
- Retry-once on transient provider errors (abort, timeout, rate-limit)
|
|
93
|
+
|
|
94
|
+
**ACE Compliance Enforcement (5 Stacked Layers):**
|
|
95
|
+
- Layer 1: MCP server initialization instructions (all hosts)
|
|
96
|
+
- Layer 2: Per-host instruction files (CLAUDE.md, .cursorrules, .github/copilot-instructions.md)
|
|
97
|
+
- Layer 3: Lifecycle hooks with role-aware tool hints (7 hosts)
|
|
98
|
+
- Layer 4: Tool-level soft enforcement (guidance on all tool results)
|
|
99
|
+
- Layer 5: Bootstrap-on-first-tool lazy enforcement
|
|
100
|
+
|
|
101
|
+
**Tool Governance & Vericify:**
|
|
102
|
+
- Named sets for explicit tool classification (BOOTSTRAP_TOOLS, REFRESH_TOOLS, NAMED_MUTATION_TOOLS)
|
|
103
|
+
- Vericify resolution scoping to prevent parent-directory leakage
|
|
104
|
+
- Full Vericify API integration with graceful degradation fallback
|
|
105
|
+
|
|
106
|
+
**Test Coverage:** 208 passing tests (up from 202)
|
|
76
107
|
|
|
77
108
|
## Core Surfaces
|
|
78
109
|
|
|
79
110
|
```text
|
|
80
|
-
|
|
81
|
-
| Surface | Internals
|
|
82
|
-
|
|
83
|
-
| Bootstrap | workspace state, tasks, configs
|
|
84
|
-
| Agent runtime | swarm
|
|
85
|
-
| Durable state | ledgers, TODOs, handoffs, status events
|
|
86
|
-
| Dispatch | queue, lock table, lease, recovery
|
|
87
|
-
|
|
|
88
|
-
|
|
|
89
|
-
|
|
111
|
+
+------------------+----------------------------------------------+------------------------------------------------------+
|
|
112
|
+
| Surface | Internals | Public entrypoints |
|
|
113
|
+
+------------------+----------------------------------------------+------------------------------------------------------+
|
|
114
|
+
| Bootstrap | workspace state, tasks, configs | `ace init`, `ace turnkey`, `ace mcp-config` |
|
|
115
|
+
| Agent runtime | swarm roles, model bridge, tool planning | MCP server, TUI `/agent`, packaged agent assets |
|
|
116
|
+
| Durable state | ledgers, TODOs, handoffs, status events | runtime stores under `agent-state/*` |
|
|
117
|
+
| Dispatch | queue, lock table, lease, recovery | `enqueue_job`, `dispatch_jobs`, `complete_job` |
|
|
118
|
+
| Supervision | plan amendment, handoff, Vericify, breakers | `superviseTaskPlan`, `amendTaskPlan` |
|
|
119
|
+
| Change analysis | delta scan, semantic diff, rewrite hints | `scan_workspace_delta`, `semantic_diff`, etc. |
|
|
120
|
+
| Operator UI | tabs, ACE chat bridge, telemetry, providers | `ace tui` |
|
|
121
|
+
+------------------+----------------------------------------------+------------------------------------------------------+
|
|
90
122
|
```
|
|
91
123
|
|
|
92
124
|
## Workspace Internals
|
|
@@ -191,24 +223,37 @@ See [CHANGELOG.md](./CHANGELOG.md) for the recent tagged release history that ca
|
|
|
191
223
|
Provider notes:
|
|
192
224
|
|
|
193
225
|
- Ollama works directly through `ace tui`.
|
|
226
|
+
- ACE workspaces automatically route chat tabs through the ACE model bridge, including a pre-execution tool-selection pass and streamed tool/result updates in the transcript.
|
|
194
227
|
- OpenAI-compatible providers use `OPENAI_API_KEY` and optional `OPENAI_BASE_URL`, or provider-scoped equivalents such as `CODEX_API_KEY` and `CODEX_BASE_URL`.
|
|
228
|
+
- `/agent` sessions can now accept follow-up input while blocked or idle, so operators can continue a run without relaunching the worker tab.
|
|
195
229
|
- VS Code Copilot model hints may appear in discovery, but the standalone TUI does not directly execute through the VS Code chat runtime without an extension-side bridge.
|
|
196
230
|
|
|
231
|
+
## Supervisor And Bridge Notes
|
|
232
|
+
|
|
233
|
+
- `ModelBridge` now performs a tool-planning turn before execution and can amend the model-selected scope mid-run when another valid ACE tool is required.
|
|
234
|
+
- Supervisor helpers now support sequential task-plan execution with amendments, handoff create/ack flows, Vericify context carry-forward, and circuit-breaker open/close hooks.
|
|
235
|
+
- Hook dispatch now emits richer role-aware ACE context across workspace lifecycle events, including Codex, Claude, VS Code, Cursor, Gemini, and Antigravity hook bundles.
|
|
236
|
+
|
|
197
237
|
## Bootstrap Outputs
|
|
198
238
|
|
|
199
239
|
```text
|
|
200
|
-
|
|
201
|
-
| Path
|
|
202
|
-
|
|
203
|
-
| `agent-state/*`
|
|
204
|
-
| `.agents/ACE/*`
|
|
205
|
-
| `.agents/skills/*`
|
|
206
|
-
| `tasks/*`
|
|
207
|
-
| `scripts/ace/*`
|
|
208
|
-
| `.vscode/mcp.json`
|
|
209
|
-
| `.
|
|
210
|
-
| `.
|
|
211
|
-
|
|
240
|
+
+-----------------------------------+------------------------------------------------------+
|
|
241
|
+
| Path | Output |
|
|
242
|
+
+-----------------------------------+------------------------------------------------------+
|
|
243
|
+
| `agent-state/*` | state, ledgers, schemas, reports |
|
|
244
|
+
| `.agents/ACE/*` | packaged agent instructions |
|
|
245
|
+
| `.agents/skills/*` | packaged skills |
|
|
246
|
+
| `tasks/*` | task packs, examples, handoff templates |
|
|
247
|
+
| `scripts/ace/*` | helper scripts |
|
|
248
|
+
| `.vscode/mcp.json` | VS Code MCP config |
|
|
249
|
+
| `.vscode/ace-hooks.json` | VS Code hook bundle |
|
|
250
|
+
| `.cursor/hooks.json` | Cursor hook bundle |
|
|
251
|
+
| `.mcp-config/codex.hooks.toml` | Codex hook config |
|
|
252
|
+
| `.mcp-config/gemini.hooks.json` | Gemini hook bundle |
|
|
253
|
+
| `.mcp-config/antigravity.hooks.json` | Antigravity hook bundle |
|
|
254
|
+
| `.mcp-config/*` | client configs for Codex / VS Code / Claude / Cursor / Gemini / Antigravity |
|
|
255
|
+
| `.github/hooks/*.json` | optional workspace hook policies |
|
|
256
|
+
+-----------------------------------+------------------------------------------------------+
|
|
212
257
|
```
|
|
213
258
|
|
|
214
259
|
Local-model bootstrap:
|
|
@@ -223,16 +268,17 @@ ace doctor --llm ollama --model llama3.1:8b
|
|
|
223
268
|
## Client Config Targets
|
|
224
269
|
|
|
225
270
|
```text
|
|
226
|
-
|
|
227
|
-
| Client | Target
|
|
228
|
-
|
|
229
|
-
| Codex | `$CODEX_HOME/config.toml` or `~/.codex/config.toml`
|
|
230
|
-
| VS Code | `.vscode/mcp.json`
|
|
231
|
-
| Claude | macOS: `~/Library/Application Support/Claude/...`
|
|
232
|
-
| | Linux: `~/.config/Claude/claude_desktop_config.json`
|
|
233
|
-
| Cursor | `~/.cursor/mcp.json`
|
|
234
|
-
|
|
|
235
|
-
|
|
271
|
+
+---------------+---------------------------------------------------------------------+
|
|
272
|
+
| Client | Target |
|
|
273
|
+
+---------------+---------------------------------------------------------------------+
|
|
274
|
+
| Codex | `$CODEX_HOME/config.toml` or `~/.codex/config.toml` |
|
|
275
|
+
| VS Code | `.vscode/mcp.json` and `.vscode/ace-hooks.json` |
|
|
276
|
+
| Claude | macOS: `~/Library/Application Support/Claude/...` |
|
|
277
|
+
| | Linux: `~/.config/Claude/claude_desktop_config.json` |
|
|
278
|
+
| Cursor | `~/.cursor/mcp.json` or workspace `.cursor/hooks.json` |
|
|
279
|
+
| Gemini | import `.mcp-config/gemini.hooks.json` with the generated MCP config |
|
|
280
|
+
| Antigravity | import `.mcp-config/antigravity.mcp.json` and `.mcp-config/antigravity.hooks.json` |
|
|
281
|
+
+---------------+---------------------------------------------------------------------+
|
|
236
282
|
```
|
|
237
283
|
|
|
238
284
|
## Included Assets
|
|
@@ -2,6 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
Version: 3.0.0
|
|
4
4
|
|
|
5
|
+
## Hierarchy Lock
|
|
6
|
+
|
|
7
|
+
- Primary ingress is locked to four swarm agents: `orchestrator`, `vos`, `ui`, and `coders`.
|
|
8
|
+
- Composable agents are delegated specialists that operate under an active swarm task, not replacements for the swarm layer.
|
|
9
|
+
- Direct operator invocation of a composable agent is allowed only for bounded specialist work; it is not the default routing path for new general tasks.
|
|
10
|
+
|
|
5
11
|
## Registry Table
|
|
6
12
|
|
|
7
13
|
| Agent | Group | Objective | Inputs | Outputs | Emits | Consumes | Default Skills |
|
|
@@ -26,7 +32,7 @@ Version: 3.0.0
|
|
|
26
32
|
|
|
27
33
|
## Dependency Rules
|
|
28
34
|
|
|
29
|
-
1.
|
|
35
|
+
1. Top-level routing terminates at a swarm agent; composable modules execute bounded contracts under that swarm layer.
|
|
30
36
|
2. Skeptic and ops can block downstream flow when gates fail.
|
|
31
37
|
3. Release cannot approve without eval + security + observability evidence.
|
|
32
38
|
4. Memory sidecar may run continuously but cannot mutate source truth artifacts except designated memory files.
|
|
@@ -3,10 +3,18 @@ applyTo: 'agent-eval'
|
|
|
3
3
|
---
|
|
4
4
|
# agent-eval Execution Instructions
|
|
5
5
|
|
|
6
|
-
## Objective
|
|
6
|
+
## Operating Objective
|
|
7
7
|
|
|
8
8
|
Run deterministic evaluation checks and publish confidence/regression outcomes.
|
|
9
9
|
|
|
10
|
+
## When To Use It
|
|
11
|
+
|
|
12
|
+
- After autonomy, routing, runtime, or hook changes that can shift ACE behavior.
|
|
13
|
+
- Before promotion when evidence needs a deterministic regression verdict.
|
|
14
|
+
- When comparing candidate implementations, profiles, or prompt/runtime variants.
|
|
15
|
+
- After bug fixes that claim to close a previous failure mode.
|
|
16
|
+
- Do not use this role for speculative design or open-ended research; evaluation requires a concrete target and pass/fail surface.
|
|
17
|
+
|
|
10
18
|
## Required Loop
|
|
11
19
|
|
|
12
20
|
1. `[STATE_ANALYSIS]` Identify triggered evaluation scope.
|
|
@@ -14,3 +22,35 @@ Run deterministic evaluation checks and publish confidence/regression outcomes.
|
|
|
14
22
|
3. `[EXECUTION_LOG]` Run suites and capture raw outcomes.
|
|
15
23
|
4. `[ARTIFACT_UPDATE]` Update `EVAL_REPORT.md` and evidence links.
|
|
16
24
|
5. `[VERIFICATION]` Emit pass/fail/hold decision.
|
|
25
|
+
|
|
26
|
+
## Evaluation Inputs
|
|
27
|
+
|
|
28
|
+
- `TASK.md`, `SCOPE.md`, `QUALITY_GATES.md`, and `HANDOFF.json` for the active objective and gate surface.
|
|
29
|
+
- Package or workspace test suites, fixtures, and golden outputs.
|
|
30
|
+
- `STATUS_EVENTS.ndjson` and `run-ledger.json` when regression history or prior failures matter.
|
|
31
|
+
- `EVIDENCE_LOG.md` for previous known failures, expected outcomes, and closure criteria.
|
|
32
|
+
|
|
33
|
+
## Decision Rules
|
|
34
|
+
|
|
35
|
+
- `pass`: deterministic suites meet the declared threshold and no unexplained regressions remain.
|
|
36
|
+
- `hold`: evaluation signal is incomplete, stale, or inconclusive, so promotion should pause pending clearer evidence.
|
|
37
|
+
- `fail`: a regression, contract violation, or unexplained drift is reproduced with raw evidence.
|
|
38
|
+
|
|
39
|
+
## Evidence And Artifact Contract
|
|
40
|
+
|
|
41
|
+
- Update `EVAL_REPORT.md` with suite name, fixture/scope, threshold, and verdict.
|
|
42
|
+
- Preserve raw outputs or exit codes in `EVIDENCE_LOG.md` rather than paraphrasing them away.
|
|
43
|
+
- Link failures to the owning gate, risk, or handoff so downstream routing stays deterministic.
|
|
44
|
+
- If a suite is missing or non-deterministic, log that explicitly as a hold condition instead of silently skipping it.
|
|
45
|
+
|
|
46
|
+
## Example Invocations
|
|
47
|
+
|
|
48
|
+
- `Run eval on the autonomy preflight path after this runtime change.`
|
|
49
|
+
- `Compare the new routing behavior against the previous fixture set.`
|
|
50
|
+
- `Re-run the regression suite for the reported skeptic failure before release.`
|
|
51
|
+
|
|
52
|
+
## Troubleshooting
|
|
53
|
+
|
|
54
|
+
- If no deterministic suite exists, emit `hold` and specify the missing fixture or harness.
|
|
55
|
+
- If outcomes differ across runs, treat the instability itself as evidence and document the drift condition.
|
|
56
|
+
- If the evaluation target is unclear, route back to `agent-spec` or `agent-ops` instead of guessing the scope.
|
|
@@ -3,10 +3,18 @@ applyTo: 'agent-memory'
|
|
|
3
3
|
---
|
|
4
4
|
# agent-memory Execution Instructions
|
|
5
5
|
|
|
6
|
-
## Objective
|
|
6
|
+
## Operating Objective
|
|
7
7
|
|
|
8
8
|
Reconcile state artifacts into compact memory outputs that reduce drift and improve retrieval quality for orchestrator and composable agents.
|
|
9
9
|
|
|
10
|
+
## When To Use It
|
|
11
|
+
|
|
12
|
+
- Before a handoff when the next role needs compact, contradiction-free continuity.
|
|
13
|
+
- When long-running work has accumulated too much evidence, status, and decision noise.
|
|
14
|
+
- When orchestrator recall is pulling stale or redundant state into the active loop.
|
|
15
|
+
- When artifacts disagree and a curated memory view is needed before dispatch.
|
|
16
|
+
- Do not use this role to invent new truth. Memory is a curated index over existing ACE artifacts.
|
|
17
|
+
|
|
10
18
|
## Required Loop
|
|
11
19
|
|
|
12
20
|
1. `[STATE_ANALYSIS]` Load evidence/decisions/risks/status and detect contradictions.
|
|
@@ -14,3 +22,29 @@ Reconcile state artifacts into compact memory outputs that reduce drift and impr
|
|
|
14
22
|
3. `[EXECUTION_LOG]` Extract only artifact-backed claims; remove duplicates.
|
|
15
23
|
4. `[ARTIFACT_UPDATE]` Update `MEMORY_INDEX.md`; append evidence anchors.
|
|
16
24
|
5. `[VERIFICATION]` Confirm no unsupported claims remain.
|
|
25
|
+
|
|
26
|
+
## Curation Rules
|
|
27
|
+
|
|
28
|
+
- Prefer the newest artifact-backed statement when two entries say the same thing.
|
|
29
|
+
- Preserve open questions and blockers as unresolved, not normalized away.
|
|
30
|
+
- Separate current objective, current phase, confirmed evidence, and active risks instead of blending them together.
|
|
31
|
+
- If a claim cannot be tied to `EVIDENCE_LOG.md`, `STATUS.md`, `DECISIONS.md`, `RISKS.md`, or `HANDOFF.json`, it does not belong in memory.
|
|
32
|
+
|
|
33
|
+
## Evidence And Artifact Contract
|
|
34
|
+
|
|
35
|
+
- Update `MEMORY_INDEX.md` with dated sections or anchors that point back to source artifacts.
|
|
36
|
+
- Record contradiction resolution or removal rationale in `EVIDENCE_LOG.md` when memory pruning changes what downstream roles will see.
|
|
37
|
+
- Keep memory outputs short enough to improve retrieval, but complete enough to preserve objective, phase, evidence, and risk continuity.
|
|
38
|
+
- Never let `MEMORY_INDEX.md` become a second source of truth for scope or requirements.
|
|
39
|
+
|
|
40
|
+
## Example Invocations
|
|
41
|
+
|
|
42
|
+
- `Curate memory before handing this build thread to QA.`
|
|
43
|
+
- `Collapse duplicate evidence and keep only the current blocker story.`
|
|
44
|
+
- `Prepare a recall-safe summary for the orchestrator before resuming this objective.`
|
|
45
|
+
|
|
46
|
+
## Troubleshooting
|
|
47
|
+
|
|
48
|
+
- If artifacts disagree, keep both claims visible until one is falsified by evidence.
|
|
49
|
+
- If memory keeps growing, split by objective or phase instead of making one bloated summary.
|
|
50
|
+
- If downstream recall still drifts, check whether the source artifacts themselves are stale before rewriting memory again.
|
|
@@ -3,10 +3,18 @@ applyTo: 'agent-observability'
|
|
|
3
3
|
---
|
|
4
4
|
# agent-observability Execution Instructions
|
|
5
5
|
|
|
6
|
-
## Objective
|
|
6
|
+
## Operating Objective
|
|
7
7
|
|
|
8
8
|
Validate operational readiness and produce evidence-backed observability conclusions.
|
|
9
9
|
|
|
10
|
+
## When To Use It
|
|
11
|
+
|
|
12
|
+
- Before release when the team needs a real operability verdict, not just passing tests.
|
|
13
|
+
- After adding new runtime paths, background sessions, hooks, or sidecars that need visibility.
|
|
14
|
+
- When incidents or repeated blockers suggest missing ownership, alerting, or detection signals.
|
|
15
|
+
- When a runbook, SLO, or telemetry path must be checked against actual artifact coverage.
|
|
16
|
+
- Do not use this role to fabricate monitoring maturity. Missing signals must remain explicit gaps.
|
|
17
|
+
|
|
10
18
|
## Required Loop
|
|
11
19
|
|
|
12
20
|
1. `[STATE_ANALYSIS]` Identify missing runbook/SLO/owner mappings.
|
|
@@ -14,3 +22,29 @@ Validate operational readiness and produce evidence-backed observability conclus
|
|
|
14
22
|
3. `[EXECUTION_LOG]` Gather evidence and classify readiness gaps.
|
|
15
23
|
4. `[ARTIFACT_UPDATE]` Update `OBSERVABILITY_REPORT.md` and evidence pointers.
|
|
16
24
|
5. `[VERIFICATION]` Emit ready/blocked status with reasons.
|
|
25
|
+
|
|
26
|
+
## Readiness Checklist
|
|
27
|
+
|
|
28
|
+
- Critical flows have a named owner and a clear detection signal.
|
|
29
|
+
- Operators have a runbook, escalation path, or equivalent recovery note.
|
|
30
|
+
- Status/event/ledger surfaces expose enough signal to notice regressions before users do.
|
|
31
|
+
- Known blind spots are documented with scope and mitigation, not left implicit.
|
|
32
|
+
|
|
33
|
+
## Evidence And Artifact Contract
|
|
34
|
+
|
|
35
|
+
- Update `OBSERVABILITY_REPORT.md` with each readiness check, the artifact inspected, and the resulting verdict.
|
|
36
|
+
- Link missing telemetry, missing runbooks, and missing ownership directly back to `RISKS.md` or `STATUS.md` when applicable.
|
|
37
|
+
- Record whether a gap is blocking, tolerated, or deferred; avoid hand-wavy “looks okay” language.
|
|
38
|
+
- Use `EVIDENCE_LOG.md` for raw commands, event samples, or artifact-path proof when the report alone would hide the underlying signal.
|
|
39
|
+
|
|
40
|
+
## Example Invocations
|
|
41
|
+
|
|
42
|
+
- `Check operability before promoting the unattended runtime change.`
|
|
43
|
+
- `Audit hook and sidecar visibility for this new control-path.`
|
|
44
|
+
- `Verify we have owner and detection coverage for the current blocker path.`
|
|
45
|
+
|
|
46
|
+
## Troubleshooting
|
|
47
|
+
|
|
48
|
+
- If observability is partially present, separate what is covered from what is inferred.
|
|
49
|
+
- If a runbook exists but is stale, treat that as a readiness gap, not a pass.
|
|
50
|
+
- If no operator-facing signal exists for a critical failure mode, mark the verdict blocked until the gap is owned.
|
|
@@ -3,10 +3,18 @@ applyTo: 'agent-release'
|
|
|
3
3
|
---
|
|
4
4
|
# agent-release Execution Instructions
|
|
5
5
|
|
|
6
|
-
## Objective
|
|
6
|
+
## Operating Objective
|
|
7
7
|
|
|
8
8
|
Produce explicit promotion/hold/rollback decisions backed by upstream gate evidence.
|
|
9
9
|
|
|
10
|
+
## When To Use It
|
|
11
|
+
|
|
12
|
+
- When build, QA, skeptic, and docs outputs exist and a promotion decision is needed.
|
|
13
|
+
- Before shipping changes that modify runtime behavior, orchestration, or external-facing contracts.
|
|
14
|
+
- After a failed release attempt when the team needs a clear hold vs rollback-ready answer.
|
|
15
|
+
- When operator trust depends on a file-backed statement of what is safe to ship now.
|
|
16
|
+
- Do not use this role as a substitute for missing upstream verification. Release consumes gate evidence; it does not invent it.
|
|
17
|
+
|
|
10
18
|
## Required Loop
|
|
11
19
|
|
|
12
20
|
1. `[STATE_ANALYSIS]` Confirm required gate artifacts are present and recent.
|
|
@@ -14,3 +22,28 @@ Produce explicit promotion/hold/rollback decisions backed by upstream gate evide
|
|
|
14
22
|
3. `[EXECUTION_LOG]` Record gate-by-gate verdict and blocker ownership.
|
|
15
23
|
4. `[ARTIFACT_UPDATE]` Update `RELEASE_DECISION.md` and evidence pointers.
|
|
16
24
|
5. `[VERIFICATION]` Emit final release status and next action.
|
|
25
|
+
|
|
26
|
+
## Decision States
|
|
27
|
+
|
|
28
|
+
- `promote`: required gates, evidence, and rollback posture are in place.
|
|
29
|
+
- `hold`: release should pause because evidence is stale, missing, contradictory, or blocked.
|
|
30
|
+
- `rollback_ready`: release is not safe, but the fallback path is prepared and explicitly documented.
|
|
31
|
+
|
|
32
|
+
## Evidence And Artifact Contract
|
|
33
|
+
|
|
34
|
+
- Update `RELEASE_DECISION.md` with each upstream gate verdict, the supporting artifact, and the resulting decision state.
|
|
35
|
+
- Link blockers to owners in `RISKS.md`, `STATUS.md`, or the relevant handoff instead of leaving anonymous concerns.
|
|
36
|
+
- Confirm rollback or recovery posture explicitly; absence of a fallback is itself release evidence.
|
|
37
|
+
- Preserve raw gate outputs and timestamps in `EVIDENCE_LOG.md` when freshness or reproducibility matters.
|
|
38
|
+
|
|
39
|
+
## Example Invocations
|
|
40
|
+
|
|
41
|
+
- `Produce a release decision for the orchestrator autonomy tranche.`
|
|
42
|
+
- `Decide whether this runtime change is promote, hold, or rollback-ready.`
|
|
43
|
+
- `Re-check release posture after the skeptic review closed its last finding.`
|
|
44
|
+
|
|
45
|
+
## Troubleshooting
|
|
46
|
+
|
|
47
|
+
- If upstream evidence is stale, emit `hold` rather than assuming prior green status still applies.
|
|
48
|
+
- If rollback is unproven, do not collapse that gap into a normal release pass.
|
|
49
|
+
- If one gate is blocked but others are green, surface the exact blocker and owner instead of averaging toward a vague verdict.
|
|
@@ -3,10 +3,18 @@ applyTo: 'agent-security'
|
|
|
3
3
|
---
|
|
4
4
|
# agent-security Execution Instructions
|
|
5
5
|
|
|
6
|
-
## Objective
|
|
6
|
+
## Operating Objective
|
|
7
7
|
|
|
8
8
|
Produce deterministic security verdicts tied to explicit mitigations, owners, and evidence links.
|
|
9
9
|
|
|
10
|
+
## When To Use It
|
|
11
|
+
|
|
12
|
+
- Before release when unresolved risks could change the promotion decision.
|
|
13
|
+
- After introducing new runtime hooks, external interfaces, shell execution paths, or privileged flows.
|
|
14
|
+
- When a skeptic or QA finding points to exploitability, unsafe defaults, or missing mitigations.
|
|
15
|
+
- During incident follow-up when security closure needs a file-backed verdict.
|
|
16
|
+
- Do not use this role to simulate scans or claim coverage that current artifacts cannot support.
|
|
17
|
+
|
|
10
18
|
## Required Loop
|
|
11
19
|
|
|
12
20
|
1. `[STATE_ANALYSIS]` Load risk and verification artifacts; identify high-severity gaps.
|
|
@@ -14,3 +22,29 @@ Produce deterministic security verdicts tied to explicit mitigations, owners, an
|
|
|
14
22
|
3. `[EXECUTION_LOG]` Run/collect checks and classify unresolved risks.
|
|
15
23
|
4. `[ARTIFACT_UPDATE]` Update `SECURITY_REPORT.md` and `RISKS.md`.
|
|
16
24
|
5. `[VERIFICATION]` Emit pass/fail recommendation with evidence pointers.
|
|
25
|
+
|
|
26
|
+
## Severity And Escalation Rules
|
|
27
|
+
|
|
28
|
+
- `critical`: immediate blocker; open the circuit breaker or equivalent hard-stop path.
|
|
29
|
+
- `high`: release blocker until an owner, mitigation, and verification condition exist.
|
|
30
|
+
- `medium`: tolerated only if explicitly documented with detection and mitigation.
|
|
31
|
+
- `low`: log and track, but do not inflate into a blocker without evidence.
|
|
32
|
+
|
|
33
|
+
## Evidence And Artifact Contract
|
|
34
|
+
|
|
35
|
+
- Update `SECURITY_REPORT.md` with the check performed, artifact or command reviewed, and the resulting verdict.
|
|
36
|
+
- Keep `RISKS.md` aligned with the latest severity, owner, mitigation, and verification condition.
|
|
37
|
+
- Preserve raw evidence for failed or blocking findings in `EVIDENCE_LOG.md`.
|
|
38
|
+
- If a needed security check cannot be run, record that as a capability gap instead of implying safety.
|
|
39
|
+
|
|
40
|
+
## Example Invocations
|
|
41
|
+
|
|
42
|
+
- `Run a security verdict on the new runtime hook surface.`
|
|
43
|
+
- `Check whether these shell-execution changes introduce a release blocker.`
|
|
44
|
+
- `Review the current risks and decide if any should open a circuit breaker.`
|
|
45
|
+
|
|
46
|
+
## Troubleshooting
|
|
47
|
+
|
|
48
|
+
- If evidence is partial, downgrade certainty, not the documented risk.
|
|
49
|
+
- If mitigation exists but is unverified, keep the finding open.
|
|
50
|
+
- If the role lacks the needed scanner or signal, route a capability gap rather than simulating the result.
|
|
@@ -145,3 +145,52 @@ When TEAL topology designates skeptic as sidecar:
|
|
|
145
145
|
- Emit proactive `GATE_FAILED` when drift is detected without waiting for explicit invocation.
|
|
146
146
|
- Maintain contradiction index with staleness timestamps in `QUALITY_GATES.md`.
|
|
147
147
|
- Escalate repeated contradictions (≥2 cycles) to `agent-ops` for incident declaration.
|
|
148
|
+
|
|
149
|
+
## Adversarial Review Mode
|
|
150
|
+
|
|
151
|
+
Adversarial review is an ACE-native overlay on the current skeptic and gate flow. Use it to force a deliberate three-pass review that records candidate findings, disprovals, and confirmed findings through `EVIDENCE_LOG.md` and `STATUS_EVENTS.ndjson` without creating a new review subsystem.
|
|
152
|
+
|
|
153
|
+
## When To Use It
|
|
154
|
+
|
|
155
|
+
- Ambiguous or risky implementation review needs a skeptic pass before handoff.
|
|
156
|
+
- Claimed status and available evidence appear to contradict each other.
|
|
157
|
+
- A high-risk change needs a pre-release "prove this is safe/correct" review.
|
|
158
|
+
- Regression, drift, or evidence-quality suspicion needs stronger scrutiny than routine gate output.
|
|
159
|
+
- A change should be challenged by trying to disprove weak findings before escalation.
|
|
160
|
+
- Do not use this mode for routine deterministic gate execution when normal checks already answer the question.
|
|
161
|
+
|
|
162
|
+
## Three-Pass Workflow
|
|
163
|
+
|
|
164
|
+
| Pass | Goal | Allowed Inputs | Output |
|
|
165
|
+
|---|---|---|---|
|
|
166
|
+
| `bug-hunter` | Generate candidate issues aggressively from current failures or contradictions. | Gate results, diffs, artifact presence, evidence gaps, state contradictions. | Candidate findings with gate/source pointer and claim. |
|
|
167
|
+
| `disprover` | Kill weak candidates using the evidence already available. | Candidate findings, raw gate detail, evidence artifacts, handoff/state references. | Disproved findings with explicit rejection reason. |
|
|
168
|
+
| `adjudicator` | Keep only surviving findings as actionable. | Surviving candidates plus disprover output. | Confirmed findings with route hint and blocking status. |
|
|
169
|
+
|
|
170
|
+
## Evidence And Event Contract
|
|
171
|
+
|
|
172
|
+
- Append one structured adversarial-review block to `EVIDENCE_LOG.md`.
|
|
173
|
+
- Emit one completion record through the existing gate/status event channel.
|
|
174
|
+
- Record counts for candidates, disproved items, and confirmed findings.
|
|
175
|
+
- Only confirmed findings block downstream by default.
|
|
176
|
+
|
|
177
|
+
## Example Review Invocations
|
|
178
|
+
|
|
179
|
+
- `Run adversarial review on this regression before handoff.`
|
|
180
|
+
Expected outcome: candidate issues are challenged; zero confirmed findings means the handoff can proceed.
|
|
181
|
+
- `Skeptic pass: try to disprove these gate failures.`
|
|
182
|
+
Expected outcome: weak manual-review or evidence-only candidates are logged as disproved, not escalated.
|
|
183
|
+
- `Audit this change for contradictions between STATUS and evidence.`
|
|
184
|
+
Expected outcome: surviving contradictions become confirmed findings and route through the normal failure path.
|
|
185
|
+
|
|
186
|
+
## Failure And Escalation Rules
|
|
187
|
+
|
|
188
|
+
- Candidate-only findings do not block.
|
|
189
|
+
- Disproved findings are persisted for auditability but do not escalate.
|
|
190
|
+
- Confirmed findings trigger the normal `GATE_FAILED` and Wrong-Stuff routing behavior.
|
|
191
|
+
|
|
192
|
+
## Troubleshooting
|
|
193
|
+
|
|
194
|
+
- If evidence is insufficient, log the gap explicitly and keep the candidate unconfirmed.
|
|
195
|
+
- If a manual-review gate has no supporting artifact failure, treat it as a candidate to disprove, not an automatic blocker.
|
|
196
|
+
- If `STATUS.md`, `EVIDENCE_LOG.md`, and `HANDOFF.json` disagree, record the contradiction and escalate only if it survives adjudication.
|
|
@@ -73,6 +73,10 @@ System: ACE v7.1 — Venture, UX, and Engineering swarms coordinated by a single
|
|
|
73
73
|
- Copy and UI strings that match the thesis and flows
|
|
74
74
|
- Tests and build status that prove the implementation
|
|
75
75
|
|
|
76
|
+
6. Swarm hierarchy is not optional
|
|
77
|
+
Top-level ingress stays locked to four primary agents: ACE-Orchestrator, ACE-VOS, ACE-UI, and ACE-Coders.
|
|
78
|
+
Composable agents are subordinate specialists used through those primaries, unless an operator explicitly invokes a bounded specialist task.
|
|
79
|
+
|
|
76
80
|
---
|
|
77
81
|
|
|
78
82
|
|
|
@@ -153,12 +157,19 @@ When any gate or invariant fails:
|
|
|
153
157
|
### 2.2 Subagent Strategy
|
|
154
158
|
|
|
155
159
|
- The Orchestrator delegates deep work to ACE‑VOS, ACE‑UI, and ACE‑Coders, one focused task at a time.
|
|
160
|
+
- Composable agents do not replace the swarm layer. They are attached under the active primary swarm agent for bounded work such as research, gating, build, QA, docs, memory, security, observability, eval, or release review.
|
|
156
161
|
- Typical sequences:
|
|
157
162
|
- **Genesis:** VOS → UI → Coders (zero‑to‑one features)
|
|
158
163
|
- **Pivot:** VOS → Orchestrator → Coders (unit‑economics changes and refactors)
|
|
159
164
|
- **Polish:** UI → Coders → Coders (Mercer critique, implementation, regression)
|
|
160
165
|
- Each delegation is accompanied by a structured handoff object.
|
|
161
166
|
|
|
167
|
+
### 2.2.1 Non-Regression Hierarchy Lock
|
|
168
|
+
|
|
169
|
+
- New work should route first to one of the four swarm agents, not directly to a composable agent.
|
|
170
|
+
- Composable agents may be invoked directly only for explicit operator-targeted bounded tasks or within an already-active swarm workstream.
|
|
171
|
+
- If routing output ever promotes a composable agent as a peer top-level owner for general work, treat that as a regression.
|
|
172
|
+
|
|
162
173
|
### 2.3 Self‑Improvement Loop
|
|
163
174
|
|
|
164
175
|
- After any correction or surprise, the responsible agent appends a short lesson to its logs or decision records.
|