agestra 4.12.2 → 4.12.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -12,7 +12,7 @@
12
12
  "name": "agestra",
13
13
  "source": "./",
14
14
  "description": "Orchestrate Ollama, Gemini, and Codex for multi-AI debates, code review, and cross-validation",
15
- "version": "4.12.2",
15
+ "version": "4.12.4",
16
16
  "author": {
17
17
  "name": "mua-vtuber"
18
18
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agestra",
3
- "version": "4.12.2",
3
+ "version": "4.12.4",
4
4
  "description": "Claude Code plugin — orchestrate Ollama, Gemini, and Codex for multi-AI debates, code review, and cross-validation",
5
5
  "mcpServers": {
6
6
  "agestra": {
package/AGENTS.md CHANGED
@@ -27,7 +27,7 @@ This repository includes a Codex-friendly host wrapper for Agestra.
27
27
  - `environment_check` and `provider_list`: inspect host/provider state first
28
28
  - `agent_debate_structured` (with `agent_debate_approve`/`_continue`/`_reject`) and `agent_debate_review`: run approval-gated multi-provider review flows
29
29
  - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: use for autonomous Codex/Gemini worker tasks
30
- - `qa_run`: verify implementation before reporting completion
30
+ - `qa_run`: run workspace build/test verification before reporting implementation completion
31
31
 
32
32
  ## Project Assets
33
33
 
package/GEMINI.md CHANGED
@@ -31,4 +31,4 @@ Each command delegates to the shared workflow specs in `commands/*.md`.
31
31
  - `agent_debate_structured`, `agent_debate_approve`/`_continue`/`_reject`, `agent_debate_review`: structured multi-provider reviews and approval-gated debates
32
32
  - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: autonomous worker lifecycle
33
33
  - `workspace_*`: document-backed review and aggregation flows
34
- - `qa_run`: final verification step before completion
34
+ - `qa_run`: workspace build/test verification before implementation completion
package/README.md CHANGED
@@ -206,13 +206,13 @@ Turborepo monorepo with 8 packages:
206
206
  | Tool | Description |
207
207
  |------|-------------|
208
208
  | `agent_debate_start` | Start a multi-provider debate (non-blocking, optional quality loop + validator) |
209
- | `agent_debate_status` | Check debate status and transcript |
209
+ | `agent_debate_status` | Check legacy debate status or structured session progress, phase, participant activity, and document paths |
210
210
  | `agent_debate_create` | Create a turn-based debate session (returns debate ID) |
211
211
  | `agent_debate_turn` | Execute one provider's turn; supports `provider: "claude"` for Claude's independent participation |
212
212
  | `agent_debate_conclude` | End a debate and generate final transcript |
213
- | `agent_debate_structured` | Start an approval-gated structured debate — individual reviews, optional alias clarification, rounds with vote aggregation, parks in `ready-for-approval` (no synthesis until approved) |
213
+ | `agent_debate_structured` | Start an approval-gated structured debate in the background — individual reviews, optional alias clarification, JSON consensus rounds, status polling, and no synthesis until approved |
214
214
  | `agent_debate_approve` | Leader-approve a ready-for-approval structured debate; writes the synthesis document and closes the session |
215
- | `agent_debate_continue` | Run additional rounds on a ready-for-approval (or escalated) structured-debate session (3/5/10) |
215
+ | `agent_debate_continue` | Start additional background rounds on a ready-for-approval or escalated structured-debate session (3/5/10), then poll status |
216
216
  | `agent_debate_reject` | Reject a structured-debate session without writing synthesis |
217
217
  | `agent_debate_review` | Send a document to multiple providers for independent review |
218
218
  | `agent_cross_validate` | Cross-validate outputs (agent-tier validators only) |
@@ -272,7 +272,7 @@ Turborepo monorepo with 8 packages:
272
272
 
273
273
  | Tool | Description |
274
274
  |------|-------------|
275
- | `qa_run` | Run automatic QA with detected build/test commands and a PASS/FAIL summary |
275
+ | `qa_run` | Run vetted workspace build/test QA profiles and return a PASS/FAIL summary |
276
276
 
277
277
  ### Trace / Observability (3)
278
278
 
@@ -48,13 +48,13 @@ You operate in one of four modes depending on how you are invoked:
48
48
 
49
49
  ### Mode: Structured Debate
50
50
 
51
- **Preferred entry point:** Call `agent_debate_structured` with the topic, scope, participants, source documents, and leader. The moderator engine owns the full lifecycle: individual reviews, JSON consensus ledger creation, optional alias clarification, sequential provider turns, strict JSON response validation, generated debate markdown, approval snapshot, and final synthesis after leader approval.
51
+ **Preferred entry point:** Call `agent_debate_structured` with `mode`, topic, scope, participants, optional `source_documents`, and leader. Use `mode: "review"` for code/document review and `mode: "idea"` for idea/design option discovery. `source_documents` is optional and must use `{ "document_id": "...", "provider": "..." }` entries when independent documents already exist. The tool creates a structured session record immediately and returns `status: running`; use `agent_debate_status` to monitor phase, provider progress, item summary, and document paths. The moderator engine owns the full lifecycle: individual/source material loading, JSON consensus ledger creation, optional alias clarification, sequential provider turns, strict JSON response validation, generated debate markdown, structured session record, and final synthesis after leader approval.
52
52
 
53
53
  The JSON consensus ledger is the source of truth. Debate markdown and synthesis markdown are generated human-readable artifacts. The moderator may inspect and report their paths, but must not edit markdown to change item status, provider stance, or consensus state.
54
54
 
55
55
  ### Phase 1: Individual reviews
56
56
 
57
- Before any consensus round, every participant produces an independent review of the scope. These reviews are written under `.agestra/workspace/individual/`. Each participant lists candidate items with fields such as `{ title, severity, location, statement }`; the engine keeps links back to these source documents in each consensus item.
57
+ Before any consensus round, every participant produces independent source material unless `source_documents` were supplied. These documents are written or read under `.agestra/workspace/individual/`. Each document must include a `<proposals>` block with `<item id="..." title="..." severity="..." location="...">...</item>` entries; the engine keeps links back to these source documents in each consensus item.
58
58
 
59
59
  ### Phase 2: Consensus ledger creation
60
60
 
@@ -105,15 +105,15 @@ After each accepted provider turn, the engine recomputes item status from ledger
105
105
 
106
106
  The engine persists the JSON ledger atomically, then regenerates:
107
107
  - the aggregate debate markdown in `debates/`
108
- - the terminal consensus report
109
- - the approval snapshot when the session reaches `ready-for-approval`
108
+ - the structured status/session record (`{sessionId}.session.json`)
109
+ - the terminal consensus report when a blocking engine caller requests one
110
110
 
111
111
  ### Phase 5: Leader approval gate
112
112
 
113
113
  The moderator does not write the final synthesis file on its own. Three dedicated MCP tools close out the flow:
114
114
 
115
115
  - `agent_debate_approve`: writes the synthesis markdown, updates ledger document paths, and transitions to `approved`.
116
- - `agent_debate_continue`: loads the persisted ledger/snapshot and runs additional consensus rounds.
116
+ - `agent_debate_continue`: loads the persisted ledger/session record, starts additional consensus rounds in the background, and returns `running`.
117
117
  - `agent_debate_reject`: closes without synthesis. With `spawn_issue = true`, an issue document can be written under `individual/` listing non-accepted items.
118
118
 
119
119
  Idempotency: a second call on a terminal state (`approved`, `rejected`, `leader-timeout`) returns the cached outcome. Calling approval-gate tools on a `running` or `error` session returns `isError: true` with a descriptive state message.
@@ -129,23 +129,23 @@ Idempotency: a second call on a terminal state (`approved`, `rejected`, `leader-
129
129
  │ accepted/ │ + user chose escalate
130
130
  │ rejected │
131
131
  ▼ ▼
132
- ready-for-approval ◀── snapshot JSON written to disk (D12)
132
+ ready-for-approval ◀── session JSON written to disk
133
133
  │ │ │
134
134
  _approve │ │ │ _continue
135
135
  ▼ │ ▼
136
- approved │ running (snapshot reloaded; max_rounds += additional_rounds)
137
- (snapshot
136
+ approved │ running (session reloaded; max_rounds += additional_rounds)
137
+ (session
138
138
  kept) │ _reject
139
139
 
140
- rejected (snapshot kept)
140
+ rejected (session kept)
141
141
 
142
- (ready-for-approval ─ 24h no tool call ─▶ leader-timeout [snapshot kept])
142
+ (ready-for-approval ─ 24h no tool call ─▶ leader-timeout [session kept])
143
143
  (running ─ uncaught internal error ─▶ error)
144
144
  ```
145
145
 
146
- **Snapshot and ledger persistence (D12).** On entry to `ready-for-approval`, the engine writes `{workspaceBaseDir}/.agestra/workspace/debates/{sessionId}.approval.json` atomically and keeps `{sessionId}.consensus.json` as the durable consensus ledger. The snapshot carries session config, consensus-derived aggregate status, rounds, document paths, `readyAt`, and `deadline`. The leader must invoke one of the three approval-gate tools within `STRUCTURED_DEBATE_APPROVAL_TIMEOUT_MS` (24 hours); otherwise the background sweep (scheduled by `STRUCTURED_DEBATE_SESSION_SWEEP_INTERVAL_MS`, default 1 hour) scans the `debates/` directory, finds snapshots with `deadline < now` still in `ready-for-approval`, and transitions them to `leader-timeout` (snapshot kept in place so the leader can still inspect/reject afterwards).
146
+ **Session and ledger persistence.** The engine writes `{workspaceBaseDir}/.agestra/workspace/debates/{sessionId}.session.json` atomically and keeps `{sessionId}.consensus.json` as the durable consensus ledger. The session record carries lifecycle status, current phase, participant progress, session config, consensus-derived aggregate status, rounds, document paths, `readyAt`, and `deadline`. The leader must invoke one of the three approval-gate tools within `STRUCTURED_DEBATE_APPROVAL_TIMEOUT_MS` (24 hours); otherwise the background sweep (scheduled by `STRUCTURED_DEBATE_SESSION_SWEEP_INTERVAL_MS`, default 1 hour) scans the `debates/` directory, finds sessions with `deadline < now` still in `ready-for-approval`, and transitions them to `leader-timeout` (session record kept in place so the leader can still inspect/reject afterwards). Legacy `.approval.json` records may be read for migration, but new writes use `.session.json`.
147
147
 
148
- The JSON consensus ledger is the truth of content and item state. The approval snapshot is the resumable gate state. Generated markdown is readable output only. Since handlers read persisted state from disk first (memory is a write-through cache), approval and continuation keep working after server restart.
148
+ The JSON consensus ledger is the truth of content and item state. The structured session record is the resumable gate/progress state. Generated markdown is readable output only. Since handlers read persisted state from disk first (memory is a write-through cache), status, approval, and continuation keep working after server restart.
149
149
 
150
150
  </Approval_Gate_State_Machine>
151
151
 
@@ -156,7 +156,7 @@ All paths relative to `workspaceBaseDir` (`.agestra/workspace/` under the projec
156
156
  ```
157
157
  .agestra/workspace/
158
158
  individual/ — each participant's initial independent review (pre-debate; no votes)
159
- debates/ — generated debate markdown + {sessionId}.consensus.json + {sessionId}.approval.json
159
+ debates/ — generated debate markdown + {sessionId}.consensus.json + {sessionId}.session.json
160
160
  synthesis/ — leader-approved final synthesis document (written only on _approve)
161
161
  reviews/ — legacy, read-only; no new writes
162
162
  ```
@@ -454,9 +454,9 @@ If `max_rounds` is hit with open proposals, the moderator surfaces the choice to
454
454
 
455
455
  <Tool_Usage>
456
456
  - `provider_list` — check available providers at the start.
457
- - `agent_debate_structured` — **recommended entry point for Structured Debate**: runs individual reviews, optional alias clarification, JSON consensus turns, ledger persistence, generated debate markdown, and the approval gate. Does NOT write synthesis.
457
+ - `agent_debate_structured` — **recommended entry point for Structured Debate**: accepts `mode: "review" | "idea"` and optional `source_documents`, starts or loads individual source material, runs optional alias clarification, JSON consensus turns, ledger persistence, generated debate markdown, and the approval gate in the background. Returns `running`; poll `agent_debate_status`. Does NOT write synthesis.
458
458
  - `agent_debate_approve` — write synthesis markdown, mark the snapshot `approved`, close the session.
459
- - `agent_debate_continue` — force additional rounds on a `ready-for-approval` session.
459
+ - `agent_debate_continue` — force additional rounds on a `ready-for-approval` or `escalated` session; returns `running`, then poll status.
460
460
  - `agent_debate_reject` — close without synthesis; optionally spawn an issue branch listing non-accepted proposals.
461
461
  - Legacy manual debate primitives — diagnostic use only; do not use them for review, idea, or design consensus workflows.
462
462
  - `agent_debate_review` — send a document to providers for structured review (Document Review mode).
@@ -248,15 +248,17 @@ Run formal verification with automatic fix loop:
248
248
 
249
249
  > Used when Work Mode in Phase 2 was **Multi-AI**. Replaces Phase 5 (QA) and Phase 6 (Quality Gate) in a single coordinated cross-AI review. In Leader-host-only mode, skip this phase.
250
250
 
251
- Run the structured-debate MCP flow. This is a **two-step** lifecycle: the moderator runs the debate to a terminal aggregation state, then parks the session in `ready-for-approval` waiting for the leader (this agent) to finalize. The moderator does NOT write the synthesis file on its own — approval must be explicit.
251
+ Run the structured-debate MCP flow. This is a **background lifecycle**: `agent_debate_structured` creates a durable session record immediately and returns `status: running`; the leader polls `agent_debate_status` until the moderator parks the session in `ready-for-approval`, `escalated`, or `error`. The moderator does NOT write the synthesis file on its own — approval must be explicit.
252
252
 
253
253
  #### 5M.1 Start the debate
254
254
 
255
255
  Call `agent_debate_structured` with:
256
256
 
257
257
  - `topic` — short slug (used in file names under `.agestra/workspace/`).
258
+ - `mode` — `"review"` for QA/review consensus, `"idea"` for exploratory design or option discovery.
258
259
  - `scope` — concrete framing: file list, task description, or the design doc path.
259
260
  - `participants` — the provider/agent IDs the user specified at Work Mode selection, or the qualified set from `trace_summary`.
261
+ - `source_documents` — optional pre-created individual documents, each as `{ "document_id": "...", "provider": "..." }`.
260
262
  - `auto_inject_specialists` — default `true`. When true, the moderator auto-adds host reviewer/QA specialists on top of `participants` based on topic heuristics (currently exposed as `claude-reviewer` and/or `claude-qa` for compatibility). When the user wants verbatim participants only, pass `false`.
261
263
  - `exclude_participants` — participant IDs to never include, applied regardless of `auto_inject_specialists`. Use this when the user explicitly wants a provider (including Ollama — there is no automatic Ollama filter anymore) kept out.
262
264
  - `leader` — omit unless you need to override the session-context leader.
@@ -264,14 +266,14 @@ Call `agent_debate_structured` with:
264
266
  - `individual_review_prompt` / `files` — optional framing for the individual-review fan-out.
265
267
  - `locale` — pass the locale resolved from `agestra.config.json` (fall back to providers.config locale). The moderator uses it for human-facing text; provider prompts remain English regardless.
266
268
 
267
- The tool returns a `StructuredDebateRunResult` with the debate snapshot and a `debate_id`. Capture both.
269
+ The tool returns a session ID and `status: running`. Capture the `session_id` and use `agent_debate_status` for progress and artifact paths.
268
270
 
269
- #### 5M.2 Await terminal state
271
+ #### 5M.2 Poll terminal state
270
272
 
271
- The result `status` will be one of:
273
+ Call `agent_debate_status` periodically. The structured status includes phase, current provider, round, participant progress, item summary, and document paths. Stop polling when `status` is one of:
272
274
 
273
- - `ready-for-approval` (subtype `consensus`) — every proposal was accepted or rejected and aggregation converged.
274
- - `ready-for-approval` (subtype `escalated`) — `max_rounds` was reached without consensus and the user elected to escalate during moderator prompts.
275
+ - `ready-for-approval` — every proposal was accepted/rejected or aggregation reached the approval gate.
276
+ - `escalated` — `max_rounds` was reached with unresolved items.
275
277
  - `error` — aggregation failed. Treat as an orchestration failure; do NOT call approve/continue/reject.
276
278
 
277
279
  In either `ready-for-approval` subtype the synthesis has NOT been written yet. The terminal report names the three follow-up tools; do not skip them.
@@ -283,7 +285,7 @@ A 24h inactivity timer starts the moment the session enters `ready-for-approval`
283
285
  Before deciding, read the on-disk outputs — the debate writes three folders under the workspace:
284
286
 
285
287
  - `.agestra/workspace/individual/` — per-participant individual reviews (`individual_{participant}_{topic}_{date}_{seq}.md`). Includes auto-injected host specialists like `claude-reviewer` / `claude-qa` when present.
286
- - `.agestra/workspace/debates/` — debate transcript (`debate_{topic}_{date}_{seq}.md`) plus the approval snapshot (`{sessionId}.approval.json`). The snapshot remains after `approve` / `reject` for idempotent replays and audit.
288
+ - `.agestra/workspace/debates/` — debate transcript (`debate_{topic}_{date}_{seq}.md`), consensus ledger (`{sessionId}.consensus.json`), and structured session record (`{sessionId}.session.json`). The session record remains after `approve` / `reject` for idempotent replays and audit.
287
289
  - `.agestra/workspace/synthesis/` — the final synthesis document, written only after `agent_debate_approve` succeeds.
288
290
 
289
291
  Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whether the debate outcome matches the design.
@@ -292,9 +294,9 @@ Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whe
292
294
 
293
295
  Pick exactly one of the three follow-up tools, based on inspection:
294
296
 
295
- 1. **Accept the outcome** → call `agent_debate_approve` with `debate_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the snapshot to `approved`, and returns `synthesisDocPath`. Proceed to Phase 7 and relay the path to the user.
296
- 2. **Need more deliberation** → call `agent_debate_continue` with `debate_id` and `additional_rounds` (`3`, `5`, or `10` only). The engine resumes the round loop from the prior snapshot and eventually re-parks the session in `ready-for-approval`. Loop back to 5M.2. Use this when the debate was close but unresolved, or when `escalated` came too early.
297
- 3. **Reject the outcome** → call `agent_debate_reject` with `debate_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
297
+ 1. **Accept the outcome** → call `agent_debate_approve` with `session_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the session record to `approved`, and returns `synthesisDocPath`. Proceed to Phase 7 and relay the path to the user.
298
+ 2. **Need more deliberation** → call `agent_debate_continue` with `session_id` and `additional_rounds` (`3`, `5`, or `10` only). The handler returns `status: running`; poll `agent_debate_status` again until it reaches the approval gate. Use this when the debate was close but unresolved, or when `escalated` came too early.
299
+ 3. **Reject the outcome** → call `agent_debate_reject` with `session_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
298
300
 
299
301
  All three tools are idempotent on terminal states — re-calling returns the cached outcome.
300
302
 
@@ -413,8 +415,8 @@ The design document is the authority. If an AI's output conflicts with the desig
413
415
  - `provider_list` / `provider_health` — check external AI availability
414
416
  - `trace_summary` / `trace_record` / `trace_compare` — provider quality tracking
415
417
  - `ai_chat` / `ai_analyze_files` / `ai_compare` — query external AI
416
- - `agent_debate_structured` — start a structured multi-AI debate (individual reviews → clarification → rounds → aggregation → `ready-for-approval`). Supports `auto_inject_specialists` (default `true`) to auto-add host reviewer/QA specialists (compatibility IDs: `claude-reviewer` / `claude-qa`) based on topic, and `exclude_participants` as the escape hatch (also the way to keep Ollama or any other provider out — there is no automatic Ollama filter).
417
- - `agent_debate_approve` / `agent_debate_continue` / `agent_debate_reject` — leader-only finalization tools for a `ready-for-approval` session. `approve` writes the synthesis under `.agestra/workspace/synthesis/`; `continue(additional_rounds=N)` accepts only `3`, `5`, or `10`; `reject(reason=..., spawn_issue?=true)` closes the session with no synthesis.
418
+ - `agent_debate_structured` — start a structured multi-AI debate in the background (individual/source material → clarification → JSON consensus rounds → aggregation → approval gate). It returns `status: running`; poll `agent_debate_status`. Supports `mode: "review" | "idea"`, optional `source_documents`, `auto_inject_specialists` (default `true`) to auto-add host reviewer/QA specialists (compatibility IDs: `claude-reviewer` / `claude-qa`) based on topic, and `exclude_participants` as the escape hatch (also the way to keep Ollama or any other provider out — there is no automatic Ollama filter).
419
+ - `agent_debate_approve` / `agent_debate_continue` / `agent_debate_reject` — leader-only finalization tools for a structured session at the approval gate. `approve` writes the synthesis under `.agestra/workspace/synthesis/`; `continue(additional_rounds=N)` accepts only `3`, `5`, or `10` and returns `running`; `reject(reason=..., spawn_issue?=true)` closes the session with no synthesis.
418
420
  - Low-level debate primitives — legacy / diagnostic use only; prefer the structured debate tools for review, idea, and design workflows.
419
421
  - `agent_cross_validate` — cross-validate outputs between providers
420
422
  - `cli_worker_spawn` / `cli_worker_status` / `cli_worker_collect` / `cli_worker_stop` — manage Codex/Gemini CLI workers
@@ -57,6 +57,7 @@ Follow the structured consensus document model from `docs/superpowers/specs/2026
57
57
  The JSON consensus ledger is the source of truth. Generated Markdown must not be parsed or hand-edited to change provider stances, item status, or consensus state.
58
58
 
59
59
  1. Start an approval-gated structured debate with `agent_debate_structured`.
60
+ - **mode:** use `"idea"` for exploratory architecture/design option discovery. Use `"review"` only when reviewing an already-written design artifact.
60
61
  - **topic:** the design subject.
61
62
  - **participants:** only providers reported available by `environment_check` / `provider_list`, plus the host design specialist when the engine supports it.
62
63
  - **scope:** the design subject plus any user-provided constraints, relevant existing design docs, and code areas that should anchor the design.
@@ -69,22 +70,23 @@ The JSON consensus ledger is the source of truth. Generated Markdown must not be
69
70
  4. Recommended approach — one choice with justification.
70
71
  5. Implementation plan — step-by-step build sequence with dependencies.
71
72
  6. Risks and mitigations.
73
+ - The tool returns immediately with `status: running`; capture the `session_id`.
72
74
 
73
- 2. Let the MCP moderator engine own the consensus flow.
75
+ 2. Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
74
76
  - The engine writes individual first-pass documents under `individual/`.
75
- - The engine owns provider turn order, JSON turn packets, response validation, ledger updates, aggregated debate Markdown rendering, synthesis rendering, and the final terminal table.
76
- - Participants submit explicit JSON stances through the MCP consensus turn packet.
77
+ - The engine owns provider turn order, JSON turn packets, response validation, ledger updates, aggregated debate Markdown rendering, status rendering, synthesis rendering, and final terminal table generation for direct engine callers.
78
+ - Participants submit explicit JSON stances through the MCP consensus turn packet. Structured consensus turns must return JSON only in the canonical `{ provider, round, items }` shape.
77
79
  - The leader/moderator must not infer agreement from prose and must not edit provider stances manually.
78
80
  - There must be one generated aggregated debate Markdown document per run, not one Markdown document per provider turn or round.
79
81
 
80
82
  3. Use the approval gate.
81
- - If the terminal report says the session is `ready-for-approval`, inspect the report and call exactly one of:
83
+ - If status says the session is `ready-for-approval` or `escalated`, inspect the status/artifacts and call exactly one of:
82
84
  - `agent_debate_approve` to write the synthesis document.
83
85
  - `agent_debate_continue` to run 3, 5, or 10 more rounds.
84
86
  - `agent_debate_reject` to close without synthesis.
85
87
  - If the result is `error`, do not approve; report the orchestration failure.
86
88
 
87
89
  4. Present the final result.
88
- - Name the debate Markdown path, consensus JSON ledger path, approval snapshot path if surfaced, and synthesis document path if approved.
90
+ - Name the debate Markdown path, consensus JSON ledger path, structured session record path if surfaced, and synthesis document path if approved.
89
91
  - Summarize accepted design decisions, excluded options, and unresolved/disputed items.
90
92
  - Preserve each provider's rationale for disputed positions.
package/commands/idea.md CHANGED
@@ -46,31 +46,33 @@ In parallel:
46
46
  - For each available external provider, call `ai_chat` with `save_as_document` using:
47
47
  - **save_as_document.kind:** `"individual"`
48
48
  - **save_as_document.title:** `Idea Exploration — {provider}`
49
- - **save_as_document.metadata:** `{ "Task": "{topic}", "Mode": "Independent", "Round": "0" }`
50
- - **prompt:** The Mode A or Mode B prompt from the `agestra-idea` skill, including the user-interview context and the structured output requirements (Title, Category, Evidence, Description, Effort, Priority).
49
+ - **save_as_document.metadata:** `{ "Provider": "{provider}", "Task": "{topic}", "Mode": "Independent", "Round": "0" }`
50
+ - **prompt:** The Mode A or Mode B prompt from the `agestra-idea` skill, including the user-interview context and the structured output requirements (Title, Category, Evidence, Description, Effort, Priority). The response must include a `<proposals>` block. Each idea is an `<item>` with `id`, `title`, `severity`, and optional `location`; put category, evidence, effort, and priority in the item body.
51
51
 
52
52
  Collect every returned Document ID.
53
53
 
54
54
  ### 3.2 Structured consensus flow
55
55
 
56
56
  Call `agent_debate_structured` with:
57
+ - **mode:** `"idea"`.
57
58
  - **topic:** the user's idea-discovery topic.
58
59
  - **participants:** `environment_check`가 Available로 보고한 provider 목록과 host ideator.
59
- - **source document IDs:** every individual exploration document from 3.1.
60
+ - **source_documents:** optional; use only when 3.1 created individual documents, and pass `{ "document_id": "...", "provider": "..." }` for each document.
60
61
  - **leader:** the current host/leader identity.
62
+ - Capture the returned `session_id`; the tool returns quickly with `status: running`.
61
63
 
62
64
  The MCP moderator engine handles the rest:
63
65
  - It converts each independent idea into stable `ITEM-*` records in `{sessionId}.consensus.json`.
64
66
  - It sends each provider a JSON turn packet, one provider at a time, with every assigned item listed exactly once.
65
- - Providers answer with `agree`, `disagree`, `opinion`, or `revise`; `disagree`, `opinion`, and `revise` require a comment, and `revise` creates a child item.
67
+ - Providers answer with JSON only in the canonical `{ provider, round, items }` shape. Each item uses `agree`, `disagree`, `opinion`, or `revise`; `disagree`, `opinion`, and `revise` require a comment, and `revise` creates a child item.
66
68
  - Malformed JSON is retried once. Repeated failures are recorded as `no_response`; unavailable providers are removed from the active participant set with a moderator note.
67
69
  - The debate markdown is regenerated from the JSON ledger after each turn. Treat it as a readable report, not as the source of truth.
68
70
 
69
71
  ### 3.3 Approval gate
70
72
 
71
- When `agent_debate_structured` returns:
72
- - If the terminal report is `ready-for-approval`, call `agent_debate_approve` to write the final synthesis document.
73
- - If the result needs more discussion, call `agent_debate_continue` with the requested additional rounds.
73
+ Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
74
+ - If status is `ready-for-approval`, call `agent_debate_approve` to write the final synthesis document.
75
+ - If status is `escalated` or the result needs more discussion, call `agent_debate_continue` with the requested additional rounds.
74
76
  - If the result should be closed without synthesis, call `agent_debate_reject` and preserve the ledger/debate documents for inspection.
75
77
 
76
78
  ### 3.4 Present to the user
@@ -76,11 +76,11 @@ Spawn `agestra:agestra-team-lead` in Multi-AI mode with:
76
76
  - instruction to use isolated CLI workers for suitable Codex/Gemini tasks
77
77
  - instruction to review changes with `agent_changes_review`
78
78
  - instruction to merge only after review
79
- - instruction to run `qa_run` and `agestra:agestra-qa` before final completion
79
+ - instruction to run `qa_run` as workspace build/test verification, then `agestra:agestra-qa` if design-compliance or orchestration review is needed before final completion
80
80
 
81
81
  ## Step 6: Final verification
82
82
 
83
83
  Before reporting completion:
84
- 1. Run `qa_run`
84
+ 1. Run `qa_run` against the target workspace build/test profile. Treat it as project verification, not an orchestration-health check.
85
85
  2. If deeper design-compliance verification is needed, spawn `agestra:agestra-qa`
86
86
  3. Summarize implementation result, review result, and QA outcome
@@ -78,16 +78,19 @@ Call `environment_check` to determine which providers and modes are available.
78
78
  a. In parallel:
79
79
  - Spawn the `agestra:agestra-reviewer` agent for host-local independent analysis.
80
80
  After the agent completes, save the host reviewer's result as a document via `workspace_create_document`:
81
+ - **kind:** `"individual"`
81
82
  - **title:** `Code Review — host/reviewer`
82
- - **metadata:** `{ "Provider": "host/reviewer", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent" }`
83
+ - **metadata:** `{ "Provider": "host/reviewer", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent", "Round": "0" }`
83
84
  - **content:** The reviewer agent's full output.
84
85
  - For each available provider, call `ai_chat` with `save_as_document` to let each AI produce its own document directly:
86
+ - **save_as_document.kind:** `"individual"`
85
87
  - **save_as_document.title:** `Code Review — {provider}`
86
- - **save_as_document.metadata:** `{ "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent" }`
88
+ - **save_as_document.metadata:** `{ "Provider": "{provider}", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent", "Round": "0" }`
87
89
  - **prompt:**
88
90
 
89
91
  > Review the following code. Focus on: [selected focus areas].
90
92
  > For each finding, provide severity (CRITICAL/HIGH/MEDIUM/LOW), file:line location, and evidence.
93
+ > Include a `<proposals>` block. Each finding is an `<item>` with `id`, `title`, `severity`, and `location`; put the evidence and impact in the item body.
91
94
  >
92
95
  > Target: [the review target]
93
96
 
@@ -101,19 +104,22 @@ Call `environment_check` to determine which providers and modes are available.
101
104
  - The moderator's integrated document becomes the starting document.
102
105
 
103
106
  2. Start a structured debate session with `agent_debate_structured`.
104
- - Use the integrated document's title/topic as the debate topic.
105
- - Reuse the same reviewer set from step 1, excluding `ollama`.
106
- - Pass the independent document IDs as source context when available.
107
-
108
- 3. Let the MCP moderator engine run the consensus loop.
107
+ - **mode:** `"review"`.
108
+ - **topic:** the review target and selected focus summary.
109
+ - **participants:** the same reviewer set from step 1, excluding `ollama`.
110
+ - **source_documents:** optional; use only when step 1 created individual documents, and pass `{ "document_id": "...", "provider": "..." }` for each document.
111
+ - **leader:** the current host/leader identity when known.
112
+ - The tool returns quickly with `status: running`; capture the `session_id`.
113
+
114
+ 3. Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
109
115
  - The engine assigns stable `ITEM-*` IDs, builds `{sessionId}.consensus.json`, and sends each provider a JSON turn packet.
110
- - Every active provider must answer every assigned item with exactly one structured stance: `agree`, `disagree`, `opinion`, or `revise`.
116
+ - Every active provider must return JSON only in the canonical `{ provider, round, items }` shape, with exactly one structured stance for every assigned item: `agree`, `disagree`, `opinion`, or `revise`.
111
117
  - The engine retries malformed JSON once, records non-responses explicitly, and removes unavailable providers from the active participant set with a moderator note.
112
118
  - The engine regenerates the aggregate debate markdown from the JSON ledger. Do not hand-edit the generated debate markdown to change consensus state.
113
119
 
114
120
  4. Use the approval gate.
115
- - If the terminal report says the session is `ready-for-approval`, call `agent_debate_approve` to write the synthesis.
116
- - If more discussion is needed, call `agent_debate_continue`.
121
+ - If status says the session is `ready-for-approval`, call `agent_debate_approve` to write the synthesis.
122
+ - If status says `escalated` or more discussion is needed, call `agent_debate_continue`.
117
123
  - If the result should be closed without synthesis, call `agent_debate_reject`.
118
124
 
119
125
  5. Present the final result: