npm - agestra - Versions diffs - 4.12.2 → 4.12.4 - Mend

agestra 4.12.2 → 4.12.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/AGENTS.md +1 -1
package/GEMINI.md +1 -1
package/README.md +4 -4
package/agents/agestra-moderator.md +15 -15
package/agents/agestra-team-lead.md +14 -12
package/commands/design.md +7 -5
package/commands/idea.md +9 -7
package/commands/implement.md +2 -2
package/commands/review.md +16 -10
package/dist/bundle.js +290 -284
package/package.json +3 -2
package/skills/design.md +1 -0
package/skills/idea.md +19 -30
package/skills/review.md +10 -6

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -12,7 +12,7 @@
       "name": "agestra",
       "source": "./",
       "description": "Orchestrate Ollama, Gemini, and Codex for multi-AI debates, code review, and cross-validation",
-      "version": "4.12.2",
+      "version": "4.12.4",
       "author": {
         "name": "mua-vtuber"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agestra",
-  "version": "4.12.2",
+  "version": "4.12.4",
   "description": "Claude Code plugin — orchestrate Ollama, Gemini, and Codex for multi-AI debates, code review, and cross-validation",
   "mcpServers": {
     "agestra": {

package/AGENTS.md CHANGED Viewed

@@ -27,7 +27,7 @@ This repository includes a Codex-friendly host wrapper for Agestra.
 - `environment_check` and `provider_list`: inspect host/provider state first
 - `agent_debate_structured` (with `agent_debate_approve`/`_continue`/`_reject`) and `agent_debate_review`: run approval-gated multi-provider review flows
 - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: use for autonomous Codex/Gemini worker tasks
-- `qa_run`: verify implementation before reporting completion
+- `qa_run`: run workspace build/test verification before reporting implementation completion
 ## Project Assets

package/GEMINI.md CHANGED Viewed

@@ -31,4 +31,4 @@ Each command delegates to the shared workflow specs in `commands/*.md`.
 - `agent_debate_structured`, `agent_debate_approve`/`_continue`/`_reject`, `agent_debate_review`: structured multi-provider reviews and approval-gated debates
 - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: autonomous worker lifecycle
 - `workspace_*`: document-backed review and aggregation flows
-- `qa_run`: final verification step before completion
+- `qa_run`: workspace build/test verification before implementation completion

package/README.md CHANGED Viewed

@@ -206,13 +206,13 @@ Turborepo monorepo with 8 packages:
 | Tool | Description |
 |------|-------------|
 | `agent_debate_start` | Start a multi-provider debate (non-blocking, optional quality loop + validator) |
-| `agent_debate_status` | Check debate status and transcript |
+| `agent_debate_status` | Check legacy debate status or structured session progress, phase, participant activity, and document paths |
 | `agent_debate_create` | Create a turn-based debate session (returns debate ID) |
 | `agent_debate_turn` | Execute one provider's turn; supports `provider: "claude"` for Claude's independent participation |
 | `agent_debate_conclude` | End a debate and generate final transcript |
-| `agent_debate_structured` | Start an approval-gated structured debate — individual reviews, optional alias clarification, rounds with vote aggregation, parks in `ready-for-approval` (no synthesis until approved) |
+| `agent_debate_structured` | Start an approval-gated structured debate in the background — individual reviews, optional alias clarification, JSON consensus rounds, status polling, and no synthesis until approved |
 | `agent_debate_approve` | Leader-approve a ready-for-approval structured debate; writes the synthesis document and closes the session |
-| `agent_debate_continue` | Run additional rounds on a ready-for-approval (or escalated) structured-debate session (3/5/10) |
+| `agent_debate_continue` | Start additional background rounds on a ready-for-approval or escalated structured-debate session (3/5/10), then poll status |
 | `agent_debate_reject` | Reject a structured-debate session without writing synthesis |
 | `agent_debate_review` | Send a document to multiple providers for independent review |
 | `agent_cross_validate` | Cross-validate outputs (agent-tier validators only) |
@@ -272,7 +272,7 @@ Turborepo monorepo with 8 packages:
 | Tool | Description |
 |------|-------------|
-| `qa_run` | Run automatic QA with detected build/test commands and a PASS/FAIL summary |
+| `qa_run` | Run vetted workspace build/test QA profiles and return a PASS/FAIL summary |
 ### Trace / Observability (3)

package/agents/agestra-moderator.md CHANGED Viewed

@@ -48,13 +48,13 @@ You operate in one of four modes depending on how you are invoked:
 ### Mode: Structured Debate
-**Preferred entry point:** Call `agent_debate_structured` with the topic, scope, participants, source documents, and leader. The moderator engine owns the full lifecycle: individual reviews, JSON consensus ledger creation, optional alias clarification, sequential provider turns, strict JSON response validation, generated debate markdown, approval snapshot, and final synthesis after leader approval.
+**Preferred entry point:** Call `agent_debate_structured` with `mode`, topic, scope, participants, optional `source_documents`, and leader. Use `mode: "review"` for code/document review and `mode: "idea"` for idea/design option discovery. `source_documents` is optional and must use `{ "document_id": "...", "provider": "..." }` entries when independent documents already exist. The tool creates a structured session record immediately and returns `status: running`; use `agent_debate_status` to monitor phase, provider progress, item summary, and document paths. The moderator engine owns the full lifecycle: individual/source material loading, JSON consensus ledger creation, optional alias clarification, sequential provider turns, strict JSON response validation, generated debate markdown, structured session record, and final synthesis after leader approval.
 The JSON consensus ledger is the source of truth. Debate markdown and synthesis markdown are generated human-readable artifacts. The moderator may inspect and report their paths, but must not edit markdown to change item status, provider stance, or consensus state.
 ### Phase 1: Individual reviews
-Before any consensus round, every participant produces an independent review of the scope. These reviews are written under `.agestra/workspace/individual/`. Each participant lists candidate items with fields such as `{ title, severity, location, statement }`; the engine keeps links back to these source documents in each consensus item.
+Before any consensus round, every participant produces independent source material unless `source_documents` were supplied. These documents are written or read under `.agestra/workspace/individual/`. Each document must include a `<proposals>` block with `<item id="..." title="..." severity="..." location="...">...</item>` entries; the engine keeps links back to these source documents in each consensus item.
 ### Phase 2: Consensus ledger creation
@@ -105,15 +105,15 @@ After each accepted provider turn, the engine recomputes item status from ledger
 The engine persists the JSON ledger atomically, then regenerates:
 - the aggregate debate markdown in `debates/`
-- the terminal consensus report
-- the approval snapshot when the session reaches `ready-for-approval`
+- the structured status/session record (`{sessionId}.session.json`)
+- the terminal consensus report when a blocking engine caller requests one
 ### Phase 5: Leader approval gate
 The moderator does not write the final synthesis file on its own. Three dedicated MCP tools close out the flow:
 - `agent_debate_approve`: writes the synthesis markdown, updates ledger document paths, and transitions to `approved`.
-- `agent_debate_continue`: loads the persisted ledger/snapshot and runs additional consensus rounds.
+- `agent_debate_continue`: loads the persisted ledger/session record, starts additional consensus rounds in the background, and returns `running`.
 - `agent_debate_reject`: closes without synthesis. With `spawn_issue = true`, an issue document can be written under `individual/` listing non-accepted items.
 Idempotency: a second call on a terminal state (`approved`, `rejected`, `leader-timeout`) returns the cached outcome. Calling approval-gate tools on a `running` or `error` session returns `isError: true` with a descriptive state message.
@@ -129,23 +129,23 @@ Idempotency: a second call on a terminal state (`approved`, `rejected`, `leader-
              │ accepted/       │ + user chose escalate
              │ rejected        │
              ▼                 ▼
-          ready-for-approval  ◀── snapshot JSON written to disk (D12)
+          ready-for-approval  ◀── session JSON written to disk
               │    │    │
    _approve  │    │    │ _continue
               ▼    │    ▼
-          approved │   running (snapshot reloaded; max_rounds += additional_rounds)
-     (snapshot    │
+          approved │   running (session reloaded; max_rounds += additional_rounds)
+     (session     │
        kept)      │ _reject
                    ▼
-               rejected (snapshot kept)
+               rejected (session kept)
-(ready-for-approval ─ 24h no tool call ─▶ leader-timeout [snapshot kept])
+(ready-for-approval ─ 24h no tool call ─▶ leader-timeout [session kept])
 (running ─ uncaught internal error ─▶ error)
 ```
-**Snapshot and ledger persistence (D12).** On entry to `ready-for-approval`, the engine writes `{workspaceBaseDir}/.agestra/workspace/debates/{sessionId}.approval.json` atomically and keeps `{sessionId}.consensus.json` as the durable consensus ledger. The snapshot carries session config, consensus-derived aggregate status, rounds, document paths, `readyAt`, and `deadline`. The leader must invoke one of the three approval-gate tools within `STRUCTURED_DEBATE_APPROVAL_TIMEOUT_MS` (24 hours); otherwise the background sweep (scheduled by `STRUCTURED_DEBATE_SESSION_SWEEP_INTERVAL_MS`, default 1 hour) scans the `debates/` directory, finds snapshots with `deadline < now` still in `ready-for-approval`, and transitions them to `leader-timeout` (snapshot kept in place so the leader can still inspect/reject afterwards).
+**Session and ledger persistence.** The engine writes `{workspaceBaseDir}/.agestra/workspace/debates/{sessionId}.session.json` atomically and keeps `{sessionId}.consensus.json` as the durable consensus ledger. The session record carries lifecycle status, current phase, participant progress, session config, consensus-derived aggregate status, rounds, document paths, `readyAt`, and `deadline`. The leader must invoke one of the three approval-gate tools within `STRUCTURED_DEBATE_APPROVAL_TIMEOUT_MS` (24 hours); otherwise the background sweep (scheduled by `STRUCTURED_DEBATE_SESSION_SWEEP_INTERVAL_MS`, default 1 hour) scans the `debates/` directory, finds sessions with `deadline < now` still in `ready-for-approval`, and transitions them to `leader-timeout` (session record kept in place so the leader can still inspect/reject afterwards). Legacy `.approval.json` records may be read for migration, but new writes use `.session.json`.
-The JSON consensus ledger is the truth of content and item state. The approval snapshot is the resumable gate state. Generated markdown is readable output only. Since handlers read persisted state from disk first (memory is a write-through cache), approval and continuation keep working after server restart.
+The JSON consensus ledger is the truth of content and item state. The structured session record is the resumable gate/progress state. Generated markdown is readable output only. Since handlers read persisted state from disk first (memory is a write-through cache), status, approval, and continuation keep working after server restart.
 </Approval_Gate_State_Machine>
@@ -156,7 +156,7 @@ All paths relative to `workspaceBaseDir` (`.agestra/workspace/` under the projec
 ```
 .agestra/workspace/
   individual/   — each participant's initial independent review (pre-debate; no votes)
-  debates/      — generated debate markdown + {sessionId}.consensus.json + {sessionId}.approval.json
+  debates/      — generated debate markdown + {sessionId}.consensus.json + {sessionId}.session.json
   synthesis/    — leader-approved final synthesis document (written only on _approve)
   reviews/      — legacy, read-only; no new writes
 ```
@@ -454,9 +454,9 @@ If `max_rounds` is hit with open proposals, the moderator surfaces the choice to
 <Tool_Usage>
 - `provider_list` — check available providers at the start.
-- `agent_debate_structured` — **recommended entry point for Structured Debate**: runs individual reviews, optional alias clarification, JSON consensus turns, ledger persistence, generated debate markdown, and the approval gate. Does NOT write synthesis.
+- `agent_debate_structured` — **recommended entry point for Structured Debate**: accepts `mode: "review" | "idea"` and optional `source_documents`, starts or loads individual source material, runs optional alias clarification, JSON consensus turns, ledger persistence, generated debate markdown, and the approval gate in the background. Returns `running`; poll `agent_debate_status`. Does NOT write synthesis.
 - `agent_debate_approve` — write synthesis markdown, mark the snapshot `approved`, close the session.
-- `agent_debate_continue` — force additional rounds on a `ready-for-approval` session.
+- `agent_debate_continue` — force additional rounds on a `ready-for-approval` or `escalated` session; returns `running`, then poll status.
 - `agent_debate_reject` — close without synthesis; optionally spawn an issue branch listing non-accepted proposals.
 - Legacy manual debate primitives — diagnostic use only; do not use them for review, idea, or design consensus workflows.
 - `agent_debate_review` — send a document to providers for structured review (Document Review mode).

package/agents/agestra-team-lead.md CHANGED Viewed

@@ -248,15 +248,17 @@ Run formal verification with automatic fix loop:
 > Used when Work Mode in Phase 2 was **Multi-AI**. Replaces Phase 5 (QA) and Phase 6 (Quality Gate) in a single coordinated cross-AI review. In Leader-host-only mode, skip this phase.
-Run the structured-debate MCP flow. This is a **two-step** lifecycle: the moderator runs the debate to a terminal aggregation state, then parks the session in `ready-for-approval` waiting for the leader (this agent) to finalize. The moderator does NOT write the synthesis file on its own — approval must be explicit.
+Run the structured-debate MCP flow. This is a **background lifecycle**: `agent_debate_structured` creates a durable session record immediately and returns `status: running`; the leader polls `agent_debate_status` until the moderator parks the session in `ready-for-approval`, `escalated`, or `error`. The moderator does NOT write the synthesis file on its own — approval must be explicit.
 #### 5M.1 Start the debate
 Call `agent_debate_structured` with:
 - `topic` — short slug (used in file names under `.agestra/workspace/`).
+- `mode` — `"review"` for QA/review consensus, `"idea"` for exploratory design or option discovery.
 - `scope` — concrete framing: file list, task description, or the design doc path.
 - `participants` — the provider/agent IDs the user specified at Work Mode selection, or the qualified set from `trace_summary`.
+- `source_documents` — optional pre-created individual documents, each as `{ "document_id": "...", "provider": "..." }`.
 - `auto_inject_specialists` — default `true`. When true, the moderator auto-adds host reviewer/QA specialists on top of `participants` based on topic heuristics (currently exposed as `claude-reviewer` and/or `claude-qa` for compatibility). When the user wants verbatim participants only, pass `false`.
 - `exclude_participants` — participant IDs to never include, applied regardless of `auto_inject_specialists`. Use this when the user explicitly wants a provider (including Ollama — there is no automatic Ollama filter anymore) kept out.
 - `leader` — omit unless you need to override the session-context leader.
@@ -264,14 +266,14 @@ Call `agent_debate_structured` with:
 - `individual_review_prompt` / `files` — optional framing for the individual-review fan-out.
 - `locale` — pass the locale resolved from `agestra.config.json` (fall back to providers.config locale). The moderator uses it for human-facing text; provider prompts remain English regardless.
-The tool returns a `StructuredDebateRunResult` with the debate snapshot and a `debate_id`. Capture both.
+The tool returns a session ID and `status: running`. Capture the `session_id` and use `agent_debate_status` for progress and artifact paths.
-#### 5M.2 Await terminal state
+#### 5M.2 Poll terminal state
-The result `status` will be one of:
+Call `agent_debate_status` periodically. The structured status includes phase, current provider, round, participant progress, item summary, and document paths. Stop polling when `status` is one of:
-- `ready-for-approval` (subtype `consensus`) — every proposal was accepted or rejected and aggregation converged.
-- `ready-for-approval` (subtype `escalated`) — `max_rounds` was reached without consensus and the user elected to escalate during moderator prompts.
+- `ready-for-approval` — every proposal was accepted/rejected or aggregation reached the approval gate.
+- `escalated` — `max_rounds` was reached with unresolved items.
 - `error` — aggregation failed. Treat as an orchestration failure; do NOT call approve/continue/reject.
 In either `ready-for-approval` subtype the synthesis has NOT been written yet. The terminal report names the three follow-up tools; do not skip them.
@@ -283,7 +285,7 @@ A 24h inactivity timer starts the moment the session enters `ready-for-approval`
 Before deciding, read the on-disk outputs — the debate writes three folders under the workspace:
 - `.agestra/workspace/individual/` — per-participant individual reviews (`individual_{participant}_{topic}_{date}_{seq}.md`). Includes auto-injected host specialists like `claude-reviewer` / `claude-qa` when present.
-- `.agestra/workspace/debates/` — debate transcript (`debate_{topic}_{date}_{seq}.md`) plus the approval snapshot (`{sessionId}.approval.json`). The snapshot remains after `approve` / `reject` for idempotent replays and audit.
+- `.agestra/workspace/debates/` — debate transcript (`debate_{topic}_{date}_{seq}.md`), consensus ledger (`{sessionId}.consensus.json`), and structured session record (`{sessionId}.session.json`). The session record remains after `approve` / `reject` for idempotent replays and audit.
 - `.agestra/workspace/synthesis/` — the final synthesis document, written only after `agent_debate_approve` succeeds.
 Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whether the debate outcome matches the design.
@@ -292,9 +294,9 @@ Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whe
 Pick exactly one of the three follow-up tools, based on inspection:
-1. **Accept the outcome** → call `agent_debate_approve` with `debate_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the snapshot to `approved`, and returns `synthesisDocPath`. Proceed to Phase 7 and relay the path to the user.
-2. **Need more deliberation** → call `agent_debate_continue` with `debate_id` and `additional_rounds` (`3`, `5`, or `10` only). The engine resumes the round loop from the prior snapshot and eventually re-parks the session in `ready-for-approval`. Loop back to 5M.2. Use this when the debate was close but unresolved, or when `escalated` came too early.
-3. **Reject the outcome** → call `agent_debate_reject` with `debate_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
+1. **Accept the outcome** → call `agent_debate_approve` with `session_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the session record to `approved`, and returns `synthesisDocPath`. Proceed to Phase 7 and relay the path to the user.
+2. **Need more deliberation** → call `agent_debate_continue` with `session_id` and `additional_rounds` (`3`, `5`, or `10` only). The handler returns `status: running`; poll `agent_debate_status` again until it reaches the approval gate. Use this when the debate was close but unresolved, or when `escalated` came too early.
+3. **Reject the outcome** → call `agent_debate_reject` with `session_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
 All three tools are idempotent on terminal states — re-calling returns the cached outcome.
@@ -413,8 +415,8 @@ The design document is the authority. If an AI's output conflicts with the desig
 - `provider_list` / `provider_health` — check external AI availability
 - `trace_summary` / `trace_record` / `trace_compare` — provider quality tracking
 - `ai_chat` / `ai_analyze_files` / `ai_compare` — query external AI
-- `agent_debate_structured` — start a structured multi-AI debate (individual reviews → clarification → rounds → aggregation → `ready-for-approval`). Supports `auto_inject_specialists` (default `true`) to auto-add host reviewer/QA specialists (compatibility IDs: `claude-reviewer` / `claude-qa`) based on topic, and `exclude_participants` as the escape hatch (also the way to keep Ollama or any other provider out — there is no automatic Ollama filter).
-- `agent_debate_approve` / `agent_debate_continue` / `agent_debate_reject` — leader-only finalization tools for a `ready-for-approval` session. `approve` writes the synthesis under `.agestra/workspace/synthesis/`; `continue(additional_rounds=N)` accepts only `3`, `5`, or `10`; `reject(reason=..., spawn_issue?=true)` closes the session with no synthesis.
+- `agent_debate_structured` — start a structured multi-AI debate in the background (individual/source material → clarification → JSON consensus rounds → aggregation → approval gate). It returns `status: running`; poll `agent_debate_status`. Supports `mode: "review" | "idea"`, optional `source_documents`, `auto_inject_specialists` (default `true`) to auto-add host reviewer/QA specialists (compatibility IDs: `claude-reviewer` / `claude-qa`) based on topic, and `exclude_participants` as the escape hatch (also the way to keep Ollama or any other provider out — there is no automatic Ollama filter).
+- `agent_debate_approve` / `agent_debate_continue` / `agent_debate_reject` — leader-only finalization tools for a structured session at the approval gate. `approve` writes the synthesis under `.agestra/workspace/synthesis/`; `continue(additional_rounds=N)` accepts only `3`, `5`, or `10` and returns `running`; `reject(reason=..., spawn_issue?=true)` closes the session with no synthesis.
 - Low-level debate primitives — legacy / diagnostic use only; prefer the structured debate tools for review, idea, and design workflows.
 - `agent_cross_validate` — cross-validate outputs between providers
 - `cli_worker_spawn` / `cli_worker_status` / `cli_worker_collect` / `cli_worker_stop` — manage Codex/Gemini CLI workers

package/commands/design.md CHANGED Viewed

@@ -57,6 +57,7 @@ Follow the structured consensus document model from `docs/superpowers/specs/2026
 The JSON consensus ledger is the source of truth. Generated Markdown must not be parsed or hand-edited to change provider stances, item status, or consensus state.
 1. Start an approval-gated structured debate with `agent_debate_structured`.
+   - **mode:** use `"idea"` for exploratory architecture/design option discovery. Use `"review"` only when reviewing an already-written design artifact.
    - **topic:** the design subject.
    - **participants:** only providers reported available by `environment_check` / `provider_list`, plus the host design specialist when the engine supports it.
    - **scope:** the design subject plus any user-provided constraints, relevant existing design docs, and code areas that should anchor the design.
@@ -69,22 +70,23 @@ The JSON consensus ledger is the source of truth. Generated Markdown must not be
      4. Recommended approach — one choice with justification.
      5. Implementation plan — step-by-step build sequence with dependencies.
      6. Risks and mitigations.
+   - The tool returns immediately with `status: running`; capture the `session_id`.
-2. Let the MCP moderator engine own the consensus flow.
+2. Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
    - The engine writes individual first-pass documents under `individual/`.
-   - The engine owns provider turn order, JSON turn packets, response validation, ledger updates, aggregated debate Markdown rendering, synthesis rendering, and the final terminal table.
-   - Participants submit explicit JSON stances through the MCP consensus turn packet.
+   - The engine owns provider turn order, JSON turn packets, response validation, ledger updates, aggregated debate Markdown rendering, status rendering, synthesis rendering, and final terminal table generation for direct engine callers.
+   - Participants submit explicit JSON stances through the MCP consensus turn packet. Structured consensus turns must return JSON only in the canonical `{ provider, round, items }` shape.
    - The leader/moderator must not infer agreement from prose and must not edit provider stances manually.
    - There must be one generated aggregated debate Markdown document per run, not one Markdown document per provider turn or round.
 3. Use the approval gate.
-   - If the terminal report says the session is `ready-for-approval`, inspect the report and call exactly one of:
+   - If status says the session is `ready-for-approval` or `escalated`, inspect the status/artifacts and call exactly one of:
      - `agent_debate_approve` to write the synthesis document.
      - `agent_debate_continue` to run 3, 5, or 10 more rounds.
      - `agent_debate_reject` to close without synthesis.
    - If the result is `error`, do not approve; report the orchestration failure.
 4. Present the final result.
-   - Name the debate Markdown path, consensus JSON ledger path, approval snapshot path if surfaced, and synthesis document path if approved.
+   - Name the debate Markdown path, consensus JSON ledger path, structured session record path if surfaced, and synthesis document path if approved.
    - Summarize accepted design decisions, excluded options, and unresolved/disputed items.
    - Preserve each provider's rationale for disputed positions.

package/commands/idea.md CHANGED Viewed

@@ -46,31 +46,33 @@ In parallel:
 - For each available external provider, call `ai_chat` with `save_as_document` using:
   - **save_as_document.kind:** `"individual"`
   - **save_as_document.title:** `Idea Exploration — {provider}`
-  - **save_as_document.metadata:** `{ "Task": "{topic}", "Mode": "Independent", "Round": "0" }`
-  - **prompt:** The Mode A or Mode B prompt from the `agestra-idea` skill, including the user-interview context and the structured output requirements (Title, Category, Evidence, Description, Effort, Priority).
+  - **save_as_document.metadata:** `{ "Provider": "{provider}", "Task": "{topic}", "Mode": "Independent", "Round": "0" }`
+  - **prompt:** The Mode A or Mode B prompt from the `agestra-idea` skill, including the user-interview context and the structured output requirements (Title, Category, Evidence, Description, Effort, Priority). The response must include a `<proposals>` block. Each idea is an `<item>` with `id`, `title`, `severity`, and optional `location`; put category, evidence, effort, and priority in the item body.
 Collect every returned Document ID.
 ### 3.2 Structured consensus flow
 Call `agent_debate_structured` with:
+- **mode:** `"idea"`.
 - **topic:** the user's idea-discovery topic.
 - **participants:** `environment_check`가 Available로 보고한 provider 목록과 host ideator.
-- **source document IDs:** every individual exploration document from 3.1.
+- **source_documents:** optional; use only when 3.1 created individual documents, and pass `{ "document_id": "...", "provider": "..." }` for each document.
 - **leader:** the current host/leader identity.
+- Capture the returned `session_id`; the tool returns quickly with `status: running`.
 The MCP moderator engine handles the rest:
 - It converts each independent idea into stable `ITEM-*` records in `{sessionId}.consensus.json`.
 - It sends each provider a JSON turn packet, one provider at a time, with every assigned item listed exactly once.
-- Providers answer with `agree`, `disagree`, `opinion`, or `revise`; `disagree`, `opinion`, and `revise` require a comment, and `revise` creates a child item.
+- Providers answer with JSON only in the canonical `{ provider, round, items }` shape. Each item uses `agree`, `disagree`, `opinion`, or `revise`; `disagree`, `opinion`, and `revise` require a comment, and `revise` creates a child item.
 - Malformed JSON is retried once. Repeated failures are recorded as `no_response`; unavailable providers are removed from the active participant set with a moderator note.
 - The debate markdown is regenerated from the JSON ledger after each turn. Treat it as a readable report, not as the source of truth.
 ### 3.3 Approval gate
-When `agent_debate_structured` returns:
-- If the terminal report is `ready-for-approval`, call `agent_debate_approve` to write the final synthesis document.
-- If the result needs more discussion, call `agent_debate_continue` with the requested additional rounds.
+Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
+- If status is `ready-for-approval`, call `agent_debate_approve` to write the final synthesis document.
+- If status is `escalated` or the result needs more discussion, call `agent_debate_continue` with the requested additional rounds.
 - If the result should be closed without synthesis, call `agent_debate_reject` and preserve the ledger/debate documents for inspection.
 ### 3.4 Present to the user

package/commands/implement.md CHANGED Viewed

@@ -76,11 +76,11 @@ Spawn `agestra:agestra-team-lead` in Multi-AI mode with:
 - instruction to use isolated CLI workers for suitable Codex/Gemini tasks
 - instruction to review changes with `agent_changes_review`
 - instruction to merge only after review
-- instruction to run `qa_run` and `agestra:agestra-qa` before final completion
+- instruction to run `qa_run` as workspace build/test verification, then `agestra:agestra-qa` if design-compliance or orchestration review is needed before final completion
 ## Step 6: Final verification
 Before reporting completion:
-1. Run `qa_run`
+1. Run `qa_run` against the target workspace build/test profile. Treat it as project verification, not an orchestration-health check.
 2. If deeper design-compliance verification is needed, spawn `agestra:agestra-qa`
 3. Summarize implementation result, review result, and QA outcome

package/commands/review.md CHANGED Viewed

@@ -78,16 +78,19 @@ Call `environment_check` to determine which providers and modes are available.
    a. In parallel:
       - Spawn the `agestra:agestra-reviewer` agent for host-local independent analysis.
         After the agent completes, save the host reviewer's result as a document via `workspace_create_document`:
+        - **kind:** `"individual"`
         - **title:** `Code Review — host/reviewer`
-        - **metadata:** `{ "Provider": "host/reviewer", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent" }`
+        - **metadata:** `{ "Provider": "host/reviewer", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent", "Round": "0" }`
         - **content:** The reviewer agent's full output.
       - For each available provider, call `ai_chat` with `save_as_document` to let each AI produce its own document directly:
+        - **save_as_document.kind:** `"individual"`
         - **save_as_document.title:** `Code Review — {provider}`
-        - **save_as_document.metadata:** `{ "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent" }`
+        - **save_as_document.metadata:** `{ "Provider": "{provider}", "Task": "{review target}", "Focus": "{selected focus areas}", "Mode": "Independent", "Round": "0" }`
         - **prompt:**
           > Review the following code. Focus on: [selected focus areas].
           > For each finding, provide severity (CRITICAL/HIGH/MEDIUM/LOW), file:line location, and evidence.
+          > Include a `<proposals>` block. Each finding is an `<item>` with `id`, `title`, `severity`, and `location`; put the evidence and impact in the item body.
           >
           > Target: [the review target]
@@ -101,19 +104,22 @@ Call `environment_check` to determine which providers and modes are available.
       - The moderator's integrated document becomes the starting document.
 2. Start a structured debate session with `agent_debate_structured`.
-   - Use the integrated document's title/topic as the debate topic.
-   - Reuse the same reviewer set from step 1, excluding `ollama`.
-   - Pass the independent document IDs as source context when available.
-3. Let the MCP moderator engine run the consensus loop.
+   - **mode:** `"review"`.
+   - **topic:** the review target and selected focus summary.
+   - **participants:** the same reviewer set from step 1, excluding `ollama`.
+   - **source_documents:** optional; use only when step 1 created individual documents, and pass `{ "document_id": "...", "provider": "..." }` for each document.
+   - **leader:** the current host/leader identity when known.
+   - The tool returns quickly with `status: running`; capture the `session_id`.
+3. Poll `agent_debate_status` until the session reaches `ready-for-approval`, `escalated`, or `error`.
    - The engine assigns stable `ITEM-*` IDs, builds `{sessionId}.consensus.json`, and sends each provider a JSON turn packet.
-   - Every active provider must answer every assigned item with exactly one structured stance: `agree`, `disagree`, `opinion`, or `revise`.
+   - Every active provider must return JSON only in the canonical `{ provider, round, items }` shape, with exactly one structured stance for every assigned item: `agree`, `disagree`, `opinion`, or `revise`.
    - The engine retries malformed JSON once, records non-responses explicitly, and removes unavailable providers from the active participant set with a moderator note.
    - The engine regenerates the aggregate debate markdown from the JSON ledger. Do not hand-edit the generated debate markdown to change consensus state.
 4. Use the approval gate.
-   - If the terminal report says the session is `ready-for-approval`, call `agent_debate_approve` to write the synthesis.
-   - If more discussion is needed, call `agent_debate_continue`.
+   - If status says the session is `ready-for-approval`, call `agent_debate_approve` to write the synthesis.
+   - If status says `escalated` or more discussion is needed, call `agent_debate_continue`.
    - If the result should be closed without synthesis, call `agent_debate_reject`.
 5. Present the final result: