agestra 4.8.3 → 4.8.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/AGENTS.md +1 -1
- package/GEMINI.md +1 -1
- package/README.ja.md +4 -1
- package/README.ko.md +4 -1
- package/README.md +4 -1
- package/README.zh.md +4 -1
- package/agents/agestra-moderator.md +379 -289
- package/agents/agestra-team-lead.md +71 -7
- package/commands/review.md +1 -1
- package/dist/bundle.js +219 -182
- package/package.json +1 -1
- package/skills/review.md +1 -1
|
@@ -191,14 +191,17 @@ After each task completes:
|
|
|
191
191
|
- Naming conventions are consistent
|
|
192
192
|
- No conflicting changes to shared files
|
|
193
193
|
- Import/export chains are complete
|
|
194
|
-
6. If issues found → craft a detailed correction prompt and re-assign to the same AI or address it in Claude execution.
|
|
195
194
|
6. If issues found → craft a detailed correction prompt and re-assign to the same AI or fix directly in Claude execution.
|
|
196
195
|
7. If all checks pass:
|
|
197
196
|
- For CLI worker tasks: call `agent_changes_accept` to merge worktree changes
|
|
198
197
|
- For rejected CLI worker tasks: call `agent_changes_reject` with reason
|
|
199
|
-
- Proceed to
|
|
198
|
+
- Proceed to verification:
|
|
199
|
+
- **Multi-AI mode** → Phase 5M (Structured Debate) replaces the separate QA and Quality Gate phases.
|
|
200
|
+
- **Claude-only mode** → Phase 5 (QA Cycle) followed by Phase 6 (Quality Gate).
|
|
200
201
|
|
|
201
|
-
### Phase 5: QA Cycle
|
|
202
|
+
### Phase 5: QA Cycle (Claude-only mode)
|
|
203
|
+
|
|
204
|
+
> Used when Work Mode in Phase 2 was **Claude only**. In Multi-AI mode, skip to Phase 5M.
|
|
202
205
|
|
|
203
206
|
Run formal verification with automatic fix loop:
|
|
204
207
|
|
|
@@ -228,7 +231,65 @@ Run formal verification with automatic fix loop:
|
|
|
228
231
|
- `INTEGRATION_BREAK` → cross-component conflict, re-assign with both sides' context
|
|
229
232
|
- `TEST_FAILURE` → implementation bug, re-assign with test output and expected behavior
|
|
230
233
|
|
|
231
|
-
### Phase
|
|
234
|
+
### Phase 5M: Structured Debate (Multi-AI mode)
|
|
235
|
+
|
|
236
|
+
> Used when Work Mode in Phase 2 was **Multi-AI**. Replaces Phase 5 (QA) and Phase 6 (Quality Gate) in a single coordinated cross-AI review. In Claude-only mode, skip this phase.
|
|
237
|
+
|
|
238
|
+
Run the structured-debate MCP flow. This is a **two-step** lifecycle: the moderator runs the debate to a terminal aggregation state, then parks the session in `ready-for-approval` waiting for the leader (this agent) to finalize. The moderator does NOT write the synthesis file on its own — approval must be explicit.
|
|
239
|
+
|
|
240
|
+
#### 5M.1 Start the debate
|
|
241
|
+
|
|
242
|
+
Call `agent_debate_structured` with:
|
|
243
|
+
|
|
244
|
+
- `topic` — short slug (used in file names under `.agestra/workspace/`).
|
|
245
|
+
- `scope` — concrete framing: file list, task description, or the design doc path.
|
|
246
|
+
- `participants` — the provider/agent IDs the user specified at Work Mode selection, or the qualified set from `trace_summary`.
|
|
247
|
+
- `auto_inject_specialists` — default `true`. When true, the moderator auto-adds `claude-reviewer` and/or `claude-qa` on top of `participants` based on topic heuristics (e.g. review-ish topics pull the reviewer, QA/verification-ish topics pull qa). When the user wants verbatim participants only, pass `false`.
|
|
248
|
+
- `exclude_participants` — participant IDs to never include, applied regardless of `auto_inject_specialists`. Use this when the user explicitly wants a provider (including Ollama — there is no automatic Ollama filter anymore) kept out.
|
|
249
|
+
- `leader` — omit unless you need to override the session-context leader.
|
|
250
|
+
- `max_rounds` — default `10`. Raise for contested topics, lower for quick smoke-debates.
|
|
251
|
+
- `individual_review_prompt` / `files` — optional framing for the individual-review fan-out.
|
|
252
|
+
- `locale` — pass the locale resolved from `agestra.config.json` (fall back to providers.config locale). The moderator uses it for human-facing text; provider prompts remain English regardless.
|
|
253
|
+
|
|
254
|
+
The tool returns a `StructuredDebateRunResult` with the debate snapshot and a `debate_id`. Capture both.
|
|
255
|
+
|
|
256
|
+
#### 5M.2 Await terminal state
|
|
257
|
+
|
|
258
|
+
The result `status` will be one of:
|
|
259
|
+
|
|
260
|
+
- `ready-for-approval` (subtype `consensus`) — every proposal was accepted or rejected and aggregation converged.
|
|
261
|
+
- `ready-for-approval` (subtype `escalated`) — `max_rounds` was reached without consensus and the user elected to escalate during moderator prompts.
|
|
262
|
+
- `error` — aggregation failed. Treat as an orchestration failure; do NOT call approve/continue/reject.
|
|
263
|
+
|
|
264
|
+
In either `ready-for-approval` subtype the synthesis has NOT been written yet. The terminal report names the three follow-up tools; do not skip them.
|
|
265
|
+
|
|
266
|
+
A 24h inactivity timer starts the moment the session enters `ready-for-approval`. If the leader does nothing, the session transitions to `leader-timeout` and only `agent_debate_reject` is accepted afterwards for cleanup.
|
|
267
|
+
|
|
268
|
+
#### 5M.3 Inspect artifacts
|
|
269
|
+
|
|
270
|
+
Before deciding, read the on-disk outputs — the debate writes three folders under the workspace:
|
|
271
|
+
|
|
272
|
+
- `.agestra/workspace/individual/` — per-participant individual reviews (`individual_{participant}_{topic}_{date}_{seq}.md`). Includes auto-injected specialists like `claude-reviewer` / `claude-qa` when present.
|
|
273
|
+
- `.agestra/workspace/debates/` — debate transcript (`debate_{topic}_{date}_{seq}.md`) plus the approval snapshot (`{sessionId}.approval.json`) while the session is parked.
|
|
274
|
+
- `.agestra/workspace/synthesis/` — the final synthesis document, written only after `agent_debate_approve` succeeds.
|
|
275
|
+
|
|
276
|
+
Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whether the debate outcome matches the design.
|
|
277
|
+
|
|
278
|
+
#### 5M.4 Finalize (leader decision)
|
|
279
|
+
|
|
280
|
+
Pick exactly one of the three follow-up tools, based on inspection:
|
|
281
|
+
|
|
282
|
+
1. **Accept the outcome** → call `agent_debate_approve` with `debate_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, deletes the snapshot, and returns `synthesisDocPath`. Proceed to Phase 7 and relay the path to the user.
|
|
283
|
+
2. **Need more deliberation** → call `agent_debate_continue` with `debate_id` and `additional_rounds` (typical values: `3`, `5`, or `10`; max `20`). The engine resumes the round loop from the prior snapshot and eventually re-parks the session in `ready-for-approval`. Loop back to 5M.2. Use this when the debate was close but unresolved, or when `escalated` came too early.
|
|
284
|
+
3. **Reject the outcome** → call `agent_debate_reject` with `debate_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
|
|
285
|
+
|
|
286
|
+
All three tools are idempotent on terminal states — re-calling returns the cached outcome.
|
|
287
|
+
|
|
288
|
+
When the session is `escalated`, explain the situation to the user in supervised mode before choosing `continue` vs `reject`. In autonomous mode, prefer `continue` with `additional_rounds: 5` once; if it escalates again, `reject` with a clear reason and fall back to targeted fix tasks in Phase 3.
|
|
289
|
+
|
|
290
|
+
### Phase 6: Quality Gate (Claude-only mode)
|
|
291
|
+
|
|
292
|
+
> Used when Work Mode in Phase 2 was **Claude only**. In Multi-AI mode, the structured debate in Phase 5M subsumes this gate.
|
|
232
293
|
|
|
233
294
|
Run the `agestra-reviewer` agent with TRUST 5 framework:
|
|
234
295
|
|
|
@@ -250,8 +311,9 @@ Provide a clear summary to the user:
|
|
|
250
311
|
- How tasks were distributed (which AI/worker did what)
|
|
251
312
|
- Task completion summary: total tasks, completed, failed, re-routed
|
|
252
313
|
- What changed (files modified, features added)
|
|
253
|
-
-
|
|
254
|
-
-
|
|
314
|
+
- Verification summary:
|
|
315
|
+
- Claude-only: QA cycle count + what was auto-fixed, TRUST 5 Quality Gate result
|
|
316
|
+
- Multi-AI: structured debate outcome (`approved` / `rejected`, with round count), `auto_inject_specialists` state, final synthesis path (if approved) from `.agestra/workspace/synthesis/`, and links to the individual reviews under `.agestra/workspace/individual/` and the transcript under `.agestra/workspace/debates/`
|
|
255
317
|
- Any issues found and how they were resolved
|
|
256
318
|
|
|
257
319
|
</Workflow>
|
|
@@ -338,7 +400,9 @@ The design document is the authority. If an AI's output conflicts with the desig
|
|
|
338
400
|
- `provider_list` / `provider_health` — check external AI availability
|
|
339
401
|
- `trace_summary` / `trace_record` / `trace_compare` — provider quality tracking
|
|
340
402
|
- `ai_chat` / `ai_analyze_files` / `ai_compare` — query external AI
|
|
341
|
-
- `
|
|
403
|
+
- `agent_debate_structured` — start a structured multi-AI debate (individual reviews → clarification → rounds → aggregation → `ready-for-approval`). Supports `auto_inject_specialists` (default `true`) to auto-add `claude-reviewer` / `claude-qa` based on topic, and `exclude_participants` as the escape hatch (also the way to keep Ollama or any other provider out — there is no automatic Ollama filter).
|
|
404
|
+
- `agent_debate_approve` / `agent_debate_continue` / `agent_debate_reject` — leader-only finalization tools for a `ready-for-approval` session. `approve` writes the synthesis under `.agestra/workspace/synthesis/`; `continue(additional_rounds=N)` extends the debate (typical N ∈ {3, 5, 10}, max 20); `reject(reason=..., spawn_issue?=true)` closes the session with no synthesis.
|
|
405
|
+
- `agent_debate_create/turn/status/summary/list/close/reset` — low-level debate primitives (legacy / diagnostic use).
|
|
342
406
|
- `agent_cross_validate` — cross-validate outputs between providers
|
|
343
407
|
- `cli_worker_spawn` / `cli_worker_status` / `cli_worker_collect` / `cli_worker_stop` — manage Codex/Gemini CLI workers
|
|
344
408
|
- `agent_changes_review` / `agent_changes_accept` / `agent_changes_reject` — review/merge worktree changes
|
package/commands/review.md
CHANGED
|
@@ -60,7 +60,7 @@ Call `environment_check` to determine which providers and modes are available.
|
|
|
60
60
|
- Treat the Claude reviewer agent as asynchronous work that may legitimately take several minutes. Poll about once per minute; an empty or slowly growing output file is not a failure by itself.
|
|
61
61
|
- Do NOT stop or replace the Claude reviewer with a duplicate main-thread review unless there is an explicit error, the user asks to cancel, or there has been no visible progress for at least 8 minutes. For large review scopes, allow up to 15 minutes before declaring the reviewer stalled.
|
|
62
62
|
- If a background reviewer is still running, tell the user you are waiting and continue the orchestration. Do not short-circuit the workflow just because another provider finished earlier.
|
|
63
|
-
-
|
|
63
|
+
- Use the turn-based flow (`agent_debate_create` + iterative `agent_debate_review` / `agent_debate_turn` + `agent_debate_conclude`) or the approval-gated flow (`agent_debate_structured` + `agent_debate_approve`/`_continue`/`_reject`) so long-running review rounds do not get cut off by host tool-call time limits.
|
|
64
64
|
|
|
65
65
|
**팀 구성:** `agestra:agestra-moderator` (리더) + `agestra:agestra-reviewer` (Claude) + 리뷰용 외부 AI (`gemini`, `codex`, 등록된 Claude-backed reviewer 등; `ollama` 제외)
|
|
66
66
|
|