agestra 4.13.0 → 4.13.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/README.ja.md +34 -12
- package/README.ko.md +44 -61
- package/README.md +43 -62
- package/README.zh.md +34 -12
- package/agents/agestra-designer.md +1 -1
- package/agents/agestra-ideator.md +1 -1
- package/agents/agestra-moderator.md +1 -2
- package/agents/agestra-qa.md +22 -10
- package/agents/agestra-reviewer.md +1 -1
- package/agents/agestra-security.md +1 -1
- package/agents/agestra-team-lead.md +85 -23
- package/commands/implement.md +13 -3
- package/commands/qa.md +11 -4
- package/dist/bundle.js +1 -1
- package/package.json +1 -1
- package/scripts/host-assets/categories.mjs +156 -0
package/README.zh.md
CHANGED
|
@@ -50,7 +50,7 @@ npm install -g agestra
|
|
|
50
50
|
agestra-install gemini --assets --scope user
|
|
51
51
|
```
|
|
52
52
|
|
|
53
|
-
Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md)
|
|
53
|
+
Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md)、[`.gemini/commands/agestra/`](.gemini/commands/agestra) 和生成的 skills 一起工作。project scope 的 `--assets` 会写入受管文件,user scope 的 `--assets` 会安装 Agestra Gemini native extension。只注册 MCP 时,请使用 `npm run install:gemini` 或 `agestra-install gemini`。`npm run install:gemini:assets` 默认使用 user scope;如果要从 checkout 安装 project-scope 受管文件,请运行 `node scripts/install-host-mcp.mjs gemini --assets --scope project`。
|
|
54
54
|
|
|
55
55
|
安装后可用的 Gemini 命令:
|
|
56
56
|
|
|
@@ -59,6 +59,8 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
59
59
|
- `/agestra:design`
|
|
60
60
|
- `/agestra:idea`
|
|
61
61
|
- `/agestra:implement`
|
|
62
|
+
- `/agestra:qa`
|
|
63
|
+
- `/agestra:security`
|
|
62
64
|
|
|
63
65
|
### 前置条件
|
|
64
66
|
|
|
@@ -83,9 +85,9 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
83
85
|
|
|
84
86
|
| 宿主 | 自然入口 |
|
|
85
87
|
|------|----------|
|
|
86
|
-
| Claude Code | `/agestra review`, `/agestra design`, `/agestra idea`, `/agestra implement` |
|
|
88
|
+
| Claude Code | `/agestra setup`, `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea`, `/agestra implement` |
|
|
87
89
|
| Codex CLI | 按 `AGENTS.md` 指引直接用自然语言发起请求 |
|
|
88
|
-
| Gemini CLI | `/agestra:review`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` |
|
|
90
|
+
| Gemini CLI | `/agestra:setup`, `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` |
|
|
89
91
|
|
|
90
92
|
三种宿主都会驱动同一个 MCP 服务,并共享 `commands/*.md` 中的工作流规范。
|
|
91
93
|
|
|
@@ -93,24 +95,29 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
93
95
|
|
|
94
96
|
| 命令 | 说明 |
|
|
95
97
|
|------|------|
|
|
98
|
+
| `/agestra setup` | 初始 AI 提供方选择与设置 |
|
|
96
99
|
| `/agestra review [target]` | 审查代码质量、安全性和集成完成度 |
|
|
100
|
+
| `/agestra qa [target]` | 验证实现结果并生成 PASS/FAIL 证据 |
|
|
101
|
+
| `/agestra security [target]` | 执行专门的安全审查 |
|
|
97
102
|
| `/agestra idea [topic]` | 通过与相似项目对比发掘改进点 |
|
|
98
103
|
| `/agestra design [subject]` | 在实现前探索架构与设计取舍 |
|
|
99
|
-
| `/agestra setup` | 初始 AI 提供方选择与设置 |
|
|
100
104
|
| `/agestra implement [task]` | 以 Claude only 或 Multi-AI 模式执行实现 |
|
|
101
105
|
|
|
102
|
-
|
|
106
|
+
当外部提供方可用时,review、QA、security、design、idea 工作流会经由 team-lead 进入多 AI 交叉验证。当未检测到提供方时,当前宿主的本地 specialist agent 会自动处理。
|
|
103
107
|
|
|
104
108
|
## 代理
|
|
105
109
|
|
|
106
110
|
| 代理 | 模型 | 角色 |
|
|
107
111
|
|------|------|------|
|
|
108
112
|
| `agestra-team-lead` | Sonnet | 全局编排者:环境检查、按质量路由提供方、选择工作模式、监督 CLI Worker、驱动 QA 循环 |
|
|
113
|
+
| `agestra-implementer` | Sonnet | 有范围的实现执行者:代码修改、测试更新、本地验证 |
|
|
114
|
+
| `agestra-e2e-writer` | Sonnet | 持久 E2E 测试作者:只编写已批准的浏览器流程测试 |
|
|
109
115
|
| `agestra-reviewer` | Opus | 严格质量审查者:关注安全、孤立实现、规格漂移、测试缺口 |
|
|
110
116
|
| `agestra-designer` | Opus | 架构探索者:苏格拉底式提问、权衡分析 |
|
|
111
117
|
| `agestra-ideator` | Sonnet | 改进点发现者:Web 调研、竞品分析 |
|
|
112
118
|
| `agestra-moderator` | Sonnet | 多模式主持者:带共识检测的辩论、独立汇总、文档审查、冲突解决 |
|
|
113
119
|
| `agestra-qa` | Opus | QA 验证者:检查设计符合性并给出 PASS/FAIL 判断 |
|
|
120
|
+
| `agestra-security` | Opus | 安全审查者:威胁模型、认证/数据流风险、依赖与密钥卫生 |
|
|
114
121
|
|
|
115
122
|
## 技能
|
|
116
123
|
|
|
@@ -125,6 +132,9 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
125
132
|
| `design` | 包含 Multi-AI 模式选择的架构探索工作流 |
|
|
126
133
|
| `idea` | 包含 Multi-AI 模式选择的改进发现工作流 |
|
|
127
134
|
| `review` | 包含 Multi-AI 模式选择的代码质量·安全·硬编码审查工作流 |
|
|
135
|
+
| `qa` | 设计契约验证与 PASS/FAIL 证据工作流 |
|
|
136
|
+
| `security` | 专门安全审查工作流 |
|
|
137
|
+
| `e2e` | 持久浏览器 E2E 测试编写工作流 |
|
|
128
138
|
| `leader` | 多AI/提供方编排入口 — 捕获明确的提供方、辩论、共识或交叉验证信号,进行领域分类后委托给 `agestra-team-lead` |
|
|
129
139
|
|
|
130
140
|
---
|
|
@@ -147,7 +157,7 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
147
157
|
|
|
148
158
|
- **Provider abstraction** — 所有后端都实现 `AIProvider`(`chat`、`healthCheck`、`getCapabilities`)。新增提供方只需新增一个 provider 包并注册工厂。
|
|
149
159
|
- **Zero-config** — 启动时自动检测提供方,无需手动配置。
|
|
150
|
-
- **Host-native** — Claude 使用插件包,Codex 使用 `AGENTS.md
|
|
160
|
+
- **Host-native** — Claude 使用插件包,Codex 使用 `AGENTS.md` 和 custom agents,Gemini 使用 `GEMINI.md`、commands、skills 或 native extension。所有宿主共享同一套 MCP 服务与工作流核心。
|
|
151
161
|
- **Modular dispatch** — 每类工具都是独立模块,对外提供 `getTools()` 和 `handleTool()`。服务端负责动态收集与分发。
|
|
152
162
|
- **Atomic writes** — 所有文件操作都采用“写临时文件再重命名”的方式,避免损坏。
|
|
153
163
|
- **Dead-end tracking** — 失败方案会被记录,并注入后续提示词。
|
|
@@ -155,11 +165,11 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
155
165
|
|
|
156
166
|
### 工作模式
|
|
157
167
|
|
|
158
|
-
**文本工作**(review、design、idea):有提供方 → 终极辩论模式;无提供方 → Claude only
|
|
168
|
+
**文本工作**(review、QA、security、design、idea):有提供方 → 终极辩论模式;无提供方 → Claude only
|
|
159
169
|
|
|
160
170
|
**实现工作**(team-lead orchestration):
|
|
161
|
-
- **
|
|
162
|
-
-
|
|
171
|
+
- **仅 Claude** — Claude 直接结合项目/全局代理完成实现。
|
|
172
|
+
- **与其他 AI 一起** — CLI Worker(Codex/Gemini)在隔离的 git worktree 中自主编码,Ollama 处理简单任务,Claude 负责监督与合并。
|
|
163
173
|
|
|
164
174
|
---
|
|
165
175
|
|
|
@@ -237,7 +247,7 @@ Gemini 会结合仓库根目录下的 [GEMINI.md](GEMINI.md) 与 [`.gemini/comma
|
|
|
237
247
|
|
|
238
248
|
| 工具 | 说明 |
|
|
239
249
|
|------|------|
|
|
240
|
-
| `host_assets_status` | 检查 Codex custom agents 等生成的宿主原生资产 |
|
|
250
|
+
| `host_assets_status` | 检查 Codex custom agents、Gemini assets 等生成的宿主原生资产 |
|
|
241
251
|
| `host_assets_install` | 显式安装或刷新受管宿主原生资产 |
|
|
242
252
|
| `host_assets_uninstall` | 移除 Agestra 追踪的受管宿主原生资产 |
|
|
243
253
|
|
|
@@ -326,21 +336,30 @@ agestra/
|
|
|
326
336
|
├── .gemini/
|
|
327
337
|
│ └── commands/
|
|
328
338
|
│ └── agestra/
|
|
339
|
+
│ ├── setup.toml # Gemini CLI 的 /agestra:setup
|
|
329
340
|
│ ├── review.toml # Gemini CLI 的 /agestra:review
|
|
330
341
|
│ ├── design.toml # Gemini CLI 的 /agestra:design
|
|
331
342
|
│ ├── idea.toml # Gemini CLI 的 /agestra:idea
|
|
332
|
-
│
|
|
343
|
+
│ ├── implement.toml # Gemini CLI 的 /agestra:implement
|
|
344
|
+
│ ├── qa.toml # Gemini CLI 的 /agestra:qa
|
|
345
|
+
│ └── security.toml # Gemini CLI 的 /agestra:security
|
|
333
346
|
├── commands/
|
|
347
|
+
│ ├── setup.md # /agestra setup — 提供方设置
|
|
334
348
|
│ ├── review.md # /agestra review — 质量验证
|
|
349
|
+
│ ├── qa.md # /agestra qa — PASS/FAIL 验证
|
|
350
|
+
│ ├── security.md # /agestra security — 安全审查
|
|
335
351
|
│ ├── idea.md # /agestra idea — 改进点发现
|
|
336
352
|
│ ├── design.md # /agestra design — 架构探索
|
|
337
353
|
│ └── implement.md # /agestra implement — 实现工作流
|
|
338
354
|
├── agents/
|
|
355
|
+
│ ├── agestra-implementer.md # 有范围的实现执行者(Sonnet)
|
|
356
|
+
│ ├── agestra-e2e-writer.md # 持久 E2E 测试作者(Sonnet)
|
|
339
357
|
│ ├── agestra-reviewer.md # 严格质量审查者(Opus)
|
|
340
358
|
│ ├── agestra-designer.md # 架构探索者(Opus)
|
|
341
359
|
│ ├── agestra-ideator.md # 改进点发现者(Sonnet)
|
|
342
360
|
│ ├── agestra-moderator.md # 多模式主持者(Sonnet)
|
|
343
361
|
│ ├── agestra-qa.md # QA 验证者(Opus,不写代码)
|
|
362
|
+
│ ├── agestra-security.md # 安全审查者(Opus)
|
|
344
363
|
│ └── agestra-team-lead.md # 全局编排者(Sonnet,不写代码)
|
|
345
364
|
├── skills/
|
|
346
365
|
│ ├── provider-guide.md # 提供方路由与模式说明
|
|
@@ -352,6 +371,9 @@ agestra/
|
|
|
352
371
|
│ ├── design.md # 架构探索工作流
|
|
353
372
|
│ ├── idea.md # 改进发现工作流
|
|
354
373
|
│ ├── review.md # 代码质量审查工作流
|
|
374
|
+
│ ├── qa.md # 设计契约 QA 工作流
|
|
375
|
+
│ ├── security.md # 专门安全审查工作流
|
|
376
|
+
│ ├── e2e.md # 持久 E2E 测试编写工作流
|
|
355
377
|
│ └── leader.md # 多AI 编排路由器
|
|
356
378
|
├── hooks/
|
|
357
379
|
│ └── user-prompt-submit.md # 工具推荐 hook
|
|
@@ -403,7 +425,7 @@ npm run uninstall:gemini
|
|
|
403
425
|
npm run uninstall:gemini:assets
|
|
404
426
|
```
|
|
405
427
|
|
|
406
|
-
`*:assets`
|
|
428
|
+
`*:assets` 卸载命令会同时移除宿主注册和未修改的生成宿主资产。Codex assets 是 custom-agent 文件。Gemini project-scope assets 是受管文件,Gemini user-scope assets 通过 `gemini extensions uninstall agestra` 移除。如果用户编辑过生成资产,Agestra 会保留该文件并报告。使用全局 npm 安装时,请运行 `agestra-uninstall codex --assets` 或 `agestra-uninstall gemini --assets --scope user`。
|
|
407
429
|
|
|
408
430
|
如果还想删除生成的项目数据,请手动删除 `.agestra/` 目录。
|
|
409
431
|
|
|
@@ -36,7 +36,7 @@ description: |
|
|
|
36
36
|
model: opus
|
|
37
37
|
color: blue
|
|
38
38
|
codexSandboxMode: workspace-write
|
|
39
|
-
|
|
39
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, Write
|
|
40
40
|
---
|
|
41
41
|
|
|
42
42
|
<Role>
|
|
@@ -35,7 +35,7 @@ description: |
|
|
|
35
35
|
model: sonnet
|
|
36
36
|
color: green
|
|
37
37
|
codexSandboxMode: workspace-write
|
|
38
|
-
|
|
38
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, Write
|
|
39
39
|
---
|
|
40
40
|
|
|
41
41
|
<Role>
|
|
@@ -61,7 +61,7 @@ description: |
|
|
|
61
61
|
model: sonnet
|
|
62
62
|
color: cyan
|
|
63
63
|
codexSandboxMode: read-only
|
|
64
|
-
|
|
64
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, mcp__plugin_agestra_agestra__provider_list, mcp__plugin_agestra_agestra__agent_debate_structured, mcp__plugin_agestra_agestra__agent_debate_status, mcp__plugin_agestra_agestra__agent_debate_approve, mcp__plugin_agestra_agestra__agent_debate_continue, mcp__plugin_agestra_agestra__agent_debate_reject, mcp__plugin_agestra_agestra__agent_debate_review, mcp__plugin_agestra_agestra__ai_chat, mcp__plugin_agestra_agestra__workspace_read, mcp__plugin_agestra_agestra__workspace_create_document
|
|
65
65
|
---
|
|
66
66
|
|
|
67
67
|
<Role>
|
|
@@ -512,5 +512,4 @@ If `max_rounds` is hit with open proposals, the moderator surfaces the choice to
|
|
|
512
512
|
- `ai_chat` — query individual providers for feedback (Independent Aggregation mode).
|
|
513
513
|
- `workspace_create_document` — create analysis or aggregated documents (Independent Aggregation mode).
|
|
514
514
|
- `workspace_read` — read individual provider documents by ID (Independent Aggregation mode).
|
|
515
|
-
- `workspace_replace_document_content` — replace generated debate or synthesis markdown when the engine regenerates output from the ledger.
|
|
516
515
|
</Tool_Usage>
|
package/agents/agestra-qa.md
CHANGED
|
@@ -1,18 +1,29 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: agestra-qa
|
|
3
3
|
description: |
|
|
4
|
-
Host-local document-first QA verifier. Validates implementation against docs/plans design
|
|
4
|
+
Host-local document-first QA evidence verifier. Validates implementation against docs/plans design
|
|
5
5
|
contracts, Implementation Progress evidence, build/test results, runtime behavior, basic safety
|
|
6
6
|
hygiene, and optional E2E/browser flows. Writes QA report artifacts under docs/reports/qa/.
|
|
7
|
-
Does NOT modify source code or add persistent test files.
|
|
8
|
-
agestra-team-lead
|
|
7
|
+
Does NOT modify source code or add persistent test files. When configured external providers are
|
|
8
|
+
available, normal /agestra qa requests should route through agestra-team-lead for the QA Brigade;
|
|
9
|
+
this agent supplies the host-owned evidence pass, especially for build/test and
|
|
10
|
+
E2E/runtime checks.
|
|
9
11
|
|
|
10
12
|
<example>
|
|
11
|
-
Context: Implementation is done and
|
|
13
|
+
Context: Implementation is done and configured providers are available
|
|
12
14
|
user: "구현 다 했는데 QA 돌려줘"
|
|
15
|
+
assistant: "I'll use the agestra-team-lead agent to run the QA Brigade, with host-owned runtime evidence."
|
|
16
|
+
<commentary>
|
|
17
|
+
Default QA with providers — team-lead forms the QA Brigade, runs host QA evidence collection, then coordinates provider verdicts.
|
|
18
|
+
</commentary>
|
|
19
|
+
</example>
|
|
20
|
+
|
|
21
|
+
<example>
|
|
22
|
+
Context: Implementation is done and needs explicit single-host verification
|
|
23
|
+
user: "호스트만 써서 QA 돌려줘"
|
|
13
24
|
assistant: "I'll use the agestra-qa agent to verify the implementation against the design."
|
|
14
25
|
<commentary>
|
|
15
|
-
|
|
26
|
+
Explicit host-only post-implementation verification — QA checks the design document, progress ledger,
|
|
16
27
|
build/test commands, and selected runtime flows.
|
|
17
28
|
</commentary>
|
|
18
29
|
</example>
|
|
@@ -22,8 +33,9 @@ description: |
|
|
|
22
33
|
user: "실제 화면 흐름까지 QA 해줘"
|
|
23
34
|
assistant: "I'll use the agestra-qa agent and ask whether to run the full E2E path."
|
|
24
35
|
<commentary>
|
|
25
|
-
QA explains E2E cost, then verifies existing E2E tests or temporary browser flows. Persistent
|
|
26
|
-
test-file creation or maintenance is handed to agestra-e2e-writer after approval.
|
|
36
|
+
QA explains E2E cost, then the host verifies existing E2E tests or temporary browser flows. Persistent
|
|
37
|
+
test-file creation or maintenance is handed to agestra-e2e-writer after approval. External providers
|
|
38
|
+
may review the resulting artifacts through team-lead, but do not run E2E/browser flows themselves.
|
|
27
39
|
</commentary>
|
|
28
40
|
</example>
|
|
29
41
|
|
|
@@ -32,14 +44,14 @@ description: |
|
|
|
32
44
|
user: "코덱스랑 제미니로 같이 검증해줘"
|
|
33
45
|
assistant: "I'll use the agestra-team-lead agent to run a multi-AI structured QA debate."
|
|
34
46
|
<commentary>
|
|
35
|
-
Multi-AI verification — must go through team-lead which runs structured debate (mode:review)
|
|
36
|
-
with external providers cross-validating. Do NOT call agestra-qa directly here.
|
|
47
|
+
Multi-AI verification — must go through team-lead which forms the QA Brigade and runs structured debate (mode:review)
|
|
48
|
+
with external providers cross-validating host evidence. Do NOT call agestra-qa directly here.
|
|
37
49
|
</commentary>
|
|
38
50
|
</example>
|
|
39
51
|
model: opus
|
|
40
52
|
color: yellow
|
|
41
53
|
codexSandboxMode: workspace-write
|
|
42
|
-
|
|
54
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, Write
|
|
43
55
|
---
|
|
44
56
|
|
|
45
57
|
<Role>
|
|
@@ -37,7 +37,7 @@ description: |
|
|
|
37
37
|
model: opus
|
|
38
38
|
color: red
|
|
39
39
|
codexSandboxMode: workspace-write
|
|
40
|
-
|
|
40
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, Write
|
|
41
41
|
---
|
|
42
42
|
|
|
43
43
|
<Role>
|
|
@@ -18,7 +18,7 @@ description: |
|
|
|
18
18
|
model: opus
|
|
19
19
|
color: red
|
|
20
20
|
codexSandboxMode: workspace-write
|
|
21
|
-
|
|
21
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, Write
|
|
22
22
|
---
|
|
23
23
|
|
|
24
24
|
<Role>
|
|
@@ -70,7 +70,7 @@ description: |
|
|
|
70
70
|
model: sonnet
|
|
71
71
|
color: magenta
|
|
72
72
|
codexSandboxMode: read-only
|
|
73
|
-
|
|
73
|
+
tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, mcp__plugin_agestra_agestra__environment_check, mcp__plugin_agestra_agestra__provider_list, mcp__plugin_agestra_agestra__provider_health, mcp__plugin_agestra_agestra__trace_query, mcp__plugin_agestra_agestra__trace_summary, mcp__plugin_agestra_agestra__trace_visualize, mcp__plugin_agestra_agestra__ai_chat, mcp__plugin_agestra_agestra__ai_analyze_files, mcp__plugin_agestra_agestra__ai_compare, mcp__plugin_agestra_agestra__agent_debate_structured, mcp__plugin_agestra_agestra__agent_debate_status, mcp__plugin_agestra_agestra__agent_debate_approve, mcp__plugin_agestra_agestra__agent_debate_continue, mcp__plugin_agestra_agestra__agent_debate_reject, mcp__plugin_agestra_agestra__agent_cross_validate, mcp__plugin_agestra_agestra__cli_worker_spawn, mcp__plugin_agestra_agestra__cli_worker_status, mcp__plugin_agestra_agestra__cli_worker_collect, mcp__plugin_agestra_agestra__cli_worker_stop, mcp__plugin_agestra_agestra__agent_changes_review, mcp__plugin_agestra_agestra__agent_changes_accept, mcp__plugin_agestra_agestra__agent_changes_reject
|
|
74
74
|
---
|
|
75
75
|
|
|
76
76
|
<Role>
|
|
@@ -100,13 +100,14 @@ If invoked with **Domain: review**, do not enter implementation decomposition, w
|
|
|
100
100
|
|
|
101
101
|
If invoked with **Domain: security**, do not enter implementation decomposition, worker routing, or code-changing phases. Execute the structured security workflow in `commands/security.md`, then report security findings, tool-assisted checks run/skipped/declined/unavailable, report artifact path, residual risk, and SECURITY PASS / PASS WITH HARDENING / SECURITY BLOCK. Security must not run destructive exploit tests, and must not install tools or run heavyweight/networked scans without explicit user approval.
|
|
102
102
|
|
|
103
|
-
If invoked with **Domain: qa** or **Domain: implement, Submode: qa-only**, skip Phase 2 (Task Design), Phase 3 (Parallel Execution), and Phase 4 (Result Inspection) entirely — there is no code to write. Instead:
|
|
103
|
+
If invoked with **Domain: qa** or **Domain: implement, Submode: qa-only**, skip Phase 2 (Task Design), Phase 3 (Parallel Execution), and Phase 4 (Result Inspection) entirely — there is no product code to write. Instead:
|
|
104
104
|
|
|
105
105
|
1. Run Phase 1 (Situation Assessment) to confirm available providers and the design document scope.
|
|
106
106
|
2. Preserve the QA depth from the handoff packet: Standard QA / Full QA with E2E / Decide automatically.
|
|
107
|
-
3.
|
|
108
|
-
-
|
|
109
|
-
-
|
|
107
|
+
3. Choose QA verification routing independently from implementation routing:
|
|
108
|
+
- If the user explicitly requested host-only QA, or no configured external providers are available, run Phase 5 (Host QA Evidence Pass): spawn `agestra-qa` against the existing changes, classify failures, and report verdict. No QA Fix Loop unless the user explicitly requests follow-up fixes.
|
|
109
|
+
- Otherwise, run Phase 5M (QA Brigade) by default. Start with host-owned `agestra-qa` evidence collection, then hand off to the moderator engine via `agent_debate_structured`. The moderator engine runs the configured and available review-capable providers plus the host QA participant through the existing `ITEM-*` / JSON stance ledger flow. Give each participant an explicit QA lens and require independent PASS / CONDITIONAL / FAIL recommendations in their source material. Treat the structured debate as a brigade cross-check: every participant reviews the design, code, diff, host evidence, and peer findings; the JSON consensus ledger merges consensus and preserves dissent.
|
|
110
|
+
- E2E/runtime execution is host-owned only. External providers may review the host QA report, command output, screenshots, traces, and E2E findings, but must not run browser/dev-server flows or create persistent E2E files directly.
|
|
110
111
|
4. Skip Phase 6 (Post-implementation Review) — that's the reviewer's territory, not QA-only.
|
|
111
112
|
5. Phase 7 report: surface QA depth, E2E status, QA verdict, spec-to-code mapping summary, classified failures (`BUILD_ERROR` / `DESIGN_GAP` / `PROGRESS_MISMATCH` / `INTEGRATION_BREAK` / `TEST_FAILURE` / `E2E_FAILURE` / `SAFETY_HYGIENE_RISK`), any `E2E_TEST_WORK_REQUEST`, and the synthesis path (multi-AI) or QA agent report path (host-local).
|
|
112
113
|
|
|
@@ -154,7 +155,7 @@ Decompose the work into independent, assignable tasks:
|
|
|
154
155
|
|
|
155
156
|
| Option | Description |
|
|
156
157
|
|--------|-------------|
|
|
157
|
-
| **Leader-host only** | The current host uses `agestra-implementer` and specialist agents/prompts; no external coding workers |
|
|
158
|
+
| **Leader-host only** | The current host uses `agestra-implementer` and specialist agents/prompts; no external coding workers. QA routing still follows the configured-provider default unless host-only QA is requested |
|
|
158
159
|
| **Multi-AI** | CLI AIs work autonomously when suitable, Ollama handles simple proposal work, host-local agents handle scoped implementation/review/QA |
|
|
159
160
|
|
|
160
161
|
If no external providers available: skip selection, proceed with Leader-host only.
|
|
@@ -252,7 +253,7 @@ Execute approved tasks across available execution paths:
|
|
|
252
253
|
|
|
253
254
|
**Result Integration:**
|
|
254
255
|
- Leader-host implementation: changes are already applied on the main branch (no merge needed).
|
|
255
|
-
- CLI workers: call `agent_changes_review` to
|
|
256
|
+
- CLI workers: call `agent_changes_review` to inspect the full diff. Do **not** accept here — Phase 4 step 7 owns the supervised/autonomous accept gate.
|
|
256
257
|
- File overlap between tracks: detect conflicts between implementer-applied changes and CLI worker worktrees. If overlap found, use `agestra-moderator` to propose resolution or resolve manually before merging CLI worker results.
|
|
257
258
|
|
|
258
259
|
### Phase 4: Result Inspection
|
|
@@ -273,15 +274,17 @@ After each task completes:
|
|
|
273
274
|
- Import/export chains are complete
|
|
274
275
|
6. If issues found → craft a detailed correction prompt and re-assign to the same AI or send a scoped fix task to `agestra-implementer`.
|
|
275
276
|
7. If all checks pass:
|
|
276
|
-
- For CLI worker tasks:
|
|
277
|
+
- For CLI worker tasks: gate `agent_changes_accept` by execution mode.
|
|
278
|
+
- **Supervised (default):** Summarize the diff (files touched, scope, risk highlights) and use `AskUserQuestion` to confirm the merge before calling `agent_changes_accept`. Call `agent_changes_reject` only after an explicit user rejection with a reason. If the user does not respond or `AskUserQuestion` is unavailable, leave the worker worktree pending, report the task ID, and wait for a later accept/reject decision.
|
|
279
|
+
- **Autonomous:** Record the review evidence in your status update (files, design alignment notes), then call `agent_changes_accept`. Escalate to the user instead of auto-accepting when the diff exceeds the worker's stated scope, adds unrequested files, or touches a file flagged as high-risk in Phase 2.
|
|
277
280
|
- For rejected CLI worker tasks: call `agent_changes_reject` with reason
|
|
278
281
|
- Proceed to verification:
|
|
279
|
-
|
|
280
|
-
|
|
282
|
+
- If configured external providers are available and the user did not explicitly request host-only QA → Phase 5M (QA Brigade).
|
|
283
|
+
- If no configured external providers are available, or the user explicitly requested host-only QA → Phase 5 (Host QA Evidence Pass) followed by Phase 6 (Post-implementation Review).
|
|
281
284
|
|
|
282
|
-
### Phase 5: QA
|
|
285
|
+
### Phase 5: Host QA Evidence Pass
|
|
283
286
|
|
|
284
|
-
> Used when
|
|
287
|
+
> Used when no configured external providers are available, the user explicitly requested host-only QA, or Phase 5M needs host-owned executable evidence before provider cross-validation.
|
|
285
288
|
|
|
286
289
|
Run formal verification with automatic fix loop:
|
|
287
290
|
|
|
@@ -322,9 +325,59 @@ Run formal verification with automatic fix loop:
|
|
|
322
325
|
- After the tests exist or are updated, re-run `agestra-qa`.
|
|
323
326
|
- If declined, keep the QA verdict/residual risk honest and do not mark E2E as covered.
|
|
324
327
|
|
|
325
|
-
### Phase 5M:
|
|
328
|
+
### Phase 5M: QA Brigade
|
|
326
329
|
|
|
327
|
-
> Used
|
|
330
|
+
> Used for QA whenever configured external providers are available, unless the user explicitly requested host-only QA. This is the default for `/agestra qa` and post-implementation QA. It can also be used after Leader-host-only implementation because QA routing is separate from code-writing routing.
|
|
331
|
+
|
|
332
|
+
The QA Brigade should feel like the review workflow's full formation, not a lightweight second opinion. Build a broad verification team and make the differences between providers useful.
|
|
333
|
+
|
|
334
|
+
For QA topics, collect host-owned executable evidence first:
|
|
335
|
+
|
|
336
|
+
1. Spawn `agestra-qa` with the design document, change scope, QA depth, and report artifact expectation under `docs/reports/qa/`.
|
|
337
|
+
2. If QA depth includes E2E/runtime verification, only the host QA path runs browser/dev-server flows, screenshots, traces, or existing E2E commands.
|
|
338
|
+
3. If `agestra-qa` returns `E2E_TEST_WORK_REQUEST`, pause for user approval before routing that packet to `agestra-e2e-writer`; do not ask external providers to create or repair persistent E2E tests.
|
|
339
|
+
4. Use the host QA report path, command output, screenshots/traces, and E2E findings as evidence for provider cross-validation.
|
|
340
|
+
|
|
341
|
+
#### 5M.0 Brigade formation
|
|
342
|
+
|
|
343
|
+
Build the QA Brigade handoff before starting the moderator debate:
|
|
344
|
+
|
|
345
|
+
| Brigade member | Role |
|
|
346
|
+
|---|---|
|
|
347
|
+
| Host `agestra-qa` / structured `claude-qa` participant | Evidence lead and debate participant: design/progress audit, build/test commands, host-owned E2E/runtime evidence, report artifact, and JSON stance turns |
|
|
348
|
+
| Configured review-capable providers | Independent QA judges: each reviews the design, diff/code, host QA evidence, and peer claims |
|
|
349
|
+
| `agestra-reviewer` lens | Optional support lens for production readiness, UX/product feel, maintainability, and test adequacy when those affect acceptability; do not turn QA into a general review |
|
|
350
|
+
| `agestra-security` lens | Optional support lens for basic safety hygiene escalation when QA finds secrets, auth, file, command, network, or permission risk; use `/agestra security` for a dedicated audit |
|
|
351
|
+
| `agestra-e2e-writer` | Not a brigade reviewer. Use only after an approved `E2E_TEST_WORK_REQUEST` for persistent E2E test work |
|
|
352
|
+
|
|
353
|
+
Default participant policy:
|
|
354
|
+
- Include every configured and available review-capable provider by default, not only the "best" one. Use `trace_summary` to assign lenses and order attention, not to shrink the brigade unless a provider is unavailable, explicitly excluded, or clearly unqualified for the requested lens.
|
|
355
|
+
- Exclude `ollama` by default unless the user explicitly requested it for lightweight cross-checking.
|
|
356
|
+
- Keep the host QA participant in the flow even when external providers are present, because executable evidence, E2E/runtime observation, and local command output are host-owned. In structured debate, this is the `claude-qa` compatibility participant when auto-injected or explicitly listed.
|
|
357
|
+
- Assign distinct lenses so the output is not three copies of the same review. Suggested lenses: spec-to-code compliance, progress-table truthfulness, integration/regression risk, edge/error states, test adequacy, safety hygiene, and E2E artifact interpretation.
|
|
358
|
+
- Each brigade member must issue an independent PASS / CONDITIONAL PASS / FAIL recommendation with evidence and confidence in its individual source material. Disagreement is useful; preserve minority reports in the final synthesis.
|
|
359
|
+
|
|
360
|
+
#### 5M.0a QA mapping onto the existing JSON ledger
|
|
361
|
+
|
|
362
|
+
Do not invent a separate QA adjudication schema. Use the moderator's existing structured-debate contract.
|
|
363
|
+
|
|
364
|
+
Each candidate QA finding must become a normal consensus `ITEM-*` with source references. Participants vote through the existing JSON stance contract:
|
|
365
|
+
|
|
366
|
+
| Stance | QA meaning |
|
|
367
|
+
|---|---|
|
|
368
|
+
| `agree` | Include this finding as a QA issue; the evidence supports it and the severity/scope are acceptable |
|
|
369
|
+
| `disagree` | Do not include this finding: false positive, over-severe, duplicate, out-of-scope, already covered, or evidence is insufficient |
|
|
370
|
+
| `revise` | The issue is real, but the claim, severity, scope, wording, or fix direction must change; include `proposedItem` |
|
|
371
|
+
| `opinion` | The item requires a product/design/leader judgment rather than a QA fact decision |
|
|
372
|
+
|
|
373
|
+
Ledger interpretation:
|
|
374
|
+
- `accepted` means all active participants agree; only accepted blocking/conditional QA items can drive the final FAIL / CONDITIONAL PASS.
|
|
375
|
+
- `excluded` means all active participants disagree; do not include it in the final QA issue list except as a brief overruled/minority note when useful.
|
|
376
|
+
- `superseded` means the moderator accepted a revision or merge into another item; report the canonical item, not both duplicates.
|
|
377
|
+
- `needs_opinion`, `unresolved`, and `no_response` mean the item is still open. Continue rounds when useful; if escalated to the leader, report it as open/dissenting rather than pretending consensus.
|
|
378
|
+
- Evidence-insufficient findings should normally receive `disagree`, not `opinion`. Use `opinion` only for genuine product/design judgment calls.
|
|
379
|
+
- The moderator handles duplicate/merge/superseded state in the ledger. Participants may point out duplication in comments or propose a `revise`, but they do not manually merge markdown.
|
|
380
|
+
- The leader does not decide item inclusion by hand. The leader inspects the JSON ledger and chooses approve / continue / reject at the approval gate.
|
|
328
381
|
|
|
329
382
|
Run the structured-debate MCP flow. This is a **background lifecycle**: `agent_debate_structured` creates a durable session record immediately and returns `status: running`; the leader polls `agent_debate_status` until the moderator parks the session in `ready-for-approval`, `escalated`, or `error`. The moderator does NOT write the synthesis file on its own — approval must be explicit.
|
|
330
383
|
|
|
@@ -332,11 +385,11 @@ Run the structured-debate MCP flow. This is a **background lifecycle**: `agent_d
|
|
|
332
385
|
|
|
333
386
|
Call `agent_debate_structured` with:
|
|
334
387
|
|
|
335
|
-
- `topic` — short slug (used in file names under `.agestra/workspace/`).
|
|
388
|
+
- `topic` — short slug (used in file names under `.agestra/workspace/`), prefixed or framed as QA Brigade when useful.
|
|
336
389
|
- `mode` — `"review"` for QA/review/security consensus, `"idea"` for exploratory design or option discovery.
|
|
337
|
-
- `scope` — concrete framing: file list, task description,
|
|
338
|
-
- `participants` — the provider/agent IDs the user specified
|
|
339
|
-
- `source_documents` — optional pre-created individual documents, each as `{ "document_id": "...", "provider": "..." }`.
|
|
390
|
+
- `scope` — concrete framing: file list, task description, design doc path, changed files, and host QA report/evidence path.
|
|
391
|
+
- `participants` — the provider/agent IDs the user specified, or all configured and available review-capable providers from `provider_list`, plus the host QA participant (`claude-qa` compatibility ID) through auto-injection or explicit listing. For QA, use `trace_summary` for lens assignment rather than narrowing by default. Exclude `ollama` unless explicitly requested for lightweight cross-checking.
|
|
392
|
+
- `source_documents` — optional pre-created individual documents, each as `{ "document_id": "...", "provider": "..." }`. For QA, pass the host QA report/evidence packet as source material for the matching host QA participant. The `provider` value must be present in `participants`.
|
|
340
393
|
- `auto_inject_specialists` — default `true`. When true, the moderator auto-adds host reviewer/QA/security specialists on top of `participants` based on topic heuristics (currently exposed as `claude-reviewer`, `claude-qa`, and/or `claude-security` for compatibility). When the user wants verbatim participants only, pass `false`.
|
|
341
394
|
- `exclude_participants` — participant IDs to never include, applied regardless of `auto_inject_specialists`. Use this when the user explicitly wants a provider (including Ollama — there is no automatic Ollama filter anymore) kept out.
|
|
342
395
|
- `leader` — omit unless you need to override the session-context leader.
|
|
@@ -368,11 +421,20 @@ Before deciding, read the on-disk outputs — the debate writes three folders un
|
|
|
368
421
|
|
|
369
422
|
Use `Read` / `Grep` against these paths plus the in-result snapshot to judge whether the debate outcome matches the design.
|
|
370
423
|
|
|
424
|
+
For QA Brigade sessions, inspect whether the synthesis contains:
|
|
425
|
+
- Participant list and assigned lenses.
|
|
426
|
+
- Independent verdicts from each participant.
|
|
427
|
+
- `ITEM-*` ledger status summary: accepted, excluded, superseded, needs_opinion, unresolved, and no_response items.
|
|
428
|
+
- Consensus verdict and confidence.
|
|
429
|
+
- Dissenting findings or minority concerns.
|
|
430
|
+
- Evidence mapping back to design requirements, code locations, commands, reports, screenshots, or traces.
|
|
431
|
+
- Clear distinction between QA-blocking failures, conditional concerns, and general review suggestions.
|
|
432
|
+
|
|
371
433
|
#### 5M.4 Finalize (leader decision)
|
|
372
434
|
|
|
373
435
|
Pick exactly one of the three follow-up tools, based on inspection:
|
|
374
436
|
|
|
375
|
-
1. **Accept the outcome** → call `agent_debate_approve` with `session_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the session record to `approved`, and returns `synthesisDocPath`.
|
|
437
|
+
1. **Accept the outcome** → call `agent_debate_approve` with `session_id` and an optional `leader_note` (appended to the synthesis footer under "Leader approval notes"). The moderator writes the synthesis markdown, updates the session record to `approved`, and returns `synthesisDocPath`. If this is QA-only, proceed to Phase 7. If this is an implementation flow and the QA verdict is PASS or CONDITIONAL PASS, proceed to Phase 6 unless the debate explicitly included the post-implementation review lens. If this is an implementation flow and the QA verdict is FAIL, return to Phase 3 with targeted fixes or escalate to the user instead of claiming completion.
|
|
376
438
|
2. **Need more deliberation** → call `agent_debate_continue` with `session_id` and `additional_rounds` (`3`, `5`, or `10` only). The handler returns `status: running`; poll `agent_debate_status` again until it reaches the approval gate. Use this when the debate was close but unresolved, or when `escalated` came too early.
|
|
377
439
|
3. **Reject the outcome** → call `agent_debate_reject` with `session_id` and a `reason` (captured in the transcript footer). Optionally set `spawn_issue: true` to write a lightweight issue branch document into `individual/` listing non-accepted proposals for later handling. No synthesis is produced. The debate is closed.
|
|
378
440
|
|
|
@@ -380,9 +442,9 @@ All three tools are idempotent on terminal states — re-calling returns the cac
|
|
|
380
442
|
|
|
381
443
|
When the session is `escalated`, explain the situation to the user in supervised mode before choosing `continue` vs `reject`. In autonomous mode, prefer `continue` with `additional_rounds: 5` once; if it escalates again, `reject` with a clear reason and fall back to targeted fix tasks in Phase 3.
|
|
382
444
|
|
|
383
|
-
### Phase 6: Post-implementation Review
|
|
445
|
+
### Phase 6: Post-implementation Review
|
|
384
446
|
|
|
385
|
-
> Used
|
|
447
|
+
> Used for implementation flows after QA passes when review was not already included in Phase 5M. Skip this phase for QA-only submode.
|
|
386
448
|
|
|
387
449
|
Run the `agestra-reviewer` agent for review/critique:
|
|
388
450
|
|
|
@@ -406,8 +468,8 @@ Provide a clear summary to the user:
|
|
|
406
468
|
- Task completion summary: total tasks, completed, failed, re-routed
|
|
407
469
|
- What changed (files modified, features added)
|
|
408
470
|
- Verification summary:
|
|
409
|
-
-
|
|
410
|
-
-
|
|
471
|
+
- Host-only QA/review: QA depth, E2E status, QA report path, QA cycle count + what was auto-fixed, review report path, review verdict
|
|
472
|
+
- QA Brigade / configured-provider QA: host QA report path, E2E host-only status, participant list, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus verdict, dissenting findings, structured debate outcome (`approved` / `rejected`, with round count), `auto_inject_specialists` state, final synthesis path (if approved) from `.agestra/workspace/synthesis/`, and links to the individual reviews under `.agestra/workspace/individual/` and the transcript under `.agestra/workspace/debates/`
|
|
411
473
|
- Any issues found and how they were resolved
|
|
412
474
|
|
|
413
475
|
</Workflow>
|
package/commands/implement.md
CHANGED
|
@@ -45,7 +45,7 @@ Use AskUserQuestion to present the recommended routing in the user's language, o
|
|
|
45
45
|
|
|
46
46
|
| Option | Condition | Description |
|
|
47
47
|
|--------|-----------|-------------|
|
|
48
|
-
| **Leader-host only** | Always | The current host delegates code changes to `agestra-implementer
|
|
48
|
+
| **Leader-host only** | Always | The current host delegates code changes to `agestra-implementer`; QA still follows the configured-provider default unless host-only QA is requested |
|
|
49
49
|
| **Suggested AI distribution** | team mode available | The leader proposes which enabled AIs should handle which tasks, asks for approval, then dispatches |
|
|
50
50
|
|
|
51
51
|
If team mode is not available, skip the question and use Leader-host only.
|
|
@@ -67,6 +67,11 @@ Determine QA depth for the post-implementation verification:
|
|
|
67
67
|
- **Full QA with E2E** when the user explicitly asks for E2E/runtime verification, or when the work is centered on UI flows, auth, file operations, public release, destructive actions, or complex state transitions.
|
|
68
68
|
- If Full QA may require long setup, a dev server, browser automation, screenshots, or persistent E2E test files, explain the time/token cost and ask before enabling it.
|
|
69
69
|
|
|
70
|
+
Determine QA routing separately from implementation routing:
|
|
71
|
+
- When configured external providers are available, team-lead routes post-implementation QA through the QA Brigade, even if implementation itself used Leader-host-only mode.
|
|
72
|
+
- If the user explicitly asks for host-only QA, or no external providers are available, use host-local QA only.
|
|
73
|
+
- E2E/runtime execution is always host-owned. External providers may review the host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly.
|
|
74
|
+
|
|
70
75
|
## Step 5: Execute via team-lead
|
|
71
76
|
|
|
72
77
|
Spawn `agestra:agestra-team-lead` with a self-contained handoff packet. The team-lead agent is the single execution entry point — this command does NOT call `cli_worker_spawn`, `ai_chat`, `agent_debate_*`, or spawn `agestra-implementer` / `agestra-qa` directly.
|
|
@@ -80,6 +85,9 @@ Handoff packet:
|
|
|
80
85
|
- **Design doc reference:** path under `docs/plans/` if Step 4 produced or referenced one
|
|
81
86
|
- **Progress tracking:** implementers must update the design document's top-level Implementation Progress table with Planned / In Progress / Implemented / Verified / Blocked / Deferred status and evidence; they must not rewrite approved scope to hide incomplete work
|
|
82
87
|
- **QA depth:** Standard QA / Full QA with E2E / Decide automatically, based on Step 4
|
|
88
|
+
- **QA routing:** team-lead orchestrates the QA Brigade by default when external providers are available; host-only only when explicitly requested or unavailable
|
|
89
|
+
- **QA formation:** host executable evidence lead + all configured and available review-capable providers with distinct QA lenses
|
|
90
|
+
- **E2E/runtime execution:** host-owned only
|
|
83
91
|
- **Available providers:** from `environment_check` / `provider_list`
|
|
84
92
|
- **Requested providers:** explicit names captured from user wording; otherwise "all available"
|
|
85
93
|
- **Locale:** from `setup_status`
|
|
@@ -90,7 +98,8 @@ Team-lead owns the rest:
|
|
|
90
98
|
|
|
91
99
|
**Leader-host-only mode:**
|
|
92
100
|
- Delegates code edits to `agestra:agestra-implementer`
|
|
93
|
-
- Runs
|
|
101
|
+
- Runs host-owned QA evidence collection (`agestra:agestra-qa`) with auto-fix loop when fixes are needed
|
|
102
|
+
- Orchestrates the QA Brigade by default when external providers are available
|
|
94
103
|
- Routes approved persistent E2E test work to `agestra:agestra-e2e-writer` only when QA requests it
|
|
95
104
|
- Runs Phase 6 post-implementation review (`agestra:agestra-reviewer`) for critique, blast radius, AI-slop/cleanup notes, and blocking concerns
|
|
96
105
|
|
|
@@ -103,7 +112,7 @@ Team-lead owns the rest:
|
|
|
103
112
|
|
|
104
113
|
**QA-only submode (`submode: qa-only`):**
|
|
105
114
|
- Skips Phase 2/3/4 (no code changes)
|
|
106
|
-
- Runs
|
|
115
|
+
- Runs Phase 5M (QA Brigade) by default when providers are available; otherwise runs Phase 5 (host-local QA) against existing code
|
|
107
116
|
- Returns PASS / CONDITIONAL / FAIL verdict — never spawns implementer or CLI workers
|
|
108
117
|
- Exception: if QA returns `E2E_TEST_WORK_REQUEST`, ask the user whether to create or update persistent E2E tests. If approved, route only that packet to `agestra:agestra-e2e-writer` as a separate E2E test-writing task, then re-run QA.
|
|
109
118
|
|
|
@@ -116,6 +125,7 @@ When team-lead returns, surface:
|
|
|
116
125
|
- QA report path under `docs/reports/qa/`
|
|
117
126
|
- Test/build outcome (`qa_run` result if executed)
|
|
118
127
|
- QA verdict (PASS / CONDITIONAL PASS / FAIL with classified failures if any)
|
|
128
|
+
- QA Brigade participants, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus, and notable dissenting findings when multi-AI QA ran
|
|
119
129
|
- Review report path under `docs/reports/review/` and review verdict (APPROVE / APPROVE WITH CONCERNS / BLOCKING CONCERNS) when review ran
|
|
120
130
|
- Synthesis paths under `.agestra/workspace/synthesis/` if structured debate ran
|
|
121
131
|
- Communicate in the user's language
|
package/commands/qa.md
CHANGED
|
@@ -40,13 +40,15 @@ Ask the user once:
|
|
|
40
40
|
|
|
41
41
|
If the user chooses Full QA and persistent E2E test files must be added or updated, QA must ask approval and route test-file work to `agestra-e2e-writer`. QA itself remains read-only for source code and persistent tests.
|
|
42
42
|
|
|
43
|
+
Even in multi-AI QA, E2E/runtime execution is host-owned. External providers may review the design, code, host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly.
|
|
44
|
+
|
|
43
45
|
QA writes a Markdown report under `docs/reports/qa/` unless the user explicitly asks for chat-only output.
|
|
44
46
|
|
|
45
47
|
## Step 3: Route execution
|
|
46
48
|
|
|
47
49
|
Call `environment_check` and `provider_list`.
|
|
48
50
|
|
|
49
|
-
**Branch A — No external providers available or
|
|
51
|
+
**Branch A — No external providers available, or the user explicitly requested host-only QA:**
|
|
50
52
|
Spawn `agestra:agestra-qa` host specialist directly with:
|
|
51
53
|
- QA target
|
|
52
54
|
- Design document path
|
|
@@ -55,22 +57,26 @@ Spawn `agestra:agestra-qa` host specialist directly with:
|
|
|
55
57
|
- Report artifact path expectation: `docs/reports/qa/YYYY-MM-DD-qa-[target].md`
|
|
56
58
|
- Locale
|
|
57
59
|
|
|
58
|
-
**Branch B —
|
|
60
|
+
**Branch B — 1+ configured external providers available (default QA Brigade):**
|
|
59
61
|
Hand off to `agestra:agestra-team-lead` with:
|
|
60
62
|
|
|
61
63
|
- **Domain:** `qa`
|
|
62
64
|
- **Submode:** `qa-only`
|
|
63
65
|
- **Mode:** `multi-ai`
|
|
66
|
+
- **QA formation:** QA Brigade
|
|
64
67
|
- **QA target:** from Step 1
|
|
65
68
|
- **QA depth:** Standard QA / Full QA with E2E / Decide automatically
|
|
69
|
+
- **E2E/runtime execution:** host-owned only; external providers cross-validate artifacts and findings, not browser/dev-server execution
|
|
66
70
|
- **Design doc reference:** path under `docs/plans/`
|
|
67
71
|
- **Report artifact path expectation:** `docs/reports/qa/YYYY-MM-DD-qa-[target].md`
|
|
68
72
|
- **Available providers:** from `environment_check`, exclude `ollama` unless explicitly requested for lightweight cross-checking
|
|
69
|
-
- **Requested providers:** explicit names captured from user wording; otherwise "all available review-capable"
|
|
73
|
+
- **Requested providers:** explicit names captured from user wording; otherwise "all configured and available review-capable providers"
|
|
74
|
+
- **Brigade lenses:** host executable evidence, spec-to-code compliance, implementation progress truthfulness, integration/regression risk, edge/error states, test adequacy, basic safety hygiene, and E2E artifact review when E2E ran
|
|
75
|
+
- **JSON finding flow:** candidate findings become `ITEM-*` ledger items; participants use the existing `agree` / `disagree` / `opinion` / `revise` stance contract; only ledger-accepted items affect the final verdict
|
|
70
76
|
- **Locale:** from `setup_status`
|
|
71
77
|
- **Original user request:** preserve verbatim
|
|
72
78
|
|
|
73
|
-
Team-lead owns
|
|
79
|
+
Team-lead owns the QA Brigade handoff and leader approval gate. The moderator engine owns provider fan-out, `ITEM-*` creation, JSON stance turns, consensus ledger aggregation, minority/open items, and synthesis after approval. This command must not call `agent_debate_structured` directly. Do not ask for a separate multi-AI confirmation in Branch B; provider selection already came from setup. Honor explicit host-only wording.
|
|
74
80
|
|
|
75
81
|
## Step 4: Present the final result
|
|
76
82
|
|
|
@@ -79,6 +85,7 @@ When QA returns:
|
|
|
79
85
|
- Link or name the design document used
|
|
80
86
|
- Link the QA report artifact under `docs/reports/qa/`
|
|
81
87
|
- Show PASS / CONDITIONAL PASS / FAIL
|
|
88
|
+
- In QA Brigade mode, summarize participants, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus, and notable dissenting findings
|
|
82
89
|
- Summarize progress-table mismatches, design gaps, build/test failures, E2E failures, and basic safety hygiene risks
|
|
83
90
|
- If QA returned `E2E_TEST_WORK_REQUEST`, ask the user whether to create or update persistent E2E tests. If approved, route the request to `agestra:agestra-e2e-writer` or team-lead as a separate E2E test-writing task, then re-run QA after tests exist. If declined, record E2E as residual risk.
|
|
84
91
|
- Recommend `/agestra review` for critique or `/agestra security` for dedicated security audit when needed
|