opencastle 0.7.0 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -3
- package/package.json +1 -1
- package/src/dashboard/node_modules/.vite/deps/_metadata.json +6 -6
- package/src/orchestrator/agent-workflows/feature-implementation.md +10 -3
- package/src/orchestrator/prompts/bug-fix.prompt.md +11 -6
- package/src/orchestrator/prompts/implement-feature.prompt.md +9 -4
- package/src/orchestrator/prompts/quick-refinement.prompt.md +9 -5
- package/src/orchestrator/prompts/resolve-pr-comments.prompt.md +18 -4
- package/src/orchestrator/skills/fast-review/SKILL.md +15 -4
- package/src/orchestrator/skills/validation-gates/SKILL.md +152 -15
package/README.md
CHANGED
|
@@ -85,11 +85,11 @@ npx opencastle@1.2.3 update # pin to a specific version
|
|
|
85
85
|
|
|
86
86
|
## What's Inside
|
|
87
87
|
|
|
88
|
-
**
|
|
88
|
+
**Specialist Agents.** Developer, UI/UX, Database, Security, Testing, Reviewer, and more.
|
|
89
89
|
|
|
90
|
-
**
|
|
90
|
+
**On-Demand Skills.** Loaded on demand to keep context windows lean. Auto-selected during init based on your stack.
|
|
91
91
|
|
|
92
|
-
**
|
|
92
|
+
**Workflow Templates.** Features, bug fixes, data pipelines, security audits — reproducible execution templates.
|
|
93
93
|
|
|
94
94
|
**Quality gates.** Fast review after every step. Panel majority vote for high-stakes changes. Lint, test, build checks.
|
|
95
95
|
|
package/package.json
CHANGED
|
@@ -1,25 +1,25 @@
|
|
|
1
1
|
{
|
|
2
|
-
"hash": "
|
|
2
|
+
"hash": "ef84f2a1",
|
|
3
3
|
"configHash": "30f8ea04",
|
|
4
|
-
"lockfileHash": "
|
|
5
|
-
"browserHash": "
|
|
4
|
+
"lockfileHash": "fc571d9a",
|
|
5
|
+
"browserHash": "822a1467",
|
|
6
6
|
"optimized": {
|
|
7
7
|
"astro > cssesc": {
|
|
8
8
|
"src": "../../../../../node_modules/cssesc/cssesc.js",
|
|
9
9
|
"file": "astro___cssesc.js",
|
|
10
|
-
"fileHash": "
|
|
10
|
+
"fileHash": "d234c1d0",
|
|
11
11
|
"needsInterop": true
|
|
12
12
|
},
|
|
13
13
|
"astro > aria-query": {
|
|
14
14
|
"src": "../../../../../node_modules/aria-query/lib/index.js",
|
|
15
15
|
"file": "astro___aria-query.js",
|
|
16
|
-
"fileHash": "
|
|
16
|
+
"fileHash": "bcc57b41",
|
|
17
17
|
"needsInterop": true
|
|
18
18
|
},
|
|
19
19
|
"astro > axobject-query": {
|
|
20
20
|
"src": "../../../../../node_modules/axobject-query/lib/index.js",
|
|
21
21
|
"file": "astro___axobject-query.js",
|
|
22
|
-
"fileHash": "
|
|
22
|
+
"fileHash": "196f4f3c",
|
|
23
23
|
"needsInterop": true
|
|
24
24
|
}
|
|
25
25
|
},
|
|
@@ -204,15 +204,22 @@ If there are no open questions, explicitly state: "No open questions — plan is
|
|
|
204
204
|
3. Verify no files outside partitions were modified
|
|
205
205
|
4. Check all tracker issue acceptance criteria
|
|
206
206
|
5. Run panel review if high-stakes (security, DB, architecture)
|
|
207
|
-
6.
|
|
208
|
-
|
|
209
|
-
|
|
207
|
+
6. **Final Smoke Test (Gate 10)** — verify the complete feature end-to-end:
|
|
208
|
+
- Full clean build of all affected projects (not incremental)
|
|
209
|
+
- End-to-end browser walkthrough of the complete user flow
|
|
210
|
+
- Verify all states: loading, empty, populated, error, partial
|
|
211
|
+
- Cross-task integration check (e.g., migration + component + page compose correctly)
|
|
212
|
+
- Final responsive sweep at all breakpoints (if UI changes)
|
|
213
|
+
7. Move all issues to Done
|
|
214
|
+
8. Update session checkpoint → delete checkpoint
|
|
215
|
+
9. Update `.github/customizations/project/roadmap.md`
|
|
210
216
|
|
|
211
217
|
### Exit Criteria
|
|
212
218
|
|
|
213
219
|
- [ ] All phases verified
|
|
214
220
|
- [ ] All tracker issues Done
|
|
215
221
|
- [ ] Full build passes
|
|
222
|
+
- [ ] **Final smoke test passed** — complete user flow verified end-to-end
|
|
216
223
|
- [ ] Roadmap updated
|
|
217
224
|
- [ ] Delivery Outcome completed (see `general.instructions.md`) — branch pushed, PR opened (not merged), tracker linked
|
|
218
225
|
|
|
@@ -107,12 +107,17 @@ Delegate to the appropriate specialist agent via **sub-agent** (inline). For bug
|
|
|
107
107
|
|
|
108
108
|
> Load the **validation-gates** skill for detailed steps on each gate.
|
|
109
109
|
|
|
110
|
-
Every bug fix must pass ALL
|
|
111
|
-
|
|
112
|
-
1. **
|
|
113
|
-
2. **
|
|
114
|
-
3. **
|
|
115
|
-
4. **
|
|
110
|
+
Every bug fix must pass ALL applicable gates:
|
|
111
|
+
|
|
112
|
+
1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings — block immediately if found
|
|
113
|
+
2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
|
|
114
|
+
3. **Gate 3: Blast Radius Check** — verify the fix is minimal and scoped (bug fixes should be ≤100 lines, ≤3 files; escalate if larger)
|
|
115
|
+
4. **Gate 4: Dependency Audit** (when `package.json` or lockfiles change) — vulnerability scan, license check, bundle size, duplicates
|
|
116
|
+
5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates the fix. No auto-PASS for sensitive files
|
|
117
|
+
6. **Gate 6: Bug-Specific Verification** (MANDATORY) — start dev server, reproduce original bug (should be gone), verify correct behavior, test edge cases, screenshot before/after, check both apps if shared code
|
|
118
|
+
7. **Gate 7: Browser Testing** (for UI-related bugs) — clear cache, start server, verify fix + responsive + screenshots
|
|
119
|
+
8. **Gate 8: Regression Testing** — run tests for all projects consuming modified files, browser-test adjacent functionality
|
|
120
|
+
9. **Gate 9: Panel Review** (only if needed) — use **panel-majority-vote** skill if fix touches auth/authorization, RLS, security headers/CSP, or sensitive data
|
|
116
121
|
|
|
117
122
|
### 6. Delivery
|
|
118
123
|
|
|
@@ -79,10 +79,15 @@ Include the self-improvement reminder in every delegation prompt (see `general.i
|
|
|
79
79
|
|
|
80
80
|
Every subtask must pass ALL gates before being marked Done:
|
|
81
81
|
|
|
82
|
-
1. **Gate 1:
|
|
83
|
-
2. **Gate 2:
|
|
84
|
-
3. **Gate 3:
|
|
85
|
-
4. **Gate 4:
|
|
82
|
+
1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings — block immediately if found
|
|
83
|
+
2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
|
|
84
|
+
3. **Gate 3: Blast Radius Check** — verify scope is expected (≤200 lines, ≤5 files normal; escalate if >500 lines or >10 files)
|
|
85
|
+
4. **Gate 4: Dependency Audit** (when `package.json` changes) — vulnerability scan, license check, bundle size, duplicates
|
|
86
|
+
5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates every delegation output. No auto-PASS for sensitive files
|
|
87
|
+
6. **Gate 6: Browser Testing** (MANDATORY for UI changes) — clear cache, start server, verify features + responsive + screenshots
|
|
88
|
+
7. **Gate 7: Regression Testing** — full test suite for affected projects, browser-test adjacent pages if shared components changed
|
|
89
|
+
8. **Gate 8: Panel Review** (for high-stakes changes) — use **panel-majority-vote** skill for security, DB migrations, architecture
|
|
90
|
+
9. **Gate 9: Final Smoke Test** — after all tasks Done, verify the complete feature end-to-end as a cohesive unit
|
|
86
91
|
|
|
87
92
|
### 5. Delivery
|
|
88
93
|
|
|
@@ -98,11 +98,15 @@ Delegate to the appropriate specialist agent(s). Since follow-ups are scoped and
|
|
|
98
98
|
|
|
99
99
|
> Load the **validation-gates** skill for detailed steps on each gate.
|
|
100
100
|
|
|
101
|
-
Every follow-up, no matter how small, must pass these
|
|
102
|
-
|
|
103
|
-
1. **
|
|
104
|
-
2. **
|
|
105
|
-
3. **
|
|
101
|
+
Every follow-up, no matter how small, must pass these gates:
|
|
102
|
+
|
|
103
|
+
1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings — block immediately if found
|
|
104
|
+
2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
|
|
105
|
+
3. **Gate 3: Blast Radius Check** — verify scope matches the "small follow-up" expectation (≤100 lines, ≤3 files normal; if larger, escalate to `implement-feature` workflow)
|
|
106
|
+
4. **Gate 4: Dependency Audit** (when `package.json` or lockfiles change) — vulnerability scan, license check, bundle size, duplicates
|
|
107
|
+
5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates the change. No auto-PASS for sensitive files
|
|
108
|
+
6. **Gate 6: Browser Testing** (MANDATORY for any visual change) — clear cache, start server, verify scenario + responsive + screenshot evidence
|
|
109
|
+
7. **Gate 7: Regression Testing** — if shared component/library modified, run tests for all consuming projects and browser-test at least one page per affected app
|
|
106
110
|
|
|
107
111
|
### 6. Delivery
|
|
108
112
|
|
|
@@ -57,10 +57,21 @@ Acceptance criteria:
|
|
|
57
57
|
|
|
58
58
|
### Phase 4: Verify & Report
|
|
59
59
|
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
60
|
+
> Load the **validation-gates** skill for detailed steps on each gate.
|
|
61
|
+
|
|
62
|
+
All fixes must pass applicable gates before pushing:
|
|
63
|
+
|
|
64
|
+
1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings — block immediately if found
|
|
65
|
+
2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
|
|
66
|
+
3. **Gate 3: Blast Radius Check** — verify changes are scoped to commented files only; flag any files modified outside the comment scope
|
|
67
|
+
4. **Gate 4: Fast Review** (MANDATORY) — single reviewer sub-agent validates the combined PR comment fixes
|
|
68
|
+
5. **Gate 5: Regression Testing** — run tests for all projects consuming modified files
|
|
69
|
+
|
|
70
|
+
After all gates pass:
|
|
71
|
+
|
|
72
|
+
6. **Commit fixes** — Use descriptive commit messages referencing the PR: `TAS-XX: Address PR review — [summary]`
|
|
73
|
+
7. **Push to the same branch** — The PR updates automatically
|
|
74
|
+
8. **Report back** — Provide a structured summary of what was resolved
|
|
64
75
|
|
|
65
76
|
## Output Format
|
|
66
77
|
|
|
@@ -85,9 +96,12 @@ After resolving comments, report:
|
|
|
85
96
|
| path/to/file.ts | [question/debate] | [your take + options] |
|
|
86
97
|
|
|
87
98
|
### Verification
|
|
99
|
+
- Secret Scanning: PASS/FAIL
|
|
88
100
|
- Lint: PASS/FAIL
|
|
89
101
|
- Tests: PASS/FAIL
|
|
90
102
|
- Build: PASS/FAIL
|
|
103
|
+
- Blast Radius: PASS/ESCALATED
|
|
104
|
+
- Fast Review: PASS/FAIL
|
|
91
105
|
|
|
92
106
|
### Commits
|
|
93
107
|
- `abc1234` TAS-XX: Address PR review — [summary]
|
|
@@ -123,7 +123,9 @@ CONFIDENCE: low | medium | high
|
|
|
123
123
|
**Auto-PASS conditions (skip reviewer):**
|
|
124
124
|
- The delegation was pure research/exploration with no code changes
|
|
125
125
|
- The delegation only modified documentation files (`.md`)
|
|
126
|
-
- All deterministic gates already passed AND the change is ≤10 lines across ≤2 files
|
|
126
|
+
- All deterministic gates already passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see validation-gates Gate 3 sensitive file list)
|
|
127
|
+
|
|
128
|
+
> **Sensitive file override:** Changes to auth/middleware files, database migrations, RLS policies, security headers, CSP configuration, environment variable schemas, or CI/CD configuration **always** require a reviewer — even for 1-line changes. Auto-PASS never applies to these files.
|
|
127
129
|
|
|
128
130
|
### Step 4: Handle Verdict
|
|
129
131
|
|
|
@@ -247,14 +249,23 @@ Fast review sits between the agent's output and the Team Lead's acceptance:
|
|
|
247
249
|
Agent completes work
|
|
248
250
|
│
|
|
249
251
|
▼
|
|
250
|
-
|
|
252
|
+
Secret Scanning ← validation-gates Gate 1
|
|
253
|
+
│
|
|
254
|
+
▼
|
|
255
|
+
Deterministic checks (lint, test, build) ← validation-gates Gate 2
|
|
256
|
+
│
|
|
257
|
+
▼
|
|
258
|
+
Blast Radius Check ← validation-gates Gate 3
|
|
259
|
+
│
|
|
260
|
+
▼
|
|
261
|
+
Dependency Audit (if packages changed) ← validation-gates Gate 4
|
|
251
262
|
│
|
|
252
263
|
▼
|
|
253
|
-
Fast Review (this skill) ← validation-gates Gate
|
|
264
|
+
Fast Review (this skill) ← validation-gates Gate 5
|
|
254
265
|
│
|
|
255
266
|
├── PASS → Accept, move to next task
|
|
256
267
|
├── FAIL → Retry loop (up to 2x)
|
|
257
|
-
└── 3x FAIL → Escalate to Panel (Gate
|
|
268
|
+
└── 3x FAIL → Escalate to Panel (Gate 9)
|
|
258
269
|
```
|
|
259
270
|
|
|
260
271
|
### Relationship to on-post-delegate Hook
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: validation-gates
|
|
3
|
-
description: "Shared validation gates for all orchestration workflows — deterministic checks, browser testing, cache management, regression checks. Referenced by prompt templates to maintain single source of truth."
|
|
3
|
+
description: "Shared validation gates for all orchestration workflows — secret scanning, deterministic checks, blast radius analysis, dependency auditing, browser testing, cache management, regression checks, and final smoke tests. Referenced by prompt templates to maintain single source of truth."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
<!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .github/customizations/ directory instead. -->
|
|
@@ -9,7 +9,57 @@ description: "Shared validation gates for all orchestration workflows — determ
|
|
|
9
9
|
|
|
10
10
|
Canonical reference for validation gates shared across all orchestration workflows. Prompt templates reference this skill to avoid duplication.
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
**Gate summary:**
|
|
13
|
+
|
|
14
|
+
| Gate | Name | Runs When |
|
|
15
|
+
|------|------|-----------|
|
|
16
|
+
| 1 | Secret Scanning | Every delegation |
|
|
17
|
+
| 2 | Deterministic Checks | Every delegation |
|
|
18
|
+
| 3 | Blast Radius Check | Every delegation |
|
|
19
|
+
| 4 | Dependency Audit | When `package.json` or lockfiles change |
|
|
20
|
+
| 5 | Fast Review | Every delegation (with auto-PASS exceptions) |
|
|
21
|
+
| 6 | Cache Clearing | Before browser testing |
|
|
22
|
+
| 7 | Browser Testing | UI changes |
|
|
23
|
+
| 8 | Regression Testing | Every delegation |
|
|
24
|
+
| 9 | Panel Review | High-stakes changes only |
|
|
25
|
+
| 10 | Final Smoke Test | Feature completion (after all tasks Done) |
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Gate 1: Secret Scanning
|
|
30
|
+
|
|
31
|
+
> **HARD GATE — Constitution rule #1.** No tokens, keys, passwords, or connection strings in code, logs, commits, or terminal output.
|
|
32
|
+
|
|
33
|
+
Scan every diff **before** any other gate. A secret leak caught after merge is exponentially more expensive than one caught at review time.
|
|
34
|
+
|
|
35
|
+
### What to scan
|
|
36
|
+
|
|
37
|
+
Run a regex scan of all changed files for patterns that match common secret formats:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
# Scan staged/changed files for common secret patterns
|
|
41
|
+
grep -rn -E '(AKIA[0-9A-Z]{16}|sk-[a-zA-Z0-9]{20,}|ghp_[a-zA-Z0-9]{36}|glpat-[a-zA-Z0-9\-]{20}|xox[bpors]-[a-zA-Z0-9\-]+|eyJ[a-zA-Z0-9]{10,}\.[a-zA-Z0-9]{10,}|-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----|mongodb(\+srv)?://[^\s]+|postgres(ql)?://[^\s]+|mysql://[^\s]+|redis://[^\s]+)' <changed-files>
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Also check for:
|
|
45
|
+
- Hardcoded `password`, `secret`, `api_key`, `apiKey`, `token` assignments (not just references)
|
|
46
|
+
- `.env` file contents copied into source files
|
|
47
|
+
- Base64-encoded secrets (common obfuscation attempt)
|
|
48
|
+
|
|
49
|
+
### On detection
|
|
50
|
+
|
|
51
|
+
- **BLOCK immediately** — do not proceed to Gate 2
|
|
52
|
+
- Flag the specific file and line number
|
|
53
|
+
- Re-delegate to the agent with explicit instruction to use environment variables instead
|
|
54
|
+
- If a secret was already committed, **rotate it immediately** — git history is permanent
|
|
55
|
+
|
|
56
|
+
### Exceptions
|
|
57
|
+
|
|
58
|
+
- Test fixtures with obviously fake values (e.g., `sk-test-1234567890`)
|
|
59
|
+
- Documentation examples with placeholder values (e.g., `YOUR_API_KEY_HERE`)
|
|
60
|
+
- Pattern matches inside comments that are clearly explanatory
|
|
61
|
+
|
|
62
|
+
## Gate 2: Deterministic Checks
|
|
13
63
|
|
|
14
64
|
Run for every affected project (resolve exact commands via the **codebase-tool** skill):
|
|
15
65
|
|
|
@@ -19,31 +69,84 @@ Run for every affected project (resolve exact commands via the **codebase-tool**
|
|
|
19
69
|
|
|
20
70
|
All must pass with zero errors. Run for **every** project that consumed modified files, not just the primary project.
|
|
21
71
|
|
|
22
|
-
## Gate
|
|
72
|
+
## Gate 3: Blast Radius Check
|
|
73
|
+
|
|
74
|
+
Assess the scope of changes to catch scope creep and ensure reviewers can evaluate the diff effectively.
|
|
75
|
+
|
|
76
|
+
### Thresholds
|
|
77
|
+
|
|
78
|
+
| Metric | Normal | Warning | Escalate |
|
|
79
|
+
|--------|--------|---------|----------|
|
|
80
|
+
| Lines changed | ≤200 | 201–500 | >500 |
|
|
81
|
+
| Files changed | ≤5 | 6–10 | >10 |
|
|
82
|
+
| Projects affected | ≤1 | 2 | >2 |
|
|
83
|
+
|
|
84
|
+
### Actions
|
|
85
|
+
|
|
86
|
+
- **Normal** — proceed to Gate 4
|
|
87
|
+
- **Warning** — log a note in the delegation record. Ask: *"Was this scope expected?"* If yes, proceed. If unexpected, investigate whether the agent drifted from the partition
|
|
88
|
+
- **Escalate** — **STOP.** The Team Lead must review the diff before proceeding:
|
|
89
|
+
1. Verify all changed files are within the agent's assigned partition
|
|
90
|
+
2. Check whether the task should have been split into smaller subtasks
|
|
91
|
+
3. If scope creep: revert extra changes, re-delegate with tighter scope
|
|
92
|
+
4. If legitimately large: proceed, but **always run fast review** (no auto-PASS) and consider panel review
|
|
93
|
+
|
|
94
|
+
### Sensitive files
|
|
95
|
+
|
|
96
|
+
Changes to these file categories always trigger Warning regardless of line count:
|
|
97
|
+
|
|
98
|
+
- Auth/middleware files (e.g., `middleware.ts`, `auth.ts`, `**/auth/**`)
|
|
99
|
+
- Database migrations, RLS policies
|
|
100
|
+
- Security headers, CSP configuration (`next.config.*`, `vercel.json`)
|
|
101
|
+
- Environment variable schemas (`.env.example`, `env.ts`)
|
|
102
|
+
- CI/CD configuration (`.github/workflows/**`)
|
|
103
|
+
- Package manager configs (`package.json`, lockfiles) — also triggers Gate 4
|
|
104
|
+
|
|
105
|
+
## Gate 4: Dependency Audit
|
|
106
|
+
|
|
107
|
+
> Runs only when `package.json`, `yarn.lock`, `package-lock.json`, `pnpm-lock.yaml`, or similar lockfiles are modified.
|
|
108
|
+
|
|
109
|
+
When agents add, remove, or update npm packages, verify:
|
|
110
|
+
|
|
111
|
+
1. **Vulnerability scan** — Run `npm audit` (or the project's equivalent). No new `high` or `critical` vulnerabilities
|
|
112
|
+
2. **License compatibility** — New packages must use MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, or ISC licenses. Flag any copyleft (GPL, LGPL, AGPL) or proprietary licenses for human review
|
|
113
|
+
3. **Bundle size impact** — For frontend packages, note the minified + gzipped size. Flag packages >50KB gzipped that have lighter alternatives
|
|
114
|
+
4. **Duplicate functionality** — Check whether the new dependency overlaps with an existing one (e.g., adding `moment` when `date-fns` is already installed)
|
|
115
|
+
5. **Maintenance health** — Flag packages with no updates in >2 years or <100 weekly downloads
|
|
116
|
+
|
|
117
|
+
### On failure
|
|
118
|
+
|
|
119
|
+
- **Vulnerability:** BLOCK. Re-delegate with instruction to use a patched version or alternative package
|
|
120
|
+
- **License concern:** Flag for human review. Do not block, but document in the PR description
|
|
121
|
+
- **Size/duplicate:** Flag as SHOULD-FIX in the fast review. Not blocking unless egregious (>200KB)
|
|
122
|
+
|
|
123
|
+
## Gate 5: Fast Review (MANDATORY)
|
|
23
124
|
|
|
24
125
|
> **HARD GATE:** Every agent delegation output must pass fast review before acceptance. This is non-negotiable — even for overnight/unattended runs. Load the **fast-review** skill for the full procedure.
|
|
25
126
|
|
|
26
|
-
After
|
|
127
|
+
After gates 1–4 pass:
|
|
27
128
|
|
|
28
129
|
1. **Spawn a single reviewer sub-agent** with the review prompt from the fast-review skill
|
|
29
130
|
2. **On PASS** — proceed to remaining gates
|
|
30
131
|
3. **On FAIL** — re-delegate to the same agent with reviewer feedback (up to 2 retries)
|
|
31
|
-
4. **On 3x FAIL** — escalate to panel review (Gate
|
|
132
|
+
4. **On 3x FAIL** — escalate to panel review (Gate 9)
|
|
32
133
|
|
|
33
134
|
The reviewer validates: acceptance criteria met, file partition respected, no regressions, type safety, error handling, security basics, and edge cases.
|
|
34
135
|
|
|
35
136
|
**Auto-PASS conditions** (skip the reviewer sub-agent):
|
|
36
137
|
- Pure research/exploration with no code changes
|
|
37
138
|
- Only `.md` files were modified
|
|
38
|
-
- All deterministic gates passed AND the change is ≤10 lines across ≤2 files
|
|
139
|
+
- All deterministic gates passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see Gate 3 sensitive file list)
|
|
39
140
|
|
|
40
|
-
|
|
141
|
+
> **Sensitive file override:** If any changed file falls into the sensitive file categories listed in Gate 3 (auth, migrations, security headers, env schemas, CI/CD), auto-PASS is **never** applied — even for 1-line changes. These files always get a human-quality review.
|
|
142
|
+
|
|
143
|
+
## Gate 6: Cache Clearing (BEFORE Browser Testing)
|
|
41
144
|
|
|
42
145
|
**Always clear before testing.** Testing stale code wastes time and produces false results.
|
|
43
146
|
|
|
44
147
|
Clear framework caches and task runner caches before starting the dev server for browser testing. See the **codebase-tool** skill for cache-clearing commands.
|
|
45
148
|
|
|
46
|
-
## Gate
|
|
149
|
+
## Gate 7: Browser Testing (MANDATORY for UI Changes)
|
|
47
150
|
|
|
48
151
|
> **HARD GATE:** A task with UI changes is NOT done until you have screenshots in Chrome proving the feature works. "The code looks correct" is not proof. "Tests pass" is not proof. Only a screenshot of the working UI in Chrome is proof.
|
|
49
152
|
|
|
@@ -59,7 +162,7 @@ Clear framework caches and task runner caches before starting the dev server for
|
|
|
59
162
|
|
|
60
163
|
Load the **browser-testing** skill for Chrome MCP commands, breakpoint details, and reporting format.
|
|
61
164
|
|
|
62
|
-
## Gate
|
|
165
|
+
## Gate 8: Regression Testing
|
|
63
166
|
|
|
64
167
|
New features must not break existing functionality:
|
|
65
168
|
|
|
@@ -68,7 +171,7 @@ New features must not break existing functionality:
|
|
|
68
171
|
3. **Verify navigation** — Ensure routing, links, and back-button behavior still work
|
|
69
172
|
4. **Check shared components** — If a component from a shared library was modified, test it in all apps that consume it
|
|
70
173
|
|
|
71
|
-
## Gate
|
|
174
|
+
## Gate 9: Panel Review (High-Stakes Only)
|
|
72
175
|
|
|
73
176
|
Use the **panel-majority-vote** skill for:
|
|
74
177
|
|
|
@@ -79,16 +182,50 @@ Use the **panel-majority-vote** skill for:
|
|
|
79
182
|
|
|
80
183
|
If the panel returns BLOCK, extract MUST-FIX items, re-delegate to the same agent, and re-run the panel. Never skip, never halt. Max 3 attempts, then escalate to Architect.
|
|
81
184
|
|
|
185
|
+
## Gate 10: Final Smoke Test (Feature-Level)
|
|
186
|
+
|
|
187
|
+
> Runs once after ALL tasks in a feature are Done — not per-task.
|
|
188
|
+
|
|
189
|
+
Individual tasks pass gates 1–9 independently. But the combined result may have integration issues that per-task testing misses. This gate verifies the feature as a cohesive unit.
|
|
190
|
+
|
|
191
|
+
### Steps
|
|
192
|
+
|
|
193
|
+
1. **Full build** — Build all affected projects from clean state (not incremental)
|
|
194
|
+
2. **Full test suite** — Run tests across all projects that consumed any changed files
|
|
195
|
+
3. **End-to-end browser walkthrough** — Navigate the complete user flow from start to finish:
|
|
196
|
+
- Verify all states: loading, empty, populated, error, partial
|
|
197
|
+
- Test every state transition end-to-end (not just individual screens)
|
|
198
|
+
- Confirm data flows correctly between pages/components
|
|
199
|
+
- Test the happy path AND at least one error path
|
|
200
|
+
4. **Cross-task integration check** — Verify that outputs from different tasks (e.g., DB migration + component + page) compose correctly
|
|
201
|
+
5. **Smoke test at all breakpoints** — If the feature has UI, one final responsive sweep
|
|
202
|
+
|
|
203
|
+
### When to skip
|
|
204
|
+
|
|
205
|
+
- Non-UI features with comprehensive test coverage (e.g., pure backend/data pipeline work where tests verify integration)
|
|
206
|
+
- Single-task features (Gate 8 already covers regression)
|
|
207
|
+
|
|
208
|
+
### On failure
|
|
209
|
+
|
|
210
|
+
Re-delegate the specific failing integration point to the agent responsible for that layer. Do NOT re-run the entire feature implementation.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
82
214
|
## Universal Completion Checklist
|
|
83
215
|
|
|
84
216
|
Use this checklist for any orchestration workflow:
|
|
85
217
|
|
|
86
|
-
- [ ]
|
|
87
|
-
- [ ]
|
|
88
|
-
- [ ]
|
|
89
|
-
- [ ]
|
|
218
|
+
- [ ] **No secrets in diff** (Gate 1)
|
|
219
|
+
- [ ] Lint, test, and build pass for all affected projects (Gate 2)
|
|
220
|
+
- [ ] Blast radius assessed — scope is expected (Gate 3)
|
|
221
|
+
- [ ] Dependency audit passed if packages changed (Gate 4)
|
|
222
|
+
- [ ] **Fast review passed** (mandatory — load **fast-review** skill) (Gate 5)
|
|
223
|
+
- [ ] Dev server started with **clean cache** (Gate 6)
|
|
224
|
+
- [ ] UI changes verified in Chrome with screenshots at all breakpoints (Gate 7)
|
|
90
225
|
- [ ] Every acceptance criteria item visually confirmed — not just "page loads"
|
|
91
|
-
- [ ] No regressions in adjacent functionality
|
|
226
|
+
- [ ] No regressions in adjacent functionality (Gate 8)
|
|
227
|
+
- [ ] Panel review passed for high-stakes changes (Gate 9)
|
|
228
|
+
- [ ] **Final smoke test passed** for multi-task features (Gate 10)
|
|
92
229
|
- [ ] Shared code changes tested across all consuming apps
|
|
93
230
|
- [ ] No duplicated code — shared logic extracted to libraries
|
|
94
231
|
- [ ] Lessons learned captured if any retries occurred
|