opencastle 0.7.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -85,11 +85,11 @@ npx opencastle@1.2.3 update # pin to a specific version
85
85
 
86
86
  ## What's Inside
87
87
 
88
- **18 agents.** Developer, UI/UX, Database, Security, Testing, Reviewer, and more.
88
+ **Specialist Agents.** Developer, UI/UX, Database, Security, Testing, Reviewer, and more.
89
89
 
90
- **34 skills.** Loaded on demand to keep context windows lean. Auto-selected during init based on your stack.
90
+ **On-Demand Skills.** Loaded on demand to keep context windows lean. Auto-selected during init based on your stack.
91
91
 
92
- **8 workflows.** Features, bug fixes, data pipelines, security audits — reproducible execution templates.
92
+ **Workflow Templates.** Features, bug fixes, data pipelines, security audits — reproducible execution templates.
93
93
 
94
94
  **Quality gates.** Fast review after every step. Panel majority vote for high-stakes changes. Lint, test, build checks.
95
95
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencastle",
3
- "version": "0.7.0",
3
+ "version": "0.8.0",
4
4
  "type": "module",
5
5
  "description": "Multi-agent orchestration framework for AI coding assistants",
6
6
  "bin": {
@@ -1,25 +1,25 @@
1
1
  {
2
- "hash": "34cc5b0e",
2
+ "hash": "ef84f2a1",
3
3
  "configHash": "30f8ea04",
4
- "lockfileHash": "51ed83e7",
5
- "browserHash": "7f7adab5",
4
+ "lockfileHash": "fc571d9a",
5
+ "browserHash": "822a1467",
6
6
  "optimized": {
7
7
  "astro > cssesc": {
8
8
  "src": "../../../../../node_modules/cssesc/cssesc.js",
9
9
  "file": "astro___cssesc.js",
10
- "fileHash": "00126710",
10
+ "fileHash": "d234c1d0",
11
11
  "needsInterop": true
12
12
  },
13
13
  "astro > aria-query": {
14
14
  "src": "../../../../../node_modules/aria-query/lib/index.js",
15
15
  "file": "astro___aria-query.js",
16
- "fileHash": "8c2d97b4",
16
+ "fileHash": "bcc57b41",
17
17
  "needsInterop": true
18
18
  },
19
19
  "astro > axobject-query": {
20
20
  "src": "../../../../../node_modules/axobject-query/lib/index.js",
21
21
  "file": "astro___axobject-query.js",
22
- "fileHash": "229c73d4",
22
+ "fileHash": "196f4f3c",
23
23
  "needsInterop": true
24
24
  }
25
25
  },
@@ -204,15 +204,22 @@ If there are no open questions, explicitly state: "No open questions — plan is
204
204
  3. Verify no files outside partitions were modified
205
205
  4. Check all tracker issue acceptance criteria
206
206
  5. Run panel review if high-stakes (security, DB, architecture)
207
- 6. Move all issues to Done
208
- 7. Update session checkpoint delete checkpoint
209
- 8. Update `.github/customizations/project/roadmap.md`
207
+ 6. **Final Smoke Test (Gate 10)** — verify the complete feature end-to-end:
208
+ - Full clean build of all affected projects (not incremental)
209
+ - End-to-end browser walkthrough of the complete user flow
210
+ - Verify all states: loading, empty, populated, error, partial
211
+ - Cross-task integration check (e.g., migration + component + page compose correctly)
212
+ - Final responsive sweep at all breakpoints (if UI changes)
213
+ 7. Move all issues to Done
214
+ 8. Update session checkpoint → delete checkpoint
215
+ 9. Update `.github/customizations/project/roadmap.md`
210
216
 
211
217
  ### Exit Criteria
212
218
 
213
219
  - [ ] All phases verified
214
220
  - [ ] All tracker issues Done
215
221
  - [ ] Full build passes
222
+ - [ ] **Final smoke test passed** — complete user flow verified end-to-end
216
223
  - [ ] Roadmap updated
217
224
  - [ ] Delivery Outcome completed (see `general.instructions.md`) — branch pushed, PR opened (not merged), tracker linked
218
225
 
@@ -107,12 +107,17 @@ Delegate to the appropriate specialist agent via **sub-agent** (inline). For bug
107
107
 
108
108
  > Load the **validation-gates** skill for detailed steps on each gate.
109
109
 
110
- Every bug fix must pass ALL of these checks:
111
-
112
- 1. **Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) all zero errors
113
- 2. **Bug-Specific Verification** (mandatory) start dev server, reproduce original bug (should be gone), verify correct behavior, test edge cases, screenshot before/after, check both apps if shared code
114
- 3. **Regression Check** — run tests for all projects consuming modified files, browser-test adjacent functionality
115
- 4. **Panel Review** (only if needed) — use **panel-majority-vote** skill if fix touches auth/authorization, RLS, security headers/CSP, or sensitive data
110
+ Every bug fix must pass ALL applicable gates:
111
+
112
+ 1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings block immediately if found
113
+ 2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) all zero errors
114
+ 3. **Gate 3: Blast Radius Check** — verify the fix is minimal and scoped (bug fixes should be ≤100 lines, ≤3 files; escalate if larger)
115
+ 4. **Gate 4: Dependency Audit** (when `package.json` or lockfiles change) — vulnerability scan, license check, bundle size, duplicates
116
+ 5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates the fix. No auto-PASS for sensitive files
117
+ 6. **Gate 6: Bug-Specific Verification** (MANDATORY) — start dev server, reproduce original bug (should be gone), verify correct behavior, test edge cases, screenshot before/after, check both apps if shared code
118
+ 7. **Gate 7: Browser Testing** (for UI-related bugs) — clear cache, start server, verify fix + responsive + screenshots
119
+ 8. **Gate 8: Regression Testing** — run tests for all projects consuming modified files, browser-test adjacent functionality
120
+ 9. **Gate 9: Panel Review** (only if needed) — use **panel-majority-vote** skill if fix touches auth/authorization, RLS, security headers/CSP, or sensitive data
116
121
 
117
122
  ### 6. Delivery
118
123
 
@@ -79,10 +79,15 @@ Include the self-improvement reminder in every delegation prompt (see `general.i
79
79
 
80
80
  Every subtask must pass ALL gates before being marked Done:
81
81
 
82
- 1. **Gate 1: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) all zero errors
83
- 2. **Gate 2: Browser Testing** (MANDATORY for UI changes) clear cache, start server, verify features + responsive + screenshots
84
- 3. **Gate 3: Regression Testing** — full test suite for affected projects, browser-test adjacent pages if shared components changed
85
- 4. **Gate 4: Panel Review** (for high-stakes changes) — use **panel-majority-vote** skill for security, DB migrations, architecture
82
+ 1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings block immediately if found
83
+ 2. **Gate 2: Deterministic Checks** run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) all zero errors
84
+ 3. **Gate 3: Blast Radius Check** — verify scope is expected (≤200 lines, ≤5 files normal; escalate if >500 lines or >10 files)
85
+ 4. **Gate 4: Dependency Audit** (when `package.json` changes) — vulnerability scan, license check, bundle size, duplicates
86
+ 5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates every delegation output. No auto-PASS for sensitive files
87
+ 6. **Gate 6: Browser Testing** (MANDATORY for UI changes) — clear cache, start server, verify features + responsive + screenshots
88
+ 7. **Gate 7: Regression Testing** — full test suite for affected projects, browser-test adjacent pages if shared components changed
89
+ 8. **Gate 8: Panel Review** (for high-stakes changes) — use **panel-majority-vote** skill for security, DB migrations, architecture
90
+ 9. **Gate 9: Final Smoke Test** — after all tasks Done, verify the complete feature end-to-end as a cohesive unit
86
91
 
87
92
  ### 5. Delivery
88
93
 
@@ -98,11 +98,15 @@ Delegate to the appropriate specialist agent(s). Since follow-ups are scoped and
98
98
 
99
99
  > Load the **validation-gates** skill for detailed steps on each gate.
100
100
 
101
- Every follow-up, no matter how small, must pass these checks:
102
-
103
- 1. **Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) all zero errors
104
- 2. **Browser Testing** (MANDATORY for any visual change) clear cache, start server, verify scenario + responsive + screenshot evidence
105
- 3. **Regression Check** — if shared component/library modified, run tests for all consuming projects and browser-test at least one page per affected app
101
+ Every follow-up, no matter how small, must pass these gates:
102
+
103
+ 1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings block immediately if found
104
+ 2. **Gate 2: Deterministic Checks** run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
105
+ 3. **Gate 3: Blast Radius Check** — verify scope matches the "small follow-up" expectation (≤100 lines, ≤3 files normal; if larger, escalate to `implement-feature` workflow)
106
+ 4. **Gate 4: Dependency Audit** (when `package.json` or lockfiles change) — vulnerability scan, license check, bundle size, duplicates
107
+ 5. **Gate 5: Fast Review** (MANDATORY) — single reviewer sub-agent validates the change. No auto-PASS for sensitive files
108
+ 6. **Gate 6: Browser Testing** (MANDATORY for any visual change) — clear cache, start server, verify scenario + responsive + screenshot evidence
109
+ 7. **Gate 7: Regression Testing** — if shared component/library modified, run tests for all consuming projects and browser-test at least one page per affected app
106
110
 
107
111
  ### 6. Delivery
108
112
 
@@ -57,10 +57,21 @@ Acceptance criteria:
57
57
 
58
58
  ### Phase 4: Verify & Report
59
59
 
60
- 1. **Run verification** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands)
61
- 2. **Commit fixes** — Use descriptive commit messages referencing the PR: `TAS-XX: Address PR review — [summary]`
62
- 3. **Push to the same branch** The PR updates automatically
63
- 4. **Report back** — Provide a structured summary of what was resolved
60
+ > Load the **validation-gates** skill for detailed steps on each gate.
61
+
62
+ All fixes must pass applicable gates before pushing:
63
+
64
+ 1. **Gate 1: Secret Scanning** — scan diff for API keys, tokens, passwords, connection strings — block immediately if found
65
+ 2. **Gate 2: Deterministic Checks** — run lint, test, and build for all affected projects (see the **codebase-tool** skill for commands) — all zero errors
66
+ 3. **Gate 3: Blast Radius Check** — verify changes are scoped to commented files only; flag any files modified outside the comment scope
67
+ 4. **Gate 4: Fast Review** (MANDATORY) — single reviewer sub-agent validates the combined PR comment fixes
68
+ 5. **Gate 5: Regression Testing** — run tests for all projects consuming modified files
69
+
70
+ After all gates pass:
71
+
72
+ 6. **Commit fixes** — Use descriptive commit messages referencing the PR: `TAS-XX: Address PR review — [summary]`
73
+ 7. **Push to the same branch** — The PR updates automatically
74
+ 8. **Report back** — Provide a structured summary of what was resolved
64
75
 
65
76
  ## Output Format
66
77
 
@@ -85,9 +96,12 @@ After resolving comments, report:
85
96
  | path/to/file.ts | [question/debate] | [your take + options] |
86
97
 
87
98
  ### Verification
99
+ - Secret Scanning: PASS/FAIL
88
100
  - Lint: PASS/FAIL
89
101
  - Tests: PASS/FAIL
90
102
  - Build: PASS/FAIL
103
+ - Blast Radius: PASS/ESCALATED
104
+ - Fast Review: PASS/FAIL
91
105
 
92
106
  ### Commits
93
107
  - `abc1234` TAS-XX: Address PR review — [summary]
@@ -123,7 +123,9 @@ CONFIDENCE: low | medium | high
123
123
  **Auto-PASS conditions (skip reviewer):**
124
124
  - The delegation was pure research/exploration with no code changes
125
125
  - The delegation only modified documentation files (`.md`)
126
- - All deterministic gates already passed AND the change is ≤10 lines across ≤2 files
126
+ - All deterministic gates already passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see validation-gates Gate 3 sensitive file list)
127
+
128
+ > **Sensitive file override:** Changes to auth/middleware files, database migrations, RLS policies, security headers, CSP configuration, environment variable schemas, or CI/CD configuration **always** require a reviewer — even for 1-line changes. Auto-PASS never applies to these files.
127
129
 
128
130
  ### Step 4: Handle Verdict
129
131
 
@@ -247,14 +249,23 @@ Fast review sits between the agent's output and the Team Lead's acceptance:
247
249
  Agent completes work
248
250
 
249
251
 
250
- Deterministic checks (lint, test, build) ← validation-gates Gate 1
252
+ Secret Scanning ← validation-gates Gate 1
253
+
254
+
255
+ Deterministic checks (lint, test, build) ← validation-gates Gate 2
256
+
257
+
258
+ Blast Radius Check ← validation-gates Gate 3
259
+
260
+
261
+ Dependency Audit (if packages changed) ← validation-gates Gate 4
251
262
 
252
263
 
253
- Fast Review (this skill) ← validation-gates Gate 1.5
264
+ Fast Review (this skill) ← validation-gates Gate 5
254
265
 
255
266
  ├── PASS → Accept, move to next task
256
267
  ├── FAIL → Retry loop (up to 2x)
257
- └── 3x FAIL → Escalate to Panel (Gate 5)
268
+ └── 3x FAIL → Escalate to Panel (Gate 9)
258
269
  ```
259
270
 
260
271
  ### Relationship to on-post-delegate Hook
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: validation-gates
3
- description: "Shared validation gates for all orchestration workflows — deterministic checks, browser testing, cache management, regression checks. Referenced by prompt templates to maintain single source of truth."
3
+ description: "Shared validation gates for all orchestration workflows — secret scanning, deterministic checks, blast radius analysis, dependency auditing, browser testing, cache management, regression checks, and final smoke tests. Referenced by prompt templates to maintain single source of truth."
4
4
  ---
5
5
 
6
6
  <!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .github/customizations/ directory instead. -->
@@ -9,7 +9,57 @@ description: "Shared validation gates for all orchestration workflows — determ
9
9
 
10
10
  Canonical reference for validation gates shared across all orchestration workflows. Prompt templates reference this skill to avoid duplication.
11
11
 
12
- ## Gate 1: Deterministic Checks
12
+ **Gate summary:**
13
+
14
+ | Gate | Name | Runs When |
15
+ |------|------|-----------|
16
+ | 1 | Secret Scanning | Every delegation |
17
+ | 2 | Deterministic Checks | Every delegation |
18
+ | 3 | Blast Radius Check | Every delegation |
19
+ | 4 | Dependency Audit | When `package.json` or lockfiles change |
20
+ | 5 | Fast Review | Every delegation (with auto-PASS exceptions) |
21
+ | 6 | Cache Clearing | Before browser testing |
22
+ | 7 | Browser Testing | UI changes |
23
+ | 8 | Regression Testing | Every delegation |
24
+ | 9 | Panel Review | High-stakes changes only |
25
+ | 10 | Final Smoke Test | Feature completion (after all tasks Done) |
26
+
27
+ ---
28
+
29
+ ## Gate 1: Secret Scanning
30
+
31
+ > **HARD GATE — Constitution rule #1.** No tokens, keys, passwords, or connection strings in code, logs, commits, or terminal output.
32
+
33
+ Scan every diff **before** any other gate. A secret leak caught after merge is exponentially more expensive than one caught at review time.
34
+
35
+ ### What to scan
36
+
37
+ Run a regex scan of all changed files for patterns that match common secret formats:
38
+
39
+ ```bash
40
+ # Scan staged/changed files for common secret patterns
41
+ grep -rn -E '(AKIA[0-9A-Z]{16}|sk-[a-zA-Z0-9]{20,}|ghp_[a-zA-Z0-9]{36}|glpat-[a-zA-Z0-9\-]{20}|xox[bpors]-[a-zA-Z0-9\-]+|eyJ[a-zA-Z0-9]{10,}\.[a-zA-Z0-9]{10,}|-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----|mongodb(\+srv)?://[^\s]+|postgres(ql)?://[^\s]+|mysql://[^\s]+|redis://[^\s]+)' <changed-files>
42
+ ```
43
+
44
+ Also check for:
45
+ - Hardcoded `password`, `secret`, `api_key`, `apiKey`, `token` assignments (not just references)
46
+ - `.env` file contents copied into source files
47
+ - Base64-encoded secrets (common obfuscation attempt)
48
+
49
+ ### On detection
50
+
51
+ - **BLOCK immediately** — do not proceed to Gate 2
52
+ - Flag the specific file and line number
53
+ - Re-delegate to the agent with explicit instruction to use environment variables instead
54
+ - If a secret was already committed, **rotate it immediately** — git history is permanent
55
+
56
+ ### Exceptions
57
+
58
+ - Test fixtures with obviously fake values (e.g., `sk-test-1234567890`)
59
+ - Documentation examples with placeholder values (e.g., `YOUR_API_KEY_HERE`)
60
+ - Pattern matches inside comments that are clearly explanatory
61
+
62
+ ## Gate 2: Deterministic Checks
13
63
 
14
64
  Run for every affected project (resolve exact commands via the **codebase-tool** skill):
15
65
 
@@ -19,31 +69,84 @@ Run for every affected project (resolve exact commands via the **codebase-tool**
19
69
 
20
70
  All must pass with zero errors. Run for **every** project that consumed modified files, not just the primary project.
21
71
 
22
- ## Gate 1.5: Fast Review (MANDATORY)
72
+ ## Gate 3: Blast Radius Check
73
+
74
+ Assess the scope of changes to catch scope creep and ensure reviewers can evaluate the diff effectively.
75
+
76
+ ### Thresholds
77
+
78
+ | Metric | Normal | Warning | Escalate |
79
+ |--------|--------|---------|----------|
80
+ | Lines changed | ≤200 | 201–500 | >500 |
81
+ | Files changed | ≤5 | 6–10 | >10 |
82
+ | Projects affected | ≤1 | 2 | >2 |
83
+
84
+ ### Actions
85
+
86
+ - **Normal** — proceed to Gate 4
87
+ - **Warning** — log a note in the delegation record. Ask: *"Was this scope expected?"* If yes, proceed. If unexpected, investigate whether the agent drifted from the partition
88
+ - **Escalate** — **STOP.** The Team Lead must review the diff before proceeding:
89
+ 1. Verify all changed files are within the agent's assigned partition
90
+ 2. Check whether the task should have been split into smaller subtasks
91
+ 3. If scope creep: revert extra changes, re-delegate with tighter scope
92
+ 4. If legitimately large: proceed, but **always run fast review** (no auto-PASS) and consider panel review
93
+
94
+ ### Sensitive files
95
+
96
+ Changes to these file categories always trigger Warning regardless of line count:
97
+
98
+ - Auth/middleware files (e.g., `middleware.ts`, `auth.ts`, `**/auth/**`)
99
+ - Database migrations, RLS policies
100
+ - Security headers, CSP configuration (`next.config.*`, `vercel.json`)
101
+ - Environment variable schemas (`.env.example`, `env.ts`)
102
+ - CI/CD configuration (`.github/workflows/**`)
103
+ - Package manager configs (`package.json`, lockfiles) — also triggers Gate 4
104
+
105
+ ## Gate 4: Dependency Audit
106
+
107
+ > Runs only when `package.json`, `yarn.lock`, `package-lock.json`, `pnpm-lock.yaml`, or similar lockfiles are modified.
108
+
109
+ When agents add, remove, or update npm packages, verify:
110
+
111
+ 1. **Vulnerability scan** — Run `npm audit` (or the project's equivalent). No new `high` or `critical` vulnerabilities
112
+ 2. **License compatibility** — New packages must use MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, or ISC licenses. Flag any copyleft (GPL, LGPL, AGPL) or proprietary licenses for human review
113
+ 3. **Bundle size impact** — For frontend packages, note the minified + gzipped size. Flag packages >50KB gzipped that have lighter alternatives
114
+ 4. **Duplicate functionality** — Check whether the new dependency overlaps with an existing one (e.g., adding `moment` when `date-fns` is already installed)
115
+ 5. **Maintenance health** — Flag packages with no updates in >2 years or <100 weekly downloads
116
+
117
+ ### On failure
118
+
119
+ - **Vulnerability:** BLOCK. Re-delegate with instruction to use a patched version or alternative package
120
+ - **License concern:** Flag for human review. Do not block, but document in the PR description
121
+ - **Size/duplicate:** Flag as SHOULD-FIX in the fast review. Not blocking unless egregious (>200KB)
122
+
123
+ ## Gate 5: Fast Review (MANDATORY)
23
124
 
24
125
  > **HARD GATE:** Every agent delegation output must pass fast review before acceptance. This is non-negotiable — even for overnight/unattended runs. Load the **fast-review** skill for the full procedure.
25
126
 
26
- After deterministic checks (Gate 1) pass:
127
+ After gates 1–4 pass:
27
128
 
28
129
  1. **Spawn a single reviewer sub-agent** with the review prompt from the fast-review skill
29
130
  2. **On PASS** — proceed to remaining gates
30
131
  3. **On FAIL** — re-delegate to the same agent with reviewer feedback (up to 2 retries)
31
- 4. **On 3x FAIL** — escalate to panel review (Gate 5)
132
+ 4. **On 3x FAIL** — escalate to panel review (Gate 9)
32
133
 
33
134
  The reviewer validates: acceptance criteria met, file partition respected, no regressions, type safety, error handling, security basics, and edge cases.
34
135
 
35
136
  **Auto-PASS conditions** (skip the reviewer sub-agent):
36
137
  - Pure research/exploration with no code changes
37
138
  - Only `.md` files were modified
38
- - All deterministic gates passed AND the change is ≤10 lines across ≤2 files
139
+ - All deterministic gates passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see Gate 3 sensitive file list)
39
140
 
40
- ## Gate 2: Cache Clearing (BEFORE Browser Testing)
141
+ > **Sensitive file override:** If any changed file falls into the sensitive file categories listed in Gate 3 (auth, migrations, security headers, env schemas, CI/CD), auto-PASS is **never** applied — even for 1-line changes. These files always get a human-quality review.
142
+
143
+ ## Gate 6: Cache Clearing (BEFORE Browser Testing)
41
144
 
42
145
  **Always clear before testing.** Testing stale code wastes time and produces false results.
43
146
 
44
147
  Clear framework caches and task runner caches before starting the dev server for browser testing. See the **codebase-tool** skill for cache-clearing commands.
45
148
 
46
- ## Gate 3: Browser Testing (MANDATORY for UI Changes)
149
+ ## Gate 7: Browser Testing (MANDATORY for UI Changes)
47
150
 
48
151
  > **HARD GATE:** A task with UI changes is NOT done until you have screenshots in Chrome proving the feature works. "The code looks correct" is not proof. "Tests pass" is not proof. Only a screenshot of the working UI in Chrome is proof.
49
152
 
@@ -59,7 +162,7 @@ Clear framework caches and task runner caches before starting the dev server for
59
162
 
60
163
  Load the **browser-testing** skill for Chrome MCP commands, breakpoint details, and reporting format.
61
164
 
62
- ## Gate 4: Regression Testing
165
+ ## Gate 8: Regression Testing
63
166
 
64
167
  New features must not break existing functionality:
65
168
 
@@ -68,7 +171,7 @@ New features must not break existing functionality:
68
171
  3. **Verify navigation** — Ensure routing, links, and back-button behavior still work
69
172
  4. **Check shared components** — If a component from a shared library was modified, test it in all apps that consume it
70
173
 
71
- ## Gate 5: Panel Review (High-Stakes Only)
174
+ ## Gate 9: Panel Review (High-Stakes Only)
72
175
 
73
176
  Use the **panel-majority-vote** skill for:
74
177
 
@@ -79,16 +182,50 @@ Use the **panel-majority-vote** skill for:
79
182
 
80
183
  If the panel returns BLOCK, extract MUST-FIX items, re-delegate to the same agent, and re-run the panel. Never skip, never halt. Max 3 attempts, then escalate to Architect.
81
184
 
185
+ ## Gate 10: Final Smoke Test (Feature-Level)
186
+
187
+ > Runs once after ALL tasks in a feature are Done — not per-task.
188
+
189
+ Individual tasks pass gates 1–9 independently. But the combined result may have integration issues that per-task testing misses. This gate verifies the feature as a cohesive unit.
190
+
191
+ ### Steps
192
+
193
+ 1. **Full build** — Build all affected projects from clean state (not incremental)
194
+ 2. **Full test suite** — Run tests across all projects that consumed any changed files
195
+ 3. **End-to-end browser walkthrough** — Navigate the complete user flow from start to finish:
196
+ - Verify all states: loading, empty, populated, error, partial
197
+ - Test every state transition end-to-end (not just individual screens)
198
+ - Confirm data flows correctly between pages/components
199
+ - Test the happy path AND at least one error path
200
+ 4. **Cross-task integration check** — Verify that outputs from different tasks (e.g., DB migration + component + page) compose correctly
201
+ 5. **Smoke test at all breakpoints** — If the feature has UI, one final responsive sweep
202
+
203
+ ### When to skip
204
+
205
+ - Non-UI features with comprehensive test coverage (e.g., pure backend/data pipeline work where tests verify integration)
206
+ - Single-task features (Gate 8 already covers regression)
207
+
208
+ ### On failure
209
+
210
+ Re-delegate the specific failing integration point to the agent responsible for that layer. Do NOT re-run the entire feature implementation.
211
+
212
+ ---
213
+
82
214
  ## Universal Completion Checklist
83
215
 
84
216
  Use this checklist for any orchestration workflow:
85
217
 
86
- - [ ] Lint, test, and build pass for all affected projects
87
- - [ ] **Fast review passed** (mandatory load **fast-review** skill)
88
- - [ ] Dev server started with **clean cache** (clear framework + task runner caches — see the **codebase-tool** skill)
89
- - [ ] UI changes verified in Chrome with screenshots at all breakpoints
218
+ - [ ] **No secrets in diff** (Gate 1)
219
+ - [ ] Lint, test, and build pass for all affected projects (Gate 2)
220
+ - [ ] Blast radius assessed scope is expected (Gate 3)
221
+ - [ ] Dependency audit passed if packages changed (Gate 4)
222
+ - [ ] **Fast review passed** (mandatory — load **fast-review** skill) (Gate 5)
223
+ - [ ] Dev server started with **clean cache** (Gate 6)
224
+ - [ ] UI changes verified in Chrome with screenshots at all breakpoints (Gate 7)
90
225
  - [ ] Every acceptance criteria item visually confirmed — not just "page loads"
91
- - [ ] No regressions in adjacent functionality
226
+ - [ ] No regressions in adjacent functionality (Gate 8)
227
+ - [ ] Panel review passed for high-stakes changes (Gate 9)
228
+ - [ ] **Final smoke test passed** for multi-task features (Gate 10)
92
229
  - [ ] Shared code changes tested across all consuming apps
93
230
  - [ ] No duplicated code — shared logic extracted to libraries
94
231
  - [ ] Lessons learned captured if any retries occurred