devlyn-cli 1.15.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (158) hide show
  1. package/AGENTS.md +104 -0
  2. package/CLAUDE.md +135 -21
  3. package/README.md +43 -125
  4. package/benchmark/auto-resolve/BENCHMARK-DESIGN.md +272 -0
  5. package/benchmark/auto-resolve/README.md +114 -0
  6. package/benchmark/auto-resolve/RUBRIC.md +162 -0
  7. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/NOTES.md +30 -0
  8. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/expected.json +68 -0
  9. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/metadata.json +10 -0
  10. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/setup.sh +4 -0
  11. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/spec.md +45 -0
  12. package/benchmark/auto-resolve/fixtures/F1-cli-trivial-flag/task.txt +8 -0
  13. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/NOTES.md +54 -0
  14. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/expected-pair-plan-registry.json +170 -0
  15. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/expected.json +84 -0
  16. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/metadata.json +21 -0
  17. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/pair-plan.sample-fail.json +214 -0
  18. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/pair-plan.sample-pass.json +223 -0
  19. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/setup.sh +5 -0
  20. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/spec.md +56 -0
  21. package/benchmark/auto-resolve/fixtures/F2-cli-medium-subcommand/task.txt +14 -0
  22. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/NOTES.md +28 -0
  23. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/expected-pair-plan-registry.json +162 -0
  24. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/expected.json +65 -0
  25. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/metadata.json +19 -0
  26. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/setup.sh +4 -0
  27. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/spec.md +56 -0
  28. package/benchmark/auto-resolve/fixtures/F3-backend-contract-risk/task.txt +9 -0
  29. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/NOTES.md +40 -0
  30. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/expected.json +57 -0
  31. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/metadata.json +10 -0
  32. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/setup.sh +6 -0
  33. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/spec.md +49 -0
  34. package/benchmark/auto-resolve/fixtures/F4-web-browser-design/task.txt +9 -0
  35. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/NOTES.md +38 -0
  36. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/expected.json +65 -0
  37. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/metadata.json +10 -0
  38. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/setup.sh +55 -0
  39. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/spec.md +49 -0
  40. package/benchmark/auto-resolve/fixtures/F5-fix-loop-red-green/task.txt +7 -0
  41. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/NOTES.md +38 -0
  42. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/expected.json +77 -0
  43. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/metadata.json +10 -0
  44. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/setup.sh +4 -0
  45. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/spec.md +49 -0
  46. package/benchmark/auto-resolve/fixtures/F6-dep-audit-native-module/task.txt +10 -0
  47. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/NOTES.md +50 -0
  48. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/expected.json +76 -0
  49. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/metadata.json +10 -0
  50. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/setup.sh +36 -0
  51. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/spec.md +46 -0
  52. package/benchmark/auto-resolve/fixtures/F7-out-of-scope-trap/task.txt +7 -0
  53. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/NOTES.md +50 -0
  54. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/expected.json +63 -0
  55. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/metadata.json +10 -0
  56. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/setup.sh +4 -0
  57. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/spec.md +48 -0
  58. package/benchmark/auto-resolve/fixtures/F8-known-limit-ambiguous/task.txt +1 -0
  59. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/NOTES.md +93 -0
  60. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/expected.json +74 -0
  61. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/metadata.json +10 -0
  62. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/setup.sh +28 -0
  63. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/spec.md +62 -0
  64. package/benchmark/auto-resolve/fixtures/F9-e2e-ideate-to-resolve/task.txt +5 -0
  65. package/benchmark/auto-resolve/fixtures/SCHEMA.md +130 -0
  66. package/benchmark/auto-resolve/fixtures/test-repo/README.md +27 -0
  67. package/benchmark/auto-resolve/fixtures/test-repo/bin/cli.js +63 -0
  68. package/benchmark/auto-resolve/fixtures/test-repo/package-lock.json +823 -0
  69. package/benchmark/auto-resolve/fixtures/test-repo/package.json +22 -0
  70. package/benchmark/auto-resolve/fixtures/test-repo/playwright.config.js +17 -0
  71. package/benchmark/auto-resolve/fixtures/test-repo/server/index.js +37 -0
  72. package/benchmark/auto-resolve/fixtures/test-repo/tests/cli.test.js +25 -0
  73. package/benchmark/auto-resolve/fixtures/test-repo/tests/server.test.js +58 -0
  74. package/benchmark/auto-resolve/fixtures/test-repo/web/index.html +37 -0
  75. package/benchmark/auto-resolve/scripts/build-pair-eligible-manifest.py +174 -0
  76. package/benchmark/auto-resolve/scripts/check-f9-artifacts.py +256 -0
  77. package/benchmark/auto-resolve/scripts/compile-report.py +331 -0
  78. package/benchmark/auto-resolve/scripts/iter-0033c-compare.py +552 -0
  79. package/benchmark/auto-resolve/scripts/judge-opus-pass.sh +430 -0
  80. package/benchmark/auto-resolve/scripts/judge.sh +359 -0
  81. package/benchmark/auto-resolve/scripts/oracle-scope-tier-a.py +260 -0
  82. package/benchmark/auto-resolve/scripts/oracle-scope-tier-b.py +274 -0
  83. package/benchmark/auto-resolve/scripts/oracle-test-fidelity.py +328 -0
  84. package/benchmark/auto-resolve/scripts/pair-plan-idgen.py +401 -0
  85. package/benchmark/auto-resolve/scripts/pair-plan-lint.py +468 -0
  86. package/benchmark/auto-resolve/scripts/run-fixture.sh +691 -0
  87. package/benchmark/auto-resolve/scripts/run-iter-0033c.sh +234 -0
  88. package/benchmark/auto-resolve/scripts/run-suite.sh +214 -0
  89. package/benchmark/auto-resolve/scripts/ship-gate.py +222 -0
  90. package/bin/devlyn.js +175 -17
  91. package/config/skills/_shared/adapters/README.md +64 -0
  92. package/config/skills/_shared/adapters/gpt-5-5.md +29 -0
  93. package/config/skills/_shared/adapters/opus-4-7.md +29 -0
  94. package/config/skills/{devlyn:auto-resolve/scripts → _shared}/archive_run.py +26 -0
  95. package/config/skills/_shared/codex-config.md +54 -0
  96. package/config/skills/_shared/codex-monitored.sh +141 -0
  97. package/config/skills/_shared/engine-preflight.md +35 -0
  98. package/config/skills/_shared/expected.schema.json +93 -0
  99. package/config/skills/_shared/pair-plan-schema.md +298 -0
  100. package/config/skills/_shared/runtime-principles.md +110 -0
  101. package/config/skills/_shared/spec-verify-check.py +519 -0
  102. package/config/skills/devlyn:ideate/SKILL.md +99 -429
  103. package/config/skills/devlyn:ideate/references/elicitation.md +97 -0
  104. package/config/skills/devlyn:ideate/references/from-spec-mode.md +54 -0
  105. package/config/skills/devlyn:ideate/references/project-mode.md +76 -0
  106. package/config/skills/devlyn:ideate/references/spec-template.md +102 -0
  107. package/config/skills/devlyn:resolve/SKILL.md +172 -184
  108. package/config/skills/devlyn:resolve/references/free-form-mode.md +68 -0
  109. package/config/skills/devlyn:resolve/references/phases/build-gate.md +45 -0
  110. package/config/skills/devlyn:resolve/references/phases/cleanup.md +39 -0
  111. package/config/skills/devlyn:resolve/references/phases/implement.md +42 -0
  112. package/config/skills/devlyn:resolve/references/phases/plan.md +42 -0
  113. package/config/skills/devlyn:resolve/references/phases/verify.md +69 -0
  114. package/config/skills/devlyn:resolve/references/state-schema.md +106 -0
  115. package/{config/skills → optional-skills}/devlyn:design-system/SKILL.md +1 -0
  116. package/{config/skills → optional-skills}/devlyn:reap/SKILL.md +1 -0
  117. package/{config/skills → optional-skills}/devlyn:team-design-ui/SKILL.md +5 -0
  118. package/package.json +12 -2
  119. package/scripts/lint-skills.sh +431 -0
  120. package/config/skills/devlyn:auto-resolve/SKILL.md +0 -252
  121. package/config/skills/devlyn:auto-resolve/evals/evals.json +0 -21
  122. package/config/skills/devlyn:auto-resolve/evals/task-doctor-subcommand.md +0 -42
  123. package/config/skills/devlyn:auto-resolve/references/build-gate.md +0 -130
  124. package/config/skills/devlyn:auto-resolve/references/engine-routing.md +0 -82
  125. package/config/skills/devlyn:auto-resolve/references/findings-schema.md +0 -103
  126. package/config/skills/devlyn:auto-resolve/references/phases/phase-1-build.md +0 -54
  127. package/config/skills/devlyn:auto-resolve/references/phases/phase-2-evaluate.md +0 -45
  128. package/config/skills/devlyn:auto-resolve/references/phases/phase-3-critic.md +0 -84
  129. package/config/skills/devlyn:auto-resolve/references/pipeline-routing.md +0 -114
  130. package/config/skills/devlyn:auto-resolve/references/pipeline-state.md +0 -201
  131. package/config/skills/devlyn:auto-resolve/scripts/terminal_verdict.py +0 -96
  132. package/config/skills/devlyn:browser-validate/SKILL.md +0 -164
  133. package/config/skills/devlyn:browser-validate/references/flow-testing.md +0 -118
  134. package/config/skills/devlyn:browser-validate/references/tier1-chrome.md +0 -137
  135. package/config/skills/devlyn:browser-validate/references/tier2-playwright.md +0 -195
  136. package/config/skills/devlyn:browser-validate/references/tier3-curl.md +0 -57
  137. package/config/skills/devlyn:clean/SKILL.md +0 -285
  138. package/config/skills/devlyn:design-ui/SKILL.md +0 -351
  139. package/config/skills/devlyn:discover-product/SKILL.md +0 -124
  140. package/config/skills/devlyn:evaluate/SKILL.md +0 -564
  141. package/config/skills/devlyn:feature-spec/SKILL.md +0 -630
  142. package/config/skills/devlyn:ideate/references/challenge-rubric.md +0 -122
  143. package/config/skills/devlyn:ideate/references/codex-critic-template.md +0 -42
  144. package/config/skills/devlyn:ideate/references/templates/item-spec.md +0 -90
  145. package/config/skills/devlyn:implement-ui/SKILL.md +0 -466
  146. package/config/skills/devlyn:preflight/SKILL.md +0 -355
  147. package/config/skills/devlyn:preflight/references/auditors/browser-auditor.md +0 -32
  148. package/config/skills/devlyn:preflight/references/auditors/code-auditor.md +0 -86
  149. package/config/skills/devlyn:preflight/references/auditors/docs-auditor.md +0 -38
  150. package/config/skills/devlyn:product-spec/SKILL.md +0 -603
  151. package/config/skills/devlyn:recommend-features/SKILL.md +0 -286
  152. package/config/skills/devlyn:review/SKILL.md +0 -161
  153. package/config/skills/devlyn:team-resolve/SKILL.md +0 -631
  154. package/config/skills/devlyn:team-review/SKILL.md +0 -493
  155. package/config/skills/devlyn:update-docs/SKILL.md +0 -463
  156. package/config/skills/workflow-routing/SKILL.md +0 -73
  157. /package/{config/skills → optional-skills}/devlyn:reap/scripts/reap.sh +0 -0
  158. /package/{config/skills → optional-skills}/devlyn:reap/scripts/scan.sh +0 -0
@@ -1,493 +0,0 @@
1
- Perform a multi-perspective code review by assembling a specialized Agent Team. Each reviewer audits the changes from their domain expertise — security, code quality, testing, product, design, and performance — ensuring nothing slips through.
2
-
3
- <review_scope>
4
- $ARGUMENTS
5
- </review_scope>
6
-
7
- <team_workflow>
8
-
9
- ## Phase 1: SCOPE ASSESSMENT (You are the Review Lead — work solo first)
10
-
11
- Before spawning any reviewers, assess the changeset:
12
-
13
- 1. Run `git diff --name-only HEAD` to get all changed files
14
- 2. Run `git diff HEAD` to get the full diff
15
- 3. Read all changed files in parallel (use parallel tool calls)
16
- 4. Classify the changes using the scope matrix below
17
- 5. Decide which reviewers to spawn
18
-
19
- <scope_classification>
20
- Classify the changes and select reviewers:
21
-
22
- **Always spawn** (every review):
23
- - security-reviewer
24
- - quality-reviewer
25
- - test-analyst
26
-
27
- **UI/interaction changes** (components, pages, views, user-facing behavior):
28
- - Add: ux-reviewer
29
-
30
- **Visual/styling changes** (CSS, Tailwind, design tokens, layout, animation, theming):
31
- - Add: ui-reviewer
32
-
33
- **Accessibility-sensitive changes** (forms, interactive elements, dynamic content, modals, navigation):
34
- - Add: accessibility-reviewer
35
-
36
- **Product behavior changes** (feature logic, user flows, business rules, copy, redirects):
37
- - Add: product-validator
38
-
39
- **API changes** (routes, endpoints, GraphQL schema, request/response shapes, middleware):
40
- - Add: api-reviewer
41
-
42
- **Performance-sensitive changes** (queries, data fetching, loops, algorithms, heavy imports, rendering):
43
- - Add: performance-reviewer
44
-
45
- **Security-sensitive changes** (auth, crypto, env, config, secrets, middleware, API routes):
46
- - Escalate: security-reviewer gets HIGH priority task with extra scrutiny mandate
47
-
48
- </scope_classification>
49
-
50
- Announce to the user:
51
- ```
52
- Review team assembling for: [N] changed files
53
- Reviewers: [list of roles being spawned and why each was chosen]
54
- ```
55
-
56
- ## Phase 2: TEAM ASSEMBLY
57
-
58
- Use the Agent Teams infrastructure:
59
-
60
- 1. **TeamCreate** with name `review-{branch-or-short-hash}` (e.g., `review-fix-auth-flow`)
61
- 2. **Spawn reviewers** using the `Task` tool with `team_name` and `name` parameters. Each reviewer is a separate Claude instance with its own context.
62
- 3. **TaskCreate** review tasks for each reviewer — include the changed file list, relevant diff sections, and their specific checklist.
63
- 4. **Assign tasks** using TaskUpdate with `owner` set to the reviewer name.
64
-
65
- **IMPORTANT**: Do NOT hardcode a model. All reviewers inherit the user's active model automatically.
66
-
67
- **IMPORTANT**: When spawning reviewers, replace `{team-name}` in each prompt below with the actual team name you chose. Include the specific changed file paths in each reviewer's spawn prompt.
68
-
69
- ### Engine-Routed Reviewer Spawning
70
-
71
- If the caller passed `--engine auto` or `--engine codex` (check the orchestrator's context or the pipeline config), read the auto-resolve skill's `references/engine-routing.md` for per-role routing under "team-review roles".
72
-
73
- **For roles routed to Codex**: Instead of spawning a Claude Agent reviewer, call `mcp__codex-cli__codex` with:
74
- - `model`: `"gpt-5.4"`
75
- - `reasoningEffort`: `"xhigh"`
76
- - `sandbox`: per routing table (`"read-only"` or `"workspace-write"`)
77
- - `workingDirectory`: project root
78
- - `prompt`: the full reviewer prompt below, with changed file paths and diff included inline
79
-
80
- Codex reviewers cannot use TeamCreate/SendMessage — the Review Lead (you) collects their output directly from the MCP call response and relays cross-cutting findings to other reviewers.
81
-
82
- **For roles routed to Claude**: Spawn via Task tool as normal (prompts below).
83
-
84
- **For Dual roles** (e.g., security-reviewer): Run BOTH a Claude Agent reviewer AND a `mcp__codex-cli__codex` call in parallel with the same prompt. Merge findings per `engine-routing.md` "How to Spawn a Dual Role" section.
85
-
86
- If `--engine auto` or no `--engine` flag: routes each reviewer role to the optimal model based on benchmark data (see `engine-routing.md`). If `--engine claude`: all roles use Claude Agent reviewers.
87
-
88
- ### Reviewer Prompts
89
-
90
- When spawning each reviewer via the Task tool (or passing to `mcp__codex-cli__codex` for Codex-routed roles), use these prompts:
91
-
92
- <security_reviewer_prompt>
93
- You are the **Security Reviewer** on an Agent Team performing a code review.
94
-
95
- **Your perspective**: Security engineer
96
- **Your mandate**: OWASP-focused review. Find credentials, injection, XSS, validation gaps, path traversal, dependency CVEs.
97
-
98
- **Your checklist** (CRITICAL severity — blocks approval):
99
- - Hardcoded credentials, API keys, tokens, secrets
100
- - SQL injection (unsanitized queries)
101
- - XSS (unescaped user input in HTML/JSX)
102
- - Missing input validation at system boundaries
103
- - Insecure dependencies (known CVEs)
104
- - Path traversal (unsanitized file paths)
105
- - Improper authentication or authorization checks
106
- - Sensitive data exposure in logs or error messages
107
-
108
- **Tools available**: Read, Grep, Glob, Bash (npm audit, grep for secrets patterns, etc.)
109
-
110
- **Your process**:
111
- 1. Read all changed files
112
- 2. Check each file against your checklist
113
- 3. For each issue found, note: severity, file:line, what the issue is, why it matters
114
- 4. Run `npm audit` or equivalent if dependencies changed
115
- 5. Check for secrets patterns: grep for API_KEY, SECRET, TOKEN, PASSWORD, etc.
116
-
117
- **Your deliverable**: Send a message to the team lead with:
118
- 1. List of security issues found (severity, file:line, description)
119
- 2. "CLEAN" if no issues found
120
- 3. Any security concerns about the overall change pattern
121
- 4. Cross-cutting concerns to flag for other reviewers
122
-
123
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Alert other teammates about security-relevant findings via SendMessage.
124
- </security_reviewer_prompt>
125
-
126
- <quality_reviewer_prompt>
127
- You are the **Quality Reviewer** on an Agent Team performing a code review.
128
-
129
- **Your perspective**: Senior engineer / code quality guardian
130
- **Your mandate**: Architecture, patterns, readability, function size, nesting, error handling, naming, over-engineering.
131
-
132
- **Your checklist**:
133
- HIGH severity (blocks approval):
134
- - Functions > 50 lines → split
135
- - Files > 800 lines → decompose
136
- - Nesting > 4 levels → flatten or extract
137
- - Missing error handling at boundaries
138
- - `console.log` in production code → remove
139
- - Unresolved TODO/FIXME → resolve or remove
140
- - Missing JSDoc for public APIs
141
-
142
- MEDIUM severity (fix or justify):
143
- - Mutation where immutable patterns preferred
144
- - Inconsistent naming or structure
145
- - Over-engineering: unnecessary abstractions, unused config, premature optimization
146
- - Code duplication that should be extracted
147
-
148
- LOW severity (fix if quick):
149
- - Unused imports/dependencies
150
- - Unreferenced functions/variables
151
- - Commented-out code
152
- - Obsolete files
153
-
154
- **Tools available**: Read, Grep, Glob
155
-
156
- **Your process**:
157
- 1. Read all changed files
158
- 2. Check each file against your checklist by severity
159
- 3. For each issue found, note: severity, file:line, what the issue is, why it matters
160
- 4. Check for consistency with existing codebase patterns
161
-
162
- **Your deliverable**: Send a message to the team lead with:
163
- 1. List of issues found grouped by severity (HIGH, MEDIUM, LOW) with file:line
164
- 2. "CLEAN" if no issues found
165
- 3. Overall code quality assessment
166
- 4. Pattern consistency observations
167
-
168
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Share relevant findings with other reviewers via SendMessage.
169
- </quality_reviewer_prompt>
170
-
171
- <test_analyst_prompt>
172
- You are the **Test Analyst** on an Agent Team performing a code review.
173
-
174
- **Your perspective**: QA lead
175
- **Your mandate**: Test coverage, test quality, missing scenarios, edge cases. Run the test suite.
176
-
177
- **Your checklist** (MEDIUM severity):
178
- - Missing tests for new functionality
179
- - Untested edge cases (null, empty, boundary values, error states)
180
- - Test quality (assertions are meaningful, not just "doesn't crash")
181
- - Integration test coverage for cross-module changes
182
- - Mocking correctness (mocks reflect real behavior)
183
- - Test file naming and organization consistency
184
-
185
- **Tools available**: Read, Grep, Glob, Bash (including running tests)
186
-
187
- **Your process**:
188
- 1. Read all changed files to understand what changed
189
- 2. Find existing test files for the changed code
190
- 3. Assess test coverage for the changes
191
- 4. Run the full test suite and report results
192
- 5. Run the project linter (`npm run lint` or equivalent) and report any lint errors/warnings on changed files
193
- 6. Identify missing test scenarios and edge cases
194
-
195
- **Your deliverable**: Send a message to the team lead with:
196
- 1. Test suite results: PASS or FAIL (with failure details)
197
- 2. Lint results: PASS or FAIL (with issue details on changed files)
198
- 3. Coverage gaps: what changed code lacks tests
199
- 4. Missing edge cases that should be tested
200
- 5. Test quality assessment
201
- 6. Recommended tests to add
202
-
203
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Share test results with other reviewers via SendMessage.
204
- </test_analyst_prompt>
205
-
206
- <ux_reviewer_prompt>
207
- You are the **UX Reviewer** on an Agent Team performing a code review.
208
-
209
- **Your perspective**: Interaction design specialist
210
- **Your mandate**: Review user-facing changes for interaction quality, flow correctness, and missing UI states. Catch UX regressions before they ship.
211
-
212
- **Your checklist** (MEDIUM severity):
213
- - Missing UI states: loading, error, empty, disabled, success — every async operation needs all of these
214
- - UX regressions: existing user flows that worked before and may now be broken
215
- - Interaction model consistency: does this behave like the rest of the app?
216
- - Focus management: after dialog close, form submit, or route change — where does focus go?
217
- - Feedback latency: does the user get immediate feedback on actions?
218
- - Error message quality: are error messages actionable and human-readable?
219
- - Copy/text: is it clear, consistent, and typo-free?
220
- - Edge cases in flows: what happens with 0 items, 1 item, 100+ items?
221
-
222
- **Tools available**: Read, Grep, Glob
223
-
224
- **Your process**:
225
- 1. Read all changed components and pages
226
- 2. Trace every user flow affected by the changes from entry to completion
227
- 3. Check each interactive element against your checklist
228
- 4. Look for missing states in async operations (loading spinners, error boundaries, empty states)
229
- 5. Compare behavior against existing similar patterns in the codebase
230
-
231
- **Your deliverable**: Send a message to the team lead with:
232
- 1. UX issues found (severity, file:line, description)
233
- 2. "CLEAN" if no issues found
234
- 3. Missing UI states that must be added before shipping
235
- 4. UX regressions detected
236
- 5. Flow diagrams or step-by-step descriptions of broken interactions
237
-
238
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Communicate with ui-reviewer about visual states and with accessibility-reviewer about interaction-level a11y concerns via SendMessage.
239
- </ux_reviewer_prompt>
240
-
241
- <ui_reviewer_prompt>
242
- You are the **UI Reviewer** on an Agent Team performing a code review.
243
-
244
- **Your perspective**: Visual design specialist
245
- **Your mandate**: Review styling and visual changes for design system consistency, visual hierarchy, and aesthetic quality. Catch design regressions and token misuse.
246
-
247
- **Your checklist** (MEDIUM severity):
248
- - Design token usage: are raw values used where tokens should be? (hardcoded colors, spacing px values, font sizes)
249
- - Spacing consistency: does this follow the project's spacing scale (4px/8px grid)?
250
- - Typography: correct font weight, size, line-height per the type scale?
251
- - Color consistency: are semantic color tokens used correctly (e.g., `text-muted` not `text-gray-400`)?
252
- - Visual hierarchy: does the eye naturally land in the right place?
253
- - Component consistency: does this look like it belongs in the same product?
254
- - Responsive behavior: does this break at mobile/tablet breakpoints?
255
- - Animation/transitions: are easing and duration values consistent with the rest of the app?
256
- - Dark mode / theme compatibility: does this work across all themes if the product supports them?
257
- - Icon usage: correct size, stroke weight, and optical alignment?
258
-
259
- **Tools available**: Read, Grep, Glob
260
-
261
- **Your process**:
262
- 1. Read all changed style files, components, and layout files
263
- 2. Check for raw values that should use design tokens
264
- 3. Compare visual patterns against existing components in the codebase
265
- 4. Look for responsive breakpoint handling
266
- 5. Check for theme/dark mode compatibility
267
-
268
- **Your deliverable**: Send a message to the team lead with:
269
- 1. Visual issues found (severity, file:line, description)
270
- 2. "CLEAN" if no issues found
271
- 3. Design token violations (raw values that should be tokens)
272
- 4. Visual inconsistencies vs. existing components
273
- 5. Responsive/theming gaps
274
-
275
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Alert ux-reviewer about visual state issues and accessibility-reviewer about contrast or focus indicator issues via SendMessage.
276
- </ui_reviewer_prompt>
277
-
278
- <accessibility_reviewer_prompt>
279
- You are the **Accessibility Reviewer** on an Agent Team performing a code review.
280
-
281
- **Your perspective**: WCAG 2.1 AA compliance specialist
282
- **Your mandate**: Ensure changed code is usable by everyone, including people using assistive technologies.
283
-
284
- **Your checklist** (HIGH severity for CRITICAL violations, MEDIUM for gaps):
285
- - Semantic HTML: correct elements for their semantic meaning (button not div, nav not div, etc.)
286
- - ARIA labels: interactive elements without visible labels need `aria-label` or `aria-labelledby`
287
- - ARIA roles: custom interactive elements need correct roles
288
- - Keyboard navigation: all interactions reachable and operable without a mouse
289
- - Focus indicators: visible focus rings on all interactive elements (not `outline: none` without replacement)
290
- - Focus management: dialogs trap focus; focus returns correctly on close
291
- - Color contrast: text ≥ 4.5:1, large text ≥ 3:1, UI components ≥ 3:1
292
- - Screen reader announcements: dynamic content updates announced via `aria-live` or role changes
293
- - Image alt text: informative images have descriptive alt; decorative images have `alt=""`
294
- - Form labels: every input has an associated label (not just placeholder)
295
- - Error association: error messages linked to inputs via `aria-describedby`
296
- - Motion: `prefers-reduced-motion` respected for animations
297
-
298
- **Tools available**: Read, Grep, Glob
299
-
300
- **Your process**:
301
- 1. Read all changed components focusing on interactive elements and dynamic content
302
- 2. Check semantic structure of the markup
303
- 3. Audit ARIA usage for correctness (not just presence)
304
- 4. Trace keyboard navigation paths through changed flows
305
- 5. Check color values against contrast ratios if possible
306
-
307
- **Your deliverable**: Send a message to the team lead with:
308
- 1. Accessibility violations (severity, file:line, WCAG criterion, recommended fix)
309
- 2. "CLEAN" if no issues found
310
- 3. Patterns that need consistent a11y fixes across the codebase
311
-
312
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Alert ux-reviewer and ui-reviewer about interaction and visual a11y issues via SendMessage.
313
- </accessibility_reviewer_prompt>
314
-
315
- <product_validator_prompt>
316
- You are the **Product Validator** on an Agent Team performing a code review.
317
-
318
- **Your perspective**: Product manager / business logic guardian
319
- **Your mandate**: Validate that changes match product intent and business rules. Catch feature regressions. Flag scope drift.
320
-
321
- **Your checklist** (MEDIUM severity):
322
- - Behavior matches product spec / user expectations
323
- - Business rules are correctly implemented (pricing, permissions, limits, validations)
324
- - No feature regressions (existing product behaviors still work as expected)
325
- - Edge cases in business logic (zero state, max limits, concurrent actions)
326
- - Copy/text matches approved language (not placeholder text or developer copy)
327
- - Feature flag or rollout considerations (is this safely gated?)
328
- - Documentation or changelog requirements for user-visible changes
329
-
330
- **Tools available**: Read, Grep, Glob
331
-
332
- **Your process**:
333
- 1. Read all changed files, focusing on business logic and user-facing behavior
334
- 2. Trace the user flows affected by the changes
335
- 3. Check business rule implementation against any spec files or comments
336
- 4. Identify behavior changes that users or other features depend on
337
-
338
- **Your deliverable**: Send a message to the team lead with:
339
- 1. Product/behavior issues found (severity, file:line, description)
340
- 2. "CLEAN" if no issues found
341
- 3. Business logic correctness assessment
342
- 4. Any behavior changes that need user communication or changelog entries
343
-
344
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Share product intent context with ux-reviewer and quality-reviewer via SendMessage.
345
- </product_validator_prompt>
346
-
347
- <api_reviewer_prompt>
348
- You are the **API Reviewer** on an Agent Team performing a code review.
349
-
350
- **Your perspective**: API design and contract specialist
351
- **Your mandate**: Ensure API changes are consistent, backwards-compatible, and well-structured.
352
-
353
- **Your checklist** (HIGH severity for breaking changes):
354
- - Breaking changes: removed fields, renamed endpoints, changed response shapes, different status codes
355
- - Consistency: do new endpoints follow the same conventions as existing ones? (naming, casing, error envelope, pagination)
356
- - HTTP semantics: correct verbs (GET idempotent, POST for creation, PUT/PATCH for update, DELETE for removal)
357
- - Status codes: correct codes returned (201 for creation, 400 for validation errors, 401 vs 403, etc.)
358
- - Error format: errors returned in the consistent error envelope format
359
- - Input validation: request payloads validated at the API boundary
360
- - Authentication: is the right auth mechanism applied to new routes?
361
- - Versioning: if breaking, is this behind a version prefix?
362
- - Over-fetching: does the response return more data than the client needs?
363
-
364
- **Tools available**: Read, Grep, Glob
365
-
366
- **Your process**:
367
- 1. Read all changed route handlers, controllers, and schema files
368
- 2. Compare against existing API patterns in the codebase
369
- 3. Check for breaking changes vs. existing client usage
370
- 4. Verify error handling consistency
371
-
372
- **Your deliverable**: Send a message to the team lead with:
373
- 1. API issues found (severity, file:line, description)
374
- 2. "CLEAN" if no issues found
375
- 3. Breaking change risk assessment
376
- 4. Consistency gaps vs. existing API conventions
377
-
378
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Alert security-reviewer about auth/validation gaps and quality-reviewer about structural issues via SendMessage.
379
- </api_reviewer_prompt>
380
-
381
- <performance_reviewer_prompt>
382
- You are the **Performance Reviewer** on an Agent Team performing a code review.
383
-
384
- **Your perspective**: Performance engineer
385
- **Your mandate**: Algorithmic complexity, N+1 queries, unnecessary re-renders, bundle size impact, memory leaks.
386
-
387
- **Your checklist** (HIGH severity when relevant):
388
- - O(n²) or worse algorithms where O(n) is possible
389
- - N+1 query patterns (database, API calls in loops)
390
- - Unnecessary re-renders (React: missing memo, unstable references, inline objects/functions)
391
- - Large bundle imports where tree-shakeable alternatives exist
392
- - Memory leaks (event listeners, subscriptions, intervals not cleaned up)
393
- - Synchronous operations that should be async
394
- - Missing pagination or unbounded data fetching
395
-
396
- **Tools available**: Read, Grep, Glob, Bash
397
-
398
- **Your process**:
399
- 1. Read all changed files, focusing on data flow and computation
400
- 2. Check each change against your checklist
401
- 3. Analyze algorithmic complexity of new/changed logic
402
- 4. Check import sizes and bundle impact
403
- 5. Look for resource lifecycle issues
404
-
405
- **Your deliverable**: Send a message to the team lead with:
406
- 1. Performance issues found (severity, file:line, description)
407
- 2. "CLEAN" if no issues found
408
- 3. Performance risk assessment for the changes
409
- 4. Optimization recommendations (if any)
410
-
411
- Read the team config at ~/.claude/teams/{team-name}/config.json to discover teammates. Alert other reviewers about performance concerns that affect their domains via SendMessage.
412
- </performance_reviewer_prompt>
413
-
414
- ## Phase 3: PARALLEL REVIEW
415
-
416
- All reviewers work simultaneously. They will:
417
- - Review from their unique perspective using their checklist
418
- - Message each other about cross-cutting concerns
419
- - Send their final findings to you (Review Lead)
420
-
421
- Wait for all reviewers to report back. If a reviewer goes idle after sending findings, that's normal — they're done with their review.
422
-
423
- ## Phase 4: MERGE & FIX (You, Review Lead)
424
-
425
- After receiving all reviewer findings:
426
-
427
- 1. Read all findings carefully
428
- 2. Deduplicate: if multiple reviewers flagged the same file:line, keep the highest severity
429
- 3. Fix all CRITICAL issues directly — these block approval
430
- 4. Fix all HIGH issues directly — these block approval
431
- 5. For MEDIUM issues: fix them, or justify deferral with a concrete reason
432
- 6. For LOW issues: fix if quick (< 1 minute each)
433
- 7. Document every action taken
434
-
435
- ## Phase 5: VALIDATION (You, Review Lead)
436
-
437
- After all fixes are applied:
438
-
439
- 1. Run the full test suite
440
- 2. If tests fail → chain to `/devlyn:team-resolve` for the failing tests
441
- 3. Re-read fixed files to verify fixes didn't introduce new issues
442
- 4. Generate the final review summary
443
-
444
- ## Phase 6: CLEANUP
445
-
446
- After review is complete:
447
- 1. Send `shutdown_request` to all reviewers via SendMessage
448
- 2. Wait for shutdown confirmations
449
- 3. Call TeamDelete to clean up the team
450
-
451
- </team_workflow>
452
-
453
- <output_format>
454
- Present the final review in this format:
455
-
456
- <team_review_summary>
457
-
458
- ### Review Complete
459
-
460
- **Approval**: [BLOCKED / APPROVED]
461
- - BLOCKED if any CRITICAL or HIGH issues remain unfixed OR lint/tests fail
462
-
463
- **Team Composition**: [N] reviewers
464
- - **Security Reviewer**: [N issues found / Clean]
465
- - **Quality Reviewer**: [N issues found / Clean]
466
- - **Test Analyst**: [Tests PASS/FAIL, Lint PASS/FAIL, N coverage gaps]
467
- - **[Conditional reviewers]**: [findings summary]
468
-
469
- **Lint**: [PASS / FAIL]
470
- - [lint summary or issue details]
471
-
472
- **Tests**: [PASS / FAIL]
473
- - [test summary or failure details]
474
-
475
- **Cross-Cutting Concerns**:
476
- - [Issues flagged by multiple reviewers]
477
-
478
- **Fixed**:
479
- - [CRITICAL/Security] file.ts:42 — [what was fixed]
480
- - [HIGH/Quality] utils.ts:156 — [what was fixed]
481
- - [HIGH/Performance] query.ts:23 — [what was fixed]
482
-
483
- **Verified**:
484
- - [Items that passed all reviewer checklists]
485
-
486
- **Deferred** (with justification):
487
- - [MEDIUM/severity] description — [concrete reason for deferral]
488
-
489
- ### Recommendation
490
- If any issues were deferred or if the fix was complex, consider running `/devlyn:team-resolve` on the specific concern for deeper analysis.
491
-
492
- </team_review_summary>
493
- </output_format>