@titan-design/brain 0.6.1 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +66 -44
- package/dist/{brain-service-4ETWBOIO.js → brain-service-TYFNTBT6.js} +2 -2
- package/dist/{chunk-KSJZ7CMP.js → chunk-4GDSQB2E.js} +1 -1
- package/dist/{chunk-HNC656YT.js → chunk-IESQY2UZ.js} +7 -5
- package/dist/{chunk-AJKFX2TM.js → chunk-LLAHWRO4.js} +2 -1
- package/dist/cli.js +2472 -223
- package/dist/{command-resolution-MO7LSFOT.js → command-resolution-EJ6LTC2Z.js} +1 -1
- package/dist/{search-AKSAQJOR.js → search-NPTRJV4W.js} +2 -2
- package/dist/templates/brainstorm-design.md +30 -0
- package/dist/templates/brainstorm-explore.md +30 -0
- package/dist/templates/brainstorm-interview.md +26 -0
- package/dist/templates/brainstorm-propose.md +28 -0
- package/dist/templates/brainstorm-write-doc.md +30 -0
- package/dist/templates/implementation-compact.md +46 -0
- package/dist/templates/implementation.md +92 -0
- package/dist/templates/ops.md +18 -0
- package/dist/templates/planning-critic.md +123 -0
- package/dist/templates/planning-decompose.md +221 -0
- package/dist/templates/planning-design.md +162 -0
- package/dist/templates/planning-interview.md +74 -0
- package/dist/templates/planning-research.md +114 -0
- package/dist/templates/planning-spectests.md +84 -0
- package/dist/templates/review-agent.md +155 -0
- package/dist/templates/review-fixup.md +79 -0
- package/dist/templates/validation-gates.md +84 -0
- package/dist/templates/writing-plans.md +48 -0
- package/package.json +11 -3
- package/skill/SKILL.md +53 -25
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# Planning Design: {{TASK_ID}}
|
|
2
|
+
|
|
3
|
+
Produce design artifacts for a planning workflow. Read the research brief, synthesize with interview answers (if provided), and write spec, design, and acceptance criteria. Do NOT implement or write code — only produce design documents.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Setup
|
|
8
|
+
|
|
9
|
+
- Plan: `{{PLAN_ID}}`
|
|
10
|
+
- Project: `{{PROJECT_PREFIX}}`
|
|
11
|
+
- Location: `{{REPO_PATH}}`
|
|
12
|
+
- Complexity: {{COMPLEXITY}}
|
|
13
|
+
- Build: `{{BUILD_CMD}}` | Test: `{{TEST_CMD}}` | Typecheck: `{{TYPECHECK_CMD}}` | Lint: `{{LINT_CMD}}`
|
|
14
|
+
|
|
15
|
+
## Input
|
|
16
|
+
|
|
17
|
+
### Research Brief
|
|
18
|
+
|
|
19
|
+
Read `{{REPO_PATH}}/.plans/{{PLAN_ID}}/research-brief.md` for context gathered during the research phase. This is your primary source of truth for existing code, patterns, constraints, and external findings.
|
|
20
|
+
|
|
21
|
+
### Interview Answers (high complexity only)
|
|
22
|
+
|
|
23
|
+
{{INTERVIEW_ANSWERS}}
|
|
24
|
+
|
|
25
|
+
### Task Description
|
|
26
|
+
|
|
27
|
+
{{TASK_DESCRIPTION}}
|
|
28
|
+
|
|
29
|
+
## Instructions
|
|
30
|
+
|
|
31
|
+
Produce three artifacts in `{{REPO_PATH}}/.plans/{{PLAN_ID}}/`. These documents are the contract for all subsequent phases — critic review, spec tests, work decomposition, and implementation. Be concrete: file paths, type signatures, function names. The decompose phase uses these to create implementation tasks with file ownership.
|
|
32
|
+
|
|
33
|
+
### Artifact 1: spec.md
|
|
34
|
+
|
|
35
|
+
Write `{{REPO_PATH}}/.plans/{{PLAN_ID}}/spec.md` — the problem specification.
|
|
36
|
+
|
|
37
|
+
```markdown
|
|
38
|
+
# Spec: <feature name>
|
|
39
|
+
|
|
40
|
+
## Problem
|
|
41
|
+
[What problem does this solve? What happens without it? 2-3 sentences.]
|
|
42
|
+
|
|
43
|
+
## Requirements
|
|
44
|
+
[Numbered list of functional requirements. Each must be testable.]
|
|
45
|
+
|
|
46
|
+
## Constraints
|
|
47
|
+
[What can't change. Backward compatibility. Performance bounds. API contracts.]
|
|
48
|
+
|
|
49
|
+
## Out of Scope
|
|
50
|
+
[Explicit list of what this work does NOT include.]
|
|
51
|
+
|
|
52
|
+
## Dependencies
|
|
53
|
+
[Other systems, packages, or tasks this depends on.]
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Keep it to 1-2 pages. Reference interview answers where applicable.
|
|
57
|
+
|
|
58
|
+
### Artifact 2: design.md
|
|
59
|
+
|
|
60
|
+
Write `{{REPO_PATH}}/.plans/{{PLAN_ID}}/design.md` — the technical approach.
|
|
61
|
+
|
|
62
|
+
```markdown
|
|
63
|
+
# Design: <feature name>
|
|
64
|
+
|
|
65
|
+
## Approach
|
|
66
|
+
[2-3 paragraph summary of the technical approach.]
|
|
67
|
+
|
|
68
|
+
## Files to Create/Modify
|
|
69
|
+
[Exact paths for every file that will be created or modified.]
|
|
70
|
+
|
|
71
|
+
| Action | Path | Purpose |
|
|
72
|
+
|--------|------|---------|
|
|
73
|
+
| Create | `src/features/foo.ts` | Core logic for X |
|
|
74
|
+
| Modify | `src/store/bar.ts` | Add Y state |
|
|
75
|
+
| Create | `src/__tests__/foo.test.ts` | Tests for X |
|
|
76
|
+
|
|
77
|
+
## API Shapes / Type Signatures
|
|
78
|
+
[TypeScript interfaces, function signatures, or data structures. Concrete, not abstract.]
|
|
79
|
+
|
|
80
|
+
## Data Flow
|
|
81
|
+
[How data moves through the system. Which components talk to which.]
|
|
82
|
+
|
|
83
|
+
## Key Decisions
|
|
84
|
+
|
|
85
|
+
| Decision | Chosen | Alternative | Rationale |
|
|
86
|
+
|----------|--------|-------------|-----------|
|
|
87
|
+
| State management | Zustand | Context | Simpler API, existing pattern |
|
|
88
|
+
|
|
89
|
+
## Risks and Mitigations
|
|
90
|
+
[What could go wrong. How to handle it.]
|
|
91
|
+
|
|
92
|
+
## Scaffolding vs. Implementation
|
|
93
|
+
[What should be built first (wave 1) vs. what can be parallelized (wave 2+).]
|
|
94
|
+
- **Scaffolding (wave 1):** [types, interfaces, shared utilities, store structure]
|
|
95
|
+
- **Implementation (wave 2+):** [features that can be built in parallel on top of scaffolding]
|
|
96
|
+
|
|
97
|
+
## PR Boundaries
|
|
98
|
+
[Should this be one PR or multiple? If multiple, what is the boundary?]
|
|
99
|
+
- Option A: One PR for everything (simpler review, all-or-nothing)
|
|
100
|
+
- Option B: One PR per wave (incremental, reviewable chunks)
|
|
101
|
+
- Recommendation: [chosen option with rationale]
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### Artifact 3: acceptance-criteria.md
|
|
105
|
+
|
|
106
|
+
Write `{{REPO_PATH}}/.plans/{{PLAN_ID}}/acceptance-criteria.md` — testable conditions.
|
|
107
|
+
|
|
108
|
+
```markdown
|
|
109
|
+
# Acceptance Criteria: <feature name>
|
|
110
|
+
|
|
111
|
+
## Criteria
|
|
112
|
+
|
|
113
|
+
### AC-01: <short name>
|
|
114
|
+
**Given:** <precondition>
|
|
115
|
+
**When:** <action>
|
|
116
|
+
**Then:** <expected result>
|
|
117
|
+
|
|
118
|
+
### AC-02: <short name>
|
|
119
|
+
**Given:** <precondition>
|
|
120
|
+
**When:** <action>
|
|
121
|
+
**Then:** <expected result>
|
|
122
|
+
|
|
123
|
+
[Continue for all criteria...]
|
|
124
|
+
|
|
125
|
+
## Edge Cases
|
|
126
|
+
|
|
127
|
+
### EC-01: <short name>
|
|
128
|
+
**Given:** <edge condition>
|
|
129
|
+
**When:** <action>
|
|
130
|
+
**Then:** <expected behavior>
|
|
131
|
+
|
|
132
|
+
## Non-Functional
|
|
133
|
+
|
|
134
|
+
### NF-01: <short name>
|
|
135
|
+
[Performance, accessibility, or other non-functional requirement with measurable threshold]
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
Every requirement in spec.md must map to at least one acceptance criterion. Every acceptance criterion must be testable — no vague conditions like "works correctly."
|
|
139
|
+
|
|
140
|
+
## Quality Checks
|
|
141
|
+
|
|
142
|
+
Before reporting, verify your artifacts:
|
|
143
|
+
|
|
144
|
+
- Every spec requirement has a corresponding AC
|
|
145
|
+
- Every AC uses Given/When/Then format with concrete, testable conditions
|
|
146
|
+
- design.md files table has exact paths (no placeholders like `src/...`)
|
|
147
|
+
- API shapes include TypeScript types, not prose descriptions
|
|
148
|
+
- Key decisions table has at least one entry with a considered alternative
|
|
149
|
+
- Scaffolding section clearly separates wave 1 (blocking) from wave 2+ (parallelizable)
|
|
150
|
+
- PR boundaries section has a concrete recommendation
|
|
151
|
+
|
|
152
|
+
## Output
|
|
153
|
+
|
|
154
|
+
When complete, report to the orchestrator:
|
|
155
|
+
|
|
156
|
+
- **Design approach** — 3-5 sentence summary of the technical approach
|
|
157
|
+
- **Files** — count of files to create and modify
|
|
158
|
+
- **Acceptance criteria** — count of ACs, edge cases, and non-functional criteria
|
|
159
|
+
- **Key decisions** — list of decisions made (one line each)
|
|
160
|
+
- **Open questions** — any unresolved questions that need human input before implementation proceeds
|
|
161
|
+
|
|
162
|
+
Stay in design mode. Do not write implementation code, tests, or make changes to the codebase. Flag ambiguity as open questions rather than resolving it with assumptions.
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Planning Interview Guide
|
|
2
|
+
|
|
3
|
+
## When to Use
|
|
4
|
+
|
|
5
|
+
This guide is for the **orchestrator** (not a sub-agent). Use it during the interview phase of the planning workflow for **high-complexity** tasks. The orchestrator asks questions one at a time in the active session, not as a batch.
|
|
6
|
+
|
|
7
|
+
## Prerequisites
|
|
8
|
+
|
|
9
|
+
Before starting the interview:
|
|
10
|
+
- Research phase must be complete
|
|
11
|
+
- Read `.plans/{{PLAN_ID}}/research-brief.md` for context
|
|
12
|
+
- Have the task description ready: {{TASK_DESCRIPTION}}
|
|
13
|
+
|
|
14
|
+
## Interview Flow
|
|
15
|
+
|
|
16
|
+
Ask these questions **one at a time**. Wait for the answer before proceeding. Adapt follow-ups based on responses.
|
|
17
|
+
|
|
18
|
+
### Core Questions
|
|
19
|
+
|
|
20
|
+
**1. Problem Framing**
|
|
21
|
+
> "What specific problem does this solve? What happens if we don't build it?"
|
|
22
|
+
|
|
23
|
+
Listen for: clarity of problem statement, urgency, impact scope.
|
|
24
|
+
|
|
25
|
+
**2. Constraints**
|
|
26
|
+
> "What can't change? What existing behavior must be preserved?"
|
|
27
|
+
|
|
28
|
+
Listen for: backward compatibility requirements, performance constraints, API contracts.
|
|
29
|
+
|
|
30
|
+
**3. Scope Boundary**
|
|
31
|
+
> "What's explicitly out of scope for this work?"
|
|
32
|
+
|
|
33
|
+
Listen for: feature creep risks, adjacent work that should be separate tasks.
|
|
34
|
+
|
|
35
|
+
**4. Success Criteria**
|
|
36
|
+
> "How will we know this works? What would you test manually?"
|
|
37
|
+
|
|
38
|
+
Listen for: testable conditions that become acceptance criteria.
|
|
39
|
+
|
|
40
|
+
**5. Prior Art**
|
|
41
|
+
> "Is there existing code that does something similar? Any patterns we should follow or avoid?"
|
|
42
|
+
|
|
43
|
+
Listen for: reusable code, anti-patterns from past experience.
|
|
44
|
+
|
|
45
|
+
### Research-Informed Follow-ups
|
|
46
|
+
|
|
47
|
+
Based on the research brief, ask targeted questions about:
|
|
48
|
+
- Knowledge gaps identified in research (ask the human to fill them)
|
|
49
|
+
- Competing approaches found (ask which direction the human prefers)
|
|
50
|
+
- Risks identified (ask about acceptable trade-offs)
|
|
51
|
+
|
|
52
|
+
Example: "Research found that [X approach] and [Y approach] are both viable. [X] is simpler but [Y] handles [edge case]. Which direction do you lean?"
|
|
53
|
+
|
|
54
|
+
### Complexity Reassessment
|
|
55
|
+
|
|
56
|
+
After the interview, reassess complexity:
|
|
57
|
+
- If answers reveal the task is simpler than expected: keep current complexity level (complexity only goes UP)
|
|
58
|
+
- If answers reveal cross-cutting concerns, new patterns, or architectural decisions not in the original description: escalate to high complexity
|
|
59
|
+
- If the human identifies prerequisites or decomposition needs: note for the decompose phase
|
|
60
|
+
|
|
61
|
+
## Recording Answers
|
|
62
|
+
|
|
63
|
+
Do NOT write answers to a file. Keep them in orchestrator context for passing to the design agent. The design agent receives interview answers as part of its {{INTERVIEW_ANSWERS}} variable.
|
|
64
|
+
|
|
65
|
+
## Output
|
|
66
|
+
|
|
67
|
+
After the interview, the orchestrator should have:
|
|
68
|
+
1. Validated assumptions from the research brief
|
|
69
|
+
2. Clear scope boundaries
|
|
70
|
+
3. Testable success criteria (raw, to be formalized in acceptance-criteria.md)
|
|
71
|
+
4. Complexity assessment (confirmed or escalated)
|
|
72
|
+
5. Any new research questions to investigate before design
|
|
73
|
+
|
|
74
|
+
If new research is needed, dispatch a targeted research agent before proceeding to design.
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
# Planning Research: {{TASK_ID}}
|
|
2
|
+
|
|
3
|
+
Explore-type agent gathering context for a planning workflow. Read broadly, produce a structured brief. Do NOT design or implement — only gather information.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Setup
|
|
8
|
+
|
|
9
|
+
- Plan: `{{PLAN_ID}}`
|
|
10
|
+
- Project: `{{PROJECT_PREFIX}}`
|
|
11
|
+
- Location: `{{REPO_PATH}}`
|
|
12
|
+
- Output: `.plans/{{PLAN_ID}}/research-brief.md`
|
|
13
|
+
|
|
14
|
+
## Task
|
|
15
|
+
|
|
16
|
+
{{TASK_DESCRIPTION}}
|
|
17
|
+
|
|
18
|
+
## Focus Areas
|
|
19
|
+
|
|
20
|
+
{{RESEARCH_FOCUS}}
|
|
21
|
+
|
|
22
|
+
## Instructions
|
|
23
|
+
|
|
24
|
+
Work through each research area below. Spend proportional effort on the focus areas listed above.
|
|
25
|
+
|
|
26
|
+
### 1. Codebase Exploration
|
|
27
|
+
|
|
28
|
+
Search the codebase at `{{REPO_PATH}}` for:
|
|
29
|
+
|
|
30
|
+
- **Related code** — existing features, shared utilities, similar patterns. Use grep/glob to find relevant files.
|
|
31
|
+
- **Test patterns** — how are tests structured? What frameworks, helpers, and conventions are used?
|
|
32
|
+
- **Configuration** — read CLAUDE.md, eslint config, tsconfig. Note conventions that constrain the design.
|
|
33
|
+
- **Recent activity** — `git log --oneline -20 -- <relevant paths>` to see what's been changing nearby.
|
|
34
|
+
|
|
35
|
+
Record file paths and brief descriptions. Quote key code only when the exact text matters.
|
|
36
|
+
|
|
37
|
+
### 2. Documentation Review
|
|
38
|
+
|
|
39
|
+
Read:
|
|
40
|
+
|
|
41
|
+
- Project README and CLAUDE.md at `{{REPO_PATH}}`
|
|
42
|
+
- Any design docs in `docs/` or `.plans/`
|
|
43
|
+
- Inline documentation in related files
|
|
44
|
+
- Brain KB: `brain search "<relevant terms>"` (if brain CLI is available)
|
|
45
|
+
|
|
46
|
+
### 3. External Research
|
|
47
|
+
|
|
48
|
+
Search for:
|
|
49
|
+
|
|
50
|
+
- Best practices for the problem domain
|
|
51
|
+
- Similar implementations in open source projects
|
|
52
|
+
- Known pitfalls and anti-patterns
|
|
53
|
+
- Relevant blog posts, papers, or documentation
|
|
54
|
+
|
|
55
|
+
Cite sources. Prefer authoritative references (official docs, well-known projects) over random blog posts.
|
|
56
|
+
|
|
57
|
+
### 4. Produce Research Brief
|
|
58
|
+
|
|
59
|
+
Create the directory if needed:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
mkdir -p {{REPO_PATH}}/.plans/{{PLAN_ID}}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Write your findings to `{{REPO_PATH}}/.plans/{{PLAN_ID}}/research-brief.md` using this structure:
|
|
66
|
+
|
|
67
|
+
```markdown
|
|
68
|
+
# Research Brief: {{TASK_ID}}
|
|
69
|
+
|
|
70
|
+
Plan: {{PLAN_ID}} | Project: {{PROJECT_PREFIX}}
|
|
71
|
+
|
|
72
|
+
## Existing Code
|
|
73
|
+
|
|
74
|
+
- [relevant files with absolute paths and what they do]
|
|
75
|
+
- [patterns and conventions observed]
|
|
76
|
+
- [test infrastructure and conventions]
|
|
77
|
+
|
|
78
|
+
## External Findings
|
|
79
|
+
|
|
80
|
+
- [best practices with sources/links]
|
|
81
|
+
- [similar solutions found in open source]
|
|
82
|
+
- [relevant research or documentation]
|
|
83
|
+
|
|
84
|
+
## Knowledge Gaps
|
|
85
|
+
|
|
86
|
+
- [things you could not determine from available sources]
|
|
87
|
+
- [areas where multiple valid approaches exist and a decision is needed]
|
|
88
|
+
- [assumptions that need human validation]
|
|
89
|
+
|
|
90
|
+
## Recommendations
|
|
91
|
+
|
|
92
|
+
- [specific approaches worth considering, with trade-offs]
|
|
93
|
+
- [risks identified and potential mitigations]
|
|
94
|
+
|
|
95
|
+
## Suggested Interview Questions
|
|
96
|
+
|
|
97
|
+
Based on the knowledge gaps above, these questions would help clarify the design:
|
|
98
|
+
|
|
99
|
+
1. [question targeting the most critical knowledge gap]
|
|
100
|
+
2. [question about scope or constraints]
|
|
101
|
+
3. [question about priorities or trade-offs]
|
|
102
|
+
4. [additional questions as needed, aim for 3-5 total]
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## Output
|
|
106
|
+
|
|
107
|
+
When complete, report to the orchestrator:
|
|
108
|
+
|
|
109
|
+
- **Key findings** — 3-5 bullet point summary of the most important discoveries
|
|
110
|
+
- **Knowledge gaps** — count and brief list of unresolved questions
|
|
111
|
+
- **Interview questions** — the suggested questions from the brief (if any)
|
|
112
|
+
- **Research brief path** — absolute path to the written file
|
|
113
|
+
|
|
114
|
+
Stay in research mode. Do not propose implementations, write code, or make design decisions. Flag ambiguity as knowledge gaps rather than resolving it yourself.
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
# Planning Spec Tests: {{TASK_ID}}
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
- Plan: {{PLAN_ID}}
|
|
6
|
+
- Project: {{PROJECT_PREFIX}}
|
|
7
|
+
- Location: `{{REPO_PATH}}`
|
|
8
|
+
- Test command: `{{TEST_CMD}}`
|
|
9
|
+
|
|
10
|
+
## Input Artifacts
|
|
11
|
+
|
|
12
|
+
Read these files before writing any tests:
|
|
13
|
+
|
|
14
|
+
1. `.plans/{{PLAN_ID}}/acceptance-criteria.md` — the testable conditions to implement
|
|
15
|
+
2. `.plans/{{PLAN_ID}}/design.md` — the technical approach (for understanding API shapes and data flow)
|
|
16
|
+
|
|
17
|
+
## Your Task
|
|
18
|
+
|
|
19
|
+
Write failing test files that map each acceptance criterion to one or more test cases. These tests are the executable specification — they define what "done" means for the implementation agent.
|
|
20
|
+
|
|
21
|
+
### Rules
|
|
22
|
+
|
|
23
|
+
1. **One test per acceptance criterion minimum** — every Given/When/Then in acceptance-criteria.md must have a corresponding test
|
|
24
|
+
2. **Arrange-Act-Assert structure** — clear setup, action, assertion in every test
|
|
25
|
+
3. **Follow project conventions** — use the same test framework (Vitest), file naming patterns, and import styles as existing tests in the repo
|
|
26
|
+
4. **Tests MUST fail** — you are writing tests before implementation exists. If a test passes, either the feature already exists (check!) or the test is wrong
|
|
27
|
+
5. **No implementation code** — do not create source files, stubs, or mocks for the feature being tested. Import paths should reference where the code WILL be (per design.md)
|
|
28
|
+
6. **Descriptive test names** — test names describe the scenario, not the method (e.g., "creates active session when user has no existing sessions" not "test createSession")
|
|
29
|
+
|
|
30
|
+
### Process
|
|
31
|
+
|
|
32
|
+
For each acceptance criterion in acceptance-criteria.md:
|
|
33
|
+
|
|
34
|
+
1. Read the criterion
|
|
35
|
+
2. Determine which file(s) the test belongs in (based on design.md file paths)
|
|
36
|
+
3. Write the test using Arrange-Act-Assert
|
|
37
|
+
4. Add edge case tests where the criterion implies boundary conditions
|
|
38
|
+
|
|
39
|
+
### Test File Placement
|
|
40
|
+
|
|
41
|
+
Place test files in the project's standard test directory, following existing patterns:
|
|
42
|
+
- If tests live in `__tests__/`, put them there
|
|
43
|
+
- If tests live alongside source as `*.test.ts`, follow that pattern
|
|
44
|
+
- Mirror the source file structure from design.md
|
|
45
|
+
|
|
46
|
+
## Verification
|
|
47
|
+
|
|
48
|
+
After writing all test files, run `{{TEST_CMD}}` and confirm failures. Compilation errors from missing imports are expected and acceptable — they prove the tests reference code that does not yet exist.
|
|
49
|
+
|
|
50
|
+
If any test passes unexpectedly, investigate: the feature may already exist, or the test assertion is wrong. Fix the test so it fails for the right reason.
|
|
51
|
+
|
|
52
|
+
## Handoff
|
|
53
|
+
|
|
54
|
+
After writing and verifying all test files:
|
|
55
|
+
|
|
56
|
+
1. **Do NOT commit** — these tests are pre-implementation artifacts
|
|
57
|
+
2. **Git stash the tests** — run:
|
|
58
|
+
```bash
|
|
59
|
+
cd {{REPO_PATH}}
|
|
60
|
+
git add <test files you created>
|
|
61
|
+
git stash push -m "spec-tests-{{PLAN_ID}}" -- <test files>
|
|
62
|
+
```
|
|
63
|
+
3. **Report** using the output format below
|
|
64
|
+
|
|
65
|
+
## Output
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
## Spec Test Summary
|
|
69
|
+
|
|
70
|
+
### Tests Written
|
|
71
|
+
- <file path> — <N> tests covering criteria: <list of criteria IDs>
|
|
72
|
+
|
|
73
|
+
### Coverage
|
|
74
|
+
- Total acceptance criteria: <N>
|
|
75
|
+
- Criteria with tests: <N>
|
|
76
|
+
- Criteria without tests: <N> (explain why if any)
|
|
77
|
+
|
|
78
|
+
### Stash Reference
|
|
79
|
+
- `git stash list` entry: spec-tests-{{PLAN_ID}}
|
|
80
|
+
|
|
81
|
+
### Notes
|
|
82
|
+
- <any assumptions made about API shapes>
|
|
83
|
+
- <any criteria that were ambiguous>
|
|
84
|
+
```
|
|
@@ -0,0 +1,155 @@
|
|
|
1
|
+
# PR Review Agent Template
|
|
2
|
+
|
|
3
|
+
Blank-slate code review for a GitHub PR. Fill in variables and pass as agent prompt.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Variables
|
|
8
|
+
|
|
9
|
+
| Variable | Description | Example |
|
|
10
|
+
|----------|-------------|---------|
|
|
11
|
+
| `{{OWNER}}` | GitHub repo owner | `voltras` |
|
|
12
|
+
| `{{REPO}}` | GitHub repo name | `mobile` |
|
|
13
|
+
| `{{PR_NUMBER}}` | Pull request number | `42` |
|
|
14
|
+
| `{{BRANCH}}` | Feature branch name | `feat/android-mvp` |
|
|
15
|
+
| `{{BASE}}` | Base branch to diff against | `main` |
|
|
16
|
+
| `{{REPO_PATH}}` | Absolute path to repo on disk | `/Users/hjewkes/Documents/projects/voltras-workspace/voltras/mobile` |
|
|
17
|
+
| `{{PROJECT_PREFIX}}` | Project key from project-map.json | `VLT` |
|
|
18
|
+
| `{{REVIEW_THRESHOLD}}` | Risk score that triggers human review | `4` |
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Agent Prompt
|
|
23
|
+
|
|
24
|
+
You are reviewing PR #{{PR_NUMBER}} in {{OWNER}}/{{REPO}}.
|
|
25
|
+
|
|
26
|
+
Branch: `{{BRANCH}}` targeting `{{BASE}}`
|
|
27
|
+
Repo path: `{{REPO_PATH}}`
|
|
28
|
+
Project: `{{PROJECT_PREFIX}}` | Human review threshold: `{{REVIEW_THRESHOLD}}`
|
|
29
|
+
|
|
30
|
+
### Step 1: Read the full diff
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
cd {{REPO_PATH}} && git fetch origin && git diff {{BASE}}...{{BRANCH}}
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Step 2: Read all changed files in full
|
|
37
|
+
|
|
38
|
+
List changed files:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
cd {{REPO_PATH}} && git diff --name-only {{BASE}}...{{BRANCH}}
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Read every changed file in its entirety (not just the diff hunks). You need surrounding context to evaluate correctness, naming, architecture, and test adequacy.
|
|
45
|
+
|
|
46
|
+
### Step 3: Review against checklist
|
|
47
|
+
|
|
48
|
+
For EVERY finding, assign one of:
|
|
49
|
+
- `[FIX]` — Must fix before merge. Explain why.
|
|
50
|
+
- `[WON'T FIX: <reason>]` — Acceptable as-is. Explain why.
|
|
51
|
+
|
|
52
|
+
There is no "suggestion" category. Decide: fix it or justify leaving it.
|
|
53
|
+
|
|
54
|
+
**Checklist:**
|
|
55
|
+
|
|
56
|
+
1. **Code quality** — naming, readability, function length (max ~30 lines), dead code, commented-out code
|
|
57
|
+
2. **Edge cases** — invalid inputs, null/undefined, boundary conditions, empty collections
|
|
58
|
+
3. **Error handling** — failures handled at boundaries, missing guards, swallowed errors
|
|
59
|
+
4. **Test coverage** — adequate tests? missing scenarios? tests verify behavior not implementation? Arrange-Act-Assert?
|
|
60
|
+
5. **Backwards compatibility** — does existing behavior change? are defaults preserved? breaking API changes?
|
|
61
|
+
6. **Architecture** — coupling, abstraction quality, separation of concerns, composition over inheritance
|
|
62
|
+
7. **Performance** — unnecessary allocations, O(n) where O(1) possible, memory leaks, redundant re-renders
|
|
63
|
+
8. **Type safety** — exhaustive switches, unchecked casts, `any` types, missing generics
|
|
64
|
+
9. **Security** — injection risks, sensitive data exposure, path traversal, shell injection
|
|
65
|
+
|
|
66
|
+
### Step 4: Post a GitHub review with inline comments
|
|
67
|
+
|
|
68
|
+
Separate your findings into two groups:
|
|
69
|
+
|
|
70
|
+
**A. Line-specific findings** — these become inline comments. Comments can ONLY target lines that appear in the diff (changed lines plus ~3 lines of surrounding context). Use `side: "RIGHT"` for the new file version (99% of cases) and `side: "LEFT"` for deleted lines only.
|
|
71
|
+
|
|
72
|
+
**B. Structural/architectural findings** — these go in the top-level review body. Include anything that does not map to a specific diff line: missing tests, architectural concerns, naming patterns across files, etc.
|
|
73
|
+
|
|
74
|
+
Post the review using the verified API pattern below. This is the ONLY reliable method — do not use alternative approaches.
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
OWNER="{{OWNER}}"
|
|
78
|
+
REPO="{{REPO}}"
|
|
79
|
+
PR_NUMBER={{PR_NUMBER}}
|
|
80
|
+
|
|
81
|
+
echo '{
|
|
82
|
+
"event": "COMMENT",
|
|
83
|
+
"body": "## Code Review\n\nVerdict: <PASS or NEEDS WORK>\n\n### Summary\n<high-level summary of the changes and overall quality>\n\n### Structural Findings\n<any findings that do not map to a specific diff line>\n\n### FIX Items\n<numbered list of all FIX items, or \"None\" if verdict is PASS>",
|
|
84
|
+
"comments": [
|
|
85
|
+
{
|
|
86
|
+
"path": "<relative file path>",
|
|
87
|
+
"line": <line number in the new file>,
|
|
88
|
+
"side": "RIGHT",
|
|
89
|
+
"body": "[FIX] <description of the issue and why it must be fixed>"
|
|
90
|
+
}
|
|
91
|
+
]
|
|
92
|
+
}' | gh api "repos/$OWNER/$REPO/pulls/$PR_NUMBER/reviews" \
|
|
93
|
+
-X POST --input -
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
**Critical constraints for the API call:**
|
|
97
|
+
- `comments` array can be empty `[]` if all findings are structural
|
|
98
|
+
- Each comment `line` must be a line number visible in the diff — not arbitrary file lines
|
|
99
|
+
- Use `start_line` and `start_side` for multi-line comment ranges
|
|
100
|
+
- `commit_id` is optional and defaults to PR HEAD
|
|
101
|
+
- Do NOT use `pulls/{id}/comments` endpoint — it requires `position` (diff offset), not `line`
|
|
102
|
+
- Do NOT use `-f` / `-F` flags for nested objects — use `--input -` with piped JSON
|
|
103
|
+
- Submitted reviews (`event: "COMMENT"`) cannot be deleted — double-check before posting
|
|
104
|
+
- Build the JSON programmatically if there are many comments to avoid syntax errors
|
|
105
|
+
|
|
106
|
+
### Step 5: Assess risk level
|
|
107
|
+
|
|
108
|
+
Score the PR on a 1–5 risk scale:
|
|
109
|
+
|
|
110
|
+
| Score | Criteria | Examples |
|
|
111
|
+
|-------|----------|----------|
|
|
112
|
+
| 1 | Docs, comments, test-only, config formatting | README update, adding test cases |
|
|
113
|
+
| 2 | Internal refactors, dead code removal, minor dep bumps | Rename internal function, remove unused import |
|
|
114
|
+
| 3 | New features within existing patterns, non-breaking additions | New endpoint following existing conventions |
|
|
115
|
+
| 4 | API changes, new patterns, cross-cutting changes, user-facing behavior | New state management approach, UI flow change |
|
|
116
|
+
| 5 | Security-sensitive, protocol/NDA data, breaking API, publishing | Auth changes, protocol byte changes, npm publish |
|
|
117
|
+
|
|
118
|
+
**Scoring rules:**
|
|
119
|
+
- Use the HIGHEST applicable score across all changes in the PR
|
|
120
|
+
- Any change touching files matching `**/protocol/**`, `**/auth/**`, or `**/security/**` is automatically risk 4+
|
|
121
|
+
- Publishing workflows (npm publish, app store, release scripts) are automatically risk 5
|
|
122
|
+
- If unsure between two levels, round UP
|
|
123
|
+
|
|
124
|
+
### Step 6: Determine verdict
|
|
125
|
+
|
|
126
|
+
- **PASS** — No `[FIX]` items remain. Code is ready to merge.
|
|
127
|
+
- **NEEDS WORK** — One or more `[FIX]` items exist. List them.
|
|
128
|
+
|
|
129
|
+
### Step 7: Output orchestrator summary
|
|
130
|
+
|
|
131
|
+
After posting the review, output a structured summary for the orchestrator (not posted to GitHub):
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
## Orchestrator Summary
|
|
135
|
+
|
|
136
|
+
### Verdict: <PASS or NEEDS WORK>
|
|
137
|
+
### Risk: <1-5>
|
|
138
|
+
|
|
139
|
+
### FIX Items (<count>)
|
|
140
|
+
- <file:line> — <brief description>
|
|
141
|
+
|
|
142
|
+
### High-Complexity Areas (human attention recommended)
|
|
143
|
+
- <area and why it needs human review>
|
|
144
|
+
|
|
145
|
+
### Open Questions
|
|
146
|
+
- <any ambiguity about requirements or approach>
|
|
147
|
+
|
|
148
|
+
### Security Concerns
|
|
149
|
+
- <any security issues, or "None">
|
|
150
|
+
|
|
151
|
+
### Estimated Fix Effort
|
|
152
|
+
- <trivial / small / medium — to help orchestrator decide whether to resume agent or re-review>
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
The orchestrator uses the risk score and verdict to route the review — see `coordination/scripts/review-route.sh`.
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
# Review Fixup: {{TASK_ID}}
|
|
2
|
+
|
|
3
|
+
## Setup
|
|
4
|
+
|
|
5
|
+
- Location: `{{REPO_PATH}}`
|
|
6
|
+
- Branch: `{{BRANCH_NAME}}`
|
|
7
|
+
- PR: `{{OWNER}}/{{REPO}}#{{PR_NUMBER}}`
|
|
8
|
+
- Build: `{{BUILD_CMD}}` | Test: `{{TEST_CMD}}` | Typecheck: `{{TYPECHECK_CMD}}` | Lint: `{{LINT_CMD}}`
|
|
9
|
+
|
|
10
|
+
## Step 1: Fetch review comments
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
cd {{REPO_PATH}} && git checkout {{BRANCH_NAME}} && git pull origin {{BRANCH_NAME}}
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
gh api repos/{{OWNER}}/{{REPO}}/pulls/{{PR_NUMBER}}/reviews --jq '.[].body'
|
|
18
|
+
gh api repos/{{OWNER}}/{{REPO}}/pulls/{{PR_NUMBER}}/comments --jq '.[] | {path, line, body}'
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Step 2: Categorize findings
|
|
22
|
+
|
|
23
|
+
Parse each comment. Extract items tagged `[FIX]` — these are required changes. Skip items tagged `[WON'T FIX]` or purely informational comments. Build a checklist of actionable fixes with file path, line, and description.
|
|
24
|
+
|
|
25
|
+
## Step 3: Implement fixes
|
|
26
|
+
|
|
27
|
+
For each `[FIX]` item:
|
|
28
|
+
|
|
29
|
+
1. Read the referenced file in full
|
|
30
|
+
2. Implement the fix
|
|
31
|
+
3. Re-read the file to confirm the change addresses the comment
|
|
32
|
+
|
|
33
|
+
Stay within the files referenced by review comments. Do not refactor unrelated code.
|
|
34
|
+
|
|
35
|
+
## Step 4: Verify
|
|
36
|
+
|
|
37
|
+
Run in order — all must pass:
|
|
38
|
+
|
|
39
|
+
1. `{{TYPECHECK_CMD}}`
|
|
40
|
+
2. `{{TEST_CMD}}`
|
|
41
|
+
3. `{{LINT_CMD}}`
|
|
42
|
+
4. `npx prettier --check .` — fix with `npx prettier --write .` if needed
|
|
43
|
+
5. `{{BUILD_CMD}}`
|
|
44
|
+
|
|
45
|
+
**Retry policy:** If any step fails, investigate and fix. Try at least **twice** before giving up. On failure after 2 attempts, report: (1) what is failing, (2) what you tried, (3) your theory on root cause.
|
|
46
|
+
|
|
47
|
+
## Step 5: Commit
|
|
48
|
+
|
|
49
|
+
Stage only changed files — do not use `git add -A` or `git add .`.
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
cd {{REPO_PATH}}
|
|
53
|
+
git add <changed files>
|
|
54
|
+
git commit -m "Address review feedback for {{TASK_ID}}"
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
Do NOT push — the orchestrator handles push and follow-up.
|
|
58
|
+
|
|
59
|
+
## Step 6: Report
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
## Fixup Summary
|
|
63
|
+
|
|
64
|
+
### Fixed (<count>)
|
|
65
|
+
- <file:line> -- <what was fixed>
|
|
66
|
+
|
|
67
|
+
### Skipped (<count>)
|
|
68
|
+
- <file:line> -- <reason skipped (WON'T FIX / informational)>
|
|
69
|
+
|
|
70
|
+
### Verification
|
|
71
|
+
- Typecheck: PASS/FAIL
|
|
72
|
+
- Tests: PASS/FAIL (<count> tests)
|
|
73
|
+
- Lint: PASS/FAIL
|
|
74
|
+
- Prettier: PASS/FAIL
|
|
75
|
+
- Build: PASS/FAIL
|
|
76
|
+
|
|
77
|
+
### Issues
|
|
78
|
+
- <any problems encountered, or "None">
|
|
79
|
+
```
|