golem-cc 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,11 +1,11 @@
1
1
  ---
2
2
  name: golem:plan
3
3
  description: Create implementation plan from specs
4
- allowed-tools: [Read, Write, Glob, Grep, Bash]
4
+ allowed-tools: [Read, Write, Glob, Grep, Bash, Task]
5
5
  ---
6
6
 
7
7
  <objective>
8
- Analyze specs versus existing code and create .golem/IMPLEMENTATION_PLAN.md with prioritized, staged tasks.
8
+ Spawn an agent team to analyze specs, explore the codebase from multiple angles, and create a robust implementation plan. Each teammate owns a layer/module, and a devil's advocate challenges the proposed architecture before the plan is finalized.
9
9
  </objective>
10
10
 
11
11
  <execution_context>
@@ -39,54 +39,87 @@ cat .golem/IMPLEMENTATION_PLAN.md 2>/dev/null || echo "No existing plan"
39
39
 
40
40
  <process>
41
41
 
42
- ## 1. Read All Specs
42
+ ## Phase 1: Analyze Specs
43
43
 
44
- Read each file in `.golem/specs/` completely. Extract:
45
- - Concrete requirements (must have, should have)
46
- - Acceptance criteria and tests
47
- - Technical constraints
44
+ Before spawning the team, read all specs and identify:
45
+ - The distinct layers/modules involved
46
+ - Key requirements and constraints
48
47
  - Dependencies between specs
49
48
 
50
- ## 2. Analyze Existing Code
49
+ ## Phase 2: Spawn Planning Team
51
50
 
52
- Search the codebase to understand current state:
53
- - What's already implemented?
54
- - Current architecture and patterns
55
- - Reusable components
56
- - Where do changes need to go?
51
+ Create an agent team based on the layers/modules identified:
57
52
 
58
- ## 3. Gap Analysis
59
-
60
- For each requirement, determine:
61
- - **Done**: Already implemented and tested
62
- - **Partial**: Partially implemented, needs completion
63
- - **Missing**: Not implemented at all
64
- - **Blocked**: Depends on something not yet built
53
+ ```
54
+ Create an agent team to plan implementation for these specs.
55
+ Spawn teammates based on the layers involved, plus a devil's advocate.
56
+
57
+ Example for a typical web app:
58
+ 1. **Frontend Planner** - Owns UI/UX implementation:
59
+ - Component structure
60
+ - State management
61
+ - User interactions
62
+ - Styling approach
63
+
64
+ 2. **Backend Planner** - Owns API/service implementation:
65
+ - Endpoint design
66
+ - Data flow
67
+ - Business logic
68
+ - Error handling
69
+
70
+ 3. **Data Planner** - Owns data layer:
71
+ - Schema changes
72
+ - Migrations
73
+ - Query patterns
74
+ - Data validation
75
+
76
+ 4. **Devil's Advocate** - Challenges the architecture:
77
+ - "This is overengineered for the requirements"
78
+ - "What happens when X fails?"
79
+ - "Why not use existing pattern Y?"
80
+ - "This will be hard to test/maintain"
81
+
82
+ Rules for Devil's Advocate:
83
+ - Must articulate WHY something is problematic
84
+ - Must propose a concrete alternative
85
+ - Must back down when concern is adequately addressed
86
+ - Goal is better architecture, not blocking progress
87
+
88
+ Adjust team composition based on actual project structure.
89
+ For a CLI tool: maybe Parser, Core Logic, Output Formatting.
90
+ For a mobile app: maybe UI, State, Network, Storage.
91
+ ```
65
92
 
66
- ## 4. Generate Staged Tasks
93
+ ## Phase 3: Team Analysis
67
94
 
68
- Create tasks grouped into STAGES. Each stage represents a logical milestone that, when complete, should be squashed into a single commit.
95
+ Each teammate analyzes their layer:
69
96
 
70
- Tasks within a stage:
71
- - Atomic - completable in one focused session
72
- - Testable - has clear verification
73
- - Affects 1-3 files typically
74
- - Minimal dependencies
97
+ ### Layer Planners investigate:
98
+ - What existing code can we reuse?
99
+ - What new components are needed?
100
+ - What's the gap between spec and current state?
101
+ - How does my layer integrate with others?
102
+ - What order should tasks be done?
75
103
 
76
- **Bad task**: "Implement authentication"
77
- **Good task**: "Add password validation function with min length check"
104
+ ### Devil's Advocate challenges:
105
+ - Is this architecture too complex?
106
+ - Are there simpler patterns we should use?
107
+ - What are the failure modes?
108
+ - Will this be maintainable?
109
+ - Are we gold-plating?
78
110
 
79
- ## 5. Prioritize
111
+ ## Phase 4: Cross-Layer Synthesis
80
112
 
81
- Order stages by:
82
- 1. **Critical path** - What blocks other work?
83
- 2. **Dependencies** - What must be built first?
84
- 3. **Risk** - Tackle unknowns early
85
- 4. **Value** - Core functionality before nice-to-haves
113
+ After individual analysis:
114
+ 1. Teammates share their proposed tasks
115
+ 2. Identify dependencies between layers
116
+ 3. Devil's advocate challenges integration points
117
+ 4. Resolve conflicts and simplify where possible
118
+ 5. Agree on stage boundaries
86
119
 
87
- ## 6. Write Plan
120
+ ## Phase 5: Generate Plan
88
121
 
89
- Create `.golem/IMPLEMENTATION_PLAN.md`:
122
+ Synthesize team findings into `.golem/IMPLEMENTATION_PLAN.md`:
90
123
 
91
124
  ```markdown
92
125
  # Implementation Plan
@@ -100,10 +133,15 @@ Based on: .golem/specs/*.md
100
133
  - Completed: 0
101
134
  - Current: Stage 1
102
135
 
136
+ ## Architecture Decisions
137
+ {Key decisions from team discussion}
138
+ {What the Devil's Advocate challenged and how it was resolved}
139
+
103
140
  ---
104
141
 
105
142
  ## Stage 1: {Stage Name}
106
143
  Commit message: {type}({scope}): {description for squash commit}
144
+ Owner: {which layer this primarily affects}
107
145
 
108
146
  ### [ ] 1.1. {Task title}
109
147
  Files: {expected files}
@@ -126,24 +164,42 @@ Files: {files}
126
164
  Notes: {notes}
127
165
 
128
166
  ...
129
- ```
130
167
 
131
- ## 7. Sync Status
168
+ ---
132
169
 
133
- Update ticket status:
134
- ```bash
135
- golem-api ticket:status $TICKET_ID planning --note "Implementation plan created with N stages"
170
+ ## Risks & Mitigations
171
+ {What the Devil's Advocate identified as risks}
172
+ {How we plan to mitigate them}
136
173
  ```
137
174
 
175
+ ## Phase 6: Cleanup & Sync
176
+
177
+ 1. Clean up the agent team
178
+ 2. Update ticket status:
179
+ ```bash
180
+ golem-api ticket:status $TICKET_ID planning --note "Implementation plan created with N stages"
181
+ ```
182
+ 3. Show next steps:
183
+ ```
184
+ Plan complete! Next steps:
185
+ 1. Review .golem/IMPLEMENTATION_PLAN.md
186
+ 2. Run /golem:build to start building
187
+ ```
188
+
138
189
  </process>
139
190
 
140
191
  <success_criteria>
141
192
  - [ ] All specs analyzed
193
+ - [ ] Agent team explored from multiple layer perspectives
194
+ - [ ] Devil's advocate challenged architecture
142
195
  - [ ] Gap analysis completed
143
196
  - [ ] Tasks grouped into logical stages
144
197
  - [ ] Each task is atomic and testable
145
198
  - [ ] Dependencies mapped correctly
199
+ - [ ] Architecture decisions documented
200
+ - [ ] Risks identified and mitigated
146
201
  - [ ] .golem/IMPLEMENTATION_PLAN.md written
202
+ - [ ] Team cleaned up
147
203
  - [ ] No code changes made (planning only)
148
204
  </success_criteria>
149
205
 
@@ -151,6 +207,8 @@ golem-api ticket:status $TICKET_ID planning --note "Implementation plan created
151
207
  - Do NOT implement anything in planning mode
152
208
  - Do NOT modify source code
153
209
  - ONLY create/update .golem/IMPLEMENTATION_PLAN.md
154
- - Stages should represent logical milestones
155
- - Each stage gets squashed to one commit when complete
210
+ - The Devil's Advocate must actively challenge, not just observe
211
+ - Document architecture decisions so we remember WHY
212
+ - Document risks so we're prepared for them
213
+ - Clean up the team when done
156
214
  </important>
@@ -0,0 +1,376 @@
1
+ ---
2
+ name: golem:review
3
+ description: Run parallel code review before PR
4
+ allowed-tools: [Read, Glob, Grep, Bash, Write, Task]
5
+ ---
6
+
7
+ <objective>
8
+ Run automated security scans first, then spawn an agent team for comprehensive code review. Security issues block the review. The agent team reviews from multiple perspectives: security patterns, performance, correctness, and test coverage. A devil's advocate challenges the implementation. Generate a review report and block PR creation until issues are resolved.
9
+ </objective>
10
+
11
+ <context>
12
+ Current ticket:
13
+ ```bash
14
+ TICKET_ID=$(basename "$(pwd)" | grep -oE '(INC|SR)-[0-9]+' || echo "")
15
+ if [ -n "$TICKET_ID" ]; then
16
+ cat .golem/tickets/$TICKET_ID.yaml 2>/dev/null
17
+ fi
18
+ ```
19
+
20
+ Changes to review:
21
+ ```bash
22
+ git diff origin/main..HEAD --stat
23
+ ```
24
+
25
+ Files changed:
26
+ ```bash
27
+ git diff origin/main..HEAD --name-only
28
+ ```
29
+
30
+ Load specs for context:
31
+ ```bash
32
+ for f in .golem/specs/*.md; do echo "=== $f ==="; cat "$f"; echo; done 2>/dev/null
33
+ ```
34
+
35
+ Load implementation plan:
36
+ ```bash
37
+ cat .golem/IMPLEMENTATION_PLAN.md 2>/dev/null
38
+ ```
39
+
40
+ Check available security tools:
41
+ ```bash
42
+ echo "Security tools:"
43
+ which gitleaks &>/dev/null && echo " gitleaks: OK" || echo " gitleaks: MISSING"
44
+ which semgrep &>/dev/null && echo " semgrep: OK" || echo " semgrep: MISSING"
45
+ which trivy &>/dev/null && echo " trivy: OK" || echo " trivy: MISSING"
46
+ ```
47
+ </context>
48
+
49
+ <process>
50
+
51
+ ## Phase 0: Security Gate (BLOCKING)
52
+
53
+ Before any code review, run automated security scans. This gate must pass.
54
+
55
+ ### 0.1 Secrets Scan (gitleaks)
56
+ ```bash
57
+ if command -v gitleaks &> /dev/null; then
58
+ echo "=== SECRETS SCAN ==="
59
+ if [ -f .gitleaks.toml ]; then
60
+ gitleaks detect --config .gitleaks.toml --no-git -v 2>&1
61
+ else
62
+ gitleaks detect --no-git -v 2>&1
63
+ fi
64
+ SECRETS_EXIT=$?
65
+ if [ $SECRETS_EXIT -ne 0 ]; then
66
+ echo "CRITICAL: Secrets detected in codebase!"
67
+ fi
68
+ fi
69
+ ```
70
+
71
+ ### 0.2 SAST Scan (semgrep)
72
+ ```bash
73
+ if command -v semgrep &> /dev/null; then
74
+ echo "=== SAST SCAN ==="
75
+ SAST_OUTPUT=$(semgrep scan --config auto --json 2>&1)
76
+ SAST_FINDINGS=$(echo "$SAST_OUTPUT" | jq '.results | length' 2>/dev/null || echo "0")
77
+ echo "Findings: $SAST_FINDINGS"
78
+ if [ "$SAST_FINDINGS" != "0" ] && [ -n "$SAST_FINDINGS" ]; then
79
+ echo "$SAST_OUTPUT" | jq '.results[] | {rule: .check_id, file: .path, line: .start.line, message: .extra.message}' 2>/dev/null
80
+ fi
81
+ fi
82
+ ```
83
+
84
+ ### 0.3 Dependency Scan (pnpm audit)
85
+ ```bash
86
+ if [ -f package.json ]; then
87
+ echo "=== DEPENDENCY SCAN ==="
88
+ DEPS_OUTPUT=$(pnpm audit --json 2>&1)
89
+ DEPS_HIGH=$(echo "$DEPS_OUTPUT" | jq '.metadata.vulnerabilities.high // 0' 2>/dev/null || echo "0")
90
+ DEPS_CRITICAL=$(echo "$DEPS_OUTPUT" | jq '.metadata.vulnerabilities.critical // 0' 2>/dev/null || echo "0")
91
+ echo "High: $DEPS_HIGH, Critical: $DEPS_CRITICAL"
92
+ fi
93
+ ```
94
+
95
+ ### 0.4 Security Gate Decision
96
+
97
+ **If ANY security scan fails (secrets found, critical SAST findings, critical deps):**
98
+ ```
99
+ SECURITY GATE: BLOCKED
100
+
101
+ Critical security issues must be resolved before code review.
102
+
103
+ Issues found:
104
+ - {list issues}
105
+
106
+ Fix these issues, then run /golem:review again.
107
+ ```
108
+ **STOP HERE. Do not proceed to agent team review.**
109
+
110
+ **If security scans pass (or tools not available):**
111
+ Proceed to Phase 1.
112
+
113
+ ---
114
+
115
+ ## Phase 1: Identify Review Scope
116
+
117
+ After security gate passes:
118
+ 1. Get list of all changed files
119
+ 2. Get the full diff
120
+ 3. Load specs to understand intent
121
+ 4. Load plan to understand expected changes
122
+
123
+ ## Phase 2: Spawn Review Team
124
+
125
+ Create an agent team with specialized reviewers:
126
+
127
+ ```
128
+ Create an agent team to review the changes before PR creation.
129
+ Security scans have passed. Now review code quality.
130
+
131
+ Spawn five reviewers:
132
+
133
+ 1. **Security Patterns Reviewer** - Reviews code for security anti-patterns:
134
+ - Auth/authz logic correctness
135
+ - Input validation patterns
136
+ - Output encoding
137
+ - Session management
138
+ - Error handling (no info leakage)
139
+ - Logging (no sensitive data)
140
+ (Note: Automated tools already ran - focus on logic/patterns)
141
+
142
+ 2. **Performance Reviewer** - Identifies performance issues:
143
+ - N+1 queries
144
+ - Missing indexes
145
+ - Memory leaks
146
+ - Unnecessary allocations
147
+ - Blocking operations
148
+ - Missing caching opportunities
149
+ - O(n²) or worse algorithms
150
+
151
+ 3. **Correctness Reviewer** - Validates logic and behavior:
152
+ - Does it match the spec?
153
+ - Edge cases handled?
154
+ - Error handling complete?
155
+ - Race conditions?
156
+ - Null/undefined handling?
157
+ - Off-by-one errors?
158
+
159
+ 4. **Test Reviewer** - Validates test coverage:
160
+ - Are new features tested?
161
+ - Are edge cases covered?
162
+ - Are error paths tested?
163
+ - Test quality (not just quantity)
164
+ - Integration tests where needed?
165
+ - Mocks used appropriately?
166
+
167
+ 5. **Devil's Advocate** - Challenges the implementation:
168
+ - "This is overengineered"
169
+ - "This will be hard to maintain"
170
+ - "This doesn't match the spec"
171
+ - "There's a simpler way to do this"
172
+ - "This will break when X happens"
173
+
174
+ Rules for Devil's Advocate:
175
+ - Must articulate WHY something is problematic
176
+ - Must propose a concrete fix or alternative
177
+ - Must back down when concern is adequately addressed
178
+ - Severity ratings: critical, major, minor, nit
179
+
180
+ Have reviewers work in parallel, share findings, and challenge
181
+ each other. The goal is to catch issues before they hit production.
182
+ ```
183
+
184
+ ## Phase 3: Parallel Review
185
+
186
+ Each reviewer examines the changes from their perspective.
187
+
188
+ ### Security Patterns Reviewer checks:
189
+ - Authentication/authorization logic
190
+ - Input handling patterns
191
+ - Data sanitization
192
+ - Session handling
193
+ - Error messages (no stack traces to users)
194
+ - Audit logging
195
+
196
+ ### Performance Reviewer checks:
197
+ - Database queries and patterns
198
+ - Loop complexity
199
+ - Memory usage patterns
200
+ - Async/await usage
201
+ - Caching strategies
202
+ - Bundle size impact
203
+
204
+ ### Correctness Reviewer checks:
205
+ - Business logic matches specs
206
+ - All acceptance criteria addressed
207
+ - Edge cases handled
208
+ - Error states handled
209
+ - Types are correct
210
+ - Contracts are honored
211
+
212
+ ### Test Reviewer checks:
213
+ - New code has tests
214
+ - Tests actually verify behavior
215
+ - Edge cases tested
216
+ - Error paths tested
217
+ - Tests are maintainable
218
+ - No flaky test patterns
219
+
220
+ ### Devil's Advocate challenges:
221
+ - Complexity vs requirements
222
+ - Maintainability concerns
223
+ - Alternative approaches
224
+ - Future-proofing vs YAGNI
225
+ - Consistency with codebase
226
+
227
+ ## Phase 4: Synthesize Findings
228
+
229
+ After individual reviews:
230
+ 1. Each reviewer reports findings with severity
231
+ 2. Devil's advocate challenges or reinforces findings
232
+ 3. Deduplicate overlapping issues
233
+ 4. Prioritize by severity
234
+
235
+ Severity levels:
236
+ - **Critical**: Must fix before merge (security holes, data loss, crashes)
237
+ - **Major**: Should fix before merge (bugs, missing tests, performance issues)
238
+ - **Minor**: Nice to fix (code style, minor improvements)
239
+ - **Nit**: Optional (naming, formatting, suggestions)
240
+
241
+ ## Phase 5: Generate Review Report
242
+
243
+ Write `.golem/REVIEW_REPORT.md`:
244
+
245
+ ```markdown
246
+ # Code Review Report
247
+
248
+ Ticket: {INC-XXXX}
249
+ Reviewed: {ISO timestamp}
250
+ Changes: {N} files, +{additions}/-{deletions}
251
+
252
+ ## Security Scans (Automated)
253
+
254
+ | Scan | Status |
255
+ |------|--------|
256
+ | Secrets (gitleaks) | {PASS/SKIPPED} |
257
+ | SAST (semgrep) | {PASS/N findings} |
258
+ | Dependencies (pnpm) | {PASS/N high, M critical} |
259
+
260
+ ## Code Review Summary
261
+
262
+ | Category | Critical | Major | Minor | Nit |
263
+ |----------|----------|-------|-------|-----|
264
+ | Security Patterns | {n} | {n} | {n} | {n} |
265
+ | Performance | {n} | {n} | {n} | {n} |
266
+ | Correctness | {n} | {n} | {n} | {n} |
267
+ | Tests | {n} | {n} | {n} | {n} |
268
+ | Devil's Advocate | {n} | {n} | {n} | {n} |
269
+
270
+ ## Verdict
271
+ {BLOCKED | APPROVED | APPROVED_WITH_COMMENTS}
272
+
273
+ ---
274
+
275
+ ## Critical Issues
276
+ {Must be resolved before merge}
277
+
278
+ ### [{Category}] {Issue title}
279
+ **File**: {path}:{line}
280
+ **Description**: {what's wrong}
281
+ **Risk**: {what could happen}
282
+ **Fix**: {how to fix it}
283
+
284
+ ---
285
+
286
+ ## Major Issues
287
+ {Should be resolved before merge}
288
+
289
+ ### [{Category}] {Issue title}
290
+ **File**: {path}:{line}
291
+ **Description**: {what's wrong}
292
+ **Impact**: {impact}
293
+ **Fix**: {how to fix it}
294
+
295
+ ---
296
+
297
+ ## Minor Issues
298
+ {Nice to have}
299
+
300
+ ### [{Category}] {Issue title}
301
+ **File**: {path}:{line}
302
+ **Suggestion**: {improvement}
303
+
304
+ ---
305
+
306
+ ## Nits
307
+ {Optional improvements}
308
+
309
+ - {nit 1}
310
+ - {nit 2}
311
+
312
+ ---
313
+
314
+ ## What Went Well
315
+ {Positive observations from the review}
316
+
317
+ ## Recommendations
318
+ {Suggestions for future work, not blockers}
319
+ ```
320
+
321
+ ## Phase 6: Verdict & Next Steps
322
+
323
+ Based on findings:
324
+
325
+ **If Critical or Major issues exist:**
326
+ ```
327
+ Review complete: BLOCKED
328
+
329
+ {N} critical and {M} major issues must be resolved.
330
+ See .golem/REVIEW_REPORT.md for details.
331
+
332
+ Fix the issues, then run /golem:review again.
333
+ ```
334
+
335
+ **If only Minor/Nit issues:**
336
+ ```
337
+ Review complete: APPROVED
338
+
339
+ {N} minor suggestions documented in .golem/REVIEW_REPORT.md
340
+ You may proceed with `golem pr` to create the pull request.
341
+ ```
342
+
343
+ ## Phase 7: Cleanup
344
+
345
+ 1. Clean up the agent team
346
+ 2. Update ticket status:
347
+ ```bash
348
+ golem-api ticket:status $TICKET_ID review --note "Code review: {verdict}"
349
+ ```
350
+
351
+ </process>
352
+
353
+ <success_criteria>
354
+ - [ ] Security scans passed (gitleaks, semgrep, pnpm audit)
355
+ - [ ] All changed files reviewed by agent team
356
+ - [ ] Security patterns perspective covered
357
+ - [ ] Performance perspective covered
358
+ - [ ] Correctness perspective covered
359
+ - [ ] Test coverage verified
360
+ - [ ] Devil's advocate challenged implementation
361
+ - [ ] Issues categorized by severity
362
+ - [ ] Review report generated
363
+ - [ ] Clear verdict provided
364
+ - [ ] Team cleaned up
365
+ </success_criteria>
366
+
367
+ <important>
368
+ - Security scans run FIRST and block if they fail
369
+ - This is a READ-ONLY review phase - do NOT fix issues
370
+ - Issues must include specific file:line references
371
+ - Issues must include concrete fix suggestions
372
+ - Critical/Major issues BLOCK the PR
373
+ - The Devil's Advocate should be ruthless but fair
374
+ - Document what went WELL, not just problems
375
+ - Clean up the team when done
376
+ </important>