tlc-claude-code 2.1.0 → 2.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/builder.md +144 -0
- package/.claude/agents/planner.md +143 -0
- package/.claude/agents/reviewer.md +160 -0
- package/.claude/commands/tlc/build.md +59 -50
- package/.claude/commands/tlc/review-plan.md +363 -0
- package/.claude/commands/tlc/review.md +155 -53
- package/CLAUDE.md +1 -0
- package/bin/install.js +105 -8
- package/bin/postinstall.js +60 -1
- package/bin/setup-autoupdate.js +206 -0
- package/bin/setup-autoupdate.test.js +124 -0
- package/bin/tlc.js +0 -0
- package/package.json +2 -2
- package/scripts/project-docs.js +1 -1
- package/server/lib/cost-tracker.test.js +49 -12
- package/server/lib/orchestration/agent-dispatcher.js +114 -0
- package/server/lib/orchestration/agent-dispatcher.test.js +110 -0
- package/server/lib/orchestration/orchestrator.js +130 -0
- package/server/lib/orchestration/orchestrator.test.js +192 -0
- package/server/lib/orchestration/tmux-manager.js +101 -0
- package/server/lib/orchestration/tmux-manager.test.js +109 -0
- package/server/lib/orchestration/worktree-manager.js +132 -0
- package/server/lib/orchestration/worktree-manager.test.js +129 -0
- package/server/lib/review/plan-reviewer.js +260 -0
- package/server/lib/review/plan-reviewer.test.js +269 -0
- package/server/lib/review/review-schemas.js +173 -0
- package/server/lib/review/review-schemas.test.js +152 -0
- package/server/setup.sh +271 -271
|
@@ -0,0 +1,363 @@
|
|
|
1
|
+
# /tlc:review-plan - Review a TLC Phase Plan
|
|
2
|
+
|
|
3
|
+
Review a `.planning/phases/{N}-PLAN.md` file for structure, scope, architecture, and completeness. Uses Claude in-session plus Codex (if available) for consensus.
|
|
4
|
+
|
|
5
|
+
## What This Does
|
|
6
|
+
|
|
7
|
+
1. Auto-detects the target plan file (or uses the phase number you provide)
|
|
8
|
+
2. **Runs Claude in-session analysis** against all quality dimensions
|
|
9
|
+
3. **Invokes Codex** (if available in router state) for a second opinion
|
|
10
|
+
4. Combines verdicts — both must APPROVE for overall APPROVED
|
|
11
|
+
5. Generates a structured report with per-provider attribution
|
|
12
|
+
|
|
13
|
+
## Usage
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
/tlc:review-plan # Auto-detect latest PLAN.md
|
|
17
|
+
/tlc:review-plan 7 # Review .planning/phases/7-PLAN.md
|
|
18
|
+
/tlc:review-plan 12 # Review .planning/phases/12-PLAN.md
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Process
|
|
22
|
+
|
|
23
|
+
### Step 1: Load Context
|
|
24
|
+
|
|
25
|
+
**Read router state first**, then load the plan file.
|
|
26
|
+
|
|
27
|
+
```javascript
|
|
28
|
+
// 1. Read persistent router state (written by session-init hook)
|
|
29
|
+
const routerState = JSON.parse(fs.readFileSync('.tlc/.router-state.json', 'utf-8'));
|
|
30
|
+
|
|
31
|
+
// 2. Read config for capability mappings
|
|
32
|
+
const config = JSON.parse(fs.readFileSync('.tlc.json', 'utf-8'));
|
|
33
|
+
const reviewProviders = config.router?.capabilities?.['review-plan']?.providers
|
|
34
|
+
|| config.router?.capabilities?.review?.providers
|
|
35
|
+
|| ['claude'];
|
|
36
|
+
|
|
37
|
+
// 3. Filter to only AVAILABLE providers (from state file)
|
|
38
|
+
const availableReviewers = reviewProviders.filter(p =>
|
|
39
|
+
routerState.providers[p]?.available === true
|
|
40
|
+
);
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
**The state file (`.tlc/.router-state.json`) is authoritative for availability.** Never run `which codex` yourself — read the state file. If the state file is missing or stale (>1 hour), probe manually and write a fresh one.
|
|
44
|
+
|
|
45
|
+
**Locate the plan file:**
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
# If phase number provided (e.g., "7"):
|
|
49
|
+
PLAN_FILE=".planning/phases/7-PLAN.md"
|
|
50
|
+
|
|
51
|
+
# If no argument — find the highest-numbered PLAN.md:
|
|
52
|
+
ls .planning/phases/*-PLAN.md | sort -t- -k1 -n | tail -1
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Also optionally read for context (if they exist):
|
|
56
|
+
- `.planning/phases/{N}-DISCUSSION.md` — design decisions
|
|
57
|
+
- `PROJECT.md` — project name, tech stack
|
|
58
|
+
- `.planning/ROADMAP.md` — phase dependencies
|
|
59
|
+
|
|
60
|
+
### Step 2: Claude In-Session Review
|
|
61
|
+
|
|
62
|
+
Analyze the plan across four dimensions. For each, produce a list of issues (empty = passing).
|
|
63
|
+
|
|
64
|
+
#### Dimension 1: Structure
|
|
65
|
+
|
|
66
|
+
Check every `### Task N:` block for:
|
|
67
|
+
|
|
68
|
+
| Field | Check | Fail Condition |
|
|
69
|
+
|-------|-------|----------------|
|
|
70
|
+
| `**Goal:**` | Present and non-empty | Missing or blank |
|
|
71
|
+
| `**Files:**` | At least one file listed | No files listed |
|
|
72
|
+
| `**Acceptance Criteria:**` | At least one `- [ ]` checkbox | Missing or empty section |
|
|
73
|
+
| `**Test Cases:**` | At least one bullet | Missing or empty section |
|
|
74
|
+
|
|
75
|
+
#### Dimension 2: Scope
|
|
76
|
+
|
|
77
|
+
| Check | Fail Condition |
|
|
78
|
+
|-------|----------------|
|
|
79
|
+
| Task title specificity | Title contains vague words: `system`, `entire`, `all`, `everything`, `complete`, `whole`, `full` |
|
|
80
|
+
| Vertical slice | Task mixes unrelated concerns (e.g., "Add auth AND refactor DB") |
|
|
81
|
+
| No files listed | Task has empty `**Files:**` section |
|
|
82
|
+
|
|
83
|
+
#### Dimension 3: Architecture
|
|
84
|
+
|
|
85
|
+
| Check | Fail Condition |
|
|
86
|
+
|-------|----------------|
|
|
87
|
+
| File line estimates | Any file has `estimated: NNN lines` where NNN > 1000 |
|
|
88
|
+
| Folder density | Any folder listed that already has > 15 files (check disk if accessible) |
|
|
89
|
+
| Module structure | Files dumped into flat `src/services/` or `src/controllers/` with > 5 items |
|
|
90
|
+
|
|
91
|
+
#### Dimension 4: Completeness
|
|
92
|
+
|
|
93
|
+
| Check | Fail Condition |
|
|
94
|
+
|-------|----------------|
|
|
95
|
+
| Prerequisites section | No `## Prerequisites` block in the plan |
|
|
96
|
+
| Dependencies section | No `## Dependencies` block (warn, not fail) |
|
|
97
|
+
| Phase ordering | Prerequisites reference phases that don't exist in ROADMAP (if readable) |
|
|
98
|
+
|
|
99
|
+
#### Dimension 5: TDD Readiness
|
|
100
|
+
|
|
101
|
+
Every task must have testable acceptance criteria:
|
|
102
|
+
|
|
103
|
+
| Check | Fail Condition |
|
|
104
|
+
|-------|----------------|
|
|
105
|
+
| Criteria are checkboxes | Criteria written as prose instead of `- [ ] ...` checkboxes |
|
|
106
|
+
| Criteria are specific | Criterion is too vague to write a test for (e.g., "works correctly") |
|
|
107
|
+
| Test cases listed | `**Test Cases:**` section is absent or has zero items |
|
|
108
|
+
| Test cases cover happy + sad paths | Only happy-path tests listed (warn) |
|
|
109
|
+
|
|
110
|
+
**Scoring:**
|
|
111
|
+
- 0 structure issues + 0 scope issues + 0 completeness issues + all tasks TDD-ready = **APPROVED**
|
|
112
|
+
- Any issues in structure, completeness, or TDD = **CHANGES_REQUESTED**
|
|
113
|
+
- Scope/architecture issues alone = **CHANGES_REQUESTED**
|
|
114
|
+
|
|
115
|
+
### Step 3: Invoke Codex (if available)
|
|
116
|
+
|
|
117
|
+
**CRITICAL: This step runs automatically when Codex is available in router state.**
|
|
118
|
+
|
|
119
|
+
Check availability:
|
|
120
|
+
```javascript
|
|
121
|
+
const codexAvailable = routerState.providers?.codex?.available === true
|
|
122
|
+
&& availableReviewers.includes('codex');
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
If Codex is available, invoke it with `exec` in read-only sandbox mode:
|
|
126
|
+
|
|
127
|
+
```bash
|
|
128
|
+
# Write the plan-review JSON schema to a temp file
|
|
129
|
+
cat > /tmp/plan-review-schema.json << 'EOF'
|
|
130
|
+
{
|
|
131
|
+
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
132
|
+
"type": "object",
|
|
133
|
+
"required": ["verdict", "structureIssues", "scopeIssues", "suggestions"],
|
|
134
|
+
"properties": {
|
|
135
|
+
"verdict": { "type": "string", "enum": ["APPROVED", "CHANGES_REQUESTED"] },
|
|
136
|
+
"structureIssues": { "type": "array", "items": { "type": "string" } },
|
|
137
|
+
"scopeIssues": { "type": "array", "items": { "type": "string" } },
|
|
138
|
+
"suggestions": { "type": "array", "items": { "type": "string" } }
|
|
139
|
+
}
|
|
140
|
+
}
|
|
141
|
+
EOF
|
|
142
|
+
|
|
143
|
+
# Invoke Codex with the plan content
|
|
144
|
+
codex exec -s read-only --output-schema /tmp/plan-review-schema.json \
|
|
145
|
+
"Review this TLC phase plan for quality. Check:
|
|
146
|
+
1. Structure: every task has Goal, Files, Acceptance Criteria (checkboxes), and Test Cases.
|
|
147
|
+
2. Scope: no vague titles, each task is a vertical slice with specific deliverables.
|
|
148
|
+
3. Architecture: no files planned >1000 lines, no folders >15 files.
|
|
149
|
+
4. Completeness: Prerequisites section present, dependencies are explicit.
|
|
150
|
+
5. TDD: every acceptance criterion is a testable, specific checkbox; test cases cover happy and sad paths.
|
|
151
|
+
|
|
152
|
+
Respond with JSON matching the output schema.
|
|
153
|
+
|
|
154
|
+
PLAN CONTENT:
|
|
155
|
+
$(cat .planning/phases/${PHASE_NUMBER}-PLAN.md)"
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
If `codex exec` is not available (older Codex without `exec` subcommand), fall back to:
|
|
159
|
+
|
|
160
|
+
```bash
|
|
161
|
+
codex --print \
|
|
162
|
+
"Review this TLC phase plan. For each issue found, format as:
|
|
163
|
+
- [structure|scope|architecture|completeness|tdd] <description>
|
|
164
|
+
End your response with either: Verdict: APPROVED or Verdict: CHANGES_REQUESTED
|
|
165
|
+
|
|
166
|
+
$(cat .planning/phases/${PHASE_NUMBER}-PLAN.md)"
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
Parse the fallback output using the `Verdict: APPROVED|CHANGES_REQUESTED` pattern.
|
|
170
|
+
|
|
171
|
+
**If Codex is unavailable:** Skip this step. Claude-only review proceeds. Note in the report: `Codex: unavailable (skipped)`.
|
|
172
|
+
|
|
173
|
+
### Step 4: Combine Verdicts
|
|
174
|
+
|
|
175
|
+
```
|
|
176
|
+
Overall verdict rules:
|
|
177
|
+
- ALL available providers approve → APPROVED
|
|
178
|
+
- ANY provider requests changes → CHANGES_REQUESTED
|
|
179
|
+
- Only 1 provider available → that provider's verdict is final
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Combine issues from all providers. Tag each issue with its source: `[Claude]` or `[Codex]`.
|
|
183
|
+
|
|
184
|
+
### Step 5: Generate Report
|
|
185
|
+
|
|
186
|
+
```markdown
|
|
187
|
+
# Plan Review Report
|
|
188
|
+
|
|
189
|
+
**Plan:** .planning/phases/{N}-PLAN.md
|
|
190
|
+
**Phase:** {N}
|
|
191
|
+
**Tasks:** {count}
|
|
192
|
+
**Date:** {ISO timestamp}
|
|
193
|
+
|
|
194
|
+
## Verdict: APPROVED | CHANGES_REQUESTED
|
|
195
|
+
|
|
196
|
+
## Dimension Results
|
|
197
|
+
|
|
198
|
+
### Structure
|
|
199
|
+
✅ All tasks have Goal, Files, Acceptance Criteria, and Test Cases
|
|
200
|
+
— or —
|
|
201
|
+
❌ 2 tasks missing acceptance criteria
|
|
202
|
+
├── Task 3: "Add OAuth flow" — missing **Acceptance Criteria:** section
|
|
203
|
+
└── Task 5: "Write migration" — missing **Test Cases:** section
|
|
204
|
+
|
|
205
|
+
### Scope
|
|
206
|
+
✅ All task titles are specific and scoped
|
|
207
|
+
— or —
|
|
208
|
+
⚠️ Task 2: "Complete user system" — title too vague (contains "complete", "system")
|
|
209
|
+
|
|
210
|
+
### Architecture
|
|
211
|
+
✅ No files planned over 1000 lines
|
|
212
|
+
✅ No folders over 15 files
|
|
213
|
+
|
|
214
|
+
### Completeness
|
|
215
|
+
✅ Prerequisites section present
|
|
216
|
+
⚠️ No ## Dependencies section (recommended)
|
|
217
|
+
|
|
218
|
+
### TDD Readiness
|
|
219
|
+
✅ All tasks have testable criteria as checkboxes
|
|
220
|
+
— or —
|
|
221
|
+
❌ Task 4 criteria written as prose, not checkboxes
|
|
222
|
+
|
|
223
|
+
## Provider Results
|
|
224
|
+
|
|
225
|
+
✅ Claude: APPROVED
|
|
226
|
+
✅ Codex: APPROVED (GPT-5.2)
|
|
227
|
+
— or —
|
|
228
|
+
❌ Codex: unavailable (skipped)
|
|
229
|
+
|
|
230
|
+
## Combined Issues
|
|
231
|
+
|
|
232
|
+
[Claude] Task 3 missing acceptance criteria
|
|
233
|
+
[Codex] Task 2 scope too broad — split into separate tasks
|
|
234
|
+
|
|
235
|
+
## Action Required
|
|
236
|
+
|
|
237
|
+
1. Add **Acceptance Criteria:** checkboxes to Task 3
|
|
238
|
+
2. Narrow Task 2 title and split "auth" from "profile" concerns
|
|
239
|
+
3. Add test cases covering auth failure paths in Task 1
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
## Example Output
|
|
243
|
+
|
|
244
|
+
### Plan Passes Review
|
|
245
|
+
|
|
246
|
+
```
|
|
247
|
+
/tlc:review-plan 7
|
|
248
|
+
|
|
249
|
+
Loading router state from .tlc/.router-state.json...
|
|
250
|
+
Plan review providers: claude, codex
|
|
251
|
+
|
|
252
|
+
Target plan: .planning/phases/7-PLAN.md (5 tasks)
|
|
253
|
+
|
|
254
|
+
─────────────────────────────────────────
|
|
255
|
+
Claude in-session review...
|
|
256
|
+
─────────────────────────────────────────
|
|
257
|
+
|
|
258
|
+
Structure: ✅ All 5 tasks complete
|
|
259
|
+
Scope: ✅ All titles specific
|
|
260
|
+
Architecture: ✅ No oversized files planned
|
|
261
|
+
Completeness: ✅ Prerequisites present
|
|
262
|
+
TDD: ✅ All criteria as checkboxes
|
|
263
|
+
|
|
264
|
+
Claude verdict: ✅ APPROVED
|
|
265
|
+
|
|
266
|
+
─────────────────────────────────────────
|
|
267
|
+
Invoking Codex (GPT-5.2) for plan review...
|
|
268
|
+
─────────────────────────────────────────
|
|
269
|
+
|
|
270
|
+
Codex verdict: ✅ APPROVED
|
|
271
|
+
- Clear task decomposition
|
|
272
|
+
- Good test coverage planning
|
|
273
|
+
- Prerequisites are explicit
|
|
274
|
+
|
|
275
|
+
Provider Results:
|
|
276
|
+
✅ Claude: APPROVED
|
|
277
|
+
✅ Codex: APPROVED
|
|
278
|
+
|
|
279
|
+
─────────────────────────────────────────
|
|
280
|
+
✅ APPROVED — Plan is ready for /tlc:build (2/2 agree)
|
|
281
|
+
─────────────────────────────────────────
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
### Plan Needs Changes
|
|
285
|
+
|
|
286
|
+
```
|
|
287
|
+
/tlc:review-plan 12
|
|
288
|
+
|
|
289
|
+
Loading router state from .tlc/.router-state.json...
|
|
290
|
+
Plan review providers: claude, codex
|
|
291
|
+
|
|
292
|
+
Target plan: .planning/phases/12-PLAN.md (4 tasks)
|
|
293
|
+
|
|
294
|
+
─────────────────────────────────────────
|
|
295
|
+
Claude in-session review...
|
|
296
|
+
─────────────────────────────────────────
|
|
297
|
+
|
|
298
|
+
Structure: ❌ 2 issues
|
|
299
|
+
├── Task 2: missing **Acceptance Criteria:**
|
|
300
|
+
└── Task 4: missing **Test Cases:**
|
|
301
|
+
Scope: ⚠️ 1 issue
|
|
302
|
+
└── Task 1: "Complete full system" — title too vague
|
|
303
|
+
Architecture: ✅ No oversized files planned
|
|
304
|
+
Completeness: ❌ 1 issue
|
|
305
|
+
└── No ## Prerequisites section
|
|
306
|
+
TDD: ❌ 1 issue
|
|
307
|
+
└── Task 3: criteria written as prose, not checkboxes
|
|
308
|
+
|
|
309
|
+
Claude verdict: ❌ CHANGES_REQUESTED
|
|
310
|
+
|
|
311
|
+
─────────────────────────────────────────
|
|
312
|
+
Invoking Codex (GPT-5.2) for plan review...
|
|
313
|
+
─────────────────────────────────────────
|
|
314
|
+
|
|
315
|
+
Codex verdict: ❌ CHANGES_REQUESTED
|
|
316
|
+
- Task 2 has no way to verify completion
|
|
317
|
+
- Task 1 scope covers too many concerns
|
|
318
|
+
- Missing error-path test cases throughout
|
|
319
|
+
|
|
320
|
+
Provider Results:
|
|
321
|
+
❌ Claude: CHANGES_REQUESTED
|
|
322
|
+
❌ Codex: CHANGES_REQUESTED
|
|
323
|
+
|
|
324
|
+
Combined Issues:
|
|
325
|
+
[Claude] Task 2 missing acceptance criteria
|
|
326
|
+
[Claude] Task 4 missing test cases
|
|
327
|
+
[Claude] Task 1 title too vague ("complete", "system")
|
|
328
|
+
[Claude] No ## Prerequisites section
|
|
329
|
+
[Claude] Task 3 criteria are prose, not checkboxes
|
|
330
|
+
[Codex] Task 2 has no verifiable completion criteria
|
|
331
|
+
[Codex] Task 1 mixes auth, profile, and settings concerns
|
|
332
|
+
[Codex] Missing error/failure path test cases in Tasks 1–3
|
|
333
|
+
|
|
334
|
+
─────────────────────────────────────────────────────
|
|
335
|
+
❌ CHANGES_REQUESTED (0/2 approved)
|
|
336
|
+
|
|
337
|
+
Action required:
|
|
338
|
+
1. Add **Acceptance Criteria:** checkboxes to Task 2
|
|
339
|
+
2. Add **Test Cases:** to Task 4
|
|
340
|
+
3. Rename Task 1 — split auth/profile/settings into separate tasks
|
|
341
|
+
4. Add ## Prerequisites section listing prior phases
|
|
342
|
+
5. Rewrite Task 3 criteria as - [ ] checkboxes
|
|
343
|
+
6. Add sad-path test cases to Tasks 1–3 (Codex)
|
|
344
|
+
─────────────────────────────────────────────────────
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
## Flags
|
|
348
|
+
|
|
349
|
+
| Flag | Description |
|
|
350
|
+
|------|-------------|
|
|
351
|
+
| `--no-external` | Skip Codex, use Claude only |
|
|
352
|
+
| `--codex-only` | Use only Codex (skip Claude in-session deep checks) |
|
|
353
|
+
| `--strict` | Treat warnings (vague titles, missing deps) as failures |
|
|
354
|
+
|
|
355
|
+
## Integration
|
|
356
|
+
|
|
357
|
+
This review runs automatically:
|
|
358
|
+
- At the end of `/tlc:plan` (informational — does not block)
|
|
359
|
+
- Can be run manually before `/tlc:build` to validate a plan
|
|
360
|
+
|
|
361
|
+
## ARGUMENTS
|
|
362
|
+
|
|
363
|
+
$ARGUMENTS
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# /tlc:review - Review Current Branch
|
|
2
2
|
|
|
3
|
-
Review changes on current branch before pushing.
|
|
3
|
+
Review changes on current branch before pushing. **Runs in a loop until clean.**
|
|
4
4
|
|
|
5
5
|
## What This Does
|
|
6
6
|
|
|
@@ -9,7 +9,10 @@ Review changes on current branch before pushing.
|
|
|
9
9
|
3. Checks test coverage for all changed files
|
|
10
10
|
4. Analyzes commit order for TDD compliance
|
|
11
11
|
5. Scans for security issues
|
|
12
|
-
6.
|
|
12
|
+
6. If issues found → **fix them automatically, then re-review**
|
|
13
|
+
7. **Repeats until both Claude and Codex approve** — no arbitrary limit, runs until clean
|
|
14
|
+
|
|
15
|
+
**This is an iterative review loop, not a single pass.** Think of it as RL-mode: keep fixing and re-checking until the code is foolproof.
|
|
13
16
|
|
|
14
17
|
**This runs automatically at the end of `/tlc:build`.**
|
|
15
18
|
|
|
@@ -192,52 +195,38 @@ For each provider in `reviewProviders` (except `claude` which is the current ses
|
|
|
192
195
|
|
|
193
196
|
**How to invoke:**
|
|
194
197
|
|
|
195
|
-
|
|
198
|
+
**Codex** — use its built-in `review` subcommand. It reads the git diff natively, no need to pipe files:
|
|
199
|
+
|
|
196
200
|
```bash
|
|
197
|
-
#
|
|
198
|
-
|
|
199
|
-
line_count=$(wc -l < /tmp/review-diff-full.patch)
|
|
201
|
+
# Review current branch against main (default)
|
|
202
|
+
codex review --base main "Focus on: bugs, security issues, missing edge cases, test coverage gaps. End with APPROVED or CHANGES_REQUESTED."
|
|
200
203
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
# Split by file for targeted review
|
|
205
|
-
git diff --name-only main...HEAD | while read file; do
|
|
206
|
-
git diff main...HEAD -- "$file" > "/tmp/review-chunk-${file//\//_}.patch"
|
|
207
|
-
done
|
|
208
|
-
|
|
209
|
-
# Or truncate with warning
|
|
210
|
-
head -500 /tmp/review-diff-full.patch > /tmp/review-diff.patch
|
|
211
|
-
echo "... truncated (showing first 500 of $line_count lines)" >> /tmp/review-diff.patch
|
|
212
|
-
else
|
|
213
|
-
cp /tmp/review-diff-full.patch /tmp/review-diff.patch
|
|
214
|
-
fi
|
|
215
|
-
```
|
|
204
|
+
# Review uncommitted changes only
|
|
205
|
+
codex review --uncommitted
|
|
216
206
|
|
|
217
|
-
|
|
207
|
+
# Review a specific commit
|
|
208
|
+
codex review --commit <sha>
|
|
218
209
|
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
210
|
+
# Custom instructions via prompt
|
|
211
|
+
codex review --base main "Check for TDD compliance: were tests written before implementation? Flag any implementation without corresponding test files."
|
|
212
|
+
```
|
|
222
213
|
|
|
223
|
-
|
|
224
|
-
2. **Bugs**: Identify potential bugs, edge cases, null checks
|
|
225
|
-
3. **Security**: SQL injection, XSS, command injection, auth issues
|
|
226
|
-
4. **Performance**: N+1 queries, unnecessary loops, memory leaks
|
|
227
|
-
5. **Code Quality**: Naming, duplication, complexity, SOLID principles
|
|
228
|
-
6. **Test Coverage**: Are the changes properly tested?
|
|
214
|
+
`codex review` outputs a structured review to stdout. No `--print` flag needed — it's already non-interactive.
|
|
229
215
|
|
|
230
|
-
|
|
231
|
-
- File and line number
|
|
232
|
-
- Severity (critical/high/medium/low)
|
|
233
|
-
- What's wrong
|
|
234
|
-
- How to fix it
|
|
216
|
+
**Gemini** — no built-in review command, so save diff and invoke:
|
|
235
217
|
|
|
236
|
-
|
|
218
|
+
```bash
|
|
219
|
+
# Save diff for Gemini
|
|
220
|
+
git diff main...HEAD > /tmp/review-diff.patch
|
|
221
|
+
line_count=$(wc -l < /tmp/review-diff.patch)
|
|
237
222
|
|
|
238
|
-
|
|
223
|
+
# Truncate if too large
|
|
224
|
+
if [ "$line_count" -gt 500 ]; then
|
|
225
|
+
head -500 /tmp/review-diff.patch > /tmp/review-diff-truncated.patch
|
|
226
|
+
echo "... truncated (showing first 500 of $line_count lines)" >> /tmp/review-diff-truncated.patch
|
|
227
|
+
mv /tmp/review-diff-truncated.patch /tmp/review-diff.patch
|
|
228
|
+
fi
|
|
239
229
|
|
|
240
|
-
# For Gemini - detailed review prompt
|
|
241
230
|
gemini --print "Review this code diff as a senior engineer. Provide:
|
|
242
231
|
- Specific line-by-line feedback
|
|
243
232
|
- Security vulnerabilities with file:line references
|
|
@@ -245,14 +234,10 @@ gemini --print "Review this code diff as a senior engineer. Provide:
|
|
|
245
234
|
- Code quality issues
|
|
246
235
|
- Missing test coverage
|
|
247
236
|
|
|
248
|
-
Be thorough and specific.
|
|
249
|
-
```
|
|
237
|
+
Be thorough and specific. End with APPROVED or CHANGES_REQUESTED.
|
|
250
238
|
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
- Each file's changes reviewed separately
|
|
254
|
-
- Results aggregated across all chunks
|
|
255
|
-
- Alternative: truncate with warning (shows first 500 lines)
|
|
239
|
+
$(cat /tmp/review-diff.patch)"
|
|
240
|
+
```
|
|
256
241
|
|
|
257
242
|
**Note:** Each CLI has its own syntax. Check `codex --help` and `gemini --help` for exact flags. The `--print` flag outputs the response without interactive mode.
|
|
258
243
|
|
|
@@ -277,7 +262,124 @@ Invoking Codex (GPT-5.2) for review...
|
|
|
277
262
|
- Any CHANGES_REQUESTED = overall CHANGES_REQUESTED
|
|
278
263
|
- Issues from all providers are combined in the report
|
|
279
264
|
|
|
280
|
-
### Step 7:
|
|
265
|
+
### Step 7: Fix-and-Recheck Loop (RL Mode)
|
|
266
|
+
|
|
267
|
+
**This is the core innovation. Reviews are not one-shot — they loop until clean.**
|
|
268
|
+
|
|
269
|
+
After Steps 2-6 complete, if ANY issues were found:
|
|
270
|
+
|
|
271
|
+
```
|
|
272
|
+
┌─────────────────────────────────────────────┐
|
|
273
|
+
│ REVIEW LOOP (runs until clean) │
|
|
274
|
+
│ │
|
|
275
|
+
│ 1. Collect all issues from Claude + Codex │
|
|
276
|
+
│ 2. Fix each issue: │
|
|
277
|
+
│ - Missing tests → write them │
|
|
278
|
+
│ - Security issues → patch the code │
|
|
279
|
+
│ - Coding standards → refactor │
|
|
280
|
+
│ - Codex feedback → apply fixes │
|
|
281
|
+
│ 3. Run tests to verify fixes don't break │
|
|
282
|
+
│ 4. Commit fixes │
|
|
283
|
+
│ 5. Re-run Steps 2-6 (full review again) │
|
|
284
|
+
│ 6. If new issues → loop back to 1 │
|
|
285
|
+
│ 7. If clean → exit loop, generate report │
|
|
286
|
+
└─────────────────────────────────────────────┘
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
**Iteration rules:**
|
|
290
|
+
|
|
291
|
+
1. **No arbitrary limit** — keep looping until BOTH providers return APPROVED. The code isn't done until it's clean.
|
|
292
|
+
2. **Each iteration runs the FULL check** — don't skip steps. Fresh eyes each time.
|
|
293
|
+
3. **Re-invoke Codex each iteration** — `codex review --base main` sees the latest fixes
|
|
294
|
+
4. **Commit after each fix round** — `git commit -m "fix: address review feedback (round N)"`
|
|
295
|
+
5. **Track what was fixed** — maintain a running list for the final report
|
|
296
|
+
6. **Stuck detection** — if the SAME issue appears 3 times after being "fixed", stop and escalate to the user with context on what was tried. This is the only exit condition besides clean.
|
|
297
|
+
|
|
298
|
+
**Fix priority order:**
|
|
299
|
+
1. Security issues (HIGH severity) — fix these first, they block everything
|
|
300
|
+
2. Missing tests — write them before touching implementation
|
|
301
|
+
3. Implementation bugs flagged by Codex — apply fixes
|
|
302
|
+
4. Coding standards (file size, `any` types, return types) — refactor
|
|
303
|
+
5. Style/naming/docs — lowest priority, fix if time permits
|
|
304
|
+
|
|
305
|
+
**What gets auto-fixed vs flagged for human:**
|
|
306
|
+
|
|
307
|
+
| Issue Type | Auto-Fix | Human Review |
|
|
308
|
+
|-----------|----------|--------------|
|
|
309
|
+
| Missing test file | Write it | - |
|
|
310
|
+
| Hardcoded secret | Replace with env var | If unclear what var to use |
|
|
311
|
+
| `any` type | Replace with proper interface | If domain type unclear |
|
|
312
|
+
| File >1000 lines | Split into sub-modules | If split strategy unclear |
|
|
313
|
+
| Security vulnerability | Patch it | If fix might break behavior |
|
|
314
|
+
| Codex-flagged bug | Apply suggestion | If suggestion conflicts with Claude |
|
|
315
|
+
| Merge conflict | - | Always human |
|
|
316
|
+
|
|
317
|
+
**Example iteration:**
|
|
318
|
+
|
|
319
|
+
```
|
|
320
|
+
───────────────────────────────────────
|
|
321
|
+
Review Round 1/5
|
|
322
|
+
───────────────────────────────────────
|
|
323
|
+
Claude: CHANGES_REQUESTED
|
|
324
|
+
- Missing test for src/utils.js
|
|
325
|
+
- Hardcoded API key in src/config.js
|
|
326
|
+
Codex: CHANGES_REQUESTED
|
|
327
|
+
- Missing null check in src/parser.js:42
|
|
328
|
+
- No error handling in src/api.js:88
|
|
329
|
+
|
|
330
|
+
Fixing 4 issues...
|
|
331
|
+
✅ Created src/utils.test.js (3 tests)
|
|
332
|
+
✅ Replaced hardcoded key with process.env.API_KEY
|
|
333
|
+
✅ Added null check in parser.js:42
|
|
334
|
+
✅ Added try/catch in api.js:88
|
|
335
|
+
✅ All tests pass
|
|
336
|
+
✅ Committed: fix: address review feedback (round 1)
|
|
337
|
+
|
|
338
|
+
───────────────────────────────────────
|
|
339
|
+
Review Round 2/5
|
|
340
|
+
───────────────────────────────────────
|
|
341
|
+
Claude: CHANGES_REQUESTED
|
|
342
|
+
- src/utils.test.js missing edge case for empty input
|
|
343
|
+
Codex: APPROVED
|
|
344
|
+
|
|
345
|
+
Fixing 1 issue...
|
|
346
|
+
✅ Added empty input edge case test
|
|
347
|
+
✅ All tests pass
|
|
348
|
+
✅ Committed: fix: address review feedback (round 2)
|
|
349
|
+
|
|
350
|
+
───────────────────────────────────────
|
|
351
|
+
Review Round 3/5
|
|
352
|
+
───────────────────────────────────────
|
|
353
|
+
Claude: APPROVED
|
|
354
|
+
Codex: APPROVED
|
|
355
|
+
|
|
356
|
+
✅ All providers agree — exiting loop.
|
|
357
|
+
───────────────────────────────────────
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
**Stuck detection (only exit besides clean):**
|
|
361
|
+
|
|
362
|
+
```
|
|
363
|
+
───────────────────────────────────────
|
|
364
|
+
Review Round 7
|
|
365
|
+
───────────────────────────────────────
|
|
366
|
+
Claude: APPROVED
|
|
367
|
+
Codex: CHANGES_REQUESTED
|
|
368
|
+
- Complex refactor needed in src/legacy.js ← appeared 3 times
|
|
369
|
+
|
|
370
|
+
⚠️ STUCK: This issue has reappeared 3 times after being fixed.
|
|
371
|
+
Escalating to human review.
|
|
372
|
+
|
|
373
|
+
Attempts made:
|
|
374
|
+
Round 3: Extracted helper function → Codex still flagged
|
|
375
|
+
Round 5: Split into two modules → Codex still flagged
|
|
376
|
+
Round 7: Added interface layer → Codex still flagged
|
|
377
|
+
|
|
378
|
+
This needs human judgment. Run /tlc:review again after manual fix.
|
|
379
|
+
───────────────────────────────────────
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
### Step 8: Generate Report
|
|
281
383
|
|
|
282
384
|
```markdown
|
|
283
385
|
# Code Review Report
|
|
@@ -305,14 +407,11 @@ Invoking Codex (GPT-5.2) for review...
|
|
|
305
407
|
- TDD Score: 75%
|
|
306
408
|
```
|
|
307
409
|
|
|
308
|
-
### Step
|
|
410
|
+
### Step 9: Return Verdict
|
|
309
411
|
|
|
310
|
-
**APPROVED** - All
|
|
412
|
+
**APPROVED** - All providers agree after N iterations. Ready to push/merge.
|
|
311
413
|
|
|
312
|
-
**CHANGES_REQUESTED** -
|
|
313
|
-
- Missing tests → Add tests for flagged files
|
|
314
|
-
- Low TDD score → Consider reordering commits or adding test commits
|
|
315
|
-
- Security issues → Fix flagged patterns
|
|
414
|
+
**CHANGES_REQUESTED** - Max iterations reached with remaining issues. Needs human review.
|
|
316
415
|
|
|
317
416
|
## Example Output
|
|
318
417
|
|
|
@@ -433,6 +532,9 @@ Action required:
|
|
|
433
532
|
| `--providers <list>` | Override providers (e.g., `--providers codex,gemini`) |
|
|
434
533
|
| `--codex-only` | Use only Codex for review |
|
|
435
534
|
| `--no-external` | Skip external providers, use Claude only |
|
|
535
|
+
| `--stuck-threshold <N>` | How many times the same issue reappears before escalating to human (default: 3) |
|
|
536
|
+
| `--no-fix` | Single-pass review only — report issues but don't auto-fix |
|
|
537
|
+
| `--fix-all` | Fix even low-priority style issues (default: skip style) |
|
|
436
538
|
|
|
437
539
|
## Integration
|
|
438
540
|
|
package/CLAUDE.md
CHANGED
|
@@ -19,6 +19,7 @@ When the user says X → invoke `Skill(skill="tlc:...")`:
|
|
|
19
19
|
| "plan", "break this down" | `/tlc:plan` |
|
|
20
20
|
| "build", "implement", "add feature" | `/tlc:build` |
|
|
21
21
|
| "review", "check code" | `/tlc:review` |
|
|
22
|
+
| "review plan", "check plan" | `/tlc:review-plan` |
|
|
22
23
|
| "status", "what's next", "where are we" | `/tlc:progress` |
|
|
23
24
|
| "discuss", "talk about approach" | `/tlc:discuss` |
|
|
24
25
|
| "test", "run tests" | `/tlc:status` |
|