create-ccc-tutor 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +41 -0
- package/bin/cli.js +76 -0
- package/package.json +28 -0
- package/template/.claude/commands/abandon.md +7 -0
- package/template/.claude/commands/add-anti-flag.md +7 -0
- package/template/.claude/commands/add-constitution-clause.md +7 -0
- package/template/.claude/commands/audit-spec.md +7 -0
- package/template/.claude/commands/commit.md +7 -0
- package/template/.claude/commands/constitution-edit.md +7 -0
- package/template/.claude/commands/db-schema.md +7 -0
- package/template/.claude/commands/exam.md +66 -0
- package/template/.claude/commands/execution-plan.md +7 -0
- package/template/.claude/commands/feature-draft.md +7 -0
- package/template/.claude/commands/handoff.md +7 -0
- package/template/.claude/commands/implement.md +7 -0
- package/template/.claude/commands/init.md +7 -0
- package/template/.claude/commands/next.md +7 -0
- package/template/.claude/commands/offload.md +7 -0
- package/template/.claude/commands/pickup.md +7 -0
- package/template/.claude/commands/recall.md +7 -0
- package/template/.claude/commands/remember.md +7 -0
- package/template/.claude/commands/slide.md +87 -0
- package/template/.claude/commands/spec-finalize.md +7 -0
- package/template/.claude/commands/test-fix.md +7 -0
- package/template/.claude/commands/uninstall.md +7 -0
- package/template/.claude/settings.json +161 -0
- package/template/.claude-plugin/plugin.json +41 -0
- package/template/.codex/config.toml +24 -0
- package/template/.codex/hooks.json +4 -0
- package/template/.codex/install-skills.sh +18 -0
- package/template/.codex/skills/exam/SKILL.md +61 -0
- package/template/.codex/skills/slide/SKILL.md +69 -0
- package/template/.harness/agents/README.md +70 -0
- package/template/.harness/agents/_template/junior-agent-template.md +116 -0
- package/template/.harness/agents/backend-reviewer.md +153 -0
- package/template/.harness/agents/frontend-reviewer.md +158 -0
- package/template/.harness/agents/security-reviewer.md +148 -0
- package/template/.harness/agents/test-fixer.md +147 -0
- package/template/.harness/docs/doc-sync.md +29 -0
- package/template/.harness/docs/git-hygiene.md +56 -0
- package/template/.harness/docs/spec-model.md +47 -0
- package/template/.harness/docs/tool-map.md +120 -0
- package/template/.harness/docs/workflow.md +59 -0
- package/template/.harness/scripts/README.md +70 -0
- package/template/.harness/scripts/auditor-gate.sh +388 -0
- package/template/.harness/scripts/bootstrap-check.sh +103 -0
- package/template/.harness/scripts/budget-monitor.sh +223 -0
- package/template/.harness/scripts/check-prereqs.sh +165 -0
- package/template/.harness/scripts/checkpoint-recall.sh +136 -0
- package/template/.harness/scripts/checkpoint-write.sh +281 -0
- package/template/.harness/scripts/decision-log-append.sh +90 -0
- package/template/.harness/scripts/env-check.sh +286 -0
- package/template/.harness/scripts/format-edit.sh +80 -0
- package/template/.harness/scripts/lint-bans.sh +110 -0
- package/template/.harness/scripts/memory-archive.sh +129 -0
- package/template/.harness/scripts/memory-recall.sh +197 -0
- package/template/.harness/scripts/memory-snapshot.sh +124 -0
- package/template/.harness/scripts/post-migration.sh +58 -0
- package/template/.harness/scripts/precommit-cycles.sh +74 -0
- package/template/.harness/scripts/precommit-typecheck.sh +69 -0
- package/template/.harness/scripts/scratchpad-recall.sh +83 -0
- package/template/.harness/scripts/scratchpad-update.sh +39 -0
- package/template/.harness/scripts/standalone-bootstrap.md +443 -0
- package/template/.harness/skills/abandon/SKILL.md +157 -0
- package/template/.harness/skills/add-anti-flag/SKILL.md +205 -0
- package/template/.harness/skills/add-constitution-clause/SKILL.md +244 -0
- package/template/.harness/skills/audit-spec/SKILL.md +395 -0
- package/template/.harness/skills/commit/SKILL.md +270 -0
- package/template/.harness/skills/constitution-edit/SKILL.md +292 -0
- package/template/.harness/skills/db-schema/SKILL.md +145 -0
- package/template/.harness/skills/db-schema/references/methodology.md +202 -0
- package/template/.harness/skills/execution-plan/SKILL.md +346 -0
- package/template/.harness/skills/feature-draft/SKILL.md +426 -0
- package/template/.harness/skills/handoff/SKILL.md +211 -0
- package/template/.harness/skills/implement/SKILL.md +355 -0
- package/template/.harness/skills/init/SKILL.md +805 -0
- package/template/.harness/skills/next/SKILL.md +245 -0
- package/template/.harness/skills/offload/SKILL.md +134 -0
- package/template/.harness/skills/pickup/SKILL.md +213 -0
- package/template/.harness/skills/recall/SKILL.md +159 -0
- package/template/.harness/skills/remember/SKILL.md +205 -0
- package/template/.harness/skills/spec-finalize/SKILL.md +196 -0
- package/template/.harness/skills/test-fix/SKILL.md +363 -0
- package/template/.harness/skills/uninstall/SKILL.md +370 -0
- package/template/.harness/state/install.json +83 -0
- package/template/AGENTS.md +262 -0
- package/template/CCC_MAGI_LICENSE +201 -0
- package/template/CCC_MAGI_README.md +986 -0
- package/template/CLAUDE.md +658 -0
- package/template/codex.md +39 -0
- package/template/constitution.md +164 -0
- package/template/course/README.md +15 -0
- package/template/course/course_code(example)/exam/README.md +2 -0
- package/template/course/course_code(example)/slide/slide_example-1.pdf +40 -0
- package/template/course/course_code(example)/slide/slide_example-2.pdf +40 -0
- package/template/docs/features/slide-query-implementation.md +79 -0
- package/template/docs/features/slide-query.md +211 -0
- package/template/docs-harness/README.md +42 -0
- package/template/docs-harness/adoption-playbook.md +373 -0
- package/template/docs-harness/ccc-step1-driver-template.md +288 -0
- package/template/docs-harness/cli-configs-README.md +78 -0
- package/template/docs-harness/context-architecture-v2.md +249 -0
- package/template/docs-harness/design-spec.md +437 -0
- package/template/docs-harness/memory-layer.md +135 -0
- package/template/docs-harness/retrospective-notes.md +204 -0
- package/template/gitignore +106 -0
|
@@ -0,0 +1,355 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: implement
|
|
3
|
+
description: This skill should be used at the end of stage 5 of the feature workflow, after the user has implemented the feature per the execution plan. It mechanically picks the required junior reviewers from `git diff` (no self-assessment), runs them in parallel, and on approval invokes a different-model auditor pass on the full diff. Use this always to close stage 5 — the gate prevents reviewer-skip mistakes and shared-model blind spots. Trigger when the user invokes /implement, says "implement this feature", "write the code per plan", "implementation done", "ready for review", or moves from coding to verification.
|
|
4
|
+
argument-hint: [feature-name]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# /implement
|
|
8
|
+
|
|
9
|
+
Drive the close of Stage 5: orchestrate the mechanical review chain on the implementation diff.
|
|
10
|
+
|
|
11
|
+
Implementation itself is creative coding done with main Claude using the execution plan. This skill takes over once implementation is "done" per the user. Two layers of independence remove the implementer-grades-own-work bias:
|
|
12
|
+
|
|
13
|
+
1. **Context-level** — junior reviewer subagents (from `{{junior_reviewers}}`) with fresh contexts.
|
|
14
|
+
2. **Model-level** — auditor ({{auditor_model}}) audits the full diff for what subagents, sharing model priors, may miss together.
|
|
15
|
+
|
|
16
|
+
Reviewer selection is **mechanical from the diff**, never self-assessed.
|
|
17
|
+
|
|
18
|
+
> *Constitutional basis: Constitution § 1 (cross-model audit is mandatory).*
|
|
19
|
+
|
|
20
|
+
## Authoritative sources
|
|
21
|
+
|
|
22
|
+
1. `{{spec_dir}}<feature>-plan.md` — the execution plan (Stage 4 artifact)
|
|
23
|
+
2. `{{spec_dir}}<feature>.md` — the **CEO spec** (plain language, canonical intent)
|
|
24
|
+
3. `{{implementation_dir}}<feature>-implementation.md` — manager-domain notes (when present)
|
|
25
|
+
4. `.harness/agents/` — junior reviewer subagent definitions (mechanical rule enforcement only; judgment is the auditor's job)
|
|
26
|
+
5. `.harness/scripts/auditor-gate.sh` — the auditor review gate
|
|
27
|
+
6. `AGENTS.md` (root) — auditor standing context, including `{{anti_flag_rules}}`
|
|
28
|
+
7. Root `CLAUDE.md` § Workflow — Stage 5 flow
|
|
29
|
+
|
|
30
|
+
## Lane awareness
|
|
31
|
+
|
|
32
|
+
The auditor audit prompt and depth depend on lane:
|
|
33
|
+
|
|
34
|
+
- **Full workflow** (default) — full auditor review on the diff (Step 6 below).
|
|
35
|
+
- **Stability-fix** — full auditor review at Step 6, plus a **mandatory failing-test-first procedural check at Step 0** (see below). The check halts the skill if the diff contains a fix without a corresponding failing test that was confirmed to fail pre-fix.
|
|
36
|
+
- **Trivial-change** — auditor runs in Quick mode (BLOCKING-only): security holes, data loss, outright defects only. No advisory or strong items. Use the Quick prompt in Step 6.
|
|
37
|
+
|
|
38
|
+
If the lane is unclear, ask the CEO before proceeding.
|
|
39
|
+
|
|
40
|
+
## Step 0 — Stability-fix lane: failing-test-first enforcement (conditional)
|
|
41
|
+
|
|
42
|
+
**Run this step only when the lane is stability-fix.** Skip for full workflow and trivial-change.
|
|
43
|
+
|
|
44
|
+
This step enforces the test-first ordering required by root `CLAUDE.md` § Lanes (stability-fix lane). Without it, a manager can apply a fix and forget the failing test — the precise failure mode this rule exists to prevent.
|
|
45
|
+
|
|
46
|
+
The procedural sequence the user must have followed before invoking `/implement` on the stability-fix lane:
|
|
47
|
+
|
|
48
|
+
1. Bug analyzed; root-cause hypothesis written down.
|
|
49
|
+
2. **Failing test authored first** — added to a test file per the project's `{{test_framework}}` convention, with a `// Verifies scenario X.Y` comment tying it to a CEO-spec scenario.
|
|
50
|
+
3. Test confirmed to fail on the broken (pre-fix) code — the user runs `{{test_runner_command}} <test-path>` and watches it fail.
|
|
51
|
+
4. Fix applied to source.
|
|
52
|
+
5. Test confirmed to pass on the fixed code.
|
|
53
|
+
|
|
54
|
+
Then `/implement` is invoked.
|
|
55
|
+
|
|
56
|
+
The skill verifies steps 2–5 before proceeding to the standard reviewer chain. **Capture the baseline first**:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
BASELINE=<sha> # last commit before stability-fix work began; ask user once if ambiguous
|
|
60
|
+
|
|
61
|
+
git diff "$BASELINE" --name-only > /tmp/implement-stab-files.txt
|
|
62
|
+
# Use {{test_framework}}'s file pattern to distinguish tests from source:
|
|
63
|
+
grep -E '\.test\.(ts|tsx|js|jsx|py|go|rs)$|_test\.(go|py)$|test_.*\.(py)$' /tmp/implement-stab-files.txt > /tmp/implement-stab-tests.txt || true
|
|
64
|
+
grep -vE '\.test\.(ts|tsx|js|jsx|py|go|rs)$|_test\.(go|py)$|test_.*\.(py)$' /tmp/implement-stab-files.txt > /tmp/implement-stab-source.txt || true
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
**Mechanical check 1 — diff must include a test file.** If `/tmp/implement-stab-tests.txt` is empty but `/tmp/implement-stab-source.txt` is non-empty, halt:
|
|
68
|
+
|
|
69
|
+
> "Stability-fix lane requires a failing test written before the fix (per CLAUDE.md § Lanes). The diff contains source changes (`<list>`) but no new or modified test file. Halt — write the failing test first, confirm it fails on the broken code, then re-apply the fix and re-invoke `/implement`. Do NOT proceed to the reviewer chain until the test is in the diff."
|
|
70
|
+
|
|
71
|
+
**Mechanical check 2 — at least one test in the diff carries `// Verifies scenario X.Y`.** Inspect `git diff "$BASELINE" -- $(cat /tmp/implement-stab-tests.txt)` for the comment pattern. If no test in the diff carries it, halt:
|
|
72
|
+
|
|
73
|
+
> "Stability-fix lane requires the failing test to reference a CEO-spec scenario via `// Verifies scenario X.Y`. The diff has test changes but none carry the comment. Halt — add the comment to the test that exercises the regression, then re-invoke `/implement`."
|
|
74
|
+
|
|
75
|
+
**User confirmation — pre-fix failure was observed.** After both mechanical checks pass, ask the user once:
|
|
76
|
+
|
|
77
|
+
> "Stability-fix procedural confirmation:
|
|
78
|
+
>
|
|
79
|
+
> - The failing test in this diff (`<test path>::<test name>`) was run against the broken code BEFORE the fix was applied — and it failed?
|
|
80
|
+
> - After the fix was applied, that same test passes now?
|
|
81
|
+
>
|
|
82
|
+
> Both `yes` → proceed to reviewer chain.
|
|
83
|
+
> Either `no` → halt; revert the fix locally, run the test, watch it fail, then re-apply the fix and re-invoke `/implement`."
|
|
84
|
+
|
|
85
|
+
If the user answers `no` to either, halt. Do not proceed.
|
|
86
|
+
|
|
87
|
+
If the user answers `yes` to both, record the confirmation in the working state and proceed to Step 1.
|
|
88
|
+
|
|
89
|
+
**Anti-loophole.** "I forgot to write the test but the fix is obvious" is not an exception — write the test now, even after the fact, and confirm it fails on the broken code by reverting the fix locally to a temp branch first. The point of test-first is the _demonstration that the test catches the bug_, not the chronological order. If the user pushes back on this, surface the rule once, then accept the user's override only after they explicitly state "override; recording in commit body."
|
|
90
|
+
|
|
91
|
+
## Invocation
|
|
92
|
+
|
|
93
|
+
- Typical: `/implement <feature-name>`
|
|
94
|
+
- `$ARGUMENTS` identifies the feature
|
|
95
|
+
|
|
96
|
+
If `$ARGUMENTS` is not provided, identify from context (recent edits matching `{{feature_folder_pattern}}`, the most recent execution plan). Ask the user if ambiguous.
|
|
97
|
+
|
|
98
|
+
## Step 1 — Identify the baseline
|
|
99
|
+
|
|
100
|
+
The diff to review is `git diff <baseline>` where baseline is the commit before Stage 5 implementation began.
|
|
101
|
+
|
|
102
|
+
Conventions:
|
|
103
|
+
|
|
104
|
+
- If Stage 3 produced a migration commit, baseline = that commit
|
|
105
|
+
- Otherwise baseline = the commit immediately before Stage 5 started (typically the plan commit, or the prior feature's final commit)
|
|
106
|
+
- If the working tree has uncommitted changes (Stage 5 changes typically remain uncommitted until `/commit` at Stage 8), they are part of the diff: use `git diff <baseline>` (not `git diff <baseline>..HEAD`)
|
|
107
|
+
|
|
108
|
+
If the baseline is ambiguous, ask the user: "what's the last commit before you started implementation?" Do not guess.
|
|
109
|
+
|
|
110
|
+
Capture: `BASELINE=<sha>`.
|
|
111
|
+
|
|
112
|
+
## Step 2 — Determine touched layers
|
|
113
|
+
|
|
114
|
+
Inspect:
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
git diff --stat $BASELINE > /tmp/implement-diffstat.txt
|
|
118
|
+
git diff $BASELINE > /tmp/implement-fulldiff.txt
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Determine required junior reviewers **mechanically** from path + content. The mapping is defined at /init time based on `{{junior_reviewers}}` and the project's `{{client_code_paths}}` / `{{backend_code_paths}}`. Typical patterns:
|
|
122
|
+
|
|
123
|
+
- **Frontend reviewer** is required if any path matches `{{client_code_paths}}`.
|
|
124
|
+
- **Backend reviewer** is required if any path matches `{{backend_code_paths}}` (skipped entirely on projects without a backend).
|
|
125
|
+
- **Security / privacy reviewer** is required when the diff touches auth, access-control, or PII-bearing code. Specific triggers are declared by the security reviewer's subagent definition (e.g., access-control predicate keywords, auth-feature paths, migrations that add columns to PII-bearing tables).
|
|
126
|
+
|
|
127
|
+
If the diff has no relevant paths, stop and report: "no reviewable changes detected; was implementation actually done?"
|
|
128
|
+
|
|
129
|
+
Surface to the user the list of required reviewers and the reason each was selected (which path or content match triggered it). The user may not opt out of any selected reviewer — selection is mechanical, but the user confirms the diff baseline and reviewer set looks right before spawn.
|
|
130
|
+
|
|
131
|
+
**Wait for user response before continuing.**
|
|
132
|
+
|
|
133
|
+
## Step 3 — Verify version-sensitive APIs
|
|
134
|
+
|
|
135
|
+
Claude's training data has an effective cutoff and confidently suggests stale APIs for libraries that moved past it. Stage 4 should have flagged version-sensitive decisions with `• verified:` annotations in the plan, but implementation drift happens.
|
|
136
|
+
|
|
137
|
+
**Spot-check the diff** for uses of libraries in `{{high_trap_libraries}}` that lack a `• verified:` note in `{{spec_dir}}<feature>-plan.md`.
|
|
138
|
+
|
|
139
|
+
For each unverified use, query `context7` for the relevant library and confirm the call matches the current API. If it doesn't match (a prop that no longer exists, a method signature change, an import path moved), halt and surface to the user — fix before invoking reviewers. If it does match, annotate the plan retroactively with the `• verified:` note.
|
|
140
|
+
|
|
141
|
+
**Skip** for: language primitives, stable framework patterns, project-internal code, and patterns the codebase already exercises correctly elsewhere — those are stronger evidence than docs.
|
|
142
|
+
|
|
143
|
+
If `context7` is unreachable or unhelpful for a specific library, do NOT guess. Surface "unverified API in implementation: `<library>` — recommend manual check" and halt.
|
|
144
|
+
|
|
145
|
+
**Canonical-source escalation.** `context7` is a third-party docs mirror and can be incomplete. If a lookup conclusion would drive a _destructive change_ (removing config, deleting a feature, renaming a field, contradicting working code) or is _negative_ ("X doesn't exist"), `context7` alone is insufficient. Also fetch the canonical source via WebFetch (official docs site or upstream GitHub README), and surface the canonical URL in the report. Treating absence-from-mirror as proof-of-absence is the failure mode this rule prevents. Adding new code based on a lookup needs one source; removing or contradicting working code based on a lookup needs canonical confirmation.
|
|
146
|
+
|
|
147
|
+
## Step 4 — Spawn junior reviewers in parallel
|
|
148
|
+
|
|
149
|
+
For each required reviewer, invoke via the Task tool (`subagent_type: "<reviewer>"`).
|
|
150
|
+
|
|
151
|
+
**Spawn all required reviewers in a single message with multiple Task tool calls** so they run concurrently. Sequential invocation wastes time and lets earlier verdicts color how you frame later prompts.
|
|
152
|
+
|
|
153
|
+
Construct each reviewer's prompt to include:
|
|
154
|
+
|
|
155
|
+
- The baseline ref (so they can run `git diff` themselves if they want)
|
|
156
|
+
- The relevant subset of paths from `/tmp/implement-diffstat.txt`
|
|
157
|
+
- An instruction to read `{{spec_dir}}<feature>.md` and `<feature>-plan.md` for context
|
|
158
|
+
|
|
159
|
+
Do NOT pass:
|
|
160
|
+
|
|
161
|
+
- Your own interpretation of the diff
|
|
162
|
+
- "I believe this is correct" framing
|
|
163
|
+
- Pre-summary of what the diff does (let the reviewer read it themselves)
|
|
164
|
+
|
|
165
|
+
Pass artifacts and criteria. Not interpretation.
|
|
166
|
+
|
|
167
|
+
## Step 5 — Surface reviewer verdicts
|
|
168
|
+
|
|
169
|
+
Read each verdict report verbatim. Do not summarize, filter, or aggregate.
|
|
170
|
+
|
|
171
|
+
For each reviewer, surface to the user:
|
|
172
|
+
|
|
173
|
+
- The verdict line (`PASS` / `CONCERNS` / `FAIL` / `BLOCK` / `ESCALATE: <other-reviewer>`)
|
|
174
|
+
- Blocking findings (if any)
|
|
175
|
+
- Warnings and advisory items (if any)
|
|
176
|
+
|
|
177
|
+
Branch:
|
|
178
|
+
|
|
179
|
+
- **Any reviewer returns `FAIL` or `BLOCK`** — halt. Stage 5 is incomplete. The user fixes the flagged issues and re-invokes `/implement`.
|
|
180
|
+
- **Any reviewer returns `ESCALATE: <other>`** — verify `<other>` was already in the required-reviewers set. If yes, its verdict is already in this batch; proceed to the next branch. If no, spawn `<other>` now and surface its verdict before proceeding.
|
|
181
|
+
- **All required reviewers return `PASS` or `CONCERNS`** — proceed to Step 6. Surface any `CONCERNS` warnings explicitly so the CEO can weigh them before commit.
|
|
182
|
+
|
|
183
|
+
## Step 6 — Auditor review pass (judgment layer)
|
|
184
|
+
|
|
185
|
+
Invoke the gate. Pick the prompt by lane.
|
|
186
|
+
|
|
187
|
+
**Full workflow / Stability-fix lane** — full review with adversarial preset.
|
|
188
|
+
|
|
189
|
+
The preset (`.harness/scripts/auditor-prompts/adversarial.md`) wraps the focus text below with adversarial-review framing: skepticism stance, attack-surface checklist (auth, data loss, idempotency, races, partial failure, schema drift, observability), and "prefer one strong finding over filler" calibration. The focus text below carries the stage-specific guardrails the preset doesn't know about.
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
DIFF_FILE=$(mktemp /tmp/implement-auditor-diff.XXXXXX)
|
|
193
|
+
git diff "$BASELINE" > "$DIFF_FILE"
|
|
194
|
+
|
|
195
|
+
AUDITOR_GATE_PRESET=adversarial \
|
|
196
|
+
AUDITOR_GATE_TARGET_LABEL="<feature> Stage 5 implementation diff" \
|
|
197
|
+
bash .harness/scripts/auditor-gate.sh review <feature> 5 \
|
|
198
|
+
"Review the implementation diff against the CEO spec at {{spec_dir}}<feature>.md, the implementation notes at {{implementation_dir}}<feature>-implementation.md (if present), and the plan at {{spec_dir}}<feature>-plan.md. Junior reviewers from {{junior_reviewers}} have already approved project-rule conformance — they are mechanical rule reviewers, not judgment. Your job is the judgment layer: catch what shared-model planning + rule-conformance review miss together. Beyond the preset's attack surface, also weigh: alternative approaches that meaningfully reduce risk; hidden assumptions in the implementation.
|
|
199
|
+
|
|
200
|
+
**Spec-vs-reality match (mandatory axis).** The CEO spec at {{spec_dir}}<feature>.md is a behavioral document written for the final decision maker (the CEO, who reads it end-to-end at smoke time). It describes what the feature does from a user-facing perspective — what the user sees, what they can do, what happens to their data, what guarantees the product makes. It does NOT describe implementation mechanism, and is BANNED from doing so by CLAUDE.md's two-file model. Your audit must respect that boundary.
|
|
201
|
+
|
|
202
|
+
Read the spec end-to-end (not just sections touched by this diff). Flag a sentence ONLY when:
|
|
203
|
+
- It asserts a user-observable behavior the code provably doesn't deliver (timing, atomicity boundary, recovery path the user can actually take, what an attempted action returns, what gets scrubbed / cascaded / revoked from the user's perspective, what the user sees after the action), OR
|
|
204
|
+
- It asserts a guarantee (\"either everything commits or nothing does\", \"the user is signed out on every device\") that the code doesn't enforce.
|
|
205
|
+
|
|
206
|
+
Do NOT flag:
|
|
207
|
+
- Plain-language vocabulary that doesn't map 1:1 to a code identifier. If the user-facing meaning is correct, the wording is correct.
|
|
208
|
+
- Sentences that omit implementation mechanism. The CEO spec is supposed to omit mechanism; absence of jargon is not absence of behavior.
|
|
209
|
+
- Wording that could be tightened toward technical precision — that would break the two-file model.
|
|
210
|
+
|
|
211
|
+
This axis exists because spec wording often gets touched outside this diff's scope and unaudited behavioral drift compounds silently across rounds. It does NOT exist to police plain-language imprecision.
|
|
212
|
+
|
|
213
|
+
Do NOT flag: anti-flag rules in AGENTS.md (already reviewed by junior reviewers), formatting (formatter handles it), naming preferences, refactor opinions, or suggestions for additional test coverage (Stage 6 is for that)." \
|
|
214
|
+
"$DIFF_FILE"
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
**Trivial-change lane** — Quick mode (BLOCKING-only):
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
DIFF_FILE=$(mktemp /tmp/implement-auditor-diff.XXXXXX)
|
|
221
|
+
git diff "$BASELINE" > "$DIFF_FILE"
|
|
222
|
+
|
|
223
|
+
bash .harness/scripts/auditor-gate.sh review <feature> 5-trivial \
|
|
224
|
+
"Review this trivial-change diff. Per CLAUDE.md's trivial-change lane, this is < 20 LOC, no new feature surface, no schema change, no new dependency, no intent change. Run in Quick mode: report ONLY items that meet the BLOCKING bar — security holes, data loss, outright defects. Do NOT flag: code style, refactor opportunities, alternative approaches, naming, advisory items, suggestions for additional tests, anything that wouldn't block a normal commit. If you find non-trivial concerns, that's a signal the lane is misclassified — say so explicitly so the user can re-classify; do NOT silently flag them as advisory. If nothing meets BLOCKING, return PASS." \
|
|
225
|
+
"$DIFF_FILE"
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
Read the gate's exit code:
|
|
229
|
+
|
|
230
|
+
- **Exit 0 (PASS / CONCERNS / WAIVED)** — surface `✓ Stage 5 complete: implementation reviewed by [list of junior reviewers] + {{auditor_model}} cross-model audit.` Mention any advisory items. For CONCERNS, also surface the logged warning path (`.harness/audits/concerns-*.json`) and remind the CEO to review before commit. For WAIVED, surface the `waiver_reason`. Stage 5 complete; user may proceed to `/test-fix <feature>`.
|
|
231
|
+
- **Exit 2 (FAIL)** — surface every blocking item from the auditor verbatim, halt. The user addresses, re-invokes `/implement`.
|
|
232
|
+
- **Exit 1 (script error / Universal Core WAIVED rejected / missing waiver_reason / legacy verdict)** — surface stderr, halt.
|
|
233
|
+
|
|
234
|
+
## Trust contract
|
|
235
|
+
|
|
236
|
+
- **Reviewer selection is mechanical from `git diff`.** Skill has no judgment authority to skip a reviewer "because the diff looks fine."
|
|
237
|
+
- **Verdicts are surfaced verbatim.** No "reviewer says it's fine, proceeding."
|
|
238
|
+
- **Auditor is unconditional on subagent PASS.** No skipping the audit because "the diff looks clean." (Constitution § 1.)
|
|
239
|
+
- **On disagreement** (junior reviewers PASS, auditor FAILs): auditor wins by default. Surface both views; user overrides explicitly if they disagree.
|
|
240
|
+
- **Verdict semantics.** The four verdicts are `PASS` (advance silently), `CONCERNS` (advance with logged warning at `.harness/audits/concerns-*.json` for CEO commit-time review), `FAIL` (halt), and `WAIVED` (CEO override only; rejected by the gate if any blocking item cites Universal Core).
|
|
241
|
+
|
|
242
|
+
## Completion criteria
|
|
243
|
+
|
|
244
|
+
Stage 5 is complete when:
|
|
245
|
+
|
|
246
|
+
- Every required junior reviewer returned `PASS` or `CONCERNS` (FAIL halts)
|
|
247
|
+
- `.harness/scripts/auditor-gate.sh` returned exit 0 for Stage 5 (`PASS`, `CONCERNS`, or `WAIVED`)
|
|
248
|
+
- `.harness/state/auditor-approvals/<feature>-stage5.json` exists with a non-FAIL `verdict`
|
|
249
|
+
|
|
250
|
+
The user should be able to proceed to `/test-fix <feature>` immediately after.
|
|
251
|
+
|
|
252
|
+
## Anti-patterns the skill blocks
|
|
253
|
+
|
|
254
|
+
- Implementer self-assessing whether a reviewer is needed → mechanical from `git diff --stat`
|
|
255
|
+
- Implementer rationalizing past a reviewer's FAIL → halt, no auto-proceed
|
|
256
|
+
- Skipping the auditor because "junior reviewers already approved" → auditor unconditional
|
|
257
|
+
- Treating auditor disagreement as noise → halt, user explicitly overrides if needed
|
|
258
|
+
- Sequential reviewer spawn (lets earlier verdicts color later prompts) → all required reviewers in a single parallel batch
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Checkpoint + decision-log integration (MAGI Archivist) — including mid-flight
|
|
263
|
+
|
|
264
|
+
Stage 5 differs from other stages: it can take hours and edit many files. MAGI Archivist tracks progress **at the file level**, not just at stage end. This is what enables `/pickup` to pick up at "5/8 files done" instead of "Stage 5 incomplete".
|
|
265
|
+
|
|
266
|
+
### At Stage 5 START — declare the file plan
|
|
267
|
+
|
|
268
|
+
After reading the execution plan but before writing any code:
|
|
269
|
+
|
|
270
|
+
```bash
|
|
271
|
+
TOTAL_FILES=$(grep -cE '^\s*-\s+\`?[^[:space:]\`]+' docs/features/<feature>-plan.md | tr -d ' ')
|
|
272
|
+
.harness/scripts/checkpoint-write.sh \
|
|
273
|
+
--feature <feature-slug> \
|
|
274
|
+
--stage 5 \
|
|
275
|
+
--stage-in-progress "$(jq -nc --argjson total "$TOTAL_FILES" '{stage_number:5, files_total:$total, files_done_list:[], files_remaining_list:[], last_action:"Stage 5 started", resume_hint:"Run /implement to begin"}')"
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### MID-FLIGHT — after each file's Edit/Write completes
|
|
279
|
+
|
|
280
|
+
Update TWO things in parallel (so /pickup + the visible todolist both stay accurate):
|
|
281
|
+
|
|
282
|
+
**1. Checkpoint (for /pickup):**
|
|
283
|
+
```bash
|
|
284
|
+
# Call this AFTER every file the implementer fully completes (not on partial edits)
|
|
285
|
+
.harness/scripts/checkpoint-write.sh \
|
|
286
|
+
--feature <feature-slug> \
|
|
287
|
+
--file-done <path/to/file>
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
This makes `/pickup` reports show: *"3/8 files done — continue at src/auth/middleware.ts (next)"*.
|
|
291
|
+
|
|
292
|
+
**2. Visible TodoList (for CEO real-time visibility):**
|
|
293
|
+
|
|
294
|
+
**For Claude Code** (`CLAUDE_PROJECT_DIR` set): use the built-in **TaskUpdate tool** (the same task list /execution-plan populated). For each file you complete:
|
|
295
|
+
- BEFORE writing: call `TaskUpdate` with `taskId: <id-of-file's-task>` and `status: "in_progress"`
|
|
296
|
+
- AFTER writing: call `TaskUpdate` with `status: "completed"`
|
|
297
|
+
|
|
298
|
+
CEO sees the sidebar live update — green checks march down the list as you write each file.
|
|
299
|
+
|
|
300
|
+
**For other CLIs** (Codex / Cursor / Gemini / etc.): update `.harness/state/workflow-checkpoints/<feature>.todo.md` (the markdown file /execution-plan created):
|
|
301
|
+
|
|
302
|
+
```bash
|
|
303
|
+
# Mark in-progress
|
|
304
|
+
sed -i.bak "s|^- \[ \] \*\*N.\*\* \`<file>\`|- [~] **N.** \`<file>\`|" .harness/state/workflow-checkpoints/<feature>.todo.md && rm .harness/state/workflow-checkpoints/<feature>.todo.md.bak
|
|
305
|
+
|
|
306
|
+
# After done: mark completed
|
|
307
|
+
sed -i.bak "s|^- \[~\] \*\*N.\*\* \`<file>\`|- [x] **N.** \`<file>\`|" .harness/state/workflow-checkpoints/<feature>.todo.md && rm .harness/state/workflow-checkpoints/<feature>.todo.md.bak
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
(Use `perl -i -pe` if sed -i is fragile across platforms.)
|
|
311
|
+
|
|
312
|
+
**Why both visibility mechanisms**: native Claude Code TodoWrite gives the slickest UX in Claude. Markdown todolist gives all other CLIs a usable fallback. Both stay in sync with checkpoint so /pickup works regardless of CLI.
|
|
313
|
+
|
|
314
|
+
### At Stage 5 END — close out + audit verdict
|
|
315
|
+
|
|
316
|
+
After all files complete + reviewer chain + auditor-gate passes:
|
|
317
|
+
|
|
318
|
+
```bash
|
|
319
|
+
.harness/scripts/checkpoint-write.sh \
|
|
320
|
+
--feature <feature-slug> \
|
|
321
|
+
--stage 6 \
|
|
322
|
+
--stage-complete 5 \
|
|
323
|
+
--stage-in-progress 'null' \
|
|
324
|
+
--append-audit "$(jq -c '{stage:5, verdict, risk:.risk_score, at:now|todate}' .harness/state/auditor-approvals/<feature>-stage5.json)"
|
|
325
|
+
|
|
326
|
+
# Log any escalation or override:
|
|
327
|
+
.harness/scripts/decision-log-append.sh \
|
|
328
|
+
--feature <feature-slug> --stage 5 --by "CEO" \
|
|
329
|
+
--decision "<e.g. 'override frontend-reviewer false positive on FlashList ref'>"
|
|
330
|
+
```
|
|
331
|
+
|
|
332
|
+
**Without mid-flight tracking, a crash at file 4/8 makes `/pickup` think Stage 5 hasn't started.**
|
|
333
|
+
|
|
334
|
+
---
|
|
335
|
+
|
|
336
|
+
## Final message to CEO (natural-language, Stage 5 → Stage 6)
|
|
337
|
+
|
|
338
|
+
After Stage 5 completes (all files implemented + reviewer chain + auditor verdict), display (in CEO's OS locale):
|
|
339
|
+
|
|
340
|
+
```
|
|
341
|
+
✅ Stage 5 完成 — <feature> 的代码写完了
|
|
342
|
+
改动: N 个文件 (todolist 里全部 ✅)
|
|
343
|
+
MAGI Reviewer chain: <X 个 reviewer 跑过, 全过 / 有 N 个 concerns>
|
|
344
|
+
MAGI Verdict: <PASS/CONCERNS/WAIVED>, risk = M
|
|
345
|
+
|
|
346
|
+
接下来可以:
|
|
347
|
+
👉 「继续」/「跑测试」 — 我做 Stage 6 (自动写测试 + 跑测试)
|
|
348
|
+
(前提:你的项目开了 test_required)
|
|
349
|
+
👉 「先看 git diff」 — 我把改动列出来
|
|
350
|
+
👉 「先手动测一下」 — 你自己跑一下,回来再 /test-fix
|
|
351
|
+
👉 「fix bug X」 — 我去修指定的问题
|
|
352
|
+
👉 「放弃」 — git stash 后放弃这个功能
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
On "继续" → invoke `/test-fix` silently (if `test_required = true`) OR directly suggest CEO smoke test (Stage 7) if tests are skipped.
|