@su-record/vibe 2.9.21 → 2.9.22
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +4 -3
- package/commands/vibe.contract.md +29 -29
- package/commands/vibe.regress.md +20 -20
- package/commands/vibe.run.md +6 -6
- package/commands/vibe.spec.md +6 -6
- package/commands/vibe.test.md +96 -0
- package/commands/vibe.verify.md +9 -9
- package/package.json +1 -1
- package/skills/vibe-contract/SKILL.md +58 -58
- package/skills/vibe-regress/SKILL.md +94 -94
- package/skills/vibe-spec/SKILL.md +12 -12
- package/skills/vibe-test/SKILL.md +247 -0
package/CLAUDE.md
CHANGED
|
@@ -69,8 +69,9 @@ No `console.log` in commits · No hardcoded strings/numbers · No commented-out
|
|
|
69
69
|
`/vibe.spec` is the single entry point — orchestrates interview → plan → spec → review → `/vibe.run` → `/vibe.verify` → `/vibe.contract` → `/vibe.trace`. For UI types (website/webapp/mobile), `/vibe.figma` branches in parallel. Smart Resume detects existing `.claude/vibe/{interviews,plans,specs}/*.md` to skip phases.
|
|
70
70
|
|
|
71
71
|
**Quality-loop commands** (bug → prevention):
|
|
72
|
-
- `/vibe.regress` —
|
|
73
|
-
- `/vibe.contract` — API
|
|
72
|
+
- `/vibe.regress` — Regression test auto-evolution. Auto-registers on `/vibe.verify` failure; `generate` produces preventive tests; `cluster` promotes recurring patterns.
|
|
73
|
+
- `/vibe.contract` — API contract drift detection. Compares the contract extracted from the SPEC against the implementation; P1 drift auto-propagates to `/vibe.regress`.
|
|
74
|
+
- `/vibe.test` — vibe self-test across the CC ↔ coco harnesses. Subcommands: `parity` (static), `report` (runtime), `compare` (diff). P1 drift auto-propagates to `/vibe.regress`. Recommended before every release.
|
|
74
75
|
|
|
75
76
|
| Task Size | Approach |
|
|
76
77
|
|---|---|
|
|
@@ -98,7 +99,7 @@ No `console.log` in commits · No hardcoded strings/numbers · No commented-out
|
|
|
98
99
|
|
|
99
100
|
## Git
|
|
100
101
|
|
|
101
|
-
**Include**: `.claude/vibe/{plans,specs,features,todos,research,regressions,contracts}/`, `.claude/vibe/config.json`, `CLAUDE.md`
|
|
102
|
+
**Include**: `.claude/vibe/{plans,specs,features,todos,research,regressions,contracts,test-reports}/`, `.claude/vibe/config.json`, `CLAUDE.md`
|
|
102
103
|
**Exclude**: `~/.claude/{rules,commands,agents,skills}/`, `.claude/settings.local.json`
|
|
103
104
|
|
|
104
105
|
<!-- VIBE:END -->
|
|
@@ -5,57 +5,57 @@ argument-hint: "extract | check | diff [feature-name]"
|
|
|
5
5
|
|
|
6
6
|
# /vibe.contract
|
|
7
7
|
|
|
8
|
-
**API Contract Drift Detection** —
|
|
8
|
+
**API Contract Drift Detection** — when implementation diverges from the SPEC's API contract, catch it immediately.
|
|
9
9
|
|
|
10
|
-
> SPEC
|
|
10
|
+
> The SPEC is the source of truth. If the implementation silently leaves the SPEC, tests can pass while the contract breaks.
|
|
11
11
|
|
|
12
12
|
## Usage
|
|
13
13
|
|
|
14
14
|
```
|
|
15
|
-
/vibe.contract extract <feature> # SPEC
|
|
16
|
-
/vibe.contract check <feature> #
|
|
17
|
-
/vibe.contract diff <feature> #
|
|
15
|
+
/vibe.contract extract <feature> # SPEC → contract record at .claude/vibe/contracts/<feature>.md
|
|
16
|
+
/vibe.contract check <feature> # contract vs implementation, drift report
|
|
17
|
+
/vibe.contract diff <feature> # changed fields since last check
|
|
18
18
|
```
|
|
19
19
|
|
|
20
20
|
## What counts as an "API contract"
|
|
21
21
|
|
|
22
|
-
|
|
22
|
+
A contract = any **interface shape** that external consumers (clients, other services) depend on:
|
|
23
23
|
|
|
24
24
|
- HTTP endpoint: method + path + request schema + response schema + status codes
|
|
25
25
|
- GraphQL: query/mutation name + args + return shape
|
|
26
|
-
-
|
|
27
|
-
-
|
|
26
|
+
- Event/message: topic + payload schema
|
|
27
|
+
- Exported TypeScript function signature (when explicitly marked as public API)
|
|
28
28
|
|
|
29
29
|
## Process
|
|
30
30
|
|
|
31
31
|
Load skill `vibe-contract` with subcommand: `$ARGUMENTS`
|
|
32
32
|
|
|
33
|
-
|
|
33
|
+
**Core steps**:
|
|
34
34
|
|
|
35
|
-
1. **extract**: SPEC
|
|
36
|
-
2. **check**:
|
|
37
|
-
3. **diff**:
|
|
35
|
+
1. **extract**: parse SPEC sections like `## API` / `## Endpoints` / `## Interface` and persist as a structured contract record
|
|
36
|
+
2. **check**: locate matching endpoints in the implementation, compare signature/schema, report drift as P1 findings
|
|
37
|
+
3. **diff**: compare against the previous snapshot, surface only **changed fields** (noise minimized)
|
|
38
38
|
|
|
39
39
|
## Drift severity
|
|
40
40
|
|
|
41
|
-
| Drift type | Severity |
|
|
41
|
+
| Drift type | Severity | Example |
|
|
42
42
|
|---|---|---|
|
|
43
|
-
| Missing endpoint | P1 | SPEC
|
|
44
|
-
| Missing required field in response | P1 | SPEC response
|
|
43
|
+
| Missing endpoint | P1 | SPEC says `GET /users/:id`, implementation has none |
|
|
44
|
+
| Missing required field in response | P1 | SPEC response includes `email`, implementation drops it |
|
|
45
45
|
| Type change (breaking) | P1 | `userId: number` → `userId: string` |
|
|
46
|
-
| Added required request field | P1 |
|
|
47
|
-
| Added optional field | P3 |
|
|
48
|
-
| Status code added | P2 |
|
|
49
|
-
| Status code removed | P1 |
|
|
46
|
+
| Added required request field | P1 | breaks existing clients |
|
|
47
|
+
| Added optional field | P3 | extension is allowed |
|
|
48
|
+
| Status code added | P2 | client must handle a new case |
|
|
49
|
+
| Status code removed | P1 | expected response disappeared |
|
|
50
50
|
|
|
51
|
-
**P1
|
|
51
|
+
**On any P1 drift**: treat as failure regardless of `/vibe.verify` outcome — tests can pass while the contract breaks.
|
|
52
52
|
|
|
53
53
|
## Storage Format
|
|
54
54
|
|
|
55
55
|
```
|
|
56
56
|
.claude/vibe/contracts/
|
|
57
|
-
<feature>.md #
|
|
58
|
-
<feature>.snapshot.md #
|
|
57
|
+
<feature>.md # extracted contract (SSOT)
|
|
58
|
+
<feature>.snapshot.md # implementation snapshot at last check (for diff)
|
|
59
59
|
```
|
|
60
60
|
|
|
61
61
|
### Contract schema (frontmatter)
|
|
@@ -81,24 +81,24 @@ endpoints:
|
|
|
81
81
|
|
|
82
82
|
## Integration with /vibe.verify
|
|
83
83
|
|
|
84
|
-
`/vibe.verify <feature>`
|
|
84
|
+
After `/vibe.verify <feature>` scenarios pass, auto-chain:
|
|
85
85
|
|
|
86
86
|
```
|
|
87
87
|
scenarios pass → /vibe.contract check <feature>
|
|
88
88
|
├─ no drift → ✅ complete
|
|
89
|
-
└─ drift found → ❌ report + auto
|
|
89
|
+
└─ drift found → ❌ report + auto /vibe.regress register (tag: integration)
|
|
90
90
|
```
|
|
91
91
|
|
|
92
92
|
## Integration with /vibe.spec
|
|
93
93
|
|
|
94
|
-
`/vibe.spec`
|
|
94
|
+
Right after `/vibe.spec` finishes writing the SPEC, auto-invoke `/vibe.contract extract`. The resulting contract becomes the reference for the subsequent `/vibe.run`.
|
|
95
95
|
|
|
96
96
|
## Done Criteria
|
|
97
97
|
|
|
98
|
-
- [ ] `extract
|
|
99
|
-
- [ ] `check
|
|
100
|
-
- [ ] P1
|
|
101
|
-
- [ ] `diff
|
|
98
|
+
- [ ] `extract` exits cleanly when SPEC has no API section (not every feature has one)
|
|
99
|
+
- [ ] `check` is silent when no drift; otherwise prints findings grouped by severity
|
|
100
|
+
- [ ] Every P1 drift triggers `/vibe.regress register --from-contract`
|
|
101
|
+
- [ ] `diff` says "first run" when no prior snapshot exists
|
|
102
102
|
|
|
103
103
|
---
|
|
104
104
|
|
package/commands/vibe.regress.md
CHANGED
|
@@ -5,51 +5,51 @@ argument-hint: "register | generate | list | import | cluster [args]"
|
|
|
5
5
|
|
|
6
6
|
# /vibe.regress
|
|
7
7
|
|
|
8
|
-
**Regression Auto-Evolution** —
|
|
8
|
+
**Regression Auto-Evolution** — never fix the same bug twice.
|
|
9
9
|
|
|
10
|
-
>
|
|
10
|
+
> Bugs are recorded, preventive tests are generated automatically, and recurring patterns get promoted into shared tests.
|
|
11
11
|
|
|
12
12
|
## Usage
|
|
13
13
|
|
|
14
14
|
```
|
|
15
|
-
/vibe.regress register "<symptom>" #
|
|
16
|
-
/vibe.regress generate <slug> # bug → vitest
|
|
17
|
-
/vibe.regress list #
|
|
18
|
-
/vibe.regress import # git log
|
|
19
|
-
/vibe.regress cluster # 3+
|
|
15
|
+
/vibe.regress register "<symptom>" # Manual register (rare — most calls are automatic)
|
|
16
|
+
/vibe.regress generate <slug> # bug record → vitest file
|
|
17
|
+
/vibe.regress list # Open items
|
|
18
|
+
/vibe.regress import # Backfill from git log `fix:` commits
|
|
19
|
+
/vibe.regress cluster # 3+ similar bugs → propose shared test
|
|
20
20
|
```
|
|
21
21
|
|
|
22
22
|
## Auto-integration
|
|
23
23
|
|
|
24
|
-
- `/vibe.verify`
|
|
25
|
-
- `/vibe.run "<feature>"`
|
|
24
|
+
- `/vibe.verify` failure → auto-invokes `register` (no manual step)
|
|
25
|
+
- `/vibe.run "<feature>"` start → warns about open regressions for that feature
|
|
26
26
|
|
|
27
27
|
## Process
|
|
28
28
|
|
|
29
29
|
Load skill `vibe-regress` with subcommand: `$ARGUMENTS`
|
|
30
30
|
|
|
31
|
-
`vibe-regress`
|
|
31
|
+
The `vibe-regress` skill performs registration, generation, and clustering.
|
|
32
32
|
|
|
33
|
-
|
|
33
|
+
**Core steps** (see `skills/vibe-regress/SKILL.md` for details):
|
|
34
34
|
|
|
35
|
-
1.
|
|
36
|
-
2. `.claude/vibe/regressions/<slug>.md`
|
|
37
|
-
3. `generate
|
|
38
|
-
4. `cluster
|
|
39
|
-
5. `import
|
|
35
|
+
1. Parse subcommand
|
|
36
|
+
2. Read/write `.claude/vibe/regressions/<slug>.md` (frontmatter schema enforced)
|
|
37
|
+
3. On `generate`, detect the project's test stack → choose template (vitest / jest)
|
|
38
|
+
4. On `cluster`, group by `root-cause-tag`; ≥3 entries → propose a shared test
|
|
39
|
+
5. On `import`, parse `git log --grep='^fix:'`; skip duplicates by commit hash
|
|
40
40
|
|
|
41
41
|
## Output
|
|
42
42
|
|
|
43
|
-
- `.claude/vibe/regressions/<slug>.md` —
|
|
44
|
-
-
|
|
45
|
-
- `list`
|
|
43
|
+
- `.claude/vibe/regressions/<slug>.md` — bug record (frontmatter + reproduction / root cause)
|
|
44
|
+
- Project test dir — generated vitest file (`*.regression.test.ts`)
|
|
45
|
+
- `list` prints a terminal table
|
|
46
46
|
|
|
47
47
|
## Storage Format
|
|
48
48
|
|
|
49
49
|
```markdown
|
|
50
50
|
---
|
|
51
51
|
slug: login-jwt-expiry-off-by-one
|
|
52
|
-
symptom: "JWT
|
|
52
|
+
symptom: "JWT expiry cuts off one second early"
|
|
53
53
|
root-cause-tag: timezone
|
|
54
54
|
fix-commit: abc1234
|
|
55
55
|
test-path: src/auth/__tests__/login.regression.test.ts
|
package/commands/vibe.run.md
CHANGED
|
@@ -48,18 +48,18 @@ Execute **Scenario-Driven Implementation** with automatic quality verification.
|
|
|
48
48
|
|
|
49
49
|
### Pre-Run Regression Check (MANDATORY, before implementation starts)
|
|
50
50
|
|
|
51
|
-
|
|
51
|
+
Run immediately after start:
|
|
52
52
|
|
|
53
53
|
```
|
|
54
54
|
Load skill `vibe-regress` with: list --feature "{feature-name}"
|
|
55
55
|
```
|
|
56
56
|
|
|
57
|
-
-
|
|
58
|
-
- interactive
|
|
59
|
-
- ultrawork
|
|
60
|
-
-
|
|
57
|
+
- If any open regressions exist:
|
|
58
|
+
- interactive mode: ask the user "generate preventive tests first, then proceed?"
|
|
59
|
+
- ultrawork mode: auto-invoke `/vibe.regress generate <slug>` for each, then proceed
|
|
60
|
+
- No open regressions → silently continue
|
|
61
61
|
|
|
62
|
-
|
|
62
|
+
Also load `.claude/vibe/contracts/{feature-name}.md` if present — use it as the contract reference during implementation.
|
|
63
63
|
|
|
64
64
|
### Core Flow
|
|
65
65
|
|
package/commands/vibe.spec.md
CHANGED
|
@@ -372,15 +372,15 @@ Load skill `vibe-spec-review` with feature: {feature-name}
|
|
|
372
372
|
5. Review Debate Team (2+ P1/P2 이슈 시)
|
|
373
373
|
6. 사용자 최종 체크포인트
|
|
374
374
|
|
|
375
|
-
### Phase 4.5: Contract Extract (
|
|
375
|
+
### Phase 4.5: Contract Extract (auto, only for features with an API)
|
|
376
376
|
|
|
377
377
|
```
|
|
378
378
|
Load skill `vibe-contract` with: extract "{feature-name}"
|
|
379
379
|
```
|
|
380
380
|
|
|
381
|
-
SPEC
|
|
381
|
+
If the SPEC has a `## API` / `## Endpoints` / `## Interface` section, extract the contract to `.claude/vibe/contracts/{feature-name}.md`. If the section is absent, exit cleanly (not every feature has an API).
|
|
382
382
|
|
|
383
|
-
|
|
383
|
+
The contract is referenced during Phase 5a implementation, and used by `/vibe.verify` for drift detection.
|
|
384
384
|
|
|
385
385
|
### Phase 5a: Logic Track
|
|
386
386
|
|
|
@@ -388,9 +388,9 @@ SPEC에 `## API` / `## Endpoints` / `## Interface` 섹션이 있으면 계약을
|
|
|
388
388
|
/vibe.run "{feature-name}"
|
|
389
389
|
```
|
|
390
390
|
|
|
391
|
-
SPEC →
|
|
392
|
-
- `/vibe.regress list --feature {feature-name}` —
|
|
393
|
-
- `.claude/vibe/contracts/{feature-name}.md` —
|
|
391
|
+
SPEC → code. Auto-checks at start:
|
|
392
|
+
- `/vibe.regress list --feature {feature-name}` — warn if any open regressions exist
|
|
393
|
+
- `.claude/vibe/contracts/{feature-name}.md` — load if present, use as implementation guide
|
|
394
394
|
|
|
395
395
|
### Phase 5b: UI Track (type ∈ {website, webapp, mobile}일 때만)
|
|
396
396
|
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Self-test vibe across CC and coco — verify every command/skill/hook/agent/tool is callable and behaves identically
|
|
3
|
+
argument-hint: "parity | report | compare [args]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /vibe.test
|
|
7
|
+
|
|
8
|
+
**Vibe Self-Test** — verify vibe works identically in both Claude Code and coco.
|
|
9
|
+
|
|
10
|
+
> Catch features broken on one harness before users do.
|
|
11
|
+
|
|
12
|
+
## Usage
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
/vibe.test parity # Static parity (file set + content sync) — local, fast
|
|
16
|
+
/vibe.test report # Invoke every feature in current harness, write JSON+MD report
|
|
17
|
+
/vibe.test compare <cc-report> <coco-report> # Diff two reports, classify P1/P2/P3
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Key Constraint
|
|
21
|
+
|
|
22
|
+
`/vibe.test report` only tests the **harness it runs in**. Run from CC for CC results, run from coco for coco results. Then `compare` merges them.
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
[CC] /vibe.test report → .claude/vibe/test-reports/<ts>-cc.{json,md}
|
|
26
|
+
[coco] /vibe.test report → .coco/vibe/test-reports/<ts>-coco.{json,md}
|
|
27
|
+
[any] /vibe.test compare → diff with parity findings
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## Subcommand: parity (static check, stage 1)
|
|
31
|
+
|
|
32
|
+
No harness execution — file system comparison only:
|
|
33
|
+
|
|
34
|
+
| Check | Compared |
|
|
35
|
+
|---|---|
|
|
36
|
+
| **install set** | `~/.claude/{commands,skills,agents}/` vs `~/.coco/{commands,skills,agents}/` file set |
|
|
37
|
+
| **content sync** | `CLAUDE.md` ↔ `AGENTS.md` body (excluding header/meta blocks) |
|
|
38
|
+
| **path config** | `.claude/vibe/` vs `.coco/vibe/` directory layout |
|
|
39
|
+
| **doc references** | Paths cited in CLAUDE.md/AGENTS.md actually resolve in install dir |
|
|
40
|
+
|
|
41
|
+
**Output**: console table + `.claude/vibe/test-reports/<ts>-parity.json`
|
|
42
|
+
|
|
43
|
+
This stage alone catches:
|
|
44
|
+
- New commands missing on one harness (e.g. if `/vibe.regress` had been added only to CC)
|
|
45
|
+
- AGENTS.md holding stale paths (e.g. `.codex/` references after a coco rename)
|
|
46
|
+
- CLAUDE.md ↔ AGENTS.md body drift
|
|
47
|
+
|
|
48
|
+
## Subcommand: report (runtime invocation)
|
|
49
|
+
|
|
50
|
+
Probes every shipped feature in the current harness and writes a JSON+MD report.
|
|
51
|
+
|
|
52
|
+
| Category | Probe |
|
|
53
|
+
|---|---|
|
|
54
|
+
| commands | frontmatter validity, body delegates to a skill |
|
|
55
|
+
| skills | frontmatter validity, triggers non-empty |
|
|
56
|
+
| hooks | run matching vitest suite |
|
|
57
|
+
| agents | frontmatter validity, declared tools exist in harness |
|
|
58
|
+
| tools | run matching vitest suite or smoke-call with minimal input |
|
|
59
|
+
|
|
60
|
+
No external LLM calls. Interactive commands are not actually invoked — structural validation only. See `skills/vibe-test/SKILL.md` for full probe spec and failure-handling rules.
|
|
61
|
+
|
|
62
|
+
## Subcommand: compare (diff two reports)
|
|
63
|
+
|
|
64
|
+
Compare two JSON reports and classify findings:
|
|
65
|
+
- **P1**: feature exists on only one side → missing
|
|
66
|
+
- **P2**: both sides have it but response shape differs → behavioral drift
|
|
67
|
+
- **P3**: only message wording differs, semantics identical → informational
|
|
68
|
+
|
|
69
|
+
P1 findings auto-invoke `/vibe.regress register --from-test`.
|
|
70
|
+
|
|
71
|
+
## Process
|
|
72
|
+
|
|
73
|
+
Load skill `vibe-test` with subcommand: `$ARGUMENTS`
|
|
74
|
+
|
|
75
|
+
See `skills/vibe-test/SKILL.md` for detailed logic.
|
|
76
|
+
|
|
77
|
+
## Storage
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
.claude/vibe/test-reports/ (CC side)
|
|
81
|
+
.coco/vibe/test-reports/ (coco side)
|
|
82
|
+
<YYYYMMDD-HHmm>-<harness>.json
|
|
83
|
+
<YYYYMMDD-HHmm>-<harness>.md
|
|
84
|
+
<YYYYMMDD-HHmm>-compare.md (compare output)
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## Done Criteria
|
|
88
|
+
|
|
89
|
+
- [ ] `parity` runs without external calls — local file inspection only (fast, deterministic)
|
|
90
|
+
- [ ] If only one install dir exists, exit cleanly with guidance (not an error)
|
|
91
|
+
- [ ] `compare` warns when reports are not within ±1 minute of each other (timing drift = false positives)
|
|
92
|
+
- [ ] P1 drift auto-registers via `/vibe.regress`
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
ARGUMENTS: $ARGUMENTS
|
package/commands/vibe.verify.md
CHANGED
|
@@ -235,9 +235,9 @@ For each failed scenario:
|
|
|
235
235
|
location: {file:line}
|
|
236
236
|
```
|
|
237
237
|
|
|
238
|
-
- `--from-verify`
|
|
239
|
-
-
|
|
240
|
-
-
|
|
238
|
+
- `--from-verify` mode skips user confirmation (the user is already attentive in a verify-failure context; minimize friction)
|
|
239
|
+
- The registered bug's slug appears as a link in the Failure Report's "Fix" section
|
|
240
|
+
- Follow up with `/vibe.regress generate <slug>` to produce a preventive test
|
|
241
241
|
|
|
242
242
|
### Failure Report
|
|
243
243
|
|
|
@@ -402,18 +402,18 @@ node -e "import('{{VIBE_PATH_URL}}/node_modules/@su-record/vibe/dist/tools/index
|
|
|
402
402
|
**Codex P2 발견 시:**
|
|
403
403
|
- TODO 파일에 기록 후 완료 처리
|
|
404
404
|
|
|
405
|
-
## Post-Verify Contract Check (
|
|
405
|
+
## Post-Verify Contract Check (auto, only when a contract file exists)
|
|
406
406
|
|
|
407
|
-
|
|
407
|
+
After all scenarios pass, auto-invoke:
|
|
408
408
|
|
|
409
409
|
```
|
|
410
410
|
Load skill `vibe-contract` with: check "{feature-name}"
|
|
411
411
|
```
|
|
412
412
|
|
|
413
|
-
- `.claude/vibe/contracts/{feature-name}.md
|
|
414
|
-
- drift
|
|
415
|
-
- **P1 drift** → verify
|
|
416
|
-
- P2/P3 drift →
|
|
413
|
+
- Skip if `.claude/vibe/contracts/{feature-name}.md` does not exist
|
|
414
|
+
- No drift → verify still passes
|
|
415
|
+
- **P1 drift** → demote verify to fail; auto-call `/vibe.regress register --from-contract`
|
|
416
|
+
- P2 / P3 drift → warning only; verify still passes
|
|
417
417
|
|
|
418
418
|
## Next Step
|
|
419
419
|
|
package/package.json
CHANGED
|
@@ -9,18 +9,18 @@ chain-next: []
|
|
|
9
9
|
|
|
10
10
|
# vibe.contract — API Contract Drift Detection
|
|
11
11
|
|
|
12
|
-
**Purpose**:
|
|
12
|
+
**Purpose**: catch divergence between the SPEC's external contract and the actual implementation. Passing tests ≠ contract preserved.
|
|
13
13
|
|
|
14
14
|
## Why this exists
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
Hidden vibe-coding weakness: as the implementation grows, response shapes drift away from what the SPEC documents. Scenario tests still pass — but **external consumers break**. Manual SPEC-vs-code review is high-friction, so mechanize it.
|
|
17
17
|
|
|
18
18
|
## Storage Contract
|
|
19
19
|
|
|
20
20
|
```
|
|
21
21
|
.claude/vibe/contracts/
|
|
22
|
-
<feature>.md #
|
|
23
|
-
<feature>.snapshot.md #
|
|
22
|
+
<feature>.md # contract SSOT (extracted from SPEC)
|
|
23
|
+
<feature>.snapshot.md # implementation snapshot (last check)
|
|
24
24
|
```
|
|
25
25
|
|
|
26
26
|
### Contract frontmatter schema
|
|
@@ -30,9 +30,9 @@ chain-next: []
|
|
|
30
30
|
feature: string
|
|
31
31
|
extracted-from: .claude/vibe/specs/<feature>.md
|
|
32
32
|
extracted-at: ISO-8601
|
|
33
|
-
source-spec-hash: sha256 #
|
|
33
|
+
source-spec-hash: sha256 # for change detection
|
|
34
34
|
endpoints:
|
|
35
|
-
- id: unique-kebab-id #
|
|
35
|
+
- id: unique-kebab-id # e.g. get-user-by-id
|
|
36
36
|
kind: http | graphql | event | function
|
|
37
37
|
# http
|
|
38
38
|
method: GET | POST | PUT | DELETE | PATCH
|
|
@@ -64,75 +64,75 @@ endpoints:
|
|
|
64
64
|
|
|
65
65
|
## Subcommands
|
|
66
66
|
|
|
67
|
-
### 1. `extract <feature>` —
|
|
67
|
+
### 1. `extract <feature>` — pull contract out of the SPEC
|
|
68
68
|
|
|
69
|
-
|
|
70
|
-
1. SPEC
|
|
71
|
-
2.
|
|
69
|
+
**Steps**:
|
|
70
|
+
1. Load SPEC file (single file or split folder)
|
|
71
|
+
2. Search sections in this order:
|
|
72
72
|
- `## API` / `## Endpoints` / `## Interface` / `## Contract`
|
|
73
|
-
- Markdown
|
|
74
|
-
-
|
|
75
|
-
3.
|
|
76
|
-
4.
|
|
77
|
-
5. `source-spec-hash`:
|
|
78
|
-
6. `.claude/vibe/contracts/<feature>.md`
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
### 2. `check <feature>` —
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
1. `.claude/vibe/contracts/<feature>.md
|
|
86
|
-
2.
|
|
87
|
-
- http:
|
|
88
|
-
- graphql: resolver
|
|
89
|
-
- event:
|
|
90
|
-
- function:
|
|
91
|
-
3.
|
|
92
|
-
4.
|
|
93
|
-
5.
|
|
94
|
-
|
|
95
|
-
### 3. `diff <feature>` —
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
1. `.snapshot.md
|
|
99
|
-
2.
|
|
100
|
-
3.
|
|
73
|
+
- Markdown tables (method/path/request/response headers)
|
|
74
|
+
- OpenAPI/JSON Schema snippets inside code blocks
|
|
75
|
+
3. Extraction failure (no such section) → **exit cleanly with `no-contract` state**. Not every feature has an API.
|
|
76
|
+
4. Success → convert to the frontmatter structure
|
|
77
|
+
5. `source-spec-hash`: sha256 of SPEC content (for next extract to detect change)
|
|
78
|
+
6. Save to `.claude/vibe/contracts/<feature>.md` (no-op if file exists with the same hash)
|
|
79
|
+
|
|
80
|
+
**Caveat**: extraction is LLM-driven. Mark low-confidence fields with `# unconfirmed` so the user can review.
|
|
81
|
+
|
|
82
|
+
### 2. `check <feature>` — contract vs implementation
|
|
83
|
+
|
|
84
|
+
**Steps**:
|
|
85
|
+
1. Load `.claude/vibe/contracts/<feature>.md`. If missing → **suggest extract first**.
|
|
86
|
+
2. For each endpoint, find implementation:
|
|
87
|
+
- http: detect framework (Express, Fastify, Next.js API routes, Hono, ...)
|
|
88
|
+
- graphql: locate resolver files
|
|
89
|
+
- event: producer/consumer code
|
|
90
|
+
- function: module export
|
|
91
|
+
3. Extract implementation signature/schema → compare against contract
|
|
92
|
+
4. Classify drift (severity table in command file)
|
|
93
|
+
5. Persist snapshot at `.claude/vibe/contracts/<feature>.snapshot.md` (current implementation state)
|
|
94
|
+
|
|
95
|
+
### 3. `diff <feature>` — changes since last snapshot
|
|
96
|
+
|
|
97
|
+
**Steps**:
|
|
98
|
+
1. If `.snapshot.md` does not exist → say "first run" and exit
|
|
99
|
+
2. Re-extract current implementation; compare to existing snapshot
|
|
100
|
+
3. Output **only changed fields** in ASCII diff form:
|
|
101
101
|
```
|
|
102
102
|
endpoints/get-user-by-id/response/200:
|
|
103
103
|
- email: string
|
|
104
|
-
+ email: string | null ← nullability
|
|
105
|
-
+ phoneNumber: string ←
|
|
104
|
+
+ email: string | null ← nullability added (P1 breaking)
|
|
105
|
+
+ phoneNumber: string ← new field (P3 safe)
|
|
106
106
|
```
|
|
107
|
-
4.
|
|
107
|
+
4. On any drift, auto-call `/vibe.regress register --from-contract`
|
|
108
108
|
|
|
109
109
|
## Drift Severity Matrix
|
|
110
110
|
|
|
111
|
-
(command file
|
|
111
|
+
(matches command file — keep both in sync on edits)
|
|
112
112
|
|
|
113
113
|
## Integration Points
|
|
114
114
|
|
|
115
115
|
### From /vibe.spec
|
|
116
116
|
|
|
117
|
-
|
|
117
|
+
Auto-invoke right after the SPEC is written:
|
|
118
118
|
```
|
|
119
119
|
Load skill `vibe-contract` with: extract <feature>
|
|
120
120
|
```
|
|
121
|
-
|
|
121
|
+
Failure does not stop `/vibe.spec` (extraction is optional). On success, `/vibe.run` references this contract.
|
|
122
122
|
|
|
123
123
|
### From /vibe.verify
|
|
124
124
|
|
|
125
|
-
|
|
125
|
+
After all scenarios pass:
|
|
126
126
|
```
|
|
127
127
|
Load skill `vibe-contract` with: check <feature>
|
|
128
128
|
```
|
|
129
|
-
- drift
|
|
130
|
-
- P1 drift → verify
|
|
131
|
-
- P2/P3 drift →
|
|
129
|
+
- no drift → verify still passes
|
|
130
|
+
- P1 drift → demote verify to fail; auto-register
|
|
131
|
+
- P2 / P3 drift → warning only; verify still passes
|
|
132
132
|
|
|
133
133
|
### To /vibe.regress
|
|
134
134
|
|
|
135
|
-
P1 drift
|
|
135
|
+
On P1 drift:
|
|
136
136
|
```
|
|
137
137
|
Load skill `vibe-regress` with:
|
|
138
138
|
subcommand: register --from-contract
|
|
@@ -143,15 +143,15 @@ Load skill `vibe-regress` with:
|
|
|
143
143
|
|
|
144
144
|
## Framework Detection Rules
|
|
145
145
|
|
|
146
|
-
HTTP framework
|
|
147
|
-
1. `package.json` dependencies
|
|
146
|
+
HTTP framework detection order:
|
|
147
|
+
1. `package.json` dependencies: `next` → Next.js API routes
|
|
148
148
|
2. `fastify` → Fastify
|
|
149
149
|
3. `express` → Express
|
|
150
150
|
4. `hono` → Hono
|
|
151
151
|
5. `@nestjs/core` → NestJS
|
|
152
|
-
6.
|
|
152
|
+
6. None detected → ask user for manual mapping
|
|
153
153
|
|
|
154
|
-
|
|
154
|
+
After detection, grep for each framework's **route definition pattern** to map endpoints:
|
|
155
155
|
- Next.js: `pages/api/**` or `app/api/**/route.ts`
|
|
156
156
|
- Express: `app.get|post|put|delete|patch\(`
|
|
157
157
|
- Fastify: `fastify.get|post|...` or route configuration
|
|
@@ -159,8 +159,8 @@ HTTP framework 감지 순서:
|
|
|
159
159
|
|
|
160
160
|
## Done Criteria
|
|
161
161
|
|
|
162
|
-
- [ ] `extract
|
|
163
|
-
- [ ] `source-spec-hash
|
|
164
|
-
- [ ] `check
|
|
165
|
-
- [ ] P1 drift
|
|
166
|
-
- [ ]
|
|
162
|
+
- [ ] `extract` does not error on SPEC without an API section
|
|
163
|
+
- [ ] `source-spec-hash`-based re-extract is a no-op when unchanged
|
|
164
|
+
- [ ] `check` reports each drift with severity + location (file:line)
|
|
165
|
+
- [ ] P1 drift always invokes `/vibe.regress`
|
|
166
|
+
- [ ] On framework detection failure, ask the user — do not silently skip
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: vibe-regress
|
|
3
3
|
tier: core
|
|
4
|
-
description: "Regression test auto-evolution. Registers bugs (auto from /vibe.verify failures or manual), generates preventive vitest/jest files from bug records, clusters repeated patterns (3+ same root-cause-tag) into shared tests, and imports historical `fix:` commits from git log. Storage: .claude/vibe/regressions/<slug>.md. Must use this skill when user runs /vibe.regress, when /vibe.verify produces a failure, or when the user says 'don't let this happen again' / '
|
|
4
|
+
description: "Regression test auto-evolution. Registers bugs (auto from /vibe.verify failures or manual), generates preventive vitest/jest files from bug records, clusters repeated patterns (3+ same root-cause-tag) into shared tests, and imports historical `fix:` commits from git log. Storage: .claude/vibe/regressions/<slug>.md. Must use this skill when user runs /vibe.regress, when /vibe.verify produces a failure, or when the user says 'don't let this happen again' / 'regression test' / '회귀 테스트'."
|
|
5
5
|
triggers: [regress, regression, "회귀", "다시는", "반복 버그", "fix commit"]
|
|
6
6
|
priority: 70
|
|
7
7
|
chain-next: []
|
|
@@ -9,92 +9,92 @@ chain-next: []
|
|
|
9
9
|
|
|
10
10
|
# vibe.regress — Regression Auto-Evolution
|
|
11
11
|
|
|
12
|
-
**Purpose**:
|
|
12
|
+
**Purpose**: never fix the same bug twice. Each fix grows a preventive test.
|
|
13
13
|
|
|
14
14
|
## Why this exists
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
A classic vibe-coding weakness: LLMs reintroduce bugs of the same class. Regression tests are the only mechanical defense. But if the human has to write the test every time, it gets skipped — so automate.
|
|
17
17
|
|
|
18
18
|
## Storage Contract
|
|
19
19
|
|
|
20
20
|
```
|
|
21
21
|
.claude/vibe/regressions/
|
|
22
|
-
<bug-slug>.md #
|
|
23
|
-
_cluster-<tag>.md #
|
|
22
|
+
<bug-slug>.md # one file per bug
|
|
23
|
+
_cluster-<tag>.md # shared-test design produced by `cluster`
|
|
24
24
|
```
|
|
25
25
|
|
|
26
|
-
### Frontmatter schema (
|
|
26
|
+
### Frontmatter schema (strict)
|
|
27
27
|
|
|
28
28
|
```yaml
|
|
29
|
-
slug: string # kebab-case,
|
|
30
|
-
symptom: string #
|
|
31
|
-
root-cause-tag: enum #
|
|
32
|
-
fix-commit: string # git hash (
|
|
33
|
-
test-path: string #
|
|
29
|
+
slug: string # kebab-case, globally unique
|
|
30
|
+
symptom: string # one line, user-facing
|
|
31
|
+
root-cause-tag: enum # only the allowed tags below
|
|
32
|
+
fix-commit: string # git hash (or "pending")
|
|
33
|
+
test-path: string # generated test file path (or "pending")
|
|
34
34
|
status: open | test-generated | resolved
|
|
35
35
|
registered: YYYY-MM-DD
|
|
36
|
-
feature: string #
|
|
36
|
+
feature: string # related feature name (matches SPEC)
|
|
37
37
|
```
|
|
38
38
|
|
|
39
39
|
### Allowed `root-cause-tag` values
|
|
40
40
|
|
|
41
|
-
|
|
41
|
+
Clustering depends on this, so use **only the predefined set**:
|
|
42
42
|
|
|
43
|
-
- `timezone` —
|
|
44
|
-
- `nullability` — null/undefined/empty
|
|
45
|
-
- `concurrency` — race
|
|
46
|
-
- `boundary` — off-by-one, edge
|
|
47
|
-
- `encoding` — charset, URL encoding,
|
|
48
|
-
- `validation` —
|
|
49
|
-
- `auth` —
|
|
50
|
-
- `state-sync` —
|
|
51
|
-
- `integration` —
|
|
52
|
-
- `type-narrow` — TypeScript
|
|
53
|
-
- `other` —
|
|
43
|
+
- `timezone` — timezone / DST / off-by-one in time
|
|
44
|
+
- `nullability` — null / undefined / empty handling
|
|
45
|
+
- `concurrency` — race conditions
|
|
46
|
+
- `boundary` — off-by-one, edge values
|
|
47
|
+
- `encoding` — charset, URL encoding, escaping
|
|
48
|
+
- `validation` — missing input validation
|
|
49
|
+
- `auth` — authn/authz logic
|
|
50
|
+
- `state-sync` — client/server state mismatch
|
|
51
|
+
- `integration` — external API call failure
|
|
52
|
+
- `type-narrow` — TypeScript type narrowing mistake
|
|
53
|
+
- `other` — when nothing fits (add new tags later)
|
|
54
54
|
|
|
55
|
-
|
|
55
|
+
**Rule**: if a new tag is needed, do not force-fit into an existing one — register as `other`. Once `other` reaches 3 entries, propose adding a new tag.
|
|
56
56
|
|
|
57
57
|
## Subcommands
|
|
58
58
|
|
|
59
|
-
### 1. `register "<symptom>"` —
|
|
59
|
+
### 1. `register "<symptom>"` — manual registration
|
|
60
60
|
|
|
61
|
-
|
|
61
|
+
Most calls are automatic; manual use is rare (bugs found outside `/vibe.verify`, or production incidents).
|
|
62
62
|
|
|
63
|
-
|
|
64
|
-
1. `getCurrentTime
|
|
65
|
-
2. `git log -1 --format=%H
|
|
66
|
-
3.
|
|
63
|
+
**Steps**:
|
|
64
|
+
1. `getCurrentTime` for today's date
|
|
65
|
+
2. `git log -1 --format=%H` for current commit hash (fix-commit candidate)
|
|
66
|
+
3. Conversation extracts:
|
|
67
67
|
- Reproduction steps (Given/When/Then)
|
|
68
|
-
- Root
|
|
69
|
-
- Fix
|
|
70
|
-
4. `root-cause-tag
|
|
71
|
-
5. slug
|
|
72
|
-
6. `.claude/vibe/regressions/<slug>.md`
|
|
73
|
-
|
|
74
|
-
### 2. `generate <slug>` —
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
1. bug
|
|
78
|
-
2.
|
|
79
|
-
- `package.json
|
|
80
|
-
-
|
|
81
|
-
3.
|
|
82
|
-
-
|
|
83
|
-
-
|
|
84
|
-
4.
|
|
85
|
-
5.
|
|
86
|
-
6. bug
|
|
87
|
-
7.
|
|
88
|
-
|
|
89
|
-
### 3. `list` —
|
|
68
|
+
- Root-cause paragraph
|
|
69
|
+
- Fix description
|
|
70
|
+
4. `root-cause-tag` is **inferred from the allowed set, then confirmed with the user**. If unclear → `other`.
|
|
71
|
+
5. Generate slug: kebab-case keywords from the symptom; on collision append `-2`
|
|
72
|
+
6. Write `.claude/vibe/regressions/<slug>.md` (status: `open`)
|
|
73
|
+
|
|
74
|
+
### 2. `generate <slug>` — generate preventive test
|
|
75
|
+
|
|
76
|
+
**Steps**:
|
|
77
|
+
1. Read bug file
|
|
78
|
+
2. Detect test stack:
|
|
79
|
+
- From `package.json` `devDependencies`: prefer `vitest` over `jest`
|
|
80
|
+
- If neither → **ask user, then stop**
|
|
81
|
+
3. Decide test location:
|
|
82
|
+
- Sibling `__tests__/` next to the implementation file, OR
|
|
83
|
+
- The project's existing test dir (vitest config `test.include`)
|
|
84
|
+
4. File name: `<original-file>.regression.test.ts`
|
|
85
|
+
5. Body: render `templates/test-vitest.md` or `templates/test-jest.md`
|
|
86
|
+
6. Update bug frontmatter: `test-path`, `status: test-generated`
|
|
87
|
+
7. **Run the test immediately** — should fail (if not yet fixed) or pass (if fixed). Record outcome in frontmatter.
|
|
88
|
+
|
|
89
|
+
### 3. `list` — open items
|
|
90
90
|
|
|
91
91
|
```
|
|
92
|
-
/vibe.regress list # status != resolved
|
|
93
|
-
/vibe.regress list --feature login # feature
|
|
94
|
-
/vibe.regress list --tag timezone # tag
|
|
92
|
+
/vibe.regress list # status != resolved
|
|
93
|
+
/vibe.regress list --feature login # filter by feature
|
|
94
|
+
/vibe.regress list --tag timezone # filter by tag
|
|
95
95
|
```
|
|
96
96
|
|
|
97
|
-
|
|
97
|
+
Terminal table:
|
|
98
98
|
|
|
99
99
|
```
|
|
100
100
|
SLUG FEATURE TAG STATUS AGE
|
|
@@ -102,36 +102,36 @@ login-jwt-expiry-off-by-one login timezone test-generated 3d
|
|
|
102
102
|
cart-stock-race-double-deduct cart concurrency open 1d
|
|
103
103
|
```
|
|
104
104
|
|
|
105
|
-
### 4. `import` — git log
|
|
105
|
+
### 4. `import` — backfill from git log
|
|
106
106
|
|
|
107
|
-
|
|
107
|
+
**Steps**:
|
|
108
108
|
1. `git log --grep='^fix:' --format='%H|%s|%ci' --since=<last-import-date>`
|
|
109
|
-
- `last-import-date
|
|
110
|
-
2.
|
|
111
|
-
-
|
|
112
|
-
-
|
|
113
|
-
-
|
|
114
|
-
3.
|
|
115
|
-
4.
|
|
109
|
+
- `last-import-date` lives in `.claude/vibe/regressions/.import-cursor` (defaults to 90 days ago)
|
|
110
|
+
2. For each commit:
|
|
111
|
+
- If a bug file with the same `fix-commit` already exists → **skip**
|
|
112
|
+
- Otherwise infer symptom + root-cause-tag from message/diff (LLM call)
|
|
113
|
+
- Write a new bug file (status: `resolved` — already fixed)
|
|
114
|
+
3. Update `.import-cursor`
|
|
115
|
+
4. Suggest `generate` for newly imported entries
|
|
116
116
|
|
|
117
|
-
|
|
117
|
+
**Note**: only `fix:` commits are considered. Projects not using Conventional Commits can override with `--grep-pattern`.
|
|
118
118
|
|
|
119
|
-
### 5. `cluster` —
|
|
119
|
+
### 5. `cluster` — promote recurring patterns
|
|
120
120
|
|
|
121
|
-
|
|
122
|
-
1.
|
|
123
|
-
2.
|
|
124
|
-
3.
|
|
125
|
-
- 3
|
|
126
|
-
- `_cluster-<tag>.md`
|
|
127
|
-
-
|
|
128
|
-
4.
|
|
121
|
+
**Steps**:
|
|
122
|
+
1. Aggregate `root-cause-tag` across all bug files
|
|
123
|
+
2. **A tag with ≥3 entries** becomes a cluster candidate
|
|
124
|
+
3. For each candidate:
|
|
125
|
+
- Feed the 3 reproductions to an LLM to extract the common cause and shared test cases
|
|
126
|
+
- Write `_cluster-<tag>.md` (links to the original bug slugs)
|
|
127
|
+
- Propose a shared test skeleton at `<project-test-dir>/_cluster-<tag>.regression.test.ts` (create only with user approval)
|
|
128
|
+
4. Original bug files are **not deleted** — history preserved
|
|
129
129
|
|
|
130
|
-
|
|
130
|
+
**Important**: `cluster` is never automatic. Users invoke it explicitly to avoid premature abstraction.
|
|
131
131
|
|
|
132
132
|
## Integration with /vibe.verify
|
|
133
133
|
|
|
134
|
-
`/vibe.verify`
|
|
134
|
+
When `/vibe.verify` fails it calls:
|
|
135
135
|
|
|
136
136
|
```
|
|
137
137
|
Load skill `vibe-regress` with: register --from-verify
|
|
@@ -141,34 +141,34 @@ Load skill `vibe-regress` with: register --from-verify
|
|
|
141
141
|
<location>: {file:line}
|
|
142
142
|
```
|
|
143
143
|
|
|
144
|
-
`--from-verify`
|
|
144
|
+
`--from-verify` behavior:
|
|
145
145
|
- symptom = scenario name + error summary
|
|
146
|
-
- feature =
|
|
147
|
-
- root-cause-tag = error pattern
|
|
146
|
+
- feature = forwarded feature name
|
|
147
|
+
- root-cause-tag = inferred from error pattern (default `other` if unclear)
|
|
148
148
|
- status = `open`
|
|
149
|
-
-
|
|
149
|
+
- **Skip user confirmation** — the user is already attentive in a verify-failure context, and friction must be minimized
|
|
150
150
|
|
|
151
151
|
## Integration with /vibe.run
|
|
152
152
|
|
|
153
|
-
`/vibe.run "<feature>"
|
|
153
|
+
At the start of `/vibe.run "<feature>"`:
|
|
154
154
|
|
|
155
|
-
1.
|
|
156
|
-
2.
|
|
155
|
+
1. Filter `.claude/vibe/regressions/*.md` for `feature: <feature-name>` + `status != resolved`
|
|
156
|
+
2. If any open items:
|
|
157
157
|
```
|
|
158
158
|
⚠️ Open regressions for this feature:
|
|
159
159
|
- login-jwt-expiry-off-by-one (timezone, 3d old)
|
|
160
160
|
- login-session-leak (auth, 1w old)
|
|
161
|
-
|
|
161
|
+
|
|
162
162
|
Fix these before adding new behavior? [y/N]
|
|
163
163
|
```
|
|
164
|
-
3. `y` → `/vibe.regress generate
|
|
165
|
-
4. `N` →
|
|
164
|
+
3. `y` → chain to `/vibe.regress generate` for items not yet test-generated
|
|
165
|
+
4. `N` → continue (ultrawork mode auto-`N`, records TODO)
|
|
166
166
|
|
|
167
167
|
## Done Criteria
|
|
168
168
|
|
|
169
|
-
- [ ]
|
|
170
|
-
- [ ]
|
|
171
|
-
- [ ] `root-cause-tag
|
|
172
|
-
- [ ] `generate
|
|
173
|
-
- [ ] `import
|
|
174
|
-
- [ ] `cluster
|
|
169
|
+
- [ ] Subcommand-less invocation prints usage
|
|
170
|
+
- [ ] Frontmatter schema strictly enforced (missing fields rejected)
|
|
171
|
+
- [ ] `root-cause-tag` outside the allowed set → warn + force `other`
|
|
172
|
+
- [ ] After `generate`, the test is **actually run** to verify
|
|
173
|
+
- [ ] `import` deduplicates by `fix-commit` hash
|
|
174
|
+
- [ ] `cluster` does nothing under 3 entries (false-positive guard)
|
|
@@ -400,21 +400,21 @@ Read ~/.claude/vibe/languages/typescript-react.md
|
|
|
400
400
|
Before spawning any research agents, check for a prior persisted dataset:
|
|
401
401
|
|
|
402
402
|
```bash
|
|
403
|
-
# Slug = kebab-case of feature/topic, max 50 chars
|
|
403
|
+
# Slug = kebab-case of the feature/topic, max 50 chars
|
|
404
404
|
ls .claude/vibe/research/<slug>/paper.md 2>/dev/null
|
|
405
405
|
```
|
|
406
406
|
|
|
407
407
|
**If `paper.md` exists:**
|
|
408
408
|
1. Read `.claude/vibe/research/<slug>/paper.md`
|
|
409
409
|
2. Read `.claude/vibe/research/<slug>/awesome-list.md` (if present)
|
|
410
|
-
3. Inject the **Findings**, **Recommendation**, and **Security considerations** sections into SPEC Context
|
|
410
|
+
3. Inject the **Findings**, **Recommendation**, and **Security considerations** sections verbatim into SPEC Context, prefixed with `> Source: .claude/vibe/research/<slug>/paper.md (cached {{FILE_MTIME}})`
|
|
411
411
|
4. **Skip step 3** (parallel research) entirely — do not re-run GPT/Gemini/Claude agents
|
|
412
|
-
5.
|
|
412
|
+
5. Print: `✅ Research cache hit: <slug> (saved ~30s of LLM calls)`
|
|
413
413
|
|
|
414
414
|
**Cache invalidation:**
|
|
415
|
-
- User passes `--refresh-research` → delete dir,
|
|
416
|
-
- `paper.md` mtime older than 30 days → warn user, ask to refresh or reuse
|
|
417
|
-
-
|
|
415
|
+
- User passes `--refresh-research` → delete dir, rerun step 3 from scratch
|
|
416
|
+
- `paper.md` mtime older than 30 days → warn the user, ask to refresh or reuse
|
|
417
|
+
- `stack` in `paper.md` frontmatter differs from current stack → auto-refresh
|
|
418
418
|
|
|
419
419
|
**If `paper.md` does NOT exist:**
|
|
420
420
|
Proceed to step 3. After step 3 completes, the synthesizer **must** write the 3 artifacts (see `parallel-research/orchestrator.md` Phase 5) so the next `/vibe.spec` run on this topic hits the cache.
|
|
@@ -565,19 +565,19 @@ Task(subagent_type="ui-layout-architect",
|
|
|
565
565
|
|
|
566
566
|
### 3.9 Persist Research Cache (AFTER research completes, BEFORE SPEC write)
|
|
567
567
|
|
|
568
|
-
> The "no Write during research" rule
|
|
568
|
+
> The "no Write during research" rule from step 3 does **not** apply here — research is done; artifacts are safe to persist.
|
|
569
569
|
|
|
570
|
-
After parallel research + UI/UX intelligence complete
|
|
570
|
+
After parallel research + UI/UX intelligence complete, before writing the SPEC, save the merged research to `.claude/vibe/research/<slug>/`:
|
|
571
571
|
|
|
572
572
|
1. Compute slug: kebab-case of feature name, max 50 chars
|
|
573
573
|
2. Write **three files** using templates from `parallel-research/templates/`:
|
|
574
574
|
- `.claude/vibe/research/<slug>/synthesis.md` — raw merged findings (all agent outputs)
|
|
575
|
-
- `.claude/vibe/research/<slug>/awesome-list.md` — curated links/repos/patterns (
|
|
575
|
+
- `.claude/vibe/research/<slug>/awesome-list.md` — curated links/repos/patterns (every entry needs a one-line "why"; drop entries without one)
|
|
576
576
|
- `.claude/vibe/research/<slug>/paper.md` — structured survey (Abstract → Background → Method → Findings → Recommendation → Security → References)
|
|
577
|
-
3. Include a frontmatter header in `paper.md` with `stack:` field so step 2.9 can detect stack drift
|
|
578
|
-
4. If the directory already exists (user
|
|
577
|
+
3. Include a frontmatter header in `paper.md` with a `stack:` field so step 2.9 can detect stack drift
|
|
578
|
+
4. If the directory already exists (user passed `--refresh-research`), overwrite
|
|
579
579
|
|
|
580
|
-
This makes the next `/vibe.spec` or `/vibe.research` invocation on the same topic hit the cache at step 2.9.
|
|
580
|
+
This makes the next `/vibe.spec` (or future `/vibe.research`) invocation on the same topic hit the cache at step 2.9.
|
|
581
581
|
|
|
582
582
|
### 4. Write SPEC Document (PTCF Structure)
|
|
583
583
|
|
|
@@ -0,0 +1,247 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: vibe-test
|
|
3
|
+
tier: core
|
|
4
|
+
description: "Self-test vibe across CC and coco. Subcommands: parity (static file/content comparison between ~/.claude and ~/.coco install dirs), report (runtime invocation of every command/skill/hook/agent/tool in the current harness), compare (diff two JSON reports). P1 drift (one-side missing) auto-registers via vibe-regress. Must use this skill when user runs /vibe.test, when verifying multi-harness compatibility before release, or when the user says 'parity', 'self-test', 'CC vs coco', 'both harnesses'."
|
|
5
|
+
triggers: [test, parity, self-test, "양쪽", "CC vs coco", "harness 동일"]
|
|
6
|
+
priority: 70
|
|
7
|
+
chain-next: []
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# vibe-test — Multi-Harness Self-Test
|
|
11
|
+
|
|
12
|
+
**Purpose**: mechanically verify vibe presents the same surface in Claude Code and coco. Catch features broken on one harness before users do.
|
|
13
|
+
|
|
14
|
+
## Why this exists
|
|
15
|
+
|
|
16
|
+
Vibe explicitly supports two harnesses (CC, coco). When new commands are added, only one side might get updated, or `AGENTS.md` ↔ `CLAUDE.md` may drift, and there is no automated check until a user reports it. This skill closes that gap.
|
|
17
|
+
|
|
18
|
+
## Storage Contract
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
.claude/vibe/test-reports/ # CC side artifacts
|
|
22
|
+
.coco/vibe/test-reports/ # coco side artifacts (when run from coco)
|
|
23
|
+
|
|
24
|
+
<YYYYMMDD-HHmm>-cc.json # machine-comparable
|
|
25
|
+
<YYYYMMDD-HHmm>-cc.md # human summary
|
|
26
|
+
<YYYYMMDD-HHmm>-coco.json
|
|
27
|
+
<YYYYMMDD-HHmm>-coco.md
|
|
28
|
+
<YYYYMMDD-HHmm>-parity.json # output of `parity` subcommand
|
|
29
|
+
<YYYYMMDD-HHmm>-compare.md # output of `compare` subcommand
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### Report schema (JSON)
|
|
33
|
+
|
|
34
|
+
```json
|
|
35
|
+
{
|
|
36
|
+
"harness": "cc | coco",
|
|
37
|
+
"version": "2.9.21",
|
|
38
|
+
"timestamp": "2026-04-14T18:30:00+09:00",
|
|
39
|
+
"vibe-version": "from package.json",
|
|
40
|
+
"commands": [
|
|
41
|
+
{ "name": "vibe.spec", "loaded": true, "first-response-ok": true, "error": null }
|
|
42
|
+
],
|
|
43
|
+
"skills": [
|
|
44
|
+
{ "name": "vibe-spec", "trigger-recognized": true, "context-injected": true, "error": null }
|
|
45
|
+
],
|
|
46
|
+
"hooks": [
|
|
47
|
+
{ "name": "pre-tool-guard", "test-suite": "passed | failed", "tests": "32/32" }
|
|
48
|
+
],
|
|
49
|
+
"agents": [],
|
|
50
|
+
"tools": []
|
|
51
|
+
}
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## Subcommand: `parity` — static comparison (stage 1, in-scope target)
|
|
55
|
+
|
|
56
|
+
No harness execution. Only file system + body inspection. Fast and deterministic.
|
|
57
|
+
|
|
58
|
+
### Steps
|
|
59
|
+
|
|
60
|
+
1. **Confirm both install dirs exist**:
|
|
61
|
+
- CC: `~/.claude/{commands,skills,agents}/`
|
|
62
|
+
- coco: `~/.coco/{commands,skills,agents}/` (`COCO_HOME` env takes precedence)
|
|
63
|
+
- If either side is missing, exit cleanly with guidance (not an error)
|
|
64
|
+
|
|
65
|
+
2. **Install set diff**:
|
|
66
|
+
```bash
|
|
67
|
+
find ~/.claude/commands -type f -name '*.md' -exec basename {} \; | sort > /tmp/cc-cmds
|
|
68
|
+
find ~/.coco/commands -type f -name '*.md' -exec basename {} \; | sort > /tmp/coco-cmds
|
|
69
|
+
diff /tmp/cc-cmds /tmp/coco-cmds
|
|
70
|
+
```
|
|
71
|
+
Repeat for skills/agents. Persist diff entries to `parity.json` field `install-set-diff`.
|
|
72
|
+
|
|
73
|
+
3. **Content sync (CLAUDE.md ↔ AGENTS.md)**:
|
|
74
|
+
- Read both files; strip header block (leading `> ` lines plus filename mentions)
|
|
75
|
+
- Normalize body: map `.claude` ↔ `.coco`, `Claude Code` ↔ `coco`, `CLAUDE.md` ↔ `AGENTS.md`
|
|
76
|
+
- Lines that still differ after normalization go into `content-drift`
|
|
77
|
+
|
|
78
|
+
4. **Path reference validation**:
|
|
79
|
+
- Extract `~/.claude/`, `.claude/vibe/` patterns from CLAUDE.md → confirm each resolves under the actual install dir
|
|
80
|
+
- Extract `~/.coco/`, `.coco/vibe/` patterns from AGENTS.md → same check
|
|
81
|
+
- Wrong paths (e.g. AGENTS.md referencing `.codex/` after a rename) classified as `path-error`
|
|
82
|
+
|
|
83
|
+
5. **Console output**:
|
|
84
|
+
```
|
|
85
|
+
📊 PARITY REPORT
|
|
86
|
+
|
|
87
|
+
Install set:
|
|
88
|
+
✅ commands: 14/14 matched
|
|
89
|
+
❌ skills: 1 missing in coco (vibe-test)
|
|
90
|
+
|
|
91
|
+
Content sync:
|
|
92
|
+
✅ CLAUDE.md ↔ AGENTS.md normalized diff: clean
|
|
93
|
+
|
|
94
|
+
Path references:
|
|
95
|
+
✅ all paths resolve to existing dirs
|
|
96
|
+
|
|
97
|
+
📈 Parity Score: 95/100
|
|
98
|
+
📁 Saved: .claude/vibe/test-reports/20260414-1830-parity.json
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
6. **Auto-register P1 drift**:
|
|
102
|
+
- On `install-set-diff` finding → call `/vibe.regress register --from-test`
|
|
103
|
+
- symptom: `"Parity drift: <category> missing in <harness>"`
|
|
104
|
+
- root-cause-tag: `integration`
|
|
105
|
+
|
|
106
|
+
## Subcommand: `report` — runtime invocation
|
|
107
|
+
|
|
108
|
+
Inspect every shipped feature in the current harness, capture pass/fail, and emit the JSON+MD report defined above.
|
|
109
|
+
|
|
110
|
+
### Probe philosophy
|
|
111
|
+
|
|
112
|
+
- **No external LLM calls.** The probe is structural + execution-based, not generative. Cost ≈ a few file reads plus running `vitest`.
|
|
113
|
+
- **Interactive commands are NOT actually invoked.** Calling `/vibe.spec` would block on the interview loop. Probe checks structural validity only and records `invocable: true` if the file is well-formed.
|
|
114
|
+
- **Hooks and tools have real unit tests** in the repo — run them, do not simulate.
|
|
115
|
+
- A probe failure never stops the run. Each entry's `error` field captures the cause; the report keeps going.
|
|
116
|
+
|
|
117
|
+
### Steps
|
|
118
|
+
|
|
119
|
+
1. **Resolve install dir for current harness**:
|
|
120
|
+
- CC: `~/.claude/`
|
|
121
|
+
- coco: `~/.coco/` (`COCO_HOME` overrides)
|
|
122
|
+
- Detect via `process.env.COCO_HOME` first, then which one is currently being read from. If both present, use the harness this skill was invoked from.
|
|
123
|
+
|
|
124
|
+
2. **Probe `commands`** — for each `<install>/commands/*.md`:
|
|
125
|
+
- `loaded`: file exists and is non-empty
|
|
126
|
+
- `frontmatter-valid`: YAML frontmatter parses; required keys present (`description`)
|
|
127
|
+
- `argument-hint-present`: optional but recorded
|
|
128
|
+
- `body-references-skill`: body contains `Load skill ` or `## Process` (signal that the command delegates correctly)
|
|
129
|
+
- Result: `{ name, loaded, frontmatter-valid, body-references-skill, error }`
|
|
130
|
+
|
|
131
|
+
3. **Probe `skills`** — for each `<install>/skills/*/SKILL.md`:
|
|
132
|
+
- `loaded`: file exists
|
|
133
|
+
- `frontmatter-valid`: YAML parses with required keys: `name`, `tier`, `description`, `triggers`
|
|
134
|
+
- `triggers-non-empty`: triggers array has ≥1 entry
|
|
135
|
+
- `description-mentions-trigger-conditions`: heuristic — description contains `Must use this skill when` or equivalent (vibe convention)
|
|
136
|
+
- Result: `{ name, loaded, frontmatter-valid, triggers-count, error }`
|
|
137
|
+
|
|
138
|
+
4. **Probe `hooks`** — for each `<install>/hooks/scripts/*.js` (or repo `hooks/scripts/` if testing the source):
|
|
139
|
+
- If a matching `__tests__/<hook-name>.test.js` exists → run `npx vitest run hooks/scripts/__tests__/<hook>.test.js --reporter=json` and parse the result
|
|
140
|
+
- If no test exists → mark `test-suite: "no-tests"` (warn, not fail)
|
|
141
|
+
- Result: `{ name, test-suite: "passed" | "failed" | "no-tests", tests: "<passed>/<total>", error }`
|
|
142
|
+
|
|
143
|
+
5. **Probe `agents`** — for each `<install>/agents/*.md`:
|
|
144
|
+
- `loaded`, `frontmatter-valid` (required: `name`, `description`, `tools`)
|
|
145
|
+
- `tools-list-valid`: every tool in the `tools` array matches a known harness tool (Read, Glob, Grep, Bash, Edit, Write, WebSearch, WebFetch, Task, plus the agent-specific Skill etc.)
|
|
146
|
+
- Result: `{ name, loaded, frontmatter-valid, tools-list-valid, error }`
|
|
147
|
+
|
|
148
|
+
6. **Probe `tools`** — for each tool exported from `dist/tools/index.js`:
|
|
149
|
+
- If a matching test file exists in `src/tools/__tests__/` → run vitest and capture pass/fail
|
|
150
|
+
- If no test → call the tool with a minimal known-safe input (e.g. `validateCodeQuality` against a tiny fixture) and verify the response is well-shaped JSON
|
|
151
|
+
- Result: `{ name, test-suite | smoke-call, status, error }`
|
|
152
|
+
|
|
153
|
+
7. **Compile JSON + Markdown reports** to `<project-vibe-dir>/test-reports/<YYYYMMDD-HHmm>-<harness>.{json,md}` per the schema above.
|
|
154
|
+
|
|
155
|
+
8. **Print summary**:
|
|
156
|
+
```
|
|
157
|
+
📊 RUNTIME REPORT (cc)
|
|
158
|
+
commands: 14/14 loaded, 14/14 frontmatter-valid
|
|
159
|
+
skills: 28/28 loaded, 1 missing description-mentions-trigger-conditions
|
|
160
|
+
hooks: 7/7 test suites passed (118/118 tests)
|
|
161
|
+
agents: 42/42 loaded, 0 with invalid tools
|
|
162
|
+
tools: 9/9 passing
|
|
163
|
+
📈 Score: 99/100
|
|
164
|
+
📁 .claude/vibe/test-reports/20260414-1845-cc.json
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
### Failure handling
|
|
168
|
+
|
|
169
|
+
| Probe failure | Action |
|
|
170
|
+
|---|---|
|
|
171
|
+
| frontmatter parse error | record + continue |
|
|
172
|
+
| missing required key | record + continue |
|
|
173
|
+
| vitest run failure | capture stderr summary into `error` field, continue |
|
|
174
|
+
| tool smoke-call exception | record exception type + continue |
|
|
175
|
+
| install dir not found | abort with clear message — cannot probe what is not installed |
|
|
176
|
+
|
|
177
|
+
### What this catches
|
|
178
|
+
|
|
179
|
+
- A new command added in source but missed by `postinstall` (file present in repo, absent from `~/.claude/commands/`)
|
|
180
|
+
- Skill with malformed frontmatter (would fail to register at runtime)
|
|
181
|
+
- Agent listing a tool that does not exist in the harness
|
|
182
|
+
- Hook unit test regression (matches existing CI guard but locally observable)
|
|
183
|
+
- Tool that broke between the test fixture and the shipped build
|
|
184
|
+
|
|
185
|
+
### What this does NOT catch
|
|
186
|
+
|
|
187
|
+
- LLM behavioral drift (interactive command actually behaving differently)
|
|
188
|
+
- Race conditions in agent orchestration
|
|
189
|
+
- Real-world failures that depend on user input
|
|
190
|
+
|
|
191
|
+
These belong to higher-effort future work (functional e2e, currently not in scope).
|
|
192
|
+
|
|
193
|
+
## Subcommand: `compare` — diff two reports
|
|
194
|
+
|
|
195
|
+
```
|
|
196
|
+
/vibe.test compare <cc-report.json> <coco-report.json>
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Steps
|
|
200
|
+
|
|
201
|
+
1. Load both JSON files. Compare timestamps; warn if delta > ±1 minute ("report timing skew detected, confidence low")
|
|
202
|
+
2. Match entries per category by `name`
|
|
203
|
+
3. Classify:
|
|
204
|
+
- **P1**: present on only one side → missing
|
|
205
|
+
- **P2**: present both sides but mismatched booleans (`loaded`, `first-response-ok`, `trigger-recognized`) → behavioral drift
|
|
206
|
+
- **P3**: only error wording differs, behavior identical → informational
|
|
207
|
+
4. Persist result as `<ts>-compare.md`
|
|
208
|
+
5. P1 findings auto-register via `/vibe.regress`
|
|
209
|
+
|
|
210
|
+
## Integration Points
|
|
211
|
+
|
|
212
|
+
### Release flow
|
|
213
|
+
|
|
214
|
+
Recommended pre-release ritual:
|
|
215
|
+
```
|
|
216
|
+
1. From CC: /vibe.test parity → must pass
|
|
217
|
+
2. From coco: /vibe.test parity → must pass (when feasible)
|
|
218
|
+
3. Both green → pnpm release
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
### To /vibe.regress
|
|
222
|
+
|
|
223
|
+
On P1 drift:
|
|
224
|
+
```
|
|
225
|
+
Load skill `vibe-regress` with:
|
|
226
|
+
subcommand: register --from-test
|
|
227
|
+
symptom: "<category> drift: <name> missing in <harness>"
|
|
228
|
+
root-cause-tag: integration
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Done Criteria
|
|
232
|
+
|
|
233
|
+
### Subcommand: parity
|
|
234
|
+
- [ ] Works without any external calls
|
|
235
|
+
- [ ] Missing one install dir → clean exit with guidance (not an error)
|
|
236
|
+
- [ ] `install-set-diff`, `content-drift`, `path-error` reported as separate categories
|
|
237
|
+
- [ ] P1 findings invoke `/vibe.regress` automatically
|
|
238
|
+
- [ ] `compare` handles timing-skew warning correctly
|
|
239
|
+
|
|
240
|
+
### Subcommand: report
|
|
241
|
+
- [ ] No external LLM calls (cost = file reads + vitest runs only)
|
|
242
|
+
- [ ] Interactive commands probed structurally, never actually invoked
|
|
243
|
+
- [ ] Hook and tool tests run via real vitest, not simulated
|
|
244
|
+
- [ ] A probe failure on one entry never stops the run
|
|
245
|
+
- [ ] JSON report matches the schema in "Storage Contract"
|
|
246
|
+
- [ ] Markdown summary printed to console after run completes
|
|
247
|
+
- [ ] Install dir absent → abort with clear message (not silent)
|