@tgoodington/intuition 10.5.0 → 10.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@tgoodington/intuition",
|
|
3
|
-
"version": "10.
|
|
3
|
+
"version": "10.7.0",
|
|
4
4
|
"description": "Domain-adaptive workflow system for Claude Code: prompt, outline, assemble specialist teams, detail with domain experts, build with format producers, test code output. Supports v8 compat (design, engineer, build) and v9 specialist workflows with 14 domain specialists and 6 format producers.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"claude-code",
|
|
@@ -130,11 +130,17 @@ If the user's initial CAPTURE response already covers some dimensions, skip them
|
|
|
130
130
|
|
|
131
131
|
Every question in REFINE follows these principles:
|
|
132
132
|
|
|
133
|
-
**Derive from their words.** Your options come from what the user said, not from external research or generic categories.
|
|
133
|
+
**Derive from their words.** Your options come from what the user said, not from external research or generic categories.
|
|
134
134
|
|
|
135
|
-
-
|
|
136
|
-
|
|
137
|
-
|
|
135
|
+
**ANTI-PATTERN: Always presenting 3 options.** This is the most common failure mode. You must count the actual distinct possibilities the user's words imply — sometimes that's 2, sometimes 5, sometimes 6. Three is not a safe default; it's a lazy one. Before presenting options, explicitly list the distinct possibilities you've identified and use ALL of them.
|
|
136
|
+
|
|
137
|
+
Examples at different scales:
|
|
138
|
+
|
|
139
|
+
- 2 options: "You said 'handle transfers' — does that mean (a) bulk migration when someone leaves, or (b) real-time co-ownership?"
|
|
140
|
+
- 3 options: "You mentioned 'fast' — is that (a) sub-second response times, (b) same-day turnaround, or (c) perceived speed through progressive loading?"
|
|
141
|
+
- 4 options: "The notification system could be (a) email-only, (b) in-app real-time, (c) digest-based batching, or (d) user-configured per event type."
|
|
142
|
+
- 5 options: "For auth you've got (a) email/password, (b) SSO via existing provider, (c) magic links, (d) OAuth social login, or (e) passkeys."
|
|
143
|
+
- 6+ options: "The dashboard layout could be (a) single KPI grid, (b) tabbed by department, (c) role-based views, (d) customizable drag-and-drop, (e) narrative/report style, or (f) a combined feed with filters."
|
|
138
144
|
|
|
139
145
|
Always include a trailing "or something else entirely?" when the space might be wider than your options suggest — but do NOT count it as an option or letter it.
|
|
140
146
|
|
|
@@ -25,6 +25,8 @@ These are non-negotiable. Violating any of these means the protocol has failed.
|
|
|
25
25
|
9. You MUST run the Exit Protocol after writing the test report. NEVER route to `/intuition-handoff`.
|
|
26
26
|
10. You MUST update `.project-memory-state.json` as part of the Exit Protocol.
|
|
27
27
|
11. You MUST NOT use `run_in_background` for subagents in Steps 2 and 5. All research and test-creation agents MUST complete before their next step begins.
|
|
28
|
+
12. You MUST NOT create tests for non-code deliverables (SKILL.md files, markdown docs, JSON config, static HTML/CSS). Pattern-matching source content is not testing. Classify deliverables in Step 1.5 and skip non-code files entirely.
|
|
29
|
+
13. You MUST design smoke tests for infrastructure scripts (install, deploy, build, publish scripts) that actually execute the script in an isolated temp environment. Do NOT grep infrastructure script source code.
|
|
28
30
|
|
|
29
31
|
## CONTEXT PATH RESOLUTION
|
|
30
32
|
|
|
@@ -39,15 +41,16 @@ On startup, before reading any files:
|
|
|
39
41
|
## PROTOCOL: COMPLETE FLOW
|
|
40
42
|
|
|
41
43
|
```
|
|
42
|
-
Step 1:
|
|
43
|
-
Step
|
|
44
|
-
Step
|
|
45
|
-
Step
|
|
46
|
-
Step
|
|
44
|
+
Step 1: Read context (state, build_report, blueprints, decisions, outline)
|
|
45
|
+
Step 1.5: Classify deliverables (code / infrastructure-script / non-code)
|
|
46
|
+
Step 2: Analyze test infrastructure (2 parallel intuition-researcher agents)
|
|
47
|
+
Step 3: Design test strategy — code and infrastructure only (self-contained domain reasoning)
|
|
48
|
+
Step 4: Confirm test plan with user (including skipped non-code files)
|
|
49
|
+
Step 5: Create tests (delegate to sonnet code-writer subagents)
|
|
47
50
|
Step 5.5: Spec compliance audit (assertion provenance + abstraction level coverage)
|
|
48
|
-
Step 6:
|
|
49
|
-
Step 7:
|
|
50
|
-
Step 8:
|
|
51
|
+
Step 6: Run tests + fix cycle (debugger-style autonomy)
|
|
52
|
+
Step 7: Write test_report.md
|
|
53
|
+
Step 8: Exit Protocol (state update, completion)
|
|
51
54
|
```
|
|
52
55
|
|
|
53
56
|
## RESUME LOGIC
|
|
@@ -75,6 +78,33 @@ Read these files:
|
|
|
75
78
|
|
|
76
79
|
From these files, extract: **build_report** → files modified (scope boundary), task results, deviations, decision compliance, deferred test deliverables. **Blueprints** → Section 5 behavioral contracts (signatures, return schemas, error conditions, naming), Section 6 AC mapping, Section 9 file paths. **test_advisory** → edge cases, critical paths, failure modes. **Decisions** → index of all [USER] and [SPEC] decisions with chosen options (used in Step 6 boundary checking).
|
|
77
80
|
|
|
81
|
+
## STEP 1.5: DELIVERABLE CLASSIFICATION
|
|
82
|
+
|
|
83
|
+
After reading the build report, classify every output file into one of three categories. This determines what gets tested and how. Files classified as `non-code` are excluded from test design entirely — no structural validation, no grep-based content tests.
|
|
84
|
+
|
|
85
|
+
| Category | Examples | Test Approach |
|
|
86
|
+
|----------|---------|---------------|
|
|
87
|
+
| **Code** | .py, .js, .ts, .jsx, .tsx, .go, .rs, .java, .rb, .php modules | Unit/integration tests (Tiers 1-3) |
|
|
88
|
+
| **Infrastructure script** | postinstall hooks, deploy scripts, build/publish scripts, CLI tools | Smoke tests (actually execute in isolated temp environment) |
|
|
89
|
+
| **Non-code** | SKILL.md, .md docs, .json config/schema, static .html/.css, .yaml config | **Skip** — not executable, not meaningfully testable |
|
|
90
|
+
|
|
91
|
+
**Classification rules:**
|
|
92
|
+
1. Read the `Files Modified` section of build_report.md
|
|
93
|
+
2. For each file, classify by extension AND purpose:
|
|
94
|
+
- SKILL.md files → **non-code** (prompt engineering artifacts — testing them via pattern matching is low-signal and expensive)
|
|
95
|
+
- Markdown documentation → **non-code**
|
|
96
|
+
- JSON schema definitions or config-only changes (e.g., adding `"private": true` to package.json) → **non-code**
|
|
97
|
+
- HTML templates rendered by a server framework (Jinja2, EJS, etc.) → **code** (tested indirectly via route tests, not directly)
|
|
98
|
+
- Static HTML/CSS with no server logic → **non-code**
|
|
99
|
+
- Python/JavaScript/TypeScript modules with functions, classes, or route handlers → **code**
|
|
100
|
+
- Scripts invoked via npm hooks, CLI, or build pipelines → **infrastructure script**
|
|
101
|
+
3. Record the classification for use in Steps 3-5
|
|
102
|
+
|
|
103
|
+
**If ALL deliverables are non-code**, present via AskUserQuestion:
|
|
104
|
+
"All deliverables are non-code (prompt files, config, documentation). Standard testing does not apply. Options: Skip testing / Proceed anyway"
|
|
105
|
+
|
|
106
|
+
Default recommendation: Skip testing. Write a minimal test_report.md noting no testable code was produced.
|
|
107
|
+
|
|
78
108
|
## STEP 2: RESEARCH (2 Parallel Research Agents)
|
|
79
109
|
|
|
80
110
|
Spawn two `intuition-researcher` agents in parallel (both Task calls in a single response). Do NOT use `run_in_background` — you MUST wait for both agents to return before proceeding to Step 3:
|
|
@@ -138,14 +168,22 @@ If process_flow.md conflicts with actual implementation, check build_report.md f
|
|
|
138
168
|
|
|
139
169
|
### File-to-Tier Mapping
|
|
140
170
|
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
|
144
|
-
|
|
145
|
-
|
|
|
146
|
-
|
|
|
147
|
-
|
|
|
148
|
-
|
|
|
171
|
+
Only files classified as `code` or `infrastructure-script` in Step 1.5 appear here. Non-code files are excluded entirely — do NOT create structural validation or grep-based tests for them.
|
|
172
|
+
|
|
173
|
+
| File Type | Category | Test Approach |
|
|
174
|
+
|-----------|----------|---------------|
|
|
175
|
+
| Route / controller | Code | Tier 1 (AC tests via HTTP) |
|
|
176
|
+
| Engine / orchestrator | Code | Tier 1 (AC tests of engine API) |
|
|
177
|
+
| Service / provider | Code | Tier 2 (blueprint contract) |
|
|
178
|
+
| Model / schema | Code | Tier 2 (blueprint contract) |
|
|
179
|
+
| Utility / helper | Code | Tier 3, or Tier 2 if blueprint specifies |
|
|
180
|
+
| Install / deploy / build script | Infrastructure | Smoke test (execute in temp env) |
|
|
181
|
+
| CLI tool | Infrastructure | Smoke test (execute with test args) |
|
|
182
|
+
| Server-rendered template (.html with server logic) | Code | Tested indirectly via Tier 1 route tests |
|
|
183
|
+
| SKILL.md / prompt file | Non-code | **Skip** |
|
|
184
|
+
| Markdown / documentation | Non-code | **Skip** |
|
|
185
|
+
| JSON config / schema-only changes | Non-code | **Skip** |
|
|
186
|
+
| Static HTML / CSS | Non-code | **Skip** |
|
|
149
187
|
|
|
150
188
|
### Tier Distribution Minimums
|
|
151
189
|
|
|
@@ -183,12 +221,36 @@ Before finalizing the test plan, review specialist domain knowledge from bluepri
|
|
|
183
221
|
|
|
184
222
|
Incorporate specialist insights as advisory, not prescriptive — you own the test strategy.
|
|
185
223
|
|
|
224
|
+
### Smoke Test Design (for infrastructure scripts)
|
|
225
|
+
|
|
226
|
+
For files classified as `infrastructure-script` in Step 1.5, design smoke tests that **actually execute the script** in an isolated environment. Do NOT write structural validation tests that grep the script's source code — pattern-matching source code catches almost nothing useful and wastes tokens.
|
|
227
|
+
|
|
228
|
+
**Isolation strategy:**
|
|
229
|
+
- Set `HOME` (or equivalent) to a temp directory to avoid modifying real user data
|
|
230
|
+
- Create required directory structures (source files, config) in the temp environment
|
|
231
|
+
- Clean up after each test (or use the test framework's temp directory support)
|
|
232
|
+
|
|
233
|
+
**What smoke tests MUST verify:**
|
|
234
|
+
- Script runs without errors (exit code 0) under normal conditions
|
|
235
|
+
- Script creates expected output files and directories
|
|
236
|
+
- Script handles missing prerequisites gracefully (exit code non-zero, meaningful error message)
|
|
237
|
+
- Script preserves data it should preserve (e.g., user config not overwritten on update)
|
|
238
|
+
- Script output matches cross-domain contracts (e.g., generated manifest schema matches the consuming endpoint's expected format)
|
|
239
|
+
|
|
240
|
+
**What smoke tests MUST NOT do:**
|
|
241
|
+
- Grep source code for variable names, array contents, or string patterns
|
|
242
|
+
- Test internal implementation details — test observable behavior only
|
|
243
|
+
- Validate that specific lines of code exist — that is not testing
|
|
244
|
+
|
|
245
|
+
Smoke tests count as **Tier 1** if they exercise an acceptance criterion's observable behavior, or **Tier 2** if they verify a blueprint behavioral contract. They follow the same tier distribution and negative test minimums as code tests.
|
|
246
|
+
|
|
186
247
|
### Output
|
|
187
248
|
|
|
188
249
|
Write the test strategy to `{context_path}/scratch/test_strategy.md`. This serves as both an audit trail and a resume marker for crash recovery.
|
|
189
250
|
|
|
190
251
|
The test strategy document MUST contain:
|
|
191
|
-
- **
|
|
252
|
+
- **Deliverable classification**: List every file from build_report, its category (code / infrastructure-script / non-code), and rationale. Non-code files are listed as skipped with brief reason.
|
|
253
|
+
- **AC coverage matrix**: For each acceptance criterion, which test(s) cover it, at what tier, and at what abstraction level. Every AC with observable behavior MUST have at least one Tier 1 test. ACs that apply exclusively to non-code deliverables should be noted as "not testable — non-code deliverable."
|
|
192
254
|
- **Tier distribution**: Total count per tier with percentages. Verify: Tier 1 ≥ 40%, Tier 3 ≤ 30%. If not met, adjust plan before proceeding.
|
|
193
255
|
- **Negative test inventory**: List each negative/error-path test explicitly. Verify: ≥ 30% of Tier 1/2 tests are negative. If not met, add more error-path tests.
|
|
194
256
|
- Test files to create (path, tier, target source file)
|
|
@@ -196,7 +258,7 @@ The test strategy document MUST contain:
|
|
|
196
258
|
- Mock requirements per file (mock external deps only for Tier 1/2; Tier 3 may mock internal seams). For infra projects: flag files needing mock-depth assertions (call args, call order, call count).
|
|
197
259
|
- Framework command to run tests
|
|
198
260
|
- Estimated test count and distribution by tier
|
|
199
|
-
- **Mutation spot-check candidates**: 3 source files with highest Tier 1/2 coverage, and one candidate mutation per file
|
|
261
|
+
- **Mutation spot-check candidates**: 3 source files with highest Tier 1/2 coverage, and one candidate mutation per file (only `code` and `infrastructure-script` files are eligible)
|
|
200
262
|
- Which specialist recommendations were incorporated (and which were skipped, with rationale)
|
|
201
263
|
- Any acceptance criteria where the expected behavior is ambiguous (flagged for potential SPEC_AMBIGUOUS markers)
|
|
202
264
|
|
|
@@ -208,13 +270,14 @@ Present the test plan via AskUserQuestion:
|
|
|
208
270
|
Question: "Test plan ready:
|
|
209
271
|
|
|
210
272
|
**Framework:** [detected framework]
|
|
273
|
+
**Deliverables:** [N] code files + [N] infrastructure scripts tested, [N] non-code files skipped
|
|
211
274
|
**Test files:** [N] files
|
|
212
|
-
**Test cases:** ~[total] tests covering [file count]
|
|
213
|
-
- Tier 1 (AC tests): [N] tests ([X]% of total, min 40%) covering [M] of [P] acceptance criteria
|
|
275
|
+
**Test cases:** ~[total] tests covering [file count] testable files
|
|
276
|
+
- Tier 1 (AC tests): [N] tests ([X]% of total, min 40%) covering [M] of [P] testable acceptance criteria
|
|
214
277
|
- Tier 2 (blueprint contracts): [N] tests
|
|
215
278
|
- Tier 3 (coverage): [N] tests ([X]% of total, max 30%)
|
|
216
279
|
**Negative tests:** [N] of [M] Tier 1/2 tests ([X]%, min 30%)
|
|
217
|
-
**
|
|
280
|
+
**Skipped (non-code):** [list skipped file types and count, e.g., '7 SKILL.md files, 2 config changes']
|
|
218
281
|
**Coverage target:** [threshold]%
|
|
219
282
|
**Post-pass:** Mutation spot-check on 3 files
|
|
220
283
|
|
|
@@ -293,6 +356,48 @@ Label every test with: `# Coverage test — not derived from spec`
|
|
|
293
356
|
Write focused unit tests for uncovered code paths. Follow existing test style.
|
|
294
357
|
```
|
|
295
358
|
|
|
359
|
+
### Infrastructure Script Smoke Test Writer Prompt
|
|
360
|
+
|
|
361
|
+
```
|
|
362
|
+
You are a smoke test writer. Your tests actually EXECUTE the script and verify observable behavior. You do NOT grep source code.
|
|
363
|
+
|
|
364
|
+
**Framework:** [detected framework + version]
|
|
365
|
+
**Test conventions:** [naming pattern, directory structure, import style from Step 2]
|
|
366
|
+
|
|
367
|
+
**Script under test:** [script file path]
|
|
368
|
+
**Script purpose:** [what the script does — from build report]
|
|
369
|
+
**Script invocation:** [how the script is run — npm hook, CLI command, etc.]
|
|
370
|
+
|
|
371
|
+
**Spec oracle — what the script SHOULD do:**
|
|
372
|
+
- Acceptance criteria: [paste relevant ACs]
|
|
373
|
+
- Blueprint spec: Read [relevant blueprint path] — Section 5 for behavioral contracts
|
|
374
|
+
- Cross-domain contracts: [any output schemas consumed by other components]
|
|
375
|
+
|
|
376
|
+
**Test cases to implement:**
|
|
377
|
+
[List each test case with: name, tier, what it validates, expected observable outcome, isolation requirements]
|
|
378
|
+
|
|
379
|
+
## ISOLATION RULES
|
|
380
|
+
- Create a temp directory for each test (use the framework's temp directory support)
|
|
381
|
+
- Set HOME or equivalent env vars to temp directory before running the script
|
|
382
|
+
- Create any prerequisite files/directories the script expects in the temp environment
|
|
383
|
+
- NEVER run the script against real user directories (~/.claude/, etc.)
|
|
384
|
+
- Clean up temp directories after each test
|
|
385
|
+
|
|
386
|
+
## WHAT TO TEST
|
|
387
|
+
- Script exit code under normal conditions (0 = success)
|
|
388
|
+
- Files and directories created by the script (verify existence, verify contents match expected schema)
|
|
389
|
+
- Script behavior when prerequisites are missing (non-zero exit, error message)
|
|
390
|
+
- Data preservation (files that should survive re-runs are not overwritten)
|
|
391
|
+
- Output format matches downstream consumer contracts
|
|
392
|
+
|
|
393
|
+
## WHAT NOT TO TEST
|
|
394
|
+
- Do NOT read the script's source code to validate its internal structure
|
|
395
|
+
- Do NOT grep for variable names, array contents, or string patterns in source
|
|
396
|
+
- Do NOT test that specific code constructs exist — test what the script DOES
|
|
397
|
+
|
|
398
|
+
Write the complete test file. Follow existing test style.
|
|
399
|
+
```
|
|
400
|
+
|
|
296
401
|
SYNCHRONIZATION GATE: After all subagents return, verify each test file exists on disk using Glob. If any file is missing, retry that subagent once (foreground) with error context. Do NOT proceed to Step 5.5 until every planned test file is confirmed on disk.
|
|
297
402
|
|
|
298
403
|
## STEP 5.5: SPEC COMPLIANCE AUDIT
|
|
@@ -445,7 +550,8 @@ Write `{context_path}/test_report.md`:
|
|
|
445
550
|
- **Tests created:** [N] (Tier 1: [N], Tier 2: [N], Tier 3: [N])
|
|
446
551
|
- **Passing:** [N]
|
|
447
552
|
- **Failing:** [N]
|
|
448
|
-
- **AC coverage:** [M]/[P] acceptance criteria have Tier 1 tests
|
|
553
|
+
- **AC coverage:** [M]/[P] testable acceptance criteria have Tier 1 tests
|
|
554
|
+
- **Skipped deliverables:** [N] non-code files ([list types: SKILL.md, config, etc.])
|
|
449
555
|
- **Coverage:** [X]% (target: [Y]%)
|
|
450
556
|
|
|
451
557
|
## Test Files Created
|
|
@@ -453,6 +559,11 @@ Write `{context_path}/test_report.md`:
|
|
|
453
559
|
|------|------|-------|--------|
|
|
454
560
|
| [path] | [1/2/3] | [count] | [what it tests — AC reference or blueprint section] |
|
|
455
561
|
|
|
562
|
+
## Skipped Deliverables (Non-Code)
|
|
563
|
+
| File | Type | Reason |
|
|
564
|
+
|------|------|--------|
|
|
565
|
+
| [path] | [SKILL.md / config / markdown / etc.] | Non-code — not executable, not testable |
|
|
566
|
+
|
|
456
567
|
## Failures & Resolutions
|
|
457
568
|
|
|
458
569
|
### [Test name]
|