bigpowers 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.gitmessage +5 -0
- package/.releaserc.json +17 -0
- package/CHANGELOG.md +61 -0
- package/CLAUDE.md +61 -0
- package/CONVENTIONS.md +140 -0
- package/GEMINI.md +53 -0
- package/LICENSE +21 -0
- package/README.md +116 -0
- package/RELEASE.md +108 -0
- package/SKILL-INDEX.md +146 -0
- package/assess-impact/SKILL.md +76 -0
- package/audit-code/HEURISTICS.md +43 -0
- package/audit-code/SKILL.md +81 -0
- package/bin/bigpowers.js +27 -0
- package/change-request/REFERENCE.md +60 -0
- package/change-request/SKILL.md +42 -0
- package/commit-message/REFERENCE.md +81 -0
- package/commit-message/SKILL.md +39 -0
- package/countable-story-format.md +293 -0
- package/craft-skill/REFERENCE.md +88 -0
- package/craft-skill/SKILL.md +55 -0
- package/deepen-architecture/DEEPENING.md +37 -0
- package/deepen-architecture/INTERFACE-DESIGN.md +44 -0
- package/deepen-architecture/LANGUAGE.md +53 -0
- package/deepen-architecture/SKILL.md +76 -0
- package/define-language/SKILL.md +75 -0
- package/define-success/SKILL.md +60 -0
- package/delegate-task/SKILL.md +70 -0
- package/design-interface/SKILL.md +94 -0
- package/develop-tdd/SKILL.md +160 -0
- package/develop-tdd/deep-modules.md +33 -0
- package/develop-tdd/interface-design.md +31 -0
- package/develop-tdd/mocking.md +59 -0
- package/develop-tdd/refactoring.md +10 -0
- package/develop-tdd/tests.md +71 -0
- package/dispatch-agents/SKILL.md +72 -0
- package/edit-document/SKILL.md +14 -0
- package/elaborate-spec/SKILL.md +79 -0
- package/enforce-first/SKILL.md +75 -0
- package/execute-plan/SKILL.md +84 -0
- package/grill-me/REFERENCE.md +63 -0
- package/grill-me/SKILL.md +25 -0
- package/guard-git/REFERENCE.md +136 -0
- package/guard-git/SKILL.md +39 -0
- package/guard-git/scripts/block-dangerous-git.sh +41 -0
- package/guard-git/scripts/lib/git-guardrails-core.sh +29 -0
- package/hook-commits/SKILL.md +91 -0
- package/hooks/pre-tool-use.sh +130 -0
- package/index.js +6 -0
- package/inspect-quality/SKILL.md +101 -0
- package/investigate-bug/SKILL.md +111 -0
- package/kickoff-branch/SKILL.md +87 -0
- package/map-codebase/SKILL.md +66 -0
- package/migrate-spec/REFERENCE-GSD.md +137 -0
- package/migrate-spec/REFERENCE.md +186 -0
- package/migrate-spec/SKILL.md +150 -0
- package/model-domain/ADR-FORMAT.md +47 -0
- package/model-domain/CONTEXT-FORMAT.md +77 -0
- package/model-domain/SKILL.md +82 -0
- package/opencode.json +4 -0
- package/orchestrate-project/REFERENCE.md +89 -0
- package/orchestrate-project/SKILL.md +59 -0
- package/organize-workspace/REFERENCE.md +80 -0
- package/organize-workspace/SKILL.md +74 -0
- package/package.json +45 -0
- package/plan-refactor/SKILL.md +75 -0
- package/plan-release/SKILL.md +75 -0
- package/plan-work/SKILL.md +124 -0
- package/playwright.config.ts +56 -0
- package/release-branch/SKILL.md +116 -0
- package/request-review/SKILL.md +70 -0
- package/respond-review/SKILL.md +68 -0
- package/scripts/audit-compliance.sh +256 -0
- package/scripts/cleanup-worktrees.sh +44 -0
- package/scripts/install-cursor-skills-local.sh +13 -0
- package/scripts/install-cursor-skills.sh +34 -0
- package/scripts/install.sh +240 -0
- package/scripts/project-survey.sh +54 -0
- package/scripts/sync-skills.sh +110 -0
- package/seed-conventions/SKILL.md +185 -0
- package/session-state/SKILL.md +69 -0
- package/skills-lock.json +157 -0
- package/spike-prototype/SKILL.md +92 -0
- package/survey-context/SKILL.md +93 -0
- package/terse-mode/SKILL.md +35 -0
- package/trace-requirement/SKILL.md +68 -0
- package/using-bigpowers/SKILL.md +65 -0
- package/validate-fix/SKILL.md +93 -0
- package/visual-dashboard/SKILL.md +49 -0
- package/visual-dashboard/scripts/frame-template.html +189 -0
- package/visual-dashboard/scripts/helper.js +83 -0
- package/visual-dashboard/scripts/server.cjs +345 -0
- package/visual-dashboard/scripts/start-server.sh +121 -0
- package/visual-dashboard/scripts/stop-server.sh +46 -0
- package/wire-observability/SKILL.md +90 -0
- package/write-document/SKILL.md +63 -0
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: define-language
|
|
3
|
+
description: Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to specs/UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Define Language
|
|
7
|
+
|
|
8
|
+
Extract and formalize domain terminology from the current conversation into a consistent glossary, saved to `specs/UBIQUITOUS_LANGUAGE.md`.
|
|
9
|
+
|
|
10
|
+
## Process
|
|
11
|
+
|
|
12
|
+
1. **Scan the conversation** for domain-relevant nouns, verbs, and concepts
|
|
13
|
+
2. **Identify problems**:
|
|
14
|
+
- Same word used for different concepts (ambiguity)
|
|
15
|
+
- Different words used for the same concept (synonyms)
|
|
16
|
+
- Vague or overloaded terms
|
|
17
|
+
3. **Propose a canonical glossary** with opinionated term choices
|
|
18
|
+
4. **Write to `specs/UBIQUITOUS_LANGUAGE.md`** in the working directory using the format below
|
|
19
|
+
5. **Output a summary** inline in the conversation
|
|
20
|
+
|
|
21
|
+
## Output Format
|
|
22
|
+
|
|
23
|
+
Write a `specs/UBIQUITOUS_LANGUAGE.md` file with this structure:
|
|
24
|
+
|
|
25
|
+
```md
|
|
26
|
+
# Ubiquitous Language
|
|
27
|
+
|
|
28
|
+
## Order lifecycle
|
|
29
|
+
|
|
30
|
+
| Term | Definition | Aliases to avoid |
|
|
31
|
+
| ----------- | ------------------------------------------------------- | --------------------- |
|
|
32
|
+
| **Order** | A customer's request to purchase one or more items | Purchase, transaction |
|
|
33
|
+
| **Invoice** | A request for payment sent to a customer after delivery | Bill, payment request |
|
|
34
|
+
|
|
35
|
+
## People
|
|
36
|
+
|
|
37
|
+
| Term | Definition | Aliases to avoid |
|
|
38
|
+
| ------------ | ------------------------------------------- | ---------------------- |
|
|
39
|
+
| **Customer** | A person or organization that places orders | Client, buyer, account |
|
|
40
|
+
| **User** | An authentication identity in the system | Login, account |
|
|
41
|
+
|
|
42
|
+
## Relationships
|
|
43
|
+
|
|
44
|
+
- An **Invoice** belongs to exactly one **Customer**
|
|
45
|
+
- An **Order** produces one or more **Invoices**
|
|
46
|
+
|
|
47
|
+
## Example dialogue
|
|
48
|
+
|
|
49
|
+
> **Dev:** "When a **Customer** places an **Order**, do we create the **Invoice** immediately?"
|
|
50
|
+
> **Domain expert:** "No — an **Invoice** is only generated once a **Fulfillment** is confirmed."
|
|
51
|
+
|
|
52
|
+
## Flagged ambiguities
|
|
53
|
+
|
|
54
|
+
- "account" was used to mean both **Customer** and **User** — these are distinct concepts.
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Rules
|
|
58
|
+
|
|
59
|
+
- **Be opinionated.** When multiple words exist for the same concept, pick the best one and list the others as aliases to avoid.
|
|
60
|
+
- **Flag conflicts explicitly.** If a term is used ambiguously, call it out in "Flagged ambiguities" with a clear recommendation.
|
|
61
|
+
- **Only include terms relevant for domain experts.** Skip names of modules or classes unless they have domain meaning.
|
|
62
|
+
- **Keep definitions tight.** One sentence max. Define what it IS, not what it does.
|
|
63
|
+
- **Show relationships.** Use bold term names and express cardinality where obvious.
|
|
64
|
+
- **Group terms into multiple tables** when natural clusters emerge. One table is fine if terms are cohesive.
|
|
65
|
+
- **Write an example dialogue.** 3–5 exchanges between a dev and domain expert showing terms used precisely.
|
|
66
|
+
|
|
67
|
+
## Re-running
|
|
68
|
+
|
|
69
|
+
When invoked again in the same conversation:
|
|
70
|
+
|
|
71
|
+
1. Read the existing `specs/UBIQUITOUS_LANGUAGE.md`
|
|
72
|
+
2. Incorporate any new terms from subsequent discussion
|
|
73
|
+
3. Update definitions if understanding has evolved
|
|
74
|
+
4. Re-flag any new ambiguities
|
|
75
|
+
5. Rewrite the example dialogue to incorporate new terms
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: define-success
|
|
3
|
+
description: Convert an imperative task statement into explicit "step → verify: <cmd>" pairs before implementation begins. Use before plan-work when success criteria are unclear, when a task lacks verifiable checkpoints, or when user says "how will we know this is done?".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Define Success
|
|
7
|
+
|
|
8
|
+
Transform "do X" into "step → verify: <cmd>" pairs. This is the pre-flight check before `plan-work` or `develop-tdd` — it makes success observable and removes ambiguity about when you're done.
|
|
9
|
+
|
|
10
|
+
## Why this matters
|
|
11
|
+
|
|
12
|
+
"Implement user authentication" is not a plan. It has no checkpoints, no evidence requirement, and no way to know if you're done. The Karpathy principle: every step must be independently verifiable with a runnable command. If you can't verify it, you can't prove it works.
|
|
13
|
+
|
|
14
|
+
## Process
|
|
15
|
+
|
|
16
|
+
### 1. Read the task statement
|
|
17
|
+
|
|
18
|
+
Take the task as stated (from conversation, or from `specs/TASKS.md`, or from `specs/SCOPE.md`).
|
|
19
|
+
|
|
20
|
+
### 2. Break into observable outcomes
|
|
21
|
+
|
|
22
|
+
For each thing the task requires, identify:
|
|
23
|
+
- The smallest unit of observable behavior that proves something works
|
|
24
|
+
- The command that proves it
|
|
25
|
+
|
|
26
|
+
Work at the level of behaviors (what the system does) not implementation steps (how you'll write the code).
|
|
27
|
+
|
|
28
|
+
### 3. Write the pairs
|
|
29
|
+
|
|
30
|
+
Format each pair as:
|
|
31
|
+
```
|
|
32
|
+
N. [What must be true] → verify: <runnable command>
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Examples:
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
Task: "Add user registration to the API"
|
|
39
|
+
|
|
40
|
+
1. POST /users accepts {email, name} and returns {id, email, name} → verify: curl -s -X POST http://localhost:3000/users -H 'Content-Type: application/json' -d '{"email":"test@test.com","name":"Test"}' | jq .id
|
|
41
|
+
2. Duplicate email is rejected with 409 → verify: npm test -- user-registration.test.ts
|
|
42
|
+
3. Missing email is rejected with 400 and descriptive error → verify: npm test -- user-validation.test.ts
|
|
43
|
+
4. Password is hashed (never stored in plaintext) → verify: npm test -- user-security.test.ts
|
|
44
|
+
5. All existing tests still pass → verify: npm test
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### 4. Challenge completeness
|
|
48
|
+
|
|
49
|
+
Ask yourself:
|
|
50
|
+
- Is there any behavior the task requires that isn't covered by a verify step?
|
|
51
|
+
- Is every verify step runnable right now without additional setup?
|
|
52
|
+
- Does the final step verify the whole thing end-to-end?
|
|
53
|
+
|
|
54
|
+
Add any missing pairs.
|
|
55
|
+
|
|
56
|
+
### 5. Output
|
|
57
|
+
|
|
58
|
+
Present the pairs to the user and ask: "Does this capture everything the task requires? Anything missing?"
|
|
59
|
+
|
|
60
|
+
Once confirmed, these pairs become the skeleton for `plan-work`'s steps. Pass them along when calling `plan-work`.
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: delegate-task
|
|
3
|
+
description: Delegate one complex task to a single subagent, review its work in two stages before merging back. Sequential — one agent at a time, with oversight. Use when a task is complex and requires careful review before the result is accepted. Distinct from dispatch-agents (no parallelism here; reviewer sees full diff before proceeding).
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Delegate Task
|
|
7
|
+
|
|
8
|
+
Delegate a single complex task to a subagent with a two-stage review gate before accepting the result. Use when oversight of a single task matters more than speed.
|
|
9
|
+
|
|
10
|
+
**Distinct from `dispatch-agents`:** This skill runs one subagent sequentially with a mandatory review. `dispatch-agents` runs multiple subagents in parallel without inter-task review gates.
|
|
11
|
+
|
|
12
|
+
## Process
|
|
13
|
+
|
|
14
|
+
### 1. Define the task
|
|
15
|
+
|
|
16
|
+
Before spawning the agent, read `specs/STATE.md` if it exists. Then write a minimal self-contained brief using this template (brief size directly controls token cost and hallucination risk — do not pad):
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
Goal: [one sentence — specific, measurable outcome]
|
|
20
|
+
In scope: [explicit file or module list]
|
|
21
|
+
Out of bounds: [what NOT to do]
|
|
22
|
+
Constraints: [relevant CONVENTIONS.md rules, existing patterns, test requirements]
|
|
23
|
+
Verify: [runnable command]
|
|
24
|
+
Prior decisions: [relevant entries from specs/STATE.md — omit section if none apply]
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Do not include full file contents, full conversation history, or decisions unrelated to this task.
|
|
28
|
+
|
|
29
|
+
### 2. Spawn the subagent
|
|
30
|
+
|
|
31
|
+
Use the Agent tool to spawn the subagent with the complete brief. Include:
|
|
32
|
+
- All context the agent needs (it starts cold — no shared state)
|
|
33
|
+
- Reference to CONVENTIONS.md constraints
|
|
34
|
+
- The verify command it must run before reporting done
|
|
35
|
+
|
|
36
|
+
### 3. Stage 1 review — output inspection
|
|
37
|
+
|
|
38
|
+
When the subagent returns, review its report before looking at the diff:
|
|
39
|
+
- Did it run the verify command? Did it pass?
|
|
40
|
+
- Does it explain what it changed and why?
|
|
41
|
+
- Are there any concerns raised by the agent?
|
|
42
|
+
|
|
43
|
+
If the report raises red flags, ask the subagent for clarification or re-run with adjusted instructions.
|
|
44
|
+
|
|
45
|
+
### 4. Stage 2 review — diff inspection
|
|
46
|
+
|
|
47
|
+
Inspect the actual diff:
|
|
48
|
+
```bash
|
|
49
|
+
git diff main...HEAD
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Check:
|
|
53
|
+
- [ ] Changes are scoped to what was asked — nothing extra
|
|
54
|
+
- [ ] No `any`, no `@ts-ignore`, no disabled lint rules
|
|
55
|
+
- [ ] Tests added for new behavior
|
|
56
|
+
- [ ] CONVENTIONS.md compliance (naming, structure, no gh issue creation)
|
|
57
|
+
- [ ] Boy Scout Rule: touched areas are cleaner than before
|
|
58
|
+
|
|
59
|
+
### 5. Decision
|
|
60
|
+
|
|
61
|
+
- **Accept**: merge the result into the main working branch
|
|
62
|
+
- **Revise**: send back to the subagent with specific feedback
|
|
63
|
+
- **Reject**: discard and re-approach differently
|
|
64
|
+
|
|
65
|
+
**After accepting**, append to `specs/STATE.md` under `## Active Decisions`:
|
|
66
|
+
```
|
|
67
|
+
**[task short name]**: [what approach the agent chose and why — one sentence]
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Report the decision and rationale to the user.
|
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: design-interface
|
|
3
|
+
description: Generate multiple radically different interface designs for a module using parallel sub-agents, then compare trade-offs. Based on "Design It Twice" from A Philosophy of Software Design. Use when user wants to design an API, explore interface options, compare module shapes, or mentions "design it twice".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Design Interface
|
|
7
|
+
|
|
8
|
+
Based on "Design It Twice" from "A Philosophy of Software Design": your first idea is unlikely to be the best. Generate multiple radically different designs, then compare.
|
|
9
|
+
|
|
10
|
+
## Workflow
|
|
11
|
+
|
|
12
|
+
### 1. Gather Requirements
|
|
13
|
+
|
|
14
|
+
Before designing, understand:
|
|
15
|
+
|
|
16
|
+
- [ ] What problem does this module solve?
|
|
17
|
+
- [ ] Who are the callers? (other modules, external users, tests)
|
|
18
|
+
- [ ] What are the key operations?
|
|
19
|
+
- [ ] Any constraints? (performance, compatibility, existing patterns)
|
|
20
|
+
- [ ] What should be hidden inside vs exposed?
|
|
21
|
+
|
|
22
|
+
Ask: "What does this module need to do? Who will use it?"
|
|
23
|
+
|
|
24
|
+
### 2. Generate Designs (Parallel Sub-Agents)
|
|
25
|
+
|
|
26
|
+
Spawn 3+ sub-agents simultaneously using Task tool. Each must produce a **radically different** approach.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
Prompt template for each sub-agent:
|
|
30
|
+
|
|
31
|
+
Design an interface for: [module description]
|
|
32
|
+
|
|
33
|
+
Requirements: [gathered requirements]
|
|
34
|
+
|
|
35
|
+
Constraints for this design: [assign a different constraint to each agent]
|
|
36
|
+
- Agent 1: "Minimize method count - aim for 1-3 methods max"
|
|
37
|
+
- Agent 2: "Maximize flexibility - support many use cases"
|
|
38
|
+
- Agent 3: "Optimize for the most common case"
|
|
39
|
+
- Agent 4: "Take inspiration from [specific paradigm/library]"
|
|
40
|
+
|
|
41
|
+
Output format:
|
|
42
|
+
1. Interface signature (types/methods)
|
|
43
|
+
2. Usage example (how caller uses it)
|
|
44
|
+
3. What this design hides internally
|
|
45
|
+
4. Trade-offs of this approach
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### 3. Present Designs
|
|
49
|
+
|
|
50
|
+
Show each design with:
|
|
51
|
+
|
|
52
|
+
1. **Interface signature** — types, methods, params
|
|
53
|
+
2. **Usage examples** — how callers actually use it in practice
|
|
54
|
+
3. **What it hides** — complexity kept internal
|
|
55
|
+
|
|
56
|
+
Present designs sequentially so user can absorb each approach before comparison.
|
|
57
|
+
|
|
58
|
+
### 4. Compare Designs
|
|
59
|
+
|
|
60
|
+
After showing all designs, compare them on:
|
|
61
|
+
|
|
62
|
+
- **Interface simplicity**: fewer methods, simpler params
|
|
63
|
+
- **General-purpose vs specialized**: flexibility vs focus
|
|
64
|
+
- **Implementation efficiency**: does shape allow efficient internals?
|
|
65
|
+
- **Depth**: small interface hiding significant complexity (good) vs large interface with thin implementation (bad)
|
|
66
|
+
- **Ease of correct use** vs **ease of misuse**
|
|
67
|
+
|
|
68
|
+
Discuss trade-offs in prose, not tables. Highlight where designs diverge most.
|
|
69
|
+
|
|
70
|
+
### 5. Synthesize
|
|
71
|
+
|
|
72
|
+
Often the best design combines insights from multiple options. Ask:
|
|
73
|
+
|
|
74
|
+
- "Which design best fits your primary use case?"
|
|
75
|
+
- "Any elements from other designs worth incorporating?"
|
|
76
|
+
|
|
77
|
+
## Evaluation Criteria
|
|
78
|
+
|
|
79
|
+
From "A Philosophy of Software Design":
|
|
80
|
+
|
|
81
|
+
**Interface simplicity**: Fewer methods, simpler params = easier to learn and use correctly.
|
|
82
|
+
|
|
83
|
+
**General-purpose**: Can handle future use cases without changes. But beware over-generalization.
|
|
84
|
+
|
|
85
|
+
**Implementation efficiency**: Does interface shape allow efficient implementation? Or force awkward internals?
|
|
86
|
+
|
|
87
|
+
**Depth**: Small interface hiding significant complexity = deep module (good). Large interface with thin implementation = shallow module (avoid).
|
|
88
|
+
|
|
89
|
+
## Anti-Patterns
|
|
90
|
+
|
|
91
|
+
- Don't let sub-agents produce similar designs — enforce radical difference
|
|
92
|
+
- Don't skip comparison — the value is in contrast
|
|
93
|
+
- Don't implement — this is purely about interface shape
|
|
94
|
+
- Don't evaluate based on implementation effort
|
|
@@ -0,0 +1,160 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: develop-tdd
|
|
3
|
+
description: Test-driven development with red-green-refactor loop using vertical slices. Use when user wants to build features or fix bugs using TDD, mentions "red-green-refactor", wants integration tests, asks for test-first development, or wants to implement a task from specs/PLAN.md.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Develop TDD
|
|
7
|
+
|
|
8
|
+
> **HARD GATE** — Do NOT proceed if on `main` or `master`. Run `kickoff-branch` first to create a feature branch or worktree.
|
|
9
|
+
>
|
|
10
|
+
> **HARD GATE** — Do NOT write code before you have a plan. If you are starting a new task, run `plan-work` to create `specs/PLAN.md`. If you are fixing a bug, run `investigate-bug` to create `specs/DIAGNOSIS.md`.
|
|
11
|
+
>
|
|
12
|
+
> **RECURSIVE DISCIPLINE** — This lifecycle apply to EVERY task, including updating these skills. Never skip planning because a task is "meta" or "just documentation."
|
|
13
|
+
|
|
14
|
+
## Philosophy
|
|
15
|
+
|
|
16
|
+
**Core principle**: Tests should verify behavior through public interfaces, not implementation details. Code can change entirely; tests shouldn't.
|
|
17
|
+
|
|
18
|
+
**Good tests** are integration-style: they exercise real code paths through public APIs. They describe _what_ the system does, not _how_ it does it. A good test reads like a specification — "user can checkout with valid cart" tells you exactly what capability exists. These tests survive refactors because they don't care about internal structure.
|
|
19
|
+
|
|
20
|
+
**Bad tests** are coupled to implementation. They mock internal collaborators, test private methods, or verify through external means. The warning sign: your test breaks when you refactor, but behavior hasn't changed.
|
|
21
|
+
|
|
22
|
+
See [tests.md](tests.md) for examples and [mocking.md](mocking.md) for mocking guidelines.
|
|
23
|
+
|
|
24
|
+
## Anti-Pattern: Horizontal Slices
|
|
25
|
+
|
|
26
|
+
**DO NOT write all tests first, then all implementation.** This is "horizontal slicing" — treating RED as "write all tests" and GREEN as "write all code."
|
|
27
|
+
|
|
28
|
+
This produces **crap tests**:
|
|
29
|
+
|
|
30
|
+
- Tests written in bulk test _imagined_ behavior, not _actual_ behavior
|
|
31
|
+
- You end up testing the _shape_ of things rather than user-facing behavior
|
|
32
|
+
- Tests become insensitive to real changes
|
|
33
|
+
|
|
34
|
+
**Correct approach**: Vertical slices via tracer bullets. One test → one implementation → repeat.
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
WRONG (horizontal):
|
|
38
|
+
RED: test1, test2, test3, test4, test5
|
|
39
|
+
GREEN: impl1, impl2, impl3, impl4, impl5
|
|
40
|
+
|
|
41
|
+
RIGHT (vertical):
|
|
42
|
+
RED→GREEN: test1→impl1
|
|
43
|
+
RED→GREEN: test2→impl2
|
|
44
|
+
RED→GREEN: test3→impl3
|
|
45
|
+
...
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Red Flags
|
|
49
|
+
|
|
50
|
+
If you find yourself thinking these things, you are likely deviating from production-grade craft. Stop and reconsider.
|
|
51
|
+
|
|
52
|
+
| Red Flag | Reality |
|
|
53
|
+
| :--- | :--- |
|
|
54
|
+
| "This is too simple to need tests." | Simple code is where bugs hide. If it's simple, the test is cheap. |
|
|
55
|
+
| "I'll refactor this later." | "Later" is when technical debt becomes a bankruptcy. Refactor while Green. |
|
|
56
|
+
| "The tests are already comprehensive." | If you're adding behavior, you need a new test. Coverage != Correctness. |
|
|
57
|
+
| "I'm just fixing a small bug." | Small bugs often indicate deep interface flaws. Investigate root cause. |
|
|
58
|
+
| "I need to mock this internal class." | Mocking internals couples tests to implementation. Mock only I/O. |
|
|
59
|
+
| "This refactor is out of scope." | Leave the code cleaner than you found it (Boy Scout Rule). |
|
|
60
|
+
|
|
61
|
+
## Workflow
|
|
62
|
+
|
|
63
|
+
### 1. Planning
|
|
64
|
+
|
|
65
|
+
Before writing any code:
|
|
66
|
+
|
|
67
|
+
- [ ] Read `specs/PLAN.md` or `specs/DIAGNOSIS.md` if they exist — understand the task and verify steps
|
|
68
|
+
- [ ] Confirm with user what interface changes are needed
|
|
69
|
+
- [ ] Confirm with user which behaviors to test (prioritize)
|
|
70
|
+
- [ ] Identify opportunities for [deep modules](deep-modules.md) (small interface, deep implementation)
|
|
71
|
+
- [ ] Design interfaces for [testability](interface-design.md)
|
|
72
|
+
- [ ] List the behaviors to test (not implementation steps)
|
|
73
|
+
- [ ] Get user approval on the plan
|
|
74
|
+
|
|
75
|
+
Ask: "What should the public interface look like? Which behaviors are most important to test?"
|
|
76
|
+
|
|
77
|
+
**You can't test everything.** Confirm with the user exactly which behaviors matter most. Focus testing effort on critical paths and complex logic.
|
|
78
|
+
|
|
79
|
+
Apply the **enforce-first** F.I.R.S.T rubric when writing tests: Fast, Independent, Repeatable, Self-Validating, Timely.
|
|
80
|
+
|
|
81
|
+
### 2. Tracer Bullet
|
|
82
|
+
|
|
83
|
+
Write ONE test that confirms ONE thing about the system:
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
RED: Write test for first behavior → test fails
|
|
87
|
+
GREEN: Write minimal code to pass → test passes
|
|
88
|
+
COMMIT: git commit -m "feat/fix(<scope>): first tracer bullet..."
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
This is your tracer bullet — proves the path works end-to-end.
|
|
92
|
+
|
|
93
|
+
### 3. Incremental Loop
|
|
94
|
+
|
|
95
|
+
> **STREAM CONTINUITY** — When writing file content, output in continuous chunks of ~200 lines. Do not pause. Continue immediately until complete. If you need time, emit a placeholder comment rather than going silent.
|
|
96
|
+
|
|
97
|
+
For each remaining behavior:
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
RED: Write next test → fails
|
|
101
|
+
GREEN: Minimal code to pass → passes
|
|
102
|
+
COMMIT: git commit -m "<type>(<scope>): <behavior description>"
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
Rules:
|
|
106
|
+
|
|
107
|
+
- One test at a time
|
|
108
|
+
- Only enough code to pass current test
|
|
109
|
+
- Don't anticipate future tests
|
|
110
|
+
- Keep tests focused on observable behavior
|
|
111
|
+
- **Atomic Commits**: Commit after every GREEN phase to record progress and prevent large diffs.
|
|
112
|
+
|
|
113
|
+
### 4. Visual Slices (UI Alternate Workflow)
|
|
114
|
+
|
|
115
|
+
For UI components (SwiftUI, React, Flutter) where behavioral unit testing is brittle or low-signal:
|
|
116
|
+
|
|
117
|
+
1. **Test-First Logic**: Extract logic (state transitions, formatting, validation) into a separate Controller, ViewModel, or Hook. This logic MUST follow pure TDD (Red-Green-Refactor).
|
|
118
|
+
2. **Visual Verification**: For the View/Component itself, use "Visual Slices":
|
|
119
|
+
- **RED**: Write the component signature and a basic preview/test snapshot that fails (or displays placeholder).
|
|
120
|
+
- **GREEN**: Implement the UI and verify visually via manual run, preview, or snapshot test.
|
|
121
|
+
- **REFINE**: Adjust styling and layout until it matches the "rich aesthetics" requirement.
|
|
122
|
+
3. **COMMIT**: git commit -m "feat(ui): <component name> visual slice verified"
|
|
123
|
+
|
|
124
|
+
### 5. Refactor
|
|
125
|
+
|
|
126
|
+
After all tests pass, look for [refactor candidates](refactoring.md):
|
|
127
|
+
|
|
128
|
+
- [ ] Extract duplication
|
|
129
|
+
- [ ] Deepen modules (move complexity behind simple interfaces)
|
|
130
|
+
- [ ] Apply SOLID principles where natural
|
|
131
|
+
- [ ] Consider what new code reveals about existing code
|
|
132
|
+
- [ ] Run tests after each refactor step
|
|
133
|
+
|
|
134
|
+
**Never refactor while RED.** Get to GREEN first.
|
|
135
|
+
|
|
136
|
+
### 5. Verify step
|
|
137
|
+
|
|
138
|
+
After every behavior cycle, run the verify command from `specs/PLAN.md` if one exists for this step. Show evidence before declaring the step done.
|
|
139
|
+
|
|
140
|
+
### 6. Manual Verification Handover
|
|
141
|
+
|
|
142
|
+
Once the story is complete and all tests pass:
|
|
143
|
+
1. Locate the **Verification Script** in `specs/RELEASE-PLAN.md` for this story.
|
|
144
|
+
2. Present the script to the user as a step-by-step guide.
|
|
145
|
+
3. Wait for the user to confirm the behavioral correctness before moving to the next story or declaring the task done.
|
|
146
|
+
|
|
147
|
+
## Checklist Per Cycle
|
|
148
|
+
|
|
149
|
+
```
|
|
150
|
+
[ ] Test describes behavior, not implementation
|
|
151
|
+
[ ] No test is ignored without an explicit ambiguity note (T4)
|
|
152
|
+
[ ] Boundary conditions tested: empty, max, min, off-by-one (T5)
|
|
153
|
+
[ ] Tests verify behavior through public interface only — no private methods (T8)
|
|
154
|
+
[ ] Test would survive internal refactor
|
|
155
|
+
[ ] Code is minimal for this test
|
|
156
|
+
[ ] No speculative features added
|
|
157
|
+
[ ] Every new abstraction has an explicit "Reason for Depth" justification
|
|
158
|
+
[ ] Progress committed (Conventional Commits)
|
|
159
|
+
[ ] verify: command passes
|
|
160
|
+
```
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# Deep Modules
|
|
2
|
+
|
|
3
|
+
From "A Philosophy of Software Design":
|
|
4
|
+
|
|
5
|
+
**Deep module** = small interface + lots of implementation
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
┌─────────────────────┐
|
|
9
|
+
│ Small Interface │ ← Few methods, simple params
|
|
10
|
+
├─────────────────────┤
|
|
11
|
+
│ │
|
|
12
|
+
│ │
|
|
13
|
+
│ Deep Implementation│ ← Complex logic hidden
|
|
14
|
+
│ │
|
|
15
|
+
│ │
|
|
16
|
+
└─────────────────────┘
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
**Shallow module** = large interface + little implementation (avoid)
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
┌─────────────────────────────────┐
|
|
23
|
+
│ Large Interface │ ← Many methods, complex params
|
|
24
|
+
├─────────────────────────────────┤
|
|
25
|
+
│ Thin Implementation │ ← Just passes through
|
|
26
|
+
└─────────────────────────────────┘
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
When designing interfaces, ask:
|
|
30
|
+
|
|
31
|
+
- Can I reduce the number of methods?
|
|
32
|
+
- Can I simplify the parameters?
|
|
33
|
+
- Can I hide more complexity inside?
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Interface Design for Testability
|
|
2
|
+
|
|
3
|
+
Good interfaces make testing natural:
|
|
4
|
+
|
|
5
|
+
1. **Accept dependencies, don't create them**
|
|
6
|
+
|
|
7
|
+
```typescript
|
|
8
|
+
// Testable
|
|
9
|
+
function processOrder(order, paymentGateway) {}
|
|
10
|
+
|
|
11
|
+
// Hard to test
|
|
12
|
+
function processOrder(order) {
|
|
13
|
+
const gateway = new StripeGateway();
|
|
14
|
+
}
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
2. **Return results, don't produce side effects**
|
|
18
|
+
|
|
19
|
+
```typescript
|
|
20
|
+
// Testable
|
|
21
|
+
function calculateDiscount(cart): Discount {}
|
|
22
|
+
|
|
23
|
+
// Hard to test
|
|
24
|
+
function applyDiscount(cart): void {
|
|
25
|
+
cart.total -= discount;
|
|
26
|
+
}
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
3. **Small surface area**
|
|
30
|
+
- Fewer methods = fewer tests needed
|
|
31
|
+
- Fewer params = simpler test setup
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# When to Mock
|
|
2
|
+
|
|
3
|
+
Mock at **system boundaries** only:
|
|
4
|
+
|
|
5
|
+
- External APIs (payment, email, etc.)
|
|
6
|
+
- Databases (sometimes - prefer test DB)
|
|
7
|
+
- Time/randomness
|
|
8
|
+
- File system (sometimes)
|
|
9
|
+
|
|
10
|
+
Don't mock:
|
|
11
|
+
|
|
12
|
+
- Your own classes/modules
|
|
13
|
+
- Internal collaborators
|
|
14
|
+
- Anything you control
|
|
15
|
+
|
|
16
|
+
## Designing for Mockability
|
|
17
|
+
|
|
18
|
+
At system boundaries, design interfaces that are easy to mock:
|
|
19
|
+
|
|
20
|
+
**1. Use dependency injection**
|
|
21
|
+
|
|
22
|
+
Pass external dependencies in rather than creating them internally:
|
|
23
|
+
|
|
24
|
+
```typescript
|
|
25
|
+
// Easy to mock
|
|
26
|
+
function processPayment(order, paymentClient) {
|
|
27
|
+
return paymentClient.charge(order.total);
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
// Hard to mock
|
|
31
|
+
function processPayment(order) {
|
|
32
|
+
const client = new StripeClient(process.env.STRIPE_KEY);
|
|
33
|
+
return client.charge(order.total);
|
|
34
|
+
}
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
**2. Prefer SDK-style interfaces over generic fetchers**
|
|
38
|
+
|
|
39
|
+
Create specific functions for each external operation instead of one generic function with conditional logic:
|
|
40
|
+
|
|
41
|
+
```typescript
|
|
42
|
+
// GOOD: Each function is independently mockable
|
|
43
|
+
const api = {
|
|
44
|
+
getUser: (id) => fetch(`/users/${id}`),
|
|
45
|
+
getOrders: (userId) => fetch(`/users/${userId}/orders`),
|
|
46
|
+
createOrder: (data) => fetch('/orders', { method: 'POST', body: data }),
|
|
47
|
+
};
|
|
48
|
+
|
|
49
|
+
// BAD: Mocking requires conditional logic inside the mock
|
|
50
|
+
const api = {
|
|
51
|
+
fetch: (endpoint, options) => fetch(endpoint, options),
|
|
52
|
+
};
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
The SDK approach means:
|
|
56
|
+
- Each mock returns one specific shape
|
|
57
|
+
- No conditional logic in test setup
|
|
58
|
+
- Easier to see which endpoints a test exercises
|
|
59
|
+
- Type safety per endpoint
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# Refactor Candidates
|
|
2
|
+
|
|
3
|
+
After TDD cycle, look for:
|
|
4
|
+
|
|
5
|
+
- **Duplication** → Extract function/class
|
|
6
|
+
- **Long methods** → Break into private helpers (keep tests on public interface)
|
|
7
|
+
- **Shallow modules** → Combine or deepen
|
|
8
|
+
- **Feature envy** → Move logic to where data lives
|
|
9
|
+
- **Primitive obsession** → Introduce value objects
|
|
10
|
+
- **Existing code** the new code reveals as problematic
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Good and Bad Tests
|
|
2
|
+
|
|
3
|
+
## Good Tests
|
|
4
|
+
|
|
5
|
+
**Integration-style**: Test through real interfaces, not mocks of internal parts.
|
|
6
|
+
|
|
7
|
+
```typescript
|
|
8
|
+
// GOOD: Tests observable behavior
|
|
9
|
+
test("user can checkout with valid cart", async () => {
|
|
10
|
+
const cart = createCart();
|
|
11
|
+
cart.add(product);
|
|
12
|
+
const result = await checkout(cart, paymentMethod);
|
|
13
|
+
expect(result.status).toBe("confirmed");
|
|
14
|
+
});
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Characteristics:
|
|
18
|
+
|
|
19
|
+
- Tests behavior users/callers care about
|
|
20
|
+
- Uses public API only
|
|
21
|
+
- Survives internal refactors
|
|
22
|
+
- Describes WHAT, not HOW
|
|
23
|
+
- One logical assertion per test
|
|
24
|
+
|
|
25
|
+
## Bad Tests
|
|
26
|
+
|
|
27
|
+
**Implementation-detail tests**: Coupled to internal structure.
|
|
28
|
+
|
|
29
|
+
```typescript
|
|
30
|
+
// BAD: Tests implementation details
|
|
31
|
+
test("checkout calls paymentService.process", async () => {
|
|
32
|
+
const mockPayment = jest.mock(paymentService);
|
|
33
|
+
await checkout(cart, payment);
|
|
34
|
+
expect(mockPayment.process).toHaveBeenCalledWith(cart.total);
|
|
35
|
+
});
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Red flags:
|
|
39
|
+
|
|
40
|
+
- Mocking internal collaborators
|
|
41
|
+
- Testing private methods
|
|
42
|
+
- Asserting on call counts/order
|
|
43
|
+
- Test breaks when refactoring without behavior change
|
|
44
|
+
- Test name describes HOW not WHAT
|
|
45
|
+
- Verifying through external means instead of interface
|
|
46
|
+
|
|
47
|
+
```typescript
|
|
48
|
+
// BAD: Bypasses interface to verify
|
|
49
|
+
test("createUser saves to database", async () => {
|
|
50
|
+
await createUser({ name: "Alice" });
|
|
51
|
+
const row = await db.query("SELECT * FROM users WHERE name = ?", ["Alice"]);
|
|
52
|
+
expect(row).toBeDefined();
|
|
53
|
+
});
|
|
54
|
+
|
|
55
|
+
// GOOD: Verifies through interface
|
|
56
|
+
test("createUser makes user retrievable", async () => {
|
|
57
|
+
const user = await createUser({ name: "Alice" });
|
|
58
|
+
const retrieved = await getUser(user.id);
|
|
59
|
+
expect(retrieved.name).toBe("Alice");
|
|
60
|
+
});
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
## Clean Test Heuristics (Uncle Bob, Ch 17)
|
|
64
|
+
|
|
65
|
+
Apply these specific heuristics to maintain a high-quality suite:
|
|
66
|
+
|
|
67
|
+
- **T1: Insufficient Tests**: A test suite should test everything that could possibly break. Don't stop at "it seems to work."
|
|
68
|
+
- **T4: Ignored Tests**: Never ignore a test without documenting the ambiguity. An ignored test is a silent warning of a gap in understanding.
|
|
69
|
+
- **T5: Test Boundary Conditions**: Most bugs happen at the edges. Test the exact boundaries (e.g., empty strings, max integers, off-by-one indices).
|
|
70
|
+
- **T6: Exhaustively Test Near Bugs**: Bugs congregate. If you find one, there are likely others nearby; test that area thoroughly.
|
|
71
|
+
- **T9: Tests Should Be Fast**: Slow tests don't get run. Keep them fast so they remain part of the core developer loop.
|