@skilly-hand/skilly-hand 0.29.2 → 0.29.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +16 -0
- package/catalog/README.md +2 -2
- package/catalog/skills/spec-driven-development/SKILL.md +95 -144
- package/catalog/skills/spec-driven-development/agents/apply.md +30 -15
- package/catalog/skills/spec-driven-development/agents/orchestrate.md +23 -14
- package/catalog/skills/spec-driven-development/agents/plan.md +19 -17
- package/catalog/skills/spec-driven-development/agents/verify.md +40 -19
- package/catalog/skills/spec-driven-development/assets/delta-spec-template.md +50 -15
- package/catalog/skills/spec-driven-development/assets/design-template.md +20 -14
- package/catalog/skills/spec-driven-development/assets/spec-template.md +41 -20
- package/catalog/skills/spec-driven-development/assets/validation-checklist.md +28 -21
- package/catalog/skills/spec-driven-development/manifest.json +4 -4
- package/catalog/skills/test-driven-development/SKILL.md +92 -117
- package/catalog/skills/test-driven-development/assets/tdd-cycle.md +63 -447
- package/catalog/skills/test-driven-development/manifest.json +5 -5
- package/package.json +1 -1
- package/packages/catalog/package.json +1 -1
- package/packages/cli/package.json +1 -1
- package/packages/core/package.json +1 -1
- package/packages/detectors/package.json +1 -1
|
@@ -1,51 +1,86 @@
|
|
|
1
|
-
# [
|
|
2
|
-
|
|
3
|
-
Use this template when changing behavior in an existing feature.
|
|
1
|
+
# [Work Name] - Delta Spec
|
|
4
2
|
|
|
5
3
|
## Why
|
|
6
4
|
|
|
7
|
-
[
|
|
5
|
+
[Reason for changing established behavior.]
|
|
8
6
|
|
|
9
|
-
##
|
|
7
|
+
## Baseline
|
|
10
8
|
|
|
11
|
-
[
|
|
9
|
+
[Maintained requirement source or a concise description of current behavior.]
|
|
12
10
|
|
|
13
11
|
## ADDED Requirements
|
|
14
12
|
|
|
15
13
|
### Requirement: [Name]
|
|
16
14
|
|
|
17
|
-
[MUST
|
|
15
|
+
[MUST, SHOULD, or MAY statement.]
|
|
18
16
|
|
|
19
17
|
#### Scenario: [Name]
|
|
20
18
|
|
|
21
19
|
- GIVEN [initial state]
|
|
22
20
|
- WHEN [action]
|
|
23
|
-
- THEN [
|
|
21
|
+
- THEN [observable result]
|
|
24
22
|
|
|
25
23
|
## MODIFIED Requirements
|
|
26
24
|
|
|
27
25
|
### Requirement: [Name]
|
|
28
26
|
|
|
29
|
-
[
|
|
27
|
+
[Complete replacement requirement.]
|
|
30
28
|
|
|
31
|
-
|
|
29
|
+
Previously: [Previous requirement or behavior.]
|
|
32
30
|
|
|
33
31
|
## REMOVED Requirements
|
|
34
32
|
|
|
35
33
|
### Requirement: [Name]
|
|
36
34
|
|
|
37
|
-
|
|
35
|
+
Reason: [Why it is removed.]
|
|
36
|
+
|
|
37
|
+
## Constraints
|
|
38
|
+
|
|
39
|
+
### Must
|
|
40
|
+
|
|
41
|
+
- [Enforceable requirement]
|
|
42
|
+
|
|
43
|
+
### Must Not
|
|
44
|
+
|
|
45
|
+
- [Disallowed behavior]
|
|
46
|
+
|
|
47
|
+
### Out of Scope
|
|
48
|
+
|
|
49
|
+
- [Boundary]
|
|
50
|
+
|
|
51
|
+
## Approval Policy
|
|
52
|
+
|
|
53
|
+
- Mode: [explicit checkpoint | self-review]
|
|
54
|
+
- Trigger for reapproval: [change condition]
|
|
38
55
|
|
|
39
56
|
## Tasks
|
|
40
57
|
|
|
41
58
|
### T1: [Title]
|
|
42
59
|
|
|
43
|
-
**What:** [
|
|
60
|
+
**What:** [Observable outcome]
|
|
61
|
+
**Required Capabilities:** [Semantic capabilities, or `none`]
|
|
62
|
+
**Files:** [Expected files, or `discover`]
|
|
63
|
+
**Scenario:** [GIVEN / WHEN / THEN, when behavioral]
|
|
64
|
+
**Verify:** [Project-discovered command or concrete manual check]
|
|
65
|
+
**Done:** [One-sentence completion condition]
|
|
44
66
|
|
|
45
|
-
|
|
67
|
+
## Progress
|
|
46
68
|
|
|
47
|
-
|
|
69
|
+
| Task | Status | Evidence |
|
|
70
|
+
| --- | --- | --- |
|
|
71
|
+
| T1 | TODO | |
|
|
48
72
|
|
|
49
73
|
## Validation
|
|
50
74
|
|
|
51
|
-
- [Checks that prove the
|
|
75
|
+
- [Checks that prove the behavior delta]
|
|
76
|
+
- [Regression checks for retained behavior]
|
|
77
|
+
|
|
78
|
+
## Reconciliation
|
|
79
|
+
|
|
80
|
+
- Baseline update required: [yes | no]
|
|
81
|
+
- Reconciliation result: [pending | complete | not applicable]
|
|
82
|
+
|
|
83
|
+
## Change Log
|
|
84
|
+
|
|
85
|
+
| Date | Change | Affected Tasks | Approval |
|
|
86
|
+
| --- | --- | --- | --- |
|
|
@@ -1,31 +1,37 @@
|
|
|
1
|
-
# Design: [
|
|
1
|
+
# Design: [Work Name]
|
|
2
2
|
|
|
3
|
-
Use
|
|
3
|
+
Use this artifact only for decisions whose rationale or trade-offs will matter after implementation.
|
|
4
4
|
|
|
5
5
|
## Context
|
|
6
6
|
|
|
7
|
-
[
|
|
7
|
+
[Verified system state and reason a decision is needed.]
|
|
8
8
|
|
|
9
9
|
## Goals
|
|
10
10
|
|
|
11
|
-
- [
|
|
12
|
-
- [Secondary outcome]
|
|
11
|
+
- [Desired outcome]
|
|
13
12
|
|
|
14
13
|
## Non-Goals
|
|
15
14
|
|
|
16
|
-
- [
|
|
17
|
-
- [Another excluded area]
|
|
15
|
+
- [Explicit exclusion]
|
|
18
16
|
|
|
19
|
-
##
|
|
17
|
+
## Decision
|
|
20
18
|
|
|
21
|
-
### Decision
|
|
19
|
+
### [Decision Name]
|
|
22
20
|
|
|
23
|
-
|
|
21
|
+
- Choice: [Selected approach]
|
|
22
|
+
- Rationale: [Why it fits the constraints]
|
|
23
|
+
- Required capabilities: [Semantic capabilities, or `none`]
|
|
24
24
|
|
|
25
|
-
|
|
25
|
+
## Alternatives Considered
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
| Alternative | Benefit | Cost | Reason Not Selected |
|
|
28
|
+
| --- | --- | --- | --- |
|
|
28
29
|
|
|
29
|
-
## Risks
|
|
30
|
+
## Risks and Mitigations
|
|
30
31
|
|
|
31
|
-
|
|
32
|
+
| Risk | Impact | Mitigation | Verification |
|
|
33
|
+
| --- | --- | --- | --- |
|
|
34
|
+
|
|
35
|
+
## Revisit Conditions
|
|
36
|
+
|
|
37
|
+
- [Evidence or change that should reopen this decision]
|
|
@@ -1,56 +1,77 @@
|
|
|
1
|
-
# [
|
|
1
|
+
# [Work Name]
|
|
2
2
|
|
|
3
3
|
## Why
|
|
4
4
|
|
|
5
|
-
[
|
|
5
|
+
[Problem, value, and why it matters now.]
|
|
6
6
|
|
|
7
7
|
## What
|
|
8
8
|
|
|
9
|
-
[Concrete
|
|
9
|
+
[Concrete and testable deliverable.]
|
|
10
10
|
|
|
11
11
|
## Constraints
|
|
12
12
|
|
|
13
13
|
### Must
|
|
14
14
|
|
|
15
|
-
- [
|
|
16
|
-
|
|
15
|
+
- [Enforceable requirement]
|
|
16
|
+
|
|
17
|
+
### Should
|
|
18
|
+
|
|
19
|
+
- [Preferred outcome; deviations require a reason]
|
|
17
20
|
|
|
18
21
|
### Must Not
|
|
19
22
|
|
|
20
|
-
- [Disallowed
|
|
21
|
-
- [Disallowed dependencies or behavioral changes]
|
|
23
|
+
- [Disallowed behavior or approach]
|
|
22
24
|
|
|
23
25
|
### Out of Scope
|
|
24
26
|
|
|
25
|
-
- [
|
|
27
|
+
- [Explicit boundary]
|
|
26
28
|
|
|
27
29
|
## Current State
|
|
28
30
|
|
|
29
|
-
- [
|
|
30
|
-
|
|
31
|
+
- [Verified files, behavior, dependencies, and conventions]
|
|
32
|
+
|
|
33
|
+
## Approval Policy
|
|
34
|
+
|
|
35
|
+
- Mode: [explicit checkpoint | self-review]
|
|
36
|
+
- Trigger for reapproval: [scope, constraint, risk, or design change]
|
|
31
37
|
|
|
32
38
|
## Tasks
|
|
33
39
|
|
|
34
40
|
### T1: [Title]
|
|
35
41
|
|
|
36
|
-
**What:** [
|
|
42
|
+
**What:** [Observable outcome]
|
|
43
|
+
|
|
44
|
+
**Required Capabilities:** [Semantic capabilities, or `none`]
|
|
37
45
|
|
|
38
|
-
**Files:**
|
|
46
|
+
**Files:** [Expected files, or `discover`]
|
|
39
47
|
|
|
40
|
-
**
|
|
48
|
+
**Scenario:**
|
|
41
49
|
|
|
42
|
-
|
|
50
|
+
- GIVEN [initial state]
|
|
51
|
+
- WHEN [action]
|
|
52
|
+
- THEN [observable result]
|
|
43
53
|
|
|
44
|
-
|
|
54
|
+
**Verify:** [Project-discovered command or concrete manual check]
|
|
45
55
|
|
|
46
|
-
**
|
|
56
|
+
**Done:** [One-sentence completion condition]
|
|
47
57
|
|
|
48
|
-
|
|
58
|
+
## Progress
|
|
49
59
|
|
|
50
|
-
|
|
60
|
+
| Task | Status | Evidence |
|
|
61
|
+
| --- | --- | --- |
|
|
62
|
+
| T1 | TODO | |
|
|
63
|
+
|
|
64
|
+
Valid states: `TODO`, `IN_PROGRESS`, `BLOCKED`, `DONE`.
|
|
51
65
|
|
|
52
66
|
## Validation
|
|
53
67
|
|
|
54
68
|
- [Feature-level automated check]
|
|
55
|
-
- [Feature-level manual scenario]
|
|
56
|
-
- [
|
|
69
|
+
- [Feature-level manual scenario, if needed]
|
|
70
|
+
- [Constraint or regression check]
|
|
71
|
+
|
|
72
|
+
## Change Log
|
|
73
|
+
|
|
74
|
+
Record requirement, scope, or design changes. Do not log routine progress.
|
|
75
|
+
|
|
76
|
+
| Date | Change | Affected Tasks | Approval |
|
|
77
|
+
| --- | --- | --- | --- |
|
|
@@ -1,33 +1,40 @@
|
|
|
1
1
|
# Spec Validation Checklist
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
## Portability
|
|
4
|
+
|
|
5
|
+
- [ ] The workflow is executable without a named external skill, agent, vendor, service, framework, VCS, package manager, or test runner.
|
|
6
|
+
- [ ] Required capabilities are semantic and can use a local fallback.
|
|
7
|
+
- [ ] Commands were discovered from the target project or are clearly marked as placeholders.
|
|
8
|
+
- [ ] Protocol states use portable ASCII tokens.
|
|
4
9
|
|
|
5
10
|
## Spec Quality
|
|
6
11
|
|
|
7
|
-
- [ ]
|
|
8
|
-
- [ ]
|
|
9
|
-
- [ ]
|
|
10
|
-
- [ ]
|
|
11
|
-
- [ ]
|
|
12
|
+
- [ ] `Why` and `What` are concrete.
|
|
13
|
+
- [ ] Constraints and out-of-scope boundaries are enforceable.
|
|
14
|
+
- [ ] Current-state claims reference verified context.
|
|
15
|
+
- [ ] Approval and reapproval policies are explicit.
|
|
16
|
+
- [ ] Architecture decisions are captured only when their rationale matters.
|
|
12
17
|
|
|
13
18
|
## Task Quality
|
|
14
19
|
|
|
15
|
-
- [ ]
|
|
16
|
-
- [ ] Each task
|
|
17
|
-
- [ ]
|
|
18
|
-
- [ ]
|
|
20
|
+
- [ ] Each task has one observable outcome.
|
|
21
|
+
- [ ] Each task declares capabilities, files, verify step, and definition of done.
|
|
22
|
+
- [ ] Behavioral tasks include an acceptance scenario.
|
|
23
|
+
- [ ] Dependencies and blockers are visible.
|
|
24
|
+
- [ ] Tasks exist only in `spec.md`.
|
|
19
25
|
|
|
20
|
-
##
|
|
26
|
+
## Execution Evidence
|
|
21
27
|
|
|
22
|
-
- [ ]
|
|
23
|
-
- [ ]
|
|
24
|
-
- [ ]
|
|
28
|
+
- [ ] Every `DONE` task has reproducible evidence.
|
|
29
|
+
- [ ] Failed or unavailable checks are recorded honestly.
|
|
30
|
+
- [ ] Scope changes updated the spec before implementation continued.
|
|
31
|
+
- [ ] Superseded evidence is identified.
|
|
25
32
|
|
|
26
|
-
## Pre-Archive
|
|
33
|
+
## Pre-Archive
|
|
27
34
|
|
|
28
|
-
- [ ] All
|
|
29
|
-
- [ ] Feature
|
|
30
|
-
- [ ]
|
|
31
|
-
- [ ]
|
|
32
|
-
- [ ]
|
|
33
|
-
- [ ]
|
|
35
|
+
- [ ] All tasks are `DONE` and no blocker remains.
|
|
36
|
+
- [ ] Feature validation and constraint checks pass.
|
|
37
|
+
- [ ] The portable final review gate passes.
|
|
38
|
+
- [ ] Manual checks are complete or explicitly approved.
|
|
39
|
+
- [ ] Delta reconciliation is complete when applicable.
|
|
40
|
+
- [ ] Archive name uses `<YYYY-MM-DD>-<work-name>`.
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"id": "spec-driven-development",
|
|
3
3
|
"title": "Spec-Driven Development",
|
|
4
|
-
"description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks.",
|
|
4
|
+
"description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. Trigger: planning or executing feature work, bug fixes, and multi-phase implementation.",
|
|
5
5
|
"portable": true,
|
|
6
6
|
"tags": ["core", "workflow", "planning"],
|
|
7
7
|
"detectors": ["always"],
|
|
@@ -10,10 +10,10 @@
|
|
|
10
10
|
"agentSupport": ["codex", "claude", "cursor", "gemini", "copilot", "antigravity", "windsurf", "trae"],
|
|
11
11
|
"skillMetadata": {
|
|
12
12
|
"author": "skilly-hand",
|
|
13
|
-
"last-edit": "2026-
|
|
13
|
+
"last-edit": "2026-06-20",
|
|
14
14
|
"license": "Apache-2.0",
|
|
15
|
-
"version": "1.0
|
|
16
|
-
"changelog": "Added
|
|
15
|
+
"version": "1.1.0",
|
|
16
|
+
"changelog": "Added a portable SDD lifecycle with capability-based routing, task evidence, change control, and archive invariants; prevents fixed tool dependencies and duplicated task state; affects planning, apply, verify, orchestrate, and spec templates",
|
|
17
17
|
"auto-invoke": "Planning or executing feature work, bug fixes, and multi-phase implementation",
|
|
18
18
|
"allowed-modes": [
|
|
19
19
|
"plan",
|
|
@@ -1,13 +1,13 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: "test-driven-development"
|
|
3
|
-
description: "Guide implementation
|
|
3
|
+
description: "Guide implementation through evidence-based RED, GREEN, and REFACTOR cycles without assuming a language, framework, or test runner. Trigger: implementing testable behavior or reproducing a regression with tests first."
|
|
4
4
|
skillMetadata:
|
|
5
5
|
author: "skilly-hand"
|
|
6
|
-
last-edit: "2026-
|
|
6
|
+
last-edit: "2026-06-20"
|
|
7
7
|
license: "Apache-2.0"
|
|
8
|
-
version: "1.
|
|
9
|
-
changelog: "
|
|
10
|
-
auto-invoke: "Implementing
|
|
8
|
+
version: "1.1.0"
|
|
9
|
+
changelog: "Rebuilt TDD guidance around portable cycle evidence, expected RED failures, behavior-preserving refactors, and project-discovered test conventions; prevents framework assumptions and untested behavior during refactor; affects core workflow, examples, and verification guidance"
|
|
10
|
+
auto-invoke: "Implementing testable behavior or reproducing a regression with tests first"
|
|
11
11
|
allowed-tools:
|
|
12
12
|
- "Read"
|
|
13
13
|
- "Edit"
|
|
@@ -20,158 +20,133 @@ skillMetadata:
|
|
|
20
20
|
|
|
21
21
|
## When to Use
|
|
22
22
|
|
|
23
|
-
Use
|
|
23
|
+
Use TDD when desired behavior can be expressed before implementation, when fixing a reproducible regression, or when changing logic that benefits from a tight feedback loop.
|
|
24
24
|
|
|
25
|
-
|
|
26
|
-
- Adding behavior to existing code where the expected outcome can be defined upfront.
|
|
27
|
-
- Debugging a regression by writing a failing test that reproduces the bug first.
|
|
28
|
-
- Reviewing or pair-programming on code where test-first discipline is required.
|
|
25
|
+
Do not force TDD onto exploratory spikes, generated artifacts, environment-only setup, or behavior that cannot be observed reliably. Time-box exploration, discard or isolate spike code, then begin TDD once an interface is understood.
|
|
29
26
|
|
|
30
|
-
|
|
27
|
+
## Portable Contract
|
|
31
28
|
|
|
32
|
-
-
|
|
33
|
-
-
|
|
34
|
-
-
|
|
29
|
+
- Discover the project's language, test runner, commands, naming, and file placement before writing tests.
|
|
30
|
+
- Prefer existing project conventions over examples in this skill.
|
|
31
|
+
- Do not require a framework, package manager, assertion library, coverage tool, or external service.
|
|
32
|
+
- If no runnable test harness exists, record the blocker or establish the smallest project-appropriate harness as separately approved work.
|
|
35
33
|
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
## Critical Patterns
|
|
39
|
-
|
|
40
|
-
### Pattern 1: RED First, Always
|
|
34
|
+
## The Cycle
|
|
41
35
|
|
|
42
|
-
|
|
36
|
+
### 1. Understand
|
|
43
37
|
|
|
44
|
-
|
|
45
|
-
- The feature is actually needed.
|
|
46
|
-
- You understand the requirements before touching implementation.
|
|
38
|
+
Define one observable behavior and choose the lowest test level that can prove it without hiding important integration risk.
|
|
47
39
|
|
|
48
|
-
|
|
40
|
+
### 2. RED
|
|
49
41
|
|
|
50
|
-
|
|
42
|
+
Write a test before production behavior changes, then run it.
|
|
51
43
|
|
|
52
|
-
|
|
44
|
+
A valid RED requires:
|
|
53
45
|
|
|
54
|
-
-
|
|
55
|
-
-
|
|
56
|
-
-
|
|
46
|
+
- The new or changed test fails.
|
|
47
|
+
- The failure is caused by the missing or incorrect target behavior.
|
|
48
|
+
- The failure message or observation is understood.
|
|
49
|
+
- Unrelated failures are separated from the cycle.
|
|
57
50
|
|
|
58
|
-
|
|
51
|
+
If the test already passes, do not weaken it or write implementation blindly. Determine whether the behavior already exists, the assertion observes the wrong thing, or the test setup bypasses the relevant path.
|
|
59
52
|
|
|
60
|
-
###
|
|
53
|
+
### 3. GREEN
|
|
61
54
|
|
|
62
|
-
|
|
55
|
+
Implement the smallest behavior that makes the RED test pass. Run the focused test, then the smallest relevant regression set.
|
|
63
56
|
|
|
64
|
-
|
|
65
|
-
- Tests must stay green throughout every refactoring step.
|
|
66
|
-
- If a refactor breaks a test, revert — the refactor was wrong.
|
|
57
|
+
Do not add speculative validation, configuration, interfaces, or error cases that the current behavior does not require.
|
|
67
58
|
|
|
68
|
-
###
|
|
59
|
+
### 4. REFACTOR
|
|
69
60
|
|
|
70
|
-
|
|
61
|
+
Improve structure while preserving observable behavior. Keep tests green after each meaningful change.
|
|
71
62
|
|
|
72
|
-
|
|
73
|
-
- A test name should complete the sentence: *"it should ___"*.
|
|
74
|
-
- If a test asserts two behaviors, split it into two tests.
|
|
63
|
+
Allowed examples include renaming, removing duplication, simplifying control flow, or extracting an internal helper. Adding a new output, error case, persistence rule, side effect, or public option is not refactoring; start another RED cycle for it.
|
|
75
64
|
|
|
76
|
-
|
|
65
|
+
### 5. Record Evidence
|
|
77
66
|
|
|
78
|
-
|
|
67
|
+
Capture enough evidence to reproduce the cycle:
|
|
79
68
|
|
|
80
69
|
```text
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
Fixing a bug? -> Write failing test that reproduces the bug first
|
|
70
|
+
Behavior: <one observable outcome>
|
|
71
|
+
RED: <command/check> -> FAIL because <expected reason>
|
|
72
|
+
GREEN: <command/check> -> PASS
|
|
73
|
+
REFACTOR: <command/check> -> PASS | NOT_NEEDED
|
|
74
|
+
Regression: <relevant suite/check> -> PASS | NOT_RUN with reason
|
|
87
75
|
```
|
|
88
76
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
## Code Examples
|
|
92
|
-
|
|
93
|
-
### Example 1: GIVEN / WHEN / THEN Structure
|
|
77
|
+
## Test Scope Selection
|
|
94
78
|
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
79
|
+
| Need | Prefer |
|
|
80
|
+
| --- | --- |
|
|
81
|
+
| Pure logic or narrow rule | Unit test |
|
|
82
|
+
| Collaboration between local modules | Integration test |
|
|
83
|
+
| Boundary with a stable external contract | Contract test or boundary integration test |
|
|
84
|
+
| User-visible workflow across the system | End-to-end test |
|
|
85
|
+
| Existing behavior with unclear intent | Characterization test before change |
|
|
100
86
|
|
|
101
|
-
|
|
102
|
-
const result = add(a, b);
|
|
87
|
+
Use the lowest level that proves the behavior, but do not mock away the boundary where the defect or risk lives. A task may need more than one level when risks differ.
|
|
103
88
|
|
|
104
|
-
|
|
105
|
-
expect(result).toBe(7);
|
|
106
|
-
});
|
|
107
|
-
```
|
|
89
|
+
## Test Design Rules
|
|
108
90
|
|
|
109
|
-
|
|
91
|
+
- One behavioral reason to fail per test. Multiple assertions are acceptable when they describe one outcome.
|
|
92
|
+
- Use the project's preferred structure, such as Given/When/Then or Arrange/Act/Assert.
|
|
93
|
+
- Assert observable results rather than private implementation details.
|
|
94
|
+
- Keep setup focused and make test data reveal intent.
|
|
95
|
+
- Test meaningful boundaries and error behavior, not every syntactic branch.
|
|
96
|
+
- A regression test must fail on the faulty baseline and pass after the fix.
|
|
110
97
|
|
|
111
|
-
|
|
112
|
-
// calculator.test.ts
|
|
113
|
-
import { divide } from './calculator';
|
|
98
|
+
## Test Doubles
|
|
114
99
|
|
|
115
|
-
|
|
116
|
-
// GIVEN / WHEN / THEN
|
|
117
|
-
expect(() => divide(10, 0)).toThrow('Cannot divide by zero');
|
|
118
|
-
});
|
|
119
|
-
```
|
|
100
|
+
Use fakes, stubs, spies, or mocks only when they make the test faster, deterministic, or able to isolate an owned boundary.
|
|
120
101
|
|
|
121
|
-
|
|
102
|
+
- Prefer simple state-based assertions over interaction assertions.
|
|
103
|
+
- Verify interactions when the interaction itself is the contract.
|
|
104
|
+
- Do not mock the unit under test.
|
|
105
|
+
- Avoid reproducing complex third-party behavior in hand-written mocks.
|
|
106
|
+
- Keep at least one integration check when a mocked boundary carries material compatibility risk.
|
|
122
107
|
|
|
123
|
-
|
|
108
|
+
## Async and Determinism
|
|
124
109
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
return a / b;
|
|
130
|
-
}
|
|
131
|
-
```
|
|
110
|
+
- Prefer controllable clocks, schedulers, events, and in-memory boundaries over real delays or network calls.
|
|
111
|
+
- Await observable completion; do not let assertions run after the test finishes.
|
|
112
|
+
- Remove order dependence and shared mutable state.
|
|
113
|
+
- Treat flaky tests as defects. Diagnose timing, isolation, and lifecycle issues instead of adding blind retries.
|
|
132
114
|
|
|
133
|
-
|
|
115
|
+
## Coverage
|
|
134
116
|
|
|
135
|
-
|
|
117
|
+
Coverage shows what executed, not whether behavior was specified well.
|
|
136
118
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
119
|
+
- Respect thresholds already configured by the project.
|
|
120
|
+
- Use uncovered critical behavior to guide new scenarios.
|
|
121
|
+
- Do not invent a universal percentage.
|
|
122
|
+
- Never add low-value assertions solely to increase a metric.
|
|
140
123
|
|
|
141
|
-
|
|
142
|
-
if (denominator === 0) throw new Error(DIVIDE_BY_ZERO_MESSAGE);
|
|
143
|
-
return numerator / denominator;
|
|
144
|
-
}
|
|
145
|
-
```
|
|
124
|
+
## Bug-Fix Cycle
|
|
146
125
|
|
|
147
|
-
|
|
126
|
+
1. Reproduce the defect at the lowest useful level.
|
|
127
|
+
2. Confirm the test fails for the reported reason.
|
|
128
|
+
3. Apply the smallest correction.
|
|
129
|
+
4. Confirm the regression test and relevant existing tests pass.
|
|
130
|
+
5. Refactor only after the correction is protected.
|
|
148
131
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
## Commands
|
|
152
|
-
|
|
153
|
-
```bash
|
|
154
|
-
# Run a single test file
|
|
155
|
-
npm test -- {test-file}
|
|
156
|
-
|
|
157
|
-
# Run all tests
|
|
158
|
-
npm test
|
|
159
|
-
|
|
160
|
-
# Run tests in watch mode
|
|
161
|
-
npm test -- --watch
|
|
162
|
-
|
|
163
|
-
# Type check without emitting
|
|
164
|
-
npx tsc --noEmit
|
|
165
|
-
|
|
166
|
-
# Lint check
|
|
167
|
-
npm run lint
|
|
132
|
+
## Decision Tree
|
|
168
133
|
|
|
169
|
-
|
|
170
|
-
|
|
134
|
+
```text
|
|
135
|
+
Can the behavior be observed reliably?
|
|
136
|
+
NO -> clarify the interface or isolate exploration first
|
|
137
|
+
YES -> choose the lowest useful test level
|
|
138
|
+
|
|
139
|
+
Does the new test fail for the expected reason?
|
|
140
|
+
NO, it passes -> inspect baseline, assertion, and setup
|
|
141
|
+
NO, unrelated failure -> fix or isolate the test environment
|
|
142
|
+
YES -> implement minimum GREEN behavior
|
|
143
|
+
|
|
144
|
+
Did implementation add behavior not demanded by the test?
|
|
145
|
+
YES -> remove it or start a new RED cycle
|
|
146
|
+
NO -> run relevant regression checks, then refactor if useful
|
|
171
147
|
```
|
|
172
148
|
|
|
173
|
-
---
|
|
174
|
-
|
|
175
149
|
## Resources
|
|
176
150
|
|
|
177
|
-
-
|
|
151
|
+
- Portable cycle examples and evidence template: [assets/tdd-cycle.md](assets/tdd-cycle.md)
|
|
152
|
+
- Multi-step delivery workflow: [../spec-driven-development/SKILL.md](../spec-driven-development/SKILL.md)
|