@skilly-hand/skilly-hand 0.29.1 → 0.29.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. package/CHANGELOG.md +31 -0
  2. package/catalog/README.md +2 -2
  3. package/catalog/skills/figma-mcp-0to1/SKILL.md +13 -14
  4. package/catalog/skills/figma-mcp-0to1/agents/canvas-creation-playbook.md +15 -2
  5. package/catalog/skills/figma-mcp-0to1/agents/install-auth.md +8 -8
  6. package/catalog/skills/figma-mcp-0to1/agents/tool-function-catalog.md +11 -5
  7. package/catalog/skills/figma-mcp-0to1/agents/troubleshooting-ops.md +8 -4
  8. package/catalog/skills/figma-mcp-0to1/assets/client-config-snippets.md +3 -8
  9. package/catalog/skills/figma-mcp-0to1/assets/prompt-recipes.md +19 -4
  10. package/catalog/skills/figma-mcp-0to1/manifest.json +3 -3
  11. package/catalog/skills/figma-mcp-0to1/references/official-tools-matrix.md +27 -28
  12. package/catalog/skills/spec-driven-development/SKILL.md +95 -144
  13. package/catalog/skills/spec-driven-development/agents/apply.md +30 -15
  14. package/catalog/skills/spec-driven-development/agents/orchestrate.md +23 -14
  15. package/catalog/skills/spec-driven-development/agents/plan.md +19 -17
  16. package/catalog/skills/spec-driven-development/agents/verify.md +40 -19
  17. package/catalog/skills/spec-driven-development/assets/delta-spec-template.md +50 -15
  18. package/catalog/skills/spec-driven-development/assets/design-template.md +20 -14
  19. package/catalog/skills/spec-driven-development/assets/spec-template.md +41 -20
  20. package/catalog/skills/spec-driven-development/assets/validation-checklist.md +28 -21
  21. package/catalog/skills/spec-driven-development/manifest.json +4 -4
  22. package/catalog/skills/test-driven-development/SKILL.md +92 -117
  23. package/catalog/skills/test-driven-development/assets/tdd-cycle.md +63 -447
  24. package/catalog/skills/test-driven-development/manifest.json +5 -5
  25. package/package.json +1 -1
  26. package/packages/catalog/package.json +1 -1
  27. package/packages/cli/package.json +1 -1
  28. package/packages/core/package.json +1 -1
  29. package/packages/detectors/package.json +1 -1
@@ -1,31 +1,37 @@
1
- # Design: [Feature Name]
1
+ # Design: [Work Name]
2
2
 
3
- Use with `spec.md` when implementation includes meaningful architectural decisions.
3
+ Use this artifact only for decisions whose rationale or trade-offs will matter after implementation.
4
4
 
5
5
  ## Context
6
6
 
7
- [Current system state, constraints, and why a decision is needed now.]
7
+ [Verified system state and reason a decision is needed.]
8
8
 
9
9
  ## Goals
10
10
 
11
- - [Primary outcome]
12
- - [Secondary outcome]
11
+ - [Desired outcome]
13
12
 
14
13
  ## Non-Goals
15
14
 
16
- - [Explicitly excluded area]
17
- - [Another excluded area]
15
+ - [Explicit exclusion]
18
16
 
19
- ## Decisions
17
+ ## Decision
20
18
 
21
- ### Decision: [Name]
19
+ ### [Decision Name]
22
20
 
23
- [What was chosen and why]
21
+ - Choice: [Selected approach]
22
+ - Rationale: [Why it fits the constraints]
23
+ - Required capabilities: [Semantic capabilities, or `none`]
24
24
 
25
- ### Alternatives Considered
25
+ ## Alternatives Considered
26
26
 
27
- - [Alternative option and why not chosen]
27
+ | Alternative | Benefit | Cost | Reason Not Selected |
28
+ | --- | --- | --- | --- |
28
29
 
29
- ## Risks / Trade-offs
30
+ ## Risks and Mitigations
30
31
 
31
- - **[Risk]:** [Impact + mitigation or acceptance rationale]
32
+ | Risk | Impact | Mitigation | Verification |
33
+ | --- | --- | --- | --- |
34
+
35
+ ## Revisit Conditions
36
+
37
+ - [Evidence or change that should reopen this decision]
@@ -1,56 +1,77 @@
1
- # [Feature Name]
1
+ # [Work Name]
2
2
 
3
3
  ## Why
4
4
 
5
- [1-2 sentences on the problem and why it matters now.]
5
+ [Problem, value, and why it matters now.]
6
6
 
7
7
  ## What
8
8
 
9
- [Concrete, testable deliverable.]
9
+ [Concrete and testable deliverable.]
10
10
 
11
11
  ## Constraints
12
12
 
13
13
  ### Must
14
14
 
15
- - [Required patterns, architecture, or conventions]
16
- - [Required dependency or existing system usage]
15
+ - [Enforceable requirement]
16
+
17
+ ### Should
18
+
19
+ - [Preferred outcome; deviations require a reason]
17
20
 
18
21
  ### Must Not
19
22
 
20
- - [Disallowed approaches]
21
- - [Disallowed dependencies or behavioral changes]
23
+ - [Disallowed behavior or approach]
22
24
 
23
25
  ### Out of Scope
24
26
 
25
- - [Adjacent but excluded features]
27
+ - [Explicit boundary]
26
28
 
27
29
  ## Current State
28
30
 
29
- - [Relevant existing files and behavior]
30
- - [Known dependencies and integration points]
31
+ - [Verified files, behavior, dependencies, and conventions]
32
+
33
+ ## Approval Policy
34
+
35
+ - Mode: [explicit checkpoint | self-review]
36
+ - Trigger for reapproval: [scope, constraint, risk, or design change]
31
37
 
32
38
  ## Tasks
33
39
 
34
40
  ### T1: [Title]
35
41
 
36
- **What:** [Specific implementation change]
42
+ **What:** [Observable outcome]
43
+
44
+ **Required Capabilities:** [Semantic capabilities, or `none`]
37
45
 
38
- **Files:** `path/to/file`, `path/to/test`
46
+ **Files:** [Expected files, or `discover`]
39
47
 
40
- **Verify:** [Command or manual check]
48
+ **Scenario:**
41
49
 
42
- ---
50
+ - GIVEN [initial state]
51
+ - WHEN [action]
52
+ - THEN [observable result]
43
53
 
44
- ### T2: [Title]
54
+ **Verify:** [Project-discovered command or concrete manual check]
45
55
 
46
- **What:** [Specific implementation change]
56
+ **Done:** [One-sentence completion condition]
47
57
 
48
- **Files:** `path/to/file`
58
+ ## Progress
49
59
 
50
- **Verify:** [Command or manual check]
60
+ | Task | Status | Evidence |
61
+ | --- | --- | --- |
62
+ | T1 | TODO | |
63
+
64
+ Valid states: `TODO`, `IN_PROGRESS`, `BLOCKED`, `DONE`.
51
65
 
52
66
  ## Validation
53
67
 
54
68
  - [Feature-level automated check]
55
- - [Feature-level manual scenario]
56
- - [Additional release-specific verification]
69
+ - [Feature-level manual scenario, if needed]
70
+ - [Constraint or regression check]
71
+
72
+ ## Change Log
73
+
74
+ Record requirement, scope, or design changes. Do not log routine progress.
75
+
76
+ | Date | Change | Affected Tasks | Approval |
77
+ | --- | --- | --- | --- |
@@ -1,33 +1,40 @@
1
1
  # Spec Validation Checklist
2
2
 
3
- Use this checklist before implementation and again before archive.
3
+ ## Portability
4
+
5
+ - [ ] The workflow is executable without a named external skill, agent, vendor, service, framework, VCS, package manager, or test runner.
6
+ - [ ] Required capabilities are semantic and can use a local fallback.
7
+ - [ ] Commands were discovered from the target project or are clearly marked as placeholders.
8
+ - [ ] Protocol states use portable ASCII tokens.
4
9
 
5
10
  ## Spec Quality
6
11
 
7
- - [ ] Title is specific and unambiguous.
8
- - [ ] `Why` explains urgency and value.
9
- - [ ] `What` is concrete and testable.
10
- - [ ] Constraints are enforceable (`Must`, `Must Not`, `Out of Scope`).
11
- - [ ] `Current State` references real files or systems.
12
+ - [ ] `Why` and `What` are concrete.
13
+ - [ ] Constraints and out-of-scope boundaries are enforceable.
14
+ - [ ] Current-state claims reference verified context.
15
+ - [ ] Approval and reapproval policies are explicit.
16
+ - [ ] Architecture decisions are captured only when their rationale matters.
12
17
 
13
18
  ## Task Quality
14
19
 
15
- - [ ] Tasks are small and scoped.
16
- - [ ] Each task has `What`, `Files`, and `Verify`.
17
- - [ ] Task verification can be completed quickly.
18
- - [ ] No task mixes unrelated concerns.
20
+ - [ ] Each task has one observable outcome.
21
+ - [ ] Each task declares capabilities, files, verify step, and definition of done.
22
+ - [ ] Behavioral tasks include an acceptance scenario.
23
+ - [ ] Dependencies and blockers are visible.
24
+ - [ ] Tasks exist only in `spec.md`.
19
25
 
20
- ## Implementation Readiness
26
+ ## Execution Evidence
21
27
 
22
- - [ ] Success criteria are explicit.
23
- - [ ] No critical ambiguity remains.
24
- - [ ] Architecture decisions are captured in `design.md` if needed.
28
+ - [ ] Every `DONE` task has reproducible evidence.
29
+ - [ ] Failed or unavailable checks are recorded honestly.
30
+ - [ ] Scope changes updated the spec before implementation continued.
31
+ - [ ] Superseded evidence is identified.
25
32
 
26
- ## Pre-Archive Verification
33
+ ## Pre-Archive
27
34
 
28
- - [ ] All planned tasks are complete.
29
- - [ ] Feature-level validation passes.
30
- - [ ] Constraints were respected.
31
- - [ ] Final `review-rangers` gate completed with no unresolved blockers.
32
- - [ ] No unintended scope creep.
33
- - [ ] Work is moved from `.sdd/active/` to `.sdd/archive/`.
35
+ - [ ] All tasks are `DONE` and no blocker remains.
36
+ - [ ] Feature validation and constraint checks pass.
37
+ - [ ] The portable final review gate passes.
38
+ - [ ] Manual checks are complete or explicitly approved.
39
+ - [ ] Delta reconciliation is complete when applicable.
40
+ - [ ] Archive name uses `<YYYY-MM-DD>-<work-name>`.
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "id": "spec-driven-development",
3
3
  "title": "Spec-Driven Development",
4
- "description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks.",
4
+ "description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. Trigger: planning or executing feature work, bug fixes, and multi-phase implementation.",
5
5
  "portable": true,
6
6
  "tags": ["core", "workflow", "planning"],
7
7
  "detectors": ["always"],
@@ -10,10 +10,10 @@
10
10
  "agentSupport": ["codex", "claude", "cursor", "gemini", "copilot", "antigravity", "windsurf", "trae"],
11
11
  "skillMetadata": {
12
12
  "author": "skilly-hand",
13
- "last-edit": "2026-04-03",
13
+ "last-edit": "2026-06-20",
14
14
  "license": "Apache-2.0",
15
- "version": "1.0.3",
16
- "changelog": "Added OpenSpec complementary support routing guidance to spec-driven-development instructions; improves planning continuity and review clarity when local SDD needs reinforcement; affects spec-driven-development SKILL guidance and manifest metadata",
15
+ "version": "1.1.0",
16
+ "changelog": "Added a portable SDD lifecycle with capability-based routing, task evidence, change control, and archive invariants; prevents fixed tool dependencies and duplicated task state; affects planning, apply, verify, orchestrate, and spec templates",
17
17
  "auto-invoke": "Planning or executing feature work, bug fixes, and multi-phase implementation",
18
18
  "allowed-modes": [
19
19
  "plan",
@@ -1,13 +1,13 @@
1
1
  ---
2
2
  name: "test-driven-development"
3
- description: "Guide implementation using the RED GREEN REFACTOR TDD cycle: write a failing test first, write the minimum code to pass, then refactor while tests stay green."
3
+ description: "Guide implementation through evidence-based RED, GREEN, and REFACTOR cycles without assuming a language, framework, or test runner. Trigger: implementing testable behavior or reproducing a regression with tests first."
4
4
  skillMetadata:
5
5
  author: "skilly-hand"
6
- last-edit: "2026-04-04"
6
+ last-edit: "2026-06-20"
7
7
  license: "Apache-2.0"
8
- version: "1.0.0"
9
- changelog: "Initial TDD skill ported from legacy scannlab-sdd tdd-templates; enables RED→GREEN→REFACTOR workflow across any stack; affects catalog skill coverage for test-first development"
10
- auto-invoke: "Implementing features, services, or components using test-driven development (TDD) or RED→GREEN→REFACTOR cycles"
8
+ version: "1.1.0"
9
+ changelog: "Rebuilt TDD guidance around portable cycle evidence, expected RED failures, behavior-preserving refactors, and project-discovered test conventions; prevents framework assumptions and untested behavior during refactor; affects core workflow, examples, and verification guidance"
10
+ auto-invoke: "Implementing testable behavior or reproducing a regression with tests first"
11
11
  allowed-tools:
12
12
  - "Read"
13
13
  - "Edit"
@@ -20,158 +20,133 @@ skillMetadata:
20
20
 
21
21
  ## When to Use
22
22
 
23
- Use this skill when:
23
+ Use TDD when desired behavior can be expressed before implementation, when fixing a reproducible regression, or when changing logic that benefits from a tight feedback loop.
24
24
 
25
- - Implementing a new feature, service, component, or function from scratch.
26
- - Adding behavior to existing code where the expected outcome can be defined upfront.
27
- - Debugging a regression by writing a failing test that reproduces the bug first.
28
- - Reviewing or pair-programming on code where test-first discipline is required.
25
+ Do not force TDD onto exploratory spikes, generated artifacts, environment-only setup, or behavior that cannot be observed reliably. Time-box exploration, discard or isolate spike code, then begin TDD once an interface is understood.
29
26
 
30
- Do not use this skill for:
27
+ ## Portable Contract
31
28
 
32
- - Exploratory prototyping where requirements are entirely undefined.
33
- - Snapshot or visual regression tests driven by existing UI.
34
- - Infrastructure or environment setup with no testable behavior.
29
+ - Discover the project's language, test runner, commands, naming, and file placement before writing tests.
30
+ - Prefer existing project conventions over examples in this skill.
31
+ - Do not require a framework, package manager, assertion library, coverage tool, or external service.
32
+ - If no runnable test harness exists, record the blocker or establish the smallest project-appropriate harness as separately approved work.
35
33
 
36
- ---
37
-
38
- ## Critical Patterns
39
-
40
- ### Pattern 1: RED First, Always
34
+ ## The Cycle
41
35
 
42
- Write a failing test before writing any implementation code. This proves:
36
+ ### 1. Understand
43
37
 
44
- - The test is meaningful (not passing by accident).
45
- - The feature is actually needed.
46
- - You understand the requirements before touching implementation.
38
+ Define one observable behavior and choose the lowest test level that can prove it without hiding important integration risk.
47
39
 
48
- Never write implementation code without a failing test that demands it.
40
+ ### 2. RED
49
41
 
50
- ### Pattern 2: Minimum Code to GREEN
42
+ Write a test before production behavior changes, then run it.
51
43
 
52
- Write the **smallest possible code** to make the test pass:
44
+ A valid RED requires:
53
45
 
54
- - No extra features beyond what the test requires.
55
- - No premature optimization or defensive handling.
56
- - No "while I'm here, let me add…" additions.
46
+ - The new or changed test fails.
47
+ - The failure is caused by the missing or incorrect target behavior.
48
+ - The failure message or observation is understood.
49
+ - Unrelated failures are separated from the cycle.
57
50
 
58
- The goal is to satisfy the test, nothing more.
51
+ If the test already passes, do not weaken it or write implementation blindly. Determine whether the behavior already exists, the assertion observes the wrong thing, or the test setup bypasses the relevant path.
59
52
 
60
- ### Pattern 3: REFACTOR With Tests GREEN
53
+ ### 3. GREEN
61
54
 
62
- Only improve code structure **after** all tests pass:
55
+ Implement the smallest behavior that makes the RED test pass. Run the focused test, then the smallest relevant regression set.
63
56
 
64
- - Extract constants, improve naming, simplify logic.
65
- - Tests must stay green throughout every refactoring step.
66
- - If a refactor breaks a test, revert — the refactor was wrong.
57
+ Do not add speculative validation, configuration, interfaces, or error cases that the current behavior does not require.
67
58
 
68
- ### Pattern 4: One Scenario Per Test
59
+ ### 4. REFACTOR
69
60
 
70
- Each test must validate exactly one behavior:
61
+ Improve structure while preserving observable behavior. Keep tests green after each meaningful change.
71
62
 
72
- - Use explicit GIVEN / WHEN / THEN structure in test bodies.
73
- - A test name should complete the sentence: *"it should ___"*.
74
- - If a test asserts two behaviors, split it into two tests.
63
+ Allowed examples include renaming, removing duplication, simplifying control flow, or extracting an internal helper. Adding a new output, error case, persistence rule, side effect, or public option is not refactoring; start another RED cycle for it.
75
64
 
76
- ---
65
+ ### 5. Record Evidence
77
66
 
78
- ## Decision Tree
67
+ Capture enough evidence to reproduce the cycle:
79
68
 
80
69
  ```text
81
- Starting a new feature or function? -> Write failing test first (RED)
82
- Test is failing as expected? -> Write minimum code to pass (GREEN)
83
- Test is passing? -> Improve code structure without changing behavior (REFACTOR)
84
- Refactor broke a test? -> Revert refactor introduced a behavior change
85
- Test is already passing before writing code? -> Test is not meaningful; redesign it
86
- Fixing a bug? -> Write failing test that reproduces the bug first
70
+ Behavior: <one observable outcome>
71
+ RED: <command/check> -> FAIL because <expected reason>
72
+ GREEN: <command/check> -> PASS
73
+ REFACTOR: <command/check> -> PASS | NOT_NEEDED
74
+ Regression: <relevant suite/check> -> PASS | NOT_RUN with reason
87
75
  ```
88
76
 
89
- ---
90
-
91
- ## Code Examples
92
-
93
- ### Example 1: GIVEN / WHEN / THEN Structure
77
+ ## Test Scope Selection
94
78
 
95
- ```typescript
96
- it('should return the sum of two numbers', () => {
97
- // GIVEN: Two positive integers
98
- const a = 3;
99
- const b = 4;
79
+ | Need | Prefer |
80
+ | --- | --- |
81
+ | Pure logic or narrow rule | Unit test |
82
+ | Collaboration between local modules | Integration test |
83
+ | Boundary with a stable external contract | Contract test or boundary integration test |
84
+ | User-visible workflow across the system | End-to-end test |
85
+ | Existing behavior with unclear intent | Characterization test before change |
100
86
 
101
- // WHEN: Sum is computed
102
- const result = add(a, b);
87
+ Use the lowest level that proves the behavior, but do not mock away the boundary where the defect or risk lives. A task may need more than one level when risks differ.
103
88
 
104
- // THEN: Result equals their sum
105
- expect(result).toBe(7);
106
- });
107
- ```
89
+ ## Test Design Rules
108
90
 
109
- ### Example 2: RED Write Failing Test First
91
+ - One behavioral reason to fail per test. Multiple assertions are acceptable when they describe one outcome.
92
+ - Use the project's preferred structure, such as Given/When/Then or Arrange/Act/Assert.
93
+ - Assert observable results rather than private implementation details.
94
+ - Keep setup focused and make test data reveal intent.
95
+ - Test meaningful boundaries and error behavior, not every syntactic branch.
96
+ - A regression test must fail on the faulty baseline and pass after the fix.
110
97
 
111
- ```typescript
112
- // calculator.test.ts
113
- import { divide } from './calculator';
98
+ ## Test Doubles
114
99
 
115
- it('should throw when dividing by zero', () => {
116
- // GIVEN / WHEN / THEN
117
- expect(() => divide(10, 0)).toThrow('Cannot divide by zero');
118
- });
119
- ```
100
+ Use fakes, stubs, spies, or mocks only when they make the test faster, deterministic, or able to isolate an owned boundary.
120
101
 
121
- Run the test it **must** fail before writing any implementation.
102
+ - Prefer simple state-based assertions over interaction assertions.
103
+ - Verify interactions when the interaction itself is the contract.
104
+ - Do not mock the unit under test.
105
+ - Avoid reproducing complex third-party behavior in hand-written mocks.
106
+ - Keep at least one integration check when a mocked boundary carries material compatibility risk.
122
107
 
123
- ### Example 3: GREEN — Write Minimum Implementation
108
+ ## Async and Determinism
124
109
 
125
- ```typescript
126
- // calculator.ts
127
- export function divide(a: number, b: number): number {
128
- if (b === 0) throw new Error('Cannot divide by zero');
129
- return a / b;
130
- }
131
- ```
110
+ - Prefer controllable clocks, schedulers, events, and in-memory boundaries over real delays or network calls.
111
+ - Await observable completion; do not let assertions run after the test finishes.
112
+ - Remove order dependence and shared mutable state.
113
+ - Treat flaky tests as defects. Diagnose timing, isolation, and lifecycle issues instead of adding blind retries.
132
114
 
133
- Run the test — it should now pass. No additional logic yet.
115
+ ## Coverage
134
116
 
135
- ### Example 4: REFACTOR Improve Without Changing Behavior
117
+ Coverage shows what executed, not whether behavior was specified well.
136
118
 
137
- ```typescript
138
- // calculator.ts (refactored)
139
- const DIVIDE_BY_ZERO_MESSAGE = 'Cannot divide by zero';
119
+ - Respect thresholds already configured by the project.
120
+ - Use uncovered critical behavior to guide new scenarios.
121
+ - Do not invent a universal percentage.
122
+ - Never add low-value assertions solely to increase a metric.
140
123
 
141
- export function divide(numerator: number, denominator: number): number {
142
- if (denominator === 0) throw new Error(DIVIDE_BY_ZERO_MESSAGE);
143
- return numerator / denominator;
144
- }
145
- ```
124
+ ## Bug-Fix Cycle
146
125
 
147
- Run the test it must still pass after renaming and extracting the constant.
126
+ 1. Reproduce the defect at the lowest useful level.
127
+ 2. Confirm the test fails for the reported reason.
128
+ 3. Apply the smallest correction.
129
+ 4. Confirm the regression test and relevant existing tests pass.
130
+ 5. Refactor only after the correction is protected.
148
131
 
149
- ---
150
-
151
- ## Commands
152
-
153
- ```bash
154
- # Run a single test file
155
- npm test -- {test-file}
156
-
157
- # Run all tests
158
- npm test
159
-
160
- # Run tests in watch mode
161
- npm test -- --watch
162
-
163
- # Type check without emitting
164
- npx tsc --noEmit
165
-
166
- # Lint check
167
- npm run lint
132
+ ## Decision Tree
168
133
 
169
- # Full verify (tests + lint + type check + build)
170
- npm test && npm run lint && npx tsc --noEmit && npm run build
134
+ ```text
135
+ Can the behavior be observed reliably?
136
+ NO -> clarify the interface or isolate exploration first
137
+ YES -> choose the lowest useful test level
138
+
139
+ Does the new test fail for the expected reason?
140
+ NO, it passes -> inspect baseline, assertion, and setup
141
+ NO, unrelated failure -> fix or isolate the test environment
142
+ YES -> implement minimum GREEN behavior
143
+
144
+ Did implementation add behavior not demanded by the test?
145
+ YES -> remove it or start a new RED cycle
146
+ NO -> run relevant regression checks, then refactor if useful
171
147
  ```
172
148
 
173
- ---
174
-
175
149
  ## Resources
176
150
 
177
- - Full cycle examples with Angular: [assets/tdd-cycle.md](assets/tdd-cycle.md)
151
+ - Portable cycle examples and evidence template: [assets/tdd-cycle.md](assets/tdd-cycle.md)
152
+ - Multi-step delivery workflow: [../spec-driven-development/SKILL.md](../spec-driven-development/SKILL.md)