npm - @skilly-hand/skilly-hand - Versions diffs - 0.29.2 → 0.29.3 - Mend

@skilly-hand/skilly-hand 0.29.2 → 0.29.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/catalog/skills/spec-driven-development/assets/delta-spec-template.md CHANGED Viewed

@@ -1,51 +1,86 @@
-# [Feature Name] - Delta Spec
-Use this template when changing behavior in an existing feature.
+# [Work Name] - Delta Spec
 ## Why
-[Why this behavior change is needed now.]
+[Reason for changing established behavior.]
-## What
+## Baseline
-[Concrete behavior change and expected outcome.]
+[Maintained requirement source or a concise description of current behavior.]
 ## ADDED Requirements
 ### Requirement: [Name]
-[MUST/SHOULD/MAY statement]
+[MUST, SHOULD, or MAY statement.]
 #### Scenario: [Name]
 - GIVEN [initial state]
 - WHEN [action]
-- THEN [expected outcome]
+- THEN [observable result]
 ## MODIFIED Requirements
 ### Requirement: [Name]
-[New requirement text]
+[Complete replacement requirement.]
-(Previously: [old requirement text])
+Previously: [Previous requirement or behavior.]
 ## REMOVED Requirements
 ### Requirement: [Name]
-(Reason: [why removed])
+Reason: [Why it is removed.]
+## Constraints
+### Must
+- [Enforceable requirement]
+### Must Not
+- [Disallowed behavior]
+### Out of Scope
+- [Boundary]
+## Approval Policy
+- Mode: [explicit checkpoint | self-review]
+- Trigger for reapproval: [change condition]
 ## Tasks
 ### T1: [Title]
-**What:** [Specific implementation]
+**What:** [Observable outcome]
+**Required Capabilities:** [Semantic capabilities, or `none`]
+**Files:** [Expected files, or `discover`]
+**Scenario:** [GIVEN / WHEN / THEN, when behavioral]
+**Verify:** [Project-discovered command or concrete manual check]
+**Done:** [One-sentence completion condition]
-**Files:** `path/to/file`, `path/to/test`
+## Progress
-**Verify:** [Command or manual check]
+| Task | Status | Evidence |
+| --- | --- | --- |
+| T1 | TODO | |
 ## Validation
-- [Checks that prove the delta behavior is correct]
+- [Checks that prove the behavior delta]
+- [Regression checks for retained behavior]
+## Reconciliation
+- Baseline update required: [yes | no]
+- Reconciliation result: [pending | complete | not applicable]
+## Change Log
+| Date | Change | Affected Tasks | Approval |
+| --- | --- | --- | --- |

package/catalog/skills/spec-driven-development/assets/design-template.md CHANGED Viewed

@@ -1,31 +1,37 @@
-# Design: [Feature Name]
+# Design: [Work Name]
-Use with `spec.md` when implementation includes meaningful architectural decisions.
+Use this artifact only for decisions whose rationale or trade-offs will matter after implementation.
 ## Context
-[Current system state, constraints, and why a decision is needed now.]
+[Verified system state and reason a decision is needed.]
 ## Goals
-- [Primary outcome]
-- [Secondary outcome]
+- [Desired outcome]
 ## Non-Goals
-- [Explicitly excluded area]
-- [Another excluded area]
+- [Explicit exclusion]
-## Decisions
+## Decision
-### Decision: [Name]
+### [Decision Name]
-[What was chosen and why]
+- Choice: [Selected approach]
+- Rationale: [Why it fits the constraints]
+- Required capabilities: [Semantic capabilities, or `none`]
-### Alternatives Considered
+## Alternatives Considered
-- [Alternative option and why not chosen]
+| Alternative | Benefit | Cost | Reason Not Selected |
+| --- | --- | --- | --- |
-## Risks / Trade-offs
+## Risks and Mitigations
-- **[Risk]:** [Impact + mitigation or acceptance rationale]
+| Risk | Impact | Mitigation | Verification |
+| --- | --- | --- | --- |
+## Revisit Conditions
+- [Evidence or change that should reopen this decision]

package/catalog/skills/spec-driven-development/assets/spec-template.md CHANGED Viewed

@@ -1,56 +1,77 @@
-# [Feature Name]
+# [Work Name]
 ## Why
-[1-2 sentences on the problem and why it matters now.]
+[Problem, value, and why it matters now.]
 ## What
-[Concrete, testable deliverable.]
+[Concrete and testable deliverable.]
 ## Constraints
 ### Must
-- [Required patterns, architecture, or conventions]
-- [Required dependency or existing system usage]
+- [Enforceable requirement]
+### Should
+- [Preferred outcome; deviations require a reason]
 ### Must Not
-- [Disallowed approaches]
-- [Disallowed dependencies or behavioral changes]
+- [Disallowed behavior or approach]
 ### Out of Scope
-- [Adjacent but excluded features]
+- [Explicit boundary]
 ## Current State
-- [Relevant existing files and behavior]
-- [Known dependencies and integration points]
+- [Verified files, behavior, dependencies, and conventions]
+## Approval Policy
+- Mode: [explicit checkpoint | self-review]
+- Trigger for reapproval: [scope, constraint, risk, or design change]
 ## Tasks
 ### T1: [Title]
-**What:** [Specific implementation change]
+**What:** [Observable outcome]
+**Required Capabilities:** [Semantic capabilities, or `none`]
-**Files:** `path/to/file`, `path/to/test`
+**Files:** [Expected files, or `discover`]
-**Verify:** [Command or manual check]
+**Scenario:**
----
+- GIVEN [initial state]
+- WHEN [action]
+- THEN [observable result]
-### T2: [Title]
+**Verify:** [Project-discovered command or concrete manual check]
-**What:** [Specific implementation change]
+**Done:** [One-sentence completion condition]
-**Files:** `path/to/file`
+## Progress
-**Verify:** [Command or manual check]
+| Task | Status | Evidence |
+| --- | --- | --- |
+| T1 | TODO | |
+Valid states: `TODO`, `IN_PROGRESS`, `BLOCKED`, `DONE`.
 ## Validation
 - [Feature-level automated check]
-- [Feature-level manual scenario]
-- [Additional release-specific verification]
+- [Feature-level manual scenario, if needed]
+- [Constraint or regression check]
+## Change Log
+Record requirement, scope, or design changes. Do not log routine progress.
+| Date | Change | Affected Tasks | Approval |
+| --- | --- | --- | --- |

package/catalog/skills/spec-driven-development/assets/validation-checklist.md CHANGED Viewed

@@ -1,33 +1,40 @@
 # Spec Validation Checklist
-Use this checklist before implementation and again before archive.
+## Portability
+- [ ] The workflow is executable without a named external skill, agent, vendor, service, framework, VCS, package manager, or test runner.
+- [ ] Required capabilities are semantic and can use a local fallback.
+- [ ] Commands were discovered from the target project or are clearly marked as placeholders.
+- [ ] Protocol states use portable ASCII tokens.
 ## Spec Quality
-- [ ] Title is specific and unambiguous.
-- [ ] `Why` explains urgency and value.
-- [ ] `What` is concrete and testable.
-- [ ] Constraints are enforceable (`Must`, `Must Not`, `Out of Scope`).
-- [ ] `Current State` references real files or systems.
+- [ ] `Why` and `What` are concrete.
+- [ ] Constraints and out-of-scope boundaries are enforceable.
+- [ ] Current-state claims reference verified context.
+- [ ] Approval and reapproval policies are explicit.
+- [ ] Architecture decisions are captured only when their rationale matters.
 ## Task Quality
-- [ ] Tasks are small and scoped.
-- [ ] Each task has `What`, `Files`, and `Verify`.
-- [ ] Task verification can be completed quickly.
-- [ ] No task mixes unrelated concerns.
+- [ ] Each task has one observable outcome.
+- [ ] Each task declares capabilities, files, verify step, and definition of done.
+- [ ] Behavioral tasks include an acceptance scenario.
+- [ ] Dependencies and blockers are visible.
+- [ ] Tasks exist only in `spec.md`.
-## Implementation Readiness
+## Execution Evidence
-- [ ] Success criteria are explicit.
-- [ ] No critical ambiguity remains.
-- [ ] Architecture decisions are captured in `design.md` if needed.
+- [ ] Every `DONE` task has reproducible evidence.
+- [ ] Failed or unavailable checks are recorded honestly.
+- [ ] Scope changes updated the spec before implementation continued.
+- [ ] Superseded evidence is identified.
-## Pre-Archive Verification
+## Pre-Archive
-- [ ] All planned tasks are complete.
-- [ ] Feature-level validation passes.
-- [ ] Constraints were respected.
-- [ ] Final `review-rangers` gate completed with no unresolved blockers.
-- [ ] No unintended scope creep.
-- [ ] Work is moved from `.sdd/active/` to `.sdd/archive/`.
+- [ ] All tasks are `DONE` and no blocker remains.
+- [ ] Feature validation and constraint checks pass.
+- [ ] The portable final review gate passes.
+- [ ] Manual checks are complete or explicitly approved.
+- [ ] Delta reconciliation is complete when applicable.
+- [ ] Archive name uses `<YYYY-MM-DD>-<work-name>`.

package/catalog/skills/spec-driven-development/manifest.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "id": "spec-driven-development",
   "title": "Spec-Driven Development",
-  "description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks.",
+  "description": "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. Trigger: planning or executing feature work, bug fixes, and multi-phase implementation.",
   "portable": true,
   "tags": ["core", "workflow", "planning"],
   "detectors": ["always"],
@@ -10,10 +10,10 @@
   "agentSupport": ["codex", "claude", "cursor", "gemini", "copilot", "antigravity", "windsurf", "trae"],
   "skillMetadata": {
     "author": "skilly-hand",
-    "last-edit": "2026-04-03",
+    "last-edit": "2026-06-20",
     "license": "Apache-2.0",
-    "version": "1.0.3",
-    "changelog": "Added OpenSpec complementary support routing guidance to spec-driven-development instructions; improves planning continuity and review clarity when local SDD needs reinforcement; affects spec-driven-development SKILL guidance and manifest metadata",
+    "version": "1.1.0",
+    "changelog": "Added a portable SDD lifecycle with capability-based routing, task evidence, change control, and archive invariants; prevents fixed tool dependencies and duplicated task state; affects planning, apply, verify, orchestrate, and spec templates",
     "auto-invoke": "Planning or executing feature work, bug fixes, and multi-phase implementation",
     "allowed-modes": [
       "plan",

package/catalog/skills/test-driven-development/SKILL.md CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
 name: "test-driven-development"
-description: "Guide implementation using the RED → GREEN → REFACTOR TDD cycle: write a failing test first, write the minimum code to pass, then refactor while tests stay green."
+description: "Guide implementation through evidence-based RED, GREEN, and REFACTOR cycles without assuming a language, framework, or test runner. Trigger: implementing testable behavior or reproducing a regression with tests first."
 skillMetadata:
   author: "skilly-hand"
-  last-edit: "2026-04-04"
+  last-edit: "2026-06-20"
   license: "Apache-2.0"
-  version: "1.0.0"
-  changelog: "Initial TDD skill ported from legacy scannlab-sdd tdd-templates; enables RED→GREEN→REFACTOR workflow across any stack; affects catalog skill coverage for test-first development"
-  auto-invoke: "Implementing features, services, or components using test-driven development (TDD) or RED→GREEN→REFACTOR cycles"
+  version: "1.1.0"
+  changelog: "Rebuilt TDD guidance around portable cycle evidence, expected RED failures, behavior-preserving refactors, and project-discovered test conventions; prevents framework assumptions and untested behavior during refactor; affects core workflow, examples, and verification guidance"
+  auto-invoke: "Implementing testable behavior or reproducing a regression with tests first"
   allowed-tools:
     - "Read"
     - "Edit"
@@ -20,158 +20,133 @@ skillMetadata:
 ## When to Use
-Use this skill when:
+Use TDD when desired behavior can be expressed before implementation, when fixing a reproducible regression, or when changing logic that benefits from a tight feedback loop.
-- Implementing a new feature, service, component, or function from scratch.
-- Adding behavior to existing code where the expected outcome can be defined upfront.
-- Debugging a regression by writing a failing test that reproduces the bug first.
-- Reviewing or pair-programming on code where test-first discipline is required.
+Do not force TDD onto exploratory spikes, generated artifacts, environment-only setup, or behavior that cannot be observed reliably. Time-box exploration, discard or isolate spike code, then begin TDD once an interface is understood.
-Do not use this skill for:
+## Portable Contract
-- Exploratory prototyping where requirements are entirely undefined.
-- Snapshot or visual regression tests driven by existing UI.
-- Infrastructure or environment setup with no testable behavior.
+- Discover the project's language, test runner, commands, naming, and file placement before writing tests.
+- Prefer existing project conventions over examples in this skill.
+- Do not require a framework, package manager, assertion library, coverage tool, or external service.
+- If no runnable test harness exists, record the blocker or establish the smallest project-appropriate harness as separately approved work.
----
-## Critical Patterns
-### Pattern 1: RED First, Always
+## The Cycle
-Write a failing test before writing any implementation code. This proves:
+### 1. Understand
-- The test is meaningful (not passing by accident).
-- The feature is actually needed.
-- You understand the requirements before touching implementation.
+Define one observable behavior and choose the lowest test level that can prove it without hiding important integration risk.
-Never write implementation code without a failing test that demands it.
+### 2. RED
-### Pattern 2: Minimum Code to GREEN
+Write a test before production behavior changes, then run it.
-Write the **smallest possible code** to make the test pass:
+A valid RED requires:
-- No extra features beyond what the test requires.
-- No premature optimization or defensive handling.
-- No "while I'm here, let me add…" additions.
+- The new or changed test fails.
+- The failure is caused by the missing or incorrect target behavior.
+- The failure message or observation is understood.
+- Unrelated failures are separated from the cycle.
-The goal is to satisfy the test, nothing more.
+If the test already passes, do not weaken it or write implementation blindly. Determine whether the behavior already exists, the assertion observes the wrong thing, or the test setup bypasses the relevant path.
-### Pattern 3: REFACTOR With Tests GREEN
+### 3. GREEN
-Only improve code structure **after** all tests pass:
+Implement the smallest behavior that makes the RED test pass. Run the focused test, then the smallest relevant regression set.
-- Extract constants, improve naming, simplify logic.
-- Tests must stay green throughout every refactoring step.
-- If a refactor breaks a test, revert — the refactor was wrong.
+Do not add speculative validation, configuration, interfaces, or error cases that the current behavior does not require.
-### Pattern 4: One Scenario Per Test
+### 4. REFACTOR
-Each test must validate exactly one behavior:
+Improve structure while preserving observable behavior. Keep tests green after each meaningful change.
-- Use explicit GIVEN / WHEN / THEN structure in test bodies.
-- A test name should complete the sentence: *"it should ___"*.
-- If a test asserts two behaviors, split it into two tests.
+Allowed examples include renaming, removing duplication, simplifying control flow, or extracting an internal helper. Adding a new output, error case, persistence rule, side effect, or public option is not refactoring; start another RED cycle for it.
----
+### 5. Record Evidence
-## Decision Tree
+Capture enough evidence to reproduce the cycle:
 ```text
-Starting a new feature or function?          -> Write failing test first (RED)
-Test is failing as expected?                 -> Write minimum code to pass (GREEN)
-Test is passing?                             -> Improve code structure without changing behavior (REFACTOR)
-Refactor broke a test?                       -> Revert — refactor introduced a behavior change
-Test is already passing before writing code? -> Test is not meaningful; redesign it
-Fixing a bug?                                -> Write failing test that reproduces the bug first
+Behavior: <one observable outcome>
+RED: <command/check> -> FAIL because <expected reason>
+GREEN: <command/check> -> PASS
+REFACTOR: <command/check> -> PASS | NOT_NEEDED
+Regression: <relevant suite/check> -> PASS | NOT_RUN with reason
 ```
----
-## Code Examples
-### Example 1: GIVEN / WHEN / THEN Structure
+## Test Scope Selection
-```typescript
-it('should return the sum of two numbers', () => {
-  // GIVEN: Two positive integers
-  const a = 3;
-  const b = 4;
+| Need | Prefer |
+| --- | --- |
+| Pure logic or narrow rule | Unit test |
+| Collaboration between local modules | Integration test |
+| Boundary with a stable external contract | Contract test or boundary integration test |
+| User-visible workflow across the system | End-to-end test |
+| Existing behavior with unclear intent | Characterization test before change |
-  // WHEN: Sum is computed
-  const result = add(a, b);
+Use the lowest level that proves the behavior, but do not mock away the boundary where the defect or risk lives. A task may need more than one level when risks differ.
-  // THEN: Result equals their sum
-  expect(result).toBe(7);
-});
-```
+## Test Design Rules
-### Example 2: RED — Write Failing Test First
+- One behavioral reason to fail per test. Multiple assertions are acceptable when they describe one outcome.
+- Use the project's preferred structure, such as Given/When/Then or Arrange/Act/Assert.
+- Assert observable results rather than private implementation details.
+- Keep setup focused and make test data reveal intent.
+- Test meaningful boundaries and error behavior, not every syntactic branch.
+- A regression test must fail on the faulty baseline and pass after the fix.
-```typescript
-// calculator.test.ts
-import { divide } from './calculator';
+## Test Doubles
-it('should throw when dividing by zero', () => {
-  // GIVEN / WHEN / THEN
-  expect(() => divide(10, 0)).toThrow('Cannot divide by zero');
-});
-```
+Use fakes, stubs, spies, or mocks only when they make the test faster, deterministic, or able to isolate an owned boundary.
-Run the test — it **must** fail before writing any implementation.
+- Prefer simple state-based assertions over interaction assertions.
+- Verify interactions when the interaction itself is the contract.
+- Do not mock the unit under test.
+- Avoid reproducing complex third-party behavior in hand-written mocks.
+- Keep at least one integration check when a mocked boundary carries material compatibility risk.
-### Example 3: GREEN — Write Minimum Implementation
+## Async and Determinism
-```typescript
-// calculator.ts
-export function divide(a: number, b: number): number {
-  if (b === 0) throw new Error('Cannot divide by zero');
-  return a / b;
-}
-```
+- Prefer controllable clocks, schedulers, events, and in-memory boundaries over real delays or network calls.
+- Await observable completion; do not let assertions run after the test finishes.
+- Remove order dependence and shared mutable state.
+- Treat flaky tests as defects. Diagnose timing, isolation, and lifecycle issues instead of adding blind retries.
-Run the test — it should now pass. No additional logic yet.
+## Coverage
-### Example 4: REFACTOR — Improve Without Changing Behavior
+Coverage shows what executed, not whether behavior was specified well.
-```typescript
-// calculator.ts (refactored)
-const DIVIDE_BY_ZERO_MESSAGE = 'Cannot divide by zero';
+- Respect thresholds already configured by the project.
+- Use uncovered critical behavior to guide new scenarios.
+- Do not invent a universal percentage.
+- Never add low-value assertions solely to increase a metric.
-export function divide(numerator: number, denominator: number): number {
-  if (denominator === 0) throw new Error(DIVIDE_BY_ZERO_MESSAGE);
-  return numerator / denominator;
-}
-```
+## Bug-Fix Cycle
-Run the test — it must still pass after renaming and extracting the constant.
+1. Reproduce the defect at the lowest useful level.
+2. Confirm the test fails for the reported reason.
+3. Apply the smallest correction.
+4. Confirm the regression test and relevant existing tests pass.
+5. Refactor only after the correction is protected.
----
-## Commands
-```bash
-# Run a single test file
-npm test -- {test-file}
-# Run all tests
-npm test
-# Run tests in watch mode
-npm test -- --watch
-# Type check without emitting
-npx tsc --noEmit
-# Lint check
-npm run lint
+## Decision Tree
-# Full verify (tests + lint + type check + build)
-npm test && npm run lint && npx tsc --noEmit && npm run build
+```text
+Can the behavior be observed reliably?
+  NO  -> clarify the interface or isolate exploration first
+  YES -> choose the lowest useful test level
+Does the new test fail for the expected reason?
+  NO, it passes          -> inspect baseline, assertion, and setup
+  NO, unrelated failure  -> fix or isolate the test environment
+  YES                    -> implement minimum GREEN behavior
+Did implementation add behavior not demanded by the test?
+  YES -> remove it or start a new RED cycle
+  NO  -> run relevant regression checks, then refactor if useful
 ```
----
 ## Resources
-- Full cycle examples with Angular: [assets/tdd-cycle.md](assets/tdd-cycle.md)
+- Portable cycle examples and evidence template: [assets/tdd-cycle.md](assets/tdd-cycle.md)
+- Multi-step delivery workflow: [../spec-driven-development/SKILL.md](../spec-driven-development/SKILL.md)