npm - @skilly-hand/skilly-hand - Versions diffs - 0.29.2 → 0.29.3 - Mend

@skilly-hand/skilly-hand 0.29.2 → 0.29.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -16,6 +16,22 @@ All notable changes to this project are documented in this file.
 ### Removed
 - _None._
+## [0.29.3] - 2026-06-20
+[View on npm](https://www.npmjs.com/package/@skilly-hand/skilly-hand/v/0.29.3)
+### Added
+- _None._
+### Changed
+- Reworked `spec-driven-development` around a portable lifecycle with repository-native capability discovery, evidence-backed task state, explicit change control, and archive invariants.
+- Rebuilt `test-driven-development` guidance around evidence-based RED, GREEN, and REFACTOR cycles without fixed framework or test-runner assumptions.
+### Fixed
+- _None._
+### Removed
+- _None._
 ## [0.29.2] - 2026-06-20
 [View on npm](https://www.npmjs.com/package/@skilly-hand/skilly-hand/v/0.29.2)

package/catalog/README.md CHANGED Viewed

@@ -20,8 +20,8 @@ Published portable skills consumed by the `skilly-hand` CLI.
 | `react-guidelines` | Guide React and Next.js code generation, review, and performance tuning using latest stable React verification and modern framework best practices. Trigger: generating, reviewing, refactoring, or optimizing React code artifacts in React projects. | react, frontend, workflow, best-practices | all |
 | `review-rangers` | Review code, decisions, and artifacts through a multi-perspective committee and a domain expert safety guard, then synthesize a structured verdict. | core, workflow, review, quality | all |
 | `roaster` | Challenge plans with constructive roast-style critique that exposes weak assumptions, missing angles, shallow sequencing, and unclear success criteria. Trigger: when the user proposes, requests, or evaluates a plan of any kind. | core, workflow, planning, quality | all |
-| `spec-driven-development` | Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. | core, workflow, planning | all |
-| `test-driven-development` | Guide implementation using the RED → GREEN → REFACTOR TDD cycle: write a failing test first, write the minimum code to pass, then refactor while tests stay green. | testing, workflow, quality, core | all |
+| `spec-driven-development` | Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. Trigger: planning or executing feature work, bug fixes, and multi-phase implementation. | core, workflow, planning | all |
+| `test-driven-development` | Guide implementation through evidence-based RED, GREEN, and REFACTOR cycles without assuming a language, framework, or test runner. Trigger: implementing testable behavior or reproducing a regression with tests first. | testing, workflow, quality, core | all |
 | `token-optimizer` | Classify task complexity and right-size reasoning depth, context gathering, and response detail to reduce wasted tokens. | core, workflow, efficiency | all |
 | `user-story-crafting` | Create and refine user stories with structured quality gates, splitting heuristics, and lightweight story mapping for release slicing. Trigger: writing, restructuring, splitting, or sequencing user stories for delivery-ready backlog work. | product, workflow, planning, quality | all |

package/catalog/skills/spec-driven-development/SKILL.md CHANGED Viewed

@@ -1,12 +1,12 @@
 ---
 name: "spec-driven-development"
-description: "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks."
+description: "Plan, execute, and verify multi-step work through versioned specs with small, testable tasks. Trigger: planning or executing feature work, bug fixes, and multi-phase implementation."
 skillMetadata:
   author: "skilly-hand"
-  last-edit: "2026-04-03"
+  last-edit: "2026-06-20"
   license: "Apache-2.0"
-  version: "1.0.3"
-  changelog: "Added OpenSpec complementary support routing guidance to spec-driven-development instructions; improves planning continuity and review clarity when local SDD needs reinforcement; affects spec-driven-development SKILL guidance and manifest metadata"
+  version: "1.1.0"
+  changelog: "Added a portable SDD lifecycle with capability-based routing, task evidence, change control, and archive invariants; prevents fixed tool dependencies and duplicated task state; affects planning, apply, verify, orchestrate, and spec templates"
   auto-invoke: "Planning or executing feature work, bug fixes, and multi-phase implementation"
   allowed-tools:
     - "Read"
@@ -22,192 +22,143 @@ skillMetadata:
 ## When to Use
-Use this skill when:
+Use this skill when work spans multiple steps, requirements need written boundaries, or progress must survive across contributors or sessions.
-- Work spans multiple commits or phases.
-- Requirements are easy to misinterpret without written constraints.
-- You need a repeatable plan that can be reviewed before coding.
-- Several contributors or sessions may touch the same feature.
+Skip it for trivial edits, urgent recovery work, and tasks with no meaningful verification path.
-Do not use this skill for:
+## Portable Contract
-- Trivial one-file edits.
-- Emergency fixes where immediate response matters more than process.
-- Tasks with no meaningful verification path.
+The workflow MUST remain executable with this skill alone.
----
-## Core Workflow
-1. Define the spec in `.sdd/active/<feature-name>/spec.md`.
-2. Review and refine scope, constraints, and tasks.
-3. Execute one small task at a time.
-4. Verify each task and the end-to-end outcome, ending with a required `review-rangers` final gate.
-5. Archive to `.sdd/archive/` when complete.
-Recommended task size:
-- Up to 3 files per task.
-- Around 30 minutes of implementation effort.
-- Clear, concrete verify step.
+- Treat integrations as optional capabilities, never required product names.
+- Discover available tools, commands, and repository conventions before selecting them.
+- When a capability is unavailable, use a local structured fallback or record a blocker.
+- Keep requirements, tasks, progress, evidence, and changes in `spec.md` as the single source of truth.
+- Do not create a second task list that can drift from the spec.
----
-## OpenSpec Complementary Support
+## Lifecycle
-Default execution SHOULD remain the local `.sdd` workflow.
+```text
+DRAFT -> APPROVED -> IN_PROGRESS -> VERIFYING -> COMPLETE -> ARCHIVED
+             |             |            |
+             +----------> BLOCKED <------+
+```
-Recur to OpenSpec support when the task needs complementary structure for:
+Rules:
-- Multi-session continuity where planning context must persist across chats.
-- Shareable planning artifacts for review before implementation.
-- Requirement-delta clarity that benefits from explicit change proposals.
+1. Planning creates or updates `.sdd/active/<work-name>/spec.md`.
+2. Implementation begins after the approval policy is satisfied.
+3. Only one task should normally be `IN_PROGRESS` at a time.
+4. A task becomes `DONE` only after its verify step passes and evidence is recorded.
+5. Changed requirements return affected tasks to planning before implementation continues.
+6. Archive only after feature validation passes and no task remains open or blocked.
-Routing rules:
+Valid task states: `TODO`, `IN_PROGRESS`, `BLOCKED`, `DONE`.
-- Keep the local `.sdd/active/<feature-name>/spec.md` as the execution source of truth unless the team explicitly standardizes on OpenSpec paths.
-- If OpenSpec is unavailable, continue in `.sdd` and document assumptions directly in the active spec.
+## Approval Policy
-| Use local SDD only | Use OpenSpec support |
-| --- | --- |
-| Single-session or straightforward work with clear requirements | Work spans multiple sessions and needs persistent planning context |
-| Existing `.sdd` artifacts already provide enough review clarity | Team needs proposal/design/tasks artifacts for async review |
-| Requirement changes are small and easy to track in-place | Requirement deltas are complex and need explicit change framing |
+Use an explicit human checkpoint when the user requests one, requirements remain ambiguous, risk is material, or the next action is difficult to reverse. Otherwise, a documented self-review may satisfy approval.
-Reference (informational): [https://openspec.dev/](https://openspec.dev/)
----
+Record the chosen approval policy in the spec. Do not assume every environment supports interactive checkpoints.
 ## Spec Structure
-A practical spec includes:
-- `Why`: problem and urgency.
-- `What`: concrete deliverable.
-- `Constraints`: `MUST`, `MUST NOT`, out-of-scope boundaries.
-- `Current State`: relevant code context.
-- `Tasks`: small implementation units with verify steps.
-- `Validation`: full feature checks after all tasks.
-For existing features with behavior changes, use a delta format (`ADDED`, `MODIFIED`, `REMOVED`) instead of rewriting everything.
----
+A practical spec contains:
-## When to Use Delta vs Full Spec
+- `Why`: problem and value.
+- `What`: concrete, testable deliverable.
+- `Constraints`: enforceable `MUST`, `SHOULD`, `MAY`, and `MUST NOT` statements.
+- `Out of Scope`: explicit boundaries.
+- `Current State`: verified context and integration points.
+- `Approval Policy`: checkpoint or self-review rule.
+- `Tasks`: small units with scenarios, capabilities, files, verify steps, and done definitions.
+- `Progress`: task state and evidence.
+- `Validation`: end-to-end checks.
+- `Change Log`: requirement or scope changes that affect execution.
-| Use Full Spec | Use Delta Spec |
-| --- | --- |
-| New feature with no previous spec | Behavior change to an existing feature |
-| Greenfield implementation | Bug fix or requirement adjustment |
-| No requirement baseline exists | Existing requirements already exist |
+### Task Contract
-## Archive Behavior
+Each task MUST define:
-When archiving a delta spec, apply changes to the base specification:
+```markdown
+### T1: Title
-- `ADDED`: append new requirements.
-- `MODIFIED`: replace the previous requirement text.
-- `REMOVED`: delete the requirement and keep a short reason in commit history.
-Then move work from `.sdd/active/<feature-name>/` to `.sdd/archive/<feature-name>/`.
-## Task Design Principles
-- Keep task scope small: if a task touches more than 3 files or needs more than about 30 minutes, split it.
-- Keep verification fast: each task should be verifiable in 2 minutes or less.
-- Keep completion explicit: each task must have a one-sentence definition of done.
+**What:** Observable outcome.
+**Required Capabilities:** Semantic needs, or `none`.
+**Files:** Expected scope, or `discover` when not yet known.
+**Scenario:** GIVEN / WHEN / THEN, when behavior is involved.
+**Verify:** Project-discovered command or concrete manual check.
+**Done:** One sentence describing completion.
+```
-## Decision Tree: When to Break Tasks Smaller
+Capabilities describe needs such as test design, accessibility review, or security analysis. They MUST NOT require a particular skill, agent, vendor, or service. Resolve them against what is actually available at execution time.
-```text
-Does the task touch > 3 files?
-  YES -> split it
+## Full vs Delta Spec
-Will the task take > 30 minutes?
-  YES -> split it
+Use a full spec for new work without an existing requirement baseline. Use a delta spec for changes to established behavior.
-Can the task be verified in <= 2 minutes?
-  NO -> add a tighter verify step
+- `ADDED`: new requirement and scenarios.
+- `MODIFIED`: complete replacement requirement plus previous behavior reference.
+- `REMOVED`: removed requirement plus reason.
-Can "done" be described in one sentence?
-  NO -> task is too vague; split it
-```
+Before archiving a delta, reconcile it with the maintained requirement baseline when one exists. If no baseline exists, archive the delta as the historical record.
-## Common Mistakes to Avoid
+## Task Sizing
-### 1) Vague Constraints
+Prefer tasks that:
-```text
-WRONG:
-Must use best practices.
+- Have one observable outcome.
+- Touch a small, related file set.
+- Can be completed without hidden dependencies.
+- Have a fast, deterministic verify step.
+- Have a one-sentence definition of done.
-RIGHT:
-MUST use existing auth middleware.
-MUST NOT add new runtime dependencies.
-```
+Split a task when its concerns, dependencies, or verification cannot be explained independently. File counts and time estimates are heuristics, not universal gates.
-### 2) Oversized Tasks
+## Change Control
-```text
-WRONG:
-T1: Build the whole authentication feature.
+When requirements change during execution:
-RIGHT:
-T1: Add token verification middleware.
-T2: Add login endpoint behavior.
-T3: Add integration test for login flow.
-```
+1. Stop the affected task at a stable point.
+2. Record the change and reason in `Change Log`.
+3. Update affected constraints, scenarios, tasks, and validation.
+4. Mark invalidated evidence as superseded.
+5. Reapply the approval policy before continuing.
-### 3) Missing Verification
+Do not silently stretch a task to absorb new behavior.
-```text
-WRONG:
-Verify: It works.
+## Verification and Review
-RIGHT:
-Verify: npm test -- src/auth/login.test.ts
-```
+Verification checks behavior against the spec, not against implementation intent.
-### 4) Mixed Concerns in One Task
+- Run every task verify step using project-discovered commands.
+- Check every `MUST` and `MUST NOT` constraint explicitly.
+- Separate automated evidence, manual evidence, warnings, and blockers.
+- Perform a final structured review using an available review capability or the fallback checklist in `agents/verify.md`.
+- A missing optional integration is not a failure when the local fallback was completed.
-```text
-WRONG:
-T1: Create component and migrate all pages to it.
+## Archive Invariants
-RIGHT:
-T1: Create component.
-T2: Migrate page A.
-T3: Migrate page B.
-```
+Archive to `.sdd/archive/<YYYY-MM-DD>-<work-name>/` only when:
-Use the full preflight and pre-archive checks in [assets/validation-checklist.md](assets/validation-checklist.md).
+- All tasks are `DONE`.
+- Validation passes or approved manual checks are recorded.
+- No blocker is unresolved.
+- Constraint and final-review evidence is present.
+- Delta reconciliation is complete when applicable.
----
+Generate the ISO date from the current environment; do not assume a particular shell command or VCS.
 ## Modes
-Use these mode guides for role-focused execution:
-- Planning mode: [agents/plan.md](agents/plan.md)
-- Implementation mode: [agents/apply.md](agents/apply.md)
-- Verification mode: [agents/verify.md](agents/verify.md)
-- Orchestrator mode: [agents/orchestrate.md](agents/orchestrate.md)
----
+- Planning: [agents/plan.md](agents/plan.md)
+- Implementation: [agents/apply.md](agents/apply.md)
+- Verification: [agents/verify.md](agents/verify.md)
+- Orchestration: [agents/orchestrate.md](agents/orchestrate.md)
 ## Templates
-- Feature spec: [assets/spec-template.md](assets/spec-template.md)
+- Full spec: [assets/spec-template.md](assets/spec-template.md)
 - Delta spec: [assets/delta-spec-template.md](assets/delta-spec-template.md)
 - Design decisions: [assets/design-template.md](assets/design-template.md)
 - Validation checklist: [assets/validation-checklist.md](assets/validation-checklist.md)
----
-## Commands
-```bash
-mkdir -p .sdd/active/<feature-name>
-cp .skilly-hand/catalog/spec-driven-development/assets/spec-template.md .sdd/active/<feature-name>/spec.md
-cp .skilly-hand/catalog/spec-driven-development/assets/design-template.md .sdd/active/<feature-name>/design.md
-```

package/catalog/skills/spec-driven-development/agents/apply.md CHANGED Viewed

@@ -2,25 +2,40 @@
 ## Purpose
-Implement approved spec tasks in small, verifiable increments.
+Implement approved tasks one at a time and keep the spec synchronized with execution evidence.
-## Inputs
+## Procedure
-- Spec name/path under `.sdd/active/`.
-- Task range or specific task IDs.
+1. Read `spec.md`, relevant design decisions, and repository instructions.
+2. Confirm the approval policy is satisfied and dependencies are `DONE`.
+3. Resolve required capabilities from available local skills, tools, or structured self-review.
+4. Mark the selected task `IN_PROGRESS`.
+5. Implement only the assigned outcome.
+6. When the task declares test-driven work, follow the available TDD guidance or the portable RED/GREEN/REFACTOR contract described in the task.
+7. Run the task verify step using commands discovered from the project.
+8. Record concise evidence: command or check, result, and relevant output summary.
+9. Mark the task `DONE` only after verification passes; otherwise mark it `BLOCKED` with the cause.
+10. Append a change-log entry only for requirement, scope, or design changes, not routine progress.
-## Procedure
+## Scope Change Rule
+If implementation reveals new behavior, conflicting constraints, or a wider blast radius, stop the affected task and return it to planning. Do not silently add work or rewrite acceptance criteria to match the implementation.
+## Evidence Contract
+Evidence must be reproducible and proportionate:
+```text
+Check: <command or manual procedure>
+Result: PASS | FAIL | NOT_RUN
+Summary: <short factual result>
+```
-1. Read `spec.md` and `design.md` (if present).
-2. Execute one task at a time.
-3. Keep changes scoped to the assigned task.
-4. Run the task verify step before marking done.
-5. Record progress against task IDs.
-6. Stop and report blockers when constraints conflict.
+Do not claim a command ran when it did not. Record unavailable or human-only checks explicitly.
 ## Quality Bar
-- Behavior matches task intent.
-- Verify steps pass before moving on.
-- No hidden scope creep.
-- Progress summary lists completed tasks and changed files.
+- Changes remain inside task scope.
+- Verification evidence supports the `DONE` state.
+- Optional integrations have a documented local fallback.
+- No automatic commit, issue update, or remote action is assumed.

package/catalog/skills/spec-driven-development/agents/orchestrate.md CHANGED Viewed

@@ -2,24 +2,33 @@
 ## Purpose
-Coordinate planning, implementation, and verification through explicit checkpoints.
+Coordinate planning, implementation, verification, and archive while remaining usable without delegation or external integrations.
-## Inputs
+## Workflow
-- High-level request.
-- Context and constraints.
+1. PLAN: create or update the active spec.
+2. APPROVE: apply the spec's checkpoint or self-review policy.
+3. APPLY: execute one ready task at a time.
+4. VERIFY TASK: require evidence before selecting the next task.
+5. REPLAN: return changed requirements or blocked dependencies to planning.
+6. VERIFY FEATURE: run validation and the portable final review gate.
+7. ARCHIVE: enforce archive invariants and move the completed work once.
-## Workflow
+## Capability Resolution
+- Delegate a phase only when a suitable capability is available and delegation improves the outcome.
+- Otherwise execute the corresponding local mode directly.
+- Never invent resource names, identifiers, issue keys, commands, or service availability.
+- Remote writes, commits, and comments require their own discovered workflow and user authorization.
+## Checkpoint Policy
-1. PLAN: Produce or update the spec.
-2. REVIEW CHECKPOINT: Confirm the plan is approved.
-3. APPLY: Execute agreed task batch.
-4. VERIFY CHECKPOINT: Validate outputs against the spec and run the required final `review-rangers` gate.
-5. REPEAT: Continue by phase or task batch.
-6. ARCHIVE: Move completed work from `.sdd/active/` to `.sdd/archive/`.
+Pause for explicit user approval when required by the spec, risk, ambiguity, or an irreversible action. For low-risk autonomous work, record a self-review decision and continue.
 ## Coordination Rules
-- Pause at checkpoints before continuing.
-- Keep phase summaries short and decision-oriented.
-- Surface blockers with options, not ambiguity.
+- Keep `spec.md` authoritative.
+- Keep at most one task `IN_PROGRESS` unless independence is explicit.
+- Do not advance past failed verification.
+- Surface blockers with evidence and the smallest decision needed.
+- Do not declare completion while open tasks or unresolved blockers remain.

package/catalog/skills/spec-driven-development/agents/plan.md CHANGED Viewed

@@ -2,26 +2,28 @@
 ## Purpose
-Turn a request into an executable, reviewable spec before implementation starts.
-## Inputs
-- Goal or problem statement.
-- Known constraints.
-- Current state references (files, systems, behaviors).
+Turn a request into an executable spec without assuming a framework, toolchain, or external service.
 ## Procedure
-1. Clarify scope and success criteria.
-2. Decide full spec vs delta spec.
-3. Fill spec sections: Why, What, Constraints, Current State, Tasks, Validation.
-4. Break tasks into small units with explicit verify commands.
-5. Add `design.md` if architecture decisions or trade-offs are non-obvious.
-6. Return the spec summary for review and approval.
+1. Inspect the repository for applicable instructions, conventions, commands, and existing behavior.
+2. Clarify only missing information that cannot be discovered safely.
+3. Choose a full or delta spec.
+4. Define scope, constraints, non-goals, approval policy, and validation.
+5. Create tasks using the task contract from the parent skill.
+6. Add behavioral scenarios where outcomes can be observed.
+7. Describe required capabilities semantically and use `none` when no special capability is needed.
+8. Initialize one progress row per task with `TODO`.
+9. Self-review with the validation checklist and apply the approval policy.
 ## Quality Bar
-- Constraints are enforceable, not vague.
-- Every task has files and a verify step.
-- Out-of-scope items are explicit.
-- Another engineer can execute without guessing intent.
+- Another implementer can execute without guessing intent.
+- No task requires a named agent, skill, vendor, framework, or command that has not been discovered locally.
+- Tasks, progress rows, and validation are mutually consistent.
+- Risks and meaningful alternatives live in `design.md`; routine choices stay in the spec.
+- `spec.md` is the only task source of truth.
+## Blockers
+Mark planning `BLOCKED` only when a required decision or fact cannot be discovered and a reasonable assumption would create material risk. Record the missing input and its impact.

package/catalog/skills/spec-driven-development/agents/verify.md CHANGED Viewed

@@ -2,31 +2,52 @@
 ## Purpose
-Validate that implementation matches the approved spec and passes quality checks.
+Validate implementation against the approved spec using reproducible evidence and a portable final review.
-## Inputs
+## Procedure
-- Spec path.
-- Implementation result from apply mode.
+1. Read the complete spec and relevant design decisions.
+2. Confirm every task has a terminal state and `DONE` tasks have evidence.
+3. Run or inspect each task verify step.
+4. Run feature-level validation discovered from the repository and listed in the spec.
+5. Evaluate every `MUST` and `MUST NOT` constraint independently.
+6. Identify manual checks that still require human confirmation.
+7. Perform the final review gate below.
+8. Report blockers, warnings, and evidence separately.
-## Procedure
+## Portable Final Review Gate
+Use an installed review capability when one is available and applicable. Otherwise perform this local review:
+- Correctness: implementation satisfies every scenario and constraint.
+- Regression risk: affected existing behavior has appropriate checks.
+- Scope: no unapproved behavior or dependency was introduced.
+- Maintainability: changes follow discovered project conventions.
+- Safety: security, privacy, accessibility, data, and destructive-operation risks were considered when relevant.
+- Evidence: results are reproducible and unsupported claims are absent.
+Any unresolved correctness, constraint, or safety blocker fails verification. Missing optional tooling does not fail verification when this fallback is completed.
+## Report Contract
+```markdown
+## Verification Report
-1. Read acceptance intent from `spec.md` and `design.md` (if present).
-2. Run task-level verification evidence checks.
-3. Run feature-level validation commands.
-4. Confirm constraints (`MUST`, `MUST NOT`) were respected.
-5. Run a final structured `review-rangers` pass over the full change set.
-6. Report pass/fail per area with concrete evidence.
+### Task Evidence
+| Task | Check | Result | Evidence |
-### Required Final Gate (`review-rangers`)
+### Constraints
+| Constraint | Result | Evidence |
-- Validate selected agent targets vs actual instruction files/symlinks written.
-- Validate stale managed target cleanup after re-install/reselection.
-- Validate backup and restore safety (including uninstall restore behavior).
-- Any unresolved `review-rangers` blocker keeps verification in failed state.
+### Final Review
+- Result: PASS | FAIL
+- Blockers: none | list
+- Warnings: none | list
+- Manual checks: none | list
+```
 ## Quality Bar
-- End-to-end validation is explicit.
-- Gaps are tied to exact tasks or constraints.
-- Report separates blockers from non-blocking warnings.
+- Results distinguish `PASS`, `FAIL`, and `NOT_RUN`.
+- Manual validation is never reported as automated evidence.
+- Archive readiness is an explicit conclusion, not an implication.