npm - @static-var/keystone - Versions diffs - 0.1.0 - Mend

@static-var/keystone 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/.agents/plugins/marketplace.json +24 -0
package/.claude-plugin/marketplace.json +24 -0
package/.claude-plugin/plugin.json +12 -0
package/.codex-plugin/plugin.json +12 -0
package/.pi/extensions/keystone.ts +172 -0
package/HOW_IT_WORKS.md +424 -0
package/Makefile +19 -0
package/README.md +253 -0
package/package.json +86 -0
package/packaging.allowlist +32 -0
package/scripts/build-metadata.py +99 -0
package/scripts/package-keystone.sh +59 -0
package/scripts/validate-keystone.py +261 -0
package/scripts/validate-package.py +140 -0
package/skills/keystone/SKILL.md +69 -0
package/skills/keystone/modules/breakdown.md +239 -0
package/skills/keystone/modules/build.md +284 -0
package/skills/keystone/modules/debug.md +198 -0
package/skills/keystone/modules/gates/isolation.md +56 -0
package/skills/keystone/modules/gates/proof.md +54 -0
package/skills/keystone/modules/gates/red.md +59 -0
package/skills/keystone/modules/gates/review.md +56 -0
package/skills/keystone/modules/gates/ship.md +57 -0
package/skills/keystone/modules/health.md +124 -0
package/skills/keystone/modules/helpers/subagents.md +134 -0
package/skills/keystone/modules/research.md +86 -0
package/skills/keystone/modules/review.md +270 -0
package/skills/keystone/modules/router.md +36 -0
package/skills/keystone/modules/shape.md +125 -0
package/skills/keystone/modules/ship.md +130 -0

package/skills/keystone/modules/breakdown.md ADDED Viewed

@@ -0,0 +1,239 @@
+# Keystone Breakdown Module
+## Core principle
+Breakdown turns a goal into sequenced, reviewable vertical slices of work.
+A good breakdown makes implementation easier because every slice has a visible result, clear constraints, and a verification path. It makes review easier because reviewers can compare changes against stated goals, requirements, risks, and acceptance checks.
+Breakdown is planning, not execution. A plan is not proof; only inspected changes, tests, demos, or other evidence prove completion.
+## Load when
+Use this module when the user asks to:
+- break down a feature, fix, refactor, migration, tool, system, or project
+- produce milestones, implementation steps, tickets, issues, vertical slices, or phases
+- decide sequencing, dependencies, iterations, scope cuts, or parallelization
+- turn an ambiguous goal into implementable work
+- prepare work for coding agents, reviewers, or subagents
+- plan greenfield architecture before implementation
+- split a large task into reviewable chunks without exposing `/plan`
+Also load when implementation is requested but the goal is broad enough that coding immediately would hide major product, architecture, or sequencing decisions.
+## Not for
+Do not use Breakdown for:
+- implementation, file edits, refactors, migrations, or generated code
+- debugging a known failure; use `debug` first, then return to Breakdown if a repair plan is needed
+- research-only tasks; use `research` first when facts are missing
+- copy/design shaping as the primary work; use `shape` first when output is prose, UX, or design direction
+- final verification, release readiness, or completion claims; use `review`, `health`, or `ship`
+- renaming this module to `plan`, creating `/plan`, or exposing the internal planner name as a public command
+## Outcome contract
+A Breakdown output must include:
+1. Goal: the intended outcome in one or two sentences.
+2. Context inspected: files, docs, requirements, or assumptions used.
+3. Requirements inventory: critical requirements, non-functional requirements, good-to-haves, constraints, and open questions.
+4. Stack and architecture context: current or proposed technologies, boundaries, integrations, and limitations.
+5. Iteration layering: iteration 1, iteration 2, iteration 3, etc., with explicit scope cuts.
+6. Vertical slices: ordered work items that each deliver an end-to-end outcome.
+7. Verification gates: how each slice can be tested, reviewed, or demonstrated.
+8. Risks and dependencies: what can block, invalidate, or reorder the work.
+9. Handoff: recommended next primary module and any subagent/reasoning suggestions.
+If information is missing, state the assumption or ask the smallest set of questions required to avoid a bad plan.
+## Modes
+### Feature breakdown
+Use for new capabilities in an existing product. Focus on the user/operator goal, existing entry points, data paths, APIs, UI surfaces, tests, deployment constraints, and the smallest end-to-end slice that proves the feature path. Add progressive enrichment after the core loop works. Avoid horizontal buckets like "database", "backend", "frontend", and "tests" unless they are nested inside a vertical slice.
+### Greenfield architecture breakdown
+Use for new projects, tools, services, apps, packages, or substantial standalone systems. Start architecture-first: name the primary users, runtime, language, framework, hosting, storage, auth, observability, CI, packaging constraints, system boundaries, and integration points. Iteration 1 should prove the architecture can run, test, deploy, and support one meaningful vertical path, not merely create folders.
+### Refactor/migration breakdown
+Use when behavior should remain stable while internals change. Focus on the current behavior contract, compatibility expectations, affected surfaces, consumers, migration seams, adapters, flags, dual-run paths, rollback strategy, observability, and characterization tests. Prefer strangler-style or seam-first slices over broad rewrites.
+### Subagent-parallel breakdown
+Use when independent workstreams can proceed safely in isolated workspaces. Build the dependency graph before delegation. Name shared files, interfaces, merge-risk hotspots, subagent roles, reasoning levels, context packets, expected artifacts, integration order, and review checkpoints. Only parallelize slices that can be verified independently or integrated behind a clear contract.
+## Process
+1. Identify the goal.
+   - Restate the desired end state, not just the requested activity.
+   - Identify who benefits and what observable change proves value.
+   - If the goal is unclear and cannot be inferred from context, ask one focused question.
+2. Inspect before asking broad questions.
+   - Read relevant files, docs, issues, architecture notes, tests, package manifests, and existing modules when available.
+   - Use `research` for unfamiliar libraries, external constraints, or repository-wide discovery.
+   - Ask clarifying questions only when inspection cannot resolve a decision that materially changes the breakdown.
+3. Build the requirements inventory.
+   - Separate critical requirements from non-functional requirements and good-to-haves.
+   - Capture explicit user constraints, protected files, deadlines, compatibility needs, and review expectations.
+   - Mark assumptions and unknowns instead of silently inventing requirements.
+4. Identify stack and constraints.
+   - Note languages, frameworks, package managers, deployment targets, test tooling, persistence, auth, APIs, and platform limits.
+   - For greenfield work, propose architecture choices only after naming tradeoffs and constraints.
+   - For existing systems, prefer the established stack unless the goal requires a change.
+5. Choose iteration layers.
+   - Define iteration 1 as the smallest coherent outcome that proves the path.
+   - Define later iterations as progressively richer outcomes, not random leftovers.
+   - Make explicit what is intentionally deferred.
+6. Slice vertically.
+   - Each slice should cross needed layers to produce a reviewable result.
+   - Include data/model/API/UI/test/docs work inside the slice when needed for that result.
+   - Avoid phase plans that finish entire subsystems before any end-to-end value appears.
+7. Add verification gates.
+   - Give each slice an acceptance check, test command, manual review path, or demo criterion.
+   - Include regression checks for migrations and refactors.
+   - State when review should happen and what reviewers should inspect.
+8. Prepare the handoff.
+   - Recommend the next Keystone module.
+   - Identify subagent opportunities, required context, and reasoning level.
+   - Call out risks, dependencies, and open questions that should block implementation if unresolved.
+## Requirements inventory
+Use this inventory before sequencing work:
+- Goal: what outcome the user wants.
+- Critical requirements: must be true or the work fails.
+- Non-functional requirements: performance, security, reliability, accessibility, privacy, maintainability, observability, compatibility, cost, release, and operational concerns.
+- Good-to-haves: valuable but deferrable enhancements.
+- Constraints: protected files, APIs, tech stack, deadlines, repo conventions, deployment targets, team/process limits.
+- Current state: what exists now, with file/source references when available.
+- Unknowns: decisions or facts still unresolved.
+- Assumptions: temporary beliefs used to proceed.
+When requirements conflict, surface the conflict before writing slices.
+## Iteration layering
+Iteration layers should describe increasing confidence and capability:
+- Iteration 1: skeleton or core path. One thin, end-to-end result that validates the architecture, integration point, or user journey.
+- Iteration 2: completeness and resilience. Add common cases, validation, error states, compatibility, and tests around the proven path.
+- Iteration 3: polish and scale. Add edge cases, performance, accessibility, observability, docs, migration cleanup, and nice-to-haves.
+- Later iterations: optional expansion, hardening, automation, or product refinements.
+For greenfield projects, iteration 1 must include a runnable or executable foundation plus one meaningful vertical path. For migrations, iteration 1 should establish safety: characterization checks, seams, adapters, or observability before broad movement.
+## Task quality bar
+Each task or slice must have:
+- a name that describes the delivered outcome
+- user/operator/developer value
+- inputs and dependencies
+- files or areas likely involved, when known
+- exact acceptance criteria
+- verification method
+- rollback or safety note when risk is non-trivial
+- review focus: what a reviewer should inspect
+A weak task says "build backend" or "add tests". A strong slice says "Persist saved searches end-to-end behind the existing search UI, with API validation, storage migration, and regression coverage for loading saved searches."
+## Subagents and reasoning
+Default reasoning: `high`.
+Use subagents when planning benefits from independent context gathering, critique, or parallel workstream design:
+- researcher subagents for separate code areas, external APIs, or prior art
+- architecture reviewer subagents for greenfield foundations and migrations
+- risk reviewer subagents for security, data loss, compatibility, or release plans
+- worker subagents only after slices are independent and interfaces are stable
+For each proposed subagent, specify role, reasoning level, context packet, expected output artifact, files or areas off limits, and integration/review checkpoint. Do not use subagents to bypass ambiguity. Resolve shared interfaces and sequencing first.
+## Hard rules
+- No implementation under Breakdown.
+- No file mutation unless the user explicitly requested a planning artifact and the artifact itself is in scope.
+- Do not rename `breakdown` to `plan`.
+- Do not expose `/plan`.
+- Do not claim the plan proves completion.
+- Do not skip goal identification.
+- Do not ask broad clarifying questions before inspecting available context.
+- Do not produce horizontal-only plans.
+- Do not hide assumptions, unresolved questions, or conflicts.
+- Do not route risky plans directly to `build` without verification gates and review points.
+## Failure modes
+- Activity plan: lists actions but never states the outcome.
+- Horizontal buckets: separates backend/frontend/tests so no slice is independently valuable.
+- Big-bang architecture: designs everything before proving one runnable path.
+- Faux certainty: treats assumptions as facts.
+- Question spam: asks what inspection could answer.
+- Implementation leak: starts coding, editing files, or choosing exact code structure beyond planning needs.
+- Review-hostile output: lacks acceptance criteria, test commands, or reviewer focus.
+- Parallelism theater: delegates coupled workstreams that collide on shared files or undefined interfaces.
+- Plan-as-proof: reports success because a plan exists.
+## Output format
+Use this structure unless the user requested a different planning artifact:
+```markdown
+# Breakdown: <goal>
+## Goal
+<one or two sentences describing the desired outcome>
+## Context inspected
+- <files, docs, issues, sources, or "none available">
+## Requirements inventory
+### Critical requirements
+- <must-have>
+### Non-functional requirements
+- <quality/operational constraint>
+### Good-to-haves
+- <deferrable enhancement>
+### Stack and architecture context
+- Stack: <technologies, frameworks, platforms, deployment targets>
+- Boundaries: <modules, layers, services, UI/data/domain seams>
+- Integrations: <APIs, persistence, queues, auth, payment, external systems>
+- Limitations and constraints: <protected files, compatibility, process, deadline, cost, operational limits>
+### Unknowns and assumptions
+- Unknown: <question that matters>
+- Assumption: <assumption used for this breakdown>
+## Iteration layering
+### Iteration 1: <core path / architecture skeleton / safety seam>
+- Outcome: <reviewable result>
+- Scope: <included>
+- Deferred: <not included>
+### Iteration 2: <complete common cases>
+- Outcome: <reviewable result>
+- Scope: <included>
+- Deferred: <not included>
+### Iteration 3: <hardening / polish / scale>
+- Outcome: <reviewable result>
+- Scope: <included>
+- Deferred: <not included>
+## Vertical slices
+1. <slice name>
+   - Value: <who benefits and how>
+   - Work: <end-to-end changes at a planning level>
+   - Dependencies: <prior slices or decisions>
+   - Acceptance: <observable completion criteria>
+   - Verification: <test/review/demo method>
+   - Review focus: <what reviewers should inspect>
+## Risks and dependencies
+- <risk, impact, mitigation>
+## Subagent opportunities
+- <role, reasoning, context packet, expected artifact>
+## Handoff
+Next module: `<research|build|debug|review|health|ship|shape>` because <reason>.
+```
+Keep the output detailed enough to guide implementation and review, but short enough that each slice remains actionable.

package/skills/keystone/modules/build.md ADDED Viewed

@@ -0,0 +1,284 @@
+# Keystone Build Module
+## Core principle
+Build changes through evidence, not vibes: isolate first, specify the next observable behavior, prove the test can fail, make the smallest correct change, then refactor without changing behavior.
+Build is the mutation module. It may edit scoped project artifacts after the isolation gate passes, but it does not decide that work is shipped. Completion means "implemented with proof and ready for review/ship," not finalized.
+## Load when
+Use Build when the user asks to:
+- implement, create, edit, or wire behavior
+- add a feature, screen, endpoint, command, migration, or integration
+- refactor existing code while preserving behavior
+- change architecture, module boundaries, interfaces, or contracts
+- apply an approved plan from `breakdown`
+- delegate independent implementation work to subagents/workers
+- make focused content or configuration changes that require mutation
+## Not for
+Do not use Build for:
+- routing unclear work: use `router`
+- researching unknowns without mutation: use `research`
+- shaping requirements or acceptance criteria: use `shape`
+- decomposing large work before implementation: use `breakdown`
+- diagnosing a failure whose cause is unknown: use `debug`
+- reviewing completed changes: use `review`
+- releasing, merging, publishing, or finalizing: use `ship`
+## Outcome contract
+Before Build exits, it must be able to report:
+- isolation was checked before the first mutation via `gates/isolation.md`
+- the exact user scope and protected files were respected
+- the intended behavior or refactor invariant is stated plainly
+- tests, examples, or checks prove the change, or gaps are explicitly disclosed
+- red-capable tests were used for behavior changes whenever practical
+- delegated work, if any, was verified by the parent before acceptance
+Build must not claim work is done because code "looks right." Proof is required before completion claims.
+## Modes
+### TDD feature build
+Use when adding or changing observable behavior.
+Contract:
+- define one behavior slice at a time
+- write or identify the smallest test/check that should fail before the change
+- run it and confirm the failure is meaningful, not caused by setup noise
+- implement the smallest code that makes it pass
+- run the focused check again
+- run relevant regression checks
+- refactor only while checks stay green
+Prefer tests that exercise real behavior over tests that only verify mocks, implementation details, or snapshots. If a failing test cannot be created, state why and use the strongest available proof.
+TDD exceptions are rare but real. When a red-capable automated test is impractical, state the reason before editing and write an alternative proof plan. Acceptable cases include documentation-only edits, generated files, one-off migrations where rollback is the proof, external systems unavailable in the environment, exploratory spikes that will be thrown away, or emergency config changes. The alternative proof plan must name the observable check, manual verification, diff review, sample input/output, dry run, or rollback validation that will replace red/green.
+### Refactor
+Use when preserving behavior while improving structure, names, duplication, readability, or boundaries.
+Contract:
+- name the behavior that must not change
+- find existing tests or add characterization tests before edits when coverage is weak
+- make small mechanical changes first
+- keep public contracts stable unless the user requested a contract change
+- run regression checks before and after meaningful refactor steps
+- stop if behavior questions appear; route back to `shape` or `debug` as needed
+A characterization test captures what the current system does before you change structure. It is not a claim that current behavior is ideal; it is a tripwire that prevents accidental behavior changes while refactoring. Write it around externally visible behavior, important edge cases, or bug-compatible outputs that must stay stable until the user approves a behavior change.
+Concise example: before extracting invoice total formatting, add a test that `renderInvoiceSummary({ subtotal: 1000, discount: 125, currency: "USD" })` still returns `"Subtotal $10.00 · Discount $1.25 · Total $8.75"`; then refactor behind that externally visible output.
+Refactoring is not a license to redesign everything nearby.
+### Architecture-sensitive build
+Use when the change affects boundaries, state management, cross-module dependencies, platform conventions, or long-lived maintainability.
+Contract:
+- identify the domain shape before choosing architecture
+- choose the lightest architecture that fits the problem
+- define contracts/interfaces where they reduce coupling or clarify ownership
+- keep domain rules separate from transport, UI, persistence, and framework glue
+- use patterns only when they remove real pressure, not to decorate simple code
+- validate that the result is pleasant for developers to read, understand, maintain, and look at
+SOLID is a pressure-test, not dogma:
+- **SRP:** Can this unit change for one clear reason, or are unrelated policies bundled together?
+- **OCP:** Can the next known variation be added without editing fragile existing logic, or is a simpler edit safer for now?
+- **LSP:** Can substitutes honor the same contract without surprising callers?
+- **ISP:** Are callers forced to depend on methods, fields, events, or permissions they do not use?
+- **DIP:** Do high-level policies depend on stable abstractions at real boundaries, or are abstractions hiding one local call?
+Examples of appropriate taste:
+- mobile UI may use MVVM/MVI when state, events, and rendering need separation
+- clean architecture may fit when domain rules must survive UI, database, or API changes
+- SOLID and design patterns are justified only when they relieve current pressure
+### Delegated/parallel build
+Use when two or more implementation slices are independent enough to proceed without shared mutable state or ambiguous ownership.
+Contract:
+- split by outcome, not by vague activity
+- pass each worker a narrow scope, files or directories, constraints, and completion criteria
+- define contracts/interfaces before parallel work begins when slices must meet
+- state what must not be edited
+- require each worker to report files changed, checks run, and risks
+- verify delegated work yourself before integrating or claiming completion
+- reconcile overlaps deliberately; do not let workers race on the same files
+Do not delegate fuzzy architecture judgment without a concrete interface or decision boundary.
+Concise subagent brief template:
+```markdown
+Goal: one observable outcome
+Scope: allowed files/directories
+Do not edit: protected files/behaviors
+Contract: interfaces, invariants, data shape, or acceptance criteria
+Proof: required tests/checks/manual verification
+Report: files changed, verification output, risks/gaps
+```
+## Process
+1. Confirm scope.
+   - Restate the requested mutation in one sentence.
+   - Identify protected files and out-of-scope behavior.
+   - If scope is unsafe or ambiguous, ask one focused question or route to `shape`.
+2. Pass isolation before mutation.
+   - Load/check `gates/isolation.md`.
+   - Know the workspace, branch/worktree state, and dirty files.
+   - Stop if unrelated changes could be overwritten.
+3. Choose the mode.
+4. Define the next outcome.
+   - Write the smallest observable behavior, invariant, or contract.
+   - Avoid "make it better" as an implementation target.
+5. Establish proof before code.
+   - For behavior changes, create or identify a red-capable test/check.
+   - Run the focused check and confirm the failure would pass only for the intended change.
+   - For refactors, establish characterization or regression coverage.
+   - If proof is impossible in the environment, record the limitation before editing and use an alternative proof plan.
+6. Implement the smallest correct slice.
+   - Edit only files in scope.
+   - Keep changes narrow and reversible.
+   - Prefer clear names and direct control flow.
+   - Do not invent broad architecture to satisfy a small behavior.
+7. Green.
+   - Run the focused test/check.
+   - If it fails unexpectedly, use `debug`; do not stack guesses.
+8. Refactor.
+   - Remove duplication introduced by the slice.
+   - Improve readability without broadening behavior.
+   - Keep tests green after cleanup.
+9. Regression check.
+   - Run the most focused relevant suite available.
+   - Never replace verification with a summary.
+10. Early smell check.
+   - Stop and simplify if the diff introduces god functions, vague `manager`/`helper` names, hidden control flow, stringly APIs, layer violations, or speculative interfaces.
+11. Architecture pressure-test.
+   - For architecture-sensitive changes, answer the SOLID questions and remove abstractions that do not survive them.
+12. Handoff.
+   - Summarize changed files and behavior.
+   - Include commands run and results.
+   - Disclose unverified areas.
+   - Recommend `review` or `ship` as the next module when appropriate.
+## Architecture taste checklist
+Use this checklist before accepting the design:
+- Does the architecture match the domain complexity rather than the agent's desire to sound senior?
+- Is each module/class/function responsible for one understandable thing?
+- Are domain concepts named in the user's language?
+- Are interfaces/contracts placed at real boundaries, not between every pair of functions?
+- Is state ownership explicit?
+- Are side effects isolated enough to test meaningful behavior?
+- Would a developer know where to add the next similar behavior?
+- Is the simplest path also readable, or has simplicity become cleverness?
+- Is duplication removed only after the repeated concept is real?
+- Are patterns such as Strategy, Repository, Adapter, Observer, MVVM, MVI, or Clean Architecture justified by current pressure?
+- Does the diff pass the SOLID pressure-test questions without adding dogmatic ceremony?
+- Does the diff avoid god functions, vague managers/helpers, hidden control flow, stringly APIs, layer violations, and speculative interfaces?
+## Subagents and reasoning
+Default reasoning: `medium`.
+Escalate to `high` when:
+- architecture boundaries are being changed
+- tests are failing for unclear reasons
+- multiple agents must coordinate through contracts
+- data loss, security, billing, release, or migration risk exists
+- the user asks for broad refactoring or platform-specific architecture judgment
+Use subagents/workers when:
+- slices can be verified independently
+- files do not overlap, or ownership is explicitly assigned
+- exploration can happen without mutation
+A delegation brief must include:
+- goal/outcome
+- allowed files or directories
+- forbidden files or behaviors
+- relevant interfaces/contracts
+- expected tests/checks
+- report format for files changed, verification, and risks/gaps
+Verify delegated work by:
+- reading the diff, not just the report
+- running or reviewing the reported checks
+- checking contract compatibility between slices
+- rejecting broad edits, invented abstractions, or unproved claims
+## Hard rules
+- Build must pass `gates/isolation.md` before the first mutation.
+- Build must not ship, merge, publish, release, or finalize work.
+- Do not edit files outside the user's scope.
+- Do not claim completion without proof or explicit disclosure of missing proof.
+- Do not skip red/green/refactor for behavior changes unless there is a stated, practical reason and alternative proof plan.
+- Do not use subagents as a way to avoid understanding the result.
+- Do not introduce architecture that the current domain pressure does not justify.
+## Failure modes
+Watch for these and correct course:
+- **Green-only development:** tests are added after implementation and never proven red.
+- **Mock theater:** tests prove mocks were called, not that behavior works.
+- **Scope creep:** nearby cleanup, README edits, scripts, or unrelated modules change without request.
+- **Invented architecture:** factories, managers, providers, repositories, or layers appear without pressure.
+- **Shallow abstraction:** a wrapper hides one call site and makes the code harder to follow.
+- **Brittle hack:** timing sleeps, magic constants, global state, or special cases mask the real issue.
+- **God function:** one routine owns validation, orchestration, persistence, formatting, and error policy.
+- **Vague names:** `manager`, `helper`, `util`, or `common` hide responsibility instead of naming it.
+- **Hidden control flow:** callbacks, observers, magic registration, or framework hooks make execution hard to trace without clear benefit.
+- **Stringly API:** strings encode commands, states, fields, or permissions that should be typed, enumerated, or centralized.
+- **Layer violation:** UI, transport, persistence, or framework code reaches across boundaries into another layer's policy.
+- **Speculative interface:** abstraction exists for imagined future variants, not current pressure.
+- **Ambiguous delegation:** workers receive goals like "improve this" without contracts or boundaries.
+- **Unverified handoff:** delegated changes are accepted from a summary alone.
+- **Finalization leak:** Build says work is shipped, merged, or ready for users instead of ready for review/ship.
+## Output format
+When handing back from Build, respond with:
+- `Summary`: what changed, in bullets
+- `Files changed`: exact paths
+- `Verification`: commands/checks run and results
+- `Delegation`: subagents used, contracts passed, and how their work was verified; or `none`
+- `Risks / gaps`: anything unverified, deferred, or worth reviewing
+- `Next`: usually `review` or `ship`, never a claim that Build finalized the work