npm - @kiwidata/grimoire - Versions diffs - 0.1.1 - Mend

@kiwidata/grimoire 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (215) hide show

package/.claude-plugin/plugin.json +8 -0
package/AGENTS.md +217 -0
package/README.md +748 -0
package/bin/grimoire.js +2 -0
package/dist/cli/index.d.ts +2 -0
package/dist/cli/index.d.ts.map +1 -0
package/dist/cli/index.js +42 -0
package/dist/cli/index.js.map +1 -0
package/dist/commands/archive.d.ts +3 -0
package/dist/commands/archive.d.ts.map +1 -0
package/dist/commands/archive.js +22 -0
package/dist/commands/archive.js.map +1 -0
package/dist/commands/branch-check.d.ts +3 -0
package/dist/commands/branch-check.d.ts.map +1 -0
package/dist/commands/branch-check.js +16 -0
package/dist/commands/branch-check.js.map +1 -0
package/dist/commands/check.d.ts +3 -0
package/dist/commands/check.d.ts.map +1 -0
package/dist/commands/check.js +22 -0
package/dist/commands/check.js.map +1 -0
package/dist/commands/ci.d.ts +3 -0
package/dist/commands/ci.d.ts.map +1 -0
package/dist/commands/ci.js +18 -0
package/dist/commands/ci.js.map +1 -0
package/dist/commands/diff.d.ts +3 -0
package/dist/commands/diff.d.ts.map +1 -0
package/dist/commands/diff.js +10 -0
package/dist/commands/diff.js.map +1 -0
package/dist/commands/docs.d.ts +3 -0
package/dist/commands/docs.d.ts.map +1 -0
package/dist/commands/docs.js +11 -0
package/dist/commands/docs.js.map +1 -0
package/dist/commands/health.d.ts +3 -0
package/dist/commands/health.d.ts.map +1 -0
package/dist/commands/health.js +13 -0
package/dist/commands/health.js.map +1 -0
package/dist/commands/init.d.ts +3 -0
package/dist/commands/init.d.ts.map +1 -0
package/dist/commands/init.js +21 -0
package/dist/commands/init.js.map +1 -0
package/dist/commands/list.d.ts +3 -0
package/dist/commands/list.d.ts.map +1 -0
package/dist/commands/list.js +22 -0
package/dist/commands/list.js.map +1 -0
package/dist/commands/log.d.ts +3 -0
package/dist/commands/log.d.ts.map +1 -0
package/dist/commands/log.js +15 -0
package/dist/commands/log.js.map +1 -0
package/dist/commands/map.d.ts +3 -0
package/dist/commands/map.d.ts.map +1 -0
package/dist/commands/map.js +17 -0
package/dist/commands/map.js.map +1 -0
package/dist/commands/pr.d.ts +3 -0
package/dist/commands/pr.d.ts.map +1 -0
package/dist/commands/pr.js +17 -0
package/dist/commands/pr.js.map +1 -0
package/dist/commands/status.d.ts +3 -0
package/dist/commands/status.d.ts.map +1 -0
package/dist/commands/status.js +12 -0
package/dist/commands/status.js.map +1 -0
package/dist/commands/test-quality.d.ts +3 -0
package/dist/commands/test-quality.d.ts.map +1 -0
package/dist/commands/test-quality.js +37 -0
package/dist/commands/test-quality.js.map +1 -0
package/dist/commands/trace.d.ts +3 -0
package/dist/commands/trace.d.ts.map +1 -0
package/dist/commands/trace.js +12 -0
package/dist/commands/trace.js.map +1 -0
package/dist/commands/update.d.ts +3 -0
package/dist/commands/update.d.ts.map +1 -0
package/dist/commands/update.js +22 -0
package/dist/commands/update.js.map +1 -0
package/dist/commands/validate.d.ts +3 -0
package/dist/commands/validate.d.ts.map +1 -0
package/dist/commands/validate.js +17 -0
package/dist/commands/validate.js.map +1 -0
package/dist/core/archive.d.ts +9 -0
package/dist/core/archive.d.ts.map +1 -0
package/dist/core/archive.js +92 -0
package/dist/core/archive.js.map +1 -0
package/dist/core/branch-check.d.ts +27 -0
package/dist/core/branch-check.d.ts.map +1 -0
package/dist/core/branch-check.js +205 -0
package/dist/core/branch-check.js.map +1 -0
package/dist/core/check.d.ts +24 -0
package/dist/core/check.d.ts.map +1 -0
package/dist/core/check.js +372 -0
package/dist/core/check.js.map +1 -0
package/dist/core/ci.d.ts +24 -0
package/dist/core/ci.d.ts.map +1 -0
package/dist/core/ci.js +162 -0
package/dist/core/ci.js.map +1 -0
package/dist/core/detect.d.ts +10 -0
package/dist/core/detect.d.ts.map +1 -0
package/dist/core/detect.js +368 -0
package/dist/core/detect.js.map +1 -0
package/dist/core/diff.d.ts +29 -0
package/dist/core/diff.d.ts.map +1 -0
package/dist/core/diff.js +197 -0
package/dist/core/diff.js.map +1 -0
package/dist/core/doc-style.d.ts +16 -0
package/dist/core/doc-style.d.ts.map +1 -0
package/dist/core/doc-style.js +192 -0
package/dist/core/doc-style.js.map +1 -0
package/dist/core/docs.d.ts +6 -0
package/dist/core/docs.d.ts.map +1 -0
package/dist/core/docs.js +478 -0
package/dist/core/docs.js.map +1 -0
package/dist/core/health.d.ts +7 -0
package/dist/core/health.d.ts.map +1 -0
package/dist/core/health.js +489 -0
package/dist/core/health.js.map +1 -0
package/dist/core/hooks.d.ts +5 -0
package/dist/core/hooks.d.ts.map +1 -0
package/dist/core/hooks.js +168 -0
package/dist/core/hooks.js.map +1 -0
package/dist/core/init.d.ts +9 -0
package/dist/core/init.d.ts.map +1 -0
package/dist/core/init.js +563 -0
package/dist/core/init.js.map +1 -0
package/dist/core/list.d.ts +4 -0
package/dist/core/list.d.ts.map +1 -0
package/dist/core/list.js +170 -0
package/dist/core/list.js.map +1 -0
package/dist/core/log.d.ts +8 -0
package/dist/core/log.d.ts.map +1 -0
package/dist/core/log.js +150 -0
package/dist/core/log.js.map +1 -0
package/dist/core/map.d.ts +9 -0
package/dist/core/map.d.ts.map +1 -0
package/dist/core/map.js +302 -0
package/dist/core/map.js.map +1 -0
package/dist/core/pr.d.ts +9 -0
package/dist/core/pr.d.ts.map +1 -0
package/dist/core/pr.js +273 -0
package/dist/core/pr.js.map +1 -0
package/dist/core/shared-setup.d.ts +52 -0
package/dist/core/shared-setup.d.ts.map +1 -0
package/dist/core/shared-setup.js +221 -0
package/dist/core/shared-setup.js.map +1 -0
package/dist/core/status.d.ts +6 -0
package/dist/core/status.d.ts.map +1 -0
package/dist/core/status.js +114 -0
package/dist/core/status.js.map +1 -0
package/dist/core/test-quality.d.ts +33 -0
package/dist/core/test-quality.d.ts.map +1 -0
package/dist/core/test-quality.js +378 -0
package/dist/core/test-quality.js.map +1 -0
package/dist/core/trace.d.ts +6 -0
package/dist/core/trace.d.ts.map +1 -0
package/dist/core/trace.js +211 -0
package/dist/core/trace.js.map +1 -0
package/dist/core/update.d.ts +10 -0
package/dist/core/update.d.ts.map +1 -0
package/dist/core/update.js +149 -0
package/dist/core/update.js.map +1 -0
package/dist/core/validate.d.ts +20 -0
package/dist/core/validate.d.ts.map +1 -0
package/dist/core/validate.js +275 -0
package/dist/core/validate.js.map +1 -0
package/dist/index.d.ts +19 -0
package/dist/index.d.ts.map +1 -0
package/dist/index.js +20 -0
package/dist/index.js.map +1 -0
package/dist/utils/config.d.ts +61 -0
package/dist/utils/config.d.ts.map +1 -0
package/dist/utils/config.js +172 -0
package/dist/utils/config.js.map +1 -0
package/dist/utils/fs.d.ts +17 -0
package/dist/utils/fs.d.ts.map +1 -0
package/dist/utils/fs.js +38 -0
package/dist/utils/fs.js.map +1 -0
package/dist/utils/paths.d.ts +10 -0
package/dist/utils/paths.d.ts.map +1 -0
package/dist/utils/paths.js +35 -0
package/dist/utils/paths.js.map +1 -0
package/dist/utils/spawn.d.ts +5 -0
package/dist/utils/spawn.d.ts.map +1 -0
package/dist/utils/spawn.js +34 -0
package/dist/utils/spawn.js.map +1 -0
package/package.json +68 -0
package/skills/grimoire-apply/SKILL.md +274 -0
package/skills/grimoire-audit/SKILL.md +129 -0
package/skills/grimoire-branch-guard/SKILL.md +111 -0
package/skills/grimoire-bug/SKILL.md +160 -0
package/skills/grimoire-bug-explore/SKILL.md +242 -0
package/skills/grimoire-bug-report/SKILL.md +237 -0
package/skills/grimoire-bug-session/SKILL.md +222 -0
package/skills/grimoire-bug-triage/SKILL.md +274 -0
package/skills/grimoire-commit/SKILL.md +150 -0
package/skills/grimoire-discover/SKILL.md +297 -0
package/skills/grimoire-draft/SKILL.md +202 -0
package/skills/grimoire-plan/SKILL.md +329 -0
package/skills/grimoire-pr/SKILL.md +134 -0
package/skills/grimoire-pr-review/SKILL.md +240 -0
package/skills/grimoire-refactor/SKILL.md +251 -0
package/skills/grimoire-remove/SKILL.md +112 -0
package/skills/grimoire-review/SKILL.md +247 -0
package/skills/grimoire-verify/SKILL.md +223 -0
package/skills/references/bug-classification.md +154 -0
package/skills/references/build-vs-buy.md +77 -0
package/skills/references/elicitation-personas.md +118 -0
package/skills/references/refactor-register-format.md +88 -0
package/skills/references/refactor-scan-categories.md +102 -0
package/skills/references/schema-format.md +68 -0
package/skills/references/security-compliance.md +110 -0
package/skills/references/testing-contracts.md +93 -0
package/templates/context.yml +110 -0
package/templates/debt-exceptions.yml +61 -0
package/templates/decision.md +50 -0
package/templates/dupignore +93 -0
package/templates/example.feature +24 -0
package/templates/manifest.md +29 -0
package/templates/mapignore +58 -0
package/templates/mapkeys +65 -0

package/skills/references/build-vs-buy.md ADDED Viewed

@@ -0,0 +1,77 @@
+# Build vs Buy Research
+Research methodology for evaluating existing solutions before designing custom code. Used by draft (conduct research), plan (validate decision), review (check prior art).
+## When to Research
+- **Level 1 (Trivial)**: Skip entirely
+- **Level 2 (Simple)**: Check built-ins and first-party ecosystem only
+- **Level 3-4 (Moderate/Complex)**: Full research across all categories
+## Research Categories
+Search for existing solutions across these categories (skip categories that clearly don't apply):
+- **Language/framework built-ins**: Does the framework already have this? (e.g., Django has auth, React has context, Express has middleware). Check official docs.
+- **First-party ecosystem**: Official plugins, extensions, or companion packages from the framework maintainers.
+- **Popular libraries**: Search the relevant package registry (npm, PyPI, crates.io, etc.) for well-maintained packages. Use web search to find comparison articles, "best of" lists, and Stack Overflow recommendations.
+- **Open-source projects**: GitHub repos that solve the same problem as a standalone tool or reference implementation.
+- **SaaS/managed services**: Hosted solutions that handle the problem as a service (e.g., Auth0 for auth, Stripe for payments, Algolia for search).
+For each candidate found, gather:
+- **Name and link** to docs/repo
+- **Maintenance signals**: last release date, commit frequency, open issues, download count
+- **Fit**: does it match the project's language, framework, and deployment constraints?
+- **Scope match**: does it solve 100% of the need, 80%, or just a part?
+- **Trade-offs**: what design decisions does it impose? What would the project give up by adopting it?
+## Decision Framework
+| Signal | Points toward **adopt** | Points toward **build** |
+|--------|------------------------|------------------------|
+| Scope match | Solves ≥80% of the need | Solves <50% or forces unwanted constraints |
+| Maintenance | Active, >1 maintainer, regular releases | Abandoned, single maintainer, or unmaintained fork |
+| Integration cost | Drop-in or <1 day to integrate | Requires significant adapter code or workarounds |
+| Customization | Configurable or extensible where needed | Core behavior can't be changed without forking |
+| Dependencies | Few, well-known transitive deps | Heavy dependency tree or conflicts with project deps |
+| Security | Audited, follows best practices, no known CVEs | Unaudited, handles sensitive data unsafely |
+| Licensing | Compatible with project license | Incompatible or ambiguous license |
+| Project constraints | Fits deployment target, bundle size, performance needs | Doesn't fit runtime environment or adds unacceptable overhead |
+When the decision is close, **prefer adopting** — maintaining custom code is almost always more expensive than people expect.
+## If Building: Learn from What Exists
+When the decision is to build custom code, **study existing implementations before designing**:
+- **Document the prior art**: For each relevant existing tool, note its architecture, data flows, API design, and key abstractions. What patterns does it use? What did its maintainers learn over time (check changelogs, migration guides, design docs)?
+- **Identify what's different**: Be precise about why the project's needs diverge. "We need something different" is not enough — state the specific requirements that existing tools don't meet.
+- **Borrow deliberately**: List specific design patterns, data flow approaches, API shapes, or architectural decisions from existing tools that should inform the custom implementation. This prevents reinventing what others have already refined.
+- **Scope the custom work**: Define the minimum viable version. If an existing tool does 10 things and you only need 3, build those 3. Don't replicate the full feature set.
+## Present Findings Format
+Present a structured summary **before drafting any artifacts**:
+```markdown
+## Prior Art Research
+### Existing Solutions Found
+1. **[name]** — [one-line description]. [fit assessment]. [key trade-off].
+2. **[name]** — ...
+### Recommendation
+- **Adopt [name]** because [reasons] → draft becomes an ADR documenting the adoption
+- OR **Build custom** because [specific gaps: requirement X isn't met by any option, constraint Y rules out adoption]. Borrowing [patterns/flows] from [existing tool].
+- OR **Hybrid**: adopt [name] for [scope] and build custom [scope] because [reasons]
+### If Building: What Makes This Different
+- [Requirement that no existing tool meets]
+- [Constraint that rules out adoption]
+- [Design decision that must differ from prior art, and why]
+### If Building: Borrowed from Prior Art
+- [Pattern/flow/API shape] from [tool] — because [reason it's proven]
+```
+Wait for user agreement on the direction before proceeding.

package/skills/references/elicitation-personas.md ADDED Viewed

@@ -0,0 +1,118 @@
+# Elicitation Personas
+Persona-driven questions to surface requirements. Used by draft (gather requirements), plan (check completeness), review (evaluate design).
+## How to Use
+- **In draft**: Ask these questions to gather requirements before drafting.
+- **In plan**: Use as a completeness checklist — flag gaps in the specs, don't ask the user.
+- **In review**: Use as evaluation criteria — check if the design addresses each concern.
+## Depth by Complexity Level
+| Level | Depth |
+|-------|-------|
+| 1 (Trivial) | Skip entirely |
+| 2 (Simple) | Outcome + non-goals, then 1-3 targeted questions from the most relevant persona |
+| 3 (Moderate) | Outcome + non-goals, then applicable personas in a single focused batch |
+| 4 (Complex) | Outcome + non-goals, then all applicable personas in batches of 3-5, wait between batches |
+Don't ask every question — only ask questions whose answers aren't already clear.
+## Outcome & Scope — Always Ask First
+Before diving into persona questions, establish the outcome and boundaries. These two questions prevent the most common spec failures — building the wrong thing and building too much:
+- **Outcome**: What problem are you solving, and how will you know it's solved? What does the user or system look like *after* this change that's different from today?
+- **Non-goals**: What is explicitly out of scope? What should this change NOT do, NOT handle, or NOT affect? Are there adjacent features, edge cases, or future plans that we should deliberately leave alone?
+Record the answers. Non-goals become constraints in the manifest and guard rails during drafting — if a scenario starts creeping into a non-goal, stop and flag it.
+## Product Manager — Functional Completeness
+Ask when: the change has user-facing behavior.
+- Who are the actors? Is there more than one user role involved? Do they see/do different things?
+- What triggers this — user action, scheduled job, external event, or another system?
+- What does success look like from the user's perspective? What do they see/receive?
+- What are the business rules? Validation constraints, conditional logic, calculations, limits?
+- What happens when the user makes a mistake? Invalid input, wrong state, missing data?
+- Are there state transitions? What states can this entity be in, and what moves it between them?
+- Is there a UI component? What does each state look like — loading, empty, error, success, partial?
+- *(If UI)* Do you have designs or mockups? Check `.grimoire/config.yaml` under `project.design_tool` for where designs live. Ask the user to point to the specific screen/flow — reference it in the requirements summary so downstream skills can consult it.
+- Does this interact with existing features? Could it conflict with or depend on other workflows?
+- Does this need to support multiple languages or locales? If so, which ones?
+- What accessibility standard applies? (WCAG 2.1 AA is the most common default. If the project has no standard, ask.)
+## Senior Engineer — Architecture & Integration
+Ask when: the change introduces new components, services, dependencies, or data flows.
+- What's the deployment context? Does this run in the same service or cross service boundaries?
+- What existing components does this touch? Are there shared modules, APIs, or databases involved?
+- Are there concurrency concerns? Multiple users or processes acting on the same data?
+- What's the data flow? Where does data enter, how is it transformed, where does it end up?
+- Are there ordering or idempotency requirements? What happens if this runs twice?
+- Does this need to be backwards-compatible with existing clients, APIs, or data formats?
+- Is there a rollout concern? Can this be deployed incrementally, or is it all-or-nothing? Feature flag needed?
+- What are the performance expectations? Response time target, expected throughput, data volume at scale?
+- How will you observe this in production? What metrics, logs, or alerts should exist? What does "healthy" look like on a dashboard?
+- What's the availability target? What happens during partial outages — degrade gracefully or fail fast?
+- *(If adopting)* What customization or configuration does the library need for this project's constraints?
+- *(If adopting)* Where does the library's responsibility end and custom code begin?
+## Security Engineer — Security & Compliance
+Ask when: the change involves authentication, authorization, user input, sensitive data, or external-facing endpoints.
+- Who should have access to this? Are there roles, permissions, or ownership rules?
+- Does this handle sensitive data (PII, credentials, financial, health)? Where is it stored and transmitted?
+- Is there user-provided input? What's the attack surface — injection, XSS, CSRF, file upload?
+- Are there compliance requirements (GDPR, HIPAA, PCI-DSS, SOC2)? Data residency or retention rules?
+- Does this cross a trust boundary? Is data coming from an external system or untrusted source?
+## QA Engineer — Testability & Edge Cases
+Ask when: the change has complex behavior, multiple paths, or integration points.
+- What are the boundary values? Min/max lengths, zero vs. one, empty collections, null states?
+- What are the timing edge cases? Concurrent edits, race conditions, timeout during processing?
+- What external dependencies could fail? How should the system behave when they do — retry, fallback, error?
+- Is there existing behavior that could regress? What should still work exactly as before?
+## Data Engineer — Data & Schema
+Ask when: the change creates, modifies, or removes data models, or integrates with external APIs.
+- What data entities are involved? What are the relationships between them?
+- What are the field constraints? Required, unique, nullable, max length, valid ranges, enums?
+- How does this data grow? Is there a retention policy, archival strategy, or cleanup needed?
+- Is there existing data that needs migrating? Can the migration run live or does it need downtime?
+- Are there external API contracts? What fields does the client read, and what happens if the schema changes?
+## Requirements Summary Template
+After elicitation, summarize what you learned in a short **Requirements Summary**. This becomes the foundation for scenarios and decisions. Format:
+```markdown
+## Requirements Summary
+**Outcome**: [what problem this solves, how we'll know it's solved]
+**Non-goals**: [what's explicitly out of scope — won't do, won't handle, won't affect]
+**Actors**: [who]
+**Trigger**: [what starts this]
+**Happy path**: [what success looks like]
+**Business rules**: [validation, constraints, logic]
+**Error cases**: [what can go wrong]
+**Data**: [what's created/modified, key constraints]
+**Security**: [access control, sensitive data, compliance]
+**Performance**: [response time, throughput, data volume targets — if applicable]
+**Observability**: [key metrics, alerts, what "healthy" looks like — if applicable]
+**Availability**: [uptime target, degradation strategy — if applicable]
+**Accessibility**: [WCAG level, requirements — if applicable]
+**i18n**: [supported locales — if applicable]
+**Design reference**: [link to mockup/design, or "none" — if UI change]
+**Open questions**: [anything the user couldn't answer yet — flag as unvalidated assumptions]
+```
+Wait for user confirmation of the summary before proceeding to draft.

package/skills/references/refactor-register-format.md ADDED Viewed

@@ -0,0 +1,88 @@
+# Debt Register Format
+Reference for `grimoire-refactor` step 4. The register is `.grimoire/docs/debt-register.yml`.
+## Required Fields
+- `id` — `debt-NNN`, monotonically increasing
+- `category` — one of: `hotspot`, `structural_bloat`, `data_structure`, `circular_dependency`, `dependency_staleness`, `broken_promise`, `duplication`, `dead_code`, `test_debt`
+- `severity` — `high`, `medium`, `low`
+- `location` — file path (with optional `:line`), or `path <> path` for relationships
+- `title` — short human-readable summary
+- `detail` — evidence: what was measured, what threshold was exceeded, consequences
+- `fingerprint` — `sha256(category + normalized_location)` for dedup across scans
+- `status` — `open` | `triaged` | `in-progress` | `resolved` | `accepted`
+## Optional Fields
+- `metrics` — numeric measurements (churn count, complexity score, line count, etc.)
+- `suggestion` — recommended refactoring approach
+- `effort` — `small` (<1 hour), `medium` (1-4 hours), `large` (>4 hours)
+- `consequences` — what happens if NOT addressed (forces articulation of impact)
+- `causes` — `evolution`, `deadline`, `knowledge`, `dependency`
+- `quadrant` — Fowler's: `deliberate-prudent`, `deliberate-reckless`, `inadvertent-prudent`, `inadvertent-reckless`
+- `change_id` — grimoire change created to address this item
+- `first_detected` / `last_detected` — date tracking
+## Example
+```yaml
+# Grimoire Debt Register
+# Generated by /grimoire:refactor
+# Last scanned: 2026-04-06
+summary:
+  total: 12
+  high: 3
+  medium: 5
+  low: 4
+  open: 9
+  accepted: 2
+  in_progress: 1
+  resolved: 0
+items:
+  - id: debt-001
+    fingerprint: a1b2c3d4
+    category: hotspot
+    severity: high
+    location: src/api/views.py
+    title: "High-churn, high-complexity API view module"
+    detail: "38 commits in 6 months, cyclomatic complexity 24. Handles 12 endpoints with mixed concerns."
+    consequences: "Every API change requires understanding 847 lines. Bug rate 3x project average."
+    causes: evolution
+    metrics: { churn: 38, complexity: 24, lines: 847 }
+    suggestion: "Split by resource: UserViews, OrderViews, ProductViews. Extract validation."
+    effort: medium
+    status: open
+    first_detected: 2026-04-06
+    last_detected: 2026-04-06
+  - id: debt-002
+    fingerprint: c9d0e1f2
+    category: data_structure
+    severity: medium
+    location: src/types/config.ts
+    title: "AppConfig interface with 28 optional fields"
+    detail: "28 fields, 22 optional. Used in 3 different contexts."
+    consequences: "Every consumer must null-check fields always present in its context."
+    causes: evolution
+    metrics: { fields: 28, optional_ratio: 0.79 }
+    suggestion: "Split into ServerConfig, ClientConfig, CLIConfig with shared BaseConfig."
+    effort: medium
+    status: accepted
+    quadrant: deliberate-prudent
+    exception_reason: "Splitting would break plugin API. Revisit in Q4."
+    first_detected: 2026-04-06
+    last_detected: 2026-04-06
+```
+## Register Rules
+- Status lifecycle: `open` -> `triaged` -> `in-progress` -> `resolved` (or `accepted` via exception)
+- `first_detected` set once, never updated. `last_detected` updated each scan.
+- Items matched by exception: `status: accepted` with `quadrant` and `exception_reason`
+- Expired exceptions: revert to `status: open` with note in detail
+- `consequences` forces articulation of real impact, not just code aesthetics
+- On refresh: match by fingerprint, preserve status/first_detected, update last_detected/metrics, mark undetected items as `resolved`
+- Sort: severity (high first), then hotspot score within severity

package/skills/references/refactor-scan-categories.md ADDED Viewed

@@ -0,0 +1,102 @@
+# Refactor Scan Categories
+Reference for `grimoire-refactor` step 2. Each category produces findings with a category, location, severity, and suggested action.
+## 2a. Hotspots (churn x complexity)
+Files that change frequently AND are hard to change. Highest-ROI refactoring targets.
+**How to scan:**
+1. Change frequency: `git log --format=format: --name-only --since="6 months ago" | sort | uniq -c | sort -rn | head -50`
+2. Complexity: run `config.tools.complexity` (configured during init — e.g., radon, eslint complexity plugin, or line count + nesting depth as proxy)
+3. Multiply: `churn_rank x complexity_rank = hotspot_score`
+4. Top 10-20 files by hotspot score are targets
+**Severity:** high = top 5 (churn >20 AND complexity above threshold), medium = 6-15, low = 16+
+## 2b. Structural Bloat
+| Signal | Threshold | Meaning |
+|---|---|---|
+| Oversized files | >300 lines (Python), >500 (TS/JS), >400 (Go) | File does too much — split |
+| Long functions | >50 lines or >4 nesting levels | Extract or flatten |
+| God classes | >10 public methods or >500 lines | Split by responsibility |
+| Too many exports | >15 from one module | Grab bag, not a module |
+| Deep nesting | >4 levels of indentation in logic | Guard clauses, extract, pipeline |
+| Wrapper-only layers | Function body is a single delegation call | Inline or remove |
+| Large switch/if-else | >5 branches | Lookup table, strategy, polymorphism |
+**Severity:** high = 2x+ threshold, medium = 1-2x, low = marginally over
+## 2c. Data Structure Complexity
+| Signal | Meaning |
+|---|---|
+| Models >15 fields | Represents multiple concepts — split |
+| >3 nesting levels | Flatten or normalize |
+| Type unions >4 variants | Separate types or polymorphism |
+| >70% field overlap between types | Consolidate or extract shared base |
+| Config with conditional logic | Business logic hiding as config |
+| >50% optional fields | God DTO serving multiple use cases |
+| Enums >10 values | Proper type hierarchy |
+**How to scan:** Read `schema.yml` if exists, scan ORM models / interfaces / dataclasses, count fields and nesting.
+**Severity:** high = >25 fields or >4 nesting, medium = 15-25 or 3-4 nesting, low = structural smell but manageable
+## 2d. Circular Dependencies
+**How to scan:**
+- JS/TS: `dependency-cruiser` or `madge` if available, else trace imports from area docs
+- Python: trace imports, look for `TYPE_CHECKING` blocks (circular import workaround signal)
+- Go: circular imports are compile errors — look for oversized packages to split
+**Severity:** high = >3 modules or crosses architecture boundaries, medium = 2-module cycles, low = within single area
+## 2e. Dependency Staleness
+**How to scan:** Run `config.tools.dep_audit` if configured, or:
+- Node: `npm outdated --json`
+- Python: `pip list --outdated --format=json`
+- Count major versions behind, check last publish date
+**Severity:** high = >2 major versions behind or unmaintained (2+ years no release), medium = 1-2 major behind, low = minor/patch behind
+## 2f. Broken Promises
+TODO/FIXME/HACK/XXX comments that have aged.
+**How to scan:**
+1. Find comments: `grep -rn 'TODO\|FIXME\|HACK\|XXX' --include="*.py" --include="*.ts" --include="*.js" --include="*.go" ...`
+2. Age from `git blame` — when was this line last touched?
+3. Older = higher priority
+**Severity:** high = >1 year old, medium = 3 months to 1 year, low = <3 months
+## 2g. Duplication
+**How to scan:**
+- Read `.grimoire/docs/.snapshot.json` `duplicates` section if present
+- Or run `config.tools.duplicates` if configured (e.g., jscpd)
+- Group by area — within-area dupes are easy to consolidate
+**Severity:** high = >30 lines or >3 copies, medium = 10-30 lines or 2 copies, low = <10 lines
+## 2h. Dead Code
+**How to scan:**
+- Run `config.tools.dead_code` if configured (e.g., knip, vulture)
+- Cross-reference area docs' reusable code tables (in table but never imported = dead)
+- If `codebase-memory-mcp` available: `query_graph` for functions with zero callers
+**Severity:** high = entire unused modules/classes, medium = unused exported functions, low = unused imports/variables
+## 2i. Test Debt
+**How to scan:**
+- Get coverage report if available — files <50% coverage
+- Cross-reference with complexity — high complexity + low coverage = dangerous
+- Check for trivial assertions (`assert True`, `expect(true).toBe(true)`)
+- Check for over-mocked tests (testing mocks, not behavior)
+**Severity:** high = complex code (top quartile) with <30% coverage, medium = moderate complexity with <50%, low = simple code with low coverage

package/skills/references/schema-format.md ADDED Viewed

@@ -0,0 +1,68 @@
+# Data Schema Format
+Reference for `grimoire-discover` step 5. The schema is `.grimoire/docs/data/schema.yml`.
+## Format
+```yaml
+# Grimoire Data Schema
+# Auto-generated by /grimoire:discover
+# Last updated: YYYY-MM-DD
+# Source: <what was scanned>
+users:
+  type: table                          # table | collection | document | external_api
+  source: src/models/user.py:12        # where defined in code
+  note: "Core user identity"           # optional context
+  fields:
+    id: { type: integer, pk: true }
+    email: { type: varchar, unique: true, not_null: true }
+    role:
+      type: enum
+      values: [admin, member, guest]
+      default: member
+    preferences:                        # nested (Mongo, JSON column, etc.)
+      type: object
+      fields:
+        theme: { type: string, default: "light" }
+  indexes:
+    - fields: [email]
+      unique: true
+  relationships:
+    - type: has_many
+      target: posts
+      foreign_key: author_id
+# --- External APIs ---
+stripe_payments:
+  type: external_api
+  provider: Stripe
+  schema_ref: https://stripe.com/docs/api/charges   # full spec location
+  client: src/integrations/stripe.py                 # where we call it
+  auth: api_key
+  endpoints:
+    create_charge:
+      method: POST
+      path: /v1/charges
+      fields:
+        amount: { type: integer, note: "in cents" }
+        currency: { type: string }
+      response:
+        id: { type: string }
+        status: { type: string, enum: [succeeded, failed] }
+      error_response:
+        message: { type: string }
+```
+## Rules
+- Document what exists in the code, not what the database contains
+- `source:` points to ORM model or migration file — schema.yml is a summary, code is truth
+- `type:` — `table` for SQL, `collection` for Mongo/document, `document` for nested sub-documents, `external_api` for consumed APIs
+- Nested `fields` for embedded objects/arrays (document DBs, JSON columns)
+- `note:` only when field name isn't self-explanatory
+- `relationships` when ORM defines them explicitly
+- For external APIs: `schema_ref` is most important — point to the full spec
+- For external APIs: `client` points to where the codebase calls the API
+- Don't duplicate entire OpenAPI specs — summarize endpoints you actually use, point to full spec via `schema_ref`
+- If project has no data layer, skip this step entirely

package/skills/references/security-compliance.md ADDED Viewed

@@ -0,0 +1,110 @@
+# Security & Compliance Reference
+Loaded by skills when feature files have security tags or `project.compliance` is configured in `.grimoire/config.yaml`.
+## Security Tags
+Apply these Gherkin tags to scenarios with security implications. Downstream skills (plan, review, verify) use them to enforce stricter checks.
+| Tag | When to apply |
+|---|---|
+| `@security` | Authentication, authorization, access control, cryptographic operations |
+| `@auth` | Login, logout, session management, token handling, role-based access |
+| `@pii` | Create, read, update, or delete personally identifiable information |
+| `@input-validation` | User input that could be malicious (forms, APIs, file uploads) |
+| `@secrets` | API keys, credentials, tokens, or secret management |
+### Compliance-specific tags (only when `project.compliance` is configured)
+| Tag | When to apply |
+|---|---|
+| `@pci-dss` | Payment card data, cardholder data environment, payment processing |
+| `@hipaa` | Protected health information, patient data, healthcare records |
+| `@gdpr` | EU personal data, consent management, data subject rights |
+| `@soc2` | Audit logging, access controls, availability requirements |
+Multiple tags can apply to one scenario.
+## What Each Tag Requires
+### In planning (grimoire-plan)
+- `@security` / `@auth` — specify which auth library/framework; include a negative scenario task (assert 401/403)
+- `@pii` — tasks for encryption at rest, access logging, data minimization; if GDPR in compliance, add consent + erasure tasks
+- `@input-validation` — explicit validation/sanitization at boundary; negative test tasks for malicious input (SQLi, XSS, path traversal)
+- `@secrets` — specify env vars or secret store, never hardcoded; add a task to verify no secrets in source
+- `@pci-dss` — no card data in logs, TLS, tokenization, audit trail for cardholder data access
+- `@hipaa` — access controls with audit logging, encryption at rest/transit, minimum necessary access
+- `@gdpr` — lawful basis, consent mechanism if needed, data subject rights (access, rectify, erase, port), retention limits
+- `@soc2` — audit logging for all access, change management documentation, availability monitoring
+### In verification (grimoire-verify)
+- `@security` / `@auth` — confirm auth checks exist in implementation, negative test covers unauthorized access (401/403)
+- `@pii` — data encrypted at rest, access logged, no PII in log output
+- `@input-validation` — validation at boundary, negative tests for malicious input exist
+- `@secrets` — values from env/secret store, no hardcoded credentials in source
+- `@pci-dss` — no card data in logs, TLS for transmission, audit trail present
+- `@hipaa` — access controls + audit logging, encryption at rest/transit
+- `@gdpr` — consent mechanism if applicable, erasure support, data retention limits
+- `@soc2` — audit logging, access controls, availability monitoring
+A security-tagged scenario with no corresponding security verification in the tests is a **CRITICAL** issue.
+## STRIDE Threat Analysis
+For each new endpoint, data flow, or trust boundary a change introduces:
+| Threat | Question |
+|---|---|
+| **S**poofing | Can an attacker impersonate a user or service? Auth checks at every entry point? |
+| **T**ampering | Can input or data in transit be modified? Integrity validated (checksums, signatures, CSRF)? |
+| **R**epudiation | Are security-relevant actions logged? Could an attacker act without a trace? |
+| **I**nfo Disclosure | Could errors, logs, or responses leak sensitive data (stack traces, PII, tokens)? |
+| **D**enial of Service | Unbounded operations (large uploads, expensive queries, no rate limits)? |
+| **E**levation of Privilege | Can a user escalate to admin? Role/permission checks at the right layer? |
+Skip categories that don't apply. Don't manufacture threats.
+## OWASP Top 10 Surface Scan
+For changed files, do a lightweight scan:
+| OWASP Category | What to check in the diff |
+|---|---|
+| A01: Broken Access Control | New endpoints missing auth decorators/middleware; direct object references without ownership checks |
+| A02: Cryptographic Failures | Weak hashing, missing encryption for sensitive data, hardcoded keys |
+| A03: Injection | String concatenation in SQL/commands/templates, `eval()`, `innerHTML` with user data |
+| A04: Insecure Design | Missing rate limiting on auth endpoints, no account lockout |
+| A05: Security Misconfiguration | Debug mode enabled, default credentials, overly permissive CORS |
+| A06: Vulnerable Components | New dependencies without version pins, known-vulnerable packages |
+| A07: Auth Failures | Weak password requirements, session tokens in URLs |
+| A08: Data Integrity Failures | Insecure deserialization (`pickle`, `yaml.load`), missing integrity checks |
+| A09: Logging Failures | Security events not logged, PII/secrets in log output |
+| A10: SSRF | User-controlled URLs in server-side HTTP requests without allowlist |
+Tag each finding with OWASP category and CWE ID.
+## CWE Quick Reference
+| Finding | OWASP / CWE |
+|---|---|
+| Missing auth checks | A01:2021 / CWE-862 |
+| SQL injection | A03:2021 / CWE-89 |
+| Command injection | A03:2021 / CWE-78 |
+| XSS | A03:2021 / CWE-79 |
+| Custom/weak crypto | A02:2021 / CWE-327, CWE-328 |
+| Hardcoded secrets | A07:2021 / CWE-798 |
+| SSRF | A10:2021 / CWE-918 |
+| Insecure deserialization | A08:2021 / CWE-502 |
+## Compliance Framework Verification
+Only applies when `project.compliance` is configured.
+- **`owasp`** — OWASP Top 10 risks addressed (see surface scan above)
+- **`pci-dss`** — No card numbers in logs, TLS for transmission, tokenization, audit trail, access controls on cardholder data
+- **`hipaa`** — Access controls + audit logging, encryption at rest/transit, minimum necessary access, BAA implications for third-party services
+- **`gdpr`** — Lawful basis identified, consent mechanism if needed, data subject rights (access, rectify, erase, port), retention limits, privacy by design
+- **`soc2`** — Audit logging for access and changes, availability monitoring, logical access controls, change management documentation
+- **`iso27001`** — Risk assessment documented, information classification applied, access control policy followed, incident response considered
+Missing compliance coverage on a tagged scenario is a **blocker**.

package/skills/references/testing-contracts.md ADDED Viewed

@@ -0,0 +1,93 @@
+# Testing & Contract Reference
+Loaded by skills that involve writing tests, mocking external services, or verifying contract compliance.
+## Mocking Strategy
+**Mock at the HTTP boundary, not at the client level.**
+- **DO mock**: the HTTP transport layer using the project's HTTP mocking library (check `config.tools` or existing test imports for: `responses`, `httpx_mock`, `nock`, `msw`, `wiremock`). Fixture responses must match the contract in `schema.yml`.
+- **DON'T mock**: your own client wrapper. If you mock `stripe_client.create_charge()`, you're testing that your code calls a function — not that it handles the real response shape. The client wrapper is the code under test.
+- **DON'T mock**: internal services within the same repo. Use the real code. Mocking between internal modules hides integration bugs that only surface in production.
+## Fixture Management
+- Fixtures live alongside tests (e.g., `tests/fixtures/stripe_create_charge.json`)
+- One fixture per endpoint, named after the endpoint, not the test
+- Each fixture is a concrete instance of the `schema.yml` contract
+- When the contract changes, the fixture must change — stale fixtures are false-positive tests
+- Include at least one error response fixture per external API (matching `error_response` in `schema.yml`)
+## Contract Test Requirements
+Every external API integration needs contract tests that assert:
+1. Every `required: true` response field is read and typed correctly in the client
+2. Request payloads match the documented shape (required fields present, types correct)
+3. Error response handling matches the documented `error_response` shape
+4. Use recorded/fixture responses (not live calls) so tests run locally without network
+For contract regression tests: if the client starts reading a new field or stops sending a required field, the test must fail.
+## Mocking Anti-Patterns
+- Mocking your own client wrapper and asserting it was called — tests wiring, not behavior
+- `unittest.mock.patch` on the function under test — replacing the thing you're testing
+- Fixture responses that don't match any documented contract — fictional, prove nothing
+- Mocking so aggressively that removing production code still passes the test
+- Test creates a mock and asserts against the mock's return value (circular)
+## Verify Before Using
+Before importing a module, calling a function, or adding a dependency — confirm it exists.
+**Imports and functions:**
+- Check area docs' Reusable Code table first (exact paths and line numbers)
+- If importing from a file you haven't read, read it first
+- If an import fails, don't guess — read the actual module for the real export name
+**Dependencies and packages:**
+- Only add packages already in `package.json` / `requirements.txt` / `pyproject.toml` / equivalent
+- If a task requires a new package, verify it exists (should be specified in the plan)
+- Never guess at a package name
+**APIs and endpoints:**
+- Check `schema.yml` for external API contracts (real endpoints, methods, field names)
+- For internal APIs, read the area doc or route file — don't assume paths
+## Step Definition Conventions
+Step definitions are organized by **domain concept**, NOT by feature file. One step file per feature file is an anti-pattern — steps should be reusable across features.
+**Before writing step definitions, check the project's existing test setup.** Read test config files, existing step definitions, and `package.json` / `requirements.txt` / `pyproject.toml` to determine which framework is in use and follow its conventions.
+**Key rules:**
+- NEVER create one step definition file per feature file
+- Given steps are most likely to be shared — put them in a common location
+- When/Then steps are more domain-specific — group by domain
+- If a step is used by 2+ features, move it to the shared/common file
+- Step definition bodies should be thin — delegate to helper functions, page objects, or API clients
+- **Match the project's existing patterns.** Don't introduce a new framework.
+Common patterns by ecosystem (use as reference, not gospel — follow the project's actual conventions):
+**Python (Behave):** `features/steps/` with `auth_steps.py`, `common_steps.py`, `environment.py`
+**Python (pytest-bdd):** `tests/conftest.py` for shared fixtures + `tests/step_defs/test_auth.py` per domain
+**JavaScript/TypeScript (Cucumber.js):** `features/step_definitions/` with `auth.steps.ts`, `common.steps.ts`, `features/support/world.ts`
+**React / Frontend (Playwright/Cypress + Cucumber):** `e2e/steps/` with domain step files + `e2e/pages/` for page objects
+## Step Definition Quality
+Every Then step must have a specific assertion with an exact expected value:
+- **Strong:** `assert result == "expected_value"`, `expect(status).toBe(302)`
+- **Weak:** `assert result is not None`, `expect(result).toBeDefined()`
+- **Trivial:** `assert True`, `pass`, empty body — always CRITICAL
+Anti-patterns:
+- `def step_impl(): pass` — empty body, always passes
+- Asserting against the return value of the function you just wrote (circular)
+- `assert True` or `assert response is not None` — trivially true
+- Catching exceptions in the step def so it never fails
+- No `assert`/`expect` in a Then step — CRITICAL