@kiwidata/grimoire 0.1.5 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (89) hide show
  1. package/.claude-plugin/plugin.json +2 -2
  2. package/AGENTS.md +21 -25
  3. package/LICENSE +36 -0
  4. package/README.md +12 -4
  5. package/dist/cli/index.js +2 -41
  6. package/dist/cli/index.js.map +1 -1
  7. package/dist/cli/program.d.ts +4 -0
  8. package/dist/cli/program.d.ts.map +1 -0
  9. package/dist/cli/program.js +47 -0
  10. package/dist/cli/program.js.map +1 -0
  11. package/dist/commands/comment-lint.d.ts +3 -0
  12. package/dist/commands/comment-lint.d.ts.map +1 -0
  13. package/dist/commands/comment-lint.js +14 -0
  14. package/dist/commands/comment-lint.js.map +1 -0
  15. package/dist/commands/configure.d.ts.map +1 -1
  16. package/dist/commands/configure.js +2 -1
  17. package/dist/commands/configure.js.map +1 -1
  18. package/dist/core/branch-check.d.ts.map +1 -1
  19. package/dist/core/branch-check.js +2 -16
  20. package/dist/core/branch-check.js.map +1 -1
  21. package/dist/core/check.d.ts.map +1 -1
  22. package/dist/core/check.js +4 -11
  23. package/dist/core/check.js.map +1 -1
  24. package/dist/core/ci.d.ts.map +1 -1
  25. package/dist/core/ci.js +2 -2
  26. package/dist/core/ci.js.map +1 -1
  27. package/dist/core/comment-lint.d.ts +18 -0
  28. package/dist/core/comment-lint.d.ts.map +1 -0
  29. package/dist/core/comment-lint.js +215 -0
  30. package/dist/core/comment-lint.js.map +1 -0
  31. package/dist/core/doc-style.d.ts +1 -0
  32. package/dist/core/doc-style.d.ts.map +1 -1
  33. package/dist/core/doc-style.js +1 -1
  34. package/dist/core/doc-style.js.map +1 -1
  35. package/dist/core/docs.d.ts.map +1 -1
  36. package/dist/core/docs.js +4 -11
  37. package/dist/core/docs.js.map +1 -1
  38. package/dist/core/health.d.ts.map +1 -1
  39. package/dist/core/health.js +4 -6
  40. package/dist/core/health.js.map +1 -1
  41. package/dist/core/hooks.d.ts.map +1 -1
  42. package/dist/core/hooks.js +44 -41
  43. package/dist/core/hooks.js.map +1 -1
  44. package/dist/core/init.js +1 -0
  45. package/dist/core/init.js.map +1 -1
  46. package/dist/core/list.d.ts.map +1 -1
  47. package/dist/core/list.js +6 -9
  48. package/dist/core/list.js.map +1 -1
  49. package/dist/core/pr.d.ts.map +1 -1
  50. package/dist/core/pr.js +0 -8
  51. package/dist/core/pr.js.map +1 -1
  52. package/dist/core/status.d.ts.map +1 -1
  53. package/dist/core/status.js +5 -5
  54. package/dist/core/status.js.map +1 -1
  55. package/dist/core/update.d.ts.map +1 -1
  56. package/dist/core/update.js +23 -10
  57. package/dist/core/update.js.map +1 -1
  58. package/dist/core/validate.js +1 -1
  59. package/dist/core/validate.js.map +1 -1
  60. package/dist/utils/config.d.ts +2 -0
  61. package/dist/utils/config.d.ts.map +1 -1
  62. package/dist/utils/config.js +4 -0
  63. package/dist/utils/config.js.map +1 -1
  64. package/dist/utils/frontmatter.d.ts +6 -0
  65. package/dist/utils/frontmatter.d.ts.map +1 -0
  66. package/dist/utils/frontmatter.js +13 -0
  67. package/dist/utils/frontmatter.js.map +1 -0
  68. package/dist/utils/paths.d.ts +1 -1
  69. package/dist/utils/paths.d.ts.map +1 -1
  70. package/dist/utils/paths.js +5 -4
  71. package/dist/utils/paths.js.map +1 -1
  72. package/dist/utils/stdin.d.ts +3 -0
  73. package/dist/utils/stdin.d.ts.map +1 -0
  74. package/dist/utils/stdin.js +13 -0
  75. package/dist/utils/stdin.js.map +1 -0
  76. package/package.json +24 -3
  77. package/skills/grimoire-apply/SKILL.md +8 -1
  78. package/skills/grimoire-audit/SKILL.md +4 -1
  79. package/skills/grimoire-bug/SKILL.md +7 -3
  80. package/skills/grimoire-draft/SKILL.md +145 -211
  81. package/skills/grimoire-plan/SKILL.md +4 -28
  82. package/skills/grimoire-pr-review/SKILL.md +1 -0
  83. package/skills/grimoire-refactor/SKILL.md +1 -1
  84. package/skills/grimoire-review/SKILL.md +1 -1
  85. package/skills/grimoire-verify/SKILL.md +12 -0
  86. package/skills/references/artifact-map.md +45 -0
  87. package/skills/references/review-personas.md +9 -3
  88. package/skills/references/test-baseline.md +55 -0
  89. package/templates/draft.md +108 -0
@@ -41,6 +41,7 @@ Two modes:
41
41
  ### 2. Load Artifacts
42
42
  For change verification:
43
43
  - Read `manifest.md`, proposed `.feature` files, decision records, `tasks.md`
44
+ - Read `baseline.md` if present (the test state captured at change start) — it's how you tell a regression from a failure that was already red
44
45
 
45
46
  For baseline verification:
46
47
  - Read all `features/**/*.feature` and `.grimoire/decisions/*.md`
@@ -76,6 +77,17 @@ Flag issues:
76
77
  - Decision's Confirmation criteria not verifiable → WARNING
77
78
  - Decision consequences not addressed → WARNING
78
79
 
80
+ ### 3.C2 Regression vs Baseline
81
+
82
+ Run the configured suites (`config.tools.unit_test`, `config.tools.bdd_test`) and classify each failure against `baseline.md`:
83
+
84
+ - Failing now **and** in the baseline → **pre-existing**, already accepted by the user at change start. Not a regression. Do not blame the change.
85
+ - Failing now, **not** in the baseline → **regression** introduced by this change → CRITICAL. Must be fixed before the change finalizes.
86
+ - Passing now, failing in the baseline → incidentally fixed; note it, don't require it.
87
+ - **No `baseline.md` / baseline skipped** → you cannot classify. List all failures and say plainly they're untriaged. Do NOT assert "existing tests pass" or call anything "pre-existing" without a baseline to back it.
88
+
89
+ The rule: a failure is "pre-existing" only if it's in `baseline.md`. Otherwise it's the change's. Full protocol: `../references/test-baseline.md`.
90
+
79
91
  ### 3.D Test Quality Intelligence
80
92
 
81
93
  Go beyond "does a step definition exist?" to "would this test catch a real bug?"
@@ -0,0 +1,45 @@
1
+ # Artifact Map & Reading Discipline
2
+
3
+ Loaded by skills that read a change's specs before acting (`grimoire-plan`, `grimoire-draft`, `grimoire-design`, `grimoire-review`, `grimoire-pr-review`). This is the single home for **what each grimoire artifact is** and **how to read them**. Skills link here instead of restating it; they keep only the reading focus specific to their job.
4
+
5
+ ---
6
+
7
+ ## The artifacts
8
+
9
+ Per-change (under `.grimoire/changes/<change-id>/`):
10
+
11
+ - **`draft.md`** — the living design doc the change was designed on (diagram/sketch, decision ledger, pseudo-code, Decided/Open ledger). The single source the other artifacts were **projected** from at the end of `grimoire-draft`. Ephemeral: retained read-only as the agreed-design reference through the pipeline, deleted when `grimoire-apply` clears the change folder. Read it for the *intent and rationale* behind the projected artifacts; the features/constraints/decisions remain the authoritative homes.
12
+ - **`manifest.md`** — change summary, complexity level, and the Why. Level 3-4 also carry Assumptions, Pre-Mortem, and **Prior Art** (the build-vs-buy rationale). Generated from `draft.md` at projection.
13
+ - **`features/*.feature`** — behavioral specifications. Edited live in `features/` on the branch.
14
+ - **decision records** — architectural choices for this change, edited live in `.grimoire/decisions/`, including Cost of Ownership sections.
15
+ - **`tasks.md`** — the implementation plan (present once planned).
16
+ - **`data.yml`** — proposed schema changes (present only when the change touches the data model).
17
+
18
+ Project-wide (under `.grimoire/`):
19
+
20
+ - **`config.yaml`** — language, tools, conventions, `comment_style`, `commit_style`, `compliance`, `dep_audit`.
21
+ - **`docs/<area>.md`** — per-area Purpose, Boundaries, Conventions, and "Where New Code Goes". Intent and placement, not live structure.
22
+ - **`docs/data/schema.yml`** — the full data model: tables/collections, field types, relationships, indexes, external API contracts with `source:` pointers. Read this instead of individual model files.
23
+ - **`docs/context.yml`** — deployment environment, related services, infrastructure dependencies, CI/CD, observability. Tells you runtime constraints (Lambda → no long-running processes), cross-service boundaries (auth lives in a sibling service), and what's available (Redis, RabbitMQ).
24
+ - **`brand/tokens.json`**, **`brand/voice.md`** — design grounding (see `brand-tokens-format.md`).
25
+
26
+ ---
27
+
28
+ ## Reading discipline
29
+
30
+ **Grimoire docs first, codebase second.** `.grimoire/docs/` is a pre-computed map — where code lives, what utilities exist, what patterns to follow, what the data layer looks like. Read it *instead of* exploring raw source. Read specific source files only when the docs don't have what you need.
31
+
32
+ **Graph for live structure.** Area docs give intent and placement; they do not carry exact symbols. For function names, file paths, line numbers, reusable utilities, and call graphs, query the graph — `search_graph` / `get_code_snippet` / `get_architecture`. Combine the two: area doc says *where new code goes*, the graph says *what's already there to reuse*.
33
+
34
+ **Do NOT read the entire codebase for "context."** Area docs + data schema + the graph already give you specific paths and assertions. Reading dozens of source files wastes context and does not produce better output. Read specific source only to verify a detail the docs can't answer (exact signature, exact import path, existing step-definition setup).
35
+
36
+ ---
37
+
38
+ ## Staleness gate
39
+
40
+ For each area doc you load, compare its `last_updated` against `git log -1 --format=%ci <directory>`. If the doc is older than the most recent commit to its directory, it's stale — its paths, utility names, and patterns may be wrong.
41
+
42
+ - **Level 1-2:** warn (`Area doc for <area> is behind recent commits — rely on the graph for structure`) and proceed. Mark inferred paths with `<!-- inferred: area doc may be stale -->`.
43
+ - **Level 3-4:** blocker. Do not proceed until the user refreshes via `grimoire-discover` targeted refresh. Acting on stale docs at this complexity produces wrong paths and misses recent utilities — re-doing the work costs more than refreshing first.
44
+
45
+ If area docs don't exist at all, tell the user to run `/grimoire:discover` first.
@@ -227,12 +227,18 @@ Every security finding gets OWASP 2021 + CWE tags. See CWE quick-reference in `.
227
227
 
228
228
  Skip if change is purely internal.
229
229
 
230
+ **Coverage-gap routing (apply before recommending any new artifact).** A coverage gap does NOT default to "write a `.feature`." Route each gap to its one home using the feature-file admission test in `../grimoire-draft/SKILL.md` (§ jurisdiction table + the four admission gates):
231
+ - A `.feature` scenario is warranted **only** for an actor-observable behavior that passes all four gates (external actor, observable outcome, domain language, survives reimplementation).
232
+ - An invariant — observability/logging guarantee, perf budget, security control, compliance rule — is a **constraint**. Recommend it be recorded/verified in `.grimoire/docs/constraints.md`, never as a new `.feature`.
233
+ - When a behavior gap belongs in features, the default is **extend an existing feature file** in the same domain — recommend a new file only if no existing file fits, and say which were considered. Don't propose a `.feature` per finding.
234
+ - Test gaps for already-specified behavior are a *missing test*, not a missing feature — recommend the test, not a new spec.
235
+
230
236
  Evaluate:
231
- - **Test presence**: Every new user-facing behavior has a test? Every scenario from linked feature file has step definitions?
237
+ - **Test presence**: Every new user-facing behavior has a test? Every scenario from a linked feature file has step definitions? Missing test = recommend the test; only recommend a new/extended `.feature` if the behavior itself is unspecified *and* passes the admission test.
232
238
  - **Test quality**: Tests asserting outputs, or just that code "ran"? Over-mocked tests = red flag.
233
239
  - **Negative paths**: For each happy path, is there a failure-path test?
234
- - **Edge cases**: Empty states, concurrent users, interruptions, boundary values?
235
- - **Observability**: New feature — how will it be debugged in prod? Structured logs / metrics / error surfaces?
240
+ - **Edge cases**: Empty states, concurrent users, interruptions, boundary values? A missing edge case is a missing test/scenario in the relevant existing feature — not grounds for a new feature file.
241
+ - **Observability**: New feature — how will it be debugged in prod? Structured logs / metrics / error surfaces? Observability is a **constraint**: verify it's asserted in `constraints.md`; do not recommend a `.feature` for it.
236
242
  - **Regression risk** *(PR/pre-commit)*: Which existing tests cover the touched code? Were any tests removed or weakened?
237
243
  - **Accessibility**: New UI — keyboard nav, aria labels, contrast?
238
244
 
@@ -0,0 +1,55 @@
1
+ # Test Baseline Reference
2
+
3
+ Loaded by skills that mutate code (`grimoire-apply`, `grimoire-bug`, `grimoire-refactor`) and the skill that checks for regressions (`grimoire-verify`).
4
+
5
+ ## Why
6
+
7
+ "That's a pre-existing failure" is unfalsifiable if you never recorded what was failing *before* you started. Without a baseline, verify diffs against nothing — a regression you introduced and a failure that was already red look identical, and the user finds out at the end instead of signing off at the start.
8
+
9
+ The fix is cheap: you already run the suite when you pick up a change. **Capture which tests were already failing, save it, and let the user accept it before any code is touched.** Verify then flags only *new* failures as regressions.
10
+
11
+ This is not a new gate. It's saving the result of a run you already do.
12
+
13
+ ## Capture (at the start of a code change)
14
+
15
+ Do this once, before writing the first test or touching production code — as part of the suite run you'd do anyway to understand the starting state.
16
+
17
+ 1. Run the configured suites: `config.tools.unit_test` and `config.tools.bdd_test`. Use what's configured; don't invent commands.
18
+ 2. Record the result to `.grimoire/changes/<change-id>/baseline.md` (ephemeral scaffolding, discarded with the change folder like `tasks.md`). For a bug fix with no change folder, record inline in the commit/test note and present to the user instead.
19
+ 3. Present the pre-existing failures to the user and get explicit acceptance before proceeding.
20
+
21
+ ### baseline.md format
22
+
23
+ ```markdown
24
+ # Test Baseline — <change-id>
25
+
26
+ captured: <date> # the day you ran it; if unavailable, omit
27
+ unit: <pass>/<total> passing command: <config.tools.unit_test>
28
+ bdd: <pass>/<total> passing command: <config.tools.bdd_test>
29
+
30
+ ## Pre-existing failures (accepted by user before change)
31
+ - <test id / name> — <one-line reason if known, else "pre-existing, cause unknown">
32
+ - ...
33
+
34
+ ## Notes
35
+ - <e.g. "unit suite not configured — baseline skipped for unit">
36
+ ```
37
+
38
+ If nothing was failing, say so explicitly: `## Pre-existing failures: none — clean baseline`.
39
+
40
+ ## Skip rules
41
+
42
+ - **No test command configured** for a suite → skip that suite, write `baseline skipped — no <suite> command configured` under Notes. Don't fabricate a command.
43
+ - **User opts out** → write `baseline skipped — user opted out`. Verify must then NOT claim a clean diff; it reports failures as unclassified (could be pre-existing or new).
44
+ - A skipped baseline is recorded, not silent. Verify needs to know it can't trust a diff.
45
+
46
+ ## Diff (at verify)
47
+
48
+ `grimoire-verify` reads `baseline.md` and classifies the current suite result against it:
49
+
50
+ - Test failing now **and** in baseline → **pre-existing** (already accepted; not a regression, don't blame the change).
51
+ - Test failing now, **not** in baseline → **regression** introduced by this change → CRITICAL, must fix before finalize.
52
+ - Test passing now, failing in baseline → incidentally fixed; note it, don't require it.
53
+ - **No baseline / baseline skipped** → state plainly that failures cannot be classified, and list them all for the user to triage. Do not assert "existing tests pass" or "pre-existing failure" without a baseline to back it.
54
+
55
+ The rule that replaces the old unfalsifiable claim: **a failure is "pre-existing" only if it's in `baseline.md`.** Otherwise it's yours.
@@ -0,0 +1,108 @@
1
+ ---
2
+ status: draft
3
+ change-id: <kebab-case-verb-led>
4
+ kind: greenfield | refactor
5
+ # NO complexity here. Complexity is an OUTPUT of design — scored at projection
6
+ # (after agreement) and written to manifest.md, never to this file.
7
+ ---
8
+
9
+ <!--
10
+ draft.md — the ONE living surface you design a change on.
11
+
12
+ This is where the whole change lives as a single coherent picture: diagram/sketch,
13
+ rationale, a decision ledger, pseudo-code, and an open-question ledger. You and the
14
+ user iterate HERE (this is the interview). Nothing is written to features/,
15
+ constraints.md, or decisions/ until the design is agreed — then it is PROJECTED into
16
+ those homes. This file is ephemeral: retained read-only as reference through the
17
+ pipeline, deleted when the change folder is cleared at grimoire-apply finalize. Git
18
+ history preserves it.
19
+
20
+ Required sections: At a glance · Why · Decisions · Decided / Open.
21
+ As-needed: Current state (REQUIRED for kind=refactor) · Sketches · Constraints · Cut.
22
+ Delete the guidance comments and any section that carries no weight for this change.
23
+ -->
24
+
25
+ # <change> — draft
26
+
27
+ **Date:** <YYYY-MM-DD> · **Provenance:** <prior passes / branches / source docs, if any>
28
+
29
+ ## At a glance
30
+
31
+ <!--
32
+ Make the whole change graspable in one screen. Pick the medium that fits:
33
+ - greenfield → an ASCII flow / box diagram of the system or pipeline
34
+ - refactor → a pseudo-code sketch of the target shape (annotate with decision IDs)
35
+ If grimoire-design (Figma) output exists for this change, its visual + component/state
36
+ material anchors this section.
37
+ -->
38
+
39
+ ## Why
40
+
41
+ <!--
42
+ greenfield: the objective (what problem, how you'll know it's solved) + non-goals.
43
+ refactor: the pain points justifying the change — each a NAMED, LOCATED smell with a
44
+ file:line breadcrumb, not an abstract complaint.
45
+ -->
46
+
47
+ ## Current state <!-- REQUIRED for kind=refactor; omit for greenfield -->
48
+
49
+ <!--
50
+ How the touched system works TODAY, with breadcrumbs to live code. Mandate the codebase
51
+ graph: index_repository first if needed, then search_graph / trace_path / get_code_snippet
52
+ for qualified names, callers, and call chains. Follow with a severity-ranked Gaps/drift
53
+ list — the audit findings that motivate the redesign.
54
+ -->
55
+
56
+ ## Decisions
57
+
58
+ <!--
59
+ ONE inline ledger. Each row: a stable ID, the decision, and its WHY. Use sub-IDs (D1a)
60
+ and cross-references (D7 cites D3) freely — this is how coupled decisions stay legible
61
+ in one place. At projection, each NOVEL decision becomes a MADR (novelty gate applies —
62
+ obvious tooling picks fold into the baseline ADR, they don't mint a record).
63
+ -->
64
+
65
+ | # | Decision | Why |
66
+ |----|----------|-----|
67
+ | D1 | | |
68
+
69
+ ## Sketches <!-- as-needed; expected for refactor -->
70
+
71
+ <!--
72
+ Pseudo-code / key considerations for the target shape. Mark it "not final" — it is shape,
73
+ not contract. Annotate lines with the decision IDs they realize (# D8).
74
+ -->
75
+
76
+ ## Constraints <!-- as-needed -->
77
+
78
+ <!--
79
+ Invariants this change must hold (security / NFR / observability / compliance). One line
80
+ each: assertion · rationale · how-verified. These project to .grimoire/docs/constraints.md
81
+ — NOT to a feature file.
82
+ -->
83
+
84
+ ## Decided / Open
85
+
86
+ <!--
87
+ Two lists. Decided = settled calls (cross-reference the D-IDs). Open = live unknowns.
88
+ Resolve an Open IN PLACE — rewrite it as `RESOLVED: <answer> (Dn)`, do not delete it.
89
+ The struck-through trail is the record of the thinking. Design is "done" when Decided is
90
+ stable and Open is empty-or-deferred.
91
+ -->
92
+
93
+ **Decided:**
94
+ -
95
+
96
+ **Open:**
97
+ -
98
+
99
+ ## Cut / deferred <!-- as-needed; greenfield-leaning -->
100
+
101
+ <!--
102
+ What was deliberately removed or deferred, so it is not silently lost. Table:
103
+ cut · what it was · why cut · re-add when. Design by subtraction, recorded.
104
+ -->
105
+
106
+ | Cut | What it was | Why cut | Re-add when |
107
+ |-----|-------------|---------|-------------|
108
+ | | | | |