memento-mori-jester 0.1.44 → 0.1.45
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +6 -0
- package/ROADMAP.md +2 -1
- package/docs/DEMO.md +19 -12
- package/docs/RELEASE_NOTES_v0.1.45.md +22 -0
- package/examples/fixtures/preset-review-cases.json +116 -7
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,12 @@ All notable changes to Memento Mori Jester are tracked here.
|
|
|
4
4
|
|
|
5
5
|
## Unreleased
|
|
6
6
|
|
|
7
|
+
## 0.1.45
|
|
8
|
+
|
|
9
|
+
- Added eight focused preset review fixtures for `risky-domain`, `missing-verification-step`, `confidence-theater`, and `done-without-evidence`.
|
|
10
|
+
- Curated intentional overlap expectations for existing fixtures so `jester tune coverage` no longer treats auth, security-group, eval, skip-tests, and migration intersections as surprise matches.
|
|
11
|
+
- Improved the fixture coverage baseline from low/thin families to medium-or-better support across the built-in and structural rule set.
|
|
12
|
+
|
|
7
13
|
## 0.1.44
|
|
8
14
|
|
|
9
15
|
- Added `jester tune coverage` and `jester tune coverage --json` as read-only maintenance reports for fixture support across every rule.
|
package/ROADMAP.md
CHANGED
|
@@ -6,6 +6,7 @@ Memento Mori Jester is usable today as a CLI, MCP server, GitHub Action, and git
|
|
|
6
6
|
|
|
7
7
|
## Recently Shipped
|
|
8
8
|
|
|
9
|
+
- Fixture curation pass in v0.1.45 that moved all built-in and structural rule evidence to medium-or-better confidence.
|
|
9
10
|
- Additional precision pass for fixture-driven tuning signals (scoped to high-signal rule families first).
|
|
10
11
|
- Fixture-informed `jester tune` evidence from preset review cases, including matched fixture IDs and verdict buckets.
|
|
11
12
|
- Framework-specific GitHub Actions examples for Next.js, Vite React, Express API, FastAPI, Terraform/Kubernetes, and AI MCP repos.
|
|
@@ -33,7 +34,7 @@ Memento Mori Jester is usable today as a CLI, MCP server, GitHub Action, and git
|
|
|
33
34
|
|
|
34
35
|
## Product Ideas
|
|
35
36
|
|
|
36
|
-
-
|
|
37
|
+
- Improve playground onboarding samples so users can try realistic command, plan, diff, and final-answer reviews without inventing input.
|
|
37
38
|
|
|
38
39
|
## Quality And Safety
|
|
39
40
|
|
package/docs/DEMO.md
CHANGED
|
@@ -176,24 +176,31 @@ Typical output:
|
|
|
176
176
|
Memento Mori Jester tuning advice
|
|
177
177
|
|
|
178
178
|
Rule: risky-domain [enabled]
|
|
179
|
+
Title: High-risk domain touched
|
|
179
180
|
Severity: S3
|
|
181
|
+
Source: built-in
|
|
182
|
+
Kinds: plan, command, diff, final
|
|
183
|
+
Project config: none loaded
|
|
184
|
+
|
|
180
185
|
Fixture tuning evidence:
|
|
181
|
-
Support:
|
|
182
|
-
Confidence:
|
|
183
|
-
Total fixtures checked:
|
|
184
|
-
Weighted fixtures checked:
|
|
185
|
-
Matching fixtures:
|
|
186
|
-
Weighted matches:
|
|
187
|
-
Expected-match weight:
|
|
188
|
-
Unexpected-match weight:
|
|
186
|
+
Support: limited
|
|
187
|
+
Confidence: medium
|
|
188
|
+
Total fixtures checked: 58
|
|
189
|
+
Weighted fixtures checked: 112.95
|
|
190
|
+
Matching fixtures: 8
|
|
191
|
+
Weighted matches: 17
|
|
192
|
+
Expected-match weight: 14
|
|
193
|
+
Unexpected-match weight: 3
|
|
189
194
|
Edge-case matches: 0
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
By verdict: pass 0, caution
|
|
195
|
+
By kind: command 0, plan 3, diff 4, final 1
|
|
196
|
+
Fixture coverage: 8/58 (15.1% weighted)
|
|
197
|
+
By verdict: pass 0, caution 3, block 5
|
|
193
198
|
Matched fixture samples:
|
|
194
|
-
web-token-localstorage-block: Token storage in localStorage should block.
|
|
195
199
|
infra-public-ingress-block: Public ingress should block in low-risk-tolerance infra repos.
|
|
200
|
+
plan-missing-verification-step: Implementation plan without verification steps should trigger the structural rule.
|
|
196
201
|
sec-secret-material-openai: Hard-coded OpenAI-like token should map to the secret-material rule.
|
|
202
|
+
universal-risky-domain-auth-caution-2: Auth callback changes should keep the broad risky-domain signal covered when verification is present.
|
|
203
|
+
universal-risky-domain-billing-final: Billing changes in final responses should remain covered when evidence is supplied.
|
|
197
204
|
|
|
198
205
|
When it may be noisy:
|
|
199
206
|
It can be noisy in docs, release notes, or rule text that merely mentions a sensitive word.
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# v0.1.45 Release Notes
|
|
2
|
+
|
|
3
|
+
This release improves the fixture evidence behind `jester tune` and `jester tune coverage`. It does not change review matching, scoring, verdicts, config, MCP tools, or GitHub Action behavior.
|
|
4
|
+
|
|
5
|
+
## Changed
|
|
6
|
+
|
|
7
|
+
- Added focused fixtures for `risky-domain`, `missing-verification-step`, `confidence-theater`, and `done-without-evidence`.
|
|
8
|
+
- Marked intentional overlaps in existing fixtures so the coverage report can distinguish real surprise matches from expected multi-rule risk.
|
|
9
|
+
- Raised built-in and structural rule coverage to medium-or-better confidence in the default coverage report.
|
|
10
|
+
|
|
11
|
+
## Release Validation
|
|
12
|
+
|
|
13
|
+
```powershell
|
|
14
|
+
npm.cmd test
|
|
15
|
+
npm.cmd run demo:svg:check
|
|
16
|
+
npm.cmd run pack:dry
|
|
17
|
+
git diff --check
|
|
18
|
+
node .\dist\cli.js tune coverage --json --no-config
|
|
19
|
+
node .\dist\cli.js tune risky-domain --no-config
|
|
20
|
+
node .\dist\cli.js tune missing-verification-step --no-config
|
|
21
|
+
git diff | node .\dist\cli.js diff --fail-on block --subject "v0.1.45 fixture curation"
|
|
22
|
+
```
|
|
@@ -36,7 +36,10 @@
|
|
|
36
36
|
"weight": 2,
|
|
37
37
|
"expectedVerdict": "block",
|
|
38
38
|
"expectedRuleIds": [
|
|
39
|
-
"custom-web-storage-sensitive-value"
|
|
39
|
+
"custom-web-storage-sensitive-value",
|
|
40
|
+
"risky-domain",
|
|
41
|
+
"configured-sensitive-domain-auth",
|
|
42
|
+
"configured-sensitive-domain-session"
|
|
40
43
|
]
|
|
41
44
|
},
|
|
42
45
|
{
|
|
@@ -88,7 +91,8 @@
|
|
|
88
91
|
"expectedVerdict": "block",
|
|
89
92
|
"weight": 2,
|
|
90
93
|
"expectedRuleIds": [
|
|
91
|
-
"custom-infra-public-exposure"
|
|
94
|
+
"custom-infra-public-exposure",
|
|
95
|
+
"risky-domain"
|
|
92
96
|
]
|
|
93
97
|
},
|
|
94
98
|
{
|
|
@@ -128,7 +132,9 @@
|
|
|
128
132
|
"expectedVerdict": "block",
|
|
129
133
|
"weight": 2,
|
|
130
134
|
"expectedRuleIds": [
|
|
131
|
-
"custom-ai-evals-skipped"
|
|
135
|
+
"custom-ai-evals-skipped",
|
|
136
|
+
"configured-sensitive-domain-eval",
|
|
137
|
+
"missing-verification-step"
|
|
132
138
|
]
|
|
133
139
|
},
|
|
134
140
|
{
|
|
@@ -152,7 +158,9 @@
|
|
|
152
158
|
"weight": 2,
|
|
153
159
|
"expectedVerdict": "block",
|
|
154
160
|
"expectedRuleIds": [
|
|
155
|
-
"secret-material"
|
|
161
|
+
"secret-material",
|
|
162
|
+
"risky-domain",
|
|
163
|
+
"configured-sensitive-domain-auth"
|
|
156
164
|
]
|
|
157
165
|
},
|
|
158
166
|
{
|
|
@@ -272,7 +280,8 @@
|
|
|
272
280
|
"weight": 2,
|
|
273
281
|
"expectedVerdict": "caution",
|
|
274
282
|
"expectedRuleIds": [
|
|
275
|
-
"risky-domain"
|
|
283
|
+
"risky-domain",
|
|
284
|
+
"missing-verification-step"
|
|
276
285
|
]
|
|
277
286
|
},
|
|
278
287
|
{
|
|
@@ -344,7 +353,8 @@
|
|
|
344
353
|
"weight": 2,
|
|
345
354
|
"expectedVerdict": "block",
|
|
346
355
|
"expectedRuleIds": [
|
|
347
|
-
"skip-tests"
|
|
356
|
+
"skip-tests",
|
|
357
|
+
"missing-verification-step"
|
|
348
358
|
]
|
|
349
359
|
},
|
|
350
360
|
{
|
|
@@ -356,7 +366,8 @@
|
|
|
356
366
|
"expectedVerdict": "caution",
|
|
357
367
|
"weight": 2,
|
|
358
368
|
"expectedRuleIds": [
|
|
359
|
-
"missing-verification-step"
|
|
369
|
+
"missing-verification-step",
|
|
370
|
+
"risky-domain"
|
|
360
371
|
]
|
|
361
372
|
},
|
|
362
373
|
{
|
|
@@ -610,5 +621,103 @@
|
|
|
610
621
|
"expectedRuleIds": [
|
|
611
622
|
"large-removal"
|
|
612
623
|
]
|
|
624
|
+
},
|
|
625
|
+
{
|
|
626
|
+
"id": "universal-risky-domain-auth-caution-2",
|
|
627
|
+
"preset": "default",
|
|
628
|
+
"kind": "plan",
|
|
629
|
+
"description": "Auth callback changes should keep the broad risky-domain signal covered when verification is present.",
|
|
630
|
+
"content": "Change the auth callback after running the focused login smoke test.",
|
|
631
|
+
"weight": 2,
|
|
632
|
+
"expectedVerdict": "block",
|
|
633
|
+
"expectedRuleIds": [
|
|
634
|
+
"risky-domain",
|
|
635
|
+
"configured-sensitive-domain-auth"
|
|
636
|
+
]
|
|
637
|
+
},
|
|
638
|
+
{
|
|
639
|
+
"id": "universal-risky-domain-billing-final",
|
|
640
|
+
"preset": "default",
|
|
641
|
+
"kind": "final",
|
|
642
|
+
"description": "Billing changes in final responses should remain covered when evidence is supplied.",
|
|
643
|
+
"content": "Completed the billing label update after running the smoke check.",
|
|
644
|
+
"weight": 2,
|
|
645
|
+
"expectedVerdict": "block",
|
|
646
|
+
"expectedRuleIds": [
|
|
647
|
+
"risky-domain",
|
|
648
|
+
"configured-sensitive-domain-billing"
|
|
649
|
+
]
|
|
650
|
+
},
|
|
651
|
+
{
|
|
652
|
+
"id": "plan-missing-verification-step-2",
|
|
653
|
+
"preset": "default",
|
|
654
|
+
"kind": "plan",
|
|
655
|
+
"description": "Implementation plans without a verification step should have a clean missing-verification example.",
|
|
656
|
+
"content": "Implement the toolbar state cleanup.",
|
|
657
|
+
"weight": 2,
|
|
658
|
+
"expectedVerdict": "pass",
|
|
659
|
+
"expectedRuleIds": [
|
|
660
|
+
"missing-verification-step"
|
|
661
|
+
]
|
|
662
|
+
},
|
|
663
|
+
{
|
|
664
|
+
"id": "plan-missing-verification-step-3",
|
|
665
|
+
"preset": "default",
|
|
666
|
+
"kind": "plan",
|
|
667
|
+
"description": "Refactor plans that omit validation should keep missing-verification coverage separate from domain wording.",
|
|
668
|
+
"content": "Refactor the account menu routing.",
|
|
669
|
+
"weight": 2,
|
|
670
|
+
"expectedVerdict": "pass",
|
|
671
|
+
"expectedRuleIds": [
|
|
672
|
+
"missing-verification-step"
|
|
673
|
+
]
|
|
674
|
+
},
|
|
675
|
+
{
|
|
676
|
+
"id": "plan-confidence-theater-2",
|
|
677
|
+
"preset": "default",
|
|
678
|
+
"kind": "plan",
|
|
679
|
+
"description": "Overconfident certainty language should have a direct confidence-theater fixture.",
|
|
680
|
+
"content": "This is definitely straightforward.",
|
|
681
|
+
"weight": 2,
|
|
682
|
+
"expectedVerdict": "pass",
|
|
683
|
+
"expectedRuleIds": [
|
|
684
|
+
"confidence-theater"
|
|
685
|
+
]
|
|
686
|
+
},
|
|
687
|
+
{
|
|
688
|
+
"id": "plan-confidence-theater-3",
|
|
689
|
+
"preset": "default",
|
|
690
|
+
"kind": "plan",
|
|
691
|
+
"description": "Dismissive easy-language plans should keep confidence-theater coverage healthy.",
|
|
692
|
+
"content": "Obviously easy config rename.",
|
|
693
|
+
"weight": 2,
|
|
694
|
+
"expectedVerdict": "pass",
|
|
695
|
+
"expectedRuleIds": [
|
|
696
|
+
"confidence-theater"
|
|
697
|
+
]
|
|
698
|
+
},
|
|
699
|
+
{
|
|
700
|
+
"id": "final-done-without-evidence-2",
|
|
701
|
+
"preset": "default",
|
|
702
|
+
"kind": "final",
|
|
703
|
+
"description": "Completion claims without test evidence should keep done-without-evidence coverage explicit.",
|
|
704
|
+
"content": "Implemented the parser cleanup.",
|
|
705
|
+
"weight": 2,
|
|
706
|
+
"expectedVerdict": "caution",
|
|
707
|
+
"expectedRuleIds": [
|
|
708
|
+
"done-without-evidence"
|
|
709
|
+
]
|
|
710
|
+
},
|
|
711
|
+
{
|
|
712
|
+
"id": "final-done-without-evidence-3",
|
|
713
|
+
"preset": "default",
|
|
714
|
+
"kind": "final",
|
|
715
|
+
"description": "All-set final responses without evidence should remain covered as done-without-evidence.",
|
|
716
|
+
"content": "All set on the theme switcher.",
|
|
717
|
+
"weight": 2,
|
|
718
|
+
"expectedVerdict": "caution",
|
|
719
|
+
"expectedRuleIds": [
|
|
720
|
+
"done-without-evidence"
|
|
721
|
+
]
|
|
613
722
|
}
|
|
614
723
|
]
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "memento-mori-jester",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.45",
|
|
4
4
|
"description": "A local court-jester sidecar for AI coding agents: review plans, commands, diffs, and final claims before they get too pleased with themselves.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"repository": {
|