rhachet-roles-bhuild 0.14.1 → 0.14.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,55 @@
1
+ reviews:
2
+ self:
3
+ - slug: has-critical-paths-identified
4
+ say: |
5
+ double-check: did you identify the critical paths?
6
+
7
+ - are the happy paths marked as critical?
8
+ - for each critical path, is it clear why it must be frictionless?
9
+ - did you consider what would happen if each critical path failed?
10
+
11
+ for each critical path, verify pit of success:
12
+ - narrower inputs: can we constrain inputs to prevent misuse?
13
+ - convenient: can we infer inputs rather than require them?
14
+ - expressive: does it pull into inferred happy path, but allow expression of differences?
15
+ - failsafes: what happens when things go wrong? does it recover gracefully?
16
+ - failfasts: does it fail early and clearly when inputs are invalid?
17
+ - idempotency: can the operation be retried safely?
18
+
19
+ critical paths are the "golden paths" — the flows that most users take.
20
+ if these aren't frictionless, users will fail. fix the friction now.
21
+
22
+ - slug: has-ergonomics-reviewed
23
+ say: |
24
+ double-check: did you review the ergonomics?
25
+
26
+ for each input/output pair:
27
+ - does the input feel natural? if not, how can we simplify it?
28
+ - does the output feel natural? if not, what would be clearer?
29
+ - is there any friction? if so, how can we remove it?
30
+
31
+ pit of success principles:
32
+ - intuitive design: can users succeed without documentation?
33
+ - convenient: can we infer inputs rather than require them?
34
+ - expressive: does it pull into inferred happy path, but allow expression of differences?
35
+ - composable: can this be combined with other operations easily?
36
+ - lower trust contracts: do we validate at boundaries?
37
+ - deeper behavior: do we handle edge cases gracefully?
38
+
39
+ awkward inputs and outputs are bugs. fix them now, before implementation.
40
+ every friction point you leave becomes a support ticket later.
41
+
42
+ - slug: has-play-test-convention
43
+ say: |
44
+ double-check: are journey tests named correctly?
45
+
46
+ journey test files should use `.play.test.ts` suffix:
47
+ - `feature.play.test.ts` — journey test
48
+ - `feature.play.integration.test.ts` — if repo requires integration runner
49
+ - `feature.play.acceptance.test.ts` — if repo requires acceptance runner
50
+
51
+ this distinguishes journey tests (step-by-step user experience tests)
52
+ from unit tests (`.test.ts`) and integration tests (`.integration.test.ts`).
53
+
54
+ if the repo doesn't support `.play.test.ts` directly, plan to use
55
+ `.play.integration.test.ts` or `.play.acceptance.test.ts` instead.
@@ -14,12 +14,118 @@ for each user experience in the vision, define how it will be reproduced in test
14
14
 
15
15
  ---
16
16
 
17
+ ## journey test sketches
18
+
19
+ for each experience, sketch the journey test with full BDD structure.
20
+
21
+ ### structure
22
+
23
+ journey tests use `given/when/then` blocks with `[tN]` labels:
24
+
25
+ ```
26
+ given('[case1] {scenario description}')
27
+ when('[t0] before any changes')
28
+ then('{precondition holds}')
29
+ then('input/output matches snapshot') ← snapshot!
30
+ when('[t1] {first action}')
31
+ then('{expected outcome}')
32
+ then('input/output matches snapshot') ← snapshot!
33
+ when('[t2] {second action}')
34
+ then('{expected outcome}')
35
+ then('input/output matches snapshot') ← snapshot!
36
+ ```
37
+
38
+ ### step table
39
+
40
+ for each journey, create a step table:
41
+
42
+ | step | action | user sees |
43
+ |------|--------|-----------|
44
+ | t0 | before any changes | {describe what user sees} |
45
+ | t1 | {first action} | {describe what user sees} |
46
+ | t2 | {second action} | {describe what user sees} |
47
+
48
+ ### input/output pairs
49
+
50
+ for each step, document:
51
+ - **input**: what the caller provides
52
+ - **output**: what the caller receives (terminal, screen, response)
53
+
54
+ example (CLI):
55
+ ```
56
+ #### t1 success case (snapshot target)
57
+ $ rhx init.behavior --name my-feature
58
+
59
+ init.behavior
60
+
61
+ created .behavior/v2024_03_12.my-feature/
62
+ ├─ 0.wish.md
63
+ └─ ... (more files)
64
+ ```
65
+
66
+ example (SDK):
67
+ ```
68
+ #### t1 success case (snapshot target)
69
+ // input
70
+ const customer = await sdk.createCustomer({ email: 'test@example.com' });
71
+
72
+ // output
73
+ { id: 'cus_abc123', email: 'test@example.com', status: 'active' }
74
+ ```
75
+
76
+ ### snapshot coverage plan
77
+
78
+ mark which outputs need `.snap` files:
79
+
80
+ - [ ] t0 before state → `.snap`
81
+ - [ ] t1 success input/output → `.snap`
82
+ - [ ] t1 error input/output → `.snap`
83
+ - [ ] t2 after state → `.snap`
84
+
85
+ ### file convention
86
+
87
+ journey test files use `.play.test.ts` suffix:
88
+ - `feature.play.test.ts` — journey test
89
+ - `feature.play.integration.test.ts` — journey test run as integration
90
+ - `feature.play.acceptance.test.ts` — journey test run as acceptance
91
+
92
+ this distinguishes journey tests from unit tests (`.test.ts`).
93
+
94
+ ---
95
+
96
+ ## critical paths
97
+
98
+ identify the happy paths that must be frictionless.
99
+
100
+ | critical path | description | why critical |
101
+ |---------------|-------------|--------------|
102
+ | {path 1} | {what user does} | {why this must work} |
103
+ | {path 2} | {what user does} | {why this must work} |
104
+
105
+ critical paths are the "golden paths" — the main flows that most users take.
106
+ if these fail or have friction, the product fails.
107
+
108
+ ---
109
+
110
+ ## ergonomics review
111
+
112
+ for each input/output pair, review:
113
+ - does the input feel natural? is it what the user would expect to provide?
114
+ - does the output feel natural? is it what the user would expect to see?
115
+ - is there friction? what could be smoother?
116
+
117
+ | journey | input ergonomics | output ergonomics | friction notes |
118
+ |---------|------------------|-------------------|----------------|
119
+ | {journey 1} | {natural / awkward} | {natural / awkward} | {any friction} |
120
+
121
+ ---
122
+
17
123
  ## reproduction feasibility
18
124
 
19
125
  for each experience, confirm it can be reproduced:
20
126
  - what test utilities are available?
21
127
  - what setup is required?
22
- - show a concrete test sketch
128
+ - show a concrete test sketch (use journey structure above)
23
129
 
24
130
  ---
25
131
 
@@ -0,0 +1,51 @@
1
+ reviews:
2
+ self:
3
+ - slug: has-complete-implementation-record
4
+ say: |
5
+ double-check: did you document everything that was implemented?
6
+
7
+ - is every file change recorded in the filediff tree?
8
+ - is every codepath change recorded in the codepath tree?
9
+ - is every test recorded in the test coverage section?
10
+
11
+ silent changes are dangerous. if it's not documented, it didn't happen.
12
+ go back and check git diff against origin/main.
13
+
14
+ - slug: has-divergence-analysis
15
+ say: |
16
+ double-check: did you find all the divergences?
17
+
18
+ compare blueprint vs implementation for each section:
19
+ - summary: does the actual match the declared?
20
+ - filediff: are all files accounted for?
21
+ - codepath: are all codepaths accounted for?
22
+ - test coverage: are all tests accounted for?
23
+
24
+ be skeptical. assume you missed something.
25
+ what would a hostile reviewer find that you overlooked?
26
+
27
+ - slug: has-divergence-addressed
28
+ say: |
29
+ double-check: did you address each divergence properly?
30
+
31
+ for each divergence:
32
+ - if repaired: did you actually make the fix? is it visible in git?
33
+ - if backed up: is the rationale convincing? would a skeptic accept it?
34
+
35
+ question each backup skeptically:
36
+ - is this truly an improvement, or just laziness?
37
+ - did we just not want to do the work the blueprint required?
38
+ - could this divergence cause problems later?
39
+
40
+ a backup without strong rationale is a defect. repair it instead.
41
+
42
+ - slug: has-no-silent-scope-creep
43
+ say: |
44
+ double-check: did any scope creep into the implementation?
45
+
46
+ - did you add features not in the blueprint?
47
+ - did you change things "while you were in there"?
48
+ - did you refactor code unrelated to the wish?
49
+
50
+ scope creep is a divergence. document it and address it.
51
+ enumerate each with [repair] or [backup] decision in the review file.
@@ -0,0 +1,88 @@
1
+ evaluate what was implemented against the blueprint
2
+
3
+ .what = articulate exactly what was implemented, then check for divergences from blueprint.
4
+
5
+ .why = the blueprint declared what the execution would adhere to.
6
+ - divergences may be intentional improvements or accidental drift
7
+ - each divergence must be either repaired or backed up with rationale
8
+ - this gate prevents silent deviations from approved design
9
+
10
+ ---
11
+
12
+ reference the blueprint:
13
+ - $BEHAVIOR_DIR_REL/3.3.1.blueprint.product.v1.i1.md
14
+
15
+ ---
16
+
17
+ ## summary (as implemented)
18
+
19
+ state what was actually built. mirror the blueprint summary structure.
20
+
21
+ ---
22
+
23
+ ## filediff tree (as implemented)
24
+
25
+ include a treestruct of filediffs that were actually made.
26
+
27
+ **legend:**
28
+ - `[+] created` — file created
29
+ - `[~] updated` — file updated
30
+ - `[-] deleted` — file deleted
31
+
32
+ ---
33
+
34
+ ## codepath tree (as implemented)
35
+
36
+ include a treestruct of codepaths that were actually implemented.
37
+
38
+ **legend:**
39
+ - `[+]` created — codepath created
40
+ - `[~]` updated — codepath updated
41
+ - `[○]` retained — codepath retained
42
+ - `[-]` deleted — codepath deleted
43
+ - `[←]` reused — codepath reused from elsewhere
44
+ - `[→]` ejected — codepath decomposed for reuse
45
+
46
+ ---
47
+
48
+ ## test coverage (as implemented)
49
+
50
+ document what tests were actually written:
51
+ - unit tests
52
+ - integration tests
53
+ - acceptance tests
54
+
55
+ ---
56
+
57
+ ## divergence analysis
58
+
59
+ for each section (summary, filediff, codepath, test coverage), compare:
60
+ - what the blueprint declared
61
+ - what was actually implemented
62
+
63
+ ### divergences found
64
+
65
+ | section | blueprint declared | actual implemented | divergence type |
66
+ |---------|-------------------|-------------------|-----------------|
67
+ | ... | ... | ... | added/removed/changed |
68
+
69
+ ### divergence resolution
70
+
71
+ for each divergence, you must either:
72
+
73
+ **repair** — fix the implementation to match the blueprint:
74
+ - what needs to change to match blueprint?
75
+ - make the change, then update the "as implemented" section above
76
+
77
+ **backup** — document why the divergence is acceptable:
78
+ - why did the implementation diverge?
79
+ - why is the divergence better than the blueprint?
80
+ - should the blueprint be updated for future reference?
81
+
82
+ | divergence | resolution | rationale |
83
+ |------------|------------|-----------|
84
+ | ... | repair/backup | ... |
85
+
86
+ ---
87
+
88
+ emit into $BEHAVIOR_DIR_REL/5.2.evaluation.v1.i1.md
@@ -52,3 +52,72 @@ reviews:
52
52
 
53
53
  to "fix tests" via changed intent is not a fix — it is at worst
54
54
  malicious deception, at best reckless negligence. unacceptable.
55
+
56
+ - slug: has-journey-tests-from-repros
57
+ say: |
58
+ double-check: did you implement each journey sketched in repros?
59
+
60
+ look back at the repros artifact:
61
+ - $BEHAVIOR_DIR_REL/3.2.distill.repros.experience.*.md
62
+
63
+ for each journey test sketch in repros:
64
+ - is there a test file for it?
65
+ - does the test follow the BDD given/when/then structure?
66
+ - does each `when([tN])` step exist?
67
+
68
+ if any journey was planned but not implemented, go back and add it.
69
+
70
+ - slug: has-snapshot-coverage
71
+ say: |
72
+ double-check: do snapshots capture input/output for caller visibility?
73
+
74
+ for each journey test:
75
+ - does it have `.toMatchSnapshot()` or equivalent assertions?
76
+ - does the snapshot show what the caller would actually see?
77
+ - for CLI: is stdout/stderr captured?
78
+ - for UI: are screens captured?
79
+ - for SDK: are responses captured?
80
+
81
+ snapshots let reviewers see the actual output without the need to run the code.
82
+ if snapshots are absent, the reviewer can't verify the user experience.
83
+
84
+ - slug: has-critical-paths-frictionless
85
+ say: |
86
+ double-check: are the critical paths frictionless in practice?
87
+
88
+ look back at the repros artifact for critical paths:
89
+ - $BEHAVIOR_DIR_REL/3.2.distill.repros.experience.*.md
90
+
91
+ for each critical path:
92
+ - run through it manually — is it smooth?
93
+ - are there unexpected errors?
94
+ - does it feel effortless to the user?
95
+
96
+ critical paths must "just work." if there's friction, fix it now.
97
+
98
+ - slug: has-ergonomics-validated
99
+ say: |
100
+ double-check: does the actual input/output match what felt right at repros?
101
+
102
+ compare the implemented input/output to what was sketched in repros:
103
+ - does the actual input match the planned input?
104
+ - does the actual output match the planned output?
105
+ - did the design change between repros and implementation?
106
+
107
+ if the ergonomics drifted, either:
108
+ - update repros to reflect the better design, or
109
+ - fix the implementation to match the planned ergonomics
110
+
111
+ - slug: has-play-test-convention
112
+ say: |
113
+ double-check: are journey test files named correctly?
114
+
115
+ journey tests should use `.play.test.ts` suffix:
116
+ - `feature.play.test.ts` — journey test
117
+ - `feature.play.integration.test.ts` — if repo requires integration runner
118
+ - `feature.play.acceptance.test.ts` — if repo requires acceptance runner
119
+
120
+ verify:
121
+ - are journey tests in the right location?
122
+ - do they have the `.play.` suffix?
123
+ - if not supported, is the fallback convention used?
@@ -36,6 +36,7 @@ reference the below for full context
36
36
  - $BEHAVIOR_DIR_REL/0.wish.md
37
37
  - $BEHAVIOR_DIR_REL/1.vision.md
38
38
  - $BEHAVIOR_DIR_REL/2.1.criteria.blackbox.md (if declared)
39
+ - $BEHAVIOR_DIR_REL/3.2.distill.repros.experience.*.md (if declared) ← **repros artifact**
39
40
 
40
41
  ---
41
42
 
@@ -51,11 +52,14 @@ this is your roadmap. emit it first, then work through it step by step.
51
52
  ```
52
53
  ## verification checklist
53
54
 
54
- ### behavior coverage
55
- | behavior (from wish/vision) | test file | status |
56
- |-----------------------------|-----------|--------|
57
- | {behavior 1} | {path} | ⏳ |
58
- | {behavior 2} | {path} | |
55
+ ### behavior coverage (with reference to repros)
56
+
57
+ for each journey sketched in repros, verify it was implemented with snapshots.
58
+
59
+ | journey (from repros) | test file | snapshots? | critical path? | ergonomics ok? | status |
60
+ |-----------------------|-----------|------------|----------------|----------------|--------|
61
+ | {journey 1} | {path} | ✓ / ✗ | ✓ frictionless / needs work | ✓ natural / needs work | ⏳ |
62
+ | {journey 2} | {path} | ✓ / ✗ | ✓ frictionless / needs work | ✓ natural / needs work | ⏳ |
59
63
  ...
60
64
 
61
65
  ### zero skips verified
@@ -24,5 +24,36 @@ reviews:
24
24
  - what inputs are unusual but valid?
25
25
  - are boundaries tested?
26
26
 
27
+ - slug: has-acceptance-test-citations
28
+ say: |
29
+ coverage check: cite the acceptance test for each playtest step.
30
+
31
+ for each step in the playtest:
32
+ - which acceptance test file verifies this behavior?
33
+ - which specific test case (given/when/then) covers it?
34
+ - cite the exact file path and test name
35
+
36
+ if a step lacks acceptance test coverage:
37
+ - is this a gap that needs a new test?
38
+ - or is this behavior untestable via automation?
39
+
40
+ the playtest and acceptance tests should align. cite the proof.
41
+
42
+ - slug: has-self-run-verification
43
+ say: |
44
+ dogfood check: did you run the playtest yourself?
45
+
46
+ before you hand off to the foreman, run every step yourself:
47
+ - follow each instruction exactly as written
48
+ - verify each expected outcome matches reality
49
+ - note any friction, confusion, or absent context
50
+
51
+ if you found issues while you ran it:
52
+ - did you fix the instructions?
53
+ - did you update expected outcomes?
54
+ - is the playtest now accurate to what you observed?
55
+
56
+ the foreman deserves a playtest that works. prove it works by self-test first.
57
+
27
58
  judges:
28
59
  - npx rhachet run --repo bhrain --skill route.stone.judge --mechanism approved? --stone $stone --route $route
package/package.json CHANGED
@@ -2,7 +2,7 @@
2
2
  "name": "rhachet-roles-bhuild",
3
3
  "author": "ehmpathy",
4
4
  "description": "roles for building resilient systems, via rhachet",
5
- "version": "0.14.1",
5
+ "version": "0.14.3",
6
6
  "repository": "ehmpathy/rhachet-roles-bhuild",
7
7
  "homepage": "https://github.com/ehmpathy/rhachet-roles-bhuild",
8
8
  "keywords": [
@@ -89,11 +89,11 @@
89
89
  "esbuild-register": "3.6.0",
90
90
  "husky": "8.0.3",
91
91
  "jest": "30.2.0",
92
- "rhachet": "1.37.14",
92
+ "rhachet": "1.37.15",
93
93
  "rhachet-brains-anthropic": "0.3.3",
94
94
  "rhachet-roles-bhrain": "0.18.1",
95
95
  "rhachet-roles-bhuild": "link:.",
96
- "rhachet-roles-ehmpathy": "1.27.12",
96
+ "rhachet-roles-ehmpathy": "1.27.13",
97
97
  "tsc-alias": "1.8.10",
98
98
  "tsx": "4.20.6",
99
99
  "typescript": "5.4.5",