orchestr8 3.2.0 β 3.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -40,6 +40,7 @@ Implement the feature according to the plan. Work incrementally, making tests pa
|
|
|
40
40
|
- Match existing patterns in the codebase
|
|
41
41
|
- Validate inputs defensively
|
|
42
42
|
- Handle errors gracefully
|
|
43
|
+
- If tests pass but behaviour feels wrong or forced, consult the failure-mode rituals in `.blueprint/ways_of_working/DEVELOPMENT_RITUAL.md`
|
|
43
44
|
|
|
44
45
|
## Completion
|
|
45
46
|
|
|
@@ -1,142 +1,178 @@
|
|
|
1
|
-
# Development Ritual
|
|
1
|
+
# Development Ritual
|
|
2
2
|
|
|
3
3
|
This document defines:
|
|
4
|
-
-
|
|
5
|
-
-
|
|
6
|
-
- **
|
|
4
|
+
- The **pipeline stages** and what each agent must deliver
|
|
5
|
+
- **Checklists** each agent must satisfy before handoff
|
|
6
|
+
- **Failure-mode rituals** that override normal flow when triggered
|
|
7
|
+
- The **feedback and handoff** mechanisms that connect stages
|
|
7
8
|
|
|
8
|
-
A stage is not complete until its
|
|
9
|
+
A stage is not complete until its checklist is satisfied.
|
|
9
10
|
|
|
10
11
|
---
|
|
11
12
|
|
|
12
|
-
##
|
|
13
|
+
## Pipeline Stages
|
|
13
14
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
15
|
+
```
|
|
16
|
+
Alex (feature spec) β Cass (user stories) β Nigel (tests) β Codey (plan β implement) β Auto-commit β Human QA
|
|
17
|
+
```
|
|
17
18
|
|
|
18
|
-
|
|
19
|
+
Each agent reads the previous agent's outputs and produces artifacts for the next. Context is passed via **handoff summaries** (max 30 lines) to keep token usage efficient. The pipeline uses a **feedback chain** where each agent rates the previous agent's work before starting their own.
|
|
20
|
+
|
|
21
|
+
Tests define behaviour. The human validates intent after auto-commit.
|
|
19
22
|
|
|
20
23
|
---
|
|
21
24
|
|
|
22
|
-
##
|
|
25
|
+
## Handoff Mechanism
|
|
23
26
|
|
|
24
|
-
|
|
27
|
+
Between stages, the pipeline creates a handoff summary (`handoff-{agent}.md`, max 30 lines) that passes key context to the next agent. This keeps each agent focused without re-reading everything from scratch.
|
|
25
28
|
|
|
26
|
-
|
|
29
|
+
Each agent also provides **feedback** on the previous agent's output:
|
|
30
|
+
- **Rating** (1-5) on quality
|
|
31
|
+
- **Issues** list (if any)
|
|
32
|
+
- **Recommendation**: `proceed`, `pause`, or `revise`
|
|
33
|
+
|
|
34
|
+
If the rating falls below the configured threshold (default: 3.0), the pipeline pauses for human review. See `feedback-config` for threshold settings.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Agent Checklists
|
|
39
|
+
|
|
40
|
+
### Alex (System Specification)
|
|
41
|
+
|
|
42
|
+
Before writing the feature spec:
|
|
43
|
+
- [ ] Read the system specification
|
|
44
|
+
- [ ] Read relevant business context (`.business_context/`)
|
|
45
|
+
- [ ] Read the feature template
|
|
46
|
+
|
|
47
|
+
Before handoff:
|
|
48
|
+
- [ ] Feature spec written to `FEATURE_SPEC.md`
|
|
49
|
+
- [ ] Intent, scope, actors, rules, and dependencies covered
|
|
50
|
+
- [ ] Ambiguities flagged explicitly
|
|
51
|
+
- [ ] Assumptions labelled as such
|
|
52
|
+
- [ ] Spec aligns with system boundaries
|
|
53
|
+
|
|
54
|
+
### Cass (Story Writer)
|
|
55
|
+
|
|
56
|
+
Before writing stories:
|
|
57
|
+
- [ ] Read the feature spec
|
|
58
|
+
- [ ] Read the system specification for context
|
|
59
|
+
- [ ] Identified primary behaviour, entry/exit conditions, branching logic
|
|
60
|
+
|
|
61
|
+
Before handoff:
|
|
62
|
+
- [ ] Each story file (`story-{slug}.md`) has a single clear goal
|
|
63
|
+
- [ ] Acceptance criteria are in Given/When/Then, max 5-7 per story
|
|
64
|
+
- [ ] Routing is explicit (no "goes to next screen")
|
|
65
|
+
- [ ] Out-of-scope items listed
|
|
66
|
+
- [ ] Assumptions flagged
|
|
67
|
+
|
|
68
|
+
### Nigel (Tester)
|
|
27
69
|
|
|
28
|
-
```text
|
|
29
|
-
[ ] Read story and acceptance criteria
|
|
30
|
-
[ ] Read tester understanding & test plan
|
|
31
|
-
[ ] Ran baseline tests
|
|
32
|
-
[ ] Implemented behaviour
|
|
33
|
-
[ ] Tests passing
|
|
34
|
-
[ ] Lint passing
|
|
35
|
-
[ ] Summary written
|
|
36
|
-
```
|
|
37
|
-
### Tester CLI Ritual (Nigel)
|
|
38
70
|
Before writing tests:
|
|
39
|
-
[ ]
|
|
40
|
-
[ ] Acceptance criteria are testable
|
|
41
|
-
[ ] Ambiguities identified
|
|
42
|
-
[ ] Assumptions written down
|
|
43
|
-
|
|
44
|
-
Before
|
|
45
|
-
[ ]
|
|
46
|
-
[ ]
|
|
47
|
-
[ ] Happy path tests written
|
|
48
|
-
[ ] Edge
|
|
49
|
-
[ ] Tests runnable via
|
|
50
|
-
[ ] Traceability table complete
|
|
51
|
-
[ ] Open questions listed
|
|
52
|
-
|
|
53
|
-
If any box is unchecked
|
|
54
|
-
|
|
55
|
-
|
|
71
|
+
- [ ] Read all story files and the feature spec
|
|
72
|
+
- [ ] Acceptance criteria are testable
|
|
73
|
+
- [ ] Ambiguities identified
|
|
74
|
+
- [ ] Assumptions written down
|
|
75
|
+
|
|
76
|
+
Before handoff:
|
|
77
|
+
- [ ] `test-spec.md` written (understanding, AC-to-test mapping, assumptions)
|
|
78
|
+
- [ ] Executable test file written
|
|
79
|
+
- [ ] Happy path tests written
|
|
80
|
+
- [ ] Edge case and error tests written
|
|
81
|
+
- [ ] Tests runnable via the project's configured test command (see `.claude/stack-config.json`)
|
|
82
|
+
- [ ] Traceability table complete (every AC mapped to test IDs)
|
|
83
|
+
- [ ] Open questions listed
|
|
84
|
+
|
|
85
|
+
If any box is unchecked, raise it before handoff.
|
|
86
|
+
|
|
87
|
+
### Codey (Developer) β Planning
|
|
88
|
+
|
|
89
|
+
Before writing the plan:
|
|
90
|
+
- [ ] Read feature spec, stories, test-spec, and executable tests
|
|
91
|
+
- [ ] Built mental model of happy path, edge cases, error flows
|
|
92
|
+
- [ ] Identified what already exists vs what is new
|
|
93
|
+
|
|
94
|
+
Before handoff:
|
|
95
|
+
- [ ] `IMPLEMENTATION_PLAN.md` written (summary, files table, steps, risks)
|
|
96
|
+
- [ ] Steps ordered to make tests pass incrementally
|
|
97
|
+
- [ ] No implementation code written yet
|
|
98
|
+
|
|
99
|
+
### Codey (Developer) β Implementation
|
|
100
|
+
|
|
56
101
|
Before coding:
|
|
57
|
-
[ ] Read
|
|
58
|
-
[ ]
|
|
59
|
-
[ ] Read executable tests
|
|
60
|
-
[ ] Ran baseline tests (expected failures only)
|
|
102
|
+
- [ ] Read implementation plan and tests
|
|
103
|
+
- [ ] Ran baseline tests (note expected failures)
|
|
61
104
|
|
|
62
105
|
During coding:
|
|
63
|
-
[ ] Implemented behaviour incrementally
|
|
64
|
-
[ ] Ran
|
|
65
|
-
[ ] Did not weaken or delete tests
|
|
66
|
-
|
|
67
|
-
Before
|
|
68
|
-
[ ] All tests passing
|
|
69
|
-
[ ] Lint passing
|
|
70
|
-
[ ] No unexplained skip
|
|
71
|
-
[ ] Changes summarised
|
|
72
|
-
[ ] Assumptions restated
|
|
73
|
-
|
|
74
|
-
If tests pass but confidence is low
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
[ ]
|
|
87
|
-
[ ]
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
Outcome
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
-
|
|
100
|
-
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
-
|
|
109
|
-
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
[ ]
|
|
120
|
-
[ ]
|
|
121
|
-
|
|
122
|
-
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
π§ Meta-Rules (Always On)
|
|
138
|
-
β Tests are the behavioural contract
|
|
139
|
-
β Green builds are necessary, not sufficient
|
|
140
|
-
β Assumptions must be written down
|
|
141
|
-
β No silent changes
|
|
142
|
-
β When in doubt, slow down and ask the human
|
|
106
|
+
- [ ] Implemented behaviour incrementally (one file at a time)
|
|
107
|
+
- [ ] Ran tests after each file change
|
|
108
|
+
- [ ] Did not weaken or delete Nigel's tests
|
|
109
|
+
|
|
110
|
+
Before handoff:
|
|
111
|
+
- [ ] All tests passing
|
|
112
|
+
- [ ] Lint passing
|
|
113
|
+
- [ ] No unexplained `skip` or `todo`
|
|
114
|
+
- [ ] Changes summarised (files changed, test status, blockers)
|
|
115
|
+
- [ ] Assumptions restated
|
|
116
|
+
|
|
117
|
+
If tests pass but confidence is low, trigger a failure-mode ritual (see below).
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## Failure-Mode Rituals
|
|
122
|
+
|
|
123
|
+
These override normal flow. When triggered, stop and follow the steps explicitly.
|
|
124
|
+
|
|
125
|
+
### Tests pass, but behaviour feels wrong
|
|
126
|
+
|
|
127
|
+
**Trigger:** Behaviour technically matches tests but not intent, or something feels "too easy."
|
|
128
|
+
|
|
129
|
+
- [ ] Re-read the original user story
|
|
130
|
+
- [ ] Re-state intended behaviour in plain English
|
|
131
|
+
- [ ] Identify mismatch: story vs tests vs implementation
|
|
132
|
+
- [ ] Decide: tests are wrong, story is underspecified, or implementation misinterpreted behaviour
|
|
133
|
+
|
|
134
|
+
**Outcome:** Update tests (Nigel), clarify ACs (Cass), or fix implementation (Codey). Never "let it slide."
|
|
135
|
+
|
|
136
|
+
### Tests are unclear or contradictory
|
|
137
|
+
|
|
138
|
+
**Trigger:** Assertions conflict, test names don't match expectations, or passing tests don't explain behaviour.
|
|
139
|
+
|
|
140
|
+
- [ ] Identify the specific confusing test(s)
|
|
141
|
+
- [ ] State what behaviour they appear to encode
|
|
142
|
+
- [ ] Compare to acceptance criteria
|
|
143
|
+
- [ ] Propose corrected test behaviour
|
|
144
|
+
|
|
145
|
+
**Outcome:** Nigel revises tests. Codey does not guess.
|
|
146
|
+
|
|
147
|
+
### Tests fail for non-behaviour reasons
|
|
148
|
+
|
|
149
|
+
**Trigger:** Environment/setup issues, brittle timing, or global state leakage.
|
|
150
|
+
|
|
151
|
+
- [ ] Confirm failure is not missing behaviour
|
|
152
|
+
- [ ] Isolate failing test
|
|
153
|
+
- [ ] Remove flakiness or hidden coupling
|
|
154
|
+
- [ ] Re-run full suite
|
|
155
|
+
|
|
156
|
+
**Outcome:** Stabilise tests before continuing feature work.
|
|
157
|
+
|
|
158
|
+
### Implementation feels forced
|
|
159
|
+
|
|
160
|
+
**Trigger:** Logic seems unnatural or overly complex to make tests pass.
|
|
161
|
+
|
|
162
|
+
- [ ] Pause implementation
|
|
163
|
+
- [ ] Identify which test is driving the awkward behaviour
|
|
164
|
+
- [ ] Re-check acceptance criteria
|
|
165
|
+
- [ ] Raise concern to the human
|
|
166
|
+
|
|
167
|
+
**Outcome:** Adjust tests or clarify intent. Prefer simpler behaviour aligned to the story.
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## Meta-Rules (Always On)
|
|
172
|
+
|
|
173
|
+
- Tests are the behavioural contract
|
|
174
|
+
- Green builds are necessary, not sufficient
|
|
175
|
+
- No silent changes β all assumptions written down
|
|
176
|
+
- When in doubt, slow down and ask the human
|
|
177
|
+
|
|
178
|
+
See `GUARDRAILS.md` for the full shared constraints (source restrictions, escalation protocol, anti-patterns).
|