prizmkit 1.1.1 → 1.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bundled/VERSION.json +3 -3
- package/bundled/adapters/claude/agent-adapter.js +18 -0
- package/bundled/adapters/claude/command-adapter.js +1 -27
- package/bundled/agents/prizm-dev-team-critic.md +2 -0
- package/bundled/agents/prizm-dev-team-dev.md +2 -0
- package/bundled/agents/prizm-dev-team-reviewer.md +2 -0
- package/bundled/dev-pipeline/README.md +63 -63
- package/bundled/dev-pipeline/assets/feature-list-example.json +1 -1
- package/bundled/dev-pipeline/assets/prizm-dev-team-integration.md +1 -1
- package/bundled/dev-pipeline/{launch-daemon.sh → launch-feature-daemon.sh} +33 -33
- package/bundled/dev-pipeline/launch-refactor-daemon.sh +454 -0
- package/bundled/dev-pipeline/lib/branch.sh +1 -1
- package/bundled/dev-pipeline/reset-feature.sh +3 -3
- package/bundled/dev-pipeline/reset-refactor.sh +312 -0
- package/bundled/dev-pipeline/{retry-bug.sh → retry-bugfix.sh} +47 -59
- package/bundled/dev-pipeline/retry-feature.sh +41 -54
- package/bundled/dev-pipeline/retry-refactor.sh +358 -0
- package/bundled/dev-pipeline/run-bugfix.sh +6 -0
- package/bundled/dev-pipeline/{run.sh → run-feature.sh} +31 -31
- package/bundled/dev-pipeline/run-refactor.sh +787 -0
- package/bundled/dev-pipeline/scripts/generate-bootstrap-prompt.py +177 -10
- package/bundled/dev-pipeline/scripts/generate-refactor-prompt.py +419 -0
- package/bundled/dev-pipeline/scripts/init-refactor-pipeline.py +393 -0
- package/bundled/dev-pipeline/scripts/update-refactor-status.py +726 -0
- package/bundled/dev-pipeline/templates/agent-prompts/critic-code-challenge.md +13 -0
- package/bundled/dev-pipeline/templates/agent-prompts/critic-plan-challenge.md +7 -0
- package/bundled/dev-pipeline/templates/agent-prompts/dev-fix.md +7 -0
- package/bundled/dev-pipeline/templates/agent-prompts/dev-implement.md +26 -0
- package/bundled/dev-pipeline/templates/agent-prompts/dev-resume.md +5 -0
- package/bundled/dev-pipeline/templates/agent-prompts/reviewer-analyze.md +5 -0
- package/bundled/dev-pipeline/templates/agent-prompts/reviewer-review.md +12 -0
- package/bundled/dev-pipeline/templates/bootstrap-tier1.md +29 -2
- package/bundled/dev-pipeline/templates/bootstrap-tier2.md +8 -7
- package/bundled/dev-pipeline/templates/bootstrap-tier3.md +11 -10
- package/bundled/dev-pipeline/templates/bugfix-bootstrap-prompt.md +2 -3
- package/bundled/dev-pipeline/templates/feature-list-schema.json +1 -1
- package/bundled/dev-pipeline/templates/refactor-list-schema.json +159 -0
- package/bundled/dev-pipeline/templates/sections/ac-verification-checklist.md +13 -0
- package/bundled/dev-pipeline/templates/sections/feature-context.md +1 -1
- package/bundled/dev-pipeline/templates/sections/phase-analyze-agent.md +9 -8
- package/bundled/dev-pipeline/templates/sections/phase-analyze-full.md +9 -8
- package/bundled/dev-pipeline/templates/sections/phase-browser-verification.md +2 -1
- package/bundled/dev-pipeline/templates/sections/phase-critic-code.md +8 -10
- package/bundled/dev-pipeline/templates/sections/phase-critic-plan-full.md +9 -10
- package/bundled/dev-pipeline/templates/sections/phase-critic-plan.md +8 -9
- package/bundled/dev-pipeline/templates/sections/phase-implement-agent.md +7 -10
- package/bundled/dev-pipeline/templates/sections/phase-implement-full.md +8 -15
- package/bundled/dev-pipeline/templates/sections/phase-review-agent.md +7 -12
- package/bundled/dev-pipeline/templates/sections/phase-review-full.md +8 -19
- package/bundled/dev-pipeline/templates/sections/test-failure-recovery.md +75 -0
- package/bundled/skills/_metadata.json +33 -6
- package/bundled/skills/app-planner/SKILL.md +105 -320
- package/bundled/skills/app-planner/assets/app-design-guide.md +101 -0
- package/bundled/skills/app-planner/references/frontend-design-guide.md +1 -1
- package/bundled/skills/app-planner/references/project-brief-guide.md +49 -80
- package/bundled/skills/bug-fix-workflow/SKILL.md +2 -2
- package/bundled/skills/bug-planner/SKILL.md +68 -5
- package/bundled/skills/bug-planner/scripts/validate-bug-list.py +3 -2
- package/bundled/skills/bugfix-pipeline-launcher/SKILL.md +19 -5
- package/bundled/skills/{dev-pipeline-launcher → feature-pipeline-launcher}/SKILL.md +32 -32
- package/bundled/skills/feature-planner/SKILL.md +337 -0
- package/bundled/skills/{app-planner → feature-planner}/assets/evaluation-guide.md +4 -4
- package/bundled/skills/{app-planner → feature-planner}/assets/planning-guide.md +3 -171
- package/bundled/skills/{app-planner → feature-planner}/references/browser-interaction.md +6 -5
- package/bundled/skills/feature-planner/references/decomposition-patterns.md +75 -0
- package/bundled/skills/{app-planner → feature-planner}/references/error-recovery.md +8 -8
- package/bundled/skills/{app-planner → feature-planner}/references/incremental-feature-planning.md +1 -1
- package/bundled/skills/{app-planner/references/new-app-planning.md → feature-planner/references/new-project-planning.md} +1 -1
- package/bundled/skills/{app-planner → feature-planner}/scripts/validate-and-generate.py +4 -4
- package/bundled/skills/feature-workflow/SKILL.md +23 -23
- package/bundled/skills/prizm-kit/SKILL.md +1 -3
- package/bundled/skills/prizmkit-analyze/SKILL.md +2 -5
- package/bundled/skills/prizmkit-code-review/SKILL.md +2 -2
- package/bundled/skills/prizmkit-committer/SKILL.md +4 -8
- package/bundled/skills/prizmkit-deploy/SKILL.md +1 -5
- package/bundled/skills/prizmkit-implement/SKILL.md +3 -50
- package/bundled/skills/prizmkit-init/SKILL.md +5 -77
- package/bundled/skills/prizmkit-plan/SKILL.md +1 -12
- package/bundled/skills/prizmkit-prizm-docs/SKILL.md +6 -24
- package/bundled/skills/prizmkit-prizm-docs/assets/PRIZM-SPEC.md +21 -0
- package/bundled/skills/prizmkit-retrospective/SKILL.md +12 -117
- package/bundled/skills/recovery-workflow/SKILL.md +166 -316
- package/bundled/skills/recovery-workflow/evals/evals.json +29 -13
- package/bundled/skills/recovery-workflow/scripts/detect-recovery-state.py +232 -274
- package/bundled/skills/refactor-pipeline-launcher/SKILL.md +352 -0
- package/bundled/skills/refactor-planner/SKILL.md +436 -0
- package/bundled/skills/refactor-planner/assets/planning-guide.md +292 -0
- package/bundled/skills/refactor-planner/references/behavior-preservation.md +301 -0
- package/bundled/skills/refactor-planner/references/refactor-scoping-guide.md +221 -0
- package/bundled/skills/refactor-planner/scripts/validate-and-generate-refactor.py +786 -0
- package/bundled/skills/refactor-workflow/SKILL.md +299 -319
- package/package.json +1 -1
- package/src/clean.js +3 -3
- package/src/scaffold.js +6 -6
- package/bundled/skills/prizmkit-plan/assets/spec-template.md +0 -56
- package/bundled/skills/prizmkit-plan/references/clarify-guide.md +0 -67
- package/src/config.js +0 -504
- package/src/prompts.js +0 -210
- /package/bundled/skills/{dev-pipeline-launcher → feature-pipeline-launcher}/scripts/preflight-check.py +0 -0
|
@@ -0,0 +1,292 @@
|
|
|
1
|
+
# Refactor Planning Reference Guide
|
|
2
|
+
|
|
3
|
+
This guide provides structured patterns, decision matrices, and templates for decomposing refactoring goals into well-scoped, executable items. It is intended as a practical reference for the AI during interactive refactor planning sessions.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## 1. Identifying Refactoring Boundaries
|
|
8
|
+
|
|
9
|
+
Refactoring boundaries define where one refactor item ends and another begins. Good boundaries produce items that are independently executable and independently verifiable.
|
|
10
|
+
|
|
11
|
+
### Boundary Heuristics
|
|
12
|
+
|
|
13
|
+
| Signal | Boundary Type | Example |
|
|
14
|
+
|--------|--------------|---------|
|
|
15
|
+
| Different files/modules | Module boundary | "Extract auth logic" vs "Extract validation logic" |
|
|
16
|
+
| Different refactoring operations | Operation boundary | "Rename function" vs "Extract class" |
|
|
17
|
+
| Different risk levels | Risk boundary | "Safe rename" vs "Restructure module internals" |
|
|
18
|
+
| Different test suites affected | Test boundary | "Changes unit tests only" vs "Changes integration tests" |
|
|
19
|
+
| Sequential dependency | Dependency boundary | "Rename X" must complete before "Move X to new module" |
|
|
20
|
+
|
|
21
|
+
### Rules for Setting Boundaries
|
|
22
|
+
|
|
23
|
+
1. **One operation type per item.** Don't mix a rename with a structural extraction in the same item.
|
|
24
|
+
2. **One module scope per item** (unless the refactoring specifically targets cross-module concerns like decoupling).
|
|
25
|
+
3. **Each item should be independently testable.** After completing item R-001, all tests should pass before starting R-002.
|
|
26
|
+
4. **If an item requires more than 3 files to change simultaneously**, consider splitting it.
|
|
27
|
+
5. **If behavior preservation requires different strategies for different parts**, split into separate items with appropriate strategies.
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## 2. Description Writing Guide
|
|
32
|
+
|
|
33
|
+
Refactor item descriptions are the primary input for autonomous pipeline sessions. A thin description forces the AI to guess about scope and safety constraints.
|
|
34
|
+
|
|
35
|
+
### Minimum Word Counts
|
|
36
|
+
|
|
37
|
+
| Complexity | Minimum Words | Warning Threshold |
|
|
38
|
+
|------------|---------------|-------------------|
|
|
39
|
+
| low | 15 | 30 |
|
|
40
|
+
| medium | 15 | 50 |
|
|
41
|
+
| high | 15 | 80 |
|
|
42
|
+
|
|
43
|
+
Below 15 words is a validation error. Below the threshold triggers a warning.
|
|
44
|
+
|
|
45
|
+
### What to Include
|
|
46
|
+
|
|
47
|
+
Every refactor description should cover:
|
|
48
|
+
|
|
49
|
+
1. **What to change** — specific files, functions, classes, or patterns being refactored
|
|
50
|
+
2. **How to change it** — the refactoring operation (extract, rename, move, inline, simplify)
|
|
51
|
+
3. **Why** — the motivation (reduce complexity, improve testability, remove duplication)
|
|
52
|
+
4. **Constraints** — what must NOT change (public API, behavior contracts, external interfaces)
|
|
53
|
+
5. **Verification** — how to confirm the refactoring succeeded without breaking behavior
|
|
54
|
+
|
|
55
|
+
### Good vs Bad Examples
|
|
56
|
+
|
|
57
|
+
**Bad** (12 words — too thin):
|
|
58
|
+
```
|
|
59
|
+
"Extract the validation logic from the handler into a separate module."
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
**Good** (55 words — implementation-ready):
|
|
63
|
+
```
|
|
64
|
+
"Extract all input validation functions from src/api/handler.js (validateEmail, validatePassword, validateUsername) into a new src/utils/validators.js module. Update all imports in handler.js and any other files that import these functions directly. Preserve the exact function signatures and return types. The handler.js file should import from the new location. All existing tests must continue to pass without modification."
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
**Bad** (14 words):
|
|
68
|
+
```
|
|
69
|
+
"Convert the user service from callbacks to async/await pattern throughout."
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Good** (72 words — implementation-ready):
|
|
73
|
+
```
|
|
74
|
+
"Convert src/services/user-service.js from callback-based functions to async/await. Target functions: createUser, findUserById, updateUser, deleteUser (4 functions total). Each function currently accepts a callback as the last parameter and calls it with (err, result). Convert to return Promises and use async/await internally. Update all callers in src/routes/user-routes.js to use await instead of passing callbacks. Preserve all error handling behavior — errors that were passed to callbacks should now be thrown."
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## 3. Common Refactoring Patterns
|
|
80
|
+
|
|
81
|
+
Use these patterns as starting points when decomposing refactoring goals.
|
|
82
|
+
|
|
83
|
+
### Pattern A: Extract Method/Function
|
|
84
|
+
|
|
85
|
+
**When**: A function is too long, has multiple responsibilities, or contains duplicated logic.
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
R-001: Extract [specific logic] from [source function] into [new function name]
|
|
89
|
+
Type: extract
|
|
90
|
+
Scope: [source file]
|
|
91
|
+
Complexity: low
|
|
92
|
+
Preservation: test-gate
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Pattern B: Extract Class/Module
|
|
96
|
+
|
|
97
|
+
**When**: A file/class is too large, a group of functions share a common concern, or a module has multiple responsibilities.
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
R-001: Create new module [name] with extracted [concern] logic
|
|
101
|
+
Type: extract
|
|
102
|
+
Scope: [source file, new file]
|
|
103
|
+
Complexity: medium
|
|
104
|
+
Preservation: test-gate
|
|
105
|
+
|
|
106
|
+
R-002: Update imports to use new [name] module (deps: R-001)
|
|
107
|
+
Type: restructure
|
|
108
|
+
Scope: [all importing files]
|
|
109
|
+
Complexity: low
|
|
110
|
+
Preservation: test-gate
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
### Pattern C: Move Module/File
|
|
114
|
+
|
|
115
|
+
**When**: A file is in the wrong directory, module organization needs restructuring.
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
R-001: Move [file] from [old path] to [new path]
|
|
119
|
+
Type: restructure
|
|
120
|
+
Scope: [file, all importers]
|
|
121
|
+
Complexity: low-medium (depends on import count)
|
|
122
|
+
Preservation: test-gate
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
### Pattern D: Inline (Reverse of Extract)
|
|
126
|
+
|
|
127
|
+
**When**: An abstraction is unnecessary, a wrapper adds no value, or indirection hurts readability.
|
|
128
|
+
|
|
129
|
+
```
|
|
130
|
+
R-001: Inline [function/module] into [target]
|
|
131
|
+
Type: simplify
|
|
132
|
+
Scope: [source file, target file]
|
|
133
|
+
Complexity: low
|
|
134
|
+
Preservation: test-gate
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Pattern E: Rename (Variable, Function, Class, File)
|
|
138
|
+
|
|
139
|
+
**When**: Names are misleading, inconsistent, or don't follow conventions.
|
|
140
|
+
|
|
141
|
+
```
|
|
142
|
+
R-001: Rename [old name] to [new name] across codebase
|
|
143
|
+
Type: rename
|
|
144
|
+
Scope: [all files containing the name]
|
|
145
|
+
Complexity: low
|
|
146
|
+
Preservation: test-gate
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Pattern F: Decouple Dependencies
|
|
150
|
+
|
|
151
|
+
**When**: Circular dependencies, tight coupling between modules, or difficulty testing in isolation.
|
|
152
|
+
|
|
153
|
+
```
|
|
154
|
+
R-001: Define interface/contract for [dependency] (deps: none)
|
|
155
|
+
Type: decouple
|
|
156
|
+
Scope: [new interface file]
|
|
157
|
+
Complexity: medium
|
|
158
|
+
Preservation: test-gate
|
|
159
|
+
|
|
160
|
+
R-002: Implement [dependency] behind new interface (deps: R-001)
|
|
161
|
+
Type: decouple
|
|
162
|
+
Scope: [implementation file]
|
|
163
|
+
Complexity: medium
|
|
164
|
+
Preservation: test-gate
|
|
165
|
+
|
|
166
|
+
R-003: Update consumers to use interface instead of concrete (deps: R-002)
|
|
167
|
+
Type: decouple
|
|
168
|
+
Scope: [all consumer files]
|
|
169
|
+
Complexity: medium
|
|
170
|
+
Preservation: test-gate
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Pattern G: Architecture Migration
|
|
174
|
+
|
|
175
|
+
**When**: Converting between paradigms (callbacks to promises, classes to functions, monolith to modules).
|
|
176
|
+
|
|
177
|
+
```
|
|
178
|
+
R-001: Add new [pattern] alongside old [pattern] (deps: none)
|
|
179
|
+
Type: migrate
|
|
180
|
+
Scope: [target files]
|
|
181
|
+
Complexity: medium-high
|
|
182
|
+
Preservation: test-gate or snapshot
|
|
183
|
+
|
|
184
|
+
R-002: Migrate [specific area] to new pattern (deps: R-001)
|
|
185
|
+
Type: migrate
|
|
186
|
+
Scope: [area files]
|
|
187
|
+
Complexity: medium
|
|
188
|
+
Preservation: test-gate
|
|
189
|
+
|
|
190
|
+
R-003: Remove old [pattern] code (deps: R-002)
|
|
191
|
+
Type: simplify
|
|
192
|
+
Scope: [cleaned files]
|
|
193
|
+
Complexity: low
|
|
194
|
+
Preservation: test-gate
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## 4. Dependency Ordering Rules
|
|
200
|
+
|
|
201
|
+
Correct ordering minimizes risk and ensures each step is independently verifiable.
|
|
202
|
+
|
|
203
|
+
### Ordering Priority (execute in this order)
|
|
204
|
+
|
|
205
|
+
1. **Safe renames** — Lowest risk. Pure name changes with no structural impact. Can be reverted trivially.
|
|
206
|
+
2. **Extract/inline** — Moderate risk. Changes module boundaries but doesn't reorganize architecture.
|
|
207
|
+
3. **Structural changes** — Higher risk. Reorganizes file layout, module hierarchy, or dependency graph.
|
|
208
|
+
4. **Migrations** — Highest risk. Changes programming patterns or paradigms.
|
|
209
|
+
|
|
210
|
+
### Dependency Rules
|
|
211
|
+
|
|
212
|
+
1. **No circular dependencies.** Dependencies MUST form a directed acyclic graph (DAG).
|
|
213
|
+
2. **Minimal dependency sets.** Each item should depend only on items it directly needs.
|
|
214
|
+
3. **Rename before restructure.** If you're renaming something AND moving it, rename first (easier to track).
|
|
215
|
+
4. **Create before consume.** If item A creates a new module and item B uses it, B depends on A.
|
|
216
|
+
5. **Interface before implementation.** If decoupling, define the interface before implementing behind it.
|
|
217
|
+
6. **Preserve before remove.** If migrating, ensure new code works before removing old code.
|
|
218
|
+
|
|
219
|
+
### Validation Checklist
|
|
220
|
+
|
|
221
|
+
- [ ] No item depends on itself
|
|
222
|
+
- [ ] No circular dependency chains exist
|
|
223
|
+
- [ ] Every item ID referenced in a dependency list is defined in the plan
|
|
224
|
+
- [ ] The graph can be topologically sorted
|
|
225
|
+
- [ ] Renames appear before structural changes that reference the renamed entities
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
## 5. Acceptance Criteria for Refactoring
|
|
230
|
+
|
|
231
|
+
Refactoring acceptance criteria focus on structural improvement AND behavior preservation. They differ from feature acceptance criteria.
|
|
232
|
+
|
|
233
|
+
### Standard Criteria Templates
|
|
234
|
+
|
|
235
|
+
**For extract operations:**
|
|
236
|
+
- [ ] New module/function exists at [target path]
|
|
237
|
+
- [ ] Original location imports from new location (no duplication)
|
|
238
|
+
- [ ] All existing tests pass without modification
|
|
239
|
+
- [ ] No new circular dependencies introduced
|
|
240
|
+
|
|
241
|
+
**For rename operations:**
|
|
242
|
+
- [ ] Old name does not appear anywhere in codebase (except git history)
|
|
243
|
+
- [ ] All references updated to new name
|
|
244
|
+
- [ ] All existing tests pass without modification
|
|
245
|
+
|
|
246
|
+
**For restructure operations:**
|
|
247
|
+
- [ ] Files are in their new locations
|
|
248
|
+
- [ ] All import paths updated
|
|
249
|
+
- [ ] Module boundary is clean (no reaching into internal paths)
|
|
250
|
+
- [ ] All existing tests pass without modification
|
|
251
|
+
|
|
252
|
+
**For decouple operations:**
|
|
253
|
+
- [ ] Interface/contract defined and documented
|
|
254
|
+
- [ ] Implementation satisfies interface
|
|
255
|
+
- [ ] Consumers depend on interface, not implementation
|
|
256
|
+
- [ ] No circular dependencies remain
|
|
257
|
+
- [ ] All existing tests pass without modification
|
|
258
|
+
|
|
259
|
+
**For migrate operations:**
|
|
260
|
+
- [ ] New pattern is used in target area
|
|
261
|
+
- [ ] Old pattern code is removed (no dead code)
|
|
262
|
+
- [ ] All existing tests pass (may need test updates to use new pattern)
|
|
263
|
+
- [ ] Behavior is identical (verified via test-gate or snapshot)
|
|
264
|
+
|
|
265
|
+
### Writing Principles
|
|
266
|
+
|
|
267
|
+
1. **Always include "all existing tests pass"** — this is the fundamental refactoring invariant.
|
|
268
|
+
2. **Be specific about structural outcomes** — "files are organized by feature" is vague; "auth files are in src/features/auth/" is concrete.
|
|
269
|
+
3. **Include negative criteria** — "no circular dependencies", "no dead code", "no duplicated logic".
|
|
270
|
+
4. **Keep count manageable** — 3-5 criteria per item. More than 6 suggests the item should be split.
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## 6. Complexity Estimation for Refactoring
|
|
275
|
+
|
|
276
|
+
| Factor | Low | Medium | High |
|
|
277
|
+
|--------|-----|--------|------|
|
|
278
|
+
| File count | 1-2 files | 3-5 files | 6+ files |
|
|
279
|
+
| Cross-module scope | Same module | 2 modules | 3+ modules |
|
|
280
|
+
| Test coverage | High (>80%) | Moderate (40-80%) | Low (<40%) |
|
|
281
|
+
| Pattern familiarity | Well-known (rename, extract) | Common (restructure) | Novel (custom migration) |
|
|
282
|
+
| Dependency changes | None | Minor (1-2 imports) | Significant (module graph changes) |
|
|
283
|
+
|
|
284
|
+
**Rule**: Take the highest individual factor as the overall complexity. When in doubt, estimate higher.
|
|
285
|
+
|
|
286
|
+
### Complexity Red Flags (Consider Splitting)
|
|
287
|
+
|
|
288
|
+
- Item touches more than 5 files
|
|
289
|
+
- Item requires changes to both test files and source files in non-trivial ways
|
|
290
|
+
- Item involves both structural change AND pattern migration
|
|
291
|
+
- Item has more than 6 acceptance criteria
|
|
292
|
+
- Item's description exceeds 100 words (suggests multiple operations combined)
|
|
@@ -0,0 +1,301 @@
|
|
|
1
|
+
# Behavior Preservation Guide
|
|
2
|
+
|
|
3
|
+
This guide covers strategies for ensuring that refactoring changes structure without changing behavior. Every refactor item must declare a behavior preservation strategy.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## 1. Preservation Strategies
|
|
8
|
+
|
|
9
|
+
### Strategy: test-gate
|
|
10
|
+
|
|
11
|
+
**Definition**: Run the full test suite after each refactoring change. All previously-passing tests must continue to pass.
|
|
12
|
+
|
|
13
|
+
**How it works**:
|
|
14
|
+
1. Run the full test suite before starting the refactor item (establish baseline)
|
|
15
|
+
2. Implement the refactoring change
|
|
16
|
+
3. Run the full test suite again
|
|
17
|
+
4. Compare: all tests that passed before must still pass
|
|
18
|
+
5. If any test fails -> revert the change, investigate, and retry
|
|
19
|
+
|
|
20
|
+
**When to use**:
|
|
21
|
+
- Target area has good test coverage (>60%)
|
|
22
|
+
- Tests are reliable (no flaky tests in the target area)
|
|
23
|
+
- Test suite runs in reasonable time (<5 minutes for the relevant subset)
|
|
24
|
+
- Tests cover the behavior contracts you need to preserve
|
|
25
|
+
|
|
26
|
+
**Strengths**:
|
|
27
|
+
- Most reliable automated strategy
|
|
28
|
+
- Catches regressions immediately
|
|
29
|
+
- Well-understood and widely practiced
|
|
30
|
+
- Works with any test framework
|
|
31
|
+
|
|
32
|
+
**Limitations**:
|
|
33
|
+
- Only as good as test coverage — untested behavior can still break
|
|
34
|
+
- Slow test suites may bottleneck iteration speed
|
|
35
|
+
- Flaky tests create false negatives
|
|
36
|
+
|
|
37
|
+
**Configuration in refactor-list.json**:
|
|
38
|
+
```json
|
|
39
|
+
{
|
|
40
|
+
"behavior_preservation": "test-gate",
|
|
41
|
+
"test_command": "npm test"
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
### Strategy: snapshot
|
|
48
|
+
|
|
49
|
+
**Definition**: Capture the observable output/state of the target code before and after refactoring, then compare.
|
|
50
|
+
|
|
51
|
+
**How it works**:
|
|
52
|
+
1. Identify observable outputs of the target code (API responses, rendered UI, log output, file output)
|
|
53
|
+
2. Capture a "before" snapshot by exercising the code with representative inputs
|
|
54
|
+
3. Implement the refactoring change
|
|
55
|
+
4. Capture an "after" snapshot with the same inputs
|
|
56
|
+
5. Compare: snapshots must match (or differ only in acceptable ways like formatting)
|
|
57
|
+
|
|
58
|
+
**When to use**:
|
|
59
|
+
- Test coverage is insufficient but behavior is observable
|
|
60
|
+
- The code produces deterministic output for given inputs
|
|
61
|
+
- You can identify representative inputs that exercise the key behavior paths
|
|
62
|
+
- API endpoints, CLI tools, data processing pipelines
|
|
63
|
+
|
|
64
|
+
**Strengths**:
|
|
65
|
+
- Works even when formal tests are missing
|
|
66
|
+
- Captures real behavior rather than test assertions
|
|
67
|
+
- Can detect subtle regressions that tests might miss
|
|
68
|
+
|
|
69
|
+
**Limitations**:
|
|
70
|
+
- Requires deterministic behavior (non-deterministic outputs need normalization)
|
|
71
|
+
- May miss edge cases if representative inputs are incomplete
|
|
72
|
+
- Snapshot comparison tools may need configuration for acceptable differences
|
|
73
|
+
- More manual setup than test-gate
|
|
74
|
+
|
|
75
|
+
**Configuration in refactor-list.json**:
|
|
76
|
+
```json
|
|
77
|
+
{
|
|
78
|
+
"behavior_preservation": "snapshot",
|
|
79
|
+
"snapshot_targets": ["API responses for /api/users/*", "CLI output for --help flag"]
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
### Strategy: manual
|
|
86
|
+
|
|
87
|
+
**Definition**: Human verification is required to confirm behavior is preserved. Used as a last resort.
|
|
88
|
+
|
|
89
|
+
**When to use**:
|
|
90
|
+
- No test coverage AND no easily observable deterministic output
|
|
91
|
+
- UI-heavy changes where visual regression is the primary concern
|
|
92
|
+
- Legacy code with unknown behavior contracts
|
|
93
|
+
- Code that interacts with external services in non-reproducible ways
|
|
94
|
+
|
|
95
|
+
**How it works**:
|
|
96
|
+
1. Document the current behavior (screenshots, recordings, written descriptions)
|
|
97
|
+
2. Implement the refactoring change
|
|
98
|
+
3. Human manually verifies the behavior matches the documentation
|
|
99
|
+
4. Human signs off on the change
|
|
100
|
+
|
|
101
|
+
**Strengths**:
|
|
102
|
+
- Works for any situation
|
|
103
|
+
- Humans can assess subjective quality (UI layout, user experience)
|
|
104
|
+
- Can catch issues that automated tools miss
|
|
105
|
+
|
|
106
|
+
**Limitations**:
|
|
107
|
+
- Slowest strategy — blocks on human availability
|
|
108
|
+
- Error-prone — humans miss regressions, especially subtle ones
|
|
109
|
+
- Not scalable — each item needs separate human attention
|
|
110
|
+
- Not repeatable — different humans may verify differently
|
|
111
|
+
|
|
112
|
+
**Configuration in refactor-list.json**:
|
|
113
|
+
```json
|
|
114
|
+
{
|
|
115
|
+
"behavior_preservation": "manual",
|
|
116
|
+
"verification_notes": "Manually verify login flow works: email login, social login, password reset"
|
|
117
|
+
}
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## 2. Choosing the Right Strategy
|
|
123
|
+
|
|
124
|
+
Use this decision tree to select the appropriate strategy for each refactor item:
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
Does the target area have test coverage >60%?
|
|
128
|
+
├── YES: Are the tests reliable (no flaky tests)?
|
|
129
|
+
│ ├── YES → test-gate
|
|
130
|
+
│ └── NO: Fix flaky tests first, then → test-gate
|
|
131
|
+
│ (or if fixing is out of scope → snapshot)
|
|
132
|
+
└── NO: Does the code produce deterministic, observable output?
|
|
133
|
+
├── YES → snapshot
|
|
134
|
+
└── NO → manual (flag as high-risk)
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Strategy Selection Table
|
|
138
|
+
|
|
139
|
+
| Test Coverage | Output Observable | Recommended Strategy | Risk Level |
|
|
140
|
+
|--------------|-------------------|---------------------|------------|
|
|
141
|
+
| High (>60%) | Yes | test-gate | Low |
|
|
142
|
+
| High (>60%) | No | test-gate | Low |
|
|
143
|
+
| Medium (30-60%) | Yes | test-gate + snapshot | Medium |
|
|
144
|
+
| Medium (30-60%) | No | test-gate (acknowledge gaps) | Medium |
|
|
145
|
+
| Low (<30%) | Yes | snapshot | Medium-High |
|
|
146
|
+
| Low (<30%) | No | manual | High |
|
|
147
|
+
| None (0%) | Yes | snapshot | High |
|
|
148
|
+
| None (0%) | No | manual | Very High |
|
|
149
|
+
|
|
150
|
+
### Mixed Strategies
|
|
151
|
+
|
|
152
|
+
For complex items, you can combine strategies:
|
|
153
|
+
- **test-gate + snapshot**: Run tests AND compare output snapshots. Provides defense in depth.
|
|
154
|
+
- **test-gate + manual**: Run tests AND have a human verify UI/UX aspects.
|
|
155
|
+
- Use the primary strategy in the `behavior_preservation` field and note the secondary in `verification_notes`.
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## 3. Common Behavior-Breaking Pitfalls
|
|
160
|
+
|
|
161
|
+
These are patterns that frequently cause unintended behavior changes during refactoring. Check for each one when planning.
|
|
162
|
+
|
|
163
|
+
### 3.1 Side Effect Ordering
|
|
164
|
+
|
|
165
|
+
**Pitfall**: Reordering function calls or module initialization can change side effects.
|
|
166
|
+
|
|
167
|
+
**Example**: Moving `initLogger()` after `loadConfig()` when the logger depends on config.
|
|
168
|
+
|
|
169
|
+
**Prevention**: Map side effects and their dependencies before restructuring. Document execution order constraints.
|
|
170
|
+
|
|
171
|
+
### 3.2 Error Handling Changes
|
|
172
|
+
|
|
173
|
+
**Pitfall**: Extracting code into a new function changes which errors are caught and where.
|
|
174
|
+
|
|
175
|
+
**Example**: A try/catch block that previously caught errors from inline code no longer catches them when the code is extracted to a separate function with its own error handling.
|
|
176
|
+
|
|
177
|
+
**Prevention**: Trace error propagation paths before and after. Ensure the same errors reach the same handlers.
|
|
178
|
+
|
|
179
|
+
### 3.3 Closure and Scope Changes
|
|
180
|
+
|
|
181
|
+
**Pitfall**: Moving code changes what variables are in scope, especially with closures.
|
|
182
|
+
|
|
183
|
+
**Example**: Extracting a closure that captures `this` into a standalone function loses the `this` binding.
|
|
184
|
+
|
|
185
|
+
**Prevention**: Identify all captured variables. Ensure they are passed as parameters or the binding is preserved.
|
|
186
|
+
|
|
187
|
+
### 3.4 Import Order Side Effects
|
|
188
|
+
|
|
189
|
+
**Pitfall**: In some languages/frameworks, import order matters (module initialization, polyfills, monkey-patching).
|
|
190
|
+
|
|
191
|
+
**Example**: Moving an import of a polyfill to a different position causes it to load after the code that needs it.
|
|
192
|
+
|
|
193
|
+
**Prevention**: Identify imports with side effects. Document order constraints. Test module initialization explicitly.
|
|
194
|
+
|
|
195
|
+
### 3.5 Default Parameter Changes
|
|
196
|
+
|
|
197
|
+
**Pitfall**: Extracting a function and adding default parameters changes behavior for callers that relied on the old defaults.
|
|
198
|
+
|
|
199
|
+
**Example**: Original: `function process(data, format) { format = format || 'json'; ... }` — Refactored: `function process(data, format = 'json') { ... }` — These behave differently for `process(data, '')` (empty string).
|
|
200
|
+
|
|
201
|
+
**Prevention**: Audit all default value logic. Use identical defaulting behavior in the refactored version.
|
|
202
|
+
|
|
203
|
+
### 3.6 Async/Await Conversion Gotchas
|
|
204
|
+
|
|
205
|
+
**Pitfall**: Converting callbacks to async/await can change error propagation, timing, and concurrency.
|
|
206
|
+
|
|
207
|
+
**Example**: Callback errors that were silently swallowed now throw unhandled promise rejections.
|
|
208
|
+
|
|
209
|
+
**Prevention**: Map all error paths in the callback version. Ensure async version handles every path. Test with error scenarios.
|
|
210
|
+
|
|
211
|
+
### 3.7 Type Coercion Changes
|
|
212
|
+
|
|
213
|
+
**Pitfall**: Moving code between contexts can change implicit type coercion behavior.
|
|
214
|
+
|
|
215
|
+
**Example**: `==` comparisons that relied on type coercion break when types change due to new module boundaries.
|
|
216
|
+
|
|
217
|
+
**Prevention**: Prefer strict equality. Audit type assumptions at module boundaries.
|
|
218
|
+
|
|
219
|
+
### 3.8 Timing and Race Conditions
|
|
220
|
+
|
|
221
|
+
**Pitfall**: Restructuring async code can change execution timing, revealing or creating race conditions.
|
|
222
|
+
|
|
223
|
+
**Example**: Splitting a synchronous operation into two async steps creates a window where state is inconsistent.
|
|
224
|
+
|
|
225
|
+
**Prevention**: Identify shared mutable state. Ensure atomicity is preserved. Test concurrent scenarios.
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
## 4. Test Coverage Assessment
|
|
230
|
+
|
|
231
|
+
Before refactoring, assess the test coverage of the target area to select the appropriate preservation strategy.
|
|
232
|
+
|
|
233
|
+
### Quick Coverage Assessment
|
|
234
|
+
|
|
235
|
+
If a formal coverage tool is available:
|
|
236
|
+
```bash
|
|
237
|
+
# JavaScript (Istanbul/nyc)
|
|
238
|
+
npx nyc --reporter=text -- npm test -- --grep "target-module"
|
|
239
|
+
|
|
240
|
+
# Python (pytest-cov)
|
|
241
|
+
pytest --cov=target_module --cov-report=term-missing
|
|
242
|
+
|
|
243
|
+
# Go
|
|
244
|
+
go test -coverprofile=coverage.out ./target-package/...
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Manual Coverage Assessment
|
|
248
|
+
|
|
249
|
+
When coverage tools aren't available, assess manually:
|
|
250
|
+
|
|
251
|
+
1. **List all public functions/methods** in the target area
|
|
252
|
+
2. **Search for test files** that import or reference the target
|
|
253
|
+
3. **Check test assertions** — do they test behavior or just structure?
|
|
254
|
+
4. **Identify untested paths** — error handling, edge cases, default behavior
|
|
255
|
+
|
|
256
|
+
### Coverage-Based Planning Decisions
|
|
257
|
+
|
|
258
|
+
| Coverage Level | Planning Decision |
|
|
259
|
+
|---------------|-------------------|
|
|
260
|
+
| >80% | Proceed with test-gate. High confidence in behavior preservation. |
|
|
261
|
+
| 60-80% | Proceed with test-gate. Note gaps in refactor item descriptions for the pipeline to be cautious about. |
|
|
262
|
+
| 30-60% | Consider writing additional tests before refactoring (as a prerequisite R-000 item). Or use snapshot strategy for low-coverage areas. |
|
|
263
|
+
| <30% | Strongly recommend writing tests first. If user declines, use snapshot or manual strategy and flag as high risk. |
|
|
264
|
+
| 0% | WARN user explicitly. Recommend writing tests as a prerequisite. If user insists on proceeding, use manual strategy and document all known behaviors. |
|
|
265
|
+
|
|
266
|
+
### Adding Tests as a Prerequisite Item
|
|
267
|
+
|
|
268
|
+
When test coverage is insufficient, add a prerequisite refactor item:
|
|
269
|
+
|
|
270
|
+
```
|
|
271
|
+
Refactor Item R-000:
|
|
272
|
+
Title: Add test coverage for [target area] before refactoring
|
|
273
|
+
Type: restructure
|
|
274
|
+
Scope: [test files]
|
|
275
|
+
Priority: critical
|
|
276
|
+
Complexity: medium
|
|
277
|
+
Behavior Preservation: manual (no existing tests to gate against)
|
|
278
|
+
Acceptance Criteria:
|
|
279
|
+
- Test coverage for [target area] reaches >60%
|
|
280
|
+
- Tests cover: [list key behavior contracts]
|
|
281
|
+
- All new tests pass
|
|
282
|
+
Dependencies: none
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
This item runs first, establishing the test baseline that all subsequent items use for their test-gate strategy.
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
## 5. Behavior Verification Checklist
|
|
290
|
+
|
|
291
|
+
Before marking a refactor item as complete, verify:
|
|
292
|
+
|
|
293
|
+
- [ ] All previously-passing tests still pass
|
|
294
|
+
- [ ] No new warnings or deprecation notices in test output
|
|
295
|
+
- [ ] No new lint errors introduced
|
|
296
|
+
- [ ] Public API surface is unchanged (same exports, same function signatures)
|
|
297
|
+
- [ ] Error messages and error codes are unchanged (consumers may depend on these)
|
|
298
|
+
- [ ] Logging output is unchanged (monitoring/alerting may depend on log patterns)
|
|
299
|
+
- [ ] Configuration interface is unchanged (env vars, config files, CLI flags)
|
|
300
|
+
- [ ] Performance characteristics are within acceptable bounds (no regression >10%)
|
|
301
|
+
- [ ] No dead code left behind (unused imports, unreachable functions)
|