deepflow 0.1.90 → 0.1.91
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/install.js +13 -4
- package/bin/ratchet.js +327 -0
- package/bin/ratchet.test.js +869 -0
- package/hooks/df-snapshot-guard.js +105 -0
- package/hooks/df-snapshot-guard.test.js +506 -0
- package/package.json +1 -1
- package/src/commands/df/execute.md +36 -17
- package/templates/config-template.yaml +24 -0
|
@@ -101,24 +101,37 @@ Context ≥50% → checkpoint and exit. Before spawning: `TaskUpdate(status: "in
|
|
|
101
101
|
|
|
102
102
|
### 5.5. RATCHET CHECK
|
|
103
103
|
|
|
104
|
-
Run
|
|
104
|
+
Run `node bin/ratchet.js` in the worktree directory after each agent completes:
|
|
105
|
+
```bash
|
|
106
|
+
node bin/ratchet.js --worktree ${WORKTREE_PATH} --snapshot .deepflow/auto-snapshot.txt
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
The script handles all health checks internally and outputs structured JSON:
|
|
110
|
+
```json
|
|
111
|
+
{"status": "PASS"|"FAIL"|"SALVAGEABLE", "reason": "...", "details": "..."}
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
**Exit codes:** 0 = PASS, 1 = FAIL (script already ran `git revert HEAD --no-edit`), 2 = SALVAGEABLE (lint/typecheck only; build+tests passed).
|
|
115
|
+
|
|
116
|
+
**You MUST NOT inspect, classify, or reinterpret test failures. FAIL means revert. No exceptions.**
|
|
105
117
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
| `Cargo.toml` | `cargo build` | `cargo test` | — | `cargo clippy` |
|
|
111
|
-
| `go.mod` | `go build ./...` | `go test ./...` | — | `go vet ./...` |
|
|
118
|
+
**Prohibited actions during ratchet:**
|
|
119
|
+
- No `git stash` or `git checkout` for investigation purposes
|
|
120
|
+
- No inline edits to pre-existing test files
|
|
121
|
+
- No reading raw test output to decide what "really" failed
|
|
112
122
|
|
|
113
|
-
|
|
123
|
+
**Broken-tests policy:** Updating pre-existing tests requires a separate dedicated task in PLAN.md with explicit justification — never inline during execution.
|
|
124
|
+
|
|
125
|
+
**Orchestrator response by exit code:**
|
|
126
|
+
- **Exit 0 (PASS):** Commit stands. Proceed to §5.6 wave test agent.
|
|
127
|
+
- **Exit 1 (FAIL):** Script already reverted. Set `TaskUpdate(status: "pending")`. Report: `"✗ T{n}: reverted"`.
|
|
128
|
+
- **Exit 2 (SALVAGEABLE):** Spawn `Agent(model="haiku")` to fix lint/typecheck issues. Re-run `node bin/ratchet.js`. If still non-zero → revert both commits, set status pending.
|
|
114
129
|
|
|
115
130
|
**Edit scope validation:** `git diff HEAD~1 --name-only` vs allowed globs. Violation → revert, report.
|
|
116
131
|
**Impact completeness:** diff vs Impact callers/duplicates. Gap → advisory warning (no revert).
|
|
117
132
|
|
|
118
133
|
**Metric gate (Optimize only):** Run `eval "${metric_command}"` with cwd=`${WORKTREE_PATH}` (never `cd && eval`). Parse float (non-numeric → revert). Compare using `direction`+`min_improvement_threshold`. Both ratchet AND metric must pass → keep. Ratchet pass + metric stagnant → revert. Secondary metrics: regression > `regression_threshold` (5%) → WARNING in auto-report.md (no revert).
|
|
119
134
|
|
|
120
|
-
**Output truncation:** Success → suppress. Build fail → last 15 lines. Test fail → names + last 20 lines. Typecheck/lint → count + first 5 errors.
|
|
121
|
-
|
|
122
135
|
**Token tracking result (on pass):** Read `end_percentage`. Sum token fields from `.deepflow/token-history.jsonl` between start/end timestamps (awk ISO 8601 compare). Write to `.deepflow/results/T{N}.yaml`:
|
|
123
136
|
```yaml
|
|
124
137
|
tokens:
|
|
@@ -131,10 +144,6 @@ tokens:
|
|
|
131
144
|
```
|
|
132
145
|
Omit if context.json/token-history.jsonl/awk unavailable. Never fail ratchet for tracking errors.
|
|
133
146
|
|
|
134
|
-
**Evaluate:** All pass → commit stands. Failure → partial salvage:
|
|
135
|
-
1. Lint/typecheck-only (build+tests passed): spawn `Agent(model="haiku")` to fix. Re-ratchet. Fail → revert both.
|
|
136
|
-
2. Build/test failure → `git revert HEAD --no-edit` (no salvage).
|
|
137
|
-
|
|
138
147
|
### 5.6. WAVE TEST AGENT
|
|
139
148
|
|
|
140
149
|
<!-- AC-8: After wave ratchet passes, Opus test agent spawns and writes unit tests -->
|
|
@@ -147,8 +156,11 @@ Omit if context.json/token-history.jsonl/awk unavailable. Never fail ratchet for
|
|
|
147
156
|
|
|
148
157
|
**Flow:**
|
|
149
158
|
1. Capture the implementation diff: `git -C ${WORKTREE_PATH} diff HEAD~1` → store as `IMPL_DIFF`.
|
|
150
|
-
2.
|
|
151
|
-
|
|
159
|
+
2. Gather dedup context:
|
|
160
|
+
- Read `.deepflow/auto-snapshot.txt` → store full file list as `SNAPSHOT_FILES`.
|
|
161
|
+
- Extract existing test function names: `grep -h 'describe\|it(\|test(\|def test_\|func Test' $(cat .deepflow/auto-snapshot.txt) 2>/dev/null | head -50` → store as `EXISTING_TEST_NAMES`.
|
|
162
|
+
3. Spawn `Agent(model="opus")` with Wave Test prompt (§6), passing `SNAPSHOT_FILES` and `EXISTING_TEST_NAMES`. `run_in_background=true`. End turn, wait.
|
|
163
|
+
4. On notification:
|
|
152
164
|
a. Run ratchet check (§5.5) — all new + pre-existing tests must pass.
|
|
153
165
|
b. **Tests pass** → commit stands. **Re-snapshot** immediately so wave N+1 ratchet includes wave N tests:
|
|
154
166
|
```bash
|
|
@@ -167,7 +179,7 @@ Omit if context.json/token-history.jsonl/awk unavailable. Never fail ratchet for
|
|
|
167
179
|
{failure_feedback}
|
|
168
180
|
Fix the issues above. Do NOT repeat the same mistakes.
|
|
169
181
|
```
|
|
170
|
-
- On implementer notification: ratchet check (§5.5). Passed → goto step
|
|
182
|
+
- On implementer notification: ratchet check (§5.5). Passed → goto step 2 (gather dedup context, spawn test agent again). Failed → same retry logic.
|
|
171
183
|
- If `attempt_count >= 3`:
|
|
172
184
|
- Revert ALL commits back to pre-task state: `git -C ${WORKTREE_PATH} reset --hard {pre_task_commit}`
|
|
173
185
|
- `TaskUpdate(status: "pending")`
|
|
@@ -279,9 +291,16 @@ Implementation diff:
|
|
|
279
291
|
Files changed: {changed_files}
|
|
280
292
|
Existing test patterns: {test_file_examples from auto-snapshot.txt, first 3}
|
|
281
293
|
|
|
294
|
+
Pre-existing test files (from auto-snapshot.txt):
|
|
295
|
+
{SNAPSHOT_FILES}
|
|
296
|
+
|
|
297
|
+
Existing test function names (do NOT duplicate these):
|
|
298
|
+
{EXISTING_TEST_NAMES}
|
|
299
|
+
|
|
282
300
|
--- END ---
|
|
283
301
|
Write thorough unit tests covering: happy paths, edge cases, error handling.
|
|
284
302
|
Follow existing test conventions in the codebase.
|
|
303
|
+
Do not duplicate tests for functionality already covered by the existing tests listed above.
|
|
285
304
|
Commit as: test({spec}): wave-{N} unit tests
|
|
286
305
|
Do NOT modify implementation files. ONLY add/edit test files.
|
|
287
306
|
Last line of your response MUST be: TASK_STATUS:pass or TASK_STATUS:fail
|
|
@@ -95,6 +95,30 @@ quality:
|
|
|
95
95
|
# Timeout in seconds to wait for the dev server to become ready (default: 30)
|
|
96
96
|
browser_timeout: 30
|
|
97
97
|
|
|
98
|
+
# Ratchet configuration for /df:verify health gate
|
|
99
|
+
# Ratchet snapshots baseline metrics (tests passing, coverage, type checks) before execution
|
|
100
|
+
# and ensures subsequent runs don't regress. These overrides control which commands ratchet monitors.
|
|
101
|
+
ratchet:
|
|
102
|
+
# Override auto-detected build command for ratchet health checks
|
|
103
|
+
# If empty, ratchet will detect from indicator files (package.json scripts, Cargo.toml, etc.)
|
|
104
|
+
# Examples: "npm run build", "cargo build", "go build ./..."
|
|
105
|
+
build_command: ""
|
|
106
|
+
|
|
107
|
+
# Override auto-detected test command for ratchet health checks
|
|
108
|
+
# If empty, ratchet will detect from indicator files
|
|
109
|
+
# Examples: "npm test", "pytest", "go test ./...", "cargo test"
|
|
110
|
+
test_command: ""
|
|
111
|
+
|
|
112
|
+
# Override auto-detected typecheck command for ratchet health checks
|
|
113
|
+
# If empty, ratchet will detect from indicator files (e.g., "tsc --noEmit" for TypeScript)
|
|
114
|
+
# Examples: "tsc --noEmit", "mypy .", "cargo check"
|
|
115
|
+
typecheck_command: ""
|
|
116
|
+
|
|
117
|
+
# Override auto-detected lint command for ratchet health checks
|
|
118
|
+
# If empty, ratchet will detect from indicator files
|
|
119
|
+
# Examples: "eslint .", "flake8", "cargo clippy"
|
|
120
|
+
lint_command: ""
|
|
121
|
+
|
|
98
122
|
# deepflow-dashboard team mode settings
|
|
99
123
|
# dashboard_url: URL of the shared team server for POST ingestion
|
|
100
124
|
# Leave blank (or omit) to use local-only mode (no data is pushed)
|