devlyn-cli 1.5.2 → 1.5.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md
CHANGED
|
@@ -61,7 +61,7 @@ This runs the full pipeline automatically: **Build → Browser Validate → Eval
|
|
|
61
61
|
For web projects, the Browser Validate phase starts the dev server and tests the implemented feature in a real browser — clicking buttons, filling forms, verifying results. If the feature doesn't work, findings feed back into the fix loop.
|
|
62
62
|
|
|
63
63
|
Optional flags:
|
|
64
|
-
- `--max-rounds
|
|
64
|
+
- `--max-rounds 6` — increase max evaluate-fix iterations (default: 4)
|
|
65
65
|
- `--skip-browser` — skip browser validation phase (auto-skipped for non-web changes)
|
|
66
66
|
- `--skip-review` — skip team-review phase
|
|
67
67
|
- `--skip-clean` — skip clean phase
|
|
@@ -25,7 +25,7 @@ This pipeline runs hands-free. The user launches it to walk away and come back t
|
|
|
25
25
|
|
|
26
26
|
1. Extract the task/issue description from `<pipeline_config>`.
|
|
27
27
|
2. Determine optional flags from the input (defaults in parentheses):
|
|
28
|
-
- `--max-rounds N` (
|
|
28
|
+
- `--max-rounds N` (4) — max evaluate-fix loops before stopping with a report
|
|
29
29
|
- `--skip-review` (false) — skip team-review phase
|
|
30
30
|
- `--security-review` (auto) — run dedicated security audit. Auto-detects: runs when changes touch auth, secrets, user data, API endpoints, env/config, or crypto. Force with `--security-review always` or skip with `--security-review skip`
|
|
31
31
|
- `--skip-clean` (false) — skip clean phase
|
|
@@ -146,7 +146,9 @@ You are an independent evaluator. Your job is to grade work produced by another
|
|
|
146
146
|
- pattern description
|
|
147
147
|
```
|
|
148
148
|
|
|
149
|
-
Verdict rules: BLOCKED = any CRITICAL issues. NEEDS WORK = HIGH issues that should be fixed. PASS WITH ISSUES = only
|
|
149
|
+
Verdict rules: BLOCKED = any CRITICAL issues. NEEDS WORK = HIGH or MEDIUM issues that should be fixed. PASS WITH ISSUES = only LOW cosmetic notes. PASS = clean.
|
|
150
|
+
|
|
151
|
+
Important: Do NOT label findings as "pre-existing" or "out of scope" to avoid fixing them. If a problem exists in the current code and relates to the done criteria, it's a finding regardless of when it was introduced. The goal is working software, not blame attribution.
|
|
150
152
|
|
|
151
153
|
Calibration examples to guide your judgment:
|
|
152
154
|
- A catch block that logs but doesn't surface error to user = HIGH (not MEDIUM). Logging is not error handling.
|
|
@@ -161,10 +163,10 @@ Do NOT delete `.devlyn/done-criteria.md` or `.devlyn/EVAL-FINDINGS.md` — the o
|
|
|
161
163
|
3. **If `--with-codex` includes `evaluate` or `both`**: Read `references/codex-integration.md` and follow the "PHASE 2-CODEX: CROSS-MODEL EVALUATE" section. This runs Codex as a second evaluator and merges findings into `EVAL-FINDINGS.md`.
|
|
162
164
|
4. Branch on verdict (from the merged findings if Codex was used):
|
|
163
165
|
- `PASS` → skip to PHASE 3
|
|
164
|
-
- `PASS WITH ISSUES` →
|
|
166
|
+
- `PASS WITH ISSUES` → go to PHASE 2.5 (fix loop) — LOW-only issues are still issues; fix them
|
|
165
167
|
- `NEEDS WORK` → go to PHASE 2.5 (fix loop)
|
|
166
168
|
- `BLOCKED` → go to PHASE 2.5 (fix loop)
|
|
167
|
-
5. If `.devlyn/EVAL-FINDINGS.md` was not created, treat as
|
|
169
|
+
5. If `.devlyn/EVAL-FINDINGS.md` was not created, treat as NEEDS WORK and log a warning — absence of evidence is not evidence of absence
|
|
168
170
|
|
|
169
171
|
## PHASE 2.5: FIX LOOP (conditional)
|
|
170
172
|
|
|
@@ -174,7 +176,7 @@ Spawn a subagent using the Agent tool with `mode: "bypassPermissions"` to fix th
|
|
|
174
176
|
|
|
175
177
|
Agent prompt — pass this to the Agent tool:
|
|
176
178
|
|
|
177
|
-
Read `.devlyn/EVAL-FINDINGS.md` — it contains specific issues found by an independent evaluator. Fix every CRITICAL and
|
|
179
|
+
Read `.devlyn/EVAL-FINDINGS.md` — it contains specific issues found by an independent evaluator. Fix every finding regardless of severity (CRITICAL, HIGH, MEDIUM, and LOW). The pipeline loops until the evaluator returns PASS — there is no "shippable with issues" shortcut.
|
|
178
180
|
|
|
179
181
|
The original done criteria are in `.devlyn/done-criteria.md` — your fixes must still satisfy those criteria. Do not delete or weaken criteria to make them pass.
|
|
180
182
|
|