@delegance/claude-autopilot 7.4.0 → 7.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,99 @@
2
2
 
3
3
  - v5.6 Phase 7 (docs reconciliation) — pending.
4
4
 
5
+ ## 7.4.2 (2026-05-11)
6
+
7
+ **v7.4.2 — risk-tiered codex pass policy in autopilot skill.**
8
+ Docs-only PR. Codifies finding N2 from the v7.4.1 codex strategic
9
+ review into `skills/autopilot/SKILL.md`.
10
+
11
+ **New policy table** in the skill:
12
+
13
+ | Spec risk | # of codex passes |
14
+ |---|---|
15
+ | **Low** (CLI UX, doc-only, scaffolding, CI tweaks) | 1 |
16
+ | **Medium** (new exec modes, auth, billing, data-access, env vars, API contracts) | 2 |
17
+ | **High** (sandboxing, multi-tenancy, auto-merge, repo-mutation, secrets, RPC/SECURITY DEFINER) | 3 + external review |
18
+
19
+ **Convention:** spec docs declare `risk: low | medium | high` in
20
+ frontmatter. Omitted defaults to **medium** (safer than defaulting
21
+ to low).
22
+
23
+ **v7.x examples** included in the skill text:
24
+ * v7.1.7 (low) — 1 pass, 0 CRITICALs in practice.
25
+ * v7.4.0 (low) — 1 pass, 2 CRITICALs caught pre-impl.
26
+ * v7.0 Phase 6 (high) — 3 passes, would have shipped credential-
27
+ exfiltration vector C3 without all three.
28
+ * v8.0 spec (high) — 2 passes done, needs 3rd before v8 alpha.
29
+
30
+ No code change. Bumping to 7.4.2.
31
+
32
+ ## 7.4.1 (2026-05-11)
33
+
34
+ **v7.4.1 — strategic pivot doc from codex 5.5 review.** Docs-only
35
+ PR. Records the decision to pause v8 daemon implementation pending
36
+ customer discovery, plus 8 other findings from the codex strategic
37
+ review of full project state on 2026-05-11.
38
+
39
+ **Key outcome:** "ship v8 daemon" is NOT the next milestone. The CLI
40
+ chat-session loop is the validated asset; v8 is unvalidated. New
41
+ priority order: (1) customer discovery sprint, (2) hosted beta
42
+ readiness slice (operational), (3) org-tier revocation completion,
43
+ (4) risk-tiered codex pass policy in the autopilot skill.
44
+
45
+ **Process changes adopted:**
46
+
47
+ * **Risk-tiered codex passes** (1 for low-risk CLI UX, 2 for new
48
+ exec/auth/billing/data-access modes, 3 for sandboxing /
49
+ multi-tenancy / repo-mutation).
50
+ * **Strategic codex review every ~10 PRs** (separate from per-spec
51
+ passes — catches "ship more without validating demand" trap).
52
+ * **Bounded benchmark suite gate** (4 repo shapes only, run
53
+ pre-release + after major workflow changes — already in v8 spec).
54
+
55
+ **v8 IF customer discovery validates demand:** local-only alpha
56
+ first (per W5 of codex review). NO hosted workers, NO billing, NO
57
+ auto-merge until alpha demand is proven.
58
+
59
+ Full doc at `docs/strategy/2026-05-11-codex-pivot.md`.
60
+
61
+ ## 7.4.0 (2026-05-11)
62
+
63
+ **v7.4.0 — scaffold per-stack support (Python + FastAPI).** Closes
64
+ the v7.1.6/v7.1.8 benchmark caveat ("n=1, Node 22 ESM only —
65
+ Python/Rust/Go remain v8 follow-ups") and gates v8 spec
66
+ stabilization criteria #2 (4-repo benchmark suite).
67
+
68
+ * **Stack detection precedence** (codex C1): explicit `--stack` >
69
+ FastAPI > Python > Node > detected-but-unsupported > Node fallback.
70
+ FastAPI checked BEFORE Python so FastAPI specs that include
71
+ `pyproject.toml` aren't mis-classified.
72
+ * **FastAPI scaffold completeness** (codex C2): generates a runnable
73
+ `src/<package>/main.py` with `app = FastAPI()`, `/health` route,
74
+ `run()` function, plus `tests/test_main.py` (otherwise the
75
+ `[project.scripts]` entry was dangling).
76
+ * **Name normalization** (codex W1): PEP 503 distribution name +
77
+ valid Python identifier package name. `my-pkg-2` → distribution
78
+ `my-pkg-2`, package `my_pkg_2`. Hatchling explicit `packages`
79
+ config always present.
80
+ * **Detected-but-unsupported** (codex W2): Go/Rust/Ruby specs →
81
+ exit 3 with diagnostic, NOT silent fallback to Node.
82
+ * **Polyglot guard** (codex W3): specs listing both `package.json`
83
+ AND `pyproject.toml` without `--stack` → exit 3.
84
+ * **Narrow dep extraction** (codex W6): 3 patterns only, no inferred
85
+ versions, dedup by PEP 503 normalized name. FastAPI auto-includes
86
+ `fastapi>=0.110` + `uvicorn[standard]>=0.27`.
87
+ * **Module split**: `scaffold.ts` is now the dispatcher;
88
+ per-stack scaffolders live under
89
+ `src/cli/scaffold/{node,python,types}.ts`.
90
+ * **New flags**: `--stack <node|python|fastapi>`, `--list-stacks`.
91
+ * **Integration test** (codex N3): scaffolds FastAPI + creates
92
+ isolated venv (handles PEP 668) + `pip install -e .` + import-
93
+ app. Skipped cleanly when `python3` unavailable.
94
+
95
+ 1563 → 1597 CLI tests; tsc clean; build clean. PR #155 spec +
96
+ #156 impl. Version 7.3.0 → 7.4.0.
97
+
5
98
  ## 7.3.0 (2026-05-10)
6
99
 
7
100
  **v7.3.0 — library export surface for v8 daemon.** Minor bump
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@delegance/claude-autopilot",
3
- "version": "7.4.0",
3
+ "version": "7.4.2",
4
4
  "type": "module",
5
5
  "publishConfig": {
6
6
  "tag": "next"
@@ -25,6 +25,58 @@ The ONLY time you stop is if a step **fails and cannot be recovered**. Otherwise
25
25
 
26
26
  Brief status lines like `[autopilot] Step 3: Executing plan...` are fine. Full summaries, questions, or check-ins are not.
27
27
 
28
+ ## Codex pass policy (risk-tiered)
29
+
30
+ > Adopted from the v7.4.1 strategic review (see
31
+ > `docs/strategy/2026-05-11-codex-pivot.md`, codex finding N2).
32
+ >
33
+ > The v8 spec pass-2 finding 3 CRITICALs the original spec missed
34
+ > (especially sandbox / credential exfiltration) was concrete evidence
35
+ > that 1 codex pass is insufficient for security-sensitive architecture.
36
+ > But running 3 passes on every CLI polish spec adds latency without
37
+ > proportional value.
38
+
39
+ **Tier the spec by risk; pass count follows.**
40
+
41
+ | Spec risk | Triggers | # of codex passes |
42
+ |---|---|---|
43
+ | **Low** | CLI UX changes, doc-only PRs, scaffolding extensions, config polish, CI workflow tweaks | **1 pass** (this skill's existing pattern — codex on the committed spec) |
44
+ | **Medium** | New execution modes, auth changes, billing flows, data-access patterns, new env vars, API contracts | **2 passes** (1 on the draft spec, 1 on the merged spec after edits) |
45
+ | **High** | Sandboxing, multi-tenancy, auto-merge, anything that mutates user repos, new secrets-handling, RPC/SECURITY DEFINER changes | **3 passes** + external review (1 draft, 1 post-edit, 1 on the impl PR diff) |
46
+
47
+ **How to apply.** Spec docs declare risk in their frontmatter:
48
+
49
+ ```markdown
50
+ ---
51
+ title: <topic>
52
+ risk: low | medium | high
53
+ ---
54
+ ```
55
+
56
+ If the spec's `risk:` is omitted, default to **medium** (safer than
57
+ defaulting to low; matches the v8 spec pattern where pass-1 was
58
+ clearly insufficient).
59
+
60
+ The brainstorming skill's per-step codex pass (approach selection,
61
+ architecture, components, error handling, implementation prep) is
62
+ ALWAYS run — it's how we get a draft spec good enough to merge.
63
+ This tier policy applies to the **post-brainstorm** passes and to
64
+ the codex PR review at Step 7 below.
65
+
66
+ **Examples from v7.x:**
67
+
68
+ * v7.1.7 (setup polish — CLAUDE.md scaffold + .gitignore + dedup): low.
69
+ 1 pass on the committed spec. Caught zero CRITICALs in practice.
70
+ * v7.4.0 (Python/FastAPI scaffold extension): low. 1 pass. Found
71
+ 2 CRITICALs (FastAPI precedence, dangling entrypoint) — both
72
+ fixed pre-impl, no PR-pass surprises.
73
+ * v7.0 Phase 6 (engine-off removal + middleware revocation): high.
74
+ 3 passes (spec, post-edit, PR diff). Each pass surfaced new
75
+ trust-boundary issues; without all three the launch would have
76
+ shipped with the credential-exfiltration vector C3.
77
+ * v8.0 spec (standalone daemon): high. 2 passes so far + needs a
78
+ 3rd before any v8 alpha implementation.
79
+
28
80
  ## Pipeline
29
81
 
30
82
  Execute these steps in order. Do NOT pause between steps unless a step fails.