@delegance/claude-autopilot 7.4.0 → 7.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,143 @@
2
2
 
3
3
  - v5.6 Phase 7 (docs reconciliation) — pending.
4
4
 
5
+ ## 7.4.3 (2026-05-11)
6
+
7
+ **v7.4.3 — FastAPI scaffold respects spec-derived package name.**
8
+ Patch release. Real-package end-to-end test on v7.4.2 surfaced
9
+ this regression: the v7.4.0 Python/FastAPI scaffolder always used
10
+ `basename(cwd)` for the package directory and ignored spec-listed
11
+ `src/<pkg>/main.py` paths. Result: scaffolding a spec that listed
12
+ `src/fastapi_test/main.py` from a directory called `v742-fastapi`
13
+ produced TWO competing trees:
14
+
15
+ * `src/v742_fastapi/` — auto-generated FastAPI app (correct
16
+ content, wrong location)
17
+ * `src/fastapi_test/main.py` — empty placeholder from the spec's
18
+ bullet (right location, no content)
19
+
20
+ `pyproject.toml`'s `[project.scripts]` pointed at the
21
+ auto-generated tree (`v742_fastapi.main:run`) — the spec's intent
22
+ was clearly the named package.
23
+
24
+ **Fix:** new `packageNameFromSpec(parsed)` extracts the package
25
+ name from the first `src/<pkg>/<*>.py` entry in the spec's
26
+ `## Files` section. Falls back to the cwd-derived default only
27
+ when the spec doesn't list any `src/<pkg>/` path.
28
+
29
+ 8 new tests in `tests/scaffold-python.test.ts` cover:
30
+ * extraction from `src/<pkg>/main.py` (the conventional case)
31
+ * extraction from `src/<pkg>/<other>.py`
32
+ * null when no `src/<pkg>/<*>.py` listed
33
+ * null for non-`src/` paths
34
+ * first-match-wins on multiple `src/<pkg>/` entries
35
+ * rejects invalid Python identifier characters in the path
36
+ * the exact regression case (`src/fastapi_test/main.py` from
37
+ cwd `v742-fastapi`)
38
+ * fallback to cwd basename when spec has no `src/<pkg>/`
39
+
40
+ Plus 2 end-to-end tests verifying:
41
+ * scaffold from `cwd=v742-real` + `src/intentional_pkg/main.py`
42
+ spec → SINGLE `src/intentional_pkg/` directory (no competing
43
+ tree); `pyproject.toml` consistent throughout
44
+ * scaffold from `cwd=myapp` + spec without `src/<pkg>/` → still
45
+ uses `src/myapp/` (preserves v7.4.0 default behavior)
46
+
47
+ 1597 → 1606 CLI tests (+9). tsc clean. build clean.
48
+
49
+ ## 7.4.2 (2026-05-11)
50
+
51
+ **v7.4.2 — risk-tiered codex pass policy in autopilot skill.**
52
+ Docs-only PR. Codifies finding N2 from the v7.4.1 codex strategic
53
+ review into `skills/autopilot/SKILL.md`.
54
+
55
+ **New policy table** in the skill:
56
+
57
+ | Spec risk | # of codex passes |
58
+ |---|---|
59
+ | **Low** (CLI UX, doc-only, scaffolding, CI tweaks) | 1 |
60
+ | **Medium** (new exec modes, auth, billing, data-access, env vars, API contracts) | 2 |
61
+ | **High** (sandboxing, multi-tenancy, auto-merge, repo-mutation, secrets, RPC/SECURITY DEFINER) | 3 + external review |
62
+
63
+ **Convention:** spec docs declare `risk: low | medium | high` in
64
+ frontmatter. Omitted defaults to **medium** (safer than defaulting
65
+ to low).
66
+
67
+ **v7.x examples** included in the skill text:
68
+ * v7.1.7 (low) — 1 pass, 0 CRITICALs in practice.
69
+ * v7.4.0 (low) — 1 pass, 2 CRITICALs caught pre-impl.
70
+ * v7.0 Phase 6 (high) — 3 passes, would have shipped credential-
71
+ exfiltration vector C3 without all three.
72
+ * v8.0 spec (high) — 2 passes done, needs 3rd before v8 alpha.
73
+
74
+ No code change. Bumping to 7.4.2.
75
+
76
+ ## 7.4.1 (2026-05-11)
77
+
78
+ **v7.4.1 — strategic pivot doc from codex 5.5 review.** Docs-only
79
+ PR. Records the decision to pause v8 daemon implementation pending
80
+ customer discovery, plus 8 other findings from the codex strategic
81
+ review of full project state on 2026-05-11.
82
+
83
+ **Key outcome:** "ship v8 daemon" is NOT the next milestone. The CLI
84
+ chat-session loop is the validated asset; v8 is unvalidated. New
85
+ priority order: (1) customer discovery sprint, (2) hosted beta
86
+ readiness slice (operational), (3) org-tier revocation completion,
87
+ (4) risk-tiered codex pass policy in the autopilot skill.
88
+
89
+ **Process changes adopted:**
90
+
91
+ * **Risk-tiered codex passes** (1 for low-risk CLI UX, 2 for new
92
+ exec/auth/billing/data-access modes, 3 for sandboxing /
93
+ multi-tenancy / repo-mutation).
94
+ * **Strategic codex review every ~10 PRs** (separate from per-spec
95
+ passes — catches "ship more without validating demand" trap).
96
+ * **Bounded benchmark suite gate** (4 repo shapes only, run
97
+ pre-release + after major workflow changes — already in v8 spec).
98
+
99
+ **v8 IF customer discovery validates demand:** local-only alpha
100
+ first (per W5 of codex review). NO hosted workers, NO billing, NO
101
+ auto-merge until alpha demand is proven.
102
+
103
+ Full doc at `docs/strategy/2026-05-11-codex-pivot.md`.
104
+
105
+ ## 7.4.0 (2026-05-11)
106
+
107
+ **v7.4.0 — scaffold per-stack support (Python + FastAPI).** Closes
108
+ the v7.1.6/v7.1.8 benchmark caveat ("n=1, Node 22 ESM only —
109
+ Python/Rust/Go remain v8 follow-ups") and gates v8 spec
110
+ stabilization criteria #2 (4-repo benchmark suite).
111
+
112
+ * **Stack detection precedence** (codex C1): explicit `--stack` >
113
+ FastAPI > Python > Node > detected-but-unsupported > Node fallback.
114
+ FastAPI checked BEFORE Python so FastAPI specs that include
115
+ `pyproject.toml` aren't mis-classified.
116
+ * **FastAPI scaffold completeness** (codex C2): generates a runnable
117
+ `src/<package>/main.py` with `app = FastAPI()`, `/health` route,
118
+ `run()` function, plus `tests/test_main.py` (otherwise the
119
+ `[project.scripts]` entry was dangling).
120
+ * **Name normalization** (codex W1): PEP 503 distribution name +
121
+ valid Python identifier package name. `my-pkg-2` → distribution
122
+ `my-pkg-2`, package `my_pkg_2`. Hatchling explicit `packages`
123
+ config always present.
124
+ * **Detected-but-unsupported** (codex W2): Go/Rust/Ruby specs →
125
+ exit 3 with diagnostic, NOT silent fallback to Node.
126
+ * **Polyglot guard** (codex W3): specs listing both `package.json`
127
+ AND `pyproject.toml` without `--stack` → exit 3.
128
+ * **Narrow dep extraction** (codex W6): 3 patterns only, no inferred
129
+ versions, dedup by PEP 503 normalized name. FastAPI auto-includes
130
+ `fastapi>=0.110` + `uvicorn[standard]>=0.27`.
131
+ * **Module split**: `scaffold.ts` is now the dispatcher;
132
+ per-stack scaffolders live under
133
+ `src/cli/scaffold/{node,python,types}.ts`.
134
+ * **New flags**: `--stack <node|python|fastapi>`, `--list-stacks`.
135
+ * **Integration test** (codex N3): scaffolds FastAPI + creates
136
+ isolated venv (handles PEP 668) + `pip install -e .` + import-
137
+ app. Skipped cleanly when `python3` unavailable.
138
+
139
+ 1563 → 1597 CLI tests; tsc clean; build clean. PR #155 spec +
140
+ #156 impl. Version 7.3.0 → 7.4.0.
141
+
5
142
  ## 7.3.0 (2026-05-10)
6
143
 
7
144
  **v7.3.0 — library export surface for v8 daemon.** Minor bump
@@ -1,4 +1,4 @@
1
- import type { ScaffoldResult, ScaffoldRunContext } from './types.ts';
1
+ import type { ParsedFiles, ScaffoldResult, ScaffoldRunContext } from './types.ts';
2
2
  /**
3
3
  * PEP 503 distribution-name normalization, restricted to what we need
4
4
  * here. Lowercase, runs of `[._-]+` collapse to a single `-`, leading +
@@ -65,6 +65,16 @@ export declare function buildFastapiTest(packageName: string): string;
65
65
  * not listed in `## Files` — without them the generated pyproject.toml
66
66
  * is invalid (missing package dir) or has dead config (no tests).
67
67
  */
68
+ /**
69
+ * Extract the Python package name from a spec's `## Files` paths if the
70
+ * spec lists a `src/<pkg>/<*>.py` entry. Returns null if no spec-derived
71
+ * package name is present — caller falls back to the cwd-derived default.
72
+ *
73
+ * v7.4.3 hotfix — the v7.4.0 scaffolder always used basename(cwd) and
74
+ * ignored spec-listed src/<pkg>/ paths, producing two competing trees
75
+ * (one auto-generated, one empty placeholder from the spec).
76
+ */
77
+ export declare function packageNameFromSpec(parsed: ParsedFiles): string | null;
68
78
  export declare function scaffoldPython(ctx: ScaffoldRunContext, opts: {
69
79
  isFastapi: boolean;
70
80
  }): Promise<ScaffoldResult>;
@@ -206,11 +206,30 @@ pytest
206
206
  * not listed in `## Files` — without them the generated pyproject.toml
207
207
  * is invalid (missing package dir) or has dead config (no tests).
208
208
  */
209
+ /**
210
+ * Extract the Python package name from a spec's `## Files` paths if the
211
+ * spec lists a `src/<pkg>/<*>.py` entry. Returns null if no spec-derived
212
+ * package name is present — caller falls back to the cwd-derived default.
213
+ *
214
+ * v7.4.3 hotfix — the v7.4.0 scaffolder always used basename(cwd) and
215
+ * ignored spec-listed src/<pkg>/ paths, producing two competing trees
216
+ * (one auto-generated, one empty placeholder from the spec).
217
+ */
218
+ export function packageNameFromSpec(parsed) {
219
+ for (const p of parsed.paths) {
220
+ const m = /^src\/([a-zA-Z_][a-zA-Z0-9_]*)\/[^/]+\.py$/.exec(p);
221
+ if (m && m[1])
222
+ return m[1];
223
+ }
224
+ return null;
225
+ }
209
226
  export async function scaffoldPython(ctx, opts) {
210
227
  const { cwd, parsed, dryRun } = ctx;
211
228
  const { isFastapi } = opts;
229
+ // v7.4.3: prefer spec-derived package name; fall back to cwd basename.
230
+ const specPackage = packageNameFromSpec(parsed);
212
231
  const distributionName = normalizeDistributionName(path.basename(cwd));
213
- const packageName = packageNameFromDistribution(distributionName);
232
+ const packageName = specPackage ?? packageNameFromDistribution(distributionName);
214
233
  const filesCreated = [];
215
234
  const filesSkippedExisting = [];
216
235
  const dirsCreated = [];
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@delegance/claude-autopilot",
3
- "version": "7.4.0",
3
+ "version": "7.4.3",
4
4
  "type": "module",
5
5
  "publishConfig": {
6
6
  "tag": "next"
@@ -25,6 +25,58 @@ The ONLY time you stop is if a step **fails and cannot be recovered**. Otherwise
25
25
 
26
26
  Brief status lines like `[autopilot] Step 3: Executing plan...` are fine. Full summaries, questions, or check-ins are not.
27
27
 
28
+ ## Codex pass policy (risk-tiered)
29
+
30
+ > Adopted from the v7.4.1 strategic review (see
31
+ > `docs/strategy/2026-05-11-codex-pivot.md`, codex finding N2).
32
+ >
33
+ > The v8 spec pass-2 finding 3 CRITICALs the original spec missed
34
+ > (especially sandbox / credential exfiltration) was concrete evidence
35
+ > that 1 codex pass is insufficient for security-sensitive architecture.
36
+ > But running 3 passes on every CLI polish spec adds latency without
37
+ > proportional value.
38
+
39
+ **Tier the spec by risk; pass count follows.**
40
+
41
+ | Spec risk | Triggers | # of codex passes |
42
+ |---|---|---|
43
+ | **Low** | CLI UX changes, doc-only PRs, scaffolding extensions, config polish, CI workflow tweaks | **1 pass** (this skill's existing pattern — codex on the committed spec) |
44
+ | **Medium** | New execution modes, auth changes, billing flows, data-access patterns, new env vars, API contracts | **2 passes** (1 on the draft spec, 1 on the merged spec after edits) |
45
+ | **High** | Sandboxing, multi-tenancy, auto-merge, anything that mutates user repos, new secrets-handling, RPC/SECURITY DEFINER changes | **3 passes** + external review (1 draft, 1 post-edit, 1 on the impl PR diff) |
46
+
47
+ **How to apply.** Spec docs declare risk in their frontmatter:
48
+
49
+ ```markdown
50
+ ---
51
+ title: <topic>
52
+ risk: low | medium | high
53
+ ---
54
+ ```
55
+
56
+ If the spec's `risk:` is omitted, default to **medium** (safer than
57
+ defaulting to low; matches the v8 spec pattern where pass-1 was
58
+ clearly insufficient).
59
+
60
+ The brainstorming skill's per-step codex pass (approach selection,
61
+ architecture, components, error handling, implementation prep) is
62
+ ALWAYS run — it's how we get a draft spec good enough to merge.
63
+ This tier policy applies to the **post-brainstorm** passes and to
64
+ the codex PR review at Step 7 below.
65
+
66
+ **Examples from v7.x:**
67
+
68
+ * v7.1.7 (setup polish — CLAUDE.md scaffold + .gitignore + dedup): low.
69
+ 1 pass on the committed spec. Caught zero CRITICALs in practice.
70
+ * v7.4.0 (Python/FastAPI scaffold extension): low. 1 pass. Found
71
+ 2 CRITICALs (FastAPI precedence, dangling entrypoint) — both
72
+ fixed pre-impl, no PR-pass surprises.
73
+ * v7.0 Phase 6 (engine-off removal + middleware revocation): high.
74
+ 3 passes (spec, post-edit, PR diff). Each pass surfaced new
75
+ trust-boundary issues; without all three the launch would have
76
+ shipped with the credential-exfiltration vector C3.
77
+ * v8.0 spec (standalone daemon): high. 2 passes so far + needs a
78
+ 3rd before any v8 alpha implementation.
79
+
28
80
  ## Pipeline
29
81
 
30
82
  Execute these steps in order. Do NOT pause between steps unless a step fails.