@delegance/claude-autopilot 7.4.0 → 7.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +137 -0
- package/dist/src/cli/scaffold/python.d.ts +11 -1
- package/dist/src/cli/scaffold/python.js +20 -1
- package/package.json +1 -1
- package/skills/autopilot/SKILL.md +52 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,143 @@
|
|
|
2
2
|
|
|
3
3
|
- v5.6 Phase 7 (docs reconciliation) — pending.
|
|
4
4
|
|
|
5
|
+
## 7.4.3 (2026-05-11)
|
|
6
|
+
|
|
7
|
+
**v7.4.3 — FastAPI scaffold respects spec-derived package name.**
|
|
8
|
+
Patch release. Real-package end-to-end test on v7.4.2 surfaced
|
|
9
|
+
this regression: the v7.4.0 Python/FastAPI scaffolder always used
|
|
10
|
+
`basename(cwd)` for the package directory and ignored spec-listed
|
|
11
|
+
`src/<pkg>/main.py` paths. Result: scaffolding a spec that listed
|
|
12
|
+
`src/fastapi_test/main.py` from a directory called `v742-fastapi`
|
|
13
|
+
produced TWO competing trees:
|
|
14
|
+
|
|
15
|
+
* `src/v742_fastapi/` — auto-generated FastAPI app (correct
|
|
16
|
+
content, wrong location)
|
|
17
|
+
* `src/fastapi_test/main.py` — empty placeholder from the spec's
|
|
18
|
+
bullet (right location, no content)
|
|
19
|
+
|
|
20
|
+
`pyproject.toml`'s `[project.scripts]` pointed at the
|
|
21
|
+
auto-generated tree (`v742_fastapi.main:run`) — the spec's intent
|
|
22
|
+
was clearly the named package.
|
|
23
|
+
|
|
24
|
+
**Fix:** new `packageNameFromSpec(parsed)` extracts the package
|
|
25
|
+
name from the first `src/<pkg>/<*>.py` entry in the spec's
|
|
26
|
+
`## Files` section. Falls back to the cwd-derived default only
|
|
27
|
+
when the spec doesn't list any `src/<pkg>/` path.
|
|
28
|
+
|
|
29
|
+
8 new tests in `tests/scaffold-python.test.ts` cover:
|
|
30
|
+
* extraction from `src/<pkg>/main.py` (the conventional case)
|
|
31
|
+
* extraction from `src/<pkg>/<other>.py`
|
|
32
|
+
* null when no `src/<pkg>/<*>.py` listed
|
|
33
|
+
* null for non-`src/` paths
|
|
34
|
+
* first-match-wins on multiple `src/<pkg>/` entries
|
|
35
|
+
* rejects invalid Python identifier characters in the path
|
|
36
|
+
* the exact regression case (`src/fastapi_test/main.py` from
|
|
37
|
+
cwd `v742-fastapi`)
|
|
38
|
+
* fallback to cwd basename when spec has no `src/<pkg>/`
|
|
39
|
+
|
|
40
|
+
Plus 2 end-to-end tests verifying:
|
|
41
|
+
* scaffold from `cwd=v742-real` + `src/intentional_pkg/main.py`
|
|
42
|
+
spec → SINGLE `src/intentional_pkg/` directory (no competing
|
|
43
|
+
tree); `pyproject.toml` consistent throughout
|
|
44
|
+
* scaffold from `cwd=myapp` + spec without `src/<pkg>/` → still
|
|
45
|
+
uses `src/myapp/` (preserves v7.4.0 default behavior)
|
|
46
|
+
|
|
47
|
+
1597 → 1606 CLI tests (+9). tsc clean. build clean.
|
|
48
|
+
|
|
49
|
+
## 7.4.2 (2026-05-11)
|
|
50
|
+
|
|
51
|
+
**v7.4.2 — risk-tiered codex pass policy in autopilot skill.**
|
|
52
|
+
Docs-only PR. Codifies finding N2 from the v7.4.1 codex strategic
|
|
53
|
+
review into `skills/autopilot/SKILL.md`.
|
|
54
|
+
|
|
55
|
+
**New policy table** in the skill:
|
|
56
|
+
|
|
57
|
+
| Spec risk | # of codex passes |
|
|
58
|
+
|---|---|
|
|
59
|
+
| **Low** (CLI UX, doc-only, scaffolding, CI tweaks) | 1 |
|
|
60
|
+
| **Medium** (new exec modes, auth, billing, data-access, env vars, API contracts) | 2 |
|
|
61
|
+
| **High** (sandboxing, multi-tenancy, auto-merge, repo-mutation, secrets, RPC/SECURITY DEFINER) | 3 + external review |
|
|
62
|
+
|
|
63
|
+
**Convention:** spec docs declare `risk: low | medium | high` in
|
|
64
|
+
frontmatter. Omitted defaults to **medium** (safer than defaulting
|
|
65
|
+
to low).
|
|
66
|
+
|
|
67
|
+
**v7.x examples** included in the skill text:
|
|
68
|
+
* v7.1.7 (low) — 1 pass, 0 CRITICALs in practice.
|
|
69
|
+
* v7.4.0 (low) — 1 pass, 2 CRITICALs caught pre-impl.
|
|
70
|
+
* v7.0 Phase 6 (high) — 3 passes, would have shipped credential-
|
|
71
|
+
exfiltration vector C3 without all three.
|
|
72
|
+
* v8.0 spec (high) — 2 passes done, needs 3rd before v8 alpha.
|
|
73
|
+
|
|
74
|
+
No code change. Bumping to 7.4.2.
|
|
75
|
+
|
|
76
|
+
## 7.4.1 (2026-05-11)
|
|
77
|
+
|
|
78
|
+
**v7.4.1 — strategic pivot doc from codex 5.5 review.** Docs-only
|
|
79
|
+
PR. Records the decision to pause v8 daemon implementation pending
|
|
80
|
+
customer discovery, plus 8 other findings from the codex strategic
|
|
81
|
+
review of full project state on 2026-05-11.
|
|
82
|
+
|
|
83
|
+
**Key outcome:** "ship v8 daemon" is NOT the next milestone. The CLI
|
|
84
|
+
chat-session loop is the validated asset; v8 is unvalidated. New
|
|
85
|
+
priority order: (1) customer discovery sprint, (2) hosted beta
|
|
86
|
+
readiness slice (operational), (3) org-tier revocation completion,
|
|
87
|
+
(4) risk-tiered codex pass policy in the autopilot skill.
|
|
88
|
+
|
|
89
|
+
**Process changes adopted:**
|
|
90
|
+
|
|
91
|
+
* **Risk-tiered codex passes** (1 for low-risk CLI UX, 2 for new
|
|
92
|
+
exec/auth/billing/data-access modes, 3 for sandboxing /
|
|
93
|
+
multi-tenancy / repo-mutation).
|
|
94
|
+
* **Strategic codex review every ~10 PRs** (separate from per-spec
|
|
95
|
+
passes — catches "ship more without validating demand" trap).
|
|
96
|
+
* **Bounded benchmark suite gate** (4 repo shapes only, run
|
|
97
|
+
pre-release + after major workflow changes — already in v8 spec).
|
|
98
|
+
|
|
99
|
+
**v8 IF customer discovery validates demand:** local-only alpha
|
|
100
|
+
first (per W5 of codex review). NO hosted workers, NO billing, NO
|
|
101
|
+
auto-merge until alpha demand is proven.
|
|
102
|
+
|
|
103
|
+
Full doc at `docs/strategy/2026-05-11-codex-pivot.md`.
|
|
104
|
+
|
|
105
|
+
## 7.4.0 (2026-05-11)
|
|
106
|
+
|
|
107
|
+
**v7.4.0 — scaffold per-stack support (Python + FastAPI).** Closes
|
|
108
|
+
the v7.1.6/v7.1.8 benchmark caveat ("n=1, Node 22 ESM only —
|
|
109
|
+
Python/Rust/Go remain v8 follow-ups") and gates v8 spec
|
|
110
|
+
stabilization criteria #2 (4-repo benchmark suite).
|
|
111
|
+
|
|
112
|
+
* **Stack detection precedence** (codex C1): explicit `--stack` >
|
|
113
|
+
FastAPI > Python > Node > detected-but-unsupported > Node fallback.
|
|
114
|
+
FastAPI checked BEFORE Python so FastAPI specs that include
|
|
115
|
+
`pyproject.toml` aren't mis-classified.
|
|
116
|
+
* **FastAPI scaffold completeness** (codex C2): generates a runnable
|
|
117
|
+
`src/<package>/main.py` with `app = FastAPI()`, `/health` route,
|
|
118
|
+
`run()` function, plus `tests/test_main.py` (otherwise the
|
|
119
|
+
`[project.scripts]` entry was dangling).
|
|
120
|
+
* **Name normalization** (codex W1): PEP 503 distribution name +
|
|
121
|
+
valid Python identifier package name. `my-pkg-2` → distribution
|
|
122
|
+
`my-pkg-2`, package `my_pkg_2`. Hatchling explicit `packages`
|
|
123
|
+
config always present.
|
|
124
|
+
* **Detected-but-unsupported** (codex W2): Go/Rust/Ruby specs →
|
|
125
|
+
exit 3 with diagnostic, NOT silent fallback to Node.
|
|
126
|
+
* **Polyglot guard** (codex W3): specs listing both `package.json`
|
|
127
|
+
AND `pyproject.toml` without `--stack` → exit 3.
|
|
128
|
+
* **Narrow dep extraction** (codex W6): 3 patterns only, no inferred
|
|
129
|
+
versions, dedup by PEP 503 normalized name. FastAPI auto-includes
|
|
130
|
+
`fastapi>=0.110` + `uvicorn[standard]>=0.27`.
|
|
131
|
+
* **Module split**: `scaffold.ts` is now the dispatcher;
|
|
132
|
+
per-stack scaffolders live under
|
|
133
|
+
`src/cli/scaffold/{node,python,types}.ts`.
|
|
134
|
+
* **New flags**: `--stack <node|python|fastapi>`, `--list-stacks`.
|
|
135
|
+
* **Integration test** (codex N3): scaffolds FastAPI + creates
|
|
136
|
+
isolated venv (handles PEP 668) + `pip install -e .` + import-
|
|
137
|
+
app. Skipped cleanly when `python3` unavailable.
|
|
138
|
+
|
|
139
|
+
1563 → 1597 CLI tests; tsc clean; build clean. PR #155 spec +
|
|
140
|
+
#156 impl. Version 7.3.0 → 7.4.0.
|
|
141
|
+
|
|
5
142
|
## 7.3.0 (2026-05-10)
|
|
6
143
|
|
|
7
144
|
**v7.3.0 — library export surface for v8 daemon.** Minor bump
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import type { ScaffoldResult, ScaffoldRunContext } from './types.ts';
|
|
1
|
+
import type { ParsedFiles, ScaffoldResult, ScaffoldRunContext } from './types.ts';
|
|
2
2
|
/**
|
|
3
3
|
* PEP 503 distribution-name normalization, restricted to what we need
|
|
4
4
|
* here. Lowercase, runs of `[._-]+` collapse to a single `-`, leading +
|
|
@@ -65,6 +65,16 @@ export declare function buildFastapiTest(packageName: string): string;
|
|
|
65
65
|
* not listed in `## Files` — without them the generated pyproject.toml
|
|
66
66
|
* is invalid (missing package dir) or has dead config (no tests).
|
|
67
67
|
*/
|
|
68
|
+
/**
|
|
69
|
+
* Extract the Python package name from a spec's `## Files` paths if the
|
|
70
|
+
* spec lists a `src/<pkg>/<*>.py` entry. Returns null if no spec-derived
|
|
71
|
+
* package name is present — caller falls back to the cwd-derived default.
|
|
72
|
+
*
|
|
73
|
+
* v7.4.3 hotfix — the v7.4.0 scaffolder always used basename(cwd) and
|
|
74
|
+
* ignored spec-listed src/<pkg>/ paths, producing two competing trees
|
|
75
|
+
* (one auto-generated, one empty placeholder from the spec).
|
|
76
|
+
*/
|
|
77
|
+
export declare function packageNameFromSpec(parsed: ParsedFiles): string | null;
|
|
68
78
|
export declare function scaffoldPython(ctx: ScaffoldRunContext, opts: {
|
|
69
79
|
isFastapi: boolean;
|
|
70
80
|
}): Promise<ScaffoldResult>;
|
|
@@ -206,11 +206,30 @@ pytest
|
|
|
206
206
|
* not listed in `## Files` — without them the generated pyproject.toml
|
|
207
207
|
* is invalid (missing package dir) or has dead config (no tests).
|
|
208
208
|
*/
|
|
209
|
+
/**
|
|
210
|
+
* Extract the Python package name from a spec's `## Files` paths if the
|
|
211
|
+
* spec lists a `src/<pkg>/<*>.py` entry. Returns null if no spec-derived
|
|
212
|
+
* package name is present — caller falls back to the cwd-derived default.
|
|
213
|
+
*
|
|
214
|
+
* v7.4.3 hotfix — the v7.4.0 scaffolder always used basename(cwd) and
|
|
215
|
+
* ignored spec-listed src/<pkg>/ paths, producing two competing trees
|
|
216
|
+
* (one auto-generated, one empty placeholder from the spec).
|
|
217
|
+
*/
|
|
218
|
+
export function packageNameFromSpec(parsed) {
|
|
219
|
+
for (const p of parsed.paths) {
|
|
220
|
+
const m = /^src\/([a-zA-Z_][a-zA-Z0-9_]*)\/[^/]+\.py$/.exec(p);
|
|
221
|
+
if (m && m[1])
|
|
222
|
+
return m[1];
|
|
223
|
+
}
|
|
224
|
+
return null;
|
|
225
|
+
}
|
|
209
226
|
export async function scaffoldPython(ctx, opts) {
|
|
210
227
|
const { cwd, parsed, dryRun } = ctx;
|
|
211
228
|
const { isFastapi } = opts;
|
|
229
|
+
// v7.4.3: prefer spec-derived package name; fall back to cwd basename.
|
|
230
|
+
const specPackage = packageNameFromSpec(parsed);
|
|
212
231
|
const distributionName = normalizeDistributionName(path.basename(cwd));
|
|
213
|
-
const packageName = packageNameFromDistribution(distributionName);
|
|
232
|
+
const packageName = specPackage ?? packageNameFromDistribution(distributionName);
|
|
214
233
|
const filesCreated = [];
|
|
215
234
|
const filesSkippedExisting = [];
|
|
216
235
|
const dirsCreated = [];
|
package/package.json
CHANGED
|
@@ -25,6 +25,58 @@ The ONLY time you stop is if a step **fails and cannot be recovered**. Otherwise
|
|
|
25
25
|
|
|
26
26
|
Brief status lines like `[autopilot] Step 3: Executing plan...` are fine. Full summaries, questions, or check-ins are not.
|
|
27
27
|
|
|
28
|
+
## Codex pass policy (risk-tiered)
|
|
29
|
+
|
|
30
|
+
> Adopted from the v7.4.1 strategic review (see
|
|
31
|
+
> `docs/strategy/2026-05-11-codex-pivot.md`, codex finding N2).
|
|
32
|
+
>
|
|
33
|
+
> The v8 spec pass-2 finding 3 CRITICALs the original spec missed
|
|
34
|
+
> (especially sandbox / credential exfiltration) was concrete evidence
|
|
35
|
+
> that 1 codex pass is insufficient for security-sensitive architecture.
|
|
36
|
+
> But running 3 passes on every CLI polish spec adds latency without
|
|
37
|
+
> proportional value.
|
|
38
|
+
|
|
39
|
+
**Tier the spec by risk; pass count follows.**
|
|
40
|
+
|
|
41
|
+
| Spec risk | Triggers | # of codex passes |
|
|
42
|
+
|---|---|---|
|
|
43
|
+
| **Low** | CLI UX changes, doc-only PRs, scaffolding extensions, config polish, CI workflow tweaks | **1 pass** (this skill's existing pattern — codex on the committed spec) |
|
|
44
|
+
| **Medium** | New execution modes, auth changes, billing flows, data-access patterns, new env vars, API contracts | **2 passes** (1 on the draft spec, 1 on the merged spec after edits) |
|
|
45
|
+
| **High** | Sandboxing, multi-tenancy, auto-merge, anything that mutates user repos, new secrets-handling, RPC/SECURITY DEFINER changes | **3 passes** + external review (1 draft, 1 post-edit, 1 on the impl PR diff) |
|
|
46
|
+
|
|
47
|
+
**How to apply.** Spec docs declare risk in their frontmatter:
|
|
48
|
+
|
|
49
|
+
```markdown
|
|
50
|
+
---
|
|
51
|
+
title: <topic>
|
|
52
|
+
risk: low | medium | high
|
|
53
|
+
---
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
If the spec's `risk:` is omitted, default to **medium** (safer than
|
|
57
|
+
defaulting to low; matches the v8 spec pattern where pass-1 was
|
|
58
|
+
clearly insufficient).
|
|
59
|
+
|
|
60
|
+
The brainstorming skill's per-step codex pass (approach selection,
|
|
61
|
+
architecture, components, error handling, implementation prep) is
|
|
62
|
+
ALWAYS run — it's how we get a draft spec good enough to merge.
|
|
63
|
+
This tier policy applies to the **post-brainstorm** passes and to
|
|
64
|
+
the codex PR review at Step 7 below.
|
|
65
|
+
|
|
66
|
+
**Examples from v7.x:**
|
|
67
|
+
|
|
68
|
+
* v7.1.7 (setup polish — CLAUDE.md scaffold + .gitignore + dedup): low.
|
|
69
|
+
1 pass on the committed spec. Caught zero CRITICALs in practice.
|
|
70
|
+
* v7.4.0 (Python/FastAPI scaffold extension): low. 1 pass. Found
|
|
71
|
+
2 CRITICALs (FastAPI precedence, dangling entrypoint) — both
|
|
72
|
+
fixed pre-impl, no PR-pass surprises.
|
|
73
|
+
* v7.0 Phase 6 (engine-off removal + middleware revocation): high.
|
|
74
|
+
3 passes (spec, post-edit, PR diff). Each pass surfaced new
|
|
75
|
+
trust-boundary issues; without all three the launch would have
|
|
76
|
+
shipped with the credential-exfiltration vector C3.
|
|
77
|
+
* v8.0 spec (standalone daemon): high. 2 passes so far + needs a
|
|
78
|
+
3rd before any v8 alpha implementation.
|
|
79
|
+
|
|
28
80
|
## Pipeline
|
|
29
81
|
|
|
30
82
|
Execute these steps in order. Do NOT pause between steps unless a step fails.
|