PyPI - specfuse-loop - Versions diffs - 0.2.0__tar.gz → 0.3.0__tar.gz - Mend

specfuse-loop 0.2.0tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (128) hide show

{specfuse_loop-0.2.0/specfuse_loop.egg-info → specfuse_loop-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: specfuse-loop
-Version: 0.2.0
+Version: 0.3.0
 Summary: Local-first executor for the Specfuse Plan + Work Unit gate-cycle methodology.
 Author: Specfuse contributors
 License: Apache-2.0

{specfuse_loop-0.2.0 → specfuse_loop-0.3.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "specfuse-loop"
-version = "0.2.0"
+version = "0.3.0"
 description = "Local-first executor for the Specfuse Plan + Work Unit gate-cycle methodology."
 readme = "README.md"
 requires-python = ">=3.10"
@@ -56,8 +56,12 @@ specfuse-lint = "specfuse.loop.lint_plan:main"
 [tool.setuptools.packages.find]
 # Ship ONLY the specfuse namespace package. Without this scope, setuptools'
-# auto-discovery sweeps tests/, docs/, and scripts/ into the wheel. The driver's
-# data (templates, rules, features) lives in the consumer's `.specfuse/`, not in
-# the wheel, so the package is pure code.
+# auto-discovery sweeps tests/, docs/, and scripts/ into the wheel.
 include = ["specfuse*"]
 namespaces = true
+[tool.setuptools.package-data]
+# Include the scaffold seed so `specfuse init` / `specfuse upgrade` can copy
+# templates, rules, and other seed files out of the wheel without needing the
+# loop source repo on disk.
+"specfuse.loop" = ["data/**/*", "data/*"]

specfuse_loop-0.3.0/specfuse/loop/data/LEARNINGS.template.md ADDED Viewed

@@ -0,0 +1,69 @@
+# LEARNINGS
+Durable, reusable rules distilled from every gate's retrospective. The `lessons` work
+unit appends here; planning reads here before detailing any new feature. This is the
+feedback loop that makes each plan better than the last.
+Append only. Phrase each entry as a rule that would change how a FUTURE work unit is
+written or executed, not a one-off observation. De-duplicate against what is here.
+Feature-specific observations stay in that feature's `RETROSPECTIVE.md` and are not
+promoted here.
+## Format
+```
+- [FEAT-YYYY-NNNN/G1] Implementation WUs must name the module a new route/handler
+  lives in; "add it to the router" cost a blocked attempt when no router existed yet.
+```
+## Entries
+<!-- lessons work units append below this line -->
+<!--
+The entries below are GENERIC methodology lessons that ship with the scaffold —
+they are about how to write and run work units, not about any one project. Your
+project's own `lessons` work units append project-specific rules beneath them.
+-->
+- [meta/first-live-use] Scope a feature's acceptance criteria to the feature's
+  own footprint — its slug, the paths it creates or edits, the symbols it
+  introduces. Acceptance criteria that grep or scan the WHOLE repo will trip
+  on pre-existing, unrelated state and (correctly) cause the agent to emit
+  `status: blocked` even when the WU's own work is fine. Example failure mode:
+  a "no TODO comments anywhere in the tree" check that fires on legacy code
+  the WU never touches. Rule: bound checks to the feature's path prefixes
+  (e.g. `src/<slug>/**`) or to files the WU declares in `generated_surfaces` /
+  `files_changed`; repo-wide invariants belong in a separate hygiene WU or in
+  the repo's `code` gate set, not in a per-feature acceptance criterion.
+- [meta/first-live-use] Name what the WU is expected to PRODUCE, not only what
+  it must NOT touch. The "Do not touch" section bounds the WU on one side;
+  without an equally-explicit "produces" list, an agent can helpfully write
+  files that should belong to a later WU (e.g. docs that were T92's job
+  showing up in T01's commit) without the verification gates objecting. Rule:
+  in addition to the "Do not touch" list, the WU's Acceptance criteria should
+  name the specific files/sections the WU is expected to author. A reviewer
+  reading the diff should be able to point at every changed file and find it
+  in either the WU's produces-list or the gate's verification output.
+- [meta/first-live-use] The "hygiene WU" pattern — when a substantive WU
+  discovers a pre-existing bug in a path its "Do not touch" rule forbids
+  (typical case: shared module, infrastructure config, dependency version),
+  the right move is to insert a narrow hygiene WU EARLIER in the gate (or as
+  a precursor gate) that fixes only that issue. Not: loosen the blocked WU's
+  scope to permit the cross-cutting fix (muddies its boundary). Not: fix it
+  manually out-of-loop and pretend the gate ran clean (silent drift between
+  the methodology's history and git's). The hygiene WU should have a single,
+  obvious acceptance criterion and pass on its own verification; the original
+  blocked WU then runs after, unmodified.
+- [meta/loop-driver-bugs] Driver bookkeeping (frontmatter status flips,
+  events.jsonl appends, per-attempt notes) must be committed if it should
+  survive across WUs — uncommitted writes are wiped by the inter-attempt
+  `git reset --hard`. Agent-work commits (per-WU squash) are separate from
+  bookkeeping commits (`chore(loop): ...`). When authoring WUs whose
+  verification commands themselves write to disk, remember the agent's
+  working tree is reset between failed attempts — scratch files written
+  during a failed attempt won't persist into the next attempt's prompt unless
+  the agent explicitly buffers them in the prompt-facing failure note the
+  driver hands to the next attempt.

specfuse_loop-0.3.0/specfuse/loop/data/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0.3.0

specfuse_loop-0.3.0/specfuse/loop/data/docs/concepts/architecture-addendum-gates-and-iterative-planning.md ADDED Viewed

@@ -0,0 +1,97 @@
+# Architecture Addendum — Gates and the iterative planning cycle (Model B)
+> **Status: adopted (2026-06).** Gate placement is resolved as **Model B — gates live in the
+> loop, per component, NOT in the orchestrator PM.** The gate cycle was proven on a real
+> multi-gate feature (loop `FEAT-2026-0003`: plan-next drafted real, armable next gates across
+> three cycles). This addendum records that decision and what it means for the orchestrator.
+>
+> **This supersedes the earlier Model-A proposal** (an earlier revision of this file that
+> proposed folding gate identification / `plan-next` / per-gate `plan_review` into the PM agent as
+> a `v1.7.0` behavioral change). That fold-in was **not adopted**; the PM does not gain gate
+> machinery. See the orchestrator repo's `docs/gate-placement-proposal.md` (Model A vs B, decision
+> criteria) and `docs/naming-convention.md` for the canonical contracts.
+---
+## 1. The decision
+The orchestrator coordinates one level above a single-repo goal. An **initiative**
+(`INIT-YYYY-NNNN`) is decomposed by the PM into a **`feature_graph`** of **features**
+(`INIT-YYYY-NNNN/FNN`), each a single-repo goal dispatched to one component. **Each dispatched
+implementation feature == a loop feature**: the receiving component's loop decomposes it into
+**gates** and **work units** and grinds it through its gate cycle.
+So gates are **internal to the loop**, not orchestrator state. The orchestrator owns
+`initiative → feature` decomposition, cross-repo dependency ordering, and the spec/generated
+interface contracts between features; it does **not** identify gates, run `plan-next`, or hold a
+per-gate `plan_review`. The loop owns all of that, per [`methodology.md`](methodology.md).
+**Why Model B (summary).** The loop is single-repo + edit-and-commit; codegen freezes the
+cross-repo interface (generated `emit-*`/`on-*` contracts are immutable `_generated/`), so
+component-loops grind hand-code against frozen boundaries and cannot break each other. This
+dissolves the hardest part of Model A (predicting cross-repo gate boundaries inside the PM) and
+keeps the gate cycle built once, in the loop. Full rationale + the rejected Model A:
+`gate-placement-proposal.md`.
+## 2. What changed in the orchestrator (minimal — no PM gate machinery)
+The orchestrator change is the **initiative/feature reframe**, already folded into
+`orchestrator-architecture.md` §1A and `naming-convention.md`:
+- **Vocabulary / IDs:** initiative → feature → gate → work unit; `INIT-YYYY-NNNN/FNN/TNN`
+  (legacy/component-local `FEAT-…/TNN`). Root token = origin.
+- **State machines (unchanged in shape):** the "feature state machine" is the **initiative**
+  lifecycle; the "task state machine" is the **feature** lifecycle. Gates/WUs do **not** appear in
+  the orchestrator's state machines — they are loop-internal.
+- **PM agent (reframed, not gate-extended):** `feature-decomposition` (was task-decomposition)
+  produces a `feature_graph`; `issue-drafting` files feature issues labelled by type;
+  `plan-review` reviews the `feature_graph`; `dependency-recomputation` (runtime `scripts/poller.py`)
+  flips features `pending → ready`. **No** gate identification, `plan-next`, or per-gate
+  `plan_review` in the PM — those were the Model-A additions and are dropped.
+- **Dispatch by feature type:** `implementation` → the component-loop (loop GitHub feature-pick on
+  `specfuse:feature`); `qa_*` → the QA agent (`specfuse:qa-feature`), a distinct cross-repo role,
+  **not** a loop. QA is the exception to uniform-loop dispatch.
+- **Per-gate autonomy / arming** lives in the loop (not the PM): autonomy flows orchestrator →
+  loop (`review`/`supervised` stop at each gate for a human arm; `auto` self-arms safe gates under
+  the methodology §9 conjunction). The merge gate stays human until the QA loop is trusted.
+The orchestrator's earlier "no gates" behavior is therefore not changed by *adding* gates to it —
+gates were placed in the loop instead.
+## 3. What the loop owns (the gate layer)
+Per [`methodology.md`](methodology.md): the gate cycle (plan → execute → close → review&arm), the
+four-type closing sequence (`retrospective → lessons → docs → plan-next`), `plan-next` drafting
+the next gate (never arming it), `LEARNINGS.md`, and per-gate autonomy. These are **loop-internal**
+to each dispatched feature; the orchestrator sees only the feature's overall state (via issue
+labels) and its completion (PR merge → merge-watcher → `state:done`).
+## 4. Reconciliation with the orchestrator (the surface-specific seams)
+Per the collaboration charter §2 / methodology §10, only these differ between surfaces:
+| Concern        | Loop (single-repo)                       | Orchestrator (multi-repo)                         |
+|----------------|------------------------------------------|---------------------------------------------------|
+| State backend  | WU/GATE/PLAN frontmatter, git-tracked    | GitHub issue labels + the initiative registry     |
+| Dispatch       | driver shells out (`claude -p`)          | poller routes by type → loop / QA agent           |
+| Branch / merge | one branch, squash per WU                | branch + PR per **feature**, merge watcher        |
+| Report-back    | RESULT block                             | `task_completed` event (+ `state:*` labels) via the loop's `GitHubBackend` |
+The loop's `loop.py` is the reference for the orchestrator's poller (its dispatch/verify/retry/
+gate-stop semantics decompose across the poller, PM dependency-recomputation, and the merge
+watcher); the orchestrator does not import `loop.py`.
+## 5. Status of the old Model-A sections
+The prior revision's §A.2–§A.11 (feature-state `in_progress → plan_review` oscillation, gate
+skeleton in the PM, `plan-next` as a PM skill, per-gate `plan_review`, the auto-arm conjunction in
+the PM, PM `v1.7.0`, the `gates`/`task.gate` frontmatter fields) described **Model A and are not
+implemented.** The `feature-frontmatter.schema.json` `gates` array is not used by the orchestrator
+(gates are loop-internal). If a future need arises to surface gate state at the orchestrator level,
+re-open this addendum deliberately.
+## 6. Remaining (gated)
+- `specfuse/methodology` extraction — once the gate-cycle contracts stop changing run-to-run
+  (charter §4; two contract fixes landed during the FEAT-2026-0003 dogfood — let them soak).
+- Loop kit → `stable` in the orchestrator distribution manifest (same soak gate).

specfuse_loop-0.3.0/specfuse/loop/data/docs/concepts/ralph-lineage.md ADDED Viewed

@@ -0,0 +1,66 @@
+# Why the loop exists — lineage and positioning
+## The Ralph lineage
+The loop descends from the "Ralph" technique: in its purest form, a bash loop
+that feeds a prompt to a coding agent repeatedly until the work is done, with a
+fresh context each iteration and durable state kept in files (git history, a
+progress file, a task list) rather than in the context window. Its insight is
+that for large work, *stubbornness plus fresh context* beats a single clever
+pass — the loop is the hero, not the model.
+Ralph's known weakness is the thinness of its task list: a bare list of TODOs
+gives an agent nothing to enforce patterns against, so it drifts. The ecosystem's
+answer to "that's too coarse for serious work" has been to make the units of work
+granular and self-contained enough that ephemeral workers can pick them up,
+execute, and hand off — orchestration of many such workers ("Gas Town"-style).
+The Specfuse Loop is that idea with the planning rigor added back in. It keeps
+Ralph's fresh-context-per-iteration property but moves it to **work-unit
+granularity**, and it replaces the thin task list with the **Plan + Work Unit**
+pattern: crisp work units with hard "do not touch" boundaries, explicit
+acceptance criteria, and machine-checkable verification gates. The up-front
+planning investment is precisely what earns the right to let execution run
+unattended — the richer the unit, the longer the loop can safely run before a
+human checkpoint.
+Two things distinguish it from vanilla Ralph:
+- **Verification is the exit oracle, not the agent's say-so.** The driver re-runs
+  the unit's gates and they decide done — eliminating Ralph's classic
+  premature-"done" failure.
+- **Gates are human checkpoints by design.** The loop runs unattended *within* a
+  gate and stops *at* it. Reflection, a cross-feature learnings rollup, and
+  drafting the next batch happen systematically as the gate's closing sequence,
+  not when someone remembers to ask.
+## Where it sits in Specfuse
+Specfuse is a methodology and an organization, not a single tool. Three
+independently-adoptable projects live under it:
+- **`specfuse/codegen`** turns OpenAPI / AsyncAPI / Arazzo specifications into
+  deterministic source code — the boilerplate no one should hand-write and no
+  agent should hallucinate.
+- **`specfuse/loop`** (this project) executes the Plan + Work Unit pattern in a
+  single repository, with no specification required and no agent-coordination
+  overhead. The lightweight surface.
+- **`specfuse/orchestrator`** coordinates specialized agents across many
+  component repositories from validated specifications — the heavyweight surface
+  for multi-repo, spec-first feature delivery.
+The loop and the orchestrator are two execution surfaces of **one** methodology
+(see [`methodology.md`](methodology.md)); they share the gate cycle, the
+work-unit contract, the correlation-ID scheme, and the verification discipline.
+The loop is the right home for work that lives in one repo or has no formal
+spec; the orchestrator is the right home when the work genuinely spans repos and
+is driven by specifications that `codegen` can turn into a stable foundation.
+## What it is not
+- Not a general-purpose AI coding platform. It does one shape of work:
+  plan-driven, gated, fresh-context execution in a single repo.
+- Not a replacement for human judgment. Every gate is a human checkpoint; the
+  loop keeps agents *inside* a loop, it does not remove the loop.
+- Not a hosted service. It runs on your machine, against your repo, under your
+  accounts.

specfuse_loop-0.3.0/specfuse/loop/data/docs/getting-started.md ADDED Viewed

@@ -0,0 +1,188 @@
+# Getting started
+This walks you from an empty project to a feature delivered by the loop, then
+shows what to do when a run halts. It assumes you've read the one-minute pitch in
+the [README](../README.md); for the full contracts see
+[`methodology.md`](methodology.md) and for the interactive operations see
+[`skills.md`](skills.md).
+The loop is stdlib-only Python plus Claude Code. There is no install step in your
+target repo — `init.sh` copies a self-contained scaffold in.
+---
+## 1. Install the scaffold
+From your checkout of `specfuse/loop`, point `init.sh` at the repo you want to
+drive:
+```bash
+./init.sh /path/to/your-project
+```
+This writes `.specfuse/` (templates, rules, skills, the driver, and the durable
+docs) into your project and wires `.claude/` so Claude Code discovers the skills.
+It refuses if `.specfuse/` already exists — use `./init.sh --upgrade` to update an
+existing install in place without touching your authored files.
+> **Don't gitignore `.specfuse/`.** The loop's durable state lives there and must
+> be committed for the loop to work. `init.sh` warns if it detects the directory
+> is ignored.
+## 2. Match verification to your stack
+`init.sh` seeds `.specfuse/verification.yml`. Open it and make the `code` gate set
+run *your* project's checks:
+```yaml
+code:
+  - name: tests
+    command: "pytest -q"
+  - name: coverage
+    command: "coverage report --fail-under=90"
+  - name: lint
+    command: "ruff check ."
+  - name: security
+    command: "bandit -r src -ll"
+```
+These commands are the **exit oracle**: the driver re-runs them itself after every
+work unit and *they* decide whether the unit is done — the agent's own self-report
+is advisory only ([methodology §5](methodology.md)). Keep this set in lock-step
+with your GitHub branch protection, or an agent can pass locally and still be
+unmergeable.
+If your repo already has CI worth deriving from, run **`/derive-verification`** in
+Claude Code instead of editing by hand — it inspects your CI and tooling and
+drafts the file for you.
+## 3. Author your first feature
+Two ways to create a feature folder under `.specfuse/features/`:
+- **Interactively (recommended):** run **`/pick-feature`** to choose from your
+  roadmap, then **`/draft-feature`**. Draft-feature asks framing questions, then
+  proposes a gate skeleton and gate 1's work units, writing only on your accept.
+- **By hand:** copy the worked example,
+  `.specfuse/features/FEAT-2026-0001-health-endpoint/`, and adapt it. It's a
+  deliberately small two-unit feature that exercises the whole loop. Or start from
+  the bare templates in `.specfuse/templates/`.
+A feature folder holds:
+| File | Owns | Who writes it |
+|------|------|---------------|
+| `PLAN.md` | the *shape*: gate order, WU membership, dependency edges, feature status | you / `draft-feature` (gate 1); `plan-next` (later gates) |
+| `GATE-NN.md` | one gate's status and definition of done | you / the planner |
+| `WU-*.md` | a single work unit: frontmatter + the prompt a fresh session receives | you / `draft-feature` / `plan-next` |
+Then create the branch named in `PLAN.md`'s frontmatter (`branch:`).
+## 4. Validate before running
+```bash
+python .specfuse/scripts/lint_plan.py .specfuse/features/FEAT-2026-0001-health-endpoint
+```
+The linter checks structure: every WU has the five mandatory sections, the closing
+sequence is present and well-formed, dependencies resolve, IDs are well-formed.
+Fix anything it flags before dispatching — it's far cheaper than a failed
+dispatch.
+## 5. Dry-run, then run
+```bash
+python .specfuse/scripts/loop.py --dry-run     # show the gate walked, in dep order, no dispatch
+python .specfuse/scripts/loop.py               # the real thing
+```
+With no `--feature` flag the driver picks the single `active` feature. For each
+ready work unit it:
+1. marks the WU `in_progress`,
+2. dispatches a **fresh** `claude -p` session with that unit's model and prompt,
+3. runs the unit's verification **itself** as the exit oracle,
+4. on pass, makes **one squashed commit** carrying the `Feature: FEAT-.../TNN`
+   trailer.
+A failed gate is discarded and re-dispatched to a fresh session carrying the
+failure evidence, up to three attempts, then the unit is escalated to
+`blocked_human` and the gate halts.
+> **One driver per working tree.** The driver holds an exclusive lock on
+> `.specfuse/.loop.lock`. A second driver on the same checkout exits immediately.
+> To run two features at once, use separate `git worktree` checkouts. `--dry-run`
+> is exempt.
+## 6. The gate boundary — where you come back in
+A gate ends with a **closing sequence** that runs automatically: it writes a
+retrospective, promotes durable lessons to `LEARNINGS.md`, reconciles docs, and —
+crucially — **drafts the next gate's work units** (as `draft`) so the next gate is
+waiting for you to review.
+Two things can happen at the boundary:
+- **The gate auto-closes.** On a clean, on-plan gate the deterministic predicate
+  (`gate_eval.py`) closes it without a reflective session — but `plan-next` still
+  drafts the next gate, so the human review step still fires
+  ([methodology §3](methodology.md)).
+- **The driver halts with `awaiting_review`.** The next gate's WUs are in `draft`
+  and the driver will refuse to execute them until you arm them. **Arming is the
+  human checkpoint and is deliberately not automated.**
+Run **`/arm-gate`**. It walks each drafted WU — accept / revise / reject — flips
+the ones you accept to `pending`, marks the finished gate `passed`, and prints the
+resume command. Read the `GATE-NN-REVIEW.md` the planner wrote first: it's
+weighted toward where the planner was *least* certain.
+Then re-run `loop.py`. Repeat until the terminal gate is `done`.
+## 7. Wrap up
+When the terminal gate is `done`, run **`/wrap-feature`**: it pushes the feature
+branch, opens a PR, optionally watches CI, and points at the next pick. Then
+**`/roadmap-archive`** moves the finished feature's detail out of the active
+roadmap.
+---
+## Operating a running loop
+The driver runs unattended within a gate, but real runs hit snags. The map:
+| Symptom | What it means | Do this |
+|---------|---------------|---------|
+| Driver halts, a WU is `blocked_human` | A unit failed three attempts or hit an escalation trigger | Run **`/gate-status`** for a diagnosis (root cause, options, recommended action) |
+| You fixed the blocker (creds, dep, spec) | The WU is still `blocked_human` | Run **`/unblock-wu`** to re-arm it (`blocked_human → pending`, attempts reset), then re-run |
+| Driver exits "could not acquire lock" | Another driver owns this checkout | Find/stop the other driver, or use a separate `git worktree` |
+| A gate is `awaiting_review` | Normal gate boundary | Run **`/arm-gate`** (§6) |
+| The feature isn't worth finishing | — | Run **`/abandon-feature`** — flips every WU/gate/PLAN/roadmap surface cleanly |
+| A WU "passed" but wrote no code | Hollow pass | Tighten the WU's acceptance criteria and verification; see [`authoring-work-units`](skills.md) |
+**Where the durable state lives** (nothing important is in a context window):
+- `PLAN.md` / `GATE-NN.md` / `WU-*.md` frontmatter — current status of everything.
+- `events.jsonl` (per feature) — the event log; every dispatch emits an
+  `attempt_outcome`.
+- `RETROSPECTIVE.md` — feature-local raw observations from each close.
+- `LEARNINGS.md` (repo root of `.specfuse/`) — cross-feature durable lessons, read
+  at planning time so each plan is better than the last. Run
+  **`/learnings-suggest`** periodically to mine recurring failures into new
+  entries.
+When in doubt after a halt, start with **`/gate-status`** — it reads all of the
+above and tells you where you stand.
+## Fixing a bug (not a feature)
+Bugs don't go through the feature methodology. Run **`/fix-bug`** with the issue
+number or report: it's 1 bug = 1 branch = 1 PR, test-first. It refuses and
+proposes promoting to a feature if the work turns out large or risky.
+## Next
+- [`methodology.md`](methodology.md) — the full gate-cycle contract.
+- [`skills.md`](skills.md) — every skill, by lifecycle phase.
+- [`concepts/ralph-lineage.md`](concepts/ralph-lineage.md) — why the loop is
+  shaped the way it is.

specfuse-loop 0.2.0__tar.gz → 0.3.0__tar.gz

specfuse-loop 0.2.0tar.gz → 0.3.0tar.gz