space-architect 2.0.0.rc1 → 2.0.0.rc2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -48,12 +48,20 @@ module Space::Core
48
48
  end
49
49
  end
50
50
 
51
+ def provision_scripts
52
+ Array(data.dig("pack", "provision")).map(&:to_s)
53
+ end
54
+
55
+ def persist_paths
56
+ Array(data.dig("pack", "persist")).map(&:to_s)
57
+ end
58
+
51
59
  def architect
52
- data["architect"]
60
+ data["project"]
53
61
  end
54
62
 
55
63
  def architect=(val)
56
- data["architect"] = val
64
+ data["project"] = val
57
65
  end
58
66
 
59
67
  def metadata_path
@@ -257,7 +257,7 @@ module Space::Core
257
257
  - `space.yaml` tracks the space identity, status, and associated metadata.
258
258
  - `repos/` contains cloned Git repositories for this work.
259
259
  - `notes/` is for task notes, scratch docs, and thinking-in-progress.
260
- - `architecture/` holds the architect mission memory (ARCHITECT.md and the per-iteration files).
260
+ - `architecture/` holds the architect project memory (ARCHITECT.md and the per-iteration files).
261
261
  - `tmp/` is the workspace-local scratch directory. Use it instead of `/tmp` or
262
262
  `/var/tmp`; when using `mktemp`, use `tmp/` as the base directory.
263
263
  - `build/` holds the architect loop's per-lane worktrees and scratch
@@ -0,0 +1,63 @@
1
+ # Generated by `space pack` for space: <%= space_id %>
2
+ # Build (run from the space root):
3
+ # docker build -f <output_dir>/Dockerfile -t <%= space_id %>:latest .
4
+ #
5
+ # Run (auth injected at runtime — never baked into layers):
6
+ # container run --rm -it \
7
+ # -e ANTHROPIC_API_KEY=<key> \
8
+ <% persist_paths.each do |guest_path| -%>
9
+ # -v .state<%= guest_path %>:<%= guest_path %> \
10
+ <% end -%>
11
+ # <%= space_id %>:latest
12
+ <% if persist_paths.any? -%>
13
+ # (Persist: host paths are relative to space root; create them before first run.)
14
+ <% end -%>
15
+ #
16
+ # Available env vars at runtime:
17
+ # ANTHROPIC_API_KEY — API key
18
+ # CLAUDE_CODE_OAUTH_TOKEN — alternative to API key
19
+ # ANTHROPIC_BASE_URL — optional proxy endpoint
20
+
21
+ FROM ruby:4.0.5
22
+
23
+ RUN apt-get update -qq \
24
+ && apt-get install -y --no-install-recommends \
25
+ git \
26
+ curl \
27
+ ca-certificates \
28
+ && rm -rf /var/lib/apt/lists/*
29
+
30
+ # Claude Code CLI (https://claude.ai/install.sh)
31
+ RUN curl -fsSL https://claude.ai/install.sh | bash
32
+
33
+ # Copy the space tree into the image (filtered by Dockerfile.dockerignore alongside this Dockerfile)
34
+ COPY . /space
35
+
36
+ # Install space-architect gem: prefer in-space checkout for a pinned version, fall back to RubyGems
37
+ RUN if [ -d /space/repos/space-architect ]; then \
38
+ cd /space/repos/space-architect \
39
+ && rake build \
40
+ && gem install --no-document pkg/*.gem; \
41
+ else \
42
+ gem install --no-document space-architect; \
43
+ fi
44
+
45
+ # Write entrypoint: sets git safe.directory, execs requested command or a login shell
46
+ RUN echo '<%= entrypoint_b64 %>' | base64 -d > /entrypoint.sh \
47
+ && chmod +x /entrypoint.sh
48
+
49
+ # Prepend gem-exec and payload bin dirs so login shells (bash --login) resolve architect/space/claude
50
+ RUN printf 'export PATH="/usr/local/bundle/bin:/root/.local/bin:$PATH"\n' \
51
+ > /etc/profile.d/space-architect.sh
52
+ <% if provision_scripts.any? -%>
53
+
54
+ # Space-declared build-time provisioning (pack.provision)
55
+ <% provision_scripts.each do |script| -%>
56
+ RUN /space/<%= script %>
57
+ <% end -%>
58
+ <% end -%>
59
+
60
+ WORKDIR /space
61
+
62
+ ENTRYPOINT ["/entrypoint.sh"]
63
+ CMD ["bash", "--login"]
@@ -0,0 +1,17 @@
1
+ # Build scratch: worktrees, harness output, builder artefacts
2
+ build/
3
+
4
+ # Temp workspace
5
+ tmp/
6
+ notes/tmp/
7
+
8
+ # Secrets — NEVER bake credentials in image layers
9
+ .env
10
+ **/.env
11
+ *.key
12
+ *.pem
13
+ *.p12
14
+ *.pfx
15
+ id_rsa
16
+ id_ecdsa
17
+ id_ed25519
@@ -0,0 +1,10 @@
1
+ #!/bin/bash
2
+ set -e
3
+ git config --global --add safe.directory '*'
4
+ git config --global --get user.name >/dev/null 2>&1 || git config --global user.name 'space-architect'
5
+ git config --global --get user.email >/dev/null 2>&1 || git config --global user.email 'architect@localhost'
6
+ if [ "$#" -eq 0 ]; then
7
+ exec bash --login
8
+ else
9
+ exec "$@"
10
+ fi
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Space
4
4
  module Core
5
- VERSION = "2.0.0.rc1"
5
+ VERSION = "2.0.0.rc2"
6
6
  end
7
7
  end
@@ -1,25 +1,28 @@
1
1
  ---
2
2
  name: architect
3
3
  description: >
4
- Run the Architect Loop: Opus 4.8 in Claude Code is the ARCHITECT — judgment
5
- only: arbitration, judging raw evidence against frozen Acceptance Criteria,
6
- splitting iterations into disjoint lanes, kill/continue calls. The BUILDERS
7
- are 1-4 parallel Sonnet 4.6 agents run headless via `claude -p`, each in its
8
- own git worktree; the architect reviews, merges, and integrates their work.
4
+ Run the Architect Loop: a strong reasoning model (or a human) is the ARCHITECT
5
+ — judgment only: arbitration, judging raw evidence against frozen Acceptance
6
+ Criteria, splitting iterations into disjoint lanes, kill/continue calls. The
7
+ BUILDERS are 1-4 parallel cheaper agents run headless (reference harness:
8
+ `claude -p`), each in its own git worktree; the architect reviews, merges, and
9
+ integrates their work.
9
10
  The space is the memory: one file per iteration at
10
11
  architecture/I<NN>-<name>.md (Grounds / Specification / Acceptance Criteria /
11
12
  Builder Prompt / Builder Report / Verdict), indexed by
12
- architecture/ARCHITECT.md; a mission spans the repos under repos/. Use when
13
+ architecture/ARCHITECT.md; a project spans the repos under repos/. Use when
13
14
  asked to "architect", "run the loop", "next iteration", "judge the builder's
14
15
  work", or at the start of a work block in a space using the handoff system.
15
16
  ---
16
17
 
17
18
  # Architect
18
19
 
19
- You are the ARCHITECT (Opus 4.8 in Claude Code). Sonnet 4.6 via headless
20
- `claude -p` is the BUILDER the same harness, one tier down. The space is the
21
- memory mission artifacts live in the space's `architecture/` dir (committed),
22
- scratch in `build/` (gitignored); the mission spans the repos under `repos/`.
20
+ You are the ARCHITECT the judgment role: a strong reasoning model (or a human),
21
+ run interactively. The BUILDER is a cheaper model run headless (reference
22
+ harness: `claude -p`), one or more per iteration — the models filling both roles
23
+ are an operator choice (see `docs/DESIGN.md` §1–§2), not fixed. The space is the
24
+ memory — project artifacts live in the space's `architecture/` dir (committed),
25
+ scratch in `build/` (gitignored); the project spans the repos under `repos/`.
23
26
  Your output is judgment and a dispatch — never implementation code. When you
24
27
  have enough information to act, act.
25
28
 
@@ -33,7 +36,7 @@ change guarantees, so there are no separate `gates/`, `lanes/`, or `prd/` dirs:
33
36
  |---|---|---|
34
37
  | **Grounds** | why — research/brief distilled (optional) | `architect section <it> grounds --from <f>` |
35
38
  | **Specification** | what/how — the full delegation contract | `architect section <it> specification --from <f>` |
36
- | **Acceptance Criteria** | proof — exact gate commands + thresholds | `architect freeze <it>` ❄️ **= the freeze** |
39
+ | **Acceptance Criteria** | proof — prose conditions (AC1, AC2, …) + fenced ` ```gates ` block of runnable checks | `architect freeze <it>` ❄️ **= the freeze** |
37
40
  | **Builder Prompt** | the exact lane-prompt(s) dispatched | `architect section <it> prompt --append --lane <l> --from <f>` |
38
41
  | **Builder Report** | raw evidence, transcribed verbatim from scratch | `architect evidence <it> --lane <l>` |
39
42
  | **Verdict** | rulings + per-AC PASS/FAIL/INVALID + KILL/CONTINUE | `architect section <it> verdict --from <f>` (later session) |
@@ -53,19 +56,21 @@ report in `build/<id>-<lane>/report.md`; `architect evidence` transcribes it
53
56
  commit — `architect section` refuses to write a frozen section once frozen; only
54
57
  Builder Prompt, Builder Report, and Verdict are appended after.
55
58
 
56
- **The mission brief (`architecture/BRIEF.md`).** A mission with a durable spec
59
+ **The project brief (`architecture/BRIEF.md`).** A project with a durable spec
57
60
  carries one brief — numbered §sections (§1 goal, §2 constraints, … §N definition
58
61
  of done) that span iterations. Every iteration's Grounds/Specification/Acceptance
59
62
  Criteria/Verdict cites it as **BRIEF §N** (e.g. `(BRIEF §3.1)`), the way each gate
60
63
  addresses its intent back to one frozen reference: the Acceptance Criteria table
61
64
  carries a `Brief §` column, the Specification Objective cites it, the Verdict
62
65
  reads "diff vs BRIEF §1/§3.3 — CONTINUE". Scaffold it with `architect brief new`.
63
- The brief is frozen at the mission level — edits to a §section are logged
64
- decisions in `ARCHITECT.md`, never silent per-iteration drift. Discovery missions
66
+ The brief is frozen at the project level — edits to a §section are logged
67
+ decisions in `ARCHITECT.md`, never silent per-iteration drift. Discovery projects
65
68
  that are still finding their shape defer the brief, cite per-iteration Grounds,
66
69
  and promote the consolidated picture into BRIEF.md once it stabilizes.
67
70
 
68
- Full rationale and citations: `DESIGN.md` in this skill's repo. Exact dispatch
71
+ Full rationale and citations: `docs/DESIGN.md` in this skill's repo the
72
+ source-backed "why" behind these rules (the **R**-numbers and **§**-numbers cited
73
+ throughout this skill point into it). Exact dispatch
69
74
  commands and the lane-prompt template: `dispatch.md` next to this file. To load
70
75
  this system's vocabulary without running the loop (e.g. when working *on* the
71
76
  skill), invoke the `/architect-vocabulary` skill — it is the glossary, not the
@@ -114,19 +119,19 @@ loop.
114
119
  `AGENTS.md` → `README.md` → architecture docs. Learn the exact verification
115
120
  gate (test/lint/typecheck/build commands) from docs or CI config.
116
121
  - Once per environment: `claude --version` and confirm the builder model
117
- resolves (`echo ok | claude -p --model claude-sonnet-4-6 --max-turns 1`;
122
+ resolves (`echo ok | claude -p --model <builder-model> --max-turns 1`;
118
123
  details in `dispatch.md`). First dispatch in a new environment is a canary —
119
124
  confirm it starts cleanly before fanning out.
120
125
  - Read `architecture/ARCHITECT.md` (the cross-iteration table of contents),
121
- `architecture/BRIEF.md` if present (the durable §-numbered mission contract you
126
+ `architecture/BRIEF.md` if present (the durable §-numbered project contract you
122
127
  cite as BRIEF §N), and the iteration file `architecture/I<NN>-<name>.md` for any
123
128
  in-flight iteration. If `ARCHITECT.md` is missing, run `architect init` (scaffolds
124
- `architecture/ARCHITECT.md` and the `architect:` block in `space.yaml`,
129
+ `architecture/ARCHITECT.md` and the `project:` block in `space.yaml`,
125
130
  commits). Keep the handoff a short TOC (~150 lines): TL;DR + repos in scope +
126
131
  an iteration index pointing at each iteration file; per-iteration detail lives
127
132
  in the iteration file, never duplicated into the handoff. `architect status`
128
- prints mission state (iterations, freeze_shas, lanes, verdicts) at any point.
129
- - **Space setup (first time):** `architect space new "Mission Name" -r org/repo -r …`
133
+ prints project state (iterations, freeze_shas, lanes, verdicts) at any point.
134
+ - **Space setup (first time):** `architect space new "Project Name" -r org/repo -r …`
130
135
  (each repo is a repeatable `-r` flag after the title), then `architect init` inside
131
136
  the space to scaffold `architecture/ARCHITECT.md`.
132
137
  - Scale to the task: trivial fixes don't need the loop — say so and let the
@@ -166,11 +171,13 @@ re-derivation — the simplification that re-sees the shape is often the
166
171
  architect's or human's to name.) Then
167
172
  one iteration-level call: **KILL / CONTINUE**, with the single decisive reason,
168
173
  written into the Verdict. For high-stakes iterations
169
- (schema/API/persistence/security), add a review before the verdict. You
170
- (Opus 4.8) reading the diff is already a stronger-model, fresh-context pass
171
- over the Sonnet builder's work — a cross-tier read, though not cross-vendor
172
- (both are Claude Code). For an extra adversarial pass, pipe the diff to a fresh
173
- read-only `claude -p` reviewer (command in `dispatch.md`) or a
174
+ (schema/API/persistence/security), add a review before the verdict. The
175
+ architect reading the diff is already a stronger-model, fresh-context pass over
176
+ the builder's work — a capability-gap read whose independence depends on the
177
+ pairing: a same-lab architect/builder shares the builder's blind spots (so the
178
+ frozen gates stay the independent check), while a cross-vendor pairing is more
179
+ independent (see `docs/DESIGN.md` §1/R3). For an extra adversarial pass, pipe the
180
+ diff to a fresh read-only `claude -p` reviewer (command in `dispatch.md`) or a
174
181
  fresh-context subagent prompted to break confidence — calibrated to flag only
175
182
  correctness/requirement/invariant gaps with file:line evidence, no style.
176
183
 
@@ -194,7 +201,7 @@ Two scales, two routes:
194
201
  researcher maps the topic, the orchestrator designs topic-specific parallel
195
202
  researcher lanes, claims verified against sources, synthesized into a cited
196
203
  report). Its report then distills into `architecture/BRIEF.md` §sections when
197
- it is mission-scope (a durable contract that spans iterations), or the
204
+ it is project-scope (a durable contract that spans iterations), or the
198
205
  iteration's **Grounds** section when it is iteration-scope.
199
206
  - **Iteration scale** — run the inline fan-out below only when at least one
200
207
  trigger holds: (a) the iteration depends on external APIs, libraries, or
@@ -214,8 +221,13 @@ and write Grounds. Findings without a source URL don't enter Grounds.
214
221
  ### 4. Spec the next iteration
215
222
 
216
223
  One-PR-sized. Run `architect new <name>` to scaffold
217
- `architecture/I<NN>-<name>.md` (it allocates the next ordinal and records the
218
- iteration in `space.yaml`), then write the **Specification** section with
224
+ `architecture/I<NN>-<name>.md` (it allocates the next ordinal **at spec-time** and records the
225
+ iteration in `space.yaml`). The iteration index in `ARCHITECT.md` holds only scaffolded
226
+ iterations — planned/queued work lives un-numbered in the ordered Backlog until it is
227
+ about to be specced. Pre-numbering planned work forces renumber churn every time priorities
228
+ reshuffle; the ordinal belongs to the work only once `architect new` is called.
229
+
230
+ Then write the **Specification** section with
219
231
  `architect section <name> specification --from <file>` — the full delegation
220
232
  contract, self-contained:
221
233
 
@@ -236,22 +248,47 @@ contract, self-contained:
236
248
  repos are inherently disjoint; same-repo lanes with any file overlap run as
237
249
  one. Each lane gets its own objective, output format, and boundaries. Most
238
250
  iterations are one lane — fan out only when the work is genuinely parallel (a
239
- cross-repo mission often is).
251
+ cross-repo project often is). Two first-class patterns — runnable recipes in `dispatch.md`: **parallel +
252
+ fast-follow** (disjoint lanes integrate first; a fast-follow lane off
253
+ `project/<slug>` carries the seam — see `### Parallel + fast-follow`) and
254
+ **serial deferred judgment** (iterations run to gates-green with `architect
255
+ verdict` withheld; one later batch session judges each against its own frozen
256
+ AC — see `### Serial deferred judgment`).
240
257
  - **Effort call** — thinking budget set in the lane-prompt via the escalation
241
258
  keywords (`think hard` … `ultrathink`); default unattended builder work high,
242
259
  downgrade a routine, tightly-specified lane (record which and why). Claude
243
260
  Code has no per-invocation effort flag — see `dispatch.md`.
244
261
 
245
- Then write the **Acceptance Criteria** section exact gate commands +
246
- thresholds, each row carrying a `Brief §` column that addresses it back to
247
- intent and run `architect freeze <name>`. What must be frozen before dispatch
248
- is the Acceptance Criteria: `architect freeze` commits any pending content in the
249
- frozen region (Grounds/Specification/Acceptance Criteria) in one freeze commit,
250
- records the `freeze_sha` in `space.yaml`, and prints the frozen AC back; **that
251
- commit is the freeze** ❄️ and is the last thing before dispatch. You needn't
252
- sequence Grounds and Specification into separate commits first the freeze
253
- snapshots the whole frozen region and refuses to re-freeze once a frozen section
254
- changed afterward.
262
+ **Spike (probe) iterations.** When the open question is too uncertain for a
263
+ build the repo can't answer it and routine API-verification won't resolve it
264
+ spec a *spike* (probe) instead of a build iteration. A spike is
265
+ investigate-only: its deliverable is a **recommendation**, not merged behavior.
266
+ Its lane reads, experiments in throwaway scratch (never the worktree), and
267
+ writes a structured recommendation to its scratch report; there is usually
268
+ nothing to integrate. Acceptance Criteria are **read-bound** gates are
269
+ minimal (at most suite-green confirming the probe broke nothing), because the
270
+ proof is the architect reading the recommendation against the question the spike
271
+ was set, not a runnable check; the AC names that question. The spike verdict
272
+ uses **ADOPT / REVISE / REJECT** (not KILL/CONTINUE): the architect transcribes
273
+ the recommendation into Builder Report, then the Verdict records the
274
+ disposition and, if adopted, names the follow-up build iteration it spawns. A
275
+ spike's CONTINUE means "recommendation accepted + disposition recorded." This
276
+ is distinct from discovery-scale research (`/architect-research`, which surveys
277
+ a whole topic): a spike is one iteration-sized, decision-oriented probe run
278
+ through the normal builder/lane machinery.
279
+
280
+ Then write the **Acceptance Criteria** section — prose conditions (AC1, AC2, …)
281
+ that the architect judges against, followed by a fenced ` ```gates ` block of
282
+ runnable checks (each gate carries `id`, `ac`, `cmd`, and `expect`; `cwd` is
283
+ optional) — and run `architect freeze <name>`. What must be frozen before
284
+ dispatch is the Acceptance Criteria: `architect freeze` lints the gates block
285
+ (absent or empty gates is allowed but warns; malformed fails), commits any
286
+ pending content in the frozen region (Grounds/Specification/Acceptance Criteria)
287
+ in one freeze commit, records the `freeze_sha` in `space.yaml`, and prints the
288
+ frozen AC back; **that commit is the freeze** ❄️ and is the last thing before
289
+ dispatch. You needn't sequence Grounds and Specification into separate commits
290
+ first — the freeze snapshots the whole frozen region and refuses to re-freeze
291
+ once a frozen section changed afterward.
255
292
 
256
293
  ### 5. Dispatch (one fresh `claude -p` per lane, worktree-isolated)
257
294
 
@@ -279,14 +316,23 @@ files and writes raw results to `build/<id>-<lane>/report.md` — it never
279
316
  touches `architecture/`, so lanes never collide and the Acceptance Criteria
280
317
  stay untouchable.
281
318
 
282
- Do not block — end the turn or do other judgment work; multi-hour runs are
319
+ Do not block — end the turn or do other judgment work; long runs (30–60 minutes) are
283
320
  normal. Print the lane-prompts too, so the human can run any lane in an
284
321
  interactive `claude` session instead. Whenever you return to a running lane,
285
322
  check liveness: the lane's `run.jsonl` must still be growing. If it has been
286
323
  silent 15+ minutes on one in-flight command, follow "Stall detection and
287
324
  rescue" in `dispatch.md` — kill the stuck child process, not the run.
288
325
 
289
- ### 6. Post-flight and integrate (when the runs complete)
326
+ When all lanes complete, **the dispatch session's job is done** — babysit
327
+ liveness per `dispatch.md` but do not run gates, transcribe evidence, integrate
328
+ lanes, or write the Verdict. Hand off to a fresh judging session (§6).
329
+
330
+ ### 6. Post-flight, judge, and integrate (judging session)
331
+
332
+ A fresh judging session — not the session that dispatched (see §1 and §5) —
333
+ opens with the **MECHANICAL POST-FLIGHT CHECKS** and owns everything through the
334
+ Verdict and integration. Because this session did not dispatch, §1's
335
+ fresh-session-judgment is intact: it is the correct session to evaluate results.
290
336
 
291
337
  `architect verify <iteration>` REPORTS (it never judges) per lane: frozen
292
338
  sections untouched, no builder commits, scratch report present, in-bounds.
@@ -307,20 +353,30 @@ commits it, and echoes the builder's STATUS line. The builder never wrote into
307
353
 
308
354
  **Then integrate** — you decide which lanes pass, the CLI does the git
309
355
  mechanics. `architect integrate <iteration> --lanes <passing-set>` commits each
310
- named lane on its branch and merges it `--no-ff` into the repo's integration
311
- branch `lane/<iteration>`, in order; it **refuses** a lane that left builder
356
+ named lane on its branch and merges it `--no-ff` into the stable
357
+ `project/<slug>` branch (slug derived from `space.title`; persistent and shared
358
+ across all iterations), in order; it **refuses** a lane that left builder
312
359
  commits or wrote out-of-bounds, and stops on a merge conflict — which means the
313
360
  lane plan wasn't disjoint, a spec defect: kill the conflicting lane and re-spec
314
- it (never hand-resolve). Then run `architect gate <iteration>` against the
315
- integration branch as a smoke check (raw output; the verdict stays yours). A
316
- cross-repo mission yields one `lane/<iteration>` branch per touched repo. Update
317
- the iteration index in `architecture/ARCHITECT.md` (recording each repo's
318
- integration branch), remove the worktrees (`architect integrate … --teardown`,
319
- or `architect worktree remove <iteration> <lane>`), and commit the space.
320
-
321
- **Do not judge now** — the Verdict on the integration branch belongs to the
322
- next architect session; merge to each repo's main only on a CONTINUE verdict
323
- there.
361
+ it (never hand-resolve). A cross-repo project yields one `project/<slug>` branch
362
+ per touched repo. `main` is never touched per-iteration `--teardown` deletes
363
+ only the per-lane `lane/<iteration>-<lane>` branches and worktrees, never the
364
+ project branch. Update the iteration index in `architecture/ARCHITECT.md`
365
+ (recording the `project/<slug>` branch), remove the worktrees (`architect
366
+ integrate … --teardown`, or `architect worktree remove <iteration> <lane>`), and
367
+ commit the space.
368
+
369
+ **Run the frozen gates cold** `architect gate <iteration>` runs the frozen
370
+ gate commands against the integration tree and streams raw output (a runner, not
371
+ a judge). Read the output, check the diff against the Specification and the
372
+ cited BRIEF §sections (per §2), then write the **Verdict** (`architect verdict
373
+ <iteration> continue|kill --from <file>`): disagreement rulings, per-AC
374
+ PASS/FAIL/INVALID, the KILL/CONTINUE call.
375
+
376
+ At project end, `architect land` prints the single `gh pr create --base main
377
+ --head project/<slug>` command per touched repo and writes a PR body to
378
+ `build/land/` — no push, no `gh` call; the human runs it from the repo when
379
+ the project is ready to ship.
324
380
 
325
381
  ## Maintenance
326
382