RubyGems - space-architect - Versions diffs - 1.2.0 → 1.3.0 - Mend

space-architect 1.2.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml +4 -4
data/README.md +9 -0
data/lib/space_architect/cli/architect.rb +26 -0
data/lib/space_architect/skill_installer.rb +107 -0
data/lib/space_architect/terminal.rb +15 -0
data/lib/space_architect/version.rb +1 -1
data/lib/space_architect.rb +1 -0
data/skill/architect/SKILL.md +329 -0
data/skill/architect/dispatch.md +308 -0
data/skill/architect/research.md +89 -0
data/skill/architect-research/SKILL.md +165 -0
data/skill/architect-research/lanes.md +191 -0
data/skill/architect-vocabulary/SKILL.md +141 -0
data/vendor/repo-tender/lib/space_architect/pristine/cli.rb +1 -10
data/vendor/repo-tender/lib/space_architect/pristine.rb +7 -0
metadata +8 -1

data/skill/architect-research/lanes.md ADDED Viewed

@@ -0,0 +1,191 @@
+# Source-class tactics library — researcher preamble, scout block, verified endpoints
+Lanes are DESIGNED per topic by the orchestrator (SKILL.md step 2); the
+sections below are search tactics and verified endpoints per source class —
+draw on whichever a designed lane needs, mix freely. Endpoints verified
+unauthenticated June 2026. Every researcher block starts with this preamble,
+then the lane-specific objective:
+```
+You are a web research agent. Answer ONE assigned objective. Do not write code,
+do not make recommendations — judgment belongs to the orchestrator reading your
+output. Budget: <N> searches; if two consecutive searches yield no new
+load-bearing facts, stop and return. HARD CONTEXT RULES: never open a full
+page when the search snippet answers the question; quote at most 2 sentences
+per source; the moment you can answer, STOP and write your findings — partial
+findings beat context exhaustion (researchers that fill their window die
+without writing anything). OUTPUT: markdown findings — every finding
+carries source URL, source date, the exact figure or a short direct quote, and
+a confidence tag (high = primary source / med = reputable secondary / low =
+single blog or forum post). Prefer primary sources. Record exact version
+numbers and dates. When sources disagree, report the disagreement — do not
+resolve it. If you cannot find evidence, write NOT FOUND — never fill gaps from
+prior knowledge without flagging it. End with the 2-3 findings most likely to
+change a design decision.
+```
+**Lane scoping rule (learned 2026-06-12):** cap each researcher at ~5 subjects
+(repos, vendors, people). Doc-heavy lanes burn the context window on fetched
+pages — two of nine researchers in one session died of context exhaustion
+before writing any findings. A researcher that dies returns NOTHING (`-o`
+only materializes on a clean finish). If a lane dies this way, bisect it into
+narrower lanes and re-dispatch; don't re-run it as-is.
+## Lane 0 — Scout (brainstorm scale; dispatches before lane design)
+Objective template: map the terrain of <topic> — do NOT gather findings.
+Return: (1) canonical terminology and the names the field itself uses;
+(2) the 5–10 load-bearing systems/papers/repos/vendors, one line each on why
+they matter; (3) the named people whose positions recur; (4) which source
+classes look rich vs empty for this topic (papers? repos? vendor blogs?
+forums?); (5) the topic's natural fault lines — the 3–6 sub-questions an
+expert would split it into. Budget ~10 searches; breadth over depth; snippet
+over page. Output is a MAP for the orchestrator to design lanes from —
+structure matters more than completeness.
+## Lane 1 — Academic (latest papers)
+Objective: the current academic state of <topic> — most recent survey, the
+frontier preprints, and which papers the field treats as load-bearing.
+Pipeline: **survey first → frontier sweep → snowball → score.**
+- Recent survey: Semantic Scholar `publicationTypes=Review`, or arXiv
+  `ti:survey AND abs:<topic>` (last ~18 months). The survey supplies canonical
+  terminology and the seed bibliography.
+- Frontier sweep (newest first):
+  `https://export.arxiv.org/api/query?search_query=cat:<cs.XX>+AND+abs:%22<topic>%22&sortBy=submittedDate&sortOrder=descending&max_results=25`
+  (Atom XML; uppercase AND/OR; ≥3s between calls) and
+  `https://api.semanticscholar.org/graph/v1/paper/search?query=<topic>&fields=title,year,citationCount,tldr,venue,externalIds&limit=20&year=2025-2026`
+  (expect 429s — back off and retry; the `tldr` field is gold for triage).
+  Community signal: `https://huggingface.co/api/daily_papers?limit=20` and
+  `https://huggingface.co/papers/trending`. **Papers With Code is dead**
+  (shut down July 2025; HF Papers is the successor) — never cite it.
+- Snowball from the 2-3 most relevant seeds — a reliable "latest papers" method:
+  forward citations
+  `https://api.semanticscholar.org/graph/v1/paper/arXiv:<id>/citations?fields=title,year,isInfluential&limit=100`
+  and semantic neighbors
+  `https://api.semanticscholar.org/recommendations/v1/papers/forpaper/arXiv:<id>?limit=20`.
+  Fallback when S2 throttles: OpenAlex —
+  `https://api.openalex.org/works?search=<topic>&sort=publication_date:desc&per-page=25&mailto=research@example.com`.
+- Score candidates: citations-per-month (not raw count — meaningless for 2026
+  papers), venue/OpenReview decision (`https://api2.openreview.net/notes/search?term=<topic>&limit=25`
+  has actual reviewer scores), code availability, HF traction. Red flags:
+  preprint-only after 18+ months, self-citation-heavy.
+## Lane 2 — Popular repos (what the ecosystem actually uses)
+Objective: the 5-10 repos/libraries the ecosystem has actually adopted for
+<topic>, with adoption evidence beyond stars.
+- Discovery: GitHub search —
+  `topic:<topic> stars:>1000 archived:false sort:stars`,
+  `"<topic>" in:name,description,readme stars:>2000`, plus awesome-lists as
+  recall boosters (`awesome <topic> in:name stars:>1000`) — re-check `pushed:`
+  on every list entry; lists go stale.
+- **Adoption evidence beats stars**: dependents count via
+  `https://api.deps.dev/v3/systems/<npm|pypi|...>/packages/<name>` or
+  `https://packages.ecosyste.ms` (keyless, 5k req/hr); registry download
+  *trends* (`https://pypistats.org/api/packages/<pkg>/recent`,
+  `https://api.npmjs.org/downloads/point/last-month/<pkg>`).
+- **Fake-star check**: ~4.5M fake stars documented in the wild. Stars without
+  proportional forks/issues/dependents = flag it. Report stars AND dependents
+  AND last release for every repo.
+## Lane 3 — Cutting-edge repos (emerging, not hype)
+Objective: what's emerging in <topic> in the last ~6 months that practitioners
+are actually adopting — and which hyped repos are already abandoned.
+- Where bleeding-edge surfaces first: HF daily/trending papers (code-linked);
+  Hacker News via Algolia —
+  `https://hn.algolia.com/api/v1/search_by_date?query=<topic>&tags=story&numericFilters=points>50`
+  (also `query=github.com` + topic for Show HNs); `https://lobste.rs/t/<tag>.json`;
+  GitHub `topic:<topic> created:>{90d ago} stars:>100 pushed:>{14d ago} sort:stars`;
+  OSS Insight (`https://ossinsight.io/collections/trending`) for transparent
+  velocity ranking.
+- **Emerging-vs-hype gate** (report which side each repo lands on):
+  EMERGING = created recently AND pushed <14d AND star velocity sustained ≥2
+  weeks AND issues getting maintainer responses AND linked from a paper or a
+  track-record org AND forks/issues growing in proportion to stars.
+  HYPE = week-one star spike then stalled pushes, unanswered issues, README
+  promises >> code, single contributor, no tests/releases. Any single signal
+  is gameable; the conjunction is not.
+## Lane 4 — Production-grade design patterns
+Objective: how 2-3 production libraries adjacent to <topic> design
+the thing we're about to build — API ergonomics, error handling, extension
+points, testing patterns — and where they differ.
+- Select subjects with the production-grade gate: pushed <6mo (or explicitly
+  stable + responsive issues), tagged releases + changelog in last 12mo,
+  dependents >100 (ecosystem-adjusted), ≥2 active maintainers, CI runs tests
+  on PRs, OSI license, no unaddressed criticals on `https://osv.dev`.
+  Ignore raw stars and commit counts.
+- Reading order — never start at file #1: README + manifest (entry points,
+  exports = the deliberate public surface) → trace ONE canonical happy-path
+  call end to end → tests for the relevant feature (executable documentation
+  of edge-case policy) → 3 closed issues + 2 merged PRs in the area (the
+  "why not" you can't get from code).
+- Extract four categories per library: **API ergonomics** (cost of the 90%
+  case in lines, defaults, config layering), **error handling** (exception
+  hierarchy root, retried vs raised, boundary translation), **extension
+  points** (grep for hook/adapter/middleware/plugin/register/Protocol),
+  **testing patterns** (fixture strategy, how I/O is faked, regression-test-
+  per-bug convention).
+- Then the **cross-library diff**: patterns all of them share are load-bearing;
+  where they differ is a trade-off to document.
+- Tools: GitHub code search (`symbol:<Name>`, `/regex/`, `repo:`, `path:`),
+  `https://grep.app` (usage in the wild), `https://sourcegraph.com/search`.
+  For "what do people actually call", search downstream dependents' code, not
+  the library.
+## Lane 5 — General web
+Objective: everything the other lanes structurally miss on <topic> — expert
+blog posts, postmortems and failure reports, comparisons, official vendor
+docs/changelogs, pricing/operational constraints.
+- Standard multi-angle sweep: official docs/changelogs; named-expert posts;
+  "<X> postmortem" / "<X> at scale" / "<X> problems" for failure reports;
+  "<X> vs <Y>" for comparisons. Date-restrict queries on fast-moving topics.
+- Source hierarchy applies hardest here: SEO listicles and AI-generated
+  aggregators are pointers, never citations — chase them to the primary
+  source or drop the claim.
+## Lane 6 — Expert opinion (second wave — dispatch after lanes 1-5 return)
+Objective: what the named experts in <topic> are saying right now — positions,
+warnings, predictions, and especially disagreements — from their blogs, talks,
+and social posts.
+- **Build the roster first** (why this lane runs second): survey and top-paper
+  authors (lane 1), maintainers of the leading repos (lanes 2-3), and names
+  that recur across lane 5 results. Pick 5-8; record each expert's affiliation
+  — you'll need it for conflict-of-interest tagging.
+- Where to find their voice, in reliability order:
+  1. **Personal blogs / newsletters** — the primary source for considered
+     positions; search `"<name>" <topic>` and `site:<their-domain> <topic>`.
+  2. **HN comments** — keyless and reliable:
+     `https://hn.algolia.com/api/v1/search?tags=comment,author_<username>&query=<topic>`
+     (many experts comment under well-known usernames).
+  3. **Conference talks / podcasts** — search `"<name>" talk <topic> 2026`;
+     prefer transcripts or the speaker's own writeup over third-party recaps.
+  4. **X** — login-walled for agents. Use search-engine indexing
+     (`site:x.com "<name>" <topic>`) and direct profile URLs
+     (`x.com/<handle>`); don't rely on third-party viewers (flaky) and note
+     that Bluesky's public search API has been closed (403) since March 2025
+     — profile pages only.
+  5. **Reddit / lobste.rs** threads and AMAs (via indexed search:
+     `site:reddit.com "<name>" <topic>`).
+- **Opinion is its own evidence class.** An expert opinion is judgment —
+  datable, revisable, and sometimes conflicted. For every position report:
+  the exact quote or close paraphrase, where and when stated, and any conflict
+  of interest (vendor employee talking their book, author promoting their own
+  tool). An opinion NEVER counts toward the ≥2-source rule for factual claims
+  — facts get verified in the other lanes; this lane reports who believes
+  what and why.
+- **The highest-value output is disagreement**: where credible experts
+  contradict each other is exactly where the genuinely open questions are.
+  Map who stands where and what evidence each side cites.

data/skill/architect-vocabulary/SKILL.md ADDED Viewed

@@ -0,0 +1,141 @@
+---
+name: architect-vocabulary
+description: >
+  Load the Architect system's vocabulary and a short "where you are"
+  orientation — space, mission, iteration, lane, brief, builder, architect,
+  gate, freeze, verdict, research, variant set — for when you're standing in a
+  space-architect workspace (or working on the skill itself) and need the terms
+  understood in conversation but do NOT want to run the loop. Reference only:
+  it does not dispatch builders, freeze, judge, or write iteration files.
+  Invoke as /architect-vocabulary.
+---
+# Architect Vocabulary
+This skill loads **terminology and orientation only**. It is the glossary, not
+the loop.
+## What this skill is — and isn't
+- **Is:** the shared vocabulary of `space-architect` plus a quick orientation to
+  where things live, so these terms are understood for the rest of the session.
+- **Isn't:** the build loop. Do **not** run `architect new` / `freeze` /
+  `dispatch` / `integrate`, do **not** write or edit anything under
+  `architecture/`, and do **not** render verdicts on builder work. That is the
+  separate **`/architect`** skill — invoke it deliberately when you actually
+  want to run the loop. Read-only `architect` commands (`status`, `show`) are
+  fine here.
+## Vocabulary
+**Roles**
+- **architect** — the judgment role (a human, or Claude Opus 4.8 in judgment
+  mode): arbitrates disagreements, writes and freezes iteration files, calls
+  kill/continue, merges builder output. Never writes implementation code.
+- **builder** — the implementation role: Claude Sonnet 4.6 run headless via
+  `architect dispatch` (`claude -p`), one per lane in its own worktree. Reports
+  raw evidence; never grades its own work; never edits `architecture/`.
+**The workspace**
+- **space** — a task-scoped workspace directory holding repos, notes, and
+  artifacts under one root. `architect` finds it by walking up from `$PWD` to the
+  nearest `space.yaml`.
+- **space.yaml** — the space's identity file: id, title, status, repos, notes,
+  tags, plus the `architect:` block (mission state — iterations, freeze shas,
+  lanes).
+- **mission** — an Architect Loop instance living inside a space; spans the
+  repos under `repos/`.
+**The unit of work**
+- **iteration** — one PR-sized unit of work, captured as a single self-contained
+  file `architecture/I<NN>-<name>.md`, grown section by section. Its sections:
+  - **Grounds** — *why*: research/brief distilled (optional).
+  - **Specification** — *what/how*: the full delegation contract.
+  - **Acceptance Criteria** — *proof*: exact gate commands + thresholds; this is
+    what gets frozen.
+  - **Builder Prompt** — the exact lane-prompt(s) dispatched.
+  - **Builder Report** — raw evidence, transcribed verbatim from build scratch.
+  - **Verdict** — rulings + per-AC PASS/FAIL/INVALID + KILL/CONTINUE.
+- **lane** — a parallel slice of an iteration (1–4 per iteration), each
+  declaring a disjoint target repo + file-touch set. Lanes in different repos are
+  inherently disjoint; same-repo lanes that overlap files run as one. Each runs
+  in its own worktree under `build/<id>-<lane>/`.
+- **worktree** — the isolated git worktree a lane builds in, off the target
+  repo's base commit, so lanes never collide.
+- **dispatch** — launching a fresh headless builder for a lane (`architect
+  dispatch <iteration> <lane>`), streaming output to `build/<id>-<lane>/run.jsonl`.
+**Contracts and checkpoints**
+- **brief** (`architecture/BRIEF.md`) — the durable, §-numbered mission contract
+  that spans iterations; frozen at the mission level and cited as **BRIEF §N**.
+- **ARCHITECT.md** (`architecture/ARCHITECT.md`) — the cross-iteration index /
+  table of contents and mission-wide state.
+- **freeze** ❄️ — committing the frozen region (Grounds / Specification /
+  Acceptance Criteria) *before* dispatch (`architect freeze`). Records the
+  **freeze_sha** in `space.yaml`; any later change to those sections is an
+  automatic iteration FAIL.
+- **gate** — a frozen verification command + threshold (test/lint/typecheck/
+  build). `architect gate` runs them and streams raw output — it is a runner,
+  never a judge.
+**Outcomes**
+- **verdict** — the architect's ruling on an iteration, written after evidence:
+  - per-criterion: **PASS** / **FAIL** / **INVALID** (INVALID = not measured the
+    way the gate specifies).
+  - iteration-level: **KILL** / **CONTINUE**.
+- **variant set** — an iteration built as multiple `(harness, model)` lanes over
+  one frozen spec, judged head-to-head against the same Acceptance Criteria; the
+  winner is selected with the human in the loop, not unilaterally.
+**Research**
+- **research** — two scales:
+  - **discovery scale** (brainstorming, technology selection, state-of-the-art)
+    → the **`/architect-research`** skill: a scout maps the topic, the
+    orchestrator designs parallel researcher **lanes**, claims are verified
+    against sources, and the synthesis distills into a brief §section or an
+    iteration's Grounds.
+  - **iteration scale** → an inline fan-out run only when an iteration needs
+    facts the repo doesn't already have.
+**Repos**
+- **evergreen** / copy-on-write / **`src` engine** — repo provisioning: when an
+  up-to-date local copy exists under `evergreen_dir`, `architect` copies it into
+  the space (copy-on-write on APFS) instead of cloning over the network. The
+  vendored `src` engine keeps those evergreen checkouts tended.
+## Where you are
+A space's directory layout:
+```text
+space.yaml        # identity + mission state (the architect: block)
+README.md
+repos/            # the repos the mission spans
+notes/            # scratch, prompts, logs
+architecture/     # ARCHITECT.md index + I<NN>-<name>.md iteration files (+ BRIEF.md)
+build/            # lane worktrees + scratch: build/<id>-<lane>/
+tmp/              # workspace-local temp — use instead of /tmp
+```
+Safe **read-only** commands to orient yourself (these don't run the loop):
+```sh
+architect status            # mission state: iterations, freeze shas, lanes, verdicts
+architect space show        # the space you're standing in
+architect space list        # all your spaces
+```
+## Maintenance
+This glossary is a **self-contained copy** of terms defined canonically in the
+`architect` skill (`SKILL.md`) and the project `README.md`. It is installed as an
+isolated skill, so it can't reference those at runtime — when the vocabulary
+changes there, re-read this file against them and update it to keep the two from
+drifting.

data/vendor/repo-tender/lib/space_architect/pristine/cli.rb CHANGED Viewed

@@ -1,7 +1,6 @@
 # frozen_string_literal: true
 require "dry/cli"
-require "space_architect/pristine"
 module SpaceArchitect::Pristine
   # CLI surface — thin translation layer between argv and the
@@ -126,12 +125,4 @@ module SpaceArchitect::Pristine
   end
 end
-# Subcommand files — each defines its command classes and
-# registers them under their group prefix.
-require "space_architect/pristine/cli/repo"
-require "space_architect/pristine/cli/org"
-require "space_architect/pristine/cli/sync"
-require "space_architect/pristine/cli/status"
-require "space_architect/pristine/cli/config"
-require "space_architect/pristine/cli/daemon"
-require "space_architect/pristine/cli/clone"

data/vendor/repo-tender/lib/space_architect/pristine.rb CHANGED Viewed

@@ -35,3 +35,10 @@ require "space_architect/pristine/ui/mode"
 require "space_architect/pristine/ui/plain_reporter"
 require "space_architect/pristine/ui/json_reporter"
 require "space_architect/pristine/cli"
+require "space_architect/pristine/cli/repo"
+require "space_architect/pristine/cli/org"
+require "space_architect/pristine/cli/sync"
+require "space_architect/pristine/cli/status"
+require "space_architect/pristine/cli/config"
+require "space_architect/pristine/cli/daemon"
+require "space_architect/pristine/cli/clone"

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: space-architect
 version: !ruby/object:Gem::Version
-  version: 1.2.0
+  version: 1.3.0
 platform: ruby
 authors:
 - Eric Jacobs
@@ -236,6 +236,7 @@ files:
 - lib/space_architect/repo_reference.rb
 - lib/space_architect/repo_resolver.rb
 - lib/space_architect/shell_integration.rb
+- lib/space_architect/skill_installer.rb
 - lib/space_architect/slugger.rb
 - lib/space_architect/space.rb
 - lib/space_architect/space_store.rb
@@ -247,6 +248,12 @@ files:
 - lib/space_architect/version.rb
 - lib/space_architect/warnings.rb
 - lib/space_architect/xdg.rb
+- skill/architect-research/SKILL.md
+- skill/architect-research/lanes.md
+- skill/architect-vocabulary/SKILL.md
+- skill/architect/SKILL.md
+- skill/architect/dispatch.md
+- skill/architect/research.md
 - vendor/repo-tender/lib/space_architect/pristine.rb
 - vendor/repo-tender/lib/space_architect/pristine/cli.rb
 - vendor/repo-tender/lib/space_architect/pristine/cli/clone.rb