npm - @therocketcode/gsd-core - Versions diffs - 1.10.0 → 1.11.1 - Mend

@therocketcode/gsd-core 1.10.0 → 1.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/.claude-plugin/plugin.json +1 -1
package/commands/gsd/autonomous.md +0 -1
package/commands/gsd/execute-phase.md +0 -1
package/commands/gsd/plan-phase.md +0 -1
package/gemini-extension.json +1 -1
package/gsd-core/references/brownfield-adaptation.md +63 -0
package/gsd-core/references/engineering-standards.md +1 -1
package/gsd-core/references/scout-codebase.md +82 -148
package/gsd-core/workflows/cicd-strategy.md +5 -0
package/gsd-core/workflows/discuss-phase.md +5 -7
package/gsd-core/workflows/infrastructure-strategy.md +5 -0
package/gsd-core/workflows/model-domain.md +15 -0
package/gsd-core/workflows/recommend-architecture.md +15 -0
package/gsd-core/workflows/testing-strategy.md +5 -0
package/package.json +1 -1

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "gsd-core",
   "displayName": "GSD Core",
-  "version": "1.10.0",
+  "version": "1.11.1",
   "description": "GSD Core is a meta-prompting, context engineering, and spec-driven development system for AI coding agents.",
   "author": {
     "name": "TheRocketCodeMX",

package/commands/gsd/autonomous.md CHANGED Viewed

@@ -2,7 +2,6 @@
 name: gsd:autonomous
 description: Run all remaining phases autonomously — discuss→plan→execute per phase
 argument-hint: "[--from N] [--to N] [--only N] [--interactive]"
-context: fork
 effort: xhigh
 allowed-tools:
   - Read

package/commands/gsd/execute-phase.md CHANGED Viewed

@@ -2,7 +2,6 @@
 name: gsd:execute-phase
 description: Execute all plans in a phase with wave-based parallelization
 argument-hint: "<phase-number> [--wave N] [--gaps-only] [--interactive] [--tdd]"
-context: fork
 effort: xhigh
 allowed-tools:
   - Read

package/commands/gsd/plan-phase.md CHANGED Viewed

@@ -2,7 +2,6 @@
 name: gsd:plan-phase
 description: Create detailed phase plan (PLAN.md) with verification loop
 argument-hint: "[phase] [--auto] [--research] [--skip-research] [--research-phase <N>] [--view] [--gaps] [--skip-verify] [--prd <file>] [--ingest <path-or-glob>] [--ingest-format <auto|nygard|madr|narrative>] [--reviews] [--text] [--tdd] [--mvp]"
-context: fork
 effort: xhigh
 allowed-tools:
   - Read

package/gemini-extension.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gsd-core",
-  "version": "1.10.0",
+  "version": "1.11.1",
   "description": "GSD Core — a meta-prompting, context engineering, and spec-driven development system for AI coding agents. Loads gsd's operating context into every Gemini CLI session.",
   "contextFileName": "GEMINI.md"
 }

package/gsd-core/references/brownfield-adaptation.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Brownfield Adaptation — Working an Existing Codebase
+How GSD adapts when there is **already a codebase** (or existing infra/pipeline/tests), not a clean greenfield. The strategy chain and scout were greenfield-first by default; this reference is the brownfield half. Pairs with `engineering-standards.md` (the invariant bar), `recommend-architecture.md` (the evolution/strangler content), and the `map-codebase` outputs (`.planning/codebase/*.md`) it consumes.
+## The principle
+**Assess what exists, then recommend an *evolution path* — never "design the ideal from scratch" and impose it.** Greenfield defines; brownfield reconciles. The strategy skills, in brownfield mode, read the existing reality first (the `map-codebase` maps + the code) and produce a *current-state → target → migration* recommendation, not a blank-slate decision.
+## Follow vs. improve vs. refactor (the decision matrix)
+Three options, one matrix keyed by: **blast radius · reversibility · test-coverage-present · churn × centrality · am-I-already-editing-this.**
+- **IMPROVE incrementally — the default.** Bring the code you touch up to the standard, scoped to the change. Most work lives here.
+- **REFACTOR — reserved.** Only for **hotspots** (high churn × complexity, where the entrenchment tax compounds), and **only after characterization tests pin current behavior**. Gate it behind those tests; never refactor blind.
+- **FOLLOW the existing pattern, even if poor — a deliberate, time-boxed tactic, never a silent default.** Choose it when blast radius is large, there is no coverage, the code is cold (rarely changes), or no target standard is agreed. Stay **locally consistent** — matching the surrounding code beats a lone island of "correct" — but do **not** propagate the bad pattern into *new* modules.
+**The consistency rule:** make the target standard explicit *first*; **all new code follows the target**; only **old** patterns you actually touch get refactored; prefer **codemods** to convert at scale. "Be consistent with bad code" is *local* advice for the change region — not a mandate to spread it.
+## The safe-change sequence (Feathers)
+Before changing legacy code: **change point → find a seam → break the dependency at the seam → write characterization tests (pin the *current* behavior, bugs included) → then change.** Where full coverage is too costly, keep new code clean with **Sprout** (write the new logic in a fresh tested unit, call it from the legacy code) or **Wrap** (wrap the legacy call to add behavior around it).
+**Where to start testing:** the highest **churn × risk** hotspots from git history — you need a *local* safety net around the change region, not global coverage. Do not block work on a coverage percentage.
+## Architecture & domain on legacy
+- **Do not "DDD the monolith"** (Evans: it almost always disappoints). Carve a **bubble context** around the core subdomain, defend it with an **Anti-Corruption Layer**, and grow it via **Strangler Fig** — new behavior to the clean core, old paths kept serving until strangled.
+- **Incremental always beats big-bang rewrite** (rewrites lose near-universally). The real axis is big-bang vs incremental, not rewrite vs not. Even the genuine rewrite case (a whole technology generation is obsolete) is done as **incremental subsystem replacement**, never a stop-the-world rewrite.
+- **Reverse-engineer, don't re-invent:** in brownfield, `model-domain` extracts the ubiquitous language and subdomains *from the existing code + docs* (names, modules, call paths) and reconciles them with the user's vision, flagging mismatches — it does not interview from a blank slate.
+## Keeping improvement bounded
+- **Two hats (Beck):** never mix a structural change and a behavioral change in one diff — separate commits.
+- **Preparatory refactoring:** refactor only enough to make the change easy ("make the change easy, then make the easy change"), not the whole neighborhood.
+- **Note bigger problems and return** — record the larger refactor as a candidate (a roadmap/backlog item), don't chase it now. Unbounded improvement is scope creep.
+## The decision card (how to surface it — never impose)
+When the assessment finds a gap between current code and the standard/strategy, present a **decision card** and let the human choose:
+```
+[area] current: <what the code does today>
+       target:  <the standard / strategy recommendation>
+       gap cost: <blast radius · churn/centrality · reversibility>
+       options:  ▸ Follow (match existing, time-boxed)   ▸ Improve (default, scoped to what we touch)   ▸ Refactor (gated on characterization tests)
+```
+**Default-select Improve.** Gate Refactor behind characterization tests. Record the choice. The framework recommends and surfaces cost; the user owns the appetite for change.
+## Anti-patterns
+- Imposing a from-scratch ideal architecture/test/infra strategy on an existing system (ignoring the grain).
+- Refactoring a hotspot with no characterization tests (changing behavior blind).
+- Big-bang rewrite; "we'll clean it all up" sweeping refactors mixed with feature work.
+- Propagating an existing bad pattern into new modules in the name of "consistency."
+- DDD-ing the whole legacy monolith instead of carving a bubble context.
+- Blocking work on a global coverage target instead of a local safety net.
+*Basis: Feathers (Working Effectively with Legacy Code — seams, characterization tests, Sprout/Wrap); Fowler (Strangler Fig, preparatory refactoring, "make the change easy"); Beck (two hats, Tidy First); Evans (bubble context, ACL; "don't DDD the monolith"); Tornhill (churn×complexity hotspots); the big-bang-rewrite-loses consensus. Full citations in the research corpus.*
+## Consumes / produces
+Consumed by the strategy chain (`model-domain`, `recommend-architecture`, `testing-strategy`, `infrastructure-strategy`, `cicd-strategy`) and the executor **when existing code is present**; reads `map-codebase`'s `.planning/codebase/*.md`. Pairs with `engineering-standards.md` (rule 3, "match what exists") and `recommend-architecture.md`'s evolution/strangler section.

package/gsd-core/references/engineering-standards.md CHANGED Viewed

@@ -17,7 +17,7 @@ Shared reference injected into the producing agents (`gsd-planner`, `gsd-phase-r
 1. **Understand before you change.** Read the real code paths, the tests-as-spec, and — for this subdomain — the DOMAIN-MODEL classification and the ADR rung, before deciding or writing. Most defects are acting on an incomplete mental model, or building to the wrong structure.
 2. **Build to the architecture, fully — on top of the universal floor.** A cheap baseline applies to *every* project, even simple ones: dependency inversion at **true external boundaries only** (DB, 3rd-party, clock/IO) + a Functional-Core/Imperative-Shell shape (pure logic separable from side-effecting glue) + strong independent tests. That floor is part of the invariant bar — skipping it (reaching into the DB/3rd-party from everywhere, untestable) is under-engineering even on a CRUD app. *On top of that*, implement the rung the ADR chose for this subdomain — ports/aggregates/boundaries — properly. Don't skip mandated structure "to keep it simple" (under-engineering); don't add un-mandated internal ceremony "to be robust" (over-engineering). "Don't add un-mandated structure" (rule 4) means the *ceremony above the floor*, never the floor itself.
-3. **Match what exists; deep modules, clear names.** Make new code look like it belongs — reuse the established patterns, names, and conventions (`gsd-pattern-mapper` supplies the analogs); don't add a second way to do something that already has one. Prefer a simple interface over real substance to many shallow pass-throughs (Ousterhout — and "extract into tiny functions" is genuinely contested; don't cargo-cult classitis). Descriptive names are one of the few empirically-validated quality levers.
+3. **Match what exists; deep modules, clear names.** Make new code look like it belongs — reuse the established patterns, names, and conventions (`gsd-pattern-mapper` supplies the analogs); don't add a second way to do something that already has one. On EXISTING code, "match what exists" is governed by the follow/improve/refactor matrix in `@~/.claude/gsd-core/references/brownfield-adaptation.md`: match the local grain by default, improve what you touch, and refactor only hotspots under characterization tests — never propagate a bad existing pattern into NEW code in the name of consistency. Prefer a simple interface over real substance to many shallow pass-throughs (Ousterhout — and "extract into tiny functions" is genuinely contested; don't cargo-cult classitis). Descriptive names are one of the few empirically-validated quality levers.
 4. **Don't invent un-mandated structure (but build everything the ADR mandates).** Prefer duplication to a *speculative* abstraction — duplication is far cheaper than the wrong abstraction (Metz); wait until the right shape is obvious, inline it back when it stops fitting (AHA); DRY is about *knowledge*, not coincidentally-similar code. Skip speculative layers/options/config for imagined futures (YAGNI; Fowler's cost-of-carry). None of this overrides the architecture's *deliberate, required* abstractions — a mandated port/aggregate is not speculative; declining it is under-engineering, not YAGNI.
 5. **Bounded boy-scout.** Improve what you touch when the change needs it and it's honestly better; keep structural changes in a *separate* commit from behavioral ones (Beck's two hats); flag larger refactors instead of silently sprawling — unbounded refactoring is scope creep.
 6. **Correct AND complete — edge cases first.** Before coding, enumerate the edge cases and failure modes the behavior must handle; a change that serves only the happy path is unfinished. This completeness is what makes code simple — not doing less.

package/gsd-core/references/scout-codebase.md CHANGED Viewed

@@ -1,183 +1,117 @@
-# Codebase scout — map selection + scaled deep-scout protocol
+# Codebase scout — mandatory phase exploration via parallel agents
 > Lazy-loaded reference for the `scout_codebase` step in
-> `workflows/discuss-phase.md` (extracted via #2551 progressive-disclosure
-> refactor). Read this when the workflow needs to scout the codebase before
-> identifying gray areas.
->
-> The scout has **two paths**. Assess depth FIRST (see "Depth assessment"),
-> then run exactly one:
-> - **Shallow path** (default for trivial/reversible phases) — the light
->   inline map-selection scan below. Cheap.
-> - **Deep path** (complex / high-blast-radius / hard-to-reverse / touches
->   the core) — parallel read-only explorers on distinct lenses, then an
->   orchestrator confirm-or-refute gate. ~15× the tokens; earned, not default.
->
-> Both feed the same internal `<codebase_context>`. The deep path adds a
-> VERIFIED-vs-INFERRED split and a reconciled, evidence-backed understanding.
-## Depth assessment (run first — gates the cheap vs deep path)
-Do NOT always-max. Multi-agent scouting burns ~15× the tokens of a single
-read; spend it only where blast-radius justifies it. Score the phase on four
-axes (from the phase title, ROADMAP entry, and prior context):
-| Axis | Light end (→ shallow) | Heavy end (→ deep) |
-|---|---|---|
-| **Complexity** | one file, clear-cut, CRUD | cross-cutting, many moving parts |
-| **Blast radius** | isolated, leaf code | touches the core / shared infra / many call sites |
-| **Reversibility** | trivially revertable | hard to reverse (schema, public API, data migration, payments) |
-| **Decomposability** | n/a | understanding splits cleanly into independent angles |
-**Routing rule:**
-- **All axes light** → **shallow path**. Run the map-selection scan only.
-- **Any of: high complexity, high blast-radius, low reversibility, or "touches
-  the core"** → **deep path** (parallel lenses + confirm-or-refute gate).
-- **Tightly-coupled / sequential understanding** (the angles can't be explored
-  independently) → prefer **one deep continuous-context pass** over many
-  parallel explorers (Cognition: parallel workers drift on conflicting
-  assumptions). Use the deep path's verification gate, but a single explorer.
-**Overrides:** an explicit `--deep` forces the deep path; `--shallow` forces
-the light path. With no flag, the routing rule above decides. Log the resolved
-path and the axis that triggered it (e.g. `[scout] deep — schema change,
-low reversibility`).
-## Shallow path — phase-type → recommended maps
-The cheap, single-pass scan. Read 2–3 maps based on inferred phase type. Do
-NOT read all seven — that inflates context without improving discussion
-quality.
+> `workflows/discuss-phase.md`. Read this before scouting a phase.
-| Phase type (infer from title + ROADMAP entry) | Read these maps |
-|---|---|
-| UI / frontend / styling / design | CONVENTIONS.md, STRUCTURE.md, STACK.md |
-| Backend / API / service / data model | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
-| Integration / third-party / provider | STACK.md, INTEGRATIONS.md, ARCHITECTURE.md |
-| Infrastructure / DevOps / CI / deploy | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
-| Testing / QA / coverage | TESTING.md, CONVENTIONS.md, STRUCTURE.md |
-| Documentation / content | CONVENTIONS.md, STRUCTURE.md |
-| Mixed / unclear | STACK.md, ARCHITECTURE.md, CONVENTIONS.md |
+## Exploring the phase with parallel agents is MANDATORY — always
-Read CONCERNS.md only if the phase explicitly addresses known concerns or
-security issues.
+**If you are in discuss-phase, you WILL spawn dedicated parallel read-only explorer agents to explore this phase. This is not optional, not gated on how much you already know, and not something you can satisfy by reading a few files in your own context.** What scales is the *number* of explorers and *which* lenses — never *whether* you explore.
-## Single-read rule
+(Proportionality lives at the command boundary, not here: genuinely trivial work routes through `/gsd:fast` or `/gsd:quick`. The moment you are running discuss-phase, the phase is substantive enough to plan — so it is substantive enough to explore. There is no "too small to scout" case inside discuss-phase.)
-Read each map file in a **single** Read call. Do not read the same file at
-two different offsets — split reads break prompt-cache reuse and cost more
-than a single full read.
+### Why — so you respect it instead of routing around it
-## No-maps fallback
+- **You are exploring THIS phase specifically — not reusing the global context you think you have.** Whatever you already know is about the project in general; the phase needs targeted grounding on the exact code paths, data flows, invariants, and failure modes it will touch. General context is not phase context.
+- **Exploration COMPLEMENTS and STRESS-TESTS what you know — it does not repeat it.** Its job is to surface what you *don't* know and to verify what you *think* you know against ground truth. The points you're most confident about are the ones most worth checking — confidence is where stale assumptions hide.
+- **The agent that "already has the context" is the highest-risk one.** ~20–30% of even careful single-context conclusions don't survive a check against raw tool output (stale memory, an assumed call path, a pattern that changed). Skipping exploration because you feel informed is exactly how a wrong premise gets locked into CONTEXT.md and propagated through plan → execute.
+- **This is the highest-leverage moment in the whole workflow.** Decisions locked here become canonical references the planner and executor must follow. A cheap, thorough exploration now prevents expensive rework later. Thoroughness beats tokens here, always.
+- **Parallel agents beat one big pass** because independent lenses see what a single sweep misses, and isolating each lens in its own context keeps each one sharp.
-If `.planning/codebase/*.md` does not exist:
-1. Extract key terms from the phase goal (e.g., "feed" → "post", "card",
-   "list"; "auth" → "login", "session", "token")
-2. `grep -rlE "{term1}|{term2}" src/ app/ --include="*.ts" ...` (use `-E`
-   for extended regex so the `|` alternation works on both GNU grep and BSD
-   grep / macOS), and `ls` the conventional component/hook/util dirs
-3. Read the 3–5 most relevant files
+### Rationalization-killers — every one of these is INVALID
-## Output (internal `<codebase_context>`)
+If you catch yourself thinking any of these, STOP — you are about to under-explore. Spawn the explorers anyway:
+| The rationalization | Why it's invalid |
+|---|---|
+| "I already have the context from committed docs / the maps." | Docs and maps are *grounding input you feed the explorers* — never a substitute for exploring the phase. They're also potentially stale; the explorers verify them. |
+| "It's mostly already mapped." | "Mostly" and "this phase, verified, right now" are different. Explore the delta and re-verify the rest. |
+| "I know this codebase well." | Then exploration is cheap and confirms it. If it doesn't confirm it, you needed it. |
+| "The angles are tightly coupled, so I'll just do it inline myself." | Tight coupling means *fewer explorers or one dedicated explorer* — never inline-in-your-own-context. A dedicated exploration pass is still mandatory. |
+| "It's a small/clear phase." | Then it routed to /gsd:fast or /gsd:quick. You're in discuss-phase, so it isn't. |
+| "I'll save tokens." | Re-work from a wrong premise costs far more than the exploration. Thoroughness over tokens — the user has said this repeatedly. |
+**Violating the letter of this (a couple of inline greps labelled "deep scout") is violating the spirit.** The deliverable is dedicated parallel explorers + the confirm-or-refute gate, every time.
-From the scan, identify:
-- **Reusable assets** — components, hooks, utilities usable in this phase
-- **Established patterns** — state management, styling, data fetching
-- **Integration points** — routes, nav, providers where new code connects
-- **Creative options** — approaches the architecture enables or constrains
+## Mode (detect FIRST — *what* to scout; never *whether*)
-Used in `analyze_phase` and `present_gray_areas`. NOT written to a file —
-session-only.
+Detect per change-site, not per repo (a brownfield repo can have a brand-new module; a greenfield repo can carry rich docs). Mode picks the lenses; it never licenses skipping exploration.
-On the **deep path**, this same `<codebase_context>` additionally carries a
-`VERIFIED facts` / `INFERRED facts` split and the explicit open questions left
-at the stop (see below).
+- **Brownfield** — real source exists for this area → **code-comprehension** lenses (below). Depth-bounded: localize to spans + blast radius.
+- **Greenfield** — little/no code yet (new project / foundational phase) → **external pre-research** lenses: verify *current* library/API versions & signatures against the web (training memory is stale — ~1 in 5 generated packages is hallucinated; post-cutoff APIs run <50% without docs), find reference implementations, surface pitfalls, reconcile locked decisions against ground truth. Breadth-bounded: ~5 good sources optimal. **Never skip because "there's no code to read" — that conflation is the classic foundational-rework miss.**
+- **Greenfield-with-docs** — thin code but real specs/research exist → honor the docs the way brownfield honors tests; reconcile against them.
----
+Mixed phases run both disciplines.
-## Deep path — parallel read-only explorers on distinct lenses
+## Breadth assessment (how MANY explorers — not whether)
-Run this only when the depth assessment routed here. Reuses the exact
-parallel-dispatch machinery the advisor mode (`modes/advisor.md`
-`advisor_research`) and `discuss-phase-assumptions.md` already use: spawn
-`Agent()` calls simultaneously, then synthesize. Do NOT invent new machinery.
+Always spawn explorers; scale the count to the phase:
+- **Focused phase** (one subsystem, clear boundary) → 2 lenses.
+- **Sprawling / high-blast-radius / hard-to-reverse / touches-the-core** → 3–4+ lenses.
+- **Tightly-coupled / sequential** (angles can't be explored independently — Cognition: parallel workers drift on conflicting assumptions) → **one dedicated deep explorer** with continuous context, still read-only, still through the confirm-or-refute gate. This is "fewer explorers," NOT "do it inline."
-**Read-only is a hard constraint.** Explorers answer questions; they never
-write code (Cognition: parallel writers amplify conflicting decisions; parallel
-read-only scouts are safe). They glob/grep/read/inspect git history only.
+`--deep` forces the maximal fan-out; `--shallow` reduces to the 2-lens minimum (it does NOT mean "no agents"). Log the resolved breadth + the trigger (e.g. `[scout] 4 lenses — touches the core, low reversibility`).
-### Lens menu — pick 2–4 relevant to the phase type
+## Lens menus — pick the relevant ones (each explorer owns a DISTINCT angle)
-Spawn one explorer per lens, each owning a DISTINCT angle (not N identical
-agents — that just duplicates work). Give each Anthropic's 4-part contract:
-objective, output format, tools/sources, and boundaries ("don't cover X —
-that's another explorer's job").
+Give each explorer Anthropic's 4-part contract: objective, output format, tools/sources, boundaries ("don't cover X — another explorer owns it"). Feed each the grounding input (existing maps/docs) as starting context, with instructions to *verify and go deeper*, not restate.
+### Brownfield lenses (existing code)
 | Lens | Owns | Pick when |
 |---|---|---|
 | **by-layer / structure & dependency** | architecture, modules, who-calls-whom, the real dependency graph | almost always (the spine) |
-| **by-data-flow** | how data enters / transforms / exits; invariants; persistence | data models, schema, state, pipelines |
+| **by-data-flow** | how data enters/transforms/exits; invariants; persistence | data models, schema, state, pipelines |
 | **by-control-flow / call-path** | real execution paths, entry points (traced, not assumed) | behavior changes, request/handler flows |
-| **by-goals/intent + tests-as-spec** | what it's supposed to do; tests read as the executable spec; existing conventions to imitate | any phase modifying established behavior |
+| **by-goals/intent + tests-as-spec** | what it should do; tests as the executable spec; conventions to imitate | any phase modifying established behavior |
 | **by-failure-mode** | error handling, edge cases, where it breaks, retry/timeout paths | high-blast-radius, low-reversibility, infra |
-Default selection: **by-layer + by-goals/tests** for most phases; add
-**by-data-flow** for data/schema work, **by-control-flow** for behavior work,
-**by-failure-mode** for high-blast-radius/infra.
+### Greenfield lenses (external pre-research — area has no code yet)
+| Lens | Owns | Pick when |
+|---|---|---|
+| **by-ecosystem / version ground-truth** | current lib/framework versions, signatures, breaking changes vs stale memory; deps exist & maintained | always — the greenfield spine |
+| **by-reference-implementation** | how comparable real systems structure this — layout, seams, gotchas | foundational scaffolding; unfamiliar domain |
+| **by-canonical-doc reconciliation** | the project's locked decisions (spec/ADR/STACK) checked against current ground truth | a strategy chain produced artifacts |
+| **by-pitfalls / failure-modes** | footguns, deprecations, security advisories for the chosen stack | high-blast-radius foundational choices |
+### Grounding input (feed it; don't let it replace exploration)
+Maps in `.planning/codebase/*.md` (if present) and design docs are *inputs* to the explorers. If no maps exist for a brownfield area, the explorers build understanding directly (glob/grep/read/git-history). Reading a map is never the exploration — it's the starting point the explorers verify and extend.
+When maps exist, select which to feed by phase type (don't dump all seven):
+| Phase type (infer from title + ROADMAP entry) | Read these maps |
+|---|---|
+| UI / frontend / styling / design | CONVENTIONS.md, STRUCTURE.md, STACK.md |
+| Backend / API / service / data model | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
+| Integration / third-party / provider | STACK.md, INTEGRATIONS.md, ARCHITECTURE.md |
+| Infrastructure / DevOps / CI / deploy | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
+| Testing / QA / coverage | TESTING.md, CONVENTIONS.md, STRUCTURE.md |
+| Documentation / content | CONVENTIONS.md, STRUCTURE.md |
+| Mixed / unclear | STACK.md, ARCHITECTURE.md, CONVENTIONS.md |
+Read each map in a **single** Read call — never split reads of the same file at two offsets (it breaks prompt-cache reuse and costs more than one full read).
 ### Explorer contract (required of every lens)
-Each explorer MUST return:
-- **Load-bearing claims with `file:line` citations** — and, for counts or
-  exhaustiveness claims, the command + its raw output. No bare assertions.
-- **An honest coverage statement** — "searched 2 of 5 dirs", not
-  "searched all locations". Partial is fine; lying-by-omission is not.
-- **A VERIFIED vs INFERRED split** — facts read from tool output vs
-  conclusions drawn from them.
+Each explorer MUST return: **load-bearing claims with `file:line` (or `url`+access-date) citations** — counts/exhaustiveness claims include the command + raw output; **an honest coverage statement** ("searched 2 of 5 dirs", not "searched all"); **a VERIFIED vs INFERRED split**.
-> **ORCHESTRATOR RULE — CODEX RUNTIME**: After calling all Agent() calls above
-> to spawn explorers, do NOT independently scout or analyze the codebase while
-> the subagents are active. Wait for all explorers to return before
-> synthesizing. This prevents duplicate work and wasted context.
+> **ORCHESTRATOR RULE — CODEX RUNTIME**: After spawning the Agent() explorers, do NOT independently scout/analyze while they run. Wait for all to return, then synthesize. (Reuses the exact parallel-dispatch machinery in `modes/advisor.md` and `discuss-phase-assumptions.md` — do not invent new machinery.)
 ## Orchestrator confirm-or-refute gate (the load-bearing part)
-~20–30% of subagent reports contain at least one claim that doesn't match the
-underlying tool output, and self-verification is unreliable. So the
-orchestrator does NOT trust the explorers' prose. After all explorers return,
-before adopting anything into `<codebase_context>`:
-1. **Spot-check the highest-risk / load-bearing claims against RAW TOOL
-   OUTPUT** — re-read the actual file at the cited `file:line`, re-run the
-   grep/count yourself. Check claims against the tool output, never against the
-   agent's summary. Prioritize: inflated/exhaustiveness claims ("found 15
-   files", "searched all", "applied successfully"), and any claim a gray area
-   or plan would hang on. You need not re-verify everything — re-derive the
-   pivotal facts.
-2. **Require citations for load-bearing claims.** A load-bearing claim with no
-   `file:line` (or no command+output) is treated as INFERRED, not VERIFIED,
-   until you ground it yourself.
-3. **Reconcile contradictions — mandatory, never silently pick.** When two
-   explorers disagree (e.g. "3 call sites" vs "5 call sites"), surface the
-   contradiction and resolve it against ground truth (re-grep), don't merge or
-   pick one quietly.
-4. **Separate VERIFIED from INFERRED** in the resulting `<codebase_context>`.
-   Carry both forward labelled — never collapse inference into fact.
-5. **Never summarize a summary.** On any disputed or load-bearing point,
-   re-ground at the source rather than paraphrasing an explorer's paraphrase.
+Self-verification is unreliable and ~20–30% of subagent reports contain a claim that doesn't match tool output — so do NOT trust explorer prose. Before adopting anything into `<codebase_context>`:
+1. **Spot-check the highest-risk / load-bearing claims against RAW TOOL OUTPUT** — re-read the cited `file:line`, re-run the grep/count yourself. Check the tool output, never the summary. Prioritize inflated/exhaustiveness claims and anything a gray area or plan will hang on. Re-derive the pivotal facts; you needn't re-verify everything.
+2. **Require citations for load-bearing claims** — an uncited one is INFERRED, not VERIFIED, until you ground it.
+3. **Reconcile contradictions — mandatory, never silently pick.** Two explorers disagree → resolve against ground truth (re-grep), don't merge or pick quietly.
+4. **Separate VERIFIED from INFERRED** in `<codebase_context>` — carry both labelled; never collapse inference into fact.
+5. **Never summarize a summary** — on any disputed/load-bearing point, re-ground at the source.
 ## Sufficiency stop (avoid analysis paralysis)
-State the stop criterion explicitly. Stop scouting and proceed to
-`analyze_phase` when **all three** hold:
-- **Confidence threshold met** — the chosen lenses are covered and the real
-  call-path / data-flow for the change site was traced, not assumed. Satisfice
-  (~good enough to decide), don't optimize.
-- **Budget cap not exceeded** — within the tool-call/explorer budget for this
-  task class (cheap fact-finding gets few; complex/high-blast gets more).
-- **Saturation** — the last round surfaced **no new load-bearing facts AND all
-  contradictions are resolved.**
-Record any remaining open questions explicitly (bounded, named) and carry them
-into `<codebase_context>` — don't keep scouting to chase curiosity. Only once
-the gate passes is the understanding "sufficient AND verified".
+Stop and proceed to `analyze_phase` when **all three** hold: **confidence threshold met** (lenses covered; the real call-path/data-flow for the change site traced, not assumed; satisfice, don't optimize); **budget cap not exceeded** (scaled to the phase class); **saturation** (last round surfaced no new load-bearing facts AND all contradictions resolved). Record remaining open questions explicitly (bounded, named) and carry them into `<codebase_context>`. Only then is the understanding "sufficient AND verified".
+## Output (internal `<codebase_context>`)
+Reusable assets · established patterns · integration points · creative options the architecture enables/constrains — plus the **VERIFIED / INFERRED** split and the named open questions. Used in `analyze_phase` and `present_gray_areas`; session-only, not written to a file.

package/gsd-core/workflows/cicd-strategy.md CHANGED Viewed

@@ -4,6 +4,7 @@ Recommend a CI/CD strategy matched to the test strategy, the target infrastructu
 <required_reading>
 @~/.claude/gsd-core/references/cicd-strategy.md
+@~/.claude/gsd-core/references/brownfield-adaptation.md
 @~/.claude/gsd-core/templates/cicd-strategy.md
 </required_reading>
@@ -36,8 +37,12 @@ cat .planning/TEST-STRATEGY.md 2>/dev/null || true
 cat .planning/INFRA-STRATEGY.md 2>/dev/null || true
 cat .planning/adr/*.md 2>/dev/null || true
 cat .planning/PROJECT.md 2>/dev/null || true
+cat .planning/codebase/STACK.md 2>/dev/null || true
+ls .planning/codebase/STACK.md >/dev/null 2>&1 && echo "BROWNFIELD" || echo "GREENFIELD"
 ```
+**Brownfield mode (existing pipeline).** If `BROWNFIELD` (a `map-codebase` run produced `.planning/codebase/STACK.md`, or CI config already exists), **read `@~/.claude/gsd-core/references/brownfield-adaptation.md` now** and recommend a **transition plan**, not a rip-and-replace: **audit the existing pipeline** (from STACK.md + the CI config) against the tier→stage map (Step 5), the **≤10-min PR budget**, and the **pinned-`sub` OIDC / secrets** standard (Step 4). Recommend incremental fixes via **decision cards** (current → target → gap cost → Follow / Improve / Refactor), e.g. "15-min gate → split to PR-smoke + nightly," "long-lived service-account key → OIDC federation," "moved-tag actions → SHA-pin." **Default-select Improve**; sequence the changes (don't enable everything at once). Greenfield (no map, no pipeline) keeps the from-scratch pipeline design below as the default.
 **Read `@~/.claude/gsd-core/references/cicd-strategy.md` now** — it defines the GHA-default platform decision, OIDC-with-pinned-`sub`, the secrets split, the tier→stage mapping with the ≤10-min PR budget, the flaky canon, the merge-queue trigger, the deployment ladder, the free-six supply-chain table stakes, and the anti-pattern table.
 **Grounding maturity governs elicitation depth.** When upstream artifacts (spec, ADR, strategies, research) already answer a step, draft-from-docs and present for confirmation — cite the source, don't re-interview. Reserve questions for genuine decision points and contradictions. Honor a posture stated in `$ARGUMENTS` without re-asking.

package/gsd-core/workflows/discuss-phase.md CHANGED Viewed

@@ -279,15 +279,13 @@ Parse JSON for: `todo_count`, `matches[]` (each with `file`, `title`, `area`, `s
 </step>
 <step name="scout_codebase">
-Scan existing code to inform gray area identification — depth SCALED to the phase, not always-max.
+Explore THIS phase before identifying gray areas, by spawning **dedicated parallel read-only explorer agents**. **This is MANDATORY and is NOT gated on how much context you already have** — you are exploring this phase specifically (to complement and verify what you think you know), not reusing global context. Reading maps/docs in your own context is NOT a substitute. (Trivial work routes to /gsd:fast or /gsd:quick; if you're in discuss-phase, the phase warrants exploration.)
-Read `@~/.claude/gsd-core/references/scout-codebase.md` — it contains the depth assessment, the shallow map-selection scan, the deep-path lens menu, the orchestrator confirm-or-refute gate, and the sufficiency stop. Then:
+**Read `@~/.claude/gsd-core/references/scout-codebase.md` now** — why-it's-mandatory, rationalization-killers, mode detection, lens menus, breadth scaling, confirm-or-refute gate, sufficiency stop. Then:
-1. **Assess depth first** (reference's "Depth assessment"). Score the phase on complexity × blast-radius × reversibility × decomposability. `--deep`/`--shallow` in $ARGUMENTS override; otherwise auto-route. Log the resolved path and the trigger.
-2. **Shallow path** (default — trivial/reversible phases, ~10% context): `ls .planning/codebase/*.md`, select 2–3 maps via the reference's table (or grep fallback if none exist), build internal `<codebase_context>`.
-3. **Deep path** (complex / high-blast-radius / hard-to-reverse / touches-the-core): spawn parallel READ-ONLY explorers on 2–4 distinct lenses, reusing the advisor-mode/`discuss-phase-assumptions` parallel `Agent()` dispatch + ORCHESTRATOR RULE (do not invent new machinery). Each returns `file:line`-cited claims, honest coverage, and a VERIFIED-vs-INFERRED split. Then run the reference's **confirm-or-refute gate** (spot-check highest-risk claims against RAW TOOL OUTPUT not prose; reconcile contradictions against ground truth; keep VERIFIED separate from INFERRED; never summarize-a-summary) and stop per its sufficiency criterion. Build `<codebase_context>` from the reconciled, evidence-backed understanding.
+1. **Detect mode** (brownfield / greenfield / greenfield-with-docs) — picks *which lenses*, never *whether*.
+2. **Spawn parallel read-only explorers** — count scales with breadth (2 focused → 4+ sprawling; `--shallow` = 2-lens floor, never zero; tightly-coupled = one dedicated explorer, never inline). Reuse the advisor / `discuss-phase-assumptions` `Agent()` dispatch + ORCHESTRATOR RULE; don't invent machinery. Existing maps/docs are grounding INPUT to verify, not the answer. Each returns cited (`file:line`/`url`) claims, honest coverage, VERIFIED-vs-INFERRED.
+3. **Confirm-or-refute gate** — spot-check highest-risk claims against RAW TOOL OUTPUT not prose; reconcile contradictions against ground truth; keep VERIFIED vs INFERRED; never summarize-a-summary. Stop per the sufficiency criterion; build `<codebase_context>` from the reconciled evidence.
 </step>
 <step name="analyze_phase">

package/gsd-core/workflows/infrastructure-strategy.md CHANGED Viewed

@@ -5,6 +5,7 @@ Recommend an infrastructure strategy matched to the project's actual traffic sha
 <required_reading>
 @~/.claude/gsd-core/references/infrastructure-strategy.md
 @~/.claude/gsd-core/references/data-environments.md
+@~/.claude/gsd-core/references/brownfield-adaptation.md
 @~/.claude/gsd-core/templates/infra-strategy.md
 </required_reading>
@@ -37,8 +38,12 @@ cat .planning/PRODUCT-BRIEF.md 2>/dev/null || true
 cat .planning/REQUIREMENTS.md 2>/dev/null || true
 cat .planning/adr/*.md 2>/dev/null || true
 cat .planning/TEST-STRATEGY.md 2>/dev/null || true
+cat .planning/codebase/STACK.md 2>/dev/null || true
+ls .planning/codebase/STACK.md >/dev/null 2>&1 && echo "BROWNFIELD" || echo "GREENFIELD"
 ```
+**Brownfield mode (existing infra).** If `BROWNFIELD` (a `map-codebase` run produced `.planning/codebase/STACK.md`, or infra/config already exists), **read `@~/.claude/gsd-core/references/brownfield-adaptation.md` now** and recommend right-size-or-migrate, not a from-scratch build: assess the existing infra (from STACK.md + config — current compute rung, data layer, IaC, CI) **against the ladder**, e.g. "currently self-managed K8s at low utilization → the CAST-AI data says serverless-containers is cheaper → here's the decision card." Recommend evolution via **strangler** — route *new* workloads to the target rung and migrate existing ones incrementally — **never a big-bang re-platform**. Surface each gap as a **decision card** (current → target → gap cost: **cost · blast radius · reversibility** → Follow / Improve / Refactor-migrate) and **default-select Improve**. Walk Steps 4–7 below as the current-state→target reconciliation rather than a blank-slate walk. Greenfield (no map, no infra) keeps the from-scratch ladder walk as the default.
 **Read `@~/.claude/gsd-core/references/infrastructure-strategy.md` now** — it defines the compute ladder with quantified move-up triggers, the crossover numbers (Fargate-vs-EC2, the CAST AI utilization data, the <4-engineers floor), the per-cloud asymmetries and equivalences table, the observability floor, the when-you-actually-need triggers, the IaC floor, the anti-patterns, and the meta-tell.
 **Grounding maturity governs elicitation depth.** When upstream artifacts (spec, ADR, strategies, research) already answer a step, draft-from-docs and present for confirmation — cite the source, don't re-interview. Reserve questions for genuine decision points and contradictions. Honor a posture stated in `$ARGUMENTS` without re-asking.

package/gsd-core/workflows/model-domain.md CHANGED Viewed

@@ -4,6 +4,7 @@ Establish strategic Domain-Driven Design foundations for a greenfield project: a
 <required_reading>
 @~/.claude/gsd-core/references/domain-modeling.md
+@~/.claude/gsd-core/references/brownfield-adaptation.md
 @~/.claude/gsd-core/templates/domain-model.md
 </required_reading>
@@ -45,6 +46,20 @@ From the project docs, build an internal draft:
 - Which nouns/verbs recur (candidate ubiquitous-language terms)?
 - Are there distinct business areas or user roles (candidate subdomains)?
+**Brownfield mode (existing code).** Greenfield (define from docs/vision) is the default below. But when `.planning/codebase/*.md` exists OR real source is present, **reverse-engineer the domain from the code first, then reconcile** — don't model from a blank slate (see `@~/.claude/gsd-core/references/brownfield-adaptation.md`, "Reverse-engineer, don't re-invent"):
+```bash
+ls .planning/codebase/ARCHITECTURE.md .planning/codebase/STRUCTURE.md >/dev/null 2>&1 && echo "HAS_MAPS" || echo "NO_MAPS"
+```
+If maps exist, `cat .planning/codebase/ARCHITECTURE.md .planning/codebase/STRUCTURE.md` (and STACK.md if useful). If real source exists but no maps, suggest `/gsd:map-codebase` first, or skim the source tree directly. Then, *before* drafting from vision:
+- **Extract ubiquitous-language terms** from module/type/table names and call paths (the words the code already speaks) — these seed Step 3's draft.
+- **Derive candidate subdomains** from the structure (modules, packages, service boundaries in ARCHITECTURE.md/STRUCTURE.md) — these seed Step 4's area list.
+- **Reconcile against the user's vision** and flag mismatches, mirroring discover-product's gap map — here **KEEP** (code and vision agree), **REFINE** (present but vague, partial, or drifting from vision), **CONTESTED** (code and vision disagree, or vision wants something the code lacks). Present the reconciliation in one message; carry CONTESTED items into the Step 6 notes.
+- **Don't DDD the monolith.** When the existing system is a tangle, do NOT model the whole legacy estate as clean subdomains. Carve a **bubble context** around the core subdomain and mark an **ACL at its edge** (record it as a candidate boundary in Step 5/6) — model the core cleanly, wrap the mess. Recording only; the translator is tactical.
+Greenfield behavior (draft from docs/vision) remains the default when no code exists.
 **Grounding maturity governs elicitation depth.** When the upstream docs are mature (a design spec, research corpus, or detailed brief already forged the vocabulary and areas), default every step to **draft-from-docs + confirm**: present complete drafts for correction, state which docs grounded them, and reserve actual questions for *genuine decision points* — contested classifications, the one-core call, complexity contradictions. The 2–3 probing rounds below are for thin grounding (a bare PROJECT.md), not a re-interview of what the docs already answer. Honor a posture stated in `$ARGUMENTS` without re-asking.
 **Check for an existing model:**

package/gsd-core/workflows/recommend-architecture.md CHANGED Viewed

@@ -4,6 +4,7 @@ Recommend an application architecture matched to the project's actual complexity
 <required_reading>
 @~/.claude/gsd-core/references/architecture-decision.md
+@~/.claude/gsd-core/references/brownfield-adaptation.md
 @~/.claude/gsd-core/templates/adr.md
 </required_reading>
@@ -40,6 +41,20 @@ cat .planning/DOMAIN-MODEL.md 2>/dev/null || true
 **Grounding maturity governs elicitation depth.** When upstream artifacts (DOMAIN-MODEL, a design spec, research) already answer a question below, draft-from-docs and present for confirmation — cite the source, don't re-interview. Reserve `AskUserQuestion` for genuine decision points: contested rungs, gate answers the docs don't record, and contradictions (the reconcile rule still ALWAYS runs). Honor a posture stated in `$ARGUMENTS` without re-asking.
+**Brownfield mode (existing architecture).** Greenfield (recommend the target from scratch) is the default. But when existing code or `map-codebase` maps are present, **assess the current topology first and recommend an evolution path** — never impose a from-scratch ideal on the running system (see `@~/.claude/gsd-core/references/brownfield-adaptation.md`):
+```bash
+ls .planning/codebase/ARCHITECTURE.md >/dev/null 2>&1 && echo "HAS_MAPS" || echo "NO_MAPS"
+```
+If maps exist, `cat .planning/codebase/ARCHITECTURE.md` (consume STRUCTURE.md/CONCERNS.md as needed); if real source exists but no maps, suggest `/gsd:map-codebase` first. Then run Steps 3–5 in **assess-then-evolve** form:
+- **Score the current topology against the meta-tell** (Step 5) — where is the existing system already over- or under-engineered relative to the floor and the per-subdomain rungs?
+- **The target is unchanged — the floor (functional-core/imperative-shell + tests) and the rung ladder still apply.** Brownfield changes only *how you get there*: incrementally, not in one cut.
+- **Recommend via the decision card**, not a verdict: per affected area record `current state · target · gap cost (blast radius · churn/centrality · reversibility) · Follow | Improve | Refactor`. **Default-select Improve**; **gate Refactor behind characterization tests**; Follow is a deliberate, time-boxed choice for cold/uncovered code. Record the card in the ADR's Consequences/promotion-triggers, surfacing cost — the user owns the appetite.
+- **The evolution path uses Strangler Fig + ACL + bubble context** — exactly the *Evolving the topology* mechanics this skill already documents in Step 4. In brownfield, that section is the recommended path (new behavior to the clean core, old paths strangled), not just a future-split footnote. Don't duplicate it here — reference it.
+Greenfield behavior remains the default when no existing code is present.
 **If `NO_DOMAIN_MODEL`:** tell the user "No domain model found — I'll ask the complexity questions directly. (Consider `/gsd:model-domain` first for a sharper result.)" Then gather, per major area: is it core/supporting/generic, and how complex (rich rules vs CRUD)?
 From DOMAIN-MODEL.md (or the answers): extract each subdomain's type + complexity, **plus** the bounded contexts (owns / talks-to), the context-map relationships, and any flagged polysemes — Step 4.5 consumes them. The **core** subdomain's complexity is the primary driver of Axis A.

package/gsd-core/workflows/testing-strategy.md CHANGED Viewed

@@ -4,6 +4,7 @@ Recommend a test strategy matched to the architecture: WHAT to test, at WHICH le
 <required_reading>
 @~/.claude/gsd-core/references/test-strategy.md
+@~/.claude/gsd-core/references/brownfield-adaptation.md
 @~/.claude/gsd-core/templates/test-strategy.md
 </required_reading>
@@ -37,10 +38,14 @@ cat .planning/REQUIREMENTS.md 2>/dev/null || true
 cat .planning/DOMAIN-MODEL.md 2>/dev/null || true
 cat .planning/adr/*.md 2>/dev/null || true
 cat TESTING-STANDARDS.md 2>/dev/null || true
+cat .planning/codebase/TESTING.md 2>/dev/null || true
+ls .planning/codebase/TESTING.md >/dev/null 2>&1 && echo "BROWNFIELD" || echo "GREENFIELD"
 ```
 **Read `@~/.claude/gsd-core/references/test-strategy.md` now** — it defines behavior-over-implementation, sociable-by-default, test-once-at-cheapest-level, shape-follows-architecture (size axis), the gnarly-bits list, persistent-vs-transient e2e, and coverage-as-floor + mutation.
+**Brownfield mode (existing test suite / untested legacy).** If `BROWNFIELD` (a `map-codebase` run produced `.planning/codebase/TESTING.md`, or existing code is present), **read `@~/.claude/gsd-core/references/brownfield-adaptation.md` now** and recommend the test strategy as an *evolution*, not a from-scratch shape: assess the current tests from TESTING.md (framework, structure, coverage) and frame Steps 3–6 as additions on top of what exists, keeping the existing framework/conventions unless there's a decision-card reason to change. Key rule: for **untested legacy you intend to change, write characterization tests first** (pin the *current* behavior, bugs included) — starting at the **highest churn × risk hotspots** (git history), as a *local* safety net around the change region; never block on a global coverage %. Surface each gap (missing level, weak assertions, money-as-float, etc.) as a **decision card** (current → target → gap cost → Follow / Improve / Refactor) and **default-select Improve**; gate any Refactor behind those characterization tests. Greenfield (no map, no code) keeps the from-scratch derivation below as the default.
 **Grounding maturity governs elicitation depth.** When upstream artifacts (spec, ADR, strategies, research) already answer a step, draft-from-docs and present for confirmation — cite the source, don't re-interview. Reserve questions for genuine decision points and contradictions. Honor a posture stated in `$ARGUMENTS` without re-asking.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@therocketcode/gsd-core",
-  "version": "1.10.0",
+  "version": "1.11.1",
   "description": "GSD Core is a meta-prompting, context engineering, and spec-driven development system for AI coding agents.",
   "bin": {
     "gsd-core": "bin/install.js",