npm - gsd-pi - Versions diffs - 2.15.0 → 2.15.1 - Mend

gsd-pi 2.15.0 → 2.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/dist/resource-loader.d.ts CHANGED Viewed

@@ -1,6 +1,7 @@
 import { DefaultResourceLoader } from '@gsd/pi-coding-agent';
 export declare function discoverExtensionEntryPaths(extensionsDir: string): string[];
 export declare function readManagedResourceVersion(agentDir: string): string | null;
+export declare function readManagedResourceSyncedAt(agentDir: string): number | null;
 export declare function getNewerManagedResourceVersion(agentDir: string, currentVersion: string): string | null;
 /**
  * Syncs all bundled resources to agentDir (~/.gsd/agent/) on every launch.

package/dist/resource-loader.js CHANGED Viewed

@@ -85,7 +85,7 @@ function getBundledGsdVersion() {
     }
 }
 function writeManagedResourceManifest(agentDir) {
-    const manifest = { gsdVersion: getBundledGsdVersion() };
+    const manifest = { gsdVersion: getBundledGsdVersion(), syncedAt: Date.now() };
     writeFileSync(getManagedResourceManifestPath(agentDir), JSON.stringify(manifest));
 }
 export function readManagedResourceVersion(agentDir) {
@@ -97,6 +97,15 @@ export function readManagedResourceVersion(agentDir) {
         return null;
     }
 }
+export function readManagedResourceSyncedAt(agentDir) {
+    try {
+        const manifest = JSON.parse(readFileSync(getManagedResourceManifestPath(agentDir), 'utf-8'));
+        return typeof manifest?.syncedAt === 'number' ? manifest.syncedAt : null;
+    }
+    catch {
+        return null;
+    }
+}
 export function getNewerManagedResourceVersion(agentDir, currentVersion) {
     const managedVersion = readManagedResourceVersion(agentDir);
     if (!managedVersion) {

package/dist/resources/GSD-WORKFLOW.md CHANGED Viewed

@@ -4,8 +4,8 @@
 >
 > **When to read this:** At the start of any session working on GSD-managed work, or when loaded by `/gsd`.
 >
-> **After reading this, always read `.gsd/state.md` to find out what's next.**
-> If the milestone has a `context.md`, read that too — it contains project-specific decisions, reference paths, and implementation guidance that this generic methodology doc does not.
+> **After reading this, always read `.gsd/STATE.md` to find out what's next.**
+> If the milestone has a `M###-CONTEXT.md`, read that too. If the active slice has an `S##-CONTEXT.md`, read that as well — these files contain project-specific decisions, reference paths, and implementation guidance that this generic methodology doc does not.
 ---
@@ -13,13 +13,14 @@
 Read these files in order and act on what they say:
-1. **`.gsd/state.md`** — Where are we? What's the next action?
-2. **`.gsd/milestones/<active>/roadmap.md`** — What's the plan? Which slices are done? (state.md tells you which milestone is active)
-3. **`.gsd/milestones/<active>/context.md`** — Project-specific decisions, reference paths, constraints. Read this before doing implementation work.
-4. If a slice is active, read its **`plan.md`** — Which tasks exist? Which are done?
-5. If a task was interrupted, check for **`continue.md`** in the active slice directory — Resume from there.
+1. **`.gsd/STATE.md`** — Where are we? What's the next action?
+2. **`.gsd/milestones/<active>/M###-ROADMAP.md`** — What's the plan? Which slices are done? (`STATE.md` tells you which milestone is active)
+3. **`.gsd/milestones/<active>/M###-CONTEXT.md`** — Milestone-level project decisions, reference paths, constraints. Read this before doing implementation work.
+4. If a slice is active and has one, read **`S##-CONTEXT.md`** — Slice-specific decisions and constraints.
+5. If a slice is active, read its **`S##-PLAN.md`** — Which tasks exist? Which are done?
+6. If a task was interrupted, check for **`continue.md`** in the active slice directory — Resume from there.
-Then do the thing `state.md` says to do next.
+Then do the thing `STATE.md` says to do next.
 ---
@@ -41,32 +42,32 @@ All artifacts live in `.gsd/` at the project root:
 ```
 .gsd/
-  state.md                                  # Dashboard — always read first
-  decisions.md                              # Append-only decisions register
+  STATE.md                                  # Dashboard — always read first (derived cache; runtime, gitignored)
+  DECISIONS.md                              # Append-only decisions register
   milestones/
     M001/
-      roadmap.md                            # Milestone plan (checkboxes = state)
-      context.md                            # Optional: user decisions from discuss phase
-      research.md                           # Optional: codebase/tech research
-      summary.md                            # Milestone rollup (updated as slices complete)
+      M001-ROADMAP.md                       # Milestone plan (checkboxes = state)
+      M001-CONTEXT.md                       # Optional: user decisions from discuss phase
+      M001-RESEARCH.md                      # Optional: codebase/tech research
+      M001-SUMMARY.md                       # Milestone rollup (updated as slices complete)
       slices/
         S01/
-          plan.md                           # Task decomposition for this slice
-          context.md                        # Optional: slice-level user decisions
-          research.md                       # Optional: slice-level research
-          summary.md                        # Slice summary (written on completion)
-          uat.md                            # Non-blocking human test script (written on completion)
+          S01-PLAN.md                       # Task decomposition for this slice
+          S01-CONTEXT.md                    # Optional: slice-level user decisions
+          S01-RESEARCH.md                   # Optional: slice-level research
+          S01-SUMMARY.md                    # Slice summary (written on completion)
+          S01-UAT.md                        # Non-blocking human test script (written on completion)
           continue.md                       # Ephemeral: resume point if interrupted
           tasks/
-            T01-plan.md                     # Individual task plan
-            T01-summary.md                  # Task summary with frontmatter
+            T01-PLAN.md                     # Individual task plan
+            T01-SUMMARY.md                  # Task summary with frontmatter
 ```
 ---
 ## File Format Reference
-### `roadmap.md`
+### `M###-ROADMAP.md`
 ```markdown
 # M001: Title of the Milestone
@@ -93,7 +94,7 @@ All artifacts live in `.gsd/` at the project root:
 **Parsing rules:** `- [x]` = done, `- [ ]` = not done. The `risk:` and `depends:[]` tags are inline metadata parsed from the line. `depends:[]` lists slice IDs this slice requires to be complete first.
-**Boundary Map** (required section in roadmap.md):
+**Boundary Map** (required section in M###-ROADMAP.md):
 After the slices section, include a `## Boundary Map` that shows what each slice produces and consumes:
@@ -123,7 +124,7 @@ The boundary map is a **planning artifact** — not runnable code. It:
 - Enables deterministic verification that slices actually connect
 - Gets updated during slice planning if new interfaces emerge
-### `plan.md` (slice-level)
+### `S##-PLAN.md` (slice-level)
 ```markdown
 # S01: Slice Title
@@ -148,7 +149,7 @@ The boundary map is a **planning artifact** — not runnable code. It:
 - path/to/another.ts
 ```
-### `TNN-plan.md` (task-level)
+### `T##-PLAN.md` (task-level)
 ```markdown
 # T01: Task Title
@@ -188,7 +189,7 @@ Critical wiring between artifacts:
 **Must-haves are what make verification mechanically checkable.** Truths are checked by running commands or reading output. Artifacts are checked by confirming files exist with real content. Key links are checked by confirming imports/references actually connect the pieces.
-### `state.md`
+### `STATE.md`
 ```markdown
 # GSD State
@@ -209,10 +210,10 @@ Critical wiring between artifacts:
 Exact next thing to do.
 ```
-### `context.md` (from discuss phase)
+### `M###-CONTEXT.md` / `S##-CONTEXT.md` (from discuss phase)
 ```markdown
-# S01: Slice Title — Context
+# M001: Milestone or Slice Title — Context
 **Gathered:** 2026-03-07
 **Status:** Ready for planning
@@ -228,7 +229,7 @@ Exact next thing to do.
 - Ideas that came up but belong in other slices
 ```
-### `decisions.md` (append-only register)
+### `DECISIONS.md` (append-only register)
 ```markdown
 # Decisions Register
@@ -265,7 +266,7 @@ Work flows through these phases. Each phase produces a file.
 ### Phase 1: Discuss (Optional)
 **Purpose:** Capture user decisions on gray areas before planning.
-**Produces:** `context.md` at milestone or slice level.
+**Produces:** `M###-CONTEXT.md` for milestone-level discussion or `S##-CONTEXT.md` for slice-level discussion.
 **When to use:** When the scope has ambiguities the user should weigh in on.
 **When to skip:** When the user already knows exactly what they want, or told you to just go.
@@ -273,18 +274,18 @@ Work flows through these phases. Each phase produces a file.
 1. Read the roadmap to understand the scope.
 2. Identify 3-5 gray areas — implementation decisions the user cares about.
 3. Use `ask_user_questions` to discuss each area.
-4. Write decisions to `context.md`.
+4. Write decisions to the appropriate context file (`M###-CONTEXT.md` or `S##-CONTEXT.md`).
 5. Do NOT discuss how to implement — only what the user wants.
 ### Phase 2: Research (Optional)
 **Purpose:** Scout the codebase and relevant docs before planning.
-**Produces:** `research.md` at milestone or slice level.
+**Produces:** `M###-RESEARCH.md` at milestone level or `S##-RESEARCH.md` at slice level.
 **When to use:** When working in unfamiliar code, with unfamiliar libraries, or on complex integrations.
 **When to skip:** When the codebase is familiar and the work is straightforward.
 **How to do it manually:**
-1. Read `context.md` if it exists — know what decisions are locked.
+1. Read `M###-CONTEXT.md` and/or `S##-CONTEXT.md` if they exist — know what decisions are locked.
 2. Scout relevant code: `rg`, `find`, read key files.
 3. Use `resolve_library` / `get_library_docs` if needed.
 4. Write findings to `research.md` with these sections:
@@ -324,24 +325,24 @@ The **Don't Hand-Roll** and **Common Pitfalls** sections prevent the most expens
 ### Phase 3: Plan
 **Purpose:** Decompose work into context-window-sized tasks with must-haves.
-**Produces:** `plan.md` + individual `T01-plan.md` files.
+**Produces:** `S##-PLAN.md` + individual `T01-PLAN.md` files.
 **For a milestone (roadmap):**
-1. Read `context.md`, `research.md`, and `.gsd/decisions.md` if they exist.
+1. Read `M###-CONTEXT.md`, `M###-RESEARCH.md`, and `.gsd/DECISIONS.md` if they exist.
 2. Decompose the vision into 4-10 demoable vertical slices.
 3. Order by risk (high-risk first to validate feasibility early).
-4. Write `roadmap.md` with checkboxes, risk levels, dependencies, demo sentences.
+4. Write `M###-ROADMAP.md` with checkboxes, risk levels, dependencies, demo sentences.
 5. **Write the boundary map** — for each slice, specify what it produces (functions, types, interfaces, endpoints) and what it consumes from upstream slices. This forces interface thinking before implementation and enables deterministic verification that slices actually connect.
 **For a slice (task decomposition):**
-1. Read the slice's entry in `roadmap.md` **and its boundary map section** — know what interfaces this slice must produce and consume.
-2. Read `context.md`, `research.md`, and `.gsd/decisions.md` if they exist for this slice.
+1. Read the slice's entry in `M###-ROADMAP.md` **and its boundary map section** — know what interfaces this slice must produce and consume.
+2. Read `M###-CONTEXT.md`, `S##-CONTEXT.md`, `M###-RESEARCH.md`, `S##-RESEARCH.md`, and `.gsd/DECISIONS.md` if they exist for this slice.
 3. Read summaries from dependency slices (check `depends:[]` in roadmap).
 4. Verify that upstream slices' actual outputs match what the boundary map says this slice consumes. If they diverge, update the boundary map.
 5. Decompose into 1-7 tasks, each fitting one context window.
 6. Each task needs: title, description, steps (3-10), must-haves (observable verification criteria).
 7. Must-haves should reference boundary map contracts — e.g. "exports `generateToken()` as specified in boundary map S01→S02".
-8. Write `plan.md` and individual `TNN-plan.md` files.
+8. Write `S##-PLAN.md` and individual `T##-PLAN.md` files.
 ### Phase 4: Execute
@@ -349,10 +350,10 @@ The **Don't Hand-Roll** and **Common Pitfalls** sections prevent the most expens
 **Produces:** Code changes + `[DONE:n]` markers.
 **How to do it manually:**
-1. Read the task's `TNN-plan.md`.
+1. Read the task's `T##-PLAN.md`.
 2. Read relevant summaries from prior tasks (for context on what's already built).
 3. Execute each step. Mark progress with `[DONE:n]` in responses.
-4. If you made an architectural, pattern, or library decision, append it to `.gsd/decisions.md`.
+4. If you made an architectural, pattern, or library decision, append it to `.gsd/DECISIONS.md`.
 5. If interrupted or context is getting full, write `continue.md` (see below).
 ### Phase 5: Verify
@@ -400,7 +401,7 @@ When verification finds gaps, include a **Gaps** section with what's missing, im
 ### Phase 6: Summarize
 **Purpose:** Record what happened for downstream tasks.
-**Produces:** `TNN-summary.md`, and when slice completes, `summary.md`.
+**Produces:** `T##-SUMMARY.md`, and when slice completes, `S##-SUMMARY.md`.
 **Task summary format:**
 ```markdown
@@ -421,7 +422,7 @@ key_decisions:
 patterns_established:
   - "Pattern name and where it lives"
 drill_down_paths:
-  - .gsd/milestones/M001/slices/S01/tasks/T01-plan.md
+  - .gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md
 duration: 15min
 verification_result: pass
 completed_at: 2026-03-07T16:00:00Z
@@ -445,7 +446,7 @@ What differed from the plan and why (or "None").
 The one-liner must be substantive: "JWT auth with refresh rotation using jose" not "Authentication implemented."
-**Slice summary:** Written when all tasks in a slice complete. Compresses all task summaries. Includes `drill_down_paths` to each task summary. During slice completion, review task summaries for `key_decisions` and ensure any significant ones are captured in `.gsd/decisions.md`.
+**Slice summary:** Written when all tasks in a slice complete. Compresses all task summaries. Includes `drill_down_paths` to each task summary. During slice completion, review task summaries for `key_decisions` and ensure any significant ones are captured in `.gsd/DECISIONS.md`.
 **Milestone summary:** Updated each time a slice completes. Compresses all slice summaries. This is what gets injected into later slice planning instead of loading many individual summaries.
@@ -454,16 +455,16 @@ The one-liner must be substantive: "JWT auth with refresh rotation using jose" n
 **Purpose:** Mark work done and move to the next thing.
 **After a task completes:**
-1. Mark the task done in `plan.md` (checkbox).
+1. Mark the task done in `S##-PLAN.md` (checkbox).
 2. Check if there's a next task in the slice → execute it.
-3. If slice is complete → write slice summary, mark slice done in `roadmap.md`.
+3. If slice is complete → write slice summary, mark slice done in `M###-ROADMAP.md`.
 **After a slice completes:**
-1. Write slice `summary.md` (compresses all task summaries).
-2. Write slice `uat.md` — a non-blocking human test script derived from the slice's must-haves and demo sentence. The agent does NOT wait for UAT results.
-3. Mark the slice checkbox in `roadmap.md` as `[x]`.
-4. Update `state.md` with new position.
-5. Update milestone `summary.md` with the completed slice's contributions.
+1. Write slice `S##-SUMMARY.md` (compresses all task summaries).
+2. Write slice `S##-UAT.md` — a non-blocking human test script derived from the slice's must-haves and demo sentence. The agent does NOT wait for UAT results.
+3. Mark the slice checkbox in `M###-ROADMAP.md` as `[x]`.
+4. Update `STATE.md` with new position.
+5. Update milestone `M###-SUMMARY.md` with the completed slice's contributions.
 6. Continue to next slice immediately. The user tests the UAT whenever convenient.
 7. If the user reports UAT failures later, create fix tasks in the current or a new slice.
 8. If all slices done → milestone complete.
@@ -513,17 +514,17 @@ The EXACT first thing to do when resuming. Not vague. Specific.
 ## State Management
-### `state.md` is a derived cache
+### `STATE.md` is a derived cache
 It is NOT the source of truth. It's a convenience dashboard.
 **Sources of truth:**
-- `roadmap.md` → which slices exist and which are done
-- `plan.md` → which tasks exist within a slice
-- `TNN-summary.md` → what happened during a task
-- `summary.md` (slice/milestone) → compressed outcomes
+- `M###-ROADMAP.md` → which slices exist and which are done
+- `S##-PLAN.md` → which tasks exist within a slice
+- `T##-SUMMARY.md` → what happened during a task
+- `S##-SUMMARY.md` and `M###-SUMMARY.md` → compressed slice and milestone outcomes
-**Update `state.md`** after every significant action:
+**Update `STATE.md`** after every significant action:
 - Active milestone/slice/task
 - Recent decisions (last 3-5)
 - Blockers
@@ -611,9 +612,9 @@ Tasks completed:
 When planning or executing a task, load relevant prior context:
-1. Check the current slice's `depends:[]` in `roadmap.md`.
+1. Check the current slice's `depends:[]` in `M###-ROADMAP.md`.
 2. Load summaries from those dependency slices.
-3. Start with the **highest available level** — milestone `summary.md` first.
+3. Start with the **highest available level** — milestone `M###-SUMMARY.md` first.
 4. Only drill down to slice/task summaries if you need specific detail.
 5. Stay within **~2500 tokens** of total injected summary context.
 6. If the dependency chain is too large, drop the oldest/least-relevant summaries first.
@@ -630,32 +631,33 @@ These are soft caps — exceed them when genuinely needed, but don't let summari
 ## Project-Specific Context
-This methodology doc is generic. Project-specific guidance belongs in the milestone's `context.md`:
+This methodology doc is generic. Project-specific guidance belongs in the milestone and slice context files:
-- **`.gsd/milestones/<active>/context.md`** — Architecture decisions, reference file paths, per-slice doc reading guides, implementation constraints, and any project-specific protocols (worktrees, testing, etc.)
+- **`.gsd/milestones/<active>/M###-CONTEXT.md`** — milestone-level architecture decisions, reference file paths, and implementation constraints
+- **`.gsd/milestones/<active>/slices/S##/S##-CONTEXT.md`** — slice-level decisions, edge cases, and narrow implementation guidance when present
-**Always read the active milestone's `context.md` before starting implementation work.** It tells you what decisions are locked, what files to reference, and how to verify your work in this specific project.
+**Always read the active milestone's `M###-CONTEXT.md` before starting implementation work.** If the active slice also has `S##-CONTEXT.md`, read that too. These files tell you what decisions are locked, what files to reference, and how to verify your work in this specific project.
 ---
 ## Checklist for a Fresh Session
-1. Read `.gsd/state.md` — what's the next action?
+1. Read `.gsd/STATE.md` — what's the next action?
 2. Check for `continue.md` in the active slice — is there interrupted work?
 3. If resuming: read `continue.md`, delete it, pick up from "Next Action".
-4. If starting fresh: read the active slice's `plan.md`, find the next incomplete task.
-5. If in a planning or research phase, read `.gsd/decisions.md` — respect existing decisions.
+4. If starting fresh: read the active slice's `S##-PLAN.md`, find the next incomplete task.
+5. If in a planning or research phase, read `.gsd/DECISIONS.md` — respect existing decisions.
 6. Read relevant summaries from prior tasks/slices for context.
 7. Do the work.
 8. Verify the must-haves.
 9. Write the summary.
-10. Mark done, update `state.md`, advance.
-11. If context is getting full or you're done for now: write `continue.md` if mid-task, or update `state.md` with next action if between tasks.
+10. Mark done, update `STATE.md`, advance.
+11. If context is getting full or you're done for now: write `continue.md` if mid-task, or update `STATE.md` with next action if between tasks.
 ## When Context Gets Large
 If you sense context pressure (many files read, long execution, lots of tool output):
 1. **If mid-task:** Write `continue.md` with exact resume state. Tell the user: "Context is getting full. I've saved progress to continue.md. Start a new session and run `/gsd` to pick up where you left off, or `/gsd auto` to resume in auto-execution mode."
-2. **If between tasks:** Just update `state.md` with the next action. No continue file needed — the next session will read state.md and pick up the next task cleanly.
+2. **If between tasks:** Just update `STATE.md` with the next action. No continue file needed — the next session will read STATE.md and pick up the next task cleanly.
 3. **Don't fight it.** The whole system is designed for this. A fresh session with the right files loaded is better than a stale session with degraded reasoning.

package/dist/resources/extensions/gsd/auto-dashboard.ts CHANGED Viewed

@@ -265,6 +265,16 @@ export function updateProgressWidget(
       tui.requestRender();
     }, 800);
+    // Refresh progress cache from disk every 5s so the widget reflects
+    // task/slice completion mid-unit. Without this, the progress bar only
+    // updates at dispatch time, appearing frozen during long-running units.
+    const progressRefreshTimer = mid ? setInterval(() => {
+      try {
+        updateSliceProgressCache(accessors.getBasePath(), mid.id, slice?.id);
+        cachedLines = undefined;
+      } catch { /* non-fatal */ }
+    }, 5_000) : null;
     return {
       render(width: number): string[] {
         if (cachedLines && cachedWidth === width) return cachedLines;
@@ -416,6 +426,7 @@ export function updateProgressWidget(
       },
       dispose() {
         clearInterval(pulseTimer);
+        if (progressRefreshTimer) clearInterval(progressRefreshTimer);
       },
     };
   });

package/dist/resources/extensions/gsd/auto-prompts.ts CHANGED Viewed

@@ -383,6 +383,7 @@ export async function buildResearchMilestonePrompt(mid: string, midTitle: string
   const outputRelPath = relMilestoneFile(base, mid, "RESEARCH");
   return loadPrompt("research-milestone", {
+    workingDirectory: base,
     milestoneId: mid, milestoneTitle: midTitle,
     milestonePath: relMilestonePath(base, mid),
     contextPath: contextRel,
@@ -422,6 +423,7 @@ export async function buildPlanMilestonePrompt(mid: string, midTitle: string, ba
   const outputRelPath = relMilestoneFile(base, mid, "ROADMAP");
   const secretsOutputPath = relMilestoneFile(base, mid, "SECRETS");
   return loadPrompt("plan-milestone", {
+    workingDirectory: base,
     milestoneId: mid, milestoneTitle: midTitle,
     milestonePath: relMilestonePath(base, mid),
     contextPath: contextRel,
@@ -667,6 +669,7 @@ export async function buildCompleteMilestonePrompt(
   const milestoneSummaryPath = `${relMilestonePath(base, mid)}/${mid}-SUMMARY.md`;
   return loadPrompt("complete-milestone", {
+    workingDirectory: base,
     milestoneId: mid,
     milestoneTitle: midTitle,
     roadmapPath: roadmapRel,
@@ -715,6 +718,7 @@ export async function buildReplanSlicePrompt(
   const replanPath = `${relSlicePath(base, mid, sid)}/${sid}-REPLAN.md`;
   return loadPrompt("replan-slice", {
+    workingDirectory: base,
     milestoneId: mid,
     sliceId: sid,
     sliceTitle: sTitle,
@@ -748,6 +752,7 @@ export async function buildRunUatPrompt(
   const uatType = extractUatType(uatContent) ?? "human-experience";
   return loadPrompt("run-uat", {
+    workingDirectory: base,
     milestoneId: mid,
     sliceId,
     uatPath,
@@ -780,6 +785,7 @@ export async function buildReassessRoadmapPrompt(
   const assessmentPath = relSliceFile(base, mid, completedSliceId, "ASSESSMENT");
   return loadPrompt("reassess-roadmap", {
+    workingDirectory: base,
     milestoneId: mid,
     milestoneTitle: midTitle,
     completedSliceId,

package/dist/resources/extensions/gsd/auto-recovery.ts CHANGED Viewed

@@ -149,7 +149,12 @@ export function verifyExpectedArtifact(unitType: string, unitId: string, base: s
           const roadmap = parseRoadmap(roadmapContent);
           const slice = roadmap.slices.find(s => s.id === sid);
           if (slice && !slice.done) return false;
-        } catch (e) { /* corrupt roadmap — be lenient and treat as verified */ void e; }
+        } catch {
+          // Corrupt/unparseable roadmap — fail verification so the unit
+          // re-runs and has a chance to fix the roadmap. Silently passing
+          // here could advance past an incomplete slice.
+          return false;
+        }
       }
     }
   }
@@ -251,6 +256,11 @@ export function skipExecuteTask(
       const re = new RegExp(`^(- \\[) \\] (\\*\\*${escapedTid}:)`, "m");
       if (re.test(planContent)) {
         writeFileSync(planAbs, planContent.replace(re, "$1x] $2"), "utf-8");
+      } else {
+        // Regex didn't match — checkbox format differs from expected pattern.
+        // Return false so callers know the plan was NOT updated and can
+        // fall through to other recovery strategies instead of assuming success.
+        return false;
       }
     }
   }
@@ -290,7 +300,10 @@ export function removePersistedKey(base: string, key: string): void {
     if (existsSync(file)) {
       let keys: string[] = JSON.parse(readFileSync(file, "utf-8"));
       keys = keys.filter(k => k !== key);
-      writeFileSync(file, JSON.stringify(keys), "utf-8");
+      // Atomic write: tmp file + rename prevents partial writes on crash
+      const tmpFile = file + ".tmp";
+      writeFileSync(tmpFile, JSON.stringify(keys), "utf-8");
+      renameSync(tmpFile, file);
     }
   } catch (e) { /* non-fatal: removePersistedKey failure */ void e; }
 }
@@ -412,8 +425,12 @@ export async function selfHealRuntimeRecords(
       const { unitType, unitId } = record;
       const artifactPath = resolveExpectedArtifactPath(unitType, unitId, base);
-      // Case 1: Artifact exists — unit completed but closeout didn't finish
-      if (artifactPath && existsSync(artifactPath)) {
+      // Case 1: Artifact exists — unit completed but closeout didn't finish.
+      // Use verifyExpectedArtifact (not just existsSync) so that execute-task
+      // also checks the plan checkbox is marked [x]. Without this, a task
+      // whose summary exists but checkbox is unchecked would be incorrectly
+      // marked as completed, causing deriveState to re-dispatch it endlessly.
+      if (artifactPath && existsSync(artifactPath) && verifyExpectedArtifact(unitType, unitId, base)) {
         clearUnitRuntimeRecord(base, unitType, unitId);
         // Also persist completion key if missing
         const key = `${unitType}/${unitId}`;

package/dist/resources/extensions/gsd/auto.ts CHANGED Viewed

@@ -39,7 +39,7 @@ import {
   readUnitRuntimeRecord,
   writeUnitRuntimeRecord,
 } from "./unit-runtime.js";
-import { resolveAutoSupervisorConfig, resolveModelWithFallbacksForUnit, loadEffectiveGSDPreferences } from "./preferences.js";
+import { resolveAutoSupervisorConfig, resolveModelWithFallbacksForUnit, loadEffectiveGSDPreferences, resolveSkillDiscoveryMode } from "./preferences.js";
 import { sendDesktopNotification } from "./notifications.js";
 import type { GSDPreferences } from "./preferences.js";
 import {
@@ -68,6 +68,7 @@ import {
 } from "./metrics.js";
 import { join } from "node:path";
 import { sep as pathSep } from "node:path";
+import { homedir } from "node:os";
 import { readdirSync, readFileSync, existsSync, mkdirSync, writeFileSync, unlinkSync, statSync } from "node:fs";
 import { execSync, execFileSync } from "node:child_process";
 import {
@@ -156,6 +157,33 @@ const unitRecoveryCount = new Map<string, number>();
 /** Persisted completed-unit keys — survives restarts. Loaded from .gsd/completed-units.json. */
 const completedKeySet = new Set<string>();
+/** Resource sync timestamp captured at auto-mode start. If the managed-resources
+ *  manifest changes mid-session (e.g. /gsd:update or dev edit + copy-resources),
+ *  templates on disk may expect variables the in-memory code doesn't provide.
+ *  Detect this and stop gracefully instead of crashing. */
+let resourceSyncedAtOnStart: number | null = null;
+function readResourceSyncedAt(): number | null {
+  const agentDir = process.env.GSD_CODING_AGENT_DIR || join(homedir(), ".gsd", "agent");
+  const manifestPath = join(agentDir, "managed-resources.json");
+  try {
+    const manifest = JSON.parse(readFileSync(manifestPath, "utf-8"));
+    return typeof manifest?.syncedAt === "number" ? manifest.syncedAt : null;
+  } catch {
+    return null;
+  }
+}
+function checkResourcesStale(): string | null {
+  if (resourceSyncedAtOnStart === null) return null;
+  const current = readResourceSyncedAt();
+  if (current === null) return null;
+  if (current !== resourceSyncedAtOnStart) {
+    return "GSD resources were updated since this session started. Restart gsd to load the new code.";
+  }
+  return null;
+}
 /**
  * Resolve whether auto-mode should use worktree isolation.
  * Returns true for worktree mode (default), false for branch mode.
@@ -618,6 +646,7 @@ export async function startAuto(
   resetHookState();
   restoreHookState(base);
   autoStartTime = Date.now();
+  resourceSyncedAtOnStart = readResourceSyncedAt();
   completedUnits = [];
   currentUnit = null;
   currentMilestoneId = state.activeMilestone?.id ?? null;
@@ -1141,6 +1170,18 @@ async function dispatchNextUnit(
     await new Promise(r => setTimeout(r, 200));
   }
+  // Resource version guard: detect mid-session resource updates.
+  // Templates are read from disk on each dispatch but extension code is loaded
+  // once at startup. If resources were re-synced (e.g. /gsd:update, npm update,
+  // or dev copy-resources), templates may expect variables the in-memory code
+  // doesn't provide. Stop gracefully instead of crashing.
+  const staleMsg = checkResourcesStale();
+  if (staleMsg) {
+    await stopAuto(ctx, pi);
+    ctx.ui.notify(staleMsg, "error");
+    return;
+  }
   // Clear all caches so deriveState sees fresh disk state (#431).
   // Parse cache is also cleared — doctor may have re-populated it with
   // stale data between handleAgentEnd and this dispatch call (Path B fix).

package/dist/resources/extensions/gsd/prompts/complete-milestone.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Complete Milestone {{milestoneId}} ("{{milestoneTitle}}")
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 ## Your Role in the Pipeline
 All slices are done. You are closing out the milestone — verifying that the assembled work actually delivers the promised outcome, writing the milestone summary, and updating project state. The milestone summary is the final record. After you finish, the system merges the worktree back to the integration branch. If there are queued milestones, the next one starts its own research → plan → execute cycle from a clean slate — the milestone summary is how it learns what was already built.

package/dist/resources/extensions/gsd/prompts/plan-milestone.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Plan Milestone {{milestoneId}} ("{{milestoneTitle}}")
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 All relevant context has been preloaded below — start working immediately without re-reading these files.
 {{inlinedContext}}

package/dist/resources/extensions/gsd/prompts/reassess-roadmap.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Reassess Roadmap — Milestone {{milestoneId}} after {{completedSliceId}}
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 ## Your Role in the Pipeline
 A slice just completed. The **complete-slice agent** verified the work and wrote a slice summary. You decide whether the remaining roadmap still makes sense given what was actually built. If you change the roadmap, the next slice's **researcher** and **planner** agents work from your updated version. If you confirm it's fine, the pipeline moves to the next slice immediately.

package/dist/resources/extensions/gsd/prompts/replan-slice.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Replan Slice {{sliceId}} ("{{sliceTitle}}") — Milestone {{milestoneId}}
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 A completed task reported `blocker_discovered: true`, meaning the current slice plan cannot be executed as-is. Your job is to rewrite the remaining tasks in the slice plan to address the blocker while preserving all completed work.
 All relevant context has been preloaded below — the roadmap, current slice plan, the blocker task summary, and decisions are inlined. Start working immediately without re-reading these files.

package/dist/resources/extensions/gsd/prompts/research-milestone.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Research Milestone {{milestoneId}} ("{{milestoneTitle}}")
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 All relevant context has been preloaded below — start working immediately without re-reading these files.
 {{inlinedContext}}

package/dist/resources/extensions/gsd/prompts/run-uat.md CHANGED Viewed

@@ -2,6 +2,10 @@ You are executing GSD auto-mode.
 ## UNIT: Run UAT — {{milestoneId}}/{{sliceId}}
+## Working Directory
+Your working directory is `{{workingDirectory}}`. All file reads, writes, and shell commands MUST operate relative to this directory. Do NOT `cd` to any other directory.
 All relevant context has been preloaded below. Start working immediately without re-reading these files.
 {{inlinedContext}}