npm - @snipcodeit/mgw - Versions diffs - 0.4.0 → 0.6.0 - Mend

@snipcodeit/mgw 0.4.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +1 -0
package/commands/run/execute.md +456 -24
package/commands/run/pr-create.md +46 -0
package/commands/run/triage.md +199 -2
package/commands/workflows/gsd.md +71 -0
package/commands/workflows/state.md +340 -1
package/dist/bin/mgw.cjs +2 -2
package/dist/{index-B-_JvYpz.cjs → index-CHrVAIMY.cjs} +197 -1
package/dist/lib/index.cjs +524 -2
package/package.json +9 -4

package/commands/run/triage.md CHANGED Viewed

@@ -38,7 +38,7 @@ If no state file exists → issue not triaged yet. Run triage inline:
   - Execute the mgw:issue triage flow (steps from issue.md) inline.
   - After triage, reload state file.
-If state file exists → load it. **Run migrateProjectState() to ensure retry fields exist:**
+If state file exists → load it. **Run migrateProjectState() to ensure retry and checkpoint fields exist:**
 ```bash
 node -e "
 const { migrateProjectState } = require('./lib/state.cjs');
@@ -46,6 +46,161 @@ migrateProjectState();
 " 2>/dev/null || true
 ```
+**Checkpoint detection — check for resumable progress before stage routing:**
+After loading state and running migration, detect whether a prior pipeline run left
+a checkpoint with meaningful progress (beyond triage). If found, present the user
+with Resume/Fresh/Skip options before proceeding.
+```bash
+# Detect checkpoint with progress beyond triage
+CHECKPOINT_DATA=$(node -e "
+const { detectCheckpoint, resumeFromCheckpoint } = require('./lib/state.cjs');
+const cp = detectCheckpoint(${ISSUE_NUMBER});
+if (!cp) {
+  console.log('none');
+} else {
+  const resume = resumeFromCheckpoint(${ISSUE_NUMBER});
+  console.log(JSON.stringify(resume));
+}
+" 2>/dev/null || echo "none")
+```
+If checkpoint is found (`CHECKPOINT_DATA !== "none"`):
+Parse the checkpoint data and display to the user:
+```bash
+CHECKPOINT_STEP=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.checkpoint.pipeline_step);
+")
+RESUME_ACTION=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.resumeAction);
+")
+RESUME_STAGE=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.resumeStage);
+")
+COMPLETED_STEPS=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.completedSteps.join(', '));
+")
+ARTIFACTS_COUNT=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.checkpoint.artifacts.length);
+")
+STARTED_AT=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.checkpoint.started_at || 'unknown');
+")
+UPDATED_AT=$(echo "$CHECKPOINT_DATA" | node -e "
+const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf-8'));
+console.log(d.checkpoint.updated_at || 'unknown');
+")
+```
+Display checkpoint state and prompt user:
+```
+AskUserQuestion(
+  header: "Checkpoint Detected for #${ISSUE_NUMBER}",
+  question: "A prior pipeline run left progress at step '${CHECKPOINT_STEP}'.
+| | |
+|---|---|
+| **Last step** | ${CHECKPOINT_STEP} |
+| **Completed steps** | ${COMPLETED_STEPS} |
+| **Artifacts** | ${ARTIFACTS_COUNT} file(s) |
+| **Resume action** | ${RESUME_ACTION} → stage: ${RESUME_STAGE} |
+| **Started** | ${STARTED_AT} |
+| **Last updated** | ${UPDATED_AT} |
+How would you like to proceed?",
+  options: [
+    { label: "Resume", description: "Resume from checkpoint — skip completed steps (${COMPLETED_STEPS}), jump to ${RESUME_STAGE}" },
+    { label: "Fresh", description: "Discard checkpoint and re-run pipeline from scratch" },
+    { label: "Skip", description: "Skip this issue entirely" }
+  ]
+)
+```
+Handle user choice:
+| Choice | Action |
+|--------|--------|
+| **Resume** | Load checkpoint context. Set `pipeline_stage` in state to `${RESUME_STAGE}`. Log: "MGW: Resuming #${ISSUE_NUMBER} from checkpoint (step: ${CHECKPOINT_STEP}, action: ${RESUME_ACTION})." Skip triage/worktree stages that already completed and jump directly to the resume stage in the pipeline. The `resume.context` object carries step-specific data (e.g., `quick_dir`, `plan_num`, `phase_number`) needed by the target stage. |
+| **Fresh** | Clear checkpoint via `clearCheckpoint()`. Reset `pipeline_stage` to `"triaged"`. Log: "MGW: Checkpoint cleared for #${ISSUE_NUMBER}. Starting fresh." Continue with normal pipeline flow. |
+| **Skip** | Log: "MGW: Skipping #${ISSUE_NUMBER} per user request." STOP pipeline. |
+```bash
+case "$USER_CHOICE" in
+  Resume)
+    # Load resume context and jump to the appropriate stage
+    node -e "
+    const fs = require('fs'), path = require('path');
+    const activeDir = path.join(process.cwd(), '.mgw', 'active');
+    const files = fs.readdirSync(activeDir);
+    const file = files.find(f => f.startsWith('${ISSUE_NUMBER}-') && f.endsWith('.json'));
+    const filePath = path.join(activeDir, file);
+    const state = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
+    // The pipeline_stage already reflects prior progress — do not overwrite
+    // unless the resume target is more advanced than current stage
+    console.log('Resuming from checkpoint: ' + JSON.stringify(state.checkpoint.resume));
+    " 2>/dev/null || true
+    # Set RESUME_MODE=true — downstream stages check this flag to skip completed work
+    RESUME_MODE=true
+    RESUME_CONTEXT="${CHECKPOINT_DATA}"
+    ;;
+  Fresh)
+    node -e "
+    const { clearCheckpoint } = require('./lib/state.cjs');
+    clearCheckpoint(${ISSUE_NUMBER});
+    console.log('Checkpoint cleared for #${ISSUE_NUMBER}');
+    " 2>/dev/null || true
+    # Reset pipeline_stage to triaged for fresh start
+    node -e "
+    const fs = require('fs'), path = require('path');
+    const activeDir = path.join(process.cwd(), '.mgw', 'active');
+    const files = fs.readdirSync(activeDir);
+    const file = files.find(f => f.startsWith('${ISSUE_NUMBER}-') && f.endsWith('.json'));
+    const filePath = path.join(activeDir, file);
+    const state = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
+    state.pipeline_stage = 'triaged';
+    fs.writeFileSync(filePath, JSON.stringify(state, null, 2));
+    " 2>/dev/null || true
+    RESUME_MODE=false
+    ;;
+  Skip)
+    echo "MGW: Skipping #${ISSUE_NUMBER} per user request."
+    exit 0
+    ;;
+esac
+```
+If no checkpoint found (or checkpoint is at triage step only), continue with
+normal pipeline stage routing below.
+**Initialize checkpoint** when pipeline first transitions past triage:
+```bash
+# Checkpoint initialization — called once when pipeline execution begins.
+# Sets pipeline_step to "triage" with route selection progress.
+# Subsequent stages update the checkpoint via updateCheckpoint().
+# All checkpoint writes are atomic (write to .tmp then rename).
+node -e "
+const { updateCheckpoint } = require('./lib/state.cjs');
+updateCheckpoint(${ISSUE_NUMBER}, {
+  pipeline_step: 'triage',
+  step_progress: {
+    comment_check_done: true,
+    route_selected: '${GSD_ROUTE}'
+  },
+  resume: {
+    action: 'begin-execution',
+    context: { gsd_route: '${GSD_ROUTE}', branch: '${BRANCH_NAME}' }
+  }
+});
+" 2>/dev/null || true
+```
 Check pipeline_stage:
   - "triaged" → proceed to GSD execution
   - "planning" / "executing" → resume from where we left off
@@ -359,7 +514,37 @@ NEW_COMMENTS=$(gh issue view $ISSUE_NUMBER --json comments \
   --jq "[.comments[-${NEW_COUNT}:]] | .[] | {author: .author.login, body: .body, createdAt: .createdAt}" 2>/dev/null)
 ```
-2. **Spawn classification agent:**
+2. **Spawn classification agent (with diagnostic capture):**
+<!-- mgw:criticality=advisory  spawn_point=comment-classifier -->
+<!-- Advisory: comment classification failure does not block the pipeline.
+     If this agent fails, log a warning and treat all new comments as
+     informational (safe default — pipeline continues with stale data).
+     Graceful degradation pattern:
+     ```
+     CLASSIFICATION_RESULT=$(wrapAdvisoryAgent(Task(...), 'comment-classifier', {
+       issueNumber: ISSUE_NUMBER,
+       fallback: '{"classification":"informational","reasoning":"comment classifier unavailable","new_requirements":[],"blocking_reason":""}'
+     }))
+     ```
+-->
+**Pre-spawn diagnostic hook:**
+```bash
+CLASSIFIER_PROMPT="<full classifier prompt assembled above>"
+DIAG_CLASSIFIER=$(node -e "
+const dh = require('${REPO_ROOT}/lib/diagnostic-hooks.cjs');
+const id = dh.beforeAgentSpawn({
+  agentType: 'general-purpose',
+  issueNumber: ${ISSUE_NUMBER},
+  prompt: process.argv[1],
+  repoRoot: '${REPO_ROOT}'
+});
+process.stdout.write(id);
+" "$CLASSIFIER_PROMPT" 2>/dev/null || echo "")
+```
 ```
 Task(
   prompt="
@@ -414,6 +599,18 @@ Return ONLY valid JSON:
 )
 ```
+**Post-spawn diagnostic hook:**
+```bash
+node -e "
+const dh = require('${REPO_ROOT}/lib/diagnostic-hooks.cjs');
+dh.afterAgentSpawn({
+  diagId: '${DIAG_CLASSIFIER}',
+  exitReason: '${CLASSIFICATION_RESULT ? \"success\" : \"error\"}',
+  repoRoot: '${REPO_ROOT}'
+});
+" 2>/dev/null || true
+```
 3. **React based on classification:**
 | Classification | Action |

package/commands/workflows/gsd.md CHANGED Viewed

@@ -355,6 +355,77 @@ The agent is read-only (general-purpose, no code execution). It reads project st
 and codebase to classify, then MGW presents the result and offers follow-up actions
 (file new issue, post comment on related issue, etc.).
+## Diagnostic Capture Hooks
+Every Task() spawn in the mgw:run pipeline SHOULD be instrumented with diagnostic
+capture hooks using `lib/diagnostic-hooks.cjs`. This provides per-agent telemetry
+(timing, prompt hash, exit reason, failure classification) without blocking pipeline
+execution.
+**Required modules:**
+- `lib/diagnostic-hooks.cjs` — Before/after hooks for Task() spawns
+- `lib/agent-diagnostics.cjs` — Underlying diagnostic logger (writes to `.mgw/diagnostics/`)
+**Pattern — wrap every Task() spawn:**
+```bash
+# 1. Before spawning: record start time and hash prompt
+DIAG_ID=$(node -e "
+const dh = require('${REPO_ROOT}/lib/diagnostic-hooks.cjs');
+const id = dh.beforeAgentSpawn({
+  agentType: '${AGENT_TYPE}',       # gsd-planner, gsd-executor, etc.
+  issueNumber: ${ISSUE_NUMBER},
+  prompt: '${PROMPT_SUMMARY}',      # short description, not full prompt
+  repoRoot: '${REPO_ROOT}'
+});
+process.stdout.write(id);
+" 2>/dev/null || echo "")
+# 2. Spawn Task() agent (unchanged)
+Task(
+  prompt="...",
+  subagent_type="${AGENT_TYPE}",
+  description="..."
+)
+# 3. After agent completes: record exit reason and write diagnostic entry
+EXIT_REASON=$( <check artifact exists> && echo "success" || echo "error" )
+node -e "
+const dh = require('${REPO_ROOT}/lib/diagnostic-hooks.cjs');
+dh.afterAgentSpawn({
+  diagId: '${DIAG_ID}',
+  exitReason: '${EXIT_REASON}',
+  repoRoot: '${REPO_ROOT}'
+});
+" 2>/dev/null || true
+```
+**Key design principles:**
+- **Non-blocking:** All hook calls are wrapped in `2>/dev/null || true` (bash) or
+  try/catch (JS). If diagnostic capture fails, pipeline continues normally.
+- **No prompt storage:** Only a hash of the prompt is stored, not the full text.
+  Use `shortHash()` from `agent-diagnostics.cjs` for longer prompts.
+- **Exit reason detection:** Use artifact existence checks (e.g., `PLAN.md` exists
+  means planner succeeded) rather than relying on Task() return values.
+- **Graceful degradation:** If `agent-diagnostics.cjs` is not available (dependency
+  PR not merged), `diagnostic-hooks.cjs` logs a warning and returns empty handles.
+**Diagnostic entries are written to:** `.mgw/diagnostics/<issueNumber>-<timestamp>.json`
+**Instrumented agent spawns in mgw:run:**
+| Agent | File | Step |
+|-------|------|------|
+| Comment classifier | `run/triage.md` | preflight_comment_check |
+| Planner (quick) | `run/execute.md` | execute_gsd_quick step 3 |
+| Plan-checker (quick) | `run/execute.md` | execute_gsd_quick step 6 |
+| Executor (quick) | `run/execute.md` | execute_gsd_quick step 7 |
+| Verifier (quick) | `run/execute.md` | execute_gsd_quick step 9 |
+| Planner (milestone) | `run/execute.md` | execute_gsd_milestone step b |
+| Executor (milestone) | `run/execute.md` | execute_gsd_milestone step d |
+| Verifier (milestone) | `run/execute.md` | execute_gsd_milestone step e |
+| PR creator | `run/pr-create.md` | create_pr |
 ## Anti-Patterns
 - **NEVER** use Skill invocation from within a Task() agent — Skills don't resolve inside subagents

package/commands/workflows/state.md CHANGED Viewed

@@ -227,13 +227,186 @@ File: `.mgw/active/<number>-<slug>.json`
   "gsd_route": null,
   "gsd_artifacts": { "type": null, "path": null },
   "pipeline_stage": "new|triaged|needs-info|needs-security-review|discussing|approved|planning|diagnosing|executing|verifying|pr-created|done|failed|blocked",
+  "checkpoint": null,
   "comments_posted": [],
   "linked_pr": null,
   "linked_issues": [],
-  "linked_branches": []
+  "linked_branches": [],
+  "checkpoint": null
 }
 ```
+## Checkpoint Schema
+The `checkpoint` field in `.mgw/active/<number>-<slug>.json` tracks fine-grained pipeline
+execution progress. It enables resume after failures, context switches, or multi-session
+execution. The field is `null` until pipeline execution begins (set during the triage-to-
+executing transition).
+### Checkpoint Object Structure
+```json
+{
+  "checkpoint": {
+    "schema_version": 1,
+    "pipeline_step": "triage|plan|execute|verify|pr",
+    "step_progress": {},
+    "last_agent_output": null,
+    "artifacts": [],
+    "resume": {
+      "action": null,
+      "context": {}
+    },
+    "started_at": "2026-03-06T12:00:00Z",
+    "updated_at": "2026-03-06T12:05:00Z",
+    "step_history": []
+  }
+}
+```
+### Checkpoint Fields
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `schema_version` | integer | `1` | Schema version for forward-compatibility. Consumers check this before parsing. New fields can be added without bumping; bump only for breaking structural changes. |
+| `pipeline_step` | string | `"triage"` | Current high-level pipeline step. Values: `"triage"`, `"plan"`, `"execute"`, `"verify"`, `"pr"`. Maps to GSD lifecycle stages but at a coarser grain than `pipeline_stage`. |
+| `step_progress` | object | `{}` | Step-specific progress data. Shape varies by `pipeline_step` (see Step Progress Shapes below). Unknown keys are preserved on read -- consumers must not strip unrecognized fields. |
+| `last_agent_output` | string\|null | `null` | File path (relative to repo root) of the last successful agent output. Updated after each agent spawn completes. Used for resume context injection. |
+| `artifacts` | array | `[]` | Accumulated artifact paths produced during this pipeline run. Each entry is `{ "path": "relative/path", "type": "plan\|summary\|verification\|commit", "created_at": "ISO" }`. Append-only -- never remove entries. |
+| `resume` | object | `{ "action": null, "context": {} }` | Instructions for resuming execution. `action` is a string describing what to do next (e.g., `"spawn-executor"`, `"retry-verifier"`, `"create-pr"`). `context` carries step-specific data needed for resume (e.g., `{ "phase_number": 3, "plan_path": ".planning/..." }`). |
+| `started_at` | string | ISO timestamp | When checkpoint tracking began for this pipeline run. |
+| `updated_at` | string | ISO timestamp | When the checkpoint was last modified. Updated on every checkpoint write. |
+| `step_history` | array | `[]` | Ordered log of completed steps. Each entry: `{ "step": "plan", "completed_at": "ISO", "agent_type": "gsd-planner", "output_path": "..." }`. Append-only. |
+### Step Progress Shapes
+The `step_progress` object has a different shape depending on the current `pipeline_step`.
+These are the documented shapes; future pipeline steps can define their own without breaking
+existing consumers (unknown keys are preserved).
+**When `pipeline_step` is `"triage"`:**
+```json
+{
+  "comment_check_done": false,
+  "route_selected": null
+}
+```
+**When `pipeline_step` is `"plan"`:**
+```json
+{
+  "plan_path": null,
+  "plan_checked": false,
+  "revision_count": 0
+}
+```
+**When `pipeline_step` is `"execute"`:**
+```json
+{
+  "gsd_phase": null,
+  "total_phases": null,
+  "current_task": null,
+  "tasks_completed": 0,
+  "tasks_total": null,
+  "commits": []
+}
+```
+**When `pipeline_step` is `"verify"`:**
+```json
+{
+  "verification_path": null,
+  "must_haves_checked": false,
+  "artifact_check_done": false,
+  "keylink_check_done": false
+}
+```
+**When `pipeline_step` is `"pr"`:**
+```json
+{
+  "branch_pushed": false,
+  "pr_number": null,
+  "pr_url": null
+}
+```
+### Forward Compatibility Contract
+1. **New fields can be added** to the checkpoint object at any level without incrementing
+   `schema_version`. Consumers must tolerate unknown fields (preserve on read-modify-write,
+   ignore on read-only access).
+2. **New `pipeline_step` values** can be introduced freely. Existing step_progress shapes
+   are not affected. The `step_progress` for an unrecognized step should be treated as an
+   opaque object (pass through unchanged).
+3. **`schema_version` bump** is required only when an existing field changes its type,
+   semantics, or is removed. When bumped, `migrateProjectState()` in `lib/state.cjs` must
+   handle the migration.
+4. **`artifacts` and `step_history` are append-only**. Consumers should never modify or
+   remove entries from these arrays. They may be compacted during archival (when pipeline
+   reaches `done` stage and state moves to `.mgw/completed/`).
+5. **`resume.context` is opaque** to all consumers except the specific resume handler for
+   the given `resume.action`. This allows step-specific resume data to evolve independently.
+### Checkpoint Lifecycle
+```
+triage (checkpoint initialized, pipeline_step="triage")
+  |
+  v
+plan (pipeline_step="plan", step_progress tracks planning state)
+  |
+  v
+execute (pipeline_step="execute", step_progress tracks GSD phase/task progress)
+  |
+  v
+verify (pipeline_step="verify", step_progress tracks verification checks)
+  |
+  v
+pr (pipeline_step="pr", step_progress tracks PR creation)
+  |
+  v
+done (checkpoint frozen — archived to .mgw/completed/)
+```
+### Checkpoint Update Pattern
+```bash
+# Update checkpoint at key pipeline stages using updateCheckpoint()
+node -e "
+const { updateCheckpoint } = require('./lib/state.cjs');
+updateCheckpoint(${ISSUE_NUMBER}, {
+  pipeline_step: 'execute',
+  step_progress: {
+    gsd_phase: ${PHASE_NUMBER},
+    tasks_completed: ${COMPLETED},
+    tasks_total: ${TOTAL}
+  },
+  last_agent_output: '${OUTPUT_PATH}',
+  resume: {
+    action: 'continue-execution',
+    context: { phase_number: ${PHASE_NUMBER} }
+  }
+});
+"
+```
+### Consumers
+| Consumer | Access Pattern |
+|----------|---------------|
+| run/triage.md | Initialize checkpoint at triage (`pipeline_step: "triage"`) |
+| run/execute.md | Update checkpoint after each agent spawn (`pipeline_step: "plan"\|"execute"\|"verify"`) |
+| run/pr-create.md | Update checkpoint at PR creation (`pipeline_step: "pr"`) |
+| milestone.md | Read checkpoint to determine resume point for failed issues |
+| status.md | Read checkpoint for detailed progress display |
+| sync.md | Compare checkpoint state against GitHub for drift detection |
 ## Stage Flow Diagram
 ```
@@ -264,6 +437,84 @@ blocked --> triaged     (re-triage after blocker resolved)
 Any stage --> failed    (unrecoverable error)
 ```
+## Pipeline Checkpoints
+Fine-grained pipeline progress tracking within `.mgw/active/<number>-<slug>.json`.
+The `checkpoint` field starts as `null` and is initialized when the pipeline first
+transitions past triage. Each subsequent stage writes an atomic checkpoint update.
+### Checkpoint Schema
+```json
+{
+  "checkpoint": {
+    "schema_version": 1,
+    "pipeline_step": "triage|plan|execute|verify|pr",
+    "step_progress": {},
+    "last_agent_output": null,
+    "artifacts": [],
+    "resume": { "action": null, "context": {} },
+    "started_at": "ISO timestamp",
+    "updated_at": "ISO timestamp",
+    "step_history": []
+  }
+}
+```
+| Field | Type | Merge Strategy | Description |
+|-------|------|---------------|-------------|
+| `schema_version` | number | — | Checkpoint format version (currently 1) |
+| `pipeline_step` | string | overwrite | Current pipeline step: `triage`, `plan`, `execute`, `verify`, `pr` |
+| `step_progress` | object | shallow merge | Step-specific progress (e.g., `{ plan_path: "...", plan_checked: false }`) |
+| `last_agent_output` | string\|null | overwrite | Path or URL of the last agent's output |
+| `artifacts` | array | append-only | `[{ path, type, created_at }]` — never removed, only appended |
+| `resume` | object | full replace | `{ action, context }` — what to do if pipeline restarts |
+| `started_at` | string | — | ISO timestamp when checkpoint was first created |
+| `updated_at` | string | auto | ISO timestamp of last update (set automatically) |
+| `step_history` | array | append-only | `[{ step, completed_at, agent_type, output_path }]` — audit trail |
+### Atomic Writes
+All checkpoint writes use `atomicWriteJson()` from `lib/state.cjs`:
+```bash
+# atomicWriteJson(filePath, data) — write to .tmp then rename.
+# POSIX rename is atomic on the same filesystem, so a crash mid-write
+# never leaves a corrupt state file.
+```
+The `updateCheckpoint()` function uses `atomicWriteJson()` internally. Commands
+should always use `updateCheckpoint()` rather than writing checkpoints directly:
+```bash
+node -e "
+const { updateCheckpoint } = require('./lib/state.cjs');
+updateCheckpoint(${ISSUE_NUMBER}, {
+  pipeline_step: 'plan',
+  step_progress: { plan_path: '...', plan_checked: false },
+  artifacts: [{ path: '...', type: 'plan', created_at: new Date().toISOString() }],
+  step_history: [{ step: 'plan', completed_at: new Date().toISOString(), agent_type: 'gsd-planner', output_path: '...' }],
+  resume: { action: 'spawn-executor', context: { quick_dir: '...' } }
+});
+" 2>/dev/null || true
+```
+### Checkpoint Lifecycle
+| Pipeline Step | Checkpoint `pipeline_step` | Resume Action |
+|--------------|---------------------------|---------------|
+| Triage complete | `triage` | `begin-execution` |
+| Planner complete | `plan` | `run-plan-checker` or `spawn-executor` |
+| Executor complete | `execute` | `spawn-verifier` or `create-pr` |
+| Verifier complete | `verify` | `create-pr` |
+| PR created | `pr` | `cleanup` |
+### Migration
+`migrateProjectState()` adds the `checkpoint: null` field to any issue state
+files that predate checkpoint support. The field is initialized lazily — it
+stays `null` until the pipeline actually runs.
 ## Slug Generation
 Use gsd-tools for consistent slug generation:
@@ -396,6 +647,91 @@ GSD phase directory (`.planning/phases/{NN}-{slug}/`) to operate in.
 Issues created outside of `/mgw:project` (e.g., manually filed bugs) will not have
 a `phase_number`. In this case, `/mgw:run` falls back to the quick pipeline.
+## Checkpoint Resume Detection
+When `mgw:run` starts for an issue, the validate_and_load step checks whether a prior
+pipeline run left a checkpoint with progress beyond the initial triage step. This enables
+resuming interrupted sessions without re-doing completed work.
+### Resume Detection Functions (lib/state.cjs)
+| Function | Signature | Returns | Description |
+|----------|-----------|---------|-------------|
+| `detectCheckpoint` | `(issueNumber)` | `object\|null` | Checks if active state file has a non-null checkpoint with `pipeline_step` beyond `"triage"`. Returns the checkpoint data if resumable, `null` otherwise. |
+| `resumeFromCheckpoint` | `(issueNumber)` | `object\|null` | Returns checkpoint data plus computed `resumeStage`, `resumeAction`, and `completedSteps`. Maps `resume.action` to the pipeline stage to jump to. |
+| `clearCheckpoint` | `(issueNumber)` | `{ cleared: boolean }` | Resets the checkpoint field to `null` in the active state file. Used for "Fresh start" option. |
+### Resume Action to Stage Mapping
+The `resume.action` field in the checkpoint tells `resumeFromCheckpoint()` which pipeline
+stage to jump to:
+| resume.action | resumeStage | Meaning |
+|---------------|-------------|---------|
+| `run-plan-checker` | `planning` | Plan exists, needs quality check |
+| `spawn-executor` | `executing` | Plan complete, execute next |
+| `continue-execution` | `executing` | Mid-execution resume |
+| `spawn-verifier` | `verifying` | Execution done, verify next |
+| `create-pr` | `pr-pending` | Verification done, create PR |
+| `begin-execution` | `planning` | Triage done, begin planning |
+| `null` / unknown | `planning` | Safe default |
+### Resume Detection Flow
+```
+mgw:run #N starts
+  |
+  v
+Load state file → migrateProjectState()
+  |
+  v
+detectCheckpoint(N)
+  |
+  +---> null (no checkpoint or triage-only) → proceed with normal stage routing
+  |
+  +---> checkpoint found → display state to user
+           |
+           v
+        AskUserQuestion: Resume / Fresh / Skip
+           |
+           +---> Resume: load checkpoint context, set RESUME_MODE=true,
+           |             jump to resume.action stage (skip completed steps)
+           |
+           +---> Fresh: clearCheckpoint(N), reset pipeline_stage to "triaged",
+           |            continue normal pipeline
+           |
+           +---> Skip: exit pipeline for this issue
+```
+### Pipeline Step Order
+The `CHECKPOINT_STEP_ORDER` constant defines the ordered progression of checkpoint steps:
+```
+triage → plan → execute → verify → pr
+```
+Only checkpoints with `pipeline_step` at index > 0 (beyond `"triage"`) are considered
+resumable. A checkpoint at `"triage"` means nothing meaningful has been completed yet.
+### Resume Context
+When resuming, the `resume.context` object carries step-specific data needed by the
+target stage. The context shape varies by `resume.action`:
+| resume.action | Context fields |
+|---------------|----------------|
+| `spawn-executor` | `{ quick_dir, plan_num }` |
+| `run-plan-checker` | `{ quick_dir, plan_num }` |
+| `spawn-verifier` | `{ quick_dir, plan_num }` |
+| `create-pr` | `{ quick_dir, plan_num }` |
+| `continue-execution` | `{ phase_number }` |
+| `begin-execution` | `{ gsd_route, branch }` |
+Downstream pipeline stages read `resume.context` to pick up where the prior run left
+off. For example, the executor stage uses `quick_dir` and `plan_num` to locate the
+existing plan files rather than re-creating them.
 ## Consumers
 | Pattern | Referenced By |
@@ -410,3 +746,6 @@ a `phase_number`. In this case, `/mgw:run` falls back to the quick pipeline.
 | Project state | milestone.md, next.md, ask.md |
 | Gate result schema | issue.md (populate), run.md (validate) |
 | Board status sync | board-sync.md (utility), issue.md (triage transitions), run.md (pipeline transitions) |
+| Checkpoint resume | run.md (detect + prompt), milestone.md (detect resume point for failed issues) |
+| Checkpoint writes | triage.md (init), execute.md (plan/execute/verify), pr-create.md (pr) |
+| Atomic writes | lib/state.cjs (`atomicWriteJson`, `updateCheckpoint`) |

package/dist/bin/mgw.cjs CHANGED Viewed

@@ -1,7 +1,7 @@
 #!/usr/bin/env node
 'use strict';
-var index = require('../index-B-_JvYpz.cjs');
+var index = require('../index-CHrVAIMY.cjs');
 var require$$0 = require('commander');
 var require$$1 = require('path');
 var require$$2 = require('fs');
@@ -11,7 +11,7 @@ require('events');
 var mgw$1 = {};
-var version = "0.4.0";
+var version = "0.6.0";
 var require$$11 = {
 	version: version};