npm - gsd-remix - Versions diffs - 1.0.2 → 1.1.1 - Mend

gsd-remix 1.0.2 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (230) hide show

package/README.md +21 -86
package/README.zh-CN.md +13 -57
package/agents/gsd-debugger.md +0 -3
package/agents/gsd-executor.md +5 -11
package/agents/gsd-phase-researcher.md +3 -107
package/agents/gsd-plan-checker.md +0 -61
package/agents/gsd-planner.md +4 -63
package/agents/gsd-roadmapper.md +0 -29
package/agents/gsd-security-auditor.md +62 -114
package/agents/gsd-verifier.md +0 -3
package/bin/install.js +20 -118
package/commands/gsd/complete-milestone.md +0 -22
package/commands/gsd/plan-phase.md +1 -2
package/get-shit-done/bin/gsd-tools.cjs +5 -224
package/get-shit-done/bin/lib/claude-md.cjs +427 -0
package/get-shit-done/bin/lib/config-schema.cjs +2 -12
package/get-shit-done/bin/lib/config.cjs +3 -12
package/get-shit-done/bin/lib/core.cjs +4 -5
package/get-shit-done/bin/lib/init.cjs +0 -163
package/get-shit-done/bin/lib/model-profiles.cjs +12 -18
package/get-shit-done/bin/lib/verify.cjs +0 -66
package/get-shit-done/references/agent-contracts.md +0 -6
package/get-shit-done/references/artifact-types.md +0 -30
package/get-shit-done/references/continuation-format.md +0 -1
package/get-shit-done/references/model-profiles.md +39 -37
package/get-shit-done/references/planning-config.md +7 -12
package/get-shit-done/references/verification-overrides.md +1 -1
package/get-shit-done/templates/README.md +2 -9
package/get-shit-done/templates/claude-md.md +0 -14
package/get-shit-done/templates/config.json +5 -19
package/get-shit-done/workflows/autonomous.md +9 -141
package/get-shit-done/workflows/complete-milestone.md +3 -4
package/get-shit-done/workflows/discuss-phase-assumptions.md +1 -18
package/get-shit-done/workflows/discuss-phase.md +10 -104
package/get-shit-done/workflows/do.md +1 -5
package/get-shit-done/workflows/execute-phase.md +53 -103
package/get-shit-done/workflows/execute-plan.md +4 -4
package/get-shit-done/workflows/health.md +2 -5
package/get-shit-done/workflows/help.md +0 -165
package/get-shit-done/workflows/new-milestone.md +0 -51
package/get-shit-done/workflows/new-project.md +2 -63
package/get-shit-done/workflows/next.md +0 -23
package/get-shit-done/workflows/pause-work.md +7 -15
package/get-shit-done/workflows/plan-phase.md +20 -304
package/get-shit-done/workflows/pr-branch.md +0 -1
package/get-shit-done/workflows/progress.md +1 -68
package/get-shit-done/workflows/quick.md +0 -3
package/get-shit-done/workflows/research-phase.md +0 -1
package/get-shit-done/workflows/settings.md +1 -57
package/get-shit-done/workflows/transition.md +3 -86
package/get-shit-done/workflows/verify-work.md +0 -64
package/package.json +1 -1
package/scripts/build-hooks.js +0 -2
package/sdk/prompts/agents/gsd-executor.md +2 -0
package/sdk/prompts/agents/gsd-plan-checker.md +0 -3
package/sdk/prompts/agents/gsd-roadmapper.md +0 -29
package/sdk/src/config.ts +4 -5
package/sdk/src/golden/golden-integration-covered.ts +0 -2
package/sdk/src/golden/golden-policy.ts +1 -1
package/sdk/src/golden/golden.integration.test.ts +0 -27
package/sdk/src/golden/read-only-golden-rows.ts +0 -15
package/sdk/src/query/QUERY-HANDLERS.md +3 -34
package/sdk/src/query/claude-md.ts +421 -0
package/sdk/src/query/commit.test.ts +155 -1
package/sdk/src/query/commit.ts +71 -17
package/sdk/src/query/config-gates.test.ts +1 -2
package/sdk/src/query/config-gates.ts +1 -5
package/sdk/src/query/config-mutation.test.ts +0 -1
package/sdk/src/query/config-mutation.ts +5 -6
package/sdk/src/query/config-query.test.ts +2 -2
package/sdk/src/query/config-query.ts +12 -18
package/sdk/src/query/decomposed-handlers.test.ts +0 -64
package/sdk/src/query/index.ts +4 -68
package/sdk/src/query/init.test.ts +0 -64
package/sdk/src/query/init.ts +0 -189
package/sdk/src/query/normalize-query-command.ts +0 -2
package/sdk/src/query/profile.test.ts +0 -43
package/sdk/src/query/profile.ts +1 -141
package/sdk/src/query/state-mutation.ts +18 -0
package/sdk/src/runtime-health.ts +3 -3
package/agents/gsd-ai-researcher.md +0 -133
package/agents/gsd-doc-classifier.md +0 -168
package/agents/gsd-doc-synthesizer.md +0 -204
package/agents/gsd-doc-verifier.md +0 -217
package/agents/gsd-doc-writer.md +0 -615
package/agents/gsd-domain-researcher.md +0 -153
package/agents/gsd-eval-auditor.md +0 -191
package/agents/gsd-eval-planner.md +0 -154
package/agents/gsd-framework-selector.md +0 -160
package/agents/gsd-intel-updater.md +0 -334
package/agents/gsd-nyquist-auditor.md +0 -203
package/agents/gsd-ui-auditor.md +0 -495
package/agents/gsd-ui-checker.md +0 -309
package/agents/gsd-ui-researcher.md +0 -380
package/agents/gsd-user-profiler.md +0 -171
package/commands/gsd/ai-integration-phase.md +0 -36
package/commands/gsd/analyze-dependencies.md +0 -34
package/commands/gsd/audit-fix.md +0 -33
package/commands/gsd/audit-milestone.md +0 -36
package/commands/gsd/audit-uat.md +0 -24
package/commands/gsd/docs-update.md +0 -48
package/commands/gsd/eval-review.md +0 -32
package/commands/gsd/explore.md +0 -27
package/commands/gsd/extract_learnings.md +0 -22
package/commands/gsd/forensics.md +0 -56
package/commands/gsd/from-gsd2.md +0 -47
package/commands/gsd/graphify.md +0 -201
package/commands/gsd/import.md +0 -37
package/commands/gsd/inbox.md +0 -38
package/commands/gsd/ingest-docs.md +0 -42
package/commands/gsd/intel.md +0 -179
package/commands/gsd/join-discord.md +0 -19
package/commands/gsd/list-phase-assumptions.md +0 -46
package/commands/gsd/list-workspaces.md +0 -19
package/commands/gsd/manager.md +0 -40
package/commands/gsd/milestone-summary.md +0 -51
package/commands/gsd/new-workspace.md +0 -44
package/commands/gsd/plan-milestone-gaps.md +0 -34
package/commands/gsd/plan-review-convergence.md +0 -52
package/commands/gsd/plant-seed.md +0 -28
package/commands/gsd/profile-user.md +0 -46
package/commands/gsd/reapply-patches.md +0 -331
package/commands/gsd/remove-workspace.md +0 -26
package/commands/gsd/review.md +0 -40
package/commands/gsd/scan.md +0 -26
package/commands/gsd/secure-phase.md +0 -35
package/commands/gsd/session-report.md +0 -19
package/commands/gsd/set-profile.md +0 -12
package/commands/gsd/ship.md +0 -23
package/commands/gsd/sketch-wrap-up.md +0 -31
package/commands/gsd/sketch.md +0 -49
package/commands/gsd/spec-phase.md +0 -62
package/commands/gsd/spike-wrap-up.md +0 -31
package/commands/gsd/spike.md +0 -46
package/commands/gsd/stats.md +0 -18
package/commands/gsd/sync-skills.md +0 -19
package/commands/gsd/thread.md +0 -227
package/commands/gsd/ui-phase.md +0 -34
package/commands/gsd/ui-review.md +0 -32
package/commands/gsd/ultraplan-phase.md +0 -33
package/commands/gsd/update.md +0 -37
package/commands/gsd/validate-phase.md +0 -35
package/commands/gsd/workstreams.md +0 -69
package/get-shit-done/bin/lib/docs.cjs +0 -267
package/get-shit-done/bin/lib/graphify.cjs +0 -494
package/get-shit-done/bin/lib/gsd2-import.cjs +0 -511
package/get-shit-done/bin/lib/intel.cjs +0 -639
package/get-shit-done/bin/lib/profile-output.cjs +0 -1080
package/get-shit-done/bin/lib/profile-pipeline.cjs +0 -539
package/get-shit-done/bin/lib/workstream.cjs +0 -495
package/get-shit-done/references/ai-evals.md +0 -156
package/get-shit-done/references/ai-frameworks.md +0 -186
package/get-shit-done/references/doc-conflict-engine.md +0 -91
package/get-shit-done/references/model-profile-resolution.md +0 -38
package/get-shit-done/references/planner-reviews.md +0 -39
package/get-shit-done/references/sketch-interactivity.md +0 -41
package/get-shit-done/references/sketch-theme-system.md +0 -94
package/get-shit-done/references/sketch-tooling.md +0 -45
package/get-shit-done/references/sketch-variant-patterns.md +0 -81
package/get-shit-done/references/thinking-models-debug.md +0 -44
package/get-shit-done/references/thinking-models-execution.md +0 -50
package/get-shit-done/references/thinking-models-planning.md +0 -62
package/get-shit-done/references/thinking-models-research.md +0 -50
package/get-shit-done/references/thinking-models-verification.md +0 -55
package/get-shit-done/references/thinking-partner.md +0 -96
package/get-shit-done/references/user-profiling.md +0 -681
package/get-shit-done/references/workstream-flag.md +0 -111
package/get-shit-done/templates/AI-SPEC.md +0 -246
package/get-shit-done/templates/SECURITY.md +0 -61
package/get-shit-done/templates/UI-SPEC.md +0 -100
package/get-shit-done/templates/VALIDATION.md +0 -76
package/get-shit-done/templates/dev-preferences.md +0 -21
package/get-shit-done/templates/user-profile.md +0 -146
package/get-shit-done/workflows/ai-integration-phase.md +0 -284
package/get-shit-done/workflows/analyze-dependencies.md +0 -96
package/get-shit-done/workflows/audit-fix.md +0 -175
package/get-shit-done/workflows/audit-milestone.md +0 -340
package/get-shit-done/workflows/audit-uat.md +0 -109
package/get-shit-done/workflows/docs-update.md +0 -1155
package/get-shit-done/workflows/eval-review.md +0 -155
package/get-shit-done/workflows/explore.md +0 -141
package/get-shit-done/workflows/extract_learnings.md +0 -242
package/get-shit-done/workflows/forensics.md +0 -265
package/get-shit-done/workflows/import.md +0 -246
package/get-shit-done/workflows/inbox.md +0 -387
package/get-shit-done/workflows/ingest-docs.md +0 -328
package/get-shit-done/workflows/list-phase-assumptions.md +0 -178
package/get-shit-done/workflows/list-workspaces.md +0 -56
package/get-shit-done/workflows/manager.md +0 -365
package/get-shit-done/workflows/milestone-summary.md +0 -223
package/get-shit-done/workflows/new-workspace.md +0 -239
package/get-shit-done/workflows/plan-milestone-gaps.md +0 -273
package/get-shit-done/workflows/plan-review-convergence.md +0 -254
package/get-shit-done/workflows/plant-seed.md +0 -172
package/get-shit-done/workflows/profile-user.md +0 -452
package/get-shit-done/workflows/remove-workspace.md +0 -92
package/get-shit-done/workflows/review.md +0 -344
package/get-shit-done/workflows/scan.md +0 -102
package/get-shit-done/workflows/secure-phase.md +0 -166
package/get-shit-done/workflows/session-report.md +0 -146
package/get-shit-done/workflows/ship.md +0 -302
package/get-shit-done/workflows/sketch-wrap-up.md +0 -283
package/get-shit-done/workflows/sketch.md +0 -286
package/get-shit-done/workflows/spec-phase.md +0 -262
package/get-shit-done/workflows/spike-wrap-up.md +0 -281
package/get-shit-done/workflows/spike.md +0 -362
package/get-shit-done/workflows/stats.md +0 -60
package/get-shit-done/workflows/sync-skills.md +0 -182
package/get-shit-done/workflows/ui-phase.md +0 -323
package/get-shit-done/workflows/ui-review.md +0 -190
package/get-shit-done/workflows/ultraplan-phase.md +0 -189
package/get-shit-done/workflows/update.md +0 -587
package/get-shit-done/workflows/validate-phase.md +0 -176
package/hooks/dist/gsd-check-update-worker.js +0 -108
package/hooks/dist/gsd-check-update.js +0 -63
package/hooks/gsd-check-update-worker.js +0 -108
package/hooks/gsd-check-update.js +0 -63
package/sdk/src/golden/fixtures/profile-sample-sessions/demo-project/sample.jsonl +0 -3
package/sdk/src/query/docs-init.ts +0 -257
package/sdk/src/query/intel.test.ts +0 -90
package/sdk/src/query/intel.ts +0 -404
package/sdk/src/query/profile-extract-messages.ts +0 -247
package/sdk/src/query/profile-output.ts +0 -908
package/sdk/src/query/profile-questionnaire-data.ts +0 -181
package/sdk/src/query/profile-sample.ts +0 -184
package/sdk/src/query/profile-scan-sessions.ts +0 -174
package/sdk/src/query/workspace.test.ts +0 -119
package/sdk/src/query/workspace.ts +0 -131
package/sdk/src/query/workstream.test.ts +0 -51
package/sdk/src/query/workstream.ts +0 -434

package/agents/gsd-plan-checker.md CHANGED Viewed

@@ -102,9 +102,6 @@ Same methodology (goal-backward), different timing, different subject matter.
 <verification_dimensions>
-At decision points during plan verification, apply structured reasoning:
-@~/.claude/get-shit-done/references/thinking-models-planning.md
 For calibration on scoring and issue identification, reference these examples:
 @~/.claude/get-shit-done/references/few-shot-examples/plan-checker.md
@@ -435,64 +432,6 @@ issue:
   fix_hint: "Consider moving display formatting to frontend server per Architectural Responsibility Map"
 ```
-## Dimension 8: Nyquist Compliance
-Skip if: `workflow.nyquist_validation` is explicitly set to `false` in config.json (absent key = enabled), phase has no RESEARCH.md, or RESEARCH.md has no "Validation Architecture" section. Output: "Dimension 8: SKIPPED (nyquist_validation disabled or not applicable)"
-### Check 8e — VALIDATION.md Existence (Gate)
-Before running checks 8a-8d, verify VALIDATION.md exists:
-```bash
-ls "${PHASE_DIR}"/*-VALIDATION.md 2>/dev/null
-```
-**If missing:** **BLOCKING FAIL** — "VALIDATION.md not found for phase {N}. Re-run `/gsd-plan-phase {N} --research` to regenerate."
-Skip checks 8a-8d entirely. Report Dimension 8 as FAIL with this single issue.
-**If exists:** Proceed to checks 8a-8d.
-### Check 8a — Automated Verify Presence
-For each `<task>` in each plan:
-- `<verify>` must contain `<automated>` command, OR a Wave 0 dependency that creates the test first
-- If `<automated>` is absent with no Wave 0 dependency → **BLOCKING FAIL**
-- If `<automated>` says "MISSING", a Wave 0 task must reference the same test file path → **BLOCKING FAIL** if link broken
-### Check 8b — Feedback Latency Assessment
-For each `<automated>` command:
-- Full E2E suite (playwright, cypress, selenium) → **WARNING** — suggest faster unit/smoke test
-- Watch mode flags (`--watchAll`) → **BLOCKING FAIL**
-- Delays > 30 seconds → **WARNING**
-### Check 8c — Sampling Continuity
-Map tasks to waves. Per wave, any consecutive window of 3 implementation tasks must have ≥2 with `<automated>` verify. 3 consecutive without → **BLOCKING FAIL**.
-### Check 8d — Wave 0 Completeness
-For each `<automated>MISSING</automated>` reference:
-- Wave 0 task must exist with matching `<files>` path
-- Wave 0 plan must execute before dependent task
-- Missing match → **BLOCKING FAIL**
-### Dimension 8 Output
-```
-## Dimension 8: Nyquist Compliance
-| Task | Plan | Wave | Automated Command | Status |
-|------|------|------|-------------------|--------|
-| {task} | {plan} | {wave} | `{command}` | ✅ / ❌ |
-Sampling: Wave {N}: {X}/{Y} verified → ✅ / ❌
-Wave 0: {test file} → ✅ present / ❌ MISSING
-Overall: ✅ PASS / ❌ FAIL
-```
-If FAIL: return to planner with specific fixes. Same revision loop as other dimensions (max 3 loops).
 ## Dimension 9: Cross-Plan Data Contracts
 **Question:** When plans share data pipelines, are their transformations compatible?

package/agents/gsd-planner.md CHANGED Viewed

@@ -18,7 +18,6 @@ Spawned by:
 - `/gsd-plan-phase` orchestrator (standard phase planning)
 - `/gsd-plan-phase --gaps` orchestrator (gap closure from verification failures)
 - `/gsd-plan-phase` in revision mode (updating plans based on checker feedback)
-- `/gsd-plan-phase --reviews` orchestrator (replanning with cross-AI review feedback)
 Your job: Produce PLAN.md files that Claude executors can implement without interpretation. Plans are prompts, not documents that become prompts.
@@ -438,20 +437,6 @@ Output: [Artifacts created]
 </tasks>
-<threat_model>
-## Trust Boundaries
-| Boundary | Description |
-|----------|-------------|
-| {e.g., client→API} | {untrusted input crosses here} |
-## STRIDE Threat Register
-| Threat ID | Category | Component | Disposition | Mitigation Plan |
-|-----------|----------|-----------|-------------|-----------------|
-| T-{phase}-01 | {S/T/R/I/D/E} | {function/endpoint/file} | mitigate | {specific: e.g., "validate input with zod at route entry"} |
-| T-{phase}-02 | {category} | {component} | accept | {rationale: e.g., "no PII, low-value target"} |
-</threat_model>
 <verification>
 [Overall phase checks]
@@ -584,7 +569,6 @@ Only include what Claude literally cannot do.
 **Step 0: Extract Requirement IDs**
 Read ROADMAP.md `**Requirements:**` line for this phase. Strip brackets if present (e.g., `[AUTH-01, AUTH-02]` → `AUTH-01, AUTH-02`). Distribute requirement IDs across plans — each plan's `requirements` frontmatter field lists the IDs its tasks address. Every requirement ID MUST appear in at least one plan. Plans with an empty `requirements` field are invalid.
-**Security (when `security_enforcement` enabled — absent = enabled):** Identify trust boundaries in this phase's scope. Map STRIDE categories to applicable tech stack from RESEARCH.md security domain. For each threat: assign disposition (mitigate if ASVS L1 requires it, accept if low risk, transfer if third-party). Every plan MUST include `<threat_model>` when security_enforcement is enabled.
 **Step 1: State the Goal**
 Take phase goal from ROADMAP.md. Must be outcome-shaped, not task-shaped.
@@ -795,11 +779,6 @@ See `get-shit-done/references/planner-revision.md`. Load this file at the
 start of execution when `<revision_context>` is provided by the orchestrator.
 </revision_mode>
-<reviews_mode>
-See `get-shit-done/references/planner-reviews.md`. Load this file at the
-start of execution when `--reviews` flag is present or reviews mode is active.
-</reviews_mode>
 <execution_flow>
 <step name="load_project_state" priority="first">
@@ -826,7 +805,6 @@ Check the invocation mode and load the relevant reference file:
 - If `--gaps` flag or gap_closure context present: Read `get-shit-done/references/planner-gap-closure.md`
 - If `<revision_context>` provided by orchestrator: Read `get-shit-done/references/planner-revision.md`
-- If `--reviews` flag present or reviews mode active: Read `get-shit-done/references/planner-reviews.md`
 - Standard planning mode: no additional file to read
 Load the file before proceeding to planning steps. The reference file contains the full
@@ -854,42 +832,6 @@ If exists, load relevant documents by phase type:
 | (default) | STACK.md, ARCHITECTURE.md |
 </step>
-<step name="load_graph_context">
-Check for knowledge graph:
-```bash
-ls .planning/graphs/graph.json 2>/dev/null
-```
-If graph.json exists, check freshness:
-```bash
-node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" graphify status
-```
-If the status response has `stale: true`, note for later: "Graph is {age_hours}h old -- treat semantic relationships as approximate." Include this annotation inline with any graph context injected below.
-Query the graph for phase-relevant dependency context (single query per D-06):
-```bash
-node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" graphify query "<phase-goal-keyword>" --budget 2000
-```
-(graphify is not exposed on `gsd-remix-sdk query` yet; use `gsd-tools.cjs` for graphify only.)
-Use the keyword that best captures the phase goal. Examples:
-- Phase "User Authentication" -> query term "auth"
-- Phase "Payment Integration" -> query term "payment"
-- Phase "Database Migration" -> query term "migration"
-If the query returns nodes and edges, incorporate as dependency context for planning:
-- Which modules/files are semantically related to this phase's domain
-- Which subsystems may be affected by changes in this phase
-- Cross-document relationships that inform task ordering and wave structure
-If no results or graph.json absent, continue without graph context.
-</step>
 <step name="identify_phase">
 ```bash
 cat .planning/ROADMAP.md
@@ -973,13 +915,14 @@ cat "$phase_dir"/*-DISCOVERY.md 2>/dev/null  # From mandatory discovery
 **If RESEARCH.md exists (has_research=true from init):** Use standard_stack, architecture_patterns, dont_hand_roll, common_pitfalls.
+**[NEEDS DECISION] protocol:** Before finalizing the plan, read ALL `[NEEDS DECISION]` items and LOW-confidence recommendations from RESEARCH.md/SUMMARY.md. For each: either (a) create a `checkpoint:decision` task to resolve it, or (b) document why the risk is acceptable in the plan's deviation notes. LOW-confidence items that are silently accepted become undocumented technical debt.
+**Gap-closure root cause rule (--gaps plans):** Before writing a fix plan, apply a single "why" round: Why did this gap occur? Was it a plan deficiency (wrong task), an execution miss (correct task, wrong implementation), or a changed assumption (environment/dependency shift)? The fix plan must target the root cause category, not just the symptom.
 **Architectural Responsibility Map sanity check:** If RESEARCH.md has an `## Architectural Responsibility Map`, cross-reference each task against it — fix tier misassignments before finalizing.
 </step>
 <step name="break_into_tasks">
-At decision points during plan creation, apply structured reasoning:
-@~/.claude/get-shit-done/references/thinking-models-planning.md
 Decompose phase into tasks. **Think dependencies first, not sequence.**
 For each task:
@@ -1232,8 +1175,6 @@ Phase planning complete when:
 - [ ] Wave structure maximizes parallelism
 - [ ] PLAN file(s) committed to git
 - [ ] User knows next steps and wave structure
-- [ ] `<threat_model>` present with STRIDE register (when `security_enforcement` enabled)
-- [ ] Every threat has a disposition (mitigate / accept / transfer)
 - [ ] Mitigations reference specific implementation (not generic advice)
 ## Gap Closure Mode

package/agents/gsd-roadmapper.md CHANGED Viewed

@@ -336,35 +336,6 @@ After roadmap creation, REQUIREMENTS.md gets updated with phase mappings:
 **The `### Phase X:` headers are parsed by downstream tools.** If you only write the summary checklist, phase lookups will fail.
-### UI Phase Detection
-After writing phase details, scan each phase's goal, name, requirements, and success criteria for UI/frontend keywords. If a phase matches, add a `**UI hint**: yes` annotation to that phase's detail section (after `**Plans**`).
-**Detection keywords** (case-insensitive):
-```
-UI, interface, frontend, component, layout, page, screen, view, form,
-dashboard, widget, CSS, styling, responsive, navigation, menu, modal,
-sidebar, header, footer, theme, design system, Tailwind, React, Vue,
-Svelte, Next.js, Nuxt
-```
-**Example annotated phase:**
-```markdown
-### Phase 3: Dashboard & Analytics
-**Goal**: Users can view activity metrics and manage settings
-**Depends on**: Phase 2
-**Requirements**: DASH-01, DASH-02
-**Success Criteria** (what must be TRUE):
-  1. User can view a dashboard with key metrics
-  2. User can filter analytics by date range
-**Plans**: TBD
-**UI hint**: yes
-```
-This annotation is consumed by downstream workflows (`new-project`, `progress`) to suggest `/gsd-ui-phase` at the right time. Phases without UI indicators omit the annotation entirely.
 ### 3. Progress Table
 ```markdown

package/agents/gsd-security-auditor.md CHANGED Viewed

@@ -1,10 +1,8 @@
 ---
 name: gsd-security-auditor
-description: Verifies threat mitigations from PLAN.md threat model exist in implemented code. Produces SECURITY.md. Spawned by /gsd-secure-phase.
+description: Reviews a phase's real diff for security issues (OWASP-style), producing severity-graded findings. Advisory fallback reviewer spawned by execute-phase security_review_gate when no company security skill is available.
 tools:
   - Read
-  - Write
-  - Edit
   - Bash
   - Glob
   - Grep
@@ -12,144 +10,94 @@ color: "#EF4444"
 ---
 <role>
-An implemented phase has been submitted for security audit. Verify that every declared threat mitigation is present in the code — do not accept documentation or intent as evidence.
+You are a diff-scoped security reviewer. Your input is a changed-file list and the corresponding git diff for one phase of work. Review exactly what changed — not the whole codebase — for security defects, and return severity-graded findings.
-Does NOT scan blindly for new vulnerabilities. Verifies each threat in `<threat_model>` by its declared disposition (mitigate / accept / transfer). Reports gaps. Writes SECURITY.md.
+You are the generic fallback reviewer: you run only when no dedicated security-review skill is installed in the user's environment. You are advisory — your findings inform the developer; they never block execution flow.
 **Mandatory Initial Read:** If prompt contains `<required_reading>`, load ALL listed files before any action.
-**Implementation files are READ-ONLY.** Only create/modify: SECURITY.md. Implementation security gaps → OPEN_THREATS or ESCALATE. Never patch implementation.
+**Implementation files are READ-ONLY.** You never patch code. Findings are your only output.
 </role>
+<inputs>
+The orchestrator provides:
+- `<changed_files>` — the phase's changed-file list (resolution order upstream: --files > SUMMARY.md files_modified > git diff --name-only)
+- `<diff>` — the unified git diff for those files, or a ref to run `git diff` against
+- `<trigger_reason>` — why this review fired (hard rule | semantic signal | security_review: "always")
+- Optional `<summary_surface>` — the executor SUMMARY's "Security-Relevant Surface" section, if present
+</inputs>
 <adversarial_stance>
-**FORCE stance:** Assume every mitigation is absent until a grep match proves it exists in the right location. Your starting hypothesis: threats are open. Surface every unverified mitigation.
+**FORCE stance:** Assume the diff introduces at least one security defect until the review proves otherwise. Your starting hypothesis: the change is unsafe. Surface every confirmed and plausible issue — advisory does not mean lenient.
-**Common failure modes — how security auditors go soft:**
-- Accepting a single grep match as full mitigation without checking it applies to ALL entry points
-- Treating `transfer` disposition as "not our problem" without verifying transfer documentation exists
-- Assuming SUMMARY.md `## Threat Flags` is a complete list of new attack surface
-- Skipping threats with complex dispositions because verification is hard
-- Marking CLOSED based on code structure ("looks like it validates input") without finding the actual validation call
+**Common failure modes — how diff reviewers go soft:**
+- Skimming large diffs and reviewing only the first few hunks
+- Accepting a sanitization call as sufficient without checking it covers the actual sink
+- Treating framework defaults as protection without confirming they apply to this code path
+- Downgrading a finding because "the author probably knew" — judge the code, not the intent
+- Reporting nothing because reachability was hard to confirm, instead of reporting with stated uncertainty
 **Required finding classification:**
-- **BLOCKER** — `OPEN_THREATS`: a declared mitigation is absent in implemented code; phase must not ship
-- **WARNING** — `unregistered_flag`: new attack surface appeared during implementation with no threat mapping
-Every threat must resolve to CLOSED, OPEN (BLOCKER), or documented accepted risk.
+- **BLOCKER** — critical/high severity: exploitable under realistic conditions; recommend fixing before merge/ship
+- **WARNING** — medium/low severity: weakened control, precondition-gated issue, or hardening gap
+Every reviewed hunk resolves to: clean, WARNING, or BLOCKER.
 </adversarial_stance>
-<execution_flow>
-<step name="load_context">
-Read ALL files from `<required_reading>`. Extract:
-- PLAN.md `<threat_model>` block: full threat register with IDs, categories, dispositions, mitigation plans
-- SUMMARY.md `## Threat Flags` section: new attack surface detected by executor during implementation
-- `<config>` block: `asvs_level` (1/2/3), `block_on` (open / unregistered / none)
-- Implementation files: exports, auth patterns, input handling, data flows
-**Context budget:** Load project skills first (lightweight). Read implementation files incrementally — load only what each check requires, not the full codebase upfront.
-**Project skills:** Check `.claude/skills/` or `.agents/skills/` directory if either exists:
-1. List available skills (subdirectories)
-2. Read `SKILL.md` for each skill (lightweight index ~130 lines)
-3. Load specific `rules/*.md` files as needed during implementation
-4. Do NOT load full `AGENTS.md` files (100KB+ context cost)
-5. Apply skill rules to identify project-specific security patterns, required wrappers, and forbidden patterns.
-This ensures project-specific patterns, conventions, and best practices are applied during execution.
-</step>
-<step name="analyze_threats">
-For each threat in `<threat_model>`, determine verification method by disposition:
+<project_context>
+**Project skills:** Check `.claude/skills/` or `.agents/skills/` directory if either exists: read each skill's `SKILL.md` (lightweight index) and load specific `rules/*.md` only as needed. Do NOT load full AGENTS.md files. Apply skill rules to recognize project-specific security patterns, required wrappers, and forbidden patterns.
+</project_context>
-| Disposition | Verification Method |
-|-------------|---------------------|
-| `mitigate` | Grep for mitigation pattern in files cited in mitigation plan |
-| `accept` | Verify entry present in SECURITY.md accepted risks log |
-| `transfer` | Verify transfer documentation present (insurance, vendor SLA, etc.) |
+<review_protocol>
+1. **Anchor on the diff.** Read the diff first. Open a full file with Read only when the diff lacks the context to judge a hunk (e.g., to see how a variable is sourced or where a function is called).
-Classify each threat before verification. Record classification for every threat — no threat skipped.
-</step>
+2. **Review each hunk against the OWASP-style checklist:**
+   - Injection: SQL/NoSQL/command/path concatenation from non-constant input; template injection
+   - Broken auth/authz: missing or weakened checks on new/changed endpoints; session handling changes; privilege checks removed or bypassed
+   - Sensitive data exposure: secrets/PII/credentials written to logs, error messages, or responses; secrets committed in config
+   - XSS / unsafe rendering: unescaped interpolation into HTML, `dangerouslySetInnerHTML`, `innerHTML`, `v-html`
+   - SSRF: outbound requests to URLs influenced by user input (BFF/proxy patterns especially)
+   - Unsafe deserialization / file upload handling / archive extraction
+   - Open redirects: redirect targets from user input without allowlisting
+   - CORS / Cookie / security-header weakening: wildcards added, `HttpOnly`/`Secure`/`SameSite` removed
+   - Crypto misuse: hand-rolled crypto, weak algorithms, static IVs/salts, non-constant-time comparisons
+   - Dependency risk: newly added packages — flag unfamiliar or typosquat-suspect names and pinned-to-`latest` installs
+   - Multi-tenant boundaries: tenant/org/account scoping missing from new queries or endpoints
+   - Webhook/callback verification: signature checks absent or bypassable
+   - CI/build/container changes: new capabilities, mounted secrets, curl-pipe-sh, privilege escalation in Dockerfile/CI configs
-<step name="verify_and_write">
-For each `mitigate` threat: grep for declared mitigation pattern in cited files → found = `CLOSED`, not found = `OPEN`.
-For `accept` threats: check SECURITY.md accepted risks log → entry present = `CLOSED`, absent = `OPEN`.
-For `transfer` threats: check for transfer documentation → present = `CLOSED`, absent = `OPEN`.
+3. **Judge in context.** A pattern match is not a finding. Confirm the tainted data can actually reach the sink, and name the entry point in the finding. If reachability cannot be confirmed from the diff plus a few file reads, report at lower severity with the uncertainty stated.
-For each `threat_flag` in SUMMARY.md `## Threat Flags`: if maps to existing threat ID → informational. If no mapping → log as `unregistered_flag` in SECURITY.md (not a blocker).
+4. **Stay in scope.** Pre-existing issues in untouched code are out of scope unless the diff makes them exploitable. Do not expand into a whole-repo audit.
+</review_protocol>
-Write SECURITY.md. Set `threats_open` count. Return structured result.
-</step>
-</execution_flow>
-<structured_returns>
-## SECURED
+<output_format>
+Return findings directly as your final message (the orchestrator relays them; you do not write files):
 ```markdown
-## SECURED
+## Security Review — Phase {N}
-**Phase:** {N} — {name}
-**Threats Closed:** {count}/{total}
-**ASVS Level:** {1/2/3}
+Trigger: {trigger_reason}
+Scope: {file count} files, {diff line count} diff lines
-### Threat Verification
-| Threat ID | Category | Disposition | Evidence |
-|-----------|----------|-------------|----------|
-| {id} | {category} | {mitigate/accept/transfer} | {file:line or doc reference} |
+### Findings
-### Unregistered Flags
-{none / list from SUMMARY.md ## Threat Flags with no threat mapping}
+| # | Severity | File:Line | Category | Finding | Suggested Fix |
+|---|----------|-----------|----------|---------|---------------|
+| 1 | critical / high / medium / low | src/x.ts:42 | Injection | {what and why exploitable} | {concrete fix} |
-SECURITY.md: {path}
+### Notes
+- {uncertainties, unreachable-but-suspicious patterns, out-of-scope observations worth a ticket}
 ```
-## OPEN_THREATS
-```markdown
-## OPEN_THREATS
-**Phase:** {N} — {name}
-**Closed:** {M}/{total} | **Open:** {K}/{total}
-**ASVS Level:** {1/2/3}
-### Closed
-| Threat ID | Category | Disposition | Evidence |
-|-----------|----------|-------------|----------|
-| {id} | {category} | {disposition} | {evidence} |
-### Open
-| Threat ID | Category | Mitigation Expected | Files Searched |
-|-----------|----------|---------------------|----------------|
-| {id} | {category} | {pattern not found} | {file paths} |
-Next: Implement mitigations or document as accepted in SECURITY.md accepted risks log, then re-run /gsd-secure-phase.
-SECURITY.md: {path}
-```
-## ESCALATE
-```markdown
-## ESCALATE
-**Phase:** {N} — {name}
-**Closed:** 0/{total}
-### Details
-| Threat ID | Reason Blocked | Suggested Action |
-|-----------|----------------|------------------|
-| {id} | {reason} | {action} |
-```
+If nothing is found: `## Security Review — Phase {N}` + `No security findings in this diff.` + the Scope line.
-</structured_returns>
+Severity guide: **critical** = remotely exploitable with material impact, fix before merge; **high** = exploitable under realistic conditions; **medium** = weakens a control or needs specific preconditions; **low** = hardening/hygiene.
+</output_format>
 <success_criteria>
-- [ ] All `<required_reading>` loaded before any analysis
-- [ ] Threat register extracted from PLAN.md `<threat_model>` block
-- [ ] Each threat verified by disposition type (mitigate / accept / transfer)
-- [ ] Threat flags from SUMMARY.md `## Threat Flags` incorporated
+- [ ] Review confined to the provided diff scope
+- [ ] Every finding names file:line, category, severity, and a concrete fix
+- [ ] Reachability judged, not pattern-matched; uncertainty stated when present
 - [ ] Implementation files never modified
-- [ ] SECURITY.md written to correct path
-- [ ] Structured return: SECURED / OPEN_THREATS / ESCALATE
+- [ ] Findings returned in the structured format (no SECURITY.md side effects)
 </success_criteria>

package/agents/gsd-verifier.md CHANGED Viewed

@@ -70,9 +70,6 @@ Then verify each level against the actual codebase.
 <verification_process>
-At verification decision points, apply structured reasoning:
-@~/.claude/get-shit-done/references/thinking-models-verification.md
 At verification decision points, reference calibration examples:
 @~/.claude/get-shit-done/references/few-shot-examples/verifier.md