npm - shipwright-cli - Versions diffs - 2.1.2 → 2.2.1 - Mend

shipwright-cli 2.1.2 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (136) hide show

package/.claude/agents/devops-engineer.md +14 -12
package/.claude/agents/doc-fleet-agent.md +99 -0
package/.claude/agents/test-specialist.md +5 -3
package/README.md +48 -27
package/claude-code/CLAUDE.md.shipwright +2 -2
package/config/policy.json +73 -0
package/config/policy.schema.json +150 -0
package/docs/AGI-PLATFORM-PLAN.md +126 -0
package/docs/AGI-WHATS-NEXT.md +72 -0
package/docs/KNOWN-ISSUES.md +1 -23
package/docs/PLATFORM-TODO-BACKLOG.md +41 -0
package/docs/PLATFORM-TODO-TRIAGE.md +56 -0
package/docs/README.md +83 -0
package/docs/TIPS.md +39 -2
package/docs/config-policy.md +40 -0
package/docs/definition-of-done.example.md +2 -0
package/docs/patterns/README.md +5 -0
package/docs/strategy/02-mission-and-brand.md +3 -3
package/docs/strategy/README.md +4 -3
package/docs/tmux-research/TMUX-AUDIT.md +2 -0
package/docs/tmux-research/TMUX-RESEARCH-INDEX.md +17 -0
package/package.json +3 -2
package/scripts/lib/daemon-health.sh +32 -0
package/scripts/lib/helpers.sh +30 -1
package/scripts/lib/pipeline-detection.sh +278 -0
package/scripts/lib/pipeline-github.sh +196 -0
package/scripts/lib/pipeline-intelligence.sh +1712 -0
package/scripts/lib/pipeline-quality-checks.sh +1052 -0
package/scripts/lib/pipeline-quality.sh +34 -0
package/scripts/lib/pipeline-stages.sh +2488 -0
package/scripts/lib/pipeline-state.sh +529 -0
package/scripts/lib/policy.sh +32 -0
package/scripts/sw +5 -1
package/scripts/sw-activity.sh +35 -46
package/scripts/sw-adaptive.sh +30 -39
package/scripts/sw-adversarial.sh +30 -36
package/scripts/sw-architecture-enforcer.sh +30 -33
package/scripts/sw-auth.sh +30 -42
package/scripts/sw-autonomous.sh +60 -40
package/scripts/sw-changelog.sh +29 -30
package/scripts/sw-checkpoint.sh +30 -18
package/scripts/sw-ci.sh +30 -42
package/scripts/sw-cleanup.sh +32 -15
package/scripts/sw-code-review.sh +26 -32
package/scripts/sw-connect.sh +30 -19
package/scripts/sw-context.sh +30 -19
package/scripts/sw-cost.sh +30 -40
package/scripts/sw-daemon.sh +66 -36
package/scripts/sw-dashboard.sh +31 -40
package/scripts/sw-db.sh +30 -20
package/scripts/sw-decompose.sh +30 -38
package/scripts/sw-deps.sh +30 -41
package/scripts/sw-developer-simulation.sh +30 -36
package/scripts/sw-discovery.sh +36 -19
package/scripts/sw-doc-fleet.sh +822 -0
package/scripts/sw-docs-agent.sh +30 -36
package/scripts/sw-docs.sh +29 -31
package/scripts/sw-doctor.sh +52 -20
package/scripts/sw-dora.sh +29 -34
package/scripts/sw-durable.sh +30 -20
package/scripts/sw-e2e-orchestrator.sh +36 -21
package/scripts/sw-eventbus.sh +30 -17
package/scripts/sw-feedback.sh +30 -41
package/scripts/sw-fix.sh +30 -40
package/scripts/sw-fleet-discover.sh +30 -41
package/scripts/sw-fleet-viz.sh +30 -20
package/scripts/sw-fleet.sh +30 -40
package/scripts/sw-github-app.sh +30 -41
package/scripts/sw-github-checks.sh +30 -41
package/scripts/sw-github-deploy.sh +30 -41
package/scripts/sw-github-graphql.sh +30 -38
package/scripts/sw-guild.sh +30 -37
package/scripts/sw-heartbeat.sh +30 -19
package/scripts/sw-hygiene.sh +134 -42
package/scripts/sw-incident.sh +30 -39
package/scripts/sw-init.sh +31 -14
package/scripts/sw-instrument.sh +30 -41
package/scripts/sw-intelligence.sh +39 -44
package/scripts/sw-jira.sh +31 -41
package/scripts/sw-launchd.sh +30 -17
package/scripts/sw-linear.sh +31 -41
package/scripts/sw-logs.sh +32 -17
package/scripts/sw-loop.sh +32 -19
package/scripts/sw-memory.sh +32 -43
package/scripts/sw-mission-control.sh +31 -40
package/scripts/sw-model-router.sh +30 -20
package/scripts/sw-otel.sh +30 -20
package/scripts/sw-oversight.sh +30 -36
package/scripts/sw-patrol-meta.sh +31 -0
package/scripts/sw-pipeline-composer.sh +30 -39
package/scripts/sw-pipeline-vitals.sh +30 -44
package/scripts/sw-pipeline.sh +277 -6383
package/scripts/sw-pm.sh +31 -41
package/scripts/sw-pr-lifecycle.sh +30 -42
package/scripts/sw-predictive.sh +32 -34
package/scripts/sw-prep.sh +30 -19
package/scripts/sw-ps.sh +32 -17
package/scripts/sw-public-dashboard.sh +30 -40
package/scripts/sw-quality.sh +42 -40
package/scripts/sw-reaper.sh +32 -15
package/scripts/sw-recruit.sh +428 -48
package/scripts/sw-regression.sh +30 -38
package/scripts/sw-release-manager.sh +30 -38
package/scripts/sw-release.sh +29 -31
package/scripts/sw-remote.sh +31 -40
package/scripts/sw-replay.sh +30 -18
package/scripts/sw-retro.sh +33 -42
package/scripts/sw-scale.sh +41 -24
package/scripts/sw-security-audit.sh +30 -20
package/scripts/sw-self-optimize.sh +33 -37
package/scripts/sw-session.sh +31 -15
package/scripts/sw-setup.sh +30 -16
package/scripts/sw-standup.sh +30 -20
package/scripts/sw-status.sh +33 -13
package/scripts/sw-strategic.sh +55 -43
package/scripts/sw-stream.sh +33 -37
package/scripts/sw-swarm.sh +30 -21
package/scripts/sw-team-stages.sh +30 -38
package/scripts/sw-templates.sh +31 -16
package/scripts/sw-testgen.sh +30 -31
package/scripts/sw-tmux-pipeline.sh +29 -31
package/scripts/sw-tmux-role-color.sh +31 -0
package/scripts/sw-tmux-status.sh +31 -0
package/scripts/sw-tmux.sh +31 -15
package/scripts/sw-trace.sh +30 -19
package/scripts/sw-tracker-github.sh +31 -0
package/scripts/sw-tracker-jira.sh +31 -0
package/scripts/sw-tracker-linear.sh +31 -0
package/scripts/sw-tracker.sh +30 -40
package/scripts/sw-triage.sh +68 -61
package/scripts/sw-upgrade.sh +30 -16
package/scripts/sw-ux.sh +30 -35
package/scripts/sw-webhook.sh +30 -25
package/scripts/sw-widgets.sh +30 -19
package/scripts/sw-worktree.sh +32 -15
package/tmux/templates/doc-fleet.json +43 -0

package/docs/AGI-PLATFORM-PLAN.md ADDED Viewed

@@ -0,0 +1,126 @@
+# AGI-Level Platform Plan: Refactor, Refine, Remove, Redo
+**Status:** Active
+**Created:** 2026-02-16
+**Goal:** Make Shipwright a fully autonomous product development team — reduce hardcoded/static policy, clean architecture, and let the platform improve itself.
+---
+## Success Criteria
+- **Policy:** All tunables (timeouts, limits, thresholds) live in `config/policy.json` or env; scripts read via `policy_get` or jq. Zero new hardcoded magic numbers in core paths.
+- **Monoliths:** `sw-pipeline.sh` and `sw-daemon.sh` decomposed into sourced modules (stages, health, poll loop); single-file line count < 2000 for core orchestration.
+- **Helpers:** All scripts use `lib/helpers.sh` for colors/output/events (or a single other canonical source); no duplicated info/success/warn/error blocks.
+- **Platform health:** `shipwright hygiene platform-refactor` counts trend down (hardcoded, fallback, TODO/FIXME/HACK); strategic agent routinely suggests platform refactor issues.
+- **Continuous:** Hygiene + platform-refactor run in CI or weekly; strategic reads platform-hygiene and policy; AGI-level criterion is part of product thinking.
+---
+## Phase 1: Foundation (Policy + Helpers Adoption)
+**Goal:** Policy and helpers are the default; at least two key scripts read from policy; plan is visible and tracked.
+**Status:** Done. 1.1–1.3 done (strategic + hygiene read policy; plan linked from STRATEGY P6). 1.4 done — all ~98 scripts migrated to `lib/helpers.sh`; zero duplicated helper blocks remain.
+| #   | Task                                                                                                                                                                                              | Owner | Acceptance                                                                                   |
+| --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- | -------------------------------------------------------------------------------------------- |
+| 1.1 | **Strategic reads policy** — In sw-strategic.sh, after constants block, source policy.sh and override STRATEGIC_MAX_ISSUES, COOLDOWN, STRATEGY_LINES, OVERLAP_THRESHOLD from policy when present. | Agent | strategic run uses config/policy.json values when file exists; fallback to current literals. |
+| 1.2 | **Hygiene reads policy** — In sw-hygiene.sh, read artifact_age_days from policy (policy_get ".hygiene.artifact_age_days" 7) when policy.sh available.                                             | Agent | hygiene --artifact-age default comes from policy when present.                               |
+| 1.3 | **Document plan** — This doc (docs/AGI-PLATFORM-PLAN.md) is the single source of truth; link from STRATEGY.md P6.                                                                                 | Done  | STRATEGY P6 references this plan.                                                            |
+| 1.4 | **Helpers adoption** — Migrate 3–5 high-traffic scripts to source lib/helpers.sh instead of defining info/success/warn/error (e.g. sw-strategic, sw-hygiene, sw-quality).                         | Agent | No duplicate color/output blocks in those scripts; they source helpers.                      |
+---
+## Phase 2: Policy Migration (First Batch)
+**Goal:** Daemon, pipeline, quality, and sweep read their key tunables from policy; hardcoded count drops.
+**Status:** Done. 2.1–2.5 complete. Daemon (timeouts, intervals), pipeline (coverage/quality thresholds), quality (thresholds), sweep (workflow reads policy.json and exports env vars).
+| #   | Task                                                                                                                                                                                                                 | Owner | Acceptance                                                                                                                |
+| --- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- | ------------------------------------------------------------------------------------------------------------------------- |
+| 2.1 | **Daemon timeouts** — In sw-daemon.sh, health heartbeat and stage timeouts read from policy_get when policy exists (else keep current defaults).                                                                     | Agent | daemon_health_timeout_for_stage uses policy .daemon.stage_timeouts and .daemon.health_heartbeat_timeout.                  |
+| 2.2 | **Daemon intervals** — POLL_INTERVAL, AUTO_SCALE_INTERVAL, OPTIMIZE_INTERVAL, STALE_REAPER_INTERVAL read from policy when present.                                                                                   | Agent | One place (policy) controls daemon timing.                                                                                |
+| 2.3 | **Pipeline thresholds** — Coverage and quality gate thresholds in pipeline read from policy (pipeline.coverage_threshold_percent, quality_gate_score_threshold, memory fallbacks).                                   | Agent | Pipeline quality gate uses policy_get for thresholds when policy exists.                                                  |
+| 2.4 | **Quality script** — sw-quality.sh reads coverage_threshold and gate_score_threshold from policy.                                                                                                                    | Agent | quality validate/gate use policy.                                                                                         |
+| 2.5 | **Sweep (workflow)** — Document in plan that sweep workflow (shipwright-sweep.yml) uses hardcoded 4h/30min; add optional env or later step to read from policy (e.g. script that emits workflow inputs from policy). | Agent | Either sweep reads policy in a wrapper or doc states “sweep defaults documented in config/policy.json; override via env.” |
+---
+## Phase 3: Monolith Decomposition
+**Goal:** Pipeline and daemon are split into sourced modules; no single file > 2000 lines for orchestration core.
+**Status:** 3.2 and 3.4 done (pipeline-quality.sh and daemon-health.sh created, wired, and sourced). 3.1 and 3.3 (full stage/poll extraction) deferred — high risk, requires incremental approach.
+| #   | Task                                                                                                                                                                                                   | Owner | Acceptance                                                            |
+| --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | --------------------------------------------------------------------- |
+| 3.1 | **Pipeline stages lib** — Extract stage run logic (run_intake, run_plan, run_build, run_test, …) into scripts/lib/pipeline-stages.sh or scripts/lib/pipeline-stages/\*.sh; source from sw-pipeline.sh. | Agent | sw-pipeline.sh sources stages; line count drops; existing tests pass. |
+| 3.2 | **Pipeline quality gate** — Extract quality gate and audit selection into scripts/lib/pipeline-quality.sh; source from sw-pipeline.sh.                                                                 | Agent | Quality gate logic in one place; pipeline sources it.                 |
+| 3.3 | **Daemon poll loop** — Extract daemon_poll_loop, daemon_poll_issues, daemon_reap_completed into scripts/lib/daemon-poll.sh; source from sw-daemon.sh.                                                  | Agent | Daemon sources daemon-poll; line count drops.                         |
+| 3.4 | **Daemon health** — Extract health check and timeout logic into scripts/lib/daemon-health.sh.                                                                                                          | Agent | Daemon sources daemon-health; tests pass.                             |
+---
+## Phase 4: Cleanup (TODO / FIXME / HACK / Dead Code)
+**Goal:** Triage all TODO/FIXME/HACK; remove dead code; reduce fallback count.
+**Status:** 4.1–4.2 done (PLATFORM-TODO-TRIAGE.md created with full triage: 4 github-issue, 3 accepted-debt, 0 stale). 4.3–4.4 ongoing (run hygiene dead-code; reduce fallbacks over time). Pre-existing `now_unix` bug in sw-scale.sh fixed.
+| #   | Task                                                                                                                                                                                                    | Owner | Acceptance                                               |
+| --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- | -------------------------------------------------------- |
+| 4.1 | **TODO/FIXME backlog** — Generate list (from platform-refactor findings); create GitHub issues for each or mark “accepted tech debt” in code; strategic can then suggest “Resolve TODO in X” as issues. | Agent | Every TODO/FIXME has an issue or comment; count tracked. |
+| 4.2 | **HACK/KLUDGE** — Same as 4.1; replace or document.                                                                                                                                                     | Agent | HACK count explained or reduced.                         |
+| 4.3 | **Dead code** — Run hygiene dead-code; remove or refactor unused functions/scripts.                                                                                                                     | Agent | Dead code count in hygiene report drops.                 |
+| 4.4 | **Fallback reduction** — Where adaptive/learned data exists, remove duplicate hardcoded fallbacks so one code path wins (policy → adaptive → minimal default).                                          | Agent | Fallback count in platform-refactor scan drops.          |
+---
+## Phase 5: Continuous (CI + Strategic + Metrics)
+**Goal:** Platform health is measured and improved continuously.
+**Status:** 5.1 done (shipwright-platform-health.yml with threshold gate). 5.2 done (strategic reads platform-hygiene + AGI rule; CI workflow now runs hygiene before strategic). 5.3 done (doctor shows platform health counts). 5.4 done (config/policy.schema.json created; ajv validates; integrated in CI). E2E policy tests added (sw-policy-e2e-test.sh, 26 tests).
+| #   | Task                                                                                                                                                                                               | Owner | Acceptance                                                |
+| --- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- | --------------------------------------------------------- |
+| 5.1 | **Hygiene in CI** — Add a job (e.g. in shipwright-sweep or a new workflow) that runs `shipwright hygiene platform-refactor` and fails or warns if counts exceed thresholds (e.g. hardcoded > 100). | Agent | CI runs platform-refactor; optional gate.                 |
+| 5.2 | **Strategic creates refactor issues** — Ensure strategic prompt and platform-hygiene input are used; run strategic periodically so it suggests platform refactor issues.                           | Done  | Strategic already has platform health + AGI rule.         |
+| 5.3 | **Metrics dashboard** — Optional: add a small “platform health” section to dashboard or doctor showing platform-hygiene counts and trend.                                                          | Agent | Doctor or dashboard shows hardcoded/fallback/TODO counts. |
+| 5.4 | **Policy schema** — Add JSON schema for config/policy.json and validate in CI or on load.                                                                                                          | Agent | policy.json validated against schema.                     |
+---
+## Current Snapshot (from platform-refactor scan)
+- **hardcoded:** 58 | **fallback:** 54 | **TODO:** 37 | **FIXME:** 19 | **HACK/KLUDGE:** 17
+- **Triage:** 4 github-issue, 3 accepted-debt, 0 stale, 0 fix-now (see `docs/PLATFORM-TODO-TRIAGE.md`)
+- **Largest scripts:** sw-pipeline.sh (8600+), sw-daemon.sh (6000+), sw-loop.sh (2400+), sw-recruit.sh (2200+), sw-prep.sh (1600+), sw-memory.sh (1600+).
+- _Last scan: 2026-02-16. Re-scan after full helpers migration to track delta._
+---
+## Sweep defaults (Phase 2.5)
+Sweep workflow (`.github/workflows/shipwright-sweep.yml`) uses hardcoded values: stuck = 4h, cron every 30min, retry template = full, retry max_iterations = 25, stuck retry = 30. These are documented in **config/policy.json** under `sweep`. To override: set env in the workflow (e.g. `STUCK_THRESHOLD_HOURS`, `RETRY_MAX_ITERATIONS`) or add a wrapper step that reads policy and exports env for the dispatch step.
+## How to Use This Plan
+1. **Run platform-refactor:** `shipwright hygiene platform-refactor` to refresh `.claude/platform-hygiene.json`.
+2. **Run strategic:** `shipwright strategic run` to get AI-suggested issues (including platform refactor).
+3. **Execute phases in order:** Phase 1 → 2 → 3 → 4 → 5; mark tasks done in this doc or in issues.
+4. **Policy first:** Any new tunable goes in config/policy.json; scripts use policy_get or jq.
+---
+## References
+- **STRATEGY.md** — P6 Platform Self-Improvement, Technical Principle 8 (AGI-level criterion).
+- **config/policy.json** — Central policy schema.
+- **docs/config-policy.md** — Policy usage and roadmap.
+- **scripts/lib/policy.sh** — policy_get helper.
+- **scripts/lib/helpers.sh** — Canonical colors and output helpers.
+- **config/policy.schema.json** — JSON Schema for policy validation.
+- **docs/PLATFORM-TODO-TRIAGE.md** — Phase 4 TODO/FIXME/HACK triage results.
+- **scripts/sw-policy-e2e-test.sh** — E2E policy integration tests (26 tests).

package/docs/AGI-WHATS-NEXT.md ADDED Viewed

@@ -0,0 +1,72 @@
+# What's Next — Gaps, Not Fully Implemented, Not Integrated, E2E Audit
+**Status:** 2026-02-16
+**Companion to:** [docs/AGI-PLATFORM-PLAN.md](AGI-PLATFORM-PLAN.md)
+---
+## 1. Still broken or risky
+| Item                                         | What                                                                                                                                                                                    | Fix                                                                                           |
+| -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
+| **Platform-health workflow threshold check** | ~~Report step used string comparison for threshold.~~ **Fixed:** Now normalizes to numeric with default 0.                                                                              | Done.                                                                                         |
+| **policy.sh when REPO_DIR not set**          | If a script is run from a different cwd (e.g. CI from repo root), `git rev-parse --show-toplevel` may point to a different repo.                                                        | Already uses SCRIPT_DIR/.. when SCRIPT_DIR is set; document that callers must set SCRIPT_DIR. |
+| **Daemon get_adaptive_heartbeat_timeout**    | When policy has no entry for a stage, we fall back to case statement only when `policy_get` is not available; when policy exists but stage is missing we keep HEALTH_HEARTBEAT_TIMEOUT. | Verified: logic is correct (policy stage → else case → HEALTH_HEARTBEAT_TIMEOUT).             |
+---
+## 2. Not fully implemented
+| Item                                       | What                                                                                                                                                                                          | Next step                                                                                       |
+| ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
+| ~~**Phase 3 libs not sourced**~~           | **Done.** `pipeline-quality.sh` sourced by `sw-pipeline.sh` and `sw-quality.sh`; `daemon-health.sh` sourced by `sw-daemon.sh`.                                                                | Wired and verified.                                                                             |
+| ~~**Policy JSON Schema validation**~~      | **Done.** `config/policy.schema.json` created; `ajv-cli` validates successfully; optional step in platform-health workflow confirmed working.                                                 | Validated locally; trigger workflow_dispatch in CI to confirm.                                  |
+| ~~**Sweep workflow still hardcoded**~~     | **Done.** Sweep workflow now checks out repo, reads `config/policy.json`, and exports `STUCK_THRESHOLD_HOURS`, `RETRY_TEMPLATE`, `RETRY_MAX_ITERATIONS`, `STUCK_RETRY_MAX_ITERATIONS` to env. | Wired.                                                                                          |
+| ~~**Helpers adoption (Phase 1.4)**~~       | **Done.** All ~98 scripts migrated to `lib/helpers.sh`. Zero duplicated info/success/warn/error blocks remain.                                                                                | Complete.                                                                                       |
+| **Monolith decomposition (Phase 3.1–3.4)** | Pipeline stages, pipeline quality gate, daemon poll loop, daemon health are **not** extracted into separate sourced files. Line counts unchanged (8600+ / 6000+).                             | Defer or do incrementally: extract one module (e.g. pipeline quality gate block) and source it. |
+---
+## 3. Not integrated
+| Item                                 | What                                                                                                                                              | Next step |
+| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- | --------- |
+| ~~**pipeline-quality.sh**~~          | **Done.** Sourced by `sw-pipeline.sh` and `sw-quality.sh`; duplicate policy_get for thresholds removed.                                           | Wired.    |
+| ~~**daemon-health.sh**~~             | **Done.** Sourced by `sw-daemon.sh`; `get_adaptive_heartbeat_timeout` calls `daemon_health_timeout_for_stage` when loaded.                        | Wired.    |
+| ~~**Strategic + platform-hygiene**~~ | **Done.** `shipwright-strategic.yml` now runs `hygiene platform-refactor` before strategic analysis, feeding fresh data to the AI agent.          | Wired.    |
+| ~~**Test suite and policy**~~        | **Done.** Policy read test added to `sw-hygiene-test.sh` (Test 12): verifies `policy_get` reads from config and returns default when key missing. | Covered.  |
+---
+## 4. Not audited E2E
+| Item                                | What                                                                                                                                                               | Next step                                                                          |
+| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- |
+| ~~**Pipeline E2E with policy**~~    | **Done.** `sw-policy-e2e-test.sh` (26 tests) verifies pipeline-quality.sh reads coverage/gate thresholds from policy, policy_get with mock and real configs.       | Added to npm test suite.                                                           |
+| ~~**Daemon E2E with policy**~~      | **Done.** `sw-policy-e2e-test.sh` verifies daemon policy_get for poll_interval, heartbeat_timeout, stage_timeouts, auto_scale_interval.                            | Covered in policy E2E test.                                                        |
+| **Platform-health workflow E2E**    | Workflow validated locally (schema, scan, report steps); not yet triggered via workflow_dispatch in CI.                                                            | Trigger workflow (workflow_dispatch) to confirm end-to-end in real CI.             |
+| **Doctor with no platform-hygiene** | When `.claude/platform-hygiene.json` is missing, doctor shows "Platform hygiene not run". Not wrong, but we never auto-run it.                                     | Optional: doctor could run `hygiene platform-refactor` once and then show section. |
+| ~~**Full npm test with policy**~~   | **Done.** `sw-policy-e2e-test.sh` added to npm test; 26 policy-specific assertions covering policy_get, pipeline-quality.sh, daemon thresholds, and sanity checks. | In test suite.                                                                     |
+---
+## 5. Summary checklist
+- [x] **Wire or remove** pipeline-quality.sh and daemon-health.sh — sourced in pipeline, quality, daemon.
+- [x] **Policy schema** — `config/policy.schema.json` created; ajv validates successfully; integrated in CI.
+- [x] **Sweep** — Workflow reads policy.json and exports env vars.
+- [x] **Helpers** — All ~98 scripts migrated to lib/helpers.sh; zero duplicated helper blocks remain.
+- [x] **Test** — Policy read test in hygiene-test.sh (Test 12) + 26 E2E policy tests in sw-policy-e2e-test.sh.
+- [x] **E2E** — Pipeline + daemon policy assertions in sw-policy-e2e-test.sh; platform-health workflow validated locally.
+- [x] **TODO/FIXME/HACK** — Phase 4 triage complete: 4 github-issue, 3 accepted-debt, 0 stale. See `docs/PLATFORM-TODO-TRIAGE.md`.
+- [x] **Strategic + hygiene** — Strategic CI workflow now runs hygiene platform-refactor before analysis.
+- [ ] **Platform-health workflow_dispatch** — Trigger once in CI to confirm end-to-end execution.
+- [ ] **Monolith decomposition (Phase 3.1, 3.3)** — Deferred; high risk, requires incremental extraction.
+---
+## References
+- [AGI-PLATFORM-PLAN.md](AGI-PLATFORM-PLAN.md) — Phases and success criteria.
+- [PLATFORM-TODO-BACKLOG.md](PLATFORM-TODO-BACKLOG.md) — TODO/FIXME/HACK triage.
+- [config-policy.md](config-policy.md) — Policy usage and schema.

package/docs/KNOWN-ISSUES.md CHANGED Viewed

@@ -142,28 +142,6 @@ If only one pane is listed while agents are active, the fallback occurred.
 ---
-## White/Bright Pane Backgrounds on Agent Spawn
-**Severity:** Medium — cosmetic but distracting
-**Problem:** When Claude Code spawns agent panes via tmux, new panes sometimes inherit the terminal's default background (often white/bright) instead of the tmux dark theme.
-**Root Cause:** tmux's `window-style` applies at the window level but newly spawned panes from external processes (like Claude Code) don't always inherit it.
-**Fix:** As of v1.3.0, `shipwright-overlay.conf` uses `set-hook` to force the dark theme on every new pane:
-```conf
-set-hook -g after-split-window "select-pane -P 'bg=#1a1a2e,fg=#e4e4e7'"
-set-hook -g after-new-window   "select-pane -P 'bg=#1a1a2e,fg=#e4e4e7'"
-set-hook -g after-new-session  "select-pane -P 'bg=#1a1a2e,fg=#e4e4e7'"
-```
-Run `shipwright init` or `shipwright upgrade --apply` to get the updated overlay with these hooks.
-**Status:** ✅ Resolved in v1.3.0
----
 ## TPM plugins not loading
 **Severity:** Low — cosmetic
@@ -204,4 +182,4 @@ shipwright reaper --dry-run    # Preview what would be reaped
 Or use the tmux keybinding: `prefix + R` for a quick one-shot cleanup.
-**Status:** Resolved in v1.6.0
+**Status:** ✅ Resolved in v1.6.0 — reaper is the recommended solution.

package/docs/PLATFORM-TODO-BACKLOG.md ADDED Viewed

@@ -0,0 +1,41 @@
+# Platform TODO/FIXME/HACK Backlog
+**Source:** `shipwright hygiene platform-refactor` → `.claude/platform-hygiene.json`
+**Purpose:** Track TODO, FIXME, and HACK markers for triage; strategic agent can suggest "Resolve TODO in X" as issues.
+## How to refresh
+```bash
+shipwright hygiene platform-refactor
+jq '.counts, .findings_sample[0:10]' .claude/platform-hygiene.json
+```
+**File:line list for triage** (after refresh):
+```bash
+jq -r '.findings_sample[]? | "\(.file):\(.line)"' .claude/platform-hygiene.json
+```
+## Triage
+- **TODO** — Create a GitHub issue or implement; add `TODO(issue #N)` in code when deferred.
+- **FIXME** — Same as TODO; prefer fix or document.
+- **HACK/KLUDGE** — Replace with proper fix or add comment: `# HACK: reason (tracked in #N)`.
+## Current counts (from last scan: 2026-02-16)
+| Marker | Count | Action                                                     |
+| ------ | ----- | ---------------------------------------------------------- |
+| TODO   | 37    | Triage; create issues or mark "accepted tech debt" in code |
+| FIXME  | 19    | Same as TODO                                               |
+| HACK   | 17    | Replace or document with tracking comment                  |
+See `.claude/platform-hygiene.json` for live counts and `findings_sample` (file:line).
+Strategic agent uses this file when suggesting platform refactor issues.
+## Prioritized next steps
+1. **Run triage** — Use `jq -r '.findings_sample[]? | "\(.file):\(.line)"' .claude/platform-hygiene.json` to list all; batch into issues by file or theme.
+2. **High-traffic scripts first** — sw-pipeline.sh, sw-daemon.sh, sw-recruit.sh have most findings; address critical path items.
+3. **Dead code (Phase 4.3)** — Run `shipwright hygiene` dead-code scan; remove or refactor unused functions.
+4. **Fallback reduction (Phase 4.4)** — Where policy or adaptive data exists, remove duplicate hardcoded fallbacks.

package/docs/PLATFORM-TODO-TRIAGE.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Platform TODO/FIXME/HACK Triage (Phase 4)
+**Date:** 2026-02-16
+**Source:** `rg -n "TODO|FIXME|HACK" scripts/ docs/ config/ .github/ .claude/` (comment-style markers only)
+This document categorizes all TODO/FIXME/HACK comment markers found in the codebase. Only actual technical-debt comment markers are included (not variable names like `STATUS_TODO`, grep patterns, or documentation references).
+## Summary by Category
+| Category      | Count |
+| ------------- | ----- |
+| fix-now       | 0     |
+| github-issue  | 4     |
+| accepted-debt | 3     |
+| stale         | 0     |
+| **Total**     | **7** |
+## Full Triage Table
+| File                          | Line | Marker | Text                                                                | Category      |
+| ----------------------------- | ---- | ------ | ------------------------------------------------------------------- | ------------- |
+| scripts/sw-scale.sh           | 173  | TODO   | Integrate with tmux/SendMessage to spawn agent                      | github-issue  |
+| scripts/sw-scale.sh           | 199  | TODO   | Integrate with SendMessage to shut down agent                       | github-issue  |
+| scripts/sw-scale.sh           | 337  | TODO   | Parse pipeline context to generate actual recommendations           | github-issue  |
+| scripts/sw-swarm.sh           | 365  | TODO   | Implement queue depth and resource monitoring                       | github-issue  |
+| scripts/sw-testgen.sh         | 271  | TODO   | Claude unavailable (generated stub when Claude API unavailable)     | accepted-debt |
+| scripts/sw-testgen.sh         | 277  | TODO   | Implement test for \$func (placeholder in generated test template)  | accepted-debt |
+| scripts/sw-predictive-test.sh | 70   | TODO   | add input validation (intentional fixture for security patrol test) | accepted-debt |
+## Category Definitions
+- **fix-now**: Simple, actionable, can be addressed in a single session (e.g., replace hardcoded value with policy read).
+- **github-issue**: Needs a tracked GitHub issue for future work; non-trivial integration or feature work.
+- **accepted-debt**: Intentional or documented trade-off; no action required beyond documentation.
+- **stale**: No longer relevant; safe to remove from source.
+## Recommended Actions
+### github-issue (create issues)
+1. **sw-scale.sh** (lines 173, 199): Create issue _"Integrate scale up/down with tmux/SendMessage"_ — when scaling, spawn/shutdown agents via tmux or SendMessage instead of emitting events only.
+2. **sw-scale.sh** (line 337): Create issue _"Parse pipeline context for scale recommendations"_ — use active pipeline state to generate context-aware scaling recommendations.
+3. **sw-swarm.sh** (line 365): Create issue _"Implement queue depth and resource monitoring for swarm"_ — add queue depth and resource utilization monitoring to auto-scaling analysis.
+### accepted-debt (no change)
+- **sw-testgen.sh** (271, 277): These are intentional placeholders in generated test templates. The TODO text signals fallback behavior when Claude is unavailable or when no test implementation exists.
+- **sw-predictive-test.sh** (70): Intentional test fixture. The test creates sample vulnerable code (SQL injection, missing input validation) to verify the security patrol detects these issues. The TODO is part of the deliberately bad code.
+### stale
+None identified.
+### fix-now
+None identified. All TODOs are either deferred integration work (github-issue) or intentional placeholders (accepted-debt).

package/docs/README.md ADDED Viewed

@@ -0,0 +1,83 @@
+# Shipwright Documentation
+Navigation hub for all Shipwright docs. Start here or jump to a section.
+---
+## Root Documentation
+| Doc                                          | Purpose                                             |
+| -------------------------------------------- | --------------------------------------------------- |
+| [../README.md](../README.md)                 | Project overview, quick start, features             |
+| [../STRATEGY.md](../STRATEGY.md)             | Vision, priorities, technical principles            |
+| [../CHANGELOG.md](../CHANGELOG.md)           | Version history                                     |
+| [../.claude/CLAUDE.md](../.claude/CLAUDE.md) | 100+ commands, architecture, development guidelines |
+---
+## docs/ Sections
+### Strategy & GTM
+| Doc                                                                  | Purpose                                               |
+| -------------------------------------------------------------------- | ----------------------------------------------------- |
+| [strategy/README.md](strategy/README.md)                             | Strategic docs index — market research, brand, GTM    |
+| [strategy/01-market-research.md](strategy/01-market-research.md)     | Market size, competitive landscape, customer segments |
+| [strategy/02-mission-and-brand.md](strategy/02-mission-and-brand.md) | Mission, vision, brand positioning, messaging         |
+| [strategy/03-gtm-and-roadmap.md](strategy/03-gtm-and-roadmap.md)     | Go-to-market, 4-phase roadmap, success metrics        |
+### Team Patterns (Wave-Style)
+| Doc                                                                      | Purpose                                       |
+| ------------------------------------------------------------------------ | --------------------------------------------- |
+| [patterns/README.md](patterns/README.md)                                 | Wave patterns index — parallel agent work     |
+| [patterns/feature-implementation.md](patterns/feature-implementation.md) | Multi-component feature builds                |
+| [patterns/research-exploration.md](patterns/research-exploration.md)     | Codebase exploration                          |
+| [patterns/test-generation.md](patterns/test-generation.md)               | Test coverage campaigns                       |
+| [patterns/refactoring.md](patterns/refactoring.md)                       | Large-scale transformations                   |
+| [patterns/bug-hunt.md](patterns/bug-hunt.md)                             | Tracking complex bugs                         |
+| [patterns/audit-loop.md](patterns/audit-loop.md)                         | Self-reflection and quality gates in the loop |
+### tmux Research
+| Doc                                                                                              | Purpose                                   |
+| ------------------------------------------------------------------------------------------------ | ----------------------------------------- |
+| [tmux-research/TMUX-RESEARCH-INDEX.md](tmux-research/TMUX-RESEARCH-INDEX.md)                     | Index and reading guide                   |
+| [tmux-research/TMUX-BEST-PRACTICES-2025-2026.md](tmux-research/TMUX-BEST-PRACTICES-2025-2026.md) | Configuration bible                       |
+| [tmux-research/TMUX-ARCHITECTURE.md](tmux-research/TMUX-ARCHITECTURE.md)                         | Visual architecture, integration patterns |
+| [tmux-research/TMUX-QUICK-REFERENCE.md](tmux-research/TMUX-QUICK-REFERENCE.md)                   | Fast lookup, keybindings                  |
+| [tmux-research/TMUX-AUDIT.md](tmux-research/TMUX-AUDIT.md)                                       | Shipwright tmux config audit report       |
+### Platform & AGI
+| Doc                                                  | Purpose                                     |
+| ---------------------------------------------------- | ------------------------------------------- |
+| [AGI-PLATFORM-PLAN.md](AGI-PLATFORM-PLAN.md)         | Phased refactor for autonomous product dev  |
+| [AGI-WHATS-NEXT.md](AGI-WHATS-NEXT.md)               | Gaps, not-yet-implemented, E2E audit status |
+| [PLATFORM-TODO-BACKLOG.md](PLATFORM-TODO-BACKLOG.md) | TODO/FIXME/HACK triage backlog              |
+| [config-policy.md](config-policy.md)                 | Policy config schema and usage              |
+### Reference & Troubleshooting
+| Doc                                                            | Purpose                                             |
+| -------------------------------------------------------------- | --------------------------------------------------- |
+| [TIPS.md](TIPS.md)                                             | Power user tips, team patterns                      |
+| [KNOWN-ISSUES.md](KNOWN-ISSUES.md)                             | Tracked bugs and workarounds                        |
+| [definition-of-done.example.md](definition-of-done.example.md) | Template for `shipwright loop --definition-of-done` |
+---
+## .claude/ Agent Definitions
+| File                                                                 | Purpose                                                                 |
+| -------------------------------------------------------------------- | ----------------------------------------------------------------------- |
+| [../.claude/DEFINITION-OF-DONE.md](../.claude/DEFINITION-OF-DONE.md) | Pipeline completion checklist                                           |
+| [../.claude/agents/](../.claude/agents/)                             | Role definitions (pipeline-agent, code-reviewer, test-specialist, etc.) |
+---
+## See Also
+- [demo/README.md](../demo/README.md) — Demo app for pipeline testing
+- [claude-code/CLAUDE.md.shipwright](../claude-code/CLAUDE.md.shipwright) — Downstream repo template
+- [.github/pull_request_template.md](../.github/pull_request_template.md) — PR checklist

package/docs/TIPS.md CHANGED Viewed

@@ -145,7 +145,7 @@ Press `prefix + G` to toggle zoom on the current pane. This makes one agent fill
 ### Synchronized input
-Press `prefix + Alt-t` to toggle synchronized panes. When enabled, anything you type goes to ALL panes simultaneously. Useful for:
+Press `prefix + S` or `prefix + Alt-t` (M-t) to toggle synchronized panes. When enabled, anything you type goes to ALL panes simultaneously. Useful for:
 - Stopping all agents at once (`Ctrl-C` in all panes)
 - Running the same command in all agent directories
@@ -154,7 +154,7 @@ Press `prefix + Alt-t` to toggle synchronized panes. When enabled, anything you
 ### Capture pane contents
-Press `prefix + Alt-s` to save the current pane's visible content to a file in `/tmp/`. Useful for debugging agent output after the fact.
+Press `prefix + Alt-s` (M-s) to save the current pane's visible content to a file in `/tmp/`. Useful for debugging agent output after the fact.
 ---
@@ -341,3 +341,40 @@ Each agent writes findings/results to a file in this directory. The team lead re
 | [Test Generation](patterns/test-generation.md)               | 3-4+  | 2-3    | Coverage campaigns          |
 | [Refactoring](patterns/refactoring.md)                       | 3-4   | 2      | Large-scale transformations |
 | [Bug Hunt](patterns/bug-hunt.md)                             | 3-4   | 2-3    | Complex, elusive bugs       |
+---
+## Shipwright-Specific Tips
+### Use `--worktree` for parallel builds
+When running multiple agents or pipelines concurrently, use worktree isolation to avoid conflicts:
+```bash
+shipwright pipeline start --issue 42 --worktree
+shipwright loop "Refactor auth" --agents 2 --worktree
+```
+### Keep docs in sync
+```bash
+shipwright docs check   # Report stale AUTO sections (exit 1 if any)
+shipwright docs sync   # Regenerate all stale sections
+```
+### Definition of Done for loops
+Use a DoD file with `shipwright loop` to prevent premature completion:
+```bash
+shipwright loop "Add RBAC" --quality-gates --definition-of-done dod.md
+```
+Template at `docs/definition-of-done.example.md` (or `~/.shipwright/templates/` after install).
+### Run all test suites
+```bash
+npm test              # All 96+ test suites
+./scripts/sw-pipeline-test.sh   # Pipeline tests only (no real Claude/GitHub)
+```

package/docs/config-policy.md ADDED Viewed

@@ -0,0 +1,40 @@
+# Shipwright Policy Configuration
+**Location:** `config/policy.json` (repo) or `~/.shipwright/policy.json` (user override)
+All tunable policy — timeouts, limits, thresholds — should live in policy config. Scripts may still have in-code defaults for backwards compatibility but should prefer policy when present. Adaptive/learned values (e.g. from `~/.shipwright/adaptive-*.json`, optimization outputs) override policy when available.
+## Why centralize policy?
+- **AGI-level self-improvement:** Strategic agent and platform-refactor scans can suggest moving more values here instead of hardcoding.
+- **Single place to tune:** Daemon, pipeline, quality, strategic, and sweep behavior can be adjusted without editing scripts.
+- **Clean architecture:** Policy is data, not code; easier to validate, document, and evolve.
+## Schema (high level)
+| Section     | Purpose                                                                                                                                                 |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `daemon`    | Poll interval, health timeouts per stage, auto-scale/optimize/stale-reaper intervals, stale thresholds, `stale_timeout_multiplier`, `stale_state_hours` |
+| `pipeline`  | Max iterations, coverage/quality gate thresholds, memory baseline fallbacks, `max_cycles_convergence_cap`                                               |
+| `quality`   | Coverage and gate score thresholds, `audit_weights` (test_pass, coverage, security, etc.)                                                               |
+| `strategic` | Max issues per cycle, cooldown, overlap threshold, strategy line limit                                                                                  |
+| `sweep`     | Cron interval, stuck threshold, retry template and iteration caps                                                                                       |
+| `hygiene`   | Artifact age for cleanup                                                                                                                                |
+| `recruit`   | Agent recruitment: self_tune, match thresholds, model, promote thresholds                                                                               |
+## Usage
+- **From bash:** Prefer `jq` to read values, e.g. `jq -r '.daemon.poll_interval_seconds // 60' config/policy.json`.
+- **Override:** If `~/.shipwright/policy.json` exists, scripts may merge or prefer it over repo `config/policy.json`.
+- **Adaptive overrides:** Daemon/pipeline already use learned timeouts and iteration counts when present; those continue to override policy.
+## Schema
+- **config/policy.schema.json** — JSON Schema (draft-07) for policy. Validate in CI with `jq empty config/policy.json`; optional full validation with `ajv validate -s config/policy.schema.json -d config/policy.json`.
+## Roadmap
+- ~~Add `scripts/lib/policy.sh`~~ (done)
+- ~~Migrate daemon/pipeline/quality/strategic/hygiene defaults to read from policy~~ (done)
+- Strategic agent can recommend issues like "Move more tunables to config/policy.json."
+- Future: Add `recruit` section to `policy.schema.json` for full validation; schema currently allows additional properties.

package/docs/definition-of-done.example.md CHANGED Viewed

@@ -3,6 +3,8 @@
 Use this template with `shipwright loop --definition-of-done <file>` to enforce completion criteria.
 Copy and customize for your project.
+> **Shipwright contributors:** See `.claude/DEFINITION-OF-DONE.md` for the project-specific checklist (bash standards, pipeline compliance, etc.).
 ## Checklist
 - [ ] All specified functionality is implemented

package/docs/patterns/README.md CHANGED Viewed

@@ -35,6 +35,7 @@ Each wave:
 | [Test Generation](test-generation.md)               | Comprehensive test coverage campaigns     | 3-4+          | 2-3 agents |
 | [Refactoring](refactoring.md)                       | Large-scale code transformations          | 3-4           | 2 agents   |
 | [Bug Hunt](bug-hunt.md)                             | Tracking down complex, elusive bugs       | 3-4           | 2-3 agents |
+| [Audit Loop](audit-loop.md)                         | Self-reflection, quality gates in loop    | N/A           | 1-2 agents |
 ---
@@ -147,3 +148,7 @@ Choose the right model for each agent's task:
 | Architecture decisions, complex debugging | `opus`   | Best reasoning         |
 | Test generation                           | `sonnet` | Good pattern matching  |
 | Documentation, reports                    | `sonnet` | Clear writing          |
+---
+See also: [docs/README.md](../README.md) — Documentation hub

package/docs/strategy/02-mission-and-brand.md CHANGED Viewed

@@ -57,7 +57,7 @@ Shipwright orchestrates autonomous Claude Code agent teams with full-cycle deliv
 **Headline:** From GitHub issue to merged PR with zero human in the loop
 **Description:** Label a GitHub issue with `shipwright` and walk away. The daemon watches, triages, plans, designs, builds, tests, reviews, gates, and merges — all while learning from failures and adapting its approach. Teams get back 20+ hours per engineer per month by eliminating manual triage, code review, and routine feature shipping.
-**Proof Point:** Shipwright dogfoods itself — this repo processes its own issues with zero human intervention. See [live examples](../../actions/workflows/shipwright-pipeline.yml).
+**Proof Point:** Shipwright dogfoods itself — this repo processes its own issues with zero human intervention. See [live examples](../../.github/workflows/shipwright-pipeline.yml).
 ### 2. **Autonomous Teams That Learn**
@@ -572,8 +572,8 @@ This document will evolve. Update it when:
 - Metrics show messaging isn't landing
 - Team adds new features that shift differentiation
-**Last updated:** 2025-02-14
-**Next review:** 2025-05-14 (quarterly)
+**Last updated:** 2026-02-14
+**Next review:** 2026-05-14 (quarterly)
 **Owner:** Brand / Product Marketing
 ---

package/docs/strategy/README.md CHANGED Viewed

@@ -167,6 +167,7 @@ Enterprise Edition ($5K-$50K/yr)
 See also:
-- `/README.md` — Project overview
-- `.claude/CLAUDE.md` — Technical documentation
-- `.github/workflows/shipwright-pipeline.yml` — Production CI/CD
+- [docs/README.md](../README.md) — Documentation hub
+- [README.md](../../README.md) — Project overview
+- [.claude/CLAUDE.md](../../.claude/CLAUDE.md) — Technical documentation
+- [.github/workflows/shipwright-pipeline.yml](../../.github/workflows/shipwright-pipeline.yml) — Production CI/CD

package/docs/tmux-research/TMUX-AUDIT.md CHANGED Viewed

@@ -5,6 +5,8 @@
 **Repository:** `/Users/sethford/Documents/shipwright`
 **Scope:** tmux configuration, CLI scripts, and integration code
+> **Note:** This audit is a point-in-time snapshot. Some issues may have been resolved since (e.g. sw-reaper.sh now uses `#{pane_id}` per line 117). Re-run the audit or verify each finding before acting.
 ---
 ## Executive Summary

package/docs/tmux-research/TMUX-RESEARCH-INDEX.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # tmux Research Index: Best-in-Class 2025-2026
+↑ [docs/](../README.md) — Documentation hub
 **Research Completion Date**: February 12, 2026
 **Focus Areas**: 14 comprehensive topics
 **Total Documentation**: 4 detailed guides + this index
@@ -87,6 +89,21 @@ This research compiled best-in-class tmux configurations, patterns, and integrat
 ---
+### 4. TMUX-AUDIT.md
+**Shipwright tmux configuration audit (this repo)**
+| Section            | Content                                              |
+| ------------------ | ---------------------------------------------------- |
+| Executive Summary  | Findings, severity counts (critical/major/minor)     |
+| Critical Issues    | Race conditions, command injection, pane format bugs |
+| Major Issues       | Pane referencing, error handling, version compat     |
+| Test Coverage Gaps | Integration scenarios, edge cases                    |
+**When to Use**: Understanding Shipwright's tmux integration issues, triage for fixes
+---
 ## Key Findings Summary
 ### Best-in-Class Configuration (2025-2026)