npm - shipwright-cli - Versions diffs - 3.1.0 → 3.3.0 - Mend

shipwright-cli 3.1.0 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (283) hide show

package/.claude/agents/code-reviewer.md +2 -0
package/.claude/agents/devops-engineer.md +2 -0
package/.claude/agents/doc-fleet-agent.md +2 -0
package/.claude/agents/pipeline-agent.md +2 -0
package/.claude/agents/shell-script-specialist.md +2 -0
package/.claude/agents/test-specialist.md +2 -0
package/.claude/hooks/agent-crash-capture.sh +32 -0
package/.claude/hooks/post-tool-use.sh +3 -2
package/.claude/hooks/pre-tool-use.sh +35 -3
package/README.md +22 -8
package/claude-code/hooks/config-change.sh +18 -0
package/claude-code/hooks/instructions-reloaded.sh +7 -0
package/claude-code/hooks/worktree-create.sh +25 -0
package/claude-code/hooks/worktree-remove.sh +20 -0
package/config/code-constitution.json +130 -0
package/config/defaults.json +25 -2
package/config/policy.json +1 -1
package/dashboard/middleware/auth.ts +134 -0
package/dashboard/middleware/constants.ts +21 -0
package/dashboard/public/index.html +8 -6
package/dashboard/public/styles.css +176 -97
package/dashboard/routes/auth.ts +38 -0
package/dashboard/server.ts +117 -25
package/dashboard/services/config.ts +26 -0
package/dashboard/services/db.ts +118 -0
package/dashboard/src/canvas/pixel-agent.ts +298 -0
package/dashboard/src/canvas/pixel-sprites.ts +440 -0
package/dashboard/src/canvas/shipyard-effects.ts +367 -0
package/dashboard/src/canvas/shipyard-scene.ts +616 -0
package/dashboard/src/canvas/submarine-layout.ts +267 -0
package/dashboard/src/components/header.ts +8 -7
package/dashboard/src/core/api.ts +5 -0
package/dashboard/src/core/router.ts +1 -0
package/dashboard/src/design/submarine-theme.ts +253 -0
package/dashboard/src/main.ts +2 -0
package/dashboard/src/types/api.ts +12 -1
package/dashboard/src/views/activity.ts +2 -1
package/dashboard/src/views/metrics.ts +69 -1
package/dashboard/src/views/shipyard.ts +39 -0
package/dashboard/types/index.ts +166 -0
package/docs/plans/2026-02-28-compound-audit-and-shipyard-design.md +186 -0
package/docs/plans/2026-02-28-skipper-shipwright-implementation-plan.md +1182 -0
package/docs/plans/2026-02-28-skipper-shipwright-integration-design.md +531 -0
package/docs/plans/2026-03-01-ai-powered-skill-injection-design.md +298 -0
package/docs/plans/2026-03-01-ai-powered-skill-injection-plan.md +1109 -0
package/docs/plans/2026-03-01-capabilities-cleanup-plan.md +658 -0
package/docs/plans/2026-03-01-clean-architecture-plan.md +924 -0
package/docs/plans/2026-03-01-compound-audit-cascade-design.md +191 -0
package/docs/plans/2026-03-01-compound-audit-cascade-plan.md +921 -0
package/docs/plans/2026-03-01-deep-integration-plan.md +851 -0
package/docs/plans/2026-03-01-pipeline-audit-trail-design.md +145 -0
package/docs/plans/2026-03-01-pipeline-audit-trail-plan.md +770 -0
package/docs/plans/2026-03-01-refined-depths-brand-design.md +382 -0
package/docs/plans/2026-03-01-refined-depths-implementation.md +599 -0
package/docs/plans/2026-03-01-skipper-kernel-integration-design.md +203 -0
package/docs/plans/2026-03-01-unified-platform-design.md +272 -0
package/docs/plans/2026-03-07-claude-code-feature-integration-design.md +189 -0
package/docs/plans/2026-03-07-claude-code-feature-integration-plan.md +1165 -0
package/docs/research/BACKLOG_QUICK_REFERENCE.md +352 -0
package/docs/research/CUTTING_EDGE_RESEARCH_2026.md +546 -0
package/docs/research/RESEARCH_INDEX.md +439 -0
package/docs/research/RESEARCH_SOURCES.md +440 -0
package/docs/research/RESEARCH_SUMMARY.txt +275 -0
package/docs/superpowers/specs/2026-03-10-pipeline-quality-revolution-design.md +341 -0
package/package.json +2 -2
package/scripts/lib/adaptive-model.sh +427 -0
package/scripts/lib/adaptive-timeout.sh +316 -0
package/scripts/lib/audit-trail.sh +309 -0
package/scripts/lib/auto-recovery.sh +471 -0
package/scripts/lib/bandit-selector.sh +431 -0
package/scripts/lib/bootstrap.sh +104 -2
package/scripts/lib/causal-graph.sh +455 -0
package/scripts/lib/compat.sh +126 -0
package/scripts/lib/compound-audit.sh +337 -0
package/scripts/lib/constitutional.sh +454 -0
package/scripts/lib/context-budget.sh +359 -0
package/scripts/lib/convergence.sh +594 -0
package/scripts/lib/cost-optimizer.sh +634 -0
package/scripts/lib/daemon-adaptive.sh +14 -2
package/scripts/lib/daemon-dispatch.sh +106 -17
package/scripts/lib/daemon-failure.sh +34 -4
package/scripts/lib/daemon-patrol.sh +25 -4
package/scripts/lib/daemon-poll-github.sh +361 -0
package/scripts/lib/daemon-poll-health.sh +299 -0
package/scripts/lib/daemon-poll.sh +27 -611
package/scripts/lib/daemon-state.sh +119 -66
package/scripts/lib/daemon-triage.sh +10 -0
package/scripts/lib/dod-scorecard.sh +442 -0
package/scripts/lib/error-actionability.sh +300 -0
package/scripts/lib/formal-spec.sh +461 -0
package/scripts/lib/helpers.sh +180 -5
package/scripts/lib/intent-analysis.sh +409 -0
package/scripts/lib/loop-convergence.sh +350 -0
package/scripts/lib/loop-iteration.sh +682 -0
package/scripts/lib/loop-progress.sh +48 -0
package/scripts/lib/loop-restart.sh +185 -0
package/scripts/lib/memory-effectiveness.sh +506 -0
package/scripts/lib/mutation-executor.sh +352 -0
package/scripts/lib/outcome-feedback.sh +521 -0
package/scripts/lib/pipeline-cli.sh +336 -0
package/scripts/lib/pipeline-commands.sh +1216 -0
package/scripts/lib/pipeline-detection.sh +101 -3
package/scripts/lib/pipeline-execution.sh +897 -0
package/scripts/lib/pipeline-github.sh +28 -3
package/scripts/lib/pipeline-intelligence-compound.sh +431 -0
package/scripts/lib/pipeline-intelligence-scoring.sh +407 -0
package/scripts/lib/pipeline-intelligence-skip.sh +181 -0
package/scripts/lib/pipeline-intelligence.sh +104 -1138
package/scripts/lib/pipeline-quality-bash-compat.sh +182 -0
package/scripts/lib/pipeline-quality-checks.sh +17 -711
package/scripts/lib/pipeline-quality-gates.sh +563 -0
package/scripts/lib/pipeline-stages-build.sh +730 -0
package/scripts/lib/pipeline-stages-delivery.sh +965 -0
package/scripts/lib/pipeline-stages-intake.sh +1133 -0
package/scripts/lib/pipeline-stages-monitor.sh +407 -0
package/scripts/lib/pipeline-stages-review.sh +1022 -0
package/scripts/lib/pipeline-stages.sh +161 -2901
package/scripts/lib/pipeline-state.sh +36 -5
package/scripts/lib/pipeline-util.sh +487 -0
package/scripts/lib/policy-learner.sh +438 -0
package/scripts/lib/process-reward.sh +493 -0
package/scripts/lib/project-detect.sh +649 -0
package/scripts/lib/quality-profile.sh +334 -0
package/scripts/lib/recruit-commands.sh +885 -0
package/scripts/lib/recruit-learning.sh +739 -0
package/scripts/lib/recruit-roles.sh +648 -0
package/scripts/lib/reward-aggregator.sh +458 -0
package/scripts/lib/rl-optimizer.sh +362 -0
package/scripts/lib/root-cause.sh +427 -0
package/scripts/lib/scope-enforcement.sh +445 -0
package/scripts/lib/session-restart.sh +493 -0
package/scripts/lib/skill-memory.sh +300 -0
package/scripts/lib/skill-registry.sh +775 -0
package/scripts/lib/spec-driven.sh +476 -0
package/scripts/lib/test-helpers.sh +18 -7
package/scripts/lib/test-holdout.sh +429 -0
package/scripts/lib/test-optimizer.sh +511 -0
package/scripts/shipwright-file-suggest.sh +45 -0
package/scripts/skills/adversarial-quality.md +61 -0
package/scripts/skills/api-design.md +44 -0
package/scripts/skills/architecture-design.md +50 -0
package/scripts/skills/brainstorming.md +43 -0
package/scripts/skills/data-pipeline.md +44 -0
package/scripts/skills/deploy-safety.md +64 -0
package/scripts/skills/documentation.md +38 -0
package/scripts/skills/frontend-design.md +45 -0
package/scripts/skills/generated/.gitkeep +0 -0
package/scripts/skills/generated/_refinements/.gitkeep +0 -0
package/scripts/skills/generated/_refinements/adversarial-quality.patch.md +3 -0
package/scripts/skills/generated/_refinements/architecture-design.patch.md +3 -0
package/scripts/skills/generated/_refinements/brainstorming.patch.md +3 -0
package/scripts/skills/generated/cli-version-management.md +29 -0
package/scripts/skills/generated/collection-system-validation.md +99 -0
package/scripts/skills/generated/large-scale-c-refactoring-coordination.md +97 -0
package/scripts/skills/generated/pattern-matching-similarity-scoring.md +195 -0
package/scripts/skills/generated/test-parallelization-detection.md +65 -0
package/scripts/skills/observability.md +79 -0
package/scripts/skills/performance.md +48 -0
package/scripts/skills/pr-quality.md +49 -0
package/scripts/skills/product-thinking.md +43 -0
package/scripts/skills/security-audit.md +49 -0
package/scripts/skills/systematic-debugging.md +40 -0
package/scripts/skills/testing-strategy.md +47 -0
package/scripts/skills/two-stage-review.md +52 -0
package/scripts/skills/validation-thoroughness.md +55 -0
package/scripts/sw +9 -3
package/scripts/sw-activity.sh +9 -8
package/scripts/sw-adaptive.sh +8 -7
package/scripts/sw-adversarial.sh +2 -1
package/scripts/sw-architecture-enforcer.sh +3 -1
package/scripts/sw-auth.sh +12 -2
package/scripts/sw-autonomous.sh +5 -1
package/scripts/sw-changelog.sh +4 -1
package/scripts/sw-checkpoint.sh +2 -1
package/scripts/sw-ci.sh +15 -6
package/scripts/sw-cleanup.sh +4 -26
package/scripts/sw-code-review.sh +45 -20
package/scripts/sw-connect.sh +2 -1
package/scripts/sw-context.sh +2 -1
package/scripts/sw-cost.sh +107 -5
package/scripts/sw-daemon.sh +71 -11
package/scripts/sw-dashboard.sh +3 -1
package/scripts/sw-db.sh +71 -20
package/scripts/sw-decide.sh +8 -2
package/scripts/sw-decompose.sh +360 -17
package/scripts/sw-deps.sh +4 -1
package/scripts/sw-developer-simulation.sh +4 -1
package/scripts/sw-discovery.sh +378 -5
package/scripts/sw-doc-fleet.sh +4 -1
package/scripts/sw-docs-agent.sh +3 -1
package/scripts/sw-docs.sh +2 -1
package/scripts/sw-doctor.sh +453 -2
package/scripts/sw-dora.sh +4 -1
package/scripts/sw-durable.sh +12 -7
package/scripts/sw-e2e-orchestrator.sh +17 -16
package/scripts/sw-eventbus.sh +13 -4
package/scripts/sw-evidence.sh +364 -12
package/scripts/sw-feedback.sh +550 -9
package/scripts/sw-fix.sh +20 -1
package/scripts/sw-fleet-discover.sh +6 -2
package/scripts/sw-fleet-viz.sh +9 -4
package/scripts/sw-fleet.sh +5 -1
package/scripts/sw-github-app.sh +18 -4
package/scripts/sw-github-checks.sh +3 -2
package/scripts/sw-github-deploy.sh +3 -2
package/scripts/sw-github-graphql.sh +18 -7
package/scripts/sw-guild.sh +5 -1
package/scripts/sw-heartbeat.sh +5 -30
package/scripts/sw-hello.sh +67 -0
package/scripts/sw-hygiene.sh +10 -3
package/scripts/sw-incident.sh +273 -5
package/scripts/sw-init.sh +18 -2
package/scripts/sw-instrument.sh +10 -2
package/scripts/sw-intelligence.sh +44 -7
package/scripts/sw-jira.sh +5 -1
package/scripts/sw-launchd.sh +2 -1
package/scripts/sw-linear.sh +4 -1
package/scripts/sw-logs.sh +4 -1
package/scripts/sw-loop.sh +436 -1076
package/scripts/sw-memory.sh +357 -3
package/scripts/sw-mission-control.sh +6 -1
package/scripts/sw-model-router.sh +483 -27
package/scripts/sw-otel.sh +15 -4
package/scripts/sw-oversight.sh +14 -5
package/scripts/sw-patrol-meta.sh +334 -0
package/scripts/sw-pipeline-composer.sh +7 -1
package/scripts/sw-pipeline-vitals.sh +12 -6
package/scripts/sw-pipeline.sh +54 -2653
package/scripts/sw-pm.sh +16 -8
package/scripts/sw-pr-lifecycle.sh +2 -1
package/scripts/sw-predictive.sh +17 -5
package/scripts/sw-prep.sh +185 -2
package/scripts/sw-ps.sh +5 -25
package/scripts/sw-public-dashboard.sh +17 -4
package/scripts/sw-quality.sh +14 -6
package/scripts/sw-reaper.sh +8 -25
package/scripts/sw-recruit.sh +156 -2303
package/scripts/sw-regression.sh +19 -12
package/scripts/sw-release-manager.sh +3 -1
package/scripts/sw-release.sh +4 -1
package/scripts/sw-remote.sh +3 -1
package/scripts/sw-replay.sh +7 -1
package/scripts/sw-retro.sh +158 -1
package/scripts/sw-review-rerun.sh +3 -1
package/scripts/sw-scale.sh +14 -5
package/scripts/sw-security-audit.sh +6 -1
package/scripts/sw-self-optimize.sh +173 -6
package/scripts/sw-session.sh +9 -3
package/scripts/sw-setup.sh +3 -1
package/scripts/sw-stall-detector.sh +406 -0
package/scripts/sw-standup.sh +15 -7
package/scripts/sw-status.sh +3 -1
package/scripts/sw-strategic.sh +14 -6
package/scripts/sw-stream.sh +13 -4
package/scripts/sw-swarm.sh +20 -7
package/scripts/sw-team-stages.sh +13 -6
package/scripts/sw-templates.sh +7 -31
package/scripts/sw-testgen.sh +17 -6
package/scripts/sw-tmux-pipeline.sh +4 -1
package/scripts/sw-tmux-role-color.sh +2 -0
package/scripts/sw-tmux-status.sh +1 -1
package/scripts/sw-tmux.sh +37 -1
package/scripts/sw-trace.sh +3 -1
package/scripts/sw-tracker-github.sh +3 -0
package/scripts/sw-tracker-jira.sh +3 -0
package/scripts/sw-tracker-linear.sh +3 -0
package/scripts/sw-tracker.sh +3 -1
package/scripts/sw-triage.sh +3 -2
package/scripts/sw-upgrade.sh +3 -1
package/scripts/sw-ux.sh +5 -2
package/scripts/sw-webhook.sh +5 -2
package/scripts/sw-widgets.sh +9 -4
package/scripts/sw-worktree.sh +15 -3
package/scripts/test-skill-injection.sh +1233 -0
package/templates/pipelines/autonomous.json +27 -3
package/templates/pipelines/cost-aware.json +34 -8
package/templates/pipelines/deployed.json +12 -0
package/templates/pipelines/enterprise.json +12 -0
package/templates/pipelines/fast.json +6 -0
package/templates/pipelines/full.json +27 -3
package/templates/pipelines/hotfix.json +6 -0
package/templates/pipelines/standard.json +12 -0
package/templates/pipelines/tdd.json +12 -0

package/docs/plans/2026-03-01-skipper-kernel-integration-design.md ADDED Viewed

@@ -0,0 +1,203 @@
+# Skipper Kernel Integration Design — Hand-as-Toolbox
+**Date:** 2026-03-01
+**Status:** Approved
+**Approach:** A — Hand-as-Toolbox (bundled Hand + custom Rust tools)
+## Context
+The `skipper-shipwright` Rust crate (9 modules, 356 tests, zero clippy warnings) implements Shipwright's pipeline engine, decision engine, memory, fleet management, and intelligence layer. It needs to be wired into the Skipper kernel as a first-class Hand so that:
+- Shipwright activates via `POST /api/hands/shipwright/activate`
+- The Hand agent calls custom tools that delegate to the Rust crate
+- Pipeline state, failure patterns, and decisions persist via kernel memory
+- Events flow through the kernel event bus
+- The whole system is proven working end-to-end
+## Architecture
+```
+User / API
+    │
+    ▼
+Skipper Kernel
+    │
+    ├── HandRegistry.activate("shipwright")
+    │       │
+    │       ▼
+    │   AgentManifest (from HAND.toml)
+    │       │
+    │       ▼
+    │   spawn_agent() → Shipwright Agent
+    │       │
+    │       ▼
+    │   Agent Loop (LLM ↔ Tools)
+    │       │
+    │       ▼
+    │   tool_runner::execute_tool()
+    │       │
+    │       ├── "shipwright_pipeline_start" → skipper_shipwright::tools::pipeline_start()
+    │       ├── "shipwright_pipeline_status" → skipper_shipwright::tools::pipeline_status()
+    │       ├── "shipwright_stage_advance"   → skipper_shipwright::tools::stage_advance()
+    │       ├── "shipwright_decision_run"    → skipper_shipwright::tools::decision_run()
+    │       ├── "shipwright_memory_search"   → skipper_shipwright::tools::memory_search()
+    │       ├── "shipwright_memory_store"    → skipper_shipwright::tools::memory_store_pattern()
+    │       ├── "shipwright_fleet_status"    → skipper_shipwright::tools::fleet_status()
+    │       └── "shipwright_intelligence"    → skipper_shipwright::tools::intelligence()
+    │               │
+    │               ▼
+    │       skipper-shipwright crate
+    │       (Pipeline, Decision, Memory, Fleet, Intelligence)
+    │               │
+    │               ▼
+    │       KernelHandle (memory_store, task_post, publish_event, spawn_agent)
+    │
+    └── MemorySubstrate (SQLite + vector search)
+```
+## Custom Tools
+8 tools registered as built-in (not MCP/skill) because Shipwright is a first-class bundled Hand:
+| Tool Name                    | Purpose                         | Key Inputs                                      |
+| ---------------------------- | ------------------------------- | ----------------------------------------------- |
+| `shipwright_pipeline_start`  | Start a delivery pipeline       | `goal` or `issue_number`, `template`            |
+| `shipwright_pipeline_status` | Get current pipeline state      | `pipeline_id` (optional)                        |
+| `shipwright_stage_advance`   | Advance stage or report failure | `pipeline_id`, `outcome`                        |
+| `shipwright_decision_run`    | Run autonomous decision cycle   | `dry_run`, `signal_filter`                      |
+| `shipwright_memory_search`   | Search failure patterns         | `query`, `repo`, `limit`                        |
+| `shipwright_memory_store`    | Record a failure pattern        | `error_class`, `signature`, `root_cause`, `fix` |
+| `shipwright_fleet_status`    | Fleet overview across repos     | (none)                                          |
+| `shipwright_intelligence`    | Run intelligence analysis       | `repo_path`, `analysis_type`                    |
+### Tool Implementation Pattern
+```rust
+// skipper-shipwright/src/tools.rs
+pub fn tool_definitions() -> Vec<ToolDefinition> { /* 8 definitions */ }
+pub async fn pipeline_start(
+    input: &serde_json::Value,
+    kernel: Option<&Arc<dyn KernelHandle>>,
+) -> ToolResult { /* ... */ }
+```
+### Dispatch Wiring
+In `skipper-runtime/src/tool_runner.rs`:
+```rust
+// In execute_tool() match block:
+name if name.starts_with("shipwright_") => {
+    skipper_shipwright::tools::dispatch(tool_use_id, name, input, kernel).await
+}
+// In builtin_tool_definitions():
+defs.extend(skipper_shipwright::tools::tool_definitions());
+```
+## HAND.toml
+```toml
+id = "shipwright"
+name = "Shipwright Hand"
+description = "Autonomous delivery pipeline — turns issues into tested, reviewed PRs"
+category = "engineering"
+icon = "⚓"
+tools = [
+    "shipwright_pipeline_start", "shipwright_pipeline_status",
+    "shipwright_stage_advance", "shipwright_decision_run",
+    "shipwright_memory_search", "shipwright_memory_store",
+    "shipwright_fleet_status", "shipwright_intelligence",
+    "shell_exec", "file_read", "file_write", "file_list",
+    "web_fetch", "memory_store", "memory_recall",
+    "knowledge_add_entity", "knowledge_add_relation",
+    "event_publish", "agent_spawn", "agent_send", "schedule_create",
+]
+[[settings]]
+key = "pipeline_template"
+label = "Default Pipeline Template"
+type = "select"
+options = ["fast", "standard", "full", "hotfix", "autonomous", "cost-aware"]
+default = "standard"
+[[settings]]
+key = "max_parallel"
+label = "Max Parallel Pipelines"
+type = "number"
+default = "2"
+[[settings]]
+key = "auto_decide"
+label = "Autonomous Decision Engine"
+type = "boolean"
+default = "false"
+[agent]
+name = "shipwright-hand"
+module = "builtin:chat"
+provider = "default"
+model = "default"
+max_iterations = 200
+system_prompt = """You are Shipwright..."""
+[dashboard]
+title = "Shipwright Pipeline"
+[[dashboard.metrics]]
+key = "active_pipelines"
+label = "Active Pipelines"
+type = "gauge"
+[[dashboard.metrics]]
+key = "stages_completed"
+label = "Stages Completed"
+type = "counter"
+[[dashboard.metrics]]
+key = "success_rate"
+label = "Success Rate"
+type = "percentage"
+```
+## Memory Wiring
+Adapter pattern — `KernelMemoryAdapter` wraps `KernelHandle` for the shipwright crate:
+- **With kernel** (production): delegates to `KernelHandle::memory_store` / `memory_recall` (persistent SQLite + vector search)
+- **Without kernel** (tests/standalone): uses existing in-memory `ShipwrightMemory`
+Stored data:
+- Pipeline state: `pipeline:{id}`
+- Failure patterns: `failure:{repo}:{error_class}`
+- Decision logs: `decision:{date}:{id}`
+- Scoring weights: `weights:current`
+## Files to Create/Modify
+| File                                                      | Action | Purpose                                  |
+| --------------------------------------------------------- | ------ | ---------------------------------------- |
+| `crates/skipper-hands/bundled/shipwright/HAND.toml`      | Create | Hand definition                          |
+| `crates/skipper-hands/bundled/shipwright/SKILL.md`       | Create | Domain knowledge (from fork)             |
+| `crates/skipper-shipwright/src/tools.rs`                 | Create | 8 tool handlers + definitions + dispatch |
+| `crates/skipper-shipwright/src/memory/kernel_adapter.rs` | Create | KernelHandle memory bridge               |
+| `crates/skipper-shipwright/src/lib.rs`                   | Modify | Export tools module                      |
+| `crates/skipper-hands/src/bundled.rs`                    | Modify | Register Shipwright in bundled_hands()   |
+| `crates/skipper-runtime/src/tool_runner.rs`              | Modify | Wire dispatch + definitions              |
+| `crates/skipper-runtime/Cargo.toml`                      | Modify | Add skipper-shipwright dep              |
+| `crates/skipper-shipwright/Cargo.toml`                   | Modify | Add skipper-types dep                   |
+| `crates/skipper-shipwright/tests/tool_tests.rs`          | Create | Tool handler unit tests                  |
+## E2E Verification
+1. `cargo build --workspace --lib` — compiles
+2. `cargo test --workspace` — all tests pass
+3. `cargo clippy --workspace --all-targets -- -D warnings` — zero warnings
+4. Hand activation: `POST /api/hands/shipwright/activate` spawns agent
+5. Pipeline via agent: send message, agent calls `shipwright_pipeline_start`
+6. Memory persistence: store failure pattern, search it back
+7. Fleet status: `shipwright_fleet_status` returns data

package/docs/plans/2026-03-01-unified-platform-design.md ADDED Viewed

@@ -0,0 +1,272 @@
+# Unified Platform Strategy — Shipwright + Skipper
+March 2026
+---
+## Vision
+Skipper subsumes Shipwright. Long-term, Skipper's Rust kernel becomes the execution engine. Shipwright bash scripts become the reference implementation that Skipper agents call. Pipeline, daemon, fleet, memory all run through Skipper eventually.
+## Strategy: Parallel Streams
+Three concurrent workstreams, each owning distinct files to avoid merge conflicts:
+| Stream                    | Scope                            | File Ownership                                                                                                     |
+| ------------------------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
+| 1. Clean Architecture     | Skipper Rust refactoring         | `crates/skipper-api/src/`, `skipper-kernel/src/`, `skipper-cli/src/`, `skipper-types/src/`, `skipper-runtime/src/` |
+| 2. Deep Integration       | Shipwright → Skipper wiring      | `crates/skipper-shipwright/`, `crates/skipper-hands/bundled/shipwright/`, new route files only                     |
+| 3. Capabilities + Cleanup | Claude Code, bash cleanup, brand | `.claude/`, `scripts/`, `dashboard/`, `website/`, docs                                                             |
+Each stream runs in its own worktree. Merge order: Stream 1 first (foundation), then Stream 2 (integration into clean modules), then Stream 3 (no Rust conflicts).
+---
+## Stream 1: Clean Architecture
+### Problem
+Five god-files concentrate too much logic:
+| File             | Lines | Issue                                |
+| ---------------- | ----- | ------------------------------------ |
+| `routes.rs`      | 8,983 | Every API endpoint in one file       |
+| `main.rs` (CLI)  | 5,671 | Every CLI command in one file        |
+| `kernel.rs`      | 5,177 | All kernel operations in one file    |
+| `tool_runner.rs` | 3,625 | All tool implementations in one file |
+| `config.rs`      | 3,579 | All config types in one file         |
+### Solution
+Decompose each into domain-oriented module trees.
+#### routes.rs → routes/
+```
+crates/skipper-api/src/routes/
+├── mod.rs          (~100 lines — router builder, re-exports)
+├── agents.rs       (~1,200 lines — CRUD, messaging, lifecycle)
+├── budget.rs       (~400 lines — budget, cost, per-agent spend)
+├── channels.rs     (~1,500 lines — channel CRUD, templates, bridge)
+├── hands.rs        (~800 lines — hand registry, install, config)
+├── network.rs      (~600 lines — OFP peers, A2A, federation)
+├── security.rs     (~500 lines — security dashboard, audit)
+├── settings.rs     (~400 lines — config read/write)
+├── triggers.rs     (~600 lines — trigger CRUD, webhooks)
+├── skills.rs       (~400 lines — skill registry)
+├── static_files.rs (~200 lines — HTML/asset serving)
+└── health.rs       (~100 lines — health, status)
+```
+#### kernel.rs → kernel/
+```
+crates/skipper-kernel/src/kernel/
+├── mod.rs          (~500 lines — KernelBuilder, startup, shutdown)
+├── agents.rs       (~1,200 lines — spawn, kill, list, lifecycle)
+├── workflows.rs    (~1,000 lines — workflow engine, triggers)
+├── channels.rs     (~800 lines — channel management)
+├── config.rs       (~600 lines — runtime config management)
+└── budget.rs       (~400 lines — budget tracking, enforcement)
+```
+#### main.rs → commands/
+```
+crates/skipper-cli/src/
+├── main.rs         (~200 lines — arg parsing, dispatcher)
+├── commands/
+│   ├── mod.rs      (~50 lines — re-exports)
+│   ├── agent.rs    (~800 lines — agent lifecycle)
+│   ├── hand.rs     (~600 lines — hand management)
+│   ├── skill.rs    (~400 lines — skill management)
+│   ├── config.rs   (~500 lines — config management)
+│   ├── channel.rs  (~400 lines — channel management)
+│   ├── auth.rs     (~300 lines — auth/login)
+│   └── daemon.rs   (~400 lines — daemon start/stop)
+```
+#### tool_runner.rs → tools/
+```
+crates/skipper-runtime/src/tools/
+├── mod.rs          (~300 lines — dispatcher, tool definition registry)
+├── filesystem.rs   (~500 lines — read, write, glob, grep)
+├── web.rs          (~400 lines — web search, fetch)
+├── shell.rs        (~400 lines — bash execution, sandbox)
+├── agents.rs       (~300 lines — agent spawn/management)
+├── notebook.rs     (~200 lines — jupyter tools)
+└── shipwright.rs   (~400 lines — feature-gated shipwright tools)
+```
+#### config.rs → config/
+```
+crates/skipper-types/src/config/
+├── mod.rs          (~500 lines — core KernelConfig, top-level types)
+├── channels.rs     (~1,000 lines — channel configs per provider)
+├── models.rs       (~800 lines — model/provider configs)
+├── budget.rs       (~400 lines — budget/cost config)
+└── display.rs      (~300 lines — Display impls, formatting)
+```
+### Principles
+- **Thin routes:** Max ~20 lines per handler. Extract, validate, delegate, format.
+- **Domain grouping:** Group by domain (agents, channels, budget), not by HTTP verb.
+- **Error types:** One `ApiError` enum with `impl IntoResponse`. No `.unwrap()` in handlers.
+- **No behavior changes:** Pure structural refactoring. Every test must continue to pass.
+### Verification
+After each decomposition:
+```bash
+cargo build --workspace --lib
+cargo test --workspace
+cargo clippy --workspace --all-targets -- -D warnings
+```
+---
+## Stream 2: Deep Integration
+### Current State
+The `skipper-shipwright` crate has 8 tools that manage pipeline state in-memory via `ShipwrightState`. The real Shipwright capabilities (bash scripts, daemon, memory, fleet) are not wired.
+### Target State
+Skipper agents can do everything Shipwright CLI can, natively.
+### Tool Enhancements
+| Tool                             | Current           | Target                                                                                            |
+| -------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------- |
+| `shipwright_pipeline_start`      | In-memory state   | Spawns `sw-pipeline.sh` subprocess, streams stage progress, writes events to Skipper event store  |
+| `shipwright_pipeline_status`     | In-memory read    | Reads `.claude/pipeline-state.md` + subprocess status. Stage timing, iteration, test results      |
+| `shipwright_decision`            | Stub scoring      | Calls `sw-decide.sh` for template selection, risk scoring. Falls back to in-memory if unavailable |
+| `shipwright_memory_store/recall` | In-memory HashMap | Reads/writes `~/.shipwright/memory/`. Syncs with Skipper memory store                             |
+| `shipwright_fleet_status`        | Stub              | Reads `~/.shipwright/fleet-config.json`, daemon state, worker pool                                |
+| `shipwright_intelligence`        | Stub              | Calls `sw-intelligence.sh analyze` with caching                                                   |
+| `shipwright_cost`                | Stub              | Reads `~/.shipwright/costs.json` and `budget.json`                                                |
+| `shipwright_daemon`              | Stub              | Start/stop/configure daemon. Read daemon metrics                                                  |
+### New Files
+```
+crates/skipper-shipwright/src/subprocess.rs    (~300 lines — spawn/monitor bash scripts)
+crates/skipper-shipwright/src/memory_bridge.rs (~200 lines — bridge to real memory files)
+crates/skipper-shipwright/src/fleet_bridge.rs  (~200 lines — read fleet/daemon state)
+crates/skipper-api/src/routes/pipelines.rs     (~400 lines — pipeline status API)
+```
+### Dashboard Integration
+- **Pipelines tab:** Active/completed pipelines with stage progress bars
+- **Fleet view:** Multi-repo fleet status, worker pool, per-repo queue depth
+- **Unified memory:** Shipwright memory readable through Skipper dashboard
+---
+## Stream 3: Claude Code Capabilities + Bash Cleanup
+### Bash Cleanup
+#### Decompose pipeline-stages.sh (3,225 lines)
+```
+scripts/lib/
+├── pipeline-stages-intake.sh    (~400 lines — intake, plan, design)
+├── pipeline-stages-build.sh     (~800 lines — build, test)
+├── pipeline-stages-review.sh    (~600 lines — review, compound_quality)
+├── pipeline-stages-delivery.sh  (~500 lines — pr, merge, deploy)
+├── pipeline-stages-monitor.sh   (~400 lines — validate, monitor)
+└── pipeline-stages.sh           (~100 lines — sources sub-files, exports stage list)
+```
+#### Adopt shared test harness
+Update 108 test scripts to `source "$SCRIPT_DIR/lib/test-helpers.sh"` instead of defining own helper functions. Estimated 15-20% code reduction.
+#### Decompose sw-loop.sh (3,366 lines)
+```
+scripts/lib/
+├── loop-iteration.sh     (~600 lines — single iteration logic)
+├── loop-convergence.sh   (~400 lines — convergence detection)
+├── loop-restart.sh       (~300 lines — session restart logic)
+├── loop-progress.sh      (~200 lines — progress.md management)
+```
+### Claude Code Capabilities
+#### 1. Skipper MCP Server
+Register Skipper's API as an MCP server for Claude Code:
+```json
+{
+  "mcpServers": {
+    "skipper": {
+      "command": "curl",
+      "args": ["-s", "http://127.0.0.1:4200/mcp"],
+      "description": "Skipper Agent OS"
+    }
+  }
+}
+```
+Tools: `skipper_spawn_agent`, `skipper_list_agents`, `skipper_send_message`, `skipper_pipeline_status`, `skipper_fleet_status`.
+#### 2. New Skills
+```
+.claude/skills/
+├── pipeline-monitor.md    — Check pipeline progress, surface blockers
+├── fleet-overview.md      — Multi-repo fleet status
+├── agent-debug.md         — Debug a stuck/failing Skipper agent
+├── cost-report.md         — Token usage and cost analysis
+```
+#### 3. Enhanced Hooks
+Add a hook that detects Skipper agent crashes and auto-captures diagnostics to memory.
+#### 4. Brand Implementation
+Execute the Refined Depths plan (`docs/plans/2026-03-01-refined-depths-implementation.md`). Touches only dashboard CSS, HTML, docs.
+---
+## Merge Strategy
+1. **Stream 1 merges first** — Pure structural refactoring, no behavior changes
+2. **Stream 2 merges second** — Integration code lands in the newly clean module structure
+3. **Stream 3 merges last** — No Rust file conflicts, only bash/CSS/docs
+If streams finish at different times, merge as they complete in this priority order. Stream 3 can merge independently at any time since it touches no Rust.
+---
+## Success Criteria
+- All 5 god-files decomposed to <1,000 lines per module
+- All 2,190+ existing tests pass
+- Zero clippy warnings
+- 8 Shipwright tools call real scripts (not stubs)
+- Pipeline status visible in Skipper dashboard
+- Shared test harness adopted by >90% of test scripts
+- MCP server functional for Claude Code integration
+- Refined Depths brand applied to both dashboards
+---
+## Risk Mitigations
+| Risk                                                    | Mitigation                                                                          |
+| ------------------------------------------------------- | ----------------------------------------------------------------------------------- |
+| Merge conflicts between streams                         | Strict file ownership boundaries. No stream touches another's files.                |
+| Decomposition breaks behavior                           | Pure structural moves, no logic changes. Run full test suite after every file move. |
+| Integration subprocess spawning fails on some platforms | Graceful fallback to in-memory stubs when bash unavailable                          |
+| Large PR size for Stream 1                              | Decompose one god-file per PR (5 PRs total)                                         |

package/docs/plans/2026-03-07-claude-code-feature-integration-design.md ADDED Viewed

@@ -0,0 +1,189 @@
+# Claude Code Feature Integration into Shipwright
+**Date:** 2026-03-07
+**Status:** Approved
+**Goal:** Integrate all new Claude Code features into Shipwright infrastructure and documentation
+## Problem
+Shipwright documents many Claude Code features in its global CLAUDE.md but doesn't actually leverage them in its pipeline, daemon, or agent infrastructure. Key gaps include effort-level routing, fallback models, structured output schemas, HTTP/prompt/agent hooks, lifecycle hooks, and MCP configuration.
+## Design
+### 1. CLI Flags Integration
+#### `--effort-level` (low/medium/high)
+Add `--effort` flag to `sw-loop.sh`, `sw-pipeline.sh`, and `sw-fix.sh`.
+Default routing by stage:
+- `low`: intake, formatting, audit/haiku agents
+- `medium`: standard builds, test execution
+- `high`: design, review, compound_quality
+Extend `select_adaptive_model()` to return effort level alongside model. The intelligence engine and self-optimizer can learn optimal effort levels per stage.
+Pass through to Claude CLI as `--effort-level` on all `claude -p` invocations.
+#### `--fallback-model`
+Add `--fallback-model` flag to `sw-loop.sh` and `sw-pipeline.sh`, default `sonnet`.
+Every `claude -p` invocation gets `--fallback-model` so agents auto-recover from rate limits without pipeline failure.
+New `fallback_model` field in `daemon-config.json`, injected into spawned pipelines.
+#### `--json-schema` for Structured Output
+Create `schemas/` directory with reusable JSON Schema files:
+| Schema                  | Purpose                        |
+| ----------------------- | ------------------------------ |
+| `iteration-result.json` | Loop iteration progress report |
+| `audit-result.json`     | Audit agent pass/fail/findings |
+| `quality-gate.json`     | Quality gate evaluation result |
+| `stage-handoff.json`    | Pipeline stage context handoff |
+Use `--json-schema <file>` on Claude CLI invocations where structured output replaces free-text parsing (audit agents, quality gates, loop progress detection).
+### 2. Hook System Expansion
+#### HTTP Hooks
+Register HTTP hooks in `settings.json` that POST pipeline events to:
+- Dashboard server (`http://localhost:PORT/api/events`) when running
+- Configurable webhook URLs from `daemon-config.json` -> `webhooks[]`
+Format: Same JSON payload as `events.jsonl`, sent as POST body.
+Support `headers` with env var interpolation:
+```json
+{
+  "type": "http",
+  "url": "https://hooks.slack.com/services/...",
+  "headers": {
+    "Authorization": "Bearer $SLACK_TOKEN"
+  }
+}
+```
+#### Prompt Hooks (LLM-evaluated gates)
+Add prompt hooks for quality-sensitive events:
+- `PostToolUse` on `Bash`: "Did the tests pass based on this output?"
+- PR stage: "Is this PR description complete and accurate?"
+- Uses Haiku by default (cheap, fast)
+#### Agent Hooks (multi-turn verification)
+For `compound_quality` stage: agent hook with tool access verifies codebase state matches claimed changes. Configure with `maxTurns: 10`, `timeout: 60`.
+For deploy stage: agent hook runs smoke tests and verifies deployment.
+#### Lifecycle Hooks
+| Hook                                        | Use                                                                          |
+| ------------------------------------------- | ---------------------------------------------------------------------------- |
+| `WorktreeCreate`                            | Auto-copy `.claude/settings.json` and `daemon-config.json` into new worktree |
+| `WorktreeRemove`                            | Clean up heartbeat files and stale state for removed worktree agents         |
+| `InstructionsLoaded` (matcher: `"compact"`) | Reload project rules after auto-compaction                                   |
+| `ConfigChange`                              | Daemon reacts to config changes without restart                              |
+#### PreToolUse Input Modification
+- Auto-inject `set -euo pipefail` reminder into Bash tool commands targeting `.sh` files (existing behavior, now also modifies input)
+- Block `git push --no-verify` via exit code 2
+### 3. Environment & MCP Configuration
+#### New env vars in settings.json
+```json
+{
+  "env": {
+    "ENABLE_TOOL_SEARCH": "auto",
+    "MAX_MCP_OUTPUT_TOKENS": "50000",
+    "CLAUDE_CODE_EFFORT_LEVEL": "medium"
+  }
+}
+```
+`sw-init.sh` and `sw-prep.sh` set these during project setup. Pipeline stages override effort level per-stage via env.
+#### Managed MCP (managed-mcp.json)
+Generate template during `shipwright prep`:
+- Allow project-relevant MCP servers
+- Deny potentially dangerous servers in pipeline agents
+- Configure `allowedMcpServers` / `deniedMcpServers` patterns
+- Useful for fleet/daemon mode where agents shouldn't have unrestricted MCP access
+#### File Suggestion (fileSuggestion)
+Create `scripts/shipwright-file-suggest.sh` for custom `@` autocomplete:
+- `pipeline-state.md`, `daemon-config.json`, `fleet-config.json`
+- Agent definitions (`.claude/agents/*.md`)
+- Loop state, schemas, pipeline artifacts
+Register via `"fileSuggestion": "./scripts/shipwright-file-suggest.sh"` in settings.
+### 4. Documentation Updates
+- `.claude/CLAUDE.md`: New sections for CLI flags, expanded hooks, MCP config, env vars
+- `sw-init.sh`: Setup output mentions new features
+- `sw-doctor.sh`: Validate new settings (effort level, fallback model, webhook URLs)
+- `schemas/README.md`: Schema documentation
+## Implementation Order
+1. **PR 1: CLI Flags** — `--effort`, `--fallback-model`, `--json-schema` + schemas directory
+2. **PR 2: Hook System** — HTTP hooks, prompt/agent hooks, lifecycle hooks, input modification
+3. **PR 3: Environment & MCP** — env vars, managed-mcp.json, fileSuggestion
+4. **PR 4: Documentation** — CLAUDE.md updates, doctor checks, init output
+## Files Changed (by PR)
+### PR 1: CLI Flags
+- `scripts/sw-loop.sh` — flag parsing, pass-through
+- `scripts/sw-pipeline.sh` — per-stage effort routing
+- `scripts/sw-fix.sh` — flag pass-through
+- `scripts/lib/pipeline-stages-*.sh` — effort level in stage dispatch
+- `scripts/lib/loop-iteration.sh` — structured output for iteration results
+- New: `schemas/iteration-result.json`
+- New: `schemas/audit-result.json`
+- New: `schemas/quality-gate.json`
+- New: `schemas/stage-handoff.json`
+- `scripts/sw-loop-test.sh` — test new flags
+- `scripts/sw-pipeline-test.sh` — test effort routing
+### PR 2: Hook System
+- `.claude/settings.json` — register new hooks
+- `scripts/sw-init.sh` — generate hook config
+- `scripts/sw-prep.sh` — generate hook config
+- New: `.claude/hooks/worktree-create.sh`
+- New: `.claude/hooks/worktree-remove.sh`
+- New: `.claude/hooks/instructions-reloaded.sh`
+- New: `.claude/hooks/config-change.sh`
+- `dashboard/server.ts` — accept HTTP hook POSTs on `/api/events`
+### PR 3: Environment & MCP
+- `.claude/settings.json` — new env vars
+- `scripts/sw-init.sh` — set env vars during setup
+- `scripts/sw-prep.sh` — generate managed-mcp.json
+- `scripts/sw-doctor.sh` — validate new settings
+- New: `scripts/shipwright-file-suggest.sh`
+- New: `.claude/managed-mcp.json` (template)
+### PR 4: Documentation
+- `.claude/CLAUDE.md` — all new sections
+- `README.md` — feature mentions