shipwright-cli 3.2.0 → 3.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/code-reviewer.md +2 -0
- package/.claude/agents/devops-engineer.md +2 -0
- package/.claude/agents/doc-fleet-agent.md +2 -0
- package/.claude/agents/pipeline-agent.md +2 -0
- package/.claude/agents/shell-script-specialist.md +2 -0
- package/.claude/agents/test-specialist.md +2 -0
- package/.claude/hooks/agent-crash-capture.sh +32 -0
- package/.claude/hooks/post-tool-use.sh +3 -2
- package/.claude/hooks/pre-tool-use.sh +35 -3
- package/README.md +4 -4
- package/claude-code/hooks/config-change.sh +18 -0
- package/claude-code/hooks/instructions-reloaded.sh +7 -0
- package/claude-code/hooks/worktree-create.sh +25 -0
- package/claude-code/hooks/worktree-remove.sh +20 -0
- package/config/code-constitution.json +130 -0
- package/dashboard/middleware/auth.ts +134 -0
- package/dashboard/middleware/constants.ts +21 -0
- package/dashboard/public/index.html +2 -6
- package/dashboard/public/styles.css +100 -97
- package/dashboard/routes/auth.ts +38 -0
- package/dashboard/server.ts +66 -25
- package/dashboard/services/config.ts +26 -0
- package/dashboard/services/db.ts +118 -0
- package/dashboard/src/canvas/pixel-agent.ts +298 -0
- package/dashboard/src/canvas/pixel-sprites.ts +440 -0
- package/dashboard/src/canvas/shipyard-effects.ts +367 -0
- package/dashboard/src/canvas/shipyard-scene.ts +616 -0
- package/dashboard/src/canvas/submarine-layout.ts +267 -0
- package/dashboard/src/components/header.ts +8 -7
- package/dashboard/src/core/router.ts +1 -0
- package/dashboard/src/design/submarine-theme.ts +253 -0
- package/dashboard/src/main.ts +2 -0
- package/dashboard/src/types/api.ts +2 -1
- package/dashboard/src/views/activity.ts +2 -1
- package/dashboard/src/views/shipyard.ts +39 -0
- package/dashboard/types/index.ts +166 -0
- package/docs/plans/2026-02-28-compound-audit-and-shipyard-design.md +186 -0
- package/docs/plans/2026-02-28-skipper-shipwright-implementation-plan.md +1182 -0
- package/docs/plans/2026-02-28-skipper-shipwright-integration-design.md +531 -0
- package/docs/plans/2026-03-01-ai-powered-skill-injection-design.md +298 -0
- package/docs/plans/2026-03-01-ai-powered-skill-injection-plan.md +1109 -0
- package/docs/plans/2026-03-01-capabilities-cleanup-plan.md +658 -0
- package/docs/plans/2026-03-01-clean-architecture-plan.md +924 -0
- package/docs/plans/2026-03-01-compound-audit-cascade-design.md +191 -0
- package/docs/plans/2026-03-01-compound-audit-cascade-plan.md +921 -0
- package/docs/plans/2026-03-01-deep-integration-plan.md +851 -0
- package/docs/plans/2026-03-01-pipeline-audit-trail-design.md +145 -0
- package/docs/plans/2026-03-01-pipeline-audit-trail-plan.md +770 -0
- package/docs/plans/2026-03-01-refined-depths-brand-design.md +382 -0
- package/docs/plans/2026-03-01-refined-depths-implementation.md +599 -0
- package/docs/plans/2026-03-01-skipper-kernel-integration-design.md +203 -0
- package/docs/plans/2026-03-01-unified-platform-design.md +272 -0
- package/docs/plans/2026-03-07-claude-code-feature-integration-design.md +189 -0
- package/docs/plans/2026-03-07-claude-code-feature-integration-plan.md +1165 -0
- package/docs/research/BACKLOG_QUICK_REFERENCE.md +352 -0
- package/docs/research/CUTTING_EDGE_RESEARCH_2026.md +546 -0
- package/docs/research/RESEARCH_INDEX.md +439 -0
- package/docs/research/RESEARCH_SOURCES.md +440 -0
- package/docs/research/RESEARCH_SUMMARY.txt +275 -0
- package/docs/superpowers/specs/2026-03-10-pipeline-quality-revolution-design.md +341 -0
- package/package.json +2 -2
- package/scripts/lib/adaptive-model.sh +427 -0
- package/scripts/lib/adaptive-timeout.sh +316 -0
- package/scripts/lib/audit-trail.sh +309 -0
- package/scripts/lib/auto-recovery.sh +471 -0
- package/scripts/lib/bandit-selector.sh +431 -0
- package/scripts/lib/bootstrap.sh +104 -2
- package/scripts/lib/causal-graph.sh +455 -0
- package/scripts/lib/compat.sh +126 -0
- package/scripts/lib/compound-audit.sh +337 -0
- package/scripts/lib/constitutional.sh +454 -0
- package/scripts/lib/context-budget.sh +359 -0
- package/scripts/lib/convergence.sh +594 -0
- package/scripts/lib/cost-optimizer.sh +634 -0
- package/scripts/lib/daemon-adaptive.sh +10 -0
- package/scripts/lib/daemon-dispatch.sh +106 -17
- package/scripts/lib/daemon-failure.sh +34 -4
- package/scripts/lib/daemon-patrol.sh +23 -2
- package/scripts/lib/daemon-poll-github.sh +361 -0
- package/scripts/lib/daemon-poll-health.sh +299 -0
- package/scripts/lib/daemon-poll.sh +27 -611
- package/scripts/lib/daemon-state.sh +112 -66
- package/scripts/lib/daemon-triage.sh +10 -0
- package/scripts/lib/dod-scorecard.sh +442 -0
- package/scripts/lib/error-actionability.sh +300 -0
- package/scripts/lib/formal-spec.sh +461 -0
- package/scripts/lib/helpers.sh +177 -4
- package/scripts/lib/intent-analysis.sh +409 -0
- package/scripts/lib/loop-convergence.sh +350 -0
- package/scripts/lib/loop-iteration.sh +682 -0
- package/scripts/lib/loop-progress.sh +48 -0
- package/scripts/lib/loop-restart.sh +185 -0
- package/scripts/lib/memory-effectiveness.sh +506 -0
- package/scripts/lib/mutation-executor.sh +352 -0
- package/scripts/lib/outcome-feedback.sh +521 -0
- package/scripts/lib/pipeline-cli.sh +336 -0
- package/scripts/lib/pipeline-commands.sh +1216 -0
- package/scripts/lib/pipeline-detection.sh +100 -2
- package/scripts/lib/pipeline-execution.sh +897 -0
- package/scripts/lib/pipeline-github.sh +28 -3
- package/scripts/lib/pipeline-intelligence-compound.sh +431 -0
- package/scripts/lib/pipeline-intelligence-scoring.sh +407 -0
- package/scripts/lib/pipeline-intelligence-skip.sh +181 -0
- package/scripts/lib/pipeline-intelligence.sh +100 -1136
- package/scripts/lib/pipeline-quality-bash-compat.sh +182 -0
- package/scripts/lib/pipeline-quality-checks.sh +17 -715
- package/scripts/lib/pipeline-quality-gates.sh +563 -0
- package/scripts/lib/pipeline-stages-build.sh +730 -0
- package/scripts/lib/pipeline-stages-delivery.sh +965 -0
- package/scripts/lib/pipeline-stages-intake.sh +1133 -0
- package/scripts/lib/pipeline-stages-monitor.sh +407 -0
- package/scripts/lib/pipeline-stages-review.sh +1022 -0
- package/scripts/lib/pipeline-stages.sh +59 -2929
- package/scripts/lib/pipeline-state.sh +36 -5
- package/scripts/lib/pipeline-util.sh +487 -0
- package/scripts/lib/policy-learner.sh +438 -0
- package/scripts/lib/process-reward.sh +493 -0
- package/scripts/lib/project-detect.sh +649 -0
- package/scripts/lib/quality-profile.sh +334 -0
- package/scripts/lib/recruit-commands.sh +885 -0
- package/scripts/lib/recruit-learning.sh +739 -0
- package/scripts/lib/recruit-roles.sh +648 -0
- package/scripts/lib/reward-aggregator.sh +458 -0
- package/scripts/lib/rl-optimizer.sh +362 -0
- package/scripts/lib/root-cause.sh +427 -0
- package/scripts/lib/scope-enforcement.sh +445 -0
- package/scripts/lib/session-restart.sh +493 -0
- package/scripts/lib/skill-memory.sh +300 -0
- package/scripts/lib/skill-registry.sh +775 -0
- package/scripts/lib/spec-driven.sh +476 -0
- package/scripts/lib/test-helpers.sh +18 -7
- package/scripts/lib/test-holdout.sh +429 -0
- package/scripts/lib/test-optimizer.sh +511 -0
- package/scripts/shipwright-file-suggest.sh +45 -0
- package/scripts/skills/adversarial-quality.md +61 -0
- package/scripts/skills/api-design.md +44 -0
- package/scripts/skills/architecture-design.md +50 -0
- package/scripts/skills/brainstorming.md +43 -0
- package/scripts/skills/data-pipeline.md +44 -0
- package/scripts/skills/deploy-safety.md +64 -0
- package/scripts/skills/documentation.md +38 -0
- package/scripts/skills/frontend-design.md +45 -0
- package/scripts/skills/generated/.gitkeep +0 -0
- package/scripts/skills/generated/_refinements/.gitkeep +0 -0
- package/scripts/skills/generated/_refinements/adversarial-quality.patch.md +3 -0
- package/scripts/skills/generated/_refinements/architecture-design.patch.md +3 -0
- package/scripts/skills/generated/_refinements/brainstorming.patch.md +3 -0
- package/scripts/skills/generated/cli-version-management.md +29 -0
- package/scripts/skills/generated/collection-system-validation.md +99 -0
- package/scripts/skills/generated/large-scale-c-refactoring-coordination.md +97 -0
- package/scripts/skills/generated/pattern-matching-similarity-scoring.md +195 -0
- package/scripts/skills/generated/test-parallelization-detection.md +65 -0
- package/scripts/skills/observability.md +79 -0
- package/scripts/skills/performance.md +48 -0
- package/scripts/skills/pr-quality.md +49 -0
- package/scripts/skills/product-thinking.md +43 -0
- package/scripts/skills/security-audit.md +49 -0
- package/scripts/skills/systematic-debugging.md +40 -0
- package/scripts/skills/testing-strategy.md +47 -0
- package/scripts/skills/two-stage-review.md +52 -0
- package/scripts/skills/validation-thoroughness.md +55 -0
- package/scripts/sw +9 -3
- package/scripts/sw-activity.sh +9 -2
- package/scripts/sw-adaptive.sh +2 -1
- package/scripts/sw-adversarial.sh +2 -1
- package/scripts/sw-architecture-enforcer.sh +3 -1
- package/scripts/sw-auth.sh +12 -2
- package/scripts/sw-autonomous.sh +5 -1
- package/scripts/sw-changelog.sh +4 -1
- package/scripts/sw-checkpoint.sh +2 -1
- package/scripts/sw-ci.sh +5 -1
- package/scripts/sw-cleanup.sh +4 -26
- package/scripts/sw-code-review.sh +10 -4
- package/scripts/sw-connect.sh +2 -1
- package/scripts/sw-context.sh +2 -1
- package/scripts/sw-cost.sh +48 -3
- package/scripts/sw-daemon.sh +66 -9
- package/scripts/sw-dashboard.sh +3 -1
- package/scripts/sw-db.sh +59 -16
- package/scripts/sw-decide.sh +8 -2
- package/scripts/sw-decompose.sh +360 -17
- package/scripts/sw-deps.sh +4 -1
- package/scripts/sw-developer-simulation.sh +4 -1
- package/scripts/sw-discovery.sh +325 -2
- package/scripts/sw-doc-fleet.sh +4 -1
- package/scripts/sw-docs-agent.sh +3 -1
- package/scripts/sw-docs.sh +2 -1
- package/scripts/sw-doctor.sh +453 -2
- package/scripts/sw-dora.sh +4 -1
- package/scripts/sw-durable.sh +4 -3
- package/scripts/sw-e2e-orchestrator.sh +17 -16
- package/scripts/sw-eventbus.sh +7 -1
- package/scripts/sw-evidence.sh +364 -12
- package/scripts/sw-feedback.sh +550 -9
- package/scripts/sw-fix.sh +20 -1
- package/scripts/sw-fleet-discover.sh +6 -2
- package/scripts/sw-fleet-viz.sh +4 -1
- package/scripts/sw-fleet.sh +5 -1
- package/scripts/sw-github-app.sh +16 -3
- package/scripts/sw-github-checks.sh +3 -2
- package/scripts/sw-github-deploy.sh +3 -2
- package/scripts/sw-github-graphql.sh +18 -7
- package/scripts/sw-guild.sh +5 -1
- package/scripts/sw-heartbeat.sh +5 -30
- package/scripts/sw-hello.sh +67 -0
- package/scripts/sw-hygiene.sh +6 -1
- package/scripts/sw-incident.sh +265 -1
- package/scripts/sw-init.sh +18 -2
- package/scripts/sw-instrument.sh +10 -2
- package/scripts/sw-intelligence.sh +42 -6
- package/scripts/sw-jira.sh +5 -1
- package/scripts/sw-launchd.sh +2 -1
- package/scripts/sw-linear.sh +4 -1
- package/scripts/sw-logs.sh +4 -1
- package/scripts/sw-loop.sh +432 -1128
- package/scripts/sw-memory.sh +356 -2
- package/scripts/sw-mission-control.sh +6 -1
- package/scripts/sw-model-router.sh +481 -26
- package/scripts/sw-otel.sh +13 -4
- package/scripts/sw-oversight.sh +14 -5
- package/scripts/sw-patrol-meta.sh +334 -0
- package/scripts/sw-pipeline-composer.sh +5 -1
- package/scripts/sw-pipeline-vitals.sh +2 -1
- package/scripts/sw-pipeline.sh +53 -2664
- package/scripts/sw-pm.sh +12 -5
- package/scripts/sw-pr-lifecycle.sh +2 -1
- package/scripts/sw-predictive.sh +7 -1
- package/scripts/sw-prep.sh +185 -2
- package/scripts/sw-ps.sh +5 -25
- package/scripts/sw-public-dashboard.sh +15 -3
- package/scripts/sw-quality.sh +2 -1
- package/scripts/sw-reaper.sh +8 -25
- package/scripts/sw-recruit.sh +156 -2303
- package/scripts/sw-regression.sh +19 -12
- package/scripts/sw-release-manager.sh +3 -1
- package/scripts/sw-release.sh +4 -1
- package/scripts/sw-remote.sh +3 -1
- package/scripts/sw-replay.sh +7 -1
- package/scripts/sw-retro.sh +158 -1
- package/scripts/sw-review-rerun.sh +3 -1
- package/scripts/sw-scale.sh +10 -3
- package/scripts/sw-security-audit.sh +6 -1
- package/scripts/sw-self-optimize.sh +6 -3
- package/scripts/sw-session.sh +9 -3
- package/scripts/sw-setup.sh +3 -1
- package/scripts/sw-stall-detector.sh +406 -0
- package/scripts/sw-standup.sh +15 -7
- package/scripts/sw-status.sh +3 -1
- package/scripts/sw-strategic.sh +4 -1
- package/scripts/sw-stream.sh +7 -1
- package/scripts/sw-swarm.sh +18 -6
- package/scripts/sw-team-stages.sh +13 -6
- package/scripts/sw-templates.sh +5 -29
- package/scripts/sw-testgen.sh +7 -1
- package/scripts/sw-tmux-pipeline.sh +4 -1
- package/scripts/sw-tmux-role-color.sh +2 -0
- package/scripts/sw-tmux-status.sh +1 -1
- package/scripts/sw-tmux.sh +3 -1
- package/scripts/sw-trace.sh +3 -1
- package/scripts/sw-tracker-github.sh +3 -0
- package/scripts/sw-tracker-jira.sh +3 -0
- package/scripts/sw-tracker-linear.sh +3 -0
- package/scripts/sw-tracker.sh +3 -1
- package/scripts/sw-triage.sh +2 -1
- package/scripts/sw-upgrade.sh +3 -1
- package/scripts/sw-ux.sh +5 -2
- package/scripts/sw-webhook.sh +3 -1
- package/scripts/sw-widgets.sh +3 -1
- package/scripts/sw-worktree.sh +15 -3
- package/scripts/test-skill-injection.sh +1233 -0
- package/templates/pipelines/autonomous.json +27 -3
- package/templates/pipelines/cost-aware.json +34 -8
- package/templates/pipelines/deployed.json +12 -0
- package/templates/pipelines/enterprise.json +12 -0
- package/templates/pipelines/fast.json +6 -0
- package/templates/pipelines/full.json +27 -3
- package/templates/pipelines/hotfix.json +6 -0
- package/templates/pipelines/standard.json +12 -0
- package/templates/pipelines/tdd.json +12 -0
|
@@ -0,0 +1,298 @@
|
|
|
1
|
+
# Design: AI-Powered Skill Injection
|
|
2
|
+
|
|
3
|
+
## Problem
|
|
4
|
+
|
|
5
|
+
The skill injection system uses static rules: label grep for issue type, hardcoded lookup tables for skill selection, keyword regex for body analysis, linear formulas for complexity weighting. These heuristics can't understand nuance — "Add OAuth login page" is both frontend and security, but the grep picks whichever matches first. Skills are concatenated verbatim into prompts regardless of what the issue actually needs, producing bloated generic guidance.
|
|
6
|
+
|
|
7
|
+
The skill memory system records outcomes but never feeds them back into selection. Generated recommendations are never consumed. The system doesn't learn.
|
|
8
|
+
|
|
9
|
+
## Approach: LLM-as-Router
|
|
10
|
+
|
|
11
|
+
**One haiku LLM call at intake replaces all heuristics.** The LLM reads the issue, selects from the skill library, generates new skills to fill gaps, and produces targeted rationale. Cost: ~$0.002 per pipeline run.
|
|
12
|
+
|
|
13
|
+
The 17 curated skill files become a "skill library" that the LLM selects from intelligently. The static registry stays as a fallback — three layers deep.
|
|
14
|
+
|
|
15
|
+
### Why not per-stage LLM calls?
|
|
16
|
+
|
|
17
|
+
Per-stage calls (4-5x more) add latency and cost for marginal benefit. Intake has full issue context; subsequent stages add incremental context (plan, diff) but the skill selection rarely needs to change. If mid-pipeline adaptation is needed later, the architecture supports it — but start with one call.
|
|
18
|
+
|
|
19
|
+
### Why not LLM-as-Synthesizer (generate prompts from scratch)?
|
|
20
|
+
|
|
21
|
+
Generating custom prompts per-issue loses the curated knowledge in skill files, is non-deterministic, and can't fall back gracefully. The skill files are a knowledge base — the LLM should route to them, not replace them.
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Design
|
|
26
|
+
|
|
27
|
+
### 1. Smart Intake Analysis (`skill_analyze_issue`)
|
|
28
|
+
|
|
29
|
+
**New function in `skill-registry.sh`.**
|
|
30
|
+
|
|
31
|
+
Calls `_intelligence_call_claude()` (haiku, cached, with fallback) with:
|
|
32
|
+
|
|
33
|
+
1. **Issue context**: title + body + labels
|
|
34
|
+
2. **Skill catalog**: compact index of all skills (curated + generated) with one-line descriptions
|
|
35
|
+
3. **Memory context**: top recommendations from `skill_memory_get_recommendations()` for this issue type
|
|
36
|
+
4. **Intelligence analysis**: reuses output from `intelligence_analyze_issue()` if available
|
|
37
|
+
|
|
38
|
+
**Returns structured JSON:**
|
|
39
|
+
|
|
40
|
+
```json
|
|
41
|
+
{
|
|
42
|
+
"issue_type": "frontend",
|
|
43
|
+
"confidence": 0.92,
|
|
44
|
+
"secondary_domains": ["accessibility", "real-time"],
|
|
45
|
+
"complexity_assessment": {
|
|
46
|
+
"score": 6,
|
|
47
|
+
"reasoning": "WebSocket integration with CSS animation and ARIA — moderate"
|
|
48
|
+
},
|
|
49
|
+
"skill_plan": {
|
|
50
|
+
"plan": ["brainstorming", "frontend-design", "product-thinking"],
|
|
51
|
+
"design": ["architecture-design", "frontend-design"],
|
|
52
|
+
"build": ["frontend-design"],
|
|
53
|
+
"review": ["two-stage-review"],
|
|
54
|
+
"compound_quality": ["adversarial-quality"]
|
|
55
|
+
},
|
|
56
|
+
"skill_rationale": {
|
|
57
|
+
"frontend-design": "Progress bar needs ARIA progressbar role, responsive CSS, touch targets",
|
|
58
|
+
"product-thinking": "UX decision: bar vs percentage text vs stage breakdown"
|
|
59
|
+
},
|
|
60
|
+
"generated_skills": [
|
|
61
|
+
{
|
|
62
|
+
"name": "websocket-realtime",
|
|
63
|
+
"reason": "Issue requires WebSocket event handling — no existing skill covers real-time data flow",
|
|
64
|
+
"content": "## WebSocket Real-Time Patterns\n\n### Connection Management\n..."
|
|
65
|
+
}
|
|
66
|
+
],
|
|
67
|
+
"review_focus": ["accessibility compliance", "responsive breakpoints"],
|
|
68
|
+
"risk_areas": ["ETA accuracy with non-uniform stage times"]
|
|
69
|
+
}
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Written to:** `$ARTIFACTS_DIR/skill-plan.json`
|
|
73
|
+
|
|
74
|
+
**Generated skills saved to:** `scripts/skills/generated/{name}.md`
|
|
75
|
+
|
|
76
|
+
**Fallback chain:**
|
|
77
|
+
1. `skill_analyze_issue()` — LLM-powered (primary)
|
|
78
|
+
2. `skill_select_adaptive()` — body keywords + complexity weighting (secondary)
|
|
79
|
+
3. `skill_get_prompts()` — static registry (tertiary)
|
|
80
|
+
|
|
81
|
+
### 2. Skill Catalog Builder (`skill_build_catalog`)
|
|
82
|
+
|
|
83
|
+
**New function in `skill-registry.sh`.**
|
|
84
|
+
|
|
85
|
+
Scans both directories and builds a compact index for the LLM prompt:
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
scripts/skills/ → curated skills (17)
|
|
89
|
+
scripts/skills/generated/ → AI-generated skills (grows over time)
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
**Output format** (one line per skill, ~30 tokens each):
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
- brainstorming: Socratic design refinement — task decomposition, alternatives, risk analysis, definition of done
|
|
96
|
+
- frontend-design: UI/UX patterns — accessibility (ARIA, WCAG), responsive design, component architecture, performance
|
|
97
|
+
- websocket-realtime [generated]: WebSocket event handling — connection management, reconnection, state sync
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**Includes memory context** when available:
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
- frontend-design: UI/UX patterns — accessibility, responsive, components [85% success rate for frontend/plan]
|
|
104
|
+
- testing-strategy: Test design — coverage, edge cases, property-based [40% success rate for frontend/plan]
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
The LLM sees which skills have proven track records for this issue type.
|
|
108
|
+
|
|
109
|
+
### 3. Downstream Stage Consumption (`skill_load_from_plan`)
|
|
110
|
+
|
|
111
|
+
**New function in `skill-registry.sh`.** Replaces per-stage `skill_select_adaptive()` calls.
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
skill_load_from_plan(stage) {
|
|
115
|
+
# 1. Read $ARTIFACTS_DIR/skill-plan.json
|
|
116
|
+
# 2. Extract skills array for this stage
|
|
117
|
+
# 3. For each skill:
|
|
118
|
+
# - Load from scripts/skills/{name}.md or scripts/skills/generated/{name}.md
|
|
119
|
+
# - If _refinements/{name}.patch.md exists → append refinement
|
|
120
|
+
# 4. Prepend skill_rationale for each skill (targeted guidance)
|
|
121
|
+
# 5. Return combined prompt text
|
|
122
|
+
# Fallback: if skill-plan.json missing → skill_select_adaptive()
|
|
123
|
+
}
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
**Prompt output structure:**
|
|
127
|
+
|
|
128
|
+
```markdown
|
|
129
|
+
## Skill Guidance (frontend issue, AI-selected)
|
|
130
|
+
|
|
131
|
+
### Why these skills were selected:
|
|
132
|
+
- frontend-design: Progress bar needs ARIA progressbar role, responsive CSS, touch targets
|
|
133
|
+
- websocket-realtime: Pipeline updates arrive via WebSocket every 2s
|
|
134
|
+
|
|
135
|
+
### Frontend Design Expertise
|
|
136
|
+
[frontend-design.md content + any refinements]
|
|
137
|
+
|
|
138
|
+
### WebSocket Real-Time Patterns (auto-generated)
|
|
139
|
+
[generated/websocket-realtime.md content]
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
The rationale acts as a focusing lens — Claude reads it first and knows what to pay attention to.
|
|
143
|
+
|
|
144
|
+
### 4. Dynamic Skill Generation
|
|
145
|
+
|
|
146
|
+
**During intake**, if the LLM determines no existing skill covers a domain:
|
|
147
|
+
|
|
148
|
+
1. The `generated_skills` array in the JSON contains the new skill content
|
|
149
|
+
2. `skill_analyze_issue()` writes each generated skill to `scripts/skills/generated/{name}.md`
|
|
150
|
+
3. The skill is immediately available for the current pipeline
|
|
151
|
+
4. Future pipelines see it in the catalog and can select it without regenerating
|
|
152
|
+
|
|
153
|
+
**Directory structure:**
|
|
154
|
+
|
|
155
|
+
```
|
|
156
|
+
scripts/skills/
|
|
157
|
+
├── brainstorming.md # Curated (hand-written)
|
|
158
|
+
├── frontend-design.md # Curated
|
|
159
|
+
├── ... # 17 curated total
|
|
160
|
+
└── generated/ # AI-generated, growing library
|
|
161
|
+
├── websocket-realtime.md
|
|
162
|
+
├── i18n-localization.md
|
|
163
|
+
└── _refinements/ # Outcome-driven patches
|
|
164
|
+
└── frontend-design.patch.md
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
**Generated skill lifecycle:**
|
|
168
|
+
|
|
169
|
+
| Verdict Count | Action |
|
|
170
|
+
|---|---|
|
|
171
|
+
| 3+ `keep` or `keep_and_refine` | Graduate to curated directory |
|
|
172
|
+
| 3+ `prune` | Delete the file |
|
|
173
|
+
| Mixed | Keep in generated, track |
|
|
174
|
+
|
|
175
|
+
### 5. Outcome Learning Loop (`skill_analyze_outcome`)
|
|
176
|
+
|
|
177
|
+
**New function in `skill-registry.sh`.**
|
|
178
|
+
|
|
179
|
+
Fires at pipeline completion (success or failure). One haiku call receives:
|
|
180
|
+
|
|
181
|
+
1. The skill plan (`skill-plan.json`)
|
|
182
|
+
2. Pipeline outcome (stages passed/failed)
|
|
183
|
+
3. Review feedback (if review ran)
|
|
184
|
+
4. Error context (if stages failed)
|
|
185
|
+
|
|
186
|
+
**Returns:**
|
|
187
|
+
|
|
188
|
+
```json
|
|
189
|
+
{
|
|
190
|
+
"skill_effectiveness": {
|
|
191
|
+
"frontend-design": {
|
|
192
|
+
"verdict": "effective",
|
|
193
|
+
"evidence": "Plan included ARIA section, review confirmed compliance",
|
|
194
|
+
"learning": "stat-bar CSS reuse hint was directly followed"
|
|
195
|
+
}
|
|
196
|
+
},
|
|
197
|
+
"refinements": [
|
|
198
|
+
{
|
|
199
|
+
"skill": "frontend-design",
|
|
200
|
+
"addition": "For dashboard features, mention existing CSS patterns to encourage reuse"
|
|
201
|
+
}
|
|
202
|
+
],
|
|
203
|
+
"generated_skill_verdict": {
|
|
204
|
+
"websocket-realtime": "keep_and_refine"
|
|
205
|
+
}
|
|
206
|
+
}
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
**Actions:**
|
|
210
|
+
|
|
211
|
+
| Field | What happens |
|
|
212
|
+
|---|---|
|
|
213
|
+
| `skill_effectiveness` | Written to skill memory with verdict + evidence (replaces bare boolean) |
|
|
214
|
+
| `refinements` | Saved to `scripts/skills/generated/_refinements/{skill}.patch.md` |
|
|
215
|
+
| `generated_skill_verdict` | `keep` / `keep_and_refine` / `prune` — controls lifecycle |
|
|
216
|
+
|
|
217
|
+
**The feedback loop:**
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
Intake → LLM reads catalog + memory → selects + generates skills
|
|
221
|
+
↓
|
|
222
|
+
Pipeline runs with targeted skill guidance
|
|
223
|
+
↓
|
|
224
|
+
Completion → LLM analyzes outcome → updates memory + refines skills
|
|
225
|
+
↓
|
|
226
|
+
Next pipeline → intake LLM sees refined skills + richer memory
|
|
227
|
+
↓
|
|
228
|
+
System gets smarter with every run
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
### 6. Integration Points
|
|
232
|
+
|
|
233
|
+
**`stage_intake()` in `pipeline-stages.sh`:**
|
|
234
|
+
- After `intelligence_analyze_issue()`, call `skill_analyze_issue()`
|
|
235
|
+
- Write `skill-plan.json` to artifacts
|
|
236
|
+
- Set `INTELLIGENCE_ISSUE_TYPE` from skill plan (replaces label grep)
|
|
237
|
+
|
|
238
|
+
**`stage_plan()`, `stage_build()`, `stage_review()`, etc.:**
|
|
239
|
+
- Replace `skill_select_adaptive()` calls with `skill_load_from_plan("plan")`
|
|
240
|
+
- Fallback to `skill_select_adaptive()` if `skill-plan.json` missing
|
|
241
|
+
|
|
242
|
+
**`sw-pipeline.sh` completion handler:**
|
|
243
|
+
- Call `skill_analyze_outcome()` after pipeline finishes
|
|
244
|
+
- Apply refinements, lifecycle verdicts
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Files
|
|
249
|
+
|
|
250
|
+
| File | Action | Purpose |
|
|
251
|
+
|---|---|---|
|
|
252
|
+
| `scripts/lib/skill-registry.sh` | MODIFY | Add `skill_analyze_issue()`, `skill_build_catalog()`, `skill_load_from_plan()`, `skill_analyze_outcome()` |
|
|
253
|
+
| `scripts/lib/skill-memory.sh` | MODIFY | Upgrade `skill_memory_record()` to store verdict + evidence + learning (not just boolean) |
|
|
254
|
+
| `scripts/lib/pipeline-stages.sh` | MODIFY | Replace `skill_select_adaptive()` calls with `skill_load_from_plan()` in all stages; add `skill_analyze_issue()` to intake |
|
|
255
|
+
| `scripts/sw-pipeline.sh` | MODIFY | Add `skill_analyze_outcome()` at pipeline completion |
|
|
256
|
+
| `scripts/skills/generated/` | CREATE | Directory for AI-generated skills |
|
|
257
|
+
| `scripts/skills/generated/_refinements/` | CREATE | Directory for outcome-driven patches |
|
|
258
|
+
| `scripts/test-skill-injection.sh` | MODIFY | Add test suites for LLM-powered selection, generation, outcome loop |
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Cost
|
|
263
|
+
|
|
264
|
+
| Call | When | Model | Cost |
|
|
265
|
+
|---|---|---|---|
|
|
266
|
+
| `skill_analyze_issue()` | Intake | haiku | ~$0.002 |
|
|
267
|
+
| `skill_analyze_outcome()` | Completion | haiku | ~$0.002 |
|
|
268
|
+
| **Total per pipeline** | | | **~$0.004** |
|
|
269
|
+
|
|
270
|
+
Existing pipeline LLM costs (plan/design/build/review with opus/sonnet) are ~$2-8 per run. The skill intelligence adds 0.05-0.2% overhead.
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Fallback Guarantees
|
|
275
|
+
|
|
276
|
+
Every new function has a fallback to the existing system:
|
|
277
|
+
|
|
278
|
+
| New Function | Fallback | Condition |
|
|
279
|
+
|---|---|---|
|
|
280
|
+
| `skill_analyze_issue()` | `skill_select_adaptive()` | LLM call fails |
|
|
281
|
+
| `skill_load_from_plan()` | `skill_select_adaptive()` | `skill-plan.json` missing |
|
|
282
|
+
| `skill_analyze_outcome()` | `skill_memory_record()` (boolean) | LLM call fails |
|
|
283
|
+
|
|
284
|
+
The static registry and keyword detection remain as safety nets. Zero regression risk for existing pipelines.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## Testing
|
|
289
|
+
|
|
290
|
+
| Suite | Tests |
|
|
291
|
+
|---|---|
|
|
292
|
+
| Skill catalog builder | Scans both directories, includes generated skills, formats correctly |
|
|
293
|
+
| LLM skill analysis | Mock haiku response, verify skill-plan.json written correctly |
|
|
294
|
+
| Generated skill lifecycle | Create, use, verdict, graduate, prune |
|
|
295
|
+
| Refinement patches | Write patch, verify it appends to skill content |
|
|
296
|
+
| Outcome analysis | Mock response, verify memory updated with verdicts |
|
|
297
|
+
| Fallback chain | LLM failure → adaptive → static, each level works independently |
|
|
298
|
+
| Integration | Full flow: intake → plan → build → review → outcome |
|