@kontourai/flow-agents 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (97) hide show
  1. package/.github/dependabot.yml +23 -0
  2. package/.github/workflows/publish-npm.yml +1 -1
  3. package/.github/workflows/release-please.yml +31 -0
  4. package/.github/workflows/runtime-compat.yml +118 -0
  5. package/CHANGELOG.md +38 -0
  6. package/CONTRIBUTING.md +4 -0
  7. package/README.md +58 -19
  8. package/build/src/cli/init.js +215 -5
  9. package/build/src/cli/utterance-check.js +236 -0
  10. package/build/src/cli.js +3 -0
  11. package/build/src/tools/build-universal-bundles.js +268 -0
  12. package/build/src/tools/filter-installed-packs.js +3 -0
  13. package/build/src/tools/validate-source-tree.js +6 -1
  14. package/context/scripts/telemetry/lib/config.sh +5 -1
  15. package/context/settings/flow-agents-settings.json +7 -0
  16. package/docs/agent-system-guidebook.md +4 -5
  17. package/docs/context-map.md +1 -0
  18. package/docs/index.md +46 -6
  19. package/docs/integrations/conformance.md +246 -0
  20. package/docs/integrations/framework-adapter.md +275 -0
  21. package/docs/integrations/harness-install.md +213 -0
  22. package/docs/integrations/index.md +54 -0
  23. package/docs/north-star.md +3 -3
  24. package/docs/repository-structure.md +1 -1
  25. package/docs/skills-map.md +10 -4
  26. package/docs/spec/runtime-hook-surface.md +472 -0
  27. package/docs/survey-utterance-check.md +308 -0
  28. package/docs/vision.md +45 -0
  29. package/docs/workflow-usage-guide.md +1 -1
  30. package/evals/acceptance/run.sh +4 -2
  31. package/evals/acceptance/test_opencode_harness.sh +121 -0
  32. package/evals/acceptance/test_pi_harness.sh +98 -0
  33. package/evals/integration/test_bundle_install.sh +226 -1
  34. package/evals/integration/test_bundle_lifecycle.sh +641 -0
  35. package/evals/integration/test_utterance_check.sh +518 -0
  36. package/evals/run.sh +2 -0
  37. package/evals/static/test_universal_bundles.sh +137 -2
  38. package/integrations/strands/README.md +256 -0
  39. package/integrations/strands/example.py +74 -0
  40. package/integrations/strands/flow_agents_strands/__init__.py +27 -0
  41. package/integrations/strands/flow_agents_strands/hooks.py +194 -0
  42. package/integrations/strands/flow_agents_strands/policy.py +348 -0
  43. package/integrations/strands/flow_agents_strands/steering.py +172 -0
  44. package/integrations/strands/flow_agents_strands/telemetry.py +238 -0
  45. package/integrations/strands/pyproject.toml +38 -0
  46. package/integrations/strands/tests/__init__.py +0 -0
  47. package/integrations/strands/tests/test_hooks.py +304 -0
  48. package/integrations/strands/tests/test_policy.py +315 -0
  49. package/integrations/strands/tests/test_telemetry.py +184 -0
  50. package/integrations/strands-ts/README.md +224 -0
  51. package/integrations/strands-ts/bin/conformance-shim.mjs +257 -0
  52. package/integrations/strands-ts/package.json +53 -0
  53. package/integrations/strands-ts/src/hooks.ts +208 -0
  54. package/integrations/strands-ts/src/index.ts +22 -0
  55. package/integrations/strands-ts/src/policy.ts +345 -0
  56. package/integrations/strands-ts/src/telemetry.ts +251 -0
  57. package/integrations/strands-ts/test/test-policy.ts +322 -0
  58. package/integrations/strands-ts/test/test-telemetry.ts +226 -0
  59. package/integrations/strands-ts/tsconfig.json +20 -0
  60. package/package.json +7 -2
  61. package/packaging/conformance/README.md +142 -0
  62. package/packaging/conformance/fixtures/config-protection--allow-no-path.json +18 -0
  63. package/packaging/conformance/fixtures/config-protection--allow-safe-file.json +20 -0
  64. package/packaging/conformance/fixtures/config-protection--block-biome.json +20 -0
  65. package/packaging/conformance/fixtures/config-protection--block-eslintrc.json +20 -0
  66. package/packaging/conformance/fixtures/quality-gate--allow-no-path.json +17 -0
  67. package/packaging/conformance/fixtures/quality-gate--allow-nonexistent-file.json +19 -0
  68. package/packaging/conformance/fixtures/stop-goal-fit--allow-clean-cwd.json +17 -0
  69. package/packaging/conformance/fixtures/stop-goal-fit--block-strict-mode.json +23 -0
  70. package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +21 -0
  71. package/packaging/conformance/fixtures/workflow-steering--allow-no-state.json +16 -0
  72. package/packaging/conformance/fixtures/workflow-steering--inject-active-state.json +29 -0
  73. package/packaging/conformance/fixtures/workflow-steering--inject-subagent-steering.json +25 -0
  74. package/packaging/conformance/package.json +4 -0
  75. package/packaging/conformance/run-conformance.js +322 -0
  76. package/packaging/manifest.json +59 -0
  77. package/schemas/flow-agents-settings.schema.json +48 -0
  78. package/scripts/README.md +5 -0
  79. package/scripts/dogfood.js +16 -0
  80. package/scripts/hooks/opencode-hook-adapter.js +123 -0
  81. package/scripts/hooks/opencode-telemetry-hook.js +101 -0
  82. package/scripts/hooks/pi-hook-adapter.js +123 -0
  83. package/scripts/hooks/pi-telemetry-hook.js +105 -0
  84. package/scripts/hooks/run-hook.js +8 -0
  85. package/scripts/hooks/utterance-check.js +327 -0
  86. package/scripts/telemetry/lib/config.sh +5 -1
  87. package/skills/idea-to-backlog/SKILL.md +1 -1
  88. package/src/cli/init.ts +219 -6
  89. package/src/cli/utterance-check.ts +324 -0
  90. package/src/cli.ts +3 -0
  91. package/src/tools/build-universal-bundles.ts +266 -0
  92. package/src/tools/filter-installed-packs.ts +3 -0
  93. package/src/tools/validate-source-tree.ts +6 -1
  94. package/build/src/cli/docs-preview.js +0 -39
  95. package/build/src/cli/export-bookmarks.js +0 -38
  96. package/build/src/cli/import-bookmarks.js +0 -50
  97. package/build/src/cli/instinct-cli.js +0 -93
@@ -0,0 +1,213 @@
1
+ ---
2
+ title: Harness Install
3
+ ---
4
+
5
+ # Harness Install
6
+
7
+ This page walks through three harness installs: Claude Code (the L2 reference runtime), opencode, and pi. All three follow the same model — `npm run build:bundles` generates the bundle, `flow-agents init` places it — but each runtime expects different files at different paths.
8
+
9
+ ## How harness bundles work
10
+
11
+ `npm run build:bundles` generates one bundle per runtime under `dist/<runtime>/`. Each bundle contains:
12
+
13
+ - A host-specific configuration file that maps lifecycle events to shell commands invoking the canonical hook adapter wrapper.
14
+ - A host-specific adapter wrapper (`<runtime>-hook-adapter.js`) that reads stdin JSON from the host, invokes `run-hook.js` with the canonical script path and profile, translates the exit code to the host-native response format, and fails open on errors.
15
+ - A host-specific telemetry wrapper (`<runtime>-telemetry-hook.js`) that maps host event names to canonical telemetry event names and invokes `scripts/telemetry/telemetry.sh`.
16
+ - An `install.sh` that places the generated files at the host-expected paths.
17
+
18
+ `flow-agents init` (from `npx @kontourai/flow-agents`) calls `install.sh` for the selected runtime.
19
+
20
+ ## Claude Code
21
+
22
+ Claude Code is the L2 reference implementation. All four policy classes are wired: workflow steering, quality gate, stop-goal-fit, and config protection.
23
+
24
+ ### Install
25
+
26
+ ```bash
27
+ npx @kontourai/flow-agents init --runtime claude-code --dest /path/to/workspace --yes
28
+ ```
29
+
30
+ The install script writes hook wiring into `.claude/settings.json` inside the destination workspace. The hooks object in `settings.json` maps Claude Code lifecycle events (`UserPromptSubmit`, `PreToolUse`, `PostToolUse`, `Stop`) to shell commands invoking the adapter:
31
+
32
+ ```bash
33
+ bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
34
+ node "$root/scripts/hooks/claude-telemetry-hook.js" UserPromptSubmit dev'
35
+ bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
36
+ node "$root/scripts/hooks/claude-hook-adapter.js" UserPromptSubmit \
37
+ workflow-steering workflow-steering.js default'
38
+ ```
39
+
40
+ Telemetry always fires first and is always non-blocking (timeout: 10 s). Policy hooks fire second and may block on `PreToolUse` (timeout: 30 s). Both fail open on hook runtime errors.
41
+
42
+ ### Dogfood variant (repo-local)
43
+
44
+ Inside the `flow-agents` source repo itself, the dogfood script writes hook wiring that points at the local `scripts/hooks/` directory rather than a published package:
45
+
46
+ ```bash
47
+ npm run dogfood -- --runtime claude-code
48
+ ```
49
+
50
+ The destination defaults to the repo root. Pass `--dest` to override.
51
+
52
+ ### Scope-collision warning
53
+
54
+ When `init` detects that an existing `.claude/settings.json` already has hooks entries for the same lifecycle events, it emits a scope-collision warning to stderr:
55
+
56
+ ```
57
+ [flow-agents] WARNING: .claude/settings.json already has hooks for UserPromptSubmit.
58
+ Existing entries will be preserved; Flow Agents hooks will be appended.
59
+ Review .claude/settings.json to confirm hook ordering is correct.
60
+ ```
61
+
62
+ The install appends rather than replaces, so existing hooks are not removed. Review the settings file after install to confirm the ordering is what you want.
63
+
64
+ ### Resulting file layout
65
+
66
+ ```
67
+ <workspace>/
68
+ .claude/
69
+ settings.json ← hook wiring (appended by install)
70
+ scripts/
71
+ hooks/
72
+ claude-hook-adapter.js
73
+ claude-telemetry-hook.js
74
+ run-hook.js
75
+ config-protection.js
76
+ quality-gate.js
77
+ stop-goal-fit.js
78
+ workflow-steering.js
79
+
80
+ skills/
81
+
82
+ .flow-agents/ ← runtime workflow artifacts (not committed)
83
+ ```
84
+
85
+ ## opencode
86
+
87
+ opencode is an L1 adapter. It has no native `prompt.submit`-equivalent event, so workflow steering is approximated at `session.created` rather than at each user turn. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.1</a>.
88
+
89
+ ### Install
90
+
91
+ ```bash
92
+ npx @kontourai/flow-agents init --runtime opencode --dest /path/to/workspace --yes
93
+ ```
94
+
95
+ ### Dogfood variant
96
+
97
+ ```bash
98
+ npm run dogfood -- --runtime opencode
99
+ ```
100
+
101
+ ### Resulting file layout
102
+
103
+ ```
104
+ <workspace>/
105
+ .opencode/
106
+ plugins/
107
+ flow-agents.js ← auto-loaded at opencode startup
108
+ agents/
109
+ dev.md ← agent prompts (opencode markdown format)
110
+ tool-planner.md
111
+ tool-worker.md
112
+
113
+ skills/
114
+ deliver.md
115
+ fix-bug.md
116
+
117
+ opencode.json ← workspace instructions pointer
118
+ scripts/
119
+ hooks/
120
+ opencode-hook-adapter.js
121
+ opencode-telemetry-hook.js
122
+ run-hook.js
123
+
124
+ skills/
125
+
126
+ ```
127
+
128
+ `opencode.json` at the workspace root is a minimal config file:
129
+
130
+ ```json
131
+ {
132
+ "instructions": "This workspace uses Flow Agents. See AGENTS.md for conventions, skills, and workflow guidance."
133
+ }
134
+ ```
135
+
136
+ The plugin at `.opencode/plugins/flow-agents.js` is auto-loaded at opencode startup. It exports `FlowAgentsPlugin` and registers handlers for:
137
+
138
+ | opencode event | What fires |
139
+ | --- | --- |
140
+ | `session.created` | Telemetry + workflow steering (session-start context injection) |
141
+ | `tool.execute.before` | Telemetry + config-protection (blocking via thrown Error) |
142
+ | `tool.execute.after` | Telemetry + quality gate |
143
+ | `session.idle` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
144
+ | `session.error`, `session.compacted`, `permission.asked`, `file.edited` | Telemetry only |
145
+
146
+ **Accepted gaps**: opencode has no `prompt.submit` hook, so workflow steering fires only on `session.created` — not at each user turn. `session.idle` is the closest event to a stop hook but does not reliably fire on session completion. These gaps are declared in the conformance level (L1) and in the plugin source comments.
147
+
148
+ **Agents**: opencode receives agent prompts as markdown files in `.opencode/agents/`. The main orchestrator is `dev.md`; specialist tools (planner, worker, reviewer, etc.) are additional markdown files in the same directory.
149
+
150
+ ## pi
151
+
152
+ pi is an L1 adapter. It has no stop hook, so stop-goal-fit cannot fire at session end. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.3</a>.
153
+
154
+ ### Install
155
+
156
+ ```bash
157
+ npx @kontourai/flow-agents init --runtime pi --dest /path/to/workspace --yes
158
+ ```
159
+
160
+ ### Dogfood variant
161
+
162
+ ```bash
163
+ npm run dogfood -- --runtime pi
164
+ ```
165
+
166
+ ### Resulting file layout
167
+
168
+ ```
169
+ <workspace>/
170
+ .pi/
171
+ extensions/
172
+ flow-agents.ts ← auto-discovered at startup (needs project trust)
173
+ skills/
174
+ deliver.md
175
+ fix-bug.md
176
+
177
+ AGENTS.md ← agent instructions (pi uses AGENTS.md, not a registry)
178
+ scripts/
179
+ hooks/
180
+ pi-hook-adapter.js
181
+ pi-telemetry-hook.js
182
+ run-hook.js
183
+
184
+ skills/
185
+
186
+ ```
187
+
188
+ The extension at `.pi/extensions/flow-agents.ts` is auto-discovered at startup. It registers handlers for:
189
+
190
+ | pi event | What fires |
191
+ | --- | --- |
192
+ | `session_start` | Telemetry |
193
+ | `before_agent_start` | Telemetry + workflow steering (injects context into system prompt) |
194
+ | `tool_call` | Telemetry + config-protection (blocking via `{ block: true }` return) |
195
+ | `tool_result` | Telemetry + quality gate |
196
+ | `session_shutdown` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
197
+
198
+ **Accepted gaps**: pi has no stop hook. `session_shutdown` is used as the closest equivalent but does not carry the same semantics as a stop event. This gap is declared in the conformance level (L1) and in the extension source comments.
199
+
200
+ **Agents**: pi has no named-subagent registry. Agent guidance is delivered through `AGENTS.md` at the workspace root, plus the skills in `.pi/skills/` and the extension. The `flow-agents.ts` extension comment says explicitly: "pi has no named-subagent registry. Agents are not exported for pi."
201
+
202
+ ### Scope-collision warning
203
+
204
+ Same behavior as Claude Code: if an existing `.pi/extensions/` directory contains a file with conflicting event registrations, `init` warns and appends. Review the extension file after install.
205
+
206
+ ## Related references
207
+
208
+ - `dist/opencode/` — generated opencode bundle (do not edit by hand)
209
+ - `dist/pi/` — generated pi bundle (do not edit by hand)
210
+ - `dist/claude-code/` — generated Claude Code bundle
211
+ - `scripts/hooks/run-hook.js` — canonical hook runner
212
+ - <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> — event taxonomy, policy classes, conformance levels
213
+ - <a href="conformance.html">Conformance</a> — how to self-certify a new adapter
@@ -0,0 +1,54 @@
1
+ ---
2
+ title: Integration Examples
3
+ ---
4
+
5
+ # Integration Examples
6
+
7
+ Flow Agents reaches host runtimes and agent frameworks through two distinct distribution models. This section provides worked examples for each model and a guide to the conformance kit for third-party adapter authors.
8
+
9
+ ## Distribution models at a glance
10
+
11
+ **Harness runtimes** ship as self-contained bundles under `dist/<runtime>/`. The `npm run build:bundles` command generates each bundle from the canonical manifest and policy scripts. `flow-agents init` (or the dogfood variant) places the generated files at the host-expected paths inside a target workspace. Claude Code, Codex, Kiro, opencode, and pi are harness adapters.
12
+
13
+ **Framework adapters** live in `integrations/<name>/` as language-native packages. They register Flow Agents callbacks with the framework's lifecycle system using the framework's native registration API. `integrations/strands/` is the reference implementation: `flow-agents-strands` is a Python `HookProvider` that wires into AWS Strands Agents without requiring the Strands SDK at import time.
14
+
15
+ **Third-party adapters** self-certify by running the conformance kit in `packaging/conformance/`. The kit provides golden fixtures and a runner that pipes each fixture through the adapter command and reports per-level verdict.
16
+
17
+ ## Conformance levels
18
+
19
+ | Level | What is required |
20
+ | --- | --- |
21
+ | L0 | Telemetry only — at least `agentSpawn` fires on session start |
22
+ | L1 | L0 plus workflow steering and stop-goal-fit in warning mode |
23
+ | L2 | L1 plus config protection (blocking) and quality gate — the reference level |
24
+
25
+ Claude Code and Codex are L2 reference implementations. opencode is L1 (no prompt-submit hook). pi is L1 (no stop hook). The Strands adapter is L0 plus config protection via `BeforeToolCallEvent` cancellation.
26
+
27
+ The <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> defines the canonical event taxonomy, policy classes, conformance levels, and engine contract in full.
28
+
29
+ ## Pages in this section
30
+
31
+ <div class="doc-grid">
32
+ <a class="doc-card" href="harness-install.html">
33
+ <strong>Harness Install</strong>
34
+ <span>Worked example installing into a Claude Code project, and the two newest runtimes: opencode and pi. Includes the dogfood variant and scope-collision warning behavior.</span>
35
+ </a>
36
+ <a class="doc-card" href="framework-adapter.html">
37
+ <strong>Framework Adapter</strong>
38
+ <span>Worked example based on <code>integrations/strands/</code>: constructing FlowAgentsHooks, telemetry emitted, the engine-contract binding for policy, and documented limitations.</span>
39
+ </a>
40
+ <a class="doc-card" href="conformance.html">
41
+ <strong>Conformance</strong>
42
+ <span>How a third-party adapter self-certifies: the engine contract 1.0, running the conformance runner, what each level requires, and how to declare gaps.</span>
43
+ </a>
44
+ <a class="doc-card" href="../spec/runtime-hook-surface.html">
45
+ <strong>Runtime Hook Surface Spec</strong>
46
+ <span>Canonical event taxonomy, four policy classes, conformance levels L0/L1/L2, mapping tables, and the engine contract for adapter authors.</span>
47
+ </a>
48
+ </div>
49
+
50
+ ---
51
+
52
+ ## TypeScript native-import adapter
53
+
54
+ `integrations/strands-ts/` (`@kontourai/flow-agents-strands`) is the first native-import consumer of the policy engine contract. It binds the `config-protection.js` `run()` function directly — no subprocess on the hot path. Achieves **L2** conformance. See `integrations/strands-ts/README.md` and the [Framework Adapter](framework-adapter.html) page for the full comparison with the Python adapter.
@@ -152,9 +152,9 @@ The goal is not to add ceremony. The goal is to make agents more reliable while
152
152
  | [x] | Standards register | Supported standards and Flow Agents-owned formats are documented with adoption rules. |
153
153
  | [ ] | Structured workflow state | Draft schemas, contracts, validation, explicit current-session identity, delegation-safe agent event logs, sidecar writer commands, and direct workflow-skill writer instructions exist for state, acceptance, evidence, handoff, critique, release, and learning; automatic enforcement remains partial. |
154
154
  | [ ] | Context map | Generated repo/context map exists; workflow steering and core planner/worker/verifier agents now use it, but broader agent coverage remains. |
155
- | [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; broader file/task-aware guidance remains. |
155
+ | [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; the opt-in utterance evidence-check hook (ADR 0003 §9) badges unsupported agent statements via Survey; broader file/task-aware guidance remains. |
156
156
  | [x] | Sandbox policy | `context/contracts/sandbox-policy.md` and https://github.com/kontourai/flow-agents/blob/main/docs/sandbox-policy.md classify local read-only, local edit, worktree, container, cloud sandbox, and privileged integration modes. |
157
- | [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper can record native Veritas reports as optional Flow Agents evidence. |
157
+ | [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper records native Veritas reports as optional evidence; utterance trust reports from `@kontourai/survey` cover agent statements. |
158
158
  | [ ] | Feedback loop | Runtime telemetry, outcomes, evals, and recurring corrections feed back into docs, skills, rules, or backlog. |
159
159
  | [ ] | Export validation | Codex, Claude Code, and Kiro exports preserve the same operating layers and now install telemetry, Goal Fit, and workflow steering hook wiring; adapter output, installed-command coverage, Claude live hook influence, and Kiro live strict-stop coverage exist. |
160
160
 
@@ -180,7 +180,7 @@ Tasks:
180
180
 
181
181
  - Document the public layers: rules, skills, powers, agents, workflows, knowledge, and evidence. **Done:** see https://github.com/kontourai/flow-agents/blob/main/docs/operating-layers.md.
182
182
  - Mark which directories are canonical source, generated exports, runtime state, and optional integrations.
183
- - Decide which workflow skills are part of the core pack and which are optional domain packs. **Started:** `packaging/packs.json` defines core, development, knowledge, AWS, and experimental packs.
183
+ - Decide which workflow skills are part of the core pack and which are optional domain packs. **Started:** `packaging/packs.json` defines core and development packs.
184
184
  - Add a standards register that lists each external standard, how Flow Agents uses it, and what Flow Agents-owned schemas still exist. **Done:** see https://github.com/kontourai/flow-agents/blob/main/docs/standards-register.md.
185
185
  - Add a "do not invent without checking standards" rule to contributor docs.
186
186
 
@@ -96,7 +96,7 @@ specific row that matches the change.
96
96
  | Bundle/export shape | `packaging/`, `src/tools/build-universal-bundles.ts`, and source directories copied into bundles | `bash evals/static/test_universal_bundles.sh` |
97
97
  | Installer or local runtime setup behavior | `scripts/install-*.sh`, package bins, and generated bundle install scripts | `bash evals/integration/test_bundle_install.sh` |
98
98
  | Workflow artifact, sidecar, or provider contract | `context/contracts/`, `schemas/`, `src/cli/workflow-*`, and matching eval fixtures | `npm run workflow:validate-artifacts --` and workflow integration evals |
99
- | Flow Kit catalog or bundled kit content | `kits/`, Flow Definition files, and kit repository fixtures | `npm run flow-kit -- validate` or `bash evals/integration/test_flow_kit_repository.sh` |
99
+ | Flow Kit catalog or bundled kit content | `kits/`, Flow Definition files, and kit repository fixtures | `npm run validate:source -- --kit <path>` or `bash evals/integration/test_flow_kit_repository.sh` |
100
100
  | Durable developer guidance | `docs/`; regenerate/check the context map when navigation or durable contracts change | `npm run context-map:check --` |
101
101
  | Eval scenario or fixture | `evals/static/`, `evals/integration/`, `evals/fixtures/`, or `evals/cases/` | The owning eval plus `bash evals/run.sh static` when contracts are touched |
102
102
  | Optional external integration configuration | `integrations/` or `veritas.claims.json`; keep local run output ignored | The integration-specific eval or documented dry run |
@@ -45,6 +45,9 @@ flowchart LR
45
45
  Learn -->|new work| Shape
46
46
  ```
47
47
 
48
+ > `publish-change` is a CLI-driven workflow step, not a loadable skill.
49
+ > `goal-fit` is a hook-enforced check, not a loadable skill.
50
+
48
51
  ## Current Shape
49
52
 
50
53
  The operating model now has first-class coverage from idea intake through trusted delivery:
@@ -76,7 +79,7 @@ This view shows how each phase is composed. The left rail is the durable phase s
76
79
  <div class="phase-step"><span>01</span><strong>Discovery & shaping</strong></div>
77
80
  <div class="phase-lanes">
78
81
  <section class="phase-lane phase-lane--primary"><h3>Primary</h3><p><code>builder-shape</code> <code>idea-to-backlog</code></p></section>
79
- <section class="phase-lane"><h3>Support</h3><p><code>knowledge-search</code> <code>search-first</code> <code>explore</code> <code>crowdsource</code> <code>frontend-design</code> <code>github-cli</code> <code>knowledge-capture</code></p></section>
82
+ <section class="phase-lane"><h3>Support</h3><p><code>search-first</code> <code>explore</code> <code>frontend-design</code> <code>github-cli</code> <code>knowledge-capture</code></p></section>
80
83
  <section class="phase-lane"><h3>Nested sections / future primitives</h3><p>intake/dedupe, separate ideas, thinnest meaningful slice, opportunity review, explore options, <code>shape-work</code>, prioritize work, sync executable backlog</p></section>
81
84
  <section class="phase-lane phase-lane--gate"><h3>Gate & artifact</h3><p>Idea, slice, shape, and backlog gates. Writes shaped briefs and GitHub issue links in <code>.flow-agents/&lt;slug&gt;/</code>.</p></section>
82
85
  </div>
@@ -112,7 +115,7 @@ This view shows how each phase is composed. The left rail is the durable phase s
112
115
  <div class="phase-step"><span>05</span><strong>Learning & improvement</strong></div>
113
116
  <div class="phase-lanes">
114
117
  <section class="phase-lane phase-lane--primary"><h3>Primary</h3><p><code>learning-review</code></p></section>
115
- <section class="phase-lane"><h3>Support</h3><p><code>knowledge-capture</code> <code>observe</code> <code>idea-to-backlog</code> <code>eval-rebuild</code></p></section>
118
+ <section class="phase-lane"><h3>Support</h3><p><code>knowledge-capture</code> <code>idea-to-backlog</code> <code>eval-rebuild</code></p></section>
116
119
  <section class="phase-lane"><h3>Nested sections / future primitives</h3><p>facts vs interpretation, follow-up routing, docs promotion review, knowledge updates, eval updates, skill/backlog improvements</p></section>
117
120
  <section class="phase-lane phase-lane--gate"><h3>Gate & artifact</h3><p>Learning gate. Writes outcomes, gaps, docs promotion state, follow-ups, knowledge updates, and verdict.</p></section>
118
121
  </div>
@@ -121,11 +124,11 @@ This view shows how each phase is composed. The left rail is the durable phase s
121
124
 
122
125
  | Phase | Primary workflow skill | Supporting skills | Nested sections / future primitive candidates |
123
126
  | --- | --- | --- | --- |
124
- | Idea discovery and shaping | `builder-shape`, `idea-to-backlog` | `knowledge-search`, `search-first`, `explore`, `crowdsource`, `frontend-design`, `github-cli`, `knowledge-capture` | intake/dedupe, separate ideas, thinnest meaningful slice, opportunity review, explore options, shape work, prioritize work, sync executable backlog |
127
+ | Idea discovery and shaping | `builder-shape`, `idea-to-backlog` | `search-first`, `explore`, `frontend-design`, `github-cli`, `knowledge-capture` | intake/dedupe, separate ideas, thinnest meaningful slice, opportunity review, explore options, shape work, prioritize work, sync executable backlog |
125
128
  | Backlog pickup | `pull-work` | `github-cli` | board snapshot, WIP check, grouping/dependency check, Probe decision, worktree decision, handoff |
126
129
  | Execution planning and build | `design-probe`, `pickup-probe`, `plan-work`, `execute-plan`, `review-work`, `verify-work` | `feedback-loop`, `browser-test`, `deliver`, `fix-bug`, `tdd-workflow` | Probe notes, Builder Kit Probe record, Definition Of Done, execution plan, parallel waves, implementation session state, critique report, verification report, Goal Fit Gate |
127
130
  | Evidence and release confidence | `evidence-gate`, `release-readiness` | `github-cli`, `eval-rebuild` | criteria-to-evidence map, CI confidence, scope/integrity check, publish-change, rollback review, observability review, final acceptance docs, post-deploy plan |
128
- | Learning and improvement | `learning-review` | `knowledge-capture`, `observe`, `idea-to-backlog`, `eval-rebuild` | facts vs interpretation, docs promotion review, follow-up routing, knowledge updates, eval/skill/backlog improvements |
131
+ | Learning and improvement | `learning-review` | `knowledge-capture`, `idea-to-backlog`, `eval-rebuild` | facts vs interpretation, docs promotion review, follow-up routing, knowledge updates, eval/skill/backlog improvements |
129
132
 
130
133
  The highest-leverage future extractions are likely `shape-work`, `test-map`, `scope-and-integrity-check`, and `remediate-ci`. They are still nested because their behavior is present, but not yet large enough to need separate activation contracts.
131
134
 
@@ -190,6 +193,9 @@ flowchart LR
190
193
  Learning -->|systemic change| Eval[eval-rebuild / backlog / skill update]
191
194
  ```
192
195
 
196
+ > `publish-change` is a CLI-driven workflow step, not a loadable skill.
197
+ > `goal-fit` is a hook-enforced check, not a loadable skill.
198
+
193
199
  ## Eval Coverage
194
200
 
195
201
  Workflow evals are layered to match this map: