@kontourai/flow-agents 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (97) hide show
  1. package/.github/dependabot.yml +23 -0
  2. package/.github/workflows/publish-npm.yml +1 -1
  3. package/.github/workflows/release-please.yml +31 -0
  4. package/.github/workflows/runtime-compat.yml +118 -0
  5. package/CHANGELOG.md +38 -0
  6. package/CONTRIBUTING.md +4 -0
  7. package/README.md +58 -19
  8. package/build/src/cli/init.js +215 -5
  9. package/build/src/cli/utterance-check.js +236 -0
  10. package/build/src/cli.js +3 -0
  11. package/build/src/tools/build-universal-bundles.js +268 -0
  12. package/build/src/tools/filter-installed-packs.js +3 -0
  13. package/build/src/tools/validate-source-tree.js +6 -1
  14. package/context/scripts/telemetry/lib/config.sh +5 -1
  15. package/context/settings/flow-agents-settings.json +7 -0
  16. package/docs/agent-system-guidebook.md +4 -5
  17. package/docs/context-map.md +1 -0
  18. package/docs/index.md +46 -6
  19. package/docs/integrations/conformance.md +246 -0
  20. package/docs/integrations/framework-adapter.md +275 -0
  21. package/docs/integrations/harness-install.md +213 -0
  22. package/docs/integrations/index.md +54 -0
  23. package/docs/north-star.md +3 -3
  24. package/docs/repository-structure.md +1 -1
  25. package/docs/skills-map.md +10 -4
  26. package/docs/spec/runtime-hook-surface.md +472 -0
  27. package/docs/survey-utterance-check.md +308 -0
  28. package/docs/vision.md +45 -0
  29. package/docs/workflow-usage-guide.md +1 -1
  30. package/evals/acceptance/run.sh +4 -2
  31. package/evals/acceptance/test_opencode_harness.sh +121 -0
  32. package/evals/acceptance/test_pi_harness.sh +98 -0
  33. package/evals/integration/test_bundle_install.sh +226 -1
  34. package/evals/integration/test_bundle_lifecycle.sh +641 -0
  35. package/evals/integration/test_utterance_check.sh +518 -0
  36. package/evals/run.sh +2 -0
  37. package/evals/static/test_universal_bundles.sh +137 -2
  38. package/integrations/strands/README.md +256 -0
  39. package/integrations/strands/example.py +74 -0
  40. package/integrations/strands/flow_agents_strands/__init__.py +27 -0
  41. package/integrations/strands/flow_agents_strands/hooks.py +194 -0
  42. package/integrations/strands/flow_agents_strands/policy.py +348 -0
  43. package/integrations/strands/flow_agents_strands/steering.py +172 -0
  44. package/integrations/strands/flow_agents_strands/telemetry.py +238 -0
  45. package/integrations/strands/pyproject.toml +38 -0
  46. package/integrations/strands/tests/__init__.py +0 -0
  47. package/integrations/strands/tests/test_hooks.py +304 -0
  48. package/integrations/strands/tests/test_policy.py +315 -0
  49. package/integrations/strands/tests/test_telemetry.py +184 -0
  50. package/integrations/strands-ts/README.md +224 -0
  51. package/integrations/strands-ts/bin/conformance-shim.mjs +257 -0
  52. package/integrations/strands-ts/package.json +53 -0
  53. package/integrations/strands-ts/src/hooks.ts +208 -0
  54. package/integrations/strands-ts/src/index.ts +22 -0
  55. package/integrations/strands-ts/src/policy.ts +345 -0
  56. package/integrations/strands-ts/src/telemetry.ts +251 -0
  57. package/integrations/strands-ts/test/test-policy.ts +322 -0
  58. package/integrations/strands-ts/test/test-telemetry.ts +226 -0
  59. package/integrations/strands-ts/tsconfig.json +20 -0
  60. package/package.json +7 -2
  61. package/packaging/conformance/README.md +142 -0
  62. package/packaging/conformance/fixtures/config-protection--allow-no-path.json +18 -0
  63. package/packaging/conformance/fixtures/config-protection--allow-safe-file.json +20 -0
  64. package/packaging/conformance/fixtures/config-protection--block-biome.json +20 -0
  65. package/packaging/conformance/fixtures/config-protection--block-eslintrc.json +20 -0
  66. package/packaging/conformance/fixtures/quality-gate--allow-no-path.json +17 -0
  67. package/packaging/conformance/fixtures/quality-gate--allow-nonexistent-file.json +19 -0
  68. package/packaging/conformance/fixtures/stop-goal-fit--allow-clean-cwd.json +17 -0
  69. package/packaging/conformance/fixtures/stop-goal-fit--block-strict-mode.json +23 -0
  70. package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +21 -0
  71. package/packaging/conformance/fixtures/workflow-steering--allow-no-state.json +16 -0
  72. package/packaging/conformance/fixtures/workflow-steering--inject-active-state.json +29 -0
  73. package/packaging/conformance/fixtures/workflow-steering--inject-subagent-steering.json +25 -0
  74. package/packaging/conformance/package.json +4 -0
  75. package/packaging/conformance/run-conformance.js +322 -0
  76. package/packaging/manifest.json +59 -0
  77. package/schemas/flow-agents-settings.schema.json +48 -0
  78. package/scripts/README.md +5 -0
  79. package/scripts/dogfood.js +16 -0
  80. package/scripts/hooks/opencode-hook-adapter.js +123 -0
  81. package/scripts/hooks/opencode-telemetry-hook.js +101 -0
  82. package/scripts/hooks/pi-hook-adapter.js +123 -0
  83. package/scripts/hooks/pi-telemetry-hook.js +105 -0
  84. package/scripts/hooks/run-hook.js +8 -0
  85. package/scripts/hooks/utterance-check.js +327 -0
  86. package/scripts/telemetry/lib/config.sh +5 -1
  87. package/skills/idea-to-backlog/SKILL.md +1 -1
  88. package/src/cli/init.ts +219 -6
  89. package/src/cli/utterance-check.ts +324 -0
  90. package/src/cli.ts +3 -0
  91. package/src/tools/build-universal-bundles.ts +266 -0
  92. package/src/tools/filter-installed-packs.ts +3 -0
  93. package/src/tools/validate-source-tree.ts +6 -1
  94. package/build/src/cli/docs-preview.js +0 -39
  95. package/build/src/cli/export-bookmarks.js +0 -38
  96. package/build/src/cli/import-bookmarks.js +0 -50
  97. package/build/src/cli/instinct-cli.js +0 -93
package/docs/index.md CHANGED
@@ -4,12 +4,12 @@ title: Kontour Flow Agents
4
4
 
5
5
  # Flow Agents
6
6
 
7
- <p class="home-lede">Coding agents are powerful and forgetful. Flow Agents wraps Codex, Claude Code, Kiro, and CI agents in an operating layer that keeps long-running work inspectable from idea to release readiness so you ask for outcomes and the system supplies the path, the state, the checks, and the proof.</p>
7
+ <p class="home-lede">A portable process-discipline layer for agentic work: canonical policies, evidence, and telemetry that compile to whatever hook surface a host exposes coding-agent harnesses today, agent frameworks next. Flow Agents keeps work inspectable from idea to release readiness so you ask for outcomes and the system supplies the path, the state, the checks, and the proof.</p>
8
8
 
9
9
  <div class="value-grid">
10
10
  <section>
11
- <strong>Stay on the path</strong>
12
- <span>Turn loose requests into shaped work, plans, implementation waves, review, verification, evidence, and release decisions the same workflow in every supported runtime.</span>
11
+ <strong>Four canonical policies</strong>
12
+ <span>Workflow steering, quality gate, stop-goal-fit, and config protection each a canonical script under <code>scripts/hooks/</code> that compiles to the host's native hook format. Claude Code and Codex are the L2 reference implementations.</span>
13
13
  </section>
14
14
  <section>
15
15
  <strong>Survive context loss</strong>
@@ -41,7 +41,29 @@ flowchart LR
41
41
  Evidence -->|not verified| Plan
42
42
  ```
43
43
 
44
- Flow Agents adds the operating layer around the model: skills choose the right workflow, sidecars preserve state, hooks catch stop-short behavior, and evals keep the bundle honest as it changes. The gate semantics underneath — definitions, runs, evidence, route-back — belong to <a href="https://kontourai.github.io/flow/">Kontour Flow</a>; Flow Agents makes that enforcement native inside agent harnesses.
44
+ Flow Agents adds the operating layer around the model: skills choose the right workflow, sidecars preserve state, hooks enforce the four canonical policies, and evals keep the bundle honest as it changes. The gate semantics underneath — definitions, runs, evidence, route-back — belong to <a href="https://kontourai.github.io/flow/">Kontour Flow</a>; Flow Agents compiles those policies to whatever hook surface a host exposes.
45
+
46
+ ## Process-discipline layer
47
+
48
+ The four canonical policy classes are defined in the <a href="spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> using a runtime-neutral vocabulary. Adapters translate them to the host's native hook format at three conformance levels: <strong>L0</strong> (telemetry only), <strong>L1</strong> (steering + stop-goal-fit warning), and <strong>L2</strong> (all four policies with blocking capability).
49
+
50
+ ### Runtime and support matrix
51
+
52
+ | Tier | Runtime | Ships | Conformance |
53
+ | --- | --- | --- | --- |
54
+ | Core harness | Claude Code | install + hooks + bundle | L2 — reference implementation |
55
+ | Core harness | Codex | install + hooks + bundle | L2 — reference implementation |
56
+ | Core harness | Kiro | install + hooks + bundle | L2 |
57
+ | Core harness | opencode | agents, skills, plugin, opencode.json | L1 — no prompt-submit hook |
58
+ | Core harness | pi | extension, skills, AGENTS.md | L1 — no stop hook |
59
+ | Official framework adapter | AWS Strands (Python) | `integrations/strands/` spike/preview | L0 + config protection via cancellation |
60
+ | Conformance-certified | Community / third-party | Self-certify | Conformance kit in development |
61
+
62
+ Documented gaps: opencode has no native `prompt.submit`-equivalent event; pi has no stop hook; Codex live hook influence on model context is limited. The <a href="spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> names every gap explicitly using the canonical event taxonomy.
63
+
64
+ ## Framework adapters
65
+
66
+ The same canonical policies wire into agent frameworks as in-process language-native packages. `integrations/strands/` contains `flow-agents-strands`, a Python `HookProvider` that emits the canonical telemetry taxonomy and enforces config protection via `BeforeToolCallEvent` cancellation — 50 unit tests, no Strands SDK required. This is a spike/preview. See <a href="spec/runtime-hook-surface.html">the spec</a> for the full framework adapter mapping and minimum viable adapter pseudocode.
45
67
 
46
68
  ## Quick Start
47
69
 
@@ -49,7 +71,13 @@ Flow Agents adds the operating layer around the model: skills choose the right w
49
71
  npx @kontourai/flow-agents init --dest /path/to/workspace
50
72
  ```
51
73
 
52
- Until the first npm release lands, the same command works from a checkout: clone the repo, `npm install && npm run build`, then `node build/src/cli.js init --dest /path/to/workspace`.
74
+ Runtime-specific installs:
75
+
76
+ ```bash
77
+ npx @kontourai/flow-agents init --runtime claude-code --dest /path/to/workspace --yes
78
+ npx @kontourai/flow-agents init --runtime opencode --dest /path/to/workspace --yes
79
+ npx @kontourai/flow-agents init --runtime pi --dest /path/to/workspace --yes
80
+ ```
53
81
 
54
82
  Then ask for the workflow you want, in plain language:
55
83
 
@@ -78,6 +106,14 @@ Use fix-bug. Reproduce the problem, diagnose root cause, implement the fix, and
78
106
  <strong>Workflow Map</strong>
79
107
  <span>See the core skills, gates, artifacts, and route-back behavior.</span>
80
108
  </a>
109
+ <a class="doc-card" href="spec/runtime-hook-surface.html">
110
+ <strong>Runtime Hook Surface</strong>
111
+ <span>Canonical event taxonomy, four policy classes, conformance levels L0/L1/L2, and host mapping tables for adapter authors.</span>
112
+ </a>
113
+ <a class="doc-card" href="vision.html">
114
+ <strong>Vision and Direction</strong>
115
+ <span>Where Flow Agents is going: kits beyond coding, TypeScript framework adapters, and Kontour Console as the unifying telemetry surface.</span>
116
+ </a>
81
117
  <a class="doc-card" href="north-star.html">
82
118
  <strong>North Star</strong>
83
119
  <span>The product promise, design principles, operating layers, and roadmap.</span>
@@ -118,11 +154,15 @@ Use fix-bug. Reproduce the problem, diagnose root cause, implement the fix, and
118
154
  <strong>Developer Reference</strong>
119
155
  <span>The generated repo map: commands, agents, skills, scripts, and contracts.</span>
120
156
  </a>
157
+ <a class="doc-card" href="integrations/index.html">
158
+ <strong>Integration Examples</strong>
159
+ <span>Worked examples for harness runtimes (Claude Code, opencode, pi), framework adapters (AWS Strands), and third-party self-certification with the conformance kit.</span>
160
+ </a>
121
161
  </div>
122
162
 
123
163
  ## The Kontour family
124
164
 
125
- Kontour AI shows the work behind AI. <a href="https://kontourai.github.io/flow/">Flow</a> proves why a process was allowed to advance. Veritas makes AI-authored code changes inspectable. Flow Agents packages those foundations into the agent tools you already use — so trustworthy autonomy doesn't require a perfect prompt, perfect memory, or a new runtime.
165
+ Kontour AI shows the work behind AI. <a href="https://kontourai.github.io/flow/">Flow</a> proves why a process was allowed to advance. <a href="https://kontourai.io/veritas">Veritas</a> makes AI-authored code changes inspectable. <a href="https://kontourai.io/survey">Survey</a> and <a href="https://kontourai.io/surface">Surface</a> carry the evidence underneath. Flow Agents packages those foundations into the agent tools you already use — so trustworthy autonomy doesn't require a perfect prompt, perfect memory, or a new runtime.
126
166
 
127
167
  ## Why it matters
128
168
 
@@ -0,0 +1,246 @@
1
+ ---
2
+ title: Conformance
3
+ ---
4
+
5
+ # Conformance
6
+
7
+ This page explains how a third-party adapter self-certifies against the Flow Agents policy engine contract. It covers the engine contract version 1.0, how to run the conformance kit, what each conformance level requires, and how to declare gaps using the opencode and pi built-in examples as the pattern.
8
+
9
+ Everything in this page is grounded in `packaging/conformance/` and `docs/spec/runtime-hook-surface.md`. No behavior is inferred.
10
+
11
+ ## Engine contract 1.0
12
+
13
+ The engine contract is the versioned public interface between Flow Agents policy scripts and adapters. Third-party adapters bind to this contract. Breaking changes will increment the major version and be announced via CHANGELOG.
14
+
15
+ The contract is defined in <a href="../spec/runtime-hook-surface.html">the spec, section 8</a>. In summary:
16
+
17
+ **Invocation — subprocess form** (standard, used by all current adapters):
18
+
19
+ ```bash
20
+ echo '<JSON payload>' | node scripts/hooks/run-hook.js <hookId> <scriptRelativePath> [profilesCsv]
21
+ ```
22
+
23
+ - `hookId`: identifier for the hook (e.g., `config-protection`). Used for profile/disable checks.
24
+ - `scriptRelativePath`: path relative to `scripts/hooks/` (e.g., `config-protection.js`).
25
+ - `profilesCsv`: comma-separated profile names. Hooks not in the current `SA_HOOK_PROFILE` are skipped.
26
+ - Payload is read from stdin. Max 1 MiB. If truncated, `SA_HOOK_INPUT_TRUNCATED=1` is set.
27
+
28
+ **Invocation — native import form** (for Node.js adapters, preferred for performance):
29
+
30
+ ```javascript
31
+ const { run } = require('./scripts/hooks/config-protection.js');
32
+ const output = run(rawJsonString, { truncated: false, maxStdin: 1024 * 1024 });
33
+ ```
34
+
35
+ All four policy scripts export `module.exports = { run }`.
36
+
37
+ **Version query:**
38
+
39
+ ```bash
40
+ node scripts/hooks/run-hook.js --contract-version
41
+ # → {"contract_version":"1.0","runner":"run-hook.js"}
42
+ ```
43
+
44
+ **Exit code semantics:**
45
+
46
+ | Exit code | Semantics |
47
+ | --- | --- |
48
+ | `0` | Allow — policy has no objection |
49
+ | `2` | Block — policy vetoes the action |
50
+ | other | Error — treat as allow (fail-open) |
51
+
52
+ **Fail-open rule**: Hook runtime errors must never block agent work. Every policy except `config-protection` exits 0 always on non-policy errors. `config-protection` exits 2 only on a protected file match or a truncated payload; runtime errors exit 0.
53
+
54
+ ## What each conformance level requires
55
+
56
+ Conformance levels are defined in <a href="../spec/runtime-hook-surface.html">the spec, section 4</a>.
57
+
58
+ ### L0: Telemetry only
59
+
60
+ The adapter wires the telemetry script to at least one lifecycle event. No policy hooks are required.
61
+
62
+ **Required:** At minimum, `agentSpawn` telemetry fires on session start.
63
+
64
+ **Permitted gaps:** All four policy classes (workflow steering, quality gate, stop-goal-fit, config protection) may be absent.
65
+
66
+ **Use case:** Framework adapters and runtimes where the telemetry signal is valuable but blocking or context injection is not feasible.
67
+
68
+ ### L1: Steering
69
+
70
+ The adapter implements L0 plus workflow steering and stop-goal-fit in warning mode.
71
+
72
+ **Required:**
73
+ - L0 telemetry.
74
+ - Workflow steering fires on `userPromptSubmit` (or the closest equivalent — document which event is used and any fidelity loss).
75
+ - Stop-goal-fit fires on `stop` in warning-only mode (exits 0 always).
76
+
77
+ **Permitted gaps:** Quality gate and config protection may be absent. Stop-goal-fit runs in warning mode only.
78
+
79
+ **Use case:** Harness adapters where the runtime supports prompt-submit and stop hooks, but tool-level blocking is not available or desired.
80
+
81
+ ### L2: Enforcing gates
82
+
83
+ The adapter implements L1 plus all blocking policy classes.
84
+
85
+ **Required:**
86
+ - L1 steering and stop telemetry.
87
+ - Config protection fires on `preToolUse` and can block (exit 2 translates to a deny response).
88
+ - Quality gate fires on `postToolUse`.
89
+ - Stop-goal-fit fires on `stop` with `FLOW_AGENTS_GOAL_FIT_STRICT` configurable.
90
+
91
+ **Permitted gaps:** None. All four policy classes must be wired. Any missing host trigger must be documented as a named gap in the conformance declaration.
92
+
93
+ **Use case:** Claude Code and Codex are L2 reference implementations.
94
+
95
+ ## Running the conformance kit
96
+
97
+ The conformance kit is in `packaging/conformance/`. It requires no npm dependencies — only Node.js.
98
+
99
+ **Self-test the canonical engine (must report L2):**
100
+
101
+ ```bash
102
+ node packaging/conformance/run-conformance.js --self
103
+ ```
104
+
105
+ **Test a third-party adapter at L2:**
106
+
107
+ ```bash
108
+ node packaging/conformance/run-conformance.js \
109
+ --adapter-cmd "node /path/to/your-adapter.js" \
110
+ --level L2
111
+ ```
112
+
113
+ **Test at L1 only:**
114
+
115
+ ```bash
116
+ node packaging/conformance/run-conformance.js \
117
+ --adapter-cmd "node /path/to/your-adapter.js" \
118
+ --level L1
119
+ ```
120
+
121
+ **CLI reference:**
122
+
123
+ ```
124
+ node packaging/conformance/run-conformance.js [options]
125
+
126
+ --self Run against the canonical engine (target L2)
127
+ --adapter-cmd CMD Shell command to pipe fixtures to (adapter under test)
128
+ --level L0|L1|L2 Minimum conformance level to enforce (default: L2 for --self, L0 for --adapter-cmd)
129
+ --fixtures DIR Override fixture directory (default: packaging/conformance/fixtures/)
130
+ --verbose Print fixture payloads and full output in per-fixture results
131
+ ```
132
+
133
+ Exit codes: `0` = target level reached, `1` = target level not reached, `2` = usage error.
134
+
135
+ ### Adapter contract for the runner
136
+
137
+ Your adapter command:
138
+ - Receives a canonical JSON payload on stdin (one JSON object).
139
+ - Writes the input JSON (or augmented form) to stdout on allow.
140
+ - Exits `0` to allow, `2` to block, any other code for error (treated as allow, fail-open).
141
+
142
+ The runner invokes your command exactly once per fixture via `sh -c "<your-cmd>"`.
143
+
144
+ ### Fixture inventory
145
+
146
+ The fixtures in `packaging/conformance/fixtures/` cover all four policy classes:
147
+
148
+ | Fixture | Policy class | Event | Level |
149
+ | --- | --- | --- | --- |
150
+ | `config-protection--block-eslintrc.json` | config-protection | preToolUse | L2 |
151
+ | `config-protection--block-biome.json` | config-protection | preToolUse | L2 |
152
+ | `config-protection--allow-safe-file.json` | config-protection | preToolUse | L2 |
153
+ | `config-protection--allow-no-path.json` | config-protection | preToolUse | L2 |
154
+ | `quality-gate--allow-nonexistent-file.json` | quality-gate | postToolUse | L2 |
155
+ | `quality-gate--allow-no-path.json` | quality-gate | postToolUse | L2 |
156
+ | `stop-goal-fit--allow-clean-cwd.json` | stop-goal-fit | stop | L1 |
157
+ | `stop-goal-fit--warn-active-delivery.json` | stop-goal-fit | stop | L1 |
158
+ | `stop-goal-fit--block-strict-mode.json` | stop-goal-fit | stop | L2 |
159
+ | `workflow-steering--allow-no-state.json` | workflow-steering | userPromptSubmit | L1 |
160
+ | `workflow-steering--inject-active-state.json` | workflow-steering | userPromptSubmit | L1 |
161
+ | `workflow-steering--inject-subagent-steering.json` | workflow-steering | postToolUse | L1 |
162
+
163
+ Fixtures with `workspace_setup` create a temporary directory with the listed files before invoking the adapter and clean up afterward. The `cwd` field in those payloads is replaced with the temp directory path at runtime.
164
+
165
+ ## How to declare gaps
166
+
167
+ If your adapter legitimately cannot satisfy a fixture — because the host runtime has no blocking `preToolUse` equivalent, or no stop hook — declare the gap explicitly in your adapter documentation. The opencode and pi adapters are the reference pattern.
168
+
169
+ ### opencode: no prompt-submit hook
170
+
171
+ opencode has no native `prompt.submit`-equivalent event. Workflow steering cannot fire at each user turn. The gap is declared in the plugin source comment and in the conformance declaration:
172
+
173
+ ```yaml
174
+ conformance_level: L1
175
+ host: opencode
176
+ event_coverage:
177
+ agentSpawn: session.created (full fidelity)
178
+ userPromptSubmit: no native equivalent — workflow steering fires at session.created only
179
+ preToolUse: tool.execute.before (full fidelity, blocking available via thrown Error)
180
+ postToolUse: tool.execute.after (full fidelity)
181
+ stop: session.idle (reduced fidelity — fires on idle, not on completion)
182
+ permissionRequest: permission.asked (telemetry only — no blocking capability)
183
+ policy_coverage:
184
+ workflow_steering: partial — injected at session.created only, not at each turn
185
+ quality_gate: wired at tool.execute.after
186
+ stop_goal_fit: degraded — session.idle does not reliably fire at completion
187
+ config_protection: wired at tool.execute.before (blocking)
188
+ gaps:
189
+ - event: userPromptSubmit
190
+ reason: opencode has no prompt.submit equivalent
191
+ degradation: Workflow steering fires once at session.created instead of at each user turn
192
+ - event: stop
193
+ reason: session.idle is the closest event but is not a true completion signal
194
+ degradation: stop-goal-fit warnings may not fire reliably at session end
195
+ ```
196
+
197
+ ### pi: no stop hook
198
+
199
+ pi has no stop hook. Stop-goal-fit cannot fire at session end. The gap is declared in the extension source comment and in the conformance declaration:
200
+
201
+ ```yaml
202
+ conformance_level: L1
203
+ host: pi
204
+ event_coverage:
205
+ agentSpawn: session_start (full fidelity)
206
+ userPromptSubmit: before_agent_start (reduced fidelity — fires at agent start, not per-turn)
207
+ preToolUse: tool_call (full fidelity, blockable via return { block: true })
208
+ postToolUse: tool_result (full fidelity)
209
+ stop: no native equivalent — session_shutdown used as closest analogue
210
+ policy_coverage:
211
+ workflow_steering: partial — injected at before_agent_start, not at each user turn
212
+ quality_gate: wired at tool_result
213
+ stop_goal_fit: degraded — session_shutdown does not reliably carry stop semantics
214
+ config_protection: wired at tool_call (blocking)
215
+ gaps:
216
+ - event: stop
217
+ reason: pi has no stop hook
218
+ degradation: stop-goal-fit cannot fire; agent may complete without the check
219
+ workaround: Run stop-goal-fit checks explicitly in CI or via a post-session script
220
+ ```
221
+
222
+ ## Including a conformance declaration in your adapter
223
+
224
+ After running the conformance kit, include a declaration in your adapter documentation:
225
+
226
+ ```yaml
227
+ conformance_level: L2 # or L0 / L1
228
+ engine_contract_version: "1.0"
229
+ runner_version: "run-conformance.js"
230
+ test_date: 2026-06-11
231
+ verdict: PASS
232
+ fixture_count: 12
233
+ fixtures_passed: 12
234
+ gaps: []
235
+ ```
236
+
237
+ If any fixtures fail, list them under `gaps` with a description of the degradation behavior. Declared gaps do not prevent reaching a lower conformance level — they make the adapter's behavior honest and auditable.
238
+
239
+ ## Related references
240
+
241
+ - `packaging/conformance/run-conformance.js` — conformance runner
242
+ - `packaging/conformance/fixtures/` — golden fixtures
243
+ - `packaging/conformance/README.md` — conformance kit README
244
+ - <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec §8</a> — engine contract 1.0 in full
245
+ - <a href="harness-install.html">Harness Install</a> — worked install examples for opencode and pi
246
+ - <a href="framework-adapter.html">Framework Adapter</a> — worked example of a language-native adapter
@@ -0,0 +1,275 @@
1
+ ---
2
+ title: Framework Adapter
3
+ ---
4
+
5
+ # Framework Adapter
6
+
7
+ This page walks through the `integrations/strands/` reference implementation: a Python `HookProvider` for AWS Strands Agents. It covers how to construct `FlowAgentsHooks`, what telemetry it emits, how the policy gate binds to the canonical engine contract, and the documented limitations of this spike.
8
+
9
+ Everything in this page is grounded in the files under `integrations/strands/`. No behavior is inferred or aspirational unless explicitly labeled as direction.
10
+
11
+ ## Harness adapters vs. framework adapters
12
+
13
+ Harness adapters (Claude Code, Codex, Kiro, opencode, pi) integrate with coding-agent runtimes that have their own hook format: JSON on stdin, exit codes, and lifecycle events named by the harness. Each harness adapter normalizes its runtime's hook payloads into the canonical Flow Agents telemetry taxonomy and delegates to `scripts/telemetry/telemetry.sh`.
14
+
15
+ Framework adapters are in-process packages. Strands Agents is not a coding-agent harness — it is a general-purpose Python agent SDK. Its hook surface (`HookProvider` / `HookRegistry`) is class-based and synchronous. There is no stdin/stdout protocol and no process exit codes as block signals. Hook callbacks receive typed Python event objects and can mutate them in place.
16
+
17
+ Despite the surface differences, the same canonical event taxonomy is used. The JSONL output from `FlowAgentsHooks` is structurally identical to the output produced by `claude-telemetry-hook.js` and `codex-telemetry-hook.js`.
18
+
19
+ ## Constructing FlowAgentsHooks
20
+
21
+ `FlowAgentsHooks` is the main entry point. It implements the Strands `HookProvider` protocol via duck typing, so `strands-agents` is not required at import time.
22
+
23
+ ```python
24
+ from flow_agents_strands import FlowAgentsHooks
25
+
26
+ hooks = FlowAgentsHooks(
27
+ workspace=".", # root of your project (reads .flow-agents/)
28
+ agent_name="my-agent", # appears in telemetry events
29
+ )
30
+ ```
31
+
32
+ Constructor parameters (all optional):
33
+
34
+ | Parameter | Default | Purpose |
35
+ | --- | --- | --- |
36
+ | `sink_path` | `<workspace>/.flow-agents/.telemetry/full.jsonl` | JSONL telemetry output path or directory |
37
+ | `workspace` | `os.getcwd()` | Root of the workspace; used to discover `.flow-agents/` |
38
+ | `agent_name` | `"strands-agent"` | Agent identifier embedded in telemetry events |
39
+ | `runtime` | `"strands"` | Runtime label embedded in telemetry events |
40
+ | `policy_gate` | `PolicyGate()` | Optional custom `PolicyGate` instance (for testing) |
41
+
42
+ `FlowAgentsHooks` is usable without `strands-agents` installed. Telemetry emission and `steering_context()` work in any Python environment. The `register_hooks` method (which wires callbacks into a `HookRegistry`) requires `strands-agents` and raises `ImportError` if the SDK is absent.
43
+
44
+ ## Wiring into an Agent
45
+
46
+ ```python
47
+ from strands import Agent
48
+ from strands.models import BedrockModel
49
+ from flow_agents_strands import FlowAgentsHooks
50
+
51
+ hooks = FlowAgentsHooks(workspace=".")
52
+
53
+ # Load steering context BEFORE constructing the agent.
54
+ # Strands' BeforeInvocationEvent does not expose a mutable system prompt,
55
+ # so steering must be injected at construction time.
56
+ system_prompt = (
57
+ "You are a helpful assistant.\n"
58
+ + hooks.steering_context()
59
+ )
60
+
61
+ model = BedrockModel(model_id="anthropic.claude-3-5-sonnet-20241022-v2:0")
62
+ agent = Agent(model=model, system_prompt=system_prompt, hooks=[hooks])
63
+
64
+ result = agent("List the files in this directory.")
65
+ ```
66
+
67
+ `register_hooks` is called by the Strands runtime when `hooks=[hooks]` is passed to `Agent`. It registers five callbacks:
68
+
69
+ | Strands event | Canonical event | What fires |
70
+ | --- | --- | --- |
71
+ | `AgentInitializedEvent` | `agentSpawn` | `emit_session_start()` — records `session.start` |
72
+ | `BeforeInvocationEvent` | `userPromptSubmit` | `emit("userPromptSubmit")` — records `turn.user` |
73
+ | `AfterInvocationEvent` | `stop` | `emit_session_end(duration_s=…)` — records `session.end` |
74
+ | `BeforeToolCallEvent` | `preToolUse` | Telemetry + policy gate (config-protection) |
75
+ | `AfterToolCallEvent` | `postToolUse` | `emit_tool_result(…)` — records `tool.result` |
76
+
77
+ This mapping is the `STRANDS_TO_CANONICAL` dict exposed at module level by `integrations/strands/flow_agents_strands/telemetry.py`:
78
+
79
+ ```python
80
+ STRANDS_TO_CANONICAL = {
81
+ "AgentInitializedEvent": "agentSpawn",
82
+ "BeforeInvocationEvent": "userPromptSubmit",
83
+ "AfterInvocationEvent": "stop",
84
+ "BeforeToolCallEvent": "preToolUse",
85
+ "AfterToolCallEvent": "postToolUse",
86
+ "AfterModelCallEvent": "postToolUse", # closest analogue; no tool name
87
+ "MessageAddedEvent": "userPromptSubmit",
88
+ }
89
+ ```
90
+
91
+ ## Telemetry emitted
92
+
93
+ Events are written to `.flow-agents/.telemetry/full.jsonl` by default. The record shape matches `build_base_event()` in `scripts/telemetry/telemetry.sh`:
94
+
95
+ ```json
96
+ {
97
+ "schema_version": "0.3.0",
98
+ "timestamp": "1718000000000",
99
+ "session_id": "<uuid>",
100
+ "event_id": "<uuid>",
101
+ "event_type": "tool.invoke",
102
+ "agent": { "name": "my-agent", "runtime": "strands", "version": "unknown" },
103
+ "hook": {
104
+ "event_name": "preToolUse",
105
+ "source": "strands",
106
+ "stop_hook_active": null,
107
+ "raw_input": null
108
+ },
109
+ "tool": { "name": "edit", "normalized_name": "fs_write", "input": { ... } }
110
+ }
111
+ ```
112
+
113
+ Canonical names map to schema `event_type` values via `_CANONICAL_TO_SCHEMA` in `telemetry.py`:
114
+
115
+ | Canonical name | Schema event_type |
116
+ | --- | --- |
117
+ | `agentSpawn` | `session.start` |
118
+ | `userPromptSubmit` | `turn.user` |
119
+ | `preToolUse` | `tool.invoke` |
120
+ | `permissionRequest` | `tool.permission_request` |
121
+ | `postToolUse` | `tool.result` |
122
+ | `stop` | `session.end` |
123
+
124
+ Telemetry is always fail-open: if the JSONL file cannot be written (`OSError`), the exception is swallowed silently. Telemetry must never block agent work.
125
+
126
+ ## Policy gate: config-protection
127
+
128
+ The config-protection policy binds to the canonical Node.js engine via subprocess. The binding is in `integrations/strands/flow_agents_strands/policy.py` in the `PolicyGate` class.
129
+
130
+ **Primary mode — engine subprocess:**
131
+
132
+ On `BeforeToolCallEvent`, if the tool name is a write-like tool (one of `edit`, `write`, `fs_write`, `apply_patch`, `create_file`, `str_replace_editor`), the gate serializes the event to a canonical JSON payload and spawns:
133
+
134
+ ```bash
135
+ echo '{"hook_event_name":"PreToolUse","tool_name":"edit","tool_input":{"path":"biome.json"}}' \
136
+ | node scripts/hooks/run-hook.js config-protection config-protection.js
137
+ ```
138
+
139
+ The engine exits 2 (block) or 0 (allow). Exit code 2 causes `event.cancel_tool` to be set to the block reason from stderr. Strands cancels the call and surfaces the message as the tool result. All other exit codes fail open.
140
+
141
+ The engine is located by `_find_engine_paths()` in this priority order:
142
+
143
+ 1. `FLOW_AGENTS_ENGINE_PATH` environment variable (explicit override).
144
+ 2. Relative to the package source file: `../../../../scripts/hooks/run-hook.js` (works from a repo checkout).
145
+ 3. Walked up from `os.getcwd()` looking for `node_modules/@kontourai/flow-agents/scripts/hooks/run-hook.js` (npm-installed package).
146
+
147
+ **Fallback mode — Python evaluation:**
148
+
149
+ If `node` is not on PATH or `run-hook.js` cannot be located, the gate degrades to a built-in Python implementation of the same logic and emits a one-time `RuntimeWarning`. The Python fallback uses the same `PROTECTED_FILES` frozenset as `config-protection.js` and is auditable. This is not silent: the warning is printed once to stderr.
150
+
151
+ **Custom protected set:**
152
+
153
+ If a `PolicyGate` is constructed with a custom `protected_files` frozenset, Python evaluation is used directly (the engine subprocess cannot receive a runtime-custom set). This path is intended for tests and local override only.
154
+
155
+ ## Workflow steering
156
+
157
+ Strands' `BeforeInvocationEvent` does not expose a mutable system prompt at callback time.
158
+
159
+ The spike approach: call `hooks.steering_context()` at `Agent` construction time and append the result to the system prompt. `steering_context()` reads the current workflow state from `.flow-agents/` and returns a text block. It also emits a `turn.user` telemetry event so the injection is recorded in the JSONL log.
160
+
161
+ This is a one-shot snapshot. It does not re-evaluate on every turn the way `workflow-steering.js` does at `UserPromptSubmit`. See the Limitations section for the productization path.
162
+
163
+ ## Documented limitations
164
+
165
+ The following limitations are from `integrations/strands/README.md` and reflect the current spike state. They are not defects to be worked around silently — they are honest gaps.
166
+
167
+ 1. **Node.js subprocess dependency**: The primary policy binding spawns a Node.js subprocess for each `BeforeToolCallEvent` involving a write-like tool. If `node` is not on PATH or the package is not installed, the gate degrades to the Python fallback with a one-time `RuntimeWarning`. To force the subprocess path, set `FLOW_AGENTS_ENGINE_PATH` to the absolute path of `run-hook.js`.
168
+
169
+ 2. **Steering seam**: Strands does not allow mutating the system prompt from `BeforeInvocationEvent`. The workaround (`steering_context()` at Agent construction) is a one-shot snapshot; it does not re-evaluate on every turn. Productization would require either a custom Strands model wrapper that injects context per-turn, or upstream SDK support for mutable system-prompt context in the invocation event.
170
+
171
+ 3. **session.usage event omitted**: The JS harness emits a `session.usage` event on stop with token counts. The Strands `AfterInvocationEvent` does not expose token-usage data in the hook payload, so this event is not emitted.
172
+
173
+ 4. **No analytics channel**: The harness adapters write to two channels (full + analytics) with different redaction profiles. This spike writes only to the `full` channel.
174
+
175
+ 5. **No Console/HTTP sink**: The bash transport supports POSTing events to a Console endpoint. This adapter writes JSONL only.
176
+
177
+ 6. **Runtime version is "unknown"**: Strands does not expose its version through the hook event; `agent.version` is hardcoded to `"unknown"`.
178
+
179
+ 7. **No subagent/delegation event**: The Strands SDK does not have a built-in delegation tool; the `subagentStart`/`subagentStop` telemetry path is not wired.
180
+
181
+ 8. **Quality-gate policy omitted**: `quality-gate.js` invokes ruff/biome after edits. There is no clear Strands analogue yet.
182
+
183
+ ## Conformance declaration
184
+
185
+ The Strands adapter is L0 plus config protection via `BeforeToolCallEvent` cancellation. A full conformance declaration would read:
186
+
187
+ ```
188
+ conformance_level: L0 (+ config-protection via BeforeToolCallEvent)
189
+ host: AWS Strands Agents
190
+ event_coverage:
191
+ agentSpawn: AgentInitializedEvent (full fidelity)
192
+ userPromptSubmit: BeforeInvocationEvent (no per-turn injection — spike limitation)
193
+ preToolUse: BeforeToolCallEvent (full fidelity, cancellable)
194
+ postToolUse: AfterToolCallEvent (full fidelity)
195
+ stop: AfterInvocationEvent (full fidelity)
196
+ permissionRequest: no native equivalent
197
+ subagentStart: no native equivalent
198
+ subagentStop: no native equivalent
199
+ policy_coverage:
200
+ workflow_steering: partial — injected once at Agent construction, not per-turn
201
+ quality_gate: omitted — no current Strands analogue
202
+ stop_goal_fit: omitted — AfterInvocationEvent used for telemetry only
203
+ config_protection: wired at BeforeToolCallEvent (blocking via event.cancel_tool)
204
+ ```
205
+
206
+ ## Running tests
207
+
208
+ The spike ships 50 unit tests that require no Strands SDK:
209
+
210
+ ```bash
211
+ cd integrations/strands
212
+ python3 -m unittest discover
213
+ ```
214
+
215
+ ## Related references
216
+
217
+ - `integrations/strands/flow_agents_strands/hooks.py` — `FlowAgentsHooks` and `register_hooks`
218
+ - `integrations/strands/flow_agents_strands/telemetry.py` — `TelemetrySink`, `STRANDS_TO_CANONICAL`
219
+ - `integrations/strands/flow_agents_strands/policy.py` — `PolicyGate`, engine subprocess binding
220
+ - `integrations/strands/flow_agents_strands/steering.py` — `SteeringContext`
221
+ - `integrations/strands/README.md` — spike README with quickstart and full limitations list
222
+ - <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec §6.2</a> — framework adapter contract and minimum viable adapter pseudocode
223
+ - <a href="conformance.html">Conformance</a> — how to self-certify using the conformance kit
224
+
225
+ ---
226
+
227
+ ## TypeScript native-import adapter (`integrations/strands-ts/`)
228
+
229
+ `@kontourai/flow-agents-strands` is the first **native-import** consumer of the policy engine contract. Where the Python adapter spawns a subprocess for each `BeforeToolCallEvent` policy check, the TS adapter calls `config-protection.js`'s exported `run()` function directly — zero subprocess overhead on the hot path.
230
+
231
+ ### Key differences from the Python adapter
232
+
233
+ | | Python adapter | TypeScript adapter |
234
+ |--|----------------|-------------------|
235
+ | Engine binding | subprocess (`node run-hook.js …`) | `require("config-protection.js").run()` — in-process |
236
+ | Strands SDK | `register_hooks(registry)` → `registry.add_callback` | `registerHooks(registry)` → `registry.addCallback` |
237
+ | Cancel signal | `event.cancel_tool = reason` | `event.cancel = reason` (TS variant) |
238
+ | Conformance | L0 + config-protection | L2 (all four policy classes via shim) |
239
+ | Test framework | stdlib unittest (Python) | node:test (no extra deps) |
240
+
241
+ ### Constructing FlowAgentsHooks (TypeScript)
242
+
243
+ ```typescript
244
+ import { FlowAgentsHooks } from "@kontourai/flow-agents-strands";
245
+
246
+ const hooks = new FlowAgentsHooks({
247
+ workspace: ".", // root of your project
248
+ agentName: "my-agent",
249
+ // engineRoot: "/path/to/flow-agents" // optional: explicit engine path
250
+ });
251
+ ```
252
+
253
+ ### Event mapping
254
+
255
+ The TS adapter exports `STRANDS_TO_CANONICAL` matching the Python adapter's dict:
256
+
257
+ | Strands TS Event | Canonical event |
258
+ |------------------|-----------------|
259
+ | `BeforeInvocationEvent` | `userPromptSubmit` |
260
+ | `AfterInvocationEvent` | `stop` |
261
+ | `BeforeToolCallEvent` | `preToolUse` |
262
+ | `AfterToolCallEvent` | `postToolUse` |
263
+ | `AgentInitializedEvent` | `agentSpawn` |
264
+
265
+ ### Conformance
266
+
267
+ The TS adapter achieves **L2** via `bin/conformance-shim.mjs`:
268
+
269
+ ```bash
270
+ node packaging/conformance/run-conformance.js \
271
+ --adapter-cmd "node integrations/strands-ts/bin/conformance-shim.mjs" \
272
+ --level L2
273
+ ```
274
+
275
+ 12/12 fixtures pass. See `integrations/strands-ts/README.md` for the full conformance declaration and limitations.