@kontourai/flow-agents 0.1.2 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/dependabot.yml +23 -0
- package/.github/workflows/release-please.yml +31 -0
- package/.github/workflows/runtime-compat.yml +118 -0
- package/CHANGELOG.md +23 -0
- package/CONTRIBUTING.md +4 -0
- package/README.md +53 -10
- package/build/src/cli/init.js +215 -5
- package/build/src/cli/utterance-check.js +65 -1
- package/build/src/tools/build-universal-bundles.js +268 -0
- package/build/src/tools/filter-installed-packs.js +3 -0
- package/build/src/tools/validate-source-tree.js +5 -1
- package/context/scripts/telemetry/lib/config.sh +5 -1
- package/context/settings/flow-agents-settings.json +7 -0
- package/docs/context-map.md +1 -0
- package/docs/index.md +45 -4
- package/docs/integrations/conformance.md +246 -0
- package/docs/integrations/framework-adapter.md +275 -0
- package/docs/integrations/harness-install.md +213 -0
- package/docs/integrations/index.md +54 -0
- package/docs/north-star.md +2 -2
- package/docs/spec/runtime-hook-surface.md +472 -0
- package/docs/survey-utterance-check.md +211 -94
- package/docs/vision.md +45 -0
- package/evals/acceptance/run.sh +4 -2
- package/evals/acceptance/test_opencode_harness.sh +121 -0
- package/evals/acceptance/test_pi_harness.sh +98 -0
- package/evals/integration/test_bundle_install.sh +226 -1
- package/evals/integration/test_bundle_lifecycle.sh +641 -0
- package/evals/integration/test_utterance_check.sh +291 -44
- package/evals/run.sh +2 -0
- package/evals/static/test_universal_bundles.sh +137 -2
- package/integrations/strands/README.md +256 -0
- package/integrations/strands/example.py +74 -0
- package/integrations/strands/flow_agents_strands/__init__.py +27 -0
- package/integrations/strands/flow_agents_strands/hooks.py +194 -0
- package/integrations/strands/flow_agents_strands/policy.py +348 -0
- package/integrations/strands/flow_agents_strands/steering.py +172 -0
- package/integrations/strands/flow_agents_strands/telemetry.py +238 -0
- package/integrations/strands/pyproject.toml +38 -0
- package/integrations/strands/tests/__init__.py +0 -0
- package/integrations/strands/tests/test_hooks.py +304 -0
- package/integrations/strands/tests/test_policy.py +315 -0
- package/integrations/strands/tests/test_telemetry.py +184 -0
- package/integrations/strands-ts/README.md +224 -0
- package/integrations/strands-ts/bin/conformance-shim.mjs +257 -0
- package/integrations/strands-ts/package.json +53 -0
- package/integrations/strands-ts/src/hooks.ts +208 -0
- package/integrations/strands-ts/src/index.ts +22 -0
- package/integrations/strands-ts/src/policy.ts +345 -0
- package/integrations/strands-ts/src/telemetry.ts +251 -0
- package/integrations/strands-ts/test/test-policy.ts +322 -0
- package/integrations/strands-ts/test/test-telemetry.ts +226 -0
- package/integrations/strands-ts/tsconfig.json +20 -0
- package/package.json +7 -2
- package/packaging/conformance/README.md +142 -0
- package/packaging/conformance/fixtures/config-protection--allow-no-path.json +18 -0
- package/packaging/conformance/fixtures/config-protection--allow-safe-file.json +20 -0
- package/packaging/conformance/fixtures/config-protection--block-biome.json +20 -0
- package/packaging/conformance/fixtures/config-protection--block-eslintrc.json +20 -0
- package/packaging/conformance/fixtures/quality-gate--allow-no-path.json +17 -0
- package/packaging/conformance/fixtures/quality-gate--allow-nonexistent-file.json +19 -0
- package/packaging/conformance/fixtures/stop-goal-fit--allow-clean-cwd.json +17 -0
- package/packaging/conformance/fixtures/stop-goal-fit--block-strict-mode.json +23 -0
- package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +21 -0
- package/packaging/conformance/fixtures/workflow-steering--allow-no-state.json +16 -0
- package/packaging/conformance/fixtures/workflow-steering--inject-active-state.json +29 -0
- package/packaging/conformance/fixtures/workflow-steering--inject-subagent-steering.json +25 -0
- package/packaging/conformance/package.json +4 -0
- package/packaging/conformance/run-conformance.js +322 -0
- package/packaging/manifest.json +59 -0
- package/schemas/flow-agents-settings.schema.json +48 -0
- package/scripts/README.md +4 -0
- package/scripts/dogfood.js +16 -0
- package/scripts/hooks/opencode-hook-adapter.js +123 -0
- package/scripts/hooks/opencode-telemetry-hook.js +101 -0
- package/scripts/hooks/pi-hook-adapter.js +123 -0
- package/scripts/hooks/pi-telemetry-hook.js +105 -0
- package/scripts/hooks/run-hook.js +8 -0
- package/scripts/hooks/utterance-check.js +124 -22
- package/scripts/telemetry/lib/config.sh +5 -1
- package/src/cli/init.ts +219 -6
- package/src/cli/utterance-check.ts +71 -1
- package/src/tools/build-universal-bundles.ts +266 -0
- package/src/tools/filter-installed-packs.ts +3 -0
- package/src/tools/validate-source-tree.ts +5 -1
|
@@ -0,0 +1,246 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Conformance
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Conformance
|
|
6
|
+
|
|
7
|
+
This page explains how a third-party adapter self-certifies against the Flow Agents policy engine contract. It covers the engine contract version 1.0, how to run the conformance kit, what each conformance level requires, and how to declare gaps using the opencode and pi built-in examples as the pattern.
|
|
8
|
+
|
|
9
|
+
Everything in this page is grounded in `packaging/conformance/` and `docs/spec/runtime-hook-surface.md`. No behavior is inferred.
|
|
10
|
+
|
|
11
|
+
## Engine contract 1.0
|
|
12
|
+
|
|
13
|
+
The engine contract is the versioned public interface between Flow Agents policy scripts and adapters. Third-party adapters bind to this contract. Breaking changes will increment the major version and be announced via CHANGELOG.
|
|
14
|
+
|
|
15
|
+
The contract is defined in <a href="../spec/runtime-hook-surface.html">the spec, section 8</a>. In summary:
|
|
16
|
+
|
|
17
|
+
**Invocation — subprocess form** (standard, used by all current adapters):
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
echo '<JSON payload>' | node scripts/hooks/run-hook.js <hookId> <scriptRelativePath> [profilesCsv]
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
- `hookId`: identifier for the hook (e.g., `config-protection`). Used for profile/disable checks.
|
|
24
|
+
- `scriptRelativePath`: path relative to `scripts/hooks/` (e.g., `config-protection.js`).
|
|
25
|
+
- `profilesCsv`: comma-separated profile names. Hooks not in the current `SA_HOOK_PROFILE` are skipped.
|
|
26
|
+
- Payload is read from stdin. Max 1 MiB. If truncated, `SA_HOOK_INPUT_TRUNCATED=1` is set.
|
|
27
|
+
|
|
28
|
+
**Invocation — native import form** (for Node.js adapters, preferred for performance):
|
|
29
|
+
|
|
30
|
+
```javascript
|
|
31
|
+
const { run } = require('./scripts/hooks/config-protection.js');
|
|
32
|
+
const output = run(rawJsonString, { truncated: false, maxStdin: 1024 * 1024 });
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
All four policy scripts export `module.exports = { run }`.
|
|
36
|
+
|
|
37
|
+
**Version query:**
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
node scripts/hooks/run-hook.js --contract-version
|
|
41
|
+
# → {"contract_version":"1.0","runner":"run-hook.js"}
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**Exit code semantics:**
|
|
45
|
+
|
|
46
|
+
| Exit code | Semantics |
|
|
47
|
+
| --- | --- |
|
|
48
|
+
| `0` | Allow — policy has no objection |
|
|
49
|
+
| `2` | Block — policy vetoes the action |
|
|
50
|
+
| other | Error — treat as allow (fail-open) |
|
|
51
|
+
|
|
52
|
+
**Fail-open rule**: Hook runtime errors must never block agent work. Every policy except `config-protection` exits 0 always on non-policy errors. `config-protection` exits 2 only on a protected file match or a truncated payload; runtime errors exit 0.
|
|
53
|
+
|
|
54
|
+
## What each conformance level requires
|
|
55
|
+
|
|
56
|
+
Conformance levels are defined in <a href="../spec/runtime-hook-surface.html">the spec, section 4</a>.
|
|
57
|
+
|
|
58
|
+
### L0: Telemetry only
|
|
59
|
+
|
|
60
|
+
The adapter wires the telemetry script to at least one lifecycle event. No policy hooks are required.
|
|
61
|
+
|
|
62
|
+
**Required:** At minimum, `agentSpawn` telemetry fires on session start.
|
|
63
|
+
|
|
64
|
+
**Permitted gaps:** All four policy classes (workflow steering, quality gate, stop-goal-fit, config protection) may be absent.
|
|
65
|
+
|
|
66
|
+
**Use case:** Framework adapters and runtimes where the telemetry signal is valuable but blocking or context injection is not feasible.
|
|
67
|
+
|
|
68
|
+
### L1: Steering
|
|
69
|
+
|
|
70
|
+
The adapter implements L0 plus workflow steering and stop-goal-fit in warning mode.
|
|
71
|
+
|
|
72
|
+
**Required:**
|
|
73
|
+
- L0 telemetry.
|
|
74
|
+
- Workflow steering fires on `userPromptSubmit` (or the closest equivalent — document which event is used and any fidelity loss).
|
|
75
|
+
- Stop-goal-fit fires on `stop` in warning-only mode (exits 0 always).
|
|
76
|
+
|
|
77
|
+
**Permitted gaps:** Quality gate and config protection may be absent. Stop-goal-fit runs in warning mode only.
|
|
78
|
+
|
|
79
|
+
**Use case:** Harness adapters where the runtime supports prompt-submit and stop hooks, but tool-level blocking is not available or desired.
|
|
80
|
+
|
|
81
|
+
### L2: Enforcing gates
|
|
82
|
+
|
|
83
|
+
The adapter implements L1 plus all blocking policy classes.
|
|
84
|
+
|
|
85
|
+
**Required:**
|
|
86
|
+
- L1 steering and stop telemetry.
|
|
87
|
+
- Config protection fires on `preToolUse` and can block (exit 2 translates to a deny response).
|
|
88
|
+
- Quality gate fires on `postToolUse`.
|
|
89
|
+
- Stop-goal-fit fires on `stop` with `FLOW_AGENTS_GOAL_FIT_STRICT` configurable.
|
|
90
|
+
|
|
91
|
+
**Permitted gaps:** None. All four policy classes must be wired. Any missing host trigger must be documented as a named gap in the conformance declaration.
|
|
92
|
+
|
|
93
|
+
**Use case:** Claude Code and Codex are L2 reference implementations.
|
|
94
|
+
|
|
95
|
+
## Running the conformance kit
|
|
96
|
+
|
|
97
|
+
The conformance kit is in `packaging/conformance/`. It requires no npm dependencies — only Node.js.
|
|
98
|
+
|
|
99
|
+
**Self-test the canonical engine (must report L2):**
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
node packaging/conformance/run-conformance.js --self
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
**Test a third-party adapter at L2:**
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
node packaging/conformance/run-conformance.js \
|
|
109
|
+
--adapter-cmd "node /path/to/your-adapter.js" \
|
|
110
|
+
--level L2
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
**Test at L1 only:**
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
node packaging/conformance/run-conformance.js \
|
|
117
|
+
--adapter-cmd "node /path/to/your-adapter.js" \
|
|
118
|
+
--level L1
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
**CLI reference:**
|
|
122
|
+
|
|
123
|
+
```
|
|
124
|
+
node packaging/conformance/run-conformance.js [options]
|
|
125
|
+
|
|
126
|
+
--self Run against the canonical engine (target L2)
|
|
127
|
+
--adapter-cmd CMD Shell command to pipe fixtures to (adapter under test)
|
|
128
|
+
--level L0|L1|L2 Minimum conformance level to enforce (default: L2 for --self, L0 for --adapter-cmd)
|
|
129
|
+
--fixtures DIR Override fixture directory (default: packaging/conformance/fixtures/)
|
|
130
|
+
--verbose Print fixture payloads and full output in per-fixture results
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Exit codes: `0` = target level reached, `1` = target level not reached, `2` = usage error.
|
|
134
|
+
|
|
135
|
+
### Adapter contract for the runner
|
|
136
|
+
|
|
137
|
+
Your adapter command:
|
|
138
|
+
- Receives a canonical JSON payload on stdin (one JSON object).
|
|
139
|
+
- Writes the input JSON (or augmented form) to stdout on allow.
|
|
140
|
+
- Exits `0` to allow, `2` to block, any other code for error (treated as allow, fail-open).
|
|
141
|
+
|
|
142
|
+
The runner invokes your command exactly once per fixture via `sh -c "<your-cmd>"`.
|
|
143
|
+
|
|
144
|
+
### Fixture inventory
|
|
145
|
+
|
|
146
|
+
The fixtures in `packaging/conformance/fixtures/` cover all four policy classes:
|
|
147
|
+
|
|
148
|
+
| Fixture | Policy class | Event | Level |
|
|
149
|
+
| --- | --- | --- | --- |
|
|
150
|
+
| `config-protection--block-eslintrc.json` | config-protection | preToolUse | L2 |
|
|
151
|
+
| `config-protection--block-biome.json` | config-protection | preToolUse | L2 |
|
|
152
|
+
| `config-protection--allow-safe-file.json` | config-protection | preToolUse | L2 |
|
|
153
|
+
| `config-protection--allow-no-path.json` | config-protection | preToolUse | L2 |
|
|
154
|
+
| `quality-gate--allow-nonexistent-file.json` | quality-gate | postToolUse | L2 |
|
|
155
|
+
| `quality-gate--allow-no-path.json` | quality-gate | postToolUse | L2 |
|
|
156
|
+
| `stop-goal-fit--allow-clean-cwd.json` | stop-goal-fit | stop | L1 |
|
|
157
|
+
| `stop-goal-fit--warn-active-delivery.json` | stop-goal-fit | stop | L1 |
|
|
158
|
+
| `stop-goal-fit--block-strict-mode.json` | stop-goal-fit | stop | L2 |
|
|
159
|
+
| `workflow-steering--allow-no-state.json` | workflow-steering | userPromptSubmit | L1 |
|
|
160
|
+
| `workflow-steering--inject-active-state.json` | workflow-steering | userPromptSubmit | L1 |
|
|
161
|
+
| `workflow-steering--inject-subagent-steering.json` | workflow-steering | postToolUse | L1 |
|
|
162
|
+
|
|
163
|
+
Fixtures with `workspace_setup` create a temporary directory with the listed files before invoking the adapter and clean up afterward. The `cwd` field in those payloads is replaced with the temp directory path at runtime.
|
|
164
|
+
|
|
165
|
+
## How to declare gaps
|
|
166
|
+
|
|
167
|
+
If your adapter legitimately cannot satisfy a fixture — because the host runtime has no blocking `preToolUse` equivalent, or no stop hook — declare the gap explicitly in your adapter documentation. The opencode and pi adapters are the reference pattern.
|
|
168
|
+
|
|
169
|
+
### opencode: no prompt-submit hook
|
|
170
|
+
|
|
171
|
+
opencode has no native `prompt.submit`-equivalent event. Workflow steering cannot fire at each user turn. The gap is declared in the plugin source comment and in the conformance declaration:
|
|
172
|
+
|
|
173
|
+
```yaml
|
|
174
|
+
conformance_level: L1
|
|
175
|
+
host: opencode
|
|
176
|
+
event_coverage:
|
|
177
|
+
agentSpawn: session.created (full fidelity)
|
|
178
|
+
userPromptSubmit: no native equivalent — workflow steering fires at session.created only
|
|
179
|
+
preToolUse: tool.execute.before (full fidelity, blocking available via thrown Error)
|
|
180
|
+
postToolUse: tool.execute.after (full fidelity)
|
|
181
|
+
stop: session.idle (reduced fidelity — fires on idle, not on completion)
|
|
182
|
+
permissionRequest: permission.asked (telemetry only — no blocking capability)
|
|
183
|
+
policy_coverage:
|
|
184
|
+
workflow_steering: partial — injected at session.created only, not at each turn
|
|
185
|
+
quality_gate: wired at tool.execute.after
|
|
186
|
+
stop_goal_fit: degraded — session.idle does not reliably fire at completion
|
|
187
|
+
config_protection: wired at tool.execute.before (blocking)
|
|
188
|
+
gaps:
|
|
189
|
+
- event: userPromptSubmit
|
|
190
|
+
reason: opencode has no prompt.submit equivalent
|
|
191
|
+
degradation: Workflow steering fires once at session.created instead of at each user turn
|
|
192
|
+
- event: stop
|
|
193
|
+
reason: session.idle is the closest event but is not a true completion signal
|
|
194
|
+
degradation: stop-goal-fit warnings may not fire reliably at session end
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
### pi: no stop hook
|
|
198
|
+
|
|
199
|
+
pi has no stop hook. Stop-goal-fit cannot fire at session end. The gap is declared in the extension source comment and in the conformance declaration:
|
|
200
|
+
|
|
201
|
+
```yaml
|
|
202
|
+
conformance_level: L1
|
|
203
|
+
host: pi
|
|
204
|
+
event_coverage:
|
|
205
|
+
agentSpawn: session_start (full fidelity)
|
|
206
|
+
userPromptSubmit: before_agent_start (reduced fidelity — fires at agent start, not per-turn)
|
|
207
|
+
preToolUse: tool_call (full fidelity, blockable via return { block: true })
|
|
208
|
+
postToolUse: tool_result (full fidelity)
|
|
209
|
+
stop: no native equivalent — session_shutdown used as closest analogue
|
|
210
|
+
policy_coverage:
|
|
211
|
+
workflow_steering: partial — injected at before_agent_start, not at each user turn
|
|
212
|
+
quality_gate: wired at tool_result
|
|
213
|
+
stop_goal_fit: degraded — session_shutdown does not reliably carry stop semantics
|
|
214
|
+
config_protection: wired at tool_call (blocking)
|
|
215
|
+
gaps:
|
|
216
|
+
- event: stop
|
|
217
|
+
reason: pi has no stop hook
|
|
218
|
+
degradation: stop-goal-fit cannot fire; agent may complete without the check
|
|
219
|
+
workaround: Run stop-goal-fit checks explicitly in CI or via a post-session script
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
## Including a conformance declaration in your adapter
|
|
223
|
+
|
|
224
|
+
After running the conformance kit, include a declaration in your adapter documentation:
|
|
225
|
+
|
|
226
|
+
```yaml
|
|
227
|
+
conformance_level: L2 # or L0 / L1
|
|
228
|
+
engine_contract_version: "1.0"
|
|
229
|
+
runner_version: "run-conformance.js"
|
|
230
|
+
test_date: 2026-06-11
|
|
231
|
+
verdict: PASS
|
|
232
|
+
fixture_count: 12
|
|
233
|
+
fixtures_passed: 12
|
|
234
|
+
gaps: []
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
If any fixtures fail, list them under `gaps` with a description of the degradation behavior. Declared gaps do not prevent reaching a lower conformance level — they make the adapter's behavior honest and auditable.
|
|
238
|
+
|
|
239
|
+
## Related references
|
|
240
|
+
|
|
241
|
+
- `packaging/conformance/run-conformance.js` — conformance runner
|
|
242
|
+
- `packaging/conformance/fixtures/` — golden fixtures
|
|
243
|
+
- `packaging/conformance/README.md` — conformance kit README
|
|
244
|
+
- <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec §8</a> — engine contract 1.0 in full
|
|
245
|
+
- <a href="harness-install.html">Harness Install</a> — worked install examples for opencode and pi
|
|
246
|
+
- <a href="framework-adapter.html">Framework Adapter</a> — worked example of a language-native adapter
|
|
@@ -0,0 +1,275 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Framework Adapter
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Framework Adapter
|
|
6
|
+
|
|
7
|
+
This page walks through the `integrations/strands/` reference implementation: a Python `HookProvider` for AWS Strands Agents. It covers how to construct `FlowAgentsHooks`, what telemetry it emits, how the policy gate binds to the canonical engine contract, and the documented limitations of this spike.
|
|
8
|
+
|
|
9
|
+
Everything in this page is grounded in the files under `integrations/strands/`. No behavior is inferred or aspirational unless explicitly labeled as direction.
|
|
10
|
+
|
|
11
|
+
## Harness adapters vs. framework adapters
|
|
12
|
+
|
|
13
|
+
Harness adapters (Claude Code, Codex, Kiro, opencode, pi) integrate with coding-agent runtimes that have their own hook format: JSON on stdin, exit codes, and lifecycle events named by the harness. Each harness adapter normalizes its runtime's hook payloads into the canonical Flow Agents telemetry taxonomy and delegates to `scripts/telemetry/telemetry.sh`.
|
|
14
|
+
|
|
15
|
+
Framework adapters are in-process packages. Strands Agents is not a coding-agent harness — it is a general-purpose Python agent SDK. Its hook surface (`HookProvider` / `HookRegistry`) is class-based and synchronous. There is no stdin/stdout protocol and no process exit codes as block signals. Hook callbacks receive typed Python event objects and can mutate them in place.
|
|
16
|
+
|
|
17
|
+
Despite the surface differences, the same canonical event taxonomy is used. The JSONL output from `FlowAgentsHooks` is structurally identical to the output produced by `claude-telemetry-hook.js` and `codex-telemetry-hook.js`.
|
|
18
|
+
|
|
19
|
+
## Constructing FlowAgentsHooks
|
|
20
|
+
|
|
21
|
+
`FlowAgentsHooks` is the main entry point. It implements the Strands `HookProvider` protocol via duck typing, so `strands-agents` is not required at import time.
|
|
22
|
+
|
|
23
|
+
```python
|
|
24
|
+
from flow_agents_strands import FlowAgentsHooks
|
|
25
|
+
|
|
26
|
+
hooks = FlowAgentsHooks(
|
|
27
|
+
workspace=".", # root of your project (reads .flow-agents/)
|
|
28
|
+
agent_name="my-agent", # appears in telemetry events
|
|
29
|
+
)
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Constructor parameters (all optional):
|
|
33
|
+
|
|
34
|
+
| Parameter | Default | Purpose |
|
|
35
|
+
| --- | --- | --- |
|
|
36
|
+
| `sink_path` | `<workspace>/.flow-agents/.telemetry/full.jsonl` | JSONL telemetry output path or directory |
|
|
37
|
+
| `workspace` | `os.getcwd()` | Root of the workspace; used to discover `.flow-agents/` |
|
|
38
|
+
| `agent_name` | `"strands-agent"` | Agent identifier embedded in telemetry events |
|
|
39
|
+
| `runtime` | `"strands"` | Runtime label embedded in telemetry events |
|
|
40
|
+
| `policy_gate` | `PolicyGate()` | Optional custom `PolicyGate` instance (for testing) |
|
|
41
|
+
|
|
42
|
+
`FlowAgentsHooks` is usable without `strands-agents` installed. Telemetry emission and `steering_context()` work in any Python environment. The `register_hooks` method (which wires callbacks into a `HookRegistry`) requires `strands-agents` and raises `ImportError` if the SDK is absent.
|
|
43
|
+
|
|
44
|
+
## Wiring into an Agent
|
|
45
|
+
|
|
46
|
+
```python
|
|
47
|
+
from strands import Agent
|
|
48
|
+
from strands.models import BedrockModel
|
|
49
|
+
from flow_agents_strands import FlowAgentsHooks
|
|
50
|
+
|
|
51
|
+
hooks = FlowAgentsHooks(workspace=".")
|
|
52
|
+
|
|
53
|
+
# Load steering context BEFORE constructing the agent.
|
|
54
|
+
# Strands' BeforeInvocationEvent does not expose a mutable system prompt,
|
|
55
|
+
# so steering must be injected at construction time.
|
|
56
|
+
system_prompt = (
|
|
57
|
+
"You are a helpful assistant.\n"
|
|
58
|
+
+ hooks.steering_context()
|
|
59
|
+
)
|
|
60
|
+
|
|
61
|
+
model = BedrockModel(model_id="anthropic.claude-3-5-sonnet-20241022-v2:0")
|
|
62
|
+
agent = Agent(model=model, system_prompt=system_prompt, hooks=[hooks])
|
|
63
|
+
|
|
64
|
+
result = agent("List the files in this directory.")
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
`register_hooks` is called by the Strands runtime when `hooks=[hooks]` is passed to `Agent`. It registers five callbacks:
|
|
68
|
+
|
|
69
|
+
| Strands event | Canonical event | What fires |
|
|
70
|
+
| --- | --- | --- |
|
|
71
|
+
| `AgentInitializedEvent` | `agentSpawn` | `emit_session_start()` — records `session.start` |
|
|
72
|
+
| `BeforeInvocationEvent` | `userPromptSubmit` | `emit("userPromptSubmit")` — records `turn.user` |
|
|
73
|
+
| `AfterInvocationEvent` | `stop` | `emit_session_end(duration_s=…)` — records `session.end` |
|
|
74
|
+
| `BeforeToolCallEvent` | `preToolUse` | Telemetry + policy gate (config-protection) |
|
|
75
|
+
| `AfterToolCallEvent` | `postToolUse` | `emit_tool_result(…)` — records `tool.result` |
|
|
76
|
+
|
|
77
|
+
This mapping is the `STRANDS_TO_CANONICAL` dict exposed at module level by `integrations/strands/flow_agents_strands/telemetry.py`:
|
|
78
|
+
|
|
79
|
+
```python
|
|
80
|
+
STRANDS_TO_CANONICAL = {
|
|
81
|
+
"AgentInitializedEvent": "agentSpawn",
|
|
82
|
+
"BeforeInvocationEvent": "userPromptSubmit",
|
|
83
|
+
"AfterInvocationEvent": "stop",
|
|
84
|
+
"BeforeToolCallEvent": "preToolUse",
|
|
85
|
+
"AfterToolCallEvent": "postToolUse",
|
|
86
|
+
"AfterModelCallEvent": "postToolUse", # closest analogue; no tool name
|
|
87
|
+
"MessageAddedEvent": "userPromptSubmit",
|
|
88
|
+
}
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Telemetry emitted
|
|
92
|
+
|
|
93
|
+
Events are written to `.flow-agents/.telemetry/full.jsonl` by default. The record shape matches `build_base_event()` in `scripts/telemetry/telemetry.sh`:
|
|
94
|
+
|
|
95
|
+
```json
|
|
96
|
+
{
|
|
97
|
+
"schema_version": "0.3.0",
|
|
98
|
+
"timestamp": "1718000000000",
|
|
99
|
+
"session_id": "<uuid>",
|
|
100
|
+
"event_id": "<uuid>",
|
|
101
|
+
"event_type": "tool.invoke",
|
|
102
|
+
"agent": { "name": "my-agent", "runtime": "strands", "version": "unknown" },
|
|
103
|
+
"hook": {
|
|
104
|
+
"event_name": "preToolUse",
|
|
105
|
+
"source": "strands",
|
|
106
|
+
"stop_hook_active": null,
|
|
107
|
+
"raw_input": null
|
|
108
|
+
},
|
|
109
|
+
"tool": { "name": "edit", "normalized_name": "fs_write", "input": { ... } }
|
|
110
|
+
}
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
Canonical names map to schema `event_type` values via `_CANONICAL_TO_SCHEMA` in `telemetry.py`:
|
|
114
|
+
|
|
115
|
+
| Canonical name | Schema event_type |
|
|
116
|
+
| --- | --- |
|
|
117
|
+
| `agentSpawn` | `session.start` |
|
|
118
|
+
| `userPromptSubmit` | `turn.user` |
|
|
119
|
+
| `preToolUse` | `tool.invoke` |
|
|
120
|
+
| `permissionRequest` | `tool.permission_request` |
|
|
121
|
+
| `postToolUse` | `tool.result` |
|
|
122
|
+
| `stop` | `session.end` |
|
|
123
|
+
|
|
124
|
+
Telemetry is always fail-open: if the JSONL file cannot be written (`OSError`), the exception is swallowed silently. Telemetry must never block agent work.
|
|
125
|
+
|
|
126
|
+
## Policy gate: config-protection
|
|
127
|
+
|
|
128
|
+
The config-protection policy binds to the canonical Node.js engine via subprocess. The binding is in `integrations/strands/flow_agents_strands/policy.py` in the `PolicyGate` class.
|
|
129
|
+
|
|
130
|
+
**Primary mode — engine subprocess:**
|
|
131
|
+
|
|
132
|
+
On `BeforeToolCallEvent`, if the tool name is a write-like tool (one of `edit`, `write`, `fs_write`, `apply_patch`, `create_file`, `str_replace_editor`), the gate serializes the event to a canonical JSON payload and spawns:
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
echo '{"hook_event_name":"PreToolUse","tool_name":"edit","tool_input":{"path":"biome.json"}}' \
|
|
136
|
+
| node scripts/hooks/run-hook.js config-protection config-protection.js
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
The engine exits 2 (block) or 0 (allow). Exit code 2 causes `event.cancel_tool` to be set to the block reason from stderr. Strands cancels the call and surfaces the message as the tool result. All other exit codes fail open.
|
|
140
|
+
|
|
141
|
+
The engine is located by `_find_engine_paths()` in this priority order:
|
|
142
|
+
|
|
143
|
+
1. `FLOW_AGENTS_ENGINE_PATH` environment variable (explicit override).
|
|
144
|
+
2. Relative to the package source file: `../../../../scripts/hooks/run-hook.js` (works from a repo checkout).
|
|
145
|
+
3. Walked up from `os.getcwd()` looking for `node_modules/@kontourai/flow-agents/scripts/hooks/run-hook.js` (npm-installed package).
|
|
146
|
+
|
|
147
|
+
**Fallback mode — Python evaluation:**
|
|
148
|
+
|
|
149
|
+
If `node` is not on PATH or `run-hook.js` cannot be located, the gate degrades to a built-in Python implementation of the same logic and emits a one-time `RuntimeWarning`. The Python fallback uses the same `PROTECTED_FILES` frozenset as `config-protection.js` and is auditable. This is not silent: the warning is printed once to stderr.
|
|
150
|
+
|
|
151
|
+
**Custom protected set:**
|
|
152
|
+
|
|
153
|
+
If a `PolicyGate` is constructed with a custom `protected_files` frozenset, Python evaluation is used directly (the engine subprocess cannot receive a runtime-custom set). This path is intended for tests and local override only.
|
|
154
|
+
|
|
155
|
+
## Workflow steering
|
|
156
|
+
|
|
157
|
+
Strands' `BeforeInvocationEvent` does not expose a mutable system prompt at callback time.
|
|
158
|
+
|
|
159
|
+
The spike approach: call `hooks.steering_context()` at `Agent` construction time and append the result to the system prompt. `steering_context()` reads the current workflow state from `.flow-agents/` and returns a text block. It also emits a `turn.user` telemetry event so the injection is recorded in the JSONL log.
|
|
160
|
+
|
|
161
|
+
This is a one-shot snapshot. It does not re-evaluate on every turn the way `workflow-steering.js` does at `UserPromptSubmit`. See the Limitations section for the productization path.
|
|
162
|
+
|
|
163
|
+
## Documented limitations
|
|
164
|
+
|
|
165
|
+
The following limitations are from `integrations/strands/README.md` and reflect the current spike state. They are not defects to be worked around silently — they are honest gaps.
|
|
166
|
+
|
|
167
|
+
1. **Node.js subprocess dependency**: The primary policy binding spawns a Node.js subprocess for each `BeforeToolCallEvent` involving a write-like tool. If `node` is not on PATH or the package is not installed, the gate degrades to the Python fallback with a one-time `RuntimeWarning`. To force the subprocess path, set `FLOW_AGENTS_ENGINE_PATH` to the absolute path of `run-hook.js`.
|
|
168
|
+
|
|
169
|
+
2. **Steering seam**: Strands does not allow mutating the system prompt from `BeforeInvocationEvent`. The workaround (`steering_context()` at Agent construction) is a one-shot snapshot; it does not re-evaluate on every turn. Productization would require either a custom Strands model wrapper that injects context per-turn, or upstream SDK support for mutable system-prompt context in the invocation event.
|
|
170
|
+
|
|
171
|
+
3. **session.usage event omitted**: The JS harness emits a `session.usage` event on stop with token counts. The Strands `AfterInvocationEvent` does not expose token-usage data in the hook payload, so this event is not emitted.
|
|
172
|
+
|
|
173
|
+
4. **No analytics channel**: The harness adapters write to two channels (full + analytics) with different redaction profiles. This spike writes only to the `full` channel.
|
|
174
|
+
|
|
175
|
+
5. **No Console/HTTP sink**: The bash transport supports POSTing events to a Console endpoint. This adapter writes JSONL only.
|
|
176
|
+
|
|
177
|
+
6. **Runtime version is "unknown"**: Strands does not expose its version through the hook event; `agent.version` is hardcoded to `"unknown"`.
|
|
178
|
+
|
|
179
|
+
7. **No subagent/delegation event**: The Strands SDK does not have a built-in delegation tool; the `subagentStart`/`subagentStop` telemetry path is not wired.
|
|
180
|
+
|
|
181
|
+
8. **Quality-gate policy omitted**: `quality-gate.js` invokes ruff/biome after edits. There is no clear Strands analogue yet.
|
|
182
|
+
|
|
183
|
+
## Conformance declaration
|
|
184
|
+
|
|
185
|
+
The Strands adapter is L0 plus config protection via `BeforeToolCallEvent` cancellation. A full conformance declaration would read:
|
|
186
|
+
|
|
187
|
+
```
|
|
188
|
+
conformance_level: L0 (+ config-protection via BeforeToolCallEvent)
|
|
189
|
+
host: AWS Strands Agents
|
|
190
|
+
event_coverage:
|
|
191
|
+
agentSpawn: AgentInitializedEvent (full fidelity)
|
|
192
|
+
userPromptSubmit: BeforeInvocationEvent (no per-turn injection — spike limitation)
|
|
193
|
+
preToolUse: BeforeToolCallEvent (full fidelity, cancellable)
|
|
194
|
+
postToolUse: AfterToolCallEvent (full fidelity)
|
|
195
|
+
stop: AfterInvocationEvent (full fidelity)
|
|
196
|
+
permissionRequest: no native equivalent
|
|
197
|
+
subagentStart: no native equivalent
|
|
198
|
+
subagentStop: no native equivalent
|
|
199
|
+
policy_coverage:
|
|
200
|
+
workflow_steering: partial — injected once at Agent construction, not per-turn
|
|
201
|
+
quality_gate: omitted — no current Strands analogue
|
|
202
|
+
stop_goal_fit: omitted — AfterInvocationEvent used for telemetry only
|
|
203
|
+
config_protection: wired at BeforeToolCallEvent (blocking via event.cancel_tool)
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
## Running tests
|
|
207
|
+
|
|
208
|
+
The spike ships 50 unit tests that require no Strands SDK:
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
cd integrations/strands
|
|
212
|
+
python3 -m unittest discover
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
## Related references
|
|
216
|
+
|
|
217
|
+
- `integrations/strands/flow_agents_strands/hooks.py` — `FlowAgentsHooks` and `register_hooks`
|
|
218
|
+
- `integrations/strands/flow_agents_strands/telemetry.py` — `TelemetrySink`, `STRANDS_TO_CANONICAL`
|
|
219
|
+
- `integrations/strands/flow_agents_strands/policy.py` — `PolicyGate`, engine subprocess binding
|
|
220
|
+
- `integrations/strands/flow_agents_strands/steering.py` — `SteeringContext`
|
|
221
|
+
- `integrations/strands/README.md` — spike README with quickstart and full limitations list
|
|
222
|
+
- <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec §6.2</a> — framework adapter contract and minimum viable adapter pseudocode
|
|
223
|
+
- <a href="conformance.html">Conformance</a> — how to self-certify using the conformance kit
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
## TypeScript native-import adapter (`integrations/strands-ts/`)
|
|
228
|
+
|
|
229
|
+
`@kontourai/flow-agents-strands` is the first **native-import** consumer of the policy engine contract. Where the Python adapter spawns a subprocess for each `BeforeToolCallEvent` policy check, the TS adapter calls `config-protection.js`'s exported `run()` function directly — zero subprocess overhead on the hot path.
|
|
230
|
+
|
|
231
|
+
### Key differences from the Python adapter
|
|
232
|
+
|
|
233
|
+
| | Python adapter | TypeScript adapter |
|
|
234
|
+
|--|----------------|-------------------|
|
|
235
|
+
| Engine binding | subprocess (`node run-hook.js …`) | `require("config-protection.js").run()` — in-process |
|
|
236
|
+
| Strands SDK | `register_hooks(registry)` → `registry.add_callback` | `registerHooks(registry)` → `registry.addCallback` |
|
|
237
|
+
| Cancel signal | `event.cancel_tool = reason` | `event.cancel = reason` (TS variant) |
|
|
238
|
+
| Conformance | L0 + config-protection | L2 (all four policy classes via shim) |
|
|
239
|
+
| Test framework | stdlib unittest (Python) | node:test (no extra deps) |
|
|
240
|
+
|
|
241
|
+
### Constructing FlowAgentsHooks (TypeScript)
|
|
242
|
+
|
|
243
|
+
```typescript
|
|
244
|
+
import { FlowAgentsHooks } from "@kontourai/flow-agents-strands";
|
|
245
|
+
|
|
246
|
+
const hooks = new FlowAgentsHooks({
|
|
247
|
+
workspace: ".", // root of your project
|
|
248
|
+
agentName: "my-agent",
|
|
249
|
+
// engineRoot: "/path/to/flow-agents" // optional: explicit engine path
|
|
250
|
+
});
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Event mapping
|
|
254
|
+
|
|
255
|
+
The TS adapter exports `STRANDS_TO_CANONICAL` matching the Python adapter's dict:
|
|
256
|
+
|
|
257
|
+
| Strands TS Event | Canonical event |
|
|
258
|
+
|------------------|-----------------|
|
|
259
|
+
| `BeforeInvocationEvent` | `userPromptSubmit` |
|
|
260
|
+
| `AfterInvocationEvent` | `stop` |
|
|
261
|
+
| `BeforeToolCallEvent` | `preToolUse` |
|
|
262
|
+
| `AfterToolCallEvent` | `postToolUse` |
|
|
263
|
+
| `AgentInitializedEvent` | `agentSpawn` |
|
|
264
|
+
|
|
265
|
+
### Conformance
|
|
266
|
+
|
|
267
|
+
The TS adapter achieves **L2** via `bin/conformance-shim.mjs`:
|
|
268
|
+
|
|
269
|
+
```bash
|
|
270
|
+
node packaging/conformance/run-conformance.js \
|
|
271
|
+
--adapter-cmd "node integrations/strands-ts/bin/conformance-shim.mjs" \
|
|
272
|
+
--level L2
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
12/12 fixtures pass. See `integrations/strands-ts/README.md` for the full conformance declaration and limitations.
|