npm - @kontourai/flow-agents - Versions diffs - 0.1.2 → 0.3.0 - Mend

@kontourai/flow-agents 0.1.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (117) hide show

package/.github/dependabot.yml +23 -0
package/.github/workflows/release-please.yml +31 -0
package/.github/workflows/runtime-compat.yml +118 -0
package/CHANGELOG.md +46 -0
package/CONTRIBUTING.md +4 -0
package/README.md +80 -18
package/build/src/cli/flow-kit.js +9 -4
package/build/src/cli/init.js +215 -5
package/build/src/cli/runtime-adapter.js +9 -5
package/build/src/cli/telemetry-doctor.js +4 -1
package/build/src/cli/utterance-check.js +65 -1
package/build/src/runtime-adapters.js +34 -0
package/build/src/tools/build-universal-bundles.js +285 -0
package/build/src/tools/filter-installed-packs.js +3 -0
package/build/src/tools/validate-source-tree.js +5 -1
package/console.telemetry.json +115 -20
package/context/scripts/telemetry/lib/config.sh +5 -1
package/context/settings/flow-agents-settings.json +7 -0
package/docs/_layouts/default.html +2 -0
package/docs/context-map.md +1 -0
package/docs/index.md +53 -4
package/docs/integrations/conformance.md +246 -0
package/docs/integrations/framework-adapter.md +275 -0
package/docs/integrations/harness-install.md +213 -0
package/docs/integrations/index.md +58 -0
package/docs/integrations/knowledge-kit-live.md +211 -0
package/docs/kit-authoring-guide.md +169 -0
package/docs/north-star.md +2 -2
package/docs/spec/runtime-hook-surface.md +525 -0
package/docs/survey-utterance-check.md +211 -94
package/docs/vision.md +45 -0
package/evals/acceptance/run.sh +13 -2
package/evals/acceptance/test_knowledge_kit_live.sh +221 -0
package/evals/acceptance/test_opencode_harness.sh +121 -0
package/evals/acceptance/test_pi_harness.sh +113 -0
package/evals/integration/test_bundle_install.sh +226 -1
package/evals/integration/test_bundle_lifecycle.sh +641 -0
package/evals/integration/test_runtime_adapter_activation.sh +113 -1
package/evals/integration/test_utterance_check.sh +291 -44
package/evals/run.sh +2 -0
package/evals/static/test_universal_bundles.sh +137 -2
package/integrations/strands/README.md +256 -0
package/integrations/strands/example.py +74 -0
package/integrations/strands/examples/knowledge_kit_live.py +461 -0
package/integrations/strands/flow_agents_strands/__init__.py +27 -0
package/integrations/strands/flow_agents_strands/hooks.py +194 -0
package/integrations/strands/flow_agents_strands/policy.py +348 -0
package/integrations/strands/flow_agents_strands/steering.py +225 -0
package/integrations/strands/flow_agents_strands/telemetry.py +238 -0
package/integrations/strands/pyproject.toml +38 -0
package/integrations/strands/tests/__init__.py +0 -0
package/integrations/strands/tests/test_hooks.py +392 -0
package/integrations/strands/tests/test_policy.py +315 -0
package/integrations/strands/tests/test_telemetry.py +184 -0
package/integrations/strands-ts/README.md +224 -0
package/integrations/strands-ts/bin/conformance-shim.mjs +257 -0
package/integrations/strands-ts/package.json +53 -0
package/integrations/strands-ts/src/hooks.ts +312 -0
package/integrations/strands-ts/src/index.ts +22 -0
package/integrations/strands-ts/src/policy.ts +345 -0
package/integrations/strands-ts/src/telemetry.ts +251 -0
package/integrations/strands-ts/test/test-policy.ts +322 -0
package/integrations/strands-ts/test/test-steering.ts +159 -0
package/integrations/strands-ts/test/test-telemetry.ts +226 -0
package/integrations/strands-ts/tsconfig.json +20 -0
package/kits/catalog.json +6 -0
package/kits/knowledge/adapters/default-store/index.js +821 -0
package/kits/knowledge/adapters/flow-runner/index.js +1179 -0
package/kits/knowledge/adapters/flow-runner/telemetry.js +174 -0
package/kits/knowledge/docs/README.md +135 -0
package/kits/knowledge/docs/store-contract.md +526 -0
package/kits/knowledge/evals/consolidation/suite.test.js +1234 -0
package/kits/knowledge/evals/contract-suite/suite.test.js +670 -0
package/kits/knowledge/evals/ingest-compile/suite.test.js +574 -0
package/kits/knowledge/evals/synthesis/suite.test.js +909 -0
package/kits/knowledge/flows/compile.flow.json +60 -0
package/kits/knowledge/flows/consolidate.flow.json +77 -0
package/kits/knowledge/flows/ingest.flow.json +60 -0
package/kits/knowledge/flows/store-contract.flow.json +48 -0
package/kits/knowledge/flows/synthesize.flow.json +77 -0
package/kits/knowledge/kit.json +78 -0
package/package.json +7 -2
package/packaging/conformance/README.md +142 -0
package/packaging/conformance/fixtures/config-protection--allow-no-path.json +18 -0
package/packaging/conformance/fixtures/config-protection--allow-safe-file.json +20 -0
package/packaging/conformance/fixtures/config-protection--block-biome.json +20 -0
package/packaging/conformance/fixtures/config-protection--block-eslintrc.json +20 -0
package/packaging/conformance/fixtures/quality-gate--allow-no-path.json +17 -0
package/packaging/conformance/fixtures/quality-gate--allow-nonexistent-file.json +19 -0
package/packaging/conformance/fixtures/stop-goal-fit--allow-clean-cwd.json +17 -0
package/packaging/conformance/fixtures/stop-goal-fit--block-strict-mode.json +23 -0
package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +21 -0
package/packaging/conformance/fixtures/workflow-steering--allow-no-state.json +16 -0
package/packaging/conformance/fixtures/workflow-steering--inject-active-state.json +29 -0
package/packaging/conformance/fixtures/workflow-steering--inject-subagent-steering.json +25 -0
package/packaging/conformance/package.json +4 -0
package/packaging/conformance/run-conformance.js +322 -0
package/packaging/manifest.json +59 -0
package/schemas/flow-agents-settings.schema.json +48 -0
package/scripts/README.md +4 -0
package/scripts/dogfood.js +16 -0
package/scripts/hooks/opencode-hook-adapter.js +123 -0
package/scripts/hooks/opencode-telemetry-hook.js +101 -0
package/scripts/hooks/pi-hook-adapter.js +123 -0
package/scripts/hooks/pi-telemetry-hook.js +105 -0
package/scripts/hooks/run-hook.js +8 -0
package/scripts/hooks/utterance-check.js +124 -22
package/scripts/telemetry/lib/config.sh +5 -1
package/src/cli/flow-kit.ts +10 -4
package/src/cli/init.ts +219 -6
package/src/cli/runtime-adapter.ts +10 -5
package/src/cli/telemetry-doctor.ts +4 -1
package/src/cli/utterance-check.ts +71 -1
package/src/runtime-adapters.ts +35 -0
package/src/tools/build-universal-bundles.ts +283 -0
package/src/tools/filter-installed-packs.ts +3 -0
package/src/tools/validate-source-tree.ts +5 -1

package/docs/integrations/harness-install.md ADDED Viewed

@@ -0,0 +1,213 @@
+---
+title: Harness Install
+---
+# Harness Install
+This page walks through three harness installs: Claude Code (the L2 reference runtime), opencode, and pi. All three follow the same model — `npm run build:bundles` generates the bundle, `flow-agents init` places it — but each runtime expects different files at different paths.
+## How harness bundles work
+`npm run build:bundles` generates one bundle per runtime under `dist/<runtime>/`. Each bundle contains:
+- A host-specific configuration file that maps lifecycle events to shell commands invoking the canonical hook adapter wrapper.
+- A host-specific adapter wrapper (`<runtime>-hook-adapter.js`) that reads stdin JSON from the host, invokes `run-hook.js` with the canonical script path and profile, translates the exit code to the host-native response format, and fails open on errors.
+- A host-specific telemetry wrapper (`<runtime>-telemetry-hook.js`) that maps host event names to canonical telemetry event names and invokes `scripts/telemetry/telemetry.sh`.
+- An `install.sh` that places the generated files at the host-expected paths.
+`flow-agents init` (from `npx @kontourai/flow-agents`) calls `install.sh` for the selected runtime.
+## Claude Code
+Claude Code is the L2 reference implementation. All four policy classes are wired: workflow steering, quality gate, stop-goal-fit, and config protection.
+### Install
+```bash
+npx @kontourai/flow-agents init --runtime claude-code --dest /path/to/workspace --yes
+```
+The install script writes hook wiring into `.claude/settings.json` inside the destination workspace. The hooks object in `settings.json` maps Claude Code lifecycle events (`UserPromptSubmit`, `PreToolUse`, `PostToolUse`, `Stop`) to shell commands invoking the adapter:
+```bash
+bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
+  node "$root/scripts/hooks/claude-telemetry-hook.js" UserPromptSubmit dev'
+bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
+  node "$root/scripts/hooks/claude-hook-adapter.js" UserPromptSubmit \
+    workflow-steering workflow-steering.js default'
+```
+Telemetry always fires first and is always non-blocking (timeout: 10 s). Policy hooks fire second and may block on `PreToolUse` (timeout: 30 s). Both fail open on hook runtime errors.
+### Dogfood variant (repo-local)
+Inside the `flow-agents` source repo itself, the dogfood script writes hook wiring that points at the local `scripts/hooks/` directory rather than a published package:
+```bash
+npm run dogfood -- --runtime claude-code
+```
+The destination defaults to the repo root. Pass `--dest` to override.
+### Scope-collision warning
+When `init` detects that an existing `.claude/settings.json` already has hooks entries for the same lifecycle events, it emits a scope-collision warning to stderr:
+```
+[flow-agents] WARNING: .claude/settings.json already has hooks for UserPromptSubmit.
+Existing entries will be preserved; Flow Agents hooks will be appended.
+Review .claude/settings.json to confirm hook ordering is correct.
+```
+The install appends rather than replaces, so existing hooks are not removed. Review the settings file after install to confirm the ordering is what you want.
+### Resulting file layout
+```
+<workspace>/
+  .claude/
+    settings.json          ← hook wiring (appended by install)
+  scripts/
+    hooks/
+      claude-hook-adapter.js
+      claude-telemetry-hook.js
+      run-hook.js
+      config-protection.js
+      quality-gate.js
+      stop-goal-fit.js
+      workflow-steering.js
+      …
+  skills/
+    …
+  .flow-agents/            ← runtime workflow artifacts (not committed)
+```
+## opencode
+opencode is an L1 adapter. It has no native `prompt.submit`-equivalent event, so workflow steering is approximated at `session.created` rather than at each user turn. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.1</a>.
+### Install
+```bash
+npx @kontourai/flow-agents init --runtime opencode --dest /path/to/workspace --yes
+```
+### Dogfood variant
+```bash
+npm run dogfood -- --runtime opencode
+```
+### Resulting file layout
+```
+<workspace>/
+  .opencode/
+    plugins/
+      flow-agents.js       ← auto-loaded at opencode startup
+    agents/
+      dev.md               ← agent prompts (opencode markdown format)
+      tool-planner.md
+      tool-worker.md
+      …
+    skills/
+      deliver.md
+      fix-bug.md
+      …
+  opencode.json            ← workspace instructions pointer
+  scripts/
+    hooks/
+      opencode-hook-adapter.js
+      opencode-telemetry-hook.js
+      run-hook.js
+      …
+  skills/
+    …
+```
+`opencode.json` at the workspace root is a minimal config file:
+```json
+{
+  "instructions": "This workspace uses Flow Agents. See AGENTS.md for conventions, skills, and workflow guidance."
+}
+```
+The plugin at `.opencode/plugins/flow-agents.js` is auto-loaded at opencode startup. It exports `FlowAgentsPlugin` and registers handlers for:
+| opencode event | What fires |
+| --- | --- |
+| `session.created` | Telemetry + workflow steering (session-start context injection) |
+| `tool.execute.before` | Telemetry + config-protection (blocking via thrown Error) |
+| `tool.execute.after` | Telemetry + quality gate |
+| `session.idle` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
+| `session.error`, `session.compacted`, `permission.asked`, `file.edited` | Telemetry only |
+**Accepted gaps**: opencode has no `prompt.submit` hook, so workflow steering fires only on `session.created` — not at each user turn. `session.idle` is the closest event to a stop hook but does not reliably fire on session completion. These gaps are declared in the conformance level (L1) and in the plugin source comments.
+**Agents**: opencode receives agent prompts as markdown files in `.opencode/agents/`. The main orchestrator is `dev.md`; specialist tools (planner, worker, reviewer, etc.) are additional markdown files in the same directory.
+## pi
+pi is an L1 adapter. It has no stop hook, so stop-goal-fit cannot fire at session end. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.3</a>.
+### Install
+```bash
+npx @kontourai/flow-agents init --runtime pi --dest /path/to/workspace --yes
+```
+### Dogfood variant
+```bash
+npm run dogfood -- --runtime pi
+```
+### Resulting file layout
+```
+<workspace>/
+  .pi/
+    extensions/
+      flow-agents.ts       ← auto-discovered at startup (needs project trust)
+    skills/
+      deliver.md
+      fix-bug.md
+      …
+  AGENTS.md                ← agent instructions (pi uses AGENTS.md, not a registry)
+  scripts/
+    hooks/
+      pi-hook-adapter.js
+      pi-telemetry-hook.js
+      run-hook.js
+      …
+  skills/
+    …
+```
+The extension at `.pi/extensions/flow-agents.ts` is auto-discovered at startup. It registers handlers for:
+| pi event | What fires |
+| --- | --- |
+| `session_start` | Telemetry |
+| `before_agent_start` | Telemetry + workflow steering (injects context into system prompt) |
+| `tool_call` | Telemetry + config-protection (blocking via `{ block: true }` return) |
+| `tool_result` | Telemetry + quality gate |
+| `session_shutdown` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
+**Accepted gaps**: pi has no stop hook. `session_shutdown` is used as the closest equivalent but does not carry the same semantics as a stop event. This gap is declared in the conformance level (L1) and in the extension source comments.
+**Agents**: pi has no named-subagent registry. Agent guidance is delivered through `AGENTS.md` at the workspace root, plus the skills in `.pi/skills/` and the extension. The `flow-agents.ts` extension comment says explicitly: "pi has no named-subagent registry. Agents are not exported for pi."
+### Scope-collision warning
+Same behavior as Claude Code: if an existing `.pi/extensions/` directory contains a file with conflicting event registrations, `init` warns and appends. Review the extension file after install.
+## Related references
+- `dist/opencode/` — generated opencode bundle (do not edit by hand)
+- `dist/pi/` — generated pi bundle (do not edit by hand)
+- `dist/claude-code/` — generated Claude Code bundle
+- `scripts/hooks/run-hook.js` — canonical hook runner
+- <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> — event taxonomy, policy classes, conformance levels
+- <a href="conformance.html">Conformance</a> — how to self-certify a new adapter

package/docs/integrations/index.md ADDED Viewed

@@ -0,0 +1,58 @@
+---
+title: Integration Examples
+---
+# Integration Examples
+Flow Agents reaches host runtimes and agent frameworks through two distinct distribution models. This section provides worked examples for each model and a guide to the conformance kit for third-party adapter authors.
+## Distribution models at a glance
+**Harness runtimes** ship as self-contained bundles under `dist/<runtime>/`. The `npm run build:bundles` command generates each bundle from the canonical manifest and policy scripts. `flow-agents init` (or the dogfood variant) places the generated files at the host-expected paths inside a target workspace. Claude Code, Codex, Kiro, opencode, and pi are harness adapters.
+**Framework adapters** live in `integrations/<name>/` as language-native packages. They register Flow Agents callbacks with the framework's lifecycle system using the framework's native registration API. `integrations/strands/` is the reference implementation: `flow-agents-strands` is a Python `HookProvider` that wires into AWS Strands Agents without requiring the Strands SDK at import time.
+**Third-party adapters** self-certify by running the conformance kit in `packaging/conformance/`. The kit provides golden fixtures and a runner that pipes each fixture through the adapter command and reports per-level verdict.
+## Conformance levels
+| Level | What is required |
+| --- | --- |
+| L0 | Telemetry only — at least `agentSpawn` fires on session start |
+| L1 | L0 plus workflow steering and stop-goal-fit in warning mode |
+| L2 | L1 plus config protection (blocking) and quality gate — the reference level |
+Claude Code and Codex are L2 reference implementations. opencode is L1 (no prompt-submit hook). pi is L1 (no stop hook). The Strands adapter is L0 plus config protection via `BeforeToolCallEvent` cancellation.
+The <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> defines the canonical event taxonomy, policy classes, conformance levels, and engine contract in full.
+## Pages in this section
+<div class="doc-grid">
+  <a class="doc-card" href="harness-install.html">
+    <strong>Harness Install</strong>
+    <span>Worked example installing into a Claude Code project, and the two newest runtimes: opencode and pi. Includes the dogfood variant and scope-collision warning behavior.</span>
+  </a>
+  <a class="doc-card" href="framework-adapter.html">
+    <strong>Framework Adapter</strong>
+    <span>Worked example based on <code>integrations/strands/</code>: constructing FlowAgentsHooks, telemetry emitted, the engine-contract binding for policy, and documented limitations.</span>
+  </a>
+  <a class="doc-card" href="conformance.html">
+    <strong>Conformance</strong>
+    <span>How a third-party adapter self-certifies: the engine contract 1.0, running the conformance runner, what each level requires, and how to declare gaps.</span>
+  </a>
+  <a class="doc-card" href="../spec/runtime-hook-surface.html">
+    <strong>Runtime Hook Surface Spec</strong>
+    <span>Canonical event taxonomy, four policy classes, conformance levels L0/L1/L2, mapping tables, and the engine contract for adapter authors.</span>
+  </a>
+  <a class="doc-card" href="knowledge-kit-live.html">
+    <strong>Knowledge Kit Live Example</strong>
+    <span>End-to-end proof of the Knowledge Kit ingest + compile flows against a real Strands agent (OllamaModel / qwen3:1.7b). No API key required. Includes acceptance test with telemetry and provenance assertions.</span>
+  </a>
+</div>
+---
+## TypeScript native-import adapter
+`integrations/strands-ts/` (`@kontourai/flow-agents-strands`) is the first native-import consumer of the policy engine contract. It binds the `config-protection.js` `run()` function directly — no subprocess on the hot path. Achieves **L2** conformance. See `integrations/strands-ts/README.md` and the [Framework Adapter](framework-adapter.html) page for the full comparison with the Python adapter.

package/docs/integrations/knowledge-kit-live.md ADDED Viewed

@@ -0,0 +1,211 @@
+---
+title: Knowledge Kit Live Example
+---
+# Knowledge Kit Live Example
+This page documents `integrations/strands/examples/knowledge_kit_live.py`: a keyless, ollama-backed end-to-end proof of the Knowledge Kit's ingest and compile flows running against a real Strands agent.
+Everything on this page is grounded in the source files and in the acceptance test that was run to validate the commands. Limitations are documented honestly.
+## What it proves
+The example exercises the full `knowledge.ingest` → `knowledge.compile` pipeline in a temporary workspace:
+- Two raw records are created programmatically via direct Node.js subprocess calls to the kit's flow-runner (`kits/knowledge/adapters/flow-runner/index.js`).
+- One raw record is created by the Strands agent calling the `capture_knowledge` tool.
+- The Strands agent calls `compile_knowledge` with all three raw record IDs, producing a compiled record with verified provenance links.
+Two telemetry streams are asserted:
+| Stream | Path | Contents |
+| --- | --- | --- |
+| Kit gate telemetry | `<workspace>/.telemetry/full.jsonl` | `tool.invoke` + `tool.result` per ingest/compile gate point |
+| Session telemetry | `<workspace>/.flow-agents/.telemetry/full.jsonl` | `session.start`, `turn.user`, `tool.invoke`, `tool.result`, `session.end` from FlowAgentsHooks |
+## Prerequisites
+- ollama installed and `qwen3:1.7b` pulled:
+  ```bash
+  ollama pull qwen3:1.7b
+  ```
+- Python venv with `strands-agents[ollama]` at `/tmp/strands-py-live/venv`:
+  ```bash
+  python3 -m venv /tmp/strands-py-live/venv
+  /tmp/strands-py-live/venv/bin/pip install 'strands-agents[ollama]'
+  ```
+- Node.js on PATH (for the kit's ESM flow-runner and bridge script).
+## Running the example
+```bash
+# From the repo root:
+ollama serve &
+FLOW_AGENTS_ROOT=$(pwd) \
+  /tmp/strands-py-live/venv/bin/python3 \
+  integrations/strands/examples/knowledge_kit_live.py
+```
+Expected output (session IDs and UUIDs vary):
+```
+=== Knowledge Kit S5: Keyless Live Example ===
+Repo root: /path/to/flow-agents
+Node.js: v24.16.0
+Workspace: /tmp/knowledge-kit-live-xxxxxxxx
+Corpus: 3 doc snippets
+  docs/integrations/framework-adapter.md (engineering.docs)
+  docs/integrations/index.md (engineering.docs)
+  kits/knowledge/docs/README.md (research.notes)
+--- Step 1: Programmatic captures (2 records) ---
+  docs/integrations/framework-adapter.md → <raw-id-1>
+  docs/integrations/index.md → <raw-id-2>
+--- Step 2: Agent-driven capture ---
+  Agent turn: 2.9s
+  Reply snippet: 'The captured knowledge record has been successfully stored with ID: ...'
+  Raw records in store: 3
+--- Step 3: Agent-driven compile ---
+  Agent turn: 4.3s
+  Reply snippet: 'The compiled knowledge record has been successfully created with ID: ...'
+  Compiled records in store: 1
+--- Provenance verification ---
+  Compiled record: <compiled-id>
+  Source IDs present in provenance: True
+  Source links in graph index: 3
+Kit gate telemetry (.telemetry/full.jsonl): 18 events
+  [tool.invoke] knowledge.ingest.classify-gate
+  [tool.result] knowledge.ingest.classify-gate
+  ...
+  [tool.invoke] knowledge.compile.link-gate
+  [tool.result] knowledge.compile.link-gate
+Session telemetry (.flow-agents/.telemetry/full.jsonl): 9 events
+  [session.start]
+  [turn.user]
+  [tool.invoke] (capture_knowledge)
+  [tool.result] (capture_knowledge)
+  [session.end]
+  [turn.user]
+  [tool.invoke] (compile_knowledge)
+  [tool.result] (compile_knowledge)
+  [session.end]
+--- Summary ---
+Kit event types:     ['tool.invoke', 'tool.result']
+Session event types: ['session.end', 'session.start', 'tool.invoke', 'tool.result', 'turn.user']
+Raw records:         3
+Compiled records:    1
+Provenance ok:       True
+Overall: PASS
+```
+## Running the acceptance test
+The acceptance harness gates on ollama binary, model presence, and venv presence. If any gate is absent it skips cleanly.
+```bash
+# Run the knowledge-kit-live acceptance test directly:
+bash evals/acceptance/test_knowledge_kit_live.sh
+# Or through the acceptance runner:
+bash evals/acceptance/run.sh knowledge-kit-live
+```
+The harness asserts:
+| Assertion | What is checked |
+| --- | --- |
+| A1 | Example script exits 0 |
+| A2 | `<workspace>/.telemetry/full.jsonl` contains `tool.invoke` + `tool.result` |
+| A3 | `<workspace>/.flow-agents/.telemetry/full.jsonl` contains `session.start`, `tool.invoke`, `tool.result` |
+| A4 | No `.telemetry` directory leaked to the workspace parent |
+| A5 | At least 1 compiled record in the knowledge store |
+| A6 | Compiled record has `source_ids` provenance referencing raw records |
+## How the kit tools work
+The example defines two Strands `@tool` functions that call the kit's flow-runner via Node.js subprocess:
+```python
+@tool
+def capture_knowledge(text: str, category: str) -> str:
+    """Capture raw knowledge text. Returns JSON: {"id": "<uuid>"}."""
+    meta_json = json.dumps({"category": category})
+    data = _call_node_bridge(bridge, "capture", text, meta_json, workspace=workspace)
+    return json.dumps(data)
+@tool
+def compile_knowledge(id1: str, id2: str, id3: str) -> str:
+    """Compile three raw records into a compiled record. Returns JSON: {"id": ...}."""
+    raw_ids = [i for i in [id1, id2, id3] if i and i.strip()]
+    data = _call_node_bridge(bridge, "compile", json.dumps(raw_ids), workspace=workspace)
+    return json.dumps(data)
+```
+The bridge script (`_kit_bridge.mjs`) is written into the workspace at runtime. It imports the kit's ESM modules using absolute paths resolved from `FLOW_AGENTS_ROOT`:
+```javascript
+import { DefaultKnowledgeStore } from "<FLOW_AGENTS_ROOT>/kits/knowledge/adapters/default-store/index.js";
+import { capture, compile } from "<FLOW_AGENTS_ROOT>/kits/knowledge/adapters/flow-runner/index.js";
+```
+Kit gate telemetry is written by the Node flow-runner to `<workspace>/.telemetry/full.jsonl` (via the `FLOW_AGENTS_WORKSPACE` env var). This path is separate from the FlowAgentsHooks telemetry path (`<workspace>/.flow-agents/.telemetry/full.jsonl`) — both files are asserted in the acceptance test.
+## Why two programmatic + one agent-driven capture
+`qwen3:1.7b` (1.7B parameters) reliably calls single-tool prompts, but complex multi-capture prompts cause it to loop or produce unexpected output. The example uses programmatic captures for the first two records to keep runtime bounded (~30 seconds total), and agent-driven calls for the third capture and the compile step. This gives evidence that:
+- The `capture_knowledge` and `compile_knowledge` tools are callable from a real Strands agent.
+- FlowAgentsHooks records session events for those calls.
+- The kit's gate telemetry is written correctly for all operations regardless of call path.
+The acceptance harness asserts on filesystem evidence, not on model output quality.
+## console.telemetry.json mapping
+A `knowledge` flow entry is registered in `console.telemetry.json` to make knowledge flow events visible in the Flow Agents Console:
+```json
+{
+  "id": "knowledge",
+  "label": "Knowledge flows",
+  "match": { "attribute": "flow", "includes": "knowledge." },
+  "titleAttribute": "title",
+  "detailAttributes": { ... }
+}
+```
+This matches telemetry events where the `flow` attribute includes `"knowledge."` — for example, the kit gate events emitted by the flow-runner use `knowledge.ingest` and `knowledge.compile` as the flow identifiers.
+## Documented limitations
+1. **Model quality**: `qwen3:1.7b` is a 1.7B parameter model. It works for single-tool prompts but has limited reliability for complex multi-step instructions. Larger models will work more reliably but require API keys or more memory.
+2. **Single-turn scope**: Each agent invocation covers one operation. Multi-turn chaining with full context tracking across many captures is out of scope for this sprint.
+3. **Steering seam**: The `FlowAgentsHooks` spike injects workflow steering context once at `Agent` construction time. Per-turn steering re-evaluation is not implemented. See `docs/integrations/framework-adapter.md` § Limitations for details.
+4. **Kit telemetry path**: The kit's flow-runner writes telemetry to `<workspace>/.telemetry/full.jsonl` (not the `.flow-agents/.telemetry/` subdirectory used by `FlowAgentsHooks`). Both paths are separate by design: kit telemetry captures gate-point evidence, session telemetry captures agent lifecycle events.
+5. **compile_knowledge tool signature**: The tool takes three separate `id1`, `id2`, `id3` parameters instead of a JSON array. This is because `qwen3:1.7b` does not reliably produce valid JSON array syntax when prompted. This signature change is limited to this example and does not affect the kit's flow-runner API.
+## Related references
+- `integrations/strands/examples/knowledge_kit_live.py` — the example script
+- `evals/acceptance/test_knowledge_kit_live.sh` — the acceptance test
+- `kits/knowledge/adapters/flow-runner/index.js` — the kit flow-runner (capture + compile)
+- `kits/knowledge/adapters/default-store/index.js` — the store adapter
+- `kits/knowledge/kit.json` — kit manifest
+- <a href="framework-adapter.html">Framework Adapter</a> — `FlowAgentsHooks` documentation and limitations
+- <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> — canonical event taxonomy

package/docs/kit-authoring-guide.md ADDED Viewed

@@ -0,0 +1,169 @@
+---
+title: Flow Kit Authoring Guide
+---
+# Flow Kit Authoring Guide
+A Flow Kit is a portable workflow bundle you author once and install into any Flow Agents workspace. It lets you package one or more Flow Definitions — plus optional skills, docs, adapters, evals, and assets — under a single validated manifest. The same install, validation, and activation path that ships the built-in Builder Kit is available to your own kits.
+This guide walks you from an empty directory to a validated, locally installed kit.
+## Concepts
+- **Kit** — a directory with a root `kit.json` manifest and the assets it declares. The manifest is the contract; Flow Agents validates it before anything is copied.
+- **Flow Definition** — a `.flow.json` file that declares steps, gates, and expected evidence. Validation of the Flow Definition semantics belongs to [Kontour Flow](https://kontourai.github.io/flow/); the kit contract delegates to it.
+- **Activation** — the step that reads the installed kit and writes runtime-local files into your workspace. Today the `codex-local` adapter is the only adapter, and it activates only Flow Definition assets.
+## Directory layout
+```text
+my-kit/
+  kit.json            ← required manifest
+  flows/
+    review.flow.json  ← at least one Flow Definition
+  docs/               ← optional
+    README.md
+```
+All paths declared in `kit.json` must be relative to the kit directory and must not contain `..`. The kit must be fully self-contained so it can be installed from any machine or worktree.
+## Minimal kit.json
+```json
+{
+  "schema_version": "1.0",
+  "id": "my-kit",
+  "name": "My Kit",
+  "description": "A minimal kit that adds a review flow.",
+  "flows": [
+    {
+      "id": "my-kit.review",
+      "path": "flows/review.flow.json",
+      "description": "Review a change against agreed criteria."
+    }
+  ]
+}
+```
+Required fields:
+| Field | Rule |
+|---|---|
+| `schema_version` | Must be `"1.0"` |
+| `id` | Stable kebab-case string, e.g. `review-kit` |
+| `name` | Non-empty display name |
+| `flows` | Non-empty list; each entry must have `id` and `path` |
+Optional fields: `product_name`, `description`, `skills`, `docs`, `adapters`, `evals`, `assets`. Optional fields list relative asset paths or objects with `id`, `path`, and optional `description`. They are declared for provenance but only Flow Definition assets are activated today; others appear in diagnostics as `skipped_assets`.
+## Minimal flow file
+A Flow Definition at minimum needs `id`, `version`, `steps`, and `gates`. Steps form a linked list; each gate names the step it guards and the evidence it expects.
+```json
+{
+  "id": "my-kit.review",
+  "version": "1.0",
+  "steps": [
+    { "id": "review", "next": "done" },
+    { "id": "done", "next": null }
+  ],
+  "gates": {
+    "review-gate": {
+      "step": "review",
+      "expects": [
+        {
+          "id": "review-finding",
+          "kind": "surface.claim",
+          "required": true,
+          "description": "The change was reviewed and findings were recorded.",
+          "claim": {
+            "type": "my-kit.review.finding",
+            "subject": "artifact",
+            "accepted_statuses": ["trusted", "accepted"]
+          }
+        }
+      ]
+    }
+  }
+}
+```
+The `id` in the flow file should match the `id` declared in `kit.json`'s `flows` list. Look at `kits/builder/flows/shape.flow.json` and `kits/builder/flows/build.flow.json` in this repository for fuller examples of multi-step flows with required and optional gate evidence.
+## Validate
+Before installing or sharing a kit, run validation from the flow-agents checkout:
+```bash
+npm run validate:source -- --kit path/to/my-kit
+```
+This runs the same repository contract validation used by `install-local`. A validation failure exits nonzero with a diagnostic. Fix errors and re-run until it passes cleanly.
+The full source-tree validation (no `--kit` flag) additionally validates the built-in catalog and Builder Kit:
+```bash
+npm run validate:source --
+```
+## Install locally
+Once validation passes, install the kit into a target workspace:
+```bash
+npx @kontourai/flow-agents flow-kit install-local path/to/my-kit --dest /path/to/workspace
+```
+`--dest` is the installed Flow Agents bundle root. When omitted the command uses the current directory. From a contributor checkout of this repository, the equivalent form is `npm run flow-kit -- <command>`.
+Confirm the install:
+```bash
+npx @kontourai/flow-agents flow-kit list --dest /path/to/workspace
+npx @kontourai/flow-agents flow-kit status my-kit --dest /path/to/workspace
+```
+`list` prints one summary line per installed kit. `status` prints JSON provenance including the SHA256 content hash and `installed` or `missing` state.
+To replace an existing install after you update the kit source:
+```bash
+npx @kontourai/flow-agents flow-kit install-local path/to/my-kit --dest /path/to/workspace --update
+```
+## Activate
+After installing, run activate to write runtime-local files into the workspace:
+```bash
+npx @kontourai/flow-agents flow-kit activate --dest /path/to/workspace --format json
+```
+The `codex-local` adapter is selected automatically. It writes Flow Definition copies under `.flow-agents/runtime/codex/flows/<kit-id>/` and an `activation.json` manifest. Declared `skills`, `docs`, `adapters`, `evals`, and `assets` are recorded as `skipped_assets` — they are not an error, just not activated yet.
+When installing through `npx @kontourai/flow-agents init` with the Codex runtime, pass `--activate-kits` to run activation as part of init:
+```bash
+npx @kontourai/flow-agents init --runtime codex --dest /path/to/workspace --activate-kits --yes
+```
+## Troubleshooting
+Common validation errors and fixes are documented in the [Flow Kit Repository Contract](flow-kit-repository-contract.md#common-failures). The most frequent:
+- `kit.json: .schema_version must be "1.0"` — update the manifest.
+- `kit.json: .id must be a stable kebab-case string` — use a lowercase id like `review-kit`.
+- `kit.json: .flows must be a non-empty list` — declare at least one Flow Definition.
+- `kit.json: flows[0].path points at missing Flow Definition` — add the file or fix the path.
+- `kit.json: docs[0].path points at missing asset` — add the asset or remove the entry.
+For path errors: all declared paths must be relative, must not contain `..`, and must point at existing files. Absolute paths are rejected because a kit must be portable between machines.
+For conflicts on re-install: if you install a different source with an existing kit id, the command fails unless you pass `--update`. Use `--force` to re-copy an existing same-source install after validation.
+See the [Flow Kit Repository Contract](flow-kit-repository-contract.md) for the full validation rules, registry schema, activation diagnostics, and the install/update/force semantics.
+## Direction
+Flow Kits are designed to be shareable workflow units — authored once, carried across teams and workspaces. The intended growth path is distribution from git remotes and a curated Kontour kit catalog of Kontour-authored kits covering work modes beyond software delivery. Today install is local-path only; remote fetch is explicitly a non-goal in this version.

package/docs/north-star.md CHANGED Viewed

@@ -152,9 +152,9 @@ The goal is not to add ceremony. The goal is to make agents more reliable while
 | [x] | Standards register | Supported standards and Flow Agents-owned formats are documented with adoption rules. |
 | [ ] | Structured workflow state | Draft schemas, contracts, validation, explicit current-session identity, delegation-safe agent event logs, sidecar writer commands, and direct workflow-skill writer instructions exist for state, acceptance, evidence, handoff, critique, release, and learning; automatic enforcement remains partial. |
 | [ ] | Context map | Generated repo/context map exists; workflow steering and core planner/worker/verifier agents now use it, but broader agent coverage remains. |
-| [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; broader file/task-aware guidance remains. |
+| [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; the opt-in utterance evidence-check hook (ADR 0003 §9) badges unsupported agent statements via Survey; broader file/task-aware guidance remains. |
 | [x] | Sandbox policy | `context/contracts/sandbox-policy.md` and https://github.com/kontourai/flow-agents/blob/main/docs/sandbox-policy.md classify local read-only, local edit, worktree, container, cloud sandbox, and privileged integration modes. |
-| [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper can record native Veritas reports as optional Flow Agents evidence. |
+| [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper records native Veritas reports as optional evidence; utterance trust reports from `@kontourai/survey` cover agent statements. |
 | [ ] | Feedback loop | Runtime telemetry, outcomes, evals, and recurring corrections feed back into docs, skills, rules, or backlog. |
 | [ ] | Export validation | Codex, Claude Code, and Kiro exports preserve the same operating layers and now install telemetry, Goal Fit, and workflow steering hook wiring; adapter output, installed-command coverage, Claude live hook influence, and Kiro live strict-stop coverage exist. |