@kontourai/flow-agents 0.1.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/dependabot.yml +23 -0
- package/.github/workflows/publish-npm.yml +1 -1
- package/.github/workflows/release-please.yml +31 -0
- package/.github/workflows/runtime-compat.yml +118 -0
- package/CHANGELOG.md +38 -0
- package/CONTRIBUTING.md +4 -0
- package/README.md +58 -19
- package/build/src/cli/init.js +215 -5
- package/build/src/cli/utterance-check.js +236 -0
- package/build/src/cli.js +3 -0
- package/build/src/tools/build-universal-bundles.js +268 -0
- package/build/src/tools/filter-installed-packs.js +3 -0
- package/build/src/tools/validate-source-tree.js +6 -1
- package/context/scripts/telemetry/lib/config.sh +5 -1
- package/context/settings/flow-agents-settings.json +7 -0
- package/docs/agent-system-guidebook.md +4 -5
- package/docs/context-map.md +1 -0
- package/docs/index.md +46 -6
- package/docs/integrations/conformance.md +246 -0
- package/docs/integrations/framework-adapter.md +275 -0
- package/docs/integrations/harness-install.md +213 -0
- package/docs/integrations/index.md +54 -0
- package/docs/north-star.md +3 -3
- package/docs/repository-structure.md +1 -1
- package/docs/skills-map.md +10 -4
- package/docs/spec/runtime-hook-surface.md +472 -0
- package/docs/survey-utterance-check.md +308 -0
- package/docs/vision.md +45 -0
- package/docs/workflow-usage-guide.md +1 -1
- package/evals/acceptance/run.sh +4 -2
- package/evals/acceptance/test_opencode_harness.sh +121 -0
- package/evals/acceptance/test_pi_harness.sh +98 -0
- package/evals/integration/test_bundle_install.sh +226 -1
- package/evals/integration/test_bundle_lifecycle.sh +641 -0
- package/evals/integration/test_utterance_check.sh +518 -0
- package/evals/run.sh +2 -0
- package/evals/static/test_universal_bundles.sh +137 -2
- package/integrations/strands/README.md +256 -0
- package/integrations/strands/example.py +74 -0
- package/integrations/strands/flow_agents_strands/__init__.py +27 -0
- package/integrations/strands/flow_agents_strands/hooks.py +194 -0
- package/integrations/strands/flow_agents_strands/policy.py +348 -0
- package/integrations/strands/flow_agents_strands/steering.py +172 -0
- package/integrations/strands/flow_agents_strands/telemetry.py +238 -0
- package/integrations/strands/pyproject.toml +38 -0
- package/integrations/strands/tests/__init__.py +0 -0
- package/integrations/strands/tests/test_hooks.py +304 -0
- package/integrations/strands/tests/test_policy.py +315 -0
- package/integrations/strands/tests/test_telemetry.py +184 -0
- package/integrations/strands-ts/README.md +224 -0
- package/integrations/strands-ts/bin/conformance-shim.mjs +257 -0
- package/integrations/strands-ts/package.json +53 -0
- package/integrations/strands-ts/src/hooks.ts +208 -0
- package/integrations/strands-ts/src/index.ts +22 -0
- package/integrations/strands-ts/src/policy.ts +345 -0
- package/integrations/strands-ts/src/telemetry.ts +251 -0
- package/integrations/strands-ts/test/test-policy.ts +322 -0
- package/integrations/strands-ts/test/test-telemetry.ts +226 -0
- package/integrations/strands-ts/tsconfig.json +20 -0
- package/package.json +7 -2
- package/packaging/conformance/README.md +142 -0
- package/packaging/conformance/fixtures/config-protection--allow-no-path.json +18 -0
- package/packaging/conformance/fixtures/config-protection--allow-safe-file.json +20 -0
- package/packaging/conformance/fixtures/config-protection--block-biome.json +20 -0
- package/packaging/conformance/fixtures/config-protection--block-eslintrc.json +20 -0
- package/packaging/conformance/fixtures/quality-gate--allow-no-path.json +17 -0
- package/packaging/conformance/fixtures/quality-gate--allow-nonexistent-file.json +19 -0
- package/packaging/conformance/fixtures/stop-goal-fit--allow-clean-cwd.json +17 -0
- package/packaging/conformance/fixtures/stop-goal-fit--block-strict-mode.json +23 -0
- package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +21 -0
- package/packaging/conformance/fixtures/workflow-steering--allow-no-state.json +16 -0
- package/packaging/conformance/fixtures/workflow-steering--inject-active-state.json +29 -0
- package/packaging/conformance/fixtures/workflow-steering--inject-subagent-steering.json +25 -0
- package/packaging/conformance/package.json +4 -0
- package/packaging/conformance/run-conformance.js +322 -0
- package/packaging/manifest.json +59 -0
- package/schemas/flow-agents-settings.schema.json +48 -0
- package/scripts/README.md +5 -0
- package/scripts/dogfood.js +16 -0
- package/scripts/hooks/opencode-hook-adapter.js +123 -0
- package/scripts/hooks/opencode-telemetry-hook.js +101 -0
- package/scripts/hooks/pi-hook-adapter.js +123 -0
- package/scripts/hooks/pi-telemetry-hook.js +105 -0
- package/scripts/hooks/run-hook.js +8 -0
- package/scripts/hooks/utterance-check.js +327 -0
- package/scripts/telemetry/lib/config.sh +5 -1
- package/skills/idea-to-backlog/SKILL.md +1 -1
- package/src/cli/init.ts +219 -6
- package/src/cli/utterance-check.ts +324 -0
- package/src/cli.ts +3 -0
- package/src/tools/build-universal-bundles.ts +266 -0
- package/src/tools/filter-installed-packs.ts +3 -0
- package/src/tools/validate-source-tree.ts +6 -1
- package/build/src/cli/docs-preview.js +0 -39
- package/build/src/cli/export-bookmarks.js +0 -38
- package/build/src/cli/import-bookmarks.js +0 -50
- package/build/src/cli/instinct-cli.js +0 -93
|
@@ -0,0 +1,213 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Harness Install
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Harness Install
|
|
6
|
+
|
|
7
|
+
This page walks through three harness installs: Claude Code (the L2 reference runtime), opencode, and pi. All three follow the same model — `npm run build:bundles` generates the bundle, `flow-agents init` places it — but each runtime expects different files at different paths.
|
|
8
|
+
|
|
9
|
+
## How harness bundles work
|
|
10
|
+
|
|
11
|
+
`npm run build:bundles` generates one bundle per runtime under `dist/<runtime>/`. Each bundle contains:
|
|
12
|
+
|
|
13
|
+
- A host-specific configuration file that maps lifecycle events to shell commands invoking the canonical hook adapter wrapper.
|
|
14
|
+
- A host-specific adapter wrapper (`<runtime>-hook-adapter.js`) that reads stdin JSON from the host, invokes `run-hook.js` with the canonical script path and profile, translates the exit code to the host-native response format, and fails open on errors.
|
|
15
|
+
- A host-specific telemetry wrapper (`<runtime>-telemetry-hook.js`) that maps host event names to canonical telemetry event names and invokes `scripts/telemetry/telemetry.sh`.
|
|
16
|
+
- An `install.sh` that places the generated files at the host-expected paths.
|
|
17
|
+
|
|
18
|
+
`flow-agents init` (from `npx @kontourai/flow-agents`) calls `install.sh` for the selected runtime.
|
|
19
|
+
|
|
20
|
+
## Claude Code
|
|
21
|
+
|
|
22
|
+
Claude Code is the L2 reference implementation. All four policy classes are wired: workflow steering, quality gate, stop-goal-fit, and config protection.
|
|
23
|
+
|
|
24
|
+
### Install
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
npx @kontourai/flow-agents init --runtime claude-code --dest /path/to/workspace --yes
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
The install script writes hook wiring into `.claude/settings.json` inside the destination workspace. The hooks object in `settings.json` maps Claude Code lifecycle events (`UserPromptSubmit`, `PreToolUse`, `PostToolUse`, `Stop`) to shell commands invoking the adapter:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
|
|
34
|
+
node "$root/scripts/hooks/claude-telemetry-hook.js" UserPromptSubmit dev'
|
|
35
|
+
bash -lc 'root="${FLOW_AGENTS_CLAUDE_CODE_ROOT:-$(pwd)}"; \
|
|
36
|
+
node "$root/scripts/hooks/claude-hook-adapter.js" UserPromptSubmit \
|
|
37
|
+
workflow-steering workflow-steering.js default'
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Telemetry always fires first and is always non-blocking (timeout: 10 s). Policy hooks fire second and may block on `PreToolUse` (timeout: 30 s). Both fail open on hook runtime errors.
|
|
41
|
+
|
|
42
|
+
### Dogfood variant (repo-local)
|
|
43
|
+
|
|
44
|
+
Inside the `flow-agents` source repo itself, the dogfood script writes hook wiring that points at the local `scripts/hooks/` directory rather than a published package:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
npm run dogfood -- --runtime claude-code
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
The destination defaults to the repo root. Pass `--dest` to override.
|
|
51
|
+
|
|
52
|
+
### Scope-collision warning
|
|
53
|
+
|
|
54
|
+
When `init` detects that an existing `.claude/settings.json` already has hooks entries for the same lifecycle events, it emits a scope-collision warning to stderr:
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
[flow-agents] WARNING: .claude/settings.json already has hooks for UserPromptSubmit.
|
|
58
|
+
Existing entries will be preserved; Flow Agents hooks will be appended.
|
|
59
|
+
Review .claude/settings.json to confirm hook ordering is correct.
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
The install appends rather than replaces, so existing hooks are not removed. Review the settings file after install to confirm the ordering is what you want.
|
|
63
|
+
|
|
64
|
+
### Resulting file layout
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
<workspace>/
|
|
68
|
+
.claude/
|
|
69
|
+
settings.json ← hook wiring (appended by install)
|
|
70
|
+
scripts/
|
|
71
|
+
hooks/
|
|
72
|
+
claude-hook-adapter.js
|
|
73
|
+
claude-telemetry-hook.js
|
|
74
|
+
run-hook.js
|
|
75
|
+
config-protection.js
|
|
76
|
+
quality-gate.js
|
|
77
|
+
stop-goal-fit.js
|
|
78
|
+
workflow-steering.js
|
|
79
|
+
…
|
|
80
|
+
skills/
|
|
81
|
+
…
|
|
82
|
+
.flow-agents/ ← runtime workflow artifacts (not committed)
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## opencode
|
|
86
|
+
|
|
87
|
+
opencode is an L1 adapter. It has no native `prompt.submit`-equivalent event, so workflow steering is approximated at `session.created` rather than at each user turn. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.1</a>.
|
|
88
|
+
|
|
89
|
+
### Install
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
npx @kontourai/flow-agents init --runtime opencode --dest /path/to/workspace --yes
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Dogfood variant
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
npm run dogfood -- --runtime opencode
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
### Resulting file layout
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
<workspace>/
|
|
105
|
+
.opencode/
|
|
106
|
+
plugins/
|
|
107
|
+
flow-agents.js ← auto-loaded at opencode startup
|
|
108
|
+
agents/
|
|
109
|
+
dev.md ← agent prompts (opencode markdown format)
|
|
110
|
+
tool-planner.md
|
|
111
|
+
tool-worker.md
|
|
112
|
+
…
|
|
113
|
+
skills/
|
|
114
|
+
deliver.md
|
|
115
|
+
fix-bug.md
|
|
116
|
+
…
|
|
117
|
+
opencode.json ← workspace instructions pointer
|
|
118
|
+
scripts/
|
|
119
|
+
hooks/
|
|
120
|
+
opencode-hook-adapter.js
|
|
121
|
+
opencode-telemetry-hook.js
|
|
122
|
+
run-hook.js
|
|
123
|
+
…
|
|
124
|
+
skills/
|
|
125
|
+
…
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
`opencode.json` at the workspace root is a minimal config file:
|
|
129
|
+
|
|
130
|
+
```json
|
|
131
|
+
{
|
|
132
|
+
"instructions": "This workspace uses Flow Agents. See AGENTS.md for conventions, skills, and workflow guidance."
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
The plugin at `.opencode/plugins/flow-agents.js` is auto-loaded at opencode startup. It exports `FlowAgentsPlugin` and registers handlers for:
|
|
137
|
+
|
|
138
|
+
| opencode event | What fires |
|
|
139
|
+
| --- | --- |
|
|
140
|
+
| `session.created` | Telemetry + workflow steering (session-start context injection) |
|
|
141
|
+
| `tool.execute.before` | Telemetry + config-protection (blocking via thrown Error) |
|
|
142
|
+
| `tool.execute.after` | Telemetry + quality gate |
|
|
143
|
+
| `session.idle` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
|
|
144
|
+
| `session.error`, `session.compacted`, `permission.asked`, `file.edited` | Telemetry only |
|
|
145
|
+
|
|
146
|
+
**Accepted gaps**: opencode has no `prompt.submit` hook, so workflow steering fires only on `session.created` — not at each user turn. `session.idle` is the closest event to a stop hook but does not reliably fire on session completion. These gaps are declared in the conformance level (L1) and in the plugin source comments.
|
|
147
|
+
|
|
148
|
+
**Agents**: opencode receives agent prompts as markdown files in `.opencode/agents/`. The main orchestrator is `dev.md`; specialist tools (planner, worker, reviewer, etc.) are additional markdown files in the same directory.
|
|
149
|
+
|
|
150
|
+
## pi
|
|
151
|
+
|
|
152
|
+
pi is an L1 adapter. It has no stop hook, so stop-goal-fit cannot fire at session end. This is a documented gap: see <a href="../spec/runtime-hook-surface.html">the spec, section 2.3</a>.
|
|
153
|
+
|
|
154
|
+
### Install
|
|
155
|
+
|
|
156
|
+
```bash
|
|
157
|
+
npx @kontourai/flow-agents init --runtime pi --dest /path/to/workspace --yes
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### Dogfood variant
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
npm run dogfood -- --runtime pi
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
### Resulting file layout
|
|
167
|
+
|
|
168
|
+
```
|
|
169
|
+
<workspace>/
|
|
170
|
+
.pi/
|
|
171
|
+
extensions/
|
|
172
|
+
flow-agents.ts ← auto-discovered at startup (needs project trust)
|
|
173
|
+
skills/
|
|
174
|
+
deliver.md
|
|
175
|
+
fix-bug.md
|
|
176
|
+
…
|
|
177
|
+
AGENTS.md ← agent instructions (pi uses AGENTS.md, not a registry)
|
|
178
|
+
scripts/
|
|
179
|
+
hooks/
|
|
180
|
+
pi-hook-adapter.js
|
|
181
|
+
pi-telemetry-hook.js
|
|
182
|
+
run-hook.js
|
|
183
|
+
…
|
|
184
|
+
skills/
|
|
185
|
+
…
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
The extension at `.pi/extensions/flow-agents.ts` is auto-discovered at startup. It registers handlers for:
|
|
189
|
+
|
|
190
|
+
| pi event | What fires |
|
|
191
|
+
| --- | --- |
|
|
192
|
+
| `session_start` | Telemetry |
|
|
193
|
+
| `before_agent_start` | Telemetry + workflow steering (injects context into system prompt) |
|
|
194
|
+
| `tool_call` | Telemetry + config-protection (blocking via `{ block: true }` return) |
|
|
195
|
+
| `tool_result` | Telemetry + quality gate |
|
|
196
|
+
| `session_shutdown` | Telemetry + stop-goal-fit (warning only — not a true stop event) |
|
|
197
|
+
|
|
198
|
+
**Accepted gaps**: pi has no stop hook. `session_shutdown` is used as the closest equivalent but does not carry the same semantics as a stop event. This gap is declared in the conformance level (L1) and in the extension source comments.
|
|
199
|
+
|
|
200
|
+
**Agents**: pi has no named-subagent registry. Agent guidance is delivered through `AGENTS.md` at the workspace root, plus the skills in `.pi/skills/` and the extension. The `flow-agents.ts` extension comment says explicitly: "pi has no named-subagent registry. Agents are not exported for pi."
|
|
201
|
+
|
|
202
|
+
### Scope-collision warning
|
|
203
|
+
|
|
204
|
+
Same behavior as Claude Code: if an existing `.pi/extensions/` directory contains a file with conflicting event registrations, `init` warns and appends. Review the extension file after install.
|
|
205
|
+
|
|
206
|
+
## Related references
|
|
207
|
+
|
|
208
|
+
- `dist/opencode/` — generated opencode bundle (do not edit by hand)
|
|
209
|
+
- `dist/pi/` — generated pi bundle (do not edit by hand)
|
|
210
|
+
- `dist/claude-code/` — generated Claude Code bundle
|
|
211
|
+
- `scripts/hooks/run-hook.js` — canonical hook runner
|
|
212
|
+
- <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> — event taxonomy, policy classes, conformance levels
|
|
213
|
+
- <a href="conformance.html">Conformance</a> — how to self-certify a new adapter
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Integration Examples
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Integration Examples
|
|
6
|
+
|
|
7
|
+
Flow Agents reaches host runtimes and agent frameworks through two distinct distribution models. This section provides worked examples for each model and a guide to the conformance kit for third-party adapter authors.
|
|
8
|
+
|
|
9
|
+
## Distribution models at a glance
|
|
10
|
+
|
|
11
|
+
**Harness runtimes** ship as self-contained bundles under `dist/<runtime>/`. The `npm run build:bundles` command generates each bundle from the canonical manifest and policy scripts. `flow-agents init` (or the dogfood variant) places the generated files at the host-expected paths inside a target workspace. Claude Code, Codex, Kiro, opencode, and pi are harness adapters.
|
|
12
|
+
|
|
13
|
+
**Framework adapters** live in `integrations/<name>/` as language-native packages. They register Flow Agents callbacks with the framework's lifecycle system using the framework's native registration API. `integrations/strands/` is the reference implementation: `flow-agents-strands` is a Python `HookProvider` that wires into AWS Strands Agents without requiring the Strands SDK at import time.
|
|
14
|
+
|
|
15
|
+
**Third-party adapters** self-certify by running the conformance kit in `packaging/conformance/`. The kit provides golden fixtures and a runner that pipes each fixture through the adapter command and reports per-level verdict.
|
|
16
|
+
|
|
17
|
+
## Conformance levels
|
|
18
|
+
|
|
19
|
+
| Level | What is required |
|
|
20
|
+
| --- | --- |
|
|
21
|
+
| L0 | Telemetry only — at least `agentSpawn` fires on session start |
|
|
22
|
+
| L1 | L0 plus workflow steering and stop-goal-fit in warning mode |
|
|
23
|
+
| L2 | L1 plus config protection (blocking) and quality gate — the reference level |
|
|
24
|
+
|
|
25
|
+
Claude Code and Codex are L2 reference implementations. opencode is L1 (no prompt-submit hook). pi is L1 (no stop hook). The Strands adapter is L0 plus config protection via `BeforeToolCallEvent` cancellation.
|
|
26
|
+
|
|
27
|
+
The <a href="../spec/runtime-hook-surface.html">Runtime Hook Surface spec</a> defines the canonical event taxonomy, policy classes, conformance levels, and engine contract in full.
|
|
28
|
+
|
|
29
|
+
## Pages in this section
|
|
30
|
+
|
|
31
|
+
<div class="doc-grid">
|
|
32
|
+
<a class="doc-card" href="harness-install.html">
|
|
33
|
+
<strong>Harness Install</strong>
|
|
34
|
+
<span>Worked example installing into a Claude Code project, and the two newest runtimes: opencode and pi. Includes the dogfood variant and scope-collision warning behavior.</span>
|
|
35
|
+
</a>
|
|
36
|
+
<a class="doc-card" href="framework-adapter.html">
|
|
37
|
+
<strong>Framework Adapter</strong>
|
|
38
|
+
<span>Worked example based on <code>integrations/strands/</code>: constructing FlowAgentsHooks, telemetry emitted, the engine-contract binding for policy, and documented limitations.</span>
|
|
39
|
+
</a>
|
|
40
|
+
<a class="doc-card" href="conformance.html">
|
|
41
|
+
<strong>Conformance</strong>
|
|
42
|
+
<span>How a third-party adapter self-certifies: the engine contract 1.0, running the conformance runner, what each level requires, and how to declare gaps.</span>
|
|
43
|
+
</a>
|
|
44
|
+
<a class="doc-card" href="../spec/runtime-hook-surface.html">
|
|
45
|
+
<strong>Runtime Hook Surface Spec</strong>
|
|
46
|
+
<span>Canonical event taxonomy, four policy classes, conformance levels L0/L1/L2, mapping tables, and the engine contract for adapter authors.</span>
|
|
47
|
+
</a>
|
|
48
|
+
</div>
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## TypeScript native-import adapter
|
|
53
|
+
|
|
54
|
+
`integrations/strands-ts/` (`@kontourai/flow-agents-strands`) is the first native-import consumer of the policy engine contract. It binds the `config-protection.js` `run()` function directly — no subprocess on the hot path. Achieves **L2** conformance. See `integrations/strands-ts/README.md` and the [Framework Adapter](framework-adapter.html) page for the full comparison with the Python adapter.
|
package/docs/north-star.md
CHANGED
|
@@ -152,9 +152,9 @@ The goal is not to add ceremony. The goal is to make agents more reliable while
|
|
|
152
152
|
| [x] | Standards register | Supported standards and Flow Agents-owned formats are documented with adoption rules. |
|
|
153
153
|
| [ ] | Structured workflow state | Draft schemas, contracts, validation, explicit current-session identity, delegation-safe agent event logs, sidecar writer commands, and direct workflow-skill writer instructions exist for state, acceptance, evidence, handoff, critique, release, and learning; automatic enforcement remains partial. |
|
|
154
154
|
| [ ] | Context map | Generated repo/context map exists; workflow steering and core planner/worker/verifier agents now use it, but broader agent coverage remains. |
|
|
155
|
-
| [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; broader file/task-aware guidance remains. |
|
|
155
|
+
| [ ] | JIT guidance | Stop hook checks sidecars; workflow steering reads `state.json`, `critique.json`, context-map availability, and high-risk state after non-subagent tools; the opt-in utterance evidence-check hook (ADR 0003 §9) badges unsupported agent statements via Survey; broader file/task-aware guidance remains. |
|
|
156
156
|
| [x] | Sandbox policy | `context/contracts/sandbox-policy.md` and https://github.com/kontourai/flow-agents/blob/main/docs/sandbox-policy.md classify local read-only, local edit, worktree, container, cloud sandbox, and privileged integration modes. |
|
|
157
|
-
| [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper
|
|
157
|
+
| [ ] | Evidence integration | Evidence sidecars now carry `standard_refs` for SARIF, OpenTelemetry, JUnit/TAP, Veritas, and custom proof; a local Veritas readiness wrapper records native Veritas reports as optional evidence; utterance trust reports from `@kontourai/survey` cover agent statements. |
|
|
158
158
|
| [ ] | Feedback loop | Runtime telemetry, outcomes, evals, and recurring corrections feed back into docs, skills, rules, or backlog. |
|
|
159
159
|
| [ ] | Export validation | Codex, Claude Code, and Kiro exports preserve the same operating layers and now install telemetry, Goal Fit, and workflow steering hook wiring; adapter output, installed-command coverage, Claude live hook influence, and Kiro live strict-stop coverage exist. |
|
|
160
160
|
|
|
@@ -180,7 +180,7 @@ Tasks:
|
|
|
180
180
|
|
|
181
181
|
- Document the public layers: rules, skills, powers, agents, workflows, knowledge, and evidence. **Done:** see https://github.com/kontourai/flow-agents/blob/main/docs/operating-layers.md.
|
|
182
182
|
- Mark which directories are canonical source, generated exports, runtime state, and optional integrations.
|
|
183
|
-
- Decide which workflow skills are part of the core pack and which are optional domain packs. **Started:** `packaging/packs.json` defines core
|
|
183
|
+
- Decide which workflow skills are part of the core pack and which are optional domain packs. **Started:** `packaging/packs.json` defines core and development packs.
|
|
184
184
|
- Add a standards register that lists each external standard, how Flow Agents uses it, and what Flow Agents-owned schemas still exist. **Done:** see https://github.com/kontourai/flow-agents/blob/main/docs/standards-register.md.
|
|
185
185
|
- Add a "do not invent without checking standards" rule to contributor docs.
|
|
186
186
|
|
|
@@ -96,7 +96,7 @@ specific row that matches the change.
|
|
|
96
96
|
| Bundle/export shape | `packaging/`, `src/tools/build-universal-bundles.ts`, and source directories copied into bundles | `bash evals/static/test_universal_bundles.sh` |
|
|
97
97
|
| Installer or local runtime setup behavior | `scripts/install-*.sh`, package bins, and generated bundle install scripts | `bash evals/integration/test_bundle_install.sh` |
|
|
98
98
|
| Workflow artifact, sidecar, or provider contract | `context/contracts/`, `schemas/`, `src/cli/workflow-*`, and matching eval fixtures | `npm run workflow:validate-artifacts --` and workflow integration evals |
|
|
99
|
-
| Flow Kit catalog or bundled kit content | `kits/`, Flow Definition files, and kit repository fixtures | `npm run
|
|
99
|
+
| Flow Kit catalog or bundled kit content | `kits/`, Flow Definition files, and kit repository fixtures | `npm run validate:source -- --kit <path>` or `bash evals/integration/test_flow_kit_repository.sh` |
|
|
100
100
|
| Durable developer guidance | `docs/`; regenerate/check the context map when navigation or durable contracts change | `npm run context-map:check --` |
|
|
101
101
|
| Eval scenario or fixture | `evals/static/`, `evals/integration/`, `evals/fixtures/`, or `evals/cases/` | The owning eval plus `bash evals/run.sh static` when contracts are touched |
|
|
102
102
|
| Optional external integration configuration | `integrations/` or `veritas.claims.json`; keep local run output ignored | The integration-specific eval or documented dry run |
|
package/docs/skills-map.md
CHANGED
|
@@ -45,6 +45,9 @@ flowchart LR
|
|
|
45
45
|
Learn -->|new work| Shape
|
|
46
46
|
```
|
|
47
47
|
|
|
48
|
+
> `publish-change` is a CLI-driven workflow step, not a loadable skill.
|
|
49
|
+
> `goal-fit` is a hook-enforced check, not a loadable skill.
|
|
50
|
+
|
|
48
51
|
## Current Shape
|
|
49
52
|
|
|
50
53
|
The operating model now has first-class coverage from idea intake through trusted delivery:
|
|
@@ -76,7 +79,7 @@ This view shows how each phase is composed. The left rail is the durable phase s
|
|
|
76
79
|
<div class="phase-step"><span>01</span><strong>Discovery & shaping</strong></div>
|
|
77
80
|
<div class="phase-lanes">
|
|
78
81
|
<section class="phase-lane phase-lane--primary"><h3>Primary</h3><p><code>builder-shape</code> <code>idea-to-backlog</code></p></section>
|
|
79
|
-
<section class="phase-lane"><h3>Support</h3><p><code>
|
|
82
|
+
<section class="phase-lane"><h3>Support</h3><p><code>search-first</code> <code>explore</code> <code>frontend-design</code> <code>github-cli</code> <code>knowledge-capture</code></p></section>
|
|
80
83
|
<section class="phase-lane"><h3>Nested sections / future primitives</h3><p>intake/dedupe, separate ideas, thinnest meaningful slice, opportunity review, explore options, <code>shape-work</code>, prioritize work, sync executable backlog</p></section>
|
|
81
84
|
<section class="phase-lane phase-lane--gate"><h3>Gate & artifact</h3><p>Idea, slice, shape, and backlog gates. Writes shaped briefs and GitHub issue links in <code>.flow-agents/<slug>/</code>.</p></section>
|
|
82
85
|
</div>
|
|
@@ -112,7 +115,7 @@ This view shows how each phase is composed. The left rail is the durable phase s
|
|
|
112
115
|
<div class="phase-step"><span>05</span><strong>Learning & improvement</strong></div>
|
|
113
116
|
<div class="phase-lanes">
|
|
114
117
|
<section class="phase-lane phase-lane--primary"><h3>Primary</h3><p><code>learning-review</code></p></section>
|
|
115
|
-
<section class="phase-lane"><h3>Support</h3><p><code>knowledge-capture</code> <code>
|
|
118
|
+
<section class="phase-lane"><h3>Support</h3><p><code>knowledge-capture</code> <code>idea-to-backlog</code> <code>eval-rebuild</code></p></section>
|
|
116
119
|
<section class="phase-lane"><h3>Nested sections / future primitives</h3><p>facts vs interpretation, follow-up routing, docs promotion review, knowledge updates, eval updates, skill/backlog improvements</p></section>
|
|
117
120
|
<section class="phase-lane phase-lane--gate"><h3>Gate & artifact</h3><p>Learning gate. Writes outcomes, gaps, docs promotion state, follow-ups, knowledge updates, and verdict.</p></section>
|
|
118
121
|
</div>
|
|
@@ -121,11 +124,11 @@ This view shows how each phase is composed. The left rail is the durable phase s
|
|
|
121
124
|
|
|
122
125
|
| Phase | Primary workflow skill | Supporting skills | Nested sections / future primitive candidates |
|
|
123
126
|
| --- | --- | --- | --- |
|
|
124
|
-
| Idea discovery and shaping | `builder-shape`, `idea-to-backlog` | `
|
|
127
|
+
| Idea discovery and shaping | `builder-shape`, `idea-to-backlog` | `search-first`, `explore`, `frontend-design`, `github-cli`, `knowledge-capture` | intake/dedupe, separate ideas, thinnest meaningful slice, opportunity review, explore options, shape work, prioritize work, sync executable backlog |
|
|
125
128
|
| Backlog pickup | `pull-work` | `github-cli` | board snapshot, WIP check, grouping/dependency check, Probe decision, worktree decision, handoff |
|
|
126
129
|
| Execution planning and build | `design-probe`, `pickup-probe`, `plan-work`, `execute-plan`, `review-work`, `verify-work` | `feedback-loop`, `browser-test`, `deliver`, `fix-bug`, `tdd-workflow` | Probe notes, Builder Kit Probe record, Definition Of Done, execution plan, parallel waves, implementation session state, critique report, verification report, Goal Fit Gate |
|
|
127
130
|
| Evidence and release confidence | `evidence-gate`, `release-readiness` | `github-cli`, `eval-rebuild` | criteria-to-evidence map, CI confidence, scope/integrity check, publish-change, rollback review, observability review, final acceptance docs, post-deploy plan |
|
|
128
|
-
| Learning and improvement | `learning-review` | `knowledge-capture`, `
|
|
131
|
+
| Learning and improvement | `learning-review` | `knowledge-capture`, `idea-to-backlog`, `eval-rebuild` | facts vs interpretation, docs promotion review, follow-up routing, knowledge updates, eval/skill/backlog improvements |
|
|
129
132
|
|
|
130
133
|
The highest-leverage future extractions are likely `shape-work`, `test-map`, `scope-and-integrity-check`, and `remediate-ci`. They are still nested because their behavior is present, but not yet large enough to need separate activation contracts.
|
|
131
134
|
|
|
@@ -190,6 +193,9 @@ flowchart LR
|
|
|
190
193
|
Learning -->|systemic change| Eval[eval-rebuild / backlog / skill update]
|
|
191
194
|
```
|
|
192
195
|
|
|
196
|
+
> `publish-change` is a CLI-driven workflow step, not a loadable skill.
|
|
197
|
+
> `goal-fit` is a hook-enforced check, not a loadable skill.
|
|
198
|
+
|
|
193
199
|
## Eval Coverage
|
|
194
200
|
|
|
195
201
|
Workflow evals are layered to match this map:
|