@kontourai/flow-agents 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (111) hide show
  1. package/.github/workflows/ci.yml +110 -0
  2. package/.github/workflows/runtime-compat.yml +5 -2
  3. package/CHANGELOG.md +42 -0
  4. package/README.md +26 -5
  5. package/build/src/cli/console-learning-projection.js +19 -2
  6. package/build/src/cli/effective-backlog-settings.js +18 -2
  7. package/build/src/cli/fixture-retirement-audit.js +19 -2
  8. package/build/src/cli/init.js +19 -2
  9. package/build/src/cli/{flow-kit.js → kit.js} +122 -108
  10. package/build/src/cli/promote-workflow-artifact.js +19 -2
  11. package/build/src/cli/publish-change-helper.js +19 -2
  12. package/build/src/cli/pull-work-provider.js +19 -2
  13. package/build/src/cli/runtime-adapter.js +20 -2
  14. package/build/src/cli/usage-feedback.js +19 -2
  15. package/build/src/cli/utterance-check.js +19 -2
  16. package/build/src/cli/validate-hook-influence.js +19 -2
  17. package/build/src/cli/validate-source-tree.js +4 -4
  18. package/build/src/cli/veritas-governance.js +19 -2
  19. package/build/src/cli/workflow-artifact-cleanup-audit.js +19 -2
  20. package/build/src/cli.js +3 -3
  21. package/build/src/flow-kit/validate.js +58 -62
  22. package/build/src/runtime-adapters.js +55 -24
  23. package/build/src/tools/build-universal-bundles.js +83 -19
  24. package/build/src/tools/generate-context-map.js +68 -9
  25. package/build/src/tools/validate-package.js +19 -2
  26. package/build/src/tools/validate-source-tree.js +51 -3
  27. package/context/scripts/telemetry/console-presets.sh +1 -1
  28. package/docs/adr/0007-flow-skill-kit-tool-boundary.md +169 -0
  29. package/docs/adr/0007-skill-audit.md +112 -0
  30. package/docs/adr/0008-kit-operation-boundary.md +88 -0
  31. package/docs/context-map.md +18 -22
  32. package/docs/flow-kit-repository-contract.md +5 -5
  33. package/docs/getting-started.md +177 -0
  34. package/docs/index.md +19 -8
  35. package/docs/kit-authoring-guide.md +46 -10
  36. package/docs/knowledge-kit.md +2 -2
  37. package/docs/spec/runtime-hook-surface.md +1 -1
  38. package/docs/vision.md +1 -1
  39. package/docs/workflow-usage-guide.md +1 -1
  40. package/evals/ci/run-baseline.sh +55 -8
  41. package/evals/fixtures/builder-kit-workflow-state/happy-path.json +2 -2
  42. package/evals/fixtures/builder-kit-workflow-state/mid-work-resume.json +2 -2
  43. package/evals/fixtures/console-learning-projection/artifacts/console-learning-correction/learning.json +1 -1
  44. package/evals/fixtures/pull-work-provider/github-issues.json +5 -5
  45. package/evals/integration/test_activate_npx_context.sh +2 -2
  46. package/evals/integration/test_bundle_install.sh +17 -12
  47. package/evals/integration/test_console_learning_projection.sh +1 -1
  48. package/evals/integration/test_flow_kit_install_git.sh +7 -7
  49. package/evals/integration/test_flow_kit_repository.sh +4 -4
  50. package/evals/integration/test_kit_conformance_levels.sh +1 -1
  51. package/evals/integration/test_local_flow_kit_install.sh +7 -7
  52. package/evals/integration/test_publish_change_helper.sh +1 -1
  53. package/evals/integration/test_pull_work_provider.sh +1 -1
  54. package/evals/integration/test_runtime_adapter_activation.sh +140 -19
  55. package/evals/lib/node.sh +2 -2
  56. package/evals/run.sh +2 -0
  57. package/evals/static/test_console_presets.sh +49 -0
  58. package/evals/static/test_workflow_skills.sh +15 -15
  59. package/integrations/strands/flow_agents_strands/steering.py +1 -1
  60. package/integrations/strands-ts/src/hooks.ts +1 -1
  61. package/kits/builder/kit.json +17 -0
  62. package/{skills → kits/builder/skills}/builder-shape/SKILL.md +4 -4
  63. package/{skills → kits/builder/skills}/idea-to-backlog/SKILL.md +1 -1
  64. package/kits/knowledge/kit.json +16 -9
  65. package/package.json +8 -5
  66. package/packaging/packs.json +1 -21
  67. package/scripts/README.md +1 -1
  68. package/scripts/kit.js +2 -0
  69. package/scripts/telemetry/console-presets.sh +1 -1
  70. package/skills/README.md +23 -0
  71. package/src/cli/console-learning-projection.ts +7 -1
  72. package/src/cli/effective-backlog-settings.ts +6 -1
  73. package/src/cli/fixture-retirement-audit.ts +7 -1
  74. package/src/cli/init.ts +7 -1
  75. package/src/cli/{flow-kit.ts → kit.ts} +124 -109
  76. package/src/cli/promote-workflow-artifact.ts +7 -1
  77. package/src/cli/publish-change-helper.ts +7 -1
  78. package/src/cli/pull-work-provider.ts +7 -1
  79. package/src/cli/runtime-adapter.ts +8 -1
  80. package/src/cli/usage-feedback.ts +7 -1
  81. package/src/cli/utterance-check.ts +7 -1
  82. package/src/cli/validate-hook-influence.ts +7 -1
  83. package/src/cli/validate-source-tree.ts +4 -4
  84. package/src/cli/veritas-governance.ts +7 -1
  85. package/src/cli/workflow-artifact-cleanup-audit.ts +7 -1
  86. package/src/cli.ts +3 -3
  87. package/src/flow-kit/validate.ts +63 -57
  88. package/src/runtime-adapters.ts +54 -26
  89. package/src/tools/build-universal-bundles.ts +67 -14
  90. package/src/tools/generate-context-map.ts +43 -7
  91. package/src/tools/validate-package.ts +7 -1
  92. package/src/tools/validate-source-tree.ts +34 -2
  93. package/scripts/flow-kit.js +0 -2
  94. package/skills/context-budget/SKILL.md +0 -40
  95. package/skills/explore/SKILL.md +0 -137
  96. package/skills/feedback-loop/SKILL.md +0 -87
  97. package/skills/frontend-design/SKILL.md +0 -80
  98. /package/{skills → kits/builder/skills}/deliver/SKILL.md +0 -0
  99. /package/{skills → kits/builder/skills}/design-probe/SKILL.md +0 -0
  100. /package/{skills → kits/builder/skills}/evidence-gate/SKILL.md +0 -0
  101. /package/{skills → kits/builder/skills}/execute-plan/SKILL.md +0 -0
  102. /package/{skills → kits/builder/skills}/fix-bug/SKILL.md +0 -0
  103. /package/{skills → kits/builder/skills}/learning-review/SKILL.md +0 -0
  104. /package/{skills → kits/builder/skills}/pickup-probe/SKILL.md +0 -0
  105. /package/{skills → kits/builder/skills}/plan-work/SKILL.md +0 -0
  106. /package/{skills → kits/builder/skills}/pull-work/SKILL.md +0 -0
  107. /package/{skills → kits/builder/skills}/release-readiness/SKILL.md +0 -0
  108. /package/{skills → kits/builder/skills}/review-work/SKILL.md +0 -0
  109. /package/{skills → kits/builder/skills}/tdd-workflow/SKILL.md +0 -0
  110. /package/{skills → kits/builder/skills}/verify-work/SKILL.md +0 -0
  111. /package/{skills → kits/knowledge/skills}/knowledge-capture/SKILL.md +0 -0
@@ -0,0 +1,112 @@
1
+ ---
2
+ title: "Skill Audit 2026-06-15: Flow / Skill / Kit / Tool Boundary"
3
+ ---
4
+
5
+ # Skill Audit: Flow / Skill / Kit / Tool Boundary
6
+
7
+ **Date:** 2026-06-15
8
+ **Companion to:** [ADR 0007](./0007-flow-skill-kit-tool-boundary.md)
9
+ **Scope:** All 26 skills in `skills/` — no skills declared inside kit directories were found separate from those already listed here.
10
+
11
+ ---
12
+
13
+ ## Classification Key
14
+
15
+ | Label | Meaning |
16
+ | --- | --- |
17
+ | **KIT-SKILL** | The agent's procedural method for one step of a kit-owned flow. Belongs in the kit that owns that flow. |
18
+ | **TOOL** | A raw capability the agent wields. Not tied to any flow step. Should be provided by the runtime or harness, not packaged as a "skill." |
19
+ | **ORPHAN** | Procedural but no flow step can be cited as the home. Either implies a missing/implicit flow, or signals scope drift. |
20
+
21
+ Flow step IDs used below are from:
22
+
23
+ - `kits/builder/flows/build.flow.json` — steps: `pull-work`, `design-probe`, `plan`, `execute`, `verify`, `merge-ready`, `pr-open`, `merge-ready-ci`, `learn`, `done`
24
+ - `kits/builder/flows/shape.flow.json` — steps: `shape`, `breakdown`, `file-issues`, `shape-done`
25
+ - `kits/knowledge/flows/ingest.flow.json` — steps: `capture`, `classify`, `route`
26
+ - `kits/knowledge/flows/compile.flow.json` — steps: `select-raws`, `compile`, `link`
27
+ - `kits/knowledge/flows/synthesize.flow.json` — steps: `detect-cluster`, `propose`, `evidence-gate`, `apply-or-reject`
28
+ - `kits/knowledge/flows/consolidate.flow.json` — steps: `related-event`, `propose`, `evidence-gate`, `apply-or-reject`
29
+ - `kits/knowledge/flows/retire.flow.json` — steps: `identify`, `propose-retirement`, `evidence-gate`, `apply-or-reject`
30
+ - `kits/knowledge/flows/store-contract.flow.json` — steps: `verify-contract`
31
+
32
+ ---
33
+
34
+ ## Full Audit Table
35
+
36
+ | Skill | What It Does | Classification | Kit + Flow Step (if KIT-SKILL) / Rationale (if TOOL or ORPHAN) |
37
+ | --- | --- | --- | --- |
38
+ | `agentic-engineering` | Principles for eval-first loops, task decomposition (15-minute units), model routing (Haiku/Sonnet/Opus), and session strategy. | TOOL | Documents how to use the agent's cognitive capabilities and model-selection judgment. It is guidance the agent *applies* while using tools, not a method for a specific flow step. It is not tied to any flow or kit. |
39
+ | `browser-test` | Delegates browser automation tasks — screenshots, accessibility checks, form filling, UI testing, DOM inspection — to `tool-playwright`. | TOOL | Wraps raw access to a browser automation capability (`tool-playwright`). No flow step backs it; it is a harness/runtime capability the agent directs. Equivalent to "how to run Playwright." |
40
+ | `builder-shape` | User-facing entry into the Builder Kit shape flow — invokes `idea-to-backlog` as a primitive and links `kits/builder/flows/shape.flow.json`; stops at the backlog gate unless issue sync is requested. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.shape`. **Step:** `shape` (and through it, `breakdown` and `file-issues` via `idea-to-backlog` delegation). `builder-shape` is the agent's procedural method for satisfying the `builder.shape` flow's entry step. |
41
+ | `context-budget` | Audits token overhead across installed Flow Agents bundles; scans components and produces a budget report with per-component breakdown and optimization suggestions. | ORPHAN | Procedural and agent-driven, but there is no flow in any kit that has a step for "audit the agent's own context budget." **Implies missing flow:** an implicit "context-health" or "self-maintenance" flow. Until that flow exists and is owned by a kit, this skill is unanchored. |
42
+ | `deliver` | Orchestrates the full plan → execute → review → verify loop, including preflight (pull-work, pickup-probe), looping on failures, and delivery confirmation. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Steps:** orchestrates across `pull-work`, `design-probe`, `plan`, `execute`, `verify`, `merge-ready` in sequence. `deliver` is the agent's top-level orchestration method for the builder build flow. It subsumes multiple build-flow steps and is the primary orchestrator skill for that flow. |
43
+ | `dependency-update` | Analyzes and upgrades project dependencies — delegates registry/advisory lookups to `tool-dependencies-updater`, then presents a plan and applies approved updates. | TOOL | Orchestrating a dependency scanner subagent (`tool-dependencies-updater`) is a raw capability use. There is no kit-owned flow with a "dependency-update" step. |
44
+ | `design-probe` | Generic one-question-at-a-time alignment interview — turns unclear goals, designs, or workflow states into shared understanding before planning or execution. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `design-probe`. The skill's own SKILL.md names the Builder Kit `design-probe` step explicitly. It also applies outside Builder Kit, but the canonical flow binding is `builder.build:design-probe`. |
45
+ | `eval-rebuild` | Defines project-specific rebuild/reinstall steps for the eval feedback loop so the `eval-builder` agent knows how to rebuild after editing a prompt or skill. | TOOL | This is harness/tooling guidance for how to run evals — a raw capability instruction with no flow step home. It is not a method for any kit-owned step; it is instructions about how the agent's own evaluation tooling works. |
46
+ | `evidence-gate` | Evaluates whether completed work has enough trustworthy evidence, scope integrity, and provider/runtime signal to publish, continue fixing, or request a human decision. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `verify` (the gate evaluation that determines whether the verify step's evidence satisfies the gate claim `builder.verify.tests`). Also maps to `merge-ready` evidence checks. The skill explicitly separates from release-readiness and handles the `verify`-step gate logic. |
47
+ | `execute-plan` | Parallel execution primitive — reads a plan artifact, fans out to `tool-worker` subagents in waves, and updates the session artifact between waves. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `execute`. The skill is the agent's procedural method for the `execute` step of the builder build flow. |
48
+ | `explore` | Fans out parallel subagents to map codebase structure, entry points, dependencies, architectural patterns, config, tests, and documentation accuracy in one pass. | ORPHAN | Procedural and multi-wave but there is no kit-owned flow step for "explore a codebase." It is used as a support skill during discovery/shaping and debugging but is not anchored to a specific flow step. **Implies missing flow:** a "codebase-onboarding" or "repository-exploration" flow with an `explore` step, or it belongs as a tool-like capability rather than a flow step skill. |
49
+ | `feedback-loop` | Verifies that completed implementation actually works by classifying the change (visual vs. integration) and delegating to the appropriate verification method (Playwright or direct command execution). | ORPHAN | There is no kit-owned flow step called "feedback-loop." It overlaps with the `verify` step of the builder build flow, but its scope is narrower (per-implementation-task confirmation) and it is used as a support skill during `execute-plan`, not as the canonical agent method for the `verify` step. **Implies missing flow:** or this is a tool-like capability (a "quick verify" affordance) that could be subsumed into `verify-work`. |
50
+ | `fix-bug` | Bug-fix orchestrator — adds a diagnosis phase (reproduce + root-cause via `tool-planner`), then chains plan → execute → review → verify identical to `deliver`. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Steps:** `design-probe` (root-cause/diagnosis maps to alignment before planning), `plan`, `execute`, `verify`. `fix-bug` is an alternative entry into the builder build flow for defect work; it adds a diagnosis front-end and otherwise implements the same flow steps as `deliver`. |
51
+ | `frontend-design` | Delegates frontend implementation to `tool-worker` with curated design guidelines (typography, color, motion, spatial composition, anti-patterns); requires Playwright visual verification after implementation. | ORPHAN | There is no kit-owned flow with a "frontend-design" step. This skill injects design taste into the `execute` step of the builder build flow but is not the canonical method for that step — it is used as a support layer inside `execute-plan`/`deliver`. **Implies missing flow:** a "frontend" or "UI-design" flow with dedicated design and verify steps, or this is more accurately a library of guidelines that the `execute` step (via `execute-plan`) consumes. |
52
+ | `github-cli` | Uses the `gh` CLI to interact with GitHub — PRs, issues, repos, releases, Actions, gists, search, and arbitrary API calls. | TOOL | `gh` is a raw capability — a command-line tool the agent wields. The skill is a how-to for operating that tool, not a method for a kit-owned flow step. Used as support across many flow steps without being bound to one. |
53
+ | `idea-to-backlog` | Turns raw product or technical ideas into shaped, prioritized, executable GitHub issue backlog through intake, separation, opportunity review, shaping, prioritization, and issue creation. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.shape`. **Steps:** `shape` (idea intake → shaped problem/outcome/constraints/non-goals/success/risk), `breakdown` (slices and thinnest meaningful slices), `file-issues` (creating GitHub issues with provider-neutral metadata). `idea-to-backlog` is the primary agent method that implements all three active steps of `builder.shape`. |
54
+ | `knowledge-capture` | Saves durable knowledge, pointers, decisions, lessons, corrections, and source references into the knowledge base using pointer or curated-knowledge modes. | KIT-SKILL | **Kit:** knowledge. **Flow:** `knowledge.ingest`. **Step:** `capture` (the first step of `knowledge.ingest`: capture raw text → produce a raw record). This skill is the agent's method for the `capture` step of the knowledge ingest flow. |
55
+ | `learning-review` | Captures post-merge/post-deploy/post-incident learnings and routes them back to backlog, workflow skills, tests, docs, or knowledge; includes correction telemetry and a verdict. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `learn`. The skill is the agent's method for the `learn` step of the builder build flow: turn delivery outcomes into durable learning and follow-up routing. |
56
+ | `pickup-probe` | Builder Kit specialization of the `design-probe` flow step — records scope, provider state, WIP/conflict scan, revision freshness, decisions, unresolved questions, accepted gaps, and planning readiness for selected backlog work. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `design-probe` (the `pickup-probe` skill is explicitly described in its SKILL.md as "the Builder Kit pickup specialization of the `design-probe` flow step"). It implements the `design-probe` step for the productized pickup path. |
57
+ | `plan-work` | Planning primitive — delegates codebase analysis and execution plan creation to `tool-planner`; produces a plan artifact, `acceptance.json`, and `handoff.json`. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `plan`. The skill is the agent's method for the `plan` step of the builder build flow. |
58
+ | `pull-work` | Selects ready GitHub issues from the backlog, enforces WIP limits, checks dependencies, determines worktree isolation, and hands selected work to planning. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `pull-work`. The skill is the agent's method for the `pull-work` step of the builder build flow. |
59
+ | `release-readiness` | Decides whether evidence-backed work is ready to merge, release, deploy, or hold — checks committed/pushed state, provider change record, CI/checks, rollback plan, observability, and docs; produces a structured release decision. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Steps:** `merge-ready` and `merge-ready-ci`. The skill implements the agent-facing logic for both merge readiness gates: it consumes evidence-gate output, checks operational and CI state, and produces a merge/release/deploy/hold decision. |
60
+ | `review-work` | Report-only critique primitive — delegates to `tool-code-reviewer`, `tool-security-reviewer`, and optionally `tool-dependencies-updater`; records findings through the `critique.json` artifact/sink. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `verify` (review is part of the verify gate's quality checks) or more precisely as an intermediate step between `execute` and the formal `verify` gate. The SKILL.md describes it as a gate that must be satisfied before verification. Mapped to: between `execute` and `verify` in `builder.build`. |
61
+ | `search-first` | Research-before-coding workflow — searches the codebase, package registries, GitHub, and web in parallel; evaluates candidates; and decides to adopt, extend, or build before writing code. | TOOL | This is a research/lookup methodology, not the agent's method for a specific flow step. It is used as a support behavior across multiple steps (shaping, planning, execution) without being anchored to one. It could be seen as a harness capability (web + registry search). |
62
+ | `tdd-workflow` | TDD orchestrator — wraps plan → execute → review → verify with test-first constraints, git checkpoints (RED/GREEN/REFACTOR), and a coverage gate (>= 80%). | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Steps:** `plan`, `execute`, `verify`. `tdd-workflow` is an alternative parameterization of the builder build flow that enforces test-first discipline across those three steps. |
63
+ | `verify-work` | Verification primitive — delegates to `tool-verifier` and `tool-playwright`; maps evidence to acceptance criteria; updates `evidence.json` and `acceptance.json`. | KIT-SKILL | **Kit:** builder. **Flow:** `builder.build`. **Step:** `verify`. The skill is the canonical agent method for the `verify` step. |
64
+
65
+ ---
66
+
67
+ ## Summary Counts
68
+
69
+ | Category | Count | Notes |
70
+ | --- | --- | --- |
71
+ | **KIT-SKILL (builder kit)** | 17 | `builder-shape`, `deliver`, `design-probe`, `evidence-gate`, `execute-plan`, `fix-bug`, `idea-to-backlog`, `learning-review`, `pickup-probe`, `plan-work`, `pull-work`, `release-readiness`, `review-work`, `tdd-workflow`, `verify-work`, `knowledge-capture` (builder-side consumer), `fix-bug` (builder alt-entry) |
72
+ | **KIT-SKILL (knowledge kit)** | 1 | `knowledge-capture` implements `knowledge.ingest:capture` |
73
+ | **TOOL** | 6 | `agentic-engineering`, `browser-test`, `dependency-update`, `eval-rebuild`, `github-cli`, `search-first` |
74
+ | **ORPHAN** | 4 | `context-budget`, `explore`, `feedback-loop`, `frontend-design` |
75
+
76
+ Note: `knowledge-capture` appears in both the builder-kit count above and the knowledge-kit count. The canonical home is the Knowledge Kit (`knowledge.ingest:capture`); its use inside builder flows is as a support dependency.
77
+
78
+ Corrected final count:
79
+
80
+ - **KIT-SKILL:** 17 total — 16 belonging to the builder kit (across `builder.build` and `builder.shape` flows), 1 belonging to the knowledge kit (`knowledge.ingest:capture`)
81
+ - **TOOL:** 6
82
+ - **ORPHAN:** 4
83
+
84
+ ---
85
+
86
+ ## Orphans With "Implies Missing Flow" Detail
87
+
88
+ | Orphan Skill | Implication | Disposition |
89
+ | --- | --- | --- |
90
+ | `context-budget` | Implies missing flow: a "context-health" or "agent-self-audit" flow. No kit currently owns context-budget management as a named flow. This could eventually become a `builder.context-audit` or standalone kit flow. Alternatively, if the repo decides context budgeting is always a harness concern, this should be reclassified as a TOOL and the skill dissolved or folded into harness documentation. | **REMOVED** (2026-06-15). Agent self-maintenance; not a flow-step skill. Conceptually adjacent to `learning-review`; preserved intent noted in ADR 0007. |
91
+ | `explore` | Implies missing flow: a "codebase-onboarding" or "repository-exploration" flow with discrete steps (structure, entry points, dependencies, patterns, docs accuracy). Alternatively, `explore` is a multi-step capability the agent uses across many flow phases — in which case it is more accurately a TOOL (raw codebase-reading capability orchestrated across subagents) than a flow-step skill. | **REMOVED** (2026-06-15). Reclassified as a tool (parallel codebase-reading capability). Preserved intent: seed of a possible future `codebase-onboarding` flow — see ADR 0007. |
92
+ | `feedback-loop` | Implies missing flow: or more precisely, it overlaps with the `verify` step of `builder.build` without being the canonical method for it. The skill is used as a lightweight per-task verification inside `execute-plan`. If the builder build flow added a sub-step or explicit "local-verify" step between `execute` and the formal `verify` gate, `feedback-loop` would map there. Otherwise, it should be subsumed into `verify-work` or reclassified as support tooling. | **REMOVED** (2026-06-15). Subsumed: concern now handled by `verify-work` plus flow route-back. |
93
+ | `frontend-design` | Implies missing flow: a "frontend" or "UI-kit" flow with steps for design direction, implementation, and visual verification. Alternatively, the design guidelines could be packaged as a context resource injected into `execute-plan`/`tool-worker` rather than as a separate "skill." If it stays a skill, it belongs in a hypothetical UI Kit that owns a `frontend.build` flow with design and verify steps. | **REMOVED** (2026-06-15). Preserved intent: "plan-work but for UI" — seed of a possible future UI/Frontend Kit with design + visual-verify steps. Revisit if a UI kit is built. |
94
+
95
+ ---
96
+
97
+ ## Implementation Record (Issue #62, 2026-06-15)
98
+
99
+ The dispositions in this audit table were implemented in PR #62:
100
+
101
+ - **16 KIT-SKILLS moved to Builder Kit:** `builder-shape`, `deliver`, `design-probe`, `evidence-gate`, `execute-plan`, `fix-bug`, `idea-to-backlog`, `learning-review`, `pickup-probe`, `plan-work`, `pull-work`, `release-readiness`, `review-work`, `tdd-workflow`, `verify-work` — moved from `skills/<name>/` to `kits/builder/skills/<name>/` and declared in `kits/builder/kit.json` `skills` array.
102
+ - **1 KIT-SKILL moved to Knowledge Kit:** `knowledge-capture` — moved to `kits/knowledge/skills/knowledge-capture/` and declared in `kits/knowledge/kit.json` `skills` array.
103
+ - **4 ORPHANS deleted:** `context-budget`, `explore`, `feedback-loop`, `frontend-design` — removed per Brian's 2026-06-15 ruling above.
104
+ - **6 TOOLs left in place:** `agentic-engineering`, `browser-test`, `dependency-update`, `eval-rebuild`, `github-cli`, `search-first` — remain in `skills/` pending separate reclassification. See `skills/README.md`.
105
+
106
+ **Structural changes:**
107
+ - `src/tools/build-universal-bundles.ts`: `collectAllSkills()` function added; bundle builders now collect skills from both `skills/` (tool-skills) and kit-declared `skills` arrays. Runtime bundles (`.claude/skills/`, `.codex/skills/`, etc.) include all kit-owned skills unchanged.
108
+ - `src/tools/generate-context-map.ts`: `allSkillPaths()` function added; context map generation now includes kit-owned skills.
109
+ - `src/tools/validate-source-tree.ts`: `validateLegacyRefs()` updated to skip legacy-ref matches that resolve as declared kit-owned asset subpaths.
110
+ - `packaging/packs.json`: Skill entries limited to the 6 remaining tool-skills in `skills/`. Kit-owned skills are no longer listed in packs (they're always included in the bundle as kit assets).
111
+ - `flow-agents kit inspect kits/builder` now reports `k1: true` (skills present).
112
+ - `flow-agents kit inspect kits/knowledge` now reports `k1: true` (skills present).
@@ -0,0 +1,88 @@
1
+ ---
2
+ title: "ADR 0008: Kit Operation Boundary"
3
+ ---
4
+
5
+ # ADR 0008: Kit Operation Boundary
6
+
7
+ **Date:** 2026-06-15
8
+ **Status:** Accepted
9
+
10
+ ---
11
+
12
+ ## Context
13
+
14
+ A kit is the SEAM where Flow and Flow Agents meet: Flow owns the container (manifest + flows), Flow Agents owns the extension (skills, adapters, docs, activation), and a kit fuses both into one package. So "which layer owns an operation on a kit?" is ill-posed whenever the operation touches both halves.
15
+
16
+ Of the kit operations: `validate` touches only the container (cleanly Flow); `activate` touches only a specific agent runtime — codex-local/strands-local (cleanly Flow Agents); `install` and `inspect` STRADDLE the seam.
17
+
18
+ A concrete duplication was found motivating this: flow-agents reimplements Flow's container contract — `src/flow-kit/validate.ts` has its own `validateCoreContainer` (schema_version/id/name/flows), while Flow exposes the authoritative `validateKitContainer` from its `src/index.ts`, and flow-agents does not even depend on `@kontourai/flow`. The two contracts can silently drift.
19
+
20
+ This ADR records the design decision reached with Brian Anderson on 2026-06-15.
21
+
22
+ ---
23
+
24
+ ## Decision
25
+
26
+ ### The Dividing Test
27
+
28
+ Does the operation need to INTERPRET the agent extension (what a skill or adapter MEANS), or only the container (manifest + flows + the *names* of declared asset classes)? Container-only → Flow. Extension-interpreting → Flow Agents.
29
+
30
+ ### Flow Owns the Agent-Blind Kit Operations
31
+
32
+ `flow kit validate`, `flow kit install` (fetch + validate + place a kit package), `flow kit inspect` (container validity + flows + declared asset-class NAMES — the K0/structural view). Flow knows NOTHING about what a skill or adapter means.
33
+
34
+ ### Flow Agents Owns the Extension Operations
35
+
36
+ `flow-agents kit activate` (wire the extension into a runtime), plus the extension-interpreting augmentation of install (place skill/adapter assets) and inspect (interpret asset classes → K1/K2 + runtime targets). Flow Agents COMPOSES on Flow's primitives; it never reimplements them.
37
+
38
+ ### The Agent-Blind Guardrail
39
+
40
+ Flow's kit operations must NEVER interpret extension semantics — fetch, validate, place, report-structure, full stop. Holding this line keeps Flow's operations genuinely generic even though flow-agents is currently the only consumer; the line between the layers is precisely "does it interpret the extension?"
41
+
42
+ ### DRY via Delegation
43
+
44
+ flow-agents depends on `@kontourai/flow`, deletes `validateCoreContainer`, and delegates all container work to Flow's primitives. The container contract lives ONCE, in Flow.
45
+
46
+ ### Flow Agents Is the Reference Consumer
47
+
48
+ Flow Agents is the worked example for any future producer building on Flow — lean on Flow's agent-blind primitives, add your own extension layer in your own CLI.
49
+
50
+ ---
51
+
52
+ ## Consequences
53
+
54
+ ### CLI Surface
55
+
56
+ `flow kit <validate|install|inspect>` (container, agent-blind) and `flow-agents kit <install|inspect|activate>` (extension-composing). The standalone `flow-kit` binary and the flat `flow validate-kit` verb are deprecated with aliases.
57
+
58
+ ### Position C Adopted
59
+
60
+ This adopts position C (generic kit operations live in Flow) over position B (whole-kit lifecycle stays in flow-agents). C was chosen because doing it twice (B now, migrate later) is wasteful, and the agent-blind guardrail removes the premature-abstraction risk that motivated B.
61
+
62
+ ### Breaking CLI Change
63
+
64
+ Breaking CLI change on published 1.x in BOTH repos → deprecation aliases + a coordinated release.
65
+
66
+ ### Separate-Product-Ready
67
+
68
+ Because the primitives live in Flow with flow-agents as a consumer, Flow Kits could later be productized (container + primitives + marketplace) without re-architecture.
69
+
70
+ ---
71
+
72
+ ## Alternatives Considered
73
+
74
+ ### Position B (kit lifecycle entirely in flow-agents, defer generic Flow ops)
75
+
76
+ Rejected. Defensible short-term (flow-agents is the only layer that comprehends a whole kit today) but causes a double-migration; the agent-blind guardrail makes C's genericity real now, removing B's main justification.
77
+
78
+ ### Position A (container ops = Flow, runtime ops = flow-agents, by category)
79
+
80
+ Rejected earlier in the discussion — kits are not pure containers, so a category split mislocates install/inspect.
81
+
82
+ ---
83
+
84
+ ## References
85
+
86
+ - [ADR 0007: Flow / Skill / Kit / Tool Boundary](./0007-flow-skill-kit-tool-boundary.md) — the skill/tool boundary; same conversation.
87
+ - GitHub #62 (Builder Kit skill placement), #50 / #79 (marketplace / trust layer), #52 / #60 (agentless gate-eval proving Flow Definitions are agentless-capable).
88
+ - kontourai/flow container spec (flow PR #67) — establishes Flow owns the container contract + `validateKitContainer`.
@@ -65,18 +65,18 @@ Primary tools: `npm run workflow:sidecar`, `npm run workflow:validate-artifacts`
65
65
 
66
66
  | Skill | Source | When To Load |
67
67
  | --- | --- | --- |
68
- | deliver | skills/deliver/SKILL.md | Delivery workflow — selected work to delivered code. Ensures pull-work + pickup-probe preflight, then chains plan-work → execute-plan → review-work → verify-work → loop on failure without requiring user interaction between cleanly determ... |
69
- | evidence-gate | skills/evidence-gate/SKILL.md | Evaluate whether completed work is trustworthy enough for human review, merge, or release. Use after implementation, verify-work, provider checks, CI, or remediation to map acceptance criteria to evidence, inspect scope integrity, classi... |
70
- | execute-plan | skills/execute-plan/SKILL.md | Parallel execution primitive — plan artifact path to implemented code via tool-worker (x4). Reads plan directly. Updates session file between waves. |
71
- | fix-bug | skills/fix-bug/SKILL.md | Bug fix orchestrator — diagnose → plan-work → execute-plan → review-work → verify-work → loop. Diagnosis phase is unique to bugs, then chains the same primitives. |
72
- | idea-to-backlog | skills/idea-to-backlog/SKILL.md | Turn raw product or technical ideas into shaped, prioritized, executable GitHub issue backlog. Use for idea intake, ideation, product shaping, spike/prototype decisions, PRD-like feature briefs, prioritization, and backlog creation befor... |
73
- | learning-review | skills/learning-review/SKILL.md | Capture post-merge, post-deploy, or post-incident learnings and feed them back into backlog, workflow skills, tests, docs, or knowledge. Use after release readiness, post-deploy checks, retrospectives, failed gates, or repeated workflow... |
74
- | plan-work | skills/plan-work/SKILL.md | Code planning primitive — goal + directory to structured execution plan. Delegates to tool-planner. No resume, no ideation. |
75
- | pull-work | skills/pull-work/SKILL.md | Select ready GitHub issues from the executable backlog and prepare them for implementation. Use when choosing what to work on next, reviewing a kanban-style issue board, enforcing WIP limits, grouping issues, deciding worktree isolation,... |
76
- | release-readiness | skills/release-readiness/SKILL.md | Decide whether evidence-backed work is ready to merge, release, deploy, or hold. Use after evidence-gate PASS, before merge/release/deploy, and for post-deploy verification planning. |
77
- | review-work | skills/review-work/SKILL.md | Review primitive - run report-only code, security, dependency, architecture/standards, and IaC/policy critique before verification; records findings through the critique artifact/sink, currently critique.json locally. |
78
- | tdd-workflow | skills/tdd-workflow/SKILL.md | Test-driven development — RED → GREEN → REFACTOR with git checkpoints. Wraps plan-work → execute-plan → review-work → verify-work with test-first constraints and coverage gates. |
79
- | verify-work | skills/verify-work/SKILL.md | Verification primitive — session file path to structured evidence verdict via tool-verifier + tool-playwright. Reads acceptance criteria from plan artifact. |
68
+ | deliver | kits/builder/skills/deliver/SKILL.md | Delivery workflow — selected work to delivered code. Ensures pull-work + pickup-probe preflight, then chains plan-work → execute-plan → review-work → verify-work → loop on failure without requiring user interaction between cleanly determ... |
69
+ | evidence-gate | kits/builder/skills/evidence-gate/SKILL.md | Evaluate whether completed work is trustworthy enough for human review, merge, or release. Use after implementation, verify-work, provider checks, CI, or remediation to map acceptance criteria to evidence, inspect scope integrity, classi... |
70
+ | execute-plan | kits/builder/skills/execute-plan/SKILL.md | Parallel execution primitive — plan artifact path to implemented code via tool-worker (x4). Reads plan directly. Updates session file between waves. |
71
+ | fix-bug | kits/builder/skills/fix-bug/SKILL.md | Bug fix orchestrator — diagnose → plan-work → execute-plan → review-work → verify-work → loop. Diagnosis phase is unique to bugs, then chains the same primitives. |
72
+ | idea-to-backlog | kits/builder/skills/idea-to-backlog/SKILL.md | Turn raw product or technical ideas into shaped, prioritized, executable GitHub issue backlog. Use for idea intake, ideation, product shaping, spike/prototype decisions, PRD-like feature briefs, prioritization, and backlog creation befor... |
73
+ | learning-review | kits/builder/skills/learning-review/SKILL.md | Capture post-merge, post-deploy, or post-incident learnings and feed them back into backlog, workflow skills, tests, docs, or knowledge. Use after release readiness, post-deploy checks, retrospectives, failed gates, or repeated workflow... |
74
+ | plan-work | kits/builder/skills/plan-work/SKILL.md | Code planning primitive — goal + directory to structured execution plan. Delegates to tool-planner. No resume, no ideation. |
75
+ | pull-work | kits/builder/skills/pull-work/SKILL.md | Select ready GitHub issues from the executable backlog and prepare them for implementation. Use when choosing what to work on next, reviewing a kanban-style issue board, enforcing WIP limits, grouping issues, deciding worktree isolation,... |
76
+ | release-readiness | kits/builder/skills/release-readiness/SKILL.md | Decide whether evidence-backed work is ready to merge, release, deploy, or hold. Use after evidence-gate PASS, before merge/release/deploy, and for post-deploy verification planning. |
77
+ | review-work | kits/builder/skills/review-work/SKILL.md | Review primitive - run report-only code, security, dependency, architecture/standards, and IaC/policy critique before verification; records findings through the critique artifact/sink, currently critique.json locally. |
78
+ | tdd-workflow | kits/builder/skills/tdd-workflow/SKILL.md | Test-driven development — RED → GREEN → REFACTOR with git checkpoints. Wraps plan-work → execute-plan → review-work → verify-work with test-first constraints and coverage gates. |
79
+ | verify-work | kits/builder/skills/verify-work/SKILL.md | Verification primitive — session file path to structured evidence verdict via tool-verifier + tool-playwright. Reads acceptance criteria from plan artifact. |
80
80
 
81
81
  ## Support Skills
82
82
 
@@ -84,17 +84,13 @@ Primary tools: `npm run workflow:sidecar`, `npm run workflow:validate-artifacts`
84
84
  | --- | --- | --- |
85
85
  | agentic-engineering | skills/agentic-engineering/SKILL.md | Eval-first execution, task decomposition, and cost-aware model routing for AI-driven development workflows. |
86
86
  | browser-test | skills/browser-test/SKILL.md | Headless browser automation via Playwright — screenshots, accessibility checks, form filling, UI testing, DOM inspection. |
87
- | builder-shape | skills/builder-shape/SKILL.md | Invoke Builder Kit shape from a raw idea or the current conversation context without requiring the user to name idea-to-backlog. Delegates shaping to idea-to-backlog, records the Builder Kit Flow Definition link, and stops at the backlog... |
88
- | context-budget | skills/context-budget/SKILL.md | Audit token overhead across Flow Agents bundles — agent specs, skills, context files, MCP servers. Produces budget report with per-component breakdown and optimization suggestions. |
87
+ | builder-shape | kits/builder/skills/builder-shape/SKILL.md | Invoke Builder Kit shape from a raw idea or the current conversation context without requiring the user to name idea-to-backlog. Delegates shaping to idea-to-backlog, records the Builder Kit Flow Definition link, and stops at the backlog... |
89
88
  | dependency-update | skills/dependency-update/SKILL.md | Analyze and upgrade project dependencies — latest versions, security vulnerabilities, actionable update plan across all package managers. |
90
- | design-probe | skills/design-probe/SKILL.md | Generic one-question-at-a-time design probing interview for turning unclear goals, designs, or workflow states into shared understanding before planning or execution. |
89
+ | design-probe | kits/builder/skills/design-probe/SKILL.md | Generic one-question-at-a-time design probing interview for turning unclear goals, designs, or workflow states into shared understanding before planning or execution. |
91
90
  | eval-rebuild | skills/eval-rebuild/SKILL.md | Project-specific build and install commands for the eval feedback loop. Injected into eval-builder agent. Replace this skill for different build systems. |
92
- | explore | skills/explore/SKILL.md | Parallel codebase exploration — fans out subagents to map structure, entry points, dependencies, patterns, config, and tests in one pass. |
93
- | feedback-loop | skills/feedback-loop/SKILL.md | Verify implementation actually works. Visual changes → Playwright; integration changes → commands/tests. Run after completing builds. |
94
- | frontend-design | skills/frontend-design/SKILL.md | Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics. |
95
91
  | github-cli | skills/github-cli/SKILL.md | Interact with GitHub via gh CLI — PRs, issues, repos, releases, workflows, gists. |
96
- | knowledge-capture | skills/knowledge-capture/SKILL.md | Save durable knowledge, lightweight pointers, user corrections, decisions, lessons, relationship context, or source references into the knowledge base. Use when the user says save, remember, capture, file this, bookmark context, or when... |
97
- | pickup-probe | skills/pickup-probe/SKILL.md | Builder Kit work-item/docs/provider-grounded Probe specialization used at the design-probe flow step before plan-work. |
92
+ | knowledge-capture | kits/knowledge/skills/knowledge-capture/SKILL.md | Save durable knowledge, lightweight pointers, user corrections, decisions, lessons, relationship context, or source references into the knowledge base. Use when the user says save, remember, capture, file this, bookmark context, or when... |
93
+ | pickup-probe | kits/builder/skills/pickup-probe/SKILL.md | Builder Kit work-item/docs/provider-grounded Probe specialization used at the design-probe flow step before plan-work. |
98
94
  | search-first | skills/search-first/SKILL.md | Research-before-coding workflow. Search for existing tools, libraries, and patterns before writing custom code. |
99
95
 
100
96
  ## Agents
@@ -129,8 +125,8 @@ Pack composition is defined in `packaging/packs.json`. The current builder expor
129
125
 
130
126
  | Pack | Default | Skills | Agents | Powers | Purpose |
131
127
  | --- | --- | --- | --- | --- | --- |
132
- | core | yes | 9 | 5 | 1 | Small default surface for reliable coding and workflow execution. |
133
- | development | no | 17 | 9 | 1 | Development workflow depth for backlog, release, dependency, GitHub, TDD, and frontend work. |
128
+ | core | yes | 2 | 5 | 1 | Small default surface for reliable coding and workflow execution. |
129
+ | development | no | 4 | 9 | 1 | Development workflow depth for backlog, release, dependency, GitHub, TDD, and frontend work. |
134
130
 
135
131
  ## Current Workflow State
136
132
 
@@ -23,11 +23,11 @@ npm run validate:source --
23
23
  Installed Flow Agents bundles include a local-only install command for Flow Kit repositories that already exist on disk:
24
24
 
25
25
  ```bash
26
- npm run flow-kit -- install-local path/to/local-kit --dest /path/to/installed-flow-agents
27
- npm run flow-kit -- list --dest /path/to/installed-flow-agents
28
- npm run flow-kit -- status --dest /path/to/installed-flow-agents
29
- npm run flow-kit -- status example-kit --dest /path/to/installed-flow-agents
30
- npm run flow-kit -- activate --dest /path/to/installed-flow-agents --format json
26
+ npm run kit -- install path/to/local-kit --dest /path/to/installed-flow-agents
27
+ npm run kit -- list --dest /path/to/installed-flow-agents
28
+ npm run kit -- status --dest /path/to/installed-flow-agents
29
+ npm run kit -- status example-kit --dest /path/to/installed-flow-agents
30
+ npm run kit -- activate --dest /path/to/installed-flow-agents --format json
31
31
  ```
32
32
 
33
33
  `--dest` is the installed bundle or workspace root. When omitted, the command uses the current working directory. Tests and automation should pass a temp destination; the command does not need to write to a user home directory.
@@ -0,0 +1,177 @@
1
+ ---
2
+ title: Builder Kit Quick Start
3
+ ---
4
+
5
+ # Builder Kit Quick Start
6
+
7
+ This guide takes you from nothing to a running, gated build flow in about two minutes. By the end you will have Flow Agents installed in your coding agent's workspace and understand how the Builder Kit's two flows — `builder.shape` and `builder.build` — turn a raw idea into a merged change with evidence.
8
+
9
+ ## 1. Install
10
+
11
+ Run this from any workspace you want to add discipline to:
12
+
13
+ ```bash
14
+ npx @kontourai/flow-agents init --runtime <your-agent> --dest .
15
+ ```
16
+
17
+ Where `--runtime` is one of `claude-code`, `codex`, `kiro`, `opencode`, or `pi`. For a fully unattended install:
18
+
19
+ ```bash
20
+ npx @kontourai/flow-agents init --runtime claude-code --dest . --yes
21
+ npx @kontourai/flow-agents init --runtime codex --dest . --yes
22
+ npx @kontourai/flow-agents init --runtime opencode --dest . --yes
23
+ ```
24
+
25
+ The installer copies agents, skills, context contracts, hook scripts, Kit assets, and the Flow Agents telemetry descriptor into the workspace. The Builder Kit installs automatically. Your agent reads those files at startup; no plugin registry required.
26
+
27
+ **What lands in the workspace:**
28
+
29
+ - `agents/`, `skills/`, `context/` — skill definitions and shared contracts the agent follows
30
+ - `scripts/hooks/` — four canonical policy scripts (steering, quality gate, stop-goal-fit, config protection) wired to the host's native hook surface
31
+ - `kits/builder/` — Builder Kit flows and skills
32
+ - `console.telemetry.json` — telemetry descriptor (writes locally by default)
33
+
34
+ At L2 conformance (Claude Code, Codex, Kiro) all four hooks are active and the stop hook blocks early exits that lack evidence. At L1 (opencode, pi) steering and stop-goal-fit run but without blocking capability; see the [Runtime Hook Surface spec](spec/runtime-hook-surface.html) for the gaps.
35
+
36
+ ## 2. What the Builder Kit gives you
37
+
38
+ The Builder Kit installs two flows:
39
+
40
+ | Flow | ID | What it does |
41
+ |---|---|---|
42
+ | Shape | `builder.shape` | Turns a raw idea into slices and executable work items |
43
+ | Build | `builder.build` | Takes a ready work item through design probe → plan → execute → verify → PR → merge readiness → learn |
44
+
45
+ These are not freeform chat sessions. Each flow has **evidence gates** — named checkpoints that expect specific claims before the next step starts. The agent cannot silently skip a gate; it either satisfies the expectation or the transition is blocked (at L2) or flagged (at L1).
46
+
47
+ **Shape flow gates** (`builder.shape`):
48
+
49
+ - `shape-gate` — problem, outcome, constraints, non-goals, success criteria, and risk are stated
50
+ - `breakdown-gate` — work is split into independently useful slices
51
+ - `file-issues-gate` — each slice becomes a filed work item with enough context to pull later
52
+
53
+ **Build flow gates** (`builder.build`):
54
+
55
+ - `pull-work-gate` — a ready work item is selected with scope and acceptance context
56
+ - `design-probe-gate` — goal fit, blockers, dependencies, and planning readiness are recorded before a plan is written
57
+ - `plan-gate` — the plan names files, changes, acceptance evidence, and sequencing
58
+ - `execute-gate` — changed files are recorded and unrelated work is excluded
59
+ - `verify-gate` — tests or checks have evidence tied to the implementation (up to 3 route-back attempts before blocking)
60
+ - `merge-ready-gate` — scope, evidence, and residual risks support a merge-ready decision
61
+ - `pr-open-gate` — a pull request exists with linked work and verification evidence
62
+ - `merge-ready-ci-gate` — CI and review status support merge
63
+ - `learn-gate` — decisions and delivery learnings are recorded for future work
64
+
65
+ The gate semantics live in [Kontour Flow](https://kontourai.github.io/flow/); Flow Agents compiles them to whatever hook surface your agent exposes.
66
+
67
+ ## 3. A two-minute first run
68
+
69
+ ### Step 1 — Shape an idea
70
+
71
+ In your coding agent, paste this:
72
+
73
+ ```text
74
+ Use Builder Kit shape. I want to add a progress indicator to the CLI output so
75
+ users can see what step the installer is on. Keep it simple — just a step count
76
+ like "[2/5] Copying agents". Shape this into an executable work item and stop
77
+ at the backlog gate.
78
+ ```
79
+
80
+ The agent will run the `builder-shape` / `idea-to-backlog` skill, which:
81
+
82
+ 1. inventories the idea and classifies it
83
+ 2. proposes the thinnest meaningful slice (the step counter) and names what is out of scope
84
+ 3. drafts a shaped work item with a stated outcome, non-goals, acceptance criteria, and a verification expectation
85
+ 4. stops at the `breakdown-gate` and waits for you to confirm before creating GitHub issues
86
+
87
+ You will see the agent write a local artifact at `.flow-agents/<slug>/<slug>--idea-to-backlog.md`. That artifact is the machine-readable input to the next stage — not a summary in the chat window.
88
+
89
+ To continue and file the GitHub issue:
90
+
91
+ ```text
92
+ That looks right. File the GitHub issue and stop.
93
+ ```
94
+
95
+ The agent runs the `file-issues` step, checks the `file-issues-gate`, and stops. You now have a shaped, filed work item that the build flow can pull.
96
+
97
+ ### Step 2 — Build that work item
98
+
99
+ ```text
100
+ Use deliver for the issue you just filed. Pull it, probe the design, plan it,
101
+ implement it, review it, verify it, and stop if any evidence is missing.
102
+ ```
103
+
104
+ The `deliver` skill orchestrates the full `builder.build` flow:
105
+
106
+ 1. **pull-work** — selects the issue, confirms scope and acceptance criteria (`pull-work-gate`)
107
+ 2. **design-probe** — checks goal fit, identifies blockers and dependencies, and records planning readiness before touching a file (`design-probe-gate`)
108
+ 3. **plan-work** — delegates to `tool-planner`, which writes a structured plan artifact naming files, changes, sequencing, and acceptance evidence (`plan-gate`)
109
+ 4. **execute-plan** — fans out to up to four `tool-worker` subagents in parallel waves (`execute-gate`)
110
+ 5. **review-work** — code and optional security review (`critique.json` sidecar)
111
+ 6. **verify-work** — tests and checks with evidence tied to the change; if evidence is missing the verify-gate triggers a route-back (`verify-gate`)
112
+ 7. **release-readiness** — scope, evidence, and risk assessment (`merge-ready-gate`)
113
+ 8. **pull-request** — PR with linked work item and verification evidence (`pr-open-gate`)
114
+
115
+ You can also invoke each skill individually if you want explicit control:
116
+
117
+ ```text
118
+ Use pull-work to select issue #42.
119
+ ```
120
+
121
+ ```text
122
+ Use plan-work on the session artifact from the pull-work step.
123
+ ```
124
+
125
+ ```text
126
+ Use verify-work on the current branch and report what evidence is present.
127
+ ```
128
+
129
+ ### What you observe
130
+
131
+ - **Between each step**, the agent writes a local session sidecar under `.flow-agents/<slug>/` — `state.json`, `acceptance.json`, `evidence.json`, and `handoff.json`. These survive compaction, tab close, or a new session. A future session resumes from recorded state.
132
+ - **At each gate**, the agent either presents the evidence and moves forward, or blocks and explains what is missing. It does not make up a confident summary and proceed.
133
+ - **The stop-goal-fit hook** (at L2) prevents the agent from stopping when evidence is still incomplete — you see a warning or block rather than "all done!" on partial work.
134
+ - **If verify fails**, the verify-gate routes back to execution (or plan, or design-probe, depending on the failure class) and tries again — up to three times before hard-blocking.
135
+
136
+ This is guided, not fully automated. The agent handles the mechanics; you make product decisions. Gates are explicit handoff points, not invisible checkboxes.
137
+
138
+ ## 4. Inspect what you installed
139
+
140
+ After installing, you can inspect the Builder Kit's declared contents:
141
+
142
+ ```bash
143
+ node build/src/cli.js kit inspect kits/builder
144
+ ```
145
+
146
+ (Or, from a global install: `flow-agents kit inspect kits/builder`)
147
+
148
+ This prints the kit id, name, declared flows, skills, and conformance level (K0/K1). It does not require a running agent or active session.
149
+
150
+ To see the raw flow definitions with their gate expectations:
151
+
152
+ ```bash
153
+ cat kits/builder/flows/shape.flow.json
154
+ cat kits/builder/flows/build.flow.json
155
+ ```
156
+
157
+ ## 5. Verify your setup
158
+
159
+ After installing, run the source validation to confirm the workspace is coherent:
160
+
161
+ ```bash
162
+ npm run validate:source
163
+ ```
164
+
165
+ For a full static eval pass (docs layout, legacy-term checks, bundle assertions):
166
+
167
+ ```bash
168
+ npm run eval:static
169
+ ```
170
+
171
+ ## What to read next
172
+
173
+ - [Workflow Usage Guide](workflow-usage-guide.html) — example prompts and expected behavior for every skill and stage
174
+ - [Agent System Guidebook](agent-system-guidebook.html) — how the pieces fit together conceptually
175
+ - [Kit Authoring Guide](kit-authoring-guide.html) — author your own Flow Kit from scratch
176
+ - [Runtime Hook Surface spec](spec/runtime-hook-surface.html) — hook events, conformance levels, and host gaps
177
+ - [Workflow Artifact Lifecycle](workflow-artifact-lifecycle.html) — when to promote local artifacts to durable docs
package/docs/index.md CHANGED
@@ -71,24 +71,31 @@ The same canonical policies wire into agent frameworks as in-process language-na
71
71
 
72
72
  ## Quick Start
73
73
 
74
+ Install into your workspace in one command:
75
+
74
76
  ```bash
75
- npx @kontourai/flow-agents init --dest /path/to/workspace
77
+ npx @kontourai/flow-agents init --runtime <your-agent> --dest .
76
78
  ```
77
79
 
78
- Runtime-specific installs:
80
+ Where `--runtime` is `claude-code`, `codex`, `kiro`, `opencode`, or `pi`. The Builder Kit installs automatically and gives your agent two gated flows: `builder.shape` (idea → slices → filed work items) and `builder.build` (selected work item → design probe → plan → execute → verify → PR → learn).
79
81
 
80
- ```bash
81
- npx @kontourai/flow-agents init --runtime claude-code --dest /path/to/workspace --yes
82
- npx @kontourai/flow-agents init --runtime opencode --dest /path/to/workspace --yes
83
- npx @kontourai/flow-agents init --runtime pi --dest /path/to/workspace --yes
82
+ Ask your agent to shape an idea:
83
+
84
+ ```text
85
+ Use Builder Kit shape. I want to add a progress indicator to the CLI output
86
+ so users can see what step the installer is on. Shape this into an executable
87
+ work item and stop at the backlog gate.
84
88
  ```
85
89
 
86
- Then ask for the workflow you want, in plain language:
90
+ Then build it:
87
91
 
88
92
  ```text
89
- Use deliver for this GitHub issue. Plan it, implement it, review it, verify it, and stop if evidence is missing.
93
+ Use deliver for the issue you just filed. Pull it, probe the design, plan it,
94
+ implement it, verify it, and stop if any evidence is missing.
90
95
  ```
91
96
 
97
+ Each step has an evidence gate. The agent cannot proceed past a gate without the expected evidence — it either presents it or blocks and explains what is missing. See the <a href="getting-started.html">Builder Kit Quick Start</a> for a full two-minute walkthrough with worked examples and an explanation of what you observe at each gate.
98
+
92
99
  For bugs:
93
100
 
94
101
  ```text
@@ -98,6 +105,10 @@ Use fix-bug. Reproduce the problem, diagnose root cause, implement the fix, and
98
105
  ## Explore the docs
99
106
 
100
107
  <div class="doc-grid">
108
+ <a class="doc-card" href="getting-started.html">
109
+ <strong>Builder Kit Quick Start</strong>
110
+ <span>Zero to a running, gated build flow in two minutes: install, shape an idea into a work item, build it through the builder.shape and builder.build flows, and see what the evidence gates do.</span>
111
+ </a>
101
112
  <a class="doc-card" href="workflow-usage-guide.html">
102
113
  <strong>Workflow Usage Guide</strong>
103
114
  <span>Every stage from shaping ideas to learning review, with example prompts and expected behavior.</span>