@kontourai/flow-agents 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.githooks/pre-push +11 -0
- package/.github/workflows/ci.yml +210 -0
- package/.github/workflows/docs-pages.yml +52 -0
- package/.github/workflows/publish-npm.yml +104 -0
- package/AGENTS.md +26 -0
- package/CHANGELOG.md +66 -0
- package/CODE_OF_CONDUCT.md +25 -0
- package/CONTEXT.md +300 -0
- package/CONTRIBUTING.md +44 -0
- package/LICENSE +201 -0
- package/README.md +129 -0
- package/SECURITY.md +33 -0
- package/agent-cards/dev.json +19 -0
- package/agents/dev.json +127 -0
- package/agents/tool-code-reviewer.json +61 -0
- package/agents/tool-dependencies-updater.json +118 -0
- package/agents/tool-explore-config.json +92 -0
- package/agents/tool-explore-deps.json +92 -0
- package/agents/tool-explore-entry.json +92 -0
- package/agents/tool-explore-patterns.json +92 -0
- package/agents/tool-explore-structure.json +92 -0
- package/agents/tool-explore-tests.json +92 -0
- package/agents/tool-planner.json +57 -0
- package/agents/tool-playwright.json +145 -0
- package/agents/tool-security-reviewer.json +56 -0
- package/agents/tool-verifier.json +61 -0
- package/agents/tool-worker.json +58 -0
- package/build/src/cli/console-learning-projection.js +123 -0
- package/build/src/cli/docs-preview.js +39 -0
- package/build/src/cli/effective-backlog-settings.js +102 -0
- package/build/src/cli/export-bookmarks.js +38 -0
- package/build/src/cli/fixture-retirement-audit.js +140 -0
- package/build/src/cli/flow-kit.js +138 -0
- package/build/src/cli/import-bookmarks.js +50 -0
- package/build/src/cli/init.js +239 -0
- package/build/src/cli/instinct-cli.js +93 -0
- package/build/src/cli/promote-workflow-artifact.js +63 -0
- package/build/src/cli/publish-change-helper.js +154 -0
- package/build/src/cli/pull-work-provider.js +469 -0
- package/build/src/cli/runtime-adapter.js +23 -0
- package/build/src/cli/telemetry-doctor.js +221 -0
- package/build/src/cli/usage-feedback.js +443 -0
- package/build/src/cli/validate-hook-influence.js +152 -0
- package/build/src/cli/validate-source-tree.js +31 -0
- package/build/src/cli/validate-workflow-artifacts.js +486 -0
- package/build/src/cli/veritas-governance.js +262 -0
- package/build/src/cli/workflow-artifact-cleanup-audit.js +272 -0
- package/build/src/cli/workflow-sidecar.js +816 -0
- package/build/src/cli.js +89 -0
- package/build/src/flow-kit/validate.js +75 -0
- package/build/src/lib/args.js +45 -0
- package/build/src/lib/fs.js +62 -0
- package/build/src/lib/workflow-learning-projection.js +334 -0
- package/build/src/runtime-adapters.js +146 -0
- package/build/src/tools/build-universal-bundles.js +397 -0
- package/build/src/tools/common.js +56 -0
- package/build/src/tools/filter-installed-packs.js +132 -0
- package/build/src/tools/generate-context-map.js +198 -0
- package/build/src/tools/validate-package.js +64 -0
- package/build/src/tools/validate-source-tree.js +622 -0
- package/console.telemetry.json +176 -0
- package/context/base-rules.md +17 -0
- package/context/code-review-standards.md +62 -0
- package/context/coding-standards.md +42 -0
- package/context/common/orchestrators.md +12 -0
- package/context/common/subagents.md +28 -0
- package/context/contracts/artifact-contract.md +182 -0
- package/context/contracts/builder-kit-workflow-state-contract.md +319 -0
- package/context/contracts/delivery-contract.md +69 -0
- package/context/contracts/execution-contract.md +53 -0
- package/context/contracts/governance-adapter-contract.md +67 -0
- package/context/contracts/planning-contract.md +85 -0
- package/context/contracts/review-contract.md +104 -0
- package/context/contracts/sandbox-policy.md +52 -0
- package/context/contracts/verification-contract.md +134 -0
- package/context/contracts/work-item-contract.md +215 -0
- package/context/deferred/demo-mode.md +33 -0
- package/context/deferred/languages/go.md +31 -0
- package/context/deferred/languages/python.md +31 -0
- package/context/deferred/languages/typescript.md +34 -0
- package/context/deferred/parallelization.md +35 -0
- package/context/deferred/worktree-isolation.md +24 -0
- package/context/development-workflow.md +50 -0
- package/context/scripts/context-budget/budget-scan.sh +166 -0
- package/context/scripts/detect-tools.sh +3 -0
- package/context/scripts/discover-agents.sh +28 -0
- package/context/scripts/git-status.sh +49 -0
- package/context/scripts/hooks/config-protection.js +79 -0
- package/context/scripts/hooks/desktop-notify.sh +39 -0
- package/context/scripts/hooks/governance-audit.sh +135 -0
- package/context/scripts/hooks/lib/audit-transport.sh +40 -0
- package/context/scripts/hooks/lib/hook-flags.js +49 -0
- package/context/scripts/hooks/lib/patterns.sh +57 -0
- package/context/scripts/hooks/lib/resolve-formatter.js +80 -0
- package/context/scripts/hooks/post-edit-accumulator.js +66 -0
- package/context/scripts/hooks/pre-commit-quality.js +194 -0
- package/context/scripts/hooks/quality-gate.js +93 -0
- package/context/scripts/hooks/report-only-guard.js +21 -0
- package/context/scripts/hooks/run-hook.js +136 -0
- package/context/scripts/hooks/stop-format-typecheck.js +141 -0
- package/context/scripts/hooks/stop-goal-fit.js +337 -0
- package/context/scripts/hooks/workflow-steering.js +250 -0
- package/context/scripts/telemetry/console-presets.sh +14 -0
- package/context/scripts/telemetry/install-console-config.sh +214 -0
- package/context/scripts/telemetry/lib/config.sh +85 -0
- package/context/scripts/telemetry/lib/enrich.sh +115 -0
- package/context/scripts/telemetry/lib/redact.sh +22 -0
- package/context/scripts/telemetry/lib/session.sh +63 -0
- package/context/scripts/telemetry/lib/transport.sh +183 -0
- package/context/scripts/telemetry/lib/usage.sh +29 -0
- package/context/scripts/telemetry/sync-agents.sh +173 -0
- package/context/scripts/telemetry/telemetry.conf +23 -0
- package/context/scripts/telemetry/telemetry.sh +387 -0
- package/context/scripts/validate-package.sh +89 -0
- package/context/settings/backlog-provider-settings.json +54 -0
- package/context/templates/core/identity.md +26 -0
- package/context/templates/core/user.md +15 -0
- package/docs/_config.yml +15 -0
- package/docs/_layouts/default.html +87 -0
- package/docs/adr/0001-flow-agents-consumes-flow.md +77 -0
- package/docs/adr/0002-flow-kits-as-extension-unit.md +13 -0
- package/docs/adr/0003-flow-agents-coordinates-kits-and-adapters.md +13 -0
- package/docs/adr/0004-gates-expect-surface-claims.md +15 -0
- package/docs/adr/0005-kubernetes-inspired-resource-contracts.md +48 -0
- package/docs/adr/0006-typescript-first-source-policy.md +98 -0
- package/docs/agent-system-guidebook.md +391 -0
- package/docs/agent-usage-feedback-loop.md +351 -0
- package/docs/assets/favicon.svg +13 -0
- package/docs/assets/og-image.png +0 -0
- package/docs/assets/site.css +774 -0
- package/docs/assets/site.js +139 -0
- package/docs/configurable-workflow-routing.md +174 -0
- package/docs/context-map.md +145 -0
- package/docs/developer-architecture.md +145 -0
- package/docs/developer-hook-setup.md +61 -0
- package/docs/fixture-ownership.md +44 -0
- package/docs/flow-kit-repository-contract.md +180 -0
- package/docs/index.md +129 -0
- package/docs/kontour-resource-contract.md +358 -0
- package/docs/migrations.md +64 -0
- package/docs/north-star.md +322 -0
- package/docs/operating-layers.md +110 -0
- package/docs/repository-structure.md +132 -0
- package/docs/sandbox-policy.md +56 -0
- package/docs/skills-map.md +203 -0
- package/docs/standards-register.md +96 -0
- package/docs/veritas-integration.md +165 -0
- package/docs/work-item-adapters.md +72 -0
- package/docs/workflow-artifact-lifecycle.md +141 -0
- package/docs/workflow-eval-strategy.md +295 -0
- package/docs/workflow-shared-contracts.md +51 -0
- package/docs/workflow-usage-guide.md +443 -0
- package/evals/ARCHITECTURE.md +143 -0
- package/evals/CONVENTIONS.md +58 -0
- package/evals/README.md +128 -0
- package/evals/acceptance/run.sh +29 -0
- package/evals/acceptance/test_claude_harness.sh +242 -0
- package/evals/acceptance/test_codex_harness.sh +108 -0
- package/evals/acceptance/test_kiro_harness.sh +128 -0
- package/evals/cases/dev/404.html +97 -0
- package/evals/cases/dev/code-review.yaml +44 -0
- package/evals/cases/dev/dashboard.html +300 -0
- package/evals/cases/dev/deliver.yaml +66 -0
- package/evals/cases/dev/dependency-update.yaml +16 -0
- package/evals/cases/dev/explore.yaml +20 -0
- package/evals/cases/dev/index.html +370 -0
- package/evals/cases/dev/package-lock.json +28 -0
- package/evals/cases/dev/package.json +16 -0
- package/evals/cases/dev/plan-work.yaml +20 -0
- package/evals/cases/dev/promptfooconfig.yaml +666 -0
- package/evals/cases/dev/search-first.yaml +20 -0
- package/evals/cases/dev/tdd-workflow.yaml +48 -0
- package/evals/cases/dev/verify-work.yaml +44 -0
- package/evals/cases/dev/workflow.yaml +34 -0
- package/evals/ci/run-baseline.sh +283 -0
- package/evals/fixtures/backlog-provider-settings/global-default.json +44 -0
- package/evals/fixtures/backlog-provider-settings/project-override.json +53 -0
- package/evals/fixtures/builder-kit-workflow-state/baseline-freshness-resolution-hint.json +139 -0
- package/evals/fixtures/builder-kit-workflow-state/direct-primitive-stop.json +59 -0
- package/evals/fixtures/builder-kit-workflow-state/empty-board-route-shape.json +55 -0
- package/evals/fixtures/builder-kit-workflow-state/happy-path.json +71 -0
- package/evals/fixtures/builder-kit-workflow-state/mid-work-resume.json +80 -0
- package/evals/fixtures/builder-kit-workflow-state/missing-prestep-recovery.json +65 -0
- package/evals/fixtures/builder-kit-workflow-state/product-build-chaining.json +60 -0
- package/evals/fixtures/builder-kit-workflow-state/stale-continuation-requires-new-probe.json +57 -0
- package/evals/fixtures/console-learning-projection/artifacts/console-learning-correction/learning.json +50 -0
- package/evals/fixtures/console-learning-projection/artifacts/console-learning-open-route/learning.json +41 -0
- package/evals/fixtures/flow-kit-repository/invalid-absolute-path/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-asset-section/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-asset-section/kit.json +11 -0
- package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/kit.json +9 -0
- package/evals/fixtures/flow-kit-repository/invalid-id/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-id/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-malformed-json/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-missing-flow/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-missing-id/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-missing-id/kit.json +7 -0
- package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/kit.json +7 -0
- package/evals/fixtures/flow-kit-repository/invalid-name/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-name/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-schema-version/flows/review.flow.json +6 -0
- package/evals/fixtures/flow-kit-repository/invalid-schema-version/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/invalid-traversal/kit.json +8 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/adapters/example.json +3 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/assets/example.txt +1 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/docs/README.md +3 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/flows/runtime.flow.json +26 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-evals/example.json +3 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-skills/mixed/SKILL.md +3 -0
- package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit.json +44 -0
- package/evals/fixtures/flow-kit-repository/valid-local-kit/docs/README.md +3 -0
- package/evals/fixtures/flow-kit-repository/valid-local-kit/flows/review.flow.json +26 -0
- package/evals/fixtures/flow-kit-repository/valid-local-kit/kit.json +20 -0
- package/evals/fixtures/hook-influence/cases.json +336 -0
- package/evals/fixtures/pull-work-provider/github-issues.json +170 -0
- package/evals/fixtures/pull-work-wip-shepherding/global-wip-informs.json +43 -0
- package/evals/fixtures/pull-work-wip-shepherding/personal-wip-blocks.json +42 -0
- package/evals/fixtures/surface-trust/accepted-claim-trust-report.json +31 -0
- package/evals/fixtures/surface-trust/artifact-absent.json +19 -0
- package/evals/fixtures/surface-trust/integrity-mismatch-trust-report.json +32 -0
- package/evals/fixtures/surface-trust/missing-authority-trust-report.json +27 -0
- package/evals/fixtures/surface-trust/provider-absent.json +19 -0
- package/evals/fixtures/surface-trust/rejected-claim-trust-report.json +30 -0
- package/evals/fixtures/surface-trust/stale-claim-trust-snapshot.json +31 -0
- package/evals/fixtures/usage-feedback/sample-full.jsonl +11 -0
- package/evals/fixtures/usage-feedback/sample-outcomes.jsonl +1 -0
- package/evals/fixtures/veritas-governance-adapter/fake-veritas-pass.sh +18 -0
- package/evals/fixtures/veritas-governance-adapter/fake-veritas-secret-fail.sh +10 -0
- package/evals/fixtures/veritas-governance-adapter/fake-veritas-unconfigured.sh +4 -0
- package/evals/integration/test_bundle_install.sh +541 -0
- package/evals/integration/test_console_learning_projection.sh +192 -0
- package/evals/integration/test_context_map.sh +65 -0
- package/evals/integration/test_effective_backlog_settings.sh +58 -0
- package/evals/integration/test_fixture_retirement_audit.sh +58 -0
- package/evals/integration/test_flow_agents_statusline.sh +93 -0
- package/evals/integration/test_flow_kit_repository.sh +90 -0
- package/evals/integration/test_goal_fit_hook.sh +482 -0
- package/evals/integration/test_hook_category_behaviors.sh +190 -0
- package/evals/integration/test_hook_influence_cases.sh +69 -0
- package/evals/integration/test_local_flow_kit_install.sh +145 -0
- package/evals/integration/test_publish_change_helper.sh +176 -0
- package/evals/integration/test_pull_work_provider.sh +140 -0
- package/evals/integration/test_runtime_adapter_activation.sh +106 -0
- package/evals/integration/test_telemetry.sh +485 -0
- package/evals/integration/test_telemetry_doctor.sh +193 -0
- package/evals/integration/test_usage_feedback_dashboard.sh +169 -0
- package/evals/integration/test_usage_feedback_global.sh +117 -0
- package/evals/integration/test_usage_feedback_import.sh +227 -0
- package/evals/integration/test_usage_feedback_outcomes.sh +165 -0
- package/evals/integration/test_usage_feedback_report.sh +263 -0
- package/evals/integration/test_veritas_governance_adapter.sh +235 -0
- package/evals/integration/test_workflow_artifact_cleanup_audit.sh +287 -0
- package/evals/integration/test_workflow_artifacts.sh +1247 -0
- package/evals/integration/test_workflow_sidecar_writer.sh +2112 -0
- package/evals/integration/test_workflow_steering_hook.sh +337 -0
- package/evals/lib/assertions/delegated-to.js +40 -0
- package/evals/lib/assertions/max-tool-calls.js +15 -0
- package/evals/lib/assertions/no-write-tools.js +27 -0
- package/evals/lib/assertions/pass-at-k.js +39 -0
- package/evals/lib/assertions/telemetry-utils.js +105 -0
- package/evals/lib/assertions/tool-called.js +39 -0
- package/evals/lib/assertions/verify-after-fix.js +61 -0
- package/evals/lib/claude-judge.sh +40 -0
- package/evals/lib/claude-provider.sh +74 -0
- package/evals/lib/codex-judge.sh +39 -0
- package/evals/lib/codex-provider.sh +81 -0
- package/evals/lib/eval-dev.sh +5 -0
- package/evals/lib/eval-judge.sh +22 -0
- package/evals/lib/eval-provider.sh +26 -0
- package/evals/lib/eval-report.sh +73 -0
- package/evals/lib/kiro-dev.sh +4 -0
- package/evals/lib/kiro-judge.sh +17 -0
- package/evals/lib/kiro-provider.sh +62 -0
- package/evals/lib/node.sh +111 -0
- package/evals/promptfooconfig.yaml +70 -0
- package/evals/run.sh +309 -0
- package/evals/static/test_evidence_refs.sh +141 -0
- package/evals/static/test_package.sh +407 -0
- package/evals/static/test_repo_hooks.sh +68 -0
- package/evals/static/test_universal_bundles.sh +274 -0
- package/evals/static/test_workflow_skills.sh +1207 -0
- package/install.sh +64 -0
- package/integrations/veritas/flow-agents.adapter.json +138 -0
- package/integrations/veritas/flow-agents.authority-settings.json +26 -0
- package/integrations/veritas/flow-agents.repo-standards.json +82 -0
- package/kits/builder/flows/build.flow.json +218 -0
- package/kits/builder/flows/shape.flow.json +127 -0
- package/kits/builder/kit.json +19 -0
- package/kits/catalog.json +11 -0
- package/package.json +130 -0
- package/packaging/README.md +60 -0
- package/packaging/manifest.json +173 -0
- package/packaging/packs.json +69 -0
- package/powers/dependency-checker/POWER.md +20 -0
- package/powers/dependency-checker/mcp.json +20 -0
- package/powers/playwright/POWER.md +25 -0
- package/powers/playwright/mcp.json +12 -0
- package/prompts/code-audit.md +123 -0
- package/prompts/kcommit.md +88 -0
- package/schemas/backlog-provider-settings.schema.json +138 -0
- package/schemas/workflow-acceptance.schema.json +216 -0
- package/schemas/workflow-critique.schema.json +113 -0
- package/schemas/workflow-evidence.schema.json +357 -0
- package/schemas/workflow-handoff.schema.json +52 -0
- package/schemas/workflow-learning.schema.json +223 -0
- package/schemas/workflow-release.schema.json +172 -0
- package/schemas/workflow-state.schema.json +80 -0
- package/scripts/README.md +111 -0
- package/scripts/build-universal-bundles.js +3 -0
- package/scripts/check-content-boundary.cjs +99 -0
- package/scripts/context-budget/budget-scan.sh +166 -0
- package/scripts/detect-tools.sh +3 -0
- package/scripts/discover-agents.sh +28 -0
- package/scripts/effective-backlog-settings.js +2 -0
- package/scripts/filter-installed-packs.js +2 -0
- package/scripts/flow-kit.js +2 -0
- package/scripts/generate-context-map.js +2 -0
- package/scripts/git-status.sh +49 -0
- package/scripts/hooks/claude-hook-adapter.js +174 -0
- package/scripts/hooks/claude-telemetry-hook.js +115 -0
- package/scripts/hooks/codex-hook-adapter.js +176 -0
- package/scripts/hooks/codex-telemetry-hook.js +95 -0
- package/scripts/hooks/config-protection.js +79 -0
- package/scripts/hooks/desktop-notify.sh +39 -0
- package/scripts/hooks/governance-audit.sh +135 -0
- package/scripts/hooks/lib/audit-transport.sh +40 -0
- package/scripts/hooks/lib/hook-flags.js +49 -0
- package/scripts/hooks/lib/patterns.sh +57 -0
- package/scripts/hooks/lib/resolve-formatter.js +80 -0
- package/scripts/hooks/post-edit-accumulator.js +66 -0
- package/scripts/hooks/pre-commit-quality.js +194 -0
- package/scripts/hooks/quality-gate.js +93 -0
- package/scripts/hooks/report-only-guard.js +21 -0
- package/scripts/hooks/run-hook.js +136 -0
- package/scripts/hooks/stop-format-typecheck.js +141 -0
- package/scripts/hooks/stop-goal-fit.js +337 -0
- package/scripts/hooks/workflow-steering.js +250 -0
- package/scripts/install-codex-home.sh +106 -0
- package/scripts/package.json +3 -0
- package/scripts/promote-workflow-artifact.js +2 -0
- package/scripts/publish-change-helper.js +2 -0
- package/scripts/pull-work-provider.js +2 -0
- package/scripts/setup-repo-hooks.sh +8 -0
- package/scripts/statusline/flow-agents-statusline.js +157 -0
- package/scripts/telemetry/console-presets.sh +14 -0
- package/scripts/telemetry/install-console-config.sh +214 -0
- package/scripts/telemetry/lib/config.sh +85 -0
- package/scripts/telemetry/lib/enrich.sh +115 -0
- package/scripts/telemetry/lib/redact.sh +22 -0
- package/scripts/telemetry/lib/session.sh +63 -0
- package/scripts/telemetry/lib/transport.sh +183 -0
- package/scripts/telemetry/lib/usage.sh +29 -0
- package/scripts/telemetry/sync-agents.sh +173 -0
- package/scripts/telemetry/telemetry.conf +23 -0
- package/scripts/telemetry/telemetry.sh +387 -0
- package/scripts/usage-feedback.js +2 -0
- package/scripts/validate-hook-influence-cases.js +2 -0
- package/scripts/validate-package.sh +89 -0
- package/scripts/validate-source-tree.js +9 -0
- package/skills/agentic-engineering/SKILL.md +62 -0
- package/skills/browser-test/SKILL.md +51 -0
- package/skills/builder-shape/SKILL.md +76 -0
- package/skills/context-budget/SKILL.md +40 -0
- package/skills/deliver/SKILL.md +241 -0
- package/skills/dependency-update/SKILL.md +68 -0
- package/skills/design-probe/SKILL.md +107 -0
- package/skills/eval-rebuild/SKILL.md +39 -0
- package/skills/evidence-gate/SKILL.md +186 -0
- package/skills/execute-plan/SKILL.md +110 -0
- package/skills/explore/SKILL.md +137 -0
- package/skills/feedback-loop/SKILL.md +87 -0
- package/skills/fix-bug/SKILL.md +133 -0
- package/skills/frontend-design/SKILL.md +80 -0
- package/skills/github-cli/SKILL.md +63 -0
- package/skills/idea-to-backlog/SKILL.md +267 -0
- package/skills/knowledge-capture/SKILL.md +55 -0
- package/skills/learning-review/SKILL.md +115 -0
- package/skills/pickup-probe/SKILL.md +114 -0
- package/skills/plan-work/SKILL.md +176 -0
- package/skills/pull-work/SKILL.md +309 -0
- package/skills/release-readiness/SKILL.md +121 -0
- package/skills/review-work/SKILL.md +161 -0
- package/skills/search-first/SKILL.md +66 -0
- package/skills/tdd-workflow/SKILL.md +140 -0
- package/skills/verify-work/SKILL.md +109 -0
- package/src/cli/console-learning-projection.ts +140 -0
- package/src/cli/effective-backlog-settings.ts +99 -0
- package/src/cli/fixture-retirement-audit.ts +154 -0
- package/src/cli/flow-kit.ts +139 -0
- package/src/cli/init.ts +248 -0
- package/src/cli/promote-workflow-artifact.ts +64 -0
- package/src/cli/publish-change-helper.ts +143 -0
- package/src/cli/pull-work-provider.ts +481 -0
- package/src/cli/runtime-adapter.ts +24 -0
- package/src/cli/telemetry-doctor.ts +243 -0
- package/src/cli/usage-feedback.ts +418 -0
- package/src/cli/validate-hook-influence.ts +119 -0
- package/src/cli/validate-source-tree.ts +30 -0
- package/src/cli/validate-workflow-artifacts.ts +411 -0
- package/src/cli/veritas-governance.ts +322 -0
- package/src/cli/workflow-artifact-cleanup-audit.ts +281 -0
- package/src/cli/workflow-sidecar.ts +676 -0
- package/src/cli.ts +95 -0
- package/src/flow-kit/validate.ts +74 -0
- package/src/lib/args.ts +43 -0
- package/src/lib/fs.ts +62 -0
- package/src/lib/workflow-learning-projection.ts +491 -0
- package/src/runtime-adapters.ts +154 -0
- package/src/tools/build-universal-bundles.ts +366 -0
- package/src/tools/common.ts +61 -0
- package/src/tools/filter-installed-packs.ts +129 -0
- package/src/tools/generate-context-map.ts +199 -0
- package/src/tools/validate-package.ts +57 -0
- package/src/tools/validate-source-tree.ts +488 -0
- package/tsconfig.json +19 -0
- package/veritas.claims.json +6 -0
|
@@ -0,0 +1,391 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Agent System Guidebook
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Agent System Guidebook
|
|
6
|
+
|
|
7
|
+
This is the plain-language map of how Flow Agents is assembled and how it should feel to use.
|
|
8
|
+
|
|
9
|
+
> **Which doc do I want?** This page explains *how the system thinks* — layers, state, hooks, evidence, and the UX rules behind them. If you want to *drive a workflow right now* — stage-by-stage prompts and expected behavior — use the [Workflow Usage Guide](workflow-usage-guide.md). For the one-line summary of every skill and gate, use the [Skills Map](skills-map.md).
|
|
10
|
+
|
|
11
|
+
The short version: Flow Agents is not one large prompt. It is a portable operating layer that wraps agent runtimes with durable instructions, task-specific procedures, scoped tools, specialist agents, Flow-backed workflow state, hooks, evidence, and learning loops. The goal is to make ordinary agent use more reliable without asking the user to understand all of that machinery.
|
|
12
|
+
|
|
13
|
+
## The User Experience
|
|
14
|
+
|
|
15
|
+
The user should be able to speak naturally:
|
|
16
|
+
|
|
17
|
+
```text
|
|
18
|
+
Plan this out and start making it happen. Keep going until the work is done.
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Flow Agents should translate that into a disciplined workflow:
|
|
22
|
+
|
|
23
|
+
```mermaid
|
|
24
|
+
flowchart LR
|
|
25
|
+
User[User asks for outcome]
|
|
26
|
+
Route[Choose relevant skill]
|
|
27
|
+
State[Create or resume workflow state]
|
|
28
|
+
Team[Delegate scoped work]
|
|
29
|
+
Check[Critique and verify]
|
|
30
|
+
Hook[Hook catches stop-short gaps]
|
|
31
|
+
Learn[Capture improvements]
|
|
32
|
+
|
|
33
|
+
User --> Route --> State --> Team --> Check --> Hook --> Learn
|
|
34
|
+
Hook -->|gap found| Team
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
The user sees a simple conversation. Underneath, the system keeps track of what is being done, who is doing it, how it will be checked, what is still missing, and what should improve next time. Generic process enforcement belongs to Kontour Flow; Flow Agents makes that enforcement native inside agent harnesses.
|
|
38
|
+
|
|
39
|
+
## The Simple Model
|
|
40
|
+
|
|
41
|
+
Think of Flow Agents as three visible ideas and four hidden supports.
|
|
42
|
+
|
|
43
|
+
<div class="concept-strip">
|
|
44
|
+
<section>
|
|
45
|
+
<strong>Ask</strong>
|
|
46
|
+
<span>The user states the outcome in normal language.</span>
|
|
47
|
+
</section>
|
|
48
|
+
<section>
|
|
49
|
+
<strong>Work</strong>
|
|
50
|
+
<span>The agent plans, delegates, edits, researches, or captures knowledge.</span>
|
|
51
|
+
</section>
|
|
52
|
+
<section>
|
|
53
|
+
<strong>Prove</strong>
|
|
54
|
+
<span>The system checks whether the result is actually done.</span>
|
|
55
|
+
</section>
|
|
56
|
+
</div>
|
|
57
|
+
|
|
58
|
+
The hidden supports are:
|
|
59
|
+
|
|
60
|
+
| Support | Plain-English Job |
|
|
61
|
+
| --- | --- |
|
|
62
|
+
| Instructions | Remember the rules and defaults the user should not have to repeat. |
|
|
63
|
+
| Procedures | Load the right playbook for the task at hand. |
|
|
64
|
+
| State | Remember the active workflow even after long context or multiple agents. |
|
|
65
|
+
| Evidence | Require proof, critique, or an explicit gap before calling work done. |
|
|
66
|
+
|
|
67
|
+
## What The User Says Vs What Flow Agents Does
|
|
68
|
+
|
|
69
|
+
| User Says | Flow Agents Should Do |
|
|
70
|
+
| --- | --- |
|
|
71
|
+
| “Research this and narrow focus.” | Use research/ideation skills, collect patterns, identify standards, and turn broad ideas into a smaller direction. |
|
|
72
|
+
| “Plan it out and start making it happen.” | Create a workflow artifact, define acceptance criteria, plan execution, then start implementation. |
|
|
73
|
+
| “Keep going until nothing is left.” | Use hooks and workflow state to avoid stopping early, then continue through verification and docs. |
|
|
74
|
+
| “Use subagents to critique.” | Delegate critique as report-only work, record findings, fix failures, and re-run verification. |
|
|
75
|
+
| “Can we validate this on Flow Agents itself?” | Use Flow Agents artifacts, hooks, evals, and learning loops on Flow Agents itself. |
|
|
76
|
+
| “What slug are we on?” | Resolve `.flow-agents/current.json`; do not depend on memory from chat. |
|
|
77
|
+
|
|
78
|
+
## Mental Model
|
|
79
|
+
|
|
80
|
+
Flow Agents works like an agent workbench with seven cooperating layers:
|
|
81
|
+
|
|
82
|
+
| Layer | What It Means | Where It Lives |
|
|
83
|
+
| --- | --- | --- |
|
|
84
|
+
| Rules | Durable behavior that should apply before a task starts. | `AGENTS.md`, `context/` |
|
|
85
|
+
| Skills | Repeatable procedures the agent loads only when relevant. | `skills/*/SKILL.md` |
|
|
86
|
+
| Powers | Tool bundles and activation guidance for integrations. | `powers/` |
|
|
87
|
+
| Agents | Specialist roles with scoped responsibilities. | `agents/`, `agent-cards/` |
|
|
88
|
+
| Workflows | State, gates, handoffs, and task memory. | Kontour Flow concepts, `.flow-agents/`, `npm run workflow:sidecar --` |
|
|
89
|
+
| Hooks | Just-in-time reminders or blockers from current workflow state. | `hooks/`, exported runtime configs |
|
|
90
|
+
| Evidence | Tests, evals, telemetry, findings, and outcome records. | `evals/`, `.telemetry/`, sidecars |
|
|
91
|
+
|
|
92
|
+
Each layer should stay small enough to explain independently. When the system feels complicated, the fix is usually to move behavior to the right layer, not to add more global prompt text.
|
|
93
|
+
|
|
94
|
+
```mermaid
|
|
95
|
+
flowchart TB
|
|
96
|
+
Rules[Rules<br/>What should always be true]
|
|
97
|
+
Skills[Skills<br/>How to do repeatable work]
|
|
98
|
+
Powers[Powers<br/>What tools can be used]
|
|
99
|
+
Agents[Agents<br/>Who should do specialized work]
|
|
100
|
+
Workflows[Workflows<br/>What Flow-backed path is happening now]
|
|
101
|
+
Hooks[Hooks<br/>When to intervene just in time]
|
|
102
|
+
Evidence[Evidence<br/>How we know it worked]
|
|
103
|
+
|
|
104
|
+
Rules --> Skills
|
|
105
|
+
Skills --> Workflows
|
|
106
|
+
Powers --> Agents
|
|
107
|
+
Agents --> Workflows
|
|
108
|
+
Workflows --> Hooks
|
|
109
|
+
Workflows --> Evidence
|
|
110
|
+
Evidence --> Skills
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
## What We Actually Built
|
|
114
|
+
|
|
115
|
+
### Portable Source
|
|
116
|
+
|
|
117
|
+
The repo root is the canonical source. Generated bundles under `dist/` are outputs, not editing targets. The source files are exported into Codex, Claude Code, and Kiro shapes so each runtime receives the same operating system through its native conventions.
|
|
118
|
+
|
|
119
|
+
The important source areas are:
|
|
120
|
+
|
|
121
|
+
| Path | Purpose |
|
|
122
|
+
| --- | --- |
|
|
123
|
+
| `AGENTS.md` | Repo-level agent rules and source-of-truth instructions. |
|
|
124
|
+
| `skills/` | Procedure packages such as `plan-work`, `execute-plan`, `verify-work`, and knowledge workflows. |
|
|
125
|
+
| `agents/` | Specialist role definitions and tool boundaries. |
|
|
126
|
+
| `context/contracts/` | Shared workflow contracts for planning, execution, verification, delivery, sandboxing, and governance adapters. |
|
|
127
|
+
| `scripts/` | Build, validation, hook, telemetry, and sidecar tooling. |
|
|
128
|
+
| `evals/` | Static, behavioral, integration, and bundle-install tests. |
|
|
129
|
+
| `packaging/` | Pack definitions and cross-runtime export rules. |
|
|
130
|
+
| `docs/` | Durable explanation of the operating model and roadmap. |
|
|
131
|
+
|
|
132
|
+
Flow Agents currently carries local workflow sidecars and hooks while Flow is being separated into its own Kontour product layer. The intended boundary is that Flow owns generic steps, gates, transitions, Flow Runs, exceptions, and Flow Reports; Flow Agents owns the agent-facing modes, skills, provider settings, runtime adapters, and Console experience that make those flows useful.
|
|
133
|
+
|
|
134
|
+
### Skills
|
|
135
|
+
|
|
136
|
+
Skills are the reusable procedures. They are intentionally more specific than broad rules and lighter than a full app.
|
|
137
|
+
|
|
138
|
+
For example:
|
|
139
|
+
|
|
140
|
+
- `plan-work` turns a goal into a concrete plan with acceptance criteria and file ownership.
|
|
141
|
+
- `execute-plan` coordinates implementation from that plan.
|
|
142
|
+
- `verify-work` gathers evidence and reports gaps.
|
|
143
|
+
- `deliver` chains planning, execution, review, verification, goal fit, and final acceptance.
|
|
144
|
+
|
|
145
|
+
The pattern is: put a repeatable procedure in a skill, then make activation explicit enough that the agent can load it just in time.
|
|
146
|
+
|
|
147
|
+
### Agents
|
|
148
|
+
|
|
149
|
+
Agents are specialist roles, not miscellaneous prompts. A good agent has a narrow job and a clear report boundary.
|
|
150
|
+
|
|
151
|
+
Examples:
|
|
152
|
+
|
|
153
|
+
- `tool-planner` produces execution plans but does not edit files.
|
|
154
|
+
- `tool-worker` edits source within assigned ownership.
|
|
155
|
+
- `tool-verifier` runs checks and reports evidence without fixing code.
|
|
156
|
+
- `tool-code-reviewer` critiques quality and maintainability.
|
|
157
|
+
- `tool-security-reviewer` checks security risk.
|
|
158
|
+
|
|
159
|
+
The orchestrator owns coordination. Specialists should not quietly rewrite the whole workflow state because that makes parallel work hard to trust.
|
|
160
|
+
|
|
161
|
+
### Workflows And Sidecars
|
|
162
|
+
|
|
163
|
+
Workflow artifacts live under:
|
|
164
|
+
|
|
165
|
+
```text
|
|
166
|
+
.flow-agents/<task-slug>/
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
The Markdown artifact is the human-readable session record. JSON sidecars are the machine-readable state:
|
|
170
|
+
|
|
171
|
+
These artifacts are the current Flow Agents implementation surface for Flow-backed workflow state. As Flow matures, Flow Agents should map these sidecars to Flow Runs and Flow Reports instead of inventing a separate generic enforcement model.
|
|
172
|
+
|
|
173
|
+
| Sidecar | Purpose |
|
|
174
|
+
| --- | --- |
|
|
175
|
+
| `state.json` | Current phase, status, next action, and artifact refs. |
|
|
176
|
+
| `acceptance.json` | Acceptance criteria and goal-fit status. |
|
|
177
|
+
| `handoff.json` | Summary, blockers, warnings, and next steps. |
|
|
178
|
+
| `evidence.json` | Checks, verdicts, and `NOT_VERIFIED` gaps. |
|
|
179
|
+
| `critique.json` | Review findings and pass/fail critique state. |
|
|
180
|
+
| `release.json` | Merge, release, deploy, hold, or rollback readiness. |
|
|
181
|
+
| `learning.json` | Post-work lessons and routed follow-ups. |
|
|
182
|
+
|
|
183
|
+
Runtime writes to `state.json` and `handoff.json` go through the sidecar transition guard in `npm run workflow:sidecar --`. The guard is an interim Flow Definition-compatible adapter: it can read the Builder Kit `builder.build` Flow Definition shape for step order, route-back reasons, and route-back max attempts, and it falls back to a legacy-compatible direct primitive policy when no Builder Kit workflow metadata is present. Flow remains the owner of transition semantics; this adapter exists only to fail closed until Flow core exposes the authoritative transition validator.
|
|
184
|
+
|
|
185
|
+
Rejected transitions do not rewrite `state.json` or `handoff.json`. The writer appends structured diagnostics to `transition-diagnostics.jsonl` beside the workflow sidecars, including the command, actor, from/to phase and status, Flow Definition id, route-back reason, attempt details when relevant, and required downstream gates. Route-back loop accounting is stored separately in `transition-attempts.json`, not in `state.json`, so the authoritative workflow state stays schema-clean.
|
|
186
|
+
|
|
187
|
+
The sidecar writer is the main helper:
|
|
188
|
+
|
|
189
|
+
```bash
|
|
190
|
+
npm run workflow:sidecar -- ensure-session ...
|
|
191
|
+
npm run workflow:sidecar -- current --format path
|
|
192
|
+
npm run workflow:sidecar -- record-agent-event ...
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
`ensure-session` creates or selects the workflow and writes `.flow-agents/current.json`. `current` resolves the active workflow path. `record-agent-event` lets parallel workers append progress to `agents/<agent-id>/events.jsonl` without guessing the slug.
|
|
196
|
+
|
|
197
|
+
This is the key answer to multi-agent coordination: agents should not rely on conversational memory for the current slug. The orchestrator resolves the active workflow and passes the path to delegates. Delegates append events. The orchestrator consolidates those events into root state, evidence, critique, and handoff.
|
|
198
|
+
|
|
199
|
+
```mermaid
|
|
200
|
+
flowchart LR
|
|
201
|
+
Current[current.json<br/>active workflow]
|
|
202
|
+
Root[Workflow root<br/>state/evidence/critique]
|
|
203
|
+
A1[worker A<br/>events.jsonl]
|
|
204
|
+
A2[worker B<br/>events.jsonl]
|
|
205
|
+
A3[reviewer<br/>events.jsonl]
|
|
206
|
+
Orch[orchestrator<br/>consolidates]
|
|
207
|
+
|
|
208
|
+
Current --> Root
|
|
209
|
+
Root --> A1
|
|
210
|
+
Root --> A2
|
|
211
|
+
Root --> A3
|
|
212
|
+
A1 --> Orch
|
|
213
|
+
A2 --> Orch
|
|
214
|
+
A3 --> Orch
|
|
215
|
+
Orch --> Root
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
This keeps the UX simple in a multi-agent session. The user does not manage slugs, artifact paths, or worker state. The orchestrator resolves the active workflow and gives each delegate the right place to report.
|
|
219
|
+
|
|
220
|
+
### Hooks
|
|
221
|
+
|
|
222
|
+
Hooks are the just-in-time guidance layer.
|
|
223
|
+
|
|
224
|
+
They inspect current workflow state and remind or block at the moment the agent is about to drift, stop early, or lose useful process state. The current hook surface includes:
|
|
225
|
+
|
|
226
|
+
- goal-fit checks before stopping
|
|
227
|
+
- ambient workflow steering at user prompt submit
|
|
228
|
+
- phase-transition steering after delegated subagent tool use
|
|
229
|
+
- runtime adapters for Codex, Claude Code, and Kiro
|
|
230
|
+
- strict modes for requiring sidecars or critique records
|
|
231
|
+
|
|
232
|
+
The reason hooks matter is that they keep guidance active even when the context window is crowded or the model is no longer tracking the original plan well.
|
|
233
|
+
|
|
234
|
+
### Evals
|
|
235
|
+
|
|
236
|
+
Evals are how we keep this from becoming vibes.
|
|
237
|
+
|
|
238
|
+
The repo has tests for:
|
|
239
|
+
|
|
240
|
+
- source-tree validity
|
|
241
|
+
- workflow skill contracts
|
|
242
|
+
- artifact and sidecar schema behavior
|
|
243
|
+
- hook output for Codex, Claude Code, and Kiro
|
|
244
|
+
- bundle install smoke tests
|
|
245
|
+
- telemetry import/report/dashboard behavior
|
|
246
|
+
- context-map drift
|
|
247
|
+
- workflow sidecar races, including late parallel-agent events
|
|
248
|
+
|
|
249
|
+
The intended pattern is that every important workflow rule gets a test at the lowest useful layer: static checks for text contracts, integration checks for scripts and hooks, and behavioral evals for runtime agent behavior when practical.
|
|
250
|
+
|
|
251
|
+
### Packs
|
|
252
|
+
|
|
253
|
+
Packs keep the global surface understandable.
|
|
254
|
+
|
|
255
|
+
`packaging/packs.json` groups capabilities into sets such as:
|
|
256
|
+
|
|
257
|
+
- `core`
|
|
258
|
+
- `development`
|
|
259
|
+
- `knowledge`
|
|
260
|
+
- `aws`
|
|
261
|
+
- `experimental`
|
|
262
|
+
|
|
263
|
+
All-pack installs remain the default today. `FLOW_AGENTS_PACKS` lets users opt into a smaller installed surface, and domain depth belongs in packs so a global setup can be narrowed without changing the source bundle.
|
|
264
|
+
|
|
265
|
+
## How A Request Flows
|
|
266
|
+
|
|
267
|
+
For a serious development task, the intended flow is:
|
|
268
|
+
|
|
269
|
+
```text
|
|
270
|
+
user request
|
|
271
|
+
-> rules establish boundaries
|
|
272
|
+
-> relevant skill loads
|
|
273
|
+
-> workflow session is created or resumed
|
|
274
|
+
-> planner defines acceptance criteria
|
|
275
|
+
-> workers implement scoped pieces
|
|
276
|
+
-> reviewers critique without fixing
|
|
277
|
+
-> verifiers collect evidence without fixing
|
|
278
|
+
-> hooks prevent premature stopping
|
|
279
|
+
-> evidence gate decides pass/fail/not verified
|
|
280
|
+
-> release readiness decides merge/release/deploy/hold
|
|
281
|
+
-> learning review routes improvements back into docs, evals, skills, or backlog
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
|
|
285
|
+
## Example: Development Work
|
|
286
|
+
|
|
287
|
+
For development work — session, plan, execute, critique, verify, document, commit — the stage-by-stage walkthrough with example prompts and expected behavior lives in the [Workflow Usage Guide](workflow-usage-guide.md). The UX contract is the same as everywhere else in this guidebook: the user states the outcome, and the system supplies the path, the state, the checks, and the proof.
|
|
288
|
+
|
|
289
|
+
## Example: Meeting Or Sales Knowledge
|
|
290
|
+
|
|
291
|
+
User prompt:
|
|
292
|
+
|
|
293
|
+
```text
|
|
294
|
+
Prepare me for this customer call and remember the follow-ups afterward.
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
What Flow Agents should do:
|
|
298
|
+
|
|
299
|
+
| Step | What Happens | What The User Should Notice |
|
|
300
|
+
| --- | --- | --- |
|
|
301
|
+
| 1. Gather | Search notes, calendar, contacts, transcripts, and account context. | Relevant history appears without manual digging. |
|
|
302
|
+
| 2. Synthesize | Summarize people, open loops, decisions, risks, and likely agenda. | The prep is short but grounded. |
|
|
303
|
+
| 3. Capture | Turn meeting notes into durable knowledge afterward. | Follow-ups and decisions are not lost. |
|
|
304
|
+
| 4. Link | Connect people, orgs, prior calls, and commitments. | Future prep gets better. |
|
|
305
|
+
| 5. Learn | Route repeated gaps into docs, skills, or source integrations. | The system improves without hidden magic. |
|
|
306
|
+
|
|
307
|
+
The same UX principle applies: the user asks for the outcome; Flow Agents chooses the support structure.
|
|
308
|
+
|
|
309
|
+
## Example: Context Is Full
|
|
310
|
+
|
|
311
|
+
User prompt:
|
|
312
|
+
|
|
313
|
+
```text
|
|
314
|
+
Keep going. Do whatever is next.
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
What should happen:
|
|
318
|
+
|
|
319
|
+
```mermaid
|
|
320
|
+
flowchart LR
|
|
321
|
+
Prompt[Short user prompt]
|
|
322
|
+
Current[current.json]
|
|
323
|
+
State[state.json]
|
|
324
|
+
Handoff[handoff.json]
|
|
325
|
+
Evidence[evidence.json]
|
|
326
|
+
Action[Next concrete action]
|
|
327
|
+
|
|
328
|
+
Prompt --> Current --> State --> Handoff --> Evidence --> Action
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
The agent should not reconstruct the session from memory. It should resolve the active workflow from `.flow-agents/current.json`, read the sidecars, and continue from the recorded next action. If evidence is missing, it should say `NOT_VERIFIED` or keep working. If critique is open, it should fix or route the finding. If everything is clean, it can deliver.
|
|
332
|
+
|
|
333
|
+
## One-Page Cheat Sheet
|
|
334
|
+
|
|
335
|
+
| If You Want | Say This | Flow Agents Should |
|
|
336
|
+
| --- | --- | --- |
|
|
337
|
+
| Shape an idea | “Research this and narrow focus.” | Find patterns, standards, risks, and the thinnest useful slice. |
|
|
338
|
+
| Build something | “Plan it out and start making it happen.” | Create workflow state, plan, execute, critique, verify, and document. |
|
|
339
|
+
| Continue autonomously | “Keep going until nothing is left.” | Use sidecars and hooks to find the next action without relying on chat memory. |
|
|
340
|
+
| Use parallel help | “Delegate critique while you continue.” | Give agents the workflow root and collect append-only events. |
|
|
341
|
+
| Prove it worked | “Do not stop until it is verified.” | Map acceptance criteria to checks, evidence, and remaining gaps. |
|
|
342
|
+
| Improve the system | “Validate this on Flow Agents itself.” | Use `dogfood-pass` to record evidence, critique, state, handoff, and routed learning. |
|
|
343
|
+
|
|
344
|
+
## Good UX Rules
|
|
345
|
+
|
|
346
|
+
- The first screen should answer “what can I ask this to do?”
|
|
347
|
+
- The user should not need to know whether a capability is a rule, skill, hook, power, or agent.
|
|
348
|
+
- The system should expose evidence and gaps in plain language.
|
|
349
|
+
- Advanced detail should be inspectable, not mandatory.
|
|
350
|
+
- The workflow should continue when safe, pause when evidence is missing, and explain exactly why.
|
|
351
|
+
- Defaults should be conservative, but the user should be able to opt into more autonomy.
|
|
352
|
+
- Long-running work should survive context compaction, restarts, and parallel agents.
|
|
353
|
+
|
|
354
|
+
## Design Rules I Used
|
|
355
|
+
|
|
356
|
+
When I changed the system, I interpreted the architecture this way:
|
|
357
|
+
|
|
358
|
+
- Keep the user-facing surface simple: the user asks for an outcome, not a workflow lecture.
|
|
359
|
+
- Keep durable state outside the model context so work survives compaction and long sessions.
|
|
360
|
+
- Keep root workflow state owned by the orchestrator so parallel workers do not overwrite each other.
|
|
361
|
+
- Let delegates append events instead of editing shared state directly.
|
|
362
|
+
- Prefer JSON Schema, OpenTelemetry, SARIF, MCP, OpenAPI, OAuth/OIDC, CommonMark, and other standards before inventing formats.
|
|
363
|
+
- Add Flow Agents-owned formats only when they are small, versioned, inspectable, and validated.
|
|
364
|
+
- Treat `NOT_VERIFIED` as a real result, not a failure to write a confident answer.
|
|
365
|
+
- Add hooks where timing matters and skills where procedure matters.
|
|
366
|
+
- Add evals for behavior that we expect future agents to preserve.
|
|
367
|
+
- Promote useful working artifacts into durable docs once the pattern matters beyond one task.
|
|
368
|
+
|
|
369
|
+
## What Good Looks Like
|
|
370
|
+
|
|
371
|
+
Flow Agents is working when:
|
|
372
|
+
|
|
373
|
+
- an agent can resume the right task without remembering the slug from chat
|
|
374
|
+
- parallel agents can report progress without corrupting shared state
|
|
375
|
+
- hooks nudge the agent before it stops short
|
|
376
|
+
- verification evidence maps back to the original user outcome
|
|
377
|
+
- docs explain the workflow without requiring code spelunking
|
|
378
|
+
- repeated corrections become tests, skills, rules, docs, or backlog items
|
|
379
|
+
- the same bundle works across Codex, Claude Code, and Kiro
|
|
380
|
+
|
|
381
|
+
## Where This Still Needs To Grow
|
|
382
|
+
|
|
383
|
+
The next useful improvements are:
|
|
384
|
+
|
|
385
|
+
- stronger live behavioral evals that prove hook output changes agent behavior across every runtime, not only that hooks emit guidance
|
|
386
|
+
- richer guide examples for non-code knowledge workflows
|
|
387
|
+
- clearer pack selection guidance for global installs
|
|
388
|
+
- a Veritas advisory-readiness spike through the optional governance adapter boundary
|
|
389
|
+
- a self-validation loop that automatically proposes docs, eval, or skill updates after repeated workflow friction
|
|
390
|
+
|
|
391
|
+
The north star is not more ceremony. It is reliable autonomy: the system should quietly keep the agent on track while the user stays focused on the work they actually wanted done.
|