npm - @kontourai/flow-agents - Versions diffs - 0.1.1 - Mend

@kontourai/flow-agents 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (418) hide show

package/.githooks/pre-push +11 -0
package/.github/workflows/ci.yml +210 -0
package/.github/workflows/docs-pages.yml +52 -0
package/.github/workflows/publish-npm.yml +104 -0
package/AGENTS.md +26 -0
package/CHANGELOG.md +66 -0
package/CODE_OF_CONDUCT.md +25 -0
package/CONTEXT.md +300 -0
package/CONTRIBUTING.md +44 -0
package/LICENSE +201 -0
package/README.md +129 -0
package/SECURITY.md +33 -0
package/agent-cards/dev.json +19 -0
package/agents/dev.json +127 -0
package/agents/tool-code-reviewer.json +61 -0
package/agents/tool-dependencies-updater.json +118 -0
package/agents/tool-explore-config.json +92 -0
package/agents/tool-explore-deps.json +92 -0
package/agents/tool-explore-entry.json +92 -0
package/agents/tool-explore-patterns.json +92 -0
package/agents/tool-explore-structure.json +92 -0
package/agents/tool-explore-tests.json +92 -0
package/agents/tool-planner.json +57 -0
package/agents/tool-playwright.json +145 -0
package/agents/tool-security-reviewer.json +56 -0
package/agents/tool-verifier.json +61 -0
package/agents/tool-worker.json +58 -0
package/build/src/cli/console-learning-projection.js +123 -0
package/build/src/cli/docs-preview.js +39 -0
package/build/src/cli/effective-backlog-settings.js +102 -0
package/build/src/cli/export-bookmarks.js +38 -0
package/build/src/cli/fixture-retirement-audit.js +140 -0
package/build/src/cli/flow-kit.js +138 -0
package/build/src/cli/import-bookmarks.js +50 -0
package/build/src/cli/init.js +239 -0
package/build/src/cli/instinct-cli.js +93 -0
package/build/src/cli/promote-workflow-artifact.js +63 -0
package/build/src/cli/publish-change-helper.js +154 -0
package/build/src/cli/pull-work-provider.js +469 -0
package/build/src/cli/runtime-adapter.js +23 -0
package/build/src/cli/telemetry-doctor.js +221 -0
package/build/src/cli/usage-feedback.js +443 -0
package/build/src/cli/validate-hook-influence.js +152 -0
package/build/src/cli/validate-source-tree.js +31 -0
package/build/src/cli/validate-workflow-artifacts.js +486 -0
package/build/src/cli/veritas-governance.js +262 -0
package/build/src/cli/workflow-artifact-cleanup-audit.js +272 -0
package/build/src/cli/workflow-sidecar.js +816 -0
package/build/src/cli.js +89 -0
package/build/src/flow-kit/validate.js +75 -0
package/build/src/lib/args.js +45 -0
package/build/src/lib/fs.js +62 -0
package/build/src/lib/workflow-learning-projection.js +334 -0
package/build/src/runtime-adapters.js +146 -0
package/build/src/tools/build-universal-bundles.js +397 -0
package/build/src/tools/common.js +56 -0
package/build/src/tools/filter-installed-packs.js +132 -0
package/build/src/tools/generate-context-map.js +198 -0
package/build/src/tools/validate-package.js +64 -0
package/build/src/tools/validate-source-tree.js +622 -0
package/console.telemetry.json +176 -0
package/context/base-rules.md +17 -0
package/context/code-review-standards.md +62 -0
package/context/coding-standards.md +42 -0
package/context/common/orchestrators.md +12 -0
package/context/common/subagents.md +28 -0
package/context/contracts/artifact-contract.md +182 -0
package/context/contracts/builder-kit-workflow-state-contract.md +319 -0
package/context/contracts/delivery-contract.md +69 -0
package/context/contracts/execution-contract.md +53 -0
package/context/contracts/governance-adapter-contract.md +67 -0
package/context/contracts/planning-contract.md +85 -0
package/context/contracts/review-contract.md +104 -0
package/context/contracts/sandbox-policy.md +52 -0
package/context/contracts/verification-contract.md +134 -0
package/context/contracts/work-item-contract.md +215 -0
package/context/deferred/demo-mode.md +33 -0
package/context/deferred/languages/go.md +31 -0
package/context/deferred/languages/python.md +31 -0
package/context/deferred/languages/typescript.md +34 -0
package/context/deferred/parallelization.md +35 -0
package/context/deferred/worktree-isolation.md +24 -0
package/context/development-workflow.md +50 -0
package/context/scripts/context-budget/budget-scan.sh +166 -0
package/context/scripts/detect-tools.sh +3 -0
package/context/scripts/discover-agents.sh +28 -0
package/context/scripts/git-status.sh +49 -0
package/context/scripts/hooks/config-protection.js +79 -0
package/context/scripts/hooks/desktop-notify.sh +39 -0
package/context/scripts/hooks/governance-audit.sh +135 -0
package/context/scripts/hooks/lib/audit-transport.sh +40 -0
package/context/scripts/hooks/lib/hook-flags.js +49 -0
package/context/scripts/hooks/lib/patterns.sh +57 -0
package/context/scripts/hooks/lib/resolve-formatter.js +80 -0
package/context/scripts/hooks/post-edit-accumulator.js +66 -0
package/context/scripts/hooks/pre-commit-quality.js +194 -0
package/context/scripts/hooks/quality-gate.js +93 -0
package/context/scripts/hooks/report-only-guard.js +21 -0
package/context/scripts/hooks/run-hook.js +136 -0
package/context/scripts/hooks/stop-format-typecheck.js +141 -0
package/context/scripts/hooks/stop-goal-fit.js +337 -0
package/context/scripts/hooks/workflow-steering.js +250 -0
package/context/scripts/telemetry/console-presets.sh +14 -0
package/context/scripts/telemetry/install-console-config.sh +214 -0
package/context/scripts/telemetry/lib/config.sh +85 -0
package/context/scripts/telemetry/lib/enrich.sh +115 -0
package/context/scripts/telemetry/lib/redact.sh +22 -0
package/context/scripts/telemetry/lib/session.sh +63 -0
package/context/scripts/telemetry/lib/transport.sh +183 -0
package/context/scripts/telemetry/lib/usage.sh +29 -0
package/context/scripts/telemetry/sync-agents.sh +173 -0
package/context/scripts/telemetry/telemetry.conf +23 -0
package/context/scripts/telemetry/telemetry.sh +387 -0
package/context/scripts/validate-package.sh +89 -0
package/context/settings/backlog-provider-settings.json +54 -0
package/context/templates/core/identity.md +26 -0
package/context/templates/core/user.md +15 -0
package/docs/_config.yml +15 -0
package/docs/_layouts/default.html +87 -0
package/docs/adr/0001-flow-agents-consumes-flow.md +77 -0
package/docs/adr/0002-flow-kits-as-extension-unit.md +13 -0
package/docs/adr/0003-flow-agents-coordinates-kits-and-adapters.md +13 -0
package/docs/adr/0004-gates-expect-surface-claims.md +15 -0
package/docs/adr/0005-kubernetes-inspired-resource-contracts.md +48 -0
package/docs/adr/0006-typescript-first-source-policy.md +98 -0
package/docs/agent-system-guidebook.md +391 -0
package/docs/agent-usage-feedback-loop.md +351 -0
package/docs/assets/favicon.svg +13 -0
package/docs/assets/og-image.png +0 -0
package/docs/assets/site.css +774 -0
package/docs/assets/site.js +139 -0
package/docs/configurable-workflow-routing.md +174 -0
package/docs/context-map.md +145 -0
package/docs/developer-architecture.md +145 -0
package/docs/developer-hook-setup.md +61 -0
package/docs/fixture-ownership.md +44 -0
package/docs/flow-kit-repository-contract.md +180 -0
package/docs/index.md +129 -0
package/docs/kontour-resource-contract.md +358 -0
package/docs/migrations.md +64 -0
package/docs/north-star.md +322 -0
package/docs/operating-layers.md +110 -0
package/docs/repository-structure.md +132 -0
package/docs/sandbox-policy.md +56 -0
package/docs/skills-map.md +203 -0
package/docs/standards-register.md +96 -0
package/docs/veritas-integration.md +165 -0
package/docs/work-item-adapters.md +72 -0
package/docs/workflow-artifact-lifecycle.md +141 -0
package/docs/workflow-eval-strategy.md +295 -0
package/docs/workflow-shared-contracts.md +51 -0
package/docs/workflow-usage-guide.md +443 -0
package/evals/ARCHITECTURE.md +143 -0
package/evals/CONVENTIONS.md +58 -0
package/evals/README.md +128 -0
package/evals/acceptance/run.sh +29 -0
package/evals/acceptance/test_claude_harness.sh +242 -0
package/evals/acceptance/test_codex_harness.sh +108 -0
package/evals/acceptance/test_kiro_harness.sh +128 -0
package/evals/cases/dev/404.html +97 -0
package/evals/cases/dev/code-review.yaml +44 -0
package/evals/cases/dev/dashboard.html +300 -0
package/evals/cases/dev/deliver.yaml +66 -0
package/evals/cases/dev/dependency-update.yaml +16 -0
package/evals/cases/dev/explore.yaml +20 -0
package/evals/cases/dev/index.html +370 -0
package/evals/cases/dev/package-lock.json +28 -0
package/evals/cases/dev/package.json +16 -0
package/evals/cases/dev/plan-work.yaml +20 -0
package/evals/cases/dev/promptfooconfig.yaml +666 -0
package/evals/cases/dev/search-first.yaml +20 -0
package/evals/cases/dev/tdd-workflow.yaml +48 -0
package/evals/cases/dev/verify-work.yaml +44 -0
package/evals/cases/dev/workflow.yaml +34 -0
package/evals/ci/run-baseline.sh +283 -0
package/evals/fixtures/backlog-provider-settings/global-default.json +44 -0
package/evals/fixtures/backlog-provider-settings/project-override.json +53 -0
package/evals/fixtures/builder-kit-workflow-state/baseline-freshness-resolution-hint.json +139 -0
package/evals/fixtures/builder-kit-workflow-state/direct-primitive-stop.json +59 -0
package/evals/fixtures/builder-kit-workflow-state/empty-board-route-shape.json +55 -0
package/evals/fixtures/builder-kit-workflow-state/happy-path.json +71 -0
package/evals/fixtures/builder-kit-workflow-state/mid-work-resume.json +80 -0
package/evals/fixtures/builder-kit-workflow-state/missing-prestep-recovery.json +65 -0
package/evals/fixtures/builder-kit-workflow-state/product-build-chaining.json +60 -0
package/evals/fixtures/builder-kit-workflow-state/stale-continuation-requires-new-probe.json +57 -0
package/evals/fixtures/console-learning-projection/artifacts/console-learning-correction/learning.json +50 -0
package/evals/fixtures/console-learning-projection/artifacts/console-learning-open-route/learning.json +41 -0
package/evals/fixtures/flow-kit-repository/invalid-absolute-path/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-asset-section/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-asset-section/kit.json +11 -0
package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/kit.json +9 -0
package/evals/fixtures/flow-kit-repository/invalid-id/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-id/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-malformed-json/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-missing-flow/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-missing-id/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-missing-id/kit.json +7 -0
package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/kit.json +7 -0
package/evals/fixtures/flow-kit-repository/invalid-name/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-name/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-schema-version/flows/review.flow.json +6 -0
package/evals/fixtures/flow-kit-repository/invalid-schema-version/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/invalid-traversal/kit.json +8 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/adapters/example.json +3 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/assets/example.txt +1 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/docs/README.md +3 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/flows/runtime.flow.json +26 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-evals/example.json +3 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-skills/mixed/SKILL.md +3 -0
package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit.json +44 -0
package/evals/fixtures/flow-kit-repository/valid-local-kit/docs/README.md +3 -0
package/evals/fixtures/flow-kit-repository/valid-local-kit/flows/review.flow.json +26 -0
package/evals/fixtures/flow-kit-repository/valid-local-kit/kit.json +20 -0
package/evals/fixtures/hook-influence/cases.json +336 -0
package/evals/fixtures/pull-work-provider/github-issues.json +170 -0
package/evals/fixtures/pull-work-wip-shepherding/global-wip-informs.json +43 -0
package/evals/fixtures/pull-work-wip-shepherding/personal-wip-blocks.json +42 -0
package/evals/fixtures/surface-trust/accepted-claim-trust-report.json +31 -0
package/evals/fixtures/surface-trust/artifact-absent.json +19 -0
package/evals/fixtures/surface-trust/integrity-mismatch-trust-report.json +32 -0
package/evals/fixtures/surface-trust/missing-authority-trust-report.json +27 -0
package/evals/fixtures/surface-trust/provider-absent.json +19 -0
package/evals/fixtures/surface-trust/rejected-claim-trust-report.json +30 -0
package/evals/fixtures/surface-trust/stale-claim-trust-snapshot.json +31 -0
package/evals/fixtures/usage-feedback/sample-full.jsonl +11 -0
package/evals/fixtures/usage-feedback/sample-outcomes.jsonl +1 -0
package/evals/fixtures/veritas-governance-adapter/fake-veritas-pass.sh +18 -0
package/evals/fixtures/veritas-governance-adapter/fake-veritas-secret-fail.sh +10 -0
package/evals/fixtures/veritas-governance-adapter/fake-veritas-unconfigured.sh +4 -0
package/evals/integration/test_bundle_install.sh +541 -0
package/evals/integration/test_console_learning_projection.sh +192 -0
package/evals/integration/test_context_map.sh +65 -0
package/evals/integration/test_effective_backlog_settings.sh +58 -0
package/evals/integration/test_fixture_retirement_audit.sh +58 -0
package/evals/integration/test_flow_agents_statusline.sh +93 -0
package/evals/integration/test_flow_kit_repository.sh +90 -0
package/evals/integration/test_goal_fit_hook.sh +482 -0
package/evals/integration/test_hook_category_behaviors.sh +190 -0
package/evals/integration/test_hook_influence_cases.sh +69 -0
package/evals/integration/test_local_flow_kit_install.sh +145 -0
package/evals/integration/test_publish_change_helper.sh +176 -0
package/evals/integration/test_pull_work_provider.sh +140 -0
package/evals/integration/test_runtime_adapter_activation.sh +106 -0
package/evals/integration/test_telemetry.sh +485 -0
package/evals/integration/test_telemetry_doctor.sh +193 -0
package/evals/integration/test_usage_feedback_dashboard.sh +169 -0
package/evals/integration/test_usage_feedback_global.sh +117 -0
package/evals/integration/test_usage_feedback_import.sh +227 -0
package/evals/integration/test_usage_feedback_outcomes.sh +165 -0
package/evals/integration/test_usage_feedback_report.sh +263 -0
package/evals/integration/test_veritas_governance_adapter.sh +235 -0
package/evals/integration/test_workflow_artifact_cleanup_audit.sh +287 -0
package/evals/integration/test_workflow_artifacts.sh +1247 -0
package/evals/integration/test_workflow_sidecar_writer.sh +2112 -0
package/evals/integration/test_workflow_steering_hook.sh +337 -0
package/evals/lib/assertions/delegated-to.js +40 -0
package/evals/lib/assertions/max-tool-calls.js +15 -0
package/evals/lib/assertions/no-write-tools.js +27 -0
package/evals/lib/assertions/pass-at-k.js +39 -0
package/evals/lib/assertions/telemetry-utils.js +105 -0
package/evals/lib/assertions/tool-called.js +39 -0
package/evals/lib/assertions/verify-after-fix.js +61 -0
package/evals/lib/claude-judge.sh +40 -0
package/evals/lib/claude-provider.sh +74 -0
package/evals/lib/codex-judge.sh +39 -0
package/evals/lib/codex-provider.sh +81 -0
package/evals/lib/eval-dev.sh +5 -0
package/evals/lib/eval-judge.sh +22 -0
package/evals/lib/eval-provider.sh +26 -0
package/evals/lib/eval-report.sh +73 -0
package/evals/lib/kiro-dev.sh +4 -0
package/evals/lib/kiro-judge.sh +17 -0
package/evals/lib/kiro-provider.sh +62 -0
package/evals/lib/node.sh +111 -0
package/evals/promptfooconfig.yaml +70 -0
package/evals/run.sh +309 -0
package/evals/static/test_evidence_refs.sh +141 -0
package/evals/static/test_package.sh +407 -0
package/evals/static/test_repo_hooks.sh +68 -0
package/evals/static/test_universal_bundles.sh +274 -0
package/evals/static/test_workflow_skills.sh +1207 -0
package/install.sh +64 -0
package/integrations/veritas/flow-agents.adapter.json +138 -0
package/integrations/veritas/flow-agents.authority-settings.json +26 -0
package/integrations/veritas/flow-agents.repo-standards.json +82 -0
package/kits/builder/flows/build.flow.json +218 -0
package/kits/builder/flows/shape.flow.json +127 -0
package/kits/builder/kit.json +19 -0
package/kits/catalog.json +11 -0
package/package.json +130 -0
package/packaging/README.md +60 -0
package/packaging/manifest.json +173 -0
package/packaging/packs.json +69 -0
package/powers/dependency-checker/POWER.md +20 -0
package/powers/dependency-checker/mcp.json +20 -0
package/powers/playwright/POWER.md +25 -0
package/powers/playwright/mcp.json +12 -0
package/prompts/code-audit.md +123 -0
package/prompts/kcommit.md +88 -0
package/schemas/backlog-provider-settings.schema.json +138 -0
package/schemas/workflow-acceptance.schema.json +216 -0
package/schemas/workflow-critique.schema.json +113 -0
package/schemas/workflow-evidence.schema.json +357 -0
package/schemas/workflow-handoff.schema.json +52 -0
package/schemas/workflow-learning.schema.json +223 -0
package/schemas/workflow-release.schema.json +172 -0
package/schemas/workflow-state.schema.json +80 -0
package/scripts/README.md +111 -0
package/scripts/build-universal-bundles.js +3 -0
package/scripts/check-content-boundary.cjs +99 -0
package/scripts/context-budget/budget-scan.sh +166 -0
package/scripts/detect-tools.sh +3 -0
package/scripts/discover-agents.sh +28 -0
package/scripts/effective-backlog-settings.js +2 -0
package/scripts/filter-installed-packs.js +2 -0
package/scripts/flow-kit.js +2 -0
package/scripts/generate-context-map.js +2 -0
package/scripts/git-status.sh +49 -0
package/scripts/hooks/claude-hook-adapter.js +174 -0
package/scripts/hooks/claude-telemetry-hook.js +115 -0
package/scripts/hooks/codex-hook-adapter.js +176 -0
package/scripts/hooks/codex-telemetry-hook.js +95 -0
package/scripts/hooks/config-protection.js +79 -0
package/scripts/hooks/desktop-notify.sh +39 -0
package/scripts/hooks/governance-audit.sh +135 -0
package/scripts/hooks/lib/audit-transport.sh +40 -0
package/scripts/hooks/lib/hook-flags.js +49 -0
package/scripts/hooks/lib/patterns.sh +57 -0
package/scripts/hooks/lib/resolve-formatter.js +80 -0
package/scripts/hooks/post-edit-accumulator.js +66 -0
package/scripts/hooks/pre-commit-quality.js +194 -0
package/scripts/hooks/quality-gate.js +93 -0
package/scripts/hooks/report-only-guard.js +21 -0
package/scripts/hooks/run-hook.js +136 -0
package/scripts/hooks/stop-format-typecheck.js +141 -0
package/scripts/hooks/stop-goal-fit.js +337 -0
package/scripts/hooks/workflow-steering.js +250 -0
package/scripts/install-codex-home.sh +106 -0
package/scripts/package.json +3 -0
package/scripts/promote-workflow-artifact.js +2 -0
package/scripts/publish-change-helper.js +2 -0
package/scripts/pull-work-provider.js +2 -0
package/scripts/setup-repo-hooks.sh +8 -0
package/scripts/statusline/flow-agents-statusline.js +157 -0
package/scripts/telemetry/console-presets.sh +14 -0
package/scripts/telemetry/install-console-config.sh +214 -0
package/scripts/telemetry/lib/config.sh +85 -0
package/scripts/telemetry/lib/enrich.sh +115 -0
package/scripts/telemetry/lib/redact.sh +22 -0
package/scripts/telemetry/lib/session.sh +63 -0
package/scripts/telemetry/lib/transport.sh +183 -0
package/scripts/telemetry/lib/usage.sh +29 -0
package/scripts/telemetry/sync-agents.sh +173 -0
package/scripts/telemetry/telemetry.conf +23 -0
package/scripts/telemetry/telemetry.sh +387 -0
package/scripts/usage-feedback.js +2 -0
package/scripts/validate-hook-influence-cases.js +2 -0
package/scripts/validate-package.sh +89 -0
package/scripts/validate-source-tree.js +9 -0
package/skills/agentic-engineering/SKILL.md +62 -0
package/skills/browser-test/SKILL.md +51 -0
package/skills/builder-shape/SKILL.md +76 -0
package/skills/context-budget/SKILL.md +40 -0
package/skills/deliver/SKILL.md +241 -0
package/skills/dependency-update/SKILL.md +68 -0
package/skills/design-probe/SKILL.md +107 -0
package/skills/eval-rebuild/SKILL.md +39 -0
package/skills/evidence-gate/SKILL.md +186 -0
package/skills/execute-plan/SKILL.md +110 -0
package/skills/explore/SKILL.md +137 -0
package/skills/feedback-loop/SKILL.md +87 -0
package/skills/fix-bug/SKILL.md +133 -0
package/skills/frontend-design/SKILL.md +80 -0
package/skills/github-cli/SKILL.md +63 -0
package/skills/idea-to-backlog/SKILL.md +267 -0
package/skills/knowledge-capture/SKILL.md +55 -0
package/skills/learning-review/SKILL.md +115 -0
package/skills/pickup-probe/SKILL.md +114 -0
package/skills/plan-work/SKILL.md +176 -0
package/skills/pull-work/SKILL.md +309 -0
package/skills/release-readiness/SKILL.md +121 -0
package/skills/review-work/SKILL.md +161 -0
package/skills/search-first/SKILL.md +66 -0
package/skills/tdd-workflow/SKILL.md +140 -0
package/skills/verify-work/SKILL.md +109 -0
package/src/cli/console-learning-projection.ts +140 -0
package/src/cli/effective-backlog-settings.ts +99 -0
package/src/cli/fixture-retirement-audit.ts +154 -0
package/src/cli/flow-kit.ts +139 -0
package/src/cli/init.ts +248 -0
package/src/cli/promote-workflow-artifact.ts +64 -0
package/src/cli/publish-change-helper.ts +143 -0
package/src/cli/pull-work-provider.ts +481 -0
package/src/cli/runtime-adapter.ts +24 -0
package/src/cli/telemetry-doctor.ts +243 -0
package/src/cli/usage-feedback.ts +418 -0
package/src/cli/validate-hook-influence.ts +119 -0
package/src/cli/validate-source-tree.ts +30 -0
package/src/cli/validate-workflow-artifacts.ts +411 -0
package/src/cli/veritas-governance.ts +322 -0
package/src/cli/workflow-artifact-cleanup-audit.ts +281 -0
package/src/cli/workflow-sidecar.ts +676 -0
package/src/cli.ts +95 -0
package/src/flow-kit/validate.ts +74 -0
package/src/lib/args.ts +43 -0
package/src/lib/fs.ts +62 -0
package/src/lib/workflow-learning-projection.ts +491 -0
package/src/runtime-adapters.ts +154 -0
package/src/tools/build-universal-bundles.ts +366 -0
package/src/tools/common.ts +61 -0
package/src/tools/filter-installed-packs.ts +129 -0
package/src/tools/generate-context-map.ts +199 -0
package/src/tools/validate-package.ts +57 -0
package/src/tools/validate-source-tree.ts +488 -0
package/tsconfig.json +19 -0
package/veritas.claims.json +6 -0

package/skills/builder-shape/SKILL.md ADDED Viewed

@@ -0,0 +1,76 @@
+---
+name: "builder-shape"
+description: "Invoke Builder Kit shape from a raw idea or the current conversation context without requiring the user to name idea-to-backlog. Delegates shaping to idea-to-backlog, records the Builder Kit Flow Definition link, and stops at the backlog gate unless GitHub issue sync is explicitly requested."
+---
+# Builder Shape
+Invoke the Builder Kit `shape` flow for raw product ideas, vague build goals, current conversation context, PRD-like concepts, spikes, prototypes, or work that needs alignment before implementation.
+## Contract
+- Product surface: let the user ask for "Builder Kit shape", "builder shape", or "shape this with Builder Kit" without naming `idea-to-backlog`.
+- Proactive suggestion: when a user starts planning a feature, product, PRD, roadmap item, or vague build idea without naming a workflow, briefly suggest Builder Kit shape as the structured path before implementation. Phrase it as an option, not a forced gate, unless the request is too ambiguous to plan responsibly.
+- Delegation: use `skills/idea-to-backlog/SKILL.md` as the shaping primitive. Do not duplicate or replace its workflow, artifact contract, issue shape, or gate rules.
+- Product-level auto-guidance: when the user invokes Builder Kit shape, guide them through `design-probe` alignment and then the `idea-to-backlog` workflow directly; do not require them to type `design-probe` or `idea-to-backlog` as additional skill names.
+- do not require them to type `idea-to-backlog`; Builder Kit shape owns the user-facing route into that primitive.
+- Flow reference: link every Builder Kit shape artifact to the Builder Kit Flow Definition at `kits/builder/flows/shape.flow.json`.
+- Input: start from the user's raw idea, pasted notes, or the current conversation context.
+- Probe/alignment: when the idea, user outcome, constraints, non-goals, success signal, risk, or bundle relationship is unclear, run `design-probe` style alignment before continuing.
+- Default stop: stop at the backlog gate by default. Do not create GitHub issues, sync to a project, or hand off to `pull-work` unless the user explicitly asks for that next step.
+- Boundary: do not run Builder Kit build execution, remote kit install, package extraction, downstream delivery workflows, `plan-work`, `execute-plan`, `review-work`, `verify-work`, `evidence-gate`, or release workflows from this invocation.
+- Compatibility: Direct `idea-to-backlog` usage remains valid and should behave exactly as described in `skills/idea-to-backlog/SKILL.md`.
+- Primitive recovery: if a user invokes `idea-to-backlog` or another primitive with missing shaping context and appears to want the product flow, explain that Builder Kit shape is the entry point and offer to route there.
+## Invocation
+Use this skill when the user says things like:
+- `Use Builder Kit shape for this idea: ...`
+- `Builder shape the current conversation into backlog candidates.`
+- `Shape this with Builder Kit, but do not create issues yet.`
+- `Run Builder Kit shape and sync GitHub issues only after I confirm.`
+When activated:
+1. Read `skills/idea-to-backlog/SKILL.md`.
+2. State that Builder Kit shape delegates to `idea-to-backlog` and uses `kits/builder/flows/shape.flow.json`.
+3. Gather the raw idea or current conversation context.
+4. If needed, use `design-probe`: ask one Probe/alignment question at a time before shaping. Prefer questions that clarify user outcome, constraints, non-goals, success criteria, risk, or whether bundled ideas truly belong together.
+5. Create or update the standard `.flow-agents/<slug>/<slug>--idea-to-backlog.md` artifact using the `idea-to-backlog` artifact contract.
+6. Add a `builder_kit_shape` or equivalent note in the artifact that links to `kits/builder/flows/shape.flow.json` and records that the product-level Builder Kit shape surface was used.
+7. Stop at `next_gate: Backlog Gate` unless the user explicitly requested GitHub issue sync.
+8. If the user asked for guided Builder Kit continuation, record the expected next step as `pull-work` after issue sync or backlog approval; otherwise record manual mode and stop.
+## Artifact Requirements
+The artifact must keep the standard `idea-to-backlog` sections:
+- `source_ideas`
+- `idea_inventory`
+- `slice_candidates`
+- `bundle_justification`
+- `dependency_map`
+- `phase`
+- `decisions`
+- `opportunity_briefs`
+- `shaped_work`
+- `risk_release_notes`
+- `backlog_links`
+- `parked_or_rejected`
+- `open_questions`
+- `next_gate`
+For Builder Kit shape invocations, also include:
+- Builder Kit Flow Definition: `kits/builder/flows/shape.flow.json`
+- Explicit issue-sync status, such as `not_requested`, `requested`, or `completed`
+- A backlog-gate decision that says whether the workflow stopped before issue creation
+## GitHub Issue Sync
+Issue sync is explicit-only.
+- If the user did not ask to create or sync issues, set `backlog_links` to `not_requested` or an empty recorded status and stop at the backlog gate.
+- If the user asks to create or sync issues, follow the GitHub issue rules in `skills/idea-to-backlog/SKILL.md`.
+- If provider details are missing, ask for them instead of assuming a GitHub repository, project, labels, milestone, or assignee.

package/skills/context-budget/SKILL.md ADDED Viewed

@@ -0,0 +1,40 @@
+---
+name: context-budget
+description: >-
+  Audit token overhead across Flow Agents bundles — agent specs, skills, context files,
+  MCP servers. Produces budget report with per-component breakdown and optimization suggestions.
+---
+# Context Budget Audit
+Scan installed Flow Agents bundles and estimate token overhead per component. Produces a structured budget report with optimization suggestions.
+## Workflow
+### Phase 1: Inventory
+Run `bash context/scripts/context-budget/budget-scan.sh` to discover all loaded components. The script walks `~/.flow-agents/` and outputs JSON with per-bundle breakdowns.
+### Phase 2: Classify
+Bucket each component from the scan output:
+- **Always loaded**: context files matching package dependency patterns, skill frontmatter descriptions
+- **On-demand**: full SKILL.md body (loaded on skill activation), deferred context (`context/deferred/`)
+- **Per-agent**: agent-spec systemPrompt, agent-specific MCP servers
+### Phase 3: Detect Issues
+Flag problems from the scan data:
+- Heavy agent specs: systemPrompt > 200 lines
+- Bloated skill descriptions: frontmatter description > 30 words
+- MCP over-subscription: agent with > 10 MCP servers or > 50 total tools
+- Context bloat: any single context file > 100 lines
+- Deferred candidates: context files > 2% of model context that aren't safety/routing
+### Phase 4: Report
+Structured output:
+- Per-bundle breakdown (tokens by category)
+- Per-agent breakdown (what each agent loads at spawn)
+- Top-N optimization suggestions ranked by token savings
+- Use `--verbose` flag on budget-scan.sh for per-file token counts

package/skills/deliver/SKILL.md ADDED Viewed

@@ -0,0 +1,241 @@
+---
+name: "deliver"
+description: "Delivery workflow — selected work to delivered code. Ensures pull-work + pickup-probe preflight, then chains plan-work → execute-plan → review-work → verify-work → loop on failure without requiring user interaction between cleanly determined stages."
+---
+# Deliver
+Takes a goal, chains the three primitives, loops until the user-facing goal is met. The orchestrator coordinates — it never touches source files.
+## Agents
+Inherited from primitives:
+| Agent | Used by |
+|---|---|
+| tool-planner | plan-work |
+| tool-worker (x4) | execute-plan |
+| tool-code-reviewer | review-work |
+| tool-security-reviewer | review-work (conditional — security-sensitive changes) |
+| tool-verifier | verify-work |
+| tool-playwright | verify-work |
+## Orchestrator Rule
+You never use `read`, `glob`, `grep`, or `code` on source files. You only read/write the session file and artifact files in `.flow-agents/<slug>/`.
+## Shared Contracts
+Follow:
+- `context/contracts/artifact-contract.md`
+- `context/contracts/planning-contract.md`
+- `context/contracts/execution-contract.md`
+- `context/contracts/review-contract.md`
+- `context/contracts/verification-contract.md`
+- `context/contracts/delivery-contract.md`
+This skill owns orchestration across the full loop. The contracts own artifact shape, Definition Of Done, execution handoff, verification verdicts, Goal Fit, and Final Acceptance.
+When you report progress or final evidence, use exact delegate ids such as `tool-planner`, `tool-worker`, `tool-verifier`, and `tool-playwright`. Do not collapse them to generic labels when the gate is part of acceptance evidence.
+## Sidecar Writer Adoption
+When the repository provides `npm run workflow:sidecar --`, use it for routine workflow state instead of hand-writing JSON:
+- `ensure-session` before planning starts
+- `current --format path` when resuming or handing work to delegates
+- `record-agent-event` for delegated progress, handoffs, blockers, and evidence pointers
+- `advance-state` at each phase transition
+- `record-evidence` after verification
+- `record-critique` or `import-critique` after review
+- `record-release` for release-readiness decisions
+- `record-learning` for learning-review outcomes
+- `dogfood-pass` for Flow Agents repo changes that should record evidence, critique, optional learning, state, and handoff in one validated pass
+After writer updates, run `npm run workflow:validate-artifacts -- --require-sidecars .flow-agents/<slug>` when local validation is available. If the writer or validation is unavailable or blocked by sandbox policy, record the exact gap in the session artifact as `NOT_VERIFIED` instead of pretending structured state exists.
+`ensure-session` maintains `.flow-agents/current.json`. The orchestrator owns root `state.json` and `handoff.json` updates. Delegated agents must be given the workflow artifact root and should append events under `agents/<agent-id>/events.jsonl` through `record-agent-event` instead of guessing the slug or rewriting root state.
+## Input
+- **Goal**: what to build (from conversation context or explicit instruction)
+- **Directory**: working directory
+- **Selected work evidence**: existing `pull-work` and `pickup-probe` artifacts when the user is continuing provider-backed or productized backlog work
+## TDD Mode
+If the user requests test-driven development, activate the `tdd-workflow` skill instead. It wraps the same plan → execute → verify chain with test-first constraints and git checkpoints. deliver is for standard (implementation-first) workflows.
+## Required Preflight
+Before planning implementation, determine whether the request is direct ad hoc delivery or pickup of provider-backed/productized backlog work.
+- If the user asks to pick up work, continue backlog work, build the next item, or deliver a selected issue, run or consume `pull-work` first. `pull-work` must enforce board selection, WIP/shepherding, dependency, grouping, and worktree logic.
+- After `pull-work`, run or consume `pickup-probe` before `plan-work`. The pickup Probe must record selected item ids, scope, acceptance quality, provider state, WIP/conflict scan, dependency freshness, expected modified files, sandbox/worktree mode, decisions, unresolved questions, accepted gaps, and planning readiness.
+- If current artifacts already prove `pull-work` and `pickup-probe` are fresh for the selected item or justified group, consume those artifacts and continue to `plan-work`.
+- If the preflight is missing, stale, contradictory, or for a different selected item, stop before planning and route through `pull-work -> pickup-probe`; for pickup/planning gaps, route `decision_gap` back to `design-probe`.
+- If the user gives a raw product idea instead of ready backlog work, suggest Builder Kit shape (`design-probe` + `idea-to-backlog`) rather than forcing delivery.
+Direct ad hoc implementation requests that are not provider-backed backlog pickup may still start at `plan-work`, but `deliver` must record why pull/pickup preflight was not applicable.
+## Session File
+Path: `.flow-agents/<slug>/<slug>--deliver.md`
+```markdown
+# <Goal one-liner>
+branch: <branch>
+worktree: <worktree>
+created: <date>
+status: planning | executing | reviewing | verifying | delivered
+type: deliver
+iteration: 0
+## Workflow Rules (re-read at each phase transition)
+- Reviewers and verifiers are REPORT ONLY — they never fix code
+- Any code change requires re-review + re-verify before delivery
+- Loop exits only when review + verify are both clean in same iteration
+- Loop exits only after the Goal Fit Gate is fully checked or explicitly accepted
+- CRITICAL/HIGH → re-plan → execute → review → verify
+- MEDIUM/FAIL → execute fix pass → review → verify
+- Temporary planning and execution artifacts live in `.flow-agents/<slug>/`; durable feature documentation is promoted after CI/merge
+- Local runtime work stays under `.flow-agents/` and remains untracked; durable outcomes must be promoted before merge to `main`
+## Plan
+(populated by plan-work)
+## Definition Of Done
+(copied from plan-work; this is the user-facing stop condition)
+## Execution Progress
+(populated by execute-plan)
+## Verification Report
+(populated by verify-work)
+## Goal Fit Gate
+Use the Goal Fit Gate from `context/contracts/delivery-contract.md`.
+## Final Acceptance
+Use the Final Acceptance checklist from `context/contracts/delivery-contract.md`.
+## History
+- iteration 1: partial — auth routes done, form validation missing
+- iteration 2: pass — all acceptance criteria met
+```
+The `status:` values in this Markdown session file are human-readable delivery progress labels. They are not the machine-readable `state.phase` enum; structured workflow sidecars must use the canonical lifecycle values from `context/contracts/artifact-contract.md`. In particular, review-work records critique through the critique artifact/sink while the sidecar lifecycle remains in a canonical phase such as `execution`, not a `review` phase.
+## Workflow
+### 1. Create session file
+Create the session file with `status: planning`, `iteration: 0`. Use the sidecar writer when available:
+```bash
+npm run workflow:sidecar -- ensure-session \
+  --source-request "<original request>" \
+  --summary "<current delivery goal>" \
+  --criterion "<acceptance criterion>"
+```
+### 2. Plan (plan-work)
+Invoke plan-work with the goal, directory, session file path, and any pull-work / pickup-probe artifact refs. The plan must include `## Definition Of Done`. Present the plan to the user when a user decision is actually needed; otherwise record the plan artifact and continue automatically to execution.
+This is a delegation gate. `plan-work` must delegate to `tool-planner` when that delegate is available, even if the environment is read-only or the repo cannot yet be modified. If the gate is blocked, preserve the attempted delegation/blocker in the session artifact and treat the delivery as `NOT_VERIFIED` or incomplete rather than substituting a local plan.
+### 3. Execute (execute-plan)
+Re-read the session file `## Workflow Rules` section before proceeding. Then invoke execute-plan with the plan artifact path and session file path.
+### 4. Review (REPORT ONLY — review-work)
+Invoke `review-work` with the session file path. Reviewers produce findings through the critique artifact/sink, currently `critique.json` locally. **They NEVER fix code.** No writes, no patches, no "found and fixed."
+This is a delegation gate. `review-work` must delegate to `tool-code-reviewer` when that delegate is available. If security-sensitive files or behaviors are in scope, it must also delegate to `tool-security-reviewer`. Architecture and standards concerns are part of the code review scope unless the project configures a more specific reviewer.
+### 5. Verify (REPORT ONLY — verify-work)
+Invoke verify-work with the session file path. Verifiers run checks and report status, including acceptance criteria and Goal Fit. **They NEVER fix code.** No format fixes, no lint auto-fixes, no patches.
+This is a delegation gate. `verify-work` must delegate to `tool-verifier` when that delegate is available. If UI or browser-facing behavior is in scope, delegate that evidence collection to `tool-playwright` as well. If the gate is blocked, report the exact `NOT_VERIFIED` evidence gap; do not replace verification with an orchestrator-only summary.
+### 6. Route on findings
+Combine the critique artifact/sink verdict + verification verdict:
+- **Clean** (no issues, all PASS) → deliver
+- **Goal Fit Gate incomplete** → fix pass or final acceptance decision
+- **CRITICAL or HIGH review findings** → re-plan (step 7a)
+- **MEDIUM review findings needing code changes** → fix pass (step 7b)
+- **Any verification FAIL** → fix pass (step 7b)
+- **Any NOT_VERIFIED** → surface to user, they decide
+When the route is deterministic, continue without asking the user between stages. Use the local stop/steering hooks when available to resume automatically after phase transitions. Ask the user only for explicit approval, missing authority, unsafe escalation, accepted gaps, unresolved `NOT_VERIFIED`, provider decisions, or scope changes.
+### 7. Loop (mandatory re-verify)
+**Any code change requires a subsequent clean review + verify pass. No exceptions.**
+#### 7a. Re-plan (CRITICAL/HIGH issues)
+1. Increment `iteration` in session file
+2. Re-invoke plan-work with: original goal + failure summary → updated plan
+3. Back to step 3 (Execute) → then step 4 (Review) → step 5 (Verify)
+#### 7b. Fix pass (MEDIUM issues / verification failures)
+1. Increment `iteration` in session file
+2. Back to step 3 (Execute) with the specific findings to fix
+3. Then step 4 (Review) → step 5 (Verify)
+**The loop exits ONLY when review + verify both produce zero findings, all PASS in the same iteration, and Goal Fit Gate is complete.** Not when fixes are applied — when fixes are *verified clean and useful to the user*.
+### 8. Goal Fit Gate
+Before final response, update `## Goal Fit Gate` in the session file. If any box is unchecked, either keep working or surface the exact decision needed. Do not hide open gaps in a summary.
+Record the final local state with `advance-state`. Use `status: verified` only when verification and critique are clean; use `status: needs_decision`, `failed`, or `not_verified` for unresolved gaps.
+### 9. Publish Verified Change
+After review, verification, evidence, and Goal Fit are clean for the same diff:
+1. Confirm the working tree contains only verified scope.
+2. Commit the verified diff.
+3. Push the branch.
+4. Open or update the provider change record with issue links, closing refs, evidence links, and verification summary, or record an explicit no-provider-change reason.
+5. Wait for provider checks/CI or record missing checks as `NOT_VERIFIED`.
+Do not invoke `release-readiness` before this gate unless the user explicitly accepts a no-provider-change/no-push path and the reason is recorded in the session artifact. For GitHub, the first `ChangeProvider` adapter example is a PR with PR checks.
+### 10. Final Acceptance And Docs Promotion
+After CI passes and the work is merged or otherwise accepted:
+1. Update `## Final Acceptance` in the session file.
+2. Archive the working artifacts under `.flow-agents/<slug>/archive/` or keep a stable link to them.
+3. Record provider records, verification evidence, durable docs targets, accepted gaps, and follow-up routing in durable docs or provider records.
+4. Promote the relevant plan, decision, evidence, and usage notes into long-lived docs such as `docs/`, `README.md`, or a project decision record.
+5. Link the long-lived doc back to the provider record, archived plan artifact, or accepted evidence when useful so future readers can see why and how the feature was built.
+6. Confirm `.flow-agents/` runtime artifacts remain untracked before merge to `main`.
+7. Hand off to `learning-review` when the delivery exposed workflow, testing, documentation, or product follow-up.
+### 11. Deliver
+1. Include the verification report verbatim in your delivery message
+2. `git diff --stat`
+3. Summarize: what was built, iterations taken, issues resolved, Goal Fit status, and final acceptance/docs status
+4. Set `status: delivered`
+{context?}

package/skills/dependency-update/SKILL.md ADDED Viewed

@@ -0,0 +1,68 @@
+---
+name: "dependency-update"
+description: "Analyze and upgrade project dependencies — latest versions, security vulnerabilities, actionable update plan across all package managers."
+---
+# Dependency Analysis & Upgrade
+Delegate dependency analysis to `tool-dependencies-updater` which has MCP access to package registries and security advisory databases.
+## When to Use
+- User asks to check for outdated dependencies
+- User wants to upgrade packages to latest versions
+- User asks about security vulnerabilities in dependencies
+- During project audits or onboarding to assess dependency health
+- Before major releases to ensure dependencies are current
+## Execution
+Spawn `tool-dependencies-updater` with a clear task description. The subagent handles all registry lookups via MCP tools.
+### Basic Audit
+```
+Delegate to tool-dependencies-updater:
+"Scan this project for all dependency manifests, check every dependency against
+the latest available version, run security advisory checks on outdated packages,
+and report findings grouped by risk level (critical/major/minor)."
+```
+### Targeted Update
+```
+Delegate to tool-dependencies-updater:
+"Check the latest versions for dependencies in <manifest_file>. Focus on
+<specific packages or ecosystem> and flag any with known security advisories."
+```
+### Security-Focused
+```
+Delegate to tool-dependencies-updater:
+"Search for known security vulnerabilities (CVEs) affecting the current
+dependency versions in this project. Prioritize critical and high severity
+issues. Include advisory IDs and recommended fix versions."
+```
+## After the Subagent Reports
+Once `tool-dependencies-updater` returns its findings:
+1. Review the update plan with the user before making changes
+2. For CRITICAL (security) updates — recommend immediate action
+3. For MAJOR version bumps — warn about potential breaking changes, check changelogs if needed
+4. For MINOR/PATCH updates — generally safe to batch-apply
+5. Apply updates to manifest files (package.json, requirements.txt, etc.)
+6. Run install commands (`npm install`, `pip install -r requirements.txt`, etc.)
+7. Run tests to verify nothing broke
+8. If tests fail after updates, investigate and either fix compatibility issues or pin to last working version
+## Key Principles
+- ALWAYS delegate registry lookups to the subagent — it has the MCP tools, you don't
+- NEVER update dependencies without showing the user the plan first
+- NEVER blindly apply major version bumps — they may require migration steps
+- Group related updates (e.g., all React packages together) to avoid partial upgrades
+- If the subagent reports packages it couldn't check, note them for manual review
+- If rate limited, suggest setting the environment variable GITHUB_TOKEN

package/skills/design-probe/SKILL.md ADDED Viewed

@@ -0,0 +1,107 @@
+---
+name: "design-probe"
+description: "Generic one-question-at-a-time design probing interview for turning unclear goals, designs, or workflow states into shared understanding before planning or execution."
+---
+# Design Probe
+Use `design-probe` when a goal, design, workflow route, implementation boundary, acceptance criterion, or recovery path is not clear enough to plan or execute responsibly.
+This skill is generic. It is not Builder Kit-only. Builder Kit uses the flow step name `design-probe` during pickup and guided build workflows, but the same probing contract applies to any project, feature, architecture, product idea, or implementation handoff that needs alignment.
+This skill is modeled after Matt Pocock's `grill-me`: interview the user relentlessly about the relevant plan or design until shared understanding exists, walk the design tree branch by branch, provide a recommended answer for each question, ask one question at a time, and explore the codebase or local docs instead of asking when the answer is discoverable.
+## Contract
+- Explore first: inspect available local docs, plans, artifacts, contracts, code, tests, issue text, and prior decisions before asking the user when the answer is discoverable.
+- Stay grounded: cite the local sources or code paths that shaped the question when they matter.
+- Walk the design tree branch by branch: resolve dependencies between decisions one-by-one before moving to the next branch instead of mixing independent concerns.
+- Be relentless about ambiguity: keep probing fuzzy goals, overloaded terms, implicit non-goals, missing constraints, and weak success signals until they are resolved or explicitly accepted as gaps.
+- Ask exactly one alignment question at a time.
+- Include a recommended answer with every question and briefly explain why it is recommended.
+- Make the recommendation actionable enough that the user can accept it directly.
+- Record decisions, unresolved questions, accepted gaps, and planning readiness as the interview progresses.
+- Stop when shared understanding exists, or when the remaining uncertainty is explicitly recorded as an accepted gap.
+- Do not silently convert uncertainty into implementation work.
+## When To Use
+Use this skill for:
+- Ambiguous product or feature goals.
+- Conflicting requirements or unclear non-goals.
+- Missing acceptance criteria or unclear evidence expectations.
+- Architecture or workflow decisions that block planning.
+- Direct primitive recovery when upstream context or state is missing.
+- Guided workflow next-step selection when artifacts do not clearly identify whether to ask, plan, execute, verify, or stop.
+Do not use this skill to replace implementation planning, backlog shaping, verification, or release review. Use it only until the design decision surface is aligned enough for the next workflow primitive.
+## Discovery Before Asking
+Before asking the first question:
+1. Read the user's request and identify the decision branch that blocks progress.
+2. Search local context that could answer it, such as `README`, `CONTEXT.md`, `docs/`, `context/contracts/`, relevant skills, active workflow artifacts, schemas, tests, and nearby implementation files.
+3. Prefer existing project vocabulary and documented decisions over inventing new terms.
+4. If local evidence resolves the branch, record the inferred decision and move to the next branch.
+5. Ask only when the branch remains ambiguous, contradictory, risky, or value-laden.
+## Interview Loop
+For each unresolved branch:
+1. State the branch being resolved in one short sentence.
+2. Ask exactly one question.
+3. Provide a recommended answer in the same message.
+4. Explain the practical consequence of accepting the recommendation.
+5. Wait for the user's answer before asking another question.
+6. Record the outcome before continuing.
+Question format:
+```markdown
+Question: <one alignment question>
+Recommended answer: <specific answer the user can accept>
+Why: <brief reason and consequence>
+```
+If the user answers with a new ambiguity, treat that as the next branch. If the user accepts the recommendation, record it as a decision and continue.
+## Records
+Maintain a compact running record in the active artifact or conversation when no artifact exists:
+- `decisions`: choices that are aligned or locally inferable.
+- `unresolved_questions`: questions still blocking planning or execution.
+- `accepted_gaps`: uncertainties the user explicitly accepts, including the consequence.
+- `planning_readiness`: one of `ready`, `needs_more_probe`, or `accepted_gap_ready`.
+- `next_action`: the recommended next workflow step, such as `shape`, `plan-work`, `execute-plan`, `verify-work`, `needs_user`, or `stop`.
+When workflow artifacts exist, update the appropriate session, handoff, Probe record, or planning artifact according to the local artifact contract. Do not invent a project-specific storage format when the repository already defines one.
+## Stop Conditions
+Stop probing when one of these is true:
+- Shared understanding exists and the next action is clear.
+- The user explicitly accepts a gap and its consequence, and the next action can proceed with that gap recorded.
+- The next action is to stop because the goal is out of scope, not worth pursuing, or blocked by an external dependency.
+Before stopping, summarize:
+- Decisions made.
+- Remaining unresolved questions, if any.
+- Accepted gaps, if any.
+- Planning readiness.
+- Recommended next action.
+## Boundaries
+- Do not ask multiple questions in one turn.
+- Do not ask for information already discoverable from local docs, code, tests, schemas, or workflow artifacts.
+- Do not broaden the probe into unrelated architecture review, backlog shaping, or implementation.
+- Do not treat Builder Kit terminology as required outside Builder Kit workflows.
+- Do not overwrite downstream workflow authority: if another contract owns planning, verification, release, or gate semantics, hand off to that contract once probing is complete.

package/skills/eval-rebuild/SKILL.md ADDED Viewed

@@ -0,0 +1,39 @@
+---
+name: "eval-rebuild"
+description: "Project-specific build and install commands for the eval feedback loop. Injected into eval-builder agent. Replace this skill for different build systems."
+---
+# Eval Rebuild
+This skill defines how to rebuild and reinstall agents after making source edits. The eval-builder agent calls this after fixing a prompt or skill.
+## Build System
+This project uses a flat standalone structure — no build step required. Edits to agent specs, skills, and context take effect immediately.
+## Source & Installed Locations (same)
+| What | Where |
+|------|-------|
+| Agent configs | `~/.flow-agents/agents/*.json` |
+| Skills | `~/.flow-agents/skills/*/SKILL.md` |
+| Context files | `~/.flow-agents/context/**/*.md` |
+| Evals | `~/.flow-agents/evals/` |
+## Rebuild Commands
+No rebuild needed — edits are live. If Claude Code caches agent configs, restart the session.
+## Post-Edit Verification
+```bash
+bash ~/.flow-agents/evals/run.sh static
+```
+## Adapting for Other Projects
+To use the eval framework with a different build system, replace this skill with one that defines your project's:
+1. Source locations (where agent specs and skills live)
+2. Rebuild commands (your build + install pipeline)
+3. Post-rebuild verification (how to check it worked)
+4. Installed locations (where the runtime reads agent configs from)