@probelabs/visor 0.1.106 → 0.1.111
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +71 -2
- package/action.yml +1 -1
- package/defaults/code-refiner.yaml +114 -0
- package/defaults/{.visor.yaml → code-review.yaml} +35 -226
- package/defaults/override.yaml +52 -0
- package/defaults/task-refinement.yaml +624 -0
- package/defaults/visor.tests.yaml +685 -0
- package/defaults/visor.yaml +483 -0
- package/dist/action-cli-bridge.d.ts +11 -82
- package/dist/action-cli-bridge.d.ts.map +1 -1
- package/dist/ai-review-service.d.ts +28 -9
- package/dist/ai-review-service.d.ts.map +1 -1
- package/dist/check-execution-engine.d.ts +19 -331
- package/dist/check-execution-engine.d.ts.map +1 -1
- package/dist/cli-main.d.ts.map +1 -1
- package/dist/cli.d.ts +0 -1
- package/dist/cli.d.ts.map +1 -1
- package/dist/config.d.ts +16 -0
- package/dist/config.d.ts.map +1 -1
- package/dist/cron-scheduler.d.ts +3 -3
- package/dist/cron-scheduler.d.ts.map +1 -1
- package/dist/debug-visualizer/ws-server.d.ts +7 -1
- package/dist/debug-visualizer/ws-server.d.ts.map +1 -1
- package/dist/defaults/code-refiner.yaml +114 -0
- package/dist/defaults/{.visor.yaml → code-review.yaml} +35 -226
- package/dist/defaults/override.yaml +52 -0
- package/dist/defaults/task-refinement.yaml +624 -0
- package/dist/defaults/visor.tests.yaml +685 -0
- package/dist/defaults/visor.yaml +483 -0
- package/dist/docs/DEPLOYMENT.md +118 -0
- package/dist/docs/GITHUB_CHECKS.md +280 -0
- package/dist/docs/NPM_USAGE.md +208 -0
- package/dist/docs/action-reference.md +19 -0
- package/dist/docs/advanced-ai.md +237 -0
- package/dist/docs/ai-configuration.md +535 -0
- package/dist/docs/ai-custom-tools-usage.md +261 -0
- package/dist/docs/ai-custom-tools.md +392 -0
- package/dist/docs/author-permissions.md +610 -0
- package/dist/docs/bot-transports-rfc.md +23 -0
- package/dist/docs/ci-cli-mode.md +34 -0
- package/dist/docs/claude-code.md +74 -0
- package/dist/docs/command-provider.md +559 -0
- package/dist/docs/commands.md +8 -0
- package/dist/docs/configuration.md +324 -0
- package/dist/docs/custom-tools.md +424 -0
- package/dist/docs/dashboards/README.md +23 -0
- package/dist/docs/dashboards/grafana-visor-diagrams.json +20 -0
- package/dist/docs/dashboards/grafana-visor-overview.json +33 -0
- package/dist/docs/debug-visualizer-progress.md +572 -0
- package/dist/docs/debug-visualizer-rfc.md +691 -0
- package/dist/docs/debug-visualizer.md +114 -0
- package/dist/docs/debugging.md +636 -0
- package/dist/docs/default-output-schema.md +28 -0
- package/dist/docs/dependencies.md +369 -0
- package/dist/docs/dev-playbook.md +9 -0
- package/dist/docs/engine-pause-resume-rfc.md +192 -0
- package/dist/docs/engine-state-machine-plan.md +333 -0
- package/dist/docs/event-driven-github-integration-rfc.md +743 -0
- package/dist/docs/event-triggers.md +292 -0
- package/dist/docs/execution-statistics-rfc.md +290 -0
- package/dist/docs/fact-validator-gap-analysis.md +178 -0
- package/dist/docs/fact-validator-implementation-plan.md +1235 -0
- package/dist/docs/fail-if.md +95 -0
- package/dist/docs/failure-conditions-implementation.md +271 -0
- package/dist/docs/failure-conditions-schema.md +173 -0
- package/dist/docs/failure-routing-rfc.md +193 -0
- package/dist/docs/failure-routing.md +507 -0
- package/dist/docs/foreach-dependency-propagation.md +473 -0
- package/dist/docs/github-ops.md +89 -0
- package/dist/docs/goto-forward-run-plan.md +113 -0
- package/dist/docs/guides/criticality-modes.md +332 -0
- package/dist/docs/guides/fault-management-and-contracts.md +738 -0
- package/dist/docs/guides/workflow-style-guide.md +224 -0
- package/dist/docs/http.md +299 -0
- package/dist/docs/human-input-provider.md +372 -0
- package/dist/docs/lifecycle-hooks.md +253 -0
- package/dist/docs/limits.md +64 -0
- package/dist/docs/liquid-templates.md +490 -0
- package/dist/docs/loop-routing-refactor.md +89 -0
- package/dist/docs/mcp-provider.md +557 -0
- package/dist/docs/mcp.md +124 -0
- package/dist/docs/memory.md +903 -0
- package/dist/docs/observability.md +12 -0
- package/dist/docs/output-formats.md +20 -0
- package/dist/docs/output-formatting.md +29 -0
- package/dist/docs/output-history.md +383 -0
- package/dist/docs/performance.md +6 -0
- package/dist/docs/pluggable.md +124 -0
- package/dist/docs/proposals/snapshot-scope-execution.md +236 -0
- package/dist/docs/providers/git-checkout.md +589 -0
- package/dist/docs/recipes.md +474 -0
- package/dist/docs/rfc/git-checkout-step.md +601 -0
- package/dist/docs/rfc/on_init-hook.md +1294 -0
- package/dist/docs/rfc/workspace-isolation.md +216 -0
- package/dist/docs/roadmap/criticality-implementation-tasks.md +92 -0
- package/dist/docs/router-patterns.md +339 -0
- package/dist/docs/schema-next-pr.md +10 -0
- package/dist/docs/schema-templates.md +68 -0
- package/dist/docs/script.md +34 -0
- package/dist/docs/sdk.md +222 -0
- package/dist/docs/security.md +7 -0
- package/dist/docs/suppressions.md +89 -0
- package/dist/docs/tag-filtering.md +258 -0
- package/dist/docs/telemetry-setup.md +119 -0
- package/dist/docs/telemetry-tracing-rfc.md +275 -0
- package/dist/docs/test-framework-rfc.md +680 -0
- package/dist/docs/testing/assertions.md +85 -0
- package/dist/docs/testing/ci.md +44 -0
- package/dist/docs/testing/cli.md +41 -0
- package/dist/docs/testing/cookbook.md +172 -0
- package/dist/docs/testing/dsl-reference.md +199 -0
- package/dist/docs/testing/fixtures-and-mocks.md +91 -0
- package/dist/docs/testing/flows.md +92 -0
- package/dist/docs/testing/getting-started.md +93 -0
- package/dist/docs/testing/troubleshooting.md +55 -0
- package/dist/docs/timeouts.md +50 -0
- package/dist/docs/troubleshooting.md +7 -0
- package/dist/docs/visor-sdk-rfc.md +186 -0
- package/dist/docs/workflows.md +569 -0
- package/dist/engine/on-finish/orchestrator.d.ts +19 -0
- package/dist/engine/on-finish/orchestrator.d.ts.map +1 -0
- package/dist/engine/on-finish/utils.d.ts +44 -0
- package/dist/engine/on-finish/utils.d.ts.map +1 -0
- package/dist/event-bus/event-bus.d.ts +13 -0
- package/dist/event-bus/event-bus.d.ts.map +1 -0
- package/dist/event-bus/types.d.ts +71 -0
- package/dist/event-bus/types.d.ts.map +1 -0
- package/dist/examples/.claude/agents/code-reviewer.md +69 -0
- package/dist/examples/.mcp.json +34 -0
- package/dist/examples/CALCULATOR-SDK.md +364 -0
- package/dist/examples/README.md +384 -0
- package/dist/examples/ai-custom-tools-example.yaml +206 -0
- package/dist/examples/ai-custom-tools-simple.yaml +76 -0
- package/dist/examples/ai-retry-fallback-config.yaml +180 -0
- package/dist/examples/ai-with-bash.yaml +126 -0
- package/dist/examples/ai-with-mcp.yaml +82 -0
- package/dist/examples/basic-human-input.yaml +15 -0
- package/dist/examples/bedrock-config.yaml +77 -0
- package/dist/examples/calculator-config.yaml +133 -0
- package/dist/examples/calculator-json-output-guide.md +311 -0
- package/dist/examples/calculator-sdk-automated.ts +340 -0
- package/dist/examples/calculator-sdk-example.ts +275 -0
- package/dist/examples/calculator-sdk-json.ts +331 -0
- package/dist/examples/calculator-sdk-real.ts +374 -0
- package/dist/examples/calculator-sdk-test.ts +148 -0
- package/dist/examples/claude-code-config.yaml +191 -0
- package/dist/examples/cron-webhook-config.yaml +215 -0
- package/dist/examples/custom-template.liquid +57 -0
- package/dist/examples/custom-tools-example.yaml +281 -0
- package/dist/examples/enhanced-config.yaml +165 -0
- package/dist/examples/environments/visor.base.yaml +92 -0
- package/dist/examples/environments/visor.dev.yaml +33 -0
- package/dist/examples/environments/visor.prod.yaml +95 -0
- package/dist/examples/environments/visor.staging.yaml +46 -0
- package/dist/examples/fact-validator.yaml +361 -0
- package/dist/examples/fail-if-simple.yaml +90 -0
- package/dist/examples/failure-conditions-advanced.yaml +136 -0
- package/dist/examples/failure-conditions-basic.yaml +48 -0
- package/dist/examples/failure-conditions-github-style.yaml +119 -0
- package/dist/examples/failure-conditions-migration.yaml +74 -0
- package/dist/examples/for-loop-example.yaml +176 -0
- package/dist/examples/forEach-example.yaml +120 -0
- package/dist/examples/git-checkout-basic.yaml +32 -0
- package/dist/examples/git-checkout-compare.yaml +59 -0
- package/dist/examples/git-checkout-cross-repo.yaml +76 -0
- package/dist/examples/github-workflow-with-tags.yml +163 -0
- package/dist/examples/http-integration-config.yaml +240 -0
- package/dist/examples/https-server-config.yaml +209 -0
- package/dist/examples/human-input-example.yaml +63 -0
- package/dist/examples/if-conditions.yaml +173 -0
- package/dist/examples/jira-simple-example.yaml +56 -0
- package/dist/examples/jira-single-issue-workflow.yaml +166 -0
- package/dist/examples/jira-workflow-mcp.yaml +182 -0
- package/dist/examples/mcp/analyzer.py +119 -0
- package/dist/examples/mcp-provider-example.yaml +301 -0
- package/dist/examples/memory-counter.yaml +99 -0
- package/dist/examples/memory-error-collection.yaml +104 -0
- package/dist/examples/memory-exec-js.yaml +247 -0
- package/dist/examples/memory-namespace-isolation.yaml +184 -0
- package/dist/examples/memory-retry-counter.yaml +65 -0
- package/dist/examples/memory-state-machine.yaml +170 -0
- package/dist/examples/on-init-import-demo.yaml +179 -0
- package/dist/examples/outputs-raw-basic.yaml +26 -0
- package/dist/examples/project-with-tools.yaml +174 -0
- package/dist/examples/prompts/architecture-analysis.liquid +116 -0
- package/dist/examples/prompts/security-comprehensive.liquid +107 -0
- package/dist/examples/quick-start-tags.yaml +53 -0
- package/dist/examples/reusable-tools.yaml +92 -0
- package/dist/examples/reusable-workflows.yaml +88 -0
- package/dist/examples/routing-basic.yaml +35 -0
- package/dist/examples/routing-dynamic-js.yaml +46 -0
- package/dist/examples/routing-foreach.yaml +34 -0
- package/dist/examples/routing-goto-event.yaml +34 -0
- package/dist/examples/routing-on-success.yaml +25 -0
- package/dist/examples/run-calculator-demo.sh +71 -0
- package/dist/examples/sdk-basic.mjs +10 -0
- package/dist/examples/sdk-cjs.cjs +10 -0
- package/dist/examples/sdk-comprehensive.mjs +175 -0
- package/dist/examples/sdk-manual-config.mjs +65 -0
- package/dist/examples/sdk-typescript.js +81 -0
- package/dist/examples/sdk-typescript.ts +92 -0
- package/dist/examples/session-reuse-config.yaml +151 -0
- package/dist/examples/session-reuse-self.yaml +81 -0
- package/dist/examples/slack-simple-chat.yaml +775 -0
- package/dist/examples/templates/security-report.liquid +137 -0
- package/dist/examples/tools-library.yaml +281 -0
- package/dist/examples/transform-example.yaml +199 -0
- package/dist/examples/visor-with-tags.yaml +198 -0
- package/dist/examples/webhook-pipeline-config.yaml +218 -0
- package/dist/examples/workflows/calculator-workflow.yaml +163 -0
- package/dist/examples/workflows/code-quality.yaml +222 -0
- package/dist/examples/workflows/quick-pr-check.yaml +90 -0
- package/dist/examples/workflows/workflow-composition-example.yaml +130 -0
- package/dist/failure-condition-evaluator.d.ts +3 -0
- package/dist/failure-condition-evaluator.d.ts.map +1 -1
- package/dist/frontends/github-frontend.d.ts +58 -0
- package/dist/frontends/github-frontend.d.ts.map +1 -0
- package/dist/frontends/host.d.ts +47 -0
- package/dist/frontends/host.d.ts.map +1 -0
- package/dist/frontends/ndjson-sink.d.ts +12 -0
- package/dist/frontends/ndjson-sink.d.ts.map +1 -0
- package/dist/frontends/slack-frontend.d.ts +58 -0
- package/dist/frontends/slack-frontend.d.ts.map +1 -0
- package/dist/generated/config-schema.d.ts +967 -57
- package/dist/generated/config-schema.d.ts.map +1 -1
- package/dist/generated/config-schema.json +1033 -56
- package/dist/github-check-service.d.ts +4 -6
- package/dist/github-check-service.d.ts.map +1 -1
- package/dist/github-comments.d.ts +2 -4
- package/dist/github-comments.d.ts.map +1 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +134327 -99004
- package/dist/liquid-extensions.d.ts.map +1 -1
- package/dist/logger.d.ts +2 -0
- package/dist/logger.d.ts.map +1 -1
- package/dist/memory-store.d.ts +6 -0
- package/dist/memory-store.d.ts.map +1 -1
- package/dist/output/assistant-json/template.liquid +0 -0
- package/dist/output/traces/run-2026-01-20T19-22-58-043Z.ndjson +138 -0
- package/dist/output/traces/run-2026-01-20T19-23-52-175Z.ndjson +1067 -0
- package/dist/output-formatters.d.ts +1 -1
- package/dist/output-formatters.d.ts.map +1 -1
- package/dist/providers/ai-check-provider.d.ts +12 -0
- package/dist/providers/ai-check-provider.d.ts.map +1 -1
- package/dist/providers/check-provider-registry.d.ts +6 -0
- package/dist/providers/check-provider-registry.d.ts.map +1 -1
- package/dist/providers/check-provider.interface.d.ts +43 -1
- package/dist/providers/check-provider.interface.d.ts.map +1 -1
- package/dist/providers/claude-code-check-provider.d.ts.map +1 -1
- package/dist/providers/command-check-provider.d.ts +1 -1
- package/dist/providers/command-check-provider.d.ts.map +1 -1
- package/dist/providers/custom-tool-executor.d.ts +61 -0
- package/dist/providers/custom-tool-executor.d.ts.map +1 -0
- package/dist/providers/git-checkout-provider.d.ts +25 -0
- package/dist/providers/git-checkout-provider.d.ts.map +1 -0
- package/dist/providers/github-ops-provider.d.ts.map +1 -1
- package/dist/providers/http-client-provider.d.ts +4 -4
- package/dist/providers/http-client-provider.d.ts.map +1 -1
- package/dist/providers/human-input-check-provider.d.ts +5 -0
- package/dist/providers/human-input-check-provider.d.ts.map +1 -1
- package/dist/providers/index.d.ts +1 -0
- package/dist/providers/index.d.ts.map +1 -1
- package/dist/providers/log-check-provider.d.ts +2 -5
- package/dist/providers/log-check-provider.d.ts.map +1 -1
- package/dist/providers/mcp-check-provider.d.ts +10 -4
- package/dist/providers/mcp-check-provider.d.ts.map +1 -1
- package/dist/providers/mcp-custom-sse-server.d.ts +66 -0
- package/dist/providers/mcp-custom-sse-server.d.ts.map +1 -0
- package/dist/providers/memory-check-provider.d.ts +2 -8
- package/dist/providers/memory-check-provider.d.ts.map +1 -1
- package/dist/providers/script-check-provider.d.ts +25 -0
- package/dist/providers/script-check-provider.d.ts.map +1 -0
- package/dist/providers/workflow-check-provider.d.ts +56 -0
- package/dist/providers/workflow-check-provider.d.ts.map +1 -0
- package/dist/reviewer.d.ts +2 -1
- package/dist/reviewer.d.ts.map +1 -1
- package/dist/sdk/check-provider-registry-534KL5HT.mjs +27 -0
- package/dist/sdk/chunk-23L3QRYX.mjs +16872 -0
- package/dist/sdk/chunk-23L3QRYX.mjs.map +1 -0
- package/dist/sdk/{chunk-TUTOLSFV.mjs → chunk-3OMWVM6J.mjs} +11 -1
- package/dist/sdk/chunk-3OMWVM6J.mjs.map +1 -0
- package/dist/sdk/chunk-7UK3NIIT.mjs +482 -0
- package/dist/sdk/chunk-7UK3NIIT.mjs.map +1 -0
- package/dist/sdk/chunk-AGIZJ4UZ.mjs +173 -0
- package/dist/sdk/chunk-AGIZJ4UZ.mjs.map +1 -0
- package/dist/sdk/chunk-AIVFBIS4.mjs +1371 -0
- package/dist/sdk/chunk-AIVFBIS4.mjs.map +1 -0
- package/dist/sdk/chunk-AK6BVWIT.mjs +426 -0
- package/dist/sdk/chunk-AK6BVWIT.mjs.map +1 -0
- package/dist/sdk/chunk-AUT26LHW.mjs +139 -0
- package/dist/sdk/chunk-AUT26LHW.mjs.map +1 -0
- package/dist/sdk/chunk-BOVFH3LI.mjs +232 -0
- package/dist/sdk/chunk-BOVFH3LI.mjs.map +1 -0
- package/dist/sdk/chunk-CNX7V5JK.mjs +89 -0
- package/dist/sdk/chunk-CNX7V5JK.mjs.map +1 -0
- package/dist/sdk/chunk-HTOKWMPO.mjs +157 -0
- package/dist/sdk/chunk-HTOKWMPO.mjs.map +1 -0
- package/dist/sdk/chunk-NAW3DB3I.mjs +197 -0
- package/dist/sdk/chunk-NAW3DB3I.mjs.map +1 -0
- package/dist/sdk/chunk-O5EZDNYL.mjs +274 -0
- package/dist/sdk/chunk-O5EZDNYL.mjs.map +1 -0
- package/dist/sdk/chunk-QR7MOMJH.mjs +558 -0
- package/dist/sdk/chunk-QR7MOMJH.mjs.map +1 -0
- package/dist/sdk/chunk-QY2XYPEV.mjs +3556 -0
- package/dist/sdk/chunk-QY2XYPEV.mjs.map +1 -0
- package/dist/sdk/chunk-S2RUE2RG.mjs +145 -0
- package/dist/sdk/chunk-S2RUE2RG.mjs.map +1 -0
- package/dist/sdk/chunk-SIWNBRTK.mjs +800 -0
- package/dist/sdk/chunk-SIWNBRTK.mjs.map +1 -0
- package/dist/sdk/chunk-YSN4G6CI.mjs +146 -0
- package/dist/sdk/chunk-YSN4G6CI.mjs.map +1 -0
- package/dist/sdk/chunk-ZYAUYXSW.mjs +206 -0
- package/dist/sdk/chunk-ZYAUYXSW.mjs.map +1 -0
- package/dist/sdk/command-executor-TYUV6HUS.mjs +14 -0
- package/dist/sdk/config-YNC2EOOT.mjs +16 -0
- package/dist/sdk/config-merger-PX3WIT57.mjs +10 -0
- package/dist/sdk/event-bus-5BEVPQ6T.mjs +35 -0
- package/dist/sdk/event-bus-5BEVPQ6T.mjs.map +1 -0
- package/dist/sdk/failure-condition-evaluator-YGTF2GHG.mjs +17 -0
- package/dist/sdk/git-repository-analyzer-HJC4MYW4.mjs +458 -0
- package/dist/sdk/git-repository-analyzer-HJC4MYW4.mjs.map +1 -0
- package/dist/sdk/github-frontend-SIAEOCON.mjs +1420 -0
- package/dist/sdk/github-frontend-SIAEOCON.mjs.map +1 -0
- package/dist/sdk/host-DXUYTNMU.mjs +52 -0
- package/dist/sdk/host-DXUYTNMU.mjs.map +1 -0
- package/dist/sdk/{liquid-extensions-KVL4MKRH.mjs → liquid-extensions-PKWCKK7E.mjs} +8 -2
- package/dist/sdk/memory-store-XGBB7LX7.mjs +12 -0
- package/dist/sdk/memory-store-XGBB7LX7.mjs.map +1 -0
- package/dist/sdk/metrics-7PP3EJUH.mjs +29 -0
- package/dist/sdk/metrics-7PP3EJUH.mjs.map +1 -0
- package/dist/sdk/ndjson-sink-B4V4NTAQ.mjs +44 -0
- package/dist/sdk/ndjson-sink-B4V4NTAQ.mjs.map +1 -0
- package/dist/sdk/prompt-state-YRJY6QAL.mjs +16 -0
- package/dist/sdk/prompt-state-YRJY6QAL.mjs.map +1 -0
- package/dist/sdk/renderer-schema-LPKN5UJS.mjs +51 -0
- package/dist/sdk/renderer-schema-LPKN5UJS.mjs.map +1 -0
- package/dist/sdk/routing-6N45MJ4F.mjs +24 -0
- package/dist/sdk/routing-6N45MJ4F.mjs.map +1 -0
- package/dist/sdk/sdk.d.mts +541 -22
- package/dist/sdk/sdk.d.ts +541 -22
- package/dist/sdk/sdk.js +27963 -16505
- package/dist/sdk/sdk.js.map +1 -1
- package/dist/sdk/sdk.mjs +1116 -2169
- package/dist/sdk/sdk.mjs.map +1 -1
- package/dist/sdk/session-registry-4E6YRQ77.mjs +10 -0
- package/dist/sdk/session-registry-4E6YRQ77.mjs.map +1 -0
- package/dist/sdk/slack-frontend-BVKW3GD5.mjs +735 -0
- package/dist/sdk/slack-frontend-BVKW3GD5.mjs.map +1 -0
- package/dist/sdk/trace-helpers-VP6QYVBX.mjs +23 -0
- package/dist/sdk/trace-helpers-VP6QYVBX.mjs.map +1 -0
- package/dist/sdk/{tracer-init-WC75N5NW.mjs → tracer-init-GSLPPLCD.mjs} +2 -2
- package/dist/sdk/tracer-init-GSLPPLCD.mjs.map +1 -0
- package/dist/sdk/workflow-registry-R6KSACFR.mjs +12 -0
- package/dist/sdk/workflow-registry-R6KSACFR.mjs.map +1 -0
- package/dist/sdk.d.ts.map +1 -1
- package/dist/slack/adapter.d.ts +36 -0
- package/dist/slack/adapter.d.ts.map +1 -0
- package/dist/slack/cache-prewarmer.d.ts +31 -0
- package/dist/slack/cache-prewarmer.d.ts.map +1 -0
- package/dist/slack/client.d.ts +77 -0
- package/dist/slack/client.d.ts.map +1 -0
- package/dist/slack/markdown.d.ts +45 -0
- package/dist/slack/markdown.d.ts.map +1 -0
- package/dist/slack/prompt-state.d.ts +33 -0
- package/dist/slack/prompt-state.d.ts.map +1 -0
- package/dist/slack/rate-limiter.d.ts +56 -0
- package/dist/slack/rate-limiter.d.ts.map +1 -0
- package/dist/slack/signature.d.ts +2 -0
- package/dist/slack/signature.d.ts.map +1 -0
- package/dist/slack/socket-runner.d.ts +42 -0
- package/dist/slack/socket-runner.d.ts.map +1 -0
- package/dist/slack/thread-cache.d.ts +51 -0
- package/dist/slack/thread-cache.d.ts.map +1 -0
- package/dist/snapshot-store.d.ts +59 -0
- package/dist/snapshot-store.d.ts.map +1 -0
- package/dist/state-machine/context/build-engine-context.d.ts +17 -0
- package/dist/state-machine/context/build-engine-context.d.ts.map +1 -0
- package/dist/state-machine/dispatch/dependency-gating.d.ts +12 -0
- package/dist/state-machine/dispatch/dependency-gating.d.ts.map +1 -0
- package/dist/state-machine/dispatch/execution-invoker.d.ts +14 -0
- package/dist/state-machine/dispatch/execution-invoker.d.ts.map +1 -0
- package/dist/state-machine/dispatch/foreach-processor.d.ts +8 -0
- package/dist/state-machine/dispatch/foreach-processor.d.ts.map +1 -0
- package/dist/state-machine/dispatch/history-snapshot.d.ts +8 -0
- package/dist/state-machine/dispatch/history-snapshot.d.ts.map +1 -0
- package/dist/state-machine/dispatch/on-init-handlers.d.ts +43 -0
- package/dist/state-machine/dispatch/on-init-handlers.d.ts.map +1 -0
- package/dist/state-machine/dispatch/renderer-schema.d.ts +8 -0
- package/dist/state-machine/dispatch/renderer-schema.d.ts.map +1 -0
- package/dist/state-machine/dispatch/stats-manager.d.ts +15 -0
- package/dist/state-machine/dispatch/stats-manager.d.ts.map +1 -0
- package/dist/state-machine/dispatch/template-renderer.d.ts +7 -0
- package/dist/state-machine/dispatch/template-renderer.d.ts.map +1 -0
- package/dist/state-machine/execution/summary.d.ts +8 -0
- package/dist/state-machine/execution/summary.d.ts.map +1 -0
- package/dist/state-machine/runner.d.ts +79 -0
- package/dist/state-machine/runner.d.ts.map +1 -0
- package/dist/state-machine/states/check-running.d.ts +14 -0
- package/dist/state-machine/states/check-running.d.ts.map +1 -0
- package/dist/state-machine/states/completed.d.ts +12 -0
- package/dist/state-machine/states/completed.d.ts.map +1 -0
- package/dist/state-machine/states/error.d.ts +11 -0
- package/dist/state-machine/states/error.d.ts.map +1 -0
- package/dist/state-machine/states/init.d.ts +11 -0
- package/dist/state-machine/states/init.d.ts.map +1 -0
- package/dist/state-machine/states/level-dispatch.d.ts +17 -0
- package/dist/state-machine/states/level-dispatch.d.ts.map +1 -0
- package/dist/state-machine/states/plan-ready.d.ts +12 -0
- package/dist/state-machine/states/plan-ready.d.ts.map +1 -0
- package/dist/state-machine/states/routing.d.ts +52 -0
- package/dist/state-machine/states/routing.d.ts.map +1 -0
- package/dist/state-machine/states/wave-planning.d.ts +14 -0
- package/dist/state-machine/states/wave-planning.d.ts.map +1 -0
- package/dist/state-machine/workflow-projection.d.ts +47 -0
- package/dist/state-machine/workflow-projection.d.ts.map +1 -0
- package/dist/state-machine-execution-engine.d.ts +159 -0
- package/dist/state-machine-execution-engine.d.ts.map +1 -0
- package/dist/telemetry/opentelemetry.d.ts.map +1 -1
- package/dist/telemetry/state-capture.d.ts +5 -0
- package/dist/telemetry/state-capture.d.ts.map +1 -1
- package/dist/test-runner/assertions.d.ts +59 -0
- package/dist/test-runner/assertions.d.ts.map +1 -0
- package/dist/test-runner/core/environment.d.ts +8 -0
- package/dist/test-runner/core/environment.d.ts.map +1 -0
- package/dist/test-runner/core/fixture.d.ts +3 -0
- package/dist/test-runner/core/fixture.d.ts.map +1 -0
- package/dist/test-runner/core/flow-stage.d.ts +32 -0
- package/dist/test-runner/core/flow-stage.d.ts.map +1 -0
- package/dist/test-runner/core/mocks.d.ts +8 -0
- package/dist/test-runner/core/mocks.d.ts.map +1 -0
- package/dist/test-runner/core/test-execution-wrapper.d.ts +18 -0
- package/dist/test-runner/core/test-execution-wrapper.d.ts.map +1 -0
- package/dist/test-runner/evaluators.d.ts +45 -0
- package/dist/test-runner/evaluators.d.ts.map +1 -0
- package/dist/test-runner/fixture-loader.d.ts +30 -0
- package/dist/test-runner/fixture-loader.d.ts.map +1 -0
- package/dist/test-runner/index.d.ts +127 -0
- package/dist/test-runner/index.d.ts.map +1 -0
- package/dist/test-runner/recorders/github-recorder.d.ts +23 -0
- package/dist/test-runner/recorders/github-recorder.d.ts.map +1 -0
- package/dist/test-runner/recorders/global-recorder.d.ts +4 -0
- package/dist/test-runner/recorders/global-recorder.d.ts.map +1 -0
- package/dist/test-runner/recorders/slack-recorder.d.ts +17 -0
- package/dist/test-runner/recorders/slack-recorder.d.ts.map +1 -0
- package/dist/test-runner/utils/selectors.d.ts +2 -0
- package/dist/test-runner/utils/selectors.d.ts.map +1 -0
- package/dist/test-runner/validator.d.ts +8 -0
- package/dist/test-runner/validator.d.ts.map +1 -0
- package/dist/traces/run-2026-01-20T19-22-58-043Z.ndjson +138 -0
- package/dist/traces/run-2026-01-20T19-23-52-175Z.ndjson +1067 -0
- package/dist/types/bot.d.ts +109 -0
- package/dist/types/bot.d.ts.map +1 -0
- package/dist/types/cli.d.ts +8 -1
- package/dist/types/cli.d.ts.map +1 -1
- package/dist/types/config.d.ts +459 -9
- package/dist/types/config.d.ts.map +1 -1
- package/dist/types/engine.d.ts +177 -0
- package/dist/types/engine.d.ts.map +1 -0
- package/dist/types/execution.d.ts +73 -0
- package/dist/types/execution.d.ts.map +1 -0
- package/dist/types/git-checkout.d.ts +76 -0
- package/dist/types/git-checkout.d.ts.map +1 -0
- package/dist/types/github.d.ts +51 -0
- package/dist/types/github.d.ts.map +1 -0
- package/dist/types/workflow.d.ts +237 -0
- package/dist/types/workflow.d.ts.map +1 -0
- package/dist/utils/command-executor.d.ts +43 -0
- package/dist/utils/command-executor.d.ts.map +1 -0
- package/dist/utils/comment-metadata.d.ts +21 -0
- package/dist/utils/comment-metadata.d.ts.map +1 -0
- package/dist/utils/config-loader.d.ts.map +1 -1
- package/dist/utils/config-merger.d.ts.map +1 -1
- package/dist/utils/env-exposure.d.ts +3 -0
- package/dist/utils/env-exposure.d.ts.map +1 -0
- package/dist/utils/file-exclusion.d.ts.map +1 -1
- package/dist/utils/interactive-prompt.d.ts +1 -1
- package/dist/utils/interactive-prompt.d.ts.map +1 -1
- package/dist/utils/json-text-extractor.d.ts +17 -0
- package/dist/utils/json-text-extractor.d.ts.map +1 -0
- package/dist/utils/sandbox.d.ts +10 -0
- package/dist/utils/sandbox.d.ts.map +1 -1
- package/dist/utils/script-memory-ops.d.ts +21 -0
- package/dist/utils/script-memory-ops.d.ts.map +1 -0
- package/dist/utils/template-context.d.ts +8 -0
- package/dist/utils/template-context.d.ts.map +1 -0
- package/dist/utils/tracer-init.d.ts.map +1 -1
- package/dist/utils/workspace-manager.d.ts +118 -0
- package/dist/utils/workspace-manager.d.ts.map +1 -0
- package/dist/utils/worktree-cleanup.d.ts +33 -0
- package/dist/utils/worktree-cleanup.d.ts.map +1 -0
- package/dist/utils/worktree-manager.d.ts +153 -0
- package/dist/utils/worktree-manager.d.ts.map +1 -0
- package/dist/webhook-server.d.ts +3 -3
- package/dist/webhook-server.d.ts.map +1 -1
- package/dist/workflow-executor.d.ts +81 -0
- package/dist/workflow-executor.d.ts.map +1 -0
- package/dist/workflow-registry.d.ts +79 -0
- package/dist/workflow-registry.d.ts.map +1 -0
- package/package.json +12 -5
- package/dist/output/traces/run-2025-10-22T18-22-56-873Z.ndjson +0 -218
- package/dist/sdk/check-execution-engine-2YYKUUSH.mjs +0 -11
- package/dist/sdk/check-execution-engine-6QJXYYON.mjs +0 -11
- package/dist/sdk/check-execution-engine-PJZ4ZOKG.mjs +0 -11
- package/dist/sdk/chunk-33QVZ2D4.mjs +0 -316
- package/dist/sdk/chunk-33QVZ2D4.mjs.map +0 -1
- package/dist/sdk/chunk-B5QBV2QJ.mjs +0 -752
- package/dist/sdk/chunk-B5QBV2QJ.mjs.map +0 -1
- package/dist/sdk/chunk-BVFNRCHT.mjs +0 -14129
- package/dist/sdk/chunk-BVFNRCHT.mjs.map +0 -1
- package/dist/sdk/chunk-KWZW23FG.mjs +0 -14129
- package/dist/sdk/chunk-KWZW23FG.mjs.map +0 -1
- package/dist/sdk/chunk-O4RP4BRH.mjs +0 -14092
- package/dist/sdk/chunk-O4RP4BRH.mjs.map +0 -1
- package/dist/sdk/chunk-TUTOLSFV.mjs.map +0 -1
- package/dist/sdk/chunk-U5D2LY66.mjs +0 -245
- package/dist/sdk/chunk-U5D2LY66.mjs.map +0 -1
- package/dist/sdk/chunk-U7X54EMV.mjs +0 -331
- package/dist/sdk/chunk-U7X54EMV.mjs.map +0 -1
- package/dist/sdk/config-merger-TWUBWFC2.mjs +0 -8
- package/dist/sdk/mermaid-telemetry-SN6A2TKW.mjs +0 -61
- package/dist/sdk/mermaid-telemetry-SN6A2TKW.mjs.map +0 -1
- package/dist/sdk/mermaid-telemetry-YCTIG76M.mjs +0 -61
- package/dist/sdk/mermaid-telemetry-YCTIG76M.mjs.map +0 -1
- package/dist/traces/run-2025-10-22T18-22-56-873Z.ndjson +0 -218
- /package/dist/sdk/{check-execution-engine-2YYKUUSH.mjs.map → check-provider-registry-534KL5HT.mjs.map} +0 -0
- /package/dist/sdk/{check-execution-engine-6QJXYYON.mjs.map → command-executor-TYUV6HUS.mjs.map} +0 -0
- /package/dist/sdk/{check-execution-engine-PJZ4ZOKG.mjs.map → config-YNC2EOOT.mjs.map} +0 -0
- /package/dist/sdk/{config-merger-TWUBWFC2.mjs.map → config-merger-PX3WIT57.mjs.map} +0 -0
- /package/dist/sdk/{liquid-extensions-KVL4MKRH.mjs.map → failure-condition-evaluator-YGTF2GHG.mjs.map} +0 -0
- /package/dist/sdk/{tracer-init-WC75N5NW.mjs.map → liquid-extensions-PKWCKK7E.mjs.map} +0 -0
|
@@ -0,0 +1,1235 @@
|
|
|
1
|
+
# Fact Validator Implementation Plan
|
|
2
|
+
|
|
3
|
+
## 🎯 CURRENT STATUS: 50% COMPLETE
|
|
4
|
+
|
|
5
|
+
**Last Updated:** 2025-10-16
|
|
6
|
+
|
|
7
|
+
### Implementation Status
|
|
8
|
+
|
|
9
|
+
| Component | Status | Files |
|
|
10
|
+
|-----------|--------|-------|
|
|
11
|
+
| **Phase 1: Infrastructure** | ✅ COMPLETE | `src/types/config.ts`, `src/config.ts`, `src/check-execution-engine.ts` |
|
|
12
|
+
| **Phase 2: Execution** | ✅ COMPLETE | `src/check-execution-engine.ts` (lines 283-748) |
|
|
13
|
+
| **Phase 3: Configuration** | ✅ COMPLETE | `examples/fact-validator.yaml` |
|
|
14
|
+
| **Phase 4: Documentation** | ⏳ PENDING | - |
|
|
15
|
+
| **Phase 5: Testing** | ⏳ PENDING | - |
|
|
16
|
+
| **Phase 6: Deployment** | ⏳ PENDING | - |
|
|
17
|
+
|
|
18
|
+
### Test Results
|
|
19
|
+
|
|
20
|
+
✅ **1438 tests passing, 0 failures**
|
|
21
|
+
|
|
22
|
+
### What's Working
|
|
23
|
+
|
|
24
|
+
- ✅ `on_finish` hook infrastructure (types, validation, detection)
|
|
25
|
+
- ✅ Full execution logic (`run`, `goto_js`, error handling, logging)
|
|
26
|
+
- ✅ Complete fact validator example configuration
|
|
27
|
+
- ✅ Memory-based retry tracking
|
|
28
|
+
- ✅ forEach iteration with aggregation
|
|
29
|
+
- ✅ Dynamic routing based on validation results
|
|
30
|
+
|
|
31
|
+
### Quick Test
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
npm run build
|
|
35
|
+
./dist/index.js --config examples/fact-validator.yaml --event issue_opened --cli
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Executive Summary
|
|
41
|
+
|
|
42
|
+
This plan implements an AI-powered fact validation system for GitHub issue and comment assistants. Before posting responses, the system:
|
|
43
|
+
|
|
44
|
+
1. **Extracts** all factual claims from the AI-generated response
|
|
45
|
+
2. **Validates** each fact individually using AI with code/documentation search
|
|
46
|
+
3. **Aggregates** validation results to determine overall accuracy
|
|
47
|
+
4. **Retries** the original assistant with fact-checking context if validation fails
|
|
48
|
+
5. **Posts** the response only after facts are verified (or max attempts reached)
|
|
49
|
+
|
|
50
|
+
**Key Feature**: Introduces a new `on_finish` routing hook on the forEach check itself to handle aggregation and routing after ALL dependent checks complete ALL iterations.
|
|
51
|
+
|
|
52
|
+
**Configuration**: Controlled by `ENABLE_FACT_VALIDATION` environment variable (enabled by default in workflows).
|
|
53
|
+
|
|
54
|
+
## Critical Design Decision: `on_finish` on forEach Check
|
|
55
|
+
|
|
56
|
+
### The Winning Argument: Multiple Dependents in forEach
|
|
57
|
+
|
|
58
|
+
The `on_finish` hook **MUST** be on the forEach check itself, not on dependent checks.
|
|
59
|
+
|
|
60
|
+
**Why?** Consider this scenario:
|
|
61
|
+
|
|
62
|
+
```yaml
|
|
63
|
+
checks:
|
|
64
|
+
extract-facts:
|
|
65
|
+
type: ai
|
|
66
|
+
forEach: true
|
|
67
|
+
# Outputs: [fact1, fact2, fact3]
|
|
68
|
+
|
|
69
|
+
validate-security:
|
|
70
|
+
depends_on: [extract-facts]
|
|
71
|
+
# Runs 3 times, validates security aspects
|
|
72
|
+
|
|
73
|
+
validate-technical:
|
|
74
|
+
depends_on: [extract-facts]
|
|
75
|
+
# Runs 3 times, validates technical aspects
|
|
76
|
+
|
|
77
|
+
validate-format:
|
|
78
|
+
depends_on: [extract-facts]
|
|
79
|
+
# Runs 3 times, validates format/style
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**With `on_finish` on forEach check:**
|
|
83
|
+
```yaml
|
|
84
|
+
extract-facts:
|
|
85
|
+
forEach: true
|
|
86
|
+
on_finish: # ✅ Runs ONCE after ALL dependents complete ALL iterations
|
|
87
|
+
run: [aggregate-all-validations]
|
|
88
|
+
goto_js: |
|
|
89
|
+
const securityValid = outputs.history['validate-security'].every(r => r.is_valid);
|
|
90
|
+
const technicalValid = outputs.history['validate-technical'].every(r => r.is_valid);
|
|
91
|
+
const formatValid = outputs.history['validate-format'].every(r => r.is_valid);
|
|
92
|
+
|
|
93
|
+
if (!securityValid || !technicalValid || !formatValid) {
|
|
94
|
+
return 'retry-with-context';
|
|
95
|
+
}
|
|
96
|
+
return null;
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
**This is the ONLY way to aggregate across ALL dependent checks!**
|
|
100
|
+
|
|
101
|
+
If `on_finish` were on dependent checks, each would trigger separately and there would be no single point to aggregate results from all dependents.
|
|
102
|
+
|
|
103
|
+
## User Requirements Met
|
|
104
|
+
|
|
105
|
+
- ✅ Fact validator enabled conditionally via environment variable
|
|
106
|
+
- ✅ Enabled by default in GitHub workflow
|
|
107
|
+
- ✅ Extract facts after AI generates response
|
|
108
|
+
- ✅ forEach validation of each fact individually
|
|
109
|
+
- ✅ **NEW**: `on_finish` hook on forEach check for post-iteration aggregation and routing
|
|
110
|
+
- ✅ Check if all facts valid after forEach completes
|
|
111
|
+
- ✅ Route back to original assistant with context if invalid (max 2 attempts)
|
|
112
|
+
- ✅ Track attempts in memory to prevent infinite loops
|
|
113
|
+
- ✅ Liquid templates check attempt count and fact data from memory
|
|
114
|
+
- ✅ Second pass includes fact validation context for AI to self-correct
|
|
115
|
+
|
|
116
|
+
## Architecture
|
|
117
|
+
|
|
118
|
+
### High-Level Flow
|
|
119
|
+
|
|
120
|
+
```mermaid
|
|
121
|
+
graph TD
|
|
122
|
+
A[Issue/Comment Trigger] --> B[init-fact-validation-memory]
|
|
123
|
+
B --> C[issue-assistant or comment-assistant]
|
|
124
|
+
C --> D{ENABLE_FACT_VALIDATION?}
|
|
125
|
+
D -->|No| E[post-direct-response]
|
|
126
|
+
D -->|Yes| F[extract-facts: forEach with on_finish]
|
|
127
|
+
F --> G{Any facts to validate?}
|
|
128
|
+
G -->|No facts| E
|
|
129
|
+
G -->|Yes| H[validate-fact: depends_on extract-facts]
|
|
130
|
+
H --> I[forEach Iterations: fact 1...N]
|
|
131
|
+
I --> J[on_finish triggers on extract-facts]
|
|
132
|
+
J --> K[aggregate-validations via on_finish.run]
|
|
133
|
+
K --> L{on_finish.goto_js evaluates}
|
|
134
|
+
L -->|All valid| M[return null → proceed]
|
|
135
|
+
L -->|Invalid + attempt < 2| N[increment attempt, return issue-assistant]
|
|
136
|
+
L -->|Invalid + attempt >= 1| O[return null → give up]
|
|
137
|
+
M --> P[post-verified-response]
|
|
138
|
+
N --> C
|
|
139
|
+
O --> Q[post-unverified-warning]
|
|
140
|
+
|
|
141
|
+
style F fill:#ffeb3b,stroke:#f57c00,stroke-width:3px
|
|
142
|
+
style J fill:#ffeb3b,stroke:#f57c00,stroke-width:3px
|
|
143
|
+
|
|
144
|
+
classDef newFeature fill:#4caf50,stroke:#2e7d32,stroke-width:2px
|
|
145
|
+
class F,J newFeature
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
**Legend:**
|
|
149
|
+
- 🟡 **Yellow boxes**: Components involving the new `on_finish` hook
|
|
150
|
+
- 🟢 **Green**: New features requiring implementation
|
|
151
|
+
|
|
152
|
+
### Detailed Flow with on_finish
|
|
153
|
+
|
|
154
|
+
```mermaid
|
|
155
|
+
graph TD
|
|
156
|
+
A[issue-assistant generates response] --> B[extract-facts: AI with forEach]
|
|
157
|
+
B --> C[validate-fact runs N times]
|
|
158
|
+
C --> D1[Iteration 1]
|
|
159
|
+
C --> D2[Iteration 2]
|
|
160
|
+
C --> DN[Iteration N]
|
|
161
|
+
D1 --> E[All iterations complete]
|
|
162
|
+
D2 --> E
|
|
163
|
+
DN --> E
|
|
164
|
+
E --> F[on_finish.run executes]
|
|
165
|
+
F --> G[aggregate-validations runs]
|
|
166
|
+
G --> H[on_finish.goto_js evaluates]
|
|
167
|
+
H -->|allValid = true| I[return null]
|
|
168
|
+
H -->|allValid = false, attempt = 0| J[increment, return issue-assistant]
|
|
169
|
+
H -->|allValid = false, attempt >= 1| K[return null]
|
|
170
|
+
I --> L[post-verified-response]
|
|
171
|
+
J --> A
|
|
172
|
+
K --> M[post-unverified-warning]
|
|
173
|
+
|
|
174
|
+
style E fill:#ffeb3b,stroke:#f57c00,stroke-width:3px
|
|
175
|
+
style F fill:#ffeb3b,stroke:#f57c00,stroke-width:3px
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
## Complete Implementation
|
|
179
|
+
|
|
180
|
+
### 1. Environment Variable Configuration
|
|
181
|
+
|
|
182
|
+
Add to `.visor.yaml`:
|
|
183
|
+
```yaml
|
|
184
|
+
env:
|
|
185
|
+
ENABLE_FACT_VALIDATION: "${{ env.ENABLE_FACT_VALIDATION }}"
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
Add to GitHub workflow (`.github/workflows/visor.yml`):
|
|
189
|
+
```yaml
|
|
190
|
+
env:
|
|
191
|
+
ENABLE_FACT_VALIDATION: "true" # Enable by default
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### 2. Memory Initialization
|
|
195
|
+
|
|
196
|
+
```yaml
|
|
197
|
+
checks:
|
|
198
|
+
init-fact-validation:
|
|
199
|
+
type: memory
|
|
200
|
+
operation: set
|
|
201
|
+
key: fact_validation_attempt
|
|
202
|
+
value: 0
|
|
203
|
+
namespace: fact-validation
|
|
204
|
+
on: [issue_opened, issue_comment]
|
|
205
|
+
if: "env.ENABLE_FACT_VALIDATION === 'true'"
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
### 3. Update Assistants with Retry Context
|
|
209
|
+
|
|
210
|
+
```yaml
|
|
211
|
+
issue-assistant:
|
|
212
|
+
type: ai
|
|
213
|
+
group: dynamic
|
|
214
|
+
schema: issue-assistant
|
|
215
|
+
depends_on: [init-fact-validation]
|
|
216
|
+
prompt: |
|
|
217
|
+
{% if memory.has('fact_validation_issues', 'fact-validation') %}
|
|
218
|
+
## ⚠️ Previous Fact Validation Issues
|
|
219
|
+
The following facts were incorrect in your previous response:
|
|
220
|
+
|
|
221
|
+
{% assign issues = memory.get('fact_validation_issues', 'fact-validation') %}
|
|
222
|
+
{% for issue in issues %}
|
|
223
|
+
- **{{ issue.claim }}** - {{ issue.issue }}
|
|
224
|
+
- Evidence: {{ issue.evidence }}
|
|
225
|
+
{% if issue.correction %}
|
|
226
|
+
- Correct information: {{ issue.correction }}
|
|
227
|
+
{% endif %}
|
|
228
|
+
{% endfor %}
|
|
229
|
+
|
|
230
|
+
Please correct these facts in your new response.
|
|
231
|
+
{% endif %}
|
|
232
|
+
|
|
233
|
+
You are a GitHub issue assistant for the {{ event.repository.fullName }} repository.
|
|
234
|
+
|
|
235
|
+
[Rest of your existing issue-assistant prompt...]
|
|
236
|
+
on: [issue_opened]
|
|
237
|
+
|
|
238
|
+
comment-assistant:
|
|
239
|
+
type: ai
|
|
240
|
+
group: dynamic
|
|
241
|
+
schema: issue-assistant
|
|
242
|
+
depends_on: [init-fact-validation]
|
|
243
|
+
prompt: |
|
|
244
|
+
{% if memory.has('fact_validation_issues', 'fact-validation') %}
|
|
245
|
+
## ⚠️ Previous Fact Validation Issues
|
|
246
|
+
[Same as above...]
|
|
247
|
+
{% endif %}
|
|
248
|
+
|
|
249
|
+
You are a GitHub comment assistant for {{ event.repository.fullName }}.
|
|
250
|
+
|
|
251
|
+
[Rest of your existing comment-assistant prompt...]
|
|
252
|
+
on: [issue_comment]
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### 4. Fact Extraction (forEach with on_finish)
|
|
256
|
+
|
|
257
|
+
```yaml
|
|
258
|
+
extract-facts:
|
|
259
|
+
type: ai
|
|
260
|
+
group: fact-validation
|
|
261
|
+
schema: plain
|
|
262
|
+
depends_on: [issue-assistant, comment-assistant]
|
|
263
|
+
prompt: |
|
|
264
|
+
Extract all factual claims from the following assistant response.
|
|
265
|
+
|
|
266
|
+
Response:
|
|
267
|
+
```
|
|
268
|
+
{{ outputs['issue-assistant'].text || outputs['comment-assistant'].text }}
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
Extract facts in these categories:
|
|
272
|
+
1. **Technical Facts**: How code/features work
|
|
273
|
+
2. **Configuration Facts**: Settings, files, configuration
|
|
274
|
+
3. **Documentation Facts**: References to docs, guides, examples
|
|
275
|
+
4. **Version Facts**: Versions, releases, compatibility
|
|
276
|
+
5. **Feature Facts**: Feature availability or behavior
|
|
277
|
+
6. **Process Facts**: Workflows, procedures, best practices
|
|
278
|
+
|
|
279
|
+
Return a JSON array of fact objects:
|
|
280
|
+
```json
|
|
281
|
+
[
|
|
282
|
+
{
|
|
283
|
+
"id": "fact-1",
|
|
284
|
+
"category": "Configuration Facts",
|
|
285
|
+
"claim": "The default config file is .visor.yaml",
|
|
286
|
+
"context": "Mentioned when explaining Visor setup",
|
|
287
|
+
"verifiable": true
|
|
288
|
+
}
|
|
289
|
+
]
|
|
290
|
+
```
|
|
291
|
+
|
|
292
|
+
Return ONLY the JSON array. If no verifiable facts, return: []
|
|
293
|
+
transform_js: |
|
|
294
|
+
try {
|
|
295
|
+
const parsed = JSON.parse(output);
|
|
296
|
+
const verifiable = parsed.filter(f => f.verifiable === true);
|
|
297
|
+
return verifiable;
|
|
298
|
+
} catch (e) {
|
|
299
|
+
log('Failed to parse facts:', e);
|
|
300
|
+
return [];
|
|
301
|
+
}
|
|
302
|
+
forEach: true # ← Makes this a forEach check
|
|
303
|
+
|
|
304
|
+
# ✅ on_finish runs ONCE after ALL dependents complete
|
|
305
|
+
on_finish:
|
|
306
|
+
# First, run aggregation
|
|
307
|
+
run: [aggregate-validations]
|
|
308
|
+
|
|
309
|
+
# Then, routing decision based on aggregated results
|
|
310
|
+
goto_js: |
|
|
311
|
+
const allValid = memory.get('all_facts_valid', 'fact-validation');
|
|
312
|
+
const attempt = memory.get('fact_validation_attempt', 'fact-validation') || 0;
|
|
313
|
+
|
|
314
|
+
log('Fact validation complete - allValid:', allValid, 'attempt:', attempt);
|
|
315
|
+
|
|
316
|
+
if (allValid) {
|
|
317
|
+
log('All facts valid, proceeding to post');
|
|
318
|
+
return null; // Continue to post checks
|
|
319
|
+
}
|
|
320
|
+
|
|
321
|
+
if (attempt >= 1) {
|
|
322
|
+
log('Max attempts reached, giving up');
|
|
323
|
+
return null; // Continue to warning check
|
|
324
|
+
}
|
|
325
|
+
|
|
326
|
+
// Retry with fact validation context
|
|
327
|
+
log('Facts invalid, retrying assistant with context');
|
|
328
|
+
memory.increment('fact_validation_attempt', 1, 'fact-validation');
|
|
329
|
+
return event.name === 'issue_opened' ? 'issue-assistant' : 'comment-assistant';
|
|
330
|
+
|
|
331
|
+
goto_event: "{{ event.event_name }}"
|
|
332
|
+
|
|
333
|
+
on: [issue_opened, issue_comment]
|
|
334
|
+
if: "env.ENABLE_FACT_VALIDATION === 'true'"
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
### 5. Fact Validation (Dependent Check)
|
|
338
|
+
|
|
339
|
+
```yaml
|
|
340
|
+
validate-fact:
|
|
341
|
+
type: ai
|
|
342
|
+
group: fact-validation
|
|
343
|
+
schema: plain
|
|
344
|
+
depends_on: [extract-facts]
|
|
345
|
+
prompt: |
|
|
346
|
+
Verify this factual claim using available MCP tools.
|
|
347
|
+
|
|
348
|
+
Fact to verify:
|
|
349
|
+
- **Category**: {{ outputs['extract-facts'].category }}
|
|
350
|
+
- **Claim**: {{ outputs['extract-facts'].claim }}
|
|
351
|
+
- **Context**: {{ outputs['extract-facts'].context }}
|
|
352
|
+
|
|
353
|
+
## Verification Process
|
|
354
|
+
|
|
355
|
+
1. **Use MCP tools** to search code, read docs, check config
|
|
356
|
+
2. **Search relevant files**:
|
|
357
|
+
- Config facts → .visor.yaml, defaults/.visor.yaml, examples/
|
|
358
|
+
- Feature facts → docs/, src/
|
|
359
|
+
- Doc facts → Verify links and file references
|
|
360
|
+
3. **Cross-reference** with examples and patterns
|
|
361
|
+
|
|
362
|
+
Return JSON:
|
|
363
|
+
```json
|
|
364
|
+
{
|
|
365
|
+
"fact_id": "{{ outputs['extract-facts'].id }}",
|
|
366
|
+
"claim": "{{ outputs['extract-facts'].claim }}",
|
|
367
|
+
"is_valid": true,
|
|
368
|
+
"confidence": "high",
|
|
369
|
+
"evidence": "Found in defaults/.visor.yaml:10",
|
|
370
|
+
"correction": null,
|
|
371
|
+
"sources": ["defaults/.visor.yaml", "docs/configuration.md"]
|
|
372
|
+
}
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
Be thorough. If unsure, mark confidence as "low".
|
|
376
|
+
Return ONLY the JSON object.
|
|
377
|
+
transform_js: |
|
|
378
|
+
try {
|
|
379
|
+
return JSON.parse(output);
|
|
380
|
+
} catch (e) {
|
|
381
|
+
log('Failed to parse validation:', e);
|
|
382
|
+
return {
|
|
383
|
+
fact_id: outputs['extract-facts'].id,
|
|
384
|
+
claim: outputs['extract-facts'].claim,
|
|
385
|
+
is_valid: false,
|
|
386
|
+
confidence: 'low',
|
|
387
|
+
evidence: 'Failed to parse validation response',
|
|
388
|
+
correction: null,
|
|
389
|
+
sources: []
|
|
390
|
+
};
|
|
391
|
+
}
|
|
392
|
+
on: [issue_opened, issue_comment]
|
|
393
|
+
if: "env.ENABLE_FACT_VALIDATION === 'true'"
|
|
394
|
+
```
|
|
395
|
+
|
|
396
|
+
### 6. Aggregate Validation Results
|
|
397
|
+
|
|
398
|
+
```yaml
|
|
399
|
+
aggregate-validations:
|
|
400
|
+
type: script
|
|
401
|
+
namespace: fact-validation
|
|
402
|
+
content: |
|
|
403
|
+
// Get ALL validation results from forEach iterations
|
|
404
|
+
const validations = outputs.history['validate-fact'];
|
|
405
|
+
|
|
406
|
+
log('Aggregating', validations.length, 'validation results');
|
|
407
|
+
|
|
408
|
+
// Analyze results
|
|
409
|
+
const invalid = validations.filter(v => !v.is_valid);
|
|
410
|
+
const lowConfidence = validations.filter(v => v.confidence === 'low');
|
|
411
|
+
const allValid = invalid.length === 0 && lowConfidence.length === 0;
|
|
412
|
+
|
|
413
|
+
// Store results
|
|
414
|
+
memory.set('all_facts_valid', allValid, 'fact-validation');
|
|
415
|
+
memory.set('validation_results', validations, 'fact-validation');
|
|
416
|
+
memory.set('invalid_facts', invalid, 'fact-validation');
|
|
417
|
+
memory.set('low_confidence_facts', lowConfidence, 'fact-validation');
|
|
418
|
+
|
|
419
|
+
// Store issues for retry context
|
|
420
|
+
if (!allValid) {
|
|
421
|
+
const issues = [...invalid, ...lowConfidence].map(v => ({
|
|
422
|
+
claim: v.claim,
|
|
423
|
+
issue: v.is_valid ? 'low confidence' : 'incorrect',
|
|
424
|
+
evidence: v.evidence,
|
|
425
|
+
correction: v.correction
|
|
426
|
+
}));
|
|
427
|
+
memory.set('fact_validation_issues', issues, 'fact-validation');
|
|
428
|
+
}
|
|
429
|
+
|
|
430
|
+
return {
|
|
431
|
+
total: validations.length,
|
|
432
|
+
valid: validations.filter(v => v.is_valid && v.confidence !== 'low').length,
|
|
433
|
+
invalid: invalid.length,
|
|
434
|
+
low_confidence: lowConfidence.length,
|
|
435
|
+
all_valid: allValid,
|
|
436
|
+
summary: allValid
|
|
437
|
+
? 'All facts validated successfully'
|
|
438
|
+
: `Found ${invalid.length} invalid and ${lowConfidence.length} low-confidence facts`
|
|
439
|
+
};
|
|
440
|
+
on: [issue_opened, issue_comment]
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
### 7. Post Verified Response
|
|
444
|
+
|
|
445
|
+
```yaml
|
|
446
|
+
post-verified-response:
|
|
447
|
+
type: github
|
|
448
|
+
tags: [github]
|
|
449
|
+
depends_on: [extract-facts]
|
|
450
|
+
on: [issue_opened, issue_comment]
|
|
451
|
+
if: |
|
|
452
|
+
env.ENABLE_FACT_VALIDATION === 'true' &&
|
|
453
|
+
memory.get('all_facts_valid', 'fact-validation') === true
|
|
454
|
+
op: comment.create
|
|
455
|
+
value: |
|
|
456
|
+
{{ outputs['issue-assistant'].text || outputs['comment-assistant'].text }}
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
### 8. Post Unverified Warning
|
|
460
|
+
|
|
461
|
+
```yaml
|
|
462
|
+
post-unverified-warning:
|
|
463
|
+
type: github
|
|
464
|
+
tags: [github]
|
|
465
|
+
depends_on: [extract-facts]
|
|
466
|
+
on: [issue_opened, issue_comment]
|
|
467
|
+
if: |
|
|
468
|
+
env.ENABLE_FACT_VALIDATION === 'true' &&
|
|
469
|
+
memory.get('all_facts_valid', 'fact-validation') === false &&
|
|
470
|
+
memory.get('fact_validation_attempt', 'fact-validation') >= 1
|
|
471
|
+
op: comment.create
|
|
472
|
+
value: |
|
|
473
|
+
⚠️ **Fact Validation Warning**
|
|
474
|
+
|
|
475
|
+
I attempted to respond to your {{ event.name === 'issue_opened' ? 'issue' : 'comment' }}, but could not verify all factual claims after validation.
|
|
476
|
+
|
|
477
|
+
**Issues Found:**
|
|
478
|
+
{% assign invalid = memory.get('invalid_facts', 'fact-validation') %}
|
|
479
|
+
{% for fact in invalid %}
|
|
480
|
+
- **{{ fact.claim }}**: {{ fact.evidence }}
|
|
481
|
+
{% if fact.correction %}
|
|
482
|
+
- Correction: {{ fact.correction }}
|
|
483
|
+
{% endif %}
|
|
484
|
+
{% endfor %}
|
|
485
|
+
|
|
486
|
+
{% assign lowConf = memory.get('low_confidence_facts', 'fact-validation') %}
|
|
487
|
+
{% if lowConf.size > 0 %}
|
|
488
|
+
|
|
489
|
+
**Low Confidence:**
|
|
490
|
+
{% for fact in lowConf %}
|
|
491
|
+
- **{{ fact.claim }}**: {{ fact.evidence }}
|
|
492
|
+
{% endfor %}
|
|
493
|
+
{% endif %}
|
|
494
|
+
|
|
495
|
+
I recommend having a human team member review this to provide accurate information.
|
|
496
|
+
```
|
|
497
|
+
|
|
498
|
+
### 9. Direct Post (Validation Disabled)
|
|
499
|
+
|
|
500
|
+
```yaml
|
|
501
|
+
post-direct-response:
|
|
502
|
+
type: github
|
|
503
|
+
tags: [github]
|
|
504
|
+
depends_on: [issue-assistant, comment-assistant]
|
|
505
|
+
on: [issue_opened, issue_comment]
|
|
506
|
+
if: "env.ENABLE_FACT_VALIDATION !== 'true'"
|
|
507
|
+
op: comment.create
|
|
508
|
+
value: |
|
|
509
|
+
{{ outputs['issue-assistant'].text || outputs['comment-assistant'].text }}
|
|
510
|
+
```
|
|
511
|
+
|
|
512
|
+
## on_finish Hook Implementation Requirements
|
|
513
|
+
|
|
514
|
+
### Type Definitions (`src/types/config.ts`)
|
|
515
|
+
|
|
516
|
+
```typescript
|
|
517
|
+
interface CheckConfig {
|
|
518
|
+
type: string;
|
|
519
|
+
forEach?: boolean;
|
|
520
|
+
on_finish?: RoutingAction; // NEW: Only valid on forEach checks
|
|
521
|
+
on_success?: RoutingAction;
|
|
522
|
+
on_fail?: RoutingAction;
|
|
523
|
+
// ... other fields
|
|
524
|
+
}
|
|
525
|
+
|
|
526
|
+
interface RoutingAction {
|
|
527
|
+
run?: string[];
|
|
528
|
+
run_js?: string;
|
|
529
|
+
goto?: string;
|
|
530
|
+
goto_js?: string;
|
|
531
|
+
goto_event?: string;
|
|
532
|
+
retry?: RetryConfig;
|
|
533
|
+
}
|
|
534
|
+
```
|
|
535
|
+
|
|
536
|
+
### Execution Engine (`src/check-execution-engine.ts`)
|
|
537
|
+
|
|
538
|
+
**Key behaviors to implement:**
|
|
539
|
+
|
|
540
|
+
1. **Trigger Condition**:
|
|
541
|
+
- Only on checks with `forEach: true`
|
|
542
|
+
- Triggers AFTER all dependent checks complete ALL their iterations
|
|
543
|
+
- Does NOT trigger if forEach array is empty
|
|
544
|
+
|
|
545
|
+
2. **Execution Order**:
|
|
546
|
+
```
|
|
547
|
+
forEach check executes once → outputs array
|
|
548
|
+
↓
|
|
549
|
+
All dependent checks execute N times (forEach propagation)
|
|
550
|
+
↓
|
|
551
|
+
on_finish.run executes (checks in order)
|
|
552
|
+
↓
|
|
553
|
+
on_finish.run_js evaluates (additional dynamic checks)
|
|
554
|
+
↓
|
|
555
|
+
on_finish.goto_js evaluates (routing decision)
|
|
556
|
+
↓
|
|
557
|
+
If goto returned, jump to ancestor check
|
|
558
|
+
```
|
|
559
|
+
|
|
560
|
+
3. **Context Available**:
|
|
561
|
+
```javascript
|
|
562
|
+
{
|
|
563
|
+
step: { id: 'extract-facts', tags: [...], group: '...' },
|
|
564
|
+
attempt: 1,
|
|
565
|
+
loop: 2,
|
|
566
|
+
outputs: {
|
|
567
|
+
'extract-facts': [...], // Array of items
|
|
568
|
+
'validate-fact': [...], // Array of ALL results
|
|
569
|
+
},
|
|
570
|
+
outputs.history: {
|
|
571
|
+
'extract-facts': [[...], ...], // Cross-loop history
|
|
572
|
+
'validate-fact': [[...], ...],
|
|
573
|
+
},
|
|
574
|
+
forEach: {
|
|
575
|
+
total: 3,
|
|
576
|
+
successful: 3,
|
|
577
|
+
failed: 0,
|
|
578
|
+
items: [...]
|
|
579
|
+
},
|
|
580
|
+
memory,
|
|
581
|
+
pr,
|
|
582
|
+
files,
|
|
583
|
+
env
|
|
584
|
+
}
|
|
585
|
+
```
|
|
586
|
+
|
|
587
|
+
4. **Error Handling**:
|
|
588
|
+
- If `on_finish.run` checks fail, mark forEach check as failed
|
|
589
|
+
- If `goto_js` throws error, fallback to static `goto` or skip routing
|
|
590
|
+
- Clear error if `on_finish` used on non-forEach check
|
|
591
|
+
|
|
592
|
+
5. **Loop Safety**:
|
|
593
|
+
- Count `on_finish.goto` toward `max_loops`
|
|
594
|
+
- Track routing transitions
|
|
595
|
+
- Abort with clear message if max_loops exceeded
|
|
596
|
+
|
|
597
|
+
### Schema Validation (`src/config.ts`)
|
|
598
|
+
|
|
599
|
+
```typescript
|
|
600
|
+
// Validation rules:
|
|
601
|
+
// 1. on_finish only allowed on checks with forEach: true
|
|
602
|
+
// 2. on_finish.goto must be ancestor (same as on_success/on_fail)
|
|
603
|
+
// 3. on_finish.goto_js must return string or null
|
|
604
|
+
// 4. on_finish.run must be array of valid check IDs
|
|
605
|
+
|
|
606
|
+
if (check.on_finish && !check.forEach) {
|
|
607
|
+
throw new Error(
|
|
608
|
+
`Check '${checkId}' has on_finish but forEach is not true. ` +
|
|
609
|
+
`on_finish is only valid on forEach checks.`
|
|
610
|
+
);
|
|
611
|
+
}
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
### Documentation Updates
|
|
615
|
+
|
|
616
|
+
1. **`docs/failure-routing.md`**: Add `on_finish` section
|
|
617
|
+
2. **`docs/dependencies.md`**: Explain `on_finish` with forEach
|
|
618
|
+
3. **`examples/foreach-on-finish.yaml`**: Complete working example
|
|
619
|
+
4. **`docs/foreach-dependency-propagation.md`**: Update with `on_finish`
|
|
620
|
+
|
|
621
|
+
## Testing Strategy
|
|
622
|
+
|
|
623
|
+
### Unit Tests
|
|
624
|
+
|
|
625
|
+
1. **on_finish Trigger**:
|
|
626
|
+
- Triggers after last forEach iteration
|
|
627
|
+
- Does NOT trigger for empty arrays
|
|
628
|
+
- Does NOT trigger on non-forEach checks
|
|
629
|
+
|
|
630
|
+
2. **on_finish Execution Order**:
|
|
631
|
+
- `run` executes before `goto`
|
|
632
|
+
- `run_js` evaluates after `run`
|
|
633
|
+
- `goto_js` evaluates last
|
|
634
|
+
|
|
635
|
+
3. **on_finish Context**:
|
|
636
|
+
- Has access to `outputs.history` of all dependents
|
|
637
|
+
- Has `forEach` stats (total, successful, failed)
|
|
638
|
+
- Has correct `attempt` and `loop` counters
|
|
639
|
+
|
|
640
|
+
4. **Error Cases**:
|
|
641
|
+
- Error if `on_finish` on non-forEach check
|
|
642
|
+
- Handle `goto_js` errors gracefully
|
|
643
|
+
- Handle failed `on_finish.run` checks
|
|
644
|
+
|
|
645
|
+
### Integration Tests
|
|
646
|
+
|
|
647
|
+
1. **Full Fact Validation Flow**:
|
|
648
|
+
- Issue opened → assistant → extract → validate → aggregate → post
|
|
649
|
+
- All facts valid → direct post
|
|
650
|
+
- Some facts invalid → retry → validate → post
|
|
651
|
+
|
|
652
|
+
2. **Retry Logic**:
|
|
653
|
+
- Invalid facts → retry once with context → validate → post
|
|
654
|
+
- Invalid facts → retry → still invalid → warning post
|
|
655
|
+
|
|
656
|
+
3. **Edge Cases**:
|
|
657
|
+
- No facts extracted → direct post
|
|
658
|
+
- All facts low confidence → retry
|
|
659
|
+
- Validation disabled → direct post
|
|
660
|
+
|
|
661
|
+
4. **Multiple Dependents**:
|
|
662
|
+
- forEach with multiple dependent checks
|
|
663
|
+
- Aggregate across all dependents
|
|
664
|
+
- Route based on combined results
|
|
665
|
+
|
|
666
|
+
### E2E Tests
|
|
667
|
+
|
|
668
|
+
1. **GitHub Issue Opened**:
|
|
669
|
+
- Real issue → assistant response → fact validation → post
|
|
670
|
+
2. **Comment on Issue**:
|
|
671
|
+
- Comment → response → invalid facts → retry → post
|
|
672
|
+
3. **Validation Disabled**:
|
|
673
|
+
- Comment → response → direct post (no validation)
|
|
674
|
+
|
|
675
|
+
## Quick Reference
|
|
676
|
+
|
|
677
|
+
### Configuration Checklist
|
|
678
|
+
|
|
679
|
+
- [x] Add `ENABLE_FACT_VALIDATION` to workflow env (default: "true") ✅
|
|
680
|
+
- [x] Implement `on_finish` hook in execution engine ✅
|
|
681
|
+
- [x] Add type definitions for `on_finish` ✅
|
|
682
|
+
- [x] Add schema validation for `on_finish` ✅
|
|
683
|
+
- [x] Add memory initialization check ✅
|
|
684
|
+
- [x] Add fact extraction AI check with `forEach` and `on_finish` ✅
|
|
685
|
+
- [x] Add fact validation AI check (depends on extract-facts) ✅
|
|
686
|
+
- [x] Add aggregation memory check ✅
|
|
687
|
+
- [x] Add response posting checks (verified/unverified/direct) ✅
|
|
688
|
+
- [x] Update issue-assistant with retry context ✅
|
|
689
|
+
- [x] Update comment-assistant with retry context ✅
|
|
690
|
+
|
|
691
|
+
**Status:** All core configuration items complete. See `examples/fact-validator.yaml` for reference.
|
|
692
|
+
|
|
693
|
+
### Check Names
|
|
694
|
+
|
|
695
|
+
```yaml
|
|
696
|
+
init-fact-validation # Memory: Initialize attempt counter
|
|
697
|
+
issue-assistant # AI: Generate issue response
|
|
698
|
+
comment-assistant # AI: Generate comment response
|
|
699
|
+
extract-facts # AI: Extract facts (forEach: true, on_finish)
|
|
700
|
+
validate-fact # AI: Validate each fact (depends on extract-facts)
|
|
701
|
+
aggregate-validations # Memory: Collect validation results (run by on_finish)
|
|
702
|
+
post-verified-response # GitHub: Post response (all facts valid)
|
|
703
|
+
post-unverified-warning # GitHub: Post warning (max attempts reached)
|
|
704
|
+
post-direct-response # GitHub: Post directly (validation disabled)
|
|
705
|
+
```
|
|
706
|
+
|
|
707
|
+
### Memory Keys (namespace: fact-validation)
|
|
708
|
+
|
|
709
|
+
```yaml
|
|
710
|
+
fact_validation_attempt # number: Current attempt (0, 1, 2)
|
|
711
|
+
all_facts_valid # boolean: True if all facts validated
|
|
712
|
+
validation_results # array: All validation results
|
|
713
|
+
invalid_facts # array: Facts that failed validation
|
|
714
|
+
low_confidence_facts # array: Facts with low confidence
|
|
715
|
+
fact_validation_issues # array: Issues for retry context
|
|
716
|
+
```
|
|
717
|
+
|
|
718
|
+
### Liquid Template Examples
|
|
719
|
+
|
|
720
|
+
```liquid
|
|
721
|
+
{# Check if fact validation is enabled #}
|
|
722
|
+
{% if env.ENABLE_FACT_VALIDATION == 'true' %}
|
|
723
|
+
|
|
724
|
+
{# Get current attempt #}
|
|
725
|
+
{% assign attempt = "fact_validation_attempt" | memory_get: "fact-validation" %}
|
|
726
|
+
|
|
727
|
+
{# Check if all facts are valid #}
|
|
728
|
+
{% assign allValid = "all_facts_valid" | memory_get: "fact-validation" %}
|
|
729
|
+
|
|
730
|
+
{# Get invalid facts #}
|
|
731
|
+
{% assign invalidFacts = "invalid_facts" | memory_get: "fact-validation" %}
|
|
732
|
+
{% for fact in invalidFacts %}
|
|
733
|
+
- {{ fact.claim }}: {{ fact.evidence }}
|
|
734
|
+
{% endfor %}
|
|
735
|
+
|
|
736
|
+
{% endif %}
|
|
737
|
+
```
|
|
738
|
+
|
|
739
|
+
### JavaScript Examples
|
|
740
|
+
|
|
741
|
+
```javascript
|
|
742
|
+
// In on_finish.goto_js
|
|
743
|
+
const allValid = memory.get('all_facts_valid', 'fact-validation');
|
|
744
|
+
const attempt = memory.get('fact_validation_attempt', 'fact-validation') || 0;
|
|
745
|
+
|
|
746
|
+
// Get all validation results from forEach iterations
|
|
747
|
+
const results = outputs.history['validate-fact'];
|
|
748
|
+
const allResultsValid = results.every(r => r.is_valid);
|
|
749
|
+
|
|
750
|
+
// Decide whether to retry
|
|
751
|
+
if (!allValid && attempt < 1) {
|
|
752
|
+
memory.increment('fact_validation_attempt', 1, 'fact-validation');
|
|
753
|
+
return event.name === 'issue_opened' ? 'issue-assistant' : 'comment-assistant';
|
|
754
|
+
}
|
|
755
|
+
return null;
|
|
756
|
+
```
|
|
757
|
+
|
|
758
|
+
## Rollout Plan
|
|
759
|
+
|
|
760
|
+
### Phase 1: on_finish Hook Implementation (3-4 days)
|
|
761
|
+
- Implement TypeScript types
|
|
762
|
+
- Implement execution engine logic
|
|
763
|
+
- Add schema validation
|
|
764
|
+
- Write unit tests
|
|
765
|
+
- Update documentation
|
|
766
|
+
|
|
767
|
+
### Phase 2: Fact Validator Configuration (2-3 days)
|
|
768
|
+
- Create complete .visor.yaml configuration
|
|
769
|
+
- Implement all checks
|
|
770
|
+
- Test forEach and on_finish integration
|
|
771
|
+
- Write integration tests
|
|
772
|
+
|
|
773
|
+
### Phase 3: Testing & Refinement (2-3 days)
|
|
774
|
+
- E2E testing with real scenarios
|
|
775
|
+
- Tune prompts for better fact extraction
|
|
776
|
+
- Tune confidence thresholds
|
|
777
|
+
- Performance testing
|
|
778
|
+
|
|
779
|
+
### Phase 4: Documentation & Examples (1-2 days)
|
|
780
|
+
- Complete user documentation
|
|
781
|
+
- Add example configurations
|
|
782
|
+
- Create debugging guide
|
|
783
|
+
- Write best practices
|
|
784
|
+
|
|
785
|
+
### Phase 5: Deployment (1 day)
|
|
786
|
+
- Enable in staging
|
|
787
|
+
- Monitor metrics
|
|
788
|
+
- Enable in production
|
|
789
|
+
- Set up alerts
|
|
790
|
+
|
|
791
|
+
**Total**: 9-13 days
|
|
792
|
+
|
|
793
|
+
## Success Metrics
|
|
794
|
+
|
|
795
|
+
1. **Accuracy Rate**: % of responses with all facts validated correctly
|
|
796
|
+
2. **Validation Coverage**: % of responses that go through validation
|
|
797
|
+
3. **Retry Rate**: % requiring retry after failed validation
|
|
798
|
+
4. **False Positive Rate**: % of valid facts marked invalid
|
|
799
|
+
5. **Performance**: Average latency added by validation
|
|
800
|
+
6. **User Satisfaction**: Feedback on response accuracy
|
|
801
|
+
|
|
802
|
+
## Risk Assessment
|
|
803
|
+
|
|
804
|
+
| Risk | Probability | Impact | Mitigation |
|
|
805
|
+
|------|-------------|--------|------------|
|
|
806
|
+
| High latency | High | Medium | Configurable enable/disable, optimize prompts |
|
|
807
|
+
| False positives | Medium | High | Thorough testing, tune thresholds |
|
|
808
|
+
| Infinite loops | Low | High | Hard limit (max 2 attempts), memory tracking |
|
|
809
|
+
| API quota exhaustion | Medium | Medium | Rate limiting, caching |
|
|
810
|
+
|
|
811
|
+
## Conclusion
|
|
812
|
+
|
|
813
|
+
This fact validation system significantly improves AI response accuracy while maintaining flexibility. The key innovation is the `on_finish` hook on forEach checks, enabling clean aggregation and routing after all dependent validations complete.
|
|
814
|
+
|
|
815
|
+
### Key Benefits
|
|
816
|
+
|
|
817
|
+
1. **on_finish Hook**: Clean primitive for post-forEach processing
|
|
818
|
+
2. **Self-Correcting AI**: Retry with validation context
|
|
819
|
+
3. **Granular Validation**: Each fact validated independently
|
|
820
|
+
4. **Safe Loops**: Memory tracking prevents recursion
|
|
821
|
+
5. **Flexible**: Easy to enable/disable per environment
|
|
822
|
+
|
|
823
|
+
---
|
|
824
|
+
|
|
825
|
+
## 📋 IMPLEMENTATION TASK LIST
|
|
826
|
+
|
|
827
|
+
### ✅ PHASE 1: Core `on_finish` Hook Infrastructure (COMPLETED)
|
|
828
|
+
|
|
829
|
+
**Priority: P0 (CRITICAL) - Status: ✅ DONE**
|
|
830
|
+
|
|
831
|
+
- [x] **1.1** Create `OnFinishConfig` interface in `src/types/config.ts`
|
|
832
|
+
- [x] Add `run?: string[]` field
|
|
833
|
+
- [x] Add `goto?: string` field
|
|
834
|
+
- [x] Add `goto_event?: EventTrigger` field
|
|
835
|
+
- [x] Add `goto_js?: string` field
|
|
836
|
+
- [x] Add `run_js?: string` field
|
|
837
|
+
- [x] Add `on_finish?: OnFinishConfig` to `CheckConfig`
|
|
838
|
+
|
|
839
|
+
- [x] **1.2** Add schema validation in `src/config.ts`
|
|
840
|
+
- [x] Validate `on_finish` only allowed on `forEach: true` checks
|
|
841
|
+
- [x] Add clear error messages
|
|
842
|
+
- [x] Fix missing 'memory' check type
|
|
843
|
+
|
|
844
|
+
- [x] **1.3** Implement detection & triggering in `src/check-execution-engine.ts`
|
|
845
|
+
- [x] Create `handleOnFinishHooks()` method
|
|
846
|
+
- [x] Detect forEach checks with `on_finish`
|
|
847
|
+
- [x] Verify all dependents completed
|
|
848
|
+
- [x] Build context (outputs, forEach stats, memory, PR)
|
|
849
|
+
- [x] Skip empty forEach arrays
|
|
850
|
+
|
|
851
|
+
- [x] **1.4** Write comprehensive tests
|
|
852
|
+
- [x] Unit tests for validation (3 tests)
|
|
853
|
+
- [x] E2E tests for integration (4 tests)
|
|
854
|
+
- [x] Verify no regressions (full suite)
|
|
855
|
+
|
|
856
|
+
**Result:** ✅ All 1426 tests passing, infrastructure solid
|
|
857
|
+
|
|
858
|
+
---
|
|
859
|
+
|
|
860
|
+
### ✅ PHASE 2: Complete `on_finish` Execution (COMPLETED)
|
|
861
|
+
|
|
862
|
+
**Priority: P0 (CRITICAL - BLOCKING) - Status: ✅ DONE**
|
|
863
|
+
|
|
864
|
+
- [x] **2.1** Detection and context building ✅
|
|
865
|
+
- [x] Detect `on_finish` configuration
|
|
866
|
+
- [x] Build forEach stats
|
|
867
|
+
- [x] Prepare outputs context
|
|
868
|
+
- [x] Create memory helpers
|
|
869
|
+
|
|
870
|
+
- [x] **2.2** Implement `executeCheckInline()` method ✅
|
|
871
|
+
- [x] Extract/refactor `executeNamedCheckInline` from `executeWithRouting()`
|
|
872
|
+
- [x] Make it a class method accessible to `handleOnFinishHooks()`
|
|
873
|
+
- [x] Handle check dependencies
|
|
874
|
+
- [x] Update results map
|
|
875
|
+
- [x] Support event override (`goto_event`)
|
|
876
|
+
- [x] Add error handling
|
|
877
|
+
- **Files modified:** `src/check-execution-engine.ts` (lines 283-470)
|
|
878
|
+
|
|
879
|
+
- [x] **2.3** Implement `on_finish.run` execution ✅
|
|
880
|
+
- [x] Iterate through `onFinish.run` array
|
|
881
|
+
- [x] Call `executeCheckInline()` for each check
|
|
882
|
+
- [x] Execute in order (sequential)
|
|
883
|
+
- [x] Propagate errors properly
|
|
884
|
+
- [x] Log execution start/complete
|
|
885
|
+
- **Files modified:** `src/check-execution-engine.ts` (lines 625-668)
|
|
886
|
+
|
|
887
|
+
- [x] **2.5** Implement `on_finish.goto_js` evaluation ✅
|
|
888
|
+
- [x] Compile JS expression with Sandbox
|
|
889
|
+
- [x] Evaluate with full context
|
|
890
|
+
- [x] Extract goto target (string or null)
|
|
891
|
+
- [x] Call `executeCheckInline()` with target
|
|
892
|
+
- [x] Support `goto_event` override
|
|
893
|
+
- [x] Handle null return (no routing)
|
|
894
|
+
- [x] Add loop safety checks
|
|
895
|
+
- **Files modified:** `src/check-execution-engine.ts` (lines 670-748)
|
|
896
|
+
|
|
897
|
+
- [x] **2.6** Implement static `on_finish.goto` execution ✅
|
|
898
|
+
- [x] Check for static goto string
|
|
899
|
+
- [x] Call `executeCheckInline()` with target
|
|
900
|
+
- [x] Support `goto_event` override
|
|
901
|
+
- **Files modified:** `src/check-execution-engine.ts` (lines 710-713)
|
|
902
|
+
|
|
903
|
+
- [x] **2.7** Add comprehensive error handling ✅
|
|
904
|
+
- [x] Try-catch around `run` execution
|
|
905
|
+
- [x] Try-catch around `goto_js` evaluation
|
|
906
|
+
- [x] Fallback to static `goto` on `goto_js` error
|
|
907
|
+
- [x] Clear error messages
|
|
908
|
+
- [x] Log all errors
|
|
909
|
+
|
|
910
|
+
- [x] **2.8** Add execution logging ✅
|
|
911
|
+
- [x] Log `on_finish` start
|
|
912
|
+
- [x] Log each check execution
|
|
913
|
+
- [x] Log routing decisions
|
|
914
|
+
- [x] Log completion
|
|
915
|
+
- [x] Debug output for context
|
|
916
|
+
|
|
917
|
+
- [x] **2.9** Write tests for execution ✅
|
|
918
|
+
- [x] Unit test: `on_finish.run` executes in order
|
|
919
|
+
- [x] Unit test: `on_finish.goto_js` routes correctly
|
|
920
|
+
- [x] Unit test: Error handling works
|
|
921
|
+
- [x] E2E test: Full routing flow
|
|
922
|
+
- [x] E2E test: Retry with memory
|
|
923
|
+
- **Files:** `tests/unit/on-finish-validation.test.ts`, `tests/e2e/foreach-on-finish.test.ts`
|
|
924
|
+
- **Result:** 1438 tests passing, 0 failures
|
|
925
|
+
|
|
926
|
+
**Total Phase 2: COMPLETED** ✅
|
|
927
|
+
|
|
928
|
+
---
|
|
929
|
+
|
|
930
|
+
### ✅ PHASE 3: Fact Validator Configuration (COMPLETED)
|
|
931
|
+
|
|
932
|
+
**Priority: P1 (HIGH) - Status: ✅ DONE**
|
|
933
|
+
|
|
934
|
+
- [x] **3.1** Create memory initialization check ✅
|
|
935
|
+
- [x] Type: `memory`
|
|
936
|
+
- [x] Operation: `set`
|
|
937
|
+
- [x] Key: `fact_validation_attempt`
|
|
938
|
+
- [x] Value: `0`
|
|
939
|
+
- [x] Namespace: `fact-validation`
|
|
940
|
+
- [x] Add `if` condition for `ENABLE_FACT_VALIDATION`
|
|
941
|
+
- **File:** `examples/fact-validator.yaml`
|
|
942
|
+
|
|
943
|
+
- [x] **3.2** Update issue-assistant prompt ✅
|
|
944
|
+
- [x] Add Liquid template check for `memory.has('fact_validation_issues')`
|
|
945
|
+
- [x] Display previous validation failures
|
|
946
|
+
- [x] Format correction context clearly
|
|
947
|
+
- [x] Test template rendering
|
|
948
|
+
- **File:** `examples/fact-validator.yaml`
|
|
949
|
+
|
|
950
|
+
- [x] **3.3** Update comment-assistant prompt ✅
|
|
951
|
+
- [x] Add same Liquid template logic as issue-assistant
|
|
952
|
+
- [x] Ensure consistency in error display
|
|
953
|
+
- [x] Test template rendering
|
|
954
|
+
- **File:** `examples/fact-validator.yaml`
|
|
955
|
+
|
|
956
|
+
- [x] **3.4** Create extract-facts check ✅
|
|
957
|
+
- [x] Type: `command` (demo) / `ai` (production)
|
|
958
|
+
- [x] Schema: `plain`
|
|
959
|
+
- [x] Depends on: `[issue-assistant, comment-assistant]`
|
|
960
|
+
- [x] Write comprehensive extraction prompt
|
|
961
|
+
- [x] Add `transform_js` to parse and filter facts
|
|
962
|
+
- [x] Set `forEach: true`
|
|
963
|
+
- [x] Add `on_finish` configuration:
|
|
964
|
+
- [x] `run: [aggregate-validations]`
|
|
965
|
+
- [x] `goto_js`: Check validation results and route
|
|
966
|
+
- [x] `goto_event`: Pass through event
|
|
967
|
+
- [x] Add `if` condition for `ENABLE_FACT_VALIDATION`
|
|
968
|
+
- **File:** `examples/fact-validator.yaml`
|
|
969
|
+
|
|
970
|
+
- [x] **3.5** Create validate-fact check ✅
|
|
971
|
+
- [x] Type: `command` (demo) / `ai` (production)
|
|
972
|
+
- [x] Schema: `plain`
|
|
973
|
+
- [x] Depends on: `[extract-facts]`
|
|
974
|
+
- [x] Write validation prompt with MCP tools
|
|
975
|
+
- [x] Instructions for using code search
|
|
976
|
+
- [x] Instructions for reading documentation
|
|
977
|
+
- [x] Add `transform_js` to parse validation result
|
|
978
|
+
- [x] Handle parse errors gracefully
|
|
979
|
+
- [x] Add `if` condition for `ENABLE_FACT_VALIDATION`
|
|
980
|
+
- **File:** `examples/fact-validator.yaml`
|
|
981
|
+
|
|
982
|
+
- [x] **3.6** Create aggregate-validations check ✅
|
|
983
|
+
- [x] Type: `script`
|
|
984
|
+
- [x] Namespace: `fact-validation`
|
|
985
|
+
- [x] Use `content` to:
|
|
986
|
+
- [x] Read all validation results from `outputs.history['validate-fact']`
|
|
987
|
+
- [x] Calculate `all_facts_valid`
|
|
988
|
+
- [x] Store `invalid_facts`
|
|
989
|
+
- [x] Store `low_confidence_facts`
|
|
990
|
+
- [x] Store `fact_validation_issues` for retry
|
|
991
|
+
- [x] Return summary object
|
|
992
|
+
- **File:** `examples/fact-validator.yaml`
|
|
993
|
+
|
|
994
|
+
- [x] **3.7** Create post-verified-response check ✅
|
|
995
|
+
- [x] Type: `logger` (demo) / `github` (production)
|
|
996
|
+
- [x] Depends on: `[extract-facts]`
|
|
997
|
+
- [x] Operation: `comment.create`
|
|
998
|
+
- [x] Add `if` condition: `all_facts_valid === true`
|
|
999
|
+
- [x] Value: Original assistant response
|
|
1000
|
+
- **File:** `examples/fact-validator.yaml`
|
|
1001
|
+
|
|
1002
|
+
- [x] **3.8** Create post-unverified-warning check ✅
|
|
1003
|
+
- [x] Type: `logger` (demo) / `github` (production)
|
|
1004
|
+
- [x] Depends on: `[extract-facts]`
|
|
1005
|
+
- [x] Operation: `comment.create`
|
|
1006
|
+
- [x] Add `if` condition: `all_facts_valid === false && attempt >= 1`
|
|
1007
|
+
- [x] Value: Warning with validation issues
|
|
1008
|
+
- [x] Format invalid and low-confidence facts
|
|
1009
|
+
- **File:** `examples/fact-validator.yaml`
|
|
1010
|
+
|
|
1011
|
+
- [x] **3.9** Create post-direct-response check ✅
|
|
1012
|
+
- [x] Type: `logger` (demo) / `github` (production)
|
|
1013
|
+
- [x] Depends on: `[issue-assistant, comment-assistant]`
|
|
1014
|
+
- [x] Operation: `comment.create`
|
|
1015
|
+
- [x] Add `if` condition: `ENABLE_FACT_VALIDATION !== 'true'`
|
|
1016
|
+
- [x] Value: Direct assistant response
|
|
1017
|
+
- **File:** `examples/fact-validator.yaml` (commented out in demo)
|
|
1018
|
+
|
|
1019
|
+
- [x] **3.10** Add environment configuration ✅
|
|
1020
|
+
- [x] Add to configuration: `env.ENABLE_FACT_VALIDATION`
|
|
1021
|
+
- [x] Test environment variable passing
|
|
1022
|
+
- **File:** `examples/fact-validator.yaml`
|
|
1023
|
+
|
|
1024
|
+
**Total Phase 3: COMPLETED** ✅
|
|
1025
|
+
|
|
1026
|
+
**Deliverable:** Complete working example at `examples/fact-validator.yaml` (474 lines) demonstrating:
|
|
1027
|
+
- ✅ Memory-based retry tracking
|
|
1028
|
+
- ✅ Conditional assistant prompts with validation context
|
|
1029
|
+
- ✅ forEach fact extraction
|
|
1030
|
+
- ✅ Parallel fact validation (N iterations)
|
|
1031
|
+
- ✅ on_finish aggregation with `run` and `goto_js`
|
|
1032
|
+
- ✅ Smart routing based on validation results
|
|
1033
|
+
- ✅ Multiple posting paths (verified/unverified/direct)
|
|
1034
|
+
|
|
1035
|
+
**Test Command:** `./dist/index.js --config examples/fact-validator.yaml --event issue_opened --cli`
|
|
1036
|
+
|
|
1037
|
+
---
|
|
1038
|
+
|
|
1039
|
+
### ✅ PHASE 4: Documentation & Examples (COMPLETED)
|
|
1040
|
+
|
|
1041
|
+
**Priority: P2 (MEDIUM) - Status: ✅ DONE**
|
|
1042
|
+
|
|
1043
|
+
- [x] **4.1** Update `docs/failure-routing.md` ✅
|
|
1044
|
+
- [x] Add `on_finish` section (260 lines)
|
|
1045
|
+
- [x] Explain when it triggers
|
|
1046
|
+
- [x] Show examples with forEach
|
|
1047
|
+
- [x] Document context available
|
|
1048
|
+
- **File:** `docs/failure-routing.md` (lines 160-420)
|
|
1049
|
+
|
|
1050
|
+
- [x] **4.2** Update `docs/dependencies.md` ✅
|
|
1051
|
+
- [x] Explain `on_finish` with forEach propagation
|
|
1052
|
+
- [x] Show dependency chain examples
|
|
1053
|
+
- [x] Document best practices
|
|
1054
|
+
- **File:** `docs/dependencies.md` (lines 83-315)
|
|
1055
|
+
|
|
1056
|
+
- [x] **4.3** Complete working example ✅
|
|
1057
|
+
- [x] Created `examples/fact-validator.yaml` (474 lines)
|
|
1058
|
+
- [x] Added comprehensive comments
|
|
1059
|
+
- [x] Tested and verified working
|
|
1060
|
+
- **File:** `examples/fact-validator.yaml`
|
|
1061
|
+
|
|
1062
|
+
- [x] **4.4** Update `docs/foreach-dependency-propagation.md` ✅
|
|
1063
|
+
- [x] Add `on_finish` section (372 lines)
|
|
1064
|
+
- [x] Show how it fits in propagation
|
|
1065
|
+
- [x] Updated with complete lifecycle
|
|
1066
|
+
- **File:** `docs/foreach-dependency-propagation.md` (lines 69-441)
|
|
1067
|
+
|
|
1068
|
+
**Total Phase 4: COMPLETED** ✅
|
|
1069
|
+
|
|
1070
|
+
**Deliverables:**
|
|
1071
|
+
- 864 lines of documentation added across 3 core docs
|
|
1072
|
+
- Complete working example with inline documentation
|
|
1073
|
+
- Cross-references and best practices throughout
|
|
1074
|
+
- Common pitfalls and debugging tips included
|
|
1075
|
+
|
|
1076
|
+
---
|
|
1077
|
+
|
|
1078
|
+
### ✅ PHASE 5: Testing & Refinement (PARTIALLY COMPLETE)
|
|
1079
|
+
|
|
1080
|
+
**Priority: P1 (HIGH) - Status: 🚧 50% DONE**
|
|
1081
|
+
|
|
1082
|
+
- [x] **5.1** E2E test: Full fact validation flow ✅
|
|
1083
|
+
- [x] Test with real issue scenarios (3 tests)
|
|
1084
|
+
- [x] Verify extraction works
|
|
1085
|
+
- [x] Verify validation works
|
|
1086
|
+
- [x] Verify retry works
|
|
1087
|
+
- [x] Verify posting works
|
|
1088
|
+
- **File:** `tests/e2e/foreach-on-finish.test.ts`
|
|
1089
|
+
|
|
1090
|
+
- [x] **5.2** E2E test: Retry flow ✅
|
|
1091
|
+
- [x] Create issue with invalid facts (3 tests)
|
|
1092
|
+
- [x] Verify assistant retries
|
|
1093
|
+
- [x] Verify correction context passed
|
|
1094
|
+
- [x] Verify max attempts enforced
|
|
1095
|
+
- **File:** `tests/e2e/foreach-on-finish.test.ts`
|
|
1096
|
+
|
|
1097
|
+
- [x] **5.3** E2E test: Empty facts ✅
|
|
1098
|
+
- [x] Test response with no facts (3 tests)
|
|
1099
|
+
- [x] Verify direct posting
|
|
1100
|
+
- [x] Verify no validation runs
|
|
1101
|
+
- **File:** `tests/e2e/foreach-on-finish.test.ts`
|
|
1102
|
+
|
|
1103
|
+
- [x] **5.4** E2E test: Validation disabled ✅
|
|
1104
|
+
- [x] Set `ENABLE_FACT_VALIDATION=false` (2 tests)
|
|
1105
|
+
- [x] Verify direct posting
|
|
1106
|
+
- [x] Verify no validation runs
|
|
1107
|
+
- **File:** `tests/e2e/foreach-on-finish.test.ts`
|
|
1108
|
+
- **Result:** 20 tests passing (11 new + 9 existing)
|
|
1109
|
+
|
|
1110
|
+
- [ ] **5.5** Prompt tuning: Fact extraction
|
|
1111
|
+
- [ ] Test with various response types
|
|
1112
|
+
- [ ] Tune categories
|
|
1113
|
+
- [ ] Reduce false positives
|
|
1114
|
+
- [ ] Test edge cases
|
|
1115
|
+
- **Estimated effort:** 3-4 hours
|
|
1116
|
+
|
|
1117
|
+
- [ ] **5.6** Prompt tuning: Fact validation
|
|
1118
|
+
- [ ] Tune MCP tool usage
|
|
1119
|
+
- [ ] Improve evidence quality
|
|
1120
|
+
- [ ] Reduce false negatives
|
|
1121
|
+
- [ ] Test confidence thresholds
|
|
1122
|
+
- **Estimated effort:** 3-4 hours
|
|
1123
|
+
|
|
1124
|
+
- [ ] **5.7** Performance testing
|
|
1125
|
+
- [ ] Measure latency added
|
|
1126
|
+
- [ ] Identify bottlenecks
|
|
1127
|
+
- [ ] Optimize slow operations
|
|
1128
|
+
- [ ] Test with concurrent requests
|
|
1129
|
+
- **Estimated effort:** 2-3 hours
|
|
1130
|
+
|
|
1131
|
+
- [ ] **5.8** Integration testing
|
|
1132
|
+
- [ ] Test with real repository
|
|
1133
|
+
- [ ] Test with real issues/comments
|
|
1134
|
+
- [ ] Monitor API quota usage
|
|
1135
|
+
- [ ] Collect metrics
|
|
1136
|
+
- **Estimated effort:** 2-3 hours
|
|
1137
|
+
|
|
1138
|
+
**Total Phase 5 Effort:** 16-22 hours
|
|
1139
|
+
|
|
1140
|
+
---
|
|
1141
|
+
|
|
1142
|
+
### 🚀 PHASE 6: Deployment & Monitoring (NOT STARTED)
|
|
1143
|
+
|
|
1144
|
+
**Priority: P3 (LOW) - Status: ⏳ 0% DONE**
|
|
1145
|
+
|
|
1146
|
+
- [ ] **6.1** Set up staging environment
|
|
1147
|
+
- [ ] Deploy to staging
|
|
1148
|
+
- [ ] Enable fact validation
|
|
1149
|
+
- [ ] Monitor for errors
|
|
1150
|
+
- **Estimated effort:** 1 hour
|
|
1151
|
+
|
|
1152
|
+
- [ ] **6.2** Collect baseline metrics
|
|
1153
|
+
- [ ] Accuracy rate
|
|
1154
|
+
- [ ] Validation coverage
|
|
1155
|
+
- [ ] Retry rate
|
|
1156
|
+
- [ ] Performance impact
|
|
1157
|
+
- **Estimated effort:** 2-3 hours
|
|
1158
|
+
|
|
1159
|
+
- [ ] **6.3** Set up monitoring alerts
|
|
1160
|
+
- [ ] High error rate
|
|
1161
|
+
- [ ] High latency
|
|
1162
|
+
- [ ] API quota warnings
|
|
1163
|
+
- [ ] Infinite loop detection
|
|
1164
|
+
- **Estimated effort:** 1-2 hours
|
|
1165
|
+
|
|
1166
|
+
- [ ] **6.4** Production rollout
|
|
1167
|
+
- [ ] Gradual rollout (10% → 50% → 100%)
|
|
1168
|
+
- [ ] Monitor metrics
|
|
1169
|
+
- [ ] Adjust configuration as needed
|
|
1170
|
+
- **Estimated effort:** 2-3 hours
|
|
1171
|
+
|
|
1172
|
+
- [ ] **6.5** Post-deployment monitoring
|
|
1173
|
+
- [ ] Track success metrics
|
|
1174
|
+
- [ ] Collect user feedback
|
|
1175
|
+
- [ ] Identify improvement areas
|
|
1176
|
+
- **Estimated effort:** Ongoing
|
|
1177
|
+
|
|
1178
|
+
**Total Phase 6 Effort:** 6-9 hours
|
|
1179
|
+
|
|
1180
|
+
---
|
|
1181
|
+
|
|
1182
|
+
## 📊 OVERALL PROGRESS SUMMARY
|
|
1183
|
+
|
|
1184
|
+
| Phase | Priority | Status | Progress | Effort Remaining |
|
|
1185
|
+
|-------|----------|--------|----------|------------------|
|
|
1186
|
+
| **Phase 1: Infrastructure** | P0 | ✅ Done | 100% | 0 hours |
|
|
1187
|
+
| **Phase 2: Execution** | P0 | ✅ Done | 100% | 0 hours |
|
|
1188
|
+
| **Phase 3: Configuration** | P1 | ✅ Done | 100% | 0 hours |
|
|
1189
|
+
| **Phase 4: Documentation** | P2 | ✅ Done | 100% | 0 hours |
|
|
1190
|
+
| **Phase 5: Testing** | P1 | 🚧 Partial | 50% | 8-11 hours |
|
|
1191
|
+
| **Phase 6: Deployment** | P3 | ⏳ Not Started | 0% | 6-9 hours |
|
|
1192
|
+
| **TOTAL** | - | - | **75%** | **14-20 hours** |
|
|
1193
|
+
|
|
1194
|
+
---
|
|
1195
|
+
|
|
1196
|
+
## 🎯 IMPLEMENTATION PROGRESS
|
|
1197
|
+
|
|
1198
|
+
### ✅ Week 1: Complete Execution Layer (P0) - DONE
|
|
1199
|
+
|
|
1200
|
+
1. ✅ **Task 2.2** - Implement `executeCheckInline()` (BLOCKER) - COMPLETED
|
|
1201
|
+
2. ✅ **Task 2.3** - Implement `on_finish.run` execution - COMPLETED
|
|
1202
|
+
3. ✅ **Task 2.5** - Implement `on_finish.goto_js` evaluation - COMPLETED
|
|
1203
|
+
4. ✅ **Task 2.9** - Write execution tests - COMPLETED
|
|
1204
|
+
5. ✅ Verify all tests passing, no regressions - COMPLETED (1438 tests passing)
|
|
1205
|
+
|
|
1206
|
+
**Deliverable:** ✅ Fully functional `on_finish` hook with `run` and `goto_js` execution
|
|
1207
|
+
|
|
1208
|
+
### ✅ Week 2: Build Fact Validator (P1) - DONE
|
|
1209
|
+
|
|
1210
|
+
6. ✅ **Tasks 3.1-3.3** - Memory init + assistant prompt updates - COMPLETED
|
|
1211
|
+
7. ✅ **Tasks 3.4-3.5** - Extract and validate fact checks - COMPLETED
|
|
1212
|
+
8. ✅ **Task 3.6** - Aggregation check - COMPLETED
|
|
1213
|
+
9. ✅ **Tasks 3.7-3.10** - Response posting + env config - COMPLETED
|
|
1214
|
+
10. ✅ Manual testing of full flow - COMPLETED
|
|
1215
|
+
|
|
1216
|
+
**Deliverable:** ✅ Complete fact validator configuration at `examples/fact-validator.yaml`
|
|
1217
|
+
|
|
1218
|
+
### 🔄 Week 3: Test & Document (P1-P2) - NEXT
|
|
1219
|
+
|
|
1220
|
+
11. ⏳ **Phase 5.1-5.4** - E2E tests (additional scenarios)
|
|
1221
|
+
12. ⏳ **Phase 5.5-5.6** - Prompt tuning (for production AI usage)
|
|
1222
|
+
13. ⏳ **Phase 4.1-4.4** - Core documentation
|
|
1223
|
+
14. ⏳ **Phase 4.5-4.6** - User guides
|
|
1224
|
+
|
|
1225
|
+
**Deliverable:** Tested and documented fact validator
|
|
1226
|
+
|
|
1227
|
+
---
|
|
1228
|
+
|
|
1229
|
+
## ✅ CRITICAL BLOCKERS - RESOLVED
|
|
1230
|
+
|
|
1231
|
+
1. ✅ **Task 2.2: `executeCheckInline()` method** - COMPLETED
|
|
1232
|
+
2. ✅ **Tasks 2.3 + 2.5: `run` and `goto_js`** - COMPLETED
|
|
1233
|
+
3. ✅ **Phase 2 completion** - COMPLETED
|
|
1234
|
+
|
|
1235
|
+
**Status:** All critical blockers have been resolved. The core infrastructure is complete and tested.
|