@sanity/ailf 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +89 -0
- package/bin/ailf.js +64 -0
- package/canonical/grader-references/README.md +88 -0
- package/canonical/grader-references/groq.yaml +234 -0
- package/canonical/grader-references/studio-setup.yaml +275 -0
- package/canonical/reference-solutions/.gitkeep +1 -0
- package/canonical/reference-solutions/frameworks/nuxt.ts +119 -0
- package/canonical/reference-solutions/frameworks/remix.tsx +100 -0
- package/canonical/reference-solutions/functions/publish-webhook.ts +60 -0
- package/canonical/reference-solutions/groq/advanced-filtering.ts +379 -0
- package/canonical/reference-solutions/groq/blog-queries.ts +137 -0
- package/canonical/reference-solutions/groq/joins-references.ts +300 -0
- package/canonical/reference-solutions/nextjs/app-router-integration.tsx +128 -0
- package/canonical/reference-solutions/studio-setup/blog-schema.ts +143 -0
- package/canonical/reference-solutions/studio-setup/custom-tool.tsx +78 -0
- package/canonical/reference-solutions/visual-editing/live-preview.tsx +137 -0
- package/canonical/reference-solutions/visual-editing/presentation-nextjs.tsx +130 -0
- package/config/airbyte/ai_literacy_framework.connector.yaml +639 -0
- package/config/bigquery/README.md +74 -0
- package/config/bigquery/views/area_scores.sql +87 -0
- package/config/bigquery/views/reports.sql +49 -0
- package/config/features.yaml +116 -0
- package/config/models.yaml +115 -0
- package/config/prompts.yaml +75 -0
- package/config/rubrics.yaml +62 -0
- package/config/schedules.yaml +43 -0
- package/config/sinks.yaml +54 -0
- package/config/sources.yaml +51 -0
- package/config/thresholds.yaml +49 -0
- package/dist/_vendor/ailf-core/examples/index.d.ts +190 -0
- package/dist/_vendor/ailf-core/examples/index.js +285 -0
- package/dist/_vendor/ailf-core/index.d.ts +17 -0
- package/dist/_vendor/ailf-core/index.js +17 -0
- package/dist/_vendor/ailf-core/ports/cache-store.d.ts +72 -0
- package/dist/_vendor/ailf-core/ports/cache-store.js +17 -0
- package/dist/_vendor/ailf-core/ports/config-source.d.ts +33 -0
- package/dist/_vendor/ailf-core/ports/config-source.js +15 -0
- package/dist/_vendor/ailf-core/ports/context.d.ts +172 -0
- package/dist/_vendor/ailf-core/ports/context.js +14 -0
- package/dist/_vendor/ailf-core/ports/doc-fetcher.d.ts +131 -0
- package/dist/_vendor/ailf-core/ports/doc-fetcher.js +12 -0
- package/dist/_vendor/ailf-core/ports/eval-runner.d.ts +24 -0
- package/dist/_vendor/ailf-core/ports/eval-runner.js +8 -0
- package/dist/_vendor/ailf-core/ports/index.d.ts +15 -0
- package/dist/_vendor/ailf-core/ports/index.js +7 -0
- package/dist/_vendor/ailf-core/ports/logger.d.ts +36 -0
- package/dist/_vendor/ailf-core/ports/logger.js +11 -0
- package/dist/_vendor/ailf-core/ports/pipeline-step.d.ts +46 -0
- package/dist/_vendor/ailf-core/ports/pipeline-step.js +8 -0
- package/dist/_vendor/ailf-core/ports/task-source.d.ts +159 -0
- package/dist/_vendor/ailf-core/ports/task-source.js +72 -0
- package/dist/_vendor/ailf-core/schemas/callback-payload.d.ts +24 -0
- package/dist/_vendor/ailf-core/schemas/callback-payload.js +29 -0
- package/dist/_vendor/ailf-core/schemas/eval-config.d.ts +55 -0
- package/dist/_vendor/ailf-core/schemas/eval-config.js +78 -0
- package/dist/_vendor/ailf-core/schemas/index.d.ts +16 -0
- package/dist/_vendor/ailf-core/schemas/index.js +16 -0
- package/dist/_vendor/ailf-core/schemas/pipeline-request.d.ts +125 -0
- package/dist/_vendor/ailf-core/schemas/pipeline-request.js +67 -0
- package/dist/_vendor/ailf-core/schemas/pipeline.d.ts +531 -0
- package/dist/_vendor/ailf-core/schemas/pipeline.js +318 -0
- package/dist/_vendor/ailf-core/schemas/schedules.d.ts +68 -0
- package/dist/_vendor/ailf-core/schemas/schedules.js +74 -0
- package/dist/_vendor/ailf-core/schemas/sinks.d.ts +207 -0
- package/dist/_vendor/ailf-core/schemas/sinks.js +108 -0
- package/dist/_vendor/ailf-core/services/comparison-formatters.d.ts +18 -0
- package/dist/_vendor/ailf-core/services/comparison-formatters.js +189 -0
- package/dist/_vendor/ailf-core/services/config-helpers.d.ts +41 -0
- package/dist/_vendor/ailf-core/services/config-helpers.js +86 -0
- package/dist/_vendor/ailf-core/services/index.d.ts +12 -0
- package/dist/_vendor/ailf-core/services/index.js +12 -0
- package/dist/_vendor/ailf-core/services/scoring.d.ts +49 -0
- package/dist/_vendor/ailf-core/services/scoring.js +222 -0
- package/dist/_vendor/ailf-core/types/index.d.ts +1082 -0
- package/dist/_vendor/ailf-core/types/index.js +21 -0
- package/dist/_vendor/ailf-core/types/scoring-input.d.ts +54 -0
- package/dist/_vendor/ailf-core/types/scoring-input.js +9 -0
- package/dist/_vendor/ailf-shared/dimension-names.d.ts +21 -0
- package/dist/_vendor/ailf-shared/dimension-names.js +27 -0
- package/dist/_vendor/ailf-shared/document-ref.d.ts +29 -0
- package/dist/_vendor/ailf-shared/document-ref.js +1 -0
- package/dist/_vendor/ailf-shared/eval-modes.d.ts +12 -0
- package/dist/_vendor/ailf-shared/eval-modes.js +8 -0
- package/dist/_vendor/ailf-shared/index.d.ts +16 -0
- package/dist/_vendor/ailf-shared/index.js +16 -0
- package/dist/_vendor/ailf-shared/noise-threshold.d.ts +9 -0
- package/dist/_vendor/ailf-shared/noise-threshold.js +9 -0
- package/dist/_vendor/ailf-shared/score-grades.d.ts +17 -0
- package/dist/_vendor/ailf-shared/score-grades.js +23 -0
- package/dist/adapters/cache/content-lake-cache.d.ts +24 -0
- package/dist/adapters/cache/content-lake-cache.js +59 -0
- package/dist/adapters/cache/filesystem-cache.d.ts +18 -0
- package/dist/adapters/cache/filesystem-cache.js +54 -0
- package/dist/adapters/cache/index.d.ts +2 -0
- package/dist/adapters/cache/index.js +2 -0
- package/dist/adapters/config-sources/cli-config-adapter.d.ts +17 -0
- package/dist/adapters/config-sources/cli-config-adapter.js +23 -0
- package/dist/adapters/config-sources/file-config-adapter.d.ts +26 -0
- package/dist/adapters/config-sources/file-config-adapter.js +96 -0
- package/dist/adapters/config-sources/index.d.ts +2 -0
- package/dist/adapters/config-sources/index.js +2 -0
- package/dist/adapters/doc-fetchers/index.d.ts +1 -0
- package/dist/adapters/doc-fetchers/index.js +1 -0
- package/dist/adapters/doc-fetchers/sanity-doc-fetcher.d.ts +76 -0
- package/dist/adapters/doc-fetchers/sanity-doc-fetcher.js +620 -0
- package/dist/adapters/eval-runners/index.d.ts +1 -0
- package/dist/adapters/eval-runners/index.js +1 -0
- package/dist/adapters/eval-runners/promptfoo-eval-adapter.d.ts +14 -0
- package/dist/adapters/eval-runners/promptfoo-eval-adapter.js +63 -0
- package/dist/adapters/index.d.ts +12 -0
- package/dist/adapters/index.js +12 -0
- package/dist/adapters/loggers/console-logger.d.ts +22 -0
- package/dist/adapters/loggers/console-logger.js +54 -0
- package/dist/adapters/loggers/index.d.ts +9 -0
- package/dist/adapters/loggers/index.js +9 -0
- package/dist/adapters/loggers/json-logger.d.ts +18 -0
- package/dist/adapters/loggers/json-logger.js +33 -0
- package/dist/adapters/loggers/quiet-logger.d.ts +16 -0
- package/dist/adapters/loggers/quiet-logger.js +30 -0
- package/dist/adapters/task-sources/composite-task-source.d.ts +20 -0
- package/dist/adapters/task-sources/composite-task-source.js +59 -0
- package/dist/adapters/task-sources/content-lake-task-source.d.ts +20 -0
- package/dist/adapters/task-sources/content-lake-task-source.js +219 -0
- package/dist/adapters/task-sources/index.d.ts +7 -0
- package/dist/adapters/task-sources/index.js +7 -0
- package/dist/adapters/task-sources/repo-schemas.d.ts +245 -0
- package/dist/adapters/task-sources/repo-schemas.js +234 -0
- package/dist/adapters/task-sources/repo-task-source.d.ts +22 -0
- package/dist/adapters/task-sources/repo-task-source.js +104 -0
- package/dist/adapters/task-sources/repo-trigger.d.ts +52 -0
- package/dist/adapters/task-sources/repo-trigger.js +153 -0
- package/dist/adapters/task-sources/repo-validation.d.ts +49 -0
- package/dist/adapters/task-sources/repo-validation.js +164 -0
- package/dist/adapters/task-sources/yaml-task-source.d.ts +18 -0
- package/dist/adapters/task-sources/yaml-task-source.js +136 -0
- package/dist/agent-observer/agentic-provider.d.ts +132 -0
- package/dist/agent-observer/agentic-provider.js +983 -0
- package/dist/agent-observer/classifier.d.ts +62 -0
- package/dist/agent-observer/classifier.js +269 -0
- package/dist/agent-observer/index.d.ts +7 -0
- package/dist/agent-observer/index.js +4 -0
- package/dist/agent-observer/pricing.d.ts +35 -0
- package/dist/agent-observer/pricing.js +82 -0
- package/dist/agent-observer/provider.d.ts +77 -0
- package/dist/agent-observer/provider.js +151 -0
- package/dist/agent-observer/proxy.d.ts +91 -0
- package/dist/agent-observer/proxy.js +321 -0
- package/dist/agent-observer/test-imports.d.ts +7 -0
- package/dist/agent-observer/test-imports.js +185 -0
- package/dist/agent-observer/types.d.ts +137 -0
- package/dist/agent-observer/types.js +16 -0
- package/dist/assertions/source-isolation.d.ts +72 -0
- package/dist/assertions/source-isolation.js +117 -0
- package/dist/cli.d.ts +24 -0
- package/dist/cli.js +199 -0
- package/dist/commands/agent-report.d.ts +5 -0
- package/dist/commands/agent-report.js +69 -0
- package/dist/commands/baseline.d.ts +9 -0
- package/dist/commands/baseline.js +141 -0
- package/dist/commands/cache.d.ts +13 -0
- package/dist/commands/cache.js +135 -0
- package/dist/commands/calculate-scores.d.ts +8 -0
- package/dist/commands/calculate-scores.js +48 -0
- package/dist/commands/compare.d.ts +8 -0
- package/dist/commands/compare.js +120 -0
- package/dist/commands/completion.d.ts +18 -0
- package/dist/commands/completion.js +260 -0
- package/dist/commands/coverage-audit.d.ts +7 -0
- package/dist/commands/coverage-audit.js +40 -0
- package/dist/commands/discovery-report.d.ts +10 -0
- package/dist/commands/discovery-report.js +44 -0
- package/dist/commands/eval.d.ts +9 -0
- package/dist/commands/eval.js +35 -0
- package/dist/commands/explain-handler.d.ts +34 -0
- package/dist/commands/explain-handler.js +719 -0
- package/dist/commands/fetch-docs.d.ts +8 -0
- package/dist/commands/fetch-docs.js +128 -0
- package/dist/commands/generate-configs.d.ts +8 -0
- package/dist/commands/generate-configs.js +46 -0
- package/dist/commands/grader/index.d.ts +11 -0
- package/dist/commands/grader/index.js +118 -0
- package/dist/commands/init.d.ts +19 -0
- package/dist/commands/init.js +150 -0
- package/dist/commands/interactive.d.ts +12 -0
- package/dist/commands/interactive.js +238 -0
- package/dist/commands/lookup-doc.d.ts +15 -0
- package/dist/commands/lookup-doc.js +84 -0
- package/dist/commands/measure-retrieval.d.ts +5 -0
- package/dist/commands/measure-retrieval.js +65 -0
- package/dist/commands/pipeline-action.d.ts +71 -0
- package/dist/commands/pipeline-action.js +305 -0
- package/dist/commands/pipeline.d.ts +62 -0
- package/dist/commands/pipeline.js +53 -0
- package/dist/commands/pr-comment.d.ts +8 -0
- package/dist/commands/pr-comment.js +47 -0
- package/dist/commands/publish.d.ts +26 -0
- package/dist/commands/publish.js +253 -0
- package/dist/commands/readiness-report.d.ts +10 -0
- package/dist/commands/readiness-report.js +104 -0
- package/dist/commands/shared/options.d.ts +29 -0
- package/dist/commands/shared/options.js +57 -0
- package/dist/commands/update-quality-scores.d.ts +5 -0
- package/dist/commands/update-quality-scores.js +20 -0
- package/dist/commands/validate-tasks.d.ts +16 -0
- package/dist/commands/validate-tasks.js +93 -0
- package/dist/commands/validate.d.ts +9 -0
- package/dist/commands/validate.js +73 -0
- package/dist/commands/webhook-server.d.ts +5 -0
- package/dist/commands/webhook-server.js +30 -0
- package/dist/commands/weekly-digest.d.ts +10 -0
- package/dist/commands/weekly-digest.js +104 -0
- package/dist/composition-root.d.ts +26 -0
- package/dist/composition-root.js +107 -0
- package/dist/interpolate.d.ts +26 -0
- package/dist/interpolate.js +70 -0
- package/dist/job-store.d.ts +104 -0
- package/dist/job-store.js +188 -0
- package/dist/lib/agent-behavior-report.d.ts +8 -0
- package/dist/lib/agent-behavior-report.js +185 -0
- package/dist/lib/baseline.d.ts +19 -0
- package/dist/lib/baseline.js +153 -0
- package/dist/lib/calculate-scores.d.ts +23 -0
- package/dist/lib/calculate-scores.js +42 -0
- package/dist/lib/compare.d.ts +18 -0
- package/dist/lib/compare.js +170 -0
- package/dist/lib/coverage-audit.d.ts +4 -0
- package/dist/lib/coverage-audit.js +42 -0
- package/dist/lib/discovery-report.d.ts +13 -0
- package/dist/lib/discovery-report.js +57 -0
- package/dist/lib/fetch-docs.d.ts +30 -0
- package/dist/lib/fetch-docs.js +171 -0
- package/dist/lib/generate-configs.d.ts +25 -0
- package/dist/lib/generate-configs.js +42 -0
- package/dist/lib/grader-api.d.ts +21 -0
- package/dist/lib/grader-api.js +34 -0
- package/dist/lib/grader-compare.d.ts +19 -0
- package/dist/lib/grader-compare.js +91 -0
- package/dist/lib/grader-consistency.d.ts +27 -0
- package/dist/lib/grader-consistency.js +79 -0
- package/dist/lib/grader-sensitivity.d.ts +19 -0
- package/dist/lib/grader-sensitivity.js +75 -0
- package/dist/lib/grader-validate.d.ts +19 -0
- package/dist/lib/grader-validate.js +78 -0
- package/dist/lib/measure-retrieval.d.ts +14 -0
- package/dist/lib/measure-retrieval.js +71 -0
- package/dist/lib/pr-comment.d.ts +16 -0
- package/dist/lib/pr-comment.js +28 -0
- package/dist/lib/readiness-report.d.ts +13 -0
- package/dist/lib/readiness-report.js +108 -0
- package/dist/lib/webhook-server.d.ts +11 -0
- package/dist/lib/webhook-server.js +24 -0
- package/dist/lib/weekly-digest.d.ts +24 -0
- package/dist/lib/weekly-digest.js +148 -0
- package/dist/orchestration/build-app-context.d.ts +27 -0
- package/dist/orchestration/build-app-context.js +81 -0
- package/dist/orchestration/build-step-sequence.d.ts +15 -0
- package/dist/orchestration/build-step-sequence.js +84 -0
- package/dist/orchestration/config-to-source-overrides.d.ts +9 -0
- package/dist/orchestration/config-to-source-overrides.js +28 -0
- package/dist/orchestration/env-bridge.d.ts +21 -0
- package/dist/orchestration/env-bridge.js +66 -0
- package/dist/orchestration/index.d.ts +11 -0
- package/dist/orchestration/index.js +11 -0
- package/dist/orchestration/pipeline-orchestrator.d.ts +24 -0
- package/dist/orchestration/pipeline-orchestrator.js +153 -0
- package/dist/orchestration/step-runner.d.ts +20 -0
- package/dist/orchestration/step-runner.js +88 -0
- package/dist/orchestration/steps/calculate-scores-step.d.ts +13 -0
- package/dist/orchestration/steps/calculate-scores-step.js +95 -0
- package/dist/orchestration/steps/callback-step.d.ts +24 -0
- package/dist/orchestration/steps/callback-step.js +76 -0
- package/dist/orchestration/steps/compare-step.d.ts +14 -0
- package/dist/orchestration/steps/compare-step.js +92 -0
- package/dist/orchestration/steps/discovery-report-step.d.ts +13 -0
- package/dist/orchestration/steps/discovery-report-step.js +55 -0
- package/dist/orchestration/steps/fetch-docs-shell.d.ts +17 -0
- package/dist/orchestration/steps/fetch-docs-shell.js +30 -0
- package/dist/orchestration/steps/fetch-docs-step.d.ts +14 -0
- package/dist/orchestration/steps/fetch-docs-step.js +135 -0
- package/dist/orchestration/steps/gap-analysis-step.d.ts +16 -0
- package/dist/orchestration/steps/gap-analysis-step.js +136 -0
- package/dist/orchestration/steps/generate-configs-step.d.ts +14 -0
- package/dist/orchestration/steps/generate-configs-step.js +85 -0
- package/dist/orchestration/steps/grader-consistency-step.d.ts +13 -0
- package/dist/orchestration/steps/grader-consistency-step.js +64 -0
- package/dist/orchestration/steps/index.d.ts +19 -0
- package/dist/orchestration/steps/index.js +19 -0
- package/dist/orchestration/steps/mirror-repo-tasks-step.d.ts +21 -0
- package/dist/orchestration/steps/mirror-repo-tasks-step.js +94 -0
- package/dist/orchestration/steps/publish-report-step.d.ts +26 -0
- package/dist/orchestration/steps/publish-report-step.js +216 -0
- package/dist/orchestration/steps/readiness-step.d.ts +13 -0
- package/dist/orchestration/steps/readiness-step.js +91 -0
- package/dist/orchestration/steps/report-step.d.ts +12 -0
- package/dist/orchestration/steps/report-step.js +49 -0
- package/dist/orchestration/steps/run-eval-step.d.ts +17 -0
- package/dist/orchestration/steps/run-eval-step.js +195 -0
- package/dist/orchestration/steps/validate-step.d.ts +12 -0
- package/dist/orchestration/steps/validate-step.js +41 -0
- package/dist/pipeline/agent-behavior-report.d.ts +53 -0
- package/dist/pipeline/agent-behavior-report.js +132 -0
- package/dist/pipeline/attribution.d.ts +47 -0
- package/dist/pipeline/attribution.js +226 -0
- package/dist/pipeline/baseline.d.ts +37 -0
- package/dist/pipeline/baseline.js +141 -0
- package/dist/pipeline/cache.d.ts +101 -0
- package/dist/pipeline/cache.js +283 -0
- package/dist/pipeline/calculate-scores.d.ts +102 -0
- package/dist/pipeline/calculate-scores.js +1128 -0
- package/dist/pipeline/callback-delivery.d.ts +50 -0
- package/dist/pipeline/callback-delivery.js +89 -0
- package/dist/pipeline/checks.d.ts +39 -0
- package/dist/pipeline/checks.js +280 -0
- package/dist/pipeline/classify-url.d.ts +61 -0
- package/dist/pipeline/classify-url.js +93 -0
- package/dist/pipeline/compare.d.ts +31 -0
- package/dist/pipeline/compare.js +208 -0
- package/dist/pipeline/coverage-audit.d.ts +39 -0
- package/dist/pipeline/coverage-audit.js +165 -0
- package/dist/pipeline/degradations.d.ts +85 -0
- package/dist/pipeline/degradations.js +242 -0
- package/dist/pipeline/discovery-report.d.ts +55 -0
- package/dist/pipeline/discovery-report.js +178 -0
- package/dist/pipeline/eval-constants.d.ts +68 -0
- package/dist/pipeline/eval-constants.js +111 -0
- package/dist/pipeline/eval-fingerprint.d.ts +66 -0
- package/dist/pipeline/eval-fingerprint.js +175 -0
- package/dist/pipeline/expand-tasks.d.ts +220 -0
- package/dist/pipeline/expand-tasks.js +421 -0
- package/dist/pipeline/failure-modes.d.ts +46 -0
- package/dist/pipeline/failure-modes.js +348 -0
- package/dist/pipeline/fetch-url-content.d.ts +44 -0
- package/dist/pipeline/fetch-url-content.js +93 -0
- package/dist/pipeline/gap-analysis.d.ts +48 -0
- package/dist/pipeline/gap-analysis.js +231 -0
- package/dist/pipeline/generate-configs.d.ts +72 -0
- package/dist/pipeline/generate-configs.js +395 -0
- package/dist/pipeline/grader-api.d.ts +49 -0
- package/dist/pipeline/grader-api.js +200 -0
- package/dist/pipeline/grader-compare-runner.d.ts +44 -0
- package/dist/pipeline/grader-compare-runner.js +301 -0
- package/dist/pipeline/grader-comparison.d.ts +111 -0
- package/dist/pipeline/grader-comparison.js +161 -0
- package/dist/pipeline/grader-consistency-runner.d.ts +60 -0
- package/dist/pipeline/grader-consistency-runner.js +270 -0
- package/dist/pipeline/grader-consistency.d.ts +103 -0
- package/dist/pipeline/grader-consistency.js +146 -0
- package/dist/pipeline/grader-sensitivity-runner.d.ts +40 -0
- package/dist/pipeline/grader-sensitivity-runner.js +282 -0
- package/dist/pipeline/grader-sensitivity.d.ts +94 -0
- package/dist/pipeline/grader-sensitivity.js +144 -0
- package/dist/pipeline/grader-validate-runner.d.ts +38 -0
- package/dist/pipeline/grader-validate-runner.js +229 -0
- package/dist/pipeline/grader-validation.d.ts +107 -0
- package/dist/pipeline/grader-validation.js +169 -0
- package/dist/pipeline/map-request-to-config.d.ts +19 -0
- package/dist/pipeline/map-request-to-config.js +80 -0
- package/dist/pipeline/measure-retrieval.d.ts +59 -0
- package/dist/pipeline/measure-retrieval.js +111 -0
- package/dist/pipeline/mirror-repo-tasks.d.ts +86 -0
- package/dist/pipeline/mirror-repo-tasks.js +350 -0
- package/dist/pipeline/plan-format.d.ts +33 -0
- package/dist/pipeline/plan-format.js +202 -0
- package/dist/pipeline/plan.d.ts +169 -0
- package/dist/pipeline/plan.js +708 -0
- package/dist/pipeline/pr-comment.d.ts +19 -0
- package/dist/pipeline/pr-comment.js +502 -0
- package/dist/pipeline/probe.d.ts +52 -0
- package/dist/pipeline/probe.js +390 -0
- package/dist/pipeline/provenance.d.ts +47 -0
- package/dist/pipeline/provenance.js +146 -0
- package/dist/pipeline/readiness-report.d.ts +87 -0
- package/dist/pipeline/readiness-report.js +205 -0
- package/dist/pipeline/release-classification.d.ts +54 -0
- package/dist/pipeline/release-classification.js +238 -0
- package/dist/pipeline/release-report.d.ts +37 -0
- package/dist/pipeline/release-report.js +222 -0
- package/dist/pipeline/repo-eval-comment.d.ts +37 -0
- package/dist/pipeline/repo-eval-comment.js +165 -0
- package/dist/pipeline/repo-threshold-evaluator.d.ts +89 -0
- package/dist/pipeline/repo-threshold-evaluator.js +162 -0
- package/dist/pipeline/resolve-mappings.d.ts +35 -0
- package/dist/pipeline/resolve-mappings.js +72 -0
- package/dist/pipeline/retrieval-metrics.d.ts +39 -0
- package/dist/pipeline/retrieval-metrics.js +136 -0
- package/dist/pipeline/reverse-mapping.d.ts +67 -0
- package/dist/pipeline/reverse-mapping.js +88 -0
- package/dist/pipeline/schemas.d.ts +9 -0
- package/dist/pipeline/schemas.js +9 -0
- package/dist/pipeline/steps/calculate-scores-step.d.ts +11 -0
- package/dist/pipeline/steps/calculate-scores-step.js +89 -0
- package/dist/pipeline/steps/compare-step.d.ts +18 -0
- package/dist/pipeline/steps/compare-step.js +90 -0
- package/dist/pipeline/steps/eval-step.d.ts +53 -0
- package/dist/pipeline/steps/eval-step.js +347 -0
- package/dist/pipeline/steps/fetch-docs-step.d.ts +11 -0
- package/dist/pipeline/steps/fetch-docs-step.js +84 -0
- package/dist/pipeline/steps/generate-configs-step.d.ts +11 -0
- package/dist/pipeline/steps/generate-configs-step.js +98 -0
- package/dist/pipeline/steps/grader-consistency-step.d.ts +21 -0
- package/dist/pipeline/steps/grader-consistency-step.js +74 -0
- package/dist/pipeline/steps/publish-report-step.d.ts +57 -0
- package/dist/pipeline/steps/publish-report-step.js +243 -0
- package/dist/pipeline/steps/report-step.d.ts +13 -0
- package/dist/pipeline/steps/report-step.js +56 -0
- package/dist/pipeline/steps/update-scores-step.d.ts +11 -0
- package/dist/pipeline/steps/update-scores-step.js +42 -0
- package/dist/pipeline/targeted-loo.d.ts +88 -0
- package/dist/pipeline/targeted-loo.js +203 -0
- package/dist/pipeline/thresholds.d.ts +27 -0
- package/dist/pipeline/thresholds.js +245 -0
- package/dist/pipeline/types.d.ts +10 -0
- package/dist/pipeline/types.js +10 -0
- package/dist/pipeline/validate.d.ts +67 -0
- package/dist/pipeline/validate.js +406 -0
- package/dist/pipeline/webhook-server.d.ts +37 -0
- package/dist/pipeline/webhook-server.js +133 -0
- package/dist/report-store.d.ts +84 -0
- package/dist/report-store.js +208 -0
- package/dist/sanity/client.d.ts +38 -0
- package/dist/sanity/client.js +86 -0
- package/dist/sanity/portable-text.d.ts +11 -0
- package/dist/sanity/portable-text.js +211 -0
- package/dist/sanity/queries.d.ts +133 -0
- package/dist/sanity/queries.js +300 -0
- package/dist/schedules/digest.d.ts +116 -0
- package/dist/schedules/digest.js +156 -0
- package/dist/schedules/index.d.ts +12 -0
- package/dist/schedules/index.js +10 -0
- package/dist/schedules/loader.d.ts +31 -0
- package/dist/schedules/loader.js +73 -0
- package/dist/schedules/schema.d.ts +9 -0
- package/dist/schedules/schema.js +9 -0
- package/dist/scripts/agent-behavior-report.d.ts +19 -0
- package/dist/scripts/agent-behavior-report.js +315 -0
- package/dist/scripts/baseline.d.ts +43 -0
- package/dist/scripts/baseline.js +267 -0
- package/dist/scripts/calculate-scores.d.ts +166 -0
- package/dist/scripts/calculate-scores.js +1296 -0
- package/dist/scripts/compare.d.ts +22 -0
- package/dist/scripts/compare.js +334 -0
- package/dist/scripts/coverage-audit.d.ts +44 -0
- package/dist/scripts/coverage-audit.js +209 -0
- package/dist/scripts/debug-eval.d.ts +19 -0
- package/dist/scripts/debug-eval.js +73 -0
- package/dist/scripts/discovery-report.d.ts +58 -0
- package/dist/scripts/discovery-report.js +250 -0
- package/dist/scripts/fetch-docs.d.ts +35 -0
- package/dist/scripts/fetch-docs.js +472 -0
- package/dist/scripts/generate-configs.d.ts +66 -0
- package/dist/scripts/generate-configs.js +459 -0
- package/dist/scripts/grader-api.d.ts +27 -0
- package/dist/scripts/grader-api.js +206 -0
- package/dist/scripts/grader-compare.d.ts +22 -0
- package/dist/scripts/grader-compare.js +368 -0
- package/dist/scripts/grader-consistency.d.ts +20 -0
- package/dist/scripts/grader-consistency.js +313 -0
- package/dist/scripts/grader-sensitivity.d.ts +22 -0
- package/dist/scripts/grader-sensitivity.js +354 -0
- package/dist/scripts/grader-validate.d.ts +19 -0
- package/dist/scripts/grader-validate.js +267 -0
- package/dist/scripts/measure-retrieval.d.ts +10 -0
- package/dist/scripts/measure-retrieval.js +145 -0
- package/dist/scripts/migrate-tasks-to-content-lake.d.ts +24 -0
- package/dist/scripts/migrate-tasks-to-content-lake.js +327 -0
- package/dist/scripts/pipeline.d.ts +76 -0
- package/dist/scripts/pipeline.js +1031 -0
- package/dist/scripts/pr-comment.d.ts +10 -0
- package/dist/scripts/pr-comment.js +510 -0
- package/dist/scripts/readiness-report.d.ts +88 -0
- package/dist/scripts/readiness-report.js +342 -0
- package/dist/scripts/update-quality-scores.d.ts +15 -0
- package/dist/scripts/update-quality-scores.js +184 -0
- package/dist/scripts/validate-task-sources.d.ts +21 -0
- package/dist/scripts/validate-task-sources.js +210 -0
- package/dist/scripts/validate.d.ts +13 -0
- package/dist/scripts/validate.js +79 -0
- package/dist/scripts/webhook-server.d.ts +26 -0
- package/dist/scripts/webhook-server.js +147 -0
- package/dist/scripts/weekly-digest.d.ts +24 -0
- package/dist/scripts/weekly-digest.js +144 -0
- package/dist/sinks/bigquery/index.d.ts +131 -0
- package/dist/sinks/bigquery/index.js +222 -0
- package/dist/sinks/format-slack.d.ts +64 -0
- package/dist/sinks/format-slack.js +306 -0
- package/dist/sinks/index.d.ts +23 -0
- package/dist/sinks/index.js +18 -0
- package/dist/sinks/loader.d.ts +18 -0
- package/dist/sinks/loader.js +82 -0
- package/dist/sinks/retry.d.ts +24 -0
- package/dist/sinks/retry.js +52 -0
- package/dist/sinks/schema.d.ts +9 -0
- package/dist/sinks/schema.js +9 -0
- package/dist/sinks/slack/format.d.ts +65 -0
- package/dist/sinks/slack/format.js +327 -0
- package/dist/sinks/slack/index.d.ts +27 -0
- package/dist/sinks/slack/index.js +78 -0
- package/dist/sinks/slack-sink.d.ts +27 -0
- package/dist/sinks/slack-sink.js +78 -0
- package/dist/sinks/types.d.ts +59 -0
- package/dist/sinks/types.js +44 -0
- package/dist/sinks/webhook/index.d.ts +19 -0
- package/dist/sinks/webhook/index.js +50 -0
- package/dist/sinks/webhook-sink.d.ts +19 -0
- package/dist/sinks/webhook-sink.js +50 -0
- package/dist/sources.d.ts +104 -0
- package/dist/sources.js +292 -0
- package/dist/webhook/budget.d.ts +42 -0
- package/dist/webhook/budget.js +60 -0
- package/dist/webhook/debounce.d.ts +67 -0
- package/dist/webhook/debounce.js +76 -0
- package/dist/webhook/dispatch.d.ts +45 -0
- package/dist/webhook/dispatch.js +84 -0
- package/dist/webhook/eval-request-handler.d.ts +87 -0
- package/dist/webhook/eval-request-handler.js +181 -0
- package/dist/webhook/handler.d.ts +88 -0
- package/dist/webhook/handler.js +203 -0
- package/dist/webhook/index.d.ts +17 -0
- package/dist/webhook/index.js +12 -0
- package/dist/webhook/types.d.ts +109 -0
- package/dist/webhook/types.js +10 -0
- package/package.json +72 -0
- package/tasks/.expanded.agentic.yaml +51 -0
- package/tasks/.expanded.yaml +66 -0
- package/tasks/frameworks.yaml +98 -0
- package/tasks/functions.yaml +51 -0
- package/tasks/groq.yaml +216 -0
- package/tasks/nextjs-live.yaml +62 -0
- package/tasks/studio-setup.yaml +111 -0
- package/tasks/visual-editing.yaml +120 -0
|
@@ -0,0 +1,639 @@
|
|
|
1
|
+
# Airbyte Declarative Source — AI Literacy Framework
|
|
2
|
+
#
|
|
3
|
+
# Extracts evaluation reports from the Sanity Content Lake and delivers them
|
|
4
|
+
# to BigQuery (or any Airbyte-supported destination).
|
|
5
|
+
#
|
|
6
|
+
# Architecture: Sanity Content Lake → Airbyte (scheduled poll) → BigQuery
|
|
7
|
+
# This replaces the direct BigQuerySink with ELT managed by the data team.
|
|
8
|
+
#
|
|
9
|
+
# Two streams:
|
|
10
|
+
# 1. reports — one row per evaluation run, GROQ-projected to flat columns
|
|
11
|
+
# 2. area_scores — one row per report with nested model×area scores;
|
|
12
|
+
# use BigQuery views (see bigquery/views/) to UNNEST into flat rows
|
|
13
|
+
#
|
|
14
|
+
# Both streams use incremental sync with _createdAt as the cursor, so only
|
|
15
|
+
# new reports are transferred on each sync.
|
|
16
|
+
#
|
|
17
|
+
# @see docs/design-docs/report-store/bigquery.md — target schema
|
|
18
|
+
# @see docs/design-docs/report-store/airbyte-elt.md — integration design
|
|
19
|
+
version: 6.48.15
|
|
20
|
+
|
|
21
|
+
type: DeclarativeSource
|
|
22
|
+
|
|
23
|
+
check:
|
|
24
|
+
type: CheckStream
|
|
25
|
+
stream_names:
|
|
26
|
+
- reports
|
|
27
|
+
|
|
28
|
+
definitions:
|
|
29
|
+
streams:
|
|
30
|
+
# ------------------------------------------------------------------
|
|
31
|
+
# Stream 1: reports — flat row per evaluation run
|
|
32
|
+
# ------------------------------------------------------------------
|
|
33
|
+
# GROQ projection flattens nested provenance/summary into top-level
|
|
34
|
+
# columns matching the ailf.reports BigQuery table schema.
|
|
35
|
+
# The comparison field is intentionally excluded — it duplicates
|
|
36
|
+
# entire ScoreSummary objects and can always be recomputed from
|
|
37
|
+
# two report rows.
|
|
38
|
+
reports:
|
|
39
|
+
type: DeclarativeStream
|
|
40
|
+
name: reports
|
|
41
|
+
retriever:
|
|
42
|
+
type: SimpleRetriever
|
|
43
|
+
decoder:
|
|
44
|
+
type: JsonDecoder
|
|
45
|
+
requester:
|
|
46
|
+
$ref: "#/definitions/base_requester"
|
|
47
|
+
path: /v2026-03-12/data/query/{{ config['dataset'] }}
|
|
48
|
+
http_method: GET
|
|
49
|
+
request_parameters:
|
|
50
|
+
query: >-
|
|
51
|
+
*[_type=="ailf.report" && _createdAt > "{{
|
|
52
|
+
stream_interval.start_time or '1970-01-01T00:00:00Z' }}" &&
|
|
53
|
+
_createdAt <= "{{ stream_interval.end_time }}" ]|order(_createdAt
|
|
54
|
+
asc){
|
|
55
|
+
"report_id": reportId,
|
|
56
|
+
"completed_at": completedAt,
|
|
57
|
+
"duration_ms": durationMs,
|
|
58
|
+
tag,
|
|
59
|
+
"mode": provenance.mode,
|
|
60
|
+
"source_name": provenance.source.name,
|
|
61
|
+
"source_base_url": provenance.source.baseUrl,
|
|
62
|
+
"source_dataset": provenance.source.dataset,
|
|
63
|
+
"source_perspective": provenance.source.perspective,
|
|
64
|
+
"grader_model": provenance.graderModel,
|
|
65
|
+
"trigger_type": provenance.trigger.type,
|
|
66
|
+
"trigger_caller_repo": select(
|
|
67
|
+
provenance.trigger.type == "cross-repo" =>
|
|
68
|
+
provenance.trigger.callerRepo,
|
|
69
|
+
null
|
|
70
|
+
),
|
|
71
|
+
"git_repo": provenance.git.repo,
|
|
72
|
+
"git_branch": provenance.git.branch,
|
|
73
|
+
"git_sha": provenance.git.sha,
|
|
74
|
+
"git_pr_number": provenance.git.prNumber,
|
|
75
|
+
"avg_score": summary.overall.avgScore,
|
|
76
|
+
"avg_doc_lift": summary.overall.avgDocLift,
|
|
77
|
+
"total_cost": summary.overall.cost.total,
|
|
78
|
+
"grader_cost": summary.overall.cost.graderTotal,
|
|
79
|
+
"area_count": count(provenance.areas),
|
|
80
|
+
"model_count": count(provenance.models),
|
|
81
|
+
"areas": provenance.areas,
|
|
82
|
+
"models": provenance.models[].id,
|
|
83
|
+
"avg_actual_score": summary.overall.avgActualScore,
|
|
84
|
+
"avg_retrieval_gap": summary.overall.avgRetrievalGap,
|
|
85
|
+
"avg_infrastructure_efficiency":
|
|
86
|
+
summary.overall.avgInfrastructureEfficiency,
|
|
87
|
+
"promptfoo_url": provenance.promptfooUrl,
|
|
88
|
+
"promptfoo_urls": provenance.promptfooUrls[] { mode, url },
|
|
89
|
+
_createdAt
|
|
90
|
+
}
|
|
91
|
+
record_selector:
|
|
92
|
+
type: RecordSelector
|
|
93
|
+
extractor:
|
|
94
|
+
type: DpathExtractor
|
|
95
|
+
field_path:
|
|
96
|
+
- result
|
|
97
|
+
primary_key:
|
|
98
|
+
- report_id
|
|
99
|
+
incremental_sync:
|
|
100
|
+
type: DatetimeBasedCursor
|
|
101
|
+
cursor_field: _createdAt
|
|
102
|
+
cursor_datetime_formats:
|
|
103
|
+
- "%Y-%m-%dT%H:%M:%S.%fZ"
|
|
104
|
+
- "%Y-%m-%dT%H:%M:%SZ"
|
|
105
|
+
datetime_format: "%Y-%m-%dT%H:%M:%SZ"
|
|
106
|
+
start_datetime:
|
|
107
|
+
type: MinMaxDatetime
|
|
108
|
+
datetime: "{{ config.get('start_date', '2026-01-01T00:00:00Z') }}"
|
|
109
|
+
datetime_format: "%Y-%m-%dT%H:%M:%SZ"
|
|
110
|
+
step: P30D
|
|
111
|
+
cursor_granularity: PT1S
|
|
112
|
+
schema_loader:
|
|
113
|
+
type: InlineSchemaLoader
|
|
114
|
+
schema:
|
|
115
|
+
$ref: "#/schemas/reports"
|
|
116
|
+
|
|
117
|
+
# ------------------------------------------------------------------
|
|
118
|
+
# Stream 2: area_scores — per-model per-area score rows
|
|
119
|
+
# ------------------------------------------------------------------
|
|
120
|
+
# GROQ extracts the nested perModel→scores arrays with report-level
|
|
121
|
+
# context (report_id, completed_at, mode, source_name). The nesting
|
|
122
|
+
# is preserved because GROQ cannot explode arrays into flat rows.
|
|
123
|
+
#
|
|
124
|
+
# BigQuery consumers should query the `ailf.area_scores` view
|
|
125
|
+
# (defined in bigquery/views/area_scores.sql) which UNNESTs the
|
|
126
|
+
# nested arrays into one flat row per area per model per report.
|
|
127
|
+
area_scores:
|
|
128
|
+
type: DeclarativeStream
|
|
129
|
+
name: area_scores
|
|
130
|
+
retriever:
|
|
131
|
+
type: SimpleRetriever
|
|
132
|
+
decoder:
|
|
133
|
+
type: JsonDecoder
|
|
134
|
+
requester:
|
|
135
|
+
$ref: "#/definitions/base_requester"
|
|
136
|
+
path: /v2026-03-12/data/query/{{ config['dataset'] }}
|
|
137
|
+
http_method: GET
|
|
138
|
+
request_parameters:
|
|
139
|
+
query: >-
|
|
140
|
+
*[_type=="ailf.report" && _createdAt > "{{
|
|
141
|
+
stream_interval.start_time or '1970-01-01T00:00:00Z' }}" &&
|
|
142
|
+
_createdAt <= "{{ stream_interval.end_time }}" ]|order(_createdAt
|
|
143
|
+
asc){
|
|
144
|
+
"report_id": reportId,
|
|
145
|
+
"completed_at": completedAt,
|
|
146
|
+
"mode": provenance.mode,
|
|
147
|
+
"source_name": provenance.source.name,
|
|
148
|
+
"model_scores": summary.perModel[]{
|
|
149
|
+
"model_id": modelId,
|
|
150
|
+
"areas": scores[]{
|
|
151
|
+
"area": feature,
|
|
152
|
+
"total_score": totalScore,
|
|
153
|
+
"task_completion": taskCompletion,
|
|
154
|
+
"code_correctness": codeCorrectness,
|
|
155
|
+
"doc_coverage": docCoverage,
|
|
156
|
+
"doc_lift": coalesce(docLift, liftFromDocs),
|
|
157
|
+
"ceiling_score": coalesce(ceilingScore, withDocsScore),
|
|
158
|
+
"floor_score": coalesce(floorScore, withoutDocsScore),
|
|
159
|
+
"actual_score": actualScore,
|
|
160
|
+
"retrieval_gap": retrievalGap,
|
|
161
|
+
"infrastructure_efficiency": infrastructureEfficiency,
|
|
162
|
+
"total_cost": totalCost,
|
|
163
|
+
"test_count": testCount
|
|
164
|
+
}
|
|
165
|
+
},
|
|
166
|
+
"fallback_model_id": provenance.models[0].id,
|
|
167
|
+
"fallback_scores": summary.scores[]{
|
|
168
|
+
"area": feature,
|
|
169
|
+
"total_score": totalScore,
|
|
170
|
+
"task_completion": taskCompletion,
|
|
171
|
+
"code_correctness": codeCorrectness,
|
|
172
|
+
"doc_coverage": docCoverage,
|
|
173
|
+
"doc_lift": coalesce(docLift, liftFromDocs),
|
|
174
|
+
"ceiling_score": coalesce(ceilingScore, withDocsScore),
|
|
175
|
+
"floor_score": coalesce(floorScore, withoutDocsScore),
|
|
176
|
+
"actual_score": actualScore,
|
|
177
|
+
"retrieval_gap": retrievalGap,
|
|
178
|
+
"infrastructure_efficiency": infrastructureEfficiency,
|
|
179
|
+
"total_cost": totalCost,
|
|
180
|
+
"test_count": testCount
|
|
181
|
+
},
|
|
182
|
+
_createdAt
|
|
183
|
+
}
|
|
184
|
+
record_selector:
|
|
185
|
+
type: RecordSelector
|
|
186
|
+
extractor:
|
|
187
|
+
type: DpathExtractor
|
|
188
|
+
field_path:
|
|
189
|
+
- result
|
|
190
|
+
primary_key:
|
|
191
|
+
- report_id
|
|
192
|
+
incremental_sync:
|
|
193
|
+
type: DatetimeBasedCursor
|
|
194
|
+
cursor_field: _createdAt
|
|
195
|
+
cursor_datetime_formats:
|
|
196
|
+
- "%Y-%m-%dT%H:%M:%S.%fZ"
|
|
197
|
+
- "%Y-%m-%dT%H:%M:%SZ"
|
|
198
|
+
datetime_format: "%Y-%m-%dT%H:%M:%SZ"
|
|
199
|
+
start_datetime:
|
|
200
|
+
type: MinMaxDatetime
|
|
201
|
+
datetime: "{{ config.get('start_date', '2026-01-01T00:00:00Z') }}"
|
|
202
|
+
datetime_format: "%Y-%m-%dT%H:%M:%SZ"
|
|
203
|
+
step: P30D
|
|
204
|
+
cursor_granularity: PT1S
|
|
205
|
+
schema_loader:
|
|
206
|
+
type: InlineSchemaLoader
|
|
207
|
+
schema:
|
|
208
|
+
$ref: "#/schemas/area_scores"
|
|
209
|
+
|
|
210
|
+
base_requester:
|
|
211
|
+
type: HttpRequester
|
|
212
|
+
url_base: https://{{ config['project_id'] }}.api.sanity.io
|
|
213
|
+
authenticator:
|
|
214
|
+
type: BearerAuthenticator
|
|
215
|
+
api_token: "{{ config['api_key'] }}"
|
|
216
|
+
|
|
217
|
+
streams:
|
|
218
|
+
- $ref: "#/definitions/streams/reports"
|
|
219
|
+
- $ref: "#/definitions/streams/area_scores"
|
|
220
|
+
|
|
221
|
+
spec:
|
|
222
|
+
type: Spec
|
|
223
|
+
connection_specification:
|
|
224
|
+
type: object
|
|
225
|
+
$schema: http://json-schema.org/draft-07/schema#
|
|
226
|
+
required:
|
|
227
|
+
- api_key
|
|
228
|
+
- dataset
|
|
229
|
+
- project_id
|
|
230
|
+
properties:
|
|
231
|
+
api_key:
|
|
232
|
+
type: string
|
|
233
|
+
order: 0
|
|
234
|
+
title: Sanity API Token
|
|
235
|
+
description: >-
|
|
236
|
+
A Sanity API token with read access to the dataset containing
|
|
237
|
+
ailf.report documents. Generate one at
|
|
238
|
+
https://www.sanity.io/manage/project/<project_id>/api#tokens
|
|
239
|
+
airbyte_secret: true
|
|
240
|
+
dataset:
|
|
241
|
+
type: string
|
|
242
|
+
order: 1
|
|
243
|
+
title: Dataset
|
|
244
|
+
description: >-
|
|
245
|
+
The Sanity dataset containing evaluation reports.
|
|
246
|
+
default: next
|
|
247
|
+
project_id:
|
|
248
|
+
type: string
|
|
249
|
+
order: 2
|
|
250
|
+
title: Project ID
|
|
251
|
+
description: >-
|
|
252
|
+
The Sanity project ID (e.g., "3do82whm").
|
|
253
|
+
default: 3do82whm
|
|
254
|
+
start_date:
|
|
255
|
+
type: string
|
|
256
|
+
order: 3
|
|
257
|
+
title: Start Date
|
|
258
|
+
description: >-
|
|
259
|
+
Only sync reports created after this date (ISO 8601). Defaults to
|
|
260
|
+
2026-01-01 if not set.
|
|
261
|
+
default: "2026-01-01T00:00:00Z"
|
|
262
|
+
examples:
|
|
263
|
+
- "2026-01-01T00:00:00Z"
|
|
264
|
+
- "2026-06-01T00:00:00Z"
|
|
265
|
+
additionalProperties: true
|
|
266
|
+
|
|
267
|
+
metadata:
|
|
268
|
+
assist: {}
|
|
269
|
+
testedStreams:
|
|
270
|
+
reports:
|
|
271
|
+
hasRecords: true
|
|
272
|
+
streamHash: null
|
|
273
|
+
hasResponse: true
|
|
274
|
+
primaryKeysAreUnique: true
|
|
275
|
+
primaryKeysArePresent: true
|
|
276
|
+
responsesAreSuccessful: true
|
|
277
|
+
area_scores:
|
|
278
|
+
hasRecords: true
|
|
279
|
+
streamHash: null
|
|
280
|
+
hasResponse: true
|
|
281
|
+
primaryKeysAreUnique: true
|
|
282
|
+
primaryKeysArePresent: true
|
|
283
|
+
responsesAreSuccessful: true
|
|
284
|
+
autoImportSchema:
|
|
285
|
+
reports: false
|
|
286
|
+
area_scores: false
|
|
287
|
+
|
|
288
|
+
# ======================================================================
|
|
289
|
+
# Inline schemas — manually defined to match the designed BigQuery tables.
|
|
290
|
+
# autoImportSchema is OFF so these don't drift with Sanity document changes.
|
|
291
|
+
# ======================================================================
|
|
292
|
+
|
|
293
|
+
schemas:
|
|
294
|
+
# ------------------------------------------------------------------
|
|
295
|
+
# reports schema — flat, matches ailf.reports BigQuery table
|
|
296
|
+
# ------------------------------------------------------------------
|
|
297
|
+
reports:
|
|
298
|
+
type: object
|
|
299
|
+
$schema: http://json-schema.org/schema#
|
|
300
|
+
required:
|
|
301
|
+
- report_id
|
|
302
|
+
properties:
|
|
303
|
+
report_id:
|
|
304
|
+
type: string
|
|
305
|
+
description: UUID v7 report identifier (primary key)
|
|
306
|
+
completed_at:
|
|
307
|
+
type:
|
|
308
|
+
- string
|
|
309
|
+
- "null"
|
|
310
|
+
description: ISO 8601 timestamp when the evaluation completed
|
|
311
|
+
duration_ms:
|
|
312
|
+
type:
|
|
313
|
+
- number
|
|
314
|
+
- "null"
|
|
315
|
+
description: Pipeline execution time in milliseconds
|
|
316
|
+
tag:
|
|
317
|
+
type:
|
|
318
|
+
- string
|
|
319
|
+
- "null"
|
|
320
|
+
description: Optional human-supplied label
|
|
321
|
+
mode:
|
|
322
|
+
type:
|
|
323
|
+
- string
|
|
324
|
+
- "null"
|
|
325
|
+
description: "Evaluation mode: baseline, observed, or agentic"
|
|
326
|
+
source_name:
|
|
327
|
+
type:
|
|
328
|
+
- string
|
|
329
|
+
- "null"
|
|
330
|
+
description: Documentation source name (e.g., "production")
|
|
331
|
+
source_base_url:
|
|
332
|
+
type:
|
|
333
|
+
- string
|
|
334
|
+
- "null"
|
|
335
|
+
description: Documentation source base URL
|
|
336
|
+
source_dataset:
|
|
337
|
+
type:
|
|
338
|
+
- string
|
|
339
|
+
- "null"
|
|
340
|
+
description: Sanity dataset used for evaluation
|
|
341
|
+
source_perspective:
|
|
342
|
+
type:
|
|
343
|
+
- string
|
|
344
|
+
- "null"
|
|
345
|
+
description: Sanity perspective (for content release evaluations)
|
|
346
|
+
grader_model:
|
|
347
|
+
type:
|
|
348
|
+
- string
|
|
349
|
+
- "null"
|
|
350
|
+
description: Model used for LLM grading
|
|
351
|
+
trigger_type:
|
|
352
|
+
type:
|
|
353
|
+
- string
|
|
354
|
+
- "null"
|
|
355
|
+
description:
|
|
356
|
+
"What triggered the evaluation: manual, ci, scheduled, webhook,
|
|
357
|
+
cross-repo"
|
|
358
|
+
trigger_caller_repo:
|
|
359
|
+
type:
|
|
360
|
+
- string
|
|
361
|
+
- "null"
|
|
362
|
+
description: Caller repository for cross-repo triggers
|
|
363
|
+
git_repo:
|
|
364
|
+
type:
|
|
365
|
+
- string
|
|
366
|
+
- "null"
|
|
367
|
+
description: Source repository (when run from CI)
|
|
368
|
+
git_branch:
|
|
369
|
+
type:
|
|
370
|
+
- string
|
|
371
|
+
- "null"
|
|
372
|
+
description: Source branch
|
|
373
|
+
git_sha:
|
|
374
|
+
type:
|
|
375
|
+
- string
|
|
376
|
+
- "null"
|
|
377
|
+
description: Commit SHA
|
|
378
|
+
git_pr_number:
|
|
379
|
+
type:
|
|
380
|
+
- number
|
|
381
|
+
- "null"
|
|
382
|
+
description: Pull request number (if applicable)
|
|
383
|
+
avg_score:
|
|
384
|
+
type:
|
|
385
|
+
- number
|
|
386
|
+
- "null"
|
|
387
|
+
description: Overall average AI literacy score (0–100)
|
|
388
|
+
avg_doc_lift:
|
|
389
|
+
type:
|
|
390
|
+
- number
|
|
391
|
+
- "null"
|
|
392
|
+
description: Overall documentation lift score
|
|
393
|
+
total_cost:
|
|
394
|
+
type:
|
|
395
|
+
- number
|
|
396
|
+
- "null"
|
|
397
|
+
description: Total evaluation cost in USD
|
|
398
|
+
grader_cost:
|
|
399
|
+
type:
|
|
400
|
+
- number
|
|
401
|
+
- "null"
|
|
402
|
+
description: Grader model cost in USD
|
|
403
|
+
area_count:
|
|
404
|
+
type:
|
|
405
|
+
- number
|
|
406
|
+
- "null"
|
|
407
|
+
description: Number of feature areas evaluated
|
|
408
|
+
model_count:
|
|
409
|
+
type:
|
|
410
|
+
- number
|
|
411
|
+
- "null"
|
|
412
|
+
description: Number of models evaluated
|
|
413
|
+
areas:
|
|
414
|
+
type:
|
|
415
|
+
- array
|
|
416
|
+
- "null"
|
|
417
|
+
items:
|
|
418
|
+
type: string
|
|
419
|
+
description: List of evaluated feature area names
|
|
420
|
+
models:
|
|
421
|
+
type:
|
|
422
|
+
- array
|
|
423
|
+
- "null"
|
|
424
|
+
items:
|
|
425
|
+
type: string
|
|
426
|
+
description: List of evaluated model IDs
|
|
427
|
+
avg_actual_score:
|
|
428
|
+
type:
|
|
429
|
+
- number
|
|
430
|
+
- "null"
|
|
431
|
+
description: Average score from agent-retrieved docs (full-mode only)
|
|
432
|
+
avg_retrieval_gap:
|
|
433
|
+
type:
|
|
434
|
+
- number
|
|
435
|
+
- "null"
|
|
436
|
+
description: Average ceiling minus actual across areas (full-mode only)
|
|
437
|
+
avg_infrastructure_efficiency:
|
|
438
|
+
type:
|
|
439
|
+
- number
|
|
440
|
+
- "null"
|
|
441
|
+
description: Average actual/ceiling ratio across areas (full-mode only)
|
|
442
|
+
promptfoo_url:
|
|
443
|
+
type:
|
|
444
|
+
- string
|
|
445
|
+
- "null"
|
|
446
|
+
description: Legacy single Promptfoo share URL
|
|
447
|
+
promptfoo_urls:
|
|
448
|
+
type:
|
|
449
|
+
- array
|
|
450
|
+
- "null"
|
|
451
|
+
description: Per-mode Promptfoo share URLs (one per sub-eval)
|
|
452
|
+
items:
|
|
453
|
+
type: object
|
|
454
|
+
properties:
|
|
455
|
+
mode:
|
|
456
|
+
type: string
|
|
457
|
+
description: "Evaluation mode: baseline, agentic, observed"
|
|
458
|
+
url:
|
|
459
|
+
type: string
|
|
460
|
+
description: Promptfoo share URL for this mode
|
|
461
|
+
_createdAt:
|
|
462
|
+
type:
|
|
463
|
+
- string
|
|
464
|
+
- "null"
|
|
465
|
+
description:
|
|
466
|
+
Sanity document creation timestamp (used as incremental cursor)
|
|
467
|
+
additionalProperties: true
|
|
468
|
+
|
|
469
|
+
# ------------------------------------------------------------------
|
|
470
|
+
# area_scores schema — nested model→area scores for BigQuery UNNEST
|
|
471
|
+
# ------------------------------------------------------------------
|
|
472
|
+
area_scores:
|
|
473
|
+
type: object
|
|
474
|
+
$schema: http://json-schema.org/schema#
|
|
475
|
+
required:
|
|
476
|
+
- report_id
|
|
477
|
+
properties:
|
|
478
|
+
report_id:
|
|
479
|
+
type: string
|
|
480
|
+
description: FK to reports table (UUID v7)
|
|
481
|
+
completed_at:
|
|
482
|
+
type:
|
|
483
|
+
- string
|
|
484
|
+
- "null"
|
|
485
|
+
description: Denormalized timestamp for partitioning
|
|
486
|
+
mode:
|
|
487
|
+
type:
|
|
488
|
+
- string
|
|
489
|
+
- "null"
|
|
490
|
+
description: Denormalized evaluation mode for clustering
|
|
491
|
+
source_name:
|
|
492
|
+
type:
|
|
493
|
+
- string
|
|
494
|
+
- "null"
|
|
495
|
+
description: Denormalized source name for clustering
|
|
496
|
+
model_scores:
|
|
497
|
+
type:
|
|
498
|
+
- array
|
|
499
|
+
- "null"
|
|
500
|
+
description:
|
|
501
|
+
Per-model score breakdowns (UNNEST in BigQuery to get flat rows)
|
|
502
|
+
items:
|
|
503
|
+
type: object
|
|
504
|
+
properties:
|
|
505
|
+
model_id:
|
|
506
|
+
type:
|
|
507
|
+
- string
|
|
508
|
+
- "null"
|
|
509
|
+
areas:
|
|
510
|
+
type:
|
|
511
|
+
- array
|
|
512
|
+
- "null"
|
|
513
|
+
items:
|
|
514
|
+
type: object
|
|
515
|
+
properties:
|
|
516
|
+
area:
|
|
517
|
+
type:
|
|
518
|
+
- string
|
|
519
|
+
- "null"
|
|
520
|
+
total_score:
|
|
521
|
+
type:
|
|
522
|
+
- number
|
|
523
|
+
- "null"
|
|
524
|
+
task_completion:
|
|
525
|
+
type:
|
|
526
|
+
- number
|
|
527
|
+
- "null"
|
|
528
|
+
code_correctness:
|
|
529
|
+
type:
|
|
530
|
+
- number
|
|
531
|
+
- "null"
|
|
532
|
+
doc_coverage:
|
|
533
|
+
type:
|
|
534
|
+
- number
|
|
535
|
+
- "null"
|
|
536
|
+
doc_lift:
|
|
537
|
+
type:
|
|
538
|
+
- number
|
|
539
|
+
- "null"
|
|
540
|
+
ceiling_score:
|
|
541
|
+
type:
|
|
542
|
+
- number
|
|
543
|
+
- "null"
|
|
544
|
+
floor_score:
|
|
545
|
+
type:
|
|
546
|
+
- number
|
|
547
|
+
- "null"
|
|
548
|
+
total_cost:
|
|
549
|
+
type:
|
|
550
|
+
- number
|
|
551
|
+
- "null"
|
|
552
|
+
test_count:
|
|
553
|
+
type:
|
|
554
|
+
- number
|
|
555
|
+
- "null"
|
|
556
|
+
actual_score:
|
|
557
|
+
type:
|
|
558
|
+
- number
|
|
559
|
+
- "null"
|
|
560
|
+
retrieval_gap:
|
|
561
|
+
type:
|
|
562
|
+
- number
|
|
563
|
+
- "null"
|
|
564
|
+
infrastructure_efficiency:
|
|
565
|
+
type:
|
|
566
|
+
- number
|
|
567
|
+
- "null"
|
|
568
|
+
fallback_model_id:
|
|
569
|
+
type:
|
|
570
|
+
- string
|
|
571
|
+
- "null"
|
|
572
|
+
description:
|
|
573
|
+
First model ID from provenance (used when perModel is absent)
|
|
574
|
+
fallback_scores:
|
|
575
|
+
type:
|
|
576
|
+
- array
|
|
577
|
+
- "null"
|
|
578
|
+
description: Aggregate scores without per-model breakdown (fallback)
|
|
579
|
+
items:
|
|
580
|
+
type: object
|
|
581
|
+
properties:
|
|
582
|
+
area:
|
|
583
|
+
type:
|
|
584
|
+
- string
|
|
585
|
+
- "null"
|
|
586
|
+
total_score:
|
|
587
|
+
type:
|
|
588
|
+
- number
|
|
589
|
+
- "null"
|
|
590
|
+
task_completion:
|
|
591
|
+
type:
|
|
592
|
+
- number
|
|
593
|
+
- "null"
|
|
594
|
+
code_correctness:
|
|
595
|
+
type:
|
|
596
|
+
- number
|
|
597
|
+
- "null"
|
|
598
|
+
doc_coverage:
|
|
599
|
+
type:
|
|
600
|
+
- number
|
|
601
|
+
- "null"
|
|
602
|
+
doc_lift:
|
|
603
|
+
type:
|
|
604
|
+
- number
|
|
605
|
+
- "null"
|
|
606
|
+
ceiling_score:
|
|
607
|
+
type:
|
|
608
|
+
- number
|
|
609
|
+
- "null"
|
|
610
|
+
floor_score:
|
|
611
|
+
type:
|
|
612
|
+
- number
|
|
613
|
+
- "null"
|
|
614
|
+
total_cost:
|
|
615
|
+
type:
|
|
616
|
+
- number
|
|
617
|
+
- "null"
|
|
618
|
+
test_count:
|
|
619
|
+
type:
|
|
620
|
+
- number
|
|
621
|
+
- "null"
|
|
622
|
+
actual_score:
|
|
623
|
+
type:
|
|
624
|
+
- number
|
|
625
|
+
- "null"
|
|
626
|
+
retrieval_gap:
|
|
627
|
+
type:
|
|
628
|
+
- number
|
|
629
|
+
- "null"
|
|
630
|
+
infrastructure_efficiency:
|
|
631
|
+
type:
|
|
632
|
+
- number
|
|
633
|
+
- "null"
|
|
634
|
+
_createdAt:
|
|
635
|
+
type:
|
|
636
|
+
- string
|
|
637
|
+
- "null"
|
|
638
|
+
description: Sanity document creation timestamp (incremental cursor)
|
|
639
|
+
additionalProperties: true
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# BigQuery Schema & Views
|
|
2
|
+
|
|
3
|
+
SQL definitions for the BigQuery analytics layer. These create the flattened
|
|
4
|
+
tables and views that power SQL analytics, BI dashboards (Looker, Sheets), and
|
|
5
|
+
ad-hoc data exploration.
|
|
6
|
+
|
|
7
|
+
## Architecture
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
Sanity Content Lake
|
|
11
|
+
│
|
|
12
|
+
▼
|
|
13
|
+
Airbyte (scheduled sync)
|
|
14
|
+
├─ Stream: reports → ailf_raw.reports (flat, GROQ-projected)
|
|
15
|
+
└─ Stream: area_scores → ailf_raw.area_scores (nested model→area arrays)
|
|
16
|
+
│
|
|
17
|
+
▼
|
|
18
|
+
BigQuery views (this directory)
|
|
19
|
+
├─ ailf.reports → direct passthrough (already flat from GROQ projection)
|
|
20
|
+
└─ ailf.area_scores → UNNEST flattening (one row per area per model per report)
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Files
|
|
24
|
+
|
|
25
|
+
| File | Purpose |
|
|
26
|
+
| ----------------------- | ------------------------------------------------------------------------------- |
|
|
27
|
+
| `views/area_scores.sql` | Flattens nested `model_scores` array into one row per area per model per report |
|
|
28
|
+
| `views/reports.sql` | Clean passthrough view with correct types and column ordering |
|
|
29
|
+
|
|
30
|
+
## Setup
|
|
31
|
+
|
|
32
|
+
The Airbyte connection loads data into a raw dataset (e.g., `ailf_raw`). The
|
|
33
|
+
views defined here reference that raw dataset and present the designed schema
|
|
34
|
+
from `docs/design-docs/report-store/bigquery.md`.
|
|
35
|
+
|
|
36
|
+
### 1. Create the raw dataset (Airbyte writes here)
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
bq mk --dataset data-platform-302218:ailf_raw
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### 2. Create the analytics dataset (views live here)
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
bq mk --dataset data-platform-302218:ailf
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### 3. Create the views
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
bq query --use_legacy_sql=false < views/reports.sql
|
|
52
|
+
bq query --use_legacy_sql=false < views/area_scores.sql
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Naming conventions
|
|
56
|
+
|
|
57
|
+
- **`ailf_raw.*`** — raw Airbyte-loaded tables (nested JSON, Airbyte metadata
|
|
58
|
+
columns)
|
|
59
|
+
- **`ailf.*`** — analytics views (flat, typed, designed schema from bigquery.md)
|
|
60
|
+
|
|
61
|
+
Airbyte adds metadata columns (`_airbyte_raw_id`, `_airbyte_extracted_at`,
|
|
62
|
+
`_airbyte_meta`) to its output tables. The views strip these so downstream
|
|
63
|
+
consumers see only the designed schema.
|
|
64
|
+
|
|
65
|
+
## Schema evolution
|
|
66
|
+
|
|
67
|
+
Views are the transformation layer. When the report format evolves:
|
|
68
|
+
|
|
69
|
+
1. Update the GROQ projections in the Airbyte connector YAML
|
|
70
|
+
2. Update the view SQL to map new fields
|
|
71
|
+
3. Backfill is automatic — views always reflect current data
|
|
72
|
+
|
|
73
|
+
@see docs/design-docs/report-store/bigquery.md — canonical schema definition
|
|
74
|
+
@see packages/eval/config/airbyte/ — Airbyte connector configuration
|