@sanity/ailf 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (530) hide show
  1. package/README.md +89 -0
  2. package/bin/ailf.js +64 -0
  3. package/canonical/grader-references/README.md +88 -0
  4. package/canonical/grader-references/groq.yaml +234 -0
  5. package/canonical/grader-references/studio-setup.yaml +275 -0
  6. package/canonical/reference-solutions/.gitkeep +1 -0
  7. package/canonical/reference-solutions/frameworks/nuxt.ts +119 -0
  8. package/canonical/reference-solutions/frameworks/remix.tsx +100 -0
  9. package/canonical/reference-solutions/functions/publish-webhook.ts +60 -0
  10. package/canonical/reference-solutions/groq/advanced-filtering.ts +379 -0
  11. package/canonical/reference-solutions/groq/blog-queries.ts +137 -0
  12. package/canonical/reference-solutions/groq/joins-references.ts +300 -0
  13. package/canonical/reference-solutions/nextjs/app-router-integration.tsx +128 -0
  14. package/canonical/reference-solutions/studio-setup/blog-schema.ts +143 -0
  15. package/canonical/reference-solutions/studio-setup/custom-tool.tsx +78 -0
  16. package/canonical/reference-solutions/visual-editing/live-preview.tsx +137 -0
  17. package/canonical/reference-solutions/visual-editing/presentation-nextjs.tsx +130 -0
  18. package/config/airbyte/ai_literacy_framework.connector.yaml +639 -0
  19. package/config/bigquery/README.md +74 -0
  20. package/config/bigquery/views/area_scores.sql +87 -0
  21. package/config/bigquery/views/reports.sql +49 -0
  22. package/config/features.yaml +116 -0
  23. package/config/models.yaml +115 -0
  24. package/config/prompts.yaml +75 -0
  25. package/config/rubrics.yaml +62 -0
  26. package/config/schedules.yaml +43 -0
  27. package/config/sinks.yaml +54 -0
  28. package/config/sources.yaml +51 -0
  29. package/config/thresholds.yaml +49 -0
  30. package/dist/_vendor/ailf-core/examples/index.d.ts +190 -0
  31. package/dist/_vendor/ailf-core/examples/index.js +285 -0
  32. package/dist/_vendor/ailf-core/index.d.ts +17 -0
  33. package/dist/_vendor/ailf-core/index.js +17 -0
  34. package/dist/_vendor/ailf-core/ports/cache-store.d.ts +72 -0
  35. package/dist/_vendor/ailf-core/ports/cache-store.js +17 -0
  36. package/dist/_vendor/ailf-core/ports/config-source.d.ts +33 -0
  37. package/dist/_vendor/ailf-core/ports/config-source.js +15 -0
  38. package/dist/_vendor/ailf-core/ports/context.d.ts +172 -0
  39. package/dist/_vendor/ailf-core/ports/context.js +14 -0
  40. package/dist/_vendor/ailf-core/ports/doc-fetcher.d.ts +131 -0
  41. package/dist/_vendor/ailf-core/ports/doc-fetcher.js +12 -0
  42. package/dist/_vendor/ailf-core/ports/eval-runner.d.ts +24 -0
  43. package/dist/_vendor/ailf-core/ports/eval-runner.js +8 -0
  44. package/dist/_vendor/ailf-core/ports/index.d.ts +15 -0
  45. package/dist/_vendor/ailf-core/ports/index.js +7 -0
  46. package/dist/_vendor/ailf-core/ports/logger.d.ts +36 -0
  47. package/dist/_vendor/ailf-core/ports/logger.js +11 -0
  48. package/dist/_vendor/ailf-core/ports/pipeline-step.d.ts +46 -0
  49. package/dist/_vendor/ailf-core/ports/pipeline-step.js +8 -0
  50. package/dist/_vendor/ailf-core/ports/task-source.d.ts +159 -0
  51. package/dist/_vendor/ailf-core/ports/task-source.js +72 -0
  52. package/dist/_vendor/ailf-core/schemas/callback-payload.d.ts +24 -0
  53. package/dist/_vendor/ailf-core/schemas/callback-payload.js +29 -0
  54. package/dist/_vendor/ailf-core/schemas/eval-config.d.ts +55 -0
  55. package/dist/_vendor/ailf-core/schemas/eval-config.js +78 -0
  56. package/dist/_vendor/ailf-core/schemas/index.d.ts +16 -0
  57. package/dist/_vendor/ailf-core/schemas/index.js +16 -0
  58. package/dist/_vendor/ailf-core/schemas/pipeline-request.d.ts +125 -0
  59. package/dist/_vendor/ailf-core/schemas/pipeline-request.js +67 -0
  60. package/dist/_vendor/ailf-core/schemas/pipeline.d.ts +531 -0
  61. package/dist/_vendor/ailf-core/schemas/pipeline.js +318 -0
  62. package/dist/_vendor/ailf-core/schemas/schedules.d.ts +68 -0
  63. package/dist/_vendor/ailf-core/schemas/schedules.js +74 -0
  64. package/dist/_vendor/ailf-core/schemas/sinks.d.ts +207 -0
  65. package/dist/_vendor/ailf-core/schemas/sinks.js +108 -0
  66. package/dist/_vendor/ailf-core/services/comparison-formatters.d.ts +18 -0
  67. package/dist/_vendor/ailf-core/services/comparison-formatters.js +189 -0
  68. package/dist/_vendor/ailf-core/services/config-helpers.d.ts +41 -0
  69. package/dist/_vendor/ailf-core/services/config-helpers.js +86 -0
  70. package/dist/_vendor/ailf-core/services/index.d.ts +12 -0
  71. package/dist/_vendor/ailf-core/services/index.js +12 -0
  72. package/dist/_vendor/ailf-core/services/scoring.d.ts +49 -0
  73. package/dist/_vendor/ailf-core/services/scoring.js +222 -0
  74. package/dist/_vendor/ailf-core/types/index.d.ts +1082 -0
  75. package/dist/_vendor/ailf-core/types/index.js +21 -0
  76. package/dist/_vendor/ailf-core/types/scoring-input.d.ts +54 -0
  77. package/dist/_vendor/ailf-core/types/scoring-input.js +9 -0
  78. package/dist/_vendor/ailf-shared/dimension-names.d.ts +21 -0
  79. package/dist/_vendor/ailf-shared/dimension-names.js +27 -0
  80. package/dist/_vendor/ailf-shared/document-ref.d.ts +29 -0
  81. package/dist/_vendor/ailf-shared/document-ref.js +1 -0
  82. package/dist/_vendor/ailf-shared/eval-modes.d.ts +12 -0
  83. package/dist/_vendor/ailf-shared/eval-modes.js +8 -0
  84. package/dist/_vendor/ailf-shared/index.d.ts +16 -0
  85. package/dist/_vendor/ailf-shared/index.js +16 -0
  86. package/dist/_vendor/ailf-shared/noise-threshold.d.ts +9 -0
  87. package/dist/_vendor/ailf-shared/noise-threshold.js +9 -0
  88. package/dist/_vendor/ailf-shared/score-grades.d.ts +17 -0
  89. package/dist/_vendor/ailf-shared/score-grades.js +23 -0
  90. package/dist/adapters/cache/content-lake-cache.d.ts +24 -0
  91. package/dist/adapters/cache/content-lake-cache.js +59 -0
  92. package/dist/adapters/cache/filesystem-cache.d.ts +18 -0
  93. package/dist/adapters/cache/filesystem-cache.js +54 -0
  94. package/dist/adapters/cache/index.d.ts +2 -0
  95. package/dist/adapters/cache/index.js +2 -0
  96. package/dist/adapters/config-sources/cli-config-adapter.d.ts +17 -0
  97. package/dist/adapters/config-sources/cli-config-adapter.js +23 -0
  98. package/dist/adapters/config-sources/file-config-adapter.d.ts +26 -0
  99. package/dist/adapters/config-sources/file-config-adapter.js +96 -0
  100. package/dist/adapters/config-sources/index.d.ts +2 -0
  101. package/dist/adapters/config-sources/index.js +2 -0
  102. package/dist/adapters/doc-fetchers/index.d.ts +1 -0
  103. package/dist/adapters/doc-fetchers/index.js +1 -0
  104. package/dist/adapters/doc-fetchers/sanity-doc-fetcher.d.ts +76 -0
  105. package/dist/adapters/doc-fetchers/sanity-doc-fetcher.js +620 -0
  106. package/dist/adapters/eval-runners/index.d.ts +1 -0
  107. package/dist/adapters/eval-runners/index.js +1 -0
  108. package/dist/adapters/eval-runners/promptfoo-eval-adapter.d.ts +14 -0
  109. package/dist/adapters/eval-runners/promptfoo-eval-adapter.js +63 -0
  110. package/dist/adapters/index.d.ts +12 -0
  111. package/dist/adapters/index.js +12 -0
  112. package/dist/adapters/loggers/console-logger.d.ts +22 -0
  113. package/dist/adapters/loggers/console-logger.js +54 -0
  114. package/dist/adapters/loggers/index.d.ts +9 -0
  115. package/dist/adapters/loggers/index.js +9 -0
  116. package/dist/adapters/loggers/json-logger.d.ts +18 -0
  117. package/dist/adapters/loggers/json-logger.js +33 -0
  118. package/dist/adapters/loggers/quiet-logger.d.ts +16 -0
  119. package/dist/adapters/loggers/quiet-logger.js +30 -0
  120. package/dist/adapters/task-sources/composite-task-source.d.ts +20 -0
  121. package/dist/adapters/task-sources/composite-task-source.js +59 -0
  122. package/dist/adapters/task-sources/content-lake-task-source.d.ts +20 -0
  123. package/dist/adapters/task-sources/content-lake-task-source.js +219 -0
  124. package/dist/adapters/task-sources/index.d.ts +7 -0
  125. package/dist/adapters/task-sources/index.js +7 -0
  126. package/dist/adapters/task-sources/repo-schemas.d.ts +245 -0
  127. package/dist/adapters/task-sources/repo-schemas.js +234 -0
  128. package/dist/adapters/task-sources/repo-task-source.d.ts +22 -0
  129. package/dist/adapters/task-sources/repo-task-source.js +104 -0
  130. package/dist/adapters/task-sources/repo-trigger.d.ts +52 -0
  131. package/dist/adapters/task-sources/repo-trigger.js +153 -0
  132. package/dist/adapters/task-sources/repo-validation.d.ts +49 -0
  133. package/dist/adapters/task-sources/repo-validation.js +164 -0
  134. package/dist/adapters/task-sources/yaml-task-source.d.ts +18 -0
  135. package/dist/adapters/task-sources/yaml-task-source.js +136 -0
  136. package/dist/agent-observer/agentic-provider.d.ts +132 -0
  137. package/dist/agent-observer/agentic-provider.js +983 -0
  138. package/dist/agent-observer/classifier.d.ts +62 -0
  139. package/dist/agent-observer/classifier.js +269 -0
  140. package/dist/agent-observer/index.d.ts +7 -0
  141. package/dist/agent-observer/index.js +4 -0
  142. package/dist/agent-observer/pricing.d.ts +35 -0
  143. package/dist/agent-observer/pricing.js +82 -0
  144. package/dist/agent-observer/provider.d.ts +77 -0
  145. package/dist/agent-observer/provider.js +151 -0
  146. package/dist/agent-observer/proxy.d.ts +91 -0
  147. package/dist/agent-observer/proxy.js +321 -0
  148. package/dist/agent-observer/test-imports.d.ts +7 -0
  149. package/dist/agent-observer/test-imports.js +185 -0
  150. package/dist/agent-observer/types.d.ts +137 -0
  151. package/dist/agent-observer/types.js +16 -0
  152. package/dist/assertions/source-isolation.d.ts +72 -0
  153. package/dist/assertions/source-isolation.js +117 -0
  154. package/dist/cli.d.ts +24 -0
  155. package/dist/cli.js +199 -0
  156. package/dist/commands/agent-report.d.ts +5 -0
  157. package/dist/commands/agent-report.js +69 -0
  158. package/dist/commands/baseline.d.ts +9 -0
  159. package/dist/commands/baseline.js +141 -0
  160. package/dist/commands/cache.d.ts +13 -0
  161. package/dist/commands/cache.js +135 -0
  162. package/dist/commands/calculate-scores.d.ts +8 -0
  163. package/dist/commands/calculate-scores.js +48 -0
  164. package/dist/commands/compare.d.ts +8 -0
  165. package/dist/commands/compare.js +120 -0
  166. package/dist/commands/completion.d.ts +18 -0
  167. package/dist/commands/completion.js +260 -0
  168. package/dist/commands/coverage-audit.d.ts +7 -0
  169. package/dist/commands/coverage-audit.js +40 -0
  170. package/dist/commands/discovery-report.d.ts +10 -0
  171. package/dist/commands/discovery-report.js +44 -0
  172. package/dist/commands/eval.d.ts +9 -0
  173. package/dist/commands/eval.js +35 -0
  174. package/dist/commands/explain-handler.d.ts +34 -0
  175. package/dist/commands/explain-handler.js +719 -0
  176. package/dist/commands/fetch-docs.d.ts +8 -0
  177. package/dist/commands/fetch-docs.js +128 -0
  178. package/dist/commands/generate-configs.d.ts +8 -0
  179. package/dist/commands/generate-configs.js +46 -0
  180. package/dist/commands/grader/index.d.ts +11 -0
  181. package/dist/commands/grader/index.js +118 -0
  182. package/dist/commands/init.d.ts +19 -0
  183. package/dist/commands/init.js +150 -0
  184. package/dist/commands/interactive.d.ts +12 -0
  185. package/dist/commands/interactive.js +238 -0
  186. package/dist/commands/lookup-doc.d.ts +15 -0
  187. package/dist/commands/lookup-doc.js +84 -0
  188. package/dist/commands/measure-retrieval.d.ts +5 -0
  189. package/dist/commands/measure-retrieval.js +65 -0
  190. package/dist/commands/pipeline-action.d.ts +71 -0
  191. package/dist/commands/pipeline-action.js +305 -0
  192. package/dist/commands/pipeline.d.ts +62 -0
  193. package/dist/commands/pipeline.js +53 -0
  194. package/dist/commands/pr-comment.d.ts +8 -0
  195. package/dist/commands/pr-comment.js +47 -0
  196. package/dist/commands/publish.d.ts +26 -0
  197. package/dist/commands/publish.js +253 -0
  198. package/dist/commands/readiness-report.d.ts +10 -0
  199. package/dist/commands/readiness-report.js +104 -0
  200. package/dist/commands/shared/options.d.ts +29 -0
  201. package/dist/commands/shared/options.js +57 -0
  202. package/dist/commands/update-quality-scores.d.ts +5 -0
  203. package/dist/commands/update-quality-scores.js +20 -0
  204. package/dist/commands/validate-tasks.d.ts +16 -0
  205. package/dist/commands/validate-tasks.js +93 -0
  206. package/dist/commands/validate.d.ts +9 -0
  207. package/dist/commands/validate.js +73 -0
  208. package/dist/commands/webhook-server.d.ts +5 -0
  209. package/dist/commands/webhook-server.js +30 -0
  210. package/dist/commands/weekly-digest.d.ts +10 -0
  211. package/dist/commands/weekly-digest.js +104 -0
  212. package/dist/composition-root.d.ts +26 -0
  213. package/dist/composition-root.js +107 -0
  214. package/dist/interpolate.d.ts +26 -0
  215. package/dist/interpolate.js +70 -0
  216. package/dist/job-store.d.ts +104 -0
  217. package/dist/job-store.js +188 -0
  218. package/dist/lib/agent-behavior-report.d.ts +8 -0
  219. package/dist/lib/agent-behavior-report.js +185 -0
  220. package/dist/lib/baseline.d.ts +19 -0
  221. package/dist/lib/baseline.js +153 -0
  222. package/dist/lib/calculate-scores.d.ts +23 -0
  223. package/dist/lib/calculate-scores.js +42 -0
  224. package/dist/lib/compare.d.ts +18 -0
  225. package/dist/lib/compare.js +170 -0
  226. package/dist/lib/coverage-audit.d.ts +4 -0
  227. package/dist/lib/coverage-audit.js +42 -0
  228. package/dist/lib/discovery-report.d.ts +13 -0
  229. package/dist/lib/discovery-report.js +57 -0
  230. package/dist/lib/fetch-docs.d.ts +30 -0
  231. package/dist/lib/fetch-docs.js +171 -0
  232. package/dist/lib/generate-configs.d.ts +25 -0
  233. package/dist/lib/generate-configs.js +42 -0
  234. package/dist/lib/grader-api.d.ts +21 -0
  235. package/dist/lib/grader-api.js +34 -0
  236. package/dist/lib/grader-compare.d.ts +19 -0
  237. package/dist/lib/grader-compare.js +91 -0
  238. package/dist/lib/grader-consistency.d.ts +27 -0
  239. package/dist/lib/grader-consistency.js +79 -0
  240. package/dist/lib/grader-sensitivity.d.ts +19 -0
  241. package/dist/lib/grader-sensitivity.js +75 -0
  242. package/dist/lib/grader-validate.d.ts +19 -0
  243. package/dist/lib/grader-validate.js +78 -0
  244. package/dist/lib/measure-retrieval.d.ts +14 -0
  245. package/dist/lib/measure-retrieval.js +71 -0
  246. package/dist/lib/pr-comment.d.ts +16 -0
  247. package/dist/lib/pr-comment.js +28 -0
  248. package/dist/lib/readiness-report.d.ts +13 -0
  249. package/dist/lib/readiness-report.js +108 -0
  250. package/dist/lib/webhook-server.d.ts +11 -0
  251. package/dist/lib/webhook-server.js +24 -0
  252. package/dist/lib/weekly-digest.d.ts +24 -0
  253. package/dist/lib/weekly-digest.js +148 -0
  254. package/dist/orchestration/build-app-context.d.ts +27 -0
  255. package/dist/orchestration/build-app-context.js +81 -0
  256. package/dist/orchestration/build-step-sequence.d.ts +15 -0
  257. package/dist/orchestration/build-step-sequence.js +84 -0
  258. package/dist/orchestration/config-to-source-overrides.d.ts +9 -0
  259. package/dist/orchestration/config-to-source-overrides.js +28 -0
  260. package/dist/orchestration/env-bridge.d.ts +21 -0
  261. package/dist/orchestration/env-bridge.js +66 -0
  262. package/dist/orchestration/index.d.ts +11 -0
  263. package/dist/orchestration/index.js +11 -0
  264. package/dist/orchestration/pipeline-orchestrator.d.ts +24 -0
  265. package/dist/orchestration/pipeline-orchestrator.js +153 -0
  266. package/dist/orchestration/step-runner.d.ts +20 -0
  267. package/dist/orchestration/step-runner.js +88 -0
  268. package/dist/orchestration/steps/calculate-scores-step.d.ts +13 -0
  269. package/dist/orchestration/steps/calculate-scores-step.js +95 -0
  270. package/dist/orchestration/steps/callback-step.d.ts +24 -0
  271. package/dist/orchestration/steps/callback-step.js +76 -0
  272. package/dist/orchestration/steps/compare-step.d.ts +14 -0
  273. package/dist/orchestration/steps/compare-step.js +92 -0
  274. package/dist/orchestration/steps/discovery-report-step.d.ts +13 -0
  275. package/dist/orchestration/steps/discovery-report-step.js +55 -0
  276. package/dist/orchestration/steps/fetch-docs-shell.d.ts +17 -0
  277. package/dist/orchestration/steps/fetch-docs-shell.js +30 -0
  278. package/dist/orchestration/steps/fetch-docs-step.d.ts +14 -0
  279. package/dist/orchestration/steps/fetch-docs-step.js +135 -0
  280. package/dist/orchestration/steps/gap-analysis-step.d.ts +16 -0
  281. package/dist/orchestration/steps/gap-analysis-step.js +136 -0
  282. package/dist/orchestration/steps/generate-configs-step.d.ts +14 -0
  283. package/dist/orchestration/steps/generate-configs-step.js +85 -0
  284. package/dist/orchestration/steps/grader-consistency-step.d.ts +13 -0
  285. package/dist/orchestration/steps/grader-consistency-step.js +64 -0
  286. package/dist/orchestration/steps/index.d.ts +19 -0
  287. package/dist/orchestration/steps/index.js +19 -0
  288. package/dist/orchestration/steps/mirror-repo-tasks-step.d.ts +21 -0
  289. package/dist/orchestration/steps/mirror-repo-tasks-step.js +94 -0
  290. package/dist/orchestration/steps/publish-report-step.d.ts +26 -0
  291. package/dist/orchestration/steps/publish-report-step.js +216 -0
  292. package/dist/orchestration/steps/readiness-step.d.ts +13 -0
  293. package/dist/orchestration/steps/readiness-step.js +91 -0
  294. package/dist/orchestration/steps/report-step.d.ts +12 -0
  295. package/dist/orchestration/steps/report-step.js +49 -0
  296. package/dist/orchestration/steps/run-eval-step.d.ts +17 -0
  297. package/dist/orchestration/steps/run-eval-step.js +195 -0
  298. package/dist/orchestration/steps/validate-step.d.ts +12 -0
  299. package/dist/orchestration/steps/validate-step.js +41 -0
  300. package/dist/pipeline/agent-behavior-report.d.ts +53 -0
  301. package/dist/pipeline/agent-behavior-report.js +132 -0
  302. package/dist/pipeline/attribution.d.ts +47 -0
  303. package/dist/pipeline/attribution.js +226 -0
  304. package/dist/pipeline/baseline.d.ts +37 -0
  305. package/dist/pipeline/baseline.js +141 -0
  306. package/dist/pipeline/cache.d.ts +101 -0
  307. package/dist/pipeline/cache.js +283 -0
  308. package/dist/pipeline/calculate-scores.d.ts +102 -0
  309. package/dist/pipeline/calculate-scores.js +1128 -0
  310. package/dist/pipeline/callback-delivery.d.ts +50 -0
  311. package/dist/pipeline/callback-delivery.js +89 -0
  312. package/dist/pipeline/checks.d.ts +39 -0
  313. package/dist/pipeline/checks.js +280 -0
  314. package/dist/pipeline/classify-url.d.ts +61 -0
  315. package/dist/pipeline/classify-url.js +93 -0
  316. package/dist/pipeline/compare.d.ts +31 -0
  317. package/dist/pipeline/compare.js +208 -0
  318. package/dist/pipeline/coverage-audit.d.ts +39 -0
  319. package/dist/pipeline/coverage-audit.js +165 -0
  320. package/dist/pipeline/degradations.d.ts +85 -0
  321. package/dist/pipeline/degradations.js +242 -0
  322. package/dist/pipeline/discovery-report.d.ts +55 -0
  323. package/dist/pipeline/discovery-report.js +178 -0
  324. package/dist/pipeline/eval-constants.d.ts +68 -0
  325. package/dist/pipeline/eval-constants.js +111 -0
  326. package/dist/pipeline/eval-fingerprint.d.ts +66 -0
  327. package/dist/pipeline/eval-fingerprint.js +175 -0
  328. package/dist/pipeline/expand-tasks.d.ts +220 -0
  329. package/dist/pipeline/expand-tasks.js +421 -0
  330. package/dist/pipeline/failure-modes.d.ts +46 -0
  331. package/dist/pipeline/failure-modes.js +348 -0
  332. package/dist/pipeline/fetch-url-content.d.ts +44 -0
  333. package/dist/pipeline/fetch-url-content.js +93 -0
  334. package/dist/pipeline/gap-analysis.d.ts +48 -0
  335. package/dist/pipeline/gap-analysis.js +231 -0
  336. package/dist/pipeline/generate-configs.d.ts +72 -0
  337. package/dist/pipeline/generate-configs.js +395 -0
  338. package/dist/pipeline/grader-api.d.ts +49 -0
  339. package/dist/pipeline/grader-api.js +200 -0
  340. package/dist/pipeline/grader-compare-runner.d.ts +44 -0
  341. package/dist/pipeline/grader-compare-runner.js +301 -0
  342. package/dist/pipeline/grader-comparison.d.ts +111 -0
  343. package/dist/pipeline/grader-comparison.js +161 -0
  344. package/dist/pipeline/grader-consistency-runner.d.ts +60 -0
  345. package/dist/pipeline/grader-consistency-runner.js +270 -0
  346. package/dist/pipeline/grader-consistency.d.ts +103 -0
  347. package/dist/pipeline/grader-consistency.js +146 -0
  348. package/dist/pipeline/grader-sensitivity-runner.d.ts +40 -0
  349. package/dist/pipeline/grader-sensitivity-runner.js +282 -0
  350. package/dist/pipeline/grader-sensitivity.d.ts +94 -0
  351. package/dist/pipeline/grader-sensitivity.js +144 -0
  352. package/dist/pipeline/grader-validate-runner.d.ts +38 -0
  353. package/dist/pipeline/grader-validate-runner.js +229 -0
  354. package/dist/pipeline/grader-validation.d.ts +107 -0
  355. package/dist/pipeline/grader-validation.js +169 -0
  356. package/dist/pipeline/map-request-to-config.d.ts +19 -0
  357. package/dist/pipeline/map-request-to-config.js +80 -0
  358. package/dist/pipeline/measure-retrieval.d.ts +59 -0
  359. package/dist/pipeline/measure-retrieval.js +111 -0
  360. package/dist/pipeline/mirror-repo-tasks.d.ts +86 -0
  361. package/dist/pipeline/mirror-repo-tasks.js +350 -0
  362. package/dist/pipeline/plan-format.d.ts +33 -0
  363. package/dist/pipeline/plan-format.js +202 -0
  364. package/dist/pipeline/plan.d.ts +169 -0
  365. package/dist/pipeline/plan.js +708 -0
  366. package/dist/pipeline/pr-comment.d.ts +19 -0
  367. package/dist/pipeline/pr-comment.js +502 -0
  368. package/dist/pipeline/probe.d.ts +52 -0
  369. package/dist/pipeline/probe.js +390 -0
  370. package/dist/pipeline/provenance.d.ts +47 -0
  371. package/dist/pipeline/provenance.js +146 -0
  372. package/dist/pipeline/readiness-report.d.ts +87 -0
  373. package/dist/pipeline/readiness-report.js +205 -0
  374. package/dist/pipeline/release-classification.d.ts +54 -0
  375. package/dist/pipeline/release-classification.js +238 -0
  376. package/dist/pipeline/release-report.d.ts +37 -0
  377. package/dist/pipeline/release-report.js +222 -0
  378. package/dist/pipeline/repo-eval-comment.d.ts +37 -0
  379. package/dist/pipeline/repo-eval-comment.js +165 -0
  380. package/dist/pipeline/repo-threshold-evaluator.d.ts +89 -0
  381. package/dist/pipeline/repo-threshold-evaluator.js +162 -0
  382. package/dist/pipeline/resolve-mappings.d.ts +35 -0
  383. package/dist/pipeline/resolve-mappings.js +72 -0
  384. package/dist/pipeline/retrieval-metrics.d.ts +39 -0
  385. package/dist/pipeline/retrieval-metrics.js +136 -0
  386. package/dist/pipeline/reverse-mapping.d.ts +67 -0
  387. package/dist/pipeline/reverse-mapping.js +88 -0
  388. package/dist/pipeline/schemas.d.ts +9 -0
  389. package/dist/pipeline/schemas.js +9 -0
  390. package/dist/pipeline/steps/calculate-scores-step.d.ts +11 -0
  391. package/dist/pipeline/steps/calculate-scores-step.js +89 -0
  392. package/dist/pipeline/steps/compare-step.d.ts +18 -0
  393. package/dist/pipeline/steps/compare-step.js +90 -0
  394. package/dist/pipeline/steps/eval-step.d.ts +53 -0
  395. package/dist/pipeline/steps/eval-step.js +347 -0
  396. package/dist/pipeline/steps/fetch-docs-step.d.ts +11 -0
  397. package/dist/pipeline/steps/fetch-docs-step.js +84 -0
  398. package/dist/pipeline/steps/generate-configs-step.d.ts +11 -0
  399. package/dist/pipeline/steps/generate-configs-step.js +98 -0
  400. package/dist/pipeline/steps/grader-consistency-step.d.ts +21 -0
  401. package/dist/pipeline/steps/grader-consistency-step.js +74 -0
  402. package/dist/pipeline/steps/publish-report-step.d.ts +57 -0
  403. package/dist/pipeline/steps/publish-report-step.js +243 -0
  404. package/dist/pipeline/steps/report-step.d.ts +13 -0
  405. package/dist/pipeline/steps/report-step.js +56 -0
  406. package/dist/pipeline/steps/update-scores-step.d.ts +11 -0
  407. package/dist/pipeline/steps/update-scores-step.js +42 -0
  408. package/dist/pipeline/targeted-loo.d.ts +88 -0
  409. package/dist/pipeline/targeted-loo.js +203 -0
  410. package/dist/pipeline/thresholds.d.ts +27 -0
  411. package/dist/pipeline/thresholds.js +245 -0
  412. package/dist/pipeline/types.d.ts +10 -0
  413. package/dist/pipeline/types.js +10 -0
  414. package/dist/pipeline/validate.d.ts +67 -0
  415. package/dist/pipeline/validate.js +406 -0
  416. package/dist/pipeline/webhook-server.d.ts +37 -0
  417. package/dist/pipeline/webhook-server.js +133 -0
  418. package/dist/report-store.d.ts +84 -0
  419. package/dist/report-store.js +208 -0
  420. package/dist/sanity/client.d.ts +38 -0
  421. package/dist/sanity/client.js +86 -0
  422. package/dist/sanity/portable-text.d.ts +11 -0
  423. package/dist/sanity/portable-text.js +211 -0
  424. package/dist/sanity/queries.d.ts +133 -0
  425. package/dist/sanity/queries.js +300 -0
  426. package/dist/schedules/digest.d.ts +116 -0
  427. package/dist/schedules/digest.js +156 -0
  428. package/dist/schedules/index.d.ts +12 -0
  429. package/dist/schedules/index.js +10 -0
  430. package/dist/schedules/loader.d.ts +31 -0
  431. package/dist/schedules/loader.js +73 -0
  432. package/dist/schedules/schema.d.ts +9 -0
  433. package/dist/schedules/schema.js +9 -0
  434. package/dist/scripts/agent-behavior-report.d.ts +19 -0
  435. package/dist/scripts/agent-behavior-report.js +315 -0
  436. package/dist/scripts/baseline.d.ts +43 -0
  437. package/dist/scripts/baseline.js +267 -0
  438. package/dist/scripts/calculate-scores.d.ts +166 -0
  439. package/dist/scripts/calculate-scores.js +1296 -0
  440. package/dist/scripts/compare.d.ts +22 -0
  441. package/dist/scripts/compare.js +334 -0
  442. package/dist/scripts/coverage-audit.d.ts +44 -0
  443. package/dist/scripts/coverage-audit.js +209 -0
  444. package/dist/scripts/debug-eval.d.ts +19 -0
  445. package/dist/scripts/debug-eval.js +73 -0
  446. package/dist/scripts/discovery-report.d.ts +58 -0
  447. package/dist/scripts/discovery-report.js +250 -0
  448. package/dist/scripts/fetch-docs.d.ts +35 -0
  449. package/dist/scripts/fetch-docs.js +472 -0
  450. package/dist/scripts/generate-configs.d.ts +66 -0
  451. package/dist/scripts/generate-configs.js +459 -0
  452. package/dist/scripts/grader-api.d.ts +27 -0
  453. package/dist/scripts/grader-api.js +206 -0
  454. package/dist/scripts/grader-compare.d.ts +22 -0
  455. package/dist/scripts/grader-compare.js +368 -0
  456. package/dist/scripts/grader-consistency.d.ts +20 -0
  457. package/dist/scripts/grader-consistency.js +313 -0
  458. package/dist/scripts/grader-sensitivity.d.ts +22 -0
  459. package/dist/scripts/grader-sensitivity.js +354 -0
  460. package/dist/scripts/grader-validate.d.ts +19 -0
  461. package/dist/scripts/grader-validate.js +267 -0
  462. package/dist/scripts/measure-retrieval.d.ts +10 -0
  463. package/dist/scripts/measure-retrieval.js +145 -0
  464. package/dist/scripts/migrate-tasks-to-content-lake.d.ts +24 -0
  465. package/dist/scripts/migrate-tasks-to-content-lake.js +327 -0
  466. package/dist/scripts/pipeline.d.ts +76 -0
  467. package/dist/scripts/pipeline.js +1031 -0
  468. package/dist/scripts/pr-comment.d.ts +10 -0
  469. package/dist/scripts/pr-comment.js +510 -0
  470. package/dist/scripts/readiness-report.d.ts +88 -0
  471. package/dist/scripts/readiness-report.js +342 -0
  472. package/dist/scripts/update-quality-scores.d.ts +15 -0
  473. package/dist/scripts/update-quality-scores.js +184 -0
  474. package/dist/scripts/validate-task-sources.d.ts +21 -0
  475. package/dist/scripts/validate-task-sources.js +210 -0
  476. package/dist/scripts/validate.d.ts +13 -0
  477. package/dist/scripts/validate.js +79 -0
  478. package/dist/scripts/webhook-server.d.ts +26 -0
  479. package/dist/scripts/webhook-server.js +147 -0
  480. package/dist/scripts/weekly-digest.d.ts +24 -0
  481. package/dist/scripts/weekly-digest.js +144 -0
  482. package/dist/sinks/bigquery/index.d.ts +131 -0
  483. package/dist/sinks/bigquery/index.js +222 -0
  484. package/dist/sinks/format-slack.d.ts +64 -0
  485. package/dist/sinks/format-slack.js +306 -0
  486. package/dist/sinks/index.d.ts +23 -0
  487. package/dist/sinks/index.js +18 -0
  488. package/dist/sinks/loader.d.ts +18 -0
  489. package/dist/sinks/loader.js +82 -0
  490. package/dist/sinks/retry.d.ts +24 -0
  491. package/dist/sinks/retry.js +52 -0
  492. package/dist/sinks/schema.d.ts +9 -0
  493. package/dist/sinks/schema.js +9 -0
  494. package/dist/sinks/slack/format.d.ts +65 -0
  495. package/dist/sinks/slack/format.js +327 -0
  496. package/dist/sinks/slack/index.d.ts +27 -0
  497. package/dist/sinks/slack/index.js +78 -0
  498. package/dist/sinks/slack-sink.d.ts +27 -0
  499. package/dist/sinks/slack-sink.js +78 -0
  500. package/dist/sinks/types.d.ts +59 -0
  501. package/dist/sinks/types.js +44 -0
  502. package/dist/sinks/webhook/index.d.ts +19 -0
  503. package/dist/sinks/webhook/index.js +50 -0
  504. package/dist/sinks/webhook-sink.d.ts +19 -0
  505. package/dist/sinks/webhook-sink.js +50 -0
  506. package/dist/sources.d.ts +104 -0
  507. package/dist/sources.js +292 -0
  508. package/dist/webhook/budget.d.ts +42 -0
  509. package/dist/webhook/budget.js +60 -0
  510. package/dist/webhook/debounce.d.ts +67 -0
  511. package/dist/webhook/debounce.js +76 -0
  512. package/dist/webhook/dispatch.d.ts +45 -0
  513. package/dist/webhook/dispatch.js +84 -0
  514. package/dist/webhook/eval-request-handler.d.ts +87 -0
  515. package/dist/webhook/eval-request-handler.js +181 -0
  516. package/dist/webhook/handler.d.ts +88 -0
  517. package/dist/webhook/handler.js +203 -0
  518. package/dist/webhook/index.d.ts +17 -0
  519. package/dist/webhook/index.js +12 -0
  520. package/dist/webhook/types.d.ts +109 -0
  521. package/dist/webhook/types.js +10 -0
  522. package/package.json +72 -0
  523. package/tasks/.expanded.agentic.yaml +51 -0
  524. package/tasks/.expanded.yaml +66 -0
  525. package/tasks/frameworks.yaml +98 -0
  526. package/tasks/functions.yaml +51 -0
  527. package/tasks/groq.yaml +216 -0
  528. package/tasks/nextjs-live.yaml +62 -0
  529. package/tasks/studio-setup.yaml +111 -0
  530. package/tasks/visual-editing.yaml +120 -0
@@ -0,0 +1,639 @@
1
+ # Airbyte Declarative Source — AI Literacy Framework
2
+ #
3
+ # Extracts evaluation reports from the Sanity Content Lake and delivers them
4
+ # to BigQuery (or any Airbyte-supported destination).
5
+ #
6
+ # Architecture: Sanity Content Lake → Airbyte (scheduled poll) → BigQuery
7
+ # This replaces the direct BigQuerySink with ELT managed by the data team.
8
+ #
9
+ # Two streams:
10
+ # 1. reports — one row per evaluation run, GROQ-projected to flat columns
11
+ # 2. area_scores — one row per report with nested model×area scores;
12
+ # use BigQuery views (see bigquery/views/) to UNNEST into flat rows
13
+ #
14
+ # Both streams use incremental sync with _createdAt as the cursor, so only
15
+ # new reports are transferred on each sync.
16
+ #
17
+ # @see docs/design-docs/report-store/bigquery.md — target schema
18
+ # @see docs/design-docs/report-store/airbyte-elt.md — integration design
19
+ version: 6.48.15
20
+
21
+ type: DeclarativeSource
22
+
23
+ check:
24
+ type: CheckStream
25
+ stream_names:
26
+ - reports
27
+
28
+ definitions:
29
+ streams:
30
+ # ------------------------------------------------------------------
31
+ # Stream 1: reports — flat row per evaluation run
32
+ # ------------------------------------------------------------------
33
+ # GROQ projection flattens nested provenance/summary into top-level
34
+ # columns matching the ailf.reports BigQuery table schema.
35
+ # The comparison field is intentionally excluded — it duplicates
36
+ # entire ScoreSummary objects and can always be recomputed from
37
+ # two report rows.
38
+ reports:
39
+ type: DeclarativeStream
40
+ name: reports
41
+ retriever:
42
+ type: SimpleRetriever
43
+ decoder:
44
+ type: JsonDecoder
45
+ requester:
46
+ $ref: "#/definitions/base_requester"
47
+ path: /v2026-03-12/data/query/{{ config['dataset'] }}
48
+ http_method: GET
49
+ request_parameters:
50
+ query: >-
51
+ *[_type=="ailf.report" && _createdAt > "{{
52
+ stream_interval.start_time or '1970-01-01T00:00:00Z' }}" &&
53
+ _createdAt <= "{{ stream_interval.end_time }}" ]|order(_createdAt
54
+ asc){
55
+ "report_id": reportId,
56
+ "completed_at": completedAt,
57
+ "duration_ms": durationMs,
58
+ tag,
59
+ "mode": provenance.mode,
60
+ "source_name": provenance.source.name,
61
+ "source_base_url": provenance.source.baseUrl,
62
+ "source_dataset": provenance.source.dataset,
63
+ "source_perspective": provenance.source.perspective,
64
+ "grader_model": provenance.graderModel,
65
+ "trigger_type": provenance.trigger.type,
66
+ "trigger_caller_repo": select(
67
+ provenance.trigger.type == "cross-repo" =>
68
+ provenance.trigger.callerRepo,
69
+ null
70
+ ),
71
+ "git_repo": provenance.git.repo,
72
+ "git_branch": provenance.git.branch,
73
+ "git_sha": provenance.git.sha,
74
+ "git_pr_number": provenance.git.prNumber,
75
+ "avg_score": summary.overall.avgScore,
76
+ "avg_doc_lift": summary.overall.avgDocLift,
77
+ "total_cost": summary.overall.cost.total,
78
+ "grader_cost": summary.overall.cost.graderTotal,
79
+ "area_count": count(provenance.areas),
80
+ "model_count": count(provenance.models),
81
+ "areas": provenance.areas,
82
+ "models": provenance.models[].id,
83
+ "avg_actual_score": summary.overall.avgActualScore,
84
+ "avg_retrieval_gap": summary.overall.avgRetrievalGap,
85
+ "avg_infrastructure_efficiency":
86
+ summary.overall.avgInfrastructureEfficiency,
87
+ "promptfoo_url": provenance.promptfooUrl,
88
+ "promptfoo_urls": provenance.promptfooUrls[] { mode, url },
89
+ _createdAt
90
+ }
91
+ record_selector:
92
+ type: RecordSelector
93
+ extractor:
94
+ type: DpathExtractor
95
+ field_path:
96
+ - result
97
+ primary_key:
98
+ - report_id
99
+ incremental_sync:
100
+ type: DatetimeBasedCursor
101
+ cursor_field: _createdAt
102
+ cursor_datetime_formats:
103
+ - "%Y-%m-%dT%H:%M:%S.%fZ"
104
+ - "%Y-%m-%dT%H:%M:%SZ"
105
+ datetime_format: "%Y-%m-%dT%H:%M:%SZ"
106
+ start_datetime:
107
+ type: MinMaxDatetime
108
+ datetime: "{{ config.get('start_date', '2026-01-01T00:00:00Z') }}"
109
+ datetime_format: "%Y-%m-%dT%H:%M:%SZ"
110
+ step: P30D
111
+ cursor_granularity: PT1S
112
+ schema_loader:
113
+ type: InlineSchemaLoader
114
+ schema:
115
+ $ref: "#/schemas/reports"
116
+
117
+ # ------------------------------------------------------------------
118
+ # Stream 2: area_scores — per-model per-area score rows
119
+ # ------------------------------------------------------------------
120
+ # GROQ extracts the nested perModel→scores arrays with report-level
121
+ # context (report_id, completed_at, mode, source_name). The nesting
122
+ # is preserved because GROQ cannot explode arrays into flat rows.
123
+ #
124
+ # BigQuery consumers should query the `ailf.area_scores` view
125
+ # (defined in bigquery/views/area_scores.sql) which UNNESTs the
126
+ # nested arrays into one flat row per area per model per report.
127
+ area_scores:
128
+ type: DeclarativeStream
129
+ name: area_scores
130
+ retriever:
131
+ type: SimpleRetriever
132
+ decoder:
133
+ type: JsonDecoder
134
+ requester:
135
+ $ref: "#/definitions/base_requester"
136
+ path: /v2026-03-12/data/query/{{ config['dataset'] }}
137
+ http_method: GET
138
+ request_parameters:
139
+ query: >-
140
+ *[_type=="ailf.report" && _createdAt > "{{
141
+ stream_interval.start_time or '1970-01-01T00:00:00Z' }}" &&
142
+ _createdAt <= "{{ stream_interval.end_time }}" ]|order(_createdAt
143
+ asc){
144
+ "report_id": reportId,
145
+ "completed_at": completedAt,
146
+ "mode": provenance.mode,
147
+ "source_name": provenance.source.name,
148
+ "model_scores": summary.perModel[]{
149
+ "model_id": modelId,
150
+ "areas": scores[]{
151
+ "area": feature,
152
+ "total_score": totalScore,
153
+ "task_completion": taskCompletion,
154
+ "code_correctness": codeCorrectness,
155
+ "doc_coverage": docCoverage,
156
+ "doc_lift": coalesce(docLift, liftFromDocs),
157
+ "ceiling_score": coalesce(ceilingScore, withDocsScore),
158
+ "floor_score": coalesce(floorScore, withoutDocsScore),
159
+ "actual_score": actualScore,
160
+ "retrieval_gap": retrievalGap,
161
+ "infrastructure_efficiency": infrastructureEfficiency,
162
+ "total_cost": totalCost,
163
+ "test_count": testCount
164
+ }
165
+ },
166
+ "fallback_model_id": provenance.models[0].id,
167
+ "fallback_scores": summary.scores[]{
168
+ "area": feature,
169
+ "total_score": totalScore,
170
+ "task_completion": taskCompletion,
171
+ "code_correctness": codeCorrectness,
172
+ "doc_coverage": docCoverage,
173
+ "doc_lift": coalesce(docLift, liftFromDocs),
174
+ "ceiling_score": coalesce(ceilingScore, withDocsScore),
175
+ "floor_score": coalesce(floorScore, withoutDocsScore),
176
+ "actual_score": actualScore,
177
+ "retrieval_gap": retrievalGap,
178
+ "infrastructure_efficiency": infrastructureEfficiency,
179
+ "total_cost": totalCost,
180
+ "test_count": testCount
181
+ },
182
+ _createdAt
183
+ }
184
+ record_selector:
185
+ type: RecordSelector
186
+ extractor:
187
+ type: DpathExtractor
188
+ field_path:
189
+ - result
190
+ primary_key:
191
+ - report_id
192
+ incremental_sync:
193
+ type: DatetimeBasedCursor
194
+ cursor_field: _createdAt
195
+ cursor_datetime_formats:
196
+ - "%Y-%m-%dT%H:%M:%S.%fZ"
197
+ - "%Y-%m-%dT%H:%M:%SZ"
198
+ datetime_format: "%Y-%m-%dT%H:%M:%SZ"
199
+ start_datetime:
200
+ type: MinMaxDatetime
201
+ datetime: "{{ config.get('start_date', '2026-01-01T00:00:00Z') }}"
202
+ datetime_format: "%Y-%m-%dT%H:%M:%SZ"
203
+ step: P30D
204
+ cursor_granularity: PT1S
205
+ schema_loader:
206
+ type: InlineSchemaLoader
207
+ schema:
208
+ $ref: "#/schemas/area_scores"
209
+
210
+ base_requester:
211
+ type: HttpRequester
212
+ url_base: https://{{ config['project_id'] }}.api.sanity.io
213
+ authenticator:
214
+ type: BearerAuthenticator
215
+ api_token: "{{ config['api_key'] }}"
216
+
217
+ streams:
218
+ - $ref: "#/definitions/streams/reports"
219
+ - $ref: "#/definitions/streams/area_scores"
220
+
221
+ spec:
222
+ type: Spec
223
+ connection_specification:
224
+ type: object
225
+ $schema: http://json-schema.org/draft-07/schema#
226
+ required:
227
+ - api_key
228
+ - dataset
229
+ - project_id
230
+ properties:
231
+ api_key:
232
+ type: string
233
+ order: 0
234
+ title: Sanity API Token
235
+ description: >-
236
+ A Sanity API token with read access to the dataset containing
237
+ ailf.report documents. Generate one at
238
+ https://www.sanity.io/manage/project/<project_id>/api#tokens
239
+ airbyte_secret: true
240
+ dataset:
241
+ type: string
242
+ order: 1
243
+ title: Dataset
244
+ description: >-
245
+ The Sanity dataset containing evaluation reports.
246
+ default: next
247
+ project_id:
248
+ type: string
249
+ order: 2
250
+ title: Project ID
251
+ description: >-
252
+ The Sanity project ID (e.g., "3do82whm").
253
+ default: 3do82whm
254
+ start_date:
255
+ type: string
256
+ order: 3
257
+ title: Start Date
258
+ description: >-
259
+ Only sync reports created after this date (ISO 8601). Defaults to
260
+ 2026-01-01 if not set.
261
+ default: "2026-01-01T00:00:00Z"
262
+ examples:
263
+ - "2026-01-01T00:00:00Z"
264
+ - "2026-06-01T00:00:00Z"
265
+ additionalProperties: true
266
+
267
+ metadata:
268
+ assist: {}
269
+ testedStreams:
270
+ reports:
271
+ hasRecords: true
272
+ streamHash: null
273
+ hasResponse: true
274
+ primaryKeysAreUnique: true
275
+ primaryKeysArePresent: true
276
+ responsesAreSuccessful: true
277
+ area_scores:
278
+ hasRecords: true
279
+ streamHash: null
280
+ hasResponse: true
281
+ primaryKeysAreUnique: true
282
+ primaryKeysArePresent: true
283
+ responsesAreSuccessful: true
284
+ autoImportSchema:
285
+ reports: false
286
+ area_scores: false
287
+
288
+ # ======================================================================
289
+ # Inline schemas — manually defined to match the designed BigQuery tables.
290
+ # autoImportSchema is OFF so these don't drift with Sanity document changes.
291
+ # ======================================================================
292
+
293
+ schemas:
294
+ # ------------------------------------------------------------------
295
+ # reports schema — flat, matches ailf.reports BigQuery table
296
+ # ------------------------------------------------------------------
297
+ reports:
298
+ type: object
299
+ $schema: http://json-schema.org/schema#
300
+ required:
301
+ - report_id
302
+ properties:
303
+ report_id:
304
+ type: string
305
+ description: UUID v7 report identifier (primary key)
306
+ completed_at:
307
+ type:
308
+ - string
309
+ - "null"
310
+ description: ISO 8601 timestamp when the evaluation completed
311
+ duration_ms:
312
+ type:
313
+ - number
314
+ - "null"
315
+ description: Pipeline execution time in milliseconds
316
+ tag:
317
+ type:
318
+ - string
319
+ - "null"
320
+ description: Optional human-supplied label
321
+ mode:
322
+ type:
323
+ - string
324
+ - "null"
325
+ description: "Evaluation mode: baseline, observed, or agentic"
326
+ source_name:
327
+ type:
328
+ - string
329
+ - "null"
330
+ description: Documentation source name (e.g., "production")
331
+ source_base_url:
332
+ type:
333
+ - string
334
+ - "null"
335
+ description: Documentation source base URL
336
+ source_dataset:
337
+ type:
338
+ - string
339
+ - "null"
340
+ description: Sanity dataset used for evaluation
341
+ source_perspective:
342
+ type:
343
+ - string
344
+ - "null"
345
+ description: Sanity perspective (for content release evaluations)
346
+ grader_model:
347
+ type:
348
+ - string
349
+ - "null"
350
+ description: Model used for LLM grading
351
+ trigger_type:
352
+ type:
353
+ - string
354
+ - "null"
355
+ description:
356
+ "What triggered the evaluation: manual, ci, scheduled, webhook,
357
+ cross-repo"
358
+ trigger_caller_repo:
359
+ type:
360
+ - string
361
+ - "null"
362
+ description: Caller repository for cross-repo triggers
363
+ git_repo:
364
+ type:
365
+ - string
366
+ - "null"
367
+ description: Source repository (when run from CI)
368
+ git_branch:
369
+ type:
370
+ - string
371
+ - "null"
372
+ description: Source branch
373
+ git_sha:
374
+ type:
375
+ - string
376
+ - "null"
377
+ description: Commit SHA
378
+ git_pr_number:
379
+ type:
380
+ - number
381
+ - "null"
382
+ description: Pull request number (if applicable)
383
+ avg_score:
384
+ type:
385
+ - number
386
+ - "null"
387
+ description: Overall average AI literacy score (0–100)
388
+ avg_doc_lift:
389
+ type:
390
+ - number
391
+ - "null"
392
+ description: Overall documentation lift score
393
+ total_cost:
394
+ type:
395
+ - number
396
+ - "null"
397
+ description: Total evaluation cost in USD
398
+ grader_cost:
399
+ type:
400
+ - number
401
+ - "null"
402
+ description: Grader model cost in USD
403
+ area_count:
404
+ type:
405
+ - number
406
+ - "null"
407
+ description: Number of feature areas evaluated
408
+ model_count:
409
+ type:
410
+ - number
411
+ - "null"
412
+ description: Number of models evaluated
413
+ areas:
414
+ type:
415
+ - array
416
+ - "null"
417
+ items:
418
+ type: string
419
+ description: List of evaluated feature area names
420
+ models:
421
+ type:
422
+ - array
423
+ - "null"
424
+ items:
425
+ type: string
426
+ description: List of evaluated model IDs
427
+ avg_actual_score:
428
+ type:
429
+ - number
430
+ - "null"
431
+ description: Average score from agent-retrieved docs (full-mode only)
432
+ avg_retrieval_gap:
433
+ type:
434
+ - number
435
+ - "null"
436
+ description: Average ceiling minus actual across areas (full-mode only)
437
+ avg_infrastructure_efficiency:
438
+ type:
439
+ - number
440
+ - "null"
441
+ description: Average actual/ceiling ratio across areas (full-mode only)
442
+ promptfoo_url:
443
+ type:
444
+ - string
445
+ - "null"
446
+ description: Legacy single Promptfoo share URL
447
+ promptfoo_urls:
448
+ type:
449
+ - array
450
+ - "null"
451
+ description: Per-mode Promptfoo share URLs (one per sub-eval)
452
+ items:
453
+ type: object
454
+ properties:
455
+ mode:
456
+ type: string
457
+ description: "Evaluation mode: baseline, agentic, observed"
458
+ url:
459
+ type: string
460
+ description: Promptfoo share URL for this mode
461
+ _createdAt:
462
+ type:
463
+ - string
464
+ - "null"
465
+ description:
466
+ Sanity document creation timestamp (used as incremental cursor)
467
+ additionalProperties: true
468
+
469
+ # ------------------------------------------------------------------
470
+ # area_scores schema — nested model→area scores for BigQuery UNNEST
471
+ # ------------------------------------------------------------------
472
+ area_scores:
473
+ type: object
474
+ $schema: http://json-schema.org/schema#
475
+ required:
476
+ - report_id
477
+ properties:
478
+ report_id:
479
+ type: string
480
+ description: FK to reports table (UUID v7)
481
+ completed_at:
482
+ type:
483
+ - string
484
+ - "null"
485
+ description: Denormalized timestamp for partitioning
486
+ mode:
487
+ type:
488
+ - string
489
+ - "null"
490
+ description: Denormalized evaluation mode for clustering
491
+ source_name:
492
+ type:
493
+ - string
494
+ - "null"
495
+ description: Denormalized source name for clustering
496
+ model_scores:
497
+ type:
498
+ - array
499
+ - "null"
500
+ description:
501
+ Per-model score breakdowns (UNNEST in BigQuery to get flat rows)
502
+ items:
503
+ type: object
504
+ properties:
505
+ model_id:
506
+ type:
507
+ - string
508
+ - "null"
509
+ areas:
510
+ type:
511
+ - array
512
+ - "null"
513
+ items:
514
+ type: object
515
+ properties:
516
+ area:
517
+ type:
518
+ - string
519
+ - "null"
520
+ total_score:
521
+ type:
522
+ - number
523
+ - "null"
524
+ task_completion:
525
+ type:
526
+ - number
527
+ - "null"
528
+ code_correctness:
529
+ type:
530
+ - number
531
+ - "null"
532
+ doc_coverage:
533
+ type:
534
+ - number
535
+ - "null"
536
+ doc_lift:
537
+ type:
538
+ - number
539
+ - "null"
540
+ ceiling_score:
541
+ type:
542
+ - number
543
+ - "null"
544
+ floor_score:
545
+ type:
546
+ - number
547
+ - "null"
548
+ total_cost:
549
+ type:
550
+ - number
551
+ - "null"
552
+ test_count:
553
+ type:
554
+ - number
555
+ - "null"
556
+ actual_score:
557
+ type:
558
+ - number
559
+ - "null"
560
+ retrieval_gap:
561
+ type:
562
+ - number
563
+ - "null"
564
+ infrastructure_efficiency:
565
+ type:
566
+ - number
567
+ - "null"
568
+ fallback_model_id:
569
+ type:
570
+ - string
571
+ - "null"
572
+ description:
573
+ First model ID from provenance (used when perModel is absent)
574
+ fallback_scores:
575
+ type:
576
+ - array
577
+ - "null"
578
+ description: Aggregate scores without per-model breakdown (fallback)
579
+ items:
580
+ type: object
581
+ properties:
582
+ area:
583
+ type:
584
+ - string
585
+ - "null"
586
+ total_score:
587
+ type:
588
+ - number
589
+ - "null"
590
+ task_completion:
591
+ type:
592
+ - number
593
+ - "null"
594
+ code_correctness:
595
+ type:
596
+ - number
597
+ - "null"
598
+ doc_coverage:
599
+ type:
600
+ - number
601
+ - "null"
602
+ doc_lift:
603
+ type:
604
+ - number
605
+ - "null"
606
+ ceiling_score:
607
+ type:
608
+ - number
609
+ - "null"
610
+ floor_score:
611
+ type:
612
+ - number
613
+ - "null"
614
+ total_cost:
615
+ type:
616
+ - number
617
+ - "null"
618
+ test_count:
619
+ type:
620
+ - number
621
+ - "null"
622
+ actual_score:
623
+ type:
624
+ - number
625
+ - "null"
626
+ retrieval_gap:
627
+ type:
628
+ - number
629
+ - "null"
630
+ infrastructure_efficiency:
631
+ type:
632
+ - number
633
+ - "null"
634
+ _createdAt:
635
+ type:
636
+ - string
637
+ - "null"
638
+ description: Sanity document creation timestamp (incremental cursor)
639
+ additionalProperties: true
@@ -0,0 +1,74 @@
1
+ # BigQuery Schema & Views
2
+
3
+ SQL definitions for the BigQuery analytics layer. These create the flattened
4
+ tables and views that power SQL analytics, BI dashboards (Looker, Sheets), and
5
+ ad-hoc data exploration.
6
+
7
+ ## Architecture
8
+
9
+ ```
10
+ Sanity Content Lake
11
+
12
+
13
+ Airbyte (scheduled sync)
14
+ ├─ Stream: reports → ailf_raw.reports (flat, GROQ-projected)
15
+ └─ Stream: area_scores → ailf_raw.area_scores (nested model→area arrays)
16
+
17
+
18
+ BigQuery views (this directory)
19
+ ├─ ailf.reports → direct passthrough (already flat from GROQ projection)
20
+ └─ ailf.area_scores → UNNEST flattening (one row per area per model per report)
21
+ ```
22
+
23
+ ## Files
24
+
25
+ | File | Purpose |
26
+ | ----------------------- | ------------------------------------------------------------------------------- |
27
+ | `views/area_scores.sql` | Flattens nested `model_scores` array into one row per area per model per report |
28
+ | `views/reports.sql` | Clean passthrough view with correct types and column ordering |
29
+
30
+ ## Setup
31
+
32
+ The Airbyte connection loads data into a raw dataset (e.g., `ailf_raw`). The
33
+ views defined here reference that raw dataset and present the designed schema
34
+ from `docs/design-docs/report-store/bigquery.md`.
35
+
36
+ ### 1. Create the raw dataset (Airbyte writes here)
37
+
38
+ ```bash
39
+ bq mk --dataset data-platform-302218:ailf_raw
40
+ ```
41
+
42
+ ### 2. Create the analytics dataset (views live here)
43
+
44
+ ```bash
45
+ bq mk --dataset data-platform-302218:ailf
46
+ ```
47
+
48
+ ### 3. Create the views
49
+
50
+ ```bash
51
+ bq query --use_legacy_sql=false < views/reports.sql
52
+ bq query --use_legacy_sql=false < views/area_scores.sql
53
+ ```
54
+
55
+ ## Naming conventions
56
+
57
+ - **`ailf_raw.*`** — raw Airbyte-loaded tables (nested JSON, Airbyte metadata
58
+ columns)
59
+ - **`ailf.*`** — analytics views (flat, typed, designed schema from bigquery.md)
60
+
61
+ Airbyte adds metadata columns (`_airbyte_raw_id`, `_airbyte_extracted_at`,
62
+ `_airbyte_meta`) to its output tables. The views strip these so downstream
63
+ consumers see only the designed schema.
64
+
65
+ ## Schema evolution
66
+
67
+ Views are the transformation layer. When the report format evolves:
68
+
69
+ 1. Update the GROQ projections in the Airbyte connector YAML
70
+ 2. Update the view SQL to map new fields
71
+ 3. Backfill is automatic — views always reflect current data
72
+
73
+ @see docs/design-docs/report-store/bigquery.md — canonical schema definition
74
+ @see packages/eval/config/airbyte/ — Airbyte connector configuration