@ls-stack/agent-eval 0.45.2 → 0.46.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (311) hide show
  1. package/dist/apps/web/dist/assets/abap-BdImnpbu.js +1 -0
  2. package/dist/apps/web/dist/assets/actionscript-3-CoDkCxhg.js +1 -0
  3. package/dist/apps/web/dist/assets/ada-bCR0ucgS.js +1 -0
  4. package/dist/apps/web/dist/assets/andromeeda-C4gqWexZ.js +1 -0
  5. package/dist/apps/web/dist/assets/angular-html-CU67Zn6k.js +1 -0
  6. package/dist/apps/web/dist/assets/angular-ts-BwZT4LLn.js +1 -0
  7. package/dist/apps/web/dist/assets/apache-Pmp26Uib.js +1 -0
  8. package/dist/apps/web/dist/assets/apex-D8_7TLub.js +1 -0
  9. package/dist/apps/web/dist/assets/apl-dKokRX4l.js +1 -0
  10. package/dist/apps/web/dist/assets/applescript-Co6uUVPk.js +1 -0
  11. package/dist/apps/web/dist/assets/ara-BRHolxvo.js +1 -0
  12. package/dist/apps/web/dist/assets/asciidoc-Ve4PFQV2.js +1 -0
  13. package/dist/apps/web/dist/assets/asm-D_Q5rh1f.js +1 -0
  14. package/dist/apps/web/dist/assets/astro-CbQHKStN.js +1 -0
  15. package/dist/apps/web/dist/assets/aurora-x-D-2ljcwZ.js +1 -0
  16. package/dist/apps/web/dist/assets/awk-DMzUqQB5.js +1 -0
  17. package/dist/apps/web/dist/assets/ayu-dark-DYE7WIF3.js +1 -0
  18. package/dist/apps/web/dist/assets/ayu-light-BA47KaF1.js +1 -0
  19. package/dist/apps/web/dist/assets/ayu-mirage-32ctXXKs.js +1 -0
  20. package/dist/apps/web/dist/assets/ballerina-BFfxhgS-.js +1 -0
  21. package/dist/apps/web/dist/assets/bat-BkioyH1T.js +1 -0
  22. package/dist/apps/web/dist/assets/beancount-k_qm7-4y.js +1 -0
  23. package/dist/apps/web/dist/assets/berry-uYugtg8r.js +1 -0
  24. package/dist/apps/web/dist/assets/bibtex-CHM0blh-.js +1 -0
  25. package/dist/apps/web/dist/assets/bicep-Bmn6On1c.js +1 -0
  26. package/dist/apps/web/dist/assets/bird2-DPOp833l.js +1 -0
  27. package/dist/apps/web/dist/assets/blade-D4QpJJKB.js +1 -0
  28. package/dist/apps/web/dist/assets/bsl-BO_Y6i37.js +1 -0
  29. package/dist/apps/web/dist/assets/c-BIGW1oBm.js +1 -0
  30. package/dist/apps/web/dist/assets/c3-eo99z4R2.js +1 -0
  31. package/dist/apps/web/dist/assets/cadence-Bv_4Rxtq.js +1 -0
  32. package/dist/apps/web/dist/assets/cairo-KRGpt6FW.js +1 -0
  33. package/dist/apps/web/dist/assets/catppuccin-frappe-DFWUc33u.js +1 -0
  34. package/dist/apps/web/dist/assets/catppuccin-latte-C9dUb6Cb.js +1 -0
  35. package/dist/apps/web/dist/assets/catppuccin-macchiato-DQyhUUbL.js +1 -0
  36. package/dist/apps/web/dist/assets/catppuccin-mocha-D87Tk5Gz.js +1 -0
  37. package/dist/apps/web/dist/assets/clarity-D53aC0YG.js +1 -0
  38. package/dist/apps/web/dist/assets/clojure-P80f7IUj.js +1 -0
  39. package/dist/apps/web/dist/assets/cmake-D1j8_8rp.js +1 -0
  40. package/dist/apps/web/dist/assets/cobol-nwyudZeR.js +1 -0
  41. package/dist/apps/web/dist/assets/codeowners-Bp6g37R7.js +1 -0
  42. package/dist/apps/web/dist/assets/codeql-DsOJ9woJ.js +1 -0
  43. package/dist/apps/web/dist/assets/coffee-Ch7k5sss.js +1 -0
  44. package/dist/apps/web/dist/assets/common-lisp-Cg-RD9OK.js +1 -0
  45. package/dist/apps/web/dist/assets/coq-DkFqJrB1.js +1 -0
  46. package/dist/apps/web/dist/assets/cpp-CofmeUqb.js +1 -0
  47. package/dist/apps/web/dist/assets/crystal-tKQVLTB8.js +1 -0
  48. package/dist/apps/web/dist/assets/csharp-COcwbKMJ.js +1 -0
  49. package/dist/apps/web/dist/assets/css-DPfMkruS.js +1 -0
  50. package/dist/apps/web/dist/assets/csv-fuZLfV_i.js +1 -0
  51. package/dist/apps/web/dist/assets/cue-D82EKSYY.js +1 -0
  52. package/dist/apps/web/dist/assets/cypher-COkxafJQ.js +1 -0
  53. package/dist/apps/web/dist/assets/d-85-TOEBH.js +1 -0
  54. package/dist/apps/web/dist/assets/dark-plus-C3mMm8J8.js +1 -0
  55. package/dist/apps/web/dist/assets/dart-CF10PKvl.js +1 -0
  56. package/dist/apps/web/dist/assets/dax-CEL-wOlO.js +1 -0
  57. package/dist/apps/web/dist/assets/desktop-BmXAJ9_W.js +1 -0
  58. package/dist/apps/web/dist/assets/diff-D97Zzqfu.js +1 -0
  59. package/dist/apps/web/dist/assets/docker-BcOcwvcX.js +1 -0
  60. package/dist/apps/web/dist/assets/dotenv-Da5cRb03.js +1 -0
  61. package/dist/apps/web/dist/assets/dracula-BzJJZx-M.js +1 -0
  62. package/dist/apps/web/dist/assets/dracula-soft-BXkSAIEj.js +1 -0
  63. package/dist/apps/web/dist/assets/dream-maker-BtqSS_iP.js +1 -0
  64. package/dist/apps/web/dist/assets/edge-BkV0erSs.js +1 -0
  65. package/dist/apps/web/dist/assets/elixir-CDX3lj18.js +1 -0
  66. package/dist/apps/web/dist/assets/elm-DbKCFpqz.js +1 -0
  67. package/dist/apps/web/dist/assets/emacs-lisp-C9XAeP06.js +1 -0
  68. package/dist/apps/web/dist/assets/erb-B12qg9BL.js +1 -0
  69. package/dist/apps/web/dist/assets/erlang-DsQrWhSR.js +1 -0
  70. package/dist/apps/web/dist/assets/everforest-dark-BgDCqdQA.js +1 -0
  71. package/dist/apps/web/dist/assets/everforest-light-C8M2exoo.js +1 -0
  72. package/dist/apps/web/dist/assets/fennel-BYunw83y.js +1 -0
  73. package/dist/apps/web/dist/assets/fish-BvzEVeQv.js +1 -0
  74. package/dist/apps/web/dist/assets/fluent-C4IJs8-o.js +1 -0
  75. package/dist/apps/web/dist/assets/fortran-fixed-form-CkoXwp7k.js +1 -0
  76. package/dist/apps/web/dist/assets/fortran-free-form-BxgE0vQu.js +1 -0
  77. package/dist/apps/web/dist/assets/fsharp-CXgrBDvD.js +1 -0
  78. package/dist/apps/web/dist/assets/gdresource-BOOCDP_w.js +1 -0
  79. package/dist/apps/web/dist/assets/gdscript-C5YyOfLZ.js +1 -0
  80. package/dist/apps/web/dist/assets/gdshader-DkwncUOv.js +1 -0
  81. package/dist/apps/web/dist/assets/genie-D0YGMca9.js +1 -0
  82. package/dist/apps/web/dist/assets/gherkin-DyxjwDmM.js +1 -0
  83. package/dist/apps/web/dist/assets/git-commit-F4YmCXRG.js +1 -0
  84. package/dist/apps/web/dist/assets/git-rebase-r7XF79zn.js +1 -0
  85. package/dist/apps/web/dist/assets/github-dark-DHJKELXO.js +1 -0
  86. package/dist/apps/web/dist/assets/github-dark-default-Cuk6v7N8.js +1 -0
  87. package/dist/apps/web/dist/assets/github-dark-dimmed-DH5Ifo-i.js +1 -0
  88. package/dist/apps/web/dist/assets/github-dark-high-contrast-E3gJ1_iC.js +1 -0
  89. package/dist/apps/web/dist/assets/github-light-DAi9KRSo.js +1 -0
  90. package/dist/apps/web/dist/assets/github-light-default-D7oLnXFd.js +1 -0
  91. package/dist/apps/web/dist/assets/github-light-high-contrast-BfjtVDDH.js +1 -0
  92. package/dist/apps/web/dist/assets/gleam-BspZqrRM.js +1 -0
  93. package/dist/apps/web/dist/assets/glimmer-js-Rg0-pVw9.js +1 -0
  94. package/dist/apps/web/dist/assets/glimmer-ts-U6CK756n.js +1 -0
  95. package/dist/apps/web/dist/assets/glsl-DplSGwfg.js +1 -0
  96. package/dist/apps/web/dist/assets/gn-n2N0HUVH.js +1 -0
  97. package/dist/apps/web/dist/assets/gnuplot-DdkO51Og.js +1 -0
  98. package/dist/apps/web/dist/assets/go-CxLEBnE3.js +1 -0
  99. package/dist/apps/web/dist/assets/graphql-ChdNCCLP.js +1 -0
  100. package/dist/apps/web/dist/assets/groovy-gcz8RCvz.js +1 -0
  101. package/dist/apps/web/dist/assets/gruvbox-dark-hard-CFHQjOhq.js +1 -0
  102. package/dist/apps/web/dist/assets/gruvbox-dark-medium-GsRaNv29.js +1 -0
  103. package/dist/apps/web/dist/assets/gruvbox-dark-soft-CVdnzihN.js +1 -0
  104. package/dist/apps/web/dist/assets/gruvbox-light-hard-CH1njM8p.js +1 -0
  105. package/dist/apps/web/dist/assets/gruvbox-light-medium-DRw_LuNl.js +1 -0
  106. package/dist/apps/web/dist/assets/gruvbox-light-soft-hJgmCMqR.js +1 -0
  107. package/dist/apps/web/dist/assets/hack-CaT9iCJl.js +1 -0
  108. package/dist/apps/web/dist/assets/haml-B8DHNrY2.js +1 -0
  109. package/dist/apps/web/dist/assets/handlebars-BL8al0AC.js +1 -0
  110. package/dist/apps/web/dist/assets/haskell-Df6bDoY_.js +1 -0
  111. package/dist/apps/web/dist/assets/haxe-CzTSHFRz.js +1 -0
  112. package/dist/apps/web/dist/assets/hcl-BWvSN4gD.js +1 -0
  113. package/dist/apps/web/dist/assets/hjson-D5-asLiD.js +1 -0
  114. package/dist/apps/web/dist/assets/hlsl-D3lLCCz7.js +1 -0
  115. package/dist/apps/web/dist/assets/horizon-BUw7H-hv.js +1 -0
  116. package/dist/apps/web/dist/assets/horizon-bright-Cn-bp-IR.js +1 -0
  117. package/dist/apps/web/dist/assets/houston-DnULxvSX.js +1 -0
  118. package/dist/apps/web/dist/assets/html-GMplVEZG.js +1 -0
  119. package/dist/apps/web/dist/assets/html-derivative-BFtXZ54Q.js +1 -0
  120. package/dist/apps/web/dist/assets/http-jrhK8wxY.js +1 -0
  121. package/dist/apps/web/dist/assets/hurl-irOxFIW8.js +1 -0
  122. package/dist/apps/web/dist/assets/hxml-Bvhsp5Yf.js +1 -0
  123. package/dist/apps/web/dist/assets/hy-DFXneXwc.js +1 -0
  124. package/dist/apps/web/dist/assets/imba-DGztddWO.js +1 -0
  125. package/dist/apps/web/dist/assets/index-B5JrV3_C.css +1 -0
  126. package/dist/apps/web/dist/assets/index-CaU84fHq.js +369 -0
  127. package/dist/apps/web/dist/assets/ini-BEwlwnbL.js +1 -0
  128. package/dist/apps/web/dist/assets/java-CylS5w8V.js +1 -0
  129. package/dist/apps/web/dist/assets/javascript-wDzz0qaB.js +1 -0
  130. package/dist/apps/web/dist/assets/jinja-4LBKfQ-Z.js +1 -0
  131. package/dist/apps/web/dist/assets/jison-wvAkD_A8.js +1 -0
  132. package/dist/apps/web/dist/assets/json-Cp-IABpG.js +1 -0
  133. package/dist/apps/web/dist/assets/json5-C9tS-k6U.js +1 -0
  134. package/dist/apps/web/dist/assets/jsonc-Des-eS-w.js +1 -0
  135. package/dist/apps/web/dist/assets/jsonl-DcaNXYhu.js +1 -0
  136. package/dist/apps/web/dist/assets/jsonnet-DFQXde-d.js +1 -0
  137. package/dist/apps/web/dist/assets/jssm-C2t-YnRu.js +1 -0
  138. package/dist/apps/web/dist/assets/jsx-g9-lgVsj.js +1 -0
  139. package/dist/apps/web/dist/assets/julia-CxzCAyBv.js +1 -0
  140. package/dist/apps/web/dist/assets/just-Cw27pwNe.js +1 -0
  141. package/dist/apps/web/dist/assets/kanagawa-dragon-CkXjmgJE.js +1 -0
  142. package/dist/apps/web/dist/assets/kanagawa-lotus-CfQXZHmo.js +1 -0
  143. package/dist/apps/web/dist/assets/kanagawa-wave-DWedfzmr.js +1 -0
  144. package/dist/apps/web/dist/assets/kdl-DV7GczEv.js +1 -0
  145. package/dist/apps/web/dist/assets/kotlin-BdnUsdx6.js +1 -0
  146. package/dist/apps/web/dist/assets/kusto-DZf3V79B.js +1 -0
  147. package/dist/apps/web/dist/assets/laserwave-DUszq2jm.js +1 -0
  148. package/dist/apps/web/dist/assets/latex-CWtU0Tv5.js +1 -0
  149. package/dist/apps/web/dist/assets/lean-BZvkOJ9d.js +1 -0
  150. package/dist/apps/web/dist/assets/less-B1dDrJ26.js +1 -0
  151. package/dist/apps/web/dist/assets/light-plus-B7mTdjB0.js +1 -0
  152. package/dist/apps/web/dist/assets/liquid-DYVedYrR.js +1 -0
  153. package/dist/apps/web/dist/assets/llvm-DjAJT7YJ.js +1 -0
  154. package/dist/apps/web/dist/assets/log-2UxHyX5q.js +1 -0
  155. package/dist/apps/web/dist/assets/logo-BtOb2qkB.js +1 -0
  156. package/dist/apps/web/dist/assets/lua-BaeVxFsk.js +1 -0
  157. package/dist/apps/web/dist/assets/luau-C-HG3fhB.js +1 -0
  158. package/dist/apps/web/dist/assets/make-CHLpvVh8.js +1 -0
  159. package/dist/apps/web/dist/assets/markdown-Cvjx9yec.js +1 -0
  160. package/dist/apps/web/dist/assets/marko-CnJfTvn9.js +1 -0
  161. package/dist/apps/web/dist/assets/material-theme-D5KoaKCx.js +1 -0
  162. package/dist/apps/web/dist/assets/material-theme-darker-BfHTSMKl.js +1 -0
  163. package/dist/apps/web/dist/assets/material-theme-lighter-B0m2ddpp.js +1 -0
  164. package/dist/apps/web/dist/assets/material-theme-ocean-CyktbL80.js +1 -0
  165. package/dist/apps/web/dist/assets/material-theme-palenight-Csfq5Kiy.js +1 -0
  166. package/dist/apps/web/dist/assets/matlab-D7o27uSR.js +1 -0
  167. package/dist/apps/web/dist/assets/mdc-BMNejdWA.js +1 -0
  168. package/dist/apps/web/dist/assets/mdx-Cmh6b_Ma.js +1 -0
  169. package/dist/apps/web/dist/assets/mermaid-mWjccvbQ.js +1 -0
  170. package/dist/apps/web/dist/assets/min-dark-CafNBF8u.js +1 -0
  171. package/dist/apps/web/dist/assets/min-light-CTRr51gU.js +1 -0
  172. package/dist/apps/web/dist/assets/mipsasm-CKIfxQSi.js +1 -0
  173. package/dist/apps/web/dist/assets/mojo-rZm6bMo-.js +1 -0
  174. package/dist/apps/web/dist/assets/monokai-D4h5O-jR.js +1 -0
  175. package/dist/apps/web/dist/assets/moonbit-_H4v1dQx.js +1 -0
  176. package/dist/apps/web/dist/assets/move-IF9eRakj.js +1 -0
  177. package/dist/apps/web/dist/assets/narrat-DRg8JJMk.js +1 -0
  178. package/dist/apps/web/dist/assets/nextflow-Zz6hmt5N.js +1 -0
  179. package/dist/apps/web/dist/assets/nextflow-groovy-BeH2EWoN.js +1 -0
  180. package/dist/apps/web/dist/assets/nginx-BpAMiNFr.js +1 -0
  181. package/dist/apps/web/dist/assets/night-owl-C39BiMTA.js +1 -0
  182. package/dist/apps/web/dist/assets/night-owl-light-CMTm3GFP.js +1 -0
  183. package/dist/apps/web/dist/assets/nim-CVrawwO9.js +1 -0
  184. package/dist/apps/web/dist/assets/nix-CwoSXNpI.js +1 -0
  185. package/dist/apps/web/dist/assets/nord-Ddv68eIx.js +1 -0
  186. package/dist/apps/web/dist/assets/nushell-Cz2AlsmD.js +1 -0
  187. package/dist/apps/web/dist/assets/objective-c-DXmwc3jG.js +1 -0
  188. package/dist/apps/web/dist/assets/objective-cpp-CLxacb5B.js +1 -0
  189. package/dist/apps/web/dist/assets/ocaml-C0hk2d4L.js +1 -0
  190. package/dist/apps/web/dist/assets/odin-BBf5iR-q.js +1 -0
  191. package/dist/apps/web/dist/assets/one-dark-pro-DVMEJ2y_.js +1 -0
  192. package/dist/apps/web/dist/assets/one-light-C3Wv6jpd.js +1 -0
  193. package/dist/apps/web/dist/assets/openscad-C4EeE6gA.js +1 -0
  194. package/dist/apps/web/dist/assets/pascal-D93ZcfNL.js +1 -0
  195. package/dist/apps/web/dist/assets/perl-C0TMdlhV.js +1 -0
  196. package/dist/apps/web/dist/assets/php-Dhbhpdrm.js +1 -0
  197. package/dist/apps/web/dist/assets/pierre-dark-DF2SEV7i.js +1 -0
  198. package/dist/apps/web/dist/assets/pierre-light-DOlZxES8.js +1 -0
  199. package/dist/apps/web/dist/assets/pkl-u5AG7uiY.js +1 -0
  200. package/dist/apps/web/dist/assets/plastic-3e1v2bzS.js +1 -0
  201. package/dist/apps/web/dist/assets/plsql-ChMvpjG-.js +1 -0
  202. package/dist/apps/web/dist/assets/po-BTJTHyun.js +1 -0
  203. package/dist/apps/web/dist/assets/poimandres-CS3Unz2-.js +1 -0
  204. package/dist/apps/web/dist/assets/polar-C0HS_06l.js +1 -0
  205. package/dist/apps/web/dist/assets/postcss-CXtECtnM.js +1 -0
  206. package/dist/apps/web/dist/assets/powerquery-CEu0bR-o.js +1 -0
  207. package/dist/apps/web/dist/assets/powershell-Dpen1YoG.js +1 -0
  208. package/dist/apps/web/dist/assets/prisma-Dd19v3D-.js +1 -0
  209. package/dist/apps/web/dist/assets/prolog-CbFg5uaA.js +1 -0
  210. package/dist/apps/web/dist/assets/proto-C7zT0LnQ.js +1 -0
  211. package/dist/apps/web/dist/assets/pug-CGlum2m_.js +1 -0
  212. package/dist/apps/web/dist/assets/puppet-BMWR74SV.js +1 -0
  213. package/dist/apps/web/dist/assets/purescript-CklMAg4u.js +1 -0
  214. package/dist/apps/web/dist/assets/python-B6aJPvgy.js +1 -0
  215. package/dist/apps/web/dist/assets/qml-3beO22l8.js +1 -0
  216. package/dist/apps/web/dist/assets/qmldir-C8lEn-DE.js +1 -0
  217. package/dist/apps/web/dist/assets/qss-IeuSbFQv.js +1 -0
  218. package/dist/apps/web/dist/assets/r-Dspwwk_N.js +1 -0
  219. package/dist/apps/web/dist/assets/racket-BqYA7rlc.js +1 -0
  220. package/dist/apps/web/dist/assets/raku-DXvB9xmW.js +1 -0
  221. package/dist/apps/web/dist/assets/razor-Uh8Bk_45.js +1 -0
  222. package/dist/apps/web/dist/assets/red-bN70gL4F.js +1 -0
  223. package/dist/apps/web/dist/assets/reg-C-SQnVFl.js +1 -0
  224. package/dist/apps/web/dist/assets/regexp-CDVJQ6XC.js +1 -0
  225. package/dist/apps/web/dist/assets/rel-C3B-1QV4.js +1 -0
  226. package/dist/apps/web/dist/assets/riscv-BM1_JUlF.js +1 -0
  227. package/dist/apps/web/dist/assets/ron-D8l8udqQ.js +1 -0
  228. package/dist/apps/web/dist/assets/rose-pine-dawn-DHQR4-dF.js +1 -0
  229. package/dist/apps/web/dist/assets/rose-pine-moon-D4_iv3hh.js +1 -0
  230. package/dist/apps/web/dist/assets/rose-pine-qdsjHGoJ.js +1 -0
  231. package/dist/apps/web/dist/assets/rosmsg-BJDFO7_C.js +1 -0
  232. package/dist/apps/web/dist/assets/rst-BrH8l1NY.js +1 -0
  233. package/dist/apps/web/dist/assets/ruby-Dw2BHqvy.js +1 -0
  234. package/dist/apps/web/dist/assets/rust-B1yitclQ.js +1 -0
  235. package/dist/apps/web/dist/assets/sas-cz2c8ADy.js +1 -0
  236. package/dist/apps/web/dist/assets/sass-Cj5Yp3dK.js +1 -0
  237. package/dist/apps/web/dist/assets/scala-C151Ov-r.js +1 -0
  238. package/dist/apps/web/dist/assets/scheme-C98Dy4si.js +1 -0
  239. package/dist/apps/web/dist/assets/scss-OYdSNvt2.js +1 -0
  240. package/dist/apps/web/dist/assets/sdbl-DVxCFoDh.js +1 -0
  241. package/dist/apps/web/dist/assets/shaderlab-Dg9Lc6iA.js +1 -0
  242. package/dist/apps/web/dist/assets/shellscript-Yzrsuije.js +1 -0
  243. package/dist/apps/web/dist/assets/shellsession-BADoaaVG.js +1 -0
  244. package/dist/apps/web/dist/assets/slack-dark-BthQWCQV.js +1 -0
  245. package/dist/apps/web/dist/assets/slack-ochin-DqwNpetd.js +1 -0
  246. package/dist/apps/web/dist/assets/smalltalk-BERRCDM3.js +1 -0
  247. package/dist/apps/web/dist/assets/snazzy-light-Bw305WKR.js +1 -0
  248. package/dist/apps/web/dist/assets/solarized-dark-DXbdFlpD.js +1 -0
  249. package/dist/apps/web/dist/assets/solarized-light-L9t79GZl.js +1 -0
  250. package/dist/apps/web/dist/assets/solidity-rGO070M0.js +1 -0
  251. package/dist/apps/web/dist/assets/soy-Brmx7dQM.js +1 -0
  252. package/dist/apps/web/dist/assets/sparql-rVzFXLq3.js +1 -0
  253. package/dist/apps/web/dist/assets/splunk-BtCnVYZw.js +1 -0
  254. package/dist/apps/web/dist/assets/sql-BLtJtn59.js +1 -0
  255. package/dist/apps/web/dist/assets/ssh-config-_ykCGR6B.js +1 -0
  256. package/dist/apps/web/dist/assets/stata-BH5u7GGu.js +1 -0
  257. package/dist/apps/web/dist/assets/stylus-BEDo0Tqx.js +1 -0
  258. package/dist/apps/web/dist/assets/surrealql-Bq5Q-fJD.js +1 -0
  259. package/dist/apps/web/dist/assets/svelte-C_ipcX3V.js +1 -0
  260. package/dist/apps/web/dist/assets/swift-D82vCrfD.js +1 -0
  261. package/dist/apps/web/dist/assets/synthwave-84-CbfX1IO0.js +1 -0
  262. package/dist/apps/web/dist/assets/system-verilog-CnnmHF94.js +1 -0
  263. package/dist/apps/web/dist/assets/systemd-4A_iFExJ.js +1 -0
  264. package/dist/apps/web/dist/assets/talonscript-CkByrt1z.js +1 -0
  265. package/dist/apps/web/dist/assets/tasl-QIJgUcNo.js +1 -0
  266. package/dist/apps/web/dist/assets/tcl-dwOrl1Do.js +1 -0
  267. package/dist/apps/web/dist/assets/templ-P3uqSqPl.js +1 -0
  268. package/dist/apps/web/dist/assets/terraform-BETggiCN.js +1 -0
  269. package/dist/apps/web/dist/assets/tex-idrVyKtj.js +1 -0
  270. package/dist/apps/web/dist/assets/tokyo-night-hegEt444.js +1 -0
  271. package/dist/apps/web/dist/assets/toml-vGWfd6FD.js +1 -0
  272. package/dist/apps/web/dist/assets/ts-tags-zn1MmPIZ.js +1 -0
  273. package/dist/apps/web/dist/assets/tsv-B_m7g4N7.js +1 -0
  274. package/dist/apps/web/dist/assets/tsx-COt5Ahok.js +1 -0
  275. package/dist/apps/web/dist/assets/turtle-BsS91CYL.js +1 -0
  276. package/dist/apps/web/dist/assets/twig-DNn4PbVi.js +1 -0
  277. package/dist/apps/web/dist/assets/typescript-BPQ3VLAy.js +1 -0
  278. package/dist/apps/web/dist/assets/typespec-BGHnOYBU.js +1 -0
  279. package/dist/apps/web/dist/assets/typst-DHCkPAjA.js +1 -0
  280. package/dist/apps/web/dist/assets/v-BcVCzyr7.js +1 -0
  281. package/dist/apps/web/dist/assets/vala-CsfeWuGM.js +1 -0
  282. package/dist/apps/web/dist/assets/vb-D17OF-Vu.js +1 -0
  283. package/dist/apps/web/dist/assets/verilog-BQ8w6xss.js +1 -0
  284. package/dist/apps/web/dist/assets/vesper-DU1UobuO.js +1 -0
  285. package/dist/apps/web/dist/assets/vhdl-CeAyd5Ju.js +1 -0
  286. package/dist/apps/web/dist/assets/viml-CJc9bBzg.js +1 -0
  287. package/dist/apps/web/dist/assets/vitesse-black-Bkuqu6BP.js +1 -0
  288. package/dist/apps/web/dist/assets/vitesse-dark-D0r3Knsf.js +1 -0
  289. package/dist/apps/web/dist/assets/vitesse-light-CVO1_9PV.js +1 -0
  290. package/dist/apps/web/dist/assets/vue-DN_0RTcg.js +1 -0
  291. package/dist/apps/web/dist/assets/vue-html-AaS7Mt5G.js +1 -0
  292. package/dist/apps/web/dist/assets/vue-vine-CQOfvN7w.js +1 -0
  293. package/dist/apps/web/dist/assets/vyper-CDx5xZoG.js +1 -0
  294. package/dist/apps/web/dist/assets/wasm-CG6Dc4jp.js +1 -0
  295. package/dist/apps/web/dist/assets/wasm-MzD3tlZU.js +1 -0
  296. package/dist/apps/web/dist/assets/wenyan-BV7otONQ.js +1 -0
  297. package/dist/apps/web/dist/assets/wgsl-Dx-B1_4e.js +1 -0
  298. package/dist/apps/web/dist/assets/wikitext-BhOHFoWU.js +1 -0
  299. package/dist/apps/web/dist/assets/wit-5i3qLPDT.js +1 -0
  300. package/dist/apps/web/dist/assets/wolfram-lXgVvXCa.js +1 -0
  301. package/dist/apps/web/dist/assets/xml-sdJ4AIDG.js +1 -0
  302. package/dist/apps/web/dist/assets/xsl-CtQFsRM5.js +1 -0
  303. package/dist/apps/web/dist/assets/yaml-Buea-lGh.js +1 -0
  304. package/dist/apps/web/dist/assets/zenscript-DVFEvuxE.js +1 -0
  305. package/dist/apps/web/dist/assets/zig-VOosw3JB.js +1 -0
  306. package/dist/apps/web/dist/index.html +2 -2
  307. package/dist/index.d.mts +72 -72
  308. package/package.json +1 -1
  309. package/skills/agent-eval/SKILL.md +44 -68
  310. package/dist/apps/web/dist/assets/index-BJX1ESNi.js +0 -140
  311. package/dist/apps/web/dist/assets/index-BU3IqUso.css +0 -1
@@ -5,11 +5,11 @@ description: Create, run, and maintain TypeScript evals with @ls-stack/agent-eva
5
5
 
6
6
  # Agent Eval
7
7
 
8
- Local-first, UI-first eval runner for LLM and agent systems. Evals are strict
9
- TypeScript modules named `*.eval.ts`, discovered from `agent-evals.config.ts`,
10
- and executed through the CLI (`agent-evals run`) or the web UI
11
- (`agent-evals app`). Runs persist to `.agent-evals/` so results, traces, and
12
- caches survive across processes.
8
+ Local-first eval runner for LLM and agent systems. Evals are strict TypeScript
9
+ modules named `*.eval.ts`, discovered from `agent-evals.config.ts`, and
10
+ executed through the CLI (`agent-evals run`) or local app (`agent-evals app`).
11
+ Runs persist to `.agent-evals/` so results, traces, and caches survive across
12
+ processes.
13
13
 
14
14
  This skill covers the mental model and conventions. For exhaustive field lists
15
15
  (config options, eval shape, column formats, score/chart/stats options, trace
@@ -27,18 +27,13 @@ display rules), read the TypeScript declarations shipped with the package:
27
27
  - Unfiltered `agent-evals run` is disabled by default; use `--eval` or `--case`
28
28
  for targeted CLI runs, or `--tags-filter <expr>` to run cases matching tags.
29
29
  Set `allowCliRunAll: true` in
30
- `agent-evals.config.ts` to opt into run-all CLI behavior. The web UI can
31
- still run grouped evals and confirms before starting more than five. On a
32
- single eval page, the Run chevron can open a picker to run specific authored
33
- case ids; those case-picked runs are temporary by default and can be made
34
- durable in the modal.
30
+ `agent-evals.config.ts` to opt into run-all CLI behavior.
35
31
  - `agent-evals run --temporary` persists a run like normal history, but deletes
36
- it before the next run starts. Temporary runs appear in `show-runs` and the UI
37
- while present; normal runs are never deleted by temporary-run cleanup.
32
+ it before the next run starts. Temporary runs appear in `show-runs` while
33
+ present; normal runs are never deleted by temporary-run cleanup.
38
34
  - `agent-evals app` watches `agent-evals.config.ts` and reloads config in
39
- place when the runner is idle. If config changes during an active run, the UI
40
- shows a pending reload banner and blocks new runs until the current run
41
- reaches a terminal state and the reload applies.
35
+ place when the runner is idle. If config changes during an active run, the
36
+ reload applies after the current run reaches a terminal state.
42
37
 
43
38
  Assume that enumerated tables in this document may lag behind the types —
44
39
  treat the types as source of truth when they disagree.
@@ -67,9 +62,7 @@ a per-case sequence number, and throws outside an eval case scope.
67
62
  Use `evalLog(level, ...args)` for intentional per-case logs. The runner also
68
63
  captures `console.log`, `console.info`, `console.warn`, and `console.error`
69
64
  during case-owned phases by default; log arguments are stored as JSON-safe
70
- values and rendered with the JSON viewer, collapsed previews include best-effort
71
- code locations when stack data is available, previews are capped, and logs
72
- inside cached operations are not replayed from cache hits.
65
+ values. Logs inside cached operations are not replayed from cache hits.
73
66
  Use eval tags to target related coverage without naming every case:
74
67
  `AgentEvalsConfig.tags` applies workspace-wide tags, `defineEval({ tags })`
75
68
  adds eval tags, `case.tags` adds case-only tags, and `removeTags` disables a
@@ -209,10 +202,8 @@ For libraries or observability exporters that already emit span lifecycle
209
202
  events, use `evalTracer.startSpan(...)`, `evalTracer.updateSpan(...)`,
210
203
  `evalTracer.endSpan(...)`, or `evalTracer.recordSpan(...)` to translate those
211
204
  events into the eval trace tree without wrapping the upstream work in a
212
- callback. Pass the upstream span id and parent id when available so the UI keeps
213
- the original hierarchy. The Trace tab can switch between that recorded hierarchy
214
- and UI-only timeline nesting for flat exported traces; saved trace JSON and
215
- `deriveFromTracing` continue to use the recorded parent ids.
205
+ callback. Pass the upstream span id and parent id when available so saved trace
206
+ JSON and `deriveFromTracing` use the recorded hierarchy.
216
207
 
217
208
  ### Eval file (thin)
218
209
 
@@ -287,20 +278,19 @@ defineEval<z.infer<typeof inputSchema>>({
287
278
  });
288
279
  ```
289
280
 
290
- The web UI opens a modal driven by the descriptor derived from the schema
291
- (`z.string` text, `z.enum` select, `z.boolean` checkbox, etc.; nested
292
- shapes fall back to a JSON textarea). The CLI accepts `--input '<json>'` for a
281
+ `manualInput` configures the local app form descriptor derived from the schema
282
+ (`z.string` -> text, `z.enum` -> select, `z.boolean` -> checkbox, etc.; nested
283
+ shapes fall back to JSON input). The CLI accepts `--input '<json>'` for a
293
284
  single targeted eval or `--input-file <path>` mapping eval keys/ids to inputs.
294
285
  Each run produces one synthetic case `<evalId>-manual` with the validated
295
286
  submission; mixing `manualInput` with `cases` is rejected at discovery time.
296
287
 
297
288
  For file or image fields, set `{ asFile: true, accept?, maxSizeBytes? }` and
298
- type the field with `manualInputFileValueSchema`. The widget supports click,
299
- drag-and-drop, and clipboard paste (so a screenshot capture flows in
300
- directly). The runtime value carries `{ name, mimeType, sizeBytes, sha256,
301
- path }`, where `path` is a workspace-relative run artifact. Use
302
- `readManualInputFile(value)` when bytes, `Blob`, `File`, text, or parsed JSON
303
- are needed. In CLI runs, provide path objects such as
289
+ type the field with `manualInputFileValueSchema`. The runtime value carries
290
+ `{ name, mimeType, sizeBytes, sha256, path }`, where `path` is a
291
+ workspace-relative run artifact. Use `readManualInputFile(value)` when bytes,
292
+ `Blob`, `File`, text, or parsed JSON are needed. In CLI runs, provide path
293
+ objects such as
304
294
  `{ "image": { "path": "./screenshot.png" } }`; the CLI stages the file before
305
295
  starting the run.
306
296
 
@@ -350,12 +340,8 @@ See `EvalScoreDef` / `EvalManualScoreDef` in the types for the full shape
350
340
  `ColumnFormat` union and `EvalColumnOverride` in the types. Global
351
341
  `columns` in `agent-evals.config.ts` apply to every eval; eval-level
352
342
  `columns` override matching global keys. Use `hideIfNoValue: true` to hide a
353
- column from the runs table when every rendered row is missing the value,
354
- `null`, or an empty string; `0` and `false` still count as values, and the
355
- value remains available in case details and raw output data.
356
- In the case detail Output tab, string outputs that look like Markdown render
357
- as Markdown even without `format: 'markdown'`, with a Preview/Raw toggle for
358
- inspecting the original text.
343
+ column when every row is missing the value, `null`, or an empty string; `0`
344
+ and `false` still count as values.
359
345
  - `deriveFromTracing` can be authored globally in `agent-evals.config.ts` or
360
346
  locally on one eval. Prefer the keyed map form for shared metrics:
361
347
  `deriveFromTracing: { toolCalls: ({ trace }) => trace.findSpansByKind('tool').length }`.
@@ -382,20 +368,17 @@ See `EvalScoreDef` / `EvalManualScoreDef` in the types for the full shape
382
368
  `placements: ['header' | 'body']`). `derivedAttributes` can be a keyed map
383
369
  for one-off fields or one callback that returns multiple path/value pairs.
384
370
  Derived keys are dot-paths under `span.attributes`; return `undefined` to
385
- skip one span or one returned key. For saved runs,
386
- the case drawer more menu can recalculate configured LLM/API derived
387
- attributes for one case and persist the updated trace artifacts without
388
- re-running the eval.
371
+ skip one span or one returned key.
389
372
  - Default usage config derives missing eval outputs from matching LLM/API spans
390
373
  before `outputsSchema` and scores run: `apiCalls`, `costUsd`, `llmTurns`,
391
374
  `inputTokens`, `outputTokens`, `totalTokens`, `cachedInputTokens`,
392
375
  `cacheCreationInputTokens`, `reasoningTokens`, and `llmDurationMs`. Authored
393
376
  outputs and column overrides win. Default usage columns, stats, and charts
394
- use `hideIfNoValue: true`, so the UI hides them until matching LLM/API span
395
- data exists. Default LLM usage charts render cost, input tokens, and output
396
- tokens separately and use `dedupeConsecutiveValues: true` to skip repeated
397
- adjacent chart values. `totalTokens` is input + output only; cache read/write
398
- tokens stay separate and affect `costUsd` at their own rates.
377
+ use `hideIfNoValue: true`. Default LLM usage charts configure cost, input
378
+ tokens, and output tokens separately and use `dedupeConsecutiveValues: true`
379
+ to skip repeated adjacent chart values. `totalTokens` is input + output only;
380
+ cache read/write tokens stay separate and affect `costUsd` at their own
381
+ rates.
399
382
  Derived base input cost uses `inputTokens - cachedInputTokens -
400
383
  cacheCreationInputTokens` so cache details are not double-counted.
401
384
  `cacheCreationInputTokens` is the total cache-write count; optional
@@ -419,16 +402,14 @@ cacheCreationInputTokens` so cache details are not double-counted.
419
402
  without persisting console calls to case details. Manual `evalLog(...)` calls
420
403
  are still captured.
421
404
 
422
- Stats rows and history charts on the eval card can be authored via `stats` /
423
- `charts` on the eval definition. Global `stats` in `agent-evals.config.ts`
424
- render before eval-level stats. Usage stats and LLM usage charts are added by
425
- default unless removed with `removeDefaultConfig`. Column stats can override
426
- `format` and `numberFormat`, otherwise they inherit from the matching column.
427
- Number formats use `maxDecimalPlaces` to cap decimals and `minDecimalPlaces`
428
- to pad trailing zeroes. Without `maxDecimalPlaces`, they render up to 3 decimal
429
- places. Stats and charts support `hideIfNoValue: true`; stats hide when they
430
- would otherwise render an empty value, and charts hide when no plotted metric or
431
- tooltip extra has a numeric value in the rendered history window. Charts support
405
+ Stats rows and history charts can be authored via `stats` / `charts` on the eval
406
+ definition. Global `stats` in `agent-evals.config.ts` combine with eval-level
407
+ stats. Usage stats and LLM usage charts are added by default unless removed with
408
+ `removeDefaultConfig`. Column stats can override `format` and `numberFormat`,
409
+ otherwise they inherit from the matching column. Number formats use
410
+ `maxDecimalPlaces` to cap decimals and `minDecimalPlaces` to pad trailing
411
+ zeroes. Without `maxDecimalPlaces`, the default cap is 3 decimal places. Stats
412
+ and charts support `hideIfNoValue: true`. Charts support
432
413
  `dedupeConsecutiveValues: true` to omit consecutive points whose plotted metrics
433
414
  and tooltip extras match the previous kept point.
434
415
  Their shapes live in the types; no need to memorize the option set.
@@ -522,11 +503,7 @@ Mental model:
522
503
  - Authored raw cache keys are stored for debugging under
523
504
  `.agent-evals/cache-debug/<sanitizedNamespace>/<keyHash>.json`. This folder
524
505
  may include prompts, user inputs, full serialized cache payloads, or other
525
- sensitive data, should be gitignored, and is not needed for cache reuse. The
526
- UI Cache tab shows the raw key when it is available and can be filtered to
527
- hits or new entries added by cache misses/refreshes. Misses/refreshes with
528
- `cache.store: false` are shown as non-stored activity without fetch/delete
529
- controls.
506
+ sensitive data, should be gitignored, and is not needed for cache reuse.
530
507
  - Cached payloads use JSON-safe tagged serialization, so return values and
531
508
  recorded SDK effects preserve richer built-ins such as `Date`, `Map`, `Set`,
532
509
  typed arrays, `URL`, `Headers`, `Blob`, and `File` on hits. Undefined values
@@ -541,11 +518,11 @@ Mental model:
541
518
  Run output lives under `.agent-evals/runs/<run-id>/`. Cache metadata lives under
542
519
  `.agent-evals/cache/<sanitizedNamespace>/<keyHash>.json.br`. Do not rely on a
543
520
  specific cache filename when authoring evals; configure cache namespaces
544
- manually in eval code, then use `agent-evals cache list` or the UI Cache tab to
545
- inspect the persisted namespace/key entries. Files in a run directory include
546
- run metadata, a run summary, per-case results, and per-case trace JSON. Inspect
547
- run files when debugging persisted output, costs, columns, traces, or failures;
548
- inspect cache entries when debugging replayed span/value-cache results.
521
+ manually in eval code, then use `agent-evals cache list` to inspect the
522
+ persisted namespace/key entries. Files in a run directory include run metadata,
523
+ a run summary, per-case results, and per-case trace JSON. Inspect run files when
524
+ debugging persisted output, costs, columns, traces, or failures; inspect cache
525
+ entries when debugging replayed span/value-cache results.
549
526
  Targeted evals in `run.json` are recorded by exact `evalKeys`
550
527
  (`filePath + evalId`) rather than authored eval ids, so duplicate eval ids stay
551
528
  unambiguous in saved history.
@@ -608,8 +585,7 @@ When adding or changing evals:
608
585
  4. Surface reviewable values through execute-context `setOutput` or ambient
609
586
  `setEvalOutput` in shared workflow code, and shape them with `columns`
610
587
  formats from the `ColumnFormat` type.
611
- 5. Promote high-signal span attributes with `traceDisplay` so they surface in
612
- the trace tree and detail pane.
588
+ 5. Promote high-signal span attributes with `traceDisplay`.
613
589
  6. Cache costly pure spans with `cache: { namespace, key }` and pure spanless
614
590
  values with `evalTracer.cache(...)`; never cache operations whose external
615
591
  side effects you depend on.