@wix/evalforge-evaluator 0.201.0 → 0.203.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -0
- package/build/index.js +308 -160
- package/build/index.js.map +4 -4
- package/build/index.mjs +308 -160
- package/build/index.mjs.map +4 -4
- package/build/types/run-scenario/agents/opencode/build-trace.d.ts +15 -1
- package/build/types/run-scenario/agents/opencode/execute.d.ts +4 -8
- package/build/types/run-scenario/agents/opencode/gateway-cost-interceptor.d.ts +28 -0
- package/package.json +5 -5
package/README.md
CHANGED
|
@@ -38,6 +38,8 @@ evaluator <project-id> <eval-run-id>
|
|
|
38
38
|
|
|
39
39
|
For OpenCode runs, the evaluator sets `lsp: false` in `OPENCODE_CONFIG_CONTENT` and `OPENCODE_DISABLE_LSP_DOWNLOAD` / `OPENCODE_DISABLE_FILETIME_CHECK` in the process environment (same as ditto `codegen`) to avoid LSP hangs after edit tools and spurious "file modified since last read" failures in automated evals.
|
|
40
40
|
|
|
41
|
+
**OpenCode cost** comes from the gateway, not OpenCode. OpenCode prices the Wix AI Gateway as a free custom provider, so its self-reported `step_finish.cost` is ~$0. Instead, the evaluator runs a localhost pass-through (`gateway-cost-interceptor.ts`) between OpenCode and the gateway: it forwards each request untouched, streams the response straight back, and reads the real `total_cost_usd` the gateway already injects into every response. Those per-request costs map to OpenCode turns in the LLM trace; if a request's cost can't be read, that turn falls back to OpenCode's reported cost (logged). No pricing tables to maintain — the number is whatever the gateway billed.
|
|
42
|
+
|
|
41
43
|
The evaluator is typically launched by the backend (locally or on a remote Dev Machine) with these variables pre-configured.
|
|
42
44
|
|
|
43
45
|
## Backend API access
|