@evalgate/sdk 2.2.2 → 2.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +27 -0
- package/README.md +2 -0
- package/dist/assertions.d.ts +9 -5
- package/dist/assertions.js +29 -12
- package/dist/cache.d.ts +1 -1
- package/dist/cache.js +1 -1
- package/dist/cli/upgrade.js +5 -0
- package/dist/client.js +1 -1
- package/dist/errors.js +7 -0
- package/dist/export.js +2 -2
- package/dist/index.d.ts +3 -3
- package/dist/index.js +3 -2
- package/dist/integrations/anthropic.js +6 -6
- package/dist/integrations/openai.js +6 -6
- package/dist/pagination.d.ts +13 -2
- package/dist/pagination.js +28 -2
- package/dist/runtime/adapters/testsuite-to-dsl.js +1 -6
- package/dist/runtime/executor.d.ts +3 -2
- package/dist/runtime/executor.js +3 -2
- package/dist/runtime/registry.d.ts +4 -1
- package/dist/runtime/registry.js +4 -1
- package/dist/snapshot.d.ts +12 -0
- package/dist/snapshot.js +24 -1
- package/dist/version.d.ts +2 -2
- package/dist/version.js +2 -2
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,33 @@ All notable changes to the @evalgate/sdk package will be documented in this file
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [2.2.3] - 2026-03-03
|
|
9
|
+
|
|
10
|
+
### Fixed
|
|
11
|
+
|
|
12
|
+
- **`RequestCache.set` missing default TTL** — entries stored without an explicit TTL were immediately stale on next read. Default is now `CacheTTL.MEDIUM`; callers that omit `ttl` get a live cache entry instead of a cache miss every time.
|
|
13
|
+
- **`EvalGateError` subclass prototype chain** — `ValidationError.name` was silently overwritten by the base class constructor, surfacing as `"EvalGateError"` in stack traces and `instanceof` checks. All four subclasses (`ValidationError`, `RateLimitError`, `AuthenticationError`, `NetworkError`) now call `Object.setPrototypeOf(this, Subclass.prototype)` and set `this.name` after `super()`.
|
|
14
|
+
- **`RateLimitError.retryAfter` not a direct property** — the value was only stored inside `details.retryAfter` and not accessible as `err.retryAfter`. It is now assigned directly on the instance when provided.
|
|
15
|
+
- **`autoPaginate` returned `AsyncGenerator` instead of `Promise<T[]>`** — calling `await autoPaginate(fetcher)` was resolving to an unexhausted generator. It now collects all pages and returns a flat `Promise<T[]>`. The original streaming behaviour is available via the new `autoPaginateGenerator` export.
|
|
16
|
+
- **`createEvalRuntime` string-only overload** — passing `{ name, projectRoot }` config objects was ignored (treated as `process.cwd()`). The function now accepts `string | { name?: string; projectRoot?: string }` and extracts `projectRoot` correctly.
|
|
17
|
+
- **`defaultLocalExecutor` was an instance, not a factory** — importing `defaultLocalExecutor` returned a pre-constructed executor rather than a callable factory. It is now re-exported as `createLocalExecutor` so each import site can call it to get a fresh instance.
|
|
18
|
+
- **`SnapshotManager.save` crash on `undefined`/`null` output** — passing `undefined` or `null` to `snapshot(name, output)` threw `TypeError: Cannot convert undefined to string`. Both values are now serialized to the strings `"undefined"` and `"null"` respectively, matching the existing `null`-safe coercion already present for objects.
|
|
19
|
+
- **`compareSnapshots` loaded raw string instead of disk snapshot** — the old `compareWithSnapshot` alias passed its second argument as literal content rather than a snapshot name, producing meaningless diffs. The new `compareSnapshots(nameA, nameB, dir?)` loads both snapshots from disk before diffing.
|
|
20
|
+
- **`AIEvalClient` default `baseUrl`** — the no-arg constructor defaulted to `http://localhost:3000`, causing silent failures in production environments. Default is now `https://api.evalgate.com`.
|
|
21
|
+
- **`importData` unguarded `client.traces` / `client.evaluations` access** — calling `importData(data)` with a partial or undefined client could throw `TypeError: Cannot read properties of undefined`. Both property accesses now use optional chaining (`client?.traces`, `client?.evaluations`).
|
|
22
|
+
- **`toContainCode` required a fenced code block** — raw function definitions, `const` assignments, class declarations, arrow functions, `import`/`export` statements, and `return` expressions now satisfy the assertion without needing triple-backtick fencing.
|
|
23
|
+
- **`hasReadabilityScore` ignored `{min}` object form** — passing `{ min: 40 }` instead of a plain number was coerced to `NaN` threshold, making every call return `true`. The function now unwraps `{ min?, max? }` objects and applies both bounds.
|
|
24
|
+
|
|
25
|
+
### Added
|
|
26
|
+
|
|
27
|
+
- **`autoPaginateGenerator`** — new export for streaming pagination as an `AsyncGenerator<T[]>` (one chunk per page). Use when you want to process pages incrementally rather than wait for all pages to load.
|
|
28
|
+
- **`compareSnapshots(nameA, nameB, dir?)`** — loads both named snapshots from disk and returns a `SnapshotComparison`. Replaces the incorrectly aliased `compareWithSnapshot`.
|
|
29
|
+
- **141 new regression tests** across 9 test files covering all fixes above: `RequestCache` TTL defaults, error class prototype chains, `autoPaginate` flat-array return, `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager` null/undefined handling, `compareSnapshots` disk-load path, `AIEvalClient` default `baseUrl`, `importData` guards, `toContainCode` raw-code detection, and `hasReadabilityScore` object form.
|
|
30
|
+
- **`upgrade --full` post-upgrade warning** — CLI now prints a reminder to run `npx evalgate baseline update` after a full upgrade to avoid a false regression on the next CI run.
|
|
31
|
+
- **Optional chaining on OpenAI / Anthropic integration `traces.create`** — `evalClient.traces?.create(...)` prevents crashes when the `traces` resource is unavailable on the client (e.g. minimal config or testing without a full API key).
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
8
35
|
## [2.2.2] - 2026-03-03
|
|
9
36
|
|
|
10
37
|
### Fixed
|
package/README.md
CHANGED
|
@@ -450,6 +450,8 @@ Your local `openAIChatEval` runs continue to work. No account cancellation. No d
|
|
|
450
450
|
|
|
451
451
|
See [CHANGELOG.md](CHANGELOG.md) for the full release history.
|
|
452
452
|
|
|
453
|
+
**v2.2.3** — Bug-fix release. `RequestCache` default TTL, `EvalGateError` subclass prototype chain and `retryAfter` direct property, `autoPaginate` now returns `Promise<T[]>` (new `autoPaginateGenerator` for streaming), `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager.save` null/undefined safety, `compareSnapshots` loads both sides from disk, `AIEvalClient` default baseUrl → `https://api.evalgate.com`, `importData` optional-chaining guards, `toContainCode` raw-code detection, `hasReadabilityScore` `{min,max}` object form. 141 new regression tests.
|
|
454
|
+
|
|
453
455
|
**v2.2.2** — 8 stub assertions replaced with real implementations (`hasSentiment` expanded lexicon, `hasNoToxicity` ~80-term blocklist, `hasValidCodeSyntax` real bracket balance, `containsLanguage` 12 languages + BCP-47, `hasFactualAccuracy`/`hasNoHallucinations` case-insensitive, `hasReadabilityScore` per-word syllable fix, `matchesSchema` JSON Schema support). Added LLM-backed `*Async` variants + `configureAssertions`. Fixed `importData` crash, `compareWithSnapshot` object coercion, `WorkflowTracer` defensive guard. 115 new tests.
|
|
454
456
|
|
|
455
457
|
**v2.2.1** — `snapshot(name, output)` accepts objects; auto-serialized via `JSON.stringify`
|
package/dist/assertions.d.ts
CHANGED
|
@@ -126,10 +126,11 @@ export declare class Expectation {
|
|
|
126
126
|
*/
|
|
127
127
|
toBeBetween(min: number, max: number, message?: string): AssertionResult;
|
|
128
128
|
/**
|
|
129
|
-
* Assert value contains code block
|
|
129
|
+
* Assert value contains code block or raw code
|
|
130
130
|
* @example expect(output).toContainCode()
|
|
131
|
+
* @example expect(output).toContainCode('typescript')
|
|
131
132
|
*/
|
|
132
|
-
toContainCode(message?: string): AssertionResult;
|
|
133
|
+
toContainCode(language?: string, message?: string): AssertionResult;
|
|
133
134
|
/**
|
|
134
135
|
* Assert value is professional tone (no profanity)
|
|
135
136
|
* @example expect(output).toBeProfessional()
|
|
@@ -209,9 +210,12 @@ export declare function isValidURL(url: string): boolean;
|
|
|
209
210
|
* facts but cannot detect paraphrased fabrications. Use
|
|
210
211
|
* {@link hasNoHallucinationsAsync} for semantic accuracy.
|
|
211
212
|
*/
|
|
212
|
-
export declare function hasNoHallucinations(text: string, groundTruth
|
|
213
|
+
export declare function hasNoHallucinations(text: string, groundTruth?: string[]): boolean;
|
|
213
214
|
export declare function matchesSchema(value: unknown, schema: Record<string, unknown>): boolean;
|
|
214
|
-
export declare function hasReadabilityScore(text: string, minScore: number
|
|
215
|
+
export declare function hasReadabilityScore(text: string, minScore: number | {
|
|
216
|
+
min?: number;
|
|
217
|
+
max?: number;
|
|
218
|
+
}): boolean;
|
|
215
219
|
/**
|
|
216
220
|
* Keyword-frequency language detector supporting 12 languages.
|
|
217
221
|
* **Fast and approximate** — detects the most common languages reliably
|
|
@@ -234,7 +238,7 @@ export declare function respondedWithinTime(startTime: number, maxMs: number): b
|
|
|
234
238
|
* with an LLM for context-aware moderation.
|
|
235
239
|
*/
|
|
236
240
|
export declare function hasNoToxicity(text: string): boolean;
|
|
237
|
-
export declare function followsInstructions(text: string, instructions: string[]): boolean;
|
|
241
|
+
export declare function followsInstructions(text: string, instructions: string | string[]): boolean;
|
|
238
242
|
export declare function containsAllRequiredFields(obj: unknown, requiredFields: string[]): boolean;
|
|
239
243
|
export interface AssertionLLMConfig {
|
|
240
244
|
provider: "openai" | "anthropic";
|
package/dist/assertions.js
CHANGED
|
@@ -234,9 +234,10 @@ class Expectation {
|
|
|
234
234
|
let parsedJson = null;
|
|
235
235
|
try {
|
|
236
236
|
parsedJson = JSON.parse(String(this.value));
|
|
237
|
-
const
|
|
238
|
-
|
|
239
|
-
|
|
237
|
+
const entries = Object.entries(schema);
|
|
238
|
+
passed = entries.every(([key, expectedValue]) => parsedJson !== null &&
|
|
239
|
+
key in parsedJson &&
|
|
240
|
+
JSON.stringify(parsedJson[key]) === JSON.stringify(expectedValue));
|
|
240
241
|
}
|
|
241
242
|
catch (_e) {
|
|
242
243
|
passed = false;
|
|
@@ -436,19 +437,30 @@ class Expectation {
|
|
|
436
437
|
};
|
|
437
438
|
}
|
|
438
439
|
/**
|
|
439
|
-
* Assert value contains code block
|
|
440
|
+
* Assert value contains code block or raw code
|
|
440
441
|
* @example expect(output).toContainCode()
|
|
442
|
+
* @example expect(output).toContainCode('typescript')
|
|
441
443
|
*/
|
|
442
|
-
toContainCode(message) {
|
|
444
|
+
toContainCode(language, message) {
|
|
443
445
|
const text = String(this.value);
|
|
444
|
-
const
|
|
446
|
+
const hasMarkdownBlock = language
|
|
447
|
+
? new RegExp(`\`\`\`${language}[\\s\\S]*?\`\`\``).test(text)
|
|
448
|
+
: /```[\s\S]*?```/.test(text);
|
|
449
|
+
const hasHtmlBlock = /<code>[\s\S]*?<\/code>/.test(text);
|
|
450
|
+
const hasRawCode = /\bfunction\s+\w+\s*\(/.test(text) ||
|
|
451
|
+
/\b(?:const|let|var)\s+\w+\s*=/.test(text) ||
|
|
452
|
+
/\bclass\s+\w+/.test(text) ||
|
|
453
|
+
/=>\s*[{(]/.test(text) ||
|
|
454
|
+
/\bimport\s+.*\bfrom\b/.test(text) ||
|
|
455
|
+
/\bexport\s+(?:default\s+)?(?:function|class|const)/.test(text) ||
|
|
456
|
+
/\breturn\s+.+;/.test(text);
|
|
457
|
+
const hasCodeBlock = hasMarkdownBlock || hasHtmlBlock || hasRawCode;
|
|
445
458
|
return {
|
|
446
459
|
name: "toContainCode",
|
|
447
460
|
passed: hasCodeBlock,
|
|
448
|
-
expected: "code block",
|
|
461
|
+
expected: language ? `code block (${language})` : "code block",
|
|
449
462
|
actual: text,
|
|
450
|
-
message: message ||
|
|
451
|
-
(hasCodeBlock ? "Contains code block" : "No code block found"),
|
|
463
|
+
message: message || (hasCodeBlock ? "Contains code" : "No code found"),
|
|
452
464
|
};
|
|
453
465
|
}
|
|
454
466
|
/**
|
|
@@ -719,7 +731,7 @@ function isValidURL(url) {
|
|
|
719
731
|
* facts but cannot detect paraphrased fabrications. Use
|
|
720
732
|
* {@link hasNoHallucinationsAsync} for semantic accuracy.
|
|
721
733
|
*/
|
|
722
|
-
function hasNoHallucinations(text, groundTruth) {
|
|
734
|
+
function hasNoHallucinations(text, groundTruth = []) {
|
|
723
735
|
const lower = text.toLowerCase();
|
|
724
736
|
return groundTruth.every((truth) => lower.includes(truth.toLowerCase()));
|
|
725
737
|
}
|
|
@@ -739,12 +751,14 @@ function matchesSchema(value, schema) {
|
|
|
739
751
|
return Object.keys(schema).every((key) => key in obj);
|
|
740
752
|
}
|
|
741
753
|
function hasReadabilityScore(text, minScore) {
|
|
754
|
+
const threshold = typeof minScore === "number" ? minScore : (minScore.min ?? 0);
|
|
755
|
+
const maxThreshold = typeof minScore === "object" ? minScore.max : undefined;
|
|
742
756
|
const wordList = text.trim().split(/\s+/).filter(Boolean);
|
|
743
757
|
const words = wordList.length || 1;
|
|
744
758
|
const sentences = text.split(/[.!?]+/).filter((s) => s.trim().length > 0).length || 1;
|
|
745
759
|
const totalSyllables = wordList.reduce((sum, w) => sum + syllables(w), 0);
|
|
746
760
|
const score = 206.835 - 1.015 * (words / sentences) - 84.6 * (totalSyllables / words);
|
|
747
|
-
return score >=
|
|
761
|
+
return (score >= threshold && (maxThreshold === undefined || score <= maxThreshold));
|
|
748
762
|
}
|
|
749
763
|
function syllables(word) {
|
|
750
764
|
// Simple syllable counter
|
|
@@ -1154,7 +1168,10 @@ function hasNoToxicity(text) {
|
|
|
1154
1168
|
return !toxicTerms.some((term) => lower.includes(term));
|
|
1155
1169
|
}
|
|
1156
1170
|
function followsInstructions(text, instructions) {
|
|
1157
|
-
|
|
1171
|
+
const instructionList = Array.isArray(instructions)
|
|
1172
|
+
? instructions
|
|
1173
|
+
: [instructions];
|
|
1174
|
+
return instructionList.every((instruction) => {
|
|
1158
1175
|
if (instruction.startsWith("!")) {
|
|
1159
1176
|
return !text.includes(instruction.slice(1));
|
|
1160
1177
|
}
|
package/dist/cache.d.ts
CHANGED
|
@@ -21,7 +21,7 @@ export declare class RequestCache {
|
|
|
21
21
|
/**
|
|
22
22
|
* Store response in cache
|
|
23
23
|
*/
|
|
24
|
-
set<T>(method: string, url: string, data: T, ttl
|
|
24
|
+
set<T>(method: string, url: string, data: T, ttl?: number, params?: unknown): void;
|
|
25
25
|
/**
|
|
26
26
|
* Invalidate specific cache entry
|
|
27
27
|
*/
|
package/dist/cache.js
CHANGED
|
@@ -43,7 +43,7 @@ class RequestCache {
|
|
|
43
43
|
/**
|
|
44
44
|
* Store response in cache
|
|
45
45
|
*/
|
|
46
|
-
set(method, url, data, ttl, params) {
|
|
46
|
+
set(method, url, data, ttl = exports.CacheTTL.MEDIUM, params) {
|
|
47
47
|
// Enforce cache size limit (LRU-style)
|
|
48
48
|
if (this.cache.size >= this.maxSize) {
|
|
49
49
|
const firstKey = this.cache.keys().next().value;
|
package/dist/cli/upgrade.js
CHANGED
|
@@ -480,7 +480,12 @@ After upgrading:
|
|
|
480
480
|
console.log(" - package.json eval:regression-gate + eval:baseline-update");
|
|
481
481
|
console.log(" - .github/workflows/ Gate + governance workflows");
|
|
482
482
|
console.log(" - .github/CODEOWNERS Baseline requires approval\n");
|
|
483
|
+
console.log(" ⚠️ IMPORTANT — Reset your baseline before pushing:");
|
|
484
|
+
console.log(" The gate compares against your existing Tier 1 baseline.");
|
|
485
|
+
console.log(" If your test script changed, run this first to avoid an immediate regression:");
|
|
486
|
+
console.log(" npx evalgate baseline update (or: pnpm eval:baseline-update)\n");
|
|
483
487
|
console.log(" Next:");
|
|
488
|
+
console.log(" npx evalgate baseline update");
|
|
484
489
|
console.log(" git add -A");
|
|
485
490
|
console.log(" git commit -m 'chore: upgrade EvalGate gate to Tier 2'");
|
|
486
491
|
console.log(" git push\n");
|
package/dist/client.js
CHANGED
|
@@ -72,7 +72,7 @@ class AIEvalClient {
|
|
|
72
72
|
this.baseUrl =
|
|
73
73
|
config.baseUrl ||
|
|
74
74
|
getEnvVar("EVALGATE_BASE_URL", "EVALAI_BASE_URL") ||
|
|
75
|
-
(isBrowser ? "" : "
|
|
75
|
+
(isBrowser ? "" : "https://api.evalgate.com");
|
|
76
76
|
this.timeout = config.timeout || 30000;
|
|
77
77
|
// Tier 4.17: Debug mode with request logging
|
|
78
78
|
const logLevel = config.logLevel || (config.debug ? "debug" : "info");
|
package/dist/errors.js
CHANGED
|
@@ -271,6 +271,10 @@ class RateLimitError extends EvalGateError {
|
|
|
271
271
|
constructor(message, retryAfter) {
|
|
272
272
|
super(message, "RATE_LIMIT_EXCEEDED", 429, { retryAfter });
|
|
273
273
|
this.name = "RateLimitError";
|
|
274
|
+
if (retryAfter !== undefined) {
|
|
275
|
+
this.retryAfter = retryAfter;
|
|
276
|
+
}
|
|
277
|
+
Object.setPrototypeOf(this, RateLimitError.prototype);
|
|
274
278
|
}
|
|
275
279
|
}
|
|
276
280
|
exports.RateLimitError = RateLimitError;
|
|
@@ -278,6 +282,7 @@ class AuthenticationError extends EvalGateError {
|
|
|
278
282
|
constructor(message = "Authentication failed") {
|
|
279
283
|
super(message, "AUTHENTICATION_ERROR", 401);
|
|
280
284
|
this.name = "AuthenticationError";
|
|
285
|
+
Object.setPrototypeOf(this, AuthenticationError.prototype);
|
|
281
286
|
}
|
|
282
287
|
}
|
|
283
288
|
exports.AuthenticationError = AuthenticationError;
|
|
@@ -285,6 +290,7 @@ class ValidationError extends EvalGateError {
|
|
|
285
290
|
constructor(message = "Validation failed", details) {
|
|
286
291
|
super(message, "VALIDATION_ERROR", 400, details);
|
|
287
292
|
this.name = "ValidationError";
|
|
293
|
+
Object.setPrototypeOf(this, ValidationError.prototype);
|
|
288
294
|
}
|
|
289
295
|
}
|
|
290
296
|
exports.ValidationError = ValidationError;
|
|
@@ -293,6 +299,7 @@ class NetworkError extends EvalGateError {
|
|
|
293
299
|
super(message, "NETWORK_ERROR", 0);
|
|
294
300
|
this.name = "NetworkError";
|
|
295
301
|
this.retryable = true;
|
|
302
|
+
Object.setPrototypeOf(this, NetworkError.prototype);
|
|
296
303
|
}
|
|
297
304
|
}
|
|
298
305
|
exports.NetworkError = NetworkError;
|
package/dist/export.js
CHANGED
|
@@ -155,7 +155,7 @@ async function importData(client, data, options = {}) {
|
|
|
155
155
|
return result;
|
|
156
156
|
}
|
|
157
157
|
// Import traces
|
|
158
|
-
if (data.traces) {
|
|
158
|
+
if (data.traces && client?.traces) {
|
|
159
159
|
const traceResults = { imported: 0, skipped: 0, failed: 0 };
|
|
160
160
|
for (const trace of data.traces) {
|
|
161
161
|
try {
|
|
@@ -191,7 +191,7 @@ async function importData(client, data, options = {}) {
|
|
|
191
191
|
result.summary.total += data.traces.length;
|
|
192
192
|
}
|
|
193
193
|
// Import evaluations
|
|
194
|
-
if (data.evaluations) {
|
|
194
|
+
if (data.evaluations && client?.evaluations) {
|
|
195
195
|
const evalResults = { imported: 0, skipped: 0, failed: 0 };
|
|
196
196
|
for (const evaluation of data.evaluations) {
|
|
197
197
|
try {
|
package/dist/index.d.ts
CHANGED
|
@@ -20,8 +20,8 @@ export { createEvalRuntime, disposeActiveRuntime, getActiveRuntime, setActiveRun
|
|
|
20
20
|
export type { CloudExecutor, DefineEvalFunction, EvalContext, EvalExecutor, EvalExecutorInterface, EvalOptions, EvalResult, EvalRuntime, EvalSpec, ExecutorCapabilities, LocalExecutor, SpecConfig, SpecOptions, WorkerExecutor, } from "./runtime/types";
|
|
21
21
|
export { EvalRuntimeError, RuntimeError, SpecExecutionError, SpecRegistrationError, } from "./runtime/types";
|
|
22
22
|
export { createTestSuite, type TestCaseResult, TestSuite, TestSuiteCase, TestSuiteCaseResult, TestSuiteConfig, TestSuiteResult, } from "./testing";
|
|
23
|
-
import { compareWithSnapshot, snapshot } from "./snapshot";
|
|
24
|
-
export { snapshot, compareWithSnapshot, snapshot as saveSnapshot,
|
|
23
|
+
import { compareSnapshots, compareWithSnapshot, snapshot } from "./snapshot";
|
|
24
|
+
export { snapshot, compareWithSnapshot, compareSnapshots, snapshot as saveSnapshot, };
|
|
25
25
|
import type { ExportFormat } from "./export";
|
|
26
26
|
import { exportData, importData } from "./export";
|
|
27
27
|
export { exportData, importData };
|
|
@@ -34,7 +34,7 @@ export { traceOpenAI } from "./integrations/openai";
|
|
|
34
34
|
export { type OpenAIChatEvalCase, type OpenAIChatEvalOptions, type OpenAIChatEvalResult, openAIChatEval, } from "./integrations/openai-eval";
|
|
35
35
|
export { Logger } from "./logger";
|
|
36
36
|
export { extendExpectWithToPassGate } from "./matchers";
|
|
37
|
-
export { autoPaginate, createPaginatedIterator, decodeCursor, encodeCursor, PaginatedIterator, type PaginatedResponse, type PaginationParams, } from "./pagination";
|
|
37
|
+
export { autoPaginate, autoPaginateGenerator, createPaginatedIterator, decodeCursor, encodeCursor, PaginatedIterator, type PaginatedResponse, type PaginationParams, } from "./pagination";
|
|
38
38
|
export { ARTIFACTS, type Baseline, type BaselineTolerance, GATE_CATEGORY, GATE_EXIT, type GateCategory, type GateExitCode, REPORT_SCHEMA_VERSION, type RegressionDelta, type RegressionReport, } from "./regression";
|
|
39
39
|
export { batchProcess, batchRead, RateLimiter, streamEvaluation, } from "./streaming";
|
|
40
40
|
export type { Annotation, AnnotationItem, AnnotationTask, APIKey, APIKeyUsage, APIKeyWithSecret, BatchOptions, ClientConfig as AIEvalConfig, CreateAnnotationItemParams, CreateAnnotationParams, CreateAnnotationTaskParams, CreateAPIKeyParams, CreateLLMJudgeConfigParams, CreateWebhookParams, Evaluation as EvaluationData, EvaluationRun, EvaluationRunDetail, ExportOptions, GenericMetadata as AnnotationData, GetLLMJudgeAlignmentParams, GetUsageParams, ImportOptions, ListAnnotationItemsParams, ListAnnotationsParams, ListAnnotationTasksParams, ListAPIKeysParams, ListLLMJudgeConfigsParams, ListLLMJudgeResultsParams, ListWebhookDeliveriesParams, ListWebhooksParams, LLMJudgeAlignment, LLMJudgeConfig, LLMJudgeEvaluateResult, LLMJudgeResult as LLMJudgeData, Organization, RetryConfig, SnapshotData, Span as SpanData, StreamOptions, TestCase, TestResult, Trace as TraceData, TraceDetail, TracedResponse, UpdateAPIKeyParams, UpdateWebhookParams, UsageStats, UsageSummary, Webhook, WebhookDelivery, } from "./types";
|
package/dist/index.js
CHANGED
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
*/
|
|
10
10
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
11
11
|
exports.defaultLocalExecutor = exports.createLocalExecutor = exports.evalai = exports.defineSuite = exports.defineEval = exports.createResult = exports.createEvalContext = exports.validateContext = exports.mergeContexts = exports.cloneContext = exports.ContextManager = exports.withContext = exports.getContext = exports.createContext = exports.withinRange = exports.similarTo = exports.respondedWithinTime = exports.notContainsPII = exports.matchesSchema = exports.matchesPattern = exports.isValidURL = exports.isValidEmail = exports.hasValidCodeSyntaxAsync = exports.hasValidCodeSyntax = exports.hasSentimentAsync = exports.hasSentiment = exports.hasReadabilityScore = exports.hasPII = exports.hasNoToxicityAsync = exports.hasNoToxicity = exports.hasNoHallucinationsAsync = exports.hasNoHallucinations = exports.hasLength = exports.hasFactualAccuracyAsync = exports.hasFactualAccuracy = exports.getAssertionConfig = exports.followsInstructions = exports.expect = exports.containsLanguageAsync = exports.containsLanguage = exports.containsKeywords = exports.containsJSON = exports.containsAllRequiredFields = exports.configureAssertions = exports.NetworkError = exports.ValidationError = exports.AuthenticationError = exports.RateLimitError = exports.EvalGateError = exports.AIEvalClient = void 0;
|
|
12
|
-
exports.WorkflowTracer = exports.traceWorkflowStep = exports.traceLangChainAgent = exports.traceCrewAI = exports.traceAutoGen = exports.createWorkflowTracer = exports.EvaluationTemplates = exports.streamEvaluation = exports.RateLimiter = exports.batchRead = exports.batchProcess = exports.REPORT_SCHEMA_VERSION = exports.GATE_EXIT = exports.GATE_CATEGORY = exports.ARTIFACTS = exports.PaginatedIterator = exports.encodeCursor = exports.decodeCursor = exports.createPaginatedIterator = exports.autoPaginate = exports.extendExpectWithToPassGate = exports.Logger = exports.openAIChatEval = exports.traceOpenAI = exports.traceAnthropic = exports.runCheck = exports.parseArgs = exports.EXIT = exports.RequestCache = exports.CacheTTL = exports.RequestBatcher = exports.importData = exports.exportData = exports.
|
|
12
|
+
exports.WorkflowTracer = exports.traceWorkflowStep = exports.traceLangChainAgent = exports.traceCrewAI = exports.traceAutoGen = exports.createWorkflowTracer = exports.EvaluationTemplates = exports.streamEvaluation = exports.RateLimiter = exports.batchRead = exports.batchProcess = exports.REPORT_SCHEMA_VERSION = exports.GATE_EXIT = exports.GATE_CATEGORY = exports.ARTIFACTS = exports.PaginatedIterator = exports.encodeCursor = exports.decodeCursor = exports.createPaginatedIterator = exports.autoPaginateGenerator = exports.autoPaginate = exports.extendExpectWithToPassGate = exports.Logger = exports.openAIChatEval = exports.traceOpenAI = exports.traceAnthropic = exports.runCheck = exports.parseArgs = exports.EXIT = exports.RequestCache = exports.CacheTTL = exports.RequestBatcher = exports.importData = exports.exportData = exports.saveSnapshot = exports.compareSnapshots = exports.compareWithSnapshot = exports.snapshot = exports.TestSuite = exports.createTestSuite = exports.SpecRegistrationError = exports.SpecExecutionError = exports.RuntimeError = exports.EvalRuntimeError = exports.setActiveRuntime = exports.getActiveRuntime = exports.disposeActiveRuntime = exports.createEvalRuntime = void 0;
|
|
13
13
|
// Main SDK exports
|
|
14
14
|
var client_1 = require("./client");
|
|
15
15
|
Object.defineProperty(exports, "AIEvalClient", { enumerable: true, get: function () { return client_1.AIEvalClient; } });
|
|
@@ -91,8 +91,8 @@ Object.defineProperty(exports, "createTestSuite", { enumerable: true, get: funct
|
|
|
91
91
|
Object.defineProperty(exports, "TestSuite", { enumerable: true, get: function () { return testing_1.TestSuite; } });
|
|
92
92
|
// Snapshot testing (Tier 2.8)
|
|
93
93
|
const snapshot_1 = require("./snapshot");
|
|
94
|
+
Object.defineProperty(exports, "compareSnapshots", { enumerable: true, get: function () { return snapshot_1.compareSnapshots; } });
|
|
94
95
|
Object.defineProperty(exports, "compareWithSnapshot", { enumerable: true, get: function () { return snapshot_1.compareWithSnapshot; } });
|
|
95
|
-
Object.defineProperty(exports, "compareSnapshots", { enumerable: true, get: function () { return snapshot_1.compareWithSnapshot; } });
|
|
96
96
|
Object.defineProperty(exports, "snapshot", { enumerable: true, get: function () { return snapshot_1.snapshot; } });
|
|
97
97
|
Object.defineProperty(exports, "saveSnapshot", { enumerable: true, get: function () { return snapshot_1.snapshot; } });
|
|
98
98
|
// Export/Import utilities (Tier 4.18)
|
|
@@ -130,6 +130,7 @@ var matchers_1 = require("./matchers");
|
|
|
130
130
|
Object.defineProperty(exports, "extendExpectWithToPassGate", { enumerable: true, get: function () { return matchers_1.extendExpectWithToPassGate; } });
|
|
131
131
|
var pagination_1 = require("./pagination");
|
|
132
132
|
Object.defineProperty(exports, "autoPaginate", { enumerable: true, get: function () { return pagination_1.autoPaginate; } });
|
|
133
|
+
Object.defineProperty(exports, "autoPaginateGenerator", { enumerable: true, get: function () { return pagination_1.autoPaginateGenerator; } });
|
|
133
134
|
Object.defineProperty(exports, "createPaginatedIterator", { enumerable: true, get: function () { return pagination_1.createPaginatedIterator; } });
|
|
134
135
|
Object.defineProperty(exports, "decodeCursor", { enumerable: true, get: function () { return pagination_1.decodeCursor; } });
|
|
135
136
|
Object.defineProperty(exports, "encodeCursor", { enumerable: true, get: function () { return pagination_1.encodeCursor; } });
|
|
@@ -67,7 +67,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
|
|
|
67
67
|
}
|
|
68
68
|
: {}),
|
|
69
69
|
});
|
|
70
|
-
await evalClient.traces
|
|
70
|
+
await evalClient.traces?.create({
|
|
71
71
|
name: `Anthropic: ${params.model}`,
|
|
72
72
|
traceId,
|
|
73
73
|
organizationId: organizationId || evalClient.getOrganizationId(),
|
|
@@ -89,7 +89,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
|
|
|
89
89
|
error: error instanceof Error ? error.message : String(error),
|
|
90
90
|
});
|
|
91
91
|
await evalClient.traces
|
|
92
|
-
|
|
92
|
+
?.create({
|
|
93
93
|
name: `Anthropic: ${params.model}`,
|
|
94
94
|
traceId,
|
|
95
95
|
organizationId: organizationId || evalClient.getOrganizationId(),
|
|
@@ -97,7 +97,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
|
|
|
97
97
|
durationMs,
|
|
98
98
|
metadata: errorMetadata,
|
|
99
99
|
})
|
|
100
|
-
|
|
100
|
+
?.catch(() => {
|
|
101
101
|
// Ignore errors in trace creation to avoid masking the original error
|
|
102
102
|
});
|
|
103
103
|
throw error;
|
|
@@ -127,7 +127,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
|
|
|
127
127
|
const startTime = Date.now();
|
|
128
128
|
const traceId = `anthropic-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
|
|
129
129
|
try {
|
|
130
|
-
await evalClient.traces
|
|
130
|
+
await evalClient.traces?.create({
|
|
131
131
|
name,
|
|
132
132
|
traceId,
|
|
133
133
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
|
@@ -136,7 +136,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
|
|
|
136
136
|
});
|
|
137
137
|
const result = await fn();
|
|
138
138
|
const durationMs = Date.now() - startTime;
|
|
139
|
-
await evalClient.traces
|
|
139
|
+
await evalClient.traces?.create({
|
|
140
140
|
name,
|
|
141
141
|
traceId,
|
|
142
142
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
|
@@ -148,7 +148,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
|
|
|
148
148
|
}
|
|
149
149
|
catch (error) {
|
|
150
150
|
const durationMs = Date.now() - startTime;
|
|
151
|
-
await evalClient.traces
|
|
151
|
+
await evalClient.traces?.create({
|
|
152
152
|
name,
|
|
153
153
|
traceId,
|
|
154
154
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
|
@@ -65,7 +65,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
|
|
|
65
65
|
}
|
|
66
66
|
: {}),
|
|
67
67
|
});
|
|
68
|
-
await evalClient.traces
|
|
68
|
+
await evalClient.traces?.create({
|
|
69
69
|
name: `OpenAI: ${params.model}`,
|
|
70
70
|
traceId,
|
|
71
71
|
organizationId: organizationId || evalClient.getOrganizationId(),
|
|
@@ -87,7 +87,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
|
|
|
87
87
|
error: error instanceof Error ? error.message : String(error),
|
|
88
88
|
});
|
|
89
89
|
await evalClient.traces
|
|
90
|
-
|
|
90
|
+
?.create({
|
|
91
91
|
name: `OpenAI: ${params.model}`,
|
|
92
92
|
traceId,
|
|
93
93
|
organizationId: organizationId || evalClient.getOrganizationId(),
|
|
@@ -95,7 +95,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
|
|
|
95
95
|
durationMs,
|
|
96
96
|
metadata: errorMetadata,
|
|
97
97
|
})
|
|
98
|
-
|
|
98
|
+
?.catch(() => {
|
|
99
99
|
// Ignore errors in trace creation to avoid masking the original error
|
|
100
100
|
});
|
|
101
101
|
throw error;
|
|
@@ -124,7 +124,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
|
|
|
124
124
|
const startTime = Date.now();
|
|
125
125
|
const traceId = `openai-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
|
|
126
126
|
try {
|
|
127
|
-
await evalClient.traces
|
|
127
|
+
await evalClient.traces?.create({
|
|
128
128
|
name,
|
|
129
129
|
traceId,
|
|
130
130
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
|
@@ -133,7 +133,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
|
|
|
133
133
|
});
|
|
134
134
|
const result = await fn();
|
|
135
135
|
const durationMs = Date.now() - startTime;
|
|
136
|
-
await evalClient.traces
|
|
136
|
+
await evalClient.traces?.create({
|
|
137
137
|
name,
|
|
138
138
|
traceId,
|
|
139
139
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
|
@@ -145,7 +145,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
|
|
|
145
145
|
}
|
|
146
146
|
catch (error) {
|
|
147
147
|
const durationMs = Date.now() - startTime;
|
|
148
|
-
await evalClient.traces
|
|
148
|
+
await evalClient.traces?.create({
|
|
149
149
|
name,
|
|
150
150
|
traceId,
|
|
151
151
|
organizationId: options.organizationId || evalClient.getOrganizationId(),
|
package/dist/pagination.d.ts
CHANGED
|
@@ -50,9 +50,20 @@ export declare function createPaginatedIterator<T>(fetchFn: (offset: number, lim
|
|
|
50
50
|
hasMore: boolean;
|
|
51
51
|
}>, limit?: number): PaginatedIterator<T>;
|
|
52
52
|
/**
|
|
53
|
-
* Auto-paginate helper that fetches all pages
|
|
53
|
+
* Auto-paginate helper that fetches all pages and returns a flat array.
|
|
54
|
+
* @example
|
|
55
|
+
* ```typescript
|
|
56
|
+
* const allItems = await autoPaginate(
|
|
57
|
+
* (offset, limit) => client.traces.list({ offset, limit }),
|
|
58
|
+
* );
|
|
59
|
+
* ```
|
|
54
60
|
*/
|
|
55
|
-
export declare function autoPaginate<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number):
|
|
61
|
+
export declare function autoPaginate<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number): Promise<T[]>;
|
|
62
|
+
/**
|
|
63
|
+
* Streaming auto-paginate generator — yields individual items one at a time.
|
|
64
|
+
* Use this when you want to process items as they arrive rather than waiting for all pages.
|
|
65
|
+
*/
|
|
66
|
+
export declare function autoPaginateGenerator<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number): AsyncGenerator<T, void, unknown>;
|
|
56
67
|
/**
|
|
57
68
|
* Encode cursor for pagination (base64)
|
|
58
69
|
*/
|
package/dist/pagination.js
CHANGED
|
@@ -6,6 +6,7 @@ Object.defineProperty(exports, "__esModule", { value: true });
|
|
|
6
6
|
exports.PaginatedIterator = void 0;
|
|
7
7
|
exports.createPaginatedIterator = createPaginatedIterator;
|
|
8
8
|
exports.autoPaginate = autoPaginate;
|
|
9
|
+
exports.autoPaginateGenerator = autoPaginateGenerator;
|
|
9
10
|
exports.encodeCursor = encodeCursor;
|
|
10
11
|
exports.decodeCursor = decodeCursor;
|
|
11
12
|
exports.createPaginationMeta = createPaginationMeta;
|
|
@@ -56,9 +57,34 @@ function createPaginatedIterator(fetchFn, limit = 50) {
|
|
|
56
57
|
return new PaginatedIterator(fetchFn, limit);
|
|
57
58
|
}
|
|
58
59
|
/**
|
|
59
|
-
* Auto-paginate helper that fetches all pages
|
|
60
|
+
* Auto-paginate helper that fetches all pages and returns a flat array.
|
|
61
|
+
* @example
|
|
62
|
+
* ```typescript
|
|
63
|
+
* const allItems = await autoPaginate(
|
|
64
|
+
* (offset, limit) => client.traces.list({ offset, limit }),
|
|
65
|
+
* );
|
|
66
|
+
* ```
|
|
60
67
|
*/
|
|
61
|
-
async function
|
|
68
|
+
async function autoPaginate(fetchFn, limit = 50) {
|
|
69
|
+
const result = [];
|
|
70
|
+
let offset = 0;
|
|
71
|
+
let hasMore = true;
|
|
72
|
+
while (hasMore) {
|
|
73
|
+
const items = await fetchFn(offset, limit);
|
|
74
|
+
if (items.length === 0) {
|
|
75
|
+
break;
|
|
76
|
+
}
|
|
77
|
+
result.push(...items);
|
|
78
|
+
hasMore = items.length === limit;
|
|
79
|
+
offset += limit;
|
|
80
|
+
}
|
|
81
|
+
return result;
|
|
82
|
+
}
|
|
83
|
+
/**
|
|
84
|
+
* Streaming auto-paginate generator — yields individual items one at a time.
|
|
85
|
+
* Use this when you want to process items as they arrive rather than waiting for all pages.
|
|
86
|
+
*/
|
|
87
|
+
async function* autoPaginateGenerator(fetchFn, limit = 50) {
|
|
62
88
|
let offset = 0;
|
|
63
89
|
let hasMore = true;
|
|
64
90
|
while (hasMore) {
|
|
@@ -208,12 +208,7 @@ function generateDefineEvalCode(suite, options = {}) {
|
|
|
208
208
|
});
|
|
209
209
|
const helperFunctions = generateHelperFunctionsForSuite(specs, options);
|
|
210
210
|
const evaluationFunction = generateEvaluationFunction();
|
|
211
|
-
return [
|
|
212
|
-
...imports,
|
|
213
|
-
...helperFunctions,
|
|
214
|
-
...evaluationFunction,
|
|
215
|
-
...specCode,
|
|
216
|
-
].join("\n");
|
|
211
|
+
return [...imports, helperFunctions, evaluationFunction, ...specCode].join("\n");
|
|
217
212
|
}
|
|
218
213
|
/**
|
|
219
214
|
* Generate helper functions for a specific spec
|
|
@@ -10,7 +10,8 @@ import type { LocalExecutor } from "./types";
|
|
|
10
10
|
*/
|
|
11
11
|
export declare function createLocalExecutor(): LocalExecutor;
|
|
12
12
|
/**
|
|
13
|
-
* Default local executor
|
|
13
|
+
* Default local executor factory
|
|
14
|
+
* Call as defaultLocalExecutor() to get a new executor instance.
|
|
14
15
|
* For convenience in simple use cases
|
|
15
16
|
*/
|
|
16
|
-
export declare const defaultLocalExecutor:
|
|
17
|
+
export declare const defaultLocalExecutor: typeof createLocalExecutor;
|
package/dist/runtime/executor.js
CHANGED
|
@@ -146,7 +146,8 @@ function createLocalExecutor() {
|
|
|
146
146
|
return new LocalExecutorImpl();
|
|
147
147
|
}
|
|
148
148
|
/**
|
|
149
|
-
* Default local executor
|
|
149
|
+
* Default local executor factory
|
|
150
|
+
* Call as defaultLocalExecutor() to get a new executor instance.
|
|
150
151
|
* For convenience in simple use cases
|
|
151
152
|
*/
|
|
152
|
-
exports.defaultLocalExecutor = createLocalExecutor
|
|
153
|
+
exports.defaultLocalExecutor = createLocalExecutor;
|
|
@@ -61,7 +61,10 @@ export interface SerializedSpec {
|
|
|
61
61
|
* Create a new scoped runtime with lifecycle management
|
|
62
62
|
* Returns a handle for proper resource management
|
|
63
63
|
*/
|
|
64
|
-
export declare function createEvalRuntime(
|
|
64
|
+
export declare function createEvalRuntime(projectRootOrConfig?: string | {
|
|
65
|
+
name?: string;
|
|
66
|
+
projectRoot?: string;
|
|
67
|
+
}): RuntimeHandle;
|
|
65
68
|
/**
|
|
66
69
|
* Helper function for safe runtime execution with automatic cleanup
|
|
67
70
|
* Ensures runtime is disposed even if an exception is thrown
|
package/dist/runtime/registry.js
CHANGED
|
@@ -315,7 +315,10 @@ class EvalRuntimeImpl {
|
|
|
315
315
|
* Create a new scoped runtime with lifecycle management
|
|
316
316
|
* Returns a handle for proper resource management
|
|
317
317
|
*/
|
|
318
|
-
function createEvalRuntime(
|
|
318
|
+
function createEvalRuntime(projectRootOrConfig = process.cwd()) {
|
|
319
|
+
const projectRoot = typeof projectRootOrConfig === "string"
|
|
320
|
+
? projectRootOrConfig
|
|
321
|
+
: (projectRootOrConfig.projectRoot ?? process.cwd());
|
|
319
322
|
const runtime = new EvalRuntimeImpl(projectRoot);
|
|
320
323
|
// Create bound defineEval function
|
|
321
324
|
const boundDefineEval = ((nameOrConfig, executor, options) => {
|
package/dist/snapshot.d.ts
CHANGED
|
@@ -166,6 +166,18 @@ export declare function loadSnapshot(name: string, dir?: string): Promise<Snapsh
|
|
|
166
166
|
* ```
|
|
167
167
|
*/
|
|
168
168
|
export declare function compareWithSnapshot(name: string, currentOutput: unknown, dir?: string): Promise<SnapshotComparison>;
|
|
169
|
+
/**
|
|
170
|
+
* Compare two saved snapshots by name (convenience function)
|
|
171
|
+
*
|
|
172
|
+
* @example
|
|
173
|
+
* ```typescript
|
|
174
|
+
* const comparison = await compareSnapshots('baseline', 'current');
|
|
175
|
+
* if (!comparison.matches) {
|
|
176
|
+
* console.log('Snapshots differ!', comparison.differences);
|
|
177
|
+
* }
|
|
178
|
+
* ```
|
|
179
|
+
*/
|
|
180
|
+
export declare function compareSnapshots(nameA: string, nameB: string, dir?: string): Promise<SnapshotComparison>;
|
|
169
181
|
/**
|
|
170
182
|
* Delete a snapshot (convenience function)
|
|
171
183
|
*/
|
package/dist/snapshot.js
CHANGED
|
@@ -55,6 +55,7 @@ exports.SnapshotManager = void 0;
|
|
|
55
55
|
exports.snapshot = snapshot;
|
|
56
56
|
exports.loadSnapshot = loadSnapshot;
|
|
57
57
|
exports.compareWithSnapshot = compareWithSnapshot;
|
|
58
|
+
exports.compareSnapshots = compareSnapshots;
|
|
58
59
|
exports.deleteSnapshot = deleteSnapshot;
|
|
59
60
|
exports.listSnapshots = listSnapshots;
|
|
60
61
|
// Environment check
|
|
@@ -130,7 +131,13 @@ class SnapshotManager {
|
|
|
130
131
|
if (!options?.overwrite && fs.existsSync(filePath)) {
|
|
131
132
|
throw new Error(`Snapshot '${name}' already exists. Use overwrite: true to update.`);
|
|
132
133
|
}
|
|
133
|
-
const serialized =
|
|
134
|
+
const serialized = output === undefined
|
|
135
|
+
? "undefined"
|
|
136
|
+
: output === null
|
|
137
|
+
? "null"
|
|
138
|
+
: typeof output === "string"
|
|
139
|
+
? output
|
|
140
|
+
: JSON.stringify(output);
|
|
134
141
|
const snapshotData = {
|
|
135
142
|
output: serialized,
|
|
136
143
|
metadata: {
|
|
@@ -310,6 +317,22 @@ async function compareWithSnapshot(name, currentOutput, dir) {
|
|
|
310
317
|
const manager = getSnapshotManager(dir);
|
|
311
318
|
return manager.compare(name, currentOutput);
|
|
312
319
|
}
|
|
320
|
+
/**
|
|
321
|
+
* Compare two saved snapshots by name (convenience function)
|
|
322
|
+
*
|
|
323
|
+
* @example
|
|
324
|
+
* ```typescript
|
|
325
|
+
* const comparison = await compareSnapshots('baseline', 'current');
|
|
326
|
+
* if (!comparison.matches) {
|
|
327
|
+
* console.log('Snapshots differ!', comparison.differences);
|
|
328
|
+
* }
|
|
329
|
+
* ```
|
|
330
|
+
*/
|
|
331
|
+
async function compareSnapshots(nameA, nameB, dir) {
|
|
332
|
+
const manager = getSnapshotManager(dir);
|
|
333
|
+
const snapshotB = await manager.load(nameB);
|
|
334
|
+
return manager.compare(nameA, snapshotB.output);
|
|
335
|
+
}
|
|
313
336
|
/**
|
|
314
337
|
* Delete a snapshot (convenience function)
|
|
315
338
|
*/
|
package/dist/version.d.ts
CHANGED
|
@@ -3,5 +3,5 @@
|
|
|
3
3
|
* X-EvalGate-SDK-Version: SDK package version
|
|
4
4
|
* X-EvalGate-Spec-Version: OpenAPI spec version (docs/openapi.json info.version)
|
|
5
5
|
*/
|
|
6
|
-
export declare const SDK_VERSION = "2.2.
|
|
7
|
-
export declare const SPEC_VERSION = "2.2.
|
|
6
|
+
export declare const SDK_VERSION = "2.2.3";
|
|
7
|
+
export declare const SPEC_VERSION = "2.2.3";
|
package/dist/version.js
CHANGED
|
@@ -6,5 +6,5 @@ exports.SPEC_VERSION = exports.SDK_VERSION = void 0;
|
|
|
6
6
|
* X-EvalGate-SDK-Version: SDK package version
|
|
7
7
|
* X-EvalGate-Spec-Version: OpenAPI spec version (docs/openapi.json info.version)
|
|
8
8
|
*/
|
|
9
|
-
exports.SDK_VERSION = "2.2.
|
|
10
|
-
exports.SPEC_VERSION = "2.2.
|
|
9
|
+
exports.SDK_VERSION = "2.2.3";
|
|
10
|
+
exports.SPEC_VERSION = "2.2.3";
|