npm - @evalgate/sdk - Versions diffs - 2.2.2 → 2.2.3 - Mend

@evalgate/sdk 2.2.2 → 2.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/CHANGELOG.md +27 -0
package/README.md +2 -0
package/dist/assertions.d.ts +9 -5
package/dist/assertions.js +29 -12
package/dist/cache.d.ts +1 -1
package/dist/cache.js +1 -1
package/dist/cli/upgrade.js +5 -0
package/dist/client.js +1 -1
package/dist/errors.js +7 -0
package/dist/export.js +2 -2
package/dist/index.d.ts +3 -3
package/dist/index.js +3 -2
package/dist/integrations/anthropic.js +6 -6
package/dist/integrations/openai.js +6 -6
package/dist/pagination.d.ts +13 -2
package/dist/pagination.js +28 -2
package/dist/runtime/adapters/testsuite-to-dsl.js +1 -6
package/dist/runtime/executor.d.ts +3 -2
package/dist/runtime/executor.js +3 -2
package/dist/runtime/registry.d.ts +4 -1
package/dist/runtime/registry.js +4 -1
package/dist/snapshot.d.ts +12 -0
package/dist/snapshot.js +24 -1
package/dist/version.d.ts +2 -2
package/dist/version.js +2 -2
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,33 @@ All notable changes to the @evalgate/sdk package will be documented in this file
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [2.2.3] - 2026-03-03
+### Fixed
+- **`RequestCache.set` missing default TTL** — entries stored without an explicit TTL were immediately stale on next read. Default is now `CacheTTL.MEDIUM`; callers that omit `ttl` get a live cache entry instead of a cache miss every time.
+- **`EvalGateError` subclass prototype chain** — `ValidationError.name` was silently overwritten by the base class constructor, surfacing as `"EvalGateError"` in stack traces and `instanceof` checks. All four subclasses (`ValidationError`, `RateLimitError`, `AuthenticationError`, `NetworkError`) now call `Object.setPrototypeOf(this, Subclass.prototype)` and set `this.name` after `super()`.
+- **`RateLimitError.retryAfter` not a direct property** — the value was only stored inside `details.retryAfter` and not accessible as `err.retryAfter`. It is now assigned directly on the instance when provided.
+- **`autoPaginate` returned `AsyncGenerator` instead of `Promise<T[]>`** — calling `await autoPaginate(fetcher)` was resolving to an unexhausted generator. It now collects all pages and returns a flat `Promise<T[]>`. The original streaming behaviour is available via the new `autoPaginateGenerator` export.
+- **`createEvalRuntime` string-only overload** — passing `{ name, projectRoot }` config objects was ignored (treated as `process.cwd()`). The function now accepts `string | { name?: string; projectRoot?: string }` and extracts `projectRoot` correctly.
+- **`defaultLocalExecutor` was an instance, not a factory** — importing `defaultLocalExecutor` returned a pre-constructed executor rather than a callable factory. It is now re-exported as `createLocalExecutor` so each import site can call it to get a fresh instance.
+- **`SnapshotManager.save` crash on `undefined`/`null` output** — passing `undefined` or `null` to `snapshot(name, output)` threw `TypeError: Cannot convert undefined to string`. Both values are now serialized to the strings `"undefined"` and `"null"` respectively, matching the existing `null`-safe coercion already present for objects.
+- **`compareSnapshots` loaded raw string instead of disk snapshot** — the old `compareWithSnapshot` alias passed its second argument as literal content rather than a snapshot name, producing meaningless diffs. The new `compareSnapshots(nameA, nameB, dir?)` loads both snapshots from disk before diffing.
+- **`AIEvalClient` default `baseUrl`** — the no-arg constructor defaulted to `http://localhost:3000`, causing silent failures in production environments. Default is now `https://api.evalgate.com`.
+- **`importData` unguarded `client.traces` / `client.evaluations` access** — calling `importData(data)` with a partial or undefined client could throw `TypeError: Cannot read properties of undefined`. Both property accesses now use optional chaining (`client?.traces`, `client?.evaluations`).
+- **`toContainCode` required a fenced code block** — raw function definitions, `const` assignments, class declarations, arrow functions, `import`/`export` statements, and `return` expressions now satisfy the assertion without needing triple-backtick fencing.
+- **`hasReadabilityScore` ignored `{min}` object form** — passing `{ min: 40 }` instead of a plain number was coerced to `NaN` threshold, making every call return `true`. The function now unwraps `{ min?, max? }` objects and applies both bounds.
+### Added
+- **`autoPaginateGenerator`** — new export for streaming pagination as an `AsyncGenerator<T[]>` (one chunk per page). Use when you want to process pages incrementally rather than wait for all pages to load.
+- **`compareSnapshots(nameA, nameB, dir?)`** — loads both named snapshots from disk and returns a `SnapshotComparison`. Replaces the incorrectly aliased `compareWithSnapshot`.
+- **141 new regression tests** across 9 test files covering all fixes above: `RequestCache` TTL defaults, error class prototype chains, `autoPaginate` flat-array return, `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager` null/undefined handling, `compareSnapshots` disk-load path, `AIEvalClient` default `baseUrl`, `importData` guards, `toContainCode` raw-code detection, and `hasReadabilityScore` object form.
+- **`upgrade --full` post-upgrade warning** — CLI now prints a reminder to run `npx evalgate baseline update` after a full upgrade to avoid a false regression on the next CI run.
+- **Optional chaining on OpenAI / Anthropic integration `traces.create`** — `evalClient.traces?.create(...)` prevents crashes when the `traces` resource is unavailable on the client (e.g. minimal config or testing without a full API key).
+---
 ## [2.2.2] - 2026-03-03
 ### Fixed

package/README.md CHANGED Viewed

@@ -450,6 +450,8 @@ Your local `openAIChatEval` runs continue to work. No account cancellation. No d
 See [CHANGELOG.md](CHANGELOG.md) for the full release history.
+**v2.2.3** — Bug-fix release. `RequestCache` default TTL, `EvalGateError` subclass prototype chain and `retryAfter` direct property, `autoPaginate` now returns `Promise<T[]>` (new `autoPaginateGenerator` for streaming), `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager.save` null/undefined safety, `compareSnapshots` loads both sides from disk, `AIEvalClient` default baseUrl → `https://api.evalgate.com`, `importData` optional-chaining guards, `toContainCode` raw-code detection, `hasReadabilityScore` `{min,max}` object form. 141 new regression tests.
 **v2.2.2** — 8 stub assertions replaced with real implementations (`hasSentiment` expanded lexicon, `hasNoToxicity` ~80-term blocklist, `hasValidCodeSyntax` real bracket balance, `containsLanguage` 12 languages + BCP-47, `hasFactualAccuracy`/`hasNoHallucinations` case-insensitive, `hasReadabilityScore` per-word syllable fix, `matchesSchema` JSON Schema support). Added LLM-backed `*Async` variants + `configureAssertions`. Fixed `importData` crash, `compareWithSnapshot` object coercion, `WorkflowTracer` defensive guard. 115 new tests.
 **v2.2.1** — `snapshot(name, output)` accepts objects; auto-serialized via `JSON.stringify`

package/dist/assertions.d.ts CHANGED Viewed

@@ -126,10 +126,11 @@ export declare class Expectation {
      */
     toBeBetween(min: number, max: number, message?: string): AssertionResult;
     /**
-     * Assert value contains code block
+     * Assert value contains code block or raw code
      * @example expect(output).toContainCode()
+     * @example expect(output).toContainCode('typescript')
      */
-    toContainCode(message?: string): AssertionResult;
+    toContainCode(language?: string, message?: string): AssertionResult;
     /**
      * Assert value is professional tone (no profanity)
      * @example expect(output).toBeProfessional()
@@ -209,9 +210,12 @@ export declare function isValidURL(url: string): boolean;
  * facts but cannot detect paraphrased fabrications. Use
  * {@link hasNoHallucinationsAsync} for semantic accuracy.
  */
-export declare function hasNoHallucinations(text: string, groundTruth: string[]): boolean;
+export declare function hasNoHallucinations(text: string, groundTruth?: string[]): boolean;
 export declare function matchesSchema(value: unknown, schema: Record<string, unknown>): boolean;
-export declare function hasReadabilityScore(text: string, minScore: number): boolean;
+export declare function hasReadabilityScore(text: string, minScore: number | {
+    min?: number;
+    max?: number;
+}): boolean;
 /**
  * Keyword-frequency language detector supporting 12 languages.
  * **Fast and approximate** — detects the most common languages reliably
@@ -234,7 +238,7 @@ export declare function respondedWithinTime(startTime: number, maxMs: number): b
  * with an LLM for context-aware moderation.
  */
 export declare function hasNoToxicity(text: string): boolean;
-export declare function followsInstructions(text: string, instructions: string[]): boolean;
+export declare function followsInstructions(text: string, instructions: string | string[]): boolean;
 export declare function containsAllRequiredFields(obj: unknown, requiredFields: string[]): boolean;
 export interface AssertionLLMConfig {
     provider: "openai" | "anthropic";

package/dist/assertions.js CHANGED Viewed

@@ -234,9 +234,10 @@ class Expectation {
         let parsedJson = null;
         try {
             parsedJson = JSON.parse(String(this.value));
-            const requiredKeys = Object.keys(schema);
-            const actualKeys = Object.keys(parsedJson);
-            passed = requiredKeys.every((key) => actualKeys.includes(key));
+            const entries = Object.entries(schema);
+            passed = entries.every(([key, expectedValue]) => parsedJson !== null &&
+                key in parsedJson &&
+                JSON.stringify(parsedJson[key]) === JSON.stringify(expectedValue));
         }
         catch (_e) {
             passed = false;
@@ -436,19 +437,30 @@ class Expectation {
         };
     }
     /**
-     * Assert value contains code block
+     * Assert value contains code block or raw code
      * @example expect(output).toContainCode()
+     * @example expect(output).toContainCode('typescript')
      */
-    toContainCode(message) {
+    toContainCode(language, message) {
         const text = String(this.value);
-        const hasCodeBlock = /```[\s\S]*?```/.test(text) || /<code>[\s\S]*?<\/code>/.test(text);
+        const hasMarkdownBlock = language
+            ? new RegExp(`\`\`\`${language}[\\s\\S]*?\`\`\``).test(text)
+            : /```[\s\S]*?```/.test(text);
+        const hasHtmlBlock = /<code>[\s\S]*?<\/code>/.test(text);
+        const hasRawCode = /\bfunction\s+\w+\s*\(/.test(text) ||
+            /\b(?:const|let|var)\s+\w+\s*=/.test(text) ||
+            /\bclass\s+\w+/.test(text) ||
+            /=>\s*[{(]/.test(text) ||
+            /\bimport\s+.*\bfrom\b/.test(text) ||
+            /\bexport\s+(?:default\s+)?(?:function|class|const)/.test(text) ||
+            /\breturn\s+.+;/.test(text);
+        const hasCodeBlock = hasMarkdownBlock || hasHtmlBlock || hasRawCode;
         return {
             name: "toContainCode",
             passed: hasCodeBlock,
-            expected: "code block",
+            expected: language ? `code block (${language})` : "code block",
             actual: text,
-            message: message ||
-                (hasCodeBlock ? "Contains code block" : "No code block found"),
+            message: message || (hasCodeBlock ? "Contains code" : "No code found"),
         };
     }
     /**
@@ -719,7 +731,7 @@ function isValidURL(url) {
  * facts but cannot detect paraphrased fabrications. Use
  * {@link hasNoHallucinationsAsync} for semantic accuracy.
  */
-function hasNoHallucinations(text, groundTruth) {
+function hasNoHallucinations(text, groundTruth = []) {
     const lower = text.toLowerCase();
     return groundTruth.every((truth) => lower.includes(truth.toLowerCase()));
 }
@@ -739,12 +751,14 @@ function matchesSchema(value, schema) {
     return Object.keys(schema).every((key) => key in obj);
 }
 function hasReadabilityScore(text, minScore) {
+    const threshold = typeof minScore === "number" ? minScore : (minScore.min ?? 0);
+    const maxThreshold = typeof minScore === "object" ? minScore.max : undefined;
     const wordList = text.trim().split(/\s+/).filter(Boolean);
     const words = wordList.length || 1;
     const sentences = text.split(/[.!?]+/).filter((s) => s.trim().length > 0).length || 1;
     const totalSyllables = wordList.reduce((sum, w) => sum + syllables(w), 0);
     const score = 206.835 - 1.015 * (words / sentences) - 84.6 * (totalSyllables / words);
-    return score >= minScore;
+    return (score >= threshold && (maxThreshold === undefined || score <= maxThreshold));
 }
 function syllables(word) {
     // Simple syllable counter
@@ -1154,7 +1168,10 @@ function hasNoToxicity(text) {
     return !toxicTerms.some((term) => lower.includes(term));
 }
 function followsInstructions(text, instructions) {
-    return instructions.every((instruction) => {
+    const instructionList = Array.isArray(instructions)
+        ? instructions
+        : [instructions];
+    return instructionList.every((instruction) => {
         if (instruction.startsWith("!")) {
             return !text.includes(instruction.slice(1));
         }

package/dist/cache.d.ts CHANGED Viewed

@@ -21,7 +21,7 @@ export declare class RequestCache {
     /**
      * Store response in cache
      */
-    set<T>(method: string, url: string, data: T, ttl: number, params?: unknown): void;
+    set<T>(method: string, url: string, data: T, ttl?: number, params?: unknown): void;
     /**
      * Invalidate specific cache entry
      */

package/dist/cache.js CHANGED Viewed

@@ -43,7 +43,7 @@ class RequestCache {
     /**
      * Store response in cache
      */
-    set(method, url, data, ttl, params) {
+    set(method, url, data, ttl = exports.CacheTTL.MEDIUM, params) {
         // Enforce cache size limit (LRU-style)
         if (this.cache.size >= this.maxSize) {
             const firstKey = this.cache.keys().next().value;

package/dist/cli/upgrade.js CHANGED Viewed

@@ -480,7 +480,12 @@ After upgrading:
     console.log("    - package.json                  eval:regression-gate + eval:baseline-update");
     console.log("    - .github/workflows/            Gate + governance workflows");
     console.log("    - .github/CODEOWNERS            Baseline requires approval\n");
+    console.log("  ⚠️  IMPORTANT — Reset your baseline before pushing:");
+    console.log("    The gate compares against your existing Tier 1 baseline.");
+    console.log("    If your test script changed, run this first to avoid an immediate regression:");
+    console.log("    npx evalgate baseline update    (or: pnpm eval:baseline-update)\n");
     console.log("  Next:");
+    console.log("    npx evalgate baseline update");
     console.log("    git add -A");
     console.log("    git commit -m 'chore: upgrade EvalGate gate to Tier 2'");
     console.log("    git push\n");

package/dist/client.js CHANGED Viewed

@@ -72,7 +72,7 @@ class AIEvalClient {
         this.baseUrl =
             config.baseUrl ||
                 getEnvVar("EVALGATE_BASE_URL", "EVALAI_BASE_URL") ||
-                (isBrowser ? "" : "http://localhost:3000");
+                (isBrowser ? "" : "https://api.evalgate.com");
         this.timeout = config.timeout || 30000;
         // Tier 4.17: Debug mode with request logging
         const logLevel = config.logLevel || (config.debug ? "debug" : "info");

package/dist/errors.js CHANGED Viewed

@@ -271,6 +271,10 @@ class RateLimitError extends EvalGateError {
     constructor(message, retryAfter) {
         super(message, "RATE_LIMIT_EXCEEDED", 429, { retryAfter });
         this.name = "RateLimitError";
+        if (retryAfter !== undefined) {
+            this.retryAfter = retryAfter;
+        }
+        Object.setPrototypeOf(this, RateLimitError.prototype);
     }
 }
 exports.RateLimitError = RateLimitError;
@@ -278,6 +282,7 @@ class AuthenticationError extends EvalGateError {
     constructor(message = "Authentication failed") {
         super(message, "AUTHENTICATION_ERROR", 401);
         this.name = "AuthenticationError";
+        Object.setPrototypeOf(this, AuthenticationError.prototype);
     }
 }
 exports.AuthenticationError = AuthenticationError;
@@ -285,6 +290,7 @@ class ValidationError extends EvalGateError {
     constructor(message = "Validation failed", details) {
         super(message, "VALIDATION_ERROR", 400, details);
         this.name = "ValidationError";
+        Object.setPrototypeOf(this, ValidationError.prototype);
     }
 }
 exports.ValidationError = ValidationError;
@@ -293,6 +299,7 @@ class NetworkError extends EvalGateError {
         super(message, "NETWORK_ERROR", 0);
         this.name = "NetworkError";
         this.retryable = true;
+        Object.setPrototypeOf(this, NetworkError.prototype);
     }
 }
 exports.NetworkError = NetworkError;

package/dist/export.js CHANGED Viewed

@@ -155,7 +155,7 @@ async function importData(client, data, options = {}) {
         return result;
     }
     // Import traces
-    if (data.traces) {
+    if (data.traces && client?.traces) {
         const traceResults = { imported: 0, skipped: 0, failed: 0 };
         for (const trace of data.traces) {
             try {
@@ -191,7 +191,7 @@ async function importData(client, data, options = {}) {
         result.summary.total += data.traces.length;
     }
     // Import evaluations
-    if (data.evaluations) {
+    if (data.evaluations && client?.evaluations) {
         const evalResults = { imported: 0, skipped: 0, failed: 0 };
         for (const evaluation of data.evaluations) {
             try {

package/dist/index.d.ts CHANGED Viewed

@@ -20,8 +20,8 @@ export { createEvalRuntime, disposeActiveRuntime, getActiveRuntime, setActiveRun
 export type { CloudExecutor, DefineEvalFunction, EvalContext, EvalExecutor, EvalExecutorInterface, EvalOptions, EvalResult, EvalRuntime, EvalSpec, ExecutorCapabilities, LocalExecutor, SpecConfig, SpecOptions, WorkerExecutor, } from "./runtime/types";
 export { EvalRuntimeError, RuntimeError, SpecExecutionError, SpecRegistrationError, } from "./runtime/types";
 export { createTestSuite, type TestCaseResult, TestSuite, TestSuiteCase, TestSuiteCaseResult, TestSuiteConfig, TestSuiteResult, } from "./testing";
-import { compareWithSnapshot, snapshot } from "./snapshot";
-export { snapshot, compareWithSnapshot, snapshot as saveSnapshot, compareWithSnapshot as compareSnapshots, };
+import { compareSnapshots, compareWithSnapshot, snapshot } from "./snapshot";
+export { snapshot, compareWithSnapshot, compareSnapshots, snapshot as saveSnapshot, };
 import type { ExportFormat } from "./export";
 import { exportData, importData } from "./export";
 export { exportData, importData };
@@ -34,7 +34,7 @@ export { traceOpenAI } from "./integrations/openai";
 export { type OpenAIChatEvalCase, type OpenAIChatEvalOptions, type OpenAIChatEvalResult, openAIChatEval, } from "./integrations/openai-eval";
 export { Logger } from "./logger";
 export { extendExpectWithToPassGate } from "./matchers";
-export { autoPaginate, createPaginatedIterator, decodeCursor, encodeCursor, PaginatedIterator, type PaginatedResponse, type PaginationParams, } from "./pagination";
+export { autoPaginate, autoPaginateGenerator, createPaginatedIterator, decodeCursor, encodeCursor, PaginatedIterator, type PaginatedResponse, type PaginationParams, } from "./pagination";
 export { ARTIFACTS, type Baseline, type BaselineTolerance, GATE_CATEGORY, GATE_EXIT, type GateCategory, type GateExitCode, REPORT_SCHEMA_VERSION, type RegressionDelta, type RegressionReport, } from "./regression";
 export { batchProcess, batchRead, RateLimiter, streamEvaluation, } from "./streaming";
 export type { Annotation, AnnotationItem, AnnotationTask, APIKey, APIKeyUsage, APIKeyWithSecret, BatchOptions, ClientConfig as AIEvalConfig, CreateAnnotationItemParams, CreateAnnotationParams, CreateAnnotationTaskParams, CreateAPIKeyParams, CreateLLMJudgeConfigParams, CreateWebhookParams, Evaluation as EvaluationData, EvaluationRun, EvaluationRunDetail, ExportOptions, GenericMetadata as AnnotationData, GetLLMJudgeAlignmentParams, GetUsageParams, ImportOptions, ListAnnotationItemsParams, ListAnnotationsParams, ListAnnotationTasksParams, ListAPIKeysParams, ListLLMJudgeConfigsParams, ListLLMJudgeResultsParams, ListWebhookDeliveriesParams, ListWebhooksParams, LLMJudgeAlignment, LLMJudgeConfig, LLMJudgeEvaluateResult, LLMJudgeResult as LLMJudgeData, Organization, RetryConfig, SnapshotData, Span as SpanData, StreamOptions, TestCase, TestResult, Trace as TraceData, TraceDetail, TracedResponse, UpdateAPIKeyParams, UpdateWebhookParams, UsageStats, UsageSummary, Webhook, WebhookDelivery, } from "./types";

package/dist/index.js CHANGED Viewed

@@ -9,7 +9,7 @@
  */
 Object.defineProperty(exports, "__esModule", { value: true });
 exports.defaultLocalExecutor = exports.createLocalExecutor = exports.evalai = exports.defineSuite = exports.defineEval = exports.createResult = exports.createEvalContext = exports.validateContext = exports.mergeContexts = exports.cloneContext = exports.ContextManager = exports.withContext = exports.getContext = exports.createContext = exports.withinRange = exports.similarTo = exports.respondedWithinTime = exports.notContainsPII = exports.matchesSchema = exports.matchesPattern = exports.isValidURL = exports.isValidEmail = exports.hasValidCodeSyntaxAsync = exports.hasValidCodeSyntax = exports.hasSentimentAsync = exports.hasSentiment = exports.hasReadabilityScore = exports.hasPII = exports.hasNoToxicityAsync = exports.hasNoToxicity = exports.hasNoHallucinationsAsync = exports.hasNoHallucinations = exports.hasLength = exports.hasFactualAccuracyAsync = exports.hasFactualAccuracy = exports.getAssertionConfig = exports.followsInstructions = exports.expect = exports.containsLanguageAsync = exports.containsLanguage = exports.containsKeywords = exports.containsJSON = exports.containsAllRequiredFields = exports.configureAssertions = exports.NetworkError = exports.ValidationError = exports.AuthenticationError = exports.RateLimitError = exports.EvalGateError = exports.AIEvalClient = void 0;
-exports.WorkflowTracer = exports.traceWorkflowStep = exports.traceLangChainAgent = exports.traceCrewAI = exports.traceAutoGen = exports.createWorkflowTracer = exports.EvaluationTemplates = exports.streamEvaluation = exports.RateLimiter = exports.batchRead = exports.batchProcess = exports.REPORT_SCHEMA_VERSION = exports.GATE_EXIT = exports.GATE_CATEGORY = exports.ARTIFACTS = exports.PaginatedIterator = exports.encodeCursor = exports.decodeCursor = exports.createPaginatedIterator = exports.autoPaginate = exports.extendExpectWithToPassGate = exports.Logger = exports.openAIChatEval = exports.traceOpenAI = exports.traceAnthropic = exports.runCheck = exports.parseArgs = exports.EXIT = exports.RequestCache = exports.CacheTTL = exports.RequestBatcher = exports.importData = exports.exportData = exports.compareSnapshots = exports.saveSnapshot = exports.compareWithSnapshot = exports.snapshot = exports.TestSuite = exports.createTestSuite = exports.SpecRegistrationError = exports.SpecExecutionError = exports.RuntimeError = exports.EvalRuntimeError = exports.setActiveRuntime = exports.getActiveRuntime = exports.disposeActiveRuntime = exports.createEvalRuntime = void 0;
+exports.WorkflowTracer = exports.traceWorkflowStep = exports.traceLangChainAgent = exports.traceCrewAI = exports.traceAutoGen = exports.createWorkflowTracer = exports.EvaluationTemplates = exports.streamEvaluation = exports.RateLimiter = exports.batchRead = exports.batchProcess = exports.REPORT_SCHEMA_VERSION = exports.GATE_EXIT = exports.GATE_CATEGORY = exports.ARTIFACTS = exports.PaginatedIterator = exports.encodeCursor = exports.decodeCursor = exports.createPaginatedIterator = exports.autoPaginateGenerator = exports.autoPaginate = exports.extendExpectWithToPassGate = exports.Logger = exports.openAIChatEval = exports.traceOpenAI = exports.traceAnthropic = exports.runCheck = exports.parseArgs = exports.EXIT = exports.RequestCache = exports.CacheTTL = exports.RequestBatcher = exports.importData = exports.exportData = exports.saveSnapshot = exports.compareSnapshots = exports.compareWithSnapshot = exports.snapshot = exports.TestSuite = exports.createTestSuite = exports.SpecRegistrationError = exports.SpecExecutionError = exports.RuntimeError = exports.EvalRuntimeError = exports.setActiveRuntime = exports.getActiveRuntime = exports.disposeActiveRuntime = exports.createEvalRuntime = void 0;
 // Main SDK exports
 var client_1 = require("./client");
 Object.defineProperty(exports, "AIEvalClient", { enumerable: true, get: function () { return client_1.AIEvalClient; } });
@@ -91,8 +91,8 @@ Object.defineProperty(exports, "createTestSuite", { enumerable: true, get: funct
 Object.defineProperty(exports, "TestSuite", { enumerable: true, get: function () { return testing_1.TestSuite; } });
 // Snapshot testing (Tier 2.8)
 const snapshot_1 = require("./snapshot");
+Object.defineProperty(exports, "compareSnapshots", { enumerable: true, get: function () { return snapshot_1.compareSnapshots; } });
 Object.defineProperty(exports, "compareWithSnapshot", { enumerable: true, get: function () { return snapshot_1.compareWithSnapshot; } });
-Object.defineProperty(exports, "compareSnapshots", { enumerable: true, get: function () { return snapshot_1.compareWithSnapshot; } });
 Object.defineProperty(exports, "snapshot", { enumerable: true, get: function () { return snapshot_1.snapshot; } });
 Object.defineProperty(exports, "saveSnapshot", { enumerable: true, get: function () { return snapshot_1.snapshot; } });
 // Export/Import utilities (Tier 4.18)
@@ -130,6 +130,7 @@ var matchers_1 = require("./matchers");
 Object.defineProperty(exports, "extendExpectWithToPassGate", { enumerable: true, get: function () { return matchers_1.extendExpectWithToPassGate; } });
 var pagination_1 = require("./pagination");
 Object.defineProperty(exports, "autoPaginate", { enumerable: true, get: function () { return pagination_1.autoPaginate; } });
+Object.defineProperty(exports, "autoPaginateGenerator", { enumerable: true, get: function () { return pagination_1.autoPaginateGenerator; } });
 Object.defineProperty(exports, "createPaginatedIterator", { enumerable: true, get: function () { return pagination_1.createPaginatedIterator; } });
 Object.defineProperty(exports, "decodeCursor", { enumerable: true, get: function () { return pagination_1.decodeCursor; } });
 Object.defineProperty(exports, "encodeCursor", { enumerable: true, get: function () { return pagination_1.encodeCursor; } });

package/dist/integrations/anthropic.js CHANGED Viewed

@@ -67,7 +67,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
                     }
                     : {}),
             });
-            await evalClient.traces.create({
+            await evalClient.traces?.create({
                 name: `Anthropic: ${params.model}`,
                 traceId,
                 organizationId: organizationId || evalClient.getOrganizationId(),
@@ -89,7 +89,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
                 error: error instanceof Error ? error.message : String(error),
             });
             await evalClient.traces
-                .create({
+                ?.create({
                 name: `Anthropic: ${params.model}`,
                 traceId,
                 organizationId: organizationId || evalClient.getOrganizationId(),
@@ -97,7 +97,7 @@ function traceAnthropic(anthropic, evalClient, options = {}) {
                 durationMs,
                 metadata: errorMetadata,
             })
-                .catch(() => {
+                ?.catch(() => {
                 // Ignore errors in trace creation to avoid masking the original error
             });
             throw error;
@@ -127,7 +127,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
     const startTime = Date.now();
     const traceId = `anthropic-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
     try {
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),
@@ -136,7 +136,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
         });
         const result = await fn();
         const durationMs = Date.now() - startTime;
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),
@@ -148,7 +148,7 @@ async function traceAnthropicCall(evalClient, name, fn, options = {}) {
     }
     catch (error) {
         const durationMs = Date.now() - startTime;
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),

package/dist/integrations/openai.js CHANGED Viewed

@@ -65,7 +65,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
                     }
                     : {}),
             });
-            await evalClient.traces.create({
+            await evalClient.traces?.create({
                 name: `OpenAI: ${params.model}`,
                 traceId,
                 organizationId: organizationId || evalClient.getOrganizationId(),
@@ -87,7 +87,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
                 error: error instanceof Error ? error.message : String(error),
             });
             await evalClient.traces
-                .create({
+                ?.create({
                 name: `OpenAI: ${params.model}`,
                 traceId,
                 organizationId: organizationId || evalClient.getOrganizationId(),
@@ -95,7 +95,7 @@ function traceOpenAI(openai, evalClient, options = {}) {
                 durationMs,
                 metadata: errorMetadata,
             })
-                .catch(() => {
+                ?.catch(() => {
                 // Ignore errors in trace creation to avoid masking the original error
             });
             throw error;
@@ -124,7 +124,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
     const startTime = Date.now();
     const traceId = `openai-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
     try {
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),
@@ -133,7 +133,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
         });
         const result = await fn();
         const durationMs = Date.now() - startTime;
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),
@@ -145,7 +145,7 @@ async function traceOpenAICall(evalClient, name, fn, options = {}) {
     }
     catch (error) {
         const durationMs = Date.now() - startTime;
-        await evalClient.traces.create({
+        await evalClient.traces?.create({
             name,
             traceId,
             organizationId: options.organizationId || evalClient.getOrganizationId(),

package/dist/pagination.d.ts CHANGED Viewed

@@ -50,9 +50,20 @@ export declare function createPaginatedIterator<T>(fetchFn: (offset: number, lim
     hasMore: boolean;
 }>, limit?: number): PaginatedIterator<T>;
 /**
- * Auto-paginate helper that fetches all pages automatically
+ * Auto-paginate helper that fetches all pages and returns a flat array.
+ * @example
+ * ```typescript
+ * const allItems = await autoPaginate(
+ *   (offset, limit) => client.traces.list({ offset, limit }),
+ * );
+ * ```
  */
-export declare function autoPaginate<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number): AsyncGenerator<T, void, unknown>;
+export declare function autoPaginate<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number): Promise<T[]>;
+/**
+ * Streaming auto-paginate generator — yields individual items one at a time.
+ * Use this when you want to process items as they arrive rather than waiting for all pages.
+ */
+export declare function autoPaginateGenerator<T>(fetchFn: (offset: number, limit: number) => Promise<T[]>, limit?: number): AsyncGenerator<T, void, unknown>;
 /**
  * Encode cursor for pagination (base64)
  */

package/dist/pagination.js CHANGED Viewed

@@ -6,6 +6,7 @@ Object.defineProperty(exports, "__esModule", { value: true });
 exports.PaginatedIterator = void 0;
 exports.createPaginatedIterator = createPaginatedIterator;
 exports.autoPaginate = autoPaginate;
+exports.autoPaginateGenerator = autoPaginateGenerator;
 exports.encodeCursor = encodeCursor;
 exports.decodeCursor = decodeCursor;
 exports.createPaginationMeta = createPaginationMeta;
@@ -56,9 +57,34 @@ function createPaginatedIterator(fetchFn, limit = 50) {
     return new PaginatedIterator(fetchFn, limit);
 }
 /**
- * Auto-paginate helper that fetches all pages automatically
+ * Auto-paginate helper that fetches all pages and returns a flat array.
+ * @example
+ * ```typescript
+ * const allItems = await autoPaginate(
+ *   (offset, limit) => client.traces.list({ offset, limit }),
+ * );
+ * ```
  */
-async function* autoPaginate(fetchFn, limit = 50) {
+async function autoPaginate(fetchFn, limit = 50) {
+    const result = [];
+    let offset = 0;
+    let hasMore = true;
+    while (hasMore) {
+        const items = await fetchFn(offset, limit);
+        if (items.length === 0) {
+            break;
+        }
+        result.push(...items);
+        hasMore = items.length === limit;
+        offset += limit;
+    }
+    return result;
+}
+/**
+ * Streaming auto-paginate generator — yields individual items one at a time.
+ * Use this when you want to process items as they arrive rather than waiting for all pages.
+ */
+async function* autoPaginateGenerator(fetchFn, limit = 50) {
     let offset = 0;
     let hasMore = true;
     while (hasMore) {

package/dist/runtime/adapters/testsuite-to-dsl.js CHANGED Viewed

@@ -208,12 +208,7 @@ function generateDefineEvalCode(suite, options = {}) {
     });
     const helperFunctions = generateHelperFunctionsForSuite(specs, options);
     const evaluationFunction = generateEvaluationFunction();
-    return [
-        ...imports,
-        ...helperFunctions,
-        ...evaluationFunction,
-        ...specCode,
-    ].join("\n");
+    return [...imports, helperFunctions, evaluationFunction, ...specCode].join("\n");
 }
 /**
  * Generate helper functions for a specific spec

package/dist/runtime/executor.d.ts CHANGED Viewed

@@ -10,7 +10,8 @@ import type { LocalExecutor } from "./types";
  */
 export declare function createLocalExecutor(): LocalExecutor;
 /**
- * Default local executor instance
+ * Default local executor factory
+ * Call as defaultLocalExecutor() to get a new executor instance.
  * For convenience in simple use cases
  */
-export declare const defaultLocalExecutor: LocalExecutor;
+export declare const defaultLocalExecutor: typeof createLocalExecutor;

package/dist/runtime/executor.js CHANGED Viewed

@@ -146,7 +146,8 @@ function createLocalExecutor() {
     return new LocalExecutorImpl();
 }
 /**
- * Default local executor instance
+ * Default local executor factory
+ * Call as defaultLocalExecutor() to get a new executor instance.
  * For convenience in simple use cases
  */
-exports.defaultLocalExecutor = createLocalExecutor();
+exports.defaultLocalExecutor = createLocalExecutor;

package/dist/runtime/registry.d.ts CHANGED Viewed

@@ -61,7 +61,10 @@ export interface SerializedSpec {
  * Create a new scoped runtime with lifecycle management
  * Returns a handle for proper resource management
  */
-export declare function createEvalRuntime(projectRoot?: string): RuntimeHandle;
+export declare function createEvalRuntime(projectRootOrConfig?: string | {
+    name?: string;
+    projectRoot?: string;
+}): RuntimeHandle;
 /**
  * Helper function for safe runtime execution with automatic cleanup
  * Ensures runtime is disposed even if an exception is thrown

package/dist/runtime/registry.js CHANGED Viewed

@@ -315,7 +315,10 @@ class EvalRuntimeImpl {
  * Create a new scoped runtime with lifecycle management
  * Returns a handle for proper resource management
  */
-function createEvalRuntime(projectRoot = process.cwd()) {
+function createEvalRuntime(projectRootOrConfig = process.cwd()) {
+    const projectRoot = typeof projectRootOrConfig === "string"
+        ? projectRootOrConfig
+        : (projectRootOrConfig.projectRoot ?? process.cwd());
     const runtime = new EvalRuntimeImpl(projectRoot);
     // Create bound defineEval function
     const boundDefineEval = ((nameOrConfig, executor, options) => {

package/dist/snapshot.d.ts CHANGED Viewed

@@ -166,6 +166,18 @@ export declare function loadSnapshot(name: string, dir?: string): Promise<Snapsh
  * ```
  */
 export declare function compareWithSnapshot(name: string, currentOutput: unknown, dir?: string): Promise<SnapshotComparison>;
+/**
+ * Compare two saved snapshots by name (convenience function)
+ *
+ * @example
+ * ```typescript
+ * const comparison = await compareSnapshots('baseline', 'current');
+ * if (!comparison.matches) {
+ *   console.log('Snapshots differ!', comparison.differences);
+ * }
+ * ```
+ */
+export declare function compareSnapshots(nameA: string, nameB: string, dir?: string): Promise<SnapshotComparison>;
 /**
  * Delete a snapshot (convenience function)
  */

package/dist/snapshot.js CHANGED Viewed

@@ -55,6 +55,7 @@ exports.SnapshotManager = void 0;
 exports.snapshot = snapshot;
 exports.loadSnapshot = loadSnapshot;
 exports.compareWithSnapshot = compareWithSnapshot;
+exports.compareSnapshots = compareSnapshots;
 exports.deleteSnapshot = deleteSnapshot;
 exports.listSnapshots = listSnapshots;
 // Environment check
@@ -130,7 +131,13 @@ class SnapshotManager {
         if (!options?.overwrite && fs.existsSync(filePath)) {
             throw new Error(`Snapshot '${name}' already exists. Use overwrite: true to update.`);
         }
-        const serialized = typeof output === "string" ? output : JSON.stringify(output);
+        const serialized = output === undefined
+            ? "undefined"
+            : output === null
+                ? "null"
+                : typeof output === "string"
+                    ? output
+                    : JSON.stringify(output);
         const snapshotData = {
             output: serialized,
             metadata: {
@@ -310,6 +317,22 @@ async function compareWithSnapshot(name, currentOutput, dir) {
     const manager = getSnapshotManager(dir);
     return manager.compare(name, currentOutput);
 }
+/**
+ * Compare two saved snapshots by name (convenience function)
+ *
+ * @example
+ * ```typescript
+ * const comparison = await compareSnapshots('baseline', 'current');
+ * if (!comparison.matches) {
+ *   console.log('Snapshots differ!', comparison.differences);
+ * }
+ * ```
+ */
+async function compareSnapshots(nameA, nameB, dir) {
+    const manager = getSnapshotManager(dir);
+    const snapshotB = await manager.load(nameB);
+    return manager.compare(nameA, snapshotB.output);
+}
 /**
  * Delete a snapshot (convenience function)
  */

package/dist/version.d.ts CHANGED Viewed

@@ -3,5 +3,5 @@
  * X-EvalGate-SDK-Version: SDK package version
  * X-EvalGate-Spec-Version: OpenAPI spec version (docs/openapi.json info.version)
  */
-export declare const SDK_VERSION = "2.2.2";
-export declare const SPEC_VERSION = "2.2.2";
+export declare const SDK_VERSION = "2.2.3";
+export declare const SPEC_VERSION = "2.2.3";

package/dist/version.js CHANGED Viewed

@@ -6,5 +6,5 @@ exports.SPEC_VERSION = exports.SDK_VERSION = void 0;
  * X-EvalGate-SDK-Version: SDK package version
  * X-EvalGate-Spec-Version: OpenAPI spec version (docs/openapi.json info.version)
  */
-exports.SDK_VERSION = "2.2.2";
-exports.SPEC_VERSION = "2.2.2";
+exports.SDK_VERSION = "2.2.3";
+exports.SPEC_VERSION = "2.2.3";

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "@evalgate/sdk",
-	"version": "2.2.2",
+	"version": "2.2.3",
 	"publishConfig": {
 		"access": "public",
 		"registry": "https://registry.npmjs.org/"