@evalgate/sdk 2.2.1 → 2.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +70 -1
- package/README.md +36 -7
- package/dist/assertions.d.ts +67 -5
- package/dist/assertions.js +733 -45
- package/dist/cache.d.ts +1 -1
- package/dist/cache.js +1 -1
- package/dist/cli/upgrade.js +5 -0
- package/dist/client.js +1 -1
- package/dist/errors.js +7 -0
- package/dist/export.d.ts +1 -1
- package/dist/export.js +3 -3
- package/dist/index.d.ts +4 -4
- package/dist/index.js +14 -3
- package/dist/integrations/anthropic.js +6 -6
- package/dist/integrations/openai.js +6 -6
- package/dist/pagination.d.ts +13 -2
- package/dist/pagination.js +28 -2
- package/dist/runtime/adapters/testsuite-to-dsl.js +1 -6
- package/dist/runtime/executor.d.ts +3 -2
- package/dist/runtime/executor.js +3 -2
- package/dist/runtime/registry.d.ts +4 -1
- package/dist/runtime/registry.js +4 -1
- package/dist/snapshot.d.ts +14 -2
- package/dist/snapshot.js +30 -4
- package/dist/types.d.ts +7 -2
- package/dist/types.js +7 -2
- package/dist/version.d.ts +2 -2
- package/dist/version.js +2 -2
- package/dist/workflows.js +6 -1
- package/package.json +2 -2
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,67 @@ All notable changes to the @evalgate/sdk package will be documented in this file
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [2.2.3] - 2026-03-03
|
|
9
|
+
|
|
10
|
+
### Fixed
|
|
11
|
+
|
|
12
|
+
- **`RequestCache.set` missing default TTL** — entries stored without an explicit TTL were immediately stale on next read. Default is now `CacheTTL.MEDIUM`; callers that omit `ttl` get a live cache entry instead of a cache miss every time.
|
|
13
|
+
- **`EvalGateError` subclass prototype chain** — `ValidationError.name` was silently overwritten by the base class constructor, surfacing as `"EvalGateError"` in stack traces and `instanceof` checks. All four subclasses (`ValidationError`, `RateLimitError`, `AuthenticationError`, `NetworkError`) now call `Object.setPrototypeOf(this, Subclass.prototype)` and set `this.name` after `super()`.
|
|
14
|
+
- **`RateLimitError.retryAfter` not a direct property** — the value was only stored inside `details.retryAfter` and not accessible as `err.retryAfter`. It is now assigned directly on the instance when provided.
|
|
15
|
+
- **`autoPaginate` returned `AsyncGenerator` instead of `Promise<T[]>`** — calling `await autoPaginate(fetcher)` was resolving to an unexhausted generator. It now collects all pages and returns a flat `Promise<T[]>`. The original streaming behaviour is available via the new `autoPaginateGenerator` export.
|
|
16
|
+
- **`createEvalRuntime` string-only overload** — passing `{ name, projectRoot }` config objects was ignored (treated as `process.cwd()`). The function now accepts `string | { name?: string; projectRoot?: string }` and extracts `projectRoot` correctly.
|
|
17
|
+
- **`defaultLocalExecutor` was an instance, not a factory** — importing `defaultLocalExecutor` returned a pre-constructed executor rather than a callable factory. It is now re-exported as `createLocalExecutor` so each import site can call it to get a fresh instance.
|
|
18
|
+
- **`SnapshotManager.save` crash on `undefined`/`null` output** — passing `undefined` or `null` to `snapshot(name, output)` threw `TypeError: Cannot convert undefined to string`. Both values are now serialized to the strings `"undefined"` and `"null"` respectively, matching the existing `null`-safe coercion already present for objects.
|
|
19
|
+
- **`compareSnapshots` loaded raw string instead of disk snapshot** — the old `compareWithSnapshot` alias passed its second argument as literal content rather than a snapshot name, producing meaningless diffs. The new `compareSnapshots(nameA, nameB, dir?)` loads both snapshots from disk before diffing.
|
|
20
|
+
- **`AIEvalClient` default `baseUrl`** — the no-arg constructor defaulted to `http://localhost:3000`, causing silent failures in production environments. Default is now `https://api.evalgate.com`.
|
|
21
|
+
- **`importData` unguarded `client.traces` / `client.evaluations` access** — calling `importData(data)` with a partial or undefined client could throw `TypeError: Cannot read properties of undefined`. Both property accesses now use optional chaining (`client?.traces`, `client?.evaluations`).
|
|
22
|
+
- **`toContainCode` required a fenced code block** — raw function definitions, `const` assignments, class declarations, arrow functions, `import`/`export` statements, and `return` expressions now satisfy the assertion without needing triple-backtick fencing.
|
|
23
|
+
- **`hasReadabilityScore` ignored `{min}` object form** — passing `{ min: 40 }` instead of a plain number was coerced to `NaN` threshold, making every call return `true`. The function now unwraps `{ min?, max? }` objects and applies both bounds.
|
|
24
|
+
|
|
25
|
+
### Added
|
|
26
|
+
|
|
27
|
+
- **`autoPaginateGenerator`** — new export for streaming pagination as an `AsyncGenerator<T[]>` (one chunk per page). Use when you want to process pages incrementally rather than wait for all pages to load.
|
|
28
|
+
- **`compareSnapshots(nameA, nameB, dir?)`** — loads both named snapshots from disk and returns a `SnapshotComparison`. Replaces the incorrectly aliased `compareWithSnapshot`.
|
|
29
|
+
- **141 new regression tests** across 9 test files covering all fixes above: `RequestCache` TTL defaults, error class prototype chains, `autoPaginate` flat-array return, `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager` null/undefined handling, `compareSnapshots` disk-load path, `AIEvalClient` default `baseUrl`, `importData` guards, `toContainCode` raw-code detection, and `hasReadabilityScore` object form.
|
|
30
|
+
- **`upgrade --full` post-upgrade warning** — CLI now prints a reminder to run `npx evalgate baseline update` after a full upgrade to avoid a false regression on the next CI run.
|
|
31
|
+
- **Optional chaining on OpenAI / Anthropic integration `traces.create`** — `evalClient.traces?.create(...)` prevents crashes when the `traces` resource is unavailable on the client (e.g. minimal config or testing without a full API key).
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## [2.2.2] - 2026-03-03
|
|
36
|
+
|
|
37
|
+
### Fixed
|
|
38
|
+
|
|
39
|
+
- **8 stub assertions replaced with real implementations:**
|
|
40
|
+
- `hasSentiment` — substring matching + expanded 34/31-word positive/negative lexicon (was exact-match, 4 words each)
|
|
41
|
+
- `hasNoHallucinations` — case-insensitive fact matching (was case-sensitive)
|
|
42
|
+
- `hasFactualAccuracy` — case-insensitive fact matching (was case-sensitive)
|
|
43
|
+
- `containsLanguage` — expanded from 3 languages (en/es/fr) to 12 (+ de/it/pt/nl/ru/zh/ja/ko/ar) with BCP-47 subtag support (`zh-CN` → `zh`)
|
|
44
|
+
- `hasValidCodeSyntax` — real bracket/brace/parenthesis balance checker with string literal and comment awareness (handles JS `//`/`/* */`, Python `#`, template literals, single/double quotes); JSON fast-path via `JSON.parse`
|
|
45
|
+
- `hasNoToxicity` — expanded from 4 words to ~80 terms across 9 categories: insults, degradation, violence/threats, self-harm directed at others, dehumanization, hate/rejection, harassment, profanity-as-attacks, bullying/appearance/mental-health weaponization
|
|
46
|
+
- `hasReadabilityScore` — fixed Flesch-Kincaid syllable counting to be per-word (was treating entire text as one word)
|
|
47
|
+
- `matchesSchema` — now dispatches on schema format: JSON Schema `required` array (`{ required: ['name'] }` → checks required keys exist), JSON Schema `properties` object (`{ properties: { name: {} } }` → checks property keys exist), or simple key-presence template (existing behavior preserved for backward compat). Fixes regression: `matchesSchema({ name: 'test', score: 95 }, { type: 'object', required: ['name'] })` was returning `false`
|
|
48
|
+
- **`importData` crash** — `options: ImportOptions` parameter now defaults to `{}` to prevent `Cannot read properties of undefined (reading 'dryRun')` when called as `importData(client, data)`
|
|
49
|
+
- **`compareWithSnapshot` / `SnapshotManager.compare` object coercion** — both now accept `unknown` input and coerce non-string values via `JSON.stringify` before comparison, matching the existing behavior of `SnapshotManager.save()`
|
|
50
|
+
- **`WorkflowTracer` constructor crash** — defensive guard: `typeof client?.getOrganizationId === "function"` before calling it; prevents `TypeError: client.getOrganizationId is not a function` when using partial clients or initializing without an API key
|
|
51
|
+
|
|
52
|
+
### Added
|
|
53
|
+
|
|
54
|
+
- **LLM-backed async assertion variants** — 6 new exported functions:
|
|
55
|
+
- `hasSentimentAsync(text, expected, config?)` — LLM classifies sentiment with full context awareness
|
|
56
|
+
- `hasNoToxicityAsync(text, config?)` — LLM detects sarcastic, implicit, and culturally specific toxic content that blocklists miss
|
|
57
|
+
- `containsLanguageAsync(text, language, config?)` — LLM language detection for any language
|
|
58
|
+
- `hasValidCodeSyntaxAsync(code, language, config?)` — LLM deep syntax analysis beyond bracket balance
|
|
59
|
+
- `hasFactualAccuracyAsync(text, facts, config?)` — LLM checks facts semantically, catches paraphrased inaccuracies
|
|
60
|
+
- `hasNoHallucinationsAsync(text, groundTruth, config?)` — LLM detects fabricated claims even when paraphrased
|
|
61
|
+
- **`configureAssertions(config: AssertionLLMConfig)`** — set global LLM provider/apiKey/model/baseUrl once; all `*Async` functions use it automatically; per-call `config` overrides it
|
|
62
|
+
- **`getAssertionConfig()`** — retrieve current global assertion LLM config
|
|
63
|
+
- **`AssertionLLMConfig` type** — exported interface: `{ provider: "openai" | "anthropic"; apiKey: string; model?: string; baseUrl?: string }`
|
|
64
|
+
- **JSDoc `**Fast and approximate**` / `**Slow and accurate**` markers** on all sync/async assertion pairs with `{@link xAsync}` cross-references that appear in IDE tooltips
|
|
65
|
+
- **115 new tests** in `assertions.test.ts` covering all improved sync assertions (expanded lexicons, JSON Schema formats, bracket balance edge cases, 12-language detection, BCP-47) and all 6 async variants (OpenAI path, Anthropic path, global config, error cases, HTTP 4xx handling)
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
8
69
|
## [2.2.1] - 2026-03-03
|
|
9
70
|
|
|
10
71
|
### Fixed
|
|
@@ -50,6 +111,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
50
111
|
- **Low:** Explain no longer shows "unnamed" for builtin gate failures
|
|
51
112
|
- **Docs:** Added missing `discover --manifest` step to local quickstart
|
|
52
113
|
|
|
114
|
+
---
|
|
115
|
+
|
|
53
116
|
## [2.1.2] - 2026-03-02
|
|
54
117
|
|
|
55
118
|
### Fixed
|
|
@@ -57,12 +120,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
57
120
|
- **Type safety** — aligned with platform 2.1.2; zero TypeScript errors across all integration points
|
|
58
121
|
- **CI gate** — all SDK tests, lint, and build checks passing
|
|
59
122
|
|
|
123
|
+
---
|
|
124
|
+
|
|
60
125
|
## [2.1.1] - 2026-03-02
|
|
61
126
|
|
|
62
127
|
### Fixed
|
|
63
128
|
|
|
64
129
|
- Version alignment with platform 2.1.1
|
|
65
130
|
|
|
131
|
+
---
|
|
132
|
+
|
|
66
133
|
## [2.0.0] - 2026-03-01
|
|
67
134
|
|
|
68
135
|
### Breaking — EvalGate Rebrand
|
|
@@ -364,7 +431,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
364
431
|
- error catalog stability + graceful handling of unknown codes
|
|
365
432
|
- exports contract (retention visibility, 410 semantics)
|
|
366
433
|
|
|
367
|
-
|
|
434
|
+
---
|
|
368
435
|
|
|
369
436
|
## [1.5.0] - 2026-02-18
|
|
370
437
|
|
|
@@ -412,6 +479,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
412
479
|
- **Package hardening** — `files`, `module`, `sideEffects: false` for leaner npm publish
|
|
413
480
|
- **CLI** — Passes `baseline` param to quality API for deterministic CI gates
|
|
414
481
|
|
|
482
|
+
---
|
|
483
|
+
|
|
415
484
|
## [1.3.0] - 2025-10-21
|
|
416
485
|
|
|
417
486
|
### ✨ Added
|
package/README.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/@evalgate/sdk)
|
|
4
4
|
[](https://www.npmjs.com/package/@evalgate/sdk)
|
|
5
5
|
[](https://www.typescriptlang.org/)
|
|
6
|
-
[](#)
|
|
7
7
|
[](#)
|
|
8
8
|
[](https://opensource.org/licenses/MIT)
|
|
9
9
|
|
|
@@ -366,6 +366,33 @@ import type {
|
|
|
366
366
|
} from "@evalgate/sdk/regression";
|
|
367
367
|
```
|
|
368
368
|
|
|
369
|
+
### Assertions — Sync (fast, heuristic) and Async (slow, LLM-backed)
|
|
370
|
+
|
|
371
|
+
```typescript
|
|
372
|
+
import {
|
|
373
|
+
// Sync — fast and approximate (no API key needed)
|
|
374
|
+
hasSentiment, hasNoToxicity, hasValidCodeSyntax,
|
|
375
|
+
containsLanguage, hasFactualAccuracy, hasNoHallucinations,
|
|
376
|
+
matchesSchema,
|
|
377
|
+
// Async — slow and accurate (requires API key)
|
|
378
|
+
configureAssertions, hasSentimentAsync, hasNoToxicityAsync,
|
|
379
|
+
hasValidCodeSyntaxAsync, containsLanguageAsync,
|
|
380
|
+
hasFactualAccuracyAsync, hasNoHallucinationsAsync,
|
|
381
|
+
} from "@evalgate/sdk";
|
|
382
|
+
|
|
383
|
+
// Configure once (or pass per-call)
|
|
384
|
+
configureAssertions({ provider: "openai", apiKey: process.env.OPENAI_API_KEY });
|
|
385
|
+
|
|
386
|
+
// Sync — fast, no network
|
|
387
|
+
console.log(hasSentiment("I love this!", "positive")); // true
|
|
388
|
+
console.log(hasNoToxicity("Have a great day!")); // true
|
|
389
|
+
console.log(hasValidCodeSyntax("function f() {}", "js")); // true
|
|
390
|
+
|
|
391
|
+
// Async — LLM-backed, context-aware
|
|
392
|
+
console.log(await hasSentimentAsync("subtle irony...", "negative")); // true
|
|
393
|
+
console.log(await hasNoToxicityAsync("sarcastic attack text")); // false
|
|
394
|
+
```
|
|
395
|
+
|
|
369
396
|
### Platform Client
|
|
370
397
|
|
|
371
398
|
```typescript
|
|
@@ -423,17 +450,19 @@ Your local `openAIChatEval` runs continue to work. No account cancellation. No d
|
|
|
423
450
|
|
|
424
451
|
See [CHANGELOG.md](CHANGELOG.md) for the full release history.
|
|
425
452
|
|
|
426
|
-
**
|
|
453
|
+
**v2.2.3** — Bug-fix release. `RequestCache` default TTL, `EvalGateError` subclass prototype chain and `retryAfter` direct property, `autoPaginate` now returns `Promise<T[]>` (new `autoPaginateGenerator` for streaming), `createEvalRuntime` config-object overload, `defaultLocalExecutor` callable factory, `SnapshotManager.save` null/undefined safety, `compareSnapshots` loads both sides from disk, `AIEvalClient` default baseUrl → `https://api.evalgate.com`, `importData` optional-chaining guards, `toContainCode` raw-code detection, `hasReadabilityScore` `{min,max}` object form. 141 new regression tests.
|
|
427
454
|
|
|
428
|
-
**
|
|
455
|
+
**v2.2.2** — 8 stub assertions replaced with real implementations (`hasSentiment` expanded lexicon, `hasNoToxicity` ~80-term blocklist, `hasValidCodeSyntax` real bracket balance, `containsLanguage` 12 languages + BCP-47, `hasFactualAccuracy`/`hasNoHallucinations` case-insensitive, `hasReadabilityScore` per-word syllable fix, `matchesSchema` JSON Schema support). Added LLM-backed `*Async` variants + `configureAssertions`. Fixed `importData` crash, `compareWithSnapshot` object coercion, `WorkflowTracer` defensive guard. 115 new tests.
|
|
429
456
|
|
|
430
|
-
**
|
|
457
|
+
**v2.2.1** — `snapshot(name, output)` accepts objects; auto-serialized via `JSON.stringify`
|
|
431
458
|
|
|
432
|
-
**
|
|
459
|
+
**v2.2.0** — `expect().not` modifier, `hasPII()`, `defineSuite` object form, `snapshot` parameter order fix, `specId` collision fix
|
|
433
460
|
|
|
434
|
-
**v1.
|
|
461
|
+
**v1.8.0** — `evalgate doctor` rewrite (9-check checklist), `evalgate explain` command, guided failure flow, CI template with doctor preflight
|
|
462
|
+
|
|
463
|
+
**v1.7.0** — `evalgate init` scaffolder, `evalgate upgrade --full`, `detectRunner()`, machine-readable gate output, init test matrix
|
|
435
464
|
|
|
436
|
-
**v1.
|
|
465
|
+
**v1.6.0** — `evalgate gate`, `evalgate baseline`, regression gate constants & types
|
|
437
466
|
|
|
438
467
|
## License
|
|
439
468
|
|
package/dist/assertions.d.ts
CHANGED
|
@@ -126,10 +126,11 @@ export declare class Expectation {
|
|
|
126
126
|
*/
|
|
127
127
|
toBeBetween(min: number, max: number, message?: string): AssertionResult;
|
|
128
128
|
/**
|
|
129
|
-
* Assert value contains code block
|
|
129
|
+
* Assert value contains code block or raw code
|
|
130
130
|
* @example expect(output).toContainCode()
|
|
131
|
+
* @example expect(output).toContainCode('typescript')
|
|
131
132
|
*/
|
|
132
|
-
toContainCode(message?: string): AssertionResult;
|
|
133
|
+
toContainCode(language?: string, message?: string): AssertionResult;
|
|
133
134
|
/**
|
|
134
135
|
* Assert value is professional tone (no profanity)
|
|
135
136
|
* @example expect(output).toBeProfessional()
|
|
@@ -193,18 +194,79 @@ export declare function notContainsPII(text: string): boolean;
|
|
|
193
194
|
* if (hasPII(response)) throw new Error("PII leak");
|
|
194
195
|
*/
|
|
195
196
|
export declare function hasPII(text: string): boolean;
|
|
197
|
+
/**
|
|
198
|
+
* Lexicon-based sentiment check. **Fast and approximate** — suitable for
|
|
199
|
+
* low-stakes filtering or CI smoke tests. For production safety gates use
|
|
200
|
+
* {@link hasSentimentAsync} with an LLM provider for context-aware accuracy.
|
|
201
|
+
*/
|
|
196
202
|
export declare function hasSentiment(text: string, expected: "positive" | "negative" | "neutral"): boolean;
|
|
197
203
|
export declare function similarTo(text1: string, text2: string, threshold?: number): boolean;
|
|
198
204
|
export declare function withinRange(value: number, min: number, max: number): boolean;
|
|
199
205
|
export declare function isValidEmail(email: string): boolean;
|
|
200
206
|
export declare function isValidURL(url: string): boolean;
|
|
201
|
-
|
|
207
|
+
/**
|
|
208
|
+
* Substring-based hallucination check — verifies each ground-truth fact
|
|
209
|
+
* appears verbatim in the text. **Fast and approximate**: catches missing
|
|
210
|
+
* facts but cannot detect paraphrased fabrications. Use
|
|
211
|
+
* {@link hasNoHallucinationsAsync} for semantic accuracy.
|
|
212
|
+
*/
|
|
213
|
+
export declare function hasNoHallucinations(text: string, groundTruth?: string[]): boolean;
|
|
202
214
|
export declare function matchesSchema(value: unknown, schema: Record<string, unknown>): boolean;
|
|
203
|
-
export declare function hasReadabilityScore(text: string, minScore: number
|
|
215
|
+
export declare function hasReadabilityScore(text: string, minScore: number | {
|
|
216
|
+
min?: number;
|
|
217
|
+
max?: number;
|
|
218
|
+
}): boolean;
|
|
219
|
+
/**
|
|
220
|
+
* Keyword-frequency language detector supporting 12 languages.
|
|
221
|
+
* **Fast and approximate** — detects the most common languages reliably
|
|
222
|
+
* but may struggle with short texts or closely related languages.
|
|
223
|
+
* Use {@link containsLanguageAsync} for reliable detection of any language.
|
|
224
|
+
*/
|
|
204
225
|
export declare function containsLanguage(text: string, language: string): boolean;
|
|
226
|
+
/**
|
|
227
|
+
* Substring-based factual accuracy check. **Fast and approximate** — verifies
|
|
228
|
+
* each fact string appears in the text but cannot reason about meaning or
|
|
229
|
+
* paraphrasing. Use {@link hasFactualAccuracyAsync} for semantic accuracy.
|
|
230
|
+
*/
|
|
205
231
|
export declare function hasFactualAccuracy(text: string, facts: string[]): boolean;
|
|
206
232
|
export declare function respondedWithinTime(startTime: number, maxMs: number): boolean;
|
|
233
|
+
/**
|
|
234
|
+
* Blocklist-based toxicity check (~80 terms across 9 categories).
|
|
235
|
+
* **Fast and approximate** — catches explicit harmful language but has
|
|
236
|
+
* inherent gaps and context-blind false positives. Do NOT rely on this
|
|
237
|
+
* alone for production content safety gates; use {@link hasNoToxicityAsync}
|
|
238
|
+
* with an LLM for context-aware moderation.
|
|
239
|
+
*/
|
|
207
240
|
export declare function hasNoToxicity(text: string): boolean;
|
|
208
|
-
export declare function followsInstructions(text: string, instructions: string[]): boolean;
|
|
241
|
+
export declare function followsInstructions(text: string, instructions: string | string[]): boolean;
|
|
209
242
|
export declare function containsAllRequiredFields(obj: unknown, requiredFields: string[]): boolean;
|
|
243
|
+
export interface AssertionLLMConfig {
|
|
244
|
+
provider: "openai" | "anthropic";
|
|
245
|
+
apiKey: string;
|
|
246
|
+
model?: string;
|
|
247
|
+
baseUrl?: string;
|
|
248
|
+
}
|
|
249
|
+
export declare function configureAssertions(config: AssertionLLMConfig): void;
|
|
250
|
+
export declare function getAssertionConfig(): AssertionLLMConfig | null;
|
|
251
|
+
/**
|
|
252
|
+
* LLM-backed sentiment check. **Slow and accurate** — uses an LLM to
|
|
253
|
+
* classify sentiment with full context awareness. Requires
|
|
254
|
+
* {@link configureAssertions} or an inline `config` argument.
|
|
255
|
+
* Falls back gracefully with a clear error if no API key is configured.
|
|
256
|
+
*/
|
|
257
|
+
export declare function hasSentimentAsync(text: string, expected: "positive" | "negative" | "neutral", config?: AssertionLLMConfig): Promise<boolean>;
|
|
258
|
+
/**
|
|
259
|
+
* LLM-backed toxicity check. **Slow and accurate** — context-aware, handles
|
|
260
|
+
* sarcasm, implicit threats, and culturally specific harmful content that
|
|
261
|
+
* blocklists miss. Recommended for production content safety gates.
|
|
262
|
+
*/
|
|
263
|
+
export declare function hasNoToxicityAsync(text: string, config?: AssertionLLMConfig): Promise<boolean>;
|
|
264
|
+
export declare function containsLanguageAsync(text: string, language: string, config?: AssertionLLMConfig): Promise<boolean>;
|
|
265
|
+
export declare function hasValidCodeSyntaxAsync(code: string, language: string, config?: AssertionLLMConfig): Promise<boolean>;
|
|
266
|
+
export declare function hasFactualAccuracyAsync(text: string, facts: string[], config?: AssertionLLMConfig): Promise<boolean>;
|
|
267
|
+
/**
|
|
268
|
+
* LLM-backed hallucination check. **Slow and accurate** — detects fabricated
|
|
269
|
+
* claims even when they are paraphrased or contradict facts indirectly.
|
|
270
|
+
*/
|
|
271
|
+
export declare function hasNoHallucinationsAsync(text: string, groundTruth: string[], config?: AssertionLLMConfig): Promise<boolean>;
|
|
210
272
|
export declare function hasValidCodeSyntax(code: string, language: string): boolean;
|