npm - json-llm-repair - Versions diffs - 0.1.0 - Mend

json-llm-repair 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Tiago Gouvêa
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,201 @@
+# json-llm-repair
+Parse and repair JSON from LLM outputs with intelligent repair strategies.
+## Why?
+LLMs frequently return JSON in unexpected formats. Models without `response_format` support often wrap JSON in explanatory text or produce malformed syntax. Even models with structured output support (like OpenAI's JSON mode or Anthropic's tool use) occasionally fail to return the exact schema, omitting wrapper objects or adding extra fields.
+This library handles these issues automatically, with configurable repair strategies.
+## Installation
+```bash
+npm install json-llm-repair
+# or
+yarn add json-llm-repair
+```
+## Quick Start
+```typescript
+import { parseFromLLM } from 'json-llm-repair';
+const llmOutput = 'Sure! Here is the data: {"name": "John", "age": 30} if you need anything else please let me know.';
+const data = parseFromLLM(llmOutput);
+console.log(data); // { name: "John", age: 30 }
+```
+## What It Fixes
+### 1. Extra Text Around JSON
+LLMs often add explanatory text before or after JSON.
+```typescript
+const llmOutput = 'Sure! Here is the data: {"name": "John"} Hope this helps!';
+const data = parseFromLLM(llmOutput);
+// Both modes handle this
+```
+### 2. JSON Inside Markdown Code Blocks
+Common with ChatGPT, Claude, and other assistants.
+```typescript
+const llmOutput = `Here's your data:
+\`\`\`json
+{"name": "John", "age": 30}
+\`\`\``;
+const data = parseFromLLM(llmOutput);
+// Both modes handle this
+```
+### 3. Multiple JSONs Concatenated
+When LLM outputs multiple JSON objects in sequence.
+```typescript
+const llmOutput = '{"id": 1}{"id": 2}{"id": 3}';
+const data = parseFromLLM(llmOutput);
+// Returns first valid JSON: {"id": 1}
+```
+### 4. Invalid JSON Syntax
+Missing quotes, trailing commas, unquoted keys (repair mode only).
+```typescript
+const llmOutput = '{name: "John", age: 30,}';
+const data = parseFromLLM(llmOutput, { mode: 'repair' });
+// Fixed to: {"name": "John", "age": 30}
+```
+### 5. Missing Root Key
+LLM forgets the wrapper object expected by your schema (repair mode + schema).
+```typescript
+import { z } from 'zod';
+const UserSchema = z.object({
+  user: z.object({ name: z.string(), age: z.number() })
+});
+const llmOutput = '{"name": "John", "age": 30}';
+const data = parseFromLLM(llmOutput, { mode: 'repair', schema: UserSchema });
+// Wrapped to: { user: { name: "John", age: 30 } }
+```
+### 6. Unescaped Quotes in Strings
+LLM embeds quotes without proper escaping (repair mode only).
+```typescript
+const llmOutput = '{"message": "She said "hello" to me"}';
+const data = parseFromLLM(llmOutput, { mode: 'repair' });
+// Fixed to: { message: 'She said "hello" to me' }
+```
+> **Note:** May not work reliably with non-ASCII characters (accents, etc).
+### 7. Missing Closing Braces or Quotes
+Incomplete JSON from streaming or interrupted responses (repair mode only).
+```typescript
+const llmOutput = '{"name": "John", "age": 30';
+const data = parseFromLLM(llmOutput, { mode: 'repair' });
+// Fixed to: { name: "John", age: 30 }
+```
+### 8. Duplicate Keys
+Same property appearing multiple times (repair mode only).
+```typescript
+const llmOutput = '{"id": 1, "name": "Alice", "id": 2}';
+const data = parseFromLLM(llmOutput, { mode: 'repair' });
+// Result: { id: 2, name: "Alice" } (last value wins)
+```
+## Mode Comparison
+| Failure Type | Parse Mode | Repair Mode |
+|--------------|------------|-------------|
+| Text before/after JSON | ✅ Extracts | ✅ Extracts |
+| JSON in markdown blocks | ✅ Extracts | ✅ Extracts |
+| Concatenated JSONs | ✅ Returns first | ✅ Returns first |
+| Missing quotes in keys | ❌ Throws error | ✅ Fixes with jsonrepair |
+| Trailing commas | ❌ Throws error | ✅ Fixes with jsonrepair |
+| Unquoted keys | ❌ Throws error | ✅ Fixes with jsonrepair |
+| Unescaped quotes in values | ❌ Throws error | ✅ Fixes with jsonrepair |
+| Missing closing braces/quotes | ❌ Throws error | ✅ Fixes with jsonrepair |
+| Duplicate keys in object | ❌ Throws error | ✅ Fixes (last wins) |
+| Missing root object | ❌ Returns as-is | ✅ Wraps (with schema) |
+| Completely invalid JSON | ❌ Throws error | ⚠️ Best effort repair |
+## Modes
+| Mode | Behavior |
+|------|----------|
+| `parse` (default) | Extract and parse JSON. Fails on syntax errors. |
+| `repair` | All strategies: jsonrepair, multiple candidates, schema fixes. |
+## Examples
+### Parse Mode (default)
+```typescript
+// Extracts JSON from text, no repair
+const data = parseFromLLM('Here is your data: {"name": "John"}');
+```
+### Repair Mode
+```typescript
+// Handles broken JSON syntax
+const data = parseFromLLM(
+  'Sure! {name: "John", age: 30}', // missing quotes
+  { mode: 'repair' }
+);
+```
+### Repair Mode + Schema
+```typescript
+import { z } from 'zod';
+const UserSchema = z.object({
+  user: z.object({
+    name: z.string(),
+    age: z.number()
+  })
+});
+// LLM forgot the root "user" key
+const data = parseFromLLM(
+  '{"name": "John", "age": 30}',
+  { mode: 'repair', schema: UserSchema }
+);
+console.log(data); // { user: { name: "John", age: 30 } }
+```
+## API
+### `parseFromLLM<T>(llmOutput: string, options?: ParseOptions): T`
+Parses JSON from LLM output.
+**Parameters:**
+- `llmOutput: string` - Raw string from LLM that may contain JSON
+- `options?: ParseOptions` - Optional configuration
+**Options:**
+- `mode?: 'parse' | 'repair'` - Parsing strategy (default: `'parse'`)
+- `schema?: ZodSchema` - Optional Zod schema for validation and fixes (repair mode only)
+### Helper Functions
+- `hasPossibleJson(str: string): boolean` - Check if string contains JSON braces
+- `isJsonString(str: string): boolean` - Validate if string is valid JSON
+## Found an Issue?
+If you encounter a JSON output format that this library doesn't handle, please [open an issue](../../issues) with an example. We'll be happy to help and improve the library!
+## License
+MIT

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,49 @@
+import { z } from 'zod';
+/**
+ * Parsing mode options:
+ * - parse: Only basic JSON extraction and parsing
+ * - repair: All repair strategies including jsonrepair and schema fixes
+ */
+export type ParseMode = 'parse' | 'repair';
+/**
+ * Options for parseFromLLM function
+ */
+export interface ParseOptions {
+    /**
+     * Parsing mode
+     * @default 'parse'
+     */
+    mode?: ParseMode;
+    /**
+     * Optional Zod schema for validation and structural fixes
+     * Only used in repair mode
+     */
+    schema?: z.ZodTypeAny;
+}
+/**
+ * Parses and extracts JSON from LLM output strings
+ *
+ * @param input - Raw string from LLM that may contain JSON
+ * @param options - Parsing options
+ * @returns Parsed JSON object
+ * @throws Error if no valid JSON is found
+ *
+ * @example
+ * ```ts
+ * // Parse mode (default)
+ * const data = parseFromLLM('Here is the data: {"name": "John"}');
+ *
+ * // Repair mode with schema
+ * const schema = z.object({ user: z.object({ name: z.string() }) });
+ * const data = parseFromLLM('{"name": "John"}', { mode: 'repair', schema });
+ * ```
+ */
+export declare function parseFromLLM<T = any>(input: string, options?: ParseOptions): T;
+/**
+ * Checks whether the string may contain JSON braces
+ */
+export declare function hasPossibleJson(str: string): boolean;
+/**
+ * Validates whether the string is valid JSON
+ */
+export declare function isJsonString(str: string): boolean;

package/dist/index.js ADDED Viewed

@@ -0,0 +1,226 @@
+"use strict";
+Object.defineProperty(exports, "__esModule", { value: true });
+exports.parseFromLLM = parseFromLLM;
+exports.hasPossibleJson = hasPossibleJson;
+exports.isJsonString = isJsonString;
+const jsonrepair_1 = require("jsonrepair");
+const zod_1 = require("zod");
+/**
+ * Parses and extracts JSON from LLM output strings
+ *
+ * @param input - Raw string from LLM that may contain JSON
+ * @param options - Parsing options
+ * @returns Parsed JSON object
+ * @throws Error if no valid JSON is found
+ *
+ * @example
+ * ```ts
+ * // Parse mode (default)
+ * const data = parseFromLLM('Here is the data: {"name": "John"}');
+ *
+ * // Repair mode with schema
+ * const schema = z.object({ user: z.object({ name: z.string() }) });
+ * const data = parseFromLLM('{"name": "John"}', { mode: 'repair', schema });
+ * ```
+ */
+function parseFromLLM(input, options) {
+    const mode = options?.mode || 'parse';
+    const schema = options?.schema;
+    let result;
+    if (mode === 'parse') {
+        result = parseOnly(input);
+    }
+    else {
+        result = parseWithRepair(input);
+    }
+    // Apply schema fixes only in repair mode
+    if (mode === 'repair' && schema) {
+        result = wrapRootIfMissing(result, schema);
+    }
+    return result;
+}
+/**
+ * Parse mode: extract and parse JSON without repair
+ */
+function parseOnly(input) {
+    // Try to find first complete JSON object
+    const firstJson = findFirstCompleteJson(input);
+    if (!firstJson) {
+        throw new Error('No JSON found in the string.');
+    }
+    try {
+        return JSON.parse(firstJson);
+    }
+    catch (error) {
+        throw new Error('Failed to parse JSON: ' + error.message);
+    }
+}
+/**
+ * Repair mode: multiple strategies with repair
+ */
+function parseWithRepair(input) {
+    const cleaned = extractOnlyJson(input);
+    if (cleaned.startsWith('Invalid input')) {
+        throw new Error('No JSON found in the string.');
+    }
+    // Strategy 1: Try all possible JSON candidates
+    const possibleJson = findAllPossibleJson(cleaned);
+    for (const jsonCandidate of possibleJson) {
+        // Try native JSON.parse first (fast path)
+        try {
+            return JSON.parse(jsonCandidate);
+        }
+        catch (parseError) {
+            // Try jsonrepair as fallback
+            try {
+                const repaired = (0, jsonrepair_1.jsonrepair)(jsonCandidate);
+                return JSON.parse(repaired);
+            }
+            catch (repairError) {
+                continue;
+            }
+        }
+    }
+    // Strategy 2: Fallback to first complete JSON
+    const firstJson = findFirstCompleteJson(cleaned);
+    if (firstJson) {
+        try {
+            return JSON.parse(firstJson);
+        }
+        catch (parseError) {
+            const repaired = (0, jsonrepair_1.jsonrepair)(firstJson);
+            return JSON.parse(repaired);
+        }
+    }
+    throw new Error('No valid JSON found in the string.');
+}
+/**
+ * Wraps parsed object with root key if schema expects it but it's missing
+ * Only applies when schema has a single root object key
+ */
+function wrapRootIfMissing(parsed, schema) {
+    if (!(schema instanceof zod_1.z.ZodObject))
+        return parsed;
+    const shape = schema.shape;
+    const rootKeys = Object.keys(shape);
+    if (rootKeys.length !== 1)
+        return parsed;
+    const rootKey = rootKeys[0];
+    const rootSchema = shape[rootKey];
+    if (!(rootSchema instanceof zod_1.z.ZodObject))
+        return parsed;
+    // Already has the root key
+    if (parsed && typeof parsed === 'object' && rootKey in parsed)
+        return parsed;
+    // Check if parsed has all children of the expected root
+    const childShape = rootSchema.shape;
+    const childKeys = Object.keys(childShape);
+    const hasAllChildren = parsed && typeof parsed === 'object' && childKeys.every((k) => k in parsed);
+    if (hasAllChildren) {
+        return { [rootKey]: parsed };
+    }
+    return parsed;
+}
+/**
+ * Extracts the substring between the first and last braces
+ */
+function extractOnlyJson(str) {
+    const start = str.indexOf('{');
+    const end = str.lastIndexOf('}') + 1;
+    if (start !== -1 && end !== -1) {
+        return str.slice(start, end);
+    }
+    return 'Invalid input: no braces found.';
+}
+/**
+ * Finds every complete JSON object in the input
+ */
+function findAllPossibleJson(input) {
+    const candidates = [];
+    for (let i = 0; i < input.length; i++) {
+        if (input[i] === '{') {
+            const jsonCandidate = findCompleteJsonStartingAt(input, i);
+            if (jsonCandidate) {
+                candidates.push(jsonCandidate);
+            }
+        }
+    }
+    return candidates;
+}
+/**
+ * Returns a balanced JSON object starting at a given index
+ */
+function findCompleteJsonStartingAt(input, startIndex) {
+    let braceCount = 0;
+    let inString = false;
+    let escapeNext = false;
+    for (let i = startIndex; i < input.length; i++) {
+        const char = input[i];
+        if (escapeNext) {
+            escapeNext = false;
+            continue;
+        }
+        if (char === '\\') {
+            escapeNext = true;
+            continue;
+        }
+        if (char === '"' && !escapeNext) {
+            inString = !inString;
+            continue;
+        }
+        if (!inString) {
+            if (char === '{') {
+                braceCount++;
+            }
+            else if (char === '}') {
+                braceCount--;
+                if (braceCount === 0) {
+                    return input.substring(startIndex, i + 1);
+                }
+            }
+        }
+    }
+    return null;
+}
+/**
+ * Finds the first complete JSON object in the input
+ */
+function findFirstCompleteJson(input) {
+    let braceCount = 0;
+    let startIndex = -1;
+    for (let i = 0; i < input.length; i++) {
+        const char = input[i];
+        if (char === '{') {
+            if (startIndex === -1)
+                startIndex = i;
+            braceCount++;
+        }
+        else if (char === '}') {
+            braceCount--;
+            if (braceCount === 0 && startIndex !== -1) {
+                return input.substring(startIndex, i + 1);
+            }
+        }
+    }
+    return null;
+}
+/**
+ * Checks whether the string may contain JSON braces
+ */
+function hasPossibleJson(str) {
+    const start = str.indexOf('{');
+    const end = str.lastIndexOf('}') + 1;
+    return start > -1 && end > 1;
+}
+/**
+ * Validates whether the string is valid JSON
+ */
+function isJsonString(str) {
+    try {
+        JSON.parse(str);
+        return true;
+    }
+    catch (e) {
+        return false;
+    }
+}

package/package.json ADDED Viewed

@@ -0,0 +1,64 @@
+{
+  "name": "json-llm-repair",
+  "version": "0.1.0",
+  "description": "Parse and repair JSON from LLM outputs with multiple strategies",
+  "main": "dist/index.js",
+  "types": "dist/index.d.ts",
+  "files": [
+    "dist"
+  ],
+  "scripts": {
+    "build": "tsc",
+    "test": "vitest run",
+    "test:watch": "vitest",
+    "prepublishOnly": "npm run build",
+    "release:patch": "npm version patch -m \"chore(release): %s\" && git push --follow-tags && npm publish",
+    "release:minor": "npm version minor -m \"chore(release): %s\" && git push --follow-tags && npm publish",
+    "release:major": "npm version major -m \"chore(release): %s\" && git push --follow-tags && npm publish",
+    "lint": "eslint src --ext .ts",
+    "lint:fix": "eslint src --ext .ts --fix",
+    "format": "prettier --write \"src/**/*.ts\"",
+    "format:check": "prettier --check \"src/**/*.ts\""
+  },
+  "keywords": [
+    "llm",
+    "json",
+    "parser",
+    "ai",
+    "openai",
+    "anthropic",
+    "repair",
+    "extract",
+    "typescript"
+  ],
+  "author": "Tiago Gouvêa",
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/tiagogouvea/json-llm-repair.git"
+  },
+  "homepage": "https://github.com/tiagogouvea/json-llm-repair#readme",
+  "bugs": {
+    "url": "https://github.com/tiagogouvea/json-llm-repair/issues"
+  },
+  "dependencies": {
+    "jsonrepair": "^3.8.0"
+  },
+  "peerDependencies": {
+    "zod": "^3.0.0"
+  },
+  "peerDependenciesMeta": {
+    "zod": {
+      "optional": true
+    }
+  },
+  "devDependencies": {
+    "zod": "^3.22.0",
+    "@typescript-eslint/eslint-plugin": "^6.0.0",
+    "@typescript-eslint/parser": "^6.0.0",
+    "eslint": "^8.0.0",
+    "prettier": "^3.0.0",
+    "typescript": "^5.0.0",
+    "vitest": "^1.0.0"
+  }
+}