npm - ai-evaluate - Versions diffs - 2.0.2 → 2.1.3 - Mend

ai-evaluate 2.0.2 → 2.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/.turbo/turbo-build.log +4 -5
package/CHANGELOG.md +31 -0
package/LICENSE +21 -0
package/README.md +142 -206
package/package.json +13 -14
package/src/evaluate.js +187 -0
package/src/index.js +10 -0
package/src/types.js +4 -0
package/src/worker-template.js +3627 -0
package/test/evaluate-extended.test.js +429 -0
package/test/evaluate.test.js +235 -0
package/test/index.test.js +77 -0
package/test/worker-template.test.js +365 -0
package/vitest.config.js +15 -0

package/.turbo/turbo-build.log CHANGED Viewed

@@ -1,5 +1,4 @@
-> ai-evaluate@2.0.1 build /Users/nathanclevenger/projects/primitives.org.ai/packages/ai-evaluate
-> tsc -p tsconfig.json
+> ai-evaluate@2.1.3 build /Users/nathanclevenger/projects/primitives.org.ai/packages/ai-evaluate
+> tsc -p tsconfig.json

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,36 @@
 # ai-evaluate
+## 2.1.3
+### Patch Changes
+- Documentation and testing improvements
+  - Add deterministic AI testing suite with self-validating patterns
+  - Apply StoryBrand narrative to all package READMEs
+  - Update TESTING.md with four principles of deterministic AI testing
+  - Fix duplicate examples package name conflict
+- Updated dependencies
+  - ai-functions@2.1.3
+  - ai-tests@2.1.3
+## 2.1.1
+### Patch Changes
+- Updated dependencies [6beb531]
+  - ai-functions@2.1.1
+  - ai-tests@2.1.1
+## 2.0.3
+### Patch Changes
+- Updated dependencies
+  - rpc.do@0.2.0
+  - ai-functions@2.0.3
+  - ai-tests@2.0.3
 ## 2.0.2
 ### Patch Changes

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 .org.ai
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md CHANGED Viewed

@@ -1,92 +1,100 @@
 # ai-evaluate
-Secure code execution in sandboxed environments. Run untrusted code safely using Cloudflare Workers or Miniflare.
+**You need to run user code. But untrusted code is terrifying.**
-## Installation
+One malicious snippet could crash your server, access your file system, or make unauthorized network requests. You've seen the horror stories. You know the risks.
+What if you could run any code with confidence?
+## The Solution
+`ai-evaluate` runs untrusted code in V8 isolates with zero access to your system. No file system. No network (by default). No risk.
+```typescript
+// Before: Dangerous eval
+const result = eval(userCode) // Could do ANYTHING
+// After: Sandboxed execution
+import { evaluate } from 'ai-evaluate'
+const result = await evaluate({
+  script: userCode
+})
+// Runs in isolated V8 context - your system is protected
+```
+## Quick Start
+**1. Install**
 ```bash
 pnpm add ai-evaluate
 ```
-## Quick Start
+**2. Evaluate code safely**
 ```typescript
 import { evaluate } from 'ai-evaluate'
-// Run a simple script
 const result = await evaluate({
   script: '1 + 1'
 })
 // { success: true, value: 2, logs: [], duration: 5 }
+```
+**3. Run tests on user code**
-// With a module and tests
+```typescript
 const result = await evaluate({
   module: `
     export const add = (a, b) => a + b
-    export const multiply = (a, b) => a * b
   `,
   tests: `
-    describe('math', () => {
+    describe('add', () => {
       it('adds numbers', () => {
-        expect(add(2, 3)).toBe(5);
-      })
-      it('multiplies numbers', () => {
-        expect(multiply(2, 3)).toBe(6);
+        expect(add(2, 3)).toBe(5)
       })
     })
-  `,
-  script: 'add(10, 20)'
+  `
 })
+// result.testResults.passed === 1
 ```
-## Features
+## What You Get
-- **Secure isolation** - Code runs in a sandboxed V8 isolate
-- **Vitest-compatible tests** - `describe`, `it`, `expect` in global scope
-- **Module exports** - Define modules and use exports in scripts/tests
-- **Cloudflare Workers** - Uses worker_loaders in production
-- **Miniflare** - Uses Miniflare for local development and Node.js
-- **Network isolation** - External network access blocked by default
+- **Complete isolation** - Code runs in sandboxed V8 isolates
+- **Built-in testing** - Vitest-compatible `describe`, `it`, `expect`
+- **Module support** - Define exports and use them in scripts/tests
+- **Production-ready** - Cloudflare Workers in production, Miniflare locally
+- **Network blocked** - External access disabled by default
-## API
+## API Reference
 ### evaluate(options)
-Execute code in a sandboxed environment.
 ```typescript
 interface EvaluateOptions {
-  /** Module code with exports */
-  module?: string
-  /** Test code using vitest-style API */
-  tests?: string
-  /** Script code to run (module exports in scope) */
-  script?: string
-  /** Timeout in milliseconds (default: 5000) */
-  timeout?: number
-  /** Environment variables */
-  env?: Record<string, string>
+  module?: string              // Module code with exports
+  tests?: string               // Vitest-style test code
+  script?: string              // Script to execute
+  timeout?: number             // Default: 5000ms
+  env?: Record<string, string> // Environment variables
+  sdk?: SDKConfig | boolean    // Enable $, db, ai globals
 }
 interface EvaluateResult {
-  /** Whether execution succeeded */
-  success: boolean
-  /** Return value from script */
-  value?: unknown
-  /** Console output */
-  logs: LogEntry[]
-  /** Test results (if tests provided) */
-  testResults?: TestResults
-  /** Error message if failed */
-  error?: string
-  /** Execution time in ms */
-  duration: number
+  success: boolean             // Execution succeeded
+  value?: unknown              // Script return value
+  logs: LogEntry[]             // Console output
+  testResults?: TestResults    // Test results if tests provided
+  error?: string               // Error message if failed
+  duration: number             // Execution time in ms
 }
 ```
 ### createEvaluator(env)
-Create an evaluate function bound to a specific environment. Useful for Cloudflare Workers.
+Bind to a Cloudflare Workers environment.
 ```typescript
 import { createEvaluator } from 'ai-evaluate'
@@ -94,24 +102,22 @@ import { createEvaluator } from 'ai-evaluate'
 export default {
   async fetch(request, env) {
     const sandbox = createEvaluator(env)
-    const result = await sandbox({
-      script: '1 + 1'
-    })
+    const result = await sandbox({ script: '1 + 1' })
     return Response.json(result)
   }
 }
 ```
-## Usage Patterns
+## Usage Examples
-### Simple Script Execution
+### Simple Script
 ```typescript
 const result = await evaluate({
   script: `
-    const x = 10;
-    const y = 20;
-    return x + y;
+    const x = 10
+    const y = 20
+    return x + y
   `
 })
 // result.value === 30
@@ -122,189 +128,137 @@ const result = await evaluate({
 ```typescript
 const result = await evaluate({
   module: `
-    exports.greet = (name) => \`Hello, \${name}!\`;
-    exports.sum = (...nums) => nums.reduce((a, b) => a + b, 0);
+    exports.greet = (name) => \`Hello, \${name}!\`
+    exports.sum = (...nums) => nums.reduce((a, b) => a + b, 0)
   `,
   script: `
-    console.log(greet('World'));
-    return sum(1, 2, 3, 4, 5);
+    console.log(greet('World'))
+    return sum(1, 2, 3, 4, 5)
   `
 })
 // result.value === 15
 // result.logs[0].message === 'Hello, World!'
 ```
-### Running Tests
+### Testing User Code
 ```typescript
 const result = await evaluate({
   module: `
     exports.isPrime = (n) => {
-      if (n < 2) return false;
+      if (n < 2) return false
       for (let i = 2; i <= Math.sqrt(n); i++) {
-        if (n % i === 0) return false;
+        if (n % i === 0) return false
       }
-      return true;
-    };
+      return true
+    }
   `,
   tests: `
     describe('isPrime', () => {
       it('returns false for numbers less than 2', () => {
-        expect(isPrime(0)).toBe(false);
-        expect(isPrime(1)).toBe(false);
-      });
+        expect(isPrime(0)).toBe(false)
+        expect(isPrime(1)).toBe(false)
+      })
       it('returns true for prime numbers', () => {
-        expect(isPrime(2)).toBe(true);
-        expect(isPrime(3)).toBe(true);
-        expect(isPrime(17)).toBe(true);
-      });
+        expect(isPrime(2)).toBe(true)
+        expect(isPrime(17)).toBe(true)
+      })
       it('returns false for composite numbers', () => {
-        expect(isPrime(4)).toBe(false);
-        expect(isPrime(9)).toBe(false);
-        expect(isPrime(100)).toBe(false);
-      });
-    });
+        expect(isPrime(4)).toBe(false)
+        expect(isPrime(100)).toBe(false)
+      })
+    })
   `
 })
-console.log(result.testResults)
-// {
-//   total: 3,
-//   passed: 3,
-//   failed: 0,
-//   skipped: 0,
-//   tests: [...]
-// }
+// result.testResults = { total: 3, passed: 3, failed: 0, ... }
 ```
 ## Test Framework
-The sandbox provides a vitest-compatible test API with async support.
+Full vitest-compatible API with async support.
-### describe / it / test
+### Test Structure
 ```typescript
-describe('group name', () => {
-  it('test name', () => {
-    // test code
-  });
-  test('another test', () => {
-    // test code
-  });
-  it.skip('skipped test', () => {
-    // won't run
-  });
-  it.only('only this test', () => {
-    // when .only is used, only these tests run
-  });
-});
+describe('group', () => {
+  it('test name', () => { /* ... */ })
+  test('another test', () => { /* ... */ })
+  it.skip('skipped', () => { /* ... */ })
+  it.only('focused', () => { /* ... */ })
+})
 ```
 ### Async Tests
 ```typescript
-describe('async operations', () => {
-  it('supports async/await', async () => {
-    const result = await someAsyncFunction();
-    expect(result).toBe('expected');
-  });
-  it('supports promises', () => {
-    return fetchData().then(data => {
-      expect(data).toBeDefined();
-    });
-  });
-});
+it('async/await', async () => {
+  const result = await someAsyncFunction()
+  expect(result).toBe('expected')
+})
 ```
 ### Hooks
 ```typescript
 describe('with setup', () => {
-  let data;
-  beforeEach(() => {
-    data = { count: 0 };
-  });
+  let data
-  afterEach(() => {
-    data = null;
-  });
+  beforeEach(() => { data = { count: 0 } })
+  afterEach(() => { data = null })
-  it('uses setup data', () => {
-    data.count++;
-    expect(data.count).toBe(1);
-  });
-});
+  it('uses setup', () => {
+    data.count++
+    expect(data.count).toBe(1)
+  })
+})
 ```
-### expect matchers
+### Matchers
 ```typescript
 // Equality
-expect(value).toBe(expected)           // Strict equality (===)
-expect(value).toEqual(expected)        // Deep equality
-expect(value).toStrictEqual(expected)  // Strict deep equality
+expect(value).toBe(expected)
+expect(value).toEqual(expected)
+expect(value).toStrictEqual(expected)
 // Truthiness
-expect(value).toBeTruthy()             // Truthy check
-expect(value).toBeFalsy()              // Falsy check
-expect(value).toBeNull()               // null check
-expect(value).toBeUndefined()          // undefined check
-expect(value).toBeDefined()            // not undefined
-expect(value).toBeNaN()                // NaN check
+expect(value).toBeTruthy()
+expect(value).toBeFalsy()
+expect(value).toBeNull()
+expect(value).toBeUndefined()
+expect(value).toBeDefined()
 // Numbers
-expect(value).toBeGreaterThan(n)       // > comparison
-expect(value).toBeLessThan(n)          // < comparison
-expect(value).toBeGreaterThanOrEqual(n)// >= comparison
-expect(value).toBeLessThanOrEqual(n)   // <= comparison
-expect(value).toBeCloseTo(n, digits)   // Floating point comparison
+expect(value).toBeGreaterThan(n)
+expect(value).toBeLessThan(n)
+expect(value).toBeCloseTo(n, digits)
-// Strings
-expect(value).toMatch(/pattern/)       // Regex match
-expect(value).toMatch('substring')     // Contains substring
-// Arrays & Strings
-expect(value).toContain(item)          // Array/string contains
-expect(value).toContainEqual(item)     // Array contains (deep equality)
-expect(value).toHaveLength(n)          // Length check
+// Strings & Arrays
+expect(value).toMatch(/pattern/)
+expect(value).toContain(item)
+expect(value).toHaveLength(n)
 // Objects
-expect(value).toHaveProperty('path')   // Has property
-expect(value).toHaveProperty('path', v)// Has property with value
-expect(value).toMatchObject(partial)   // Partial object match
-// Types
-expect(value).toBeInstanceOf(Class)    // instanceof check
-expect(value).toBeTypeOf('string')     // typeof check
+expect(value).toHaveProperty('path')
+expect(value).toMatchObject(partial)
 // Errors
-expect(fn).toThrow()                   // Throws any error
-expect(fn).toThrow('message')          // Throws with message
-expect(fn).toThrow(/pattern/)          // Throws matching pattern
-expect(fn).toThrow(ErrorClass)         // Throws specific error type
+expect(fn).toThrow()
+expect(fn).toThrow('message')
-// Negated matchers
+// Negation
 expect(value).not.toBe(expected)
-expect(value).not.toEqual(expected)
-expect(value).not.toContain(item)
-expect(fn).not.toThrow()
-// Promise matchers
+// Promises
 await expect(promise).resolves.toBe(value)
 await expect(promise).rejects.toThrow('error')
 ```
 ## Cloudflare Workers Setup
-To use in Cloudflare Workers with worker_loaders:
 ### wrangler.toml
 ```toml
@@ -315,7 +269,7 @@ main = "src/index.ts"
 binding = "LOADER"
 ```
-### Worker Code
+### Worker
 ```typescript
 import { createEvaluator } from 'ai-evaluate'
@@ -327,7 +281,6 @@ export interface Env {
 export default {
   async fetch(request: Request, env: Env): Promise<Response> {
     const sandbox = createEvaluator(env)
     const { code, tests } = await request.json()
     const result = await sandbox({
@@ -340,58 +293,33 @@ export default {
 }
 ```
-## Node.js / Development
+## Local Development
-In Node.js or during development, the evaluate function automatically uses Miniflare:
+In Node.js, Miniflare is used automatically:
 ```typescript
 import { evaluate } from 'ai-evaluate'
-// Miniflare is used automatically when LOADER binding is not present
 const result = await evaluate({
   script: 'return "Hello from Node!"'
 })
 ```
-Make sure `miniflare` is installed:
+Ensure Miniflare is installed:
 ```bash
 pnpm add miniflare
 ```
-## Security
-The sandbox provides several security features:
-1. **V8 Isolate** - Code runs in an isolated V8 context
-2. **No Network** - External network access is blocked (`globalOutbound: null`)
-3. **No File System** - No access to the file system
-4. **Memory Limits** - Standard Worker memory limits apply
-5. **CPU Limits** - Execution time is limited
+## Security Model
-## Example: Code Evaluation API
-```typescript
-import { evaluate } from 'ai-evaluate'
-import { Hono } from 'hono'
-const app = new Hono()
-app.post('/evaluate', async (c) => {
-  const { module, tests, script } = await c.req.json()
-  const result = await evaluate({
-    module,
-    tests,
-    script,
-    timeout: 5000
-  })
-  return c.json(result)
-})
-export default app
-```
+| Protection | Description |
+|------------|-------------|
+| V8 Isolate | Code runs in isolated V8 context |
+| No Network | External access blocked by default |
+| No File System | Zero filesystem access |
+| Memory Limits | Standard Worker limits apply |
+| CPU Limits | Execution time bounded |
 ## Types
@@ -418,3 +346,11 @@ interface TestResult {
   duration: number
 }
 ```
+---
+**Stop worrying about untrusted code. Start building.**
+```bash
+pnpm add ai-evaluate
+```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ai-evaluate",
-  "version": "2.0.2",
+  "version": "2.1.3",
   "description": "Secure code execution in sandboxed environments",
   "type": "module",
   "main": "dist/index.js",
@@ -11,19 +11,10 @@
       "types": "./dist/index.d.ts"
     }
   },
-  "scripts": {
-    "build": "tsc -p tsconfig.json",
-    "dev": "tsc -p tsconfig.json --watch",
-    "test": "vitest",
-    "typecheck": "tsc --noEmit",
-    "lint": "eslint .",
-    "clean": "rm -rf dist"
-  },
   "dependencies": {
-    "ai-functions": "2.0.2",
-    "ai-tests": "2.0.2",
     "capnweb": "^0.2.0",
-    "rpc.do": "^0.1.0"
+    "ai-functions": "2.1.3",
+    "ai-tests": "2.1.3"
   },
   "devDependencies": {
     "@vitest/coverage-v8": "^2.1.0",
@@ -42,5 +33,13 @@
     "miniflare",
     "primitives"
   ],
-  "license": "MIT"
-}
+  "license": "MIT",
+  "scripts": {
+    "build": "tsc -p tsconfig.json",
+    "dev": "tsc -p tsconfig.json --watch",
+    "test": "vitest",
+    "typecheck": "tsc --noEmit",
+    "lint": "eslint .",
+    "clean": "rm -rf dist"
+  }
+}