npm - @evalview/node - Versions diffs - 0.2.8 → 0.3.0 - Mend

@evalview/node 0.2.8 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +83 -42
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,15 +1,30 @@
-# @evalview/node
+# @evalview/node — Proof that your agent still works.
-Drop-in Node.js/Next.js middleware for [EvalView](https://github.com/hidai25/eval-view) testing framework.
+> You changed a prompt. Swapped a model. Updated a tool.
+> Did anything break? **Run EvalView. Know for sure.**
-## Installation
+Drop-in Node.js/Next.js middleware for [EvalView](https://github.com/hidai25/eval-view) — the regression testing framework for AI agents.
+---
+## 🔍 What EvalView Catches
+| Status | What it means | What you do |
+|--------|--------------|-------------|
+| ✅ **PASSED** | Agent behavior matches baseline | Ship with confidence |
+| ⚠️ **TOOLS_CHANGED** | Agent is calling different tools | Review the diff |
+| ⚠️ **OUTPUT_CHANGED** | Same tools, output quality shifted | Review the diff |
+| ❌ **REGRESSION** | Score dropped significantly | Fix before shipping |
+---
+## 🚀 Quick Start
 ```bash
 npm install @evalview/node
+pip install evalview
 ```
-## Quick Start
 ### Next.js App Router
 ```typescript
@@ -17,7 +32,7 @@ npm install @evalview/node
 import { createEvalViewMiddleware } from '@evalview/node';
 export const POST = createEvalViewMiddleware({
-  targetEndpoint: '/api/unifiedchat',  // Your agent's endpoint
+  targetEndpoint: '/api/your-agent',
 });
 ```
@@ -31,33 +46,72 @@ app.post('/api/evalview', createEvalViewMiddleware({
 }));
 ```
-## Configuration
+Then point EvalView at your endpoint:
+```yaml
+# .evalview/config.yaml
+adapter: http
+endpoint: http://localhost:3000/api/evalview
+```
+Capture baseline and check for regressions:
+```bash
+evalview snapshot   # save current behavior as baseline
+evalview check      # detect regressions on every change
+```
+---
+## 🤖 Claude Code Integration (MCP)
+Test your agent without leaving the conversation:
+```bash
+claude mcp add --transport stdio evalview -- evalview mcp serve
+cp CLAUDE.md.example CLAUDE.md
+```
+Ask Claude Code naturally:
+```
+You: Did my refactor break anything?
+Claude: [run_check] ✨ All clean! No regressions detected.
+You: Add a test for my weather agent
+Claude: [create_test] ✅ Created tests/weather-lookup.yaml
+        [run_snapshot] 📸 Baseline captured.
+```
+No YAML. No terminal switching. No context loss.
+[Full MCP docs →](https://github.com/hidai25/eval-view#-claude-code-integration-mcp)
+---
+## ⚙️ Configuration
 ```typescript
 createEvalViewMiddleware({
-  // Required: Endpoint to forward requests to
-  targetEndpoint: '/api/unifiedchat',
+  // Required: your agent's endpoint
+  targetEndpoint: '/api/your-agent',
-  // Optional: Default user ID for test requests (defaults to 'evalview-test-user')
-  // Use an existing user ID from your database
+  // Optional: default user ID for test requests
   defaultUserId: 'your-dev-user-id',
-  // Optional: Dynamic user ID resolution
-  // Useful for creating users on-the-fly or looking up existing ones
+  // Optional: dynamic user ID resolution
   getUserId: async (req) => {
-    // Example: Look up or create user
     const user = await findOrCreateUser('test@example.com');
     return user.id;
   },
-  // Optional: Transform EvalView request to your API format
+  // Optional: transform EvalView request to your API format
   transformRequest: (req) => ({
     message: req.query,
-    userId: req.context?.userId, // Automatically set by middleware
-    // ... your custom mapping
+    userId: req.context?.userId,
   }),
-  // Optional: Parse your API response to EvalView format
+  // Optional: parse your API response to EvalView format
   parseResponse: (responseText, startTime) => ({
     session_id: `session-${startTime}`,
     output: '...',
@@ -65,41 +119,28 @@ createEvalViewMiddleware({
     cost: 0.05,
     latency: Date.now() - startTime,
   }),
-  // Optional: Base URL for requests
-  baseUrl: process.env.API_BASE_URL,
 });
 ```
-## Default Behavior
-Works out-of-the-box with Tapescope-style APIs that:
-- Accept: `{ message, userId, conversationId, route, history }`
-- Return: NDJSON stream with `tool_call` and `message_complete` events
-## Testing
+---
-Point EvalView CLI to your endpoint:
+## 🔧 Automate It
 ```yaml
-# .evalview/config.yaml
-adapter: http
-endpoint: http://localhost:3000/api/evalview
+# .github/workflows/evalview.yml
+- run: evalview check --fail-on REGRESSION --json
+  env:
+    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
 ```
-Then run tests:
-```bash
-evalview run
-```
+---
-## License
+## 📚 Documentation
-MIT
+[Full docs →](https://github.com/hidai25/eval-view) • [Examples →](https://github.com/hidai25/eval-view/tree/main/examples) • [Issues →](https://github.com/hidai25/eval-view/issues)
 ---
-EvalView just stopped your agent from:
-- hallucinating tools that don't exist
-- tool-calling itself into bankruptcy
+## License
-→ Smash ⭐ if it saved your sanity (and your wallet) today
+Apache-2.0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@evalview/node",
-  "version": "0.2.8",
+  "version": "0.3.0",
   "description": "Drop-in Node.js middleware for EvalView testing framework",
   "main": "dist/index.js",
   "module": "dist/index.js",