npm - clew-code - Versions diffs - 0.2.21 → 0.2.22 - Mend

clew-code 0.2.21 → 0.2.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (66) hide show

package/dist/main.js +1861 -1856
package/docs/architecture.html +91 -148
package/docs/assets/clew-agent-loop.png +0 -0
package/docs/assets/clew-general-architecture.png +0 -0
package/docs/assets/clew-mcp-architecture.png +0 -0
package/docs/assets/clew-p2p-swarm.png +0 -0
package/docs/changelog.html +150 -0
package/docs/cli-reference.html +90 -0
package/docs/commands.html +156 -265
package/docs/configuration.html +85 -147
package/docs/contributing.html +91 -0
package/docs/css/styles.css +425 -425
package/docs/daemon.html +62 -129
package/docs/features/bridge-mode.html +61 -66
package/docs/features/evals.html +57 -149
package/docs/features/searxng-search.html +58 -118
package/docs/features/sentry-setup.html +61 -124
package/docs/index.html +137 -125
package/docs/installation.html +77 -105
package/docs/internals/growthbook-ab-testing.html +69 -91
package/docs/internals/hidden-features.html +81 -143
package/docs/js/main.js +29 -0
package/docs/loop.html +69 -181
package/docs/mcp.html +99 -247
package/docs/models.html +69 -110
package/docs/permission-model.html +86 -102
package/docs/plugins.html +84 -102
package/docs/providers.html +87 -127
package/docs/quick-start.html +81 -93
package/docs/research-memory.html +71 -102
package/docs/security.html +71 -0
package/docs/skills.html +67 -117
package/docs/swarm.html +78 -236
package/docs/tools.html +152 -151
package/docs/troubleshooting.html +86 -106
package/docs/voice-mode.html +79 -0
package/package.json +1 -1
package/docs/architecture.th.html +0 -79
package/docs/clew-code-architecture.html +0 -1126
package/docs/commands.th.html +0 -269
package/docs/configuration.th.html +0 -108
package/docs/daemon.th.html +0 -73
package/docs/features/bridge-mode.th.html +0 -62
package/docs/features/evals.th.html +0 -62
package/docs/features/searxng-search.th.html +0 -67
package/docs/features/sentry-setup.th.html +0 -69
package/docs/features/swarm.html +0 -156
package/docs/generated/providers.html +0 -625
package/docs/generated/tools.html +0 -558
package/docs/index.th.html +0 -292
package/docs/installation.th.html +0 -105
package/docs/internals/growthbook-ab-testing.th.html +0 -60
package/docs/internals/hidden-features.th.html +0 -107
package/docs/loop.th.html +0 -227
package/docs/mcp.th.html +0 -207
package/docs/models.th.html +0 -61
package/docs/permission-model.th.html +0 -67
package/docs/plugins.th.html +0 -79
package/docs/prompts-and-features.html +0 -806
package/docs/providers.th.html +0 -81
package/docs/quick-start.th.html +0 -89
package/docs/research-memory.th.html +0 -72
package/docs/skills.th.html +0 -90
package/docs/swarm.th.html +0 -280
package/docs/tools.th.html +0 -84
package/docs/troubleshooting.th.html +0 -85

package/docs/daemon.html CHANGED Viewed

@@ -1,129 +1,62 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-  <meta charset="UTF-8">
-  <meta name="viewport" content="width=device-width, initial-scale=1.0">
-  <title>Autonomous Daemon — Clew</title>
-  <meta name="description" content="24/7 autonomous background execution — task queue, agent loop, supervisor integration, and recurring tasks.">
-  <link rel="preconnect" href="https://fonts.googleapis.com">
-  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-  <link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
-  <link rel="stylesheet" href="css/styles.css">
-  <link rel="icon" type="image/svg+xml" href="./assets/clew.svg">
-</head>
-<body>
-<header class="header"></header>
-<div class="app">
-  <aside class="sidebar" id="sidebar"></aside>
-  <div class="sidebar-overlay" id="sidebarOverlay"></div>
-  <div class="content-wrap">
-    <main class="content">
-      <div class="breadcrumbs"><a href="index.html">Home</a><span class="sep">/</span><span>Daemon Mode</span></div>
-      <h1>Autonomous Daemon Mode</h1>
-      <p class="section-subtitle">Run Clew as a 24/7 background daemon — task queue, agent loop, health checks, and supervisor auto-respawn for unattended autonomous operation.</p>
-      <p>The autonomous system lives in <code>src/services/autonomous/</code> and consists of four main components: the <strong>task queue</strong>, <strong>agent loop</strong>, <strong>daemon entry point</strong>, and <strong>supervisor integration</strong>.</p>
-      <h2>Architecture</h2>
-      <pre><code>  + Task Queue (taskQueue.ts)
-  |    File-backed persistent queue
-  |    Priorities Leases Dead-letter
-  |
-  + Agent Loop (agentLoop.ts)
-  |    Dequeue Spawn worker Monitor Retry
-  |
-  + Daemon Mode (daemonMode.ts)
-  |    Background process entry point
-  |
-  + Supervisor (supervisorIntegration.ts)
-       Health checks Auto-respawn State tracking</code></pre>
-      <h2>Task Queue</h2>
-      <p>The file-backed persistent queue (<code>src/services/autonomous/taskQueue.ts</code>) is the foundation of the autonomous system:</p>
-      <ul>
-        <li><strong>Persistence</strong> — Tasks survive process restarts via on-disk storage</li>
-        <li><strong>Priorities</strong> — Urgent tasks skip ahead in the queue</li>
-        <li><strong>Leases</strong> — Tasks are leased to workers with TTL; expired leases are retried</li>
-        <li><strong>Dead-letter</strong> — Tasks that exhaust retries are moved to dead-letter for inspection</li>
-        <li><strong>Scheduling</strong> — One-shot and recurring (cron) tasks supported</li>
-      </ul>
-      <h2>Agent Loop</h2>
-      <p>The continuous agent loop (<code>src/services/autonomous/agentLoop.ts</code>) runs in the background:</p>
-      <ol>
-        <li><strong>Dequeue</strong> — Pull the highest-priority ready task</li>
-        <li><strong>Spawn worker</strong> — Launch a worker session for the task</li>
-        <li><strong>Monitor</strong> — Track progress, streaming output, and resource usage</li>
-        <li><strong>Retry or complete</strong> — On failure, retry with backoff; on success, record result</li>
-        <li><strong>Repeat</strong> — Check for new tasks and repeat the cycle</li>
-      </ol>
-      <h2>Daemon Entry Point</h2>
-      <p><code>src/services/autonomous/daemonMode.ts</code> provides the background process entry point. When started in daemon mode, Clew:</p>
-      <ul>
-        <li>Detaches from the terminal and runs as a background process</li>
-        <li>Logs output to a configurable log file</li>
-        <li>Responds to signals for graceful shutdown</li>
-        <li>Reports status to the supervisor for health tracking</li>
-      </ul>
-      <h2>Supervisor Integration</h2>
-      <p><code>src/services/autonomous/supervisorIntegration.ts</code> ensures the daemon stays running:</p>
-      <ul>
-        <li><strong>Health checks</strong> — Periodic heartbeat and resource checks</li>
-        <li><strong>Auto-respawn</strong> — Automatic restart on unexpected exit</li>
-        <li><strong>State tracking</strong> — Current status, running tasks, error counts</li>
-        <li><strong>Graceful degradation</strong> — Reduces polling frequency on repeated failures</li>
-      </ul>
-      <h2>Commands</h2>
-      <table>
-        <tr><th>Command</th><th>Description</th></tr>
-        <tr><td><code>/daemon</code></td><td>Open interactive control panel; subcommands: start, stop, status, restart</td></tr>
-        <tr><td><code>/task</code></td><td>Create scheduled or recurring tasks via interactive form</td></tr>
-        <tr><td><code>/task list</code></td><td>List queued, running, and completed tasks</td></tr>
-        <tr><td><code>/loop</code></td><td>Run a prompt or command on a recurring interval (<code>/loop 5m /check-deploy</code>)</td></tr>
-        <tr><td><code>/agents</code></td><td>Manage agent configurations and daemon worker pools</td></tr>
-        <tr><td><code>/tasks</code></td><td>List and manage background agent tasks</td></tr>
-      </table>
-      <h2>Task Scheduling</h2>
-      <p>Scheduled tasks can be created through the interactive <code>/task</code> form or programmatically. Storage modes:</p>
-      <ul>
-        <li><strong>Durable</strong> — Persists to <code>.clew/scheduled_tasks.json</code>, survives restarts</li>
-        <li><strong>Session-only</strong> — Kept in memory for the current session only</li>
-      </ul>
-      <p>Recurring tasks auto-expire after 30 days. One-shot tasks auto-delete after firing. Custom cron expressions are supported (standard 5-field format).</p>
-      <pre><code>/task
-Name: Deploy health check
-Schedule: Daily
-Time: 09:00
-Prompt: Check deployment status and report
-Storage: Durable</code></pre>
-      <h2>Architecture Files</h2>
-      <table>
-        <tr><th>File</th><th>Role</th></tr>
-        <tr><td><code>src/services/autonomous/taskQueue.ts</code></td><td>Persistent task queue with priorities, leases, dead-letter</td></tr>
-        <tr><td><code>src/services/autonomous/agentLoop.ts</code></td><td>Continuous 24/7 agent loop</td></tr>
-        <tr><td><code>src/services/autonomous/daemonMode.ts</code></td><td>Background daemon entry point</td></tr>
-        <tr><td><code>src/services/autonomous/supervisorIntegration.ts</code></td><td>Health checks, auto-respawn, state tracking</td></tr>
-      </table>
-      <footer class="footer">
-        <span>Clew Code 0.2.14 — Open Source</span>
-        <div class="footer-links">
-          <a href="https://github.com/ClewCode/ClewCode">GitHub</a>
-          <a href="https://github.com/ClewCode/ClewCode/issues">Issues</a>
-        </div>
-      </footer>
-    </main>
-    <nav class="toc-sidebar"></nav>
-  </div>
-</div>
-<script src="js/main.js"></script>
-</body>
-</html>
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>Daemon Mode — Clew Code</title>
+<meta name="description" content="Run Clew Code as a background daemon for autonomous operations.">
+<link rel="icon" type="image/svg+xml" href="assets/clew.svg">
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
+<link rel="stylesheet" href="css/styles.css">
+</head>
+<body>
+<header class="header"></header>
+<div id="sidebarOverlay" class="sidebar-overlay"></div>
+<aside id="sidebar" class="sidebar"></aside>
+<div class="content-wrap">
+<div class="content">
+<div class="breadcrumbs"><a href="index.html">Home</a><span class="sep">/</span><span class="current">Daemon Mode</span></div>
+<h1>Daemon Mode</h1>
+<p class="sub">Run Clew Code as a persistent background process for autonomous task execution, scheduled jobs, and continuous monitoring.</p>
+<h2 id="overview">Overview</h2>
+<p>Daemon mode keeps Clew Code running in the background, processing tasks from a persistent queue, executing scheduled jobs via cron, and coordinating with mesh peers — all without an active terminal session.</p>
+<h2 id="daemon-commands">Daemon Commands</h2>
+<pre><code class="language-bash">❯ /daemon          # Open daemon dashboard
+❯ /daemon status   # Check daemon status
+</code></pre>
+<h2 id="task-queue">Task Queue</h2>
+<p>The daemon uses a file-backed persistent task queue with:</p>
+<ul>
+  <li><strong>Lease-based concurrency</strong> — max 3 concurrent workers</li>
+  <li><strong>Exponential backoff retry</strong> — failed tasks are retried with increasing delays</li>
+  <li><strong>Dead-letter management</strong> — tasks that exceed retry limits are moved to dead-letter storage</li>
+</ul>
+<h2 id="scheduling">Scheduling</h2>
+<p>Use cron syntax to schedule recurring tasks:</p>
+<pre><code class="language-bash">❯ /task add "0 9 * * *" "daily standup summary"
+❯ /task list          # list scheduled tasks
+❯ /task remove &lt;id&gt;   # remove a task
+</code></pre>
+<h2 id="loop">Agent Loop</h2>
+<p>The daemon integrates with the autonomous agent loop for 24/7 operation:</p>
+<pre><code class="language-bash">❯ /loop start         # start the autonomous loop
+❯ /loop stop          # stop the loop
+❯ /loop status        # check loop status
+</code></pre>
+<p>See <a href="loop.html">Agent Loop</a> for details.</p>
+</div>
+</div>
+<script src="js/main.js"></script>
+</body>
+</html>

package/docs/features/bridge-mode.html CHANGED Viewed

@@ -1,67 +1,62 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-  <meta charset="UTF-8">
-  <meta name="viewport" content="width=device-width, initial-scale=1.0">
-  <title>Bridge Mode — Remote Control & Collaboration — Clew</title>
-  <meta name="description" content="WebSocket remote control and collaboration for Clew.">
-  <link rel="preconnect" href="https://fonts.googleapis.com">
-  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-  <link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
-  <link rel="stylesheet" href="../css/styles.css">
-  <link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
-</head>
-<body>
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>Bridge Mode — Clew Code</title>
+<meta name="description" content="Remote control and bridge mode for Clew Code.">
+<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
+<link rel="stylesheet" href="../css/styles.css">
+</head>
+<body>
 <header class="header"></header>
-<div class="app">
-  <aside class="sidebar" id="sidebar"></aside>
-  <div class="sidebar-overlay" id="sidebarOverlay"></div>
-  <div class="content-wrap">
-    <main class="content">
-      <div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><a href="../index.html#features">Features</a><span class="sep">/</span><span>Bridge Mode</span></div>
-      <h1>Bridge Mode</h1>
-      <p class="section-subtitle">Remote Control &amp; Remote Collaboration</p>
-      <div class="callout callout-info">
-        <strong>Bridge Mode</strong> exposes a remote control surface over a WebSocket connection. It is designed to be used by a mobile/web app to send commands and receive information from a running Clew session.
-      </div>
-      <h2>Architecture</h2>
-      <p>Bridge mode creates a WebSocket server that runs alongside the main Clew session. Remote clients connect to this server and can send slash commands, receive responses, and interact with the running session. The bridge also supports session sharing for team collaboration.</p>
-      <h2>Quick Start</h2>
-      <pre><code># Enable bridge mode
-export BRIDGE_MODE=1
-claude --bridge
-# Connect from another terminal
-claude --remote ws://localhost:18790</code></pre>
-      <h2>Features</h2>
-      <ul>
-        <li><strong>Remote Control</strong> — Send commands from mobile/web/CLI clients</li>
-        <li><strong>Session Sharing</strong> — Share your session with team members</li>
-        <li><strong>Team Onboarding</strong> — Invite teammates to collaborate</li>
-        <li><strong>Secure</strong> — OAuth-based authentication for remote connections</li>
-      </ul>
-      <div class="callout callout-warn">
-        <strong>Security Note</strong>
-        Bridge mode is designed for trusted networks. Use appropriate security measures when exposing the WebSocket server to external networks.
-      </div>
-      <footer class="footer">
-        <span>Clew Code 0.2.14</span>
-        <div class="footer-links">
-          <a href="https://github.com/ClewCode/ClewCode">GitHub</a>
-          <a href="https://github.com/ClewCode/ClewCode/issues">Issues</a>
-        </div>
-      </footer>
-    </main>
-    <nav class="toc-sidebar"></nav>
-  </div>
-</div>
-<script src="../js/main.js"></script>
-</body>
-</html>
+<div id="sidebarOverlay" class="sidebar-overlay"></div>
+<aside id="sidebar" class="sidebar"></aside>
+<div class="content-wrap">
+<div class="content">
+<div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><span class="current">Bridge Mode</span></div>
+<h1>Bridge Mode</h1>
+<p class="sub">Remote control Clew Code from anywhere via WebSocket bridge.</p>
+<h2 id="overview">Overview</h2>
+<p>Bridge mode allows you to connect to a running Clew Code instance remotely. There are two systems:</p>
+<ul>
+  <li><strong>Bridge v1 (Legacy CCR)</strong> — The original Claude Code Remote system, tied to claude.ai OAuth</li>
+  <li><strong>Bridge v2 (Provider-Agnostic)</strong> — A standalone WebSocket server that works without claude.ai</li>
+</ul>
+<h2 id="v2">Bridge v2 — Provider-Agnostic Remote Control</h2>
+<p>The new bridge v2 runs a local WebSocket server with:</p>
+<ul>
+  <li>One-time auth tokens (SHA-256 hashed)</li>
+  <li>Session management</li>
+  <li>Optional NAT-traversal relay</li>
+  <li>No dependency on any provider's backend</li>
+</ul>
+<h3>Commands</h3>
+<pre><code class="language-bash">❯ /remote listen         # start the WebSocket server
+❯ /remote connect &lt;url&gt;  # connect to a remote instance
+❯ /remote token          # generate a one-time auth token
+</code></pre>
+<h2 id="relay">Relay Mode</h2>
+<p>For NAT traversal, use the optional relay server:</p>
+<pre><code class="language-bash">bun run relay            # start the relay server
+</code></pre>
+<h2 id="bridge-commands">Bridge v1 Commands</h2>
+<pre><code class="language-bash">❯ /bridge                # configure bridge mode
+</code></pre>
+<p>Note: Bridge v1 requires a claude.ai subscription.</p>
+</div>
+</div>
+<script src="../js/main.js"></script>
+</body>
+</html>

package/docs/features/evals.html CHANGED Viewed

@@ -1,150 +1,58 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-  <meta charset="UTF-8">
-  <meta name="viewport" content="width=device-width, initial-scale=1.0">
-  <title>Evaluation Harness — Clew</title>
-  <meta name="description" content="Offline-first AI coding agent evaluation and verification framework.">
-  <link rel="preconnect" href="https://fonts.googleapis.com">
-  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-  <link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
-  <link rel="stylesheet" href="../css/styles.css">
-  <link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
-</head>
-<body>
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>Evaluation Harness — Clew Code</title>
+<meta name="description" content="Built-in evaluation harness for testing provider and model performance.">
+<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
+<link rel="stylesheet" href="../css/styles.css">
+</head>
+<body>
 <header class="header"></header>
-<div class="app">
-  <aside class="sidebar" id="sidebar"></aside>
-  <div class="sidebar-overlay" id="sidebarOverlay"></div>
-  <div class="content-wrap">
-    <main class="content">
-      <div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><a href="../index.html#features">Features</a><span class="sep">/</span><span>Evaluation Harness</span></div>
-      <h1>Evaluation Harness</h1>
-      <p class="section-subtitle">Offline-first AI coding agent evaluation and verification framework</p>
-      <div class="callout callout-tip">
-        <strong>TL;DR</strong>
-        Run <code>clew eval init</code> to bootstrap the evaluation folders inside your project,
-        then execute <code>clew eval run</code> to run standard coding or research benchmarks locally.
-      </div>
-      <h2>Overview</h2>
-      <p>Clew includes a localized, <strong>offline-first evaluation harness</strong> under the <code>/eval</code> command namespace. This allows developers to systematically grade agent output quality, detect trace trajectory regressions, control boundary escapes, and compare model versions using deterministic rules.</p>
-      <h2>Workspace Directory Layout</h2>
-      <p>When you run <code>clew eval init</code>, it configures the following structures inside <code>.claude/evals/</code>:</p>
-      <table>
-        <tr><th>Folder</th><th>Description</th></tr>
-        <tr><td><code>.claude/evals/tasks/</code></td><td>YAML task definitions (grouped by categories like <code>coding/</code>, <code>research/</code>, <code>memory/</code>, <code>security/</code>)</td></tr>
-        <tr><td><code>.claude/evals/graders/</code></td><td>YAML grader rules and configurations (Command, Trace, Artifact, and Rule graders)</td></tr>
-        <tr><td><code>.claude/evals/runs/</code></td><td>Outcome results, captured events logs, and workspace diffs per run</td></tr>
-        <tr><td><code>.claude/evals/baselines/</code></td><td>Saved scoring baselines (e.g. main branch benchmark records)</td></tr>
-        <tr><td><code>.claude/evals/reports/</code></td><td>Final generated markdown and JSON evaluation reports</td></tr>
-      </table>
-      <h2>Subcommand CLI Usage</h2>
-      <h3>1. Initialize Workspace</h3>
-      <pre><code>claude eval init</code></pre>
-      <h3>2. Run Evaluations</h3>
-      <pre><code># Run all loaded tasks
-claude eval run
-# Run only tasks in the "coding" category
-claude eval run --set coding
-# Run a specific task by ID
-claude eval run --task coding.sample-task
-# Run evaluations and compare against a baseline
-claude eval run --baseline main</code></pre>
-      <h3>3. Drift &amp; Regression Comparison</h3>
-      <pre><code>claude eval compare --baseline main</code></pre>
-      <h3>4. Step Trace Trajectory</h3>
-      <pre><code>claude eval trace coding.sample-task</code></pre>
-      <h3>5. Diagnostics (Doctor)</h3>
-      <pre><code>claude eval doctor</code></pre>
-      <h2>Writing Tasks &amp; Graders</h2>
-      <h3>Eval Task YAML Schema</h3>
-      <pre><code>id: coding.fix-provider-routing
-title: Fix provider routing fallback behavior
-category: coding
-input: |
-  Fix the provider routing fallback so unsupported providers return a clear error.
-workspace_fixture: fixtures/provider-routing
-expected:
-  files_changed:
-    - src/providers/router.ts
-  commands_run:
-    - bun test src/providers
-graders:
-  - test-pass
-  - scope-control
-  - evidence-before-patch
-budgets:
-  max_steps: 12
-  max_tool_calls: 6</code></pre>
-      <h3>Grader Types</h3>
-      <h4>Command Grader</h4>
-      <pre><code>id: test-pass
-type: command
-commands:
-  - bun test
-pass_when:
-  exit_code: 0</code></pre>
-      <h4>Trace Grader</h4>
-      <pre><code>id: evidence-before-patch
-type: trace
-rules:
-  - before: repo.patch
-    require_any:
-      - repo.search
-      - repo.open
-fail_message: Agent patched files before reading evidence.</code></pre>
-      <h4>Artifact Grader</h4>
-      <pre><code>id: scope-control
-type: artifact
-checks:
-  max_changed_files: 5
-  changed_files:
-    allow:
-      - src/providers/**
-      - tests/providers/**
-    deny:
-      - package-lock.json</code></pre>
-      <h4>Rule Grader</h4>
-      <pre><code>id: output-format
-type: rule
-must_include:
-  - "## Summary"
-must_not_include:
-  - "I could not view"</code></pre>
-      <h2>Critical Failure Policies</h2>
-      <p>Clew immediately scores a task as <strong>0.0 (Failed)</strong> if any of these boundaries are breached:</p>
-      <ol>
-        <li><strong>Secret Leakage</strong> — Sensitive tokens (e.g. API keys, secrets) detected in agent output</li>
-        <li><strong>Workspace Escape</strong> — Agent attempts to write or edit files outside workspace boundaries</li>
-        <li><strong>Forbidden Commands</strong> — Destructive actions (e.g., <code>rm -rf</code>) without explicit permission</li>
-      </ol>
-      <footer class="footer">
-        <span>Clew Code 0.2.14</span>
-        <div class="footer-links">
-          <a href="https://github.com/ClewCode/ClewCode">GitHub</a>
-          <a href="https://github.com/ClewCode/ClewCode/issues">Issues</a>
-        </div>
-      </footer>
-    </main>
-    <nav class="toc-sidebar"></nav>
-  </div>
-</div>
-<script src="../js/main.js"></script>
-</body>
-</html>
+<div id="sidebarOverlay" class="sidebar-overlay"></div>
+<aside id="sidebar" class="sidebar"></aside>
+<div class="content-wrap">
+<div class="content">
+<div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><span class="current">Evaluation Harness</span></div>
+<h1>Evaluation Harness</h1>
+<p class="sub">Test and compare provider and model performance with the built-in eval system.</p>
+<h2 id="overview">Overview</h2>
+<p>The evaluation harness allows you to run standardized benchmarks against any configured provider/model combination. Use it to compare performance, measure latency, and validate outputs across providers.</p>
+<h2 id="usage">Usage</h2>
+<pre><code class="language-bash">❯ /evals run                 # run the standard eval suite
+❯ /evals list                # list available eval benchmarks
+❯ /evals results             # show previous eval results
+</code></pre>
+<h2 id="benchmarks">Available Benchmarks</h2>
+<ul>
+  <li><strong>Code generation</strong> — function-level code synthesis</li>
+  <li><strong>Tool calling</strong> — accuracy of tool selection and argument generation</li>
+  <li><strong>Reasoning</strong> — multi-step logical reasoning</li>
+  <li><strong>Context comprehension</strong> — long-context understanding and recall</li>
+</ul>
+<h2 id="comparing">Comparing Providers</h2>
+<p>Switch providers and re-run the same eval to compare:</p>
+<pre><code class="language-bash">❯ /model openai
+❯ /evals run
+❯ /model deepseek-v4-flash
+❯ /evals run
+❯ /evals results        # side-by-side comparison
+</code></pre>
+</div>
+</div>
+<script src="../js/main.js"></script>
+</body>
+</html>