clew-code 0.2.6 → 0.2.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +292 -264
- package/dist/main.js +2835 -2950
- package/docs/architecture.html +148 -145
- package/docs/architecture.th.html +79 -78
- package/docs/clew-code-architecture.html +1125 -0
- package/docs/commands.html +224 -223
- package/docs/commands.th.html +131 -130
- package/docs/configuration.html +147 -145
- package/docs/configuration.th.html +108 -106
- package/docs/css/styles.css +48 -42
- package/docs/daemon.html +129 -128
- package/docs/daemon.th.html +73 -72
- package/docs/features/bridge-mode.html +99 -98
- package/docs/features/bridge-mode.th.html +90 -89
- package/docs/features/evals.html +182 -181
- package/docs/features/evals.th.html +90 -89
- package/docs/features/peer.html +178 -177
- package/docs/features/searxng-search.html +151 -150
- package/docs/features/searxng-search.th.html +95 -94
- package/docs/features/sentry-setup.html +157 -156
- package/docs/features/sentry-setup.th.html +97 -96
- package/docs/index.html +299 -298
- package/docs/index.th.html +292 -290
- package/docs/installation.html +105 -103
- package/docs/installation.th.html +105 -103
- package/docs/internals/growthbook-ab-testing.html +113 -112
- package/docs/internals/growthbook-ab-testing.th.html +81 -80
- package/docs/internals/hidden-features.html +149 -147
- package/docs/internals/hidden-features.th.html +109 -107
- package/docs/js/main.js +83 -3
- package/docs/loop.html +181 -180
- package/docs/loop.th.html +227 -226
- package/docs/mcp.html +247 -246
- package/docs/mcp.th.html +207 -206
- package/docs/models.html +111 -110
- package/docs/models.th.html +61 -60
- package/docs/peer.html +236 -235
- package/docs/peer.th.html +280 -279
- package/docs/permission-model.html +102 -101
- package/docs/permission-model.th.html +67 -66
- package/docs/plugins.html +102 -101
- package/docs/plugins.th.html +79 -78
- package/docs/providers.html +126 -117
- package/docs/providers.th.html +80 -78
- package/docs/quick-start.html +93 -92
- package/docs/quick-start.th.html +40 -39
- package/docs/research-memory.html +82 -79
- package/docs/research-memory.th.html +72 -71
- package/docs/skills.html +117 -116
- package/docs/skills.th.html +90 -89
- package/docs/tools.html +170 -169
- package/docs/tools.th.html +84 -83
- package/docs/troubleshooting.html +106 -105
- package/docs/troubleshooting.th.html +85 -84
- package/package.json +3 -1
- package/docs/taste.html +0 -436
- package/docs/taste.th.html +0 -236
package/docs/features/evals.html
CHANGED
|
@@ -1,181 +1,182 @@
|
|
|
1
|
-
<!DOCTYPE html>
|
|
2
|
-
<html lang="en">
|
|
3
|
-
<head>
|
|
4
|
-
<meta charset="UTF-8">
|
|
5
|
-
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
6
|
-
<title>Evaluation Harness
|
|
7
|
-
<meta name="description" content="Offline-first AI coding agent evaluation and verification framework.">
|
|
8
|
-
<link rel="preconnect" href="https://fonts.googleapis.com">
|
|
9
|
-
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
|
10
|
-
<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
|
|
11
|
-
<link rel="stylesheet" href="../css/styles.css">
|
|
12
|
-
<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
|
|
13
|
-
</head>
|
|
14
|
-
<body>
|
|
15
|
-
<header class="header">
|
|
16
|
-
<div class="header-inner">
|
|
17
|
-
<a href="../index.html" class="logo">
|
|
18
|
-
<span>Clew Code</span>
|
|
19
|
-
</a>
|
|
20
|
-
<nav class="header-nav">
|
|
21
|
-
<a href="../index.html">Home</a>
|
|
22
|
-
<a href="../index.html#features">Features</a>
|
|
23
|
-
<a href="../index.html#commands">Commands</a>
|
|
24
|
-
<a href="../quick-start.html" class="active">Docs</a>
|
|
25
|
-
<a href="https://github.com/
|
|
26
|
-
<div class="lang-wrap">
|
|
27
|
-
<button class="lang-btn"
|
|
28
|
-
<div class="lang-menu">
|
|
29
|
-
<a href="../../readme/README.zh.md"
|
|
30
|
-
<a href="../../readme/README.th.md"
|
|
31
|
-
<a href="../../readme/README.ja.md"
|
|
32
|
-
<a href="../../readme/README.ko.md"
|
|
33
|
-
<a href="../../readme/README.es.md">
|
|
34
|
-
<a href="../../readme/README.fr.md">
|
|
35
|
-
<a href="../../readme/README.de.md">Deutsch</a>
|
|
36
|
-
<a href="../../readme/README.pt.md">
|
|
37
|
-
<a href="../../readme/README.vi.md">
|
|
38
|
-
<a href="../../readme/README.id.md">Bahasa Indonesia</a>
|
|
39
|
-
<a href="../../readme/README.ru.md"
|
|
40
|
-
<a href="../../readme/README.hi.md"
|
|
41
|
-
<a href="../../README.md">English</a>
|
|
42
|
-
</div>
|
|
43
|
-
</div>
|
|
44
|
-
</nav>
|
|
45
|
-
<button class="menu-btn" id="menuToggle" aria-label="Toggle navigation"><span></span><span></span><span></span></button>
|
|
46
|
-
</div>
|
|
47
|
-
</header>
|
|
48
|
-
<div class="app">
|
|
49
|
-
<aside class="sidebar" id="sidebar"></aside>
|
|
50
|
-
<div class="sidebar-overlay" id="sidebarOverlay"></div>
|
|
51
|
-
<div class="content-wrap">
|
|
52
|
-
<main class="content">
|
|
53
|
-
<div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><a href="../index.html#features">Features</a><span class="sep">/</span><span>Evaluation Harness</span></div>
|
|
54
|
-
<h1>Evaluation Harness</h1>
|
|
55
|
-
<p class="section-subtitle">Offline-first AI coding agent evaluation and verification framework</p>
|
|
56
|
-
|
|
57
|
-
<div class="callout callout-tip">
|
|
58
|
-
<strong>TL;DR</strong>
|
|
59
|
-
Run <code>clew eval init</code> to bootstrap the evaluation folders inside your project,
|
|
60
|
-
then execute <code>clew eval run</code> to run standard coding or research benchmarks locally.
|
|
61
|
-
</div>
|
|
62
|
-
|
|
63
|
-
<h2>Overview</h2>
|
|
64
|
-
<p>Clew includes a localized, <strong>offline-first evaluation harness</strong> under the <code>/eval</code> command namespace. This allows developers to systematically grade agent output quality, detect trace trajectory regressions, control boundary escapes, and compare model versions using deterministic rules.</p>
|
|
65
|
-
|
|
66
|
-
<h2>Workspace Directory Layout</h2>
|
|
67
|
-
<p>When you run <code>clew eval init</code>, it configures the following structures inside <code>.claude/evals/</code>:</p>
|
|
68
|
-
<table>
|
|
69
|
-
<tr><th>Folder</th><th>Description</th></tr>
|
|
70
|
-
<tr><td><code>.claude/evals/tasks/</code></td><td>YAML task definitions (grouped by categories like <code>coding/</code>, <code>research/</code>, <code>memory/</code>, <code>security/</code>)</td></tr>
|
|
71
|
-
<tr><td><code>.claude/evals/graders/</code></td><td>YAML grader rules and configurations (Command, Trace, Artifact, and Rule graders)</td></tr>
|
|
72
|
-
<tr><td><code>.claude/evals/runs/</code></td><td>Outcome results, captured events logs, and workspace diffs per run</td></tr>
|
|
73
|
-
<tr><td><code>.claude/evals/baselines/</code></td><td>Saved scoring baselines (e.g. main branch benchmark records)</td></tr>
|
|
74
|
-
<tr><td><code>.claude/evals/reports/</code></td><td>Final generated markdown and JSON evaluation reports</td></tr>
|
|
75
|
-
</table>
|
|
76
|
-
|
|
77
|
-
<h2>Subcommand CLI Usage</h2>
|
|
78
|
-
<h3>1. Initialize Workspace</h3>
|
|
79
|
-
<pre><code>claude eval init</code></pre>
|
|
80
|
-
|
|
81
|
-
<h3>2. Run Evaluations</h3>
|
|
82
|
-
<pre><code># Run all loaded tasks
|
|
83
|
-
claude eval run
|
|
84
|
-
# Run only tasks in the "coding" category
|
|
85
|
-
claude eval run --set coding
|
|
86
|
-
# Run a specific task by ID
|
|
87
|
-
claude eval run --task coding.sample-task
|
|
88
|
-
# Run evaluations and compare against a baseline
|
|
89
|
-
claude eval run --baseline main</code></pre>
|
|
90
|
-
|
|
91
|
-
<h3>3. Drift & Regression Comparison</h3>
|
|
92
|
-
<pre><code>claude eval compare --baseline main</code></pre>
|
|
93
|
-
|
|
94
|
-
<h3>4. Step Trace Trajectory</h3>
|
|
95
|
-
<pre><code>claude eval trace coding.sample-task</code></pre>
|
|
96
|
-
|
|
97
|
-
<h3>5. Diagnostics (Doctor)</h3>
|
|
98
|
-
<pre><code>claude eval doctor</code></pre>
|
|
99
|
-
|
|
100
|
-
<h2>Writing Tasks & Graders</h2>
|
|
101
|
-
<h3>Eval Task YAML Schema</h3>
|
|
102
|
-
<pre><code>id: coding.fix-provider-routing
|
|
103
|
-
title: Fix provider routing fallback behavior
|
|
104
|
-
category: coding
|
|
105
|
-
input: |
|
|
106
|
-
Fix the provider routing fallback so unsupported providers return a clear error.
|
|
107
|
-
workspace_fixture: fixtures/provider-routing
|
|
108
|
-
expected:
|
|
109
|
-
files_changed:
|
|
110
|
-
- src/providers/router.ts
|
|
111
|
-
commands_run:
|
|
112
|
-
- bun test src/providers
|
|
113
|
-
graders:
|
|
114
|
-
- test-pass
|
|
115
|
-
- scope-control
|
|
116
|
-
- evidence-before-patch
|
|
117
|
-
budgets:
|
|
118
|
-
max_steps: 12
|
|
119
|
-
max_tool_calls: 6</code></pre>
|
|
120
|
-
|
|
121
|
-
<h3>Grader Types</h3>
|
|
122
|
-
<h4>Command Grader</h4>
|
|
123
|
-
<pre><code>id: test-pass
|
|
124
|
-
type: command
|
|
125
|
-
commands:
|
|
126
|
-
- bun test
|
|
127
|
-
pass_when:
|
|
128
|
-
exit_code: 0</code></pre>
|
|
129
|
-
|
|
130
|
-
<h4>Trace Grader</h4>
|
|
131
|
-
<pre><code>id: evidence-before-patch
|
|
132
|
-
type: trace
|
|
133
|
-
rules:
|
|
134
|
-
- before: repo.patch
|
|
135
|
-
require_any:
|
|
136
|
-
- repo.search
|
|
137
|
-
- repo.open
|
|
138
|
-
fail_message: Agent patched files before reading evidence.</code></pre>
|
|
139
|
-
|
|
140
|
-
<h4>Artifact Grader</h4>
|
|
141
|
-
<pre><code>id: scope-control
|
|
142
|
-
type: artifact
|
|
143
|
-
checks:
|
|
144
|
-
max_changed_files: 5
|
|
145
|
-
changed_files:
|
|
146
|
-
allow:
|
|
147
|
-
- src/providers/**
|
|
148
|
-
- tests/providers/**
|
|
149
|
-
deny:
|
|
150
|
-
- package-lock.json</code></pre>
|
|
151
|
-
|
|
152
|
-
<h4>Rule Grader</h4>
|
|
153
|
-
<pre><code>id: output-format
|
|
154
|
-
type: rule
|
|
155
|
-
must_include:
|
|
156
|
-
- "## Summary"
|
|
157
|
-
must_not_include:
|
|
158
|
-
- "I could not view"</code></pre>
|
|
159
|
-
|
|
160
|
-
<h2>Critical Failure Policies</h2>
|
|
161
|
-
<p>Clew immediately scores a task as <strong>0.0 (Failed)</strong> if any of these boundaries are breached:</p>
|
|
162
|
-
<ol>
|
|
163
|
-
<li><strong>Secret Leakage</strong>
|
|
164
|
-
<li><strong>Workspace Escape</strong>
|
|
165
|
-
<li><strong>Forbidden Commands</strong>
|
|
166
|
-
</ol>
|
|
167
|
-
|
|
168
|
-
<footer class="footer">
|
|
169
|
-
<span>Clew Code v0.2.
|
|
170
|
-
<div class="footer-links">
|
|
171
|
-
<a href="https://github.com/
|
|
172
|
-
<a href="https://github.com/
|
|
173
|
-
</div>
|
|
174
|
-
</footer>
|
|
175
|
-
</main>
|
|
176
|
-
<nav class="toc-sidebar"></nav>
|
|
177
|
-
</div>
|
|
178
|
-
</div>
|
|
179
|
-
<script src="../js/main.js"></script>
|
|
180
|
-
</body>
|
|
181
|
-
</html>
|
|
1
|
+
<!DOCTYPE html>
|
|
2
|
+
<html lang="en">
|
|
3
|
+
<head>
|
|
4
|
+
<meta charset="UTF-8">
|
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
6
|
+
<title>Evaluation Harness — Clew</title>
|
|
7
|
+
<meta name="description" content="Offline-first AI coding agent evaluation and verification framework.">
|
|
8
|
+
<link rel="preconnect" href="https://fonts.googleapis.com">
|
|
9
|
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
|
10
|
+
<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
|
|
11
|
+
<link rel="stylesheet" href="../css/styles.css">
|
|
12
|
+
<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
|
|
13
|
+
</head>
|
|
14
|
+
<body>
|
|
15
|
+
<header class="header">
|
|
16
|
+
<div class="header-inner">
|
|
17
|
+
<a href="../index.html" class="logo">
|
|
18
|
+
<span>Clew Code</span>
|
|
19
|
+
</a>
|
|
20
|
+
<nav class="header-nav">
|
|
21
|
+
<a href="../index.html">Home</a>
|
|
22
|
+
<a href="../index.html#features">Features</a>
|
|
23
|
+
<a href="../index.html#commands">Commands</a>
|
|
24
|
+
<a href="../quick-start.html" class="active">Docs</a>
|
|
25
|
+
<a href="https://github.com/ClewCode/ClewCode" target="_blank">GitHub</a>
|
|
26
|
+
<div class="lang-wrap">
|
|
27
|
+
<button class="lang-btn">ðŸŒ</button>
|
|
28
|
+
<div class="lang-menu">
|
|
29
|
+
<a href="../../readme/README.zh.md">䏿–‡</a>
|
|
30
|
+
<a href="../../readme/README.th.md">ไทย</a>
|
|
31
|
+
<a href="../../readme/README.ja.md">日本語</a>
|
|
32
|
+
<a href="../../readme/README.ko.md">한êµì–´</a>
|
|
33
|
+
<a href="../../readme/README.es.md">Español</a>
|
|
34
|
+
<a href="../../readme/README.fr.md">Français</a>
|
|
35
|
+
<a href="../../readme/README.de.md">Deutsch</a>
|
|
36
|
+
<a href="../../readme/README.pt.md">Português</a>
|
|
37
|
+
<a href="../../readme/README.vi.md">Tiếng Việt</a>
|
|
38
|
+
<a href="../../readme/README.id.md">Bahasa Indonesia</a>
|
|
39
|
+
<a href="../../readme/README.ru.md">РуÑÑкий</a>
|
|
40
|
+
<a href="../../readme/README.hi.md">हिनà¥à¤¦à¥€</a>
|
|
41
|
+
<a href="../../README.md">English</a>
|
|
42
|
+
</div>
|
|
43
|
+
</div>
|
|
44
|
+
</nav>
|
|
45
|
+
<button class="menu-btn" id="menuToggle" aria-label="Toggle navigation"><span></span><span></span><span></span></button>
|
|
46
|
+
</div>
|
|
47
|
+
</header>
|
|
48
|
+
<div class="app">
|
|
49
|
+
<aside class="sidebar" id="sidebar"></aside>
|
|
50
|
+
<div class="sidebar-overlay" id="sidebarOverlay"></div>
|
|
51
|
+
<div class="content-wrap">
|
|
52
|
+
<main class="content">
|
|
53
|
+
<div class="breadcrumbs"><a href="../index.html">Home</a><span class="sep">/</span><a href="../index.html#features">Features</a><span class="sep">/</span><span>Evaluation Harness</span></div>
|
|
54
|
+
<h1>Evaluation Harness</h1>
|
|
55
|
+
<p class="section-subtitle">Offline-first AI coding agent evaluation and verification framework</p>
|
|
56
|
+
|
|
57
|
+
<div class="callout callout-tip">
|
|
58
|
+
<strong>TL;DR</strong>
|
|
59
|
+
Run <code>clew eval init</code> to bootstrap the evaluation folders inside your project,
|
|
60
|
+
then execute <code>clew eval run</code> to run standard coding or research benchmarks locally.
|
|
61
|
+
</div>
|
|
62
|
+
|
|
63
|
+
<h2>Overview</h2>
|
|
64
|
+
<p>Clew includes a localized, <strong>offline-first evaluation harness</strong> under the <code>/eval</code> command namespace. This allows developers to systematically grade agent output quality, detect trace trajectory regressions, control boundary escapes, and compare model versions using deterministic rules.</p>
|
|
65
|
+
|
|
66
|
+
<h2>Workspace Directory Layout</h2>
|
|
67
|
+
<p>When you run <code>clew eval init</code>, it configures the following structures inside <code>.claude/evals/</code>:</p>
|
|
68
|
+
<table>
|
|
69
|
+
<tr><th>Folder</th><th>Description</th></tr>
|
|
70
|
+
<tr><td><code>.claude/evals/tasks/</code></td><td>YAML task definitions (grouped by categories like <code>coding/</code>, <code>research/</code>, <code>memory/</code>, <code>security/</code>)</td></tr>
|
|
71
|
+
<tr><td><code>.claude/evals/graders/</code></td><td>YAML grader rules and configurations (Command, Trace, Artifact, and Rule graders)</td></tr>
|
|
72
|
+
<tr><td><code>.claude/evals/runs/</code></td><td>Outcome results, captured events logs, and workspace diffs per run</td></tr>
|
|
73
|
+
<tr><td><code>.claude/evals/baselines/</code></td><td>Saved scoring baselines (e.g. main branch benchmark records)</td></tr>
|
|
74
|
+
<tr><td><code>.claude/evals/reports/</code></td><td>Final generated markdown and JSON evaluation reports</td></tr>
|
|
75
|
+
</table>
|
|
76
|
+
|
|
77
|
+
<h2>Subcommand CLI Usage</h2>
|
|
78
|
+
<h3>1. Initialize Workspace</h3>
|
|
79
|
+
<pre><code>claude eval init</code></pre>
|
|
80
|
+
|
|
81
|
+
<h3>2. Run Evaluations</h3>
|
|
82
|
+
<pre><code># Run all loaded tasks
|
|
83
|
+
claude eval run
|
|
84
|
+
# Run only tasks in the "coding" category
|
|
85
|
+
claude eval run --set coding
|
|
86
|
+
# Run a specific task by ID
|
|
87
|
+
claude eval run --task coding.sample-task
|
|
88
|
+
# Run evaluations and compare against a baseline
|
|
89
|
+
claude eval run --baseline main</code></pre>
|
|
90
|
+
|
|
91
|
+
<h3>3. Drift & Regression Comparison</h3>
|
|
92
|
+
<pre><code>claude eval compare --baseline main</code></pre>
|
|
93
|
+
|
|
94
|
+
<h3>4. Step Trace Trajectory</h3>
|
|
95
|
+
<pre><code>claude eval trace coding.sample-task</code></pre>
|
|
96
|
+
|
|
97
|
+
<h3>5. Diagnostics (Doctor)</h3>
|
|
98
|
+
<pre><code>claude eval doctor</code></pre>
|
|
99
|
+
|
|
100
|
+
<h2>Writing Tasks & Graders</h2>
|
|
101
|
+
<h3>Eval Task YAML Schema</h3>
|
|
102
|
+
<pre><code>id: coding.fix-provider-routing
|
|
103
|
+
title: Fix provider routing fallback behavior
|
|
104
|
+
category: coding
|
|
105
|
+
input: |
|
|
106
|
+
Fix the provider routing fallback so unsupported providers return a clear error.
|
|
107
|
+
workspace_fixture: fixtures/provider-routing
|
|
108
|
+
expected:
|
|
109
|
+
files_changed:
|
|
110
|
+
- src/providers/router.ts
|
|
111
|
+
commands_run:
|
|
112
|
+
- bun test src/providers
|
|
113
|
+
graders:
|
|
114
|
+
- test-pass
|
|
115
|
+
- scope-control
|
|
116
|
+
- evidence-before-patch
|
|
117
|
+
budgets:
|
|
118
|
+
max_steps: 12
|
|
119
|
+
max_tool_calls: 6</code></pre>
|
|
120
|
+
|
|
121
|
+
<h3>Grader Types</h3>
|
|
122
|
+
<h4>Command Grader</h4>
|
|
123
|
+
<pre><code>id: test-pass
|
|
124
|
+
type: command
|
|
125
|
+
commands:
|
|
126
|
+
- bun test
|
|
127
|
+
pass_when:
|
|
128
|
+
exit_code: 0</code></pre>
|
|
129
|
+
|
|
130
|
+
<h4>Trace Grader</h4>
|
|
131
|
+
<pre><code>id: evidence-before-patch
|
|
132
|
+
type: trace
|
|
133
|
+
rules:
|
|
134
|
+
- before: repo.patch
|
|
135
|
+
require_any:
|
|
136
|
+
- repo.search
|
|
137
|
+
- repo.open
|
|
138
|
+
fail_message: Agent patched files before reading evidence.</code></pre>
|
|
139
|
+
|
|
140
|
+
<h4>Artifact Grader</h4>
|
|
141
|
+
<pre><code>id: scope-control
|
|
142
|
+
type: artifact
|
|
143
|
+
checks:
|
|
144
|
+
max_changed_files: 5
|
|
145
|
+
changed_files:
|
|
146
|
+
allow:
|
|
147
|
+
- src/providers/**
|
|
148
|
+
- tests/providers/**
|
|
149
|
+
deny:
|
|
150
|
+
- package-lock.json</code></pre>
|
|
151
|
+
|
|
152
|
+
<h4>Rule Grader</h4>
|
|
153
|
+
<pre><code>id: output-format
|
|
154
|
+
type: rule
|
|
155
|
+
must_include:
|
|
156
|
+
- "## Summary"
|
|
157
|
+
must_not_include:
|
|
158
|
+
- "I could not view"</code></pre>
|
|
159
|
+
|
|
160
|
+
<h2>Critical Failure Policies</h2>
|
|
161
|
+
<p>Clew immediately scores a task as <strong>0.0 (Failed)</strong> if any of these boundaries are breached:</p>
|
|
162
|
+
<ol>
|
|
163
|
+
<li><strong>Secret Leakage</strong> — Sensitive tokens (e.g. API keys, secrets) detected in agent output</li>
|
|
164
|
+
<li><strong>Workspace Escape</strong> — Agent attempts to write or edit files outside workspace boundaries</li>
|
|
165
|
+
<li><strong>Forbidden Commands</strong> — Destructive actions (e.g., <code>rm -rf</code>) without explicit permission</li>
|
|
166
|
+
</ol>
|
|
167
|
+
|
|
168
|
+
<footer class="footer">
|
|
169
|
+
<span>Clew Code v0.2.7</span>
|
|
170
|
+
<div class="footer-links">
|
|
171
|
+
<a href="https://github.com/ClewCode/ClewCode">GitHub</a>
|
|
172
|
+
<a href="https://github.com/ClewCode/ClewCode/issues">Issues</a>
|
|
173
|
+
</div>
|
|
174
|
+
</footer>
|
|
175
|
+
</main>
|
|
176
|
+
<nav class="toc-sidebar"></nav>
|
|
177
|
+
</div>
|
|
178
|
+
</div>
|
|
179
|
+
<script src="../js/main.js"></script>
|
|
180
|
+
</body>
|
|
181
|
+
</html>
|
|
182
|
+
|
|
@@ -1,89 +1,90 @@
|
|
|
1
|
-
<!DOCTYPE html>
|
|
2
|
-
<html lang="th">
|
|
3
|
-
<head>
|
|
4
|
-
<meta charset="UTF-8">
|
|
5
|
-
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
6
|
-
<title
|
|
7
|
-
<meta name="description" content="
|
|
8
|
-
<link rel="preconnect" href="https://fonts.googleapis.com">
|
|
9
|
-
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
|
10
|
-
<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&family=Noto+Sans+Thai:wght@400;500;600;700&display=swap" rel="stylesheet">
|
|
11
|
-
<link rel="stylesheet" href="../css/styles.css">
|
|
12
|
-
<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
|
|
13
|
-
</head>
|
|
14
|
-
<body>
|
|
15
|
-
<header class="header">
|
|
16
|
-
<div class="header-inner">
|
|
17
|
-
<a href="../index.th.html" class="logo">
|
|
18
|
-
<span>Clew Code</span>
|
|
19
|
-
</a>
|
|
20
|
-
<nav class="header-nav">
|
|
21
|
-
<a href="../index.th.html"
|
|
22
|
-
<div class="lang-wrap">
|
|
23
|
-
<button class="lang-btn"
|
|
24
|
-
<div class="lang-menu">
|
|
25
|
-
<a href="../../readme/README.zh.md"
|
|
26
|
-
<a href="../../readme/README.th.md"
|
|
27
|
-
<a href="../../readme/README.ja.md"
|
|
28
|
-
<a href="../../readme/README.ko.md"
|
|
29
|
-
<a href="../../readme/README.es.md">
|
|
30
|
-
<a href="../../readme/README.fr.md">
|
|
31
|
-
<a href="../../readme/README.de.md">Deutsch</a>
|
|
32
|
-
<a href="../../readme/README.pt.md">
|
|
33
|
-
<a href="../../readme/README.vi.md">
|
|
34
|
-
<a href="../../readme/README.id.md">Bahasa Indonesia</a>
|
|
35
|
-
<a href="../../readme/README.ru.md"
|
|
36
|
-
<a href="../../readme/README.hi.md"
|
|
37
|
-
<a href="evals.html">English</a>
|
|
38
|
-
</div>
|
|
39
|
-
</div>
|
|
40
|
-
</nav>
|
|
41
|
-
<button class="menu-btn" id="menuToggle" aria-label="
|
|
42
|
-
</div>
|
|
43
|
-
</header>
|
|
44
|
-
<div class="app">
|
|
45
|
-
<aside class="sidebar" id="sidebar"></aside>
|
|
46
|
-
<div class="sidebar-overlay" id="sidebarOverlay"></div>
|
|
47
|
-
<div class="content-wrap">
|
|
48
|
-
<main class="content">
|
|
49
|
-
<div class="breadcrumbs"><a href="../index.th.html"
|
|
50
|
-
<h1
|
|
51
|
-
<p class="section-subtitle"
|
|
52
|
-
|
|
53
|
-
<div class="callout callout-tip">
|
|
54
|
-
<strong
|
|
55
|
-
|
|
56
|
-
</div>
|
|
57
|
-
|
|
58
|
-
<h2
|
|
59
|
-
<p>Clew
|
|
60
|
-
|
|
61
|
-
<h2
|
|
62
|
-
<h3>1.
|
|
63
|
-
<pre><code>claude eval init</code></pre>
|
|
64
|
-
|
|
65
|
-
<h3>2.
|
|
66
|
-
<pre><code>#
|
|
67
|
-
claude eval run
|
|
68
|
-
#
|
|
69
|
-
claude eval run --set coding
|
|
70
|
-
#
|
|
71
|
-
claude eval run --task coding.sample-task</code></pre>
|
|
72
|
-
|
|
73
|
-
<h3>3.
|
|
74
|
-
<pre><code>claude eval compare --baseline main</code></pre>
|
|
75
|
-
|
|
76
|
-
<footer class="footer">
|
|
77
|
-
<span>Clew Code v0.2.
|
|
78
|
-
<div class="footer-links">
|
|
79
|
-
<a href="https://github.com/
|
|
80
|
-
<a href="https://github.com/
|
|
81
|
-
</div>
|
|
82
|
-
</footer>
|
|
83
|
-
</main>
|
|
84
|
-
<nav class="toc-sidebar"></nav>
|
|
85
|
-
</div>
|
|
86
|
-
</div>
|
|
87
|
-
<script src="../js/main.js"></script>
|
|
88
|
-
</body>
|
|
89
|
-
</html>
|
|
1
|
+
<!DOCTYPE html>
|
|
2
|
+
<html lang="th">
|
|
3
|
+
<head>
|
|
4
|
+
<meta charset="UTF-8">
|
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
6
|
+
<title>à¸à¸²à¸£à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¸œà¸¥ — Clew</title>
|
|
7
|
+
<meta name="description" content="à¸à¸£à¸à¸šà¸‡à¸²à¸™à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¹à¸¥à¸°à¸•รวจสà¸à¸šà¹€à¸à¹€à¸ˆà¸™à¸•์เขียนโค้ด AI à¹à¸šà¸šà¸à¸à¸Ÿà¹„ลน์">
|
|
8
|
+
<link rel="preconnect" href="https://fonts.googleapis.com">
|
|
9
|
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
|
10
|
+
<link href="https://fonts.googleapis.com/css2?family=DM+Sans:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&family=Noto+Sans+Thai:wght@400;500;600;700&display=swap" rel="stylesheet">
|
|
11
|
+
<link rel="stylesheet" href="../css/styles.css">
|
|
12
|
+
<link rel="icon" type="image/svg+xml" href="../assets/clew.svg">
|
|
13
|
+
</head>
|
|
14
|
+
<body>
|
|
15
|
+
<header class="header">
|
|
16
|
+
<div class="header-inner">
|
|
17
|
+
<a href="../index.th.html" class="logo">
|
|
18
|
+
<span>Clew Code</span>
|
|
19
|
+
</a>
|
|
20
|
+
<nav class="header-nav">
|
|
21
|
+
<a href="../index.th.html">หน้าà¹à¸£à¸</a><a href="../index.th.html#features">ฟีเจà¸à¸£à¹Œ</a><a href="../index.th.html#commands">คำสั่ง</a><a href="../quick-start.th.html" class="active">เà¸à¸à¸ªà¸²à¸£</a><a href="https://github.com/ClewCode/ClewCode" target="_blank">GitHub</a>
|
|
22
|
+
<div class="lang-wrap">
|
|
23
|
+
<button class="lang-btn">ðŸŒ</button>
|
|
24
|
+
<div class="lang-menu">
|
|
25
|
+
<a href="../../readme/README.zh.md">䏿–‡</a>
|
|
26
|
+
<a href="../../readme/README.th.md">ไทย</a>
|
|
27
|
+
<a href="../../readme/README.ja.md">日本語</a>
|
|
28
|
+
<a href="../../readme/README.ko.md">韓國語</a>
|
|
29
|
+
<a href="../../readme/README.es.md">Español</a>
|
|
30
|
+
<a href="../../readme/README.fr.md">Français</a>
|
|
31
|
+
<a href="../../readme/README.de.md">Deutsch</a>
|
|
32
|
+
<a href="../../readme/README.pt.md">Português</a>
|
|
33
|
+
<a href="../../readme/README.vi.md">Tiếng Việt</a>
|
|
34
|
+
<a href="../../readme/README.id.md">Bahasa Indonesia</a>
|
|
35
|
+
<a href="../../readme/README.ru.md">РуÑÑкий</a>
|
|
36
|
+
<a href="../../readme/README.hi.md">हिनà¥à¤¦à¥€</a>
|
|
37
|
+
<a href="evals.html">English</a>
|
|
38
|
+
</div>
|
|
39
|
+
</div>
|
|
40
|
+
</nav>
|
|
41
|
+
<button class="menu-btn" id="menuToggle" aria-label="เปิด/ปิดเมนู"><span></span><span></span><span></span></button>
|
|
42
|
+
</div>
|
|
43
|
+
</header>
|
|
44
|
+
<div class="app">
|
|
45
|
+
<aside class="sidebar" id="sidebar"></aside>
|
|
46
|
+
<div class="sidebar-overlay" id="sidebarOverlay"></div>
|
|
47
|
+
<div class="content-wrap">
|
|
48
|
+
<main class="content">
|
|
49
|
+
<div class="breadcrumbs"><a href="../index.th.html">หน้าà¹à¸£à¸</a><span class="sep">/</span><a href="../index.th.html#features">ฟีเจà¸à¸£à¹Œ</a><span class="sep">/</span><span>à¸à¸²à¸£à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¸œà¸¥</span></div>
|
|
50
|
+
<h1>à¸à¸£à¸à¸šà¸‡à¸²à¸™à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¸œà¸¥</h1>
|
|
51
|
+
<p class="section-subtitle">à¸à¸£à¸à¸šà¸‡à¸²à¸™à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¹à¸¥à¸°à¸•รวจสà¸à¸šà¹€à¸à¹€à¸ˆà¸™à¸•์เขียนโค้ด AI à¹à¸šà¸šà¸à¸à¸Ÿà¹„ลน์</p>
|
|
52
|
+
|
|
53
|
+
<div class="callout callout-tip">
|
|
54
|
+
<strong>สรุป</strong>
|
|
55
|
+
รัน <code>clew eval init</code> เพื่à¸à¸ªà¸£à¹‰à¸²à¸‡à¹‚ฟลเดà¸à¸£à¹Œà¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¸œà¸¥à¹ƒà¸™à¹‚ปรเจà¸à¸•์ขà¸à¸‡à¸„ุณ จาà¸à¸™à¸±à¹‰à¸™à¸£à¸±à¸™ <code>clew eval run</code> เพื่à¸à¸£à¸±à¸™à¹€à¸à¸“ฑ์มาตรà¸à¸²à¸™à¸à¸²à¸£à¹€à¸‚ียนโค้ดหรืà¸à¸§à¸´à¸ˆà¸±à¸¢à¹ƒà¸™à¹€à¸„รื่à¸à¸‡
|
|
56
|
+
</div>
|
|
57
|
+
|
|
58
|
+
<h2>ภาพรวม</h2>
|
|
59
|
+
<p>Clew มีà¸à¸£à¸à¸šà¸‡à¸²à¸™à¸›à¸£à¸°à¹€à¸¡à¸´à¸™à¸œà¸¥à¹à¸šà¸šà¸à¸à¸Ÿà¹„ลน์ในตัวภายใต้เนมสเปซคำสั่ง <code>/eval</code> ช่วยให้นัà¸à¸žà¸±à¸’นาสามารถให้คะà¹à¸™à¸™à¸„ุณภาพผลลัพธ์ขà¸à¸‡à¹€à¸à¹€à¸ˆà¸™à¸•์, ตรวจจับ regression ในร่à¸à¸‡à¸£à¸à¸¢, à¹à¸¥à¸°à¹€à¸›à¸£à¸µà¸¢à¸šà¹€à¸—ียบเวà¸à¸£à¹Œà¸Šà¸±à¸™à¹‚มเดล</p>
|
|
60
|
+
|
|
61
|
+
<h2>à¸à¸²à¸£à¹ƒà¸Šà¹‰à¸‡à¸²à¸™ CLI</h2>
|
|
62
|
+
<h3>1. เริ่มต้น Workspace</h3>
|
|
63
|
+
<pre><code>claude eval init</code></pre>
|
|
64
|
+
|
|
65
|
+
<h3>2. รันà¸à¸²à¸£à¸›à¸£à¸°à¹€à¸¡à¸´à¸™</h3>
|
|
66
|
+
<pre><code># รันงานทั้งหมด
|
|
67
|
+
claude eval run
|
|
68
|
+
# รันเฉพาะหมวด "coding"
|
|
69
|
+
claude eval run --set coding
|
|
70
|
+
# รันงานเฉพาะ
|
|
71
|
+
claude eval run --task coding.sample-task</code></pre>
|
|
72
|
+
|
|
73
|
+
<h3>3. เปรียบเทียบ Drift à¹à¸¥à¸° Regression</h3>
|
|
74
|
+
<pre><code>claude eval compare --baseline main</code></pre>
|
|
75
|
+
|
|
76
|
+
<footer class="footer">
|
|
77
|
+
<span>Clew Code v0.2.7</span>
|
|
78
|
+
<div class="footer-links">
|
|
79
|
+
<a href="https://github.com/ClewCode/ClewCode">GitHub</a>
|
|
80
|
+
<a href="https://github.com/ClewCode/ClewCode/issues">ปัà¸à¸«à¸²</a>
|
|
81
|
+
</div>
|
|
82
|
+
</footer>
|
|
83
|
+
</main>
|
|
84
|
+
<nav class="toc-sidebar"></nav>
|
|
85
|
+
</div>
|
|
86
|
+
</div>
|
|
87
|
+
<script src="../js/main.js"></script>
|
|
88
|
+
</body>
|
|
89
|
+
</html>
|
|
90
|
+
|