@hacksmith/doraval 0.2.45 → 0.2.47
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +24 -4
- package/bin/doraval.js +1666 -851
- package/bin/ui/index.html +156 -5
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -1,10 +1,14 @@
|
|
|
1
1
|
# doraval
|
|
2
2
|
|
|
3
|
-
The context engineering toolkit for coding
|
|
3
|
+
The context engineering toolkit for coding agent orchestrators.
|
|
4
4
|
|
|
5
|
-
If you'
|
|
5
|
+
If you're a senior engineer handing skills to new team members, a company publishing AI resources, or anyone who wants agents (and humans) to succeed on the first attempt instead of after days of debugging — this is for you.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
**The orchestrator problem:** Give 10 new engineers (or agents) a skill and only 3/10 succeed on the first try. 4/10 take hours. 7/10 take a day. 10/10 take days.
|
|
8
|
+
|
|
9
|
+
doraval helps you **left-shift success** — validate, scaffold, and manage context so the first attempt works across Claude, Cursor, Codex, Copilot, and whatever comes next.
|
|
10
|
+
|
|
11
|
+
> **Quick start (left-shift success in < 2 minutes):**
|
|
8
12
|
> ```bash
|
|
9
13
|
> # macOS
|
|
10
14
|
> brew install saif-shines/tap/doraval
|
|
@@ -14,7 +18,7 @@ If you've ever shipped a Claude Code skill that stopped firing after a refactor,
|
|
|
14
18
|
> npx @hacksmith/doraval validate .
|
|
15
19
|
> ```
|
|
16
20
|
|
|
17
|
-
|
|
21
|
+
Validate before you hand a skill to a new engineer or publish it. It auto-detects issues across agents and tells you what's broken.
|
|
18
22
|
|
|
19
23
|
## Install
|
|
20
24
|
|
|
@@ -130,6 +134,21 @@ doraval skill drift ./skills/my-skill/
|
|
|
130
134
|
| **Guardrail** | Has explicit `MUST` / `MUST NOT` constraints |
|
|
131
135
|
| **Clarity** | Free of ambiguous words (`maybe`, `perhaps`, `consider`) |
|
|
132
136
|
|
|
137
|
+
### `eval` — Did the agent follow the skill?
|
|
138
|
+
|
|
139
|
+
After a real session, evaluate whether the coding agent actually adhered to the skills it invoked.
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
doraval eval # pick from recent sessions interactively
|
|
143
|
+
doraval eval --verbose
|
|
144
|
+
doraval judge ./skills/improve/ # evaluate latest session for one skill
|
|
145
|
+
doraval eval history
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
`eval` uses an LLM judge (via your configured agent) to produce a per-skill `PASS`/`FAIL` with a dynamic checklist, user familiarity score, and closure information (1-shot vs multi-turn vs incomplete).
|
|
149
|
+
|
|
150
|
+
Requires `doraval init` first. See the [full docs](https://thehacksmith.dev/commands/eval/).
|
|
151
|
+
|
|
133
152
|
### `journal` — Decision memory
|
|
134
153
|
|
|
135
154
|
Record and sync project principles so future you (and agents) don't accidentally contradict past choices.
|
|
@@ -173,6 +192,7 @@ doraval claude new # interactive wizard for skills/plugins (follows off
|
|
|
173
192
|
doraval validate . --for claude --format json --ci
|
|
174
193
|
doraval skill validate ./my-skill/ --format json --ci
|
|
175
194
|
doraval skill drift ./my-skill/ --format json --ci
|
|
195
|
+
doraval eval --ci --format json
|
|
176
196
|
```
|
|
177
197
|
|
|
178
198
|
Exits with code `1` when errors are found. Pipe `--format json` output to `jq` or consume programmatically.
|