ancoder-skill-cli 0.13.38-beta.0 → 0.13.38-beta.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +483 -470
- package/bin/skill-cli.js +52 -52
- package/bin/targets/skill-cli-darwin-arm64 +0 -0
- package/bin/targets/skill-cli-darwin-x64 +0 -0
- package/bin/targets/skill-cli-linux-arm64 +0 -0
- package/bin/targets/skill-cli-linux-x64 +0 -0
- package/bin/targets/skill-cli-win32-x64.exe +0 -0
- package/package.json +54 -54
- package/scripts/check-bin.js +176 -176
- package/scripts/install.ps1 +150 -150
- package/scripts/make-install-bundle.js +180 -180
- package/scripts/postinstall.js +36 -36
package/README.md
CHANGED
|
@@ -1,488 +1,501 @@
|
|
|
1
|
-
# skill-cli
|
|
2
|
-
|
|
3
|
-
CLI for managing and testing [Anthropic Agent Skills](https://agentskills.io) (e.g. [anthropics/skills](https://github.com/anthropics/skills)).
|
|
4
|
-
|
|
5
|
-
## Install (npm)
|
|
6
|
-
|
|
7
|
-
```bash
|
|
8
|
-
# Global
|
|
9
|
-
npm install -g ancoder-skill-cli
|
|
10
|
-
|
|
11
|
-
# Or run without installing
|
|
12
|
-
npx ancoder-skill-cli --help
|
|
13
|
-
```
|
|
14
|
-
|
|
15
|
-
The npm package is self-contained and includes prebuilt binaries for:
|
|
16
|
-
|
|
17
|
-
- macOS arm64
|
|
18
|
-
- macOS x64
|
|
19
|
-
- Linux arm64
|
|
20
|
-
- Linux x64
|
|
21
|
-
- Windows x64
|
|
22
|
-
|
|
23
|
-
After install, the wrapper selects the correct bundled binary for the current platform automatically.
|
|
24
|
-
|
|
25
|
-
## One-line install (Gitee or intranet)
|
|
26
|
-
|
|
27
|
-
For users who cannot access GitHub, publish the release-only bundle to Gitee or copy these files to an intranet static file server:
|
|
28
|
-
|
|
29
|
-
- `scripts/install.ps1`
|
|
30
|
-
- `scripts/install.sh`
|
|
31
|
-
- `bin/targets/skill-cli-win32-x64.exe`
|
|
32
|
-
- `bin/targets/skill-cli-linux-x64`
|
|
33
|
-
- `bin/targets/skill-cli-linux-arm64`
|
|
34
|
-
- `bin/targets/skill-cli-darwin-arm64`
|
|
35
|
-
- `bin/targets/skill-cli-darwin-x64`
|
|
36
|
-
|
|
37
|
-
Windows users can install the latest Gitee release with PowerShell:
|
|
38
|
-
|
|
39
|
-
```powershell
|
|
40
|
-
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
macOS/Linux users can install the latest Gitee release with curl:
|
|
44
|
-
|
|
45
|
-
```bash
|
|
46
|
-
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | bash
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
The installer downloads the matching prebuilt binary, installs it as `skill-cli`, runs `skill-cli install`, then runs a lite `doctor` check. No Go toolchain is required.
|
|
50
|
-
|
|
51
|
-
The `raw/main` URL is the moving latest channel. Each GitHub release updates Gitee `main` with the newest install bundle. If you need a reproducible install, pin a specific release tag instead:
|
|
52
|
-
|
|
53
|
-
```powershell
|
|
1
|
+
# skill-cli
|
|
2
|
+
|
|
3
|
+
CLI for managing and testing [Anthropic Agent Skills](https://agentskills.io) (e.g. [anthropics/skills](https://github.com/anthropics/skills)).
|
|
4
|
+
|
|
5
|
+
## Install (npm)
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# Global
|
|
9
|
+
npm install -g ancoder-skill-cli
|
|
10
|
+
|
|
11
|
+
# Or run without installing
|
|
12
|
+
npx ancoder-skill-cli --help
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
The npm package is self-contained and includes prebuilt binaries for:
|
|
16
|
+
|
|
17
|
+
- macOS arm64
|
|
18
|
+
- macOS x64
|
|
19
|
+
- Linux arm64
|
|
20
|
+
- Linux x64
|
|
21
|
+
- Windows x64
|
|
22
|
+
|
|
23
|
+
After install, the wrapper selects the correct bundled binary for the current platform automatically.
|
|
24
|
+
|
|
25
|
+
## One-line install (Gitee or intranet)
|
|
26
|
+
|
|
27
|
+
For users who cannot access GitHub, publish the release-only bundle to Gitee or copy these files to an intranet static file server:
|
|
28
|
+
|
|
29
|
+
- `scripts/install.ps1`
|
|
30
|
+
- `scripts/install.sh`
|
|
31
|
+
- `bin/targets/skill-cli-win32-x64.exe`
|
|
32
|
+
- `bin/targets/skill-cli-linux-x64`
|
|
33
|
+
- `bin/targets/skill-cli-linux-arm64`
|
|
34
|
+
- `bin/targets/skill-cli-darwin-arm64`
|
|
35
|
+
- `bin/targets/skill-cli-darwin-x64`
|
|
36
|
+
|
|
37
|
+
Windows users can install the latest Gitee release with PowerShell:
|
|
38
|
+
|
|
39
|
+
```powershell
|
|
40
|
+
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
macOS/Linux users can install the latest Gitee release with curl:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | bash
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
The installer downloads the matching prebuilt binary, installs it as `skill-cli`, runs `skill-cli install`, then runs a lite `doctor` check. No Go toolchain is required.
|
|
50
|
+
|
|
51
|
+
The `raw/main` URL is the moving latest channel. Each GitHub release updates Gitee `main` with the newest install bundle. If you need a reproducible install, pin a specific release tag instead:
|
|
52
|
+
|
|
53
|
+
```powershell
|
|
54
54
|
irm https://gitee.com/marvin-dev/skill-cli-release/raw/v0.13.33/scripts/install.ps1 | iex
|
|
55
|
-
```
|
|
56
|
-
|
|
57
|
-
```bash
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
58
|
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/v0.13.33/scripts/install.sh | bash
|
|
59
|
-
```
|
|
60
|
-
|
|
61
|
-
Before sharing the Gitee command externally, make sure both the Gitee `main` branch and the release tag have been updated. The installer in `main` is generated from the latest release bundle.
|
|
62
|
-
|
|
63
|
-
Gitee release checklist:
|
|
64
|
-
|
|
65
|
-
1. Run `go test ./...`, `bash scripts/build-all.sh`, `node scripts/check-bin.js`, and `npm pack --dry-run`.
|
|
66
|
-
2. Commit `bin/targets/`, `scripts/install.ps1`, `scripts/install.sh`, `scripts/make-install-bundle.js`, `scripts/check-install-host.js`, `scripts/check-bin.js`, `scripts/build-all.sh`, `package.json`, `package-lock.json`, `.gitattributes`, and `README.md`.
|
|
67
|
-
3. Push the release tag to GitHub. The GitHub Actions release workflow builds the binaries and syncs a dist-only bundle to `git@gitee.com:marvin-dev/skill-cli-release.git`.
|
|
68
|
-
4. Verify these URLs return `200` before sending the install command:
|
|
69
|
-
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1`
|
|
70
|
-
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh`
|
|
71
|
-
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/bin/targets/skill-cli-win32-x64.exe`
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Before sharing the Gitee command externally, make sure both the Gitee `main` branch and the release tag have been updated. The installer in `main` is generated from the latest release bundle.
|
|
62
|
+
|
|
63
|
+
Gitee release checklist:
|
|
64
|
+
|
|
65
|
+
1. Run `go test ./...`, `bash scripts/build-all.sh`, `node scripts/check-bin.js`, and `npm pack --dry-run`.
|
|
66
|
+
2. Commit `bin/targets/`, `scripts/install.ps1`, `scripts/install.sh`, `scripts/make-install-bundle.js`, `scripts/check-install-host.js`, `scripts/check-bin.js`, `scripts/build-all.sh`, `package.json`, `package-lock.json`, `.gitattributes`, and `README.md`.
|
|
67
|
+
3. Push the release tag to GitHub. The GitHub Actions release workflow builds the binaries and syncs a dist-only bundle to `git@gitee.com:marvin-dev/skill-cli-release.git`.
|
|
68
|
+
4. Verify these URLs return `200` before sending the install command:
|
|
69
|
+
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1`
|
|
70
|
+
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh`
|
|
71
|
+
- `https://gitee.com/marvin-dev/skill-cli-release/raw/main/bin/targets/skill-cli-win32-x64.exe`
|
|
72
72
|
- `https://gitee.com/marvin-dev/skill-cli-release/raw/v0.13.33/scripts/install.ps1`
|
|
73
73
|
- `https://gitee.com/marvin-dev/skill-cli-release/raw/v0.13.33/bin/targets/skill-cli-win32-x64.exe`
|
|
74
|
-
Or run:
|
|
75
|
-
|
|
76
|
-
```bash
|
|
77
|
-
node scripts/check-install-host.js \
|
|
78
|
-
--script-root https://gitee.com/marvin-dev/skill-cli-release/raw/main \
|
|
79
|
-
--base-url https://gitee.com/marvin-dev/skill-cli-release/raw/main/bin/targets
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
5. Share the `raw/main` install command only after the check above passes.
|
|
83
|
-
|
|
84
|
-
GitHub Actions Gitee sync setup:
|
|
85
|
-
|
|
86
|
-
1. Create a dedicated SSH key for the Gitee release-only repo:
|
|
87
|
-
|
|
88
|
-
```bash
|
|
89
|
-
ssh-keygen -t ed25519 -C "github-actions skill-cli-release" -f ~/.ssh/skill-cli-release-gitee
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
2. Add the public key `~/.ssh/skill-cli-release-gitee.pub` to Gitee with write access to `git@gitee.com:marvin-dev/skill-cli-release.git`.
|
|
93
|
-
3. Add the private key content from `~/.ssh/skill-cli-release-gitee` to the GitHub repository secret `GITEE_SSH_PRIVATE_KEY`.
|
|
94
|
-
|
|
95
|
-
The workflow only pushes the generated `dist/install/` bundle to Gitee, so the source repository remains GitHub-only.
|
|
96
|
-
|
|
97
|
-
For private intranet hosting, generate a static install bundle and upload the generated `dist/install/` directory as-is:
|
|
98
|
-
|
|
99
|
-
```bash
|
|
100
|
-
npm run install-bundle -- --root https://intranet.example.com/skill-cli
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
That directory contains:
|
|
104
|
-
|
|
105
|
-
- `scripts/install.ps1`
|
|
106
|
-
- `scripts/install.sh`
|
|
107
|
-
- `bin/targets/skill-cli-*`
|
|
108
|
-
- `manifest.json`
|
|
109
|
-
- `README.txt`
|
|
110
|
-
|
|
111
|
-
After uploading `dist/install/` to `https://intranet.example.com/skill-cli/`, users can run:
|
|
112
|
-
|
|
113
|
-
```bash
|
|
114
|
-
node scripts/check-install-host.js --root https://intranet.example.com/skill-cli --all-binaries
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
```powershell
|
|
118
|
-
irm https://intranet.example.com/skill-cli/scripts/install.ps1 | iex
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
```bash
|
|
122
|
-
curl -fsSL https://intranet.example.com/skill-cli/scripts/install.sh | bash
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
If binaries are hosted somewhere else, point the installer at the directory containing the `skill-cli-*` files:
|
|
126
|
-
|
|
127
|
-
```powershell
|
|
128
|
-
$env:SKILL_CLI_BASE_URL = "https://intranet.example.com/skill-cli/bin/targets"
|
|
129
|
-
irm https://intranet.example.com/skill-cli/scripts/install.ps1 | iex
|
|
130
|
-
```
|
|
131
|
-
|
|
132
|
-
```bash
|
|
133
|
-
curl -fsSL https://intranet.example.com/skill-cli/scripts/install.sh | \
|
|
134
|
-
bash -s -- --base-url https://intranet.example.com/skill-cli/bin/targets
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
Useful options:
|
|
138
|
-
|
|
139
|
-
```powershell
|
|
140
|
-
# Install the binary only, without writing ~/.claude components
|
|
141
|
-
$env:SKILL_CLI_NO_INSTALL = "1"
|
|
142
|
-
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
143
|
-
|
|
144
|
-
# For a clean smoke test against a temporary Claude directory
|
|
145
|
-
$env:SKILL_CLI_NO_INSTALL = $null
|
|
146
|
-
$env:SKILL_CLI_CLAUDE_DIR = "$PWD\.skill-cli-test-claude"
|
|
147
|
-
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
148
|
-
```
|
|
149
|
-
|
|
150
|
-
```bash
|
|
151
|
-
# Install the binary only, without writing ~/.claude components
|
|
152
|
-
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | bash -s -- --no-install
|
|
153
|
-
|
|
154
|
-
# For a clean smoke test against a temporary Claude directory
|
|
155
|
-
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | \
|
|
156
|
-
bash -s -- --claude-dir "$PWD/.skill-cli-test-claude"
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
## Build from source (Go)
|
|
160
|
-
|
|
161
|
-
```bash
|
|
162
|
-
cd skill-cli
|
|
163
|
-
go build -o bin/skill-cli .
|
|
164
|
-
# Then run: ./bin/skill-cli --help
|
|
165
|
-
# Or via npm: node bin/skill-cli.js --help
|
|
166
|
-
```
|
|
167
|
-
|
|
168
|
-
## Commands
|
|
169
|
-
|
|
170
|
-
| Command | Description |
|
|
171
|
-
|--------|-------------|
|
|
172
|
-
| `skill-cli validate <path>` | Validate `SKILL.md`, `skill.contract.yaml`, and `evals/*.yaml` |
|
|
173
|
-
| `skill-cli list [--path <dir>]` | List installed skills |
|
|
174
|
-
| `skill-cli create <name> [--path <dir>]` | Create a skill scaffold with contract and smoke eval templates |
|
|
175
|
-
| `skill-cli test <path>` | Check that a skill has trigger docs, contract, and eval coverage |
|
|
176
|
-
| `skill-cli verify <path> [--suite smoke]` | Run a machine-readable verification suite end-to-end |
|
|
177
|
-
| `skill-cli generate <name> --desc "..."` | Generate with
|
|
178
|
-
| `skill-cli
|
|
74
|
+
Or run:
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
node scripts/check-install-host.js \
|
|
78
|
+
--script-root https://gitee.com/marvin-dev/skill-cli-release/raw/main \
|
|
79
|
+
--base-url https://gitee.com/marvin-dev/skill-cli-release/raw/main/bin/targets
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
5. Share the `raw/main` install command only after the check above passes.
|
|
83
|
+
|
|
84
|
+
GitHub Actions Gitee sync setup:
|
|
85
|
+
|
|
86
|
+
1. Create a dedicated SSH key for the Gitee release-only repo:
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
ssh-keygen -t ed25519 -C "github-actions skill-cli-release" -f ~/.ssh/skill-cli-release-gitee
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
2. Add the public key `~/.ssh/skill-cli-release-gitee.pub` to Gitee with write access to `git@gitee.com:marvin-dev/skill-cli-release.git`.
|
|
93
|
+
3. Add the private key content from `~/.ssh/skill-cli-release-gitee` to the GitHub repository secret `GITEE_SSH_PRIVATE_KEY`.
|
|
94
|
+
|
|
95
|
+
The workflow only pushes the generated `dist/install/` bundle to Gitee, so the source repository remains GitHub-only.
|
|
96
|
+
|
|
97
|
+
For private intranet hosting, generate a static install bundle and upload the generated `dist/install/` directory as-is:
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
npm run install-bundle -- --root https://intranet.example.com/skill-cli
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
That directory contains:
|
|
104
|
+
|
|
105
|
+
- `scripts/install.ps1`
|
|
106
|
+
- `scripts/install.sh`
|
|
107
|
+
- `bin/targets/skill-cli-*`
|
|
108
|
+
- `manifest.json`
|
|
109
|
+
- `README.txt`
|
|
110
|
+
|
|
111
|
+
After uploading `dist/install/` to `https://intranet.example.com/skill-cli/`, users can run:
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
node scripts/check-install-host.js --root https://intranet.example.com/skill-cli --all-binaries
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
```powershell
|
|
118
|
+
irm https://intranet.example.com/skill-cli/scripts/install.ps1 | iex
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
curl -fsSL https://intranet.example.com/skill-cli/scripts/install.sh | bash
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
If binaries are hosted somewhere else, point the installer at the directory containing the `skill-cli-*` files:
|
|
126
|
+
|
|
127
|
+
```powershell
|
|
128
|
+
$env:SKILL_CLI_BASE_URL = "https://intranet.example.com/skill-cli/bin/targets"
|
|
129
|
+
irm https://intranet.example.com/skill-cli/scripts/install.ps1 | iex
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
curl -fsSL https://intranet.example.com/skill-cli/scripts/install.sh | \
|
|
134
|
+
bash -s -- --base-url https://intranet.example.com/skill-cli/bin/targets
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
Useful options:
|
|
138
|
+
|
|
139
|
+
```powershell
|
|
140
|
+
# Install the binary only, without writing ~/.claude components
|
|
141
|
+
$env:SKILL_CLI_NO_INSTALL = "1"
|
|
142
|
+
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
143
|
+
|
|
144
|
+
# For a clean smoke test against a temporary Claude directory
|
|
145
|
+
$env:SKILL_CLI_NO_INSTALL = $null
|
|
146
|
+
$env:SKILL_CLI_CLAUDE_DIR = "$PWD\.skill-cli-test-claude"
|
|
147
|
+
irm https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.ps1 | iex
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
# Install the binary only, without writing ~/.claude components
|
|
152
|
+
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | bash -s -- --no-install
|
|
153
|
+
|
|
154
|
+
# For a clean smoke test against a temporary Claude directory
|
|
155
|
+
curl -fsSL https://gitee.com/marvin-dev/skill-cli-release/raw/main/scripts/install.sh | \
|
|
156
|
+
bash -s -- --claude-dir "$PWD/.skill-cli-test-claude"
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
## Build from source (Go)
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
cd skill-cli
|
|
163
|
+
go build -o bin/skill-cli .
|
|
164
|
+
# Then run: ./bin/skill-cli --help
|
|
165
|
+
# Or via npm: node bin/skill-cli.js --help
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
## Commands
|
|
169
|
+
|
|
170
|
+
| Command | Description |
|
|
171
|
+
|--------|-------------|
|
|
172
|
+
| `skill-cli validate <path>` | Validate `SKILL.md`, `skill.contract.yaml`, and `evals/*.yaml` |
|
|
173
|
+
| `skill-cli list [--path <dir>]` | List installed skills |
|
|
174
|
+
| `skill-cli create <name> [--path <dir>]` | Create a skill scaffold with contract and smoke eval templates |
|
|
175
|
+
| `skill-cli test <path>` | Check that a skill has trigger docs, contract, and eval coverage |
|
|
176
|
+
| `skill-cli verify <path> [--suite smoke]` | Run a machine-readable verification suite end-to-end |
|
|
177
|
+
| `skill-cli generate <name> --desc "..."` | Generate with the default OMC pipeline, real-task feedback, and skill self-loop scaffolding |
|
|
178
|
+
| `skill-cli improve <path-or-name> --logs <dir>` | Improve an existing skill through the default self-iteration loop |
|
|
179
|
+
| `skill-cli loop skill run <path-or-name> --logs <dir>` | Run explicit skill self-iteration cycles with state, budget, and run logs |
|
|
180
|
+
| `skill-cli generate <name> --desc "..." --adversarial` | Also run isolated generator/evaluator contract negotiation and review |
|
|
179
181
|
| `skill-cli install [--no-omc]` | Install ECC components into `~/.claude/` (includes OMC by default) |
|
|
180
182
|
| `skill-cli install --component omc` | Install only the bundled OMC multi-agent orchestration layer |
|
|
181
183
|
|
|
182
|
-
##
|
|
184
|
+
## Skill Self-Iteration Loops
|
|
183
185
|
|
|
184
|
-
|
|
186
|
+
Generated and improved skills now use a loop-engineering operating layer by default. Each skill can carry:
|
|
185
187
|
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
├── evals/
|
|
191
|
-
│ └── smoke.yaml
|
|
192
|
-
├── fixtures/
|
|
193
|
-
└── scripts/
|
|
194
|
-
```
|
|
188
|
+
- `LOOP.md` for cadence, gates, and escalation rules
|
|
189
|
+
- `STATE.md` for durable memory across runs
|
|
190
|
+
- `loop-budget.md` for token budget and kill switch
|
|
191
|
+
- `loop-run-log.md` for append-only cycle history
|
|
195
192
|
|
|
196
|
-
- `skill
|
|
197
|
-
- `evals/*.yaml` defines runnable verification suites with deterministic checks like file existence, required content, and JSON assertions.
|
|
198
|
-
- `skill-cli verify` materializes fixture data into a temp workspace, runs the skill entrypoint, and enforces the declared checks.
|
|
199
|
-
|
|
200
|
-
`skill-cli verify` executes local code declared by the skill contract, so only run it against trusted skills and repositories.
|
|
193
|
+
Use `skill-cli loop skill audit <skill>` to inspect readiness, `skill-cli loop skill init <skill>` to add the files to an older skill, and `skill-cli loop skill run <skill> --logs ./logs` to run evidence-backed self-improvement cycles.
|
|
201
194
|
|
|
195
|
+
## Machine-Readable Skill Layout
|
|
196
|
+
|
|
197
|
+
Task-oriented skills can now include a deterministic verification harness:
|
|
198
|
+
|
|
199
|
+
```text
|
|
200
|
+
my-skill/
|
|
201
|
+
├── SKILL.md
|
|
202
|
+
├── skill.contract.yaml
|
|
203
|
+
├── evals/
|
|
204
|
+
│ └── smoke.yaml
|
|
205
|
+
├── fixtures/
|
|
206
|
+
└── scripts/
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
- `skill.contract.yaml` defines the executable contract: entrypoint, inputs, outputs, invariants, and datasets.
|
|
210
|
+
- `evals/*.yaml` defines runnable verification suites with deterministic checks like file existence, required content, and JSON assertions.
|
|
211
|
+
- `skill-cli verify` materializes fixture data into a temp workspace, runs the skill entrypoint, and enforces the declared checks.
|
|
212
|
+
|
|
213
|
+
`skill-cli verify` executes local code declared by the skill contract, so only run it against trusted skills and repositories.
|
|
214
|
+
|
|
202
215
|
## Adversarial Skill Generation
|
|
203
216
|
|
|
204
|
-
`skill-cli generate`
|
|
217
|
+
`skill-cli generate` uses the OMC pipeline by default. For stricter experimental review, add `--adversarial` to run an independent evaluator pass after the OMC generation pipeline:
|
|
205
218
|
|
|
206
219
|
```bash
|
|
207
|
-
skill-cli generate pdf-to-md --desc "Convert PDF files to Markdown"
|
|
220
|
+
skill-cli generate pdf-to-md --desc "Convert PDF files to Markdown" --adversarial
|
|
208
221
|
```
|
|
209
|
-
|
|
210
|
-
This mode uses two isolated Claude CLI contexts:
|
|
211
|
-
|
|
212
|
-
- A generator-side `claude -p` process proposes concrete acceptance criteria for the generated skill.
|
|
213
|
-
- An evaluator-side `claude -p` process negotiates the contract, runs `skill-cli validate`, `skill-cli test`, and `skill-cli verify`, and writes `.adversarial/diff-report.json`.
|
|
214
|
-
|
|
222
|
+
|
|
223
|
+
This mode uses two isolated Claude CLI contexts:
|
|
224
|
+
|
|
225
|
+
- A generator-side `claude -p` process proposes concrete acceptance criteria for the generated skill.
|
|
226
|
+
- An evaluator-side `claude -p` process negotiates the contract, runs `skill-cli validate`, `skill-cli test`, and `skill-cli verify`, and writes `.adversarial/diff-report.json`.
|
|
227
|
+
|
|
215
228
|
The evaluator fails the run when critical issues are found or the score is below `0.80`. Deterministic gate failures are always converted into critical diff items and cap the score below `0.60`.
|
|
216
229
|
|
|
217
|
-
Use `--
|
|
218
|
-
|
|
219
|
-
### Generation failure logs
|
|
220
|
-
|
|
221
|
-
Every `skill-cli generate` run writes a local JSONL session log under:
|
|
222
|
-
|
|
223
|
-
```text
|
|
224
|
-
<skills-dir>/.skill-cli/logs/generate/<skill-name>-<timestamp>.jsonl
|
|
225
|
-
```
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
## Publish to npm
|
|
230
|
-
|
|
231
|
-
Releases are published by GitHub Actions, not by running `npm publish` from a developer machine. The workflow in `.github/workflows/release.yml` runs when a `v*` tag is pushed; it builds the release binaries, creates the GitHub Release, syncs a release-only install bundle to Gitee, prepares `bin/targets/` for the npm package, and publishes with the repository `NPM_TOKEN` secret.
|
|
232
|
-
|
|
233
|
-
1. Bump `package.json` and `package-lock.json` to the intended version.
|
|
234
|
-
2. Run local verification:
|
|
235
|
-
|
|
236
|
-
```bash
|
|
237
|
-
go test ./...
|
|
238
|
-
bash scripts/build-all.sh
|
|
239
|
-
node scripts/check-bin.js
|
|
240
|
-
npm pack --dry-run
|
|
241
|
-
```
|
|
242
|
-
|
|
243
|
-
3. Commit the source changes, version files, and regenerated binaries under `bin/targets/`.
|
|
244
|
-
4. Create and push the release tag:
|
|
245
|
-
|
|
246
|
-
```bash
|
|
247
|
-
git tag v0.13.10
|
|
248
|
-
git push origin v0.13.10
|
|
249
|
-
```
|
|
250
|
-
|
|
251
|
-
The CI release publishes binaries named:
|
|
252
|
-
|
|
253
|
-
- `skill-cli-darwin-arm64`, `skill-cli-darwin-x64`
|
|
254
|
-
- `skill-cli-linux-x64`, `skill-cli-linux-arm64`
|
|
255
|
-
- `skill-cli-win32-x64.exe`
|
|
256
|
-
|
|
257
|
-
Users who `npm install -g ancoder-skill-cli` get a fully bundled package. No extra binary download is required during install.
|
|
258
|
-
|
|
259
|
-
## Test-Driven Skill Development (100:10:1 Architecture)
|
|
260
|
-
|
|
261
|
-
skill-cli adopts a test-driven approach to skill development, inspired by [oh-my-claudecode](https://github.com/Yeachan-Heo/oh-my-claudecode)'s multi-agent orchestration patterns. The core principle: **invest the majority of compute in building robust test skills, not the skill itself.**
|
|
262
|
-
|
|
263
|
-
### Time Allocation: 100:10:1
|
|
264
|
-
|
|
265
|
-
When creating a skill for a task, the system simultaneously creates a **main skill** and a **test skill**:
|
|
266
|
-
|
|
267
|
-
| Phase | Time Share | Purpose |
|
|
268
|
-
|-------|-----------|---------|
|
|
269
|
-
| Test skill development | 90% (100 units) | Build an automated evaluator that compares expected vs actual output, locating specific differences |
|
|
270
|
-
| Main skill development | 9% (10 units) | Implement the actual skill, guided by test skill feedback |
|
|
271
|
-
| Execution & verification | 1% (1 unit) | Final end-to-end smoke test |
|
|
272
|
-
|
|
273
|
-
### Architecture
|
|
274
|
-
|
|
275
|
-
```text
|
|
276
|
-
Phase 1: Test Skill Development (90% compute)
|
|
277
|
-
generate structured acceptance criteria
|
|
278
|
-
-> N planners generate test strategies in parallel
|
|
279
|
-
-> critic reviews + eliminates weak strategies
|
|
280
|
-
-> N executors implement test skills in parallel
|
|
281
|
-
-> golden test evaluation (tournament selection)
|
|
282
|
-
-> repeat until precision threshold met
|
|
283
|
-
-> best test skill selected
|
|
284
|
-
|
|
285
|
-
Phase 2: Main Skill Development (9% compute)
|
|
286
|
-
generate main skill
|
|
287
|
-
-> test skill verifies (independent executor)
|
|
288
|
-
-> structured diff feedback injected into next prompt
|
|
289
|
-
-> repeat until test skill passes
|
|
290
|
-
-> main skill complete
|
|
291
|
-
|
|
292
|
-
Phase 3: Final Verification (1% compute)
|
|
293
|
-
end-to-end smoke test
|
|
294
|
-
```
|
|
295
|
-
|
|
296
|
-
### Key Design Principles
|
|
297
|
-
|
|
298
|
-
**1. Separation of Author and Reviewer**
|
|
299
|
-
|
|
300
|
-
The agent that generates the main skill and the agent that runs the test skill operate in separate contexts. This prevents self-approval bias. The verify phase spawns an independent executor to run the test skill, ensuring honest evaluation (borrowed from OMC's verifier lane pattern).
|
|
301
|
-
|
|
302
|
-
**2. Structured Diff Feedback**
|
|
303
|
-
|
|
304
|
-
Test skills output structured diff reports instead of simple pass/fail:
|
|
305
|
-
|
|
306
|
-
```yaml
|
|
307
|
-
diffs:
|
|
308
|
-
- location: "page 3, paragraph 2"
|
|
309
|
-
type: "content_loss"
|
|
310
|
-
severity: "critical"
|
|
311
|
-
expected: "table with 3 columns and 5 rows"
|
|
312
|
-
actual: "table missing entirely"
|
|
313
|
-
- location: "page 5, heading"
|
|
314
|
-
type: "format_drift"
|
|
315
|
-
severity: "warning"
|
|
316
|
-
expected: "## Second-level heading"
|
|
317
|
-
actual: "### Third-level heading"
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
This structured feedback is injected back into the main skill's improvement loop, enabling targeted fixes rather than blind retries.
|
|
321
|
-
|
|
322
|
-
**3. QA Cycling with Early Exit**
|
|
323
|
-
|
|
324
|
-
Borrowed from OMC's UltraQA pattern:
|
|
325
|
-
- Test skill finds issues -> structured diagnosis -> main skill fixes -> retest -> loop
|
|
326
|
-
- Same error appearing 3 times triggers early exit (avoids infinite compute burn)
|
|
327
|
-
- Maximum 5 QA cycles per iteration
|
|
328
|
-
|
|
329
|
-
**4. Tournament Selection for Test Skills**
|
|
330
|
-
|
|
331
|
-
During the 90% test skill development phase, multiple test strategies are generated in parallel and evaluated against golden tests (known-correct input/output pairs). The strategy with the highest detection precision wins, similar to OMC's self-improve tournament selection.
|
|
332
|
-
|
|
333
|
-
**5. PRD-Driven Acceptance Criteria**
|
|
334
|
-
|
|
335
|
-
Test skills define concrete, testable acceptance criteria (not vague "implementation is complete"):
|
|
336
|
-
|
|
337
|
-
```text
|
|
338
|
-
Bad: "PDF conversion works correctly"
|
|
339
|
-
Good: "All tables with merged cells are preserved as HTML <table> blocks
|
|
340
|
-
with correct colspan/rowspan attributes"
|
|
341
|
-
```
|
|
342
|
-
|
|
343
|
-
### Example: PDF-to-Markdown Skill
|
|
344
|
-
|
|
345
|
-
For a PDF-to-Markdown conversion skill:
|
|
346
|
-
|
|
347
|
-
- **Test skill** (100 min): Compares original PDF content with generated Markdown, detecting content loss (missing paragraphs, tables, images), format drift (heading levels, list styles), and encoding issues. Outputs structured diffs with page/paragraph-level location info.
|
|
348
|
-
- **Main skill** (10 min): Implements PDF parsing and Markdown generation, iteratively improved by test skill feedback.
|
|
349
|
-
- **Verification** (1 min): End-to-end smoke test on fixture PDFs.
|
|
350
|
-
|
|
351
|
-
### `skill_eval` Check Type
|
|
352
|
-
|
|
353
|
-
The verify system supports a `skill_eval` check type that invokes a test skill as a verification oracle:
|
|
354
|
-
|
|
355
|
-
```yaml
|
|
356
|
-
checks:
|
|
357
|
-
- id: quality-check
|
|
358
|
-
type: skill_eval
|
|
359
|
-
skill: pdf-to-md-test
|
|
360
|
-
config:
|
|
361
|
-
threshold: 0.95
|
|
362
|
-
output_format: structured_diff
|
|
363
|
-
```
|
|
364
|
-
|
|
365
|
-
### Verify Phase: Independent Executor
|
|
366
|
-
|
|
367
|
-
During the loop's verify phase, a separate Claude executor is spawned to run the test skill. This executor:
|
|
368
|
-
- Has no shared context with the main skill's executor
|
|
369
|
-
- Produces an objective evaluation report
|
|
370
|
-
- Returns structured diff feedback that feeds into the next iteration
|
|
371
|
-
|
|
372
|
-
This mirrors OMC's principle: "Keep authoring and review as separate passes."
|
|
373
|
-
|
|
374
|
-
## oh-my-claudecode (OMC) Integration
|
|
375
|
-
|
|
376
|
-
skill-cli embeds the full [oh-my-claudecode](https://github.com/Yeachan-Heo/oh-my-claudecode) multi-agent orchestration bundle (synced from GitHub release **v4.13.6**) and installs it into `~/.claude/omc/` by default. This gives any skill-cli user a single-command path to OMC's agents, skills, hooks, and runtime scripts without needing to clone the OMC repo or configure the plugin marketplace separately.
|
|
377
|
-
|
|
378
|
-
### What gets installed
|
|
379
|
-
|
|
380
|
-
When you run `skill-cli install`, OMC is installed alongside ECC components:
|
|
381
|
-
|
|
382
|
-
| OMC asset | Install target |
|
|
383
|
-
|-----------|---------------|
|
|
384
|
-
| 19 agents (analyst, architect, executor, planner, critic, verifier, …) | `~/.claude/omc/agents/` |
|
|
385
|
-
| 38 skills (autopilot, ralph, ralplan, deep-interview, team, ultrawork, ultraqa, self-improve, …) | `~/.claude/omc/skills/` |
|
|
386
|
-
| Runtime scripts (hook helpers, session lifecycle, skill injector, …) | `~/.claude/omc/scripts/` (executable bit preserved for `.sh`/`.mjs`/`.cjs`/`.js`/`.ts`) |
|
|
387
|
-
| `hooks.json` | Merged into `~/.claude/settings.json` with `$CLAUDE_PLUGIN_ROOT` rewritten to the absolute OMC install path |
|
|
388
|
-
| Templates | `~/.claude/omc/templates/` |
|
|
389
|
-
| `.claude-plugin/` manifest, LICENSE, CHANGELOG, VERSION | `~/.claude/omc/` |
|
|
390
|
-
|
|
391
|
-
### Flags
|
|
392
|
-
|
|
393
|
-
```bash
|
|
394
|
-
# Default — installs ECC + OMC
|
|
395
|
-
skill-cli install
|
|
396
|
-
|
|
397
|
-
# Skip OMC entirely (opt-out)
|
|
398
|
-
skill-cli install --no-omc
|
|
399
|
-
|
|
400
|
-
# Install only the OMC bundle
|
|
401
|
-
skill-cli install --component omc
|
|
402
|
-
|
|
403
|
-
# Preview without writing files
|
|
404
|
-
skill-cli install --dry-run
|
|
405
|
-
```
|
|
406
|
-
|
|
407
|
-
### Browse embedded OMC content
|
|
408
|
-
|
|
409
|
-
```bash
|
|
410
|
-
skill-cli list --type omc # list embedded OMC agents and skills
|
|
411
|
-
skill-cli info autopilot # show the autopilot skill content
|
|
412
|
-
skill-cli doctor # verify OMC install health and version
|
|
413
|
-
```
|
|
414
|
-
|
|
415
|
-
### Why the hook rewrite matters
|
|
416
|
-
|
|
417
|
-
OMC hooks are authored for the Claude Code plugin system and reference scripts via `$CLAUDE_PLUGIN_ROOT/scripts/...`. Because skill-cli installs OMC as a plain directory (not as a marketplace plugin), the installer rewrites `$CLAUDE_PLUGIN_ROOT` → `${claudeDir}/omc` at merge time so hooks resolve correctly without the plugin loader.
|
|
418
|
-
|
|
419
|
-
If you already have OMC installed via the Claude Code plugin marketplace, the skill-cli install places a separate self-contained copy under `~/.claude/omc/` and will not touch the marketplace install. The two copies can coexist; hooks from both sources will simply fire in sequence.
|
|
420
|
-
|
|
421
|
-
### Upgrading OMC
|
|
422
|
-
|
|
423
|
-
The embedded OMC version is pinned to the release tagged in `embedded/omc/VERSION`. To bump it, re-run the sync workflow that downloads a fresh GitHub release tarball into `embedded/omc/` and rebuild.
|
|
424
|
-
|
|
425
|
-
## Meta-Harness (experimental)
|
|
426
|
-
|
|
427
|
-
`meta-harness/` is a Python sub-project that implements the outer-loop harness
|
|
428
|
-
optimizer from [arXiv:2603.28052](https://arxiv.org/abs/2603.28052) (Stanford, 2026).
|
|
429
|
-
|
|
430
|
-
### Architecture
|
|
431
|
-
|
|
432
|
-
```
|
|
433
|
-
meta-harness search ← outer loop (Python, Claude Code proposer)
|
|
434
|
-
│
|
|
435
|
-
└─ skill-cli eval validate / run / ls / diff ← evaluator backend (Go)
|
|
436
|
-
│
|
|
437
|
-
└─ harness.py (user-supplied Python) ← inner execution layer
|
|
438
|
-
```
|
|
439
|
-
|
|
440
|
-
**Two independent binaries — intentionally decoupled:**
|
|
441
|
-
- `skill-cli` knows nothing about `meta-harness`; it only runs harness candidates and emits scores/traces.
|
|
442
|
-
- `meta-harness` knows nothing about OMC internals; it calls `skill-cli` via CLI contract only.
|
|
443
|
-
|
|
444
|
-
### Quick start
|
|
445
|
-
|
|
446
|
-
```bash
|
|
447
|
-
# Build skill-cli
|
|
448
|
-
go build -o bin/skill-cli .
|
|
449
|
-
|
|
450
|
-
# Install meta-harness
|
|
451
|
-
cd meta-harness
|
|
452
|
-
python3 -m venv .venv && source .venv/bin/activate
|
|
453
|
-
pip install -e ".[dev]"
|
|
454
|
-
|
|
455
|
-
# Run smoke test (no API key needed)
|
|
456
|
-
cd ..
|
|
457
|
-
bash scripts/meta-harness-smoke.sh
|
|
458
|
-
|
|
459
|
-
# Real search (requires ANTHROPIC_API_KEY + claude CLI)
|
|
460
|
-
meta-harness search \
|
|
461
|
-
--suite meta-harness/domains/text_classification/suite.yaml \
|
|
462
|
-
--out search-runs/run-01 \
|
|
463
|
-
--max-iter 5 \
|
|
464
|
-
--k 2 \
|
|
465
|
-
--seed meta-harness/domains/text_classification/seeds/zero_shot.py \
|
|
466
|
-
--seed meta-harness/domains/text_classification/seeds/few_shot.py \
|
|
467
|
-
--skill-cli bin/skill-cli \
|
|
468
|
-
--samples 20
|
|
469
|
-
```
|
|
470
|
-
|
|
471
|
-
### CLI contract (skill-cli eval)
|
|
472
|
-
|
|
473
|
-
| Command | Description |
|
|
474
|
-
|---|---|
|
|
475
|
-
| `skill-cli eval validate <dir>` | Cheap structural check (exit 0 = valid) |
|
|
476
|
-
| `skill-cli eval run <dir> --suite <f> --out <d>` | Full eval → scores.json + traces/ |
|
|
477
|
-
| `skill-cli eval ls --store <d> [--pareto]` | List / filter candidates |
|
|
478
|
-
| `skill-cli eval diff <a> <b> --store <d>` | Code + score diff |
|
|
479
|
-
|
|
480
|
-
### Tuning
|
|
481
|
-
|
|
482
|
-
The `meta-harness/src/meta_harness/skill.md` file is the most important lever on search quality.
|
|
483
|
-
Per Appendix D of the paper: run 3–5 short iterations (`--max-iter 3`) specifically to
|
|
484
|
-
debug and refine it before committing to a full run.
|
|
485
|
-
|
|
486
|
-
## License
|
|
487
|
-
|
|
488
|
-
MIT
|
|
230
|
+
Use the default command without `--adversarial` for the OMC-only generation path.
|
|
231
|
+
|
|
232
|
+
### Generation failure logs
|
|
233
|
+
|
|
234
|
+
Every `skill-cli generate` run writes a local JSONL session log under:
|
|
235
|
+
|
|
236
|
+
```text
|
|
237
|
+
<skills-dir>/.skill-cli/logs/generate/<skill-name>-<timestamp>.jsonl
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
Every generation run prints the exact log path. The log records major stages such as Claude binary resolution, scaffold creation, brainstorm probe, loop start/ticks, deterministic gates, repair attempts, real-task feedback artifacts, and optional adversarial review failures, including captured command output and error tails for troubleshooting.
|
|
241
|
+
|
|
242
|
+
## Publish to npm
|
|
243
|
+
|
|
244
|
+
Releases are published by GitHub Actions, not by running `npm publish` from a developer machine. The workflow in `.github/workflows/release.yml` runs when a `v*` tag is pushed; it builds the release binaries, creates the GitHub Release, syncs a release-only install bundle to Gitee, prepares `bin/targets/` for the npm package, and publishes with the repository `NPM_TOKEN` secret.
|
|
245
|
+
|
|
246
|
+
1. Bump `package.json` and `package-lock.json` to the intended version.
|
|
247
|
+
2. Run local verification:
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
go test ./...
|
|
251
|
+
bash scripts/build-all.sh
|
|
252
|
+
node scripts/check-bin.js
|
|
253
|
+
npm pack --dry-run
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
3. Commit the source changes, version files, and regenerated binaries under `bin/targets/`.
|
|
257
|
+
4. Create and push the release tag:
|
|
258
|
+
|
|
259
|
+
```bash
|
|
260
|
+
git tag v0.13.10
|
|
261
|
+
git push origin v0.13.10
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
The CI release publishes binaries named:
|
|
265
|
+
|
|
266
|
+
- `skill-cli-darwin-arm64`, `skill-cli-darwin-x64`
|
|
267
|
+
- `skill-cli-linux-x64`, `skill-cli-linux-arm64`
|
|
268
|
+
- `skill-cli-win32-x64.exe`
|
|
269
|
+
|
|
270
|
+
Users who `npm install -g ancoder-skill-cli` get a fully bundled package. No extra binary download is required during install.
|
|
271
|
+
|
|
272
|
+
## Test-Driven Skill Development (100:10:1 Architecture)
|
|
273
|
+
|
|
274
|
+
skill-cli adopts a test-driven approach to skill development, inspired by [oh-my-claudecode](https://github.com/Yeachan-Heo/oh-my-claudecode)'s multi-agent orchestration patterns. The core principle: **invest the majority of compute in building robust test skills, not the skill itself.**
|
|
275
|
+
|
|
276
|
+
### Time Allocation: 100:10:1
|
|
277
|
+
|
|
278
|
+
When creating a skill for a task, the system simultaneously creates a **main skill** and a **test skill**:
|
|
279
|
+
|
|
280
|
+
| Phase | Time Share | Purpose |
|
|
281
|
+
|-------|-----------|---------|
|
|
282
|
+
| Test skill development | 90% (100 units) | Build an automated evaluator that compares expected vs actual output, locating specific differences |
|
|
283
|
+
| Main skill development | 9% (10 units) | Implement the actual skill, guided by test skill feedback |
|
|
284
|
+
| Execution & verification | 1% (1 unit) | Final end-to-end smoke test |
|
|
285
|
+
|
|
286
|
+
### Architecture
|
|
287
|
+
|
|
288
|
+
```text
|
|
289
|
+
Phase 1: Test Skill Development (90% compute)
|
|
290
|
+
generate structured acceptance criteria
|
|
291
|
+
-> N planners generate test strategies in parallel
|
|
292
|
+
-> critic reviews + eliminates weak strategies
|
|
293
|
+
-> N executors implement test skills in parallel
|
|
294
|
+
-> golden test evaluation (tournament selection)
|
|
295
|
+
-> repeat until precision threshold met
|
|
296
|
+
-> best test skill selected
|
|
297
|
+
|
|
298
|
+
Phase 2: Main Skill Development (9% compute)
|
|
299
|
+
generate main skill
|
|
300
|
+
-> test skill verifies (independent executor)
|
|
301
|
+
-> structured diff feedback injected into next prompt
|
|
302
|
+
-> repeat until test skill passes
|
|
303
|
+
-> main skill complete
|
|
304
|
+
|
|
305
|
+
Phase 3: Final Verification (1% compute)
|
|
306
|
+
end-to-end smoke test
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
### Key Design Principles
|
|
310
|
+
|
|
311
|
+
**1. Separation of Author and Reviewer**
|
|
312
|
+
|
|
313
|
+
The agent that generates the main skill and the agent that runs the test skill operate in separate contexts. This prevents self-approval bias. The verify phase spawns an independent executor to run the test skill, ensuring honest evaluation (borrowed from OMC's verifier lane pattern).
|
|
314
|
+
|
|
315
|
+
**2. Structured Diff Feedback**
|
|
316
|
+
|
|
317
|
+
Test skills output structured diff reports instead of simple pass/fail:
|
|
318
|
+
|
|
319
|
+
```yaml
|
|
320
|
+
diffs:
|
|
321
|
+
- location: "page 3, paragraph 2"
|
|
322
|
+
type: "content_loss"
|
|
323
|
+
severity: "critical"
|
|
324
|
+
expected: "table with 3 columns and 5 rows"
|
|
325
|
+
actual: "table missing entirely"
|
|
326
|
+
- location: "page 5, heading"
|
|
327
|
+
type: "format_drift"
|
|
328
|
+
severity: "warning"
|
|
329
|
+
expected: "## Second-level heading"
|
|
330
|
+
actual: "### Third-level heading"
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
This structured feedback is injected back into the main skill's improvement loop, enabling targeted fixes rather than blind retries.
|
|
334
|
+
|
|
335
|
+
**3. QA Cycling with Early Exit**
|
|
336
|
+
|
|
337
|
+
Borrowed from OMC's UltraQA pattern:
|
|
338
|
+
- Test skill finds issues -> structured diagnosis -> main skill fixes -> retest -> loop
|
|
339
|
+
- Same error appearing 3 times triggers early exit (avoids infinite compute burn)
|
|
340
|
+
- Maximum 5 QA cycles per iteration
|
|
341
|
+
|
|
342
|
+
**4. Tournament Selection for Test Skills**
|
|
343
|
+
|
|
344
|
+
During the 90% test skill development phase, multiple test strategies are generated in parallel and evaluated against golden tests (known-correct input/output pairs). The strategy with the highest detection precision wins, similar to OMC's self-improve tournament selection.
|
|
345
|
+
|
|
346
|
+
**5. PRD-Driven Acceptance Criteria**
|
|
347
|
+
|
|
348
|
+
Test skills define concrete, testable acceptance criteria (not vague "implementation is complete"):
|
|
349
|
+
|
|
350
|
+
```text
|
|
351
|
+
Bad: "PDF conversion works correctly"
|
|
352
|
+
Good: "All tables with merged cells are preserved as HTML <table> blocks
|
|
353
|
+
with correct colspan/rowspan attributes"
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
### Example: PDF-to-Markdown Skill
|
|
357
|
+
|
|
358
|
+
For a PDF-to-Markdown conversion skill:
|
|
359
|
+
|
|
360
|
+
- **Test skill** (100 min): Compares original PDF content with generated Markdown, detecting content loss (missing paragraphs, tables, images), format drift (heading levels, list styles), and encoding issues. Outputs structured diffs with page/paragraph-level location info.
|
|
361
|
+
- **Main skill** (10 min): Implements PDF parsing and Markdown generation, iteratively improved by test skill feedback.
|
|
362
|
+
- **Verification** (1 min): End-to-end smoke test on fixture PDFs.
|
|
363
|
+
|
|
364
|
+
### `skill_eval` Check Type
|
|
365
|
+
|
|
366
|
+
The verify system supports a `skill_eval` check type that invokes a test skill as a verification oracle:
|
|
367
|
+
|
|
368
|
+
```yaml
|
|
369
|
+
checks:
|
|
370
|
+
- id: quality-check
|
|
371
|
+
type: skill_eval
|
|
372
|
+
skill: pdf-to-md-test
|
|
373
|
+
config:
|
|
374
|
+
threshold: 0.95
|
|
375
|
+
output_format: structured_diff
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
### Verify Phase: Independent Executor
|
|
379
|
+
|
|
380
|
+
During the loop's verify phase, a separate Claude executor is spawned to run the test skill. This executor:
|
|
381
|
+
- Has no shared context with the main skill's executor
|
|
382
|
+
- Produces an objective evaluation report
|
|
383
|
+
- Returns structured diff feedback that feeds into the next iteration
|
|
384
|
+
|
|
385
|
+
This mirrors OMC's principle: "Keep authoring and review as separate passes."
|
|
386
|
+
|
|
387
|
+
## oh-my-claudecode (OMC) Integration
|
|
388
|
+
|
|
389
|
+
skill-cli embeds the full [oh-my-claudecode](https://github.com/Yeachan-Heo/oh-my-claudecode) multi-agent orchestration bundle (synced from GitHub release **v4.13.6**) and installs it into `~/.claude/omc/` by default. This gives any skill-cli user a single-command path to OMC's agents, skills, hooks, and runtime scripts without needing to clone the OMC repo or configure the plugin marketplace separately.
|
|
390
|
+
|
|
391
|
+
### What gets installed
|
|
392
|
+
|
|
393
|
+
When you run `skill-cli install`, OMC is installed alongside ECC components:
|
|
394
|
+
|
|
395
|
+
| OMC asset | Install target |
|
|
396
|
+
|-----------|---------------|
|
|
397
|
+
| 19 agents (analyst, architect, executor, planner, critic, verifier, …) | `~/.claude/omc/agents/` |
|
|
398
|
+
| 38 skills (autopilot, ralph, ralplan, deep-interview, team, ultrawork, ultraqa, self-improve, …) | `~/.claude/omc/skills/` |
|
|
399
|
+
| Runtime scripts (hook helpers, session lifecycle, skill injector, …) | `~/.claude/omc/scripts/` (executable bit preserved for `.sh`/`.mjs`/`.cjs`/`.js`/`.ts`) |
|
|
400
|
+
| `hooks.json` | Merged into `~/.claude/settings.json` with `$CLAUDE_PLUGIN_ROOT` rewritten to the absolute OMC install path |
|
|
401
|
+
| Templates | `~/.claude/omc/templates/` |
|
|
402
|
+
| `.claude-plugin/` manifest, LICENSE, CHANGELOG, VERSION | `~/.claude/omc/` |
|
|
403
|
+
|
|
404
|
+
### Flags
|
|
405
|
+
|
|
406
|
+
```bash
|
|
407
|
+
# Default — installs ECC + OMC
|
|
408
|
+
skill-cli install
|
|
409
|
+
|
|
410
|
+
# Skip OMC entirely (opt-out)
|
|
411
|
+
skill-cli install --no-omc
|
|
412
|
+
|
|
413
|
+
# Install only the OMC bundle
|
|
414
|
+
skill-cli install --component omc
|
|
415
|
+
|
|
416
|
+
# Preview without writing files
|
|
417
|
+
skill-cli install --dry-run
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
### Browse embedded OMC content
|
|
421
|
+
|
|
422
|
+
```bash
|
|
423
|
+
skill-cli list --type omc # list embedded OMC agents and skills
|
|
424
|
+
skill-cli info autopilot # show the autopilot skill content
|
|
425
|
+
skill-cli doctor # verify OMC install health and version
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
### Why the hook rewrite matters
|
|
429
|
+
|
|
430
|
+
OMC hooks are authored for the Claude Code plugin system and reference scripts via `$CLAUDE_PLUGIN_ROOT/scripts/...`. Because skill-cli installs OMC as a plain directory (not as a marketplace plugin), the installer rewrites `$CLAUDE_PLUGIN_ROOT` → `${claudeDir}/omc` at merge time so hooks resolve correctly without the plugin loader.
|
|
431
|
+
|
|
432
|
+
If you already have OMC installed via the Claude Code plugin marketplace, the skill-cli install places a separate self-contained copy under `~/.claude/omc/` and will not touch the marketplace install. The two copies can coexist; hooks from both sources will simply fire in sequence.
|
|
433
|
+
|
|
434
|
+
### Upgrading OMC
|
|
435
|
+
|
|
436
|
+
The embedded OMC version is pinned to the release tagged in `embedded/omc/VERSION`. To bump it, re-run the sync workflow that downloads a fresh GitHub release tarball into `embedded/omc/` and rebuild.
|
|
437
|
+
|
|
438
|
+
## Meta-Harness (experimental)
|
|
439
|
+
|
|
440
|
+
`meta-harness/` is a Python sub-project that implements the outer-loop harness
|
|
441
|
+
optimizer from [arXiv:2603.28052](https://arxiv.org/abs/2603.28052) (Stanford, 2026).
|
|
442
|
+
|
|
443
|
+
### Architecture
|
|
444
|
+
|
|
445
|
+
```
|
|
446
|
+
meta-harness search ← outer loop (Python, Claude Code proposer)
|
|
447
|
+
│
|
|
448
|
+
└─ skill-cli eval validate / run / ls / diff ← evaluator backend (Go)
|
|
449
|
+
│
|
|
450
|
+
└─ harness.py (user-supplied Python) ← inner execution layer
|
|
451
|
+
```
|
|
452
|
+
|
|
453
|
+
**Two independent binaries — intentionally decoupled:**
|
|
454
|
+
- `skill-cli` knows nothing about `meta-harness`; it only runs harness candidates and emits scores/traces.
|
|
455
|
+
- `meta-harness` knows nothing about OMC internals; it calls `skill-cli` via CLI contract only.
|
|
456
|
+
|
|
457
|
+
### Quick start
|
|
458
|
+
|
|
459
|
+
```bash
|
|
460
|
+
# Build skill-cli
|
|
461
|
+
go build -o bin/skill-cli .
|
|
462
|
+
|
|
463
|
+
# Install meta-harness
|
|
464
|
+
cd meta-harness
|
|
465
|
+
python3 -m venv .venv && source .venv/bin/activate
|
|
466
|
+
pip install -e ".[dev]"
|
|
467
|
+
|
|
468
|
+
# Run smoke test (no API key needed)
|
|
469
|
+
cd ..
|
|
470
|
+
bash scripts/meta-harness-smoke.sh
|
|
471
|
+
|
|
472
|
+
# Real search (requires ANTHROPIC_API_KEY + claude CLI)
|
|
473
|
+
meta-harness search \
|
|
474
|
+
--suite meta-harness/domains/text_classification/suite.yaml \
|
|
475
|
+
--out search-runs/run-01 \
|
|
476
|
+
--max-iter 5 \
|
|
477
|
+
--k 2 \
|
|
478
|
+
--seed meta-harness/domains/text_classification/seeds/zero_shot.py \
|
|
479
|
+
--seed meta-harness/domains/text_classification/seeds/few_shot.py \
|
|
480
|
+
--skill-cli bin/skill-cli \
|
|
481
|
+
--samples 20
|
|
482
|
+
```
|
|
483
|
+
|
|
484
|
+
### CLI contract (skill-cli eval)
|
|
485
|
+
|
|
486
|
+
| Command | Description |
|
|
487
|
+
|---|---|
|
|
488
|
+
| `skill-cli eval validate <dir>` | Cheap structural check (exit 0 = valid) |
|
|
489
|
+
| `skill-cli eval run <dir> --suite <f> --out <d>` | Full eval → scores.json + traces/ |
|
|
490
|
+
| `skill-cli eval ls --store <d> [--pareto]` | List / filter candidates |
|
|
491
|
+
| `skill-cli eval diff <a> <b> --store <d>` | Code + score diff |
|
|
492
|
+
|
|
493
|
+
### Tuning
|
|
494
|
+
|
|
495
|
+
The `meta-harness/src/meta_harness/skill.md` file is the most important lever on search quality.
|
|
496
|
+
Per Appendix D of the paper: run 3–5 short iterations (`--max-iter 3`) specifically to
|
|
497
|
+
debug and refine it before committing to a full run.
|
|
498
|
+
|
|
499
|
+
## License
|
|
500
|
+
|
|
501
|
+
MIT
|