miii-agent 0.1.8 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +45 -3
  2. package/dist/cli.js +1755 -1427
  3. package/package.json +3 -1
package/README.md CHANGED
@@ -133,6 +133,34 @@ Every sensitive operation is gated by a permission system — you approve what t
133
133
 
134
134
  ---
135
135
 
136
+ ## Checking your setup
137
+
138
+ miii is model-agnostic — but not every local model can actually drive an agent. A model that can't emit clean tool calls will chat at you instead of editing files. `miii doctor` tells you which of *your* installed models are up to the job, before you waste time wondering why nothing happens.
139
+
140
+ ```bash
141
+ miii doctor # check every local model (from `ollama list`)
142
+ miii doctor qwen2.5-coder:7b # check one model
143
+ miii doctor gemma4:e4b grep # one model, only scenarios matching "grep"
144
+ ```
145
+
146
+ It runs the real agent against a handful of concrete tasks (edit a file, read-and-answer, create a file, locate a definition) and checks the *outcome* — did the file actually change, was the answer right — then prints a verdict per model:
147
+
148
+ ```
149
+ === qwen3-coder ===
150
+ PASS edit-exact-string ...
151
+ PASS read-then-answer ...
152
+ PASS create-new-file ...
153
+ PASS grep-locate ...
154
+ → qwen3-coder: 4/4 — ready
155
+
156
+ === gemma4:e4b ===
157
+ → gemma4:e4b: 1/4 — not recommended — weak tool-calling
158
+ ```
159
+
160
+ With more than one model it also prints a compatibility matrix (`+` pass, `.` fail). Cloud models are skipped by default; name one explicitly to include it. If a model comes back `marginal` or `not recommended`, pull a stronger coding model and try again.
161
+
162
+ ---
163
+
136
164
  ## Architecture
137
165
 
138
166
  ```mermaid
@@ -196,11 +224,25 @@ npm run dev
196
224
  ```
197
225
 
198
226
  ```bash
199
- npm run build # production build
200
- npm run start # run built output
227
+ npm run build # production build
228
+ npm run start # run built output
229
+ npm run typecheck # type-check src + eval
230
+ npm run eval # run the eval harness as a CI / regression gate
201
231
  ```
202
232
 
203
- ---
233
+ The eval harness lives in `eval/` and powers `miii doctor`. As `npm run eval` it doubles as a regression gate — it exits non-zero if any model fails any scenario, so a prompt or tool change that regresses a baseline model is caught in CI. Same engine, two doors: `miii doctor` for users checking their setup, `npm run eval` for maintainers gating changes.
234
+
235
+ ### Testing the `miii` command against your local changes
236
+
237
+ The global `miii` command points at whatever was last installed with `npm install -g miii-agent` — **not** your working tree. After editing source, the global binary is stale, so `miii` (and `miii doctor`) will run the old code and may appear to ignore your changes (e.g. printing the wrong model). Two ways to run your local build:
238
+
239
+ ```bash
240
+ node dist/cli.js doctor <model> # run the freshly built output directly
241
+ # — or —
242
+ npm run build && npm link # point the global `miii` at this repo
243
+ ```
244
+
245
+ `npm link` symlinks the global `miii` to `dist/cli.js` in this repo, so each `npm run build` is picked up automatically. Restore the published version later with `npm install -g miii-agent`. Note: `npm run dev` / `npm run start` always run the current source and never have this staleness problem.
204
246
 
205
247
  ## Project Status
206
248