@metaharness/weight-eft 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +38 -14
  2. package/package.json +26 -10
package/README.md CHANGED
@@ -1,23 +1,47 @@
1
1
  # @metaharness/weight-eft
2
2
 
3
- **Evolutionary fine-tuning** the bridge from Darwin's gradient-FREE policy
4
- evolution (*freeze the model, evolve the harness*) to **gradient / weight**
5
- self-learning on the **open cheap tier**.
3
+ > **Make cheap open-source LLMs solve more coding tasks on their own.** Fine-tune them (LoRA) on your AI agent's *past successful runs*, so your pipeline calls expensive frontier models (GPT, Claude) **less often** — and your cost-per-fix drops.
6
4
 
7
- ## Thesis (honest, bounded)
5
+ [![npm version](https://img.shields.io/npm/v/@metaharness/weight-eft.svg)](https://www.npmjs.com/package/@metaharness/weight-eft)
6
+ [![license: MIT](https://img.shields.io/npm/l/@metaharness/weight-eft.svg)](./LICENSE)
7
+ [![node](https://img.shields.io/node/v/@metaharness/weight-eft.svg)](https://nodejs.org)
8
8
 
9
- We attack the **cost-Pareto axis, not the frontier ceiling.**
9
+ ```bash
10
+ npm i @metaharness/weight-eft
11
+ ```
12
+
13
+ ## What is this? (plain language)
14
+
15
+ If you run an **AI coding agent**, you probably use a **model cascade**: a cheap
16
+ model (GLM / Qwen / DeepSeek) tries first, and only the hard problems
17
+ **escalate** to an expensive frontier model (GPT / Claude). Every escalation
18
+ costs real money.
19
+
20
+ **`weight-eft` makes the cheap model smarter** by fine-tuning it with **LoRA** on
21
+ the trajectories your agent *already solved* — turning your run history into
22
+ training data. The cheap model then resolves more issues by itself, so you
23
+ **escalate less and pay less per solved task.**
24
+
25
+ It's a self-improving loop: **your agent's wins become the next model's training set.**
26
+
27
+ - **Input:** your agent's run archive (successful + failed trajectories).
28
+ - **Output:** portable LoRA training data — **SFT + DPO** in standard formats
29
+ (OpenAI chat JSONL / TRL / axolotl / unsloth) **+ a GPU training plan**.
30
+ - **Goal:** lower **cost-per-resolved**, not a leaderboard score.
31
+
32
+ ## Why it exists (the honest, bounded thesis)
10
33
 
11
- The metaharness cascade runs a cheap open model first (GLM / Qwen / DeepSeek)
12
- and **escalates to a frontier model** (Opus / GPT) only on the hard tail. Each
13
- escalation costs ~$0.50. `weight-eft` **distills the harness's archival
14
- success into the cheap tier via LoRA** so the cheap model resolves more issues
15
- on its own **the cascade escalates less often** → **$/resolved drops.**
34
+ We attack the **cost axis, not the capability ceiling.** A small (7-14B) local
35
+ fine-tune **will not** out-reason a frontier model on the hardest problems
36
+ that's a model-capability ceiling (measured: clean-eval ~37.3%, ADR-198 / §53).
37
+ The win is **fewer escalations** (lower cost), and the tooling keeps the
38
+ telemetry honest about exactly that: the eval metric is
39
+ **escalation-rate-reduction + cost/resolved**, *never* "we beat the frontier."
16
40
 
17
- A 7-14B local-GPU tune **will not crack the hard tail** that's a frontier
18
- reasoning ceiling (clean-eval ~37.3%, ADR-177 §53). The win is **fewer
19
- escalations**, and the telemetry stays honest about that. The eval metric is
20
- **escalation-rate-reduction + cost/resolved**, *not* hard-tail cracking.
41
+ Under the hood this is the gradient/weight counterpart to Darwin's gradient-free
42
+ policy evolution (*freeze the model, evolve the harness*) here we **also**
43
+ evolve the cheap model's *weights*, on the open tier, from the harness's own
44
+ archive.
21
45
 
22
46
  ## The data recipe (on/off-policy)
23
47
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@metaharness/weight-eft",
3
- "version": "0.1.0",
4
- "description": "Evolutionary fine-tuning distill the harness's archival success into the open cheap tier (GLM/Qwen) via LoRA so the cost-cascade escalates to a frontier model less often. SFT-distills ALL gold-resolved trajectories; on-policy DPO on GLM-vs-GLM pairs only. Attacks the cost-Pareto axis (fewer escalations), NOT the frontier reasoning ceiling. Strict train/eval instance-ID disjointness (the contamination guard).",
3
+ "version": "0.1.1",
4
+ "description": "Fine-tune cheap open-source LLMs (GLM, Qwen, DeepSeek) on your AI coding agent's successful runs with LoRA (SFT + DPO) so your model cascade escalates to expensive frontier models (GPT, Claude) less often cutting cost-per-resolved. Turns run history into portable training data (OpenAI/TRL/axolotl JSONL) with a built-in contamination guard and reward-hacking filter.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",
7
7
  "types": "./dist/index.d.ts",
@@ -32,20 +32,36 @@
32
32
  "llm",
33
33
  "lora",
34
34
  "fine-tuning",
35
- "evolutionary-fine-tuning",
36
- "weight-eft",
35
+ "peft",
37
36
  "sft",
38
37
  "dpo",
39
- "distillation",
40
- "cost-optimization",
38
+ "rlhf-alternative",
39
+ "model-distillation",
40
+ "knowledge-distillation",
41
+ "ai-agents",
42
+ "coding-agent",
43
+ "agentic",
44
+ "llm-agent",
41
45
  "swe-bench",
42
- "metaharness",
43
- "darwin-mode",
44
- "contamination-guard"
46
+ "llm-routing",
47
+ "model-cascade",
48
+ "cost-optimization",
49
+ "llm-cost",
50
+ "openrouter",
51
+ "qwen",
52
+ "deepseek",
53
+ "training-data",
54
+ "jsonl",
55
+ "trl",
56
+ "axolotl",
57
+ "unsloth",
58
+ "self-improving",
59
+ "weight-eft",
60
+ "metaharness"
45
61
  ],
46
62
  "author": "rUv <ruv@ruv.net>",
47
63
  "license": "MIT",
48
- "homepage": "https://github.com/ruvnet/agent-harness-generator",
64
+ "homepage": "https://github.com/ruvnet/agent-harness-generator/tree/main/packages/weight-eft#readme",
49
65
  "repository": {
50
66
  "type": "git",
51
67
  "url": "https://github.com/ruvnet/agent-harness-generator",