helixevo 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +193 -0
- package/dist/cli.js +12869 -0
- package/package.json +46 -0
package/README.md
ADDED
|
@@ -0,0 +1,193 @@
|
|
|
1
|
+
# Helix
|
|
2
|
+
|
|
3
|
+
Self-evolving skill ecosystem for AI agents. Captures failures, evolves skills through multi-judge evaluation, and maintains a Pareto frontier of optimal skill configurations.
|
|
4
|
+
|
|
5
|
+
## How it works
|
|
6
|
+
|
|
7
|
+
Helix builds on ideas from [EvoSkill](https://arxiv.org/abs/2603.02766) and [AutoResearch](https://github.com/karpathy/autoResearch) to create a three-directional evolution system:
|
|
8
|
+
|
|
9
|
+
- **Generalize ↑** — Detect cross-project patterns and promote them to abstract skills
|
|
10
|
+
- **Specialize ↓** — Create project-specific skills from domain skills + project failures
|
|
11
|
+
- **Lateral ↔** — Merge, split, and resolve conflicts between skills
|
|
12
|
+
|
|
13
|
+
Every proposed change goes through:
|
|
14
|
+
1. **3 independent LLM judges** (Task Completion, Correction Alignment, Side-Effect Check)
|
|
15
|
+
2. **Regression testing** against golden cases
|
|
16
|
+
3. **3-day canary deployment** with auto-rollback
|
|
17
|
+
|
|
18
|
+
## Prerequisites
|
|
19
|
+
|
|
20
|
+
- **Node.js 18+**
|
|
21
|
+
- **[Bun](https://bun.sh)** — used for building (`curl -fsSL https://bun.sh/install | bash`)
|
|
22
|
+
- **[Claude CLI](https://docs.anthropic.com/en/docs/claude-code)** — installed and authenticated
|
|
23
|
+
- Requires a **Claude Max plan** subscription
|
|
24
|
+
- Helix uses `claude --print` for all LLM operations (no API key needed)
|
|
25
|
+
|
|
26
|
+
Verify prerequisites:
|
|
27
|
+
```bash
|
|
28
|
+
node --version # v18+
|
|
29
|
+
bun --version # any
|
|
30
|
+
claude --version # any
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Install
|
|
34
|
+
|
|
35
|
+
### From npm (recommended)
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
npm install -g helixevo
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### From GitHub
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
npm install -g github:danielchen26/helixevo
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### From source
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
git clone https://github.com/danielchen26/helixevo.git
|
|
51
|
+
cd helixevo
|
|
52
|
+
npm install
|
|
53
|
+
npm run build
|
|
54
|
+
npm link
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Quick Start
|
|
58
|
+
|
|
59
|
+
```bash
|
|
60
|
+
# 1. Initialize — imports existing skills + generates golden cases
|
|
61
|
+
helixevo init
|
|
62
|
+
|
|
63
|
+
# 2. Capture failures from a session
|
|
64
|
+
helixevo capture path/to/session.json --project myapp
|
|
65
|
+
|
|
66
|
+
# 3. Evolve skills from failures
|
|
67
|
+
helixevo evolve --verbose
|
|
68
|
+
|
|
69
|
+
# 4. View the skill network
|
|
70
|
+
helixevo graph
|
|
71
|
+
|
|
72
|
+
# 5. Open the web dashboard
|
|
73
|
+
helixevo dashboard
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## Commands
|
|
77
|
+
|
|
78
|
+
| Command | Description |
|
|
79
|
+
|---------|-------------|
|
|
80
|
+
| `helixevo watch` | Always-on learning: auto-capture + auto-evolve |
|
|
81
|
+
| `helixevo metrics` | Correction rates, skill trends, evolution impact |
|
|
82
|
+
| `helixevo health` | Network health: cohesion, coverage, balance, transfer |
|
|
83
|
+
| `helixevo init` | Import existing skills + generate golden cases |
|
|
84
|
+
| `helixevo capture <session>` | Extract failures from a session file |
|
|
85
|
+
| `helixevo evolve` | Evolve skills from captured failures |
|
|
86
|
+
| `helixevo generalize` | Promote cross-project patterns ↑ |
|
|
87
|
+
| `helixevo specialize --project <name>` | Create project-specific skills ↓ |
|
|
88
|
+
| `helixevo graph` | View skill network in terminal |
|
|
89
|
+
| `helixevo research` | Proactive web research for skill improvement |
|
|
90
|
+
| `helixevo dashboard` | Open web dashboard at localhost:3847 |
|
|
91
|
+
| `helixevo status` | Show system health |
|
|
92
|
+
| `helixevo report` | Generate evolution report |
|
|
93
|
+
|
|
94
|
+
### Common options
|
|
95
|
+
|
|
96
|
+
Most commands support:
|
|
97
|
+
- `--dry-run` — Preview changes without applying
|
|
98
|
+
- `--verbose` — Show detailed LLM interactions
|
|
99
|
+
|
|
100
|
+
### Graph options
|
|
101
|
+
|
|
102
|
+
```bash
|
|
103
|
+
helixevo graph # TUI view (instant, cached)
|
|
104
|
+
helixevo graph --mermaid # Open in browser as Mermaid diagram
|
|
105
|
+
helixevo graph --obsidian ~/vault # Sync to Obsidian vault
|
|
106
|
+
helixevo graph --rebuild # Re-infer relationships (LLM call)
|
|
107
|
+
helixevo graph --optimize # Detect merge/split/conflict opportunities
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Research options
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
helixevo research --verbose # Full output
|
|
114
|
+
helixevo research --project ./myapp # Focus research on a project
|
|
115
|
+
helixevo research --max-hypotheses 5 # Test more hypotheses
|
|
116
|
+
helixevo research --dry-run # Preview without creating skills
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## Data
|
|
120
|
+
|
|
121
|
+
All data is stored in `~/.helix/`:
|
|
122
|
+
|
|
123
|
+
```
|
|
124
|
+
~/.helix/
|
|
125
|
+
├── config.json # Configuration
|
|
126
|
+
├── failures.jsonl # Captured failures
|
|
127
|
+
├── frontier.json # Pareto frontier (top-k configurations)
|
|
128
|
+
├── evolution-history.json # All evolution runs + proposals
|
|
129
|
+
├── golden-cases.jsonl # Regression test cases
|
|
130
|
+
├── skill-graph.json # Cached network (nodes + edges)
|
|
131
|
+
├── canary-registry.json # Active canary deployments
|
|
132
|
+
├── knowledge-buffer.json # Research discoveries + drafts
|
|
133
|
+
├── general/ # Skills (SKILL.md files)
|
|
134
|
+
│ ├── my-skill/SKILL.md
|
|
135
|
+
│ └── ...
|
|
136
|
+
├── backups/ # Pre-canary skill backups
|
|
137
|
+
└── reports/ # Generated reports
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
## Web Dashboard
|
|
141
|
+
|
|
142
|
+
The dashboard provides an interactive view of your skill ecosystem:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
helixevo dashboard
|
|
146
|
+
# Opens http://localhost:3847
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
**Tabs:**
|
|
150
|
+
- **Overview** — Stats, frontier programs, recent evolution
|
|
151
|
+
- **Skill Network** — Interactive graph + skill cards + project view + co-evolution analysis
|
|
152
|
+
- **Evolution** — Timeline of all evolution runs with judge scores
|
|
153
|
+
- **Research** — Knowledge buffer: discoveries and drafts
|
|
154
|
+
- **Frontier** — Pareto frontier with 4-dimension scores + canary status
|
|
155
|
+
|
|
156
|
+
The dashboard requires Next.js dependencies. On first run:
|
|
157
|
+
```bash
|
|
158
|
+
cd dashboard && npm install
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
## Craft Agent Integration
|
|
162
|
+
|
|
163
|
+
Helix includes a Craft Agent skill at `integrations/craft-agent/`:
|
|
164
|
+
|
|
165
|
+
```bash
|
|
166
|
+
# Copy to your skills directory
|
|
167
|
+
cp -r integrations/craft-agent/skills/skill-evolver ~/.agents/skills/
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
Then use `[skill:skill-evolver]` in Craft Agent to trigger evolution.
|
|
171
|
+
|
|
172
|
+
## Architecture
|
|
173
|
+
|
|
174
|
+
```
|
|
175
|
+
Failures → Cluster → Propose → Replay → Multi-Judge → Regression → Canary → Frontier
|
|
176
|
+
│ │
|
|
177
|
+
│ 3 independent judges:
|
|
178
|
+
│ - Task Completion
|
|
179
|
+
│ - Correction Alignment
|
|
180
|
+
│ - Side-Effect Check
|
|
181
|
+
│
|
|
182
|
+
Knowledge Buffer
|
|
183
|
+
(discoveries + drafts from rejected proposals)
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
**Three-layer hierarchy:**
|
|
187
|
+
- **System** — Global agent behaviors
|
|
188
|
+
- **Domain** — Cross-project patterns (generalized skills)
|
|
189
|
+
- **Project** — Project-specific specializations
|
|
190
|
+
|
|
191
|
+
## License
|
|
192
|
+
|
|
193
|
+
MIT
|