kairn-cli 2.1.0 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +88 -66
- package/dist/cli.js +348 -24
- package/dist/cli.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,16 +1,12 @@
|
|
|
1
1
|
# Kairn
|
|
2
2
|
|
|
3
|
-
> Describe what you want done
|
|
3
|
+
> The agent environment compiler. Describe what you want done — get an optimized Claude Code environment. Then evolve it automatically.
|
|
4
4
|
|
|
5
5
|
Kairn is a CLI that compiles natural language workflow descriptions into minimal, optimal [Claude Code](https://code.claude.com/) agent environments — complete with MCP servers, slash commands, skills, subagents, and security rules.
|
|
6
6
|
|
|
7
|
-
**
|
|
8
|
-
|
|
9
|
-
## Why
|
|
10
|
-
|
|
11
|
-
Every agent needs an environment before it can work. Today, building that environment is manual and generic. The harness repos on GitHub give you 136 skills and hope you figure out which 6 matter for your task.
|
|
7
|
+
**v2.1** adds **Kairn Evolve** — an automated optimization loop that runs your agent on real tasks, diagnoses failures from full execution traces, and mutates the harness until performance plateaus. Inspired by [Meta-Harness](https://yoonholee.com/meta-harness/) (Stanford IRIS Lab, 2026).
|
|
12
8
|
|
|
13
|
-
|
|
9
|
+
**No server. No account. Runs locally with your own LLM key.**
|
|
14
10
|
|
|
15
11
|
## Install
|
|
16
12
|
|
|
@@ -23,7 +19,7 @@ Requires Node.js 18+. The command is `kairn`.
|
|
|
23
19
|
## Quick Start
|
|
24
20
|
|
|
25
21
|
```bash
|
|
26
|
-
# 1. Set up your LLM key
|
|
22
|
+
# 1. Set up your LLM key
|
|
27
23
|
kairn init
|
|
28
24
|
|
|
29
25
|
# 2. Describe your workflow
|
|
@@ -33,31 +29,21 @@ kairn describe "Build a Next.js app with Supabase auth"
|
|
|
33
29
|
claude
|
|
34
30
|
```
|
|
35
31
|
|
|
36
|
-
|
|
32
|
+
Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP servers, slash commands, skills, agents, rules — tailored to your specific workflow.
|
|
37
33
|
|
|
38
34
|
## What Gets Generated
|
|
39
35
|
|
|
40
36
|
```
|
|
41
37
|
.claude/
|
|
42
|
-
├── CLAUDE.md # Workflow-specific system prompt
|
|
43
|
-
├── settings.json # Permissions and security deny rules
|
|
44
|
-
├── commands/ # Slash commands (
|
|
45
|
-
|
|
46
|
-
│ ├── tasks.md # /project:tasks — manage TODOs
|
|
47
|
-
│ ├── plan.md # /project:plan — plan before coding
|
|
48
|
-
│ └── ... # workflow-specific commands
|
|
49
|
-
├── rules/ # Auto-loaded instructions
|
|
50
|
-
│ ├── security.md # Security best practices
|
|
51
|
-
│ └── continuity.md # Session memory (DECISIONS.md, LEARNINGS.md)
|
|
38
|
+
├── CLAUDE.md # Workflow-specific system prompt
|
|
39
|
+
├── settings.json # Permissions, hooks, and security deny rules
|
|
40
|
+
├── commands/ # Slash commands (/project:help, /project:plan, etc.)
|
|
41
|
+
├── rules/ # Auto-loaded instructions (security, continuity)
|
|
52
42
|
├── skills/ # Model-controlled capabilities
|
|
53
|
-
│ └── {skill}/SKILL.md
|
|
54
43
|
├── agents/ # Specialized subagents
|
|
55
|
-
│ └── {agent}.md
|
|
56
44
|
└── docs/ # Pre-initialized project memory
|
|
57
|
-
├── TODO.md
|
|
58
|
-
├── DECISIONS.md
|
|
59
|
-
└── LEARNINGS.md
|
|
60
45
|
.mcp.json # Project-scoped MCP server config
|
|
46
|
+
.env # API keys (gitignored, masked in output)
|
|
61
47
|
```
|
|
62
48
|
|
|
63
49
|
## Commands
|
|
@@ -67,71 +53,103 @@ That's it. Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP se
|
|
|
67
53
|
Interactive setup. Pick your LLM provider and model, paste your API key. Key stays local at `~/.kairn/config.json`.
|
|
68
54
|
|
|
69
55
|
Supported providers:
|
|
70
|
-
- **Anthropic** — Claude Sonnet 4, Opus 4, Haiku
|
|
71
|
-
- **OpenAI** — GPT-
|
|
72
|
-
- **Google** — Gemini 2.5 Flash, Gemini 2.5 Pro
|
|
56
|
+
- **Anthropic** — Claude Sonnet 4.6, Opus 4.6, Haiku 4.5
|
|
57
|
+
- **OpenAI** — GPT-4.1, GPT-4.1 mini, o4-mini, GPT-5 mini
|
|
58
|
+
- **Google** — Gemini 2.5 Flash, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 3.1 Pro
|
|
59
|
+
- **xAI** — Grok 4.1 Fast, Grok 4.20 (2M context)
|
|
60
|
+
- **DeepSeek** — V3.2 Chat, V3.2 Reasoner (cheapest)
|
|
61
|
+
- **Mistral** — Large 3, Codestral, Small 4 (open-weight)
|
|
62
|
+
- **Groq** — Llama 4, DeepSeek R1, Qwen 3 (free tier)
|
|
63
|
+
- **Custom** — any OpenAI-compatible endpoint (local Ollama, LM Studio, etc.)
|
|
73
64
|
|
|
74
65
|
### `kairn describe [intent]`
|
|
75
66
|
|
|
76
67
|
The main command. Describe what you want your agent to do, and Kairn compiles an optimal environment.
|
|
77
68
|
|
|
78
69
|
```bash
|
|
79
|
-
# Interactive
|
|
80
|
-
kairn describe
|
|
81
|
-
|
|
82
|
-
# Inline
|
|
83
70
|
kairn describe "Research ML papers on GRPO training and write a summary"
|
|
84
|
-
|
|
85
|
-
# Skip confirmation
|
|
86
|
-
kairn describe "Draft outreach emails from a CSV" --yes
|
|
71
|
+
kairn describe "Build a REST API with Express and PostgreSQL" --quick
|
|
87
72
|
```
|
|
88
73
|
|
|
89
|
-
|
|
74
|
+
Features:
|
|
75
|
+
- **Interactive clarification** — 3-5 questions to understand your project (skip with `--quick`)
|
|
76
|
+
- **Multi-pass compilation** — skeleton pass (tool selection) + harness pass (content generation) + deterministic settings
|
|
77
|
+
- **Autonomy levels** — choose how autonomous the agent should be (1-4)
|
|
78
|
+
- **Secrets collection** — prompted for API keys after generation, written to `.env`
|
|
90
79
|
|
|
91
|
-
### `kairn
|
|
80
|
+
### `kairn optimize [--diff]`
|
|
92
81
|
|
|
93
|
-
|
|
82
|
+
Scan an existing project and optimize its Claude Code environment. Detects language, framework, dependencies, and generates improvements.
|
|
94
83
|
|
|
84
|
+
```bash
|
|
85
|
+
kairn optimize # Write optimized environment
|
|
86
|
+
kairn optimize --diff # Preview changes before writing
|
|
95
87
|
```
|
|
96
|
-
$ kairn list
|
|
97
88
|
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
89
|
+
### `kairn templates`
|
|
90
|
+
|
|
91
|
+
Browse and activate pre-built environment templates.
|
|
101
92
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
93
|
+
```bash
|
|
94
|
+
kairn templates # Browse gallery
|
|
95
|
+
kairn templates --activate nextjs # Apply a template
|
|
105
96
|
```
|
|
106
97
|
|
|
107
|
-
|
|
98
|
+
Available templates: Next.js Full-Stack, API Service, Research Project, Content Writing.
|
|
108
99
|
|
|
109
|
-
|
|
100
|
+
### `kairn doctor`
|
|
110
101
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
kairn
|
|
114
|
-
|
|
102
|
+
Validate the current environment against Claude Code best practices.
|
|
103
|
+
|
|
104
|
+
### `kairn keys [--show]`
|
|
105
|
+
|
|
106
|
+
Add or update API keys for MCP servers in the current environment.
|
|
107
|
+
|
|
108
|
+
### `kairn list` / `kairn activate <env_id>`
|
|
109
|
+
|
|
110
|
+
Show saved environments and re-deploy them to any directory.
|
|
115
111
|
|
|
116
|
-
### `kairn
|
|
112
|
+
### `kairn evolve`
|
|
117
113
|
|
|
118
|
-
|
|
114
|
+
Automated harness optimization. Run your agent on real tasks, capture traces, and evolve the environment.
|
|
119
115
|
|
|
120
116
|
```bash
|
|
121
|
-
|
|
117
|
+
# 1. Initialize — auto-generates project-specific eval tasks via LLM
|
|
118
|
+
kairn evolve init
|
|
119
|
+
|
|
120
|
+
# 2. Snapshot current .claude/ as the baseline
|
|
121
|
+
kairn evolve baseline
|
|
122
|
+
|
|
123
|
+
# 3. Run the evolution loop
|
|
124
|
+
kairn evolve run # 5 iterations (default)
|
|
125
|
+
kairn evolve run --iterations 3 # Custom iteration count
|
|
126
|
+
kairn evolve run --task <id> # Run a single task
|
|
122
127
|
```
|
|
123
128
|
|
|
129
|
+
**How it works:**
|
|
130
|
+
|
|
131
|
+
1. **Define tasks** — `kairn evolve init` reads your CLAUDE.md and project structure, then uses the LLM to generate 3-5 concrete eval tasks from 6 built-in templates (add-feature, fix-bug, refactor, test-writing, config-change, documentation)
|
|
132
|
+
2. **Baseline** — `kairn evolve baseline` snapshots your current `.claude/` directory
|
|
133
|
+
3. **Evaluate** — runs each task by spawning Claude Code in an isolated workspace, capturing full traces (stdout, stderr, tool calls, files changed, timing)
|
|
134
|
+
4. **Diagnose** — a proposer agent (Opus) reads the full traces and performs causal reasoning to identify why tasks fail
|
|
135
|
+
5. **Mutate** — proposes minimal, targeted changes to CLAUDE.md, commands, rules, or agents
|
|
136
|
+
6. **Repeat** — re-evaluates with the mutated harness. Rolls back if scores regress.
|
|
137
|
+
|
|
138
|
+
Scoring: pass/fail (default), LLM-as-judge, or weighted rubric.
|
|
139
|
+
|
|
124
140
|
## Tool Registry
|
|
125
141
|
|
|
126
|
-
Kairn ships with
|
|
142
|
+
Kairn ships with 28 curated tools across 8 categories:
|
|
127
143
|
|
|
128
144
|
| Category | Tools |
|
|
129
145
|
|----------|-------|
|
|
130
|
-
| **
|
|
131
|
-
| **Code** | GitHub MCP,
|
|
132
|
-
| **Search** | Exa, Brave Search, Firecrawl, Perplexity |
|
|
133
|
-
| **
|
|
134
|
-
| **
|
|
146
|
+
| **Reasoning** | Context7, Sequential Thinking |
|
|
147
|
+
| **Code & DevTools** | GitHub MCP, Chrome DevTools |
|
|
148
|
+
| **Search & Research** | Exa, Brave Search, Firecrawl, Perplexity |
|
|
149
|
+
| **Browser Automation** | Playwright, Browserbase |
|
|
150
|
+
| **Data & Infrastructure** | PostgreSQL, Supabase, SQLite, Docker, Vercel |
|
|
151
|
+
| **Communication** | Slack, Notion, Linear, AgentMail, Gmail |
|
|
152
|
+
| **Security** | Semgrep, security-guidance |
|
|
135
153
|
| **Design** | Figma, Frontend Design |
|
|
136
154
|
|
|
137
155
|
Tools are selected based on your workflow description. Fewer tools = less context bloat = better agent performance.
|
|
@@ -139,24 +157,28 @@ Tools are selected based on your workflow description. Fewer tools = less contex
|
|
|
139
157
|
## How It Works
|
|
140
158
|
|
|
141
159
|
1. You describe your workflow in natural language
|
|
142
|
-
2. Kairn
|
|
143
|
-
3.
|
|
144
|
-
4.
|
|
145
|
-
5.
|
|
160
|
+
2. Kairn asks clarifying questions (or skip with `--quick`)
|
|
161
|
+
3. **Pass 1:** LLM selects the minimal tool set and outlines the project
|
|
162
|
+
4. **Pass 2:** LLM generates all harness content (CLAUDE.md, commands, rules, agents)
|
|
163
|
+
5. **Pass 3:** Settings and MCP config generated deterministically from the registry
|
|
164
|
+
6. Kairn writes the `.claude/` directory and `.mcp.json`
|
|
165
|
+
7. API keys are collected and written to `.env`
|
|
146
166
|
|
|
147
167
|
The LLM call uses your own API key. Nothing is sent to Kairn servers (there are none).
|
|
148
168
|
|
|
149
169
|
## Security
|
|
150
170
|
|
|
151
|
-
- **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted
|
|
171
|
+
- **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted.
|
|
152
172
|
- **Every environment includes security rules.** Deny rules for `rm -rf`, `curl | sh`, reading `.env` and `secrets/`.
|
|
153
|
-
- **Curated registry only.** Every MCP server
|
|
154
|
-
- **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to files.
|
|
173
|
+
- **Curated registry only.** Every MCP server is manually verified.
|
|
174
|
+
- **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to config files.
|
|
175
|
+
- **Path traversal protection.** Evolution mutations are validated against `../` injection.
|
|
155
176
|
|
|
156
177
|
## Philosophy
|
|
157
178
|
|
|
158
179
|
- **Minimal over complete.** 5 well-chosen tools beat 50 generic ones.
|
|
159
180
|
- **Workflow-specific over generic.** Every file generated relates to your actual task.
|
|
181
|
+
- **Self-improving.** Environments should get better with use, not just at generation time.
|
|
160
182
|
- **Local-first.** No accounts, no servers, no telemetry.
|
|
161
183
|
- **Transparent.** You can inspect every generated file. Nothing is hidden.
|
|
162
184
|
|