kairn-cli 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,16 +1,12 @@
1
1
  # Kairn
2
2
 
3
- > Describe what you want done. Get an optimized Claude Code environment.
3
+ > The agent environment compiler. Describe what you want done get an optimized Claude Code environment. Then evolve it automatically.
4
4
 
5
5
  Kairn is a CLI that compiles natural language workflow descriptions into minimal, optimal [Claude Code](https://code.claude.com/) agent environments — complete with MCP servers, slash commands, skills, subagents, and security rules.
6
6
 
7
- **No server. No account. Runs locally with your own LLM key.**
8
-
9
- ## Why
10
-
11
- Every agent needs an environment before it can work. Today, building that environment is manual and generic. The harness repos on GitHub give you 136 skills and hope you figure out which 6 matter for your task.
7
+ **v2.1** adds **Kairn Evolve** an automated optimization loop that runs your agent on real tasks, diagnoses failures from full execution traces, and mutates the harness until performance plateaus. Inspired by [Meta-Harness](https://yoonholee.com/meta-harness/) (Stanford IRIS Lab, 2026).
12
8
 
13
- Kairn selects the right tools, generates workflow-specific instructions, and writes a production-quality `.claude/` directory in one command.
9
+ **No server. No account. Runs locally with your own LLM key.**
14
10
 
15
11
  ## Install
16
12
 
@@ -23,7 +19,7 @@ Requires Node.js 18+. The command is `kairn`.
23
19
  ## Quick Start
24
20
 
25
21
  ```bash
26
- # 1. Set up your LLM key (Anthropic, OpenAI, or Google)
22
+ # 1. Set up your LLM key
27
23
  kairn init
28
24
 
29
25
  # 2. Describe your workflow
@@ -33,31 +29,21 @@ kairn describe "Build a Next.js app with Supabase auth"
33
29
  claude
34
30
  ```
35
31
 
36
- That's it. Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP servers, slash commands, skills, agents, rules — tailored to your specific workflow.
32
+ Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP servers, slash commands, skills, agents, rules — tailored to your specific workflow.
37
33
 
38
34
  ## What Gets Generated
39
35
 
40
36
  ```
41
37
  .claude/
42
- ├── CLAUDE.md # Workflow-specific system prompt (<100 lines)
43
- ├── settings.json # Permissions and security deny rules
44
- ├── commands/ # Slash commands (show up when you type /)
45
- ├── help.md # /project:help environment guide
46
- │ ├── tasks.md # /project:tasks — manage TODOs
47
- │ ├── plan.md # /project:plan — plan before coding
48
- │ └── ... # workflow-specific commands
49
- ├── rules/ # Auto-loaded instructions
50
- │ ├── security.md # Security best practices
51
- │ └── continuity.md # Session memory (DECISIONS.md, LEARNINGS.md)
38
+ ├── CLAUDE.md # Workflow-specific system prompt
39
+ ├── settings.json # Permissions, hooks, and security deny rules
40
+ ├── commands/ # Slash commands (/project:help, /project:plan, etc.)
41
+ ├── rules/ # Auto-loaded instructions (security, continuity)
52
42
  ├── skills/ # Model-controlled capabilities
53
- │ └── {skill}/SKILL.md
54
43
  ├── agents/ # Specialized subagents
55
- │ └── {agent}.md
56
44
  └── docs/ # Pre-initialized project memory
57
- ├── TODO.md
58
- ├── DECISIONS.md
59
- └── LEARNINGS.md
60
45
  .mcp.json # Project-scoped MCP server config
46
+ .env # API keys (gitignored, masked in output)
61
47
  ```
62
48
 
63
49
  ## Commands
@@ -67,71 +53,103 @@ That's it. Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP se
67
53
  Interactive setup. Pick your LLM provider and model, paste your API key. Key stays local at `~/.kairn/config.json`.
68
54
 
69
55
  Supported providers:
70
- - **Anthropic** — Claude Sonnet 4, Opus 4, Haiku 3.5
71
- - **OpenAI** — GPT-4o, GPT-4o mini, o3
72
- - **Google** — Gemini 2.5 Flash, Gemini 2.5 Pro
56
+ - **Anthropic** — Claude Sonnet 4.6, Opus 4.6, Haiku 4.5
57
+ - **OpenAI** — GPT-4.1, GPT-4.1 mini, o4-mini, GPT-5 mini
58
+ - **Google** — Gemini 2.5 Flash, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 3.1 Pro
59
+ - **xAI** — Grok 4.1 Fast, Grok 4.20 (2M context)
60
+ - **DeepSeek** — V3.2 Chat, V3.2 Reasoner (cheapest)
61
+ - **Mistral** — Large 3, Codestral, Small 4 (open-weight)
62
+ - **Groq** — Llama 4, DeepSeek R1, Qwen 3 (free tier)
63
+ - **Custom** — any OpenAI-compatible endpoint (local Ollama, LM Studio, etc.)
73
64
 
74
65
  ### `kairn describe [intent]`
75
66
 
76
67
  The main command. Describe what you want your agent to do, and Kairn compiles an optimal environment.
77
68
 
78
69
  ```bash
79
- # Interactive
80
- kairn describe
81
-
82
- # Inline
83
70
  kairn describe "Research ML papers on GRPO training and write a summary"
84
-
85
- # Skip confirmation
86
- kairn describe "Draft outreach emails from a CSV" --yes
71
+ kairn describe "Build a REST API with Express and PostgreSQL" --quick
87
72
  ```
88
73
 
89
- Kairn selects the minimal set of tools from a curated registry of 18 MCP servers, plugins, and hooks — then generates every file Claude Code needs.
74
+ Features:
75
+ - **Interactive clarification** — 3-5 questions to understand your project (skip with `--quick`)
76
+ - **Multi-pass compilation** — skeleton pass (tool selection) + harness pass (content generation) + deterministic settings
77
+ - **Autonomy levels** — choose how autonomous the agent should be (1-4)
78
+ - **Secrets collection** — prompted for API keys after generation, written to `.env`
90
79
 
91
- ### `kairn list`
80
+ ### `kairn optimize [--diff]`
92
81
 
93
- Show all saved environments.
82
+ Scan an existing project and optimize its Claude Code environment. Detects language, framework, dependencies, and generates improvements.
94
83
 
84
+ ```bash
85
+ kairn optimize # Write optimized environment
86
+ kairn optimize --diff # Preview changes before writing
95
87
  ```
96
- $ kairn list
97
88
 
98
- nextjs-supabase-auth
99
- Next.js app with Supabase authentication
100
- 3/30/2026 · 7 tools · env_df2c0a23...
89
+ ### `kairn templates`
90
+
91
+ Browse and activate pre-built environment templates.
101
92
 
102
- grpo-research
103
- Research ML papers on GRPO training
104
- 3/30/2026 · 4 tools · env_1638c54e...
93
+ ```bash
94
+ kairn templates # Browse gallery
95
+ kairn templates --activate nextjs # Apply a template
105
96
  ```
106
97
 
107
- ### `kairn activate <env_id>`
98
+ Available templates: Next.js Full-Stack, API Service, Research Project, Content Writing.
108
99
 
109
- Re-deploy a saved environment to any directory. Use the ID from `kairn list`.
100
+ ### `kairn doctor`
110
101
 
111
- ```bash
112
- mkdir new-project && cd new-project
113
- kairn activate env_df2c0a23
114
- ```
102
+ Validate the current environment against Claude Code best practices.
103
+
104
+ ### `kairn keys [--show]`
105
+
106
+ Add or update API keys for MCP servers in the current environment.
107
+
108
+ ### `kairn list` / `kairn activate <env_id>`
109
+
110
+ Show saved environments and re-deploy them to any directory.
115
111
 
116
- ### `kairn update-registry`
112
+ ### `kairn evolve`
117
113
 
118
- Fetch the latest tool catalog from GitHub.
114
+ Automated harness optimization. Run your agent on real tasks, capture traces, and evolve the environment.
119
115
 
120
116
  ```bash
121
- kairn update-registry
117
+ # 1. Initialize — auto-generates project-specific eval tasks via LLM
118
+ kairn evolve init
119
+
120
+ # 2. Snapshot current .claude/ as the baseline
121
+ kairn evolve baseline
122
+
123
+ # 3. Run the evolution loop
124
+ kairn evolve run # 5 iterations (default)
125
+ kairn evolve run --iterations 3 # Custom iteration count
126
+ kairn evolve run --task <id> # Run a single task
122
127
  ```
123
128
 
129
+ **How it works:**
130
+
131
+ 1. **Define tasks** — `kairn evolve init` reads your CLAUDE.md and project structure, then uses the LLM to generate 3-5 concrete eval tasks from 6 built-in templates (add-feature, fix-bug, refactor, test-writing, config-change, documentation)
132
+ 2. **Baseline** — `kairn evolve baseline` snapshots your current `.claude/` directory
133
+ 3. **Evaluate** — runs each task by spawning Claude Code in an isolated workspace, capturing full traces (stdout, stderr, tool calls, files changed, timing)
134
+ 4. **Diagnose** — a proposer agent (Opus) reads the full traces and performs causal reasoning to identify why tasks fail
135
+ 5. **Mutate** — proposes minimal, targeted changes to CLAUDE.md, commands, rules, or agents
136
+ 6. **Repeat** — re-evaluates with the mutated harness. Rolls back if scores regress.
137
+
138
+ Scoring: pass/fail (default), LLM-as-judge, or weighted rubric.
139
+
124
140
  ## Tool Registry
125
141
 
126
- Kairn ships with 18 curated tools across 6 tiers:
142
+ Kairn ships with 28 curated tools across 8 categories:
127
143
 
128
144
  | Category | Tools |
129
145
  |----------|-------|
130
- | **Universal** | Context7, Sequential Thinking, security-guidance |
131
- | **Code** | GitHub MCP, Playwright, Semgrep |
132
- | **Search** | Exa, Brave Search, Firecrawl, Perplexity |
133
- | **Data** | PostgreSQL (Bytebase), Supabase |
134
- | **Communication** | Slack, Notion, Linear, AgentMail |
146
+ | **Reasoning** | Context7, Sequential Thinking |
147
+ | **Code & DevTools** | GitHub MCP, Chrome DevTools |
148
+ | **Search & Research** | Exa, Brave Search, Firecrawl, Perplexity |
149
+ | **Browser Automation** | Playwright, Browserbase |
150
+ | **Data & Infrastructure** | PostgreSQL, Supabase, SQLite, Docker, Vercel |
151
+ | **Communication** | Slack, Notion, Linear, AgentMail, Gmail |
152
+ | **Security** | Semgrep, security-guidance |
135
153
  | **Design** | Figma, Frontend Design |
136
154
 
137
155
  Tools are selected based on your workflow description. Fewer tools = less context bloat = better agent performance.
@@ -139,24 +157,28 @@ Tools are selected based on your workflow description. Fewer tools = less contex
139
157
  ## How It Works
140
158
 
141
159
  1. You describe your workflow in natural language
142
- 2. Kairn sends your intent + its tool registry to an LLM
143
- 3. The LLM selects the minimal tool set and generates a complete `EnvironmentSpec`
144
- 4. Kairn writes the `.claude/` directory and `.mcp.json` from the spec
145
- 5. The spec is saved locally so you can re-deploy it anywhere
160
+ 2. Kairn asks clarifying questions (or skip with `--quick`)
161
+ 3. **Pass 1:** LLM selects the minimal tool set and outlines the project
162
+ 4. **Pass 2:** LLM generates all harness content (CLAUDE.md, commands, rules, agents)
163
+ 5. **Pass 3:** Settings and MCP config generated deterministically from the registry
164
+ 6. Kairn writes the `.claude/` directory and `.mcp.json`
165
+ 7. API keys are collected and written to `.env`
146
166
 
147
167
  The LLM call uses your own API key. Nothing is sent to Kairn servers (there are none).
148
168
 
149
169
  ## Security
150
170
 
151
- - **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted anywhere.
171
+ - **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted.
152
172
  - **Every environment includes security rules.** Deny rules for `rm -rf`, `curl | sh`, reading `.env` and `secrets/`.
153
- - **Curated registry only.** Every MCP server in the registry is manually verified. No auto-inclusion of unvetted tools.
154
- - **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to files.
173
+ - **Curated registry only.** Every MCP server is manually verified.
174
+ - **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to config files.
175
+ - **Path traversal protection.** Evolution mutations are validated against `../` injection.
155
176
 
156
177
  ## Philosophy
157
178
 
158
179
  - **Minimal over complete.** 5 well-chosen tools beat 50 generic ones.
159
180
  - **Workflow-specific over generic.** Every file generated relates to your actual task.
181
+ - **Self-improving.** Environments should get better with use, not just at generation time.
160
182
  - **Local-first.** No accounts, no servers, no telemetry.
161
183
  - **Transparent.** You can inspect every generated file. Nothing is hidden.
162
184