kairn-cli 2.10.0 → 2.14.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +171 -483
- package/dist/cli.js +9740 -7107
- package/dist/cli.js.map +1 -1
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -1,326 +1,144 @@
|
|
|
1
1
|
# Kairn — The Agent Environment Compiler
|
|
2
2
|
|
|
3
|
-
>
|
|
3
|
+
> Agent harnesses are programs. They should be compiled from intent and optimized through evolutionary search — not hand-written.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Every Claude Code project ships with a `.claude/` directory: system prompts, slash commands, rules, agents, hooks, MCP configs, security policies. Today, teams hand-write these files, cargo-culting from templates and fixing problems by trial and error. The harness *is* the program that shapes agent behavior, but nobody treats it like one.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Kairn treats it like one. You describe your workflow in natural language. Kairn compiles an optimized environment through a multi-agent pipeline — an @orchestrator plans the compilation, 6 specialist agents generate typed intermediate representation nodes in parallel, and a @linker validates cross-references before deterministic assembly. Then, optionally, Kairn *evolves* it: running real tasks against the harness, diagnosing failures via causal reasoning, proposing typed IR mutations, and repeating — with population-based training, Thompson sampling for task selection, and KL regularization to prevent bloat.
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
The result is a harness that's been compiled from intent and stress-tested against real work, not guessed at by a human reading docs.
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
## Install
|
|
14
|
-
|
|
15
|
-
```bash
|
|
16
|
-
npm install -g kairn-cli
|
|
17
|
-
```
|
|
18
|
-
|
|
19
|
-
Requires Node.js 18+. The command is `kairn`.
|
|
20
|
-
|
|
21
|
-
## Quick Start
|
|
22
|
-
|
|
23
|
-
```bash
|
|
24
|
-
# 1. Set up your LLM provider (Anthropic, OpenAI, Google, xAI, DeepSeek, Mistral, Groq, or custom)
|
|
25
|
-
kairn init
|
|
26
|
-
|
|
27
|
-
# 2. Describe your workflow (or scan an existing repo)
|
|
28
|
-
kairn describe "Build a Next.js app with Supabase auth"
|
|
29
|
-
# or
|
|
30
|
-
kairn optimize # scans existing project at cwd
|
|
31
|
-
|
|
32
|
-
# 3. Start coding
|
|
33
|
-
claude
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
Kairn generates the entire `.claude/` directory — CLAUDE.md, settings.json, commands, rules, agents, hooks, security policies — tailored to your specific workflow. Then, optionally, evolve it:
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
# Set up evolution
|
|
40
|
-
kairn evolve init # auto-generate 3-5 eval tasks
|
|
41
|
-
kairn evolve baseline # snapshot current harness
|
|
42
|
-
|
|
43
|
-
# Optimize
|
|
44
|
-
kairn evolve run --iterations 5 # Run evolution loop
|
|
45
|
-
kairn evolve apply # Accept best harness
|
|
46
|
-
```
|
|
11
|
+
**No servers. No accounts. No telemetry. Local-first, runs with your own LLM key.**
|
|
47
12
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
## What Gets Generated
|
|
51
|
-
|
|
52
|
-
```
|
|
53
|
-
.claude/
|
|
54
|
-
├── CLAUDE.md # Workflow-specific system prompt (7 sections)
|
|
55
|
-
├── settings.json # Permissions, hooks, security rules, intent routing
|
|
56
|
-
├── commands/ # Slash commands (/project:help, /project:plan, etc.)
|
|
57
|
-
├── rules/ # Auto-loaded instructions (security, continuity, paths)
|
|
58
|
-
├── skills/ # Model-controlled capabilities (code, research, writing)
|
|
59
|
-
├── agents/ # Specialized subagents (@architect, @tester, etc.)
|
|
60
|
-
├── docs/ # Pre-initialized project memory
|
|
61
|
-
├── hooks/ # Intent router (Tier 1 regex + Tier 2 Haiku classifier)
|
|
62
|
-
│ ├── intent-router.mjs # Project-specific regex patterns + fallthrough
|
|
63
|
-
│ ├── intent-learner.mjs # Promotes recurring Tier 2 patterns to Tier 1
|
|
64
|
-
│ └── intent-log.jsonl # Log of routed prompts (for learning)
|
|
65
|
-
└── QUICKSTART.md # Interactive startup guide (Level 2-4)
|
|
66
|
-
.mcp.json # Project-scoped MCP server config
|
|
67
|
-
.env # API keys (gitignored, masked in output)
|
|
68
|
-
```
|
|
13
|
+
Kairn's own development environment was compiled and evolved by Kairn.
|
|
69
14
|
|
|
70
15
|
---
|
|
71
16
|
|
|
72
|
-
##
|
|
73
|
-
|
|
74
|
-
### `kairn init`
|
|
75
|
-
|
|
76
|
-
Interactive setup. Pick your LLM provider, enter credentials. API key stored locally at `~/.kairn/config.json`.
|
|
17
|
+
## What's Under the Hood
|
|
77
18
|
|
|
78
|
-
|
|
79
|
-
- **Anthropic** — Claude Sonnet 4.6, Opus 4.6, Haiku 4.5
|
|
80
|
-
- **OpenAI** — GPT-4.1, GPT-4.1 mini, o4-mini, GPT-5 mini
|
|
81
|
-
- **Google** — Gemini 2.5 Flash, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 3.1 Pro
|
|
82
|
-
- **xAI** — Grok 4.1 Fast, Grok 4.20 (2M context, $0.20/M)
|
|
83
|
-
- **DeepSeek** — V3.2 Chat, V3.2 Reasoner (cheapest at $0.28/M)
|
|
84
|
-
- **Mistral** — Large 3, Codestral, Small 4 (open-weight)
|
|
85
|
-
- **Groq** — Llama 4, DeepSeek R1, Qwen 3 (free tier)
|
|
86
|
-
- **Custom** — any OpenAI-compatible endpoint (local Ollama, LM Studio)
|
|
19
|
+
Most tools in this space either generate prompts or generate code. Kairn generates *full agent environments* — and then optimizes them as a system. Here's what that required building.
|
|
87
20
|
|
|
88
|
-
###
|
|
21
|
+
### Multi-Agent Compilation Pipeline (v2.11)
|
|
89
22
|
|
|
90
|
-
|
|
23
|
+
The monolithic "ask an LLM to produce a giant JSON blob" approach hits a wall at ~16K tokens: truncation, incoherence, format corruption. Kairn decomposes compilation into a DAG of specialist agents, each producing typed output within its own token budget.
|
|
91
24
|
|
|
92
|
-
```bash
|
|
93
|
-
kairn describe "Build a Next.js REST API with PostgreSQL"
|
|
94
|
-
kairn describe "Research ML papers on GRPO training and summarize" --quick
|
|
95
25
|
```
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
- **Secrets collection** — Prompted for API keys after generation, written to `.env`
|
|
106
|
-
- **Intent routing** — Auto-generated `/project:*` command routing (both regex and Haiku-based)
|
|
107
|
-
|
|
108
|
-
### `kairn optimize [options]`
|
|
109
|
-
|
|
110
|
-
Scan an existing project and optimize its Claude Code environment. Detects language, framework, dependencies, and generates improvements.
|
|
111
|
-
|
|
112
|
-
```bash
|
|
113
|
-
kairn optimize # Scan, audit, and overwrite .claude/
|
|
114
|
-
kairn optimize --diff # Preview changes before writing
|
|
115
|
-
kairn optimize --audit-only # Show issues without generating
|
|
26
|
+
Pass 1: Skeleton — LLM selects tools, outlines the project (max_tokens: 2048)
|
|
27
|
+
Pass 2: @orchestrator — reads skeleton + intent, emits a CompilationPlan
|
|
28
|
+
(phased tasks, dependency ordering, per-agent token budgets)
|
|
29
|
+
Pass 3: Specialist agents — parallel fan-out across phases:
|
|
30
|
+
Phase A: @sections-writer → Section[], @rule-writer → RuleNode[]
|
|
31
|
+
Phase B: @command-writer → CommandNode[], @agent-writer → AgentNode[],
|
|
32
|
+
@skill-writer → SkillNode[]
|
|
33
|
+
Phase C: @linker — cross-reference validation + auto-patching
|
|
34
|
+
Pass 4: Assembly — deterministic generation of settings.json, .mcp.json, hooks
|
|
116
35
|
```
|
|
117
36
|
|
|
118
|
-
|
|
119
|
-
- **Full project scan** — language, framework, dependencies, scripts, env keys, CI/CD, existing harness
|
|
120
|
-
- **Harness audit** — checks CLAUDE.md quality, missing commands/rules, MCP bloat, security configurations
|
|
121
|
-
- **Two modes:**
|
|
122
|
-
- No `.claude/` → generate from scratch
|
|
123
|
-
- Has `.claude/` → optimize + overwrite (shows audit issues first, asks for confirmation)
|
|
124
|
-
- **Diff preview** — see what would change before applying (with `--diff`)
|
|
37
|
+
Each specialist produces typed HarnessIR nodes, not strings. The @linker detects broken `@agent` references in commands, missing `/project:command` mentions in agents, and injects mandatory help/security/continuity rules if absent. If an agent's output is truncated (`stop_reason === 'max_tokens'`), the batch engine retries with doubled budget — one agent failing doesn't crash the whole compilation.
|
|
125
38
|
|
|
126
|
-
###
|
|
39
|
+
### Structured Harness IR (v2.7)
|
|
127
40
|
|
|
128
|
-
|
|
41
|
+
Raw Markdown mutation accumulates contradictions, corrupts formatting, and breaks as files grow. Kairn operates on a typed intermediate representation: 14 node types (Section, CommandNode, RuleNode, AgentNode, SkillNode, DocNode, HookNode, SettingsIR, McpServerNode, IntentNode, ...), 17 mutation operations, and a semantic diff engine.
|
|
129
42
|
|
|
130
|
-
|
|
131
|
-
kairn templates # Browse gallery
|
|
132
|
-
kairn templates --activate nextjs # Apply a template
|
|
133
|
-
```
|
|
43
|
+
The IR is round-trip tested: `parse → render → parse` preserves all content on real `.claude/` directories. The evolution loop mutates IR nodes directly — no regex replacement, no string surgery. The compilation pipeline produces IR, the evolution loop mutates IR, and the renderer writes files. One representation, end to end.
|
|
134
44
|
|
|
135
|
-
|
|
136
|
-
- Next.js Full-Stack (React + Node + PostgreSQL + Supabase)
|
|
137
|
-
- API Service (Express/Fastify + database + testing)
|
|
138
|
-
- Research Project (paper analysis, literature review, synthesis)
|
|
139
|
-
- Content Writing (blog, documentation, marketing)
|
|
45
|
+
### Population-Based Training with Thompson Sampling (v2.6)
|
|
140
46
|
|
|
141
|
-
|
|
47
|
+
A single sequential evolution trajectory wastes wall-clock time on dead ends and overfits to its task sample. `kairn evolve pbt` runs N independent trajectories concurrently (default: 3), each with its own workspace, RNG seed, and Thompson Sampling beliefs.
|
|
142
48
|
|
|
143
|
-
|
|
144
|
-
- CLAUDE.md structure and token count
|
|
145
|
-
- MCP server configuration completeness
|
|
146
|
-
- Security rules and hooks
|
|
147
|
-
- Command and agent definitions
|
|
148
|
-
- Environment variable references
|
|
49
|
+
**Thompson Sampling** maintains a Beta distribution per eval task. Tasks with volatile scores (high uncertainty) get sampled more often; stable tasks less. This is uncertainty-driven exploration — the system automatically focuses evaluation budget where signal is weakest, rather than uniform random sampling.
|
|
149
50
|
|
|
150
|
-
|
|
51
|
+
**KL Regularization** prevents harness bloat. Every mutation pays a complexity cost: `effective_score = raw_score - λ * complexityCost * 100`. The cost measures lines, files, sections, and character-level diff from baseline. The proposer must *earn* every addition. Default λ = 0.1.
|
|
151
52
|
|
|
152
|
-
|
|
53
|
+
After all branches complete, a **Meta-Principal** LLM agent reads all branch results — iteration logs, per-task score matrices, Thompson beliefs, complexity metrics — and synthesizes the optimal harness by cherry-picking the best mutations from each trajectory. The synthesis is evaluated against the full task suite and must beat the best individual branch.
|
|
153
54
|
|
|
154
|
-
|
|
155
|
-
kairn keys # Prompt for missing keys
|
|
156
|
-
kairn keys --show # Show which keys are set vs missing
|
|
157
|
-
```
|
|
55
|
+
### Hybrid Scoring (v2.8)
|
|
158
56
|
|
|
159
|
-
|
|
57
|
+
Eval quality is the bottleneck of any optimization loop. Kairn blends deterministic rubric criteria (shell command checks: does the harness include a test command? does security block `rm -rf`?) with LLM-as-judge scoring, in a configurable weighted combination. Anthropic prompt caching on system prompts saves ~85% of tokens on repeated proposer/scorer calls. After mutation, targeted re-evaluation re-runs only tasks whose harness files were touched, saving ~40% eval cost per iteration.
|
|
160
58
|
|
|
161
|
-
|
|
59
|
+
### Persistent Execution Loops (v2.10)
|
|
162
60
|
|
|
163
|
-
|
|
164
|
-
kairn list # List all saved environments
|
|
165
|
-
kairn activate env_abc123 # Copy that environment to .claude/
|
|
166
|
-
```
|
|
61
|
+
Generated harnesses include `/project:persist` — a loop that reads acceptance criteria from `docs/SPRINT.md`, works criterion-by-criterion with structured progress tracking in `.claude/progress.json`, auto-retries on verification failure (max 3 per criterion), and delegates to a review gate before completion. Progress persists across sessions via `memory.json`.
|
|
167
62
|
|
|
168
|
-
|
|
63
|
+
A `UserPromptSubmit` hook detects complex tasks (multi-step, feature-scope, refactoring, bug-with-repro) via 6 complexity signals and auto-routes them through the persistence loop. Simple tasks pass through normally. Configurable: `auto | manual | off`.
|
|
169
64
|
|
|
170
|
-
|
|
65
|
+
### Anthropic Harness Patterns (v2.9)
|
|
171
66
|
|
|
172
|
-
|
|
67
|
+
Comparative analysis against [Anthropic's harness design guidance](https://www.anthropic.com/engineering/harness-design-long-running-apps), [Everything Claude Code](https://github.com/affaan-m/everything-claude-code) (151 skills, 102 security rules), and [Oh-My-ClaudeCode](https://github.com/yeachan-heo/oh-my-claudecode) (model routing) identified 6 gaps. Kairn now generates:
|
|
173
68
|
|
|
174
|
-
|
|
69
|
+
- **Sprint contracts** — `@architect` outputs numbered acceptance criteria; `/project:develop` validates each one individually
|
|
70
|
+
- **Smart model routing** — agents include tiered routing guidance (Haiku for linting, Sonnet for implementation, Opus for architecture) with a `modelRouting` IR field
|
|
71
|
+
- **Expanded security** — PreToolUse patterns from 5 to 20+ across credential leaks, injection, destructive ops, and network exfiltration
|
|
72
|
+
- **Memory persistence** — SessionStart/End hooks save/load `.claude/memory.json` across sessions
|
|
73
|
+
- **Context reset protocol** — full PostCompact alternative for long sessions (>2 hours or >3 compactions)
|
|
175
74
|
|
|
176
|
-
|
|
177
|
-
kairn evolve init
|
|
178
|
-
```
|
|
75
|
+
### Self-Learning Intent Routing (v2.5)
|
|
179
76
|
|
|
180
|
-
|
|
181
|
-
- "Add a new feature X to the codebase"
|
|
182
|
-
- "Fix this known bug Y"
|
|
183
|
-
- "Refactor the API layer for clarity"
|
|
184
|
-
- "Write comprehensive test coverage"
|
|
185
|
-
- "Update documentation after feature launch"
|
|
77
|
+
Two-tier routing compiles project-specific intent patterns at generation time. Tier 1: regex patterns (<10ms, $0) match keywords and synonyms. Tier 2: Haiku-powered semantic classification (~$0.001) handles ambiguous prompts. A background learner promotes recurring Tier 2 patterns to Tier 1 regexes after 3+ matches. Over time, the harness learns the user's vocabulary: session 1 is 40% regex, session 10 is 90%.
|
|
186
78
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
#### `kairn evolve baseline`
|
|
190
|
-
|
|
191
|
-
Snapshot your current `.claude/` directory as iteration 0 (the baseline to improve against).
|
|
192
|
-
|
|
193
|
-
```bash
|
|
194
|
-
kairn evolve baseline
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
#### `kairn evolve run`
|
|
198
|
-
|
|
199
|
-
Run the full evolution loop. Evaluates all tasks, diagnoses failures, proposes mutations, re-evaluates.
|
|
200
|
-
|
|
201
|
-
```bash
|
|
202
|
-
kairn evolve run # 5 iterations (default)
|
|
203
|
-
kairn evolve run --iterations 3 # Custom iteration count
|
|
204
|
-
kairn evolve run --task <task_id> # Run a single task
|
|
205
|
-
kairn evolve run --parallel 4 # Parallel task evaluation (4 concurrent)
|
|
206
|
-
kairn evolve run --runs 3 # Run each task 3 times, report mean ± stddev
|
|
207
|
-
```
|
|
208
|
-
|
|
209
|
-
**How it works (the loop):**
|
|
210
|
-
|
|
211
|
-
1. **Evaluate** — Run each eval task by spawning Claude Code in an isolated workspace. Capture full traces:
|
|
212
|
-
- stdout, stderr
|
|
213
|
-
- MCP tool calls (which tools, inputs, outputs)
|
|
214
|
-
- Files changed (diffs)
|
|
215
|
-
- Execution time, pass/fail status
|
|
216
|
-
|
|
217
|
-
2. **Diagnose** — A proposer agent (Opus) reads the full trace filesystem and performs causal reasoning:
|
|
218
|
-
- "Task A failed because CLAUDE.md doesn't mention the /api path"
|
|
219
|
-
- "Task B passed on iteration 1 but regressed on iteration 3 — the new security rule broke it"
|
|
220
|
-
- "Tasks A and C both needed /project:fix but there's no /project:fix command"
|
|
79
|
+
---
|
|
221
80
|
|
|
222
|
-
|
|
223
|
-
- `replace`: Update a section in CLAUDE.md, a command, a rule
|
|
224
|
-
- `add_section`: Insert new guidance into CLAUDE.md
|
|
225
|
-
- `create_file`: Add a new command or rule
|
|
226
|
-
- `delete_section`: Remove contradictory or bloat sections
|
|
227
|
-
- `delete_file`: Remove unused commands/rules
|
|
228
|
-
- `add_intent_pattern`: Add a new natural language pattern (v2.5.0)
|
|
229
|
-
- `modify_intent_prompt`: Improve the Tier 2 Haiku classifier (v2.5.0)
|
|
81
|
+
## What Makes Kairn Different
|
|
230
82
|
|
|
231
|
-
|
|
83
|
+
**vs. DSPy** — DSPy optimizes *prompts*. Kairn optimizes *full environments*: system prompts, slash commands, rules, agents, hooks, MCP configs, security policies, intent routing — as a coherent system. DSPy's mutation space is string replacement on prompt templates. Kairn's is 17 typed IR operations on a 14-node-type intermediate representation with cross-reference validation.
|
|
232
84
|
|
|
233
|
-
|
|
85
|
+
**vs. OpenEvolve** — OpenEvolve optimizes *code*. Kairn optimizes the *harness that shapes how agents write code*. Different layer of the stack, different mutation space, different eval methodology (real agent execution traces, not unit tests).
|
|
234
86
|
|
|
235
|
-
**
|
|
236
|
-
- **pass/fail** (default) — task passes or fails
|
|
237
|
-
- **llm-judge** — LLM reads task output and scores (0-100)
|
|
238
|
-
- **rubric** — custom weighted scoring function
|
|
87
|
+
**vs. Oh-My-ClaudeCode / static harness collections** — OMC ships a fixed set of 150 skills and 100+ rules. Kairn generates *project-specific* environments from intent, then evolves them against real tasks. Static harnesses can't adapt; Kairn's improve with use.
|
|
239
88
|
|
|
240
|
-
**
|
|
241
|
-
On middle iterations, skip slow/expensive tasks above a confidence threshold. Re-run all tasks on the first and last iteration for rigor.
|
|
89
|
+
**vs. manual `.claude/` directories** — No memorizing command names (intent routing). No trial-and-error (evolution loop). No format corruption (typed IR). No cargo-culting (compiled from your actual workflow).
|
|
242
90
|
|
|
243
|
-
**
|
|
244
|
-
-
|
|
245
|
-
-
|
|
246
|
-
-
|
|
91
|
+
**The specific technical gaps:**
|
|
92
|
+
- Full-environment optimization (not just prompts, not just code)
|
|
93
|
+
- Typed IR mutations with pre-condition validation (not string replacement)
|
|
94
|
+
- Population-based evolutionary search with uncertainty-driven sampling
|
|
95
|
+
- Cross-component validation via the @linker (commands reference real agents, agents reference real commands)
|
|
96
|
+
- Self-learning intent routing that promotes patterns from expensive LLM classification to free regex
|
|
247
97
|
|
|
248
|
-
|
|
98
|
+
---
|
|
249
99
|
|
|
250
|
-
|
|
100
|
+
## Quick Start
|
|
251
101
|
|
|
252
102
|
```bash
|
|
253
|
-
|
|
254
|
-
kairn evolve report --json # Machine-readable JSON
|
|
255
|
-
```
|
|
256
|
-
|
|
257
|
-
Shows:
|
|
258
|
-
- Evolution leaderboard (iterations × tasks × scores)
|
|
259
|
-
- Per-task trace diffs (what changed between iterations for the same task)
|
|
260
|
-
- Counterfactual diagnosis (which mutations helped/hurt which tasks)
|
|
261
|
-
- Wall time, token cost, iterations completed
|
|
103
|
+
npm install -g kairn-cli # Node.js 18+
|
|
262
104
|
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
```bash
|
|
268
|
-
kairn evolve diff 0 3 # Show all mutations from baseline to iteration 3
|
|
105
|
+
kairn init # Set up your LLM provider
|
|
106
|
+
kairn describe "Build a Next.js app with Supabase auth"
|
|
107
|
+
claude # Start Claude Code with the compiled harness
|
|
269
108
|
```
|
|
270
109
|
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
Copy the best (or specified) evolved harness back to `.claude/`.
|
|
110
|
+
To evolve the harness:
|
|
274
111
|
|
|
275
112
|
```bash
|
|
276
|
-
kairn evolve
|
|
277
|
-
kairn evolve
|
|
113
|
+
kairn evolve init # Auto-generate eval tasks from your project
|
|
114
|
+
kairn evolve baseline # Snapshot current harness
|
|
115
|
+
kairn evolve run # 5 iterations: evaluate → diagnose → mutate → re-evaluate
|
|
116
|
+
kairn evolve apply # Deploy the best harness
|
|
278
117
|
```
|
|
279
118
|
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
## Tool Registry
|
|
283
|
-
|
|
284
|
-
Kairn ships with **28 curated MCP servers** across 8 categories. Tools are auto-selected based on your workflow — fewer tools = less context bloat = better agent performance.
|
|
119
|
+
Kairn generates the entire `.claude/` directory — CLAUDE.md, settings.json, commands, rules, agents, skills, hooks, docs, intent routing, security policies — plus `.mcp.json` and `.env`.
|
|
285
120
|
|
|
286
|
-
|
|
287
|
-
|----------|-------|
|
|
288
|
-
| **Reasoning** | Context7, Sequential Thinking |
|
|
289
|
-
| **Code & DevTools** | GitHub MCP, Chrome DevTools |
|
|
290
|
-
| **Search & Research** | Exa, Brave Search, Firecrawl, Perplexity |
|
|
291
|
-
| **Browser Automation** | Playwright, Browserbase |
|
|
292
|
-
| **Data & Infrastructure** | PostgreSQL, Supabase, SQLite, Docker, Vercel |
|
|
293
|
-
| **Communication** | Slack, Notion, Linear, AgentMail, Gmail |
|
|
294
|
-
| **Security** | Semgrep, security-guidance |
|
|
295
|
-
| **Design** | Figma, Frontend Design |
|
|
121
|
+
Supports 8 LLM providers: Anthropic, OpenAI, Google, xAI, DeepSeek, Mistral, Groq, and any OpenAI-compatible endpoint.
|
|
296
122
|
|
|
297
123
|
---
|
|
298
124
|
|
|
299
|
-
##
|
|
300
|
-
|
|
301
|
-
### Generation (kairn describe / kairn optimize)
|
|
302
|
-
|
|
303
|
-
1. **User input** — intent string or scanned project profile
|
|
304
|
-
2. **Clarification** (optional) — 3-5 yes/no questions to refine workflow
|
|
305
|
-
3. **Pass 1: Skeleton** — LLM selects minimal tool set and outlines the project
|
|
306
|
-
4. **Pass 2: Harness** — LLM generates all content (CLAUDE.md, commands, rules, agents, docs)
|
|
307
|
-
5. **Pass 3: Settings** — Deterministic generation of `settings.json` and `.mcp.json` from registry
|
|
308
|
-
6. **Intent patterns** — Compile project-specific regex patterns from command names + synonyms
|
|
309
|
-
7. **Hook templates** — Generate `intent-router.mjs` (Tier 1) and Tier 2 prompt template
|
|
310
|
-
8. **Write files** — `.claude/` directory + `.mcp.json` + `.env` (with masked keys)
|
|
125
|
+
## The Evolution Engine
|
|
311
126
|
|
|
312
|
-
|
|
127
|
+
The heart of Kairn. Run your agent on real tasks, capture full execution traces, diagnose failures via causal reasoning, and mutate the harness iteratively.
|
|
313
128
|
|
|
314
129
|
```
|
|
315
130
|
Baseline (.claude/ snapshot)
|
|
316
131
|
│
|
|
317
132
|
▼
|
|
318
133
|
Iteration 1
|
|
319
|
-
├─ Evaluate:
|
|
320
|
-
|
|
321
|
-
├─
|
|
322
|
-
|
|
323
|
-
|
|
134
|
+
├─ Evaluate: spawn Claude Code on each task, capture traces
|
|
135
|
+
│ (stdout, MCP tool calls, file diffs, execution time, pass/fail)
|
|
136
|
+
├─ Diagnose: proposer (Sonnet) reads traces worst-first, performs causal reasoning
|
|
137
|
+
│ ("Task A failed because CLAUDE.md doesn't mention the /api path")
|
|
138
|
+
├─ Mutate: propose 1-3 typed IR mutations
|
|
139
|
+
│ (17 operation types: update/add/remove sections, commands, rules, agents, MCP servers, ...)
|
|
140
|
+
├─ Re-evaluate: run all tasks against the mutated harness
|
|
141
|
+
└─ Accept improvement / rollback regression
|
|
324
142
|
│
|
|
325
143
|
▼
|
|
326
144
|
Iteration 2, 3, 4, 5...
|
|
@@ -329,283 +147,153 @@ Baseline (.claude/ snapshot)
|
|
|
329
147
|
Best harness (apply to .claude/)
|
|
330
148
|
```
|
|
331
149
|
|
|
332
|
-
|
|
150
|
+
**Safety controls:** max 3 mutations per iteration, per-task regression guard (>20 point drop = rollback), adaptive eval pruning on middle iterations, loss-weighted proposer focus.
|
|
333
151
|
|
|
334
|
-
|
|
152
|
+
**Population-based mode:** `kairn evolve pbt` runs N parallel trajectories with Thompson Sampling + KL regularization, then synthesizes the optimal harness via Meta-Principal.
|
|
335
153
|
|
|
336
|
-
|
|
337
|
-
Tier 1: regex hook intercepts prompt
|
|
338
|
-
├─ Matches pattern? → route to command + inject context
|
|
339
|
-
└─ No match? → fallthrough to Tier 2
|
|
340
|
-
|
|
341
|
-
Tier 2: Haiku prompt hook
|
|
342
|
-
├─ Classify intent
|
|
343
|
-
├─ Route to command if confident
|
|
344
|
-
└─ Log routing attempt (for learning)
|
|
345
|
-
|
|
346
|
-
SessionStart: intent-learner.mjs
|
|
347
|
-
├─ Read intent-log.jsonl (recent tier 2 routings)
|
|
348
|
-
├─ Promote recurring patterns to regex
|
|
349
|
-
├─ Update intent-router.mjs
|
|
350
|
-
└─ Write audit trail
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
Over time, more patterns become regex (fast, free) instead of Haiku (slow, $0.001).
|
|
354
|
-
|
|
355
|
-
---
|
|
356
|
-
|
|
357
|
-
## Example Workflow
|
|
358
|
-
|
|
359
|
-
### Scenario: Build a Next.js API
|
|
154
|
+
### Example: Evolution in Action
|
|
360
155
|
|
|
361
156
|
```bash
|
|
362
|
-
|
|
363
|
-
git init
|
|
364
|
-
|
|
365
|
-
kairn describe "Next.js REST API with Prisma ORM and PostgreSQL. OAuth login, JWT auth, rate limiting."
|
|
366
|
-
|
|
367
|
-
# Output:
|
|
368
|
-
# ✔ Pass 1: Selected 7 tools (GitHub, PostgreSQL, Vercel, Semgrep, Docker, Context7, Sequential Thinking)
|
|
369
|
-
# ✔ Pass 2: Generated 73 lines in CLAUDE.md, 8 commands, 4 rules, 3 agents, 2 skills
|
|
370
|
-
# ✔ Pass 3: Configured 2 MCP servers (PostgreSQL + GitHub)
|
|
371
|
-
#
|
|
372
|
-
# Commands:
|
|
373
|
-
# /project:help Show available commands
|
|
374
|
-
# /project:plan Draft the API spec
|
|
375
|
-
# /project:develop Full development pipeline
|
|
376
|
-
# /project:test Run test suite
|
|
377
|
-
# /project:fix Issue-driven bug fixing
|
|
378
|
-
# /project:deploy Deploy to Vercel
|
|
379
|
-
# /project:security Audit for vulnerabilities
|
|
380
|
-
# /project:batch Run batches of independent tasks
|
|
381
|
-
#
|
|
382
|
-
# Env keys needed:
|
|
383
|
-
# POSTGRES_URL
|
|
384
|
-
# JWT_SECRET
|
|
385
|
-
# GITHUB_TOKEN
|
|
386
|
-
# VERCEL_TOKEN
|
|
387
|
-
#
|
|
388
|
-
# Paste your secrets (or press enter to skip):
|
|
389
|
-
# POSTGRES_URL: ***
|
|
390
|
-
# JWT_SECRET: ***
|
|
391
|
-
# GITHUB_TOKEN: (skipped)
|
|
392
|
-
# VERCEL_TOKEN: (skipped)
|
|
393
|
-
#
|
|
394
|
-
# Ready! Run: $ claude
|
|
395
|
-
|
|
396
|
-
claude # Start Claude Code with the generated harness
|
|
397
|
-
|
|
398
|
-
# In Claude Code:
|
|
399
|
-
# > /project:plan
|
|
400
|
-
# Drafts the API specification with OAuth flow, database schema, endpoint design
|
|
401
|
-
#
|
|
402
|
-
# > /project:develop feature/auth
|
|
403
|
-
# Full pipeline: specs feature in detail, plans implementation, TDD red→green→refactor,
|
|
404
|
-
# writes tests, runs security audit, updates docs
|
|
405
|
-
#
|
|
406
|
-
# > /project:fix
|
|
407
|
-
# Shows recent issues, user picks one, Claude researches the bug, fixes it, runs tests
|
|
408
|
-
```
|
|
409
|
-
|
|
410
|
-
### Scenario: Optimize an Existing Project
|
|
411
|
-
|
|
412
|
-
```bash
|
|
413
|
-
cd /path/to/existing/next-app
|
|
414
|
-
# It has a manual .claude/ directory
|
|
415
|
-
|
|
416
|
-
kairn optimize
|
|
417
|
-
|
|
418
|
-
# Output:
|
|
419
|
-
# ✔ Scan: TypeScript, Next.js, 47 dependencies, 8 scripts
|
|
420
|
-
#
|
|
421
|
-
# Harness Audit:
|
|
422
|
-
# CLAUDE.md: 187 lines ✓ (good)
|
|
423
|
-
# MCP servers: 4
|
|
424
|
-
# Commands: 5 (/help, /plan, /code, /test, /deploy)
|
|
425
|
-
# Rules: 2 (security, continuity)
|
|
426
|
-
#
|
|
427
|
-
# Issues found:
|
|
428
|
-
# ⚠ Missing /project:develop command (full development pipeline)
|
|
429
|
-
# ⚠ No path-scoped rules (api.md, testing.md for different code domains)
|
|
430
|
-
# ⚠ Hooks not configured (missing destructive command blocking)
|
|
431
|
-
#
|
|
432
|
-
# Generate optimized environment? This will overwrite existing .claude/ files.
|
|
433
|
-
# > Yes
|
|
434
|
-
#
|
|
435
|
-
# ✔ Environment compiled in 12s
|
|
436
|
-
# ✔ Files written: 4 new, 3 modified, 1 unchanged
|
|
437
|
-
#
|
|
438
|
-
# Ready! Run: $ claude
|
|
439
|
-
```
|
|
440
|
-
|
|
441
|
-
### Scenario: Evolve the Harness
|
|
442
|
-
|
|
443
|
-
```bash
|
|
444
|
-
# Harness is generated and working. Set up evolution:
|
|
445
|
-
|
|
446
|
-
kairn evolve init
|
|
447
|
-
|
|
448
|
-
# Auto-generated 5 eval tasks based on CLAUDE.md + project structure:
|
|
449
|
-
# task-1: "Implement user profile page"
|
|
450
|
-
# task-2: "Add password reset flow"
|
|
451
|
-
# task-3: "Refactor authentication middleware"
|
|
452
|
-
# task-4: "Write E2E tests for checkout flow"
|
|
453
|
-
# task-5: "Update API documentation after feature release"
|
|
454
|
-
|
|
455
|
-
kairn evolve baseline # Snapshot current .claude/ as iteration 0
|
|
456
|
-
|
|
157
|
+
kairn evolve init && kairn evolve baseline
|
|
457
158
|
kairn evolve run --iterations 5
|
|
458
159
|
|
|
459
160
|
# Iteration 1/5
|
|
460
|
-
#
|
|
461
|
-
# Score:
|
|
462
|
-
#
|
|
463
|
-
#
|
|
464
|
-
# - Task 2 failed: "password reset" not mentioned in CLAUDE.md. Need /project:email command.
|
|
465
|
-
# - Task 4 failed: E2E tests failed because missing /project:test. Added but not documented.
|
|
466
|
-
#
|
|
467
|
-
# Proposing mutations:
|
|
468
|
-
# - Add /project:email command with SMTP integration guidance
|
|
469
|
-
# - Update CLAUDE.md "Authentication" section with password reset flow
|
|
470
|
-
# - Add e2e.md path-scoped rule with Playwright patterns
|
|
161
|
+
# [task-1] pass [task-2] fail [task-3] pass [task-4] fail [task-5] pass
|
|
162
|
+
# Score: 60%
|
|
163
|
+
# Diagnosis: "password reset" not in CLAUDE.md, E2E tests need Playwright rule
|
|
164
|
+
# Mutations: +/project:email command, +authentication section, +e2e.md rule
|
|
471
165
|
#
|
|
472
166
|
# Iteration 2/5
|
|
473
|
-
#
|
|
474
|
-
#
|
|
475
|
-
# Score: 5/5 (100%) ✔ improvement! Accepting mutations.
|
|
167
|
+
# [task-1] pass [task-2] pass [task-3] pass [task-4] pass [task-5] pass
|
|
168
|
+
# Score: 100% — accepting mutations
|
|
476
169
|
#
|
|
477
170
|
# Iteration 3/5
|
|
478
|
-
#
|
|
479
|
-
# [task-1] pass [task-2] pass [task-3] pass [task-4] pass [task-5] pass
|
|
480
|
-
# Score: 5/5 (100%) — no regression, but no improvement. Proposing refactements...
|
|
481
|
-
# - CLAUDE.md got bloated (142 lines). Moving detail to rules/.
|
|
482
|
-
# Iteration 3 score: 5/5. Accepting.
|
|
171
|
+
# Score: 100% — CLAUDE.md bloated (142 lines), moving detail to rules/
|
|
483
172
|
#
|
|
484
|
-
# Iterations 4-5:
|
|
173
|
+
# Iterations 4-5: plateau at 100%. No regressions.
|
|
485
174
|
#
|
|
486
|
-
# Final
|
|
487
|
-
|
|
488
|
-
#
|
|
489
|
-
# Iteration 2: 100% (5/5) ← best
|
|
490
|
-
# Iteration 3: 100% (5/5)
|
|
491
|
-
# Iteration 4: 100% (5/5)
|
|
492
|
-
# Iteration 5: 100% (5/5)
|
|
493
|
-
|
|
494
|
-
kairn evolve report # Detailed markdown summary
|
|
495
|
-
kairn evolve apply # Copy iteration 2 to .claude/
|
|
175
|
+
# Final: baseline 60% → evolved 100%
|
|
176
|
+
|
|
177
|
+
kairn evolve apply # Deploy the winning harness
|
|
496
178
|
```
|
|
497
179
|
|
|
180
|
+
See [docs/walkthroughs/](docs/walkthroughs/) for full examples including generation, optimization, and PBT runs.
|
|
181
|
+
|
|
498
182
|
---
|
|
499
183
|
|
|
500
|
-
##
|
|
184
|
+
## Vision
|
|
501
185
|
|
|
502
|
-
|
|
186
|
+
The architecture — typed IR, population-based training, multi-agent compilation with linker validation — was designed to extend from N=1 (one project, one harness) to N=500 (a fleet of agents with interdependent harnesses). Today Kairn compiles a single `.claude/` directory. The same pipeline generalizes to **swarm manifest compilation**: describe a fleet of agents with roles, contracts, and communication patterns; compile harnesses for each agent with inter-agent contract validation (agent A's output schema matches agent B's input expectations); evolve the fleet as a system, not individual harnesses in isolation.
|
|
503
187
|
|
|
504
|
-
|
|
505
|
-
2. **Workflow-specific over generic.** Every file generated relates to your actual task.
|
|
506
|
-
3. **Self-improving.** Environments get better with use via the evolution loop and self-learning intent router.
|
|
507
|
-
4. **Local-first.** No accounts, no servers, no telemetry. Runs offline with your own LLM key.
|
|
508
|
-
5. **Transparent.** You can inspect every generated file. Nothing is hidden.
|
|
509
|
-
6. **Security by default.** Every environment includes deny rules, hooks, and guidance.
|
|
510
|
-
7. **Prove it.** Evolved harnesses must demonstrably outperform static ones. Claims require measurement.
|
|
188
|
+
The linker already validates cross-references within a single harness (commands ↔ agents ↔ rules). Extending it to validate cross-references *between* harnesses — inter-agent contracts, shared MCP server configurations, compatible security policies — is the path from project-scoped optimization to fleet-scale coordination.
|
|
511
189
|
|
|
512
|
-
|
|
190
|
+
---
|
|
513
191
|
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
192
|
+
## Command Reference
|
|
193
|
+
|
|
194
|
+
| Command | Description |
|
|
195
|
+
|---------|-------------|
|
|
196
|
+
| `kairn init` | Interactive LLM provider setup (8 providers, API key stored locally) |
|
|
197
|
+
| `kairn describe <intent>` | Compile intent → optimized `.claude/` environment |
|
|
198
|
+
| `kairn optimize` | Scan existing project, audit + regenerate harness (`--diff` to preview) |
|
|
199
|
+
| `kairn templates` | Browse and activate pre-built environments (Next.js, API, Research, Content) |
|
|
200
|
+
| `kairn doctor` | Validate environment against Claude Code best practices |
|
|
201
|
+
| `kairn keys` | Manage API keys for MCP servers (`--show` to audit) |
|
|
202
|
+
| `kairn list` / `kairn activate <id>` | Save, browse, and re-deploy environments |
|
|
203
|
+
| `kairn evolve init` | Scaffold evolution workspace, auto-generate eval tasks |
|
|
204
|
+
| `kairn evolve baseline` | Snapshot current `.claude/` as iteration 0 |
|
|
205
|
+
| `kairn evolve run` | Full evolution loop (`--iterations N`, `--parallel N`, `--runs N`) |
|
|
206
|
+
| `kairn evolve pbt` | Population-based training (N parallel branches + Meta-Principal synthesis) |
|
|
207
|
+
| `kairn evolve report` | Markdown/JSON summary with leaderboard and counterfactual diagnosis |
|
|
208
|
+
| `kairn evolve diff <i1> <i2>` | Harness changes between two iterations |
|
|
209
|
+
| `kairn evolve apply` | Deploy best (or specified) harness to `.claude/` |
|
|
210
|
+
|
|
211
|
+
**Describe options:** `--quick` (skip clarification), `--autonomy 1-4` (guided → full auto), `--runtime hermes` (Hermes adapter)
|
|
212
|
+
|
|
213
|
+
**Evolve options:** `--sampling thompson|uniform`, `--kl-lambda 0.1`, `--pbt-branches 3`, `--task <id>` (single task)
|
|
518
214
|
|
|
519
|
-
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## What Gets Generated
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
.claude/
|
|
221
|
+
├── CLAUDE.md # Workflow-specific system prompt (7 sections)
|
|
222
|
+
├── settings.json # Permissions, hooks, security rules, intent routing
|
|
223
|
+
├── commands/ # Slash commands (/project:help, /project:plan, etc.)
|
|
224
|
+
├── rules/ # Auto-loaded instructions (security, continuity, paths)
|
|
225
|
+
├── skills/ # Model-controlled capabilities (code, research, writing)
|
|
226
|
+
├── agents/ # Specialized subagents (@architect, @tester, etc.)
|
|
227
|
+
├── docs/ # Pre-initialized project memory
|
|
228
|
+
├── hooks/ # Intent router (Tier 1 regex + Tier 2 Haiku classifier)
|
|
229
|
+
│ ├── intent-router.mjs # Project-specific regex patterns + fallthrough
|
|
230
|
+
│ ├── intent-learner.mjs # Promotes recurring Tier 2 patterns to Tier 1
|
|
231
|
+
│ └── intent-log.jsonl # Log of routed prompts (for learning)
|
|
232
|
+
└── QUICKSTART.md # Interactive startup guide (Level 2-4)
|
|
233
|
+
.mcp.json # Project-scoped MCP server config
|
|
234
|
+
.env # API keys (gitignored, masked in output)
|
|
235
|
+
```
|
|
523
236
|
|
|
524
|
-
**
|
|
525
|
-
- Kairn is CLI-first and project-scoped (not a framework library)
|
|
526
|
-
- Integrated with Claude Code's native hooks API (not custom inference)
|
|
527
|
-
- Generates MCP configurations alongside harness (full integration)
|
|
237
|
+
**Tool registry:** 28 curated MCP servers across reasoning, code, search, browser automation, data/infrastructure, communication, security, and design. Auto-selected based on workflow — fewer tools = less context bloat = better agent performance.
|
|
528
238
|
|
|
529
239
|
---
|
|
530
240
|
|
|
531
241
|
## Roadmap
|
|
532
242
|
|
|
533
|
-
### v1.x
|
|
534
|
-
Local CLI
|
|
243
|
+
### v1.x (Complete)
|
|
244
|
+
Local CLI: intent compilation, project scanning, templates, secrets management, autonomy levels (1-4), interactive clarification, branded CLI, verification patterns, sprint contracts, multi-agent QA, 8 LLM providers.
|
|
535
245
|
|
|
536
|
-
### v2.x (
|
|
246
|
+
### v2.x (Current — v2.11.0)
|
|
537
247
|
**Kairn Evolve** — automated harness optimization.
|
|
538
248
|
|
|
539
|
-
- **v2.0
|
|
540
|
-
- **v2.1
|
|
541
|
-
- **v2.2
|
|
542
|
-
- **v2.
|
|
543
|
-
- **v2.
|
|
544
|
-
- **v2.
|
|
545
|
-
- **v2.
|
|
546
|
-
- **v2.
|
|
547
|
-
- **v2.
|
|
249
|
+
- **v2.0** ✅ Task definition, trace infrastructure, eval templates
|
|
250
|
+
- **v2.1** ✅ The evolution loop (evaluate → diagnose → mutate → re-evaluate → rollback)
|
|
251
|
+
- **v2.2** ✅ Diagnosis, reporting, parallel evaluation, anti-regression guards
|
|
252
|
+
- **v2.3** ✅ Eval quality, Claude Code subscription auth, prompt caching
|
|
253
|
+
- **v2.5** ✅ Intent-aware harnesses (two-tier routing, self-learning promotion)
|
|
254
|
+
- **v2.6** ✅ Population-based training (Thompson sampling, KL regularization, Meta-Principal synthesis)
|
|
255
|
+
- **v2.7** ✅ Structured Harness IR (14 node types, 17 mutations, semantic diff, round-trip renderer)
|
|
256
|
+
- **v2.8** ✅ Hybrid scoring, prompt caching (~85% savings), targeted re-evaluation (~40% cost reduction)
|
|
257
|
+
- **v2.9** ✅ Anthropic patterns (sprint contracts, model routing, 20+ security rules, memory persistence)
|
|
258
|
+
- **v2.10** ✅ Persistent execution loops (/project:persist, auto-routing, progress tracking)
|
|
259
|
+
- **v2.11** ✅ Multi-agent compilation (orchestrator → specialist agents → linker → HarnessIR)
|
|
260
|
+
- **v2.12** ⏳ Polish: live dashboard, describe→evolve integration, CI/CD, template evolution
|
|
548
261
|
|
|
549
262
|
### v3.x (Aspirational)
|
|
550
|
-
|
|
263
|
+
Fleet-scale harness optimization. Swarm manifest compilation. Inter-agent contract validation. Runtime-agnostic harness IR (Claude Code, Hermes, OpenClaw). Tool marketplace with proposer-initiated discovery.
|
|
551
264
|
|
|
552
265
|
---
|
|
553
266
|
|
|
554
267
|
## Security
|
|
555
268
|
|
|
556
|
-
-
|
|
557
|
-
-
|
|
558
|
-
-
|
|
559
|
-
-
|
|
560
|
-
-
|
|
561
|
-
-
|
|
269
|
+
- API keys stay local (`~/.kairn/config.json`, never transmitted)
|
|
270
|
+
- Every environment includes 20+ PreToolUse deny rules across credential leaks, injection, destructive ops, and network exfiltration
|
|
271
|
+
- Curated MCP registry only — every server manually verified
|
|
272
|
+
- Environment variables use `${ENV_VAR}` syntax — secrets never written to config files
|
|
273
|
+
- Path traversal protection on all evolution mutations
|
|
274
|
+
- Hooks block destructive commands; PostCompact restores context
|
|
562
275
|
|
|
563
276
|
---
|
|
564
277
|
|
|
565
278
|
## FAQ
|
|
566
279
|
|
|
567
|
-
**
|
|
568
|
-
A: No. Kairn is a local CLI. Your API key for Claude/GPT/Gemini is configured once and stored locally.
|
|
280
|
+
**Do I need an account?** No. Local CLI, your API key, no backend.
|
|
569
281
|
|
|
570
|
-
**
|
|
571
|
-
A: No. All LLM calls use your own API key. Kairn CLI has no backend.
|
|
282
|
+
**Does Kairn send my code anywhere?** No. All LLM calls use your key. Nothing leaves your machine except API requests.
|
|
572
283
|
|
|
573
|
-
**
|
|
574
|
-
A: Yes. Generate the harness locally, commit `.claude/` to git. Team members run `claude` and get the same environment. The evolve loop runs locally per person (results don't auto-merge).
|
|
284
|
+
**Team use?** Generate locally, commit `.claude/` to git. Everyone gets the same environment.
|
|
575
285
|
|
|
576
|
-
**
|
|
577
|
-
A: Use `kairn optimize --diff` to preview changes. You can selectively accept or reject them. For full control, don't use `optimize` — use `describe` once and then hand-edit the generated files.
|
|
286
|
+
**Keep manual customizations?** `kairn optimize --diff` previews changes. Accept or reject selectively.
|
|
578
287
|
|
|
579
|
-
**
|
|
580
|
-
A: Depends on your model, iteration count, and task volume. A 5-iteration evolution run with 5 tasks on Anthropic:
|
|
581
|
-
- Evaluation: ~100K tokens per iteration (traces logged)
|
|
582
|
-
- Proposer: ~80K tokens per iteration (diagnosis + mutation)
|
|
583
|
-
- Re-evaluation: ~100K tokens per iteration
|
|
584
|
-
- **Total:** ~1.5M tokens = ~$15-50 (Opus/Claude 3) or ~$2-5 (Haiku)
|
|
288
|
+
**Evolution cost?** 5 iterations, 5 tasks on Anthropic: ~1.5M tokens (~$15-50 Opus, ~$2-5 Haiku). PBT multiplies by branch count but runs concurrently.
|
|
585
289
|
|
|
586
|
-
**
|
|
587
|
-
A: Yes. `kairn evolve run --task <task_id>` runs a single task.
|
|
588
|
-
|
|
589
|
-
**Q: What's the intent router doing on my prompt?**
|
|
590
|
-
A: When you type a prompt like "deploy this", the intent router:
|
|
591
|
-
1. Checks Tier 1 regex patterns (fast, free)
|
|
592
|
-
2. If no match, sends to Tier 2 (Haiku, ~$0.001)
|
|
593
|
-
3. Injects `/project:deploy` into your message context
|
|
594
|
-
4. Claude reads that and executes the command
|
|
595
|
-
|
|
596
|
-
You can disable it with `"enableTier2": false` in settings.json if you find it intrusive.
|
|
290
|
+
**What's the intent router doing?** Intercepts natural language prompts, matches to `/project:*` commands via regex (free) or Haiku (~$0.001). Disable Tier 2 with `"enableTier2": false`.
|
|
597
291
|
|
|
598
292
|
---
|
|
599
293
|
|
|
600
294
|
## Contributing
|
|
601
295
|
|
|
602
|
-
Kairn is open-source. Contributions welcome:
|
|
603
|
-
- New MCP servers to the registry
|
|
604
|
-
- Eval task templates for new project types
|
|
605
|
-
- Improved proposer prompts
|
|
606
|
-
- Bug reports and UX feedback
|
|
607
|
-
|
|
608
|
-
---
|
|
296
|
+
Kairn is open-source. Contributions welcome: MCP servers to the registry, eval task templates, proposer prompt improvements, bug reports.
|
|
609
297
|
|
|
610
298
|
## License
|
|
611
299
|
|