npm - kairn-cli - Versions diffs - 2.0.0 → 2.2.0 - Mend

kairn-cli 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -1,16 +1,12 @@
 # Kairn
-> Describe what you want done. Get an optimized Claude Code environment.
+> The agent environment compiler. Describe what you want done — get an optimized Claude Code environment. Then evolve it automatically.
 Kairn is a CLI that compiles natural language workflow descriptions into minimal, optimal [Claude Code](https://code.claude.com/) agent environments — complete with MCP servers, slash commands, skills, subagents, and security rules.
-**No server. No account. Runs locally with your own LLM key.**
-## Why
-Every agent needs an environment before it can work. Today, building that environment is manual and generic. The harness repos on GitHub give you 136 skills and hope you figure out which 6 matter for your task.
+**v2.1** adds **Kairn Evolve** — an automated optimization loop that runs your agent on real tasks, diagnoses failures from full execution traces, and mutates the harness until performance plateaus. Inspired by [Meta-Harness](https://yoonholee.com/meta-harness/) (Stanford IRIS Lab, 2026).
-Kairn selects the right tools, generates workflow-specific instructions, and writes a production-quality `.claude/` directory in one command.
+**No server. No account. Runs locally with your own LLM key.**
 ## Install
@@ -23,7 +19,7 @@ Requires Node.js 18+. The command is `kairn`.
 ## Quick Start
 ```bash
-# 1. Set up your LLM key (Anthropic, OpenAI, or Google)
+# 1. Set up your LLM key
 kairn init
 # 2. Describe your workflow
@@ -33,31 +29,21 @@ kairn describe "Build a Next.js app with Supabase auth"
 claude
 ```
-That's it. Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP servers, slash commands, skills, agents, rules — tailored to your specific workflow.
+Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP servers, slash commands, skills, agents, rules — tailored to your specific workflow.
 ## What Gets Generated
 ```
 .claude/
-├── CLAUDE.md              # Workflow-specific system prompt (<100 lines)
-├── settings.json          # Permissions and security deny rules
-├── commands/              # Slash commands (show up when you type /)
-│   ├── help.md            #   /project:help — environment guide
-│   ├── tasks.md           #   /project:tasks — manage TODOs
-│   ├── plan.md            #   /project:plan — plan before coding
-│   └── ...                #   workflow-specific commands
-├── rules/                 # Auto-loaded instructions
-│   ├── security.md        #   Security best practices
-│   └── continuity.md      #   Session memory (DECISIONS.md, LEARNINGS.md)
+├── CLAUDE.md              # Workflow-specific system prompt
+├── settings.json          # Permissions, hooks, and security deny rules
+├── commands/              # Slash commands (/project:help, /project:plan, etc.)
+├── rules/                 # Auto-loaded instructions (security, continuity)
 ├── skills/                # Model-controlled capabilities
-│   └── {skill}/SKILL.md
 ├── agents/                # Specialized subagents
-│   └── {agent}.md
 └── docs/                  # Pre-initialized project memory
-    ├── TODO.md
-    ├── DECISIONS.md
-    └── LEARNINGS.md
 .mcp.json                  # Project-scoped MCP server config
+.env                       # API keys (gitignored, masked in output)
 ```
 ## Commands
@@ -67,71 +53,103 @@ That's it. Kairn generates the entire `.claude/` directory — CLAUDE.md, MCP se
 Interactive setup. Pick your LLM provider and model, paste your API key. Key stays local at `~/.kairn/config.json`.
 Supported providers:
-- **Anthropic** — Claude Sonnet 4, Opus 4, Haiku 3.5
-- **OpenAI** — GPT-4o, GPT-4o mini, o3
-- **Google** — Gemini 2.5 Flash, Gemini 2.5 Pro
+- **Anthropic** — Claude Sonnet 4.6, Opus 4.6, Haiku 4.5
+- **OpenAI** — GPT-4.1, GPT-4.1 mini, o4-mini, GPT-5 mini
+- **Google** — Gemini 2.5 Flash, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 3.1 Pro
+- **xAI** — Grok 4.1 Fast, Grok 4.20 (2M context)
+- **DeepSeek** — V3.2 Chat, V3.2 Reasoner (cheapest)
+- **Mistral** — Large 3, Codestral, Small 4 (open-weight)
+- **Groq** — Llama 4, DeepSeek R1, Qwen 3 (free tier)
+- **Custom** — any OpenAI-compatible endpoint (local Ollama, LM Studio, etc.)
 ### `kairn describe [intent]`
 The main command. Describe what you want your agent to do, and Kairn compiles an optimal environment.
 ```bash
-# Interactive
-kairn describe
-# Inline
 kairn describe "Research ML papers on GRPO training and write a summary"
-# Skip confirmation
-kairn describe "Draft outreach emails from a CSV" --yes
+kairn describe "Build a REST API with Express and PostgreSQL" --quick
 ```
-Kairn selects the minimal set of tools from a curated registry of 18 MCP servers, plugins, and hooks — then generates every file Claude Code needs.
+Features:
+- **Interactive clarification** — 3-5 questions to understand your project (skip with `--quick`)
+- **Multi-pass compilation** — skeleton pass (tool selection) + harness pass (content generation) + deterministic settings
+- **Autonomy levels** — choose how autonomous the agent should be (1-4)
+- **Secrets collection** — prompted for API keys after generation, written to `.env`
-### `kairn list`
+### `kairn optimize [--diff]`
-Show all saved environments.
+Scan an existing project and optimize its Claude Code environment. Detects language, framework, dependencies, and generates improvements.
+```bash
+kairn optimize          # Write optimized environment
+kairn optimize --diff   # Preview changes before writing
 ```
-$ kairn list
-  nextjs-supabase-auth
-    Next.js app with Supabase authentication
-    3/30/2026 · 7 tools · env_df2c0a23...
+### `kairn templates`
+Browse and activate pre-built environment templates.
-  grpo-research
-    Research ML papers on GRPO training
-    3/30/2026 · 4 tools · env_1638c54e...
+```bash
+kairn templates                        # Browse gallery
+kairn templates --activate nextjs      # Apply a template
 ```
-### `kairn activate <env_id>`
+Available templates: Next.js Full-Stack, API Service, Research Project, Content Writing.
-Re-deploy a saved environment to any directory. Use the ID from `kairn list`.
+### `kairn doctor`
-```bash
-mkdir new-project && cd new-project
-kairn activate env_df2c0a23
-```
+Validate the current environment against Claude Code best practices.
+### `kairn keys [--show]`
+Add or update API keys for MCP servers in the current environment.
+### `kairn list` / `kairn activate <env_id>`
+Show saved environments and re-deploy them to any directory.
-### `kairn update-registry`
+### `kairn evolve`
-Fetch the latest tool catalog from GitHub.
+Automated harness optimization. Run your agent on real tasks, capture traces, and evolve the environment.
 ```bash
-kairn update-registry
+# 1. Initialize — auto-generates project-specific eval tasks via LLM
+kairn evolve init
+# 2. Snapshot current .claude/ as the baseline
+kairn evolve baseline
+# 3. Run the evolution loop
+kairn evolve run                    # 5 iterations (default)
+kairn evolve run --iterations 3     # Custom iteration count
+kairn evolve run --task <id>        # Run a single task
 ```
+**How it works:**
+1. **Define tasks** — `kairn evolve init` reads your CLAUDE.md and project structure, then uses the LLM to generate 3-5 concrete eval tasks from 6 built-in templates (add-feature, fix-bug, refactor, test-writing, config-change, documentation)
+2. **Baseline** — `kairn evolve baseline` snapshots your current `.claude/` directory
+3. **Evaluate** — runs each task by spawning Claude Code in an isolated workspace, capturing full traces (stdout, stderr, tool calls, files changed, timing)
+4. **Diagnose** — a proposer agent (Opus) reads the full traces and performs causal reasoning to identify why tasks fail
+5. **Mutate** — proposes minimal, targeted changes to CLAUDE.md, commands, rules, or agents
+6. **Repeat** — re-evaluates with the mutated harness. Rolls back if scores regress.
+Scoring: pass/fail (default), LLM-as-judge, or weighted rubric.
 ## Tool Registry
-Kairn ships with 18 curated tools across 6 tiers:
+Kairn ships with 28 curated tools across 8 categories:
 | Category | Tools |
 |----------|-------|
-| **Universal** | Context7, Sequential Thinking, security-guidance |
-| **Code** | GitHub MCP, Playwright, Semgrep |
-| **Search** | Exa, Brave Search, Firecrawl, Perplexity |
-| **Data** | PostgreSQL (Bytebase), Supabase |
-| **Communication** | Slack, Notion, Linear, AgentMail |
+| **Reasoning** | Context7, Sequential Thinking |
+| **Code & DevTools** | GitHub MCP, Chrome DevTools |
+| **Search & Research** | Exa, Brave Search, Firecrawl, Perplexity |
+| **Browser Automation** | Playwright, Browserbase |
+| **Data & Infrastructure** | PostgreSQL, Supabase, SQLite, Docker, Vercel |
+| **Communication** | Slack, Notion, Linear, AgentMail, Gmail |
+| **Security** | Semgrep, security-guidance |
 | **Design** | Figma, Frontend Design |
 Tools are selected based on your workflow description. Fewer tools = less context bloat = better agent performance.
@@ -139,24 +157,28 @@ Tools are selected based on your workflow description. Fewer tools = less contex
 ## How It Works
 1. You describe your workflow in natural language
-2. Kairn sends your intent + its tool registry to an LLM
-3. The LLM selects the minimal tool set and generates a complete `EnvironmentSpec`
-4. Kairn writes the `.claude/` directory and `.mcp.json` from the spec
-5. The spec is saved locally so you can re-deploy it anywhere
+2. Kairn asks clarifying questions (or skip with `--quick`)
+3. **Pass 1:** LLM selects the minimal tool set and outlines the project
+4. **Pass 2:** LLM generates all harness content (CLAUDE.md, commands, rules, agents)
+5. **Pass 3:** Settings and MCP config generated deterministically from the registry
+6. Kairn writes the `.claude/` directory and `.mcp.json`
+7. API keys are collected and written to `.env`
 The LLM call uses your own API key. Nothing is sent to Kairn servers (there are none).
 ## Security
-- **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted anywhere.
+- **API keys stay local.** Stored at `~/.kairn/config.json`, never transmitted.
 - **Every environment includes security rules.** Deny rules for `rm -rf`, `curl | sh`, reading `.env` and `secrets/`.
-- **Curated registry only.** Every MCP server in the registry is manually verified. No auto-inclusion of unvetted tools.
-- **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to files.
+- **Curated registry only.** Every MCP server is manually verified.
+- **Environment variable references.** MCP configs use `${ENV_VAR}` syntax — secrets never written to config files.
+- **Path traversal protection.** Evolution mutations are validated against `../` injection.
 ## Philosophy
 - **Minimal over complete.** 5 well-chosen tools beat 50 generic ones.
 - **Workflow-specific over generic.** Every file generated relates to your actual task.
+- **Self-improving.** Environments should get better with use, not just at generation time.
 - **Local-first.** No accounts, no servers, no telemetry.
 - **Transparent.** You can inspect every generated file. Nothing is hidden.