npm - cceval - Versions diffs - 0.1.0 → 0.2.0 - Mend

cceval 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -7,10 +7,10 @@ Evaluate and benchmark your CLAUDE.md effectiveness with automated testing.
 Your `CLAUDE.md` file guides how Claude Code behaves in your project. But how do you know if your instructions are actually working?
 **cceval** lets you:
-- Test different system prompt variations
-- Compare performance across metrics
+- Auto-generate variations of your CLAUDE.md
+- Test them against realistic prompts
 - Find what actually improves Claude's behavior
-- Share benchmarks with your team
+- Iterate quickly on your instructions
 ## Installation
@@ -23,123 +23,161 @@ bun add -D cceval
 ```
 **Requirements:**
-- [Bun](https://bun.sh) ≥ 1.0
+- [Bun](https://bun.sh) >= 1.0
 - [Claude Code CLI](https://docs.anthropic.com/claude-code) installed and authenticated
-## Quick Start
+**Optional (for faster generation):**
+- `ANTHROPIC_API_KEY` environment variable - if set, uses direct API calls for variation generation instead of Claude CLI (faster and more reliable)
-```bash
-# Run with default prompts and variations
-cceval run
+## Quick Start (Turnkey)
-# Generate report from existing results
-cceval report evaluation-results.json
+Just run `cceval` in any project with a CLAUDE.md:
-# Create custom config
-cceval init
+```bash
+cd your-project
+cceval
 ```
-## Usage
-### Basic Evaluation
+That's it! cceval will:
+1. Find your CLAUDE.md
+2. Use Claude to generate 5 variations (condensed, explicit, prioritized, minimal, structured)
+3. Test each variation against 5 realistic prompts
+4. Show you which variation performs best
-Run with sensible defaults (5 variations × 5 prompts = 25 tests):
+### Example Output
-```bash
-cceval run
+```
+🚀 cceval - Turnkey CLAUDE.md Evaluation
+============================================================
+📄 Source: ./CLAUDE.md
+🤖 Model: haiku
+🔄 Generating 7 variations...
+  ⏳ Generating "condensed"... ✓
+  ⏳ Generating "explicit"... ✓
+  ⏳ Generating "prioritized"... ✓
+  ⏳ Generating "minimal"... ✓
+  ⏳ Generating "structured"... ✓
+📦 Cached variations to .cceval-variations.json
+============================================================
+📊 Starting Evaluation
+============================================================
+Testing 7 variations:
+  • baseline
+  • original
+  • condensed
+  • explicit
+  • prioritized
+  • minimal
+  • structured
+With 5 test prompts each
+Total tests: 35
+============================================================
+  ✓ baseline/exploreBeforeBuild: $0.0012
+  ✓ baseline/bunPreference: $0.0015
+  ...
+============================================================
+📊 CLAUDE.md EVALUATION REPORT
+============================================================
+🏷️  explicit: 72.0% (18/25)
+   ✅ noPermissionSeeking: 5/5
+   ✅ readFilesFirst: 5/5
+   ⚠️ usedBun: 3/5
+   ...
+🏆 WINNER: explicit
+💰 Total cost: $0.42
+============================================================
 ```
-This tests:
-- **baseline**: Minimal "You are a helpful assistant"
-- **gateFocused**: Clear pass/fail criteria
-- **bunFocused**: Bun-specific instructions
-- **rootCauseFocused**: Root cause protocol
-- **antiPermission**: Trust-based prompting
+## How It Works
-Against prompts that test:
-- Reading files before coding
-- Using Bun instead of Node
-- Fixing root cause vs surface symptoms
-- Asking permission appropriately
+### Variation Strategies
-### Custom Configuration
+cceval generates these variations from your CLAUDE.md:
-Create a config file:
+| Strategy | Description |
+|----------|-------------|
+| `baseline` | Minimal "You are a helpful coding assistant" |
+| `original` | Your actual CLAUDE.md as-is |
+| `condensed` | Shorter version keeping only critical rules |
+| `explicit` | More explicit version with clear imperatives (MUST, NEVER, ALWAYS) |
+| `prioritized` | Reordered with most important rules first |
+| `minimal` | Just the 3-5 most critical rules |
+| `structured` | Well-organized with clear sections and headers |
-```bash
-cceval init
-```
+### Test Prompts
-Edit `cceval.config.ts`:
+Default prompts test key behaviors:
-```typescript
-import type { EvalConfig } from "cceval"
-const config: EvalConfig = {
-  prompts: {
-    // Your test scenarios
-    authentication: "Add login functionality to the app.",
-    performance: "The dashboard is slow, optimize it.",
-    testing: "Add tests for the user service.",
-  },
+| Prompt | What it tests |
+|--------|---------------|
+| `exploreBeforeBuild` | Does it read files before coding? |
+| `bunPreference` | Does it use Bun instead of Node? |
+| `rootCause` | Does it fix root cause instead of adding spinners? |
+| `simplicity` | Does it avoid over-engineering? |
+| `permissionSeeking` | Does it execute without asking permission? |
-  variations: {
-    // Your system prompt variations
-    baseline: "You are a helpful coding assistant.",
+## CLI Reference
-    myClaudeMd: `You are evaluated on gates. Fail any = FAIL.
-1. Read files before coding
-2. State plan then proceed immediately
-3. Run tests and show output
-4. Only pause for destructive actions`,
+### Basic Usage
-    strict: `NEVER ask permission. Execute immediately.
-ALWAYS use Bun, never Node.
-ALWAYS fix root cause, never add spinners.`,
-  },
+```bash
+# Turnkey: auto-detect, generate, test
+cceval
-  // Optional settings
-  model: "haiku",        // cheapest, ~$0.08/test
-  delayMs: 1000,         // rate limiting
-  outputFile: "results.json",
-}
+# Same as above, explicit
+cceval auto
-export default config
-```
+# Specify a different CLAUDE.md
+cceval auto -p ./docs/CLAUDE.md
-Run with your config:
+# Use a smarter model for better variations
+cceval auto -m sonnet
-```bash
-cceval run
+# Skip regeneration, use cached variations
+cceval auto --skip-generate
+# See what strategies are available
+cceval auto --strategies-only
 ```
-### CLI Options
+### Output Options
 ```bash
-# Use specific model
-cceval run -m sonnet
 # Custom output file
-cceval run -o my-results.json
+cceval -o my-results.json
-# Also generate markdown report
-cceval run --markdown REPORT.md
+# Generate markdown report too
+cceval --markdown REPORT.md
+```
-# Preview prompts without running
-cceval run --prompts-only
+### Advanced: Custom Config
-# Preview variations without running
-cceval run --variations-only
+For full control, use a config file:
-# Use specific config file
-cceval run -c my-config.ts
-```
+```bash
+# Create starter config
+cceval init
-### Generate Reports
+# Edit cceval.config.ts with your prompts/variations
-From existing results:
+# Run with config
+cceval run
+```
+### Reports
 ```bash
+# Generate report from previous results
 cceval report evaluation-results.json
 ```
@@ -155,61 +193,47 @@ cceval measures:
 | `proposedRootCause` | For "slow" prompts: fixes root cause instead of adding spinners |
 | `ranVerification` | Mentions running tests or showing output |
-### Custom Metrics
-Add your own analyzers:
-```typescript
-const config: EvalConfig = {
-  // ...prompts and variations...
+## Cost Estimates
-  analyzers: {
-    // Custom metric: did it use TypeScript?
-    usedTypeScript: (response) =>
-      /\.ts|interface |type |<.*>/.test(response),
+| Model | Cost per test | 35 tests (auto) | 25 tests (manual) |
+|-------|---------------|-----------------|-------------------|
+| haiku | ~$0.08 | ~$2.80 | ~$2.00 |
+| sonnet | ~$0.30 | ~$10.50 | ~$7.50 |
+| opus | ~$1.50 | ~$52.50 | ~$37.50 |
-    // Custom metric: did it mention security?
-    consideredSecurity: (response) =>
-      /security|auth|permission|sanitize|validate/i.test(response),
-  },
-}
-```
+We recommend **haiku** for iteration (it's fast and cheap), then validate findings with **sonnet**.
 ## Programmatic Usage
-Use as a library:
 ```typescript
-import { runEvaluation, printConsoleReport, generateReport } from "cceval"
+import {
+  generateVariations,
+  variationsToConfig,
+  runEvaluation,
+  printConsoleReport
+} from "cceval"
+// Generate variations from a CLAUDE.md
+const generated = await generateVariations({
+  claudeMdPath: "./CLAUDE.md",
+  model: "haiku",
+})
+// Convert to config
+const variations = variationsToConfig(generated)
+// Run evaluation
 const results = await runEvaluation({
   config: {
     prompts: { test: "Create a hello world server." },
-    variations: {
-      v1: "Use Bun.",
-      v2: "Use Node.",
-    },
+    variations,
     model: "haiku",
   },
-  onProgress: (variation, prompt, result) => {
-    console.log(`${variation}/${prompt}: $${result?.cost.toFixed(4)}`)
-  },
 })
 printConsoleReport(results)
-const markdown = generateReport(results)
 ```
-## Cost Estimates
-| Model | Cost per test | 25 tests |
-|-------|---------------|----------|
-| haiku | ~$0.08 | ~$2.00 |
-| sonnet | ~$0.30 | ~$7.50 |
-| opus | ~$1.50 | ~$37.50 |
-We recommend starting with **haiku** for iteration, then validating findings with **sonnet**.
 ## Key Findings from Our Research
 Based on evaluating 25+ prompt variations:
@@ -231,11 +255,30 @@ File edits and test runs are routine.
 ```
 ### 3. Verification Is the Biggest Win
-Adding "Run tests and show actual output" improved verification from 20% → 100%.
+Adding "Run tests and show actual output" improved verification from 20% to 100%.
 ### 4. Keep It Concise
 The winning prompt was just 4 lines. Long instructions get ignored.
+## Workflow
+Recommended workflow for optimizing your CLAUDE.md:
+1. **Baseline**: Run `cceval` to see how your current CLAUDE.md performs
+2. **Analyze**: Look at which variation scored best and why
+3. **Apply**: Update your CLAUDE.md based on the winning strategy
+4. **Iterate**: Run `cceval` again to verify improvement
+## Files Generated
+| File | Description |
+|------|-------------|
+| `.cceval-variations.json` | Cached generated variations (re-run faster with `--skip-generate`) |
+| `evaluation-results.json` | Full test results (JSON) |
+| `REPORT.md` | Markdown report (if `--markdown` specified) |
+Add `.cceval-variations.json` and `evaluation-results.json` to `.gitignore`.
 ## Contributing
 PRs welcome! Ideas:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "cceval",
-  "version": "0.1.0",
+  "version": "0.2.0",
   "description": "Evaluate and benchmark your CLAUDE.md effectiveness with automated testing",
   "type": "module",
   "bin": {

package/src/cli.ts CHANGED Viewed

@@ -3,20 +3,36 @@ import { parseArgs } from "util"
 import { runEvaluation } from "./runner.ts"
 import { printConsoleReport, generateReport } from "./scoring.ts"
 import { defaultConfig, defaultPrompts, defaultVariations } from "./defaults.ts"
+import {
+  generateVariations,
+  variationsToConfig,
+  findClaudeMd,
+  defaultStrategies,
+} from "./generator.ts"
 import type { EvalConfig } from "./types.ts"
-const VERSION = "0.1.0"
+const VERSION = "0.2.0"
 const HELP = `
 cceval - Evaluate your CLAUDE.md effectiveness
 Usage:
-  cceval run [options]     Run evaluation with default or custom config
+  cceval                   Auto-detect CLAUDE.md, generate variations, and test
+  cceval auto [options]    Same as above (explicit)
+  cceval run [options]     Run with custom config file
   cceval report <file>     Generate report from existing results
   cceval init              Create a starter config file
   cceval --help            Show this help
   cceval --version         Show version
+Auto Options:
+  -p, --path <file>      Path to CLAUDE.md (default: auto-detect)
+  -m, --model <model>    Model for generation & testing (default: haiku)
+  -o, --output <file>    Output file for results (default: evaluation-results.json)
+  --markdown <file>      Also save markdown report to file
+  --strategies-only      Only show variation strategies, don't run
+  --skip-generate        Skip generation, use cached variations
 Run Options:
   -c, --config <file>    Config file (default: cceval.config.ts)
   -m, --model <model>    Model to use (default: haiku)
@@ -26,17 +42,20 @@ Run Options:
   --variations-only      Only show configured variations, don't run
 Examples:
-  # Run with defaults
-  cceval run
+  # Turnkey: auto-detect CLAUDE.md, generate variations, test
+  cceval
+  # Specify a different CLAUDE.md path
+  cceval auto -p ./docs/CLAUDE.md
-  # Run with custom model
-  cceval run -m sonnet
+  # Use a smarter model for generation
+  cceval auto -m sonnet
+  # Run with custom config file
+  cceval run -c my-config.ts
   # Generate report from previous results
   cceval report evaluation-results.json
-  # Create starter config
-  cceval init
 `
 async function loadConfig(configPath?: string): Promise<EvalConfig> {
@@ -160,6 +179,138 @@ async function reportCommand(args: string[]) {
   console.log("\n" + report)
 }
+async function autoCommand(args: string[]) {
+  const { values } = parseArgs({
+    args,
+    options: {
+      path: { type: "string", short: "p" },
+      model: { type: "string", short: "m" },
+      output: { type: "string", short: "o" },
+      markdown: { type: "string" },
+      "strategies-only": { type: "boolean" },
+      "skip-generate": { type: "boolean" },
+    },
+    allowPositionals: true,
+  })
+  const model = values.model || "haiku"
+  const outputFile = values.output || "evaluation-results.json"
+  const cachedVariationsFile = ".cceval-variations.json"
+  // Show strategies only
+  if (values["strategies-only"]) {
+    console.log("🔀 Variation Strategies:\n")
+    for (const strategy of defaultStrategies) {
+      console.log(`  ${strategy.name}:`)
+      console.log(`    ${strategy.description}\n`)
+    }
+    return
+  }
+  // Check if CLAUDE.md exists
+  const claudeMdPath = values.path || (await findClaudeMd())
+  if (!claudeMdPath) {
+    console.error("❌ No CLAUDE.md found in current directory.")
+    console.error("\nTo use cceval, you need a CLAUDE.md file. Options:")
+    console.error("  1. Create a CLAUDE.md in your project root")
+    console.error("  2. Specify a path: cceval auto -p ./path/to/CLAUDE.md")
+    console.error("  3. Use manual config: cceval init && cceval run")
+    process.exit(1)
+  }
+  let variations: Record<string, string>
+  // Try to use cached variations
+  if (values["skip-generate"] && (await Bun.file(cachedVariationsFile).exists())) {
+    console.log("📦 Using cached variations from .cceval-variations.json")
+    const cached = await Bun.file(cachedVariationsFile).json()
+    variations = cached.variations
+  } else {
+    // Generate variations
+    console.log("🚀 cceval - Turnkey CLAUDE.md Evaluation")
+    console.log("=".repeat(60))
+    console.log(`\n📄 Source: ${claudeMdPath}`)
+    console.log(`🤖 Model: ${model}`)
+    console.log(`\n🔄 Generating ${defaultStrategies.length + 2} variations...\n`)
+    const generated = await generateVariations({
+      claudeMdPath,
+      model,
+      onProgress: (strategy, status) => {
+        if (status === "start") {
+          process.stdout.write(`  ⏳ Generating "${strategy}"...`)
+        } else if (status === "done") {
+          console.log(" ✓")
+        } else {
+          console.log(" ✗")
+        }
+      },
+    })
+    variations = variationsToConfig(generated)
+    // Cache the variations
+    await Bun.write(
+      cachedVariationsFile,
+      JSON.stringify({ sourceFile: claudeMdPath, variations }, null, 2)
+    )
+    console.log(`\n📦 Cached variations to ${cachedVariationsFile}`)
+  }
+  // Build config with generated variations
+  const config: EvalConfig = {
+    prompts: defaultPrompts,
+    variations,
+    model,
+    tools: "Read,Glob,Grep",
+    delayMs: 1000,
+    outputFile,
+  }
+  const totalTests =
+    Object.keys(config.prompts).length * Object.keys(config.variations).length
+  console.log("\n" + "=".repeat(60))
+  console.log("📊 Starting Evaluation")
+  console.log("=".repeat(60))
+  console.log(`Testing ${Object.keys(config.variations).length} variations:`)
+  for (const name of Object.keys(config.variations)) {
+    console.log(`  • ${name}`)
+  }
+  console.log(`\nWith ${Object.keys(config.prompts).length} test prompts each`)
+  console.log(`Total tests: ${totalTests}`)
+  console.log("=".repeat(60) + "\n")
+  const results = await runEvaluation({
+    config,
+    onProgress: (variation, prompt, result, error) => {
+      if (error) {
+        console.log(`  ✗ ${variation}/${prompt}: Error - ${error.message}`)
+      } else if (result) {
+        console.log(`  ✓ ${variation}/${prompt}: $${result.cost.toFixed(4)}`)
+      }
+    },
+  })
+  printConsoleReport(results)
+  // Save JSON results
+  await Bun.write(outputFile, JSON.stringify(results, null, 2))
+  console.log(`\n📁 Results saved to ${outputFile}`)
+  // Save markdown if requested
+  if (values.markdown) {
+    const report = generateReport(results)
+    await Bun.write(values.markdown, report)
+    console.log(`📄 Markdown report saved to ${values.markdown}`)
+  }
+  console.log("\n💡 Tips:")
+  console.log("  • Re-run faster with: cceval auto --skip-generate")
+  console.log("  • Generate new variations: cceval auto (without --skip-generate)")
+  console.log("  • See full report: cceval report " + outputFile)
+}
 async function initCommand() {
   const configFile = "cceval.config.ts"
@@ -227,7 +378,7 @@ export default config
 async function main() {
   const args = process.argv.slice(2)
-  if (args.length === 0 || args.includes("--help") || args.includes("-h")) {
+  if (args.includes("--help") || args.includes("-h")) {
     console.log(HELP)
     return
   }
@@ -237,10 +388,19 @@ async function main() {
     return
   }
+  // No args = turnkey auto mode
+  if (args.length === 0) {
+    await autoCommand([])
+    return
+  }
   const command = args[0]
   const commandArgs = args.slice(1)
   switch (command) {
+    case "auto":
+      await autoCommand(commandArgs)
+      break
     case "run":
       await runCommand(commandArgs)
       break
@@ -251,9 +411,15 @@ async function main() {
       await initCommand()
       break
     default:
-      console.error(`Unknown command: ${command}`)
-      console.log(HELP)
-      process.exit(1)
+      // If not a known command, treat as auto with args
+      // This handles cases like: cceval -p ./CLAUDE.md
+      if (command?.startsWith("-")) {
+        await autoCommand(args)
+      } else {
+        console.error(`Unknown command: ${command}`)
+        console.log(HELP)
+        process.exit(1)
+      }
   }
 }

package/src/generator.ts ADDED Viewed

@@ -0,0 +1,281 @@
+import { $ } from "bun"
+export interface VariationStrategy {
+  name: string
+  description: string
+  prompt: string
+}
+export const defaultStrategies: VariationStrategy[] = [
+  {
+    name: "condensed",
+    description: "Shorter version keeping only critical rules",
+    prompt: `Create a condensed version of this CLAUDE.md that:
+- Keeps only the most critical rules and instructions
+- Removes redundant or overlapping guidance
+- Uses shorter, more direct language
+- Aims for 30-50% of the original length
+- Preserves the most impactful behaviors
+Return ONLY the condensed content, no explanations.`,
+  },
+  {
+    name: "explicit",
+    description: "More explicit version with clear imperatives",
+    prompt: `Create a more explicit version of this CLAUDE.md that:
+- Converts suggestions into direct commands (MUST, NEVER, ALWAYS)
+- Makes implicit rules explicit
+- Uses bullet points and clear formatting
+- Removes ambiguous language
+- Keeps the same rules but makes them unmistakably clear
+Return ONLY the explicit content, no explanations.`,
+  },
+  {
+    name: "prioritized",
+    description: "Reordered with most important rules first",
+    prompt: `Reorganize this CLAUDE.md to:
+- Put the most important/impactful rules at the very top
+- Group related rules together
+- Add a "Critical Rules" section at the start with the top 5 rules
+- Keep all original content but in a better order
+- Rules that affect code quality should come before style rules
+Return ONLY the reorganized content, no explanations.`,
+  },
+  {
+    name: "minimal",
+    description: "Just the 3-5 most critical rules",
+    prompt: `Extract ONLY the 3-5 most critical rules from this CLAUDE.md:
+- Choose rules that have the highest impact on code quality
+- Choose rules that are most unique to this project
+- Skip generic advice that any AI would follow
+- Format as a very short, focused instruction set
+Return ONLY the minimal rules, no explanations.`,
+  },
+  {
+    name: "structured",
+    description: "Well-organized with clear sections and headers",
+    prompt: `Restructure this CLAUDE.md into a well-organized document:
+- Add clear section headers (## Syntax)
+- Group related rules under appropriate headers
+- Use consistent formatting throughout
+- Add brief section summaries where helpful
+- Keep all original rules but organize them logically
+Return ONLY the restructured content, no explanations.`,
+  },
+]
+export interface GenerateOptions {
+  claudeMdPath?: string
+  strategies?: VariationStrategy[]
+  model?: string
+  onProgress?: (strategy: string, status: "start" | "done" | "error", result?: string) => void
+}
+export async function findClaudeMd(startPath?: string): Promise<string | null> {
+  const searchPaths = startPath
+    ? [startPath]
+    : ["./CLAUDE.md", "./.claude/CLAUDE.md", "../CLAUDE.md"]
+  for (const p of searchPaths) {
+    if (await Bun.file(p).exists()) {
+      return p
+    }
+  }
+  return null
+}
+export async function readClaudeMd(path: string): Promise<string> {
+  const file = Bun.file(path)
+  if (!(await file.exists())) {
+    throw new Error(`CLAUDE.md not found at ${path}`)
+  }
+  return await file.text()
+}
+// Map cceval model names to Anthropic model IDs
+function getAnthropicModel(model: string): string {
+  const modelMap: Record<string, string> = {
+    haiku: "claude-haiku-4-5-20251001",
+    sonnet: "claude-sonnet-4-20250514",
+    opus: "claude-opus-4-5-20251101",
+  }
+  return modelMap[model] || model
+}
+async function generateVariationWithAPI(
+  prompt: string,
+  model: string
+): Promise<string> {
+  const apiKey = process.env.ANTHROPIC_API_KEY
+  if (!apiKey) {
+    throw new Error("ANTHROPIC_API_KEY not set")
+  }
+  const anthropicModel = getAnthropicModel(model)
+  const response = await fetch("https://api.anthropic.com/v1/messages", {
+    method: "POST",
+    headers: {
+      "Content-Type": "application/json",
+      "x-api-key": apiKey,
+      "anthropic-version": "2023-06-01",
+    },
+    body: JSON.stringify({
+      model: anthropicModel,
+      max_tokens: 4096,
+      messages: [{ role: "user", content: prompt }],
+    }),
+  })
+  if (!response.ok) {
+    const error = await response.text()
+    throw new Error(`Anthropic API error: ${response.status} - ${error}`)
+  }
+  const data = (await response.json()) as {
+    content?: Array<{ type: string; text?: string }>
+  }
+  let result = ""
+  for (const block of data.content || []) {
+    if (block.type === "text" && block.text) {
+      result += block.text
+    }
+  }
+  return result.trim()
+}
+async function generateVariationWithCLI(
+  prompt: string,
+  model: string
+): Promise<string> {
+  // Write prompt to a temp file to avoid shell escaping issues
+  const tempFile = `/tmp/cceval-prompt-${Date.now()}.txt`
+  await Bun.write(tempFile, prompt)
+  try {
+    // Use --tools "" to disable all tools, preventing file modifications
+    // Use --settings to disable hooks
+    const cmd = `cat "${tempFile}" | claude \
+      --model "${model}" \
+      --no-chrome \
+      --disable-slash-commands \
+      --tools "" \
+      --settings '{"disableAllHooks": true}' \
+      --output-format stream-json \
+      --verbose \
+      -p`
+    const result = await $`${{ raw: cmd }}`.nothrow().quiet()
+    const output = result.stdout.toString()
+    // Parse the stream-json output
+    const lines = output.split("\n").filter((l) => l.trim())
+    let fullResponse = ""
+    for (const line of lines) {
+      try {
+        const parsed = JSON.parse(line)
+        if (parsed.type === "assistant" && parsed.message?.content) {
+          for (const block of parsed.message.content) {
+            if (block.type === "text") {
+              fullResponse += block.text
+            }
+          }
+        }
+      } catch {
+        // Skip non-JSON lines
+      }
+    }
+    return fullResponse.trim()
+  } finally {
+    // Clean up temp file
+    await Bun.file(tempFile).exists() && (await $`rm ${tempFile}`.nothrow())
+  }
+}
+async function generateVariation(
+  originalContent: string,
+  strategy: VariationStrategy,
+  model: string
+): Promise<string> {
+  const prompt = `Here is a CLAUDE.md file that configures an AI coding assistant:
+<claude_md>
+${originalContent}
+</claude_md>
+${strategy.prompt}`
+  // Try direct API first (faster, more reliable), fall back to CLI
+  if (process.env.ANTHROPIC_API_KEY) {
+    return generateVariationWithAPI(prompt, model)
+  } else {
+    return generateVariationWithCLI(prompt, model)
+  }
+}
+export interface GeneratedVariations {
+  original: string
+  baseline: string
+  variations: Record<string, string>
+  sourceFile: string
+}
+export async function generateVariations(
+  options: GenerateOptions = {}
+): Promise<GeneratedVariations> {
+  const { strategies = defaultStrategies, model = "haiku", onProgress } = options
+  // Find and read CLAUDE.md
+  const claudeMdPath = options.claudeMdPath || (await findClaudeMd())
+  if (!claudeMdPath) {
+    throw new Error(
+      "No CLAUDE.md found. Please create one or specify --path <file>"
+    )
+  }
+  const originalContent = await readClaudeMd(claudeMdPath)
+  console.log(`📄 Found CLAUDE.md at ${claudeMdPath} (${originalContent.length} chars)`)
+  const variations: Record<string, string> = {}
+  // Generate each variation
+  for (const strategy of strategies) {
+    onProgress?.(strategy.name, "start")
+    try {
+      const variation = await generateVariation(originalContent, strategy, model)
+      variations[strategy.name] = variation
+      onProgress?.(strategy.name, "done", variation)
+    } catch (error) {
+      onProgress?.(strategy.name, "error")
+      console.error(`  ✗ Failed to generate ${strategy.name}: ${error}`)
+    }
+    // Small delay to avoid rate limiting
+    await Bun.sleep(500)
+  }
+  return {
+    original: originalContent,
+    baseline: "You are a helpful coding assistant.",
+    variations,
+    sourceFile: claudeMdPath,
+  }
+}
+export function variationsToConfig(
+  generated: GeneratedVariations
+): Record<string, string> {
+  return {
+    baseline: generated.baseline,
+    original: generated.original,
+    ...generated.variations,
+  }
+}

package/src/index.ts CHANGED Viewed

@@ -10,3 +10,13 @@ export { scoreResults, generateReport, printConsoleReport } from "./scoring.ts"
 // Defaults
 export { defaultConfig, defaultPrompts, defaultVariations } from "./defaults.ts"
+// Generator (for turnkey variation generation)
+export {
+  generateVariations,
+  variationsToConfig,
+  findClaudeMd,
+  readClaudeMd,
+  defaultStrategies,
+} from "./generator.ts"
+export type { VariationStrategy, GenerateOptions, GeneratedVariations } from "./generator.ts"