npm - homunculus-code - Versions diffs - 0.1.0 → 0.2.0 - Mend

homunculus-code 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/ARCHITECTURE.md ADDED Viewed

@@ -0,0 +1,211 @@
+# Architecture: Goal-Tree-Driven Evolution
+This document explains the design philosophy behind Homunculus.
+## The Problem
+AI assistants are statically configured. You write rules, add skills, configure hooks — but the AI never improves on its own. Every improvement requires human effort:
+1. Notice something could be better
+2. Research the solution
+3. Implement it (skill, hook, rule, script)
+4. Test it manually
+5. Maintain it when things change
+This doesn't scale. As your system grows, maintenance becomes a full-time job.
+## The Solution: Goal Trees
+Instead of configuring individual behaviors, you define **what you want** in a tree of goals. The system figures out **how to achieve it** — and keeps improving the "how" over time.
+### Why a Tree?
+Goals are naturally hierarchical:
+```
+"Ship better software"
+├── "Write fewer bugs"
+│   ├── "Every change has tests"
+│   └── "Catch issues before merge"
+└── "Move faster"
+    ├── "Automate repetitive tasks"
+    └── "Debug faster"
+```
+A tree structure gives you:
+- **Decomposition** — Break big goals into measurable sub-goals
+- **Priority** — Unhealthy sub-goals bubble up to their parent
+- **Independence** — Each sub-goal can be implemented differently
+- **Replaceability** — Swap implementations without affecting goals
+### Goal Node Schema
+Every node in the tree follows the same schema:
+```yaml
+goal_name:
+  purpose: "Why this goal exists"           # Never changes
+  realized_by: path/to/implementation       # Changes as system evolves
+  metrics:
+    - name: metric_name
+      source: where to get the data
+      healthy: "what healthy looks like"    # Defines success
+  health_check:
+    command: "shell command"                # Machine-executable
+    expected: "human description"           # For readability
+  goals:
+    sub_goal_1: ...                         # Recursive
+    sub_goal_2: ...
+```
+### The `realized_by` Field
+This is the key innovation. It decouples **what** (the goal) from **how** (the implementation). The value can be:
+- A skill: `skills/tdd-workflow.md`
+- An agent: `agents/code-reviewer.md`
+- A hook: `hooks/pre-commit.sh`
+- A script: `scripts/deploy.sh`
+- A cron job: `launchd/nightly.plist`
+- An MCP server: `mcp-servers/github`
+- A rule: `rules/security.md`
+- A slash command: `commands/quality-gate.md`
+- Or anything else
+The evolution system can **replace** the implementation at any time, as long as the goal's health check still passes. This is what makes the system truly self-improving — it's not locked into any particular approach.
+## Evolution Pipeline
+### Layer 1: Observation → Instincts
+```
+Tool usage → observe.sh (hook) → observations.jsonl
+                                        │
+                         evaluate-session.js (SessionEnd)
+                                        │
+                                        ▼
+                              instincts/personal/*.md
+                              (confidence-scored patterns)
+```
+Instincts are small behavioral patterns with:
+- **Confidence scores** — increase with reinforcement, decay over time
+- **Half-life decay** — unused instincts fade (default: 90 days)
+- **Automatic archival** — `prune-instincts.js` archives low-value ones
+### Layer 2: Convergence → Skills
+```
+Related instincts (3+)  →  /evolve  →  evolved/skills/*.md
+                                              │
+                                    eval spec (scenarios)
+                                              │
+                                    /eval-skill → score
+                                              │
+                                    /improve-skill → iterate
+                                              │
+                                        100% pass rate
+```
+Skills are:
+- **Tested** — every skill has an eval spec with scenario tests
+- **Versioned** — changes tracked with semver
+- **Quality-gated** — must pass eval before adoption
+- **Regression-protected** — rollback if quality drops
+### Layer 3: Goal-Directed Evolution
+```
+architecture.yaml
+       │
+       ▼
+health_check.command → pass/fail per goal
+       │
+       ▼
+Unhealthy goals get priority
+       │
+       ▼
+Evolution engine focuses improvement
+on the weakest areas
+```
+The goal tree adds **direction** to evolution. Without it, the system learns whatever patterns happen to occur. With it, the system asks: "which goals are failing?" and focuses there.
+### Layer 4: Autonomous Operation (Nightly Agent)
+```
+launchd/cron → heartbeat.sh (scheduled)
+                    │
+    ┌───────────────┼───────────────┐
+    ▼               ▼               ▼
+Health Check    Evolve Pipeline   Research
+    │               │               │
+    ▼               ▼               ▼
+Identify weak   eval → improve    Scan for better
+goals            → prune          approaches
+    │               │               │
+    └───────────────┼───────────────┘
+                    ▼
+            Morning Report
+```
+The nightly agent closes the loop — no human intervention needed for routine evolution.
+### Layer 5: Meta-Evolution
+The evolution mechanism itself is measured and adjusted:
+| Metric | What it measures | Action if unhealthy |
+|--------|-----------------|-------------------|
+| `instinct_survival_rate` | % of instincts surviving 14 days | Raise extraction thresholds |
+| `skill_convergence` | Time from first instinct to skill | Adjust aggregation triggers |
+| `eval_discrimination` | % of scenarios that distinguish versions | Add harder boundary scenarios |
+This prevents the system from degenerating — if it's extracting too many low-quality instincts, it automatically becomes more selective.
+## Design Principles
+### 1. Goals Over Implementations
+Never optimize an implementation directly. Always ask: "which goal does this serve?" If there's no goal, the implementation shouldn't exist.
+### 2. Measure Everything
+If you can't measure it, you can't evolve it. Every goal should eventually have a `health_check` or `metrics` entry. Start with approximate metrics and refine them.
+### 3. Replaceable Everything
+No implementation is sacred. A skill can be replaced by a hook. A script can be replaced by an agent. The only constant is the goal tree.
+### 4. Safe Evolution
+- Eval specs prevent regression
+- Experiments run in isolated worktrees
+- Rollback if quality drops
+- Noise tolerance (5pp) prevents chasing statistical artifacts
+### 5. Progressive Complexity
+Start simple:
+1. Just the goal tree + observation hook
+2. Let instincts accumulate naturally
+3. Evolve first skill when you have enough instincts
+4. Add eval spec for quality control
+5. Set up nightly agent when you're ready for autonomy
+Don't try to build the full system on day one. Let it grow.
+## Comparison with Other Approaches
+| Approach | Optimizes | Direction | Quality Control | Autonomous |
+|----------|-----------|-----------|----------------|------------|
+| Manual rules | Individual behaviors | Human-directed | None | No |
+| Claude Memory | Fact recall | Usage-driven | None | No |
+| OpenClaw skills | Skill generation | Human-triggered | None | Partial |
+| **Homunculus** | **Goal achievement** | **Goal-tree-driven** | **Eval specs** | **Yes (nightly)** |
+## Further Reading
+- [README.md](README.md) — Quick start and overview
+- [docs/nightly-agent.md](docs/nightly-agent.md) — Setting up autonomous operation
+- [examples/reference/](examples/reference/) — Real-world system after 15 days

package/README.md CHANGED Viewed

@@ -1,5 +1,6 @@
 # Homunculus for Claude Code
+[![npm version](https://img.shields.io/npm/v/homunculus-code)](https://www.npmjs.com/package/homunculus-code)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Claude Code](https://img.shields.io/badge/Claude%20Code-v2.1.70+-blue)](https://docs.anthropic.com/en/docs/claude-code)
 [![Node.js](https://img.shields.io/badge/Node.js-18+-green)](https://nodejs.org)
@@ -169,20 +170,50 @@ The wizard asks you a few questions and sets everything up:
 Done! Start using Claude Code. Your assistant will evolve.
 ```
-### 2. Use Claude Code Normally
+### 2. Run Your First Evolution Cycle
-That's it. The system observes, extracts, and evolves in the background. Check progress anytime:
+```bash
+npx homunculus-code night
+```
+Watch the system check your goals, scan for patterns, and generate a report:
+```
+🌙 Homunculus — Evolution Cycle
+[1/5] Health Check
+      code_quality:    ○ no data yet
+      productivity:    ○ no data yet
+[2/5] Scan Instincts
+      ○ No instincts yet — use Claude Code normally, patterns will emerge
+[3/5] Eval Skills
+      ○ No skills yet — instincts converge into skills over time
+[4/5] Research
+      ✓ Claude Code 2.1.81
+      △ 0/7 goals have health checks — add more for better evolution
+[5/5] Report
+  ┌────────────────────────────────────────────┐
+  │  Evolution Report — 2026-03-22             │
+  │  Goals: 7 | Instincts: 0 | Skills: 0      │
+  │  Status: Fresh install — ready to evolve   │
+  └────────────────────────────────────────────┘
+```
+### 3. Use Claude Code Normally
+The observation hook watches your usage automatically. As patterns emerge, run `night` again to see your system evolve. Or set up the [nightly agent](docs/nightly-agent.md) to do it autonomously while you sleep.
 ```bash
-claude "/eval-skill"         # Run skill evaluations
+npx homunculus-code night    # Manual evolution cycle
+claude "/eval-skill"         # Evaluate a specific skill
 claude "/improve-skill"      # Auto-improve a skill
 claude "/evolve"             # Converge instincts into skills
 ```
-### 3. Refine Your Goals (Optional)
-As you use the system, refine `architecture.yaml` — add sub-goals, metrics, and health checks. The more specific your goals, the smarter the evolution.
 ---
 ## Key Concepts

package/bin/cli.js ADDED Viewed

@@ -0,0 +1,36 @@
+#!/usr/bin/env node
+// homunculus-code CLI — entry point
+const command = process.argv[2];
+switch (command) {
+  case 'init':
+    require('./init.js');
+    break;
+  case 'night':
+    require('./night.js');
+    break;
+  case 'help':
+  case '--help':
+  case '-h':
+  case undefined:
+    console.log('');
+    console.log('  \x1b[1mHomunculus\x1b[0m — Self-evolving AI Assistant for Claude Code');
+    console.log('');
+    console.log('  Usage:');
+    console.log('    npx homunculus-code <command>');
+    console.log('');
+    console.log('  Commands:');
+    console.log('    init     Set up Homunculus in your project');
+    console.log('    night    Run one evolution cycle (health check → evolve → report)');
+    console.log('    help     Show this help message');
+    console.log('');
+    console.log('  After init, use Claude Code normally. Evolution happens automatically.');
+    console.log('  Run "night" anytime to trigger a manual evolution cycle.');
+    console.log('');
+    break;
+  default:
+    console.error(`  Unknown command: ${command}`);
+    console.error('  Run "npx homunculus-code help" for available commands.');
+    process.exit(1);
+}

package/bin/night.js ADDED Viewed

@@ -0,0 +1,292 @@
+#!/usr/bin/env node
+// homunculus night — Run one evolution cycle
+// Simulates what the nightly agent does: health check → evolve → research → report
+const fs = require('fs');
+const path = require('path');
+const { execSync } = require('child_process');
+const projectDir = process.cwd();
+const HOMUNCULUS_DIR = path.join(projectDir, 'homunculus');
+const ARCH_FILE = path.join(projectDir, 'architecture.yaml');
+const INSTINCTS_DIR = path.join(HOMUNCULUS_DIR, 'instincts', 'personal');
+const ARCHIVED_DIR = path.join(HOMUNCULUS_DIR, 'instincts', 'archived');
+const SKILLS_DIR = path.join(HOMUNCULUS_DIR, 'evolved', 'skills');
+const EVALS_DIR = path.join(HOMUNCULUS_DIR, 'evolved', 'evals');
+const OBS_FILE = path.join(HOMUNCULUS_DIR, 'observations.jsonl');
+const SCRIPTS_DIR = path.join(projectDir, 'scripts');
+const dim = (s) => `\x1b[2m${s}\x1b[0m`;
+const bold = (s) => `\x1b[1m${s}\x1b[0m`;
+const green = (s) => `\x1b[32m${s}\x1b[0m`;
+const yellow = (s) => `\x1b[33m${s}\x1b[0m`;
+const red = (s) => `\x1b[31m${s}\x1b[0m`;
+const cyan = (s) => `\x1b[36m${s}\x1b[0m`;
+function countFiles(dir, ext = '.md') {
+  try {
+    return fs.readdirSync(dir).filter(f => f.endsWith(ext)).length;
+  } catch { return 0; }
+}
+function countLines(file) {
+  try {
+    return fs.readFileSync(file, 'utf8').trim().split('\n').filter(Boolean).length;
+  } catch { return 0; }
+}
+function sleep(ms) {
+  return new Promise(resolve => setTimeout(resolve, ms));
+}
+function parseGoals(yamlContent) {
+  const goals = [];
+  const lines = yamlContent.split('\n');
+  let currentGoal = null;
+  let indent = 0;
+  for (const line of lines) {
+    // Match top-level goals (direct children of "goals:")
+    const goalMatch = line.match(/^    (\w+):$/);
+    if (goalMatch && !line.includes('purpose') && !line.includes('metrics') &&
+        !line.includes('health_check') && !line.includes('realized_by') &&
+        !line.includes('goals:') && !line.includes('command') &&
+        !line.includes('expected') && !line.includes('healthy') &&
+        !line.includes('name:') && !line.includes('source')) {
+      currentGoal = goalMatch[1];
+    }
+    const purposeMatch = line.match(/purpose:\s*"(.+)"/);
+    if (purposeMatch && currentGoal) {
+      goals.push({ name: currentGoal, purpose: purposeMatch[1] });
+      currentGoal = null;
+    }
+  }
+  return goals;
+}
+async function main() {
+  console.log('');
+  console.log(`  ${bold('🌙 Homunculus — Evolution Cycle')}`);
+  console.log(`  ${dim(new Date().toISOString().replace('T', ' ').slice(0, 19))}`);
+  console.log('');
+  // Check if initialized
+  if (!fs.existsSync(HOMUNCULUS_DIR)) {
+    console.log(`  ${red('✗')} Not initialized. Run ${bold('npx homunculus-code init')} first.`);
+    process.exit(1);
+  }
+  // ─────────────────────────────────────────
+  // Phase 1: Health Check
+  // ─────────────────────────────────────────
+  console.log(`  ${bold('[1/5] Health Check')}`);
+  await sleep(300);
+  if (fs.existsSync(ARCH_FILE)) {
+    const yaml = fs.readFileSync(ARCH_FILE, 'utf8');
+    const goals = parseGoals(yaml);
+    if (goals.length > 0) {
+      for (const goal of goals) {
+        // Simple heuristic: if realized_by exists and points to real file = healthy
+        const status = yellow('○ no data yet');
+        console.log(`        ${goal.name}: ${status}`);
+        console.log(`        ${dim(goal.purpose)}`);
+      }
+    } else {
+      console.log(`        ${yellow('○')} No goals defined in architecture.yaml`);
+    }
+  } else {
+    console.log(`        ${red('✗')} architecture.yaml not found`);
+  }
+  console.log('');
+  await sleep(200);
+  // ─────────────────────────────────────────
+  // Phase 2: Scan Instincts
+  // ─────────────────────────────────────────
+  console.log(`  ${bold('[2/5] Scan Instincts')}`);
+  await sleep(300);
+  const activeInstincts = countFiles(INSTINCTS_DIR);
+  const archivedInstincts = countFiles(ARCHIVED_DIR);
+  const observations = countLines(OBS_FILE);
+  if (activeInstincts > 0) {
+    console.log(`        ${green('✓')} ${activeInstincts} active instincts (${archivedInstincts} archived)`);
+    // Run prune check
+    try {
+      const pruneScript = path.join(SCRIPTS_DIR, 'prune-instincts.js');
+      if (fs.existsSync(pruneScript)) {
+        const result = execSync(`node "${pruneScript}"`, {
+          encoding: 'utf8', timeout: 10000, cwd: projectDir
+        });
+        const candidateMatch = result.match(/Candidates:\s*(\d+)/);
+        if (candidateMatch && parseInt(candidateMatch[1]) > 0) {
+          console.log(`        ${yellow('△')} ${candidateMatch[1]} instincts eligible for archival`);
+        }
+      }
+    } catch {}
+  } else if (observations > 0) {
+    console.log(`        ${yellow('○')} ${observations} observations recorded, no instincts extracted yet`);
+    console.log(`        ${dim('Tip: instincts are extracted at session end when enough patterns emerge')}`);
+  } else {
+    console.log(`        ${yellow('○')} No instincts yet — use Claude Code normally, patterns will emerge`);
+    console.log(`        ${dim('The observation hook is watching. Keep using Claude Code.')}`);
+  }
+  console.log('');
+  await sleep(200);
+  // ─────────────────────────────────────────
+  // Phase 3: Eval Skills
+  // ─────────────────────────────────────────
+  console.log(`  ${bold('[3/5] Eval Skills')}`);
+  await sleep(300);
+  const skillCount = countFiles(SKILLS_DIR);
+  const evalCount = countFiles(EVALS_DIR, '.yaml');
+  if (skillCount > 0) {
+    console.log(`        ${green('✓')} ${skillCount} evolved skills found`);
+    // List skills
+    try {
+      const skills = fs.readdirSync(SKILLS_DIR).filter(f => f.endsWith('.md'));
+      for (const skill of skills) {
+        const name = skill.replace('.md', '');
+        const hasEval = fs.existsSync(path.join(EVALS_DIR, `${name}.eval.yaml`));
+        if (hasEval) {
+          console.log(`          ${green('✓')} ${name} — has eval spec`);
+        } else {
+          console.log(`          ${yellow('○')} ${name} — no eval spec yet`);
+        }
+      }
+    } catch {}
+    if (evalCount > 0) {
+      console.log(`        ${dim(`Run 'claude "/eval-skill"' to evaluate, or let the nightly agent handle it`)}`);
+    }
+  } else {
+    console.log(`        ${yellow('○')} No skills yet — instincts converge into skills over time`);
+    console.log(`        ${dim(`Once you have 3+ related instincts, run 'claude "/evolve"'`)}`);
+  }
+  console.log('');
+  await sleep(200);
+  // ─────────────────────────────────────────
+  // Phase 4: Research
+  // ─────────────────────────────────────────
+  console.log(`  ${bold('[4/5] Research')}`);
+  await sleep(300);
+  // Check Claude Code version
+  try {
+    const version = execSync('claude --version 2>/dev/null', {
+      encoding: 'utf8', timeout: 5000
+    }).trim();
+    console.log(`        ${green('✓')} Claude Code ${version}`);
+  } catch {
+    console.log(`        ${yellow('○')} Claude Code version check skipped`);
+  }
+  // Check Node version
+  const nodeVersion = process.version;
+  console.log(`        ${green('✓')} Node.js ${nodeVersion}`);
+  // Check architecture health
+  if (fs.existsSync(ARCH_FILE)) {
+    const yaml = fs.readFileSync(ARCH_FILE, 'utf8');
+    const goalCount = (yaml.match(/purpose:/g) || []).length;
+    const healthChecks = (yaml.match(/health_check:/g) || []).length;
+    const realizedBy = (yaml.match(/realized_by:/g) || []).length;
+    if (healthChecks < goalCount / 2) {
+      console.log(`        ${yellow('△')} ${healthChecks}/${goalCount} goals have health checks — add more for better evolution`);
+    }
+    if (realizedBy < goalCount / 2) {
+      console.log(`        ${yellow('△')} ${realizedBy}/${goalCount} goals have implementations — some are waiting to evolve`);
+    }
+  }
+  console.log('');
+  await sleep(200);
+  // ─────────────────────────────────────────
+  // Phase 5: Report
+  // ─────────────────────────────────────────
+  console.log(`  ${bold('[5/5] Report')}`);
+  await sleep(300);
+  const reportDate = new Date().toISOString().slice(0, 10);
+  const reportLines = [];
+  console.log('');
+  console.log(`  ┌${'─'.repeat(52)}┐`);
+  console.log(`  │ ${bold(`  Evolution Report — ${reportDate}`)}              │`);
+  console.log(`  ├${'─'.repeat(52)}┤`);
+  // Summary
+  const goalCount = fs.existsSync(ARCH_FILE)
+    ? (fs.readFileSync(ARCH_FILE, 'utf8').match(/purpose:/g) || []).length
+    : 0;
+  console.log(`  │                                                    │`);
+  console.log(`  │  ${cyan('Goals:')}          ${String(goalCount).padStart(3)}                              │`);
+  console.log(`  │  ${cyan('Instincts:')}      ${String(activeInstincts).padStart(3)} active / ${String(archivedInstincts).padStart(3)} archived       │`);
+  console.log(`  │  ${cyan('Skills:')}          ${String(skillCount).padStart(3)} (${evalCount} with eval specs)          │`);
+  console.log(`  │  ${cyan('Observations:')}  ${String(observations).padStart(5)}                              │`);
+  console.log(`  │                                                    │`);
+  if (activeInstincts === 0 && skillCount === 0) {
+    console.log(`  │  ${bold('Status:')} Fresh install — ready to evolve          │`);
+    console.log(`  │                                                    │`);
+    console.log(`  │  ${dim('Next steps:')}                                      │`);
+    console.log(`  │  ${dim('1. Use Claude Code on your project')}                │`);
+    console.log(`  │  ${dim('2. Patterns will be auto-extracted')}                │`);
+    console.log(`  │  ${dim('3. Run this command again to see progress')}         │`);
+  } else if (skillCount === 0) {
+    console.log(`  │  ${bold('Status:')} Collecting patterns...                   │`);
+    console.log(`  │                                                    │`);
+    console.log(`  │  ${dim('You have instincts! When 3+ are related,')}         │`);
+    console.log(`  │  ${dim('run claude "/evolve" to create your first skill.')} │`);
+  } else {
+    console.log(`  │  ${bold('Status:')} ${green('Evolving')}                                  │`);
+    console.log(`  │                                                    │`);
+    console.log(`  │  ${dim('Run claude "/eval-skill" to check quality,')}       │`);
+    console.log(`  │  ${dim('or set up the nightly agent for autonomy.')}        │`);
+  }
+  console.log(`  │                                                    │`);
+  console.log(`  └${'─'.repeat(52)}┘`);
+  console.log('');
+  // Save report to file
+  const reportDir = path.join(HOMUNCULUS_DIR, 'reports');
+  if (!fs.existsSync(reportDir)) fs.mkdirSync(reportDir, { recursive: true });
+  const reportContent = [
+    `# Evolution Report — ${reportDate}`,
+    '',
+    '## Summary',
+    `- Goals: ${goalCount}`,
+    `- Instincts: ${activeInstincts} active / ${archivedInstincts} archived`,
+    `- Skills: ${skillCount} (${evalCount} with eval specs)`,
+    `- Observations: ${observations}`,
+    '',
+    `Generated by \`npx homunculus-code night\``,
+  ].join('\n');
+  fs.writeFileSync(path.join(reportDir, `${reportDate}.md`), reportContent + '\n');
+  console.log(`  ${dim(`Report saved to homunculus/reports/${reportDate}.md`)}`);
+  console.log('');
+}
+main().catch(err => {
+  console.error('Error:', err.message);
+  process.exit(1);
+});

package/examples/reference/README.md CHANGED Viewed

@@ -10,7 +10,19 @@ reference/
 ├── evolved-skills/            # 7 evolved skills (all 100% eval pass)
 │   ├── api-system-diagnosis.md
 │   ├── assistant-system-management.md
-│   ├── claude-code-reference.md
+│   ├── claude-code-reference/    # Split into index + 11 chapters (95% context reduction)
+│   │   ├── index.md              # Routing table — loads chapters on demand
+│   │   ├── ch01-extensions.md
+│   │   ├── ch02-agents.md
+│   │   ├── ch03-hooks.md
+│   │   ├── ch04-mcp.md
+│   │   ├── ch05-ui.md
+│   │   ├── ch06-rules-security.md
+│   │   ├── ch07-model-memory.md
+│   │   ├── ch08-workflows.md
+│   │   ├── ch09-integrations.md
+│   │   ├── ch10-cli-sdk.md
+│   │   └── ch11-output-versions.md
 │   ├── development-verification-patterns.md
 │   ├── multi-agent-design-patterns.md
 │   ├── shell-automation-patterns.md