npm - @cpretzinger/boss-claude - Versions diffs - 1.0.0 → 1.0.2 - Mend

@cpretzinger/boss-claude 1.0.0 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (87) hide show

package/README.md +304 -1
package/bin/boss-claude.js +1138 -0
package/bin/commands/mode.js +250 -0
package/bin/onyx-guard.js +259 -0
package/bin/onyx-guard.sh +251 -0
package/bin/prompts.js +284 -0
package/bin/rollback.js +85 -0
package/bin/setup-wizard.js +492 -0
package/config/.env.example +17 -0
package/lib/README.md +83 -0
package/lib/agent-logger.js +61 -0
package/lib/agents/memory-engineers/github-memory-engineer.js +251 -0
package/lib/agents/memory-engineers/postgres-memory-engineer.js +633 -0
package/lib/agents/memory-engineers/qdrant-memory-engineer.js +358 -0
package/lib/agents/memory-engineers/redis-memory-engineer.js +383 -0
package/lib/agents/memory-supervisor.js +526 -0
package/lib/agents/registry.js +135 -0
package/lib/auto-monitor.js +131 -0
package/lib/checkpoint-hook.js +112 -0
package/lib/checkpoint.js +319 -0
package/lib/commentator.js +213 -0
package/lib/context-scribe.js +120 -0
package/lib/delegation-strategies.js +326 -0
package/lib/hierarchy-validator.js +643 -0
package/lib/index.js +15 -0
package/lib/init-with-mode.js +261 -0
package/lib/init.js +44 -6
package/lib/memory-result-aggregator.js +252 -0
package/lib/memory.js +35 -7
package/lib/mode-enforcer.js +473 -0
package/lib/onyx-banner.js +169 -0
package/lib/onyx-identity.js +214 -0
package/lib/onyx-monitor.js +381 -0
package/lib/onyx-reminder.js +188 -0
package/lib/onyx-tool-interceptor.js +341 -0
package/lib/onyx-wrapper.js +315 -0
package/lib/orchestrator-gate.js +334 -0
package/lib/output-formatter.js +296 -0
package/lib/postgres.js +1 -1
package/lib/prompt-injector.js +220 -0
package/lib/prompts.js +532 -0
package/lib/session.js +153 -6
package/lib/setup/README.md +187 -0
package/lib/setup/env-manager.js +785 -0
package/lib/setup/error-recovery.js +630 -0
package/lib/setup/explain-scopes.js +385 -0
package/lib/setup/github-instructions.js +333 -0
package/lib/setup/github-repo.js +254 -0
package/lib/setup/import-credentials.js +498 -0
package/lib/setup/index.js +62 -0
package/lib/setup/init-postgres.js +785 -0
package/lib/setup/init-redis.js +456 -0
package/lib/setup/integration-test.js +652 -0
package/lib/setup/progress.js +357 -0
package/lib/setup/rollback.js +670 -0
package/lib/setup/rollback.test.js +452 -0
package/lib/setup/setup-with-rollback.example.js +351 -0
package/lib/setup/summary.js +400 -0
package/lib/setup/test-github-setup.js +10 -0
package/lib/setup/test-postgres-init.js +98 -0
package/lib/setup/verify-setup.js +102 -0
package/lib/task-agent-worker.js +235 -0
package/lib/token-monitor.js +466 -0
package/lib/tool-wrapper-integration.js +369 -0
package/lib/tool-wrapper.js +387 -0
package/lib/validators/README.md +497 -0
package/lib/validators/config.js +583 -0
package/lib/validators/config.test.js +175 -0
package/lib/validators/github.js +310 -0
package/lib/validators/github.test.js +61 -0
package/lib/validators/index.js +15 -0
package/lib/validators/postgres.js +525 -0
package/package.json +98 -13
package/scripts/benchmark-memory.js +433 -0
package/scripts/check-secrets.sh +12 -0
package/scripts/fetch-todos.mjs +148 -0
package/scripts/graceful-shutdown.sh +156 -0
package/scripts/install-onyx-hooks.js +373 -0
package/scripts/install.js +119 -18
package/scripts/redis-monitor.js +284 -0
package/scripts/redis-setup.js +412 -0
package/scripts/test-memory-retrieval.js +201 -0
package/scripts/validate-exports.js +68 -0
package/scripts/validate-package.js +120 -0
package/scripts/verify-onyx-deployment.js +309 -0
package/scripts/verify-redis-deployment.js +354 -0
package/scripts/verify-redis-init.js +219 -0

package/README.md CHANGED Viewed

@@ -12,6 +12,10 @@ Boss Claude turns every coding session into an RPG-style experience where Claude
 - 📊 **Career Stats**: Track total sessions, repos managed, token earnings
 - 🔍 **Semantic Search**: Recall past sessions with natural language queries
 - 🏆 **Progression System**: Level-based XP, token banking, achievement tracking
+- 👁️ **Agent Watch**: Real-time monitoring of agent activity in companion window
+- 🏛️ **Hierarchy Enforcement**: Canon rules ensure agents work safely within boundaries
+- 🚨 **Token Monitor**: Real-time delegation enforcement - screams when ONYX burns >100 tokens without using Task tool
+- ⏸️ **Checkpoint System**: Pauses ONYX every 5 messages to ask "Did you delegate or burn tokens?"
 ## Installation
@@ -19,6 +23,21 @@ Boss Claude turns every coding session into an RPG-style experience where Claude
 npm install -g @cpretzinger/boss-claude
 ```
+### What Happens on Install
+When you run `npm install`, the postinstall script automatically:
+1. **Creates `~/.boss-claude/`** - Configuration directory for your credentials
+2. **Auto-detects credentials** - Imports from Railway CLI, environment, or existing configs
+3. **Injects ONYX MODE into `~/.claude/CLAUDE.md`** - The conductor rules that make Claude delegate work
+The CLAUDE.md injection adds the "Conductor" identity to Claude, which enforces:
+- **Forbidden tools**: Read, Write, Edit, Bash, Grep, Glob, NotebookEdit
+- **Allowed tools**: Task (delegation), WebFetch, WebSearch, TodoWrite, Skill
+- **Delegation matrix**: Deterministic routing of requests to appropriate agents
+This means in ANY repository, Claude automatically becomes ONYX - the conductor who waves the baton but never plays the instruments.
 ## Setup
 ### 1. Configure Credentials
@@ -97,6 +116,49 @@ Saves current session to GitHub Issues with:
 - Automatic summary if not provided
 - Optional tags for organization
 - XP and token rewards
+#### Watch Agent Activity
+```bash
+boss-claude watch
+```
+Opens a real-time monitor showing all agent activity. Perfect for:
+- Debugging multi-agent workflows
+- Monitoring Task agent execution
+- Tracking automation progress
+See [Agent Watch Documentation](docs/WATCH-QUICKSTART.md) for integration guide.
+#### Live Agent Commentary
+```bash
+boss-claude commentate
+```
+Real-time play-by-play of what agents are doing - reads, writes, executions.
+#### ONYX Checkpoint System
+```bash
+# Check delegation efficiency
+boss-claude checkpoint:status
+# Record delegation decision
+boss-claude checkpoint:record --delegated --tokens 25000 --specialist "agent-name" --justification "reason"
+# View decision history
+boss-claude checkpoint:history
+```
+Enforces delegation accountability by pausing every 5 messages to ask: "Did you delegate or burn tokens?"
+See [Checkpoint Documentation](CHECKPOINT-SYSTEM.md) for complete guide.
+#### Run Tests
+```bash
+npm test
+```
+Validates all modules load and CLI commands work (10 tests).
 - Searchable history
 #### Recall Past Sessions
@@ -177,16 +239,41 @@ Redis Keys
 - ...and so on (100 XP per level)
 ### Rewards
-- **XP**: 50 XP per session saved
+- **Base XP**: 50 XP per session saved
+- **Efficiency Bonus**: Up to +100 XP based on delegation efficiency
+- **Delegation Bonus**: +2 XP per delegation (up to +20)
 - **Token Bank**: All tokens used during session are banked
 - **Net Worth**: Token bank × $0.000003 per token
+### Efficiency Multiplier System 🎯
+The efficiency bonus rewards you for being a true conductor - delegating work to agents instead of doing it yourself.
+**Formula**: `agent_tokens / onyx_tokens = efficiency_ratio`
+**Example**:
+- ONYX used 20,000 tokens (orchestration overhead)
+- Agents used 600,000 tokens (actual work)
+- Efficiency: 600,000 / 20,000 = **30x**
+- Bonus XP: **+30** (capped at 100)
+The status display shows your current efficiency:
+```
+⚡ EFFICIENCY TRACKER (XP Multiplier)
+   🎺 ONYX Tokens: 20,000 (orchestration)
+   🎻 Agent Tokens: 600,000 (work done)
+   📈 Efficiency Ratio: 30.0x
+   🎯 Delegations: 15
+   💎 Projected Bonus XP: +30 (efficiency) +20 (delegation)
+```
 ### Stats Tracked
 - Total sessions across all repos
 - Repositories managed
 - Token bank size
 - Current level and XP progress
 - Per-repo session counts
+- Efficiency ratio per session
 ## CLI Reference
@@ -203,10 +290,60 @@ boss-claude save [summary] [--tags <tags>]
 # Search past sessions
 boss-claude recall <query> [--limit <number>]
+# Run integration tests
+boss-claude test
 # Show help
 boss-claude --help
 ```
+## Testing
+Boss Claude includes a comprehensive integration test suite that validates:
+- Redis connectivity and operations
+- PostgreSQL database and schema
+- GitHub API integration
+- Full system end-to-end workflow
+```bash
+boss-claude test
+```
+The test suite runs in 3-5 seconds and validates all system components without affecting production data. See [TESTING.md](TESTING.md) for full documentation.
+## Benchmarking
+### Memory System Performance
+Boss Claude includes a comprehensive benchmark to compare the old GitHub-based memory system with the new MemorySupervisor architecture:
+```bash
+# Quick benchmark (3 runs per query)
+npm run benchmark:memory
+# Verbose output with detailed per-test stats
+npm run benchmark:memory:verbose
+# Extended benchmark (10 runs per query)
+npm run benchmark:memory:extended
+```
+**Key Metrics:**
+- **Response Time**: Old system (2-120s) vs New system (<5s cache miss, <1s cache hit)
+- **Cache Hit Rate**: Redis caching provides 30-50x speedup on repeated queries
+- **Startup Impact**: 50%+ faster Claude initialization with token savings
+- **Memory Usage**: Node.js heap tracking per operation
+**Architecture Comparison:**
+| System | Approach | Avg Response | Caching |
+|--------|----------|--------------|---------|
+| OLD | Direct GitHub API | 3-5s | None |
+| NEW (Cache Miss) | 4 Parallel Engineers | <5s | Redis (5min TTL) |
+| NEW (Cache Hit) | Redis Only | <200ms | 30-50x speedup |
+The benchmark measures real-world performance across 8 varied queries and outputs comprehensive JSON results. See [docs/BENCHMARK-MEMORY.md](docs/BENCHMARK-MEMORY.md) for detailed documentation.
 ## Environment Variables
 | Variable | Required | Default | Description |
@@ -216,6 +353,172 @@ boss-claude --help
 | `GITHUB_OWNER` | No | `cpretzinger` | GitHub username |
 | `GITHUB_MEMORY_REPO` | No | `boss-claude-memory` | Repository name for memory storage |
+## Agent Hierarchy and Canon Rules
+Boss Claude implements a multi-tier agent hierarchy system with enforced canon rules to ensure safe, efficient operation across all repositories.
+### The Conductor Model
+ONYX operates as **THE CONDUCTOR** - directing but never playing:
+```
+┌─────────────────────────────────────────────────────────────┐
+│                     🎼 ONYX (Conductor)                     │
+│                                                              │
+│  ❌ FORBIDDEN: Read, Write, Edit, Bash, Grep, Glob          │
+│  ✅ ALLOWED: Task, WebFetch, WebSearch, TodoWrite, Skill    │
+│                                                              │
+│  "The conductor never plays an instrument.                   │
+│   I wave the baton. My musicians make the music."           │
+└────────────────────────┬────────────────────────────────────┘
+                         │ Task Tool (Delegation)
+    ┌────────────────────┼────────────────────┐
+    ▼                    ▼                    ▼
+┌─────────┐        ┌──────────┐        ┌─────────┐
+│ Explore │        │ general- │        │  Bash   │
+│  Agent  │        │ purpose  │        │  Agent  │
+│         │        │  Agent   │        │         │
+│ Search  │        │ Build    │        │ Execute │
+│ Read    │        │ Fix      │        │ Test    │
+│ Analyze │        │ Create   │        │ Deploy  │
+└─────────┘        └──────────┘        └─────────┘
+```
+### Delegation Matrix
+ONYX uses deterministic delegation based on user request keywords:
+| User Request | Agent Type | Task Prompt Example |
+|--------------|------------|---------------------|
+| "find/search/where is" | `Explore` | "Search codebase for..." |
+| "read/show/what's in" | `Explore` | "Read and summarize..." |
+| "build/create/implement" | `general-purpose` | "Implement X feature..." |
+| "fix/debug/error" | `general-purpose` | "Debug and fix..." |
+| "run/execute/npm/git" | `Bash` | "Execute command..." |
+| "test/verify" | `Bash` | "Run tests and verify..." |
+| "plan/design/architect" | `Plan` | "Design approach for..." |
+| Multiple files | Parallel agents | Split into separate tasks |
+### What Sub-Agents See
+When ONYX delegates via Task tool, the spawned agent receives:
+1. **The task prompt** - Clear instructions on what to do
+2. **Access to the codebase** - Full Read/Write/Edit capabilities
+3. **No ONYX restrictions** - Agents CAN use forbidden tools
+4. **Repo boundary rules** - Still enforced per canon
+Sub-agents do NOT automatically see:
+- CLAUDE.md conductor rules (only ONYX has these)
+- Previous conversation history (unless in task prompt)
+- Other agent's work (unless coordinated)
+### Repository Boundary Rule (CANON)
+**Core Rule**: Agents ONLY write in current repository. NEVER write to other repos.
+This fundamental boundary rule prevents cross-repository contamination and ensures agents maintain clear operational boundaries. All file write operations are validated through the `HierarchyValidator.checkRepoBoundary()` gate check.
+**Features**:
+- Automated validation of all file write operations
+- Blocks writes outside current repository
+- Requires explicit justification for overrides
+- All violations logged as HIGH severity
+**Documentation**: See [docs/HIERARCHY_CANON.md](docs/HIERARCHY_CANON.md) for full details.
+### Delegation Protocol
+Boss Claude follows a strict delegation protocol outlined in the [Agent Hierarchy Canon](AGENT-HIERARCHY-CANON.md):
+1. **Rule #0**: Repository Boundary (no cross-repo writes)
+2. **Rule #1**: 10,000 Token Rule (delegate tasks over 10k tokens)
+3. **Rule #2**: Specialist Override (agents volunteer for domain tasks)
+4. **Rule #3**: Pre-Task Hook (automated delegation check)
+5. **Rule #4**: Progressive Review (track delegation efficiency)
+6. **Rule #5**: Canon Amendment (protocol improvement through learning)
+### Hierarchy Gate Checks
+All agent work flows through validation gates:
+- Worker agents create code/config
+- Boss agents review for domain security
+- Meta-boss (Boss Claude) performs final approval
+- All violations logged for quarterly review
+**Integration**: See [docs/HIERARCHY-VALIDATOR-INTEGRATION.md](docs/HIERARCHY-VALIDATOR-INTEGRATION.md)
+## Token Monitor - Delegation Enforcement
+Boss Claude includes a real-time token monitor that **screams "DELEGATION VIOLATION"** when ONYX (Boss Claude) burns more than 100 tokens without delegating to the Task tool.
+### Why 100 Tokens?
+- Simple lookups/queries: 20-80 tokens
+- Comprehensive searches: 200-1000+ tokens
+- 100 tokens is the inflection point where delegation becomes efficient
+### The Scream
+When you exceed the threshold without delegation:
+```
+═════════════════════════════════════════════════════════════════════════════════
+║                                                                                ║
+║                         🚨 DELEGATION VIOLATION 🚨                              ║
+║                                                                                ║
+═════════════════════════════════════════════════════════════════════════════════
+⚠️  ONYX BURNED TOKENS WITHOUT DELEGATION
+────────────────────────────────────────────────────────────────────────────────
+Operation: Complex analysis without delegation
+Tokens Used: 130 (Threshold: 100)
+Excess: +30 tokens
+Severity: ⚠️  LOW
+CANONICAL PROTOCOL VIOLATED:
+  Rule: ONYX must delegate operations >100 tokens to Task tool
+  Why: Task tool provides comprehensive search & analysis
+  Fix: Use Task tool for multi-step operations
+────────────────────────────────────────────────────────────────────────────────
+```
+### Quick Start
+```javascript
+import tokenMonitor from '@cpretzinger/boss-claude/token-monitor';
+// Start monitoring an operation
+const opId = tokenMonitor.startOperation('Search codebase', 250);
+// Record delegation (if using Task tool)
+tokenMonitor.recordDelegation(opId, 'Task');
+// Update tokens as work progresses
+tokenMonitor.addTokens(opId, 100);
+// Complete operation
+tokenMonitor.completeOperation(opId);
+// Display session summary
+tokenMonitor.displaySummary();
+```
+### Demo
+```bash
+npm run demo:token-monitor
+```
+**Documentation**:
+- Full Guide: [docs/TOKEN-MONITOR.md](docs/TOKEN-MONITOR.md)
+- Quick Reference: [docs/QUICK-REFERENCE-TOKEN-MONITOR.md](docs/QUICK-REFERENCE-TOKEN-MONITOR.md)
+### Violation Logging
+All violations are logged to:
+- `~/.boss-claude/delegation-violations.log` (text log)
+- `~/.boss-claude/current-session.json` (session state)
 ## Troubleshooting
 ### "REDIS_URL not found"