@cpretzinger/boss-claude 1.0.0 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. package/README.md +304 -1
  2. package/bin/boss-claude.js +1138 -0
  3. package/bin/commands/mode.js +250 -0
  4. package/bin/onyx-guard.js +259 -0
  5. package/bin/onyx-guard.sh +251 -0
  6. package/bin/prompts.js +284 -0
  7. package/bin/rollback.js +85 -0
  8. package/bin/setup-wizard.js +492 -0
  9. package/config/.env.example +17 -0
  10. package/lib/README.md +83 -0
  11. package/lib/agent-logger.js +61 -0
  12. package/lib/agents/memory-engineers/github-memory-engineer.js +251 -0
  13. package/lib/agents/memory-engineers/postgres-memory-engineer.js +633 -0
  14. package/lib/agents/memory-engineers/qdrant-memory-engineer.js +358 -0
  15. package/lib/agents/memory-engineers/redis-memory-engineer.js +383 -0
  16. package/lib/agents/memory-supervisor.js +526 -0
  17. package/lib/agents/registry.js +135 -0
  18. package/lib/auto-monitor.js +131 -0
  19. package/lib/checkpoint-hook.js +112 -0
  20. package/lib/checkpoint.js +319 -0
  21. package/lib/commentator.js +213 -0
  22. package/lib/context-scribe.js +120 -0
  23. package/lib/delegation-strategies.js +326 -0
  24. package/lib/hierarchy-validator.js +643 -0
  25. package/lib/index.js +15 -0
  26. package/lib/init-with-mode.js +261 -0
  27. package/lib/init.js +44 -6
  28. package/lib/memory-result-aggregator.js +252 -0
  29. package/lib/memory.js +35 -7
  30. package/lib/mode-enforcer.js +473 -0
  31. package/lib/onyx-banner.js +169 -0
  32. package/lib/onyx-identity.js +214 -0
  33. package/lib/onyx-monitor.js +381 -0
  34. package/lib/onyx-reminder.js +188 -0
  35. package/lib/onyx-tool-interceptor.js +341 -0
  36. package/lib/onyx-wrapper.js +315 -0
  37. package/lib/orchestrator-gate.js +334 -0
  38. package/lib/output-formatter.js +296 -0
  39. package/lib/postgres.js +1 -1
  40. package/lib/prompt-injector.js +220 -0
  41. package/lib/prompts.js +532 -0
  42. package/lib/session.js +153 -6
  43. package/lib/setup/README.md +187 -0
  44. package/lib/setup/env-manager.js +785 -0
  45. package/lib/setup/error-recovery.js +630 -0
  46. package/lib/setup/explain-scopes.js +385 -0
  47. package/lib/setup/github-instructions.js +333 -0
  48. package/lib/setup/github-repo.js +254 -0
  49. package/lib/setup/import-credentials.js +498 -0
  50. package/lib/setup/index.js +62 -0
  51. package/lib/setup/init-postgres.js +785 -0
  52. package/lib/setup/init-redis.js +456 -0
  53. package/lib/setup/integration-test.js +652 -0
  54. package/lib/setup/progress.js +357 -0
  55. package/lib/setup/rollback.js +670 -0
  56. package/lib/setup/rollback.test.js +452 -0
  57. package/lib/setup/setup-with-rollback.example.js +351 -0
  58. package/lib/setup/summary.js +400 -0
  59. package/lib/setup/test-github-setup.js +10 -0
  60. package/lib/setup/test-postgres-init.js +98 -0
  61. package/lib/setup/verify-setup.js +102 -0
  62. package/lib/task-agent-worker.js +235 -0
  63. package/lib/token-monitor.js +466 -0
  64. package/lib/tool-wrapper-integration.js +369 -0
  65. package/lib/tool-wrapper.js +387 -0
  66. package/lib/validators/README.md +497 -0
  67. package/lib/validators/config.js +583 -0
  68. package/lib/validators/config.test.js +175 -0
  69. package/lib/validators/github.js +310 -0
  70. package/lib/validators/github.test.js +61 -0
  71. package/lib/validators/index.js +15 -0
  72. package/lib/validators/postgres.js +525 -0
  73. package/package.json +98 -13
  74. package/scripts/benchmark-memory.js +433 -0
  75. package/scripts/check-secrets.sh +12 -0
  76. package/scripts/fetch-todos.mjs +148 -0
  77. package/scripts/graceful-shutdown.sh +156 -0
  78. package/scripts/install-onyx-hooks.js +373 -0
  79. package/scripts/install.js +119 -18
  80. package/scripts/redis-monitor.js +284 -0
  81. package/scripts/redis-setup.js +412 -0
  82. package/scripts/test-memory-retrieval.js +201 -0
  83. package/scripts/validate-exports.js +68 -0
  84. package/scripts/validate-package.js +120 -0
  85. package/scripts/verify-onyx-deployment.js +309 -0
  86. package/scripts/verify-redis-deployment.js +354 -0
  87. package/scripts/verify-redis-init.js +219 -0
package/README.md CHANGED
@@ -12,6 +12,10 @@ Boss Claude turns every coding session into an RPG-style experience where Claude
12
12
  - 📊 **Career Stats**: Track total sessions, repos managed, token earnings
13
13
  - 🔍 **Semantic Search**: Recall past sessions with natural language queries
14
14
  - 🏆 **Progression System**: Level-based XP, token banking, achievement tracking
15
+ - 👁️ **Agent Watch**: Real-time monitoring of agent activity in companion window
16
+ - 🏛️ **Hierarchy Enforcement**: Canon rules ensure agents work safely within boundaries
17
+ - 🚨 **Token Monitor**: Real-time delegation enforcement - screams when ONYX burns >100 tokens without using Task tool
18
+ - ⏸️ **Checkpoint System**: Pauses ONYX every 5 messages to ask "Did you delegate or burn tokens?"
15
19
 
16
20
  ## Installation
17
21
 
@@ -19,6 +23,21 @@ Boss Claude turns every coding session into an RPG-style experience where Claude
19
23
  npm install -g @cpretzinger/boss-claude
20
24
  ```
21
25
 
26
+ ### What Happens on Install
27
+
28
+ When you run `npm install`, the postinstall script automatically:
29
+
30
+ 1. **Creates `~/.boss-claude/`** - Configuration directory for your credentials
31
+ 2. **Auto-detects credentials** - Imports from Railway CLI, environment, or existing configs
32
+ 3. **Injects ONYX MODE into `~/.claude/CLAUDE.md`** - The conductor rules that make Claude delegate work
33
+
34
+ The CLAUDE.md injection adds the "Conductor" identity to Claude, which enforces:
35
+ - **Forbidden tools**: Read, Write, Edit, Bash, Grep, Glob, NotebookEdit
36
+ - **Allowed tools**: Task (delegation), WebFetch, WebSearch, TodoWrite, Skill
37
+ - **Delegation matrix**: Deterministic routing of requests to appropriate agents
38
+
39
+ This means in ANY repository, Claude automatically becomes ONYX - the conductor who waves the baton but never plays the instruments.
40
+
22
41
  ## Setup
23
42
 
24
43
  ### 1. Configure Credentials
@@ -97,6 +116,49 @@ Saves current session to GitHub Issues with:
97
116
  - Automatic summary if not provided
98
117
  - Optional tags for organization
99
118
  - XP and token rewards
119
+
120
+ #### Watch Agent Activity
121
+ ```bash
122
+ boss-claude watch
123
+ ```
124
+
125
+ Opens a real-time monitor showing all agent activity. Perfect for:
126
+ - Debugging multi-agent workflows
127
+ - Monitoring Task agent execution
128
+ - Tracking automation progress
129
+
130
+ See [Agent Watch Documentation](docs/WATCH-QUICKSTART.md) for integration guide.
131
+
132
+ #### Live Agent Commentary
133
+ ```bash
134
+ boss-claude commentate
135
+ ```
136
+
137
+ Real-time play-by-play of what agents are doing - reads, writes, executions.
138
+
139
+ #### ONYX Checkpoint System
140
+ ```bash
141
+ # Check delegation efficiency
142
+ boss-claude checkpoint:status
143
+
144
+ # Record delegation decision
145
+ boss-claude checkpoint:record --delegated --tokens 25000 --specialist "agent-name" --justification "reason"
146
+
147
+ # View decision history
148
+ boss-claude checkpoint:history
149
+ ```
150
+
151
+ Enforces delegation accountability by pausing every 5 messages to ask: "Did you delegate or burn tokens?"
152
+
153
+ See [Checkpoint Documentation](CHECKPOINT-SYSTEM.md) for complete guide.
154
+
155
+ #### Run Tests
156
+ ```bash
157
+ npm test
158
+ ```
159
+
160
+ Validates all modules load and CLI commands work (10 tests).
161
+
100
162
  - Searchable history
101
163
 
102
164
  #### Recall Past Sessions
@@ -177,16 +239,41 @@ Redis Keys
177
239
  - ...and so on (100 XP per level)
178
240
 
179
241
  ### Rewards
180
- - **XP**: 50 XP per session saved
242
+ - **Base XP**: 50 XP per session saved
243
+ - **Efficiency Bonus**: Up to +100 XP based on delegation efficiency
244
+ - **Delegation Bonus**: +2 XP per delegation (up to +20)
181
245
  - **Token Bank**: All tokens used during session are banked
182
246
  - **Net Worth**: Token bank × $0.000003 per token
183
247
 
248
+ ### Efficiency Multiplier System 🎯
249
+
250
+ The efficiency bonus rewards you for being a true conductor - delegating work to agents instead of doing it yourself.
251
+
252
+ **Formula**: `agent_tokens / onyx_tokens = efficiency_ratio`
253
+
254
+ **Example**:
255
+ - ONYX used 20,000 tokens (orchestration overhead)
256
+ - Agents used 600,000 tokens (actual work)
257
+ - Efficiency: 600,000 / 20,000 = **30x**
258
+ - Bonus XP: **+30** (capped at 100)
259
+
260
+ The status display shows your current efficiency:
261
+ ```
262
+ ⚡ EFFICIENCY TRACKER (XP Multiplier)
263
+ 🎺 ONYX Tokens: 20,000 (orchestration)
264
+ 🎻 Agent Tokens: 600,000 (work done)
265
+ 📈 Efficiency Ratio: 30.0x
266
+ 🎯 Delegations: 15
267
+ 💎 Projected Bonus XP: +30 (efficiency) +20 (delegation)
268
+ ```
269
+
184
270
  ### Stats Tracked
185
271
  - Total sessions across all repos
186
272
  - Repositories managed
187
273
  - Token bank size
188
274
  - Current level and XP progress
189
275
  - Per-repo session counts
276
+ - Efficiency ratio per session
190
277
 
191
278
  ## CLI Reference
192
279
 
@@ -203,10 +290,60 @@ boss-claude save [summary] [--tags <tags>]
203
290
  # Search past sessions
204
291
  boss-claude recall <query> [--limit <number>]
205
292
 
293
+ # Run integration tests
294
+ boss-claude test
295
+
206
296
  # Show help
207
297
  boss-claude --help
208
298
  ```
209
299
 
300
+ ## Testing
301
+
302
+ Boss Claude includes a comprehensive integration test suite that validates:
303
+ - Redis connectivity and operations
304
+ - PostgreSQL database and schema
305
+ - GitHub API integration
306
+ - Full system end-to-end workflow
307
+
308
+ ```bash
309
+ boss-claude test
310
+ ```
311
+
312
+ The test suite runs in 3-5 seconds and validates all system components without affecting production data. See [TESTING.md](TESTING.md) for full documentation.
313
+
314
+ ## Benchmarking
315
+
316
+ ### Memory System Performance
317
+
318
+ Boss Claude includes a comprehensive benchmark to compare the old GitHub-based memory system with the new MemorySupervisor architecture:
319
+
320
+ ```bash
321
+ # Quick benchmark (3 runs per query)
322
+ npm run benchmark:memory
323
+
324
+ # Verbose output with detailed per-test stats
325
+ npm run benchmark:memory:verbose
326
+
327
+ # Extended benchmark (10 runs per query)
328
+ npm run benchmark:memory:extended
329
+ ```
330
+
331
+ **Key Metrics:**
332
+ - **Response Time**: Old system (2-120s) vs New system (<5s cache miss, <1s cache hit)
333
+ - **Cache Hit Rate**: Redis caching provides 30-50x speedup on repeated queries
334
+ - **Startup Impact**: 50%+ faster Claude initialization with token savings
335
+ - **Memory Usage**: Node.js heap tracking per operation
336
+
337
+ **Architecture Comparison:**
338
+
339
+ | System | Approach | Avg Response | Caching |
340
+ |--------|----------|--------------|---------|
341
+ | OLD | Direct GitHub API | 3-5s | None |
342
+ | NEW (Cache Miss) | 4 Parallel Engineers | <5s | Redis (5min TTL) |
343
+ | NEW (Cache Hit) | Redis Only | <200ms | 30-50x speedup |
344
+
345
+ The benchmark measures real-world performance across 8 varied queries and outputs comprehensive JSON results. See [docs/BENCHMARK-MEMORY.md](docs/BENCHMARK-MEMORY.md) for detailed documentation.
346
+
210
347
  ## Environment Variables
211
348
 
212
349
  | Variable | Required | Default | Description |
@@ -216,6 +353,172 @@ boss-claude --help
216
353
  | `GITHUB_OWNER` | No | `cpretzinger` | GitHub username |
217
354
  | `GITHUB_MEMORY_REPO` | No | `boss-claude-memory` | Repository name for memory storage |
218
355
 
356
+ ## Agent Hierarchy and Canon Rules
357
+
358
+ Boss Claude implements a multi-tier agent hierarchy system with enforced canon rules to ensure safe, efficient operation across all repositories.
359
+
360
+ ### The Conductor Model
361
+
362
+ ONYX operates as **THE CONDUCTOR** - directing but never playing:
363
+
364
+ ```
365
+ ┌─────────────────────────────────────────────────────────────┐
366
+ │ 🎼 ONYX (Conductor) │
367
+ │ │
368
+ │ ❌ FORBIDDEN: Read, Write, Edit, Bash, Grep, Glob │
369
+ │ ✅ ALLOWED: Task, WebFetch, WebSearch, TodoWrite, Skill │
370
+ │ │
371
+ │ "The conductor never plays an instrument. │
372
+ │ I wave the baton. My musicians make the music." │
373
+ └────────────────────────┬────────────────────────────────────┘
374
+ │ Task Tool (Delegation)
375
+ ┌────────────────────┼────────────────────┐
376
+ ▼ ▼ ▼
377
+ ┌─────────┐ ┌──────────┐ ┌─────────┐
378
+ │ Explore │ │ general- │ │ Bash │
379
+ │ Agent │ │ purpose │ │ Agent │
380
+ │ │ │ Agent │ │ │
381
+ │ Search │ │ Build │ │ Execute │
382
+ │ Read │ │ Fix │ │ Test │
383
+ │ Analyze │ │ Create │ │ Deploy │
384
+ └─────────┘ └──────────┘ └─────────┘
385
+ ```
386
+
387
+ ### Delegation Matrix
388
+
389
+ ONYX uses deterministic delegation based on user request keywords:
390
+
391
+ | User Request | Agent Type | Task Prompt Example |
392
+ |--------------|------------|---------------------|
393
+ | "find/search/where is" | `Explore` | "Search codebase for..." |
394
+ | "read/show/what's in" | `Explore` | "Read and summarize..." |
395
+ | "build/create/implement" | `general-purpose` | "Implement X feature..." |
396
+ | "fix/debug/error" | `general-purpose` | "Debug and fix..." |
397
+ | "run/execute/npm/git" | `Bash` | "Execute command..." |
398
+ | "test/verify" | `Bash` | "Run tests and verify..." |
399
+ | "plan/design/architect" | `Plan` | "Design approach for..." |
400
+ | Multiple files | Parallel agents | Split into separate tasks |
401
+
402
+ ### What Sub-Agents See
403
+
404
+ When ONYX delegates via Task tool, the spawned agent receives:
405
+ 1. **The task prompt** - Clear instructions on what to do
406
+ 2. **Access to the codebase** - Full Read/Write/Edit capabilities
407
+ 3. **No ONYX restrictions** - Agents CAN use forbidden tools
408
+ 4. **Repo boundary rules** - Still enforced per canon
409
+
410
+ Sub-agents do NOT automatically see:
411
+ - CLAUDE.md conductor rules (only ONYX has these)
412
+ - Previous conversation history (unless in task prompt)
413
+ - Other agent's work (unless coordinated)
414
+
415
+ ### Repository Boundary Rule (CANON)
416
+
417
+ **Core Rule**: Agents ONLY write in current repository. NEVER write to other repos.
418
+
419
+ This fundamental boundary rule prevents cross-repository contamination and ensures agents maintain clear operational boundaries. All file write operations are validated through the `HierarchyValidator.checkRepoBoundary()` gate check.
420
+
421
+ **Features**:
422
+ - Automated validation of all file write operations
423
+ - Blocks writes outside current repository
424
+ - Requires explicit justification for overrides
425
+ - All violations logged as HIGH severity
426
+
427
+ **Documentation**: See [docs/HIERARCHY_CANON.md](docs/HIERARCHY_CANON.md) for full details.
428
+
429
+ ### Delegation Protocol
430
+
431
+ Boss Claude follows a strict delegation protocol outlined in the [Agent Hierarchy Canon](AGENT-HIERARCHY-CANON.md):
432
+
433
+ 1. **Rule #0**: Repository Boundary (no cross-repo writes)
434
+ 2. **Rule #1**: 10,000 Token Rule (delegate tasks over 10k tokens)
435
+ 3. **Rule #2**: Specialist Override (agents volunteer for domain tasks)
436
+ 4. **Rule #3**: Pre-Task Hook (automated delegation check)
437
+ 5. **Rule #4**: Progressive Review (track delegation efficiency)
438
+ 6. **Rule #5**: Canon Amendment (protocol improvement through learning)
439
+
440
+ ### Hierarchy Gate Checks
441
+
442
+ All agent work flows through validation gates:
443
+ - Worker agents create code/config
444
+ - Boss agents review for domain security
445
+ - Meta-boss (Boss Claude) performs final approval
446
+ - All violations logged for quarterly review
447
+
448
+ **Integration**: See [docs/HIERARCHY-VALIDATOR-INTEGRATION.md](docs/HIERARCHY-VALIDATOR-INTEGRATION.md)
449
+
450
+ ## Token Monitor - Delegation Enforcement
451
+
452
+ Boss Claude includes a real-time token monitor that **screams "DELEGATION VIOLATION"** when ONYX (Boss Claude) burns more than 100 tokens without delegating to the Task tool.
453
+
454
+ ### Why 100 Tokens?
455
+
456
+ - Simple lookups/queries: 20-80 tokens
457
+ - Comprehensive searches: 200-1000+ tokens
458
+ - 100 tokens is the inflection point where delegation becomes efficient
459
+
460
+ ### The Scream
461
+
462
+ When you exceed the threshold without delegation:
463
+
464
+ ```
465
+ ═════════════════════════════════════════════════════════════════════════════════
466
+ ║ ║
467
+ ║ 🚨 DELEGATION VIOLATION 🚨 ║
468
+ ║ ║
469
+ ═════════════════════════════════════════════════════════════════════════════════
470
+
471
+ ⚠️ ONYX BURNED TOKENS WITHOUT DELEGATION
472
+ ────────────────────────────────────────────────────────────────────────────────
473
+ Operation: Complex analysis without delegation
474
+ Tokens Used: 130 (Threshold: 100)
475
+ Excess: +30 tokens
476
+ Severity: ⚠️ LOW
477
+
478
+ CANONICAL PROTOCOL VIOLATED:
479
+ Rule: ONYX must delegate operations >100 tokens to Task tool
480
+ Why: Task tool provides comprehensive search & analysis
481
+ Fix: Use Task tool for multi-step operations
482
+ ────────────────────────────────────────────────────────────────────────────────
483
+ ```
484
+
485
+ ### Quick Start
486
+
487
+ ```javascript
488
+ import tokenMonitor from '@cpretzinger/boss-claude/token-monitor';
489
+
490
+ // Start monitoring an operation
491
+ const opId = tokenMonitor.startOperation('Search codebase', 250);
492
+
493
+ // Record delegation (if using Task tool)
494
+ tokenMonitor.recordDelegation(opId, 'Task');
495
+
496
+ // Update tokens as work progresses
497
+ tokenMonitor.addTokens(opId, 100);
498
+
499
+ // Complete operation
500
+ tokenMonitor.completeOperation(opId);
501
+
502
+ // Display session summary
503
+ tokenMonitor.displaySummary();
504
+ ```
505
+
506
+ ### Demo
507
+
508
+ ```bash
509
+ npm run demo:token-monitor
510
+ ```
511
+
512
+ **Documentation**:
513
+ - Full Guide: [docs/TOKEN-MONITOR.md](docs/TOKEN-MONITOR.md)
514
+ - Quick Reference: [docs/QUICK-REFERENCE-TOKEN-MONITOR.md](docs/QUICK-REFERENCE-TOKEN-MONITOR.md)
515
+
516
+ ### Violation Logging
517
+
518
+ All violations are logged to:
519
+ - `~/.boss-claude/delegation-violations.log` (text log)
520
+ - `~/.boss-claude/current-session.json` (session state)
521
+
219
522
  ## Troubleshooting
220
523
 
221
524
  ### "REDIS_URL not found"