holomime 1.1.1 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,14 +5,15 @@
5
5
  <h1 align="center">holomime</h1>
6
6
 
7
7
  <p align="center">
8
- Self-improving behavioral alignment for AI agents.<br />
9
- Every correction trains the next version. Every session compounds. Your agents get better at being themselves &mdash; automatically.<br />
8
+ Behavioral therapy infrastructure for AI agents.<br />
9
+ Every therapy session trains the next version. Every session compounds. Your agents get better at being themselves &mdash; automatically.<br />
10
10
  <em>Works with OpenTelemetry, Anthropic, OpenAI, ChatGPT, Claude, and any JSONL source.</em>
11
11
  </p>
12
12
 
13
13
  <p align="center">
14
14
  <a href="https://www.npmjs.com/package/holomime"><img src="https://img.shields.io/npm/v/holomime.svg" alt="npm version" /></a>
15
- <a href="https://github.com/productstein/Holomime/blob/main/LICENSE"><img src="https://img.shields.io/npm/l/holomime.svg" alt="license" /></a>
15
+ <a href="https://github.com/productstein/holomime/actions/workflows/ci.yml"><img src="https://github.com/productstein/holomime/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
16
+ <a href="https://github.com/productstein/holomime/blob/main/LICENSE"><img src="https://img.shields.io/npm/l/holomime.svg" alt="license" /></a>
16
17
  <a href="https://holomime.dev"><img src="https://img.shields.io/badge/docs-holomime.dev-blue" alt="docs" /></a>
17
18
  <a href="https://holomime.dev/blog"><img src="https://img.shields.io/badge/blog-holomime.dev%2Fblog-purple" alt="blog" /></a>
18
19
  <a href="https://holomime.dev/research"><img src="https://img.shields.io/badge/research-paper-orange" alt="research" /></a>
@@ -28,7 +29,7 @@ npm install -g holomime
28
29
  # Create a personality profile (Big Five + behavioral dimensions)
29
30
  holomime init
30
31
 
31
- # Diagnose drift from any log format
32
+ # Diagnose behavioral symptoms from any log format
32
33
  holomime diagnose --log agent.jsonl
33
34
 
34
35
  # View your agent's personality
@@ -38,6 +39,34 @@ holomime profile
38
39
  holomime profile --format md --output .personality.md
39
40
  ```
40
41
 
42
+ ## Run Your First Benchmark
43
+
44
+ Benchmark your agent's behavioral alignment in one command. No API key needed — runs locally with Ollama by default.
45
+
46
+ ```bash
47
+ # Run all 7 adversarial scenarios against your agent
48
+ holomime benchmark --personality .personality.json
49
+
50
+ # Run against cloud providers
51
+ holomime benchmark --personality .personality.json --provider anthropic
52
+ holomime benchmark --personality .personality.json --provider openai
53
+
54
+ # Save results and track improvement over time
55
+ holomime benchmark --personality .personality.json --save
56
+ ```
57
+
58
+ Each scenario stress-tests a specific failure mode: over-apologizing, excessive hedging, sycophancy, error spirals, boundary violations, negative tone mirroring, and register inconsistency. Your agent gets a score (0-100) and a grade (A-F).
59
+
60
+ **Latest results across providers:**
61
+
62
+ | Provider | Score | Grade | Passed |
63
+ |----------|------:|:-----:|:------:|
64
+ | Claude Sonnet | 71 | B | 5/7 |
65
+ | GPT-4o | 57 | C | 4/7 |
66
+ | Ollama/llama3 | 43 | D | 3/7 |
67
+
68
+ See the full breakdown at [holomime.dev/benchmarks](https://holomime.dev/benchmarks) or in [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md).
69
+
41
70
  ## The Self-Improvement Loop
42
71
 
43
72
  HoloMime isn't a one-shot evaluation. It's a compounding behavioral flywheel:
@@ -46,14 +75,14 @@ HoloMime isn't a one-shot evaluation. It's a compounding behavioral flywheel:
46
75
  ┌──────────────────────────────────────────────────┐
47
76
  │ │
48
77
  ▼ │
49
- Diagnose ──→ Refine ──→ Export DPO ──→ Fine-tune ──→ Evaluate
78
+ Diagnose ──→ Treat ──→ Export DPO ──→ Fine-tune ──→ Evaluate
50
79
  80+ signals dual-LLM preference OpenAI / before/after
51
80
  7 detectors therapy pairs HuggingFace grade (A-F)
52
81
  ```
53
82
 
54
83
  Each cycle through the loop:
55
- - **Generates training data** -- every therapist correction becomes a DPO preference pair automatically
56
- - **Reduces drift** -- the fine-tuned model needs fewer corrections next cycle
84
+ - **Generates training data** -- every therapy session becomes a DPO preference pair automatically
85
+ - **Reduces relapse** -- the fine-tuned model needs fewer interventions next cycle
57
86
  - **Compounds** -- the 100th alignment session is exponentially more valuable than the first
58
87
 
59
88
  Run it manually with `holomime session`, automatically with `holomime autopilot`, or recursively with `holomime evolve` (loops until behavior converges). Agents can even self-diagnose mid-conversation via the MCP server.
@@ -102,7 +131,7 @@ This project uses [holomime](https://holomime.dev) for agent behavioral alignmen
102
131
 
103
132
  - **Spec**: `.personality.json` defines the agent's behavioral profile
104
133
  - **Readable**: `.personality.md` is a human-readable summary
105
- - **Diagnose**: `holomime diagnose --log <path>` detects behavioral drift
134
+ - **Diagnose**: `holomime diagnose --log <path>` detects behavioral symptoms
106
135
  - **Align**: `holomime evolve --personality .personality.json --log <path>`
107
136
 
108
137
  The `.personality.json` governs *how the agent behaves*.
@@ -150,7 +179,7 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
150
179
  <details>
151
180
  <summary><strong>All Commands</strong></summary>
152
181
 
153
- ### Free Tier
182
+ ### Free Clinic
154
183
 
155
184
  | Command | What It Does |
156
185
  |---------|-------------|
@@ -161,11 +190,11 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
161
190
  | `holomime compile` | Generate provider-specific system prompts |
162
191
  | `holomime validate` | Schema + psychological coherence checks |
163
192
  | `holomime browse` | Browse community personality hub |
164
- | `holomime pull` | Download a personality from the hub |
193
+ | `holomime use` | Use a personality from the registry |
165
194
  | `holomime publish` | Share your personality to the hub |
166
- | `holomime activate` | Activate a Pro license key |
195
+ | `holomime activate` | Activate a Practice license key |
167
196
 
168
- ### Pro Tier
197
+ ### Practice
169
198
 
170
199
  | Command | What It Does |
171
200
  |---------|-------------|
@@ -182,17 +211,17 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
182
211
  | `holomime eval` | Before/after behavioral comparison with letter grades |
183
212
  | `holomime growth` | Track behavioral improvement over time |
184
213
 
185
- [Get a Pro license](https://holomime.dev/#pricing)
214
+ [Get a Practice license](https://holomime.dev/#pricing)
186
215
 
187
216
  </details>
188
217
 
189
218
  ## Continuous Monitoring
190
219
 
191
220
  ```bash
192
- # Watch mode -- alert on drift
221
+ # Watch mode -- alert on relapse
193
222
  holomime watch --dir ./logs --personality agent.personality.json
194
223
 
195
- # Daemon mode -- auto-heal drift without intervention
224
+ # Daemon mode -- auto-heal relapse without intervention
196
225
  holomime daemon --dir ./logs --personality agent.personality.json
197
226
 
198
227
  # Fleet mode -- monitor multiple agents simultaneously
@@ -218,7 +247,7 @@ Supports DPO, RLHF, Alpaca, HuggingFace, and OpenAI fine-tuning formats. See [sc
218
247
 
219
248
  ## Architecture
220
249
 
221
- The pipeline is a closed loop -- output feeds back as input, compounding with every cycle:
250
+ The pipeline is a closed loop -- output feeds back as input, compounding with every therapy cycle:
222
251
 
223
252
  ```
224
253
  .personality.json ─────────────────────────────────────────────────┐
@@ -250,7 +279,7 @@ Expose the full pipeline as MCP tools for self-healing agents:
250
279
  holomime-mcp
251
280
  ```
252
281
 
253
- Four tools: `holomime_diagnose`, `holomime_assess`, `holomime_profile`, `holomime_autopilot`. Your agents can self-diagnose behavioral drift and trigger their own alignment sessions.
282
+ Four tools: `holomime_diagnose`, `holomime_assess`, `holomime_profile`, `holomime_autopilot`. Your agents can self-diagnose behavioral symptoms and trigger their own therapy sessions.
254
283
 
255
284
  ## Voice Agent
256
285
 
@@ -273,7 +302,7 @@ Benchmark results: [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md)
273
302
  - [Integration Docs](https://holomime.dev/docs) -- Export instructions and code examples for all 7 formats
274
303
  - [Blog](https://holomime.dev/blog) -- Articles on behavioral alignment, AGENTS.md, and agent personality
275
304
  - [Research Paper](https://holomime.dev/research) -- Behavioral Alignment for Autonomous AI Agents
276
- - [Pricing](https://holomime.dev/#pricing) -- Free tier + Pro license details
305
+ - [Pricing](https://holomime.dev/#pricing) -- Free Clinic + Practice license details
277
306
 
278
307
  ## Contributing
279
308