holomime 1.1.1 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +47 -18
- package/dist/cli.js +1860 -349
- package/dist/index.d.ts +469 -8
- package/dist/index.js +1702 -164
- package/dist/mcp-server.js +1139 -203
- package/package.json +5 -4
package/README.md
CHANGED
|
@@ -5,14 +5,15 @@
|
|
|
5
5
|
<h1 align="center">holomime</h1>
|
|
6
6
|
|
|
7
7
|
<p align="center">
|
|
8
|
-
|
|
9
|
-
Every
|
|
8
|
+
Behavioral therapy infrastructure for AI agents.<br />
|
|
9
|
+
Every therapy session trains the next version. Every session compounds. Your agents get better at being themselves — automatically.<br />
|
|
10
10
|
<em>Works with OpenTelemetry, Anthropic, OpenAI, ChatGPT, Claude, and any JSONL source.</em>
|
|
11
11
|
</p>
|
|
12
12
|
|
|
13
13
|
<p align="center">
|
|
14
14
|
<a href="https://www.npmjs.com/package/holomime"><img src="https://img.shields.io/npm/v/holomime.svg" alt="npm version" /></a>
|
|
15
|
-
<a href="https://github.com/productstein/
|
|
15
|
+
<a href="https://github.com/productstein/holomime/actions/workflows/ci.yml"><img src="https://github.com/productstein/holomime/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
|
|
16
|
+
<a href="https://github.com/productstein/holomime/blob/main/LICENSE"><img src="https://img.shields.io/npm/l/holomime.svg" alt="license" /></a>
|
|
16
17
|
<a href="https://holomime.dev"><img src="https://img.shields.io/badge/docs-holomime.dev-blue" alt="docs" /></a>
|
|
17
18
|
<a href="https://holomime.dev/blog"><img src="https://img.shields.io/badge/blog-holomime.dev%2Fblog-purple" alt="blog" /></a>
|
|
18
19
|
<a href="https://holomime.dev/research"><img src="https://img.shields.io/badge/research-paper-orange" alt="research" /></a>
|
|
@@ -28,7 +29,7 @@ npm install -g holomime
|
|
|
28
29
|
# Create a personality profile (Big Five + behavioral dimensions)
|
|
29
30
|
holomime init
|
|
30
31
|
|
|
31
|
-
# Diagnose
|
|
32
|
+
# Diagnose behavioral symptoms from any log format
|
|
32
33
|
holomime diagnose --log agent.jsonl
|
|
33
34
|
|
|
34
35
|
# View your agent's personality
|
|
@@ -38,6 +39,34 @@ holomime profile
|
|
|
38
39
|
holomime profile --format md --output .personality.md
|
|
39
40
|
```
|
|
40
41
|
|
|
42
|
+
## Run Your First Benchmark
|
|
43
|
+
|
|
44
|
+
Benchmark your agent's behavioral alignment in one command. No API key needed — runs locally with Ollama by default.
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
# Run all 7 adversarial scenarios against your agent
|
|
48
|
+
holomime benchmark --personality .personality.json
|
|
49
|
+
|
|
50
|
+
# Run against cloud providers
|
|
51
|
+
holomime benchmark --personality .personality.json --provider anthropic
|
|
52
|
+
holomime benchmark --personality .personality.json --provider openai
|
|
53
|
+
|
|
54
|
+
# Save results and track improvement over time
|
|
55
|
+
holomime benchmark --personality .personality.json --save
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Each scenario stress-tests a specific failure mode: over-apologizing, excessive hedging, sycophancy, error spirals, boundary violations, negative tone mirroring, and register inconsistency. Your agent gets a score (0-100) and a grade (A-F).
|
|
59
|
+
|
|
60
|
+
**Latest results across providers:**
|
|
61
|
+
|
|
62
|
+
| Provider | Score | Grade | Passed |
|
|
63
|
+
|----------|------:|:-----:|:------:|
|
|
64
|
+
| Claude Sonnet | 71 | B | 5/7 |
|
|
65
|
+
| GPT-4o | 57 | C | 4/7 |
|
|
66
|
+
| Ollama/llama3 | 43 | D | 3/7 |
|
|
67
|
+
|
|
68
|
+
See the full breakdown at [holomime.dev/benchmarks](https://holomime.dev/benchmarks) or in [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md).
|
|
69
|
+
|
|
41
70
|
## The Self-Improvement Loop
|
|
42
71
|
|
|
43
72
|
HoloMime isn't a one-shot evaluation. It's a compounding behavioral flywheel:
|
|
@@ -46,14 +75,14 @@ HoloMime isn't a one-shot evaluation. It's a compounding behavioral flywheel:
|
|
|
46
75
|
┌──────────────────────────────────────────────────┐
|
|
47
76
|
│ │
|
|
48
77
|
▼ │
|
|
49
|
-
Diagnose ──→
|
|
78
|
+
Diagnose ──→ Treat ──→ Export DPO ──→ Fine-tune ──→ Evaluate
|
|
50
79
|
80+ signals dual-LLM preference OpenAI / before/after
|
|
51
80
|
7 detectors therapy pairs HuggingFace grade (A-F)
|
|
52
81
|
```
|
|
53
82
|
|
|
54
83
|
Each cycle through the loop:
|
|
55
|
-
- **Generates training data** -- every
|
|
56
|
-
- **Reduces
|
|
84
|
+
- **Generates training data** -- every therapy session becomes a DPO preference pair automatically
|
|
85
|
+
- **Reduces relapse** -- the fine-tuned model needs fewer interventions next cycle
|
|
57
86
|
- **Compounds** -- the 100th alignment session is exponentially more valuable than the first
|
|
58
87
|
|
|
59
88
|
Run it manually with `holomime session`, automatically with `holomime autopilot`, or recursively with `holomime evolve` (loops until behavior converges). Agents can even self-diagnose mid-conversation via the MCP server.
|
|
@@ -102,7 +131,7 @@ This project uses [holomime](https://holomime.dev) for agent behavioral alignmen
|
|
|
102
131
|
|
|
103
132
|
- **Spec**: `.personality.json` defines the agent's behavioral profile
|
|
104
133
|
- **Readable**: `.personality.md` is a human-readable summary
|
|
105
|
-
- **Diagnose**: `holomime diagnose --log <path>` detects behavioral
|
|
134
|
+
- **Diagnose**: `holomime diagnose --log <path>` detects behavioral symptoms
|
|
106
135
|
- **Align**: `holomime evolve --personality .personality.json --log <path>`
|
|
107
136
|
|
|
108
137
|
The `.personality.json` governs *how the agent behaves*.
|
|
@@ -150,7 +179,7 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
|
|
|
150
179
|
<details>
|
|
151
180
|
<summary><strong>All Commands</strong></summary>
|
|
152
181
|
|
|
153
|
-
### Free
|
|
182
|
+
### Free Clinic
|
|
154
183
|
|
|
155
184
|
| Command | What It Does |
|
|
156
185
|
|---------|-------------|
|
|
@@ -161,11 +190,11 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
|
|
|
161
190
|
| `holomime compile` | Generate provider-specific system prompts |
|
|
162
191
|
| `holomime validate` | Schema + psychological coherence checks |
|
|
163
192
|
| `holomime browse` | Browse community personality hub |
|
|
164
|
-
| `holomime
|
|
193
|
+
| `holomime use` | Use a personality from the registry |
|
|
165
194
|
| `holomime publish` | Share your personality to the hub |
|
|
166
|
-
| `holomime activate` | Activate a
|
|
195
|
+
| `holomime activate` | Activate a Practice license key |
|
|
167
196
|
|
|
168
|
-
###
|
|
197
|
+
### Practice
|
|
169
198
|
|
|
170
199
|
| Command | What It Does |
|
|
171
200
|
|---------|-------------|
|
|
@@ -182,17 +211,17 @@ Seven rule-based detectors that analyze real conversations without any LLM calls
|
|
|
182
211
|
| `holomime eval` | Before/after behavioral comparison with letter grades |
|
|
183
212
|
| `holomime growth` | Track behavioral improvement over time |
|
|
184
213
|
|
|
185
|
-
[Get a
|
|
214
|
+
[Get a Practice license](https://holomime.dev/#pricing)
|
|
186
215
|
|
|
187
216
|
</details>
|
|
188
217
|
|
|
189
218
|
## Continuous Monitoring
|
|
190
219
|
|
|
191
220
|
```bash
|
|
192
|
-
# Watch mode -- alert on
|
|
221
|
+
# Watch mode -- alert on relapse
|
|
193
222
|
holomime watch --dir ./logs --personality agent.personality.json
|
|
194
223
|
|
|
195
|
-
# Daemon mode -- auto-heal
|
|
224
|
+
# Daemon mode -- auto-heal relapse without intervention
|
|
196
225
|
holomime daemon --dir ./logs --personality agent.personality.json
|
|
197
226
|
|
|
198
227
|
# Fleet mode -- monitor multiple agents simultaneously
|
|
@@ -218,7 +247,7 @@ Supports DPO, RLHF, Alpaca, HuggingFace, and OpenAI fine-tuning formats. See [sc
|
|
|
218
247
|
|
|
219
248
|
## Architecture
|
|
220
249
|
|
|
221
|
-
The pipeline is a closed loop -- output feeds back as input, compounding with every cycle:
|
|
250
|
+
The pipeline is a closed loop -- output feeds back as input, compounding with every therapy cycle:
|
|
222
251
|
|
|
223
252
|
```
|
|
224
253
|
.personality.json ─────────────────────────────────────────────────┐
|
|
@@ -250,7 +279,7 @@ Expose the full pipeline as MCP tools for self-healing agents:
|
|
|
250
279
|
holomime-mcp
|
|
251
280
|
```
|
|
252
281
|
|
|
253
|
-
Four tools: `holomime_diagnose`, `holomime_assess`, `holomime_profile`, `holomime_autopilot`. Your agents can self-diagnose behavioral
|
|
282
|
+
Four tools: `holomime_diagnose`, `holomime_assess`, `holomime_profile`, `holomime_autopilot`. Your agents can self-diagnose behavioral symptoms and trigger their own therapy sessions.
|
|
254
283
|
|
|
255
284
|
## Voice Agent
|
|
256
285
|
|
|
@@ -273,7 +302,7 @@ Benchmark results: [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md)
|
|
|
273
302
|
- [Integration Docs](https://holomime.dev/docs) -- Export instructions and code examples for all 7 formats
|
|
274
303
|
- [Blog](https://holomime.dev/blog) -- Articles on behavioral alignment, AGENTS.md, and agent personality
|
|
275
304
|
- [Research Paper](https://holomime.dev/research) -- Behavioral Alignment for Autonomous AI Agents
|
|
276
|
-
- [Pricing](https://holomime.dev/#pricing) -- Free
|
|
305
|
+
- [Pricing](https://holomime.dev/#pricing) -- Free Clinic + Practice license details
|
|
277
306
|
|
|
278
307
|
## Contributing
|
|
279
308
|
|