adversarial-mirror 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Stephen Marullo
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,397 @@
1
+ ![A-MIRROR](./header.svg)
2
+
3
+ <div align="center">
4
+
5
+ **Terminal-first adversarial AI middleware. Classifies each prompt, routes open-ended questions<br>to two models in parallel, and synthesizes a verdict — all in real time.**
6
+
7
+ ![Node](https://img.shields.io/badge/node-%3E%3D20-brightgreen?style=flat-square)
8
+ ![License](https://img.shields.io/badge/license-MIT-blue?style=flat-square)
9
+ ![Tests](https://img.shields.io/badge/tests-122%20passing-brightgreen?style=flat-square)
10
+
11
+ </div>
12
+
13
+ ---
14
+
15
+ ## The problem it solves
16
+
17
+ Every AI gives you the answer it thinks you want to hear. Adversarial Mirror sits between you and your models and routes the right prompts to a second model whose only job is to challenge the first one — not to be contrary, but to surface the assumptions you didn't know you were making, the risks you didn't ask about, and the alternatives you didn't consider.
18
+
19
+ Not every prompt benefits from adversarial pressure. Factual lookups, math, and code have correct answers — mirroring them wastes tokens and adds noise. Adversarial Mirror classifies each prompt first and only applies the challenger where it actually helps.
20
+
21
+ ```
22
+ Your prompt
23
+
24
+
25
+ ┌──────────────────────────────────────────────────────┐
26
+ │ Intent Classifier │
27
+ │ │
28
+ │ factual · math · code · conversational │
29
+ │ └─► direct mode (original only, no challenger) │
30
+ │ │
31
+ │ opinion · analysis · strategy · prediction │
32
+ │ └─► mirror mode (original + challenger) │
33
+ └──────────────────────────────────────────────────────┘
34
+
35
+ │ mirror mode
36
+
37
+ ┌───────────────────────┐
38
+ │ Original + Challenger│ both stream in parallel
39
+ └───────────────────────┘
40
+
41
+
42
+ ┌───────┐
43
+ │ Judge │ agreement score · synthesis · blind spot
44
+ └───────┘ on by default — skip with --no-judge
45
+ ```
46
+
47
+ The output is a persistent terminal session. Each completed exchange stamps itself permanently into the scrollback. Only the live streaming panels update while models are running.
48
+
49
+ ---
50
+
51
+ ## Install
52
+
53
+ ```bash
54
+ npm install -g adversarial-mirror
55
+ ```
56
+
57
+ Or build from source:
58
+
59
+ ```bash
60
+ git clone https://github.com/ArtificialStephen/adversarial-mirror
61
+ cd adversarial-mirror
62
+ npm install && npm run build
63
+ npm link
64
+ ```
65
+
66
+ Run the one-time setup wizard:
67
+
68
+ ```bash
69
+ mirror config init
70
+ ```
71
+
72
+ This walks you through API keys, default brains, intensity, judge, and persona settings. Keys are persisted to environment variables (`setx` on Windows, shell profile export on Unix).
73
+
74
+ ---
75
+
76
+ ## Quick start
77
+
78
+ ```bash
79
+ # Interactive session (default when no subcommand is given)
80
+ mirror
81
+
82
+ # One-shot query — streams and exits
83
+ mirror mirror "Should I rewrite this in Rust?"
84
+
85
+ # Increase adversarial pressure
86
+ mirror --intensity aggressive
87
+
88
+ # Apply a professional lens to the challenger
89
+ mirror --persona security-auditor
90
+
91
+ # Disable the judge synthesis pass
92
+ mirror --no-judge
93
+
94
+ # Load a file as context before the session starts
95
+ mirror chat --file ./architecture.md
96
+
97
+ # One-shot with file context
98
+ mirror mirror --file ./spec.md "What are the risks?"
99
+
100
+ # Pipe anything in
101
+ cat proposal.md | mirror mirror "Challenge every assumption"
102
+
103
+ # Override which brains are used for this run
104
+ mirror --original claude-sonnet-4-6 --challenger o3-mini --judge-brain claude-opus-4-6
105
+ ```
106
+
107
+ ---
108
+
109
+ ## How it works
110
+
111
+ ### Intent classification
112
+
113
+ Before routing, the engine classifies the prompt to decide whether adversarial pressure will actually add value.
114
+
115
+ | Category | Routed to | Example |
116
+ |---|---|---|
117
+ | `factual_lookup` | Direct | "What year was Redis released?" |
118
+ | `math_computation` | Direct | "What is 17% of 4200?" |
119
+ | `code_task` | Direct | "Write a binary search in Go" |
120
+ | `conversational` | Direct | "Thanks, that helps" |
121
+ | `opinion_advice` | Mirror | "Should I use microservices?" |
122
+ | `analysis` | Mirror | "What are the risks of this approach?" |
123
+ | `prediction` | Mirror | "Will this scale to 10M users?" |
124
+ | `interpretation` | Mirror | "What does this contract clause mean?" |
125
+
126
+ Classification uses a small fast model (Claude Haiku by default) with a configurable confidence threshold. Prompts below threshold default to mirroring. Disable classification entirely with `--no-classify` to always mirror.
127
+
128
+ ---
129
+
130
+ ### Intensity levels
131
+
132
+ Controls how aggressively the challenger pushes back.
133
+
134
+ | Level | Style | Structure |
135
+ |---|---|---|
136
+ | `mild` | Gentle critic | Full answer + 1-2 real gaps + steelman alternative |
137
+ | `moderate` | Devil's advocate | Reframe / challenge the frame / hidden costs / strongest counterposition / verdict |
138
+ | `aggressive` | Full adversarial | Buried assumption / strongest refutation / failure cases / expert dissent / honest synthesis |
139
+
140
+ All levels enforce one rule: **every point must have a specific mechanism. Vague doubt is useless.**
141
+
142
+ ```bash
143
+ mirror --intensity mild # quick sanity check
144
+ mirror --intensity moderate # default — the sweet spot
145
+ mirror --intensity aggressive # when the stakes are high
146
+ ```
147
+
148
+ ---
149
+
150
+ ### Persona lenses
151
+
152
+ Personas give the challenger a professional frame of reference. Instead of generic adversarialism, you get a domain expert's specific skepticism applied to your prompt. Personas stack on top of whatever intensity level is set.
153
+
154
+ | Persona | Lens | Focus areas |
155
+ |---|---|---|
156
+ | `vc-skeptic` | Investor | Market size assumptions, unit economics, moat, defensibility |
157
+ | `security-auditor` | Security & risk | Attack surfaces, trust boundaries, failure modes, blast radius |
158
+ | `end-user` | Real user | Actual behavior vs stated intent, adoption friction, miscomprehension |
159
+ | `regulator` | Compliance & legal | Regulatory exposure, liability, stakeholder harm, unintended consequences |
160
+ | `contrarian` | Pure opposition | Historical failures, second-order effects, inverted premises, consensus traps |
161
+
162
+ Personas and intensities are independent — you get 15 distinct challenger modes in total:
163
+
164
+ ```bash
165
+ mirror --persona vc-skeptic --intensity aggressive
166
+ mirror --persona security-auditor # defaults to moderate
167
+ mirror --persona regulator --file ./terms.md chat
168
+ ```
169
+
170
+ Set a default so you never have to type it:
171
+
172
+ ```bash
173
+ mirror config set session.defaultPersona vc-skeptic
174
+ ```
175
+
176
+ ---
177
+
178
+ ### Judge synthesis
179
+
180
+ After both models complete, a third model synthesizes their responses and produces a structured verdict:
181
+
182
+ ```
183
+ AGREEMENT: 34%
184
+ Both models agree on the technical approach but diverge sharply on timeline and risk.
185
+
186
+ SYNTHESIS
187
+ The monolith wins short-term. The challenger's concern about coupling is real but premature
188
+ at your current scale. Revisit at 50k DAU. The original underestimates the ops cost of
189
+ distributed tracing — budget 2 sprints for observability before you ship anything.
190
+
191
+ BLIND SPOT
192
+ Neither model addressed the team's existing expertise. The "right" architecture is the one
193
+ your engineers can actually debug at 3am.
194
+ ```
195
+
196
+ The agreement score gives you a quick read on how contested the territory is:
197
+
198
+ | Score | Meaning |
199
+ |---|---|
200
+ | 90-100% | Substantively identical — both models see the same thing |
201
+ | 70-89% | Same core answer, meaningful differences in emphasis or caveats |
202
+ | 50-69% | Partial overlap — worth reading both carefully |
203
+ | 30-49% | Different conclusions from shared premises |
204
+ | 0-29% | Fundamentally opposed — the question is genuinely hard |
205
+
206
+ ```bash
207
+ mirror --no-judge # skip synthesis, go faster
208
+ mirror --judge-brain claude-opus-4-6 # use a heavier model for synthesis
209
+ mirror config set session.judgeBrainId o3-mini
210
+ ```
211
+
212
+ ---
213
+
214
+ ### File and pipe input
215
+
216
+ Feed any document as context before a question.
217
+
218
+ ```bash
219
+ # Interactive session with a file preloaded as context
220
+ mirror chat --file ./notes.md
221
+ mirror chat --file ./architecture.md
222
+
223
+ # One-shot with file context
224
+ mirror mirror --file ./contract.md "What clauses expose us to liability?"
225
+ mirror mirror --file ./codebase-summary.md "Where are the security risks?"
226
+
227
+ # Pipe from stdin
228
+ cat ./spec.md | mirror mirror "What are the weakest assumptions here?"
229
+ git diff HEAD~1 | mirror mirror "Review this diff for correctness and safety"
230
+ curl -s https://api.example.com/openapi.json | mirror mirror "What could go wrong with this API design?"
231
+ ```
232
+
233
+ ---
234
+
235
+ ## Commands
236
+
237
+ ```
238
+ mirror Open interactive chat (default)
239
+ mirror chat Interactive multi-turn session
240
+ mirror chat --file <path> Preload a file as conversation context
241
+
242
+ mirror mirror "<question>" One-shot query, stream and exit
243
+ mirror mirror --file <path> "<question>" One-shot with file context
244
+
245
+ mirror config init Interactive setup wizard
246
+ mirror config show Print current config as JSON
247
+ mirror config set <key> <value> Set a config value by dot-path
248
+
249
+ mirror brains list List configured brains
250
+ mirror brains test <id> Ping a brain to verify connection
251
+ mirror brains add Add a new brain interactively
252
+
253
+ mirror history list List saved sessions
254
+ mirror history show <id> Print a saved session as JSON
255
+ mirror history export <id> <file> Export a session to a file
256
+ ```
257
+
258
+ ---
259
+
260
+ ## Global flags
261
+
262
+ All flags apply to any command and can be freely combined:
263
+
264
+ ```
265
+ --intensity mild|moderate|aggressive Adversarial pressure level (default: moderate)
266
+ --original <brain-id> Override the original brain
267
+ --challenger <brain-id> Override the challenger brain
268
+ --judge-brain <brain-id> Override the judge brain
269
+ --persona <name> Apply a professional lens to the challenger
270
+ --no-mirror Disable mirroring — original brain only
271
+ --no-classify Skip classification, always mirror
272
+ --no-judge Skip the judge synthesis pass
273
+ --debug Print debug info to stderr
274
+ ```
275
+
276
+ ---
277
+
278
+ ## Configuration
279
+
280
+ Config is stored at:
281
+ - **macOS / Linux:** `~/.config/adversarial-mirror/config.json`
282
+ - **Windows:** `%APPDATA%\adversarial-mirror\config.json`
283
+
284
+ `mirror config init` sets everything up interactively. Set individual values with:
285
+
286
+ ```bash
287
+ mirror config set session.defaultIntensity aggressive
288
+ mirror config set session.defaultPersona security-auditor
289
+ mirror config set session.originalBrainId claude-sonnet-4-6
290
+ mirror config set session.challengerBrainId o3-mini
291
+ mirror config set session.judgeEnabled false
292
+ mirror config set session.judgeBrainId claude-opus-4-6
293
+ mirror config set ui.showTokenCounts true
294
+ mirror config set ui.showLatency true
295
+ mirror config set ui.syntaxHighlighting true
296
+ ```
297
+
298
+ View the full config at any time:
299
+
300
+ ```bash
301
+ mirror config show
302
+ ```
303
+
304
+ ---
305
+
306
+ ## Supported providers
307
+
308
+ Mix and match any provider for original, challenger, and judge independently.
309
+
310
+ | Provider | Example models | Env var |
311
+ |---|---|---|
312
+ | **Anthropic** | `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-haiku-4-5-20251001` | `ANTHROPIC_API_KEY` |
313
+ | **OpenAI** | `gpt-4o`, `o3-mini`, `o3` | `OPENAI_API_KEY` |
314
+ | **Google** | `gemini-2.5-pro`, `gemini-1.5-pro` | `GOOGLE_API_KEY` |
315
+
316
+ Add a brain with `mirror brains add` or add it directly to the config:
317
+
318
+ ```json
319
+ {
320
+ "id": "my-o3",
321
+ "provider": "openai",
322
+ "model": "o3",
323
+ "apiKeyEnvVar": "OPENAI_API_KEY"
324
+ }
325
+ ```
326
+
327
+ A few combinations worth trying:
328
+
329
+ ```bash
330
+ # Heavyweight — best quality, slower
331
+ mirror --original claude-opus-4-6 --challenger o3 --judge-brain claude-opus-4-6
332
+
333
+ # Fast and lean — no judge pass
334
+ mirror --original claude-sonnet-4-6 --challenger o3-mini --no-judge
335
+
336
+ # Cross-company sanity check
337
+ mirror --original claude-sonnet-4-6 --challenger gemini-2.5-pro
338
+ ```
339
+
340
+ ---
341
+
342
+ ## Terminal UI
343
+
344
+ The interface is built with [Ink](https://github.com/vadimdemedes/ink) (React for terminals).
345
+
346
+ - **Completed exchanges** are stamped permanently to the scrollback via Ink's `Static` — they never flicker or redraw, even while the next query is streaming
347
+ - **Live panels** update as tokens arrive, batched at 60ms
348
+ - **Side-by-side layout** activates automatically at terminal widths >= 80 columns
349
+ - **Synthesis panel** appears below both brain panels with a yellow border when the judge is running
350
+ - **Syntax highlighting** in code blocks
351
+ - **Agreement score** shown in the synthesis panel header and status bar
352
+ - Token counts and latency visible with `mirror config set ui.showTokenCounts true`
353
+
354
+ ```
355
+ Ctrl+C while idle → exit
356
+ Ctrl+C while streaming → cancel current request
357
+ ```
358
+
359
+ ---
360
+
361
+ ## Development
362
+
363
+ ```bash
364
+ git clone https://github.com/ArtificialStephen/adversarial-mirror
365
+ cd adversarial-mirror
366
+ npm install
367
+
368
+ npm run build # compile to dist/
369
+ npm run dev # watch mode
370
+ npm test # 122 tests across 9 suites
371
+ npm run test:coverage # coverage report
372
+ npm run test:watch # vitest watch mode
373
+ ```
374
+
375
+ Run without real API keys using the built-in mock adapter:
376
+
377
+ ```bash
378
+ MOCK_BRAINS=true node dist/cli.js chat
379
+ MOCK_BRAINS=true node dist/cli.js --persona vc-skeptic mirror "my startup idea"
380
+ MOCK_BRAINS=true node dist/cli.js mirror --file README.md "What are the risks?"
381
+ echo "test input" | MOCK_BRAINS=true node dist/cli.js mirror "analyze this"
382
+ MOCK_BRAINS=true node dist/cli.js --no-judge --intensity aggressive chat
383
+ ```
384
+
385
+ Build standalone binaries (no Node.js required to run):
386
+
387
+ ```bash
388
+ npm run build && npm run package
389
+ # outputs to dist/pkg/ for:
390
+ # win-x64 · linux-x64 · linux-arm64 · macos-x64 · macos-arm64
391
+ ```
392
+
393
+ ---
394
+
395
+ ## License
396
+
397
+ MIT