dopple-ai 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Aayush Mathur
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,412 @@
1
+ # dopple
2
+
3
+ Understand your users before you build. Dopple connects to your product data, creates psychometrically grounded synthetic users, and tells you what to build next — all from the terminal.
4
+
5
+ Every prediction gets smarter. Dopple tracks what it predicted, what actually happened, and uses that history to improve the next prediction. The more you use it, the more accurate it gets.
6
+
7
+ ```bash
8
+ dopple insights --product "my SaaS" --posthog-key phx_... --pretty
9
+
10
+ # 5 Insights:
11
+ #
12
+ # 1. Power users export heavily but you have no export customization
13
+ # Confidence: ████████░░ 80% Impact: █████████░ 90%
14
+ # → Add CSV column selection and scheduled exports
15
+ #
16
+ # 2. 28% of users churn within 7 days — they never finish onboarding
17
+ # Confidence: ███████░░░ 70% Impact: ████████░░ 80%
18
+ # → Simplify onboarding step 3 or add a guided walkthrough
19
+ ```
20
+
21
+ ## Use Cases
22
+
23
+ **"What should I build next?"**
24
+ Connect PostHog → Dopple discovers your user segments → generates targeted questions from your data → cross-references persona responses with real behavior → surfaces insights you didn't know.
25
+
26
+ **"Will users pay for this?"**
27
+ Generate a panel of synthetic users grounded in your Stripe + PostHog data. Ask them about pricing, features, positioning. Every answer cites specific data.
28
+
29
+ **"How will users react to this change?"**
30
+ Run a focus group. Inject a stimulus mid-discussion ("competitor just launched a free tier"). Watch personas debate, change their minds, disagree. Get a summary of themes, agreements, and opinion shifts.
31
+
32
+ **"Does our design work?"**
33
+ Have personas review your Figma designs, landing pages, email copy, or pricing pages. Each persona reacts from their specific context — their usage patterns, their frustrations, their personality.
34
+
35
+ **"Does our product match our positioning?"**
36
+ Calibrate personas against your policy docs, brand guidelines, or real survey results. Structured calibration with real math (MAE, Pearson correlation), not LLM vibes.
37
+
38
+ **"Who are my users, really?"**
39
+ Build a knowledge graph from your data. Ask it natural language questions. "Why are users churning?" — get answers grounded in real behavioral patterns.
40
+
41
+ ## Install
42
+
43
+ ```bash
44
+ bun add dopple-ai
45
+ # or
46
+ npm install dopple-ai
47
+ ```
48
+
49
+ ## Quick Start
50
+
51
+ ### Get insights (the main command)
52
+
53
+ ```bash
54
+ dopple insights --product "my SaaS" --csv users.csv --pretty
55
+ dopple insights --product "my SaaS" --posthog-key phx_... --graph --pretty
56
+ dopple insights --product "my SaaS" --posthog-key phx_... --agent # JSON for AI agents
57
+ ```
58
+
59
+ Predictions are automatically traced. When real outcomes arrive, record them — next predictions improve:
60
+
61
+ ```bash
62
+ # Record what actually happened
63
+ dopple record-outcome \
64
+ --product "my SaaS" \
65
+ --trace-id abc123 \
66
+ --actual "22% churned (predicted 28%)" \
67
+ --mae 0.06 --direction-correct
68
+
69
+ # View prediction history and accuracy
70
+ dopple traces --product "my SaaS" --pretty
71
+ ```
72
+
73
+ ### As a library
74
+
75
+ ```typescript
76
+ import { Dopple } from "dopple-ai";
77
+
78
+ const dopple = new Dopple({
79
+ model: "anthropic/claude-sonnet-4-20250514",
80
+ adapters: [{ type: "posthog", apiKey: "phx_..." }],
81
+ });
82
+
83
+ // Get actionable insights — automatically calibrated from past predictions
84
+ const report = await dopple.runInsights({ product: "my SaaS app" });
85
+ console.log(report.insights);
86
+
87
+ // Generate personas from data, documents, or both
88
+ const personas = await dopple.generate({
89
+ product: "budget tracking app for freelancers",
90
+ count: 5,
91
+ save: "budget-panel",
92
+ });
93
+
94
+ // Every response includes citations and reasoning
95
+ const response = await personas[0].ask("Would you pay $15/mo for this?");
96
+ console.log(response.response); // "Honestly, $15 feels steep..."
97
+ console.log(response.reasoning); // "High price sensitivity (neuroticism=0.72)"
98
+ console.log(response.citations); // [{ source: "trait-model", detail: "...", weight: 0.8 }]
99
+
100
+ // Record real outcomes to improve future predictions
101
+ await dopple.recordOutcome(
102
+ "my SaaS app", traceId, "22% churned", "stripe",
103
+ { mae: 0.06, directionCorrect: true, notes: "Over-estimated by 6pp" }
104
+ );
105
+ ```
106
+
107
+ ### Persona generation (three modes)
108
+
109
+ ```typescript
110
+ // From data — clusters real users into segments
111
+ const personas = await dopple.generate({
112
+ product: "my SaaS", // + PostHog/Stripe adapters connected
113
+ });
114
+
115
+ // From documents — extracts stakeholder types
116
+ const personas = await dopple.generate({
117
+ product: "Singapore flood risk policy", // + PDF adapter: --doc policy.pdf
118
+ });
119
+
120
+ // Combined — segments from data, enriched with document knowledge
121
+ const personas = await dopple.generate({
122
+ product: "my SaaS", // + PostHog + --doc product-roadmap.pdf
123
+ });
124
+ ```
125
+
126
+ Each persona gets a rich narrative backstory, pain points, goals, and relationship to the product — not bullet-point demographics.
127
+
128
+ ### Focus groups
129
+
130
+ ```typescript
131
+ const group = dopple.createFocusGroup("pricing", {
132
+ topic: "Should we raise prices from $10 to $15/mo?",
133
+ product: "project management SaaS",
134
+ });
135
+
136
+ group.addPersonas(personas);
137
+ group.setPersonaContext("Sarah", ["You cancelled your subscription last month"]);
138
+ group.setPersonaContext("Mike", ["You upgraded to pro 2 weeks ago"]);
139
+
140
+ const round1 = await group.discuss(llm);
141
+ group.inject("A competitor just launched a free tier");
142
+ const round2 = await group.discuss(llm);
143
+
144
+ const summary = await group.summarize(llm);
145
+ // { themes, agreements, disagreements, insights, opinionShifts }
146
+ ```
147
+
148
+ ### Design and messaging review
149
+
150
+ ```typescript
151
+ import { reviewContent } from "dopple-ai";
152
+
153
+ // Personas review with full context — their usage, frustrations, personality
154
+ const review = await reviewContent(llm, personas, {
155
+ type: "pricing_page",
156
+ content: "https://myapp.com/pricing",
157
+ description: "New pricing page with Pro tier at $15/mo",
158
+ questions: ["Is the pricing clear?", "Would you upgrade?"],
159
+ });
160
+
161
+ console.log(review.consensus.wouldActRate); // 0.4 — only 40% would upgrade
162
+ console.log(review.consensus.weaknesses); // ["Pro tier value not clear"]
163
+
164
+ // Figma design review
165
+ import { reviewFigmaDesign } from "dopple-ai";
166
+
167
+ const review = await reviewFigmaDesign(llm, personas, {
168
+ accessToken: "fig_...",
169
+ fileKey: "abc123",
170
+ nodeId: "42:0",
171
+ });
172
+ ```
173
+
174
+ ### Surveys
175
+
176
+ ```typescript
177
+ const survey = dopple.createSurvey("pricing-research")
178
+ .freeText("pain", "What's the hardest part about managing expenses?")
179
+ .multipleChoice("switch", "Would you switch to a cheaper alternative?", [
180
+ "Yes, immediately", "Maybe, if same features", "No, I'm loyal",
181
+ ])
182
+ .likert5("satisfaction", "How satisfied are you with your current tool?")
183
+ .numerical("budget", "Max monthly price?", 0, 100);
184
+
185
+ const results = await dopple.runSurvey(survey, personas);
186
+ ```
187
+
188
+ ### Calibration
189
+
190
+ ```typescript
191
+ // Qualitative — against policy docs, brand guidelines
192
+ const report = await dopple.calibrate(personas, [
193
+ { type: "policy", name: "Return Policy", content: policyText },
194
+ ]);
195
+
196
+ // Structured — real math against real survey data
197
+ const report = dopple.calibrateStructured(realSurveyData, syntheticResults);
198
+ console.log(report.overallMAE); // 0.073 (7.3pp average error)
199
+ console.log(report.correlation); // 0.847
200
+ ```
201
+
202
+ ### Knowledge graph
203
+
204
+ ```typescript
205
+ const graph = await dopple.buildGraph();
206
+ const answer = await graph.ask(llm, "Why are users churning?");
207
+ // { answer: "...", evidence: [...], confidence: 0.82 }
208
+ ```
209
+
210
+ ## CLI
211
+
212
+ 18 commands. JSON output by default (for AI agents). `--pretty` for humans. `--agent` for strict envelope format.
213
+
214
+ ```bash
215
+ # Insights (the main command — predictions auto-traced)
216
+ dopple insights --product "my SaaS" --posthog-key phx_... --pretty
217
+
218
+ # Personas (from data, documents, or both)
219
+ dopple generate --product "budget app" --count 5 --save my-panel --pretty
220
+ dopple generate --product "flood risk" --doc policy.pdf --pretty
221
+ dopple generate --product "my SaaS" --posthog-key phx_... --doc roadmap.pdf --pretty
222
+ dopple ask "Would you pay for dark mode?" --persona my-panel --pretty
223
+
224
+ # Research
225
+ dopple survey --product "my SaaS" --persona my-panel --pretty
226
+ dopple focus-group --topic "pricing" --product "my SaaS" --rounds 3 --pretty
227
+ dopple review "https://myapp.com" --type landing_page --persona my-panel --pretty
228
+ dopple review "Ship faster with AI" --type messaging --pretty
229
+
230
+ # Calibration
231
+ dopple calibrate --source policy.txt --source-type policy --persona my-panel --pretty
232
+ dopple calibrate-data --real survey.json --synthetic results.json --pretty
233
+
234
+ # Data & graph
235
+ dopple discover --posthog-key phx_... --product "my SaaS" --pretty
236
+ dopple graph --posthog-key phx_... --pretty
237
+ dopple query "Why are users churning?" --csv users.csv --pretty
238
+
239
+ # Prediction tracking (the compounding loop)
240
+ dopple traces --product "my SaaS" --pretty
241
+ dopple record-outcome --product "my SaaS" --trace-id abc123 --actual "22% churned"
242
+
243
+ # Management
244
+ dopple validate --persona panel.jsonl --pretty
245
+ dopple status --posthog-key phx_... --pretty
246
+ dopple panels
247
+ dopple providers
248
+ ```
249
+
250
+ ## How It Works
251
+
252
+ ### Prediction trace accumulation
253
+
254
+ Every prediction Dopple makes is traced per product. When real outcomes arrive, they close the loop:
255
+
256
+ ```
257
+ Week 1: dopple insights → "28% will churn" (trace saved)
258
+ Week 2: dopple record-outcome → "22% actually churned" (accuracy: 6pp error)
259
+ Week 3: dopple insights → calibration history injected → better prediction
260
+ Week 8: dopple insights → "For this product, Dopple's predictions have
261
+ r=0.84 correlation with real outcomes across 12 verified predictions"
262
+ ```
263
+
264
+ The LLM doesn't change. The world model expands. Each verified prediction makes the next one more accurate.
265
+
266
+ ### Persona grounding
267
+
268
+ Every persona is a mathematical object, not a prompt template. Built on the [Big Five personality model](https://en.wikipedia.org/wiki/Big_Five_personality_traits) (OCEAN) — the most validated framework in personality psychology.
269
+
270
+ ```
271
+ Openness: 0.72 → tries new products early, values innovation
272
+ Conscientiousness: 0.45 → moderate planning, some impulse buying
273
+ Extraversion: 0.31 → decides independently, researches quietly
274
+ Agreeableness: 0.68 → trusts recommendations, loyal, avoids conflict
275
+ Neuroticism: 0.55 → somewhat price-sensitive, reads negative reviews
276
+ ```
277
+
278
+ Trait vectors compile into deterministic behavioral rules. Same traits = same behavior, every time.
279
+
280
+ ### Three generation modes
281
+
282
+ | Input | Mode | Confidence |
283
+ |-------|------|-----------|
284
+ | PostHog, Stripe, CSV | **Data** — clusters real users → representative persona per segment | Medium-High |
285
+ | PDF, Markdown, TXT | **Documents** — extracts stakeholder types → individual persona per stakeholder | Medium |
286
+ | Both | **Combined** — segments from data, enriched with document knowledge | High |
287
+
288
+ Auto-selected based on what you connect. Each persona gets a rich narrative backstory, pain points, goals, and relationship to the product.
289
+
290
+ ### Sourced responses
291
+
292
+ Every response traces back to data:
293
+
294
+ ```json
295
+ {
296
+ "persona": "Sarah",
297
+ "response": "I'd probably cancel at $15.",
298
+ "confidence": "medium",
299
+ "reasoning": "High neuroticism drives price anxiety. Payment data shows 34% churn at current price.",
300
+ "citations": [
301
+ { "source": "trait-model", "detail": "neuroticism 0.72 → price sensitivity", "weight": 0.6 },
302
+ { "source": "payment-data", "detail": "34% churned after last price increase", "weight": 0.9 }
303
+ ]
304
+ }
305
+ ```
306
+
307
+ ### Knowledge graph
308
+
309
+ Builds an in-memory graph from your data — users, events, features, complaints, payments, and their relationships. Query it with natural language or use it to ground personas in specific data neighborhoods.
310
+
311
+ ### Calibration (two modes)
312
+
313
+ **Qualitative** — calibrate against policy docs, brand guidelines, benchmarks. LLM evaluates alignment.
314
+
315
+ **Structured** — calibrate against real survey data. Pure math: Mean Absolute Error per question, Pearson correlation across distributions. No LLM judgment. Publishable.
316
+
317
+ ### Memory
318
+
319
+ Three layers that persist across sessions:
320
+ - **Facts** — immutable ground truths from your data. Can't be contradicted.
321
+ - **Episodes** — what the persona said before. Prevents flip-flopping.
322
+ - **Stances** — distilled positions ("AGAINST price increases"). Compact, always in prompt.
323
+
324
+ ## Data Sources
325
+
326
+ ### Adapters (read-only — Dopple never writes to your services)
327
+
328
+ | Adapter | What it provides |
329
+ |---------|-----------------|
330
+ | **PostHog** | User properties, events, behavioral patterns |
331
+ | **Stripe** | Subscriptions, churn, MRR, cancel reasons |
332
+ | **Amplitude** | User journeys, event export |
333
+ | **Mixpanel** | User profiles, event analytics |
334
+ | **HubSpot** | CRM contacts, deals, lifecycle stage |
335
+ | **Intercom** | Contacts, conversation history, tags |
336
+ | **CSV / JSON** | Any structured data from files |
337
+ | **Documents** | PDF, Markdown, TXT, HTML files |
338
+ | **Context** | Freeform text, URLs |
339
+
340
+ ### Integrations (output channels)
341
+
342
+ | Integration | What it does |
343
+ |-------------|-------------|
344
+ | **Figma** | Personas review your Figma designs — layout, messaging, usability, trust |
345
+ | **Slack** | Post survey results, panel responses, insights to Slack channels |
346
+ | **Review** | Personas review any content — landing pages, emails, ads, pricing pages |
347
+
348
+ ### Roadmap
349
+
350
+ | Layer | Adapter | What it adds |
351
+ |-------|---------|-------------|
352
+ | **SAY** | Formbricks | Survey responses, NPS, in-app feedback |
353
+ | **SAY** | Chatwoot | Support conversations, tickets |
354
+ | **DO** | Shopify | Orders, products, customer segments |
355
+ | **DO** | RudderStack | Unified customer profiles |
356
+ | **HOW** | OpenReplay | Session replays, heatmaps, rage clicks |
357
+ | **HOW** | rrweb | Raw DOM recordings, scroll patterns |
358
+
359
+ The four layers of user understanding:
360
+ - **SAY** — what users tell you (surveys, support, reviews)
361
+ - **DO** — what users actually did (events, payments, signups)
362
+ - **HOW** — how users interacted (session replays, hesitation, scroll depth)
363
+ - **WHY** — why users behave that way (personality, motivation, beliefs) — Dopple
364
+
365
+ ### Custom adapters
366
+
367
+ Write your own in ~20 lines:
368
+
369
+ ```typescript
370
+ import { registerAdapter, type Adapter, type UserRecord } from "dopple-ai";
371
+
372
+ class MyAdapter implements Adapter {
373
+ readonly type = "my-source";
374
+ async fetch(): Promise<UserRecord[]> {
375
+ return [{ id: "user-1", properties: { plan: "pro" } }];
376
+ }
377
+ }
378
+
379
+ registerAdapter("my-source", (config) => new MyAdapter(config));
380
+ ```
381
+
382
+ ## LLM Providers
383
+
384
+ One string to switch models:
385
+
386
+ ```typescript
387
+ new Dopple({ model: "anthropic/claude-sonnet-4-20250514" }) // default
388
+ new Dopple({ model: "openai/gpt-4o" })
389
+ new Dopple({ model: "ollama/llama3" }) // local, free
390
+ new Dopple({ model: "groq/llama-3.3-70b-versatile" }) // fast
391
+ new Dopple({ model: "openrouter/anthropic/claude-sonnet-4-20250514" })
392
+ ```
393
+
394
+ 8 providers: Anthropic, OpenAI, Google, Groq, Together, Fireworks, Ollama, OpenRouter.
395
+
396
+ ## Agent Integration
397
+
398
+ Dopple is designed to be called by AI agents. Every command returns structured JSON by default.
399
+
400
+ ```bash
401
+ # Claude Code / Codex / any agent can run:
402
+ dopple insights --product "my SaaS" --csv users.csv --agent
403
+
404
+ # Returns single-line JSON envelope:
405
+ # {"ok":true,"command":"insights","data":{...}}
406
+ ```
407
+
408
+ Self-documenting: `dopple --help --agent` returns a JSON schema of all commands.
409
+
410
+ ## License
411
+
412
+ MIT