bridgekit 0.3.0__tar.gz → 0.3.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: bridgekit
3
- Version: 0.3.0
3
+ Version: 0.3.1
4
4
  Summary: AI tools that make you a better data scientist, not a redundant one.
5
5
  License: MIT
6
6
  Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
@@ -32,54 +32,7 @@ Dynamic: license-file
32
32
 
33
33
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
34
34
 
35
- No new interface to learn. No data leaving your hands. Just better work.
36
-
37
- ---
38
-
39
- ## Tool #1: Analysis Reviewer
40
-
41
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
42
-
43
- ```python
44
- from bridgekit import evaluate
45
-
46
- text = """
47
- I analyzed 90 days of user behavior data to understand what drives subscription
48
- upgrades. Users who engaged with the reporting feature within their first week
49
- were 3x more likely to upgrade within 30 days. I recommend we prioritize
50
- onboarding users to reporting as a growth lever.
51
- """
52
-
53
- evaluate(text)
54
- ```
55
-
56
- **Output:**
57
-
58
- ```
59
- BRIDGEKIT FEEDBACK
60
- ─────────────────────────────────────────
61
-
62
- ✅ LOGIC
63
- Your conclusion follows from the data. The 3x lift is a meaningful signal
64
- worth acting on.
65
-
66
- ⚠️ WHAT'S MISSING
67
- - Did you control for user intent? Users who explore reporting features may
68
- already be power users likely to upgrade regardless.
69
- - What's the sample size behind the 3x figure?
70
- - Is this correlation or did you establish any causal direction?
71
-
72
- 🎯 WEAKEST POINT
73
- "I recommend we prioritize onboarding to reporting" is a big leap from an
74
- observational finding. A senior DS would push back on this in the meeting.
75
-
76
- 💡 LEVEL UP
77
- Look into selection bias and how to address it — this analysis would be
78
- significantly stronger with a matched cohort or an experiment to validate
79
- the finding.
80
-
81
- ─────────────────────────────────────────
82
- ```
35
+ No new interface to learn. Just better work.
83
36
 
84
37
  ---
85
38
 
@@ -112,34 +65,87 @@ export ANTHROPIC_API_KEY=your_key_here
112
65
 
113
66
  ## Getting Started
114
67
 
115
- **From the terminal:**
68
+ Set your API key before launching Jupyter:
116
69
 
117
70
  ```bash
118
- python example.py
71
+ export ANTHROPIC_API_KEY=your_key_here
72
+ jupyter notebook
119
73
  ```
120
74
 
121
- **From a Jupyter notebook:**
75
+ Then import whichever tool you need:
122
76
 
123
- Set your API key before launching Jupyter:
77
+ ```python
78
+ from bridgekit import evaluate, plan, ask
79
+ ```
124
80
 
125
- ```bash
126
- export ANTHROPIC_API_KEY=your_key_here
127
- jupyter notebook
81
+ **Review a writeup:**
82
+ ```python
83
+ print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
84
+ ```
85
+
86
+ **Plan your analytical approach:**
87
+ ```python
88
+ print(plan("Did our onboarding flow reduce churn?"))
128
89
  ```
129
90
 
130
- Then in a cell:
91
+ **Search past reports:**
92
+ ```python
93
+ print(ask("What drove churn in Q3?", source="reports/"))
94
+ ```
95
+
96
+ ---
97
+
98
+ ## Tool #1: Analysis Reviewer
99
+
100
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
131
101
 
132
102
  ```python
133
103
  from bridgekit import evaluate
134
104
 
135
105
  text = """
136
- Your analysis writeup goes here.
106
+ I analyzed 90 days of user behavior data to understand what drives subscription
107
+ upgrades. Users who engaged with the reporting feature within their first week
108
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
109
+ onboarding users to reporting as a growth lever.
137
110
  """
138
111
 
139
112
  print(evaluate(text))
140
113
  ```
141
114
 
142
- Paste your writeup as a string and call `evaluate()` — that's it.
115
+ **Output:**
116
+
117
+ ```
118
+ BRIDGEKIT ANALYSIS REVIEW
119
+ ─────────────────────────────────────────
120
+
121
+ 1. CLARITY
122
+ ✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
123
+ this and immediately understand the claim and the recommendation.
124
+
125
+ 2. STATISTICAL RIGOR
126
+ ⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
127
+ context is missing. How many users are in each group? What's the base
128
+ upgrade rate? There's no confidence interval or p-value, so we can't
129
+ assess whether this difference is statistically significant or noise.
130
+
131
+ 3. METHODOLOGY
132
+ ❌ MISSING — This reads as a pure correlation finding, but the
133
+ recommendation implies causation. Users who explore reporting in week one
134
+ may simply be more motivated or already closer to upgrading. Without
135
+ addressing the self-selection problem, this recommendation is not
136
+ defensible.
137
+
138
+ 4. BUSINESS IMPACT
139
+ ⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
140
+ the 3x lift into projected revenue or upgrade volume so leadership can
141
+ prioritize this against competing initiatives.
142
+
143
+ ─────────────────────────────────────────
144
+ BOTTOM LINE
145
+ You must address the correlation-vs-causation gap before presenting —
146
+ otherwise you risk recommending an onboarding investment that targets a
147
+ symptom of upgrade intent rather than a cause of it.
148
+ ```
143
149
 
144
150
  ---
145
151
 
@@ -197,7 +203,7 @@ from bridgekit import plan
197
203
 
198
204
  print(plan(
199
205
  question="Does our new onboarding flow increase upgrade rates?",
200
- data_description="1,000 users randomly split 50/50 between old and new onboarding. Variables: upgrade status (binary), time to upgrade (days), acquisition channel, plan tier.",
206
+ data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
201
207
  goal="causal inference"
202
208
  ))
203
209
  ```
@@ -255,13 +261,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
255
261
 
256
262
  ## Why a library and not a chatbot?
257
263
 
258
- Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call at the end of your existing process — consistent, reproducible, and fast.
259
-
260
- ---
261
-
262
- ## Is my data safe?
263
-
264
- Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
264
+ Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
265
265
 
266
266
  ---
267
267
 
@@ -269,9 +269,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
269
269
 
270
270
  Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
271
271
 
272
- - **Statistical approach suggester** — describe your problem in plain English, get the right test and why
273
272
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
274
273
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
274
+ - **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
275
275
 
276
276
  Each tool is small, focused, and built for the way data scientists actually work.
277
277
 
@@ -4,54 +4,7 @@
4
4
 
5
5
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
6
6
 
7
- No new interface to learn. No data leaving your hands. Just better work.
8
-
9
- ---
10
-
11
- ## Tool #1: Analysis Reviewer
12
-
13
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
14
-
15
- ```python
16
- from bridgekit import evaluate
17
-
18
- text = """
19
- I analyzed 90 days of user behavior data to understand what drives subscription
20
- upgrades. Users who engaged with the reporting feature within their first week
21
- were 3x more likely to upgrade within 30 days. I recommend we prioritize
22
- onboarding users to reporting as a growth lever.
23
- """
24
-
25
- evaluate(text)
26
- ```
27
-
28
- **Output:**
29
-
30
- ```
31
- BRIDGEKIT FEEDBACK
32
- ─────────────────────────────────────────
33
-
34
- ✅ LOGIC
35
- Your conclusion follows from the data. The 3x lift is a meaningful signal
36
- worth acting on.
37
-
38
- ⚠️ WHAT'S MISSING
39
- - Did you control for user intent? Users who explore reporting features may
40
- already be power users likely to upgrade regardless.
41
- - What's the sample size behind the 3x figure?
42
- - Is this correlation or did you establish any causal direction?
43
-
44
- 🎯 WEAKEST POINT
45
- "I recommend we prioritize onboarding to reporting" is a big leap from an
46
- observational finding. A senior DS would push back on this in the meeting.
47
-
48
- 💡 LEVEL UP
49
- Look into selection bias and how to address it — this analysis would be
50
- significantly stronger with a matched cohort or an experiment to validate
51
- the finding.
52
-
53
- ─────────────────────────────────────────
54
- ```
7
+ No new interface to learn. Just better work.
55
8
 
56
9
  ---
57
10
 
@@ -84,34 +37,87 @@ export ANTHROPIC_API_KEY=your_key_here
84
37
 
85
38
  ## Getting Started
86
39
 
87
- **From the terminal:**
40
+ Set your API key before launching Jupyter:
88
41
 
89
42
  ```bash
90
- python example.py
43
+ export ANTHROPIC_API_KEY=your_key_here
44
+ jupyter notebook
91
45
  ```
92
46
 
93
- **From a Jupyter notebook:**
47
+ Then import whichever tool you need:
94
48
 
95
- Set your API key before launching Jupyter:
49
+ ```python
50
+ from bridgekit import evaluate, plan, ask
51
+ ```
96
52
 
97
- ```bash
98
- export ANTHROPIC_API_KEY=your_key_here
99
- jupyter notebook
53
+ **Review a writeup:**
54
+ ```python
55
+ print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
56
+ ```
57
+
58
+ **Plan your analytical approach:**
59
+ ```python
60
+ print(plan("Did our onboarding flow reduce churn?"))
100
61
  ```
101
62
 
102
- Then in a cell:
63
+ **Search past reports:**
64
+ ```python
65
+ print(ask("What drove churn in Q3?", source="reports/"))
66
+ ```
67
+
68
+ ---
69
+
70
+ ## Tool #1: Analysis Reviewer
71
+
72
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
103
73
 
104
74
  ```python
105
75
  from bridgekit import evaluate
106
76
 
107
77
  text = """
108
- Your analysis writeup goes here.
78
+ I analyzed 90 days of user behavior data to understand what drives subscription
79
+ upgrades. Users who engaged with the reporting feature within their first week
80
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
81
+ onboarding users to reporting as a growth lever.
109
82
  """
110
83
 
111
84
  print(evaluate(text))
112
85
  ```
113
86
 
114
- Paste your writeup as a string and call `evaluate()` — that's it.
87
+ **Output:**
88
+
89
+ ```
90
+ BRIDGEKIT ANALYSIS REVIEW
91
+ ─────────────────────────────────────────
92
+
93
+ 1. CLARITY
94
+ ✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
95
+ this and immediately understand the claim and the recommendation.
96
+
97
+ 2. STATISTICAL RIGOR
98
+ ⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
99
+ context is missing. How many users are in each group? What's the base
100
+ upgrade rate? There's no confidence interval or p-value, so we can't
101
+ assess whether this difference is statistically significant or noise.
102
+
103
+ 3. METHODOLOGY
104
+ ❌ MISSING — This reads as a pure correlation finding, but the
105
+ recommendation implies causation. Users who explore reporting in week one
106
+ may simply be more motivated or already closer to upgrading. Without
107
+ addressing the self-selection problem, this recommendation is not
108
+ defensible.
109
+
110
+ 4. BUSINESS IMPACT
111
+ ⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
112
+ the 3x lift into projected revenue or upgrade volume so leadership can
113
+ prioritize this against competing initiatives.
114
+
115
+ ─────────────────────────────────────────
116
+ BOTTOM LINE
117
+ You must address the correlation-vs-causation gap before presenting —
118
+ otherwise you risk recommending an onboarding investment that targets a
119
+ symptom of upgrade intent rather than a cause of it.
120
+ ```
115
121
 
116
122
  ---
117
123
 
@@ -169,7 +175,7 @@ from bridgekit import plan
169
175
 
170
176
  print(plan(
171
177
  question="Does our new onboarding flow increase upgrade rates?",
172
- data_description="1,000 users randomly split 50/50 between old and new onboarding. Variables: upgrade status (binary), time to upgrade (days), acquisition channel, plan tier.",
178
+ data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
173
179
  goal="causal inference"
174
180
  ))
175
181
  ```
@@ -227,13 +233,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
227
233
 
228
234
  ## Why a library and not a chatbot?
229
235
 
230
- Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call at the end of your existing process — consistent, reproducible, and fast.
231
-
232
- ---
233
-
234
- ## Is my data safe?
235
-
236
- Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
236
+ Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
237
237
 
238
238
  ---
239
239
 
@@ -241,9 +241,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
241
241
 
242
242
  Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
243
243
 
244
- - **Statistical approach suggester** — describe your problem in plain English, get the right test and why
245
244
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
246
245
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
246
+ - **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
247
247
 
248
248
  Each tool is small, focused, and built for the way data scientists actually work.
249
249
 
@@ -2,5 +2,5 @@ from .reviewer import evaluate
2
2
  from .search import ask
3
3
  from .planner import plan
4
4
 
5
- __version__ = "0.3.0"
5
+ __version__ = "0.3.1"
6
6
  __all__ = ["evaluate", "ask", "plan"]
@@ -0,0 +1 @@
1
+ DEFAULT_MODEL = "claude-opus-4-6"
@@ -1,5 +1,6 @@
1
1
  import os
2
2
  import anthropic
3
+ from .config import DEFAULT_MODEL
3
4
 
4
5
  SYSTEM_PROMPT = """You are a senior statistician and data scientist advising a colleague on the right analytical approach for their problem.
5
6
 
@@ -60,7 +61,7 @@ def plan(question: str, data_description: str = None, goal: str = None) -> str:
60
61
 
61
62
  client = anthropic.Anthropic(api_key=api_key)
62
63
  message = client.messages.create(
63
- model="claude-opus-4-5",
64
+ model=DEFAULT_MODEL,
64
65
  max_tokens=1024,
65
66
  system=SYSTEM_PROMPT,
66
67
  messages=[
@@ -1,16 +1,16 @@
1
1
  import os
2
2
  import anthropic
3
+ from .config import DEFAULT_MODEL
3
4
 
4
5
  SYSTEM_PROMPT = """You are a senior data scientist reviewing a colleague's analysis writeup.
5
6
  You are direct, constructive, and specific. You do not flatter — you help people improve.
6
7
 
7
- Evaluate the writeup across exactly these five dimensions:
8
+ Evaluate the writeup across exactly these four dimensions:
8
9
 
9
- 1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything?
10
- 2. AUDIENCE CLARITY — Is it written for the right reader? Does the level of detail and framing match who will actually read this?
11
- 3. STATISTICAL RIGOR — Is there enough data to support the claim? Are sample sizes mentioned? Are confidence levels or uncertainty acknowledged?
12
- 4. METHODOLOGYIs it clear why this analytical approach was chosen? Are alternatives considered or ruled out?
13
- 5. BUSINESS IMPACT — Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
10
+ 1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything? Is it written for the right reader?
11
+ 2. STATISTICAL RIGOR — Is there enough data to support the claim? Are sample sizes mentioned? Are confidence levels or uncertainty acknowledged?
12
+ 3. METHODOLOGY — Is it clear why this analytical approach was chosen? Are alternatives considered or ruled out?
13
+ 4. BUSINESS IMPACT Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
14
14
 
15
15
  For each dimension, give one of three ratings:
16
16
  ✅ STRONG — this dimension is handled well
@@ -29,16 +29,13 @@ BRIDGEKIT ANALYSIS REVIEW
29
29
  1. CLARITY
30
30
  [rating] [feedback]
31
31
 
32
- 2. AUDIENCE CLARITY
32
+ 2. STATISTICAL RIGOR
33
33
  [rating] [feedback]
34
34
 
35
- 3. STATISTICAL RIGOR
35
+ 3. METHODOLOGY
36
36
  [rating] [feedback]
37
37
 
38
- 4. METHODOLOGY
39
- [rating] [feedback]
40
-
41
- 5. BUSINESS IMPACT
38
+ 4. BUSINESS IMPACT
42
39
  [rating] [feedback]
43
40
 
44
41
  ─────────────────────────────────────────
@@ -54,7 +51,7 @@ def evaluate(text: str) -> str:
54
51
  text: Your analysis writeup as a plain string.
55
52
 
56
53
  Returns:
57
- Structured feedback across five dimensions.
54
+ Structured feedback across four dimensions.
58
55
  """
59
56
  if not text or not text.strip():
60
57
  raise ValueError("Text cannot be empty.")
@@ -68,7 +65,7 @@ def evaluate(text: str) -> str:
68
65
  client = anthropic.Anthropic(api_key=api_key)
69
66
 
70
67
  message = client.messages.create(
71
- model="claude-opus-4-5",
68
+ model=DEFAULT_MODEL,
72
69
  max_tokens=1024,
73
70
  system=SYSTEM_PROMPT,
74
71
  messages=[
@@ -1,6 +1,7 @@
1
1
  import os
2
2
  from pathlib import Path
3
3
  import anthropic
4
+ from .config import DEFAULT_MODEL
4
5
  import chromadb
5
6
  from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
6
7
 
@@ -106,7 +107,7 @@ def ask(question: str, source: str = None, text: str = None) -> str:
106
107
  # Generate answer with Claude
107
108
  anthropic_client = anthropic.Anthropic(api_key=api_key)
108
109
  message = anthropic_client.messages.create(
109
- model="claude-opus-4-5",
110
+ model=DEFAULT_MODEL,
110
111
  max_tokens=1024,
111
112
  system=(
112
113
  "You are a senior data scientist answering questions based on analysis reports. "
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: bridgekit
3
- Version: 0.3.0
3
+ Version: 0.3.1
4
4
  Summary: AI tools that make you a better data scientist, not a redundant one.
5
5
  License: MIT
6
6
  Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
@@ -32,54 +32,7 @@ Dynamic: license-file
32
32
 
33
33
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
34
34
 
35
- No new interface to learn. No data leaving your hands. Just better work.
36
-
37
- ---
38
-
39
- ## Tool #1: Analysis Reviewer
40
-
41
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
42
-
43
- ```python
44
- from bridgekit import evaluate
45
-
46
- text = """
47
- I analyzed 90 days of user behavior data to understand what drives subscription
48
- upgrades. Users who engaged with the reporting feature within their first week
49
- were 3x more likely to upgrade within 30 days. I recommend we prioritize
50
- onboarding users to reporting as a growth lever.
51
- """
52
-
53
- evaluate(text)
54
- ```
55
-
56
- **Output:**
57
-
58
- ```
59
- BRIDGEKIT FEEDBACK
60
- ─────────────────────────────────────────
61
-
62
- ✅ LOGIC
63
- Your conclusion follows from the data. The 3x lift is a meaningful signal
64
- worth acting on.
65
-
66
- ⚠️ WHAT'S MISSING
67
- - Did you control for user intent? Users who explore reporting features may
68
- already be power users likely to upgrade regardless.
69
- - What's the sample size behind the 3x figure?
70
- - Is this correlation or did you establish any causal direction?
71
-
72
- 🎯 WEAKEST POINT
73
- "I recommend we prioritize onboarding to reporting" is a big leap from an
74
- observational finding. A senior DS would push back on this in the meeting.
75
-
76
- 💡 LEVEL UP
77
- Look into selection bias and how to address it — this analysis would be
78
- significantly stronger with a matched cohort or an experiment to validate
79
- the finding.
80
-
81
- ─────────────────────────────────────────
82
- ```
35
+ No new interface to learn. Just better work.
83
36
 
84
37
  ---
85
38
 
@@ -112,34 +65,87 @@ export ANTHROPIC_API_KEY=your_key_here
112
65
 
113
66
  ## Getting Started
114
67
 
115
- **From the terminal:**
68
+ Set your API key before launching Jupyter:
116
69
 
117
70
  ```bash
118
- python example.py
71
+ export ANTHROPIC_API_KEY=your_key_here
72
+ jupyter notebook
119
73
  ```
120
74
 
121
- **From a Jupyter notebook:**
75
+ Then import whichever tool you need:
122
76
 
123
- Set your API key before launching Jupyter:
77
+ ```python
78
+ from bridgekit import evaluate, plan, ask
79
+ ```
124
80
 
125
- ```bash
126
- export ANTHROPIC_API_KEY=your_key_here
127
- jupyter notebook
81
+ **Review a writeup:**
82
+ ```python
83
+ print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
84
+ ```
85
+
86
+ **Plan your analytical approach:**
87
+ ```python
88
+ print(plan("Did our onboarding flow reduce churn?"))
128
89
  ```
129
90
 
130
- Then in a cell:
91
+ **Search past reports:**
92
+ ```python
93
+ print(ask("What drove churn in Q3?", source="reports/"))
94
+ ```
95
+
96
+ ---
97
+
98
+ ## Tool #1: Analysis Reviewer
99
+
100
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
131
101
 
132
102
  ```python
133
103
  from bridgekit import evaluate
134
104
 
135
105
  text = """
136
- Your analysis writeup goes here.
106
+ I analyzed 90 days of user behavior data to understand what drives subscription
107
+ upgrades. Users who engaged with the reporting feature within their first week
108
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
109
+ onboarding users to reporting as a growth lever.
137
110
  """
138
111
 
139
112
  print(evaluate(text))
140
113
  ```
141
114
 
142
- Paste your writeup as a string and call `evaluate()` — that's it.
115
+ **Output:**
116
+
117
+ ```
118
+ BRIDGEKIT ANALYSIS REVIEW
119
+ ─────────────────────────────────────────
120
+
121
+ 1. CLARITY
122
+ ✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
123
+ this and immediately understand the claim and the recommendation.
124
+
125
+ 2. STATISTICAL RIGOR
126
+ ⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
127
+ context is missing. How many users are in each group? What's the base
128
+ upgrade rate? There's no confidence interval or p-value, so we can't
129
+ assess whether this difference is statistically significant or noise.
130
+
131
+ 3. METHODOLOGY
132
+ ❌ MISSING — This reads as a pure correlation finding, but the
133
+ recommendation implies causation. Users who explore reporting in week one
134
+ may simply be more motivated or already closer to upgrading. Without
135
+ addressing the self-selection problem, this recommendation is not
136
+ defensible.
137
+
138
+ 4. BUSINESS IMPACT
139
+ ⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
140
+ the 3x lift into projected revenue or upgrade volume so leadership can
141
+ prioritize this against competing initiatives.
142
+
143
+ ─────────────────────────────────────────
144
+ BOTTOM LINE
145
+ You must address the correlation-vs-causation gap before presenting —
146
+ otherwise you risk recommending an onboarding investment that targets a
147
+ symptom of upgrade intent rather than a cause of it.
148
+ ```
143
149
 
144
150
  ---
145
151
 
@@ -197,7 +203,7 @@ from bridgekit import plan
197
203
 
198
204
  print(plan(
199
205
  question="Does our new onboarding flow increase upgrade rates?",
200
- data_description="1,000 users randomly split 50/50 between old and new onboarding. Variables: upgrade status (binary), time to upgrade (days), acquisition channel, plan tier.",
206
+ data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
201
207
  goal="causal inference"
202
208
  ))
203
209
  ```
@@ -255,13 +261,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
255
261
 
256
262
  ## Why a library and not a chatbot?
257
263
 
258
- Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call at the end of your existing process — consistent, reproducible, and fast.
259
-
260
- ---
261
-
262
- ## Is my data safe?
263
-
264
- Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
264
+ Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
265
265
 
266
266
  ---
267
267
 
@@ -269,9 +269,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
269
269
 
270
270
  Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
271
271
 
272
- - **Statistical approach suggester** — describe your problem in plain English, get the right test and why
273
272
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
274
273
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
274
+ - **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
275
275
 
276
276
  Each tool is small, focused, and built for the way data scientists actually work.
277
277
 
@@ -2,6 +2,7 @@ LICENSE
2
2
  README.md
3
3
  pyproject.toml
4
4
  bridgekit/__init__.py
5
+ bridgekit/config.py
5
6
  bridgekit/planner.py
6
7
  bridgekit/reviewer.py
7
8
  bridgekit/search.py
@@ -7,7 +7,7 @@ include = ["bridgekit*"]
7
7
 
8
8
  [project]
9
9
  name = "bridgekit"
10
- version = "0.3.0"
10
+ version = "0.3.1"
11
11
  description = "AI tools that make you a better data scientist, not a redundant one."
12
12
  readme = "README.md"
13
13
  requires-python = ">=3.9"
@@ -21,13 +21,11 @@ FAKE_RESPONSE = (
21
21
  "─────────────────────────────────────────\n\n"
22
22
  "1. CLARITY\n"
23
23
  "✅ STRONG — The writeup is clear and jargon-free.\n\n"
24
- "2. AUDIENCE CLARITY\n"
25
- "✅ STRONG — Written for the right audience.\n\n"
26
- "3. STATISTICAL RIGOR\n"
24
+ "2. STATISTICAL RIGOR\n"
27
25
  "⚠️ NEEDS WORK — Sample size is not mentioned.\n\n"
28
- "4. METHODOLOGY\n"
26
+ "3. METHODOLOGY\n"
29
27
  "✅ STRONG — Approach is well explained.\n\n"
30
- "5. BUSINESS IMPACT\n"
28
+ "4. BUSINESS IMPACT\n"
31
29
  "❌ MISSING — No quantified outcomes.\n\n"
32
30
  "─────────────────────────────────────────\n"
33
31
  "BOTTOM LINE\n"
File without changes
File without changes