bridgekit 0.3.0__tar.gz → 0.3.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {bridgekit-0.3.0 → bridgekit-0.3.2}/PKG-INFO +68 -68
- {bridgekit-0.3.0 → bridgekit-0.3.2}/README.md +67 -67
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit/__init__.py +1 -1
- bridgekit-0.3.2/bridgekit/config.py +1 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit/planner.py +2 -1
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit/reviewer.py +11 -14
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit/search.py +2 -1
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit.egg-info/PKG-INFO +68 -68
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit.egg-info/SOURCES.txt +1 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/pyproject.toml +1 -1
- {bridgekit-0.3.0 → bridgekit-0.3.2}/tests/test_reviewer.py +3 -5
- {bridgekit-0.3.0 → bridgekit-0.3.2}/LICENSE +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit.egg-info/dependency_links.txt +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit.egg-info/requires.txt +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/bridgekit.egg-info/top_level.txt +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/setup.cfg +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/tests/test_planner.py +0 -0
- {bridgekit-0.3.0 → bridgekit-0.3.2}/tests/test_search.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: bridgekit
|
|
3
|
-
Version: 0.3.
|
|
3
|
+
Version: 0.3.2
|
|
4
4
|
Summary: AI tools that make you a better data scientist, not a redundant one.
|
|
5
5
|
License: MIT
|
|
6
6
|
Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
|
|
@@ -32,54 +32,7 @@ Dynamic: license-file
|
|
|
32
32
|
|
|
33
33
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
34
34
|
|
|
35
|
-
No new interface to learn.
|
|
36
|
-
|
|
37
|
-
---
|
|
38
|
-
|
|
39
|
-
## Tool #1: Analysis Reviewer
|
|
40
|
-
|
|
41
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
42
|
-
|
|
43
|
-
```python
|
|
44
|
-
from bridgekit import evaluate
|
|
45
|
-
|
|
46
|
-
text = """
|
|
47
|
-
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
48
|
-
upgrades. Users who engaged with the reporting feature within their first week
|
|
49
|
-
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
50
|
-
onboarding users to reporting as a growth lever.
|
|
51
|
-
"""
|
|
52
|
-
|
|
53
|
-
evaluate(text)
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
**Output:**
|
|
57
|
-
|
|
58
|
-
```
|
|
59
|
-
BRIDGEKIT FEEDBACK
|
|
60
|
-
─────────────────────────────────────────
|
|
61
|
-
|
|
62
|
-
✅ LOGIC
|
|
63
|
-
Your conclusion follows from the data. The 3x lift is a meaningful signal
|
|
64
|
-
worth acting on.
|
|
65
|
-
|
|
66
|
-
⚠️ WHAT'S MISSING
|
|
67
|
-
- Did you control for user intent? Users who explore reporting features may
|
|
68
|
-
already be power users likely to upgrade regardless.
|
|
69
|
-
- What's the sample size behind the 3x figure?
|
|
70
|
-
- Is this correlation or did you establish any causal direction?
|
|
71
|
-
|
|
72
|
-
🎯 WEAKEST POINT
|
|
73
|
-
"I recommend we prioritize onboarding to reporting" is a big leap from an
|
|
74
|
-
observational finding. A senior DS would push back on this in the meeting.
|
|
75
|
-
|
|
76
|
-
💡 LEVEL UP
|
|
77
|
-
Look into selection bias and how to address it — this analysis would be
|
|
78
|
-
significantly stronger with a matched cohort or an experiment to validate
|
|
79
|
-
the finding.
|
|
80
|
-
|
|
81
|
-
─────────────────────────────────────────
|
|
82
|
-
```
|
|
35
|
+
No new interface to learn. Just better work.
|
|
83
36
|
|
|
84
37
|
---
|
|
85
38
|
|
|
@@ -112,34 +65,87 @@ export ANTHROPIC_API_KEY=your_key_here
|
|
|
112
65
|
|
|
113
66
|
## Getting Started
|
|
114
67
|
|
|
115
|
-
|
|
68
|
+
Set your API key before launching Jupyter:
|
|
116
69
|
|
|
117
70
|
```bash
|
|
118
|
-
|
|
71
|
+
export ANTHROPIC_API_KEY=your_key_here
|
|
72
|
+
jupyter notebook
|
|
119
73
|
```
|
|
120
74
|
|
|
121
|
-
|
|
75
|
+
Then import whichever tool you need:
|
|
122
76
|
|
|
123
|
-
|
|
77
|
+
```python
|
|
78
|
+
from bridgekit import evaluate, plan, ask
|
|
79
|
+
```
|
|
124
80
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
81
|
+
**Review a writeup:**
|
|
82
|
+
```python
|
|
83
|
+
print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Plan your analytical approach:**
|
|
87
|
+
```python
|
|
88
|
+
print(plan("Did our onboarding flow reduce churn?"))
|
|
128
89
|
```
|
|
129
90
|
|
|
130
|
-
|
|
91
|
+
**Search past reports:**
|
|
92
|
+
```python
|
|
93
|
+
print(ask("What drove churn in Q3?", source="reports/"))
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Tool #1: Analysis Reviewer
|
|
99
|
+
|
|
100
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
131
101
|
|
|
132
102
|
```python
|
|
133
103
|
from bridgekit import evaluate
|
|
134
104
|
|
|
135
105
|
text = """
|
|
136
|
-
|
|
106
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
107
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
108
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
109
|
+
onboarding users to reporting as a growth lever.
|
|
137
110
|
"""
|
|
138
111
|
|
|
139
112
|
print(evaluate(text))
|
|
140
113
|
```
|
|
141
114
|
|
|
142
|
-
|
|
115
|
+
**Output:**
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
BRIDGEKIT ANALYSIS REVIEW
|
|
119
|
+
─────────────────────────────────────────
|
|
120
|
+
|
|
121
|
+
1. CLARITY
|
|
122
|
+
✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
|
|
123
|
+
this and immediately understand the claim and the recommendation.
|
|
124
|
+
|
|
125
|
+
2. STATISTICAL RIGOR
|
|
126
|
+
⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
|
|
127
|
+
context is missing. How many users are in each group? What's the base
|
|
128
|
+
upgrade rate? There's no confidence interval or p-value, so we can't
|
|
129
|
+
assess whether this difference is statistically significant or noise.
|
|
130
|
+
|
|
131
|
+
3. METHODOLOGY
|
|
132
|
+
❌ MISSING — This reads as a pure correlation finding, but the
|
|
133
|
+
recommendation implies causation. Users who explore reporting in week one
|
|
134
|
+
may simply be more motivated or already closer to upgrading. Without
|
|
135
|
+
addressing the self-selection problem, this recommendation is not
|
|
136
|
+
defensible.
|
|
137
|
+
|
|
138
|
+
4. BUSINESS IMPACT
|
|
139
|
+
⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
|
|
140
|
+
the 3x lift into projected revenue or upgrade volume so leadership can
|
|
141
|
+
prioritize this against competing initiatives.
|
|
142
|
+
|
|
143
|
+
─────────────────────────────────────────
|
|
144
|
+
BOTTOM LINE
|
|
145
|
+
You must address the correlation-vs-causation gap before presenting —
|
|
146
|
+
otherwise you risk recommending an onboarding investment that targets a
|
|
147
|
+
symptom of upgrade intent rather than a cause of it.
|
|
148
|
+
```
|
|
143
149
|
|
|
144
150
|
---
|
|
145
151
|
|
|
@@ -197,7 +203,7 @@ from bridgekit import plan
|
|
|
197
203
|
|
|
198
204
|
print(plan(
|
|
199
205
|
question="Does our new onboarding flow increase upgrade rates?",
|
|
200
|
-
data_description="1,000 users
|
|
206
|
+
data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
|
|
201
207
|
goal="causal inference"
|
|
202
208
|
))
|
|
203
209
|
```
|
|
@@ -255,13 +261,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
|
|
|
255
261
|
|
|
256
262
|
## Why a library and not a chatbot?
|
|
257
263
|
|
|
258
|
-
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call
|
|
259
|
-
|
|
260
|
-
---
|
|
261
|
-
|
|
262
|
-
## Is my data safe?
|
|
263
|
-
|
|
264
|
-
Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
|
|
264
|
+
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
|
|
265
265
|
|
|
266
266
|
---
|
|
267
267
|
|
|
@@ -269,9 +269,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
|
|
|
269
269
|
|
|
270
270
|
Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
|
|
271
271
|
|
|
272
|
-
- **Statistical approach suggester** — describe your problem in plain English, get the right test and why
|
|
273
272
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
274
273
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
274
|
+
- **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
|
|
275
275
|
|
|
276
276
|
Each tool is small, focused, and built for the way data scientists actually work.
|
|
277
277
|
|
|
@@ -4,54 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
6
6
|
|
|
7
|
-
No new interface to learn.
|
|
8
|
-
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
## Tool #1: Analysis Reviewer
|
|
12
|
-
|
|
13
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
14
|
-
|
|
15
|
-
```python
|
|
16
|
-
from bridgekit import evaluate
|
|
17
|
-
|
|
18
|
-
text = """
|
|
19
|
-
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
20
|
-
upgrades. Users who engaged with the reporting feature within their first week
|
|
21
|
-
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
22
|
-
onboarding users to reporting as a growth lever.
|
|
23
|
-
"""
|
|
24
|
-
|
|
25
|
-
evaluate(text)
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
**Output:**
|
|
29
|
-
|
|
30
|
-
```
|
|
31
|
-
BRIDGEKIT FEEDBACK
|
|
32
|
-
─────────────────────────────────────────
|
|
33
|
-
|
|
34
|
-
✅ LOGIC
|
|
35
|
-
Your conclusion follows from the data. The 3x lift is a meaningful signal
|
|
36
|
-
worth acting on.
|
|
37
|
-
|
|
38
|
-
⚠️ WHAT'S MISSING
|
|
39
|
-
- Did you control for user intent? Users who explore reporting features may
|
|
40
|
-
already be power users likely to upgrade regardless.
|
|
41
|
-
- What's the sample size behind the 3x figure?
|
|
42
|
-
- Is this correlation or did you establish any causal direction?
|
|
43
|
-
|
|
44
|
-
🎯 WEAKEST POINT
|
|
45
|
-
"I recommend we prioritize onboarding to reporting" is a big leap from an
|
|
46
|
-
observational finding. A senior DS would push back on this in the meeting.
|
|
47
|
-
|
|
48
|
-
💡 LEVEL UP
|
|
49
|
-
Look into selection bias and how to address it — this analysis would be
|
|
50
|
-
significantly stronger with a matched cohort or an experiment to validate
|
|
51
|
-
the finding.
|
|
52
|
-
|
|
53
|
-
─────────────────────────────────────────
|
|
54
|
-
```
|
|
7
|
+
No new interface to learn. Just better work.
|
|
55
8
|
|
|
56
9
|
---
|
|
57
10
|
|
|
@@ -84,34 +37,87 @@ export ANTHROPIC_API_KEY=your_key_here
|
|
|
84
37
|
|
|
85
38
|
## Getting Started
|
|
86
39
|
|
|
87
|
-
|
|
40
|
+
Set your API key before launching Jupyter:
|
|
88
41
|
|
|
89
42
|
```bash
|
|
90
|
-
|
|
43
|
+
export ANTHROPIC_API_KEY=your_key_here
|
|
44
|
+
jupyter notebook
|
|
91
45
|
```
|
|
92
46
|
|
|
93
|
-
|
|
47
|
+
Then import whichever tool you need:
|
|
94
48
|
|
|
95
|
-
|
|
49
|
+
```python
|
|
50
|
+
from bridgekit import evaluate, plan, ask
|
|
51
|
+
```
|
|
96
52
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
53
|
+
**Review a writeup:**
|
|
54
|
+
```python
|
|
55
|
+
print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Plan your analytical approach:**
|
|
59
|
+
```python
|
|
60
|
+
print(plan("Did our onboarding flow reduce churn?"))
|
|
100
61
|
```
|
|
101
62
|
|
|
102
|
-
|
|
63
|
+
**Search past reports:**
|
|
64
|
+
```python
|
|
65
|
+
print(ask("What drove churn in Q3?", source="reports/"))
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Tool #1: Analysis Reviewer
|
|
71
|
+
|
|
72
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
103
73
|
|
|
104
74
|
```python
|
|
105
75
|
from bridgekit import evaluate
|
|
106
76
|
|
|
107
77
|
text = """
|
|
108
|
-
|
|
78
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
79
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
80
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
81
|
+
onboarding users to reporting as a growth lever.
|
|
109
82
|
"""
|
|
110
83
|
|
|
111
84
|
print(evaluate(text))
|
|
112
85
|
```
|
|
113
86
|
|
|
114
|
-
|
|
87
|
+
**Output:**
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
BRIDGEKIT ANALYSIS REVIEW
|
|
91
|
+
─────────────────────────────────────────
|
|
92
|
+
|
|
93
|
+
1. CLARITY
|
|
94
|
+
✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
|
|
95
|
+
this and immediately understand the claim and the recommendation.
|
|
96
|
+
|
|
97
|
+
2. STATISTICAL RIGOR
|
|
98
|
+
⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
|
|
99
|
+
context is missing. How many users are in each group? What's the base
|
|
100
|
+
upgrade rate? There's no confidence interval or p-value, so we can't
|
|
101
|
+
assess whether this difference is statistically significant or noise.
|
|
102
|
+
|
|
103
|
+
3. METHODOLOGY
|
|
104
|
+
❌ MISSING — This reads as a pure correlation finding, but the
|
|
105
|
+
recommendation implies causation. Users who explore reporting in week one
|
|
106
|
+
may simply be more motivated or already closer to upgrading. Without
|
|
107
|
+
addressing the self-selection problem, this recommendation is not
|
|
108
|
+
defensible.
|
|
109
|
+
|
|
110
|
+
4. BUSINESS IMPACT
|
|
111
|
+
⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
|
|
112
|
+
the 3x lift into projected revenue or upgrade volume so leadership can
|
|
113
|
+
prioritize this against competing initiatives.
|
|
114
|
+
|
|
115
|
+
─────────────────────────────────────────
|
|
116
|
+
BOTTOM LINE
|
|
117
|
+
You must address the correlation-vs-causation gap before presenting —
|
|
118
|
+
otherwise you risk recommending an onboarding investment that targets a
|
|
119
|
+
symptom of upgrade intent rather than a cause of it.
|
|
120
|
+
```
|
|
115
121
|
|
|
116
122
|
---
|
|
117
123
|
|
|
@@ -169,7 +175,7 @@ from bridgekit import plan
|
|
|
169
175
|
|
|
170
176
|
print(plan(
|
|
171
177
|
question="Does our new onboarding flow increase upgrade rates?",
|
|
172
|
-
data_description="1,000 users
|
|
178
|
+
data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
|
|
173
179
|
goal="causal inference"
|
|
174
180
|
))
|
|
175
181
|
```
|
|
@@ -227,13 +233,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
|
|
|
227
233
|
|
|
228
234
|
## Why a library and not a chatbot?
|
|
229
235
|
|
|
230
|
-
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call
|
|
231
|
-
|
|
232
|
-
---
|
|
233
|
-
|
|
234
|
-
## Is my data safe?
|
|
235
|
-
|
|
236
|
-
Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
|
|
236
|
+
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
|
|
237
237
|
|
|
238
238
|
---
|
|
239
239
|
|
|
@@ -241,9 +241,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
|
|
|
241
241
|
|
|
242
242
|
Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
|
|
243
243
|
|
|
244
|
-
- **Statistical approach suggester** — describe your problem in plain English, get the right test and why
|
|
245
244
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
246
245
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
246
|
+
- **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
|
|
247
247
|
|
|
248
248
|
Each tool is small, focused, and built for the way data scientists actually work.
|
|
249
249
|
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
DEFAULT_MODEL = "claude-opus-4-6"
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
import os
|
|
2
2
|
import anthropic
|
|
3
|
+
from .config import DEFAULT_MODEL
|
|
3
4
|
|
|
4
5
|
SYSTEM_PROMPT = """You are a senior statistician and data scientist advising a colleague on the right analytical approach for their problem.
|
|
5
6
|
|
|
@@ -60,7 +61,7 @@ def plan(question: str, data_description: str = None, goal: str = None) -> str:
|
|
|
60
61
|
|
|
61
62
|
client = anthropic.Anthropic(api_key=api_key)
|
|
62
63
|
message = client.messages.create(
|
|
63
|
-
model=
|
|
64
|
+
model=DEFAULT_MODEL,
|
|
64
65
|
max_tokens=1024,
|
|
65
66
|
system=SYSTEM_PROMPT,
|
|
66
67
|
messages=[
|
|
@@ -1,16 +1,16 @@
|
|
|
1
1
|
import os
|
|
2
2
|
import anthropic
|
|
3
|
+
from .config import DEFAULT_MODEL
|
|
3
4
|
|
|
4
5
|
SYSTEM_PROMPT = """You are a senior data scientist reviewing a colleague's analysis writeup.
|
|
5
6
|
You are direct, constructive, and specific. You do not flatter — you help people improve.
|
|
6
7
|
|
|
7
|
-
Evaluate the writeup across exactly these
|
|
8
|
+
Evaluate the writeup across exactly these four dimensions:
|
|
8
9
|
|
|
9
|
-
1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything?
|
|
10
|
-
2.
|
|
11
|
-
3.
|
|
12
|
-
4.
|
|
13
|
-
5. BUSINESS IMPACT — Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
|
|
10
|
+
1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything? Is it written for the right reader?
|
|
11
|
+
2. STATISTICAL RIGOR — Is there enough data to support the claim? Are sample sizes mentioned? Are confidence levels or uncertainty acknowledged?
|
|
12
|
+
3. METHODOLOGY — Is it clear why this analytical approach was chosen? Are alternatives considered or ruled out?
|
|
13
|
+
4. BUSINESS IMPACT — Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
|
|
14
14
|
|
|
15
15
|
For each dimension, give one of three ratings:
|
|
16
16
|
✅ STRONG — this dimension is handled well
|
|
@@ -29,16 +29,13 @@ BRIDGEKIT ANALYSIS REVIEW
|
|
|
29
29
|
1. CLARITY
|
|
30
30
|
[rating] [feedback]
|
|
31
31
|
|
|
32
|
-
2.
|
|
32
|
+
2. STATISTICAL RIGOR
|
|
33
33
|
[rating] [feedback]
|
|
34
34
|
|
|
35
|
-
3.
|
|
35
|
+
3. METHODOLOGY
|
|
36
36
|
[rating] [feedback]
|
|
37
37
|
|
|
38
|
-
4.
|
|
39
|
-
[rating] [feedback]
|
|
40
|
-
|
|
41
|
-
5. BUSINESS IMPACT
|
|
38
|
+
4. BUSINESS IMPACT
|
|
42
39
|
[rating] [feedback]
|
|
43
40
|
|
|
44
41
|
─────────────────────────────────────────
|
|
@@ -54,7 +51,7 @@ def evaluate(text: str) -> str:
|
|
|
54
51
|
text: Your analysis writeup as a plain string.
|
|
55
52
|
|
|
56
53
|
Returns:
|
|
57
|
-
Structured feedback across
|
|
54
|
+
Structured feedback across four dimensions.
|
|
58
55
|
"""
|
|
59
56
|
if not text or not text.strip():
|
|
60
57
|
raise ValueError("Text cannot be empty.")
|
|
@@ -68,7 +65,7 @@ def evaluate(text: str) -> str:
|
|
|
68
65
|
client = anthropic.Anthropic(api_key=api_key)
|
|
69
66
|
|
|
70
67
|
message = client.messages.create(
|
|
71
|
-
model=
|
|
68
|
+
model=DEFAULT_MODEL,
|
|
72
69
|
max_tokens=1024,
|
|
73
70
|
system=SYSTEM_PROMPT,
|
|
74
71
|
messages=[
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
import os
|
|
2
2
|
from pathlib import Path
|
|
3
3
|
import anthropic
|
|
4
|
+
from .config import DEFAULT_MODEL
|
|
4
5
|
import chromadb
|
|
5
6
|
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
|
|
6
7
|
|
|
@@ -106,7 +107,7 @@ def ask(question: str, source: str = None, text: str = None) -> str:
|
|
|
106
107
|
# Generate answer with Claude
|
|
107
108
|
anthropic_client = anthropic.Anthropic(api_key=api_key)
|
|
108
109
|
message = anthropic_client.messages.create(
|
|
109
|
-
model=
|
|
110
|
+
model=DEFAULT_MODEL,
|
|
110
111
|
max_tokens=1024,
|
|
111
112
|
system=(
|
|
112
113
|
"You are a senior data scientist answering questions based on analysis reports. "
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: bridgekit
|
|
3
|
-
Version: 0.3.
|
|
3
|
+
Version: 0.3.2
|
|
4
4
|
Summary: AI tools that make you a better data scientist, not a redundant one.
|
|
5
5
|
License: MIT
|
|
6
6
|
Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
|
|
@@ -32,54 +32,7 @@ Dynamic: license-file
|
|
|
32
32
|
|
|
33
33
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
34
34
|
|
|
35
|
-
No new interface to learn.
|
|
36
|
-
|
|
37
|
-
---
|
|
38
|
-
|
|
39
|
-
## Tool #1: Analysis Reviewer
|
|
40
|
-
|
|
41
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
42
|
-
|
|
43
|
-
```python
|
|
44
|
-
from bridgekit import evaluate
|
|
45
|
-
|
|
46
|
-
text = """
|
|
47
|
-
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
48
|
-
upgrades. Users who engaged with the reporting feature within their first week
|
|
49
|
-
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
50
|
-
onboarding users to reporting as a growth lever.
|
|
51
|
-
"""
|
|
52
|
-
|
|
53
|
-
evaluate(text)
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
**Output:**
|
|
57
|
-
|
|
58
|
-
```
|
|
59
|
-
BRIDGEKIT FEEDBACK
|
|
60
|
-
─────────────────────────────────────────
|
|
61
|
-
|
|
62
|
-
✅ LOGIC
|
|
63
|
-
Your conclusion follows from the data. The 3x lift is a meaningful signal
|
|
64
|
-
worth acting on.
|
|
65
|
-
|
|
66
|
-
⚠️ WHAT'S MISSING
|
|
67
|
-
- Did you control for user intent? Users who explore reporting features may
|
|
68
|
-
already be power users likely to upgrade regardless.
|
|
69
|
-
- What's the sample size behind the 3x figure?
|
|
70
|
-
- Is this correlation or did you establish any causal direction?
|
|
71
|
-
|
|
72
|
-
🎯 WEAKEST POINT
|
|
73
|
-
"I recommend we prioritize onboarding to reporting" is a big leap from an
|
|
74
|
-
observational finding. A senior DS would push back on this in the meeting.
|
|
75
|
-
|
|
76
|
-
💡 LEVEL UP
|
|
77
|
-
Look into selection bias and how to address it — this analysis would be
|
|
78
|
-
significantly stronger with a matched cohort or an experiment to validate
|
|
79
|
-
the finding.
|
|
80
|
-
|
|
81
|
-
─────────────────────────────────────────
|
|
82
|
-
```
|
|
35
|
+
No new interface to learn. Just better work.
|
|
83
36
|
|
|
84
37
|
---
|
|
85
38
|
|
|
@@ -112,34 +65,87 @@ export ANTHROPIC_API_KEY=your_key_here
|
|
|
112
65
|
|
|
113
66
|
## Getting Started
|
|
114
67
|
|
|
115
|
-
|
|
68
|
+
Set your API key before launching Jupyter:
|
|
116
69
|
|
|
117
70
|
```bash
|
|
118
|
-
|
|
71
|
+
export ANTHROPIC_API_KEY=your_key_here
|
|
72
|
+
jupyter notebook
|
|
119
73
|
```
|
|
120
74
|
|
|
121
|
-
|
|
75
|
+
Then import whichever tool you need:
|
|
122
76
|
|
|
123
|
-
|
|
77
|
+
```python
|
|
78
|
+
from bridgekit import evaluate, plan, ask
|
|
79
|
+
```
|
|
124
80
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
81
|
+
**Review a writeup:**
|
|
82
|
+
```python
|
|
83
|
+
print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Plan your analytical approach:**
|
|
87
|
+
```python
|
|
88
|
+
print(plan("Did our onboarding flow reduce churn?"))
|
|
128
89
|
```
|
|
129
90
|
|
|
130
|
-
|
|
91
|
+
**Search past reports:**
|
|
92
|
+
```python
|
|
93
|
+
print(ask("What drove churn in Q3?", source="reports/"))
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Tool #1: Analysis Reviewer
|
|
99
|
+
|
|
100
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
131
101
|
|
|
132
102
|
```python
|
|
133
103
|
from bridgekit import evaluate
|
|
134
104
|
|
|
135
105
|
text = """
|
|
136
|
-
|
|
106
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
107
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
108
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
109
|
+
onboarding users to reporting as a growth lever.
|
|
137
110
|
"""
|
|
138
111
|
|
|
139
112
|
print(evaluate(text))
|
|
140
113
|
```
|
|
141
114
|
|
|
142
|
-
|
|
115
|
+
**Output:**
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
BRIDGEKIT ANALYSIS REVIEW
|
|
119
|
+
─────────────────────────────────────────
|
|
120
|
+
|
|
121
|
+
1. CLARITY
|
|
122
|
+
✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
|
|
123
|
+
this and immediately understand the claim and the recommendation.
|
|
124
|
+
|
|
125
|
+
2. STATISTICAL RIGOR
|
|
126
|
+
⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
|
|
127
|
+
context is missing. How many users are in each group? What's the base
|
|
128
|
+
upgrade rate? There's no confidence interval or p-value, so we can't
|
|
129
|
+
assess whether this difference is statistically significant or noise.
|
|
130
|
+
|
|
131
|
+
3. METHODOLOGY
|
|
132
|
+
❌ MISSING — This reads as a pure correlation finding, but the
|
|
133
|
+
recommendation implies causation. Users who explore reporting in week one
|
|
134
|
+
may simply be more motivated or already closer to upgrading. Without
|
|
135
|
+
addressing the self-selection problem, this recommendation is not
|
|
136
|
+
defensible.
|
|
137
|
+
|
|
138
|
+
4. BUSINESS IMPACT
|
|
139
|
+
⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
|
|
140
|
+
the 3x lift into projected revenue or upgrade volume so leadership can
|
|
141
|
+
prioritize this against competing initiatives.
|
|
142
|
+
|
|
143
|
+
─────────────────────────────────────────
|
|
144
|
+
BOTTOM LINE
|
|
145
|
+
You must address the correlation-vs-causation gap before presenting —
|
|
146
|
+
otherwise you risk recommending an onboarding investment that targets a
|
|
147
|
+
symptom of upgrade intent rather than a cause of it.
|
|
148
|
+
```
|
|
143
149
|
|
|
144
150
|
---
|
|
145
151
|
|
|
@@ -197,7 +203,7 @@ from bridgekit import plan
|
|
|
197
203
|
|
|
198
204
|
print(plan(
|
|
199
205
|
question="Does our new onboarding flow increase upgrade rates?",
|
|
200
|
-
data_description="1,000 users
|
|
206
|
+
data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
|
|
201
207
|
goal="causal inference"
|
|
202
208
|
))
|
|
203
209
|
```
|
|
@@ -255,13 +261,7 @@ It also lives in your Jupyter notebook, so there's no context switching. You sta
|
|
|
255
261
|
|
|
256
262
|
## Why a library and not a chatbot?
|
|
257
263
|
|
|
258
|
-
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call
|
|
259
|
-
|
|
260
|
-
---
|
|
261
|
-
|
|
262
|
-
## Is my data safe?
|
|
263
|
-
|
|
264
|
-
Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
|
|
264
|
+
Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
|
|
265
265
|
|
|
266
266
|
---
|
|
267
267
|
|
|
@@ -269,9 +269,9 @@ Bridgekit only ever sees text you write yourself — your narrative, your conclu
|
|
|
269
269
|
|
|
270
270
|
Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
|
|
271
271
|
|
|
272
|
-
- **Statistical approach suggester** — describe your problem in plain English, get the right test and why
|
|
273
272
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
274
273
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
274
|
+
- **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
|
|
275
275
|
|
|
276
276
|
Each tool is small, focused, and built for the way data scientists actually work.
|
|
277
277
|
|
|
@@ -21,13 +21,11 @@ FAKE_RESPONSE = (
|
|
|
21
21
|
"─────────────────────────────────────────\n\n"
|
|
22
22
|
"1. CLARITY\n"
|
|
23
23
|
"✅ STRONG — The writeup is clear and jargon-free.\n\n"
|
|
24
|
-
"2.
|
|
25
|
-
"✅ STRONG — Written for the right audience.\n\n"
|
|
26
|
-
"3. STATISTICAL RIGOR\n"
|
|
24
|
+
"2. STATISTICAL RIGOR\n"
|
|
27
25
|
"⚠️ NEEDS WORK — Sample size is not mentioned.\n\n"
|
|
28
|
-
"
|
|
26
|
+
"3. METHODOLOGY\n"
|
|
29
27
|
"✅ STRONG — Approach is well explained.\n\n"
|
|
30
|
-
"
|
|
28
|
+
"4. BUSINESS IMPACT\n"
|
|
31
29
|
"❌ MISSING — No quantified outcomes.\n\n"
|
|
32
30
|
"─────────────────────────────────────────\n"
|
|
33
31
|
"BOTTOM LINE\n"
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|