bridgekit 0.3.1__tar.gz → 0.3.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {bridgekit-0.3.1 → bridgekit-0.3.3}/PKG-INFO +122 -4
- {bridgekit-0.3.1 → bridgekit-0.3.3}/README.md +121 -3
- bridgekit-0.3.3/bridgekit/__init__.py +7 -0
- bridgekit-0.3.3/bridgekit/redteam.py +87 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit.egg-info/PKG-INFO +122 -4
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit.egg-info/SOURCES.txt +1 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/pyproject.toml +1 -1
- bridgekit-0.3.1/bridgekit/__init__.py +0 -6
- {bridgekit-0.3.1 → bridgekit-0.3.3}/LICENSE +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit/config.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit/planner.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit/reviewer.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit/search.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit.egg-info/dependency_links.txt +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit.egg-info/requires.txt +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/bridgekit.egg-info/top_level.txt +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/setup.cfg +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/tests/test_planner.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/tests/test_reviewer.py +0 -0
- {bridgekit-0.3.1 → bridgekit-0.3.3}/tests/test_search.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: bridgekit
|
|
3
|
-
Version: 0.3.
|
|
3
|
+
Version: 0.3.3
|
|
4
4
|
Summary: AI tools that make you a better data scientist, not a redundant one.
|
|
5
5
|
License: MIT
|
|
6
6
|
Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
|
|
@@ -26,10 +26,18 @@ Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
|
26
26
|
Requires-Dist: pytest-mock>=3.0.0; extra == "dev"
|
|
27
27
|
Dynamic: license-file
|
|
28
28
|
|
|
29
|
+
<p align="center">
|
|
30
|
+
<img src="assets/logo.png" alt="Bridgekit" width="200"/>
|
|
31
|
+
</p>
|
|
32
|
+
|
|
29
33
|
# Bridgekit
|
|
30
34
|
|
|
31
35
|
**AI tools that make you a better data scientist, not a redundant one.**
|
|
32
36
|
|
|
37
|
+
[](https://pypi.org/project/bridgekit/)
|
|
38
|
+
[](https://pypi.org/project/bridgekit/)
|
|
39
|
+
[](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
|
|
40
|
+
|
|
33
41
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
34
42
|
|
|
35
43
|
No new interface to learn. Just better work.
|
|
@@ -75,7 +83,7 @@ jupyter notebook
|
|
|
75
83
|
Then import whichever tool you need:
|
|
76
84
|
|
|
77
85
|
```python
|
|
78
|
-
from bridgekit import evaluate, plan, ask
|
|
86
|
+
from bridgekit import evaluate, plan, ask, redteam
|
|
79
87
|
```
|
|
80
88
|
|
|
81
89
|
**Review a writeup:**
|
|
@@ -97,7 +105,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
|
|
|
97
105
|
|
|
98
106
|
## Tool #1: Analysis Reviewer
|
|
99
107
|
|
|
100
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
108
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
|
|
101
109
|
|
|
102
110
|
```python
|
|
103
111
|
from bridgekit import evaluate
|
|
@@ -251,6 +259,116 @@ ALTERNATIVES
|
|
|
251
259
|
|
|
252
260
|
---
|
|
253
261
|
|
|
262
|
+
## Tool #4: Red Team
|
|
263
|
+
|
|
264
|
+
Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
|
|
265
|
+
|
|
266
|
+
```python
|
|
267
|
+
from bridgekit import redteam
|
|
268
|
+
|
|
269
|
+
text = """
|
|
270
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
271
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
272
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
273
|
+
onboarding users to reporting as a growth lever.
|
|
274
|
+
"""
|
|
275
|
+
|
|
276
|
+
# Default — skeptical senior executive
|
|
277
|
+
print(redteam(text))
|
|
278
|
+
|
|
279
|
+
# Or specify a stakeholder
|
|
280
|
+
print(redteam(text, stakeholder="VP of Engineering"))
|
|
281
|
+
print(redteam(text, stakeholder="VP of Marketing"))
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
Same writeup, different attack angles:
|
|
285
|
+
|
|
286
|
+
**VP of Engineering output:**
|
|
287
|
+
```
|
|
288
|
+
BRIDGEKIT RED TEAM
|
|
289
|
+
─────────────────────────────────────────
|
|
290
|
+
STAKEHOLDER: VP of Engineering
|
|
291
|
+
|
|
292
|
+
CRITIQUE 1: Classic correlation-causation conflation
|
|
293
|
+
❯ "You're telling me to re-architect our onboarding flow based on a correlation?
|
|
294
|
+
Users who dig into reporting in week one are probably already power users.
|
|
295
|
+
You haven't shown me that exposing someone to reporting causes them to upgrade."
|
|
296
|
+
WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
|
|
297
|
+
variable, no matched cohort analysis.
|
|
298
|
+
TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
|
|
299
|
+
during onboarding. At minimum, a propensity-score-matched comparison controlling
|
|
300
|
+
for user segment and acquisition channel.
|
|
301
|
+
|
|
302
|
+
CRITIQUE 2: 3x on what base rate?
|
|
303
|
+
❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
|
|
304
|
+
re-prioritizing my engineering roadmap for that."
|
|
305
|
+
WHY IT LANDS: Relative lift without base rates inflates significance. No way to
|
|
306
|
+
evaluate whether this justifies the engineering investment.
|
|
307
|
+
TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
|
|
308
|
+
and rough engineering cost to frame ROI.
|
|
309
|
+
|
|
310
|
+
CRITIQUE 3: No definition of "engaged with reporting"
|
|
311
|
+
❯ "What does engaged actually mean? Clicked once? Built a custom report?
|
|
312
|
+
If someone accidentally opened the tab, are they in your 3x cohort?"
|
|
313
|
+
WHY IT LANDS: The threshold fundamentally changes the interpretation and
|
|
314
|
+
the recommended intervention.
|
|
315
|
+
TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
|
|
316
|
+
across definitions.
|
|
317
|
+
|
|
318
|
+
─────────────────────────────────────────
|
|
319
|
+
HARDEST QUESTION TO ANSWER
|
|
320
|
+
"What specific onboarding action would you implement, and what engagement depth
|
|
321
|
+
does it need to produce to replicate the effect — and have you tested whether
|
|
322
|
+
you can actually get general users to that depth?"
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
**VP of Marketing output:**
|
|
326
|
+
```
|
|
327
|
+
BRIDGEKIT RED TEAM
|
|
328
|
+
─────────────────────────────────────────
|
|
329
|
+
STAKEHOLDER: VP of Marketing
|
|
330
|
+
|
|
331
|
+
CRITIQUE 1: Correlation masquerading as a growth lever
|
|
332
|
+
❯ "You're telling me to restructure onboarding based on a correlation. How do
|
|
333
|
+
you know that users who found reporting weren't already power users or
|
|
334
|
+
higher-intent buyers who would have upgraded regardless?"
|
|
335
|
+
WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
|
|
336
|
+
feature in week one are likely more sophisticated or higher-intent. The 3x lift
|
|
337
|
+
could entirely reflect who they already were, not what the feature did to them.
|
|
338
|
+
TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
|
|
339
|
+
propensity-matched cohort controlling for acquisition source, company size,
|
|
340
|
+
and plan tier.
|
|
341
|
+
|
|
342
|
+
CRITIQUE 2: Where's the segmentation by channel and campaign?
|
|
343
|
+
❯ "I spent millions driving traffic from different channels last quarter.
|
|
344
|
+
Did you even look at where these reporting-engaged users came from? If they're
|
|
345
|
+
all from our enterprise webinar funnel, this isn't a product insight —
|
|
346
|
+
it's a marketing attribution insight you've mislabeled."
|
|
347
|
+
WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
|
|
348
|
+
users came disproportionately from specific campaigns, the real lever might be
|
|
349
|
+
acquiring more users like them, not changing onboarding.
|
|
350
|
+
TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
|
|
351
|
+
Show the effect holds within channels, not just across the blended population.
|
|
352
|
+
|
|
353
|
+
CRITIQUE 3: 90 days is a dangerously thin window
|
|
354
|
+
❯ "Was there a product launch, a pricing change, or a big campaign push in
|
|
355
|
+
that window? How do I know this finding isn't an artifact of whatever else
|
|
356
|
+
was happening in those specific 90 days?"
|
|
357
|
+
WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
|
|
358
|
+
analysis carries high risk of temporal bias.
|
|
359
|
+
TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
|
|
360
|
+
major product, pricing, or marketing events and show the effect persists
|
|
361
|
+
after controlling for them.
|
|
362
|
+
|
|
363
|
+
─────────────────────────────────────────
|
|
364
|
+
HARDEST QUESTION TO ANSWER
|
|
365
|
+
"If I run an A/B test tomorrow where we force half of new users through a
|
|
366
|
+
reporting-focused onboarding flow, what conversion lift are you personally
|
|
367
|
+
willing to commit to — and what's your confidence interval on that estimate?"
|
|
368
|
+
```
|
|
369
|
+
|
|
370
|
+
---
|
|
371
|
+
|
|
254
372
|
## Why not just use Claude?
|
|
255
373
|
|
|
256
374
|
You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
|
|
@@ -267,7 +385,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
|
|
|
267
385
|
|
|
268
386
|
## What's next?
|
|
269
387
|
|
|
270
|
-
Bridgekit is a suite, not a one-off.
|
|
388
|
+
Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
|
|
271
389
|
|
|
272
390
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
273
391
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
@@ -1,7 +1,15 @@
|
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="assets/logo.png" alt="Bridgekit" width="200"/>
|
|
3
|
+
</p>
|
|
4
|
+
|
|
1
5
|
# Bridgekit
|
|
2
6
|
|
|
3
7
|
**AI tools that make you a better data scientist, not a redundant one.**
|
|
4
8
|
|
|
9
|
+
[](https://pypi.org/project/bridgekit/)
|
|
10
|
+
[](https://pypi.org/project/bridgekit/)
|
|
11
|
+
[](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
|
|
12
|
+
|
|
5
13
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
6
14
|
|
|
7
15
|
No new interface to learn. Just better work.
|
|
@@ -47,7 +55,7 @@ jupyter notebook
|
|
|
47
55
|
Then import whichever tool you need:
|
|
48
56
|
|
|
49
57
|
```python
|
|
50
|
-
from bridgekit import evaluate, plan, ask
|
|
58
|
+
from bridgekit import evaluate, plan, ask, redteam
|
|
51
59
|
```
|
|
52
60
|
|
|
53
61
|
**Review a writeup:**
|
|
@@ -69,7 +77,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
|
|
|
69
77
|
|
|
70
78
|
## Tool #1: Analysis Reviewer
|
|
71
79
|
|
|
72
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
80
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
|
|
73
81
|
|
|
74
82
|
```python
|
|
75
83
|
from bridgekit import evaluate
|
|
@@ -223,6 +231,116 @@ ALTERNATIVES
|
|
|
223
231
|
|
|
224
232
|
---
|
|
225
233
|
|
|
234
|
+
## Tool #4: Red Team
|
|
235
|
+
|
|
236
|
+
Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
|
|
237
|
+
|
|
238
|
+
```python
|
|
239
|
+
from bridgekit import redteam
|
|
240
|
+
|
|
241
|
+
text = """
|
|
242
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
243
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
244
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
245
|
+
onboarding users to reporting as a growth lever.
|
|
246
|
+
"""
|
|
247
|
+
|
|
248
|
+
# Default — skeptical senior executive
|
|
249
|
+
print(redteam(text))
|
|
250
|
+
|
|
251
|
+
# Or specify a stakeholder
|
|
252
|
+
print(redteam(text, stakeholder="VP of Engineering"))
|
|
253
|
+
print(redteam(text, stakeholder="VP of Marketing"))
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
Same writeup, different attack angles:
|
|
257
|
+
|
|
258
|
+
**VP of Engineering output:**
|
|
259
|
+
```
|
|
260
|
+
BRIDGEKIT RED TEAM
|
|
261
|
+
─────────────────────────────────────────
|
|
262
|
+
STAKEHOLDER: VP of Engineering
|
|
263
|
+
|
|
264
|
+
CRITIQUE 1: Classic correlation-causation conflation
|
|
265
|
+
❯ "You're telling me to re-architect our onboarding flow based on a correlation?
|
|
266
|
+
Users who dig into reporting in week one are probably already power users.
|
|
267
|
+
You haven't shown me that exposing someone to reporting causes them to upgrade."
|
|
268
|
+
WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
|
|
269
|
+
variable, no matched cohort analysis.
|
|
270
|
+
TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
|
|
271
|
+
during onboarding. At minimum, a propensity-score-matched comparison controlling
|
|
272
|
+
for user segment and acquisition channel.
|
|
273
|
+
|
|
274
|
+
CRITIQUE 2: 3x on what base rate?
|
|
275
|
+
❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
|
|
276
|
+
re-prioritizing my engineering roadmap for that."
|
|
277
|
+
WHY IT LANDS: Relative lift without base rates inflates significance. No way to
|
|
278
|
+
evaluate whether this justifies the engineering investment.
|
|
279
|
+
TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
|
|
280
|
+
and rough engineering cost to frame ROI.
|
|
281
|
+
|
|
282
|
+
CRITIQUE 3: No definition of "engaged with reporting"
|
|
283
|
+
❯ "What does engaged actually mean? Clicked once? Built a custom report?
|
|
284
|
+
If someone accidentally opened the tab, are they in your 3x cohort?"
|
|
285
|
+
WHY IT LANDS: The threshold fundamentally changes the interpretation and
|
|
286
|
+
the recommended intervention.
|
|
287
|
+
TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
|
|
288
|
+
across definitions.
|
|
289
|
+
|
|
290
|
+
─────────────────────────────────────────
|
|
291
|
+
HARDEST QUESTION TO ANSWER
|
|
292
|
+
"What specific onboarding action would you implement, and what engagement depth
|
|
293
|
+
does it need to produce to replicate the effect — and have you tested whether
|
|
294
|
+
you can actually get general users to that depth?"
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
**VP of Marketing output:**
|
|
298
|
+
```
|
|
299
|
+
BRIDGEKIT RED TEAM
|
|
300
|
+
─────────────────────────────────────────
|
|
301
|
+
STAKEHOLDER: VP of Marketing
|
|
302
|
+
|
|
303
|
+
CRITIQUE 1: Correlation masquerading as a growth lever
|
|
304
|
+
❯ "You're telling me to restructure onboarding based on a correlation. How do
|
|
305
|
+
you know that users who found reporting weren't already power users or
|
|
306
|
+
higher-intent buyers who would have upgraded regardless?"
|
|
307
|
+
WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
|
|
308
|
+
feature in week one are likely more sophisticated or higher-intent. The 3x lift
|
|
309
|
+
could entirely reflect who they already were, not what the feature did to them.
|
|
310
|
+
TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
|
|
311
|
+
propensity-matched cohort controlling for acquisition source, company size,
|
|
312
|
+
and plan tier.
|
|
313
|
+
|
|
314
|
+
CRITIQUE 2: Where's the segmentation by channel and campaign?
|
|
315
|
+
❯ "I spent millions driving traffic from different channels last quarter.
|
|
316
|
+
Did you even look at where these reporting-engaged users came from? If they're
|
|
317
|
+
all from our enterprise webinar funnel, this isn't a product insight —
|
|
318
|
+
it's a marketing attribution insight you've mislabeled."
|
|
319
|
+
WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
|
|
320
|
+
users came disproportionately from specific campaigns, the real lever might be
|
|
321
|
+
acquiring more users like them, not changing onboarding.
|
|
322
|
+
TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
|
|
323
|
+
Show the effect holds within channels, not just across the blended population.
|
|
324
|
+
|
|
325
|
+
CRITIQUE 3: 90 days is a dangerously thin window
|
|
326
|
+
❯ "Was there a product launch, a pricing change, or a big campaign push in
|
|
327
|
+
that window? How do I know this finding isn't an artifact of whatever else
|
|
328
|
+
was happening in those specific 90 days?"
|
|
329
|
+
WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
|
|
330
|
+
analysis carries high risk of temporal bias.
|
|
331
|
+
TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
|
|
332
|
+
major product, pricing, or marketing events and show the effect persists
|
|
333
|
+
after controlling for them.
|
|
334
|
+
|
|
335
|
+
─────────────────────────────────────────
|
|
336
|
+
HARDEST QUESTION TO ANSWER
|
|
337
|
+
"If I run an A/B test tomorrow where we force half of new users through a
|
|
338
|
+
reporting-focused onboarding flow, what conversion lift are you personally
|
|
339
|
+
willing to commit to — and what's your confidence interval on that estimate?"
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
---
|
|
343
|
+
|
|
226
344
|
## Why not just use Claude?
|
|
227
345
|
|
|
228
346
|
You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
|
|
@@ -239,7 +357,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
|
|
|
239
357
|
|
|
240
358
|
## What's next?
|
|
241
359
|
|
|
242
|
-
Bridgekit is a suite, not a one-off.
|
|
360
|
+
Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
|
|
243
361
|
|
|
244
362
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
245
363
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
import os
|
|
2
|
+
import anthropic
|
|
3
|
+
from .config import DEFAULT_MODEL
|
|
4
|
+
|
|
5
|
+
DEFAULT_STAKEHOLDER = "a skeptical senior executive with no tolerance for weak methodology, unsupported claims, or vague business impact"
|
|
6
|
+
|
|
7
|
+
SYSTEM_PROMPT_TEMPLATE = """You are {stakeholder}.
|
|
8
|
+
|
|
9
|
+
You have just read a data science analysis writeup. You are not here to be helpful — you are here to find every weakness, challenge every assumption, and ask the questions that will expose flaws in the analysis.
|
|
10
|
+
|
|
11
|
+
Your job is to identify the 3 hardest critiques a skeptical stakeholder would make in a readout. For each critique:
|
|
12
|
+
- State the challenge directly and bluntly, as the stakeholder would say it
|
|
13
|
+
- Explain why this is a legitimate weak point in the analysis
|
|
14
|
+
- State what would need to be true for this critique to be addressed
|
|
15
|
+
|
|
16
|
+
Format your response exactly like this:
|
|
17
|
+
|
|
18
|
+
BRIDGEKIT RED TEAM
|
|
19
|
+
─────────────────────────────────────────
|
|
20
|
+
STAKEHOLDER: {stakeholder_label}
|
|
21
|
+
|
|
22
|
+
CRITIQUE 1: [short title]
|
|
23
|
+
❯ [direct challenge as the stakeholder would say it, in quotes]
|
|
24
|
+
WHY IT LANDS: [1-2 sentences on why this is a real weakness]
|
|
25
|
+
TO ADDRESS: [what would need to be shown or proven]
|
|
26
|
+
|
|
27
|
+
CRITIQUE 2: [short title]
|
|
28
|
+
❯ [direct challenge, in quotes]
|
|
29
|
+
WHY IT LANDS: [1-2 sentences]
|
|
30
|
+
TO ADDRESS: [what would need to be shown or proven]
|
|
31
|
+
|
|
32
|
+
CRITIQUE 3: [short title]
|
|
33
|
+
❯ [direct challenge, in quotes]
|
|
34
|
+
WHY IT LANDS: [1-2 sentences]
|
|
35
|
+
TO ADDRESS: [what would need to be shown or proven]
|
|
36
|
+
|
|
37
|
+
─────────────────────────────────────────
|
|
38
|
+
HARDEST QUESTION TO ANSWER
|
|
39
|
+
[The single question the stakeholder would ask that would be hardest to answer on the spot]
|
|
40
|
+
"""
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
def redteam(text: str, stakeholder: str = None) -> str:
|
|
44
|
+
"""
|
|
45
|
+
Red-team a data science analysis writeup from the perspective of a skeptical stakeholder.
|
|
46
|
+
|
|
47
|
+
Args:
|
|
48
|
+
text: Your analysis writeup as a plain string.
|
|
49
|
+
stakeholder: Optional. The skeptical stakeholder role (e.g. "VP of Finance",
|
|
50
|
+
"skeptical board member", "Chief Revenue Officer").
|
|
51
|
+
Defaults to a generic skeptical senior executive.
|
|
52
|
+
|
|
53
|
+
Returns:
|
|
54
|
+
The 3-5 hardest critiques the stakeholder would make, plus the single
|
|
55
|
+
hardest question to answer on the spot.
|
|
56
|
+
"""
|
|
57
|
+
if not text or not text.strip():
|
|
58
|
+
raise ValueError("Text cannot be empty.")
|
|
59
|
+
|
|
60
|
+
api_key = os.environ.get("ANTHROPIC_API_KEY")
|
|
61
|
+
if not api_key:
|
|
62
|
+
raise EnvironmentError(
|
|
63
|
+
"ANTHROPIC_API_KEY not found. Set it with: export ANTHROPIC_API_KEY=your_key_here"
|
|
64
|
+
)
|
|
65
|
+
|
|
66
|
+
stakeholder_label = stakeholder if stakeholder else "Skeptical Senior Executive"
|
|
67
|
+
stakeholder_desc = stakeholder if stakeholder else DEFAULT_STAKEHOLDER
|
|
68
|
+
|
|
69
|
+
system_prompt = SYSTEM_PROMPT_TEMPLATE.format(
|
|
70
|
+
stakeholder=stakeholder_desc,
|
|
71
|
+
stakeholder_label=stakeholder_label
|
|
72
|
+
)
|
|
73
|
+
|
|
74
|
+
client = anthropic.Anthropic(api_key=api_key)
|
|
75
|
+
message = client.messages.create(
|
|
76
|
+
model=DEFAULT_MODEL,
|
|
77
|
+
max_tokens=1024,
|
|
78
|
+
system=system_prompt,
|
|
79
|
+
messages=[
|
|
80
|
+
{
|
|
81
|
+
"role": "user",
|
|
82
|
+
"content": f"Red-team this analysis writeup:\n\n{text}"
|
|
83
|
+
}
|
|
84
|
+
]
|
|
85
|
+
)
|
|
86
|
+
|
|
87
|
+
return message.content[0].text
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: bridgekit
|
|
3
|
-
Version: 0.3.
|
|
3
|
+
Version: 0.3.3
|
|
4
4
|
Summary: AI tools that make you a better data scientist, not a redundant one.
|
|
5
5
|
License: MIT
|
|
6
6
|
Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
|
|
@@ -26,10 +26,18 @@ Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
|
26
26
|
Requires-Dist: pytest-mock>=3.0.0; extra == "dev"
|
|
27
27
|
Dynamic: license-file
|
|
28
28
|
|
|
29
|
+
<p align="center">
|
|
30
|
+
<img src="assets/logo.png" alt="Bridgekit" width="200"/>
|
|
31
|
+
</p>
|
|
32
|
+
|
|
29
33
|
# Bridgekit
|
|
30
34
|
|
|
31
35
|
**AI tools that make you a better data scientist, not a redundant one.**
|
|
32
36
|
|
|
37
|
+
[](https://pypi.org/project/bridgekit/)
|
|
38
|
+
[](https://pypi.org/project/bridgekit/)
|
|
39
|
+
[](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
|
|
40
|
+
|
|
33
41
|
Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
|
|
34
42
|
|
|
35
43
|
No new interface to learn. Just better work.
|
|
@@ -75,7 +83,7 @@ jupyter notebook
|
|
|
75
83
|
Then import whichever tool you need:
|
|
76
84
|
|
|
77
85
|
```python
|
|
78
|
-
from bridgekit import evaluate, plan, ask
|
|
86
|
+
from bridgekit import evaluate, plan, ask, redteam
|
|
79
87
|
```
|
|
80
88
|
|
|
81
89
|
**Review a writeup:**
|
|
@@ -97,7 +105,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
|
|
|
97
105
|
|
|
98
106
|
## Tool #1: Analysis Reviewer
|
|
99
107
|
|
|
100
|
-
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
|
|
108
|
+
Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
|
|
101
109
|
|
|
102
110
|
```python
|
|
103
111
|
from bridgekit import evaluate
|
|
@@ -251,6 +259,116 @@ ALTERNATIVES
|
|
|
251
259
|
|
|
252
260
|
---
|
|
253
261
|
|
|
262
|
+
## Tool #4: Red Team
|
|
263
|
+
|
|
264
|
+
Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
|
|
265
|
+
|
|
266
|
+
```python
|
|
267
|
+
from bridgekit import redteam
|
|
268
|
+
|
|
269
|
+
text = """
|
|
270
|
+
I analyzed 90 days of user behavior data to understand what drives subscription
|
|
271
|
+
upgrades. Users who engaged with the reporting feature within their first week
|
|
272
|
+
were 3x more likely to upgrade within 30 days. I recommend we prioritize
|
|
273
|
+
onboarding users to reporting as a growth lever.
|
|
274
|
+
"""
|
|
275
|
+
|
|
276
|
+
# Default — skeptical senior executive
|
|
277
|
+
print(redteam(text))
|
|
278
|
+
|
|
279
|
+
# Or specify a stakeholder
|
|
280
|
+
print(redteam(text, stakeholder="VP of Engineering"))
|
|
281
|
+
print(redteam(text, stakeholder="VP of Marketing"))
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
Same writeup, different attack angles:
|
|
285
|
+
|
|
286
|
+
**VP of Engineering output:**
|
|
287
|
+
```
|
|
288
|
+
BRIDGEKIT RED TEAM
|
|
289
|
+
─────────────────────────────────────────
|
|
290
|
+
STAKEHOLDER: VP of Engineering
|
|
291
|
+
|
|
292
|
+
CRITIQUE 1: Classic correlation-causation conflation
|
|
293
|
+
❯ "You're telling me to re-architect our onboarding flow based on a correlation?
|
|
294
|
+
Users who dig into reporting in week one are probably already power users.
|
|
295
|
+
You haven't shown me that exposing someone to reporting causes them to upgrade."
|
|
296
|
+
WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
|
|
297
|
+
variable, no matched cohort analysis.
|
|
298
|
+
TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
|
|
299
|
+
during onboarding. At minimum, a propensity-score-matched comparison controlling
|
|
300
|
+
for user segment and acquisition channel.
|
|
301
|
+
|
|
302
|
+
CRITIQUE 2: 3x on what base rate?
|
|
303
|
+
❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
|
|
304
|
+
re-prioritizing my engineering roadmap for that."
|
|
305
|
+
WHY IT LANDS: Relative lift without base rates inflates significance. No way to
|
|
306
|
+
evaluate whether this justifies the engineering investment.
|
|
307
|
+
TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
|
|
308
|
+
and rough engineering cost to frame ROI.
|
|
309
|
+
|
|
310
|
+
CRITIQUE 3: No definition of "engaged with reporting"
|
|
311
|
+
❯ "What does engaged actually mean? Clicked once? Built a custom report?
|
|
312
|
+
If someone accidentally opened the tab, are they in your 3x cohort?"
|
|
313
|
+
WHY IT LANDS: The threshold fundamentally changes the interpretation and
|
|
314
|
+
the recommended intervention.
|
|
315
|
+
TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
|
|
316
|
+
across definitions.
|
|
317
|
+
|
|
318
|
+
─────────────────────────────────────────
|
|
319
|
+
HARDEST QUESTION TO ANSWER
|
|
320
|
+
"What specific onboarding action would you implement, and what engagement depth
|
|
321
|
+
does it need to produce to replicate the effect — and have you tested whether
|
|
322
|
+
you can actually get general users to that depth?"
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
**VP of Marketing output:**
|
|
326
|
+
```
|
|
327
|
+
BRIDGEKIT RED TEAM
|
|
328
|
+
─────────────────────────────────────────
|
|
329
|
+
STAKEHOLDER: VP of Marketing
|
|
330
|
+
|
|
331
|
+
CRITIQUE 1: Correlation masquerading as a growth lever
|
|
332
|
+
❯ "You're telling me to restructure onboarding based on a correlation. How do
|
|
333
|
+
you know that users who found reporting weren't already power users or
|
|
334
|
+
higher-intent buyers who would have upgraded regardless?"
|
|
335
|
+
WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
|
|
336
|
+
feature in week one are likely more sophisticated or higher-intent. The 3x lift
|
|
337
|
+
could entirely reflect who they already were, not what the feature did to them.
|
|
338
|
+
TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
|
|
339
|
+
propensity-matched cohort controlling for acquisition source, company size,
|
|
340
|
+
and plan tier.
|
|
341
|
+
|
|
342
|
+
CRITIQUE 2: Where's the segmentation by channel and campaign?
|
|
343
|
+
❯ "I spent millions driving traffic from different channels last quarter.
|
|
344
|
+
Did you even look at where these reporting-engaged users came from? If they're
|
|
345
|
+
all from our enterprise webinar funnel, this isn't a product insight —
|
|
346
|
+
it's a marketing attribution insight you've mislabeled."
|
|
347
|
+
WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
|
|
348
|
+
users came disproportionately from specific campaigns, the real lever might be
|
|
349
|
+
acquiring more users like them, not changing onboarding.
|
|
350
|
+
TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
|
|
351
|
+
Show the effect holds within channels, not just across the blended population.
|
|
352
|
+
|
|
353
|
+
CRITIQUE 3: 90 days is a dangerously thin window
|
|
354
|
+
❯ "Was there a product launch, a pricing change, or a big campaign push in
|
|
355
|
+
that window? How do I know this finding isn't an artifact of whatever else
|
|
356
|
+
was happening in those specific 90 days?"
|
|
357
|
+
WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
|
|
358
|
+
analysis carries high risk of temporal bias.
|
|
359
|
+
TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
|
|
360
|
+
major product, pricing, or marketing events and show the effect persists
|
|
361
|
+
after controlling for them.
|
|
362
|
+
|
|
363
|
+
─────────────────────────────────────────
|
|
364
|
+
HARDEST QUESTION TO ANSWER
|
|
365
|
+
"If I run an A/B test tomorrow where we force half of new users through a
|
|
366
|
+
reporting-focused onboarding flow, what conversion lift are you personally
|
|
367
|
+
willing to commit to — and what's your confidence interval on that estimate?"
|
|
368
|
+
```
|
|
369
|
+
|
|
370
|
+
---
|
|
371
|
+
|
|
254
372
|
## Why not just use Claude?
|
|
255
373
|
|
|
256
374
|
You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
|
|
@@ -267,7 +385,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
|
|
|
267
385
|
|
|
268
386
|
## What's next?
|
|
269
387
|
|
|
270
|
-
Bridgekit is a suite, not a one-off.
|
|
388
|
+
Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
|
|
271
389
|
|
|
272
390
|
- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
|
|
273
391
|
- **Assumption checker** — state your analytical assumptions, get the ones you missed
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|