bridgekit 0.3.2__tar.gz → 0.3.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: bridgekit
3
- Version: 0.3.2
3
+ Version: 0.3.3
4
4
  Summary: AI tools that make you a better data scientist, not a redundant one.
5
5
  License: MIT
6
6
  Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
@@ -26,10 +26,18 @@ Requires-Dist: pytest>=7.0.0; extra == "dev"
26
26
  Requires-Dist: pytest-mock>=3.0.0; extra == "dev"
27
27
  Dynamic: license-file
28
28
 
29
+ <p align="center">
30
+ <img src="assets/logo.png" alt="Bridgekit" width="200"/>
31
+ </p>
32
+
29
33
  # Bridgekit
30
34
 
31
35
  **AI tools that make you a better data scientist, not a redundant one.**
32
36
 
37
+ [![PyPI](https://img.shields.io/pypi/v/bridgekit)](https://pypi.org/project/bridgekit/)
38
+ [![Downloads](https://img.shields.io/pypi/dm/bridgekit)](https://pypi.org/project/bridgekit/)
39
+ [![License](https://img.shields.io/pypi/l/bridgekit)](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
40
+
33
41
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
34
42
 
35
43
  No new interface to learn. Just better work.
@@ -75,7 +83,7 @@ jupyter notebook
75
83
  Then import whichever tool you need:
76
84
 
77
85
  ```python
78
- from bridgekit import evaluate, plan, ask
86
+ from bridgekit import evaluate, plan, ask, redteam
79
87
  ```
80
88
 
81
89
  **Review a writeup:**
@@ -97,7 +105,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
97
105
 
98
106
  ## Tool #1: Analysis Reviewer
99
107
 
100
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
108
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
101
109
 
102
110
  ```python
103
111
  from bridgekit import evaluate
@@ -251,6 +259,116 @@ ALTERNATIVES
251
259
 
252
260
  ---
253
261
 
262
+ ## Tool #4: Red Team
263
+
264
+ Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
265
+
266
+ ```python
267
+ from bridgekit import redteam
268
+
269
+ text = """
270
+ I analyzed 90 days of user behavior data to understand what drives subscription
271
+ upgrades. Users who engaged with the reporting feature within their first week
272
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
273
+ onboarding users to reporting as a growth lever.
274
+ """
275
+
276
+ # Default — skeptical senior executive
277
+ print(redteam(text))
278
+
279
+ # Or specify a stakeholder
280
+ print(redteam(text, stakeholder="VP of Engineering"))
281
+ print(redteam(text, stakeholder="VP of Marketing"))
282
+ ```
283
+
284
+ Same writeup, different attack angles:
285
+
286
+ **VP of Engineering output:**
287
+ ```
288
+ BRIDGEKIT RED TEAM
289
+ ─────────────────────────────────────────
290
+ STAKEHOLDER: VP of Engineering
291
+
292
+ CRITIQUE 1: Classic correlation-causation conflation
293
+ ❯ "You're telling me to re-architect our onboarding flow based on a correlation?
294
+ Users who dig into reporting in week one are probably already power users.
295
+ You haven't shown me that exposing someone to reporting causes them to upgrade."
296
+ WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
297
+ variable, no matched cohort analysis.
298
+ TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
299
+ during onboarding. At minimum, a propensity-score-matched comparison controlling
300
+ for user segment and acquisition channel.
301
+
302
+ CRITIQUE 2: 3x on what base rate?
303
+ ❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
304
+ re-prioritizing my engineering roadmap for that."
305
+ WHY IT LANDS: Relative lift without base rates inflates significance. No way to
306
+ evaluate whether this justifies the engineering investment.
307
+ TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
308
+ and rough engineering cost to frame ROI.
309
+
310
+ CRITIQUE 3: No definition of "engaged with reporting"
311
+ ❯ "What does engaged actually mean? Clicked once? Built a custom report?
312
+ If someone accidentally opened the tab, are they in your 3x cohort?"
313
+ WHY IT LANDS: The threshold fundamentally changes the interpretation and
314
+ the recommended intervention.
315
+ TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
316
+ across definitions.
317
+
318
+ ─────────────────────────────────────────
319
+ HARDEST QUESTION TO ANSWER
320
+ "What specific onboarding action would you implement, and what engagement depth
321
+ does it need to produce to replicate the effect — and have you tested whether
322
+ you can actually get general users to that depth?"
323
+ ```
324
+
325
+ **VP of Marketing output:**
326
+ ```
327
+ BRIDGEKIT RED TEAM
328
+ ─────────────────────────────────────────
329
+ STAKEHOLDER: VP of Marketing
330
+
331
+ CRITIQUE 1: Correlation masquerading as a growth lever
332
+ ❯ "You're telling me to restructure onboarding based on a correlation. How do
333
+ you know that users who found reporting weren't already power users or
334
+ higher-intent buyers who would have upgraded regardless?"
335
+ WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
336
+ feature in week one are likely more sophisticated or higher-intent. The 3x lift
337
+ could entirely reflect who they already were, not what the feature did to them.
338
+ TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
339
+ propensity-matched cohort controlling for acquisition source, company size,
340
+ and plan tier.
341
+
342
+ CRITIQUE 2: Where's the segmentation by channel and campaign?
343
+ ❯ "I spent millions driving traffic from different channels last quarter.
344
+ Did you even look at where these reporting-engaged users came from? If they're
345
+ all from our enterprise webinar funnel, this isn't a product insight —
346
+ it's a marketing attribution insight you've mislabeled."
347
+ WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
348
+ users came disproportionately from specific campaigns, the real lever might be
349
+ acquiring more users like them, not changing onboarding.
350
+ TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
351
+ Show the effect holds within channels, not just across the blended population.
352
+
353
+ CRITIQUE 3: 90 days is a dangerously thin window
354
+ ❯ "Was there a product launch, a pricing change, or a big campaign push in
355
+ that window? How do I know this finding isn't an artifact of whatever else
356
+ was happening in those specific 90 days?"
357
+ WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
358
+ analysis carries high risk of temporal bias.
359
+ TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
360
+ major product, pricing, or marketing events and show the effect persists
361
+ after controlling for them.
362
+
363
+ ─────────────────────────────────────────
364
+ HARDEST QUESTION TO ANSWER
365
+ "If I run an A/B test tomorrow where we force half of new users through a
366
+ reporting-focused onboarding flow, what conversion lift are you personally
367
+ willing to commit to — and what's your confidence interval on that estimate?"
368
+ ```
369
+
370
+ ---
371
+
254
372
  ## Why not just use Claude?
255
373
 
256
374
  You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
@@ -267,7 +385,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
267
385
 
268
386
  ## What's next?
269
387
 
270
- Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
388
+ Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
271
389
 
272
390
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
273
391
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
@@ -1,7 +1,15 @@
1
+ <p align="center">
2
+ <img src="assets/logo.png" alt="Bridgekit" width="200"/>
3
+ </p>
4
+
1
5
  # Bridgekit
2
6
 
3
7
  **AI tools that make you a better data scientist, not a redundant one.**
4
8
 
9
+ [![PyPI](https://img.shields.io/pypi/v/bridgekit)](https://pypi.org/project/bridgekit/)
10
+ [![Downloads](https://img.shields.io/pypi/dm/bridgekit)](https://pypi.org/project/bridgekit/)
11
+ [![License](https://img.shields.io/pypi/l/bridgekit)](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
12
+
5
13
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
6
14
 
7
15
  No new interface to learn. Just better work.
@@ -47,7 +55,7 @@ jupyter notebook
47
55
  Then import whichever tool you need:
48
56
 
49
57
  ```python
50
- from bridgekit import evaluate, plan, ask
58
+ from bridgekit import evaluate, plan, ask, redteam
51
59
  ```
52
60
 
53
61
  **Review a writeup:**
@@ -69,7 +77,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
69
77
 
70
78
  ## Tool #1: Analysis Reviewer
71
79
 
72
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
80
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
73
81
 
74
82
  ```python
75
83
  from bridgekit import evaluate
@@ -223,6 +231,116 @@ ALTERNATIVES
223
231
 
224
232
  ---
225
233
 
234
+ ## Tool #4: Red Team
235
+
236
+ Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
237
+
238
+ ```python
239
+ from bridgekit import redteam
240
+
241
+ text = """
242
+ I analyzed 90 days of user behavior data to understand what drives subscription
243
+ upgrades. Users who engaged with the reporting feature within their first week
244
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
245
+ onboarding users to reporting as a growth lever.
246
+ """
247
+
248
+ # Default — skeptical senior executive
249
+ print(redteam(text))
250
+
251
+ # Or specify a stakeholder
252
+ print(redteam(text, stakeholder="VP of Engineering"))
253
+ print(redteam(text, stakeholder="VP of Marketing"))
254
+ ```
255
+
256
+ Same writeup, different attack angles:
257
+
258
+ **VP of Engineering output:**
259
+ ```
260
+ BRIDGEKIT RED TEAM
261
+ ─────────────────────────────────────────
262
+ STAKEHOLDER: VP of Engineering
263
+
264
+ CRITIQUE 1: Classic correlation-causation conflation
265
+ ❯ "You're telling me to re-architect our onboarding flow based on a correlation?
266
+ Users who dig into reporting in week one are probably already power users.
267
+ You haven't shown me that exposing someone to reporting causes them to upgrade."
268
+ WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
269
+ variable, no matched cohort analysis.
270
+ TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
271
+ during onboarding. At minimum, a propensity-score-matched comparison controlling
272
+ for user segment and acquisition channel.
273
+
274
+ CRITIQUE 2: 3x on what base rate?
275
+ ❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
276
+ re-prioritizing my engineering roadmap for that."
277
+ WHY IT LANDS: Relative lift without base rates inflates significance. No way to
278
+ evaluate whether this justifies the engineering investment.
279
+ TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
280
+ and rough engineering cost to frame ROI.
281
+
282
+ CRITIQUE 3: No definition of "engaged with reporting"
283
+ ❯ "What does engaged actually mean? Clicked once? Built a custom report?
284
+ If someone accidentally opened the tab, are they in your 3x cohort?"
285
+ WHY IT LANDS: The threshold fundamentally changes the interpretation and
286
+ the recommended intervention.
287
+ TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
288
+ across definitions.
289
+
290
+ ─────────────────────────────────────────
291
+ HARDEST QUESTION TO ANSWER
292
+ "What specific onboarding action would you implement, and what engagement depth
293
+ does it need to produce to replicate the effect — and have you tested whether
294
+ you can actually get general users to that depth?"
295
+ ```
296
+
297
+ **VP of Marketing output:**
298
+ ```
299
+ BRIDGEKIT RED TEAM
300
+ ─────────────────────────────────────────
301
+ STAKEHOLDER: VP of Marketing
302
+
303
+ CRITIQUE 1: Correlation masquerading as a growth lever
304
+ ❯ "You're telling me to restructure onboarding based on a correlation. How do
305
+ you know that users who found reporting weren't already power users or
306
+ higher-intent buyers who would have upgraded regardless?"
307
+ WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
308
+ feature in week one are likely more sophisticated or higher-intent. The 3x lift
309
+ could entirely reflect who they already were, not what the feature did to them.
310
+ TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
311
+ propensity-matched cohort controlling for acquisition source, company size,
312
+ and plan tier.
313
+
314
+ CRITIQUE 2: Where's the segmentation by channel and campaign?
315
+ ❯ "I spent millions driving traffic from different channels last quarter.
316
+ Did you even look at where these reporting-engaged users came from? If they're
317
+ all from our enterprise webinar funnel, this isn't a product insight —
318
+ it's a marketing attribution insight you've mislabeled."
319
+ WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
320
+ users came disproportionately from specific campaigns, the real lever might be
321
+ acquiring more users like them, not changing onboarding.
322
+ TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
323
+ Show the effect holds within channels, not just across the blended population.
324
+
325
+ CRITIQUE 3: 90 days is a dangerously thin window
326
+ ❯ "Was there a product launch, a pricing change, or a big campaign push in
327
+ that window? How do I know this finding isn't an artifact of whatever else
328
+ was happening in those specific 90 days?"
329
+ WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
330
+ analysis carries high risk of temporal bias.
331
+ TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
332
+ major product, pricing, or marketing events and show the effect persists
333
+ after controlling for them.
334
+
335
+ ─────────────────────────────────────────
336
+ HARDEST QUESTION TO ANSWER
337
+ "If I run an A/B test tomorrow where we force half of new users through a
338
+ reporting-focused onboarding flow, what conversion lift are you personally
339
+ willing to commit to — and what's your confidence interval on that estimate?"
340
+ ```
341
+
342
+ ---
343
+
226
344
  ## Why not just use Claude?
227
345
 
228
346
  You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
@@ -239,7 +357,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
239
357
 
240
358
  ## What's next?
241
359
 
242
- Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
360
+ Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
243
361
 
244
362
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
245
363
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
@@ -0,0 +1,7 @@
1
+ from .reviewer import evaluate
2
+ from .search import ask
3
+ from .planner import plan
4
+ from .redteam import redteam
5
+
6
+ __version__ = "0.3.3"
7
+ __all__ = ["evaluate", "ask", "plan", "redteam"]
@@ -0,0 +1,87 @@
1
+ import os
2
+ import anthropic
3
+ from .config import DEFAULT_MODEL
4
+
5
+ DEFAULT_STAKEHOLDER = "a skeptical senior executive with no tolerance for weak methodology, unsupported claims, or vague business impact"
6
+
7
+ SYSTEM_PROMPT_TEMPLATE = """You are {stakeholder}.
8
+
9
+ You have just read a data science analysis writeup. You are not here to be helpful — you are here to find every weakness, challenge every assumption, and ask the questions that will expose flaws in the analysis.
10
+
11
+ Your job is to identify the 3 hardest critiques a skeptical stakeholder would make in a readout. For each critique:
12
+ - State the challenge directly and bluntly, as the stakeholder would say it
13
+ - Explain why this is a legitimate weak point in the analysis
14
+ - State what would need to be true for this critique to be addressed
15
+
16
+ Format your response exactly like this:
17
+
18
+ BRIDGEKIT RED TEAM
19
+ ─────────────────────────────────────────
20
+ STAKEHOLDER: {stakeholder_label}
21
+
22
+ CRITIQUE 1: [short title]
23
+ ❯ [direct challenge as the stakeholder would say it, in quotes]
24
+ WHY IT LANDS: [1-2 sentences on why this is a real weakness]
25
+ TO ADDRESS: [what would need to be shown or proven]
26
+
27
+ CRITIQUE 2: [short title]
28
+ ❯ [direct challenge, in quotes]
29
+ WHY IT LANDS: [1-2 sentences]
30
+ TO ADDRESS: [what would need to be shown or proven]
31
+
32
+ CRITIQUE 3: [short title]
33
+ ❯ [direct challenge, in quotes]
34
+ WHY IT LANDS: [1-2 sentences]
35
+ TO ADDRESS: [what would need to be shown or proven]
36
+
37
+ ─────────────────────────────────────────
38
+ HARDEST QUESTION TO ANSWER
39
+ [The single question the stakeholder would ask that would be hardest to answer on the spot]
40
+ """
41
+
42
+
43
+ def redteam(text: str, stakeholder: str = None) -> str:
44
+ """
45
+ Red-team a data science analysis writeup from the perspective of a skeptical stakeholder.
46
+
47
+ Args:
48
+ text: Your analysis writeup as a plain string.
49
+ stakeholder: Optional. The skeptical stakeholder role (e.g. "VP of Finance",
50
+ "skeptical board member", "Chief Revenue Officer").
51
+ Defaults to a generic skeptical senior executive.
52
+
53
+ Returns:
54
+ The 3-5 hardest critiques the stakeholder would make, plus the single
55
+ hardest question to answer on the spot.
56
+ """
57
+ if not text or not text.strip():
58
+ raise ValueError("Text cannot be empty.")
59
+
60
+ api_key = os.environ.get("ANTHROPIC_API_KEY")
61
+ if not api_key:
62
+ raise EnvironmentError(
63
+ "ANTHROPIC_API_KEY not found. Set it with: export ANTHROPIC_API_KEY=your_key_here"
64
+ )
65
+
66
+ stakeholder_label = stakeholder if stakeholder else "Skeptical Senior Executive"
67
+ stakeholder_desc = stakeholder if stakeholder else DEFAULT_STAKEHOLDER
68
+
69
+ system_prompt = SYSTEM_PROMPT_TEMPLATE.format(
70
+ stakeholder=stakeholder_desc,
71
+ stakeholder_label=stakeholder_label
72
+ )
73
+
74
+ client = anthropic.Anthropic(api_key=api_key)
75
+ message = client.messages.create(
76
+ model=DEFAULT_MODEL,
77
+ max_tokens=1024,
78
+ system=system_prompt,
79
+ messages=[
80
+ {
81
+ "role": "user",
82
+ "content": f"Red-team this analysis writeup:\n\n{text}"
83
+ }
84
+ ]
85
+ )
86
+
87
+ return message.content[0].text
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: bridgekit
3
- Version: 0.3.2
3
+ Version: 0.3.3
4
4
  Summary: AI tools that make you a better data scientist, not a redundant one.
5
5
  License: MIT
6
6
  Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
@@ -26,10 +26,18 @@ Requires-Dist: pytest>=7.0.0; extra == "dev"
26
26
  Requires-Dist: pytest-mock>=3.0.0; extra == "dev"
27
27
  Dynamic: license-file
28
28
 
29
+ <p align="center">
30
+ <img src="assets/logo.png" alt="Bridgekit" width="200"/>
31
+ </p>
32
+
29
33
  # Bridgekit
30
34
 
31
35
  **AI tools that make you a better data scientist, not a redundant one.**
32
36
 
37
+ [![PyPI](https://img.shields.io/pypi/v/bridgekit)](https://pypi.org/project/bridgekit/)
38
+ [![Downloads](https://img.shields.io/pypi/dm/bridgekit)](https://pypi.org/project/bridgekit/)
39
+ [![License](https://img.shields.io/pypi/l/bridgekit)](https://github.com/getbridgekit/bridgekit/blob/main/LICENSE)
40
+
33
41
  Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
34
42
 
35
43
  No new interface to learn. Just better work.
@@ -75,7 +83,7 @@ jupyter notebook
75
83
  Then import whichever tool you need:
76
84
 
77
85
  ```python
78
- from bridgekit import evaluate, plan, ask
86
+ from bridgekit import evaluate, plan, ask, redteam
79
87
  ```
80
88
 
81
89
  **Review a writeup:**
@@ -97,7 +105,7 @@ print(ask("What drove churn in Q3?", source="reports/"))
97
105
 
98
106
  ## Tool #1: Analysis Reviewer
99
107
 
100
- Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
108
+ Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — to help you improve your work before you walk into the meeting.
101
109
 
102
110
  ```python
103
111
  from bridgekit import evaluate
@@ -251,6 +259,116 @@ ALTERNATIVES
251
259
 
252
260
  ---
253
261
 
262
+ ## Tool #4: Red Team
263
+
264
+ Simulate a skeptical stakeholder challenging your work to prepare you for the questions you hope no one asks — but now you're ready if they do.
265
+
266
+ ```python
267
+ from bridgekit import redteam
268
+
269
+ text = """
270
+ I analyzed 90 days of user behavior data to understand what drives subscription
271
+ upgrades. Users who engaged with the reporting feature within their first week
272
+ were 3x more likely to upgrade within 30 days. I recommend we prioritize
273
+ onboarding users to reporting as a growth lever.
274
+ """
275
+
276
+ # Default — skeptical senior executive
277
+ print(redteam(text))
278
+
279
+ # Or specify a stakeholder
280
+ print(redteam(text, stakeholder="VP of Engineering"))
281
+ print(redteam(text, stakeholder="VP of Marketing"))
282
+ ```
283
+
284
+ Same writeup, different attack angles:
285
+
286
+ **VP of Engineering output:**
287
+ ```
288
+ BRIDGEKIT RED TEAM
289
+ ─────────────────────────────────────────
290
+ STAKEHOLDER: VP of Engineering
291
+
292
+ CRITIQUE 1: Classic correlation-causation conflation
293
+ ❯ "You're telling me to re-architect our onboarding flow based on a correlation?
294
+ Users who dig into reporting in week one are probably already power users.
295
+ You haven't shown me that exposing someone to reporting causes them to upgrade."
296
+ WHY IT LANDS: No causal identification strategy — no experiment, no instrumental
297
+ variable, no matched cohort analysis.
298
+ TO ADDRESS: Run an A/B test where randomized new users get guided into reporting
299
+ during onboarding. At minimum, a propensity-score-matched comparison controlling
300
+ for user segment and acquisition channel.
301
+
302
+ CRITIQUE 2: 3x on what base rate?
303
+ ❯ "If the base upgrade rate is 0.5% and reporting users upgrade at 1.5%, I'm not
304
+ re-prioritizing my engineering roadmap for that."
305
+ WHY IT LANDS: Relative lift without base rates inflates significance. No way to
306
+ evaluate whether this justifies the engineering investment.
307
+ TO ADDRESS: Absolute upgrade rates, cohort sizes, estimated incremental revenue,
308
+ and rough engineering cost to frame ROI.
309
+
310
+ CRITIQUE 3: No definition of "engaged with reporting"
311
+ ❯ "What does engaged actually mean? Clicked once? Built a custom report?
312
+ If someone accidentally opened the tab, are they in your 3x cohort?"
313
+ WHY IT LANDS: The threshold fundamentally changes the interpretation and
314
+ the recommended intervention.
315
+ TO ADDRESS: Define exact engagement criteria, show sensitivity analysis
316
+ across definitions.
317
+
318
+ ─────────────────────────────────────────
319
+ HARDEST QUESTION TO ANSWER
320
+ "What specific onboarding action would you implement, and what engagement depth
321
+ does it need to produce to replicate the effect — and have you tested whether
322
+ you can actually get general users to that depth?"
323
+ ```
324
+
325
+ **VP of Marketing output:**
326
+ ```
327
+ BRIDGEKIT RED TEAM
328
+ ─────────────────────────────────────────
329
+ STAKEHOLDER: VP of Marketing
330
+
331
+ CRITIQUE 1: Correlation masquerading as a growth lever
332
+ ❯ "You're telling me to restructure onboarding based on a correlation. How do
333
+ you know that users who found reporting weren't already power users or
334
+ higher-intent buyers who would have upgraded regardless?"
335
+ WHY IT LANDS: Classic selection bias. Users who proactively explore an advanced
336
+ feature in week one are likely more sophisticated or higher-intent. The 3x lift
337
+ could entirely reflect who they already were, not what the feature did to them.
338
+ TO ADDRESS: A/B test forcing reporting exposure in onboarding vs. not, or a
339
+ propensity-matched cohort controlling for acquisition source, company size,
340
+ and plan tier.
341
+
342
+ CRITIQUE 2: Where's the segmentation by channel and campaign?
343
+ ❯ "I spent millions driving traffic from different channels last quarter.
344
+ Did you even look at where these reporting-engaged users came from? If they're
345
+ all from our enterprise webinar funnel, this isn't a product insight —
346
+ it's a marketing attribution insight you've mislabeled."
347
+ WHY IT LANDS: Marketing mix directly shapes user intent. If reporting-engaged
348
+ users came disproportionately from specific campaigns, the real lever might be
349
+ acquiring more users like them, not changing onboarding.
350
+ TO ADDRESS: Break the 3x uplift down by acquisition channel and campaign.
351
+ Show the effect holds within channels, not just across the blended population.
352
+
353
+ CRITIQUE 3: 90 days is a dangerously thin window
354
+ ❯ "Was there a product launch, a pricing change, or a big campaign push in
355
+ that window? How do I know this finding isn't an artifact of whatever else
356
+ was happening in those specific 90 days?"
357
+ WHY IT LANDS: 90 days is susceptible to confounding events. Any single-quarter
358
+ analysis carries high risk of temporal bias.
359
+ TO ADDRESS: Replicate the finding across multiple 90-day windows. Flag any
360
+ major product, pricing, or marketing events and show the effect persists
361
+ after controlling for them.
362
+
363
+ ─────────────────────────────────────────
364
+ HARDEST QUESTION TO ANSWER
365
+ "If I run an A/B test tomorrow where we force half of new users through a
366
+ reporting-focused onboarding flow, what conversion lift are you personally
367
+ willing to commit to — and what's your confidence interval on that estimate?"
368
+ ```
369
+
370
+ ---
371
+
254
372
  ## Why not just use Claude?
255
373
 
256
374
  You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
@@ -267,7 +385,7 @@ Because your analysis already lives in a notebook. Bridgekit meets you there. A
267
385
 
268
386
  ## What's next?
269
387
 
270
- Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
388
+ Bridgekit is a suite, not a one-off. Four tools are live — more are coming:
271
389
 
272
390
  - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
273
391
  - **Assumption checker** — state your analytical assumptions, get the ones you missed
@@ -4,6 +4,7 @@ pyproject.toml
4
4
  bridgekit/__init__.py
5
5
  bridgekit/config.py
6
6
  bridgekit/planner.py
7
+ bridgekit/redteam.py
7
8
  bridgekit/reviewer.py
8
9
  bridgekit/search.py
9
10
  bridgekit.egg-info/PKG-INFO
@@ -7,7 +7,7 @@ include = ["bridgekit*"]
7
7
 
8
8
  [project]
9
9
  name = "bridgekit"
10
- version = "0.3.2"
10
+ version = "0.3.3"
11
11
  description = "AI tools that make you a better data scientist, not a redundant one."
12
12
  readme = "README.md"
13
13
  requires-python = ">=3.9"
@@ -1,6 +0,0 @@
1
- from .reviewer import evaluate
2
- from .search import ask
3
- from .planner import plan
4
-
5
- __version__ = "0.3.2"
6
- __all__ = ["evaluate", "ask", "plan"]
File without changes
File without changes
File without changes
File without changes