PyPI - bridgekit - Versions diffs - 0.2.2__tar.gz → 0.3.1__tar.gz - Mend

bridgekit 0.2.2tar.gz → 0.3.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

{bridgekit-0.2.2 → bridgekit-0.3.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: bridgekit
-Version: 0.2.2
+Version: 0.3.1
 Summary: AI tools that make you a better data scientist, not a redundant one.
 License: MIT
 Project-URL: Homepage, https://github.com/getbridgekit/bridgekit
@@ -32,54 +32,7 @@ Dynamic: license-file
 Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
-No new interface to learn. No data leaving your hands. Just better work.
----
-## Tool #1: Analysis Reviewer
-Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
-```python
-from bridgekit import evaluate
-text = """
-I analyzed 90 days of user behavior data to understand what drives subscription
-upgrades. Users who engaged with the reporting feature within their first week
-were 3x more likely to upgrade within 30 days. I recommend we prioritize
-onboarding users to reporting as a growth lever.
-"""
-evaluate(text)
-```
-**Output:**
-```
-BRIDGEKIT FEEDBACK
-─────────────────────────────────────────
-✅ LOGIC
-Your conclusion follows from the data. The 3x lift is a meaningful signal
-worth acting on.
-⚠️  WHAT'S MISSING
-- Did you control for user intent? Users who explore reporting features may
-  already be power users likely to upgrade regardless.
-- What's the sample size behind the 3x figure?
-- Is this correlation or did you establish any causal direction?
-🎯 WEAKEST POINT
-"I recommend we prioritize onboarding to reporting" is a big leap from an
-observational finding. A senior DS would push back on this in the meeting.
-💡 LEVEL UP
-Look into selection bias and how to address it — this analysis would be
-significantly stronger with a matched cohort or an experiment to validate
-the finding.
-─────────────────────────────────────────
-```
+No new interface to learn. Just better work.
 ---
@@ -112,34 +65,87 @@ export ANTHROPIC_API_KEY=your_key_here
 ## Getting Started
-**From the terminal:**
+Set your API key before launching Jupyter:
 ```bash
-python example.py
+export ANTHROPIC_API_KEY=your_key_here
+jupyter notebook
 ```
-**From a Jupyter notebook:**
+Then import whichever tool you need:
-Set your API key before launching Jupyter:
+```python
+from bridgekit import evaluate, plan, ask
+```
-```bash
-export ANTHROPIC_API_KEY=your_key_here
-jupyter notebook
+**Review a writeup:**
+```python
+print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
+```
+**Plan your analytical approach:**
+```python
+print(plan("Did our onboarding flow reduce churn?"))
+```
+**Search past reports:**
+```python
+print(ask("What drove churn in Q3?", source="reports/"))
 ```
-Then in a cell:
+---
+## Tool #1: Analysis Reviewer
+Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
 ```python
 from bridgekit import evaluate
 text = """
-Your analysis writeup goes here.
+I analyzed 90 days of user behavior data to understand what drives subscription
+upgrades. Users who engaged with the reporting feature within their first week
+were 3x more likely to upgrade within 30 days. I recommend we prioritize
+onboarding users to reporting as a growth lever.
 """
 print(evaluate(text))
 ```
-Paste your writeup as a string and call `evaluate()` — that's it.
+**Output:**
+```
+BRIDGEKIT ANALYSIS REVIEW
+─────────────────────────────────────────
+1. CLARITY
+✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
+this and immediately understand the claim and the recommendation.
+2. STATISTICAL RIGOR
+⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
+context is missing. How many users are in each group? What's the base
+upgrade rate? There's no confidence interval or p-value, so we can't
+assess whether this difference is statistically significant or noise.
+3. METHODOLOGY
+❌ MISSING — This reads as a pure correlation finding, but the
+recommendation implies causation. Users who explore reporting in week one
+may simply be more motivated or already closer to upgrading. Without
+addressing the self-selection problem, this recommendation is not
+defensible.
+4. BUSINESS IMPACT
+⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
+the 3x lift into projected revenue or upgrade volume so leadership can
+prioritize this against competing initiatives.
+─────────────────────────────────────────
+BOTTOM LINE
+You must address the correlation-vs-causation gap before presenting —
+otherwise you risk recommending an onboarding investment that targets a
+symptom of upgrade intent rather than a cause of it.
+```
 ---
@@ -186,33 +192,86 @@ churn rate of 4.5%:
 ---
-## Why not just use Claude?
+## Tool #3: Analysis Planner
-You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
+Describe your analytical problem and get a structured plan for the right approach — before you start the analysis.
-It also lives in your Jupyter notebook, so there's no context switching. You stay in your workflow.
+Covers the recommended method, why it fits your problem, key assumptions, common pitfalls, and alternatives.
+```python
+from bridgekit import plan
+print(plan(
+    question="Does our new onboarding flow increase upgrade rates?",
+    data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
+    goal="causal inference"
+))
+```
+`data_description` and `goal` are optional — the more context you provide, the more tailored the recommendation.
+**`goal` examples:** `"causal inference"`, `"prediction"`, `"segmentation"`, `"hypothesis testing"`, `"exploration"`
+**Output:**
+```
+BRIDGEKIT ANALYSIS PLAN
+─────────────────────────────────────────
+RECOMMENDED APPROACH
+Two-sample proportion test (z-test or Fisher's exact) for the primary
+analysis, since you have a randomized experiment with a binary outcome
+and want to estimate the causal effect of the new onboarding flow on
+upgrade rates.
+WHY THIS APPROACH
+Randomization handles confounding, so you don't need regression
+adjustment to get an unbiased causal estimate. With 500 per group,
+you have reasonable power for detecting meaningful differences (~80%
+power for a 7-8 percentage point lift from a 20% baseline).
+KEY ASSUMPTIONS
+- Randomization was correctly implemented (no selection bias)
+- No interference between users
+- SUTVA: each user has a single well-defined treatment version
+- Outcome measurement is complete (watch for differential dropout)
+- Users in both arms had equal opportunity to upgrade
+WATCH OUT FOR
+Peeking and early stopping — if you're checking results repeatedly
+before the experiment concludes, your p-values are invalid. Decide
+your sample size and analysis time upfront.
+ALTERNATIVES
+- Logistic regression with covariates (channel, plan tier): use if you
+  discover post-hoc imbalance or want to tighten confidence intervals
+- Survival analysis (Cox model): use if time-to-upgrade matters as
+  much as whether users upgrade
+─────────────────────────────────────────
+```
 ---
-## Why a library and not a chatbot?
+## Why not just use Claude?
-Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call at the end of your existing process — consistent, reproducible, and fast.
+You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
+It also lives in your Jupyter notebook, so there's no context switching. You stay in your workflow.
 ---
-## Is my data safe?
+## Why a library and not a chatbot?
-Bridgekit only ever sees text you write yourself — your narrative, your conclusions, your writeup. It never touches your raw data, your DataFrames, or your code. You're sending your own words to an API, the same way you'd paste them into a Google Doc to share with a colleague.
+Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
 ---
 ## What's next?
-Bridgekit is a suite, not a one-off. Two tools are live — more are coming:
+Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
-- **Statistical approach suggester** — describe your problem in plain English, get the right test and why
 - **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
 - **Assumption checker** — state your analytical assumptions, get the ones you missed
+- **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
 Each tool is small, focused, and built for the way data scientists actually work.

bridgekit-0.3.1/README.md ADDED Viewed

@@ -0,0 +1,260 @@
+# Bridgekit
+**AI tools that make you a better data scientist, not a redundant one.**
+Data scientists are not being replaced — they're being asked to do more with less context, less time, and more pressure to be right. Bridgekit is a growing suite of small, focused tools that bring AI into your existing workflow to sharpen your thinking, catch your blind spots, and level up your craft.
+No new interface to learn. Just better work.
+---
+## Installation
+**Standard install:**
+```bash
+pip install bridgekit
+```
+**In a virtual environment (recommended for clean setups):**
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install bridgekit
+```
+**In a Jupyter notebook:**
+```python
+!pip install bridgekit
+```
+Requires an Anthropic API key:
+```bash
+export ANTHROPIC_API_KEY=your_key_here
+```
+---
+## Getting Started
+Set your API key before launching Jupyter:
+```bash
+export ANTHROPIC_API_KEY=your_key_here
+jupyter notebook
+```
+Then import whichever tool you need:
+```python
+from bridgekit import evaluate, plan, ask
+```
+**Review a writeup:**
+```python
+print(evaluate("I analyzed 90 days of user behavior data. Users who engaged with the reporting feature were 3x more likely to upgrade."))
+```
+**Plan your analytical approach:**
+```python
+print(plan("Did our onboarding flow reduce churn?"))
+```
+**Search past reports:**
+```python
+print(ask("What drove churn in Q3?", source="reports/"))
+```
+---
+## Tool #1: Analysis Reviewer
+Write your findings the way you normally would. Bridgekit reads them and gives you the feedback a senior data scientist would — before you walk into the meeting.
+```python
+from bridgekit import evaluate
+text = """
+I analyzed 90 days of user behavior data to understand what drives subscription
+upgrades. Users who engaged with the reporting feature within their first week
+were 3x more likely to upgrade within 30 days. I recommend we prioritize
+onboarding users to reporting as a growth lever.
+"""
+print(evaluate(text))
+```
+**Output:**
+```
+BRIDGEKIT ANALYSIS REVIEW
+─────────────────────────────────────────
+1. CLARITY
+✅ STRONG — Clean, concise, and jargon-free. Any stakeholder could read
+this and immediately understand the claim and the recommendation.
+2. STATISTICAL RIGOR
+⚠️ NEEDS WORK — "3x more likely" is a compelling number, but critical
+context is missing. How many users are in each group? What's the base
+upgrade rate? There's no confidence interval or p-value, so we can't
+assess whether this difference is statistically significant or noise.
+3. METHODOLOGY
+❌ MISSING — This reads as a pure correlation finding, but the
+recommendation implies causation. Users who explore reporting in week one
+may simply be more motivated or already closer to upgrading. Without
+addressing the self-selection problem, this recommendation is not
+defensible.
+4. BUSINESS IMPACT
+⚠️ NEEDS WORK — "Growth lever" is directional, not quantified. Translate
+the 3x lift into projected revenue or upgrade volume so leadership can
+prioritize this against competing initiatives.
+─────────────────────────────────────────
+BOTTOM LINE
+You must address the correlation-vs-causation gap before presenting —
+otherwise you risk recommending an onboarding investment that targets a
+symptom of upgrade intent rather than a cause of it.
+```
+---
+## Tool #2: Analysis Search
+Ask questions across a collection of your past analysis documents. Point it at a folder and get answers grounded in your actual work — no digging through files manually.
+Uses a vector database and semantic similarity to find relevant context across your documents — not keyword matching.
+Supports `.txt`, `.md`, `.pdf`, `.docx`, `.pptx`, and `.ipynb` files.
+> **Note:** The first run will download the MiniLM embedding model (~90MB). This is a one-time download — it gets cached locally for all subsequent calls.
+**From a folder:**
+```python
+from bridgekit import ask
+print(ask("what drove churn in Q3?", source="reports/"))
+```
+**From raw text:**
+```python
+from bridgekit import ask
+text = """
+Q3 churn rose to 4.5%, driven by a product outage in August and a pricing
+change in July that increased SMB costs by 12%.
+"""
+print(ask("what caused the Q3 churn spike?", text=text))
+```
+**Output** *(based on sample data included in the repo)*:
+```
+Based on the Q3 2024 Churn Analysis, two primary factors drove the elevated
+churn rate of 4.5%:
+1. August Product Outage — A 14-hour outage affected 3,800 accounts. Impacted
+   accounts churned at 8.1% vs 3.2% for unaffected accounts.
+2. July Pricing Change — SMB costs increased by an average of 12%, causing SMB
+   churn to spike to 7.2% — the highest single-month figure in the dataset.
+```
+---
+## Tool #3: Analysis Planner
+Describe your analytical problem and get a structured plan for the right approach — before you start the analysis.
+Covers the recommended method, why it fits your problem, key assumptions, common pitfalls, and alternatives.
+```python
+from bridgekit import plan
+print(plan(
+    question="Does our new onboarding flow increase upgrade rates?",
+    data_description="We are running an A/B test with ~1,000 users split between old and new onboarding. Key variables will include upgrade status, time to upgrade, acquisition channel, and plan tier.",
+    goal="causal inference"
+))
+```
+`data_description` and `goal` are optional — the more context you provide, the more tailored the recommendation.
+**`goal` examples:** `"causal inference"`, `"prediction"`, `"segmentation"`, `"hypothesis testing"`, `"exploration"`
+**Output:**
+```
+BRIDGEKIT ANALYSIS PLAN
+─────────────────────────────────────────
+RECOMMENDED APPROACH
+Two-sample proportion test (z-test or Fisher's exact) for the primary
+analysis, since you have a randomized experiment with a binary outcome
+and want to estimate the causal effect of the new onboarding flow on
+upgrade rates.
+WHY THIS APPROACH
+Randomization handles confounding, so you don't need regression
+adjustment to get an unbiased causal estimate. With 500 per group,
+you have reasonable power for detecting meaningful differences (~80%
+power for a 7-8 percentage point lift from a 20% baseline).
+KEY ASSUMPTIONS
+- Randomization was correctly implemented (no selection bias)
+- No interference between users
+- SUTVA: each user has a single well-defined treatment version
+- Outcome measurement is complete (watch for differential dropout)
+- Users in both arms had equal opportunity to upgrade
+WATCH OUT FOR
+Peeking and early stopping — if you're checking results repeatedly
+before the experiment concludes, your p-values are invalid. Decide
+your sample size and analysis time upfront.
+ALTERNATIVES
+- Logistic regression with covariates (channel, plan tier): use if you
+  discover post-hoc imbalance or want to tighten confidence intervals
+- Survival analysis (Cox model): use if time-to-upgrade matters as
+  much as whether users upgrade
+─────────────────────────────────────────
+```
+---
+## Why not just use Claude?
+You could. But you'd need to know what to ask, how to frame it, and what a good answer looks like. Bridgekit has that baked in — it knows you're a data scientist presenting findings, so it asks the right questions automatically. No prompt engineering required. Just paste your work and run it.
+It also lives in your Jupyter notebook, so there's no context switching. You stay in your workflow.
+---
+## Why a library and not a chatbot?
+Because your analysis already lives in a notebook. Bridgekit meets you there. A chatbot asks you to re-explain your work from scratch every time. Bridgekit is one function call — consistent, reproducible, and fast.
+---
+## What's next?
+Bridgekit is a suite, not a one-off. Three tools are live — more are coming:
+- **Stakeholder translator** — turn your technical findings into a narrative a non-technical audience will actually follow
+- **Assumption checker** — state your analytical assumptions, get the ones you missed
+- **Multi-model support** — use any LLM provider (OpenAI, Gemini, open source models via OpenRouter) instead of being tied to Anthropic
+Each tool is small, focused, and built for the way data scientists actually work.
+---
+## Contributing
+Bridgekit is open source and early. If you're a data scientist and something here would genuinely save you time or make you sharper — open an issue, submit a PR, or just tell me what's missing.
+---
+## License
+MIT

bridgekit-0.3.1/bridgekit/__init__.py ADDED Viewed

@@ -0,0 +1,6 @@
+from .reviewer import evaluate
+from .search import ask
+from .planner import plan
+__version__ = "0.3.1"
+__all__ = ["evaluate", "ask", "plan"]

bridgekit-0.3.1/bridgekit/config.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ DEFAULT_MODEL = "claude-opus-4-6"

bridgekit-0.3.1/bridgekit/planner.py ADDED Viewed

@@ -0,0 +1,75 @@
+import os
+import anthropic
+from .config import DEFAULT_MODEL
+SYSTEM_PROMPT = """You are a senior statistician and data scientist advising a colleague on the right analytical approach for their problem.
+Given a question, a description of the available data, and the goal of the analysis, recommend the best analytical approach. Be direct and specific — not a textbook, not a list of every possible method.
+Structure your response exactly like this:
+BRIDGEKIT ANALYSIS PLAN
+─────────────────────────────────────────
+RECOMMENDED APPROACH
+[Name of the method and one sentence on why it fits this problem]
+WHY THIS APPROACH
+[2-3 sentences on why this is the right fit given the question, data, and goal]
+KEY ASSUMPTIONS
+[Bullet list of assumptions this approach requires — flag any that may be violated]
+WATCH OUT FOR
+[The most common mistake DS make on this type of problem]
+ALTERNATIVES
+[1-2 alternative approaches and when you'd use them instead]
+─────────────────────────────────────────
+"""
+def plan(question: str, data_description: str = None, goal: str = None) -> str:
+    """
+    Recommend the right analytical approach for your problem.
+    Args:
+        question:         The analytical question you are trying to answer.
+        data_description: Optional. A plain text description of your available data.
+        goal:             Optional. The goal of your analysis (e.g. "causal inference",
+                          "prediction", "segmentation", "hypothesis testing", "exploration").
+    Returns:
+        A structured analytical plan covering the recommended approach, assumptions,
+        common pitfalls, and alternatives.
+    """
+    if not question or not question.strip():
+        raise ValueError("Question cannot be empty.")
+    api_key = os.environ.get("ANTHROPIC_API_KEY")
+    if not api_key:
+        raise EnvironmentError(
+            "ANTHROPIC_API_KEY not found. Set it with: export ANTHROPIC_API_KEY=your_key_here"
+        )
+    user_message = f"Question: {question}"
+    if data_description:
+        user_message += f"\n\nData: {data_description}"
+    if goal:
+        user_message += f"\n\nGoal: {goal}"
+    client = anthropic.Anthropic(api_key=api_key)
+    message = client.messages.create(
+        model=DEFAULT_MODEL,
+        max_tokens=1024,
+        system=SYSTEM_PROMPT,
+        messages=[
+            {
+                "role": "user",
+                "content": user_message
+            }
+        ]
+    )
+    return message.content[0].text

{bridgekit-0.2.2 → bridgekit-0.3.1}/bridgekit/reviewer.py RENAMED Viewed

@@ -1,16 +1,16 @@
 import os
 import anthropic
+from .config import DEFAULT_MODEL
 SYSTEM_PROMPT = """You are a senior data scientist reviewing a colleague's analysis writeup.
 You are direct, constructive, and specific. You do not flatter — you help people improve.
-Evaluate the writeup across exactly these five dimensions:
+Evaluate the writeup across exactly these four dimensions:
-1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything?
-2. AUDIENCE CLARITY — Is it written for the right reader? Does the level of detail and framing match who will actually read this?
-3. STATISTICAL RIGOR — Is there enough data to support the claim? Are sample sizes mentioned? Are confidence levels or uncertainty acknowledged?
-4. METHODOLOGY — Is it clear why this analytical approach was chosen? Are alternatives considered or ruled out?
-5. BUSINESS IMPACT — Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
+1. CLARITY — Is it free of jargon? Could someone outside data science read this without googling anything? Is it written for the right reader?
+2. STATISTICAL RIGOR — Is there enough data to support the claim? Are sample sizes mentioned? Are confidence levels or uncertainty acknowledged?
+3. METHODOLOGY — Is it clear why this analytical approach was chosen? Are alternatives considered or ruled out?
+4. BUSINESS IMPACT — Are outcomes quantified in % or $ terms? Directional statements like "improved performance" are not enough.
 For each dimension, give one of three ratings:
 ✅ STRONG — this dimension is handled well
@@ -29,16 +29,13 @@ BRIDGEKIT ANALYSIS REVIEW
 1. CLARITY
 [rating] [feedback]
-2. AUDIENCE CLARITY
+2. STATISTICAL RIGOR
 [rating] [feedback]
-3. STATISTICAL RIGOR
+3. METHODOLOGY
 [rating] [feedback]
-4. METHODOLOGY
-[rating] [feedback]
-5. BUSINESS IMPACT
+4. BUSINESS IMPACT
 [rating] [feedback]
 ─────────────────────────────────────────
@@ -54,7 +51,7 @@ def evaluate(text: str) -> str:
         text: Your analysis writeup as a plain string.
     Returns:
-        Structured feedback across five dimensions.
+        Structured feedback across four dimensions.
     """
     if not text or not text.strip():
         raise ValueError("Text cannot be empty.")
@@ -68,7 +65,7 @@ def evaluate(text: str) -> str:
     client = anthropic.Anthropic(api_key=api_key)
     message = client.messages.create(
-        model="claude-opus-4-5",
+        model=DEFAULT_MODEL,
         max_tokens=1024,
         system=SYSTEM_PROMPT,
         messages=[

{bridgekit-0.2.2 → bridgekit-0.3.1}/bridgekit/search.py RENAMED Viewed

@@ -1,6 +1,7 @@
 import os
 from pathlib import Path
 import anthropic
+from .config import DEFAULT_MODEL
 import chromadb
 from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
@@ -106,7 +107,7 @@ def ask(question: str, source: str = None, text: str = None) -> str:
     # Generate answer with Claude
     anthropic_client = anthropic.Anthropic(api_key=api_key)
     message = anthropic_client.messages.create(
-        model="claude-opus-4-5",
+        model=DEFAULT_MODEL,
         max_tokens=1024,
         system=(
             "You are a senior data scientist answering questions based on analysis reports. "

bridgekit 0.2.2__tar.gz → 0.3.1__tar.gz

bridgekit 0.2.2tar.gz → 0.3.1tar.gz