PyPI - notionhelper - Versions diffs - 0.3.1__tar.gz → 0.4.0__tar.gz - Mend

notionhelper 0.3.1tar.gz → 0.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

notionhelper-0.4.0/ML_DEMO_README.md ADDED Viewed

@@ -0,0 +1,317 @@
+# NotionHelper ML Demo Guide
+## Overview
+`ml_demo.py` is a comprehensive demonstration of how to use **MLNotionHelper** (which extends NotionHelper) to track machine learning experiments. It showcases a complete workflow from model training to Notion integration.
+**Note:** The ML experiment tracking features are available in the `MLNotionHelper` class, which inherits from `NotionHelper` and adds specialized methods for logging ML experiments.
+## Features
+✨ **Complete ML Pipeline**
+- Logistic Regression on sklearn's breast cancer dataset
+- Train/test split with stratification
+- Feature scaling
+- Comprehensive metrics calculation
+📊 **Metrics Tracked**
+- Accuracy
+- Precision
+- Recall
+- F1 Score
+- ROC AUC
+- Training/Test sample sizes
+📈 **Visualizations**
+- Confusion Matrix (heatmap)
+- ROC Curve with AUC score
+- Feature Importance (when scaling is disabled)
+💾 **Artifacts**
+- Predictions CSV with probabilities
+- Classification report
+- All generated plots
+## Quick Start
+### 1. Run the Demo (without Notion)
+```bash
+python ml_demo.py
+```
+This will:
+- Train the model
+- Generate all metrics and plots
+- Save artifacts to disk
+- Show instructions for Notion integration
+### 2. Set Up Notion Integration
+#### A. Get Your Notion API Token
+1. Go to [Notion Integrations](https://www.notion.so/my-integrations)
+2. Create a new integration
+3. Copy the "Internal Integration Token"
+4. Set it as an environment variable:
+```bash
+export NOTION_TOKEN='secret_your_token_here'
+```
+#### B. Create a Parent Page
+1. Create a new page in Notion (this will hold your ML experiment databases)
+2. Share the page with your integration
+3. Copy the page ID from the URL:
+   - URL: `https://www.notion.so/My-ML-Experiments-abc123def456...`
+   - Page ID: `abc123def456...`
+#### C. Create the Database (First Time Only)
+1. Open `ml_demo.py`
+2. Find the "STEP 4A" section
+3. Uncomment the database creation code
+4. Set `PARENT_PAGE_ID = "your_page_id_here"`
+5. Run the script:
+```bash
+python ml_demo.py
+```
+6. **IMPORTANT**: Copy the `data_source_id` from the output!
+Example output:
+```
+✓ Database created! Data Source ID: 2d2fdfd6-8a97-80ba-bdd6-000b787993a4
+💡 Save this ID for future experiment logging!
+```
+#### D. Log Experiments
+1. Comment out the database creation code (STEP 4A)
+2. Set `DATA_SOURCE_ID = "your_data_source_id_from_step_C"`
+3. Run experiments:
+```bash
+python ml_demo.py
+```
+Each run will:
+- Create a new row in your Notion database
+- Upload confusion matrix and ROC curve plots
+- Attach CSV artifacts
+- Compare metrics with previous runs
+- Show 🏆 if it's a new best score!
+## Customization
+### Hyperparameters
+Modify the `config` dictionary in the `main()` function:
+```python
+config = {
+    "Experiment_Name": "Your Experiment",
+    "Model": "Logistic Regression",
+    "C_Regularization": 10.0,        # Change this
+    "Max_Iterations": 2000,          # Or this
+    "Solver": "saga",                # Try different solvers
+    "Penalty": "l1",                 # L1 or L2 regularization
+    "Feature_Scaling": True          # Enable/disable scaling
+}
+```
+### Target Metric
+Change which metric to optimize in the `log_ml_experiment()` call:
+```python
+page_id = nh.log_ml_experiment(
+    ...
+    target_metric="Accuracy",     # Or "Precision", "Recall", "F1_Score"
+    higher_is_better=True,        # Higher scores are better
+    ...
+)
+```
+## Example Workflow
+### Experiment 1: Baseline
+```python
+config = {
+    "C_Regularization": 1.0,
+    "Penalty": "l2",
+    "Solver": "lbfgs"
+}
+# Results: F1 Score = 98.61%
+```
+### Experiment 2: Stronger Regularization
+```python
+config = {
+    "C_Regularization": 0.1,  # Stronger regularization
+    "Penalty": "l2",
+    "Solver": "lbfgs"
+}
+# Run to see if it improves performance
+```
+### Experiment 3: L1 Regularization
+```python
+config = {
+    "C_Regularization": 1.0,
+    "Penalty": "l1",          # Switch to L1
+    "Solver": "saga"          # L1 requires saga or liblinear
+}
+# L1 can perform feature selection
+```
+## Generated Files
+After running the demo, you'll find:
+```
+├── confusion_matrix.png           # Confusion matrix heatmap
+├── roc_curve.png                  # ROC curve with AUC
+├── feature_importance.png         # Feature coefficients (if no scaling)
+├── predictions.csv                # Test set predictions
+└── classification_report.csv      # Detailed metrics per class
+```
+## Notion Database Schema
+The created database will have columns for:
+**Config Fields:**
+- Experiment_Name (Title)
+- Model
+- Dataset
+- Test_Size
+- Random_State
+- C_Regularization (Number)
+- Max_Iterations (Number)
+- Solver
+- Penalty
+- Feature_Scaling (Checkbox) ✅
+**Metric Fields:**
+- Accuracy (Number)
+- Precision (Number)
+- Recall (Number)
+- F1_Score (Number)
+- ROC_AUC (Number)
+- Train_Samples (Number)
+- Test_Samples (Number)
+- Run Status (shows 🏆 for new best)
+**Artifacts:**
+- Plots (embedded in page body)
+- Artifacts (attached CSV files)
+## Troubleshooting
+### Boolean Properties Showing as Numbers
+If you see boolean values (like `Feature_Scaling`) appearing as numbers in Notion:
+1. Check the debug output in the console
+2. Ensure you're passing Python `bool` types (not 0/1 integers)
+3. The `dict_to_notion_schema()` includes debug prints to help diagnose
+### Notion API Errors
+Common issues:
+- **401 Unauthorized**: Check your NOTION_TOKEN
+- **404 Not Found**: Verify your PARENT_PAGE_ID or DATA_SOURCE_ID
+- **400 Bad Request**: Make sure the page is shared with your integration
+### Missing Plots
+Ensure matplotlib and seaborn are installed:
+```bash
+pip install matplotlib seaborn
+```
+## Advanced Usage
+### Use Your Own Dataset
+Replace the data loading section:
+```python
+# Replace this:
+data = load_breast_cancer()
+X = pd.DataFrame(data.data, columns=data.feature_names)
+y = pd.Series(data.target, name='target')
+# With your own data:
+df = pd.read_csv('your_data.csv')
+X = df.drop('target_column', axis=1)
+y = df['target_column']
+```
+### Add More Metrics
+Calculate additional metrics:
+```python
+from sklearn.metrics import matthews_corrcoef, balanced_accuracy_score
+metrics = {
+    ...
+    "MCC": round(matthews_corrcoef(y_test, y_pred), 4),
+    "Balanced_Accuracy": round(balanced_accuracy_score(y_test, y_pred) * 100, 2)
+}
+```
+### Grid Search Integration
+Combine with sklearn's GridSearchCV:
+```python
+from sklearn.model_selection import GridSearchCV
+param_grid = {
+    'C': [0.1, 1.0, 10.0],
+    'penalty': ['l1', 'l2']
+}
+grid = GridSearchCV(LogisticRegression(), param_grid, cv=5)
+grid.fit(X_train, y_train)
+# Log each configuration
+for params, mean_score in zip(grid.cv_results_['params'],
+                               grid.cv_results_['mean_test_score']):
+    config.update(params)
+    metrics['CV_Score'] = mean_score
+    nh.log_ml_experiment(...)
+```
+## Benefits of Using NotionHelper
+✅ **Centralized Tracking**: All experiments in one place
+✅ **Visual Comparison**: See which hyperparameters work best
+✅ **Automatic Leaderboard**: Highlights new best scores
+✅ **File Attachments**: Keep plots and CSVs with experiments
+✅ **Team Collaboration**: Share results with your team
+✅ **Reproducibility**: Track all hyperparameters and seeds
+## Next Steps
+1. **Run the demo** to familiarize yourself with the workflow
+2. **Create your Notion database** following the setup guide
+3. **Customize for your project** - replace with your ML model
+4. **Run multiple experiments** with different hyperparameters
+5. **Review results in Notion** - compare and analyze performance
+## Support
+For issues or questions:
+- Check the [NotionHelper documentation](carecast/notionhelper.py)
+- Review the [Notion API docs](https://developers.notion.com/)
+- Examine the debug output for type checking issues
+---
+**Happy Experimenting! 🚀**

{notionhelper-0.3.1 → notionhelper-0.4.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: notionhelper
-Version: 0.3.1
+Version: 0.4.0
 Summary: NotionHelper is a Python library that simplifies interactions with the Notion API, enabling easy management of databases, pages, and files within Notion workspaces.
 Author-email: Jan du Plessis <drjanduplessis@icloud.com>
 Requires-Python: >=3.10
@@ -74,7 +74,7 @@ Here is an example of how to use the library:
 ```python
 import os
-from notionhelper import NotionHelper
+from notionhelper import NotionHelper, MLNotionHelper
 ```
 ### Initialize the NotionHelper class
@@ -82,7 +82,11 @@ from notionhelper import NotionHelper
 ```python
 notion_token = os.getenv("NOTION_TOKEN")
+# For core Notion operations
 helper = NotionHelper(notion_token)
+# For ML experiment tracking (includes all NotionHelper methods)
+ml_helper = MLNotionHelper(notion_token)
 ```
 ### Retrieve a Database (Container)

{notionhelper-0.3.1 → notionhelper-0.4.0}/README.md RENAMED Viewed

@@ -47,7 +47,7 @@ Here is an example of how to use the library:
 ```python
 import os
-from notionhelper import NotionHelper
+from notionhelper import NotionHelper, MLNotionHelper
 ```
 ### Initialize the NotionHelper class
@@ -55,7 +55,11 @@ from notionhelper import NotionHelper
 ```python
 notion_token = os.getenv("NOTION_TOKEN")
+# For core Notion operations
 helper = NotionHelper(notion_token)
+# For ML experiment tracking (includes all NotionHelper methods)
+ml_helper = MLNotionHelper(notion_token)
 ```
 ### Retrieve a Database (Container)

notionhelper-0.4.0/SEPARATION_SUMMARY.md ADDED Viewed

@@ -0,0 +1,70 @@
+# ML Functions Separation - Implementation Summary
+## What Was Done
+Successfully separated Machine Learning functions from the core NotionHelper class using **inheritance-based approach**.
+### File Changes
+#### 1. **Created: `src/notionhelper/ml_logger.py`** (NEW)
+- New `MLNotionHelper` class that **inherits from `NotionHelper`**
+- Moved ML-specific methods:
+  - `log_ml_experiment()` - Logs experiments with metrics, plots, and artifacts
+  - `create_ml_database()` - Creates Notion databases optimized for ML tracking
+  - `dict_to_notion_schema()` - Converts dictionaries to Notion schema
+  - `dict_to_notion_props()` - Converts dictionaries to Notion properties
+#### 2. **Modified: `src/notionhelper/helper.py`**
+- Removed the 4 ML-specific methods listed above
+- **Kept all core Notion API methods**:
+  - Database/data source operations
+  - Page creation and retrieval
+  - File upload and embedding
+  - Block management
+#### 3. **Updated: `src/notionhelper/__init__.py`**
+```python
+from .helper import NotionHelper
+from .ml_logger import MLNotionHelper
+__all__ = ["NotionHelper", "MLNotionHelper"]
+```
+#### 4. **Updated: `examples/ml_demo.py`**
+- Changed import: `from notionhelper import MLNotionHelper`
+- Changed initialization: `nh = MLNotionHelper(NOTION_TOKEN)`
+## Usage
+### Simple, Single Instantiation:
+```python
+from notionhelper import MLNotionHelper
+# One line - that's it!
+ml_tracker = MLNotionHelper(notion_token)
+# Use ML methods
+ml_tracker.log_ml_experiment(...)
+ml_tracker.create_ml_database(...)
+# Also available: all NotionHelper methods
+ml_tracker.get_data_source(...)
+ml_tracker.upload_file(...)
+```
+## Architecture Benefits
+✅ **Clean Separation** - ML logic isolated in dedicated module
+✅ **Single Instantiation** - No extra code needed
+✅ **Minimal Changes** - Just inherit and move methods
+✅ **Backward Compatible** - `NotionHelper` still available separately
+✅ **Extensible** - Easy to add other trackers (e.g., `ImageNotionHelper`)
+✅ **Elegant** - Inheritance makes intent clear
+## File Structure
+```
+src/notionhelper/
+├── helper.py          # Core Notion API methods
+├── ml_logger.py       # ML experiment tracking (NEW)
+└── __init__.py        # Exports both classes
+```

notionhelper-0.4.0/examples/ml_demo.py ADDED Viewed

@@ -0,0 +1,391 @@
+"""
+NotionHelper ML Demo: Logistic Regression with sklearn
+=======================================================
+This demo showcases how to use NotionHelper to track ML experiments.
+Features:
+- Logistic regression on sklearn's breast cancer dataset
+- Complete metrics tracking (accuracy, precision, recall, F1)
+- Hyperparameter configuration
+- Automatic Notion database creation
+- Experiment logging with plots and artifacts
+"""
+import os
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+import seaborn as sns
+from sklearn.datasets import load_breast_cancer
+from sklearn.model_selection import train_test_split
+from sklearn.linear_model import LogisticRegression
+from sklearn.metrics import (
+    accuracy_score,
+    precision_score,
+    recall_score,
+    f1_score,
+    confusion_matrix,
+    classification_report,
+    roc_curve,
+    roc_auc_score
+)
+from sklearn.preprocessing import StandardScaler
+from notionhelper import MLNotionHelper
+def train_logistic_regression(
+    test_size=0.2,
+    random_state=42,
+    C=1.0,
+    max_iter=1000,
+    solver='lbfgs',
+    penalty='l2',
+    scale_features=True
+):
+    """
+    Train a logistic regression model on breast cancer dataset.
+    Parameters:
+    -----------
+    test_size : float
+        Proportion of dataset to use for testing
+    random_state : int
+        Random seed for reproducibility
+    C : float
+        Inverse of regularization strength
+    max_iter : int
+        Maximum iterations for solver
+    solver : str
+        Algorithm to use in optimization
+    penalty : str
+        Regularization penalty type
+    scale_features : bool
+        Whether to standardize features
+    Returns:
+    --------
+    metrics : dict
+        Dictionary containing all evaluation metrics
+    plot_paths : list
+        List of paths to generated plots
+    artifacts : list
+        List of paths to saved artifacts
+    """
+    print("\n" + "="*60)
+    print("🔬 NOTIONHELPER ML DEMO: Logistic Regression")
+    print("="*60 + "\n")
+    # 1. Load Dataset
+    print("📊 Loading breast cancer dataset...")
+    data = load_breast_cancer()
+    X = pd.DataFrame(data.data, columns=data.feature_names)
+    y = pd.Series(data.target, name='target')
+    print(f"   Dataset shape: {X.shape}")
+    print(f"   Classes: {data.target_names}")
+    print(f"   Features: {X.shape[1]}")
+    # 2. Split Data
+    print("\n🔀 Splitting data...")
+    X_train, X_test, y_train, y_test = train_test_split(
+        X, y, test_size=test_size, random_state=random_state, stratify=y
+    )
+    print(f"   Training set: {X_train.shape[0]} samples")
+    print(f"   Test set: {X_test.shape[0]} samples")
+    # 3. Feature Scaling (optional but recommended)
+    if scale_features:
+        print("\n⚖️  Scaling features...")
+        scaler = StandardScaler()
+        X_train = scaler.fit_transform(X_train)
+        X_test = scaler.transform(X_test)
+    # 4. Train Model
+    print("\n🤖 Training Logistic Regression model...")
+    model = LogisticRegression(
+        C=C,
+        max_iter=max_iter,
+        solver=solver,
+        penalty=penalty,
+        random_state=random_state
+    )
+    model.fit(X_train, y_train)
+    print("   ✓ Model trained successfully")
+    # 5. Make Predictions
+    print("\n🎯 Making predictions...")
+    y_pred = model.predict(X_test)
+    y_pred_proba = model.predict_proba(X_test)[:, 1]
+    # 6. Calculate Metrics
+    print("\n📈 Calculating metrics...")
+    accuracy = accuracy_score(y_test, y_pred)
+    precision = precision_score(y_test, y_pred)
+    recall = recall_score(y_test, y_pred)
+    f1 = f1_score(y_test, y_pred)
+    roc_auc = roc_auc_score(y_test, y_pred_proba)
+    metrics = {
+        "Accuracy": round(accuracy * 100, 2),
+        "Precision": round(precision * 100, 2),
+        "Recall": round(recall * 100, 2),
+        "F1_Score": round(f1 * 100, 2),
+        "ROC_AUC": round(roc_auc * 100, 2),
+        "Train_Samples": int(X_train.shape[0]),
+        "Test_Samples": int(X_test.shape[0])
+    }
+    # Print metrics
+    print("\n" + "="*60)
+    print("📊 MODEL PERFORMANCE METRICS")
+    print("-" * 60)
+    print(f"Accuracy  : {metrics['Accuracy']:.2f}%")
+    print(f"Precision : {metrics['Precision']:.2f}%")
+    print(f"Recall    : {metrics['Recall']:.2f}%")
+    print(f"F1 Score  : {metrics['F1_Score']:.2f}%")
+    print(f"ROC AUC   : {metrics['ROC_AUC']:.2f}%")
+    print("="*60 + "\n")
+    # 7. Generate Visualizations
+    print("📊 Generating visualizations...")
+    plot_paths = []
+    # Confusion Matrix
+    cm = confusion_matrix(y_test, y_pred)
+    plt.figure(figsize=(8, 6))
+    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
+                xticklabels=data.target_names,
+                yticklabels=data.target_names)
+    plt.title('Confusion Matrix', fontsize=14, fontweight='bold')
+    plt.ylabel('True Label')
+    plt.xlabel('Predicted Label')
+    plt.tight_layout()
+    cm_path = 'confusion_matrix.png'
+    plt.savefig(cm_path, dpi=150)
+    plot_paths.append(cm_path)
+    plt.close()
+    print(f"   ✓ Saved: {cm_path}")
+    # ROC Curve
+    fpr, tpr, _ = roc_curve(y_test, y_pred_proba)
+    plt.figure(figsize=(8, 6))
+    plt.plot(fpr, tpr, color='#1f77b4', lw=2,
+             label=f'ROC curve (AUC = {roc_auc:.3f})')
+    plt.plot([0, 1], [0, 1], color='gray', lw=1, linestyle='--',
+             label='Random Classifier')
+    plt.xlim([0.0, 1.0])
+    plt.ylim([0.0, 1.05])
+    plt.xlabel('False Positive Rate', fontsize=12)
+    plt.ylabel('True Positive Rate', fontsize=12)
+    plt.title('ROC Curve', fontsize=14, fontweight='bold')
+    plt.legend(loc="lower right")
+    plt.grid(alpha=0.3)
+    plt.tight_layout()
+    roc_path = 'roc_curve.png'
+    plt.savefig(roc_path, dpi=150)
+    plot_paths.append(roc_path)
+    plt.close()
+    print(f"   ✓ Saved: {roc_path}")
+    # Feature Importance (Coefficients)
+    if not scale_features:
+        feature_importance = pd.DataFrame({
+            'Feature': data.feature_names,
+            'Coefficient': model.coef_[0]
+        }).sort_values('Coefficient', key=abs, ascending=False).head(15)
+        plt.figure(figsize=(10, 6))
+        colors = ['#d62728' if x < 0 else '#2ca02c' for x in feature_importance['Coefficient']]
+        plt.barh(feature_importance['Feature'], feature_importance['Coefficient'], color=colors)
+        plt.xlabel('Coefficient Value', fontsize=12)
+        plt.title('Top 15 Feature Importance (Logistic Regression Coefficients)',
+                  fontsize=14, fontweight='bold')
+        plt.grid(axis='x', alpha=0.3)
+        plt.tight_layout()
+        feat_path = 'feature_importance.png'
+        plt.savefig(feat_path, dpi=150)
+        plot_paths.append(feat_path)
+        plt.close()
+        print(f"   ✓ Saved: {feat_path}")
+    # 8. Save Artifacts
+    print("\n💾 Saving artifacts...")
+    artifacts = []
+    # Save predictions
+    predictions_df = pd.DataFrame({
+        'True_Label': y_test.values,
+        'Predicted_Label': y_pred,
+        'Probability_Malignant': y_pred_proba,
+        'Correct': (y_test.values == y_pred).astype(int)
+    })
+    pred_path = 'predictions.csv'
+    predictions_df.to_csv(pred_path, index=False)
+    artifacts.append(pred_path)
+    print(f"   ✓ Saved: {pred_path}")
+    # Save classification report
+    report = classification_report(y_test, y_pred,
+                                   target_names=data.target_names,
+                                   output_dict=True)
+    report_df = pd.DataFrame(report).transpose()
+    report_path = 'classification_report.csv'
+    report_df.to_csv(report_path)
+    artifacts.append(report_path)
+    print(f"   ✓ Saved: {report_path}")
+    # Combine plot paths and artifacts
+    all_artifacts = plot_paths + artifacts
+    return metrics, plot_paths, all_artifacts
+def main():
+    """
+    Main function to demonstrate NotionHelper integration.
+    """
+    # ============================================================
+    # STEP 1: Define Hyperparameters Configuration
+    # ============================================================
+    config = {
+        "Experiment_Name": "Logistic Regression Demo",
+        "Model": "Logistic Regression",
+        "Dataset": "Breast Cancer (sklearn)",
+        "Test_Size": 0.2,
+        "Random_State": 42,
+        "C_Regularization": 1.0,
+        "Max_Iterations": 2,
+        "Solver": "lbfgs",
+        "Penalty": "l2",
+        "Feature_Scaling": True
+    }
+    # ============================================================
+    # STEP 2: Train Model and Calculate Metrics
+    # ============================================================
+    metrics, plot_paths, artifacts = train_logistic_regression(
+        test_size=config["Test_Size"],
+        random_state=config["Random_State"],
+        C=config["C_Regularization"],
+        max_iter=config["Max_Iterations"],
+        solver=config["Solver"],
+        penalty=config["Penalty"],
+        scale_features=config["Feature_Scaling"]
+    )
+    # ============================================================
+    # STEP 3: Initialize NotionHelper
+    # ============================================================
+    print("\n" + "="*60)
+    print("📝 NOTION INTEGRATION")
+    print("="*60 + "\n")
+    # IMPORTANT: Replace with your Notion API token
+    NOTION_TOKEN = os.getenv("NOTION_TOKEN", "your_notion_token_here")
+    if NOTION_TOKEN == "your_notion_token_here":
+        print("⚠️  WARNING: Please set your NOTION_TOKEN environment variable")
+        print("   Example: export NOTION_TOKEN='secret_...'")
+        print("\n✅ Demo completed successfully (without Notion logging)")
+        print(f"\n📁 Generated files:")
+        for artifact in artifacts:
+            print(f"   • {artifact}")
+        return
+    try:
+        nh = MLNotionHelper(NOTION_TOKEN)
+        print("✓ MLNotionHelper initialized successfully")
+        # ============================================================
+        # STEP 4A: Create New Database (First time only)
+        # ============================================================
+        # Set CREATE_NEW_DB to True on first run, then set to False
+        CREATE_NEW_DB = False  # Force creation for this run
+        PARENT_PAGE_ID = "your page id here"
+        if CREATE_NEW_DB:
+            print("\n🗄️  Creating new Notion database...")
+            data_source_id = nh.create_ml_database(
+                parent_page_id=PARENT_PAGE_ID,
+                db_title="ML Experiments - Logistic Regression Demo",
+                config=config,
+                metrics=metrics,
+                file_property_name="Artifacts"
+            )
+            print(f"\n✅ Database created successfully!")
+            print(f"📝 Data Source ID: {data_source_id}")
+            print("\n" + "="*60)
+            print("⚠️  CRITICAL: Complete these steps NOW!")
+            print("="*60)
+            print("\n1️⃣  Go to Notion and find the new database:")
+            print("   'ML Experiments - Logistic Regression Demo'")
+            print("\n2️⃣  Click '...' (top right) → Add connections")
+            print("   → Select your integration")
+            print("\n3️⃣  Save this Data Source ID:")
+            print(f"   DATA_SOURCE_ID = \"{data_source_id}\"")
+            print("\n4️⃣  Set CREATE_NEW_DB = False in this script")
+            print("\n5️⃣  Run the script again to log experiments")
+            print("="*60 + "\n")
+            print("⏸️  Skipping experiment logging for this run.")
+            print("   Complete steps above, then run again.")
+            return  # Exit after database creation
+        # This else block will only be reached if CREATE_NEW_DB was initially False
+        # and the user has already provided a DATA_SOURCE_ID.
+        else:
+            # Replace with your actual data source ID after creating the database
+            DATA_SOURCE_ID = "your_data_source_id_here" # This should be updated by the user
+        if DATA_SOURCE_ID == "your_data_source_id_here":
+            print("\n💡 To log experiments:")
+            print("   1. Ensure CREATE_NEW_DB is False and DATA_SOURCE_ID is set.")
+            print("   2. Make sure the database is shared with your integration.")
+        else:
+            print("\n� Logging experiment to Notion...")
+            page_id = nh.log_ml_experiment(
+                data_source_id=DATA_SOURCE_ID,
+                config=config,
+                metrics=metrics,
+                plots=plot_paths,
+                target_metric="F1_Score",
+                higher_is_better=True,
+                file_paths=artifacts,
+                file_property_name="Artifacts"
+            )
+            if page_id:
+                print(f"✓ Experiment logged successfully!")
+                print(f"  Page ID: {page_id}")
+            else:
+                print("❌ Failed to log experiment")
+    except Exception as e:
+        print(f"❌ Notion API Error: {e}")
+        print("   Continuing without Notion logging...")
+    # ============================================================
+    # FINAL SUMMARY
+    # ============================================================
+    print("\n" + "="*60)
+    print("✅ DEMO COMPLETED SUCCESSFULLY")
+    print("="*60)
+    print(f"\n📁 Generated files:")
+    for artifact in artifacts:
+        print(f"   • {artifact}")
+    print("\n📊 Key Metrics:")
+    print(f"   • Accuracy:  {metrics['Accuracy']:.2f}%")
+    print(f"   • F1 Score:  {metrics['F1_Score']:.2f}%")
+    print(f"   • ROC AUC:   {metrics['ROC_AUC']:.2f}%")
+    print("\n🎉 Thank you for trying NotionHelper!")
+    print("="*60 + "\n")
+if __name__ == "__main__":
+    main()

{notionhelper-0.3.1 → notionhelper-0.4.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "notionhelper"
-version = "0.3.1"
+version = "0.4.0"
 description = "NotionHelper is a Python library that simplifies interactions with the Notion API, enabling easy management of databases, pages, and files within Notion workspaces."
 readme = "README.md"
 authors = [

notionhelper-0.4.0/src/notionhelper/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .helper import NotionHelper
+from .ml_logger import MLNotionHelper
+__all__ = ["NotionHelper", "MLNotionHelper"]

{notionhelper-0.3.1 → notionhelper-0.4.0}/src/notionhelper/helper.py RENAMED Viewed

@@ -654,177 +654,3 @@ class NotionHelper:
         }
         response = requests.patch(update_url, headers=headers, json=data)
         return response.json()
-    def dict_to_notion_schema(self, data: Dict[str, Any], title_key: str) -> Dict[str, Any]:
-        """Converts a dictionary into a Notion property schema for database creation.
-        Parameters:
-            data (dict): Dictionary containing sample values to infer types from.
-            title_key (str): The key that should be used as the title property.
-        Returns:
-            dict: A dictionary defining the Notion property schema.
-        """
-        properties = {}
-        for key, value in data.items():
-            # Handle NumPy types
-            if hasattr(value, "item"):
-                value = value.item()
-            # Debug output to help diagnose type issues
-            print(f"DEBUG: key='{key}', value={value}, type={type(value).__name__}, isinstance(bool)={isinstance(value, bool)}, isinstance(int)={isinstance(value, int)}")
-            if key == title_key:
-                properties[key] = {"title": {}}
-            # IMPORTANT: Check for bool BEFORE (int, float) because bool is a subclass of int in Python
-            elif isinstance(value, bool):
-                properties[key] = {"checkbox": {}}
-                print(f"  → Assigned as CHECKBOX")
-            elif isinstance(value, (int, float)):
-                properties[key] = {"number": {"format": "number"}}
-                print(f"  → Assigned as NUMBER")
-            else:
-                properties[key] = {"rich_text": {}}
-                print(f"  → Assigned as RICH_TEXT")
-        return properties
-    def dict_to_notion_props(self, data: Dict[str, Any], title_key: str) -> Dict[str, Any]:
-        """Converts a dictionary into Notion property values for page creation.
-        Parameters:
-            data (dict): Dictionary containing the values to convert.
-            title_key (str): The key that should be used as the title property.
-        Returns:
-            dict: A dictionary defining the Notion property values.
-        """
-        notion_props = {}
-        for key, value in data.items():
-            # Handle NumPy types
-            if hasattr(value, "item"):
-                value = value.item()
-            if key == title_key:
-                ts = datetime.now().strftime("%Y-%m-%d %H:%M")
-                notion_props[key] = {"title": [{"text": {"content": f"{value} ({ts})"}}]}
-            # FIX: Handle Booleans
-            elif isinstance(value, bool):
-                # Option A: Map to a Checkbox column in Notion
-                # notion_props[key] = {"checkbox": value}
-                # Option B: Map to a Rich Text column as a string (since you added a rich text field)
-                notion_props[key] = {"rich_text": [{"text": {"content": str(value)}}]}
-            elif isinstance(value, (int, float)):
-                if pd.isna(value) or np.isinf(value): continue
-                notion_props[key] = {"number": float(value)}
-            else:
-                notion_props[key] = {"rich_text": [{"text": {"content": str(value)}}]}
-        return notion_props
-    def log_ml_experiment(
-        self,
-        data_source_id: str,
-        config: Dict,
-        metrics: Dict,
-        plots: List[str] = None,
-        target_metric: str = "sMAPE",      # Re-added these
-        higher_is_better: bool = False,   # to fix the error
-        file_paths: Optional[List[str]] = None, # Changed to list
-        file_property_name: str = "Output Files"
-    ):
-        """Logs ML experiment and compares metrics with multiple file support."""
-        improvement_tag = "Standard Run"
-        new_score = metrics.get(target_metric)
-        # 1. Leaderboard Logic (Champions)
-        if new_score is not None:
-            try:
-                df = self.get_data_source_pages_as_dataframe(data_source_id, limit=100)
-                if not df.empty and target_metric in df.columns:
-                    valid_scores = pd.to_numeric(df[target_metric], errors='coerce').dropna()
-                    if not valid_scores.empty:
-                        current_best = valid_scores.max() if higher_is_better else valid_scores.min()
-                        is_improvement = (new_score > current_best) if higher_is_better else (new_score < current_best)
-                        if is_improvement:
-                            improvement_tag = f"🏆 NEW BEST {target_metric} (Prev: {current_best:.2f})"
-                        else:
-                            diff = abs(new_score - current_best)
-                            improvement_tag = f"No Improvement (+{diff:.2f} {target_metric})"
-            except Exception as e:
-                print(f"Leaderboard check skipped: {e}")
-        # 2. Prepare Notion Properties
-        data_for_notion = metrics.copy()
-        data_for_notion["Run Status"] = improvement_tag
-        combined_payload = {**config, **data_for_notion}
-        title_key = list(config.keys())[0]
-        properties = self.dict_to_notion_props(combined_payload, title_key)
-        try:
-            # 3. Create the row
-            new_page = self.new_page_to_data_source(data_source_id, properties)
-            page_id = new_page["id"]
-            # 4. Handle Plots (Body)
-            if plots:
-                for plot_path in plots:
-                    if os.path.exists(plot_path):
-                        self.one_step_image_embed(page_id, plot_path)
-            # 5. Handle Multiple File Uploads (Property)
-            if file_paths:
-                file_assets = []
-                for path in file_paths:
-                    if os.path.exists(path):
-                        print(f"Uploading {path}...")
-                        upload_resp = self.upload_file(path)
-                        file_assets.append({
-                            "type": "file_upload",
-                            "file_upload": {"id": upload_resp["id"]},
-                            "name": os.path.basename(path),
-                        })
-                if file_assets:
-                    # Attach all files in one request
-                    update_url = f"https://api.notion.com/v1/pages/{page_id}"
-                    file_payload = {"properties": {file_property_name: {"files": file_assets}}}
-                    self._make_request("PATCH", update_url, file_payload)
-                    print(f"✅ {len(file_assets)} files attached to {file_property_name}")
-            return page_id
-        except Exception as e:
-            print(f"Log error: {e}")
-            return None
-    def create_ml_database(self, parent_page_id: str, db_title: str, config: Dict, metrics: Dict, file_property_name: str = "Output Files") -> str:
-        """
-        Analyzes dicts to create a new Notion Database with the correct schema.
-        Uses dict_to_notion_schema() for universal type conversion.
-        """
-        combined = {**config, **metrics}
-        title_key = list(config.keys())[0]
-        # Use the universal dict_to_notion_schema() method
-        properties = self.dict_to_notion_schema(combined, title_key)
-        # Add 'Run Status' if not already present
-        if "Run Status" not in properties:
-            properties["Run Status"] = {"rich_text": {}}
-        # Add the Multi-file property
-        properties[file_property_name] = {"files": {}}
-        print(f"Creating database '{db_title}' with {len(properties)} columns...")
-        response = self.create_database(
-            parent_page_id=parent_page_id,
-            database_title=db_title,
-            initial_data_source_properties=properties
-        )
-        data_source_id = response.get("initial_data_source", {}).get("id")
-        return data_source_id if data_source_id else response.get("id")

notionhelper-0.4.0/src/notionhelper/ml_logger.py ADDED Viewed

@@ -0,0 +1,206 @@
+from typing import Optional, Dict, List, Any
+import pandas as pd
+import numpy as np
+import os
+from datetime import datetime
+from .helper import NotionHelper
+class MLNotionHelper(NotionHelper):
+    """
+    ML experiment tracking helper that extends NotionHelper.
+    Provides specialized methods for logging and tracking machine learning experiments,
+    automatically comparing metrics against historical runs and logging results to Notion.
+    Methods
+    -------
+    log_ml_experiment(data_source_id, config, metrics, plots, target_metric,
+                     higher_is_better, file_paths, file_property_name):
+        Logs an ML experiment run with metrics, plots, and artifacts.
+    create_ml_database(parent_page_id, db_title, config, metrics, file_property_name):
+        Creates a new Notion database optimized for ML experiment tracking.
+    dict_to_notion_schema(data, title_key):
+        Converts a dictionary into a Notion property schema.
+    dict_to_notion_props(data, title_key):
+        Converts a dictionary into Notion property values.
+    """
+    def dict_to_notion_schema(self, data: Dict[str, Any], title_key: str) -> Dict[str, Any]:
+        """Converts a dictionary into a Notion property schema for database creation.
+        Parameters:
+            data (dict): Dictionary containing sample values to infer types from.
+            title_key (str): The key that should be used as the title property.
+        Returns:
+            dict: A dictionary defining the Notion property schema.
+        """
+        properties = {}
+        for key, value in data.items():
+            # Handle NumPy types
+            if hasattr(value, "item"):
+                value = value.item()
+            # Debug output to help diagnose type issues
+            print(f"DEBUG: key='{key}', value={value}, type={type(value).__name__}, isinstance(bool)={isinstance(value, bool)}, isinstance(int)={isinstance(value, int)}")
+            if key == title_key:
+                properties[key] = {"title": {}}
+            # IMPORTANT: Check for bool BEFORE (int, float) because bool is a subclass of int in Python
+            elif isinstance(value, bool):
+                properties[key] = {"checkbox": {}}
+                print(f"  → Assigned as CHECKBOX")
+            elif isinstance(value, (int, float)):
+                properties[key] = {"number": {"format": "number"}}
+                print(f"  → Assigned as NUMBER")
+            else:
+                properties[key] = {"rich_text": {}}
+                print(f"  → Assigned as RICH_TEXT")
+        return properties
+    def dict_to_notion_props(self, data: Dict[str, Any], title_key: str) -> Dict[str, Any]:
+        """Converts a dictionary into Notion property values for page creation.
+        Parameters:
+            data (dict): Dictionary containing the values to convert.
+            title_key (str): The key that should be used as the title property.
+        Returns:
+            dict: A dictionary defining the Notion property values.
+        """
+        notion_props = {}
+        for key, value in data.items():
+            # Handle NumPy types
+            if hasattr(value, "item"):
+                value = value.item()
+            if key == title_key:
+                ts = datetime.now().strftime("%Y-%m-%d %H:%M")
+                notion_props[key] = {"title": [{"text": {"content": f"{value} ({ts})"}}]}
+            # FIX: Handle Booleans
+            elif isinstance(value, bool):
+                # Option A: Map to a Checkbox column in Notion
+                # notion_props[key] = {"checkbox": value}
+                # Option B: Map to a Rich Text column as a string (since you added a rich text field)
+                notion_props[key] = {"rich_text": [{"text": {"content": str(value)}}]}
+            elif isinstance(value, (int, float)):
+                if pd.isna(value) or np.isinf(value):
+                    continue
+                notion_props[key] = {"number": float(value)}
+            else:
+                notion_props[key] = {"rich_text": [{"text": {"content": str(value)}}]}
+        return notion_props
+    def log_ml_experiment(
+        self,
+        data_source_id: str,
+        config: Dict,
+        metrics: Dict,
+        plots: List[str] = None,
+        target_metric: str = "sMAPE",
+        higher_is_better: bool = False,
+        file_paths: Optional[List[str]] = None,
+        file_property_name: str = "Output Files"
+    ):
+        """Logs ML experiment and compares metrics with multiple file support."""
+        improvement_tag = "Standard Run"
+        new_score = metrics.get(target_metric)
+        # 1. Leaderboard Logic (Champions)
+        if new_score is not None:
+            try:
+                df = self.get_data_source_pages_as_dataframe(data_source_id, limit=100)
+                if not df.empty and target_metric in df.columns:
+                    valid_scores = pd.to_numeric(df[target_metric], errors='coerce').dropna()
+                    if not valid_scores.empty:
+                        current_best = valid_scores.max() if higher_is_better else valid_scores.min()
+                        is_improvement = (new_score > current_best) if higher_is_better else (new_score < current_best)
+                        if is_improvement:
+                            improvement_tag = f"🏆 NEW BEST {target_metric} (Prev: {current_best:.2f})"
+                        else:
+                            diff = abs(new_score - current_best)
+                            improvement_tag = f"No Improvement (+{diff:.2f} {target_metric})"
+            except Exception as e:
+                print(f"Leaderboard check skipped: {e}")
+        # 2. Prepare Notion Properties
+        data_for_notion = metrics.copy()
+        data_for_notion["Run Status"] = improvement_tag
+        combined_payload = {**config, **data_for_notion}
+        title_key = list(config.keys())[0]
+        properties = self.dict_to_notion_props(combined_payload, title_key)
+        try:
+            # 3. Create the row
+            new_page = self.new_page_to_data_source(data_source_id, properties)
+            page_id = new_page["id"]
+            # 4. Handle Plots (Body)
+            if plots:
+                for plot_path in plots:
+                    if os.path.exists(plot_path):
+                        self.one_step_image_embed(page_id, plot_path)
+            # 5. Handle Multiple File Uploads (Property)
+            if file_paths:
+                file_assets = []
+                for path in file_paths:
+                    if os.path.exists(path):
+                        print(f"Uploading {path}...")
+                        upload_resp = self.upload_file(path)
+                        file_assets.append({
+                            "type": "file_upload",
+                            "file_upload": {"id": upload_resp["id"]},
+                            "name": os.path.basename(path),
+                        })
+                if file_assets:
+                    # Attach all files in one request
+                    update_url = f"https://api.notion.com/v1/pages/{page_id}"
+                    file_payload = {"properties": {file_property_name: {"files": file_assets}}}
+                    self._make_request("PATCH", update_url, file_payload)
+                    print(f"✅ {len(file_assets)} files attached to {file_property_name}")
+            return page_id
+        except Exception as e:
+            print(f"Log error: {e}")
+            return None
+    def create_ml_database(self, parent_page_id: str, db_title: str, config: Dict, metrics: Dict, file_property_name: str = "Output Files") -> str:
+        """
+        Analyzes dicts to create a new Notion Database with the correct schema.
+        Uses dict_to_notion_schema() for universal type conversion.
+        """
+        combined = {**config, **metrics}
+        title_key = list(config.keys())[0]
+        # Use the universal dict_to_notion_schema() method
+        properties = self.dict_to_notion_schema(combined, title_key)
+        # Add 'Run Status' if not already present
+        if "Run Status" not in properties:
+            properties["Run Status"] = {"rich_text": {}}
+        # Add the Multi-file property
+        properties[file_property_name] = {"files": {}}
+        print(f"Creating database '{db_title}' with {len(properties)} columns...")
+        response = self.create_database(
+            parent_page_id=parent_page_id,
+            database_title=db_title,
+            initial_data_source_properties=properties
+        )
+        data_source_id = response.get("initial_data_source", {}).get("id")
+        return data_source_id if data_source_id else response.get("id")

{notionhelper-0.3.1 → notionhelper-0.4.0}/uv.lock RENAMED Viewed

@@ -1279,7 +1279,7 @@ wheels = [
 [[package]]
 name = "notionhelper"
-version = "0.2.1"
+version = "0.3.2"
 source = { editable = "." }
 dependencies = [
     { name = "mimetype" },