PyPI - jupytertracker - Versions diffs - 0.1.0__tar.gz - Mend

jupytertracker 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

jupytertracker-0.1.0/.claude/CLAUDE.md +24 -0
jupytertracker-0.1.0/.gitignore +13 -0
jupytertracker-0.1.0/PKG-INFO +148 -0
jupytertracker-0.1.0/README.md +136 -0
jupytertracker-0.1.0/pyproject.toml +26 -0
jupytertracker-0.1.0/setup.cfg +7 -0
jupytertracker-0.1.0/src/jupytertracker/__init__.py +68 -0
jupytertracker-0.1.0/src/jupytertracker/exporter.py +61 -0
jupytertracker-0.1.0/src/jupytertracker/tracker.py +92 -0
jupytertracker-0.1.0/tests/conftest.py +24 -0
jupytertracker-0.1.0/tests/test_exporter.py +127 -0
jupytertracker-0.1.0/tests/test_init.py +69 -0
jupytertracker-0.1.0/tests/test_tracker.py +149 -0

jupytertracker-0.1.0/.claude/CLAUDE.md ADDED Viewed

@@ -0,0 +1,24 @@
+# gstack
+Use the `/browse` skill from gstack for all web browsing. Never use `mcp__claude-in-chrome__*` tools directly.
+Available gstack skills:
+`/office-hours`, `/plan-ceo-review`, `/plan-eng-review`, `/plan-design-review`, `/design-consultation`, `/design-shotgun`, `/design-html`, `/review`, `/ship`, `/land-and-deploy`, `/canary`, `/benchmark`, `/browse`, `/connect-chrome`, `/qa`, `/qa-only`, `/design-review`, `/setup-browser-cookies`, `/setup-deploy`, `/setup-gbrain`, `/retro`, `/investigate`, `/document-release`, `/document-generate`, `/codex`, `/cso`, `/autoplan`, `/plan-devex-review`, `/devex-review`, `/careful`, `/freeze`, `/guard`, `/unfreeze`, `/gstack-upgrade`, `/learn`
+## Skill routing
+When the user's request matches an available skill, invoke it via the Skill tool. When in doubt, invoke the skill.
+Key routing rules:
+- Product ideas/brainstorming → invoke /office-hours
+- Strategy/scope → invoke /plan-ceo-review
+- Architecture → invoke /plan-eng-review
+- Design system/plan review → invoke /design-consultation or /plan-design-review
+- Full review pipeline → invoke /autoplan
+- Bugs/errors → invoke /investigate
+- QA/testing site behavior → invoke /qa or /qa-only
+- Code review/diff check → invoke /review
+- Visual polish → invoke /design-review
+- Ship/deploy/PR → invoke /ship or /land-and-deploy
+- Save progress → invoke /context-save
+- Resume context → invoke /context-restore

jupytertracker-0.1.0/.gitignore ADDED Viewed

@@ -0,0 +1,13 @@
+__pycache__/
+*.py[cod]
+*.egg-info/
+dist/
+build/
+.eggs/
+*.egg
+.venv/
+venv/
+.pytest_cache/
+.mypy_cache/
+*.pyc
+.DS_Store

jupytertracker-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,148 @@
+Metadata-Version: 2.4
+Name: jupytertracker
+Version: 0.1.0
+Summary: Track Jupyter notebook cell execution and export a clean, ordered Python script
+License: MIT
+Requires-Python: >=3.8
+Requires-Dist: ipython>=7.0
+Provides-Extra: dev
+Requires-Dist: nbformat>=5.0; extra == 'dev'
+Requires-Dist: pytest>=7.0; extra == 'dev'
+Description-Content-Type: text/markdown
+# jupytertracker
+Part of an end-to-end ML model management system for replicable machine learning.
+## The problem
+Building a machine learning model in a Jupyter notebook is iterative and messy — cells run out of order, code gets modified and re-run, hyperparameters get tweaked. When a model reviewer asks "how did you build this?", the data scientist has to manually reconstruct the process. When a compliance team asks for documentation, someone has to write it by hand.
+The result: models that can't be independently replicated, and whitepapers that are written after the fact from memory rather than from the actual process.
+## System vision
+This library is Component 1 of a three-part system for making the ML modeling process fully replicable and auditable:
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                  ML Model Management System                      │
+├──────────────────┬──────────────────────┬───────────────────────┤
+│  Component 1     │  Component 2         │  Component 3          │
+│  JupyterTracker  │  MLflow Integration  │  Whitepaper Generator │
+│  (this library)  │                      │                        │
+├──────────────────┼──────────────────────┼───────────────────────┤
+│ Records every    │ Registers models,    │ Generates a structured│
+│ cell execution   │ tracks experiments,  │ report (data, method, │
+│ in order. Exports│ parameters, metrics, │ results, limitations) │
+│ an honest Python │ and serves models.   │ from code annotations │
+│ script of what   │ Uses MLflow as-is.   │ using an LLM.         │
+│ actually ran.    │                      │                        │
+├──────────────────┴──────────────────────┴───────────────────────┤
+│  Together: a non-technical reviewer can verify what was built,  │
+│  how it was built, and reproduce the result independently.      │
+└─────────────────────────────────────────────────────────────────┘
+```
+**Data flow:**
+```
+Notebook session
+  │
+  ├── JupyterTracker records every cell execution (parallel, live)
+  │     └── export_script() → ordered .py file with timing
+  │
+  ├── MLflow tracks experiments, parameters, and metrics (parallel, live)
+  │     └── model registry → reproducible run IDs
+  │
+  └── On demand: Whitepaper generator
+        ├── pulls execution log from JupyterTracker
+        ├── pulls run metadata from MLflow
+        └── uses wpr_-prefixed function outputs as report sections
+              └── LLM assembles → structured whitepaper (PDF/Markdown)
+```
+---
+## Component 1: JupyterTracker
+Track Jupyter notebook cell executions and export a clean, ordered Python script — exactly what ran, in the order it ran.
+### Install
+```bash
+pip install jupytertracker
+```
+### Usage
+Add one line at the top of your notebook:
+```python
+import jupytertracker
+jupytertracker.start()
+```
+When you're done, export:
+```python
+jupytertracker.export_script("my_analysis.py")
+```
+The output is a `.py` file with every cell execution in order, one block per run:
+```python
+# Generated by jupytertracker (sequential mode)
+# Total execution time: 2m 14.3s
+# Cells recorded: 5
+# execution 1  [340ms]
+x = load_data("train.csv")
+# execution 2  [1m 52.1s]
+model = train(x, lr=0.01)
+# execution 3  [18.4s]
+evaluate(model)
+# execution 4 (re-run)  [1m 48.7s]
+model = train(x, lr=0.1)
+# execution 5 (re-run)  [15.1s]
+evaluate(model)
+```
+### API
+```python
+jupytertracker.start(ip=None)        # start tracking; idempotent
+jupytertracker.stop()                # stop tracking; next start() begins fresh
+jupytertracker.export_script(path)   # write execution log to .py file
+jupytertracker.clear()               # clear the log without stopping
+jupytertracker.get_log()             # return list of ExecutionRecord
+```
+### Notes
+- **Call `start()` in your very first cell**, before any imports or data loading. The tracker only records what runs after `start()` is called. Any state built up before — loaded dataframes, imported libraries, defined variables — is invisible to the tracker and will be missing from the exported script.
+- **The exported script is an execution record, not a guaranteed reproducible script.** If cells depended on state that existed in the kernel but wasn't captured (see above), the script will fail with a `NameError` when run top-to-bottom.
+- **Failed cells are excluded.** Cells that raise an exception, have a syntax error, or are interrupted by the user are not recorded — only successful executions appear in the output.
+- **Kernel restart** resets tracking automatically (Python state is cleared). Call `export_script()` before restarting if you want to preserve the session.
+- Magic commands (`%matplotlib inline`, `!pip install ...`) are included with a comment noting they require a Jupyter environment.
+## Related projects
+- **[ipyflow](https://github.com/ipyflow/ipyflow)** — reactive Python kernel that tracks dataflow between cells and can recover the minimal set of cells needed to reproduce an output. Requires switching kernels; takes a "prevent the mess" approach vs. jupytertracker's "record the mess" approach.
+- **[papermill](https://github.com/nteract/papermill)** — parameterizes and executes notebooks top-to-bottom. Good for batch runs; doesn't handle interactive out-of-order execution.
+- **[reprozip-jupyter](https://pypi.org/project/reprozip-jupyter/)** — packs the full notebook environment (libraries, data) for portability. Solves environment reproducibility, not execution-order reproducibility.
+- **[MLflow](https://mlflow.org)** — experiment tracking, model registry, and model serving. Component 2 of this system.
+## Roadmap
+- **v2:** `mode='dedup'` — deduplicate to the last version of each cell, ordered by last execution. For "clean up my notebook" workflows.
+- **Component 2:** MLflow integration — link JupyterTracker sessions to MLflow run IDs automatically.
+- **Component 3:** Whitepaper generator — `wpr_`-prefixed functions collect outputs for LLM-generated structured reports.

jupytertracker-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,136 @@
+# jupytertracker
+Part of an end-to-end ML model management system for replicable machine learning.
+## The problem
+Building a machine learning model in a Jupyter notebook is iterative and messy — cells run out of order, code gets modified and re-run, hyperparameters get tweaked. When a model reviewer asks "how did you build this?", the data scientist has to manually reconstruct the process. When a compliance team asks for documentation, someone has to write it by hand.
+The result: models that can't be independently replicated, and whitepapers that are written after the fact from memory rather than from the actual process.
+## System vision
+This library is Component 1 of a three-part system for making the ML modeling process fully replicable and auditable:
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                  ML Model Management System                      │
+├──────────────────┬──────────────────────┬───────────────────────┤
+│  Component 1     │  Component 2         │  Component 3          │
+│  JupyterTracker  │  MLflow Integration  │  Whitepaper Generator │
+│  (this library)  │                      │                        │
+├──────────────────┼──────────────────────┼───────────────────────┤
+│ Records every    │ Registers models,    │ Generates a structured│
+│ cell execution   │ tracks experiments,  │ report (data, method, │
+│ in order. Exports│ parameters, metrics, │ results, limitations) │
+│ an honest Python │ and serves models.   │ from code annotations │
+│ script of what   │ Uses MLflow as-is.   │ using an LLM.         │
+│ actually ran.    │                      │                        │
+├──────────────────┴──────────────────────┴───────────────────────┤
+│  Together: a non-technical reviewer can verify what was built,  │
+│  how it was built, and reproduce the result independently.      │
+└─────────────────────────────────────────────────────────────────┘
+```
+**Data flow:**
+```
+Notebook session
+  │
+  ├── JupyterTracker records every cell execution (parallel, live)
+  │     └── export_script() → ordered .py file with timing
+  │
+  ├── MLflow tracks experiments, parameters, and metrics (parallel, live)
+  │     └── model registry → reproducible run IDs
+  │
+  └── On demand: Whitepaper generator
+        ├── pulls execution log from JupyterTracker
+        ├── pulls run metadata from MLflow
+        └── uses wpr_-prefixed function outputs as report sections
+              └── LLM assembles → structured whitepaper (PDF/Markdown)
+```
+---
+## Component 1: JupyterTracker
+Track Jupyter notebook cell executions and export a clean, ordered Python script — exactly what ran, in the order it ran.
+### Install
+```bash
+pip install jupytertracker
+```
+### Usage
+Add one line at the top of your notebook:
+```python
+import jupytertracker
+jupytertracker.start()
+```
+When you're done, export:
+```python
+jupytertracker.export_script("my_analysis.py")
+```
+The output is a `.py` file with every cell execution in order, one block per run:
+```python
+# Generated by jupytertracker (sequential mode)
+# Total execution time: 2m 14.3s
+# Cells recorded: 5
+# execution 1  [340ms]
+x = load_data("train.csv")
+# execution 2  [1m 52.1s]
+model = train(x, lr=0.01)
+# execution 3  [18.4s]
+evaluate(model)
+# execution 4 (re-run)  [1m 48.7s]
+model = train(x, lr=0.1)
+# execution 5 (re-run)  [15.1s]
+evaluate(model)
+```
+### API
+```python
+jupytertracker.start(ip=None)        # start tracking; idempotent
+jupytertracker.stop()                # stop tracking; next start() begins fresh
+jupytertracker.export_script(path)   # write execution log to .py file
+jupytertracker.clear()               # clear the log without stopping
+jupytertracker.get_log()             # return list of ExecutionRecord
+```
+### Notes
+- **Call `start()` in your very first cell**, before any imports or data loading. The tracker only records what runs after `start()` is called. Any state built up before — loaded dataframes, imported libraries, defined variables — is invisible to the tracker and will be missing from the exported script.
+- **The exported script is an execution record, not a guaranteed reproducible script.** If cells depended on state that existed in the kernel but wasn't captured (see above), the script will fail with a `NameError` when run top-to-bottom.
+- **Failed cells are excluded.** Cells that raise an exception, have a syntax error, or are interrupted by the user are not recorded — only successful executions appear in the output.
+- **Kernel restart** resets tracking automatically (Python state is cleared). Call `export_script()` before restarting if you want to preserve the session.
+- Magic commands (`%matplotlib inline`, `!pip install ...`) are included with a comment noting they require a Jupyter environment.
+## Related projects
+- **[ipyflow](https://github.com/ipyflow/ipyflow)** — reactive Python kernel that tracks dataflow between cells and can recover the minimal set of cells needed to reproduce an output. Requires switching kernels; takes a "prevent the mess" approach vs. jupytertracker's "record the mess" approach.
+- **[papermill](https://github.com/nteract/papermill)** — parameterizes and executes notebooks top-to-bottom. Good for batch runs; doesn't handle interactive out-of-order execution.
+- **[reprozip-jupyter](https://pypi.org/project/reprozip-jupyter/)** — packs the full notebook environment (libraries, data) for portability. Solves environment reproducibility, not execution-order reproducibility.
+- **[MLflow](https://mlflow.org)** — experiment tracking, model registry, and model serving. Component 2 of this system.
+## Roadmap
+- **v2:** `mode='dedup'` — deduplicate to the last version of each cell, ordered by last execution. For "clean up my notebook" workflows.
+- **Component 2:** MLflow integration — link JupyterTracker sessions to MLflow run IDs automatically.
+- **Component 3:** Whitepaper generator — `wpr_`-prefixed functions collect outputs for LLM-generated structured reports.

jupytertracker-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,26 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+[project]
+name = "jupytertracker"
+version = "0.1.0"
+description = "Track Jupyter notebook cell execution and export a clean, ordered Python script"
+readme = "README.md"
+requires-python = ">=3.8"
+license = { text = "MIT" }
+dependencies = [
+    "ipython>=7.0",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0",
+    "nbformat>=5.0",
+]
+[tool.hatch.build.targets.wheel]
+packages = ["src/jupytertracker"]
+[tool.pytest.ini_options]
+testpaths = ["tests"]

jupytertracker-0.1.0/setup.cfg ADDED Viewed

@@ -0,0 +1,7 @@
+[options]
+package_dir =
+    = src
+packages = find:
+[options.packages.find]
+where = src

jupytertracker-0.1.0/src/jupytertracker/__init__.py ADDED Viewed

@@ -0,0 +1,68 @@
+"""
+jupytertracker — record Jupyter notebook cell executions and export an ordered script.
+Basic usage:
+    import jupytertracker
+    jupytertracker.start()
+    # ... run cells in your notebook ...
+    jupytertracker.export_script("output.py")
+"""
+from __future__ import annotations
+from pathlib import Path
+from typing import Optional
+from .tracker import Tracker
+from .exporter import export_sequential
+_tracker: Optional[Tracker] = None
+def start(ip=None) -> None:
+    """Start tracking cell executions. Safe to call multiple times (idempotent)."""
+    global _tracker
+    if _tracker is None:
+        _tracker = Tracker()
+    _tracker.start(ip=ip)
+def stop() -> None:
+    """Stop tracking. Does nothing if tracking was not started."""
+    if _tracker is not None:
+        _tracker.stop()
+def export_script(path: str, mode: str = "sequential") -> None:
+    """Export the recorded execution log to a Python script.
+    Args:
+        path: Output file path (e.g. 'output.py').
+        mode: 'sequential' (default) — every execution in order, no deduplication.
+              'dedup' — last version of each cell only (deferred to v2).
+    """
+    if _tracker is None:
+        raise RuntimeError(
+            "Tracking has not been started. Call jupytertracker.start() first."
+        )
+    if mode == "sequential":
+        export_sequential(_tracker.log, path)
+    elif mode == "dedup":
+        raise NotImplementedError(
+            "mode='dedup' is planned for v2. Use mode='sequential' (the default)."
+        )
+    else:
+        raise ValueError(f"Unknown mode '{mode}'. Use 'sequential'.")
+def clear() -> None:
+    """Clear the recorded execution log without stopping tracking."""
+    if _tracker is not None:
+        _tracker.clear()
+def get_log():
+    """Return a copy of the current execution log (list of ExecutionRecord)."""
+    if _tracker is None:
+        return []
+    return _tracker.log

jupytertracker-0.1.0/src/jupytertracker/exporter.py ADDED Viewed

@@ -0,0 +1,61 @@
+from __future__ import annotations
+import textwrap
+from pathlib import Path
+from typing import List
+from .tracker import ExecutionRecord
+_HEADER_TEMPLATE = """\
+# Generated by jupytertracker (sequential mode)
+# Total execution time: {total_time}
+# Cells recorded: {cell_count}
+#
+# Each block below reflects exactly what ran, in the order it ran.
+# A cell that was modified and re-run appears multiple times — once per execution.
+# NOTE: This script may not run top-to-bottom without error if cells relied on
+# intermediate kernel state not captured here. It is an execution record, not a
+# guaranteed reproducible script.
+"""
+def _fmt_duration(seconds: float) -> str:
+    """Human-readable duration: '34ms', '1.23s', '2m 5.1s'."""
+    if seconds < 1:
+        return f"{seconds * 1000:.0f}ms"
+    if seconds < 60:
+        return f"{seconds:.2f}s"
+    mins = int(seconds // 60)
+    secs = seconds % 60
+    return f"{mins}m {secs:.1f}s"
+def export_sequential(log: List[ExecutionRecord], path: str | Path) -> None:
+    """Write the raw execution log to a .py file, one block per execution."""
+    if not log:
+        header = _HEADER_TEMPLATE.format(total_time="0ms", cell_count=0)
+        Path(path).write_text(header + "# No cells were recorded.\n", encoding="utf-8")
+        return
+    total_seconds = sum(r.duration for r in log)
+    header = _HEADER_TEMPLATE.format(
+        total_time=_fmt_duration(total_seconds),
+        cell_count=len(log),
+    )
+    blocks = [header]
+    for record in log:
+        source = record.source.rstrip("\n")
+        comment = _magic_comment(source)
+        timing = _fmt_duration(record.duration)
+        block = f"# execution {record.exec_count}  [{timing}]\n{comment}{source}\n"
+        blocks.append(block)
+    Path(path).write_text("\n".join(blocks) + "\n", encoding="utf-8")
+def _magic_comment(source: str) -> str:
+    first_line = source.lstrip().split("\n")[0]
+    if first_line.startswith("%") or first_line.startswith("!"):
+        return "# magic/shell command — requires Jupyter environment to run as-is\n"
+    return ""

jupytertracker-0.1.0/src/jupytertracker/tracker.py ADDED Viewed

@@ -0,0 +1,92 @@
+from __future__ import annotations
+import sys
+import time
+from dataclasses import dataclass, field
+from typing import List
+@dataclass
+class ExecutionRecord:
+    exec_count: int
+    source: str
+    timestamp: float
+    duration: float = 0.0  # seconds; set by post_run_cell
+class Tracker:
+    def __init__(self) -> None:
+        self._log: List[ExecutionRecord] = []
+        self._ip = None
+        self._registered = False
+        self._counter = 0        # own counter — ip.execution_count isn't reliable pre-run
+        self._pending = None     # staged record; committed only on successful post_run_cell
+    def start(self, ip=None) -> None:
+        if self._registered:
+            return  # idempotent — already tracking, do nothing
+        if ip is None:
+            try:
+                from IPython import get_ipython
+                ip = get_ipython()
+            except ImportError:
+                pass
+        if ip is None:
+            raise RuntimeError(
+                "No active IPython kernel found. "
+                "Call jupytertracker.start() from inside a Jupyter notebook, "
+                "or pass an IPython instance: jupytertracker.start(ip=get_ipython())"
+            )
+        self._ip = ip
+        self._log.clear()   # fresh session — discard any log from a previous run
+        self._counter = 0
+        self._pending = None
+        ip.events.register("pre_run_cell", self._on_pre_run_cell)
+        ip.events.register("post_run_cell", self._on_post_run_cell)
+        self._registered = True
+    def stop(self) -> None:
+        if not self._registered or self._ip is None:
+            return
+        for event, handler in [
+            ("pre_run_cell", self._on_pre_run_cell),
+            ("post_run_cell", self._on_post_run_cell),
+        ]:
+            try:
+                self._ip.events.unregister(event, handler)
+            except ValueError:
+                pass
+        self._pending = None
+        self._registered = False
+    def _on_pre_run_cell(self, info) -> None:
+        try:
+            self._counter += 1
+            self._pending = ExecutionRecord(
+                exec_count=self._counter,
+                source=info.raw_cell,
+                timestamp=time.time(),
+            )
+        except Exception as exc:
+            print(f"[jupytertracker] hook error (ignored): {exc}", file=sys.stderr)
+    def _on_post_run_cell(self, result) -> None:
+        try:
+            if self._pending is None:
+                return
+            if result.success:
+                self._pending.duration = time.time() - self._pending.timestamp
+                self._log.append(self._pending)
+            else:
+                # Discard: error, exception, or user interruption
+                self._counter -= 1
+            self._pending = None
+        except Exception as exc:
+            print(f"[jupytertracker] hook error (ignored): {exc}", file=sys.stderr)
+    @property
+    def log(self) -> List[ExecutionRecord]:
+        return list(self._log)
+    def clear(self) -> None:
+        self._log.clear()

jupytertracker-0.1.0/tests/conftest.py ADDED Viewed

@@ -0,0 +1,24 @@
+import pytest
+import jupytertracker
+from IPython.testing.globalipapp import start_ipython
+# Start the global IPython app once and keep a reference to it.
+_IP = start_ipython()
+@pytest.fixture(autouse=True)
+def reset_tracker():
+    """Reset module-level singleton and IPython execution count between tests."""
+    jupytertracker.stop()
+    jupytertracker._tracker = None
+    if _IP is not None:
+        _IP.execution_count = 1
+    yield
+    jupytertracker.stop()
+    jupytertracker._tracker = None
+@pytest.fixture
+def ip():
+    """Return the global IPython instance."""
+    return _IP

jupytertracker-0.1.0/tests/test_exporter.py ADDED Viewed

@@ -0,0 +1,127 @@
+import pytest
+from pathlib import Path
+from jupytertracker.tracker import ExecutionRecord
+from jupytertracker.exporter import export_sequential
+def _records(*sources, durations=None):
+    if durations is None:
+        durations = [0.1 * (i + 1) for i in range(len(sources))]
+    return [
+        ExecutionRecord(exec_count=i + 1, source=src, timestamp=float(i), duration=dur)
+        for i, (src, dur) in enumerate(zip(sources, durations))
+    ]
+def test_sequential_preserves_all_executions(tmp_path):
+    log = _records("x = 1", "y = 2", "x = 99", "y = 2")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert content.count("# execution") == 4
+    assert "x = 1" in content
+    assert "x = 99" in content
+def test_sequential_preserves_modified_source_at_each_run(tmp_path):
+    log = _records("model = train(lr=0.01)", "model = train(lr=0.1)")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "lr=0.01" in content
+    assert "lr=0.1" in content
+    # Both versions present — neither deduplicated
+    assert content.index("lr=0.01") < content.index("lr=0.1")
+def test_sequential_execution_order(tmp_path):
+    log = _records("a = 1", "b = 2", "c = 3", "b = 99", "c = 3")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    lines = [l for l in content.splitlines() if l.startswith("# execution")]
+    assert len(lines) == 5
+    for i, line in enumerate(lines, start=1):
+        assert line.startswith(f"# execution {i}  [")
+def test_empty_log_produces_header_only(tmp_path):
+    out = tmp_path / "out.py"
+    export_sequential([], out)
+    content = out.read_text()
+    assert "No cells were recorded" in content
+def test_magic_command_gets_comment(tmp_path):
+    log = _records("%matplotlib inline", "x = 1")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "magic/shell command" in content
+def test_shell_command_gets_comment(tmp_path):
+    log = _records("!pip install pandas", "import pandas")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "magic/shell command" in content
+def test_normal_cell_no_magic_comment(tmp_path):
+    log = _records("x = 1 + 1")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "magic/shell" not in content
+def test_output_file_has_header_warning(tmp_path):
+    log = _records("x = 1")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "jupytertracker" in content
+    assert "sequential mode" in content
+def test_execution_time_shown_per_cell(tmp_path):
+    log = _records("x = 1", "y = 2", durations=[0.5, 1.25])
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "500ms" in content
+    assert "1.25s" in content
+def test_total_execution_time_in_header(tmp_path):
+    log = _records("x = 1", "y = 2", durations=[30.0, 45.0])
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    # 75 seconds total = 1m 15.0s
+    assert "1m 15.0s" in content
+def test_cell_count_in_header(tmp_path):
+    log = _records("a = 1", "b = 2", "c = 3")
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    content = out.read_text()
+    assert "Cells recorded: 3" in content
+def test_fmt_duration_ms(tmp_path):
+    log = _records("x = 1", durations=[0.034])
+    out = tmp_path / "out.py"
+    export_sequential(log, out)
+    assert "34ms" in out.read_text()
+def test_duration_recorded_in_tracker(ip):
+    import jupytertracker
+    jupytertracker.start(ip=ip)
+    ip.run_cell("import time; time.sleep(0.05)")
+    log = jupytertracker.get_log()
+    assert len(log) == 1
+    assert log[0].duration >= 0.05

jupytertracker-0.1.0/tests/test_init.py ADDED Viewed

@@ -0,0 +1,69 @@
+import pytest
+from pathlib import Path
+import jupytertracker
+from conftest import _IP as _global_ip
+def test_export_before_start_raises():
+    with pytest.raises(RuntimeError, match="not been started"):
+        jupytertracker.export_script("/tmp/out.py")
+def test_unknown_mode_raises(tmp_path):
+    jupytertracker.start(ip=_global_ip)
+    with pytest.raises(ValueError, match="Unknown mode"):
+        jupytertracker.export_script(str(tmp_path / "out.py"), mode="unknown")
+def test_dedup_mode_raises_not_implemented(tmp_path):
+    jupytertracker.start(ip=_global_ip)
+    with pytest.raises(NotImplementedError):
+        jupytertracker.export_script(str(tmp_path / "out.py"), mode="dedup")
+def test_start_stop_start_clears_log(tmp_path):
+    ip = _global_ip
+    jupytertracker.start(ip=ip)
+    ip.run_cell("a = 1")
+    jupytertracker.stop()
+    ip.run_cell("b = 2")  # not tracked
+    jupytertracker.start(ip=ip)  # fresh session — old log discarded
+    ip.run_cell("c = 3")
+    log = jupytertracker.get_log()
+    sources = [r.source for r in log]
+    assert not any("a = 1" in s for s in sources)  # pre-stop entries gone
+    assert not any("b = 2" in s for s in sources)  # untracked — still absent
+    assert any("c = 3" in s for s in sources)       # post-restart entry present
+def test_full_pipeline(tmp_path):
+    ip = _global_ip
+    jupytertracker.start(ip=ip)
+    ip.run_cell("x = 10")
+    ip.run_cell("y = 20")
+    ip.run_cell("x = 99")  # re-run with new value
+    ip.run_cell("y = 20")  # re-run unchanged
+    out = tmp_path / "output.py"
+    jupytertracker.export_script(str(out))
+    content = out.read_text()
+    assert content.count("# execution") == 4
+    assert "x = 10" in content
+    assert "x = 99" in content
+    assert content.index("x = 10") < content.index("x = 99")
+def test_clear_resets_log():
+    ip = _global_ip
+    jupytertracker.start(ip=ip)
+    ip.run_cell("a = 1")
+    assert len(jupytertracker.get_log()) == 1
+    jupytertracker.clear()
+    assert jupytertracker.get_log() == []
+def test_stop_before_start_does_not_raise():
+    jupytertracker.stop()  # _tracker is None — should not raise
+def test_get_log_before_start_returns_empty():
+    assert jupytertracker.get_log() == []

jupytertracker-0.1.0/tests/test_tracker.py ADDED Viewed

@@ -0,0 +1,149 @@
+import pytest
+from IPython.testing.globalipapp import get_ipython
+from jupytertracker.tracker import Tracker
+def _run_cell(ip, source: str):
+    """Simulate a cell execution through the real IPython kernel."""
+    ip.run_cell(source)
+def test_records_single_cell(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "x = 1")
+    log = tracker.log
+    assert len(log) == 1
+    assert "x = 1" in log[0].source
+    tracker.stop()
+def test_records_multiple_cells_in_order(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "a = 1")
+    _run_cell(ip, "b = 2")
+    _run_cell(ip, "c = 3")
+    log = tracker.log
+    assert len(log) == 3
+    assert log[0].exec_count < log[1].exec_count < log[2].exec_count
+    tracker.stop()
+def test_records_rerun_with_modified_source(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "x = 1")
+    _run_cell(ip, "x = 99")  # same "cell", modified source
+    log = tracker.log
+    assert len(log) == 2
+    assert "x = 1" in log[0].source
+    assert "x = 99" in log[1].source
+    tracker.stop()
+def test_start_is_idempotent(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    hook_count_before = len([h for h in ip.events.callbacks.get("pre_run_cell", [])])
+    tracker.start(ip=ip)  # second call — must not double-register
+    hook_count_after = len([h for h in ip.events.callbacks.get("pre_run_cell", [])])
+    assert hook_count_before == hook_count_after
+    tracker.stop()
+def test_stop_unregisters_hooks(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    tracker.stop()
+    _run_cell(ip, "y = 42")
+    assert tracker.log == []
+def test_stop_before_start_does_not_raise(ip):
+    tracker = Tracker()
+    tracker.stop()  # should not raise
+def test_hook_exception_does_not_disrupt_execution(ip, capsys):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    # Corrupt the hook to raise intentionally
+    original = tracker._on_pre_run_cell
+    def bad_hook(info):
+        raise RuntimeError("intentional test error")
+    ip.events.unregister("pre_run_cell", original)
+    ip.events.register("pre_run_cell", bad_hook)
+    # Cell execution must still succeed despite bad hook
+    result = ip.run_cell("z = 7")
+    assert result.success
+    ip.events.unregister("pre_run_cell", bad_hook)
+    tracker.stop()
+def test_start_without_ipython_raises(monkeypatch):
+    import IPython.core.interactiveshell as _shell
+    # Temporarily clear the global singleton so get_ipython() returns None
+    orig = _shell.InteractiveShell._instance
+    _shell.InteractiveShell._instance = None
+    try:
+        tracker = Tracker()
+        with pytest.raises(RuntimeError, match="No active IPython kernel"):
+            tracker.start(ip=None)
+    finally:
+        _shell.InteractiveShell._instance = orig
+def test_failed_cell_not_recorded(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "x = 1")                  # succeeds
+    _run_cell(ip, "raise ValueError('boom')")  # fails
+    _run_cell(ip, "y = 2")                  # succeeds
+    log = tracker.log
+    assert len(log) == 2
+    assert "x = 1" in log[0].source
+    assert "y = 2" in log[1].source
+    assert log[0].exec_count == 1
+    assert log[1].exec_count == 2
+    tracker.stop()
+def test_syntax_error_cell_not_recorded(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "x = 1")
+    _run_cell(ip, "def bad syntax(:")   # syntax error — never executes
+    _run_cell(ip, "y = 2")
+    log = tracker.log
+    sources = [r.source for r in log]
+    assert len(log) == 2
+    assert any("x = 1" in s for s in sources)
+    assert any("y = 2" in s for s in sources)
+    tracker.stop()
+def test_exec_count_stays_contiguous_after_failure(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "a = 1")
+    _run_cell(ip, "raise RuntimeError()")
+    _run_cell(ip, "b = 2")
+    log = tracker.log
+    assert len(log) == 2
+    assert log[0].exec_count == 1
+    assert log[1].exec_count == 2  # counter rolled back on failure, so next success is 2
+    tracker.stop()
+def test_clear_empties_log(ip):
+    tracker = Tracker()
+    tracker.start(ip=ip)
+    _run_cell(ip, "a = 1")
+    assert len(tracker.log) == 1
+    tracker.clear()
+    assert tracker.log == []
+    tracker.stop()