calkit-python 0.25.0__tar.gz → 0.25.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (162) hide show
  1. {calkit_python-0.25.0 → calkit_python-0.25.1}/PKG-INFO +1 -1
  2. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/__init__.py +1 -1
  3. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/new.py +10 -0
  4. calkit_python-0.25.1/docs/notebooks.md +236 -0
  5. {calkit_python-0.25.0 → calkit_python-0.25.1}/mkdocs.yml +1 -0
  6. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/FUNDING.yml +0 -0
  7. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/workflows/docs.yml +0 -0
  8. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/workflows/format.yml +0 -0
  9. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/workflows/publish-test.yml +0 -0
  10. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/workflows/publish.yml +0 -0
  11. {calkit_python-0.25.0 → calkit_python-0.25.1}/.github/workflows/test.yml +0 -0
  12. {calkit_python-0.25.0 → calkit_python-0.25.1}/.gitignore +0 -0
  13. {calkit_python-0.25.0 → calkit_python-0.25.1}/.pre-commit-config.yaml +0 -0
  14. {calkit_python-0.25.0 → calkit_python-0.25.1}/.python-version +0 -0
  15. {calkit_python-0.25.0 → calkit_python-0.25.1}/CONTRIBUTING.md +0 -0
  16. {calkit_python-0.25.0 → calkit_python-0.25.1}/LICENSE +0 -0
  17. {calkit_python-0.25.0 → calkit_python-0.25.1}/Makefile +0 -0
  18. {calkit_python-0.25.0 → calkit_python-0.25.1}/README.md +0 -0
  19. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/__main__.py +0 -0
  20. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/calc.py +0 -0
  21. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/check.py +0 -0
  22. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/__init__.py +0 -0
  23. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/check.py +0 -0
  24. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/cloud.py +0 -0
  25. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/config.py +0 -0
  26. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/core.py +0 -0
  27. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/import_.py +0 -0
  28. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/list.py +0 -0
  29. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/main.py +0 -0
  30. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/notebooks.py +0 -0
  31. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/office.py +0 -0
  32. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/overleaf.py +0 -0
  33. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cli/update.py +0 -0
  34. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/cloud.py +0 -0
  35. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/conda.py +0 -0
  36. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/config.py +0 -0
  37. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/core.py +0 -0
  38. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/datasets.py +0 -0
  39. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/docker.py +0 -0
  40. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/dvc.py +0 -0
  41. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/environments.py +0 -0
  42. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/git.py +0 -0
  43. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/github.py +0 -0
  44. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/gui.py +0 -0
  45. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/jupyter.py +0 -0
  46. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/magics.py +0 -0
  47. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/matlab.py +0 -0
  48. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/models/__init__.py +0 -0
  49. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/models/core.py +0 -0
  50. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/models/io.py +0 -0
  51. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/models/iteration.py +0 -0
  52. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/models/pipeline.py +0 -0
  53. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/notebooks.py +0 -0
  54. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/office.py +0 -0
  55. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/ops.py +0 -0
  56. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/pipeline.py +0 -0
  57. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/releases.py +0 -0
  58. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/server.py +0 -0
  59. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/__init__.py +0 -0
  60. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/core.py +0 -0
  61. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/__init__.py +0 -0
  62. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/article/paper.tex +0 -0
  63. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/core.py +0 -0
  64. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/jfm/jfm.bst +0 -0
  65. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/jfm/jfm.cls +0 -0
  66. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/jfm/lineno-FLM.sty +0 -0
  67. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/jfm/paper.tex +0 -0
  68. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/templates/latex/jfm/upmath.sty +0 -0
  69. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/__init__.py +0 -0
  70. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/cli/__init__.py +0 -0
  71. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/cli/test_config.py +0 -0
  72. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/cli/test_list.py +0 -0
  73. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/cli/test_main.py +0 -0
  74. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/cli/test_new.py +0 -0
  75. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/models/__init__.py +0 -0
  76. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/models/test_pipeline.py +0 -0
  77. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_calc.py +0 -0
  78. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_check.py +0 -0
  79. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_conda.py +0 -0
  80. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_core.py +0 -0
  81. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_dvc.py +0 -0
  82. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_jupyter.py +0 -0
  83. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_magics.py +0 -0
  84. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_notebooks.py +0 -0
  85. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_pipeline.py +0 -0
  86. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/tests/test_templates.py +0 -0
  87. {calkit_python-0.25.0 → calkit_python-0.25.1}/calkit/zenodo.py +0 -0
  88. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/CNAME +0 -0
  89. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/apps.md +0 -0
  90. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/calculations.md +0 -0
  91. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/calkit-yaml.md +0 -0
  92. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/cli-reference.md +0 -0
  93. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/cloud-integration.md +0 -0
  94. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/datasets.md +0 -0
  95. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/dependencies.md +0 -0
  96. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/environments.md +0 -0
  97. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/examples.md +0 -0
  98. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/help.md +0 -0
  99. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/img/c-to-the-k-white.svg +0 -0
  100. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/img/calkit-no-bg.png +0 -0
  101. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/img/connect-zenodo.png +0 -0
  102. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/index.md +0 -0
  103. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/installation.md +0 -0
  104. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/local-server.md +0 -0
  105. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/overleaf.md +0 -0
  106. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/pipeline/index.md +0 -0
  107. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/pipeline/manual-steps.md +0 -0
  108. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/references.md +0 -0
  109. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/releases.md +0 -0
  110. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/adding-latex-pub-docker.md +0 -0
  111. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/conda-envs.md +0 -0
  112. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/existing-project.md +0 -0
  113. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/first-project.md +0 -0
  114. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/building-codespace.png +0 -0
  115. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/codespaces-secrets-2.png +0 -0
  116. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/editor-split.png +0 -0
  117. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/go-to-linked-code.png +0 -0
  118. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/issue-from-selection.png +0 -0
  119. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/new-project.png +0 -0
  120. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/new-pub-2.png +0 -0
  121. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/new-token.png +0 -0
  122. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/paper.tex.png +0 -0
  123. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/project-home-3.png +0 -0
  124. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/push.png +0 -0
  125. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/latex-codespaces/stage.png +0 -0
  126. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/anakin-excel.jpg +0 -0
  127. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/chart-more-rows.png +0 -0
  128. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/create-project.png +0 -0
  129. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/elsevier-research-data-guidelines.png +0 -0
  130. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/excel-chart.png +0 -0
  131. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/excel-data.png +0 -0
  132. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/insert-link-to-file.png +0 -0
  133. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/needs-clone.png +0 -0
  134. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/new-stage.png +0 -0
  135. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/phd-comics-version-control.webp +0 -0
  136. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/pipeline-out-of-date.png +0 -0
  137. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/status-more-rows.png +0 -0
  138. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/uncommitted-changes.png +0 -0
  139. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/untracked-data.png +0 -0
  140. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/updated-publication.png +0 -0
  141. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/word-to-pdf-stage-2.png +0 -0
  142. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/office/workflow-page.png +0 -0
  143. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/clone.png +0 -0
  144. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/create-project.png +0 -0
  145. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/datasets-page.png +0 -0
  146. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/figure-on-website-updated.png +0 -0
  147. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/figure-on-website.png +0 -0
  148. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/new-token.png +0 -0
  149. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/reclone.png +0 -0
  150. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/openfoam/status-after-import-dataset.png +0 -0
  151. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/img/run-proc.png +0 -0
  152. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/index.md +0 -0
  153. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/latex-codespaces.md +0 -0
  154. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/matlab.md +0 -0
  155. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/notebook-pipeline.md +0 -0
  156. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/office.md +0 -0
  157. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/openfoam.md +0 -0
  158. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/tutorials/procedures.md +0 -0
  159. {calkit_python-0.25.0 → calkit_python-0.25.1}/docs/version-control.md +0 -0
  160. {calkit_python-0.25.0 → calkit_python-0.25.1}/pyproject.toml +0 -0
  161. {calkit_python-0.25.0 → calkit_python-0.25.1}/test/pipeline.ipynb +0 -0
  162. {calkit_python-0.25.0 → calkit_python-0.25.1}/uv.lock +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: calkit-python
3
- Version: 0.25.0
3
+ Version: 0.25.1
4
4
  Summary: Reproducibility simplified.
5
5
  Project-URL: Homepage, https://calkit.org
6
6
  Project-URL: Issues, https://github.com/calkit/calkit/issues
@@ -1,4 +1,4 @@
1
- __version__ = "0.25.0"
1
+ __version__ = "0.25.1"
2
2
 
3
3
  from .core import * # noqa: F403, I001
4
4
  from . import git # noqa: F401
@@ -30,6 +30,13 @@ from calkit.models.pipeline import LatexStage, StageIteration
30
30
  new_app = typer.Typer(no_args_is_help=True)
31
31
 
32
32
 
33
+ def _check_path_dir(path: str):
34
+ """If path is in a subdirectory, check that it exists."""
35
+ dirname = os.path.dirname(path)
36
+ if dirname:
37
+ os.makedirs(dirname, exist_ok=True)
38
+
39
+
33
40
  @new_app.command(name="project")
34
41
  def new_project(
35
42
  path: Annotated[str, typer.Argument(help="Where to create the project.")],
@@ -1075,6 +1082,7 @@ def new_conda_env(
1075
1082
  project_name = os.path.basename(os.getcwd())
1076
1083
  conda_name = calkit.to_kebab_case(project_name) + "-" + name
1077
1084
  # Write environment to path
1085
+ _check_path_dir(path)
1078
1086
  conda_env = dict(
1079
1087
  name=conda_name, channels=["conda-forge"], dependencies=packages
1080
1088
  )
@@ -1174,6 +1182,7 @@ def new_uv_venv(
1174
1182
  )
1175
1183
  packages_txt = "\n".join(packages)
1176
1184
  # Write environment to path
1185
+ _check_path_dir(path)
1177
1186
  with open(path, "w") as f:
1178
1187
  f.write(packages_txt)
1179
1188
  repo.git.add(path)
@@ -1259,6 +1268,7 @@ def new_venv(
1259
1268
  )
1260
1269
  packages_txt = "\n".join(packages)
1261
1270
  # Write environment to path
1271
+ _check_path_dir(path)
1262
1272
  with open(path, "w") as f:
1263
1273
  f.write(packages_txt)
1264
1274
  repo.git.add(path)
@@ -0,0 +1,236 @@
1
+ # Working with notebooks
2
+
3
+ While working on a research project,
4
+ Jupyter notebooks can be useful for prototyping and data exploration.
5
+ If while working interactively in a notebook
6
+ you get an output you like, e.g., a figure,
7
+ it can be tempting to simply stop right there
8
+ and copy/paste it into a research article.
9
+ However, in order to keep the project reproducible,
10
+ we need to be able to go from raw data to research article
11
+ [with a single command](https://doi.org/10.1190/1.1822162),
12
+ which of course is not possible in the above scenario.
13
+
14
+ This is the primary notebook use case Calkit is concerned with:
15
+ generating evidence to back up conclusions or answers to research questions.
16
+ There are other use cases that are out of scope like using notebooks to build
17
+ documentation or interactive web apps for exploring results.
18
+ For building [apps](apps.md) (a different concept in a Calkit project),
19
+ there are probably better tools out there, e.g.,
20
+ [marimo](https://marimo.io/),
21
+ [Dash](https://dash.plotly.com/),
22
+ [Voila](https://voila.readthedocs.io/en/stable/),
23
+ or [Gradio](https://www.gradio.app/).
24
+
25
+ Here we'll talk about how to take advantage of the interactive nature
26
+ of Jupyter notebooks while incorporating them into a reproducible workflow,
27
+ avoiding some of the pitfalls that have caused a bit of a
28
+ [notebook reproducibility crisis](https://leomurta.github.io/papers/pimentel2019a.pdf).
29
+ Returning to the "one project, one command" requirement,
30
+ we can focus on three rules:
31
+
32
+ 1. The notebook must be kept in version control.
33
+ This happens naturally since any file included in a Calkit project is
34
+ kept in version control.
35
+ However, it's usually a good idea to exclude notebook output from
36
+ Git commits.
37
+ This can be done by installing `nbstripout` and running
38
+ `nbstripout --install` in the project directory.
39
+ 1. A notebook must run in one of the project's [environments](environments.md).
40
+ 1. Notebooks should be incorporated into the project's
41
+ [pipeline](pipeline/index.md), notebooks are no exception.
42
+ It's fine to do some ad hoc work interactively to get the notebook
43
+ working properly, but
44
+ "official" outputs should be generated by calling `calkit run`.
45
+ This means notebooks need to be able to run from top-to-bottom with no
46
+ manual intervention. We'll see how below.
47
+
48
+ ## Creating an environment for a notebook
49
+
50
+ Assuming you want to run Python in the notebook, you can create an environment
51
+ for it with `uv`, `venv`, `conda`, or `pixi`.
52
+ For example, if we wanted to create a new `uv-venv` called `py` in our project,
53
+ we can execute:
54
+
55
+ ```sh
56
+ calkit new uv-venv \
57
+ --name py \
58
+ --prefix .venv \
59
+ --python 3.13 \
60
+ --path requirements.txt \
61
+ jupyter \
62
+ "pandas>=2" \
63
+ numpy \
64
+ plotly \
65
+ matplotlib \
66
+ polars
67
+ ```
68
+
69
+ You can then start JupyterLab in this environment with
70
+ `calkit xenv -n py jupyter lab`.
71
+
72
+ Note the environment only needs to be created once per project.
73
+ If the project is cloned onto a new machine,
74
+ the environment does not need to be recreated,
75
+ since that will be done automatically when the project is run.
76
+ Also note that it's totally fine and perhaps even preferable to create
77
+ a new environment for each notebook, so long as they have different
78
+ names, prefixes, and paths---there is no limit to the number
79
+ of environments a project can use, and they can be of any type.
80
+
81
+ ## Adding a notebook to the pipeline
82
+
83
+ A notebook can be added to the pipeline by editing the project's `calkit.yaml`
84
+ file directly, using a `jupyter-notebook` stage.
85
+ For example:
86
+
87
+ ```yaml
88
+ # In calkit.yaml
89
+ environments:
90
+ py:
91
+ kind: uv-venv
92
+ prefix: .venv
93
+ python: "3.13"
94
+ path: requirements.txt
95
+ pipeline:
96
+ stages:
97
+ my-notebook:
98
+ kind: jupyter-notebook
99
+ environment: py
100
+ notebook_path: notebooks/get-data.ipynb
101
+ inputs:
102
+ - config/my-params.json
103
+ outputs:
104
+ - data/raw/data.csv
105
+ html_storage: dvc
106
+ executed_ipynb_storage: null
107
+ cleaned_ipynb_storage: git
108
+ # Optional: Add to project notebooks so they can be viewed on Calkit Cloud
109
+ notebooks:
110
+ - path: notebooks/get-data.ipynb
111
+ title: Get data
112
+ stage: my-notebook
113
+ ```
114
+
115
+ For this example, we're declaring that the notebook
116
+ should use the `py` environment, and that it will read an input
117
+ file `config/my-params.json` and produce an output
118
+ file `data/raw/data.csv`.
119
+ These inputs and outputs will be tracked
120
+ along with the notebook and environment content,
121
+ to automatically determine if and when the notebook needs to be rerun.
122
+ Outputs will also be kept in DVC by default so others can pull them down
123
+ without bloating the Git repo.
124
+ Output storage is configurable, however, e.g., if you'd like to keep
125
+ smaller and/or text-based outputs in Git for simplicity's sake.
126
+
127
+ Copies of the notebook with and without outputs will be generated as the
128
+ notebook is executed, along with an HTML export of the latter.
129
+ Storage for these outputs can be controlled with the `html_storage`,
130
+ `executed_ipynb_storage`, `cleaned_ipynb_storage` properties,
131
+ and they will live inside the project's `.calkit` subdirectory.
132
+ The executed `.ipynb` can be rendered on GitHub or
133
+ [nbviewer.org](https://nbviewer.org),
134
+ and the HTML can be viewed on [calkit.io](https://calkit.io),
135
+ the latter of which allows some level of interactivity, e.g., Plotly figures.
136
+ The cleaned `.ipynb` can be useful for diffing with Git in cases where
137
+ `nbstripout` is not activated.
138
+
139
+ It's also possible to add a notebook to the pipeline
140
+ inside a notebook with the `declare_notebook` function,
141
+ which will update `calkit.yaml` automatically.
142
+
143
+ ```python
144
+ import calkit
145
+
146
+ calkit.declare_notebook(
147
+ path="notebooks/get-data.ipynb",
148
+ stage_name="my-notebook",
149
+ environment_name="py",
150
+ inputs=["config/my-params.json"],
151
+ outputs=["data/raw/data.csv"],
152
+ html_storage="dvc",
153
+ executed_ipynb_storage=None,
154
+ cleaned_ipynb_storage="git",
155
+ )
156
+ ```
157
+
158
+ Note that for this to run properly `calkit-python` must be installed in
159
+ the notebook's environment, which in this case is named `py` and whose
160
+ packages are listed in `requirements.txt`.
161
+ If we didn't include them when creating the environment,
162
+ we can simply add `calkit-python` to the `requirements.txt` file and rerun
163
+ `calkit xenv -n py jupyter lab`.
164
+ The environment will be updated before starting JupyterLab.
165
+
166
+ ## Working interactively
167
+
168
+ The main advantage of Jupyter notebooks is the ability to work interactively,
169
+ allowing us to quickly iterate on a smaller chunk of the process
170
+ while the rest remains constant.
171
+ For example, if you need to refine a figure,
172
+ you can keep updating and running the cell that generates the figure,
173
+ without needing to rerun the expensive cell above that generates
174
+ or processes the data for it.
175
+ In this case our notebook might look like this:
176
+
177
+ ```python
178
+ from some_package import run_data_processing
179
+
180
+ result = run_data_processing(param1=55)
181
+ ```
182
+
183
+ ```python
184
+ import matplotlib.pyplot as plt
185
+
186
+ fig, ax = plt.subplots()
187
+ ax.plot(result["x"], result["y"])
188
+ ```
189
+
190
+ ```python
191
+ fig.savefig("figures/my-plot.png")
192
+ ```
193
+
194
+ So, with a fresh Jupyter kernel we'll need to run cell 1 in order to generate
195
+ `result` so we can iterate on cell 2 to get the plot looking the way
196
+ we want it to.
197
+ But what if `run_data_processing`
198
+ takes minutes, hours, or even days, so therefore we don't want to run it
199
+ every time we restart the notebook?
200
+ Well, we can use the Calkit `%%stage` cell magic to automatically cache
201
+ and retrieve the result.
202
+
203
+ After adding a cell with:
204
+
205
+ ```python
206
+ %load_ext calkit.magics
207
+ ```
208
+
209
+ the first cell can be turned into a pipeline stage by changing it to:
210
+
211
+ ```python
212
+ %%stage --name run-nb-proc --environment py --out result
213
+
214
+ from some_package import run_data_processing
215
+
216
+ result = run_data_processing(param1=55)
217
+ ```
218
+
219
+ In the magic command we're giving the cell a unique name,
220
+ declaring which environment it should run in
221
+ (`py` above, but it can be any environment in the project),
222
+ and declaring an output from the cell that we want to be available to
223
+ cells below.
224
+
225
+ Now, the kernel can be restarted and we can use "run all cells above"
226
+ when working on the figure,
227
+ and we'll have `result` nearly instantaneously.
228
+ `result` will also be versioned with DVC and pushed to the cloud by default,
229
+ so our collaborators can also take advantage of the caching
230
+ without bloating the Git repo.
231
+ Execution as part of the project's pipeline will also take advantage of
232
+ the caching and will not rerun data processing unless something
233
+ about that cell's code or environment has changed.
234
+
235
+ For a more in-depth look at using the `%%stage` cell magic,
236
+ see [this tutorial](tutorials/notebook-pipeline.md).
@@ -45,6 +45,7 @@ nav:
45
45
  - The pipeline:
46
46
  - pipeline/index.md
47
47
  - pipeline/manual-steps.md
48
+ - Notebooks: notebooks.md
48
49
  - Datasets: datasets.md
49
50
  - References: references.md
50
51
  - Calculations: calculations.md
File without changes
File without changes
File without changes
File without changes