calkit-python 0.22.2__tar.gz → 0.22.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (152) hide show
  1. {calkit_python-0.22.2 → calkit_python-0.22.3}/.pre-commit-config.yaml +0 -1
  2. {calkit_python-0.22.2 → calkit_python-0.22.3}/PKG-INFO +55 -52
  3. {calkit_python-0.22.2 → calkit_python-0.22.3}/README.md +54 -51
  4. calkit_python-0.22.3/calkit/__init__.py +17 -0
  5. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/check.py +2 -2
  6. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/__init__.py +1 -1
  7. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/import_.py +7 -7
  8. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/main.py +4 -0
  9. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/new.py +4 -3
  10. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/releases.py +2 -0
  11. calkit_python-0.22.3/calkit/templates/__init__.py +1 -0
  12. calkit_python-0.22.3/calkit/templates/latex/__init__.py +1 -0
  13. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_calc.py +1 -1
  14. calkit_python-0.22.3/docs/index.md +60 -0
  15. {calkit_python-0.22.2 → calkit_python-0.22.3}/pyproject.toml +1 -1
  16. calkit_python-0.22.2/calkit/__init__.py +0 -17
  17. calkit_python-0.22.2/calkit/templates/__init__.py +0 -1
  18. calkit_python-0.22.2/calkit/templates/latex/__init__.py +0 -1
  19. calkit_python-0.22.2/docs/index.md +0 -51
  20. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/FUNDING.yml +0 -0
  21. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/workflows/docs.yml +0 -0
  22. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/workflows/format.yml +0 -0
  23. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/workflows/publish-test.yml +0 -0
  24. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/workflows/publish.yml +0 -0
  25. {calkit_python-0.22.2 → calkit_python-0.22.3}/.github/workflows/test.yml +0 -0
  26. {calkit_python-0.22.2 → calkit_python-0.22.3}/.gitignore +0 -0
  27. {calkit_python-0.22.2 → calkit_python-0.22.3}/.python-version +0 -0
  28. {calkit_python-0.22.2 → calkit_python-0.22.3}/CONTRIBUTING.md +0 -0
  29. {calkit_python-0.22.2 → calkit_python-0.22.3}/LICENSE +0 -0
  30. {calkit_python-0.22.2 → calkit_python-0.22.3}/Makefile +0 -0
  31. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/__main__.py +0 -0
  32. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/calc.py +0 -0
  33. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/check.py +0 -0
  34. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/cloud.py +0 -0
  35. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/config.py +0 -0
  36. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/core.py +0 -0
  37. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/list.py +0 -0
  38. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/notebooks.py +0 -0
  39. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/office.py +0 -0
  40. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/overleaf.py +0 -0
  41. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cli/update.py +0 -0
  42. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/cloud.py +0 -0
  43. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/conda.py +0 -0
  44. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/config.py +0 -0
  45. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/core.py +0 -0
  46. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/datasets.py +0 -0
  47. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/docker.py +0 -0
  48. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/dvc.py +0 -0
  49. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/git.py +0 -0
  50. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/github.py +0 -0
  51. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/gui.py +0 -0
  52. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/jupyter.py +0 -0
  53. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/magics.py +0 -0
  54. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/models.py +0 -0
  55. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/office.py +0 -0
  56. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/ops.py +0 -0
  57. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/server.py +0 -0
  58. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/core.py +0 -0
  59. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/article/paper.tex +0 -0
  60. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/core.py +0 -0
  61. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/jfm/jfm.bst +0 -0
  62. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/jfm/jfm.cls +0 -0
  63. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/jfm/lineno-FLM.sty +0 -0
  64. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/jfm/paper.tex +0 -0
  65. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/templates/latex/jfm/upmath.sty +0 -0
  66. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/__init__.py +0 -0
  67. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/cli/__init__.py +0 -0
  68. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/cli/test_config.py +0 -0
  69. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/cli/test_list.py +0 -0
  70. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/cli/test_main.py +0 -0
  71. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/cli/test_new.py +0 -0
  72. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_check.py +0 -0
  73. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_conda.py +0 -0
  74. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_core.py +0 -0
  75. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_dvc.py +0 -0
  76. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_jupyter.py +0 -0
  77. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_magics.py +0 -0
  78. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/tests/test_templates.py +0 -0
  79. {calkit_python-0.22.2 → calkit_python-0.22.3}/calkit/zenodo.py +0 -0
  80. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/CNAME +0 -0
  81. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/apps.md +0 -0
  82. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/calculations.md +0 -0
  83. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/calkit-yaml.md +0 -0
  84. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/cli-reference.md +0 -0
  85. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/cloud-integration.md +0 -0
  86. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/datasets.md +0 -0
  87. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/environments.md +0 -0
  88. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/examples.md +0 -0
  89. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/help.md +0 -0
  90. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/img/c-to-the-k-white.svg +0 -0
  91. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/img/calkit-no-bg.png +0 -0
  92. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/img/connect-zenodo.png +0 -0
  93. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/installation.md +0 -0
  94. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/local-server.md +0 -0
  95. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/overleaf.md +0 -0
  96. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/pipeline/index.md +0 -0
  97. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/pipeline/manual-steps.md +0 -0
  98. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/references.md +0 -0
  99. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/releases.md +0 -0
  100. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/adding-latex-pub-docker.md +0 -0
  101. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/conda-envs.md +0 -0
  102. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/existing-project.md +0 -0
  103. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/first-project.md +0 -0
  104. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/building-codespace.png +0 -0
  105. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/codespaces-secrets-2.png +0 -0
  106. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/editor-split.png +0 -0
  107. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/go-to-linked-code.png +0 -0
  108. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/issue-from-selection.png +0 -0
  109. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/new-project.png +0 -0
  110. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/new-pub-2.png +0 -0
  111. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/new-token.png +0 -0
  112. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/paper.tex.png +0 -0
  113. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/project-home-3.png +0 -0
  114. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/push.png +0 -0
  115. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/latex-codespaces/stage.png +0 -0
  116. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/anakin-excel.jpg +0 -0
  117. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/chart-more-rows.png +0 -0
  118. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/create-project.png +0 -0
  119. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/elsevier-research-data-guidelines.png +0 -0
  120. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/excel-chart.png +0 -0
  121. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/excel-data.png +0 -0
  122. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/insert-link-to-file.png +0 -0
  123. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/needs-clone.png +0 -0
  124. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/new-stage.png +0 -0
  125. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/phd-comics-version-control.webp +0 -0
  126. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/pipeline-out-of-date.png +0 -0
  127. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/status-more-rows.png +0 -0
  128. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/uncommitted-changes.png +0 -0
  129. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/untracked-data.png +0 -0
  130. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/updated-publication.png +0 -0
  131. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/word-to-pdf-stage-2.png +0 -0
  132. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/office/workflow-page.png +0 -0
  133. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/clone.png +0 -0
  134. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/create-project.png +0 -0
  135. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/datasets-page.png +0 -0
  136. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/figure-on-website-updated.png +0 -0
  137. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/figure-on-website.png +0 -0
  138. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/new-token.png +0 -0
  139. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/reclone.png +0 -0
  140. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/openfoam/status-after-import-dataset.png +0 -0
  141. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/img/run-proc.png +0 -0
  142. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/index.md +0 -0
  143. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/latex-codespaces.md +0 -0
  144. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/matlab.md +0 -0
  145. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/notebook-pipeline.md +0 -0
  146. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/office.md +0 -0
  147. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/openfoam.md +0 -0
  148. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/tutorials/procedures.md +0 -0
  149. {calkit_python-0.22.2 → calkit_python-0.22.3}/docs/version-control.md +0 -0
  150. {calkit_python-0.22.2 → calkit_python-0.22.3}/mkdocs.yml +0 -0
  151. {calkit_python-0.22.2 → calkit_python-0.22.3}/test/pipeline.ipynb +0 -0
  152. {calkit_python-0.22.2 → calkit_python-0.22.3}/uv.lock +0 -0
@@ -12,7 +12,6 @@ repos:
12
12
  hooks:
13
13
  - id: ruff
14
14
  args: [--exit-non-zero-on-fix, --config=pyproject.toml]
15
- exclude: calkit
16
15
  - id: ruff-format
17
16
  args: [--config=pyproject.toml]
18
17
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: calkit-python
3
- Version: 0.22.2
3
+ Version: 0.22.3
4
4
  Summary: Reproducibility simplified.
5
5
  Project-URL: Homepage, https://calkit.org
6
6
  Project-URL: Issues, https://github.com/calkit/calkit/issues
@@ -54,16 +54,60 @@ Description-Content-Type: text/markdown
54
54
  </a>
55
55
  </p>
56
56
 
57
- Calkit is a framework and toolkit for reproducible research projects.
58
- It acts as a top-level layer to integrate and simplify the use of enabling
59
- technologies such as
60
- [Git](https://git-scm.com/),
61
- [DVC](https://dvc.org/),
62
- [Conda](https://docs.conda.io/en/latest/),
63
- and [Docker](https://docker.com).
64
- Calkit also adds a domain-specific data model
65
- such that all aspects of the research process can be fully described in a
66
- single repository and therefore easily consumed by others.
57
+ Calkit is a language-agnostic project framework and toolkit
58
+ to make your research or analytics project
59
+ reproducible to the highest standard,
60
+ which means:
61
+
62
+ > Inputs and process definitions are provided and sufficiently described
63
+ > such that anyone can easily verify that they produced the outputs
64
+ > used to support the conclusions.
65
+
66
+ "Easily" means that after obtaining your project files,
67
+ it should only require executing a single command
68
+ (like "pressing a single button" in
69
+ [Claerbout and Karrenbach (1992)](https://doi.org/10.1190/1.1822162)),
70
+ which should finish in less than 15 minutes
71
+ (suggested by
72
+ [Vandewalle et al. (2009)](https://doi.org/10.1109/MSP.2009.932122)).
73
+
74
+ If the processes are too expensive to rerun in under 15 minutes,
75
+ it should be possible to confirm that none of the input data
76
+ or process definitions (e.g., environment specifications, scripts)
77
+ have changed since saving the current versions of each output artifact
78
+ (figure, table, dataset, publication, etc.)
79
+
80
+ When your project is reproducible,
81
+ you'll be able to iterate more quickly and more often,
82
+ easily onboard collaborators,
83
+ make fewer mistakes,
84
+ and feel confident sharing all of your project materials
85
+ with your research articles,
86
+ because you'll know the code will actually run!
87
+ This will allow others to reuse parts of your project in their own research,
88
+ accelerating the pace of discovery.
89
+
90
+ Working at this level of automation, discipline, and rigor may sound like
91
+ a lot of effort,
92
+ but Calkit makes it easy!
93
+
94
+ ## Features
95
+
96
+ - A schema to store structured metadata describing the
97
+ project's important outputs (in its `calkit.yaml` file)
98
+ and how they are created
99
+ (its computational environments and pipeline).
100
+ - A CLI to run the project's pipeline to verify it's reproducible,
101
+ regenerating outputs as needed and
102
+ ensuring all
103
+ computational environments (e.g., [Conda](https://docs.conda.io/en/latest/), [Docker](https://docker.com)) match their specification.
104
+ - A command line interface (CLI) to simplify keeping code, text, and larger
105
+ data files backed up in the same project repo using both
106
+ [Git](https://git-scm.com/) and [DVC](https://dvc.org/).
107
+ - A complementary
108
+ [cloud system](https://github.com/calkit/calkit-cloud)
109
+ to facilitate backup, collaboration,
110
+ and sharing throughout the entire research lifecycle.
67
111
 
68
112
  ## Installation
69
113
 
@@ -150,47 +194,6 @@ This will commit and push to both GitHub and the Calkit Cloud.
150
194
  We welcome all kinds of contributions!
151
195
  See [CONTRIBUTING.md](CONTRIBUTING.md) to learn how to get involved.
152
196
 
153
- ## Why does reproducibility matter?
154
-
155
- If your work is reproducible, that means that someone else can "run" it and
156
- calculate the same results or outputs.
157
- This is a major step towards addressing
158
- [the replication crisis](https://en.wikipedia.org/wiki/Replication_crisis)
159
- and has some major benefits for both you as an individual and the research
160
- community:
161
-
162
- 1. You will avoid mistakes caused by, e.g., running an old version of a script
163
- and including a figure that wasn't created after fixing a bug in the data
164
- processing pipeline.
165
- 2. Since your project is "runnable," it's more likely that someone else will be
166
- able to reuse part of your work to run it in a different context, thereby
167
- producing a bigger impact and accelerating the pace of discovery.
168
- If someone can take what you've done and use it to calculate a
169
- prediction, you have just produced truly useful knowledge.
170
-
171
- ## Why another tool/platform?
172
-
173
- Git, GitHub, DVC, Docker et al. are amazing tools/platforms, but their
174
- use involves multiple fairly difficult learning curves,
175
- and tying them together might mean developing something new for each project.
176
- Our goal is to provide a single tool and platform to unify all of these so
177
- that there is a single, gentle learning curve.
178
- However, it is not our goal to hide or replace these underlying components.
179
- Advanced users can use them directly, but new users aren't forced to, which
180
- helps them get up and running with less effort and training.
181
- Calkit should help users understand what is going on under the hood without
182
- forcing them to work at that lower level of abstraction.
183
-
184
- ## How it works
185
-
186
- Calkit creates a simple human-readable "database" inside the `calkit.yaml`
187
- file, which serves as a way to store important information about the project,
188
- e.g., what question(s) it seeks to answer,
189
- what files should be considered datasets, figures, publications, etc.
190
- The Calkit cloud reads this database and registers the various entities
191
- as part of the entire ecosystem such that if a project is made public,
192
- other researchers can find and reuse your work to accelerate their own.
193
-
194
197
  ## Design/UX principles
195
198
 
196
199
  1. Be opinionated. Users should not be forced to make unimportant decisions.
@@ -17,16 +17,60 @@
17
17
  </a>
18
18
  </p>
19
19
 
20
- Calkit is a framework and toolkit for reproducible research projects.
21
- It acts as a top-level layer to integrate and simplify the use of enabling
22
- technologies such as
23
- [Git](https://git-scm.com/),
24
- [DVC](https://dvc.org/),
25
- [Conda](https://docs.conda.io/en/latest/),
26
- and [Docker](https://docker.com).
27
- Calkit also adds a domain-specific data model
28
- such that all aspects of the research process can be fully described in a
29
- single repository and therefore easily consumed by others.
20
+ Calkit is a language-agnostic project framework and toolkit
21
+ to make your research or analytics project
22
+ reproducible to the highest standard,
23
+ which means:
24
+
25
+ > Inputs and process definitions are provided and sufficiently described
26
+ > such that anyone can easily verify that they produced the outputs
27
+ > used to support the conclusions.
28
+
29
+ "Easily" means that after obtaining your project files,
30
+ it should only require executing a single command
31
+ (like "pressing a single button" in
32
+ [Claerbout and Karrenbach (1992)](https://doi.org/10.1190/1.1822162)),
33
+ which should finish in less than 15 minutes
34
+ (suggested by
35
+ [Vandewalle et al. (2009)](https://doi.org/10.1109/MSP.2009.932122)).
36
+
37
+ If the processes are too expensive to rerun in under 15 minutes,
38
+ it should be possible to confirm that none of the input data
39
+ or process definitions (e.g., environment specifications, scripts)
40
+ have changed since saving the current versions of each output artifact
41
+ (figure, table, dataset, publication, etc.)
42
+
43
+ When your project is reproducible,
44
+ you'll be able to iterate more quickly and more often,
45
+ easily onboard collaborators,
46
+ make fewer mistakes,
47
+ and feel confident sharing all of your project materials
48
+ with your research articles,
49
+ because you'll know the code will actually run!
50
+ This will allow others to reuse parts of your project in their own research,
51
+ accelerating the pace of discovery.
52
+
53
+ Working at this level of automation, discipline, and rigor may sound like
54
+ a lot of effort,
55
+ but Calkit makes it easy!
56
+
57
+ ## Features
58
+
59
+ - A schema to store structured metadata describing the
60
+ project's important outputs (in its `calkit.yaml` file)
61
+ and how they are created
62
+ (its computational environments and pipeline).
63
+ - A CLI to run the project's pipeline to verify it's reproducible,
64
+ regenerating outputs as needed and
65
+ ensuring all
66
+ computational environments (e.g., [Conda](https://docs.conda.io/en/latest/), [Docker](https://docker.com)) match their specification.
67
+ - A command line interface (CLI) to simplify keeping code, text, and larger
68
+ data files backed up in the same project repo using both
69
+ [Git](https://git-scm.com/) and [DVC](https://dvc.org/).
70
+ - A complementary
71
+ [cloud system](https://github.com/calkit/calkit-cloud)
72
+ to facilitate backup, collaboration,
73
+ and sharing throughout the entire research lifecycle.
30
74
 
31
75
  ## Installation
32
76
 
@@ -113,47 +157,6 @@ This will commit and push to both GitHub and the Calkit Cloud.
113
157
  We welcome all kinds of contributions!
114
158
  See [CONTRIBUTING.md](CONTRIBUTING.md) to learn how to get involved.
115
159
 
116
- ## Why does reproducibility matter?
117
-
118
- If your work is reproducible, that means that someone else can "run" it and
119
- calculate the same results or outputs.
120
- This is a major step towards addressing
121
- [the replication crisis](https://en.wikipedia.org/wiki/Replication_crisis)
122
- and has some major benefits for both you as an individual and the research
123
- community:
124
-
125
- 1. You will avoid mistakes caused by, e.g., running an old version of a script
126
- and including a figure that wasn't created after fixing a bug in the data
127
- processing pipeline.
128
- 2. Since your project is "runnable," it's more likely that someone else will be
129
- able to reuse part of your work to run it in a different context, thereby
130
- producing a bigger impact and accelerating the pace of discovery.
131
- If someone can take what you've done and use it to calculate a
132
- prediction, you have just produced truly useful knowledge.
133
-
134
- ## Why another tool/platform?
135
-
136
- Git, GitHub, DVC, Docker et al. are amazing tools/platforms, but their
137
- use involves multiple fairly difficult learning curves,
138
- and tying them together might mean developing something new for each project.
139
- Our goal is to provide a single tool and platform to unify all of these so
140
- that there is a single, gentle learning curve.
141
- However, it is not our goal to hide or replace these underlying components.
142
- Advanced users can use them directly, but new users aren't forced to, which
143
- helps them get up and running with less effort and training.
144
- Calkit should help users understand what is going on under the hood without
145
- forcing them to work at that lower level of abstraction.
146
-
147
- ## How it works
148
-
149
- Calkit creates a simple human-readable "database" inside the `calkit.yaml`
150
- file, which serves as a way to store important information about the project,
151
- e.g., what question(s) it seeks to answer,
152
- what files should be considered datasets, figures, publications, etc.
153
- The Calkit cloud reads this database and registers the various entities
154
- as part of the entire ecosystem such that if a project is made public,
155
- other researchers can find and reuse your work to accelerate their own.
156
-
157
160
  ## Design/UX principles
158
161
 
159
162
  1. Be opinionated. Users should not be forced to make unimportant decisions.
@@ -0,0 +1,17 @@
1
+ __version__ = "0.22.3"
2
+
3
+ from .core import * # noqa: F403, I001
4
+ from . import git # noqa: F401
5
+ from . import dvc # noqa: F401
6
+ from . import cloud # noqa: F401
7
+ from . import jupyter # noqa: F401
8
+ from . import config # noqa: F401
9
+ from . import models # noqa: F401
10
+ from . import office # noqa: F401
11
+ from . import templates # noqa: F401
12
+ from . import conda # noqa: F401
13
+ from . import calc # noqa: F401
14
+ from . import check # noqa: F401
15
+ from . import github # noqa: F401
16
+ from . import zenodo # noqa: F401
17
+ from . import releases # noqa: F401
@@ -6,8 +6,8 @@ from typing import Callable
6
6
  import git
7
7
  from git.exc import InvalidGitRepositoryError
8
8
  from pydantic import BaseModel, computed_field
9
- import calkit
10
9
 
10
+ import calkit
11
11
 
12
12
  INSTRUCTIONS_NOTE = (
13
13
  "Note that these could be as simple as telling the user to "
@@ -194,7 +194,7 @@ def check_reproducibility(
194
194
  readme_path = os.path.join(wdir, "README.md")
195
195
  if os.path.isfile(readme_path):
196
196
  res["has_readme"] = True
197
- with open(readme_path) as f:
197
+ with open(readme_path, encoding="utf-8") as f:
198
198
  readme_txt = f.read().lower()
199
199
  res["instructions_in_readme"] = (
200
200
  ("getting started" in readme_txt)
@@ -1,4 +1,4 @@
1
- from .core import *
1
+ from .core import * # noqa: F403
2
2
 
3
3
 
4
4
  def run() -> None:
@@ -221,11 +221,11 @@ def import_environment(
221
221
  ),
222
222
  ],
223
223
  dest_path: Annotated[
224
- str,
224
+ str | None,
225
225
  typer.Option("--path", help="Output path at which to save."),
226
226
  ] = None,
227
227
  dest_name: Annotated[
228
- str,
228
+ str | None,
229
229
  typer.Option(
230
230
  "--name", "-n", help="Name to use in the destination project."
231
231
  ),
@@ -254,15 +254,15 @@ def import_environment(
254
254
  raise_error("Invalid source environment specification")
255
255
  if os.path.isdir(project):
256
256
  typer.echo(f"Importing from local project directory: {project}")
257
- src_ck_info = calkit.load_calkit_info(
258
- wdir=project, process_includes=True
257
+ src_ck_info = dict(
258
+ calkit.load_calkit_info(wdir=project, process_includes=True)
259
259
  )
260
260
  environments = src_ck_info.get("environments", {})
261
261
  if env_name not in environments:
262
262
  raise_error(f"Environment {env_name} not found in project")
263
263
  src_env = environments[env_name]
264
264
  if "path" in src_env:
265
- env_path = src_env["path"]
265
+ env_path = src_env["path"] # noqa: F841 TODO: Use this variable
266
266
  try:
267
267
  src_project_name = calkit.detect_project_name(project)
268
268
  except Exception as e:
@@ -270,7 +270,7 @@ def import_environment(
270
270
  else:
271
271
  typer.echo("Importing from Cloud project")
272
272
  try:
273
- resp = calkit.cloud.get(
273
+ resp = calkit.cloud.get( # noqa: F841 TODO: Use this variable
274
274
  f"/projects/{project}/environments/{env_name}"
275
275
  )
276
276
  except Exception as e:
@@ -278,7 +278,7 @@ def import_environment(
278
278
  src_project_name = project
279
279
  # TODO: Parse information we need from the response
280
280
  # Write environment into current Calkit info
281
- ck_info = calkit.load_calkit_info()
281
+ ck_info = dict(calkit.load_calkit_info())
282
282
  environments = ck_info.get("environments", {})
283
283
  # Check if an environment with this name already exists
284
284
  if dest_name is None:
@@ -10,6 +10,7 @@ import posixpath
10
10
  import subprocess
11
11
  import sys
12
12
  import time
13
+ from pathlib import PurePath
13
14
 
14
15
  import dotenv
15
16
  import dvc.repo
@@ -632,6 +633,9 @@ def ignore(
632
633
  ):
633
634
  """Ignore a file, i.e., keep it out of version control."""
634
635
  repo = git.Repo()
636
+ path = PurePath(
637
+ path
638
+ ).as_posix() # gitignore expects / (not \) regardless of OS
635
639
  if repo.ignored(path):
636
640
  typer.echo(f"{path} is already ignored")
637
641
  return
@@ -1474,16 +1474,17 @@ def new_stage(
1474
1474
  ] = False,
1475
1475
  ):
1476
1476
  """Create a new pipeline stage."""
1477
- ck_info = calkit.load_calkit_info(process_includes="environments")
1477
+ ck_info = dict(calkit.load_calkit_info(process_includes="environments"))
1478
+ environments = ck_info.get("environments", {})
1478
1479
  if environment is None:
1479
1480
  warn("No environment is specified")
1480
1481
  cmd = ""
1481
1482
  else:
1482
- if environment not in ck_info["environments"] and not no_check:
1483
+ if environment not in environments and not no_check:
1483
1484
  raise_error(f"Environment '{environment}' does not exist")
1484
1485
  cmd = f"calkit xenv -n {environment} -- "
1485
1486
  # Add environment path as a dependency if applicable
1486
- env_path = ck_info["environments"].get(environment, {}).get("path")
1487
+ env_path = environments.get(environment, {}).get("path")
1487
1488
  if env_path is not None and env_path not in deps:
1488
1489
  deps = [env_path] + deps
1489
1490
  if not os.path.exists(target) and not no_check:
@@ -152,8 +152,10 @@ def populate_dvc_cache():
152
152
  if not os.path.isfile(dvc_cache_fpath):
153
153
  # TODO: Download file from Zenodo and save in the cache
154
154
  release_fpath = f".calkit/releases/{name}/files/{obj['path']}"
155
+ print(release_fpath)
155
156
  zip_file = obj.get("zipfile")
156
157
  if zip_file is not None:
157
158
  zip_fpath = f".calkit/releases/{name}/files/{zip_file}"
159
+ print(zip_fpath)
158
160
  # Extract out of ZIP file if necessary
159
161
  # TODO: Check MD5 before inserting into the cache
@@ -0,0 +1 @@
1
+ from .core import * # noqa: F403
@@ -0,0 +1 @@
1
+ from .core import * # noqa: F403
@@ -43,7 +43,7 @@ def test_formula():
43
43
 
44
44
 
45
45
  def test_lookuptable():
46
- calc = calkit.calc.LookupTable(
46
+ _ = calkit.calc.LookupTable(
47
47
  inputs=["x"],
48
48
  output="something",
49
49
  params=calkit.calc.LookupTableParams(
@@ -0,0 +1,60 @@
1
+ # Home
2
+
3
+ Calkit is a language-agnostic project framework and toolkit
4
+ to make your research or analytics project
5
+ reproducible to the highest standard,
6
+ which means:
7
+
8
+ > Inputs and process definitions are provided and sufficiently described
9
+ > such that anyone can easily verify that they produced the outputs
10
+ > used to support the conclusions.
11
+
12
+ "Easily" means that after obtaining your project files,
13
+ it should only require executing a single command
14
+ (like "pressing a single button" in
15
+ [Claerbout and Karrenbach (1992)](https://doi.org/10.1190/1.1822162)),
16
+ which should finish in less than 15 minutes
17
+ (suggested by
18
+ [Vandewalle et al. (2009)](https://doi.org/10.1109/MSP.2009.932122)).
19
+
20
+ If the processes are too expensive to rerun in under 15 minutes,
21
+ it should be possible to confirm that none of the input data
22
+ or process definitions (e.g., environment specifications, scripts)
23
+ have changed since saving the current versions of each output artifact
24
+ (figure, table, dataset, publication, etc.)
25
+
26
+ When your project is reproducible,
27
+ you'll be able to iterate more quickly and more often,
28
+ easily onboard collaborators,
29
+ make fewer mistakes,
30
+ and feel confident sharing all of your project materials
31
+ with your research articles,
32
+ because you'll know the code will actually run!
33
+ This will allow others to reuse parts of your project in their own research,
34
+ accelerating the pace of discovery.
35
+
36
+ Working at this level of automation, discipline, and rigor may sound like
37
+ a lot of effort,
38
+ but Calkit makes it easy!
39
+
40
+ ## Features
41
+
42
+ - A schema to store structured metadata describing the
43
+ project's important outputs (in its `calkit.yaml` file)
44
+ and how they are created
45
+ (its computational environments and pipeline).
46
+ - A CLI to run the project's pipeline to verify it's reproducible,
47
+ regenerating outputs as needed and
48
+ ensuring all
49
+ computational environments (e.g., [Conda](https://docs.conda.io/en/latest/), [Docker](https://docker.com)) match their specification.
50
+ - A command line interface (CLI) to simplify keeping code, text, and larger
51
+ data files backed up in the same project repo using both
52
+ [Git](https://git-scm.com/) and [DVC](https://dvc.org/).
53
+ - A complementary
54
+ [cloud system](https://github.com/calkit/calkit-cloud)
55
+ to facilitate backup, collaboration,
56
+ and sharing throughout the entire research lifecycle.
57
+
58
+ ## Installation
59
+
60
+ See [installation](installation.md).
@@ -90,7 +90,7 @@ warn_unused_ignores = true
90
90
  show_error_codes = true
91
91
 
92
92
  [tool.ruff]
93
- target-version = "py39"
93
+ target-version = "py310"
94
94
  line-length = 79
95
95
  fix = true
96
96
  extend-select = ["I"]
@@ -1,17 +0,0 @@
1
- __version__ = "0.22.2"
2
-
3
- from .core import *
4
- from . import git
5
- from . import dvc
6
- from . import cloud
7
- from . import jupyter
8
- from . import config
9
- from . import models
10
- from . import office
11
- from . import templates
12
- from . import conda
13
- from . import calc
14
- from . import check
15
- from . import github
16
- from . import zenodo
17
- from . import releases
@@ -1 +0,0 @@
1
- from .core import *
@@ -1 +0,0 @@
1
- from .core import *
@@ -1,51 +0,0 @@
1
- # Home
2
-
3
- Calkit is an open source
4
- framework and toolkit for reproducible research projects.
5
- It acts as a top-level layer to integrate and simplify the use of enabling
6
- technologies such as
7
- [Git](https://git-scm.com/),
8
- [DVC](https://dvc.org/),
9
- [Conda](https://docs.conda.io/en/latest/),
10
- and [Docker](https://docker.com).
11
- Calkit also adds a domain-specific data model
12
- such that all aspects of the research process can be fully described in a
13
- single repository and therefore easily consumed by others.
14
-
15
- Our goal is to make reproducibility easier so it becomes more common.
16
- To do this, we try to make it easy for users to follow two simple rules:
17
-
18
- 1. **Keep everything in version control.** This includes large files like
19
- datasets, enabled by DVC.
20
- The [Calkit Cloud](https://github.com/calkit/calkit-cloud),
21
- hosted at [calkit.io](https://calkit.io),
22
- serves as a simple default DVC remote storage location for those who do not
23
- want to manage their own infrastructure.
24
- 2. **Generate all important artifacts with a single pipeline.** There should be
25
- no special instructions required to reproduce a project's artifacts.
26
- It should be as simple as calling `calkit run`.
27
- The DVC pipeline (in a project's `dvc.yaml` file) is therefore the main
28
- thing to "build" throughout a research project.
29
- Calkit provides helper functionality to build pipeline stages that
30
- keep computational environments up-to-date and label their outputs for
31
- convenient reuse.
32
-
33
- ## Features
34
-
35
- - A [version control interface](version-control.md)
36
- that unifies and simplifies interaction with Git and DVC.
37
- - Automated [environment management](environments.md).
38
- - A [project metadata model](calkit-yaml.md)
39
- to declare global dependencies, environments,
40
- and artifacts like datasets, figures, notebooks, and publications
41
- to facilitate searchability and reuse.
42
- - A complementary [cloud platform](https://calkit.io) to interact with
43
- the project and its artifacts, which also serves as a DVC remote.
44
- - Templates for projects, publications, and more.
45
- - The ability to declare, execute, and track
46
- [manual procedures](tutorials/procedures.md) and
47
- pipeline stages with [manual steps](pipeline/manual-steps.md).
48
- - A Jupyter cell magic to
49
- [use notebook cells as pipeline stages](tutorials/notebook-pipeline.md).
50
- - Tools to help improve the reproducibility of workflows that depend on
51
- [Microsoft Office](tutorials/office.md).
File without changes
File without changes
File without changes