crunch-convert 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,217 @@
1
+ Metadata-Version: 2.1
2
+ Name: crunch-convert
3
+ Version: 0.1.0
4
+ Summary: crunch-convert - Conversion module for the CrunchDAO Platform
5
+ Home-page: https://github.com/crunchdao/crunch-convert
6
+ Author: Enzo CACERES
7
+ Author-email: enzo.caceres@crunchdao.com
8
+ Keywords: package development template
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: Programming Language :: Python :: 3.7
11
+ Requires-Python: >=3
12
+ Description-Content-Type: text/markdown
13
+ Requires-Dist: click
14
+ Requires-Dist: libcst
15
+ Requires-Dist: requirements-parser>=0.11.0
16
+ Provides-Extra: test
17
+ Requires-Dist: parameterized; extra == "test"
18
+ Requires-Dist: pytest; extra == "test"
19
+ Requires-Dist: pytest-cov; extra == "test"
20
+
21
+ # Crunch Convert Tool
22
+
23
+ [![PyTest](https://github.com/crunchdao/crunch-convert/actions/workflows/pytest.yml/badge.svg)](https://github.com/crunchdao/crunch-convert/actions/workflows/pytest.yml)
24
+
25
+ This Python library is designed for the [CrunchDAO Platform](https://hub.crunchdao.com/), exposing the conversion tools in a very small CLI.
26
+
27
+ - [Crunch Convert Tool](#crunch-convert-tool)
28
+ - [Features](#features)
29
+ - [Automatic line commenting](#automatic-line-commenting)
30
+ - [Specifying package versions](#specifying-package-versions)
31
+ - [R imports via rpy2](#r-imports-via-rpy2)
32
+ - [Embedded Files](#embedded-files)
33
+ - [Installation](#installation)
34
+ - [Usage](#usage)
35
+ - [Via the CLI](#via-the-cli)
36
+ - [Via the Code](#via-the-code)
37
+ - [Contributing](#contributing)
38
+ - [License](#license)
39
+
40
+ # Features
41
+
42
+ ## Automatic line commenting
43
+
44
+ Only includes the functions, imports, and classes will be kept.
45
+
46
+ Everything else is commented out to prevent side effects when your code is loaded into the cloud environment. (e.g. when you're exploring the data, debugging your algorithm, or doing visualizating using Matplotlib, etc.)
47
+
48
+ You can prevent this behavior by using special comments to tell the system to keep part of your code:
49
+
50
+ - To start a section that you want to keep, write: `@crunch/keep:on`
51
+ - To end the section, write: `@crunch/keep:off`
52
+
53
+ ```python
54
+ # @crunch/keep:on
55
+
56
+ # keep global initialization
57
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
58
+
59
+ # keep constants
60
+ TRAIN_DEPTH = 42
61
+ IMPORTANT_FEATURES = [ "a", "b", "c" ]
62
+
63
+ # @crunch/keep:off
64
+
65
+ # this will be ignored
66
+ x, y = crunch.load_data()
67
+
68
+ def train(...):
69
+ ...
70
+ ```
71
+
72
+ The result will be:
73
+
74
+ ```python
75
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
76
+
77
+ TRAIN_DEPTH = 42
78
+ IMPORTANT_FEATURES = [ "a", "b", "c" ]
79
+
80
+ #x, y = crunch.load_data()
81
+
82
+ def train(...):
83
+ ...
84
+ ```
85
+
86
+ > [!TIP]
87
+ > You can put a `@crunch/keep:on` at the top of the cell and never close it to keep everything.
88
+
89
+ ## Specifying package versions
90
+
91
+ Since submitting a notebook does not include a `requirements.txt`, users can instead specify the version of a package using import-level [requirement specifiers](https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples) in a comment on the same line.
92
+
93
+ ```python
94
+ # Valid statements
95
+ import pandas # == 1.3
96
+ import sklearn # >= 1.2, < 2.0
97
+ import tqdm # [foo, bar]
98
+ import scikit # ~= 1.4.2
99
+ from requests import Session # == 1.5
100
+ ```
101
+
102
+ Specifying multiple times will cause the submission to be rejected if they are different.
103
+
104
+ ```python
105
+ # Inconsistant versions will be rejected
106
+ import pandas # == 1.3
107
+ import pandas # == 1.5
108
+ ```
109
+
110
+ Specifying versions on standard libraries does nothing (but they will still be rejected if there is an inconsistent version).
111
+
112
+ ```python
113
+ # Will be ignored
114
+ import os # == 1.3
115
+ import sys # == 1.5
116
+ ```
117
+
118
+ If an optional dependency is required for the code to work properly, an import statement must be added, even if the code does not use it directly.
119
+
120
+ ```python
121
+ import castle.algorithms
122
+
123
+ # Keep me, I am needed by castle
124
+ import torch
125
+ ```
126
+
127
+ It is possible for multiple import names to resolve to different libraries on PyPI. If this happens, you must specify which one you want. If you do not want a specific version, you can use `@latest`, as without this, we cannot distinguish between commented code and version specifiers.
128
+
129
+ ```python
130
+ # Prefer https://pypi.org/project/EMD-signal/
131
+ import pyemd # EMD-signal @latest
132
+
133
+ # Prefer https://pypi.org/project/pyemd/
134
+ import pyemd # pyemd @latest
135
+ ```
136
+
137
+ ## R imports via rpy2
138
+
139
+ For notebook users, the packages are automatically extracted from the `importr("<name>")` calls, which is provided by [rpy2](https://rpy2.github.io/).
140
+
141
+ ```python
142
+ # Import the `importr` function
143
+ from rpy2.robjects.packages import importr
144
+
145
+ # Import the "base" R package
146
+ base = importr("base")
147
+ ```
148
+
149
+ The following format must be followed:
150
+ - The import must be declared at the root level.
151
+ - The result must be assigned to a variable; the variable's name will not matter.
152
+ - The function name must be `importr`, and it must be imported as shown in the example above.
153
+ - The first argument must be a string constant, variables or other will be ignored.
154
+ - The other arguments are ignored; this allows for [custom import mapping](https://rpy2.github.io/doc/latest/html/robjects_rpackages.html#importing-r-packages) if necessary.
155
+
156
+ The line will not be commented, [read more about line commenting here](#automatic-line-commenting).
157
+
158
+ ## Embedded Files
159
+
160
+ Additional files can be embedded in cells to be submitted with the Notebook. In order for the system to recognize a cell as an Embed File, the following syntax must be followed:
161
+
162
+ ```
163
+ ---
164
+ file: <file_name>.md
165
+ ---
166
+
167
+ <!-- File content goes here -->
168
+ Lorem ipsum dolor sit amet, consectetur adipiscing elit.
169
+ Aenean rutrum condimentum ornare.
170
+ ```
171
+
172
+ Submitting multiple cells with the same file name will be rejected.
173
+
174
+ While the focus is on Markdown files, any text file will be accepted. Including but not limited to: `.txt`, `.yaml`, `.json`, ...
175
+
176
+ # Installation
177
+
178
+ Use [pip](https://pypi.org/project/crunch-convert/) to install the `crunch-convert`.
179
+
180
+ ```bash
181
+ pip install --upgrade crunch-convert
182
+ ```
183
+
184
+ # Usage
185
+
186
+ ## Via the CLI
187
+
188
+ ```bash
189
+ crunch-convert notebook my-notebook.ipynb
190
+ ```
191
+
192
+ ## Via the Code
193
+
194
+ ```python
195
+ from crunch_convert.notebook import extract_from_file
196
+
197
+ flatten = extract_from_file("notebook.ipynb")
198
+
199
+ with open("main.py", "w") as fd:
200
+ fd.write(flatten.source_code)
201
+
202
+ with open("requirements.txt", "w") as fd:
203
+ for requirement in flatten.requirements:
204
+ fd.write(str(requirement) + "\n")
205
+
206
+ for embedded_file in flatten.embedded_files:
207
+ with open(embedded_file.normalized_path, "w") as fd:
208
+ fd.write(embedded_file.content)
209
+ ```
210
+
211
+ # Contributing
212
+
213
+ Pull requests are always welcome! If you find any issues or have suggestions for improvements, please feel free to submit a pull request or open an issue in the GitHub repository.
214
+
215
+ # License
216
+
217
+ [MIT](https://choosealicense.com/licenses/mit/)
@@ -0,0 +1,197 @@
1
+ # Crunch Convert Tool
2
+
3
+ [![PyTest](https://github.com/crunchdao/crunch-convert/actions/workflows/pytest.yml/badge.svg)](https://github.com/crunchdao/crunch-convert/actions/workflows/pytest.yml)
4
+
5
+ This Python library is designed for the [CrunchDAO Platform](https://hub.crunchdao.com/), exposing the conversion tools in a very small CLI.
6
+
7
+ - [Crunch Convert Tool](#crunch-convert-tool)
8
+ - [Features](#features)
9
+ - [Automatic line commenting](#automatic-line-commenting)
10
+ - [Specifying package versions](#specifying-package-versions)
11
+ - [R imports via rpy2](#r-imports-via-rpy2)
12
+ - [Embedded Files](#embedded-files)
13
+ - [Installation](#installation)
14
+ - [Usage](#usage)
15
+ - [Via the CLI](#via-the-cli)
16
+ - [Via the Code](#via-the-code)
17
+ - [Contributing](#contributing)
18
+ - [License](#license)
19
+
20
+ # Features
21
+
22
+ ## Automatic line commenting
23
+
24
+ Only includes the functions, imports, and classes will be kept.
25
+
26
+ Everything else is commented out to prevent side effects when your code is loaded into the cloud environment. (e.g. when you're exploring the data, debugging your algorithm, or doing visualizating using Matplotlib, etc.)
27
+
28
+ You can prevent this behavior by using special comments to tell the system to keep part of your code:
29
+
30
+ - To start a section that you want to keep, write: `@crunch/keep:on`
31
+ - To end the section, write: `@crunch/keep:off`
32
+
33
+ ```python
34
+ # @crunch/keep:on
35
+
36
+ # keep global initialization
37
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
38
+
39
+ # keep constants
40
+ TRAIN_DEPTH = 42
41
+ IMPORTANT_FEATURES = [ "a", "b", "c" ]
42
+
43
+ # @crunch/keep:off
44
+
45
+ # this will be ignored
46
+ x, y = crunch.load_data()
47
+
48
+ def train(...):
49
+ ...
50
+ ```
51
+
52
+ The result will be:
53
+
54
+ ```python
55
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
56
+
57
+ TRAIN_DEPTH = 42
58
+ IMPORTANT_FEATURES = [ "a", "b", "c" ]
59
+
60
+ #x, y = crunch.load_data()
61
+
62
+ def train(...):
63
+ ...
64
+ ```
65
+
66
+ > [!TIP]
67
+ > You can put a `@crunch/keep:on` at the top of the cell and never close it to keep everything.
68
+
69
+ ## Specifying package versions
70
+
71
+ Since submitting a notebook does not include a `requirements.txt`, users can instead specify the version of a package using import-level [requirement specifiers](https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples) in a comment on the same line.
72
+
73
+ ```python
74
+ # Valid statements
75
+ import pandas # == 1.3
76
+ import sklearn # >= 1.2, < 2.0
77
+ import tqdm # [foo, bar]
78
+ import scikit # ~= 1.4.2
79
+ from requests import Session # == 1.5
80
+ ```
81
+
82
+ Specifying multiple times will cause the submission to be rejected if they are different.
83
+
84
+ ```python
85
+ # Inconsistant versions will be rejected
86
+ import pandas # == 1.3
87
+ import pandas # == 1.5
88
+ ```
89
+
90
+ Specifying versions on standard libraries does nothing (but they will still be rejected if there is an inconsistent version).
91
+
92
+ ```python
93
+ # Will be ignored
94
+ import os # == 1.3
95
+ import sys # == 1.5
96
+ ```
97
+
98
+ If an optional dependency is required for the code to work properly, an import statement must be added, even if the code does not use it directly.
99
+
100
+ ```python
101
+ import castle.algorithms
102
+
103
+ # Keep me, I am needed by castle
104
+ import torch
105
+ ```
106
+
107
+ It is possible for multiple import names to resolve to different libraries on PyPI. If this happens, you must specify which one you want. If you do not want a specific version, you can use `@latest`, as without this, we cannot distinguish between commented code and version specifiers.
108
+
109
+ ```python
110
+ # Prefer https://pypi.org/project/EMD-signal/
111
+ import pyemd # EMD-signal @latest
112
+
113
+ # Prefer https://pypi.org/project/pyemd/
114
+ import pyemd # pyemd @latest
115
+ ```
116
+
117
+ ## R imports via rpy2
118
+
119
+ For notebook users, the packages are automatically extracted from the `importr("<name>")` calls, which is provided by [rpy2](https://rpy2.github.io/).
120
+
121
+ ```python
122
+ # Import the `importr` function
123
+ from rpy2.robjects.packages import importr
124
+
125
+ # Import the "base" R package
126
+ base = importr("base")
127
+ ```
128
+
129
+ The following format must be followed:
130
+ - The import must be declared at the root level.
131
+ - The result must be assigned to a variable; the variable's name will not matter.
132
+ - The function name must be `importr`, and it must be imported as shown in the example above.
133
+ - The first argument must be a string constant, variables or other will be ignored.
134
+ - The other arguments are ignored; this allows for [custom import mapping](https://rpy2.github.io/doc/latest/html/robjects_rpackages.html#importing-r-packages) if necessary.
135
+
136
+ The line will not be commented, [read more about line commenting here](#automatic-line-commenting).
137
+
138
+ ## Embedded Files
139
+
140
+ Additional files can be embedded in cells to be submitted with the Notebook. In order for the system to recognize a cell as an Embed File, the following syntax must be followed:
141
+
142
+ ```
143
+ ---
144
+ file: <file_name>.md
145
+ ---
146
+
147
+ <!-- File content goes here -->
148
+ Lorem ipsum dolor sit amet, consectetur adipiscing elit.
149
+ Aenean rutrum condimentum ornare.
150
+ ```
151
+
152
+ Submitting multiple cells with the same file name will be rejected.
153
+
154
+ While the focus is on Markdown files, any text file will be accepted. Including but not limited to: `.txt`, `.yaml`, `.json`, ...
155
+
156
+ # Installation
157
+
158
+ Use [pip](https://pypi.org/project/crunch-convert/) to install the `crunch-convert`.
159
+
160
+ ```bash
161
+ pip install --upgrade crunch-convert
162
+ ```
163
+
164
+ # Usage
165
+
166
+ ## Via the CLI
167
+
168
+ ```bash
169
+ crunch-convert notebook my-notebook.ipynb
170
+ ```
171
+
172
+ ## Via the Code
173
+
174
+ ```python
175
+ from crunch_convert.notebook import extract_from_file
176
+
177
+ flatten = extract_from_file("notebook.ipynb")
178
+
179
+ with open("main.py", "w") as fd:
180
+ fd.write(flatten.source_code)
181
+
182
+ with open("requirements.txt", "w") as fd:
183
+ for requirement in flatten.requirements:
184
+ fd.write(str(requirement) + "\n")
185
+
186
+ for embedded_file in flatten.embedded_files:
187
+ with open(embedded_file.normalized_path, "w") as fd:
188
+ fd.write(embedded_file.content)
189
+ ```
190
+
191
+ # Contributing
192
+
193
+ Pull requests are always welcome! If you find any issues or have suggestions for improvements, please feel free to submit a pull request or open an issue in the GitHub repository.
194
+
195
+ # License
196
+
197
+ [MIT](https://choosealicense.com/licenses/mit/)
@@ -0,0 +1,7 @@
1
+ """
2
+ crunch-convert
3
+ ~~~~~~
4
+ The crunch module to convert a notebook into multiple components!
5
+ """
6
+
7
+ from crunch_convert import notebook as notebook
@@ -0,0 +1,3 @@
1
+ from crunch_convert.cli import cli
2
+
3
+ cli()
@@ -0,0 +1,6 @@
1
+ __title__ = 'crunch-convert'
2
+ __description__ = 'crunch-convert - Conversion module for the CrunchDAO Platform'
3
+ __version__ = '0.1.0'
4
+ __author__ = 'Enzo CACERES'
5
+ __author_email__ = 'enzo.caceres@crunchdao.com'
6
+ __url__ = 'https://github.com/crunchdao/crunch-convert'
@@ -0,0 +1,69 @@
1
+ import json
2
+ import os
3
+
4
+ import click
5
+
6
+ from crunch_convert.__version__ import __version__
7
+ from crunch_convert.notebook import (ConverterError,
8
+ InconsistantLibraryVersionError,
9
+ NotebookCellParseError, extract_from_file)
10
+ from crunch_convert.notebook._utils import print_indented
11
+
12
+
13
+ @click.group()
14
+ @click.version_option(__version__, package_name="__version__.__title__")
15
+ def cli():
16
+ pass
17
+
18
+
19
+ @cli.command(help="Convert a notebook to a python script.")
20
+ @click.option("--override", is_flag=True, help="Force overwrite of the python file.")
21
+ @click.argument("notebook-file-path", required=True)
22
+ @click.argument("python-file-path", default="main.py")
23
+ def notebook(
24
+ override: bool,
25
+ notebook_file_path: str,
26
+ python_file_path: str,
27
+ ):
28
+ try:
29
+ flatten = extract_from_file(
30
+ notebook_file_path,
31
+ print=print,
32
+ validate=True,
33
+ )
34
+ except IOError as error:
35
+ print(f"{notebook_file_path}: cannot read notebook file: {error}")
36
+ raise click.Abort()
37
+ except json.JSONDecodeError as error:
38
+ print(f"{notebook_file_path}: cannot parse notebook file: {error}")
39
+ raise click.Abort()
40
+ except ConverterError as error:
41
+ print(f"{notebook_file_path}: convert failed: {error}")
42
+
43
+ if isinstance(error, NotebookCellParseError):
44
+ print(f" cell: {error.cell_id} ({error.cell_index})")
45
+ print(f" source:")
46
+ print_indented(error.cell_source)
47
+ print(f" parser error:")
48
+ print_indented(error.parser_error or "None")
49
+
50
+ elif isinstance(error, InconsistantLibraryVersionError):
51
+ print(f" package name: {error.package_name}")
52
+ print(f" first version: {error.old}")
53
+ print(f" other version: {error.new}")
54
+
55
+ raise click.Abort()
56
+
57
+ if not override and os.path.exists(python_file_path):
58
+ override = click.prompt(
59
+ f"file {python_file_path} already exists, override?",
60
+ type=bool,
61
+ default=False,
62
+ prompt_suffix=" "
63
+ )
64
+
65
+ if not override:
66
+ raise click.Abort()
67
+
68
+ with open(python_file_path, "w") as fd:
69
+ fd.write(flatten.source_code)
@@ -0,0 +1,27 @@
1
+ """
2
+ crunch-convert
3
+ ~~~~~~
4
+ The crunch module to convert a notebook into multiple components!
5
+ """
6
+
7
+ from crunch_convert.notebook._embedded import EmbeddedFile as EmbeddedFile
8
+ from crunch_convert.notebook._notebook import ConverterError as ConverterError
9
+ from crunch_convert.notebook._notebook import \
10
+ FlattenNotebook as FlattenNotebook
11
+ from crunch_convert.notebook._notebook import \
12
+ InconsistantLibraryVersionError as InconsistantLibraryVersionError
13
+ from crunch_convert.notebook._notebook import \
14
+ NotebookCellParseError as NotebookCellParseError
15
+ from crunch_convert.notebook._notebook import \
16
+ RequirementVersionParseError as RequirementVersionParseError
17
+ from crunch_convert.notebook._notebook import \
18
+ extract_from_cells as extract_from_cells
19
+ from crunch_convert.notebook._notebook import \
20
+ extract_from_file as extract_from_file
21
+ from crunch_convert.notebook._requirement import \
22
+ ImportedRequirement as ImportedRequirement
23
+ from crunch_convert.notebook._requirement import \
24
+ ImportedRequirementLanguage as ImportedRequirementLanguage
25
+
26
+ # alias for compatibility with previous versions
27
+ EmbedFile = EmbeddedFile
@@ -0,0 +1,8 @@
1
+ from dataclasses import dataclass
2
+
3
+
4
+ @dataclass()
5
+ class EmbeddedFile:
6
+ path: str
7
+ normalized_path: str
8
+ content: str