scale-nucleus 0.1.10__py3-none-any.whl → 0.1.24__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- nucleus/__init__.py +259 -162
- nucleus/annotation.py +121 -32
- nucleus/autocurate.py +26 -0
- nucleus/constants.py +43 -5
- nucleus/dataset.py +213 -52
- nucleus/dataset_item.py +139 -26
- nucleus/errors.py +21 -3
- nucleus/job.py +27 -6
- nucleus/model.py +23 -2
- nucleus/model_run.py +56 -14
- nucleus/payload_constructor.py +39 -2
- nucleus/prediction.py +75 -14
- nucleus/scene.py +241 -0
- nucleus/slice.py +24 -15
- nucleus/url_utils.py +22 -0
- nucleus/utils.py +26 -5
- {scale_nucleus-0.1.10.dist-info → scale_nucleus-0.1.24.dist-info}/LICENSE +0 -0
- scale_nucleus-0.1.24.dist-info/METADATA +85 -0
- scale_nucleus-0.1.24.dist-info/RECORD +21 -0
- {scale_nucleus-0.1.10.dist-info → scale_nucleus-0.1.24.dist-info}/WHEEL +1 -1
- scale_nucleus-0.1.10.dist-info/METADATA +0 -236
- scale_nucleus-0.1.10.dist-info/RECORD +0 -18
@@ -1,236 +0,0 @@
|
|
1
|
-
Metadata-Version: 2.1
|
2
|
-
Name: scale-nucleus
|
3
|
-
Version: 0.1.10
|
4
|
-
Summary: The official Python client library for Nucleus, the Data Platform for AI
|
5
|
-
Home-page: https://scale.com/nucleus
|
6
|
-
License: MIT
|
7
|
-
Author: Scale AI Nucleus Team
|
8
|
-
Author-email: nucleusapi@scaleapi.com
|
9
|
-
Requires-Python: >=3.6.2,<4.0.0
|
10
|
-
Classifier: License :: OSI Approved :: MIT License
|
11
|
-
Classifier: Programming Language :: Python :: 3
|
12
|
-
Classifier: Programming Language :: Python :: 3.7
|
13
|
-
Classifier: Programming Language :: Python :: 3.8
|
14
|
-
Classifier: Programming Language :: Python :: 3.9
|
15
|
-
Requires-Dist: aiohttp (>=3.7.4,<4.0.0)
|
16
|
-
Requires-Dist: dataclasses (>=0.7,<0.8); python_version >= "3.6.1" and python_version < "3.7"
|
17
|
-
Requires-Dist: requests (>=2.23.0,<3.0.0)
|
18
|
-
Requires-Dist: tqdm (>=4.41.0,<5.0.0)
|
19
|
-
Project-URL: Documentation, https://dashboard.scale.com/nucleus/docs/api
|
20
|
-
Project-URL: Repository, https://github.com/scaleapi/nucleus-python-client
|
21
|
-
Description-Content-Type: text/markdown
|
22
|
-
|
23
|
-
# Nucleus
|
24
|
-
|
25
|
-
https://dashboard.scale.com/nucleus
|
26
|
-
|
27
|
-
Aggregate metrics in ML are not good enough. To improve production ML, you need to understand their qualitative failure modes, fix them by gathering more data, and curate diverse scenarios.
|
28
|
-
|
29
|
-
Scale Nucleus helps you:
|
30
|
-
|
31
|
-
- Visualize your data
|
32
|
-
- Curate interesting slices within your dataset
|
33
|
-
- Review and manage annotations
|
34
|
-
- Measure and debug your model performance
|
35
|
-
|
36
|
-
Nucleus is a new way—the right way—to develop ML models, helping us move away from the concept of one dataset and towards a paradigm of collections of scenarios.
|
37
|
-
|
38
|
-
## Installation
|
39
|
-
|
40
|
-
`$ pip install scale-nucleus`
|
41
|
-
|
42
|
-
## Usage
|
43
|
-
|
44
|
-
The first step to using the Nucleus library is instantiating a client object.
|
45
|
-
The client abstractions serves to authenticate the user and act as the gateway
|
46
|
-
for users to interact with their datasets, models, and model runs.
|
47
|
-
|
48
|
-
### Create a client object
|
49
|
-
|
50
|
-
```python
|
51
|
-
import nucleus
|
52
|
-
client = nucleus.NucleusClient("YOUR_API_KEY_HERE")
|
53
|
-
```
|
54
|
-
|
55
|
-
### Create Dataset
|
56
|
-
|
57
|
-
```python
|
58
|
-
dataset = client.create_dataset("My Dataset")
|
59
|
-
```
|
60
|
-
|
61
|
-
### List Datasets
|
62
|
-
|
63
|
-
```python
|
64
|
-
datasets = client.list_datasets()
|
65
|
-
```
|
66
|
-
|
67
|
-
### Delete a Dataset
|
68
|
-
|
69
|
-
By specifying target dataset id.
|
70
|
-
A response code of 200 indicates successful deletion.
|
71
|
-
|
72
|
-
```python
|
73
|
-
client.delete_dataset("YOUR_DATASET_ID")
|
74
|
-
```
|
75
|
-
|
76
|
-
### Append Items to a Dataset
|
77
|
-
|
78
|
-
You can append both local images and images from the web. Simply specify the location and Nucleus will automatically infer if it's remote or a local file.
|
79
|
-
|
80
|
-
```python
|
81
|
-
dataset_item_1 = DatasetItem(image_location="./1.jpeg", reference_id="1", metadata={"key": "value"})
|
82
|
-
dataset_item_2 = DatasetItem(image_location="s3://srikanth-nucleus/9-1.jpg", reference_id="2", metadata={"key": "value"})
|
83
|
-
```
|
84
|
-
|
85
|
-
The append function expects a list of `DatasetItem` objects to upload, like this:
|
86
|
-
|
87
|
-
```python
|
88
|
-
response = dataset.append([dataset_item_1, dataset_item_2])
|
89
|
-
```
|
90
|
-
|
91
|
-
### Get Dataset Info
|
92
|
-
|
93
|
-
Tells us the dataset name, number of dataset items, model_runs, and slice_ids.
|
94
|
-
|
95
|
-
```python
|
96
|
-
dataset.info
|
97
|
-
```
|
98
|
-
|
99
|
-
### Access Dataset Items
|
100
|
-
|
101
|
-
There are three methods to access individual Dataset Items:
|
102
|
-
|
103
|
-
(1) Dataset Items are accessible by reference id
|
104
|
-
|
105
|
-
```python
|
106
|
-
item = dataset.refloc("my_img_001.png")
|
107
|
-
```
|
108
|
-
|
109
|
-
(2) Dataset Items are accessible by index
|
110
|
-
|
111
|
-
```python
|
112
|
-
item = dataset.iloc(0)
|
113
|
-
```
|
114
|
-
|
115
|
-
(3) Dataset Items are accessible by the dataset_item_id assigned internally
|
116
|
-
|
117
|
-
```python
|
118
|
-
item = dataset.loc("dataset_item_id")
|
119
|
-
```
|
120
|
-
|
121
|
-
### Add Annotations
|
122
|
-
|
123
|
-
Upload groundtruth annotations for the items in your dataset.
|
124
|
-
Box2DAnnotation has same format as https://dashboard.scale.com/nucleus/docs/api#add-ground-truth
|
125
|
-
|
126
|
-
```python
|
127
|
-
annotation_1 = BoxAnnotation(reference_id="1", label="label", x=0, y=0, width=10, height=10, annotation_id="ann_1", metadata={})
|
128
|
-
annotation_2 = BoxAnnotation(reference_id="2", label="label", x=0, y=0, width=10, height=10, annotation_id="ann_2", metadata={})
|
129
|
-
response = dataset.annotate([annotation_1, annotation_2])
|
130
|
-
```
|
131
|
-
|
132
|
-
For particularly large payloads, please reference the accompanying scripts in **references**
|
133
|
-
|
134
|
-
### Add Model
|
135
|
-
|
136
|
-
The model abstraction is intended to represent a unique architecture.
|
137
|
-
Models are independent of any dataset.
|
138
|
-
|
139
|
-
```python
|
140
|
-
model = client.add_model(name="My Model", reference_id="newest-cnn-its-new", metadata={"timestamp": "121012401"})
|
141
|
-
```
|
142
|
-
|
143
|
-
### Upload Predictions to ModelRun
|
144
|
-
|
145
|
-
This method populates the model_run object with predictions. `ModelRun` objects need to reference a `Dataset` that has been created.
|
146
|
-
Returns the associated model_id, human-readable name of the run, status, and user specified metadata.
|
147
|
-
Takes a list of Box2DPredictions within the payload, where Box2DPrediction
|
148
|
-
is formulated as in https://dashboard.scale.com/nucleus/docs/api#upload-model-outputs
|
149
|
-
|
150
|
-
```python
|
151
|
-
prediction_1 = BoxPrediction(reference_id="1", label="label", x=0, y=0, width=10, height=10, annotation_id="pred_1", confidence=0.9)
|
152
|
-
prediction_2 = BoxPrediction(reference_id="2", label="label", x=0, y=0, width=10, height=10, annotation_id="pred_2", confidence=0.2)
|
153
|
-
|
154
|
-
model_run = model.create_run(name="My Model Run", metadata={"timestamp": "121012401"}, dataset=dataset, predictions=[prediction_1, prediction_2])
|
155
|
-
```
|
156
|
-
|
157
|
-
### Commit ModelRun
|
158
|
-
|
159
|
-
The commit action indicates that the user is finished uploading predictions associated
|
160
|
-
with this model run. Committing a model run kicks off Nucleus internal processes
|
161
|
-
to calculate performance metrics like IoU. After being committed, a ModelRun object becomes immutable.
|
162
|
-
|
163
|
-
```python
|
164
|
-
model_run.commit()
|
165
|
-
```
|
166
|
-
|
167
|
-
### Get ModelRun Info
|
168
|
-
|
169
|
-
Returns the associated model_id, human-readable name of the run, status, and user specified metadata.
|
170
|
-
|
171
|
-
```python
|
172
|
-
model_run.info
|
173
|
-
```
|
174
|
-
|
175
|
-
### Accessing ModelRun Predictions
|
176
|
-
|
177
|
-
You can access the modelRun predictions for an individual dataset_item through three methods:
|
178
|
-
|
179
|
-
(1) user specified reference_id
|
180
|
-
|
181
|
-
```python
|
182
|
-
model_run.refloc("my_img_001.png")
|
183
|
-
```
|
184
|
-
|
185
|
-
(2) Index
|
186
|
-
|
187
|
-
```python
|
188
|
-
model_run.iloc(0)
|
189
|
-
```
|
190
|
-
|
191
|
-
(3) Internally maintained dataset_item_id
|
192
|
-
|
193
|
-
```python
|
194
|
-
model_run.loc("dataset_item_id")
|
195
|
-
```
|
196
|
-
|
197
|
-
### Delete ModelRun
|
198
|
-
|
199
|
-
Delete a model run using the target model_run_id.
|
200
|
-
|
201
|
-
A response code of 200 indicates successful deletion.
|
202
|
-
|
203
|
-
```python
|
204
|
-
client.delete_model_run("model_run_id")
|
205
|
-
```
|
206
|
-
|
207
|
-
## For Developers
|
208
|
-
|
209
|
-
Clone from github and install as editable
|
210
|
-
|
211
|
-
```
|
212
|
-
git clone git@github.com:scaleapi/nucleus-python-client.git
|
213
|
-
cd nucleus-python-client
|
214
|
-
pip3 install poetry
|
215
|
-
poetry install
|
216
|
-
```
|
217
|
-
|
218
|
-
Please install the pre-commit hooks by running the following command:
|
219
|
-
|
220
|
-
```python
|
221
|
-
poetry run pre-commit install
|
222
|
-
```
|
223
|
-
|
224
|
-
**Best practices for testing:**
|
225
|
-
(1). Please run pytest from the root directory of the repo, i.e.
|
226
|
-
|
227
|
-
```
|
228
|
-
poetry run pytest tests/test_dataset.py
|
229
|
-
```
|
230
|
-
|
231
|
-
(2) To skip slow integration tests that have to wait for an async job to start.
|
232
|
-
|
233
|
-
```
|
234
|
-
poetry run pytest -m "not integration"
|
235
|
-
```
|
236
|
-
|
@@ -1,18 +0,0 @@
|
|
1
|
-
nucleus/__init__.py,sha256=GZAE6HQoGnocPEOBRVLiqIFwVGeULmbEELneXsNJAVc,38550
|
2
|
-
nucleus/annotation.py,sha256=DcIccmP07Fk8w6xadpJ67YREMzR76so-ksh7YO5mlI0,7595
|
3
|
-
nucleus/constants.py,sha256=l8Wvr68x0It7JvaVmOwe4KlA_8vrSkU5xbqmWoBa8t0,2078
|
4
|
-
nucleus/dataset.py,sha256=aGOMncVTQHe8-b8B7VbyoorlNGSBhYlgcateV-42nWs,12263
|
5
|
-
nucleus/dataset_item.py,sha256=DuzQWPIqQ-u8h0HwOlGW3clQy6DlA4RWbntf3fTj8wc,2479
|
6
|
-
nucleus/errors.py,sha256=RNuP5tlTIkym-Y_IJTfvrvR7QQwt75QJ1zHsYztIB-8,1597
|
7
|
-
nucleus/job.py,sha256=a3o04oMEFDJA-mPWcQG_Ml5c3gum7u1fNeoFPNCuCFk,1648
|
8
|
-
nucleus/model.py,sha256=3ddk-y9K1Enolzrd4ku0BeeMgcBdO7oo5S8W9oFpcrY,1576
|
9
|
-
nucleus/model_run.py,sha256=qZb7jsONv-NZie18f6VxRsm2J-0Y3M4VDN4M5YPKl4M,6498
|
10
|
-
nucleus/payload_constructor.py,sha256=WowN3QT8FgIcqexiVM8VrQkwc4gpVUw9-atQNNxUb4g,2738
|
11
|
-
nucleus/prediction.py,sha256=so07LrCt89qsDTSJxChoJQmZ5z-LbiyJnqjUH3oq0v8,4491
|
12
|
-
nucleus/slice.py,sha256=q_TF1aMKQszHsXEREVVjCU8bftghQDyv0IbLWYv1_Po,5544
|
13
|
-
nucleus/upload_response.py,sha256=pwOb3iS6TbpoumC1Mao6Pyli7dXBRDcI0zjNfCMU4_c,2729
|
14
|
-
nucleus/utils.py,sha256=dSwKo4UlxGJ_Nnl7Ez6FfCXJtb4-cwh_1sGtCNQa1f0,5398
|
15
|
-
scale_nucleus-0.1.10.dist-info/LICENSE,sha256=jaTGyQSQIZeWMo5iyYqgbAYHR9Bdy7nOzgE-Up3m_-g,1075
|
16
|
-
scale_nucleus-0.1.10.dist-info/WHEEL,sha256=V7iVckP-GYreevsTDnv1eAinQt_aArwnAxmnP0gygBY,83
|
17
|
-
scale_nucleus-0.1.10.dist-info/METADATA,sha256=mhy5YffqL0DKMishVUW_YTMdaN0qgOGMHa-fhSQR72Y,6662
|
18
|
-
scale_nucleus-0.1.10.dist-info/RECORD,,
|