henge 0.2.2__tar.gz → 0.2.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- henge-0.2.3/PKG-INFO +132 -0
- henge-0.2.3/README.md +98 -0
- henge-0.2.3/henge/_version.py +1 -0
- {henge-0.2.2 → henge-0.2.3}/henge/henge.py +35 -26
- henge-0.2.3/henge/scconf.py +359 -0
- henge-0.2.3/henge.egg-info/PKG-INFO +132 -0
- {henge-0.2.2 → henge-0.2.3}/henge.egg-info/SOURCES.txt +1 -1
- {henge-0.2.2 → henge-0.2.3}/setup.py +5 -9
- henge-0.2.2/PKG-INFO +0 -28
- henge-0.2.2/README.md +0 -7
- henge-0.2.2/henge/_version.py +0 -1
- henge-0.2.2/henge.egg-info/PKG-INFO +0 -28
- henge-0.2.2/henge.egg-info/entry_points.txt +0 -2
- {henge-0.2.2 → henge-0.2.3}/LICENSE.txt +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge/__init__.py +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge/const.py +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge/deprecated.py +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge.egg-info/dependency_links.txt +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge.egg-info/requires.txt +0 -0
- {henge-0.2.2 → henge-0.2.3}/henge.egg-info/top_level.txt +0 -0
- {henge-0.2.2 → henge-0.2.3}/setup.cfg +0 -0
- {henge-0.2.2 → henge-0.2.3}/tests/test_henge.py +0 -0
henge-0.2.3/PKG-INFO
ADDED
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: henge
|
|
3
|
+
Version: 0.2.3
|
|
4
|
+
Summary: Storage and retrieval of object-derived, decomposable recursive unique identifiers.
|
|
5
|
+
Home-page: https://databio.org
|
|
6
|
+
Author: Nathan Sheffield
|
|
7
|
+
Author-email: nathan@code.databio.org
|
|
8
|
+
License: BSD2
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: License :: OSI Approved :: BSD License
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
15
|
+
Classifier: Topic :: System :: Distributed Computing
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
License-File: LICENSE.txt
|
|
19
|
+
Requires-Dist: jsonschema
|
|
20
|
+
Requires-Dist: ubiquerg>=0.5.2
|
|
21
|
+
Requires-Dist: yacman>=0.6.7
|
|
22
|
+
Dynamic: author
|
|
23
|
+
Dynamic: author-email
|
|
24
|
+
Dynamic: classifier
|
|
25
|
+
Dynamic: description
|
|
26
|
+
Dynamic: description-content-type
|
|
27
|
+
Dynamic: home-page
|
|
28
|
+
Dynamic: keywords
|
|
29
|
+
Dynamic: license
|
|
30
|
+
Dynamic: license-file
|
|
31
|
+
Dynamic: requires-dist
|
|
32
|
+
Dynamic: requires-python
|
|
33
|
+
Dynamic: summary
|
|
34
|
+
|
|
35
|
+
# Henge
|
|
36
|
+
|
|
37
|
+
Henge is a Python package for building data storage and retrieval interfaces for arbitrary data. Henge is based on the idea of **decomposable recursive unique identifiers (DRUIDs)**, which are hash-based unique identifiers for data derived from the data itself. For arbitrary data with any structure, Henge can mint unique DRUIDs to identify data, store the data in a key-value database of your choice, and provide lookup functions to retrieve the data in its original structure using its DRUID identifier.
|
|
38
|
+
|
|
39
|
+
Henge was intended as a building block for [sequence collections](https://github.com/refgenie/seqcol), but is generic enough to use for any data type that needs content-derived identifiers with database lookup capability.
|
|
40
|
+
|
|
41
|
+
## Install
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
pip install henge
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Quick Start
|
|
48
|
+
|
|
49
|
+
Create a Henge object by providing a database and a data schema. The database can be a Python dict or backed by persistent storage. Data schemas are [JSON-schema](https://json-schema.org/) descriptions of data types, and can be hierarchical.
|
|
50
|
+
|
|
51
|
+
```python
|
|
52
|
+
import henge
|
|
53
|
+
|
|
54
|
+
schemas = ["path/to/json_schema.yaml"]
|
|
55
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Insert items into the henge. Upon insert, henge returns the DRUID (digest/checksum/unique identifier) for your object:
|
|
59
|
+
|
|
60
|
+
```python
|
|
61
|
+
druid = h.insert({"name": "Pat", "age": 38}, item_type="person")
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Retrieve the original object using the DRUID:
|
|
65
|
+
|
|
66
|
+
```python
|
|
67
|
+
h.retrieve(druid)
|
|
68
|
+
# {'age': '38', 'name': 'Pat'}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Tutorial
|
|
72
|
+
|
|
73
|
+
For a comprehensive walkthrough covering basic types, arrays, nested objects, and advanced features, see the [tutorial notebook](docs/tutorial.ipynb).
|
|
74
|
+
|
|
75
|
+
## What are DRUIDs?
|
|
76
|
+
|
|
77
|
+
DRUIDs are a special type of unique identifiers with two powerful properties:
|
|
78
|
+
|
|
79
|
+
- **Decomposable**: Identifiers in henge automatically retrieve structured data (tuples, arrays, objects). The structure is defined by a JSON schema, so henge can be used as a back-end for arbitrary data types.
|
|
80
|
+
|
|
81
|
+
- **Recursive**: Individual elements retrieved by henge can be tagged as recursive, meaning these attributes contain their own DRUIDs. Henge can recurse through these, allowing you to mint unique identifiers for arbitrary nested data structures.
|
|
82
|
+
|
|
83
|
+
A DRUID is ultimately the result of a digest operation (such as `md5` or `sha256`) on some data. Because DRUIDs are computed deterministically from the item, they represent globally unique identifiers. If you insert the same item repeatedly, it will produce the same DRUID -- this is true across henges as long as they share a data schema.
|
|
84
|
+
|
|
85
|
+
## Persisting Data
|
|
86
|
+
|
|
87
|
+
### In-memory (default)
|
|
88
|
+
|
|
89
|
+
Use a Python `dict` as the database for testing or ephemeral use:
|
|
90
|
+
|
|
91
|
+
```python
|
|
92
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### SQLite backend
|
|
96
|
+
|
|
97
|
+
For persistent storage with SQLite:
|
|
98
|
+
|
|
99
|
+
```python
|
|
100
|
+
from sqlitedict import SqliteDict
|
|
101
|
+
|
|
102
|
+
mydict = SqliteDict('./my_db.sqlite', autocommit=True)
|
|
103
|
+
h = henge.Henge(mydict, schemas=schemas)
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Requires: `pip install sqlitedict`
|
|
107
|
+
|
|
108
|
+
### MongoDB backend
|
|
109
|
+
|
|
110
|
+
For production use with MongoDB:
|
|
111
|
+
|
|
112
|
+
1. **Start MongoDB with Docker:**
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
docker run --network="host" mongo
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
For persistent storage, mount a volume to `/data/db`:
|
|
119
|
+
|
|
120
|
+
```bash
|
|
121
|
+
docker run -it --network="host" -v /path/to/data:/data/db mongo
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
2. **Connect henge to MongoDB:**
|
|
125
|
+
|
|
126
|
+
```python
|
|
127
|
+
import henge
|
|
128
|
+
|
|
129
|
+
h = henge.Henge(henge.connect_mongo(), schemas=schemas)
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Requires: `pip install pymongo mongodict`
|
henge-0.2.3/README.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
# Henge
|
|
2
|
+
|
|
3
|
+
Henge is a Python package for building data storage and retrieval interfaces for arbitrary data. Henge is based on the idea of **decomposable recursive unique identifiers (DRUIDs)**, which are hash-based unique identifiers for data derived from the data itself. For arbitrary data with any structure, Henge can mint unique DRUIDs to identify data, store the data in a key-value database of your choice, and provide lookup functions to retrieve the data in its original structure using its DRUID identifier.
|
|
4
|
+
|
|
5
|
+
Henge was intended as a building block for [sequence collections](https://github.com/refgenie/seqcol), but is generic enough to use for any data type that needs content-derived identifiers with database lookup capability.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
pip install henge
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
## Quick Start
|
|
14
|
+
|
|
15
|
+
Create a Henge object by providing a database and a data schema. The database can be a Python dict or backed by persistent storage. Data schemas are [JSON-schema](https://json-schema.org/) descriptions of data types, and can be hierarchical.
|
|
16
|
+
|
|
17
|
+
```python
|
|
18
|
+
import henge
|
|
19
|
+
|
|
20
|
+
schemas = ["path/to/json_schema.yaml"]
|
|
21
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Insert items into the henge. Upon insert, henge returns the DRUID (digest/checksum/unique identifier) for your object:
|
|
25
|
+
|
|
26
|
+
```python
|
|
27
|
+
druid = h.insert({"name": "Pat", "age": 38}, item_type="person")
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Retrieve the original object using the DRUID:
|
|
31
|
+
|
|
32
|
+
```python
|
|
33
|
+
h.retrieve(druid)
|
|
34
|
+
# {'age': '38', 'name': 'Pat'}
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Tutorial
|
|
38
|
+
|
|
39
|
+
For a comprehensive walkthrough covering basic types, arrays, nested objects, and advanced features, see the [tutorial notebook](docs/tutorial.ipynb).
|
|
40
|
+
|
|
41
|
+
## What are DRUIDs?
|
|
42
|
+
|
|
43
|
+
DRUIDs are a special type of unique identifiers with two powerful properties:
|
|
44
|
+
|
|
45
|
+
- **Decomposable**: Identifiers in henge automatically retrieve structured data (tuples, arrays, objects). The structure is defined by a JSON schema, so henge can be used as a back-end for arbitrary data types.
|
|
46
|
+
|
|
47
|
+
- **Recursive**: Individual elements retrieved by henge can be tagged as recursive, meaning these attributes contain their own DRUIDs. Henge can recurse through these, allowing you to mint unique identifiers for arbitrary nested data structures.
|
|
48
|
+
|
|
49
|
+
A DRUID is ultimately the result of a digest operation (such as `md5` or `sha256`) on some data. Because DRUIDs are computed deterministically from the item, they represent globally unique identifiers. If you insert the same item repeatedly, it will produce the same DRUID -- this is true across henges as long as they share a data schema.
|
|
50
|
+
|
|
51
|
+
## Persisting Data
|
|
52
|
+
|
|
53
|
+
### In-memory (default)
|
|
54
|
+
|
|
55
|
+
Use a Python `dict` as the database for testing or ephemeral use:
|
|
56
|
+
|
|
57
|
+
```python
|
|
58
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### SQLite backend
|
|
62
|
+
|
|
63
|
+
For persistent storage with SQLite:
|
|
64
|
+
|
|
65
|
+
```python
|
|
66
|
+
from sqlitedict import SqliteDict
|
|
67
|
+
|
|
68
|
+
mydict = SqliteDict('./my_db.sqlite', autocommit=True)
|
|
69
|
+
h = henge.Henge(mydict, schemas=schemas)
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Requires: `pip install sqlitedict`
|
|
73
|
+
|
|
74
|
+
### MongoDB backend
|
|
75
|
+
|
|
76
|
+
For production use with MongoDB:
|
|
77
|
+
|
|
78
|
+
1. **Start MongoDB with Docker:**
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
docker run --network="host" mongo
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
For persistent storage, mount a volume to `/data/db`:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
docker run -it --network="host" -v /path/to/data:/data/db mongo
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
2. **Connect henge to MongoDB:**
|
|
91
|
+
|
|
92
|
+
```python
|
|
93
|
+
import henge
|
|
94
|
+
|
|
95
|
+
h = henge.Henge(henge.connect_mongo(), schemas=schemas)
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Requires: `pip install pymongo mongodict`
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
__version__ = "0.2.3"
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
"""
|
|
1
|
+
"""An interface to a database back-end for DRUIDs"""
|
|
2
2
|
|
|
3
3
|
import base64
|
|
4
4
|
import copy
|
|
@@ -57,14 +57,18 @@ def read_url(url):
|
|
|
57
57
|
raise e
|
|
58
58
|
data = response.read() # a `bytes` object
|
|
59
59
|
text = data.decode("utf-8")
|
|
60
|
-
print(text)
|
|
61
60
|
return yaml.safe_load(text)
|
|
62
61
|
|
|
63
62
|
|
|
64
63
|
class Henge(object):
|
|
65
64
|
def __init__(
|
|
66
|
-
self,
|
|
67
|
-
|
|
65
|
+
self,
|
|
66
|
+
database: dict,
|
|
67
|
+
schemas: list[str],
|
|
68
|
+
schemas_str: list[str] = None,
|
|
69
|
+
henges: dict = None,
|
|
70
|
+
checksum_function: callable = md5,
|
|
71
|
+
) -> None:
|
|
68
72
|
"""
|
|
69
73
|
A user interface to insert and retrieve decomposable recursive unique
|
|
70
74
|
identifiers (DRUIDs).
|
|
@@ -115,7 +119,7 @@ class Henge(object):
|
|
|
115
119
|
)
|
|
116
120
|
# populated_schemas.append(yaml.safe_load(schema_value))
|
|
117
121
|
|
|
118
|
-
for schema_value in schemas_str:
|
|
122
|
+
for schema_value in schemas_str or []:
|
|
119
123
|
populated_schemas.append(yaml.safe_load(schema_value))
|
|
120
124
|
|
|
121
125
|
split_schemas = {}
|
|
@@ -140,7 +144,9 @@ class Henge(object):
|
|
|
140
144
|
self.schemas[item_type] = henge.schemas[item_type]
|
|
141
145
|
self.henges[item_type] = henge
|
|
142
146
|
|
|
143
|
-
def retrieve(
|
|
147
|
+
def retrieve(
|
|
148
|
+
self, druid: str, reclimit: int = None, raw: bool = False
|
|
149
|
+
) -> dict | list:
|
|
144
150
|
"""
|
|
145
151
|
Retrieve an item given a digest
|
|
146
152
|
|
|
@@ -202,7 +208,7 @@ class Henge(object):
|
|
|
202
208
|
def lookup(self, druid, item_type):
|
|
203
209
|
try:
|
|
204
210
|
henge_to_query = self.henges[item_type]
|
|
205
|
-
except:
|
|
211
|
+
except KeyError:
|
|
206
212
|
_LOGGER.debug("No henges available for this item type")
|
|
207
213
|
raise NotFoundException(druid)
|
|
208
214
|
try:
|
|
@@ -236,7 +242,9 @@ class Henge(object):
|
|
|
236
242
|
continue
|
|
237
243
|
return valid_schemas
|
|
238
244
|
|
|
239
|
-
def insert(
|
|
245
|
+
def insert(
|
|
246
|
+
self, item: dict | list, item_type: str, reclimit: int = None
|
|
247
|
+
) -> str | bool:
|
|
240
248
|
"""
|
|
241
249
|
Add structured items of a specified type to the database.
|
|
242
250
|
|
|
@@ -251,8 +259,9 @@ class Henge(object):
|
|
|
251
259
|
|
|
252
260
|
if item_type not in self.schemas.keys():
|
|
253
261
|
_LOGGER.error(
|
|
254
|
-
"I don't know about items of type '{}'. "
|
|
255
|
-
|
|
262
|
+
"I don't know about items of type '{}'. I know of: '{}'".format(
|
|
263
|
+
item_type, list(self.schemas.keys())
|
|
264
|
+
)
|
|
256
265
|
)
|
|
257
266
|
return False
|
|
258
267
|
|
|
@@ -336,8 +345,9 @@ class Henge(object):
|
|
|
336
345
|
"""
|
|
337
346
|
if item_type not in self.schemas.keys():
|
|
338
347
|
_LOGGER.error(
|
|
339
|
-
"I don't know about items of type '{}'. "
|
|
340
|
-
|
|
348
|
+
"I don't know about items of type '{}'. I know of: '{}'".format(
|
|
349
|
+
item_type, list(self.schemas.keys())
|
|
350
|
+
)
|
|
341
351
|
)
|
|
342
352
|
return False
|
|
343
353
|
|
|
@@ -357,7 +367,7 @@ class Henge(object):
|
|
|
357
367
|
item_type, item
|
|
358
368
|
)
|
|
359
369
|
)
|
|
360
|
-
|
|
370
|
+
_LOGGER.error(e)
|
|
361
371
|
|
|
362
372
|
if isinstance(item, str):
|
|
363
373
|
henge_to_query = self.henges[item_type]
|
|
@@ -378,7 +388,6 @@ class Henge(object):
|
|
|
378
388
|
return item
|
|
379
389
|
|
|
380
390
|
raise e
|
|
381
|
-
return None
|
|
382
391
|
|
|
383
392
|
_LOGGER.debug(f"item to insert: {item}")
|
|
384
393
|
item_inherent_split = select_inherent_properties(item, valid_schema)
|
|
@@ -416,17 +425,14 @@ class Henge(object):
|
|
|
416
425
|
|
|
417
426
|
henge_to_query = self.henges[item_type]
|
|
418
427
|
# _LOGGER.debug("henge_to_query: {}".format(henge_to_query))
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
|
|
426
|
-
|
|
427
|
-
self.database[druid + "_digest_version"] = digest_version
|
|
428
|
-
except Exception as e:
|
|
429
|
-
raise e
|
|
428
|
+
henge_to_query.database[druid] = string
|
|
429
|
+
henge_to_query.database[druid + ITEM_TYPE] = item_type
|
|
430
|
+
henge_to_query.database[druid + "_digest_version"] = digest_version
|
|
431
|
+
henge_to_query.database[druid + "_external_string"] = external_string
|
|
432
|
+
|
|
433
|
+
if henge_to_query != self:
|
|
434
|
+
self.database[druid + ITEM_TYPE] = item_type
|
|
435
|
+
self.database[druid + "_digest_version"] = digest_version
|
|
430
436
|
|
|
431
437
|
def clean(self):
|
|
432
438
|
"""
|
|
@@ -448,7 +454,10 @@ class Henge(object):
|
|
|
448
454
|
Show all items in the database.
|
|
449
455
|
"""
|
|
450
456
|
for k, v in self.database.items():
|
|
451
|
-
|
|
457
|
+
_LOGGER.info(f"{k} {v}")
|
|
458
|
+
|
|
459
|
+
def __len__(self):
|
|
460
|
+
return len(self.database)
|
|
452
461
|
|
|
453
462
|
def list(self, limit=1000, offset=0):
|
|
454
463
|
"""
|
|
@@ -0,0 +1,359 @@
|
|
|
1
|
+
import logging
|
|
2
|
+
import os
|
|
3
|
+
import psycopg2
|
|
4
|
+
|
|
5
|
+
from collections.abc import Mapping
|
|
6
|
+
from psycopg2 import OperationalError, sql
|
|
7
|
+
from psycopg2.errors import UniqueViolation
|
|
8
|
+
|
|
9
|
+
_LOGGER = logging.getLogger(__name__)
|
|
10
|
+
|
|
11
|
+
# Use like:
|
|
12
|
+
# pgdb = RDBDict(...) # Open connection
|
|
13
|
+
# pgdb["key"] = "value" # Insert item
|
|
14
|
+
# pgdb["key"] # Retrieve item
|
|
15
|
+
# pgdb.close() # Close connection
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
# This was originally written in seqcolapi.
|
|
19
|
+
# I am moving it here in 2025, because the whole point was to enable
|
|
20
|
+
# interesting database back-ends to have dict-style key-value pair
|
|
21
|
+
# mechanisms, which was enabling henge to use these various backends
|
|
22
|
+
# to back arbitrary databases.
|
|
23
|
+
# with the move to sqlmodel, I abandoned the henge backend approach,
|
|
24
|
+
# so intermediates are no longer important for seqcol.
|
|
25
|
+
|
|
26
|
+
# they could become relevant for other henge use cases, so they
|
|
27
|
+
# fit better here now.
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def getenv(varname):
|
|
31
|
+
"""Simple wrapper to make the Exception more informative for missing env var"""
|
|
32
|
+
try:
|
|
33
|
+
return os.environ[varname]
|
|
34
|
+
except KeyError:
|
|
35
|
+
raise Exception(f"Environment variable {varname} not set.")
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
import pipestat
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
class PipestatMapping(pipestat.PipestatManager):
|
|
42
|
+
"""A wrapper class to allow using a PipestatManager as a dict-like object."""
|
|
43
|
+
|
|
44
|
+
def __getitem__(self, key):
|
|
45
|
+
# This little hack makes this work with `in`;
|
|
46
|
+
# e.g.: for x in rdbdict, which is now disabled, instead of infinite.
|
|
47
|
+
if isinstance(key, int):
|
|
48
|
+
raise IndexError
|
|
49
|
+
return self.retrieve(key)
|
|
50
|
+
|
|
51
|
+
def __setitem__(self, key, value):
|
|
52
|
+
return self.insert({key: value})
|
|
53
|
+
|
|
54
|
+
def __len__(self):
|
|
55
|
+
return self.count_records()
|
|
56
|
+
|
|
57
|
+
def _next_page(self):
|
|
58
|
+
self._buf["page_index"] += 1
|
|
59
|
+
limit = self._buf["page_size"]
|
|
60
|
+
offset = self._buf["page_index"] * limit
|
|
61
|
+
self._buf["keys"] = self.get_records(limit, offset)
|
|
62
|
+
return self._buf["keys"][0]
|
|
63
|
+
|
|
64
|
+
def __iter__(self):
|
|
65
|
+
_LOGGER.debug("Iterating...")
|
|
66
|
+
self._buf = { # buffered iterator
|
|
67
|
+
"current_view_index": 0,
|
|
68
|
+
"len": len(self),
|
|
69
|
+
"page_size": 100,
|
|
70
|
+
"page_index": -1,
|
|
71
|
+
"keys": self._next_page(),
|
|
72
|
+
}
|
|
73
|
+
return self
|
|
74
|
+
|
|
75
|
+
def __next__(self):
|
|
76
|
+
if self._buf["current_view_index"] > self._buf["len"]:
|
|
77
|
+
raise StopIteration
|
|
78
|
+
|
|
79
|
+
idx = (
|
|
80
|
+
self._buf["current_view_index"]
|
|
81
|
+
- self._buf["page_index"] * self._buf["page_size"]
|
|
82
|
+
)
|
|
83
|
+
if idx <= self._buf["page_size"]:
|
|
84
|
+
self._buf["current_view_index"] += 1
|
|
85
|
+
return self._buf["keys"][idx - 1]
|
|
86
|
+
else: # current index is beyond current page, but not beyond total
|
|
87
|
+
return self._next_page()
|
|
88
|
+
|
|
89
|
+
|
|
90
|
+
class RDBDict(Mapping):
|
|
91
|
+
"""
|
|
92
|
+
A Relational DataBase Dict.
|
|
93
|
+
|
|
94
|
+
Simple database connection manager object that allows us to use a
|
|
95
|
+
PostgresQL database as a simple key-value store to back Python
|
|
96
|
+
dict-style access to database items.
|
|
97
|
+
"""
|
|
98
|
+
|
|
99
|
+
def __init__(
|
|
100
|
+
self,
|
|
101
|
+
db_name: str = None,
|
|
102
|
+
db_user: str = None,
|
|
103
|
+
db_password: str = None,
|
|
104
|
+
db_host: str = None,
|
|
105
|
+
db_port: str = None,
|
|
106
|
+
db_table: str = None,
|
|
107
|
+
):
|
|
108
|
+
self.connection = None
|
|
109
|
+
self.db_name = db_name or getenv("POSTGRES_DB")
|
|
110
|
+
self.db_user = db_user or getenv("POSTGRES_USER")
|
|
111
|
+
self.db_host = db_host or os.environ.get("POSTGRES_HOST") or "localhost"
|
|
112
|
+
self.db_port = db_port or os.environ.get("POSTGRES_PORT") or "5432"
|
|
113
|
+
self.db_table = db_table or os.environ.get("POSTGRES_TABLE") or "seqcol"
|
|
114
|
+
db_password = db_password or getenv("POSTGRES_PASSWORD")
|
|
115
|
+
|
|
116
|
+
try:
|
|
117
|
+
self.connection = self.create_connection(
|
|
118
|
+
self.db_name, self.db_user, db_password, self.db_host, self.db_port
|
|
119
|
+
)
|
|
120
|
+
if not self.connection:
|
|
121
|
+
raise Exception("Connection failed")
|
|
122
|
+
except Exception as e:
|
|
123
|
+
_LOGGER.info(f"{self}")
|
|
124
|
+
raise e
|
|
125
|
+
_LOGGER.info(self.connection)
|
|
126
|
+
self.connection.autocommit = True
|
|
127
|
+
|
|
128
|
+
def __repr__(self):
|
|
129
|
+
return (
|
|
130
|
+
"RDBD object\n"
|
|
131
|
+
+ "db_table: {}\n".format(self.db_table)
|
|
132
|
+
+ "db_name: {}\n".format(self.db_name)
|
|
133
|
+
+ "db_user: {}\n".format(self.db_user)
|
|
134
|
+
+ "db_host: {}\n".format(self.db_host)
|
|
135
|
+
+ "db_port: {}\n".format(self.db_port)
|
|
136
|
+
)
|
|
137
|
+
|
|
138
|
+
def init_table(self):
|
|
139
|
+
# Wrap statements to prevent SQL injection attacks
|
|
140
|
+
stmt = sql.SQL(
|
|
141
|
+
"""
|
|
142
|
+
CREATE TABLE IF NOT EXISTS {table}(
|
|
143
|
+
key TEXT PRIMARY KEY,
|
|
144
|
+
value TEXT);
|
|
145
|
+
"""
|
|
146
|
+
).format(table=sql.Identifier(self.db_table))
|
|
147
|
+
return self.execute_query(stmt, params=None)
|
|
148
|
+
|
|
149
|
+
def insert(self, key, value):
|
|
150
|
+
stmt = sql.SQL(
|
|
151
|
+
"""
|
|
152
|
+
INSERT INTO {table}(key, value)
|
|
153
|
+
VALUES (%(key)s, %(value)s);
|
|
154
|
+
"""
|
|
155
|
+
).format(table=sql.Identifier(self.db_table))
|
|
156
|
+
params = {"key": key, "value": value}
|
|
157
|
+
return self.execute_query(stmt, params)
|
|
158
|
+
|
|
159
|
+
def update(self, key, value):
|
|
160
|
+
stmt = sql.SQL(
|
|
161
|
+
"""
|
|
162
|
+
UPDATE {table} SET value=%(value)s WHERE key=%(key)s
|
|
163
|
+
"""
|
|
164
|
+
).format(table=sql.Identifier(self.db_table))
|
|
165
|
+
params = {"key": key, "value": value}
|
|
166
|
+
return self.execute_query(stmt, params)
|
|
167
|
+
|
|
168
|
+
def __getitem__(self, key):
|
|
169
|
+
# This little hack makes this work with `in`;
|
|
170
|
+
# e.g.: for x in rdbdict, which is now disabled, instead of infinite.
|
|
171
|
+
if isinstance(key, int):
|
|
172
|
+
raise IndexError
|
|
173
|
+
stmt = sql.SQL(
|
|
174
|
+
"""
|
|
175
|
+
SELECT value FROM {table} WHERE key=%(key)s
|
|
176
|
+
"""
|
|
177
|
+
).format(table=sql.Identifier(self.db_table))
|
|
178
|
+
params = {"key": key}
|
|
179
|
+
res = self.execute_read_query(stmt, params)
|
|
180
|
+
if not res:
|
|
181
|
+
_LOGGER.info("Not found: {}".format(key))
|
|
182
|
+
return res
|
|
183
|
+
|
|
184
|
+
def __setitem__(self, key, value):
|
|
185
|
+
try:
|
|
186
|
+
return self.insert(key, value)
|
|
187
|
+
except UniqueViolation as e:
|
|
188
|
+
_LOGGER.info("Updating existing value for {}".format(key))
|
|
189
|
+
return self.update(key, value)
|
|
190
|
+
|
|
191
|
+
def __delitem__(self, key):
|
|
192
|
+
stmt = sql.SQL(
|
|
193
|
+
"""
|
|
194
|
+
DELETE FROM {table} WHERE key=%(key)s
|
|
195
|
+
"""
|
|
196
|
+
).format(table=sql.Identifier(self.db_table))
|
|
197
|
+
params = {"key": key}
|
|
198
|
+
res = self.execute_query(stmt, params)
|
|
199
|
+
return res
|
|
200
|
+
|
|
201
|
+
def create_connection(self, db_name, db_user, db_password, db_host, db_port):
|
|
202
|
+
connection = None
|
|
203
|
+
try:
|
|
204
|
+
connection = psycopg2.connect(
|
|
205
|
+
database=db_name,
|
|
206
|
+
user=db_user,
|
|
207
|
+
password=db_password,
|
|
208
|
+
host=db_host,
|
|
209
|
+
port=db_port,
|
|
210
|
+
)
|
|
211
|
+
_LOGGER.info("Connection to PostgreSQL DB successful")
|
|
212
|
+
except OperationalError as e:
|
|
213
|
+
_LOGGER.info("Error: {e}".format(e=str(e)))
|
|
214
|
+
return connection
|
|
215
|
+
|
|
216
|
+
def execute_read_query(self, query, params=None):
|
|
217
|
+
cursor = self.connection.cursor()
|
|
218
|
+
result = None
|
|
219
|
+
try:
|
|
220
|
+
cursor.execute(query, params)
|
|
221
|
+
result = cursor.fetchone()
|
|
222
|
+
if result:
|
|
223
|
+
return result[0]
|
|
224
|
+
else:
|
|
225
|
+
_LOGGER.debug(f"Query: {query}")
|
|
226
|
+
_LOGGER.debug(f"Result: {result}")
|
|
227
|
+
return None
|
|
228
|
+
except OperationalError as e:
|
|
229
|
+
_LOGGER.info("Error: {e}".format(e=str(e)))
|
|
230
|
+
raise
|
|
231
|
+
except TypeError as e:
|
|
232
|
+
_LOGGER.info("TypeError: {e}, item: {q}".format(e=str(e), q=query))
|
|
233
|
+
raise
|
|
234
|
+
|
|
235
|
+
def execute_multi_query(self, query, params=None):
|
|
236
|
+
cursor = self.connection.cursor()
|
|
237
|
+
result = None
|
|
238
|
+
try:
|
|
239
|
+
cursor.execute(query, params)
|
|
240
|
+
result = cursor.fetchall()
|
|
241
|
+
return result
|
|
242
|
+
except OperationalError as e:
|
|
243
|
+
_LOGGER.info("Error: {e}".format(e=str(e)))
|
|
244
|
+
raise
|
|
245
|
+
except TypeError as e:
|
|
246
|
+
_LOGGER.info("TypeError: {e}, item: {q}".format(e=str(e), q=query))
|
|
247
|
+
raise
|
|
248
|
+
|
|
249
|
+
def execute_query(self, query, params=None):
|
|
250
|
+
cursor = self.connection.cursor()
|
|
251
|
+
try:
|
|
252
|
+
return cursor.execute(query, params)
|
|
253
|
+
_LOGGER.info("Query executed successfully")
|
|
254
|
+
except OperationalError as e:
|
|
255
|
+
_LOGGER.info("Error: {e}".format(e=str(e)))
|
|
256
|
+
|
|
257
|
+
def close(self):
|
|
258
|
+
_LOGGER.info("Closing connection")
|
|
259
|
+
return self.connection.close()
|
|
260
|
+
|
|
261
|
+
def __del__(self):
|
|
262
|
+
if self.connection:
|
|
263
|
+
self.close()
|
|
264
|
+
|
|
265
|
+
def __len__(self):
|
|
266
|
+
stmt = sql.SQL(
|
|
267
|
+
"""
|
|
268
|
+
SELECT COUNT(*) FROM {table}
|
|
269
|
+
"""
|
|
270
|
+
).format(table=sql.Identifier(self.db_table))
|
|
271
|
+
_LOGGER.debug(stmt)
|
|
272
|
+
res = self.execute_read_query(stmt)
|
|
273
|
+
return res
|
|
274
|
+
|
|
275
|
+
def get_paged_keys(self, limit=None, offset=None):
|
|
276
|
+
stmt = sql.SQL("SELECT key FROM {table}").format(
|
|
277
|
+
table=sql.Identifier(self.db_table)
|
|
278
|
+
)
|
|
279
|
+
params = {}
|
|
280
|
+
if limit is not None:
|
|
281
|
+
stmt = sql.SQL("{} LIMIT %(limit)s").format(stmt)
|
|
282
|
+
params["limit"] = limit
|
|
283
|
+
if offset is not None:
|
|
284
|
+
stmt = sql.SQL("{} OFFSET %(offset)s").format(stmt)
|
|
285
|
+
params["offset"] = offset
|
|
286
|
+
res = self.execute_multi_query(stmt, params if params else None)
|
|
287
|
+
return res
|
|
288
|
+
|
|
289
|
+
def _next_page(self):
|
|
290
|
+
self._buf["page_index"] += 1
|
|
291
|
+
limit = self._buf["page_size"]
|
|
292
|
+
offset = self._buf["page_index"] * limit
|
|
293
|
+
self._buf["keys"] = self.get_paged_keys(limit, offset)
|
|
294
|
+
return self._buf["keys"][0]
|
|
295
|
+
|
|
296
|
+
def __iter__(self):
|
|
297
|
+
_LOGGER.debug("Iterating...")
|
|
298
|
+
self._buf = { # buffered iterator
|
|
299
|
+
"current_view_index": 0,
|
|
300
|
+
"len": len(self),
|
|
301
|
+
"page_size": 10,
|
|
302
|
+
"page_index": 0,
|
|
303
|
+
"keys": self.get_paged_keys(10, 0),
|
|
304
|
+
}
|
|
305
|
+
return self
|
|
306
|
+
|
|
307
|
+
def __next__(self):
|
|
308
|
+
if self._buf["current_view_index"] > self._buf["len"]:
|
|
309
|
+
raise StopIteration
|
|
310
|
+
|
|
311
|
+
idx = (
|
|
312
|
+
self._buf["current_view_index"]
|
|
313
|
+
- self._buf["page_index"] * self._buf["page_size"]
|
|
314
|
+
)
|
|
315
|
+
if idx <= self._buf["page_size"]:
|
|
316
|
+
self._buf["current_view_index"] += 1
|
|
317
|
+
return self._buf["keys"][idx - 1]
|
|
318
|
+
else: # current index is beyond current page, but not beyond total
|
|
319
|
+
return self._next_page()
|
|
320
|
+
|
|
321
|
+
# Old, non-paged iterator:
|
|
322
|
+
# def __iter__(self):
|
|
323
|
+
# self._current_idx = 0
|
|
324
|
+
# return self
|
|
325
|
+
|
|
326
|
+
# def __next__(self):
|
|
327
|
+
# stmt = sql.SQL(
|
|
328
|
+
# """
|
|
329
|
+
# SELECT key,value FROM {table} LIMIT 1 OFFSET %(idx)s
|
|
330
|
+
# """
|
|
331
|
+
# ).format(table=sql.Identifier(self.db_table))
|
|
332
|
+
# res = self.execute_read_query(stmt, {"idx": self._current_idx})
|
|
333
|
+
# self._current_idx += 1
|
|
334
|
+
# if not res:
|
|
335
|
+
# _LOGGER.info("Not found: {}".format(self._current_idx))
|
|
336
|
+
# raise StopIteration
|
|
337
|
+
# return res
|
|
338
|
+
|
|
339
|
+
|
|
340
|
+
# We don't need the full SeqColHenge,
|
|
341
|
+
# which also has loading capability, and requires pyfaidx, which requires
|
|
342
|
+
# biopython, which requires numpy, which is huge and can't compile the in
|
|
343
|
+
# default fastapi container.
|
|
344
|
+
# So, I had written the below class which provides retrieve only.
|
|
345
|
+
# HOWEVER, switching from alpine to slim allows install of numpy;
|
|
346
|
+
# This inflates the container size from 262Mb to 350Mb; perhaps that's worth paying.
|
|
347
|
+
# So I can avoid duplicating this and just use the full SeqColHenge from seqcol
|
|
348
|
+
# class SeqColHenge(refget.RefGetClient):
|
|
349
|
+
# def retrieve(self, druid, reclimit=None, raw=False):
|
|
350
|
+
# try:
|
|
351
|
+
# return super(SeqColHenge, self).retrieve(druid, reclimit, raw)
|
|
352
|
+
# except henge.NotFoundException as e:
|
|
353
|
+
# _LOGGER.debug(e)
|
|
354
|
+
# try:
|
|
355
|
+
# return self.refget(druid)
|
|
356
|
+
# except Exception as e:
|
|
357
|
+
# _LOGGER.debug(e)
|
|
358
|
+
# raise e
|
|
359
|
+
# return henge.NotFoundException("{} not found in database, or in refget.".format(druid))
|
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: henge
|
|
3
|
+
Version: 0.2.3
|
|
4
|
+
Summary: Storage and retrieval of object-derived, decomposable recursive unique identifiers.
|
|
5
|
+
Home-page: https://databio.org
|
|
6
|
+
Author: Nathan Sheffield
|
|
7
|
+
Author-email: nathan@code.databio.org
|
|
8
|
+
License: BSD2
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: License :: OSI Approved :: BSD License
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
15
|
+
Classifier: Topic :: System :: Distributed Computing
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
License-File: LICENSE.txt
|
|
19
|
+
Requires-Dist: jsonschema
|
|
20
|
+
Requires-Dist: ubiquerg>=0.5.2
|
|
21
|
+
Requires-Dist: yacman>=0.6.7
|
|
22
|
+
Dynamic: author
|
|
23
|
+
Dynamic: author-email
|
|
24
|
+
Dynamic: classifier
|
|
25
|
+
Dynamic: description
|
|
26
|
+
Dynamic: description-content-type
|
|
27
|
+
Dynamic: home-page
|
|
28
|
+
Dynamic: keywords
|
|
29
|
+
Dynamic: license
|
|
30
|
+
Dynamic: license-file
|
|
31
|
+
Dynamic: requires-dist
|
|
32
|
+
Dynamic: requires-python
|
|
33
|
+
Dynamic: summary
|
|
34
|
+
|
|
35
|
+
# Henge
|
|
36
|
+
|
|
37
|
+
Henge is a Python package for building data storage and retrieval interfaces for arbitrary data. Henge is based on the idea of **decomposable recursive unique identifiers (DRUIDs)**, which are hash-based unique identifiers for data derived from the data itself. For arbitrary data with any structure, Henge can mint unique DRUIDs to identify data, store the data in a key-value database of your choice, and provide lookup functions to retrieve the data in its original structure using its DRUID identifier.
|
|
38
|
+
|
|
39
|
+
Henge was intended as a building block for [sequence collections](https://github.com/refgenie/seqcol), but is generic enough to use for any data type that needs content-derived identifiers with database lookup capability.
|
|
40
|
+
|
|
41
|
+
## Install
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
pip install henge
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Quick Start
|
|
48
|
+
|
|
49
|
+
Create a Henge object by providing a database and a data schema. The database can be a Python dict or backed by persistent storage. Data schemas are [JSON-schema](https://json-schema.org/) descriptions of data types, and can be hierarchical.
|
|
50
|
+
|
|
51
|
+
```python
|
|
52
|
+
import henge
|
|
53
|
+
|
|
54
|
+
schemas = ["path/to/json_schema.yaml"]
|
|
55
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Insert items into the henge. Upon insert, henge returns the DRUID (digest/checksum/unique identifier) for your object:
|
|
59
|
+
|
|
60
|
+
```python
|
|
61
|
+
druid = h.insert({"name": "Pat", "age": 38}, item_type="person")
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Retrieve the original object using the DRUID:
|
|
65
|
+
|
|
66
|
+
```python
|
|
67
|
+
h.retrieve(druid)
|
|
68
|
+
# {'age': '38', 'name': 'Pat'}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Tutorial
|
|
72
|
+
|
|
73
|
+
For a comprehensive walkthrough covering basic types, arrays, nested objects, and advanced features, see the [tutorial notebook](docs/tutorial.ipynb).
|
|
74
|
+
|
|
75
|
+
## What are DRUIDs?
|
|
76
|
+
|
|
77
|
+
DRUIDs are a special type of unique identifiers with two powerful properties:
|
|
78
|
+
|
|
79
|
+
- **Decomposable**: Identifiers in henge automatically retrieve structured data (tuples, arrays, objects). The structure is defined by a JSON schema, so henge can be used as a back-end for arbitrary data types.
|
|
80
|
+
|
|
81
|
+
- **Recursive**: Individual elements retrieved by henge can be tagged as recursive, meaning these attributes contain their own DRUIDs. Henge can recurse through these, allowing you to mint unique identifiers for arbitrary nested data structures.
|
|
82
|
+
|
|
83
|
+
A DRUID is ultimately the result of a digest operation (such as `md5` or `sha256`) on some data. Because DRUIDs are computed deterministically from the item, they represent globally unique identifiers. If you insert the same item repeatedly, it will produce the same DRUID -- this is true across henges as long as they share a data schema.
|
|
84
|
+
|
|
85
|
+
## Persisting Data
|
|
86
|
+
|
|
87
|
+
### In-memory (default)
|
|
88
|
+
|
|
89
|
+
Use a Python `dict` as the database for testing or ephemeral use:
|
|
90
|
+
|
|
91
|
+
```python
|
|
92
|
+
h = henge.Henge(database={}, schemas=schemas)
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### SQLite backend
|
|
96
|
+
|
|
97
|
+
For persistent storage with SQLite:
|
|
98
|
+
|
|
99
|
+
```python
|
|
100
|
+
from sqlitedict import SqliteDict
|
|
101
|
+
|
|
102
|
+
mydict = SqliteDict('./my_db.sqlite', autocommit=True)
|
|
103
|
+
h = henge.Henge(mydict, schemas=schemas)
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Requires: `pip install sqlitedict`
|
|
107
|
+
|
|
108
|
+
### MongoDB backend
|
|
109
|
+
|
|
110
|
+
For production use with MongoDB:
|
|
111
|
+
|
|
112
|
+
1. **Start MongoDB with Docker:**
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
docker run --network="host" mongo
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
For persistent storage, mount a volume to `/data/db`:
|
|
119
|
+
|
|
120
|
+
```bash
|
|
121
|
+
docker run -it --network="host" -v /path/to/data:/data/db mongo
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
2. **Connect henge to MongoDB:**
|
|
125
|
+
|
|
126
|
+
```python
|
|
127
|
+
import henge
|
|
128
|
+
|
|
129
|
+
h = henge.Henge(henge.connect_mongo(), schemas=schemas)
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Requires: `pip install pymongo mongodict`
|
|
@@ -6,10 +6,10 @@ henge/_version.py
|
|
|
6
6
|
henge/const.py
|
|
7
7
|
henge/deprecated.py
|
|
8
8
|
henge/henge.py
|
|
9
|
+
henge/scconf.py
|
|
9
10
|
henge.egg-info/PKG-INFO
|
|
10
11
|
henge.egg-info/SOURCES.txt
|
|
11
12
|
henge.egg-info/dependency_links.txt
|
|
12
|
-
henge.egg-info/entry_points.txt
|
|
13
13
|
henge.egg-info/requires.txt
|
|
14
14
|
henge.egg-info/top_level.txt
|
|
15
15
|
tests/test_henge.py
|
|
@@ -1,6 +1,5 @@
|
|
|
1
1
|
#! /usr/bin/env python
|
|
2
2
|
|
|
3
|
-
import os
|
|
4
3
|
from setuptools import setup
|
|
5
4
|
import sys
|
|
6
5
|
|
|
@@ -35,10 +34,10 @@ setup(
|
|
|
35
34
|
classifiers=[
|
|
36
35
|
"Development Status :: 4 - Beta",
|
|
37
36
|
"License :: OSI Approved :: BSD License",
|
|
38
|
-
"Programming Language :: Python :: 3.7",
|
|
39
|
-
"Programming Language :: Python :: 3.8",
|
|
40
|
-
"Programming Language :: Python :: 3.9",
|
|
41
37
|
"Programming Language :: Python :: 3.10",
|
|
38
|
+
"Programming Language :: Python :: 3.11",
|
|
39
|
+
"Programming Language :: Python :: 3.12",
|
|
40
|
+
"Programming Language :: Python :: 3.13",
|
|
42
41
|
"Topic :: System :: Distributed Computing",
|
|
43
42
|
],
|
|
44
43
|
keywords="",
|
|
@@ -46,15 +45,12 @@ setup(
|
|
|
46
45
|
author="Nathan Sheffield",
|
|
47
46
|
author_email="nathan@code.databio.org",
|
|
48
47
|
license="BSD2",
|
|
49
|
-
|
|
50
|
-
"console_scripts": ["packagename = packagename.packagename:main"],
|
|
51
|
-
},
|
|
52
|
-
package_data={"packagename": [os.path.join("packagename", "*")]},
|
|
48
|
+
python_requires=">=3.10",
|
|
53
49
|
include_package_data=True,
|
|
54
50
|
test_suite="tests",
|
|
55
51
|
tests_require=(["pytest"]),
|
|
56
52
|
setup_requires=(
|
|
57
53
|
["pytest-runner"] if {"test", "pytest", "ptr"} & set(sys.argv) else []
|
|
58
54
|
),
|
|
59
|
-
**extra
|
|
55
|
+
**extra,
|
|
60
56
|
)
|
henge-0.2.2/PKG-INFO
DELETED
|
@@ -1,28 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.1
|
|
2
|
-
Name: henge
|
|
3
|
-
Version: 0.2.2
|
|
4
|
-
Summary: Storage and retrieval of object-derived, decomposable recursive unique identifiers.
|
|
5
|
-
Home-page: https://databio.org
|
|
6
|
-
Author: Nathan Sheffield
|
|
7
|
-
Author-email: nathan@code.databio.org
|
|
8
|
-
License: BSD2
|
|
9
|
-
Classifier: Development Status :: 4 - Beta
|
|
10
|
-
Classifier: License :: OSI Approved :: BSD License
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.7
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
15
|
-
Classifier: Topic :: System :: Distributed Computing
|
|
16
|
-
Description-Content-Type: text/markdown
|
|
17
|
-
License-File: LICENSE.txt
|
|
18
|
-
Requires-Dist: jsonschema
|
|
19
|
-
Requires-Dist: ubiquerg>=0.5.2
|
|
20
|
-
Requires-Dist: yacman>=0.6.7
|
|
21
|
-
|
|
22
|
-
[](https://travis-ci.com/databio/henge)
|
|
23
|
-
|
|
24
|
-
# Henge
|
|
25
|
-
|
|
26
|
-
Henge is a Python package that builds backends for generic decomposable recursive unique identifiers (or, *DRUIDs*). It is intended to be used as a building block for sequence collections (see the [seqcol package](https://github.com/databio/seqcol)), and also for other data types that need content-derived identifiers.
|
|
27
|
-
|
|
28
|
-
Documentation at [http://henge.databio.org](http://henge.databio.org).
|
henge-0.2.2/README.md
DELETED
|
@@ -1,7 +0,0 @@
|
|
|
1
|
-
[](https://travis-ci.com/databio/henge)
|
|
2
|
-
|
|
3
|
-
# Henge
|
|
4
|
-
|
|
5
|
-
Henge is a Python package that builds backends for generic decomposable recursive unique identifiers (or, *DRUIDs*). It is intended to be used as a building block for sequence collections (see the [seqcol package](https://github.com/databio/seqcol)), and also for other data types that need content-derived identifiers.
|
|
6
|
-
|
|
7
|
-
Documentation at [http://henge.databio.org](http://henge.databio.org).
|
henge-0.2.2/henge/_version.py
DELETED
|
@@ -1 +0,0 @@
|
|
|
1
|
-
__version__ = "0.2.2"
|
|
@@ -1,28 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.1
|
|
2
|
-
Name: henge
|
|
3
|
-
Version: 0.2.2
|
|
4
|
-
Summary: Storage and retrieval of object-derived, decomposable recursive unique identifiers.
|
|
5
|
-
Home-page: https://databio.org
|
|
6
|
-
Author: Nathan Sheffield
|
|
7
|
-
Author-email: nathan@code.databio.org
|
|
8
|
-
License: BSD2
|
|
9
|
-
Classifier: Development Status :: 4 - Beta
|
|
10
|
-
Classifier: License :: OSI Approved :: BSD License
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.7
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
15
|
-
Classifier: Topic :: System :: Distributed Computing
|
|
16
|
-
Description-Content-Type: text/markdown
|
|
17
|
-
License-File: LICENSE.txt
|
|
18
|
-
Requires-Dist: jsonschema
|
|
19
|
-
Requires-Dist: ubiquerg>=0.5.2
|
|
20
|
-
Requires-Dist: yacman>=0.6.7
|
|
21
|
-
|
|
22
|
-
[](https://travis-ci.com/databio/henge)
|
|
23
|
-
|
|
24
|
-
# Henge
|
|
25
|
-
|
|
26
|
-
Henge is a Python package that builds backends for generic decomposable recursive unique identifiers (or, *DRUIDs*). It is intended to be used as a building block for sequence collections (see the [seqcol package](https://github.com/databio/seqcol)), and also for other data types that need content-derived identifiers.
|
|
27
|
-
|
|
28
|
-
Documentation at [http://henge.databio.org](http://henge.databio.org).
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|