tissuebox 2.0.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- tissuebox-2.0.1/PKG-INFO +438 -0
- tissuebox-2.0.1/README.md +546 -0
- tissuebox-2.0.1/setup.cfg +4 -0
- tissuebox-2.0.1/setup.py +19 -0
- tissuebox-2.0.1/tissuebox/__init__.py +312 -0
- tissuebox-2.0.1/tissuebox/basic.py +548 -0
- tissuebox-2.0.1/tissuebox/helpers.py +33 -0
- tissuebox-2.0.1/tissuebox.egg-info/PKG-INFO +438 -0
- tissuebox-2.0.1/tissuebox.egg-info/SOURCES.txt +9 -0
- tissuebox-2.0.1/tissuebox.egg-info/dependency_links.txt +1 -0
- tissuebox-2.0.1/tissuebox.egg-info/top_level.txt +1 -0
tissuebox-2.0.1/PKG-INFO
ADDED
|
@@ -0,0 +1,438 @@
|
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
|
+
Name: tissuebox
|
|
3
|
+
Version: 2.0.1
|
|
4
|
+
Summary: Tissuebox :: Pythonic payload validator
|
|
5
|
+
Home-page: https://github.com/n3h3m/tissuebox.git
|
|
6
|
+
Author: nehemiah
|
|
7
|
+
Author-email: nehemiah.jacob@gmail.com
|
|
8
|
+
Description-Content-Type: text/x-rst
|
|
9
|
+
|
|
10
|
+
.. figure:: https://raw.githubusercontent.com/nehemiahjacob/tissuebox/master/tissuebox.png
|
|
11
|
+
|
|
12
|
+
Tissuebox
|
|
13
|
+
---------
|
|
14
|
+
|
|
15
|
+
Tissuebox is a pure Pythonic schema validator which takes advantage of
|
|
16
|
+
Python’s functional style programming to provide simple yet powerful
|
|
17
|
+
validation framework. The standard usage would be validating incoming
|
|
18
|
+
JSON objects upon http requests or to validate any Python dict in other
|
|
19
|
+
common scenarios.
|
|
20
|
+
|
|
21
|
+
Installation:
|
|
22
|
+
^^^^^^^^^^^^^
|
|
23
|
+
|
|
24
|
+
Use ``pip`` to install Tissuebox
|
|
25
|
+
|
|
26
|
+
``pip install tissuebox``
|
|
27
|
+
|
|
28
|
+
Requirements:
|
|
29
|
+
^^^^^^^^^^^^^
|
|
30
|
+
|
|
31
|
+
Tissuebox requires Python 3.7 however we are considering to add support
|
|
32
|
+
for earlier versions of Python3
|
|
33
|
+
|
|
34
|
+
Examples:
|
|
35
|
+
^^^^^^^^^
|
|
36
|
+
|
|
37
|
+
Assume the incoming JSON object or a python dict which contains hotel
|
|
38
|
+
details and we will build upon this example.
|
|
39
|
+
|
|
40
|
+
.. code:: python
|
|
41
|
+
|
|
42
|
+
payload = {
|
|
43
|
+
"name": "Park Shereton",
|
|
44
|
+
"available": True,
|
|
45
|
+
"price_per_night": 270,
|
|
46
|
+
"email": "contact@shereton.com",
|
|
47
|
+
"web": "www.shereton.com",
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
1. Validating basic data types
|
|
51
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
52
|
+
|
|
53
|
+
You can use ``tissuebox`` to define a schema to validate the payload
|
|
54
|
+
against basic data types and validate using ``validate`` method.
|
|
55
|
+
|
|
56
|
+
.. code:: python
|
|
57
|
+
|
|
58
|
+
from tissuebox import validate
|
|
59
|
+
from tissuebox.basic import boolean, integer, string
|
|
60
|
+
|
|
61
|
+
schema = {
|
|
62
|
+
'name': string,
|
|
63
|
+
'available': boolean,
|
|
64
|
+
'price_per_night': integer
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
validate(schema, payload)
|
|
68
|
+
|
|
69
|
+
will return
|
|
70
|
+
|
|
71
|
+
.. code:: python
|
|
72
|
+
|
|
73
|
+
(True, [])
|
|
74
|
+
|
|
75
|
+
2. Validating common datatypes
|
|
76
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
77
|
+
|
|
78
|
+
A ``tissuebox`` schema is simply a dict where keys are payload keys and
|
|
79
|
+
values are type_functions to which the payload value would be passed. A
|
|
80
|
+
type_function simply accepts a single parameter and returns a tuple with
|
|
81
|
+
two items ``(boolean, msg)``.
|
|
82
|
+
|
|
83
|
+
Tissuebox aims to amass a collection of commonly used types to it’s
|
|
84
|
+
library. For now common data types like ``email``, ``url``,
|
|
85
|
+
``rfc_datetime``, ``geolocation`` are part of ``tissuebox``\ ’s standard
|
|
86
|
+
collections. You can contribute more via Github.
|
|
87
|
+
|
|
88
|
+
.. code:: python
|
|
89
|
+
|
|
90
|
+
from tissuebox import validate
|
|
91
|
+
from tissuebox.basic import email, integer, string, url
|
|
92
|
+
schema = {
|
|
93
|
+
'name': string,
|
|
94
|
+
'price_per_night': integer,
|
|
95
|
+
"email": email,
|
|
96
|
+
"web": url
|
|
97
|
+
}
|
|
98
|
+
|
|
99
|
+
validate(schema, payload)
|
|
100
|
+
|
|
101
|
+
will return
|
|
102
|
+
|
|
103
|
+
.. code:: python
|
|
104
|
+
|
|
105
|
+
(True, [])
|
|
106
|
+
|
|
107
|
+
One of the ways ``tissuebox`` stands our from other alternatives is, the
|
|
108
|
+
type_functions are stored and passed around as Python variables which is
|
|
109
|
+
helpful in identifying the schema definition errors ahead of time as
|
|
110
|
+
most IDEs will display squiggly lines if the variables aren’t resolved,
|
|
111
|
+
while other frameworks like JsonSchema and Cerebrus pass types within
|
|
112
|
+
strings which is hard for IDEs to detect errors in the schema.
|
|
113
|
+
|
|
114
|
+
3. Validating nested fields
|
|
115
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
116
|
+
|
|
117
|
+
Method 1:
|
|
118
|
+
'''''''''
|
|
119
|
+
|
|
120
|
+
Defining a schema in a nested fashion is very straight forward which
|
|
121
|
+
enables re-use schemas around. Consider if the payload has an
|
|
122
|
+
``address`` field. We can define a separate schema as ``address_schema``
|
|
123
|
+
and pass it to the main schema as below.
|
|
124
|
+
|
|
125
|
+
.. code:: python
|
|
126
|
+
|
|
127
|
+
from tissuebox import validate
|
|
128
|
+
from tissuebox.basic import email, integer, string, url
|
|
129
|
+
payload = {
|
|
130
|
+
"name": "Park Shereton",
|
|
131
|
+
"available": True,
|
|
132
|
+
"price_per_night": 270,
|
|
133
|
+
"email": "contact@shereton.com",
|
|
134
|
+
"web": "www.shereton.com",
|
|
135
|
+
"address": {
|
|
136
|
+
"street": "128 George St",
|
|
137
|
+
"city": "Sydney",
|
|
138
|
+
"state": "NSW",
|
|
139
|
+
"zip": 2000
|
|
140
|
+
}
|
|
141
|
+
}
|
|
142
|
+
|
|
143
|
+
address = {
|
|
144
|
+
"street": string,
|
|
145
|
+
"city": string,
|
|
146
|
+
"state": string,
|
|
147
|
+
"zip": integer
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
schema = {
|
|
151
|
+
'name': string,
|
|
152
|
+
'price_per_night': integer,
|
|
153
|
+
"email": email,
|
|
154
|
+
"web": url,
|
|
155
|
+
"address": address
|
|
156
|
+
}
|
|
157
|
+
|
|
158
|
+
validate(schema, payload)
|
|
159
|
+
|
|
160
|
+
would return
|
|
161
|
+
|
|
162
|
+
.. code:: python
|
|
163
|
+
|
|
164
|
+
(True, [])
|
|
165
|
+
|
|
166
|
+
Method 2:
|
|
167
|
+
'''''''''
|
|
168
|
+
|
|
169
|
+
The prefered method of defining nested schema is by using ``.`` dot as
|
|
170
|
+
delimiter to represent nested fields of the payload hierarchy.
|
|
171
|
+
Apparently this comes up with the downside wherein if ``.`` dot itself
|
|
172
|
+
is part of keys which would be an unfortunate scenario. But it can
|
|
173
|
+
improve the readability to a tremendous level. See it yourself how
|
|
174
|
+
elegantly we can express the schema once we introduce the ``address``
|
|
175
|
+
field to our payload.
|
|
176
|
+
|
|
177
|
+
.. code:: python
|
|
178
|
+
|
|
179
|
+
schema = {
|
|
180
|
+
'name': string,
|
|
181
|
+
'price_per_night': integer,
|
|
182
|
+
"email": email,
|
|
183
|
+
"web": url,
|
|
184
|
+
"address.street": string,
|
|
185
|
+
"address.city": string,
|
|
186
|
+
"address.state": string,
|
|
187
|
+
"address.zip": integer
|
|
188
|
+
}
|
|
189
|
+
|
|
190
|
+
The primary reason why we suggest the later method is we can quickly
|
|
191
|
+
define a nested field with any depth without creating unnecessary schema
|
|
192
|
+
objects in the middle.
|
|
193
|
+
|
|
194
|
+
4. Validating enums.
|
|
195
|
+
^^^^^^^^^^^^^^^^^^^^
|
|
196
|
+
|
|
197
|
+
Let us try enforcing that the field ``address.state`` must be one of 8
|
|
198
|
+
Australian states. Tissuebox let’s you define an enum using the ``{}``
|
|
199
|
+
i.e ``set()`` syntax. Look at the example below.
|
|
200
|
+
|
|
201
|
+
.. code:: python
|
|
202
|
+
|
|
203
|
+
schema = {
|
|
204
|
+
'name': string,
|
|
205
|
+
'price_per_night': integer,
|
|
206
|
+
"email": email,
|
|
207
|
+
"web": url,
|
|
208
|
+
"address.state": {'ACT', 'NSW', 'NT', 'QLD', 'SA', 'TAS', 'VIC', 'WA'},
|
|
209
|
+
"address.zip": integer
|
|
210
|
+
}
|
|
211
|
+
|
|
212
|
+
To have a feel how Tissuebox responds when we pass something which is
|
|
213
|
+
not an Australian state
|
|
214
|
+
|
|
215
|
+
.. code:: python
|
|
216
|
+
|
|
217
|
+
payload = {
|
|
218
|
+
"name": "Park Shereton",
|
|
219
|
+
"available": True,
|
|
220
|
+
"price_per_night": 270,
|
|
221
|
+
"email": "contact@shereton.com",
|
|
222
|
+
"web": "www.shereton.com",
|
|
223
|
+
"address": {
|
|
224
|
+
"street": "128 George St",
|
|
225
|
+
"city": "Sydney",
|
|
226
|
+
"state": "TX",
|
|
227
|
+
"zip": 2000
|
|
228
|
+
}
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
validate(schema, hotel)
|
|
232
|
+
|
|
233
|
+
would return
|
|
234
|
+
|
|
235
|
+
.. code:: python
|
|
236
|
+
|
|
237
|
+
(False, ['["address"]["state"] is failing to be enum of `{\'SA\', \'QLD\', \'NT\', \'TAS\', \'VIC\', \'WA\', \'ACT\', \'NSW\'}`'])
|
|
238
|
+
|
|
239
|
+
5. Validating arrays
|
|
240
|
+
^^^^^^^^^^^^^^^^^^^^
|
|
241
|
+
|
|
242
|
+
Let us assume the payload has ``staffs`` which is array of staff names.
|
|
243
|
+
|
|
244
|
+
.. code:: python
|
|
245
|
+
|
|
246
|
+
payload = {
|
|
247
|
+
"name": "Park Shereton",
|
|
248
|
+
"email": "contact@shereton.com",
|
|
249
|
+
"web": "www.shereton.com",
|
|
250
|
+
"staffs" ["John Doe", "Jane Smith"],
|
|
251
|
+
}
|
|
252
|
+
|
|
253
|
+
Now the schema simple looks as below
|
|
254
|
+
|
|
255
|
+
.. code:: python
|
|
256
|
+
|
|
257
|
+
schema = {
|
|
258
|
+
'name': string,
|
|
259
|
+
"email": email,
|
|
260
|
+
"web": url,
|
|
261
|
+
"staffs": [string]
|
|
262
|
+
}
|
|
263
|
+
|
|
264
|
+
So in order to declare an element as array simply use ``[]`` syntax, if
|
|
265
|
+
it’s array of string simply say ``[string]``. If it’s array of cats
|
|
266
|
+
simply say ``[cat]``. Array syntax can be either empty or single length
|
|
267
|
+
where the element means a type_function or another nested schema.
|
|
268
|
+
|
|
269
|
+
There are two scenarios where Tissuebox implicitly handles the array.
|
|
270
|
+
|
|
271
|
+
1. The incoming payload is simply list of dicts then Tissuebox knows
|
|
272
|
+
that the given schema must be validated against all the items in the
|
|
273
|
+
array.
|
|
274
|
+
2. While declaring ``.`` dot separated nested attribute, and any of the
|
|
275
|
+
middle element is array, Tissuebox is aware of such fact and will
|
|
276
|
+
iterate the validation automatically.
|
|
277
|
+
|
|
278
|
+
These two cases are implemented to make Tissuebox as intuitive as
|
|
279
|
+
possible,
|
|
280
|
+
|
|
281
|
+
6. Writing custom validators
|
|
282
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
283
|
+
|
|
284
|
+
By now you would have observed that ``tissuebox`` schema is simply a
|
|
285
|
+
collection of ``key:value`` pairs where ``value`` contains the data type
|
|
286
|
+
verified against. ``tissuebox`` defines them in the style of
|
|
287
|
+
``type_function`` which is simply a boolean function that takes one or
|
|
288
|
+
more parameters.
|
|
289
|
+
|
|
290
|
+
Let us assume you want to validate the zip code as a valid Australian
|
|
291
|
+
one. Since ``tissuebox`` does’t have a built-in type function, for that
|
|
292
|
+
purpose you can come up with your own type function as below. For
|
|
293
|
+
brevity I’ve removed few fields in the payload & schema.
|
|
294
|
+
|
|
295
|
+
.. code:: python
|
|
296
|
+
|
|
297
|
+
>>> def australian_zip(x):
|
|
298
|
+
... # https://www.etl-tools.com/regular-expressions/is-australian-post-code.html
|
|
299
|
+
... x = str(x)
|
|
300
|
+
... import re
|
|
301
|
+
... return re.match(r'^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4])|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$', x), "must be a valida Australian zip"
|
|
302
|
+
...
|
|
303
|
+
>>> hotel = {
|
|
304
|
+
... "address": {
|
|
305
|
+
... "zip": 200
|
|
306
|
+
... }
|
|
307
|
+
... }
|
|
308
|
+
>>>
|
|
309
|
+
>>> schema = {
|
|
310
|
+
... "address.zip": australian_zip
|
|
311
|
+
... }
|
|
312
|
+
>>>
|
|
313
|
+
>>> validate(schema, hotel)
|
|
314
|
+
(False, ['["address"]["zip"] must be a valida Australian zip"])
|
|
315
|
+
|
|
316
|
+
7. Validating with type_functions that accept parameters.
|
|
317
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
318
|
+
|
|
319
|
+
In ``tissuebox`` type_functions always accept one argument which is the
|
|
320
|
+
payload value. There are times for a type_function it makes sense to
|
|
321
|
+
accepts multiple parameters. To achieve that they are declared as
|
|
322
|
+
Python’s higher order functions.
|
|
323
|
+
|
|
324
|
+
Let us try validating where the ``price_per_night`` must be multiple of
|
|
325
|
+
50. Also let us declare the Yelp review rating of a hotel must be
|
|
326
|
+
between 1-5.
|
|
327
|
+
|
|
328
|
+
.. code:: python
|
|
329
|
+
|
|
330
|
+
>>> from tissuebox import validate
|
|
331
|
+
>>> from tissuebox.basic import between, divisible, string
|
|
332
|
+
|
|
333
|
+
>>> schema = {
|
|
334
|
+
... "name": string,
|
|
335
|
+
... "rating": between(1, 5),
|
|
336
|
+
... "price_per_night": divisible(50)
|
|
337
|
+
... }
|
|
338
|
+
>>>
|
|
339
|
+
>>> hotel = {
|
|
340
|
+
... "name": "Park Shereton",
|
|
341
|
+
... "price_per_night": 370,
|
|
342
|
+
... "rating": 5.1
|
|
343
|
+
... }
|
|
344
|
+
>>>
|
|
345
|
+
>>> validate(schema, hotel)
|
|
346
|
+
(False, [
|
|
347
|
+
'["price_per_night"] is failing to be `divisible(50)`',
|
|
348
|
+
'["rating"] is failing to be `between(1, 5)`'
|
|
349
|
+
])
|
|
350
|
+
|
|
351
|
+
For curiosity here is the implementation of ``divisible`` from Tissuebox
|
|
352
|
+
library. It has been defined as a higher order function which returns
|
|
353
|
+
another function which always accepts single parameter. While writing
|
|
354
|
+
custom validators you are encouraged to use the same pattern.
|
|
355
|
+
|
|
356
|
+
.. code:: python
|
|
357
|
+
|
|
358
|
+
def divisible(n):
|
|
359
|
+
def divisible(x):
|
|
360
|
+
return numeric(x) and numeric(n) and x % n == 0, "multiple of {}".format(n)
|
|
361
|
+
|
|
362
|
+
return divisible
|
|
363
|
+
|
|
364
|
+
8. Combining multiple type_functions for same element
|
|
365
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
366
|
+
|
|
367
|
+
As we have observed ``tissuebox`` schema is a dict with ``key:value``
|
|
368
|
+
format. In Python keys in dicts are unique. It’s a terrible idea to
|
|
369
|
+
redeclare same key since the data will be overridden.
|
|
370
|
+
|
|
371
|
+
Assume that you are attempting to do something like this
|
|
372
|
+
|
|
373
|
+
.. code:: python
|
|
374
|
+
|
|
375
|
+
from tissuebox.basic import divisible, integer, positive, string
|
|
376
|
+
schema = {
|
|
377
|
+
'name': string,
|
|
378
|
+
'price_per_night': integer,
|
|
379
|
+
'price_per_night': positive,
|
|
380
|
+
'price_per_night': divisible(50),
|
|
381
|
+
"address.zip": integer
|
|
382
|
+
}
|
|
383
|
+
|
|
384
|
+
Here ``price_per_night`` will be overridden by the latest declaration
|
|
385
|
+
which must be avoided. This can be solved with another special syntax
|
|
386
|
+
which yet Pythonic
|
|
387
|
+
|
|
388
|
+
Simply use ``()`` to chain type_functions.
|
|
389
|
+
|
|
390
|
+
::
|
|
391
|
+
|
|
392
|
+
```python
|
|
393
|
+
from tissuebox.basic import divisible, integer, positive, string
|
|
394
|
+
|
|
395
|
+
schema = {
|
|
396
|
+
'name': string,
|
|
397
|
+
'price_per_night': (integer, positive, divisible(50)),
|
|
398
|
+
"address.zip": integer
|
|
399
|
+
}
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
Now Tissuebox will iterate all these conditions against
|
|
403
|
+
``price_per_night``
|
|
404
|
+
|
|
405
|
+
9. Declaring a field as ``required``
|
|
406
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
407
|
+
|
|
408
|
+
While Tissuebox validates the values with type_functions, it only does
|
|
409
|
+
so only for the values are found in the payload. Otherwise they were
|
|
410
|
+
simply ignored silently.
|
|
411
|
+
|
|
412
|
+
In a situation where a specific value is expected in payload declared
|
|
413
|
+
them as ``required`` function. And it’s a common scenario to combine
|
|
414
|
+
them under ``()`` operator as described in the above.
|
|
415
|
+
|
|
416
|
+
.. code:: python
|
|
417
|
+
|
|
418
|
+
from tissuebox.basic import integer, required, string
|
|
419
|
+
schema = {
|
|
420
|
+
'name': (required, string),
|
|
421
|
+
"address.city": (required, string),
|
|
422
|
+
"address.zip": integer
|
|
423
|
+
}
|
|
424
|
+
|
|
425
|
+
Tissuebox Advantages:
|
|
426
|
+
^^^^^^^^^^^^^^^^^^^^^
|
|
427
|
+
|
|
428
|
+
- Tissuebox has lots of advantages than the current alternatives like
|
|
429
|
+
jsonschema, cerebrus etc.
|
|
430
|
+
- Truly Pythonic and heavily relies on short & static methods. The
|
|
431
|
+
schema definition itself takes full advantages of Python’s built-in
|
|
432
|
+
syntax like ``{}`` for enum, ``()`` for parameterized function,
|
|
433
|
+
``[]`` chaining multiple rules etc
|
|
434
|
+
- Highly readable with concise schema definition.
|
|
435
|
+
- Highly extensible with ability to insert your own custom methods
|
|
436
|
+
without complicated class inheritance.
|
|
437
|
+
- Ability to provide all the error messages upfront upon validation.
|
|
438
|
+
|