tissuebox 2.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,438 @@
1
+ Metadata-Version: 2.1
2
+ Name: tissuebox
3
+ Version: 2.0.1
4
+ Summary: Tissuebox :: Pythonic payload validator
5
+ Home-page: https://github.com/n3h3m/tissuebox.git
6
+ Author: nehemiah
7
+ Author-email: nehemiah.jacob@gmail.com
8
+ Description-Content-Type: text/x-rst
9
+
10
+ .. figure:: https://raw.githubusercontent.com/nehemiahjacob/tissuebox/master/tissuebox.png
11
+
12
+ Tissuebox
13
+ ---------
14
+
15
+ Tissuebox is a pure Pythonic schema validator which takes advantage of
16
+ Python’s functional style programming to provide simple yet powerful
17
+ validation framework. The standard usage would be validating incoming
18
+ JSON objects upon http requests or to validate any Python dict in other
19
+ common scenarios.
20
+
21
+ Installation:
22
+ ^^^^^^^^^^^^^
23
+
24
+ Use ``pip`` to install Tissuebox
25
+
26
+ ``pip install tissuebox``
27
+
28
+ Requirements:
29
+ ^^^^^^^^^^^^^
30
+
31
+ Tissuebox requires Python 3.7 however we are considering to add support
32
+ for earlier versions of Python3
33
+
34
+ Examples:
35
+ ^^^^^^^^^
36
+
37
+ Assume the incoming JSON object or a python dict which contains hotel
38
+ details and we will build upon this example.
39
+
40
+ .. code:: python
41
+
42
+ payload = {
43
+ "name": "Park Shereton",
44
+ "available": True,
45
+ "price_per_night": 270,
46
+ "email": "contact@shereton.com",
47
+ "web": "www.shereton.com",
48
+ }
49
+
50
+ 1. Validating basic data types
51
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52
+
53
+ You can use ``tissuebox`` to define a schema to validate the payload
54
+ against basic data types and validate using ``validate`` method.
55
+
56
+ .. code:: python
57
+
58
+ from tissuebox import validate
59
+ from tissuebox.basic import boolean, integer, string
60
+
61
+ schema = {
62
+ 'name': string,
63
+ 'available': boolean,
64
+ 'price_per_night': integer
65
+ }
66
+
67
+ validate(schema, payload)
68
+
69
+ will return
70
+
71
+ .. code:: python
72
+
73
+ (True, [])
74
+
75
+ 2. Validating common datatypes
76
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
77
+
78
+ A ``tissuebox`` schema is simply a dict where keys are payload keys and
79
+ values are type_functions to which the payload value would be passed. A
80
+ type_function simply accepts a single parameter and returns a tuple with
81
+ two items ``(boolean, msg)``.
82
+
83
+ Tissuebox aims to amass a collection of commonly used types to it’s
84
+ library. For now common data types like ``email``, ``url``,
85
+ ``rfc_datetime``, ``geolocation`` are part of ``tissuebox``\ ’s standard
86
+ collections. You can contribute more via Github.
87
+
88
+ .. code:: python
89
+
90
+ from tissuebox import validate
91
+ from tissuebox.basic import email, integer, string, url
92
+ schema = {
93
+ 'name': string,
94
+ 'price_per_night': integer,
95
+ "email": email,
96
+ "web": url
97
+ }
98
+
99
+ validate(schema, payload)
100
+
101
+ will return
102
+
103
+ .. code:: python
104
+
105
+ (True, [])
106
+
107
+ One of the ways ``tissuebox`` stands our from other alternatives is, the
108
+ type_functions are stored and passed around as Python variables which is
109
+ helpful in identifying the schema definition errors ahead of time as
110
+ most IDEs will display squiggly lines if the variables aren’t resolved,
111
+ while other frameworks like JsonSchema and Cerebrus pass types within
112
+ strings which is hard for IDEs to detect errors in the schema.
113
+
114
+ 3. Validating nested fields
115
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
116
+
117
+ Method 1:
118
+ '''''''''
119
+
120
+ Defining a schema in a nested fashion is very straight forward which
121
+ enables re-use schemas around. Consider if the payload has an
122
+ ``address`` field. We can define a separate schema as ``address_schema``
123
+ and pass it to the main schema as below.
124
+
125
+ .. code:: python
126
+
127
+ from tissuebox import validate
128
+ from tissuebox.basic import email, integer, string, url
129
+ payload = {
130
+ "name": "Park Shereton",
131
+ "available": True,
132
+ "price_per_night": 270,
133
+ "email": "contact@shereton.com",
134
+ "web": "www.shereton.com",
135
+ "address": {
136
+ "street": "128 George St",
137
+ "city": "Sydney",
138
+ "state": "NSW",
139
+ "zip": 2000
140
+ }
141
+ }
142
+
143
+ address = {
144
+ "street": string,
145
+ "city": string,
146
+ "state": string,
147
+ "zip": integer
148
+ }
149
+
150
+ schema = {
151
+ 'name': string,
152
+ 'price_per_night': integer,
153
+ "email": email,
154
+ "web": url,
155
+ "address": address
156
+ }
157
+
158
+ validate(schema, payload)
159
+
160
+ would return
161
+
162
+ .. code:: python
163
+
164
+ (True, [])
165
+
166
+ Method 2:
167
+ '''''''''
168
+
169
+ The prefered method of defining nested schema is by using ``.`` dot as
170
+ delimiter to represent nested fields of the payload hierarchy.
171
+ Apparently this comes up with the downside wherein if ``.`` dot itself
172
+ is part of keys which would be an unfortunate scenario. But it can
173
+ improve the readability to a tremendous level. See it yourself how
174
+ elegantly we can express the schema once we introduce the ``address``
175
+ field to our payload.
176
+
177
+ .. code:: python
178
+
179
+ schema = {
180
+ 'name': string,
181
+ 'price_per_night': integer,
182
+ "email": email,
183
+ "web": url,
184
+ "address.street": string,
185
+ "address.city": string,
186
+ "address.state": string,
187
+ "address.zip": integer
188
+ }
189
+
190
+ The primary reason why we suggest the later method is we can quickly
191
+ define a nested field with any depth without creating unnecessary schema
192
+ objects in the middle.
193
+
194
+ 4. Validating enums.
195
+ ^^^^^^^^^^^^^^^^^^^^
196
+
197
+ Let us try enforcing that the field ``address.state`` must be one of 8
198
+ Australian states. Tissuebox let’s you define an enum using the ``{}``
199
+ i.e ``set()`` syntax. Look at the example below.
200
+
201
+ .. code:: python
202
+
203
+ schema = {
204
+ 'name': string,
205
+ 'price_per_night': integer,
206
+ "email": email,
207
+ "web": url,
208
+ "address.state": {'ACT', 'NSW', 'NT', 'QLD', 'SA', 'TAS', 'VIC', 'WA'},
209
+ "address.zip": integer
210
+ }
211
+
212
+ To have a feel how Tissuebox responds when we pass something which is
213
+ not an Australian state
214
+
215
+ .. code:: python
216
+
217
+ payload = {
218
+ "name": "Park Shereton",
219
+ "available": True,
220
+ "price_per_night": 270,
221
+ "email": "contact@shereton.com",
222
+ "web": "www.shereton.com",
223
+ "address": {
224
+ "street": "128 George St",
225
+ "city": "Sydney",
226
+ "state": "TX",
227
+ "zip": 2000
228
+ }
229
+ }
230
+
231
+ validate(schema, hotel)
232
+
233
+ would return
234
+
235
+ .. code:: python
236
+
237
+ (False, ['["address"]["state"] is failing to be enum of `{\'SA\', \'QLD\', \'NT\', \'TAS\', \'VIC\', \'WA\', \'ACT\', \'NSW\'}`'])
238
+
239
+ 5. Validating arrays
240
+ ^^^^^^^^^^^^^^^^^^^^
241
+
242
+ Let us assume the payload has ``staffs`` which is array of staff names.
243
+
244
+ .. code:: python
245
+
246
+ payload = {
247
+ "name": "Park Shereton",
248
+ "email": "contact@shereton.com",
249
+ "web": "www.shereton.com",
250
+ "staffs" ["John Doe", "Jane Smith"],
251
+ }
252
+
253
+ Now the schema simple looks as below
254
+
255
+ .. code:: python
256
+
257
+ schema = {
258
+ 'name': string,
259
+ "email": email,
260
+ "web": url,
261
+ "staffs": [string]
262
+ }
263
+
264
+ So in order to declare an element as array simply use ``[]`` syntax, if
265
+ it’s array of string simply say ``[string]``. If it’s array of cats
266
+ simply say ``[cat]``. Array syntax can be either empty or single length
267
+ where the element means a type_function or another nested schema.
268
+
269
+ There are two scenarios where Tissuebox implicitly handles the array.
270
+
271
+ 1. The incoming payload is simply list of dicts then Tissuebox knows
272
+ that the given schema must be validated against all the items in the
273
+ array.
274
+ 2. While declaring ``.`` dot separated nested attribute, and any of the
275
+ middle element is array, Tissuebox is aware of such fact and will
276
+ iterate the validation automatically.
277
+
278
+ These two cases are implemented to make Tissuebox as intuitive as
279
+ possible,
280
+
281
+ 6. Writing custom validators
282
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
283
+
284
+ By now you would have observed that ``tissuebox`` schema is simply a
285
+ collection of ``key:value`` pairs where ``value`` contains the data type
286
+ verified against. ``tissuebox`` defines them in the style of
287
+ ``type_function`` which is simply a boolean function that takes one or
288
+ more parameters.
289
+
290
+ Let us assume you want to validate the zip code as a valid Australian
291
+ one. Since ``tissuebox`` does’t have a built-in type function, for that
292
+ purpose you can come up with your own type function as below. For
293
+ brevity I’ve removed few fields in the payload & schema.
294
+
295
+ .. code:: python
296
+
297
+ >>> def australian_zip(x):
298
+ ... # https://www.etl-tools.com/regular-expressions/is-australian-post-code.html
299
+ ... x = str(x)
300
+ ... import re
301
+ ... return re.match(r'^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4])|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$', x), "must be a valida Australian zip"
302
+ ...
303
+ >>> hotel = {
304
+ ... "address": {
305
+ ... "zip": 200
306
+ ... }
307
+ ... }
308
+ >>>
309
+ >>> schema = {
310
+ ... "address.zip": australian_zip
311
+ ... }
312
+ >>>
313
+ >>> validate(schema, hotel)
314
+ (False, ['["address"]["zip"] must be a valida Australian zip"])
315
+
316
+ 7. Validating with type_functions that accept parameters.
317
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
318
+
319
+ In ``tissuebox`` type_functions always accept one argument which is the
320
+ payload value. There are times for a type_function it makes sense to
321
+ accepts multiple parameters. To achieve that they are declared as
322
+ Python’s higher order functions.
323
+
324
+ Let us try validating where the ``price_per_night`` must be multiple of
325
+ 50. Also let us declare the Yelp review rating of a hotel must be
326
+ between 1-5.
327
+
328
+ .. code:: python
329
+
330
+ >>> from tissuebox import validate
331
+ >>> from tissuebox.basic import between, divisible, string
332
+
333
+ >>> schema = {
334
+ ... "name": string,
335
+ ... "rating": between(1, 5),
336
+ ... "price_per_night": divisible(50)
337
+ ... }
338
+ >>>
339
+ >>> hotel = {
340
+ ... "name": "Park Shereton",
341
+ ... "price_per_night": 370,
342
+ ... "rating": 5.1
343
+ ... }
344
+ >>>
345
+ >>> validate(schema, hotel)
346
+ (False, [
347
+ '["price_per_night"] is failing to be `divisible(50)`',
348
+ '["rating"] is failing to be `between(1, 5)`'
349
+ ])
350
+
351
+ For curiosity here is the implementation of ``divisible`` from Tissuebox
352
+ library. It has been defined as a higher order function which returns
353
+ another function which always accepts single parameter. While writing
354
+ custom validators you are encouraged to use the same pattern.
355
+
356
+ .. code:: python
357
+
358
+ def divisible(n):
359
+ def divisible(x):
360
+ return numeric(x) and numeric(n) and x % n == 0, "multiple of {}".format(n)
361
+
362
+ return divisible
363
+
364
+ 8. Combining multiple type_functions for same element
365
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
366
+
367
+ As we have observed ``tissuebox`` schema is a dict with ``key:value``
368
+ format. In Python keys in dicts are unique. It’s a terrible idea to
369
+ redeclare same key since the data will be overridden.
370
+
371
+ Assume that you are attempting to do something like this
372
+
373
+ .. code:: python
374
+
375
+ from tissuebox.basic import divisible, integer, positive, string
376
+ schema = {
377
+ 'name': string,
378
+ 'price_per_night': integer,
379
+ 'price_per_night': positive,
380
+ 'price_per_night': divisible(50),
381
+ "address.zip": integer
382
+ }
383
+
384
+ Here ``price_per_night`` will be overridden by the latest declaration
385
+ which must be avoided. This can be solved with another special syntax
386
+ which yet Pythonic
387
+
388
+ Simply use ``()`` to chain type_functions.
389
+
390
+ ::
391
+
392
+ ```python
393
+ from tissuebox.basic import divisible, integer, positive, string
394
+
395
+ schema = {
396
+ 'name': string,
397
+ 'price_per_night': (integer, positive, divisible(50)),
398
+ "address.zip": integer
399
+ }
400
+ ```
401
+
402
+ Now Tissuebox will iterate all these conditions against
403
+ ``price_per_night``
404
+
405
+ 9. Declaring a field as ``required``
406
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
407
+
408
+ While Tissuebox validates the values with type_functions, it only does
409
+ so only for the values are found in the payload. Otherwise they were
410
+ simply ignored silently.
411
+
412
+ In a situation where a specific value is expected in payload declared
413
+ them as ``required`` function. And it’s a common scenario to combine
414
+ them under ``()`` operator as described in the above.
415
+
416
+ .. code:: python
417
+
418
+ from tissuebox.basic import integer, required, string
419
+ schema = {
420
+ 'name': (required, string),
421
+ "address.city": (required, string),
422
+ "address.zip": integer
423
+ }
424
+
425
+ Tissuebox Advantages:
426
+ ^^^^^^^^^^^^^^^^^^^^^
427
+
428
+ - Tissuebox has lots of advantages than the current alternatives like
429
+ jsonschema, cerebrus etc.
430
+ - Truly Pythonic and heavily relies on short & static methods. The
431
+ schema definition itself takes full advantages of Python’s built-in
432
+ syntax like ``{}`` for enum, ``()`` for parameterized function,
433
+ ``[]`` chaining multiple rules etc
434
+ - Highly readable with concise schema definition.
435
+ - Highly extensible with ability to insert your own custom methods
436
+ without complicated class inheritance.
437
+ - Ability to provide all the error messages upfront upon validation.
438
+