pipescript 0.0.14__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pipescript-0.0.14/MANIFEST.in +4 -0
- pipescript-0.0.14/PKG-INFO +656 -0
- pipescript-0.0.14/README.md +611 -0
- pipescript-0.0.14/docs/HISTORY.rst +33 -0
- pipescript-0.0.14/docs/LICENSE.txt +11 -0
- pipescript-0.0.14/pipescript/__init__.py +97 -0
- pipescript-0.0.14/pipescript/__main__.py +26 -0
- pipescript-0.0.14/pipescript/_version.py +658 -0
- pipescript-0.0.14/pipescript/analysis/__init__.py +0 -0
- pipescript-0.0.14/pipescript/analysis/dynamic_macros.py +162 -0
- pipescript-0.0.14/pipescript/analysis/extract_names.py +64 -0
- pipescript-0.0.14/pipescript/analysis/placeholders.py +231 -0
- pipescript-0.0.14/pipescript/api/__init__.py +7 -0
- pipescript-0.0.14/pipescript/api/static_macros.py +182 -0
- pipescript-0.0.14/pipescript/api/utils.py +95 -0
- pipescript-0.0.14/pipescript/constants.py +3 -0
- pipescript-0.0.14/pipescript/extension.py +163 -0
- pipescript-0.0.14/pipescript/patches/__init__.py +0 -0
- pipescript-0.0.14/pipescript/patches/completion_patch.py +33 -0
- pipescript-0.0.14/pipescript/patches/traceback_patch.py +39 -0
- pipescript-0.0.14/pipescript/tracers/__init__.py +7 -0
- pipescript-0.0.14/pipescript/tracers/macro_tracer.py +553 -0
- pipescript-0.0.14/pipescript/tracers/optional_chaining_tracer.py +330 -0
- pipescript-0.0.14/pipescript/tracers/pipeline_tracer.py +1062 -0
- pipescript-0.0.14/pipescript/utils.py +41 -0
- pipescript-0.0.14/pipescript/version.py +23 -0
- pipescript-0.0.14/pipescript.egg-info/PKG-INFO +656 -0
- pipescript-0.0.14/pipescript.egg-info/SOURCES.txt +36 -0
- pipescript-0.0.14/pipescript.egg-info/dependency_links.txt +1 -0
- pipescript-0.0.14/pipescript.egg-info/entry_points.txt +2 -0
- pipescript-0.0.14/pipescript.egg-info/not-zip-safe +1 -0
- pipescript-0.0.14/pipescript.egg-info/requires.txt +24 -0
- pipescript-0.0.14/pipescript.egg-info/top_level.txt +1 -0
- pipescript-0.0.14/pyproject.toml +43 -0
- pipescript-0.0.14/setup.cfg +76 -0
- pipescript-0.0.14/setup.py +7 -0
- pipescript-0.0.14/versioneer.py +2205 -0
|
@@ -0,0 +1,656 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pipescript
|
|
3
|
+
Version: 0.0.14
|
|
4
|
+
Summary: Powerful pipeline syntax for IPython and Jupyter
|
|
5
|
+
Home-page: https://github.com/smacke/pipescript
|
|
6
|
+
Author: Stephen Macke
|
|
7
|
+
Author-email: stephen.macke@gmail.com
|
|
8
|
+
License: BSD-3-Clause
|
|
9
|
+
Classifier: Development Status :: 3 - Alpha
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: BSD License
|
|
12
|
+
Classifier: Natural Language :: English
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
19
|
+
Requires-Python: >=3.9
|
|
20
|
+
Description-Content-Type: text/markdown; charset=UTF-8
|
|
21
|
+
License-File: docs/LICENSE.txt
|
|
22
|
+
Requires-Dist: pyccolo>=0.0.84
|
|
23
|
+
Requires-Dist: typing-extensions
|
|
24
|
+
Provides-Extra: test
|
|
25
|
+
Requires-Dist: black; extra == "test"
|
|
26
|
+
Requires-Dist: hypothesis; extra == "test"
|
|
27
|
+
Requires-Dist: isort; extra == "test"
|
|
28
|
+
Requires-Dist: mypy; extra == "test"
|
|
29
|
+
Requires-Dist: pytest; extra == "test"
|
|
30
|
+
Requires-Dist: pytest-cov; extra == "test"
|
|
31
|
+
Requires-Dist: ruff; extra == "test"
|
|
32
|
+
Provides-Extra: dev
|
|
33
|
+
Requires-Dist: build; extra == "dev"
|
|
34
|
+
Requires-Dist: pycln; extra == "dev"
|
|
35
|
+
Requires-Dist: twine; extra == "dev"
|
|
36
|
+
Requires-Dist: versioneer; extra == "dev"
|
|
37
|
+
Requires-Dist: black; extra == "dev"
|
|
38
|
+
Requires-Dist: hypothesis; extra == "dev"
|
|
39
|
+
Requires-Dist: isort; extra == "dev"
|
|
40
|
+
Requires-Dist: mypy; extra == "dev"
|
|
41
|
+
Requires-Dist: pytest; extra == "dev"
|
|
42
|
+
Requires-Dist: pytest-cov; extra == "dev"
|
|
43
|
+
Requires-Dist: ruff; extra == "dev"
|
|
44
|
+
Dynamic: license-file
|
|
45
|
+
|
|
46
|
+
pipescript
|
|
47
|
+
==========
|
|
48
|
+
|
|
49
|
+
[](https://github.com/smacke/pipescript/actions)
|
|
50
|
+
[](http://mypy-lang.org/)
|
|
51
|
+
[](https://opensource.org/licenses/BSD-3-Clause)
|
|
52
|
+
[](https://pypi.org/project/pipescript)
|
|
53
|
+
[](https://pypi.org/project/pipescript)
|
|
54
|
+
|
|
55
|
+
Pipescript is an IPython extension that brings a pipe operator `|>` and
|
|
56
|
+
powerful placeholder and macro expansion syntax extensions to IPython and Jupyter.
|
|
57
|
+
|
|
58
|
+
For a quick example, consider the following code snippet, which is not super easy
|
|
59
|
+
to read (which function call does the keyword parameter `initial=1.0` go with?):
|
|
60
|
+
|
|
61
|
+
```python
|
|
62
|
+
result = max(
|
|
63
|
+
np.max(np.abs(array[np.isfinite(array)]), initial=1.0)
|
|
64
|
+
for array in arrays
|
|
65
|
+
)
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
This mess of nested function calls can be written in pipescript as follows:
|
|
69
|
+
|
|
70
|
+
```python
|
|
71
|
+
result = arrays |> map[$
|
|
72
|
+
|> $array[np.isfinite($array)]
|
|
73
|
+
|> np.abs
|
|
74
|
+
|> np.max($, initial=1.0)
|
|
75
|
+
] |> max
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
If you're familiar with the [magrittr](https://magrittr.tidyverse.org/) package
|
|
79
|
+
for R, then you'll be right at home with pipescript.
|
|
80
|
+
|
|
81
|
+
|
|
82
|
+
## Getting Started
|
|
83
|
+
|
|
84
|
+
Run the following in IPython or Jupyter to install pipescript and load
|
|
85
|
+
the extension:
|
|
86
|
+
|
|
87
|
+
```python
|
|
88
|
+
%pip install pipescript
|
|
89
|
+
%load_ext pipescript
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
The `%load_ext pipescript` invocation is what enables the new pipe syntax
|
|
93
|
+
in your current session.
|
|
94
|
+
|
|
95
|
+
## Features by Example
|
|
96
|
+
|
|
97
|
+
Let's look at a few examples to give a flavor of what you can do with pipescript:
|
|
98
|
+
|
|
99
|
+
```python
|
|
100
|
+
# Render a sorted version of a tuple
|
|
101
|
+
>>> tup = (3, 4, 1, 5, 6)
|
|
102
|
+
>>> tup |> sorted |> tuple
|
|
103
|
+
(1, 3, 4, 5, 6)
|
|
104
|
+
```
|
|
105
|
+
The above example showcases the `|>`, or "pipe", operator, which is a much-loved
|
|
106
|
+
feature of functional programming that has become increasingly mainstream. Its
|
|
107
|
+
primary benefit is that the flow of execution follows natural left-to-right
|
|
108
|
+
reading / writing order of the code. Whether or not such pipeline syntax is
|
|
109
|
+
available, it's not uncommon for programmers to execute pipelines like the above
|
|
110
|
+
multiple times during to verify the computation at each step, particularly in
|
|
111
|
+
interactive programming environments like Jupyter. With `|>`, this type of
|
|
112
|
+
incremental verification becomes a breeze: first execute `tup |> sorted`, then
|
|
113
|
+
append ` |> tuple` to execute the full chain `tup |> sorted |> tuple`, each time
|
|
114
|
+
using the last-expression rendering capabilities of the notebook or REPL to
|
|
115
|
+
inspect and verify the result.
|
|
116
|
+
|
|
117
|
+
### Placeholders
|
|
118
|
+
|
|
119
|
+
The power of the `|>` operator is amplified via placeholder syntax for implicit
|
|
120
|
+
function construction: for pipescript, we use `$` to stand in for function arguments
|
|
121
|
+
and induce function creation:
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
# Sort a list in reverse order
|
|
125
|
+
>>> lst = [3, 4, 1, 5, 6]
|
|
126
|
+
>>> lst |> sorted($, reverse=True)
|
|
127
|
+
[6, 5, 4, 3, 1]
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
`$` is analogous to magrittr's `.` placeholder. It can also be used outside
|
|
131
|
+
of pipeline contexts:
|
|
132
|
+
|
|
133
|
+
```python
|
|
134
|
+
# Sort a list in reverse order and print the result
|
|
135
|
+
lst = [3, 4, 1, 5, 6]
|
|
136
|
+
reverse_sorter = sorted($, reverse=True)
|
|
137
|
+
|
|
138
|
+
# The following are equivalent:
|
|
139
|
+
print(reverse_sorter(lst))
|
|
140
|
+
lst |> reverse_sorter |> print
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
Each time `$` appears, it represents a new argument, so `sorted($, reverse=$)`
|
|
144
|
+
represents a function with two arguments:
|
|
145
|
+
|
|
146
|
+
```python
|
|
147
|
+
import random
|
|
148
|
+
|
|
149
|
+
# Sort a list in either ascending or descending order with probablility 0.5:
|
|
150
|
+
lst = [3, 4, 1, 5, 6]
|
|
151
|
+
sorter = sorted($, reverse=$)
|
|
152
|
+
reverse = random.random() < 0.5
|
|
153
|
+
|
|
154
|
+
# The following are equivalent:
|
|
155
|
+
print(sorter(lst, reverse))
|
|
156
|
+
lst |> sorter($, reverse) |> print
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Placeholders can appear anywhere -- not just as arguments to function calls:
|
|
160
|
+
|
|
161
|
+
```python
|
|
162
|
+
# Sort a list and find the position of element 4:
|
|
163
|
+
>>> lst = [3, 4, 1, 5, 6]
|
|
164
|
+
>>> lst |> sorted |> $.index(3)
|
|
165
|
+
1
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### Named Placeholders
|
|
169
|
+
|
|
170
|
+
There are situations that would benefit from referencing the same placeholder multiple times, for which
|
|
171
|
+
pipescript permits *named placeholders* by prefixing `$` to an identifier:
|
|
172
|
+
|
|
173
|
+
```python
|
|
174
|
+
# Pair even entries from a range with their adjacent odd entry
|
|
175
|
+
range(6) |> list |> zip($v[::2], $v[1::2]) |> list
|
|
176
|
+
>>> [(0, 1), (2, 3), (4, 5)]
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
In the above example, we could have used any name for `$v`, the important
|
|
180
|
+
thing is that the same name was used -- otherwise pipescript would have
|
|
181
|
+
induced a function with two arguments instead of one.
|
|
182
|
+
|
|
183
|
+
### Undetermined Pipelines
|
|
184
|
+
|
|
185
|
+
Similar to magrittr's behavior, if any number of placeholders appear in the first
|
|
186
|
+
step of an pipescript pipeline, this *undetermined pipeline* will represent a function:
|
|
187
|
+
|
|
188
|
+
```python
|
|
189
|
+
>>> second_largest_value = $ |> sorted($, reverse=True) |> $[1]
|
|
190
|
+
>>> [3, 8, 6, 5, 1] |> second_largest_value
|
|
191
|
+
6
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### Macros and Partial Function Syntax
|
|
195
|
+
|
|
196
|
+
In some cases, it may be desirable to curry a function with parameters at its start,
|
|
197
|
+
akin to the typical usage of `functools.partial`. For example:
|
|
198
|
+
|
|
199
|
+
```python
|
|
200
|
+
>>> add_reducer = reduce(lambda x, y: x + y, $, $)
|
|
201
|
+
>>> add_reducer([1, 2, 3], 0)
|
|
202
|
+
6
|
|
203
|
+
>>> add_reducer([[1, 2, 3], [4, 5, 6]], [])
|
|
204
|
+
[1, 2, 3, 4, 5, 6]
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
To avoid writing out a `$` placeholder for each and every tail argument, you can
|
|
208
|
+
prefix the call itself with a `$` and omit subsequent arguments, just like in coconut:
|
|
209
|
+
|
|
210
|
+
```python
|
|
211
|
+
>>> add_reducer = reduce$(lambda x, y: x + y)
|
|
212
|
+
>>> add_reducer([1, 2, 3], 0)
|
|
213
|
+
6
|
|
214
|
+
>>> add_reducer([[1, 2, 3], [4, 5, 6]], [])
|
|
215
|
+
[1, 2, 3, 4, 5, 6]
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
Or even more simply, since the induced partial function retains all the same
|
|
219
|
+
argument defaults as the original `reduce`, we can omit the base case:
|
|
220
|
+
|
|
221
|
+
```python
|
|
222
|
+
>>> add_reducer = reduce$(lambda x, y: x + y)
|
|
223
|
+
>>> add_reducer([1, 2, 3])
|
|
224
|
+
6
|
|
225
|
+
>>> add_reducer([[1, 2, 3], [4, 5, 6]])
|
|
226
|
+
[1, 2, 3, 4, 5, 6]
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
For common functional programming tools like `map`, `reduce`, and `filter`, the above
|
|
230
|
+
pattern is so common that pipescript provides corresponding macros, in which the function used
|
|
231
|
+
to curry each higher order function is specified between brackets:
|
|
232
|
+
|
|
233
|
+
```python
|
|
234
|
+
>>> add_reducer = reduce[lambda x, y: x + y]
|
|
235
|
+
>>> [1, 2, 3] |> add_reducer
|
|
236
|
+
6
|
|
237
|
+
>>> [[1, 2, 3], [4, 5, 6]] |> add_reducer
|
|
238
|
+
[1, 2, 3, 4, 5, 6]
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
We're still writing out `lambda x, y: x + y`, which is kind of tedious -- for these
|
|
242
|
+
kinds of simple lambda constructions, pipescript provides a *quick lambda macro*, `f`:
|
|
243
|
+
|
|
244
|
+
```python
|
|
245
|
+
>>> add_reducer = reduce[f[$ + $]]
|
|
246
|
+
>>> [1, 2, 3] |> add_reducer
|
|
247
|
+
6
|
|
248
|
+
>>> [[1, 2, 3], [4, 5, 6]] |> add_reducer
|
|
249
|
+
[1, 2, 3, 4, 5, 6]
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
`f` can also be used on its own:
|
|
253
|
+
|
|
254
|
+
```python
|
|
255
|
+
>>> f[$ + $](2, 3)
|
|
256
|
+
5
|
|
257
|
+
|
|
258
|
+
>>> f[$a*$b + $b*$c + $a*$c](2, 3, 4)
|
|
259
|
+
26
|
|
260
|
+
```
|
|
261
|
+
|
|
262
|
+
Furthermore, pipescript allows you to omit the `f` from higher order
|
|
263
|
+
functional macros, so that you can simply do `add_reducer = reduce[$ + $]` instead.
|
|
264
|
+
Here are a couple of nifty constructions utilizing this compact syntax:
|
|
265
|
+
|
|
266
|
+
```python
|
|
267
|
+
# factorial
|
|
268
|
+
>>> range(1, 5) |> reduce[$ * $]
|
|
269
|
+
24
|
|
270
|
+
|
|
271
|
+
# compute a number from decimal digits
|
|
272
|
+
>>> [2, 3, 4] |> reduce[10*$ + $]
|
|
273
|
+
234
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
### Additional Pipe Operators
|
|
277
|
+
|
|
278
|
+
There are a few other variants of the `|>` operator offered by
|
|
279
|
+
pipescript, covered in this section.
|
|
280
|
+
|
|
281
|
+
#### Assignment Pipe
|
|
282
|
+
|
|
283
|
+
The *assignment pipe*, `|>>`, writes the left hand side value to the variable
|
|
284
|
+
whose name is specified on the right hand side. Furthermore, it evaluates to
|
|
285
|
+
the left hand side value. For example:
|
|
286
|
+
|
|
287
|
+
```python
|
|
288
|
+
>>> 2 |> $ + 2 |>> two_plus_two |> $ + 3 |>> two_plus_two_plus_three
|
|
289
|
+
7
|
|
290
|
+
>>> (two_plus_two, two_plus_two_plus_three)
|
|
291
|
+
(4, 7)
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
#### Varargs Pipe
|
|
295
|
+
|
|
296
|
+
The *varargs pipe*, `*|>`, unpacks the iterable on the left hand side before
|
|
297
|
+
passing its values as inputs to the function on the right hand side. For
|
|
298
|
+
example:
|
|
299
|
+
|
|
300
|
+
```python
|
|
301
|
+
# Add two numbers:
|
|
302
|
+
>>> (2, 3) *|> $ + $
|
|
303
|
+
5
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
A common pattern is using `*|>` to expand an undetermined pipeline
|
|
307
|
+
appearing inside of a `map[...]`:
|
|
308
|
+
|
|
309
|
+
```python
|
|
310
|
+
# Take the product of consecutive pairs of even-odd integers
|
|
311
|
+
>>> consecutive_pairs = range(10) |> list |> ($v[::2], $v[1::2]) *|> zip
|
|
312
|
+
>>> consecutive_pairs |> map[$ *|> $ * $] |> list
|
|
313
|
+
[0, 6, 20, 42, 72]
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
#### Function Pipe
|
|
317
|
+
|
|
318
|
+
The other commonly used pipe is the *function pipe*, `.>`, which is used to compose
|
|
319
|
+
the functions specified on the left hand side and right hand side together, with the
|
|
320
|
+
function on the left hand side being applied first in the composition (note that this
|
|
321
|
+
behavior is reversed from normal function composition, but follows the flow of data better).
|
|
322
|
+
For example:
|
|
323
|
+
|
|
324
|
+
```python
|
|
325
|
+
>>> reverse = reversed .> list
|
|
326
|
+
>>> [1, 2, 3] |> reverse
|
|
327
|
+
[3, 2, 1]
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
#### Other Pipes
|
|
331
|
+
|
|
332
|
+
Besides `|>>`, `*|>`, and `.>`, pipescript offers a few less commonly used operators as well. The below
|
|
333
|
+
table describes the complete set of forward pipe operators available:
|
|
334
|
+
|
|
335
|
+
| Operator | Pipescript Syntax | Python Syntax |
|
|
336
|
+
|--------------------|-----------------------------------------------------|-----------------------------------------|
|
|
337
|
+
| <code>\|></code> | <code>y = x \|> f</code> | `y = f(x)` |
|
|
338
|
+
| <code>\|>></code> | <code>x \|>> y</code> | `y = x; y` |
|
|
339
|
+
| <code>*\|></code> | <code>y = x *\|> f</code> where `x` is an iterable | `y = f(*x)` |
|
|
340
|
+
| <code>**\|></code> | <code>y = x **\|> f</code> where `x` is a dict | `y = f(**x)` |
|
|
341
|
+
| `.>` | `h = g .> f` | `h = lambda *a, **kw: g(f(*a, **kw))` |
|
|
342
|
+
| `*.>` | `h = g *.> f` | `h = lambda *a, **kw: g(*f(*a, **kw))` |
|
|
343
|
+
| `**.>` | `h = g **.> f` | `h = lambda *a, **kw: g(**f(*a, **kw))` |
|
|
344
|
+
| `?>` | `y = x ?> f` | `y = None if x is None else f(x)` |
|
|
345
|
+
| `*?>` | `y = x *?> f` where `x` is an iterable, or `None` | `y = None if x is None else f(*x)` |
|
|
346
|
+
| `**?>` | `y = x **?> f` where `x` is a dict, or `None` | `y = None if x is None else f(**x)` |
|
|
347
|
+
| `$>` | `g = x $> f` | `g = functools.partial(f, x)` |
|
|
348
|
+
| `*$>` | `g = x *$> f` where `x` is an iterable | `g = functools.partial(f, *x)` |
|
|
349
|
+
| `**$>` | `g = x **$> f` where `x` is a dict | `g = functools.partial(f, **x)` |
|
|
350
|
+
|
|
351
|
+
Except for `|>>`, each and every operator has a corresponding *backward* variant; e.g. `<|` is the backward variant
|
|
352
|
+
of `|>` and is a low-precedence apply. For example:
|
|
353
|
+
|
|
354
|
+
```python
|
|
355
|
+
>>> reversed .> list <| [1, 2, 3]
|
|
356
|
+
[3, 2, 1]
|
|
357
|
+
```
|
|
358
|
+
|
|
359
|
+
All pipe operators are applied in order from left to right (including backward pipes).
|
|
360
|
+
Furthermore, all pipe operators are left associative and operate at the same precedence
|
|
361
|
+
as `|` (bitwise or), meaning that any pipeline steps that include an `|` binary operation
|
|
362
|
+
must be wrapped in parentheses.
|
|
363
|
+
|
|
364
|
+
### Additional Macros and Helper Utilities
|
|
365
|
+
|
|
366
|
+
#### `do` macro
|
|
367
|
+
|
|
368
|
+
Similar to [toolz](https://github.com/pytoolz/toolz), pipescript offers a `do` macro
|
|
369
|
+
implementing something similar to the following higher order function:
|
|
370
|
+
|
|
371
|
+
```python
|
|
372
|
+
def do(func, obj):
|
|
373
|
+
func(obj)
|
|
374
|
+
return obj
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
In the case of pipescript, the input function `func` is specified inside of brackets,
|
|
378
|
+
just as with other functional macros:
|
|
379
|
+
|
|
380
|
+
```python
|
|
381
|
+
>>> 2 |> $ + 2 |> do[print] |> $ + 2 |>> result
|
|
382
|
+
4
|
|
383
|
+
6
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
While any function expression, including undetermined pipelines, can appear inside `do[...]` brackets,
|
|
387
|
+
`do[print]` is so common that pipescript provides a `peek` utility that implements the very same:
|
|
388
|
+
|
|
389
|
+
```python
|
|
390
|
+
>>> 2 |> $ + 2 |> peek |> $ + 2 |>> result
|
|
391
|
+
4
|
|
392
|
+
6
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
To suppress the automatic expression rendering of a pipeline result, pipescript also offers a `null` utility function
|
|
396
|
+
(as in `/dev/null`), which essentially swallows its input:
|
|
397
|
+
|
|
398
|
+
```python
|
|
399
|
+
>>> 2 |> $ + 2 |> peek |> $ + 2 |>> result |> null
|
|
400
|
+
4
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
#### `fork` and `parallel` macros
|
|
404
|
+
|
|
405
|
+
If you wish to move beyond linear chains and apply the same input to multiple pipelines,
|
|
406
|
+
pipescript provides `fork` and `parallel` macros, which return the results of each function
|
|
407
|
+
as a tuple:
|
|
408
|
+
|
|
409
|
+
```python
|
|
410
|
+
>>> range(10) |> list |> fork[
|
|
411
|
+
map[2 * $] .> filter[$ % 3 == 0],
|
|
412
|
+
map[3 * $] .> filter[$ % 2 == 0],
|
|
413
|
+
]
|
|
414
|
+
([0, 6, 12, 18], [0, 6, 12, 18, 24])
|
|
415
|
+
```
|
|
416
|
+
|
|
417
|
+
`parallel` does the same thing as `fork` but executes each function passed to it concurrently.
|
|
418
|
+
|
|
419
|
+
#### `when` `unless`, `otherwise`, `repeat`, `until` macros
|
|
420
|
+
|
|
421
|
+
The `when` macro takes as input a value and conditional expression that, upon passing,
|
|
422
|
+
forwards the value, and upon failing, terminates computation with `None`. It is particularly powerful
|
|
423
|
+
when combined with `fork` and `collapse` (the latter of which extracts the non-null value out of
|
|
424
|
+
the tuple that results from the `fork`):
|
|
425
|
+
|
|
426
|
+
```python
|
|
427
|
+
>>> collatz = when[$ != 1] .> fork[
|
|
428
|
+
when[$ % 2 == 0] .> $ // 2,
|
|
429
|
+
when[$ % 2 == 1] .> $ * 3 + 1,
|
|
430
|
+
] .> collapse .> peek
|
|
431
|
+
```
|
|
432
|
+
|
|
433
|
+
You can also use `unless`, which is just the opposite of `when`:
|
|
434
|
+
|
|
435
|
+
```python
|
|
436
|
+
>>> collatz = when[$ != 1] .> fork[
|
|
437
|
+
when[$ % 2 == 0] .> $ // 2,
|
|
438
|
+
unless[$ % 2 == 0] .> $ * 3 + 1,
|
|
439
|
+
] .> collapse .> peek
|
|
440
|
+
```
|
|
441
|
+
|
|
442
|
+
If you don't want to explicitly write out the negative conditional, `fork` lets you
|
|
443
|
+
use the `otherwise` macro as the last expression:
|
|
444
|
+
|
|
445
|
+
```python
|
|
446
|
+
>>> collatz = when[$ != 1] .> fork[
|
|
447
|
+
when[$ % 2 == 0] .> $ // 2,
|
|
448
|
+
otherwise[$ * 3 + 1],
|
|
449
|
+
] .> collapse .> peek
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
Of course, this can be written more naturally and succinctly with
|
|
453
|
+
a ternary conditional expression:
|
|
454
|
+
|
|
455
|
+
```python
|
|
456
|
+
>>> collatz = when[$ != 1] .> f[$v // 2 if $v % 2 == 0 else $v * 3 + 1] .> peek
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
Regardless of how we write the conditional, pipescript allows you to
|
|
460
|
+
exponentiate single-argument functions with power the composition (`.**`)
|
|
461
|
+
operator, so that we don't need to write out
|
|
462
|
+
`42 |> collatz |> collatz |> ... |> collatz`:
|
|
463
|
+
|
|
464
|
+
```python
|
|
465
|
+
>>> 42 |> collatz .** 20
|
|
466
|
+
21
|
|
467
|
+
64
|
|
468
|
+
32
|
|
469
|
+
16
|
|
470
|
+
8
|
|
471
|
+
4
|
|
472
|
+
2
|
|
473
|
+
1
|
|
474
|
+
```
|
|
475
|
+
|
|
476
|
+
If you don't want to guess the upper bound of how many steps to run it, you can
|
|
477
|
+
use the `repeat` and `until` macros (`until` is just an alias of `unless`):
|
|
478
|
+
|
|
479
|
+
```python
|
|
480
|
+
>>> collatz = f[$v // 2 if $v % 2 == 0 else $v * 3 + 1]
|
|
481
|
+
>>> 42 |> repeat[until[$ == 1] .> collatz .> peek] |> null
|
|
482
|
+
21
|
|
483
|
+
64
|
|
484
|
+
32
|
|
485
|
+
16
|
|
486
|
+
8
|
|
487
|
+
4
|
|
488
|
+
2
|
|
489
|
+
1
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
#### `future` macro
|
|
493
|
+
|
|
494
|
+
Finally, to schedule a function to run in another thread and immediately
|
|
495
|
+
return a future to the eventual result, pipescript provides a `future` macro:
|
|
496
|
+
|
|
497
|
+
```python
|
|
498
|
+
>>> 2 |> future[$ + 2] |> $.result()
|
|
499
|
+
4
|
|
500
|
+
>>> [1, 2, 3] |> future[sum] |> $.result()
|
|
501
|
+
6
|
|
502
|
+
```
|
|
503
|
+
|
|
504
|
+
## Placeholder Scope
|
|
505
|
+
|
|
506
|
+
A natural question is: how does pipescript know what part of the code should
|
|
507
|
+
be included in the body of the function induced by placeholder use? The
|
|
508
|
+
rules are as follows:
|
|
509
|
+
|
|
510
|
+
1. If there is a macro or pipeline step enclosing the placeholder, the induced
|
|
511
|
+
function body includes the "smallest" such enclosing macro or pipeline step.
|
|
512
|
+
2. Otherwise, the function body expands to include the nearest "chain"
|
|
513
|
+
of function calls, attribute accesses, and / or subscript accesses.
|
|
514
|
+
|
|
515
|
+
An example of a "chain" would be something like `np.array($).T.astype(int)`,
|
|
516
|
+
which induces a lambda that converts its argument to a numpy array,
|
|
517
|
+
transposes it, and then converts the result to use `int64` dtype. That is,
|
|
518
|
+
the lambda body expands to include not just `np.array($)`, but the entire
|
|
519
|
+
"chain" in the expression.
|
|
520
|
+
|
|
521
|
+
To see a concrete example of where this matters, consider the following
|
|
522
|
+
two placeholder expressions:
|
|
523
|
+
|
|
524
|
+
```python
|
|
525
|
+
# The following sorters do different things!
|
|
526
|
+
sorter1 = sorted($, key=$[1])
|
|
527
|
+
sorter2 = sorted($, key=f[$[1]])
|
|
528
|
+
```
|
|
529
|
+
|
|
530
|
+
`sorter1` is a function that takes two arguments: a sequence, and a list of
|
|
531
|
+
functions, the second of which will be used to compute the sort key, which it then
|
|
532
|
+
uses to sort the first argument.
|
|
533
|
+
`sorter2`, on the other hand, is a function that takes a single argument, which
|
|
534
|
+
is a sequence that it sorts using the second element of each value in said
|
|
535
|
+
sequence value as sort key. In most cases, `sorter2` probably gives the desired
|
|
536
|
+
behavior.
|
|
537
|
+
|
|
538
|
+
## Optional Chaining, Permissive Attribute Chaining, and Nullish Coalescing
|
|
539
|
+
|
|
540
|
+
Pipescript also provides typescript-style optional chaining and nullish coalescing.
|
|
541
|
+
That is, `a?.b.c.d().e` resolves to `None` when `a` is `None`, as does `a?.()`.
|
|
542
|
+
Also, `a ?? obj` evaluates to `obj` only when `a` is `None`, but evaluates to `a`
|
|
543
|
+
whenever `a` is some other falsey value like `""`, `0`, `False`, or `[]`. Note that,
|
|
544
|
+
like normal boolean `or`, the nullish coalescing operator `??` is lazy and will not
|
|
545
|
+
evaluate expressions on its right hand side when its left hand side is not `None`.
|
|
546
|
+
|
|
547
|
+
Unlike Javascript, Python does not resolve unavailable attribute accesses to
|
|
548
|
+
`undefined`, but will rather throw `AttributeError`. In pipescript, if you would
|
|
549
|
+
like to perform some kind of permissive attribute access like in Javascript, you
|
|
550
|
+
can use the *permissive chaining operator* `.?` (where the `?` appears after the
|
|
551
|
+
`.`) and access `b` as `a.?b`, which is equivalent to `getattr(a, "b", None)`.
|
|
552
|
+
Note however that if the aforementioned expression resolves to `None`, something
|
|
553
|
+
like `a.?b.c` will still throw an `AttributeError` -- to avoid that, you need to
|
|
554
|
+
combine both permissive attribute chaining and optional chaining as `a.?b?.c`.
|
|
555
|
+
|
|
556
|
+
## Performance Overhead
|
|
557
|
+
|
|
558
|
+
Because pipescript is implemented using instrumentation (see [How it works](#how-it-works)),
|
|
559
|
+
it does incur overhead. For top-level code written in a Jupyter cell (e.g.,
|
|
560
|
+
code that doesn't have any indentation), the additional overhead generally doesn't matter,
|
|
561
|
+
as it tends to be insignificant when compared to data-intensive dataframe operations
|
|
562
|
+
and SQL queries common in data science workloads. Furthermore, overhead is only incurred
|
|
563
|
+
when pipescript syntax is actually used -- there's no penalty for any code written in vanilla
|
|
564
|
+
Python, **even when pipescript has been enabled in your current REPL session**.
|
|
565
|
+
|
|
566
|
+
## More Examples
|
|
567
|
+
I developed pipescript while working on
|
|
568
|
+
[Advent of Code 2025](https://adventofcode.com/2025) in parallel,
|
|
569
|
+
and used it for most of the input processesing portions of my solutions.
|
|
570
|
+
You can find these solutions at https://github.com/smacke/aoc2025. In particular,
|
|
571
|
+
the [solution for day 6](https://github.com/smacke/aoc2025/blob/main/aoc6.ipynb)
|
|
572
|
+
showcases the upper limits of what is possible with pipescript. Note however that it is
|
|
573
|
+
optimized for pipescript usage and not readability, which I generally wouldn't recommend.
|
|
574
|
+
|
|
575
|
+
## What pipescript is and is not
|
|
576
|
+
|
|
577
|
+
For now, pipescript is not a general purpose functional programming language on top of
|
|
578
|
+
Python. It is very much not intended for production use cases, and instead
|
|
579
|
+
caters toward quick-and-dirty one-off / scratchpad type computations in IPython
|
|
580
|
+
and Jupyter specifically. In short, pipescript aims to provide simple but powerful
|
|
581
|
+
pipeline and placeholder syntax to interactive Python programming environments.
|
|
582
|
+
|
|
583
|
+
Particularly, pipescript is:
|
|
584
|
+
- Currently only for interactive Python environments built on top of IPython, such as
|
|
585
|
+
Jupyter, or IPython itself
|
|
586
|
+
- Just a library you can install from PyPI, compatible with a wide range of Python 3
|
|
587
|
+
versions -- no fancy installation instructions, no complicated language distribution
|
|
588
|
+
to install
|
|
589
|
+
- Fully compatible with all existing Python standard and third-party libraries that
|
|
590
|
+
you already know and love, since it's just Python function calls under the hood
|
|
591
|
+
|
|
592
|
+
All the different pipeline operators like `|>`, `<|`, `*|>`, etc. essentially
|
|
593
|
+
transpile down to an instrumented variant of the bitwise-or (`|`) operator, and
|
|
594
|
+
therefore every new operator left-associates at the same level of precedence,
|
|
595
|
+
meaning that pipeline steps run from left to right in the order that they
|
|
596
|
+
appear. Pipescript aims to optimize for simplicity, readability / writability, and
|
|
597
|
+
predictability over feature completeness (though I'd like to think it strikes a
|
|
598
|
+
fairly good balance in this regard). Pipescript may be expanded beyond IPython / Jupyter
|
|
599
|
+
depending on traction.
|
|
600
|
+
|
|
601
|
+
## How it works
|
|
602
|
+
|
|
603
|
+
Pipescript works by transforming syntax in two stages. First, it rewrites token spans
|
|
604
|
+
like `|>` and `*|>` that are illegal in Python to legal ones -- for the previous
|
|
605
|
+
examples, both spans are rewritten to bitwise or, `|`. After these transformations,
|
|
606
|
+
the resulting code is valid (but likely not runnable) Python syntax. Pipescript uses
|
|
607
|
+
the [pyccolo](https://github.com/smacke/pyccolo) library to perform these rewrites,
|
|
608
|
+
which remembers the positions of the rewrites where they occurred, so that the eventual
|
|
609
|
+
`ast.BinOp` AST node can be associated with the `|>` operator.
|
|
610
|
+
|
|
611
|
+
Pyccolo is a library I developed during my PhD which provides an event-driven
|
|
612
|
+
architecture for declarative AST transformations. Its key selling point is that
|
|
613
|
+
it allows you to layer multiple AST transformations on top of each other in a
|
|
614
|
+
composable fashion. In short, you specify handlers for different AST nodes such
|
|
615
|
+
as `ast.BinOp`, and pyccolo instruments these nodes by emitting events for them,
|
|
616
|
+
so that when the code runs, all the handlers for a particular event are run.
|
|
617
|
+
Such event handlers are what allow us to change the behavior of `ast.BinOp`
|
|
618
|
+
nodes that have been associated with various custom operators like `|>`.
|
|
619
|
+
|
|
620
|
+
Because the same event emission transformation can be leveraged by multiple
|
|
621
|
+
associated handlers, you generally don't need to worry about said
|
|
622
|
+
transformations rewriting the AST in ways that conflict with each other. This
|
|
623
|
+
composability lies in stark contrast with the challenges you would face if you
|
|
624
|
+
were to just create a bunch of `ast.NodeTransformer` instances to perform
|
|
625
|
+
transformations. The strategy employed by pyccolo therefore allows for
|
|
626
|
+
incremental and iterative feature development without requiring large rewrites
|
|
627
|
+
as new features are introduced.
|
|
628
|
+
|
|
629
|
+
To summarize, pipescript rewrites its syntax to valid Python, and then runs this Python in
|
|
630
|
+
an instrumented fashion using pyccolo. Because everything is just running in
|
|
631
|
+
Python, pipescript is effectively a Python superset, and because the transformed
|
|
632
|
+
Python that is instrumented is fairly similar visually to pipescript syntax,
|
|
633
|
+
various Jupyter ergonomical features like readable stack traces and jedi-based
|
|
634
|
+
autocomplete can continue to function as normal (for the most part).
|
|
635
|
+
|
|
636
|
+
Implementation-wise, thanks to pyccolo's heavy lifting, I was able to write the
|
|
637
|
+
initial release of pipescript entirely over the course of time off during the
|
|
638
|
+
2025 holiday season. At the time of this writing, pipescript occupies about 2000
|
|
639
|
+
lines of code (excluding tests), each of which was produced *without* the help
|
|
640
|
+
of any AI agents.
|
|
641
|
+
|
|
642
|
+
## Inspiration
|
|
643
|
+
|
|
644
|
+
Pipescript draws inspiration largely from
|
|
645
|
+
[magrittr](https://magrittr.tidyverse.org/), but also from efforts like
|
|
646
|
+
[coconut](https://coconut-lang.org/) (a functional superset of Python),
|
|
647
|
+
as well as from libraries like [Pipe](https://github.com/JulienPalard/Pipe) and [toolz](https://github.com/pytoolz/toolz) which
|
|
648
|
+
fill some of Python's pipe and functional programming gaps with elegant APIs.
|
|
649
|
+
|
|
650
|
+
## Disclaimer
|
|
651
|
+
|
|
652
|
+
**Warning: use pipescript at your own risk!** It is very much not guaranteed to
|
|
653
|
+
be bug-free -- I implemented it in a hurry before it was time to go back to work.
|
|
654
|
+
|
|
655
|
+
## License
|
|
656
|
+
Code in this project licensed under the [BSD-3-Clause License](https://opensource.org/licenses/BSD-3-Clause).
|