@chrismo/superkit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (74) hide show
  1. package/LICENSE.txt +29 -0
  2. package/README.md +26 -0
  3. package/dist/cli/pager.d.ts +6 -0
  4. package/dist/cli/pager.d.ts.map +1 -0
  5. package/dist/cli/pager.js +21 -0
  6. package/dist/cli/pager.js.map +1 -0
  7. package/dist/cli/skdoc.d.ts +3 -0
  8. package/dist/cli/skdoc.d.ts.map +1 -0
  9. package/dist/cli/skdoc.js +42 -0
  10. package/dist/cli/skdoc.js.map +1 -0
  11. package/dist/cli/skgrok.d.ts +3 -0
  12. package/dist/cli/skgrok.d.ts.map +1 -0
  13. package/dist/cli/skgrok.js +21 -0
  14. package/dist/cli/skgrok.js.map +1 -0
  15. package/dist/cli/skops.d.ts +3 -0
  16. package/dist/cli/skops.d.ts.map +1 -0
  17. package/dist/cli/skops.js +32 -0
  18. package/dist/cli/skops.js.map +1 -0
  19. package/dist/index.d.ts +10 -0
  20. package/dist/index.d.ts.map +1 -0
  21. package/dist/index.js +11 -0
  22. package/dist/index.js.map +1 -0
  23. package/dist/lib/docs.d.ts +11 -0
  24. package/dist/lib/docs.d.ts.map +1 -0
  25. package/dist/lib/docs.js +29 -0
  26. package/dist/lib/docs.js.map +1 -0
  27. package/dist/lib/expert-sections.d.ts +32 -0
  28. package/dist/lib/expert-sections.d.ts.map +1 -0
  29. package/dist/lib/expert-sections.js +130 -0
  30. package/dist/lib/expert-sections.js.map +1 -0
  31. package/dist/lib/grok.d.ts +15 -0
  32. package/dist/lib/grok.d.ts.map +1 -0
  33. package/dist/lib/grok.js +57 -0
  34. package/dist/lib/grok.js.map +1 -0
  35. package/dist/lib/help.d.ts +20 -0
  36. package/dist/lib/help.d.ts.map +1 -0
  37. package/dist/lib/help.js +163 -0
  38. package/dist/lib/help.js.map +1 -0
  39. package/dist/lib/recipes.d.ts +29 -0
  40. package/dist/lib/recipes.d.ts.map +1 -0
  41. package/dist/lib/recipes.js +133 -0
  42. package/dist/lib/recipes.js.map +1 -0
  43. package/dist/superkit.tar.gz +0 -0
  44. package/docs/grok-patterns.sup +89 -0
  45. package/docs/recipes/array.md +66 -0
  46. package/docs/recipes/array.spq +31 -0
  47. package/docs/recipes/character.md +110 -0
  48. package/docs/recipes/character.spq +57 -0
  49. package/docs/recipes/escape.md +159 -0
  50. package/docs/recipes/escape.spq +102 -0
  51. package/docs/recipes/format.md +51 -0
  52. package/docs/recipes/format.spq +24 -0
  53. package/docs/recipes/index.md +23 -0
  54. package/docs/recipes/integer.md +101 -0
  55. package/docs/recipes/integer.spq +53 -0
  56. package/docs/recipes/records.md +84 -0
  57. package/docs/recipes/records.spq +61 -0
  58. package/docs/recipes/string.md +177 -0
  59. package/docs/recipes/string.spq +105 -0
  60. package/docs/superdb-expert.md +929 -0
  61. package/docs/tutorials/bash_to_sup.md +123 -0
  62. package/docs/tutorials/chess-tiebreaks.md +233 -0
  63. package/docs/tutorials/debug.md +439 -0
  64. package/docs/tutorials/fork_for_window.md +296 -0
  65. package/docs/tutorials/grok.md +166 -0
  66. package/docs/tutorials/index.md +10 -0
  67. package/docs/tutorials/joins.md +79 -0
  68. package/docs/tutorials/moar_subqueries.md +35 -0
  69. package/docs/tutorials/subqueries.md +236 -0
  70. package/docs/tutorials/sup_to_bash.md +164 -0
  71. package/docs/tutorials/super_db_update.md +34 -0
  72. package/docs/tutorials/unnest.md +113 -0
  73. package/docs/zq-to-super-upgrades.md +549 -0
  74. package/package.json +46 -0
@@ -0,0 +1,296 @@
1
+ ---
2
+ title: "Fork as a Window Function Workaround"
3
+ name: fork-for-window
4
+ description: "Using fork as a workaround for window functions to do per-group selection."
5
+ layout: default
6
+ nav_order: 4
7
+ parent: Tutorials
8
+ superdb_version: "0.3.0"
9
+ last_updated: "2026-02-20"
10
+ ---
11
+
12
+ # Fork as a Window Function Workaround
13
+
14
+ Window functions like `ROW_NUMBER() OVER (PARTITION BY ...)` are not yet
15
+ available in SuperDB ([brimdata/super#5921][issue]). This tutorial shows how to
16
+ use `fork` to achieve per-group selection — picking the top N items from each
17
+ group.
18
+
19
+ [issue]: https://github.com/brimdata/super/issues/5921
20
+
21
+ ## The Problem
22
+
23
+ You have a pool of available EC2 instances spread across availability zones.
24
+ You need to pick instances while maximizing AZ distribution — taking an equal
25
+ number from each zone rather than filling up from one.
26
+
27
+ ```mdtest-input instances.sup
28
+ {id:"i-001",az:"us-east-1a"}
29
+ {id:"i-002",az:"us-east-1a"}
30
+ {id:"i-003",az:"us-east-1a"}
31
+ {id:"i-004",az:"us-east-1b"}
32
+ {id:"i-005",az:"us-east-1c"}
33
+ {id:"i-006",az:"us-east-1c"}
34
+ {id:"i-007",az:"us-east-1c"}
35
+ {id:"i-008",az:"us-east-1c"}
36
+ ```
37
+
38
+ Distribution: 3 in `us-east-1a`, 1 in `us-east-1b`, 4 in `us-east-1c`.
39
+
40
+ ## What You'd Want (Window Functions)
41
+
42
+ In SQL with window functions, this would be straightforward:
43
+
44
+ ```sql
45
+ SELECT * FROM (
46
+ SELECT *,
47
+ ROW_NUMBER() OVER (PARTITION BY az ORDER BY id) as rn
48
+ FROM instances
49
+ ) WHERE rn <= 2
50
+ ```
51
+
52
+ This assigns a row number within each AZ group, then filters to keep only the
53
+ first 2 per group. But SuperDB doesn't support this yet.
54
+
55
+ ## The Fork Approach
56
+
57
+ `fork` splits the input stream into parallel branches. Each branch receives a
58
+ copy of **all** the input records, processes them independently, and the results
59
+ from every branch are merged back together into a single stream.
60
+
61
+ Here's the full query — we'll break it down step by step after:
62
+
63
+ ```mdtest-command
64
+ super -s -c "
65
+ from instances.sup
66
+ | fork
67
+ ( where az=='us-east-1a' | head 2 )
68
+ ( where az=='us-east-1b' | head 2 )
69
+ ( where az=='us-east-1c' | head 2 )
70
+ | sort az, id
71
+ "
72
+ ```
73
+ ```mdtest-output
74
+ {id:"i-001",az:"us-east-1a"}
75
+ {id:"i-002",az:"us-east-1a"}
76
+ {id:"i-004",az:"us-east-1b"}
77
+ {id:"i-005",az:"us-east-1c"}
78
+ {id:"i-006",az:"us-east-1c"}
79
+ ```
80
+
81
+ ### Step by Step
82
+
83
+ **Step 1: `from instances.sup`** — reads all 8 records into the stream:
84
+
85
+ ```mdtest-command
86
+ super -s -c "from instances.sup"
87
+ ```
88
+ ```mdtest-output
89
+ {id:"i-001",az:"us-east-1a"}
90
+ {id:"i-002",az:"us-east-1a"}
91
+ {id:"i-003",az:"us-east-1a"}
92
+ {id:"i-004",az:"us-east-1b"}
93
+ {id:"i-005",az:"us-east-1c"}
94
+ {id:"i-006",az:"us-east-1c"}
95
+ {id:"i-007",az:"us-east-1c"}
96
+ {id:"i-008",az:"us-east-1c"}
97
+ ```
98
+
99
+ **Step 2: `fork`** — sends all 8 records into each of three branches. Each
100
+ branch sees the full input and processes it independently.
101
+
102
+ **Branch 1:** `where az=='us-east-1a'` filters to 3 records, then `head 2`
103
+ keeps the first 2:
104
+
105
+ ```mdtest-command
106
+ super -s -c "from instances.sup | where az=='us-east-1a' | head 2"
107
+ ```
108
+ ```mdtest-output
109
+ {id:"i-001",az:"us-east-1a"}
110
+ {id:"i-002",az:"us-east-1a"}
111
+ ```
112
+
113
+ (i-003 was filtered out by `head 2`)
114
+
115
+ **Branch 2:** `where az=='us-east-1b'` filters to 1 record, `head 2` returns
116
+ what's available:
117
+
118
+ ```mdtest-command
119
+ super -s -c "from instances.sup | where az=='us-east-1b' | head 2"
120
+ ```
121
+ ```mdtest-output
122
+ {id:"i-004",az:"us-east-1b"}
123
+ ```
124
+
125
+ Only 1 instance exists in this AZ. `head 2` doesn't error or pad — it just
126
+ returns what's there.
127
+
128
+ **Branch 3:** `where az=='us-east-1c'` filters to 4 records, `head 2` keeps
129
+ the first 2:
130
+
131
+ ```mdtest-command
132
+ super -s -c "from instances.sup | where az=='us-east-1c' | head 2"
133
+ ```
134
+ ```mdtest-output
135
+ {id:"i-005",az:"us-east-1c"}
136
+ {id:"i-006",az:"us-east-1c"}
137
+ ```
138
+
139
+ (i-007 and i-008 were filtered out by `head 2`)
140
+
141
+ **Step 3: implicit combine** — after the fork closes, results from all three
142
+ branches merge back into a single stream of 5 records. Fork branches run in
143
+ parallel and finish in nondeterministic order, so the combined output may be
144
+ interleaved differently on each run. This is why the final `sort` matters.
145
+
146
+ **Step 4: `sort az, id`** — sorts the combined results for clean, predictable
147
+ output:
148
+
149
+ ```mdtest-command
150
+ super -s -c "
151
+ from instances.sup
152
+ | fork
153
+ ( where az=='us-east-1a' | head 2 )
154
+ ( where az=='us-east-1b' | head 2 )
155
+ ( where az=='us-east-1c' | head 2 )
156
+ | sort az, id
157
+ "
158
+ ```
159
+ ```mdtest-output
160
+ {id:"i-001",az:"us-east-1a"}
161
+ {id:"i-002",az:"us-east-1a"}
162
+ {id:"i-004",az:"us-east-1b"}
163
+ {id:"i-005",az:"us-east-1c"}
164
+ {id:"i-006",az:"us-east-1c"}
165
+ ```
166
+
167
+ 2 from `us-east-1a`, 1 from `us-east-1b` (all it had), 2 from `us-east-1c` —
168
+ as balanced as possible given the available pool.
169
+
170
+ ## Why Not Just Sort and Head?
171
+
172
+ Without fork, you might try:
173
+
174
+ ```mdtest-command
175
+ super -s -c "from instances.sup | sort az, id | head 5"
176
+ ```
177
+ ```mdtest-output
178
+ {id:"i-001",az:"us-east-1a"}
179
+ {id:"i-002",az:"us-east-1a"}
180
+ {id:"i-003",az:"us-east-1a"}
181
+ {id:"i-004",az:"us-east-1b"}
182
+ {id:"i-005",az:"us-east-1c"}
183
+ ```
184
+
185
+ All 3 from `us-east-1a`, the 1 from `us-east-1b`, and only 1 from `us-east-1c`.
186
+ That's unbalanced — it fills up from the first AZ alphabetically instead of
187
+ distributing evenly.
188
+
189
+ ## Verifying the Distribution
190
+
191
+ You can check the balance of your selection by piping through an aggregate:
192
+
193
+ ```mdtest-command
194
+ super -s -c "
195
+ from instances.sup
196
+ | fork
197
+ ( where az=='us-east-1a' | head 2 )
198
+ ( where az=='us-east-1b' | head 2 )
199
+ ( where az=='us-east-1c' | head 2 )
200
+ | aggregate count:=count() by az
201
+ | sort az
202
+ "
203
+ ```
204
+ ```mdtest-output
205
+ {az:"us-east-1a",count:2}
206
+ {az:"us-east-1b",count:1}
207
+ {az:"us-east-1c",count:2}
208
+ ```
209
+
210
+ ## Alternative: Self-Join for Row Numbering
211
+
212
+ There's a pure SQL approach that doesn't require fork and works dynamically with
213
+ any number of groups. The idea: for each record, count how many records in the
214
+ same group have an id less than or equal to it. This simulates
215
+ `ROW_NUMBER() OVER (PARTITION BY az ORDER BY id)`.
216
+
217
+ ```mdtest-command
218
+ super -s -c "
219
+ select a.id, a.az, count(*) as row_num
220
+ from instances.sup a
221
+ join instances.sup b on a.az = b.az and b.id <= a.id
222
+ group by a.id, a.az
223
+ order by a.az, a.id
224
+ "
225
+ ```
226
+ ```mdtest-output
227
+ {id:"i-001",az:"us-east-1a",row_num:1}
228
+ {id:"i-002",az:"us-east-1a",row_num:2}
229
+ {id:"i-003",az:"us-east-1a",row_num:3}
230
+ {id:"i-004",az:"us-east-1b",row_num:1}
231
+ {id:"i-005",az:"us-east-1c",row_num:1}
232
+ {id:"i-006",az:"us-east-1c",row_num:2}
233
+ {id:"i-007",az:"us-east-1c",row_num:3}
234
+ {id:"i-008",az:"us-east-1c",row_num:4}
235
+ ```
236
+
237
+ Step by step, for record `i-006` in `us-east-1c`:
238
+
239
+ 1. The self-join matches `i-006` against all `us-east-1c` records with
240
+ `id <= 'i-006'`: that's `i-005` and `i-006` itself.
241
+ 2. `count(*)` = 2, so `row_num` = 2.
242
+
243
+ Now filter to keep only the first 2 per group:
244
+
245
+ ```mdtest-command
246
+ super -s -c "
247
+ with ranked as (
248
+ select a.id, a.az, count(*) as row_num
249
+ from instances.sup a
250
+ join instances.sup b on a.az = b.az and b.id <= a.id
251
+ group by a.id, a.az
252
+ )
253
+ select id, az from ranked
254
+ where row_num <= 2
255
+ order by az, id
256
+ "
257
+ ```
258
+ ```mdtest-output
259
+ {id:"i-001",az:"us-east-1a"}
260
+ {id:"i-002",az:"us-east-1a"}
261
+ {id:"i-004",az:"us-east-1b"}
262
+ {id:"i-005",az:"us-east-1c"}
263
+ {id:"i-006",az:"us-east-1c"}
264
+ ```
265
+
266
+ Same result as fork, but no hardcoded AZ names — works with any number of
267
+ groups dynamically.
268
+
269
+ ## Trade-offs
270
+
271
+ **Fork** is simple and fast (linear scan per branch), but requires hardcoding
272
+ group values. Best when groups are known and stable (like AZs in a region).
273
+
274
+ **Self-join** is dynamic and handles any number of groups automatically, but
275
+ is O(n^2) per group since every record is joined against all peers with a
276
+ smaller key. Fine for small datasets, potentially slow for large ones.
277
+
278
+ **With window functions** ([brimdata/super#5921][issue]), the query would be
279
+ both dynamic and efficient — handling any number of groups with a single linear
280
+ pass and supporting sophisticated ranking (e.g., ordering within groups by
281
+ launch time, instance type preference, etc.).
282
+
283
+ | Approach | Dynamic groups? | Time complexity | Notes |
284
+ |------------------|-----------------|------------------|--------------------------------------------|
285
+ | Fork | No | O(n) per branch | Groups must be hardcoded |
286
+ | Self-join | Yes | O(n^2) per group | Every record joined against its group peers |
287
+ | Window functions | Yes | O(n log n) | Sort + single pass (not yet available) |
288
+
289
+ For a refresher on what those mean in practice
290
+ ([Big O notation](https://en.wikipedia.org/wiki/Big_O_notation)):
291
+
292
+ | Notation | Name | 100 records | 10,000 records | Growth |
293
+ |------------|-------------|-------------|----------------|---------------|
294
+ | O(n) | Linear | 100 | 10,000 | Scales nicely |
295
+ | O(n log n) | Linearithmic| ~664 | ~132,877 | Typical sort |
296
+ | O(n^2) | Quadratic | 10,000 | 100,000,000 | Gets slow fast|
@@ -0,0 +1,166 @@
1
+ ---
2
+ title: "grok"
3
+ name: grok
4
+ description: "Tutorial on using the grok function for text parsing in SuperDB."
5
+ layout: default
6
+ nav_order: 5
7
+ parent: Tutorials
8
+ superdb_version: "0.3.0"
9
+ last_updated: "2026-03-28"
10
+ ---
11
+
12
+ # grok
13
+
14
+ The grok function is a great choice for parsing text, but due to some gaps in
15
+ its documentation and some vague error messages, it can be difficult to use at
16
+ first.
17
+
18
+ The docs do helpfully encourage building out grok patterns incrementally, but
19
+ without knowing some of grok's gotchyas, this can be discouraging.
20
+
21
+ Let's demonstrate these starting with this example where we want to extract the
22
+ name out of this string:
23
+
24
+ ```text
25
+ My name is: Muerte!
26
+ ```
27
+
28
+ To start incrementally, I know I want to skip everything up past the colon, and
29
+ then extract the name minus the closing exclamation point. There's probably not
30
+ a predefined pattern that exists for this prefix regex, and/or I'm feeling lazy
31
+ enough right now to not go looking for one, and the regex is pretty simple.
32
+
33
+ So, I'll define my own pattern in the 3rd arg to grok to handle this:
34
+ ```mdtest-command
35
+ super -s -c '
36
+ values "My name is: Muerte!"
37
+ | grok("%{NAME_PREFIX}", this, "NAME_PREFIX .*: ")'
38
+ ```
39
+ Since there aren't any errors, and no field names assigned, it returns an empty
40
+ record:
41
+ ```mdtest-output
42
+ {}
43
+ ```
44
+
45
+ It's a simple regex, it seems like it's accurate, it's hard for me to see what's
46
+ wrong?
47
+
48
+ The regex is fine, in fact. The real reason this returns an empty record is that
49
+ the capture pattern is **missing a field name** in which to store the value.
50
+ Without a field name, there's nothing to capture into a record field.
51
+
52
+ We probably made this mistake because we don't really want to capture "My name
53
+ is: " in a field of the record. But, no big deal, we can add one and use the
54
+ cut operator later to remove it.
55
+
56
+ ```mdtest-command
57
+ super -s -c '
58
+ values "My name is: Muerte!"
59
+ | grok("%{NAME_PREFIX:prefix}", this, "NAME_PREFIX .*: ")'
60
+ ```
61
+ ```mdtest-output
62
+ {prefix:"My name is: "}
63
+ ```
64
+
65
+ Success!
66
+
67
+ For our next incremental step, let's capture the name. That's all that's left.
68
+ ```mdtest-command
69
+ super -s -c '
70
+ values "My name is: Muerte!"
71
+ | grok("%{NAME_PREFIX:prefix}%{WORD:name}", this, "NAME_PREFIX .*: ")'
72
+ ```
73
+ ```mdtest-output
74
+ {prefix:"My name is: ",name:"Muerte"}
75
+ ```
76
+
77
+ Success again! Ok, that wasn't so bad, but - it's a little arduous. It doesn't
78
+ feel like I'm getting to use the power of regex in a straightforward manner.
79
+
80
+ There are two grok undocumented "hacks" that can make a simple job like this
81
+ even simpler.
82
+
83
+ First, (as seen already with the unnamed capture pattern in `super`), not all
84
+ capture patterns need a field name as long as _**one**_ of them has a field
85
+ name. So we can reduce our last example to be this:
86
+
87
+ ```mdtest-command
88
+ super -s -c '
89
+ values "My name is: Muerte!"
90
+ | grok("%{NAME_PREFIX}%{WORD:name}", this, "NAME_PREFIX .*: ")'
91
+ ```
92
+ ```mdtest-output
93
+ {name:"Muerte"}
94
+ ```
95
+
96
+ Second, custom regex patterns can be _inlined_ into the pattern string without
97
+ being a custom named pattern in the 3rd argument at all!
98
+
99
+ ```mdtest-command
100
+ super -s -c '
101
+ values "My name is: Muerte!"
102
+ | grok(".*: %{WORD:name}", this)'
103
+ ```
104
+ ```mdtest-output
105
+ {name:"Muerte"}
106
+ ```
107
+
108
+ Now that feels clean and simple!
109
+
110
+ ## Pairing grok with infer
111
+
112
+ Grok extracts fields as strings — even numbers and IPs. The `infer` operator
113
+ can automatically detect and cast these to native types.
114
+
115
+ Here's a log line parsed with grok — note that everything is a string:
116
+
117
+ ```mdtest-command
118
+ super -s -c '
119
+ values
120
+ "192.168.1.1 GET /api/users 200 1234",
121
+ "10.0.0.5 POST /api/data 404 567"
122
+ | grok("%{IP:client} %{WORD:method} %{URIPATH:path} %{INT:status} %{INT:bytes}", this)'
123
+ ```
124
+ ```mdtest-output
125
+ {client:"192.168.1.1",method:"GET",path:"/api/users",status:"200",bytes:"1234"}
126
+ {client:"10.0.0.5",method:"POST",path:"/api/data",status:"404",bytes:"567"}
127
+ ```
128
+
129
+ Add `| infer` and the types get cleaned up automatically:
130
+
131
+ ```mdtest-command
132
+ super -s -c '
133
+ values
134
+ "192.168.1.1 GET /api/users 200 1234",
135
+ "10.0.0.5 POST /api/data 404 567"
136
+ | grok("%{IP:client} %{WORD:method} %{URIPATH:path} %{INT:status} %{INT:bytes}", this)
137
+ | infer'
138
+ ```
139
+ ```mdtest-output
140
+ {client:192.168.1.1,method:"GET",path:"/api/users",status:200,bytes:1234}
141
+ {client:10.0.0.5,method:"POST",path:"/api/data",status:404,bytes:567}
142
+ ```
143
+
144
+ `client` became an `ip` type, `status` and `bytes` became `int64`, while
145
+ `method` and `path` correctly stayed as strings. This means you can now do
146
+ things like `where status >= 400` or `where client in 10.0.0.0/8` without
147
+ manual casting.
148
+
149
+ ## Unit tests in codebase
150
+
151
+ ```mdtest-command
152
+ super -s -c 'values "1", "foo" | grok("%{INT}", this)'
153
+ ```
154
+ ```mdtest-output
155
+ {}
156
+ error({message:"grok: value does not match pattern",on:"foo"})
157
+ ```
158
+
159
+ ## as of versions
160
+
161
+ ```mdtest-command
162
+ super --version
163
+ ```
164
+ ```mdtest-output
165
+ Version: v0.3.0
166
+ ```
@@ -0,0 +1,10 @@
1
+ ---
2
+ title: Tutorials
3
+ layout: default
4
+ has_children: true
5
+ nav_order: 4
6
+ ---
7
+
8
+ # Tutorials
9
+
10
+ Step-by-step guides covering common SuperDB patterns and techniques.
@@ -0,0 +1,79 @@
1
+ ---
2
+ title: "Joins"
3
+ name: joins
4
+ description: "Examples of outer joins, anti joins, and full outer joins in SuperDB."
5
+ layout: default
6
+ nav_order: 6
7
+ parent: Tutorials
8
+ superdb_version: "0.3.0"
9
+ last_updated: "2026-02-15"
10
+ ---
11
+
12
+ # Joins
13
+
14
+ ## Outer Joins
15
+
16
+ ```mdtest-input za.sup
17
+ {id:1,name:"foo",src:"za"}
18
+ {id:3,name:"qux",src:"za"}
19
+ ```
20
+ ```mdtest-input zb.sup
21
+ {id:1,name:"foo",src:"zb"}
22
+ {id:2,name:"bar",src:"zb"}
23
+ ```
24
+
25
+ Left Join (Left Only + Inner Joins — where clause required to eliminate inner joins)
26
+
27
+ `select *` includes columns from both sides. The right table's columns get a
28
+ `_1` suffix to avoid name collisions, and unmatched values are
29
+ `error("missing")`.
30
+ ```mdtest-command
31
+ super -s -c "select * from za.sup as za
32
+ left join zb.sup as zb
33
+ on za.id=zb.id
34
+ where is_error(zb.name)"
35
+ ```
36
+ ```mdtest-output
37
+ {id:3,name:"qux",src:"za",id_1:error("missing"),name_1:error("missing"),src_1:error("missing")}
38
+ ```
39
+
40
+ Right Join (Right Only + Inner Joins — where clause required to eliminate inner joins)
41
+ ```mdtest-command
42
+ super -s -c "select * from za.sup as za
43
+ right join zb.sup as zb
44
+ on za.id=zb.id
45
+ where is_error(za.name)"
46
+ ```
47
+ ```mdtest-output
48
+ {id:error("missing"),name:error("missing"),src:error("missing"),id_1:2,name_1:"bar",src_1:"zb"}
49
+ ```
50
+
51
+ Anti Join (Left Join exclusively — no where clause required)
52
+ ```mdtest-command
53
+ super -s -c "select * from za.sup as za
54
+ anti join zb.sup as zb
55
+ on za.id=zb.id"
56
+ ```
57
+ ```mdtest-output
58
+ {id:3,name:"qux",src:"za",id_1:error("missing"),name_1:error("missing"),src_1:error("missing")}
59
+ ```
60
+
61
+ Full Outer (Left Only + Right Only + Inner Joins) - _BUG: Still behaves like Left Join — only returns left-side rows_
62
+ ```mdtest-command
63
+ super -s -c "select * from za.sup as za
64
+ full outer join zb.sup as zb
65
+ on za.id=zb.id
66
+ where is_error(za.name) or is_error(zb.name)"
67
+ ```
68
+ ```mdtest-output
69
+ {id:3,name:"qux",src:"za",id_1:error("missing"),name_1:error("missing"),src_1:error("missing")}
70
+ ```
71
+
72
+ ## as of versions
73
+
74
+ ```mdtest-command
75
+ super --version
76
+ ```
77
+ ```mdtest-output
78
+ Version: v0.2.0
79
+ ```
@@ -0,0 +1,35 @@
1
+ ---
2
+ title: "Moar Subqueries"
3
+ name: moar-subqueries
4
+ description: "Additional subquery patterns including fork and full sub-selects."
5
+ layout: default
6
+ nav_order: 10
7
+ parent: Tutorials
8
+ superdb_version: "0.2.0"
9
+ last_updated: "2026-02-15"
10
+ ---
11
+
12
+ # Moar Subqueries
13
+
14
+ ## Fork
15
+
16
+ One hassle to this approach is the limit of 2 forks. Nesting forks works, but
17
+ makes constructing this query a bit more difficult.
18
+
19
+ ## Full Sub-Selects
20
+
21
+ As of 20250815 build, this is much, much slower. I'm guessing it's doing a full
22
+ reload of the data file each time.
23
+
24
+ ```
25
+ select
26
+ (select count(*)
27
+ from './moar_subqueries.sup'
28
+ where win is not null) as total_games,
29
+ ...
30
+ ```
31
+
32
+ ## All Other SQL-Syntax Subqueries
33
+
34
+ They all take about the same amount of wall-time, but CPU usage is much higher
35
+ due to re-reading the file each time.