@xcitedbs/client 0.2.14 → 0.2.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/llms-full.txt +2 -0
- package/llms.txt +1 -1
- package/package.json +4 -2
- package/unquery-ai-guide.md +1139 -0
- package/unquery-grammar.md +414 -0
|
@@ -0,0 +1,1139 @@
|
|
|
1
|
+
# Unquery — AI guide for writing queries
|
|
2
|
+
|
|
3
|
+
> **Read this entire document before generating any Unquery query.** It is the single, self-contained reference an AI model needs (alongside the user's data schema) to write correct, idiomatic Unquery. For parser-level syntax detail and static-check rules, the companion document is [`docs/unquery-grammar.md`](./unquery-grammar.md) (also exposed as MCP resource `xcitedb://unquery-grammar`).
|
|
4
|
+
|
|
5
|
+
Unquery is a declarative language for querying and transforming structured documents (JSON and XML). It is the query language of XCiteDB and the `unq` command-line tool. Every query is itself a JSON document. The result is also JSON.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 0. AI preamble — must-read before writing a query
|
|
10
|
+
|
|
11
|
+
Internalize these eight points first. They are the single largest source of mistakes in machine-generated Unquery.
|
|
12
|
+
|
|
13
|
+
1. **A query *is* JSON.** Wrap your output in the same JSON shape you want back. Strings inside the JSON are Unquery **expressions**, not literal text.
|
|
14
|
+
2. **Result mirrors template.** Objects produce objects, arrays produce arrays, strings produce single values. If the user wants a *list*, you need a *JSON array* in the query. If they want an *object*, use a JSON object. To repeat per-document, wrap in `[...]`.
|
|
15
|
+
3. **String constants need single quotes inside the expression.** `"FirstName"` reads the field `FirstName`. To return the literal text `FirstName`, write `"'FirstName'"`. Double quotes are JSON syntax, single quotes are Unquery string delimiters.
|
|
16
|
+
4. **Condition OR is single `|`. Context-OR is `||`.** Inside a `?`, `#if`, or constraint, write `a=1 | b=2`. Inside a key (between context modifiers), write `:.||Family[]`. Mixing them is a parse error.
|
|
17
|
+
5. **`<-` does not exist as an operator.** It is lexed as `<` then unary `-`. To compare to a negative literal, use `< -5` or parentheses.
|
|
18
|
+
6. **`..` is the parent step token; you almost always want `../`.** `../Field` means "go up, then read Field". `..Field` is a parse error.
|
|
19
|
+
7. **Directive keys (`#if`, `#var`, `#assign`, `#exists`, `#notexists`, `#return`, `#returnif`, `#func`) get parsed differently from regular keys.** `#if` value is a *condition* (no `?`, no `@sort`); the others are full values. Constraints attach to the value of directives, but to the *object* of regular keys.
|
|
20
|
+
8. **Aggregates (`$count`, `$sum`, `$avg`, `$min`, `$max`, `$prev`) can only be compared to literals**, not to other expressions. `"$avg(Salary)>100000"` is OK. `"$avg(Salary)>OtherField"` is a parse error.
|
|
21
|
+
|
|
22
|
+
### Decision shortcuts
|
|
23
|
+
|
|
24
|
+
| Want… | Use |
|
|
25
|
+
|---|---|
|
|
26
|
+
| Read field `X` | `"X"` |
|
|
27
|
+
| Constant string `'X'` in result | `"'X'"` |
|
|
28
|
+
| One value per document | bare object/string at root |
|
|
29
|
+
| One result accumulating across documents | bare object at root |
|
|
30
|
+
| Array of one value per document | `["expr"]` |
|
|
31
|
+
| Array of objects per document | `[{...}]` |
|
|
32
|
+
| Iterate inside a single document | context modifier `:Path[]` or `:{}` |
|
|
33
|
+
| Filter rows | predicate `?cond` or constraint `Field op val` |
|
|
34
|
+
| Filter the whole record | `#if` directive |
|
|
35
|
+
| Group by | dynamic key `$(GroupExpr)` + array value |
|
|
36
|
+
| Drop the wrapper object | `#return` |
|
|
37
|
+
| Pick one of several shapes | `#returnif` chain |
|
|
38
|
+
| Recursive transform of any JSON | `#func` returning `#returnif` per type |
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## 1. Mental model
|
|
43
|
+
|
|
44
|
+
A query is a **template**. Evaluation walks the template top-down, possibly many times (once per document, plus extra passes when iterating with context modifiers, dynamic keys, or `||`).
|
|
45
|
+
|
|
46
|
+
- **Object in template → object in result.** Each key-value pair is evaluated in order. A key may produce zero, one, or many output keys.
|
|
47
|
+
- **Array in template → array in result.** Each evaluation pass *appends* a new element. Arrays are how you "broadcast" across documents or iterations.
|
|
48
|
+
- **String in template → single value.** A string is parsed as one Unquery expression and returns one JSON value (which can itself be a number, string, bool, object, or array).
|
|
49
|
+
- **Numbers / booleans / null in template → constants.** Literal JSON values are passed through unchanged.
|
|
50
|
+
|
|
51
|
+
### The "evaluate once vs. per pass" rule
|
|
52
|
+
|
|
53
|
+
Outside an array, a value is set once and **does not update** on later passes. So `{"name":"FirstName"}` over many docs returns just the first `FirstName`. Wrap in `[]` to get one per pass:
|
|
54
|
+
|
|
55
|
+
```json
|
|
56
|
+
[{"name":"FirstName"}]
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
Aggregates are the exception: they update on every visit even outside arrays. `{"total":"$sum(Salary)"}` does compute the sum.
|
|
60
|
+
|
|
61
|
+
### When does iteration happen?
|
|
62
|
+
|
|
63
|
+
- One pass per document in the input set (always).
|
|
64
|
+
- Extra passes from `:[]` (array traversal), `:{}` (object-key traversal), `**` (recursive descent), `||` (context-or), `->$children`/`->$descendants`/`->$ancestors` (XML), and from dynamic keys producing multiple values.
|
|
65
|
+
|
|
66
|
+
### What changes the *current path*?
|
|
67
|
+
|
|
68
|
+
- Path expressions in expressions (`Field.Sub`, `[0]`, `../`, `/`, `<<`).
|
|
69
|
+
- Context modifiers in keys (`:Path`, `:[]`, `:{}`, `:**`, `->...`).
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## 2. Path expressions
|
|
74
|
+
|
|
75
|
+
A path selects a value from the current context.
|
|
76
|
+
|
|
77
|
+
| Form | Meaning | Example |
|
|
78
|
+
|---|---|---|
|
|
79
|
+
| `.` | Current value (initially the document root) | `["."]` returns each document |
|
|
80
|
+
| `Field` | Read named field | `"FirstName"` |
|
|
81
|
+
| `Field.Sub` | Nested field | `"Address.City"` |
|
|
82
|
+
| `` `field with spaces` `` | Backtick-quote a non-identifier name | `` "`First Name`" `` |
|
|
83
|
+
| `[n]` | Array index (0-based) | `"Dependants[0].FirstName"` |
|
|
84
|
+
| `[expr]` | Computed index | `"Field1[$index+1]"` |
|
|
85
|
+
| `[]` | Whole array (projection over all elements) | `"Array1[].Field1"` |
|
|
86
|
+
| `/Field` | Absolute from document root | `"/employees.$(.)"` |
|
|
87
|
+
| `../Field` | Up one path level (skips array indices) | `"../DBInstanceIdentifier"` |
|
|
88
|
+
| `<<Field` | Read same field in *previous* document context | `"<<Field1"` after a `->$file(...)` |
|
|
89
|
+
| `$(expr)` | Use the value of `expr` as the *next* path segment / dynamic key | `"$var(states).$(Address.State)"` |
|
|
90
|
+
|
|
91
|
+
Notes:
|
|
92
|
+
|
|
93
|
+
- `Field.$x` is **not** a continued path. After `.`, if the next token starts with `$`, the path stops and you get the field `Field` followed by an expression `$x`. To use a computed subfield, write `.$(x)` explicitly.
|
|
94
|
+
- Subfield works on any expression, not just paths: `"$var(x).Subfield1"`.
|
|
95
|
+
- `[]` on a value that is *not* an array silently treats it as a one-element array. This is convenient and almost never wrong.
|
|
96
|
+
- The `$()` evaluate function is *redundant* before a function token: write `$index`, not `$($index)`.
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## 3. Expressions and operators
|
|
101
|
+
|
|
102
|
+
### 3.1 Arithmetic and string
|
|
103
|
+
|
|
104
|
+
| Op | Meaning | Notes |
|
|
105
|
+
|---|---|---|
|
|
106
|
+
| `+` | Add (numbers) or concatenate (strings) | Mixed types: behavior is engine-defined; cast first if unsure |
|
|
107
|
+
| `-` | Subtract | Unary minus only on numeric literal: `-5` |
|
|
108
|
+
| `*` | Multiply | |
|
|
109
|
+
| `/` | Divide | Integer/integer = integer (truncation). Force float: `14/$float(10)` → `1.4` |
|
|
110
|
+
| `mod` | Modulo | Word, not `%` |
|
|
111
|
+
|
|
112
|
+
Precedence: `*`, `/`, `mod` > `+`, `-`. Use parentheses freely: `(Field1+2)*5`.
|
|
113
|
+
|
|
114
|
+
### 3.2 String literals
|
|
115
|
+
|
|
116
|
+
Single quotes inside the expression: `'hello'`. Or escape double quotes: `"\"hello\""`. Backticks (`` `...` ``) are for **field names**, not string constants — never use them for literals.
|
|
117
|
+
|
|
118
|
+
### 3.3 Type-cast functions
|
|
119
|
+
|
|
120
|
+
`$string(x)`, `$number(x)`, `$int(x)`, `$float(x)`, `$bool(x)`. Common uses:
|
|
121
|
+
|
|
122
|
+
```json
|
|
123
|
+
"$int('5') + 1" // 6
|
|
124
|
+
"14/$float(10)" // 1.4 (force float division)
|
|
125
|
+
"$sum($number(value))" // sum of strings that look like numbers
|
|
126
|
+
"$number(.)" // smart-cast to int/float depending on content
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### 3.4 Identifiers and variables
|
|
130
|
+
|
|
131
|
+
- `$var(name)` reads a variable declared via `#var` or `#assign`.
|
|
132
|
+
- `%name` is shorthand for `$var(name)` and parses identically.
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## 4. Conditions
|
|
137
|
+
|
|
138
|
+
Conditions appear in four places. Same grammar everywhere except `#if` cannot have `?`/`@sort`.
|
|
139
|
+
|
|
140
|
+
| Place | Syntax | Effect |
|
|
141
|
+
|---|---|---|
|
|
142
|
+
| **Predicate** on a value | `"expr ? cond"` | Skip emitting the value when false. Aggregates skip the update. |
|
|
143
|
+
| **Constraint** (object field only) | `"expr op rhs"` (no `?`) | If false, the *whole object* is skipped. |
|
|
144
|
+
| **`#if`** directive | `{"#if":"cond", ...}` | If false, the whole object is skipped. Same effect as constraint, but explicit. |
|
|
145
|
+
| **`$if(cond, then, else)`** | function call | Inline ternary. |
|
|
146
|
+
| **Predicate after a context modifier** | `"key:Path[]?cond"` | Filters which traversal items run the sub-template. |
|
|
147
|
+
|
|
148
|
+
### 4.1 Comparators
|
|
149
|
+
|
|
150
|
+
`=`, `!=`, `<`, `>`, `<=`, `>=`. (Single `=`, not `==`.)
|
|
151
|
+
|
|
152
|
+
### 4.2 Boolean glue
|
|
153
|
+
|
|
154
|
+
- AND: `&` (higher precedence)
|
|
155
|
+
- OR: `|` (lower precedence) — **single pipe**, never `||` inside a condition
|
|
156
|
+
- NOT: `!cond`
|
|
157
|
+
- Parentheses: `(...)`
|
|
158
|
+
|
|
159
|
+
```text
|
|
160
|
+
x=5 | (y>7 & x=z)
|
|
161
|
+
Field1 contains 'Developer' & Field2 matches 'A.*b'
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
### 4.3 Existence
|
|
165
|
+
|
|
166
|
+
Postfix `!` after a path: `"Field1.Field2!"` is true iff `Field2` exists in `Field1`. Works as a constraint too:
|
|
167
|
+
|
|
168
|
+
```json
|
|
169
|
+
{ "key1":"Field1!", "key2":"value_expr" }
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### 4.4 Type tests (no RHS)
|
|
173
|
+
|
|
174
|
+
`is_array`, `is_object`, `is_literal`, `is_string`, `is_number`, `is_int`, `is_float`, `is_bool`. Form: `"path is_int"`. Example — keep only ints from a mixed array:
|
|
175
|
+
|
|
176
|
+
```json
|
|
177
|
+
{ "numbers:[]": [". is_int"] }
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
### 4.5 String tests
|
|
181
|
+
|
|
182
|
+
`contains`, `starts_with`, `ends_with`, `matches` (regex). Example:
|
|
183
|
+
|
|
184
|
+
```text
|
|
185
|
+
Title contains 'Developer'
|
|
186
|
+
folder starts_with 'User/Admin'
|
|
187
|
+
Name matches 'A.*b'
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
### 4.6 Set tests
|
|
191
|
+
|
|
192
|
+
`in` and `not_in`. RHS must be array-shaped:
|
|
193
|
+
|
|
194
|
+
```text
|
|
195
|
+
Color in ['red','green','blue']
|
|
196
|
+
Status not_in BadStatuses[]
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### 4.7 Predicate vs. constraint — subtle but important
|
|
200
|
+
|
|
201
|
+
Inside an **object field**, these differ:
|
|
202
|
+
|
|
203
|
+
```json
|
|
204
|
+
{ "key1": "value1?x!=1", "key2": "value2" } // predicate: skips key1 only
|
|
205
|
+
{ "key1": "value1 != 1", "key2": "value2" } // constraint: skips the WHOLE object
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
The constraint form is equivalent to writing `{"#if":"value1!=1", ...}` with the same fields.
|
|
209
|
+
|
|
210
|
+
### 4.8 Aggregate-in-condition rule
|
|
211
|
+
|
|
212
|
+
You may compare an aggregate to a **literal**:
|
|
213
|
+
|
|
214
|
+
```text
|
|
215
|
+
$count > 1
|
|
216
|
+
$avg(Age) > 40
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
You may **not** compare two non-literal sides where one is an aggregate. The parser rejects `"$avg(Age) > Threshold"`.
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## 5. Object keys
|
|
224
|
+
|
|
225
|
+
Keys are not just constant strings. The first character (or first token) decides what kind of key it is.
|
|
226
|
+
|
|
227
|
+
### 5.1 Constant keys
|
|
228
|
+
|
|
229
|
+
Alphanumeric/underscore identifiers, or backtick-quoted strings. The output key has the same name.
|
|
230
|
+
|
|
231
|
+
```json
|
|
232
|
+
{ "FirstName": "FirstName" }
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### 5.2 Dynamic keys — `$(expr)`
|
|
236
|
+
|
|
237
|
+
Evaluate `expr` and use the result as the key name. Each evaluation can add a new key.
|
|
238
|
+
|
|
239
|
+
```json
|
|
240
|
+
{ "$(LastName)": "." } // dictionary by last name
|
|
241
|
+
{ "$(LastName+' '+FirstName)": "." } // composite key
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
`$(...)` is **redundant** when the expression is itself a `$function` token. Use `"$index"` not `"$($index)"`. Use `"$key"` not `"$($key)"`.
|
|
245
|
+
|
|
246
|
+
### 5.3 Group-by pattern
|
|
247
|
+
|
|
248
|
+
A dynamic key with an **array** value collects all matching values per group:
|
|
249
|
+
|
|
250
|
+
```json
|
|
251
|
+
{ "$(bin)": ["value"] }
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
Equivalent to SQL `GROUP BY bin`. Combine with aggregate values for grouped stats:
|
|
255
|
+
|
|
256
|
+
```json
|
|
257
|
+
{ "$(Title)": "$avg(Salary)" }
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### 5.4 Copy-all keys — `{}` and `{'regex'}`
|
|
261
|
+
|
|
262
|
+
`{}` evaluates to *every* key in the queried object. `{'regex'}` evaluates to keys matching the regex.
|
|
263
|
+
|
|
264
|
+
```json
|
|
265
|
+
{ "{}": "value" } // every key gets the constant 'value' (rarely useful)
|
|
266
|
+
{ "{}:": "." } // copy every key → its value (very useful)
|
|
267
|
+
{ "{'A.*B'}:": "." } // copy only keys matching the regex
|
|
268
|
+
{ "{}:": ".?$key!='ID'" } // copy all except ID (predicate filter)
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
The trailing `:` after `{}` is a context modifier ("for each key, switch context to that key"). Without `:`, `{}` is just the empty-key spec.
|
|
272
|
+
|
|
273
|
+
### 5.5 Key utility functions
|
|
274
|
+
|
|
275
|
+
- `$key` — name of the current key in the queried object (during traversal).
|
|
276
|
+
- `$reskey` — name of the *result* key (useful in `#if` / dynamic situations).
|
|
277
|
+
- `$path` — the full current path as a string.
|
|
278
|
+
- `$index` — the last numeric index in the path.
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## 6. Context modifiers
|
|
283
|
+
|
|
284
|
+
Context modifiers appear **in the key**, after the key name, and scope the value (object/array/string under that key). Two families:
|
|
285
|
+
|
|
286
|
+
- **Path modifiers** start with `:` — they change the path within the current document.
|
|
287
|
+
- **Identifier/document modifiers** start with `->` (or `:->`) — they switch document, XML node, branch, or date.
|
|
288
|
+
|
|
289
|
+
A modifier may be followed by `?cond` (predicate). Modifiers chain freely.
|
|
290
|
+
|
|
291
|
+
### 6.1 Path modifiers
|
|
292
|
+
|
|
293
|
+
| Form | Meaning |
|
|
294
|
+
|---|---|
|
|
295
|
+
| `:Path` | Switch context to `Path` |
|
|
296
|
+
| `:` (nothing after) | Repeat the key name as the path: `"Field1:"` ≡ `"Field1:Field1"` |
|
|
297
|
+
| `:[]` | Iterate over each array element |
|
|
298
|
+
| `:[n]` | Switch to array index `n` |
|
|
299
|
+
| `:{}` | Iterate over each object field |
|
|
300
|
+
| `:{'regex'}` | Iterate over fields matching regex |
|
|
301
|
+
| `:**` | Recursive descent (every path under current) |
|
|
302
|
+
| `:.` | Stay at current path (rarely needed) |
|
|
303
|
+
|
|
304
|
+
```json
|
|
305
|
+
{ "Childen:Dependants[]?Relation='Child'": ["FirstName+' '+LastName"] }
|
|
306
|
+
{ "result:Customers[]?Balance>100000:Accounts[]": ["accountNumber"] }
|
|
307
|
+
{ "result:{}": ["."] }
|
|
308
|
+
{ "#return:**": ["$key@unique_ascending"] }
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
### 6.2 The `:Field` shorthand (very common)
|
|
312
|
+
|
|
313
|
+
`"FirstName:"` is identical to `"FirstName:FirstName"`. Use it whenever the result key matches the source field:
|
|
314
|
+
|
|
315
|
+
```json
|
|
316
|
+
[{ "FirstName:":".", "LastName:":".", "NumOfChildren:Dependants[]":"$count?Relation='Child'" }]
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
### 6.3 Identifier / document modifiers — `->`
|
|
320
|
+
|
|
321
|
+
Switch the current "document". After a `->`, the path resets to the new document's root.
|
|
322
|
+
|
|
323
|
+
| Form | Meaning |
|
|
324
|
+
|---|---|
|
|
325
|
+
| `->$file('name.json')` | Read another JSON file as new context |
|
|
326
|
+
| `->$var(name)` | Switch to JSON stored in variable `name` |
|
|
327
|
+
| `->$date('YYYY-MM-DD')` | Switch temporal context (XCiteDB only) |
|
|
328
|
+
| `->$branch('name')` | Switch branch / workspace (XCiteDB only) |
|
|
329
|
+
| `->$self`, `->$parent`, `->$children`, `->$descendants`, `->$descendants_and_self`, `->$all` | **Structural shortcuts** — parser lowers each to an identifier expression AST (`TQIdExpr`) with the same runtime as the path form in the right column (see §6.3.3). |
|
|
330
|
+
| Path equivalents | `->$self` → `->./`; `->$parent` → `->../`; `->$children` → `->./[]`; `->$descendants` → `->./[]/**` (descendants only); `->$descendants_and_self` → `->./**`; `->$all` → root children + recursive descent under `/` (every identifier under `/`). |
|
|
331
|
+
| `->$ancestors`, `->$ancestors_and_self` | Named-only structural shortcuts (same lowering idea; no single-segment path spellings). |
|
|
332
|
+
| `->"/abs/path"` | Switch to the named XML identifier (also an `identifierExpression` with one quoted segment) |
|
|
333
|
+
| `->/abs/...`, `->./...`, `->../...`, `->[]/...`, `->**/...`, `->seg/sub/...`, `->$(expr)/...`, `->/path[]`, `->/path/**` | **Identifier expression** — evaluates to **zero or more** identifiers (see §6.3.1, §6.3.3). |
|
|
334
|
+
| `->$each(expr)` | Evaluate `expr` as JSON; if string or array of strings, iterate each as an identifier (must not be followed by `/` in the same modifier) |
|
|
335
|
+
| `->Field` *(deprecated)* | JSON-field lookup: string or array of identifier strings (runtime deprecation warning; migrate to `->$each(Field)` or an `identifierExpression`) |
|
|
336
|
+
| `->$(expr)` *(deprecated)* | Same lookup using `expr`’s string value as the field name (deprecation warning) |
|
|
337
|
+
|
|
338
|
+
**Decision:** iterate **direct children** of a path → `->/path/to/node[]` or `->$children` from that node (after switching context with `->/path/to/node` if needed).
|
|
339
|
+
|
|
340
|
+
When chaining identifier switches, the colon between them is optional: `:->$parent->$parent`.
|
|
341
|
+
|
|
342
|
+
#### 6.3.1 Identifier expressions
|
|
343
|
+
|
|
344
|
+
Use these when the path after `->` should be composed as an XCiteDB identifier (always normalized to a leading `/` at evaluation when needed), instead of reading a JSON field name. Evaluation can produce **multiple** identifiers; the sub-template runs once per identifier (same as `->$children` / `->$each` iteration).
|
|
345
|
+
|
|
346
|
+
```json
|
|
347
|
+
{ "balance:->/tables/bank/accounts/$(accountID)": "amount" }
|
|
348
|
+
{ "holder:->../accounts/$(accountID)": "name" }
|
|
349
|
+
{ "msg:->../../audit/$(customerID)/$(date)": "text" }
|
|
350
|
+
{ "title:->$each(relatedSectionIds)": "heading" }
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
- **`/`** — absolute from project root.
|
|
354
|
+
- **`./`** — continue under the **current** identifier.
|
|
355
|
+
- **`../`** — go to parent identifier (repeat in the anchor only; more `../` in one anchor climbs multiple levels).
|
|
356
|
+
- **Mid-slash paths** like `->customers/$(id)` — relative to current identifier (first segment is under current id).
|
|
357
|
+
- **`[]`** — after the path built so far, expand to every **direct child** identifier (LMDB / `id_hier` when available).
|
|
358
|
+
- **`**`** — after the path built so far, expand to that identifier **plus all descendants** (prefix scan).
|
|
359
|
+
- **`[N]`** and non-empty `[…]` brackets are not allowed inside these paths (parse error); use `$(expr)` for dynamic segments.
|
|
360
|
+
|
|
361
|
+
#### 6.3.3 Set-valued identifier expressions
|
|
362
|
+
|
|
363
|
+
- **`->/registry/foo[]`** — run the remainder of the query once per direct child of `/registry/foo`.
|
|
364
|
+
- **`->/registry/foo/**`** — run for `/registry/foo` and every descendant identifier under it.
|
|
365
|
+
- **Anchor-less** `->[]/…` and `->**/…` are sugar for `->./[]/…` and `->./**/…` (start at the current identifier).
|
|
366
|
+
- **Ordering** of produced ids follows LMDB / index iteration (stable for a given DB snapshot); treat as implementation-defined like JSON-path `[]` iteration.
|
|
367
|
+
- **Predicates:** `?cond` after the modifier applies **per produced identifier** (the modifier is wrapped in `TQValueWithCond` as usual).
|
|
368
|
+
|
|
369
|
+
Dynamic `$(expr)` / `%var` segments in Phase 2 are still evaluated **once** for the whole step (not per frontier element); per-element evaluation is a possible future extension.
|
|
370
|
+
|
|
371
|
+
#### 6.3.4 Grammar reference
|
|
372
|
+
|
|
373
|
+
Full EBNF and static rules: **`docs/unquery-grammar.md` §5.1** (identifier expressions) and **§8.2** (arrow dispatch).
|
|
374
|
+
|
|
375
|
+
#### 6.3.2 Migrating from deprecated arrow forms
|
|
376
|
+
|
|
377
|
+
| Before | After (Phase 1) |
|
|
378
|
+
|--------|------------------|
|
|
379
|
+
| `->relatedIds` where `relatedIds` is `["/a/1","/a/2"]` | `->$each(relatedIds)` |
|
|
380
|
+
| `->relatedIds` where `relatedIds` is a single string path | `->$each(relatedIds)` (works for one string too) |
|
|
381
|
+
| You meant a **child** path under the current identifier | `->./childName/...` or an absolute `->/prefix/...` |
|
|
382
|
+
| `->$(nameField)` with no `/` (double indirection via JSON) | Keep until Phase 2; prefer `->$each(...)` if you only need iteration semantics |
|
|
383
|
+
|
|
384
|
+
### 6.4 Chaining
|
|
385
|
+
|
|
386
|
+
Write modifiers back-to-back. Each is a predicate-eligible step.
|
|
387
|
+
|
|
388
|
+
```json
|
|
389
|
+
{ "result:[]:{}":["."] } // array → each element → each field
|
|
390
|
+
{ "result->$file('employees.json'):Employees[]":["FirstName"] }
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
### 6.5 Context-OR — `||`
|
|
394
|
+
|
|
395
|
+
Try **multiple** context paths and union the results. Each branch starts from the same point.
|
|
396
|
+
|
|
397
|
+
```json
|
|
398
|
+
{ "names:.||Family[]": ["FirstName"] }
|
|
399
|
+
{ "names:.||Dependants[]": ["FirstName+' '+LastName"] }
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
`||` is a **lexer-level merge** of two `|` characters. It is recognized **only** in context-modifier position. Do **not** use `||` in a condition — that's a parse error; use single `|` for boolean OR.
|
|
403
|
+
|
|
404
|
+
### 6.6 Context predicates
|
|
405
|
+
|
|
406
|
+
Add `?cond` after each modifier step (not just at the end):
|
|
407
|
+
|
|
408
|
+
```json
|
|
409
|
+
{ "result:Customers[]?Balance>100000:Accounts[]":["accountNumber"] }
|
|
410
|
+
{ "Dependants:Dependants[]?$index<2": ["FirstName+' '+LastName"] }
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
### 6.7 The `<<` previous-document operator
|
|
414
|
+
|
|
415
|
+
After switching to a different document with `->`, `<<Field` reads `Field` from the *original* document. Useful for joins:
|
|
416
|
+
|
|
417
|
+
```json
|
|
418
|
+
{
|
|
419
|
+
"result->$file('another.unq')": {
|
|
420
|
+
"key1":"Field1",
|
|
421
|
+
"key2":"<<Field1"
|
|
422
|
+
}
|
|
423
|
+
}
|
|
424
|
+
```
|
|
425
|
+
|
|
426
|
+
---
|
|
427
|
+
|
|
428
|
+
## 7. Sorting
|
|
429
|
+
|
|
430
|
+
Append a sort specifier at the *end* of a value string (after the expression and any predicate, before nothing). Four specifiers:
|
|
431
|
+
|
|
432
|
+
- `@ascending`
|
|
433
|
+
- `@descending`
|
|
434
|
+
- `@unique_ascending` (also dedupes)
|
|
435
|
+
- `@unique_descending` (also dedupes)
|
|
436
|
+
|
|
437
|
+
```json
|
|
438
|
+
["FirstName@ascending"]
|
|
439
|
+
["FirstName@unique_ascending"]
|
|
440
|
+
```
|
|
441
|
+
|
|
442
|
+
### 7.1 Multi-key sort
|
|
443
|
+
|
|
444
|
+
For arrays of objects, every sortable field carries its own specifier and a priority `(n)` (lower number = higher priority):
|
|
445
|
+
|
|
446
|
+
```json
|
|
447
|
+
[{ "firstname":"FirstName@descending(2)", "lastname":"LastName@ascending(1)" }]
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
Sorts by `lastname` ascending first, then `firstname` descending.
|
|
451
|
+
|
|
452
|
+
### 7.2 Pitfall — multiple items in one array
|
|
453
|
+
|
|
454
|
+
`["FirstName@ascending","Lastname"]` has *undefined* sort order. Only one element should drive the sort, or every element must use the same specifier:
|
|
455
|
+
|
|
456
|
+
```json
|
|
457
|
+
["FirstName@ascending","Lastname@ascending"]
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
### 7.3 `@sort` placement
|
|
461
|
+
|
|
462
|
+
`@sort` ends parsing of the value string. It cannot follow a constraint. Put predicates and constraints *before* the `@`:
|
|
463
|
+
|
|
464
|
+
```text
|
|
465
|
+
"Salary?Title='Dev'@descending" // OK: expr ? cond @sort
|
|
466
|
+
"Salary>0 @descending" // OK: expr constraint @sort
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
---
|
|
470
|
+
|
|
471
|
+
## 8. Directives
|
|
472
|
+
|
|
473
|
+
Directives are object keys starting with `#`. They are evaluated in the order they appear inside the object.
|
|
474
|
+
|
|
475
|
+
### 8.1 `#var` — declare a variable
|
|
476
|
+
|
|
477
|
+
```json
|
|
478
|
+
{ "#var x": "1000", "value": "$var(x)" } // 1000
|
|
479
|
+
{ "#var x": "'Some string'", "v": "%x" } // %x ≡ $var(x)
|
|
480
|
+
```
|
|
481
|
+
|
|
482
|
+
A variable can hold any JSON value — including the result of running a sub-template:
|
|
483
|
+
|
|
484
|
+
```json
|
|
485
|
+
{
|
|
486
|
+
"#var dic:Employees[]": { "$(EmployeeId)": "." },
|
|
487
|
+
"Employee1": "$var(dic).1001"
|
|
488
|
+
}
|
|
489
|
+
```
|
|
490
|
+
|
|
491
|
+
Variables are **scoped to the enclosing object**. Inner `#var x` shadows outer `#var x` for the duration of that object only.
|
|
492
|
+
|
|
493
|
+
### 8.2 `#assign` — update an existing variable
|
|
494
|
+
|
|
495
|
+
```json
|
|
496
|
+
{
|
|
497
|
+
"#var x": "1",
|
|
498
|
+
"obj": { "#assign x": "2" },
|
|
499
|
+
"x_value": "$var(x)" // 2
|
|
500
|
+
}
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
If the variable was never declared, `#assign` behaves like `#var`.
|
|
504
|
+
|
|
505
|
+
### 8.3 `#if` — skip the object when condition is false
|
|
506
|
+
|
|
507
|
+
The value of `#if` is parsed as a **condition** — no `?`, no `@sort`, no leading expression-then-predicate.
|
|
508
|
+
|
|
509
|
+
```json
|
|
510
|
+
[{
|
|
511
|
+
"#if": "Title!=CEO",
|
|
512
|
+
"FirstName:": ".",
|
|
513
|
+
"LastName:": "."
|
|
514
|
+
}]
|
|
515
|
+
```
|
|
516
|
+
|
|
517
|
+
`#if` accepts a context modifier:
|
|
518
|
+
|
|
519
|
+
```json
|
|
520
|
+
[{
|
|
521
|
+
"#if:Dependants[]?Relation='Child'": "$count>1",
|
|
522
|
+
"FirstName:": ".",
|
|
523
|
+
"LastName:": "."
|
|
524
|
+
}]
|
|
525
|
+
```
|
|
526
|
+
|
|
527
|
+
### 8.4 `#exists` / `#notexists` — value-non-empty / value-empty test
|
|
528
|
+
|
|
529
|
+
Used like `#if`, but the *value* is a normal Unquery template; the directive tests whether the template produced any non-empty result.
|
|
530
|
+
|
|
531
|
+
```json
|
|
532
|
+
[{
|
|
533
|
+
"#exists:Employees[]": ["Salary>100000"],
|
|
534
|
+
"company": "CompanyName"
|
|
535
|
+
}]
|
|
536
|
+
[{
|
|
537
|
+
"#notexists:Employees[]": ["Salary<30000"],
|
|
538
|
+
"company": "CompanyName"
|
|
539
|
+
}]
|
|
540
|
+
```
|
|
541
|
+
|
|
542
|
+
### 8.5 `#return` — flatten the wrapper object
|
|
543
|
+
|
|
544
|
+
A `#return` makes the *value* of the directive be the entire object's result, dropping any other fields and the wrapping object itself. This breaks Principle 2 deliberately, to escape the "you must wrap in an object to use a context modifier" limitation.
|
|
545
|
+
|
|
546
|
+
```json
|
|
547
|
+
{ "result:Employees[]": ["FirstName"] } // {"result":[ ...names... ]}
|
|
548
|
+
{ "#return:Employees[]": ["FirstName"] } // [ ...names... ]
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
Rules:
|
|
552
|
+
|
|
553
|
+
- The first `#return` in an object wins.
|
|
554
|
+
- Other fields in the same object are dropped.
|
|
555
|
+
|
|
556
|
+
### 8.6 `#returnif` — first non-empty wins
|
|
557
|
+
|
|
558
|
+
Like `#return`, but only effective when the value is non-null/non-empty. Multiple `#returnif`s form an ordered fallback chain.
|
|
559
|
+
|
|
560
|
+
```json
|
|
561
|
+
{
|
|
562
|
+
"#returnif:Tags[]?Key='Name'": "Value",
|
|
563
|
+
"#return": "'N/A'"
|
|
564
|
+
}
|
|
565
|
+
```
|
|
566
|
+
|
|
567
|
+
This pattern is the canonical "default value" idiom: try a tagged value; if missing, fall back to the literal string.
|
|
568
|
+
|
|
569
|
+
`#returnif` is the engine of polymorphic / recursive transforms (see §10 cookbook).
|
|
570
|
+
|
|
571
|
+
### 8.7 `#func` — declare a (possibly recursive) function
|
|
572
|
+
|
|
573
|
+
```json
|
|
574
|
+
{ "#func fullname": "FirstName+' '+LastName",
|
|
575
|
+
"names:Employees[]": ["$fullname"] }
|
|
576
|
+
|
|
577
|
+
{ "#func fullname(x,y)": "$var(x)+' '+$var(y)",
|
|
578
|
+
"names:Employees[]": ["$fullname(FirstName, LastName)"] }
|
|
579
|
+
```
|
|
580
|
+
|
|
581
|
+
Rules:
|
|
582
|
+
|
|
583
|
+
- Function name and arity are registered in the symbol table when the `#func` key is parsed. Every call `$name(...)` must match that arity exactly.
|
|
584
|
+
- A function body is a full Unquery template subtree (string, object, or array).
|
|
585
|
+
- Functions can be recursive — call themselves from inside their own body. This is how generic JSON transforms are written (§10).
|
|
586
|
+
- `$call(name)` invokes a zero-arg function by string name (rare; usually just `$name`).
|
|
587
|
+
|
|
588
|
+
---
|
|
589
|
+
|
|
590
|
+
## 9. Built-in functions and aggregates
|
|
591
|
+
|
|
592
|
+
### 9.1 Function reference table
|
|
593
|
+
|
|
594
|
+
| Function | Arity | Returns |
|
|
595
|
+
|---|---|---|
|
|
596
|
+
| `$()` (evaluate) | 1 | Value of the inner expression treated as a path/dynamic key |
|
|
597
|
+
| `$if(c,a,b)` | 3 | `a` if `c` else `b` |
|
|
598
|
+
| `$call(name)` | 1 (ident) | Result of `#func name` (zero-arg) |
|
|
599
|
+
| `$var(name)` / `%name` | 1 (ident) | Variable value |
|
|
600
|
+
| `$file(expr)` | 1 | Parsed JSON from file at `expr` |
|
|
601
|
+
| `$csv(file [,delim] [,hdr])` | 1–3 | Array of objects from a CSV file (delim default `,`, headers default `true`) |
|
|
602
|
+
| `$env(expr)` | 1 | Environment variable named by `expr` |
|
|
603
|
+
| `$now` | 0 | Current Unix epoch (resolved at parse time) |
|
|
604
|
+
| `$path` | 0 | Current path as string |
|
|
605
|
+
| `$index` | 0 | Last numeric index in the current path |
|
|
606
|
+
| `$key` | 0 | Last key in the current path |
|
|
607
|
+
| `$reskey` | 0 | Result key being emitted |
|
|
608
|
+
| `$filename` | 0 | Current source filename |
|
|
609
|
+
| `$identifier` | 0 | Full XML identifier (XCiteDB) |
|
|
610
|
+
| `$identifier(n)` | 1 (int) | Identifier path truncated before the n-th `/` |
|
|
611
|
+
| `$lower(expr)` / `$upper(expr)` | 1 | Case-changed string |
|
|
612
|
+
| `$length(expr)` | 1 | String length |
|
|
613
|
+
| `$size(expr)` | 1 | Array size |
|
|
614
|
+
| `$substr(s, start [,len])` | 2–3 | Substring |
|
|
615
|
+
| `$find(s, t)` | 2 | Array of indices where `t` occurs in `s` |
|
|
616
|
+
| `$ifind(s, t)` | 2 | Like `$find`, case-insensitive |
|
|
617
|
+
| `$replace(s, from, to [,all?])` | 3–4 | String with replacements |
|
|
618
|
+
| `$split(s, delim)` | 2 | Array of substrings |
|
|
619
|
+
| `$join(arr [,delim])` | 1–2 | Joined string (delim default empty) |
|
|
620
|
+
| `$to_time(s, fmt)` | 2 | Unix epoch from `strptime`-formatted string |
|
|
621
|
+
| `$time_to_str(t, fmt)` | 2 | `strftime`-formatted string |
|
|
622
|
+
| `$D"..."` | literal | Date literal → epoch via `stringToTime` |
|
|
623
|
+
| `$string` `$number` `$int` `$float` `$bool` | 1 | Type cast |
|
|
624
|
+
| `$node_date` | 0 | XML node revision time (epoch) |
|
|
625
|
+
| `$data_date [(expr)]` | 0–1 | Metadata field last-change time (epoch) |
|
|
626
|
+
| `$xml` / `$xml_no_children` | 0 | Current XML node as string (with/without children) |
|
|
627
|
+
| `$node` | 0 | Tag name of the root element of the current XML node |
|
|
628
|
+
| `$attr("name")` | 1 (quoted) | XML attribute value |
|
|
629
|
+
| `$child("name")` | 1 (quoted) | Direct XML child element as XML |
|
|
630
|
+
| `$xpath("expr")` / `$lxpath("expr")` | 1 (quoted) | First XPath match (leaf-style for `lxpath`) |
|
|
631
|
+
| `$text(expr)` | 1 | Concatenated text content of an XML node |
|
|
632
|
+
| `$in_filter("name")` | 1 (quoted) | True if current XML element belongs to the named filter group |
|
|
633
|
+
|
|
634
|
+
Notes on quoted-string args: `$attr`, `$child`, `$xpath`, `$lxpath`, `$in_filter` require a **string literal in single or double quotes**, not an expression.
|
|
635
|
+
|
|
636
|
+
### 9.2 Aggregate functions
|
|
637
|
+
|
|
638
|
+
Aggregates **update on every visit** to their position. They live "outside" the per-pass value rule.
|
|
639
|
+
|
|
640
|
+
| Aggregate | Arity | Notes |
|
|
641
|
+
|---|---|---|
|
|
642
|
+
| `$count` | 0 | Number of visits (use `?cond` to count selectively) |
|
|
643
|
+
| `$sum(expr)` | 1 | Sum |
|
|
644
|
+
| `$avg(expr)` | 1 | Average |
|
|
645
|
+
| `$min(expr)` | 1 | Minimum |
|
|
646
|
+
| `$max(expr)` | 1 | Maximum |
|
|
647
|
+
| `$prev(default)` | 1 | First visit returns `default`; subsequent visits return the value of the enclosing expression on the previous visit |
|
|
648
|
+
|
|
649
|
+
`$prev` lets you build custom reductions:
|
|
650
|
+
|
|
651
|
+
```text
|
|
652
|
+
"$prev('') + Text" // concatenate Text from every document
|
|
653
|
+
"$prev(0) + 1" // count via reduction
|
|
654
|
+
```
|
|
655
|
+
|
|
656
|
+
### 9.3 Aggregates with arrays / context modifiers
|
|
657
|
+
|
|
658
|
+
- Aggregate at the *top level* gives a single overall value.
|
|
659
|
+
- Aggregate *inside an array element with no context modifier* gives one update per pass — usually just an array of single values, not what you want.
|
|
660
|
+
- Aggregate inside a context-traversal **does** what you expect — one accumulator per outer pass:
|
|
661
|
+
|
|
662
|
+
```json
|
|
663
|
+
[{
|
|
664
|
+
"name": "FullName",
|
|
665
|
+
"avgFamilyAge:Family[]": "$avg(Age)"
|
|
666
|
+
}]
|
|
667
|
+
```
|
|
668
|
+
|
|
669
|
+
### 9.4 Aggregates with predicates
|
|
670
|
+
|
|
671
|
+
A `?cond` after an aggregate **skips the update** when false:
|
|
672
|
+
|
|
673
|
+
```text
|
|
674
|
+
"$avg(Age) ? Age >= 18"
|
|
675
|
+
"$count ? Salary > 200000"
|
|
676
|
+
```
|
|
677
|
+
|
|
678
|
+
### 9.5 Aggregates in conditions
|
|
679
|
+
|
|
680
|
+
Allowed only against literals (see §0.8 and §4.8):
|
|
681
|
+
|
|
682
|
+
```json
|
|
683
|
+
[{
|
|
684
|
+
"name":"FullName",
|
|
685
|
+
"avgFamilyAge:Family[]": "$avg(Age) > 40"
|
|
686
|
+
}]
|
|
687
|
+
```
|
|
688
|
+
|
|
689
|
+
---
|
|
690
|
+
|
|
691
|
+
## 10. Recipe cookbook
|
|
692
|
+
|
|
693
|
+
Each recipe shows a small input-shape sketch, the Unquery template, and (where useful) the equivalent jq one-liner for cross-reference.
|
|
694
|
+
|
|
695
|
+
### 10.1 Pick one field per document → array
|
|
696
|
+
|
|
697
|
+
```json
|
|
698
|
+
["FirstName"]
|
|
699
|
+
```
|
|
700
|
+
|
|
701
|
+
Sorted, unique:
|
|
702
|
+
|
|
703
|
+
```json
|
|
704
|
+
["FirstName@unique_ascending"]
|
|
705
|
+
```
|
|
706
|
+
|
|
707
|
+
### 10.2 Multiple fields per document → array of objects
|
|
708
|
+
|
|
709
|
+
```json
|
|
710
|
+
[{
|
|
711
|
+
"fullname": "FirstName+' '+LastName",
|
|
712
|
+
"title": "Title",
|
|
713
|
+
"city": "Address.City"
|
|
714
|
+
}]
|
|
715
|
+
```
|
|
716
|
+
|
|
717
|
+
### 10.3 Filter rows by predicate
|
|
718
|
+
|
|
719
|
+
```json
|
|
720
|
+
["LastName?Salary>200000"]
|
|
721
|
+
```
|
|
722
|
+
|
|
723
|
+
Filter by string contains (constraint form):
|
|
724
|
+
|
|
725
|
+
```json
|
|
726
|
+
[{
|
|
727
|
+
"FirstName": "FirstName",
|
|
728
|
+
"LastName": "LastName",
|
|
729
|
+
"Title": "Title contains 'Developer'"
|
|
730
|
+
}]
|
|
731
|
+
```
|
|
732
|
+
|
|
733
|
+
Filter by `#if`:
|
|
734
|
+
|
|
735
|
+
```json
|
|
736
|
+
[{
|
|
737
|
+
"#if": "$size(Dependants)>=3",
|
|
738
|
+
"FirstName": "FirstName",
|
|
739
|
+
"LastName": "LastName"
|
|
740
|
+
}]
|
|
741
|
+
```
|
|
742
|
+
|
|
743
|
+
### 10.4 Filter array of objects by inner field — return field
|
|
744
|
+
|
|
745
|
+
`jq '.[] | select(.location=="Stockholm") | .name'`
|
|
746
|
+
|
|
747
|
+
```json
|
|
748
|
+
{ "#return:{}": ["name?location='Stockholm'"] }
|
|
749
|
+
```
|
|
750
|
+
|
|
751
|
+
Or using `[]` traversal directly (when input is an array of objects):
|
|
752
|
+
|
|
753
|
+
```json
|
|
754
|
+
{ "#return:[]": ["name?location='Stockholm'"] }
|
|
755
|
+
```
|
|
756
|
+
|
|
757
|
+
### 10.5 Recursive descent — collect every value under a key name
|
|
758
|
+
|
|
759
|
+
`jq '..|.foo? // empty'`
|
|
760
|
+
|
|
761
|
+
```json
|
|
762
|
+
{ "#return:**?$key='foo'": "." }
|
|
763
|
+
```
|
|
764
|
+
|
|
765
|
+
### 10.6 Filter array, return only selected field
|
|
766
|
+
|
|
767
|
+
`jq '. - map(select(.Names[] | contains("data"))) | .[].Id'`
|
|
768
|
+
|
|
769
|
+
```json
|
|
770
|
+
{
|
|
771
|
+
"#return:[]": [{
|
|
772
|
+
"#notexists:Names[]": ". contains 'data'",
|
|
773
|
+
"#return": "Id"
|
|
774
|
+
}]
|
|
775
|
+
}
|
|
776
|
+
```
|
|
777
|
+
|
|
778
|
+
### 10.7 Multi-shape result with `#returnif` chain
|
|
779
|
+
|
|
780
|
+
```json
|
|
781
|
+
[{
|
|
782
|
+
"#returnif:StatusInfos[]?StatusType='read replication'": {
|
|
783
|
+
"RepStatus": "Status",
|
|
784
|
+
"RepStatusType": "StatusType",
|
|
785
|
+
"ReadReplicaSourceDBInstanceIdentifier": "../ReadReplicaSourceDBInstanceIdentifier",
|
|
786
|
+
"DBInstanceIdentifier": "../DBInstanceIdentifier"
|
|
787
|
+
},
|
|
788
|
+
"#return": "."
|
|
789
|
+
}]
|
|
790
|
+
```
|
|
791
|
+
|
|
792
|
+
If any `StatusInfo` matches, return the projection; otherwise return the whole object.
|
|
793
|
+
|
|
794
|
+
### 10.8 Default value via `#returnif` + `#return`
|
|
795
|
+
|
|
796
|
+
```json
|
|
797
|
+
{
|
|
798
|
+
"#returnif:Tags[]?Key='Name'": "Value",
|
|
799
|
+
"#return": "'N/A'"
|
|
800
|
+
}
|
|
801
|
+
```
|
|
802
|
+
|
|
803
|
+
### 10.9 Drop a column from every record
|
|
804
|
+
|
|
805
|
+
`jq '[.[] | del(.timestamp)]'`
|
|
806
|
+
|
|
807
|
+
```json
|
|
808
|
+
{
|
|
809
|
+
"#return:[]": [{
|
|
810
|
+
"{}:": ".?$key!='timestamp'"
|
|
811
|
+
}]
|
|
812
|
+
}
|
|
813
|
+
```
|
|
814
|
+
|
|
815
|
+
### 10.10 Single-element array trick — keep only `[0]` from a sub-array
|
|
816
|
+
|
|
817
|
+
```json
|
|
818
|
+
{
|
|
819
|
+
"blah0:": ".",
|
|
820
|
+
"Array:Array[]?$index=0": ["."]
|
|
821
|
+
}
|
|
822
|
+
```
|
|
823
|
+
|
|
824
|
+
### 10.11 Boolean-OR predicate inside array element
|
|
825
|
+
|
|
826
|
+
`jq '.theList | map(select(.id==2 or .id==4))'`
|
|
827
|
+
|
|
828
|
+
```json
|
|
829
|
+
{ "#return:theList[]": [".?id=2 | id=4"] }
|
|
830
|
+
```
|
|
831
|
+
|
|
832
|
+
### 10.12 Build a lookup table
|
|
833
|
+
|
|
834
|
+
`jq 'map({(.name):.id}) | add'`
|
|
835
|
+
|
|
836
|
+
```json
|
|
837
|
+
{ "#return:{}": { "$(name)": "id" } }
|
|
838
|
+
```
|
|
839
|
+
|
|
840
|
+
### 10.13 Group-by + sum across nested arrays
|
|
841
|
+
|
|
842
|
+
`reduce .[] as $o (null; . + ($o | f))` style.
|
|
843
|
+
|
|
844
|
+
```json
|
|
845
|
+
{
|
|
846
|
+
"#return:[]:cost[]": {
|
|
847
|
+
"$(../account)": "$sum($float(totalcost))"
|
|
848
|
+
}
|
|
849
|
+
}
|
|
850
|
+
```
|
|
851
|
+
|
|
852
|
+
Plain sum over nested apps:
|
|
853
|
+
|
|
854
|
+
```json
|
|
855
|
+
{ "#return:apps[]": "$sum(memory*instances)" }
|
|
856
|
+
```
|
|
857
|
+
|
|
858
|
+
Stitch fields into a key, sum a numeric one:
|
|
859
|
+
|
|
860
|
+
```json
|
|
861
|
+
{
|
|
862
|
+
"#return:results[]": {
|
|
863
|
+
"#var account:[]?field='AccountId'": "value",
|
|
864
|
+
"#var requests:[]?field='number_of_requests'": "$number(value)",
|
|
865
|
+
"$var(account)": "$sum($var(requests))"
|
|
866
|
+
}
|
|
867
|
+
}
|
|
868
|
+
```
|
|
869
|
+
|
|
870
|
+
### 10.14 Bucketize with thresholds (`<<` joins back to original context)
|
|
871
|
+
|
|
872
|
+
```json
|
|
873
|
+
{
|
|
874
|
+
"#var dates": [
|
|
875
|
+
"'2018-08-22'",
|
|
876
|
+
"'2018-09-22'",
|
|
877
|
+
"'2018-10-22'",
|
|
878
|
+
"'2018-11-22'"
|
|
879
|
+
],
|
|
880
|
+
"#return:data[]": {
|
|
881
|
+
"#var i->%dates:[]?$to_time(<<date)<$to_time(.)": "'up to '+.",
|
|
882
|
+
"%i": "$sum(value)"
|
|
883
|
+
}
|
|
884
|
+
}
|
|
885
|
+
```
|
|
886
|
+
|
|
887
|
+
Notes:
|
|
888
|
+
|
|
889
|
+
- `->%dates` switches context to the array stored in variable `dates`.
|
|
890
|
+
- `<<date` reads `date` from the *outer* `data[]` element (before the switch).
|
|
891
|
+
- `#var i` captures the bucket label so it can be reused as a *dynamic key* in `%i`.
|
|
892
|
+
|
|
893
|
+
### 10.15 Generic recursive merge / transform
|
|
894
|
+
|
|
895
|
+
```json
|
|
896
|
+
{
|
|
897
|
+
"#func merge": {
|
|
898
|
+
"#returnif:{}": { "$key": "$merge" },
|
|
899
|
+
"#returnif:[]": ["$merge"],
|
|
900
|
+
"#return": "."
|
|
901
|
+
},
|
|
902
|
+
"#return": "$merge"
|
|
903
|
+
}
|
|
904
|
+
```
|
|
905
|
+
|
|
906
|
+
This `merge` function recurses into every object key, every array element, and falls through for leaves. The `#returnif:{}` branch fires only if `.` is an object (yields a non-empty object result); same for `#returnif:[]`.
|
|
907
|
+
|
|
908
|
+
### 10.16 Aggregate dictionary-style: per-key sum
|
|
909
|
+
|
|
910
|
+
```json
|
|
911
|
+
{
|
|
912
|
+
"#return:[]": {
|
|
913
|
+
"$(A)": {
|
|
914
|
+
"A": "A",
|
|
915
|
+
"{}:": "$sum($number(.))"
|
|
916
|
+
}
|
|
917
|
+
}
|
|
918
|
+
}
|
|
919
|
+
```
|
|
920
|
+
|
|
921
|
+
### 10.17 Invert object: `{key: value}` → `{value: [key, ...]}`
|
|
922
|
+
|
|
923
|
+
```json
|
|
924
|
+
{ "#return:{}": { "$(.)": ["$key"] } }
|
|
925
|
+
```
|
|
926
|
+
|
|
927
|
+
### 10.18 Wildcard prefix filter
|
|
928
|
+
|
|
929
|
+
```json
|
|
930
|
+
{ "#return:[]?folder starts_with 'User/Admin'": ["."] }
|
|
931
|
+
```
|
|
932
|
+
|
|
933
|
+
### 10.19 Path-aware key construction
|
|
934
|
+
|
|
935
|
+
Strip a known prefix from `$path`, then group:
|
|
936
|
+
|
|
937
|
+
```json
|
|
938
|
+
{
|
|
939
|
+
"#return:**": {
|
|
940
|
+
"$substr($path, $length('content.'))": {
|
|
941
|
+
"$(.)": "$count"
|
|
942
|
+
}
|
|
943
|
+
}
|
|
944
|
+
}
|
|
945
|
+
```
|
|
946
|
+
|
|
947
|
+
### 10.20 Multi-field projection with dynamic key
|
|
948
|
+
|
|
949
|
+
```json
|
|
950
|
+
{
|
|
951
|
+
"msg": "message",
|
|
952
|
+
"status": "status",
|
|
953
|
+
"details:details[]": {
|
|
954
|
+
"$(test)": {
|
|
955
|
+
"amount:": ".",
|
|
956
|
+
"pre:": "."
|
|
957
|
+
}
|
|
958
|
+
}
|
|
959
|
+
}
|
|
960
|
+
```
|
|
961
|
+
|
|
962
|
+
### 10.21 Add filename to each record
|
|
963
|
+
|
|
964
|
+
```json
|
|
965
|
+
[{
|
|
966
|
+
"Filename": "$filename",
|
|
967
|
+
"{}:": "."
|
|
968
|
+
}]
|
|
969
|
+
```
|
|
970
|
+
|
|
971
|
+
### 10.22 Translate codes via variable-as-dictionary
|
|
972
|
+
|
|
973
|
+
```json
|
|
974
|
+
{
|
|
975
|
+
"#var states": {
|
|
976
|
+
"CA": "'California'",
|
|
977
|
+
"NJ": "'New Jersey'",
|
|
978
|
+
"NY": "'New York'",
|
|
979
|
+
"TX": "'Texas'"
|
|
980
|
+
},
|
|
981
|
+
"employees": [{
|
|
982
|
+
"FirstName:": ".",
|
|
983
|
+
"LastName:": ".",
|
|
984
|
+
"State": "$var(states).$(Address.State)"
|
|
985
|
+
}]
|
|
986
|
+
}
|
|
987
|
+
```
|
|
988
|
+
|
|
989
|
+
### 10.23 Group-by + count at multiple levels
|
|
990
|
+
|
|
991
|
+
Top-level group-by:
|
|
992
|
+
|
|
993
|
+
```json
|
|
994
|
+
{ "$(Title)": "$avg(Salary)" }
|
|
995
|
+
```
|
|
996
|
+
|
|
997
|
+
Nested per-employee count of children:
|
|
998
|
+
|
|
999
|
+
```json
|
|
1000
|
+
[{
|
|
1001
|
+
"FirstName:": ".",
|
|
1002
|
+
"LastName:": ".",
|
|
1003
|
+
"NumOfChildren:Dependants[]": "$count?Relation='Child'"
|
|
1004
|
+
}]
|
|
1005
|
+
```
|
|
1006
|
+
|
|
1007
|
+
### 10.24 Filter by aggregated count (uses `#if` + context-modifier with `$count`)
|
|
1008
|
+
|
|
1009
|
+
```json
|
|
1010
|
+
[{
|
|
1011
|
+
"#if:Dependants[]?Relation='Child'": "$count>1",
|
|
1012
|
+
"FirstName:": ".",
|
|
1013
|
+
"LastName:": "."
|
|
1014
|
+
}]
|
|
1015
|
+
```
|
|
1016
|
+
|
|
1017
|
+
### 10.25 Convert array-of-objects ↔ dictionary
|
|
1018
|
+
|
|
1019
|
+
Array → dictionary (keyed by `ID`):
|
|
1020
|
+
|
|
1021
|
+
```json
|
|
1022
|
+
{ "$(ID)": "." }
|
|
1023
|
+
```
|
|
1024
|
+
|
|
1025
|
+
Dictionary → array of values (drops keys):
|
|
1026
|
+
|
|
1027
|
+
```json
|
|
1028
|
+
{ "employees:{}": ["."] }
|
|
1029
|
+
```
|
|
1030
|
+
|
|
1031
|
+
Add the original key into each value (e.g. promote `$key` → `Filename` field):
|
|
1032
|
+
|
|
1033
|
+
```json
|
|
1034
|
+
{
|
|
1035
|
+
"$(ID):{}": {
|
|
1036
|
+
"Filename": "$filename",
|
|
1037
|
+
"$key": "."
|
|
1038
|
+
}
|
|
1039
|
+
}
|
|
1040
|
+
```
|
|
1041
|
+
|
|
1042
|
+
Drop a known key while reshaping:
|
|
1043
|
+
|
|
1044
|
+
```json
|
|
1045
|
+
{
|
|
1046
|
+
"$(ID):{}?$key!='ID'": {
|
|
1047
|
+
"$key": "."
|
|
1048
|
+
}
|
|
1049
|
+
}
|
|
1050
|
+
```
|
|
1051
|
+
|
|
1052
|
+
### 10.26 Walk children with stored context
|
|
1053
|
+
|
|
1054
|
+
Snapshot the parent name into a variable, then iterate dependants and inject:
|
|
1055
|
+
|
|
1056
|
+
```json
|
|
1057
|
+
{
|
|
1058
|
+
"#var employee": "FirstName+' '+LastName",
|
|
1059
|
+
"all_dependants:Dependants[]": [{
|
|
1060
|
+
"{}:": ".",
|
|
1061
|
+
"Employee": "$var(employee)"
|
|
1062
|
+
}]
|
|
1063
|
+
}
|
|
1064
|
+
```
|
|
1065
|
+
|
|
1066
|
+
### 10.27 Sort keys / collect all keys in a corpus
|
|
1067
|
+
|
|
1068
|
+
```json
|
|
1069
|
+
{ "#return:**": ["$key@unique_ascending"] }
|
|
1070
|
+
```
|
|
1071
|
+
|
|
1072
|
+
### 10.28 Aggregate over the entire corpus
|
|
1073
|
+
|
|
1074
|
+
```json
|
|
1075
|
+
"$avg(Salary)" // overall average
|
|
1076
|
+
"$count" // document count
|
|
1077
|
+
"$count?Salary>200000"
|
|
1078
|
+
```
|
|
1079
|
+
|
|
1080
|
+
---
|
|
1081
|
+
|
|
1082
|
+
## 11. Common idioms — when to reach for which feature
|
|
1083
|
+
|
|
1084
|
+
| Symptom | Use |
|
|
1085
|
+
|---|---|
|
|
1086
|
+
| "Same key on output as input" | `"Field:": "."` |
|
|
1087
|
+
| "Need iterative array, not just first value" | wrap in `[...]` |
|
|
1088
|
+
| "Need a flat array, not `{key:[...]}`" | `#return:` instead of a wrapper key |
|
|
1089
|
+
| "Value comes from one of N possibilities" | `#returnif` chain ending with `#return` for default |
|
|
1090
|
+
| "Need group-by on field X" | `{ "$(X)": [...] }` or `{ "$(X)": "$agg(...)" }` |
|
|
1091
|
+
| "Need dictionary keyed by a field" | `{ "$(field)": "." }` (no array) |
|
|
1092
|
+
| "Need to iterate fields of an object" | `:{}` or `{}:` |
|
|
1093
|
+
| "Need to iterate array elements" | `:[]` or `[]` in path |
|
|
1094
|
+
| "Need to walk every node anywhere" | `:**` recursive descent |
|
|
1095
|
+
| "Need to filter object keys by regex" | `{'regex'}` or `:{}?$key matches 'regex'` |
|
|
1096
|
+
| "Need to remove an existing field" | `"{}:": ".?$key!='X'"` |
|
|
1097
|
+
| "Need to add a field while copying others" | `[{ "X": "...", "{}:": "." }]` |
|
|
1098
|
+
| "Need to compute new key from value" | `{ "$(expr)": ... }` |
|
|
1099
|
+
| "Need a default when something is missing" | `#returnif:cond` then `#return:'default'` |
|
|
1100
|
+
| "Need recursion over arbitrary JSON" | `#func` calling itself via three `#returnif` cases (object/array/leaf) |
|
|
1101
|
+
| "Need to remember a value while changing context" | `#var` before the context-changing key, then `%name` / `$var(name)` inside |
|
|
1102
|
+
| "Need a value from the previous (outer) document" | `<<Field` after `->...` switch |
|
|
1103
|
+
|
|
1104
|
+
---
|
|
1105
|
+
|
|
1106
|
+
## 12. Static-check checklist for AI output
|
|
1107
|
+
|
|
1108
|
+
Before returning a query, scan it against this checklist. These rules come straight from the parser and are the most common reasons an Unquery query fails to parse or fails to do what you intended.
|
|
1109
|
+
|
|
1110
|
+
1. **Whole string consumed.** Each value string must parse cleanly to EOF — no trailing characters. If you concatenated something, recheck.
|
|
1111
|
+
2. **Aggregate vs. literal.** Either side of an aggregate comparison must be a literal. Reject `"$sum(x) > field"`.
|
|
1112
|
+
3. **`#if` value is a *condition only*.** No `?`, no `@sort`, no leading expression-then-predicate.
|
|
1113
|
+
4. **`#func` arity matches.** A `#func name(a,b)` declaration registers arity 2; every `$name(...)` in this template must pass exactly 2 comma-separated argument expressions.
|
|
1114
|
+
5. **`<-` is not a token.** To compare `< -5`, write it with whitespace or parentheses: `Field < -5`, `Field<(-5)`. Never type `Field<-5` and expect `<-`.
|
|
1115
|
+
6. **`..` is the parent step; you usually want `../`.** `..Field` is invalid. Write `../Field`.
|
|
1116
|
+
7. **`@sort` ends value parsing.** Place it last in the value string, after any `?cond` or constraint.
|
|
1117
|
+
8. **Condition OR is `|`.** `||` in a condition is a parse error. `||` is reserved for context-OR in keys.
|
|
1118
|
+
9. **`%x` ≡ `$var(x)`.** Use whichever is clearer; behavior is identical.
|
|
1119
|
+
10. **`#func` definitions go *before* their first use** in the same object.
|
|
1120
|
+
11. **Quoted args of XML helpers (`$attr`, `$child`, `$xpath`, `$lxpath`, `$in_filter`)** must be quoted string literals, not expressions.
|
|
1121
|
+
12. **Don't wrap a function token in `$()`.** Write `$index`, not `$($index)`.
|
|
1122
|
+
13. **Backticks are for field names.** `` `name with spaces` `` is a path segment, not a string literal.
|
|
1123
|
+
14. **Don't confuse predicate vs constraint inside object fields.** Predicate (`?`) skips just that key. Constraint (no `?`) skips the whole object.
|
|
1124
|
+
15. **Sort across an array of multiple items is undefined unless every item shares the same specifier.** Prefer single-item arrays for sorting.
|
|
1125
|
+
|
|
1126
|
+
---
|
|
1127
|
+
|
|
1128
|
+
## 13. Cross-reference
|
|
1129
|
+
|
|
1130
|
+
- [`docs/unquery-grammar.md`](./unquery-grammar.md) — Parser-faithful EBNF, AST nodes, lexer tokens, every static rule. Use this for syntax-checking, error attribution, or when extending Unquery tooling. MCP resource: `xcitedb://unquery-grammar`.
|
|
1131
|
+
- [`web/src/docs/content/unquery-language-reference.html`](../web/src/docs/content/unquery-language-reference.html) — Human-oriented reference manual.
|
|
1132
|
+
- [`web/src/docs/content/unquery-tutorial.html`](../web/src/docs/content/unquery-tutorial.html) — Hands-on tutorial with the employees dataset.
|
|
1133
|
+
- XCiteDB-specific bits (workspaces, branches, identifiers, `->$date`, `->$branch`, `->$all`, `$xml*`, `$attr`, `$xpath`, `$node_date`, `$data_date`) only apply when running against XCiteDB. With the `unq` CLI on plain JSON files, ignore them.
|
|
1134
|
+
|
|
1135
|
+
---
|
|
1136
|
+
|
|
1137
|
+
## Document history
|
|
1138
|
+
|
|
1139
|
+
- Synthesized from `docs/unquery-grammar.md`, the language reference and tutorial under `web/src/docs/content/`, and the canonical stackoverflow recipes under `Xcential/Unquery/tests/stackoverflow/`. Keep in sync when grammar or recipes change.
|