flowquery 1.0.50 → 1.0.52

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (56) hide show
  1. package/README.md +640 -75
  2. package/dist/compute/runner.d.ts +40 -5
  3. package/dist/compute/runner.d.ts.map +1 -1
  4. package/dist/compute/runner.js +94 -10
  5. package/dist/compute/runner.js.map +1 -1
  6. package/dist/flowquery.min.js +1 -1
  7. package/dist/graph/database.d.ts +5 -0
  8. package/dist/graph/database.d.ts.map +1 -1
  9. package/dist/graph/database.js +19 -3
  10. package/dist/graph/database.js.map +1 -1
  11. package/dist/graph/physical_node.d.ts +1 -1
  12. package/dist/graph/physical_node.d.ts.map +1 -1
  13. package/dist/graph/physical_node.js +2 -2
  14. package/dist/graph/physical_node.js.map +1 -1
  15. package/dist/graph/physical_relationship.d.ts +1 -1
  16. package/dist/graph/physical_relationship.d.ts.map +1 -1
  17. package/dist/graph/physical_relationship.js +2 -2
  18. package/dist/graph/physical_relationship.js.map +1 -1
  19. package/dist/index.browser.d.ts +2 -0
  20. package/dist/index.browser.d.ts.map +1 -1
  21. package/dist/index.browser.js.map +1 -1
  22. package/dist/index.node.d.ts +2 -1
  23. package/dist/index.node.d.ts.map +1 -1
  24. package/dist/index.node.js.map +1 -1
  25. package/dist/parsing/expressions/parameter_reference.d.ts +39 -0
  26. package/dist/parsing/expressions/parameter_reference.d.ts.map +1 -0
  27. package/dist/parsing/expressions/parameter_reference.js +56 -0
  28. package/dist/parsing/expressions/parameter_reference.js.map +1 -0
  29. package/dist/parsing/operations/match.d.ts +28 -0
  30. package/dist/parsing/operations/match.d.ts.map +1 -1
  31. package/dist/parsing/operations/match.js +100 -0
  32. package/dist/parsing/operations/match.js.map +1 -1
  33. package/dist/parsing/parser.d.ts +15 -0
  34. package/dist/parsing/parser.d.ts.map +1 -1
  35. package/dist/parsing/parser.js +86 -1
  36. package/dist/parsing/parser.js.map +1 -1
  37. package/dist/parsing/parser_state.d.ts +3 -0
  38. package/dist/parsing/parser_state.d.ts.map +1 -1
  39. package/dist/parsing/parser_state.js +7 -0
  40. package/dist/parsing/parser_state.js.map +1 -1
  41. package/dist/tokenization/string_walker.d.ts +1 -0
  42. package/dist/tokenization/string_walker.d.ts.map +1 -1
  43. package/dist/tokenization/string_walker.js +7 -0
  44. package/dist/tokenization/string_walker.js.map +1 -1
  45. package/dist/tokenization/symbol.d.ts +1 -0
  46. package/dist/tokenization/symbol.d.ts.map +1 -1
  47. package/dist/tokenization/symbol.js +1 -0
  48. package/dist/tokenization/symbol.js.map +1 -1
  49. package/dist/tokenization/token.d.ts +2 -0
  50. package/dist/tokenization/token.d.ts.map +1 -1
  51. package/dist/tokenization/token.js +6 -0
  52. package/dist/tokenization/token.js.map +1 -1
  53. package/dist/tokenization/tokenizer.d.ts.map +1 -1
  54. package/dist/tokenization/tokenizer.js +3 -1
  55. package/dist/tokenization/tokenizer.js.map +1 -1
  56. package/package.json +2 -2
package/README.md CHANGED
@@ -1,12 +1,38 @@
1
- ![FlowQuery](./FlowQueryLogoIcon.png)
1
+ # FlowQuery
2
2
 
3
- A declarative query language for data processing pipelines.
3
+ **Author:** Niclas Kjäll-Ohlsson
4
4
 
5
- FlowQuery is a declarative query language for defining and executing data processing pipelines involving (but not limited to) API calls over http. The language is very well suited for prototyping of for example LLM chain-of-thought pipelines involving fetching grounding data from APIs, and processing that grounding data in multiple successive LLM calls where the next call builds on previous results. FlowQuery is based on many of the core language constructs in the OpenCypher query language except (currently) concepts related to graphs. Additionally, FlowQuery implements its own language constructs, such as Python-style f-strings, and special predicate functions operating over lists. FlowQuery is not limited to its current capabilities and may evolve beyond this in the future to include language constructs such as variables and/or other language concepts from OpenCypher.
5
+ A declarative OpenCypher-based query language for virtual graphs and data processing pipelines.
6
6
 
7
- The main motivation of FlowQuery is rapid prototyping of fixed step data processing pipelines involving LLMs (for example chain-of-thought) and as such drastically shorten the work needed to create such data processing pipelines. A core business outcome of this is faster product value experimentation loops, which leads to shorter time-to-market for product ideas involving LLMs.
7
+ FlowQuery is a declarative query language aiming to fully support OpenCypher, extended with capabilities such as **virtual graphs**, HTTP data loading, f-strings, and **custom function extensibility**. Virtual nodes and relationships are backed by sub-queries that can fetch data dynamically (e.g., from REST APIs), and FlowQuery's graph engine supports pattern matching, variable-length traversals, optional matches, relationship direction, and filter pass-down, enabling you to model and explore complex data relationships without a traditional graph database.
8
8
 
9
- FlowQuery is written in TypeScript (https://www.typescriptlang.org/) and built/compiled runs both in browser or in Node as a self-contained one-file Javascript library.
9
+ Beyond graphs, FlowQuery provides a full data processing pipeline language with features like `LOAD JSON FROM` for HTTP calls (GET/POST with headers), f-strings, list comprehensions, inline predicate aggregation, temporal functions, and a rich library of scalar and aggregate functions.
10
+
11
+ ### Graph RAG with FlowQuery
12
+
13
+ The combination of graph querying and pipeline processing makes FlowQuery ideal for the retrieval stage of Retrieval Augmented Generation (RAG). A typical graph RAG flow works as follows:
14
+
15
+ 1. **User query** — The user asks a question in natural language.
16
+ 2. **Query generation** — The LLM, with knowledge of the virtual graph schema, generates a precise OpenCypher query to retrieve the grounding data needed to answer the question.
17
+ 3. **Query execution** — The FlowQuery engine executes the generated OpenCypher query against the virtual graph and returns the results as grounding data.
18
+ 4. **Response formulation** — The LLM formulates a final response informed by the grounding data.
19
+
20
+ ```
21
+ ┌──────────┐ ┌───────────────┐ ┌─────────────────┐ ┌───────────────┐
22
+ │ User │────>│ LLM │────>│ FlowQuery │────>│ LLM │
23
+ │ Question │ │ Generate Query│ │ Execute Query │ │ Formulate │
24
+ │ │ │ (OpenCypher) │ │ (Virtual Graph) │ │ Response │
25
+ └──────────┘ └───────────────┘ └─────────────────┘ └───────┬───────┘
26
+
27
+ v
28
+ ┌──────────┐
29
+ │ Answer │
30
+ └──────────┘
31
+ ```
32
+
33
+ See the [Language Reference](#language-reference) and [Quick Cheat Sheet](#quick-cheat-sheet) for full syntax documentation.
34
+
35
+ FlowQuery is written in TypeScript and runs both in the browser and in Node.js as a self-contained single-file JavaScript library. A pure Python implementation of FlowQuery with full functional fidelity is also available in the [flowquery-py](./flowquery-py) sub-folder (`pip install flowquery`).
10
36
 
11
37
  - Test live at <a href="https://microsoft.github.io/FlowQuery/" target="_blank">https://microsoft.github.io/FlowQuery/</a>.
12
38
  - Try as a VSCode plugin from https://marketplace.visualstudio.com/items?itemName=FlowQuery.flowquery-vscode.
@@ -72,100 +98,639 @@ await query.run();
72
98
  console.log(query.results);
73
99
  ```
74
100
 
75
- ## Examples
101
+ ### Python
102
+
103
+ Install FlowQuery from PyPI:
104
+
105
+ ```bash
106
+ pip install flowquery
107
+ ```
108
+
109
+ Then use it in your code:
110
+
111
+ ```python
112
+ import asyncio
113
+ from flowquery import Runner
114
+
115
+ runner = Runner("WITH 1 AS x RETURN x + 1 AS result")
116
+ asyncio.run(runner.run())
117
+ print(runner.results) # [{'result': 2}]
118
+ ```
119
+
120
+ Or start the interactive REPL:
121
+
122
+ ```bash
123
+ flowquery
124
+ ```
125
+
126
+ See [flowquery-py](./flowquery-py) for more details, including custom function extensibility in Python.
127
+
128
+ ## Language Reference
129
+
130
+ ### Clauses
131
+
132
+ #### RETURN
133
+
134
+ Returns results. Expressions can be aliased with `AS`.
135
+
136
+ ```cypher
137
+ RETURN 1 + 2 AS sum, 3 + 4 AS sum2
138
+ // [{ sum: 3, sum2: 7 }]
139
+ ```
140
+
141
+ #### WITH
142
+
143
+ Introduces variables into scope. Works like `RETURN` but continues the pipeline.
144
+
145
+ ```cypher
146
+ WITH 1 AS a RETURN a
147
+ // [{ a: 1 }]
148
+ ```
149
+
150
+ #### UNWIND
151
+
152
+ Expands a list into individual rows.
153
+
154
+ ```cypher
155
+ UNWIND [1, 2, 3] AS num RETURN num
156
+ // [{ num: 1 }, { num: 2 }, { num: 3 }]
157
+ ```
158
+
159
+ Unwinding `null` produces zero rows.
160
+
161
+ #### LOAD JSON FROM
162
+
163
+ Fetches JSON data from a URL. Supports GET (default) and POST with headers.
164
+
165
+ ```cypher
166
+ LOAD JSON FROM "https://api.example.com/data" AS data RETURN data
167
+
168
+ // With POST body and custom headers
169
+ LOAD JSON FROM 'https://api.example.com/endpoint'
170
+ HEADERS { `Content-Type`: 'application/json', Authorization: f'Bearer {token}' }
171
+ POST { key: 'value' } AS response
172
+ RETURN response
173
+ ```
174
+
175
+ #### LIMIT
176
+
177
+ Restricts the number of rows. Can appear mid-pipeline or after `RETURN`.
178
+
179
+ ```cypher
180
+ UNWIND range(1, 100) AS i RETURN i LIMIT 5
181
+ ```
182
+
183
+ #### CALL ... YIELD
184
+
185
+ Invokes an async function and yields named fields into scope.
186
+
187
+ ```cypher
188
+ CALL myAsyncFunction() YIELD result RETURN result
189
+ // If last operation, YIELD is optional
190
+ CALL myAsyncFunction()
191
+ ```
192
+
193
+ #### UNION / UNION ALL
194
+
195
+ Combines results from multiple queries. `UNION` removes duplicates; `UNION ALL` keeps them. Column names must match.
196
+
197
+ ```cypher
198
+ WITH 1 AS x RETURN x UNION WITH 2 AS x RETURN x
199
+ // [{ x: 1 }, { x: 2 }]
200
+
201
+ WITH 1 AS x RETURN x UNION ALL WITH 1 AS x RETURN x
202
+ // [{ x: 1 }, { x: 1 }]
203
+ ```
204
+
205
+ #### Multi-Statement Queries
206
+
207
+ Multiple statements can be separated by semicolons. Only `CREATE VIRTUAL` and `DELETE VIRTUAL` statements may appear before the last statement. The last statement can be any valid query.
208
+
209
+ ```cypher
210
+ CREATE VIRTUAL (:Person) AS {
211
+ UNWIND [{id: 1, name: 'Alice'}, {id: 2, name: 'Bob'}] AS r
212
+ RETURN r.id AS id, r.name AS name
213
+ };
214
+ CREATE VIRTUAL (:Person)-[:KNOWS]-(:Person) AS {
215
+ UNWIND [{left_id: 1, right_id: 2}] AS r
216
+ RETURN r.left_id AS left_id, r.right_id AS right_id
217
+ };
218
+ MATCH (a:Person)-[:KNOWS]->(b:Person)
219
+ RETURN a.name AS from, b.name AS to
220
+ ```
221
+
222
+ The `Runner` also exposes a `metadata` property with counts of virtual nodes/relationships created and deleted:
223
+
224
+ ```javascript
225
+ const runner = new FlowQuery("CREATE VIRTUAL (:X) AS { RETURN 1 AS id }; MATCH (n:X) RETURN n");
226
+ await runner.run();
227
+ console.log(runner.metadata);
228
+ // { virtual_nodes_created: 1, virtual_relationships_created: 0,
229
+ // virtual_nodes_deleted: 0, virtual_relationships_deleted: 0 }
230
+ ```
231
+
232
+ ### WHERE Clause
233
+
234
+ Filters rows based on conditions. Supports the following operators:
235
+
236
+ | Operator | Example |
237
+ | -------------------- | -------------------------------------------------------------- |
238
+ | Comparison | `=`, `<>`, `>`, `>=`, `<`, `<=` |
239
+ | Logical | `AND`, `OR`, `NOT` |
240
+ | Null checks | `IS NULL`, `IS NOT NULL` |
241
+ | List membership | `IN [...]`, `NOT IN [...]` |
242
+ | String matching | `CONTAINS`, `NOT CONTAINS` |
243
+ | String prefix/suffix | `STARTS WITH`, `NOT STARTS WITH`, `ENDS WITH`, `NOT ENDS WITH` |
244
+
245
+ ```cypher
246
+ UNWIND range(1,100) AS n WITH n WHERE n >= 20 AND n <= 30 RETURN n
247
+
248
+ UNWIND ['apple', 'banana', 'grape'] AS fruit
249
+ WITH fruit WHERE fruit CONTAINS 'ap' RETURN fruit
250
+ // [{ fruit: 'apple' }, { fruit: 'grape' }]
251
+
252
+ UNWIND ['apple', 'apricot', 'banana'] AS fruit
253
+ WITH fruit WHERE fruit STARTS WITH 'ap' RETURN fruit
254
+
255
+ WITH fruit WHERE fruit IN ['banana', 'date'] RETURN fruit
256
+
257
+ WHERE age IS NOT NULL
258
+ ```
259
+
260
+ ### ORDER BY
261
+
262
+ Sorts results. Supports `ASC` (default) and `DESC`. Can use aliases, property access, function expressions, or arithmetic.
263
+
264
+ ```cypher
265
+ UNWIND [3, 1, 2] AS x RETURN x ORDER BY x DESC
266
+ // [{ x: 3 }, { x: 2 }, { x: 1 }]
267
+
268
+ // Multiple sort keys
269
+ RETURN person.name AS name, person.age AS age ORDER BY name ASC, age DESC
270
+
271
+ // Sort by expression (expression values are not leaked into results)
272
+ UNWIND ['BANANA', 'apple', 'Cherry'] AS fruit
273
+ RETURN fruit ORDER BY toLower(fruit)
274
+
275
+ // Sort by arithmetic expression
276
+ RETURN item.a AS a, item.b AS b ORDER BY item.a + item.b ASC
277
+ ```
278
+
279
+ #### DISTINCT
76
280
 
77
- See also ./misc/queries and ./tests/compute/runner.test.ts for more examples.
281
+ Removes duplicate rows from `RETURN` or `WITH`.
78
282
 
79
283
  ```cypher
80
- /*
81
- Collect 10 random pieces of wisdom and create a letter histogram.
82
- */
83
- unwind range(0,10) as i
84
- load json from "https://api.adviceslip.com/advice" as item
85
- with join(collect(item.slip.advice),"") as wisdom
86
- unwind split(wisdom,"") as letter
87
- return letter, sum(1) as lettercount
284
+ UNWIND [1, 1, 2, 2] AS i RETURN DISTINCT i
285
+ // [{ i: 1 }, { i: 2 }]
88
286
  ```
89
287
 
288
+ ### Expressions
289
+
290
+ #### Arithmetic
291
+
292
+ `+`, `-`, `*`, `/`, `^` (power), `%` (modulo). Standard precedence applies; use parentheses to override.
293
+
90
294
  ```cypher
91
- /*
92
- This query fetches 10 cat facts from the Cat Facts API (https://catfact.ninja/fact)
93
- and then uses the OpenAI API to analyze those cat facts and return a short summary
94
- of the most interesting facts and what they imply about cats as pets.
295
+ RETURN 2 + 3 * 4 AS result // 14
296
+ RETURN (2 + 3) * 4 AS result // 20
297
+ ```
298
+
299
+ #### String Concatenation
95
300
 
96
- To run this query, you need to set the OPENAI_API_KEY variable to your OpenAI API key.
97
- You also need to set the OpenAI-Organization header to your organization ID.
98
- You can find your organization ID in the OpenAI dashboard.
99
- See https://platform.openai.com/docs/guides/chat for more information.
100
- */
101
- // Setup OpenAI API key and organization ID
102
- with
103
- 'YOUR_OPENAI_API_KEY' as OPENAI_API_KEY,
104
- 'YOUR_OPENAI_ORGANIZATION_ID' as OPENAI_ORGANIZATION_ID
301
+ The `+` operator concatenates strings.
105
302
 
106
- // Get 10 cat facts and collect them into a list
107
- unwind range(0,10) as i
108
- load json from "https://catfact.ninja/fact" as item
109
- with collect(item.fact) as catfacts
303
+ ```cypher
304
+ RETURN "hello" + " world" AS result // "hello world"
305
+ ```
306
+
307
+ #### List Concatenation
308
+
309
+ The `+` operator concatenates lists.
310
+
311
+ ```cypher
312
+ RETURN [1, 2] + [3, 4] AS result // [1, 2, 3, 4]
313
+ ```
314
+
315
+ #### Negative Numbers
316
+
317
+ ```cypher
318
+ RETURN -1 AS num // -1
319
+ ```
320
+
321
+ #### Associative Arrays (Maps)
322
+
323
+ Create inline maps. Keys can be reserved keywords.
324
+
325
+ ```cypher
326
+ RETURN {name: "Alice", age: 30} AS person
327
+ RETURN {return: 1}.return AS aa // 1
328
+ ```
329
+
330
+ #### Property Access
331
+
332
+ Dot notation or bracket notation for nested lookups. Bracket notation supports range slicing.
333
+
334
+ ```cypher
335
+ person.name
336
+ person["name"]
337
+ numbers[0:3] // first 3 elements
338
+ numbers[:-2] // all but last 2
339
+ numbers[2:-2] // slice from index 2, excluding last 2
340
+ numbers[:] // full copy
341
+ ```
110
342
 
111
- // Create prompt to analyze cat facts
112
- with f"
113
- Analyze the following cat facts and answer with a short summary of the most interesting facts, and what they imply about cats as pets:
114
- {join(catfacts, '\n')}
115
- " as catfacts_analysis_prompt
343
+ #### F-Strings
116
344
 
117
- // Call OpenAI API to analyze cat facts
118
- load json from 'https://api.openai.com/v1/chat/completions'
119
- headers {
120
- `Content-Type`: 'application/json',
121
- Authorization: f'Bearer {OPENAI_API_KEY}',
122
- `OpenAI-Organization`: OPENAI_ORGANIZATION_ID
345
+ Python-style formatted strings with embedded expressions.
346
+
347
+ ```cypher
348
+ WITH "world" AS w RETURN f"hello {w}" AS greeting
349
+ // Escape braces with double braces: {{ and }}
350
+ RETURN f"literal {{braces}}" AS result // "literal {braces}"
351
+ ```
352
+
353
+ #### CASE Expression
354
+
355
+ ```cypher
356
+ RETURN CASE WHEN num > 1 THEN num ELSE null END AS ret
357
+ ```
358
+
359
+ #### Equality as Expression
360
+
361
+ `=` and `<>` return `1` (true) or `0` (false) when used in RETURN.
362
+
363
+ ```cypher
364
+ RETURN i=5 AS isEqual, i<>5 AS isNotEqual
365
+ ```
366
+
367
+ ### List Comprehensions
368
+
369
+ Filter and/or transform lists inline.
370
+
371
+ ```cypher
372
+ // Map: [variable IN list | expression]
373
+ RETURN [n IN [1, 2, 3] | n * 2] AS doubled // [2, 4, 6]
374
+
375
+ // Filter: [variable IN list WHERE condition]
376
+ RETURN [n IN [1, 2, 3, 4, 5] WHERE n > 2] AS filtered // [3, 4, 5]
377
+
378
+ // Filter + Map: [variable IN list WHERE condition | expression]
379
+ RETURN [n IN [1, 2, 3, 4] WHERE n > 1 | n ^ 2] AS result // [4, 9, 16]
380
+
381
+ // Identity (copy): [variable IN list]
382
+ RETURN [n IN [10, 20, 30]] AS result // [10, 20, 30]
383
+ ```
384
+
385
+ ### Predicate Functions (Inline Aggregation)
386
+
387
+ Aggregate over a list expression with optional filtering.
388
+
389
+ ```cypher
390
+ // sum(variable IN list | expression WHERE condition)
391
+ RETURN sum(n IN [1, 2, 3] | n WHERE n > 1) AS sum // 5
392
+ RETURN sum(n IN [1, 2, 3] | n) AS sum // 6
393
+ RETURN sum(n IN [1+2+3, 2, 3] | n^2) AS sum // 49
394
+ ```
395
+
396
+ ### Aggregate Functions
397
+
398
+ Used in `RETURN` or `WITH` to group and reduce rows. Non-aggregated expressions define grouping keys. Aggregate functions cannot be nested.
399
+
400
+ | Function | Description |
401
+ | ------------------------ | ------------------------------------------------------------------- |
402
+ | `sum(expr)` | Sum of values. Returns `0` for empty input, `null` for null input. |
403
+ | `avg(expr)` | Average. Returns `null` for null input. |
404
+ | `count(expr)` | Count of rows. |
405
+ | `count(DISTINCT expr)` | Count of unique values. |
406
+ | `min(expr)` | Minimum value (numbers or strings). |
407
+ | `max(expr)` | Maximum value (numbers or strings). |
408
+ | `collect(expr)` | Collects values into a list. |
409
+ | `collect(DISTINCT expr)` | Collects unique values. Works with primitives, arrays, and objects. |
410
+
411
+ ```cypher
412
+ UNWIND [1, 1, 2, 2] AS i UNWIND [1, 2, 3, 4] AS j
413
+ RETURN i, sum(j) AS sum, avg(j) AS avg
414
+ // [{ i: 1, sum: 20, avg: 2.5 }, { i: 2, sum: 20, avg: 2.5 }]
415
+
416
+ UNWIND ["a", "b", "a", "c"] AS s RETURN count(DISTINCT s) AS cnt // 3
417
+ ```
418
+
419
+ ### Scalar Functions
420
+
421
+ | Function | Description | Example |
422
+ | ------------------------------ | -------------------------------------- | -------------------------------------- |
423
+ | `size(list)` | Length of list or string | `size([1,2,3])` → `3` |
424
+ | `range(start, end)` | Inclusive integer range | `range(1,3)` → `[1,2,3]` |
425
+ | `round(n)` | Round to nearest integer | `round(3.7)` → `4` |
426
+ | `rand()` | Random float 0–1 | `round(rand()*10)` |
427
+ | `split(str, delim)` | Split string into list | `split("a,b",",")` → `["a","b"]` |
428
+ | `join(list, delim)` | Join list into string | `join(["a","b"],",")` → `"a,b"` |
429
+ | `replace(str, from, to)` | Replace all occurrences | `replace("hello","l","x")` → `"hexxo"` |
430
+ | `toLower(str)` | Lowercase | `toLower("Hello")` → `"hello"` |
431
+ | `trim(str)` | Strip whitespace | `trim(" hi ")` → `"hi"` |
432
+ | `substring(str, start[, len])` | Extract substring | `substring("hello",1,3)` → `"ell"` |
433
+ | `toString(val)` | Convert to string | `toString(42)` → `"42"` |
434
+ | `toInteger(val)` | Convert to integer | `toInteger("42")` → `42` |
435
+ | `toFloat(val)` | Convert to float | `toFloat("3.14")` → `3.14` |
436
+ | `tojson(str)` | Parse JSON string to object | `tojson('{"a":1}')` → `{a: 1}` |
437
+ | `stringify(obj)` | Pretty-print object as JSON | `stringify({a:1})` |
438
+ | `string_distance(a, b)` | Normalized Levenshtein distance (0–1) | `string_distance("kitten","sitting")` |
439
+ | `keys(obj)` | Keys of a map | `keys({a:1,b:2})` → `["a","b"]` |
440
+ | `properties(node_or_map)` | Properties of a node or map | `properties(n)` |
441
+ | `type(val)` | Type name string | `type(123)` → `"number"` |
442
+ | `coalesce(val, ...)` | First non-null argument | `coalesce(null, 42)` → `42` |
443
+ | `head(list)` | First element | `head([1,2,3])` → `1` |
444
+ | `tail(list)` | All but first element | `tail([1,2,3])` → `[2,3]` |
445
+ | `last(list)` | Last element | `last([1,2,3])` → `3` |
446
+ | `id(node_or_rel)` | ID of a node or type of a relationship | `id(n)` |
447
+ | `elementId(node)` | String ID of a node | `elementId(n)` → `"1"` |
448
+
449
+ All scalar functions propagate `null`: if the primary input is `null`, the result is `null`.
450
+
451
+ ### Temporal Functions
452
+
453
+ | Function | Description |
454
+ | ------------------------------------------------- | ------------------------------------------------------------------ |
455
+ | `datetime()` | Current UTC datetime object |
456
+ | `datetime(str)` | Parse ISO 8601 string (e.g. `'2025-06-15T12:30:45.123Z'`) |
457
+ | `datetime({year, month, day, hour, minute, ...})` | Construct from map |
458
+ | `date()` / `date(str)` / `date({...})` | Date only (no time fields) |
459
+ | `time()` | Current UTC time |
460
+ | `localtime()` | Current local time |
461
+ | `localdatetime()` / `localdatetime(str)` | Current or parsed local datetime |
462
+ | `timestamp()` | Current epoch milliseconds (number) |
463
+ | `duration(str)` | Parse ISO 8601 duration (`'P1Y2M3DT4H5M6S'`, `'P2W'`, `'PT2H30M'`) |
464
+ | `duration({days, hours, ...})` | Construct duration from map |
465
+
466
+ **Datetime properties:** `year`, `month`, `day`, `hour`, `minute`, `second`, `millisecond`, `epochMillis`, `epochSeconds`, `dayOfWeek` (1=Mon, 7=Sun), `dayOfYear`, `quarter`, `formatted`.
467
+
468
+ **Date properties:** `year`, `month`, `day`, `epochMillis`, `dayOfWeek`, `dayOfYear`, `quarter`, `formatted`.
469
+
470
+ **Duration properties:** `years`, `months`, `weeks`, `days`, `hours`, `minutes`, `seconds`, `totalMonths`, `totalDays`, `totalSeconds`, `formatted`.
471
+
472
+ ```cypher
473
+ WITH datetime() AS now RETURN now.year AS year, now.quarter AS q
474
+ RETURN date('2025-06-15').dayOfWeek AS dow // 7 (Sunday)
475
+ RETURN duration('P2W').days AS d // 14
476
+ ```
477
+
478
+ ### Graph Operations
479
+
480
+ #### CREATE VIRTUAL (Nodes)
481
+
482
+ Defines a virtual node label backed by a sub-query.
483
+
484
+ ```cypher
485
+ CREATE VIRTUAL (:Person) AS {
486
+ UNWIND [{id: 1, name: 'Alice'}, {id: 2, name: 'Bob'}] AS record
487
+ RETURN record.id AS id, record.name AS name
123
488
  }
124
- post {
125
- model: 'gpt-4o-mini',
126
- messages: [{role: 'user', content: catfacts_analysis_prompt}],
127
- temperature: 0.7
128
- } as openai_response
129
- with openai_response.choices[0].message.content as catfacts_analysis
489
+ ```
490
+
491
+ #### CREATE VIRTUAL (Relationships)
492
+
493
+ Defines a virtual relationship type between two node labels. Must return `left_id` and `right_id`.
130
494
 
131
- // Return the analysis
132
- return catfacts_analysis
495
+ ```cypher
496
+ CREATE VIRTUAL (:Person)-[:KNOWS]-(:Person) AS {
497
+ UNWIND [{left_id: 1, right_id: 2}] AS record
498
+ RETURN record.left_id AS left_id, record.right_id AS right_id
499
+ }
500
+ ```
501
+
502
+ #### DELETE VIRTUAL
503
+
504
+ Removes a virtual node or relationship definition.
505
+
506
+ ```cypher
507
+ DELETE VIRTUAL (:Person)
508
+ DELETE VIRTUAL (:Person)-[:KNOWS]-(:Person)
133
509
  ```
134
510
 
511
+ #### MATCH
512
+
513
+ Queries virtual graph data. Supports property constraints, `WHERE` clauses, and relationship traversal.
514
+
135
515
  ```cypher
136
- // Test completion from Azure OpenAI API
137
- with
138
- 'YOUR_AZURE_OPENAI_API_KEY' as AZURE_OPENAI_API_KEY
139
- load json from 'https://YOUR_DEPLOYMENT_NAME.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-08-01-preview'
140
- headers {
141
- `Content-Type`: 'application/json',
142
- `api-key`: AZURE_OPENAI_API_KEY,
516
+ MATCH (n:Person) RETURN n.name AS name
517
+ MATCH (n:Person {name: 'Alice'}) RETURN n
518
+ MATCH (a:Person)-[:KNOWS]->(b:Person) RETURN a.name, b.name
519
+ ```
520
+
521
+ **Leftward direction:** `<-[:TYPE]-` reverses traversal direction.
522
+
523
+ ```cypher
524
+ MATCH (m:Person)<-[:REPORTS_TO]-(e:Person)
525
+ RETURN m.name AS manager, e.name AS employee
526
+ ```
527
+
528
+ **Variable-length relationships:** `*`, `*0..3`, `*1..`, `*2..`
529
+
530
+ ```cypher
531
+ MATCH (a:Person)-[:KNOWS*]->(b:Person) RETURN a.name, b.name // 0+ hops
532
+ MATCH (a:Person)-[:KNOWS*1..]->(b:Person) RETURN a.name, b.name // 1+ hops
533
+ MATCH (a:Person)-[:KNOWS*0..3]->(b:Person) RETURN a.name, b.name // 0–3 hops
534
+ ```
535
+
536
+ **ORed relationship types:**
537
+
538
+ ```cypher
539
+ MATCH (a)-[:KNOWS|FOLLOWS]->(b) RETURN a.name, b.name
540
+ ```
541
+
542
+ **Pattern variable:** Capture the full path as a variable.
543
+
544
+ ```cypher
545
+ MATCH p=(:Person)-[:KNOWS]-(:Person) RETURN p AS pattern
546
+ ```
547
+
548
+ **Pattern in WHERE:** Check existence of a relationship in a WHERE clause.
549
+
550
+ ```cypher
551
+ MATCH (a:Person), (b:Person) WHERE (a)-[:KNOWS]->(b) RETURN a.name, b.name
552
+ MATCH (a:Person) WHERE NOT (a)-[:KNOWS]->(:Person) RETURN a.name
553
+ ```
554
+
555
+ **Node reference reuse across MATCH clauses:**
556
+
557
+ ```cypher
558
+ MATCH (a:Person)-[:KNOWS]-(b:Person)
559
+ MATCH (b)-[:KNOWS]-(c:Person)
560
+ RETURN a.name, b.name, c.name
561
+ ```
562
+
563
+ #### OPTIONAL MATCH
564
+
565
+ Like `MATCH` but returns `null` for unmatched nodes instead of dropping the row. Property access on `null` nodes returns `null`.
566
+
567
+ ```cypher
568
+ MATCH (a:Person)
569
+ OPTIONAL MATCH (a)-[:KNOWS]->(b:Person)
570
+ RETURN a.name AS name, b.name AS friend
571
+ // Persons without KNOWS relationships get friend=null
572
+ ```
573
+
574
+ Chained optional matches propagate `null`:
575
+
576
+ ```cypher
577
+ OPTIONAL MATCH (u)-[:REPORTS_TO]->(m1:Employee)
578
+ OPTIONAL MATCH (m1)-[:REPORTS_TO]->(m2:Employee)
579
+ // If m1 is null, m2 is also null
580
+ ```
581
+
582
+ #### Graph Utility Functions
583
+
584
+ | Function | Description |
585
+ | ------------------------- | -------------------------------------------------------------------------------- |
586
+ | `nodes(path)` | List of nodes in a path |
587
+ | `relationships(path)` | List of relationships in a path |
588
+ | `properties(node_or_rel)` | Properties map (excludes `id` for nodes, `left_id`/`right_id` for relationships) |
589
+ | `schema()` | Introspect registered virtual node labels and relationship types |
590
+
591
+ ```cypher
592
+ MATCH p=(:City)-[:CONNECTED_TO]-(:City)
593
+ RETURN nodes(p) AS cities, relationships(p) AS rels
594
+
595
+ CALL schema() YIELD kind, label, type, from_label, to_label, properties, sample
596
+ RETURN kind, label, properties
597
+ ```
598
+
599
+ #### Filter Pass-Down (Parameter References)
600
+
601
+ Virtual node/relationship definitions can reference `$paramName` or `$args.paramName` to receive filter values from `MATCH` constraints and `WHERE` equality predicates. This enables dynamic data loading (e.g., API calls parameterized by match constraints).
602
+
603
+ ```cypher
604
+ CREATE VIRTUAL (:Todo) AS {
605
+ LOAD JSON FROM f"https://api.example.com/todos/{coalesce($id, 1)}" AS todo
606
+ RETURN todo.id AS id, todo.title AS title
143
607
  }
144
- post {
145
- messages: [{role: 'user', content: 'Answer with this is a test!'}],
146
- temperature: 0.7
147
- } as data
148
- return data
608
+
609
+ // $id receives the value 3 from the constraint
610
+ MATCH (t:Todo {id: 3}) RETURN t.title
611
+
612
+ // Also extracted from WHERE equality
613
+ MATCH (t:Todo) WHERE t.id = 3 RETURN t.title
149
614
  ```
150
615
 
151
- ## Extending FlowQuery with Custom Functions
616
+ `$`-prefixed identifiers are **only** allowed inside virtual definitions. Non-equality operators in `WHERE` (`>`, `<`, `CONTAINS`, etc.) are **not** extracted as pass-down parameters. `OR` predicates are also not extracted.
152
617
 
153
- FlowQuery supports extending its functionality with custom functions using the `@FunctionDef` decorator. You can create scalar functions, aggregate functions, predicate functions, and async data providers.
618
+ ### Reserved Keywords as Identifiers
154
619
 
155
- ### Installing the Extensibility API
620
+ Reserved words (`return`, `with`, `from`, `to`, etc.) can be used as:
156
621
 
157
- Import the necessary classes and decorators from the extensibility module:
622
+ - Variable aliases: `WITH 1 AS return RETURN return`
623
+ - Property keys: `data.from`, `data.to`
624
+ - Map keys: `{return: 1}`
625
+ - Node labels and relationship types: `(:Return)-[:With]->()`
158
626
 
159
- ```typescript
160
- import {
161
- AggregateFunction,
162
- Function,
163
- FunctionDef,
164
- PredicateFunction,
165
- ReducerElement,
166
- } from "flowquery/extensibility";
627
+ ### Introspection
628
+
629
+ Discover all registered functions (built-in and custom):
630
+
631
+ ```cypher
632
+ WITH functions() AS funcs UNWIND funcs AS f
633
+ RETURN f.name, f.description, f.category
634
+ ```
635
+
636
+ ---
637
+
638
+ ## Quick Cheat Sheet
639
+
640
+ ```
641
+ ┌─────────────────────────────────────────────────────────────┐
642
+ │ CLAUSE SYNTAX │
643
+ ├─────────────────────────────────────────────────────────────┤
644
+ │ RETURN expr [AS alias], ... [WHERE cond] │
645
+ │ │ [ORDER BY expr [ASC|DESC], ...] [LIMIT n] │
646
+ │ WITH expr [AS alias], ... [WHERE cond] │
647
+ │ UNWIND list AS var │
648
+ │ LOAD JSON FROM url [HEADERS {...}] [POST {...}] AS alias │
649
+ │ CALL func() [YIELD field, ...] │
650
+ │ query1 UNION [ALL] query2 │
651
+ │ stmt1; stmt2; ... stmtN -- multi-statement │
652
+ │ LIMIT n │
653
+ ├─────────────────────────────────────────────────────────────┤
654
+ │ GRAPH OPERATIONS │
655
+ ├─────────────────────────────────────────────────────────────┤
656
+ │ CREATE VIRTUAL (:Label) AS { subquery } │
657
+ │ CREATE VIRTUAL (:L1)-[:TYPE]-(:L2) AS { subquery } │
658
+ │ DELETE VIRTUAL (:Label) │
659
+ │ DELETE VIRTUAL (:L1)-[:TYPE]-(:L2) │
660
+ │ MATCH (n:Label {prop: val}), ... [WHERE cond] │
661
+ │ MATCH (a)-[:TYPE]->(b) -- rightward │
662
+ │ MATCH (a)<-[:TYPE]-(b) -- leftward │
663
+ │ MATCH (a)-[:TYPE*0..3]->(b) -- variable length │
664
+ │ MATCH (a)-[:T1|T2]->(b) -- ORed types │
665
+ │ MATCH p=(a)-[:TYPE]->(b) -- pattern variable │
666
+ │ OPTIONAL MATCH (a)-[:TYPE]->(b) -- null if no match │
667
+ ├─────────────────────────────────────────────────────────────┤
668
+ │ WHERE OPERATORS │
669
+ ├─────────────────────────────────────────────────────────────┤
670
+ │ = <> > >= < <= │
671
+ │ AND OR NOT │
672
+ │ IS NULL · IS NOT NULL │
673
+ │ IN [...] · NOT IN [...] │
674
+ │ CONTAINS · NOT CONTAINS │
675
+ │ STARTS WITH · NOT STARTS WITH │
676
+ │ ENDS WITH · NOT ENDS WITH │
677
+ ├─────────────────────────────────────────────────────────────┤
678
+ │ EXPRESSIONS │
679
+ ├─────────────────────────────────────────────────────────────┤
680
+ │ + - * / ^ % -- arithmetic │
681
+ │ "str" + "str" -- string concat │
682
+ │ [1,2] + [3,4] -- list concat │
683
+ │ f"hello {expr}" -- f-string │
684
+ │ {key: val, ...} -- map literal │
685
+ │ obj.key · obj["key"] -- property access │
686
+ │ list[0:3] · list[:-2] -- slicing │
687
+ │ CASE WHEN cond THEN v ELSE v END -- conditional │
688
+ │ DISTINCT -- deduplicate │
689
+ ├─────────────────────────────────────────────────────────────┤
690
+ │ LIST COMPREHENSIONS │
691
+ ├─────────────────────────────────────────────────────────────┤
692
+ │ [x IN list | expr] -- map │
693
+ │ [x IN list WHERE cond] -- filter │
694
+ │ [x IN list WHERE cond | expr] -- filter + map │
695
+ ├─────────────────────────────────────────────────────────────┤
696
+ │ AGGREGATE FUNCTIONS │
697
+ ├─────────────────────────────────────────────────────────────┤
698
+ │ sum(x) avg(x) count(x) min(x) max(x) collect(x) │
699
+ │ count(DISTINCT x) · collect(DISTINCT x) │
700
+ │ sum(v IN list | expr [WHERE cond]) -- inline predicate │
701
+ ├─────────────────────────────────────────────────────────────┤
702
+ │ SCALAR FUNCTIONS │
703
+ ├─────────────────────────────────────────────────────────────┤
704
+ │ size range round rand split join replace │
705
+ │ toLower trim substring toString toInteger toFloat │
706
+ │ tojson stringify string_distance keys properties │
707
+ │ type coalesce head tail last id elementId │
708
+ ├─────────────────────────────────────────────────────────────┤
709
+ │ TEMPORAL FUNCTIONS │
710
+ ├─────────────────────────────────────────────────────────────┤
711
+ │ datetime() date() time() localtime() localdatetime() │
712
+ │ timestamp() duration() │
713
+ │ Properties: .year .month .day .hour .minute .second │
714
+ │ .millisecond .epochMillis .dayOfWeek .quarter .formatted │
715
+ ├─────────────────────────────────────────────────────────────┤
716
+ │ GRAPH FUNCTIONS │
717
+ ├─────────────────────────────────────────────────────────────┤
718
+ │ nodes(path) relationships(path) properties(node) │
719
+ │ schema() functions() │
720
+ ├─────────────────────────────────────────────────────────────┤
721
+ │ PARAMETER PASS-DOWN (inside virtual definitions only) │
722
+ ├─────────────────────────────────────────────────────────────┤
723
+ │ $paramName · $args.paramName │
724
+ │ coalesce($id, defaultValue) -- with fallback │
725
+ └─────────────────────────────────────────────────────────────┘
167
726
  ```
168
727
 
728
+ ---
729
+
730
+ ## Extending FlowQuery with Custom Functions
731
+
732
+ FlowQuery supports extending its functionality with custom functions using the `@FunctionDef` decorator. You can create scalar functions, aggregate functions, predicate functions, and async data providers.
733
+
169
734
  ### Creating a Custom Scalar Function
170
735
 
171
736
  Scalar functions operate on individual values and return a result: