@clickhouse/client 1.22.0 → 1.23.0-head.287977a.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -1
- package/dist/client.d.ts +2 -2
- package/dist/client.js +11 -4
- package/dist/client.js.map +1 -1
- package/dist/common/clickhouse_types.d.ts +98 -0
- package/dist/common/clickhouse_types.js +30 -0
- package/dist/common/clickhouse_types.js.map +1 -0
- package/dist/common/client.d.ts +233 -0
- package/dist/common/client.js +414 -0
- package/dist/common/client.js.map +1 -0
- package/dist/common/config.d.ts +234 -0
- package/dist/common/config.js +364 -0
- package/dist/common/config.js.map +1 -0
- package/dist/common/connection.d.ts +124 -0
- package/dist/common/connection.js +3 -0
- package/dist/common/connection.js.map +1 -0
- package/dist/common/data_formatter/format_query_params.d.ts +11 -0
- package/dist/common/data_formatter/format_query_params.js +128 -0
- package/dist/common/data_formatter/format_query_params.js.map +1 -0
- package/dist/common/data_formatter/format_query_settings.d.ts +2 -0
- package/dist/common/data_formatter/format_query_settings.js +20 -0
- package/dist/common/data_formatter/format_query_settings.js.map +1 -0
- package/dist/common/data_formatter/formatter.d.ts +41 -0
- package/dist/common/data_formatter/formatter.js +78 -0
- package/dist/common/data_formatter/formatter.js.map +1 -0
- package/dist/common/data_formatter/index.d.ts +3 -0
- package/dist/common/data_formatter/index.js +24 -0
- package/dist/common/data_formatter/index.js.map +1 -0
- package/dist/common/error/error.d.ts +20 -0
- package/dist/common/error/error.js +73 -0
- package/dist/common/error/error.js.map +1 -0
- package/dist/common/error/index.d.ts +1 -0
- package/dist/common/error/index.js +18 -0
- package/dist/common/error/index.js.map +1 -0
- package/dist/common/index.d.ts +67 -0
- package/dist/common/index.js +97 -0
- package/dist/common/index.js.map +1 -0
- package/dist/common/logger.d.ts +80 -0
- package/dist/common/logger.js +154 -0
- package/dist/common/logger.js.map +1 -0
- package/dist/common/parse/column_types.d.ts +155 -0
- package/dist/common/parse/column_types.js +594 -0
- package/dist/common/parse/column_types.js.map +1 -0
- package/dist/common/parse/index.d.ts +2 -0
- package/dist/common/parse/index.js +19 -0
- package/dist/common/parse/index.js.map +1 -0
- package/dist/common/parse/json_handling.d.ts +19 -0
- package/dist/common/parse/json_handling.js +8 -0
- package/dist/common/parse/json_handling.js.map +1 -0
- package/dist/common/result.d.ts +90 -0
- package/dist/common/result.js +3 -0
- package/dist/common/result.js.map +1 -0
- package/dist/common/settings.d.ts +2007 -0
- package/dist/common/settings.js +19 -0
- package/dist/common/settings.js.map +1 -0
- package/dist/common/tracing.d.ts +146 -0
- package/dist/common/tracing.js +76 -0
- package/dist/common/tracing.js.map +1 -0
- package/dist/common/ts_utils.d.ts +4 -0
- package/dist/common/ts_utils.js +3 -0
- package/dist/common/ts_utils.js.map +1 -0
- package/dist/common/utils/connection.d.ts +21 -0
- package/dist/common/utils/connection.js +43 -0
- package/dist/common/utils/connection.js.map +1 -0
- package/dist/common/utils/index.d.ts +5 -0
- package/dist/common/utils/index.js +22 -0
- package/dist/common/utils/index.js.map +1 -0
- package/dist/common/utils/multipart.d.ts +34 -0
- package/dist/common/utils/multipart.js +81 -0
- package/dist/common/utils/multipart.js.map +1 -0
- package/dist/common/utils/sleep.d.ts +4 -0
- package/dist/common/utils/sleep.js +12 -0
- package/dist/common/utils/sleep.js.map +1 -0
- package/dist/common/utils/stream.d.ts +15 -0
- package/dist/common/utils/stream.js +50 -0
- package/dist/common/utils/stream.js.map +1 -0
- package/dist/common/utils/url.d.ts +20 -0
- package/dist/common/utils/url.js +67 -0
- package/dist/common/utils/url.js.map +1 -0
- package/dist/common/version.d.ts +2 -0
- package/dist/common/version.js +4 -0
- package/dist/common/version.js.map +1 -0
- package/dist/config.d.ts +22 -2
- package/dist/config.js +2 -2
- package/dist/config.js.map +1 -1
- package/dist/connection/compression.d.ts +2 -2
- package/dist/connection/compression.js +4 -4
- package/dist/connection/compression.js.map +1 -1
- package/dist/connection/create_connection.d.ts +1 -1
- package/dist/connection/node_base_connection.d.ts +3 -3
- package/dist/connection/node_base_connection.js +22 -22
- package/dist/connection/node_base_connection.js.map +1 -1
- package/dist/connection/node_custom_agent_connection.js +2 -2
- package/dist/connection/node_custom_agent_connection.js.map +1 -1
- package/dist/connection/node_http_connection.js +2 -2
- package/dist/connection/node_http_connection.js.map +1 -1
- package/dist/connection/node_https_connection.d.ts +1 -1
- package/dist/connection/node_https_connection.js +3 -3
- package/dist/connection/node_https_connection.js.map +1 -1
- package/dist/connection/socket_pool.d.ts +1 -1
- package/dist/connection/socket_pool.js +30 -30
- package/dist/connection/socket_pool.js.map +1 -1
- package/dist/connection/stream.d.ts +1 -1
- package/dist/connection/stream.js +9 -9
- package/dist/connection/stream.js.map +1 -1
- package/dist/index.d.ts +9 -7
- package/dist/index.js +26 -24
- package/dist/index.js.map +1 -1
- package/dist/result_set.d.ts +1 -1
- package/dist/result_set.js +10 -10
- package/dist/result_set.js.map +1 -1
- package/dist/utils/encoder.d.ts +1 -1
- package/dist/utils/encoder.js +5 -5
- package/dist/utils/encoder.js.map +1 -1
- package/dist/version.d.ts +1 -1
- package/dist/version.js +1 -1
- package/dist/version.js.map +1 -1
- package/package.json +7 -5
- package/skills/clickhouse-js-node-rowbinary-parser/EXAMPLES.md +48 -0
- package/skills/clickhouse-js-node-rowbinary-parser/README.md +255 -0
- package/skills/clickhouse-js-node-rowbinary-parser/SKILL.md +206 -0
- package/skills/clickhouse-js-node-rowbinary-parser/case-studies/iot-rowbinary-vs-json.md +83 -0
- package/skills/clickhouse-js-node-rowbinary-parser/case-studies/ledger-rowbinary-vs-json.md +103 -0
- package/skills/clickhouse-js-node-rowbinary-parser/case-studies/logs-json-wins.md +86 -0
- package/skills/clickhouse-js-node-rowbinary-parser/case-studies/wasm-vs-js.md +172 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/aggregateFunction.ts +34 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/bool.ts +10 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/columnar.ts +125 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/compile.ts +318 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/composite.ts +181 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/core.ts +77 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/datetime.ts +113 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/decimals.ts +57 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/dynamic.ts +328 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/enums.ts +28 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/carts.ts +71 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/events.ts +51 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/iot.ts +158 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/ledger.ts +98 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/logs.ts +73 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/observability.ts +142 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/orders.ts +65 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/profiles.ts +60 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/examples/telemetry.ts +102 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/floats.ts +32 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/geo.ts +109 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/header.ts +29 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/integers.ts +95 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/interval.ts +54 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/ip.ts +93 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/json.ts +33 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/lowCardinality.ts +18 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/nested.ts +23 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/nothing.ts +29 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/reader.ts +68 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/rowBinaryWithNamesAndTypes.ts +155 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/rows.ts +58 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/simpleAggregateFunction.ts +20 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/stream.ts +276 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/strings.ts +55 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/time.ts +61 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/uuid.ts +153 -0
- package/skills/clickhouse-js-node-rowbinary-parser/src/varint.ts +70 -0
|
@@ -0,0 +1,206 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: clickhouse-js-node-rowbinary-parser
|
|
3
|
+
description: >
|
|
4
|
+
Generate TypeScript/JavaScript code that reads and decodes ClickHouse
|
|
5
|
+
RowBinary streams from the ClickHouse HTTP server.
|
|
6
|
+
Use this skill whenever a user wants to parse `RowBinary`,
|
|
7
|
+
`RowBinaryWithNames`, or `RowBinaryWithNamesAndTypes`.
|
|
8
|
+
Node.js only, doesn't cover browsers.
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# ClickHouse JS RowBinary Parser Generator for Node.js
|
|
12
|
+
|
|
13
|
+
## First: is RowBinary even the right format?
|
|
14
|
+
|
|
15
|
+
RowBinary exists for throughput, but it is **not automatically the fastest
|
|
16
|
+
path** — match the format to the shape of the data before committing to a
|
|
17
|
+
bespoke parser.
|
|
18
|
+
|
|
19
|
+
**Prefer a `JSON*` format (e.g. `JSONEachRow`) when** the result is mostly
|
|
20
|
+
strings / JSON-like values that you consume wholesale — randomly accessing
|
|
21
|
+
essentially every field, running string/regexp methods on them, treating values
|
|
22
|
+
as text. V8's native `JSON.parse` is heavily optimized C++ and builds JS strings
|
|
23
|
+
and objects faster than a JS-level RowBinary decoder can; pair it with HTTP
|
|
24
|
+
response compression (`gzip` / `zstd`, which crushes JSON's repetitive keys) and
|
|
25
|
+
the wire cost shrinks too.
|
|
26
|
+
|
|
27
|
+
**RowBinary clearly wins when** the result is dominated by:
|
|
28
|
+
|
|
29
|
+
- **Wide numerics** — `Int128`/`Int256`/`UInt128`/`UInt256`,
|
|
30
|
+
`Decimal128`/`Decimal256`.
|
|
31
|
+
- **Binary / fixed-width blobs** — `IPv4`, `IPv6`, `UUID`, `FixedString`.
|
|
32
|
+
- **High-volume fixed-width numeric columns** generally, where each value is a
|
|
33
|
+
single `DataView` read.
|
|
34
|
+
|
|
35
|
+
**Prefer the `Native` format when** columnar load and client-side analytics are
|
|
36
|
+
the main goal (fold/scan/filter columns, feed typed arrays to a Worker or WASM).
|
|
37
|
+
`Native` is column-major, so it loads straight into one typed array per column
|
|
38
|
+
with no transpose.
|
|
39
|
+
|
|
40
|
+
For help choosing and consuming a `JSON*` format (or CSV / TSV) instead, use the
|
|
41
|
+
**`clickhouse-js-node-coding`** skill.
|
|
42
|
+
|
|
43
|
+
## Second: complete buffer, or incremental stream?
|
|
44
|
+
|
|
45
|
+
Decide this before writing the reader — it changes the shape of the code and is
|
|
46
|
+
a real performance fork.
|
|
47
|
+
|
|
48
|
+
- **Incremental / streaming (the default here).** You consume the HTTP response
|
|
49
|
+
chunk by chunk as it arrives — low latency to the first row, bounded memory.
|
|
50
|
+
It is generally the best choice for large results, but slower per-row.
|
|
51
|
+
|
|
52
|
+
- **Whole buffer in memory (faster, when it fits).** If you already hold the
|
|
53
|
+
entire response as one `Buffer`, the bounds check never fires — so you can drop
|
|
54
|
+
`advance()` entirely and read at a running offset in one monolithic loop.
|
|
55
|
+
This is 2-3x faster but introduces latency and unbounded memory use.
|
|
56
|
+
|
|
57
|
+
The exposed API is streaming by default and requires an optimisation pass.
|
|
58
|
+
|
|
59
|
+
## Third: row objects, or columnar (typed arrays)?
|
|
60
|
+
|
|
61
|
+
The default output is one object per row (array-of-structs). For a **numeric,
|
|
62
|
+
fixed-width result that the consumer reads column-wise**, decode instead into one
|
|
63
|
+
typed array per column (struct-of-arrays) — it is **~4x faster and several times
|
|
64
|
+
smaller** because it removes the per-row object, `Date`, and number-boxing
|
|
65
|
+
allocations that dominate a numeric decode (the byte reads are already at memory
|
|
66
|
+
bandwidth). Measured in `tests/iot.columnar.bench.ts`; rationale in
|
|
67
|
+
`case-studies/wasm-vs-js.md`.
|
|
68
|
+
|
|
69
|
+
- **Use columnar when** columns are numeric/fixed-width and the consumer
|
|
70
|
+
aggregates / filters / scans / plots them, or hands the buffers to a Worker or
|
|
71
|
+
WASM kernel (typed-array `ArrayBuffer`s are transferable — zero-copy).
|
|
72
|
+
- **The preallocation trick:** if EVERY column is fixed-width the row stride is
|
|
73
|
+
known, so the exact count is `buf.length / stride` — allocate each column once,
|
|
74
|
+
write at `[i]`, no growth, no per-row bounds check.
|
|
75
|
+
- **Streaming columnar is just that arithmetic per chunk.** Fixed width means
|
|
76
|
+
honoring a partial buffer needs no `advance()`/`NeedMoreData`/restart: the
|
|
77
|
+
complete-row count is `(chunk.length / stride) | 0`, and the leftover bytes
|
|
78
|
+
carry to the next chunk. Yield one typed-array batch per chunk, each owning a
|
|
79
|
+
fresh transferable `ArrayBuffer` (see `streamSensorColumns` in
|
|
80
|
+
`src/columnar.ts`).
|
|
81
|
+
- **Stay row-oriented when** downstream code is row-shaped, the row is
|
|
82
|
+
string-dominated (columnar's win is numeric — a JS string allocates either
|
|
83
|
+
way), or the schema is nested/heterogeneous (`Array`/`Map`/`Tuple`).
|
|
84
|
+
- **Hybrid:** store columnar, expose a lazy `rowAt(i)` accessor that builds an
|
|
85
|
+
object only for rows actually touched (see `iotRowAt` in `src/examples/iot.ts`).
|
|
86
|
+
|
|
87
|
+
## Fourth: are the column types known ahead of time?
|
|
88
|
+
|
|
89
|
+
- **Known (the default).** Generate a straight-line reader specialized to those
|
|
90
|
+
types — everything below.
|
|
91
|
+
- **Only at runtime** (the schema varies, or you just want to decode an arbitrary
|
|
92
|
+
`RowBinaryWithNamesAndTypes` stream). Call
|
|
93
|
+
`compileRowBinaryWithNamesAndTypes(cursor)` (`src/rowBinaryWithNamesAndTypes.ts`):
|
|
94
|
+
it reads the header, folds each column type's AST into a `Reader`
|
|
95
|
+
(`astToReader`, `src/compile.ts`; type strings parsed by
|
|
96
|
+
`@clickhouse/datatype-parser`), and returns a `readRows` driver for the rest of
|
|
97
|
+
the stream. Generic and unoptimized (no codegen), so prefer the specialized
|
|
98
|
+
path whenever the types are fixed.
|
|
99
|
+
|
|
100
|
+
## Core guidance
|
|
101
|
+
|
|
102
|
+
When generating a parser, follow these:
|
|
103
|
+
|
|
104
|
+
- **Little-endian only.** RowBinary is little-endian; target x86/ARM. Read every
|
|
105
|
+
multi-byte number with `DataView` accessors passing a **literal** `true` for
|
|
106
|
+
the `littleEndian` flag.
|
|
107
|
+
|
|
108
|
+
- **Correct first, then optimize.** First emit a correct reader built from the
|
|
109
|
+
plain per-type API. Only after it's correct (and tested) specialize it. Don't
|
|
110
|
+
bake performance assumptions in before correctness.
|
|
111
|
+
|
|
112
|
+
- **Monomorphize generic/composite types.** Emit specialized, inlined code per
|
|
113
|
+
type combination instead of passing functions as arguments where the type
|
|
114
|
+
is known ahead of time.
|
|
115
|
+
|
|
116
|
+
- **Streaming: throw + restart, not generators.** To signal "need more bytes",
|
|
117
|
+
a synchronous reader that throws a sentinel (`NeedMoreData`) and restarts the
|
|
118
|
+
row beats generators for realistic chunk sizes;
|
|
119
|
+
|
|
120
|
+
- **Keep an eye on chunk sizes.** Partial trailing rows, small chunks are a silent
|
|
121
|
+
throughput killer: `streamRowBatches` warns once when
|
|
122
|
+
rows-per-chunk falls too low, and `coalesceChunks(source, { minSize, timeoutMs })`
|
|
123
|
+
merges small chunks in front of it when the source size isn't yours to raise.
|
|
124
|
+
|
|
125
|
+
- **Shared scratch is not reentrant.** Some hot methods reuse a module-level
|
|
126
|
+
scratch buffer as a write-then-read pair — correct only because reads are fully
|
|
127
|
+
synchronous. An `async`/`yield` boundary between populating and reading it
|
|
128
|
+
corrupts the value.
|
|
129
|
+
|
|
130
|
+
- **Hoist the cursor into locals.** Prefer the working buffer and view declared
|
|
131
|
+
once at the top of the generated reader, and keep the read offset in a **local variable**,
|
|
132
|
+
operating on it directly instead of re-reading from an object.
|
|
133
|
+
|
|
134
|
+
- **Coalesce `advance()` across adjacent fixed-width columns.** A run of
|
|
135
|
+
neighbouring fixed-width columns has a known combined size, so bounds-check it
|
|
136
|
+
ONCE.
|
|
137
|
+
|
|
138
|
+
- **Inline the leaf reads.** The per-type `readX` functions are the correct,
|
|
139
|
+
composable reference; the generated parser should INLINE their bodies, not call
|
|
140
|
+
them, so the row reader is straight-line with no per-field indirection (and so
|
|
141
|
+
the two points above can fold the offset arithmetic together).
|
|
142
|
+
|
|
143
|
+
- **Annotate the decoded type per column.** Inlining erases the type structure,
|
|
144
|
+
so put a short comment above each column's decode block naming the ClickHouse
|
|
145
|
+
type it reads.
|
|
146
|
+
|
|
147
|
+
- **Pre-allocate small result arrays.** RowBinary gives every array/map its
|
|
148
|
+
element count up front (the LEB128 prefix), so DEFAULT is to `new Array(n)`.
|
|
149
|
+
NOTE: for **large** arrays the application will iterate or compute over repeatedly,
|
|
150
|
+
prefer `[]` + `push` (faster to traverse in V8) — or a typed array (`Float64Array`…)
|
|
151
|
+
for numeric elements.
|
|
152
|
+
|
|
153
|
+
- **TypeScript by default.** Generate TypeScript parsers and helpers unless the
|
|
154
|
+
user explicitly asks for plain JavaScript.
|
|
155
|
+
|
|
156
|
+
## Type family references
|
|
157
|
+
|
|
158
|
+
The readers live as real code under `src/`, split by type family.
|
|
159
|
+
|
|
160
|
+
| Result contains (trigger) | Open |
|
|
161
|
+
| ---------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
162
|
+
| **Always** — cursor state, `advance()`, `NeedMoreData`, `Reader<T>` | `src/core.ts` |
|
|
163
|
+
| LEB128 length/count prefixes for `String`/`Array`/`Map` (`readUVarint`) | `src/varint.ts` |
|
|
164
|
+
| `Int8`–`Int256`, `UInt8`–`UInt256` | `src/integers.ts` |
|
|
165
|
+
| `Bool` | `src/bool.ts` |
|
|
166
|
+
| `Enum8`, `Enum16` | `src/enums.ts` |
|
|
167
|
+
| `Float32`, `Float64`, `BFloat16` | `src/floats.ts` |
|
|
168
|
+
| `Decimal32/64/128/256`, `Decimal(P, S)` | `src/decimals.ts` |
|
|
169
|
+
| `String`, `FixedString(N)` | `src/strings.ts` |
|
|
170
|
+
| `UUID` | `src/uuid.ts` |
|
|
171
|
+
| `IPv4`, `IPv6` | `src/ip.ts` |
|
|
172
|
+
| `Date`, `Date32`, `DateTime`, `DateTime(tz)`, `DateTime64(P[, tz])` | `src/datetime.ts` |
|
|
173
|
+
| `Time`, `Time64(P)` | `src/time.ts` |
|
|
174
|
+
| `IntervalNanosecond` … `IntervalYear` | `src/interval.ts` |
|
|
175
|
+
| `Array(T)`, `Map(K, V)`, `Tuple(...)`, `Nullable(T)`, `Variant(...)`, `QBit(...)` | `src/composite.ts` |
|
|
176
|
+
| `Point`, `Ring`, `LineString`, `MultiLineString`, `Polygon`, `MultiPolygon`, `Geometry` | `src/geo.ts` |
|
|
177
|
+
| `Dynamic` (and `Variant`/`Interval`/`Nested`/`Dynamic` nested inside it) | `src/dynamic.ts` |
|
|
178
|
+
| `JSON` | `src/json.ts` |
|
|
179
|
+
| The whole result — loop rows to EOF (`readRows`) | `src/rows.ts` |
|
|
180
|
+
| A chunked HTTP response — `streamRowBatches`, `coalesceChunks` | `src/stream.ts` |
|
|
181
|
+
| The `RowBinaryWithNamesAndTypes` header — column names + type strings (`readHeader`) | `src/header.ts` |
|
|
182
|
+
| Fold one parsed type AST into a `Reader` (`astToReader`) — AST in, reader out | `src/compile.ts` |
|
|
183
|
+
| **Types known only at runtime** — compile a whole header into a row reader (`compileRowBinaryWithNamesAndTypes`, `typeStringToReader`) | `src/rowBinaryWithNamesAndTypes.ts` |
|
|
184
|
+
| **Numeric/fixed-width result read column-wise** (aggregate/scan/plot, hand to a Worker/WASM) → decode into typed arrays, not row objects (~4x) | `src/columnar.ts` (`streamSensorColumns` — streaming, yields transferable typed-array batches); `decodeIotColumnar` in `src/examples/iot.ts` is the whole-buffer form |
|
|
185
|
+
| `LowCardinality(T)` — transparent, decode as `T` | `src/lowCardinality.ts` |
|
|
186
|
+
| `SimpleAggregateFunction(f, T)` — transparent, decode as `T` | `src/simpleAggregateFunction.ts` |
|
|
187
|
+
| `Nested(...)` — no wire of its own; `Array(Tuple(...))` | `src/nested.ts` |
|
|
188
|
+
| `Nothing` — zero-width, never decoded (only wrapped) | `src/nothing.ts` |
|
|
189
|
+
| `AggregateFunction(...)` — opaque state; finalize server-side | `src/aggregateFunction.ts` |
|
|
190
|
+
|
|
191
|
+
## Worked examples
|
|
192
|
+
|
|
193
|
+
Six end-to-end examples with real speedup are catalogued in [EXAMPLES.md](EXAMPLES.md).
|
|
194
|
+
|
|
195
|
+
## Out of scope
|
|
196
|
+
|
|
197
|
+
- **JSON / CSV / TSV / Parquet parsing** → use `clickhouse-js-node-coding`.
|
|
198
|
+
- **Connection errors, hangs, type mismatches** → use
|
|
199
|
+
`clickhouse-js-node-troubleshooting`.
|
|
200
|
+
- **Browser / Web Worker / Edge** → `@clickhouse/client-web`.
|
|
201
|
+
|
|
202
|
+
## Still Stuck?
|
|
203
|
+
|
|
204
|
+
- [ClickHouse RowBinary format](https://clickhouse.com/docs/interfaces/formats#rowbinary)
|
|
205
|
+
- [ClickHouse data types](https://clickhouse.com/docs/sql-reference/data-types)
|
|
206
|
+
- [ClickHouse JS client docs](https://clickhouse.com/docs/integrations/javascript)
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# Case study: RowBinary vs JSON on a table of IoT readings
|
|
2
|
+
|
|
3
|
+
**TL;DR** — On a dense fixed-width numeric row, the skill's optimized RowBinary
|
|
4
|
+
reader decodes **3.5x faster than the best JSON format** (`JSONCompactEachRow`)
|
|
5
|
+
and **5.4x faster than `JSONEachRow`**, over a wire that is **1.6–3.3x smaller**.
|
|
6
|
+
This is the workload shape the [SKILL's format-choice
|
|
7
|
+
guidance](../SKILL.md#first-is-rowbinary-even-the-right-format) points at
|
|
8
|
+
RowBinary for — and the numbers below are _measured_, not assumed.
|
|
9
|
+
|
|
10
|
+
Reproduce: `npx vitest bench --run tests/iot.bench.ts` (against a live
|
|
11
|
+
ClickHouse server). Source: [`tests/iot.bench.ts`](../tests/iot.bench.ts),
|
|
12
|
+
reader: [`src/examples/iot.ts`](../src/examples/iot.ts).
|
|
13
|
+
|
|
14
|
+
## The data
|
|
15
|
+
|
|
16
|
+
A table of IoT sensor readings — every column fixed-width, not a string in the
|
|
17
|
+
row, so the whole record is a flat 41-byte run:
|
|
18
|
+
|
|
19
|
+
```sql
|
|
20
|
+
sensor_id UInt32 -- 4 bytes
|
|
21
|
+
ts DateTime64(3) -- 8 bytes
|
|
22
|
+
temperature Float64 -- 8 bytes
|
|
23
|
+
humidity Float64 -- 8 bytes
|
|
24
|
+
pressure Float64 -- 8 bytes
|
|
25
|
+
battery Float32 -- 4 bytes
|
|
26
|
+
status UInt8 -- 1 byte
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
50,000 rows, fetched from a live server in three formats and decoded into
|
|
30
|
+
equivalent JS objects. A cross-format check asserts the RowBinary (binary
|
|
31
|
+
float) and JSON (decimal-text → float) decodes agree on every numeric column
|
|
32
|
+
before any timing is taken — so this measures the same work three ways, not
|
|
33
|
+
three different results.
|
|
34
|
+
|
|
35
|
+
## What was compared
|
|
36
|
+
|
|
37
|
+
- **RowBinary — optimized.** The skill's monomorphized reader: the seven column
|
|
38
|
+
bounds checks coalesce into one `advance(s, 41)`, every field read at a
|
|
39
|
+
constant offset off that base.
|
|
40
|
+
- **RowBinary — API combinators.** The same logic written with the plain
|
|
41
|
+
per-type readers (`readUInt32`, `readFloat64`, …) — the clear default.
|
|
42
|
+
- **JSONCompactEachRow — `JSON.parse`.** Newline-delimited _arrays_ (no repeated
|
|
43
|
+
keys). The strongest JSON contender a knowledgeable user would pick.
|
|
44
|
+
- **JSONEachRow — `JSON.parse`.** Newline-delimited _objects_ (keys repeated
|
|
45
|
+
every row) — the naive idiomatic choice.
|
|
46
|
+
|
|
47
|
+
Both JSON paths use the fastest idiomatic decode: splice the rows into one
|
|
48
|
+
`[...]` document and hand it to V8's native `JSON.parse` in a single call.
|
|
49
|
+
|
|
50
|
+
## Wire size (HTTP response bytes)
|
|
51
|
+
|
|
52
|
+
| Format | Size | B/row | vs RowBinary |
|
|
53
|
+
| ------------------ | ------- | ----- | ------------ |
|
|
54
|
+
| RowBinary | 2.05 MB | 41.0 | 1.0x |
|
|
55
|
+
| JSONCompactEachRow | 3.38 MB | 67.6 | 1.6x |
|
|
56
|
+
| JSONEachRow | 6.68 MB | 133.6 | 3.3x |
|
|
57
|
+
|
|
58
|
+
## Decode throughput (full 50k-row decode; higher = faster)
|
|
59
|
+
|
|
60
|
+
| Decoder | ops/s | ms/decode | ≈ rows/s | speedup |
|
|
61
|
+
| --------------------------------- | ----- | --------- | -------- | -------- |
|
|
62
|
+
| **RowBinary — optimized** | 399 | 2.50 | ~20.0 M | **1.0x** |
|
|
63
|
+
| RowBinary — API combinators | 159 | 6.31 | ~7.9 M | 0.40x |
|
|
64
|
+
| JSONCompactEachRow — `JSON.parse` | 114 | 8.76 | ~5.7 M | 0.29x |
|
|
65
|
+
| JSONEachRow — `JSON.parse` | 74 | 13.47 | ~3.7 M | 0.19x |
|
|
66
|
+
|
|
67
|
+
_Node 24 / V8. Your numbers will vary; run `npm run bench` on your own hardware._
|
|
68
|
+
|
|
69
|
+
## Takeaways
|
|
70
|
+
|
|
71
|
+
- **This is the textbook RowBinary win.** High-volume fixed-width numerics where
|
|
72
|
+
each field is one `DataView` read and there is no text to tokenize or numbers
|
|
73
|
+
to parse from decimal strings. The monomorphization win (2.5x over the
|
|
74
|
+
combinator API) is unusually large here because the whole row coalesces into a
|
|
75
|
+
_single_ bounds check with constant-offset reads.
|
|
76
|
+
- **Format choice matters more than the optimization.** Even the plain
|
|
77
|
+
combinator-API RowBinary reader (~7.9 M rows/s) beats the best JSON option —
|
|
78
|
+
before any monomorphization.
|
|
79
|
+
- **The flip side still holds.** Had this been a string-heavy result (logs, JSON
|
|
80
|
+
blobs, text consumed wholesale), `JSON.parse`'s optimized C++ would likely
|
|
81
|
+
_win_, and the skill would steer you to `JSONEachRow` + compression instead.
|
|
82
|
+
For IoT telemetry, RowBinary is clearly right — match the format to the shape
|
|
83
|
+
of the data.
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# Case study: RowBinary vs JSON on a financial ledger (wide ints & decimals)
|
|
2
|
+
|
|
3
|
+
**TL;DR** — When every column is wider than a JS `number` can hold (`UInt128`,
|
|
4
|
+
`Int64`, `Decimal128(18)`, `UInt256`), RowBinary wins _twice over_. Stock
|
|
5
|
+
`JSON.parse` is not merely slow here — it is **silently wrong**, rounding every
|
|
6
|
+
value to a float64. The only correct JSON path quotes the values server-side and
|
|
7
|
+
re-parses each string into a `bigint`/decimal pair by hand, which is **~5x
|
|
8
|
+
slower** than the optimized RowBinary reader over a **2.1–2.6x larger** wire.
|
|
9
|
+
RowBinary reads each value exactly, straight off the wire.
|
|
10
|
+
|
|
11
|
+
This is the workload the [SKILL's format-choice
|
|
12
|
+
guidance](../SKILL.md#first-is-rowbinary-even-the-right-format) calls out
|
|
13
|
+
explicitly: "RowBinary clearly wins when the result is dominated by **wide
|
|
14
|
+
numerics** — `Int128`/`Int256`/`UInt128`/`UInt256`, `Decimal128`/`Decimal256`."
|
|
15
|
+
|
|
16
|
+
Reproduce: `npx vitest bench --run tests/ledger.bench.ts` (against a live
|
|
17
|
+
ClickHouse server). Source: [`tests/ledger.bench.ts`](../tests/ledger.bench.ts),
|
|
18
|
+
reader: [`src/examples/ledger.ts`](../src/examples/ledger.ts).
|
|
19
|
+
|
|
20
|
+
## The data
|
|
21
|
+
|
|
22
|
+
A financial ledger — every column exceeds IEEE-754 double's 53-bit exact range:
|
|
23
|
+
|
|
24
|
+
```sql
|
|
25
|
+
txn_id UInt128 -- 16 bytes
|
|
26
|
+
account Int64 -- 8 bytes (values past 2^53)
|
|
27
|
+
amount Decimal128(18) -- 16 bytes (~32 significant digits)
|
|
28
|
+
balance Decimal128(18) -- 16 bytes
|
|
29
|
+
fee Decimal64(4) -- 8 bytes
|
|
30
|
+
volume UInt256 -- 32 bytes
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
50,000 rows, fixed-width (96 bytes/row), fetched from a live server.
|
|
34
|
+
|
|
35
|
+
## The correctness trap
|
|
36
|
+
|
|
37
|
+
ClickHouse emits these types as **bare, unquoted JSON numbers**. So stock
|
|
38
|
+
`JSON.parse` parses them as float64 and silently corrupts every one — measured
|
|
39
|
+
on row 0 of the live result:
|
|
40
|
+
|
|
41
|
+
| Column | Exact value (RowBinary) | `JSON.parse` of bare JSON | |
|
|
42
|
+
| --------- | ----------------------------------------- | ----------------------------------------- | ---------------- |
|
|
43
|
+
| `txn_id` | `340282366920938463463374607431768200000` | `340282366920938463463374607431768211456` | ✗ off by 11 456 |
|
|
44
|
+
| `account` | `9007199254740993` | `9007199254740992` | ✗ off by 1 |
|
|
45
|
+
| `amount` | `98765432109876.123456789012345678` | `98765432109876.12` | ✗ lost 16 digits |
|
|
46
|
+
|
|
47
|
+
No exception, no warning — just wrong numbers. For money and IDs, that is a
|
|
48
|
+
correctness bug, not a performance footnote.
|
|
49
|
+
|
|
50
|
+
### Making JSON correct costs extra work
|
|
51
|
+
|
|
52
|
+
The only way to get exact values through JSON is to **quote them server-side** so
|
|
53
|
+
they arrive as strings, then re-parse each one:
|
|
54
|
+
|
|
55
|
+
```sql
|
|
56
|
+
... SETTINGS output_format_json_quote_64bit_integers = 1,
|
|
57
|
+
output_format_json_quote_decimals = 1
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
```ts
|
|
61
|
+
txn_id: BigInt(r.txn_id), // string -> bigint
|
|
62
|
+
amount: parseDecimal(r.amount, 18), // string -> [unscaled, scale]
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
That per-field `BigInt(...)` / decimal parse is work RowBinary doesn't do — it
|
|
66
|
+
reads the exact `bigint` directly with two `DataView` reads — and it lands on
|
|
67
|
+
top of a larger wire (strings are longer than the binary words).
|
|
68
|
+
|
|
69
|
+
## Wire size (correct paths quote wide values as strings)
|
|
70
|
+
|
|
71
|
+
| Format | Size | vs RowBinary |
|
|
72
|
+
| --------------------------- | -------- | ------------ |
|
|
73
|
+
| RowBinary | 4.80 MB | 1.0x |
|
|
74
|
+
| JSONCompactEachRow (quoted) | 9.88 MB | 2.1x |
|
|
75
|
+
| JSONEachRow (quoted) | 12.28 MB | 2.6x |
|
|
76
|
+
|
|
77
|
+
## Decode throughput (full 50k-row decode; higher = faster)
|
|
78
|
+
|
|
79
|
+
| Decoder | ops/s | ms/decode | ≈ rows/s | speedup | correct? |
|
|
80
|
+
| -------------------------------------------------- | ----- | --------- | -------- | -------- | -------------- |
|
|
81
|
+
| **RowBinary — optimized** | 130 | 7.71 | ~6.5 M | **1.0x** | ✅ |
|
|
82
|
+
| RowBinary — API combinators | 80 | 12.50 | ~4.0 M | 0.62x | ✅ |
|
|
83
|
+
| JSONEachRow bare — `JSON.parse` only | 44 | 22.74 | ~2.2 M | 0.34x | ❌ **corrupt** |
|
|
84
|
+
| JSONCompactEachRow quoted — parse + BigInt/decimal | 26 | 37.78 | ~1.3 M | 0.20x | ✅ |
|
|
85
|
+
| JSONEachRow quoted — parse + BigInt/decimal | 25 | 40.70 | ~1.2 M | 0.19x | ✅ |
|
|
86
|
+
|
|
87
|
+
_Node 24 / V8. Your numbers will vary; run `npm run bench` on your own hardware._
|
|
88
|
+
|
|
89
|
+
## Takeaways
|
|
90
|
+
|
|
91
|
+
- **The fast JSON path is the wrong one.** Bare `JSON.parse` is JSON's quickest
|
|
92
|
+
option and it is still 2.95x slower than RowBinary — _and_ it silently
|
|
93
|
+
corrupts every wide value. There is no "fast and correct" JSON here.
|
|
94
|
+
- **The correct JSON path is ~5x slower.** Quote + per-field `BigInt`/decimal
|
|
95
|
+
parsing is the price of correctness, on top of a 2.1–2.6x larger wire.
|
|
96
|
+
- **RowBinary is correct by construction.** Each value is composed from 64-bit
|
|
97
|
+
words read at constant offsets (high word signed for the signed types),
|
|
98
|
+
yielding an exact `bigint` or `[unscaled, scale]` pair — no rounding, no
|
|
99
|
+
string re-parsing.
|
|
100
|
+
- **Contrast with the [IoT case study](iot-rowbinary-vs-json.md):** there the
|
|
101
|
+
numbers fit a float64 and the win was purely throughput (3.5x). Here the values
|
|
102
|
+
don't fit, so the win is _correctness first_, throughput second. Match the
|
|
103
|
+
format to the shape of the data.
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# Case study: JSON beats RowBinary on a string-heavy log table
|
|
2
|
+
|
|
3
|
+
**TL;DR** — This is the honest counter-case. When the result is mostly **text
|
|
4
|
+
consumed wholesale** (an application log table), `JSONCompactEachRow` +
|
|
5
|
+
`JSON.parse` decodes **1.4x faster** than the optimized RowBinary reader — and
|
|
6
|
+
once you turn on HTTP compression, RowBinary's raw-wire size advantage
|
|
7
|
+
**disappears**: gzip ties the two, and with **zstd the JSON response is actually
|
|
8
|
+
slightly smaller**. For this shape the skill steers you _away_ from RowBinary —
|
|
9
|
+
and proving that is what makes its "use RowBinary here" advice (see the
|
|
10
|
+
[IoT](iot-rowbinary-vs-json.md) and [ledger](ledger-rowbinary-vs-json.md)
|
|
11
|
+
studies) trustworthy.
|
|
12
|
+
|
|
13
|
+
This is exactly what the [SKILL's format-choice
|
|
14
|
+
guidance](../SKILL.md#first-is-rowbinary-even-the-right-format) says: prefer a
|
|
15
|
+
`JSON*` format when the result is "mostly strings / JSON-like values that you
|
|
16
|
+
consume wholesale," because V8's native `JSON.parse` is heavily optimized C++
|
|
17
|
+
and "pair it with HTTP response compression (`gzip` / `zstd`, which crushes
|
|
18
|
+
JSON's repetitive keys)."
|
|
19
|
+
|
|
20
|
+
Reproduce: `npx vitest bench --run tests/logs.bench.ts` (against a live
|
|
21
|
+
ClickHouse server). Source: [`tests/logs.bench.ts`](../tests/logs.bench.ts),
|
|
22
|
+
reader: [`src/examples/logs.ts`](../src/examples/logs.ts).
|
|
23
|
+
|
|
24
|
+
## The data
|
|
25
|
+
|
|
26
|
+
An application log table — four of five columns are text consumed as text:
|
|
27
|
+
|
|
28
|
+
```sql
|
|
29
|
+
ts DateTime
|
|
30
|
+
level LowCardinality(String) -- transparent in RowBinary -> plain String
|
|
31
|
+
service LowCardinality(String)
|
|
32
|
+
message String -- templated log line, varying values
|
|
33
|
+
trace_id String -- high-cardinality 32-char hex
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
50,000 rows, fetched from a live server. The two `LowCardinality` columns carry
|
|
37
|
+
no dictionary on the RowBinary wire — they decode as plain `String`.
|
|
38
|
+
|
|
39
|
+
## Decode throughput (full 50k-row decode; higher = faster)
|
|
40
|
+
|
|
41
|
+
| Decoder | ops/s | ms/decode | ≈ rows/s | speedup |
|
|
42
|
+
| ------------------------------------- | ----- | --------- | -------- | -------- |
|
|
43
|
+
| **JSONCompactEachRow — `JSON.parse`** | 93 | 10.73 | ~4.7 M | **1.0x** |
|
|
44
|
+
| JSONEachRow — `JSON.parse` | 72 | 13.89 | ~3.6 M | 0.77x |
|
|
45
|
+
| RowBinary — optimized (monomorphized) | 66 | 15.07 | ~3.3 M | 0.71x |
|
|
46
|
+
| RowBinary — API combinators | 54 | 18.68 | ~2.7 M | 0.57x |
|
|
47
|
+
|
|
48
|
+
`JSONCompactEachRow` (arrays, no repeated keys) is the fastest JSON option and
|
|
49
|
+
beats even the optimized RowBinary reader by ~1.4x. A RowBinary string is a
|
|
50
|
+
varint length + `buf.toString("utf8", …)` decoded one field at a time in JS;
|
|
51
|
+
`JSON.parse` builds the same JS strings in one optimized C++ pass.
|
|
52
|
+
|
|
53
|
+
## Wire size — raw, and compressed (gzip / zstd)
|
|
54
|
+
|
|
55
|
+
| Format | raw | gzip | zstd |
|
|
56
|
+
| ------------------ | ------- | ------- | ------- |
|
|
57
|
+
| RowBinary | 5.04 MB | 1.46 MB | 1.35 MB |
|
|
58
|
+
| JSONCompactEachRow | 6.84 MB | 1.51 MB | 1.32 MB |
|
|
59
|
+
| JSONEachRow | 8.84 MB | 1.52 MB | 1.33 MB |
|
|
60
|
+
|
|
61
|
+
RowBinary is 1.4–1.8x smaller **raw**, which is the usual argument for it. But
|
|
62
|
+
that edge is mostly JSON's repeated structure (keys, punctuation) — exactly what
|
|
63
|
+
a compressor removes. With `gzip` the three are within ~4% of each other, and
|
|
64
|
+
with `zstd` the JSON responses are _slightly smaller_ than RowBinary. Any
|
|
65
|
+
production HTTP path should have compression on, so the wire-size case for
|
|
66
|
+
RowBinary on this data effectively vanishes.
|
|
67
|
+
|
|
68
|
+
_Node 24 / V8. Your numbers will vary; run `npm run bench` on your own hardware._
|
|
69
|
+
|
|
70
|
+
## Takeaways
|
|
71
|
+
|
|
72
|
+
- **JSON wins both axes here.** Faster to decode (~1.4x) _and_, once compressed,
|
|
73
|
+
no larger on the wire. There is no reason to hand-write a RowBinary parser for
|
|
74
|
+
this shape.
|
|
75
|
+
- **`JSONCompactEachRow` is the one to reach for** — it drops the per-row
|
|
76
|
+
repeated keys, so it parses faster than `JSONEachRow` and compresses about the
|
|
77
|
+
same.
|
|
78
|
+
- **Compression erases RowBinary's raw-size advantage on text.** RowBinary's
|
|
79
|
+
smaller raw wire comes largely from not repeating keys; a compressor already
|
|
80
|
+
does that for JSON. Always compare _compressed_ sizes when the data is
|
|
81
|
+
string-heavy.
|
|
82
|
+
- **This is the boundary of the skill.** RowBinary earns its keep on
|
|
83
|
+
numeric/wide/binary data ([IoT](iot-rowbinary-vs-json.md),
|
|
84
|
+
[ledger](ledger-rowbinary-vs-json.md)); on string-heavy results read as text,
|
|
85
|
+
the right answer is `JSONCompactEachRow` + compression. Match the format to the
|
|
86
|
+
shape of the data — and measure.
|
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
# Case study: why JS, not WASM, for RowBinary parsing (and the one place WASM wins)
|
|
2
|
+
|
|
3
|
+
**TL;DR** — A JIT-compiled JS RowBinary reader already streams bytes at **memory
|
|
4
|
+
bandwidth** (~16 GB/s), and the dominant cost of decoding is allocating the JS
|
|
5
|
+
values themselves (objects, `Date`, strings, `BigInt`) — which **WASM cannot do
|
|
6
|
+
and therefore cannot remove**. So for the skill's actual job, turning a RowBinary
|
|
7
|
+
response into usable JS data, WASM buys ~nothing (a wash, or a loss after the
|
|
8
|
+
copy-in tax). WASM wins decisively in exactly **one** different problem:
|
|
9
|
+
**in-place aggregation of wide integers / decimals** (and hash group-by), where
|
|
10
|
+
JS is forced onto heap `BigInt`/`Map`. There we measured a hand-written WASM
|
|
11
|
+
kernel at **27–38x** over JS. But that is _compute_, not parsing — and it is
|
|
12
|
+
usually pushable to ClickHouse anyway. And if you genuinely need heavier
|
|
13
|
+
**client-side analytics**, the lever isn't WASM-over-RowBinary at all — it's a
|
|
14
|
+
**columnar wire format (`Native`)**, coming soon to the JS client out of the
|
|
15
|
+
Python-client collaboration; RowBinary is row-major and fights every analytical
|
|
16
|
+
pass.
|
|
17
|
+
|
|
18
|
+
Reproduce:
|
|
19
|
+
|
|
20
|
+
- `npx vitest bench --run tests/iot.wasm-headroom.bench.ts` (the parsing headroom)
|
|
21
|
+
- `node tests/wasm-int128.experiment.mjs` (the hand-emitted WASM kernel)
|
|
22
|
+
|
|
23
|
+
All numbers Node 24 / V8; yours will vary.
|
|
24
|
+
|
|
25
|
+
## The idea under test
|
|
26
|
+
|
|
27
|
+
A tempting architecture: a _dynamic WASM JIT inside the JS runtime_. A type
|
|
28
|
+
builder (`t.Int32()`, `t.Map(t.FixedString, t.Int32())`) plus a query DSL
|
|
29
|
+
(`q.sum(q.column(1))`) compile **on the fly** to a WASM module that parses the
|
|
30
|
+
raw network chunk sitting at address 0 in linear memory, computes the answer,
|
|
31
|
+
writes it to a result region, and returns the offset where the incomplete
|
|
32
|
+
trailing row begins (streaming resume). Elegant. The question is _what it would
|
|
33
|
+
win_ — and the honest answer needs three measurements.
|
|
34
|
+
|
|
35
|
+
## Proof 1 — JIT-compiled JS reads at memory speed
|
|
36
|
+
|
|
37
|
+
V8 compiles `DataView` accessors to native loads. Folding a 32 MB column of
|
|
38
|
+
native-width values (`Float64`) in a plain JS loop:
|
|
39
|
+
|
|
40
|
+
| Read | ms / 32 MB | throughput |
|
|
41
|
+
| ---------------------- | ---------- | ------------- |
|
|
42
|
+
| JS `DataView` f64 fold | 1.94 ms | **16.5 GB/s** |
|
|
43
|
+
|
|
44
|
+
That is essentially RAM bandwidth. **There is no headroom for a "faster
|
|
45
|
+
language" to read these bytes** — JS is already at the metal. A WASM parser
|
|
46
|
+
reading the same bytes lands in the same place (see Proof 3, where the WASM
|
|
47
|
+
kernel reads at 28 GB/s doing _integer_ loads — same order, also bandwidth-bound,
|
|
48
|
+
not 10x).
|
|
49
|
+
|
|
50
|
+
## Proof 2 — the parsing bottleneck is allocation, which WASM can't touch
|
|
51
|
+
|
|
52
|
+
On the best case for RowBinary (IoT, every column fixed-width numeric), three
|
|
53
|
+
decoders over the same buffer (`tests/iot.wasm-headroom.bench.ts`):
|
|
54
|
+
|
|
55
|
+
| Decode | ms | vs current | what it isolates |
|
|
56
|
+
| ---------------------------------------------------- | ---- | ---------- | -------------------- |
|
|
57
|
+
| **rows** — current fast reader (objects + `Date`) | 3.48 | 1.0x | full materialization |
|
|
58
|
+
| **columnar** — into typed arrays, no per-row objects | 0.86 | 4.0x | drop the objects |
|
|
59
|
+
| **parseOnly** — reads only, zero allocation | 0.61 | 5.8x | the pure-read floor |
|
|
60
|
+
|
|
61
|
+
**~83% of decode time is JS-side object/`Date` allocation**, not byte reading.
|
|
62
|
+
A WASM parser still has to produce those JS values across the boundary, so it
|
|
63
|
+
_cannot_ remove that 83%. Even if WASM made the parse slice instantaneous and the
|
|
64
|
+
copy-in free, the row-object decode would drop only `3.48 → 2.88 ms` — a **max
|
|
65
|
+
~1.2x**, and realistically a wash once you add the copy into linear memory.
|
|
66
|
+
|
|
67
|
+
The 4.0x that _is_ on the table comes from the **output contract** (columnar
|
|
68
|
+
typed arrays), and it's available in **plain JS** — no WASM. (That columnar path
|
|
69
|
+
is worth shipping; it's the real win this whole investigation surfaced.)
|
|
70
|
+
|
|
71
|
+
## Proof 3 — the one place WASM wins: wide-int / decimal aggregation
|
|
72
|
+
|
|
73
|
+
Summing an `Int128` column forces JS onto heap `BigInt` (one allocation per
|
|
74
|
+
row). A hand-emitted WASM kernel (94 bytes; native `i64` add-with-carry) does it
|
|
75
|
+
in registers. Same 32 MB buffer, result verified equal to the BigInt sum
|
|
76
|
+
(`tests/wasm-int128.experiment.mjs`):
|
|
77
|
+
|
|
78
|
+
| Sum of an `Int128` column | ms / 32 MB | throughput | |
|
|
79
|
+
| --------------------------------------- | ---------- | ---------- | -------------- |
|
|
80
|
+
| **JS BigInt-128 sum** (what JS must do) | 42.93 ms | 0.7 GB/s | correct |
|
|
81
|
+
| WASM `i64` add-carry — kernel only | 1.14 ms | 28.2 GB/s | correct |
|
|
82
|
+
| WASM + copy-in boundary tax | 1.62 ms | 19.7 GB/s | (copy 0.49 ms) |
|
|
83
|
+
|
|
84
|
+
**WASM is 37.8x faster than JS (26.5x including the copy into linear memory).**
|
|
85
|
+
Note _why_: the win is escaping `BigInt`, not reading bytes faster — the WASM
|
|
86
|
+
kernel (28 GB/s) is the same order as the JS f64 floor (16.5 GB/s). JS pays a
|
|
87
|
+
**22x `BigInt` tax** purely to add 128-bit integers; WASM's native `i64` reclaims
|
|
88
|
+
it. The same logic applies to `Decimal128/256` accumulation and to hash group-by
|
|
89
|
+
(WASM open-addressing table in linear memory vs JS `Map` + GC).
|
|
90
|
+
|
|
91
|
+
## Verdict on the dynamic-WASM-JIT
|
|
92
|
+
|
|
93
|
+
The architecture is **sound for the aggregation regime and only that regime**.
|
|
94
|
+
It targets the one quadrant where WASM beats well-written JS: _parse and compute
|
|
95
|
+
in place, return a small result, never cross the boundary per value._ The design
|
|
96
|
+
answers its own open questions well:
|
|
97
|
+
|
|
98
|
+
- **Where does the answer go?** Scalars return directly (`i128` via multi-value
|
|
99
|
+
or two `i64`s); group-by results go to a reserved linear-memory region that JS
|
|
100
|
+
reads as a typed-array view — only the small final result crosses.
|
|
101
|
+
- **Streaming.** Returning the resume offset (vs throwing across the FFI) is
|
|
102
|
+
clean, and accumulator state lives in linear memory across chunks — the module
|
|
103
|
+
_is_ the streaming aggregation state.
|
|
104
|
+
|
|
105
|
+
But three caveats bound where it's worth building:
|
|
106
|
+
|
|
107
|
+
1. **For parsing → JS values, use generated JS, not WASM.** Proofs 1–2: JS is
|
|
108
|
+
already at memory speed and the cost is materialization WASM can't remove. A
|
|
109
|
+
`DSL → new Function(generatedJS)` backend captures the parse + native-numeric
|
|
110
|
+
aggregation case with **zero toolchain**, debuggable. This is the skill's
|
|
111
|
+
existing monomorphization thesis.
|
|
112
|
+
2. **Reserve a WASM backend for the wide-int/decimal + group-by kernels only** —
|
|
113
|
+
gate it on the presence of `Int128/256`, `Decimal128/256`, or a `GROUP BY`,
|
|
114
|
+
where Proof 3's 27–38x is real. For `Float64` sums it would tie JS.
|
|
115
|
+
3. **SIMD won't help much** — RowBinary is row-major (AoS); strided columns
|
|
116
|
+
defeat Wasm SIMD (no gather) without a transpose pass. The WASM win here is
|
|
117
|
+
native `i64` + no GC, not vectorization.
|
|
118
|
+
4. **The elephant: push it down.** `q.sum(col)` is `SELECT sum(col)` — ClickHouse
|
|
119
|
+
will beat any client. Client-side aggregation only justifies itself when you
|
|
120
|
+
_can't_ push down: folding a stream you already receive for another reason,
|
|
121
|
+
combining across queries/sources, or compute SQL can't express.
|
|
122
|
+
|
|
123
|
+
## If you need more client-side analytical strength: reach for Native columnar
|
|
124
|
+
|
|
125
|
+
Step back from WASM and look at _why_ the wins above are so narrow. RowBinary is
|
|
126
|
+
**row-major (AoS)**: every row interleaves all columns, so any analytical pass —
|
|
127
|
+
fold a column, vectorize, build a column-at-a-time accumulator — has to stride
|
|
128
|
+
over the bytes it doesn't want and re-materialize a value at a time. That is the
|
|
129
|
+
same row-major tax that defeats SIMD (caveat 3) and that makes the free **4x in
|
|
130
|
+
Proof 2 cost a transpose** today (you decode rows, _then_ pack into typed
|
|
131
|
+
arrays).
|
|
132
|
+
|
|
133
|
+
So the honest answer to _"I need real client-side analytical strength"_ is **not
|
|
134
|
+
a smarter parser over RowBinary, and not WASM** — it is a **columnar wire
|
|
135
|
+
format**. ClickHouse's **`Native`** format is **column-major (SoA)**: each block
|
|
136
|
+
arrives as contiguous per-column runs. That flips every constraint in this study:
|
|
137
|
+
|
|
138
|
+
- The Proof-2 columnar typed-array path stops needing a transpose — the wire
|
|
139
|
+
_is_ already `Float64Array`-shaped, so you `subarray`/`set` a column in one
|
|
140
|
+
move instead of decoding rows first.
|
|
141
|
+
- Vectorization becomes real: a contiguous column is exactly what `v128.load` /
|
|
142
|
+
SIMD (and even auto-vectorized JS) want — the gather problem disappears.
|
|
143
|
+
- The wide-int/decimal aggregation win (Proof 3) keeps applying, now over
|
|
144
|
+
contiguous input, which is the friendliest possible layout for it.
|
|
145
|
+
|
|
146
|
+
A columnar reader is **coming to the JS client soon**, out of the **collaboration
|
|
147
|
+
with the Python client** (which already ships a mature `Native`/columnar path —
|
|
148
|
+
the format and lessons port directly). When it lands, the order of preference for
|
|
149
|
+
client-side analytics becomes: **push down to ClickHouse → if you can't, decode
|
|
150
|
+
`Native` columnar → reserve WASM for the wide-int/decimal/group-by kernel on top
|
|
151
|
+
of those columns.** RowBinary stays the right tool for what this skill targets —
|
|
152
|
+
turning a result into JS _rows/values_ — not for analytics over them.
|
|
153
|
+
|
|
154
|
+
## Takeaways
|
|
155
|
+
|
|
156
|
+
- **Generated JS is the right engine for the parser.** It reads at memory
|
|
157
|
+
bandwidth; the remaining cost is JS-value materialization that no language
|
|
158
|
+
swap removes. WASM for parsing is a wash-to-loss.
|
|
159
|
+
- **The free 4x is a columnar (typed-array) output contract — in pure JS.** Worth
|
|
160
|
+
capturing as a first-class option for numeric results.
|
|
161
|
+
- **WASM earns its complexity in one place: in-place wide-int/decimal/group-by
|
|
162
|
+
aggregation** (27–38x measured), where JS is trapped in `BigInt`/`Map`. And
|
|
163
|
+
even then, prefer pushing the aggregation to ClickHouse unless you genuinely
|
|
164
|
+
can't.
|
|
165
|
+
- **For real client-side analytical strength, the answer is columnar, not WASM.**
|
|
166
|
+
RowBinary is row-major and taxes every analytical pass; a `Native` (SoA)
|
|
167
|
+
columnar reader — coming to the JS client soon via the Python-client
|
|
168
|
+
collaboration — removes the transpose, unlocks SIMD, and is the natural
|
|
169
|
+
substrate for the aggregation kernels above.
|
|
170
|
+
- Matches the rest of the studies' through-line: pick the tool for the shape of
|
|
171
|
+
the work, and **measure** — the 94-byte WASM kernel exists precisely so this
|
|
172
|
+
claim isn't hand-waved.
|