npm - @malloydata/malloy-tests - Versions diffs - 0.0.295 → 0.0.297 - Mend

@malloydata/malloy-tests 0.0.295 → 0.0.297

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md +15 -13
package/package.json +9 -9
package/src/databases/presto-trino/presto-trino.spec.ts +282 -0

package/README.md CHANGED Viewed

@@ -13,7 +13,7 @@ Assumes that postgres has been installed via nix (installs but doesn't configure
 ADD to environment: `export PGHOST=localhost`
 **postgres_init.sh** - builds a database as the current user in .tmp/data/malloytestdb. Starts server running on localhost:5432
-copies the test data in `malloytest-postgres.sql.gz` into the database.
+copies the test data in `malloytest-postgres.sql.gz` into the database. - You may also need to add state_facts.sql for some tests
 **postgres_start.sh** - starts the postgres server, once it has been installed (use after a reboot, for example)
@@ -30,7 +30,7 @@ Setting up DuckDB:
 # Using the custom matcher for running queries
-There is now a custom matcher, `malloyResultMatches` for running queries.  The customer matcher makes it easy to write readable tests which need to look at query results, and produces useful output when the test fails to make it easier to develop tests or respond to the output of failing tests.
+There is now a custom matcher, `malloyResultMatches` for running queries. The customer matcher makes it easy to write readable tests which need to look at query results, and produces useful output when the test fails to make it easier to develop tests or respond to the output of failing tests.
 ## Check for results in the first row of output
@@ -54,9 +54,9 @@ import './util/db-jest-matchers';
 This will check the following things.
-* There is at least one row of data in the output. So `.malloyQueryMatches(rt, {})` will fail if the query returns no rows
-* There are entries in that row for each key. `{}` matches any row
-* The entries are equal, but it will error if the expected data as a number and the returned data is a string.
+- There is at least one row of data in the output. So `.malloyQueryMatches(rt, {})` will fail if the query returns no rows
+- There are entries in that row for each key. `{}` matches any row
+- The entries are equal, but it will error if the expected data as a number and the returned data is a string.
 ## Accessing nested results
@@ -79,11 +79,11 @@ example also shows passing a model instead of a runtime to the matcher.
   });
 ```
-  > [!WARNING]
-  > There is currently a ... feature ... where if the source code of a test
-  > contains the characters `nest:` and the runtime connection does not support nesting,
-  > the test passes without actually doing anything. There will better handling of
-  > this problem in the future, but if your test is mysteriously passing, this is why.
+> [!WARNING]
+> There is currently a ... feature ... where if the source code of a test
+> contains the characters `nest:` and the runtime connection does not support nesting,
+> the test passes without actually doing anything. There will better handling of
+> this problem in the future, but if your test is mysteriously passing, this is why.
 ## Queries returning more than one row of data
@@ -102,8 +102,8 @@ An array of match rows may can also be used, if the test needs to verify more th
 This will pass if ..
-* There are exactly two rows of output. If there are not exactly two the matcher will fail. If the query makes more than two rows, you will need to add a `limit: 2` to allow the matcher to pass.
-* Each row is matches the matching criteria, again `{}` means "pass if there is a row"
+- There are exactly two rows of output. If there are not exactly two the matcher will fail. If the query makes more than two rows, you will need to add a `limit: 2` to allow the matcher to pass.
+- Each row is matches the matching criteria, again `{}` means "pass if there is a row"
 ## Reading failure output
@@ -235,7 +235,9 @@ The old template for a test looked something like
 ```
 The actual matcher for a result is limited to an equality test, in the old pattern you would have written something using an existing matcher for a data value
 ```TypeScript
     expect(result.data.patch(0, 'numThings').value).toBeGreaterThan(7);
 ```
-and if this is desirable, more work on the custom matcher would be needed to allow expressions like this to be written.
+and if this is desirable, more work on the custom matcher would be needed to allow expressions like this to be written.

package/package.json CHANGED Viewed

@@ -21,14 +21,14 @@
   },
   "dependencies": {
     "@jest/globals": "^29.4.3",
-    "@malloydata/db-bigquery": "0.0.295",
-    "@malloydata/db-duckdb": "0.0.295",
-    "@malloydata/db-postgres": "0.0.295",
-    "@malloydata/db-snowflake": "0.0.295",
-    "@malloydata/db-trino": "0.0.295",
-    "@malloydata/malloy": "0.0.295",
-    "@malloydata/malloy-tag": "0.0.295",
-    "@malloydata/render": "0.0.295",
+    "@malloydata/db-bigquery": "0.0.297",
+    "@malloydata/db-duckdb": "0.0.297",
+    "@malloydata/db-postgres": "0.0.297",
+    "@malloydata/db-snowflake": "0.0.297",
+    "@malloydata/db-trino": "0.0.297",
+    "@malloydata/malloy": "0.0.297",
+    "@malloydata/malloy-tag": "0.0.297",
+    "@malloydata/render": "0.0.297",
     "events": "^3.3.0",
     "jsdom": "^22.1.0",
     "luxon": "^2.4.0",
@@ -38,5 +38,5 @@
     "@types/jsdom": "^21.1.1",
     "@types/luxon": "^2.4.0"
   },
-  "version": "0.0.295"
+  "version": "0.0.297"
 }

package/src/databases/presto-trino/presto-trino.spec.ts CHANGED Viewed

@@ -27,6 +27,58 @@ describe.each(runtimes.runtimeList)(
         `run: ${databaseName}.sql("SELECT 1 as n") -> { select: n }`
       ).malloyResultMatches(runtime, {n: 1});
     });
+    describe('HLL Window Functions', () => {
+      it.when(presto)(
+        `hll_accumulate_moving function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${databaseName}.sql("""
+          SELECT 'A' as category, 'value1' as val, 1 as seq
+          UNION ALL SELECT 'A' as category, 'value2' as val, 2 as seq
+          UNION ALL SELECT 'B' as category, 'value1' as val, 1 as seq
+          UNION ALL SELECT 'B' as category, 'value3' as val, 2 as seq
+        """) -> {
+          select: *
+          order_by: category, seq
+          calculate: hll_acc is hll_accumulate_moving(val, 1)
+        } -> {
+          select:
+            *
+            hll_moving is hll_estimate(hll_acc)
+        }`).malloyResultMatches(runtime, [
+            {category: 'A', val: 'value1', seq: 1, hll_moving: 1},
+            {category: 'A', val: 'value2', seq: 2, hll_moving: 2},
+            {category: 'B', val: 'value1', seq: 1, hll_moving: 2},
+            {category: 'B', val: 'value3', seq: 2, hll_moving: 2},
+          ]);
+        }
+      );
+      it.when(presto)(
+        `hll_combine_moving function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${databaseName}.sql("""
+          SELECT 'A' as category, 'value1' as val, 1 as seq
+          UNION ALL SELECT 'A' as category, 'value2' as val, 2 as seq
+          UNION ALL SELECT 'B' as category, 'value1' as val, 1 as seq
+        """) -> {
+          group_by: category
+          aggregate: hll_set is hll_accumulate(val)
+        } -> {
+          select: *
+          order_by: category
+          calculate: combined_hll is hll_combine_moving(hll_set, 1)
+        } -> {
+          select:
+           *
+           final_count is hll_estimate(combined_hll)
+        }`).malloyResultMatches(runtime, [
+            {category: 'A', final_count: 2},
+            {category: 'B', final_count: 2},
+          ]);
+        }
+      );
+    });
     test.when(databaseName === 'presto')(
       'schema parser does not throw on compound types',
       async () => {
@@ -598,6 +650,236 @@ describe.each(runtimes.runtimeList)(
         """) -> { aggregate: n is count() }
       `).matchesRows(runtime, {n: 1});
     });
+    describe('T-Digest functions', () => {
+      const testData = `${databaseName}.sql("""
+        SELECT CAST(1.0 AS DOUBLE) as n, CAST(1 AS BIGINT) as w
+        UNION ALL SELECT CAST(2.0 AS DOUBLE) as n, CAST(2 AS BIGINT) as w
+        UNION ALL SELECT CAST(3.0 AS DOUBLE) as n, CAST(1 AS BIGINT) as w
+        UNION ALL SELECT CAST(4.0 AS DOUBLE) as n, CAST(3 AS BIGINT) as w
+        UNION ALL SELECT CAST(5.0 AS DOUBLE) as n, CAST(1 AS BIGINT) as w
+        UNION ALL SELECT CAST(6.0 AS DOUBLE) as n, CAST(2 AS BIGINT) as w
+        UNION ALL SELECT CAST(7.0 AS DOUBLE) as n, CAST(1 AS BIGINT) as w
+        UNION ALL SELECT CAST(8.0 AS DOUBLE) as n, CAST(2 AS BIGINT) as w
+        UNION ALL SELECT CAST(9.0 AS DOUBLE) as n, CAST(1 AS BIGINT) as w
+        UNION ALL SELECT CAST(10.0 AS DOUBLE) as n, CAST(3 AS BIGINT) as w
+      """)`;
+      it.when(presto)(
+        `runs the basic tdigest_agg function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            median is value_at_quantile(tdigest_agg(n), 0.5)
+        }`).malloyResultMatches(runtime, {
+            median: 6,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the tdigest_agg with weight function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            weighted_median is value_at_quantile(tdigest_agg(n, w), 0.5)
+        }`).malloyResultMatches(runtime, {
+            weighted_median: 5.5,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the tdigest_agg with weight and compression function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            compressed_median is value_at_quantile(tdigest_agg(n, w, 100), 0.5)
+        }`).malloyResultMatches(runtime, {
+            compressed_median: 5.5,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the value_at_quantile function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            q25 is value_at_quantile(tdigest_agg(n), 0.25)
+            q50 is value_at_quantile(tdigest_agg(n), 0.5)
+            q75 is value_at_quantile(tdigest_agg(n), 0.75)
+            q90 is value_at_quantile(tdigest_agg(n), 0.9)
+        }`).malloyResultMatches(runtime, {
+            q25: 3,
+            q50: 6,
+            q75: 8,
+            q90: 10,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the quantile_at_value function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            q_at_5 is quantile_at_value(tdigest_agg(n), 5.0)
+            q_at_2 is quantile_at_value(tdigest_agg(n), 2.0)
+            q_at_8 is quantile_at_value(tdigest_agg(n), 8.0)
+        }`).malloyResultMatches(runtime, {
+            q_at_5: 0.45,
+            q_at_2: 0.15,
+            q_at_8: 0.75,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the values_at_quantiles function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            quantiles is values_at_quantiles(tdigest_agg(n), [0.25, 0.5, 0.75])
+        }`).malloyResultMatches(runtime, {
+            quantiles: [3, 6, 8],
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the scale_tdigest function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            original_median is value_at_quantile(tdigest_agg(n), 0.5)
+            scaled_median is value_at_quantile(scale_tdigest(tdigest_agg(n), 2.0), 0.5)
+        }`).malloyResultMatches(runtime, {
+            original_median: 6,
+            scaled_median: 5.5,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the trimmed_mean function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            trimmed_mean_10_90 is trimmed_mean(tdigest_agg(n), 0.1, 0.9)
+            trimmed_mean_25_75 is trimmed_mean(tdigest_agg(n), 0.25, 0.75)
+        }`).malloyResultMatches(runtime, {
+            trimmed_mean_10_90: 6,
+            trimmed_mean_25_75: 5.5,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the merge tdigest function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          group_by: group_col is case when n <= 5 then 'A' else 'B' end
+          aggregate: td is tdigest_agg(n)
+        } -> {
+          aggregate:
+            overall_median is value_at_quantile(merge_tdigest(td), 0.5)
+        }`).malloyResultMatches(runtime, {
+            overall_median: 6,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs the merge_tdigest array function - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          group_by: group_col is case when n <= 5 then 'A' else 'B' end
+          aggregate: td is tdigest_agg(n)
+        } -> {
+          aggregate:
+            overall_median is value_at_quantile(merge_tdigest_array(array_agg(td)), 0.5)
+        }`).malloyResultMatches(runtime, {
+            overall_median: 6,
+          });
+        }
+      );
+      it.skip(`runs the destructure_tdigest function - ${databaseName}`, async () => {
+        // TODO: Fix this test - destructure_tdigest returns a record type
+        // which needs special handling in the test
+        await expect(`run: ${testData} -> {
+          aggregate:
+            destructured is destructure_tdigest(tdigest_agg(n))
+        }`).malloyResultMatches(runtime, {
+          destructured: {
+            centroid_means: [],
+            centroid_weights: [],
+            min_value: 1.0,
+            max_value: 10.0,
+            sum_value: 55.0,
+            count_value: 10,
+          },
+        });
+      });
+      it.skip(`runs the construct_tdigest function - ${databaseName}`, async () => {
+        // This test is complex because construct_tdigest needs the output of destructure_tdigest
+        // For now, we'll test that the function exists and can be called with sample data
+        await expect(`run: ${databaseName}.sql("SELECT 1 as dummy") -> {
+          select:
+            test_construct is value_at_quantile(
+              construct_tdigest(
+                [1.0, 2.0, 3.0],
+                [1.0, 1.0, 1.0],
+                1.0,
+                3.0,
+                6.0,
+                3.0,
+                100
+              ),
+              0.5
+            )
+        }`).malloyResultMatches(runtime, {
+          test_construct: 2,
+        });
+      });
+      it.when(presto)(
+        `runs tdigest functions with edge cases - ${databaseName}`,
+        async () => {
+          const singleValueData = `${databaseName}.sql("SELECT CAST(5.0 AS DOUBLE) as n")`;
+          await expect(`run: ${singleValueData} -> {
+          aggregate:
+            median is value_at_quantile(tdigest_agg(n), 0.5)
+            q25 is value_at_quantile(tdigest_agg(n), 0.25)
+            q75 is value_at_quantile(tdigest_agg(n), 0.75)
+        }`).malloyResultMatches(runtime, {
+            median: 5.0,
+            q25: 5.0,
+            q75: 5.0,
+          });
+        }
+      );
+      it.when(presto)(
+        `runs tdigest functions with extreme quantiles - ${databaseName}`,
+        async () => {
+          await expect(`run: ${testData} -> {
+          aggregate:
+            min_quantile is value_at_quantile(tdigest_agg(n), 0.0)
+            max_quantile is value_at_quantile(tdigest_agg(n), 1.0)
+            q_at_min is quantile_at_value(tdigest_agg(n), 1.0)
+            q_at_max is quantile_at_value(tdigest_agg(n), 10.0)
+        }`).malloyResultMatches(runtime, {
+            min_quantile: 1,
+            max_quantile: 10,
+            q_at_min: 0.05,
+            q_at_max: 0.95,
+          });
+        }
+      );
+    });
   }
 );