@hamelin.sh/documentation 0.2.3 → 0.2.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/main.js +1 -1
- package/package.json +1 -1
package/dist/main.js
CHANGED
|
@@ -488,7 +488,7 @@ scoring weights without affecting the overall detection logic.`,
|
|
|
488
488
|
"function-reference/mathematical-functions.md": "# Mathematical Functions\n\nScalar functions for mathematical operations and calculations that can be used in any expression context.\n\n## `abs(x)`\n\nReturns the absolute value of a number.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `abs()` function returns the absolute value (magnitude) of the input number,\nremoving any negative sign. For positive numbers and zero, it returns the value\nunchanged. For negative numbers, it returns the positive equivalent.\n\n## `cbrt(x)`\n\nReturns the cube root of a number.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `cbrt()` function calculates the cube root of the input value. The result\nis always returned as a double-precision floating-point number. Unlike square\nroot, cube root is defined for negative numbers.\n\n## `ceil(x)` / `ceiling(x)`\n\nRounds a number up to the nearest integer.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `ceil()` and `ceiling()` functions round the input value up to the nearest\ninteger. For positive numbers, this means rounding away from zero. For negative\nnumbers, this means rounding toward zero. Both function names are equivalent.\n\n## `degrees(x)`\n\nConverts radians to degrees.\n\n### Parameters\n\n- **x** - Numeric expression representing an angle in radians\n\n### Description\n\nThe `degrees()` function converts an angle from radians to degrees. The result\nis always returned as a double-precision floating-point number. The conversion\nuses the formula: degrees = radians \xD7 (180/\u03C0).\n\n## `e()`\n\nReturns Euler's number (mathematical constant e).\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `e()` function returns the mathematical constant e (approximately 2.71828),\nwhich is the base of natural logarithms. The result is returned as a\ndouble-precision floating-point number.\n\n## `exp(x)`\n\nReturns e raised to the power of x.\n\n### Parameters\n\n- **x** - Numeric expression representing the exponent\n\n### Description\n\nThe `exp()` function calculates e^x, where e is Euler's number. This is the\nexponential function, which is the inverse of the natural logarithm. The result\nis always returned as a double-precision floating-point number.\n\n## `floor(x)`\n\nRounds a number down to the nearest integer.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `floor()` function rounds the input value down to the nearest integer. For\npositive numbers, this means rounding toward zero. For negative numbers, this\nmeans rounding away from zero.\n\n## `ln(x)`\n\nReturns the natural logarithm of a number.\n\n### Parameters\n\n- **x** - Numeric expression (must be positive)\n\n### Description\n\nThe `ln()` function calculates the natural logarithm (base e) of the input\nvalue. The input must be positive; negative values or zero will result in an\nerror. The result is always returned as a double-precision floating-point number.\n\n## `log(b, x)`\n\nReturns the logarithm of x with the specified base.\n\n### Parameters\n\n- **b** - Numeric expression representing the logarithm base\n- **x** - Numeric expression (must be positive)\n\n### Description\n\nThe `log()` function calculates the logarithm of x using the specified base b.\nBoth the base and the value must be positive. The result is always returned as\na double-precision floating-point number.\n\n## `log10(x)`\n\nReturns the base-10 logarithm of a number.\n\n### Parameters\n\n- **x** - Numeric expression (must be positive)\n\n### Description\n\nThe `log10()` function calculates the common logarithm (base 10) of the input\nvalue. The input must be positive; negative values or zero will result in an\nerror. The result is always returned as a double-precision floating-point number.\n\n## `log2(x)`\n\nReturns the base-2 logarithm of a number.\n\n### Parameters\n\n- **x** - Numeric expression (must be positive)\n\n### Description\n\nThe `log2()` function calculates the binary logarithm (base 2) of the input\nvalue. The input must be positive; negative values or zero will result in an\nerror. The result is always returned as a double-precision floating-point number.\n\n## `pi()`\n\nReturns the mathematical constant \u03C0 (pi).\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `pi()` function returns the mathematical constant \u03C0 (approximately 3.14159),\nwhich represents the ratio of a circle's circumference to its diameter. The\nresult is returned as a double-precision floating-point number.\n\n## `pow(x, p)` / `power(x, p)`\n\nRaises a number to the specified power.\n\n### Parameters\n\n- **x** - Numeric expression representing the base\n- **p** - Numeric expression representing the exponent\n\n### Description\n\nThe `pow()` and `power()` functions calculate x raised to the power of p (x^p).\nBoth function names are equivalent. The result is always returned as a\ndouble-precision floating-point number.\n\n## `radians(x)`\n\nConverts degrees to radians.\n\n### Parameters\n\n- **x** - Numeric expression representing an angle in degrees\n\n### Description\n\nThe `radians()` function converts an angle from degrees to radians. The result\nis always returned as a double-precision floating-point number. The conversion\nuses the formula: radians = degrees \xD7 (\u03C0/180).\n\n## `round(x)` / `round(x, d)`\n\nRounds a number to the nearest integer or specified decimal places.\n\n### Parameters\n\n- **x** - Numeric expression to round\n- **d** (optional) - Integer specifying the number of decimal places\n\n### Description\n\nThe `round()` function rounds the input value to the nearest integer when used\nwith one parameter, or to the specified number of decimal places when used with\ntwo parameters. The rounding follows standard mathematical rules (0.5 rounds up).\n\n## `sign(x)`\n\nReturns the sign of a number.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `sign()` function returns -1 for negative numbers, 0 for zero, and 1 for\npositive numbers. This function helps determine the sign of a value without\nregard to its magnitude.\n\n## `sqrt(x)`\n\nReturns the square root of a number.\n\n### Parameters\n\n- **x** - Numeric expression (must be non-negative)\n\n### Description\n\nThe `sqrt()` function calculates the square root of the input value. The input\nmust be non-negative; negative values will result in an error. The result is\nalways returned as a double-precision floating-point number.\n\n## `truncate(x)`\n\nRemoves the fractional part of a number.\n\n### Parameters\n\n- **x** - Numeric expression\n\n### Description\n\nThe `truncate()` function removes the fractional part of a number, effectively\nrounding toward zero. For positive numbers, this is equivalent to `floor()`.\nFor negative numbers, this is equivalent to `ceil()`.\n\n## `width_bucket(x, bound1, bound2, n)`\n\nReturns the bucket number for a value in a histogram with equal-width buckets.\n\n### Parameters\n\n- **x** - Numeric expression representing the value to bucket\n- **bound1** - Numeric expression representing the lower bound\n- **bound2** - Numeric expression representing the upper bound \n- **n** - Integer expression representing the number of buckets\n\n### Description\n\nThe `width_bucket()` function determines which bucket a value falls into when\ndividing the range between bound1 and bound2 into n equal-width buckets. Values\noutside the bounds return 0 (below bound1) or n+1 (above bound2).\n\n## `width_bucket(x, bins)`\n\nReturns the bucket number for a value using explicitly defined bucket boundaries.\n\n### Parameters\n\n- **x** - Numeric expression representing the value to bucket\n- **bins** - Array of numeric values representing bucket boundaries\n\n### Description\n\nThe `width_bucket()` function determines which bucket a value falls into using\nan array of explicitly defined bucket boundaries. The function returns the\nindex of the bucket where the value belongs, with 0 for values below the\nlowest boundary and array length + 1 for values above the highest boundary.",
|
|
489
489
|
"function-reference/regular-expression-functions.md": "# Regular Expression Functions\n\nScalar functions for pattern matching and advanced text processing using regular expressions.\n\n## `regexp_count(string, pattern)`\n\nCounts the number of times a regular expression pattern matches in a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the regular expression pattern\n\n### Description\n\nThe `regexp_count()` function returns the number of non-overlapping matches of\nthe specified regular expression pattern within the input string. If no matches\nare found, it returns 0. The pattern uses standard regular expression syntax.\n\n## `regexp_extract_all(string, pattern)` / `regexp_extract_all(string, pattern, group)`\n\nExtracts all matches of a regular expression pattern from a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the regular expression pattern\n- **group** (optional) - Integer specifying which capture group to extract\n\n### Description\n\nThe `regexp_extract_all()` function returns an array containing all matches of\nthe specified pattern. When used with two parameters, it returns the entire\nmatch. When used with three parameters, it returns the specified capture group\nfrom each match.\n\nIf no matches are found, it returns an empty array. Capture groups are numbered\nstarting from 1, with 0 representing the entire match.\n\n## `regexp_extract(string, pattern)` / `regexp_extract(string, pattern, group)`\n\nExtracts the first match of a regular expression pattern from a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the regular expression pattern\n- **group** (optional) - Integer specifying which capture group to extract\n\n### Description\n\nThe `regexp_extract()` function returns the first match of the specified pattern.\nWhen used with two parameters, it returns the entire match. When used with three\nparameters, it returns the specified capture group from the first match.\n\nIf no match is found, it returns null. Capture groups are numbered starting\nfrom 1, with 0 representing the entire match.\n\n## `regexp_like(string, pattern)`\n\nTests whether a string matches a regular expression pattern.\n\n### Parameters\n\n- **string** - String expression to test\n- **pattern** - String expression representing the regular expression pattern\n\n### Description\n\nThe `regexp_like()` function returns true if the input string contains a match\nfor the specified regular expression pattern, false otherwise. This function\ntests for the presence of a match anywhere within the string, not just at the\nbeginning or end.\n\n## `regexp_position(string, pattern)` / `regexp_position(string, pattern, start)` / `regexp_position(string, pattern, start, occurrence)`\n\nReturns the position of a regular expression match within a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the regular expression pattern\n- **start** (optional) - Integer specifying the starting position for the search\n- **occurrence** (optional) - Integer specifying which occurrence to find\n\n### Description\n\nThe `regexp_position()` function returns the 1-based position of the first\ncharacter of a pattern match within the string. When used with the `start`\nparameter, it begins searching from that position. When used with the\n`occurrence` parameter, it finds the nth occurrence of the pattern.\n\nIf no match is found, it returns 0. The start position is 1-based, and the\noccurrence count begins at 1 for the first match.\n\n## `regexp_replace(string, pattern)` / `regexp_replace(string, pattern, replacement)`\n\nReplaces matches of a regular expression pattern in a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the regular expression pattern\n- **replacement** (optional) - String expression to replace matches with\n\n### Description\n\nThe `regexp_replace()` function replaces all matches of the specified pattern.\nWhen used with two parameters, it removes all matches (replaces with empty\nstring). When used with three parameters, it replaces matches with the\nspecified replacement string.\n\nThe replacement string can include capture group references using standard\nregular expression syntax. If no matches are found, the original string is\nreturned unchanged.\n\n## `regexp_split(string, pattern)`\n\nSplits a string using a regular expression pattern as the delimiter.\n\n### Parameters\n\n- **string** - String expression to split\n- **pattern** - String expression representing the regular expression pattern to use as delimiter\n\n### Description\n\nThe `regexp_split()` function splits the input string at each occurrence of\nthe specified pattern and returns an array of the resulting substrings. The\npattern matches are not included in the result array.\n\nIf the pattern is not found, the function returns an array containing the\noriginal string as a single element. If the pattern matches at the beginning\nor end of the string, empty strings may be included in the result array.",
|
|
490
490
|
"function-reference/string-functions.md": "# String Functions\n\nScalar functions for string processing and manipulation that can be used in any expression context.\n\n## `replace(string, pattern)`\n\nReplaces all occurrences of a pattern in a string.\n\n### Parameters\n\n- **string** - String expression to search within\n- **pattern** - String expression representing the text to replace\n\n### Description\n\nThe `replace()` function removes all occurrences of the specified pattern from\nthe input string. This function performs literal string replacement, not\npattern matching. If the pattern is not found, the original string is returned\nunchanged.\n\n## `starts_with(string, prefix)`\n\nTests whether a string starts with a specified prefix.\n\n### Parameters\n\n- **string** - String expression to test\n- **prefix** - String expression representing the prefix to check for\n\n### Description\n\nThe `starts_with()` function returns true if the input string begins with the\nspecified prefix, false otherwise. The comparison is case-sensitive. An empty\nprefix will always return true for any string.\n\n## `ends_with(string, suffix)`\n\nTests whether a string ends with a specified suffix.\n\n### Parameters\n\n- **string** - String expression to test\n- **suffix** - String expression representing the suffix to check for\n\n### Description\n\nThe `ends_with()` function returns true if the input string ends with the\nspecified suffix, false otherwise. The comparison is case-sensitive. An empty\nsuffix will always return true for any string.\n\n## `contains(string, substring)`\n\nTests whether a string contains a specified substring.\n\n### Parameters\n\n- **string** - String expression to search within\n- **substring** - String expression representing the text to search for\n\n### Description\n\nThe `contains()` function returns true if the input string contains the\nspecified substring anywhere within it, false otherwise. The comparison is\ncase-sensitive. An empty substring will always return true for any string.\n\n## `lower(string)`\n\nConverts a string to lowercase.\n\n### Parameters\n\n- **string** - String expression to convert\n\n### Description\n\nThe `lower()` function converts all uppercase characters in the input string\nto their lowercase equivalents. Characters that are already lowercase or\nnon-alphabetic characters remain unchanged.\n\n## `upper(string)`\n\nConverts a string to uppercase.\n\n### Parameters\n\n- **string** - String expression to convert\n\n### Description\n\nThe `upper()` function converts all lowercase characters in the input string\nto their uppercase equivalents. Characters that are already uppercase or\nnon-alphabetic characters remain unchanged.\n\n## `len(string)`\n\nReturns the length of a string in characters.\n\n### Parameters\n\n- **string** - String expression to measure\n\n### Description\n\nThe `len()` function returns the number of characters in the input string.\nThis counts Unicode characters, not bytes, so multi-byte characters are\ncounted as single characters. An empty string returns 0.",
|
|
491
|
-
"function-reference/time-date-functions.md": '# Time & Date Functions\n\nScalar functions for temporal data processing and manipulation that can be used in any expression context.\n\n## `now()`\n\nReturns the current timestamp.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `now()` function returns the current date and time as a timestamp. The\nexact timestamp represents the moment when the function is evaluated during\nquery execution. All calls to `now()` within the same query execution return\nthe same timestamp value.\n\n## `today()`\n\nReturns today\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `today()` function returns the current date with the time portion set to\nmidnight (00:00:00). This is equivalent to truncating `now()` to the day\nboundary. The result represents the start of the current day.\n\n## `yesterday()`\n\nReturns yesterday\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `yesterday()` function returns yesterday\'s date with the time portion set\nto midnight (00:00:00). This is equivalent to subtracting one day from `today()`.\nThe result represents the start of the previous day.\n\n## `tomorrow()`\n\nReturns tomorrow\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `tomorrow()` function returns tomorrow\'s date with the time portion set to\nmidnight (00:00:00). This is equivalent to adding one day to `today()`. The\nresult represents the start of the next day.\n\n## `ts(timestamp)`\n\nConverts a string to a timestamp.\n\n### Parameters\n\n- **timestamp** - String expression representing a timestamp\n\n### Description\n\nThe `ts()` function parses a string representation of a timestamp and converts\nit to a timestamp type. The function accepts various timestamp formats including\nISO 8601 format. If the string cannot be parsed as a valid timestamp, an error\nis raised.\n\n## `year(timestamp)`\n\nExtracts the year from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `year()` function extracts the year component from a timestamp and returns\nit as an integer. For example, a timestamp of "2023-07-15 14:30:00" would\nreturn 2023.\n\n## `month(timestamp)`\n\nExtracts the month from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `month()` function extracts the month component from a timestamp and returns\nit as an integer from 1 to 12, where 1 represents January and 12 represents\nDecember. For example, a timestamp of "2023-07-15 14:30:00" would return 7.\n\n## `day(timestamp)`\n\nExtracts the day of the month from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `day()` function extracts the day component from a timestamp and returns\nit as an integer from 1 to 31, depending on the month. For example, a timestamp\nof "2023-07-15 14:30:00" would return 15.\n\n## `hour(timestamp)`\n\nExtracts the hour from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `hour()` function extracts the hour component from a timestamp and returns\nit as an integer from 0 to 23, using 24-hour format. For example, a timestamp\nof "2023-07-15 14:30:00" would return 14.\n\n## `minute(timestamp)`\n\nExtracts the minute from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `minute()` function extracts the minute component from a timestamp and\nreturns it as an integer from 0 to 59. For example, a timestamp of\n"2023-07-15 14:30:00" would return 30.\n\n## `second(timestamp)`\n\nExtracts the second from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `second()` function extracts the second component from a timestamp and\nreturns it as an integer from 0 to 59. For example, a timestamp of\n"2023-07-15 14:30:45" would return 45.\n\n## `at_timezone(timestamp, timezone)`\n\nConverts a timestamp to a different timezone.\n\n### Parameters\n\n- **timestamp** - Timestamp expression to convert\n- **timezone** - String expression representing the target timezone\n\n### Description\n\nThe `at_timezone()` function converts a timestamp from its current timezone\nto the specified target timezone. The timezone parameter should be a valid\ntimezone identifier such as "UTC", "America/New_York", or "Europe/London".\nThe function returns a new timestamp representing the same moment in time\nbut expressed in the target timezone.\n\n## `to_millis(interval)`\n\nConverts an interval to milliseconds.\n\n### Parameters\n\n- **interval** - Interval expression to convert\n\n### Description\n\nThe `to_millis()` function converts an interval (duration) to its equivalent\nvalue in milliseconds as an integer. This is useful for calculations that\nrequire numeric representations of time durations. For example, an interval\nof "5 minutes" would return 300000 milliseconds.',
|
|
491
|
+
"function-reference/time-date-functions.md": '# Time & Date Functions\n\nScalar functions for temporal data processing and manipulation that can be used in any expression context.\n\n## `now()`\n\nReturns the current timestamp.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `now()` function returns the current date and time as a timestamp. The\nexact timestamp represents the moment when the function is evaluated during\nquery execution. All calls to `now()` within the same query execution return\nthe same timestamp value.\n\n## `today()`\n\nReturns today\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `today()` function returns the current date with the time portion set to\nmidnight (00:00:00). This is equivalent to truncating `now()` to the day\nboundary. The result represents the start of the current day.\n\n## `yesterday()`\n\nReturns yesterday\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `yesterday()` function returns yesterday\'s date with the time portion set\nto midnight (00:00:00). This is equivalent to subtracting one day from `today()`.\nThe result represents the start of the previous day.\n\n## `tomorrow()`\n\nReturns tomorrow\'s date at midnight.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `tomorrow()` function returns tomorrow\'s date with the time portion set to\nmidnight (00:00:00). This is equivalent to adding one day to `today()`. The\nresult represents the start of the next day.\n\n## `ts(timestamp)`\n\nConverts a string to a timestamp.\n\n### Parameters\n\n- **timestamp** - String expression representing a timestamp\n\n### Description\n\nThe `ts()` function parses a string representation of a timestamp and converts\nit to a timestamp type. The function accepts various timestamp formats including\nISO 8601 format. If the string cannot be parsed as a valid timestamp, an error\nis raised.\n\n## `year(timestamp)`\n\nExtracts the year from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `year()` function extracts the year component from a timestamp and returns\nit as an integer. For example, a timestamp of "2023-07-15 14:30:00" would\nreturn 2023.\n\n## `month(timestamp)`\n\nExtracts the month from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `month()` function extracts the month component from a timestamp and returns\nit as an integer from 1 to 12, where 1 represents January and 12 represents\nDecember. For example, a timestamp of "2023-07-15 14:30:00" would return 7.\n\n## `day(timestamp)`\n\nExtracts the day of the month from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `day()` function extracts the day component from a timestamp and returns\nit as an integer from 1 to 31, depending on the month. For example, a timestamp\nof "2023-07-15 14:30:00" would return 15.\n\n## `day_of_week(timestamp)`\n\nExtracts the day of the week from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `day_of_week()` function extracts the ISO day of the week from a timestamp \nand returns it as an integer from 1 (Monday) to 7 (Sunday).\n\n## `hour(timestamp)`\n\nExtracts the hour from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `hour()` function extracts the hour component from a timestamp and returns\nit as an integer from 0 to 23, using 24-hour format. For example, a timestamp\nof "2023-07-15 14:30:00" would return 14.\n\n## `minute(timestamp)`\n\nExtracts the minute from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `minute()` function extracts the minute component from a timestamp and\nreturns it as an integer from 0 to 59. For example, a timestamp of\n"2023-07-15 14:30:00" would return 30.\n\n## `second(timestamp)`\n\nExtracts the second from a timestamp.\n\n### Parameters\n\n- **timestamp** - Timestamp expression\n\n### Description\n\nThe `second()` function extracts the second component from a timestamp and\nreturns it as an integer from 0 to 59. For example, a timestamp of\n"2023-07-15 14:30:45" would return 45.\n\n## `at_timezone(timestamp, timezone)`\n\nConverts a timestamp to a different timezone.\n\n### Parameters\n\n- **timestamp** - Timestamp expression to convert\n- **timezone** - String expression representing the target timezone\n\n### Description\n\nThe `at_timezone()` function converts a timestamp from its current timezone\nto the specified target timezone. The timezone parameter should be a valid\ntimezone identifier such as "UTC", "America/New_York", or "Europe/London".\nThe function returns a new timestamp representing the same moment in time\nbut expressed in the target timezone.\n\n## `to_millis(interval)`\n\nConverts an interval to milliseconds.\n\n### Parameters\n\n- **interval** - Interval expression to convert\n\n### Description\n\nThe `to_millis()` function converts an interval (duration) to its equivalent\nvalue in milliseconds as an integer. This is useful for calculations that\nrequire numeric representations of time durations. For example, an interval\nof "5 minutes" would return 300000 milliseconds.',
|
|
492
492
|
"function-reference/window-functions.md": "# Window Functions\n\nFunctions for analytical operations over data windows that must be used with the `WINDOW` command.\n\n## `row_number()`\n\nReturns a sequential row number for each row within a window partition.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `row_number()` function assigns a unique sequential integer to each row\nwithin its window partition, starting from 1. The ordering is determined by\nthe `SORT` clause in the `WINDOW` command. Rows with identical sort values\nreceive different row numbers in an arbitrary but consistent order.\n\n## `rank()`\n\nReturns the rank of each row within a window partition with gaps.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `rank()` function assigns a rank to each row within its window partition\nbased on the `SORT` clause ordering. Rows with identical sort values receive\nthe same rank, and subsequent ranks are skipped. For example, if two rows tie\nfor rank 2, the next row receives rank 4 (not rank 3).\n\n## `dense_rank()`\n\nReturns the rank of each row within a window partition without gaps.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `dense_rank()` function assigns a rank to each row within its window\npartition based on the `SORT` clause ordering. Rows with identical sort values\nreceive the same rank, but subsequent ranks are not skipped. For example, if\ntwo rows tie for rank 2, the next row receives rank 3.\n\n## `lag(expression, offset, ignore_nulls)`\n\nReturns the value of an expression from a previous row within the window.\n\n### Parameters\n\n- **expression** - Expression to evaluate from the previous row\n- **offset** - Integer specifying how many rows back to look\n- **ignore_nulls** - Boolean indicating whether to skip null values (default: true)\n\n### Description\n\nThe `lag()` function retrieves the value of the specified expression from a\nrow that is `offset` positions before the current row within the window\npartition. When `ignore_nulls` is true, null values are skipped when counting\nthe offset. If there is no row at the specified offset, the function returns null.\n\n## `lead(expression, offset, ignore_nulls)`\n\nReturns the value of an expression from a subsequent row within the window.\n\n### Parameters\n\n- **expression** - Expression to evaluate from the subsequent row\n- **offset** - Integer specifying how many rows ahead to look\n- **ignore_nulls** - Boolean indicating whether to skip null values (default: true)\n\n### Description\n\nThe `lead()` function retrieves the value of the specified expression from a\nrow that is `offset` positions after the current row within the window\npartition. When `ignore_nulls` is true, null values are skipped when counting\nthe offset. If there is no row at the specified offset, the function returns null.\n\n## `first_value(expression, ignore_nulls)`\n\nReturns the first value of an expression within the window frame.\n\n### Parameters\n\n- **expression** - Expression to evaluate\n- **ignore_nulls** - Boolean indicating whether to skip null values (default: true)\n\n### Description\n\nThe `first_value()` function returns the value of the specified expression from\nthe first row in the current window frame. When `ignore_nulls` is true, it\nreturns the first non-null value. The window frame is determined by the\n`WITHIN` clause in the `WINDOW` command.\n\n## `last_value(expression, ignore_nulls)`\n\nReturns the last value of an expression within the window frame.\n\n### Parameters\n\n- **expression** - Expression to evaluate\n- **ignore_nulls** - Boolean indicating whether to skip null values (default: true)\n\n### Description\n\nThe `last_value()` function returns the value of the specified expression from\nthe last row in the current window frame. When `ignore_nulls` is true, it\nreturns the last non-null value. The window frame is determined by the\n`WITHIN` clause in the `WINDOW` command.\n\n## `nth_value(expression, n, ignore_nulls)`\n\nReturns the nth value of an expression within the window frame.\n\n### Parameters\n\n- **expression** - Expression to evaluate\n- **n** - Integer specifying which value to return (1-based)\n- **ignore_nulls** - Boolean indicating whether to skip null values (default: true)\n\n### Description\n\nThe `nth_value()` function returns the value of the specified expression from\nthe nth row in the current window frame. When `ignore_nulls` is true, null\nvalues are not counted in the position. If there is no nth row, the function\nreturns null. The position is 1-based, where 1 represents the first row.\n\n## `cume_dist()`\n\nReturns the cumulative distribution of each row within the window partition.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `cume_dist()` function calculates the cumulative distribution of each row\nwithin its window partition. The result is the number of rows with values less\nthan or equal to the current row's value, divided by the total number of rows\nin the partition. Values range from 0 to 1.\n\n## `percent_rank()`\n\nReturns the percentile rank of each row within the window partition.\n\n### Parameters\n\nThis function takes no parameters.\n\n### Description\n\nThe `percent_rank()` function calculates the percentile rank of each row within\nits window partition. The result is calculated as (rank - 1) / (total rows - 1),\nwhere rank is determined by the `SORT` clause ordering. Values range from 0 to 1,\nwith 0 representing the lowest value and 1 representing the highest.",
|
|
493
493
|
"introduction.md": "# Introducing Hamelin\n\nHamelin is a **pipe-based query language** for **event analytics** which targets\nthe specific challenges detection engineers face when analyzing security events.\nThe language makes event correlation straightforward, letting you define\npatterns, correlate them across time windows, and match ordered sequences of\nevents.\n\n## Key Features\n\n### \u{1F504} Pipe-Based\n\nYou write queries that read naturally from top to bottom. Each operation\nconnects to the next using the pipe operator `|`. Pipe-based languages let you\nbuild queries incrementally, making them easier to read, write, and test than\napproaches that rely heavily on nested subqueries.\n\n```hamelin\nFROM events\n| WHERE event.action == 'login'\n| WITHIN -1hr\n| SELECT user.email, timestamp\n```\n\n### \u{1F550} Event-Native\n\nHamelin offers shorthand for working with timestamped events. Time intervals are\nwritten as simple expressions that match how you think about time. You can\nreference relative timestamps and truncate them to specific boundaries.\n\n```hamelin\n// Reference relative time\n| WITHIN -15m // events within the last 15 minutes\n| WITHIN -1h // events within the last hour\n| WITHIN -7d // events within the last 7 days\n\n// Truncate timestamps to boundaries\n| SELECT timestamp@h // truncate to hour boundary\n| SELECT timestamp@d // truncate to day boundary\n```\n\n### \u{1FA9F} Sliding Windows\n\nSliding windows move continuously with each event, giving you insights without\ngaps or duplicates. You can aggregate data over these moving time windows to\ndetect patterns as they happen.\n\n```hamelin\nFROM events\n| WHERE event.action == 'login'\n| WINDOW count()\n BY user.id\n WITHIN -15m\n```\n\n### \u{1F3AF} Correlation of Named Subqueries\n\nNamed subqueries let you define specific event patterns and correlate them\nwithin sliding windows. You can drop these patterns into sliding windows and\nwrite correlations around them. Hamelin makes it straightforward to aggregate\nover specific patterns while also aggregating over the entire group of events.\n\n```hamelin\nWITH failed_logins = FROM events\n| WHERE event.action == 'login_failed'\n\nWITH successful_logins = FROM events\n| WHERE event.action == 'login_success'\n\nFROM failed = failed_logins, success = successful_logins\n| WINDOW failures = count(failed),\n successes = count(success),\n total = count(),\n BY user.id\n WITHIN -5m\n| WHERE successes >= 1 && failures / total > 0.2\n```\n\nThis query demonstrates correlating failed and successful login events to detect\nbrute force attacks. Named subqueries define distinct event patterns:\n`failed_logins` filters to login failure events while `successful_logins`\nfilters to login success events. The sliding window aggregates these patterns by\nuser over 5-minute periods, counting failures, successes, and total events. The\nfinal filter identifies users who had at least one successful login where failed\nattempts represent more than 20% of their total login activity within that\nwindow.\n\n### \u{1F50D} Ordered Matching of Named Subqueries\n\nYou can ask Hamelin to match ordered patterns across events. Aggregations over sliding windows work well for many use cases, but others require that you search for specific events followed by other specific events. You can do that in Hamelin using regular expression quantifiers applied to named subqueries.\n\n```hamelin\nWITH failed_logins = FROM events\n| WHERE event.action == 'login_failed'\n\nWITH successful_logins = FROM events\n| WHERE event.action == 'login_success'\n\nMATCH failed_logins{10,} successful_logins+\nWHEN last(successful_logins.timestamp) - first(successful_logins.timestamp) < 10m\n```\n\nThis searches for 10 failed logins followed by at least one successful login in\na ten minute period. The sliding window approach might miss attack patterns\nwhere timing and sequence matter, but ordered matching can detect the exact\nprogression of a brute force attack.\n\n### \u{1F517} Event Type Expansion\n\nYou can query across different event types without worrying about schema\ndifferences. Hamelin automatically sets missing fields to `null` when they don't\nexist in a particular event type.\n\n```hamelin\nFROM login_events, logout_events, error_events\n// Filters by user.email when if this field exists in a row.\n// Drops rows where this field does not exist\n// (because NULL does not equal any string).\n| WHERE user.email == 'john@example.com'\n```\n\n### \u{1F5C2}\uFE0F Structured Types\n\nHamelin supports structured types like structs, arrays, and maps to represent\ncomplex data. These types make data modeling more familiar, and reduce the need\nto rely too much on joins in analytic queries.\n\n```hamelin\n// Create struct literals with nested data\nLET login_metadata = {\n ip_address: '192.168.1.100',\n user_agent: 'Mozilla/5.0',\n location: 'San Francisco'\n}\n\n// Access nested fields using dot notation\n| WHERE login_metadata.ip_address != '192.168.1.100'\n\n// Use arrays to store multiple related values\n| LET failed_attempts = [\n {timestamp: '2024-01-15T14:25:00Z', reason: 'invalid_password'},\n {timestamp: '2024-01-15T14:27:00Z', reason: 'account_locked'}\n ]\n\n// Use maps when key data is high cardinality\n// Using structs for this use case creates too many columns.\n| LET host_metrics = map(\n 'web-server-01': {cpu: 85.2, memory: 72.1},\n 'web-server-02': {cpu: 91.7, memory: 68.9},\n 'db-primary-01': {cpu: 67.3, memory: 89.4}\n )\n\n// Look up map values using index notation\n| WHERE host_metrics['web-server-01'].cpu > 80\n```\n\n### \u{1F4E1} Array Broadcasting\n\nHamelin makes working with arrays simpler by offering broadcasting, which helps\nyou distribute operations over each member of an array. It does this when you\napply an operation to an array that makes more sense to be applied to each of\nits members. Broadcasting lets you work with arrays using simple, familiar\nsyntax without asking you to resort to functional programming or inefficient\nunnesting.\n\n```hamelin\n| WHERE any(failed_attempts.reason == 'invalid_password')\n```\n\nThis example demonstrates how the equality operator `==` broadcasts across the\n`reason` field of each element in the `failed_attempts` array. This example\ndemonstrates *two* broadcasts:\n\n * first, the lookup of the `reason` field changes an array-of-struct into an\n array-of-string\n * second, applying equality to the resulting array applies it to each member\n\nHamelin can do this automatically because it is type-aware. It knows that\ncomparing equality between `array(string)` and `string` makes more sense to\nbroadcast: an array can never be equal to a string, but a member of an\n`array(string)` might be.\n\n### \u{1F500} Semi-Structured Types\n\nHamelin lets you parse json into instances of the `variant` type. This helps you\nhandle semi-structured data that doesn't fit nicely into fixed schemas. You can\nparse JSON strings, access their fields, and convert them to more structured\ntypes. This makes working with JSON feel fairly native.\n\n```hamelin\n// Parse JSON strings into the variant type\nFROM logs\n| LET event_data = parse_json(raw_json)\n\n// Access nested fields using dot notation\n| WHERE event_data.level AS string == 'ERROR'\n\n// Access json array elements with index notation\n| LET first_tag = event_data.tags[0]\n\n// Cast variant data to structured types when you need type safety.\n// Values that do not match will be null.\n| LET user_info = event_data.user AS {id: int, name: string}\n```\n\n### \u{1F6A8} Excellent Error Messages\n\nHamelin provides clear, helpful error messages. Error messages\npoint directly to the problematic Hamelin code and explain exactly what went\nwrong, rather than showing cryptic messages about generated SQL.\n\nThis matters especially when AI assistants write queries. AI tools need precise\ndescriptions of errors to fix queries and complete tasks. Clear error messages\nlet AI assistants debug queries effectively by giving the context needed to\ncorrect mistakes.\n\n```hamelin\nFROM simba.sysmon_events\n| AGG count() BY host.hostname\n| LET hostname = lower(host.hostname)\n```\n\ngenerates the error\n\n```\nError: problem doing translation\n \u256D\u2500[ :3:24 ]\n \u2502\n 3 \u2502 | LET hostname = lower(host.hostname)\n \u2502 \u2500\u2500\u252C\u2500\n \u2502 \u2570\u2500\u2500\u2500 error while translating\n \u2502\n \u2502 Note: unbound column reference: host\n \u2502\n \u2502 the following entries in the environment are close:\n \u2502 - `host.hostname` (you must actually wrap with ``)\n\u2500\u2500\u2500\u256F\n```\n\nHere, the user has forgotten to escape an identifier that contains a dot character.\n\n```hamelin\nFROM simba.sysmon_events\n| WINDOW count(),\n all(winlog.event_data.events)\n BY host.hostname\n```\n\ngenerates the error\n\n```\nError: problem doing translation\n \u256D\u2500[ :3:10 ]\n \u2502\n 3 \u2502 all(winlog.event_data.events)\n \u2502 \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252C\u2500\u252C\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n \u2502 \u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 could not find a matching function definition\n \u2502 \u2502\n \u2502 \u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 variant\n \u2502\n \u2502 Note: Attempted all(x=boolean)\n \u2502 - Type mismatch for x: expected boolean, got variant\n \u2502\n \u2502 Attempted all(x=array(boolean))\n \u2502 - Type mismatch for x: expected array(boolean), got variant\n \u2502\n\u2500\u2500\u2500\u256F\n```\n\nHere, the user has forgotten to cast variant to a primitive type so that it can\nbe matched against the function call. (A future version of Hamelin will probably\ncoerce this automatically!)\n",
|
|
494
494
|
"language-basics/aggregation.md": "# AGG: performing ordinary aggregation\n\nThe `AGG` command groups and aggregates datasets to create summary statistics\nand analytical insights. You can analyze user behavior patterns, system\nperformance metrics, or security events by grouping related records together and\napplying mathematical functions to each group.\n\n## AGG syntax\n\nThe `AGG` command follows a simple pattern that groups data and applies aggregation functions to each group:\n\n```hamelin\nAGG result = function(expression), ... BY grouping_expression, ...\n```\n\nWhen you omit the `BY` clause, Hamelin aggregates all records into a single group. This calculates overall dataset statistics and global metrics that span all records, counting all events across the entire dataset without any grouping or partitioning:\n\n```hamelin\nFROM events\n| AGG total_events = count()\n```\n\nWhen you omit explicit column names, Hamelin generates them automatically from\nthe expressions you provide. Learn more about this feature in [Automatic Field\nNames](../smart-features/automatic-field-names.md). This creates columns named\n`count()` and `avg(response_time)` that you can reference using backticks in\nsubsequent commands:\n\n```hamelin\nFROM requests\n| AGG count(), avg(response_time) BY service_name\n```\n\nWhen you omit aggregation functions entirely, you get distinct groups without any calculations. This returns the unique combinations of event_type and user_id without performing any mathematical operations:\n\n```hamelin\nFROM events\n| AGG BY event_type, user_id\n```\n\nYou can also rename columns in the BY clause and use any expression for grouping. This example groups by renamed event_type, truncated timestamp, and extracted email domain, creating clear column names for downstream analysis:\n\n```hamelin\nFROM events\n| AGG\n total_events = count(),\n avg_duration = avg(duration)\n BY event_category = event_type,\n hour_bucket = timestamp@hr,\n user_domain = split(email, '@')[1]\n```\n\n## Simple aggregation examples\n\n### Basic counting\n\nEvent counting groups events by their characteristics and calculates how many events fall into each category. Notice that Hamelin uses `count()` with no arguments, not `count(*)` like SQL. The empty parentheses count all rows in each group, providing a clean syntax for the most common aggregation operation:\n\n```hamelin\nFROM events\n| AGG event_count = count() BY event_type\n```\n\n### Multiple aggregations\n\nCalculating several metrics at once in a single `AGG` command ensures all metrics use consistent grouping logic:\n\n```hamelin\nFROM requests\n| AGG\n total_requests = count(),\n avg_response_time = avg(response_time),\n max_response_time = max(response_time),\n error_count = count_if(status_code >= 400)\n BY service_name\n```\n\n### Conditional aggregation\n\nConditional aggregation functions like `count_if()` let you count only rows that meet specific conditions without pre-filtering the dataset. Conditional aggregation maintains the full context of each group while applying different filters to different calculations:\n\n```hamelin\nFROM auth_logs\n| AGG\n failures = count_if(outcome == 'FAILURE'),\n successes = count_if(outcome == 'SUCCESS')\n BY user_name\n```\n\n## Time series aggregations\n\nTime series aggregations combine time truncation with grouping to create time-based buckets for temporal analysis. Time-based grouping creates time-bucketed summaries for monitoring system performance, tracking business metrics, and understanding user behavior patterns across different time scales.\n\n### Hourly summaries\n\nHourly aggregations provide detailed views of system activity and user behavior throughout the day:\n\n```hamelin\nFROM logs\n| AGG\n hourly_events = count(),\n avg_response = avg(response_time),\n error_rate = count_if(status >= 400) / count()\n BY timestamp@hr\n| SORT timestamp@hr\n```\n\n### Daily trends\n\nDaily aggregations reveal longer-term trends and enable comparison across different time periods:\n\n```hamelin\nFROM events\n| WITHIN -30d..now()\n| AGG\n daily_events = count(),\n unique_users = count_distinct(user_name),\n high_severity = count_if(severity = 'HIGH')\n BY timestamp@d\n| SORT timestamp@d DESC\n```\n",
|