flagsmith-sql-flag-engine 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,153 @@
1
+ Metadata-Version: 2.4
2
+ Name: flagsmith-sql-flag-engine
3
+ Version: 0.1.0
4
+ Summary: SQL translator for Flagsmith segment predicates.
5
+ Author: Flagsmith
6
+ Author-email: Flagsmith <engineering@flagsmith.com>
7
+ License-Expression: BSD-3-Clause
8
+ Classifier: Programming Language :: Python :: 3 :: Only
9
+ Classifier: Programming Language :: SQL
10
+ Classifier: Topic :: Database
11
+ Requires-Dist: flagsmith-flag-engine>=10
12
+ Requires-Dist: jsonpath-rfc9535>=0.2
13
+ Requires-Python: >=3.10
14
+ Project-URL: Homepage, https://github.com/Flagsmith/flagsmith-sql-flag-engine
15
+ Description-Content-Type: text/markdown
16
+
17
+ # flagsmith-sql-flag-engine
18
+
19
+ SQL translator for Flagsmith segment predicates.
20
+
21
+ Where the Python and Rust `flag_engine` implementations evaluate
22
+ `is_context_in_segment` against an in-memory `EvaluationContext`, this
23
+ package takes a `SegmentContext` and emits a SQL `WHERE` expression that
24
+ evaluates the segment against an entire `IDENTITIES` table — one row per
25
+ identity, with the identity's full trait map held in a single column
26
+ the translator path-extracts at query time. `PERCENTAGE_SPLIT` and
27
+ `:semver`-marked comparators compile to inline pure-SQL.
28
+
29
+ ## Quickstart
30
+
31
+ ```python
32
+ from flag_engine.context.types import EvaluationContext, SegmentContext
33
+
34
+ from flagsmith_sql_flag_engine import TranslateContext, translate_segment
35
+ from flagsmith_sql_flag_engine.dialects import ClickHouseDialect
36
+
37
+ eval_context: EvaluationContext = {
38
+ "environment": {"key": "n9fbf9...3ngWhb", "name": "Production"},
39
+ }
40
+ ctx = TranslateContext(evaluation_context=eval_context, dialect=ClickHouseDialect())
41
+
42
+ segment: SegmentContext = {
43
+ "key": "growth-cohort",
44
+ "name": "Growth cohort",
45
+ "rules": [
46
+ {
47
+ "type": "ALL",
48
+ "conditions": [
49
+ {"operator": "EQUAL", "property": "plan", "value": "growth"},
50
+ ],
51
+ },
52
+ ],
53
+ }
54
+ where_expr = translate_segment(segment, ctx)
55
+ # where_expr is a SQL string. Drop into:
56
+ # SELECT COUNT(*) FROM IDENTITIES i
57
+ # WHERE i.environment_id = 'n9fbf9...3ngWhb' AND ({where_expr})
58
+ ```
59
+
60
+ `environment_id` in the `IDENTITIES` table is a string column holding
61
+ `EnvironmentContext.key` directly — the same identifier the engine uses,
62
+ no separate integer PK.
63
+
64
+ `translate_segment` returns `None` if the segment uses an operator the
65
+ translator can't handle — typically a REGEX pattern the active dialect's
66
+ regex flavour can't compile. Callers should fall back to
67
+ `flag_engine.is_context_in_segment` for those segments.
68
+
69
+ ## Schema
70
+
71
+ Each dialect publishes the table layout it expects via a `schema_ddl`
72
+ constant. For ClickHouse:
73
+
74
+ ```sql
75
+ CREATE TABLE IF NOT EXISTS IDENTITIES (
76
+ environment_id String,
77
+ id UInt64,
78
+ identifier String,
79
+ identity_key String,
80
+ traits JSON
81
+ )
82
+ ENGINE = MergeTree()
83
+ ORDER BY (environment_id, id);
84
+ ```
85
+
86
+ Traits live in a single `JSON` column (CH 24+, GA in 25.x). Each key is
87
+ stored as a typed subcolumn, so trait reads are direct columnar scans
88
+ rather than per-row JSON parses. Trait keys are *data* — new keys appear
89
+ without schema changes — and the translator only sees the abstract path
90
+ extraction.
91
+
92
+ ClickHouse Cloud requires `SET allow_experimental_json_type = 1` when
93
+ creating a `JSON`-column table (the type is GA on OSS 25.x); the test
94
+ harness applies this setting automatically.
95
+
96
+ Programmatic access:
97
+
98
+ ```python
99
+ from flagsmith_sql_flag_engine.dialects.clickhouse import SCHEMA_DDL
100
+ ```
101
+
102
+ ## Engine parity
103
+
104
+ Validated against [Flagsmith/engine-test-data](https://github.com/Flagsmith/engine-test-data),
105
+ the test suite every engine implementation is checked against. The
106
+ engine-parity suite loads each test case's identity into a per-dialect
107
+ scratch table, translates the case's segments, runs the generated SQL,
108
+ and compares to `flag_engine.is_context_in_segment`.
109
+
110
+ To run the engine-parity suite locally:
111
+
112
+ ```bash
113
+ git submodule update --init # pull engine-test-data
114
+ docker compose up --detach --wait clickhouse
115
+ uv run pytest tests/test_engine.py
116
+ ```
117
+
118
+ Adding a new dialect's parity coverage is one harness module — see
119
+ `tests/harnesses/` for the shape.
120
+
121
+ ## Dialects
122
+
123
+ The translator is dialect-aware: a `Dialect` protocol abstracts the
124
+ SQL fragments that differ across SQL engines — MD5 hex, hex-to-int
125
+ parsing, prefix-anchored regex, padded-version comparison, type-aware
126
+ trait predicates, regex flavour. Today `ClickHouseDialect` is the only
127
+ implementation; adding another engine such as Snowflake, DuckDB or
128
+ Postgres means writing one class.
129
+
130
+ ## Operator coverage
131
+
132
+ | Operator | Translatable | Notes |
133
+ | -------------------------------------------- | :----------: | -------------------------------------------------------------- |
134
+ | `EQUAL`, `NOT_EQUAL`, `IN` | yes | |
135
+ | `IS_SET`, `IS_NOT_SET` | yes | trait subcolumn `IS NOT NULL` / `IS NULL` |
136
+ | `CONTAINS`, `NOT_CONTAINS` | yes | |
137
+ | `GREATER_THAN`, `LESS_THAN` plus `_INCLUSIVE`| yes | |
138
+ | `MODULO` | yes | |
139
+ | `PERCENTAGE_SPLIT` | yes | inlined MD5-mod-9999; ~0.005% diverge on hash==9998 |
140
+ | `REGEX` | partial | dialect-flavour gated; unsupported patterns → caller fallback |
141
+ | `:semver`-marked comparators | yes | major.minor.patch only; ignores prerelease |
142
+
143
+ ## Development
144
+
145
+ ```bash
146
+ make install # uv sync + pre-commit install
147
+ make lint # run pre-commit hooks across the tree
148
+ make typecheck # mypy
149
+ make test # unit tests
150
+ ```
151
+
152
+ Ruff (lint + format) runs as a pre-commit hook on every commit. Mypy
153
+ runs as a `make typecheck` hook on staged Python files.
@@ -0,0 +1,137 @@
1
+ # flagsmith-sql-flag-engine
2
+
3
+ SQL translator for Flagsmith segment predicates.
4
+
5
+ Where the Python and Rust `flag_engine` implementations evaluate
6
+ `is_context_in_segment` against an in-memory `EvaluationContext`, this
7
+ package takes a `SegmentContext` and emits a SQL `WHERE` expression that
8
+ evaluates the segment against an entire `IDENTITIES` table — one row per
9
+ identity, with the identity's full trait map held in a single column
10
+ the translator path-extracts at query time. `PERCENTAGE_SPLIT` and
11
+ `:semver`-marked comparators compile to inline pure-SQL.
12
+
13
+ ## Quickstart
14
+
15
+ ```python
16
+ from flag_engine.context.types import EvaluationContext, SegmentContext
17
+
18
+ from flagsmith_sql_flag_engine import TranslateContext, translate_segment
19
+ from flagsmith_sql_flag_engine.dialects import ClickHouseDialect
20
+
21
+ eval_context: EvaluationContext = {
22
+ "environment": {"key": "n9fbf9...3ngWhb", "name": "Production"},
23
+ }
24
+ ctx = TranslateContext(evaluation_context=eval_context, dialect=ClickHouseDialect())
25
+
26
+ segment: SegmentContext = {
27
+ "key": "growth-cohort",
28
+ "name": "Growth cohort",
29
+ "rules": [
30
+ {
31
+ "type": "ALL",
32
+ "conditions": [
33
+ {"operator": "EQUAL", "property": "plan", "value": "growth"},
34
+ ],
35
+ },
36
+ ],
37
+ }
38
+ where_expr = translate_segment(segment, ctx)
39
+ # where_expr is a SQL string. Drop into:
40
+ # SELECT COUNT(*) FROM IDENTITIES i
41
+ # WHERE i.environment_id = 'n9fbf9...3ngWhb' AND ({where_expr})
42
+ ```
43
+
44
+ `environment_id` in the `IDENTITIES` table is a string column holding
45
+ `EnvironmentContext.key` directly — the same identifier the engine uses,
46
+ no separate integer PK.
47
+
48
+ `translate_segment` returns `None` if the segment uses an operator the
49
+ translator can't handle — typically a REGEX pattern the active dialect's
50
+ regex flavour can't compile. Callers should fall back to
51
+ `flag_engine.is_context_in_segment` for those segments.
52
+
53
+ ## Schema
54
+
55
+ Each dialect publishes the table layout it expects via a `schema_ddl`
56
+ constant. For ClickHouse:
57
+
58
+ ```sql
59
+ CREATE TABLE IF NOT EXISTS IDENTITIES (
60
+ environment_id String,
61
+ id UInt64,
62
+ identifier String,
63
+ identity_key String,
64
+ traits JSON
65
+ )
66
+ ENGINE = MergeTree()
67
+ ORDER BY (environment_id, id);
68
+ ```
69
+
70
+ Traits live in a single `JSON` column (CH 24+, GA in 25.x). Each key is
71
+ stored as a typed subcolumn, so trait reads are direct columnar scans
72
+ rather than per-row JSON parses. Trait keys are *data* — new keys appear
73
+ without schema changes — and the translator only sees the abstract path
74
+ extraction.
75
+
76
+ ClickHouse Cloud requires `SET allow_experimental_json_type = 1` when
77
+ creating a `JSON`-column table (the type is GA on OSS 25.x); the test
78
+ harness applies this setting automatically.
79
+
80
+ Programmatic access:
81
+
82
+ ```python
83
+ from flagsmith_sql_flag_engine.dialects.clickhouse import SCHEMA_DDL
84
+ ```
85
+
86
+ ## Engine parity
87
+
88
+ Validated against [Flagsmith/engine-test-data](https://github.com/Flagsmith/engine-test-data),
89
+ the test suite every engine implementation is checked against. The
90
+ engine-parity suite loads each test case's identity into a per-dialect
91
+ scratch table, translates the case's segments, runs the generated SQL,
92
+ and compares to `flag_engine.is_context_in_segment`.
93
+
94
+ To run the engine-parity suite locally:
95
+
96
+ ```bash
97
+ git submodule update --init # pull engine-test-data
98
+ docker compose up --detach --wait clickhouse
99
+ uv run pytest tests/test_engine.py
100
+ ```
101
+
102
+ Adding a new dialect's parity coverage is one harness module — see
103
+ `tests/harnesses/` for the shape.
104
+
105
+ ## Dialects
106
+
107
+ The translator is dialect-aware: a `Dialect` protocol abstracts the
108
+ SQL fragments that differ across SQL engines — MD5 hex, hex-to-int
109
+ parsing, prefix-anchored regex, padded-version comparison, type-aware
110
+ trait predicates, regex flavour. Today `ClickHouseDialect` is the only
111
+ implementation; adding another engine such as Snowflake, DuckDB or
112
+ Postgres means writing one class.
113
+
114
+ ## Operator coverage
115
+
116
+ | Operator | Translatable | Notes |
117
+ | -------------------------------------------- | :----------: | -------------------------------------------------------------- |
118
+ | `EQUAL`, `NOT_EQUAL`, `IN` | yes | |
119
+ | `IS_SET`, `IS_NOT_SET` | yes | trait subcolumn `IS NOT NULL` / `IS NULL` |
120
+ | `CONTAINS`, `NOT_CONTAINS` | yes | |
121
+ | `GREATER_THAN`, `LESS_THAN` plus `_INCLUSIVE`| yes | |
122
+ | `MODULO` | yes | |
123
+ | `PERCENTAGE_SPLIT` | yes | inlined MD5-mod-9999; ~0.005% diverge on hash==9998 |
124
+ | `REGEX` | partial | dialect-flavour gated; unsupported patterns → caller fallback |
125
+ | `:semver`-marked comparators | yes | major.minor.patch only; ignores prerelease |
126
+
127
+ ## Development
128
+
129
+ ```bash
130
+ make install # uv sync + pre-commit install
131
+ make lint # run pre-commit hooks across the tree
132
+ make typecheck # mypy
133
+ make test # unit tests
134
+ ```
135
+
136
+ Ruff (lint + format) runs as a pre-commit hook on every commit. Mypy
137
+ runs as a `make typecheck` hook on staged Python files.
@@ -0,0 +1,75 @@
1
+ [project]
2
+ name = "flagsmith-sql-flag-engine"
3
+ version = "0.1.0"
4
+ description = "SQL translator for Flagsmith segment predicates."
5
+ readme = "README.md"
6
+ authors = [{ name = "Flagsmith", email = "engineering@flagsmith.com" }]
7
+ requires-python = ">=3.10"
8
+ license = "BSD-3-Clause"
9
+ classifiers = [
10
+ "Programming Language :: Python :: 3 :: Only",
11
+ "Programming Language :: SQL",
12
+ "Topic :: Database",
13
+ ]
14
+ dependencies = ["flagsmith-flag-engine>=10", "jsonpath-rfc9535>=0.2"]
15
+
16
+ [project.urls]
17
+ Homepage = "https://github.com/Flagsmith/flagsmith-sql-flag-engine"
18
+
19
+ [dependency-groups]
20
+ dev = [
21
+ "pytest>=8",
22
+ "pytest-xdist>=3",
23
+ "mypy>=1.10",
24
+ "prek>=0.3",
25
+ "clickhouse-connect>=0.7",
26
+ "json5>=0.14.0",
27
+ "pytest-cov>=7.1.0",
28
+ ]
29
+
30
+ [build-system]
31
+ requires = ["uv_build>=0.8.14,<0.9.0"]
32
+ build-backend = "uv_build"
33
+
34
+ [tool.pytest.ini_options]
35
+ addopts = [
36
+ "-ra",
37
+ "--cov",
38
+ "src",
39
+ "--cov-report",
40
+ "term-missing",
41
+ "--cov-report",
42
+ "xml",
43
+ ]
44
+ testpaths = ["tests"]
45
+
46
+ [tool.coverage.run]
47
+ branch = true
48
+ source = ["src"]
49
+
50
+ [tool.coverage.report]
51
+ # `match` statements exhaustive over a Literal type record a phantom
52
+ # fall-through branch from the last case to function exit; coverage.py
53
+ # can't see the type-system exhaustiveness mypy enforces. Treat any
54
+ # `case` line as a possibly-partial branch so the gate stays at 100%
55
+ # without us littering the source with `# pragma: no branch`.
56
+ partial_branches = [
57
+ "pragma: no branch",
58
+ "case .+:",
59
+ ]
60
+
61
+ [tool.ruff]
62
+ target-version = "py310"
63
+ line-length = 100
64
+
65
+ [tool.ruff.lint]
66
+ select = ["E", "F", "I", "B", "UP"]
67
+
68
+ [tool.mypy]
69
+ strict = true
70
+ python_version = "3.10"
71
+ files = ["src/flagsmith_sql_flag_engine", "tests"]
72
+
73
+ [[tool.mypy.overrides]]
74
+ module = "clickhouse_connect.*"
75
+ ignore_missing_imports = true
@@ -0,0 +1,28 @@
1
+ """SQL translator for Flagsmith segment predicates.
2
+
3
+ Public API:
4
+ translate_segment(segment, ctx) -> str | None
5
+ TranslateContext
6
+
7
+ See README.md for usage. The translator is dialect-aware via the `Dialect`
8
+ protocol; `flagsmith_sql_flag_engine.dialects.clickhouse.ClickHouseDialect`
9
+ is the only implementation today.
10
+ """
11
+
12
+ from flagsmith_sql_flag_engine.dialect import Dialect
13
+ from flagsmith_sql_flag_engine.translator import (
14
+ TRANSLATABLE_OPERATORS,
15
+ TranslateContext,
16
+ translate_condition,
17
+ translate_rule,
18
+ translate_segment,
19
+ )
20
+
21
+ __all__ = [
22
+ "TRANSLATABLE_OPERATORS",
23
+ "Dialect",
24
+ "TranslateContext",
25
+ "translate_condition",
26
+ "translate_rule",
27
+ "translate_segment",
28
+ ]
@@ -0,0 +1,125 @@
1
+ """Per-dialect SQL fragments — MD5 hex, hex-to-int parsing, prefix-anchored
2
+ regex, padded-version comparison, type-aware trait predicates, regex flavour."""
3
+
4
+ from typing import Protocol
5
+
6
+
7
+ class Dialect(Protocol):
8
+ """Per-dialect SQL fragments.
9
+
10
+ Methods return SQL string fragments. Inputs are already-formatted SQL
11
+ strings (column refs, string literals); the dialect only chooses the
12
+ right syntax for the operation.
13
+ """
14
+
15
+ name: str # human-readable, used in test ids and error messages
16
+
17
+ # --- IDENTITIES schema access ---
18
+ #
19
+ # The dialect owns the canonical IDENTITIES schema, see `schema_ddl`,
20
+ # so it also owns the SQL expression for each logical column. The
21
+ # translator just hands over an alias.
22
+
23
+ def identifier_expr(self, alias: str) -> str:
24
+ """SQL expression for `$.identity.identifier`."""
25
+ ...
26
+
27
+ def identity_key_expr(self, alias: str) -> str:
28
+ """SQL expression for `$.identity.key`."""
29
+ ...
30
+
31
+ def trait_path(self, alias: str, trait_key: str) -> str:
32
+ """Path-extract a trait value from the IDENTITIES traits container.
33
+
34
+ The path syntax varies by SQL engine.
35
+ """
36
+ ...
37
+
38
+ def trait_eq(self, alias: str, trait_key: str, value: object, negate: bool) -> str:
39
+ """Type-aware EQUAL / NOT_EQUAL predicate on a trait, mirroring
40
+ `flag_engine`'s per-type coercion: the segment value is cast to
41
+ the trait's runtime type before compare, and a cast failure
42
+ means no match for both ops. Implementation is dialect-specific
43
+ because trait-type discrimination and runtime type-coercion
44
+ casts both vary by engine.
45
+ """
46
+ ...
47
+
48
+ def trait_in(self, alias: str, trait_key: str, items: list[str]) -> str:
49
+ """Type-aware IN predicate on a trait, mirroring engine semantics:
50
+ string trait does direct lookup; integer trait stringifies and
51
+ looks up; other trait types never match. `items` is the parsed
52
+ candidate list per `flag_engine`'s `_get_in_values`.
53
+ """
54
+ ...
55
+
56
+ # --- string operations ---
57
+
58
+ def position(self, needle_lit: str, haystack_expr: str) -> str:
59
+ """Boolean: does the string literal `needle_lit` appear in
60
+ `haystack_expr`? Used for CONTAINS / NOT_CONTAINS."""
61
+ ...
62
+
63
+ def lpad(self, expr: str, width: int, pad_lit: str) -> str:
64
+ """Left-pad `expr` to `width` using `pad_lit`."""
65
+ ...
66
+
67
+ def coalesce(self, *exprs: str) -> str:
68
+ """COALESCE/NVL-style: first non-null."""
69
+ ...
70
+
71
+ # --- regex ---
72
+
73
+ def regex_supports(self, pattern: str) -> bool:
74
+ """Return True if this dialect's regex engine can compile
75
+ `pattern`. The translator falls back to `None` for any REGEX
76
+ condition where this returns False, letting the caller defer
77
+ to `flag_engine`."""
78
+ ...
79
+
80
+ def regexp_anchored_match(self, value_expr: str, pattern: str) -> str:
81
+ """Boolean: equivalent to Python `re.match(pattern, value)` —
82
+ anchored at position 0, may be a prefix of the value, not a
83
+ full-match.
84
+
85
+ `pattern` is the raw Python regex string; the dialect handles
86
+ its own escaping into a SQL literal, since regex flavours
87
+ differ in how backslashes are treated."""
88
+ ...
89
+
90
+ def regexp_nth_digit_run(self, value_expr: str, n: int) -> str:
91
+ """Extract the n-th sequence of digits from `value_expr`. Returns NULL
92
+ if there are fewer than n digit runs. Used for semver."""
93
+ ...
94
+
95
+ # --- hashing primitives for PERCENTAGE_SPLIT ---
96
+
97
+ def md5_hex(self, expr: str) -> str:
98
+ """SQL fragment producing the lowercase 32-char hex MD5 digest."""
99
+ ...
100
+
101
+ def parse_hex_chunk(self, hex_expr: str, start: int, length: int = 8) -> str:
102
+ """Parse `length` hex characters of `hex_expr` starting at 1-indexed
103
+ `start` into a non-negative integer."""
104
+ ...
105
+
106
+ # --- type casts ---
107
+
108
+ def cast_string(self, expr: str) -> str:
109
+ """Cast `expr` to STRING / VARCHAR."""
110
+ ...
111
+
112
+ def cast_float(self, expr: str) -> str:
113
+ """Cast `expr` to a 64-bit float / DOUBLE."""
114
+ ...
115
+
116
+ def cast_number(self, expr: str) -> str:
117
+ """Cast `expr` to a NUMBER / BIGINT — the engine-side numeric
118
+ type used for modulo arithmetic."""
119
+ ...
120
+
121
+ # --- composition ---
122
+
123
+ def mod(self, dividend: str, divisor: str) -> str:
124
+ """`dividend MOD divisor` returning a numeric value."""
125
+ ...
@@ -0,0 +1,5 @@
1
+ """Dialect implementations."""
2
+
3
+ from flagsmith_sql_flag_engine.dialects.clickhouse import ClickHouseDialect
4
+
5
+ __all__ = ["ClickHouseDialect"]