PyPI - duckdb-sqlalchemy - Versions diffs - 1.4.4__tar.gz → 1.4.4.2__tar.gz - Mend

duckdb-sqlalchemy 1.4.4tar.gz → 1.4.4.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (72) hide show

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,29 @@ preserved from the upstream project for historical context.
 ## Maintained in this fork
+## [1.4.4.2](https://github.com/leonardovida/duckdb-sqlalchemy/compare/v1.4.4...v1.4.4.2) (2026-02-05)
+### Security
+* validate config keys before `SET` statements to block SQL injection payloads
+* validate preload extension names before `LOAD`
+* validate COPY helper table/column/option identifiers and reject SQL fragments
+### Testing
+* gate pandas tests on supported pandas/SQLAlchemy combinations
+* pin `pandas<2.2` in `nox` SQLAlchemy 1.x sessions for stable matrix runs
+### Typing
+* align SQLAlchemy compatibility shims and test typing to satisfy `ty`
+## [1.4.4.1](https://github.com/leonardovida/duckdb-sqlalchemy/compare/v1.4.4...v1.4.4.1) (2026-02-05)
+### Documentation
+* document DuckDB multiprocessing fork-safety caveat and `spawn`/`forkserver` workaround
 ## [1.4.4](https://github.com/leonardovida/duckdb-sqlalchemy/compare/v1.4.3...v1.4.4) (2026-02-03)
 ### Versioning

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: duckdb-sqlalchemy
-Version: 1.4.4
+Version: 1.4.4.2
 Summary: DuckDB SQLAlchemy dialect for DuckDB and MotherDuck
 Project-URL: Bug Tracker, https://github.com/leonardovida/duckdb-sqlalchemy/issues
 Project-URL: Changelog, https://github.com/leonardovida/duckdb-sqlalchemy/releases
@@ -58,14 +58,45 @@ Description-Content-Type: text/markdown
 duckdb-sqlalchemy is a DuckDB SQLAlchemy dialect for DuckDB and MotherDuck. It supports SQLAlchemy Core and ORM APIs for local DuckDB and MotherDuck connections.
+For new projects, this repository is the recommended dialect when you want production-oriented defaults, explicit MotherDuck guidance, and a clear migration path from older package names.
 The dialect handles pooling defaults, bulk inserts, type mappings, and cloud-specific configuration.
-## Why this dialect
+## Why choose duckdb-sqlalchemy today
 - **SQLAlchemy compatibility**: Core, ORM, Alembic, and reflection.
 - **MotherDuck support**: Token handling, attach modes, session hints, and read scaling helpers.
 - **Operational defaults**: Pooling defaults, transient retry for reads, and bulk insert optimization via Arrow/DataFrame registration.
-- **Maintained**: Tracks current DuckDB releases with a long-term support posture.
+- **Active release cadence**: Tracks current DuckDB releases with a long-term support posture.
+| Area | `duckdb-sqlalchemy` (this repo) | `duckdb_engine` |
+| --- | --- | --- |
+| Package/module name | `duckdb-sqlalchemy` / `duckdb_sqlalchemy` | `duckdb-engine` / `duckdb_engine` |
+| SQLAlchemy driver URL | `duckdb://` | `duckdb://` |
+| MotherDuck workflow coverage | Dedicated URL helper (`MotherDuckURL`), connection guidance, and examples | No dedicated MotherDuck usage section in the upstream README |
+| Operational guidance | Documented pooling defaults, read-scaling helpers, and bulk insert patterns | Basic configuration guidance in upstream README |
+| Migration path | Explicit migration guide from older package names | Migration to this package is documented in this repo |
+| Project direction | Release policy, changelog, roadmap, and docs site are maintained here | Upstream README focuses on the core driver usage |
+## Coming from duckdb_engine?
+If you already use `duckdb-engine`, migration is straightforward:
+- keep the SQLAlchemy URL scheme (`duckdb://`)
+- install `duckdb-sqlalchemy`
+- switch imports to `duckdb_sqlalchemy`
+See the full guide: [docs/migration-from-duckdb-engine.md](docs/migration-from-duckdb-engine.md).
+## Project lineage
+This project is a heavily modified fork of `Mause/duckdb_engine` and continues to preserve upstream history in `CHANGELOG.md`.
+Current direction in this repository:
+- package and module rename to `duckdb-sqlalchemy` / `duckdb_sqlalchemy`
+- production-oriented defaults for local DuckDB and MotherDuck deployments
+- docs-first maintenance with versioned release notes and a published docs site
 ## Compatibility

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/README.md RENAMED Viewed

@@ -6,14 +6,45 @@
 duckdb-sqlalchemy is a DuckDB SQLAlchemy dialect for DuckDB and MotherDuck. It supports SQLAlchemy Core and ORM APIs for local DuckDB and MotherDuck connections.
+For new projects, this repository is the recommended dialect when you want production-oriented defaults, explicit MotherDuck guidance, and a clear migration path from older package names.
 The dialect handles pooling defaults, bulk inserts, type mappings, and cloud-specific configuration.
-## Why this dialect
+## Why choose duckdb-sqlalchemy today
 - **SQLAlchemy compatibility**: Core, ORM, Alembic, and reflection.
 - **MotherDuck support**: Token handling, attach modes, session hints, and read scaling helpers.
 - **Operational defaults**: Pooling defaults, transient retry for reads, and bulk insert optimization via Arrow/DataFrame registration.
-- **Maintained**: Tracks current DuckDB releases with a long-term support posture.
+- **Active release cadence**: Tracks current DuckDB releases with a long-term support posture.
+| Area | `duckdb-sqlalchemy` (this repo) | `duckdb_engine` |
+| --- | --- | --- |
+| Package/module name | `duckdb-sqlalchemy` / `duckdb_sqlalchemy` | `duckdb-engine` / `duckdb_engine` |
+| SQLAlchemy driver URL | `duckdb://` | `duckdb://` |
+| MotherDuck workflow coverage | Dedicated URL helper (`MotherDuckURL`), connection guidance, and examples | No dedicated MotherDuck usage section in the upstream README |
+| Operational guidance | Documented pooling defaults, read-scaling helpers, and bulk insert patterns | Basic configuration guidance in upstream README |
+| Migration path | Explicit migration guide from older package names | Migration to this package is documented in this repo |
+| Project direction | Release policy, changelog, roadmap, and docs site are maintained here | Upstream README focuses on the core driver usage |
+## Coming from duckdb_engine?
+If you already use `duckdb-engine`, migration is straightforward:
+- keep the SQLAlchemy URL scheme (`duckdb://`)
+- install `duckdb-sqlalchemy`
+- switch imports to `duckdb_sqlalchemy`
+See the full guide: [docs/migration-from-duckdb-engine.md](docs/migration-from-duckdb-engine.md).
+## Project lineage
+This project is a heavily modified fork of `Mause/duckdb_engine` and continues to preserve upstream history in `CHANGELOG.md`.
+Current direction in this repository:
+- package and module rename to `duckdb-sqlalchemy` / `duckdb_sqlalchemy`
+- production-oriented defaults for local DuckDB and MotherDuck deployments
+- docs-first maintenance with versioned release notes and a published docs site
 ## Compatibility

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/configuration.md RENAMED Viewed

@@ -57,6 +57,9 @@ engine = create_engine(
 )
 ```
+For safety, extension names must be plain identifiers (`[A-Za-z0-9_]+`).
+Values containing spaces, punctuation, or SQL fragments are rejected.
 ## Register filesystems
 You can register filesystems via `fsspec`:

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/migration-from-duckdb-engine.md RENAMED Viewed

@@ -5,7 +5,7 @@ title: Migration from duckdb_engine
 # Migration from duckdb_engine
-This project is the actively maintained DuckDB SQLAlchemy dialect. If you are coming from the older `duckdb_engine` package, migrate as follows:
+`duckdb-sqlalchemy` is the recommended package name for new work in this repository. If you are coming from `duckdb_engine`, migrate as follows:
 ## Package and import rename
@@ -28,3 +28,4 @@ SQLAlchemy URLs use the `duckdb://` driver name in both packages. Existing URLs
 - The package name is now `duckdb-sqlalchemy` and the module is `duckdb_sqlalchemy`.
 - The dialect remains registered as `duckdb` for SQLAlchemy.
 - See `docs/motherduck.md` for MotherDuck-specific behavior.
+- See `README.md` for project lineage, release policy, and roadmap links.

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/motherduck.md RENAMED Viewed

@@ -41,6 +41,13 @@ engine = create_engine(
 )
 ```
+## Multiprocessing (fork)
+DuckDB's Python client is not fork-safe, so `multiprocessing` children created with
+`fork` can fail when opening new connections (commonly observed with MotherDuck or
+file-backed databases). Use the `spawn` or `forkserver` start methods and create
+engines/connections inside the child process.
 ## Options
 ### Connection-string parameters (instance cache key)

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/olap.md RENAMED Viewed

@@ -113,6 +113,10 @@ with engine.begin() as conn:
     copy_from_csv(conn, "events", "data/events.csv", header=True)
 ```
+For safety, string table names, column names, and COPY option keys must be
+identifiers. Dotted paths like `schema.events` are supported, but SQL
+fragments are rejected.
 For row iterables, you can stream to a temporary CSV in chunks:
 ```python

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/seo-checklist.md RENAMED Viewed

@@ -22,3 +22,4 @@ Use this list to validate indexability after each docs update or release.
 - Project name and description are consistent in README, docs, and PyPI metadata.
 - URLs in `pyproject.toml` match the docs site.
+- README and docs clearly differentiate this fork's scope from upstream `duckdb_engine` content.

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/docs/types-and-caveats.md RENAMED Viewed

@@ -87,3 +87,11 @@ users = Table(
 ## Pandas chunksize
 Older DuckDB versions (< 0.5.0) may have issues with `pandas.read_sql(..., chunksize=...)`. If you hit errors, use `chunksize=None` or upgrade DuckDB.
+## Multiprocessing (fork)
+DuckDB's Python bindings are not fork-safe. Creating a new connection in a
+`multiprocessing` child process created with `fork` can raise runtime errors
+(for example, `RuntimeError: thread::join failed: No such process`), especially
+with MotherDuck or file-backed connections. Prefer `spawn` or `forkserver`, and
+initialize engines/connections in the child process.

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/duckdb_sqlalchemy/__init__.py RENAMED Viewed

@@ -15,6 +15,7 @@ from typing import (
     Sequence,
     Tuple,
     Type,
+    cast,
 )
 import duckdb
@@ -38,6 +39,7 @@ from sqlalchemy.sql import bindparam
 from sqlalchemy.sql.selectable import Select
 from ._supports import has_comment_support
+from ._validation import validate_extension_name
 from .bulk import copy_from_csv, copy_from_parquet, copy_from_rows
 from .capabilities import get_capabilities
 from .config import apply_config, get_core_config
@@ -56,11 +58,15 @@ from .olap import read_csv, read_csv_auto, read_parquet, table_function
 from .url import URL, make_url
 try:
-    from sqlalchemy.dialects.postgresql.base import PGExecutionContext
+    from sqlalchemy.dialects.postgresql import base as _pg_base
 except ImportError:  # pragma: no cover - fallback for older SQLAlchemy
-    PGExecutionContext = DefaultExecutionContext
+    _PGExecutionContext = DefaultExecutionContext
+else:
+    _PGExecutionContext = getattr(
+        _pg_base, "PGExecutionContext", DefaultExecutionContext
+    )
-__version__ = "1.4.4"
+__version__ = "1.4.4.2"
 sqlalchemy_version = sqlalchemy.__version__
 SQLALCHEMY_VERSION = Version(sqlalchemy_version)
 SQLALCHEMY_2 = SQLALCHEMY_VERSION >= Version("2.0.0")
@@ -71,7 +77,9 @@ supports_user_agent: bool = _capabilities.supports_user_agent
 if TYPE_CHECKING:
     from sqlalchemy.engine import Connection
-    from sqlalchemy.engine.reflection import ReflectedCheckConstraint, ReflectedIndex
+    ReflectedCheckConstraint = Dict[str, Any]
+    ReflectedIndex = Dict[str, Any]
     from .capabilities import DuckDBCapabilities
@@ -318,7 +326,7 @@ class DuckDBArrowResult:
         return iter(self._result)
-class DuckDBExecutionContext(PGExecutionContext):
+class DuckDBExecutionContext(_PGExecutionContext):
     @classmethod
     def _init_compiled(
         cls,
@@ -369,8 +377,9 @@ class DuckDBExecutionContext(PGExecutionContext):
         arraysize = self.execution_options.get("duckdb_arraysize")
         if arraysize is None:
             arraysize = self.execution_options.get("arraysize")
-        if arraysize is not None and hasattr(self.cursor, "arraysize"):
-            self.cursor.arraysize = arraysize
+        cursor = getattr(self, "cursor", None)
+        if arraysize is not None and hasattr(cursor, "arraysize"):
+            cursor.arraysize = arraysize
         result = super()._setup_result_proxy()
         if self.execution_options.get("duckdb_arrow") and getattr(
             result, "returns_rows", False
@@ -607,7 +616,7 @@ class Dialect(PGDialect_psycopg2):
         conn = duckdb.connect(*cargs, **cparams)
         for extension in preload_extensions:
-            conn.execute(f"LOAD {extension}")
+            conn.execute(f"LOAD {validate_extension_name(extension)}")
         for filesystem in filesystems:
             conn.register_filesystem(filesystem)
@@ -875,7 +884,7 @@ class Dialect(PGDialect_psycopg2):
     @cache  # type: ignore[call-arg]
     def get_columns(  # type: ignore[no-untyped-def]
-        self, connection: "Connection", table_name: str, schema=None, **kw: Any
+        self, connection: "Connection", table_name: str, schema=None, **kw: "Any"
     ):
         try:
             return super().get_columns(connection, table_name, schema=schema, **kw)
@@ -887,7 +896,7 @@ class Dialect(PGDialect_psycopg2):
     @cache  # type: ignore[call-arg]
     def get_foreign_keys(  # type: ignore[no-untyped-def]
-        self, connection: "Connection", table_name: str, schema=None, **kw: Any
+        self, connection: "Connection", table_name: str, schema=None, **kw: "Any"
     ):
         try:
             return super().get_foreign_keys(connection, table_name, schema=schema, **kw)
@@ -898,7 +907,7 @@ class Dialect(PGDialect_psycopg2):
     @cache  # type: ignore[call-arg]
     def get_unique_constraints(  # type: ignore[no-untyped-def]
-        self, connection: "Connection", table_name: str, schema=None, **kw: Any
+        self, connection: "Connection", table_name: str, schema=None, **kw: "Any"
     ):
         try:
             return super().get_unique_constraints(
@@ -911,7 +920,7 @@ class Dialect(PGDialect_psycopg2):
     @cache  # type: ignore[call-arg]
     def get_check_constraints(  # type: ignore[no-untyped-def]
-        self, connection: "Connection", table_name: str, schema=None, **kw: Any
+        self, connection: "Connection", table_name: str, schema=None, **kw: "Any"
     ):
         try:
             return super().get_check_constraints(
@@ -1019,7 +1028,7 @@ class Dialect(PGDialect_psycopg2):
                 import pandas as pd  # type: ignore[import-not-found]
                 rows = parameters if isinstance(parameters, list) else list(parameters)
-                data = pd.DataFrame(rows, columns=column_names)
+                data = pd.DataFrame(rows, columns=cast(Any, column_names))
             except Exception:
                 data = None
             if data is None:
@@ -1119,16 +1128,25 @@ class Dialect(PGDialect_psycopg2):
         self._execute_with_retry(cursor, statement, parameters, context, executor)
-    def do_execute_no_params(
-        self,
-        cursor: Any,
-        statement: str,
-        context: Optional[Any] = None,
-    ) -> None:
+    def do_execute_no_params(self, cursor: Any, statement: str, *args: Any) -> None:
+        parameters: Any = None
+        context: Optional[Any] = None
+        if len(args) == 1:
+            context = cast(Optional[Any], args[0])
+        elif len(args) >= 2:
+            parameters = args[0]
+            context = cast(Optional[Any], args[1])
         def executor() -> Any:
-            return DefaultDialect.do_execute_no_params(self, cursor, statement, context)
+            if parameters is None:
+                return DefaultDialect.do_execute_no_params(
+                    self, cursor, statement, context
+                )
+            return DefaultDialect.do_execute(
+                self, cursor, statement, parameters, context
+            )
-        self._execute_with_retry(cursor, statement, None, context, executor)
+        self._execute_with_retry(cursor, statement, parameters, context, executor)
     def _pg_class_filter_scope_schema(
         self,
@@ -1160,10 +1178,10 @@ class Dialect(PGDialect_psycopg2):
         # reflection to avoid Catalog Errors during SQLAlchemy 2.x reflection.
         from sqlalchemy.dialects.postgresql import base as pg_base
-        pg_catalog = pg_base.pg_catalog
-        REGCLASS = pg_base.REGCLASS
-        TEXT = pg_base.TEXT
-        OID = pg_base.OID
+        pg_catalog = getattr(pg_base, "pg_catalog")
+        REGCLASS = getattr(pg_base, "REGCLASS")
+        TEXT = getattr(pg_base, "TEXT")
+        OID = getattr(pg_base, "OID")
         server_version_info = self.server_version_info or (0,)
@@ -1241,7 +1259,7 @@ class Dialect(PGDialect_psycopg2):
         collate = sql.null().label("collation")
-        relkinds = self._kind_to_relkinds(kind)
+        relkinds = getattr(super(), "_kind_to_relkinds")(kind)
         query = (
             select(
                 pg_catalog.pg_attribute.c.attname.label("name"),
@@ -1275,7 +1293,7 @@ class Dialect(PGDialect_psycopg2):
                     == pg_catalog.pg_attribute.c.attnum,
                 ),
             )
-            .where(self._pg_class_relkind_condition(relkinds))
+            .where(getattr(super(), "_pg_class_relkind_condition")(relkinds))
             .order_by(pg_catalog.pg_class.c.relname, pg_catalog.pg_attribute.c.attnum)
         )
         query = self._pg_class_filter_scope_schema(query, schema, scope=scope)
@@ -1339,15 +1357,20 @@ class Dialect(PGDialect_psycopg2):
         # dictionary with (name, ) if default search path or (schema, name)
         # as keys
+        load_enums = getattr(self, "_load_enums")
+        try:
+            enum_records = load_enums(
+                connection, schema="*", info_cache=kw.get("info_cache")
+            )
+        except TypeError:
+            enum_records = load_enums(connection, schema="*")
         enums = dict(
             (
                 ((rec["name"],), rec)
                 if rec["visible"]
                 else ((rec["schema"], rec["name"]), rec)
             )
-            for rec in self._load_enums(  # type: ignore[attr-defined]
-                connection, schema="*", info_cache=kw.get("info_cache")
-            )
+            for rec in enum_records
         )
         columns = self._get_columns_info(rows, domains, enums, schema)  # type: ignore[attr-defined]
@@ -1361,9 +1384,9 @@ class Dialect(PGDialect_psycopg2):
         self, schema: str, has_filter_names: bool, scope: Any, kind: Any
     ):
         if SQLALCHEMY_VERSION >= Version("2.0.36"):
-            from sqlalchemy.dialects.postgresql import (  # type: ignore[attr-defined]
-                pg_catalog,
-            )
+            from sqlalchemy.dialects.postgresql import base as pg_base
+            pg_catalog = getattr(pg_base, "pg_catalog")
             if (
                 hasattr(super(), "_kind_to_relkinds")

duckdb_sqlalchemy-1.4.4.2/duckdb_sqlalchemy/_validation.py ADDED Viewed

@@ -0,0 +1,38 @@
+import re
+from typing import Iterable
+IDENTIFIER_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
+EXTENSION_RE = re.compile(r"^[A-Za-z0-9_]+$")
+def validate_identifier(value: str, *, kind: str = "identifier") -> str:
+    if not isinstance(value, str):
+        raise ValueError(f"{kind} must be a string")
+    if not IDENTIFIER_RE.fullmatch(value):
+        raise ValueError(f"invalid {kind}: {value!r}")
+    return value
+def validate_dotted_identifier(value: str, *, kind: str = "identifier") -> str:
+    if not isinstance(value, str):
+        raise ValueError(f"{kind} must be a string")
+    parts = value.split(".")
+    if not parts or any(not part for part in parts):
+        raise ValueError(f"invalid {kind}: {value!r}")
+    for part in parts:
+        validate_identifier(part, kind=kind)
+    return value
+def validate_extension_name(value: str) -> str:
+    if not isinstance(value, str):
+        raise ValueError("extension name must be a string")
+    if not EXTENSION_RE.fullmatch(value):
+        raise ValueError(f"invalid extension name: {value!r}")
+    return value
+def validate_identifier_list(
+    values: Iterable[str], *, kind: str = "identifier"
+) -> tuple[str, ...]:
+    return tuple(validate_identifier(value, kind=kind) for value in values)

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/duckdb_sqlalchemy/bulk.py RENAMED Viewed

@@ -3,6 +3,12 @@ import tempfile
 from pathlib import Path
 from typing import Any, Iterable, Mapping, Optional, Sequence, Tuple, Union
+from ._validation import (
+    validate_dotted_identifier,
+    validate_identifier,
+    validate_identifier_list,
+)
 TableLike = Union[str, Any]
@@ -25,7 +31,7 @@ def _format_copy_options(options: Mapping[str, Any]) -> str:
     for key, value in options.items():
         if value is None:
             continue
-        opt_key = str(key).upper()
+        opt_key = validate_identifier(str(key), kind="COPY option key").upper()
         if isinstance(value, (list, tuple)):
             inner = ", ".join(_quote_literal(v) for v in value)
             parts.append(f"{opt_key} ({inner})")
@@ -46,21 +52,28 @@ def _format_table(connection: Any, table: TableLike) -> str:
         schema = getattr(table, "schema", None)
         name = getattr(table, "name", None)
         if schema:
-            return f"{schema}.{name}"
-        return str(name)
-    return str(table)
+            schema_name = validate_dotted_identifier(
+                str(schema), kind="table schema identifier"
+            )
+            table_name = validate_identifier(str(name), kind="table identifier")
+            return f"{schema_name}.{table_name}"
+        return validate_identifier(str(name), kind="table identifier")
+    table_name = str(table)
+    validate_dotted_identifier(table_name, kind="table identifier")
+    return table_name
 def _format_columns(connection: Any, columns: Optional[Sequence[str]]) -> str:
     if not columns:
         return ""
+    validated_columns = validate_identifier_list(columns, kind="column identifier")
     preparer = getattr(
         getattr(connection, "dialect", None), "identifier_preparer", None
     )
     if preparer is None:
-        cols = ", ".join(columns)
+        cols = ", ".join(validated_columns)
     else:
-        cols = ", ".join(preparer.quote_identifier(col) for col in columns)
+        cols = ", ".join(preparer.quote_identifier(col) for col in validated_columns)
     return f" ({cols})"
@@ -115,6 +128,7 @@ def _copy_from_file(
     columns: Optional[Sequence[str]] = None,
     **options: Any,
 ) -> Any:
+    validate_identifier(format_name, kind="COPY format")
     table_name = _format_table(connection, table)
     column_clause = _format_columns(connection, columns)
     path_literal = _quote_literal(path)

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/duckdb_sqlalchemy/config.py RENAMED Viewed

@@ -1,13 +1,15 @@
 import os
 from decimal import Decimal
 from functools import lru_cache
-from typing import Dict, Set, Type, Union
+from typing import Any, Dict, Set, Type, Union
 import duckdb
 from sqlalchemy import Boolean, Float, Integer, String
 from sqlalchemy.engine import Dialect
 from sqlalchemy.sql.type_api import TypeEngine
+from ._validation import validate_identifier
 TYPES: Dict[Type, TypeEngine] = {
     bool: Boolean(),
     int: Integer(),
@@ -37,7 +39,7 @@ def get_core_config() -> Set[str]:
 def apply_config(
     dialect: Dialect,
-    conn: duckdb.DuckDBPyConnection,
+    conn: Any,
     ext: Dict[str, Union[str, int, bool, float, None]],
 ) -> None:
     # TODO: does sqlalchemy have something that could do this for us?
@@ -48,8 +50,9 @@ def apply_config(
     string_processor = String().literal_processor(dialect=dialect)
     for k, v in ext.items():
+        key = validate_identifier(k, kind="config key")
         if v is None:
-            conn.execute(f"SET {k} = NULL")
+            conn.execute(f"SET {key} = NULL")
             continue
         if isinstance(v, os.PathLike):
             v = os.fspath(v)
@@ -67,4 +70,4 @@ def apply_config(
                 v = str(v)
                 process = string_processor
         assert process, f"Not able to configure {k} with {v}"
-        conn.execute(f"SET {k} = {process(v)}")
+        conn.execute(f"SET {key} = {process(v)}")

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/duckdb_sqlalchemy/tests/test_basic.py RENAMED Viewed

@@ -37,6 +37,7 @@ from sqlalchemy.engine.reflection import Inspector
 from sqlalchemy.exc import DBAPIError
 from sqlalchemy.ext.declarative import declarative_base
 from sqlalchemy.orm import Session, relationship, sessionmaker
+from sqlalchemy.pool import QueuePool
 from .. import Dialect, insert, supports_attach, supports_user_agent
 from .._supports import has_comment_support
@@ -572,7 +573,7 @@ def test_do_ping(tmp_path: Path, caplog: LogCaptureFixture) -> None:
         "duckdb:///" + str(tmp_path / "db"),
         pool_pre_ping=True,
         pool_size=1,
-        poolclass=sqlalchemy.pool.QueuePool,
+        poolclass=QueuePool,
     )
     logger = cast(logging.Logger, engine.pool.logger)  # type: ignore
@@ -615,7 +616,8 @@ def test_361(engine: Engine) -> None:
         metadata = MetaData()
         metadata.reflect(bind=conn)
-        test = metadata.tables["test"]
+        tables = cast(dict[str, Table], metadata.tables)
+        test = tables["test"]
         part = "year"
         date_part = func.date_part(part, test.c.dt)

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/duckdb_sqlalchemy/tests/test_core_units.py RENAMED Viewed

@@ -1,4 +1,5 @@
-from typing import cast
+from pathlib import Path
+from typing import Any, cast
 from urllib.parse import parse_qs
 import duckdb
@@ -7,6 +8,7 @@ from sqlalchemy import Integer, String, pool
 from sqlalchemy import exc as sa_exc
 from sqlalchemy.engine import URL as SAURL
+import duckdb_sqlalchemy
 from duckdb_sqlalchemy import (
     URL,
     ConnectionWrapper,
@@ -28,6 +30,7 @@ from duckdb_sqlalchemy import (
 )
 from duckdb_sqlalchemy import datatypes as dt
 from duckdb_sqlalchemy import motherduck as md
+from duckdb_sqlalchemy.bulk import copy_from_csv
 from duckdb_sqlalchemy.config import TYPES, apply_config, get_core_config
@@ -472,15 +475,93 @@ def test_struct_or_union_requires_fields() -> None:
     preparer = dialect.identifier_preparer
     with pytest.raises(sa_exc.CompileError):
-        dt.struct_or_union(dt.Struct(), compiler, preparer)
+        dt.struct_or_union(dt.Struct(), cast(Any, compiler), preparer)
     struct = dt.Struct({"first name": String, "age": Integer})
-    rendered = dt.struct_or_union(struct, compiler, preparer)
+    rendered = dt.struct_or_union(struct, cast(Any, compiler), preparer)
     assert rendered.startswith("(")
     assert rendered.endswith(")")
     assert '"first name"' in rendered
+def test_apply_config_rejects_invalid_key_no_side_effect() -> None:
+    conn = duckdb.connect(":memory:")
+    dialect = Dialect()
+    with pytest.raises(ValueError, match="invalid config key"):
+        apply_config(
+            dialect,
+            conn,
+            {"threads = 1; CREATE TABLE pwned_cfg(i INTEGER); --": "x"},
+        )
+    found = conn.execute(
+        "SELECT COUNT(*) FROM duckdb_tables() WHERE table_name='pwned_cfg'"
+    ).fetchone()
+    assert found is not None
+    assert found[0] == 0
+def test_connect_rejects_invalid_extension_before_execute(
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
+    get_core_config()
+    class DummyConn:
+        def __init__(self) -> None:
+            self.executed: list[str] = []
+        def execute(self, statement: str) -> None:
+            self.executed.append(statement)
+        def register_filesystem(self, filesystem: object) -> None:
+            return None
+    dummy = DummyConn()
+    monkeypatch.setattr(duckdb_sqlalchemy.duckdb, "connect", lambda *a, **k: dummy)
+    with pytest.raises(ValueError, match="invalid extension name"):
+        Dialect().connect(
+            database=":memory:",
+            preload_extensions=["sqlite; CREATE TABLE pwned_ext(i INTEGER); --"],
+            config={},
+        )
+    assert dummy.executed == []
+def test_copy_from_csv_rejects_invalid_table_and_option_key(
+    tmp_path: Path,
+) -> None:
+    conn = duckdb.connect(":memory:")
+    conn.execute("CREATE TABLE safe(i INTEGER)")
+    csv_path = tmp_path / "rows.csv"
+    csv_path.write_text("1\n")
+    with pytest.raises(ValueError, match="invalid table identifier"):
+        copy_from_csv(
+            conn,
+            "safe FROM 'x'; CREATE TABLE pwned_bulk(i INTEGER); --",
+            csv_path,
+        )
+    with pytest.raises(ValueError, match="invalid COPY option key"):
+        bad_options: dict[str, Any] = {
+            "header); CREATE TABLE pwned_opt(i INTEGER); --": True
+        }
+        copy_from_csv(
+            conn,
+            "safe",
+            csv_path,
+            **bad_options,
+        )
+    found = conn.execute(
+        "SELECT COUNT(*) FROM duckdb_tables() WHERE table_name IN ('pwned_bulk', 'pwned_opt')"
+    ).fetchone()
+    assert found is not None
+    assert found[0] == 0
 def test_parse_register_params_dict_and_tuple() -> None:
     view_name, df = _parse_register_params({"view_name": "v", "df": "data"})
     assert view_name == "v"

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/noxfile.py RENAMED Viewed

@@ -3,6 +3,7 @@ from typing import Generator
 import github_action_utils as gha
 import nox
+from packaging.version import Version
 nox.options.default_venv_backend = "uv"
 nox.options.error_on_external_run = True
@@ -61,6 +62,8 @@ def tests_core(session: nox.Session, duckdb: str, sqlalchemy: str) -> None:
         session.install("-e", ".[dev]")
         operator = "==" if sqlalchemy.count(".") == 2 else "~="
         session.install(f"sqlalchemy{operator}{sqlalchemy}")
+        if Version(sqlalchemy) < Version("2.0"):
+            session.install("pandas<2.2")
         if duckdb == "master":
             session.install("duckdb", "--pre", "-U")
         else:

{duckdb_sqlalchemy-1.4.4 → duckdb_sqlalchemy-1.4.4.2}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "duckdb-sqlalchemy"
-version = "1.4.4"
+version = "1.4.4.2"
 description = "DuckDB SQLAlchemy dialect for DuckDB and MotherDuck"
 authors = [
     {name = "Leonardo Vida", email = "lleonardovida@gmail.com"},