PyPI - dictselect - Versions diffs - 0.1.0__tar.gz - Mend

dictselect 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

dictselect-0.1.0/PKG-INFO +181 -0
dictselect-0.1.0/README.md +158 -0
dictselect-0.1.0/dictselect/__init__.py +1 -0
dictselect-0.1.0/dictselect/dictselect.py +405 -0
dictselect-0.1.0/dictselect/tests/__init__.py +0 -0
dictselect-0.1.0/dictselect/tests/test_dictselect.py +347 -0
dictselect-0.1.0/dictselect.egg-info/PKG-INFO +181 -0
dictselect-0.1.0/dictselect.egg-info/SOURCES.txt +10 -0
dictselect-0.1.0/dictselect.egg-info/dependency_links.txt +1 -0
dictselect-0.1.0/dictselect.egg-info/top_level.txt +1 -0
dictselect-0.1.0/setup.cfg +4 -0
dictselect-0.1.0/setup.py +25 -0

dictselect-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,181 @@
+Metadata-Version: 2.1
+Name: dictselect
+Version: 0.1.0
+Summary: A lazy selector for nested Python data structures.
+Home-page: UNKNOWN
+Author: alphacena
+Author-email: lukas.makswitis@gmail.com
+License: MIT
+Platform: UNKNOWN
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+# dictselect
+A Python library for extracting data from nested dicts and lists using reusable pipelines.
+```python
+from dictselect import Selector
+pipe = Selector["annotations"][:]["x_min", "x_max"]
+my_data_selection = pipe(data_dict)
+```
+## Installation
+```bash
+pip install dictselect
+```
+Requires Python ≥ 3.9.
+## How it works
+Build a `Selector` by chaining operations, then call it with your data. The pipeline is built to be reusable.
+```python
+from dictselect import Selector
+data_dict = {
+    "image_id": "xa001",
+    "annotations": [
+        {"id": 1, "x_min": 10, "x_max": 20, "label": "cat"},
+        {"id": 2, "x_min": 30, "x_max": 50, "label": "dog"},
+    ],
+}
+Selector["image_id"](data_dict)                          # → "xa001"
+Selector["annotations"][0]["label"](data_dict)           # → "cat"
+Selector["annotations"][:]["label"](data_dict)           # → ["cat", "dog"]
+Selector["annotations"][:]["x_min", "x_max"](data_dict)  # → [[10, 20], [30, 50]]
+```
+## Operations
+| Syntax                             | What it does                                                            |
+|------------------------------------|-------------------------------------------------------------------------|
+| `Selector["key"]`                  | Dict key or list index lookup                                           |
+| `Selector[0]`, `Selector[-1]`      | List index                                                              |
+| `Selector[1:3]`                    | Slice                                                                   |
+| `Selector[:]` or `Selector[...]`   | Fan-out — apply the rest of the chain to **every element** at this step |
+| `Selector["a", "b"]`               | Input multiple keys at once, returns a list                             |
+| `Selector.method()`                | Call a method on the current value                                      |
+| `pipe_a + pipe_b`                  | Compose two pipelines into one                                          |
+| `Selector.invoke(*args, **kwargs)` | Call the function if the current value is a function                    |
+### Fan-out `[:]`
+`[:]` maps the remaining steps over every item in this step. Steps after `[:]` run on each element individually.
+```python
+data = [{"v": 1}, {"v": 2}, {"v": 3}]
+Selector[:]["v"](data)   # → [1, 2, 3]
+Selector[:][:][0]([[10, 20], [30, 40]])  # → [[10], [30]]  (nested fan-out)
+```
+### Multi-key input `["a", "b"]`
+Returns a list of values for each key. All keys must be the same type (all strings or all integers).
+```python
+Selector["x", "y"]({"x": 1, "y": 2, "z": 3})  # → [1, 2]
+```
+### Method calls
+Access an attribute, then call it like a regular Python method.
+```python
+Selector.upper()("hello")               # → "HELLO"
+Selector[:].upper()(["hi", "there"])    # → ["HI", "THERE"]
+```
+### Composition
+Join two pipelines with `+`.
+```python
+head = Selector["data"][:]
+tail = Selector["value"]
+(head + tail)({"data": [{"value": 1}, {"value": 2}]})  # → [1, 2]
+```
+## Including keys in the result
+Pass `include_keys=True` to wrap the result with the last key as a dict. Works for single key lookups and multi-key inputs.
+```python
+Selector["a"]["b"]({"a": {"b": 12}}, include_keys=True)
+# → {"b": 12}
+Selector[:]["a"]([{"a": 1}, {"a": 2}], include_keys=True)
+# → [{"a": 1}, {"a": 2}]
+Selector[:]["a", "b"]([{"a": 1, "b": 2, "c": 3}, {"a": 4, "c": 6, "b": 5}], include_keys=True)
+# → [{"a": 1, "b": 2}, {"a": 4, "b": 5}]
+```
+Also works on `.apply()`:
+```python
+Selector["x"].apply({"x": 7}, include_keys=True)  # → {"x": 7}
+```
+## Handling missing values
+Pass `include_null=True` to get `None` instead of a `KeyError`/`IndexError` when a key or index doesn't exist. Once a step fails, the rest of the chain is skipped and `None` is returned.
+```python
+Selector["a"]["missing"]({"a": {}}, include_null=True)
+# → None   (instead of KeyError)
+Selector[:]["x"]([{"x": 1}, {"y": 2}, {"x": 3}], include_null=True)
+# → [1, None, 3]
+Selector["a", "b"]({"a": 1}, include_null=True)
+# → [1, None]   (missing keys in multi-select become None individually)
+```
+The two flags can be combined:
+```python
+Selector[:]["x"]([{"x": 1}, {"y": 2}], include_null=True, include_keys=True)
+# → [{"x": 1}, {"x": None}]
+```
+## Calling vs. evaluating
+Normally, calling a selector evaluates it:
+```python
+pipe = Selector["key"]
+pipe({"key": 42})  # → 42
+```
+**Exception 1:** if the last step is an attribute name (e.g. `.upper`), calling it *records* a method call instead of evaluating. Use `.apply(data)` to force evaluation in that case.
+```python
+pipe = Selector["title"].upper()       # records .upper() call
+pipe({"title": "hello"})               # evaluates → "HELLO"
+Selector.upper.apply("hello")          # force evaluation → <method object>
+```
+**Exception 2:** if the last step is a function as a value, use `.invoke(*args, **kwargs)` to force evaluation in that case.
+```python
+pipe = Selector["function"]()            # value will be a function. Calling the function, results in evaluating the selector -> ERROR.
+pipe = Selector["function"].invoke()     # Calls the function without evaluating the Selector
+```

dictselect-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,158 @@
+# dictselect
+A Python library for extracting data from nested dicts and lists using reusable pipelines.
+```python
+from dictselect import Selector
+pipe = Selector["annotations"][:]["x_min", "x_max"]
+my_data_selection = pipe(data_dict)
+```
+## Installation
+```bash
+pip install dictselect
+```
+Requires Python ≥ 3.9.
+## How it works
+Build a `Selector` by chaining operations, then call it with your data. The pipeline is built to be reusable.
+```python
+from dictselect import Selector
+data_dict = {
+    "image_id": "xa001",
+    "annotations": [
+        {"id": 1, "x_min": 10, "x_max": 20, "label": "cat"},
+        {"id": 2, "x_min": 30, "x_max": 50, "label": "dog"},
+    ],
+}
+Selector["image_id"](data_dict)                          # → "xa001"
+Selector["annotations"][0]["label"](data_dict)           # → "cat"
+Selector["annotations"][:]["label"](data_dict)           # → ["cat", "dog"]
+Selector["annotations"][:]["x_min", "x_max"](data_dict)  # → [[10, 20], [30, 50]]
+```
+## Operations
+| Syntax                             | What it does                                                            |
+|------------------------------------|-------------------------------------------------------------------------|
+| `Selector["key"]`                  | Dict key or list index lookup                                           |
+| `Selector[0]`, `Selector[-1]`      | List index                                                              |
+| `Selector[1:3]`                    | Slice                                                                   |
+| `Selector[:]` or `Selector[...]`   | Fan-out — apply the rest of the chain to **every element** at this step |
+| `Selector["a", "b"]`               | Input multiple keys at once, returns a list                             |
+| `Selector.method()`                | Call a method on the current value                                      |
+| `pipe_a + pipe_b`                  | Compose two pipelines into one                                          |
+| `Selector.invoke(*args, **kwargs)` | Call the function if the current value is a function                    |
+### Fan-out `[:]`
+`[:]` maps the remaining steps over every item in this step. Steps after `[:]` run on each element individually.
+```python
+data = [{"v": 1}, {"v": 2}, {"v": 3}]
+Selector[:]["v"](data)   # → [1, 2, 3]
+Selector[:][:][0]([[10, 20], [30, 40]])  # → [[10], [30]]  (nested fan-out)
+```
+### Multi-key input `["a", "b"]`
+Returns a list of values for each key. All keys must be the same type (all strings or all integers).
+```python
+Selector["x", "y"]({"x": 1, "y": 2, "z": 3})  # → [1, 2]
+```
+### Method calls
+Access an attribute, then call it like a regular Python method.
+```python
+Selector.upper()("hello")               # → "HELLO"
+Selector[:].upper()(["hi", "there"])    # → ["HI", "THERE"]
+```
+### Composition
+Join two pipelines with `+`.
+```python
+head = Selector["data"][:]
+tail = Selector["value"]
+(head + tail)({"data": [{"value": 1}, {"value": 2}]})  # → [1, 2]
+```
+## Including keys in the result
+Pass `include_keys=True` to wrap the result with the last key as a dict. Works for single key lookups and multi-key inputs.
+```python
+Selector["a"]["b"]({"a": {"b": 12}}, include_keys=True)
+# → {"b": 12}
+Selector[:]["a"]([{"a": 1}, {"a": 2}], include_keys=True)
+# → [{"a": 1}, {"a": 2}]
+Selector[:]["a", "b"]([{"a": 1, "b": 2, "c": 3}, {"a": 4, "c": 6, "b": 5}], include_keys=True)
+# → [{"a": 1, "b": 2}, {"a": 4, "b": 5}]
+```
+Also works on `.apply()`:
+```python
+Selector["x"].apply({"x": 7}, include_keys=True)  # → {"x": 7}
+```
+## Handling missing values
+Pass `include_null=True` to get `None` instead of a `KeyError`/`IndexError` when a key or index doesn't exist. Once a step fails, the rest of the chain is skipped and `None` is returned.
+```python
+Selector["a"]["missing"]({"a": {}}, include_null=True)
+# → None   (instead of KeyError)
+Selector[:]["x"]([{"x": 1}, {"y": 2}, {"x": 3}], include_null=True)
+# → [1, None, 3]
+Selector["a", "b"]({"a": 1}, include_null=True)
+# → [1, None]   (missing keys in multi-select become None individually)
+```
+The two flags can be combined:
+```python
+Selector[:]["x"]([{"x": 1}, {"y": 2}], include_null=True, include_keys=True)
+# → [{"x": 1}, {"x": None}]
+```
+## Calling vs. evaluating
+Normally, calling a selector evaluates it:
+```python
+pipe = Selector["key"]
+pipe({"key": 42})  # → 42
+```
+**Exception 1:** if the last step is an attribute name (e.g. `.upper`), calling it *records* a method call instead of evaluating. Use `.apply(data)` to force evaluation in that case.
+```python
+pipe = Selector["title"].upper()       # records .upper() call
+pipe({"title": "hello"})               # evaluates → "HELLO"
+Selector.upper.apply("hello")          # force evaluation → <method object>
+```
+**Exception 2:** if the last step is a function as a value, use `.invoke(*args, **kwargs)` to force evaluation in that case.
+```python
+pipe = Selector["function"]()            # value will be a function. Calling the function, results in evaluating the selector -> ERROR.
+pipe = Selector["function"].invoke()     # Calls the function without evaluating the Selector
+```

dictselect-0.1.0/dictselect/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ from .dictselect import Selector

dictselect-0.1.0/dictselect/dictselect.py ADDED Viewed

@@ -0,0 +1,405 @@
+"""dictselect — a tiny lazy selector for nested Python data structures.
+Build a reusable pipeline of access operations (key lookup, slicing, attribute
+access, method calls) and apply it to any compatible data object.
+Example::
+    from dictselect import Selector
+    # Build a pipeline once.
+    pipe = Selector["annotations"][:]["x_min", "x_max", "y_min", "y_max"]
+    # Apply it to multiple data objects.
+    result = pipe(record)
+    # → [[x_min, x_max, y_min, y_max], ...] for every "annotation" in record
+Supported operations
+--------------------
+* Selector["key"]                   — dict / sequence key lookup
+* Selector[1:3]                     — slice
+* Selector[:] or Selector[...]      — fan-out over a sequence (map)
+* Selector["a", "b"]                — pluck multiple str (or int) keys at once
+* Selector.attr                     — attribute access
+* Selector.method(args)             — record a method call (chained after attr access)
+* pipe_a + pipe_b                   — compose two pipelines
+* pipe(data) or pipe.apply(data)    — evaluate a pipeline against data
+* pipe(data, include_keys=True)     — wrap leaf result(s) with the last key as a dict
+* pipe(data, include_null=True)     — return None for missing keys instead of raising
+Python ≥ 3.9 is required.
+"""
+from __future__ import annotations
+from typing import Any
+__all__ = ["Selector"]
+__version__ = "0.1.0"
+_MISSING = object()  # sentinel for a failed retrieval when include_null=True
+class _SelectorMeta(type):
+    @property
+    def steps(cls):
+        return ()
+    def __getattr__(cls, name: str):
+        if name.startswith("__"):
+            raise AttributeError(name)
+        return cls((("getattr", name),))
+class Selector(metaclass=_SelectorMeta):
+    """An immutable, composable pipeline of data-access operations.
+    A Selector accumulates a sequence of steps (key lookups, slices,
+    attribute accesses, …) and evaluates them lazily when .apply() is
+    called (or equivalently when the instance is called like a function).
+    Build selectors by subscripting the class or an existing instance:
+        pipe = Selector["annotations"][:]["label"]
+        labels = pipe(record)          # evaluate
+        labels = pipe.apply(record)    # same thing
+    Evaluation rule for pipe(data)
+    Calling an instance normally evaluates the pipeline.  The one exception:
+    if the last recorded step is an attribute lookup (e.g. the chain ends with
+    .upper), the call is interpreted as a method-call step and returns a
+    new Selector—mirroring how Python itself handles obj.method(args):
+        text_pipe = Selector["title"].upper()   # records .upper() call
+        result = text_pipe({"title": "hello"})  # → "HELLO"
+    To always force a call step regardless of the preceding step, use
+    .invoke():
+        fn_pipe = Selector["callback"].invoke(42)
+        fn_pipe({"callback": lambda x: x * 2})  # → 84
+    Composition
+    Two selectors can be joined with +; neither operand is mutated:
+        head = Selector["data"][:]
+        tail = Selector["value"]
+        pipe = head + tail           # equivalent to Selector["data"][:]["value"]
+    Immutability
+    Every builder method (__getitem__, __getattr__, __call__, …)
+    returns a new Selector; the original is never modified.  Steps are
+    stored as a plain tuple so they are cheaply shareable.
+    """
+    __slots__ = ("steps",)
+    def __init__(self, steps: tuple = ()):
+        """Initialize a selector with the given steps.
+        Args:
+            steps:
+                Sequence of step tuples that form the pipeline.  Pass nothing (or
+                an empty tuple) to create the identity/root selector.
+        Example:
+            empty = Selector()           # identity — apply returns data unchanged
+            copy  = Selector(other.steps)  # clone
+        """
+        self.steps = tuple(steps)
+    @classmethod
+    def __class_getitem__(cls, key):
+        """Allow Selector[key] as a shorthand for Selector()[key].
+        This lets you start a chain without an explicit Selector() call:
+            pipe = Selector["annotations"][:]["id"]
+        Args:
+            key:
+                The first access step; forwarded to __getitem__.
+        Returns:
+            Selector: A new selector with the first step recorded.
+        """
+        return cls()[key]
+    def __getitem__(self, key):
+        """Record a key-access, slice, fan-out, or multi-key pluck step.
+        Behavior depends on key:
+        * [:] | [...]
+              Fan-out (map): apply all remaining steps to every element of the
+              current sequence and return a list of results.
+        * [a:b] / [a:b:c]
+              Arbitrary slice; applied directly to the current data.
+        * ["a", "b"] | [["a", "b"]]
+              Multi-key pluck: all keys must be the same type (all str or
+              all int).  Returns a list of the selected values.
+        * Any other value
+              Plain key / index lookup (data[key]).
+        Args:
+            key:
+                The subscript expression.
+        Returns:
+            Selector: A new selector with the step appended.
+        Raises:
+            TypeError: If key is a tuple or list with mixed str/int types, or
+                       with fewer than 2 elements.
+        Example:
+            Selector["name"].apply({"name": "Alice"})          # → "Alice"
+            Selector[0].apply([10, 20, 30])                    # → 10
+            Selector[1:3].apply([0, 1, 2, 3])                  # → [1, 2]
+            Selector["x", "y"].apply({"x": 1, "y": 2, "z": 3}) # → [1, 2]
+            Selector[:]["x"].apply([{"x": 1}, {"x": 2}])       # → [1, 2]
+        """
+        if key is Ellipsis or (isinstance(key, slice) and key == slice(None)):
+            return type(self)(self.steps + (("map",),))
+        if isinstance(key, slice):
+            return type(self)(self.steps + (("slice", key),))
+        if isinstance(key, (tuple, list)):
+            if len(key) < 2:
+                raise TypeError(
+                    "Multi-key selection requires at least 2 keys; "
+                    f"got {len(key)}."
+                )
+            if all(isinstance(k, str) for k in key):
+                pass
+            elif all(isinstance(k, int) for k in key):
+                pass
+            else:
+                raise TypeError(
+                    "Multi-key selection requires homogeneous keys "
+                    "(all str or all int); got mixed types."
+                )
+            return type(self)(self.steps + (("multi", tuple(key)),))
+        return type(self)(self.steps + (("getitem", key),))
+    def __getattr__(self, name: str):
+        """Record an attribute-access step.
+        Args:
+            name:
+                The attribute name.
+        Returns:
+            Selector: A new selector with the step appended.
+        Raises:
+            AttributeError: Immediately for any dunder name (__copy__, __reduce__, …)
+                            so that pickle, copy, and introspection tools behave correctly.
+        Example:
+            Selector.upper.apply("hello")          # → <method 'upper'>
+            Selector.upper().apply("hello")        # → "HELLO"
+        """
+        if name.startswith("__"):
+            raise AttributeError(name)
+        return type(self)(self.steps + (("getattr", name),))
+    def __call__(self, *args, **kwargs):
+        """Evaluate the pipeline or record a method-call step.
+        Evaluation
+        When the last step is not an attribute lookup—or when there are no
+        steps—calling the selector evaluates it against the single positional
+        argument:
+            pipe = Selector["key"]
+            pipe({"key": 42})   # → 42
+        Method-call recording (the special case)
+        When the last step *is* an attribute lookup (e.g. the chain ends with
+        .upper), the call mirrors Python's own obj.attr(args) pattern
+        and records a call step instead of evaluating:
+            pipe = Selector.upper()            # records .upper() call
+            pipe("hello")                      # evaluates → "HELLO"
+        To always record a call step (bypassing the heuristic), use
+        .invoke().
+        Args:
+            *args, **kwargs: For evaluation: exactly one positional argument (the data).
+                             For recording: any arguments forwarded to the method call.
+        Returns:
+            Any | Selector: The result of apply(data) when evaluating, or a new
+                            Selector with the call step appended when recording.
+        Raises:
+            TypeError: If the heuristic selects *evaluation* but the wrong number of
+                       arguments are supplied.
+        """
+        if self.steps and self.steps[-1][0] == "getattr":
+            return type(self)(self.steps + (("call", args, kwargs),))
+        extra = set(kwargs) - {"include_keys", "include_null"}
+        if len(args) != 1 or extra:
+            raise TypeError(
+                "Selector evaluation expects exactly one positional argument "
+                "(the data) and optional include_keys / include_null keywords. "
+                "To record a method call, access the method via attribute first "
+                "(e.g. pipe.method(args)), or use "
+                "pipe.invoke(args) to record a call step explicitly."
+            )
+        return self.apply(args[0], **kwargs)
+    def invoke(self, *args, **kwargs):
+        """Record a call step unconditionally, regardless of the previous step.
+        Use this when you need to invoke a callable obtained via key lookup
+        (not attribute access), where the default __call__ heuristic would
+        interpret the call as an evaluation instead:
+            pipe = Selector["handler"].invoke(event)
+            pipe({"handler": lambda e: e.upper()})  # → "EVENT" (if event="event")
+        Args:
+            *args, **kwargs: Arguments that will be forwarded to the callable at evaluation time.
+        Returns:
+            Selector: A new selector with the call step appended.
+        """
+        return type(self)(self.steps + (("call", args, kwargs),))
+    def __add__(self, other):
+        """Return a new selector that concatenates the steps of both operands.
+        Neither self nor other is mutated:
+            head = Selector["data"][:]
+            tail = Selector["value"]
+            pipe = head + tail      # Selector["data"][:]["value"]
+        Args:
+            other: Another Selector instance.
+        Returns:
+            Selector: A fresh selector whose steps are ``self.steps + other.steps``.
+        Raises:
+            TypeError: If other is not a Selector instance (via NotImplemented
+                       so Python can try other.__radd__).
+        """
+        if other is type(self):
+            return type(self)(self.steps)
+        if not isinstance(other, type(self)):
+            return NotImplemented
+        return type(self)(self.steps + other.steps)
+    def __repr__(self):
+        """Return a developer-friendly representation listing all steps.
+        Example:
+            repr(Selector["a"][:])  # → "Selector([('getitem', 'a'), ('map',)])"
+        """
+        return f"Selector({list(self.steps)!r})"
+    def apply(self, data: Any, include_keys: bool = False, include_null: bool = False) -> Any:
+        """Evaluate the recorded pipeline against data.
+        Steps are executed in order:
+        * getitem — data[key]
+        * slice   — data[slice]
+        * multi   — [data[k] for k in keys]
+        * map     — apply remaining steps to every element; returns a list
+        * getattr — getattr(data, name)
+        * call    — data(*args, **kwargs)
+        A selector with no steps is the identity: it returns *data* unchanged.
+        Args:
+            data: The root data object to query.
+            include_keys: If True, wrap the leaf result with the last key-bearing
+                step's key(s) as a dict.  Only ``getitem`` and ``multi`` steps
+                qualify; all other terminal steps leave the result unchanged.
+                Selector["b"].apply({"b": 7}, include_keys=True)          # → {"b": 7}
+                Selector["a","b"].apply({"a":1,"b":2}, include_keys=True)  # → {"a":1,"b":2}
+                With a fan-out (``[:]``), the wrapping happens per element:
+                Selector[:]["v"].apply([{"v":1},{"v":2}], include_keys=True)
+                # → [{"v": 1}, {"v": 2}]
+            include_null: If True, return None (instead of raising) when a key,
+                index, or attribute is missing.  The None propagates through the
+                rest of the chain so subsequent steps are skipped.
+                For multi select, each missing key becomes None individually:
+                Selector["a","b"].apply({"a": 1}, include_null=True)  # → [1, None]
+                With a fan-out, missing elements become None per item:
+                Selector[:]["x"].apply([{"x":1},{"y":2}], include_null=True)
+                # → [1, None]
+        Returns:
+            Any: The result after all steps have been applied.
+        Example:
+            Selector["a"]["b"].apply({"a": {"b": 7}})  # → 7
+            Selector[:]["v"].apply([{"v": 1}, {"v": 2}])  # → [1, 2]
+        """
+        for i, step in enumerate(self.steps):
+            kind = step[0]
+            if kind == "map":
+                rest_steps = self.steps[i + 1:]
+                if not rest_steps:
+                    return list(data)
+                rest = type(self)(rest_steps)
+                return [
+                    rest.apply(x, include_keys=include_keys, include_null=include_null)
+                    for x in data
+                ]
+            # Short-circuit: a previous step already failed
+            if include_null and data is _MISSING:
+                continue
+            try:
+                if kind in ("getitem", "slice"):
+                    data = data[step[1]]
+                elif kind == "multi":
+                    if include_null:
+                        row = []
+                        for k in step[1]:
+                            try:
+                                row.append(data[k])
+                            except (KeyError, IndexError, TypeError):
+                                row.append(None)
+                        data = row
+                    else:
+                        data = [data[k] for k in step[1]]
+                elif kind == "getattr":
+                    data = getattr(data, step[1])
+                elif kind == "call":
+                    _, call_args, call_kwargs = step
+                    data = data(*call_args, **call_kwargs)
+            except (KeyError, IndexError, AttributeError, TypeError):
+                if include_null:
+                    data = _MISSING
+                else:
+                    raise
+        if include_null and data is _MISSING:
+            data = None
+        if include_keys and self.steps:
+            last = self.steps[-1]
+            if last[0] == "getitem":
+                return {last[1]: data}
+            if last[0] == "multi":
+                return dict(zip(last[1], data))
+        return data

dictselect-0.1.0/dictselect/tests/__init__.py ADDED Viewed

File without changes

dictselect-0.1.0/dictselect/tests/test_dictselect.py ADDED Viewed

@@ -0,0 +1,347 @@
+"""Tests for dictselect.Selector."""
+import pickle
+import pytest
+from dictselect import Selector
+RECORD = {
+    "annotations": [
+        {"id": 0, "x_min": 1, "y_min": 1, "x_max": 2, "y_max": 2, "label": "car"},
+        {"id": 0, "x_min": 1, "y_min": 3, "x_max": 2, "y_max": 5, "label": "car"},
+        {"id": 0, "x_min": 1, "y_min": 1, "x_max": 3, "y_max": 2, "label": "other"},
+        {"id": 0, "x_min": 1, "y_min": 1, "x_max": 2, "y_max": 3, "label": "truck"},
+    ],
+    "image_id": "xa001",
+}
+class TestGetitem:
+    def test_dict_str_key(self):
+        assert Selector["image_id"].apply(RECORD) == "xa001"
+    def test_list_int_index(self):
+        assert Selector[0].apply([10, 20, 30]) == 10
+    def test_negative_index(self):
+        assert Selector[-1].apply([10, 20, 30]) == 30
+    def test_chained_dict_access(self):
+        data = {"a": {"b": {"c": 42}}}
+        assert Selector["a"]["b"]["c"].apply(data) == 42
+    def test_chained_list_and_dict(self):
+        data = [{"x": 7}]
+        assert Selector[0]["x"].apply(data) == 7
+    def test_class_subscript_entry_point(self):
+        assert Selector["image_id"].apply(RECORD) == Selector()["image_id"].apply(RECORD)
+    def test_eval_via_call(self):
+        assert Selector["image_id"](RECORD) == "xa001"
+    def test_eval_and_apply_agree(self):
+        pipe = Selector["image_id"]
+        assert pipe(RECORD) == pipe.apply(RECORD)
+class TestSlice:
+    def test_range_slice(self):
+        assert Selector[1:3].apply([0, 1, 2, 3, 4]) == [1, 2]
+    def test_head_slice(self):
+        assert Selector[:2].apply([10, 20, 30]) == [10, 20]
+    def test_tail_slice(self):
+        assert Selector[-2:].apply([10, 20, 30]) == [20, 30]
+    def test_step_slice(self):
+        assert Selector[::2].apply([0, 1, 2, 3, 4]) == [0, 2, 4]
+    def test_partial_slice_on_string(self):
+        assert Selector[1:4].apply("hello") == "ell"
+class TestMap:
+    def test_full_slice_fans_out(self):
+        """S[:] on a list returns a copy of that list (identity fan-out)."""
+        assert Selector[:].apply([1, 2, 3]) == [1, 2, 3]
+    def test_ellipsis_fans_out(self):
+        assert Selector[...].apply([1, 2, 3]) == [1, 2, 3]
+    def test_colon_and_ellipsis_equivalent(self):
+        data = [{"x": 1}, {"x": 2}]
+        assert Selector[:]["x"].apply(data) == Selector[...]["x"].apply(data)
+    def test_map_then_getitem(self):
+        data = [{"x": 1}, {"x": 2}, {"x": 3}]
+        assert Selector[:]["x"].apply(data) == [1, 2, 3]
+    def test_map_empty_sequence(self):
+        assert Selector[:]["x"].apply([]) == []
+    # Regression – bug #3: duplicate map steps used to pick the wrong remainder
+    def test_nested_map_no_index_collision(self):
+        data = [[1, 2], [3, 4]]
+        assert Selector[:][:].apply(data) == [[1, 2], [3, 4]]
+    def test_triple_nested_map(self):
+        data = [[[1, 2], [3]], [[4]]]
+        assert Selector[:][:][:].apply(data) == [[[1, 2], [3]], [[4]]]
+    def test_map_then_attr_and_call(self):
+        assert Selector[:].upper().apply(["hello", "world"]) == ["HELLO", "WORLD"]
+class TestMultiKey:
+    # Regression – bug #2: original code only checked for list, not tuple
+    def test_tuple_syntax_str_keys(self):
+        data = {"a": 1, "b": 2, "c": 3}
+        assert Selector["a", "b"].apply(data) == [1, 2]
+    def test_list_syntax_str_keys(self):
+        data = {"a": 1, "b": 2, "c": 3}
+        assert Selector[["a", "b"]].apply(data) == [1, 2]
+    def test_tuple_syntax_int_keys(self):
+        assert Selector[0, 2].apply([10, 20, 30, 40]) == [10, 30]
+    def test_list_syntax_int_keys(self):
+        assert Selector[[0, 2]].apply([10, 20, 30, 40]) == [10, 30]
+    def test_mixed_key_types_raise(self):
+        with pytest.raises(TypeError, match="homogeneous"):
+            Selector["a", 1]
+    def test_single_element_multi_key_raises(self):
+        with pytest.raises(TypeError, match="at least 2"):
+            Selector[["only_one"]]
+    def test_map_then_multi_key_str(self):
+        """Primary use-case: pluck multiple fields from each element."""
+        result = Selector["annotations"][:]["x_min", "x_max", "y_min", "y_max"].apply(RECORD)
+        assert result == [
+            [1, 2, 1, 2],
+            [1, 2, 3, 5],
+            [1, 3, 1, 2],
+            [1, 2, 1, 3],
+        ]
+class TestGetattr:
+    def test_simple_attribute(self):
+        class Obj:
+            value = 99
+        assert Selector.value.apply(Obj()) == 99
+    def test_chained_attributes(self):
+        class Inner:
+            x = 5
+        class Outer:
+            inner = Inner()
+        assert Selector.inner.x.apply(Outer()) == 5
+    def test_dunder_attr_raises_attribute_error(self):
+        with pytest.raises(AttributeError):
+            _ = Selector["x"].__copy__
+    def test_hasattr_dunder_is_false(self):
+        assert not hasattr(Selector["x"], "__copy__")
+class TestMethodChain:
+    def test_call_after_getattr_records_step(self):
+        """Calling a selector whose last step is getattr records a call step."""
+        pipe = Selector.upper()
+        assert pipe.steps == (("getattr", "upper"), ("call", (), {}))
+    def test_str_upper_method_chain(self):
+        assert Selector.upper().apply("hello") == "HELLO"
+    def test_method_with_args(self):
+        assert Selector.replace("l", "y").apply("hello") == "heyyo"
+    def test_method_with_positional_args(self):
+        # str.center takes fillchar as positional-only in CPython
+        result = Selector.center(7, "-").apply("hi")
+        assert len(result) == 7 and result.strip("-") == "hi"
+    def test_chained_method_after_getitem(self):
+        assert Selector["title"].upper().apply({"title": "hello"}) == "HELLO"
+    def test_annotation_conjugate_pipeline(self):
+        pipe = Selector["annotations"][:]["id"].conjugate()
+        assert pipe.apply(RECORD) == [0, 0, 0, 0]
+class TestInvoke:
+    def test_invoke_after_getitem_records_call(self):
+        """invoke() always records, even when the last step is not getattr."""
+        pipe = Selector["fn"].invoke(1, 2)
+        assert pipe.steps[-1] == ("call", (1, 2), {})
+    def test_invoke_calls_function_in_data(self):
+        data = {"fn": lambda x, y: x + y}
+        assert Selector["fn"].invoke(3, 4).apply(data) == 7
+    def test_invoke_with_kwargs(self):
+        def greet(name, greeting="Hello"):
+            return f"{greeting}, {name}!"
+        data = {"fn": greet}
+        assert Selector["fn"].invoke("Alice", greeting="Hi").apply(data) == "Hi, Alice!"
+    def test_invoke_after_getattr_also_records(self):
+        """invoke() always records regardless of context."""
+        pipe = Selector.upper.invoke()
+        assert pipe.steps[-1] == ("call", (), {})
+class TestCompose:
+    def test_add_composes_two_selectors(self):
+        head = Selector["a"]
+        tail = Selector["b"]
+        assert (head + tail).apply({"a": {"b": 42}}) == 42
+    def test_add_does_not_mutate_left_operand(self):
+        a = Selector["a"]
+        b = Selector["b"]
+        steps_before = a.steps
+        _ = a + b
+        assert a.steps == steps_before
+    def test_add_does_not_mutate_right_operand(self):
+        b = Selector["b"]
+        a = Selector["a"]
+        steps_before = b.steps
+        _ = a + b
+        assert b.steps == steps_before
+    def test_add_three_selectors(self):
+        pipe = Selector["a"] + Selector["b"] + Selector["c"]
+        assert pipe.apply({"a": {"b": {"c": 99}}}) == 99
+    def test_add_non_selector_raises_type_error(self):
+        with pytest.raises(TypeError):
+            _ = Selector["x"] + 1
+    def test_add_with_empty_selector(self):
+        pipe = Selector["x"] + Selector
+        assert pipe.apply({"x": 7}) == 7
+class TestMisc:
+    def test_empty_selector_is_identity(self):
+        data = {"x": 42}
+        # Here is does make a difference when using the uninitializes Selector for an empty object.
+        assert Selector().apply(data) is data
+    def test_repr_contains_class_name(self):
+        assert repr(Selector["a"][:]).startswith("Selector(")
+    def test_repr_contains_step_kinds(self):
+        r = repr(Selector["a"][:])
+        assert "getitem" in r
+        assert "map" in r
+    def test_pickle_roundtrip(self):
+        pipe = Selector["annotations"][:]
+        restored = pickle.loads(pickle.dumps(pipe))
+        assert restored.steps == pipe.steps
+    def test_pickle_empty_selector(self):
+        restored = pickle.loads(pickle.dumps(Selector))
+        assert restored.steps == ()
+    def test_wrong_arity_on_evaluation_raises(self):
+        with pytest.raises(TypeError):
+            Selector["x"](1, 2)  # two args; last step is getitem → tries to evaluate
+    def test_call_with_kwargs_on_non_getattr_raises(self):
+        with pytest.raises(TypeError):
+            Selector["x"](data={"x": 1})  # kwargs not allowed for evaluation
+class TestIncludeKeys:
+    def test_single_getitem(self):
+        assert Selector["a"]["b"]({"a": {"b": 12}}, include_keys=True) == {"b": 12}
+    def test_map_then_getitem(self):
+        result = Selector[:]["a"]([{"a": 1}, {"a": 2}], include_keys=True)
+        assert result == [{"a": 1}, {"a": 2}]
+    def test_map_then_multi(self):
+        data = [{"a": 1, "b": 2, "c": 3}, {"a": 4, "c": 6, "b": 5}]
+        result = Selector[:]["a", "b"](data, include_keys=True)
+        assert result == [{"a": 1, "b": 2}, {"a": 4, "b": 5}]
+    def test_int_key_wrapped(self):
+        assert Selector[0]([10, 20, 30], include_keys=True) == {0: 10}
+    def test_flag_false_is_unchanged(self):
+        assert Selector["x"]({"x": 7}, include_keys=False) == 7
+    def test_apply_path(self):
+        assert Selector["x"].apply({"x": 7}, include_keys=True) == {"x": 7}
+    def test_slice_terminal_ignored(self):
+        assert Selector[1:3]([0, 1, 2, 3], include_keys=True) == [1, 2]
+    def test_call_terminal_ignored(self):
+        assert Selector["x"].upper().apply({"x": "hi"}, include_keys=True) == "HI"
+    def test_map_terminal_ignored(self):
+        assert Selector["a"][:]({"a": [1, 2, 3]}, include_keys=True) == [1, 2, 3]
+    def test_empty_selector_ignored(self):
+        data = {"x": 1}
+        assert Selector().apply(data, include_keys=True) is data
+class TestIncludeNull:
+    def test_missing_key_returns_none(self):
+        assert Selector["a"]["missing"]({"a": {}}, include_null=True) is None
+    def test_missing_top_level_key(self):
+        assert Selector["nope"]({"a": 1}, include_null=True) is None
+    def test_present_key_unaffected(self):
+        assert Selector["a"]({"a": 42}, include_null=True) == 42
+    def test_missing_key_propagates_through_chain(self):
+        # Once a step returns None, subsequent steps are skipped
+        assert Selector["x"]["y"]["z"]({"x": {}}, include_null=True) is None
+    def test_missing_index_returns_none(self):
+        assert Selector[5]([1, 2, 3], include_null=True) is None
+    def test_multi_partial_missing(self):
+        result = Selector["a", "b"]({"a": 1}, include_null=True)
+        assert result == [1, None]
+    def test_multi_all_missing(self):
+        result = Selector["a", "b"]({}, include_null=True)
+        assert result == [None, None]
+    def test_map_with_partial_missing(self):
+        data = [{"x": 1}, {"y": 2}, {"x": 3}]
+        assert Selector[:]["x"](data, include_null=True) == [1, None, 3]
+    def test_apply_path(self):
+        assert Selector["missing"].apply({}, include_null=True) is None
+    def test_flag_false_raises(self):
+        with pytest.raises(KeyError):
+            Selector["missing"]({"a": 1}, include_null=False)
+    def test_combined_include_keys_and_null(self):
+        result = Selector["a"]["missing"]({"a": {}}, include_null=True, include_keys=True)
+        assert result == {"missing": None}
+    def test_combined_map_keys_and_null(self):
+        data = [{"x": 1}, {"y": 2}]
+        result = Selector[:]["x"](data, include_null=True, include_keys=True)
+        assert result == [{"x": 1}, {"x": None}]

dictselect-0.1.0/dictselect.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,181 @@
+Metadata-Version: 2.1
+Name: dictselect
+Version: 0.1.0
+Summary: A lazy selector for nested Python data structures.
+Home-page: UNKNOWN
+Author: alphacena
+Author-email: lukas.makswitis@gmail.com
+License: MIT
+Platform: UNKNOWN
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+# dictselect
+A Python library for extracting data from nested dicts and lists using reusable pipelines.
+```python
+from dictselect import Selector
+pipe = Selector["annotations"][:]["x_min", "x_max"]
+my_data_selection = pipe(data_dict)
+```
+## Installation
+```bash
+pip install dictselect
+```
+Requires Python ≥ 3.9.
+## How it works
+Build a `Selector` by chaining operations, then call it with your data. The pipeline is built to be reusable.
+```python
+from dictselect import Selector
+data_dict = {
+    "image_id": "xa001",
+    "annotations": [
+        {"id": 1, "x_min": 10, "x_max": 20, "label": "cat"},
+        {"id": 2, "x_min": 30, "x_max": 50, "label": "dog"},
+    ],
+}
+Selector["image_id"](data_dict)                          # → "xa001"
+Selector["annotations"][0]["label"](data_dict)           # → "cat"
+Selector["annotations"][:]["label"](data_dict)           # → ["cat", "dog"]
+Selector["annotations"][:]["x_min", "x_max"](data_dict)  # → [[10, 20], [30, 50]]
+```
+## Operations
+| Syntax                             | What it does                                                            |
+|------------------------------------|-------------------------------------------------------------------------|
+| `Selector["key"]`                  | Dict key or list index lookup                                           |
+| `Selector[0]`, `Selector[-1]`      | List index                                                              |
+| `Selector[1:3]`                    | Slice                                                                   |
+| `Selector[:]` or `Selector[...]`   | Fan-out — apply the rest of the chain to **every element** at this step |
+| `Selector["a", "b"]`               | Input multiple keys at once, returns a list                             |
+| `Selector.method()`                | Call a method on the current value                                      |
+| `pipe_a + pipe_b`                  | Compose two pipelines into one                                          |
+| `Selector.invoke(*args, **kwargs)` | Call the function if the current value is a function                    |
+### Fan-out `[:]`
+`[:]` maps the remaining steps over every item in this step. Steps after `[:]` run on each element individually.
+```python
+data = [{"v": 1}, {"v": 2}, {"v": 3}]
+Selector[:]["v"](data)   # → [1, 2, 3]
+Selector[:][:][0]([[10, 20], [30, 40]])  # → [[10], [30]]  (nested fan-out)
+```
+### Multi-key input `["a", "b"]`
+Returns a list of values for each key. All keys must be the same type (all strings or all integers).
+```python
+Selector["x", "y"]({"x": 1, "y": 2, "z": 3})  # → [1, 2]
+```
+### Method calls
+Access an attribute, then call it like a regular Python method.
+```python
+Selector.upper()("hello")               # → "HELLO"
+Selector[:].upper()(["hi", "there"])    # → ["HI", "THERE"]
+```
+### Composition
+Join two pipelines with `+`.
+```python
+head = Selector["data"][:]
+tail = Selector["value"]
+(head + tail)({"data": [{"value": 1}, {"value": 2}]})  # → [1, 2]
+```
+## Including keys in the result
+Pass `include_keys=True` to wrap the result with the last key as a dict. Works for single key lookups and multi-key inputs.
+```python
+Selector["a"]["b"]({"a": {"b": 12}}, include_keys=True)
+# → {"b": 12}
+Selector[:]["a"]([{"a": 1}, {"a": 2}], include_keys=True)
+# → [{"a": 1}, {"a": 2}]
+Selector[:]["a", "b"]([{"a": 1, "b": 2, "c": 3}, {"a": 4, "c": 6, "b": 5}], include_keys=True)
+# → [{"a": 1, "b": 2}, {"a": 4, "b": 5}]
+```
+Also works on `.apply()`:
+```python
+Selector["x"].apply({"x": 7}, include_keys=True)  # → {"x": 7}
+```
+## Handling missing values
+Pass `include_null=True` to get `None` instead of a `KeyError`/`IndexError` when a key or index doesn't exist. Once a step fails, the rest of the chain is skipped and `None` is returned.
+```python
+Selector["a"]["missing"]({"a": {}}, include_null=True)
+# → None   (instead of KeyError)
+Selector[:]["x"]([{"x": 1}, {"y": 2}, {"x": 3}], include_null=True)
+# → [1, None, 3]
+Selector["a", "b"]({"a": 1}, include_null=True)
+# → [1, None]   (missing keys in multi-select become None individually)
+```
+The two flags can be combined:
+```python
+Selector[:]["x"]([{"x": 1}, {"y": 2}], include_null=True, include_keys=True)
+# → [{"x": 1}, {"x": None}]
+```
+## Calling vs. evaluating
+Normally, calling a selector evaluates it:
+```python
+pipe = Selector["key"]
+pipe({"key": 42})  # → 42
+```
+**Exception 1:** if the last step is an attribute name (e.g. `.upper`), calling it *records* a method call instead of evaluating. Use `.apply(data)` to force evaluation in that case.
+```python
+pipe = Selector["title"].upper()       # records .upper() call
+pipe({"title": "hello"})               # evaluates → "HELLO"
+Selector.upper.apply("hello")          # force evaluation → <method object>
+```
+**Exception 2:** if the last step is a function as a value, use `.invoke(*args, **kwargs)` to force evaluation in that case.
+```python
+pipe = Selector["function"]()            # value will be a function. Calling the function, results in evaluating the selector -> ERROR.
+pipe = Selector["function"].invoke()     # Calls the function without evaluating the Selector
+```

dictselect-0.1.0/dictselect.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,10 @@
+README.md
+setup.py
+dictselect/__init__.py
+dictselect/dictselect.py
+dictselect.egg-info/PKG-INFO
+dictselect.egg-info/SOURCES.txt
+dictselect.egg-info/dependency_links.txt
+dictselect.egg-info/top_level.txt
+dictselect/tests/__init__.py
+dictselect/tests/test_dictselect.py

dictselect-0.1.0/dictselect.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

dictselect-0.1.0/dictselect.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ dictselect

dictselect-0.1.0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

dictselect-0.1.0/setup.py ADDED Viewed

@@ -0,0 +1,25 @@
+from setuptools import setup, find_packages
+setup(
+    name="dictselect",
+    version="0.1.0",
+    description="A lazy selector for nested Python data structures.",
+    long_description=open("README.md").read(),
+    long_description_content_type="text/markdown",
+    author="alphacena",
+    author_email="lukas.makswitis@gmail.com",
+    license="MIT",
+    python_requires=">=3.9",
+    packages=find_packages(exclude=["tests*"]),
+    classifiers=[
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3.11",
+        "Programming Language :: Python :: 3.12",
+        "License :: OSI Approved :: MIT License",
+        "Operating System :: OS Independent",
+        "Intended Audience :: Developers",
+        "Topic :: Software Development :: Libraries :: Python Modules",
+    ],
+)