PyPI - leancontext - Versions diffs - 2.0.4__tar.gz → 2.0.6__tar.gz - Mend

leancontext 2.0.4tar.gz → 2.0.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (75) hide show

{leancontext-2.0.4 → leancontext-2.0.6}/.github/workflows/publish.yml RENAMED Viewed

@@ -26,3 +26,5 @@ jobs:
           python -m build
       - name: Publish to PyPI
         uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          skip-existing: true   # don't fail if a version was already uploaded

{leancontext-2.0.4 → leancontext-2.0.6}/CHANGELOG.md RENAMED Viewed

@@ -5,13 +5,43 @@ All notable changes to this project are documented here. The format is based on
 ## [Unreleased]
+## [2.0.6] - 2026-06-21
+### Fixed
+- JSON reducer is now lossless on every value: rows are emitted as JSON arrays with
+  the field names factored into the header once, so values containing the column
+  delimiter, quotes, or newlines no longer corrupt the columnar layout. The JSON
+  fidelity check matches values in their encoded form, so it sees such corruption.
+- Gateway paths (LiteLLM proxy + SDK patch) now reduce OpenAI Responses requests
+  (`input=`), not just chat (`messages=`).
+- `reduce_messages` dispatches per item, so a list mixing message formats reduces
+  every tool output instead of only those matching the first format detected.
+- OpenAI Responses tool outputs shaped as a list of content parts are now reduced.
+- `__version__` is read from the installed package metadata (was a stale `0.0.1`).
+- `CostTracker` running totals are guarded by a lock for multi-threaded agents.
+### Docs
+- README install commands use the published package (`pip install leancontext`),
+  document the `mcp` extra, note which tokenizer the benchmark uses, and state
+  which integrations are CI-verified vs best-effort.
+## [2.0.5] - 2026-06-21
+### Security
+- Fix a path traversal in the disk-backed paging store: `expand()` and `ContentStore.get()`
+  now accept only content-hash ids, so a crafted reference can no longer read files outside
+  the store (reachable via the MCP `expand` tool). The default in-memory store was unaffected.
 ## [2.0.4] - 2026-06-21
 ### Fixed
 - README uses absolute image and link URLs so the logo and links render on the PyPI
   project page (relative paths only resolve on GitHub).
+- The reduction cache is now thread-safe (guarded by a lock) for multi-threaded agents.
 ### Added
+- OpenAI Responses API support: `reduce_messages` and `wrap_openai` handle `input`
+  with `function_call_output` items.
 - PyPI downloads badge, `SUPPORT.md`, and a CodeQL security-scanning workflow.
 ## [2.0.2] - 2026-06-21
@@ -20,6 +50,11 @@ All notable changes to this project are documented here. The format is based on
 - Lower the minimum Python from 3.14 to 3.10 so the package installs on current
   interpreters (the code already supports 3.10+; CI runs 3.10 through 3.14).
+## [2.0.1] - 2026-06-21
+Intermediate release during the initial PyPI rollout (Python version metadata),
+superseded by 2.0.2. Version 2.0.3 was never published.
 ## [2.0.0] - 2026-06-21
 ### Added
@@ -43,7 +78,10 @@ All notable changes to this project are documented here. The format is based on
 - Targets Python 3.14; ruff, mypy, and coverage run in CI; examples, contributor, and
   security docs included.
-[Unreleased]: https://github.com/pankajniet/LeanContext/compare/v2.0.4...HEAD
+[Unreleased]: https://github.com/pankajniet/LeanContext/compare/v2.0.6...HEAD
+[2.0.6]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.6
+[2.0.5]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.5
 [2.0.4]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.4
 [2.0.2]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.2
+[2.0.1]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.1
 [2.0.0]: https://github.com/pankajniet/LeanContext/releases/tag/v2.0.0

{leancontext-2.0.4 → leancontext-2.0.6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: leancontext
-Version: 2.0.4
+Version: 2.0.6
 Summary: Deterministic, type-aware reduction of agent tool outputs at the source. Cut LLM token cost without making the agent do less.
 Project-URL: Homepage, https://github.com/pankajniet/LeanContext
 Project-URL: Repository, https://github.com/pankajniet/LeanContext
@@ -84,14 +84,18 @@ $ python bench.py
 sample              kind          before   after  saved  fidelity
 -----------------------------------------------------------------
 log (incident)      log            52642     100   100%      100%
-json (RAG chunks)   json            1862    1390    25%      100%
+json (RAG chunks)   json            1862    1391    25%      100%
 html (web fetch)    html            1672    1093    35%      100%
 diff (patch)        diff             639      81    87%      100%
 stacktrace          stacktrace       896      94    90%      100%
 -----------------------------------------------------------------
-TOTAL                              57711    2758    95%
+TOTAL                              57711    2759    95%
 ```
+Counts above use the built-in heuristic tokenizer (≈4 chars/token). Install the
+`tiktoken` extra for exact model token counts — the ratios are similar (~92% on
+this sample). The reduced text is identical either way.
 A real incident log, before and after:
 ```text
@@ -128,10 +132,11 @@ errors, anomalies, and identifiers, and collapses the rest.
 ## Install
 ```bash
-pip install -e .                  # core, standard library only
-pip install -e ".[integrations]"  # openai, anthropic, litellm, fastapi adapters
-pip install -e ".[otel]"          # OpenTelemetry metrics
-pip install -e ".[tiktoken]"      # exact token counts (used automatically when present)
+pip install leancontext                  # core, standard library only
+pip install "leancontext[integrations]"  # openai, anthropic, litellm, fastapi adapters
+pip install "leancontext[otel]"          # OpenTelemetry metrics
+pip install "leancontext[mcp]"           # MCP server
+pip install "leancontext[tiktoken]"      # exact token counts (used automatically when present)
 ```
 ## Use it
@@ -176,6 +181,11 @@ r.fidelity                        # 0..1 signal preserved
 | Frameworks | LangChain, LangGraph, Agno via `wrap(tools)`; any framework via `@reduce` on tool functions (sync or async) |
 | MCP server | `python -m leancontext.integrations.mcp_server` — reduce / expand / stats over stdio |
+CI exercises OpenAI (chat + Responses), Anthropic, LiteLLM, the standalone proxy, OpenTelemetry,
+and the MCP server against the real packages. Message reduction for all formats (including Gemini)
+is unit-tested directly. The framework adapters (LangChain / LangGraph / Agno) and the SDK-level
+Gemini client wrapper are provided best-effort and are not yet covered in CI against the live SDKs.
 ## Reducers
 | Kind | What it does |
@@ -185,6 +195,7 @@ r.fidelity                        # 0..1 signal preserved
 | `diff` | Keep all change, hunk, and header lines, collapse unchanged context |
 | `stacktrace` | Keep the exception and boundary frames, collapse the deep middle |
 | `html` | Strip tags, scripts, and styles, keep visible text and links |
+| `table` | Collapse whitespace-aligned command-line tables, keep header and data |
 Anything else, or any payload below the size, saving, or fidelity thresholds, passes through unchanged.
@@ -216,8 +227,8 @@ leancontext.use_tiktoken("gpt-4o")            # force a specific model's tokeniz
 ## Roadmap
-Accurate provider tokenizers by default, an MCP server, tested LangChain / LlamaIndex / CrewAI
-adapters, broader Anthropic native interop, and a PyPI release.
+CI-verified LangChain / LlamaIndex / CrewAI / Agno adapters, accurate provider tokenizers by
+default, and broader Anthropic native interop.
 ## Contributing

{leancontext-2.0.4 → leancontext-2.0.6}/README.md RENAMED Viewed

@@ -38,14 +38,18 @@ $ python bench.py
 sample              kind          before   after  saved  fidelity
 -----------------------------------------------------------------
 log (incident)      log            52642     100   100%      100%
-json (RAG chunks)   json            1862    1390    25%      100%
+json (RAG chunks)   json            1862    1391    25%      100%
 html (web fetch)    html            1672    1093    35%      100%
 diff (patch)        diff             639      81    87%      100%
 stacktrace          stacktrace       896      94    90%      100%
 -----------------------------------------------------------------
-TOTAL                              57711    2758    95%
+TOTAL                              57711    2759    95%
 ```
+Counts above use the built-in heuristic tokenizer (≈4 chars/token). Install the
+`tiktoken` extra for exact model token counts — the ratios are similar (~92% on
+this sample). The reduced text is identical either way.
 A real incident log, before and after:
 ```text
@@ -82,10 +86,11 @@ errors, anomalies, and identifiers, and collapses the rest.
 ## Install
 ```bash
-pip install -e .                  # core, standard library only
-pip install -e ".[integrations]"  # openai, anthropic, litellm, fastapi adapters
-pip install -e ".[otel]"          # OpenTelemetry metrics
-pip install -e ".[tiktoken]"      # exact token counts (used automatically when present)
+pip install leancontext                  # core, standard library only
+pip install "leancontext[integrations]"  # openai, anthropic, litellm, fastapi adapters
+pip install "leancontext[otel]"          # OpenTelemetry metrics
+pip install "leancontext[mcp]"           # MCP server
+pip install "leancontext[tiktoken]"      # exact token counts (used automatically when present)
 ```
 ## Use it
@@ -130,6 +135,11 @@ r.fidelity                        # 0..1 signal preserved
 | Frameworks | LangChain, LangGraph, Agno via `wrap(tools)`; any framework via `@reduce` on tool functions (sync or async) |
 | MCP server | `python -m leancontext.integrations.mcp_server` — reduce / expand / stats over stdio |
+CI exercises OpenAI (chat + Responses), Anthropic, LiteLLM, the standalone proxy, OpenTelemetry,
+and the MCP server against the real packages. Message reduction for all formats (including Gemini)
+is unit-tested directly. The framework adapters (LangChain / LangGraph / Agno) and the SDK-level
+Gemini client wrapper are provided best-effort and are not yet covered in CI against the live SDKs.
 ## Reducers
 | Kind | What it does |
@@ -139,6 +149,7 @@ r.fidelity                        # 0..1 signal preserved
 | `diff` | Keep all change, hunk, and header lines, collapse unchanged context |
 | `stacktrace` | Keep the exception and boundary frames, collapse the deep middle |
 | `html` | Strip tags, scripts, and styles, keep visible text and links |
+| `table` | Collapse whitespace-aligned command-line tables, keep header and data |
 Anything else, or any payload below the size, saving, or fidelity thresholds, passes through unchanged.
@@ -170,8 +181,8 @@ leancontext.use_tiktoken("gpt-4o")            # force a specific model's tokeniz
 ## Roadmap
-Accurate provider tokenizers by default, an MCP server, tested LangChain / LlamaIndex / CrewAI
-adapters, broader Anthropic native interop, and a PyPI release.
+CI-verified LangChain / LlamaIndex / CrewAI / Agno adapters, accurate provider tokenizers by
+default, and broader Anthropic native interop.
 ## Contributing

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/__init__.py RENAMED Viewed

@@ -49,7 +49,18 @@ from .integrations import (
 from .messages import detect_format, reduce_messages
 from .tokens import active_tokenizer, count_tokens, set_token_counter, use_tiktoken
-__version__ = "0.0.1"
+# Single source of truth is the installed package metadata (pyproject version);
+# the literal is only a fallback for running straight from a source tree.
+try:
+    from importlib.metadata import PackageNotFoundError
+    from importlib.metadata import version as _pkg_version
+    try:
+        __version__ = _pkg_version("leancontext")
+    except PackageNotFoundError:
+        __version__ = "2.0.6"
+except ImportError:  # pragma: no cover - importlib.metadata is stdlib on 3.10+
+    __version__ = "2.0.6"
 __all__ = [
     "reduce",

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/core.py RENAMED Viewed

@@ -9,6 +9,7 @@ from __future__ import annotations
 import json
 import os
+import threading
 from collections import OrderedDict
 from collections.abc import Callable
 from dataclasses import dataclass, field
@@ -36,11 +37,13 @@ CONFIG = _Config()
 # A tool output is re-sent on every turn, so we reduce each unique payload once and
 # reuse the result. Keyed by content hash + options; deterministic, so this is safe.
 _CACHE: OrderedDict[tuple, Reduction] = OrderedDict()
+_CACHE_LOCK = threading.Lock()
 def clear_cache() -> None:
     """Drop all cached reductions."""
-    _CACHE.clear()
+    with _CACHE_LOCK:
+        _CACHE.clear()
 def disable() -> None:
@@ -167,15 +170,20 @@ def reduce_text(
     key = (ref, kind, min_saving, min_fidelity, CONFIG.min_tokens, CONFIG.max_input_chars)
     use_cache = CONFIG.cache_size > 0
-    if use_cache and key in _CACHE:
-        result = _CACHE[key]
-        _CACHE.move_to_end(key)
-    else:
+    result = None
+    if use_cache:
+        with _CACHE_LOCK:
+            result = _CACHE.get(key)
+            if result is not None:
+                _CACHE.move_to_end(key)
+    if result is None:
         result = _compute(original, before, ref, kind, min_saving, min_fidelity)
         if use_cache:
-            _CACHE[key] = result
-            if len(_CACHE) > CONFIG.cache_size:
-                _CACHE.popitem(last=False)  # evict least-recently-used
+            with _CACHE_LOCK:
+                _CACHE[key] = result
+                if len(_CACHE) > CONFIG.cache_size:
+                    _CACHE.popitem(last=False)  # evict least-recently-used
     if result.applied:
         _emit(result)

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/cost.py RENAMED Viewed

@@ -13,6 +13,7 @@ no known price, token savings are still reported and ``usd_saved`` is ``None``.
 from __future__ import annotations
+import threading
 from collections.abc import Callable
 #: USD per 1M tokens (input, output). Indicative — override via set_price().
@@ -71,14 +72,17 @@ class CostTracker:
         self.usd_saved = 0.0
         self.has_price = _input_price(model, input_price_per_mtok) is not None
         self._hook: Callable | None = None
+        self._lock = threading.Lock()
     def _on(self, r) -> None:
-        self.reductions += 1
-        self.tokens_before += r.tokens_before
-        self.tokens_after += r.tokens_after
-        self.tokens_saved += r.tokens_saved
-        if self.has_price:
-            self.usd_saved += estimate_savings(r, self.model, self.price)["usd_saved"]
+        # The hook fires from every reducing thread, so guard the running totals.
+        usd = estimate_savings(r, self.model, self.price)["usd_saved"] if self.has_price else 0.0
+        with self._lock:
+            self.reductions += 1
+            self.tokens_before += r.tokens_before
+            self.tokens_after += r.tokens_after
+            self.tokens_saved += r.tokens_saved
+            self.usd_saved += usd
     def install(self) -> CostTracker:
         from .core import on_reduction

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/fidelity.py RENAMED Viewed

@@ -60,12 +60,22 @@ def _iter_scalars(data: Any):
 def _json_fidelity(original: str, reduced: str) -> float:
-    """Fraction of JSON scalar values (strings and numbers) preserved in the output."""
+    """Fraction of JSON scalar values (strings and numbers) preserved in the output.
+    Values are matched in their JSON-encoded form (the reducer emits them that way),
+    so a value containing a delimiter, quote, or newline only counts as preserved if
+    its exact escaped bytes survive — the check sees structural corruption, not just
+    whether the characters appear somewhere.
+    """
     try:
         data = json.loads(original)
     except Exception:
         return 1.0
-    values = [str(v) for v in _iter_scalars(data) if str(v)]
+    values = [
+        json.dumps(v, ensure_ascii=False).strip('"')
+        for v in _iter_scalars(data)
+    ]
+    values = [v for v in values if v]
     if not values:
         return 1.0
     kept = sum(1 for v in values if v in reduced)

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/integrations/_common.py RENAMED Viewed

@@ -25,19 +25,30 @@ def mark(fn: Callable) -> Callable:
     return fn
-def reduce_messages_in(mapping: Any, fmt: str, opts: dict, key: str = "messages") -> None:
-    """Fail-open, in-place reduction of ``mapping[key]`` (dict-like).
+#: Request keys that can carry a message/tool-output list across providers:
+#: ``messages`` (OpenAI chat / Anthropic), ``input`` (OpenAI Responses API).
+_LIST_KEYS = ("messages", "input")
-    ``key`` is ``messages`` for OpenAI/Anthropic, ``contents`` for Gemini.
+def reduce_messages_in(mapping: Any, fmt: str, opts: dict, key: str | None = "messages") -> None:
+    """Fail-open, in-place reduction of the message list(s) in ``mapping`` (dict-like).
+    ``key`` names the field to reduce (``messages`` for OpenAI/Anthropic). Pass
+    ``key=None`` to reduce whichever known list keys are present — used on gateway
+    paths (LiteLLM) where a request may be chat (``messages``) or Responses (``input``).
     """
-    if isinstance(mapping, dict) and isinstance(mapping.get(key), list):
-        try:
-            mapping[key] = reduce_messages(mapping[key], fmt=fmt, **opts)
-        except Exception:
-            pass  # fail open
+    if not isinstance(mapping, dict):
+        return
+    keys = _LIST_KEYS if key is None else (key,)
+    for k in keys:
+        if isinstance(mapping.get(k), list):
+            try:
+                mapping[k] = reduce_messages(mapping[k], fmt=fmt, **opts)
+            except Exception:
+                pass  # fail open
-def wrap_messages_create(create: Callable, *, fmt: str, opts: dict, key: str = "messages",
+def wrap_messages_create(create: Callable, *, fmt: str, opts: dict, key: str | None = "messages",
                          reduce: bool = True,
                          before: Callable[[dict], None] | None = None) -> Callable:
     """Wrap a ``create(**kwargs)`` callable to reduce its messages before calling through.

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/integrations/clients.py RENAMED Viewed

@@ -15,12 +15,17 @@ from ._common import wrap_messages_create
 def wrap_openai(client: Any, **opts) -> Any:
-    """Reduce tool outputs on an OpenAI client's chat.completions.create."""
+    """Reduce tool outputs on an OpenAI client's chat.completions and responses APIs."""
     try:
         comp = client.chat.completions
         comp.create = wrap_messages_create(comp.create, fmt="openai", opts=opts)
     except Exception:
         pass  # fail open
+    try:
+        responses = client.responses
+        responses.create = wrap_messages_create(responses.create, fmt="responses", opts=opts, key="input")
+    except Exception:
+        pass  # fail open
     return client

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/integrations/litellm.py RENAMED Viewed

@@ -35,7 +35,8 @@ def make_handler(**opts):
     class LeanContextHandler(CustomLogger):
         async def async_pre_call_hook(self, user_api_key_dict, cache, data, call_type):
             if call_type in _REDUCIBLE_CALLS:
-                reduce_messages_in(data, "auto", opts)  # fail-open in-place
+                # key=None: reduce chat (messages) or Responses (input) payloads alike
+                reduce_messages_in(data, "auto", opts, key=None)  # fail-open in-place
             return data
     return LeanContextHandler()
@@ -48,14 +49,14 @@ def patch(**opts) -> None:
     if getattr(litellm, "_leancontext_patched", False):
         return
-    litellm.completion = wrap_messages_create(litellm.completion, fmt="auto", opts=opts)
+    litellm.completion = wrap_messages_create(litellm.completion, fmt="auto", opts=opts, key=None)
     if hasattr(litellm, "acompletion"):
         _orig_acompletion = litellm.acompletion
         @functools.wraps(_orig_acompletion)
         async def acompletion(*args, **kwargs):
-            reduce_messages_in(kwargs, "auto", opts)
+            reduce_messages_in(kwargs, "auto", opts, key=None)
             return await _orig_acompletion(*args, **kwargs)
         litellm.acompletion = mark(acompletion)

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/messages.py RENAMED Viewed

@@ -1,39 +1,27 @@
 """Protocol-aware message reduction — the gateway/wire surface.
 This is how LeanContext plugs into gateways (LiteLLM), SDK client wrappers, and proxies
-*without* the structure-blindness that hurts wire-level compressors: the chat
-protocols already tag tool outputs (OpenAI ``role="tool"``; Anthropic
-``tool_result`` blocks), so we can find and reduce exactly those — and nothing
-else. We never touch system/user/assistant instruction text. Fail-open throughout.
+*without* the structure-blindness that hurts wire-level compressors: the chat protocols
+already tag tool outputs, so we find and reduce exactly those and nothing else. We never
+touch system/user/assistant instruction text. Fail-open throughout.
-Cache-safety: reductions are deterministic and content-addressed, so the same tool
-output always serialises to the same bytes → the provider prompt-cache keeps hitting.
+Each provider format registers a detector and a per-item reducer in ``_FORMATS`` (like the
+typed-reducer registry), so adding a format means adding one entry. Supported: OpenAI
+chat-completions, Anthropic messages, Gemini contents, and the OpenAI Responses API.
+Cache-safety: reductions are deterministic and content-addressed, so the same tool output
+always serialises to the same bytes → the provider prompt-cache keeps hitting.
 """
 from __future__ import annotations
+from collections.abc import Callable
+from dataclasses import dataclass
 from typing import Any
 from .core import reduce_text
-def detect_format(messages: list) -> str:
-    """Best-effort detection of the message protocol."""
-    for m in messages:
-        if not isinstance(m, dict):
-            continue
-        if isinstance(m.get("parts"), list):
-            return "gemini"
-        if m.get("role") in ("tool", "function"):
-            return "openai"
-        content = m.get("content")
-        if isinstance(content, list):
-            for block in content:
-                if isinstance(block, dict) and block.get("type") == "tool_result":
-                    return "anthropic"
-    return "openai"
 def _reduce_str(text: Any, opts: dict) -> Any:
     if not isinstance(text, str):
         return text
@@ -112,8 +100,7 @@ def _reduce_anthropic_textblock(x: Any, opts: dict) -> Any:
 # --- Gemini format -----------------------------------------------------------
 # Gemini uses `contents` -> `parts`, where a tool result is a `functionResponse`
 # part whose `response` is a dict. We reduce the large string values inside that
-# dict, keeping the dict shape Gemini requires. Typed SDK objects (non-dict)
-# pass through untouched.
+# dict, keeping the dict shape Gemini requires. Typed SDK objects pass through.
 def _reduce_gemini_message(content: Any, opts: dict) -> Any:
     if not isinstance(content, dict) or not isinstance(content.get("parts"), list):
@@ -133,20 +120,105 @@ def _reduce_gemini_message(content: Any, opts: dict) -> Any:
     return {**content, "parts": new_parts}
+# --- OpenAI Responses API format ---------------------------------------------
+# The Responses API uses `input` (not `messages`); a tool result is an item with
+# type "function_call_output" whose `output` is a string.
+def _reduce_responses_message(item: Any, opts: dict) -> Any:
+    if not isinstance(item, dict) or item.get("type") != "function_call_output":
+        return item
+    output = item.get("output")
+    if isinstance(output, str):
+        new_item = dict(item)
+        new_item["output"] = _reduce_str(output, opts)
+        return new_item
+    # The Responses API also allows a list of content parts (e.g. output_text);
+    # reduce those the same way as chat parts. Anything else passes through.
+    if isinstance(output, list):
+        new_item = dict(item)
+        new_item["output"] = [_reduce_openai_part(p, opts) for p in output]
+        return new_item
+    return item
+# --- format registry ---------------------------------------------------------
+def _is_responses(m: dict) -> bool:
+    return m.get("type") == "function_call_output"
+def _is_gemini(m: dict) -> bool:
+    return isinstance(m.get("parts"), list)
+def _is_openai(m: dict) -> bool:
+    return m.get("role") in ("tool", "function")
+def _is_anthropic(m: dict) -> bool:
+    content = m.get("content")
+    return isinstance(content, list) and any(
+        isinstance(b, dict) and b.get("type") == "tool_result" for b in content
+    )
+@dataclass(frozen=True)
+class _Format:
+    name: str
+    detect: Callable[[dict], bool]
+    reduce: Callable[[Any, dict], Any]
+    priority: int
+# Detection runs in priority order; the first format any single message matches wins.
+_FORMATS: list[_Format] = sorted(
+    [
+        _Format("responses", _is_responses, _reduce_responses_message, 10),
+        _Format("gemini", _is_gemini, _reduce_gemini_message, 20),
+        _Format("openai", _is_openai, _reduce_openai_message, 30),
+        _Format("anthropic", _is_anthropic, _reduce_anthropic_message, 40),
+    ],
+    key=lambda f: f.priority,
+)
+_REDUCE_BY_NAME = {f.name: f.reduce for f in _FORMATS}
 # --- public ------------------------------------------------------------------
+def _format_for(m: Any) -> str:
+    """The format a single message belongs to (priority order); defaults to ``openai``."""
+    if isinstance(m, dict):
+        for fmt in _FORMATS:
+            if fmt.detect(m):
+                return fmt.name
+    return "openai"
+def detect_format(messages: list) -> str:
+    """Best-effort detection of the message protocol; defaults to ``openai``."""
+    for m in messages:
+        if not isinstance(m, dict):
+            continue
+        for fmt in _FORMATS:
+            if fmt.detect(m):
+                return fmt.name
+    return "openai"
 def reduce_messages(messages: Any, *, fmt: str = "auto", **opts) -> Any:
     """Return a new message list with tool outputs reduced. Input is not mutated.
-    Handles OpenAI (`role:"tool"`), Anthropic (`tool_result` blocks), and Gemini
-    (`functionResponse` parts). Only tool-result content is touched; instructions
-    are never altered. Anything unrecognised passes through unchanged (fail open).
+    Handles OpenAI (chat + Responses), Anthropic, and Gemini formats. Only tool-result
+    content is touched; instructions are never altered. Anything unrecognised passes
+    through unchanged (fail open).
+    With ``fmt="auto"`` each message is dispatched by its own format, so a list mixing
+    shapes (e.g. a chat tool message alongside a Responses ``function_call_output``)
+    reduces every item — not just the ones matching the first format seen.
     """
     if not isinstance(messages, list):
         return messages
-    resolved = detect_format(messages) if fmt == "auto" else fmt
-    if resolved == "anthropic":
-        return [_reduce_anthropic_message(m, opts) for m in messages]
-    if resolved == "gemini":
-        return [_reduce_gemini_message(m, opts) for m in messages]
-    return [_reduce_openai_message(m, opts) for m in messages]
+    if fmt == "auto":
+        return [_REDUCE_BY_NAME.get(_format_for(m), _reduce_openai_message)(m, opts) for m in messages]
+    reducer = _REDUCE_BY_NAME.get(fmt, _reduce_openai_message)
+    return [reducer(m, opts) for m in messages]

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/paging.py RENAMED Viewed

@@ -18,6 +18,7 @@ from .tokens import content_ref, count_tokens
 REF_SCHEME = "lc"
 _REF_RE = re.compile(r"lc://([0-9a-f]{6,40})")
+_HEX_REF = re.compile(r"[0-9a-f]{6,40}")   # a valid content-hash id (no path chars)
 class ContentStore:
@@ -42,6 +43,8 @@ class ContentStore:
         return ref
     def get(self, ref: str) -> str | None:
+        if not _HEX_REF.fullmatch(ref):   # only content-hash ids; blocks path traversal
+            return None
         if self.root:
             try:
                 with open(self._path(ref), encoding="utf-8") as fh:
@@ -54,9 +57,12 @@ class ContentStore:
 _DEFAULT_STORE = ContentStore()
-def _normalize(ref: str) -> str:
+def _normalize(ref: str) -> str | None:
     m = _REF_RE.search(ref)
-    return m.group(1) if m else ref.strip()
+    if m:
+        return m.group(1)
+    ref = ref.strip()
+    return ref if _HEX_REF.fullmatch(ref) else None
 def store(content: str, using: ContentStore | None = None) -> str:
@@ -66,7 +72,10 @@ def store(content: str, using: ContentStore | None = None) -> str:
 def expand(ref: str, using: ContentStore | None = None) -> str | None:
     """Return the original content for a ref (accepts 'lc://<id>' or a bare id)."""
-    return (using or _DEFAULT_STORE).get(_normalize(ref))
+    norm = _normalize(ref)
+    if norm is None:
+        return None
+    return (using or _DEFAULT_STORE).get(norm)
 def reference_line(content: str, summary: str | None = None,

{leancontext-2.0.4 → leancontext-2.0.6}/leancontext/reducers/json_data.py RENAMED Viewed

@@ -27,20 +27,22 @@ def _find_records(data: Any) -> list[dict] | None:
     return None
-def _fmt(value: Any) -> str:
-    if isinstance(value, str):
-        return value
-    return json.dumps(value, separators=(",", ":"), ensure_ascii=False)
 def reduce_json(text: str) -> tuple[str, list[str]]:
     data = json.loads(text)
     records = _find_records(data)
     if records is not None and len(records) >= 3:
         keys = list(dict.fromkeys(k for row in records for k in row.keys()))
-        header = "fields: " + " | ".join(keys)
-        rows = [" | ".join(_fmt(row.get(k, "")) for k in keys) for row in records]
+        # Each row is a JSON array of values in `keys` order, with the field names
+        # factored out into the header once (a missing field becomes null, keeping
+        # every row positional). JSON-encoding every cell keeps values that contain
+        # the delimiter, quotes, or newlines unambiguous and lossless — a plain
+        # " | " join would corrupt those, and the fidelity check wouldn't catch it.
+        header = "fields: " + json.dumps(keys, separators=(",", ":"), ensure_ascii=False)
+        rows = [
+            json.dumps([row.get(k) for k in keys], separators=(",", ":"), ensure_ascii=False)
+            for row in records
+        ]
         notes = [f"columnar: {len(records)} records × {len(keys)} fields, keys factored out once"]
         return header + "\n" + "\n".join(rows), notes

{leancontext-2.0.4 → leancontext-2.0.6}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "leancontext"
-version = "2.0.4"
+version = "2.0.6"
 description = "Deterministic, type-aware reduction of agent tool outputs at the source. Cut LLM token cost without making the agent do less."
 readme = "README.md"
 requires-python = ">=3.10"

leancontext-2.0.6/tests/test_concurrency.py ADDED Viewed

@@ -0,0 +1,27 @@
+import concurrent.futures
+import leancontext
+from leancontext.core import CONFIG, clear_cache
+def _log(n=400):
+    lines = [f"2026-06-21T09:00:{i % 60:02d}Z INFO [worker] job={i} status=ok ms={i % 50}" for i in range(n)]
+    lines.insert(200, "2026-06-21T09:05:00Z FATAL [render] OOM killed worker=7 — root cause")
+    return "\n".join(lines)
+def test_concurrent_reduce_with_eviction_is_safe():
+    # Small cache + many distinct payloads across threads exercises the cache's
+    # insert / move_to_end / evict paths concurrently. With the lock this is safe.
+    clear_cache()
+    old = CONFIG.cache_size
+    CONFIG.cache_size = 5
+    try:
+        payloads = [_log() + f"\nunique-{i} marker line " * 2 for i in range(60)] * 2
+        with concurrent.futures.ThreadPoolExecutor(max_workers=8) as pool:
+            results = list(pool.map(lambda p: leancontext.reduce(p).text, payloads))
+        assert len(results) == len(payloads)
+        assert all("root cause" in r for r in results)
+    finally:
+        CONFIG.cache_size = old
+        clear_cache()

{leancontext-2.0.4 → leancontext-2.0.6}/tests/test_core.py RENAMED Viewed

@@ -41,6 +41,16 @@ def test_json_columnar_is_lossless_on_values():
     for i in range(20):
         assert f"n{i}" in r.text           # every value preserved
+def test_json_columnar_handles_delimiter_and_newline_values():
+    # Values containing the column delimiter or a newline must not corrupt rows:
+    # each row must parse back to exactly its original fields (regression test).
+    records = [{"id": i, "text": f"row {i} | part A\nrow {i} part B", "n": i} for i in range(10)]
+    r = leancontext.reduce(json.dumps(records))
+    assert r.kind == "json"                 # reduction applied, not reverted
+    rows = [json.loads(line) for line in r.text.splitlines()[1:]]  # skip the fields header
+    assert rows == [[i, f"row {i} | part A\nrow {i} part B", i] for i in range(10)]
 def test_decorator_preserves_contract():
     @leancontext.reduce
     def tool(_: str) -> str:

{leancontext-2.0.4 → leancontext-2.0.6}/tests/test_gateway.py RENAMED Viewed

@@ -36,6 +36,46 @@ def test_wrap_openai_client_reduces_messages():
     assert len(sent) < len(_big_log()) and "root cause" in sent
+def test_wrap_openai_responses_api():
+    openai = pytest.importorskip("openai")
+    client = openai.OpenAI(api_key="test-key")
+    if not hasattr(client, "responses"):
+        pytest.skip("Responses API not in this openai version")
+    captured = {}
+    client.responses.create = lambda **kw: captured.update(kw) or "OK"
+    from leancontext import wrap_openai
+    wrap_openai(client)
+    client.responses.create(
+        model="gpt-4o",
+        input=[{"type": "function_call_output", "call_id": "c", "output": _big_log()}],
+    )
+    sent = captured["input"][0]["output"]
+    assert len(sent) < len(_big_log()) and "root cause" in sent
+def test_wrap_async_openai_client_reduces_messages():
+    openai = pytest.importorskip("openai")
+    client = openai.AsyncOpenAI(api_key="test-key")
+    captured = {}
+    async def fake(**kw):
+        captured.update(kw)
+        return "OK"
+    client.chat.completions.create = fake
+    from leancontext import wrap_openai
+    wrap_openai(client)
+    asyncio.run(client.chat.completions.create(model="gpt-4o", messages=[_openai_tool_msg()]))
+    sent = captured["messages"][0]["content"]
+    assert len(sent) < len(_big_log()) and "root cause" in sent
 def test_wrap_anthropic_client_reduces_tool_results():
     anthropic = pytest.importorskip("anthropic")
     client = anthropic.Anthropic(api_key="test-key")
@@ -87,6 +127,20 @@ def test_proxy_reduces_before_forwarding():
     assert len(sent) < len(_big_log()) and "root cause" in sent
+# --- gateway helper: chat (messages) vs Responses (input) --------------------
+def test_reduce_messages_in_handles_responses_input_key():
+    # Gateway paths use key=None so a Responses request (input=) reduces too, not
+    # just chat (messages=). No third-party dependency needed for this logic.
+    from leancontext.integrations._common import reduce_messages_in
+    data = {"model": "gpt-4o",
+            "input": [{"type": "function_call_output", "call_id": "c", "output": _big_log()}]}
+    reduce_messages_in(data, "auto", {}, key=None)
+    sent = data["input"][0]["output"]
+    assert len(sent) < len(_big_log()) and "root cause" in sent
 # --- LiteLLM (real CustomLogger) ---------------------------------------------
 def test_litellm_pre_call_hook_reduces():

{leancontext-2.0.4 → leancontext-2.0.6}/tests/test_messages.py RENAMED Viewed

@@ -52,6 +52,39 @@ def test_input_not_mutated():
     assert tool["content"] == before            # original list/dicts untouched
+def test_responses_format_reduced():
+    items = [
+        {"role": "user", "content": "why did it crash?"},
+        {"type": "function_call_output", "call_id": "c1", "output": _log()},
+    ]
+    out = reduce_messages(items)                 # auto-detect -> responses
+    assert out[0] == items[0]                    # the user message is untouched
+    reduced = out[1]["output"]
+    assert len(reduced) < len(_log()) and "root cause" in reduced
+def test_mixed_format_list_reduces_every_item():
+    # A chat tool message AND a Responses function_call_output in one list: auto
+    # dispatch must reduce both, not just the format of the first message seen.
+    items = [
+        {"role": "tool", "tool_call_id": "c1", "content": _log()},
+        {"type": "function_call_output", "call_id": "c2", "output": _log()},
+    ]
+    out = reduce_messages(items)
+    assert len(out[0]["content"]) < len(_log()) and "root cause" in out[0]["content"]
+    assert len(out[1]["output"]) < len(_log()) and "root cause" in out[1]["output"]
+def test_responses_list_shaped_output_reduced():
+    items = [
+        {"type": "function_call_output", "call_id": "c1",
+         "output": [{"type": "output_text", "text": _log()}]},
+    ]
+    out = reduce_messages(items)
+    reduced = out[0]["output"][0]["text"]
+    assert len(reduced) < len(_log()) and "root cause" in reduced
 def test_non_list_passthrough():
     assert reduce_messages("not a list") == "not a list"

{leancontext-2.0.4 → leancontext-2.0.6}/tests/test_paging.py RENAMED Viewed

@@ -44,3 +44,14 @@ def test_expand_tool_spec_shape():
     spec = paging.EXPAND_TOOL_SPEC
     assert spec["name"] == "leancontext_expand"
     assert spec["input_schema"]["required"] == ["ref"]
+def test_expand_rejects_path_traversal(tmp_path):
+    store = paging.ContentStore(root=str(tmp_path))
+    secret = tmp_path.parent / "leak.txt"
+    secret.write_text("TOPSECRET")
+    # refs that aren't content hashes must never resolve to a filesystem path
+    for evil in ("../leak", "../../etc/hosts", "/etc/hosts", "..%2Fleak"):
+        assert paging.expand(evil, using=store) is None
+        assert store.get(evil) is None