PyPI - semble - Versions diffs - 0.3.2__tar.gz → 0.3.3__tar.gz - Mend

semble 0.3.2tar.gz → 0.3.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (74) hide show

{semble-0.3.2 → semble-0.3.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: semble
-Version: 0.3.2
+Version: 0.3.3
 Summary: Fast and Accurate Code Search for Agents
 Author-email: Thomas van Dongen <thomasvdongen@proton.me>, Stéphan Tulkens <stephantul@gmail.com>
 License: MIT License
@@ -98,7 +98,7 @@ Dynamic: license-file
 </div>
-Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](#mcp-server) or call it from the shell via [AGENTS.md](#agentsmd) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
+Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Use it as an MCP server, a CLI tool via AGENTS.md, or a dedicated sub-agent, and any coding agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
 ## Quickstart
@@ -210,7 +210,12 @@ semble savings --verbose # also show breakdown by call type
 Savings are calculated as follows: for each call, semble records the total character count of the unique files containing returned chunks and the character count of the snippets returned. Estimated tokens saved is `(file chars − snippet chars) / 4` (4 chars per token). This is a conservative estimate: the baseline is reading matched files in full, which is how coding agents often explore unfamiliar code.
-By default, stats are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. 'd:\caches\storemysemblecachehere'.
+</details>
+<details>
+<summary>Storage</summary>
+By default, your Semble savings statistics and any saved indexes are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. `~/my-folder/my-caches/semble`.
 </details>

{semble-0.3.2 → semble-0.3.3}/README.md RENAMED Viewed

@@ -24,7 +24,7 @@
 </div>
-Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](#mcp-server) or call it from the shell via [AGENTS.md](#agentsmd) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
+Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Use it as an MCP server, a CLI tool via AGENTS.md, or a dedicated sub-agent, and any coding agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
 ## Quickstart
@@ -136,7 +136,12 @@ semble savings --verbose # also show breakdown by call type
 Savings are calculated as follows: for each call, semble records the total character count of the unique files containing returned chunks and the character count of the snippets returned. Estimated tokens saved is `(file chars − snippet chars) / 4` (4 chars per token). This is a conservative estimate: the baseline is reading matched files in full, which is how coding agents often explore unfamiliar code.
-By default, stats are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. 'd:\caches\storemysemblecachehere'.
+</details>
+<details>
+<summary>Storage</summary>
+By default, your Semble savings statistics and any saved indexes are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. `~/my-folder/my-caches/semble`.
 </details>

{semble-0.3.2 → semble-0.3.3}/docs/installation.md RENAMED Viewed

@@ -21,7 +21,9 @@ To undo:
 semble uninstall
 ```
-Supported agents: Claude Code, Cursor, Gemini CLI, Kiro, OpenCode, GitHub Copilot, Codex, VS Code, Windsurf, and Zed.
+Supported agents: Claude Code, Cursor, Gemini CLI, Kiro, OpenCode, GitHub Copilot, Codex, VS Code, Windsurf, Zed, Reasonix, and Pi.
+> **Pi prerequisite:** Pi requires the MCP extension to be installed before semble can connect. Run `pi install npm:pi-mcp-extension` once, then `semble install`.
 ---
@@ -198,6 +200,48 @@ Add to `~/.config/zed/settings.json` (or `.zed/settings.json` in your project):
 </details>
+<details>
+<summary>Reasonix</summary>
+Add to `~/.reasonix/config.json` (the backwards-compatible MCP config path read by all Reasonix versions):
+```json
+{
+  "mcpServers": {
+    "semble": {
+      "command": "uvx",
+      "args": ["--from", "semble[mcp]", "semble"]
+    }
+  }
+}
+```
+</details>
+<details>
+<summary>Pi</summary>
+First install the Pi MCP extension (one-time prerequisite):
+```bash
+pi install npm:pi-mcp-extension
+```
+Then add to `~/.pi/agent/mcp.json`:
+```json
+{
+  "mcpServers": {
+    "semble": {
+      "command": "uvx",
+      "args": ["--from", "semble[mcp]", "semble"]
+    }
+  }
+}
+```
+</details>
 By default the MCP server indexes only code files. To also index documentation, config, or everything, append `--content docs`, `--content config`, or `--content all` to the server command. For example, in Claude Code:
 ```bash
@@ -250,7 +294,9 @@ If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its plac
 ### Sub-agent
-For harnesses that support sub-agents (Claude Code, Cursor, Gemini CLI, Kiro, OpenCode, GitHub Copilot), you can install a dedicated `semble-search` sub-agent. Copy the appropriate file from [`src/semble/agents/`](../src/semble/agents/) to your agent's agents directory:
+For harnesses that support sub-agents (Claude Code, Cursor, Gemini CLI, Kiro, OpenCode, GitHub Copilot, Reasonix, Pi), you can install a dedicated `semble-search` sub-agent. Copy the appropriate file from [`src/semble/agents/`](../src/semble/agents/) to your agent's agents directory:
+> **Pi prerequisite:** Pi sub-agents require the Pi agents extension. Run `pi install npm:pi-agents` once before installing.
 | Agent | File | Destination |
 |---|---|---|
@@ -260,3 +306,5 @@ For harnesses that support sub-agents (Claude Code, Cursor, Gemini CLI, Kiro, Op
 | Kiro | `kiro.md` | `~/.kiro/agents/semble-search.md` |
 | OpenCode | `opencode.md` | `~/.config/opencode/agents/semble-search.md` |
 | GitHub Copilot | `copilot.md` | `~/.copilot/agents/semble-search.agent.md` |
+| Reasonix | `reasonix.md` | `~/.reasonix/skills/semble-search.md` |
+| Pi | `pi.md` | `~/.pi/agents/semble-search.md` |

semble-0.3.3/src/semble/agents/pi.md ADDED Viewed

@@ -0,0 +1,40 @@
+---
+name: semble-search
+description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over Bash/Read for any semantic or exploratory question.
+---
+Use `semble search` to find code by describing what it does or naming a symbol/identifier, instead of grep:
+```bash
+semble search "authentication flow" ./my-project
+semble search "save_pretrained" ./my-project
+semble search "save model to disk" ./my-project --top-k 10
+```
+Results are cached automatically on first run and invalidated when files change.
+Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config:
+```bash
+semble search "deployment guide" ./my-project --content docs
+semble search "database host port" ./my-project --content config
+semble search "authentication" ./my-project --content all
+```
+Use `semble find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result):
+```bash
+semble find-related src/auth.py 42 ./my-project
+```
+`path` defaults to the current directory when omitted; git URLs are accepted.
+If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its place.
+### Workflow
+1. Start with `semble search` to find relevant chunks. The index is built and cached automatically.
+2. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything.
+3. Inspect full files only when the returned chunk does not give enough context.
+4. Optionally use `semble find-related` with a promising result's `file_path` and `line` to discover related implementations.
+5. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string.

semble-0.3.3/src/semble/agents/reasonix.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+name: semble-search
+description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over bash/grep for any semantic or exploratory question.
+runAs: subagent
+allowed-tools: bash, read_file
+---
+Use `semble search` to find code by describing what it does or naming a symbol/identifier, instead of grep:
+```bash
+semble search "authentication flow" ./my-project
+semble search "save_pretrained" ./my-project
+semble search "save model to disk" ./my-project --top-k 10
+```
+Results are cached automatically on first run and invalidated when files change.
+Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config:
+```bash
+semble search "deployment guide" ./my-project --content docs
+semble search "database host port" ./my-project --content config
+semble search "authentication" ./my-project --content all
+```
+Use `semble find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result):
+```bash
+semble find-related src/auth.py 42 ./my-project
+```
+`path` defaults to the current directory when omitted; git URLs are accepted.
+If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its place.
+### Workflow
+1. Start with `semble search` to find relevant chunks. The index is built and cached automatically.
+2. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything.
+3. Inspect full files only when the returned chunk does not give enough context.
+4. Optionally use `semble find-related` with a promising result's `file_path` and `line` to discover related implementations.
+5. Use bash/grep only when you need exhaustive literal matches or quick confirmation of an exact string.

{semble-0.3.2 → semble-0.3.3}/src/semble/cache.py RENAMED Viewed

@@ -1,5 +1,6 @@
 import hashlib
 import json
+import logging
 import os
 import shutil
 import sys
@@ -13,6 +14,8 @@ from semble.index.types import PersistencePath
 from semble.types import ContentType
 from semble.utils import is_git_url, resolve_model_name
+logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
     from semble.index import SembleIndex
@@ -48,11 +51,24 @@ def _linux_cache_dir(name: str) -> Path:
     return base / name
+def _get_valid_user_cache_dir() -> Path | None:
+    """Gets the user cache dir if it is set and is a valid path."""
+    user_cache_location = os.getenv("SEMBLE_CACHE_LOCATION")
+    if user_cache_location is None:
+        return None
+    user_cache_dir = Path(user_cache_location)
+    if not user_cache_dir.is_absolute():
+        logger.warning("SEMBLE_CACHE_LOCATION is not an absolute path: %s", user_cache_location)
+        return None
+    return user_cache_dir
 def resolve_cache_folder() -> Path:
     """Resolves a cache folder, respects SEMBLE_CACHE_LOCATION (highest precedence), XDG_CACHE_HOME."""
     name = "semble"
-    if semble_cache_location := os.getenv("SEMBLE_CACHE_LOCATION"):
-        cache_dir = Path(semble_cache_location)
+    if user_cache_dir := _get_valid_user_cache_dir():
+        cache_dir = user_cache_dir
     elif sys.platform == "win32":
         cache_dir = _windows_cache_dir(name)
     elif sys.platform == "darwin":

{semble-0.3.2 → semble-0.3.3}/src/semble/cli.py RENAMED Viewed

@@ -1,19 +1,26 @@
 import argparse
 import asyncio
 import json
+import re
 import sys
 import warnings
 from importlib.util import find_spec
+from shutil import rmtree
+from typing import Literal
 from model2vec.utils import get_package_extras
-from semble.cache import find_index_from_cache_folder
+from semble.cache import find_index_from_cache_folder, resolve_cache_folder
 from semble.index import SembleIndex
+from semble.index.types import PersistencePath
 from semble.stats import format_savings_report
 from semble.types import ContentType
 from semble.utils import format_results, is_git_url, resolve_chunk
-_CLI_DISPATCH_ARGS = frozenset({"search", "find-related", "install", "uninstall", "savings", "-h", "--help"})
+_CLI_DISPATCH_ARGS = frozenset({"search", "find-related", "install", "uninstall", "savings", "-h", "--help", "clear"})
+_CLEAR_CHOICE = Literal["all", "index", "savings"]
+_SHA_256_REGEX = re.compile(r"^[a-f0-9]{64}$")
 def _build_index(path: str, content: list[ContentType]) -> SembleIndex:
@@ -131,6 +138,35 @@ def _run_find_related(path: str, file_path: str, line: int, top_k: int, content:
     _maybe_save_index(index, path)
+def _run_clear(clear_type: _CLEAR_CHOICE) -> None:
+    """Run the `clear` subcommand."""
+    cache_folder = resolve_cache_folder()
+    if clear_type == "index" or clear_type == "all":
+        indexes = []
+        for path in cache_folder.glob("*/index"):
+            if not _SHA_256_REGEX.match(path.parent.name):
+                continue
+            if PersistencePath.from_path(path).non_existing():
+                continue
+            indexes.append(path)
+        if not indexes:
+            print(f"No indexes found to clear in `{cache_folder}`")
+        else:
+            for path in indexes:
+                index_folder = path.parent
+                rmtree(index_folder)
+                print(f"Cleared index at `{index_folder}`")
+    if clear_type == "savings" or clear_type == "all":
+        path = cache_folder / "savings.jsonl"
+        if not path.exists():
+            print(f"No savings file found at `{path}`")
+        else:
+            path.unlink()
+            print(f"Cleared savings at `{path}`")
 def _cli_main() -> None:
     parser = argparse.ArgumentParser(prog="semble")
     sub = parser.add_subparsers(dest="command")
@@ -141,6 +177,9 @@ def _cli_main() -> None:
     search_p.add_argument("-k", "--top-k", type=int, default=5, help="Number of results (default: 5).")
     _add_content_args(search_p)
+    clear_p = sub.add_parser("clear", help="Clear the index cache.")
+    clear_p.add_argument("type", choices=["all", "index", "savings"], help="Type of cache to clear.")
     related_p = sub.add_parser("find-related", help="Find code similar to a specific location.")
     related_p.add_argument("file_path", help="File path as shown in search results.")
     related_p.add_argument("line", type=int, help="Line number (1-indexed).")
@@ -162,6 +201,8 @@ def _cli_main() -> None:
         from semble.installer import run
         run(args.command)
+    elif args.command == "clear":
+        _run_clear(args.type)
     elif args.command == "search":
         _run_search(args.path, args.query, args.top_k, _resolve_content(args.content, args.include_text_files))
     elif args.command == "find-related":

{semble-0.3.2 → semble-0.3.3}/src/semble/installer/agents.py RENAMED Viewed

@@ -5,13 +5,12 @@ import shutil
 import sys
 from dataclasses import dataclass
 from pathlib import Path
-from typing import Callable, Literal
+from typing import Literal
 _HOME = Path.home()
 Action = Literal["created", "updated", "unchanged", "not-found", "removed", "error", "skipped"]
 Mode = Literal["install", "uninstall"]
-PathResolver = Callable[[], Path]
 SEMBLE_START = "<!-- SEMBLE_START -->"
 SEMBLE_END = "<!-- SEMBLE_END -->"
@@ -39,7 +38,7 @@ _ZED_SERVER_CONFIG: dict[str, object] = {  # Zed requires "source": "custom" for
     "args": ["--from", "semble[mcp]", "semble"],
 }
-_INSTRUCTIONS = f"""\
+INSTRUCTIONS = f"""\
 {SEMBLE_START}
 ## Semble Code Search
@@ -78,15 +77,11 @@ The index is built on first run and cached automatically. If `semble` is not on
 class McpConfig:
     """MCP integration config for one agent."""
-    path: Path | PathResolver
+    path: Path
     key: str
     entry: dict[str, object]
     format: Literal["json", "toml"] = "json"
-    def resolved_path(self) -> Path:
-        """Return the resolved config path."""
-        return self.path() if callable(self.path) else self.path
 @dataclass(frozen=True)
 class WriteResult:
@@ -110,7 +105,7 @@ class AgentTarget:
     def resolved_mcp_path(self) -> Path | None:
         """Return the resolved MCP config path, or None if MCP is unsupported."""
-        return self.mcp.resolved_path() if self.mcp else None
+        return self.mcp.path if self.mcp else None
 def _opencode_mcp_path() -> Path:
@@ -175,7 +170,7 @@ AGENTS: list[AgentTarget] = [
         display_name="Opencode",
         binary="opencode",
         config_dir=_HOME / ".config" / "opencode",
-        mcp=McpConfig(_opencode_mcp_path, "mcp", _OPENCODE_SERVER_CONFIG),
+        mcp=McpConfig(_opencode_mcp_path(), "mcp", _OPENCODE_SERVER_CONFIG),
         instructions_path=_HOME / ".config" / "opencode" / "AGENTS.md",
         subagent_path=_HOME / ".config" / "opencode" / "agents" / "semble-search.md",
     ),
@@ -201,7 +196,7 @@ AGENTS: list[AgentTarget] = [
         display_name="VS Code",
         binary="code",
         config_dir=None,
-        mcp=McpConfig(_vscode_mcp_path, "servers", _STDIO_SERVER_CONFIG),
+        mcp=McpConfig(_vscode_mcp_path(), "servers", _STDIO_SERVER_CONFIG),
         instructions_path=None,
     ),
     AgentTarget(
@@ -220,6 +215,27 @@ AGENTS: list[AgentTarget] = [
         mcp=McpConfig(_HOME / ".config" / "zed" / "settings.json", "context_servers", _ZED_SERVER_CONFIG),
         instructions_path=None,
     ),
+    AgentTarget(
+        id="reasonix",
+        display_name="Reasonix",
+        binary="reasonix",
+        config_dir=_HOME / ".config" / "reasonix",
+        # ~/.reasonix/config.json is the legacy v0.x path still read by v1.x for backwards compat.
+        # The v1.x canonical config is ~/.config/reasonix/config.toml ([[plugins]]), but the JSON
+        # path requires no special TOML handling and works for new users who have never had v0.x.
+        mcp=McpConfig(_HOME / ".reasonix" / "config.json", "mcpServers", _BARE_STDIO_SERVER_CONFIG),
+        instructions_path=_HOME / ".config" / "reasonix" / "REASONIX.md",
+        subagent_path=_HOME / ".reasonix" / "skills" / "semble-search.md",
+    ),
+    AgentTarget(
+        id="pi",
+        display_name="Pi",
+        binary="pi",
+        config_dir=_HOME / ".pi",
+        mcp=McpConfig(_HOME / ".pi" / "agent" / "mcp.json", "mcpServers", _BARE_STDIO_SERVER_CONFIG),
+        instructions_path=None,
+        subagent_path=_HOME / ".pi" / "agents" / "semble-search.md",
+    ),
 ]

{semble-0.3.2 → semble-0.3.3}/src/semble/installer/config.py RENAMED Viewed

@@ -230,7 +230,7 @@ def _strip_toml_section(text: str, header: str) -> str:
     return "".join(result)
-def _merge_toml_block(path: Path) -> Action:
+def merge_toml_block(path: Path) -> Action:
     """Add (or refresh) the semble [mcp_servers.semble] table in a Codex config.toml as text."""
     path.parent.mkdir(parents=True, exist_ok=True)
     existed = path.exists()
@@ -242,7 +242,7 @@ def _merge_toml_block(path: Path) -> Action:
     return "created" if not existed else "updated"
-def _remove_toml_block(path: Path) -> Action:
+def remove_toml_block(path: Path) -> Action:
     """Remove the semble [mcp_servers.semble] table from a Codex config.toml, leaving the rest."""
     if not path.exists():
         return "not-found"

{semble-0.3.2 → semble-0.3.3}/src/semble/installer/installer.py RENAMED Viewed

@@ -9,19 +9,19 @@ from typing import Callable, NoReturn, Sequence, TypeVar
 import questionary
 from semble.installer.agents import (
-    _INSTRUCTIONS,
     AGENTS,
+    INSTRUCTIONS,
     AgentTarget,
     Mode,
     WriteResult,
     is_detected,
 )
 from semble.installer.config import (
-    _merge_toml_block,
-    _remove_toml_block,
     merge_json_member,
+    merge_toml_block,
     remove_json_member,
     remove_marked,
+    remove_toml_block,
     replace_or_append_marked,
 )
@@ -51,14 +51,14 @@ class _Integration:
 def merge_mcp(agent: AgentTarget) -> WriteResult:
     """Add the semble MCP entry to the agent's config."""
     assert agent.mcp is not None
-    path = agent.mcp.resolved_path()
+    path = agent.mcp.path
     return WriteResult(path, merge_json_member(path, agent.mcp.key, "semble", agent.mcp.entry))
 def remove_mcp(agent: AgentTarget) -> WriteResult:
     """Remove the semble MCP entry from the agent's config."""
     assert agent.mcp is not None
-    path = agent.mcp.resolved_path()
+    path = agent.mcp.path
     return WriteResult(path, remove_json_member(path, agent.mcp.key, "semble"))
@@ -66,9 +66,9 @@ def _apply_mcp(agent: AgentTarget, mode: Mode) -> WriteResult | None:
     """Apply or remove the MCP server integration for one agent."""
     if agent.mcp is None:
         return None
-    path = agent.mcp.resolved_path()
+    path = agent.mcp.path
     if agent.mcp.format == "toml":
-        return WriteResult(path, _merge_toml_block(path) if mode == "install" else _remove_toml_block(path))
+        return WriteResult(path, merge_toml_block(path) if mode == "install" else remove_toml_block(path))
     return merge_mcp(agent) if mode == "install" else remove_mcp(agent)
@@ -77,7 +77,7 @@ def _apply_instructions(agent: AgentTarget, mode: Mode) -> WriteResult | None:
     path = agent.instructions_path
     if path is None:
         return None
-    action = replace_or_append_marked(path, _INSTRUCTIONS) if mode == "install" else remove_marked(path)
+    action = replace_or_append_marked(path, INSTRUCTIONS) if mode == "install" else remove_marked(path)
     return WriteResult(path, action)

{semble-0.3.2 → semble-0.3.3}/src/semble/version.py RENAMED Viewed

@@ -1,2 +1,2 @@
-__version_triple__ = (0, 3, 2)
+__version_triple__ = (0, 3, 3)
 __version__ = ".".join(map(str, __version_triple__))

{semble-0.3.2 → semble-0.3.3}/src/semble.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: semble
-Version: 0.3.2
+Version: 0.3.3
 Summary: Fast and Accurate Code Search for Agents
 Author-email: Thomas van Dongen <thomasvdongen@proton.me>, Stéphan Tulkens <stephantul@gmail.com>
 License: MIT License
@@ -98,7 +98,7 @@ Dynamic: license-file
 </div>
-Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](#mcp-server) or call it from the shell via [AGENTS.md](#agentsmd) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
+Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Use it as an MCP server, a CLI tool via AGENTS.md, or a dedicated sub-agent, and any coding agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
 ## Quickstart
@@ -210,7 +210,12 @@ semble savings --verbose # also show breakdown by call type
 Savings are calculated as follows: for each call, semble records the total character count of the unique files containing returned chunks and the character count of the snippets returned. Estimated tokens saved is `(file chars − snippet chars) / 4` (4 chars per token). This is a conservative estimate: the baseline is reading matched files in full, which is how coding agents often explore unfamiliar code.
-By default, stats are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. 'd:\caches\storemysemblecachehere'.
+</details>
+<details>
+<summary>Storage</summary>
+By default, your Semble savings statistics and any saved indexes are stored in the OS cache folder (`~/Library/Caches/semble/` on macOS, `~/.cache/semble/` on Linux, `%LOCALAPPDATA%\semble\Cache\` on Windows). To override this location you can supply an environment variable `SEMBLE_CACHE_LOCATION` which should be the full path to the target cache location e.g. `~/my-folder/my-caches/semble`.
 </details>

{semble-0.3.2 → semble-0.3.3}/src/semble.egg-info/SOURCES.txt RENAMED Viewed

@@ -34,6 +34,8 @@ src/semble/agents/cursor.md
 src/semble/agents/gemini.md
 src/semble/agents/kiro.md
 src/semble/agents/opencode.md
+src/semble/agents/pi.md
+src/semble/agents/reasonix.md
 src/semble/chunking/__init__.py
 src/semble/chunking/chunking.py
 src/semble/chunking/core.py

{semble-0.3.2 → semble-0.3.3}/tests/test_cache.py RENAMED Viewed

@@ -8,6 +8,7 @@ from unittest.mock import MagicMock, patch
 import pytest
 from semble.cache import (
+    _get_valid_user_cache_dir,
     _linux_cache_dir,
     _windows_cache_dir,
     clear_cache,
@@ -81,18 +82,30 @@ def test_save_index_to_cache(tmp_path: Path) -> None:
     [
         ("win32", "semble.cache._windows_cache_dir", Path("/win")),
         ("linux", "semble.cache._linux_cache_dir", Path("/linux")),
+        ("darwin", "semble.cache._macos_cache_dir", Path("/macos")),
     ],
 )
 def test_resolve_cache_folder(platform: str, mock_target: str, expected: Path) -> None:
     """resolve_cache_folder calls the correct platform helper."""
-    with patch.object(sys, "platform", platform):
-        with patch(mock_target, return_value=expected) as mock_fn:
-            with patch("pathlib.Path.mkdir"):
-                result = resolve_cache_folder()
+    with (
+        patch.object(sys, "platform", platform),
+        patch.dict("os.environ", {}, clear=True),
+        patch(mock_target, return_value=expected) as mock_fn,
+        patch("pathlib.Path.mkdir"),
+    ):
+        result = resolve_cache_folder()
     mock_fn.assert_called_once_with("semble")
     assert result == expected
+def test_get_valid_user_cache_dir_relative_path() -> None:
+    """_get_valid_user_cache_dir returns None when SEMBLE_CACHE_LOCATION is a relative path."""
+    with patch.dict("os.environ", {"SEMBLE_CACHE_LOCATION": "relative/path"}):
+        with patch("semble.cache.logger") as mock_logger:
+            assert _get_valid_user_cache_dir() is None
+        mock_logger.warning.assert_called_once()
 def test_resolve_cache_folder_semble_cache_location(tmp_path: Path) -> None:
     """SEMBLE_CACHE_LOCATION takes precedence over all platform-specific helpers."""
     custom = tmp_path / "custom_cache"

{semble-0.3.2 → semble-0.3.3}/tests/test_cli.py RENAMED Viewed

@@ -1,11 +1,12 @@
 import sys
+import warnings
 from importlib.resources import files
 from pathlib import Path
 from unittest.mock import MagicMock, patch
 import pytest
-from semble.cli import _cli_main, _maybe_save_index, main
+from semble.cli import _cli_main, _maybe_save_index, _run_clear, main
 from semble.types import ContentType, SearchResult
 from tests.conftest import make_chunk
@@ -172,8 +173,6 @@ def test_include_text_files_cli_deprecated(
     capsys: pytest.CaptureFixture[str],
 ) -> None:
     """--include-text-files on CLI raises DeprecationWarning."""
-    import warnings
     chunk = make_chunk("def foo(): pass", "src/foo.py")
     fake_index = MagicMock()
     fake_index.search.return_value = [SearchResult(chunk=chunk, score=0.9)]
@@ -229,3 +228,144 @@ def test_agent_file_tools_are_bash_only() -> None:
     tools = [t.strip() for t in tools_line.removeprefix("tools:").split(",")]
     assert set(tools) == {"Bash", "Read"}, f"Unexpected tools in agent file: {tools}"
     assert not any("mcp__" in t for t in tools)
+def _make_valid_index_dir(cache_folder: Path, sha: str = "a" * 64) -> Path:
+    """Create a fake valid index directory with the expected structure."""
+    index_dir = cache_folder / sha / "index"
+    index_dir.mkdir(parents=True)
+    # Create the files that PersistencePath.non_existing checks
+    (index_dir / "chunks.json").write_text("[]")
+    (index_dir / "bm25_index").write_text("")
+    (index_dir / "semantic_index").write_text("")
+    (index_dir / "metadata.json").write_text("{}")
+    return index_dir
+@pytest.mark.parametrize(
+    ("scenario", "expected_in_output"),
+    [
+        ("valid", ["Cleared index", "a" * 64, "b" * 64]),
+        ("empty", ["No indexes found"]),
+        ("non_sha", ["No indexes found"]),
+        ("incomplete", ["No indexes found"]),
+    ],
+)
+def test_run_clear_index(
+    scenario: str, expected_in_output: list[str], tmp_path: Path, capsys: pytest.CaptureFixture[str]
+) -> None:
+    """_run_clear('index') finds valid indexes, and skips non-SHA/incomplete/empty dirs."""
+    if scenario == "valid":
+        _make_valid_index_dir(tmp_path, "a" * 64)
+        _make_valid_index_dir(tmp_path, "b" * 64)
+    elif scenario == "non_sha":
+        bad_dir = tmp_path / "not-a-sha" / "index"
+        bad_dir.mkdir(parents=True)
+        (bad_dir / "chunks.json").write_text("[]")
+        (bad_dir / "bm25_index").write_text("")
+        (bad_dir / "semantic_index").write_text("")
+        (bad_dir / "metadata.json").write_text("{}")
+    elif scenario == "incomplete":
+        index_dir = tmp_path / ("c" * 64) / "index"
+        index_dir.mkdir(parents=True)
+    with patch("semble.cli.resolve_cache_folder", return_value=tmp_path):
+        _run_clear("index")
+    out = capsys.readouterr().out
+    for fragment in expected_in_output:
+        assert fragment in out
+    if scenario == "valid":
+        assert not (tmp_path / ("a" * 64)).exists()
+        assert not (tmp_path / ("b" * 64)).exists()
+@pytest.mark.parametrize(
+    ("create_file", "expected"),
+    [
+        (True, "Cleared savings"),
+        (False, "No savings file found"),
+    ],
+)
+def test_run_clear_savings(
+    create_file: bool, expected: str, tmp_path: Path, capsys: pytest.CaptureFixture[str]
+) -> None:
+    """_run_clear('savings') deletes the file when present, reports missing otherwise."""
+    savings_file = tmp_path / "savings.jsonl"
+    if create_file:
+        savings_file.write_text('{"tokens": 100}\n')
+    with patch("semble.cli.resolve_cache_folder", return_value=tmp_path):
+        _run_clear("savings")
+    if create_file:
+        assert not savings_file.exists()
+    out = capsys.readouterr().out
+    assert expected in out
+@pytest.mark.parametrize(
+    ("populate", "expected_fragments"),
+    [
+        (True, ["Cleared index", "d" * 64, "Cleared savings"]),
+        (False, ["No indexes found", "No savings file found"]),
+    ],
+)
+def test_run_clear_all(
+    populate: bool, expected_fragments: list[str], tmp_path: Path, capsys: pytest.CaptureFixture[str]
+) -> None:
+    """_run_clear('all') handles both indexes and savings."""
+    if populate:
+        _make_valid_index_dir(tmp_path, "d" * 64)
+        (tmp_path / "savings.jsonl").write_text('{"tokens": 50}\n')
+    with patch("semble.cli.resolve_cache_folder", return_value=tmp_path):
+        _run_clear("all")
+    out = capsys.readouterr().out
+    for fragment in expected_fragments:
+        assert fragment in out
+    if populate:
+        assert not (tmp_path / ("d" * 64)).exists()
+        assert not (tmp_path / "savings.jsonl").exists()
+@pytest.mark.parametrize(
+    ("subcommand", "setup_index", "setup_savings", "expected_fragments"),
+    [
+        ("index", True, False, ["Cleared index", "e" * 64]),
+        ("savings", False, True, ["Cleared savings"]),
+        ("all", True, True, ["Cleared index", "Cleared savings"]),
+    ],
+)
+def test_cli_clear_command(
+    subcommand: str,
+    setup_index: bool,
+    setup_savings: bool,
+    expected_fragments: list[str],
+    tmp_path: Path,
+    monkeypatch: pytest.MonkeyPatch,
+    capsys: pytest.CaptureFixture[str],
+) -> None:
+    """The `semble clear <subcommand>` CLI dispatches to _run_clear correctly."""
+    sha = "e" * 64
+    if setup_index:
+        _make_valid_index_dir(tmp_path, sha)
+    savings_file = tmp_path / "savings.jsonl"
+    if setup_savings:
+        savings_file.write_text('{"tokens": 200}\n')
+    monkeypatch.setattr(sys, "argv", ["semble", "clear", subcommand])
+    with patch("semble.cli.resolve_cache_folder", return_value=tmp_path):
+        _cli_main()
+    out = capsys.readouterr().out
+    for fragment in expected_fragments:
+        assert fragment in out
+    if setup_index:
+        assert not (tmp_path / sha).exists()
+    if setup_savings:
+        assert not savings_file.exists()

{semble-0.3.2 → semble-0.3.3}/tests/test_installer.py RENAMED Viewed

@@ -16,9 +16,9 @@ from semble.installer.agents import (
 )
 from semble.installer.config import (
     _CODEX_MCP_HEADER,
-    _merge_toml_block,
-    _remove_toml_block,
+    merge_toml_block,
     remove_marked,
+    remove_toml_block,
     replace_or_append_marked,
 )
 from semble.installer.installer import (
@@ -125,7 +125,13 @@ def test_merge_mcp_errors(claude_agent, content):
 @pytest.mark.parametrize(
     ("agent_id", "key"),
-    [("zed", "context_servers"), ("windsurf", "mcpServers"), ("copilot", "mcpServers")],
+    [
+        ("zed", "context_servers"),
+        ("windsurf", "mcpServers"),
+        ("copilot", "mcpServers"),
+        ("reasonix", "mcpServers"),
+        ("pi", "mcpServers"),
+    ],
 )
 def test_merge_mcp_writes_under_agent_key(tmp_path, agent_id, key):
     """merge_mcp writes the semble entry under each agent's own top-level MCP key."""
@@ -206,14 +212,14 @@ def test_codex_toml_merge_and_remove(tmp_path):
     """The Codex TOML helpers add/remove [mcp_servers.semble] while preserving other tables and keys."""
     f = tmp_path / "config.toml"
     f.write_text('model = "gpt-5"\n\n[mcp_servers.other]\ncommand = "x"\n')
-    assert _merge_toml_block(f) == "updated"
+    assert merge_toml_block(f) == "updated"
     text = f.read_text()
     assert _CODEX_MCP_HEADER in text
     assert 'model = "gpt-5"' in text
     assert "[mcp_servers.other]" in text
-    assert _merge_toml_block(f) == "unchanged"  # idempotent
+    assert merge_toml_block(f) == "unchanged"  # idempotent
-    assert _remove_toml_block(f) == "removed"
+    assert remove_toml_block(f) == "removed"
     text = f.read_text()
     assert _CODEX_MCP_HEADER not in text
     assert "[mcp_servers.other]" in text  # only the semble table is removed
@@ -223,7 +229,7 @@ def test_codex_toml_merge_replaces_section_with_inline_comment(tmp_path):
     """_merge_toml_block replaces an existing semble table even when the header has a trailing comment."""
     f = tmp_path / "config.toml"
     f.write_text('[mcp_servers.semble] # added manually\ncommand = "old"\n')
-    assert _merge_toml_block(f) == "updated"
+    assert merge_toml_block(f) == "updated"
     text = f.read_text()
     assert text.count("[mcp_servers.semble]") == 1
@@ -237,14 +243,14 @@ def test_remove_toml_not_found(tmp_path, setup, expected):
     f = tmp_path / "config.toml"
     if setup is not None:
         f.write_text(setup)
-    assert _remove_toml_block(f) == expected
+    assert remove_toml_block(f) == expected
 def test_remove_toml_deletes_file_when_only_semble(tmp_path):
     """_remove_toml_block unlinks the file when removing semble leaves it empty."""
     f = tmp_path / "config.toml"
-    _merge_toml_block(f)
-    _remove_toml_block(f)
+    merge_toml_block(f)
+    remove_toml_block(f)
     assert not f.exists()
@@ -265,7 +271,7 @@ def test_remove_toml_strips_sub_tables(tmp_path, content):
     """_remove_toml_block removes sub-tables like [mcp_servers.semble.tools.search], before or after the main header."""
     f = tmp_path / "config.toml"
     f.write_text(content)
-    assert _remove_toml_block(f) == "removed"
+    assert remove_toml_block(f) == "removed"
     text = f.read_text()
     assert "[mcp_servers.semble]" not in text
     assert "[mcp_servers.semble.tools.search]" not in text