PyPI - codespine - Versions diffs - 1.0.8__tar.gz → 1.0.9__tar.gz - Mend

codespine 1.0.8tar.gz → 1.0.9tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (74) hide show

{codespine-1.0.8 → codespine-1.0.9}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: codespine
-Version: 1.0.8
+Version: 1.0.9
 Summary: Local Java code intelligence indexer backed by a graph database
 Author: CodeSpine contributors
 License: MIT License
@@ -124,8 +124,7 @@ Downloads and caches the embedding model. Only needed once. After this, `--embed
 codespine analyse /path/to/java-project
 # 2. (Optional) Run the expensive deep passes: communities, flows, dead code, coupling
-#    Auto-enabled for repos with ≤ 3,000 files; use --deep to force on larger repos.
-codespine analyse /path/to/java-project --deep
+codespine analyse /path/to/java-project --complete --deep
 # 3. (Optional) Add semantic embeddings for concept-level search
 codespine analyse /path/to/java-project --embed
@@ -313,8 +312,9 @@ Higher-level tools designed to answer full agent questions in a single call, wit
 # Indexing
 codespine analyse <path>                     # incremental index (default)
 codespine analyse <path> --full              # full re-index from scratch
-codespine analyse <path> --deep              # + communities, flows, dead code, coupling
-codespine analyse <path> --incremental-deep  # incremental index + force deep passes
+codespine analyse <path> --budget 90         # fast index with a resolver deadline
+codespine analyse <path> --complete --deep   # + communities, flows, dead code, coupling
+codespine analyse <path> --complete --incremental-deep
 codespine analyse <path> --embed             # + vector embeddings
 # Live watch
@@ -360,7 +360,7 @@ codespine force-reset                        # emergency: delete all data files
 `analyse` defaults to incremental mode. Repeat runs only process changed files and are fast.
-Deep analysis (`--deep`) now runs automatically for repos with ≤ 3,000 files. For larger repos, pass `--deep` explicitly. Use `--incremental-deep` when you want a fast file-only update but still want communities, flows, dead code, and coupling refreshed.
+`analyse` runs in fast mode by default: it indexes the core graph, publishes that read replica from a detached process, then continues communities, flows, dead code, coupling, and cross-module enrichment in the background. Use `--complete --deep` when you want those passes refreshed before the command returns.
 ---
@@ -546,12 +546,12 @@ The deep analysis phase covers four passes that are expensive but optional:
 | Dead code | Finds methods with no callers (Java-aware exemptions) | Cleanup audits |
 | Change coupling | Analyses git history for co-changed file pairs | `get_change_coupling`, `related` |
-**Auto-threshold:** deep analysis runs automatically when the project has ≤ 3,000 Java files. Larger repos get lightweight flow/dead-code passes; full deep analysis requires `--deep`.
+**Fast default:** `codespine analyse` prioritizes a queryable core index. Communities, flows, dead-code, git coupling, and cross-module links are queued in a detached background enrichment job unless you use `--complete`.
-**Incremental deep:** `--incremental-deep` combines incremental file indexing with a forced full deep pass — useful after large refactors where you want the call graph refreshed quickly but also want updated communities and coupling.
+**Complete deep:** `--complete --deep` runs the expensive enrichment passes before returning. `--complete --incremental-deep` combines incremental file indexing with a forced full deep pass.
 ```bash
-codespine analyse . --incremental-deep
+codespine analyse . --complete --incremental-deep
 ```
 **Embeddings** (`--embed`) are independent of deep analysis. Without them, BM25 + fuzzy search still works. Add embeddings when you need concept-level retrieval ("find retry logic", "find payment processing").

{codespine-1.0.8 → codespine-1.0.9}/README.md RENAMED Viewed

@@ -59,8 +59,7 @@ Downloads and caches the embedding model. Only needed once. After this, `--embed
 codespine analyse /path/to/java-project
 # 2. (Optional) Run the expensive deep passes: communities, flows, dead code, coupling
-#    Auto-enabled for repos with ≤ 3,000 files; use --deep to force on larger repos.
-codespine analyse /path/to/java-project --deep
+codespine analyse /path/to/java-project --complete --deep
 # 3. (Optional) Add semantic embeddings for concept-level search
 codespine analyse /path/to/java-project --embed
@@ -248,8 +247,9 @@ Higher-level tools designed to answer full agent questions in a single call, wit
 # Indexing
 codespine analyse <path>                     # incremental index (default)
 codespine analyse <path> --full              # full re-index from scratch
-codespine analyse <path> --deep              # + communities, flows, dead code, coupling
-codespine analyse <path> --incremental-deep  # incremental index + force deep passes
+codespine analyse <path> --budget 90         # fast index with a resolver deadline
+codespine analyse <path> --complete --deep   # + communities, flows, dead code, coupling
+codespine analyse <path> --complete --incremental-deep
 codespine analyse <path> --embed             # + vector embeddings
 # Live watch
@@ -295,7 +295,7 @@ codespine force-reset                        # emergency: delete all data files
 `analyse` defaults to incremental mode. Repeat runs only process changed files and are fast.
-Deep analysis (`--deep`) now runs automatically for repos with ≤ 3,000 files. For larger repos, pass `--deep` explicitly. Use `--incremental-deep` when you want a fast file-only update but still want communities, flows, dead code, and coupling refreshed.
+`analyse` runs in fast mode by default: it indexes the core graph, publishes that read replica from a detached process, then continues communities, flows, dead code, coupling, and cross-module enrichment in the background. Use `--complete --deep` when you want those passes refreshed before the command returns.
 ---
@@ -481,12 +481,12 @@ The deep analysis phase covers four passes that are expensive but optional:
 | Dead code | Finds methods with no callers (Java-aware exemptions) | Cleanup audits |
 | Change coupling | Analyses git history for co-changed file pairs | `get_change_coupling`, `related` |
-**Auto-threshold:** deep analysis runs automatically when the project has ≤ 3,000 Java files. Larger repos get lightweight flow/dead-code passes; full deep analysis requires `--deep`.
+**Fast default:** `codespine analyse` prioritizes a queryable core index. Communities, flows, dead-code, git coupling, and cross-module links are queued in a detached background enrichment job unless you use `--complete`.
-**Incremental deep:** `--incremental-deep` combines incremental file indexing with a forced full deep pass — useful after large refactors where you want the call graph refreshed quickly but also want updated communities and coupling.
+**Complete deep:** `--complete --deep` runs the expensive enrichment passes before returning. `--complete --incremental-deep` combines incremental file indexing with a forced full deep pass.
 ```bash
-codespine analyse . --incremental-deep
+codespine analyse . --complete --incremental-deep
 ```
 **Embeddings** (`--embed`) are independent of deep analysis. Without them, BM25 + fuzzy search still works. Add embeddings when you need concept-level retrieval ("find retry logic", "find payment processing").

{codespine-1.0.8 → codespine-1.0.9}/codespine/__init__.py RENAMED Viewed

@@ -1,4 +1,4 @@
 """CodeSpine package."""
 __all__ = ["__version__"]
-__version__ = "1.0.8"
+__version__ = "1.0.9"

{codespine-1.0.8 → codespine-1.0.9}/codespine/cli.py RENAMED Viewed

@@ -66,6 +66,24 @@ def _open_store(read_only: bool = True) -> ShardedGraphStore:
     return ShardedGraphStore(read_only=read_only)
+def _spawn_background_enrichment(path: str) -> bool:
+    """Publish the fast index, then enrich it in a detached process."""
+    try:
+        subprocess.Popen(
+            [sys.executable, "-m", "codespine.cli", "enrich-background", path],
+            stdin=subprocess.DEVNULL,
+            stdout=subprocess.DEVNULL,
+            stderr=subprocess.DEVNULL,
+            start_new_session=True,
+            cwd=os.getcwd(),
+            env=os.environ.copy(),
+        )
+        return True
+    except Exception as exc:  # noqa: BLE001
+        LOGGER.warning("Unable to spawn background enrichment: %s", exc)
+        return False
 def _db_size_bytes(path: str) -> int:
     if os.path.isfile(path):
         return os.path.getsize(path)
@@ -110,6 +128,7 @@ def _index_shard_group(
     sg,
     full: bool,
     embed: bool,
+    deadline: float | None,
     output_lock: threading.Lock,
     parallel: bool,
 ) -> tuple[int, list, int]:
@@ -381,8 +400,9 @@ def _index_shard_group(
                 call_state["shown"] = False
                 elapsed_s = (now - call_state["started_at"]) if call_state["started_at"] else 0.0
                 n = int(payload.get("calls_resolved", 0))
+                suffix = " partial" if payload.get("partial") else ""
                 with output_lock:
-                    _phase(f"{prefix}Tracing calls...", f"{n} calls resolved  ({elapsed_s:.1f}s)")
+                    _phase(f"{prefix}Tracing calls...", f"{n} calls resolved{suffix}  ({elapsed_s:.1f}s)")
                 return
             if event == "resolve_types_start":
                 with output_lock:
@@ -390,14 +410,20 @@ def _index_shard_group(
                 return
             if event == "resolve_types_done":
                 n = int(payload.get("type_relationships", 0))
+                suffix = " partial" if payload.get("partial") else ""
                 with output_lock:
-                    _phase(f"{prefix}Analyzing types...", f"{n} type relationships")
+                    _phase(f"{prefix}Analyzing types...", f"{n} type relationships{suffix}")
                 return
         shard_store = sg.shard(project_id)
         indexer = JavaIndexer(shard_store)
         result = indexer.index_project(
-            mod_path, full=full, progress=_progress, project_id=project_id, embed=embed
+            mod_path,
+            full=full,
+            progress=_progress,
+            project_id=project_id,
+            embed=embed,
+            deadline=deadline,
         )
         results.append(result)
         total_files += result.files_found
@@ -466,7 +492,21 @@ def main() -> None:
 @main.command()
 @click.argument("path", type=click.Path(exists=True))
 @click.option("--full/--incremental", default=False, show_default=True)
-@click.option("--deep/--no-deep", default=False, show_default=True, help="Run expensive global analyses (auto-on for repos ≤3 k files).")
+@click.option("--deep/--no-deep", default=False, show_default=True, help="Run expensive global analyses when used with --complete.")
+@click.option(
+    "--fast/--complete",
+    default=True,
+    show_default=True,
+    help="Fast mode returns after the core index is queryable; complete mode runs enrichment in the foreground.",
+)
+@click.option(
+    "--budget",
+    "budget_seconds",
+    default=90.0,
+    show_default=True,
+    type=float,
+    help="Foreground time budget in seconds for fast mode; use 0 to disable the resolver deadline.",
+)
 @click.option(
     "--incremental-deep",
     is_flag=True,
@@ -475,17 +515,25 @@ def main() -> None:
 )
 @click.option(
     "--embed/--no-embed",
-    default=True,
+    default=False,
     show_default=True,
-    help="Generate vector embeddings. Uses sentence-transformers if installed (pip install codespine[ml]), otherwise falls back to hash-based vectors.",
+    help="Generate vector embeddings. Off by default so analyse stays fast; rerun with --embed when semantic vectors are needed.",
 )
 @click.option("--allow-running", is_flag=True, hidden=True, help="Skip MCP running check (used by MCP analyse_project tool).")
-def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bool, allow_running: bool) -> None:
+def analyse(
+    path: str,
+    full: bool,
+    deep: bool,
+    fast: bool,
+    budget_seconds: float,
+    incremental_deep: bool,
+    embed: bool,
+    allow_running: bool,
+) -> None:
     """Index a local Java project (auto-detects workspace / Maven / Gradle layout).
-    Embeddings are generated by default. If sentence-transformers is installed
-    (pip install codespine[ml]), high-quality semantic vectors are used; otherwise
-    a fast hash-based fallback provides basic vector search.
+    Fast mode indexes the core Java graph and returns quickly. Use --complete
+    for foreground communities, flows, dead-code, and git-coupling enrichment.
     """
     if not allow_running and _is_running():
         click.secho("Stop MCP first ('codespine stop') to index.", fg="yellow")
@@ -494,6 +542,18 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
     started = time.perf_counter()
     abs_path = os.path.abspath(path)
+    if fast and (deep or incremental_deep):
+        click.secho(
+            "Fast mode runs deep analysis in the background. Use --complete --deep to wait for it.",
+            fg="yellow",
+        )
+    budget_deadline = (
+        started + budget_seconds
+        if fast and budget_seconds and budget_seconds > 0
+        else None
+    )
     # Warn about hash fallback early so users know to install [ml]
     if embed:
         from codespine.search.vector import _load_model
@@ -610,7 +670,7 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
             for s_idx, group in shard_groups.items():
                 f = ex.submit(
                     _index_shard_group,
-                    s_idx, group, sg, full, embed, output_lock, True,
+                    s_idx, group, sg, full, embed, budget_deadline, output_lock, True,
                 )
                 futures_map[f] = s_idx
@@ -632,7 +692,7 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
         only_shard_idx = next(iter(shard_groups))
         only_group = shard_groups[only_shard_idx]
         _, all_results, total_files_found = _index_shard_group(
-            only_shard_idx, only_group, sg, full, embed, output_lock, False,
+            only_shard_idx, only_group, sg, full, embed, budget_deadline, output_lock, False,
         )
         if all_results:
             last_result = all_results[-1]
@@ -652,7 +712,9 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
     root_shard_store = sg.shard(root_project_id)
     # ── Cross-module call linking ──────────────────────────────────────
-    if is_multi and len(modules_with_ids) > 1:
+    if fast and is_multi and len(modules_with_ids) > 1:
+        _phase("Cross-module linking...", "skipped (fast mode; use --complete)")
+    elif is_multi and len(modules_with_ids) > 1:
         xmod_label = "Cross-module linking..."
         _live_phase(xmod_label, "running")
         xmod_pids = [pid for _, pid in modules_with_ids]
@@ -669,7 +731,7 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
     dead: list[dict] = []
     coupling_pairs: list[dict] = []
-    should_run_deep = deep or incremental_deep or total_files_found <= 3000
+    should_run_deep = (not fast) and (deep or incremental_deep or total_files_found <= 3000)
     if should_run_deep:
         comm_label = "Detecting communities..."
         _live_phase(comm_label, "running")
@@ -707,6 +769,11 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
             progress=lambda s: _live_phase(coup_label, s),
         )
         _finish_phase(coup_label, f"{len(coupling_pairs)} coupled file pairs")
+    elif fast:
+        _phase("Detecting communities...", "queued in background")
+        _phase("Detecting execution flows...", "queued in background")
+        _phase("Finding dead code...", "queued in background")
+        _phase("Analyzing git history...", "queued in background")
     else:
         # Run lightweight versions of flow tracing and dead code from the call
         # graph already built — no community detection or coupling (those are
@@ -768,30 +835,121 @@ def analyse(path: str, full: bool, deep: bool, incremental_deep: bool, embed: bo
         fg="green",
     )
-    # Detect unresolved imports → hint about unindexed sibling projects
-    try:
-        unresolved = JavaIndexer.detect_unresolved_imports(root_shard_store)
-        if unresolved:
-            click.echo()
-            click.secho("⚠  Unresolved imports — consider indexing these projects:", fg="yellow")
-            for pkg, samples in sorted(unresolved.items())[:8]:
-                click.echo(f"   {pkg}  (e.g. {samples[0]})")
-    except Exception:
-        pass  # best-effort
+    # Detect unresolved imports → hint about unindexed sibling projects.
+    # This is useful, but it is still another global query, so fast mode leaves
+    # it out of the foreground path.
+    if not fast:
+        try:
+            unresolved = JavaIndexer.detect_unresolved_imports(root_shard_store)
+            if unresolved:
+                click.echo()
+                click.secho("⚠  Unresolved imports — consider indexing these projects:", fg="yellow")
+                for pkg, samples in sorted(unresolved.items())[:8]:
+                    click.echo(f"   {pkg}  (e.g. {samples[0]})")
+        except Exception:
+            pass  # best-effort
     # Publish a read replica so MCP and read-only CLI commands (search, stats…)
     # run against an isolated snapshot rather than competing with the write
     # process's buffer pool.  Snapshot all open shards concurrently.
     snap_label = "Publishing read replica..."
-    _live_phase(snap_label, "copying")
-    root_shard_store._recycle_conn()
-    sg.snapshot_all(background=False)
-    _finish_phase(snap_label, "MCP will reload automatically")
+    for store in sg.open_shards():
+        recycle = getattr(store, "_recycle_conn", None)
+        if callable(recycle):
+            recycle()
+    if fast and _spawn_background_enrichment(abs_path):
+        _phase(snap_label, "core snapshot now; enrichment continues in background")
+    else:
+        _live_phase(snap_label, "copying")
+        sg.snapshot_all(background=False)
+        _finish_phase(snap_label, "MCP will reload automatically")
     # Restore original SIGINT handler now that we've finished cleanly.
     signal.signal(signal.SIGINT, _old_sigint_handler)
+@main.command("publish-snapshot", hidden=True)
+def publish_snapshot() -> None:
+    """Publish sharded read replicas for a recently completed analyse run."""
+    sg = ShardedGraphStore(read_only=False)
+    sg.snapshot_all(background=False)
+@main.command("enrich-background", hidden=True)
+@click.argument("path", type=click.Path(exists=True))
+def enrich_background(path: str) -> None:
+    """Run expensive post-index graph enrichment outside the analyse foreground."""
+    abs_path = os.path.abspath(path)
+    LOGGER.info("Background enrichment starting for %s", abs_path)
+    project_roots = JavaIndexer.detect_projects_in_workspace(abs_path)
+    modules_with_ids: list[tuple[str, str]] = []
+    for proj_root in project_roots:
+        proj_name = os.path.basename(proj_root)
+        module_dirs = JavaIndexer.detect_modules(proj_root)
+        is_multi_module = not (len(module_dirs) == 1 and module_dirs[0] == proj_root)
+        if is_multi_module:
+            for m in module_dirs:
+                modules_with_ids.append((m, f"{proj_name}::{os.path.basename(m)}"))
+        else:
+            modules_with_ids.append((proj_root, proj_name))
+    root_basename = os.path.basename(abs_path)
+    root_project_id = modules_with_ids[-1][1] if modules_with_ids else root_basename
+    is_multi = len(modules_with_ids) > 1
+    xmod_pids = [pid for _, pid in modules_with_ids]
+    sg = ShardedGraphStore(read_only=False)
+    root_shard_store = sg.shard(root_project_id)
+    try:
+        # Publish the fast core graph first so MCP/search can use it while the
+        # more expensive enrichment keeps working.
+        sg.snapshot_all(background=False)
+        if is_multi and len(xmod_pids) > 1:
+            xmod_edges = link_cross_module_calls(
+                root_shard_store,
+                project_ids=xmod_pids,
+                progress=lambda s: LOGGER.info("Cross-module linking: %s", s),
+            )
+            LOGGER.info("Background cross-module linking wrote %d edges", xmod_edges)
+        communities = detect_communities(
+            root_shard_store,
+            progress=lambda s: LOGGER.info("Community detection: %s", s),
+        )
+        LOGGER.info("Background community detection found %d clusters", len(communities))
+        flows = trace_execution_flows(
+            root_shard_store,
+            progress=lambda s: LOGGER.info("Execution flow tracing: %s", s),
+        )
+        LOGGER.info("Background flow tracing found %d flows", len(flows))
+        dead = detect_dead_code(root_shard_store, limit=500)
+        LOGGER.info("Background dead-code scan found %d candidates", _dead_result_count(dead))
+        root_shard_store.clear_coupling()
+        coupling_project = root_basename if is_multi else root_project_id
+        coupling_pairs = compute_coupling(
+            root_shard_store,
+            abs_path,
+            coupling_project,
+            days=SETTINGS.default_coupling_days,
+            min_strength=SETTINGS.default_min_coupling_strength,
+            min_cochanges=SETTINGS.default_min_cochanges,
+            progress=lambda s: LOGGER.info("Git coupling: %s", s),
+        )
+        LOGGER.info("Background coupling analysis found %d pairs", len(coupling_pairs))
+        sg.snapshot_all(background=False)
+        LOGGER.info("Background enrichment finished for %s", abs_path)
+    except Exception as exc:  # noqa: BLE001
+        LOGGER.exception("Background enrichment failed for %s: %s", abs_path, exc)
+        raise
 @main.command()
 @click.argument("query")
 @click.option("--k", default=20, show_default=True, type=int)

{codespine-1.0.8 → codespine-1.0.9}/codespine/config.py RENAMED Viewed

@@ -29,8 +29,10 @@ class Settings:
     rrf_k: int = 60
     semantic_candidate_pool: int = 2000
     write_batch_size: int = 500
-    index_file_batch_size: int = 20
-    edge_write_batch_size: int = 500
+    index_file_batch_size: int = 200
+    index_method_batch_size: int = 2000
+    index_symbol_batch_size: int = 2000
+    edge_write_batch_size: int = 5000
     default_coupling_days: int = 5
     default_min_coupling_strength: float = 0.3
     default_min_cochanges: int = 3

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/call_resolver.py RENAMED Viewed

@@ -1,5 +1,6 @@
 from __future__ import annotations
+import time
 from collections import defaultdict
 from typing import Iterator
@@ -58,6 +59,7 @@ def resolve_calls(
     class_catalog: dict[str, list[str]],
     *,
     scan_counter: list[int] | None = None,
+    deadline: float | None = None,
 ) -> Iterator[tuple[str, str, float, str]]:
     """Resolve call names to known method ids.
@@ -84,6 +86,8 @@ def resolve_calls(
             class_method_index_by_fqcn[class_fqcn][key].append(method_id)
     for source_id, call_sites in calls.items():
+        if deadline is not None and time.perf_counter() >= deadline:
+            return
         if scan_counter is not None:
             scan_counter[0] += 1
         src_meta = method_catalog.get(source_id, {})
@@ -94,6 +98,8 @@ def resolve_calls(
         field_types = src_ctx.get("field_types", {}) or {}
         for call in call_sites:
+            if deadline is not None and time.perf_counter() >= deadline:
+                return
             call_name = call.name
             key = (call_name, int(call.arg_count))

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/engine.py RENAMED Viewed

@@ -190,6 +190,7 @@ class JavaIndexer:
         progress: Callable[[str, dict], None] | None = None,
         project_id: str | None = None,
         embed: bool = True,
+        deadline: float | None = None,
     ) -> IndexResult:
         root_path = os.path.abspath(root_path)
         if project_id is None:
@@ -651,13 +652,13 @@ class JavaIndexer:
                 with self.store.transaction():
                     self.store.upsert_classes_batch(class_rows)
                 self.store._recycle_conn()
-                _METHOD_SUB_BATCH = 200
+                _METHOD_SUB_BATCH = max(1, int(getattr(SETTINGS, "index_method_batch_size", 2000)))
                 _db_phase_holder[0] = "writing methods"
                 for method_sub in self._chunked(method_rows, _METHOD_SUB_BATCH):
                     with self.store.transaction():
                         self.store.upsert_methods_batch(method_sub)
                     self.store._recycle_conn()
-                _SYMBOL_SUB_BATCH = 200
+                _SYMBOL_SUB_BATCH = max(1, int(getattr(SETTINGS, "index_symbol_batch_size", 2000)))
                 _db_phase_holder[0] = "writing symbols"
                 for symbol_sub in self._chunked(symbol_rows, _SYMBOL_SUB_BATCH):
                     with self.store.transaction():
@@ -686,6 +687,9 @@ class JavaIndexer:
             elapsed=time.perf_counter() - _db_start,
         )
+        def _deadline_expired() -> bool:
+            return deadline is not None and time.perf_counter() >= deadline
         self._emit(progress, "resolve_calls_start")
         # ── Heartbeat thread ──────────────────────────────────────────────
@@ -693,43 +697,47 @@ class JavaIndexer:
         # many seconds on large repos with common method names.  A daemon
         # heartbeat thread fires every 2 s so the CLI progress spinner keeps
         # ticking even during those silent stretches.
-        _scan_counter: list[int] = [0]
-        _edges_counter: list[int] = [0]
-        _hb_stop = threading.Event()
-        _resolve_start = time.perf_counter()
-        def _heartbeat_worker() -> None:
-            while not _hb_stop.wait(2.0):
-                self._emit(
-                    progress,
-                    "resolve_calls_heartbeat",
-                    scanned=_scan_counter[0],
-                    edges=_edges_counter[0],
-                    elapsed=time.perf_counter() - _resolve_start,
-                )
+        best_calls: dict[tuple[str, str], tuple[float, str]] = {}
+        partial_calls = _deadline_expired()
+        if not partial_calls:
+            _scan_counter: list[int] = [0]
+            _edges_counter: list[int] = [0]
+            _hb_stop = threading.Event()
+            _resolve_start = time.perf_counter()
+            def _heartbeat_worker() -> None:
+                while not _hb_stop.wait(2.0):
+                    self._emit(
+                        progress,
+                        "resolve_calls_heartbeat",
+                        scanned=_scan_counter[0],
+                        edges=_edges_counter[0],
+                        elapsed=time.perf_counter() - _resolve_start,
+                    )
-        _hb_thread = threading.Thread(
-            target=_heartbeat_worker,
-            daemon=True,
-            name="codespine-resolver-heartbeat",
-        )
-        _hb_thread.start()
+            _hb_thread = threading.Thread(
+                target=_heartbeat_worker,
+                daemon=True,
+                name="codespine-resolver-heartbeat",
+            )
+            _hb_thread.start()
-        # Deduplicate (src, dst) pairs — the same pair can appear many times
-        # when a method calls another method at multiple call sites.
-        # Keep the highest-confidence resolution to avoid N writes per pair.
-        best_calls: dict[tuple[str, str], tuple[float, str]] = {}
-        try:
-            for src, dst, confidence, reason in resolve_calls(
-                method_catalog, method_calls, method_context, class_catalog,
-                scan_counter=_scan_counter,
-            ):
-                key = (src, dst)
-                if key not in best_calls or confidence > best_calls[key][0]:
-                    best_calls[key] = (confidence, reason)
-        finally:
-            _hb_stop.set()
-            _hb_thread.join(timeout=3.0)
+            # Deduplicate (src, dst) pairs — the same pair can appear many times
+            # when a method calls another method at multiple call sites.
+            # Keep the highest-confidence resolution to avoid N writes per pair.
+            try:
+                for src, dst, confidence, reason in resolve_calls(
+                    method_catalog, method_calls, method_context, class_catalog,
+                    scan_counter=_scan_counter,
+                    deadline=deadline,
+                ):
+                    key = (src, dst)
+                    if key not in best_calls or confidence > best_calls[key][0]:
+                        best_calls[key] = (confidence, reason)
+                partial_calls = _deadline_expired()
+            finally:
+                _hb_stop.set()
+                _hb_thread.join(timeout=3.0)
         # Stream writes in batches — never hold the full set in RAM.
         call_buf: list[dict] = []
@@ -751,9 +759,29 @@ class JavaIndexer:
                 self.store.add_calls_batch(call_buf)
             calls_resolved += len(call_buf)
             self.store._recycle_conn()
-        self._emit(progress, "resolve_calls_done", calls_resolved=calls_resolved)
+        self._emit(
+            progress,
+            "resolve_calls_done",
+            calls_resolved=calls_resolved,
+            partial=partial_calls,
+        )
         self._emit(progress, "resolve_types_start")
+        if _deadline_expired():
+            self._emit(progress, "resolve_types_done", type_relationships=0, partial=True)
+            self._emit(progress, "di_done", injections=0, interface_bindings=0, partial=True)
+            self._prune_meta_cache(meta_cache, current_file_ids)
+            self._save_file_meta_cache(project_id, meta_cache)
+            return IndexResult(
+                project_id=project_id,
+                files_found=len(current_files),
+                files_indexed=files_indexed,
+                classes_indexed=classes_indexed,
+                methods_indexed=methods_indexed,
+                calls_resolved=calls_resolved,
+                type_relationships=0,
+                embeddings_generated=classes_indexed + methods_indexed if embed else 0,
+            )
         type_rows = self._build_inheritance_edges(
             class_meta,
             class_catalog,

{codespine-1.0.8 → codespine-1.0.9}/codespine/mcp/server.py RENAMED Viewed

@@ -1394,7 +1394,8 @@ def build_mcp_server(store, repo_path_provider):
         Parameters:
           path  – Absolute or relative path to the project/workspace to index.
           full  – If True, re-index every file even if unchanged (default: incremental).
-          deep  – If True, also run community detection, flows, and coupling (slower).
+          deep  – If True, run complete foreground community, flow, dead-code,
+                  and coupling enrichment (slower).
           embed – If True, generate vector embeddings for semantic search (slow when
                   sentence-transformers is installed; BM25/fuzzy search works without them).
@@ -1422,7 +1423,7 @@ def build_mcp_server(store, repo_path_provider):
         else:
             cmd.append("--incremental")
         if deep:
-            cmd.append("--deep")
+            cmd.extend(["--complete", "--deep"])
         if embed:
             cmd.append("--embed")
         else:

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: codespine
-Version: 1.0.8
+Version: 1.0.9
 Summary: Local Java code intelligence indexer backed by a graph database
 Author: CodeSpine contributors
 License: MIT License
@@ -124,8 +124,7 @@ Downloads and caches the embedding model. Only needed once. After this, `--embed
 codespine analyse /path/to/java-project
 # 2. (Optional) Run the expensive deep passes: communities, flows, dead code, coupling
-#    Auto-enabled for repos with ≤ 3,000 files; use --deep to force on larger repos.
-codespine analyse /path/to/java-project --deep
+codespine analyse /path/to/java-project --complete --deep
 # 3. (Optional) Add semantic embeddings for concept-level search
 codespine analyse /path/to/java-project --embed
@@ -313,8 +312,9 @@ Higher-level tools designed to answer full agent questions in a single call, wit
 # Indexing
 codespine analyse <path>                     # incremental index (default)
 codespine analyse <path> --full              # full re-index from scratch
-codespine analyse <path> --deep              # + communities, flows, dead code, coupling
-codespine analyse <path> --incremental-deep  # incremental index + force deep passes
+codespine analyse <path> --budget 90         # fast index with a resolver deadline
+codespine analyse <path> --complete --deep   # + communities, flows, dead code, coupling
+codespine analyse <path> --complete --incremental-deep
 codespine analyse <path> --embed             # + vector embeddings
 # Live watch
@@ -360,7 +360,7 @@ codespine force-reset                        # emergency: delete all data files
 `analyse` defaults to incremental mode. Repeat runs only process changed files and are fast.
-Deep analysis (`--deep`) now runs automatically for repos with ≤ 3,000 files. For larger repos, pass `--deep` explicitly. Use `--incremental-deep` when you want a fast file-only update but still want communities, flows, dead code, and coupling refreshed.
+`analyse` runs in fast mode by default: it indexes the core graph, publishes that read replica from a detached process, then continues communities, flows, dead code, coupling, and cross-module enrichment in the background. Use `--complete --deep` when you want those passes refreshed before the command returns.
 ---
@@ -546,12 +546,12 @@ The deep analysis phase covers four passes that are expensive but optional:
 | Dead code | Finds methods with no callers (Java-aware exemptions) | Cleanup audits |
 | Change coupling | Analyses git history for co-changed file pairs | `get_change_coupling`, `related` |
-**Auto-threshold:** deep analysis runs automatically when the project has ≤ 3,000 Java files. Larger repos get lightweight flow/dead-code passes; full deep analysis requires `--deep`.
+**Fast default:** `codespine analyse` prioritizes a queryable core index. Communities, flows, dead-code, git coupling, and cross-module links are queued in a detached background enrichment job unless you use `--complete`.
-**Incremental deep:** `--incremental-deep` combines incremental file indexing with a forced full deep pass — useful after large refactors where you want the call graph refreshed quickly but also want updated communities and coupling.
+**Complete deep:** `--complete --deep` runs the expensive enrichment passes before returning. `--complete --incremental-deep` combines incremental file indexing with a forced full deep pass.
 ```bash
-codespine analyse . --incremental-deep
+codespine analyse . --complete --incremental-deep
 ```
 **Embeddings** (`--embed`) are independent of deep analysis. Without them, BM25 + fuzzy search still works. Add embeddings when you need concept-level retrieval ("find retry logic", "find payment processing").

{codespine-1.0.8 → codespine-1.0.9}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "codespine"
-version = "1.0.8"
+version = "1.0.9"
 description = "Local Java code intelligence indexer backed by a graph database"
 readme = "README.md"
 requires-python = ">=3.10"

{codespine-1.0.8 → codespine-1.0.9}/tests/test_call_resolver.py RENAMED Viewed

@@ -1,3 +1,4 @@
+import time
 from types import SimpleNamespace
 from codespine.indexer.call_resolver import resolve_calls
@@ -41,3 +42,36 @@ def test_resolver_prefers_receiver_type_and_arity():
     out = list(resolve_calls(method_catalog, calls, method_context, class_catalog))
     assert ("src", "m1", 1.0, "receiver_this_exact") in out
     assert ("src", "m3", 0.8, "receiver_method_match") in out
+def test_resolver_stops_at_deadline():
+    method_catalog = {
+        "src": {
+            "name": "entry",
+            "param_count": 0,
+            "class_id": "c_service",
+            "class_fqcn": "com.example.Service",
+            "signature": "entry()",
+        },
+        "m1": {
+            "name": "run",
+            "param_count": 0,
+            "class_id": "c_service",
+            "class_fqcn": "com.example.Service",
+            "signature": "run()",
+        },
+    }
+    calls = {"src": [SimpleNamespace(name="run", receiver="this", arg_count=0)]}
+    method_context = {"src": {"class_id": "c_service", "class_fqcn": "com.example.Service"}}
+    out = list(
+        resolve_calls(
+            method_catalog,
+            calls,
+            method_context,
+            {"Service": ["com.example.Service"]},
+            deadline=time.perf_counter() - 1,
+        )
+    )
+    assert out == []

{codespine-1.0.8 → codespine-1.0.9}/LICENSE RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/community.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/context.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/coupling.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/crossmodule.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/deadcode.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/flow.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/analysis/impact.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/cache/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/cache/result_cache.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/db/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/db/_cypher_compat.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/db/duckdb_store.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/db/schema.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/db/store.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/diff/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/diff/branch_diff.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/guide.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/di_resolver.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/java_parser.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/indexer/symbol_builder.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/mcp/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/noise/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/noise/blocklist.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/overlay/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/overlay/git_state.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/overlay/merge.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/overlay/store.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/bm25.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/fuzzy.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/hybrid.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/rrf.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/search/vector.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/sharding/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/sharding/router.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/sharding/store.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/watch/__init__.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/watch/git_hook.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine/watch/watcher.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/entry_points.txt RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/requires.txt RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/codespine.egg-info/top_level.txt RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/gindex.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/setup.cfg RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_branch_diff_normalize.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_community_detection.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_cypher_compat.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_deadcode.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_duckdb_store.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_index_and_hybrid.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_java_parser.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_multimodule_index.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_overlay.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_parse_resilience.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_result_cache.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_search_ranking.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_sharding.py RENAMED Viewed

File without changes

{codespine-1.0.8 → codespine-1.0.9}/tests/test_store_recovery.py RENAMED Viewed

File without changes

codespine 1.0.8__tar.gz → 1.0.9__tar.gz

codespine 1.0.8tar.gz → 1.0.9tar.gz