PyPI - code-context-engine - Versions diffs - 0.4.1__py3-none-any.whl → 0.4.2__py3-none-any.whl - Mend

code-context-engine 0.4.1py3-none-any.whl → 0.4.2py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: code-context-engine
-Version: 0.4.1
+Version: 0.4.2
 Summary: Index your codebase. AI searches instead of re-reading files. Save 70%+ on tokens. Works with Claude Code, Cursor, VS Code, Gemini CLI, and Codex.
 Author-email: Fazle Elahee <felahee@gmail.com>, Raj <rajkumar.sakti@gmail.com>
 License-Expression: MIT
@@ -53,7 +53,7 @@ Dynamic: license-file
 <h1 align="center">Code Context Engine</h1>
 <p align="center">
-  <strong>Index your codebase. AI searches instead of re-reading files. Save 70%+ on tokens.</strong>
+  <strong>Index your codebase. AI searches instead of re-reading files. 93% token savings, benchmarked.</strong>
 </p>
 <p align="center">
@@ -65,16 +65,20 @@ Dynamic: license-file
 </p>
 <p align="center">
-  <img src="https://img.shields.io/badge/Claude_Code-black?style=for-the-badge&logo=anthropic&logoColor=white" alt="Claude Code">
-  <img src="https://img.shields.io/badge/VS_Code-007ACC?style=for-the-badge&logo=visual-studio-code&logoColor=white" alt="VS Code">
-  <img src="https://img.shields.io/badge/Cursor-000000?style=for-the-badge&logo=cursor&logoColor=white" alt="Cursor">
-  <img src="https://img.shields.io/badge/Gemini_CLI-4285F4?style=for-the-badge&logo=google&logoColor=white" alt="Gemini CLI">
-  <img src="https://img.shields.io/badge/Codex_CLI-412991?style=for-the-badge&logo=openai&logoColor=white" alt="Codex CLI">
+  <strong>Works with your editor</strong>
+</p>
+<p align="center">
+  <a href="#install-and-see-savings-in-60-seconds"><img src="https://img.shields.io/badge/Claude_Code-D4A27F?style=for-the-badge&logo=anthropic&logoColor=black" alt="Claude Code" height="36"></a>&nbsp;&nbsp;
+  <a href="#install-and-see-savings-in-60-seconds"><img src="https://img.shields.io/badge/VS_Code-007ACC?style=for-the-badge&logo=visual-studio-code&logoColor=white" alt="VS Code" height="36"></a>&nbsp;&nbsp;
+  <a href="#install-and-see-savings-in-60-seconds"><img src="https://img.shields.io/badge/Cursor-000000?style=for-the-badge&logo=cursor&logoColor=white" alt="Cursor" height="36"></a>&nbsp;&nbsp;
+  <a href="#install-and-see-savings-in-60-seconds"><img src="https://img.shields.io/badge/Gemini_CLI-4285F4?style=for-the-badge&logo=google&logoColor=white" alt="Gemini CLI" height="36"></a>&nbsp;&nbsp;
+  <a href="#install-and-see-savings-in-60-seconds"><img src="https://img.shields.io/badge/Codex_CLI-412991?style=for-the-badge&logo=openai&logoColor=white" alt="Codex CLI" height="36"></a>
 </p>
 <p align="center">
   One command. Index your codebase. Your AI coding agent searches instead of reading entire files.<br>
-  Works with Claude Code, Cursor, VS Code, Gemini CLI, and OpenAI Codex. Local, zero-cloud.
+  Zero-cloud, zero-config. <code>cce init</code> auto-detects your editor.
 </p>
 <p align="center">
@@ -108,12 +112,12 @@ Multiple editors in the same project? All get configured in one command.
 ```
   my-project · 38 queries
-  ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶  70% tokens saved
+  ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶  93% tokens saved
   Without CCE   48.0k  tokens   $0.24
-  With CCE      14.2k  tokens   $0.07
+  With CCE       3.4k  tokens   $0.02
   ──────────────────────────────────────────
-  Saved         33.8k  tokens   $0.17
+  Saved         44.6k  tokens   $0.22
   Cost estimate based on Opus input pricing ($5/1M tokens)
 ```
@@ -122,7 +126,7 @@ Multiple editors in the same project? All get configured in one command.
 ## Why this matters
-Input tokens are 85-95% of your Claude Code bill. CCE cuts them by 70-98%.
+Input tokens are 85-95% of your Claude Code bill. CCE cuts them by 93% ([benchmarked on FastAPI](#benchmark-fastapi-independently-verified)).
 ```
 Without CCE:    Claude reads payments.py + shipping.py   = 45,000 tokens
@@ -138,6 +142,42 @@ With CCE:       context_search "payment flow"            =    800 tokens
 ---
+## Benchmark: FastAPI (independently verified)
+We benchmarked CCE against [FastAPI](https://github.com/fastapi/fastapi) (48 source files, 19K lines of Python) with 20 real coding questions. No cherry-picking, no synthetic queries.
+**Methodology:** For each query, "without CCE" means reading the full content of every file the query touches. "With CCE" means the relevant chunks after compression. This is conservative (agents often read more files than needed).
+| Metric | Result |
+|--------|--------|
+| **Retrieval** | **93%** savings (75,355 → 5,381 tokens/query) |
+| **+ Compression** | **90%** additional (5,381 → 541 tokens/query) |
+| **Combined** | **99.3%** (75,355 → 541 tokens/query) |
+| Recall@10 (found the right files) | 0.80 |
+| Precision@10 | 0.30 |
+| Latency p50 | 0.4ms |
+| Queries tested | 20 |
+### Per-Layer Savings (each measured independently)
+| Layer | What it does | Savings | Method |
+|-------|-------------|---------|--------|
+| **Retrieval** | Full files → relevant code chunks | 93% | measured |
+| **Chunk Compression** | Raw chunks → signatures + docstrings | 90% | measured |
+| **Output Compression** | Reduces Claude's reply length | 65% | estimated |
+| **Grammar** | Drops articles/fillers from memory text | 13% | measured |
+**Reproduce it yourself:**
+```bash
+pip install code-context-engine
+python benchmarks/run_benchmark.py --repo https://github.com/fastapi/fastapi.git --source-dir fastapi
+```
+Full results in [`benchmarks/results/fastapi.md`](benchmarks/results/fastapi.md). Queries and methodology in [`benchmarks/`](benchmarks/).
+---
 ## What you get
 **9 MCP tools** that Claude uses automatically:
@@ -188,7 +228,7 @@ Re-indexing after edits takes under 1 second (96% embedding cache hit rate). Git
 Output compression tools (like Caveman) save 20-75% on output tokens. Output is 5-15% of your bill. Net savings: ~11%.
-CCE saves 70-98% on **input** tokens. Input is 85-95% of your bill. Net savings: ~77%.
+CCE saves on **input** tokens (93% retrieval + 90% compression on FastAPI, [independently benchmarked](#benchmark-fastapi-independently-verified)). Input is 85-95% of your bill.
 ### It actually understands your code
@@ -200,7 +240,7 @@ Not a text search. Tree-sitter AST parsing creates semantic chunks. Hybrid retri
 ### It tracks real savings
-Not estimates. Actual tokens served vs full-file baseline, broken down by 7 buckets (retrieval, compression, output, memory, grammar, summarization, progressive disclosure). Dollar costs fetched from Anthropic's pricing page.
+Not estimates. Actual tokens served vs full-file baseline, broken down by buckets (retrieval, compression, output, memory, grammar). Dollar costs fetched from Anthropic's pricing page. Savings summary shown at every session start.
 ### It is secure by default
@@ -358,6 +398,8 @@ No GPU required. Embedding model runs on CPU via ONNX Runtime.
 - [x] Clean uninstall (removes all CCE artifacts)
 - [x] AST-aware chunking for PHP, Go, Rust, Java (tree-sitter)
 - [x] Multi-editor support (Cursor, VS Code/Copilot, Gemini CLI)
+- [x] Reproducible benchmark suite (93% savings on FastAPI, per-layer breakdown)
+- [x] Session savings visibility (shown at every session start)
 - [ ] Tree-sitter support for C, C++, Ruby, Swift, Kotlin
 - [ ] Docker support for remote mode

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/RECORD RENAMED Viewed

@@ -1,4 +1,4 @@
-code_context_engine-0.4.1.dist-info/licenses/LICENSE,sha256=vLbw0GGCVJSIRppMus7Oq0PyMDhDXz-dfvz2rPpWtjQ,1069
+code_context_engine-0.4.2.dist-info/licenses/LICENSE,sha256=vLbw0GGCVJSIRppMus7Oq0PyMDhDXz-dfvz2rPpWtjQ,1069
 context_engine/__init__.py,sha256=HU6q9Ni12P7RHgw4VNQDHoOcqdcHRfyjcmRqNq9c0Fw,129
 context_engine/cli.py,sha256=ZJ2qg8srayPouQ6Cg_kk-jf5zQbM4khjGetDkWPSyOE,113122
 context_engine/cli_style.py,sha256=a3l3Smq1gIN2asbNalFUz0i_5x7Tmkp_wEhyGMoo8a4,2460
@@ -34,7 +34,7 @@ context_engine/indexer/watcher.py,sha256=uILU_29M7J4VNv94nEQL7KKVyWSNee6JCA7lzXt
 context_engine/integration/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 context_engine/integration/bootstrap.py,sha256=nD7brJXAgEgO6MTtIu6rui-DlewS9hzpVga55_bRhZs,3272
 context_engine/integration/git_context.py,sha256=nAUOh1w7IU2Ph6MWbnfhSEnZ99-hRP424X0Ziy3aW3Q,4252
-context_engine/integration/mcp_server.py,sha256=XGr4HK0cS8PkaWABI7hvLeZAJx2ooDHUhzCBuDOhn8U,79580
+context_engine/integration/mcp_server.py,sha256=rcBwwn_UOTzRggjzEMVsW9XptALFkyV9XS9Z6gNeyjU,80116
 context_engine/integration/session_capture.py,sha256=azc0I2PoQQ-0gsmTFy254na_Ez3ADHJ5IdOKU5oFIEU,12440
 context_engine/memory/__init__.py,sha256=-mzH2HLbjF6mlyzlt0IZoezDPLHBTJmIXFlsn8cjeQA,299
 context_engine/memory/compressor.py,sha256=qSXPE67dTNnhf7M5lKpIetpa-s0rpoa22AQljQRv52Q,13257
@@ -43,7 +43,7 @@ context_engine/memory/extractive.py,sha256=VJFBG8P6Wku0OaKBQmOr3eTk5XRS2ed3q-TYb
 context_engine/memory/grammar.py,sha256=1yrMky1MlmT9m4-_XW3Rq8ZAEE6fBp4miFiWNEcH8ao,16776
 context_engine/memory/hook_installer.py,sha256=1rtPuw9KKveCGiPTojH4CNLK8gwh7Um3l1h0QN0dWv0,9273
 context_engine/memory/hook_server.py,sha256=y62r7TGxXIDIAMiAcebIyqHE0fU5u-1dq3qGHspM4PQ,2692
-context_engine/memory/hooks.py,sha256=CuwDUOdvRrD9SZe9GOtI5m1-e_j_EGYU-3QPZh5uv90,12302
+context_engine/memory/hooks.py,sha256=q3UAOOsRJAiYh_OFj6AsemUVy95tsZAp-S8ok3WRTUg,14129
 context_engine/memory/migrate.py,sha256=X5w6t96mWXz5p_CtEHYNQEpmCvcnwJ0uVsXeM9k3Xro,9447
 context_engine/retrieval/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 context_engine/retrieval/confidence.py,sha256=gOjL_T-_s1OSg0rKIzCh0ishmZ0Vw6LMB9qq7pr6WsE,1621
@@ -56,8 +56,8 @@ context_engine/storage/graph_store.py,sha256=mftuJFvlkFeBlzMsQorY5YS4l5wsDUxCMw5
 context_engine/storage/local_backend.py,sha256=5MVoAn6Jkiltho-9BjClisLkyXMkSZZc2Z_h3N7Vfcg,4200
 context_engine/storage/remote_backend.py,sha256=u77lnGIvqrL3PwInjT6nfRgyNn6oVxW92KUK66oWrvI,5504
 context_engine/storage/vector_store.py,sha256=tA0ol_v5B2KRNMt2hE2kI4qnYe_AoYP_HSp1MvzcsFU,14704
-code_context_engine-0.4.1.dist-info/METADATA,sha256=w1SE7xEw7cvqAuSeMn0zQdJaml8KWk05zvaXDK1qwIQ,15256
-code_context_engine-0.4.1.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
-code_context_engine-0.4.1.dist-info/entry_points.txt,sha256=DQuRWUuVFM7nPcXtDmJzlem7QA0IboD_4N8AnTtDD9Q,144
-code_context_engine-0.4.1.dist-info/top_level.txt,sha256=X1-RUqb61WXBjy3JjsW2oXwfvqk2ydXKDNidxmw4CZ4,15
-code_context_engine-0.4.1.dist-info/RECORD,,
+code_context_engine-0.4.2.dist-info/METADATA,sha256=Jha4lJakEzKAJr-gGNK_T8Zm-qnvjrcy1UKcGPKPgZw,17592
+code_context_engine-0.4.2.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
+code_context_engine-0.4.2.dist-info/entry_points.txt,sha256=DQuRWUuVFM7nPcXtDmJzlem7QA0IboD_4N8AnTtDD9Q,144
+code_context_engine-0.4.2.dist-info/top_level.txt,sha256=X1-RUqb61WXBjy3JjsW2oXwfvqk2ydXKDNidxmw4CZ4,15
+code_context_engine-0.4.2.dist-info/RECORD,,

context_engine/integration/mcp_server.py CHANGED Viewed

@@ -1330,6 +1330,7 @@ class ContextEngineMCP:
         queries = self._stats["queries"]
         raw = self._stats["raw_tokens"]
         served = self._stats["served_tokens"]
+        full_file = self._stats.get("full_file_tokens", 0)
         saved = raw - served
         pct = int(saved / raw * 100) if raw > 0 else 0
@@ -1339,10 +1340,20 @@ class ContextEngineMCP:
             f"{get_level_description(self._output_level)}",
         ]
         if queries > 0:
-            status_parts.append(
-                f"Token savings ({queries} queries): {raw:,} raw → {served:,} served "
-                f"({saved:,} saved, {pct}%)"
-            )
+            # Show full-file baseline savings (the headline number)
+            if full_file > 0:
+                full_saved = full_file - served
+                full_pct = int(full_saved / full_file * 100)
+                status_parts.append(
+                    f"Token savings ({queries} queries): "
+                    f"{full_file:,} full-file baseline → {served:,} served "
+                    f"({full_pct}% saved)"
+                )
+            else:
+                status_parts.append(
+                    f"Token savings ({queries} queries): {raw:,} raw → {served:,} served "
+                    f"({saved:,} saved, {pct}%)"
+                )
         else:
             status_parts.append("Token savings: no queries recorded yet")
         return [TextContent(type="text", text="\n".join(status_parts))]

context_engine/memory/hooks.py CHANGED Viewed

@@ -48,6 +48,50 @@ _RESUME_RECENT_DECISIONS = 5
 _RESUME_DECISION_REASON_CHARS = 200
+def _build_savings_line(conn: sqlite3.Connection) -> str:
+    """One-line savings summary from the savings_log table.
+    Returns something like:
+      "CCE saved 95% of input tokens across 14 queries (48.0k baseline, 2.4k served)"
+    or "" if no savings data exists.
+    """
+    from context_engine.memory.db import aggregate_savings
+    try:
+        buckets = aggregate_savings(conn)
+    except Exception:
+        return ""
+    # Use retrieval bucket for the true baseline (full-file tokens) and query
+    # count.  For served tokens, prefer chunk_compression (the final pipeline
+    # stage) when available, otherwise fall back to retrieval served.  This
+    # avoids double-counting that would occur if we summed baselines across
+    # all buckets (retrieval baseline feeds into chunk_compression baseline).
+    retrieval = buckets.get("retrieval", {"baseline": 0, "served": 0, "calls": 0})
+    compression = buckets.get("chunk_compression", {"baseline": 0, "served": 0, "calls": 0})
+    total_baseline = retrieval["baseline"]
+    total_served = compression["served"] if compression["calls"] > 0 else retrieval["served"]
+    total_queries = retrieval["calls"]
+    if total_baseline <= 0 or total_queries <= 0:
+        return ""
+    saved_pct = (1 - total_served / total_baseline) * 100
+    def _fmt_k(n: int) -> str:
+        if n >= 1_000_000:
+            return f"{n / 1_000_000:.1f}M"
+        if n >= 1_000:
+            return f"{n / 1_000:.1f}k"
+        return str(n)
+    return (
+        f"CCE saved {saved_pct:.0f}% of input tokens across {total_queries} queries "
+        f"({_fmt_k(total_baseline)} baseline, {_fmt_k(total_served)} served)"
+    )
 def build_session_resume(conn: sqlite3.Connection, project: str) -> str:
     """Compose a short text block summarising recent state for the model.
@@ -73,7 +117,9 @@ def build_session_resume(conn: sqlite3.Connection, project: str) -> str:
         (_RESUME_RECENT_DECISIONS,),
     ))
-    if not last_rollup and not decisions:
+    savings_line = _build_savings_line(conn)
+    if not last_rollup and not decisions and not savings_line:
         return ""
     parts.append(f"## CCE memory · resuming {project}")
@@ -81,6 +127,10 @@ def build_session_resume(conn: sqlite3.Connection, project: str) -> str:
     # before display so the resume reads as natural prose.
     from context_engine.memory.grammar import expand as _grammar_expand
+    if savings_line:
+        parts.append("")
+        parts.append(f"**{savings_line}**")
     if last_rollup:
         when = last_rollup["ended_at"] or "in progress"
         parts.append("")

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/WHEEL RENAMED Viewed

File without changes

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{code_context_engine-0.4.1.dist-info → code_context_engine-0.4.2.dist-info}/top_level.txt RENAMED Viewed

File without changes

code-context-engine 0.4.1__py3-none-any.whl → 0.4.2__py3-none-any.whl

code-context-engine 0.4.1py3-none-any.whl → 0.4.2py3-none-any.whl