PyPI - code-context-engine - Versions diffs - 0.4.20__py3-none-any.whl → 0.4.22__py3-none-any.whl - Mend

code-context-engine 0.4.20py3-none-any.whl → 0.4.22py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: code-context-engine
-Version: 0.4.20
+Version: 0.4.22
 Summary: Save 94% on Claude Code tokens. Index your codebase locally, AI agents search instead of reading files. Reduce Claude API costs, save tokens on Cursor, VS Code, Gemini CLI. Free, open source MCP server.
 Author-email: Fazle Elahee <felahee@gmail.com>, Raj <rajkumar.sakti@gmail.com>
 License-Expression: MIT
@@ -115,15 +115,17 @@ Dynamic: license-file
 ---
-## Quick start (3 lines)
+## Quick start
 ```bash
-uv tool install code-context-engine
+uv tool install "code-context-engine[local]"    # or: pipx install "code-context-engine[local]"
 cd /path/to/your/project
-cce init
+cce init                                        # or: cce init --agent all
 ```
-That's it. Claude now searches your index instead of reading entire files. No config needed.
+That's it. Your AI coding agent now searches your index instead of reading entire files.
+> **Already have Ollama?** You can skip `[local]` and use `uv tool install code-context-engine` instead. CCE auto-detects Ollama at localhost:11434 and uses `nomic-embed-text`.
 ---
@@ -143,35 +145,42 @@ Tested on all three platforms in CI (macOS, Linux, Windows × Python 3.11/3.12/3
 ## Install and see savings in 60 seconds
-```bash
-uv tool install code-context-engine   # or: pipx install code-context-engine
-cd /path/to/your/project
-cce init                              # index, install hooks, register MCP server
-```
+You need an embedding backend to index code. Pick one:
+| Option | Install command | Size | Requires |
+|--------|----------------|------|----------|
+| **Local (recommended)** | `uv tool install "code-context-engine[local]"` | +60 MB | Nothing else |
+| **Ollama** | `uv tool install code-context-engine` | Core only | Ollama running + `nomic-embed-text` pulled |
-**Embedding backends:** CCE auto-detects the best available backend. If you have Ollama running, it uses `nomic-embed-text` with zero extra dependencies. For offline/local embedding without Ollama, install the `[local]` extra:
+Then:
 ```bash
-uv tool install "code-context-engine[local]"   # includes fastembed + ONNX Runtime
+cd /path/to/your/project
+cce init                              # index, install hooks, register MCP server
 ```
 Restart your editor. Done. Every question now hits the index instead of re-reading files.
-`cce init` auto-detects your editor and writes the right config:
+`cce init` auto-detects your editor and writes the right config. To target a
+specific agent, use `--agent claude`, `--agent codex`, `--agent copilot`, or
+`--agent all`.
 | Editor | Config written | Instructions |
 |--------|---------------|--------------|
 | Claude Code | `.mcp.json` | `CLAUDE.md` |
-| VS Code / Copilot | `.vscode/mcp.json` | |
+| VS Code / Copilot | `.vscode/mcp.json` | `.github/copilot-instructions.md` |
 | Cursor | `.cursor/mcp.json` | `.cursorrules` |
 | Gemini CLI | `.gemini/settings.json` | `GEMINI.md` |
-| OpenAI Codex | `~/.codex/config.toml` (user-global, per-project section) | |
+| OpenAI Codex | `~/.codex/config.toml` (user-global, per-project section) | `AGENTS.md` |
 | OpenCode | `opencode.json` | |
 | Tabnine | `.tabnine/agent/settings.json` | `TABNINE.md` |
 Multiple editors in the same project? All get configured in one command.
-**Codex note:** Codex CLI reads MCP servers from `~/.codex/config.toml` only — it has no per-project config. `cce init` adds one `[mcp_servers.cce-<project>-<hash>]` section per project so multiple projects coexist; `cce uninstall` removes only the section for the current project.
+**Codex note:** Codex CLI reads MCP servers from `~/.codex/config.toml` only —
+it has no per-project config. `cce init` adds one `[mcp_servers.cce-<project>-<hash>]`
+section per project so multiple projects coexist; `cce uninstall` removes only
+the section for the current project.
 ```
   my-project · 38 queries
@@ -487,6 +496,57 @@ All other text files are chunked by line range. Binary files are skipped.
 ---
+## FAQ
+### Does CCE affect response quality?
+No. Quality stays the same or slightly improves.
+CCE replaces "dump the entire file" with "search for the relevant function." The model still gets the code it needs (0.90 Recall@10 in benchmarks). Less irrelevant context means less noise competing for attention, which can improve the model's focus on your actual question.
+### How does output token savings work?
+CCE writes output compression rules directly into your agent's instruction files (`CLAUDE.md`, `AGENTS.md`, `.cursorrules`, etc.) during `cce init`. These rules apply to the **entire session**, not just CCE tool responses, so every reply from the agent follows them.
+Set the level in `cce.yaml`:
+```yaml
+compression:
+  output: max       # off | lite | standard | max
+```
+Then re-run `cce init` to update instruction files. Or change at runtime:
+```
+set_output_level output_level=max
+```
+| Level | Savings | What it does |
+|-------|---------|--------------|
+| `off` | 0% | No compression |
+| `lite` | ~25% | Removes filler/hedging/pleasantries + diff-only for code changes |
+| `standard` | ~70% | Drops articles, fragments, short synonyms + diff-only for code |
+| `max` | ~80% | Telegraphic style + diff-only for code |
+Default is `standard`. All levels include **code output rules** that tell the model to show only changed lines (not full file rewrites), which is where most output tokens go in coding sessions. The `max` level produces very terse prose (similar to "caveman mode"). Code blocks, paths, and commands are never compressed regardless of level.
+### Where do the savings come from?
+Most savings are **input tokens** (what goes into the model):
+| Layer | Type | Typical savings |
+|-------|------|-----------------|
+| Retrieval | Input | 94% (full files → relevant chunks) |
+| Chunk compression | Input | 89% (chunks → signatures) |
+| Grammar compression | Input | 13% (article/filler removal) |
+| Turn summarization | Input | varies (session history) |
+| Progressive disclosure | Input | varies (tool payloads) |
+| Output compression | Output | 25-80% (depends on level) |
+Output tokens cost 5x more per token (e.g. Opus: $15/1M input vs $75/1M output), so even a small output reduction has outsized cost impact.
+---
 ## Roadmap
 - [x] Multi-repo benchmarks (FastAPI, chi, fiber)

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/RECORD RENAMED Viewed

@@ -1,12 +1,12 @@
-code_context_engine-0.4.20.dist-info/licenses/LICENSE,sha256=vLbw0GGCVJSIRppMus7Oq0PyMDhDXz-dfvz2rPpWtjQ,1069
+code_context_engine-0.4.22.dist-info/licenses/LICENSE,sha256=vLbw0GGCVJSIRppMus7Oq0PyMDhDXz-dfvz2rPpWtjQ,1069
 context_engine/__init__.py,sha256=qThGxB7xfZi5M9jDpUno0MKBp7KKrEOdH1hG4wHMuLc,193
-context_engine/cli.py,sha256=2UeFLCpe9t8gLDZLZaUiT9Qc6Ma4yn2Qeb_WvvRaO3I,121060
+context_engine/cli.py,sha256=iZbxwA0O4zFD_WRVgPnh1WdhsmZpu6Me-9lJTeT28DE,130226
 context_engine/cli_style.py,sha256=a3l3Smq1gIN2asbNalFUz0i_5x7Tmkp_wEhyGMoo8a4,2460
 context_engine/config.py,sha256=UGbVuc8_wTMflzGh80AotMZXZHzzUpLI3QjMnCxTzRo,8370
-context_engine/editors.py,sha256=F9QZXukvCdvjQ2Dw0pScRmHnWWG8HrvrXNhflHygKmY,23065
+context_engine/editors.py,sha256=k9jrqzU5gvYkR5kMu3VcVKHdjxEODZNmxBIEhQUOszE,23986
 context_engine/event_bus.py,sha256=7Jgw_2YvGQFrnYewXk6T6FJcvRHz0LVEMDgZym9YBCE,760
 context_engine/models.py,sha256=XBbM0CUqNDQ5MOp6F3STST2qLqy2Zk0m050ZtWdXkrk,2048
-context_engine/pricing.py,sha256=aA7iMsIC4ET6-Pqqp5PSji1XYwIjrTg64lGcPeHdom8,3173
+context_engine/pricing.py,sha256=aT1bsQuZXPlCdTgtwesJLwlKc2tzh8rxL67sZlMbz4E,4684
 context_engine/project_commands.py,sha256=ZePtRU48F1MS0LsVE-32kUA7kjy7yeSh0swL0L6irLA,10741
 context_engine/serve_http.py,sha256=bWG4yyeSusz19qM3SzDINO7oYd6SpWKsVD7c_VniZi4,9563
 context_engine/services.py,sha256=8WSVGS7jtqArIihIHKW4fN2ZgfBex9GSEBnjMWecUQM,9827
@@ -14,7 +14,7 @@ context_engine/utils.py,sha256=rytymcEY0tjG4uknJU3DXKz1_ZGjUjJRV3PhkjXoC8A,3192
 context_engine/compression/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 context_engine/compression/compressor.py,sha256=JlNxZeM6-tXISWVOGiJAcLoixqAxwfEGcYtE0dj8FPw,6680
 context_engine/compression/ollama_client.py,sha256=MKF1gii2BXMU-wxBRPyMCjo8t72v3dZ06Kv2JNfILgQ,1265
-context_engine/compression/output_rules.py,sha256=M68k2BnQ64mPBjj0nX2EjQl0vizQf3yrwYv5-ITC4jY,3526
+context_engine/compression/output_rules.py,sha256=kpLZ6r6Ng6PyAvA22wed5ecm8YTxHwwKI57PgsnX6ls,6655
 context_engine/compression/prompts.py,sha256=jZnpqhr77uI9R3S0vm3Dj17JYy03AXq24E6HQTPXy-A,711
 context_engine/compression/quality.py,sha256=F6fyxDdWjq-Hgtw4xFIaE4BqPoJw1W1EQSn3RXDgdHc,1676
 context_engine/dashboard/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
@@ -22,7 +22,7 @@ context_engine/dashboard/_page.py,sha256=2LOz6GxVFHdNyd6iGV-u6sbwCnTrw2p_cVUY-Ly
 context_engine/dashboard/server.py,sha256=N-QVaDCUL1h70QUgKrIy6QhQIedasf0KYHcV5LACZ0U,17437
 context_engine/indexer/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 context_engine/indexer/chunker.py,sha256=f0n7gJughdHP1fmMd1sbHAxLmVlCnIq6scHOeGFmBS8,6503
-context_engine/indexer/embedder.py,sha256=QSrep2Si6RgddikJMyBlO-K2p58yc2VgANKEsv5rf3g,20646
+context_engine/indexer/embedder.py,sha256=xznLoW8A9KfDRZWO2MYzCk6o_Kj5YLIMuQ2J-MIbo3g,22717
 context_engine/indexer/embedding_cache.py,sha256=yp7zvjjbhDei1tEczdo25GB_a5SJt3XfO4TVGujjSA0,6454
 context_engine/indexer/git_hooks.py,sha256=GjncsmFu2TZx_3TNQNSBSp15uDwOJ3AtUJxuePQCP24,3258
 context_engine/indexer/git_indexer.py,sha256=3IbAHYKa-XzpEX4zUfdvU0EHj-qjyn8muK6yPuxy9kw,4154
@@ -38,7 +38,7 @@ context_engine/integration/mcp_server.py,sha256=hIvap8fnpbeAOjJ0oy0GZdgjnUln6b-D
 context_engine/integration/session_capture.py,sha256=azc0I2PoQQ-0gsmTFy254na_Ez3ADHJ5IdOKU5oFIEU,12440
 context_engine/memory/__init__.py,sha256=-mzH2HLbjF6mlyzlt0IZoezDPLHBTJmIXFlsn8cjeQA,299
 context_engine/memory/compressor.py,sha256=TiHxFHRPS3TQxo2_YnnXv8QaQXwxehmH2iwe-azuxpw,15763
-context_engine/memory/db.py,sha256=x0NaR5aKcOcqrl-GKCFIW7DPuwQ1pYreqDc0dpg9O14,34579
+context_engine/memory/db.py,sha256=C700MhsdzT8NhpTz_8q-XV4kO6i-Rp4h4GTRoDa8OC4,34936
 context_engine/memory/decision_extractor.py,sha256=tAFcKVaX5Y1qax71MAR03eq6uyCBIfiEDlbsgiodHUw,3508
 context_engine/memory/extractive.py,sha256=VJFBG8P6Wku0OaKBQmOr3eTk5XRS2ed3q-TYb432GLc,3227
 context_engine/memory/grammar.py,sha256=1yrMky1MlmT9m4-_XW3Rq8ZAEE6fBp4miFiWNEcH8ao,16776
@@ -56,9 +56,9 @@ context_engine/storage/fts_store.py,sha256=GzsF-xUPInqovcK72ULgpYAtMAymx4BRrYmps
 context_engine/storage/graph_store.py,sha256=EAJaDK1OzSabm6HY4h7ZdZcykzlqtdFosNTypW5VNpc,8991
 context_engine/storage/local_backend.py,sha256=5MVoAn6Jkiltho-9BjClisLkyXMkSZZc2Z_h3N7Vfcg,4200
 context_engine/storage/remote_backend.py,sha256=6AwEI9YQnmP1w0a7S0ei3YrU2h3z7wbrwv34k7g5YOU,5483
-context_engine/storage/vector_store.py,sha256=FOp1fqneIQ4LQQh3f6sfZcn2jswj2SoEazW5BySGBVw,15025
-code_context_engine-0.4.20.dist-info/METADATA,sha256=4HEiEnteppLc8aY913oAZ0oBvIxSqZB0EDN5ljVPW3A,23088
-code_context_engine-0.4.20.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
-code_context_engine-0.4.20.dist-info/entry_points.txt,sha256=DQuRWUuVFM7nPcXtDmJzlem7QA0IboD_4N8AnTtDD9Q,144
-code_context_engine-0.4.20.dist-info/top_level.txt,sha256=X1-RUqb61WXBjy3JjsW2oXwfvqk2ydXKDNidxmw4CZ4,15
-code_context_engine-0.4.20.dist-info/RECORD,,
+context_engine/storage/vector_store.py,sha256=GyXSTlcKpByjr2C9JUF_cUCvMbGAc1UVV8Apx5X82kw,15772
+code_context_engine-0.4.22.dist-info/METADATA,sha256=UUastWJFLBpuSBE0fr-bWL857Jp06tyCq_5V1bj00CI,25756
+code_context_engine-0.4.22.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
+code_context_engine-0.4.22.dist-info/entry_points.txt,sha256=DQuRWUuVFM7nPcXtDmJzlem7QA0IboD_4N8AnTtDD9Q,144
+code_context_engine-0.4.22.dist-info/top_level.txt,sha256=X1-RUqb61WXBjy3JjsW2oXwfvqk2ydXKDNidxmw4CZ4,15
+code_context_engine-0.4.22.dist-info/RECORD,,

context_engine/cli.py CHANGED Viewed

@@ -56,6 +56,88 @@ def _safe_cwd() -> Path:
         ) from exc
+# ── Update check ─────────────────────────────────────────────────────
+_CCE_HOME = Path.home() / ".cce"
+_UPDATE_CACHE = _CCE_HOME / "update_check.json"
+_UPDATE_CHECK_TTL = 24 * 3600  # 1 day
+def _version_tuple(v: str) -> tuple[int, ...]:
+    """Parse '0.4.21' into (0, 4, 21) for comparison."""
+    return tuple(int(x) for x in v.split(".") if x.isdigit())
+def _check_for_update() -> str | None:
+    """Return the latest PyPI version if newer than installed, else None.
+    Checks at most once per day. Best-effort: swallows all errors.
+    """
+    import time
+    from importlib.metadata import version as pkg_version
+    try:
+        current = pkg_version("code-context-engine")
+    except Exception:
+        return None
+    # Read cache
+    try:
+        if _UPDATE_CACHE.exists():
+            data = json.loads(_UPDATE_CACHE.read_text())
+            if time.time() - data.get("ts", 0) < _UPDATE_CHECK_TTL:
+                latest = data.get("latest", "")
+                if latest and _version_tuple(latest) > _version_tuple(current):
+                    return latest
+                return None
+    except Exception:
+        pass
+    # Fetch from PyPI
+    latest = None
+    try:
+        import httpx
+        resp = httpx.get(
+            "https://pypi.org/pypi/code-context-engine/json",
+            timeout=3.0,
+            follow_redirects=True,
+        )
+        if resp.status_code == 200:
+            latest = resp.json()["info"]["version"]
+    except Exception:
+        pass
+    # Cache result
+    try:
+        _CCE_HOME.mkdir(parents=True, exist_ok=True)
+        _UPDATE_CACHE.write_text(json.dumps({"ts": time.time(), "latest": latest or ""}))
+    except Exception:
+        pass
+    if latest and _version_tuple(latest) > _version_tuple(current):
+        return latest
+    return None
+def _show_update_notice() -> None:
+    """Print a one-line update notice if a newer version is available."""
+    from importlib.metadata import version as pkg_version
+    try:
+        latest = _check_for_update()
+        if latest:
+            current = pkg_version("code-context-engine")
+            click.echo(
+                f"\n  {click.style('Update available', fg='yellow', bold=True)} "
+                f"{click.style(current, dim=True)} → "
+                f"{click.style(latest, fg='green', bold=True)}  "
+                f"{click.style('Run', dim=True)} "
+                f"{click.style('cce upgrade', fg='cyan')} "
+                f"{click.style('to update', dim=True)}"
+            )
+    except Exception:
+        pass
 def _configure_mcp(project_dir: Path) -> bool:
     """Write MCP server config to .mcp.json in the project directory.
@@ -100,12 +182,12 @@ _CCE_CLAUDE_MD_MARKER = "## Context Engine (CCE)"
 # Version stamp embedded as an HTML comment so it doesn't render in the final
 # Markdown but lets `_ensure_claude_md` detect when the installed block is
 # stale and needs replacing. Bump whenever _CCE_CLAUDE_MD_BLOCK changes.
-_CCE_CLAUDE_MD_VERSION = "3"
+_CCE_CLAUDE_MD_VERSION = "4"
 _CCE_CLAUDE_MD_VERSION_TAG = f"<!-- cce-block-version: {_CCE_CLAUDE_MD_VERSION} -->"
 _CCE_CLAUDE_MD_VERSION_PREFIX = "<!-- cce-block-version: "
 _CCE_CLAUDE_MD_END_MARKER = "<!-- /cce-block -->"
-_CCE_CLAUDE_MD_BLOCK = f"""\
+_CCE_CLAUDE_MD_BLOCK_TEMPLATE = f"""\
 {_CCE_CLAUDE_MD_VERSION_TAG}
 ## Context Engine (CCE)
@@ -186,18 +268,22 @@ the goal is durable signal, not an event log.
 Both are read-only and cheap. Prefer them over re-running tool calls or
 asking the user to re-paste context.
-## Output Style
-Be concise. Lead with the answer or action, not reasoning. Skip filler words,
-preamble, and phrases like "I'll help you with that" or "Certainly!". Prefer
-fragments over full sentences in explanations. No trailing summaries of what
-you just did. One sentence if it fits.
-Code blocks, file paths, commands, and error messages are always written in full.
+{{output_style}}
 {_CCE_CLAUDE_MD_END_MARKER}
 """
+def _build_claude_md_block(output_level: str = "standard") -> str:
+    """Generate the CLAUDE.md CCE block with the configured output style."""
+    from context_engine.compression.output_rules import get_instruction_output_block
+    block = get_instruction_output_block(output_level)
+    return _CCE_CLAUDE_MD_BLOCK_TEMPLATE.replace("{output_style}", block)
+# Default block for backward compat
+_CCE_CLAUDE_MD_BLOCK = _build_claude_md_block("standard")
 def _resolve_cce_cmd() -> str:
     """Find the globally installed cce binary path."""
     from context_engine.utils import resolve_cce_binary
@@ -541,6 +627,22 @@ def _preflight_check(config) -> None:
     one was picked, and surfaces Ollama status for the separate compression
     path so users know what compression level they will get.
     """
+    # --- SQLite extension support ---
+    import sqlite3 as _sqlite3
+    _test_conn = _sqlite3.connect(":memory:")
+    if not hasattr(_test_conn, "enable_load_extension"):
+        _test_conn.close()
+        raise click.ClickException(
+            "Your Python was compiled without SQLite extension support "
+            "(enable_load_extension is missing).\n"
+            "This is common with python.org installers on macOS.\n\n"
+            "Fix: reinstall CCE under a Python that has extension support:\n\n"
+            "  brew install python3\n"
+            "  uv tool install --python /opt/homebrew/bin/python3 "
+            "--force code-context-engine\n"
+        )
+    _test_conn.close()
     # --- Embedding backend ---
     click.echo(_dim("  Detecting embedding backend") + "...", nl=False)
     from context_engine.config import resolve_ollama_url
@@ -564,13 +666,15 @@ def _preflight_check(config) -> None:
                 fg="green",
             )
         )
-    except Exception as exc:
+    except Exception:
         click.echo("")
-        _warn(f"No embedding backend available: {exc}")
-        _warn(
-            "Install fastembed (`pip install code-context-engine[local]`) "
-            f"or start an Ollama server at {ollama_url} and pull "
-            f"{ollama_model}."
+        raise click.ClickException(
+            "No embedding backend available.\n\n"
+            "Fix (pick one):\n"
+            "  1. Install local embeddings:\n"
+            "     uv tool install 'code-context-engine[local]'\n\n"
+            f"  2. Start Ollama and pull the embedding model:\n"
+            f"     ollama pull {ollama_model}\n"
         )
     # --- Ollama for LLM compression (independent of the embedding path) ---
@@ -596,7 +700,7 @@ def _preflight_check(config) -> None:
         click.echo(_dim("  Tip: ollama pull phi3:mini for LLM summarization"))
-def _ensure_claude_md(project_dir: Path) -> None:
+def _ensure_claude_md(project_dir: Path, output_level: str = "standard") -> None:
     """Add or upgrade the CCE instructions block in CLAUDE.md.
     Three states the file can be in:
@@ -611,9 +715,10 @@ def _ensure_claude_md(project_dir: Path) -> None:
     """
     from context_engine.utils import atomic_write_text
+    block = _build_claude_md_block(output_level)
     claude_md = project_dir / "CLAUDE.md"
     if not claude_md.exists():
-        atomic_write_text(claude_md, _CCE_CLAUDE_MD_BLOCK)
+        atomic_write_text(claude_md, block)
         _ok("CLAUDE.md created with CCE instructions")
         return
@@ -628,13 +733,13 @@ def _ensure_claude_md(project_dir: Path) -> None:
     # survives the upgrade.
     old_block = _extract_existing_cce_block(existing)
     if old_block is not None:
-        new_content = existing.replace(old_block, _CCE_CLAUDE_MD_BLOCK.rstrip(), 1)
+        new_content = existing.replace(old_block, block.rstrip(), 1)
         atomic_write_text(claude_md, new_content)
         _ok("CLAUDE.md upgraded to current CCE instructions")
         return
     # No CCE block detected — append.
-    new_content = existing.rstrip() + "\n\n" + _CCE_CLAUDE_MD_BLOCK
+    new_content = existing.rstrip() + "\n\n" + block
     atomic_write_text(claude_md, new_content)
     _ok("CLAUDE.md updated with CCE instructions")
@@ -681,10 +786,72 @@ def main(ctx: click.Context, verbose: bool) -> None:
         _show_welcome_banner(ctx.obj["config"])
+@main.result_callback()
+@click.pass_context
+def _after_command(ctx: click.Context, *_args, **_kwargs) -> None:
+    """Run after every command. Shows update notice if available."""
+    # Skip for serve (long-running MCP server) and upgrade (already handles it)
+    if ctx.invoked_subcommand in ("serve", "upgrade"):
+        return
+    _show_update_notice()
+_INIT_AGENT_CHOICES = ("auto", "claude", "codex", "copilot", "all")
+_INIT_AGENT_TO_EDITORS = {
+    "claude": {"claude"},
+    "codex": {"codex"},
+    "copilot": {"vscode"},
+}
+# Editor key → instruction-file key. `claude` is omitted because CLAUDE.md is
+# written by `_ensure_claude_md`, not via the generic instruction-file path.
+# `opencode` has no instruction file.
+_INIT_EDITOR_TO_INSTRUCTIONS = {
+    "codex": "agents",
+    "vscode": "copilot",
+    "cursor": "cursorrules",
+    "gemini": "gemini",
+    "tabnine": "tabnine",
+}
+def _init_editor_targets(project_dir: Path, agent: str) -> set[str]:
+    """Return editor keys to configure for `cce init --agent`.
+    - `all`: every editor in EDITORS (computed at call time so the set never
+      drifts when new editors are added).
+    - `auto`: Claude plus any editor whose project/home markers exist.
+    - explicit (`claude`/`codex`/`copilot`): exactly the editors that flag
+      maps to.
+    """
+    from context_engine.editors import EDITORS, detect_editors
+    if agent == "all":
+        return set(EDITORS.keys())
+    if agent != "auto":
+        return set(_INIT_AGENT_TO_EDITORS[agent])
+    return {"claude", *detect_editors(project_dir)}
+def _init_instruction_targets(editor_targets: set[str]) -> set[str]:
+    """Instruction-file keys derived from the selected editors."""
+    return {
+        file_key
+        for editor_key, file_key in _INIT_EDITOR_TO_INSTRUCTIONS.items()
+        if editor_key in editor_targets
+    }
 @main.command()
+@click.option(
+    "--agent",
+    type=click.Choice(_INIT_AGENT_CHOICES),
+    default="auto",
+    show_default=True,
+    help="Agent/editor target: auto, claude, codex, copilot, or all.",
+)
 @click.pass_context
-def init(ctx: click.Context) -> None:
-    """Initialize context engine and connect it to Claude Code."""
+def init(ctx: click.Context, agent: str) -> None:
+    """Initialize context engine and connect it to AI coding agents."""
     from context_engine.indexer.git_hooks import install_hooks
     from context_engine.project_commands import ensure_gitignore
     config = ctx.obj["config"]
@@ -719,23 +886,24 @@ def init(ctx: click.Context) -> None:
         _warn("Not a git repository — git hook skipped")
         click.echo(_dim("    Run `cce index` manually after making changes."))
-    # 4. MCP config — Claude Code + any detected editors
+    # 4. MCP config — selected agents/editors
     from context_engine.editors import (
         EDITORS, INSTRUCTION_FILES,
-        detect_editors, configure_mcp, write_instruction_file,
+        configure_mcp, write_instruction_file,
     )
-    configured = _configure_mcp(project_dir)
-    if configured:
-        _ok("MCP server registered in " + click.style(".mcp.json", fg="cyan"))
-    else:
-        _ok("MCP server already configured in " + click.style(".mcp.json", fg="cyan"))
-    # Configure MCP for other detected editors (Cursor, VS Code, Gemini, Codex, Tabnine)
     from context_engine.editors import _editor_section  # noqa: SLF001
-    detected = detect_editors(project_dir)
-    for editor_key in detected:
+    editor_targets = _init_editor_targets(project_dir, agent)
+    if "claude" in editor_targets:
+        configured = _configure_mcp(project_dir)
+        if configured:
+            _ok("MCP server registered in " + click.style(".mcp.json", fg="cyan"))
+        else:
+            _ok("MCP server already configured in " + click.style(".mcp.json", fg="cyan"))
+    for editor_key in sorted(editor_targets):
         if editor_key == "claude":
-            continue  # already handled above
+            continue
         editor = EDITORS[editor_key]
         changed = configure_mcp(project_dir, editor_key)
         if changed is None:
@@ -751,19 +919,27 @@ def init(ctx: click.Context) -> None:
             section = _editor_section(editor, project_dir)
             click.echo(_dim(f"    ~/{editor['config_path']}  →  [{section}]"))
-    # Write instruction files for detected editors
-    for file_key, info in INSTRUCTION_FILES.items():
-        for marker in info["detect"]:
-            if (project_dir / marker).exists():
-                if write_instruction_file(project_dir, file_key):
-                    _ok(f"CCE instructions added to {info['name']}")
-                break
+    # Write instruction files for the selected editors. In `auto` mode, also
+    # pick up instruction files whose marker exists even if the editor itself
+    # wasn't detected (e.g. an `AGENTS.md` checked in without a `~/.codex/`).
+    # Explicit `--agent X` writes only what X covers — no surprise edits.
+    instruction_targets = _init_instruction_targets(editor_targets)
+    if agent == "auto":
+        for file_key, info in INSTRUCTION_FILES.items():
+            if any((project_dir / marker).exists() for marker in info["detect"]):
+                instruction_targets.add(file_key)
+    output_level = getattr(config, "output_compression", "standard")
+    for file_key in sorted(instruction_targets):
+        info = INSTRUCTION_FILES[file_key]
+        if write_instruction_file(project_dir, file_key, output_level=output_level):
+            _ok(f"CCE instructions added to {info['name']}")
     # 5. CLAUDE.md + session hook + memory lifecycle hooks
-    _ensure_claude_md(project_dir)
-    _ensure_session_hook(project_dir)
-    _install_memory_hooks(project_dir)
-    _check_memory_capture_reachable(config, project_dir)
+    if "claude" in editor_targets:
+        _ensure_claude_md(project_dir, output_level=output_level)
+        _ensure_session_hook(project_dir)
+        _install_memory_hooks(project_dir)
+        _check_memory_capture_reachable(config, project_dir)
     # 6. .gitignore — add CCE per-machine entries
     ensure_gitignore(str(project_dir))
@@ -777,7 +953,7 @@ def init(ctx: click.Context) -> None:
     click.echo("")
     click.echo(
         click.style("  Done!", fg="green", bold=True) +
-        click.style("  Restart Claude Code to activate CCE.", fg="white")
+        click.style("  Restart your AI coding agent to activate CCE.", fg="white")
     )
     click.echo("")
@@ -962,7 +1138,7 @@ def list_commands() -> None:
     groups = [
         ("Setup", [
-            ("cce init", "Index project, install git hooks, write .mcp.json"),
+            ("cce init [--agent auto|all|...]", "Index project and register MCP config"),
             ("cce index", "Re-index changed files"),
             ("cce index --full", "Force full re-index of every file"),
             ("cce index --path <file>", "Index one file or directory"),
@@ -1257,13 +1433,20 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
     _all_pricing = get_model_pricing()
     _pricing_model = config.pricing_model.lower()
-    _price_per_m = _all_pricing.get(_pricing_model, _all_pricing.get("opus", 5.0))
-    _COST_PER_TOKEN = _price_per_m / 1_000_000
+    _default = _all_pricing.get("opus", {"input": 15.0, "output": 75.0})
+    _model_pricing = _all_pricing.get(_pricing_model, _default)
+    _input_price_per_m = _model_pricing["input"]
+    _output_price_per_m = _model_pricing["output"]
+    _INPUT_COST = _input_price_per_m / 1_000_000
+    _OUTPUT_COST = _output_price_per_m / 1_000_000
     _model_label = _pricing_model.capitalize()
     _GRID_COLS = 10
     _FILLED = "⛁"
     _EMPTY = "⛶"
+    # The output_compression bucket is the only one saving output tokens.
+    _OUTPUT_BUCKETS = {"output_compression"}
     def _fmt_tokens(n: int) -> str:
         if n >= 1_000_000:
             return f"{n / 1_000_000:.1f}M"
@@ -1271,12 +1454,27 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             return f"{n / 1000:.1f}k"
         return str(n)
-    def _fmt_cost(n: int) -> str:
-        cost = n * _COST_PER_TOKEN
+    def _fmt_cost_input(n: int) -> str:
+        cost = n * _INPUT_COST
         if cost < 0.01:
             return "<$0.01"
         return f"${cost:.2f}"
+    def _fmt_cost_output(n: int) -> str:
+        cost = n * _OUTPUT_COST
+        if cost < 0.01:
+            return "<$0.01"
+        return f"${cost:.2f}"
+    def _bucket_cost(bucket: str, tokens: int) -> float:
+        rate = _OUTPUT_COST if bucket in _OUTPUT_BUCKETS else _INPUT_COST
+        return tokens * rate
+    def _fmt_cost_raw(amount: float) -> str:
+        if amount < 0.01:
+            return "<$0.01"
+        return f"${amount:.2f}"
     def _bar(saved_pct: int) -> str:
         """Render ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ grid where filled = tokens used."""
         used_pct = 100 - saved_pct
@@ -1307,6 +1505,20 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
         s = sum(int(v.get("served", 0)) for v in buckets.values())
         return b, s
+    def _split_io(buckets: dict) -> tuple[int, int, int, int]:
+        """Split buckets into (input_baseline, input_served, output_baseline, output_served)."""
+        ib = is_ = ob = os_ = 0
+        for key, v in buckets.items():
+            base = int(v.get("baseline", 0))
+            srv = int(v.get("served", 0))
+            if key in _OUTPUT_BUCKETS:
+                ob += base
+                os_ += srv
+            else:
+                ib += base
+                is_ += srv
+        return ib, is_, ob, os_
     def _print_project(name: str, stats: dict, buckets: dict, levels: dict) -> None:
         queries = stats.get("queries", 0)
@@ -1326,6 +1538,16 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
         tokens_saved = max(0, baseline - served) if queries > 0 else 0
         saved_pct = int(tokens_saved / baseline * 100) if baseline > 0 and queries > 0 else 0
+        # Split into input / output savings
+        in_base, in_srv, out_base, out_srv = _split_io(buckets)
+        in_saved = max(0, in_base - in_srv)
+        out_saved = max(0, out_base - out_srv)
+        # Legacy projects have no bucket data; treat all savings as input.
+        if bucket_baseline == 0 and tokens_saved > 0:
+            in_saved = tokens_saved
+            out_saved = 0
+        total_cost_saved = in_saved * _INPUT_COST + out_saved * _OUTPUT_COST
         q_label = "query" if queries == 1 else "queries"
         click.echo()
@@ -1350,29 +1572,28 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
         )
         click.echo()
-        # Before / after / saved
+        # Input / output / total saved
         click.echo(
-            f"  {dim('Without CCE')}   "
-            f"{value(_fmt_tokens(baseline)):>10}  {dim('tokens')}   "
-            f"{dim(_fmt_cost(baseline))}"
-        )
-        click.echo(
-            f"  {success('With CCE')}      "
-            f"{value(_fmt_tokens(served)):>10}  {dim('tokens')}   "
-            f"{dim(_fmt_cost(served))}"
+            f"  {dim('Input savings')}   "
+            f"{value(_fmt_tokens(in_saved)):>10}  {dim('tokens')}   "
+            f"{dim(_fmt_cost_input(in_saved))}"
         )
+        if out_saved > 0:
+            click.echo(
+                f"  {dim('Output savings')}  "
+                f"{value(_fmt_tokens(out_saved)):>10}  {dim('tokens')}   "
+                f"{dim(_fmt_cost_output(out_saved))}"
+            )
         click.echo(f"  {dim('─' * 42)}")
         click.echo(
-            f"  {success('Saved')}         "
+            f"  {success('Total saved')}   "
             f"{click.style(_fmt_tokens(tokens_saved), fg='green', bold=True):>10}  {dim('tokens')}   "
-            f"{click.style(_fmt_cost(tokens_saved), fg='green', bold=True)}"
+            f"{click.style(_fmt_cost_raw(total_cost_saved), fg='green', bold=True)}"
         )
-        # Per-query average — the number a user actually grounds "is this
-        # worth my time?" on. Skipped when there are no queries or no
-        # savings (avoids dividing by zero and showing $0.00/query noise).
+        # Per-query average
         if queries > 0 and tokens_saved > 0:
             avg_tokens = tokens_saved // max(1, queries)
-            avg_cost = _fmt_cost(avg_tokens)
+            avg_cost = _fmt_cost_raw(total_cost_saved / max(1, queries))
             click.echo(
                 f"  {dim(f'~{_fmt_tokens(avg_tokens)} tokens / query')}  "
                 f"{dim(f'~{avg_cost} / query')}"
@@ -1390,9 +1611,9 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             if saved <= 0:
                 continue
             pct = int(saved / baseline * 100) if baseline > 0 else 0
-            rows.append((display, pct, saved, int(b.get("calls", 0)), is_est, idx))
+            rows.append((key, display, pct, saved, int(b.get("calls", 0)), is_est, idx))
         # Polish 2: sort by saved tokens descending. Biggest wins first.
-        rows.sort(key=lambda r: (-r[2], r[5]))
+        rows.sort(key=lambda r: (-r[3], r[6]))
         if rows:
             click.echo(f"  {dim('Breakdown:')}")
@@ -1401,16 +1622,16 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             # form so estimate buckets don't blow out the alignment.
             displayed_labels = [
                 f"{display}*" if is_est else display
-                for display, _, _, _, is_est, _ in rows
+                for _, display, _, _, _, is_est, _ in rows
             ]
             label_width = max(len(s) for s in displayed_labels) + 1
             # Polish 3: normalize bar fill against the largest bucket's saved
             # tokens, not the total. Otherwise a dominant bucket squashes all
             # others to 0–1 cells and the visualisation goes blind.
-            max_saved = max(r[2] for r in rows)
+            max_saved = max(r[3] for r in rows)
             any_estimate = False
-            for display, pct, saved, calls, is_est in [
-                (d, p, s, c, e) for d, p, s, c, e, _ in rows
+            for key, display, pct, saved, calls, is_est in [
+                (k, d, p, s, c, e) for k, d, p, s, c, e, _ in rows
             ]:
                 if is_est:
                     any_estimate = True
@@ -1430,11 +1651,12 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
                 call_text = "1 call" if calls == 1 else f"{calls} calls"
                 # Polish 5: asterisk glued to label, no separate marker column.
                 label_text = f"{display}*" if is_est else display
+                cost_str = _fmt_cost_raw(_bucket_cost(key, saved))
                 click.echo(
                     f"    {label(label_text.ljust(label_width))}  "
                     f"{value(pct_text)}  {mini_bar}  "
                     f"{dim(_fmt_tokens(saved).rjust(6))} "
-                    f"{dim(_fmt_cost(saved).rjust(8))} "
+                    f"{dim(cost_str.rjust(8))} "
                     f"{dim(f'· {call_text}')}"
                 )
             click.echo()
@@ -1477,9 +1699,11 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
                 f"{label('compression')} {value(f'{max(0, compression_pct)}%')}"
             )
-        click.echo(
-            f"  {dim(f'Cost estimate based on {_model_label} input pricing (${_price_per_m:.0f}/1M tokens)')}"
+        pricing_note = (
+            f"Cost estimate based on {_model_label} pricing "
+            f"(input ${_input_price_per_m}/1M, output ${_output_price_per_m}/1M)"
         )
+        click.echo(f"  {dim(pricing_note)}")
     def _json_entry(name: str, stats: dict, buckets: dict, levels: dict) -> dict:
         full_file = stats.get("full_file_tokens", 0)
@@ -1493,6 +1717,9 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             baseline = max(full_file, raw) if full_file > 0 else raw
             served_total = served
         saved = max(0, baseline - served_total)
+        in_base, in_srv, out_base, out_srv = _split_io(buckets)
+        in_saved = max(0, in_base - in_srv)
+        out_saved = max(0, out_base - out_srv)
         retrieval_pct = (
             int(round((1 - raw / full_file) * 100))
             if full_file > 0 and raw <= full_file
@@ -1510,6 +1737,8 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             "raw_tokens": raw,
             "served_tokens": served,
             "tokens_saved": saved,
+            "input_tokens_saved": in_saved,
+            "output_tokens_saved": out_saved,
             # Kept for backward compat with anything scraping this JSON:
             "savings_pct": int(saved / baseline * 100) if baseline > 0 else 0,
             "retrieval_savings_pct": max(0, retrieval_pct),
@@ -1602,6 +1831,17 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
         total_queries = sum(s.get("queries", 0) for _, s, _, _ in reports)
         total_saved = max(0, total_baseline - total_served)
         total_pct = int(total_saved / total_baseline * 100) if total_baseline > 0 else 0
+        # Aggregate input/output across all projects
+        all_in_saved = all_out_saved = 0
+        for _, stats, bkts, _ in reports:
+            ib, is_, ob, os_ = _split_io(bkts)
+            all_in_saved += max(0, ib - is_)
+            all_out_saved += max(0, ob - os_)
+        # Legacy projects with no bucket data: attribute remaining to input
+        bucket_total_saved = all_in_saved + all_out_saved
+        if bucket_total_saved < total_saved:
+            all_in_saved += total_saved - bucket_total_saved
+        agg_cost = all_in_saved * _INPUT_COST + all_out_saved * _OUTPUT_COST
         click.echo()
         click.echo(
             f"  {bold('Total')} {dim('across')} {value(str(len(reports)))} "
@@ -1613,7 +1853,7 @@ def _run_savings_report(config, *, as_json: bool = False, all_projects: bool = F
             f"{dim('saved ·')} "
             f"{click.style(_fmt_tokens(total_saved), fg='green', bold=True)} "
             f"{dim('tokens ·')} "
-            f"{click.style(_fmt_cost(total_saved), fg='green', bold=True)}"
+            f"{click.style(_fmt_cost_raw(agg_cost), fg='green', bold=True)}"
         )
     click.echo()
@@ -2094,10 +2334,15 @@ def upgrade(ctx: click.Context, check: bool) -> None:
     new_version = current  # fallback
     try:
-        # Re-read version from the just-upgraded package
-        importlib.metadata.invalidate_caches()
-        dist = importlib.metadata.distribution("code-context-engine")
-        new_version = dist.metadata["Version"]
+        # The running process still sees the old venv metadata, so shell out
+        # to the upgraded executable to get the real post-upgrade version.
+        ver_result = subprocess.run(
+            [str(cce_bin), "--version"],
+            capture_output=True, text=True, timeout=10,
+        )
+        if ver_result.returncode == 0:
+            # Output format: "cce, version X.Y.Z"
+            new_version = ver_result.stdout.strip().rsplit(None, 1)[-1]
     except Exception:
         pass
@@ -2128,7 +2373,7 @@ def upgrade(ctx: click.Context, check: bool) -> None:
     click.echo("")
     click.echo(
         click.style("  Done!", fg="green", bold=True) +
-        click.style("  Restart Claude Code to pick up changes.", fg="white")
+        click.style("  Restart your AI coding agent to pick up changes.", fg="white")
     )
     click.echo("")

context_engine/compression/output_rules.py CHANGED Viewed

@@ -11,15 +11,27 @@ ESTIMATED_AVG_REPLY_TOKENS = 500
 # Advertised output-token reduction per level. Sourced from the level
 # descriptions ("~65% savings", "~75% savings"). `lite` has no advertised
-# number; we use a conservative 20% based on how much filler/hedging
-# typically lives in default-mode replies.
+# number; we use a conservative 25% based on filler removal + code diff rules.
+# The code output rules (show diffs, not full files) add ~5-10% on top of
+# prose compression since code responses are a large share of output tokens.
 ADVERTISED_PCT = {
     "off": 0.0,
-    "lite": 0.20,
-    "standard": 0.65,
-    "max": 0.75,
+    "lite": 0.25,
+    "standard": 0.70,
+    "max": 0.80,
 }
+# Code output rules — appended to all non-off levels to reduce code token waste.
+_CODE_RULES = (
+    "\n\n## Code Output Rules\n"
+    "When suggesting code changes:\n"
+    "- Show ONLY the changed lines with minimal surrounding context (3 lines above/below)\n"
+    "- Use edit format: file path, then the specific change. Never rewrite entire files.\n"
+    "- If multiple changes in one file, show each change separately, not the whole file\n"
+    "- Never echo back unchanged code the user already has\n"
+    "- For new files, show the full file. For edits, show only what changes."
+)
 _RULES = {
     "lite": (
         "## Output Compression: Lite\n"
@@ -30,6 +42,7 @@ _RULES = {
         "- No trailing summaries — the diff/output speaks for itself\n"
         "- Keep full grammar and articles\n"
         "- Code blocks, paths, commands, URLs: NEVER compress"
+        + _CODE_RULES
     ),
     "standard": (
         "## Output Compression: Standard\n"
@@ -43,6 +56,7 @@ _RULES = {
         "- One-line explanations unless detail is asked for\n"
         "- Code blocks, paths, commands, URLs, errors: NEVER compress\n"
         "- Security warnings and destructive action confirmations: use full clarity"
+        + _CODE_RULES
     ),
     "max": (
         "## Output Compression: Max\n"
@@ -55,6 +69,7 @@ _RULES = {
         "- Pattern: [thing] → [action]. [reason].\n"
         "- Code blocks, paths, commands, URLs, errors: NEVER compress\n"
         "- Security warnings and destructive action confirmations: use full clarity"
+        + _CODE_RULES
     ),
 }
@@ -70,8 +85,62 @@ def get_level_description(level: str) -> str:
     """Return a human-readable description of the compression level."""
     descriptions = {
         "off": "No output compression — Claude responds normally",
-        "lite": "Removes filler, hedging, and pleasantries. Keeps full grammar.",
-        "standard": "Drops articles, uses fragments, short synonyms. ~65% output token savings.",
-        "max": "Telegraphic style with abbreviations and symbols. ~75% output token savings.",
+        "lite": "Removes filler, hedging, and pleasantries. Diff-only for code. ~25% savings.",
+        "standard": "Drops articles, uses fragments, short synonyms. Diff-only for code. ~70% savings.",
+        "max": "Telegraphic style with abbreviations and symbols. Diff-only for code. ~80% savings.",
     }
     return descriptions.get(level, "Unknown level")
+# ── Instruction-file blocks ──────────────────────────────────────────
+# These go into CLAUDE.md, AGENTS.md, .cursorrules, etc. so they apply
+# to the entire session, not just CCE tool responses.
+_INSTRUCTION_OUTPUT_STYLES = {
+    "lite": """\
+### Output style
+Respond concisely. Remove filler words (just, really, basically, actually,
+simply), hedging (I think, it seems, perhaps), and pleasantries (Sure!,
+Happy to help, Great question). No trailing summaries. Keep full grammar.
+When suggesting code changes, show only the changed lines with 3 lines of
+context. Never rewrite entire files. For new files, show the full file.
+For edits, show only what changes.""",
+    "standard": """\
+### Output style
+Respond in compressed style. Drop articles (a, an, the) in prose. Use
+sentence fragments over full sentences. Use short synonyms (fix not resolve,
+check not investigate). Pattern: [thing] [action] [reason]. [next step].
+No filler, hedging, pleasantries, trailing summaries, or restating what
+the user said. One sentence if one sentence is enough.
+When suggesting code changes, show only the changed lines with 3 lines of
+context. Never rewrite entire files. Multiple changes in one file: show each
+change separately. Never echo back unchanged code the user already has.
+Code blocks, file paths, commands, error messages: always written in full.
+Security warnings and destructive action confirmations: use full clarity.""",
+    "max": """\
+### Output style
+Respond in telegraphic style. Drop articles, pronouns, conjunctions where
+meaning survives. Abbreviate common terms: DB, auth, config, fn, dep, impl,
+req, resp, init. Use arrows for causality: X → Y. Use symbols: + (add),
+- (remove), ~ (change), ! (warning). Max 1-2 sentences per explanation.
+Pattern: [thing] → [action]. [reason].
+When suggesting code changes, show only changed lines. Never rewrite files.
+Never echo back unchanged code.
+Code blocks, paths, commands, errors: always full.
+Security warnings and destructive actions: full clarity, drop compression.""",
+}
+def get_instruction_output_block(level: str) -> str:
+    """Return the output style block for instruction files, or empty if off."""
+    return _INSTRUCTION_OUTPUT_STYLES.get(level, "")

context_engine/editors.py CHANGED Viewed

@@ -92,7 +92,7 @@ EDITORS: dict[str, dict] = {
 # ── Instruction file definitions ──────────────────────────────────────
 # Editor-agnostic CCE instructions (no "Claude Code" references)
-_CCE_INSTRUCTIONS = """\
+_CCE_INSTRUCTIONS_BASE = """\
 ## Context Engine (CCE)
 This project uses Code Context Engine for intelligent code retrieval and
@@ -122,7 +122,30 @@ Call `record_decision(decision="...", reason="...")` after making choices.
 Call `record_code_area(file_path="...", description="...")` after meaningful work.
 """
+def _build_instructions(output_level: str = "standard") -> str:
+    """Build CCE instructions with the configured output style."""
+    from context_engine.compression.output_rules import get_instruction_output_block
+    block = get_instruction_output_block(output_level)
+    if block:
+        return _CCE_INSTRUCTIONS_BASE + "\n" + block + "\n"
+    return _CCE_INSTRUCTIONS_BASE
+# Default instructions (standard output compression)
+_CCE_INSTRUCTIONS = _build_instructions("standard")
 INSTRUCTION_FILES: dict[str, dict] = {
+    "agents": {
+        "name": "AGENTS.md",
+        "path": "AGENTS.md",
+        "detect": ["AGENTS.md"],
+    },
+    "copilot": {
+        "name": ".github/copilot-instructions.md",
+        "path": ".github/copilot-instructions.md",
+        "detect": [".github/copilot-instructions.md"],
+    },
     "cursorrules": {
         "name": ".cursorrules",
         "path": ".cursorrules",
@@ -558,20 +581,24 @@ def _remove_toml(config_path: Path, display_path: str, *, section: str) -> str |
         return None
-def write_instruction_file(project_dir: Path, file_key: str) -> bool:
+def write_instruction_file(
+    project_dir: Path, file_key: str, output_level: str = "standard",
+) -> bool:
     """Write CCE instructions to an editor's instruction file. Returns True if written."""
     info = INSTRUCTION_FILES[file_key]
     path = project_dir / info["path"]
     marker = "## Context Engine (CCE)"
+    path.parent.mkdir(parents=True, exist_ok=True)
+    instructions = _build_instructions(output_level)
     if path.exists():
         content = path.read_text()
         if marker in content:
             return False  # already has CCE block
         # Append
-        path.write_text(content.rstrip() + "\n\n" + _CCE_INSTRUCTIONS)
+        path.write_text(content.rstrip() + "\n\n" + instructions)
     else:
-        path.write_text(_CCE_INSTRUCTIONS)
+        path.write_text(instructions)
     return True

context_engine/indexer/embedder.py CHANGED Viewed

@@ -319,16 +319,66 @@ class OllamaBackend:
             for _ in resp.iter_lines():
                 pass
+    # nomic-embed-text has an 8192-token context. Dense-tokenizing content
+    # (YAML with ${{ }}, Python separator comments) can hit ~1 char/token,
+    # so 3000 chars is a safe ceiling that works for all content types.
+    _MAX_EMBED_CHARS = 3000
     def _embed_batch(self, texts: list[str]) -> list[list[float]]:
         import httpx
-        resp = httpx.post(
-            f"{self.base_url}/api/embed",
-            json={"model": self.model_name, "input": texts},
-            timeout=self._timeout,
-        )
-        resp.raise_for_status()
-        data = resp.json()
-        return data.get("embeddings", [])
+        # Truncate oversized texts and skip empty ones
+        safe_texts = []
+        original_indices = []
+        for i, t in enumerate(texts):
+            if not t or not t.strip():
+                continue
+            safe_texts.append(t[:self._MAX_EMBED_CHARS])
+            original_indices.append(i)
+        if not safe_texts:
+            return [[] for _ in texts]
+        try:
+            resp = httpx.post(
+                f"{self.base_url}/api/embed",
+                json={"model": self.model_name, "input": safe_texts},
+                timeout=self._timeout,
+            )
+            resp.raise_for_status()
+            embeddings = resp.json().get("embeddings", [])
+        except httpx.HTTPStatusError as exc:
+            if exc.response.status_code != 400:
+                raise
+            # Batch failed (possibly one text still too large after truncation).
+            # Fall back to one-at-a-time with halving retry.
+            log.warning("Ollama batch embed failed, retrying one-at-a-time")
+            embeddings = []
+            for text in safe_texts:
+                vec = self._embed_single_with_retry(text)
+                embeddings.append(vec)
+        # Map embeddings back to original positions (empty texts get empty vecs)
+        result: list[list[float]] = [[] for _ in texts]
+        for idx, emb in zip(original_indices, embeddings):
+            result[idx] = emb
+        return result
+    def _embed_single_with_retry(self, text: str) -> list[float]:
+        """Embed a single text, halving on context-length errors."""
+        import httpx
+        while text:
+            resp = httpx.post(
+                f"{self.base_url}/api/embed",
+                json={"model": self.model_name, "input": [text]},
+                timeout=self._timeout,
+            )
+            if resp.status_code == 400 and "context length" in resp.text:
+                text = text[:len(text) // 2]
+                continue
+            resp.raise_for_status()
+            vecs = resp.json().get("embeddings", [[]])
+            return vecs[0] if vecs else []
+        return []
     def embed_texts(self, texts: list[str], batch_size: int = 64) -> list[list[float]]:
         out: list[list[float]] = []

context_engine/memory/db.py CHANGED Viewed

@@ -281,6 +281,14 @@ def _try_load_vec(conn: sqlite3.Connection) -> bool:
         sqlite_vec.load(conn)
         conn.enable_load_extension(False)
         return True
+    except AttributeError:
+        log.warning(
+            "sqlite-vec load failed; semantic recall disabled. "
+            "Python was compiled without SQLite extension support. "
+            "Reinstall CCE with Homebrew Python: "
+            "uv tool install --python /opt/homebrew/bin/python3 --force code-context-engine"
+        )
+        return False
     except Exception as exc:
         log.warning("sqlite-vec load failed; semantic recall disabled: %s", exc)
         return False

context_engine/pricing.py CHANGED Viewed

@@ -3,23 +3,38 @@ import json
 import re
 import time
 from pathlib import Path
+from typing import TypedDict
 _CCE_HOME = Path.home() / ".cce"
 _CACHE_PATH = _CCE_HOME / "pricing_cache.json"
 _CACHE_TTL = 7 * 24 * 3600  # 7 days
 _DOCS_URL = "https://docs.anthropic.com/en/docs/about-claude/models"
+class ModelPricing(TypedDict):
+    input: float   # $/1M input tokens
+    output: float  # $/1M output tokens
 # Used only when fetch fails and no cache exists
-_FALLBACK: dict[str, float] = {
-    "opus": 5.0,
+_FALLBACK: dict[str, ModelPricing] = {
+    "opus": {"input": 15.0, "output": 75.0},
+    "sonnet": {"input": 3.0, "output": 15.0},
+    "haiku": {"input": 0.80, "output": 4.0},
+}
+# Flat input-only fallback kept for backward compat with existing cache files
+_FALLBACK_INPUT: dict[str, float] = {
+    "opus": 15.0,
     "sonnet": 3.0,
-    "haiku": 1.0,
+    "haiku": 0.80,
 }
-def _parse_html(html: str) -> dict[str, float] | None:
-    """Parse per-family input pricing from Anthropic docs HTML table."""
-    pricing: dict[str, float] = {}
+def _parse_html(html: str) -> dict[str, ModelPricing] | None:
+    """Parse per-family input + output pricing from Anthropic docs HTML table."""
+    input_pricing: dict[str, float] = {}
+    output_pricing: dict[str, float] = {}
     rows = re.findall(r"<tr[^>]*>(.*?)</tr>", html, re.DOTALL | re.IGNORECASE)
     col_families: list[str | None] = []
@@ -44,23 +59,42 @@ def _parse_html(html: str) -> dict[str, float] | None:
             col_families = families_in_row
             continue
-        # Pricing row: extract $ amounts per column
-        if col_families and any(
-            "input" in c.lower() and "tok" in c.lower() for c in cells
-        ):
+        if not col_families:
+            continue
+        # Detect whether this is an input or output pricing row
+        is_input = any("input" in c.lower() and "tok" in c.lower() for c in cells)
+        is_output = any("output" in c.lower() and "tok" in c.lower() for c in cells)
+        target = None
+        if is_input and not is_output:
+            target = input_pricing
+        elif is_output and not is_input:
+            target = output_pricing
+        if target is not None:
             for i, cell in enumerate(cells):
                 if i < len(col_families) and col_families[i]:
                     m = re.search(r"\$(\d+(?:\.\d+)?)", cell)
                     if m:
                         family = col_families[i]
-                        if family not in pricing:
-                            pricing[family] = float(m.group(1))
-            col_families = []
+                        if family not in target:
+                            target[family] = float(m.group(1))
+            if target is output_pricing:
+                col_families = []
+    if not input_pricing:
+        return None
-    return pricing if pricing else None
+    result: dict[str, ModelPricing] = {}
+    for family in input_pricing:
+        result[family] = {
+            "input": input_pricing[family],
+            "output": output_pricing.get(family, input_pricing[family] * 5),
+        }
+    return result
-def _fetch() -> dict[str, float] | None:
+def _fetch() -> dict[str, ModelPricing] | None:
     try:
         import httpx
@@ -72,19 +106,29 @@ def _fetch() -> dict[str, float] | None:
         return None
-def _load_cache() -> dict[str, float] | None:
+def _load_cache() -> dict[str, ModelPricing] | None:
     try:
         if not _CACHE_PATH.exists():
             return None
         data = json.loads(_CACHE_PATH.read_text())
         if time.time() - data.get("ts", 0) < _CACHE_TTL:
-            return data.get("pricing")
+            raw = data.get("pricing")
+            if not raw:
+                return None
+            # Migrate flat input-only cache to ModelPricing format
+            first = next(iter(raw.values()), None)
+            if isinstance(first, (int, float)):
+                return {
+                    k: {"input": v, "output": v * 5}
+                    for k, v in raw.items()
+                }
+            return raw
     except Exception:
         pass
     return None
-def _save_cache(pricing: dict[str, float]) -> None:
+def _save_cache(pricing: dict[str, ModelPricing]) -> None:
     try:
         _CACHE_PATH.parent.mkdir(parents=True, exist_ok=True)
         _CACHE_PATH.write_text(json.dumps({"ts": time.time(), "pricing": pricing}))
@@ -92,8 +136,8 @@ def _save_cache(pricing: dict[str, float]) -> None:
         pass
-def get_model_pricing() -> dict[str, float]:
-    """Return {family: input_price_per_1M_tokens}. Cached 7 days."""
+def get_model_pricing() -> dict[str, ModelPricing]:
+    """Return {family: {input, output}} pricing per 1M tokens. Cached 7 days."""
     cached = _load_cache()
     if cached:
         return cached

context_engine/storage/vector_store.py CHANGED Viewed

@@ -46,9 +46,23 @@ class VectorStore:
     def _connect(self) -> sqlite3.Connection:
         import sqlite_vec
         conn = sqlite3.connect(self._db_file, check_same_thread=False)
-        conn.enable_load_extension(True)
-        sqlite_vec.load(conn)
-        conn.enable_load_extension(False)
+        try:
+            conn.enable_load_extension(True)
+            sqlite_vec.load(conn)
+            conn.enable_load_extension(False)
+        except AttributeError:
+            raise RuntimeError(
+                "Your Python was compiled without SQLite extension support "
+                "(enable_load_extension is missing). This is common with "
+                "python.org installers on macOS.\n\n"
+                "Fix: reinstall CCE under a Python that has extension support:\n"
+                "  uv tool install --python $(brew --prefix python3)/bin/python3 "
+                "--force code-context-engine\n\n"
+                "Or use Homebrew Python directly:\n"
+                "  brew install python3\n"
+                "  uv tool install --python /opt/homebrew/bin/python3 "
+                "--force code-context-engine"
+            ) from None
         conn.execute("PRAGMA journal_mode=WAL")
         conn.execute("PRAGMA synchronous=NORMAL")
         return conn

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/WHEEL RENAMED Viewed

File without changes

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{code_context_engine-0.4.20.dist-info → code_context_engine-0.4.22.dist-info}/top_level.txt RENAMED Viewed

File without changes

code-context-engine 0.4.20__py3-none-any.whl → 0.4.22__py3-none-any.whl

code-context-engine 0.4.20py3-none-any.whl → 0.4.22py3-none-any.whl