npm - context-mcp-server - Versions diffs - 1.0.1 - Mend

context-mcp-server 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

package/README.md +464 -0
package/codegraph/__init__.py +0 -0
package/codegraph/__main__.py +24 -0
package/codegraph/__pycache__/__init__.cpython-313.pyc +0 -0
package/codegraph/__pycache__/__main__.cpython-313.pyc +0 -0
package/codegraph/__pycache__/cache.cpython-313.pyc +0 -0
package/codegraph/__pycache__/config.cpython-313.pyc +0 -0
package/codegraph/__pycache__/report.cpython-313.pyc +0 -0
package/codegraph/__pycache__/scanner.cpython-313.pyc +0 -0
package/codegraph/__pycache__/server.cpython-313.pyc +0 -0
package/codegraph/cache.py +137 -0
package/codegraph/config.py +31 -0
package/codegraph/extractors/__init__.py +0 -0
package/codegraph/extractors/__pycache__/__init__.cpython-313.pyc +0 -0
package/codegraph/extractors/__pycache__/ast_extractor.cpython-313.pyc +0 -0
package/codegraph/extractors/__pycache__/audio_extractor.cpython-313.pyc +0 -0
package/codegraph/extractors/__pycache__/doc_extractor.cpython-313.pyc +0 -0
package/codegraph/extractors/__pycache__/image_extractor.cpython-313.pyc +0 -0
package/codegraph/extractors/ast_extractor.py +222 -0
package/codegraph/extractors/audio_extractor.py +8 -0
package/codegraph/extractors/doc_extractor.py +34 -0
package/codegraph/extractors/image_extractor.py +26 -0
package/codegraph/graph/__init__.py +0 -0
package/codegraph/graph/__pycache__/__init__.cpython-313.pyc +0 -0
package/codegraph/graph/__pycache__/builder.cpython-313.pyc +0 -0
package/codegraph/graph/__pycache__/clustering.cpython-313.pyc +0 -0
package/codegraph/graph/__pycache__/query.cpython-313.pyc +0 -0
package/codegraph/graph/builder.py +145 -0
package/codegraph/graph/clustering.py +40 -0
package/codegraph/graph/query.py +283 -0
package/codegraph/report.py +115 -0
package/codegraph/scanner.py +92 -0
package/codegraph/server.py +514 -0
package/package.json +62 -0
package/src/cli.js +1010 -0
package/src/config.js +89 -0
package/src/db.js +786 -0
package/src/guard.js +20 -0
package/src/hooks/autoContext.js +17 -0
package/src/hooks/autoLink.js +7 -0
package/src/http.js +765 -0
package/src/index.js +47 -0
package/src/search.js +50 -0
package/src/server.js +80 -0
package/src/summarizer.js +124 -0
package/src/templates/AGENTS.md +76 -0
package/src/templates/CLAUDE.md +94 -0
package/src/templates/GEMINI.md +76 -0
package/src/templates/cursor-rules.mdc +41 -0
package/src/templates/windsurf-rules.md +35 -0
package/src/tools/codegraph.js +215 -0
package/src/tools/context.js +188 -0
package/src/tools/discussion.js +123 -0
package/src/tools/errorCheck.js +65 -0
package/src/tools/fileTools.js +185 -0
package/src/tools/gitTools.js +259 -0
package/src/tools/search.js +55 -0
package/src/vector.js +153 -0

package/README.md ADDED Viewed

@@ -0,0 +1,464 @@
+# context-mcp
+Persistent memory and codebase knowledge graph for AI coding assistants — delivered as a single MCP server.
+One shared context store. Works across Claude Code, Cursor, Gemini CLI, Codex, Windsurf, VS Code Copilot, Claude.ai, and ChatGPT. Save context from one AI, pick it up in another. Your memory follows the project, not the tool.
+---
+## The Problem
+Every conversation with an AI assistant starts from zero. The AI re-reads files it already read yesterday, re-discovers architecture it already understood, re-derives decisions that were already made. You repeat context. You paste the same background. You explain the same things.
+This gets worse as projects grow. A codebase with 50 files means the AI either reads all of them every time (burning thousands of tokens) or misses context and gives wrong answers.
+---
+## What context-mcp Solves
+**1. You lose context between conversations.**
+AI assistants have no memory. Every new chat is a blank slate. context-mcp gives the AI a persistent store of decisions, bugs, notes, and architecture — loaded automatically at conversation start.
+**2. Context is siloed to one tool.**
+You fix a bug with Claude Code, then open Cursor and it knows nothing about it. context-mcp stores everything in `~/.context-mcp/` — a single shared store on your machine. Any AI that connects reads and writes the same store.
+**3. Structural understanding costs too many tokens.**
+Reading 20 files to answer "what calls this function?" is wasteful. context-mcp builds a knowledge graph of your codebase once, then answers structural questions in ~500 tokens instead of ~50,000.
+**4. Repeated enrichment is expensive.**
+AI-written descriptions of your code nodes are computed once and stored permanently. They survive file changes, rebuilds, and new conversations — never paid for twice.
+---
+## Installation
+```bash
+npm install -g context-mcp-server
+```
+That's it. One command installs everything — the MCP server, HTTP server, and `ctx` CLI.
+Then run from your project root:
+```bash
+ctx install --all
+```
+This writes MCP config + AI instruction files for every platform **and** automatically sets up the Python codegraph environment if [uv](https://docs.astral.sh/uv/) is installed.
+> **CodeGraph requires uv.** Install it first if you want graph features:
+> ```bash
+> curl -Ls https://astral.sh/uv/install.sh | sh   # macOS / Linux
+> winget install astral-sh.uv                      # Windows
+> ```
+> Memory tools work with npm alone — uv is only needed for `codegraph_build` and graph queries.
+Requires Node.js ≥ 18.
+Installs three commands:
+| Command | What it runs |
+|---------|-------------|
+| `context-mcp` | Stdio MCP server (for local AI clients) |
+| `context-mcp-http` | HTTP MCP server with OAuth 2.0 (for web clients) |
+| `ctx` | Interactive CLI — browse, search, manage context |
+---
+## Platform Setup
+```bash
+ctx install --claude      # Claude Code
+ctx install --cursor      # Cursor
+ctx install --vscode      # VS Code Copilot
+ctx install --gemini      # Gemini CLI
+ctx install --codex       # Codex CLI
+ctx install --windsurf    # Windsurf
+ctx install --all         # all platforms + Python setup at once
+```
+Run from your project root. Each command writes the MCP config file and AI instruction file for that platform, then checks for uv and sets up the Python codegraph environment.
+---
+### Claude Code
+`ctx install --claude` writes:
+- `.claude/mcp.json` — MCP server config
+- `CLAUDE.md` — instructions Claude reads automatically at conversation start
+Manual config — add to `.claude/mcp.json`:
+```json
+{
+  "mcpServers": {
+    "context-mcp": {
+      "command": "npx",
+      "args": ["-y", "context-mcp-server@latest"]
+    }
+  }
+}
+```
+---
+### Cursor
+`ctx install --cursor` writes:
+- `.cursor/mcp.json` — MCP server config
+- `.cursor/rules/context-mcp.mdc` — Cursor rules file
+Manual config — add to `.cursor/mcp.json`:
+```json
+{
+  "mcpServers": {
+    "context-mcp": {
+      "command": "npx",
+      "args": ["-y", "context-mcp-server@latest"]
+    }
+  }
+}
+```
+---
+### VS Code Copilot
+`ctx install --vscode` writes:
+- `.vscode/mcp.json` — MCP server config
+- `CLAUDE.md` — instruction file
+Manual config — add to `.vscode/mcp.json`:
+```json
+{
+  "servers": {
+    "context-mcp": {
+      "type": "stdio",
+      "command": "npx",
+      "args": ["-y", "context-mcp-server@latest"]
+    }
+  }
+}
+```
+---
+### Gemini CLI
+`ctx install --gemini` writes:
+- `.gemini/settings.json` — MCP server config
+- `GEMINI.md` — instructions Gemini reads automatically
+Manual config — add to `.gemini/settings.json`:
+```json
+{
+  "mcpServers": {
+    "context-mcp": {
+      "command": "npx",
+      "args": ["-y", "context-mcp-server@latest"]
+    }
+  }
+}
+```
+---
+### Codex CLI
+`ctx install --codex` writes:
+- `.codex/config.toml` — MCP server config
+- `AGENTS.md` — instructions Codex reads automatically
+Manual config — add to `.codex/config.toml`:
+```toml
+[[mcp_servers]]
+name    = "context-mcp"
+command = "npx"
+args    = ["-y", "context-mcp-server@latest"]
+```
+---
+### Windsurf
+`ctx install --windsurf` writes:
+- `.windsurf/rules/context-mcp.md` — local rules file (project scope)
+- `~/.codeium/windsurf/mcp_config.json` — global MCP config (merged, not overwritten)
+Manual config — add to `~/.codeium/windsurf/mcp_config.json`:
+```json
+{
+  "mcpServers": {
+    "context-mcp": {
+      "command": "npx",
+      "args": ["-y", "context-mcp-server@latest"]
+    }
+  }
+}
+```
+---
+### Claude.ai / ChatGPT (HTTP mode)
+Web-based clients connect over HTTP with OAuth 2.0. Use `ctx online` to start the HTTP server.
+**Step 1 — Start the server:**
+```bash
+ctx online
+```
+Starts the server in the background, shows your OAuth credentials, and prints the endpoint URL. Safe to re-run — won't start a second copy.
+```bash
+ctx online --restart   # force restart
+ctx online --port 3200 # use a different port
+```
+Or start directly:
+```bash
+context-mcp-http --port 3100 --host localhost --access-git
+```
+**Step 2 — Add as a remote MCP connector:**
+1. Go to Claude.ai → Settings → Integrations → Add MCP Connector
+2. Enter your server URL (e.g. `http://localhost:3100`)
+3. Use the **Client ID** and **Client Secret** from `~/.context-mcp/contextconfig.json`
+**View or edit config:**
+```bash
+ctx settings
+```
+---
+## Path Sandboxing (Security)
+File and git tools are sandboxed to your project root. Pass `rootPath` when calling `context.resume` to register it:
+```json
+{ "action": "resume", "project": "my-app", "rootPath": "/home/user/my-app" }
+```
+The root is stored permanently with the project. Any file or git operation outside that directory is rejected. This applies to all HTTP-connected clients (Claude.ai, ChatGPT) — they can only access files within the registered project root.
+---
+## CLI Reference
+```bash
+ctx                                  # open interactive mode
+# Context
+ctx list [project]                   # list entries, discussions, graphs
+ctx projects                         # all projects with IDs, graph status, recent entries
+ctx search "query"                   # keyword → semantic fallback search
+ctx add                              # add entry interactively
+ctx summary [project]                # summarize recent entries
+# Delete
+ctx delete <id-prefix>               # delete one entry by ID prefix
+ctx delete project <name|id>         # delete all entries for a project
+# Server
+ctx online                           # start HTTP server (idempotent)
+ctx online --restart                 # force stop + restart
+ctx online --port 3200               # use a different port
+ctx settings                         # view and edit config interactively
+# Setup
+ctx install --claude                 # write MCP config for Claude Code
+ctx install --cursor                 # write MCP config for Cursor
+ctx install --vscode                 # write MCP config for VS Code
+ctx install --gemini                 # write MCP config for Gemini CLI
+ctx install --codex                  # write MCP config for Codex CLI
+ctx install --windsurf               # write MCP config for Windsurf
+ctx install --all                    # all platforms + Python setup
+# Tools
+ctx benchmark                        # real token savings report (memory + graph)
+ctx discuss [project]                # view discussions
+```
+---
+## Server Flags
+### `context-mcp` (stdio)
+```
+context-mcp [options]
+Options:
+  --data-dir <path>   Override storage directory (default: ~/.context-mcp)
+                      Also via env: CONTEXT_MCP_DIR=<path>
+  --help, -h          Show help
+```
+### `context-mcp-http` (HTTP + OAuth)
+```
+context-mcp-http [options]
+Options:
+  --port <number>     HTTP listen port (default: 3100)
+  --host <string>     Bind address (default: localhost)
+  --access-git        Enable git tools for connected clients
+  --data-dir <path>   Override storage directory (default: ~/.context-mcp)
+                      Also via env: CONTEXT_MCP_DIR=<path>
+  --help, -h          Show help
+```
+---
+## Config Reference
+Config lives at `~/.context-mcp/contextconfig.json` — auto-created on first run:
+```json
+{
+  "client_id": "context-mcp",
+  "client_secret": "<auto-generated>",
+  "port": 3100,
+  "host": "localhost",
+  "access_git": false,
+  "public_url": null,
+  "allowed_redirect_uris": ["https://claude.ai"],
+  "allowed_origins": []
+}
+```
+| Field | Default | Description |
+|-------|---------|-------------|
+| `client_id` | `"context-mcp"` | OAuth client ID |
+| `client_secret` | auto-generated | OAuth signing secret — keep private |
+| `port` | `3100` | HTTP server port |
+| `host` | `"localhost"` | HTTP bind host |
+| `access_git` | `false` | Enable git tools for HTTP clients |
+| `public_url` | `null` | Public URL shown in `ctx online` output |
+| `allowed_redirect_uris` | `["https://claude.ai"]` | OAuth redirect URI whitelist |
+| `allowed_origins` | `[]` | Extra CORS origins beyond `claude.ai` and `localhost` |
+Edit any field interactively with `ctx settings`.
+---
+## Features
+### Memory
+- `context.resume` — loads recent entries, active discussions, and graph status. Pass `rootPath` to sandbox file/git tools to your project directory.
+- `context.save` — store decisions, bugs, notes, code snippets, architecture with type tags
+- `context.get` / `context.update` / `context.delete` — full CRUD
+- `search` — keyword-first, semantic fallback, searches all past context
+- `discussion` — threaded plans with steps, status tracking, cross-session continuity
+- Auto-deduplication on save
+- Auto-compact at 50 entries (oldest entries summarized into a digest)
+- Per-project isolation with stable UUIDs
+### File & Git Tools (HTTP mode)
+Available to web clients (Claude.ai, ChatGPT) only — local AI clients use their native IDE tools directly.
+- `read_file`, `write_file`, `patch_file`, `create_dir`, `list_dir`, `delete_file`
+- `git_status`, `git_diff`, `git_log`, `git_add`, `git_commit`, `git_push`, `git_pull`, `git_branch`, `git_stash`, `git_reset`, `git_show`
+All file and git operations are sandboxed to the registered project root. Enable git tools with `--access-git` or `access_git: true` in config.
+### CodeGraph
+- `codegraph_build` — AST scan: functions, classes, imports, edges. Runs locally, no API cost.
+- `codegraph_extract` — returns changed files with node lists for AI enrichment
+- `codegraph_add_nodes` — stores AI-written descriptions in permanent semantic cache
+- `codegraph_query` — natural language structural question → NODE/EDGE subgraph with `token_budget` control
+- `codegraph_explain` — single node: description, dependencies, usages
+- `codegraph_path` — shortest path between two concepts
+- `codegraph_nodes` — list all nodes of a given type
+- `codegraph_report` — full graph analysis: god nodes, clusters, surprising connections
+### Multi-AI Support
+| AI | Config File | Instruction File |
+|----|------------|-----------------|
+| Claude Code | `.claude/mcp.json` | `CLAUDE.md` |
+| VS Code Copilot | `.vscode/mcp.json` | `CLAUDE.md` |
+| Cursor | `.cursor/mcp.json` | `.cursor/rules/context-mcp.mdc` |
+| Gemini CLI | `.gemini/settings.json` | `GEMINI.md` |
+| Codex CLI | `.codex/config.toml` | `AGENTS.md` |
+| Windsurf | `~/.codeium/windsurf/mcp_config.json` | `.windsurf/rules/context-mcp.md` |
+| Claude.ai / ChatGPT | HTTP (`ctx online`) | — |
+> The context store lives at `~/.context-mcp/` — not inside any tool, IDE, or session. A decision saved in Claude Code is visible in Cursor. A bug logged from Gemini CLI shows up when you resume in Codex.
+---
+## Token Reduction
+| Scenario | Without context-mcp | With context-mcp |
+|----------|-------------------|--------------------|
+| Start of conversation | Paste background, re-explain project | `context.resume` → 15 entries, ~750 tokens |
+| "What calls function X?" | Read 10 files to trace callers | `codegraph_query` → subgraph, ~400 tokens |
+| "What does module Y depend on?" | Read module + all imports | `codegraph_explain` → node + edges, ~200 tokens |
+| Understand architecture | Read 20+ files | Graph built once, queried forever |
+| Remember last session's decision | Ask user or re-derive | `context.resume` loads it automatically |
+Real measured reduction on this project: **162× fewer tokens**, **99.38% reduction** per conversation.
+---
+## Architecture
+```
+context-mcp/
+├── src/
+│   ├── index.js           Stdio MCP server entrypoint
+│   ├── server.js          MCP server — registers all tools
+│   ├── db.js              JSON store — in-memory cache, debounced writes, project registry
+│   ├── guard.js           Path sandboxing — enforces project root on all file/git ops
+│   ├── search.js          Keyword + semantic search
+│   ├── summarizer.js      Auto-compact summarization
+│   ├── cli.js             Interactive CLI (ctx)
+│   ├── http.js            HTTP server — OAuth 2.0 + Streamable HTTP transport
+│   ├── config.js          Config loader — contextconfig.json + keytar
+│   ├── vector.js          Embedding helpers
+│   └── tools/
+│       ├── context.js     Memory tool (resume/save/get/update/delete)
+│       ├── discussion.js  Discussion tool (threaded plans + steps)
+│       ├── codegraph.js   CodeGraph tool — bridge to Python subprocess
+│       ├── search.js      Search tool
+│       ├── fileTools.js   File read/write (HTTP mode, sandboxed to project root)
+│       ├── gitTools.js    Git integration (HTTP mode, sandboxed to project root)
+│       └── errorCheck.js  Error checking tool
+├── codegraph/             Python package — AST extraction + graph queries
+│   ├── server.py          Dispatcher — reads JSON from stdin, routes to tools
+│   ├── scanner.py         File walker + classifier
+│   ├── cache.py           Two-layer cache (ast.json + semantic.json)
+│   ├── report.py          Graph report generator
+│   ├── extractors/
+│   │   ├── ast_extractor.py
+│   │   ├── doc_extractor.py
+│   │   ├── image_extractor.py
+│   │   └── audio_extractor.py
+│   └── graph/
+│       ├── builder.py     NetworkX graph construction
+│       ├── query.py       Natural language → subgraph traversal
+│       └── clustering.py  Community detection
+└── ~/.context-mcp/        Data directory (outside repo, never committed)
+    ├── contexts.json
+    ├── discussions.json
+    ├── projects.json      Project registry — includes rootPath per project
+    ├── graphs.json
+    └── contextconfig.json OAuth config + server settings
+```
+---
+## License
+MIT

package/codegraph/__init__.py ADDED Viewed

File without changes

package/codegraph/__main__.py ADDED Viewed

@@ -0,0 +1,24 @@
+"""
+codegraph/__main__.py — stdin/stdout dispatcher for Node.js integration.
+Reads {"tool": "codegraph_build", "args": {...}} from stdin, writes result JSON to stdout.
+"""
+import json
+import sys
+import asyncio
+from .server import _dispatch
+def main():
+    try:
+        payload = json.loads(sys.stdin.read())
+        result = asyncio.run(_dispatch(payload["tool"], payload["args"]))
+        print(json.dumps(result))
+    except Exception as e:
+        print(json.dumps({"error": str(e)}))
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

package/codegraph/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/__main__.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/cache.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/config.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/report.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/scanner.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/__pycache__/server.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/cache.py ADDED Viewed

@@ -0,0 +1,137 @@
+"""
+cache.py — SHA-256 file hash cache for codegraph.
+Two separate caches:
+  codegraph-cache/ast.json      — AST-extracted nodes (overwritten on rebuild)
+  codegraph-cache/semantic.json — AI-written descriptions (never overwritten by rebuild)
+Format: { "rel/path": { "hash": "...", "nodes": [...], "extracted_at": "..." } }
+"""
+import hashlib
+import json
+from datetime import datetime, timezone
+from pathlib import Path
+def _ast_path(project_root: str) -> Path:
+    return Path(project_root) / "codegraph-cache" / "ast.json"
+def _semantic_path(project_root: str) -> Path:
+    return Path(project_root) / "codegraph-cache" / "semantic.json"
+def _read(p: Path) -> dict:
+    if not p.exists():
+        return {}
+    try:
+        return json.loads(p.read_text(encoding="utf-8"))
+    except Exception:
+        return {}
+def _write(p: Path, data: dict) -> None:
+    p.parent.mkdir(parents=True, exist_ok=True)
+    tmp = p.with_suffix(".tmp")
+    tmp.write_text(json.dumps(data, indent=2), encoding="utf-8")
+    tmp.replace(p)
+def load_cache(project_root: str) -> dict:
+    """Load merged view: AST base + semantic descriptions overlaid."""
+    ast = _read(_ast_path(project_root))
+    sem = _read(_semantic_path(project_root))
+    merged = {}
+    all_keys = set(ast) | set(sem)
+    for key in all_keys:
+        ast_entry = ast.get(key, {})
+        sem_entry = sem.get(key, {})
+        # Use AST hash for change detection (source of truth)
+        merged[key] = {
+            "hash":         ast_entry.get("hash", sem_entry.get("hash", "")),
+            "nodes":        _merge_nodes(ast_entry.get("nodes", []), sem_entry.get("nodes", [])),
+            "extracted_at": ast_entry.get("extracted_at", sem_entry.get("extracted_at", "")),
+        }
+    return merged
+def _merge_nodes(ast_nodes: list, sem_nodes: list) -> list:
+    """Overlay semantic descriptions onto AST nodes by name."""
+    sem_by_name = {n.get("name"): n for n in sem_nodes if n.get("name")}
+    result = []
+    for n in ast_nodes:
+        name = n.get("name")
+        if name and name in sem_by_name:
+            merged = dict(n)
+            sem_desc = sem_by_name[name].get("description", "")
+            if sem_desc:
+                merged["description"] = sem_desc
+            result.append(merged)
+        else:
+            result.append(n)
+    # Append semantic-only nodes (from doc files) not in AST
+    ast_names = {n.get("name") for n in ast_nodes}
+    for n in sem_nodes:
+        if n.get("name") not in ast_names:
+            result.append(n)
+    return result
+def save_cache(project_root: str, cache: dict) -> None:
+    """Write back to AST cache only (used by build pipeline)."""
+    _write(_ast_path(project_root), cache)
+def save_semantic_cache(project_root: str, updates: dict[str, list]) -> None:
+    """
+    Persist AI-written descriptions into semantic cache.
+    updates: { rel_path: [nodes_with_descriptions] }
+    Never touched by rebuild — descriptions survive file changes.
+    """
+    sem = _read(_semantic_path(project_root))
+    for rel_path, nodes in updates.items():
+        existing = {n.get("name"): n for n in sem.get(rel_path, {}).get("nodes", [])}
+        for n in nodes:
+            name = n.get("name")
+            if name:
+                existing[name] = {**existing.get(name, {}), **{k: v for k, v in n.items() if v}}
+        sem[rel_path] = {
+            "nodes":        list(existing.values()),
+            "extracted_at": datetime.now(timezone.utc).isoformat(),
+        }
+    _write(_semantic_path(project_root), sem)
+def file_hash(path: str) -> str:
+    h = hashlib.sha256()
+    with open(path, "rb") as f:
+        for chunk in iter(lambda: f.read(65536), b""):
+            h.update(chunk)
+    return h.hexdigest()
+def get_cached_nodes(cache: dict, rel_path: str, current_hash: str) -> list | None:
+    """Return cached nodes if hash matches, else None."""
+    entry = cache.get(rel_path)
+    if entry and entry.get("hash") == current_hash:
+        return entry.get("nodes", [])
+    return None
+def set_cached_nodes(cache: dict, rel_path: str, file_hash_val: str, nodes: list) -> None:
+    cache[rel_path] = {
+        "hash":         file_hash_val,
+        "nodes":        nodes,
+        "extracted_at": datetime.now(timezone.utc).isoformat(),
+    }
+def remove_deleted(cache: dict, existing_rel_paths: set) -> list:
+    """Remove cache entries for files that no longer exist. Returns removed keys."""
+    removed = [k for k in list(cache.keys()) if k not in existing_rel_paths]
+    for k in removed:
+        del cache[k]
+    return removed

package/codegraph/config.py ADDED Viewed

@@ -0,0 +1,31 @@
+"""
+config.py — codegraph settings.
+"""
+import os
+# Files/dirs to ignore during scanning
+DEFAULT_IGNORE = {
+    "node_modules", ".git", "dist", "build", ".next", "__pycache__",
+    ".venv", "venv", "env", ".env", "coverage", ".DS_Store",
+    "codegraph-cache", ".pytest_cache", ".mypy_cache",
+}
+# Extensions handled by each extractor
+CODE_EXTENSIONS = {
+    ".py", ".js", ".ts", ".jsx", ".tsx", ".mjs", ".cjs",
+    ".go", ".rs", ".java", ".c", ".cpp", ".h", ".hpp", ".rb",
+}
+SQL_EXTENSIONS   = {".sql"}
+CONFIG_EXTENSIONS = {".yaml", ".yml", ".toml", ".env", ".ini", ".cfg"}
+DOC_EXTENSIONS   = {".md", ".txt", ".rst", ".mdx"}
+PDF_EXTENSIONS   = {".pdf"}
+IMAGE_EXTENSIONS = {".png", ".jpg", ".jpeg", ".svg", ".gif", ".webp"}
+AUDIO_EXTENSIONS = {".mp3", ".wav", ".m4a", ".ogg", ".flac"}
+VIDEO_EXTENSIONS = {".mp4", ".mov", ".avi", ".mkv", ".webm"}
+# Max file size to process (bytes) — skip huge generated files
+MAX_FILE_BYTES = 500_000
+# Max characters of doc text returned to the AI per file
+DOC_MAX_CHARS = 8_000

package/codegraph/extractors/__init__.py ADDED Viewed

File without changes

package/codegraph/extractors/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/extractors/__pycache__/ast_extractor.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/extractors/__pycache__/audio_extractor.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/extractors/__pycache__/doc_extractor.cpython-313.pyc ADDED Viewed

Binary file

package/codegraph/extractors/__pycache__/image_extractor.cpython-313.pyc ADDED Viewed

Binary file