codegraphy 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- codegraphy-0.1.0/LICENSE +21 -0
- codegraphy-0.1.0/PKG-INFO +310 -0
- codegraphy-0.1.0/README.md +270 -0
- codegraphy-0.1.0/codegraphy.egg-info/PKG-INFO +310 -0
- codegraphy-0.1.0/codegraphy.egg-info/SOURCES.txt +24 -0
- codegraphy-0.1.0/codegraphy.egg-info/dependency_links.txt +1 -0
- codegraphy-0.1.0/codegraphy.egg-info/entry_points.txt +2 -0
- codegraphy-0.1.0/codegraphy.egg-info/requires.txt +25 -0
- codegraphy-0.1.0/codegraphy.egg-info/top_level.txt +1 -0
- codegraphy-0.1.0/pyproject.toml +46 -0
- codegraphy-0.1.0/repolens/__init__.py +5 -0
- codegraphy-0.1.0/repolens/cli.py +141 -0
- codegraphy-0.1.0/repolens/config.py +13 -0
- codegraphy-0.1.0/repolens/db/__init__.py +3 -0
- codegraphy-0.1.0/repolens/db/schema.py +84 -0
- codegraphy-0.1.0/repolens/db/store.py +162 -0
- codegraphy-0.1.0/repolens/indexer/__init__.py +5 -0
- codegraphy-0.1.0/repolens/indexer/base.py +27 -0
- codegraphy-0.1.0/repolens/indexer/python.py +177 -0
- codegraphy-0.1.0/repolens/indexer/walker.py +77 -0
- codegraphy-0.1.0/repolens/mcp/__init__.py +3 -0
- codegraphy-0.1.0/repolens/mcp/server.py +306 -0
- codegraphy-0.1.0/repolens/plugins/__init__.py +3 -0
- codegraphy-0.1.0/repolens/plugins/base.py +10 -0
- codegraphy-0.1.0/repolens/plugins/django.py +24 -0
- codegraphy-0.1.0/setup.cfg +4 -0
codegraphy-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Charan Kulal
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,310 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: codegraphy
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: SQLite/PostgreSQL codebase knowledge graph and MCP server for Claude Code
|
|
5
|
+
Author: Charan Kulal
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Keywords: mcp,code-analysis,knowledge-graph,claude-code,sqlite,postgresql
|
|
8
|
+
Classifier: Development Status :: 3 - Alpha
|
|
9
|
+
Classifier: Intended Audience :: Developers
|
|
10
|
+
Classifier: Programming Language :: Python :: 3
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
15
|
+
Classifier: Topic :: Software Development :: Documentation
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
License-File: LICENSE
|
|
19
|
+
Requires-Dist: click>=8.0
|
|
20
|
+
Requires-Dist: mcp>=1.0
|
|
21
|
+
Provides-Extra: postgres
|
|
22
|
+
Requires-Dist: psycopg2-binary; extra == "postgres"
|
|
23
|
+
Provides-Extra: pgvector
|
|
24
|
+
Requires-Dist: pgvector; extra == "pgvector"
|
|
25
|
+
Provides-Extra: js
|
|
26
|
+
Requires-Dist: tree-sitter; extra == "js"
|
|
27
|
+
Requires-Dist: tree-sitter-javascript; extra == "js"
|
|
28
|
+
Requires-Dist: tree-sitter-typescript; extra == "js"
|
|
29
|
+
Provides-Extra: html
|
|
30
|
+
Requires-Dist: tree-sitter; extra == "html"
|
|
31
|
+
Requires-Dist: tree-sitter-html; extra == "html"
|
|
32
|
+
Provides-Extra: all
|
|
33
|
+
Requires-Dist: psycopg2-binary; extra == "all"
|
|
34
|
+
Requires-Dist: pgvector; extra == "all"
|
|
35
|
+
Requires-Dist: tree-sitter; extra == "all"
|
|
36
|
+
Requires-Dist: tree-sitter-javascript; extra == "all"
|
|
37
|
+
Requires-Dist: tree-sitter-typescript; extra == "all"
|
|
38
|
+
Requires-Dist: tree-sitter-html; extra == "all"
|
|
39
|
+
Dynamic: license-file
|
|
40
|
+
|
|
41
|
+
# codegraphy
|
|
42
|
+
|
|
43
|
+
Standalone Python package that parses a codebase into a knowledge graph (PostgreSQL or SQLite) and exposes it as an [MCP](https://modelcontextprotocol.io/) server for Claude Code. Claude calls graph tools instead of `Read` + `Bash(grep)` — cuts exploration token cost by 5–10×.
|
|
44
|
+
|
|
45
|
+
**Python:** 3.10+
|
|
46
|
+
**License:** MIT
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Why
|
|
51
|
+
|
|
52
|
+
Claude exploring an unfamiliar codebase today:
|
|
53
|
+
|
|
54
|
+
| Task | Without codegraphy | With codegraphy |
|
|
55
|
+
|------|-------------------|----------------|
|
|
56
|
+
| Find where `Something` is defined | Read 10 files (~15k tokens) | `search_symbol("Something")` (~200 tokens) |
|
|
57
|
+
| Understand a file's structure | Read full file (~3k tokens) | `get_file_summary("views.py")` (~300 tokens) |
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## Installation
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
# SQLite-only install (default, zero config):
|
|
65
|
+
pip install codegraphy
|
|
66
|
+
|
|
67
|
+
# For PostgreSQL support:
|
|
68
|
+
pip install codegraphy[postgres]
|
|
69
|
+
|
|
70
|
+
# For JS/TS parsing (planned):
|
|
71
|
+
pip install codegraphy[js]
|
|
72
|
+
|
|
73
|
+
# Everything:
|
|
74
|
+
pip install codegraphy[all]
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
The base PyPI package keeps SQLite support in the standard library path, so PostgreSQL stays opt-in.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Quickstart
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
# 1. Initialize the database (SQLite by default)
|
|
85
|
+
codegraphy init
|
|
86
|
+
|
|
87
|
+
# 2. Index your project
|
|
88
|
+
codegraphy index .
|
|
89
|
+
|
|
90
|
+
# 3. Start the MCP server (stdio, for Claude Code)
|
|
91
|
+
codegraphy serve
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
That's it. Claude can now query your codebase graph instead of reading files.
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## CLI Reference
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
codegraphy init [--db URL] # Create tables (SQLite default, or pass Postgres URL)
|
|
102
|
+
codegraphy index PATH [--exclude] # Full index of a directory
|
|
103
|
+
codegraphy update # Incremental re-index via git diff
|
|
104
|
+
codegraphy serve # Start MCP server over stdio
|
|
105
|
+
codegraphy search NAME # Search symbols (debug, not MCP)
|
|
106
|
+
codegraphy usages QUALIFIED_NAME # Find usages (debug, not MCP)
|
|
107
|
+
codegraphy stats # Show graph statistics
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## MCP Tools
|
|
113
|
+
|
|
114
|
+
When running as an MCP server, codegraphy exposes these tools to Claude:
|
|
115
|
+
|
|
116
|
+
| Tool | Description |
|
|
117
|
+
|------|-------------|
|
|
118
|
+
| `search_symbol(name, kind?, limit?, fallback_grep?)` | Find symbols by name — exact, then substring, then grep fallback |
|
|
119
|
+
| `get_file_summary(file_path)` | Classes, functions, imports in a file without reading it |
|
|
120
|
+
| `find_usages(qualified_name, limit?, fallback_grep?)` | Who imports/calls/references this symbol |
|
|
121
|
+
| `get_context(file_path, line, radius?)` | Read N lines around a line number |
|
|
122
|
+
| `path_between(from_qualified, to_qualified, max_depth?)` | BFS shortest path between two symbols |
|
|
123
|
+
| `grep_search(pattern, include?, exclude?, limit?)` | Direct grep — bypass the graph |
|
|
124
|
+
| `graph_stats()` | File/symbol/edge counts, backend type |
|
|
125
|
+
| `what_touches_model(model_name)` | Django: views, admin, signals referencing a model |
|
|
126
|
+
| `search_semantic(query, limit?)` | pgvector semantic search (Postgres only, planned) |
|
|
127
|
+
|
|
128
|
+
All tools return a `source` field (`"graph"` or `"grep"`) so Claude can gauge confidence.
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
132
|
+
## Configuration
|
|
133
|
+
|
|
134
|
+
Priority: CLI flag → environment variable → `codegraphy.toml` → defaults.
|
|
135
|
+
|
|
136
|
+
### Environment Variables
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
DATABASE_URL=sqlite:///codegraphy.db # or postgresql://localhost/codegraphy
|
|
140
|
+
REPOLENS_ROOT=. # project root for grep fallback
|
|
141
|
+
REPOLENS_PLUGINS=repolens.plugins.django
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
### Config File (optional)
|
|
145
|
+
|
|
146
|
+
```toml
|
|
147
|
+
# codegraphy.toml (place at project root)
|
|
148
|
+
database_url = "postgresql://localhost/codegraphy"
|
|
149
|
+
root = "."
|
|
150
|
+
exclude = ["migrations", "node_modules", ".venv", "__pycache__"]
|
|
151
|
+
plugins = ["repolens.plugins.django"]
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## Claude Code Integration
|
|
157
|
+
|
|
158
|
+
### Register the MCP server
|
|
159
|
+
|
|
160
|
+
```json
|
|
161
|
+
// .claude/settings.json
|
|
162
|
+
{
|
|
163
|
+
"mcpServers": {
|
|
164
|
+
"codegraphy": {
|
|
165
|
+
"command": "codegraphy",
|
|
166
|
+
"args": ["serve"],
|
|
167
|
+
"env": {
|
|
168
|
+
"DATABASE_URL": "sqlite:///codegraphy.db"
|
|
169
|
+
}
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
}
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Auto-update on session end (optional)
|
|
176
|
+
|
|
177
|
+
```json
|
|
178
|
+
// .claude/settings.json
|
|
179
|
+
{
|
|
180
|
+
"hooks": {
|
|
181
|
+
"Stop": [{
|
|
182
|
+
"type": "command",
|
|
183
|
+
"command": "codegraphy update"
|
|
184
|
+
}]
|
|
185
|
+
}
|
|
186
|
+
}
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Architecture
|
|
192
|
+
|
|
193
|
+
```
|
|
194
|
+
repolens/
|
|
195
|
+
├── cli.py # Click CLI entry points
|
|
196
|
+
├── config.py # DATABASE_URL, REPOLENS_ROOT, plugin list
|
|
197
|
+
├── db/
|
|
198
|
+
│ ├── schema.py # CREATE TABLE statements (PG + SQLite)
|
|
199
|
+
│ └── store.py # upsert_symbol, upsert_edge, query helpers
|
|
200
|
+
├── indexer/
|
|
201
|
+
│ ├── base.py # BaseIndexer ABC, Symbol/Edge dataclasses
|
|
202
|
+
│ ├── python.py # ast-based Python indexer
|
|
203
|
+
│ └── walker.py # Filesystem walk + git-diff incremental
|
|
204
|
+
├── mcp/
|
|
205
|
+
│ └── server.py # FastMCP server + all tool definitions
|
|
206
|
+
├── plugins/
|
|
207
|
+
│ ├── base.py # BasePlugin ABC
|
|
208
|
+
│ └── django.py # Django-aware: models, views, signals
|
|
209
|
+
└── session/ # (planned) git-diff hook + memory write
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### Database Schema
|
|
213
|
+
|
|
214
|
+
Three tables power the graph:
|
|
215
|
+
|
|
216
|
+
- **`cg_files`** — indexed files with git hash for deduplication
|
|
217
|
+
- **`cg_symbols`** — every class, function, method, import with location + summary
|
|
218
|
+
- **`cg_edges`** — relationships: `imports`, `calls`, `inherits`, `references`, `registers`, `handles_signal`
|
|
219
|
+
|
|
220
|
+
### Indexing Strategy
|
|
221
|
+
|
|
222
|
+
1. Walk files via `git ls-files` (falls back to `os.walk`)
|
|
223
|
+
2. SHA-256 content hash skips unchanged files
|
|
224
|
+
3. AST parsing extracts symbols and edges
|
|
225
|
+
4. Plugins post-process symbols (e.g., Django re-tags `class` → `model`)
|
|
226
|
+
5. Upsert into database with cascade delete for clean re-indexing
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Plugin System
|
|
231
|
+
|
|
232
|
+
Plugins implement two hooks:
|
|
233
|
+
|
|
234
|
+
```python
|
|
235
|
+
class BasePlugin:
|
|
236
|
+
def on_symbol(self, symbol: Symbol) -> Symbol:
|
|
237
|
+
"""Mutate or re-tag a symbol after parsing."""
|
|
238
|
+
return symbol
|
|
239
|
+
|
|
240
|
+
def extra_edges(self, symbols: list[Symbol]) -> list[Edge]:
|
|
241
|
+
"""Derive additional edges from the symbol list."""
|
|
242
|
+
return []
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
### Built-in: Django Plugin
|
|
246
|
+
|
|
247
|
+
Detects Django patterns by file naming convention:
|
|
248
|
+
- Classes in `models.py` → `kind = "model"`
|
|
249
|
+
- Classes/functions in `views.py` → `kind = "view"`
|
|
250
|
+
|
|
251
|
+
Enable via environment variable:
|
|
252
|
+
```bash
|
|
253
|
+
REPOLENS_PLUGINS=repolens.plugins.django
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
---
|
|
257
|
+
|
|
258
|
+
## Current Status
|
|
259
|
+
|
|
260
|
+
| Milestone | Status |
|
|
261
|
+
|-----------|--------|
|
|
262
|
+
| M1 — Schema + Python indexer + `codegraphy index` | ✅ Complete |
|
|
263
|
+
| M2 — `search_symbol` + `get_file_summary` + MCP serve | ✅ Complete |
|
|
264
|
+
| M3 — `find_usages` + `path_between` + `get_context` + grep fallback | ✅ Complete |
|
|
265
|
+
| M4 — `codegraphy update` (incremental) | ✅ Complete |
|
|
266
|
+
| M5 — Django plugin | 🔶 Partial (symbol re-tagging, no admin/signal edges) |
|
|
267
|
+
| M6 — Semantic search (pgvector) | ⬜ Stub only |
|
|
268
|
+
| M7 — JS/TS indexer (tree-sitter) | ⬜ Planned |
|
|
269
|
+
| M8 — HTML/Template indexer | ⬜ Planned |
|
|
270
|
+
| M9 — `grep_search` tool + cross-language edges | 🔶 grep_search done, cross-lang edges planned |
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Development
|
|
275
|
+
|
|
276
|
+
```bash
|
|
277
|
+
# Clone and install in editable mode
|
|
278
|
+
git clone <repo-url> && cd codegraphy
|
|
279
|
+
python -m venv .venv && source .venv/bin/activate
|
|
280
|
+
pip install -e .
|
|
281
|
+
|
|
282
|
+
# Initialize local DB and index this project
|
|
283
|
+
codegraphy init
|
|
284
|
+
codegraphy index .
|
|
285
|
+
|
|
286
|
+
# Check stats
|
|
287
|
+
codegraphy stats
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
## Publishing
|
|
291
|
+
|
|
292
|
+
`codegraphy` is configured to build as a standard PyPI distribution from `pyproject.toml`.
|
|
293
|
+
|
|
294
|
+
For PyPI trusted publishing, use **`publish.yml`** as the workflow name. The workflow file lives at `.github/workflows/publish.yml`.
|
|
295
|
+
|
|
296
|
+
```bash
|
|
297
|
+
python -m pip install --upgrade build twine
|
|
298
|
+
python -m build
|
|
299
|
+
python -m twine check dist/*
|
|
300
|
+
python -m twine upload dist/*
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
---
|
|
304
|
+
|
|
305
|
+
## What It Is NOT
|
|
306
|
+
|
|
307
|
+
- Not a code execution sandbox
|
|
308
|
+
- Not a test runner or linter
|
|
309
|
+
- Not a replacement for LSP/IDE features
|
|
310
|
+
- Not AI-generated summaries by default (uses docstrings; AI summaries are opt-in future)
|
|
@@ -0,0 +1,270 @@
|
|
|
1
|
+
# codegraphy
|
|
2
|
+
|
|
3
|
+
Standalone Python package that parses a codebase into a knowledge graph (PostgreSQL or SQLite) and exposes it as an [MCP](https://modelcontextprotocol.io/) server for Claude Code. Claude calls graph tools instead of `Read` + `Bash(grep)` — cuts exploration token cost by 5–10×.
|
|
4
|
+
|
|
5
|
+
**Python:** 3.10+
|
|
6
|
+
**License:** MIT
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Why
|
|
11
|
+
|
|
12
|
+
Claude exploring an unfamiliar codebase today:
|
|
13
|
+
|
|
14
|
+
| Task | Without codegraphy | With codegraphy |
|
|
15
|
+
|------|-------------------|----------------|
|
|
16
|
+
| Find where `Something` is defined | Read 10 files (~15k tokens) | `search_symbol("Something")` (~200 tokens) |
|
|
17
|
+
| Understand a file's structure | Read full file (~3k tokens) | `get_file_summary("views.py")` (~300 tokens) |
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Installation
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
# SQLite-only install (default, zero config):
|
|
25
|
+
pip install codegraphy
|
|
26
|
+
|
|
27
|
+
# For PostgreSQL support:
|
|
28
|
+
pip install codegraphy[postgres]
|
|
29
|
+
|
|
30
|
+
# For JS/TS parsing (planned):
|
|
31
|
+
pip install codegraphy[js]
|
|
32
|
+
|
|
33
|
+
# Everything:
|
|
34
|
+
pip install codegraphy[all]
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
The base PyPI package keeps SQLite support in the standard library path, so PostgreSQL stays opt-in.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Quickstart
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
# 1. Initialize the database (SQLite by default)
|
|
45
|
+
codegraphy init
|
|
46
|
+
|
|
47
|
+
# 2. Index your project
|
|
48
|
+
codegraphy index .
|
|
49
|
+
|
|
50
|
+
# 3. Start the MCP server (stdio, for Claude Code)
|
|
51
|
+
codegraphy serve
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
That's it. Claude can now query your codebase graph instead of reading files.
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
## CLI Reference
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
codegraphy init [--db URL] # Create tables (SQLite default, or pass Postgres URL)
|
|
62
|
+
codegraphy index PATH [--exclude] # Full index of a directory
|
|
63
|
+
codegraphy update # Incremental re-index via git diff
|
|
64
|
+
codegraphy serve # Start MCP server over stdio
|
|
65
|
+
codegraphy search NAME # Search symbols (debug, not MCP)
|
|
66
|
+
codegraphy usages QUALIFIED_NAME # Find usages (debug, not MCP)
|
|
67
|
+
codegraphy stats # Show graph statistics
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## MCP Tools
|
|
73
|
+
|
|
74
|
+
When running as an MCP server, codegraphy exposes these tools to Claude:
|
|
75
|
+
|
|
76
|
+
| Tool | Description |
|
|
77
|
+
|------|-------------|
|
|
78
|
+
| `search_symbol(name, kind?, limit?, fallback_grep?)` | Find symbols by name — exact, then substring, then grep fallback |
|
|
79
|
+
| `get_file_summary(file_path)` | Classes, functions, imports in a file without reading it |
|
|
80
|
+
| `find_usages(qualified_name, limit?, fallback_grep?)` | Who imports/calls/references this symbol |
|
|
81
|
+
| `get_context(file_path, line, radius?)` | Read N lines around a line number |
|
|
82
|
+
| `path_between(from_qualified, to_qualified, max_depth?)` | BFS shortest path between two symbols |
|
|
83
|
+
| `grep_search(pattern, include?, exclude?, limit?)` | Direct grep — bypass the graph |
|
|
84
|
+
| `graph_stats()` | File/symbol/edge counts, backend type |
|
|
85
|
+
| `what_touches_model(model_name)` | Django: views, admin, signals referencing a model |
|
|
86
|
+
| `search_semantic(query, limit?)` | pgvector semantic search (Postgres only, planned) |
|
|
87
|
+
|
|
88
|
+
All tools return a `source` field (`"graph"` or `"grep"`) so Claude can gauge confidence.
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Configuration
|
|
93
|
+
|
|
94
|
+
Priority: CLI flag → environment variable → `codegraphy.toml` → defaults.
|
|
95
|
+
|
|
96
|
+
### Environment Variables
|
|
97
|
+
|
|
98
|
+
```bash
|
|
99
|
+
DATABASE_URL=sqlite:///codegraphy.db # or postgresql://localhost/codegraphy
|
|
100
|
+
REPOLENS_ROOT=. # project root for grep fallback
|
|
101
|
+
REPOLENS_PLUGINS=repolens.plugins.django
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### Config File (optional)
|
|
105
|
+
|
|
106
|
+
```toml
|
|
107
|
+
# codegraphy.toml (place at project root)
|
|
108
|
+
database_url = "postgresql://localhost/codegraphy"
|
|
109
|
+
root = "."
|
|
110
|
+
exclude = ["migrations", "node_modules", ".venv", "__pycache__"]
|
|
111
|
+
plugins = ["repolens.plugins.django"]
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## Claude Code Integration
|
|
117
|
+
|
|
118
|
+
### Register the MCP server
|
|
119
|
+
|
|
120
|
+
```json
|
|
121
|
+
// .claude/settings.json
|
|
122
|
+
{
|
|
123
|
+
"mcpServers": {
|
|
124
|
+
"codegraphy": {
|
|
125
|
+
"command": "codegraphy",
|
|
126
|
+
"args": ["serve"],
|
|
127
|
+
"env": {
|
|
128
|
+
"DATABASE_URL": "sqlite:///codegraphy.db"
|
|
129
|
+
}
|
|
130
|
+
}
|
|
131
|
+
}
|
|
132
|
+
}
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
### Auto-update on session end (optional)
|
|
136
|
+
|
|
137
|
+
```json
|
|
138
|
+
// .claude/settings.json
|
|
139
|
+
{
|
|
140
|
+
"hooks": {
|
|
141
|
+
"Stop": [{
|
|
142
|
+
"type": "command",
|
|
143
|
+
"command": "codegraphy update"
|
|
144
|
+
}]
|
|
145
|
+
}
|
|
146
|
+
}
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Architecture
|
|
152
|
+
|
|
153
|
+
```
|
|
154
|
+
repolens/
|
|
155
|
+
├── cli.py # Click CLI entry points
|
|
156
|
+
├── config.py # DATABASE_URL, REPOLENS_ROOT, plugin list
|
|
157
|
+
├── db/
|
|
158
|
+
│ ├── schema.py # CREATE TABLE statements (PG + SQLite)
|
|
159
|
+
│ └── store.py # upsert_symbol, upsert_edge, query helpers
|
|
160
|
+
├── indexer/
|
|
161
|
+
│ ├── base.py # BaseIndexer ABC, Symbol/Edge dataclasses
|
|
162
|
+
│ ├── python.py # ast-based Python indexer
|
|
163
|
+
│ └── walker.py # Filesystem walk + git-diff incremental
|
|
164
|
+
├── mcp/
|
|
165
|
+
│ └── server.py # FastMCP server + all tool definitions
|
|
166
|
+
├── plugins/
|
|
167
|
+
│ ├── base.py # BasePlugin ABC
|
|
168
|
+
│ └── django.py # Django-aware: models, views, signals
|
|
169
|
+
└── session/ # (planned) git-diff hook + memory write
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### Database Schema
|
|
173
|
+
|
|
174
|
+
Three tables power the graph:
|
|
175
|
+
|
|
176
|
+
- **`cg_files`** — indexed files with git hash for deduplication
|
|
177
|
+
- **`cg_symbols`** — every class, function, method, import with location + summary
|
|
178
|
+
- **`cg_edges`** — relationships: `imports`, `calls`, `inherits`, `references`, `registers`, `handles_signal`
|
|
179
|
+
|
|
180
|
+
### Indexing Strategy
|
|
181
|
+
|
|
182
|
+
1. Walk files via `git ls-files` (falls back to `os.walk`)
|
|
183
|
+
2. SHA-256 content hash skips unchanged files
|
|
184
|
+
3. AST parsing extracts symbols and edges
|
|
185
|
+
4. Plugins post-process symbols (e.g., Django re-tags `class` → `model`)
|
|
186
|
+
5. Upsert into database with cascade delete for clean re-indexing
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## Plugin System
|
|
191
|
+
|
|
192
|
+
Plugins implement two hooks:
|
|
193
|
+
|
|
194
|
+
```python
|
|
195
|
+
class BasePlugin:
|
|
196
|
+
def on_symbol(self, symbol: Symbol) -> Symbol:
|
|
197
|
+
"""Mutate or re-tag a symbol after parsing."""
|
|
198
|
+
return symbol
|
|
199
|
+
|
|
200
|
+
def extra_edges(self, symbols: list[Symbol]) -> list[Edge]:
|
|
201
|
+
"""Derive additional edges from the symbol list."""
|
|
202
|
+
return []
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
### Built-in: Django Plugin
|
|
206
|
+
|
|
207
|
+
Detects Django patterns by file naming convention:
|
|
208
|
+
- Classes in `models.py` → `kind = "model"`
|
|
209
|
+
- Classes/functions in `views.py` → `kind = "view"`
|
|
210
|
+
|
|
211
|
+
Enable via environment variable:
|
|
212
|
+
```bash
|
|
213
|
+
REPOLENS_PLUGINS=repolens.plugins.django
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## Current Status
|
|
219
|
+
|
|
220
|
+
| Milestone | Status |
|
|
221
|
+
|-----------|--------|
|
|
222
|
+
| M1 — Schema + Python indexer + `codegraphy index` | ✅ Complete |
|
|
223
|
+
| M2 — `search_symbol` + `get_file_summary` + MCP serve | ✅ Complete |
|
|
224
|
+
| M3 — `find_usages` + `path_between` + `get_context` + grep fallback | ✅ Complete |
|
|
225
|
+
| M4 — `codegraphy update` (incremental) | ✅ Complete |
|
|
226
|
+
| M5 — Django plugin | 🔶 Partial (symbol re-tagging, no admin/signal edges) |
|
|
227
|
+
| M6 — Semantic search (pgvector) | ⬜ Stub only |
|
|
228
|
+
| M7 — JS/TS indexer (tree-sitter) | ⬜ Planned |
|
|
229
|
+
| M8 — HTML/Template indexer | ⬜ Planned |
|
|
230
|
+
| M9 — `grep_search` tool + cross-language edges | 🔶 grep_search done, cross-lang edges planned |
|
|
231
|
+
|
|
232
|
+
---
|
|
233
|
+
|
|
234
|
+
## Development
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
# Clone and install in editable mode
|
|
238
|
+
git clone <repo-url> && cd codegraphy
|
|
239
|
+
python -m venv .venv && source .venv/bin/activate
|
|
240
|
+
pip install -e .
|
|
241
|
+
|
|
242
|
+
# Initialize local DB and index this project
|
|
243
|
+
codegraphy init
|
|
244
|
+
codegraphy index .
|
|
245
|
+
|
|
246
|
+
# Check stats
|
|
247
|
+
codegraphy stats
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
## Publishing
|
|
251
|
+
|
|
252
|
+
`codegraphy` is configured to build as a standard PyPI distribution from `pyproject.toml`.
|
|
253
|
+
|
|
254
|
+
For PyPI trusted publishing, use **`publish.yml`** as the workflow name. The workflow file lives at `.github/workflows/publish.yml`.
|
|
255
|
+
|
|
256
|
+
```bash
|
|
257
|
+
python -m pip install --upgrade build twine
|
|
258
|
+
python -m build
|
|
259
|
+
python -m twine check dist/*
|
|
260
|
+
python -m twine upload dist/*
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
---
|
|
264
|
+
|
|
265
|
+
## What It Is NOT
|
|
266
|
+
|
|
267
|
+
- Not a code execution sandbox
|
|
268
|
+
- Not a test runner or linter
|
|
269
|
+
- Not a replacement for LSP/IDE features
|
|
270
|
+
- Not AI-generated summaries by default (uses docstrings; AI summaries are opt-in future)
|