java-codebase-rag 0.2.0__py3-none-any.whl → 0.2.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/METADATA +27 -6
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/RECORD +6 -6
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/WHEEL +0 -0
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/entry_points.txt +0 -0
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/licenses/LICENSE +0 -0
- {java_codebase_rag-0.2.0.dist-info → java_codebase_rag-0.2.2.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: java-codebase-rag
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.2
|
|
4
4
|
Summary: MCP server for semantic + structural search over Java codebases
|
|
5
5
|
Author: HumanBean17
|
|
6
6
|
License-Expression: MIT
|
|
@@ -18,6 +18,7 @@ Classifier: Topic :: Software Development :: Libraries
|
|
|
18
18
|
Requires-Python: >=3.11
|
|
19
19
|
Description-Content-Type: text/markdown
|
|
20
20
|
License-File: LICENSE
|
|
21
|
+
Requires-Dist: cocoindex[lancedb]<2,>=1.0.0a43
|
|
21
22
|
Requires-Dist: kuzu<0.12,>=0.11.3
|
|
22
23
|
Requires-Dist: lancedb<0.31,>=0.25.3
|
|
23
24
|
Requires-Dist: mcp<2,>=1.27.0
|
|
@@ -26,6 +27,7 @@ Requires-Dist: pathspec<2,>=1.0.4
|
|
|
26
27
|
Requires-Dist: pyarrow<24,>=23.0.1
|
|
27
28
|
Requires-Dist: PyYAML<7,>=6.0.3
|
|
28
29
|
Requires-Dist: sentence-transformers<6,>=5.4.0
|
|
30
|
+
Requires-Dist: transformers<=5.5.3,>=4.48.3
|
|
29
31
|
Requires-Dist: tree-sitter<0.26,>=0.25.2
|
|
30
32
|
Requires-Dist: tree-sitter-java<0.24,>=0.23.5
|
|
31
33
|
Requires-Dist: unidiff<1,>=0.7.3
|
|
@@ -50,6 +52,24 @@ For the design rationale, the GPS metaphor, and the full ontology, see [`docs/pa
|
|
|
50
52
|
|
|
51
53
|
---
|
|
52
54
|
|
|
55
|
+
## Why this exists
|
|
56
|
+
|
|
57
|
+
Generic code-search tools (grep, ctags, vector-only RAG) hit a ceiling on real Java microservice estates: they find files but lose the structure that makes a Spring/JAX-RS system navigable. This project is built around five choices that target that gap.
|
|
58
|
+
|
|
59
|
+
- **Hybrid RAG + GraphRAG, not either-or.** Semantic recall (LanceDB chunk vectors) and structural navigation (Kuzu property graph) are composed in one surface. `search` finds candidate nodes by meaning; `neighbors` walks the exact edge you care about (`CALLS`, `IMPLEMENTS`, `INJECTS`, `DECLARES_ROUTE`, …). The agent picks the right primitive per step instead of being forced into pure-vector or pure-symbol search.
|
|
60
|
+
|
|
61
|
+
- **A Java-tuned role model.** Symbols are labelled with stereotypes inferred from Spring and JAX-RS conventions — `CONTROLLER`, `SERVICE`, `REPOSITORY`, `CLIENT`, `PRODUCER`, `MAPPER`, `DTO`. Agents can ask "list controllers" or "who injects this repository" directly, instead of grep-ing for `@RestController` and hoping for the best. Roles drive both filtering (`find` with a `NodeFilter`) and ranking.
|
|
62
|
+
|
|
63
|
+
- **Ranking specialized for Java codebases.** The composite ranker is aware of role, microservice, and FQN structure — not a generic BM25. A search for `"chat ingress"` surfaces controllers before utility classes; a search scoped to one microservice doesn't drown in matches from the other 19. Defaults are tuned on the bank-chat fixture and exposed in `docs/CONFIGURATION.md` for per-repo overrides.
|
|
64
|
+
|
|
65
|
+
- **Cross-service resolution + system-level navigation.** `HTTP_CALLS` and `ASYNC_CALLS` edges connect Clients and Producers in one microservice to Routes and Handlers in another, resolved at index time from URL/topic strings + Spring `@FeignClient` / `RestTemplate` conventions. `/who-hits-route`, `/trace-request-flow`, and `/impact-of` use these to answer questions a single-service tool fundamentally can't — "who calls this REST endpoint from outside this service", "trace this Kafka message end-to-end", "if I change this DTO, which services break".
|
|
66
|
+
|
|
67
|
+
- **Brownfield annotations as a first-class override.** Real Java estates have hand-rolled HTTP clients, dynamic topic names, reflection-heavy routing. `@CodebaseHttpRoute`, `@CodebaseAsyncRoute`, `@CodebaseHttpClient`, and `@CodebaseProducer` let you pin the truth in source. They have **exclusive priority** — when a symbol is annotated, framework-convention inference is skipped entirely. You get a correct graph on legacy code without rewriting it.
|
|
68
|
+
|
|
69
|
+
The rest of this README is the install, walkthrough, and tool cheat sheet for putting that to work.
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
53
73
|
## Install
|
|
54
74
|
|
|
55
75
|
```bash
|
|
@@ -57,6 +77,7 @@ pip install java-codebase-rag
|
|
|
57
77
|
```
|
|
58
78
|
|
|
59
79
|
Python **3.11+** required. After install, `java-codebase-rag --help` should print the CLI groups.
|
|
80
|
+
The package includes the CocoIndex lifecycle dependency used by `init`, `increment`, `reprocess`, and `erase`.
|
|
60
81
|
|
|
61
82
|
> **Stability disclaimer.** This package does **not** promise backward compatibility. MCP tool contracts, env vars, Lance/Kuzu schemas, config files, and Python APIs may change without a deprecation period. Track `main` and rebuild indexes when ontology or embedding settings change.
|
|
62
83
|
|
|
@@ -132,9 +153,9 @@ See [`mcp.json.example`](./mcp.json.example) for the same shape in `.mcp.json` (
|
|
|
132
153
|
|
|
133
154
|
Pick **one** of two options (not both — they cover the same navigation intents):
|
|
134
155
|
|
|
135
|
-
1. **[`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md)** (recommended for most) — standalone MCP operating manual. Copy-paste the `BEGIN`/`END` block into your project's `QWEN.md`, `CLAUDE.md`, or `AGENTS.md`. Contains: five-tool reference, `NodeFilter` / edge taxonomy, ontology glossary, recovery playbook, and
|
|
156
|
+
1. **[`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md)** (recommended for most) — standalone MCP operating manual. Copy-paste the `BEGIN`/`END` block into your project's `QWEN.md`, `CLAUDE.md`, or `AGENTS.md`. Contains: five-tool reference, `NodeFilter` / edge taxonomy, ontology glossary, recovery playbook, and navigation patterns. Self-contained — no external file dependencies.
|
|
136
157
|
|
|
137
|
-
2. **[`
|
|
158
|
+
2. **[`/explore-codebase`](./skills/explore-codebase/SKILL.md)** (for hosts with skill discovery) — single self-contained skill with the complete operating manual. If your MCP host supports skill discovery (Claude Code, Qwen Code, Cursor), load `/explore-codebase` to get the full tool reference, edge taxonomy, decision tree, and recovery playbook in one shot.
|
|
138
159
|
|
|
139
160
|
Also: **[`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md)** — 7-phase agent-driven verification you run after indexing your real project.
|
|
140
161
|
|
|
@@ -154,7 +175,7 @@ Full schemas, `NodeFilter` / `EdgeFilter` semantics, and the hints contract live
|
|
|
154
175
|
|
|
155
176
|
### Three-layer architecture
|
|
156
177
|
|
|
157
|
-
Layer 1 (storage) → Layer 2 (5 MCP tools) → Layer 3 (
|
|
178
|
+
Layer 1 (storage) → Layer 2 (5 MCP tools) → Layer 3 (skill). The [`/explore-codebase`](./skills/explore-codebase/SKILL.md) skill provides the full operating manual for Layer 2. See the [architecture diagram in `skills/README.md`](./skills/README.md#three-layer-architecture).
|
|
158
179
|
|
|
159
180
|
---
|
|
160
181
|
|
|
@@ -197,7 +218,7 @@ Run `java-codebase-rag --help` to list grouped subcommands. Operator playbook wi
|
|
|
197
218
|
| [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) | Environment variables, project YAML, graph ontology, brownfield overrides, ignore patterns. |
|
|
198
219
|
| [`docs/JAVA-CODEBASE-RAG-CLI.md`](./docs/JAVA-CODEBASE-RAG-CLI.md) | CLI operator playbook: workflows, exit codes, env alignment. |
|
|
199
220
|
| [`docs/EDGE-NAVIGATION.md`](./docs/EDGE-NAVIGATION.md) | MCP-traversable edges, directions, dot-key composition. |
|
|
200
|
-
| [`skills/`](./skills/) |
|
|
221
|
+
| [`skills/`](./skills/) | Single `/explore-codebase` skill — complete MCP operating manual for hosts with skill discovery (alternative to copy-pasting AGENT-GUIDE). See [`skills/README.md`](./skills/README.md). |
|
|
201
222
|
| [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) | 7-phase agent-driven verification after indexing your project. |
|
|
202
223
|
| [`docs/CODEBASE_REQUIREMENTS.md`](./docs/CODEBASE_REQUIREMENTS.md) | Assumptions about your Java repo + per-file edit map for non-conforming codebases. |
|
|
203
224
|
| [`automation/cursor_propose_only/README.md`](./automation/cursor_propose_only/README.md) | Optional proposal orchestration workflow (single-command autopilot, planning bundles, automated execution/review loops). |
|
|
@@ -214,7 +235,7 @@ python3 -m venv .venv
|
|
|
214
235
|
.venv/bin/pip install -r requirements.txt
|
|
215
236
|
```
|
|
216
237
|
|
|
217
|
-
The `cocoindex` package
|
|
238
|
+
The `cocoindex` package powers lifecycle commands that run the indexer (`init`, `increment`, `reprocess`, `erase`). Search and MCP navigation do not invoke it directly.
|
|
218
239
|
|
|
219
240
|
The default embedding model is `sentence-transformers/all-MiniLM-L6-v2` (downloaded on first `init`). Override via the `EMBEDDING_MODEL` env var — see [`docs/CONFIGURATION.md` §1](./docs/CONFIGURATION.md#1-environment-variables).
|
|
220
241
|
|
|
@@ -19,9 +19,9 @@ java_codebase_rag/cli.py,sha256=hCjlmAXkS80noTX_bxm6BMiLIYEz_P5xfrw9C7LvkBE,2767
|
|
|
19
19
|
java_codebase_rag/cli_progress.py,sha256=Vtio3RqJ3LkRoNpxrv8iGbEiX4klkTlJX-mR4l6oeBM,1586
|
|
20
20
|
java_codebase_rag/config.py,sha256=h07zJrV8QoLv9hIhJZ2JgUI0Rh6uPBZUiPkGDEmTg_w,11687
|
|
21
21
|
java_codebase_rag/pipeline.py,sha256=QyKNCrBsjdFU71N9Xygti-DdtMQQsrZ8aySisux46lI,5311
|
|
22
|
-
java_codebase_rag-0.2.
|
|
23
|
-
java_codebase_rag-0.2.
|
|
24
|
-
java_codebase_rag-0.2.
|
|
25
|
-
java_codebase_rag-0.2.
|
|
26
|
-
java_codebase_rag-0.2.
|
|
27
|
-
java_codebase_rag-0.2.
|
|
22
|
+
java_codebase_rag-0.2.2.dist-info/licenses/LICENSE,sha256=gxvtiHtuviR_q8ZAjWw-QTcF3DyPzg6ZY-lQrr8OPpw,1068
|
|
23
|
+
java_codebase_rag-0.2.2.dist-info/METADATA,sha256=VWpfMNxxjvuY2x-rJviWa4pv-OlkX7R93ew-IkFyzjM,15112
|
|
24
|
+
java_codebase_rag-0.2.2.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
|
|
25
|
+
java_codebase_rag-0.2.2.dist-info/entry_points.txt,sha256=mVVQJa0n73OWfhHXYCDoPRrWin_LJhH2Rn0CkJ2iax4,101
|
|
26
|
+
java_codebase_rag-0.2.2.dist-info/top_level.txt,sha256=5aIYoMkvJvvfXvf4iHn2OeSIM7PZXP-0j94eNESnwMw,242
|
|
27
|
+
java_codebase_rag-0.2.2.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|