java-codebase-rag 0.3.1__py3-none-any.whl → 0.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,204 @@
1
+ ---
2
+ name: explore-codebase
3
+ description: "MUST BE USED PROACTIVELY. Universal read-only codebase exploration. Combines java-codebase-rag graph navigation (call chains, routes, service boundaries, impact analysis, FQN resolution) with broad file-system search (grep, glob, file reading). Use for any exploration: locating code, tracing dependencies, finding patterns, 'where is X', 'who calls Y', 'find all controllers', 'trace the flow from A to B'. Do NOT use when the answer is already in open context or for a single known file — read that file directly."
4
+ ---
5
+
6
+ # /explore-codebase — Universal codebase exploration
7
+
8
+ Read-only exploration combining **java-codebase-rag graph navigation** with **broad file-system search**.
9
+
10
+ ## When to use
11
+
12
+ Any time you need to search, locate, navigate, or explore the codebase. **Do NOT use when** the answer is already in open context or for a single known file — read that file directly.
13
+
14
+ ## Core Principles
15
+
16
+ 1. **Read-only.** Never edit, write, or modify any file.
17
+ 2. **Smallest sufficient tool.** Pick the lightest tool that answers the question.
18
+ 3. **Stop when answered.** Don't prefetch unrelated subgraphs or directories.
19
+
20
+ ## Tool Inventory
21
+
22
+ ### Graph tools (java-codebase-rag MCP)
23
+
24
+ `search`, `find`, `describe`, `neighbors`, `resolve`.
25
+
26
+ **Node kinds:** `Symbol` (types/methods), `Route` (HTTP/messaging entry points), `Client` (outbound HTTP), `Producer` (outbound async).
27
+ **Indexed content:** Java sources + SQL + YAML (`table`: `java`, `sql`, `yaml`, or `all`).
28
+
29
+ ### File-system tools
30
+
31
+ - **Grep** — content search by pattern/regex
32
+ - **Glob** — find files by name/path pattern (`**/*.java`, `**/*Controller*.java`, `**/application*.yml`)
33
+ - **Read** — read files (`offset`/`limit` for large files)
34
+
35
+ ### Other: **Bash** (read-only: `git log`, `git blame`, `ls`, `find`), **WebSearch**/**WebFetch** (external lookups)
36
+
37
+ ---
38
+
39
+ ## Decision Framework
40
+
41
+ | User asks… | First step | Follow-up |
42
+ | ---------- | ---------- | --------- |
43
+ | Identifier-shaped string | `resolve` (+ optional `hint_kind`) | `describe` → `neighbors` |
44
+ | Fuzzy / NL "where is X" | `search` | `describe` → `neighbors` |
45
+ | All controllers in service S | `find(kind="symbol", filter={"microservice":"S","role":"CONTROLLER"})` | `neighbors` `CALLS`/`EXPOSES` |
46
+ | Interfaces in service S | `find(..., filter={"microservice":"S","symbol_kind":"interface"})` | `neighbors`/`describe` |
47
+ | HTTP / messaging entry points | `find(kind="route", filter={…})` | `describe` |
48
+ | Outbound HTTP clients | `find(kind="client", filter={…})` | `neighbors(..., "out", ["HTTP_CALLS"])` |
49
+ | Outbound async producers | `find(kind="producer", filter={…})` | `neighbors(..., "out", ["ASYNC_CALLS"])` |
50
+ | Who calls method M? | id via `resolve`/`find`/`search` | `neighbors(ids, "in", ["CALLS"])` |
51
+ | What does M call? | same | `neighbors(ids, "out", ["CALLS"])` |
52
+ | Who hits this route? | route id | `neighbors(ids, "in", ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"])` |
53
+ | Handler for route | route id | `neighbors(ids, "in", ["EXPOSES"])` |
54
+ | Who implements/injects T? | type symbol id | `neighbors(ids, "in", ["IMPLEMENTS"])` or `["INJECTS"]` |
55
+ | Impact of changing X? | bounded `neighbors` `in` loop with `CALLS`, `INJECTS`, … | `Grep` fallback |
56
+ | Find files matching pattern | `Glob` | `Read` |
57
+ | Search for text in files | `Grep` | `Read` |
58
+ | Who changed X and when? | Bash: `git log`/`git blame` | — |
59
+ | "How is this configured?" | `Glob` + `Grep` for config keys; `search(query=…, table="yaml")` | `Read` sections |
60
+
61
+ **Escalation:** ① Most targeted tool first → ② Fall back gracefully (graph empty → `Grep`/`Glob`) → ③ Cross-validate (graph vs file disagree → **trust the file**).
62
+
63
+ **Rules of thumb:** Structure beats vector for exact questions (`resolve`/`find`+`neighbors`); vector beats structure for fuzzy discovery (`search`); file-system beats stale index.
64
+
65
+ ---
66
+
67
+ ## Graph Navigation Reference (java-codebase-rag MCP)
68
+
69
+ **Ontology: 17** — if results look structurally wrong or empty across tools, the index may be missing or stale; ask the operator to rebuild.
70
+ Responses may include `hints_structured` (suggested next calls) and `advisories` — advisory only; ignore when `success` is false.
71
+
72
+ ### Forced reasoning preamble (every MCP call)
73
+
74
+ ```
75
+ Q-class: <semantic | structured | inspect | walk>
76
+ Pick: <search|find|describe|neighbors|resolve> Why: <≤8 words>
77
+ ```
78
+
79
+ ### Workflow: locate → inspect → walk
80
+
81
+ 1. **Locate** — `resolve` for identifier-shaped; `search` for NL/code fragments; `find` for structured `NodeFilter`.
82
+ 2. **Inspect** — `describe(id)` for full record + `edge_summary`.
83
+ 3. **Walk** — `neighbors` in a loop with explicit `direction` and `edge_types`.
84
+
85
+ ### Edge taxonomy
86
+
87
+ Use these strings **verbatim** in `neighbors(..., edge_types=[...])`.
88
+
89
+ **Stored edges (one hop):**
90
+
91
+ | Edge type | Semantics |
92
+ | --------- | --------- |
93
+ | `EXTENDS`, `IMPLEMENTS`, `INJECTS` | Type wiring. `in`=dependents, `out`=dependencies |
94
+ | `DECLARES`, `DECLARES_CLIENT`, `DECLARES_PRODUCER` | Containment. `in`=owner, `out`=owned member/client/producer |
95
+ | `OVERRIDES` | Subtype method → supertype declaration |
96
+ | `CALLS` | Method→method. `in`=callers, `out`=callees. Source-ordered (`call_site_line`) |
97
+ | `EXPOSES` | Method Symbol → Route (handler exposes route) |
98
+ | `HTTP_CALLS`, `ASYNC_CALLS` | Cross-service: Client/Producer → Route |
99
+
100
+ **Composed edges — type Symbol origin (`direction="out"` only):**
101
+
102
+ `DECLARES.DECLARES_CLIENT` — members' HTTP clients | `DECLARES.DECLARES_PRODUCER` — members' async producers | `DECLARES.EXPOSES` — members' exposed routes
103
+
104
+ **Composed edges — non-static method Symbol origin (`direction="out"` only):**
105
+
106
+ `OVERRIDDEN_BY` — concrete overrider methods | `OVERRIDDEN_BY.DECLARES_CLIENT` | `OVERRIDDEN_BY.DECLARES_PRODUCER` | `OVERRIDDEN_BY.EXPOSES`
107
+
108
+ > Do not mix `DECLARES.*` and `OVERRIDDEN_BY.*` in one `edge_types` list. When `edge_summary` shows large composed counts, raise `limit` or issue separate calls per key.
109
+
110
+ ### Argument shapes
111
+
112
+ **JSON, not stringified JSON:** `edge_types=["CALLS"]` not `"CALLS"`; `filter={"role":"CONTROLLER"}` not nested string; `ids=["sym:…","sym:…"]` not comma-joined. Omit keys you don't need. Empty string `""` is a real filter that matches nothing.
113
+
114
+ **Node id prefixes:** Symbol `sym:`, Route `route:`/`r:`, Client `client:`/`c:`, Producer `producer:`/`p:`. Use exact ids from previous calls.
115
+
116
+ **Symbol FQNs:** `<package>.<Type>[.<NestedType>]#<methodName>(<SimpleType1>,<SimpleType2>,…)`. Generics erased, no spaces after commas. No-arg: `()`. Constructor: `#<init>(…)`.
117
+
118
+ ### `neighbors` — required every time
119
+
120
+ - **`direction`**: `"in"` or `"out"` (no default). **`edge_types`**: non-empty list.
121
+ - **Batching:** multiple `ids` expand first; `limit`/`offset` slice the **merged** edge list — raise `limit` when batching.
122
+ - **`CALLS` edges:** `attrs.resolved=false` = external (JDK/Spring), not missing. **`include_unresolved=True`** (`out` only) interleaves unresolved call sites; mutually exclusive with `edge_filter`. **`dedup_calls=True`** collapses identical (origin, callee) pairs.
123
+ - **`edge_filter`** (only with `edge_types=['CALLS']`): `min_confidence`; `include_strategies`/`exclude_strategies`; `callee_declaring_role`/`callee_declaring_roles`/`exclude_callee_declaring_roles`. Note: use `edge_filter.callee_declaring_role` for callee stereotype filtering, not `filter.role` which filters the neighbor node.
124
+ - **Cross-service edges:** read `attrs.confidence` and `attrs.match` — low confidence or `unresolved`/`phantom`/`ambiguous` = resolver signal, not ground truth.
125
+
126
+ ### NodeFilter (`find`, `search.filter`, `neighbors.filter`)
127
+
128
+ For `find`, `filter` is required — `{}` means no predicates. **Strict frame:** unknown keys or inapplicable populated fields → `success=false`.
129
+
130
+ | Applicable to | Keys |
131
+ | ------------- | ---- |
132
+ | All kinds | `microservice`, `module` |
133
+ | **symbol** only | `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` |
134
+ | **route** only | `http_method`, `path_prefix`, `framework` |
135
+ | **client** only | `client_kind`, `target_service`, `target_path_prefix`, `http_method` |
136
+ | **producer** only | `producer_kind`, `topic_prefix` |
137
+
138
+ No wildcards in prefix fields — use `search(query=…)` for ranked text.
139
+
140
+ ### `resolve` — identifier lookup
141
+
142
+ **Input:** FQN/suffix, `sym:`/`route:`/`client:`/`producer:` id, `METHOD /path`, route path, client target_service, producer topic.
143
+ **`hint_kind`:** optional `symbol`|`route`|`client`|`producer` (narrows generators).
144
+
145
+ | `status` | Action |
146
+ | -------- | ------ |
147
+ | `one` | `describe(id=node.id)` |
148
+ | `many` | pick from `candidates`, then `describe` |
149
+ | `none` | fall back to `search(query=…)` or `Grep` |
150
+
151
+ Prefer `resolve` → `describe(id=…)` over `describe(fqn=…)` when FQN may collide.
152
+
153
+ ### Tool signatures summary
154
+
155
+ - **`search`** — `query`, `table` (`java`|`sql`|`yaml`|`all`), `hybrid` (bool), `limit` (default 5), `offset`, `path_contains`, optional `filter` (symbol-applicable only).
156
+ - **`find`** — `kind` (`symbol`|`route`|`client`|`producer`), **`filter`** (required object), `limit` (default 25), `offset`.
157
+ - **`describe`** — `id` (any kind) or `fqn` (symbol only; `id` wins). Returns node + `edge_summary` (stored + composed keys).
158
+ - **`resolve`** — `identifier`, optional `hint_kind`.
159
+
160
+ ### Ontology glossary
161
+
162
+ **Roles:** `CONTROLLER` | `SERVICE` | `REPOSITORY` | `COMPONENT` | `CONFIG` | `ENTITY` | `CLIENT` | `MAPPER` | `DTO` | `OTHER`.
163
+ Exclude `DTO`, `OTHER`, `MAPPER` with `exclude_roles` when tracing business logic. On `CALLS` out: `edge_filter={"exclude_callee_declaring_roles":["OTHER"]}` drops framework calls.
164
+
165
+ **Capabilities:** `MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, `EXCEPTION_HANDLER`.
166
+
167
+ **Symbol kinds:** `class`, `interface`, `enum`, `record`, `annotation`, `method`, `constructor`.
168
+
169
+ **Route frameworks:** `spring_mvc`, `webflux`, `kafka`, `rabbitmq`, `jms`, `stream`, `codebase_async_route`, …
170
+ **Client kinds:** `feign_method`, `rest_template`, `web_client`. **Producer kinds:** `kafka_send`, `stream_bridge_send`.
171
+ **Match types:** `cross_service`, `intra_service`, `ambiguous`, `phantom`, `unresolved`.
172
+
173
+ ---
174
+
175
+ ## Recovery Playbook
176
+
177
+ **After two failed attempts on the same intent, stop and report tool name, args, and response snippet.**
178
+
179
+ | Symptom | Fix |
180
+ | ------- | --- |
181
+ | `neighbors` validation error | Add both `direction` and `edge_types` explicitly |
182
+ | Empty `neighbors` | Read `describe.edge_summary`; check edge type and direction |
183
+ | Cannot find symbol | `resolve`/`search`; `find` with `fqn_prefix`; fallback `Grep` |
184
+ | `find` returns too much | Add `microservice`, `fqn_prefix`, `path_prefix`, `topic_prefix` |
185
+ | Empty `search` | Try `table="all"`; `find` with `fqn_prefix`; `Grep` directly |
186
+ | Empty results across tools | Index missing/stale → `Grep`/`Glob`/`Read`; ask operator to rebuild |
187
+ | Graph vs file disagree | **Trust the file**; report stale index |
188
+ | Mixed composed families on one id | Split calls — type keys need type id; override keys need method id |
189
+ | `Glob`/`Grep` too many results | Narrow pattern; add directory prefix or `path_filter` |
190
+ | `Grep` no results | Broaden pattern; check working directory; try alternate terms |
191
+
192
+ ---
193
+
194
+ ## Workflow Patterns
195
+
196
+ **"Explain feature X":** `search` → pick 1–3 hits → `describe` → `neighbors` with targeted edges → stop when answered.
197
+
198
+ **"Where is X used?":** `resolve`/`search` → `neighbors("in", ["CALLS","INJECTS","IMPLEMENTS"])` → `Grep` fallback → report all sites with file:line.
199
+
200
+ **"Find all Y":** Structural → `find(kind=…, filter={…})`. Textual → `Grep`. Broad → `Glob` + `Grep`. Summarize, don't dump.
201
+
202
+ **"Trace flow from A to B":** Resolve both → walk `CALLS`/`EXPOSES`/`HTTP_CALLS` from A → `Grep` gaps → report with file:line.
203
+
204
+ **"How is this configured?":** `Glob` for `**/application*.yml` → `Grep` for key → `Read` sections → `search(query=…, table="yaml")` supplement.