codemap-core 0.2.2__tar.gz → 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codemap_core-0.2.2 → codemap_core-0.3.0}/CHANGELOG.md +100 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/PKG-INFO +1 -1
- codemap_core-0.3.0/docs/adr/0013-java-engine-tree-sitter-over-scip-java.md +115 -0
- codemap_core-0.3.0/docs/spikes/2026-06-23-scip-java-findings.md +142 -0
- codemap_core-0.3.0/docs/spikes/2026-06-24-codemap-refactor-execution-readiness.md +168 -0
- codemap_core-0.3.0/docs/spikes/2026-06-24-l1-refactor-next-steps.md +118 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/pyproject.toml +21 -1
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/index.py +96 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/main.py +30 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/config/schema.py +2 -0
- codemap_core-0.3.0/src/codemap/core/git_hotspots.py +43 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/models.py +3 -0
- codemap_core-0.3.0/src/codemap/emitters/__init__.py +3 -0
- codemap_core-0.3.0/src/codemap/emitters/base.py +43 -0
- codemap_core-0.3.0/src/codemap/emitters/registry.py +80 -0
- codemap_core-0.3.0/src/codemap/indexers/project_base.py +32 -0
- codemap_core-0.3.0/src/codemap/indexers/project_registry.py +81 -0
- codemap_core-0.3.0/tests/e2e/test_golden_precision.py +133 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/.gitignore +6 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/pom.xml +51 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/settings.xml +35 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/src/main/java/com/example/hellospring/controller/OrderController.java +25 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/src/main/java/com/example/hellospring/mapper/CouponMapper.java +9 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/src/main/java/com/example/hellospring/service/OrderService.java +27 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/src/main/resources/mapper/CouponMapper.xml +8 -0
- codemap_core-0.3.0/tests/fixtures/scip-samples/HelloSpring/web/OrderList.vue +14 -0
- codemap_core-0.3.0/tests/unit/test_emitter_registry.py +32 -0
- codemap_core-0.3.0/tests/unit/test_git_hotspots.py +32 -0
- codemap_core-0.3.0/tests/unit/test_models_kinds.py +26 -0
- codemap_core-0.3.0/tests/unit/test_orchestration_phases.py +126 -0
- codemap_core-0.3.0/tests/unit/test_project_indexer_registry.py +41 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/.gitignore +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/LICENSE +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/README.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0000-template.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0001-symbol-id-format.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0002-storage-backend.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0003-module-boundaries.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0004-indexer-bridge-plugin.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0005-exit-codes.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0006-schema-version.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0007-diagnostic-isolation.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0008-atomic-write.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0009-quality-gates.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0010-benchmark-regression.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0011-language-neutrality.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/adr/0012-first-language-cohort.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/bridges/http_route.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/cli.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/configuration.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/indexers/python.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/performance.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/docs/plugin-guide.md +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/_common.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/callees.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/callers.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/config.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/diagnostics.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/doctor.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/get.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/routes.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/search.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/commands/trace.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/renderers/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/renderers/json.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/cli/renderers/text.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/config/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/config/loader.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/bridge/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/bridge/base.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/bridge/http_route.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/bridge/python_cross_module.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/bridge/registry.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/graph.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/store.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/core/symbol.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/diagnostics/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/diagnostics/exit_codes.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/diagnostics/logging.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/diagnostics/progress.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/indexers/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/indexers/_example_lang.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/indexers/base.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/indexers/python.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/indexers/registry.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/atomic.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/base.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/json_store.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/lock.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/io/manifest.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/src/codemap/mcp/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/bench/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/bench/conftest.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/bench/test_index_perf.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/bench/test_query_perf.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_cli.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_config_integration.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_cross_module_callers.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_diagnostics.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_error_experience.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_http_pipeline.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_incremental.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/e2e/test_query_commands.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/basics/expected/symbol_ids.txt +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/basics/input/pkg/mod.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/imports/expected/symbol_ids.txt +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/imports/input/users.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/inheritance/expected/symbol_ids.txt +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/indexers/python/inheritance/input/shapes.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/smoke/a.example +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/fixtures/smoke/b.example +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/integration/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/integration/test_http_route_bridge_e2e.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/__init__.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_config.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_graph.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_http_route_bridge.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_indexer_registry.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_io.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_models.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_python_cross_module_bridge.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_python_indexer.py +0 -0
- {codemap_core-0.2.2 → codemap_core-0.3.0}/tests/unit/test_symbol.py +0 -0
|
@@ -8,6 +8,106 @@ During `0.x`, MINOR may introduce breaking changes — they will be marked `BREA
|
|
|
8
8
|
|
|
9
9
|
## [Unreleased]
|
|
10
10
|
|
|
11
|
+
## [0.3.0] — 2026-06-25
|
|
12
|
+
|
|
13
|
+
The four-layer-memory-model L1 release. The plugin family grows from
|
|
14
|
+
**18 to 20** distributions; every package bumps in lockstep to `0.3.0`,
|
|
15
|
+
and every plugin's `codemap-core` dependency widens to `>=0.3.0,<0.4`.
|
|
16
|
+
|
|
17
|
+
### New plugins (opt-in)
|
|
18
|
+
|
|
19
|
+
* **`codemap-mybatis`** — MyBatis Mapper XML indexer. Per-file XML
|
|
20
|
+
parsing yields `sql_mapping` symbols + `table` symbols + DML
|
|
21
|
+
`accesses_table` edges (confidence graded by SQL complexity —
|
|
22
|
+
static / dynamic-tag / `${}` substitution). A new `MyBatisLinkBridge`
|
|
23
|
+
emits `maps_to` edges from Java Mapper interface methods to their
|
|
24
|
+
backing XML statements (requires `codemap-java` installed).
|
|
25
|
+
* **`codemap-aimemory`** — emits the four-layer memory model's L1
|
|
26
|
+
layout (`.ai-memory/entities/*.yml` + `.ai-memory/relations/*.yml`)
|
|
27
|
+
so AI agents can consume the index directly with stable
|
|
28
|
+
`entity_id` slugs (fn-* / cls-* / tbl-*). Atomic per-file writes
|
|
29
|
+
(tmp + rename). Includes an optional LLM enrichment overlay — the
|
|
30
|
+
core index itself remains LLM-free; enrichment writes to a
|
|
31
|
+
separate `enrichment/` directory keyed by `symbol_id` and is
|
|
32
|
+
merged only at emit time.
|
|
33
|
+
|
|
34
|
+
### New core capability
|
|
35
|
+
|
|
36
|
+
* **Project-level indexer protocol** (`codemap.indexers.project_base`).
|
|
37
|
+
Mirrors the per-file `Indexer` but consumes the entire project in
|
|
38
|
+
one pass, for engines whose output is whole-project (Java semantic
|
|
39
|
+
resolver, future SCIP-backed importers). Lazy-discovered through
|
|
40
|
+
the new `codemap.project_indexers` entry-point group.
|
|
41
|
+
* **Emitter protocol** (`codemap.emitters`) — third plugin layer
|
|
42
|
+
alongside indexers / bridges. Registered through
|
|
43
|
+
`codemap.emitters` entry-point group with the same Protocol +
|
|
44
|
+
Registry pattern. The orchestrator runs emitters as the last
|
|
45
|
+
phase of `codemap index` (after bridges + hotspots).
|
|
46
|
+
* **CLI subcommand registration via entry-points**
|
|
47
|
+
(`codemap.cli_commands`). Plugins can ship typer subcommands —
|
|
48
|
+
`codemap-aimemory` uses this to register `codemap enrich`.
|
|
49
|
+
* **Git change-hotspot analyzer** (`codemap.core.git_hotspots`).
|
|
50
|
+
Language-neutral; surfaces `change_count_90d` on every symbol's
|
|
51
|
+
`extra`. Graceful skip on non-git / unavailable git.
|
|
52
|
+
* `EdgeKind` adds `overrides`, `accesses_table`; `SymbolKind` adds
|
|
53
|
+
`table`.
|
|
54
|
+
|
|
55
|
+
### Java engine rewrite (ADR-0013)
|
|
56
|
+
|
|
57
|
+
* **`codemap-java`** moves from declaration-only to a full call
|
|
58
|
+
graph. The per-file indexer now captures `imports`, `supertypes`,
|
|
59
|
+
`pending_calls` (raw invocation records), method `params` /
|
|
60
|
+
`return_type`, and field `type` on `Symbol.extra`. A new
|
|
61
|
+
`JavaCallResolverBridge` (registered as `java_calls`) does
|
|
62
|
+
project-wide FQN resolution to emit `calls` / `extends` /
|
|
63
|
+
`implements` edges at `confidence=medium`. ADR-0013 documents
|
|
64
|
+
the deliberate trade-off (precision ceiling drops from
|
|
65
|
+
full-semantic `high` to FQN-resolved `medium`, in exchange for
|
|
66
|
+
zero external toolchain — no scip-java, no JVM build needed).
|
|
67
|
+
* Spring annotation extraction: type / method `@Annotation` nodes
|
|
68
|
+
land on `Symbol.annotations`; the indexer combines class-level
|
|
69
|
+
`@RequestMapping` prefix with method verb mappings
|
|
70
|
+
(`@GetMapping` / `@PostMapping` / …) and writes `http_route`
|
|
71
|
+
metadata so the existing `http_route` bridge auto-mints route
|
|
72
|
+
intermediates.
|
|
73
|
+
|
|
74
|
+
### `codemap-vue` extends
|
|
75
|
+
|
|
76
|
+
* Captures `axios.<verb>(...)` / `this.$axios.<verb>(...)` /
|
|
77
|
+
`fetch(...)` invocations inside script blocks, attaching
|
|
78
|
+
`{method, url, confidence}` records to the enclosing
|
|
79
|
+
function/method symbol's `extra["http_calls"]`. The existing
|
|
80
|
+
`http_route` bridge now connects Vue clients to Spring routes
|
|
81
|
+
automatically.
|
|
82
|
+
|
|
83
|
+
### `codemap enrich` CLI (new)
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
codemap enrich --backend openai --model gpt-4o-mini
|
|
87
|
+
codemap enrich --backend anthropic --model claude-sonnet-4-5
|
|
88
|
+
codemap enrich --backend ollama --model llama3
|
|
89
|
+
codemap enrich --base-url http://my-proxy/v1 --api-key sk-…
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Reads `.codemap/`, calls the configured LLM for each
|
|
93
|
+
function/method symbol, writes overlay YAML files under
|
|
94
|
+
`.ai-memory/enrichment/`. Env-var fallback chain:
|
|
95
|
+
`CODEMAP_LLM_API_KEY` → `ANTHROPIC_API_KEY` → `OPENAI_API_KEY`;
|
|
96
|
+
`CODEMAP_LLM_BASE_URL` → `OPENAI_BASE_URL` → `ANTHROPIC_BASE_URL`.
|
|
97
|
+
The next `codemap index` merges the overlay into
|
|
98
|
+
`entities/functions.yml`. `--dry-run` reports without calling.
|
|
99
|
+
|
|
100
|
+
### Default prune dirs
|
|
101
|
+
|
|
102
|
+
`DEFAULT_PRUNE_DIRS` now includes `target` (Maven) and `out`
|
|
103
|
+
(Gradle IDE default) so Java/Kotlin/Scala projects don't double-
|
|
104
|
+
index build output trees.
|
|
105
|
+
|
|
106
|
+
### New import-linter contract
|
|
107
|
+
|
|
108
|
+
`emitters may not import cli/mcp/io` keeps the new emitter layer
|
|
109
|
+
honest to the same dependency rules as indexers and bridges.
|
|
110
|
+
|
|
11
111
|
## [0.2.2] — 2026-06-05
|
|
12
112
|
|
|
13
113
|
Lockstep version-only bump across all **18 packages** to keep the
|
|
@@ -0,0 +1,115 @@
|
|
|
1
|
+
# ADR-0013: Java engine — tree-sitter over scip-java
|
|
2
|
+
|
|
3
|
+
* **Status**: Accepted
|
|
4
|
+
* **Date**: 2026-06-24
|
|
5
|
+
* **Related**: design `2026-06-23-codemap-l1-知识图谱重构-design.md` §A1/§B1 · spike `2026-06-23-scip-java-findings.md` · ADR-0004 (indexer/bridge plugin) · ADR-0007 (diagnostic isolation) · ADR-L001 (language neutrality)
|
|
6
|
+
|
|
7
|
+
## Context
|
|
8
|
+
|
|
9
|
+
The original L1 knowledge-graph design (2026-06-23) picked scip-java as Java's
|
|
10
|
+
semantic backend, citing maximum precision (interface→impl, overloads,
|
|
11
|
+
generics, cross-file resolution). Implementing it required:
|
|
12
|
+
|
|
13
|
+
* ~100 MB scip-java + coursier + JVM toolchain on the user's machine
|
|
14
|
+
* The target project must be `mvn`/`gradle` buildable (private-repo deps, JDK
|
|
15
|
+
pinning, multi-module)
|
|
16
|
+
* protoc / grpcio-tools to vendor `scip_pb2.py`
|
|
17
|
+
* A spike (Plan 0) to confirm the real SCIP symbol string format before
|
|
18
|
+
Plan 2's `symbol_map` / `extract_edges` could be written
|
|
19
|
+
* `mvn compile` on every full index — minute-scale, doesn't fit watch mode
|
|
20
|
+
|
|
21
|
+
The user explicitly rejected this weight tradeoff in favor of a lighter
|
|
22
|
+
default engine that still meets the spec's "high precision call graph"
|
|
23
|
+
ambition for the common path.
|
|
24
|
+
|
|
25
|
+
## Decision
|
|
26
|
+
|
|
27
|
+
**Use tree-sitter-java as the sole Java engine.** Permanently drop scip-java.
|
|
28
|
+
|
|
29
|
+
The architecture stays the same — declarations come from a per-file
|
|
30
|
+
`JavaIndexer`, cross-file edges (calls/extends/implements) come from a
|
|
31
|
+
separate resolver — but the resolver runs in Python on tree-sitter AST plus
|
|
32
|
+
an import + FQN graph it builds itself, not on a `.scip` file.
|
|
33
|
+
|
|
34
|
+
Implementation shape:
|
|
35
|
+
|
|
36
|
+
1. **`JavaIndexer` (per-file, existing — extended)**: in addition to today's
|
|
37
|
+
declarations, capture `import` statements and method invocations into
|
|
38
|
+
`Symbol.extra["pending_calls"]` (raw records: receiver name, method name,
|
|
39
|
+
argument arity, location).
|
|
40
|
+
2. **`JavaCallResolverBridge` (new, runs in the bridge phase)**: read all
|
|
41
|
+
Java symbols + their pending_calls, build a project-wide FQN table from
|
|
42
|
+
the import statements, same-package declarations, and explicit
|
|
43
|
+
`extends`/`implements` relations, then emit `calls` / `extends` /
|
|
44
|
+
`implements` / `overrides` edges with `confidence: medium`.
|
|
45
|
+
3. **Spring annotations + http_route metadata** stay in the indexer (Plan 3
|
|
46
|
+
Task 1/2) using tree-sitter — no scip-java needed for that anyway.
|
|
47
|
+
4. **MyBatis** (Plan 3 Task 3) rebuilds the Java method `SymbolID` using the
|
|
48
|
+
codemap-java scheme (consistent with the new indexer), not the scip-java
|
|
49
|
+
symbol format. This removes the spike-Plan-0 dependency entirely.
|
|
50
|
+
|
|
51
|
+
### Alternatives considered
|
|
52
|
+
|
|
53
|
+
* **B — Hybrid (tree-sitter default + scip-java optional)**: tempting but
|
|
54
|
+
doubles the implementation surface and forces every downstream consumer
|
|
55
|
+
to handle two backends with different fidelity.
|
|
56
|
+
* **C — Bytecode parsing (javap / asm)**: still requires `mvn compile`, so it
|
|
57
|
+
inherits scip-java's "your project must build" weight. Python bytecode-
|
|
58
|
+
parsing ecosystem is thinner than tree-sitter.
|
|
59
|
+
|
|
60
|
+
## Consequences
|
|
61
|
+
|
|
62
|
+
What becomes easier:
|
|
63
|
+
|
|
64
|
+
* Zero external Java toolchain. `pip install codemap-core codemap-java`
|
|
65
|
+
works out of the box on any machine.
|
|
66
|
+
* No spike unblock required. Plan 2 can start TDD immediately.
|
|
67
|
+
* Watch mode and incremental indexing become realistic (per-file tree-sitter
|
|
68
|
+
parse is millisecond-scale).
|
|
69
|
+
* Plan 4's Golden test fixture no longer needs scip-java in CI.
|
|
70
|
+
|
|
71
|
+
What becomes harder / what we accept:
|
|
72
|
+
|
|
73
|
+
* **Precision ceiling drops from "high" to "medium" for `calls` edges.**
|
|
74
|
+
Realistic targets on Spring/MyBatis projects:
|
|
75
|
+
* `calls`: 70–80% precision (overload disambiguation by arity only; dynamic
|
|
76
|
+
dispatch through interface-typed fields resolved heuristically by
|
|
77
|
+
DI-known impls — see follow-up below)
|
|
78
|
+
* `implements`/`extends`: ~95% (explicit `implements X, Y` syntax)
|
|
79
|
+
* Annotation extraction: 100% (purely syntactic)
|
|
80
|
+
* Reflection, dynamic proxies, and Spring AOP-style indirection produce
|
|
81
|
+
edges that look correct syntactically but miss runtime targets. These are
|
|
82
|
+
the same blind spots tree-sitter has anywhere; we mark them with `medium`
|
|
83
|
+
confidence rather than pretend.
|
|
84
|
+
* The precision gate (Plan 4 Task 5) lowers its `high`-tier threshold,
|
|
85
|
+
effectively becoming a `medium`-tier `≥ 0.70` gate on `calls` and a
|
|
86
|
+
`≥ 0.95` gate on `implements`/`extends`. Regression-detection rationale
|
|
87
|
+
unchanged.
|
|
88
|
+
|
|
89
|
+
What we keep paying for:
|
|
90
|
+
|
|
91
|
+
* The FQN resolver is real code we own — when Java's grammar changes (new
|
|
92
|
+
language features, pattern-matching) tree-sitter-java updates flow
|
|
93
|
+
through, but our resolver has to keep up too.
|
|
94
|
+
* The `ProjectIndexer` protocol added in Plan 1 Task 2 is retained as a
|
|
95
|
+
generic extension point for future heavier engines (e.g. someone wants to
|
|
96
|
+
ship a scip-java backend later) — Java itself no longer uses it, but
|
|
97
|
+
emptying the slot would break the symmetry with `Indexer`/`Emitter`.
|
|
98
|
+
|
|
99
|
+
Follow-up ADRs / decisions:
|
|
100
|
+
|
|
101
|
+
* If/when a project demonstrates that medium precision is insufficient (e.g.
|
|
102
|
+
precision-gate alarms repeatedly), revisit and consider adding scip-java
|
|
103
|
+
as an *opt-in* second backend that publishes the same SymbolID shape — at
|
|
104
|
+
that point Hybrid (alternative B) becomes a focused upgrade rather than
|
|
105
|
+
default complexity.
|
|
106
|
+
|
|
107
|
+
## References
|
|
108
|
+
|
|
109
|
+
* spec `2026-06-23-codemap-l1-知识图谱重构-design.md` (Obsidian vault) §A1/§B1
|
|
110
|
+
* spike `docs/spikes/2026-06-23-scip-java-findings.md` §0 (toolchain blocker
|
|
111
|
+
that triggered the rethink)
|
|
112
|
+
* `docs/adr/0004-indexer-bridge-plugin.md` (resolver lives as a bridge, not a
|
|
113
|
+
new plugin layer)
|
|
114
|
+
* `docs/adr/0007-diagnostic-isolation.md` (unresolved calls become
|
|
115
|
+
diagnostics, never crashes)
|
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# scip-java spike findings (Plan 0)
|
|
2
|
+
|
|
3
|
+
> 创建时间:2026-06-24
|
|
4
|
+
> 状态:**部分完成** —— §0/§3/§4 已落定;§1/§2 阻塞,等本机 scip-java 工具链就绪
|
|
5
|
+
> 关联:spec `2026-06-23-codemap-l1-知识图谱重构-design.md`、`plan-0-验证spike`、交接文档 `2026-06-24-codemap-refactor-execution-readiness.md`
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## §0. 环境勘察与工具链状态
|
|
10
|
+
|
|
11
|
+
**本机(xueqiang的 MacBook Pro, Darwin 25.5.0, arm64):**
|
|
12
|
+
|
|
13
|
+
| 工具 | 状态 | 备注 |
|
|
14
|
+
|---|---|---|
|
|
15
|
+
| Java 17 | ✅ `17.0.17 LTS` (`/usr/bin/java`) | Maven 工程构建用 |
|
|
16
|
+
| Python 3.13 | ✅ `.venv/bin/python` | 项目自带 venv |
|
|
17
|
+
| `mvn` | ✅ `/opt/homebrew/bin/mvn` (Homebrew) | 已验证可编译 fixture |
|
|
18
|
+
| `scip-java` | ❌ **未安装** | 本轮被代理速度阻塞(见下) |
|
|
19
|
+
| `cs` (coursier) | ❌ 未安装 | 同上 |
|
|
20
|
+
| `protoc` | ❌ 未安装 | Plan 2 生成 `scip_pb2.py` 用,或 `grpcio-tools` 替代 |
|
|
21
|
+
| `grpcio-tools` (PyPI) | ⏳ 待装于 `.venv` | 受 PEP 668 阻挡装到系统 Python,改用 venv |
|
|
22
|
+
|
|
23
|
+
**安装阻塞证据:**
|
|
24
|
+
|
|
25
|
+
- `brew install coursier/formulas/coursier` 跑了 5+ 分钟没出任何输出 → kill。
|
|
26
|
+
- 直接 `curl -fLo cs.gz https://github.com/coursier/coursier/releases/latest/download/cs-aarch64-apple-darwin.gz`(21 MB)→ 90 秒只下了 2.3 MB(平均约 30 KB/s)→ 超时退出。
|
|
27
|
+
- HTTPS 通过 SOCKS5 (`127.0.0.1:7890`) 本身可用(`curl https://repo.maven.apache.org/maven2/` → 200 OK,`curl https://github.com` → 200 OK),代理服务有效,**但带宽极低**,scip-java jar 包(数十 MB)实际下载不下来。
|
|
28
|
+
- DNS:系统 nameserver 指向 `198.18.0.2`(Clash/Surge 类代理的 fake-IP 网段);`nslookup` 直发 UDP 53 失败属于代理软件默认只代理 TCP 的预期行为,对 HTTPS 应用无影响。
|
|
29
|
+
|
|
30
|
+
**下次推进 §1/§2 的两个可行路径:**
|
|
31
|
+
|
|
32
|
+
1. 改进代理速度(换节点 / 直连 / 公司代理)后重跑 `cs install scip-java`。
|
|
33
|
+
2. 在带宽好的环境下载 scip-java,scp 到本机后直接 `mv` 到 `~/.local/bin/scip-java`。
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## §1. 真实 SCIP symbol 字符串格式(HARD GATE for Plan 2/3)
|
|
38
|
+
|
|
39
|
+
> 状态:**PENDING**(被 §0 工具链阻塞)
|
|
40
|
+
|
|
41
|
+
待 scip-java 装好后,在 `tests/fixtures/scip-samples/HelloSpring/` 目录执行:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
cd tests/fixtures/scip-samples/HelloSpring
|
|
45
|
+
scip-java index --output index.scip -- compile
|
|
46
|
+
scip print --json index.scip > index.scip.json
|
|
47
|
+
grep -o '"symbol":[^,]*' index.scip.json | sort -u | head -40
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
需要在本节回答的问题(来自 plan-0 Task 1 Step 5):
|
|
51
|
+
|
|
52
|
+
- [ ] `com.example.hellospring.service.OrderService#calculateOrderPrice(long)` 的**逐字符**真实 SCIP symbol 串
|
|
53
|
+
- [ ] package 描述符到底用什么:Maven `group/artifact version` 还是 `.` 占位
|
|
54
|
+
- [ ] 方法 disambiguator(重载场景;本 fixture 暂无重载,可造一个)
|
|
55
|
+
- [ ] 调用关系怎么表达:`Occurrence.symbol_roles` 位掩码?`SymbolInformation.relationships`?还是按方法 range 内对其它 symbol 的非定义引用推断?
|
|
56
|
+
- [ ] 定义 range 在 `occurrence.range` 还是 `SymbolInformation` 的某字段?是否填 `enclosing_range`?
|
|
57
|
+
|
|
58
|
+
**直接消费方:**
|
|
59
|
+
- Plan 2 `plugins/codemap-java/src/codemap_java/symbol_map.py` 的 `to_symbolid()` 实测 round-trip 用例
|
|
60
|
+
- Plan 2 `extract_edges` 里 `_enclosing_method()` 的实现策略选择
|
|
61
|
+
- Plan 3 `plugins/codemap-mybatis` 的 `_java_method_id(namespace, method)` 重建,必须**逐字符**等于 scip-java 给同一方法的 symbol,否则 MyBatis 边的 source 悬空
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## §2. scip-java 增量索引能力
|
|
66
|
+
|
|
67
|
+
> 状态:**PENDING**(被 §0 工具链阻塞)
|
|
68
|
+
|
|
69
|
+
待 scip-java 装好后执行:
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
cd tests/fixtures/scip-samples/HelloSpring
|
|
73
|
+
time scip-java index --output index_full.scip -- compile
|
|
74
|
+
# 改一处 OrderService.java 加一行无害日志
|
|
75
|
+
time scip-java index --output index_inc.scip -- compile
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
需要在本节回答的问题:
|
|
79
|
+
|
|
80
|
+
- [ ] (a) scip-java 支持增量/部分索引 → Plan 2 可做受影响单元重索引
|
|
81
|
+
- [ ] (b) 仅全量 → spec §5.3② 退化为「每次全量重跑 scip-java + codemap 侧按 sha256 diff」;正确性不变、代价记录在案
|
|
82
|
+
|
|
83
|
+
**直接消费方:** Plan 2 `JavaScipIndexer.index_project()` 的增量策略;编排器 `_do_incremental` 路径里项目级 indexer 的调用频率(当前 Plan 1 Task 5 实现是每次 incremental 都全量重跑,符合 (b) 假设)。
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## §3. 目标企业工程可构建性
|
|
88
|
+
|
|
89
|
+
> 状态:**N/A** —— 本机无目标企业工程
|
|
90
|
+
|
|
91
|
+
Plan 0 Task 3 的目的是验证「真实业务工程能否稳定 mvn compile + scip-java index」。本机只有 `tests/fixtures/scip-samples/HelloSpring/`(开发期最小 fixture),未持有任何目标企业工程。
|
|
92
|
+
|
|
93
|
+
**判断与后续动作:**
|
|
94
|
+
|
|
95
|
+
- 本节的结论只能在能访问真实业务工程的环境里二次评估(开发机 / CI runner)。
|
|
96
|
+
- 一旦目标工程接入,本节需补:JDK 版本、是否多模块(aggregator pom)、私服/离线依赖能否拉取、构建总耗时、scip-java 在该工程上是否成功产 `index.scip`。
|
|
97
|
+
- 不阻塞 Plan 1(已完成)、Plan 2 算法本身(用 HelloSpring fixture 验证)、Plan 3 框架 bridge 算法(同上)。但**精度门禁**(Plan 4 Task 5)的最终意义需在真实工程上跑一次校准。
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## §4. codemap-vue / codemap-typescript 的 http_calls 现状
|
|
102
|
+
|
|
103
|
+
> 状态:**已落定**
|
|
104
|
+
|
|
105
|
+
### 4.1 实测
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# 仓库根
|
|
109
|
+
grep -rn "http_calls\|http_route\|axios\|fetch" \
|
|
110
|
+
plugins/codemap-vue plugins/codemap-typescript \
|
|
111
|
+
src/codemap/core/bridge/http_route.py
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
实测命中(2026-06-24,分支 `feat/l1-knowledge-graph`):
|
|
115
|
+
|
|
116
|
+
- `plugins/codemap-vue/README.md:68` — 文档里仅 1 行示例 symbol(无实现)
|
|
117
|
+
- `src/codemap/core/bridge/http_route.py` — bridge 自身的 5 处(`extra["http_route"]` server 侧、`extra["http_calls"]` client 侧,以及类型注解)
|
|
118
|
+
|
|
119
|
+
实测 **`plugins/codemap-vue/src/`、`plugins/codemap-typescript/src/` 的实现代码中零命中 `axios` / `fetch` / `http_calls` / `http_route`**。
|
|
120
|
+
|
|
121
|
+
### 4.2 代码核对
|
|
122
|
+
|
|
123
|
+
- `plugins/codemap-vue/src/codemap_vue/indexer.py` —— Vue SFC indexer:用 tree-sitter-vue 不可得,所以走「SFC 切片 + 内嵌 JS/TS 走 tree-sitter-javascript / tree-sitter-typescript」。访问者只采集:top-level 函数、类(含方法)、模块级 `const/let/var` 声明。**不识别**调用表达式中的 `axios.<verb>(...)` 或 `fetch(...)` 字面量。
|
|
124
|
+
- `plugins/codemap-typescript/src/codemap_typescript/indexer.py` —— 同模式(直接走 tree-sitter-typescript,无 SFC 切片)。
|
|
125
|
+
|
|
126
|
+
### 4.3 结论 & Plan 4 起点
|
|
127
|
+
|
|
128
|
+
- **从零写**(不是「增强已有」)。Plan 4 Task 1 文档里也已写了「实现 http_calls 抽取」,本节是其前置事实确认。
|
|
129
|
+
- 实现位点:在 vue/typescript indexer 的 AST 遍历里加调用表达式访问;识别 `axios.<verb>` / `this.$axios.<verb>` / `fetch` / `useFetch`(按团队约定扩展),把 `{method, url, confidence}` 累加到所在方法/组件符号的 `extra["http_calls"]`。
|
|
130
|
+
- 置信度:纯字面量 url → `medium`;含模板拼接 / 变量 → `low`;动态计算 → 不建边(宁缺毋滥,符合 spec §5.1)。
|
|
131
|
+
- 下游:现成的 `HttpRouteBridge` 已经读 `extra["http_calls"]`,只要 Vue 侧产出格式正确,Plan 3 Task 2 Java 侧产出 `extra["http_route"]` 后 bridge 自动跨语言连边。
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## 接手 Checklist(下一轮)
|
|
136
|
+
|
|
137
|
+
- [ ] 把 §0 的代理速度问题解决后跑 `cs install scip-java`(或手工放 scip-java 二进制到 `~/.local/bin/`)。
|
|
138
|
+
- [ ] 跑 `scip-java index` 出 `tests/fixtures/scip-samples/HelloSpring/index.scip`,把 §1 的 5 个问题用实测证据写满。
|
|
139
|
+
- [ ] 跑 §2 的全量+增量两次 `time scip-java index`,二选一。
|
|
140
|
+
- [ ] 把 fixture 工程 + `index.scip` + 本 findings 文档一并 commit(`spike: capture real scip-java output format on HelloSpring sample`)。
|
|
141
|
+
- [ ] 按 §1 的 `REAL` symbol 字符串回填 Plan 2 `symbol_map.py` 测试与 Plan 3 `_java_method_id()` 重建逻辑。
|
|
142
|
+
- [ ] 继续 Plan 2 → 3 → 4。
|
|
@@ -0,0 +1,168 @@
|
|
|
1
|
+
# CodeMap 知识图谱重构 —— 执行就绪评估与环境交接
|
|
2
|
+
|
|
3
|
+
> 创建时间:2026-06-24
|
|
4
|
+
> 作者:Claude(按用户「读重构文档→按顺序实现」请求,在动手前做的批判性审查 + 环境勘察)
|
|
5
|
+
> 状态:**未动手实现**。本机缺 scip-java 工具链,用户决定换环境继续。本文是交接清单。
|
|
6
|
+
> 关联计划文档(在 Obsidian vault):
|
|
7
|
+
> `07-Ideas/CodeMap/重构(适配知识图谱)/` 下的
|
|
8
|
+
> `…-design.md` / `plan-0-验证spike` / `plan-1-地基` / `plan-2-java语义核心` /
|
|
9
|
+
> `plan-3-框架bridge` / `plan-4-贯通与输出`
|
|
10
|
+
> 现有代码库:`/Users/xueqiang/Git/codemap`(git,随仓库可带到新环境)
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## 0. 一句话结论
|
|
15
|
+
|
|
16
|
+
5 个 plan 质量很高、可逐步 TDD 执行;**Plan 1(地基)完全不依赖 scip-java,现在/任何环境都能立即开工**。但**文档字面顺序的第一步 Plan 0(验证 spike)被本机工具链阻塞**:`scip-java / maven / gradle / coursier / protoc` 全缺。Plan 2/3/4 中凡是需要**真实 `index.scip`** 的部分都连带阻塞。建议在新环境先装工具链跑 Plan 0,或先在任意环境把 Plan 1 + 不依赖真实 `.scip` 的部分做掉。
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## 1. 本机环境勘察(2026-06-24,xueqiang的MacBook Pro)
|
|
21
|
+
|
|
22
|
+
| 工具 | 状态 | 说明 |
|
|
23
|
+
|---|---|---|
|
|
24
|
+
| Java | ✅ `17.0.17 LTS` (`/usr/bin/java`) | 满足目标 JDK 17 |
|
|
25
|
+
| Python | ✅ `3.14.5` | 满足 `requires-python >=3.11` |
|
|
26
|
+
| scip-java | ❌ 未安装 | **Plan 0/2 的硬前置** |
|
|
27
|
+
| maven (`mvn`) | ❌ 未安装 | scip-java 索引需要可构建的 mvn/gradle |
|
|
28
|
+
| gradle | ❌ 未安装 | 同上(二选一) |
|
|
29
|
+
| coursier (`cs`) | ❌ 未安装 | scip-java 官方推荐安装途径 |
|
|
30
|
+
| protoc | ❌ 未安装 | Plan 2 生成 `scip_pb2.py` 需要(或用 `pip install grpcio-tools` 自带 protoc) |
|
|
31
|
+
|
|
32
|
+
**新环境上手前需补齐**:scip-java、maven(或 gradle)、protoc(或 grpcio-tools)。另需一个**可 `mvn compile` 通过的目标 Java 工程**(Plan 0 Task 3 / Plan 4 golden fixture 的真相源)。
|
|
33
|
+
|
|
34
|
+
新环境建议安装命令(macOS / Homebrew,供参考,未在本机执行):
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
brew install coursier/formulas/coursier && cs setup # 装 JVM 工具基座
|
|
38
|
+
cs install scip-java # 装 scip-java
|
|
39
|
+
brew install maven # 或 gradle
|
|
40
|
+
pip install grpcio-tools # 自带 protoc,给 Plan 2 生成 scip_pb2.py
|
|
41
|
+
scip-java --help # 验证可用
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## 2. 计划审查:硬编码 API vs 真实代码(已逐项核对)
|
|
47
|
+
|
|
48
|
+
对 Plan 1(直接硬编码了大量 codemap 内部 API)做了逐项核对,结论:**Plan 1 可按文档原样执行**。核对覆盖到 Plan 2–4 用到的跨层 API。
|
|
49
|
+
|
|
50
|
+
### 2.1 与计划假设一致(可放心按文档写)
|
|
51
|
+
|
|
52
|
+
- `src/codemap/core/models.py`
|
|
53
|
+
- `SymbolKind` Literal 现含 `… "asset", "unknown"`;`EdgeKind` Literal 现含 `… "maps_to", "imports"`。Plan 1 Task 1 要加 `"table"`(SymbolKind)、`"overrides"`/`"accesses_table"`(EdgeKind)——位置假设成立。
|
|
54
|
+
- `Confidence = Literal["high","medium","low"]`(注意:核心 Confidence **不含** `"llm"`;`"llm"` 只出现在 enrichment YAML,物理隔离,符合设计)。
|
|
55
|
+
- `Symbol`/`Edge`/`Range`/`Annotation`/`Diagnostic`/`IndexResult` 字段与计划用法一致;`Symbol.extra: dict[str, Any]`、`Symbol.annotations: list[Annotation]` 存在。
|
|
56
|
+
- `src/codemap/core/symbol.py`
|
|
57
|
+
- `SymbolID.parse(s) -> SymbolID`(classmethod)✅;`to_string()` + `__str__` ✅;构造参数 `scheme`(必填)、`manager="."`、`package_name="."`、`package_version="."`、`descriptors: tuple[Descriptor,...]=()`(`@dataclass(frozen, slots)`)。
|
|
58
|
+
- `SymbolParseError(ValueError)` ✅。
|
|
59
|
+
- `Descriptor(name, kind, disambiguator="")` ✅;`DescriptorKind`(StrEnum) 成员 `NAMESPACE/TYPE/TERM/METHOD/TYPE_PARAMETER/PARAMETER/META` ✅(Plan 用到的 TYPE/METHOD/TERM/NAMESPACE 全在)。
|
|
60
|
+
- `SymbolID.descriptors` 元素有 `.name`/`.kind` ✅。
|
|
61
|
+
- `src/codemap/core/store.py`
|
|
62
|
+
- `ReadOnlyStore` 是 `@runtime_checkable Protocol`,方法 `get / iter_symbols / iter_edges / callers / callees / search / manifest`。emitter(Plan 1 Task 3 / Plan 4)只用 `iter_symbols`/`iter_edges`,满足。
|
|
63
|
+
- 另有 `SymbolStore(ReadOnlyStore, Protocol)` 读写协议。
|
|
64
|
+
- `src/codemap/io/json_store.py`
|
|
65
|
+
- `JsonStore.open(root, *, mode="rw")`(classmethod,返回支持 `with` 的实例)✅;`iter_symbols/iter_edges/upsert_symbols/upsert_edges/upsert_routes/upsert_diagnostics/commit/get` 全部存在,签名与 Plan 用法一致;额外有 `upsert_aliases/iter_routes/iter_aliases/iter_diagnostics/delete_by_file/clear_bridge_outputs`。结构上满足 `ReadOnlyStore`/`SymbolStore`。
|
|
66
|
+
- `src/codemap/indexers/base.py` + `registry.py`
|
|
67
|
+
- `Indexer` Protocol:`index_file(path, source: bytes, ctx) -> IndexResult`;`supports(path)`;ClassVar `name/version/file_patterns/languages`。
|
|
68
|
+
- `IndexContext(project_root: Path, relative_path: PurePosixPath, language: str, config={})`(**`relative_path` 是 `PurePosixPath` 不是 str**)。
|
|
69
|
+
- entry-point group `codemap.indexers`,发现机制 `entry_points(group=…)→ep.load()()→isinstance 校验→按 name 入表`。Plan 1 的 project_indexers / emitters 注册表照抄此模式即可。
|
|
70
|
+
- `src/codemap/cli/commands/index.py`
|
|
71
|
+
- `_run_bridges(store, stats, config)` ✅、`_index_one(file_path, project_root, store, registry, stats, bar, config)` ✅、`_do_incremental(...)` ✅、`_index_one_prefetched(...)` ✅。
|
|
72
|
+
- **full-build `else:` 分支真实代码(`index.py:140-144`)与 Plan 1 Task 5 假设一致**:
|
|
73
|
+
```python
|
|
74
|
+
else:
|
|
75
|
+
with progress_bar("Indexing", total=len(files), enabled=not no_progress) as bar:
|
|
76
|
+
for file_path in files:
|
|
77
|
+
_index_one(file_path, path, store, registry, stats, bar, config)
|
|
78
|
+
_run_bridges(store, stats, config)
|
|
79
|
+
```
|
|
80
|
+
- `_short_exception_message(producer: str, exc: BaseException)` ✅(两参,Plan 调 `_short_exception_message(ix.name, exc)` 匹配)。
|
|
81
|
+
- 顶部 import 齐全:`Diagnostic`、`Config`、`PurePosixPath`、`logger`、`progress_bar` 全部存在。
|
|
82
|
+
- `src/codemap/core/bridge/http_route.py`
|
|
83
|
+
- `HttpRouteBridge.resolve(store) -> BridgeResult`;ClassVar `name="http_route"`。
|
|
84
|
+
- server 侧读 `Symbol.extra["http_route"]`、client 侧读 `Symbol.extra["http_calls"]`,产 `routes_to`(handler→route)与 `calls`(caller→route)边 + `Route`/`Alias`/`Diagnostic`。
|
|
85
|
+
- `plugins/codemap-sql/`:**仅 DDL**(CREATE TABLE/VIEW/INDEX),明确忽略 SELECT/INSERT/UPDATE/DELETE。→ 印证 Plan 3 的修正:MyBatis 必须自带 DML 表名抽取。
|
|
86
|
+
- `plugins/codemap-java/`:当前 `JavaIndexer` 确为**声明级**——`_Visitor.edges` 恒为 `[]`,全文无 `edges.append`。→ 印证设计 §0「现状缺口①」,Plan 2 用 `JavaScipIndexer` 取代它。
|
|
87
|
+
|
|
88
|
+
### 2.2 与计划措辞/假设**不一致**项(执行时按此修正,不改变算法与断言)
|
|
89
|
+
|
|
90
|
+
1. **根 `pyproject.toml` 没有任何「语言插件」的 entry-point 注册行**。根 pyproject 只注册 `_example_lang`+`python`(indexers)与 `http_route`+`python_cross_module`(bridges)。Java/Vue/TS/SQL 等都在各自 `plugins/codemap-*/pyproject.toml` 注册。
|
|
91
|
+
- 影响 **Plan 1 Task 5 Step 5**:在根 pyproject 加 `codemap.project_indexers`/`codemap.emitters` **空占位组**没问题;但 Plan 2 Task 6「从旧 `codemap.indexers` 组移除旧 java 注册」要去改 `plugins/codemap-java/pyproject.toml`,**不是**根 pyproject。
|
|
92
|
+
2. `ReadOnlyStore` Protocol **不含** `iter_routes/iter_diagnostics/iter_aliases`(这些只在 `JsonStore` 具体类)。emitter 若只读 symbols/edges 不受影响;若将来 emitter 想读 routes,需收 `JsonStore` 或扩展 Protocol。
|
|
93
|
+
3. `index()` 是 `register(app)` 内的**嵌套命令函数**,不能直接 `import index`。但 Plan 1 Task 5 的测试只 import **模块级**辅助函数 `_run_project_indexers`/`_apply_hotspots`(新加的就是模块级),不受影响。
|
|
94
|
+
4. `_IndexStats` 是**普通 class 不是 dataclass**,但 `.diagnostics` 字段存在(int 计数),Plan 的 `stats.diagnostics += len(...)` 成立。
|
|
95
|
+
5. `JsonStore.open` 是**返回 context-manager 实例的 classmethod**,不是 `@contextmanager` 生成器(仅措辞)。
|
|
96
|
+
6. `HttpRouteBridge` 同时用 `http_route`(server)+`http_calls`(client) 两个 key,不是单一 key(Plan 3/4 已分别在 Java 侧写 `http_route`、Vue 侧写 `http_calls`,方向正确)。
|
|
97
|
+
7. `importlinter` 现有 3 条 `forbidden` 合约(core / io / indexers 各一条),**无 bridge 专属合约**。Plan 1 Task 5 新增 emitters 合约符合现有模式。
|
|
98
|
+
8. **Vue / TypeScript 插件目前完全没有 `http_calls`/axios/fetch 提取逻辑**(grep 零命中)。→ Plan 4 Task 1 是**从零新写**(不是「增强已有」),Plan 0 §4 的审计结论可直接定为「从零写」。
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## 3. 阻塞分析:哪些被 scip-java 卡住、哪些没有
|
|
103
|
+
|
|
104
|
+
### 3.1 **不依赖 scip-java,任意环境可立即做**
|
|
105
|
+
|
|
106
|
+
- **Plan 1 全部**(Task 1–5):模型枚举、项目级 indexer 协议+注册表、emitter 协议+注册表、git 热点、编排器接入 + import-linter 合约。纯 Python TDD,自给自足。**这是后面一切的地基,强烈建议最先做。**
|
|
107
|
+
- **Plan 3 Task 1**(Java 注解抽取,tree-sitter-java,已是依赖)、**Task 3**(MyBatis Mapper XML indexer 的单元测试——`table_refs` + `MyBatisIndexer.index_file` 用手工 XML,自给自足)。
|
|
108
|
+
- ⚠️ 但 Plan 3 的 `_java_method_id` 重建必须**逐字符**等于 scip-java 对同一方法给的 symbol,这点要等 Plan 0 §1 真实符号串才能最终锁定;可先写实现 + 单元测试,留待 Plan 4 golden 联调校验。
|
|
109
|
+
- **Plan 4 Task 1**(Vue `http_calls` 从零写)、**Task 2**(`ids` entity_id 派生)、**Task 3**(`AiMemoryEmitter`,测试用手工 seed 符号)、**Task 4**(`enrich` LLM 增强,LLM 用 mock)。这些测试都不需要真实 `.scip`。
|
|
110
|
+
|
|
111
|
+
### 3.2 **被 scip-java / 真实 `index.scip` 阻塞**
|
|
112
|
+
|
|
113
|
+
- **Plan 0 全部**(spike:装 scip-java、造样例工程、跑出真实 `index.scip`、记录 §1 符号串格式 / §2 增量能力 / §3 目标工程可构建性 / §4 Vue 现状)。
|
|
114
|
+
- **Plan 2 Task 2/4/5/6**:`scip_reader`/`extract_symbols`/`extract_edges`/`JavaScipIndexer` e2e 的测试都以 `tests/fixtures/scip-samples/HelloSpring/index.scip` 为真相源——该 fixture 由 Plan 0 用 scip-java 真实产出。
|
|
115
|
+
- (Plan 2 Task 1 runner、Task 3 symbol_map **可先做**:runner 测试 mock subprocess,symbol_map 是纯字符串→SymbolID;但 Task 3 的 `REAL` 字符串需 Plan 0 §1 实测值替换。)
|
|
116
|
+
- **Plan 3 Task 2 Step 5**(Spring 路由经 http_route bridge 成图的集成测试,依赖 fixture `index.scip`)。
|
|
117
|
+
- **Plan 4 Task 5**(golden 全栈 fixture + 精度门禁,依赖预生成 `index.scip`)——这是把 Plan 2/3 所有 ⚠️Plan-0 假设做总验收的地方。
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## 4. 推荐执行顺序(roadmap)
|
|
122
|
+
|
|
123
|
+
**阶段 A(任意环境,立即可做,零返工风险)**
|
|
124
|
+
1. **Plan 1 全部**(地基)。完成后 codemap core 具备:项目级 indexer 钩子、emitter 插件协议、git 热点、新枚举、编排器接入、emitters import-linter 合约。
|
|
125
|
+
|
|
126
|
+
**阶段 B(需 scip-java 工具链 + 可构建 Java 工程)**
|
|
127
|
+
2. **Plan 0** spike:装工具链 → 造 `tests/fixtures/scip-samples/HelloSpring/` 最小 Maven 工程 → 跑出真实 `index.scip` → 写 `docs/spikes/2026-06-23-scip-java-findings.md` 的 §1–§4。
|
|
128
|
+
- **§1 是硬出口**:`OrderService#calculateOrderPrice()` 的逐字符真实 symbol 串、package 描述符形态(Maven 坐标 vs `.` 占位)、重载 disambiguator、**调用关系如何表达**(occurrence symbol_roles 位掩码 / SymbolInformation.relationships / enclosing_range)。直接决定 Plan 2 的 `to_symbolid` 与 `extract_edges`、Plan 3 的 `_java_method_id` 重建。
|
|
129
|
+
3. **Plan 2**:用 Plan 0 真实 fixture 与 §1 结论,落地 runner → scip_pb2/reader → symbol_map → extract_symbols/edges → JavaScipIndexer,并替换旧声明级 java 插件。
|
|
130
|
+
4. **Plan 3**:注解 → Spring `http_route` 元数据(写进索引期,复用 http_route bridge)→ MyBatis 插件。用真实 fixture 校准 `_java_method_id`。
|
|
131
|
+
5. **Plan 4**:Vue `http_calls`(从零)→ ids → AiMemoryEmitter → enrich → golden 全栈 fixture + 精度门禁(high 档 precision ≥ 0.98,接 CI)。
|
|
132
|
+
|
|
133
|
+
> 也可在阶段 A 顺手把 §3.1 列的「不依赖 scip 的 Plan 3/4 子任务」做掉,但要注意 Plan 3 `_java_method_id`、Plan 2 `symbol_map` 的 `REAL` 串等**含 ⚠️Plan-0 常量**的点,必须等 Plan 0 §1 实测后回填,否则会出现「MyBatis 边 source 悬空 / round-trip 断言失败」。稳妥起见:含 ⚠️ 的常量留到阶段 B。
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## 5. 执行规范提醒(给接手的会话/Agent)
|
|
138
|
+
|
|
139
|
+
- 每个 plan 顶部都标了 **REQUIRED SUB-SKILL:superpowers:subagent-driven-development(推荐)或 superpowers:executing-plans**,按其逐 task / 逐 step 执行,步骤用 `- [ ]` 跟踪。
|
|
140
|
+
- 全程 **TDD**:先写失败测试 → 跑确认 FAIL → 写实现 → 跑确认 PASS → commit。Plan 里每步都给了具体命令与 Expected。
|
|
141
|
+
- **门禁不降标**:`ruff check`、`mypy --strict`、`pytest --cov-fail-under=80`、`lint-imports`(import-linter)、bench。新代码一并纳入。
|
|
142
|
+
- **不在 main 直接实现**:先开 worktree 或 feature 分支(superpowers:using-git-worktrees)。本次用户选择「跳过」是因为本轮只写结论、不实现。
|
|
143
|
+
- **commit 署名**:按全局规则追加两条 `Co-Authored-By`(Claude 在前、用户实时取 `git config user.name/email` 在后)。
|
|
144
|
+
- **降级铁律**:critical(scip-java 构建失败)之外,任何单层失败都不能让整体 L1 产出失败——保留上一份好索引、记 diagnostic 继续。
|
|
145
|
+
|
|
146
|
+
---
|
|
147
|
+
|
|
148
|
+
## 6. 接手 checklist(换环境后从这里开始)
|
|
149
|
+
|
|
150
|
+
- [ ] 拉取/同步 codemap 仓库到新环境,确认本文件在 `docs/spikes/`。
|
|
151
|
+
- [ ] 装 scip-java + maven(或 gradle)+ protoc/grpcio-tools(见 §1 命令)。
|
|
152
|
+
- [ ] 准备一个可 `mvn -DskipTests compile` 通过的目标 Java 工程(或用 Plan 0 Task 2 的 HelloSpring 最小样例)。
|
|
153
|
+
- [ ] 开 worktree / feature 分支(不在 main 上写)。
|
|
154
|
+
- [ ] 若想零等待:先做 **Plan 1 全部**(阶段 A,不需以上工具)。
|
|
155
|
+
- [ ] 工具链就绪后做 **Plan 0**,把 `2026-06-23-scip-java-findings.md` §1–§4 用实测证据写满。
|
|
156
|
+
- [ ] 按 §1 findings 回填 Plan 2/3 的 ⚠️ 常量,再依次推进 Plan 2 → 3 → 4。
|
|
157
|
+
- [ ] 全链跑通后用 Plan 4 golden 精度门禁验收,接入 CI。
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## 附:本轮已确认的关键事实索引(避免接手者重复勘察)
|
|
162
|
+
|
|
163
|
+
- 旧 `JavaIndexer` 声明级、零 Edge:`plugins/codemap-java/src/codemap_java/indexer.py`(`_Visitor.edges=[]`,无 append)。
|
|
164
|
+
- full-build 编排接入点:`src/codemap/cli/commands/index.py:140-144`(`else:` 块,`_run_bridges` 之后是 Plan 1 追加 hotspots/emitters 的位置)。
|
|
165
|
+
- http_route bridge 读 `extra["http_route"]`(server) / `extra["http_calls"]`(client):`src/codemap/core/bridge/http_route.py`。
|
|
166
|
+
- DDL-only SQL 插件:`plugins/codemap-sql/src/codemap_sql/indexer.py`(docstring 明示忽略 DML)。
|
|
167
|
+
- Vue/TS 无 http_calls:`plugins/codemap-vue`、`plugins/codemap-typescript` grep 零命中。
|
|
168
|
+
- import-linter 合约:`pyproject.toml`(core/io/indexers 三条 forbidden,无 bridge 合约)。
|