axm-echo 0.0.1.dev0__tar.gz → 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. axm_echo-0.1.0/.gitignore +48 -0
  2. axm_echo-0.1.0/CONTRIBUTING.md +16 -0
  3. axm_echo-0.1.0/PKG-INFO +84 -0
  4. axm_echo-0.1.0/README.md +58 -0
  5. axm_echo-0.1.0/docs/explanation/architecture.md +58 -0
  6. axm_echo-0.1.0/docs/howto/index.md +9 -0
  7. axm_echo-0.1.0/docs/howto/reuse-check-in-planning.md +78 -0
  8. axm_echo-0.1.0/docs/index.md +98 -0
  9. axm_echo-0.1.0/docs/reference/cli.md +136 -0
  10. axm_echo-0.1.0/docs/tutorials/getting-started.md +92 -0
  11. axm_echo-0.1.0/mkdocs.yml +14 -0
  12. axm_echo-0.1.0/pyproject.toml +262 -0
  13. axm_echo-0.1.0/src/axm_echo/__init__.py +42 -0
  14. axm_echo-0.1.0/src/axm_echo/_version.py +24 -0
  15. axm_echo-0.1.0/src/axm_echo/cluster.py +320 -0
  16. axm_echo-0.1.0/src/axm_echo/corpus.py +318 -0
  17. axm_echo-0.1.0/src/axm_echo/embedding.py +198 -0
  18. axm_echo-0.1.0/src/axm_echo/scope.py +77 -0
  19. axm_echo-0.1.0/src/axm_echo/structural.py +86 -0
  20. axm_echo-0.1.0/src/axm_echo/tools.py +606 -0
  21. axm_echo-0.1.0/src/axm_echo/waiver.py +168 -0
  22. axm_echo-0.1.0/tests/__init__.py +1 -0
  23. axm_echo-0.1.0/tests/conftest.py +3 -0
  24. axm_echo-0.1.0/tests/e2e/__init__.py +1 -0
  25. axm_echo-0.1.0/tests/e2e/conftest.py +3 -0
  26. axm_echo-0.1.0/tests/e2e/test_echo_check.py +78 -0
  27. axm_echo-0.1.0/tests/e2e/test_echo_code.py +78 -0
  28. axm_echo-0.1.0/tests/integration/__init__.py +1 -0
  29. axm_echo-0.1.0/tests/integration/conftest.py +3 -0
  30. axm_echo-0.1.0/tests/integration/test_boilerplate_calibration.py +240 -0
  31. axm_echo-0.1.0/tests/integration/test_discover_package_roots__extract_monorepo.py +164 -0
  32. axm_echo-0.1.0/tests/integration/test_echo_check.py +277 -0
  33. axm_echo-0.1.0/tests/integration/test_echo_code.py +581 -0
  34. axm_echo-0.1.0/tests/integration/test_embed__extract_package.py +240 -0
  35. axm_echo-0.1.0/tests/integration/test_load_scope.py +83 -0
  36. axm_echo-0.1.0/tests/unit/__init__.py +1 -0
  37. axm_echo-0.1.0/tests/unit/conftest.py +3 -0
  38. axm_echo-0.1.0/tests/unit/test_cluster.py +177 -0
  39. axm_echo-0.1.0/tests/unit/test_corpus.py +89 -0
  40. axm_echo-0.1.0/tests/unit/test_echo_code_guards.py +30 -0
  41. axm_echo-0.1.0/tests/unit/test_embedding.py +91 -0
  42. axm_echo-0.1.0/tests/unit/test_structural.py +65 -0
  43. axm_echo-0.1.0/tests/unit/test_version.py +11 -0
  44. axm_echo-0.1.0/tests/unit/test_waiver.py +94 -0
  45. axm_echo-0.0.1.dev0/PKG-INFO +0 -14
  46. axm_echo-0.0.1.dev0/README.md +0 -3
  47. axm_echo-0.0.1.dev0/pyproject.toml +0 -16
  48. axm_echo-0.0.1.dev0/src/axm_echo/__init__.py +0 -1
  49. {axm_echo-0.0.1.dev0 → axm_echo-0.1.0}/src/axm_echo/py.typed +0 -0
@@ -0,0 +1,48 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ # Generated version file
7
+ _version.py
8
+ .Python
9
+ build/
10
+ dist/
11
+ *.egg-info/
12
+ *.egg
13
+
14
+ # Virtual environments
15
+ .venv/
16
+ .uv/
17
+ venv/
18
+ ENV/
19
+
20
+ # Testing & Coverage
21
+ .pytest_cache/
22
+ .coverage
23
+ coverage.xml
24
+ coverage.json
25
+ htmlcov/
26
+ coverage_html/
27
+ .tox/
28
+
29
+ # Type checking & Linting
30
+ .mypy_cache/
31
+ .ruff_cache/
32
+
33
+ # IDE
34
+ .idea/
35
+ .vscode/
36
+ *.swp
37
+ *.swo
38
+
39
+ # Environment
40
+ .env
41
+ .envrc
42
+
43
+ # Documentation
44
+ site/
45
+
46
+ # OS
47
+ .DS_Store
48
+ Thumbs.db
@@ -0,0 +1,16 @@
1
+ # Contributing to axm-echo
2
+
3
+ Thanks for your interest in contributing!
4
+
5
+ ## Development setup
6
+
7
+ ```bash
8
+ uv sync --all-groups
9
+ uv run pytest
10
+ ```
11
+
12
+ ## Pull requests
13
+
14
+ - Follow Conventional Commits for commit messages.
15
+ - Run `make lint` and `make test` before opening a PR.
16
+ - Add tests for new features and bug fixes.
@@ -0,0 +1,84 @@
1
+ Metadata-Version: 2.4
2
+ Name: axm-echo
3
+ Version: 0.1.0
4
+ Summary: Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).
5
+ Project-URL: Homepage, https://github.com/axm-protocols/axm-forge-workspace
6
+ Project-URL: Documentation, https://axm-protocols.github.io/axm-forge-workspace/
7
+ Project-URL: Repository, https://github.com/axm-protocols/axm-forge-workspace.git
8
+ Project-URL: Issues, https://github.com/axm-protocols/axm-forge-workspace/issues
9
+ Author-email: Gabriel Jarry <gabriel@axm-protocols.io>
10
+ License-Expression: MIT
11
+ Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
17
+ Classifier: Typing :: Typed
18
+ Requires-Python: >=3.12
19
+ Requires-Dist: axm
20
+ Requires-Dist: axm-ast
21
+ Requires-Dist: numpy>=2.5.0
22
+ Requires-Dist: scikit-learn>=1.9.0
23
+ Requires-Dist: sentence-transformers>=5.6.0
24
+ Requires-Dist: torch>=2.12.1
25
+ Description-Content-Type: text/markdown
26
+
27
+ # axm-echo
28
+
29
+ Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).
30
+
31
+ <p align="center">
32
+ <a href="https://forge.axm-protocols.io/audit/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-audit.json" alt="axm-audit"></a>
33
+ <a href="https://forge.axm-protocols.io/init/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-init.json" alt="axm-init"></a>
34
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/axm-quality.yml"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/coverage.json" alt="Coverage"></a>
35
+ <img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+">
36
+ </p>
37
+
38
+ ---
39
+
40
+ ## Overview
41
+
42
+ Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).
43
+
44
+ ## Features
45
+
46
+ - **Neural by default** — the `st` (MiniLM) backend ships in the base install
47
+ (`torch` + `sentence-transformers`) and runs in-process; no extra to enable.
48
+ - **`tfidf` opt-out** — the pure-CPU `numpy` + `scikit-learn` backend stays
49
+ available (`--backend tfidf`) for callers that want to avoid loading torch.
50
+ - Built on `axm-ast` for code-corpus extraction — the corpus feeding both
51
+ `echo_code` (cross-package dedup) and `echo_check` (reuse retrieval).
52
+
53
+ ## Installation
54
+
55
+ ```bash
56
+ # echo is neural by default — the install ships torch + sentence-transformers.
57
+ uv add axm-echo
58
+ ```
59
+
60
+ Or as a workspace dependency in `pyproject.toml`:
61
+
62
+ ```toml
63
+ [project]
64
+ dependencies = ["axm-echo"]
65
+
66
+ [tool.uv.sources]
67
+ axm-echo = { workspace = true }
68
+ ```
69
+
70
+ ## Development
71
+
72
+ This package is part of the **axm-forge-workspace** uv workspace.
73
+
74
+ ```bash
75
+ # Run tests for this package
76
+ uv run pytest --package axm-echo
77
+
78
+ # From workspace root
79
+ make test-axm-echo
80
+ ```
81
+
82
+ ## License
83
+
84
+ MIT — © 2026 Gabriel Jarry
@@ -0,0 +1,58 @@
1
+ # axm-echo
2
+
3
+ Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).
4
+
5
+ <p align="center">
6
+ <a href="https://forge.axm-protocols.io/audit/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-audit.json" alt="axm-audit"></a>
7
+ <a href="https://forge.axm-protocols.io/init/"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-init.json" alt="axm-init"></a>
8
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/axm-quality.yml"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/coverage.json" alt="Coverage"></a>
9
+ <img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="Python 3.12+">
10
+ </p>
11
+
12
+ ---
13
+
14
+ ## Overview
15
+
16
+ Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).
17
+
18
+ ## Features
19
+
20
+ - **Neural by default** — the `st` (MiniLM) backend ships in the base install
21
+ (`torch` + `sentence-transformers`) and runs in-process; no extra to enable.
22
+ - **`tfidf` opt-out** — the pure-CPU `numpy` + `scikit-learn` backend stays
23
+ available (`--backend tfidf`) for callers that want to avoid loading torch.
24
+ - Built on `axm-ast` for code-corpus extraction — the corpus feeding both
25
+ `echo_code` (cross-package dedup) and `echo_check` (reuse retrieval).
26
+
27
+ ## Installation
28
+
29
+ ```bash
30
+ # echo is neural by default — the install ships torch + sentence-transformers.
31
+ uv add axm-echo
32
+ ```
33
+
34
+ Or as a workspace dependency in `pyproject.toml`:
35
+
36
+ ```toml
37
+ [project]
38
+ dependencies = ["axm-echo"]
39
+
40
+ [tool.uv.sources]
41
+ axm-echo = { workspace = true }
42
+ ```
43
+
44
+ ## Development
45
+
46
+ This package is part of the **axm-forge-workspace** uv workspace.
47
+
48
+ ```bash
49
+ # Run tests for this package
50
+ uv run pytest --package axm-echo
51
+
52
+ # From workspace root
53
+ make test-axm-echo
54
+ ```
55
+
56
+ ## License
57
+
58
+ MIT — © 2026 Gabriel Jarry
@@ -0,0 +1,58 @@
1
+ # Architecture
2
+
3
+ `axm-echo` is a flat set of single-responsibility modules — no `core/` /
4
+ `adapters/` split, no hexagonal layering. The two `axm.tools` entry points
5
+ (`echo_code`, `echo_check`) orchestrate a shared
6
+ **corpus → embed → compare** pipeline; everything else is a leaf the tools
7
+ compose.
8
+
9
+ ```mermaid
10
+ graph TD
11
+ subgraph "Tools (axm.tools entry points)"
12
+ EchoCode["EchoCodeTool · echo_code"]
13
+ EchoCheck["EchoCheckTool · echo_check"]
14
+ end
15
+
16
+ subgraph "Pipeline"
17
+ Corpus["corpus · extract_package / extract_monorepo"]
18
+ Embedding["embedding · embed / neighbors (tfidf | st)"]
19
+ Cluster["cluster · cross_pairs / split_pairs / cluster_pairs"]
20
+ Waiver["waiver · cluster_hash / acknowledged"]
21
+ end
22
+
23
+ subgraph "Leaves"
24
+ Scope["scope · load_scope (~/axm/echo.toml)"]
25
+ Structural["structural · jaccard_similarity (stdlib, no torch)"]
26
+ end
27
+
28
+ EchoCode --> Corpus
29
+ EchoCode --> Embedding
30
+ EchoCode --> Cluster
31
+ EchoCode --> Waiver
32
+ EchoCheck --> Corpus
33
+ EchoCheck --> Embedding
34
+ Corpus --> Scope
35
+ Corpus -->|axm-ast| Embedding
36
+ ```
37
+
38
+ ## Modules
39
+
40
+ | Module | Role |
41
+ |---|---|
42
+ | `tools` | The `echo_code` / `echo_check` `AXMTool`s (MCP + CLI + DAG node). They run the pipeline and shape the `ToolResult`. |
43
+ | `corpus` | Extract public symbols from a package (`extract_package`) or the whole scope (`extract_monorepo`) via `axm-ast`; each `Symbol` carries an `embed_text`. |
44
+ | `embedding` | The two backends behind `embed()` — `tfidf` (scikit-learn, pure CPU) and `st` (MiniLM, neural). `neighbors()` does exact cosine top-k. |
45
+ | `cluster` | Cross-package candidate pairs (`cross_pairs`), the v7 anti-signal split (`split_pairs`: dupes / parallel-API / boilerplate), and union-find clustering. |
46
+ | `waiver` | The acknowledged-cluster mechanism: a stable `cluster_hash` and the `[[tool.axm-echo.acknowledged]]` waiver lifecycle (mark / stale). |
47
+ | `scope` | Resolve the workspace roots to scan from `~/axm/echo.toml`, degrading to the current directory when absent. |
48
+ | `structural` | 100%-structural similarity over `ast.FunctionDef` bodies (`statement_set` + `jaccard_similarity`); pure stdlib, never loads torch. The primitive `duplicate_tests` reuses. |
49
+
50
+ ## Design decisions
51
+
52
+ | Decision | Rationale |
53
+ |---|---|
54
+ | Neural by default (`st`/MiniLM) | Docstring similarity wants semantics; `torch` + `sentence-transformers` ship in the base install. |
55
+ | `tfidf` backend kept | A pure-CPU opt-out for callers that must avoid loading torch — `embed(texts, backend="tfidf")` and `--backend tfidf`. |
56
+ | Lazy torch import | `torch` is imported only inside the `st` backend, so the `tfidf` path stays light at runtime even though torch is installed. |
57
+ | Flat modules, no hexagonal split | Each module is one concern with a small public surface; the tools compose them. No abstract ports to swap. |
58
+ | Exact cosine (no ANN) | Corpora are monorepo-sized; brute-force matmul is exact and fast enough, with no index to maintain. |
@@ -0,0 +1,9 @@
1
+ # How-To Guides
2
+
3
+ Task-oriented guides for common workflows.
4
+
5
+ ## Available Guides
6
+
7
+ - [Reuse check during planning with `echo_check`](reuse-check-in-planning.md) —
8
+ retrieve existing monorepo symbols for a ticket's intention and decide
9
+ reuse / extend / develop before drafting it.
@@ -0,0 +1,78 @@
1
+ # Reuse check during planning with `echo_check`
2
+
3
+ When a plan is decomposed into tickets (the `/plan-tickets` workflow), every
4
+ ticket whose scope is *"develop a helper / function / class that does X"*
5
+ risks minting a duplicate of something the monorepo already provides — a
6
+ fifth retry helper, a third CSV reader, another `slugify`. `echo_check`
7
+ turns that risk into a deliberate decision.
8
+
9
+ This guide shows how to wire `echo_check` into the planning step that
10
+ gathers codebase intelligence, and how to act on its result.
11
+
12
+ ## When to run it
13
+
14
+ Run the reuse check **only** for tickets that introduce *reusable
15
+ behaviour* — a new unit worth deduplicating. Skip it for pure glue, config
16
+ edits, dependency bumps, or scopes that are already "wire / refactor
17
+ existing code": there is no new helper to deduplicate there.
18
+
19
+ ## 1. Retrieve the closest existing symbols
20
+
21
+ Call `echo_check` on the **intention** — a free-form description of the
22
+ behaviour the ticket would build:
23
+
24
+ ```python
25
+ from axm_echo.tools import EchoCheckTool
26
+
27
+ result = EchoCheckTool().execute(
28
+ intention="resilient HTTP call with retry on transient 5xx errors",
29
+ )
30
+ candidates = result.data["candidates"]
31
+ ```
32
+
33
+ Via MCP / CLI the same call is `axm echo_check` or
34
+ `axm_call(name="echo_check", arguments={"intention": "..."})`.
35
+
36
+ Each candidate carries its `qualname`, `package`, `score`, full docstring
37
+ (`doc_full`), a location `verdict`, and a `promotable` flag:
38
+
39
+ | Field | Meaning |
40
+ |---|---|
41
+ | `verdict = "reuse_canonical"` | The hit lives in the canonical commons (`axm-ingot`) — reuse the canonical symbol directly. |
42
+ | `verdict = "reuse_in_place"` | A real helper exists in some package but has not been canonicalised — reuse it **in place** from `<package>`; do not mint a duplicate just because it is not in the ingot yet. |
43
+ | `promotable = True` | A well-documented non-ingot candidate worth canonicalising later. |
44
+
45
+ An **empty** candidate list means nothing scored above the retrieval
46
+ threshold — the intention is genuinely novel.
47
+
48
+ ## 2. Decide reuse / extend / develop — read the docstrings, not the score
49
+
50
+ `echo_check` *retrieves and ranks*; it deliberately does **not** decide.
51
+ A `PARTIAL` match (similar docstring, different contract) can outrank a
52
+ perfect one, so never branch on the score or the verdict tag alone. Read
53
+ each candidate's `doc_full` and signature, compare its real contract
54
+ against the intention, and pick one branch:
55
+
56
+ | Decision | When | Ticket effect |
57
+ |---|---|---|
58
+ | **reuse** | A candidate already does *exactly* what the intention needs. | Rewrite the ticket to *"import / reuse `<qualname>` from `<package>` + wire it in"*; drop the implementation tasks, keep only wiring + tests. |
59
+ | **extend** | A candidate is the right *canonical* home but misses a parameter / mode / edge case. | Emit an **extension ticket** on `<qualname>` in `<package>`, and make the consumer ticket `blocks`-depend on it (extension lands first). |
60
+ | **develop** | No candidate covers the intention (empty list, or all near-misses with a different contract). | Write the "develop a helper" ticket as normal. |
61
+
62
+ ## 3. Worked example
63
+
64
+ Spec line: *"the screener needs a resilient HTTP call (retry on 5xx)."*
65
+
66
+ ```python
67
+ EchoCheckTool().execute(
68
+ intention="resilient HTTP call with retry on transient errors",
69
+ )
70
+ ```
71
+
72
+ If a candidate like `request_with_retry [axm-commons]` comes back with a
73
+ docstring matching the contract, **do not** emit "develop a retry helper".
74
+ Emit a **reuse** ticket — *"reuse `request_with_retry` from `axm-commons`
75
+ in the screener fetch path"* — with its implementation tasks dropped.
76
+
77
+ If nothing matches (empty candidate list), the helper genuinely does not
78
+ exist yet: emit the develop ticket.
@@ -0,0 +1,98 @@
1
+ ---
2
+ hide:
3
+ - navigation
4
+ - toc
5
+ ---
6
+
7
+ # axm-echo
8
+
9
+ <p align="center">
10
+ <strong>Neural similarity & echo detection over code corpora (MiniLM + scikit-learn).</strong>
11
+ </p>
12
+
13
+ <p align="center">
14
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/ci.yml">
15
+ <img src="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/ci.yml/badge.svg" alt="CI" />
16
+ </a>
17
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/axm-quality.yml">
18
+ <img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-init.json" alt="axm-init" />
19
+ </a>
20
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/axm-quality.yml">
21
+ <img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/axm-audit.json" alt="axm-audit" />
22
+ </a>
23
+ <a href="https://github.com/axm-protocols/axm-forge-workspace/actions/workflows/axm-quality.yml">
24
+ <img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/axm-protocols/axm-forge-workspace/gh-pages/badges/axm-echo/coverage.json" alt="Coverage" />
25
+ </a>
26
+ <img src="https://img.shields.io/badge/python-3.12+-blue.svg" alt="Python 3.12+" />
27
+ </p>
28
+
29
+ ---
30
+
31
+ ## Installation
32
+
33
+ ```bash
34
+ # echo is neural by default — the install ships torch + sentence-transformers
35
+ # (MiniLM) alongside numpy + scikit-learn.
36
+ uv add axm-echo
37
+ ```
38
+
39
+ The neural `st` backend is the in-process default. The `tfidf` backend stays
40
+ pure-CPU and never loads torch, for callers that want to skip the model.
41
+
42
+ ## Quick Start
43
+
44
+ ```python
45
+ from axm_echo import embed, extract_monorepo, neighbors
46
+
47
+ # 1. Build a corpus of public symbols across the configured workspaces
48
+ # (driven by ~/axm/echo.toml, falling back to the current dir).
49
+ symbols = extract_monorepo()
50
+ texts = [s["embed_text"] for s in symbols]
51
+
52
+ # 2. Embed it. "st" (MiniLM) is the neural default; "tfidf" stays pure-CPU.
53
+ matrix = embed(texts, backend="tfidf")
54
+
55
+ # 3. Find the nearest neighbours of a symbol (exact cosine top-k).
56
+ for idx, score in neighbors(matrix[0], matrix, k=5):
57
+ print(f"{score:.3f} {symbols[idx]['qualname']}")
58
+ ```
59
+
60
+ ## Features
61
+
62
+ - ✅ **`echo_code` cross-package echo detection** — the `axm echo_code` tool
63
+ (MCP + CLI + DAG node) clusters intent-equivalent duplicate symbols across
64
+ packages, with the v7 anti-signals (trivial-accessor filter, parallel-API
65
+ demotion, boilerplate-frequency demotion) applied
66
+ - ✅ **Liveable `echo_code` report** — bounded `--top-n` display (the neural
67
+ pass still finds them all, only the output is capped; the total stays
68
+ visible), `--max-cluster-size` rejection of union-find over-merges, and an
69
+ acknowledged-cluster **waiver** workflow (`[[tool.axm-echo.acknowledged]]`
70
+ in the scan-root `pyproject.toml`) that excludes intended echoes and reports
71
+ stale waivers to retire
72
+ - ✅ **`echo_check` intent retrieval** — the `axm echo_check` tool
73
+ (MCP + CLI + DAG node) embeds a free-form intention and returns the top-k
74
+ nearest monorepo symbols with their docstrings, each tagged with a location
75
+ verdict (reuse canonical / reuse in place / promotable); it does the
76
+ retrieval, leaving the use / extend / nothing decision to the caller
77
+ - ✅ **Structural similarity** — `statement_set` / `jaccard_similarity`
78
+ (with `flatten_body` / `normalize_dump`) compare two `ast.FunctionDef`
79
+ bodies by Jaccard over constant/identifier-normalized statement-sets;
80
+ 100% structural, pure stdlib, never loads torch
81
+ - ✅ **Two embedding backends** — `st` (MiniLM `all-MiniLM-L6-v2`, the neural
82
+ default) and `tfidf` (code, scikit-learn), selected by a registry
83
+ - ✅ **Exact neighbour search** — brute-force cosine matmul, no ANN
84
+ - ✅ **Lazy torch import** — `torch` + `sentence-transformers` ship in the
85
+ base install (neural-by-default), but torch is imported only inside the
86
+ `st` backend, so the `tfidf` path never loads it at runtime
87
+ - ✅ **axm-ast corpus extractor** — public symbols with signature +
88
+ docstring, `embed_text` falling back to code when undocumented
89
+ - ✅ **Scope loader** — `~/axm/echo.toml`, graceful degradation to the
90
+ current workspace
91
+ - ✅ **Modern Python** — 3.12+ with strict typing
92
+
93
+ ---
94
+
95
+ <div style="text-align: center; margin: 2rem 0;">
96
+ <a href="tutorials/getting-started/" class="md-button md-button--primary">Get Started →</a>
97
+ <a href="reference/cli/" class="md-button">Reference</a>
98
+ </div>
@@ -0,0 +1,136 @@
1
+ # CLI Reference
2
+
3
+ ## Commands
4
+
5
+ ### `axm echo_code`
6
+
7
+ Detect cross-package code **echoes** — intent-equivalent duplicate symbols
8
+ across the configured monorepo. The tool walks the scope, embeds every public
9
+ documented symbol, finds cross-package pairs whose docstrings are semantically
10
+ close, applies the v7 anti-signals, and prints the surviving **clusters** plus
11
+ the demoted parallel-API / boilerplate buckets.
12
+
13
+ The command is auto-registered from the `axm.tools` entry point, so the same
14
+ implementation is reachable as an MCP tool and a DAG `tool_node` too.
15
+
16
+ ```bash
17
+ # Cluster echoes across the corpus (~/axm/echo.toml scope, or the cwd).
18
+ axm echo_code
19
+
20
+ # The default backend is neural "st" (MiniLM, in-process). Opt into the
21
+ # pure-CPU tfidf backend to avoid loading torch.
22
+ axm echo_code --backend tfidf
23
+
24
+ # Raise the cosine floor for a candidate pair (default 0.55).
25
+ axm echo_code --threshold 0.7
26
+
27
+ # Show only the 10 nearest actionable clusters (the total stays in the header).
28
+ axm echo_code --top-n 10
29
+
30
+ # Tighten the over-merge guard (default 50): drop any component above 20.
31
+ axm echo_code --max-cluster-size 20
32
+ ```
33
+
34
+ | Option | Default | Description |
35
+ | -- | -- | -- |
36
+ | `--backend` | `st` | Embedding backend: `st` (neural MiniLM, the in-process default) or `tfidf` (pure CPU, no torch). |
37
+ | `--threshold` | `0.55` | Minimum cosine similarity for a candidate pair. |
38
+ | `--top-n` | `30` | Bound the report to the N nearest *non-acknowledged* clusters. The neural pass still finds them all — only the display is bounded; the total count stays in the header. |
39
+ | `--max-cluster-size` | `50` | Reject any connected component larger than this as a union-find over-merge (a structural-conformity signal, not a duplicate echo — a genuine duplicate is 2-5 members). |
40
+
41
+ Output names the tool, the live/shown/actionable cluster counts, the corpus
42
+ size, and the demoted buckets, then lists each shown cluster's members with
43
+ their package and docstring first line:
44
+
45
+ ```text
46
+ echo_code | 8 clusters, 3 shown (8 actionable) | corpus 16 symbols | 0 parallel-API · 0 boilerplate (demoted)
47
+
48
+ cluster 1 sim=1.000 (2 symbols)
49
+ axm_commons.errors.RateLimitError [axm-commons] “Raised when the upstream API rate limit has been exceeded.”
50
+ axm_bib.errors.RateLimitError [axm-bib] “Raised when the upstream API rate limit has been exceeded.”
51
+ ```
52
+
53
+ #### Acknowledging a cluster (waiver)
54
+
55
+ A genuine cross-package echo that is *intended* (a parallel API, a deliberate
56
+ wrapper) is noise on every run. Acknowledge it in the **scan-root** `pyproject.toml`
57
+ (the first workspace root in `~/axm/echo.toml`) so it drops out of the
58
+ actionable top-N. Each entry is a 12-hex `cluster_hash` (printed in the tool's
59
+ `data.clusters[*].cluster_hash`) plus a non-empty `reason`:
60
+
61
+ ```toml
62
+ [[tool.axm-echo.acknowledged]]
63
+ hash = "ca29d81fb73c"
64
+ reason = "parallel API, intended cross-package duplication"
65
+ ```
66
+
67
+ An acknowledged *live* cluster is marked `acknowledged` and excluded from the
68
+ top-N and the `actionable_count`. The mechanism is self-cleaning: a waiver whose
69
+ hash no longer matches any live cluster is reported under
70
+ `data.stale_acknowledged` ("this waiver no longer serves a purpose, retire it")
71
+ — informative, never blocking. A malformed entry (bad hash, empty reason) is
72
+ rejected gracefully into `data.acknowledged_errors`; the run never crashes.
73
+
74
+ ### `axm echo_check`
75
+
76
+ Retrieve the public symbols closest to a free-form **intention**, ranked by
77
+ semantic similarity across the whole monorepo. Before writing a new helper,
78
+ ask `echo_check` what already exists: it embeds the intention, returns the
79
+ top-k nearest documented symbols with their docstrings, and tags each with a
80
+ location **verdict** so you know whether to reuse the canonical symbol, reuse
81
+ one in place, or promote it.
82
+
83
+ The verdict is a *location* tag, not a decision: a high score means "this is
84
+ the closest existing promise", never "use this". The use / extend / nothing
85
+ call is left to the calling agent — a partial match may legitimately score
86
+ above an exact one.
87
+
88
+ Like `echo_code`, the command is auto-registered from the `axm.tools` entry
89
+ point, so the same implementation is reachable as an MCP tool and a DAG
90
+ `tool_node` too.
91
+
92
+ ```bash
93
+ # Retrieve the closest existing symbols for an intention.
94
+ axm echo_check --intention "HTTP request with retry and backoff"
95
+
96
+ # The default backend is neural "st" (MiniLM, in-process). Opt into the
97
+ # pure-CPU tfidf backend to avoid loading torch.
98
+ axm echo_check --intention "slugify a string" --backend tfidf
99
+
100
+ # Raise the retrieval floor / cap the number of candidates.
101
+ axm echo_check --intention "parse a CSV file" --threshold 0.5 --k 3
102
+ ```
103
+
104
+ | Option | Default | Description |
105
+ | -- | -- | -- |
106
+ | `--intention` | `""` | Free-form description of the behaviour to implement. |
107
+ | `--backend` | `st` | Embedding backend: `st` (neural MiniLM, the in-process default) or `tfidf` (pure CPU, no torch). |
108
+ | `--k` | `10` | Maximum number of candidates to return. |
109
+ | `--threshold` | `0.30` | Minimum cosine similarity for a candidate to be retrieved. Below it the candidate is dropped, so a novel intention returns an empty list rather than a spurious match. |
110
+
111
+ The verdict is set by the candidate's package: a hit in `axm-ingot` is
112
+ `reuse_canonical`; anything else is `reuse_in_place` (with a `promotable→ingot`
113
+ hint when the symbol is documented well enough to be worth canonicalising).
114
+
115
+ Output names the tool, the intention, the candidate count, and the corpus
116
+ size, then lists each ranked candidate with its package, similarity, verdict
117
+ and docstring first line:
118
+
119
+ ```text
120
+ echo_check | “HTTP request with retry and backoff” | 1 candidates | corpus 2 symbols
121
+
122
+ 1. axm_ingot.net.fetch_url [axm-ingot] sim=0.762 reuse_canonical
123
+ "Perform an HTTP request, retrying with backoff on transient errors."
124
+ ```
125
+
126
+ When nothing crosses the threshold the report says so explicitly, rather than
127
+ surfacing a weak false match:
128
+
129
+ ```text
130
+ echo_check | “render a mermaid sequence diagram” | 0 candidates | corpus 2 symbols
131
+ (no candidate above threshold — likely novel)
132
+ ```
133
+
134
+ ## Python API
135
+
136
+ Auto-generated API reference is available under [Python API](api/).