@x-quantum-tech/repolens 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +118 -0
- package/examples/repolens.config.example.json +38 -0
- package/package.json +43 -0
- package/repolens.mjs +1421 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Maurizio Tarricone — X Quantum Tech (xquantumtech.com)
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,118 @@
|
|
|
1
|
+
# ◳ repolens
|
|
2
|
+
|
|
3
|
+
> **Messy or gigantic repo? Don't waste tokens to map it.**
|
|
4
|
+
> Run one command, get the full map — and feed it to your AI.
|
|
5
|
+
|
|
6
|
+
   
|
|
7
|
+
|
|
8
|
+
Every time an LLM agent opens a mid-size repository, it burns **100k–500k tokens** re-discovering facts that never change between commits: where the routes are, what tables exist, which env vars matter, which file is the god-file. **repolens extracts all of it deterministically, in seconds, with zero LLM calls** — one self-contained script, no dependencies, Node 18+.
|
|
9
|
+
|
|
10
|
+
```bash
|
|
11
|
+
# one-shot, no install (after npm publish)
|
|
12
|
+
npx repo-lens . --config repolens.config.json --out docs/repo-map
|
|
13
|
+
|
|
14
|
+
# or just grab the single file - it has zero dependencies
|
|
15
|
+
node repolens.mjs . --config repolens.config.json --out docs/repo-map
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Then point your agent's `CLAUDE.md` / `AGENTS.md` at `docs/repo-map.md` and every session starts with the map already in context instead of exploring from scratch.
|
|
19
|
+
|
|
20
|
+
## What you get
|
|
21
|
+
|
|
22
|
+
| Output | What it is |
|
|
23
|
+
|---|---|
|
|
24
|
+
| `repo-map.md` | Compact index for humans **and LLMs** — every entry is a clickable `file#L<line>` link (works on GitHub and VS Code) |
|
|
25
|
+
| `repo-map.json` | Pretty-printed structured data: file tree, import graph, every catalog row with `file`, `line`, `link` provenance |
|
|
26
|
+
| `repo-map.html` | Self-contained interactive dashboard (no CDN, works offline): stat cards, area/language bars, **zoomable treemap**, dependency highlighting, sortable + searchable catalog tables. UI in English or Italian (`"lang"` in config) |
|
|
27
|
+
|
|
28
|
+
## What it extracts — out of the box, framework-agnostic
|
|
29
|
+
|
|
30
|
+
- **File tree + LOC + languages**, rendered as a squarified treemap
|
|
31
|
+
- **Import graph** (JS/TS/Python): fan-in/fan-out, dependency hubs, god files (≥1500 LOC)
|
|
32
|
+
- **HTTP routes** — Express/Hono/Koa (`app.get("/x")`), raw matching (`path === "/api/x"` + nearby method detection, Cloudflare Workers style), prefix routes, Python decorators — each with an auth heuristic and `file:line`
|
|
33
|
+
- **Database tables** — `CREATE TABLE` (with columns) + `ALTER TABLE`, merged per table, from `.sql` and inline SQL
|
|
34
|
+
- **Environment variables** with reference counts and locations
|
|
35
|
+
- **HTML pages → API calls** — which endpoints each page's JS actually hits
|
|
36
|
+
- **npm scripts**, **Cloudflare wrangler bindings** (D1/R2/KV/queues/crons)
|
|
37
|
+
|
|
38
|
+
## Not just code — it maps *any* folder of *any* files
|
|
39
|
+
|
|
40
|
+
repolens is not tied to source code. It walks **any directory** and handles every file in one of three tiers:
|
|
41
|
+
|
|
42
|
+
| Tier | Which files | What you get |
|
|
43
|
+
|---|---|---|
|
|
44
|
+
| **1 · Inventory** (always) | **everything**, including binaries | tree, sizes, treemap, per-type counts. Images, video, PDFs, fonts, archives are *located and sized* — not opened |
|
|
45
|
+
| **2 · Text** (automatic) | any text file | line counts, language, size-weighted treemap |
|
|
46
|
+
| **3 · Deep extraction** | text matching your extractors | structured catalogs (see below) |
|
|
47
|
+
|
|
48
|
+
Because tier 3 is **regex-over-text**, it works on **any textual format** — Markdown, YAML, TOML, CSV, `.env`, JSON, HTML, SQL, OpenAPI specs, logs, prose — not only code. A few non-code uses, each just one extractor in the config:
|
|
49
|
+
|
|
50
|
+
- **Knowledge base / docs** — pull front-matter, headings, tags and `[[wiki-links]]` out of a folder of Markdown notes (e.g. an Obsidian vault) → a navigable catalog + a treemap of your knowledge
|
|
51
|
+
- **Glossaries / datasets** — terms from a CSV, keys from a YAML, entries from a data dictionary
|
|
52
|
+
- **API specs** — endpoints straight from an `openapi.yaml`
|
|
53
|
+
- **Content audits** — every `H1/H2` across your docs, or every `TODO`/`FIXME`
|
|
54
|
+
- **Config sprawl** — every feature flag, event name, or secret-name across the repo
|
|
55
|
+
|
|
56
|
+
The `repo-map.json` is then **structured data ready to feed an LLM** (or a RAG pipeline, a dashboard, a script).
|
|
57
|
+
|
|
58
|
+
**What it does *not* do (yet):** it inventories binaries but doesn't read inside them (no PDF/`.docx`/`.xlsx` text extraction — that would need dependencies, breaking the zero-dep promise); the import graph covers JS/TS/Python (other languages are ~15 lines of regex each); and it extracts *structure and facts*, not *meaning* — semantic chunking/embeddings stay downstream with your LLM.
|
|
59
|
+
|
|
60
|
+
## What people use it for
|
|
61
|
+
|
|
62
|
+
- **LLM agent onboarding** — the map replaces exploration; sessions start informed, tokens go to the actual task
|
|
63
|
+
- **Structured data extraction** — any repeated pattern in your codebase becomes a queryable JSON catalog (tool definitions, feature flags, CLI commands, event names…) ready to feed into RAG pipelines, dashboards or scripts
|
|
64
|
+
- **Refactor planning** — god files and dependency hubs tell you exactly where the pain is
|
|
65
|
+
- **API inventory & audit** — every route, with method, auth hint and source location
|
|
66
|
+
- **DB schema overview** — reconstructed from migrations, no DB connection needed
|
|
67
|
+
- **Human onboarding** — hand a new dev the HTML dashboard instead of a tour
|
|
68
|
+
- **Docs that never rot** — regenerate on demand or pre-commit; the map is code, not prose
|
|
69
|
+
- **CI drift checks** — diff `repo-map.json` between commits to catch new routes/tables/env vars in review
|
|
70
|
+
|
|
71
|
+
## Repo-specific knowledge: custom extractors
|
|
72
|
+
|
|
73
|
+
The core stays agnostic; your domain plugs in via config. Each extractor is a glob + regex + field names — every match becomes a catalog row with automatic `file:line` provenance, a tab in the HTML and a section in the MD:
|
|
74
|
+
|
|
75
|
+
```jsonc
|
|
76
|
+
{
|
|
77
|
+
"name": "my-repo",
|
|
78
|
+
"lang": "en",
|
|
79
|
+
"ignore": ["**/*.generated.*", "fixtures/**"],
|
|
80
|
+
"extractors": [
|
|
81
|
+
{
|
|
82
|
+
"name": "mcp_tools",
|
|
83
|
+
"title": "MCP tools",
|
|
84
|
+
"glob": "server/agent.js",
|
|
85
|
+
"pattern": "tool\\(\\s*\"([\\w-]+)\",\\s*\"(.*?)\"",
|
|
86
|
+
"flags": "g",
|
|
87
|
+
"fields": ["tool", "description"],
|
|
88
|
+
"maxLen": 260, // truncate long captures
|
|
89
|
+
"unique": true, // dedup identical rows
|
|
90
|
+
"postSplit": { // explode a capture into a list + count
|
|
91
|
+
"field": "tools",
|
|
92
|
+
"pattern": "\"(mcp__\\w+)\""
|
|
93
|
+
}
|
|
94
|
+
},
|
|
95
|
+
{
|
|
96
|
+
// Non-code example: index a Markdown knowledge base by front-matter.
|
|
97
|
+
"name": "notes",
|
|
98
|
+
"title": "Knowledge notes",
|
|
99
|
+
"glob": "docs/**/*.md",
|
|
100
|
+
"pattern": "title:\\s*([^\\n]+)[\\s\\S]*?tags:\\s*([^\\n]+)",
|
|
101
|
+
"flags": "",
|
|
102
|
+
"fields": ["title", "tags"]
|
|
103
|
+
}
|
|
104
|
+
]
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Philosophy
|
|
109
|
+
|
|
110
|
+
Deterministic extraction beats model exploration for everything that is *structural*. Architectural sensors like [sentrux](https://github.com/sentrux/sentrux) tell you **how healthy** your structure is; repolens tells you **what's in it and where**. Use both.
|
|
111
|
+
|
|
112
|
+
## Credits
|
|
113
|
+
|
|
114
|
+
Built by **Maurizio Tarricone** · [X Quantum Tech](https://xquantumtech.com) — AI innovation for real-world businesses.
|
|
115
|
+
|
|
116
|
+
## License
|
|
117
|
+
|
|
118
|
+
MIT
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "my-project",
|
|
3
|
+
"lang": "en",
|
|
4
|
+
"ignore": [
|
|
5
|
+
"**/*.generated.*",
|
|
6
|
+
"fixtures/**",
|
|
7
|
+
"docs/repo-map.*"
|
|
8
|
+
],
|
|
9
|
+
"extractors": [
|
|
10
|
+
{
|
|
11
|
+
"name": "feature_flags",
|
|
12
|
+
"title": "Feature flags",
|
|
13
|
+
"glob": "src/**/*.{ts,js}",
|
|
14
|
+
"pattern": "isEnabled\\(\\s*[\"']([\\w.-]+)[\"']",
|
|
15
|
+
"flags": "g",
|
|
16
|
+
"fields": ["flag"],
|
|
17
|
+
"unique": true
|
|
18
|
+
},
|
|
19
|
+
{
|
|
20
|
+
"name": "cli_commands",
|
|
21
|
+
"title": "CLI commands",
|
|
22
|
+
"glob": "src/cli/**/*.ts",
|
|
23
|
+
"pattern": "command\\(\\s*[\"']([\\w:-]+)[\"']\\s*,\\s*[\"']([^\"']{0,200})",
|
|
24
|
+
"flags": "g",
|
|
25
|
+
"fields": ["command", "description"],
|
|
26
|
+
"maxLen": 200
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"name": "events",
|
|
30
|
+
"title": "Emitted events",
|
|
31
|
+
"glob": "**/*.{ts,js,mjs}",
|
|
32
|
+
"pattern": "emit\\(\\s*[\"']([\\w:.-]+)[\"']",
|
|
33
|
+
"flags": "g",
|
|
34
|
+
"fields": ["event"],
|
|
35
|
+
"unique": true
|
|
36
|
+
}
|
|
37
|
+
]
|
|
38
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@x-quantum-tech/repolens",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"description": "Messy or gigantic repo? Don't waste tokens to map it. Zero-dependency repository mapper for humans and LLM agents: MD index + JSON graph + interactive treemap dashboard.",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"bin": {
|
|
7
|
+
"repolens": "./repolens.mjs"
|
|
8
|
+
},
|
|
9
|
+
"publishConfig": {
|
|
10
|
+
"access": "public"
|
|
11
|
+
},
|
|
12
|
+
"files": [
|
|
13
|
+
"repolens.mjs",
|
|
14
|
+
"README.md",
|
|
15
|
+
"LICENSE",
|
|
16
|
+
"examples"
|
|
17
|
+
],
|
|
18
|
+
"engines": {
|
|
19
|
+
"node": ">=18"
|
|
20
|
+
},
|
|
21
|
+
"keywords": [
|
|
22
|
+
"llm",
|
|
23
|
+
"ai-agents",
|
|
24
|
+
"repository-map",
|
|
25
|
+
"codebase-map",
|
|
26
|
+
"treemap",
|
|
27
|
+
"code-visualization",
|
|
28
|
+
"static-analysis",
|
|
29
|
+
"claude",
|
|
30
|
+
"context",
|
|
31
|
+
"zero-dependency"
|
|
32
|
+
],
|
|
33
|
+
"author": "Maurizio Tarricone (https://xquantumtech.com)",
|
|
34
|
+
"license": "MIT",
|
|
35
|
+
"repository": {
|
|
36
|
+
"type": "git",
|
|
37
|
+
"url": "git+https://github.com/xQUANTUMTECH/repolens.git"
|
|
38
|
+
},
|
|
39
|
+
"bugs": {
|
|
40
|
+
"url": "https://github.com/xQUANTUMTECH/repolens/issues"
|
|
41
|
+
},
|
|
42
|
+
"homepage": "https://github.com/xQUANTUMTECH/repolens#readme"
|
|
43
|
+
}
|