@lon-ask/dockit 0.1.1 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (140) hide show
  1. package/LICENSE +21 -674
  2. package/README.md +618 -328
  3. package/SKILL.md +61 -110
  4. package/apps/client/dist/assets/{index-CqOXxsEZ.js → index-DzadxeQH.js} +2 -2
  5. package/apps/client/dist/index.html +1 -1
  6. package/apps/client/index.html +12 -0
  7. package/apps/client/package.json +26 -0
  8. package/apps/client/src/App.tsx +18 -0
  9. package/apps/client/src/api/client.ts +54 -0
  10. package/apps/client/src/components/BuildPanel.tsx +77 -0
  11. package/apps/client/src/components/DocViewer.tsx +76 -0
  12. package/apps/client/src/components/EntryDetail.tsx +322 -0
  13. package/apps/client/src/components/EntryForm.tsx +117 -0
  14. package/apps/client/src/components/EntryList.tsx +165 -0
  15. package/apps/client/src/components/GlobalSearchBar.tsx +166 -0
  16. package/apps/client/src/components/Layout.tsx +57 -0
  17. package/apps/client/src/components/SearchBar.tsx +103 -0
  18. package/apps/client/src/components/SourceForm.tsx +497 -0
  19. package/apps/client/src/hooks/useTheme.ts +19 -0
  20. package/apps/client/src/index.css +77 -0
  21. package/apps/client/src/main.tsx +13 -0
  22. package/apps/client/src/types.ts +105 -0
  23. package/apps/client/vite.config.ts +13 -0
  24. package/apps/server/dist/core/domain/entry.js +20 -0
  25. package/apps/server/dist/core/domain/entry.js.map +1 -0
  26. package/apps/server/dist/core/domain/errors.js +33 -0
  27. package/apps/server/dist/core/domain/errors.js.map +1 -0
  28. package/apps/server/dist/core/domain/knowledge-graph.js +2 -0
  29. package/apps/server/dist/core/domain/knowledge-graph.js.map +1 -0
  30. package/apps/server/dist/core/domain/types.js +2 -0
  31. package/apps/server/dist/core/domain/types.js.map +1 -0
  32. package/apps/server/dist/core/ports/IBuildRepository.js +2 -0
  33. package/apps/server/dist/core/ports/IBuildRepository.js.map +1 -0
  34. package/apps/server/dist/core/ports/IDocumentNormalizer.js +2 -0
  35. package/apps/server/dist/core/ports/IDocumentNormalizer.js.map +1 -0
  36. package/apps/server/dist/core/ports/IDocumentStore.js +2 -0
  37. package/apps/server/dist/core/ports/IDocumentStore.js.map +1 -0
  38. package/apps/server/dist/core/ports/IEntryReadModel.js +2 -0
  39. package/apps/server/dist/core/ports/IEntryReadModel.js.map +1 -0
  40. package/apps/server/dist/core/ports/IEntryRepository.js +2 -0
  41. package/apps/server/dist/core/ports/IEntryRepository.js.map +1 -0
  42. package/apps/server/dist/core/ports/IKnowledgeGraph.js +2 -0
  43. package/apps/server/dist/core/ports/IKnowledgeGraph.js.map +1 -0
  44. package/apps/server/dist/core/ports/IPathResolver.js +2 -0
  45. package/apps/server/dist/core/ports/IPathResolver.js.map +1 -0
  46. package/apps/server/dist/core/ports/ISearchEngine.js +2 -0
  47. package/apps/server/dist/core/ports/ISearchEngine.js.map +1 -0
  48. package/apps/server/dist/core/ports/ISourceProcessor.js +2 -0
  49. package/apps/server/dist/core/ports/ISourceProcessor.js.map +1 -0
  50. package/apps/server/dist/core/ports/ISourceRepository.js +2 -0
  51. package/apps/server/dist/core/ports/ISourceRepository.js.map +1 -0
  52. package/apps/server/dist/core/usecases/BuildUseCase.js +76 -0
  53. package/apps/server/dist/core/usecases/BuildUseCase.js.map +1 -0
  54. package/apps/server/dist/core/usecases/ConfigUseCase.js +62 -0
  55. package/apps/server/dist/core/usecases/ConfigUseCase.js.map +1 -0
  56. package/apps/server/dist/core/usecases/SearchUseCase.js +17 -0
  57. package/apps/server/dist/core/usecases/SearchUseCase.js.map +1 -0
  58. package/apps/server/dist/index.js +86 -0
  59. package/apps/server/dist/index.js.map +1 -0
  60. package/apps/server/dist/infrastructure/filesystem/FileSystemDocumentStore.js +25 -0
  61. package/apps/server/dist/infrastructure/filesystem/FileSystemDocumentStore.js.map +1 -0
  62. package/apps/server/dist/infrastructure/graph/GraphSearchDecorator.js +42 -0
  63. package/apps/server/dist/infrastructure/graph/GraphSearchDecorator.js.map +1 -0
  64. package/apps/server/dist/infrastructure/graph/GraphifyKnowledgeGraph.js +145 -0
  65. package/apps/server/dist/infrastructure/graph/GraphifyKnowledgeGraph.js.map +1 -0
  66. package/apps/server/dist/infrastructure/graph/index.js +3 -0
  67. package/apps/server/dist/infrastructure/graph/index.js.map +1 -0
  68. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteBuildRepository.js +21 -0
  69. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteBuildRepository.js.map +1 -0
  70. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteEntryReadModel.js +11 -0
  71. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteEntryReadModel.js.map +1 -0
  72. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteEntryRepository.js +59 -0
  73. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteEntryRepository.js.map +1 -0
  74. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteSourceRepository.js +47 -0
  75. package/apps/server/dist/infrastructure/persistence/sqlite/SqliteSourceRepository.js.map +1 -0
  76. package/apps/server/dist/infrastructure/persistence/sqlite/connection.js +50 -0
  77. package/apps/server/dist/infrastructure/persistence/sqlite/connection.js.map +1 -0
  78. package/apps/server/dist/infrastructure/search/SearchEngineFactory.js +32 -0
  79. package/apps/server/dist/infrastructure/search/SearchEngineFactory.js.map +1 -0
  80. package/apps/server/dist/infrastructure/search/json/JsonSearchEngine.js +147 -0
  81. package/apps/server/dist/infrastructure/search/json/JsonSearchEngine.js.map +1 -0
  82. package/apps/server/dist/infrastructure/search/vector/EmbeddingService.js +23 -0
  83. package/apps/server/dist/infrastructure/search/vector/EmbeddingService.js.map +1 -0
  84. package/apps/server/dist/infrastructure/search/vector/VectorSearchEngine.js +378 -0
  85. package/apps/server/dist/infrastructure/search/vector/VectorSearchEngine.js.map +1 -0
  86. package/apps/server/dist/infrastructure/source-processors/AntoraSourceProcessor.js +11 -0
  87. package/apps/server/dist/infrastructure/source-processors/AntoraSourceProcessor.js.map +1 -0
  88. package/apps/server/dist/infrastructure/source-processors/AsciidocSourceProcessor.js +9 -0
  89. package/apps/server/dist/infrastructure/source-processors/AsciidocSourceProcessor.js.map +1 -0
  90. package/apps/server/dist/infrastructure/source-processors/DocumentNormalizer.js +11 -0
  91. package/apps/server/dist/infrastructure/source-processors/DocumentNormalizer.js.map +1 -0
  92. package/apps/server/dist/infrastructure/source-processors/GithubMarkdownSourceProcessor.js +9 -0
  93. package/apps/server/dist/infrastructure/source-processors/GithubMarkdownSourceProcessor.js.map +1 -0
  94. package/apps/server/dist/infrastructure/source-processors/MavenSourceProcessor.js +9 -0
  95. package/apps/server/dist/infrastructure/source-processors/MavenSourceProcessor.js.map +1 -0
  96. package/apps/server/dist/infrastructure/source-processors/PathResolver.js +5 -0
  97. package/apps/server/dist/infrastructure/source-processors/PathResolver.js.map +1 -0
  98. package/apps/server/dist/infrastructure/source-processors/SourceCodeSourceProcessor.js +269 -0
  99. package/apps/server/dist/infrastructure/source-processors/SourceCodeSourceProcessor.js.map +1 -0
  100. package/apps/server/dist/infrastructure/source-processors/ZipSourceProcessor.js +9 -0
  101. package/apps/server/dist/infrastructure/source-processors/ZipSourceProcessor.js.map +1 -0
  102. package/apps/server/dist/mcp-http.js +93 -0
  103. package/apps/server/dist/mcp-http.js.map +1 -0
  104. package/apps/server/dist/mcp.js +339 -0
  105. package/apps/server/dist/mcp.js.map +1 -0
  106. package/apps/server/dist/routes/build.js +89 -0
  107. package/apps/server/dist/routes/build.js.map +1 -0
  108. package/apps/server/dist/routes/entries.js +52 -0
  109. package/apps/server/dist/routes/entries.js.map +1 -0
  110. package/apps/server/dist/routes/graph.js +58 -0
  111. package/apps/server/dist/routes/graph.js.map +1 -0
  112. package/apps/server/dist/routes/search.js +24 -0
  113. package/apps/server/dist/routes/search.js.map +1 -0
  114. package/apps/server/dist/routes/sources.js +100 -0
  115. package/apps/server/dist/routes/sources.js.map +1 -0
  116. package/apps/server/dist/routes/viewer.js +22 -0
  117. package/apps/server/dist/routes/viewer.js.map +1 -0
  118. package/apps/server/dist/services/antora.js +222 -0
  119. package/apps/server/dist/services/antora.js.map +1 -0
  120. package/apps/server/dist/services/asciidoc.js +206 -0
  121. package/apps/server/dist/services/asciidoc.js.map +1 -0
  122. package/apps/server/dist/services/configLoader.js +150 -0
  123. package/apps/server/dist/services/configLoader.js.map +1 -0
  124. package/apps/server/dist/services/githubMarkdown.js +221 -0
  125. package/apps/server/dist/services/githubMarkdown.js.map +1 -0
  126. package/apps/server/dist/services/maven.js +148 -0
  127. package/apps/server/dist/services/maven.js.map +1 -0
  128. package/apps/server/dist/services/normalizer.js +42 -0
  129. package/apps/server/dist/services/normalizer.js.map +1 -0
  130. package/apps/server/dist/services/paths.js +5 -0
  131. package/apps/server/dist/services/paths.js.map +1 -0
  132. package/apps/server/dist/services/textExtractor.js +46 -0
  133. package/apps/server/dist/services/textExtractor.js.map +1 -0
  134. package/apps/server/dist/services/zip.js +63 -0
  135. package/apps/server/dist/services/zip.js.map +1 -0
  136. package/apps/server/package.json +38 -0
  137. package/apps/server/src/infrastructure/source-processors/SourceCodeSourceProcessor.ts +9 -2
  138. package/bin/commands/dev.ts +2 -2
  139. package/bin/commands/serve.ts +2 -2
  140. package/package.json +22 -4
package/README.md CHANGED
@@ -1,496 +1,786 @@
1
1
  # Dockit
2
2
 
3
- Local documentation hub that aggregates multiple documentation source types (ZIP, Maven, Antora, AsciiDoc, GitHub Markdown) into a unified, searchable HTML bundle — useful as LLM context. Also supports **source code knowledge graphs** powered by Graphify (Tree-sitter AST), producing structural dependency graphs for 15+ languages.
3
+ > Built with [OpenCode](https://opencode.ai) and [DeepSeek](https://deepseek.com)
4
4
 
5
- Ships with two search engines: a lightweight **TF-IDF engine** and a **hybrid semantic+keyword engine** (LanceDB + all-MiniLM-L6-v2 embeddings) configurable via a single toggle.
5
+ > **Local documentation hub** aggregate, index, and search your project's documentation and source code.
6
+ > Runs entirely offline. Ships as a single CLI binary via npm.
6
7
 
7
- All operational data (SQLite DB, build outputs, search indexes, knowledge graphs, embeddings model cache) is stored in `~/.dockit/` by default. Override with the `DOCKIT_DATA_DIR` environment variable.
8
+ ## Why Dockit
8
9
 
9
- ## Quick Start
10
+ Modern software teams juggle multiple documentation sources: auto-generated API docs, hand-written Markdown guides, AsciiDoc references, Antora sites, Maven Javadoc JARs, ZIP archives. Each source lives in its own silo with its own search bar. When an LLM coding agent needs to answer a framework question, it either hallucinates from training data or scrolls through GitHub.
11
+
12
+ Dockit solves this by ingesting **six documentation source types** and **source code** into a single, offline searchable index. It runs entirely on your machine — no cloud, no API keys, no internet required after the initial build.
13
+
14
+ ### What it does
15
+
16
+ - **Indexes documentation** from ZIP bundles, Maven Javadoc, Antora sites, AsciiDoc repos, GitHub Markdown repos
17
+ - **Builds source code knowledge graphs** via Graphify (Tree-sitter AST) — traces imports, calls, and inheritance across 15+ languages
18
+ - **Searches with hybrid TF-IDF + vector semantic engine** — keyword precision + conceptual understanding
19
+ - **Exposes an MCP server** so AI coding agents (Claude, Cline, OpenCode) can query docs on-demand
20
+ - **Works completely offline** — LanceDB embedded vector DB, ONNX embeddings model, local SQLite
21
+
22
+ ### Who it's for
23
+
24
+ | Role | Use case |
25
+ |------|----------|
26
+ | **LLM coding agents** | Query up-to-date framework docs instead of relying on stale training data |
27
+ | **Developers** | Search your project's docs + code structure from the terminal |
28
+ | **Teams** | Pre-build doc indexes once, share across the team |
29
+ | **Air-gapped environments** | Full offline operation with pre-seeded models and indexes |
30
+
31
+ ---
32
+
33
+ ## Installation
34
+
35
+ ### Method 1: npm registry (recommended)
36
+
37
+ ```bash
38
+ npm install -g @lon-ask/dockit
39
+ ```
40
+
41
+ After global install, the `dockit` command is available in your PATH:
10
42
 
11
43
  ```bash
12
- # 1. Clone and install
13
- git clone https://github.com/your-org/dockit.git
44
+ dockit --help
45
+ dockit list
46
+ ```
47
+
48
+ ### Method 2: npx (zero-install)
49
+
50
+ Run dockit on-demand without installing anything:
51
+
52
+ ```bash
53
+ npx @lon-ask/dockit list
54
+ npx @lon-ask/dockit init --path ./my-project --code-path src
55
+ npx @lon-ask/dockit search my-project "authentication"
56
+ ```
57
+
58
+ `npx` downloads the package to a temp cache and executes it. Perfect for one-off usage or CI pipelines. Set `DOCKIT_DATA_DIR` to persist data across invocations.
59
+
60
+ ### Method 3: Build from source
61
+
62
+ ```bash
63
+ git clone https://github.com/karthik20/dockit.git
14
64
  cd dockit
15
65
  npm install
16
- pip3 install graphify openai # optional — for source code graphs
66
+ npm run build
67
+ npm link # makes 'dockit' available globally
17
68
 
18
- # 2. Make CLI available globally
19
- npm link
69
+ pip3 install graphify # optional for source code knowledge graphs
70
+ ```
20
71
 
21
- # 3a. Build pre-configured docs (one-time per entry)
22
- dockit build quarkus
72
+ ### Prerequisites
23
73
 
24
- # 3b. Or init a project with source code + markdown scanning
25
- dockit init --path /path/to/project --code-path src
74
+ | Requirement | Needed for |
75
+ |-------------|-----------|
76
+ | Node.js 18+ | Runtime |
77
+ | Python 3.8+ & pip | Graphify source code graphs (optional) |
78
+ | Graphify (`pip install graphify`) | `source-code` source type, `graphifyEnabled` on doc sources |
79
+ | Maven (`mvn`) | Maven Javadoc source type |
80
+ | Antora CLI | Antora source type (auto-installed via npm dep) |
81
+ | Git | Cloning repos for AsciiDoc, Markdown, Source Code sources |
26
82
 
27
- # 4. Start searching
28
- dockit search quarkus "configure cache"
29
- dockit graph query dockit "BuildUseCase" # if source-code built
83
+ ### Data storage
84
+
85
+ All operational data lives in `~/.dockit/` by default. Override with `DOCKIT_DATA_DIR`:
86
+
87
+ ```bash
88
+ export DOCKIT_DATA_DIR=/path/to/custom/data
30
89
  ```
31
90
 
32
- ## Pre-configured Documentation
91
+ | Path | Contents |
92
+ |------|----------|
93
+ | `~/.dockit/dockit.db` | SQLite database (entries, sources, builds) |
94
+ | `~/.dockit/dockit.yaml` | Your config (auto-created by `dockit init`) |
95
+ | `~/.dockit/.lancedb/` | Vector search index (LanceDB) |
96
+ | `~/.dockit/models/` | ONNX embedding model cache |
97
+ | `~/.dockit/{entryId}/bundle/` | Built HTML docs per entry |
98
+ | `~/.dockit/{entryId}/graph.json` | Knowledge graph (source-code entries) |
33
99
 
34
- | Entry | Version | Source Type | Description |
35
- |-------|---------|-------------|-------------|
36
- | **Quarkus** | 3.35 | AsciiDoc | Quarkus framework documentation |
37
- | **Quarkus Core** | 3.35.2 | Maven Javadoc | Quarkus Core API reference |
38
- | **React** | 19 | GitHub Markdown | React library documentation |
39
- | **Spring Boot** | 3.5.x | Antora | Production-ready Spring applications |
40
- | **Spring Framework** | 7.x | Antora | Core Spring ecosystem reference |
41
- | **Quarkus Source Code** | 3.35 | Source Code | Quarkus framework source — knowledge graph |
42
- | **Quarkus (Docs + Code)** | 3.35 | AsciiDoc + Source Code | Docs + code graph combined |
43
- | **Dockit** | 1.0 | Source Code + Markdown | Self-hosted dockit project entry |
100
+ ---
44
101
 
45
- Add your own entries by editing `dockit.yaml` or using `dockit init` — see [Supported Sources](#supported-documentation-sources) below.
102
+ ## Quick Start
46
103
 
47
- ## Search Engine
104
+ ### Index your own project (30 seconds)
48
105
 
49
- Dockit ships two search engines toggleable via `dockit.yaml`:
106
+ ```bash
107
+ # From your project directory
108
+ npx @lon-ask/dockit init --code-path src
109
+
110
+ # This:
111
+ # 1. Scans all .md files → searchable docs
112
+ # 2. Runs Graphify on src/ → knowledge graph
113
+ # 3. Builds vector search index
114
+ # 4. Saves config to ~/.dockit/dockit.yaml
115
+ ```
50
116
 
51
- ```yaml
52
- search:
53
- engine: vector # 'vector' (default) | 'json' (TF-IDF fallback)
117
+ Now search:
118
+
119
+ ```bash
120
+ npx @lon-ask/dockit search my-project "authentication"
121
+ npx @lon-ask/dockit graph gods my-project
122
+ npx @lon-ask/dockit graph path my-project "createApp" "startServer"
54
123
  ```
55
124
 
56
- | | JSON (TF-IDF) | Vector (Hybrid) |
57
- |---|---|---|
58
- | **Storage per entry** | ~300 KB | ~32 MB (LanceDB) |
59
- | **Runtime memory** | Minimal | ~200 MB |
60
- | **Build time** | Fast | Slower (embeds docs via ONNX) |
61
- | **Search method** | Term-frequency scoring (title 10x, headings 3x, body 1x) | Hybrid: parallel cosine ANN + BM25 FTS → RRF fusion |
62
- | **Keyword precision** | High (exact term matches) | Very high (FTS component recovers keywords) |
63
- | **Semantic matching** | None | Yes (finds conceptually related docs even without exact terms) |
64
- | **Model** | None | `all-MiniLM-L6-v2` via `@huggingface/transformers` (~88 MB ONNX) |
65
- | **Works offline** | Yes — zero dependencies | Yes — model bundles in npm package |
125
+ ### Index framework docs
66
126
 
67
- ### How hybrid search works
127
+ ```bash
128
+ # Build pre-configured entries from dockit.yaml
129
+ dockit build quarkus # 30 min, 3500+ AsciiDoc pages → ~800 MB vector index
130
+ dockit build react # 2 min, 200+ Markdown pages
131
+ dockit build spring-boot # 15 min, Antora site
68
132
 
69
- ```
70
- query vector (cosine ANN) + BM25 (FTS) in parallel
71
- → deduplicate per-path → RRF fusion → top N
133
+ # Search across all built entries
134
+ dockit search "configure cache"
72
135
  ```
73
136
 
74
- - Vector search finds semantically related pages (e.g., "Ahead-of-Time Caching" for a cache query)
75
- - FTS recovers exact keyword matches (e.g., "caffeine" in the Caching guide)
76
- - Reciprocal Rank Fusion combines both with dynamic weighting: confident FTS gets 2x weight, uncertain FTS gets 0.7x
77
- - Title matches in FTS results get an additional 1.5x boost
137
+ ### Use with an AI agent
78
138
 
79
- ### When to use each
139
+ Add dockit as an MCP server in your AI tool's config:
80
140
 
81
- - **`json`**: Low-resource environments, fast builds, or when exact keyword matching is sufficient
82
- - **`vector`** (default): Better discovery of non-obvious matches, section-level chunking with heading context in results, hybrid search that matches both keywords and concepts
141
+ ```json
142
+ // ~/.config/opencode/opencode.json
143
+ {
144
+ "mcp": {
145
+ "dockit": {
146
+ "type": "local",
147
+ "command": ["npx", "@lon-ask/dockit", "mcp"],
148
+ "enabled": true
149
+ }
150
+ }
151
+ }
152
+ ```
83
153
 
84
- ## CLI Usage (Recommended)
154
+ The agent can then call `dockit_search`, `dockit_graph_query`, etc. automatically.
85
155
 
86
- The CLI is the primary way to interact with Dockit. Works from any directory, requires no server process, and is ideal for LLM agents that can execute shell commands.
156
+ ---
87
157
 
88
- ### Commands
158
+ ## CLI Commands
89
159
 
90
160
  | Command | Description |
91
161
  |---------|-------------|
162
+ | `dockit init --path <dir> [--code-path <sub>]` | Index a local project (markdown + source code) |
92
163
  | `dockit search [<entry>] <query>` | Search documentation |
93
- | `dockit search <query>` | Global search top result per entry |
94
- | `dockit search [<entry>] <query> --get-top [N]` | Fetch full content for top N results (default 3) |
164
+ | `dockit search [<entry>] <query> --get-top [N]` | Search + fetch full content for top N results |
95
165
  | `dockit list` | List all entries |
96
- | `dockit build <entry>` | Build documentation for an entry |
166
+ | `dockit build <entry>` | Build/rebuild documentation for an entry |
97
167
  | `dockit status <entry>` | Check build status |
98
- | `dockit get <entry> <path>` | Fetch full document content |
99
- | `dockit graph query <entry> <query>` | Search knowledge graph nodes by name, file, or type |
100
- | `dockit graph path <entry> <from> <to>` | Find shortest dependency path between two nodes |
101
- | `dockit graph gods <entry>` | List most connected (god) nodes |
102
- | `dockit graph explain <entry> <node>` | Show node details and connections |
103
- | `dockit init --path <dir> [--code-path <subdir>]` | Initialize a project as a dockit source |
104
- | `dockit dev` | Start dev servers (web UI) |
105
- | `dockit serve` | Start production server |
106
- | `dockit mcp` | Start MCP server |
168
+ | `dockit get <entry> <path>` | Fetch full document by path |
169
+ | `dockit graph query <entry> <query>` | Search knowledge graph nodes |
170
+ | `dockit graph path <entry> <from> <to>` | Find shortest dependency path |
171
+ | `dockit graph gods <entry>` | List most-connected nodes |
172
+ | `dockit graph explain <entry> <node>` | Show node details + connections |
173
+ | `dockit dev` | Start dev servers (Web UI on :5173 + API on :3001) |
174
+ | `dockit serve [--port <p>]` | Start production REST server |
175
+ | `dockit mcp` | Start MCP server for AI agents |
107
176
 
108
- ### Search Workflow
177
+ ### Flags
178
+
179
+ | Flag | Applies to | Description |
180
+ |------|-----------|-------------|
181
+ | `--json` | search, list, status, graph | Output as JSON |
182
+ | `--limit <n>` | search, graph query, graph gods | Max results |
183
+ | `--get-top [N]` | search | Fetch full content for top N (default 3) |
184
+ | `--name <n>` | init | Entry display name |
185
+ | `--version <v>` | init | Entry version string |
186
+ | `--code-path <p>` | init | Subdirectory for source code scanning |
187
+ | `--port <p>` | serve, mcp --http | Custom port |
109
188
 
110
- **Step 1: Global search** — discover which entries are relevant
189
+ ### Search workflow
111
190
 
112
191
  ```bash
192
+ # Step 1: Discover relevant entries
113
193
  dockit search "cache"
114
- # Returns top result per entry:
115
- # [React] cache
116
- # [Quarkus] caching-guide
117
- # [Quarkus Core] Cache API
194
+ # [React] cache [Quarkus] caching-guide [Quarkus Core] Cache API
195
+
196
+ # Step 2: Deep-dive into one entry with full content
197
+ dockit search quarkus "cache" --get-top 3
198
+ # → Returns plain text of top 3 matching documents
199
+
200
+ # Step 3: JSON output for scripts
201
+ dockit search react "useState" --get-top 3 --json
118
202
  ```
119
203
 
120
- **Step 2: Scoped search** — dive deeper into the chosen entry
204
+ ### Knowledge graph workflow
205
+
206
+ When you run `dockit init --code-path src` on a project, Graphify scans the source code with Tree-sitter (AST parser) and produces a dependency graph. Every file, class, function, and import becomes a node with edges tracking imports, calls, and inheritance.
207
+
208
+ The examples below use the **dockit source code itself** (built via `dockit init --path . --code-path apps/server/src`).
209
+
210
+ #### 1. Impact analysis — "What breaks if I change types.ts?"
121
211
 
122
212
  ```bash
123
- dockit search quarkus "cache" --get-top 3
124
- # Returns full content for top 3 Quarkus cache documents
213
+ npx @lon-ask/dockit graph gods dockit --limit 5
125
214
  ```
126
215
 
127
- ### Knowledge Graph Queries
216
+ Output:
217
+ ```
218
+ Name Degree File
219
+ ─────────── ────── ───────────────────────────────
220
+ types.ts 59 server/src/core/domain/types.ts
221
+ index.ts 54 server/src/index.ts
222
+ mcp.ts 49 server/src/mcp.ts
223
+ ```
128
224
 
129
- For `source-code` entries built with Graphify:
225
+ `types.ts` has degree 59 — it's imported by nearly every file in the codebase. Changing it means touching more than half the project. To see exactly which files are affected:
130
226
 
131
227
  ```bash
132
- # Search nodes
133
- dockit graph query dockit-code "BuildUseCase"
228
+ npx @lon-ask/dockit graph explain dockit "types.ts"
229
+ ```
134
230
 
135
- # List most connected nodes
136
- dockit graph gods dockit-code --limit 5
231
+ This reveals all 59 connections — every use case, repository, search engine, route handler, and source processor that depends on domain types.
137
232
 
138
- # Find shortest path between two nodes
139
- dockit graph path dockit-code "BuildUseCase" "SourceCodeSourceProcessor"
233
+ #### 2. Architecture discovery "How does the build pipeline work?"
140
234
 
141
- # Show node details and connections
142
- dockit graph explain dockit-code "BuildUseCase"
235
+ ```bash
236
+ npx @lon-ask/dockit graph query dockit "Build"
143
237
  ```
144
238
 
145
- ### Init a Project
239
+ Finds `BuildUseCase.ts`, `BuildResult`, `.build()` method, constructor — every node related to builds.
146
240
 
147
241
  ```bash
148
- # From project root auto-detects name
149
- dockit init --code-path apps
242
+ npx @lon-ask/dockit graph explain dockit "BuildUseCase.ts"
243
+ ```
150
244
 
151
- # With explicit path and version
152
- dockit init --path /path/to/project --name "MyApp" --version "2.0" --code-path src
245
+ Shows all 24 connections: the 7 port interfaces it depends on (`ISearchEngine`, `ISourceProcessor`, `IEntryRepository`, etc.), the 4 repositories it calls, the search engines it updates. An LLM can understand the full build architecture in one command instead of reading through hundreds of lines of imports.
153
246
 
154
- # This creates:
155
- # - source-code source (graphify on --code-path)
156
- # - github-markdown source (scans all .md files)
157
- # - Builds both immediately
247
+ #### 3. Architecture validation — "Does UI code ever import server code?"
248
+
249
+ ```bash
250
+ npx @lon-ask/dockit graph path dockit "EntryDetail.tsx" "BuildUseCase.ts"
251
+ # → No path found
158
252
  ```
159
253
 
160
- ### Flags
254
+ The graph confirms clean separation: client React components never import server core modules directly. The only bridge is the API client layer:
161
255
 
162
- | Flag | Description |
163
- |------|-------------|
164
- | `--json` | Output as JSON (for search, list, status, graph) |
165
- | `--limit <n>` | Max results (default 20) |
166
- | `--get-top [N]` | Fetch full content for top N results (default 3) |
167
- | `--port <port>` | Custom port (for serve, mcp --http) |
256
+ ```bash
257
+ npx @lon-ask/dockit graph query dockit "client.ts"
258
+ # Finds the HTTP API client the sole communication channel between UI and server
259
+ ```
168
260
 
169
- ### Examples
261
+ #### 4. Finding all code that touches a feature — "Where is source-code processing handled?"
170
262
 
171
263
  ```bash
172
- # Global search see which entries match
173
- dockit search "hooks"
264
+ npx @lon-ask/dockit graph query dockit "SourceCodeSourceProcessor"
265
+ # Shows the processor class plus everything that references it
266
+
267
+ npx @lon-ask/dockit graph path dockit "SourceCodeSourceProcessor" "graph.json"
268
+ # → Traces the full path from processor to graph output file
269
+ ```
174
270
 
175
- # Scoped search with full content
176
- dockit search react "how to create a hook" --get-top
271
+ #### 5. Entry points and god classes — "What are the most critical modules?"
177
272
 
178
- # JSON output for scripts/agents
179
- dockit search react "useState" --get-top 3 --json
273
+ ```bash
274
+ npx @lon-ask/dockit graph gods dockit --limit 10 --json
275
+ ```
180
276
 
181
- # Build documentation
182
- dockit build quarkus
183
- dockit status quarkus
277
+ Returns ranked by degree (total connections). The top nodes are the ones to be most careful with — they're the architectural keystones. `types.ts` (59), `index.ts` (54), `mcp.ts` (49), `BuildUseCase.ts` (24), `configLoader.ts` (22), `entries.ts` (18).
184
278
 
185
- # Fetch a specific document
186
- dockit get react react-docs-markdown/reference/react/hooks.html
279
+ #### 6. Cross-boundary tracing — "How does the MCP server reach the database?"
187
280
 
188
- # Graph queries
189
- dockit graph query dockit-code "SourceCodeSourceProcessor"
190
- dockit graph gods dockit-code
281
+ ```bash
282
+ npx @lon-ask/dockit graph path dockit "mcp.ts" "connection.ts"
191
283
  ```
192
284
 
193
- ## MCP Server (Optional)
285
+ Traces: `mcp.ts` `getDb()` → imports from `connection.ts`. Shows exactly which function calls form the chain.
194
286
 
195
- Dockit exposes an MCP (Model Context Protocol) server for AI tools like Claude Desktop, Cline, and OpenCode.
287
+ #### Why this matters for LLMs
196
288
 
197
- ### OpenCode
289
+ Traditional code search (grep) finds strings but not structure. Graphify's AST-based graph lets an LLM:
198
290
 
199
- ```json
200
- // ~/.config/opencode/opencode.json
201
- {
202
- "$schema": "https://opencode.ai/config.json",
203
- "mcp": {
204
- "dockit": {
205
- "type": "local",
206
- "command": ["bash", "/path/to/dockit/scripts/mcp-wrapper.sh"],
207
- "enabled": true
208
- }
209
- }
210
- }
291
+ | Question | Grep approach | Graph approach |
292
+ |----------|--------------|----------------|
293
+ | "What impacts does changing types.ts have?" | Search 59 files manually | `graph explain` shows all 59 connections instantly |
294
+ | "Does the UI import server code?" | Read every import line | `graph path` confirms no path exists |
295
+ | "What's the entry point of the build system?" | Guess based on naming conventions | `graph gods` ranks by degree — `BuildUseCase.ts` at #4 |
296
+ | "How does data flow from MCP to SQLite?" | Trace imports across 12 files | `graph path` shows exact chain in 1 command |
297
+
298
+ ### Real LLM use case: adding a new source type to dockit
299
+
300
+ Here's how an LLM uses graph search to understand the codebase before implementing a feature — end to end, step by step.
301
+
302
+ **Task**: "Add support for a new documentation source type called 'wiki'."
303
+
304
+ **Step 1: Find existing source type references** — understand the pattern:
305
+ ```bash
306
+ npx @lon-ask/dockit graph query dockit "SourceCodeSourceProcessor"
211
307
  ```
308
+ Returns `SourceCodeSourceProcessor.ts` — this is the template to follow for a new source processor.
212
309
 
213
- ### Claude Desktop / Cline
310
+ **Step 2: Trace the dependency chain** — where does the processor fit?
311
+ ```bash
312
+ npx @lon-ask/dockit graph explain dockit "SourceCodeSourceProcessor.ts"
313
+ ```
314
+ Shows the processor is used by: `BuildUseCase.ts`, `mcp.ts`, `index.ts`. The LLM now knows to update these 3 files when adding a new processor.
214
315
 
215
- ```json
216
- // ~/.claude/claude_desktop_config.json
217
- {
218
- "mcpServers": {
219
- "dockit": {
220
- "command": "bash",
221
- "args": ["/path/to/dockit/scripts/mcp-wrapper.sh"]
222
- }
223
- }
224
- }
316
+ **Step 3: Find the registration point** — where are processors registered?
317
+ ```bash
318
+ npx @lon-ask/dockit graph gods dockit
225
319
  ```
320
+ `types.ts` is the top god node (degree 59). The LLM knows to check here next.
226
321
 
227
- ### HTTP Transport
322
+ ```bash
323
+ npx @lon-ask/dockit graph query dockit "types"
324
+ ```
325
+ Returns `types.ts` in `core/domain/` — this is where `SourceType` is defined. The LLM finds the union type that needs a new `'wiki'` variant.
228
326
 
327
+ **Step 4: Trace the full modification path** — from entry point to database:
229
328
  ```bash
230
- # Start HTTP bridge on port 3456
231
- DOCKIT_MCP_HTTP_PORT=3456 ./scripts/mcp-wrapper.sh
329
+ npx @lon-ask/dockit graph path dockit "mcp.ts" "connection.ts"
330
+ ```
331
+ Shows: `mcp.ts` → `getDb()` → `connection.ts`. The LLM now knows how the system starts up and where the DB gets initialized.
232
332
 
233
- # Then curl:
234
- curl -X POST http://localhost:3456 \
235
- -H "Content-Type: application/json" \
236
- -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
333
+ **Step 5: Verify no duplicates** — is "wiki" already handled?
334
+ ```bash
335
+ npx @lon-ask/dockit graph query dockit "wiki"
237
336
  ```
337
+ Returns empty — confirmed no existing wiki handling. Safe to proceed.
238
338
 
239
- ### MCP Tools
339
+ **Step 6: Before coding, understand the build pipeline**:
340
+ ```bash
341
+ npx @lon-ask/dockit graph explain dockit "BuildUseCase.ts"
342
+ ```
343
+ Shows 24 connections — the LLM now understands which interfaces (`ISourceProcessor`) and repositories get called during a build. It knows exactly which files to read and which interfaces to implement.
240
344
 
241
- | Tool | Description |
242
- |------|-------------|
243
- | `dockit_list_entries` | List all configured entries |
244
- | `dockit_find_entry` | Find entries by name/description |
245
- | `dockit_search` | Search within a specific entry |
246
- | `dockit_global_search` | Search across all entries |
247
- | `dockit_get_doc` | Fetch full document content |
248
- | `dockit_build` / `dockit_build_status` | Build / check status |
249
- | `dockit_graph_query` | Search knowledge graph (MCP only) |
250
- | `dockit_graph_path` | Find shortest path between nodes (MCP only) |
251
- | `dockit_graph_explain` | Node details with edges (MCP only) |
252
- | `dockit_graph_gods` | Most connected nodes (MCP only) |
345
+ **Result**: The LLM has a complete mental model before writing a single line of code:
346
+ - Files to modify: `types.ts` (add type), `SourceCodeSourceProcessor.ts` (use as template), `BuildUseCase.ts`, `mcp.ts`, `index.ts` (register)
347
+ - Interfaces to implement: `ISourceProcessor`
348
+ - Pattern to follow: `SourceCodeSourceProcessor.ts`
349
+ - Build pipeline behavior: understands how processors get invoked
253
350
 
254
- ## How LLMs Use Dockit
351
+ This turns what would be 20+ minutes of grepping and reading imports into 6 graph commands executed in under 30 seconds.
255
352
 
256
- Dockit includes `SKILL.md` — a skill file that instructs LLMs how to use Dockit effectively. When an LLM has access to the `dockit` CLI or MCP tools, it follows this workflow:
353
+ ### What Graphify supports
257
354
 
258
- 1. **`dockit list`** / **`dockit_list_entries`** — discover available documentation
259
- 2. **`dockit search "query"`** — global search to find relevant entries
260
- 3. **`dockit search <entry> "query" --get-top`** scoped search with full content
261
- 4. **For source-code entries**: use `dockit graph query <entry> "node"` for structural queries
262
- 5. **Answer the user's question** using the retrieved documentation as context
355
+ | Language | Status |
356
+ |----------|--------|
357
+ | TypeScript / JavaScript | Full (imports, calls, classes, functions) |
358
+ | Python | Full |
359
+ | Java | Full |
360
+ | Go | ✅ Full |
361
+ | Rust | ✅ Full |
362
+ | C / C++ | ✅ Full |
363
+ | Ruby, PHP, C#, Swift, Kotlin, Scala, Lua, Elixir | ✅ AST parsing (import resolution varies) |
263
364
 
264
- The LLM strips conversational filler from queries, scopes searches to the right entry, and prefers Dockit documentation over training data.
365
+ ### Enabling graphs on doc sources
265
366
 
266
- ## Supported Documentation Sources
367
+ Add `graphifyEnabled: true` to any doc source (AsciiDoc, Markdown, Antora) that lives in a repo with source code:
267
368
 
268
- | Type | Description | Remote Fields | Local/Offline Fields |
269
- |------|-------------|---------------|---------------------|
270
- | **ZIP Bundle** | Download or extract a ZIP of HTML documentation | `url` | `localPath` — path to pre-downloaded .zip |
271
- | **Maven Artifact** | Download a documentation JAR (javadoc) from Maven Central | *(none extra)* | `useMavenCommand: true` — uses local Maven + settings.xml; `localJar` — path to pre-downloaded .jar |
272
- | **Antora** | Build a multi-page HTML site with Antora | `repoUrl` | `localPath` — path to pre-cloned repo |
273
- | **AsciiDoc** | Convert `.adoc` files to HTML | `repoUrl`, `sourcePath` (optional) | `localPath` — path to pre-cloned repo |
274
- | **GitHub Markdown** | Clone a GitHub repo and convert `.md` files to HTML | `repoUrl`, `sourcePath` (optional), `branch` (optional) | `localPath` — path to pre-cloned repo |
275
- | **Source Code** | Build a knowledge graph via Graphify (Tree-sitter AST) | `repoUrl`, `sourcePath` (optional), `branch` (optional) | `localPath` — path to local repo |
276
- | **Combined** | Add `graphifyEnabled: true` on doc sources (AsciiDoc, Markdown, Antora) to also generate a graph | *(inherits from parent type)* | *(inherits from parent type)* |
369
+ ```yaml
370
+ sources:
371
+ - type: github-markdown
372
+ label: "API Docs"
373
+ repoUrl: "https://github.com/myorg/myrepo.git"
374
+ sourcePath: "docs" # where .md files live
375
+ graphifyEnabled: true
376
+ graphifySourcePath: "src" # where source code lives
377
+ ```
277
378
 
278
- ### Source code knowledge graphs
379
+ This generates a graph alongside the document index during build. The search engine then boosts results that match graph node names (e.g., searching "BuildUseCase" ranks it higher because it's a known node).
279
380
 
280
- The `source-code` source type runs [Graphify](https://github.com/anomalyco/graphify) (Tree-sitter AST parser) on the source directory, producing a `graph.json` with nodes (classes, functions, files) and edges (imports, calls, inherits). Supports 15+ languages including TypeScript, JavaScript, Python, Java, Go, Rust, and C++.
381
+ ---
281
382
 
282
- **Configuration fields:**
383
+ ## Supported Documentation Sources
384
+
385
+ | Type | What it indexes | Remote | Local |
386
+ |------|----------------|--------|-------|
387
+ | **GitHub Markdown** | All `.md` files in a repo | `repoUrl`, `sourcePath`, `branch` | `localPath` |
388
+ | **AsciiDoc** | `.adoc` files via Asciidoctor | `repoUrl`, `sourcePath` | `localPath`, `zipPath` |
389
+ | **Antora** | Multi-page Antora documentation sites | `repoUrl` | `localPath`, `zipPath` |
390
+ | **ZIP Bundle** | Pre-built HTML in a ZIP archive | `url` | `localPath` |
391
+ | **Maven Javadoc** | Javadoc JAR from Maven Central | — | `localJar`, `useMavenCommand` |
392
+ | **Source Code** | Knowledge graph via Graphify Tree-sitter AST | `repoUrl`, `sourcePath`, `branch` | `localPath` |
283
393
 
284
- | Field | Description |
285
- |-------|-------------|
286
- | `repoUrl` / `localPath` / `zipPath` | Source acquisition (same pattern as other types) |
287
- | `sourcePath` | Subdirectory to scan for code files |
288
- | `graphifySourcePath` | Separate subdirectory for graphify (when different from sourcePath e.g. docs vs code) |
394
+ ---
395
+
396
+ ## Search Engine
289
397
 
290
- For existing doc sources (AsciiDoc, GitHub Markdown, Antora) that point to a repo containing source code, toggle `graphifyEnabled: true` and set `graphifySourcePath` to scan the code during build:
398
+ Dockit ships two engines, toggled via `dockit.yaml`:
291
399
 
292
400
  ```yaml
293
- sources:
294
- - type: asciidoc
295
- label: "Docs"
296
- repoUrl: "https://github.com/myorg/myrepo.git"
297
- sourcePath: "docs" # doc files
298
- graphifyEnabled: true
299
- graphifySourcePath: "src" # code files for graph
401
+ search:
402
+ engine: vector # 'vector' (default) | 'json' (TF-IDF)
300
403
  ```
301
404
 
302
- ## Offline / Proxy Mode
405
+ | | JSON (TF-IDF) | Vector (Hybrid) |
406
+ |---|---|---|
407
+ | **Storage** | ~300 KB per entry | ~32 MB per entry |
408
+ | **Memory** | Minimal | ~200 MB |
409
+ | **Build speed** | Fast | Slower (embeds all documents) |
410
+ | **Keyword match** | Exact term frequency | BM25 FTS (very high precision) |
411
+ | **Semantic match** | None | Yes (cosine ANN via all-MiniLM-L6-v2) |
412
+ | **Model** | None | 88 MB ONNX, bundled in package |
413
+ | **Offline** | Yes | Yes |
414
+
415
+ ### How hybrid search works
416
+
417
+ ```
418
+ query → [vector cosine ANN] + [BM25 full-text search] in parallel
419
+ → deduplicate per document path
420
+ → Reciprocal Rank Fusion combining both
421
+ → dynamic FTS weighting: 2x for confident matches, 0.7x for uncertain
422
+ → title match bonus: 1.5x when query terms appear in headings
423
+ ```
303
424
 
304
- For environments behind corporate proxies or without internet access, Dockit supports multiple fallback mechanisms:
425
+ ---
305
426
 
306
- ### Source Repositories (local clones)
427
+ ## Config File (`dockit.yaml`)
307
428
 
308
- Each source type supports local paths that take precedence over remote URLs:
429
+ After running `dockit init`, your config is at `~/.dockit/dockit.yaml`:
309
430
 
310
431
  ```yaml
311
- # dockit.yaml — local mode entries
312
432
  entries:
313
- - id: quarkus-local
314
- name: Quarkus (Local)
315
- version: "3.35"
433
+ - id: my-project
434
+ name: My Project
435
+ version: "1.0"
436
+ description: My project source code and documentation
316
437
  sources:
317
- - type: asciidoc
318
- label: "Quarkus Docs"
319
- localPath: "/home/user/repos/quarkus"
320
- sourcePath: "docs/src/main/asciidoc"
438
+ - type: source-code
439
+ label: "my-project Code"
440
+ localPath: /home/user/projects/my-project
441
+ sourcePath: src
442
+ - type: github-markdown
443
+ label: "my-project Markdown"
444
+ localPath: /home/user/projects/my-project
445
+
446
+ search:
447
+ engine: vector
321
448
  ```
322
449
 
323
- ### Embedding Model (air-gapped vector search)
450
+ Config resolution order:
451
+ 1. `~/.dockit/dockit.yaml` — user home (created by `dockit init`)
452
+ 2. `./dockit.yaml` — project root (development/backward compatibility)
324
453
 
325
- The embedding model downloads on first `embed()` call by default into `~/.dockit/models/`. Override with `DOCKIT_DATA_DIR` or `configure({ cacheDir: '...' })`. For air-gapped environments:
454
+ ---
326
455
 
327
- **Option A — Pre-seed on connected machine, then copy:**
328
- ```bash
329
- # On connected machine
330
- npm run download-model -w packages/embeddings
456
+ ## LLM Integration
457
+
458
+ Dockit is designed to be an **on-demand knowledge source for AI coding agents**. Instead of relying on stale training data or hallucinated API references, LLMs can query dockit at runtime for up-to-date, project-specific documentation and source code structure.
331
459
 
332
- # Copy ~/.dockit/models/ to the target machine
460
+ ### How it works
461
+
462
+ Dockit ships with a **skill file** (`SKILL.md`) that teaches LLMs how to use the tool. When an LLM coding agent has access to dockit (via CLI, MCP, or shell commands), it follows this workflow:
463
+
464
+ ```
465
+ User question → dockit search "query" → discover relevant entries
466
+ → dockit search <entry> "query" --get-top → retrieve full docs
467
+ → dockit graph query <entry> "node" → trace code structure
468
+ → Answer user with retrieved content as context
333
469
  ```
334
470
 
335
- **Option B Install offline via npm:**
336
- The model ONNX bundle ships inside the `@dockit/embeddings` npm package under `packages/embeddings/model/`. If `~/.dockit/models/` is empty at first `embed()` call, it will attempt to download — set `DOCKIT_DATA_DIR` to point to a pre-seeded directory or use `configure({ cacheDir: '/path/to/model' })`.
471
+ The skill file instructs the LLM to:
472
+ - Strip conversational filler from queries (keep only technical terms)
473
+ - Always scope searches to the right entry once identified
474
+ - Prefer dockit documentation over training data
475
+ - Use knowledge graph queries for source-code entries
476
+ - Show attribution (source type, repo, version) with answers
337
477
 
338
- ### Pre-built Index Bundling
478
+ ### OpenCode
479
+
480
+ OpenCode supports multiple integration modes:
339
481
 
340
- For environments where even building is impractical, LanceDB and JSON indexes can be pre-built and bundled:
482
+ **Skill mode** (recommended) OpenCode reads `SKILL.md` automatically from the skill registry:
483
+
484
+ ```bash
485
+ # When dockit is configured as a skill in ~/.config/opencode/skills/dockit/
486
+ # OpenCode loads SKILL.md instructions and invokes dockit CLI commands directly
487
+ opencode> "How do I configure cache in Quarkus?"
488
+ # OpenCode runs: dockit search quarkus "configure cache" --get-top
489
+ ```
341
490
 
342
- 1. Build indexes on a connected machine:
343
- ```bash
344
- dockit build quarkus
345
- dockit build spring-boot
346
- # ... all desired entries
347
- ```
348
- 2. Package the `~/.dockit/` directory (or specific `.lancedb/` + `index.json` files)
349
- 3. Deploy to target machines via the same `~/.dockit/` path
491
+ **MCP mode** dockit exposes as an MCP server for structured tool calls:
350
492
 
351
- ### Proxy Configuration
493
+ ```json
494
+ {
495
+ "$schema": "https://opencode.ai/config.json",
496
+ "mcp": {
497
+ "dockit": {
498
+ "type": "local",
499
+ "command": ["npx", "@lon-ask/dockit", "mcp"],
500
+ "enabled": true
501
+ }
502
+ }
503
+ }
504
+ ```
352
505
 
353
- Standard proxy environment variables:
506
+ **CLI mode** dockit commands are shell commands the agent can execute:
354
507
 
355
508
  ```bash
356
- export HTTP_PROXY=http://proxy.corp:8080
357
- export HTTPS_PROXY=http://proxy.corp:8080
509
+ dockit search react "useState" --get-top 3 --json
358
510
  ```
359
511
 
360
- Or override the HuggingFace CDN host via code:
512
+ ### Claude Code
361
513
 
362
- ```ts
363
- import { env } from '@huggingface/transformers';
364
- env.remoteHost = 'https://internal-mirror.corp'; // point to internal mirror
514
+ Add dockit as an MCP server in Claude Code's config:
515
+
516
+ ```json
517
+ {
518
+ "mcpServers": {
519
+ "dockit": {
520
+ "command": "npx",
521
+ "args": ["@lon-ask/dockit", "mcp"]
522
+ }
523
+ }
524
+ }
365
525
  ```
366
526
 
367
- ### Native Binaries
527
+ Claude can then call `dockit_search`, `dockit_get_doc`, `dockit_graph_query`, and all other MCP tools directly. The skill instructions in `SKILL.md` guide it to use the right tool for each query type.
368
528
 
369
- `@lancedb/lancedb` ships prebuilt binaries for linux x64/arm64, macOS x64/arm64, and Windows x64/arm64 no compilation needed.
529
+ **Claude Code with CLI fallback** if MCP is unavailable, Claude can run dockit as a shell command:
370
530
 
371
- ### Additional offline fields
531
+ ```bash
532
+ npx @lon-ask/dockit search quarkus "reactive routes" --get-top 3 --json
533
+ ```
534
+
535
+ ### Cline (VS Code)
536
+
537
+ ```json
538
+ {
539
+ "mcpServers": {
540
+ "dockit": {
541
+ "command": "npx",
542
+ "args": ["@lon-ask/dockit", "mcp"]
543
+ }
544
+ }
545
+ }
546
+ ```
372
547
 
373
- | Source Type | Fields (all take precedence over remote) |
374
- |-------------|------------------------------------------|
375
- | **ZIP** | `localPath` — path to pre-downloaded .zip |
376
- | **Maven** | `useMavenCommand: true` — uses local `~/.m2/settings.xml` (proxies, mirrors); `localJar` — pre-downloaded .jar |
377
- | **Antora** | `localPath` — pre-cloned repo |
378
- | **AsciiDoc** | `localPath` — pre-cloned repo (kept, not cleaned up) |
379
- | **GitHub Markdown** | `localPath` — pre-cloned repo |
380
- | **Source Code** | `localPath` — local repo directory |
548
+ ### General LLM Integration
381
549
 
382
- ## Web UI (Optional)
550
+ Any LLM that can execute shell commands or make HTTP requests can use dockit:
383
551
 
384
- Dockit includes a web interface for managing entries, configuring sources, and browsing documentation.
552
+ **Via CLI (shell access)**:
553
+ ```bash
554
+ # Build an entry
555
+ npx @lon-ask/dockit build react
556
+ # Search with full content
557
+ npx @lon-ask/dockit search react "hooks" --get-top 3 --json
558
+ # Query knowledge graph
559
+ npx @lon-ask/dockit graph gods my-project --json
560
+ ```
385
561
 
562
+ **Via REST API** (when server is running):
386
563
  ```bash
387
- # Start dev servers
388
- dockit dev
389
- # Or: npm run dev
564
+ dockit serve --port 3001 &
565
+ curl "http://localhost:3001/api/entries/react/search?q=hooks"
566
+ curl "http://localhost:3001/api/graph/my-project/query?q=database"
567
+ ```
390
568
 
391
- # Frontend http://localhost:5173
392
- # Backend → http://localhost:3001
569
+ **Via HTTP MCP bridge**:
570
+ ```bash
571
+ DOCKIT_MCP_HTTP_PORT=3456 npx @lon-ask/dockit mcp --http &
572
+ curl -X POST http://localhost:3456 \
573
+ -H "Content-Type: application/json" \
574
+ -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"dockit_search","arguments":{"entry":"react","query":"hooks"}}}'
393
575
  ```
394
576
 
395
- 1. Open http://localhost:5173 in your browser
396
- 2. Click **New Entry** in the sidebar
397
- 3. Add sources (including **Source Code** type) and click **Build Now**
398
- 4. For doc sources with repos, toggle **Generate source code knowledge graph** and set the **Source Code Path**
399
- 5. Use the embedded viewer to browse, or search across indexed content
577
+ ### Skills Registry
578
+
579
+ Dockit's `SKILL.md` is also registered as a skill file. When placed in an LLM agent's skill directory, it provides:
400
580
 
401
- ### Build Modes
581
+ 1. **Tool instructions** — which commands to use and when
582
+ 2. **Query refinement rules** — stripping filler, keeping technical terms
583
+ 3. **Workflow patterns** — discover → search → retrieve → graph
584
+ 4. **Attribution rules** — always cite source type, repo, version
402
585
 
403
- - **Build Now** server-side processing with live log output
404
- - **Download Script** — exports a self-contained `.sh` script with all curl commands
586
+ To register dockit as a skill:
587
+ ```bash
588
+ # For OpenCode
589
+ cp SKILL.md ~/.config/opencode/skills/dockit/SKILL.md
405
590
 
406
- ## Data Storage
591
+ # For other agents that support skill files, place SKILL.md in their skills directory
592
+ ```
407
593
 
408
- All runtime data is stored in `~/.dockit/` by default:
594
+ ### MCP Tools Reference
409
595
 
410
- | Path | Description |
596
+ | Tool | Description |
411
597
  |------|-------------|
412
- | `~/.dockit/dockit.db` | SQLite database (entries, sources, builds) |
413
- | `~/.dockit/dockit.yaml` | User configuration (auto-created by `dockit init`) |
414
- | `~/.dockit/.lancedb/` | Vector search index (LanceDB) |
415
- | `~/.dockit/models/` | HuggingFace ONNX embedding model cache |
416
- | `~/.dockit/{entryId}/bundle/` | Built HTML documentation per entry |
417
- | `~/.dockit/{entryId}/sources/` | Raw source processing artifacts |
418
- | `~/.dockit/{entryId}/graph.json` | Knowledge graph (source-code entries) |
598
+ | `dockit_list_entries` | List all configured entries |
599
+ | `dockit_find_entry` | Find entries by name/description |
600
+ | `dockit_search` | Search within a specific entry |
601
+ | `dockit_global_search` | Search across all entries |
602
+ | `dockit_get_doc` | Fetch full document content |
603
+ | `dockit_build` | Build documentation for an entry |
604
+ | `dockit_build_status` | Check build status |
605
+ | `dockit_graph_query` | Search knowledge graph nodes |
606
+ | `dockit_graph_path` | Find dependency path between two nodes |
607
+ | `dockit_graph_explain` | Show node details and connections |
608
+ | `dockit_graph_gods` | List most-connected (god) nodes |
609
+
610
+ ---
611
+
612
+ ## Web UI
419
613
 
420
- Override with the environment variable:
614
+ Dockit includes a React-based graphical interface for managing entries, configuring sources, and browsing documentation. It runs alongside the API server.
615
+
616
+ ### Starting the UI
421
617
 
422
618
  ```bash
423
- export DOCKIT_DATA_DIR=/custom/path # all data goes here instead of ~/.dockit/
619
+ # Development mode starts both API server + Vite dev UI concurrently
620
+ npx @lon-ask/dockit dev
621
+ # API → http://localhost:3001
622
+ # UI → http://localhost:5173
623
+
624
+ # Production mode — API server only (UI not served yet)
625
+ npx @lon-ask/dockit serve --port 3001
626
+ ```
627
+
628
+ > **Note**: The first `npx` run downloads `tsx` and `vite` (if not locally cached). Subsequent runs use the cached versions and start within seconds.
629
+
630
+ ### How it works under the hood
631
+
632
+ `dockit dev` spawns two processes in parallel:
633
+ - **API server** — `npx tsx watch apps/server/src/index.ts` (Express + TypeScript, hot reload)
634
+ - **Web UI** — `npx vite apps/client` (React dev server with HMR, port 5173)
635
+
636
+ The UI proxies `/api/*` requests to the API server at `localhost:3001`. Both processes terminate on Ctrl+C.
637
+
638
+ ### What the UI provides
639
+
640
+ | Feature | Description |
641
+ |---------|-------------|
642
+ | **Entry management** | Create, edit, and delete documentation entries via a form |
643
+ | **Source configuration** | Add/remove/reorder sources per entry — supports all 6 source types (ZIP, Maven, Antora, AsciiDoc, GitHub Markdown, Source Code) |
644
+ | **Source form** | Mode selector (Git Repo / Local Dir / ZIP File), Graphify toggle with source path field |
645
+ | **Build triggering** | One-click build with live streaming logs |
646
+ | **Download script** | Export build as a self-contained `.sh` script (for CI/reproducible builds) |
647
+ | **Document viewer** | Browse built HTML docs in the browser |
648
+ | **Entry detail** | Shows all sources, graph status badge (Network icon when graphify is enabled), build history |
649
+ | **Status badges** | Quick visual indicators for entry status (pending/building/ready/error) per source |
650
+
651
+ ### What it shows
652
+
653
+ The UI surfaces the same data as the CLI, but visually:
654
+
655
+ 1. **Sidebar** — list of all entries with status badges
656
+ 2. **Entry page** — entry metadata (name, version, description) + sources list + build controls
657
+ 3. **Source editor** — configure type, URL/path, source path, graphify toggle
658
+ 4. **Build log** — real-time output stream during builds
659
+ 5. **Graph status** — which entries have knowledge graphs built
660
+
661
+ ### Architecture
662
+
424
663
  ```
664
+ Browser (port 5173) ←→ API Server (port 3001)
665
+
666
+ SQLite DB + LanceDB + ~/.dockit/
667
+ ```
668
+
669
+ The UI communicates with the Express API via REST endpoints (`/api/entries`, `/api/sources`, `/api/build`, etc.). All data CRUD, search, and build operations are done through the same API that the CLI and MCP server use.
670
+
671
+ ---
425
672
 
426
- Configuration (`dockit.yaml`) is resolved in order:
427
- 1. `~/.dockit/dockit.yaml` (user home, persisted by `dockit init`)
428
- 2. Project root `dockit.yaml` (backward compatibility for development)
673
+ ## Offline / Air-Gapped Mode
674
+
675
+ Dockit is designed for full offline operation:
676
+
677
+ | Concern | Solution |
678
+ |---------|----------|
679
+ | **No internet** | All models bundled in npm package, LanceDB is embedded (Rust native) |
680
+ | **Corporate proxy** | Set `HTTP_PROXY`/`HTTPS_PROXY` env vars |
681
+ | **Pre-built indexes** | Build on connected machine, copy `~/.dockit/` to target |
682
+ | **Embedding model** | Ships as ONNX (~88 MB). Caches to `~/.dockit/models/` |
683
+ | **Source repos** | Clone once locally, reference via `localPath` in config |
684
+ | **Maven Javadoc** | Download JAR once, reference via `localJar` or use local Maven settings |
685
+
686
+ ---
429
687
 
430
688
  ## Architecture
431
689
 
432
690
  ```
433
- dockit/
691
+ dockit/ # npm package @lon-ask/dockit
692
+ ├── bin/
693
+ │ ├── dockit.js # CLI entry point (shebang node)
694
+ │ ├── dockit-cli.ts # Command router
695
+ │ ├── commands/ # search, build, graph, init, get, list, dev, mcp
696
+ │ └── utils.ts # Shared CLI helpers
434
697
  ├── apps/
435
- │ ├── server/ Express + TypeScript backend (port 3001)
698
+ │ ├── server/ # Express backend (port 3001)
436
699
  │ │ └── src/
437
- │ │ ├── core/domain/ Domain types & knowledge-graph types
438
- │ │ ├── core/ports/ IKnowledgeGraph, ISourceProcessor + ports
439
- │ │ ├── core/usecases/ BuildUseCase, ConfigUseCase, SearchUseCase
440
- │ │ ├── infrastructure/graph/ GraphifyKnowledgeGraph, GraphSearchDecorator
441
- │ ├── infrastructure/source-processors/ SourceCodeSourceProcessor + others
442
- │ │ └── routes/graph.ts Graph REST endpoints
443
- │ └── client/ React + Vite + Tailwind CSS frontend (port 5173)
444
- ├── bin/ CLI entry point, graph commands, init command
700
+ │ │ ├── core/ # Domain types, ports, use cases
701
+ │ │ ├── infrastructure/ # SQLite, LanceDB, Graphify, processors
702
+ │ │ ├── routes/ # REST API, graph endpoints, viewer
703
+ │ │ └── services/ # Config loader, text extractor, normalizer
704
+ └── client/ # React + Vite web UI (port 5173)
445
705
  ├── packages/
446
- │ └── embeddings/ @dockit/embeddings — ONNX model wrapper (@huggingface/transformers)
447
- ├── ~/.dockit/ Runtime data (created automatically on first run)
448
- ├── dockit.db SQLite database
449
- ├── dockit.yaml Entries/sources config (auto-created by `dockit init`)
450
- ├── .lancedb/ Vector search index
451
- │ ├── models/ HuggingFace ONNX embedding model cache
452
- │ ├── {entryId}/bundle/ Build outputs per entry
453
- │ ├── {entryId}/sources/ Raw source processing artifacts
454
- │ └── {entryId}/graph.json Knowledge graph (source-code entries)
455
- ├── dockit.yaml Entries/sources config
456
- ├── SKILL.md LLM skill instructions
457
- ├── GRAPHIFY_SOURCE_PLAN.md Graphify feature plan
458
- └── package.json npm workspace root
459
- ```
460
-
461
- ## API Overview
706
+ │ └── embeddings/ # @lon-ask/dockit-embeddings
707
+ │ └── model/ # all-MiniLM-L6-v2 ONNX (88 MB)
708
+ ├── scripts/
709
+ └── mcp-wrapper.sh # MCP server launcher
710
+ ├── dockit.yaml # Example config
711
+ └── SKILL.md # LLM agent instructions
712
+ ```
713
+
714
+ Runtime data (auto-created):
715
+ ```
716
+ ~/.dockit/
717
+ ├── dockit.db # SQLite (entries, sources, builds)
718
+ ├── dockit.yaml # Your config
719
+ ├── .lancedb/ # Vector search index
720
+ ├── models/ # Embedding model cache
721
+ └── {entryId}/
722
+ ├── bundle/ # Normalized HTML docs
723
+ ├── sources/ # Raw processing artifacts
724
+ └── graph.json # Knowledge graph
725
+ ```
726
+
727
+ ---
728
+
729
+ ## API Reference
462
730
 
463
731
  | Method | Path | Purpose |
464
732
  |--------|------|---------|
465
- | `GET` | `/api/entries` | List entries |
466
- | `POST` | `/api/entries` | Create entry |
467
- | `GET` | `/api/entries/:id` | Get entry detail + sources |
468
- | `PUT` | `/api/entries/:id` | Update entry |
469
- | `DELETE` | `/api/entries/:id` | Delete entry + all data |
470
- | `POST` | `/api/entries/:id/sources` | Add source to entry |
471
- | `PUT` | `/api/sources/:id` | Update source |
472
- | `DELETE` | `/api/sources/:id` | Remove source |
473
- | `POST` | `/api/entries/:id/build` | Trigger build |
474
- | `GET` | `/api/entries/:id/build-status` | Poll build progress |
475
- | `GET` | `/api/entries/:id/cli-script` | Download CLI script |
476
- | `GET` | `/api/graph/:entry/query?q=...` | Search knowledge graph nodes |
477
- | `GET` | `/api/graph/:entry/path?from=...&to=...` | Find path between nodes |
478
- | `GET` | `/api/graph/:entry/gods` | List most connected nodes |
479
- | `GET` | `/api/entries/:id/search?q=term` | Search built docs |
480
- | `GET` | `/api/bundle/:entryId/*` | Serve bundled HTML |
733
+ | `GET` | `/api/entries` | List entries |
734
+ | `POST` | `/api/entries` | Create entry |
735
+ | `GET` | `/api/entries/:id` | Get entry detail |
736
+ | `PUT` | `/api/entries/:id` | Update entry |
737
+ | `DELETE` | `/api/entries/:id` | Delete entry |
738
+ | `POST` | `/api/entries/:id/sources` | Add source |
739
+ | `PUT` | `/api/sources/:id` | Update source |
740
+ | `DELETE` | `/api/sources/:id` | Remove source |
741
+ | `POST` | `/api/entries/:id/build` | Trigger build |
742
+ | `GET` | `/api/entries/:id/build-status` | Poll build |
743
+ | `GET` | `/api/entries/:id/cli-script` | Download CLI script |
744
+ | `GET` | `/api/graph/:entry/query?q=...` | Graph node search |
745
+ | `GET` | `/api/graph/:entry/path?from=...&to=...` | Graph path find |
746
+ | `GET` | `/api/graph/:entry/gods` | Graph god nodes |
747
+ | `GET` | `/api/entries/:id/search?q=term` | Search docs |
748
+ | `GET` | `/api/bundle/:entryId/*` | Serve HTML |
749
+
750
+ ---
481
751
 
482
752
  ## Tech Stack
483
753
 
484
754
  | Layer | Technology |
485
755
  |-------|-----------|
486
- | Frontend | React 19, TypeScript, Vite 6, Tailwind CSS 4, React Router 7 |
487
- | Backend | Express 4, TypeScript, tsx |
756
+ | CLI | Node.js, tsx (TypeScript runtime) |
757
+ | Backend | Express 4, TypeScript |
488
758
  | Database | SQLite via better-sqlite3 |
489
- | MCP | @modelcontextprotocol/server 2.0.0-alpha.2 |
490
- | HTML Parsing | node-html-parser |
759
+ | Vector Search | LanceDB (embedded Rust) |
760
+ | Embeddings | all-MiniLM-L6-v2 ONNX via @huggingface/transformers |
761
+ | Frontend | React 19, Vite 6, Tailwind CSS 4 |
762
+ | MCP | @modelcontextprotocol/server v2 |
763
+ | HTML/MD Parse | node-html-parser, marked |
491
764
  | AsciiDoc | @asciidoctor/core |
765
+ | Antora | @antora/cli + @antora/site-generator |
492
766
  | Archives | unzipper |
493
- | Build Pipeline | Antora CLI, Git, Maven dependency plugin |
494
- | Markdown | marked |
495
- | Vector Search | LanceDB embedded (Rust native), all-MiniLM-L6-v2 via @huggingface/transformers |
496
767
  | Knowledge Graph | Graphify (Tree-sitter AST, 15+ languages) |
768
+
769
+ ---
770
+
771
+ ## Credits
772
+
773
+ Dockit was built with the assistance of the following LLMs and tools:
774
+
775
+ | Contributor | Role |
776
+ |------------|------|
777
+ | **[OpenCode](https://opencode.ai)** | Primary development agent — architecture, code generation, code review, CLI tooling, MCP server, graph features, npm publishing pipeline |
778
+ | **[DeepSeek](https://deepseek.com)** | Strategic architecture planning, feature design, documentation writing, test planning |
779
+
780
+ Special thanks to:
781
+
782
+ | Tool | Used for |
783
+ |------|---------|
784
+ | **[Graphify](https://github.com/safishamsi/graphify)** | Tree-sitter AST source code knowledge graphs |
785
+ | **[LanceDB](https://lancedb.com)** | Embedded vector search |
786
+ | **[OpenCode](https://opencode.ai)** | Interactive CLI agent framework that orchestrated the entire build pipeline |