docs-kit 0.1.4__tar.gz → 0.1.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. {docs_kit-0.1.4 → docs_kit-0.1.5}/AGENTS.md +10 -0
  2. docs_kit-0.1.5/CHANGELOG.md +92 -0
  3. {docs_kit-0.1.4 → docs_kit-0.1.5}/CLAUDE.md +10 -0
  4. {docs_kit-0.1.4 → docs_kit-0.1.5}/PKG-INFO +7 -2
  5. {docs_kit-0.1.4 → docs_kit-0.1.5}/README.md +6 -1
  6. docs_kit-0.1.5/docs_kit/_version.py +1 -0
  7. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/cli/commands.py +29 -12
  8. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/core/chunking.py +167 -2
  9. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/templates/claude_code_skill.md +10 -2
  10. docs_kit-0.1.5/docs_kit/templates/claude_desktop_skill.md +36 -0
  11. docs_kit-0.1.5/docs_kit/templates/cursor_agents_md.md +33 -0
  12. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/templates/skill.md +10 -2
  13. docs_kit-0.1.5/tests/test_chunking.py +290 -0
  14. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_cli.py +2 -1
  15. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_install_cmd.py +49 -4
  16. docs_kit-0.1.4/CHANGELOG.md +0 -20
  17. docs_kit-0.1.4/docs_kit/_version.py +0 -1
  18. docs_kit-0.1.4/docs_kit/templates/claude_desktop_skill.md +0 -31
  19. docs_kit-0.1.4/docs_kit/templates/cursor_agents_md.md +0 -27
  20. docs_kit-0.1.4/tests/test_chunking.py +0 -41
  21. {docs_kit-0.1.4 → docs_kit-0.1.5}/.github/workflows/ci.yml +0 -0
  22. {docs_kit-0.1.4 → docs_kit-0.1.5}/.github/workflows/publish.yml +0 -0
  23. {docs_kit-0.1.4 → docs_kit-0.1.5}/.gitignore +0 -0
  24. {docs_kit-0.1.4 → docs_kit-0.1.5}/CONTRIBUTING.md +0 -0
  25. {docs_kit-0.1.4 → docs_kit-0.1.5}/LICENSE +0 -0
  26. {docs_kit-0.1.4 → docs_kit-0.1.5}/data/sample_docs/claude-code-changelog.md +0 -0
  27. {docs_kit-0.1.4 → docs_kit-0.1.5}/data/sample_docs/the-adventure-of-the-speckled-band.md +0 -0
  28. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs-kit.yaml +0 -0
  29. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/__init__.py +0 -0
  30. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/__main__.py +0 -0
  31. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/agent.py +0 -0
  32. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/cli/__init__.py +0 -0
  33. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/cli/__main__.py +0 -0
  34. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/cli/help.py +0 -0
  35. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/__init__.py +0 -0
  36. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/embeddings/__init__.py +0 -0
  37. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/embeddings/base.py +0 -0
  38. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/embeddings/fastembed.py +0 -0
  39. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/fetchers/__init__.py +0 -0
  40. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/fetchers/base.py +0 -0
  41. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/fetchers/gitbook.py +0 -0
  42. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/fetchers/llms_txt.py +0 -0
  43. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/fetchers/mintlify.py +0 -0
  44. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/parsers/__init__.py +0 -0
  45. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/parsers/base.py +0 -0
  46. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/parsers/markdown.py +0 -0
  47. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/parsers/text.py +0 -0
  48. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/vector_stores/__init__.py +0 -0
  49. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/vector_stores/base.py +0 -0
  50. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/connectors/vector_stores/qdrant.py +0 -0
  51. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/core/__init__.py +0 -0
  52. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/core/config.py +0 -0
  53. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/core/html_utils.py +0 -0
  54. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/core/models.py +0 -0
  55. {docs_kit-0.1.4 → docs_kit-0.1.5}/docs_kit/templates/__init__.py +0 -0
  56. {docs_kit-0.1.4 → docs_kit-0.1.5}/npx-wrapper/bin/docs-kit.js +0 -0
  57. {docs_kit-0.1.4 → docs_kit-0.1.5}/npx-wrapper/package.json +0 -0
  58. {docs_kit-0.1.4 → docs_kit-0.1.5}/pyproject.toml +0 -0
  59. {docs_kit-0.1.4 → docs_kit-0.1.5}/scripts/smoke_test.sh +0 -0
  60. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/__init__.py +0 -0
  61. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_agent.py +0 -0
  62. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_config.py +0 -0
  63. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_embeddings_fastembed.py +0 -0
  64. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_fetcher_gitbook.py +0 -0
  65. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_fetcher_mintlify.py +0 -0
  66. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_models.py +0 -0
  67. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_parsers.py +0 -0
  68. {docs_kit-0.1.4 → docs_kit-0.1.5}/tests/test_vector_store_qdrant.py +0 -0
@@ -37,6 +37,16 @@ pytest tests/ -v
37
37
 
38
38
  ---
39
39
 
40
+ ## Skill templates (`docs_kit/templates/`)
41
+
42
+ Skills installed by `docs-kit install` come from these Markdown templates.
43
+
44
+ - **MUST** review `docs_kit/templates/` after **major** CLI or install-flow changes (new or removed commands, renamed options, changed defaults, new `INSTALL_TARGETS`, different recommended workflows).
45
+ - **MUST** edit the relevant templates when skill text or examples would be wrong or misleading; **need not** update templates when the change does not affect what installed skills describe.
46
+ - If you are unsure whether agents would see outdated instructions, check the templates and align them with `docs_kit/cli/commands.py` and install behavior.
47
+
48
+ ---
49
+
40
50
  ## Git Commit Workflow
41
51
 
42
52
  When the user asks to commit, stage, or write a commit message:
@@ -0,0 +1,92 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
+
7
+ ## [0.1.5] - 2026-03-28
8
+
9
+ ### Added
10
+
11
+ - `docs-kit query --full` to print full chunk text; default preview truncation increased to 1500 characters (from 300).
12
+ - Markdown chunking that keeps tables and fenced code blocks intact when possible: large tables split on row boundaries with header repeated; large code blocks split on lines with opening/closing fences preserved in each part.
13
+ - Tests covering table/code chunking, list-heavy sections, and headers inside fences.
14
+
15
+ ### Changed
16
+
17
+ - `docs-kit install codex` (and `codex-app` / `codex-desktop`) writes the skill to `~/.codex/skills/docs-kit/SKILL.md` and mirrors the same content to `~/.agents/skills/docs-kit/SKILL.md` for compatibility with shared agent layouts.
18
+ - `docs-kit doctor` lists installed skills with install-target labels (`claude-code`, `cursor`, `codex`, `codex-shared`, `opencode`, `claude-desktop`) and suggests the matching `docs-kit install <target>` command when missing.
19
+
20
+ ### Fixed
21
+
22
+ - Chunking no longer merges or normalizes markdown tables and code fences in ways that broke pipe layout or list structure.
23
+
24
+ ## [0.1.4] - 2026-03-28
25
+
26
+ ### Changed (breaking)
27
+
28
+ - Removed the MCP server and MCP dependencies; agents use installed skill files that invoke the `docs-kit` CLI directly (`query`, `list`, `ingest`, `remove`, `inspect`, `doctor`, etc.).
29
+ - CLI: added `list` and `remove`; removed `serve` and `stop`.
30
+ - `docs-kit install` writes skills for Claude Code, Cursor, Codex, OpenCode, Claude Desktop, and related targets under `docs_kit/templates/`.
31
+ - `doctor` checks PATH, config, Qdrant, and installed skill paths.
32
+ - Local Qdrant access is serialized with `filelock` for concurrent CLI use.
33
+ - Python support remains `>=3.11,<3.14`.
34
+
35
+ ## [0.1.3] - 2026-03-28
36
+
37
+ ### Added
38
+
39
+ - `docs-kit serve` as a background daemon (PID + logs) and `docs-kit stop`.
40
+ - Default MCP transport: SSE on `localhost:45139` to reduce Qdrant lock contention.
41
+ - Install flows write URL entries pointing at `/sse` for Claude, Cursor, and Codex.
42
+ - `doctor` checks MCP process and port reachability.
43
+
44
+ ### Changed
45
+
46
+ - Project install without a local `docs-kit.yaml` uses in-memory defaults (no user bootstrap in that path).
47
+
48
+ ### Fixed
49
+
50
+ - Serve daemon test log path mock (`MagicMock` parent breaking `mkdir`).
51
+
52
+ ## [0.1.2] - 2026-03-27
53
+
54
+ ### Added
55
+
56
+ - Bootstrap `~/.docs-kit/docs-kit.yaml` and default Qdrant path when running a global `install` with no project config.
57
+ - `DocsKitConfig.from_yaml` loading from the user config after `./docs-kit.yaml`.
58
+
59
+ ### Changed
60
+
61
+ - Documented pipx-first install, config precedence, and Python 3.11–3.13 cap; `requires-python` set to `<3.14` (onnxruntime wheels).
62
+
63
+ ## [0.1.1] - 2026-03-27
64
+
65
+ ### Added
66
+
67
+ - Mintlify fetcher (`llms-full.txt`, `llms.txt`, sitemap fallback) and shared `llms_txt` helpers.
68
+ - `docs-kit ingest --provider auto|gitbook|mintlify`.
69
+ - Formatted CLI help and banner (`DocsKitGroup` / `DocsKitCommand`, examples in epilogs).
70
+
71
+ ### Changed
72
+
73
+ - GitBook fetcher refactor; HTML stripped before chunking via `html_utils`.
74
+ - `install` resolves absolute `docs-kit` command and config paths; warns if YAML is missing.
75
+
76
+ ### Chore
77
+
78
+ - Ignore local `docs/` and add release helper script to `.gitignore`.
79
+
80
+ ## [0.1.0] - 2026-03-27
81
+
82
+ ### Added
83
+
84
+ - `DocsKitAgent` API: `ingest()`, `query()`.
85
+ - CLI: `docs-kit init`, `fetch`, `ingest`, `serve`, `install`, `query`, `inspect`, `doctor`.
86
+ - GitBook fetch via `/llms-full.txt` / `/llms.txt` and linked pages.
87
+ - Local embeddings with FastEmbed (dense + sparse/BM25).
88
+ - Vector store: Qdrant (local path or remote URL).
89
+ - Document parsers for `.txt` and `.md`.
90
+ - `DocsKitConfig` with YAML and environment variable support.
91
+ - MCP server tools: `search_docs`, `list_sources`, `get_collection_info`, `get_full_document`.
92
+ - Annotated `docs-kit.yaml` example, npx wrapper, sample data, CI and publish workflows.
@@ -69,6 +69,16 @@ After substantive changes, run the full test suite before claiming completion.
69
69
 
70
70
  ---
71
71
 
72
+ ## Skill templates (`docs_kit/templates/`)
73
+
74
+ Installed agent skills are generated from these templates. They must stay aligned with the real CLI and install flow.
75
+
76
+ - **MUST** open and review `docs_kit/templates/` after **major** changes to commands, flags, defaults, install targets, or agent integration (anything that changes what users or agents actually run).
77
+ - **MUST** update the affected template files when those changes alter skill instructions; **need not** touch templates when the change is clearly unrelated (e.g. internal refactors with no CLI or install behavior change).
78
+ - Stale templates mislead agents; treat template drift as a product bug on par with wrong CLI help text.
79
+
80
+ ---
81
+
72
82
  ## Git Commits
73
83
 
74
84
  - **MUST** write descriptive commit messages using [Conventional Commits](https://www.conventionalcommits.org/): `feat`, `fix`, `chore`, `docs`, `style`, `refactor`, `test`, `perf`.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: docs-kit
3
- Version: 0.1.4
3
+ Version: 0.1.5
4
4
  Summary: Fetch docs, embed locally, expose to AI agents via skills.
5
5
  License: MIT License
6
6
 
@@ -107,7 +107,7 @@ For more architecture detail (config resolution, layout, security posture), see
107
107
  | Command | Where the skill lands |
108
108
  |---------|------------------------|
109
109
  | `docs-kit install claude-code` | `~/.claude/skills/docs-kit/SKILL.md` |
110
- | `docs-kit install codex` | `~/.agents/skills/docs-kit/SKILL.md` |
110
+ | `docs-kit install codex` | `~/.codex/skills/docs-kit/SKILL.md` (also mirrors to `~/.agents/skills/docs-kit/SKILL.md` for compatibility) |
111
111
  | `docs-kit install cursor` | `~/.cursor/skills/docs-kit/SKILL.md` + marked block in `~/.cursor/AGENTS.md` when needed |
112
112
  | `docs-kit install opencode` | `~/.config/opencode/skills/docs-kit/SKILL.md` (and may mirror under `~/.agents/skills/` if absent) |
113
113
  | `docs-kit install claude-desktop` | `~/.docs-kit/exports/claude-desktop/docs-kit.zip` — upload via Claude Desktop **Customize → Skills** |
@@ -154,9 +154,12 @@ Run hybrid retrieval from the CLI.
154
154
  ```bash
155
155
  docs-kit query "How do I authenticate?"
156
156
  docs-kit query "getting started" --limit 3
157
+ docs-kit query "season 5 end date" --full
157
158
  docs-kit query "..." --config ./docs-kit.yaml
158
159
  ```
159
160
 
161
+ Use `--full` when you want the entire chunk body instead of the default preview.
162
+
160
163
  ### `docs-kit list`
161
164
 
162
165
  List ingested sources with ingestion timestamps.
@@ -197,6 +200,8 @@ docs-kit inspect --config ./docs-kit.yaml
197
200
 
198
201
  Check `docs-kit` on PATH, effective config, Qdrant path / connectivity, and which skills are installed.
199
202
 
203
+ For Codex installs, `doctor` reports both the native `~/.codex/skills/...` install and the shared compatibility mirror under `~/.agents/skills/...` when present.
204
+
200
205
  ```bash
201
206
  docs-kit doctor
202
207
  docs-kit doctor --config ./docs-kit.yaml
@@ -68,7 +68,7 @@ For more architecture detail (config resolution, layout, security posture), see
68
68
  | Command | Where the skill lands |
69
69
  |---------|------------------------|
70
70
  | `docs-kit install claude-code` | `~/.claude/skills/docs-kit/SKILL.md` |
71
- | `docs-kit install codex` | `~/.agents/skills/docs-kit/SKILL.md` |
71
+ | `docs-kit install codex` | `~/.codex/skills/docs-kit/SKILL.md` (also mirrors to `~/.agents/skills/docs-kit/SKILL.md` for compatibility) |
72
72
  | `docs-kit install cursor` | `~/.cursor/skills/docs-kit/SKILL.md` + marked block in `~/.cursor/AGENTS.md` when needed |
73
73
  | `docs-kit install opencode` | `~/.config/opencode/skills/docs-kit/SKILL.md` (and may mirror under `~/.agents/skills/` if absent) |
74
74
  | `docs-kit install claude-desktop` | `~/.docs-kit/exports/claude-desktop/docs-kit.zip` — upload via Claude Desktop **Customize → Skills** |
@@ -115,9 +115,12 @@ Run hybrid retrieval from the CLI.
115
115
  ```bash
116
116
  docs-kit query "How do I authenticate?"
117
117
  docs-kit query "getting started" --limit 3
118
+ docs-kit query "season 5 end date" --full
118
119
  docs-kit query "..." --config ./docs-kit.yaml
119
120
  ```
120
121
 
122
+ Use `--full` when you want the entire chunk body instead of the default preview.
123
+
121
124
  ### `docs-kit list`
122
125
 
123
126
  List ingested sources with ingestion timestamps.
@@ -158,6 +161,8 @@ docs-kit inspect --config ./docs-kit.yaml
158
161
 
159
162
  Check `docs-kit` on PATH, effective config, Qdrant path / connectivity, and which skills are installed.
160
163
 
164
+ For Codex installs, `doctor` reports both the native `~/.codex/skills/...` install and the shared compatibility mirror under `~/.agents/skills/...` when present.
165
+
161
166
  ```bash
162
167
  docs-kit doctor
163
168
  docs-kit doctor --config ./docs-kit.yaml
@@ -0,0 +1 @@
1
+ __version__ = "0.1.5"
@@ -48,6 +48,14 @@ def _get_user_config_path() -> Path:
48
48
  return _get_user_config_dir() / "docs-kit.yaml"
49
49
 
50
50
 
51
+ def _get_codex_skill_path() -> Path:
52
+ return Path.home() / ".codex" / "skills" / "docs-kit" / "SKILL.md"
53
+
54
+
55
+ def _get_shared_agent_skill_path() -> Path:
56
+ return Path.home() / ".agents" / "skills" / "docs-kit" / "SKILL.md"
57
+
58
+
51
59
  def _build_user_bootstrap_config():
52
60
  DocsKitConfig = _get_config_class()
53
61
  config = DocsKitConfig()
@@ -350,17 +358,18 @@ def doctor_cmd(config_path: str | None):
350
358
  # Skill install status
351
359
  click.echo(" Installed skills:")
352
360
  skill_locations = [
353
- ("Claude Code", home / ".claude" / "skills" / "docs-kit" / "SKILL.md"),
354
- ("Cursor", home / ".cursor" / "skills" / "docs-kit" / "SKILL.md"),
355
- ("Codex", home / ".agents" / "skills" / "docs-kit" / "SKILL.md"),
356
- ("OpenCode", home / ".config" / "opencode" / "skills" / "docs-kit" / "SKILL.md"),
357
- ("Claude Desktop", _get_user_config_dir() / "exports" / "claude-desktop" / "docs-kit.zip"),
361
+ ("claude-code", "claude-code", home / ".claude" / "skills" / "docs-kit" / "SKILL.md"),
362
+ ("cursor", "cursor", home / ".cursor" / "skills" / "docs-kit" / "SKILL.md"),
363
+ ("codex", "codex", _get_codex_skill_path()),
364
+ ("codex-shared", "codex", _get_shared_agent_skill_path()),
365
+ ("opencode", "opencode", home / ".config" / "opencode" / "skills" / "docs-kit" / "SKILL.md"),
366
+ ("claude-desktop", "claude-desktop", _get_user_config_dir() / "exports" / "claude-desktop" / "docs-kit.zip"),
358
367
  ]
359
- for label, path in skill_locations:
368
+ for label, install_target, path in skill_locations:
360
369
  if path.exists():
361
370
  click.echo(f" [OK] {label}: {path}")
362
371
  else:
363
- click.echo(f" [--] {label}: not installed (run: docs-kit install {label.lower().replace(' ', '-')})")
372
+ click.echo(f" [--] {label}: not installed (run: docs-kit install {install_target})")
364
373
 
365
374
 
366
375
  @click.command(
@@ -370,12 +379,14 @@ def doctor_cmd(config_path: str | None):
370
379
  epilog=format_examples(
371
380
  'docs-kit query "How do I authenticate?"',
372
381
  'docs-kit query "getting started" --limit 3',
382
+ 'docs-kit query "season 5 end date" --full',
373
383
  ),
374
384
  )
375
385
  @click.argument("text")
376
386
  @click.option("--config", "config_path", default=None, help="Path to docs-kit.yaml.")
377
387
  @click.option("--limit", default=None, type=int, help="Maximum chunks to return.")
378
- def query_cmd(text: str, config_path: str | None, limit: int | None):
388
+ @click.option("--full", is_flag=True, default=False, help="Show full chunk text without truncation.")
389
+ def query_cmd(text: str, config_path: str | None, limit: int | None, full: bool):
379
390
  """Run a retrieval query against the vector store (no server required)."""
380
391
  config = _load_config(config_path)
381
392
  agent = _get_agent_class()(config=config)
@@ -386,9 +397,12 @@ def query_cmd(text: str, config_path: str | None, limit: int | None):
386
397
  sys.exit(1)
387
398
 
388
399
  for i, chunk in enumerate(chunks, start=1):
389
- preview = chunk.text[:300] + "..." if len(chunk.text) > 300 else chunk.text
400
+ if full:
401
+ body = chunk.text
402
+ else:
403
+ body = chunk.text[:1500] + "..." if len(chunk.text) > 1500 else chunk.text
390
404
  click.echo(f"[{i}] score={chunk.score:.2f} source={chunk.source}")
391
- click.echo(preview)
405
+ click.echo(body)
392
406
  click.echo("---")
393
407
 
394
408
 
@@ -506,10 +520,13 @@ def install_cmd(agent: str, project: bool, config_path: str | None):
506
520
  return
507
521
 
508
522
  if normalized in {"codex", "codex-app", "codex-desktop"}:
509
- dest = home / ".agents" / "skills" / "docs-kit" / "SKILL.md"
523
+ dest = _get_codex_skill_path()
524
+ shared_dest = _get_shared_agent_skill_path()
510
525
  content = _build_skill_content("skill.md", effective_config_path)
511
526
  _install_skill_file(content, dest)
527
+ _install_skill_file(content, shared_dest)
512
528
  click.echo(f"Installed docs-kit skill at {dest}")
529
+ click.echo(f" Also installed shared compatibility skill at {shared_dest}")
513
530
  if created_user_config:
514
531
  click.echo(f" Created user config at {_get_user_config_path()}")
515
532
  elif effective_config_path is not None:
@@ -544,7 +561,7 @@ def install_cmd(agent: str, project: bool, config_path: str | None):
544
561
  click.echo(f"Installed docs-kit skill at {dest}")
545
562
 
546
563
  # Also write to shared ~/.agents/skills/ path if not already installed by codex
547
- shared_dest = home / ".agents" / "skills" / "docs-kit" / "SKILL.md"
564
+ shared_dest = _get_shared_agent_skill_path()
548
565
  if not shared_dest.exists():
549
566
  _install_skill_file(content, shared_dest)
550
567
  click.echo(f" Also installed at shared path {shared_dest}")
@@ -32,6 +32,7 @@ def chunk_text(text: str, chunk_size: int = 800, chunk_overlap: int = 120) -> li
32
32
 
33
33
  _HEADER_RE = re.compile(r"^(#{1,6})\s+(.*)", re.MULTILINE)
34
34
  _FENCE_OPEN_RE = re.compile(r"^ {0,3}(`{3,}|~{3,})")
35
+ _TABLE_LINE_RE = re.compile(r"^\s*\|")
35
36
 
36
37
 
37
38
  def _fence_ranges(text: str) -> list[tuple[int, int]]:
@@ -192,6 +193,159 @@ def _chunk_prose_section(body: str, prefix: str, chunk_size: int, chunk_overlap:
192
193
  return [f"{prefix}{w}" for w in windows]
193
194
 
194
195
 
196
+ # ---------------------------------------------------------------------------
197
+ # Table and code block preservation
198
+ # ---------------------------------------------------------------------------
199
+
200
+ def _split_tables_and_code(body: str) -> list[tuple[str, str]]:
201
+ """Split a section body into typed segments: ('prose'|'table'|'code', text).
202
+
203
+ Table blocks: contiguous runs of pipe-prefixed lines (markdown table rows).
204
+ Code blocks: fenced code blocks (``` or ~~~).
205
+ Everything else: prose.
206
+
207
+ Original newlines are preserved within table and code segments.
208
+ """
209
+ lines = body.splitlines(keepends=True)
210
+ blocks: list[tuple[str, str]] = []
211
+ current_type: str = "prose"
212
+ current_lines: list[str] = []
213
+ fence_char: str | None = None
214
+ fence_len: int = 0
215
+
216
+ def _flush():
217
+ nonlocal current_lines
218
+ if current_lines:
219
+ text = "".join(current_lines)
220
+ if text.strip():
221
+ blocks.append((current_type, text))
222
+ current_lines = []
223
+
224
+ for line in lines:
225
+ fence_match = _FENCE_OPEN_RE.match(line)
226
+ is_table_line = bool(_TABLE_LINE_RE.match(line))
227
+
228
+ if fence_char is not None:
229
+ # Inside a fenced code block — accumulate until closing fence
230
+ current_lines.append(line)
231
+ if fence_match and fence_match.group(1)[0] == fence_char and len(fence_match.group(1)) >= fence_len:
232
+ fence_char = None
233
+ continue
234
+
235
+ if fence_match and current_type != "table":
236
+ # Opening fence — flush current block, start code block
237
+ _flush()
238
+ current_type = "code"
239
+ current_lines.append(line)
240
+ fence_char = fence_match.group(1)[0]
241
+ fence_len = len(fence_match.group(1))
242
+ continue
243
+
244
+ if is_table_line:
245
+ if current_type != "table":
246
+ _flush()
247
+ current_type = "table"
248
+ current_lines.append(line)
249
+ else:
250
+ if current_type in ("table", "code"):
251
+ _flush()
252
+ current_type = "prose"
253
+ current_lines.append(line)
254
+
255
+ _flush()
256
+ return blocks
257
+
258
+
259
+ def _chunk_table_block(table_text: str, prefix: str, chunk_size: int) -> list[str]:
260
+ """Chunk a markdown table, keeping it atomic when possible.
261
+
262
+ If the table exceeds chunk_size, split at row boundaries. The header row
263
+ and separator row are repeated at the top of every split chunk.
264
+ """
265
+ full = f"{prefix}{table_text.strip()}"
266
+ if len(full) <= chunk_size:
267
+ return [full]
268
+
269
+ rows = [r for r in table_text.splitlines() if r.strip()]
270
+ if len(rows) < 3:
271
+ # No data rows to split — keep as-is even if oversized
272
+ return [full]
273
+
274
+ header_row = rows[0]
275
+ separator_row = rows[1]
276
+ data_rows = rows[2:]
277
+
278
+ header_block = f"{header_row}\n{separator_row}\n"
279
+ chunks: list[str] = []
280
+ group_rows: list[str] = []
281
+ group_len = len(prefix) + len(header_block)
282
+
283
+ for row in data_rows:
284
+ row_len = len(row) + 1 # +1 for newline
285
+ if group_rows and group_len + row_len > chunk_size:
286
+ chunk_text_val = prefix + header_block + "\n".join(group_rows)
287
+ chunks.append(chunk_text_val)
288
+ group_rows = [row]
289
+ group_len = len(prefix) + len(header_block) + row_len
290
+ else:
291
+ group_rows.append(row)
292
+ group_len += row_len
293
+
294
+ if group_rows:
295
+ chunks.append(prefix + header_block + "\n".join(group_rows))
296
+
297
+ return chunks if chunks else [full]
298
+
299
+
300
+ def _chunk_code_block(code_text: str, prefix: str, chunk_size: int) -> list[str]:
301
+ """Chunk a fenced code block, keeping it atomic when possible.
302
+
303
+ If the block exceeds chunk_size, split at line boundaries preserving
304
+ the opening fence line at the top of each split chunk.
305
+ """
306
+ full = f"{prefix}{code_text.strip()}"
307
+ if len(full) <= chunk_size:
308
+ return [full]
309
+
310
+ lines = code_text.splitlines()
311
+ if not lines:
312
+ return []
313
+
314
+ opening_fence = lines[0]
315
+ closing_fence = None
316
+ content_lines = lines[1:]
317
+ if len(lines) > 1:
318
+ fence_match = _FENCE_OPEN_RE.match(lines[-1])
319
+ if fence_match:
320
+ closing_fence = lines[-1]
321
+ content_lines = lines[1:-1]
322
+
323
+ # Keep each emitted chunk as valid fenced markdown.
324
+ closing_fence = closing_fence or opening_fence
325
+ wrapper_len = len(prefix) + len(opening_fence) + 1 + len(closing_fence)
326
+
327
+ chunks: list[str] = []
328
+ current: list[str] = []
329
+ current_len = wrapper_len
330
+
331
+ for line in content_lines:
332
+ line_len = len(line) + 1
333
+ if current and current_len + line_len > chunk_size:
334
+ chunks.append(prefix + "\n".join([opening_fence, *current, closing_fence]))
335
+ current = [line]
336
+ current_len = wrapper_len + line_len
337
+ else:
338
+ current.append(line)
339
+ current_len += line_len
340
+
341
+ if current:
342
+ chunks.append(prefix + "\n".join([opening_fence, *current, closing_fence]))
343
+ elif not chunks:
344
+ chunks.append(prefix + "\n".join([opening_fence, closing_fence]))
345
+
346
+ return chunks if chunks else [full]
347
+
348
+
195
349
  def _merge_small_chunks(chunks: list[str], chunk_size: int) -> list[str]:
196
350
  """Merge adjacent chunks that are both smaller than chunk_size/2."""
197
351
  if not chunks:
@@ -208,7 +362,12 @@ def _merge_small_chunks(chunks: list[str], chunk_size: int) -> list[str]:
208
362
 
209
363
 
210
364
  def chunk_markdown(text: str, chunk_size: int = 800, chunk_overlap: int = 120) -> list[str]:
211
- """Chunk markdown text with structure-awareness."""
365
+ """Chunk markdown text with structure-awareness.
366
+
367
+ Tables and fenced code blocks are preserved with their original newlines.
368
+ Prose is whitespace-normalized and split with a sliding window.
369
+ List-heavy sections use bullet-item grouping.
370
+ """
212
371
  if not text.strip():
213
372
  return []
214
373
  if chunk_overlap >= chunk_size:
@@ -222,6 +381,12 @@ def chunk_markdown(text: str, chunk_size: int = 800, chunk_overlap: int = 120) -
222
381
  if _is_list_heavy(body):
223
382
  raw_chunks.extend(_chunk_list_section(body, prefix, chunk_size))
224
383
  else:
225
- raw_chunks.extend(_chunk_prose_section(body, prefix, chunk_size, chunk_overlap))
384
+ for block_type, block_text in _split_tables_and_code(body):
385
+ if block_type == "table":
386
+ raw_chunks.extend(_chunk_table_block(block_text, prefix, chunk_size))
387
+ elif block_type == "code":
388
+ raw_chunks.extend(_chunk_code_block(block_text, prefix, chunk_size))
389
+ else:
390
+ raw_chunks.extend(_chunk_prose_section(block_text, prefix, chunk_size, chunk_overlap))
226
391
 
227
392
  return _merge_small_chunks([c for c in raw_chunks if c.strip()], chunk_size)
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: docs-kit
3
- description: Search and manage documentation knowledge bases using docs-kit CLI. Use when the user asks about third-party library docs, API references, or wants to ingest/manage documentation sources.
3
+ description: Search and manage documentation knowledge bases using docs-kit CLI. Use when the user asks about third-party library docs, API references, vendor documentation, version-specific API behavior, GitBook or Mintlify public docs, offline or local doc search, or wants to ingest a doc URL before answering a question.
4
4
  allowed-tools:
5
5
  - Bash(docs-kit *)
6
6
  - Bash({{DOCS_KIT_CMD}} *)
@@ -25,7 +25,7 @@ Use docs-kit when the user:
25
25
 
26
26
  1. Run `{{DOCS_KIT_CMD}} list` to check what documentation is already available.
27
27
  2. If relevant docs are present, run `{{DOCS_KIT_CMD}} query "your question"` to find relevant chunks.
28
- 3. If docs are not yet ingested, suggest: `{{DOCS_KIT_CMD}} ingest <url-or-path>`
28
+ 3. If docs are not yet ingested, run `{{DOCS_KIT_CMD}} ingest <url-or-path>` (confirm with user if the source is unfamiliar).
29
29
  4. Use retrieved chunks to inform your answer, citing sources.
30
30
 
31
31
  ## Commands
@@ -34,7 +34,9 @@ Use docs-kit when the user:
34
34
  ```bash
35
35
  {{DOCS_KIT_CMD}} query "your search query"
36
36
  {{DOCS_KIT_CMD}} query "how to authenticate" --limit 10
37
+ {{DOCS_KIT_CMD}} query "how to authenticate" --limit 10 --full
37
38
  ```
39
+ Use `--full` to return untruncated passage text.
38
40
  Returns relevant chunks with source attribution and relevance scores.
39
41
 
40
42
  ### List ingested sources
@@ -70,6 +72,12 @@ Returns relevant chunks with source attribution and relevance scores.
70
72
  {{DOCS_KIT_CMD}} inspect
71
73
  ```
72
74
 
75
+ ### Initialize project config
76
+ ```bash
77
+ {{DOCS_KIT_CMD}} init
78
+ ```
79
+ Creates a project-local `docs-kit.yaml` config file.
80
+
73
81
  ### Diagnose issues
74
82
  ```bash
75
83
  {{DOCS_KIT_CMD}} doctor
@@ -0,0 +1,36 @@
1
+ ---
2
+ name: docs-kit
3
+ description: Search and manage documentation knowledge bases using docs-kit CLI. Use when the user asks about third-party library docs, API references, vendor documentation, version-specific API behavior, GitBook or Mintlify public docs, offline or local doc search, or wants to ingest a doc URL before answering a question.
4
+ ---
5
+
6
+ # docs-kit — Documentation Knowledge Base
7
+
8
+ docs-kit is a globally installed CLI tool that fetches, embeds, and searches documentation locally.
9
+ All configuration and data are stored under `~/.docs-kit/` — no extra setup required.
10
+
11
+ Executable: `{{DOCS_KIT_CMD}}`
12
+
13
+ ## When to use
14
+
15
+ Use docs-kit when the user:
16
+ - Asks a question about a library, framework, or API whose docs may be ingested
17
+ - Wants to search documentation for code examples, API references, or guides
18
+ - Wants to ingest new documentation from a URL or local files
19
+ - Wants to manage (list, remove, inspect) ingested documentation sources
20
+
21
+ ## Workflow
22
+
23
+ 1. Run `{{DOCS_KIT_CMD}} list` to check what documentation is available.
24
+ 2. Run `{{DOCS_KIT_CMD}} query "your question"` to search ingested docs.
25
+ 3. If docs are not yet ingested, propose: `{{DOCS_KIT_CMD}} ingest <url-or-path>`
26
+
27
+ ## Commands
28
+
29
+ - `{{DOCS_KIT_CMD}} query "search terms" --limit 10` — search documentation (add `--full` for untruncated text)
30
+ - `{{DOCS_KIT_CMD}} list` — list all ingested sources
31
+ - `{{DOCS_KIT_CMD}} ingest <url-or-path>` — ingest new docs (add `--recreate` to re-ingest)
32
+ - `{{DOCS_KIT_CMD}} fetch <url> --output <dir>` — download docs to local files
33
+ - `{{DOCS_KIT_CMD}} remove <source>` — remove a source
34
+ - `{{DOCS_KIT_CMD}} inspect` — collection stats
35
+ - `{{DOCS_KIT_CMD}} init` — create project-local config
36
+ - `{{DOCS_KIT_CMD}} doctor` — diagnose issues
@@ -0,0 +1,33 @@
1
+ > Prefer `~/.cursor/skills/docs-kit/SKILL.md` when present; this block is a fallback.
2
+
3
+ # docs-kit — Documentation Knowledge Base
4
+
5
+ docs-kit is a globally installed CLI tool for searching and managing local documentation embeddings.
6
+ All configuration and data are stored under `~/.docs-kit/` — no extra setup required.
7
+
8
+ Executable: `{{DOCS_KIT_CMD}}`
9
+
10
+ ## When to use
11
+
12
+ Use docs-kit when the user:
13
+ - Asks a question about a library, framework, or API whose docs may be ingested
14
+ - Wants to search documentation for code examples, API references, or guides
15
+ - Wants to ingest new documentation from a URL or local files
16
+ - Wants to manage (list, remove, inspect) ingested documentation sources
17
+
18
+ ## Workflow
19
+
20
+ 1. Run `{{DOCS_KIT_CMD}} list` to check what documentation is available.
21
+ 2. Run `{{DOCS_KIT_CMD}} query "your question"` to search.
22
+ 3. If docs are not yet ingested, run `{{DOCS_KIT_CMD}} ingest <url-or-path>` (confirm with user if the source is unfamiliar).
23
+
24
+ ## Commands
25
+
26
+ - `{{DOCS_KIT_CMD}} query "search terms" --limit 10` — search ingested documentation (add `--full` for untruncated text)
27
+ - `{{DOCS_KIT_CMD}} list` — list all ingested sources with dates
28
+ - `{{DOCS_KIT_CMD}} ingest <url-or-path>` — ingest docs from a URL or local path (add `--recreate` to re-ingest)
29
+ - `{{DOCS_KIT_CMD}} fetch <url> --output <dir>` — download docs to local Markdown files
30
+ - `{{DOCS_KIT_CMD}} remove <source>` — remove a previously ingested source
31
+ - `{{DOCS_KIT_CMD}} inspect` — show collection stats
32
+ - `{{DOCS_KIT_CMD}} init` — create project-local config
33
+ - `{{DOCS_KIT_CMD}} doctor` — diagnose issues
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: docs-kit
3
- description: Search and manage documentation knowledge bases using docs-kit CLI. Use when the user asks about third-party library docs, API references, or wants to ingest/manage documentation sources.
3
+ description: Search and manage documentation knowledge bases using docs-kit CLI. Use when the user asks about third-party library docs, API references, vendor documentation, version-specific API behavior, GitBook or Mintlify public docs, offline or local doc search, or wants to ingest a doc URL before answering a question.
4
4
  ---
5
5
 
6
6
  # docs-kit — Documentation Knowledge Base
@@ -22,7 +22,7 @@ Use docs-kit when the user:
22
22
 
23
23
  1. Run `{{DOCS_KIT_CMD}} list` to check what documentation is already available.
24
24
  2. If relevant docs are present, run `{{DOCS_KIT_CMD}} query "your question"` to find relevant chunks.
25
- 3. If docs are not yet ingested, suggest: `{{DOCS_KIT_CMD}} ingest <url-or-path>`
25
+ 3. If docs are not yet ingested, run `{{DOCS_KIT_CMD}} ingest <url-or-path>` (confirm with user if the source is unfamiliar).
26
26
  4. Use retrieved chunks to inform your answer, citing sources.
27
27
 
28
28
  ## Commands
@@ -31,7 +31,9 @@ Use docs-kit when the user:
31
31
  ```bash
32
32
  {{DOCS_KIT_CMD}} query "your search query"
33
33
  {{DOCS_KIT_CMD}} query "how to authenticate" --limit 10
34
+ {{DOCS_KIT_CMD}} query "how to authenticate" --limit 10 --full
34
35
  ```
36
+ Use `--full` to return untruncated passage text.
35
37
 
36
38
  ### List ingested sources
37
39
  ```bash
@@ -60,6 +62,12 @@ Use docs-kit when the user:
60
62
  {{DOCS_KIT_CMD}} inspect
61
63
  ```
62
64
 
65
+ ### Initialize project config
66
+ ```bash
67
+ {{DOCS_KIT_CMD}} init
68
+ ```
69
+ Creates a project-local `docs-kit.yaml` config file.
70
+
63
71
  ### Diagnose issues
64
72
  ```bash
65
73
  {{DOCS_KIT_CMD}} doctor