docmk 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. package/.claude/skills/pdf/SKILL.md +89 -0
  2. package/.claude/skills/web-scraping/SKILL.md +78 -0
  3. package/CLAUDE.md +90 -0
  4. package/bin/docmk.js +3 -0
  5. package/dist/index.d.ts +1 -0
  6. package/dist/index.js +636 -0
  7. package/dist/index.js.map +1 -0
  8. package/final-site/assets/main-B4orIFxK.css +1 -0
  9. package/final-site/assets/main-CSoKXua6.js +25 -0
  10. package/final-site/favicon.svg +4 -0
  11. package/final-site/index.html +26 -0
  12. package/final-site/robots.txt +4 -0
  13. package/final-site/sitemap.xml +14 -0
  14. package/my-docs/api/README.md +152 -0
  15. package/my-docs/api/advanced.md +260 -0
  16. package/my-docs/getting-started/README.md +24 -0
  17. package/my-docs/tutorials/README.md +272 -0
  18. package/my-docs/tutorials/customization.md +492 -0
  19. package/package.json +59 -0
  20. package/postcss.config.js +6 -0
  21. package/site/assets/main-BZUsYUCF.css +1 -0
  22. package/site/assets/main-q6laQtCD.js +114 -0
  23. package/site/favicon.svg +4 -0
  24. package/site/index.html +23 -0
  25. package/site/robots.txt +4 -0
  26. package/site/sitemap.xml +34 -0
  27. package/site-output/assets/main-B4orIFxK.css +1 -0
  28. package/site-output/assets/main-CSoKXua6.js +25 -0
  29. package/site-output/favicon.svg +4 -0
  30. package/site-output/index.html +26 -0
  31. package/site-output/robots.txt +4 -0
  32. package/site-output/sitemap.xml +14 -0
  33. package/src/builder/index.ts +189 -0
  34. package/src/builder/vite-dev.ts +117 -0
  35. package/src/cli/commands/build.ts +48 -0
  36. package/src/cli/commands/dev.ts +53 -0
  37. package/src/cli/commands/preview.ts +57 -0
  38. package/src/cli/index.ts +42 -0
  39. package/src/client/App.vue +15 -0
  40. package/src/client/components/SearchBox.vue +204 -0
  41. package/src/client/components/Sidebar.vue +18 -0
  42. package/src/client/components/SidebarItem.vue +108 -0
  43. package/src/client/index.html +21 -0
  44. package/src/client/layouts/AppLayout.vue +99 -0
  45. package/src/client/lib/utils.ts +6 -0
  46. package/src/client/main.ts +42 -0
  47. package/src/client/pages/Home.vue +279 -0
  48. package/src/client/pages/SkillPage.vue +565 -0
  49. package/src/client/router.ts +16 -0
  50. package/src/client/styles/global.css +92 -0
  51. package/src/client/utils/routes.ts +69 -0
  52. package/src/parser/index.ts +253 -0
  53. package/src/scanner/index.ts +127 -0
  54. package/src/types/index.ts +45 -0
  55. package/tailwind.config.js +65 -0
  56. package/test-build/assets/main-C2ARPC0e.css +1 -0
  57. package/test-build/assets/main-CHIQpV3B.js +25 -0
  58. package/test-build/favicon.svg +4 -0
  59. package/test-build/index.html +47 -0
  60. package/test-build/robots.txt +4 -0
  61. package/test-build/sitemap.xml +19 -0
  62. package/test-dist/assets/main-B4orIFxK.css +1 -0
  63. package/test-dist/assets/main-CSoKXua6.js +25 -0
  64. package/test-dist/favicon.svg +4 -0
  65. package/test-dist/index.html +26 -0
  66. package/test-dist/robots.txt +4 -0
  67. package/test-dist/sitemap.xml +14 -0
  68. package/tsconfig.json +30 -0
  69. package/tsup.config.ts +13 -0
  70. package/vite.config.ts +21 -0
@@ -0,0 +1,89 @@
1
+ ---
2
+ title: PDF Processing
3
+ description: Comprehensive guide to working with PDF files including extraction, manipulation, and generation
4
+ tags: ["pdf", "documents", "data-extraction"]
5
+ ---
6
+
7
+ # PDF Processing
8
+
9
+ This skill covers various aspects of working with PDF documents programmatically.
10
+
11
+ ## Overview
12
+
13
+ PDF (Portable Document Format) is a widely used file format for documents. This skill includes:
14
+
15
+ - Text extraction from PDFs
16
+ - PDF generation and manipulation
17
+ - Form handling
18
+ - Metadata processing
19
+ - Document conversion
20
+
21
+ ## Key Libraries and Tools
22
+
23
+ ### Python
24
+ - **PyPDF2/PyPDF4** - Basic PDF operations
25
+ - **pdfplumber** - Text extraction with layout preservation
26
+ - **ReportLab** - PDF generation
27
+ - **pdftk** - Command-line PDF toolkit
28
+
29
+ ### JavaScript/Node.js
30
+ - **pdf-lib** - Create and modify PDF documents
31
+ - **pdf-parse** - Simple PDF parsing
32
+ - **puppeteer** - Generate PDFs from web content
33
+
34
+ ## Common Use Cases
35
+
36
+ ### Text Extraction
37
+ Extract text content from PDF files while preserving formatting and structure.
38
+
39
+ ### Document Generation
40
+ Create PDFs programmatically from data, templates, or web content.
41
+
42
+ ### Form Processing
43
+ Handle PDF forms, extract form data, and fill forms programmatically.
44
+
45
+ ## Best Practices
46
+
47
+ 1. **Memory Management** - Large PDFs can consume significant memory
48
+ 2. **Error Handling** - PDFs can be corrupted or password-protected
49
+ 3. **Performance** - Consider streaming for large documents
50
+ 4. **Security** - Be cautious with user-uploaded PDFs
51
+
52
+ ## Examples
53
+
54
+ ### Basic Text Extraction (Python)
55
+ ```python
56
+ import PyPDF2
57
+
58
+ with open('document.pdf', 'rb') as file:
59
+ reader = PyPDF2.PdfReader(file)
60
+ text = ""
61
+ for page in reader.pages:
62
+ text += page.extract_text()
63
+ print(text)
64
+ ```
65
+
66
+ ### PDF Generation (JavaScript)
67
+ ```javascript
68
+ import { PDFDocument, rgb } from 'pdf-lib'
69
+ import fs from 'fs'
70
+
71
+ const pdfDoc = await PDFDocument.create()
72
+ const page = pdfDoc.addPage()
73
+
74
+ page.drawText('Hello, PDF!', {
75
+ x: 50,
76
+ y: 750,
77
+ size: 30,
78
+ color: rgb(0, 0, 0),
79
+ })
80
+
81
+ const pdfBytes = await pdfDoc.save()
82
+ fs.writeFileSync('output.pdf', pdfBytes)
83
+ ```
84
+
85
+ ## Troubleshooting
86
+
87
+ - **Encoding Issues** - Some PDFs may have encoding problems
88
+ - **Layout Preservation** - Complex layouts may not extract cleanly
89
+ - **Performance** - Large PDFs may require optimization
@@ -0,0 +1,78 @@
1
+ ---
2
+ title: Web Scraping
3
+ description: Techniques and tools for extracting data from websites
4
+ tags: ["scraping", "data-extraction", "automation"]
5
+ ---
6
+
7
+ # Web Scraping
8
+
9
+ Web scraping is the process of extracting data from websites programmatically.
10
+
11
+ ## Overview
12
+
13
+ Web scraping involves:
14
+
15
+ - Fetching web pages
16
+ - Parsing HTML content
17
+ - Extracting specific data
18
+ - Handling dynamic content
19
+ - Managing rate limits and ethics
20
+
21
+ ## Tools and Libraries
22
+
23
+ ### Python
24
+ - **BeautifulSoup** - HTML parsing
25
+ - **Scrapy** - Full-featured scraping framework
26
+ - **Selenium** - Browser automation
27
+ - **requests** - HTTP library
28
+
29
+ ### JavaScript/Node.js
30
+ - **Puppeteer** - Chrome automation
31
+ - **Playwright** - Multi-browser automation
32
+ - **Cheerio** - Server-side jQuery
33
+ - **axios** - HTTP client
34
+
35
+ ## Key Concepts
36
+
37
+ ### Respect robots.txt
38
+ Always check the robots.txt file of websites before scraping.
39
+
40
+ ### Rate Limiting
41
+ Implement delays between requests to avoid overwhelming servers.
42
+
43
+ ### User Agents
44
+ Rotate user agents to appear more like regular browsers.
45
+
46
+ ## Examples
47
+
48
+ ### Basic Scraping (Python)
49
+ ```python
50
+ import requests
51
+ from bs4 import BeautifulSoup
52
+
53
+ response = requests.get('https://example.com')
54
+ soup = BeautifulSoup(response.content, 'html.parser')
55
+
56
+ titles = soup.find_all('h2', class_='title')
57
+ for title in titles:
58
+ print(title.text)
59
+ ```
60
+
61
+ ### Dynamic Content (JavaScript)
62
+ ```javascript
63
+ const puppeteer = require('puppeteer');
64
+
65
+ (async () => {
66
+ const browser = await puppeteer.launch();
67
+ const page = await browser.newPage();
68
+
69
+ await page.goto('https://example.com');
70
+ await page.waitForSelector('.dynamic-content');
71
+
72
+ const content = await page.$eval('.dynamic-content',
73
+ el => el.textContent);
74
+
75
+ console.log(content);
76
+ await browser.close();
77
+ })();
78
+ ```
package/CLAUDE.md ADDED
@@ -0,0 +1,90 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## 项目概述
6
+
7
+ DocGen 是一个文档生成 CLI 工具,用于扫描任意目录并自动生成静态文档网站。支持 Markdown 渲染、全文搜索、代码高亮等功能。
8
+
9
+ ## 常用命令
10
+
11
+ ```bash
12
+ # 安装依赖
13
+ npm install
14
+
15
+ # 构建 CLI(输出到 dist/)
16
+ npm run build
17
+
18
+ # 类型检查
19
+ npm run typecheck
20
+
21
+ # 代码检查
22
+ npm run lint
23
+
24
+ # 开发模式运行 CLI(使用 tsx 直接执行)
25
+ npm run dev
26
+
27
+ # 使用 CLI 启动文档开发服务器
28
+ node dist/index.js dev --dir ./my-docs --port 3000
29
+
30
+ # 构建静态站点
31
+ node dist/index.js build --dir ./my-docs --output ./dist
32
+
33
+ # 预览构建结果
34
+ node dist/index.js preview --output ./dist --port 4173
35
+ ```
36
+
37
+ ## 架构
38
+
39
+ ```
40
+ src/
41
+ ├── cli/ # CLI 入口和命令实现
42
+ │ ├── index.ts # 主入口,使用 Commander.js 定义命令
43
+ │ └── commands/ # dev/build/preview 命令
44
+ ├── scanner/ # 目录扫描器,递归扫描 .md 文件
45
+ ├── parser/ # Markdown 解析,使用 gray-matter + markdown-it + shiki
46
+ ├── builder/ # Vite 构建逻辑
47
+ │ ├── index.ts # 生产构建,生成 sitemap/robots.txt
48
+ │ └── vite-dev.ts # 开发服务器,支持文件监听热更新
49
+ ├── client/ # Vue 3 前端 SPA
50
+ │ ├── pages/ # Home(首页统计)和 SkillPage(文档渲染)
51
+ │ ├── components/ # Sidebar、SearchBox 等
52
+ │ └── router.ts # 路由配置,catch-all 匹配文档路径
53
+ └── types/ # TypeScript 类型定义
54
+ ```
55
+
56
+ ## 核心数据流
57
+
58
+ 1. **Scanner** (`src/scanner/index.ts`): 扫描源目录,递归收集 `.md` 文件,构建 `SkillDirectory[]` 树结构
59
+ 2. **Parser** (`src/parser/index.ts`): 解析 frontmatter、渲染 Markdown 为 HTML、提取 TOC、应用 shiki 代码高亮
60
+ 3. **Builder**:
61
+ - 开发模式:通过 `/api/config` 端点注入配置,使用 chokidar 监听文件变化触发热更新
62
+ - 生产模式:将配置 base64 编码注入 HTML `<head>`,生成 sitemap.xml 和 robots.txt
63
+ 4. **Client**: Vue 3 SPA 从 `globalThis.__DOCGEN_CONFIG__` 或 `/api/config` 获取配置,渲染文档
64
+
65
+ ## 关键类型
66
+
67
+ - `SkillDirectory`: 目录节点,包含 `children` 和可选的 `skillFile`(SKILL.md)
68
+ - `SkillFile`: 文件节点,包含 `content`、`frontmatter`、`lastModified`
69
+ - `DocGenConfig`: 完整站点配置,包含 `navigation`、`files`、`directories`
70
+
71
+ ## 技术栈
72
+
73
+ - **CLI**: Commander.js + tsup 打包
74
+ - **前端**: Vue 3 + Vue Router + Vite + Tailwind CSS
75
+ - **解析**: gray-matter(frontmatter)+ markdown-it(渲染)+ shiki(代码高亮)
76
+ - **监听**: chokidar
77
+
78
+ ## 文档目录结构约定
79
+
80
+ ```
81
+ docs/
82
+ ├── category/
83
+ │ ├── SKILL.md # 主文档(可选,作为该目录的入口)
84
+ │ ├── advanced.md # 其他文档
85
+ │ └── sub-category/ # 子目录
86
+ └── another/
87
+ └── SKILL.md
88
+ ```
89
+
90
+ 路由规则:`SKILL.md` 文件映射到目录路径(如 `/api/SKILL.md` → `/api`),其他 `.md` 文件保留文件名(如 `/api/advanced.md` → `/api/advanced`)。
package/bin/docmk.js ADDED
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env node
2
+
3
+ import('../dist/index.js').catch(console.error)
@@ -0,0 +1 @@
1
+ #!/usr/bin/env node