lifeos 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (216) hide show
  1. package/LICENSE +21 -0
  2. package/README.en.md +202 -0
  3. package/README.md +202 -0
  4. package/assets/lifeos-rules.en.md +162 -0
  5. package/assets/lifeos-rules.zh.md +162 -0
  6. package/assets/lifeos.yaml +56 -0
  7. package/assets/prompts/AI_LLMResearch_Prompt.en.md +120 -0
  8. package/assets/prompts/AI_LLMResearch_Prompt.zh.md +120 -0
  9. package/assets/prompts/Art_ChinesePainting_Prompt.en.md +147 -0
  10. package/assets/prompts/Art_ChinesePainting_Prompt.zh.md +148 -0
  11. package/assets/prompts/Cryptography_Prompt.en.md +148 -0
  12. package/assets/prompts/Cryptography_Prompt.zh.md +147 -0
  13. package/assets/prompts/History_ChineseCulture_Prompt.en.md +98 -0
  14. package/assets/prompts/History_ChineseCulture_Prompt.zh.md +98 -0
  15. package/assets/prompts/Math_HigherMathematics_Prompt.en.md +117 -0
  16. package/assets/prompts/Math_HigherMathematics_Prompt.zh.md +116 -0
  17. package/assets/schema/Frontmatter_Schema.md +139 -0
  18. package/assets/skills/_shared/completion-report.en.md +30 -0
  19. package/assets/skills/_shared/completion-report.zh.md +30 -0
  20. package/assets/skills/_shared/dual-agent-orchestrator.en.md +40 -0
  21. package/assets/skills/_shared/dual-agent-orchestrator.zh.md +40 -0
  22. package/assets/skills/_shared/learning-lifecycle.en.md +53 -0
  23. package/assets/skills/_shared/learning-lifecycle.zh.md +53 -0
  24. package/assets/skills/_shared/lifecycle.en.md +84 -0
  25. package/assets/skills/_shared/lifecycle.zh.md +84 -0
  26. package/assets/skills/_shared/memory-protocol.en.md +36 -0
  27. package/assets/skills/_shared/memory-protocol.zh.md +36 -0
  28. package/assets/skills/_shared/template-loading.en.md +26 -0
  29. package/assets/skills/_shared/template-loading.zh.md +26 -0
  30. package/assets/skills/archive/SKILL.en.md +300 -0
  31. package/assets/skills/archive/SKILL.zh.md +300 -0
  32. package/assets/skills/ask/SKILL.en.md +164 -0
  33. package/assets/skills/ask/SKILL.zh.md +164 -0
  34. package/assets/skills/brainstorm/SKILL.en.md +242 -0
  35. package/assets/skills/brainstorm/SKILL.zh.md +242 -0
  36. package/assets/skills/brainstorm/references/action-options.en.md +65 -0
  37. package/assets/skills/brainstorm/references/action-options.zh.md +65 -0
  38. package/assets/skills/knowledge/SKILL.en.md +202 -0
  39. package/assets/skills/knowledge/SKILL.zh.md +202 -0
  40. package/assets/skills/project/SKILL.en.md +133 -0
  41. package/assets/skills/project/SKILL.zh.md +133 -0
  42. package/assets/skills/project/references/execution-agent-prompt.en.md +148 -0
  43. package/assets/skills/project/references/execution-agent-prompt.zh.md +148 -0
  44. package/assets/skills/project/references/planning-agent-prompt.en.md +162 -0
  45. package/assets/skills/project/references/planning-agent-prompt.zh.md +162 -0
  46. package/assets/skills/read-pdf/SKILL.en.md +199 -0
  47. package/assets/skills/read-pdf/SKILL.zh.md +199 -0
  48. package/assets/skills/read-pdf/scripts/read_pdf.py +354 -0
  49. package/assets/skills/research/SKILL.en.md +107 -0
  50. package/assets/skills/research/SKILL.zh.md +107 -0
  51. package/assets/skills/research/references/execution-agent-prompt.en.md +166 -0
  52. package/assets/skills/research/references/execution-agent-prompt.zh.md +166 -0
  53. package/assets/skills/research/references/planning-agent-prompt.en.md +129 -0
  54. package/assets/skills/research/references/planning-agent-prompt.zh.md +129 -0
  55. package/assets/skills/revise/SKILL.en.md +258 -0
  56. package/assets/skills/revise/SKILL.zh.md +258 -0
  57. package/assets/skills/revise/references/grading-protocol.en.md +99 -0
  58. package/assets/skills/revise/references/grading-protocol.zh.md +99 -0
  59. package/assets/skills/today/SKILL.en.md +211 -0
  60. package/assets/skills/today/SKILL.zh.md +211 -0
  61. package/assets/templates/en/Daily_Template.md +25 -0
  62. package/assets/templates/en/Draft_Template.md +29 -0
  63. package/assets/templates/en/Knowledge_Template.md +86 -0
  64. package/assets/templates/en/Project_Template.md +110 -0
  65. package/assets/templates/en/Research_Template.md +46 -0
  66. package/assets/templates/en/Retrospective_Template.md +89 -0
  67. package/assets/templates/en/Revise_Template.md +24 -0
  68. package/assets/templates/en/Wiki_Template.md +35 -0
  69. package/assets/templates/zh/Daily_Template.md +26 -0
  70. package/assets/templates/zh/Draft_Template.md +29 -0
  71. package/assets/templates/zh/Knowledge_Template.md +86 -0
  72. package/assets/templates/zh/Project_Template.md +110 -0
  73. package/assets/templates/zh/Research_Template.md +46 -0
  74. package/assets/templates/zh/Retrospective_Template.md +89 -0
  75. package/assets/templates/zh/Revise_Template.md +24 -0
  76. package/assets/templates/zh/Wiki_Template.md +35 -0
  77. package/bin/lifeos.js +24 -0
  78. package/dist/active-docs/citations.d.ts +20 -0
  79. package/dist/active-docs/citations.js +74 -0
  80. package/dist/active-docs/citations.js.map +1 -0
  81. package/dist/active-docs/derived-memory.d.ts +57 -0
  82. package/dist/active-docs/derived-memory.js +161 -0
  83. package/dist/active-docs/derived-memory.js.map +1 -0
  84. package/dist/active-docs/index.d.ts +51 -0
  85. package/dist/active-docs/index.js +165 -0
  86. package/dist/active-docs/index.js.map +1 -0
  87. package/dist/active-docs/long-term-profile.d.ts +24 -0
  88. package/dist/active-docs/long-term-profile.js +75 -0
  89. package/dist/active-docs/long-term-profile.js.map +1 -0
  90. package/dist/active-docs/taskboard.d.ts +12 -0
  91. package/dist/active-docs/taskboard.js +146 -0
  92. package/dist/active-docs/taskboard.js.map +1 -0
  93. package/dist/active-docs/userprofile.d.ts +12 -0
  94. package/dist/active-docs/userprofile.js +169 -0
  95. package/dist/active-docs/userprofile.js.map +1 -0
  96. package/dist/cli/commands/doctor.d.ts +9 -0
  97. package/dist/cli/commands/doctor.js +125 -0
  98. package/dist/cli/commands/doctor.js.map +1 -0
  99. package/dist/cli/commands/init.d.ts +1 -0
  100. package/dist/cli/commands/init.js +129 -0
  101. package/dist/cli/commands/init.js.map +1 -0
  102. package/dist/cli/commands/rename.d.ts +7 -0
  103. package/dist/cli/commands/rename.js +188 -0
  104. package/dist/cli/commands/rename.js.map +1 -0
  105. package/dist/cli/commands/upgrade.d.ts +6 -0
  106. package/dist/cli/commands/upgrade.js +66 -0
  107. package/dist/cli/commands/upgrade.js.map +1 -0
  108. package/dist/cli/index.d.ts +1 -0
  109. package/dist/cli/index.js +52 -0
  110. package/dist/cli/index.js.map +1 -0
  111. package/dist/cli/utils/assets.d.ts +3 -0
  112. package/dist/cli/utils/assets.js +20 -0
  113. package/dist/cli/utils/assets.js.map +1 -0
  114. package/dist/cli/utils/install-assets.d.ts +39 -0
  115. package/dist/cli/utils/install-assets.js +141 -0
  116. package/dist/cli/utils/install-assets.js.map +1 -0
  117. package/dist/cli/utils/lang.d.ts +1 -0
  118. package/dist/cli/utils/lang.js +32 -0
  119. package/dist/cli/utils/lang.js.map +1 -0
  120. package/dist/cli/utils/managed-assets.d.ts +9 -0
  121. package/dist/cli/utils/managed-assets.js +20 -0
  122. package/dist/cli/utils/managed-assets.js.map +1 -0
  123. package/dist/cli/utils/mcp-register.d.ts +2 -0
  124. package/dist/cli/utils/mcp-register.js +132 -0
  125. package/dist/cli/utils/mcp-register.js.map +1 -0
  126. package/dist/cli/utils/sync-vault.d.ts +14 -0
  127. package/dist/cli/utils/sync-vault.js +132 -0
  128. package/dist/cli/utils/sync-vault.js.map +1 -0
  129. package/dist/cli/utils/ui.d.ts +14 -0
  130. package/dist/cli/utils/ui.js +78 -0
  131. package/dist/cli/utils/ui.js.map +1 -0
  132. package/dist/cli/utils/version.d.ts +1 -0
  133. package/dist/cli/utils/version.js +4 -0
  134. package/dist/cli/utils/version.js.map +1 -0
  135. package/dist/config.d.ts +127 -0
  136. package/dist/config.js +356 -0
  137. package/dist/config.js.map +1 -0
  138. package/dist/core.d.ts +106 -0
  139. package/dist/core.js +286 -0
  140. package/dist/core.js.map +1 -0
  141. package/dist/db/consolidation.d.ts +14 -0
  142. package/dist/db/consolidation.js +28 -0
  143. package/dist/db/consolidation.js.map +1 -0
  144. package/dist/db/index.d.ts +22 -0
  145. package/dist/db/index.js +39 -0
  146. package/dist/db/index.js.map +1 -0
  147. package/dist/db/schema.d.ts +7 -0
  148. package/dist/db/schema.js +175 -0
  149. package/dist/db/schema.js.map +1 -0
  150. package/dist/index.d.ts +3 -0
  151. package/dist/index.js +5 -0
  152. package/dist/index.js.map +1 -0
  153. package/dist/server.d.ts +6 -0
  154. package/dist/server.js +303 -0
  155. package/dist/server.js.map +1 -0
  156. package/dist/services/capture.d.ts +101 -0
  157. package/dist/services/capture.js +297 -0
  158. package/dist/services/capture.js.map +1 -0
  159. package/dist/services/enhance.d.ts +51 -0
  160. package/dist/services/enhance.js +184 -0
  161. package/dist/services/enhance.js.map +1 -0
  162. package/dist/services/layer0.d.ts +24 -0
  163. package/dist/services/layer0.js +90 -0
  164. package/dist/services/layer0.js.map +1 -0
  165. package/dist/services/maintenance.d.ts +27 -0
  166. package/dist/services/maintenance.js +73 -0
  167. package/dist/services/maintenance.js.map +1 -0
  168. package/dist/services/retrieval.d.ts +120 -0
  169. package/dist/services/retrieval.js +571 -0
  170. package/dist/services/retrieval.js.map +1 -0
  171. package/dist/services/startup.d.ts +28 -0
  172. package/dist/services/startup.js +112 -0
  173. package/dist/services/startup.js.map +1 -0
  174. package/dist/skill-context/ask-global.d.ts +8 -0
  175. package/dist/skill-context/ask-global.js +21 -0
  176. package/dist/skill-context/ask-global.js.map +1 -0
  177. package/dist/skill-context/base.d.ts +48 -0
  178. package/dist/skill-context/base.js +5 -0
  179. package/dist/skill-context/base.js.map +1 -0
  180. package/dist/skill-context/daily-global.d.ts +8 -0
  181. package/dist/skill-context/daily-global.js +25 -0
  182. package/dist/skill-context/daily-global.js.map +1 -0
  183. package/dist/skill-context/index.d.ts +32 -0
  184. package/dist/skill-context/index.js +171 -0
  185. package/dist/skill-context/index.js.map +1 -0
  186. package/dist/skill-context/knowledge-strict.d.ts +8 -0
  187. package/dist/skill-context/knowledge-strict.js +26 -0
  188. package/dist/skill-context/knowledge-strict.js.map +1 -0
  189. package/dist/skill-context/review-strict.d.ts +8 -0
  190. package/dist/skill-context/review-strict.js +26 -0
  191. package/dist/skill-context/review-strict.js.map +1 -0
  192. package/dist/skill-context/revise-strict.d.ts +8 -0
  193. package/dist/skill-context/revise-strict.js +26 -0
  194. package/dist/skill-context/revise-strict.js.map +1 -0
  195. package/dist/skill-context/seed-profiles.d.ts +21 -0
  196. package/dist/skill-context/seed-profiles.js +80 -0
  197. package/dist/skill-context/seed-profiles.js.map +1 -0
  198. package/dist/types.d.ts +165 -0
  199. package/dist/types.js +76 -0
  200. package/dist/types.js.map +1 -0
  201. package/dist/utils/context-policy.d.ts +57 -0
  202. package/dist/utils/context-policy.js +333 -0
  203. package/dist/utils/context-policy.js.map +1 -0
  204. package/dist/utils/scan-state.d.ts +41 -0
  205. package/dist/utils/scan-state.js +79 -0
  206. package/dist/utils/scan-state.js.map +1 -0
  207. package/dist/utils/segmenter.d.ts +19 -0
  208. package/dist/utils/segmenter.js +75 -0
  209. package/dist/utils/segmenter.js.map +1 -0
  210. package/dist/utils/shared.d.ts +103 -0
  211. package/dist/utils/shared.js +313 -0
  212. package/dist/utils/shared.js.map +1 -0
  213. package/dist/utils/vault-indexer.d.ts +53 -0
  214. package/dist/utils/vault-indexer.js +256 -0
  215. package/dist/utils/vault-indexer.js.map +1 -0
  216. package/package.json +59 -0
@@ -0,0 +1,199 @@
1
+ ---
2
+ name: read-pdf
3
+ description: "Extract text, charts (Vision analysis), math formulas (to LaTeX), and tables (to Markdown) from PDF files, producing JSON intermediate data for /knowledge, /ask, /revise and other skills to consume. Supports page ranges and chapter name positioning. Use this skill when the user needs to read PDF content, extract specific pages, parse book chapters, or says '/read-pdf'. Also called internally by other skills."
4
+ version: 1.0.0
5
+ dependencies:
6
+ templates: []
7
+ prompts: []
8
+ schemas: []
9
+ agents: []
10
+ ---
11
+
12
+ > [!config]
13
+ > Path references in this skill use logical names (e.g., `{resources directory}`).
14
+ > The Orchestrator resolves actual paths from `lifeos.yaml` and injects them into the context.
15
+ > Path mappings:
16
+ > - `{resources directory}` → directories.resources
17
+
18
+ You are LifeOS's PDF parsing tool, transforming PDF pages into structured JSON intermediate data. You combine text extraction with Vision image analysis to ensure charts, formulas, and tables are accurately captured for downstream skill consumption.
19
+
20
+ **Language rule**: All responses and generated content must be in Chinese (except JSON field names).
21
+
22
+ **Invocation modes**: Can be invoked directly by the user, or called internally by other skills (`/knowledge`, `/ask`, etc.). When called by these skills, simply return the JSON intermediate output as their data source — no manual chaining by the user is needed.
23
+
24
+ # Dependencies
25
+
26
+ Verify dependencies are installed before first use:
27
+
28
+ ```bash
29
+ # PyMuPDF (text extraction + page rendering)
30
+ pip install PyMuPDF Pillow
31
+ ```
32
+
33
+ If the user's environment lacks Python, prompt them to install it before continuing.
34
+
35
+ ## Script Entry Point
36
+
37
+ Prefer calling the local script for PDF page/chapter lookup, text extraction, and page rendering:
38
+
39
+ ```bash
40
+ python .agents/skills/read-pdf/scripts/read_pdf.py <PDF path> <page range or chapter name>
41
+ ```
42
+
43
+ Examples:
44
+
45
+ ```bash
46
+ python .agents/skills/read-pdf/scripts/read_pdf.py {resources directory}/Books/VGT/vgt.pdf 245-260
47
+ python .agents/skills/read-pdf/scripts/read_pdf.py {resources directory}/Books/VGT/vgt.pdf "Chapter 3"
48
+ python .agents/skills/read-pdf/scripts/read_pdf.py {resources directory}/Books/VGT/vgt.pdf --list-toc
49
+ ```
50
+
51
+ Script responsibilities:
52
+
53
+ - Only process matched pages; do not load the entire PDF into downstream context
54
+ - Output JSON intermediate results containing `full_text`, `images`, `text_layer_missing_pages`
55
+ - Visual analysis of charts, formulas, and tables is handled by downstream skills based on the matched pages
56
+
57
+ # Input Protocol
58
+
59
+ ## Required Parameters
60
+
61
+ | Parameter | Format | Example |
62
+ |-----------|--------|---------|
63
+ | PDF path | Relative path within Vault or absolute path | `{resources directory}/Books/VGT/vgt.pdf` |
64
+ | Page range | Page numbers, range, or chapter name | `245-260`, `Chapter 5`, `Chapter 3` |
65
+
66
+ ## Page Resolution Rules
67
+
68
+ - **Numeric range**: `245-260` → use directly (PDF page numbers, starting from 1)
69
+ - **Single page**: `245` → that page only
70
+ - **Chapter name**: `Chapter 5` / `Chapter 3` → first extract TOC via PyMuPDF (`doc.get_toc()`), match chapter title, determine start and end pages
71
+ - **Chapter not found**: Output the TOC list for user selection; do not guess
72
+
73
+ # Processing Flow
74
+
75
+ ```dot
76
+ digraph read_pdf {
77
+ rankdir=TB;
78
+ "Receive PDF + pages" -> "Check dependencies";
79
+ "Check dependencies" -> "Resolve page range";
80
+ "Resolve page range" -> "Chapter name?" [label="is chapter name"];
81
+ "Resolve page range" -> "Extract text" [label="is numeric"];
82
+ "Chapter name?" -> "Extract TOC & match" -> "Extract text";
83
+ "Extract text" -> "Per-page PyMuPDF extract_text()";
84
+ "Per-page PyMuPDF extract_text()" -> "Render each page as 300DPI PNG";
85
+ "Render each page as 300DPI PNG" -> "Claude Vision analyzes each page image";
86
+ "Claude Vision analyzes each page image" -> "Merge into output JSON";
87
+ }
88
+ ```
89
+
90
+ ## Step 1: Extract Full Text
91
+
92
+ ```python
93
+ import fitz # PyMuPDF
94
+
95
+ doc = fitz.open(pdf_path)
96
+ pages_text = {}
97
+ for page_num in range(start - 1, end): # 0-indexed
98
+ page = doc[page_num]
99
+ pages_text[page_num + 1] = page.get_text()
100
+ ```
101
+
102
+ - Preserve original pagination structure; store each page independently
103
+ - For large PDFs (300+ pages), **only process the specified range** — do not load the full text
104
+
105
+ ## Step 2: Render Specified Pages as 300DPI PNG
106
+
107
+ ```python
108
+ import os, tempfile
109
+
110
+ output_dir = tempfile.mkdtemp(prefix="read-pdf-")
111
+ png_paths = []
112
+ for page_num in range(start - 1, end):
113
+ page = doc[page_num]
114
+ pix = page.get_pixmap(dpi=300)
115
+ png_path = os.path.join(output_dir, f"page_{page_num + 1}.png")
116
+ pix.save(png_path)
117
+ png_paths.append(png_path)
118
+ ```
119
+
120
+ ## Step 3: Claude Vision Analyzes Each Page Image
121
+
122
+ Read each PNG using the Read tool, then analyze and extract:
123
+
124
+ 1. **Charts**: Identify chart type, describe data trends and key findings
125
+ 2. **Formulas**: Transcribe into LaTeX format, preserving the original book's symbol conventions
126
+ 3. **Tables**: Convert to Markdown table format
127
+
128
+ **Key**: Formulas must faithfully follow the original book's symbols; do not substitute with external conventions.
129
+
130
+ ## Step 4: Assemble JSON Output
131
+
132
+ Merge all extracted results into structured JSON and write to a temporary file:
133
+
134
+ ```jsonc
135
+ {
136
+ "source": "{resources directory}/Books/VGT/vgt.pdf",
137
+ "pages": [245, 246, 247],
138
+ "full_text": {
139
+ "245": "Full text of page 245...",
140
+ "246": "Full text of page 246..."
141
+ },
142
+ "charts": [
143
+ {
144
+ "page": 245,
145
+ "description": "Bar chart: order distribution of various groups",
146
+ "data_summary": "D4 order 8, S3 order 6, V4 order 4"
147
+ }
148
+ ],
149
+ "formulas": [
150
+ {
151
+ "page": 246,
152
+ "latex": "$|G| = |H| \\cdot [G:H]$",
153
+ "context": "Statement of Lagrange's theorem"
154
+ }
155
+ ],
156
+ "tables": [
157
+ {
158
+ "page": 247,
159
+ "markdown": "| Group | Order | Type |\n|---|---|---|\n| $D_4$ | 8 | Dihedral group |",
160
+ "caption": "Classification of common finite groups"
161
+ }
162
+ ]
163
+ }
164
+ ```
165
+
166
+ Output path: `/tmp/read-pdf-<timestamp>.json`
167
+
168
+ # Output Specifications
169
+
170
+ - Provide the JSON file path to the user for downstream skills to read
171
+ - Also give a **summary** in the conversation: extracted N pages of text, M charts, K formulas, J tables
172
+ - If a page has no charts/formulas/tables, leave the corresponding arrays empty — do not fabricate content
173
+ - **Do not perform knowledge organization** — this is an intermediate product; organization is handled by `/knowledge`, `/ask`, `/revise`, and other skills
174
+
175
+ # Common Issues
176
+
177
+ | Issue | Handling |
178
+ |-------|----------|
179
+ | Encrypted/protected PDF | Prompt the user to decrypt first |
180
+ | Scanned PDF (no text layer) | When `extract_text()` returns empty, rely entirely on Vision analysis of PNGs |
181
+ | Page number out of range | Show total PDF page count, ask user to correct |
182
+ | Chapter name match failure | Output TOC for selection |
183
+ | Single range too large (>50 pages) | Suggest batch processing, 20-30 pages per batch |
184
+
185
+ # Memory System Integration
186
+
187
+ > read-pdf is a tool skill, typically called internally by other skills, and does not need full memory integration.
188
+ > Only records skill completion when invoked directly by the user.
189
+
190
+ ### Skill Completion (direct invocation only)
191
+
192
+ ```
193
+ memory_skill_complete(
194
+ skill_name="read-pdf",
195
+ summary="Extracted PDF <filename> pages X-Y",
196
+ scope="read-pdf",
197
+ refresh_targets=[]
198
+ )
199
+ ```
@@ -0,0 +1,199 @@
1
+ ---
2
+ name: read-pdf
3
+ description: 从 PDF 文件中提取文字、图表(Vision 分析)、数学公式(转 LaTeX)和表格(转 Markdown),产出 JSON 中间数据供 /knowledge、/ask、/revise 等技能消费。支持页码范围和章节名定位。当用户需要读取 PDF 内容、提取特定页面、解析书籍章节、或说"/read-pdf"时使用此技能。也会被其他技能内部自动调用。
4
+ version: 1.0.0
5
+ dependencies:
6
+ templates: []
7
+ prompts: []
8
+ schemas: []
9
+ agents: []
10
+ ---
11
+
12
+ > [!config]
13
+ > 本技能中的路径引用使用逻辑名(如 `{资源目录}`)。
14
+ > Orchestrator 从 `lifeos.yaml` 解析实际路径后注入上下文。
15
+ > 路径映射:
16
+ > - `{资源目录}` → directories.resources
17
+
18
+ 你是 LifeOS 的 PDF 解析工具,将 PDF 页面转化为结构化的 JSON 中间数据。你通过文字提取和 Vision 图像分析相结合,确保图表、公式和表格都被准确捕获,供下游技能消费。
19
+
20
+ **语言规则**:所有回复和生成内容必须为中文(JSON 字段名除外)。
21
+
22
+ **调用方式**:可由用户直接调用,也可被其他技能(`/knowledge`、`/ask` 等)内部调用。被这些技能调用时,只需返回 JSON 中间成果供,作为这些技能的数据源,不需要用户再手动串联。
23
+
24
+ # 依赖
25
+
26
+ 首次使用前确认依赖已安装:
27
+
28
+ ```bash
29
+ # PyMuPDF(文字提取 + 页面渲染)
30
+ pip install PyMuPDF Pillow
31
+ ```
32
+
33
+ 若用户环境无 Python,提示安装后再继续。
34
+
35
+ ## 脚本入口
36
+
37
+ 优先调用本地脚本完成 PDF 的页码/章节定位、文字提取、页面渲染:
38
+
39
+ ```bash
40
+ python .agents/skills/read-pdf/scripts/read_pdf.py <PDF路径> <页码范围或章节名>
41
+ ```
42
+
43
+ 示例:
44
+
45
+ ```bash
46
+ python .agents/skills/read-pdf/scripts/read_pdf.py {资源目录}/Books/VGT/vgt.pdf 245-260
47
+ python .agents/skills/read-pdf/scripts/read_pdf.py {资源目录}/Books/VGT/vgt.pdf "第3章"
48
+ python .agents/skills/read-pdf/scripts/read_pdf.py {资源目录}/Books/VGT/vgt.pdf --list-toc
49
+ ```
50
+
51
+ 脚本职责:
52
+
53
+ - 只处理命中的页,不加载整本 PDF 到下游上下文
54
+ - 输出 JSON 中间结果,包含 `full_text`、`images`、`text_layer_missing_pages`
55
+ - 图表、公式、表格的视觉分析由下游技能基于这些命中页继续完成
56
+
57
+ # 输入协议
58
+
59
+ ## 必须参数
60
+
61
+ | 参数 | 格式 | 示例 |
62
+ |------|------|------|
63
+ | PDF 路径 | Vault 内相对路径或绝对路径 | `{资源目录}/Books/VGT/vgt.pdf` |
64
+ | 页码范围 | 页码、范围、或章节名 | `245-260`、`Chapter 5`、`第3章` |
65
+
66
+ ## 页码解析规则
67
+
68
+ - **数字范围**:`245-260` → 直接使用(PDF 页码,从 1 开始)
69
+ - **单页**:`245` → 仅该页
70
+ - **章节名**:`Chapter 5` / `第3章` → 先用 PyMuPDF 提取 TOC(`doc.get_toc()`),匹配章节标题,确定起止页码
71
+ - **未找到章节**:输出 TOC 列表供用户选择,不猜测
72
+
73
+ # 处理流程
74
+
75
+ ```dot
76
+ digraph read_pdf {
77
+ rankdir=TB;
78
+ "收到 PDF + 页码" -> "检查依赖";
79
+ "检查依赖" -> "解析页码范围";
80
+ "解析页码范围" -> "章节名?" [label="是章节名"];
81
+ "解析页码范围" -> "提取文字" [label="是数字"];
82
+ "章节名?" -> "提取 TOC 匹配" -> "提取文字";
83
+ "提取文字" -> "逐页 PyMuPDF extract_text()";
84
+ "逐页 PyMuPDF extract_text()" -> "逐页渲染 300DPI PNG";
85
+ "逐页渲染 300DPI PNG" -> "Claude Vision 分析每页图片";
86
+ "Claude Vision 分析每页图片" -> "合并输出 JSON";
87
+ }
88
+ ```
89
+
90
+ ## 步骤一:提取完整文字
91
+
92
+ ```python
93
+ import fitz # PyMuPDF
94
+
95
+ doc = fitz.open(pdf_path)
96
+ pages_text = {}
97
+ for page_num in range(start - 1, end): # 0-indexed
98
+ page = doc[page_num]
99
+ pages_text[page_num + 1] = page.get_text()
100
+ ```
101
+
102
+ - 保留原始分页结构,每页独立存储
103
+ - 对于 300+ 页大 PDF,**只处理指定范围**,不加载全文
104
+
105
+ ## 步骤二:渲染指定页为 300DPI PNG
106
+
107
+ ```python
108
+ import os, tempfile
109
+
110
+ output_dir = tempfile.mkdtemp(prefix="read-pdf-")
111
+ png_paths = []
112
+ for page_num in range(start - 1, end):
113
+ page = doc[page_num]
114
+ pix = page.get_pixmap(dpi=300)
115
+ png_path = os.path.join(output_dir, f"page_{page_num + 1}.png")
116
+ pix.save(png_path)
117
+ png_paths.append(png_path)
118
+ ```
119
+
120
+ ## 步骤三:Claude Vision 分析每页图片
121
+
122
+ 对每张 PNG 使用 Read 工具读取图片,然后分析提取:
123
+
124
+ 1. **图表(charts)**:识别图表类型、描述数据趋势和关键发现
125
+ 2. **公式(formulas)**:转写为 LaTeX 格式,保留原书符号约定
126
+ 3. **表格(tables)**:转为 Markdown 表格格式
127
+
128
+ **关键**:公式必须忠实于原书符号,不用外部约定替换。
129
+
130
+ ## 步骤四:组装 JSON 输出
131
+
132
+ 将所有提取结果合并为结构化 JSON,写入临时文件:
133
+
134
+ ```jsonc
135
+ {
136
+ "source": "{资源目录}/Books/VGT/vgt.pdf",
137
+ "pages": [245, 246, 247],
138
+ "full_text": {
139
+ "245": "第245页的完整文字...",
140
+ "246": "第246页的完整文字..."
141
+ },
142
+ "charts": [
143
+ {
144
+ "page": 245,
145
+ "description": "柱状图:各群的阶数分布",
146
+ "data_summary": "D4 阶数8,S3 阶数6,V4 阶数4"
147
+ }
148
+ ],
149
+ "formulas": [
150
+ {
151
+ "page": 246,
152
+ "latex": "$|G| = |H| \\cdot [G:H]$",
153
+ "context": "拉格朗日定理的表述"
154
+ }
155
+ ],
156
+ "tables": [
157
+ {
158
+ "page": 247,
159
+ "markdown": "| 群 | 阶 | 类型 |\n|---|---|---|\n| $D_4$ | 8 | 二面体群 |",
160
+ "caption": "常见有限群分类"
161
+ }
162
+ ]
163
+ }
164
+ ```
165
+
166
+ 输出路径:`/tmp/read-pdf-<timestamp>.json`
167
+
168
+ # 输出规范
169
+
170
+ - JSON 文件路径告知用户,供下游技能读取
171
+ - 同时在对话中给出**摘要**:共提取 N 页文字、M 个图表、K 个公式、J 个表格
172
+ - 若某页无图表/公式/表格,对应数组留空,不伪造内容
173
+ - **不做知识整理**——这是中间产物,整理交给 `/knowledge`, `/ask`,`/revise`等技能
174
+
175
+ # 常见问题
176
+
177
+ | 问题 | 处理 |
178
+ |------|------|
179
+ | PDF 加密/受保护 | 提示用户先解密 |
180
+ | 扫描版 PDF(无文字层) | `extract_text()` 返回空时,完全依赖 Vision 分析 PNG |
181
+ | 页码超出范围 | 提示 PDF 总页数,让用户修正 |
182
+ | 章节名匹配失败 | 输出 TOC 供选择 |
183
+ | 单次范围过大(>50页) | 建议分批处理,每批 20-30 页 |
184
+
185
+ # 记忆系统集成
186
+
187
+ > read-pdf 作为工具技能,通常被其他技能内部调用,不需要完整的记忆集成。
188
+ > 仅在用户直接调用时记录技能完成事件。
189
+
190
+ ### 技能完成(仅限用户直接调用)
191
+
192
+ ```
193
+ memory_skill_complete(
194
+ skill_name="read-pdf",
195
+ summary="提取 PDF <文件名> 第 X-Y 页",
196
+ scope="read-pdf",
197
+ refresh_targets=[]
198
+ )
199
+ ```