@biaoo/tiangong-wiki 0.2.0 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/README.md +39 -50
  2. package/README.zh-CN.md +39 -50
  3. package/SKILL.md +75 -107
  4. package/assets/templates/achievement.md +8 -8
  5. package/assets/templates/bridge.md +8 -8
  6. package/assets/templates/concept.md +14 -18
  7. package/assets/templates/faq.md +8 -10
  8. package/assets/templates/lesson.md +8 -8
  9. package/assets/templates/method.md +16 -8
  10. package/assets/templates/misconception.md +10 -10
  11. package/assets/templates/person.md +8 -8
  12. package/assets/templates/research-note.md +10 -10
  13. package/assets/templates/resume.md +11 -10
  14. package/assets/templates/source-summary.md +8 -12
  15. package/assets/tiangong-wiki-framework.png +0 -0
  16. package/assets/wiki.config.default.json +6 -3
  17. package/dist/commands/asset.js +21 -0
  18. package/dist/commands/skill.js +78 -0
  19. package/dist/commands/template.js +30 -0
  20. package/dist/core/cli-env.js +34 -5
  21. package/dist/core/global-config.js +61 -0
  22. package/dist/core/onboarding.js +252 -102
  23. package/dist/core/workflow-context.js +58 -21
  24. package/dist/core/workspace-skills.js +496 -60
  25. package/dist/daemon/server.js +8 -0
  26. package/dist/index.js +36 -1
  27. package/dist/operations/asset.js +81 -0
  28. package/dist/operations/query.js +25 -1
  29. package/dist/operations/template-lint.js +160 -0
  30. package/dist/utils/asset.js +75 -0
  31. package/dist/utils/errors.js +6 -0
  32. package/package.json +2 -1
  33. package/references/cli-interface.md +32 -1
  34. package/references/template-design-guide.md +125 -113
  35. package/references/{env.md → troubleshooting.md} +64 -33
  36. package/references/vault-to-wiki-instruction.md +109 -51
  37. package/references/wiki-maintenance-instruction.md +15 -15
@@ -1,6 +1,6 @@
1
- # Vault To Wiki Instruction
1
+ # Vault-to-Wiki Instruction
2
2
 
3
- Use this instruction when a vault queue item must be processed into durable wiki knowledge.
3
+ Use this instruction when vault files need to be processed into durable wiki knowledge.
4
4
 
5
5
  ## Goal
6
6
 
@@ -14,86 +14,144 @@ The output may be:
14
14
 
15
15
  Do not assume any page type is the default destination.
16
16
 
17
- ## Core Rules
17
+ ---
18
+
19
+ ## Phase 1: Read the File
20
+
21
+ ### Skill Discovery
22
+
23
+ Parser skills are installed under `<workspace-root>/.agents/skills/`. Do not assume any parser skill is present — check what is actually available.
24
+
25
+ | Skill | Purpose |
26
+ |---|---|
27
+ | `pdf` | Extract text and structure from PDF files |
28
+ | `docx` | Extract text and structure from DOCX files |
29
+ | `pptx` | Extract text, slide structure, and speaker notes from PPTX files |
30
+ | `xlsx` | Extract tables and data from XLSX/CSV files |
31
+
32
+ When a parser skill is available and the vault file matches its type, use the skill. Read the skill's SKILL.md for interface details before invoking.
33
+
34
+ If a parser skill fails due to missing runtime dependencies, attempt to install (e.g., `pip install`, `npm install`) and retry. If resolution fails, fall back to direct reading and note the failure in the result manifest.
35
+
36
+ ### File Type Strategies
37
+
38
+ **Markdown / Plain Text (md, txt)**
39
+ Read directly. For large files (>5000 lines), read in sections. Parse YAML frontmatter separately if present.
40
+
41
+ **PDF**
42
+ Prefer the `pdf` parser skill. Without it: attempt direct read; if unreadable, skip. Use PDF metadata (title, author, date, subject) to inform decisions.
43
+
44
+ **Word Documents (docx)**
45
+ Prefer the `docx` parser skill. Without it: skip (DOCX is a ZIP/XML archive, unreliable to read directly). Use document properties when available.
46
+
47
+ **Presentations (pptx)**
48
+ Prefer the `pptx` parser skill. Speaker notes are often more valuable than slide text. Without the skill: skip.
49
+
50
+ **Spreadsheets (xlsx, csv)**
51
+ Prefer the `xlsx` skill for xlsx. CSV can be read directly (check encoding — Chinese content may use GBK/GB2312). Not all tabular data is knowledge — look for definitions, rules, or structured descriptions rather than raw dumps.
52
+
53
+ **Structured Data (json, yaml, yml)**
54
+ Read and parse directly. Evaluate whether the structure itself is the knowledge (e.g., a schema) or merely a container.
55
+
56
+ **Image Files (png, jpg, jpeg, webp)**
57
+ Use vision to read and evaluate. If the image has extractable value (diagrams, flowcharts, visualizations), save with `tiangong-wiki asset save` and reference with `tiangong-wiki asset ref`. Every image in a wiki page MUST have a textual description — images cannot be indexed.
58
+
59
+ **Images Embedded in Documents**
60
+ Use vision to understand each image in context. Extract only high-value images via the relevant parser skill. Use `tiangong-wiki asset save/ref` to manage extracted files.
61
+
62
+ ### Large and Complex Files
63
+
64
+ - Read incrementally — do not load the entire file at once.
65
+ - Summarize structure first (TOC, slide titles, sheet names) to identify high-value sections.
66
+ - Not every section is worth extracting.
67
+
68
+ ### Encoding and Edge Cases
69
+
70
+ - Chinese content may use GBK, GB2312, or Big5. Try alternative encodings if garbled.
71
+ - Corrupted files: skip with a clear reason.
72
+ - Password-protected or empty files: skip immediately.
73
+
74
+ ---
75
+
76
+ ## Phase 2: Decide
77
+
78
+ ### Core Rules
18
79
 
19
80
  1. Discover the current ontology through the wiki CLI before deciding what to write.
20
81
  2. Search for relevant existing pages before creating new ones.
21
- 3. Treat all page types equally. Choose the best fit from the current wiki, not a hardcoded fallback.
82
+ 3. Treat all page types equally. Choose the best fit, not a hardcoded fallback.
22
83
  4. Skip transient, duplicate, or low-value files.
23
- 5. Preserve provenance with `sourceRefs` and any type-specific source fields already defined by the chosen template.
24
- 6. `sourceRefs` may only contain existing wiki page ids. Do not put raw vault file paths there; keep raw file provenance in the page body or a dedicated source field such as `vaultPath` when the chosen type supports it.
25
- 7. If the existing type system cannot represent the knowledge cleanly, prefer `propose_only` unless template evolution is explicitly allowed.
26
-
27
- ## Runtime Discovery
84
+ 5. Preserve provenance with `sourceRefs` and type-specific source fields defined by the chosen template.
85
+ 6. `sourceRefs` may only contain existing wiki page ids. Raw file provenance belongs in the page body or a field like `vaultPath`.
86
+ 7. Only write frontmatter fields declared by the chosen type (`tiangong-wiki type show <type>`). Do not invent ad-hoc fields.
87
+ 8. If the type system cannot represent the knowledge cleanly, prefer `propose_only` unless template evolution is explicitly allowed.
28
88
 
29
- Use the CLI as the source of truth for the current ontology and page set.
89
+ ### Runtime Discovery
30
90
 
31
- Prefer:
91
+ Use the CLI as source of truth:
32
92
 
33
93
  - `tiangong-wiki type list --format json`
34
94
  - `tiangong-wiki type show <type> --format json`
35
95
  - `tiangong-wiki type recommend --text "<summary>" --keywords "a,b,c" --limit 5 --format json`
36
- - `tiangong-wiki find`
37
- - `tiangong-wiki fts`
38
- - `tiangong-wiki page-info`
96
+ - `tiangong-wiki find` / `tiangong-wiki fts` / `tiangong-wiki page-info`
39
97
 
40
98
  Notes:
41
-
42
99
  - Do not use guessed subcommands such as `tiangong-wiki page find`.
43
- - `tiangong-wiki find` and `tiangong-wiki list` already emit JSON; do not append `--format json`.
44
-
45
- Do not rely on static prompt snapshots of types, templates, or pages.
100
+ - `find` and `list` already emit JSON; do not append `--format json`.
46
101
 
47
- ## Decision Model
102
+ ### Decision Model
48
103
 
49
- Choose exactly one decision:
104
+ Choose exactly one:
50
105
 
51
- - `skip`
52
- The file is noise, duplicate, transient, unreadable, or already fully represented.
106
+ - **`skip`** — the file is noise, duplicate, transient, unreadable, or already fully represented.
107
+ - **`apply`** — the file adds durable knowledge expressible with existing page types.
108
+ - **`propose_only`** — the file has durable value but the current type system is not a clean fit.
53
109
 
54
- - `apply`
55
- The file adds durable knowledge and you can express it with existing page types.
56
-
57
- - `propose_only`
58
- The file has durable value, but the current type system is not a clean fit and automatic template evolution is not allowed.
59
-
60
- ## Value Heuristics
110
+ ### Value Heuristics
61
111
 
62
112
  Favors `apply`:
63
-
64
- - durable documents with reusable knowledge
65
- - files that materially strengthen or revise an existing page
66
- - sources that introduce a clearly distinct knowledge object worth revisiting
113
+ - Durable documents with reusable knowledge
114
+ - Files that materially strengthen or revise an existing page
115
+ - Sources that introduce a clearly distinct knowledge object
67
116
 
68
117
  Favors `skip`:
69
-
70
- - temporary exports, dumps, or duplicates
71
- - opaque binaries with no useful extractable content
72
- - screenshots or images with no standalone evidence value
73
- - files whose substance is already covered by current pages
118
+ - Temporary exports, dumps, or duplicates
119
+ - Opaque binaries with no extractable content
120
+ - Screenshots with no standalone evidence value
121
+ - Files whose substance is already covered
74
122
 
75
123
  Favors `propose_only`:
124
+ - The file is valuable but existing types are clearly awkward or lossy
125
+ - A new type would materially improve ontology quality
76
126
 
77
- - the file is valuable
78
- - existing types are clearly awkward or lossy
79
- - creating a new type would materially improve ontology quality
127
+ ### Metadata Utilization
80
128
 
81
- ## Page Update Rules
129
+ | Metadata | Where Found | How to Use |
130
+ |---|---|---|
131
+ | Title | PDF, DOCX, PPTX properties | Inform page title and nodeId |
132
+ | Author | PDF, DOCX, PPTX properties | May indicate relevant `person` pages, inform provenance |
133
+ | Creation / modification date | Most formats | Inform `createdAt`, assess recency |
134
+ | Subject / keywords | PDF, DOCX properties | Inform tags and search during discovery |
135
+ | Slide / page count | PDF, PPTX | Gauge complexity, anticipate splitting needs |
82
136
 
83
- When applying changes:
137
+ Do not blindly copy metadata into wiki fields — use it as input alongside actual content.
84
138
 
85
- 1. Prefer updating an existing page if the source is a revision, appendix, or direct reinforcement of that page.
139
+ ---
140
+
141
+ ## Phase 3: Execute
142
+
143
+ ### Page Update Rules
144
+
145
+ 1. Prefer updating an existing page if the source is a revision, appendix, or reinforcement.
86
146
  2. Create a new page only when the knowledge object is distinct and deserves its own identity.
87
147
  3. Keep edits minimal, specific, and provenance-preserving.
88
- 4. For every changed page, run:
89
- - `tiangong-wiki sync --path <page>`
90
- - `tiangong-wiki lint --path <page> --format json`
91
-
92
- ## Manifest Contract
148
+ 4. After every change:
149
+ - `tiangong-wiki sync --path <page-id>`
150
+ - `tiangong-wiki lint --path <page-id> --format json`
93
151
 
94
- The workflow must write a valid `result.json` manifest.
152
+ ### Manifest Contract
95
153
 
96
- Minimum expectations:
154
+ The workflow must write a valid `result.json` manifest with these fields:
97
155
 
98
156
  - `status`
99
157
  - `decision`
@@ -22,20 +22,17 @@ The judgment layer is the agent: deciding whether knowledge is durable, which pa
22
22
  8. Run `tiangong-wiki sync --path <page-id>`.
23
23
  9. Re-run `tiangong-wiki lint --path <page-id>` when the change is localized.
24
24
 
25
- ## PageType Selection Table
26
- | pageType | Use When | Typical Trigger | Notes |
27
- | --- | --- | --- | --- |
28
- | `concept` | A stable concept, model, or principle should be reusable later | New synthesis from multiple sources | Best for durable understanding |
29
- | `misconception` | A wrong mental model was corrected | User or agent had a clear before/after correction | Record the failure mode and prevention cues |
30
- | `bridge` | Knowledge transfers from one domain to another | Cross-project or cross-discipline analogy | Use when the value is the transfer itself |
31
- | `source-summary` | The source itself deserves a reusable digest page | A source document should remain a first-class knowledge object | Preserve `vaultPath` and `sourceType`; this is not the default destination for every vault file |
32
- | `lesson` | A specific incident produced a durable lesson | Failure, success, or surprise with actionable aftermath | Keep the event and future action linked |
33
- | `method` | A repeatable process or recipe proved useful | Same process worked more than once | Capture applicability and evidence |
34
- | `person` | Someone's role, preferences, or influence matters again later | Recurring collaborator or decision-maker | Keep factual and context-specific |
35
- | `achievement` | A milestone, credential, or verifiable result matters | Award, publication, certification, milestone | Useful for profile and reporting reuse |
36
- | `resume` | A reusable positioning page is needed for different audiences | Tailored summary for applications or intros | Keep it current and audience-aware |
37
- | `research-note` | Investigation is ongoing and incomplete | Exploratory work with open questions | Prefer this over `concept` when not settled yet |
38
- | `faq` | The same question recurs often | Third repeat or clear repeating pattern | Optimize for rapid reuse |
25
+ ## PageType Selection
26
+
27
+ Do not assume a fixed set of page types. Always discover the current ontology at runtime:
28
+
29
+ ```bash
30
+ tiangong-wiki type list --format json
31
+ tiangong-wiki type show <type> --format json
32
+ tiangong-wiki type recommend --text "<summary>" --keywords "a,b,c" --limit 5 --format json
33
+ ```
34
+
35
+ Use `type recommend` to find the best fit for new knowledge. If no existing type fits cleanly, see `references/template-design-guide.md` for when and how to create a new type.
39
36
 
40
37
  ## Update Vs Create Decision Flow
41
38
  1. Start with a retrieval pass.
@@ -55,7 +52,7 @@ The judgment layer is the agent: deciding whether knowledge is durable, which pa
55
52
 
56
53
  ## Create Workflow
57
54
  1. Confirm no suitable active page exists.
58
- 2. Choose the page type from the table above.
55
+ 2. Choose the page type using `tiangong-wiki type recommend` or `tiangong-wiki type list`.
59
56
  3. Run `tiangong-wiki create --type <pageType> --title "<title>" [--node-id <nodeId>]`.
60
57
  4. Open the created Markdown file and fill the frontmatter-specific fields.
61
58
  5. Fill every body section in the template with concrete content, not placeholders.
@@ -113,6 +110,9 @@ Example flow:
113
110
  ```
114
111
 
115
112
  ## Page-Type Specific Guidance
113
+
114
+ > The canonical type list comes from `tiangong-wiki type list`. The tips below apply to commonly used types but may not cover all registered types.
115
+
116
116
  ### concept
117
117
  - Use for durable understanding, definitions, formulas, and reusable intuition.
118
118
  - Fill prerequisites, examples, confusions, and open questions.