@creativeaitools/agent-wiki 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json ADDED
@@ -0,0 +1,54 @@
1
+ {
2
+ "name": "@creativeaitools/agent-wiki",
3
+ "version": "2.0.0",
4
+ "description": "Obsidian-compatible evidence-aware wiki tooling for agents.",
5
+ "type": "module",
6
+ "bin": {
7
+ "agent-wiki": "dist/src/cli.js"
8
+ },
9
+ "files": [
10
+ "dist/src",
11
+ "AGENTS.md",
12
+ "WIKI.md",
13
+ "README.md",
14
+ "ONBOARD.md",
15
+ "INBOX.md",
16
+ "AGENT-WIKI-SPEC-v2.md",
17
+ "_system/config.example.json",
18
+ "skills"
19
+ ],
20
+ "scripts": {
21
+ "build": "tsc",
22
+ "test": "npm run build && node --test dist/tests/*.test.js",
23
+ "check": "npm run test"
24
+ },
25
+ "devDependencies": {
26
+ "@types/node": "^24.0.0",
27
+ "typescript": "^5.8.0"
28
+ },
29
+ "engines": {
30
+ "node": ">=20"
31
+ },
32
+ "dependencies": {
33
+ "yaml": "^2.9.0"
34
+ },
35
+ "publishConfig": {
36
+ "access": "public"
37
+ },
38
+ "repository": {
39
+ "type": "git",
40
+ "url": "git+https://github.com/jesse-lane-ai/agent-wiki.git"
41
+ },
42
+ "bugs": {
43
+ "url": "https://github.com/jesse-lane-ai/agent-wiki/issues"
44
+ },
45
+ "homepage": "https://github.com/jesse-lane-ai/agent-wiki#readme",
46
+ "license": "MIT",
47
+ "keywords": [
48
+ "agent",
49
+ "wiki",
50
+ "obsidian",
51
+ "knowledge-management",
52
+ "cli"
53
+ ]
54
+ }
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: compile-wiki
3
+ description: "Run the compile pipeline whenever the underlying vault data changes. Trigger this skill whenever the user says \"compile the wiki\", \"regenerate the cache\", \"compile\", \"run compile\", or similar phrases indicating they want to update the machine-facing artifacts."
4
+ ---
5
+ # Compile the Wiki Cache
6
+
7
+ Run the compile pipeline to regenerate all machine-facing cache artifacts, the deterministic root page catalog, and maintenance reports from vault page frontmatter.
8
+
9
+ ---
10
+
11
+ ## When to run
12
+
13
+ Run the compile pipeline after:
14
+ - Adding or editing pages with structured frontmatter (claims, relations, timeline entries)
15
+ - Adding new question pages
16
+ - Adding source pages
17
+ - Adding source parent or source part pages for large documents
18
+ - Extracting knowledge primitives from source pages
19
+ - Creating or refreshing synthesis pages
20
+ - Resolving questions (status changes)
21
+ - Any bulk edit to page metadata
22
+
23
+ The compile pipeline is **safe to run at any time**. It is fully regenerative. It does not modify canonical knowledge pages, but it does rewrite the deterministic root `index.md` page catalog from `_system/cache/pages.json`.
24
+
25
+ ---
26
+
27
+ ## How to run
28
+
29
+ From the wiki root:
30
+
31
+ ```bash
32
+ agent-wiki compile
33
+ ```
34
+
35
+ With verbose output:
36
+
37
+ ```bash
38
+ agent-wiki compile --verbose
39
+ ```
40
+
41
+ Run this command from the repository root. The compile pipeline does not accept an alternate vault or wiki root.
42
+
43
+ ---
44
+
45
+ ## What it produces
46
+
47
+ ### Required cache files (`_system/cache/`)
48
+
49
+ | File | Purpose |
50
+ |---|---|
51
+ | `pages.json` | Normalized index of all parsed pages |
52
+ | `claims.jsonl` | All extracted claims with owning page info |
53
+ | `relations.jsonl` | All extracted relations |
54
+ | `agent-digest.json` | High-signal agent context pack |
55
+ | `contradictions.json` | Detected contradiction registry |
56
+ | `questions.json` | Question registry |
57
+ | `timeline-events.json` | Chronological event index |
58
+ | `source-index.json` | Source metadata registry |
59
+
60
+ ### Indexes (`_system/indexes/`)
61
+
62
+ | File | Purpose |
63
+ |---|---|
64
+ | `alias-index.json` | Alias → page ID map |
65
+ | `tag-index.json` | Tag → page IDs map |
66
+ | `id-to-path.json` | Page ID → path map |
67
+ | `path-to-id.json` | Path → page ID map |
68
+ | `pagetype-index.json` | Page type → page IDs map |
69
+
70
+ ### Root page catalog (`index.md`)
71
+
72
+ The compile pipeline runs:
73
+
74
+ ```bash
75
+ agent-wiki index --write --no-log
76
+ ```
77
+
78
+ This regenerates the full root `index.md` page catalog from `_system/cache/pages.json`.
79
+
80
+ `agent-wiki index` also supports:
81
+
82
+ ```bash
83
+ agent-wiki index --check
84
+ ```
85
+
86
+ Use `--check` when you need to verify that `index.md` matches the compiled page metadata without rewriting it.
87
+
88
+ ### Reports (`reports/`)
89
+
90
+ | Report | Purpose |
91
+ |---|---|
92
+ | `open-questions.md` | All open/active questions |
93
+ | `contradictions.md` | Tracked claim conflicts |
94
+ | `low-confidence.md` | Claims below confidence threshold |
95
+ | `claim-health.md` | Evidence gap and staleness overview |
96
+ | `stale-pages.md` | Pages not updated recently |
97
+ | `orphaned-claims.md` | Claims whose owning page is missing |
98
+ | `evidence-gaps.md` | Claims with no direct evidence |
99
+
100
+ ### Logs (`_system/logs/`)
101
+
102
+ The compile pipeline writes one operational log entry to `_system/logs/log.md` on each run through `agent-wiki log`.
103
+
104
+ ---
105
+
106
+ ## Validation Responsibility
107
+
108
+ This skill owns validation, cache regeneration, root catalog generation, report generation, and compile logs.
109
+
110
+ When running compile:
111
+
112
+ - validate page frontmatter and generated records
113
+ - warn when authored knowledge pages are missing required Markdown body prose
114
+ - validate synthesis scope and warn when active synthesis pages have no listed source or claim basis
115
+ - validate large-source parent and source-part structure
116
+ - detect duplicate IDs and malformed records
117
+ - regenerate cache files, `_system/indexes/`, root `index.md`, and report artifacts
118
+ - write a concise operational log entry through `agent-wiki log`
119
+ - report validation issues clearly
120
+
121
+ If validation errors occur, fix the affected canonical page or structured frontmatter, then re-run this skill. Do not repair generated cache, index, report, or log files by hand.
122
+
123
+ ---
124
+
125
+ ## Requirements
126
+
127
+ - Python 3.8+
128
+ - No third-party Python packages are required.
129
+
130
+ ---
131
+
132
+ ## Important rules
133
+
134
+ - Do NOT hand-edit files in `_system/cache/` or `_system/indexes/`. They are regenerated on each compile.
135
+ - Do NOT hand-edit `index.md` for durable prose. It is regenerated as the deterministic root page catalog.
136
+ - Reports in `reports/` are views — do not treat them as primary data.
137
+ - The compile pipeline reads `pageType`, `id`, `claims`, `relations`, `timeline` from frontmatter.
138
+ - For source pages, the compile pipeline also preserves `sourceRole`, `parentSourceId`, `sourceParts`, `partIndex`, `partCount`, and `locator` in `pages.json` and `source-index.json`.
139
+ - Pages without frontmatter, or without `id` and `pageType`, are skipped.
140
+ - The compile only modifies generated artifacts and the deterministic root `index.md` catalog.
@@ -0,0 +1,350 @@
1
+ ---
2
+ name: extract-knowledge-primitives
3
+ description: "Extract knowledge primitives (entities, concepts, claims, questions, and relations) from source pages. Use this skill when the user says 'extract primitives', 'extract knowledge', 'process sources for structured data', or 'analyze sources'. This skill reads source pages with status: unprocessed and creates appropriate wiki pages."
4
+ ---
5
+
6
+ # Extract Knowledge Primitives
7
+
8
+ This skill defines the extraction workflow. It does not own the vault schema or synthesis prose workflow.
9
+
10
+ Runtime schema and common examples live in `WIKI.md` Section 4.1. Status enums live in `WIKI.md` Sections 5 and 6. Evidence rules live in `WIKI.md` Section 7. Relationship predicates live in `WIKI.md` Section 8. Entity and concept type enums live in `WIKI.md` Section 12.1. The vault behavior contract lives in `AGENTS.md`. The full project/development contract lives in `AGENT-WIKI-SPEC-v2.md`.
11
+
12
+ Use `AGENT-WIKI-SPEC-v2.md` only when changing project behavior, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 6, 7, 8, or 12.1 do not contain enough detail. If this skill or those `WIKI.md` sections conflict with `AGENT-WIKI-SPEC-v2.md`, follow `AGENT-WIKI-SPEC-v2.md`.
13
+
14
+ Authored body requirements for newly created `entity`, `concept`, `claim`, `question`, and `synthesis` pages live in `AGENT-WIKI-SPEC-v2.md` Section 7.10.
15
+ Deterministic page scaffolding lives in `AGENT-WIKI-SPEC-v2.md` Section 6.10. Use `agent-wiki create-page` when creating new primitive page files.
16
+
17
+ ## Core Principles
18
+
19
+ - Preserve source content. Add structure without rewriting human-authored prose.
20
+ - Use stable IDs. Reuse existing primitives when they already exist.
21
+ - Do not invent certainty. New source-extracted claims start `status: unverified` with `confidence: 0.60` unless the canonical spec says otherwise.
22
+ - Keep claims atomic. One proposition per claim.
23
+ - Treat evidence honestly. An excerpt can show that a source made a statement without proving the statement true.
24
+ - Use Obsidian wikilinks for internal vault references. Cross-vault Obsidian references are the exception: write them as standard markdown links with `obsidian://` URIs per `AGENT-WIKI-SPEC-v2.md` Section 8.6.
25
+
26
+ ## Step 1: Read the Contract and Runtime Reference
27
+
28
+ Before extracting anything, read:
29
+
30
+ 1. `AGENTS.md` for behavior rules.
31
+ 2. `WIKI.md` Sections 4.1, 5, 6, 7, 8, and 12.1 for runtime schema, field requirements, ID formats, enums, and common examples.
32
+
33
+ Read `AGENT-WIKI-SPEC-v2.md` only when changing the project itself, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 6, 7, 8, or 12.1 are insufficient.
34
+
35
+ Do not copy schemas from this skill when creating pages. Use `WIKI.md` Section 4.1 as the routine source of truth for ordinary vault schemas.
36
+ Use `agent-wiki create-page` for new page files so IDs, filenames, required frontmatter, and body requirements stay deterministic.
37
+ Do not launch `obsidian://` URIs. Resolve them only through local `_system/config.json` `knownVaults`; if no mapping exists, treat the URI as an opaque external reference.
38
+
39
+ ## Step 2: Find Source Pages Needing Extraction
40
+
41
+ Scan `sources/` for source pages that have not yet been processed for extraction.
42
+
43
+ A source page needs extraction when:
44
+
45
+ - `status: unprocessed`.
46
+ - `sourceRole` is absent, `whole`, or `part`.
47
+
48
+ A source page has already been extracted when:
49
+
50
+ - `status: processed`.
51
+
52
+ Large source parent pages are manifests and metadata records. Do not extract primitives from pages with `sourceRole: parent`; extract from their child source part pages instead.
53
+
54
+ Read frontmatter first. Do not reprocess already extracted source pages unless the user explicitly asks for re-extraction.
55
+
56
+ If no unprocessed source pages are found, report that and stop.
57
+
58
+ ## Step 3: Analyze Each Source
59
+
60
+ Read each selected source page in full and identify durable primitives worth adding to the vault.
61
+
62
+ For source parts, preserve the parent source context. Evidence should cite the source part page and include the part's `locator` when available.
63
+
64
+ ### Entities
65
+
66
+ Extract entities when the source names durable things worth tracking across the vault:
67
+
68
+ - people
69
+ - organizations
70
+ - projects
71
+ - products
72
+ - systems
73
+ - places
74
+ - events
75
+ - artifacts
76
+ - documents
77
+
78
+ Prefer entities that are main subjects, repeated, or needed for links, claims, or relations. Do not create entities for generic nouns or passing mentions.
79
+
80
+ Example judgment:
81
+
82
+ ```text
83
+ Source: "Acme Corp was founded in 2010 by John Doe."
84
+
85
+ Extract:
86
+ - entity.organization.acme-corp
87
+ - entity.person.john-doe
88
+ ```
89
+
90
+ ### Concepts
91
+
92
+ Extract concepts when the source defines or explains reusable abstractions or reusable instructions:
93
+
94
+ - definitions
95
+ - methods
96
+ - principles
97
+ - frameworks
98
+ - policies
99
+ - standards
100
+ - patterns
101
+ - theories
102
+ - taxonomies
103
+ - workflows
104
+ - runbooks
105
+ - checklists
106
+ - playbooks
107
+
108
+ Do not create concept pages for terms that are only mentioned in passing.
109
+
110
+ Example judgment:
111
+
112
+ ```text
113
+ Source: "Adaptive reuse converts existing buildings to new uses while preserving much of the original structure."
114
+
115
+ Extract:
116
+ - concept.method.adaptive-reuse
117
+ ```
118
+
119
+ ### Claims
120
+
121
+ Extract claims when the source makes an atomic proposition that can be evaluated for support, confidence, freshness, and conflict.
122
+
123
+ Split compound statements:
124
+
125
+ ```text
126
+ Source: "Acme Corp was founded in 2010 and is based in San Francisco."
127
+
128
+ Extract:
129
+ - Acme Corp was founded in 2010.
130
+ - Acme Corp is based in San Francisco.
131
+ ```
132
+
133
+ Choose `claimType` by meaning:
134
+
135
+ - `historical`: dated or temporal event
136
+ - `descriptive`: what something is, has, or does
137
+ - `causal`: cause or mechanism
138
+ - `interpretive`: meaning, implication, or judgment
139
+ - `normative`: recommendation or what should happen
140
+ - `forecast`: expected future outcome
141
+
142
+ New source-extracted claims should normally be `unverified` with `confidence: 0.60`. The evidence excerpt documents where the claim came from; it does not automatically make the claim supported.
143
+
144
+ Extract workflow-style concepts when the source contains reusable actionable steps:
145
+
146
+ - workflows
147
+ - runbooks
148
+ - checklists
149
+ - playbooks
150
+ - setup or operating instructions
151
+
152
+ Represent these as `pageType: concept` in `concepts/`, using the concept schema in `WIKI.md` Section 4.1 and an appropriate workflow-oriented `conceptType` from `WIKI.md` Section 12.1. Preserve the operational sequence. Keep the body concise and source-grounded.
153
+
154
+ ### Questions
155
+
156
+ Extract questions when the source exposes unresolved uncertainty:
157
+
158
+ - explicit questions
159
+ - research gaps
160
+ - known unknowns
161
+ - unresolved decisions
162
+ - TODOs or future work that should remain visible
163
+
164
+ Questions should be specific enough to answer. Resolved questions remain in the vault with an updated status; do not delete them.
165
+
166
+ ### Relations
167
+
168
+ Extract relations when the source establishes a typed connection between primitives:
169
+
170
+ - type membership
171
+ - ownership or authorship
172
+ - organizational hierarchy
173
+ - dependency or usage
174
+ - production or derivation
175
+ - location
176
+ - logical support or contradiction
177
+ - general association
178
+
179
+ Use predicates from `WIKI.md` Section 8. Relations are directional; record the direction actually supported by the source.
180
+
181
+ ## Step 4: Create or Update Pages
182
+
183
+ For each primitive, check for an existing page before creating a new one. Search the relevant folder and the compiled cache when available.
184
+
185
+ Create new page files with `agent-wiki create-page --no-log`. This skill writes one extraction batch log entry after all primitive creation and source metadata updates succeed.
186
+
187
+ Create pages in the folder required by `AGENTS.md`:
188
+
189
+ - `entities/` for `pageType: entity`
190
+ - `concepts/` for `pageType: concept`
191
+ - `claims/` for `pageType: claim`
192
+ - `questions/` for `pageType: question`
193
+
194
+ Use the runtime page schemas and examples from `WIKI.md` Section 4.1. Do not use local schema templates or copied frontmatter examples.
195
+
196
+ When creating a new `entity`, `concept`, `claim`, or `question` page, write a substantive Markdown body after the frontmatter. The body must be human-facing prose, not only frontmatter, a placeholder, or a one-line title restatement.
197
+
198
+ Do not create `synthesis` pages as part of routine primitive extraction. If the extraction reveals a need for durable cross-source interpretation, comparison, brief, or timeline narrative, report that a synthesis may be useful and use the `write-synthesis` skill when the operator asks for it.
199
+
200
+ Prepare the body prose in a temporary Markdown file outside the vault, then call the scaffolder. Examples:
201
+
202
+ ```bash
203
+ agent-wiki create-page \
204
+ --type entity \
205
+ --subtype organization \
206
+ --slug acme-corp \
207
+ --title "Acme Corp" \
208
+ --body-file <prepared-body.md> \
209
+ --source-page <sourceId> \
210
+ --no-log
211
+ ```
212
+
213
+ ```bash
214
+ agent-wiki create-page \
215
+ --type concept \
216
+ --subtype workflow \
217
+ --slug adaptive-reuse-review \
218
+ --title "Adaptive Reuse Review" \
219
+ --body-file <prepared-body.md> \
220
+ --source-page <sourceId> \
221
+ --no-log
222
+ ```
223
+
224
+ ```bash
225
+ agent-wiki create-page \
226
+ --type claim \
227
+ --subtype historical \
228
+ --slug acme-founded-2010 \
229
+ --title "Acme Corp was founded in 2010" \
230
+ --claim-text "Acme Corp was founded in 2010." \
231
+ --confidence 0.60 \
232
+ --source-id <sourceId> \
233
+ --evidence "id=evidence.quote.supports.acme-founded-2010;sourceId=<sourceId>;path=<sourcePath>;kind=quote;relation=context_only;weight=0.60;excerpt=<short-excerpt>;updatedAt=<yyyy-mm-dd>;locatorText=<locator>" \
234
+ --body-file <prepared-body.md> \
235
+ --no-log
236
+ ```
237
+
238
+ ```bash
239
+ agent-wiki create-page \
240
+ --type question \
241
+ --subtype acquisition \
242
+ --slug acme-founder-identity \
243
+ --title "Who founded Acme Corp?" \
244
+ --body-file <prepared-body.md> \
245
+ --related-page <pageId> \
246
+ --no-log
247
+ ```
248
+
249
+ For claim pages, pass source-grounded evidence records to the scaffolder with repeatable `--evidence` flags so the page is created with block YAML evidence frontmatter. Use `relation: supports` only when the source directly supports the claim; otherwise prefer `context_only`, `weakens`, or `contradicts`.
250
+
251
+ After creating a page with the scaffolder, immediately add any extraction-specific structured fields the scaffolder does not own, such as embedded relations, extracted primitive lists, or richer source references. Preserve the body prose written by the scaffolder and update `updatedAt` when structured content changes.
252
+
253
+ Use the body to explain the primitive in source-grounded context:
254
+
255
+ - `entity` pages: describe the entity, why it matters in the vault, important aliases or identifiers, and known uncertainty.
256
+ - `concept` pages: explain the concept, its boundaries, source-grounded examples or steps, and any important distinctions.
257
+ - `claim` pages: restate the atomic proposition in prose, summarize the evidence posture, and note caveats or uncertainty.
258
+ - `question` pages: explain why the question exists, what is already known, what remains unresolved, and what would count as resolution.
259
+
260
+ When updating an existing page:
261
+
262
+ - preserve human-authored prose
263
+ - update only relevant structured fields
264
+ - update `updatedAt` when structured content changes
265
+ - avoid duplicate claims, evidence entries, relations, aliases, and links
266
+
267
+ ## Step 5: Add Evidence
268
+
269
+ Every extracted claim should point back to the source page when possible.
270
+
271
+ Evidence should include:
272
+
273
+ - source page ID
274
+ - source path
275
+ - source locator when available, especially for source parts
276
+ - evidence kind
277
+ - evidence relation
278
+ - concise note
279
+ - exact excerpt when available
280
+ - retrieval/update dates required by the spec
281
+
282
+ Use `relation: supports` only when the evidence directly supports the claim. Use `context_only`, `weakens`, or `contradicts` when that is more accurate.
283
+
284
+ ## Step 6: Update Source Extraction Metadata
285
+
286
+ After extracting primitives from a source page, update that source page's frontmatter using the fields defined in `WIKI.md` Section 4.1 and source statuses from `WIKI.md` Section 5.
287
+
288
+ At minimum, record:
289
+
290
+ - `status: processed`
291
+ - `updatedAt`
292
+ - wikilinked IDs of extracted entities, concepts, claims, and questions where applicable
293
+
294
+ Extracted primitive lists (`extractedEntities`, `extractedConcepts`, `extractedClaims`, `extractedQuestions`) are navigation/display fields. Write each value as an aliased Obsidian wikilink that targets the page filename stem and displays the dotted ID, for example `"[[concept-principle-agentic-workflows|concept.principle.agentic-workflows]]"`.
295
+
296
+ When processing source parts, update each processed part to `status: processed`. After all parts for a parent source are processed, update the parent source from `status: partitioned` to `status: processed` and update its `updatedAt`.
297
+
298
+ Do not modify the source body unless the user explicitly asks for prose changes.
299
+
300
+ ## Step 7: Log the Extraction Batch
301
+
302
+ After successfully extracting primitives and updating source metadata, write one operational log entry for the batch:
303
+
304
+ ```bash
305
+ agent-wiki log --message "extract-knowledge-primitives: processed <sourceCount> sources; entities=<count> concepts=<count> claims=<count> questions=<count> relations=<count>"
306
+ ```
307
+
308
+ Do not write a log entry when no source pages were processed.
309
+
310
+ ## Step 8: Report Results
311
+
312
+ Report a concise summary:
313
+
314
+ ```text
315
+ Extraction complete.
316
+
317
+ Processed:
318
+ - source.<id>
319
+
320
+ Created or updated:
321
+ - Entities: ...
322
+ - Concepts: ...
323
+ - Claims: ...
324
+ - Questions: ...
325
+ - Relations: ...
326
+ ```
327
+
328
+ ## Checklist
329
+
330
+ - [ ] Read `AGENTS.md` and `WIKI.md` Sections 4.1, 5, 6, 7, 8, and 12.1
331
+ - [ ] Find unprocessed source pages and source parts
332
+ - [ ] Skip `sourceRole: parent` pages during extraction
333
+ - [ ] Read each selected source in full
334
+ - [ ] Identify entities, concepts, claims, questions, and relations
335
+ - [ ] Defer durable cross-source interpretation to the `write-synthesis` skill
336
+ - [ ] Check for duplicates before creating pages
337
+ - [ ] Use `agent-wiki create-page --no-log` for newly created primitive page files
338
+ - [ ] Use runtime schemas and examples from `WIKI.md` Section 4.1
339
+ - [ ] Write substantive Markdown body prose for every new entity, concept, claim, and question page
340
+ - [ ] Mark source-extracted claims `unverified` with `confidence: 0.60`
341
+ - [ ] Add source-grounded evidence without overstating support
342
+ - [ ] Include source part locators in evidence when available
343
+ - [ ] Preserve human-authored prose
344
+ - [ ] Update source extraction metadata
345
+ - [ ] Write one operational log entry for the extraction batch
346
+ - [ ] Report results
347
+
348
+ ## Schema Authority
349
+
350
+ This skill owns extraction workflow guidance only. Runtime page schemas, ID formats, and common examples live in `WIKI.md` Section 4.1. Allowed enum values live in `WIKI.md` Sections 5, 6, 7, 8, 12, and 12.1. Use `AGENT-WIKI-SPEC-v2.md` only for project changes, ambiguity, or missing detail.
@@ -0,0 +1,101 @@
1
+ ---
2
+ name: import-link
3
+ description: Import a URL, link-derived capture, transcript, or pasted source directly into a canonical source page. Use when the user asks to import a link, capture an external source, ingest a URL, or save source material into the vault.
4
+ ---
5
+
6
+ # Import Link
7
+
8
+ ## Configuration
9
+ - Before first use, read `ONBOARD.md` and `skills/import-link/config.json`.
10
+ - If local setup is uncertain, run `agent-wiki onboard --check` and use the read-only probe output to guide setup questions.
11
+ - For first-run setup, prefer `agent-wiki onboard --check --questions` so the user can answer with compact letter choices.
12
+ - If the user approves persisting local Python or conversion policy, use `agent-wiki onboard --write-config` with the approved flags. The writer creates local `_system/config.json` from `_system/config.example.json`.
13
+ - Confirm `configured` is `true` before importing.
14
+ - Do not assume a default model, browser profile, or external retrieval tool.
15
+ - This checkout is the only wiki root. Write imports under this repository root using the relative directories in `skills/import-link/config.json`.
16
+ - If retrieval modes or attachment policy are unknown, stop and ask the user to configure `skills/import-link/config.json`.
17
+ - The default `manual_paste` retrieval mode requires no external tools. Other retrieval modes only apply when configured and available.
18
+ - Do not create a virtual environment, install packages, write `_system/config.json`, or change `skills/import-link/config.json` unless the user explicitly asks for setup changes. Do not hand-edit `_system/config.json`; use `agent-wiki onboard --write-config` after approval.
19
+
20
+ ## Wiki Root
21
+ - Run this skill from the repository root.
22
+ - Do not accept a vault name, alternate root, or external destination for this workflow.
23
+ - Use repository-relative paths for all writes.
24
+ - Source pages are written under `sources/`.
25
+ - Attachments are written under `_attachments/`.
26
+ - Users who want multiple independent wikis should clone this repository into multiple folders and onboard each checkout separately.
27
+ - If the incoming link is an `obsidian://` URI, follow `AGENT-WIKI-SPEC-v2.md` Section 8.6: do not launch the URI; resolve it only through `_system/config.json` `knownVaults`, and treat it as an opaque external reference when the target vault is not configured.
28
+
29
+ ## UUID Generation
30
+ - Use `agent-wiki uuid` to generate a new UUID for each source attachment.
31
+
32
+ ## Source Slug
33
+ - For any incoming URL or source, always generate a 4 word slug for the source note.
34
+ - Infer the four words by summarizing the content of the source note in 4 words.
35
+
36
+ ## Source Schema (required, strictly enforced)
37
+
38
+ Create source files in `sources/` using `agent-wiki create-page`. The scaffolder writes schema-compliant source pages, validates source parent/part requirements, and prevents duplicate IDs or target path overwrites.
39
+
40
+ Use `WIKI.md` Section 4.1 as the routine source of truth for page-type schemas, ID formats, and examples. Use `WIKI.md` Sections 5 and 12 for status and source type enums. This skill owns the import workflow, not the source frontmatter schema.
41
+
42
+ Use `WIKI.md` Section 13 for large-source parent and part handling. Consult `AGENT-WIKI-SPEC-v2.md` only when changing project behavior, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 12, or 13 are insufficient.
43
+
44
+ Use `AGENT-WIKI-SPEC-v2.md` Section 6.10 only when you need the page scaffolding contract or `agent-wiki create-page` option semantics.
45
+
46
+ Newly imported ordinary source pages MUST use `status: unprocessed` and `sourceRole: whole`. The extraction workflow changes source pages to `status: processed` after knowledge primitives have been extracted.
47
+
48
+ Large source parent pages MUST use `sourceRole: parent`. They SHOULD use `status: partitioned` while one or more child source parts remain unprocessed. Child source part pages MUST use `sourceRole: part` and `status: unprocessed`.
49
+
50
+ ## Deterministic Workflow
51
+ 1. **Deduplication Check:** Before capturing content, check existing source pages in `sources/` for a matching `originUrl`.
52
+ - If a source with the matching URL already exists, stop and inform the user that it has already been imported or update the existing file if requested.
53
+ 2. Capture source content using the retrieval modes configured in `skills/import-link/config.json`.
54
+ - If direct fetch is available, try it first.
55
+ - If a transcript tool is configured and the source is a video, capture one English transcript when available and use it as the primary source body.
56
+ - If browser automation is configured and direct retrieval is blocked or incomplete, use the configured browser automation.
57
+ - If no configured retrieval mode works, ask the user to paste the source content or configure another retrieval method.
58
+ 3. Ensure wiki folders exist:
59
+ - `sources/`
60
+ - `sources/parts/` when the capture needs partitioning
61
+ - `_attachments/`
62
+ 4. Build a deterministic ID:
63
+ - Create the source slug in 4 words by summarizing the content of the source note. (This is done using the raw source note, after the content has been captured in Steps 2 & 3).
64
+ 5. Decide whether to partition:
65
+ - If captured text is larger than roughly 25,000 words, or if an agent cannot reliably process the full source in one extraction pass, create a large source.
66
+ - Prefer semantic boundaries such as chapters, headings, appendices, transcript topics, or slide boundaries.
67
+ - Fall back to page ranges, timestamps, or other stable locators.
68
+ - Target 8,000-15,000 words per source part.
69
+ - Do not exceed 20,000 words per part unless preserving an indivisible structure requires it.
70
+ - Avoid splitting inside tables, code blocks, quoted blocks, or list structures when possible.
71
+ 6. Save the prepared source body or source-part bodies to temporary Markdown files outside the vault, then call `agent-wiki create-page` with `--no-log` for each canonical source page. The skill writes one batch log entry after the import succeeds.
72
+ 7. For an ordinary source, call the scaffolder with:
73
+ ```bash
74
+ agent-wiki create-page \
75
+ --type source \
76
+ --subtype <sourceType> \
77
+ --slug <sourceSlug> \
78
+ --title "<title>" \
79
+ --source-date <yyyy-mm-dd> \
80
+ --retrieved-at <yyyy-mm-dd> \
81
+ --source-url "<originUrl>" \
82
+ --source-role whole \
83
+ --body-file <prepared-source-body.md> \
84
+ --no-log
85
+ ```
86
+ Use `--origin-path` instead of `--source-url` only when the imported source is local. Add `--attachment <attachment>` once for each saved attachment. The body file must contain the full captured verbatim body with inline images and source URLs below the frontmatter. Images save to `_attachments/` using filename `yyyy-mm-dd-<sourceSlug>-<UUID>-<index>.<ext>`, where `<index>` starts at 1 and increments for each attachment. If a video thumbnail is captured, place it at the top of the transcript. Inline images use Obsidian image syntax `![[filename]]`.
87
+ 8. For a large source, call the scaffolder once for each child source part, then once for the parent manifest:
88
+ - part pages use `--source-role part`, `--parent-source-id <parentSourceId>`, `--part-index <n>`, `--part-count <count>`, and `--locator "<locator>"`
89
+ - the parent page uses `--source-role parent`, `--source-part <partPath>` once for each ordered child part path, and `--part-count <count>`
90
+ - the parent body should stay short and should not contain the full long-form source text
91
+ - each part body file must contain the verbatim text for its segment
92
+ - use `--no-log` on each scaffolder call and log the import once after all pages and attachments are written
93
+ 9. Write one operational log entry for the import:
94
+ ```bash
95
+ agent-wiki log --message "import-link: imported source <sourceId> to sources/<filename>; attachments=<count>"
96
+ ```
97
+ Log only after the source page and any attachments have been written successfully.
98
+ 10. Confirm in chat with:
99
+ - source path
100
+ - source part paths when a large source was partitioned
101
+ - number of attachments saved
@@ -0,0 +1,12 @@
1
+ {
2
+ "schemaVersion": 1,
3
+ "configured": false,
4
+ "retrievalModes": [
5
+ "manual_paste"
6
+ ],
7
+ "browserProfile": null,
8
+ "youtubeTranscriptTool": null,
9
+ "attachmentPolicy": "copy",
10
+ "sourceDirectory": "sources",
11
+ "attachmentDirectory": "_attachments"
12
+ }