npm - @creativeaitools/agent-wiki - Versions diffs - 2.0.0 - Mend

@creativeaitools/agent-wiki 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/AGENT-WIKI-SPEC-v2.md +2584 -0
package/AGENTS.md +314 -0
package/INBOX.md +19 -0
package/LICENSE +21 -0
package/ONBOARD.md +373 -0
package/README.md +429 -0
package/WIKI.md +706 -0
package/_system/config.example.json +105 -0
package/dist/src/catalog.js +66 -0
package/dist/src/cli.js +330 -0
package/dist/src/compile.js +104 -0
package/dist/src/config.js +84 -0
package/dist/src/lifecycle.js +171 -0
package/dist/src/migrate.js +26 -0
package/dist/src/onboard.js +159 -0
package/dist/src/page.js +188 -0
package/dist/src/registry.js +74 -0
package/dist/src/schedule-prompts.js +74 -0
package/dist/src/upgrade.js +215 -0
package/dist/src/wiki-utils.js +112 -0
package/dist/src/workspace.js +198 -0
package/package.json +54 -0
package/skills/compile-wiki/SKILL.md +140 -0
package/skills/extract-knowledge-primitives/SKILL.md +350 -0
package/skills/import-link/SKILL.md +101 -0
package/skills/import-link/config.json +12 -0
package/skills/process-inbox/SKILL.md +255 -0
package/skills/process-workspace-sources/SKILL.md +127 -0
package/skills/update-overview/SKILL.md +140 -0
package/skills/write-synthesis/SKILL.md +154 -0

package/package.json ADDED Viewed

@@ -0,0 +1,54 @@
+{
+  "name": "@creativeaitools/agent-wiki",
+  "version": "2.0.0",
+  "description": "Obsidian-compatible evidence-aware wiki tooling for agents.",
+  "type": "module",
+  "bin": {
+    "agent-wiki": "dist/src/cli.js"
+  },
+  "files": [
+    "dist/src",
+    "AGENTS.md",
+    "WIKI.md",
+    "README.md",
+    "ONBOARD.md",
+    "INBOX.md",
+    "AGENT-WIKI-SPEC-v2.md",
+    "_system/config.example.json",
+    "skills"
+  ],
+  "scripts": {
+    "build": "tsc",
+    "test": "npm run build && node --test dist/tests/*.test.js",
+    "check": "npm run test"
+  },
+  "devDependencies": {
+    "@types/node": "^24.0.0",
+    "typescript": "^5.8.0"
+  },
+  "engines": {
+    "node": ">=20"
+  },
+  "dependencies": {
+    "yaml": "^2.9.0"
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/jesse-lane-ai/agent-wiki.git"
+  },
+  "bugs": {
+    "url": "https://github.com/jesse-lane-ai/agent-wiki/issues"
+  },
+  "homepage": "https://github.com/jesse-lane-ai/agent-wiki#readme",
+  "license": "MIT",
+  "keywords": [
+    "agent",
+    "wiki",
+    "obsidian",
+    "knowledge-management",
+    "cli"
+  ]
+}

package/skills/compile-wiki/SKILL.md ADDED Viewed

@@ -0,0 +1,140 @@
+---
+name: compile-wiki
+description: "Run the compile pipeline whenever the underlying vault data changes. Trigger this skill whenever the user says \"compile the wiki\", \"regenerate the cache\", \"compile\", \"run compile\", or similar phrases indicating they want to update the machine-facing artifacts."
+---
+# Compile the Wiki Cache
+Run the compile pipeline to regenerate all machine-facing cache artifacts, the deterministic root page catalog, and maintenance reports from vault page frontmatter.
+---
+## When to run
+Run the compile pipeline after:
+- Adding or editing pages with structured frontmatter (claims, relations, timeline entries)
+- Adding new question pages
+- Adding source pages
+- Adding source parent or source part pages for large documents
+- Extracting knowledge primitives from source pages
+- Creating or refreshing synthesis pages
+- Resolving questions (status changes)
+- Any bulk edit to page metadata
+The compile pipeline is **safe to run at any time**. It is fully regenerative. It does not modify canonical knowledge pages, but it does rewrite the deterministic root `index.md` page catalog from `_system/cache/pages.json`.
+---
+## How to run
+From the wiki root:
+```bash
+agent-wiki compile
+```
+With verbose output:
+```bash
+agent-wiki compile --verbose
+```
+Run this command from the repository root. The compile pipeline does not accept an alternate vault or wiki root.
+---
+## What it produces
+### Required cache files (`_system/cache/`)
+| File | Purpose |
+|---|---|
+| `pages.json` | Normalized index of all parsed pages |
+| `claims.jsonl` | All extracted claims with owning page info |
+| `relations.jsonl` | All extracted relations |
+| `agent-digest.json` | High-signal agent context pack |
+| `contradictions.json` | Detected contradiction registry |
+| `questions.json` | Question registry |
+| `timeline-events.json` | Chronological event index |
+| `source-index.json` | Source metadata registry |
+### Indexes (`_system/indexes/`)
+| File | Purpose |
+|---|---|
+| `alias-index.json` | Alias → page ID map |
+| `tag-index.json` | Tag → page IDs map |
+| `id-to-path.json` | Page ID → path map |
+| `path-to-id.json` | Path → page ID map |
+| `pagetype-index.json` | Page type → page IDs map |
+### Root page catalog (`index.md`)
+The compile pipeline runs:
+```bash
+agent-wiki index --write --no-log
+```
+This regenerates the full root `index.md` page catalog from `_system/cache/pages.json`.
+`agent-wiki index` also supports:
+```bash
+agent-wiki index --check
+```
+Use `--check` when you need to verify that `index.md` matches the compiled page metadata without rewriting it.
+### Reports (`reports/`)
+| Report | Purpose |
+|---|---|
+| `open-questions.md` | All open/active questions |
+| `contradictions.md` | Tracked claim conflicts |
+| `low-confidence.md` | Claims below confidence threshold |
+| `claim-health.md` | Evidence gap and staleness overview |
+| `stale-pages.md` | Pages not updated recently |
+| `orphaned-claims.md` | Claims whose owning page is missing |
+| `evidence-gaps.md` | Claims with no direct evidence |
+### Logs (`_system/logs/`)
+The compile pipeline writes one operational log entry to `_system/logs/log.md` on each run through `agent-wiki log`.
+---
+## Validation Responsibility
+This skill owns validation, cache regeneration, root catalog generation, report generation, and compile logs.
+When running compile:
+- validate page frontmatter and generated records
+- warn when authored knowledge pages are missing required Markdown body prose
+- validate synthesis scope and warn when active synthesis pages have no listed source or claim basis
+- validate large-source parent and source-part structure
+- detect duplicate IDs and malformed records
+- regenerate cache files, `_system/indexes/`, root `index.md`, and report artifacts
+- write a concise operational log entry through `agent-wiki log`
+- report validation issues clearly
+If validation errors occur, fix the affected canonical page or structured frontmatter, then re-run this skill. Do not repair generated cache, index, report, or log files by hand.
+---
+## Requirements
+- Python 3.8+
+- No third-party Python packages are required.
+---
+## Important rules
+- Do NOT hand-edit files in `_system/cache/` or `_system/indexes/`. They are regenerated on each compile.
+- Do NOT hand-edit `index.md` for durable prose. It is regenerated as the deterministic root page catalog.
+- Reports in `reports/` are views — do not treat them as primary data.
+- The compile pipeline reads `pageType`, `id`, `claims`, `relations`, `timeline` from frontmatter.
+- For source pages, the compile pipeline also preserves `sourceRole`, `parentSourceId`, `sourceParts`, `partIndex`, `partCount`, and `locator` in `pages.json` and `source-index.json`.
+- Pages without frontmatter, or without `id` and `pageType`, are skipped.
+- The compile only modifies generated artifacts and the deterministic root `index.md` catalog.

package/skills/extract-knowledge-primitives/SKILL.md ADDED Viewed

@@ -0,0 +1,350 @@
+---
+name: extract-knowledge-primitives
+description: "Extract knowledge primitives (entities, concepts, claims, questions, and relations) from source pages. Use this skill when the user says 'extract primitives', 'extract knowledge', 'process sources for structured data', or 'analyze sources'. This skill reads source pages with status: unprocessed and creates appropriate wiki pages."
+---
+# Extract Knowledge Primitives
+This skill defines the extraction workflow. It does not own the vault schema or synthesis prose workflow.
+Runtime schema and common examples live in `WIKI.md` Section 4.1. Status enums live in `WIKI.md` Sections 5 and 6. Evidence rules live in `WIKI.md` Section 7. Relationship predicates live in `WIKI.md` Section 8. Entity and concept type enums live in `WIKI.md` Section 12.1. The vault behavior contract lives in `AGENTS.md`. The full project/development contract lives in `AGENT-WIKI-SPEC-v2.md`.
+Use `AGENT-WIKI-SPEC-v2.md` only when changing project behavior, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 6, 7, 8, or 12.1 do not contain enough detail. If this skill or those `WIKI.md` sections conflict with `AGENT-WIKI-SPEC-v2.md`, follow `AGENT-WIKI-SPEC-v2.md`.
+Authored body requirements for newly created `entity`, `concept`, `claim`, `question`, and `synthesis` pages live in `AGENT-WIKI-SPEC-v2.md` Section 7.10.
+Deterministic page scaffolding lives in `AGENT-WIKI-SPEC-v2.md` Section 6.10. Use `agent-wiki create-page` when creating new primitive page files.
+## Core Principles
+- Preserve source content. Add structure without rewriting human-authored prose.
+- Use stable IDs. Reuse existing primitives when they already exist.
+- Do not invent certainty. New source-extracted claims start `status: unverified` with `confidence: 0.60` unless the canonical spec says otherwise.
+- Keep claims atomic. One proposition per claim.
+- Treat evidence honestly. An excerpt can show that a source made a statement without proving the statement true.
+- Use Obsidian wikilinks for internal vault references. Cross-vault Obsidian references are the exception: write them as standard markdown links with `obsidian://` URIs per `AGENT-WIKI-SPEC-v2.md` Section 8.6.
+## Step 1: Read the Contract and Runtime Reference
+Before extracting anything, read:
+1. `AGENTS.md` for behavior rules.
+2. `WIKI.md` Sections 4.1, 5, 6, 7, 8, and 12.1 for runtime schema, field requirements, ID formats, enums, and common examples.
+Read `AGENT-WIKI-SPEC-v2.md` only when changing the project itself, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 6, 7, 8, or 12.1 are insufficient.
+Do not copy schemas from this skill when creating pages. Use `WIKI.md` Section 4.1 as the routine source of truth for ordinary vault schemas.
+Use `agent-wiki create-page` for new page files so IDs, filenames, required frontmatter, and body requirements stay deterministic.
+Do not launch `obsidian://` URIs. Resolve them only through local `_system/config.json` `knownVaults`; if no mapping exists, treat the URI as an opaque external reference.
+## Step 2: Find Source Pages Needing Extraction
+Scan `sources/` for source pages that have not yet been processed for extraction.
+A source page needs extraction when:
+- `status: unprocessed`.
+- `sourceRole` is absent, `whole`, or `part`.
+A source page has already been extracted when:
+- `status: processed`.
+Large source parent pages are manifests and metadata records. Do not extract primitives from pages with `sourceRole: parent`; extract from their child source part pages instead.
+Read frontmatter first. Do not reprocess already extracted source pages unless the user explicitly asks for re-extraction.
+If no unprocessed source pages are found, report that and stop.
+## Step 3: Analyze Each Source
+Read each selected source page in full and identify durable primitives worth adding to the vault.
+For source parts, preserve the parent source context. Evidence should cite the source part page and include the part's `locator` when available.
+### Entities
+Extract entities when the source names durable things worth tracking across the vault:
+- people
+- organizations
+- projects
+- products
+- systems
+- places
+- events
+- artifacts
+- documents
+Prefer entities that are main subjects, repeated, or needed for links, claims, or relations. Do not create entities for generic nouns or passing mentions.
+Example judgment:
+```text
+Source: "Acme Corp was founded in 2010 by John Doe."
+Extract:
+- entity.organization.acme-corp
+- entity.person.john-doe
+```
+### Concepts
+Extract concepts when the source defines or explains reusable abstractions or reusable instructions:
+- definitions
+- methods
+- principles
+- frameworks
+- policies
+- standards
+- patterns
+- theories
+- taxonomies
+- workflows
+- runbooks
+- checklists
+- playbooks
+Do not create concept pages for terms that are only mentioned in passing.
+Example judgment:
+```text
+Source: "Adaptive reuse converts existing buildings to new uses while preserving much of the original structure."
+Extract:
+- concept.method.adaptive-reuse
+```
+### Claims
+Extract claims when the source makes an atomic proposition that can be evaluated for support, confidence, freshness, and conflict.
+Split compound statements:
+```text
+Source: "Acme Corp was founded in 2010 and is based in San Francisco."
+Extract:
+- Acme Corp was founded in 2010.
+- Acme Corp is based in San Francisco.
+```
+Choose `claimType` by meaning:
+- `historical`: dated or temporal event
+- `descriptive`: what something is, has, or does
+- `causal`: cause or mechanism
+- `interpretive`: meaning, implication, or judgment
+- `normative`: recommendation or what should happen
+- `forecast`: expected future outcome
+New source-extracted claims should normally be `unverified` with `confidence: 0.60`. The evidence excerpt documents where the claim came from; it does not automatically make the claim supported.
+Extract workflow-style concepts when the source contains reusable actionable steps:
+- workflows
+- runbooks
+- checklists
+- playbooks
+- setup or operating instructions
+Represent these as `pageType: concept` in `concepts/`, using the concept schema in `WIKI.md` Section 4.1 and an appropriate workflow-oriented `conceptType` from `WIKI.md` Section 12.1. Preserve the operational sequence. Keep the body concise and source-grounded.
+### Questions
+Extract questions when the source exposes unresolved uncertainty:
+- explicit questions
+- research gaps
+- known unknowns
+- unresolved decisions
+- TODOs or future work that should remain visible
+Questions should be specific enough to answer. Resolved questions remain in the vault with an updated status; do not delete them.
+### Relations
+Extract relations when the source establishes a typed connection between primitives:
+- type membership
+- ownership or authorship
+- organizational hierarchy
+- dependency or usage
+- production or derivation
+- location
+- logical support or contradiction
+- general association
+Use predicates from `WIKI.md` Section 8. Relations are directional; record the direction actually supported by the source.
+## Step 4: Create or Update Pages
+For each primitive, check for an existing page before creating a new one. Search the relevant folder and the compiled cache when available.
+Create new page files with `agent-wiki create-page --no-log`. This skill writes one extraction batch log entry after all primitive creation and source metadata updates succeed.
+Create pages in the folder required by `AGENTS.md`:
+- `entities/` for `pageType: entity`
+- `concepts/` for `pageType: concept`
+- `claims/` for `pageType: claim`
+- `questions/` for `pageType: question`
+Use the runtime page schemas and examples from `WIKI.md` Section 4.1. Do not use local schema templates or copied frontmatter examples.
+When creating a new `entity`, `concept`, `claim`, or `question` page, write a substantive Markdown body after the frontmatter. The body must be human-facing prose, not only frontmatter, a placeholder, or a one-line title restatement.
+Do not create `synthesis` pages as part of routine primitive extraction. If the extraction reveals a need for durable cross-source interpretation, comparison, brief, or timeline narrative, report that a synthesis may be useful and use the `write-synthesis` skill when the operator asks for it.
+Prepare the body prose in a temporary Markdown file outside the vault, then call the scaffolder. Examples:
+```bash
+agent-wiki create-page \
+  --type entity \
+  --subtype organization \
+  --slug acme-corp \
+  --title "Acme Corp" \
+  --body-file <prepared-body.md> \
+  --source-page <sourceId> \
+  --no-log
+```
+```bash
+agent-wiki create-page \
+  --type concept \
+  --subtype workflow \
+  --slug adaptive-reuse-review \
+  --title "Adaptive Reuse Review" \
+  --body-file <prepared-body.md> \
+  --source-page <sourceId> \
+  --no-log
+```
+```bash
+agent-wiki create-page \
+  --type claim \
+  --subtype historical \
+  --slug acme-founded-2010 \
+  --title "Acme Corp was founded in 2010" \
+  --claim-text "Acme Corp was founded in 2010." \
+  --confidence 0.60 \
+  --source-id <sourceId> \
+  --evidence "id=evidence.quote.supports.acme-founded-2010;sourceId=<sourceId>;path=<sourcePath>;kind=quote;relation=context_only;weight=0.60;excerpt=<short-excerpt>;updatedAt=<yyyy-mm-dd>;locatorText=<locator>" \
+  --body-file <prepared-body.md> \
+  --no-log
+```
+```bash
+agent-wiki create-page \
+  --type question \
+  --subtype acquisition \
+  --slug acme-founder-identity \
+  --title "Who founded Acme Corp?" \
+  --body-file <prepared-body.md> \
+  --related-page <pageId> \
+  --no-log
+```
+For claim pages, pass source-grounded evidence records to the scaffolder with repeatable `--evidence` flags so the page is created with block YAML evidence frontmatter. Use `relation: supports` only when the source directly supports the claim; otherwise prefer `context_only`, `weakens`, or `contradicts`.
+After creating a page with the scaffolder, immediately add any extraction-specific structured fields the scaffolder does not own, such as embedded relations, extracted primitive lists, or richer source references. Preserve the body prose written by the scaffolder and update `updatedAt` when structured content changes.
+Use the body to explain the primitive in source-grounded context:
+- `entity` pages: describe the entity, why it matters in the vault, important aliases or identifiers, and known uncertainty.
+- `concept` pages: explain the concept, its boundaries, source-grounded examples or steps, and any important distinctions.
+- `claim` pages: restate the atomic proposition in prose, summarize the evidence posture, and note caveats or uncertainty.
+- `question` pages: explain why the question exists, what is already known, what remains unresolved, and what would count as resolution.
+When updating an existing page:
+- preserve human-authored prose
+- update only relevant structured fields
+- update `updatedAt` when structured content changes
+- avoid duplicate claims, evidence entries, relations, aliases, and links
+## Step 5: Add Evidence
+Every extracted claim should point back to the source page when possible.
+Evidence should include:
+- source page ID
+- source path
+- source locator when available, especially for source parts
+- evidence kind
+- evidence relation
+- concise note
+- exact excerpt when available
+- retrieval/update dates required by the spec
+Use `relation: supports` only when the evidence directly supports the claim. Use `context_only`, `weakens`, or `contradicts` when that is more accurate.
+## Step 6: Update Source Extraction Metadata
+After extracting primitives from a source page, update that source page's frontmatter using the fields defined in `WIKI.md` Section 4.1 and source statuses from `WIKI.md` Section 5.
+At minimum, record:
+- `status: processed`
+- `updatedAt`
+- wikilinked IDs of extracted entities, concepts, claims, and questions where applicable
+Extracted primitive lists (`extractedEntities`, `extractedConcepts`, `extractedClaims`, `extractedQuestions`) are navigation/display fields. Write each value as an aliased Obsidian wikilink that targets the page filename stem and displays the dotted ID, for example `"[[concept-principle-agentic-workflows|concept.principle.agentic-workflows]]"`.
+When processing source parts, update each processed part to `status: processed`. After all parts for a parent source are processed, update the parent source from `status: partitioned` to `status: processed` and update its `updatedAt`.
+Do not modify the source body unless the user explicitly asks for prose changes.
+## Step 7: Log the Extraction Batch
+After successfully extracting primitives and updating source metadata, write one operational log entry for the batch:
+```bash
+agent-wiki log --message "extract-knowledge-primitives: processed <sourceCount> sources; entities=<count> concepts=<count> claims=<count> questions=<count> relations=<count>"
+```
+Do not write a log entry when no source pages were processed.
+## Step 8: Report Results
+Report a concise summary:
+```text
+Extraction complete.
+Processed:
+- source.<id>
+Created or updated:
+- Entities: ...
+- Concepts: ...
+- Claims: ...
+- Questions: ...
+- Relations: ...
+```
+## Checklist
+- [ ] Read `AGENTS.md` and `WIKI.md` Sections 4.1, 5, 6, 7, 8, and 12.1
+- [ ] Find unprocessed source pages and source parts
+- [ ] Skip `sourceRole: parent` pages during extraction
+- [ ] Read each selected source in full
+- [ ] Identify entities, concepts, claims, questions, and relations
+- [ ] Defer durable cross-source interpretation to the `write-synthesis` skill
+- [ ] Check for duplicates before creating pages
+- [ ] Use `agent-wiki create-page --no-log` for newly created primitive page files
+- [ ] Use runtime schemas and examples from `WIKI.md` Section 4.1
+- [ ] Write substantive Markdown body prose for every new entity, concept, claim, and question page
+- [ ] Mark source-extracted claims `unverified` with `confidence: 0.60`
+- [ ] Add source-grounded evidence without overstating support
+- [ ] Include source part locators in evidence when available
+- [ ] Preserve human-authored prose
+- [ ] Update source extraction metadata
+- [ ] Write one operational log entry for the extraction batch
+- [ ] Report results
+## Schema Authority
+This skill owns extraction workflow guidance only. Runtime page schemas, ID formats, and common examples live in `WIKI.md` Section 4.1. Allowed enum values live in `WIKI.md` Sections 5, 6, 7, 8, 12, and 12.1. Use `AGENT-WIKI-SPEC-v2.md` only for project changes, ambiguity, or missing detail.

package/skills/import-link/SKILL.md ADDED Viewed

@@ -0,0 +1,101 @@
+---
+name: import-link
+description: Import a URL, link-derived capture, transcript, or pasted source directly into a canonical source page. Use when the user asks to import a link, capture an external source, ingest a URL, or save source material into the vault.
+---
+# Import Link
+## Configuration
+- Before first use, read `ONBOARD.md` and `skills/import-link/config.json`.
+- If local setup is uncertain, run `agent-wiki onboard --check` and use the read-only probe output to guide setup questions.
+- For first-run setup, prefer `agent-wiki onboard --check --questions` so the user can answer with compact letter choices.
+- If the user approves persisting local Python or conversion policy, use `agent-wiki onboard --write-config` with the approved flags. The writer creates local `_system/config.json` from `_system/config.example.json`.
+- Confirm `configured` is `true` before importing.
+- Do not assume a default model, browser profile, or external retrieval tool.
+- This checkout is the only wiki root. Write imports under this repository root using the relative directories in `skills/import-link/config.json`.
+- If retrieval modes or attachment policy are unknown, stop and ask the user to configure `skills/import-link/config.json`.
+- The default `manual_paste` retrieval mode requires no external tools. Other retrieval modes only apply when configured and available.
+- Do not create a virtual environment, install packages, write `_system/config.json`, or change `skills/import-link/config.json` unless the user explicitly asks for setup changes. Do not hand-edit `_system/config.json`; use `agent-wiki onboard --write-config` after approval.
+## Wiki Root
+- Run this skill from the repository root.
+- Do not accept a vault name, alternate root, or external destination for this workflow.
+- Use repository-relative paths for all writes.
+- Source pages are written under `sources/`.
+- Attachments are written under `_attachments/`.
+- Users who want multiple independent wikis should clone this repository into multiple folders and onboard each checkout separately.
+- If the incoming link is an `obsidian://` URI, follow `AGENT-WIKI-SPEC-v2.md` Section 8.6: do not launch the URI; resolve it only through `_system/config.json` `knownVaults`, and treat it as an opaque external reference when the target vault is not configured.
+## UUID Generation
+- Use `agent-wiki uuid` to generate a new UUID for each source attachment.
+## Source Slug
+- For any incoming URL or source, always generate a 4 word slug for the source note.
+- Infer the four words by summarizing the content of the source note in 4 words.
+## Source Schema (required, strictly enforced)
+Create source files in `sources/` using `agent-wiki create-page`. The scaffolder writes schema-compliant source pages, validates source parent/part requirements, and prevents duplicate IDs or target path overwrites.
+Use `WIKI.md` Section 4.1 as the routine source of truth for page-type schemas, ID formats, and examples. Use `WIKI.md` Sections 5 and 12 for status and source type enums. This skill owns the import workflow, not the source frontmatter schema.
+Use `WIKI.md` Section 13 for large-source parent and part handling. Consult `AGENT-WIKI-SPEC-v2.md` only when changing project behavior, resolving ambiguity, or when `WIKI.md` Sections 4.1, 5, 12, or 13 are insufficient.
+Use `AGENT-WIKI-SPEC-v2.md` Section 6.10 only when you need the page scaffolding contract or `agent-wiki create-page` option semantics.
+Newly imported ordinary source pages MUST use `status: unprocessed` and `sourceRole: whole`. The extraction workflow changes source pages to `status: processed` after knowledge primitives have been extracted.
+Large source parent pages MUST use `sourceRole: parent`. They SHOULD use `status: partitioned` while one or more child source parts remain unprocessed. Child source part pages MUST use `sourceRole: part` and `status: unprocessed`.
+## Deterministic Workflow
+1. **Deduplication Check:** Before capturing content, check existing source pages in `sources/` for a matching `originUrl`.
+   - If a source with the matching URL already exists, stop and inform the user that it has already been imported or update the existing file if requested.
+2. Capture source content using the retrieval modes configured in `skills/import-link/config.json`.
+   - If direct fetch is available, try it first.
+   - If a transcript tool is configured and the source is a video, capture one English transcript when available and use it as the primary source body.
+   - If browser automation is configured and direct retrieval is blocked or incomplete, use the configured browser automation.
+   - If no configured retrieval mode works, ask the user to paste the source content or configure another retrieval method.
+3. Ensure wiki folders exist:
+   - `sources/`
+   - `sources/parts/` when the capture needs partitioning
+   - `_attachments/`
+4. Build a deterministic ID:
+   - Create the source slug in 4 words by summarizing the content of the source note. (This is done using the raw source note, after the content has been captured in Steps 2 & 3).
+5. Decide whether to partition:
+   - If captured text is larger than roughly 25,000 words, or if an agent cannot reliably process the full source in one extraction pass, create a large source.
+   - Prefer semantic boundaries such as chapters, headings, appendices, transcript topics, or slide boundaries.
+   - Fall back to page ranges, timestamps, or other stable locators.
+   - Target 8,000-15,000 words per source part.
+   - Do not exceed 20,000 words per part unless preserving an indivisible structure requires it.
+   - Avoid splitting inside tables, code blocks, quoted blocks, or list structures when possible.
+6. Save the prepared source body or source-part bodies to temporary Markdown files outside the vault, then call `agent-wiki create-page` with `--no-log` for each canonical source page. The skill writes one batch log entry after the import succeeds.
+7. For an ordinary source, call the scaffolder with:
+   ```bash
+   agent-wiki create-page \
+     --type source \
+     --subtype <sourceType> \
+     --slug <sourceSlug> \
+     --title "<title>" \
+     --source-date <yyyy-mm-dd> \
+     --retrieved-at <yyyy-mm-dd> \
+     --source-url "<originUrl>" \
+     --source-role whole \
+     --body-file <prepared-source-body.md> \
+     --no-log
+   ```
+   Use `--origin-path` instead of `--source-url` only when the imported source is local. Add `--attachment <attachment>` once for each saved attachment. The body file must contain the full captured verbatim body with inline images and source URLs below the frontmatter. Images save to `_attachments/` using filename `yyyy-mm-dd-<sourceSlug>-<UUID>-<index>.<ext>`, where `<index>` starts at 1 and increments for each attachment. If a video thumbnail is captured, place it at the top of the transcript. Inline images use Obsidian image syntax `![[filename]]`.
+8. For a large source, call the scaffolder once for each child source part, then once for the parent manifest:
+   - part pages use `--source-role part`, `--parent-source-id <parentSourceId>`, `--part-index <n>`, `--part-count <count>`, and `--locator "<locator>"`
+   - the parent page uses `--source-role parent`, `--source-part <partPath>` once for each ordered child part path, and `--part-count <count>`
+   - the parent body should stay short and should not contain the full long-form source text
+   - each part body file must contain the verbatim text for its segment
+   - use `--no-log` on each scaffolder call and log the import once after all pages and attachments are written
+9. Write one operational log entry for the import:
+   ```bash
+   agent-wiki log --message "import-link: imported source <sourceId> to sources/<filename>; attachments=<count>"
+   ```
+   Log only after the source page and any attachments have been written successfully.
+10. Confirm in chat with:
+   - source path
+   - source part paths when a large source was partitioned
+   - number of attachments saved

package/skills/import-link/config.json ADDED Viewed

@@ -0,0 +1,12 @@
+{
+  "schemaVersion": 1,
+  "configured": false,
+  "retrievalModes": [
+    "manual_paste"
+  ],
+  "browserProfile": null,
+  "youtubeTranscriptTool": null,
+  "attachmentPolicy": "copy",
+  "sourceDirectory": "sources",
+  "attachmentDirectory": "_attachments"
+}