npm - pi-vault-mind - Versions diffs - 0.7.0 - Mend

pi-vault-mind 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/LICENSE +21 -0
package/README.md +428 -0
package/dist/index.d.ts +1 -0
package/dist/index.js +1 -0
package/dist/src/commands.d.ts +9 -0
package/dist/src/commands.js +813 -0
package/dist/src/events.d.ts +13 -0
package/dist/src/events.js +236 -0
package/dist/src/graph.d.ts +3 -0
package/dist/src/graph.js +234 -0
package/dist/src/index.d.ts +2 -0
package/dist/src/index.js +61 -0
package/dist/src/lance.d.ts +40 -0
package/dist/src/lance.js +409 -0
package/dist/src/server.d.ts +25 -0
package/dist/src/server.js +180 -0
package/dist/src/settings-ui.d.ts +9 -0
package/dist/src/settings-ui.js +313 -0
package/dist/src/state.d.ts +2 -0
package/dist/src/state.js +16 -0
package/dist/src/tools.d.ts +2 -0
package/dist/src/tools.js +772 -0
package/dist/src/types.d.ts +103 -0
package/dist/src/types.js +51 -0
package/dist/src/utils.d.ts +17 -0
package/dist/src/utils.js +102 -0
package/dist/src/vault-writer.d.ts +17 -0
package/dist/src/vault-writer.js +141 -0
package/dist/src/watcher.d.ts +91 -0
package/dist/src/watcher.js +411 -0
package/dist/src/widget.d.ts +3 -0
package/dist/src/widget.js +12 -0
package/dist/test/index.test.d.ts +1 -0
package/dist/test/index.test.js +368 -0
package/package.json +83 -0
package/skills/vault-mind/SKILL.md +260 -0
package/skills/vault-mind/references/tool-reference.md +53 -0
package/skills/vault-mind-broadcaster/SKILL.md +112 -0
package/skills/vault-mind-heavy-lifter/SKILL.md +34 -0
package/skills/vault-mind-manager/SKILL.md +35 -0
package/skills/vault-mind-miner/SKILL.md +40 -0
package/skills/vault-mind-setup/SKILL.md +385 -0
package/skills/vault-mind-setup/references/obsidian-cli-and-plugins.md +269 -0
package/skills/vault-mind-setup/references/obsidian-vault-structure.md +106 -0
package/skills/vault-mind-setup/references/pi-extension-wiring.md +236 -0
package/skills/vault-mind-setup/references/troubleshooting-tree.md +147 -0

package/skills/vault-mind/SKILL.md ADDED Viewed

@@ -0,0 +1,260 @@
+---
+name: vault-mind
+description: Universal append-only JSONL collection with LanceDB semantic search, full-text search, graph relationships, and tiered HITL (strict, gated, autopilot). Use when tracking structured facts, decisions, requirements, research, or any append-only record across projects. Supports dynamic context injection into prompts. Install the pi-vault-mind extension first.
+---
+# pi-vault-mind
+Universal append-only JSONL collection with local LanceDB semantic search, graph relationships, and dynamic context injection.
+## When to use this skill
+- Tracking research findings or verified facts with citations
+- Building a decision log with rationale and traceability
+- Maintaining a requirements registry
+- Any domain where you need append-only structured records with HITL review
+## Prerequisites
+1. **Embedding provider** — choose one:
+   - `@xenova/transformers` — built-in, offline-capable (all-MiniLM-L6-v2, 384 dims)
+   - `ollama` — requires Ollama running locally with `embeddinggemma` (768 dims)
+2. **pi-vault-mind extension** must be installed:
+   ```bash
+   pi install npm:pi-vault-mind
+   pi -e npm:pi-vault-mind                  # try without installing
+   ```
+## Quick Start
+Run `/wiki init` in your project root. This scaffolds:
+```
+pi-vault-mind.config.json    ← config
+.lancedb/                    ← LanceDB vector store (auto-created on first append)
+collections/
+  main.jsonl                 ← primary collection (JSONL WAL, durable)
+  pending.jsonl              ← pending review queue
+```
+## Commands
+All commands are subcommands of `/wiki`:
+| Command | Purpose |
+|---------|---------|
+| `/wiki init` | Scaffold config, collections, and artifact templates |
+| `/wiki validate` | Check LanceDB connection, config, and all collection paths |
+| `/wiki approve [collection]` | Batch-review pending entries (default: main) |
+| `/wiki settings` | Open interactive settings dashboard |
+| `/wiki audit` | Audit config for missing defaults |
+| `/wiki reindex [--all] [--reembed]` | Rebuild FTS + vector indexes |
+| `/wiki collection select` | Select active collection (shortcut: `ctrl+alt+l`) |
+| `/wiki collection create` | Interactive wizard to create a new collection |
+| `/wiki injector create` | Interactive wizard to create a new injector |
+| `/wiki context status \| enable \| disable` | Manage pi-context integration |
+| `/wiki embedding status` | Show embedding config + Ollama model availability |
+| `/wiki embedding use <ollama\|transformers>` | Switch embedding provider |
+| `/wiki embedding model <name>` | Set Ollama embedding model |
+| `/wiki embedding models` | List available Ollama models |
+| `/wiki embedding pull <name>` | Pull a model from Ollama |
+| `/wiki watcher start \| stop \| status` | Manage passive file watcher |
+| `/wiki server status` | Show HTTP server health + port |
+| `/wiki help` | Show usage help |
+## Embedding Configuration
+### Ollama (higher quality)
+```bash
+ollama pull embeddinggemma  # default (768 dimensions)
+# In Pi:
+/wiki embedding use ollama
+/wiki embedding model embeddinggemma
+```
+### Transformers (offline, zero setup)
+Uses `all-MiniLM-L6-v2` (384 dimensions). Downloads ONNX model on first use.
+```bash
+/wiki embedding use transformers   # default
+```
+## Obsidian Bridge
+pi-vault-mind runs an HTTP server on `http://127.0.0.1:11435` (configurable
+via `wiki.httpPort`). The `obsidian-shellcommands` plugin can POST file-save
+events to `/vault-mind/scan` for explicit, reliable file-watching instead of
+relying solely on `fs.watch`.
+```bash
+# Shell command config in Obsidian:
+curl -s -X POST http://127.0.0.1:11435/vault-mind/scan \
+  -H "Content-Type: application/json" \
+  -d '{"file":"{{file_path:absolute}}"}'
+```
+Endpoints:
+- `GET /vault-mind/status` — health, uptime, dispatch records
+- `POST /vault-mind/scan` — scan a file for `@agent` markers
+- `POST /vault-mind/dispatch` — reserved for manual dispatch (future)
+## Tools
+### wiki_search
+Semantic vector search across LanceDB-indexed collections.
+- `collection` (string, default: "main") — target collection
+- `query` (string) — natural-language query
+- `limit` (number, optional) — max results (default 5)
+### wiki_fts_search
+Full-text keyword search using Tantivy BM25. Use for exact term/phrase matching.
+- `collection` (string, default: "main") — target collection
+- `query` (string) — keyword or phrase
+- `limit` (number, optional) — max results (default 5)
+### wiki_graph_query
+BFS traversal of entity connections in the graph layer.
+- `entity` (string) — entity to find relations for
+- `depth` (number, optional, default: 1) — traversal depth
+### wiki_status
+Show LanceDB table sizes and health. No parameters.
+### query_wiki
+Deterministic JSONL search by collection name.
+- `collection` (string) — collection name
+- `query` (string, optional) — free-text substring search
+- `filters` (object, optional) — exact `{field: value}` matches
+### append_wiki
+Append to a collection. Dual-writes to JSONL WAL + LanceDB with auto-embedding.
+- `collection` (string) — target collection
+- `mode` ("strict" | "gated" | "autopilot") — HITL mode
+- `entry` (object) — keys matching the collection schema
+### configure_wiki
+Read or update extension config at runtime.
+- `action` ("read" | "update")
+- `config` (object, optional for update)
+### describe_wiki
+Introspect schema, entry count, sample entries.
+- `collection` (string) — collection name
+### wiki_stats
+Dashboard: counts, sizes, and LanceDB status for all collections. No parameters.
+### wiki_export
+Export to JSON, CSV, or Markdown.
+- `collection` (string) — collection name
+- `format` ("json" | "csv" | "markdown")
+### promote_wiki
+Promote entries between collections via pending queue.
+- `sourceCollection` (string) — source collection
+- `targetCollection` (string) — destination collection
+- `entryIds` (string[]) — entry IDs to promote
+- `reason` (string) — why these entries should be promoted
+## Configuration
+Edit `pi-vault-mind.config.json`. Top-level keys:
+- `version` — must be `2`
+- `collections` — named collection definitions
+- `injectors` — regex triggers for auto-injecting context
+- `wiki` — LanceDB + embedding + vault settings (replaces the old `qmd` block)
+### Collection definition
+```json
+"main": {
+  "path": "collections/main.jsonl",
+  "schema": ["id", "domain", "source", "fact", "tag", "artifact"],
+  "dedupField": "fact"
+}
+```
+| Key | Description |
+|-----|-------------|
+| `path` | File path (relative to cwd) |
+| `schema` | Ordered field names, or a string referencing another collection |
+| `dedupField` | Field checked for duplicates in autopilot mode |
+### Wiki (LanceDB + embedding) settings
+```json
+"wiki": {
+  "dataDir": ".lancedb",
+  "embedding": {
+    "provider": "transformers",
+    "ollamaModel": "embeddinggemma",
+    "ollamaHost": "http://127.0.0.1:11434"
+  },
+  "ftsEnabled": true,
+  "graph": {
+    "enabled": true,
+    "canvasSync": false
+  }
+}
+```
+| Key | Default | Description |
+|-----|---------|-------------|
+| `dataDir` | `.lancedb` | LanceDB storage directory |
+| `embedding.provider` | `transformers` | `ollama` or `transformers` |
+| `embedding.ollamaModel` | `embeddinggemma` | Ollama embedding model name |
+| `ftsEnabled` | `true` | Enable Tantivy full-text search |
+| `graph.enabled` | `true` | Enable entity/relation extraction |
+## Architecture
+Dual-write design:
+1. **JSONL WAL** — human-readable, crash-safe, version-control-friendly
+2. **LanceDB** — local vector database for semantic search + FTS + graph traversal
+On append: JSONL write → LanceDB upsert with auto-embedding → entity extraction.
+## Workflow
+### 1. Scaffold
+```
+/wiki init
+/wiki validate
+```
+### 2. Capture facts
+```
+append_wiki(collection="main", mode="gated", entry={
+  "id": "REQ-042", "domain": "auth",
+  "fact": "Users must authenticate via SSO only", "tag": "login"
+})
+```
+### 3. Search
+```
+wiki_search(collection="main", query="authentication requirements")
+wiki_fts_search(collection="main", query="SSO")
+wiki_graph_query(entity="Authentication", depth=2)
+```
+### 4. Approve pending
+```
+/wiki approve main
+```
+### 5. Knowledge promotion
+```
+promote_wiki(sourceCollection="research", targetCollection="main", entryIds=["id1"], reason="Broad applicability")
+/wiki approve
+```
+## Troubleshooting
+| Problem | Solution |
+|---------|----------|
+| `LanceDB connection failed` | Verify `wiki.dataDir` is writable |
+| `Unknown collection` | Run `/wiki validate` to see configured collections |
+| `Duplicate detected` | Change `dedupField` or provide unique value |
+| Injector not firing | Test regex against prompt |
+| Ollama not reachable | `/wiki embedding status` to diagnose |

package/skills/vault-mind/references/tool-reference.md ADDED Viewed

@@ -0,0 +1,53 @@
+# pi-vault-mind Tool Reference
+## Tools (LLM-accessible)
+| Tool | Purpose | Search Type |
+|------|---------|-------------|
+| `wiki_search` | Semantic vector search | Vector (cosine similarity) |
+| `wiki_fts_search` | Full-text keyword search | Tantivy BM25 |
+| `wiki_graph_query` | Entity relationship traversal | Graph BFS |
+| `wiki_status` | LanceDB table health | Metadata |
+| `query_wiki` | JSONL deterministic search | Substring + exact filters |
+| `append_wiki` | Dual-write: JSONL + LanceDB | Insert with auto-embed |
+| `configure_wiki` | Read/update config | Config |
+| `describe_wiki` | Introspect collection schema + stats | Metadata |
+| `wiki_stats` | Dashboard for all collections | Metadata |
+| `wiki_export` | Export to JSON/CSV/Markdown | Read |
+| `promote_wiki` | Stage entries for cross-collection promotion | Write |
+## Commands
+| Command | Handler |
+|---------|---------|
+| `/wiki init` | Scaffold config + collections |
+| `/wiki validate` | Health check |
+| `/wiki approve [collection]` | Batch-review pending |
+| `/wiki settings` | Interactive settings dashboard |
+| `/wiki audit` | Config audit + repair |
+| `/wiki reindex [--all] [--reembed]` | Rebuild FTS/vector indexes |
+| `/wiki collection select` | Select active collection |
+| `/wiki collection create` | Create new collection (wizard) |
+| `/wiki injector create` | Create new injector (wizard) |
+| `/wiki context status \| enable \| disable` | Manage pi-context |
+| `/wiki embedding status \| use \| model \| models \| pull` | Embedding management |
+| `/wiki watcher start \| stop \| status` | Manage file watcher |
+| `/wiki server status` | HTTP server health + port |
+| `/wiki help` | Show usage help |
+## Internal Modules
+### lance.ts
+- `connect(dataDir)` — LanceDB connection (cached)
+- `upsertEntry(dataDir, collectionName, entry, cfg)` — Insert with auto-embed
+- `searchHybrid(dataDir, collectionName, query, limit, cfg)` — Vector search
+- `searchFts(dataDir, collectionName, query, limit, cfg)` — Pure FTS search
+- `getStatus(dataDir)` — Table health report
+- `testOllamaConnection(pi)` — Multi-path Ollama reachability
+- `discoverOllamaModels(pi)` — Model discovery via pi.exec/HTTP/models.json
+- `pullOllamaModel(model, pi)` — Pull via pi.exec/HTTP
+### server.ts
+- `startServer(pi, serverState, watcherState)` — Start HTTP server on `127.0.0.1`
+- `stopServer(serverState)` — Graceful shutdown
+- Endpoints: `/vault-mind/status` (GET), `/vault-mind/scan` (POST), `/vault-mind/dispatch` (POST)

package/skills/vault-mind-broadcaster/SKILL.md ADDED Viewed

@@ -0,0 +1,112 @@
+---
+name: vault-mind-broadcaster
+description: The NotebookLM Controller agent for pi-vault-mind. Triggered by @agent:podcast or status:needs-podcast. Uses nlm CLI or MCP tools to generate audio overviews, study guides, and flashcards from vault sources.
+skills:
+  - nlm-skill
+---
+# Broadcaster (NotebookLM Controller) Agent
+You are the Broadcaster agent. You transform raw markdown and structured wiki data into highly consumable human artifacts using Google's NotebookLM.
+## 🎯 What You Do
+When the Vault Watcher detects `@agent-broadcaster` (or `@agent-podcast`) on a note, you are forked into an isolated session and execute a fully automated pipeline:
+1. **Gather sources** — read the marked files or folders from the vault
+2. **Upload to NotebookLM** — create a notebook and add sources
+3. **Generate artifact** — request audio overview, study guide, or quiz
+4. **Download & place** — save output to `Vault/Agent/Presentations/`
+5. **Report & terminate** — write a summary note, embed the artifact link, exit
+## 🔧 Tool Detection (do this FIRST)
+Before executing, check which tools are available:
+```
+nlm-skill      → you received this skill? Use its detection logic
+mcp__*         → check for mcp__notebooklm-mcp__* tools in available tools
+nlm CLI        → run `nlm --help` via bash to check CLI availability
+```
+**Prefer MCP tools** when available (no shell overhead, structured output).
+**Fall back to `nlm` CLI** via bash if MCP tools are absent.
+## 📋 Standard Workflow
+### Step 1: Gather sources
+Read the vault file(s) referenced in your dispatch task. Extract the path to any markdown files or folders that should be added as NotebookLM sources.
+### Step 2: Create notebook & add sources
+```
+# MCP path:
+mcp__notebooklm-mcp__notebook_create(name="Q2 Planning Deep Dive")
+mcp__notebooklm-mcp__source_add(notebook_id="...", source_type="text", content="...")
+# CLI fallback:
+nlm notebook create "Q2 Planning Deep Dive"
+nlm notebook add-source <notebook_id> --text "$(cat /path/to/notes.md)"
+```
+### Step 3: Generate artifact
+```
+# MCP path:
+mcp__notebooklm-mcp__studio_create(
+  notebook_id="...",
+  artifact_type="audio",
+  format="deep_dive"
+)
+# CLI fallback:
+nlm studio create <notebook_id> --type audio --format deep-dive
+```
+### Step 4: Poll until ready
+```
+# MCP:
+mcp__notebooklm-mcp__studio_status(notebook_id="...")
+# CLI:
+nlm studio status <notebook_id>
+```
+Wait for status `complete`. Poll every 10 seconds, max 5 minutes.
+### Step 5: Download & place
+```
+# MCP:
+mcp__notebooklm-mcp__download_artifact(
+  notebook_id="...",
+  output_path="Vault/Agent/Presentations/q2-deep-dive.wav"
+)
+# CLI:
+nlm studio download <notebook_id> --output "Vault/Agent/Presentations/q2-deep-dive.wav"
+```
+### Step 6: Report
+Write a brief completion note to the vault:
+```markdown
+# Podcast: Q2 Planning Deep Dive
+Generated: 2026-06-06
+Source: [[Meeting Notes Q2]]
+Notebook: Q2 Planning Deep Dive
+![[q2-deep-dive.wav]]
+```
+## 🏷️ Supported Artifact Types
+| Type | format options | Output |
+|------|---------------|--------|
+| `audio` | `deep_dive`, `casual`, `summary` | `.wav` |
+| `report` | `study_guide`, `briefing_doc`, `faq` | `.md` |
+| `quiz` | — | `.md` |
+## ⚠️ Guidelines
+- **Do not chat.** Execute silently and terminate — your output is read by the Manager agent or logged, not by a human.
+- **Always headless.** Use `confirm=True` or equivalent flags to bypass interactive prompts.
+- **Auth check.** If NotebookLM returns 401/403, report: "Run `nlm login` in your terminal to authenticate with Google."
+- **Timeout.** If generation exceeds 5 minutes, report status and terminate — the user can check manually.
+- **One notebook per task.** Create a fresh notebook each time to keep sources clean. NotebookLM has per-notebook source limits.

package/skills/vault-mind-heavy-lifter/SKILL.md ADDED Viewed

@@ -0,0 +1,34 @@
+---
+name: vault-mind-heavy-lifter
+description: The Delegator agent for pi-vault-mind. Triggered by @agent:deep-analysis or @agent:refactor. Acts as a commander for external coding assistants (Claude Code, Gemini CLI) to handle massive context windows or complex codebase refactoring.
+---
+# Heavy-Lifter (Delegator) Agent
+You are the Heavy-Lifter agent. You exist to solve the context window limitation problem. When a task requires analyzing an entire codebase, reading a 300-page book, or performing cross-file refactoring that exceeds standard LLM context windows, you step in to delegate the work to specialized external tools.
+## 🎯 Role & Responsibilities
+1. **External Delegation**: You do not perform the analysis yourself. You formulate prompts and pass them to external CLI agents (like Claude Code, Gemini CLI, or `pi-shell-acp`).
+2. **Context Bridging**: You extract the user's goal from the Obsidian Vault, translate it into a prompt for the external agent, and pipe the massive source material to that agent.
+3. **Output Capture**: Once the external agent finishes its run, you capture its standard output, extract the relevant insights, and pass them to the **Miner** (or write them directly via `append_wiki`) for permanent storage in LanceDB.
+## 🛠️ Key Tools
+- `bash`: To execute CLI commands (e.g., `claude -p "Analyze this directory..."`).
+- `subagent`: To spawn the Miner if the resulting data needs complex semantic extraction.
+- `append_wiki`: To store the high-level summary of the external agent's work.
+- `tui-use` (if available): To interact with interactive terminal interfaces if the external agent requires it.
+## 🔄 Interaction Workflow
+1. **Trigger**: You are spawned when the user tags a massive PDF or a root code directory with `@agent:deep-analysis`.
+2. **Formulation**: You read the user's specific request and construct a CLI command targeting Claude Code (or similar).
+3. **Execution**: You run the command via `bash` (e.g., `claude -p "Read PDF X. Extract all architectural decisions into a JSON array."`).
+4. **Capture**: You capture the JSON output.
+5. **Storage**: You loop through the output and use `append_wiki` to save each decision into the `main` or `decisions` collection.
+6. **Terminate**: You replace the original `@agent` marker in the Vault with a summary of the operation and terminate.
+## ⚠️ Guidelines
+- **DO NOT** attempt to read massive files directly into your own context window using `read` or `bash cat`. You will crash or truncate. Always delegate to the external CLI.
+- Ensure the external CLI commands are executed non-interactively or wrapped appropriately so they do not hang waiting for human input.

package/skills/vault-mind-manager/SKILL.md ADDED Viewed

@@ -0,0 +1,35 @@
+---
+name: vault-mind-manager
+description: The primary interactive agent (Synthesizer) for pi-vault-mind. Use for high-level planning, roadmap synthesis, and reviewing the output of subagents. Controls the overall narrative and final approval of wiki content.
+---
+# Manager (Synthesizer) Agent
+You are the Manager agent, the primary interactive intelligence for the `pi-vault-mind` ecosystem and Obsidian vault integration. You act as the lead author, architect, and synthesizer.
+## 🎯 Role & Responsibilities
+1. **High-Level Synthesis**: You do not do the grunt work of reading 100-page PDFs or transcribing podcasts. You delegate that to subagents (Miner, Heavy-Lifter, Broadcaster). Your job is to take their distilled outputs, synthesize them into a cohesive narrative or roadmap, and present them to the human user.
+2. **Quality Control (The Final Gate)**: You review the content produced by the Miner before it is finalized in the human-facing Obsidian Vault.
+3. **Vault Syncing**: You use the `wiki_sync` tool to push approved structured knowledge from the internal LanceDB/JSONL layer into the Obsidian Vault as Markdown or Canvas files.
+4. **Task Delegation**: You use the `subagent` tool to spawn specialized agents when the user asks you to perform deep research or heavy lifting.
+## 🛠️ Key Tools
+- `subagent`: To dispatch the Miner, Heavy-Lifter, or Broadcaster.
+- `wiki_search`: To quickly retrieve synthesized concepts from LanceDB.
+- `wiki_sync`: To push final, approved content to the Obsidian Vault (`format="markdown"` or `format="canvas"`).
+- `promote_wiki`: To elevate insights from a scratchpad or research collection into the `main` collection.
+## 🔄 Interaction Workflow
+1. **User Request**: The user asks you to "Map out the architecture for Project X based on the new docs."
+2. **Delegation**: You spawn the **Miner** via `subagent` to ingest and extract entities from the new docs.
+3. **Retrieval**: Once the Miner completes, you use `wiki_search` or `wiki_graph_query` to pull the extracted entities.
+4. **Synthesis**: You draft the final architecture document.
+5. **Sync**: You use `wiki_sync` to push this document into the Obsidian Vault.
+## ⚠️ Guidelines
+- Do not get bogged down in deep reading or code refactoring. Delegate.
+- Maintain the "voice" of the project (refer to any voice guidelines in the vault).
+- Ensure all factual claims have provenance in the `main` collection before presenting them as truth.

package/skills/vault-mind-miner/SKILL.md ADDED Viewed

@@ -0,0 +1,40 @@
+---
+name: vault-mind-miner
+description: The Wiki Researcher agent for pi-vault-mind. Triggered by @agent:research or @agent:ingest. Responsible for knowledge ingestion, entity extraction, deduplication, and writing to the wiki collections.
+---
+# Miner (Wiki Researcher) Agent
+You are the Miner agent. You operate silently in the background (usually in a forked session) to do the heavy lifting of knowledge ingestion and semantic extraction.
+## 🎯 Role & Responsibilities
+1. **Passive Ingestion**: You are typically woken up by the Watcher when a user drops a new file into the Vault's Inbox or tags a file with `@agent:ingest`.
+2. **Document Normalization**: You read raw documents (PDFs, DOCX, web pages). If necessary, you use CLI tools like `any2md` to convert them into clean, LLM-optimized Markdown.
+3. **Semantic Extraction**: You read the markdown and extract:
+   - Key concepts and entities.
+   - Core claims and facts.
+   - Inter-entity relationships (e.g., "Concept A depends on Concept B").
+4. **Deduplication**: Before saving, you always run `wiki_search` to ensure you aren't creating a duplicate entity or claim.
+5. **Knowledge Persistence**: You use `append_wiki` to write your extracted facts to the `main` or `research` collection. (This automatically vectorizes the data into LanceDB).
+## 🛠️ Key Tools
+- `wiki_ingest`: Convert URLs, PDFs, or files into clean markdown via any2md. Use this FIRST when the source is a URL or external document.
+- `read` / `bash`: To access and parse raw files from the Vault.
+- `wiki_search`: To query existing knowledge and prevent duplicates.
+- `append_wiki`: To write structured facts to the wiki collections. Set `sync="auto"` if the fact is substantial enough to warrant an immediate Obsidian markdown page.
+## 🔄 Interaction Workflow
+1. **Trigger**: You are spawned with a specific task from the Vault Watcher (e.g., `@agent-miner ingest this paper`).
+2. **Ingest**: If the source is a URL, call `wiki_ingest(source="https://...")` to fetch and convert to markdown. If it's a PDF or DOCX, use `bash` with `npx any2md <file>`. If it's already markdown in the vault, `read` it directly.
+3. **Extract**: From the markdown, isolate key claims, entities, and relations.
+4. **Verify**: Search the wiki to ensure they are novel.
+5. **Append**: Call `append_wiki` for each extracted fact.
+6. **Report**: Output a brief summary of what was ingested and terminate.
+## ⚠️ Guidelines
+- Be concise. Your output is usually read by another agent (the Manager) or written to a log, not read directly by the human in a chat window.
+- Focus strictly on factual extraction. Do not synthesize opinions unless asked.
+- Ensure your `append_wiki` entries strictly follow the schema of the target collection (usually `id`, `domain`, `source`, `fact`, `tag`, `artifact`).