npm - @syndash/research-vault-mcp - Versions diffs - 0.2.0 - Mend

@syndash/research-vault-mcp 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md ADDED Viewed

@@ -0,0 +1,143 @@
+# research-vault MCP server
+MCP server exposing a local research-vault checkout to any MCP client over **stdio** or **Streamable HTTP**.
+Wave 2D — see `docs/superpowers/plans/2026-04-19-wave2d-research-vault-mcp.md`.
+## Architecture
+MCP client → this server → local `VAULT_ROOT` checkout → optional BGE-M3-compatible embedding endpoint.
+Phase 1: substring search over `.meta/registry.jsonl`, exact-ID read via `vault_get`, plus `vault_status` and `vault_taxonomy`. Phase 2: real BGE-M3 cosine over `.meta/embeddings.jsonl`.
+## Tools (v0)
+| Tool             | Args                                       | Returns                                                |
+|------------------|--------------------------------------------|--------------------------------------------------------|
+| `vault_search`   | `query` (str), `top_k?` (1-50), `mode?`    | Ranked hits with separated readability/index/analysis verdicts |
+| `vault_get`      | `id` (exact str), `include_content?`       | Authoritative exact-ID metadata/content read           |
+| `vault_status`   | —                                          | Registry counts, decay summary, last maintenance run   |
+| `vault_taxonomy` | —                                          | `knowledge/_taxonomy.md` verbatim                      |
+`mode` defaults to `substring`; `embedding` is wired in Phase 2.
+`vault_search` reports `verdict`, `readability_verdict`, `index_verdict`, and `analysis_verdict` separately. A readable raw note that has no `lastAnalyzedAt` returns `verdict: "PASS"` with `analysis_verdict: "NOT_ANALYZED"`; that means content is readable but has not been analyzed yet.
+## Search And Read Semantics
+`vault_search` is an index/search surface. It answers "did this query match registry metadata, and is the matched content readable?"
+Matching covers:
+- exact entry IDs, with exact ID hits marked as `matched: ["id_exact"]`
+- titles, categories, tags, and source URLs
+- punctuation-normalized text, so `Music/Rhythm` can be found with `music rhythm`, and `software-engineering/game-design` can be found with `software engineering game design`
+The top-level `verdict` is about search/read usability, not analysis freshness:
+```json
+{
+  "verdict": "PASS",
+  "readability_verdict": "PASS",
+  "index_verdict": "PASS",
+  "analysis_verdict": "NOT_ANALYZED"
+}
+```
+That shape means the result matched and the file is readable. It does not mean the entry has been analyzed. Missing `lastAnalyzedAt` is reported as `NOT_ANALYZED`, not as missing or broken content.
+`vault_get` is the authoritative exact-ID read path. It requires an exact `id`, reads `knowledgePath` first when present, and falls back to `rawPath` when the knowledge file is absent. Use `include_content: true` when the client needs the markdown body:
+```json
+{
+  "id": "20260608-readable-raw-note",
+  "include_content": true
+}
+```
+## Install From npm
+```bash
+npm install -g @syndash/research-vault-mcp
+```
+This package runs on Bun. Keep `VAULT_ROOT` pointed at your local vault checkout.
+## Install From Source
+```bash
+cd mcp
+bun install
+```
+## Run — stdio
+```bash
+VAULT_ROOT=/path/to/research-vault research-vault-mcp
+```
+From a source checkout:
+```bash
+VAULT_ROOT=/path/to/research-vault bun run dev
+```
+## Run — HTTP
+```bash
+VAULT_ROOT=/path/to/research-vault MCP_MODE=http MCP_HOST=127.0.0.1 MCP_PORT=8765 research-vault-mcp
+curl http://127.0.0.1:8765/health
+```
+Bind `MCP_HOST=0.0.0.0` only behind private network/auth controls.
+## Inspect (verify handshake + tool listing)
+```bash
+bun run inspect
+# = bunx @modelcontextprotocol/inspector bun run server.ts
+```
+Then in the Inspector UI: connect → list tools → call each one.
+## Environment
+| Var               | Default                          | Notes                                                  |
+|-------------------|----------------------------------|--------------------------------------------------------|
+| `MCP_MODE`        | `stdio`                          | `stdio` \| `http`                                      |
+| `MCP_HOST`        | `127.0.0.1`                      | Use `0.0.0.0` only behind private access controls      |
+| `MCP_PORT`        | `8765`                           | HTTP mode port                                         |
+| `VAULT_ROOT`      | parent dir of `mcp/`             | Absolute path to the vault checkout                    |
+| `EMBED_ENDPOINT`  | `http://127.0.0.1:8080`          | Optional BGE-M3-compatible embedding endpoint          |
+## Client Configuration
+Use your actual private MCP URL in your local client config:
+```yaml
+mcp_servers:
+  research_vault:
+    url: http://127.0.0.1:8765/mcp
+    timeout: 60
+    connect_timeout: 30
+```
+Tool names will appear as `mcp_research_vault_vault_search` etc.
+## Publishing OPSEC
+The npm package is intentionally allowlisted to runtime files only: `README.md`, `server.ts`, `types.ts`, and `tsconfig.json`.
+Before publishing, run:
+```bash
+npm pack --dry-run
+```
+Do not publish vault content, `.meta`, private hostnames, private IPs, service-token material, local absolute paths, or operator-specific deployment notes.
+## Non-goals (explicit)
+- No public internet exposure by default
+- No bundled vault content or registry data
+- No multi-tenant / multi-vault — one server, one `VAULT_ROOT`

package/package.json ADDED Viewed

@@ -0,0 +1,46 @@
+{
+  "name": "@syndash/research-vault-mcp",
+  "version": "0.2.0",
+  "description": "MCP server exposing local research-vault search and exact-ID reads",
+  "type": "module",
+  "bin": {
+    "research-vault-mcp": "server.ts"
+  },
+  "files": [
+    "README.md",
+    "server.ts",
+    "types.ts",
+    "tsconfig.json"
+  ],
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/Fearvox/ds-research-vault.git",
+    "directory": "mcp"
+  },
+  "bugs": {
+    "url": "https://github.com/Fearvox/ds-research-vault/issues"
+  },
+  "homepage": "https://github.com/Fearvox/ds-research-vault/tree/main/mcp#readme",
+  "license": "UNLICENSED",
+  "engines": {
+    "bun": ">=1.3.0"
+  },
+  "scripts": {
+    "start": "bun ./server.ts",
+    "dev": "MCP_MODE=stdio bun run server.ts",
+    "http": "MCP_MODE=http MCP_HOST=127.0.0.1 MCP_PORT=8765 bun run server.ts",
+    "inspect": "bunx @modelcontextprotocol/inspector bun run server.ts",
+    "test": "bun test server.test.ts",
+    "typecheck": "bunx tsc --noEmit -p tsconfig.json",
+    "pack:dry": "npm pack --dry-run",
+    "prepublishOnly": "bun run typecheck && bun test server.test.ts && npm pack --dry-run"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "^1.29.0",
+    "zod": "^3.23.8"
+  },
+  "devDependencies": {
+    "@types/bun": "latest",
+    "typescript": "^6.0.3"
+  }
+}

package/server.ts ADDED Viewed

@@ -0,0 +1,913 @@
+#!/usr/bin/env bun
+/**
+ * research-vault MCP server
+ *
+ * Exposes semantic + metadata access to the research-vault knowledge repo
+ * over MCP (stdio for local clients, Streamable HTTP for private deployments).
+ *
+ * v0 tools (Phase 1):
+ *   - vault_search     → substring match over registry.jsonl (titles, tags, category)
+ *                        Phase 2 upgrade: BGE-M3 query embedding + cosine over .meta/embeddings.jsonl
+ *   - vault_status     → registry counts + decay summary + last maintenance run
+ *   - vault_taxonomy   → returns knowledge/_taxonomy.md verbatim
+ *
+ * v1 write tools (DAS-808 — write upgrade):
+ *   - vault_store      → create a new vault entry (registry + knowledge file)
+ *   - vault_write      → update an existing vault entry's metadata or content
+ *
+ * Env:
+ *   MCP_MODE         stdio | http   (default: stdio)
+ *   MCP_HOST         host for http mode (default: 127.0.0.1; use 0.0.0.0 only behind private access controls)
+ *   MCP_PORT         port for http mode (default: 8765)
+ *   VAULT_ROOT       absolute path to research-vault root
+ *                    (default: parent of this file — works when running `bun run server.ts` from mcp/)
+ *   EMBED_ENDPOINT   BGE-M3 OpenAI-compatible embeddings URL
+ *                    (default: http://127.0.0.1:8080)
+ *   MCP_READONLY     if "1" / "true", do not register vault_store / vault_write
+ *                    (default: off; set on remote-agent hosts that consume but must not mutate)
+ */
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
+import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
+import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'
+import { z } from 'zod'
+import { readFileSync, existsSync, writeFileSync, mkdirSync, appendFileSync, statSync } from 'node:fs'
+import { join, resolve, dirname, basename } from 'node:path'
+import { fileURLToPath } from 'node:url'
+import { createServer, type IncomingMessage, type ServerResponse } from 'node:http'
+import { createHash, randomUUID } from 'node:crypto'
+import type { RawItem, DecayScore } from './types.ts'
+// --------------------------------------------------------------------------
+// Paths + config
+// --------------------------------------------------------------------------
+const HERE = dirname(fileURLToPath(import.meta.url))
+const VAULT_ROOT = resolve(process.env.VAULT_ROOT ?? join(HERE, '..'))
+const META_DIR = join(VAULT_ROOT, '.meta')
+const REGISTRY_PATH = join(META_DIR, 'registry.jsonl')
+const DECAY_PATH = join(META_DIR, 'decay-scores.json')
+const MAINT_PATH = join(META_DIR, 'last-maintenance.json')
+const TAXONOMY_PATH = join(VAULT_ROOT, 'knowledge', '_taxonomy.md')
+const MODE = (process.env.MCP_MODE ?? 'stdio').toLowerCase()
+const HOST = process.env.MCP_HOST ?? '127.0.0.1'
+const PORT = Number(process.env.MCP_PORT ?? 8765)
+const EMBED_ENDPOINT = process.env.EMBED_ENDPOINT ?? 'http://127.0.0.1:8080'
+const READONLY = process.env.MCP_READONLY === '1' || process.env.MCP_READONLY === 'true'
+const PUBLIC_BASE_URL =
+  (process.env.MCP_PUBLIC_BASE_URL ?? process.env.PUBLIC_BASE_URL ?? `http://${HOST}:${PORT}`).replace(/\/+$/, '')
+const MCP_ENDPOINT_URL = `${PUBLIC_BASE_URL}/mcp`
+// --------------------------------------------------------------------------
+// Vault readers (cheap — O(registry size), called per tool invocation; no cache)
+// --------------------------------------------------------------------------
+function readRegistry(): RawItem[] {
+  if (!existsSync(REGISTRY_PATH)) return []
+  const raw = readFileSync(REGISTRY_PATH, 'utf-8').trim()
+  if (!raw) return []
+  return raw.split('\n').map((line) => JSON.parse(line) as RawItem)
+}
+function readDecayScores(): DecayScore[] {
+  if (!existsSync(DECAY_PATH)) return []
+  return JSON.parse(readFileSync(DECAY_PATH, 'utf-8')) as DecayScore[]
+}
+function readLastMaintenance(): { last_run: string | null; items_processed: number } {
+  if (!existsSync(MAINT_PATH)) return { last_run: null, items_processed: 0 }
+  return JSON.parse(readFileSync(MAINT_PATH, 'utf-8'))
+}
+function readTaxonomy(): string {
+  if (!existsSync(TAXONOMY_PATH)) return '(taxonomy file missing)'
+  return readFileSync(TAXONOMY_PATH, 'utf-8')
+}
+// --------------------------------------------------------------------------
+// Vault writers (v1 — DAS-808 write upgrade)
+// --------------------------------------------------------------------------
+function slugify(text: string): string {
+  return text
+    .toLowerCase()
+    .replace(/[^a-z0-9]+/g, '-')
+    .replace(/^-+|-+$/g, '')
+    .slice(0, 80)
+}
+function generateEntryId(title: string): string {
+  const date = new Date().toISOString().slice(0, 10).replace(/-/g, '')
+  const slug = slugify(title)
+  return `${date}-${slug}`
+}
+interface WriteStoreResult {
+  entry: RawItem
+  knowledgePath: string
+  rawPath: string
+}
+function writeStoreEntry(args: {
+  title: string
+  content: string
+  category?: string
+  tags?: string[]
+  sourceUrl?: string
+}): WriteStoreResult {
+  const id = generateEntryId(args.title)
+  const now = new Date().toISOString()
+  const yearMonth = now.slice(0, 7) // YYYY-MM
+  // Determine paths
+  const catDir = args.category ? args.category.replace(/\/+/g, '/').replace(/^\/|\/$/g, '') : ''
+  const knowledgeDir = catDir ? join(VAULT_ROOT, 'knowledge', catDir) : join(VAULT_ROOT, 'knowledge')
+  const rawDir = join(VAULT_ROOT, 'raw', yearMonth)
+  const knowledgeFile = join(knowledgeDir, `${id}.md`)
+  const rawFile = join(rawDir, `${id}.md`)
+  // Create directories
+  mkdirSync(knowledgeDir, { recursive: true })
+  mkdirSync(rawDir, { recursive: true })
+  // Write content files
+  const header = `# ${args.title}\n\n`
+  const sourceLine = args.sourceUrl ? `> Source: ${args.sourceUrl}\n\n` : ''
+  const body = header + sourceLine + args.content
+  writeFileSync(knowledgeFile, body, 'utf-8')
+  writeFileSync(rawFile, body, 'utf-8')
+  // Build registry entry
+  const entry: RawItem = {
+    id,
+    title: args.title,
+    source: args.sourceUrl ? 'url' : 'local',
+    sourceUrl: args.sourceUrl,
+    rawPath: `raw/${yearMonth}/${id}.md`,
+    ingestedAt: now,
+    status: 'raw',
+    tags: args.tags ?? [],
+    category: args.category,
+    knowledgePath: catDir ? `knowledge/${catDir}/${id}.md` : `knowledge/${id}.md`,
+  }
+  // Append to registry
+  appendFileSync(REGISTRY_PATH, JSON.stringify(entry) + '\n', 'utf-8')
+  return { entry, knowledgePath: knowledgeFile, rawPath: rawFile }
+}
+interface WriteUpdateResult {
+  entry: RawItem
+  previous: RawItem
+  knowledgePath?: string
+}
+function writeUpdateEntry(args: {
+  id: string
+  title?: string
+  content?: string
+  category?: string
+  tags?: string[]
+  status?: RawItem['status']
+  sourceUrl?: string
+}): WriteUpdateResult {
+  const registry = readRegistry()
+  const idx = registry.findIndex((item) => item.id === args.id)
+  if (idx === -1) {
+    throw new Error(`Entry not found: ${args.id}`)
+  }
+  const previous = { ...registry[idx], tags: [...registry[idx].tags] }
+  const updated = registry[idx]
+  // Update scalar fields
+  if (args.title !== undefined) updated.title = args.title
+  if (args.category !== undefined) updated.category = args.category
+  if (args.tags !== undefined) updated.tags = args.tags
+  if (args.status !== undefined) updated.status = args.status
+  if (args.sourceUrl !== undefined) {
+    updated.sourceUrl = args.sourceUrl
+    updated.source = 'url'
+  }
+  // Update content if provided
+  let knowledgePath: string | undefined
+  if (args.content !== undefined && updated.knowledgePath) {
+    const kp = join(VAULT_ROOT, updated.knowledgePath)
+    const header = `# ${updated.title}\n\n`
+    const sourceLine = updated.sourceUrl ? `> Source: ${updated.sourceUrl}\n\n` : ''
+    writeFileSync(kp, header + sourceLine + args.content, 'utf-8')
+    knowledgePath = kp
+  }
+  // Handle category change — move knowledge file
+  if (args.category !== undefined && args.category !== previous.category) {
+    const oldPath = previous.knowledgePath ? join(VAULT_ROOT, previous.knowledgePath) : null
+    const newCatDir = args.category.replace(/\/+/g, '/').replace(/^\/|\/$/g, '')
+    const newDir = join(VAULT_ROOT, 'knowledge', newCatDir)
+    const newPath = join(newDir, `${args.id}.md`)
+    const newRelPath = `knowledge/${newCatDir}/${args.id}.md`
+    if (oldPath && existsSync(oldPath)) {
+      mkdirSync(newDir, { recursive: true })
+      writeFileSync(newPath, readFileSync(oldPath, 'utf-8'), 'utf-8')
+      // Note: old file is left in place to avoid data loss; manual cleanup expected
+    }
+    updated.knowledgePath = newRelPath
+  }
+  // Rewrite registry
+  registry[idx] = updated
+  writeFileSync(REGISTRY_PATH, registry.map((item) => JSON.stringify(item)).join('\n') + '\n', 'utf-8')
+  return { entry: updated, previous, knowledgePath }
+}
+// --------------------------------------------------------------------------
+// Search (v0: substring match; v1: BGE-M3 cosine)
+// --------------------------------------------------------------------------
+interface SearchHit {
+  id: string
+  title: string
+  category?: string
+  tags: string[]
+  status: RawItem['status']
+  knowledgePath?: string
+  rawPath?: string
+  score: number
+  matched: string[]  // which fields matched
+}
+type GlobalVerdict = 'PASS' | 'FLAG'
+type ReadabilityVerdict = 'PASS' | 'MISSING'
+type IndexVerdict = 'PASS' | 'NO_MATCH'
+type AnalysisVerdict = 'PASS' | 'NOT_ANALYZED'
+interface ReadableContent {
+  readability_verdict: ReadabilityVerdict
+  read_path?: string
+  read_source?: 'knowledge' | 'raw'
+  source_mtime?: string
+  content_hash?: string
+  content?: string
+}
+interface SearchVerdicts {
+  verdict: GlobalVerdict
+  readability_verdict: ReadabilityVerdict
+  index_verdict: IndexVerdict
+  analysis_verdict: AnalysisVerdict
+  analysis_caveat?: string
+}
+interface DecoratedSearchHit extends SearchHit, SearchVerdicts {
+  sourceMtime?: string
+  contentHash?: string
+  lastIndexedAt?: string
+  lastAnalyzedAt?: string
+  analysisVersion?: string
+  read_path?: string
+  read_source?: 'knowledge' | 'raw'
+}
+interface SearchResponse extends SearchVerdicts {
+  query: string
+  backend: 'substring' | 'embedding'
+  count: number
+  hits: DecoratedSearchHit[]
+}
+function normalizeSearchText(text: string): string {
+  return text
+    .toLowerCase()
+    .replace(/[^a-z0-9]+/g, ' ')
+    .replace(/\s+/g, ' ')
+    .trim()
+}
+function matchesQuery(value: string | undefined, query: string): boolean {
+  if (!value) return false
+  const rawValue = value.toLowerCase()
+  const rawQuery = query.toLowerCase().trim()
+  if (rawValue.includes(rawQuery)) return true
+  const normalizedValue = normalizeSearchText(value)
+  const normalizedQuery = normalizeSearchText(query)
+  return normalizedQuery.length > 0 && normalizedValue.includes(normalizedQuery)
+}
+export function substringSearch(items: RawItem[], query: string, topK: number): SearchHit[] {
+  const q = query.toLowerCase().trim()
+  if (!q) return []
+  const hits: SearchHit[] = []
+  for (const item of items) {
+    const matched: string[] = []
+    let score = 0
+    if (item.id.toLowerCase() === q) { matched.push('id_exact'); score += 10 }
+    else if (matchesQuery(item.id, query)) { matched.push('id'); score += 4 }
+    if (matchesQuery(item.title, query)) { matched.push('title'); score += 3 }
+    if (matchesQuery(item.category, query)) { matched.push('category'); score += 2 }
+    if (item.tags.some((t) => matchesQuery(t, query))) { matched.push('tag'); score += 2 }
+    if (matchesQuery(item.sourceUrl, query)) { matched.push('source_url'); score += 1 }
+    if (score > 0) {
+      hits.push({
+        id: item.id,
+        title: item.title,
+        category: item.category,
+        tags: item.tags,
+        status: item.status,
+        knowledgePath: item.knowledgePath,
+        rawPath: item.rawPath,
+        score,
+        matched,
+      })
+    }
+  }
+  hits.sort((a, b) => b.score - a.score)
+  return hits.slice(0, topK)
+}
+function candidateReadPaths(item: RawItem): Array<{ source: 'knowledge' | 'raw'; path: string }> {
+  const candidates: Array<{ source: 'knowledge' | 'raw'; path: string }> = []
+  if (item.knowledgePath) candidates.push({ source: 'knowledge', path: item.knowledgePath })
+  candidates.push({ source: 'raw', path: item.rawPath })
+  return candidates
+}
+export function resolveReadableContent(item: RawItem, vaultRoot = VAULT_ROOT, includeContent = false): ReadableContent {
+  for (const candidate of candidateReadPaths(item)) {
+    const absPath = join(vaultRoot, candidate.path)
+    if (!existsSync(absPath)) continue
+    const content = readFileSync(absPath, 'utf-8')
+    const stat = statSync(absPath)
+    return {
+      readability_verdict: 'PASS',
+      read_path: candidate.path,
+      read_source: candidate.source,
+      source_mtime: stat.mtime.toISOString(),
+      content_hash: createHash('sha256').update(content).digest('hex'),
+      ...(includeContent ? { content } : {}),
+    }
+  }
+  return { readability_verdict: 'MISSING' }
+}
+function analysisVerdict(item: RawItem): Pick<SearchVerdicts, 'analysis_verdict' | 'analysis_caveat'> {
+  if (item.lastAnalyzedAt) return { analysis_verdict: 'PASS' }
+  return {
+    analysis_verdict: 'NOT_ANALYZED',
+    analysis_caveat: 'Readable content exists, but registry has no lastAnalyzedAt. Treat as not analyzed, not missing.',
+  }
+}
+function itemVerdicts(item: RawItem, vaultRoot = VAULT_ROOT): SearchVerdicts & ReadableContent {
+  const readable = resolveReadableContent(item, vaultRoot, false)
+  const analysis = analysisVerdict(item)
+  return {
+    verdict: readable.readability_verdict === 'PASS' ? 'PASS' : 'FLAG',
+    readability_verdict: readable.readability_verdict,
+    index_verdict: 'PASS',
+    ...analysis,
+    read_path: readable.read_path,
+    read_source: readable.read_source,
+    source_mtime: item.sourceMtime ?? readable.source_mtime,
+    content_hash: item.contentHash ?? readable.content_hash,
+  }
+}
+function decorateSearchHit(hit: SearchHit, item: RawItem, vaultRoot = VAULT_ROOT): DecoratedSearchHit {
+  const verdicts = itemVerdicts(item, vaultRoot)
+  return {
+    ...hit,
+    ...verdicts,
+    sourceMtime: item.sourceMtime ?? verdicts.source_mtime,
+    contentHash: item.contentHash ?? verdicts.content_hash,
+    lastIndexedAt: item.lastIndexedAt,
+    lastAnalyzedAt: item.lastAnalyzedAt,
+    analysisVersion: item.analysisVersion,
+  }
+}
+export function buildSearchResponse(
+  query: string,
+  backend: 'substring' | 'embedding',
+  hits: SearchHit[],
+  items: RawItem[],
+  vaultRoot = VAULT_ROOT
+): SearchResponse {
+  const byId = new Map(items.map((item) => [item.id, item]))
+  const decorated = hits.map((hit) => {
+    const item = byId.get(hit.id)
+    if (!item) {
+      return {
+        ...hit,
+        verdict: 'FLAG' as const,
+        readability_verdict: 'MISSING' as const,
+        index_verdict: 'PASS' as const,
+        analysis_verdict: 'NOT_ANALYZED' as const,
+        analysis_caveat: 'Search hit has no matching registry item.',
+      }
+    }
+    return decorateSearchHit(hit, item, vaultRoot)
+  })
+  const hasHits = decorated.length > 0
+  const readable = hasHits && decorated.every((hit) => hit.readability_verdict === 'PASS')
+  const allAnalyzed = hasHits && decorated.every((hit) => hit.analysis_verdict === 'PASS')
+  return {
+    query,
+    backend,
+    count: decorated.length,
+    verdict: hasHits && readable ? 'PASS' : 'FLAG',
+    readability_verdict: readable ? 'PASS' : 'MISSING',
+    index_verdict: hasHits ? 'PASS' : 'NO_MATCH',
+    analysis_verdict: allAnalyzed ? 'PASS' : 'NOT_ANALYZED',
+    ...(allAnalyzed ? {} : { analysis_caveat: 'One or more readable hits have no lastAnalyzedAt.' }),
+    hits: decorated,
+  }
+}
+export function getVaultItem(
+  registry: RawItem[],
+  id: string,
+  vaultRoot = VAULT_ROOT,
+  includeContent = false
+) {
+  const item = registry.find((entry) => entry.id === id)
+  if (!item) {
+    return {
+      found: false as const,
+      id,
+      verdict: 'FLAG' as const,
+      readability_verdict: 'MISSING' as const,
+      index_verdict: 'NO_MATCH' as const,
+      analysis_verdict: 'NOT_ANALYZED' as const,
+      error: `Entry not found: ${id}`,
+    }
+  }
+  const readable = resolveReadableContent(item, vaultRoot, includeContent)
+  const analysis = analysisVerdict(item)
+  return {
+    found: true as const,
+    item: {
+      ...item,
+      sourceMtime: item.sourceMtime ?? readable.source_mtime,
+      contentHash: item.contentHash ?? readable.content_hash,
+      read_path: readable.read_path,
+      read_source: readable.read_source,
+    },
+    verdict: readable.readability_verdict === 'PASS' ? 'PASS' as const : 'FLAG' as const,
+    readability_verdict: readable.readability_verdict,
+    index_verdict: 'PASS' as const,
+    ...analysis,
+    ...(includeContent ? { content: readable.content ?? null } : {}),
+  }
+}
+// Phase 2 hook — embed query via BGE-M3-compatible endpoint, cosine against .meta/embeddings.jsonl
+// Left as a TODO stub; substringSearch ships tonight so Hermes can talk to the vault.
+async function embeddingSearch(_query: string, _topK: number): Promise<SearchHit[]> {
+  throw new Error(
+    'embeddingSearch not yet wired — Phase 2. ' +
+    `Will call ${EMBED_ENDPOINT}/v1/embeddings and cosine against .meta/embeddings.jsonl.`
+  )
+}
+// --------------------------------------------------------------------------
+// MCP server + tools
+// --------------------------------------------------------------------------
+function buildServer(): McpServer {
+  const server = new McpServer(
+    { name: 'research-vault', version: '0.2.0' },
+    { capabilities: { tools: {} } }
+  )
+  server.registerTool(
+    'vault_search',
+    {
+      title: 'Search research vault',
+      description:
+        'Search the research-vault knowledge base. v0 uses substring match over registry (title/category/tags/source URL). ' +
+        'Phase 2 will swap in BGE-M3 semantic search via the configured embedding endpoint.',
+      inputSchema: {
+        query: z.string().min(1).describe('Search query — keyword, phrase, tag, or URL fragment'),
+        top_k: z.number().int().min(1).max(50).optional().describe('Max results (default 10)'),
+        mode: z.enum(['substring', 'embedding']).optional().describe('Search backend (default substring; embedding = Phase 2)'),
+      },
+    },
+    async ({ query, top_k, mode }) => {
+      const k = top_k ?? 10
+      const backend = mode ?? 'substring'
+      const registry = readRegistry()
+      let hits: SearchHit[]
+      if (backend === 'embedding') {
+        hits = await embeddingSearch(query, k)
+      } else {
+        hits = substringSearch(registry, query, k)
+      }
+      const response = buildSearchResponse(query, backend, hits, registry)
+      return {
+        content: [
+          {
+            type: 'text',
+            text: JSON.stringify(
+              response,
+              null,
+              2
+            ),
+          },
+        ],
+      }
+    }
+  )
+  server.registerTool(
+    'vault_get',
+    {
+      title: 'Get research vault entry by exact ID',
+      description:
+        'Authoritative exact-ID read path for a research-vault entry. ' +
+        'Reads the knowledge file when present, otherwise falls back to the raw file. ' +
+        'Missing lastAnalyzedAt is reported as NOT_ANALYZED, not missing content.',
+      inputSchema: {
+        id: z.string().min(1).describe('Exact registry entry ID'),
+        include_content: z.boolean().optional().describe('Include markdown content in the response (default false)'),
+      },
+    },
+    async ({ id, include_content }) => {
+      const result = getVaultItem(readRegistry(), id, VAULT_ROOT, include_content ?? false)
+      return {
+        content: [{ type: 'text', text: JSON.stringify(result, null, 2) }],
+        ...(result.found ? {} : { isError: true }),
+      }
+    }
+  )
+  server.registerTool(
+    'vault_status',
+    {
+      title: 'Vault status',
+      description:
+        'Registry counts + decay summary + last maintenance run for the research vault. ' +
+        'Use to check vault health before ingest/analyze workflows.',
+      inputSchema: {},
+    },
+    async () => {
+      const reg = readRegistry()
+      const decay = readDecayScores()
+      const maint = readLastMaintenance()
+      const byStatus: Record<string, number> = {}
+      const byCategory: Record<string, number> = {}
+      for (const item of reg) {
+        byStatus[item.status] = (byStatus[item.status] ?? 0) + 1
+        if (item.category) byCategory[item.category] = (byCategory[item.category] ?? 0) + 1
+      }
+      const bySummary: Record<string, number> = { deep: 0, shallow: 0, none: 0 }
+      for (const d of decay) bySummary[d.summaryLevel]++
+      const status = {
+        vault_root: VAULT_ROOT,
+        registry: { total: reg.length, by_status: byStatus, by_category: byCategory },
+        decay: { tracked: decay.length, by_summary_level: bySummary },
+        last_maintenance: maint,
+      }
+      return { content: [{ type: 'text', text: JSON.stringify(status, null, 2) }] }
+    }
+  )
+  server.registerTool(
+    'vault_taxonomy',
+    {
+      title: 'Vault taxonomy',
+      description: 'Returns the canonical category taxonomy (knowledge/_taxonomy.md). Use before /analyze to pick the right category.',
+      inputSchema: {},
+    },
+    async () => {
+      return { content: [{ type: 'text', text: readTaxonomy() }] }
+    }
+  )
+  // --- v1 write tools (DAS-808 write upgrade) ---
+  // Skipped entirely when MCP_READONLY=1 — remote-agent hosts get a strictly
+  // read-only surface; write attempts return "tool not found" instead of touching disk.
+  if (!READONLY) {
+  server.registerTool(
+    'vault_store',
+    {
+      title: 'Store new entry in research vault',
+      description:
+        'Create a new entry in the research vault with title, content, and optional category/tags/source URL. ' +
+        'Appends to registry.jsonl and creates knowledge + raw files. ' +
+        'Requires explicit write approval (DAS-808). ' +
+        'Do not store secrets, OAuth material, tokens, raw payloads, or private infrastructure details.',
+      inputSchema: {
+        title: z.string().min(1).max(500).describe('Entry title (used to generate the entry ID)'),
+        content: z.string().min(1).describe('Markdown content body — must be sanitized (no secrets, tokens, or private payloads)'),
+        category: z.string().optional().describe('Taxonomy category path (e.g. software-engineering/security). Check vault_taxonomy first.'),
+        tags: z.array(z.string()).max(20).optional().describe('Tags for search indexing (max 20)'),
+        source_url: z.string().url().optional().describe('Source URL if this entry references an external resource'),
+      },
+    },
+    async ({ title, content, category, tags, source_url }) => {
+      try {
+        const result = writeStoreEntry({ title, content, category, tags, sourceUrl: source_url })
+        return {
+          content: [
+            {
+              type: 'text',
+              text: JSON.stringify(
+                {
+                  stored: true,
+                  entry: result.entry,
+                  paths: { knowledge: result.knowledgePath, raw: result.rawPath },
+                },
+                null,
+                2
+              ),
+            },
+          ],
+        }
+      } catch (err) {
+        return {
+          content: [{ type: 'text', text: JSON.stringify({ stored: false, error: String(err) }, null, 2) }],
+          isError: true,
+        }
+      }
+    }
+  )
+  server.registerTool(
+    'vault_write',
+    {
+      title: 'Update existing vault entry',
+      description:
+        'Update metadata or content of an existing research vault entry identified by its id. ' +
+        'Rewrites the registry and optionally updates the knowledge file. ' +
+        'Requires explicit write approval (DAS-808). ' +
+        'Do not store secrets, OAuth material, tokens, raw payloads, or private infrastructure details.',
+      inputSchema: {
+        id: z.string().min(1).describe('Entry ID to update (e.g. 20260503-my-entry-slug)'),
+        title: z.string().min(1).max(500).optional().describe('New title'),
+        content: z.string().min(1).optional().describe('New markdown content — must be sanitized'),
+        category: z.string().optional().describe('New taxonomy category path'),
+        tags: z.array(z.string()).max(20).optional().describe('New tags (replaces existing tags)'),
+        status: z.enum(['raw', 'analyzed', 'archived']).optional().describe('New entry status'),
+        source_url: z.string().url().optional().describe('New source URL'),
+      },
+    },
+    async ({ id, title, content, category, tags, status, source_url }) => {
+      try {
+        const result = writeUpdateEntry({ id, title, content, category, tags, status, sourceUrl: source_url })
+        return {
+          content: [
+            {
+              type: 'text',
+              text: JSON.stringify(
+                {
+                  updated: true,
+                  entry: result.entry,
+                  previous: { title: result.previous.title, category: result.previous.category, status: result.previous.status, tags: result.previous.tags },
+                  knowledgePath: result.knowledgePath,
+                },
+                null,
+                2
+              ),
+            },
+          ],
+        }
+      } catch (err) {
+        return {
+          content: [{ type: 'text', text: JSON.stringify({ updated: false, error: String(err) }, null, 2) }],
+          isError: true,
+        }
+      }
+    }
+  )
+  } // end if (!READONLY)
+  return server
+}
+// --------------------------------------------------------------------------
+// Transports
+// --------------------------------------------------------------------------
+async function runStdio() {
+  const server = buildServer()
+  const transport = new StdioServerTransport()
+  await server.connect(transport)
+  // stdio holds the process open via the transport's stdin reader.
+  process.stderr.write(`[research-vault-mcp] stdio ready — vault=${VAULT_ROOT}\n`)
+}
+function writeJson(res: ServerResponse, status: number, body: unknown) {
+  res.writeHead(status, {
+    'content-type': 'application/json',
+    'cache-control': 'no-store',
+  })
+  res.end(JSON.stringify(body))
+}
+function writeOAuthMetadata(res: ServerResponse) {
+  writeJson(res, 200, {
+    issuer: PUBLIC_BASE_URL,
+    authorization_endpoint: `${PUBLIC_BASE_URL}/oauth/authorize`,
+    token_endpoint: `${PUBLIC_BASE_URL}/oauth/token`,
+    registration_endpoint: `${PUBLIC_BASE_URL}/oauth/register`,
+    response_types_supported: ['code'],
+    grant_types_supported: ['authorization_code', 'refresh_token', 'client_credentials'],
+    token_endpoint_auth_methods_supported: ['none', 'client_secret_post', 'client_secret_basic'],
+    code_challenge_methods_supported: ['S256', 'plain'],
+    scopes_supported: ['vault.read', 'vault.write'],
+  })
+}
+function writeProtectedResourceMetadata(res: ServerResponse) {
+  writeJson(res, 200, {
+    resource: MCP_ENDPOINT_URL,
+    authorization_servers: [PUBLIC_BASE_URL],
+    bearer_methods_supported: ['header'],
+    scopes_supported: ['vault.read', 'vault.write'],
+  })
+}
+async function readRequestBody(req: IncomingMessage): Promise<string> {
+  const chunks: Buffer[] = []
+  for await (const chunk of req) {
+    chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
+  }
+  return Buffer.concat(chunks).toString('utf8')
+}
+async function readRequestParams(req: IncomingMessage): Promise<Record<string, string>> {
+  const raw = await readRequestBody(req)
+  if (!raw.trim()) return {}
+  const contentType = req.headers['content-type'] ?? ''
+  if (contentType.includes('application/json')) {
+    const parsed = JSON.parse(raw) as Record<string, unknown>
+    return Object.fromEntries(
+      Object.entries(parsed).map(([key, value]) => [key, typeof value === 'string' ? value : String(value)])
+    )
+  }
+  return Object.fromEntries(new URLSearchParams(raw).entries())
+}
+function handleOAuthAuthorize(req: IncomingMessage, res: ServerResponse) {
+  const url = new URL(req.url ?? '/', PUBLIC_BASE_URL)
+  const redirectUri = url.searchParams.get('redirect_uri')
+  if (!redirectUri) {
+    writeJson(res, 400, { error: 'invalid_request', error_description: 'redirect_uri is required' })
+    return
+  }
+  const redirect = new URL(redirectUri)
+  redirect.searchParams.set('code', `rv-${randomUUID()}`)
+  const state = url.searchParams.get('state')
+  if (state) redirect.searchParams.set('state', state)
+  res.writeHead(302, {
+    location: redirect.toString(),
+    'cache-control': 'no-store',
+  })
+  res.end()
+}
+async function handleOAuthToken(req: IncomingMessage, res: ServerResponse) {
+  let params: Record<string, string>
+  try {
+    params = await readRequestParams(req)
+  } catch {
+    writeJson(res, 400, { error: 'invalid_request' })
+    return
+  }
+  const grantType = params.grant_type ?? 'authorization_code'
+  if (!['authorization_code', 'refresh_token', 'client_credentials'].includes(grantType)) {
+    writeJson(res, 400, { error: 'unsupported_grant_type' })
+    return
+  }
+  writeJson(res, 200, {
+    access_token: `rv-${randomUUID()}`,
+    token_type: 'Bearer',
+    expires_in: 3600,
+    scope: params.scope ?? 'vault.read vault.write',
+  })
+}
+async function handleOAuthRegister(req: IncomingMessage, res: ServerResponse) {
+  // Dynamic registration compatibility: deployments should enforce real
+  // authorization at their private access layer.
+  try {
+    await readRequestBody(req)
+  } catch {
+    // Ignore malformed registration bodies; this endpoint is compatibility-only.
+  }
+  writeJson(res, 201, {
+    client_id: `rv-client-${randomUUID()}`,
+    client_id_issued_at: Math.floor(Date.now() / 1000),
+    token_endpoint_auth_method: 'none',
+    grant_types: ['authorization_code', 'refresh_token', 'client_credentials'],
+    response_types: ['code'],
+    redirect_uris: [],
+    scope: 'vault.read vault.write',
+  })
+}
+async function handleMcpRequest(req: IncomingMessage, res: ServerResponse) {
+  // Stateless per-request transport keeps Capy refresh/probe/idempotent retries
+  // from poisoning a singleton transport session.
+  const server = buildServer()
+  const transport = new StreamableHTTPServerTransport({
+    sessionIdGenerator: undefined,
+  })
+  await server.connect(transport)
+  res.on('close', () => {
+    void transport.close().catch(() => undefined)
+    void server.close().catch(() => undefined)
+  })
+  await transport.handleRequest(req, res)
+}
+async function runHttp() {
+  const http = createServer(async (req, res) => {
+    try {
+      const url = new URL(req.url ?? '/', PUBLIC_BASE_URL)
+      // Health probe — matches embedding endpoint's /v1/models convention.
+      if (req.method === 'GET' && url.pathname === '/health') {
+        writeJson(res, 200, { status: 'ok', server: 'research-vault-mcp', version: '0.2.0', vault: VAULT_ROOT })
+        return
+      }
+      if (
+        req.method === 'GET' &&
+        (url.pathname === '/.well-known/oauth-protected-resource' ||
+          url.pathname === '/.well-known/oauth-protected-resource/mcp')
+      ) {
+        writeProtectedResourceMetadata(res)
+        return
+      }
+      if (
+        req.method === 'GET' &&
+        (url.pathname === '/.well-known/oauth-authorization-server' ||
+          url.pathname === '/.well-known/oauth-authorization-server/mcp')
+      ) {
+        writeOAuthMetadata(res)
+        return
+      }
+      if (req.method === 'GET' && url.pathname === '/oauth/authorize') {
+        handleOAuthAuthorize(req, res)
+        return
+      }
+      if (req.method === 'POST' && url.pathname === '/oauth/token') {
+        await handleOAuthToken(req, res)
+        return
+      }
+      if (req.method === 'POST' && url.pathname === '/oauth/register') {
+        await handleOAuthRegister(req, res)
+        return
+      }
+      // MCP Streamable HTTP endpoint (default path /mcp; accept / too for convenience).
+      if (url.pathname === '/mcp' || url.pathname === '/') {
+        await handleMcpRequest(req, res)
+        return
+      }
+      res.writeHead(404, { 'content-type': 'text/plain' })
+      res.end('not found')
+    } catch (err) {
+      process.stderr.write(`[research-vault-mcp] http error: ${String(err)}\n`)
+      if (!res.headersSent) {
+        res.writeHead(500, { 'content-type': 'text/plain' })
+        res.end('internal error')
+      }
+    }
+  })
+  http.listen(PORT, HOST, () => {
+    process.stderr.write(`[research-vault-mcp] http ready — http://${HOST}:${PORT}/mcp  vault=${VAULT_ROOT}\n`)
+  })
+}
+// --------------------------------------------------------------------------
+// Entry
+// --------------------------------------------------------------------------
+const isMain = process.argv[1] ? import.meta.url === new URL(process.argv[1], 'file:').href : false
+if (isMain) {
+  if (MODE === 'http') {
+    runHttp().catch((err) => {
+      process.stderr.write(`[research-vault-mcp] fatal: ${String(err)}\n`)
+      process.exit(1)
+    })
+  } else {
+    runStdio().catch((err) => {
+      process.stderr.write(`[research-vault-mcp] fatal: ${String(err)}\n`)
+      process.exit(1)
+    })
+  }
+}

package/tsconfig.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ESNext",
+    "module": "ESNext",
+    "moduleResolution": "bundler",
+    "strict": true,
+    "esModuleInterop": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "types": ["bun-types"]
+  },
+  "include": ["**/*.ts"]
+}

package/types.ts ADDED Viewed

@@ -0,0 +1,27 @@
+export interface RawItem {
+  id: string
+  title: string
+  source: 'url' | 'local'
+  sourceUrl?: string
+  rawPath: string
+  ingestedAt: string
+  status: 'raw' | 'analyzed' | 'archived'
+  tags: string[]
+  category?: string
+  knowledgePath?: string
+  sourceMtime?: string
+  contentHash?: string
+  lastIndexedAt?: string
+  lastAnalyzedAt?: string
+  analysisVersion?: string
+}
+export interface DecayScore {
+  itemId: string
+  score: number
+  lastAccess: string
+  accessCount: number
+  summaryLevel: 'deep' | 'shallow' | 'none'
+  nextReviewAt: string
+  difficulty: number
+}