@skill-map/spec 0.18.0 → 0.19.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/architecture.md CHANGED
@@ -72,9 +72,9 @@ The loader enforces two id-uniqueness rules during discovery (see [`plugin-autho
72
72
  1. **Directory name == manifest id.** A plugin lives at `<root>/<id>/plugin.json`. A mismatch surfaces as status `invalid-manifest`. This rule eliminates same-root collisions by construction.
73
73
  2. **Cross-root id collision blocks both sides.** Two plugins reachable from different roots (project + global, or any `--plugin-dir` combination) that declare the same `id` BOTH receive status `id-collision`. No precedence rule applies — coherent with §Boot invariant ("no extension is privileged"). The user resolves by renaming one of them.
74
74
 
75
- In addition, the loader **qualifies every extension** with its owning plugin id before registering it. The registry stores extensions under the qualified id `<plugin-id>/<extension-id>` (e.g. `claude/slash`, `core/broken-ref`, `hello-world/greet`). Authors continue to declare the short `id` in each extension manifest; the loader composes the qualified form from `manifest.id` at load time. Built-in extensions bundled with the reference impl declare their `pluginId` directly in `built-ins.ts` — `core/` for kernel-internal primitives (rules, the formatter, the external-url-counter extractor) and `claude/` for the Claude provider bundle (the Provider and its kind-aware extractors). If a plugin author injects a `pluginId` field on an extension that disagrees with `plugin.json`'s `id`, the loader emits `invalid-manifest` with a directed reason.
75
+ In addition, the loader **qualifies every extension** with its owning plugin id before registering it. The registry stores extensions under the qualified id `<plugin-id>/<extension-id>` (e.g. `core/slash`, `core/broken-ref`, `hello-world/greet`). Authors continue to declare the short `id` in each extension manifest; the loader composes the qualified form from `manifest.id` at load time. Built-in extensions bundled with the reference impl declare their `pluginId` directly in `built-ins.ts` — `core/` for kernel-internal primitives (every rule, the formatter, the cross-vendor extractors `annotations` / `slash` / `at-directive` / `markdown-link` / `external-url-counter`) and vendor-specific bundles such as `claude/` (the Claude provider) for Provider integrations whose territory is platform-bound. If a plugin author injects a `pluginId` field on an extension that disagrees with `plugin.json`'s `id`, the loader emits `invalid-manifest` with a directed reason.
76
76
 
77
- Each plugin (and each built-in bundle) declares a **granularity** that controls how its extensions are toggled. `granularity: 'bundle'` (the default) means the plugin id is the only enable/disable key; `granularity: 'extension'` means each extension is independently toggle-able under its qualified id. The loader's pre-import `resolveEnabled(pluginId)` short-circuit is always coarse (bundle level) — when a granularity=`extension` bundle is partially enabled, the import work proceeds and the runtime composer (the CLI's `composeScanExtensions` / `composeFormatters` in `src/cli/util/plugin-runtime.ts`) drops the disabled extensions before they reach the orchestrator. The two built-in bundles split deliberately: `claude` is granularity=`bundle` (provider-level toggle), `core` is granularity=`extension` (every kernel built-in is removable, satisfying §Boot invariant: "no extension is privileged"). See [`plugin-author-guide.md` §Granularity — bundle vs extension](./plugin-author-guide.md#granularity--bundle-vs-extension) for the author-facing summary.
77
+ Each plugin (and each built-in bundle) declares a **granularity** that controls how its extensions are toggled. `granularity: 'bundle'` (the default) means the plugin id is the only enable/disable key; `granularity: 'extension'` means each extension is independently toggle-able under its qualified id. The loader's pre-import `resolveEnabled(pluginId)` short-circuit is always coarse (bundle level) — when a granularity=`extension` bundle is partially enabled, the import work proceeds and the runtime composer (the CLI's `composeScanExtensions` / `composeFormatters` in `src/cli/util/plugin-runtime.ts`) drops the disabled extensions before they reach the orchestrator. Vendor Provider bundles (`claude`, `gemini`, `agent-skills`) ship as granularity=`bundle` (the platform integration is on or off as a whole); the `core` bundle is granularity=`extension` (every kernel built-in is removable, satisfying §Boot invariant: "no extension is privileged"). See [`plugin-author-guide.md` §Granularity — bundle vs extension](./plugin-author-guide.md#granularity--bundle-vs-extension) for the author-facing summary.
78
78
 
79
79
  ### `RunnerPort`
80
80
 
@@ -141,14 +141,14 @@ Mode is a property of the extension as a whole, not of an individual call. **An
141
141
 
142
142
  | Kind | Modes | How mode is set |
143
143
  |---|---|---|
144
- | **Extractor** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
144
+ | **Extractor** | deterministic-only | implicit; `mode` field MUST NOT appear |
145
145
  | **Rule** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
146
146
  | **Action** | deterministic / probabilistic | declared in manifest (`mode` field, **required** — no default) |
147
147
  | **Hook** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
148
148
  | **Provider** | deterministic-only | implicit; `mode` field MUST NOT appear |
149
149
  | **Formatter** | deterministic-only | implicit; `mode` field MUST NOT appear |
150
150
 
151
- Provider and Formatter are locked to deterministic because they sit at the **boundaries** of the system. A Provider resolves `path → kind` during boot; probabilistic classification would make the boot phase slow, costly, and non-reproducible. A formatter must produce diffable output (`sm scan` snapshots round-trip in CI). Probabilistic narrators of the graph are a valid product but they live in jobs and emit Findings, not in formatters.
151
+ Provider, Extractor, and Formatter are locked to deterministic because they sit on the **deterministic scan path**. A Provider resolves `path → kind` during boot; probabilistic classification would make the boot phase slow, costly, and non-reproducible. An Extractor consumes a parsed node body inside `sm scan`'s synchronous loop; LLM-driven enrichment of a node is an Action concern (queued as a job and observed via the enrichment layer or sidecar writes), not an Extractor concern — the distinction matters because `sm scan` MUST be fast, free, and reproducible. A Formatter must produce diffable output (`sm scan` snapshots round-trip in CI). Probabilistic narrators of the graph are a valid product but they live in jobs and emit Findings or write to the enrichment layer through Actions, not through Extractors or Formatters.
152
152
 
153
153
  > **Naming note — `Provider` vs hexagonal `adapter`.** A `Provider` is an **extension** authored by plugins (it recognises a platform and declares its kind catalog). The hexagonal-architecture term `adapter` refers to **port implementations** internal to the kernel package — `RunnerPort.adapter`, `StoragePort.adapter`, `FilesystemPort.adapter`, `PluginLoaderPort.adapter` — and lives under `kernel/adapters/`. The two concepts share an architectural lineage (both bridge two worlds) but live in deliberately disjoint namespaces so plugin authors and impl maintainers never confuse them.
154
154
 
@@ -163,7 +163,7 @@ This separation is normative: a probabilistic extension cannot register a hook t
163
163
 
164
164
  The kernel exposes the LLM through the `RunnerPort` (see §Ports above). Reference impl: `ClaudeCliRunner`. Tests: `MockRunner`. Other adapters (OpenAI, local Ollama, etc.) implement the same port without spec changes.
165
165
 
166
- A probabilistic extension receives the runner in its invocation context alongside `ctx.store`. The extension never imports a specific LLM SDK — the runner contract is what the spec normalizes; wire format and model selection are adapter concerns.
166
+ A probabilistic Action, Rule, or Hook receives the runner in its invocation context alongside `ctx.store` (Extractors are deterministic-only and never see the runner). The extension never imports a specific LLM SDK — the runner contract is what the spec normalizes; wire format and model selection are adapter concerns.
167
167
 
168
168
  ---
169
169
 
@@ -174,7 +174,7 @@ Six kinds, all first-class, all loaded through the same registry. Each kind has
174
174
  | Kind | Role | Input | Output |
175
175
  |---|---|---|---|
176
176
  | **Provider** | Recognizes a platform. Declares the catalog of node `kind`s it emits via the `kinds` map; each map entry pairs the kind's frontmatter schema (path relative to the Provider's package directory) with its `defaultRefreshAction` (a qualified action id that drives the probabilistic-refresh surface). Also declares the filesystem `explorationDir` where its content lives. Deterministic-only. | Filesystem walk results, candidate path. | `{ kind, provider } \| null`. |
177
- | **Extractor** | Extracts signals from a node body. Dual-mode: `deterministic` runs in scan, `probabilistic` runs in jobs. Output flows through three context callbacks (no return value): `ctx.emitLink(link)` for the kernel's `links` table, `ctx.enrichNode(partial)` for the kernel's enrichment layer (separate from the author's frontmatter), `ctx.store` for the plugin's own KV / dedicated tables. | Parsed node (frontmatter + body) + callbacks. | `void` (output via callbacks). |
177
+ | **Extractor** | Extracts signals from a node body. Deterministic-only: runs synchronously inside `sm scan`. Output flows through three context callbacks (no return value): `ctx.emitLink(link)` for the kernel's `links` table, `ctx.enrichNode(partial)` for the kernel's enrichment layer (separate from the author's frontmatter), `ctx.store` for the plugin's own KV / dedicated tables. | Parsed node (frontmatter + body) + callbacks. | `void` (output via callbacks). |
178
178
  | **Rule** | Evaluates the graph. Dual-mode: `deterministic` runs in `sm check`, `probabilistic` runs in jobs. | Full graph (nodes + links). | `Issue[]`. |
179
179
  | **Action** | Operates on one or more nodes. Dual-mode: `deterministic` (in-process code) or `probabilistic` (rendered prompt the runner executes). | Node(s), optional args. | Deterministic: report JSON. Probabilistic: rendered prompt that a runner executes. |
180
180
  | **Formatter** | Serializes the graph. Deterministic-only. | Graph + optional filter. | String (ASCII / Mermaid / DOT / JSON / user-defined). |
@@ -208,23 +208,36 @@ The kernel ships every Provider's `ui` block to the BFF at boot; the BFF aggrega
208
208
 
209
209
  Every `Provider` extension MUST declare an `explorationDir: string` naming the filesystem directory (relative to user home or project root) where its content lives. Examples: `'~/.claude'` for the Claude Provider, `'~/.cursor'` for a hypothetical Cursor Provider. The kernel walks this directory during boot/scan to discover nodes; the Provider's `globs` (if declared) refines what to match inside. `sm doctor` (and `sm plugins doctor`) validates the directory exists; missing directory yields a non-blocking warning so the user sees the gap without the load failing — the Provider may legitimately precede installation of its platform.
210
210
 
211
+ ### Provider · dispatch order and the universal markdown fallback
212
+
213
+ `sm scan` iterates Providers in **registration order** — vendor-specific Providers first (built-in: `claude` → `gemini` → `agent-skills`; user-installed plugins follow in load order), then the built-in `core/markdown` Provider LAST. Each Provider's walker enumerates the full project tree for its declared `read.extensions`; for every emitted file, the orchestrator calls `provider.classify(path, frontmatter)`. The kernel maintains a per-scan `Set<path>` of already-classified files so each path is offered to AT MOST one Provider's `classify`: the first Provider whose `classify` returns non-null claims the file, and subsequent Providers see the path as taken and skip.
214
+
215
+ The dispatch contract has two consequences implementations MUST honour:
216
+
217
+ 1. **First-claim-wins**. A vendor Provider that classifies a file inside its territory (e.g. claude's `.claude/agents/foo.md` → `agent`) is authoritative; later Providers cannot reclassify it to a different kind. This locks vendor ownership of vendor paths and removes the historical `provider-ambiguous` failure mode for non-overlapping territories.
218
+ 2. **`core/markdown` is the universal fallback for unclaimed `.md` files**. The built-in `core/markdown` Provider's `classify` returns `'markdown'` unconditionally (it does NOT inspect the path). Combined with the dedup guarantee above and its terminal position in the iteration order, it picks up exactly the `.md` files no vendor Provider claimed — a `.md` at the project root, under `.claude/hooks/`, `notes/`, `CLAUDE.md`, `GEMINI.md`, or anywhere else outside a known vendor territory. The fallback is **not privileged kernel code**: it ships as a regular built-in Provider under the `core` bundle (`granularity: 'extension'`), so a user who explicitly does not want it can disable it via `sm plugins disable core/markdown` and the scan reverts to "vendor-only" — orphan `.md` files become silently invisible, matching pre-spec-0.9.0 behaviour.
219
+
220
+ The fallback exists because the format-named generic kind `markdown` is provider-agnostic: no vendor owns the universal markdown format. Keeping the fallback as a Provider (rather than a kernel-level special case) preserves the boot invariant that no extension is privileged — when a future vendor Provider (Codex, Cursor, Roo) lands, it slots into the iteration order before `core/markdown` and the fallback semantics stay invariant.
221
+
211
222
  ### Extractor · output callbacks
212
223
 
213
224
  The `Extractor` runtime contract is `extract(ctx) → void`. The extractor emits its work through three callbacks the kernel binds onto `ctx`:
214
225
 
215
226
  - `ctx.emitLink(link)` — append a `Link` to the kernel's `links` table. The kernel validates the link against the extractor's declared `emitsLinkKinds` before persistence; off-contract links are dropped and surface as `extension.error` events. URL-shaped targets (`http(s)://…`) are partitioned out into `node.externalRefsCount` and never persisted.
216
- - `ctx.enrichNode(partial)` — merge canonical, kernel-curated properties onto the current node's enrichment layer (persisted into [`node_enrichments`](./db-schema.md#node_enrichments)). **Strictly separate from the author-supplied frontmatter** (the latter remains immutable across scans). The enrichment layer is the right home for kernel-derived facts (e.g. computed titles, summaries, signals from probabilistic extractors) without polluting what the user wrote on disk. See §Enrichment layer below for the full lifecycle (per-extractor attribution, stale tracking, refresh verbs).
227
+ - `ctx.enrichNode(partial)` — merge canonical, kernel-curated properties onto the current node's enrichment layer (persisted into [`node_enrichments`](./db-schema.md#node_enrichments)). **Strictly separate from the author-supplied frontmatter** (the latter remains immutable across scans). The enrichment layer is the right home for kernel-derived facts (computed titles, summaries, signals an Extractor inferred from the body) without polluting what the user wrote on disk. See §Enrichment layer below for the full lifecycle (per-extractor attribution, refresh verbs).
217
228
  - `ctx.store` — plugin-scoped persistence. Optional, present only when the plugin declares `storage.mode` in `plugin.json`. Shape depends on the mode (`KvStore` for mode A, scoped `Database` for mode B). See [`plugin-kv-api.md`](./plugin-kv-api.md). The plugin author MAY opt into shape validation for their own writes by declaring `storage.schema` (Mode A) or `storage.schemas` (Mode B) in the manifest — JSON Schemas the kernel AJV-compiles at load time and runs against every `ctx.store.set(key, value)` / `ctx.store.write(table, row)` call. Absent = permissive (status quo). `emitLink` and `enrichNode` keep their universal validation against `link.schema.json` / `node.schema.json` regardless of this opt-in. See [`plugin-author-guide.md` §`outputSchema`](./plugin-author-guide.md#outputschema--opt-in-correctness-for-custom-storage-writes).
218
229
 
219
- Probabilistic extractors additionally receive `ctx.runner` (the `RunnerPort`) for LLM dispatch.
230
+ Extractors are deterministic-only; `ctx.runner` is NOT exposed on the Extractor context. LLM-driven enrichment of a node is an Action concern (queued as a job), not an Extractor concern.
220
231
 
221
232
  ### Extractor · enrichment layer
222
233
 
223
- `ctx.enrichNode(partial)` is the only writable surface the Extractor pipeline has on a node. The author's frontmatter on `scan_nodes.frontmatter_json` is read-only from any Extractor — that contract holds for both deterministic and probabilistic extractors. Implementations MUST:
234
+ `ctx.enrichNode(partial)` is the only writable surface the Extractor pipeline has on a node. The author's frontmatter on `scan_nodes.frontmatter_json` is read-only from any Extractor. Implementations MUST:
224
235
 
225
236
  - Persist enrichments into a per-`(node, extractor)` table (the reference impl uses [`node_enrichments`](./db-schema.md#node_enrichments)) so attribution survives across scans.
226
237
  - Preserve the author frontmatter byte-for-byte through every scan and refresh; the enrichment overlay is a SEPARATE store.
227
- - Track stale state for probabilistic rows: when the scan loop detects `body_hash_at_enrichment != node.body_hash` for a probabilistic enrichment, mark the row stale (NOT delete it the LLM cost is preserved). Deterministic enrichments do not need stale tracking they regenerate via the §Extractor · fine-grained scan cache contract.
238
+ - Regenerate enrichments through the §Extractor · fine-grained scan cache contract: an unchanged body hash + same registered Extractor reuses the prior row; a changed body re-runs `extract()` and overwrites the row via the PRIMARY KEY conflict. Extractors are deterministic, so a stale-flag is unnecessary re-running is free and reproducible.
239
+
240
+ > **Reserved columns** — `node_enrichments.is_probabilistic`, `body_hash_at_enrichment`, and `stale` are persisted but inert in this revision: every Extractor write sets `is_probabilistic = 0` and `stale = 0`, with `body_hash_at_enrichment` always equal to the current body hash. The columns are reserved for a future revision where Action-issued enrichments (queued probabilistic jobs writing back through the enrichment layer) will need stale tracking to preserve LLM cost across body changes. Until that revision lands, readers MAY assume `stale = 0` and the merge helper's `includeStale: true` flag is a no-op.
228
241
 
229
242
  Read-side merge (`mergeNodeWithEnrichments` in the reference impl):
230
243
 
@@ -232,13 +245,13 @@ Read-side merge (`mergeNodeWithEnrichments` in the reference impl):
232
245
  2. Sort by `enriched_at` ASC.
233
246
  3. Spread-merge each `value` over the author frontmatter (last-write-wins per field).
234
247
 
235
- Rules / `sm check` / `sm export` consume `node.frontmatter` directly (deterministic CI-safe baseline); enrichment consumption is opt-in by the caller. Stale visibility is also opt-in (`includeStale: true` in the merge helper) so the UI can render a "stale (last value: …)" marker without polluting the deterministic merge.
248
+ Rules / `sm check` / `sm export` consume `node.frontmatter` directly (deterministic CI-safe baseline); enrichment consumption is opt-in by the caller.
236
249
 
237
- Refresh verbs (`sm refresh <node>` and `sm refresh --stale`) re-run the Extractor pipeline against a node or the stale set and upsert fresh enrichment rows — see [`cli-contract.md` §Scan](./cli-contract.md#scan).
250
+ Refresh verbs (`sm refresh <node>` and `sm refresh --stale`) re-run the Extractor pipeline against a node or the stale set and upsert fresh enrichment rows — see [`cli-contract.md` §Scan](./cli-contract.md#scan). With Extractors deterministic-only, `--stale` is a no-op today (no rows are stale-flagged); it remains in the contract for the future Action-prob enrichment revision noted above.
238
251
 
239
252
  ### Extractor · `applicableKinds` filter
240
253
 
241
- Extractors MAY declare an optional `applicableKinds: string[]` on their manifest. When declared, the kernel filters fail-fast: `extract()` is invoked **only** for nodes whose `kind` appears in the list. The skip happens BEFORE the extractor context is built so a probabilistic extractor wastes zero LLM cost — and a deterministic extractor zero CPU on inapplicable nodes. Absent (`undefined`) is the default and means "applies to every kind"; there is no wildcard syntax. An empty array (`[]`) is invalid (`minItems: 1` in the schema). Unknown kinds (no installed Provider declares them in its `kinds` catalog) are non-blocking: the extractor keeps `loaded` status and `sm plugins doctor` surfaces an informational warning so the author sees typos and missing-Provider cases, but the doctor's exit code is NOT promoted by this warning. See [`plugin-author-guide.md` §Extractor `applicableKinds`](./plugin-author-guide.md#extractor-applicablekinds--narrow-the-pipeline) for the full author-side contract.
254
+ Extractors MAY declare an optional `applicableKinds: string[]` on their manifest. When declared, the kernel filters fail-fast: `extract()` is invoked **only** for nodes whose `kind` appears in the list. The skip happens BEFORE the extractor context is built so the extractor wastes zero CPU on inapplicable nodes. Absent (`undefined`) is the default and means "applies to every kind"; there is no wildcard syntax. An empty array (`[]`) is invalid (`minItems: 1` in the schema). Unknown kinds (no installed Provider declares them in its `kinds` catalog) are non-blocking: the extractor keeps `loaded` status and `sm plugins doctor` surfaces an informational warning so the author sees typos and missing-Provider cases, but the doctor's exit code is NOT promoted by this warning. See [`plugin-author-guide.md` §Extractor `applicableKinds`](./plugin-author-guide.md#extractor-applicablekinds--narrow-the-pipeline) for the full author-side contract.
242
255
 
243
256
  ### Extractor · fine-grained scan cache
244
257
 
@@ -249,9 +262,9 @@ The contract the cache MUST satisfy (engine-agnostic):
249
262
  - A node-level cache hit (body+frontmatter unchanged) is upgraded to a full skip ONLY when every currently-registered Extractor that applies to the node's kind has a recorded run against the prior body hash.
250
263
  - A new Extractor registered between scans MUST run on the cached node — its absence from the cache is the canonical signal. The rest of the cache (existing Extractors against the same body) is preserved.
251
264
  - An Extractor uninstalled between scans MUST have its cache rows removed and its sole-source links dropped. Links whose `sources` mix the uninstalled Extractor's short id with a still-cached Extractor's short id MUST be reshaped: the obsolete short id is stripped from the array and the link survives with the cached attribution intact. The persisted audit trail therefore never references a removed contributor.
252
- - The cache is transparent to plugin authors. An Extractor cannot opt out and cannot inspect the cache; its only obligation is to be deterministic for a given body input (probabilistic Extractors run as jobs, never in scan).
265
+ - The cache is transparent to plugin authors. An Extractor cannot opt out and cannot inspect the cache; its only obligation is to be deterministic for a given body input (this is structural: every Extractor is deterministic-only, by spec).
253
266
 
254
- This invariant is the difference between a free and a paid scan for the probabilistic Extractor model: re-running an LLM Extractor against an unchanged body would be both expensive and non-reproducible.
267
+ The invariant exists to keep `sm scan --changed` cheap on real corpora: re-parsing a body that has not changed for an Extractor that has not changed is wasted work; the cache turns it into a one-row reuse. The same machinery is what will let a future Action-prob enrichment revision (see §Extractor · enrichment layer) reuse paid LLM output across unchanged bodies.
255
268
 
256
269
  ### Extractor · trigger normalization
257
270
 
@@ -400,25 +413,25 @@ This is what makes "CLI-first" a coherent rule: every CLI verb is a kernel funct
400
413
 
401
414
  ## Annotation system
402
415
 
403
- Skill-map's own metadata layer (versioning, supersession, provenance, taxonomy, display, docs) lives in **co-located YAML sidecars** with extension `.sm`, in the same directory as the markdown node they annotate. Vendor files (`.claude/agents/foo.md`, `.cursor/rules/bar.mdc`, …) stay untouched; the sidecar (`foo.sm` / `bar.sm`) carries the annotations.
416
+ Skill-map's own metadata layer (versioning, supersession, provenance, taxonomy, docs) lives in **co-located YAML sidecars** with extension `.sm`, in the same directory as the markdown node they annotate. Vendor files (`.claude/agents/foo.md`, `.cursor/rules/bar.mdc`, …) stay untouched; the sidecar (`foo.sm` / `bar.sm`) IS skill-map's "annotations file" for that node — every key under it is, conceptually, an annotation. The YAML root organizes those annotations into structural blocks (identity, the curated annotations catalog, audit timestamps, settings, plugin namespaces); the file as a whole is the annotation surface.
404
417
 
405
418
  Two schemas describe the wire shape:
406
419
 
407
- - [`schemas/sidecar.schema.json`](./schemas/sidecar.schema.json) — root shape with reserved blocks `for` (identity link), `annotations` (the conventional catalog), `settings` (reserved), `audit` (write trail), plus opt-in `<plugin-id>:` namespacing.
408
- - [`schemas/annotations.schema.json`](./schemas/annotations.schema.json) — curated 14-field catalog: versioning + supersession (`version`, `stability`, `supersedes`, `supersededBy`, `requires`, `conflictsWith`, `related`), provenance (`authors`, `license`, `source`, `sourceVersion`), taxonomy (`tags`), display (`hidden`), docs (`docsUrl`). The activity timestamp lives in the reserved `audit:` block (`audit.lastBumpedAt`), not in `annotations:`. `additionalProperties: true` so plugins or users add custom keys without coordination; the built-in `unknown-field` rule warns on truly unrecognized keys (typo guard).
420
+ - [`schemas/sidecar.schema.json`](./schemas/sidecar.schema.json) — root shape with reserved blocks `identity` (anchor + drift hashes), `annotations` (the conventional catalog), `settings` (reserved), `audit` (write trail), plus opt-in `<plugin-id>:` namespacing.
421
+ - [`schemas/annotations.schema.json`](./schemas/annotations.schema.json) — curated 13-field catalog: versioning + supersession (`version`, `stability`, `supersedes`, `supersededBy`, `requires`, `conflictsWith`, `related`), provenance (`authors`, `license`, `source`, `sourceVersion`), taxonomy (`tags`), docs (`docsUrl`). The activity timestamp lives in the reserved `audit:` block (`audit.lastBumpedAt`), not in `annotations:`. `additionalProperties: true` so plugins or users add custom keys without coordination; the built-in `unknown-field` rule warns on truly unrecognized keys (typo guard).
409
422
 
410
423
  ### Identity and drift
411
424
 
412
- `for` carries `path` (scope-root-relative, matches the canonical Node identifier in [`schemas/node.schema.json`](./schemas/node.schema.json)) plus `bodyHash` and `frontmatterHash`. Both hashes are sha256 over the kernel's canonical form of the markdown body (post-frontmatter bytes) and frontmatter (YAML re-emitted via `js-yaml dump` with `sortKeys: true`, `lineWidth: -1`, `noRefs: true`, `noCompatMode: true`); each sidecar captures the values the kernel saw at the moment it was last written.
425
+ `identity` carries `path` (scope-root-relative, matches the canonical Node identifier in [`schemas/node.schema.json`](./schemas/node.schema.json)) plus `bodyHash` and `frontmatterHash`. Both hashes are sha256 over the kernel's canonical form of the markdown body (post-frontmatter bytes) and frontmatter (YAML re-emitted via `js-yaml dump` with `sortKeys: true`, `lineWidth: -1`, `noRefs: true`, `noCompatMode: true`); each sidecar captures the values the kernel saw at the moment it was last written.
413
426
 
414
- At scan time the kernel re-computes the live hashes and compares against the stored ones. Mismatch in either is **drift**, surfaced via the built-in `annotation-stale` rule (severity `warning`, never blocking — soft mode by design). A `.sm` whose `for.path` no longer points at an existing `.md` is **orphan**, surfaced via the built-in `annotation-orphan` rule (also `warning`). Drift state is **derived**, never stored — pure function over existing data, no flag to drift between flag and reality.
427
+ At scan time the kernel re-computes the live hashes and compares against the stored ones. Mismatch in either is **drift**, surfaced via the built-in `annotation-stale` rule (severity `warning`, never blocking — soft mode by design). A `.sm` whose `identity.path` no longer points at an existing `.md` is **orphan**, surfaced via the built-in `annotation-orphan` rule (also `warning`). Drift state is **derived**, never stored — pure function over existing data, no flag to drift between flag and reality.
415
428
 
416
429
  ### Bump model
417
430
 
418
431
  The deterministic built-in `core/bump` Action produces a sidecar patch:
419
432
 
420
433
  - Increments `annotations.version` by 1 (or sets to `1` if missing — single integer monotonic, orthogonal to `stability`; major bumps are not a concept, the convention for breaking changes is "create a new node, supersede the old").
421
- - Refreshes `for.bodyHash` and `for.frontmatterHash` to the live values.
434
+ - Refreshes `identity.bodyHash` and `identity.frontmatterHash` to the live values.
422
435
  - Stamps `audit.lastBumpedAt` (ISO 8601 datetime) and `audit.lastBumpedBy` (`'cli'`, `'ui'`, or `'plugin:<id>'`).
423
436
  - On first-time creation also stamps `audit.createdAt` and `audit.createdBy` (set once, stable thereafter).
424
437
 
@@ -436,7 +449,7 @@ The Action stays pure (no IO). The kernel materializes the patch through the `Si
436
449
  Plugins extend the annotation surface via the `annotationContributions` manifest field — a map of contributed key → `{ schema, ownership, location }`. Inline JSON Schema (no `$ref` to external files). Two location modes:
437
450
 
438
451
  - `location: 'namespaced'` (default) — writes go to the plugin's `<plugin-id>:` block at the sidecar root. Default `ownership: 'shared'`. Plugins write to their own namespace without coordination; AJV validates contributed keys against the plugin's declared schema.
439
- - `location: 'root'` — writes go to a top-level key of the sidecar (alongside `for` / `annotations` / `settings` / `audit`). Requires `ownership: 'exclusive'` (claiming a root key is elevated trust). Two plugins claiming the same root key with `exclusive` is a **hard fatal** at orchestrator startup — the kernel refuses to boot rather than route writes ambiguously.
452
+ - `location: 'root'` — writes go to a top-level key of the sidecar (alongside `identity` / `annotations` / `settings` / `audit`). Requires `ownership: 'exclusive'` (claiming a root key is elevated trust). Two plugins claiming the same root key with `exclusive` is a **hard fatal** at orchestrator startup — the kernel refuses to boot rather than route writes ambiguously.
440
453
 
441
454
  The kernel exposes a runtime catalog (`Kernel.getRegisteredAnnotationKeys()`) listing every plugin-contributed key with its `pluginId`, `location`, `ownership`, and `schema` — consumed by the BFF (`GET /api/annotations/registered`) for UI autocomplete.
442
455
 
@@ -449,6 +462,23 @@ Two columns on `scan_nodes` source from the sidecar's `annotations:` block when
449
462
 
450
463
  A `scan_nodes.annotations_json` column carries the full parsed `annotations:` block; `sidecar_present` and `sidecar_status` carry the drift-detection state. The full sidecar overlay (parsed `annotations`, `status`, `present`) is exposed on `Node.sidecar` so REST and UI consumers see it as part of the canonical wire shape.
451
464
 
465
+ ### Tags · dual-source
466
+
467
+ Skill-map's tag system is **dual-source** by design:
468
+
469
+ - **Author tags** live in `frontmatter.tags` (in the `.md`). Universal optional field declared on [`schemas/frontmatter/base.schema.json`](./schemas/frontmatter/base.schema.json) so every Provider's per-kind schema accepts it without each having to redeclare. These represent intrinsic categories the file's author wrote into the frontmatter (vendor-supplied or your own writing).
470
+ - **User tags** live in `sidecar.annotations.tags` (in the `.sm`). Curated annotation field declared on [`schemas/annotations.schema.json`](./schemas/annotations.schema.json). These represent the post-hoc tags whoever curates the project assigned to the node from their sidecar.
471
+
472
+ The two surfaces are **not aliases**. They capture different intent layers and both are first-class:
473
+
474
+ - Search and listings (`sm list --tag <name>`, UI faceted search) match the **union**: a hit on either source returns the node.
475
+ - The optional `--tag-source author|user` flag filters one source.
476
+ - The UI distinguishes them visually so the attribution stays explicit (different chip style; author chips render first, user chips after).
477
+
478
+ Persistence layer projects rows into a normalized [`scan_node_tags`](./db-schema.md#scan_node_tags) table at write time — one row per `(node_path, tag, source)` triple — so SQL queries can index on `(tag)` for `O(log n)` lookup. Replace-all per scan keeps the table in sync with the live frontmatter + sidecar state; deleting a tag from either source removes its row on the next scan.
479
+
480
+ The wire shape (`/api/nodes` and `/api/nodes/:pathB64`) projects `node.tags = { byAuthor: string[], byUser: string[] }` so consumers see the split with attribution. The kernel `Node` interface (TypeScript) does NOT carry `tags` — consumers that walk the canonical sources read `node.frontmatter.tags` and `node.sidecar.annotations.tags` directly (consistent with the post-decision-#2 posture of "no Node-level denormalisations").
481
+
452
482
  ### Stability
453
483
 
454
484
  The **layout decision** (co-located `.sm`, not mirror tree under `.skill-map/`) is stable as of spec v1.0.0. Moving the home is a major bump.
@@ -457,7 +487,7 @@ The **format** (YAML, extension `.sm`, not `.md.sm`) is stable as of spec v1.0.0
457
487
 
458
488
  The **reserved block names** (`for`, `annotations`, `settings`, `audit`) are stable as of spec v1.0.0. Adding a new reserved block is a minor bump; renaming or removing one is a major bump.
459
489
 
460
- The **identity contract** (`for.path` + `for.bodyHash` + `for.frontmatterHash`, with `resolvedAs` optional) is stable as of spec v1.0.0. Changing the hash algorithm or canonicalization rule is a major bump.
490
+ The **identity contract** (`identity.path` + `identity.bodyHash` + `identity.frontmatterHash`, with `resolvedAs` optional) is stable as of spec v1.0.0. Changing the hash algorithm or canonicalization rule is a major bump.
461
491
 
462
492
  The **bump field set** (the four `audit` fields `lastBumpedAt` / `lastBumpedBy` / `createdAt` / `createdBy`) is stable as of spec v1.0.0. Adding new audit fields is a minor bump; removing or renaming is a major bump. The audit block is `additionalProperties: true` so plugins or future Actions MAY ride additional keys opaquely.
463
493
 
@@ -467,6 +497,172 @@ The **`null`-as-delete sentinel** in `SidecarStore.applyPatch` is an internal co
467
497
 
468
498
  ---
469
499
 
500
+ ## View contribution system
501
+
502
+ Sibling system to the annotation contributions above. Both let plugins extend the surface the kernel exposes; the difference is **where the data lives and what it drives**.
503
+
504
+ | | Annotation contributions | View contributions |
505
+ |---|---|---|
506
+ | **Data lives in** | the user-facing sidecar `.sm` file | the kernel-managed `scan_contributions` table |
507
+ | **Author intent** | extend the metadata catalog | surface per-node data in the UI |
508
+ | **Plugin author writes** | inline JSON Schema for the value | `contract` name from a closed catalog |
509
+ | **Validation** | AJV at sidecar-write time | AJV at `ctx.emitContribution(...)` time |
510
+ | **Lifecycle** | persists across scans (file-on-disk) | re-emitted on every scan (table cleared per node) |
511
+ | **Surfaces in** | sidecar consumers + `<sm-plugin-contributions>` panel | renderer per contract, mounted in slots by the UI |
512
+
513
+ Two schemas describe the wire shape:
514
+
515
+ - [`schemas/view-contracts.schema.json`](./schemas/view-contracts.schema.json) — closed catalog: 10 contract names + the `IViewContribution` manifest declaration shape + per-contract payload schemas (in `$defs/payloads`) the kernel uses to validate emit-time payloads.
516
+ - [`schemas/input-types.schema.json`](./schemas/input-types.schema.json) — closed catalog: 10 input-type names + the `ISettingDeclaration` manifest declaration shape (discriminated by `type`).
517
+
518
+ ### Identity
519
+
520
+ Each view contribution is identified by the qualified id `<pluginId>/<extensionId>/<contributionId>`. The plugin author declares contributions in the extension manifest under `viewContributions: Record<string, IViewContribution>`; the loader composes the qualified id from the plugin id, the extension id, and the Record key.
521
+
522
+ ### Manifest
523
+
524
+ Each entry picks a `contract` name from the closed catalog and supplies presentation tuning:
525
+
526
+ ```jsonc
527
+ {
528
+ "viewContributions": {
529
+ "breakdown": {
530
+ "contract": "node-breakdown",
531
+ "label": "Keyword hits",
532
+ "emptyText": "No matches."
533
+ },
534
+ "total": {
535
+ "contract": "node-counter",
536
+ "icon": "🔍",
537
+ "label": "kw",
538
+ "emitWhenEmpty": false
539
+ }
540
+ }
541
+ }
542
+ ```
543
+
544
+ The plugin author NEVER picks a slot, NEVER writes JSON Schema, NEVER ships UI components. Six manifest fields per contribution + the contract catalog page is the entire mental model. See [`plugin-author-guide.md`](./plugin-author-guide.md) §View contributions for worked examples.
545
+
546
+ ### Settings
547
+
548
+ Plugin user-configurable settings live at the manifest root in `settings: Record<string, ISettingDeclaration>` (see [`schemas/plugins-registry.schema.json`](./schemas/plugins-registry.schema.json)). Each setting picks an input-type from the closed catalog at [`schemas/input-types.schema.json`](./schemas/input-types.schema.json) (`string-list`, `single-string`, `boolean-flag`, `integer`, `enum-pick`, `enum-multipick`, `path-glob`, `regex`, `secret`, `key-value-list`). The kernel exposes resolved settings to extractors via `ctx.settings.<settingId>`; the UI generates a form per declaration; the CLI's `sm plugins config <id>` exposes the same surface.
549
+
550
+ Settings are read once at extractor invocation; changing a setting requires `sm scan` to re-emit affected contributions. The UI surfaces a "settings changed, rescan needed" indicator when the manifest detects mismatch; live re-emission is explicitly out of scope (rescan-required is a stability decision per `ROADMAP.md` §UI contribution system D4).
551
+
552
+ ### Runtime catalog
553
+
554
+ The kernel exposes a runtime catalog (`Kernel.getRegisteredViewContributions()`) listing every plugin-contributed view contribution with its `pluginId`, `extensionId`, `contributionId`, `contract`, and the manifest-declared `label` / `tooltip` / `icon` / `emptyText` / `emitWhenEmpty`. The catalog is built once at boot from every loaded extension's `viewContributions` map, AJV-validated, and frozen — same lifecycle as `getRegisteredAnnotationKeys()`.
555
+
556
+ Rules see the catalog through `IRuleContext.viewContributions` so cross-cutting checks (`core/unknown-contract`, `core/contribution-orphan`) can reason about emissions.
557
+
558
+ ### Emit path
559
+
560
+ Extensions emit per-node payloads via context callbacks:
561
+
562
+ ```ts
563
+ // Extractors (per-node walk)
564
+ ctx.emitContribution(contributionId, payload);
565
+
566
+ // Rules (post-merge graph) — same payload contract, explicit nodePath
567
+ // because the rule sees every node at once
568
+ ctx.emitContribution(nodePath, contributionId, payload);
569
+ ```
570
+
571
+ Parallel to `ctx.emitLink(link)`. The kernel buffers the emission, validates the payload against the contract's payload schema in `$defs/payloads/<contract>` (AJV-compiled at boot), and persists the row to `scan_contributions` during `persistScanResult`. Off-contract payloads emit an `extension.error` event and drop silently — same posture as `emitLink` rejecting off-`emitsLinkKinds` links. Both Extractor and Rule emissions land in the same `scan_contributions` rows; the row's `extension_id` records which kind of extension produced it.
572
+
573
+ The Extractor-emit signature binds `nodePath` implicitly (the extractor runs per-node, with `ctx.node.path` available as the only sensible target). The Rule-emit signature requires the rule to declare the target node explicitly because Rules see the full graph at once and may emit for any subset of nodes — the canonical use case is a rule that derives per-node values from cross-graph aggregations (`core/link-counts` projects `linksOutCount` / `linksInCount` this way).
574
+
575
+ Rules MAY also emit scope-level contributions via `IRuleContext.emitScopeContribution(contributionId, payload)` (only contracts whose schema permits scope-level emission, today only `scope-stat`). That signature is reserved in the spec; the runtime callback lands when the first scope-stat adopter arrives.
576
+
577
+ ### Persistence
578
+
579
+ A new table `scan_contributions` (see [`db-schema.md`](./db-schema.md) §scan_contributions when shipped) carries per-node emissions:
580
+
581
+ | Column | Type | Notes |
582
+ |---|---|---|
583
+ | `plugin_id` | TEXT | qualified plugin id |
584
+ | `extension_id` | TEXT | extension id within the plugin |
585
+ | `node_path` | TEXT | scope-relative path |
586
+ | `contribution_id` | TEXT | manifest Record key |
587
+ | `contract` | TEXT | denormalized contract name (`view-contracts.schema.json#/$defs/ContractName`) |
588
+ | `payload_json` | TEXT | JSON-serialized payload (already validated against contract schema) |
589
+ | `emitted_at` | INTEGER | unix epoch ms |
590
+
591
+ PK `(plugin_id, extension_id, node_path, contribution_id)` so re-emission upserts. Index on `node_path` (inspector lazy-fetch + orphan sweep) and on `plugin_id` (catalog sweep + `purgeByPlugin`).
592
+
593
+ **NOT pure replace-all** (the way `scan_links` / `scan_issues` are). The watcher's cached pass leaves the contributions buffer empty for cached nodes — the orchestrator skips `extract()` when the per-(node, extractor) cache hits, so no `emitContribution` fires. A naive wipe-all would silently drop the prior valid rows on every watcher boot. The persist runs three passes inside the same transaction:
594
+
595
+ 1. **Orphan sweep** — drops every row whose `node_path` is NOT in the current live node set. Disappeared nodes lose their contributions.
596
+ 2. **Catalog sweep** — drops every row whose qualified id `(pluginId, extensionId, contributionId)` is NOT in the registered runtime catalog (uninstalled plugins, disabled bundles, removed contributions).
597
+ 3. **Upsert** — `INSERT ... ON CONFLICT DO UPDATE SET payload_json = excluded.payload_json` for every row in the buffer. PK conflict refreshes payload + `emitted_at`.
598
+
599
+ Cached nodes' rows survive untouched (still in the live set, still in the catalog, no buffer hit). The next time the body changes, the orchestrator re-runs the extractor, fresh contributions land in the buffer, and the upsert refreshes them.
600
+
601
+ Empty buffer + non-empty live set = the cached-pass case (no-op). Empty buffer + empty live set = legacy fallback to wipe-all (cold start). The `IPersistOptions` field `registeredContributionKeys?: ReadonlySet<string>` controls whether the catalog sweep activates — absent set = sweep skipped (legacy callers).
602
+
603
+ Cold-start posture: the BFF endpoints below return empty arrays when the table is missing (mirror of the `tryWithSqlite` graceful-null pattern used by `routes/nodes.ts`); never a 500.
604
+
605
+ ### BFF surface
606
+
607
+ Endpoints under `/api/contributions/*`:
608
+
609
+ - `GET /api/contributions/registered` — runtime catalog. Mirror of `/api/annotations/registered`. Envelope variant `kind: 'contributions.registered'` (see [`schemas/api/rest-envelope.schema.json`](./schemas/api/rest-envelope.schema.json)).
610
+ - `GET /api/contributions/:pluginId/:extensionId/:contributionId?path=...` — lazy per-node fetch for inspector slots. **Three URL segments** mirror the qualified id `<pluginId>/<extensionId>/<contributionId>`. Filters by qualified id + node path; the BFF enforces `pluginId` ↔ namespace at the route level — no cross-plugin reads via this endpoint.
611
+
612
+ Plus catalog embedding into every payload-bearing envelope:
613
+
614
+ - `kindRegistry` and `contributionsRegistry` are siblings on the envelope (see schema). Built once per server boot, embedded into list (`nodes` / `links` / `issues` / `plugins`), single (`node`), and value (`config`) envelopes. Sentinel envelopes (`health` / `scan` / `graph`) and action-result envelopes (`sidecar.bumped`) and the catalog envelopes themselves (`annotations.registered` / `contributions.registered`) carry neither.
615
+
616
+ Plus per-node embedding on node responses:
617
+
618
+ - `GET /api/nodes/:pathB64` — single-node `item.contributions[]` carries every emission for that node, regardless of `bff.maxBulkContributions`.
619
+ - `GET /api/nodes` (bulk list) — `items[].contributions[]` carries emissions for the page slice **only when** `limit ≤ bff.maxBulkContributions` (default and hard upper bound 200). When the page exceeds the cap, `items[].contributions` is omitted and `meta.contributionsOmitted: true` is set so the UI can lazy-fetch per node. The cap is documented but not promoted; tuning above 200 is unsupported.
620
+ - `GET /api/scan` — the SPA's `CollectionLoaderService` hydrates from this endpoint on F5 / cold boot (single-fetch ScanResult); it MUST embed `contributions[]` per node alongside the standard fields, otherwise the inspector / card slot hosts have nothing to render until the next per-node fetch. Decoration is a single bulk `port.contributions.listForPaths(...)` round-trip after `scans.load()` — sibling of the per-node `isFavorite` decoration on the same route.
621
+
622
+ ### Isolation
623
+
624
+ View contributions extend the existing plugin-isolation model (see [`plugin-kv-api.md`](./plugin-kv-api.md) §Honest note on isolation) with six rules specific to UI rendering:
625
+
626
+ 1. **No raw DOM from plugin** — contributions are typed data only; the UI renders them via a closed catalog of Angular components mapped from contract id.
627
+ 2. **CSS scoping by Angular view encapsulation** — plugin does not write CSS; per-plugin tinting is sourced from a kernel-managed palette derived from `pluginId`.
628
+ 3. **Data path namespaced and BFF-enforced** — `GET /api/contributions/:pluginId/:extensionId/:contributionId?path=...` rejects cross-plugin reads at the route level (the qualified id triple is the URL shape).
629
+ 4. **Click actions are typed kernel verb dispatches** — a button rendered from a contribution invokes a kernel verb by qualified id; no arbitrary URLs / effects.
630
+ 5. **AJV at three layers** — manifest at load (rejects unknown `contract` names with `invalid-manifest`), payload at emit (rejects off-contract payloads with `extension.error`), envelope at BFF response.
631
+ 6. **Renderer attr-sanitization** — the UI's renderer components MUST NOT bind contribution data to `[innerHTML]`, `[style]`, `[src]`, `[href]`, or any DomSanitizer DANGEROUS_ATTR. Lint-enforced in the UI workspace; documented in [`context/view-contributions.md`](../context/view-contributions.md).
632
+
633
+ Same honest-note posture as [`plugin-kv-api.md`](./plugin-kv-api.md): isolated against accidents, not hostile code, until worker-thread / iframe sandbox post-v1.0.
634
+
635
+ ### Soft-warning rules
636
+
637
+ Two built-ins ship with the system to cover catalog evolution and rename edge cases:
638
+
639
+ - **`core/unknown-contract`** — walks every loaded plugin's `viewContributions[*].contract`; emits an `Issue` of severity `warn` for any contract not in the current kernel catalog. Parallel to `core/unknown-field` for annotations. Note: AJV at manifest load already rejects unknown contracts as `invalid-manifest`; this rule covers the soft-warning path when a plugin remains loaded across a catalog version bump.
640
+ - **`core/contribution-orphan`** — joins `scan_contributions` against the live `scan_nodes` set; emits an `Issue` of severity `warn` for emissions whose `node_path` no longer exists (post-rename heuristic miss).
641
+
642
+ ### Catalog versioning
643
+
644
+ The catalog of contracts and input-types evolves on its own cadence, independent of the spec version. Plugin manifests carry an optional `catalogCompat: string` (semver range) field at the root, parallel to `specCompat`. The kernel checks `semver.satisfies(catalogVersion, plugin.catalogCompat)` at load. Mismatch surfaces as `incompatible-catalog` plugin status (new entry in the load-status enum). Resolution: `sm plugins upgrade <id>` runs registered migrations from a closed kernel-side registry of `{ from, to, transform }` triples; auto-migration impossible → CLI exit ≠ 0 + UI dialog naming the offending contract / input-type.
645
+
646
+ Pre-1.0 versioning rule (per [`AGENTS.md`](../AGENTS.md)): catalog breaking changes ship as minor bumps while in `0.y.z`; the first `1.0.0` is a deliberate stabilization moment, not a side-effect.
647
+
648
+ ### Stability
649
+
650
+ The **closed catalog of view contracts** is stable as of the v1 of this system: adding a new contract is a minor bump; renaming or removing one is a catalog-major bump and triggers `sm plugins upgrade` migration of every dependent plugin.
651
+
652
+ The **`IViewContribution` manifest shape** (six fields: `contract`, `label?`, `tooltip?`, `icon?`, `emptyText?`, `emitWhenEmpty?`) is stable. Adding a new optional field is a minor bump; making a field required or removing one is a catalog-major bump.
653
+
654
+ The **closed catalog of input-types** is stable on the same model: adding minor, renaming/removing major.
655
+
656
+ The **`ctx.emitContribution(id, payload)` signature** is stable. Adding new context callbacks (e.g. `ctx.emitScopeContribution`) is additive and minor.
657
+
658
+ The **persistence shape** (`scan_contributions` columns) is stable; column additions are minor bumps. Renames or removals trigger a kernel migration.
659
+
660
+ The **slot catalog ownership** (UI-only, kernel does not know about slots) is a permanent architectural decision; it is NOT versioned because the kernel does not expose it. Different driving adapters (UI, future TUI, `sm show --json`) MAY publish their own slot catalogs over the same contributions data without spec coordination.
661
+
662
+ The **isolation honest-note** (accidents, not hostile code) is the same posture as [`plugin-kv-api.md`](./plugin-kv-api.md) and migrates together when worker-thread / iframe sandbox lands post-v1.0.
663
+
664
+ ---
665
+
470
666
  ## See also
471
667
 
472
668
  - [`cli-contract.md`](./cli-contract.md) — verb surface of the CLI driving adapter.
@@ -488,7 +684,7 @@ The **extension kind list** (6 kinds: Provider, Extractor, Rule, Action, Formatt
488
684
 
489
685
  The **Hook curated trigger set** (eight events: `scan.started`, `scan.completed`, `extractor.completed`, `rule.completed`, `action.completed`, `job.spawning`, `job.completed`, `job.failed`) is stable as of spec v1.0.0. Adding a ninth trigger is a minor bump; removing or renaming any of the eight is a major bump.
490
686
 
491
- The **execution modes** (`deterministic` / `probabilistic`) and the per-kind mode capability matrix above are stable as of spec v1.0.0. Adding a third mode or changing which kinds are dual-mode is a major bump. Renaming or repurposing the mode enum values is a major bump.
687
+ The **execution modes** (`deterministic` / `probabilistic`) and the per-kind mode capability matrix above are stable as of spec v1.0.0. Adding a third mode is a major bump. Renaming or repurposing the mode enum values is a major bump. Pre-1.0, narrowing a kind from dual-mode to single-mode is permitted as a minor bump (Extractor went from `deterministic / probabilistic` to `deterministic-only` in 0.X.0); post-1.0 the same change would be major.
492
688
 
493
689
  The **dependency rules** above are stable as of spec v1.0.0. Relaxing any is a major bump; tightening (forbidding an allowed import) is a minor bump.
494
690
 
package/cli-contract.md CHANGED
@@ -198,8 +198,8 @@ Keys are dot-paths (`jobs.minimumTtlSeconds`, `scan.tokenize`). Unknown keys →
198
198
  | `sm scan --watch` | Long-running: watch the roots and trigger an incremental scan after each debounced batch of filesystem events. Alias of `sm watch`. |
199
199
  | `sm scan compare-with <dump> [roots...]` | Delta report: run a fresh scan in memory and compare against the saved `ScanResult` dump at `<dump>`. Read-only — does not modify the DB. Exit `0` on empty delta, `1` on any drift, `2` on operational error (missing or malformed dump, schema violation). |
200
200
  | `sm watch [roots...]` | Long-running watcher. Same semantics as `sm scan --watch`, exposed as a top-level verb because the watcher is a loop, not a one-shot scan. |
201
- | `sm refresh <node.path>` | Re-run Extractors against a single node and upsert their outputs into the universal enrichment layer (`node_enrichments`, see [`db-schema.md`](./db-schema.md#node_enrichments)). Stub state until the job subsystem ships at Step 10: deterministic Extractors run for real and persist; probabilistic Extractors emit a stderr advisory and skip without touching their stale rows. Exit `0` on success (with possible stub advisory), `2` on failure, `5` if the node is not in the persisted scan. |
202
- | `sm refresh --stale` | Batch form of `sm refresh <node>` — refreshes every node carrying at least one stale probabilistic enrichment row. Same stub caveat: deterministic Extractors persist; probabilistic Extractors skip with a stderr advisory. Exit `0` (including when the stale set is empty prints a "nothing to do" advisory). |
201
+ | `sm refresh <node.path>` | Re-run Extractors against a single node and upsert their outputs into the universal enrichment layer (`node_enrichments`, see [`db-schema.md`](./db-schema.md#node_enrichments)). Extractors are deterministic-only they run synchronously and persist. Exit `0` on success, `2` on failure, `5` if the node is not in the persisted scan. |
202
+ | `sm refresh --stale` | Batch form of `sm refresh <node>` — refreshes every node carrying at least one stale enrichment row. With Extractors deterministic-only, the stale set is empty in this revision (Extractor writes never set `stale = 1`) so `--stale` always exits `0` with a "nothing to do" advisory. The verb is preserved for the future Action-prob enrichment revision (see [`architecture.md` §Extractor · enrichment layer](./architecture.md#extractor--enrichment-layer)) where queued LLM jobs will populate stale rows. |
203
203
 
204
204
  `--json` output conforms to `schemas/scan-result.schema.json`. `sm watch` (and `sm scan --watch`) emit one ScanResult per batch — under `--json` this is an `ndjson` stream of ScanResult documents.
205
205
 
@@ -244,7 +244,7 @@ The built-in deterministic `core/bump` Action is the canonical write channel for
244
244
  | `sm bump --pending [--staged] [--force]` | Batch bump. Walks every node whose sidecar overlay reports drift in `node.path` ASC order and bumps each. `--staged` runs `git add <sidecar-path>` after each successful bump so the new content lands in the same commit; `git add` failure degrades to a stderr warning, the batch keeps running. Empty stale set → exit `0` with a "nothing to do" advisory. `--json` envelope: `{ bumped, refused, skipped, errors[], elapsedMs }`. Exit `0` on a clean run; `1` when at least one per-node error landed in `errors[]`. **Git error matrix for `--staged`**: not inside a git repo (no `.git/` parent of `cwd`) → exit `5`; `git` binary not on PATH (spawn ENOENT) → exit `2`. Both checks run BEFORE any sidecar write so a misconfigured environment never produces partial state. |
245
245
  | `sm sidecar refresh <node.path>` | Hash-only update on the sidecar. Refreshes `for.{bodyHash, frontmatterHash}` to match the live node WITHOUT bumping `annotations.version` and WITHOUT touching the audit block. Useful when the user knows a body change is editorial-only and doesn't want to spend a version increment. Distinct from the top-level `sm refresh` (which targets the enrichment layer at Step A.8) — different storage, different concept; the sub-namespace prefix prevents the collision. Exit `5` if the node has no sidecar or is not in the persisted scan. No-op on a fresh node (informational stderr, exit `0`). |
246
246
  | `sm sidecar prune [--dry-run] [--yes]` | Delete orphan `.sm` files (sidecars whose accompanying `<basename>.md` does not exist on disk). Destructive — without `--dry-run` prompts for interactive confirmation listing every file to be deleted (per the §Dry-run rule for destructive verbs). `--yes` (alias `--force`) bypasses the prompt for non-interactive callers (CI, the pre-commit hook, scripts). With `--dry-run` reports what would be deleted without touching disk and never prompts. Different domain from `sm orphans` — that verb operates on the node graph (rename heuristic); this one operates on the filesystem layer. `--json` envelope: `{ deleted, wouldDelete, errors, items[], elapsedMs }`. Exit `1` when delete failures landed in `errors`. |
247
- | `sm sidecar annotate <node.path> [--force]` | Pure scaffolding. Writes a minimal `.sm` next to the `.md` with the identity (`for:`) block populated and an empty `annotations: {}` block, ready for editing. Refuses if the file exists; `--force` overwrites. The optional legacy-frontmatter migration helper (`--from-frontmatter`) is deferred — no released consumer demands it. |
247
+ | `sm sidecar annotate <node.path> [--force]` | Pure scaffolding. Writes a minimal `.sm` next to the `.md` with the `identity:` block populated and an empty `annotations: {}` block, ready for editing. Refuses if the file exists; `--force` overwrites. The optional legacy-frontmatter migration helper (`--from-frontmatter`) is deferred — no released consumer demands it. |
248
248
  | `sm hooks install pre-commit-bump [--dry-run]` | Install (or chain into) a git pre-commit hook that runs `sm bump --pending --staged` so any staged drift in `.sm` sidecars auto-bumps before the commit lands. Idempotent: re-running detects the skill-map marker and no-ops. When the repo already has a custom `pre-commit`, the verb appends the skill-map block to the existing file rather than replacing it. `--dry-run` prints the planned content with `--- target: <path> ---` markers and writes nothing. Exit `5` if no `.git/` parent is found at or above `cwd`; exit `2` on write failures or unknown hook flavours. |
249
249
 
250
250
  **`.sm` round-trip contract.** The `bump` verb, `sm sidecar refresh`, and `sm sidecar annotate` write through `FilesystemSidecarStore`, which re-serialises the merged result via `js-yaml` `dump` with `sortKeys: true`. **`.sm` files are managed artifacts; comments and key order are not preserved on round-trip.** Author commentary belongs in the markdown body or in a separate documentation file, not inside `.sm`. The integrity guarantee is that the merged YAML always validates against `sidecar.schema.json` + `annotations.schema.json` and that the file is written atomically (`.tmp + rename`).
@@ -252,7 +252,7 @@ The built-in deterministic `core/bump` Action is the canonical write channel for
252
252
  Concretely, a hand-edited sidecar like this:
253
253
 
254
254
  ```yaml
255
- for:
255
+ identity:
256
256
  path: agents/reviewer.md
257
257
  bodyHash: 3dd7d0...
258
258
  frontmatterHash: 271d1e...
@@ -283,7 +283,7 @@ audit:
283
283
  createdBy: cli
284
284
  lastBumpedAt: '2026-05-07T10:00:00.000Z'
285
285
  lastBumpedBy: cli
286
- for:
286
+ identity:
287
287
  bodyHash: 3dd7d0...
288
288
  frontmatterHash: 271d1e...
289
289
  path: agents/reviewer.md
@@ -460,16 +460,19 @@ The reference implementation ships a Hono BFF rooted at `src/server/`. One Node
460
460
 
461
461
  | Path | Status | Shape |
462
462
  |---|---|---|
463
- | `GET /api/health` | implemented | `{ ok: true, schemaVersion, specVersion, implVersion, scope: 'project'\|'global', db: 'present'\|'missing' }` |
463
+ | `GET /api/health` | implemented | `{ ok: true, schemaVersion, specVersion, implVersion, scope: 'project'\|'global', db: 'present'\|'missing', cwd: string, dbPath: string }`. `cwd` is the absolute project root the BFF resolves against (`runtimeContext.cwd`); `dbPath` is the absolute project DB path (`IServerOptions.dbPath`). Both are surfaced so the SPA's About panel can show "you are looking at <project>" + the DB location without a second endpoint. |
464
464
  | `GET /api/scan` | implemented | latest persisted `ScanResult` (1:1 with `scan-result.schema.json`; byte-equal to `sm scan --json` modulo whitespace). DB absent → empty `ScanResult` shape (zero `nodes` / `links` / `issues`). |
465
465
  | `GET /api/scan?fresh=1` | implemented | runs an in-memory scan and returns the produced `ScanResult` without persistence. Rejects with `bad-query` (400) when the server was started with `--no-built-ins` or `--no-plugins` (would yield empty / partial results). |
466
+ | `POST /api/scan` | implemented | Run a fresh scan **and persist it** through the same `runScanWithRenames` + `persistScanResult` pipeline the watcher uses. Body is empty (`{}` or no body). Response: the persisted `ScanResult` inline (same shape as `GET /api/scan`). Side effects: broadcasts `scan.started` then `scan.completed` over `/ws` so other connected clients can refresh — the per-batch sequence is identical to a watcher-driven batch. **Concurrency**: only one scan may run at a time across the whole BFF process. A POST that arrives while a watcher batch is in flight (or while another POST is in flight) is rejected with `409 scan-busy` so the caller can decide whether to retry. **Pipeline gate**: rejected with `400 bad-query` when the server was started with `--no-built-ins` or `--no-plugins` (a partial pipeline would persist a misleading DB the next watcher boot would have to reconcile). **DB gate**: rejected with `500 db-missing` when the project DB file is absent — the read-side `/api/scan` degrades to the empty shape, but a write path cannot, so it fails fast. |
466
467
  | `GET /api/nodes?kind=&hasIssues=&path=&limit=&offset=` | implemented | `RestEnvelope` (`kind: 'nodes'`) — paginated, filtered list. Filters share the `kind=` / `has=issues` / `path=<glob>` grammar with `sm export`. `hasIssues=false` is a server-side post-filter (not representable in the kernel grammar). Pagination defaults `offset=0`, `limit=100`; max `limit=1000`. |
467
468
  | `GET /api/nodes/:pathB64[?include=body]` | implemented | Single-node detail envelope: `{ schemaVersion, kind: 'node', item: Node, links: { incoming: Link[], outgoing: Link[] }, issues: Issue[] }`. `:pathB64` is base64url (RFC 4648 §5, no padding) of `node.path`. Missing node or malformed `pathB64` → 404 `not-found`. **`?include=body`** (Step 14.5.a) — opt-in flag that adds `item.body: string \| null` to the response. The body is read from disk on demand at request time (the kernel persists `bodyHash` only). `null` indicates the source file was missing / unreadable when the request landed (the watcher will re-emit `scan.completed` when it catches up). Without the flag, `item.body` is `undefined` and the handler does not touch the filesystem. |
468
469
  | `GET /api/links?kind=&from=&to=` | implemented | `RestEnvelope` (`kind: 'links'`) — list of links. Filters: `kind` (CSV whitelist of `link.kind`), `from` (exact match on `link.source`), `to` (exact match on `link.target`). No pagination at v14.2. |
469
470
  | `GET /api/issues?severity=&ruleId=&node=` | implemented | `RestEnvelope` (`kind: 'issues'`) — list of issues. Filters: `severity` (CSV from `error\|warn\|info`), `ruleId` (CSV; qualified or short suffix per `sm check --rules`), `node` (filter to issues whose `nodeIds` includes the path). No pagination at v14.2. |
470
471
  | `GET /api/graph?format=ascii\|json\|md` | implemented | formatter-rendered graph. `Content-Type` per format: `text/plain` (ascii), `application/json` (json), `text/markdown` (md / mermaid). Default `format=ascii`. Unknown format → 400 `bad-query`. |
471
472
  | `GET /api/config` | implemented | `RestEnvelope` (`kind: 'config'`) — merged effective config (defaults → user → user-local → project → project-local → override). |
472
- | `GET /api/plugins` | implemented | `RestEnvelope` (`kind: 'plugins'`) — list of installed plugins (built-in + drop-in) with status. Item shape: `{ id, version, kinds, status, reason, source: 'built-in'\|'project'\|'global' }`. |
473
+ | `GET /api/plugins` | implemented | `RestEnvelope` (`kind: 'plugins'`) — list of installed plugins (built-in + drop-in) with status. Item shape: `{ id, version, kinds, status, reason, source: 'built-in'\|'project'\|'global', granularity: 'bundle'\|'extension', description?: string, extensions?: Array<{ id, kind, version, enabled, description?: string }> }`. The `granularity` field reflects the manifest declaration (built-ins: hardcoded per `built-in-plugins/built-ins.ts`; drop-ins: from `plugin.json#/granularity`, default `'bundle'`). The `description` field on the bundle item carries the manifest-declared description (built-ins: hardcoded on `IBuiltInBundle`; drop-ins: `plugin.json#/description`); each `extensions[]` entry carries its extension manifest's `description` per `IExtensionBase` (`extensions/base.schema.json#/properties/description`). The SPA's Settings list renders the descriptions as muted secondary text and includes them in its substring-search index alongside the ids. The `extensions` array is present **only** when `granularity === 'extension'` AND the plugin loaded successfully; each entry's `enabled` reflects the per-extension override resolution (DB > settings.json > installed default). For `granularity: 'bundle'` plugins the array is omitted (the bundle is the only toggle-able key). |
474
+ | `PATCH /api/plugins/:id` | implemented | Toggle one plugin's user override. `:id` MUST be a top-level bundle id; qualified-id form (`bundle/extension`) is the sibling route below. Body `{ enabled: boolean }` (JSON). Persists to `config_plugins` via `IConfigPluginsPort.set` — same path the CLI's `sm plugins enable\|disable` uses. Response is the canonical `{ id, version, kinds, status, reason, source }` row for the affected plugin (post-write `status` reflects the new override resolution). **Granularity** — rejected with 400 `bad-query` when the target bundle declares `granularity: 'extension'` (only the qualified-id form is toggle-able for those). **Restart required** — the loaded plugin runtime is boot-cached; the new value applies on the next `sm scan` or `sm serve` restart. The endpoint does NOT broadcast a WS event today. |
475
+ | `PATCH /api/plugins/:bundleId/extensions/:extensionId` | implemented | Qualified-id form for `granularity: 'extension'` bundles (today: `core` + any user plugin that opts in). Body `{ enabled: boolean }`. Both segments are URL-path-segment-encoded (no slash inside `:bundleId` or `:extensionId`). Rejected with 400 `bad-query` when the target bundle declares `granularity: 'bundle'` (use the sibling route above). Same persistence + restart-required semantics as the bundle form. |
473
476
  | `ALL /api/*` (other) | reserved | structured 404 envelope (see below); future endpoints land in subsequent sub-steps. |
474
477
  | `GET /ws` | implemented (v14.4.a) | accepts WebSocket upgrade and registers the client with the BFF broadcaster. Server-push only — the server fans `scan.*` (and forthcoming `issue.*`) events to every connected client. See **WebSocket protocol** below. |
475
478
  | `GET *` | implemented | static asset from the resolved UI bundle, falling back to `index.html` for SPA deep links. |
@@ -491,14 +494,18 @@ List endpoints conform to [`schemas/api/rest-envelope.schema.json`](schemas/api/
491
494
  }
492
495
  ```
493
496
 
494
- HTTP status mapping: `400` → `bad-query`, `404` → `not-found`, `500` → `internal` / `db-missing`.
497
+ HTTP status mapping: `400` → `bad-query`, `404` → `not-found`, `409` → `sidecar-fresh` (`POST /api/sidecar/bump`) or `scan-busy` (`POST /api/scan`), `500` → `internal` / `db-missing`.
495
498
 
496
499
  Error code sources at v14.2:
497
500
 
498
501
  - `not-found` (404) — unknown `/api/*` path; missing node on `/api/nodes/:pathB64`; malformed `pathB64` (treated as "no such node" so the client UX is uniform).
499
502
  - `bad-query` (400) — `ExportQueryError` from `parseExportQuery`; pagination beyond `limit ≤ 1000`; non-integer / negative `limit` / `offset`; unknown formatter on `/api/graph`; `?fresh=1` when the server started with `--no-built-ins` or `--no-plugins`.
500
503
  - `internal` (500) — uncaught error during a request (e.g. config-load failure, DB corruption surfacing through `loadScanResult`).
501
- - `db-missing` (500) — reserved for endpoints that cannot degrade to an empty result. The v14.2 routes uniformly degrade (`/api/scan` returns the empty shape; list endpoints return zero items) so this code is not currently emitted by any handler it is documented for future endpoints (post-v0.6.0 mutations) where degradation is not safe.
504
+ - `db-missing` (500) — emitted by mutation endpoints (`PATCH /api/plugins/:id`, `PATCH /api/plugins/:bundleId/extensions/:extensionId`) when the project DB is absent. Read-side routes uniformly degrade to the empty shape (`/api/scan`) or zero items (list endpoints) so they do not emit this code; mutation endpoints cannot persist without a DB so they fail fast instead of silently dropping the write.
505
+ - `not-found` (404) on `PATCH /api/plugins/:id` — unknown plugin id (no built-in bundle, no discovered drop-in matches). The qualified-id form returns the same code when either segment misses.
506
+ - `bad-query` (400) on `PATCH /api/plugins/:id` — granularity mismatch (bundle-level call against a `granularity: 'extension'` bundle, or qualified-id call against a `granularity: 'bundle'` bundle), malformed body (missing `enabled`, wrong type), unknown extension id under a known bundle.
507
+ - `bad-query` (400) on `POST /api/scan` — the server was started with `--no-built-ins` or `--no-plugins` (partial pipeline would persist a misleading DB).
508
+ - `scan-busy` (409) on `POST /api/scan` — another scan (a watcher batch or another POST) is already in flight. Retry once the in-flight scan resolves; the WS `scan.completed` envelope is the unambiguous "now safe" signal.
502
509
 
503
510
  **Flag surface**:
504
511
 
@@ -0,0 +1,22 @@
1
+ {
2
+ "$schema": "https://skill-map.dev/spec/v0/conformance-case.schema.json",
3
+ "id": "orphan-markdown-fallback",
4
+ "description": "spec 0.18.0 universal markdown fallback. A `.md` file no vendor-specific Provider classifies (e.g. `ARCHITECTURE.md` at the project root) MUST be picked up by the built-in `core/markdown` Provider, classified as kind `markdown`, and attributed to the `markdown` provider id. The orchestrator's path-dedup ensures vendor Providers retain priority on files inside their territory (`.claude/agents/reviewer.md` here stays with `claude` as `agent`). Locks the contract that markdown is provider-agnostic and the kernel emits no privileged kinds.",
5
+ "fixture": "orphan-markdown",
6
+ "invoke": {
7
+ "verb": "scan",
8
+ "flags": ["--json"]
9
+ },
10
+ "assertions": [
11
+ { "type": "exit-code", "value": 0 },
12
+ { "type": "json-path", "path": "$.schemaVersion", "equals": 1 },
13
+ { "type": "json-path", "path": "$.stats.nodesCount", "equals": 2 },
14
+ { "type": "json-path", "path": "$.stats.issuesCount", "equals": 0 },
15
+ { "type": "json-path", "path": "$.nodes[0].path", "equals": ".claude/agents/reviewer.md" },
16
+ { "type": "json-path", "path": "$.nodes[0].kind", "equals": "agent" },
17
+ { "type": "json-path", "path": "$.nodes[0].provider", "equals": "claude" },
18
+ { "type": "json-path", "path": "$.nodes[1].path", "equals": "ARCHITECTURE.md" },
19
+ { "type": "json-path", "path": "$.nodes[1].kind", "equals": "markdown" },
20
+ { "type": "json-path", "path": "$.nodes[1].provider", "equals": "markdown" }
21
+ ]
22
+ }
@@ -10,10 +10,11 @@
10
10
  "assertions": [
11
11
  { "type": "exit-code", "value": 0 },
12
12
  { "type": "stderr-matches", "pattern": "plugin bad-provider:.*invalid.*must have required property 'ui'" },
13
- { "type": "json-path", "path": "$.providers.length", "equals": 3 },
13
+ { "type": "json-path", "path": "$.providers.length", "equals": 4 },
14
14
  { "type": "json-path", "path": "$.providers[0]", "equals": "claude" },
15
15
  { "type": "json-path", "path": "$.providers[1]", "equals": "gemini" },
16
16
  { "type": "json-path", "path": "$.providers[2]", "equals": "agent-skills" },
17
+ { "type": "json-path", "path": "$.providers[3]", "equals": "markdown" },
17
18
  { "type": "json-path", "path": "$.nodes.length", "equals": 1 }
18
19
  ]
19
20
  }
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "$schema": "https://skill-map.dev/spec/v0/conformance-case.schema.json",
3
3
  "id": "sidecar-end-to-end",
4
- "description": "Step 9.6.6 — co-located `.sm` sidecar end-to-end. Scanning a fixture that carries a stale sidecar (wrong `for.{bodyHash,frontmatterHash}`) plus an orphan sidecar (no sibling `.md`) MUST surface `sidecar_status` queryable on the node, denormalise `annotations.version` into the node row, and emit both `annotation-stale` (per stale node) and `annotation-orphan` (per orphan `.sm`) issues from the built-in core rules.",
4
+ "description": "Step 9.6.6 — co-located `.sm` sidecar end-to-end. Scanning a fixture that carries a stale sidecar (wrong `identity.{bodyHash,frontmatterHash}`) plus an orphan sidecar (no sibling `.md`) MUST surface `sidecar.status` on the node and emit both `annotation-stale` (per stale node) and `annotation-orphan` (per orphan `.sm`) issues from the built-in core rules.",
5
5
  "fixture": "sidecar-end-to-end",
6
6
  "invoke": {
7
7
  "verb": "scan",
@@ -12,7 +12,6 @@
12
12
  { "type": "json-path", "path": "$.schemaVersion", "equals": 1 },
13
13
  { "type": "json-path", "path": "$.stats.nodesCount", "equals": 1 },
14
14
  { "type": "json-path", "path": "$.nodes[0].path", "equals": ".claude/agents/stale.md" },
15
- { "type": "json-path", "path": "$.nodes[0].version", "equals": 7 },
16
15
  { "type": "json-path", "path": "$.nodes[0].sidecar.present", "equals": true },
17
16
  { "type": "json-path", "path": "$.nodes[0].sidecar.status", "matches": "^stale-(body|frontmatter|both)$" },
18
17
  { "type": "json-path", "path": "$.nodes[0].sidecar.annotations.version", "equals": 7 },