@skill-map/spec 0.18.0 → 0.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/CHANGELOG.md +680 -1
  2. package/README.md +6 -6
  3. package/architecture.md +244 -41
  4. package/cli-contract.md +48 -20
  5. package/conformance/README.md +2 -2
  6. package/conformance/cases/kernel-empty-boot.json +2 -2
  7. package/conformance/cases/orphan-markdown-fallback.json +22 -0
  8. package/conformance/cases/plugin-missing-ui-rejected.json +2 -1
  9. package/conformance/cases/sidecar-end-to-end.json +3 -4
  10. package/conformance/coverage.md +8 -6
  11. package/conformance/fixtures/orphan-markdown/.claude/agents/reviewer.md +6 -0
  12. package/conformance/fixtures/orphan-markdown/ARCHITECTURE.md +10 -0
  13. package/conformance/fixtures/sidecar-end-to-end/.claude/agents/orphan.sm +2 -2
  14. package/conformance/fixtures/sidecar-end-to-end/.claude/agents/stale.sm +1 -1
  15. package/conformance/fixtures/sidecar-example/agent-example.md +1 -1
  16. package/conformance/fixtures/sidecar-example/agent-example.sm +1 -1
  17. package/db-schema.md +68 -23
  18. package/index.json +47 -42
  19. package/interfaces/security-scanner.md +2 -2
  20. package/job-events.md +12 -12
  21. package/job-lifecycle.md +1 -1
  22. package/package.json +1 -1
  23. package/plugin-author-guide.md +374 -69
  24. package/plugin-kv-api.md +5 -5
  25. package/prompt-preamble.md +1 -1
  26. package/schemas/annotations.schema.json +5 -9
  27. package/schemas/api/rest-envelope.schema.json +55 -11
  28. package/schemas/conformance-case.schema.json +2 -2
  29. package/schemas/extensions/analyzer.schema.json +43 -0
  30. package/schemas/extensions/base.schema.json +14 -4
  31. package/schemas/extensions/extractor.schema.json +3 -10
  32. package/schemas/extensions/hook.schema.json +6 -4
  33. package/schemas/extensions/provider.schema.json +1 -1
  34. package/schemas/frontmatter/base.schema.json +6 -1
  35. package/schemas/input-types.schema.json +260 -0
  36. package/schemas/issue.schema.json +6 -6
  37. package/schemas/link.schema.json +2 -2
  38. package/schemas/node.schema.json +1 -19
  39. package/schemas/plugins-registry.schema.json +14 -2
  40. package/schemas/project-config.schema.json +25 -0
  41. package/schemas/sidecar.schema.json +6 -6
  42. package/schemas/summaries/agent.schema.json +1 -1
  43. package/schemas/summaries/command.schema.json +1 -1
  44. package/schemas/summaries/hook.schema.json +1 -1
  45. package/schemas/summaries/markdown.schema.json +1 -1
  46. package/schemas/view-slots.schema.json +335 -0
  47. package/schemas/extensions/rule.schema.json +0 -43
package/README.md CHANGED
@@ -7,7 +7,7 @@ This document is the **source of truth**. The reference implementation under `..
7
7
  ## What this spec defines
8
8
 
9
9
  - The **domain model**: nodes, links, issues, scan results.
10
- - The **extension contract**: six extension kinds (provider, extractor, rule, action, formatter, hook) with their input/output shapes.
10
+ - The **extension contract**: six extension kinds (provider, extractor, analyzer, action, formatter, hook) with their input/output shapes.
11
11
  - The **CLI contract**: verb set, flags, exit codes, JSON introspection.
12
12
  - The **persistence contract**: table catalog owned by the kernel, plugin key-value API.
13
13
  - The **job contract**: lifecycle states, event stream, prompt preamble, submit/claim/record semantics.
@@ -36,10 +36,10 @@ These are implementation decisions. The reference impl picks them (see [`../AGEN
36
36
 
37
37
  ## Naming conventions
38
38
 
39
- Two rules govern every identifier in the spec. They are **normative**.
39
+ Two analyzers govern every identifier in the spec. They are **normative**.
40
40
 
41
- - **Filesystem artefacts use kebab-case.** Every file and directory in `spec/` (and in any conforming implementation) — `scan-result.schema.json`, `job-lifecycle.md`, `report-base.schema.json`, `auto-rename-medium` (as an `issue.ruleId` value), `direct-override` (as a `safety.injectionType` enum value), and so on — is kebab-case lowercase. Enum values and issue rule ids follow the same convention so they can be echoed back into URLs, filenames, and log keys without escaping.
42
- - **JSON content uses camelCase.** Every key inside a JSON Schema, frontmatter block, config file, plugin manifest, action manifest, job record, report, event payload, or API response is camelCase: `whatItDoes`, `injectionDetected`, `expectedTools`, `conflictsWith`, `docsUrl`, `examplesUrl`, `ttlSeconds`, `runId`, `jobId`. This matches the JS/TS ecosystem the reference impl ships in and the Kysely `CamelCasePlugin` that bridges to the `snake_case` SQL layer — but the rule is spec-level, not implementation-level: an alternative implementation in any language still exposes camelCase JSON keys.
41
+ - **Filesystem artefacts use kebab-case.** Every file and directory in `spec/` (and in any conforming implementation) — `scan-result.schema.json`, `job-lifecycle.md`, `report-base.schema.json`, `auto-rename-medium` (as an `issue.analyzerId` value), `direct-override` (as a `safety.injectionType` enum value), and so on — is kebab-case lowercase. Enum values and issue analyzer ids follow the same convention so they can be echoed back into URLs, filenames, and log keys without escaping.
42
+ - **JSON content uses camelCase.** Every key inside a JSON Schema, frontmatter block, config file, plugin manifest, action manifest, job record, report, event payload, or API response is camelCase: `whatItDoes`, `injectionDetected`, `expectedTools`, `conflictsWith`, `docsUrl`, `examplesUrl`, `ttlSeconds`, `runId`, `jobId`. This matches the JS/TS ecosystem the reference impl ships in and the Kysely `CamelCasePlugin` that bridges to the `snake_case` SQL layer — but the analyzer is spec-level, not implementation-level: an alternative implementation in any language still exposes camelCase JSON keys.
43
43
 
44
44
  The SQL persistence layer is the sole exception: tables, columns, and migration filenames use `snake_case` (see `db-schema.md`). That boundary is crossed only inside a storage adapter; nothing that leaves the kernel should ever be `snake_case`.
45
45
 
@@ -78,7 +78,7 @@ spec/ ← published as @skill-map/spec
78
78
  │ │ ├── base.schema.json ┐
79
79
  │ │ ├── provider.schema.json │
80
80
  │ │ ├── extractor.schema.json │ 6 extension schemas
81
- │ │ ├── rule.schema.json │ (base + 5 kinds)
81
+ │ │ ├── analyzer.schema.json │ (base + 5 kinds)
82
82
  │ │ ├── action.schema.json │
83
83
  │ │ └── formatter.schema.json ┘
84
84
  │ │
@@ -108,7 +108,7 @@ spec/ ← published as @skill-map/spec
108
108
  ## How to read this spec
109
109
 
110
110
  - **Building a tool or plugin that consumes skill-map output?** Start with [`schemas/scan-result.schema.json`](./schemas/scan-result.schema.json) and [`schemas/node.schema.json`](./schemas/node.schema.json).
111
- - **Building a custom extractor, rule, or formatter?** Read [`architecture.md`](./architecture.md), then the relevant schema under [`schemas/extensions/`](./schemas/extensions/).
111
+ - **Building a custom extractor, analyzer, or formatter?** Read [`architecture.md`](./architecture.md), then the relevant schema under [`schemas/extensions/`](./schemas/extensions/).
112
112
  - **Building an alternative CLI implementation?** Read [`cli-contract.md`](./cli-contract.md) and run [`conformance/`](./conformance/README.md).
113
113
  - **Integrating a new platform (adapter)?** Read [`architecture.md`](./architecture.md) §adapters, then the Claude adapter source in `../src/extensions/adapters/claude/` as a worked example.
114
114
  - **Shipping a job-running runner?** Read [`job-events.md`](./job-events.md), [`job-lifecycle.md`](./job-lifecycle.md), [`prompt-preamble.md`](./prompt-preamble.md).
package/architecture.md CHANGED
@@ -67,14 +67,14 @@ Discovers plugin directories, reads `plugin.json`, checks `specCompat`, dynamica
67
67
 
68
68
  Operations: `discover(scopes)`, `load(pluginPath)`, `validateManifest(json)`.
69
69
 
70
- The loader enforces two id-uniqueness rules during discovery (see [`plugin-author-guide.md` §Plugin id uniqueness](./plugin-author-guide.md#plugin-id-uniqueness) for the author-facing summary):
70
+ The loader enforces two id-uniqueness analyzers during discovery (see [`plugin-author-guide.md` §Plugin id uniqueness](./plugin-author-guide.md#plugin-id-uniqueness) for the author-facing summary):
71
71
 
72
- 1. **Directory name == manifest id.** A plugin lives at `<root>/<id>/plugin.json`. A mismatch surfaces as status `invalid-manifest`. This rule eliminates same-root collisions by construction.
73
- 2. **Cross-root id collision blocks both sides.** Two plugins reachable from different roots (project + global, or any `--plugin-dir` combination) that declare the same `id` BOTH receive status `id-collision`. No precedence rule applies — coherent with §Boot invariant ("no extension is privileged"). The user resolves by renaming one of them.
72
+ 1. **Directory name == manifest id.** A plugin lives at `<root>/<id>/plugin.json`. A mismatch surfaces as status `invalid-manifest`. This analyzer eliminates same-root collisions by construction.
73
+ 2. **Cross-root id collision blocks both sides.** Two plugins reachable from different roots (project + global, or any `--plugin-dir` combination) that declare the same `id` BOTH receive status `id-collision`. No precedence analyzer applies — coherent with §Boot invariant ("no extension is privileged"). The user resolves by renaming one of them.
74
74
 
75
- In addition, the loader **qualifies every extension** with its owning plugin id before registering it. The registry stores extensions under the qualified id `<plugin-id>/<extension-id>` (e.g. `claude/slash`, `core/broken-ref`, `hello-world/greet`). Authors continue to declare the short `id` in each extension manifest; the loader composes the qualified form from `manifest.id` at load time. Built-in extensions bundled with the reference impl declare their `pluginId` directly in `built-ins.ts` — `core/` for kernel-internal primitives (rules, the formatter, the external-url-counter extractor) and `claude/` for the Claude provider bundle (the Provider and its kind-aware extractors). If a plugin author injects a `pluginId` field on an extension that disagrees with `plugin.json`'s `id`, the loader emits `invalid-manifest` with a directed reason.
75
+ In addition, the loader **qualifies every extension** with its owning plugin id before registering it. The registry stores extensions under the qualified id `<plugin-id>/<extension-id>` (e.g. `core/slash`, `core/broken-ref`, `hello-world/greet`). Authors continue to declare the short `id` in each extension manifest; the loader composes the qualified form from `manifest.id` at load time. Built-in extensions bundled with the reference impl declare their `pluginId` directly in `built-ins.ts` — `core/` for kernel-internal primitives (every analyzer, the formatter, the cross-vendor extractors `annotations` / `slash` / `at-directive` / `markdown-link` / `external-url-counter`) and vendor-specific bundles such as `claude/` (the Claude provider) for Provider integrations whose territory is platform-bound. If a plugin author injects a `pluginId` field on an extension that disagrees with `plugin.json`'s `id`, the loader emits `invalid-manifest` with a directed reason.
76
76
 
77
- Each plugin (and each built-in bundle) declares a **granularity** that controls how its extensions are toggled. `granularity: 'bundle'` (the default) means the plugin id is the only enable/disable key; `granularity: 'extension'` means each extension is independently toggle-able under its qualified id. The loader's pre-import `resolveEnabled(pluginId)` short-circuit is always coarse (bundle level) — when a granularity=`extension` bundle is partially enabled, the import work proceeds and the runtime composer (the CLI's `composeScanExtensions` / `composeFormatters` in `src/cli/util/plugin-runtime.ts`) drops the disabled extensions before they reach the orchestrator. The two built-in bundles split deliberately: `claude` is granularity=`bundle` (provider-level toggle), `core` is granularity=`extension` (every kernel built-in is removable, satisfying §Boot invariant: "no extension is privileged"). See [`plugin-author-guide.md` §Granularity — bundle vs extension](./plugin-author-guide.md#granularity--bundle-vs-extension) for the author-facing summary.
77
+ Each plugin (and each built-in bundle) declares a **granularity** that controls how its extensions are toggled. `granularity: 'bundle'` (the default) means the plugin id is the only enable/disable key; `granularity: 'extension'` means each extension is independently toggle-able under its qualified id. The loader's pre-import `resolveEnabled(pluginId)` short-circuit is always coarse (bundle level) — when a granularity=`extension` bundle is partially enabled, the import work proceeds and the runtime composer (the CLI's `composeScanExtensions` / `composeFormatters` in `src/cli/util/plugin-runtime.ts`) drops the disabled extensions before they reach the orchestrator. Vendor Provider bundles (`claude`, `gemini`, `agent-skills`) ship as granularity=`bundle` (the platform integration is on or off as a whole); the `core` bundle is granularity=`extension` (every kernel built-in is removable, satisfying §Boot invariant: "no extension is privileged"). See [`plugin-author-guide.md` §Granularity — bundle vs extension](./plugin-author-guide.md#granularity--bundle-vs-extension) for the author-facing summary.
78
78
 
79
79
  ### `RunnerPort`
80
80
 
@@ -141,14 +141,14 @@ Mode is a property of the extension as a whole, not of an individual call. **An
141
141
 
142
142
  | Kind | Modes | How mode is set |
143
143
  |---|---|---|
144
- | **Extractor** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
145
- | **Rule** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
144
+ | **Extractor** | deterministic-only | implicit; `mode` field MUST NOT appear |
145
+ | **Analyzer** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
146
146
  | **Action** | deterministic / probabilistic | declared in manifest (`mode` field, **required** — no default) |
147
147
  | **Hook** | deterministic / probabilistic | declared in manifest (`mode` field, optional; defaults to `deterministic`) |
148
148
  | **Provider** | deterministic-only | implicit; `mode` field MUST NOT appear |
149
149
  | **Formatter** | deterministic-only | implicit; `mode` field MUST NOT appear |
150
150
 
151
- Provider and Formatter are locked to deterministic because they sit at the **boundaries** of the system. A Provider resolves `path → kind` during boot; probabilistic classification would make the boot phase slow, costly, and non-reproducible. A formatter must produce diffable output (`sm scan` snapshots round-trip in CI). Probabilistic narrators of the graph are a valid product but they live in jobs and emit Findings, not in formatters.
151
+ Provider, Extractor, and Formatter are locked to deterministic because they sit on the **deterministic scan path**. A Provider resolves `path → kind` during boot; probabilistic classification would make the boot phase slow, costly, and non-reproducible. An Extractor consumes a parsed node body inside `sm scan`'s synchronous loop; LLM-driven enrichment of a node is an Action concern (queued as a job and observed via the enrichment layer or sidecar writes), not an Extractor concern — the distinction matters because `sm scan` MUST be fast, free, and reproducible. A Formatter must produce diffable output (`sm scan` snapshots round-trip in CI). Probabilistic narrators of the graph are a valid product but they live in jobs and emit Findings or write to the enrichment layer through Actions, not through Extractors or Formatters.
152
152
 
153
153
  > **Naming note — `Provider` vs hexagonal `adapter`.** A `Provider` is an **extension** authored by plugins (it recognises a platform and declares its kind catalog). The hexagonal-architecture term `adapter` refers to **port implementations** internal to the kernel package — `RunnerPort.adapter`, `StoragePort.adapter`, `FilesystemPort.adapter`, `PluginLoaderPort.adapter` — and lives under `kernel/adapters/`. The two concepts share an architectural lineage (both bridge two worlds) but live in deliberately disjoint namespaces so plugin authors and impl maintainers never confuse them.
154
154
 
@@ -163,7 +163,7 @@ This separation is normative: a probabilistic extension cannot register a hook t
163
163
 
164
164
  The kernel exposes the LLM through the `RunnerPort` (see §Ports above). Reference impl: `ClaudeCliRunner`. Tests: `MockRunner`. Other adapters (OpenAI, local Ollama, etc.) implement the same port without spec changes.
165
165
 
166
- A probabilistic extension receives the runner in its invocation context alongside `ctx.store`. The extension never imports a specific LLM SDK — the runner contract is what the spec normalizes; wire format and model selection are adapter concerns.
166
+ A probabilistic Action, Analyzer, or Hook receives the runner in its invocation context alongside `ctx.store` (Extractors are deterministic-only and never see the runner). The extension never imports a specific LLM SDK — the runner contract is what the spec normalizes; wire format and model selection are adapter concerns.
167
167
 
168
168
  ---
169
169
 
@@ -174,11 +174,11 @@ Six kinds, all first-class, all loaded through the same registry. Each kind has
174
174
  | Kind | Role | Input | Output |
175
175
  |---|---|---|---|
176
176
  | **Provider** | Recognizes a platform. Declares the catalog of node `kind`s it emits via the `kinds` map; each map entry pairs the kind's frontmatter schema (path relative to the Provider's package directory) with its `defaultRefreshAction` (a qualified action id that drives the probabilistic-refresh surface). Also declares the filesystem `explorationDir` where its content lives. Deterministic-only. | Filesystem walk results, candidate path. | `{ kind, provider } \| null`. |
177
- | **Extractor** | Extracts signals from a node body. Dual-mode: `deterministic` runs in scan, `probabilistic` runs in jobs. Output flows through three context callbacks (no return value): `ctx.emitLink(link)` for the kernel's `links` table, `ctx.enrichNode(partial)` for the kernel's enrichment layer (separate from the author's frontmatter), `ctx.store` for the plugin's own KV / dedicated tables. | Parsed node (frontmatter + body) + callbacks. | `void` (output via callbacks). |
178
- | **Rule** | Evaluates the graph. Dual-mode: `deterministic` runs in `sm check`, `probabilistic` runs in jobs. | Full graph (nodes + links). | `Issue[]`. |
177
+ | **Extractor** | Extracts signals from a node body. Deterministic-only: runs synchronously inside `sm scan`. Output flows through three context callbacks (no return value): `ctx.emitLink(link)` for the kernel's `links` table, `ctx.enrichNode(partial)` for the kernel's enrichment layer (separate from the author's frontmatter), `ctx.store` for the plugin's own KV / dedicated tables. | Parsed node (frontmatter + body) + callbacks. | `void` (output via callbacks). |
178
+ | **Analyzer** | Evaluates the graph. Dual-mode: `deterministic` runs in `sm check`, `probabilistic` runs in jobs. | Full graph (nodes + links). | `Issue[]`. |
179
179
  | **Action** | Operates on one or more nodes. Dual-mode: `deterministic` (in-process code) or `probabilistic` (rendered prompt the runner executes). | Node(s), optional args. | Deterministic: report JSON. Probabilistic: rendered prompt that a runner executes. |
180
180
  | **Formatter** | Serializes the graph. Deterministic-only. | Graph + optional filter. | String (ASCII / Mermaid / DOT / JSON / user-defined). |
181
- | **Hook** | Reacts declaratively to one of eight curated lifecycle events (`scan.started`, `scan.completed`, `extractor.completed`, `rule.completed`, `action.completed`, `job.spawning`, `job.completed`, `job.failed`). Dual-mode: `deterministic` runs in-process during the dispatch, `probabilistic` is enqueued as a job. Hooks REACT to events; they cannot block, mutate, or steer the pipeline. | A curated event payload (run-scoped, scan-scoped, or job-scoped) plus an optional declarative `filter` map. | `void` (reactions are side effects). |
181
+ | **Hook** | Reacts declaratively to one of ten curated lifecycle events — eight pipeline-driven (`scan.started`, `scan.completed`, `extractor.completed`, `analyzer.completed`, `action.completed`, `job.spawning`, `job.completed`, `job.failed`) plus two CLI-process-driven (`boot` before verb routing, `shutdown` after the verb's exit code resolves). Dual-mode: `deterministic` runs in-process during the dispatch, `probabilistic` is enqueued as a job. Hooks REACT to events; they cannot block, mutate, or steer the pipeline. | A curated event payload (run-scoped, scan-scoped, job-scoped, or process-scoped) plus an optional declarative `filter` map. | `void` (reactions are side effects). |
182
182
 
183
183
  ### Provider · `kinds` catalog
184
184
 
@@ -208,23 +208,36 @@ The kernel ships every Provider's `ui` block to the BFF at boot; the BFF aggrega
208
208
 
209
209
  Every `Provider` extension MUST declare an `explorationDir: string` naming the filesystem directory (relative to user home or project root) where its content lives. Examples: `'~/.claude'` for the Claude Provider, `'~/.cursor'` for a hypothetical Cursor Provider. The kernel walks this directory during boot/scan to discover nodes; the Provider's `globs` (if declared) refines what to match inside. `sm doctor` (and `sm plugins doctor`) validates the directory exists; missing directory yields a non-blocking warning so the user sees the gap without the load failing — the Provider may legitimately precede installation of its platform.
210
210
 
211
+ ### Provider · dispatch order and the universal markdown fallback
212
+
213
+ `sm scan` iterates Providers in **registration order** — vendor-specific Providers first (built-in: `claude` → `gemini` → `agent-skills`; user-installed plugins follow in load order), then the built-in `core/markdown` Provider LAST. Each Provider's walker enumerates the full project tree for its declared `read.extensions`; for every emitted file, the orchestrator calls `provider.classify(path, frontmatter)`. The kernel maintains a per-scan `Set<path>` of already-classified files so each path is offered to AT MOST one Provider's `classify`: the first Provider whose `classify` returns non-null claims the file, and subsequent Providers see the path as taken and skip.
214
+
215
+ The dispatch contract has two consequences implementations MUST honour:
216
+
217
+ 1. **First-claim-wins**. A vendor Provider that classifies a file inside its territory (e.g. claude's `.claude/agents/foo.md` → `agent`) is authoritative; later Providers cannot reclassify it to a different kind. This locks vendor ownership of vendor paths and removes the historical `provider-ambiguous` failure mode for non-overlapping territories.
218
+ 2. **`core/markdown` is the universal fallback for unclaimed `.md` files**. The built-in `core/markdown` Provider's `classify` returns `'markdown'` unconditionally (it does NOT inspect the path). Combined with the dedup guarantee above and its terminal position in the iteration order, it picks up exactly the `.md` files no vendor Provider claimed — a `.md` at the project root, under `.claude/hooks/`, `notes/`, `CLAUDE.md`, `GEMINI.md`, or anywhere else outside a known vendor territory. The fallback is **not privileged kernel code**: it ships as a regular built-in Provider under the `core` bundle (`granularity: 'extension'`), so a user who explicitly does not want it can disable it via `sm plugins disable core/markdown` and the scan reverts to "vendor-only" — orphan `.md` files become silently invisible, matching pre-spec-0.9.0 behaviour.
219
+
220
+ The fallback exists because the format-named generic kind `markdown` is provider-agnostic: no vendor owns the universal markdown format. Keeping the fallback as a Provider (rather than a kernel-level special case) preserves the boot invariant that no extension is privileged — when a future vendor Provider (Codex, Cursor, Roo) lands, it slots into the iteration order before `core/markdown` and the fallback semantics stay invariant.
221
+
211
222
  ### Extractor · output callbacks
212
223
 
213
224
  The `Extractor` runtime contract is `extract(ctx) → void`. The extractor emits its work through three callbacks the kernel binds onto `ctx`:
214
225
 
215
226
  - `ctx.emitLink(link)` — append a `Link` to the kernel's `links` table. The kernel validates the link against the extractor's declared `emitsLinkKinds` before persistence; off-contract links are dropped and surface as `extension.error` events. URL-shaped targets (`http(s)://…`) are partitioned out into `node.externalRefsCount` and never persisted.
216
- - `ctx.enrichNode(partial)` — merge canonical, kernel-curated properties onto the current node's enrichment layer (persisted into [`node_enrichments`](./db-schema.md#node_enrichments)). **Strictly separate from the author-supplied frontmatter** (the latter remains immutable across scans). The enrichment layer is the right home for kernel-derived facts (e.g. computed titles, summaries, signals from probabilistic extractors) without polluting what the user wrote on disk. See §Enrichment layer below for the full lifecycle (per-extractor attribution, stale tracking, refresh verbs).
227
+ - `ctx.enrichNode(partial)` — merge canonical, kernel-curated properties onto the current node's enrichment layer (persisted into [`node_enrichments`](./db-schema.md#node_enrichments)). **Strictly separate from the author-supplied frontmatter** (the latter remains immutable across scans). The enrichment layer is the right home for kernel-derived facts (computed titles, summaries, signals an Extractor inferred from the body) without polluting what the user wrote on disk. See §Enrichment layer below for the full lifecycle (per-extractor attribution, refresh verbs).
217
228
  - `ctx.store` — plugin-scoped persistence. Optional, present only when the plugin declares `storage.mode` in `plugin.json`. Shape depends on the mode (`KvStore` for mode A, scoped `Database` for mode B). See [`plugin-kv-api.md`](./plugin-kv-api.md). The plugin author MAY opt into shape validation for their own writes by declaring `storage.schema` (Mode A) or `storage.schemas` (Mode B) in the manifest — JSON Schemas the kernel AJV-compiles at load time and runs against every `ctx.store.set(key, value)` / `ctx.store.write(table, row)` call. Absent = permissive (status quo). `emitLink` and `enrichNode` keep their universal validation against `link.schema.json` / `node.schema.json` regardless of this opt-in. See [`plugin-author-guide.md` §`outputSchema`](./plugin-author-guide.md#outputschema--opt-in-correctness-for-custom-storage-writes).
218
229
 
219
- Probabilistic extractors additionally receive `ctx.runner` (the `RunnerPort`) for LLM dispatch.
230
+ Extractors are deterministic-only; `ctx.runner` is NOT exposed on the Extractor context. LLM-driven enrichment of a node is an Action concern (queued as a job), not an Extractor concern.
220
231
 
221
232
  ### Extractor · enrichment layer
222
233
 
223
- `ctx.enrichNode(partial)` is the only writable surface the Extractor pipeline has on a node. The author's frontmatter on `scan_nodes.frontmatter_json` is read-only from any Extractor — that contract holds for both deterministic and probabilistic extractors. Implementations MUST:
234
+ `ctx.enrichNode(partial)` is the only writable surface the Extractor pipeline has on a node. The author's frontmatter on `scan_nodes.frontmatter_json` is read-only from any Extractor. Implementations MUST:
224
235
 
225
236
  - Persist enrichments into a per-`(node, extractor)` table (the reference impl uses [`node_enrichments`](./db-schema.md#node_enrichments)) so attribution survives across scans.
226
237
  - Preserve the author frontmatter byte-for-byte through every scan and refresh; the enrichment overlay is a SEPARATE store.
227
- - Track stale state for probabilistic rows: when the scan loop detects `body_hash_at_enrichment != node.body_hash` for a probabilistic enrichment, mark the row stale (NOT delete it the LLM cost is preserved). Deterministic enrichments do not need stale tracking they regenerate via the §Extractor · fine-grained scan cache contract.
238
+ - Regenerate enrichments through the §Extractor · fine-grained scan cache contract: an unchanged body hash + same registered Extractor reuses the prior row; a changed body re-runs `extract()` and overwrites the row via the PRIMARY KEY conflict. Extractors are deterministic, so a stale-flag is unnecessary re-running is free and reproducible.
239
+
240
+ > **Reserved columns** — `node_enrichments.is_probabilistic`, `body_hash_at_enrichment`, and `stale` are persisted but inert in this revision: every Extractor write sets `is_probabilistic = 0` and `stale = 0`, with `body_hash_at_enrichment` always equal to the current body hash. The columns are reserved for a future revision where Action-issued enrichments (queued probabilistic jobs writing back through the enrichment layer) will need stale tracking to preserve LLM cost across body changes. Until that revision lands, readers MAY assume `stale = 0` and the merge helper's `includeStale: true` flag is a no-op.
228
241
 
229
242
  Read-side merge (`mergeNodeWithEnrichments` in the reference impl):
230
243
 
@@ -232,13 +245,13 @@ Read-side merge (`mergeNodeWithEnrichments` in the reference impl):
232
245
  2. Sort by `enriched_at` ASC.
233
246
  3. Spread-merge each `value` over the author frontmatter (last-write-wins per field).
234
247
 
235
- Rules / `sm check` / `sm export` consume `node.frontmatter` directly (deterministic CI-safe baseline); enrichment consumption is opt-in by the caller. Stale visibility is also opt-in (`includeStale: true` in the merge helper) so the UI can render a "stale (last value: …)" marker without polluting the deterministic merge.
248
+ Analyzers / `sm check` / `sm export` consume `node.frontmatter` directly (deterministic CI-safe baseline); enrichment consumption is opt-in by the caller.
236
249
 
237
- Refresh verbs (`sm refresh <node>` and `sm refresh --stale`) re-run the Extractor pipeline against a node or the stale set and upsert fresh enrichment rows — see [`cli-contract.md` §Scan](./cli-contract.md#scan).
250
+ Refresh verbs (`sm refresh <node>` and `sm refresh --stale`) re-run the Extractor pipeline against a node or the stale set and upsert fresh enrichment rows — see [`cli-contract.md` §Scan](./cli-contract.md#scan). With Extractors deterministic-only, `--stale` is a no-op today (no rows are stale-flagged); it remains in the contract for the future Action-prob enrichment revision noted above.
238
251
 
239
252
  ### Extractor · `applicableKinds` filter
240
253
 
241
- Extractors MAY declare an optional `applicableKinds: string[]` on their manifest. When declared, the kernel filters fail-fast: `extract()` is invoked **only** for nodes whose `kind` appears in the list. The skip happens BEFORE the extractor context is built so a probabilistic extractor wastes zero LLM cost — and a deterministic extractor zero CPU on inapplicable nodes. Absent (`undefined`) is the default and means "applies to every kind"; there is no wildcard syntax. An empty array (`[]`) is invalid (`minItems: 1` in the schema). Unknown kinds (no installed Provider declares them in its `kinds` catalog) are non-blocking: the extractor keeps `loaded` status and `sm plugins doctor` surfaces an informational warning so the author sees typos and missing-Provider cases, but the doctor's exit code is NOT promoted by this warning. See [`plugin-author-guide.md` §Extractor `applicableKinds`](./plugin-author-guide.md#extractor-applicablekinds--narrow-the-pipeline) for the full author-side contract.
254
+ Extractors MAY declare an optional `applicableKinds: string[]` on their manifest. When declared, the kernel filters fail-fast: `extract()` is invoked **only** for nodes whose `kind` appears in the list. The skip happens BEFORE the extractor context is built so the extractor wastes zero CPU on inapplicable nodes. Absent (`undefined`) is the default and means "applies to every kind"; there is no wildcard syntax. An empty array (`[]`) is invalid (`minItems: 1` in the schema). Unknown kinds (no installed Provider declares them in its `kinds` catalog) are non-blocking: the extractor keeps `loaded` status and `sm plugins doctor` surfaces an informational warning so the author sees typos and missing-Provider cases, but the doctor's exit code is NOT promoted by this warning. See [`plugin-author-guide.md` §Extractor `applicableKinds`](./plugin-author-guide.md#extractor-applicablekinds--narrow-the-pipeline) for the full author-side contract.
242
255
 
243
256
  ### Extractor · fine-grained scan cache
244
257
 
@@ -249,16 +262,16 @@ The contract the cache MUST satisfy (engine-agnostic):
249
262
  - A node-level cache hit (body+frontmatter unchanged) is upgraded to a full skip ONLY when every currently-registered Extractor that applies to the node's kind has a recorded run against the prior body hash.
250
263
  - A new Extractor registered between scans MUST run on the cached node — its absence from the cache is the canonical signal. The rest of the cache (existing Extractors against the same body) is preserved.
251
264
  - An Extractor uninstalled between scans MUST have its cache rows removed and its sole-source links dropped. Links whose `sources` mix the uninstalled Extractor's short id with a still-cached Extractor's short id MUST be reshaped: the obsolete short id is stripped from the array and the link survives with the cached attribution intact. The persisted audit trail therefore never references a removed contributor.
252
- - The cache is transparent to plugin authors. An Extractor cannot opt out and cannot inspect the cache; its only obligation is to be deterministic for a given body input (probabilistic Extractors run as jobs, never in scan).
265
+ - The cache is transparent to plugin authors. An Extractor cannot opt out and cannot inspect the cache; its only obligation is to be deterministic for a given body input (this is structural: every Extractor is deterministic-only, by spec).
253
266
 
254
- This invariant is the difference between a free and a paid scan for the probabilistic Extractor model: re-running an LLM Extractor against an unchanged body would be both expensive and non-reproducible.
267
+ The invariant exists to keep `sm scan --changed` cheap on real corpora: re-parsing a body that has not changed for an Extractor that has not changed is wasted work; the cache turns it into a one-row reuse. The same machinery is what will let a future Action-prob enrichment revision (see §Extractor · enrichment layer) reuse paid LLM output across unchanged bodies.
255
268
 
256
269
  ### Extractor · trigger normalization
257
270
 
258
271
  Extractors that emit invocation-style links (slashes, at-directives, command names) populate the `link.trigger` block defined in [`schemas/link.schema.json`](./schemas/link.schema.json):
259
272
 
260
273
  - `originalTrigger` — the exact source text the extractor saw, byte-for-byte. Used only for display.
261
- - `normalizedTrigger` — the output of the pipeline below. Used for equality and collision detection — the built-in `trigger-collision` rule keys on this field.
274
+ - `normalizedTrigger` — the output of the pipeline below. Used for equality and collision detection — the built-in `trigger-collision` analyzer keys on this field.
262
275
 
263
276
  Both fields MUST be present whenever `link.trigger` is non-null. Implementations MUST produce byte-identical `normalizedTrigger` output for byte-identical input across platforms and locales.
264
277
 
@@ -290,17 +303,19 @@ Characters outside the separator set that are not letters or digits (e.g. `/`, `
290
303
 
291
304
  ### Hook · curated trigger set
292
305
 
293
- Hooks subscribe declaratively to a curated set of kernel lifecycle events and react to them. Reaction-only by design: a hook cannot mutate the pipeline, block emission, or alter outputs. The hookable trigger set is intentionally small — eight events out of the full [`job-events.md`](./job-events.md) catalog. Other events (per-node `scan.progress`, `model.delta`, `run.*`, `job.claimed`, `job.callback.received`) are deliberately NOT hookable: too verbose for a reactive surface, internal to the runner, or covered elsewhere. Declaring a trigger outside the curated set yields `invalid-manifest` at load time.
306
+ Hooks subscribe declaratively to a curated set of kernel lifecycle events and react to them. Reaction-only by design: a hook cannot mutate the pipeline, block emission, or alter outputs. The hookable trigger set is intentionally small — ten events out of the full [`job-events.md`](./job-events.md) catalog. Eight are pipeline-driven (emitted from inside `runScan`); two (`boot`, `shutdown`) are CLI-process-driven (emitted by the driving binary before / after the verb runs, fire-and-forget so `process.exit` is never blocked). Other events (per-node `scan.progress`, `model.delta`, `run.*`, `job.claimed`, `job.callback.received`) are deliberately NOT hookable: too verbose for a reactive surface, internal to the runner, or covered elsewhere. Declaring a trigger outside the curated set yields `invalid-manifest` at load time.
294
307
 
295
308
  | Trigger | When it fires | Payload (key fields) | Hook scope |
296
309
  |---|---|---|---|
310
+ | `boot` | Once per CLI process invocation, BEFORE the verb routes. The dispatcher AWAITS subscribed hooks so anything they print lands above the verb's output (the `core/update-check` banner relies on this); a slow hook therefore delays the first verb paint. The dispatcher catches every hook error so a buggy hook never prevents the verb from running — it can only delay it. Use sparingly. | `argv: string[]` (the routed argv slice the CLI is about to parse). | Boot-time output that must appear above the verb (the `core/update-check` banner), pre-flight checks, telemetry warm-up. |
297
311
  | `scan.started` | Once at the start of every `sm scan` invocation. | `roots: string[]`. | Pre-scan setup (cache warm-up, telemetry init). |
298
312
  | `scan.completed` | Once at the end of every `sm scan` invocation. | `stats: { filesWalked, nodesCount, linksCount, issuesCount, durationMs }`. | Post-scan reaction (Slack notification, CI gate, summary). |
299
313
  | `extractor.completed` | Once per registered Extractor, after the full walk completes. Aggregated, NOT per-node. | `extractorId: string` (qualified). | Per-Extractor metrics, audit. |
300
- | `rule.completed` | Once per Rule, after every issue has been validated. | `ruleId: string` (qualified). | Per-Rule alerting, downstream tooling. |
314
+ | `analyzer.completed` | Once per Analyzer, after every issue has been validated. | `analyzerId: string` (qualified). | Per-Analyzer alerting, downstream tooling. |
301
315
  | `action.completed` | Once per Action invocation, after the report has been recorded. | `actionId: string` (qualified), `node`, `jobResult`. | Per-Action notification, integration glue. |
302
316
  | `job.spawning` | Pre-spawn of a runner subprocess (job subsystem; Step 10). | `jobId`, `actionId`, spawn metadata. | Pre-flight checks, audit logging. |
303
317
  | `job.spawning`, `job.completed`, `job.failed` | The three job-lifecycle hookables; same payload shapes as the [`job-events.md`](./job-events.md) entries of the same name. | See [`job-events.md` §Event catalog](./job-events.md#event-catalog). | Most common Hook surface (notifications, retries, billing). |
318
+ | `shutdown` | Once per CLI process invocation, AFTER the verb returns its exit code and BEFORE `process.exit`. The dispatcher awaits subscribed hooks so they finish before the process terminates, but every hook MUST be fast (the user already saw the verb's output and is waiting for the prompt back). The dispatcher catches every hook error so a buggy hook never alters the verb's exit code; it can only delay the exit. | `exitCode: number` (the verb's resolved exit code, `0..5`). | Cleanup, post-run telemetry, the `core/update-check` banner. |
304
319
 
305
320
  A hook MAY narrow further with an optional declarative `filter` map: keys are payload field paths (top-level only in v0.x); values are the literal expected match. The dispatcher walks `event.data` for each declared key and short-circuits the invocation when any value disagrees. Examples:
306
321
 
@@ -317,7 +332,7 @@ A hook MAY narrow further with an optional declarative `filter` map: keys are pa
317
332
 
318
333
  Hooks introduce no new persisted state and do NOT participate in the deterministic scan cache (A.9). A scan that re-runs against an unchanged corpus dispatches `scan.started` / `scan.completed` exactly as before; subscribed hooks fire on every scan regardless of cache hit / miss. Hooks that need cache-aware behaviour MUST inspect their own state via `ctx.store` (declared in their plugin's manifest).
319
334
 
320
- ### Contract rules
335
+ ### Contract analyzers
321
336
 
322
337
  1. An extension declares its kind in its module export and its manifest. Kind mismatch → load-error.
323
338
  2. An extension MAY declare `preconditions` — predicates that must be satisfied for the extension to be offered (e.g., `action.requires: ["kind=skill"]`).
@@ -328,11 +343,11 @@ Hooks introduce no new persisted state and do NOT participate in the determinist
328
343
  ### Locality
329
344
 
330
345
  - **Drop-in**: extensions live inside plugins, discovered at boot from `.skill-map/plugins/<id>/` and `~/.skill-map/plugins/<id>/`.
331
- - **Built-in**: the reference impl bundles a default extension set (one Provider, four extractors, five rules, one formatter, zero hooks). The fifth rule, `core/validate-all`, replays every scanned node and link through the authoritative spec schemas via AJV — the kernel-side guard against persisting non-conforming graph rows. The Hook kind has no built-ins at this bump; the kind exists so plugins can subscribe (concrete built-in hooks land separately when demand surfaces). These are loaded from `src/extensions/` and are indistinguishable from plugin-supplied extensions from the kernel's point of view.
346
+ - **Built-in**: the reference impl bundles a default extension set (one Provider, four extractors, five analyzers, one formatter, one hook). The fifth analyzer, `core/validate-all`, replays every scanned node and link through the authoritative spec schemas via AJV — the kernel-side guard against persisting non-conforming graph rows. The first built-in Hook is `core/update-check`, which subscribes to `shutdown` and runs the once-per-day "update available" probe + banner that lived on the CLI entry path before the Hook kind had concrete consumers. These are loaded from `src/extensions/` and are indistinguishable from plugin-supplied extensions from the kernel's point of view.
332
347
 
333
348
  ---
334
349
 
335
- ## Dependency rules
350
+ ## Dependency analyzers
336
351
 
337
352
  The following imports are NORMATIVELY FORBIDDEN:
338
353
 
@@ -384,7 +399,7 @@ Alternative implementations MAY use workspaces, separate packages, or a compiled
384
399
 
385
400
  ---
386
401
 
387
- ## Driving-adapter peer rule
402
+ ## Driving-adapter peer analyzer
388
403
 
389
404
  The CLI, Server, and Skill driving adapters are **peers**. None depends on another.
390
405
 
@@ -394,31 +409,31 @@ The CLI, Server, and Skill driving adapters are **peers**. None depends on anoth
394
409
 
395
410
  All three consume the same kernel API. Any use case a driving adapter needs MUST be available as a kernel function — if it isn't, the gap is a kernel bug, not a driving-adapter workaround.
396
411
 
397
- This is what makes "CLI-first" a coherent rule: every CLI verb is a kernel function call. The UI does not reimplement business logic; it calls the same functions.
412
+ This is what makes "CLI-first" a coherent analyzer: every CLI verb is a kernel function call. The UI does not reimplement business logic; it calls the same functions.
398
413
 
399
414
  ---
400
415
 
401
416
  ## Annotation system
402
417
 
403
- Skill-map's own metadata layer (versioning, supersession, provenance, taxonomy, display, docs) lives in **co-located YAML sidecars** with extension `.sm`, in the same directory as the markdown node they annotate. Vendor files (`.claude/agents/foo.md`, `.cursor/rules/bar.mdc`, …) stay untouched; the sidecar (`foo.sm` / `bar.sm`) carries the annotations.
418
+ Skill-map's own metadata layer (versioning, supersession, provenance, taxonomy, docs) lives in **co-located YAML sidecars** with extension `.sm`, in the same directory as the markdown node they annotate. Vendor files (`.claude/agents/foo.md`, `.cursor/analyzers/bar.mdc`, …) stay untouched; the sidecar (`foo.sm` / `bar.sm`) IS skill-map's "annotations file" for that node — every key under it is, conceptually, an annotation. The YAML root organizes those annotations into structural blocks (identity, the curated annotations catalog, audit timestamps, settings, plugin namespaces); the file as a whole is the annotation surface.
404
419
 
405
420
  Two schemas describe the wire shape:
406
421
 
407
- - [`schemas/sidecar.schema.json`](./schemas/sidecar.schema.json) — root shape with reserved blocks `for` (identity link), `annotations` (the conventional catalog), `settings` (reserved), `audit` (write trail), plus opt-in `<plugin-id>:` namespacing.
408
- - [`schemas/annotations.schema.json`](./schemas/annotations.schema.json) — curated 14-field catalog: versioning + supersession (`version`, `stability`, `supersedes`, `supersededBy`, `requires`, `conflictsWith`, `related`), provenance (`authors`, `license`, `source`, `sourceVersion`), taxonomy (`tags`), display (`hidden`), docs (`docsUrl`). The activity timestamp lives in the reserved `audit:` block (`audit.lastBumpedAt`), not in `annotations:`. `additionalProperties: true` so plugins or users add custom keys without coordination; the built-in `unknown-field` rule warns on truly unrecognized keys (typo guard).
422
+ - [`schemas/sidecar.schema.json`](./schemas/sidecar.schema.json) — root shape with reserved blocks `identity` (anchor + drift hashes), `annotations` (the conventional catalog), `settings` (reserved), `audit` (write trail), plus opt-in `<plugin-id>:` namespacing.
423
+ - [`schemas/annotations.schema.json`](./schemas/annotations.schema.json) — curated 13-field catalog: versioning + supersession (`version`, `stability`, `supersedes`, `supersededBy`, `requires`, `conflictsWith`, `related`), provenance (`authors`, `license`, `source`, `sourceVersion`), taxonomy (`tags`), docs (`docsUrl`). The activity timestamp lives in the reserved `audit:` block (`audit.lastBumpedAt`), not in `annotations:`. `additionalProperties: true` so plugins or users add custom keys without coordination; the built-in `unknown-field` analyzer warns on truly unrecognized keys (typo guard).
409
424
 
410
425
  ### Identity and drift
411
426
 
412
- `for` carries `path` (scope-root-relative, matches the canonical Node identifier in [`schemas/node.schema.json`](./schemas/node.schema.json)) plus `bodyHash` and `frontmatterHash`. Both hashes are sha256 over the kernel's canonical form of the markdown body (post-frontmatter bytes) and frontmatter (YAML re-emitted via `js-yaml dump` with `sortKeys: true`, `lineWidth: -1`, `noRefs: true`, `noCompatMode: true`); each sidecar captures the values the kernel saw at the moment it was last written.
427
+ `identity` carries `path` (scope-root-relative, matches the canonical Node identifier in [`schemas/node.schema.json`](./schemas/node.schema.json)) plus `bodyHash` and `frontmatterHash`. Both hashes are sha256 over the kernel's canonical form of the markdown body (post-frontmatter bytes) and frontmatter (YAML re-emitted via `js-yaml dump` with `sortKeys: true`, `lineWidth: -1`, `noRefs: true`, `noCompatMode: true`); each sidecar captures the values the kernel saw at the moment it was last written.
413
428
 
414
- At scan time the kernel re-computes the live hashes and compares against the stored ones. Mismatch in either is **drift**, surfaced via the built-in `annotation-stale` rule (severity `warning`, never blocking — soft mode by design). A `.sm` whose `for.path` no longer points at an existing `.md` is **orphan**, surfaced via the built-in `annotation-orphan` rule (also `warning`). Drift state is **derived**, never stored — pure function over existing data, no flag to drift between flag and reality.
429
+ At scan time the kernel re-computes the live hashes and compares against the stored ones. Mismatch in either is **drift**, surfaced via the built-in `annotation-stale` analyzer (severity `warning`, never blocking — soft mode by design). A `.sm` whose `identity.path` no longer points at an existing `.md` is **orphan**, surfaced via the built-in `annotation-orphan` analyzer (also `warning`). Drift state is **derived**, never stored — pure function over existing data, no flag to drift between flag and reality.
415
430
 
416
431
  ### Bump model
417
432
 
418
433
  The deterministic built-in `core/bump` Action produces a sidecar patch:
419
434
 
420
435
  - Increments `annotations.version` by 1 (or sets to `1` if missing — single integer monotonic, orthogonal to `stability`; major bumps are not a concept, the convention for breaking changes is "create a new node, supersede the old").
421
- - Refreshes `for.bodyHash` and `for.frontmatterHash` to the live values.
436
+ - Refreshes `identity.bodyHash` and `identity.frontmatterHash` to the live values.
422
437
  - Stamps `audit.lastBumpedAt` (ISO 8601 datetime) and `audit.lastBumpedBy` (`'cli'`, `'ui'`, or `'plugin:<id>'`).
423
438
  - On first-time creation also stamps `audit.createdAt` and `audit.createdBy` (set once, stable thereafter).
424
439
 
@@ -436,7 +451,7 @@ The Action stays pure (no IO). The kernel materializes the patch through the `Si
436
451
  Plugins extend the annotation surface via the `annotationContributions` manifest field — a map of contributed key → `{ schema, ownership, location }`. Inline JSON Schema (no `$ref` to external files). Two location modes:
437
452
 
438
453
  - `location: 'namespaced'` (default) — writes go to the plugin's `<plugin-id>:` block at the sidecar root. Default `ownership: 'shared'`. Plugins write to their own namespace without coordination; AJV validates contributed keys against the plugin's declared schema.
439
- - `location: 'root'` — writes go to a top-level key of the sidecar (alongside `for` / `annotations` / `settings` / `audit`). Requires `ownership: 'exclusive'` (claiming a root key is elevated trust). Two plugins claiming the same root key with `exclusive` is a **hard fatal** at orchestrator startup — the kernel refuses to boot rather than route writes ambiguously.
454
+ - `location: 'root'` — writes go to a top-level key of the sidecar (alongside `identity` / `annotations` / `settings` / `audit`). Requires `ownership: 'exclusive'` (claiming a root key is elevated trust). Two plugins claiming the same root key with `exclusive` is a **hard fatal** at orchestrator startup — the kernel refuses to boot rather than route writes ambiguously.
440
455
 
441
456
  The kernel exposes a runtime catalog (`Kernel.getRegisteredAnnotationKeys()`) listing every plugin-contributed key with its `pluginId`, `location`, `ownership`, and `schema` — consumed by the BFF (`GET /api/annotations/registered`) for UI autocomplete.
442
457
 
@@ -449,6 +464,23 @@ Two columns on `scan_nodes` source from the sidecar's `annotations:` block when
449
464
 
450
465
  A `scan_nodes.annotations_json` column carries the full parsed `annotations:` block; `sidecar_present` and `sidecar_status` carry the drift-detection state. The full sidecar overlay (parsed `annotations`, `status`, `present`) is exposed on `Node.sidecar` so REST and UI consumers see it as part of the canonical wire shape.
451
466
 
467
+ ### Tags · dual-source
468
+
469
+ Skill-map's tag system is **dual-source** by design:
470
+
471
+ - **Author tags** live in `frontmatter.tags` (in the `.md`). Universal optional field declared on [`schemas/frontmatter/base.schema.json`](./schemas/frontmatter/base.schema.json) so every Provider's per-kind schema accepts it without each having to redeclare. These represent intrinsic categories the file's author wrote into the frontmatter (vendor-supplied or your own writing).
472
+ - **User tags** live in `sidecar.annotations.tags` (in the `.sm`). Curated annotation field declared on [`schemas/annotations.schema.json`](./schemas/annotations.schema.json). These represent the post-hoc tags whoever curates the project assigned to the node from their sidecar.
473
+
474
+ The two surfaces are **not aliases**. They capture different intent layers and both are first-class:
475
+
476
+ - Search and listings (`sm list --tag <name>`, UI faceted search) match the **union**: a hit on either source returns the node.
477
+ - The optional `--tag-source author|user` flag filters one source.
478
+ - The UI distinguishes them visually so the attribution stays explicit (different chip style; author chips render first, user chips after).
479
+
480
+ Persistence layer projects rows into a normalized [`scan_node_tags`](./db-schema.md#scan_node_tags) table at write time — one row per `(node_path, tag, source)` triple — so SQL queries can index on `(tag)` for `O(log n)` lookup. Replace-all per scan keeps the table in sync with the live frontmatter + sidecar state; deleting a tag from either source removes its row on the next scan.
481
+
482
+ The wire shape (`/api/nodes` and `/api/nodes/:pathB64`) projects `node.tags = { byAuthor: string[], byUser: string[] }` so consumers see the split with attribution. The kernel `Node` interface (TypeScript) does NOT carry `tags` — consumers that walk the canonical sources read `node.frontmatter.tags` and `node.sidecar.annotations.tags` directly (consistent with the post-decision-#2 posture of "no Node-level denormalisations").
483
+
452
484
  ### Stability
453
485
 
454
486
  The **layout decision** (co-located `.sm`, not mirror tree under `.skill-map/`) is stable as of spec v1.0.0. Moving the home is a major bump.
@@ -457,7 +489,7 @@ The **format** (YAML, extension `.sm`, not `.md.sm`) is stable as of spec v1.0.0
457
489
 
458
490
  The **reserved block names** (`for`, `annotations`, `settings`, `audit`) are stable as of spec v1.0.0. Adding a new reserved block is a minor bump; renaming or removing one is a major bump.
459
491
 
460
- The **identity contract** (`for.path` + `for.bodyHash` + `for.frontmatterHash`, with `resolvedAs` optional) is stable as of spec v1.0.0. Changing the hash algorithm or canonicalization rule is a major bump.
492
+ The **identity contract** (`identity.path` + `identity.bodyHash` + `identity.frontmatterHash`, with `resolvedAs` optional) is stable as of spec v1.0.0. Changing the hash algorithm or canonicalization analyzer is a major bump.
461
493
 
462
494
  The **bump field set** (the four `audit` fields `lastBumpedAt` / `lastBumpedBy` / `createdAt` / `createdBy`) is stable as of spec v1.0.0. Adding new audit fields is a minor bump; removing or renaming is a major bump. The audit block is `additionalProperties: true` so plugins or future Actions MAY ride additional keys opaquely.
463
495
 
@@ -467,6 +499,177 @@ The **`null`-as-delete sentinel** in `SidecarStore.applyPatch` is an internal co
467
499
 
468
500
  ---
469
501
 
502
+ ## View contribution system
503
+
504
+ Sibling system to the annotation contributions above. Both let plugins extend the surface the kernel exposes; the difference is **where the data lives and what it drives**.
505
+
506
+ | | Annotation contributions | View contributions |
507
+ |---|---|---|
508
+ | **Data lives in** | the user-facing sidecar `.sm` file | the kernel-managed `scan_contributions` table |
509
+ | **Author intent** | extend the metadata catalog | surface per-node data in the UI |
510
+ | **Plugin author writes** | inline JSON Schema for the value | `slot` name from a closed catalog |
511
+ | **Validation** | AJV at sidecar-write time | AJV at `ctx.emitContribution(...)` time |
512
+ | **Lifecycle** | persists across scans (file-on-disk) | re-emitted on every scan (table cleared per node) |
513
+ | **Surfaces in** | sidecar consumers + `<sm-plugin-contributions>` panel | fixed renderer per slot, mounted at exactly the slot the author declared |
514
+
515
+ Two schemas describe the wire shape:
516
+
517
+ - [`schemas/view-slots.schema.json`](./schemas/view-slots.schema.json) — closed catalog: 15 slot names + the `IViewContribution` manifest declaration shape + per-slot payload schemas (in `$defs/payloads`) the kernel uses to validate emit-time payloads.
518
+ - [`schemas/input-types.schema.json`](./schemas/input-types.schema.json) — closed catalog: 10 input-type names + the `ISettingDeclaration` manifest declaration shape (discriminated by `type`).
519
+
520
+ ### Identity
521
+
522
+ Each view contribution is identified by the qualified id `<pluginId>/<extensionId>/<contributionId>`. The plugin author declares contributions in the extension manifest under `viewContributions: Record<string, IViewContribution>`; the loader composes the qualified id from the plugin id, the extension id, and the Record key.
523
+
524
+ ### Manifest
525
+
526
+ Each entry picks a `slot` name from the closed catalog and supplies presentation tuning. The slot fixes both the renderer and the payload shape — there is no separate "contract" abstraction:
527
+
528
+ ```jsonc
529
+ {
530
+ "viewContributions": {
531
+ "breakdown": {
532
+ "slot": "inspector.body.panel.breakdown",
533
+ "label": "Keyword hits",
534
+ "emptyText": "No matches."
535
+ },
536
+ "total": {
537
+ "slot": "card.footer.left.counter",
538
+ "icon": "🔍",
539
+ "label": "kw",
540
+ "emitWhenEmpty": false
541
+ }
542
+ }
543
+ }
544
+ ```
545
+
546
+ The plugin author picks ONE slot per contribution; that single decision determines where the data renders, what payload shape `ctx.emitContribution(...)` must produce, and which Angular component draws it. Six manifest fields per contribution + the slot catalog page is the entire mental model. See [`plugin-author-guide.md`](./plugin-author-guide.md) §View contributions for worked examples.
547
+
548
+ ### Settings
549
+
550
+ Plugin user-configurable settings live at the manifest root in `settings: Record<string, ISettingDeclaration>` (see [`schemas/plugins-registry.schema.json`](./schemas/plugins-registry.schema.json)). Each setting picks an input-type from the closed catalog at [`schemas/input-types.schema.json`](./schemas/input-types.schema.json) (`string-list`, `single-string`, `boolean-flag`, `integer`, `enum-pick`, `enum-multipick`, `path-glob`, `regex`, `secret`, `key-value-list`). The kernel exposes resolved settings to extractors via `ctx.settings.<settingId>`; the UI generates a form per declaration; the CLI's `sm plugins config <id>` exposes the same surface.
551
+
552
+ Settings are read once at extractor invocation; changing a setting requires `sm scan` to re-emit affected contributions. The UI surfaces a "settings changed, rescan needed" indicator when the manifest detects mismatch; live re-emission is explicitly out of scope (rescan-required is a stability decision per `ROADMAP.md` §UI contribution system D4).
553
+
554
+ ### Runtime catalog
555
+
556
+ The kernel exposes a runtime catalog (`Kernel.getRegisteredViewContributions()`) listing every plugin-contributed view contribution with its `pluginId`, `extensionId`, `contributionId`, `slot`, and the manifest-declared `label` / `tooltip` / `icon` / `emptyText` / `emitWhenEmpty`. The catalog is built once at boot from every loaded extension's `viewContributions` map, AJV-validated, and frozen — same lifecycle as `getRegisteredAnnotationKeys()`.
557
+
558
+ Analyzers see the catalog through `IAnalyzerContext.viewContributions` so cross-cutting checks (`core/unknown-slot`, `core/contribution-orphan`) can reason about emissions.
559
+
560
+ ### Emit path
561
+
562
+ Extensions emit per-node payloads via context callbacks:
563
+
564
+ ```ts
565
+ // Extractors (per-node walk)
566
+ ctx.emitContribution(contributionId, payload);
567
+
568
+ // Analyzers (post-merge graph) — same payload contract, explicit nodePath
569
+ // because the analyzer sees every node at once
570
+ ctx.emitContribution(nodePath, contributionId, payload);
571
+ ```
572
+
573
+ Parallel to `ctx.emitLink(link)`. The kernel buffers the emission, validates the payload against the slot's payload schema in `$defs/payloads/<slot>` (AJV-compiled at boot), and persists the row to `scan_contributions` during `persistScanResult`. Off-shape payloads emit an `extension.error` event and drop silently — same posture as `emitLink` rejecting off-`emitsLinkKinds` links. Both Extractor and Analyzer emissions land in the same `scan_contributions` rows; the row's `extension_id` records which kind of extension produced it.
574
+
575
+ The Extractor-emit signature binds `nodePath` implicitly (the extractor runs per-node, with `ctx.node.path` available as the only sensible target). The Analyzer-emit signature requires the analyzer to declare the target node explicitly because Analyzers see the full graph at once and may emit for any subset of nodes — the canonical use case is a analyzer that derives per-node values from cross-graph aggregations (`core/link-counts` projects `linksOutCount` / `linksInCount` this way).
576
+
577
+ Analyzers MAY also emit scope-level contributions via `IAnalyzerContext.emitScopeContribution(contributionId, payload)` (only slots whose schema permits scope-level emission, today only `topbar.actions.indicator`). That signature is reserved in the spec; the runtime callback lands when the first scope-level adopter arrives.
578
+
579
+ ### Persistence
580
+
581
+ A new table `scan_contributions` (see [`db-schema.md`](./db-schema.md) §scan_contributions when shipped) carries per-node emissions:
582
+
583
+ | Column | Type | Notes |
584
+ |---|---|---|
585
+ | `plugin_id` | TEXT | qualified plugin id |
586
+ | `extension_id` | TEXT | extension id within the plugin |
587
+ | `node_path` | TEXT | scope-relative path |
588
+ | `contribution_id` | TEXT | manifest Record key |
589
+ | `slot` | TEXT | denormalized slot name (`view-slots.schema.json#/$defs/SlotName`) |
590
+ | `payload_json` | TEXT | JSON-serialized payload (already validated against the slot's payload schema) |
591
+ | `emitted_at` | INTEGER | unix epoch ms |
592
+
593
+ PK `(plugin_id, extension_id, node_path, contribution_id)` so re-emission upserts. Index on `node_path` (inspector lazy-fetch + orphan sweep) and on `plugin_id` (catalog sweep + `purgeByPlugin`).
594
+
595
+ **NOT pure replace-all** (the way `scan_links` / `scan_issues` are). The watcher's cached pass leaves the contributions buffer empty for cached nodes — the orchestrator skips `extract()` when the per-(node, extractor) cache hits, so no `emitContribution` fires. A naive wipe-all would silently drop the prior valid rows on every watcher boot. The persist runs four passes inside the same transaction:
596
+
597
+ 1. **Orphan sweep** — drops every row whose `node_path` is NOT in the current live node set. Disappeared nodes lose their contributions.
598
+ 2. **Catalog sweep** — drops every row whose qualified id `(pluginId, extensionId, contributionId)` is NOT in the registered runtime catalog (uninstalled plugins, disabled bundles, removed contributions).
599
+ 3. **Per-tuple sweep** — for every `(pluginId, extensionId, nodePath)` tuple where the extension actually RAN against that node in this scan (extractor cache miss, OR analyzer — analyzers always run), drop any row carrying that triple whose `contribution_id` is NOT present in the buffer for that triple. This catches the "extractor used to emit, now does not" case (e.g. a node body change that removes the trigger). Cached-extractor tuples are NOT in the set, so their rows survive untouched.
600
+ 4. **Upsert** — `INSERT ... ON CONFLICT DO UPDATE SET payload_json = excluded.payload_json, slot = excluded.slot` for every row in the buffer. PK conflict refreshes payload + `slot` + `emitted_at`.
601
+
602
+ Cached nodes' rows survive untouched (still in the live set, still in the catalog, the (plugin, extension, node) tuple is not in the freshly-run set, no buffer hit). The next time the body changes, the orchestrator re-runs the extractor, the tuple lands in the freshly-run set, and either the upsert refreshes the row OR the per-tuple sweep drops it (when the extractor no longer emits for that node).
603
+
604
+ Empty buffer + non-empty live set = the cached-pass case (no-op). Empty buffer + empty live set = legacy fallback to wipe-all (cold start). Three `IPersistOptions` fields control which sweeps activate — absent values fall back to legacy behaviour (sweep skipped) so older callers keep working:
605
+
606
+ - `livePaths?: ReadonlySet<string>` — gates the orphan sweep (1).
607
+ - `registeredContributionKeys?: ReadonlySet<string>` — gates the catalog sweep (2). Element format: qualified id `<pluginId>/<extensionId>/<contributionId>`.
608
+ - `freshlyRunTuples?: ReadonlySet<string>` — gates the per-tuple sweep (3). Element format: `<pluginId>/<extensionId>/<nodePath>` (no contribution-id segment — the sweep operates at the (plugin, extension, node) level and inspects the buffer to decide which contribution-ids survive).
609
+
610
+ Cold-start posture: the BFF endpoints below return empty arrays when the table is missing (mirror of the `tryWithSqlite` graceful-null pattern used by `routes/nodes.ts`); never a 500.
611
+
612
+ ### BFF surface
613
+
614
+ Endpoints under `/api/contributions/*`:
615
+
616
+ - `GET /api/contributions/registered` — runtime catalog. Mirror of `/api/annotations/registered`. Envelope variant `kind: 'contributions.registered'` (see [`schemas/api/rest-envelope.schema.json`](./schemas/api/rest-envelope.schema.json)).
617
+ - `GET /api/contributions/:pluginId/:extensionId/:contributionId?path=...` — lazy per-node fetch for inspector slots. **Three URL segments** mirror the qualified id `<pluginId>/<extensionId>/<contributionId>`. Filters by qualified id + node path; the BFF enforces `pluginId` ↔ namespace at the route level — no cross-plugin reads via this endpoint.
618
+
619
+ Plus catalog embedding into every payload-bearing envelope:
620
+
621
+ - `kindRegistry` and `contributionsRegistry` are siblings on the envelope (see schema). Built once per server boot, embedded into list (`nodes` / `links` / `issues` / `plugins`), single (`node`), and value (`config`) envelopes. Sentinel envelopes (`health` / `scan` / `graph`) and action-result envelopes (`sidecar.bumped`) and the catalog envelopes themselves (`annotations.registered` / `contributions.registered`) carry neither.
622
+
623
+ Plus per-node embedding on node responses:
624
+
625
+ - `GET /api/nodes/:pathB64` — single-node `item.contributions[]` carries every emission for that node, regardless of `bff.maxBulkContributions`.
626
+ - `GET /api/nodes` (bulk list) — `items[].contributions[]` carries emissions for the page slice **only when** `limit ≤ bff.maxBulkContributions` (default and hard upper bound 200). When the page exceeds the cap, `items[].contributions` is omitted and `meta.contributionsOmitted: true` is set so the UI can lazy-fetch per node. The cap is documented but not promoted; tuning above 200 is unsupported.
627
+ - `GET /api/scan` — the SPA's `CollectionLoaderService` hydrates from this endpoint on F5 / cold boot (single-fetch ScanResult); it MUST embed `contributions[]` per node alongside the standard fields, otherwise the inspector / card slot hosts have nothing to render until the next per-node fetch. Decoration is a single bulk `port.contributions.listForPaths(...)` round-trip after `scans.load()` — sibling of the per-node `isFavorite` decoration on the same route.
628
+
629
+ ### Isolation
630
+
631
+ View contributions extend the existing plugin-isolation model (see [`plugin-kv-api.md`](./plugin-kv-api.md) §Honest note on isolation) with six analyzers specific to UI rendering:
632
+
633
+ 1. **No raw DOM from plugin** — contributions are typed data only; the UI renders them via a closed catalog of Angular components mapped from slot id.
634
+ 2. **CSS scoping by Angular view encapsulation** — plugin does not write CSS; per-plugin tinting is sourced from a kernel-managed palette derived from `pluginId`.
635
+ 3. **Data path namespaced and BFF-enforced** — `GET /api/contributions/:pluginId/:extensionId/:contributionId?path=...` rejects cross-plugin reads at the route level (the qualified id triple is the URL shape).
636
+ 4. **Click actions are typed kernel verb dispatches** — a button rendered from a contribution invokes a kernel verb by qualified id; no arbitrary URLs / effects.
637
+ 5. **AJV at three layers** — manifest at load (rejects unknown `slot` names with `invalid-manifest`), payload at emit (rejects off-shape payloads with `extension.error`), envelope at BFF response.
638
+ 6. **Renderer attr-sanitization** — the UI's renderer components MUST NOT bind contribution data to `[innerHTML]`, `[style]`, `[src]`, `[href]`, or any DomSanitizer DANGEROUS_ATTR. Lint-enforced in the UI workspace; documented in [`context/view-slots.md`](../context/view-slots.md).
639
+
640
+ Same honest-note posture as [`plugin-kv-api.md`](./plugin-kv-api.md): isolated against accidents, not hostile code, until worker-thread / iframe sandbox post-v1.0.
641
+
642
+ ### Soft-warning analyzers
643
+
644
+ Two built-ins ship with the system to cover catalog evolution and rename edge cases:
645
+
646
+ - **`core/unknown-slot`** — walks every loaded plugin's `viewContributions[*].slot`; emits an `Issue` of severity `warn` for any slot not in the current kernel catalog. Parallel to `core/unknown-field` for annotations. Note: AJV at manifest load already rejects unknown slots as `invalid-manifest`; this analyzer covers the soft-warning path when a plugin remains loaded across a catalog version bump.
647
+ - **`core/contribution-orphan`** — joins `scan_contributions` against the live `scan_nodes` set; emits an `Issue` of severity `warn` for emissions whose `node_path` no longer exists (post-rename heuristic miss).
648
+
649
+ ### Catalog versioning
650
+
651
+ The catalog of slots and input-types evolves on its own cadence, independent of the spec version. Plugin manifests carry an optional `catalogCompat: string` (semver range) field at the root, parallel to `specCompat`. The kernel checks `semver.satisfies(catalogVersion, plugin.catalogCompat)` at load. Mismatch surfaces as `incompatible-catalog` plugin status (new entry in the load-status enum). Resolution: `sm plugins upgrade <id>` runs registered migrations from a closed kernel-side registry of `{ from, to, transform }` triples; auto-migration impossible → CLI exit ≠ 0 + UI dialog naming the offending slot / input-type.
652
+
653
+ Pre-1.0 versioning analyzer (per [`AGENTS.md`](../AGENTS.md)): catalog breaking changes ship as minor bumps while in `0.y.z`; the first `1.0.0` is a deliberate stabilization moment, not a side-effect.
654
+
655
+ ### Stability
656
+
657
+ The **closed catalog of view slots** is stable as of the v1 of this system: adding a new slot is a minor bump; renaming or removing one is a catalog-major bump and triggers `sm plugins upgrade` migration of every dependent plugin.
658
+
659
+ The **`IViewContribution` manifest shape** (six fields: `slot`, `label?`, `tooltip?`, `icon?`, `emptyText?`, `emitWhenEmpty?`) is stable. Adding a new optional field is a minor bump; making a field required or removing one is a catalog-major bump.
660
+
661
+ The **closed catalog of input-types** is stable on the same model: adding minor, renaming/removing major.
662
+
663
+ The **`ctx.emitContribution(id, payload)` signature** is stable. Adding new context callbacks (e.g. `ctx.emitScopeContribution`) is additive and minor.
664
+
665
+ The **persistence shape** (`scan_contributions` columns) is stable; column additions are minor bumps. Renames or removals trigger a kernel migration.
666
+
667
+ The **slot catalog ownership** is now spec-level (kernel + spec own the catalog jointly); the UI implementation may rearrange visual placement WITHOUT renaming a slot — the slot id is the public handle, the visual surface beneath it can evolve. Different driving adapters (UI, future TUI, `sm show --json`) MUST honour the same slot vocabulary; surface-level rendering policy stays adapter-specific (e.g. a TUI may render `card.title.right` as a prefix glyph instead of a right-side marker).
668
+
669
+ The **isolation honest-note** (accidents, not hostile code) is the same posture as [`plugin-kv-api.md`](./plugin-kv-api.md) and migrates together when worker-thread / iframe sandbox lands post-v1.0.
670
+
671
+ ---
672
+
470
673
  ## See also
471
674
 
472
675
  - [`cli-contract.md`](./cli-contract.md) — verb surface of the CLI driving adapter.
@@ -484,12 +687,12 @@ The **`null`-as-delete sentinel** in `SidecarStore.applyPatch` is an internal co
484
687
 
485
688
  The **port list** is stable as of spec v1.0.0. Adding a sixth port is a major bump.
486
689
 
487
- The **extension kind list** (6 kinds: Provider, Extractor, Rule, Action, Formatter, Hook) is stable as of spec v1.0.0. Adding a seventh kind is a major bump. Removing or renaming a kind is a major bump.
690
+ The **extension kind list** (6 kinds: Provider, Extractor, Analyzer, Action, Formatter, Hook) is stable as of spec v1.0.0. Adding a seventh kind is a major bump. Removing or renaming a kind is a major bump.
488
691
 
489
- The **Hook curated trigger set** (eight events: `scan.started`, `scan.completed`, `extractor.completed`, `rule.completed`, `action.completed`, `job.spawning`, `job.completed`, `job.failed`) is stable as of spec v1.0.0. Adding a ninth trigger is a minor bump; removing or renaming any of the eight is a major bump.
692
+ The **Hook curated trigger set** (ten events: `boot`, `scan.started`, `scan.completed`, `extractor.completed`, `analyzer.completed`, `action.completed`, `job.spawning`, `job.completed`, `job.failed`, `shutdown`) is stable as of spec v1.0.0. Adding an eleventh trigger is a minor bump; removing or renaming any of the ten is a major bump.
490
693
 
491
- The **execution modes** (`deterministic` / `probabilistic`) and the per-kind mode capability matrix above are stable as of spec v1.0.0. Adding a third mode or changing which kinds are dual-mode is a major bump. Renaming or repurposing the mode enum values is a major bump.
694
+ The **execution modes** (`deterministic` / `probabilistic`) and the per-kind mode capability matrix above are stable as of spec v1.0.0. Adding a third mode is a major bump. Renaming or repurposing the mode enum values is a major bump. Pre-1.0, narrowing a kind from dual-mode to single-mode is permitted as a minor bump (Extractor went from `deterministic / probabilistic` to `deterministic-only` in 0.X.0); post-1.0 the same change would be major.
492
695
 
493
- The **dependency rules** above are stable as of spec v1.0.0. Relaxing any is a major bump; tightening (forbidding an allowed import) is a minor bump.
696
+ The **dependency analyzers** above are stable as of spec v1.0.0. Relaxing any is a major bump; tightening (forbidding an allowed import) is a minor bump.
494
697
 
495
698
  The **Extractor · trigger normalization** pipeline (six steps, in order) is stable from the next spec release. Adding a new step at the end is a minor bump; reordering, removing, or changing any existing step (including the character classes in step 4) is a major bump. Implementations that produce different `normalizedTrigger` output for equivalent input are non-conforming.