@skill-map/spec 0.53.0 → 0.55.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +32 -0
- package/README.md +12 -10
- package/architecture.md +154 -150
- package/cli-contract.md +139 -143
- package/conformance/README.md +9 -9
- package/conformance/coverage.md +5 -5
- package/db-schema.md +72 -72
- package/index.json +19 -18
- package/interfaces/security-scanner.md +25 -25
- package/job-events.md +43 -43
- package/job-lifecycle.md +32 -36
- package/package.json +2 -1
- package/plugin-author-guide.md +97 -125
- package/plugin-kv-api.md +22 -23
- package/plugin-quickstart.md +96 -0
- package/prompt-preamble.md +6 -6
- package/schemas/extensions/action.schema.json +6 -0
- package/schemas/project-config.schema.json +4 -0
- package/telemetry.md +120 -136
- package/versioning.md +12 -12
package/db-schema.md
CHANGED
|
@@ -12,13 +12,13 @@ The spec assumes a relational, SQL-like store but is **engine-agnostic**. The re
|
|
|
12
12
|
|
|
13
13
|
## Scope and location
|
|
14
14
|
|
|
15
|
-
One scope. Skill-map operates on the project scope only (`<cwd>/.skill-map/`).
|
|
15
|
+
One scope. Skill-map operates on the project scope only (`<cwd>/.skill-map/`). No global / user-level DB; the CLI never reads `$HOME` by default (see `cli-contract.md` §Scope is always project-local). To extend the scan beyond the current repository the user adds explicit paths via `scan.extraFolders` in the project-local config; the scan walks those against the same project DB.
|
|
16
16
|
|
|
17
17
|
| Scope | Default DB location | Scan roots |
|
|
18
18
|
|---|---|---|
|
|
19
19
|
| `project` | `<cwd>/.skill-map/skill-map.db` | The current repository, plus any paths the user added to `scan.extraFolders`. |
|
|
20
20
|
|
|
21
|
-
The project DB is gitignored by default (`sm init` adds the entry). Teams MAY
|
|
21
|
+
The project DB is gitignored by default (`sm init` adds the entry). Teams MAY share it by removing that `.gitignore` entry; the file is then committed and the execution log becomes a team artifact.
|
|
22
22
|
|
|
23
23
|
The `--db <path>` CLI flag overrides the DB location as an escape hatch (debugging, custom layouts).
|
|
24
24
|
|
|
@@ -26,7 +26,7 @@ The `--db <path>` CLI flag overrides the DB location as an escape hatch (debuggi
|
|
|
26
26
|
|
|
27
27
|
## Zones
|
|
28
28
|
|
|
29
|
-
Every kernel table belongs to exactly one zone, identified by a mandatory
|
|
29
|
+
Every kernel table belongs to exactly one zone, identified by a mandatory prefix.
|
|
30
30
|
|
|
31
31
|
| Zone | Prefix | Nature | Regenerable | Backed up | Example |
|
|
32
32
|
|---|---|---|---|---|---|
|
|
@@ -34,9 +34,9 @@ Every kernel table belongs to exactly one zone, identified by a mandatory name p
|
|
|
34
34
|
| State | `state_` | Persistent operational data: jobs, executions, summaries, enrichment, plugin KV. | No | Yes | `state_jobs` |
|
|
35
35
|
| Config | `config_` | User-owned configuration: plugin enable/disable, preferences, migration ledger. | No | Yes | `config_plugins` |
|
|
36
36
|
|
|
37
|
-
`sm db reset` drops `scan_*` only (non-destructive, equivalent to forcing the next scan from a clean slate). `sm db reset --state` also drops `state_*` (destructive to operational history). `sm db reset --hard` deletes the DB file entirely. `sm db backup` preserves `state_*` + `config_*`; `scan_*` is
|
|
37
|
+
`sm db reset` drops `scan_*` only (non-destructive, equivalent to forcing the next scan from a clean slate). `sm db reset --state` also drops `state_*` (destructive to operational history). `sm db reset --hard` deletes the DB file entirely. `sm db backup` preserves `state_*` + `config_*`; `scan_*` is regenerated on demand and never included in backups.
|
|
38
38
|
|
|
39
|
-
**Active-provider lens change**: switching the `activeProvider` setting (see [`cli-contract.md` §Active provider lens](./cli-contract.md#active-provider-lens) and [`architecture.md` §Active Provider Lens](./architecture.md#active-provider-lens)) drops the `scan_*` zone atomically and triggers a fresh scan under the new lens.
|
|
39
|
+
**Active-provider lens change**: switching the `activeProvider` setting (see [`cli-contract.md` §Active provider lens](./cli-contract.md#active-provider-lens) and [`architecture.md` §Active Provider Lens](./architecture.md#active-provider-lens)) drops the `scan_*` zone atomically and triggers a fresh scan under the new lens. Same effect as `sm db reset` then `sm scan`, but one transaction so the user never sees an empty graph between the two. `state_*` and `config_*` are preserved; `config_plugins` and `config_preferences` rows survive (including the new `activeProvider` value itself).
|
|
40
40
|
|
|
41
41
|
---
|
|
42
42
|
|
|
@@ -73,7 +73,7 @@ One row per detected node, matching [`schemas/node.schema.json`](./schemas/node.
|
|
|
73
73
|
| Column | Type | Constraint | Notes |
|
|
74
74
|
|---|---|---|---|
|
|
75
75
|
| `path` | TEXT | PRIMARY KEY | Relative path from scope root. Canonical node identifier. |
|
|
76
|
-
| `kind` | TEXT | NOT NULL | Open-by-design (`node.schema.json#/properties/kind`):
|
|
76
|
+
| `kind` | TEXT | NOT NULL | Open-by-design (`node.schema.json#/properties/kind`): whatever the classifying Provider declares. Built-in catalogs: `claude` ships `skill` / `agent` / `command` / `mcp`; `openai` ships `agent`; `agent-skills` ships `skill`; `core/markdown` ships the format-named generic fallback `markdown` (universal, picks up any `.md` no vendor Provider claims, see `architecture.md` §Provider · dispatch order). The metadata-only `antigravity` Provider ships no kinds; Antigravity skills route through `agent-skills`. External Providers MAY emit their own. |
|
|
77
77
|
| `provider` | TEXT | NOT NULL | Provider extension id. |
|
|
78
78
|
| `title` | TEXT | NULL | |
|
|
79
79
|
| `description` | TEXT | NULL | |
|
|
@@ -92,7 +92,7 @@ One row per detected node, matching [`schemas/node.schema.json`](./schemas/node.
|
|
|
92
92
|
| `links_in_count` | INTEGER | NOT NULL DEFAULT 0 | |
|
|
93
93
|
| `external_refs_count` | INTEGER | NOT NULL DEFAULT 0 | |
|
|
94
94
|
| `scanned_at` | INTEGER | NOT NULL | Unix ms. |
|
|
95
|
-
| `modified_at_ms` | INTEGER | NULL | File `mtime` in Unix ms, captured at scan time from `lstat`. NULL for virtual / derived nodes (no backing file). Drives the UI "last modified" sortable column; never
|
|
95
|
+
| `modified_at_ms` | INTEGER | NULL | File `mtime` in Unix ms, captured at scan time from `lstat`. NULL for virtual / derived nodes (no backing file). Drives the UI "last modified" sortable column; never hashed. |
|
|
96
96
|
|
|
97
97
|
Indexes: `ix_scan_nodes_kind`, `ix_scan_nodes_provider`, `ix_scan_nodes_body_hash` (rename heuristic).
|
|
98
98
|
|
|
@@ -106,7 +106,7 @@ One row per detected link, matching [`schemas/link.schema.json`](./schemas/link.
|
|
|
106
106
|
| `source_path` | TEXT | NOT NULL | FK semantically; MAY be unenforced for performance. |
|
|
107
107
|
| `target_path` | TEXT | NOT NULL | MAY point to a missing node (broken ref). |
|
|
108
108
|
| `kind` | TEXT | NOT NULL, CHECK in (`invokes`, `references`, `mentions`, `points`) | |
|
|
109
|
-
| `confidence` | REAL | NOT NULL, CHECK `>= 0.0 AND <= 1.0` | Numeric `[0,1]` (`link.schema.json#/properties/confidence`). The kernel's 1.0 baseline
|
|
109
|
+
| `confidence` | REAL | NOT NULL, CHECK `>= 0.0 AND <= 1.0` | Numeric `[0,1]` (`link.schema.json#/properties/confidence`). The kernel's 1.0 baseline folded with every `score`-phase `ctx.adjustConfidence` op (the built-in detectors `core/name-reserved`, `core/reference-broken`, plus any third-party scorer); per-op attribution lives in `scan_link_scores`. Migrated from the legacy `high`/`medium`/`low` TEXT enum. |
|
|
110
110
|
| `sources_json` | TEXT | NOT NULL | JSON array of extractor ids. |
|
|
111
111
|
| `original_trigger` | TEXT | NULL | |
|
|
112
112
|
| `normalized_trigger` | TEXT | NULL | |
|
|
@@ -137,7 +137,7 @@ Indexes: `ix_scan_issues_analyzer_id`, `ix_scan_issues_severity`.
|
|
|
137
137
|
|
|
138
138
|
### `scan_meta`
|
|
139
139
|
|
|
140
|
-
Single-row table holding the
|
|
140
|
+
Single-row table holding the last persisted scan's metadata. Lets `loadScanResult` return the real `roots` / `scannedAt` / `scannedBy` / `providers` / `stats.filesWalked|filesSkipped|durationMs` instead of synthesising them. Replaced atomically with the rest of `scan_*` on every `sm scan`.
|
|
141
141
|
|
|
142
142
|
`nodesCount` / `linksCount` / `issuesCount` are not stored here, they derive from `COUNT(*)` of the sibling tables.
|
|
143
143
|
|
|
@@ -153,55 +153,55 @@ Single-row table holding the metadata of the last persisted scan. Lets `loadScan
|
|
|
153
153
|
| `stats_files_walked` | INTEGER | NOT NULL |
|
|
154
154
|
| `stats_files_skipped` | INTEGER | NOT NULL |
|
|
155
155
|
| `stats_duration_ms` | INTEGER | NOT NULL |
|
|
156
|
-
| `tokenizer` | TEXT | NULL | Resolved offline encoder that produced this scan's per-node token counts (closed enum `cl100k_base` / `o200k_base`, see `project-config.md` / `project-config.schema.json` §tokenizer). Carried on the `ScanResult.tokenizer` wire field. NULL on a pre-feature DB or a scan
|
|
156
|
+
| `tokenizer` | TEXT | NULL | Resolved offline encoder that produced this scan's per-node token counts (closed enum `cl100k_base` / `o200k_base`, see `project-config.md` / `project-config.schema.json` §tokenizer). Carried on the `ScanResult.tokenizer` wire field. NULL on a pre-feature DB or a scan with tokenization disabled. On `sm scan --changed` the orchestrator compares this against the freshly-resolved encoder and, when they differ (or the stored value is NULL), bypasses cached per-node token reuse so `buildNode` recomputes counts with the current encoder; changing the tokenizer thus invalidates prior counts. |
|
|
157
157
|
| `schema_fingerprint` | TEXT | NULL | sha256 (hex) of the migration DDL the schema was built from, written at persist time. NULL on a DB created by a pre-fingerprint CLI; a NULL (or mismatching) value is read as schema drift (see §Schema drift). Internal DB metadata, NOT carried on the `ScanResult` wire shape. |
|
|
158
158
|
|
|
159
|
-
The `scope` column was removed pre-1.0 along with the `-g/--global` flag (see `cli-contract.md` §Scope is always project-local); every persisted scan is project-scoped so the column
|
|
159
|
+
The `scope` column was removed pre-1.0 along with the `-g/--global` flag (see `cli-contract.md` §Scope is always project-local); every persisted scan is project-scoped so the column carried nothing worth round-tripping. Older DBs are not migrated; the drop is a greenfield change and a fresh `sm init` regenerates the schema.
|
|
160
160
|
|
|
161
161
|
No indexes (single row).
|
|
162
162
|
|
|
163
163
|
### `scan_extractor_runs`
|
|
164
164
|
|
|
165
|
-
Fine-grained cache breadcrumbs for the incremental scan path. One row per `(node_path, extractor_id)` recording the body hash the Extractor saw the last time it ran against that node. Replace-all on every `sm scan` so rows for
|
|
165
|
+
Fine-grained cache breadcrumbs for the incremental scan path. One row per `(node_path, extractor_id)` recording the body hash the Extractor saw the last time it ran against that node. Replace-all on every `sm scan` so rows for uninstalled Extractors disappear automatically.
|
|
166
166
|
|
|
167
|
-
The orchestrator consults this table on `sm scan --changed`: a node-level cache hit (body+frontmatter unchanged)
|
|
167
|
+
The orchestrator consults this table on `sm scan --changed`: a node-level cache hit (body+frontmatter unchanged) upgrades to a full skip ONLY when every currently-registered Extractor (filtered by its `precondition`) has a row matching the prior body hash. A new Extractor registered between scans is detected by the absence of its row and runs over the cached node WITHOUT a full cache invalidation; without this table its emissions would go missing on the next `--changed` pass. The same machinery lets a future Action-issued probabilistic enrichment reuse paid LLM output across unchanged bodies.
|
|
168
168
|
|
|
169
169
|
| Column | Type | Constraint |
|
|
170
170
|
|---|---|---|
|
|
171
171
|
| `node_path` | TEXT | NOT NULL | FK semantically to `scan_nodes.path`; MAY be unenforced (the row is deleted in the same tx as the parent node when the file disappears). |
|
|
172
172
|
| `extractor_id` | TEXT | NOT NULL | Qualified id `<plugin_id>/<id>` per spec § A.6. |
|
|
173
173
|
| `body_hash_at_run` | TEXT | NOT NULL | The `node.body_hash` the Extractor processed; sha256, hex. |
|
|
174
|
-
| `sidecar_annotations_hash_at_run` | TEXT | NOT NULL | sha256 of the canonical-form `node.sidecar.annotations` block the Extractor saw on its run. Always populated
|
|
175
|
-
| `ran_at` | INTEGER | NOT NULL | Unix milliseconds, wall-clock when the Extractor finished or was last carried forward via cache reuse.
|
|
174
|
+
| `sidecar_annotations_hash_at_run` | TEXT | NOT NULL | sha256 of the canonical-form `node.sidecar.annotations` block the Extractor saw on its run. Always populated; an absent sidecar or one without annotations canonicalises to `{}` so the hash stays stable across "no sidecar" → "empty annotations" transitions. Participates in the cache hit condition for every Extractor: a `.sm`-only edit invalidates the cached run, no opt-in flag required. The author-facing flag alternative was rejected (forgetting it yielded silent stale-data bugs); universal invalidation costs one re-run on sidecar edits (negligible: sidecars change rarely, Extractors are pure-CPU). |
|
|
175
|
+
| `ran_at` | INTEGER | NOT NULL | Unix milliseconds, wall-clock when the Extractor finished or was last carried forward via cache reuse. For diagnostics + future GC of stale rows. |
|
|
176
176
|
|
|
177
177
|
Primary key: `(node_path, extractor_id)`. Indexes: `ix_scan_extractor_runs_node`, `ix_scan_extractor_runs_extractor`.
|
|
178
178
|
|
|
179
|
-
**Source-attribution interaction.** `scan_links.sources_json` carries the *short* extractor id the author wrote (e.g. `'slash'`); this table keys on the *qualified* form (`'core/slash-command'`). When a cached link is reshaped on reuse the orchestrator strips short ids whose owning Extractor is no longer registered (
|
|
179
|
+
**Source-attribution interaction.** `scan_links.sources_json` carries the *short* extractor id the author wrote (e.g. `'slash'`); this table keys on the *qualified* form (`'core/slash-command'`). When a cached link is reshaped on reuse the orchestrator strips short ids whose owning Extractor is no longer registered (a removed extractor must not stay attributed); links whose sole source is an uninstalled Extractor disappear; links whose sources include a missing-but-still-registered Extractor are dropped so it can re-emit fresh.
|
|
180
180
|
|
|
181
181
|
### `node_enrichments`
|
|
182
182
|
|
|
183
183
|
Universal enrichment layer (A.8). Stores `ctx.enrichNode(partial)` outputs separately from the author-supplied frontmatter on `scan_nodes.frontmatter_json`, which the Extractor pipeline NEVER mutates.
|
|
184
184
|
|
|
185
|
-
One row per `(node_path, extractor_id)` pair an Extractor enriched. Extractors are deterministic-only; rows are
|
|
185
|
+
One row per `(node_path, extractor_id)` pair an Extractor enriched. Extractors are deterministic-only; rows are overwritten via PRIMARY KEY conflict on the next re-extract through the A.9 cache.
|
|
186
186
|
|
|
187
187
|
| Column | Type | Constraint |
|
|
188
188
|
|---|---|---|
|
|
189
189
|
| `node_path` | TEXT | NOT NULL | FK semantically to `scan_nodes.path`; replaced when a rename heuristic fires (mirrors the `state_*` FK migration). |
|
|
190
190
|
| `extractor_id` | TEXT | NOT NULL | Qualified id `<plugin_id>/<id>` per spec § A.6. |
|
|
191
|
-
| `body_hash_at_enrichment` | TEXT | NOT NULL | The `node.body_hash` the Extractor saw when it produced this enrichment. Always equal to the live body hash for Extractor writes; reserved for future Action-issued probabilistic enrichments where stale tracking
|
|
191
|
+
| `body_hash_at_enrichment` | TEXT | NOT NULL | The `node.body_hash` the Extractor saw when it produced this enrichment. Always equal to the live body hash for Extractor writes; reserved for future Action-issued probabilistic enrichments where stale tracking matters. |
|
|
192
192
|
| `value_json` | TEXT | NOT NULL | JSON-serialised `Partial<Node>`, the cumulative merge of every `enrichNode(...)` call the Extractor made for this node within its `extract()` invocation. |
|
|
193
|
-
| `stale` | INTEGER | NOT NULL DEFAULT 0, CHECK in (0, 1) | Reserved. Always `0` in this revision (Extractors are deterministic; re-running is free).
|
|
194
|
-
| `enriched_at` | INTEGER | NOT NULL | Unix milliseconds, when the Extractor produced this enrichment. Drives
|
|
195
|
-
| `is_probabilistic` | INTEGER | NOT NULL DEFAULT 0, CHECK in (0, 1) | Reserved. Always `0` for Extractor writes (
|
|
193
|
+
| `stale` | INTEGER | NOT NULL DEFAULT 0, CHECK in (0, 1) | Reserved. Always `0` in this revision (Extractors are deterministic; re-running is free). Flag and index kept for the future Action-prob enrichment revision where queued LLM jobs must preserve paid output across body changes. |
|
|
194
|
+
| `enriched_at` | INTEGER | NOT NULL | Unix milliseconds, when the Extractor produced this enrichment. Drives read-time merge order (`ASC` → last-write-wins per field) inside `mergeNodeWithEnrichments`. |
|
|
195
|
+
| `is_probabilistic` | INTEGER | NOT NULL DEFAULT 0, CHECK in (0, 1) | Reserved. Always `0` for Extractor writes (deterministic-only). For the future Action-prob revision where the writer's mode is denormalised onto the row so the stale-flag query stays single-table. |
|
|
196
196
|
|
|
197
|
-
Primary key: `(node_path, extractor_id)`. Indexes: `ix_node_enrichments_node`, `ix_node_enrichments_stale`. The `_stale` index is dormant in this revision (every row has `stale = 0`);
|
|
197
|
+
Primary key: `(node_path, extractor_id)`. Indexes: `ix_node_enrichments_node`, `ix_node_enrichments_stale`. The `_stale` index is dormant in this revision (every row has `stale = 0`); preserved so the future Action-prob revision ships without a schema migration.
|
|
198
198
|
|
|
199
199
|
**Persistence flow** (per `sm scan`):
|
|
200
200
|
|
|
201
201
|
1. **Rename migration**, for every `RenameOp` from the rename heuristic, update `node_enrichments.node_path` from `op.from` to `op.to` so the audit trail tracks the file like `state_*` rows do.
|
|
202
202
|
2. **Drop-on-disappear**, delete every row whose `node_path` is no longer in the live node set.
|
|
203
203
|
3. **Upsert**, for every `(node_path, extractor_id)` pair the orchestrator emitted in this scan, upsert with `stale = 0`, `is_probabilistic = 0`, and the current `body_hash`. The PRIMARY KEY conflict refreshes `body_hash_at_enrichment` / `value_json` / `enriched_at` on every re-run.
|
|
204
|
-
4. **Stale flagging**, no-op in this revision (Extractors are deterministic-only; the sweep finds nothing to flag).
|
|
204
|
+
4. **Stale flagging**, no-op in this revision (Extractors are deterministic-only; the sweep finds nothing to flag). Preserved so the future Action-prob revision slots in without reshaping the contract.
|
|
205
205
|
|
|
206
206
|
**Read-side `node.merged` view.** Analyzers / `sm check` / `sm export` consume `node.frontmatter` directly (deterministic CI-safe baseline). UI / future opt-in consumers call `mergeNodeWithEnrichments(node, enrichments)` which:
|
|
207
207
|
|
|
@@ -209,16 +209,16 @@ Primary key: `(node_path, extractor_id)`. Indexes: `ix_node_enrichments_node`, `
|
|
|
209
209
|
2. Sorts by `enriched_at` ASC.
|
|
210
210
|
3. Spread-merges each `value` over the author frontmatter (last-write-wins per field).
|
|
211
211
|
|
|
212
|
-
Stale row visibility is opt-in via `mergeNodeWithEnrichments(node, enrichments, { includeStale: true })
|
|
212
|
+
Stale row visibility is opt-in via `mergeNodeWithEnrichments(node, enrichments, { includeStale: true })`, a no-op today (no rows are stale-flagged); preserved for the future Action-prob revision noted above.
|
|
213
213
|
|
|
214
214
|
**Refresh verbs** (see [`cli-contract.md` §Scan](./cli-contract.md#scan)):
|
|
215
215
|
|
|
216
|
-
- `sm refresh <node.path>` re-runs Extractors against a single node and upserts their enrichment rows.
|
|
216
|
+
- `sm refresh <node.path>` re-runs Extractors against a single node and upserts their enrichment rows. Deterministic-only, they always run for real and persist.
|
|
217
217
|
- `sm refresh --stale` batches the granular form across every node carrying at least one stale row; in this revision the stale set is always empty so the verb prints a "nothing to do" advisory and exits `0`.
|
|
218
218
|
|
|
219
219
|
### `scan_contributions`
|
|
220
220
|
|
|
221
|
-
|
|
221
|
+
View contribution system (Phase 3). Per-node typed payloads emitted by extractors via `ctx.emitContribution(id, payload)` (and by analyzers via `ctx.emitScopeContribution(id, payload)` for scope-level slots). One row per `(plugin_id, extension_id, node_path, contribution_id)` tuple.
|
|
222
222
|
|
|
223
223
|
| Column | Type | Constraint |
|
|
224
224
|
|---|---|---|
|
|
@@ -226,30 +226,30 @@ Phase 3 / View contribution system. Per-node typed payloads emitted by extractor
|
|
|
226
226
|
| `extension_id` | TEXT | NOT NULL | Extension id within the plugin. |
|
|
227
227
|
| `node_path` | TEXT | NOT NULL | FK semantically to `scan_nodes.path`; orphan-swept on persist when the parent node disappears. |
|
|
228
228
|
| `contribution_id` | TEXT | NOT NULL | Manifest Record key under `extension.ui[<contributionId>]` (the runtime catalog keeps the historical name `viewContributions`). |
|
|
229
|
-
| `slot` | TEXT | NOT NULL | Closed-enum-by-spec slot name; mirror of `view-slots.schema.json#/$defs/SlotName`. Kept open at the SQL layer (no CHECK) so catalog evolution
|
|
229
|
+
| `slot` | TEXT | NOT NULL | Closed-enum-by-spec slot name; mirror of `view-slots.schema.json#/$defs/SlotName`. Kept open at the SQL layer (no CHECK) so catalog evolution needs no DDL migration; `sm plugins upgrade` handles renames at the manifest layer. |
|
|
230
230
|
| `payload_json` | TEXT | NOT NULL | JSON-serialised payload, already validated against the slot's payload schema (`view-slots.schema.json#/$defs/payloads/<slot>`) at emit time. Off-shape payloads emit `extension.error` and drop silently. |
|
|
231
231
|
| `emitted_at` | INTEGER | NOT NULL | Unix milliseconds. |
|
|
232
232
|
|
|
233
233
|
Primary key: `(plugin_id, extension_id, node_path, contribution_id)`. Indexes: `ix_scan_contributions_node_path` (inspector lazy-fetch + orphan sweep), `ix_scan_contributions_plugin_id` (catalog sweep + `purgeByPlugin`).
|
|
234
234
|
|
|
235
|
-
**Persistence, orphan + catalog + per-tuple sweep + upsert (NOT pure replace-all).** The watcher's cached pass leaves the
|
|
235
|
+
**Persistence, orphan + catalog + per-tuple sweep + upsert (NOT pure replace-all).** The watcher's cached pass leaves the buffer empty for cached nodes (the orchestrator skips `extract()` on a per-(node, extractor) cache hit, so no `emitContribution` fires), and a naive wipe-all would drop the prior valid rows on every watcher boot. The persist runs four passes inside the same tx as the rest of the scan zone:
|
|
236
236
|
|
|
237
237
|
1. **Orphan sweep**, drops every row whose `node_path` is NOT in the current live node set (`livePaths` derived from `result.nodes`). Disappeared nodes lose their contributions automatically.
|
|
238
|
-
2. **Catalog sweep**, drops every row whose qualified id `(pluginId, extensionId, contributionId)` is NOT in the registered runtime catalog (`registeredContributionKeys` collected via `collectRegisteredContributionKeys(composed)`). Uninstalled-on-disk plugins and removed contributions lose their rows on the next scan. Disabled plugins are normally purged eagerly by `sm plugins disable` (see `purgeByPlugin` below)
|
|
238
|
+
2. **Catalog sweep**, drops every row whose qualified id `(pluginId, extensionId, contributionId)` is NOT in the registered runtime catalog (`registeredContributionKeys` collected via `collectRegisteredContributionKeys(composed)`). Uninstalled-on-disk plugins and removed contributions lose their rows on the next scan. Disabled plugins are normally purged eagerly by `sm plugins disable` (see `purgeByPlugin` below); this is the fallback for the rare "config flipped between scans without going through the CLI" case.
|
|
239
239
|
3. **Per-tuple sweep**, for every `(pluginId, extensionId, node_path)` tuple in `freshlyRunTuples` (extension actually ran against that node this scan: extractor cache miss, OR analyzer), drop any row carrying that triple whose `contribution_id` is NOT refreshed by the buffer. Catches the "extractor used to emit, now does not" case without touching cached-extractor rows. Tuple format: `<pluginId>/<extensionId>/<nodePath>`.
|
|
240
240
|
4. **Upsert**, `INSERT ... ON CONFLICT DO UPDATE SET payload_json = excluded.payload_json, slot = excluded.slot` for every row in the buffer. PK conflict refreshes `payload_json` + `slot` + `emitted_at`.
|
|
241
241
|
|
|
242
|
-
Cached nodes' rows survive untouched
|
|
242
|
+
Cached nodes' rows survive untouched: neither orphaned (still in the live set) nor uninstalled (still in the catalog) nor in `freshlyRunTuples` (extractor short-circuited via cache) nor in the buffer (no re-emit). When the body next changes, the extractor re-runs, the tuple lands in the freshly-run set, and either the upsert refreshes the row or the per-tuple sweep drops it.
|
|
243
243
|
|
|
244
|
-
**Backwards-compat fallbacks.** `IPersistOptions.livePaths`, `IPersistOptions.registeredContributionKeys`, `IPersistOptions.freshlyRunTuples` are all optional. Absent / empty `livePaths` falls back to wipe-all (legacy behaviour)
|
|
244
|
+
**Backwards-compat fallbacks.** `IPersistOptions.livePaths`, `IPersistOptions.registeredContributionKeys`, `IPersistOptions.freshlyRunTuples` are all optional, so older callers preserve the pre-fix behaviour. Absent / empty `livePaths` falls back to wipe-all (legacy behaviour); `registeredContributionKeys` skips the catalog sweep (rows for disabled plugins linger until next purge); `freshlyRunTuples` skips the per-tuple sweep (rows that should have been dropped because an extractor stopped emitting linger until the node body, the extractor registration, or the node existence changes again).
|
|
245
245
|
|
|
246
|
-
NOT analogous to `state_plugin_kvs` (which is plugin-managed). Belongs to the `scan_*` family
|
|
246
|
+
NOT analogous to `state_plugin_kvs` (which is plugin-managed). Belongs to the `scan_*` family; sweep semantics replace pure replace-all but the data is still scan-derived.
|
|
247
247
|
|
|
248
|
-
**Eager purge on disable.** `sm plugins disable <id>` calls `StoragePort.contributions.purgeByPlugin(pluginId, extensionId)` immediately after persisting `config_plugins[<id>].enabled = false`. Every persisted toggle key is the qualified `<plugin>/<ext>` shape (the CLI's bundle macro form and the BFF's cascade endpoint expand bare plugin ids before persistence), so the purge always receives both segments.
|
|
248
|
+
**Eager purge on disable.** `sm plugins disable <id>` calls `StoragePort.contributions.purgeByPlugin(pluginId, extensionId)` immediately after persisting `config_plugins[<id>].enabled = false`. Every persisted toggle key is the qualified `<plugin>/<ext>` shape (the CLI's bundle macro form and the BFF's cascade endpoint expand bare plugin ids before persistence), so the purge always receives both segments. Avoids the "I disabled the extension but its chips still render until I re-scan" gap. Re-enabling (`sm plugins enable <id>`) does NOT restore the rows; the next scan re-emits them, same as a cold start. Contributions are scan-derived, so this is cheap; for plugin-managed state (`state_plugin_kvs`, dedicated tables) the opposite policy holds, see `plugin-kv-api.md` § "disable does not drop data".
|
|
249
249
|
|
|
250
250
|
### `scan_link_scores`
|
|
251
251
|
|
|
252
|
-
Per-op confidence-attribution audit trail. One row per attributed `ctx.adjustConfidence(link, op)` call buffered by a `score`-phase analyzer during the scan (the
|
|
252
|
+
Per-op confidence-attribution audit trail. One row per attributed `ctx.adjustConfidence(link, op)` call buffered by a `score`-phase analyzer during the scan (the built-in detectors `core/name-reserved`, `core/reference-broken` apply penalty deltas on top of the kernel's 1.0 baseline; third-party scorers add their own). Lets an operator answer "why is this link at `0.3`?" by listing the plugin / extension / op that moved it, with the FOLDED final value denormalised onto every row.
|
|
253
253
|
|
|
254
254
|
| Column | Type | Constraint | Notes |
|
|
255
255
|
|---|---|---|---|
|
|
@@ -261,16 +261,16 @@ Per-op confidence-attribution audit trail. One row per attributed `ctx.adjustCon
|
|
|
261
261
|
| `normalized_trigger` | TEXT | NULL | The link's `trigger.normalizedTrigger`; NULL for path-style links that carry no trigger. Completes the structural identity key. |
|
|
262
262
|
| `op_kind` | TEXT | NOT NULL | Confidence-algebra bucket: `set` / `delta` / `ceil` / `floor`. Kept open at the SQL layer (no CHECK) so the op catalog can evolve as a kernel + spec change without a DDL migration. |
|
|
263
263
|
| `op_value` | REAL | NOT NULL | The op's operand. |
|
|
264
|
-
| `result_confidence` | REAL | NOT NULL | Denormalised FOLDED final `link.confidence` after every op for this link
|
|
264
|
+
| `result_confidence` | REAL | NOT NULL | Denormalised FOLDED final `link.confidence` after every op for this link applied. Equal across all rows for one link; mirrors `scan_links.confidence` for the same structural edge so the audit read needs no join. |
|
|
265
265
|
| `emitted_at` | INTEGER | NOT NULL | Unix milliseconds. |
|
|
266
266
|
|
|
267
267
|
No primary key (multiple ops MAY land on one link). Index: `ix_scan_link_scores_source_path` (per-node "why this link?" lookup).
|
|
268
268
|
|
|
269
|
-
**Persistence, plain replace-all per scan** (delete every row, then insert), the same posture as `scan_issues` / `scan_contribution_errors`, NOT the orphan/catalog/per-tuple sweep `scan_contributions` uses. A score adjustment is a transient scan finding re-derived in full on every analyzer pass, so
|
|
269
|
+
**Persistence, plain replace-all per scan** (delete every row, then insert), the same posture as `scan_issues` / `scan_contribution_errors`, NOT the orphan/catalog/per-tuple sweep `scan_contributions` uses. A score adjustment is a transient scan finding re-derived in full on every analyzer pass, so no cached-node row to preserve. An empty buffer (scorers touched nothing) wipes the table, clearing stale rows from a prior scan.
|
|
270
270
|
|
|
271
271
|
### `scan_node_tags`
|
|
272
272
|
|
|
273
|
-
Tags. One row per `(node_path, tag)` pair, projected at persist time from `sidecar.annotations.tags`. Tags are a skill-map concept (no vendor carries `tags` in frontmatter), so the sidecar is the single source. Drives `sm list --tag <name>` and the UI's tag-faceted search; the `(tag)` index keeps "find all nodes with tag X" `O(log n)`.
|
|
273
|
+
Tags. One row per `(node_path, tag)` pair, projected at persist time from `sidecar.annotations.tags`. Tags are a skill-map concept (no vendor carries `tags` in frontmatter), so the sidecar is the single source. Drives `sm list --tag <name>` and the UI's tag-faceted search; the `(tag)` index keeps "find all nodes with tag X" at `O(log n)`.
|
|
274
274
|
|
|
275
275
|
| Column | Type | Constraint |
|
|
276
276
|
|---|---|---|
|
|
@@ -279,9 +279,9 @@ Tags. One row per `(node_path, tag)` pair, projected at persist time from `sidec
|
|
|
279
279
|
|
|
280
280
|
Primary key: `(node_path, tag)`. Indexes: `ix_scan_node_tags_tag` (search by tag), `ix_scan_node_tags_node_path` (per-node lookup, e.g. inspector projection).
|
|
281
281
|
|
|
282
|
-
**Persistence, replace-all per scan.** Every persisted scan rebuilds the table for the live node set: rows whose `node_path` is NOT in `livePaths` are dropped (orphan sweep, same as the contributions table); rows for nodes in the live set are wiped and re-inserted from the projected sidecar state. Cached nodes' tag rows
|
|
282
|
+
**Persistence, replace-all per scan.** Every persisted scan rebuilds the table for the live node set: rows whose `node_path` is NOT in `livePaths` are dropped (orphan sweep, same as the contributions table); rows for nodes in the live set are wiped and re-inserted from the projected sidecar state. Cached nodes' tag rows project from the cached `node.sidecar.annotations.tags` (already in memory), so the rebuild is cheap regardless of cache hit / miss. Storage is small: a 50-node project with avg 3 tags/node is ~150 rows ≈ 7.5 KB.
|
|
283
283
|
|
|
284
|
-
The wire shape on `/api/nodes` joins this table to project `node.tags = string[]`. The kernel `Node` interface (TypeScript) does NOT carry `tags
|
|
284
|
+
The wire shape on `/api/nodes` joins this table to project `node.tags = string[]`. The kernel `Node` interface (TypeScript) does NOT carry `tags`; consumers walking the canonical source read `node.sidecar.annotations.tags` directly (consistent with the post-decision-#2 posture).
|
|
285
285
|
|
|
286
286
|
---
|
|
287
287
|
|
|
@@ -312,11 +312,11 @@ Matching [`schemas/job.schema.json`](./schemas/job.schema.json). See [`job-lifec
|
|
|
312
312
|
|
|
313
313
|
Indexes: `ix_state_jobs_status`, `ix_state_jobs_action_node_hash` (unique partial index WHERE `status IN ('queued','running')` for duplicate detection).
|
|
314
314
|
|
|
315
|
-
The rendered job content is NOT stored on this table. It lives in `state_job_contents` keyed by `content_hash` so multiple jobs with identical action + node + template pairs share
|
|
315
|
+
The rendered job content is NOT stored on this table. It lives in `state_job_contents` keyed by `content_hash` so multiple jobs with identical action + node + template pairs share one physical blob. See `state_job_contents` below for the storage shape and GC contract.
|
|
316
316
|
|
|
317
317
|
### `state_job_contents`
|
|
318
318
|
|
|
319
|
-
Content-addressed store for the rendered MD content of every queued or completed job. Decouples
|
|
319
|
+
Content-addressed store for the rendered MD content of every queued or completed job. Decouples content from the lifecycle row in `state_jobs` so retries / `--force` reruns / cross-node fan-out emissions of the same prompt all reference one blob.
|
|
320
320
|
|
|
321
321
|
| Column | Type | Constraint |
|
|
322
322
|
|---|---|---|
|
|
@@ -324,15 +324,15 @@ Content-addressed store for the rendered MD content of every queued or completed
|
|
|
324
324
|
| `content` | TEXT | NOT NULL |
|
|
325
325
|
| `created_at` | INTEGER | NOT NULL |
|
|
326
326
|
|
|
327
|
-
No indexes (PK
|
|
327
|
+
No indexes (PK covers lookup by hash; the table is keyed-by-hash exclusively).
|
|
328
328
|
|
|
329
329
|
**Insertion semantics**: `INSERT OR IGNORE INTO state_job_contents(content_hash, content, created_at) VALUES (?, ?, ?)`, an existing row for the same hash is a no-op (the prior insert already paid the storage cost).
|
|
330
330
|
|
|
331
331
|
**GC contract**: `sm job prune` MUST delete every row whose `content_hash` is no longer referenced by any `state_jobs` row, in the same transaction that prunes the job rows. Implementations MUST NOT delete `state_job_contents` rows on `sm job cancel` (a cancelled job's content is recoverable via `sm job submit --force` of the same content_hash and dedup is desirable).
|
|
332
332
|
|
|
333
|
-
`content_hash` is the same hash
|
|
333
|
+
`content_hash` is the same hash `state_jobs.content_hash` carries, computed at submit time as `sha256(actionId + actionVersion + bodyHash + frontmatterHash + promptTemplateHash)`. Two jobs with identical `content_hash` MUST render to identical content (the formula is deterministic over all rendering inputs); the table relies on this to dedup.
|
|
334
334
|
|
|
335
|
-
|
|
335
|
+
FK enforcement: SQLite foreign keys are off by default and the kernel does not currently turn them on (per `dialect.ts`). The `state_jobs.content_hash → state_job_contents.content_hash` relationship is enforced procedurally by the storage adapter (insert content row before job row in the same transaction; never delete content while jobs reference it). A future foreign-key push may upgrade this to a true FK without breaking the contract.
|
|
336
336
|
|
|
337
337
|
### `state_executions`
|
|
338
338
|
|
|
@@ -360,7 +360,7 @@ Matching [`schemas/execution-record.schema.json`](./schemas/execution-record.sch
|
|
|
360
360
|
|
|
361
361
|
Indexes: `ix_state_executions_extension_id`, `ix_state_executions_started_at`, `ix_state_executions_job_id`.
|
|
362
362
|
|
|
363
|
-
The full report payload (the JSON the model returned, validated against the action's `reportSchemaRef`) is stored inline in `report_json`.
|
|
363
|
+
The full report payload (the JSON the model returned, validated against the action's `reportSchemaRef`) is stored inline in `report_json`. No on-disk report file. `sm job show <id>` and `sm history --json` read the column directly.
|
|
364
364
|
|
|
365
365
|
### `state_summaries`
|
|
366
366
|
|
|
@@ -409,18 +409,18 @@ Primary key: `(plugin_id, node_id, key)` with `node_id` using a sentinel empty s
|
|
|
409
409
|
|
|
410
410
|
### `state_node_favorites`
|
|
411
411
|
|
|
412
|
-
Per-node "favorite" flag set by the local user from the UI.
|
|
412
|
+
Per-node "favorite" flag set by the local user from the UI. One row per favorited node, absence of a row means "not favorited". Exists in zone `state_` because it is user-authored preference, not regenerable scan output: it must survive `sm scan` truncation and `sm db reset` (which drops only `scan_*`).
|
|
413
413
|
|
|
414
414
|
| Column | Type | Constraint |
|
|
415
415
|
|---|---|---|
|
|
416
416
|
| `node_path` | TEXT | PRIMARY KEY |
|
|
417
417
|
| `favorited_at` | INTEGER | NOT NULL | Unix milliseconds when the user marked the node. |
|
|
418
418
|
|
|
419
|
-
No indexes (PK
|
|
419
|
+
No indexes (PK covers lookup by path; the table is keyed-by-path exclusively).
|
|
420
420
|
|
|
421
|
-
`node_path` is FK-semantic to `scan_nodes.path`. Per `§ Rename detection` below, the rename heuristic MUST migrate rows
|
|
421
|
+
`node_path` is FK-semantic to `scan_nodes.path`. Per `§ Rename detection` below, the rename heuristic MUST migrate rows here when a path is renamed (same protocol as `state_jobs` / `state_summaries` / `state_enrichments` / `state_plugin_kvs`). A simple PK update suffices; no composite key, so collisions cannot occur (if the destination path already has a row, the migrating row is dropped to preserve the live one).
|
|
422
422
|
|
|
423
|
-
The BFF's `/api/nodes` route loads the full set of favorited paths once per request (`SELECT node_path FROM state_node_favorites`) and decorates each emitted `Node` with a derived `isFavorite` boolean by Set membership
|
|
423
|
+
The BFF's `/api/nodes` route loads the full set of favorited paths once per request (`SELECT node_path FROM state_node_favorites`) and decorates each emitted `Node` with a derived `isFavorite` boolean by Set membership: no SQL JOIN against `scan_nodes`, zero per-scan persistence transactions.
|
|
424
424
|
|
|
425
425
|
---
|
|
426
426
|
|
|
@@ -443,7 +443,7 @@ Persists user-toggled enable/disable overrides. Discovery is still filesystem-ba
|
|
|
443
443
|
2. `.skill-map/settings.json#/plugins/<id>/enabled`, committed team-shared baseline.
|
|
444
444
|
3. Installed default, every discovered plugin is enabled until told otherwise.
|
|
445
445
|
|
|
446
|
-
The DB
|
|
446
|
+
The DB takes precedence over `settings.json` so a developer can locally disable a misbehaving plugin without committing the toggle to the team's config. Conversely, a team baseline that explicitly enables a plugin is overridable per-machine, no agreement required to experiment.
|
|
447
447
|
|
|
448
448
|
### `config_preferences`
|
|
449
449
|
|
|
@@ -469,7 +469,7 @@ Migration ledger. One row per successfully applied migration, per scope.
|
|
|
469
469
|
|
|
470
470
|
Primary key: `(scope, owner_id, version)`.
|
|
471
471
|
|
|
472
|
-
The kernel ALSO maintains `PRAGMA user_version` (or the engine equivalent) as a fast pre-check for kernel migrations. A mismatch between `user_version` and `config_schema_versions` is
|
|
472
|
+
The kernel ALSO maintains `PRAGMA user_version` (or the engine equivalent) as a fast pre-check for kernel migrations. A mismatch between `user_version` and `config_schema_versions` is flagged by `sm doctor`.
|
|
473
473
|
|
|
474
474
|
---
|
|
475
475
|
|
|
@@ -479,9 +479,9 @@ The kernel ALSO maintains `PRAGMA user_version` (or the engine equivalent) as a
|
|
|
479
479
|
- **Naming**: `NNN_snake_case.sql` where `NNN` is 3-digit sequential, zero-padded. Example: `001_initial.sql`, `042_add_provenance.sql`.
|
|
480
480
|
- **Location**: kernel migrations in `src/migrations/` (reference impl); plugin migrations in `<plugin-dir>/migrations/`.
|
|
481
481
|
- **Wrapping**: the kernel wraps each file in `BEGIN; ... ; COMMIT;`. Files contain DDL only.
|
|
482
|
-
- **Strict versioning**: no idempotency
|
|
482
|
+
- **Strict versioning**: no idempotency required. `CREATE TABLE IF NOT EXISTS` is DISCOURAGED in kernel migrations (permitted in plugin migrations, at the author's discretion).
|
|
483
483
|
- **Auto-apply**: on startup. A backup is written to `.skill-map/backups/skill-map-pre-migrate-v<N>.db` before applying. The `sm db migrate` / `sm db backup` verbs open the DB with auto-apply suppressed so the operator drives migrations manually.
|
|
484
|
-
- **Plugin migration order**: plugins are migrated after kernel migrations
|
|
484
|
+
- **Plugin migration order**: plugins are migrated after kernel migrations, in stable alphabetical order by plugin id. A failing plugin migration disables only that plugin; other plugins and the kernel continue.
|
|
485
485
|
|
|
486
486
|
`sm db migrate` controls migration flow manually: `--dry-run`, `--status`, `--to <n>`, `--kernel-only`, `--plugin <id>`, `--no-backup`.
|
|
487
487
|
|
|
@@ -489,26 +489,26 @@ The kernel ALSO maintains `PRAGMA user_version` (or the engine equivalent) as a
|
|
|
489
489
|
|
|
490
490
|
## Schema drift (pre-1.0)
|
|
491
491
|
|
|
492
|
-
The project DB is a derived cache: every `scan_*` row is regenerable, and the operator's authored data lives in `.sm` sidecars, not in the DB. While the kernel stays in `0.Y.Z` (see [`versioning.md` §Pre-1.0](./versioning.md#pre-10)) it does NOT ship incremental migrations to carry an existing DB across a schema change. Drift is detected on two independent axes; either
|
|
492
|
+
The project DB is a derived cache: every `scan_*` row is regenerable, and the operator's authored data lives in `.sm` sidecars, not in the DB. While the kernel stays in `0.Y.Z` (see [`versioning.md` §Pre-1.0](./versioning.md#pre-10)) it does NOT ship incremental migrations to carry an existing DB across a schema change. Drift is detected on two independent axes; either trips a rebuild.
|
|
493
493
|
|
|
494
494
|
**Axis 1, version.** A write-side open compares `scan_meta.scanned_by_version` against the running CLI version:
|
|
495
495
|
|
|
496
496
|
- **Same `major.minor`** (patch differences ignored): compatible.
|
|
497
497
|
- **Any minor or major difference**: drifted.
|
|
498
498
|
|
|
499
|
-
**Axis 2, schema fingerprint.** Pre-1.0 the greenfield posture adds columns INLINE to `001_initial.sql` WITHOUT bumping a version (see [`versioning.md` §Pre-1.0](./versioning.md#pre-10)). A DB
|
|
499
|
+
**Axis 2, schema fingerprint.** Pre-1.0 the greenfield posture adds columns INLINE to `001_initial.sql` WITHOUT bumping a version (see [`versioning.md` §Pre-1.0](./versioning.md#pre-10)). A DB within the same `major.minor` but with an older inline schema would otherwise pass the version axis and then fail as a runtime "no such column" error. To close that gap, the implementation computes a **schema fingerprint** = sha256 over the concatenated migration DDL (`NNN_*.sql` files, in sorted order) and persists it to `scan_meta.schema_fingerprint` at persist time. A write-side open recomputes the fingerprint from the bundled migrations and compares:
|
|
500
500
|
|
|
501
501
|
- **Stored fingerprint equals the recomputed one**: compatible.
|
|
502
502
|
- **Stored fingerprint differs from the recomputed one**: drifted. Any inline edit to a migration file changes the fingerprint and trips this axis independently of the version axis.
|
|
503
|
-
- **Stored fingerprint is NULL** (a DB written by a pre-fingerprint CLI, or whose `schema_fingerprint` column does not exist): drifted.
|
|
503
|
+
- **Stored fingerprint is NULL** (a DB written by a pre-fingerprint CLI, or whose `schema_fingerprint` column does not exist): drifted. Forces a one-time rebuild on upgrade so the very column that detects drift gets provisioned.
|
|
504
504
|
|
|
505
|
-
When **either axis** reports drift, the entire DB file (plus its `-wal` / `-shm` sidecars) is deleted and recreated from the current migrations; the scan then repopulates it. No backup is written (the cache is derived). `state_*` and `config_*` are wiped along with `scan_*`; pre-1.0 they are
|
|
505
|
+
When **either axis** reports drift, the entire DB file (plus its `-wal` / `-shm` sidecars) is deleted and recreated from the current migrations; the scan then repopulates it. No backup is written (the cache is derived). `state_*` and `config_*` are wiped along with `scan_*`; pre-1.0 they are transient. `.sm` sidecars are never touched. The drift message names the reason (version skew vs schema fingerprint).
|
|
506
506
|
|
|
507
|
-
A DB that was never scanned (no `scan_meta` row) is **not** drift:
|
|
507
|
+
A DB that was never scanned (no `scan_meta` row) is **not** drift: no recorded version, no recorded fingerprint, no signal. The open proceeds untouched (the next scan writes both fields). Reading the stored fingerprint is defensive: a missing `scan_meta` table and a missing `schema_fingerprint` column are both tolerated (column-absent maps to NULL, i.e. drift; row-absent maps to no-signal).
|
|
508
508
|
|
|
509
509
|
The rebuild is confirmed interactively on a TTY (`sm scan`, and `sm serve` before it starts listening) unless `--yes` is passed; non-interactive callers (piped stdin, CI, the BFF scan route, the watcher) rebuild without prompting. Declining the prompt aborts (exit `2`) without deleting anything.
|
|
510
510
|
|
|
511
|
-
Read-side verbs (`sm check`, `sm list`, `sm show`, `GET /api/*`) do NOT rebuild. They surface a prominent advisory (warn on an older DB or a fingerprint mismatch, refuse on a newer or different-major DB) so a read never silently discards the cache
|
|
511
|
+
Read-side verbs (`sm check`, `sm list`, `sm show`, `GET /api/*`) do NOT rebuild. They surface a prominent advisory (warn on an older DB or a fingerprint mismatch, refuse on a newer or different-major DB) so a read never silently discards the cache nor crashes cryptically on a missing column. The advisory points at `sm scan` (rebuild on the next write) or `sm db reset`.
|
|
512
512
|
|
|
513
513
|
This is a pre-1.0 affordance. The first `1.0.0` replaces it with real up-only migrations (see §Migrations): drift detection by version / fingerprint becomes drift repair by migration, and `state_*` / `config_*` stop being disposable.
|
|
514
514
|
|
|
@@ -544,13 +544,13 @@ The kernel MUST enforce all three layers **in this exact order** for every plugi
|
|
|
544
544
|
- `DROP` / `ALTER` / `TRUNCATE` against anything outside the plugin's own logical table names.
|
|
545
545
|
- `ATTACH DATABASE` statements.
|
|
546
546
|
- Global `PRAGMA` statements (anything not scoped to a plugin-owned table).
|
|
547
|
-
|
|
548
|
-
3. **Prefix injection (rewrite)**, the kernel rewrites the AST so every table name the plugin authored becomes `plugin_<normalizedId>_<originalName>` if
|
|
549
|
-
4. **Scoped connection (runtime)**, at runtime
|
|
547
|
+
Validation runs **before** prefix injection so kernel tables are named as the plugin wrote them, making the reject test straightforward.
|
|
548
|
+
3. **Prefix injection (rewrite)**, the kernel rewrites the AST so every table name the plugin authored becomes `plugin_<normalizedId>_<originalName>` if not already prefixed. Index and constraint names get the same treatment. A plugin CANNOT create un-prefixed tables.
|
|
549
|
+
4. **Scoped connection (runtime)**, at runtime the plugin receives a `Database` wrapper (not a raw handle). The wrapper rejects any query touching tables whose name doesn't start with this plugin's prefix. Last-line defense: even if a migration-time layer were bypassed, runtime queries still cannot reach out-of-namespace data.
|
|
550
550
|
|
|
551
551
|
Step 4 is separate from 1–3 because it applies at query time, not migration time. Together the four steps form the "triple protection" referenced across the spec (the name predates the explicit parse step).
|
|
552
552
|
|
|
553
|
-
|
|
553
|
+
Note: plugins are user-placed code. Protection guards against accidents (a plugin that mistakenly names a table `state_jobs`), not hostile plugins. A malicious plugin in the same process can bypass any JS-level guard. Post-v1.0 evaluates sandboxing (worker threads, VM contexts) and/or signing.
|
|
554
554
|
|
|
555
555
|
---
|
|
556
556
|
|
|
@@ -561,13 +561,13 @@ Honest note: plugins are user-placed code. Protection guards against accidents (
|
|
|
561
561
|
- Auto-backup before migrations: `.skill-map/backups/skill-map-pre-migrate-v<N>.db`.
|
|
562
562
|
- `sm db restore <path>` swaps the current DB with the supplied file. Interactive confirmation required unless `--force`.
|
|
563
563
|
|
|
564
|
-
Backups include `state_*` + `config_*` only; `scan_*` is regenerated after restore
|
|
564
|
+
Backups include `state_*` + `config_*` only; `scan_*` is regenerated after restore via `sm scan`.
|
|
565
565
|
|
|
566
566
|
---
|
|
567
567
|
|
|
568
568
|
## Rename detection (automatic)
|
|
569
569
|
|
|
570
|
-
`scan_nodes.path` is the canonical node identifier in v0. Moving a file
|
|
570
|
+
`scan_nodes.path` is the canonical node identifier in v0. Moving a file rewrites the primary key, which would orphan every `state_*` row referencing the old path (`state_executions.node_ids_json`, `state_jobs.node_id`, `state_summaries.node_id`, `state_enrichments.node_id`, `state_plugin_kvs.node_id`, `state_node_favorites.node_path`).
|
|
571
571
|
|
|
572
572
|
Implementations MUST apply a rename heuristic at scan time **before** committing the new scan transaction:
|
|
573
573
|
|
|
@@ -575,20 +575,20 @@ Implementations MUST apply a rename heuristic at scan time **before** committing
|
|
|
575
575
|
2. For each pair `(deletedPath, newPath)` where `newPath.bodyHash == deletedPath.bodyHash` → classify as **high-confidence rename**. The kernel MUST:
|
|
576
576
|
- Update every `state_*` row whose `node_id` equals `deletedPath` to reference `newPath`.
|
|
577
577
|
- Emit no issue. Log at `info` level.
|
|
578
|
-
3. Remaining pairs where `newPath.frontmatterHash == deletedPath.frontmatterHash` (body differs, frontmatter
|
|
578
|
+
3. Remaining pairs where `newPath.frontmatterHash == deletedPath.frontmatterHash` (body differs, frontmatter a perfect match) → classify as **medium-confidence rename**. The kernel MUST:
|
|
579
579
|
- Apply the same FK migration.
|
|
580
580
|
- Emit an issue with `analyzerId: auto-rename-medium` (severity `warn`) pointing to both paths. The issue's `data` MUST include `{ from: <old.path>, to: <new.path>, confidence: "medium" }` so `sm orphans undo-rename <new.path>` can read the prior path without user input.
|
|
581
|
-
4. Any `deletedPath` left without a match after steps 2–3 becomes an **orphan**: the kernel emits an issue with `analyzerId: orphan` (severity `info`) and keeps the `state_*` rows referencing the dead path
|
|
582
|
-
- **Silenced exception**: the kernel skips the `orphan` issue when the `deletedPath` is currently filtered out of the scan by the active ignore-source (e.g. the user added
|
|
581
|
+
4. Any `deletedPath` left without a match after steps 2–3 becomes an **orphan**: the kernel emits an issue with `analyzerId: orphan` (severity `info`) and keeps the `state_*` rows referencing the dead path until the user runs `sm orphans reconcile <dead.path> --to <new.path>` or accepts the orphan.
|
|
582
|
+
- **Silenced exception**: the kernel skips the `orphan` issue when the `deletedPath` is currently filtered out of the scan by the active ignore-source (e.g. the user added a `.skillmapignore` entry between scans and the file still exists on disk). The intent is "hide from the graph", not "lost without a rename"; an `orphan` info would pollute `sm check` with noise the user asked for. The reference impl threads a `silenced(path): boolean` predicate from the orchestrator into the rename heuristic; callers that do not supply one preserve the previous "always emit" behaviour. The `state_*` rows are still kept; removing the ignore entry re-enters the path as a live node, transparent to history.
|
|
583
583
|
|
|
584
|
-
Matching is 1-to-1: once a `newPath` is claimed as the rename target of some `deletedPath`, no other deletion can match it in the same scan. Ambiguity (two deletions share a body hash with the same new path) → fall back to the orphan path for all candidates, with issue `auto-rename-ambiguous` listing every conflict. `auto-rename-ambiguous` issues MUST populate `data` with `{ to: <new.path>, candidates: [<old.path.a>, <old.path.b>, ...] }`;
|
|
584
|
+
Matching is 1-to-1: once a `newPath` is claimed as the rename target of some `deletedPath`, no other deletion can match it in the same scan. Ambiguity (two deletions share a body hash with the same new path) → fall back to the orphan path for all candidates, with issue `auto-rename-ambiguous` listing every conflict. `auto-rename-ambiguous` issues MUST populate `data` with `{ to: <new.path>, candidates: [<old.path.a>, <old.path.b>, ...] }`; here `sm orphans undo-rename` requires the user to pass `--from <old.path>` to disambiguate.
|
|
585
585
|
|
|
586
|
-
|
|
586
|
+
Casing: `bodyHash` / `frontmatterHash` / `analyzerId` / `data` are the domain-object field names (per `node.schema.json` and `issue.schema.json`); the SQLite reference impl stores the same values in `body_hash` / `frontmatter_hash` / `analyzer_id` / `data_json` columns, the storage adapter bridges the two (see §Naming conventions above). The heuristic is specified against the domain types, not the columns.
|
|
587
587
|
|
|
588
588
|
The heuristic runs inside the scan transaction, so either all renames land or none do. `sm scan` is the only surface that triggers automatic rename detection. Two manual verbs exist for cases the heuristic missed or got wrong:
|
|
589
589
|
|
|
590
590
|
- `sm orphans reconcile <orphan.path> --to <new.path>`, forward direction. Attaches FKs of an orphan to a live node. Use when the heuristic could not match (semantic rename, body rewrite).
|
|
591
|
-
- `sm orphans undo-rename <new.path>`, reverse direction. Reads `issue.data.from` from the active `auto-rename-medium` (or `--from`-disambiguated `auto-rename-ambiguous`) issue on `<new.path>`, migrates `state_*` FKs back, and resolves the issue. The prior path becomes an `orphan`. Use when the heuristic matched two unrelated files
|
|
591
|
+
- `sm orphans undo-rename <new.path>`, reverse direction. Reads `issue.data.from` from the active `auto-rename-medium` (or `--from`-disambiguated `auto-rename-ambiguous`) issue on `<new.path>`, migrates `state_*` FKs back, and resolves the issue. The prior path becomes an `orphan`. Use when the heuristic matched two unrelated files sharing a frontmatter hash.
|
|
592
592
|
|
|
593
593
|
Both verbs operate on FK ownership only; neither edits files on disk.
|
|
594
594
|
|
|
@@ -602,7 +602,7 @@ Both verbs operate on FK ownership only; neither edits files on disk.
|
|
|
602
602
|
- `PRAGMA quick_check` (or equivalent) returns OK.
|
|
603
603
|
- Applied migration version matches code-bundled migrations.
|
|
604
604
|
- No `state_jobs` rows whose `content_hash` is missing from `state_job_contents` (corrupt state, the content row was deleted out from under a live job).
|
|
605
|
-
- No `state_job_contents` rows whose `content_hash` is referenced by zero `state_jobs` rows (GC stragglers
|
|
605
|
+
- No `state_job_contents` rows whose `content_hash` is referenced by zero `state_jobs` rows (GC stragglers `sm job prune` should have collected).
|
|
606
606
|
- No plugin in `load-error` or `incompatible-spec` status.
|
|
607
607
|
|
|
608
608
|
Failures are reported with suggested remediation (e.g., "run `sm db migrate`", "run `sm job prune`").
|
package/index.json
CHANGED
|
@@ -174,15 +174,15 @@
|
|
|
174
174
|
}
|
|
175
175
|
]
|
|
176
176
|
},
|
|
177
|
-
"specPackageVersion": "0.
|
|
177
|
+
"specPackageVersion": "0.55.0",
|
|
178
178
|
"integrity": {
|
|
179
179
|
"algorithm": "sha256",
|
|
180
180
|
"files": {
|
|
181
|
-
"CHANGELOG.md": "
|
|
182
|
-
"README.md": "
|
|
183
|
-
"architecture.md": "
|
|
184
|
-
"cli-contract.md": "
|
|
185
|
-
"conformance/README.md": "
|
|
181
|
+
"CHANGELOG.md": "cfaa5ab3c07175903f3334274ad87b257b1df1e7a565468e1bc47fb4583a9b6b",
|
|
182
|
+
"README.md": "a790cd010b46d47883d1f37e3893cea9d7aa69ec4750c0202e6a0c99991e7980",
|
|
183
|
+
"architecture.md": "062127380199b20c918359212a2b696195d3f142b6184297900db953be73b308",
|
|
184
|
+
"cli-contract.md": "bfcc200fb085270cb425bb8692be51f4adf48dce4eb483e0632a05a658470b5a",
|
|
185
|
+
"conformance/README.md": "dcbef7249f161acf597552a05dcadc813cd0ced430dcd3f813fcf5e1c876335d",
|
|
186
186
|
"conformance/cases/backtick-path-extraction.json": "4620e7f8bc161fc57cb44001e9d99879c7e22b4865a0c27a20dc28969cd936d9",
|
|
187
187
|
"conformance/cases/extractor-collision-detection.json": "179a02c61892f0d26492de0c4e2c327fa6b4986d1265a8f119e871df6afe4658",
|
|
188
188
|
"conformance/cases/extractor-emits-signal.json": "0115c7bb62a7a705f72e9d8048b3f0396e5caaeb3d04dea204415e279e58479d",
|
|
@@ -195,7 +195,7 @@
|
|
|
195
195
|
"conformance/cases/view-action-button.json": "51331f725be1c3655351f8fca6fc9d3d301ae68ea1741ff6c79998332ba2dfeb",
|
|
196
196
|
"conformance/cases/view-contribution-payloads.json": "e8f54ed62e64a2a0f86729866e507abb1f4246683f0e60d538280536f7cd3ecc",
|
|
197
197
|
"conformance/cases/view-slots-all.json": "05284e0324dd2da72b6b21d397c11b355802229a68053e9dddc323f69b3a1eba",
|
|
198
|
-
"conformance/coverage.md": "
|
|
198
|
+
"conformance/coverage.md": "695fb4082eda222bfe746ff6dc1db4634773b086a6eef6706994cb878794fdbc",
|
|
199
199
|
"conformance/fixtures/backtick-path/docs/target.md": "a09ae2cb4c96358a2e0692215f172b0f8c48028b6b123e4e83424b28302e644c",
|
|
200
200
|
"conformance/fixtures/backtick-path/source.md": "217f78b12b3ff47a938a5cc9c1ff7d6989d6a1db82bd1ddf3656787f31efb902",
|
|
201
201
|
"conformance/fixtures/orphan-markdown/.claude/agents/reviewer.md": "7f062731106f2d9811e4fffcf6ab44b8dfff4cfb16536a469514cc0664e832bf",
|
|
@@ -229,20 +229,21 @@
|
|
|
229
229
|
"conformance/fixtures/view-contribution-payloads/notes/example.md": "312b1919cd7fd0f233648b053acfb2975662ede3c65dd391cc508204b67ad6fb",
|
|
230
230
|
"conformance/fixtures/view-slots-all/.skill-map/plugins/all-slots/analyzers/everything/index.js": "ea0022fec7f0fd5a26ba12db1310335f434f2f820682206a3a9542d98db0d346",
|
|
231
231
|
"conformance/fixtures/view-slots-all/.skill-map/plugins/all-slots/plugin.json": "c48e8a0574947ade0b4eb189d6bc27a48e24f92f616aacdc177f2d22d472a599",
|
|
232
|
-
"db-schema.md": "
|
|
233
|
-
"interfaces/security-scanner.md": "
|
|
234
|
-
"job-events.md": "
|
|
235
|
-
"job-lifecycle.md": "
|
|
236
|
-
"plugin-author-guide.md": "
|
|
237
|
-
"plugin-kv-api.md": "
|
|
238
|
-
"
|
|
232
|
+
"db-schema.md": "b350582ffb6c0f6612697e467873c9199731c8000c590759578dc0c16c724831",
|
|
233
|
+
"interfaces/security-scanner.md": "0996dd782e2d39d4791f2e290da4bb1a68a5b30c1f79187977188ec8e3fe6ef2",
|
|
234
|
+
"job-events.md": "2c7017f5f0003b19653424111a07043487173cbe88b51e961598bb1693987059",
|
|
235
|
+
"job-lifecycle.md": "ce33bc8bb5090ea183f860e495bfccc2a4a0ac2e23f6ebad83b9c28aad59124e",
|
|
236
|
+
"plugin-author-guide.md": "1eb52e9eadb1c196b0c2d749b9e5a74d32a048530f794f6419f6d2407622a43c",
|
|
237
|
+
"plugin-kv-api.md": "5e095581020043af73ff028e272f56d42ca9eb6e506dd777d45703f9db796a5b",
|
|
238
|
+
"plugin-quickstart.md": "19092b278d80df357ea623dc3bd9f833d059582ee1356f317621913d91e50512",
|
|
239
|
+
"prompt-preamble.md": "5d0f836688aa23eafc32104c3174132340b268361f6060326eec84da17c6ad6d",
|
|
239
240
|
"schemas/annotations.schema.json": "09fcebc86e3b793bf9f03a35b38e5ca2a08d79ac3504f6f03895ac2ae1c2aded",
|
|
240
241
|
"schemas/api/rest-envelope.schema.json": "8eeb1c2d79fb69eaef23737a2231d48d67e59b8b19aad816239ab4680e2c4752",
|
|
241
242
|
"schemas/bump-report.schema.json": "c763e1f89f2665c479d6a4985c1d324c65e5278331ebab82220287a07e4c4429",
|
|
242
243
|
"schemas/conformance-case.schema.json": "958b316d646d0c64a715a7a28cee66d2c2d2498a60dbfc5ae8970687c2a96954",
|
|
243
244
|
"schemas/conformance-result.schema.json": "14f983a8f4e62cd4ff964688c9b2b026a3bee3a0b762b64091c8c34db5b75777",
|
|
244
245
|
"schemas/execution-record.schema.json": "db0eb16153493ad9f13ea0ecede44191e4a8536979adffd17ca278ddf8786c77",
|
|
245
|
-
"schemas/extensions/action.schema.json": "
|
|
246
|
+
"schemas/extensions/action.schema.json": "b0f3b2cf49b4c62615d128293699f4666c0e638b7e522afac00a8110dc43c947",
|
|
246
247
|
"schemas/extensions/analyzer.schema.json": "5ab80cba46f4ca6ca78bd9484cc1b46d949d77142b5a8864dc09af5e406908e0",
|
|
247
248
|
"schemas/extensions/base.schema.json": "ec4cef21bc5d493c4d60ae3208c5e15364b02176f5f32bb00bbd62e9578befdc",
|
|
248
249
|
"schemas/extensions/extractor.schema.json": "ee44bf562b19318c93116c574a811857cdef1f4119326a9a604fa408889dd230",
|
|
@@ -259,7 +260,7 @@
|
|
|
259
260
|
"schemas/node.schema.json": "1ebba38e0c0ae022fccbc0cdf7c298da1720a68d4cb375f0baf9f0847998a0d8",
|
|
260
261
|
"schemas/plugins-doctor.schema.json": "03e2dc51c052a09bf0198c80e2c26e6129734ada4a748e483245de3dd8576c42",
|
|
261
262
|
"schemas/plugins-registry.schema.json": "211d081691fc83526e1593c79ed9741ad8a5dbd4db1a756f72141b0cced2ea15",
|
|
262
|
-
"schemas/project-config.schema.json": "
|
|
263
|
+
"schemas/project-config.schema.json": "2241e7a542bc036ef82c5816cbd217b1a1e0a2ee5e8054e069325e89980c6e25",
|
|
263
264
|
"schemas/refresh-report.schema.json": "47184d4f6b15e9b7671dc178b3b3886a64422da198898508ecdb2cb27876db04",
|
|
264
265
|
"schemas/report-base-deterministic.schema.json": "59785fe6f3ceb34814bbbd03d10fa7336a32835ce598946f2923d469b32aa32a",
|
|
265
266
|
"schemas/report-base.schema.json": "e4d25f055e24f18ae0f77c24661c1bddc87ff2e43b001b6a827fcb14f9753f44",
|
|
@@ -273,8 +274,8 @@
|
|
|
273
274
|
"schemas/summaries/skill.schema.json": "85d68056054bade62391948cc038fcfa70cdcf465e2b295f69cd520bbdba0134",
|
|
274
275
|
"schemas/user-settings.schema.json": "d155552ffca9c7dd4c6e31398aff4950dd9721d2a1f4b308cf0fe33000ca31b5",
|
|
275
276
|
"schemas/view-slots.schema.json": "886487a1f38fd7e4270fa6213653664c0cf906043e8aa9e832017149932bf6a2",
|
|
276
|
-
"telemetry.md": "
|
|
277
|
-
"versioning.md": "
|
|
277
|
+
"telemetry.md": "9c80bd679d7f4e3d1813d54be8547d0494a369d25619123367c0cfca2ccc75fd",
|
|
278
|
+
"versioning.md": "142c9cd4521e001216114d477b635d8dff6e38a9a33105cef814b6ca2d2eecf2"
|
|
278
279
|
}
|
|
279
280
|
}
|
|
280
281
|
}
|