@skill-map/spec 0.34.0 → 0.35.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,121 @@
1
1
  # Spec changelog
2
2
 
3
+ ## 0.35.0
4
+
5
+ ### Minor Changes
6
+
7
+ - de68f09: Soft-warn drift detection for the active provider lens. When `activeProvider` is set (whether by auto-detect on first scan, the interactive prompt for ambiguous markers, or `sm config set activeProvider <id>`), the runtime now persists the set of provider markers that existed on disk at the moment of the choice as `activeProviderMarkers` in `.skill-map/settings.json`. On every subsequent scan the bootstrap re-detects markers and diffs against this snapshot; when the diff is non-empty (new markers appeared, recorded markers disappeared), it emits ONE soft warn before the scan and continues with the cached lens.
8
+
9
+ **Motivation.** Today `activeProvider` wins silently forever, even when the project grows a new provider directory (e.g. adds `.codex/` after the choice was made under `claude`) or loses one (`.cursor/` deleted in a cleanup). The operator should at least notice. The friction of a soft warn is right: it surfaces the drift, points at the fix (`sm config set activeProvider <id>`), and gets out of the way. The warn is informational and never blocks the scan.
10
+
11
+ **Spec.** `spec/schemas/project-config.schema.json` declares the new optional `activeProviderMarkers` string array as internal-state, NOT normally hand-edited. `spec/architecture.md` §"Active-lens drift detection" documents the snapshot + diff + soft-warn contract.
12
+
13
+ **Backfill.** Legacy projects (existing `activeProvider` without a snapshot) lazily backfill on the next scan: the runtime writes the current detected set as the snapshot and stays silent (there is nothing to compare against the first time), so the warn only fires when markers actually drift relative to a known-good snapshot.
14
+
15
+ **Atomicity.** The two writes (`activeProvider` + `activeProviderMarkers`) go through the same `writeConfigValue` helper as every other config mutation; each is atomic on its own, the pair is not transactional. A failure between the two writes leaves the file in the legacy "lens but no snapshot" shape, which the lazy backfill handles cleanly on the next scan.
16
+
17
+ **Tests.** `src/core/runtime/__tests__/active-provider-bootstrap-drift.spec.ts` covers snapshot persistence on auto-detect (single + ambiguous picks), drift detection from config (no-drift, added marker, removed marker, both-direction drift), legacy backfill, snapshot stickiness on repeat drift, and the `style.warnGlyph` / `style.dim` plumbing. `src/cli/commands/__tests__/config-cli.spec.ts` adds two cases for `sm config set activeProvider`: snapshot refresh on set, and full-set capture (not just the picked id).
18
+
19
+ Pre-1.0 minor per `spec/versioning.md`: additive optional field on the project-config schema (`@skill-map/spec`) plus an additive runtime behaviour on `@skill-map/cli`. No removed surface.
20
+
21
+ ## User-facing
22
+
23
+ `sm scan` now warns once when provider markers on disk drifted since `activeProvider` was set (e.g. you added `.codex/` after picking the `claude` lens). Run `sm config set activeProvider <id>` to switch the lens, or ignore the warn and keep going, it never blocks the scan.
24
+
25
+ - a58989f: Lens-gated classification for vendor providers. Vendor Providers (`claude`, `openai`, `antigravity`) now opt into being gated by the active lens via a new `gatedByActiveLens: true` field on their manifest. The walker (`src/kernel/orchestrator/walk.ts`) pre-filters `opts.providers` before the walk loop: a gated Provider runs only when `provider.id === opts.activeProvider`, so vendor providers no longer attempt to classify files outside their lens. Universal providers (`core/markdown`, future `agent-skills` open standard) leave the flag absent / `false` and run unconditionally.
26
+
27
+ **Motivation.** The real runtimes never cross-read each other's on-disk formats: Claude Code does not consume `.codex/`, Codex CLI does not consume `.claude/`, Antigravity has no on-disk kind beyond the open standard yet. Offering every file to every provider during classification fabricated cross-vendor graph edges the runtimes themselves reject, the operator saw `openai/agent` nodes for `.codex/agents/*.toml` in a `claude`-lensed project even though Claude Code would never resolve them. The pre-filter in the walker is the cheap path: a gated-off Provider does NOT walk its territory at all, no per-file cost.
28
+
29
+ **Spec.** `spec/schemas/extensions/provider.schema.json` mirrors the new optional boolean field with the full normative description (vendor MUST opt in, universal SHOULD omit, `null` lens bypasses the gate). The matching prose lives in `spec/architecture.md` §"Active-lens scope for providers (classification gate)" (landing alongside drift-detection in a follow-up commit; the changeset for that commit owns the architecture.md prose bump).
30
+
31
+ **`null` lens semantics.** When `activeProvider === null` (a project with no provider markers, no setting), the walker bypasses the gate entirely and every Provider runs. This matches the extractor-side fallback for unlensed projects: a plain-markdown repo keeps classifying with every Provider, no gates fire.
32
+
33
+ **Backward compatibility.** Providers without the field default to `gatedByActiveLens === undefined ≡ false`, the universal behaviour. Existing third-party providers keep working unchanged; only providers that explicitly opt in change classification semantics.
34
+
35
+ **Tests.** `src/kernel/orchestrator/__tests__/walk-lens-gate.spec.ts` covers the walker filter at the unit level (3 cases: claude lens excludes openai territory, openai lens excludes claude territory, `null` lens admits both). `src/__tests__/integration/lens-gated-classification.spec.ts` covers the end-to-end shape across a 4-file fixture per lens (2 cases).
36
+
37
+ Pre-1.0 minor per `spec/versioning.md`: additive optional field on the Provider manifest schema (`@skill-map/spec`) plus an additive walker behaviour change on `@skill-map/cli`. No removed surface, no breaking change for universal providers.
38
+
39
+ ## User-facing
40
+
41
+ Cross-provider files (e.g. a `.codex/agents/*.toml` while the lens is `claude`) are no longer claimed by the foreign provider. They surface as plain markdown / unclassified instead, matching how the agent itself would see them at runtime.
42
+
43
+ - d207cfa: Observable link analysis. The link-matrix walkthrough surfaced a recurring complaint, "the inspector tells me there is an edge but not where, why, or whether it overlaps with another", and a small cluster of detection bugs that were hiding real problems and inventing fake ones. This changeset is the drain pass.
44
+
45
+ **Kernel domain shape, additive.** Three new fields on `Link` / `Node`:
46
+
47
+ - `Link.occurrences[]` (`LinkOccurrence` = `{ extractor, originalTrigger, location? }`) accumulates every syntactic site in the source body that contributed to an edge. Populated by extractors at emit time, concatenated by `dedupeLinks` across extractor merges (with `(extractor, originalTrigger, line)` dedup inside the array to defend against double-emit). Frontmatter / sidecar-derived synthetic links carry it empty.
48
+ - `Link.resolvedTarget` is the node path the post-walk `liftResolvedLinkConfidence` transform bound the link to. Equal to `target` for path-style links; differs for trigger-style links (`@foo`, `/cmd`) where `target` keeps the authored trigger and `resolvedTarget` carries the resolved node path. `null` when unresolved (broken).
49
+ - `Node.externalRefs[]` (`IExternalRef` = `{ url, line?, originalTrigger? }`) is the list of distinct http(s) URLs the body references, in extractor-order, deduped by normalised URL. Populated by `recomputeExternalRefsCount` (renamed in role from "count-only" to "count + list"); the denormalised `externalRefsCount` rides alongside and must equal the array length when both are present.
50
+
51
+ All three exported from `src/kernel/index.ts`; matching JSON-Schema additions in `spec/schemas/link.schema.json` and `spec/schemas/node.schema.json` (additive, `additionalProperties: false` preserved); `spec/index.json` regenerated.
52
+
53
+ **SQL, edited in place (greenfield rule).** `src/migrations/001_initial.sql` gains three columns: `scan_links.occurrences_json`, `scan_links.resolved_target`, `scan_nodes.external_refs_json`; one new index `ix_scan_links_resolved_target`. Matching types in `src/kernel/adapters/sqlite/schema.ts`; `linkToRow` / `rowToLink` / `nodeToRow` / `rowToNode` round-trip the new columns (round-trip tests already cover the shape).
54
+
55
+ **Two new analyzers.**
56
+
57
+ - `core/redundant-target-reference` flags `(source, resolved-target)` pairs reached via two or more syntactic surfaces, whether cross-extractor (same kind, multiple authored triggers) or cross-kind (multi-edge to one target). Walks `Link.occurrences[]` plus `Link.resolvedTarget` to detect the redundancy. Severity `warn`. Tests at `src/plugins/core/analyzers/redundant-target-reference/__tests__/redundant-target-reference.spec.ts`.
58
+ - `core/self-loop` flags links whose source is its own resolved target (a body heading like `# /deploy` inside the file that defines `/deploy`). Severity `warn`. The UI hides self-loops by default; this analyzer is the authoritative detector so the count is still visible in `sm scan` output and SARIF exports. Tests at `src/plugins/core/analyzers/self-loop/__tests__/self-loop.spec.ts`.
59
+
60
+ **Existing analyzer extended.** `core/reserved-name` now emits both target-side (the file shadowing a built-in, behaviour preserved) and source-side (one `warn` per link the lift downgraded to `RESERVED_TARGET_CONFIDENCE`). Source-side issues carry `data.target` matching the link so UIs can correlate per-row instead of bleeding "any issue on source" onto every outgoing edge.
61
+
62
+ **Extractor fixes.**
63
+
64
+ - `core/markdown-link` and `core/external-url-counter` now run their regex over `stripCodeBlocks(ctx.body)` instead of raw body, matching the guard `claude/at-directive` and `claude/slash` already had. Author-written examples like `[label](path)` or `https://example.com` inside backticks or fenced blocks stop emitting spurious `references` edges (which were feeding `core/broken-ref` false positives) and stop inflating the external-URL count. Three new test cases per extractor (inline-code, fenced, mixed).
65
+ - `claude/at-directive` and `claude/slash` extractors now track line numbers per occurrence (the `core/redundant-target-reference` analyzer needs every occurrence to know its line). Both compute `lineStarts` once per body via the new shared util `src/kernel/util/line-tracking.ts` (extracted from `markdown-link`'s previously-local helper) and attach `location: { line }` to every emit.
66
+
67
+ **BFF.** `/api/links?to=X` now matches via `target` OR `resolvedTarget`; the storage-layer companion in `getNodeBundle` does the same. Without this, a Claude `@real-agent` mention stayed invisible in the incoming list of `.claude/agents/real-agent.md` because the row's `target_path` carried the trigger, not the resolved path.
68
+
69
+ **UI overhaul, `LinkedNodesPanel`.**
70
+
71
+ - Numeric confidence value shown in the tag, was qualitative `high` / `medium` / `low`. The tier survives as the tag's tooltip and severity colour, so `0.85` and `1.00` are now visually distinguishable on the same row.
72
+ - New "Findings" section at the top of the panel, lists every issue whose `nodeIds[]` includes the focused path.
73
+ - Inline issue chip per outgoing / incoming row. Correlation rules tightened: source-side issue with `data.target` matching the link's `target` / `resolvedTarget` / current path (the original "any issue on source" fallback bled unrelated `broken-ref` findings onto every row).
74
+ - Per-row "Occurs at:" sub-list when `link.occurrences.length > 0`, shows each line + original trigger + extractor id.
75
+ - New "External references" section above Findings when `node.externalRefs` is populated, clickable URLs that open in a new tab.
76
+ - Self-loops hidden by default from outgoing + incoming via a client-side `isSelfLoop` filter. The `core/self-loop` analyzer remains the authoritative detector; the panel just respects it.
77
+ - Texts catalog (`linked-nodes-panel.texts.ts`) and CSS updated.
78
+ - `ui/src/models/api.ts` gained `ILinkOccurrenceApi`, `IExternalRefApi`, `Link.occurrences`, `Link.resolvedTarget`, `Node.externalRefs` shapes mirroring the kernel domain types.
79
+
80
+ **Plus an out-of-band AGENTS.md operating rule.** A new analyzer queues mid-execution user messages (do not abort an in-flight tool sequence to handle an interrupt unless the interrupt is an unambiguous abort verb). Lands in this commit because it surfaced during the same walkthrough.
81
+
82
+ Pre-1.0 minor on both workspaces per `spec/versioning.md` (additive shape changes, no breakage).
83
+
84
+ ## User-facing
85
+
86
+ **Inspector overhaul.** Links show numeric confidence, a Findings list, per-row issue chips, and per-site "Occurs at" lines. New "External references" section. Self-loops hidden by default. Two new analyzers flag redundant multi-form references and self-loops.
87
+
88
+ - 5a12e5c: Phase 2.D of the Signal IR migration: new `core/signal-collision` built-in analyzer surfaces resolver rejections as operator-visible `warn` issues. The analyzer reads `IAnalyzerContext.signals`, finds every Signal whose `resolution.outcome === 'rejected'`, and emits one issue per rejection naming the loser extractor + matched text + byte range, the winner extractor + range, and the tiebreak reason (`kind-priority` / `higher-confidence` / `longer-range` / `earlier-declaration`). Phase 4+ stubs (`extractorDisabled`, `belowFloor`) are handled with their own message templates so the surface stays forward-compatible.
89
+
90
+ Closes spec conformance coverage row 37 (`signal.schema.json`) with the two required cases:
91
+
92
+ - `extractor-emits-signal`: a body with a single `[text](path)` markdown link materialises as one Link via the Signal IR resolver path; `sources[0] === 'markdown-link'`.
93
+ - `signal-collision-detection`: a body with `[@./api.md](./api.md)` triggers a cross-extractor range overlap (markdown-link's range contains at-directive's range); markdown-link wins on confidence; the loser surfaces as exactly one `core/signal-collision` warn issue.
94
+
95
+ ## User-facing
96
+
97
+ `sm scan` now warns when two extractors detect overlapping byte ranges. The graph keeps the winner; the issue panel explains which detection lost and why, so a markdown link wrapping an `@`-directive no longer looks like silent disappearing intent.
98
+
99
+ - 3ca095b: Wire the Signal IR resolver end-to-end (Phase 2.A of the active-lens migration). The kernel's `resolveSignals` runs after extraction and before analysis: filters disabled extractors (Phase 4+ stub), ranks intra-Signal candidates via `IProvider.resolverRules.kindPriority` (when declared) + confidence + extractor declaration order, builds overlap clusters from body-scoped Signals sharing a source, picks a cluster winner per the four-step tiebreak chain (`kind-priority` -> `higher-confidence` -> `longer-range` -> `earlier-declaration`), materialises winners as Links indistinguishable from `emitLink`-emitted ones, and annotates each Signal's new `resolution` field with the outcome + reason. Rejected (losing) Signals remain accessible to analyzers via `IAnalyzerContext.signals` so a future `core/signal-collision` analyzer can surface them as `warn` issues naming WHO won and WHY.
100
+
101
+ Spec changes: `signal.schema.json` gains the `resolution` object property (outcome / winnerIndex / rejectedBy / phase 4+ stubs); `extensions/provider.schema.json` gains `resolverRules.kindPriority`; `architecture.md` §Resolver phase rewritten to reflect the wired contract; `conformance/coverage.md` row 37 flipped to in-progress.
102
+
103
+ Kernel changes: extend `Signal` type with `resolution?: ISignalResolution`; add `IResolverRules` + `IProvider.resolverRules`; rewrite `resolveSignals` (87-line first-candidate scaffold -> full algorithm); thread `signals` through `walkAndExtract` accumulators -> `runAnalyzers` -> per-analyzer context; export `isExternalUrlLink` for the caller's routing of materialised Links between internal / external arrays.
104
+
105
+ No extractor uses `emitSignal` yet (Phases 2.B and 2.C migrate them). With zero Signals emitted today the wiring is a no-op pass-through that returns empty arrays; 18 new resolver unit tests cover intra-Signal ranking, cross-Signal overlap, the four tiebreak reasons, kindPriority interaction, external-URL cluster skip, frontmatter / sidecar scope pass-through, and materialised Link shape parity.
106
+
107
+ ### Patch Changes
108
+
109
+ - 1362de9: Phase 2.B of the Signal IR migration: `claude/at-directive` extractor now routes through `ctx.emitSignal` instead of `ctx.emitLink`. Each `@<token>` match emits a single-candidate Signal carrying the byte range, scope (`body`), and a candidate with the same kind / target / confidence / trigger / rationale shape the extractor used to embed directly into a Link. The resolver phase materialises the winning candidate as a Link indistinguishable from the prior direct-emit shape, including `occurrences[]` round-tripping; full `pnpm validate` stays green with 1734 tests passing and zero behaviour change.
110
+
111
+ Why through Signals: byte ranges now flow into the kernel resolver, which unlocks cross-extractor range-overlap collision detection (a future `core/signal-collision` analyzer will surface losers as `warn` issues). The single-candidate shape keeps the migration narrow; multi-candidate emissions for cases of genuine intra-Signal ambiguity stay deferred until a real case demands it.
112
+
113
+ Spec: `signal.schema.json` gains an optional `range.line` field so extractors that already compute line tracking (via `computeLineStarts` / `lineFor`) thread the line number through to the materialised `Link.location.line` without the resolver re-walking the body.
114
+
115
+ Kernel: resolver's `materialise()` synthesises a one-entry `occurrences[]` from the winning candidate's trigger + range so multi-extractor `dedupeLinks` merges accumulate occurrences through the same code path as direct emissions. `extractorOrder` and `link.sources` now both use short extractor ids (e.g. `'at-directive'`) to match the cache layer's lookup contract.
116
+
117
+ Test harness: `src/plugins/core/extractors/__tests__/extractors.spec.ts` `extract()` helper auto-flushes Signals via the resolver so tests that assert on the resulting `links` array see identical shape regardless of whether the extractor went through `emitLink` directly or routed through `emitSignal`.
118
+
3
119
  ## 0.34.0
4
120
 
5
121
  ### Minor Changes
package/architecture.md CHANGED
@@ -65,6 +65,34 @@ The lens does NOT gate the universal extractors that ship under `core/` (markdow
65
65
 
66
66
  The gate is the active lens, not the node's provider. A `@handle` token in `CLAUDE.md` or `notes/todo.md` (files the `claude` provider disclaims to `core/markdown`) still gets parsed by `claude/at-directive` under the `claude` lens, because the runtime grammar is what the lens represents and the runtime reads markdown across the whole project, not only the files it owns. The earlier double-check ("node's provider matches AND the lens") silently dropped that surface; dropping the node side restores it. Cross-lens isolation is preserved by the lens half alone: under `openai`, claude extractors are silent on every node (including `.claude/*`), because the lens authorisation is missing. When `activeProvider` is `null` (no setting, no filesystem marker), provider-gated extractors are skipped uniformly.
67
67
 
68
+ ### Active-lens scope for providers (classification gate)
69
+
70
+ The active lens also gates **classification**. Each Provider declares `gatedByActiveLens` on its manifest (`extensions/provider.schema.json#/properties/gatedByActiveLens`, mirrored at `IProvider.gatedByActiveLens`). Vendor providers (`claude`, `openai`, `antigravity`) set this to `true`; their `classify()` only runs (and the walker only iterates their territory) when `provider.id === activeProvider`. Universal providers (the open-standard `agent-skills`, the markdown fallback `core/markdown`, any future format-based fallback) leave the flag `false` (the default) and run on every scan.
71
+
72
+ Filtering happens in `walkAndExtract` (kernel, `src/kernel/orchestrator/walk.ts`) at the provider-iteration level: a gated-off Provider does NOT walk its territory at all, the cheap path. The predicate is: include the Provider when `!gatedByActiveLens || activeProvider === null || provider.id === activeProvider`. The `null` branch is intentional: an unlensed project (no marker, no setting) keeps the walker permissive so every Provider participates, mirroring the matching extractor-side fallback.
73
+
74
+ Consequence: under `activeProvider = 'claude'`, a `.codex/agents/foo.toml` file is not classified by the `openai` Provider (gated off); whether the file becomes a node depends on whether a universal Provider claims its extension. Today no universal claims `.toml`, so the file is silently absent from the graph, which matches the runtime reality (Claude Code never consumes `.codex/`). The same path under `activeProvider = 'openai'` becomes `openai/agent`. A `core/markdown` fallback continues to claim every unclaimed `.md` regardless of lens, so a `.claude/agents/foo.md` under `openai` lens reverts to `markdown` (no claude territory under that lens).
75
+
76
+ This gate affects **classification only**. Extractors keep filtering through their own `precondition.provider` allowlist (described in the previous section); a gated-off vendor Provider does not contribute classified nodes, but its bundled extractors still skip uniformly under the wrong lens via the extractor-side rule. The two gates are independent and complementary.
77
+
78
+ ### Active-lens drift detection
79
+
80
+ The lens is sticky once set, the operator chose `activeProvider` deliberately, the runtime keeps using it until the operator explicitly runs `sm config set activeProvider <id>`. But projects grow: a repo that started under `claude` may later add `.codex/`, or a `.cursor/` directory disappears in a cleanup. Without a hint, the operator would silently keep scanning under the original lens long after the on-disk reality moved.
81
+
82
+ To surface this drift without being noisy, the runtime persists a snapshot of provider markers alongside `activeProvider`:
83
+
84
+ - **`activeProviderMarkers`** (`project-config.schema.json#/properties/activeProviderMarkers`): the set of provider ids whose filesystem markers were present on disk at the moment `activeProvider` was set. Written by the runtime in three places: (1) auto-detect on first scan when exactly one marker is found, (2) interactive prompt when multiple markers are found and the operator picks one, (3) `sm config set activeProvider <id>` (a manual switch refreshes the snapshot to match current reality).
85
+
86
+ At every subsequent scan entry, the bootstrap re-detects markers, diffs against the snapshot, and emits ONE soft warning when the diff is non-empty:
87
+
88
+ - **New markers in current but not in snapshot** → "New: <added>" (e.g. the operator added `.codex/` after the choice was made).
89
+ - **Markers in snapshot but no longer on disk** → "Removed: <removed>".
90
+ - **Both** → both lines, still ONE warn per scan.
91
+
92
+ The warn is informational and never blocks the scan; the run continues with the cached lens. The snapshot is NOT refreshed automatically when drift fires, the operator chooses whether to switch the lens (`sm config set activeProvider <id>` refreshes the snapshot and atomically drops `scan_*`) or accept the drift (re-running the auto-detect by deleting the `activeProvider` key resets the snapshot).
93
+
94
+ Legacy projects (an existing `activeProvider` without a snapshot) lazily backfill: the first scan after the project upgrades writes the current detected set as the snapshot and stays silent (there is nothing to compare against the first time), so the warn only fires when markers actually drift relative to a known-good snapshot. The bookkeeping is internal-state, not normally hand-edited.
95
+
68
96
  ---
69
97
 
70
98
  ## Ports
@@ -320,12 +348,15 @@ In addition to the `emitLink` path, Extractors MAY emit **Signals** via `ctx.emi
320
348
 
321
349
  The kernel's **resolver phase** runs after extraction completes and before analysis starts. For each Signal, the resolver:
322
350
 
323
- 1. Filters candidates whose `extractorId` is not enabled (per `plugins.<id>.extensions.<extId>.enabled` overrides).
324
- 2. Applies the active Provider's resolution rules (declared on `IProvider.resolverRules`) to rank surviving candidates: priority order, tie-break by confidence, then by longest range, then by `extractorId` declaration order.
325
- 3. Materialises the winning candidate as a Link (indistinguishable from a Link emitted directly via `emitLink`). The rejected candidates remain accessible to analyzers via `IAnalyzerContext.signals` for collision-detection and conflict-visualization use cases.
326
- 4. Rejects all candidates and emits no Link if every interpretation has confidence below the configured floor.
351
+ 1. (Phase 4+, not yet wired) Filters candidates whose `extractorId` is disabled via `plugins.<id>.extensions.<extId>.enabled`. When that filter empties every candidate, the Signal carries `resolution.outcome = 'rejected'` with `extractorDisabled = { extractorId }`.
352
+ 2. Ranks the surviving candidates inside the Signal by the active Provider's `resolverRules.kindPriority` (when declared), then `confidence` DESC, then `range` length (`end - start`) DESC, then `extractorId` declaration order. The chosen index is recorded as `resolution.winnerIndex` and (provisionally) `resolution.outcome = 'materialised'`.
353
+ 3. For body-scoped Signals with a `range`, the resolver builds overlap clusters per source (transitive closure of range intersection). Clusters of size 1 keep their winner. For clusters of size 2+, the resolver re-applies the same four-step tiebreak to each Signal's winning candidate to pick a cluster winner. Losers flip to `resolution.outcome = 'rejected'` with `rejectedBy = { source, range, extractorId, reason }`, where `reason` names the tiebreak step that decided it: `kind-priority`, `higher-confidence`, `longer-range`, or `earlier-declaration`. External pseudo-link clusters (every member targets `http://` / `https://`) skip cross-cluster ranking, every member materialises (URL-targeted Signals can never conflict with internal-target Signals or with each other because they leave the local graph).
354
+ 4. Materialises every Signal whose final `outcome === 'materialised'` as a Link, identical in shape to a Link emitted directly via `emitLink`. The materialised Link's `sources[]` carries the winning candidate's `extractorId` so attribution survives the resolver.
355
+ 5. (Phase 4+, not yet wired) Rejects a whole Signal when every candidate's `confidence` falls below the configured floor: `resolution.outcome = 'rejected'` with `belowFloor = { threshold }`. Today the resolver materialises every Signal that survives overlap regardless of confidence.
356
+
357
+ Both materialised and rejected Signals remain on `IAnalyzerContext.signals` post-resolver. The built-in `core/signal-collision` analyzer reads this buffer and emits one `warn` issue per rejected Signal so the operator sees WHICH extractor lost, against WHO, and WHY. Rejected Signals never enter the graph as Links, but their existence is visible end-to-end through the issue surface.
327
358
 
328
- The Signal's `range` field (byte offsets in the source) powers two cross-extractor analyses no Link can support today: collision detection (two extractors emitting Signals with overlapping ranges) and fragmentation detection (an authored intent split across several adjacent Signals). Both surface as analyzer issues, not silent merges.
359
+ The Signal's `range` field (byte offsets in the source) powers two cross-extractor analyses no Link can support today: collision detection (two extractors emitting Signals with overlapping ranges, contract above) and fragmentation detection (an authored intent split across several adjacent Signals, deferred to Phase 5+). Both surface as analyzer issues, not silent merges.
329
360
 
330
361
  ### Extractor · enrichment layer
331
362
 
@@ -0,0 +1,20 @@
1
+ {
2
+ "$schema": "https://skill-map.dev/spec/v0/conformance-case.schema.json",
3
+ "id": "extractor-emits-signal",
4
+ "description": "Signal IR resolver phase, end-to-end. A body that contains a single `[text](path)` markdown link MUST flow through the Signal IR resolver (Phase 2 of the active-lens migration): `core/markdown-link` emits a single-candidate Signal, the resolver materialises the winning candidate as a Link, and the result lands in `scan.links` with the same shape a direct `emitLink` call would have produced. Locks the contract that the Signal IR path coexists with the direct-emit path and produces indistinguishable Link rows.",
5
+ "fixture": "signal-ir-single-signal",
6
+ "invoke": {
7
+ "verb": "scan",
8
+ "flags": ["--json"]
9
+ },
10
+ "assertions": [
11
+ { "type": "exit-code", "value": 0 },
12
+ { "type": "json-path", "path": "$.schemaVersion", "equals": 1 },
13
+ { "type": "json-path", "path": "$.stats.linksCount", "equals": 1 },
14
+ { "type": "json-path", "path": "$.links[0].source", "equals": "source.md" },
15
+ { "type": "json-path", "path": "$.links[0].target", "equals": "target.md" },
16
+ { "type": "json-path", "path": "$.links[0].kind", "equals": "references" },
17
+ { "type": "json-path", "path": "$.links[0].confidence", "equals": 1.0 },
18
+ { "type": "json-path", "path": "$.links[0].sources[0]", "equals": "markdown-link" }
19
+ ]
20
+ }
@@ -0,0 +1,20 @@
1
+ {
2
+ "$schema": "https://skill-map.dev/spec/v0/conformance-case.schema.json",
3
+ "id": "signal-collision-detection",
4
+ "description": "Signal IR resolver phase, range-overlap collision. A body that contains `[@./api.md](./api.md)` triggers a cross-extractor range overlap: `core/markdown-link` matches the whole bracketed-and-parenthesised span; `claude/at-directive` matches the `@./api.md` token INSIDE the bracket text. The two byte ranges overlap (the at-directive range is a strict subset of the markdown-link range). The kernel resolver picks ONE winner per the four-step tiebreak (`kind-priority` -> `higher-confidence` -> `longer-range` -> `earlier-declaration`); markdown-link wins on confidence (1.0 vs 0.85). The resolver materialises the winner as a Link, marks the loser's `resolution.outcome === 'rejected'` with `rejectedBy` naming the winner, and the built-in `core/signal-collision` analyzer surfaces the rejection as ONE `warn` issue attached to the source node. Locks the contract that range-overlap collisions surface to the operator instead of being silently merged.",
5
+ "fixture": "signal-ir-collision",
6
+ "invoke": {
7
+ "verb": "scan",
8
+ "flags": ["--json"]
9
+ },
10
+ "assertions": [
11
+ { "type": "exit-code", "value": 0 },
12
+ { "type": "json-path", "path": "$.schemaVersion", "equals": 1 },
13
+ { "type": "json-path", "path": "$.stats.linksCount", "equals": 1 },
14
+ { "type": "json-path", "path": "$.links[0].target", "equals": ".claude/agents/api.md" },
15
+ { "type": "json-path", "path": "$.links[0].sources[0]", "equals": "markdown-link" },
16
+ { "type": "json-path", "path": "$.stats.issuesCount", "equals": 1 },
17
+ { "type": "json-path", "path": "$.issues[0].analyzerId", "equals": "signal-collision" },
18
+ { "type": "json-path", "path": "$.issues[0].severity", "equals": "warn" }
19
+ ]
20
+ }
@@ -44,7 +44,7 @@ This file is hand-maintained. A CI check before spec release compares the schema
44
44
  | 33 | `plugins-doctor.schema.json` |, | 🔴 missing | Machine-readable output of `sm plugins doctor --json`. Aggregates per-status counts plus structured issue / warning lists. Direct conformance case pending: prime a scope with one healthy + one invalid-manifest drop-in plugin, run `sm plugins doctor --json`, assert the envelope validates and the invalid plugin appears under `issues[]`. Implementation tests at `src/test/plugins-cli.test.ts` cover the runtime behaviour. |
45
45
  | 34 | `conformance-result.schema.json` |, | 🔴 missing | Machine-readable output of `sm conformance run --json`. Self-referential by design (a conformance case would invoke the verb against itself); a direct case is deferred until the runner gains a meta-loopback mode. Implementation tests at `src/test/conformance-cli.test.ts` cover the envelope shape today. |
46
46
  | 35 | `user-settings.schema.json` | (indirect via `no-global-scope`) | 🟡 partial | Per-user / per-machine settings file at `~/.skill-map/settings.json` (the narrow `$HOME` exception, see `cli-contract.md` §User-settings file). Direct case is not added because alt-impls MAY choose to not ship an update-check feature, requiring them to produce this file would over-prescribe. The implementation-side AJV round-trip is covered by `src/test/user-settings-store.test.ts` (15 cases: defaults, malformed JSON, schemaVersion mismatch, wrong-type fields, unknown top-level keys, deep-merge writes, off-shape rejection). The behavioral counterpart (no global / user scope) lives at `no-global-scope` in the non-schema table below. |
47
- | 37 | `signal.schema.json` |, | 🔴 missing | Intermediate Representation (IR) emitted by extractors via `ctx.emitSignal()`; the kernel resolver phase consumes Signals and materialises Links. Opt-in: the existing `ctx.emitLink()` path coexists. Cases required (2): (a) `extractor-emits-signal`, an extractor emits a multi-candidate Signal and the resolver picks the highest-confidence candidate per the active Provider's `resolverRules`; (b) `signal-collision-detection`, two extractors emit Signals with overlapping `range` and the resolver surfaces the collision to analyzers via `IAnalyzerContext.signals`. Blocked by the kernel resolver phase landing in Phase 2 of the active-lens migration. |
47
+ | 37 | `signal.schema.json` | `extractor-emits-signal`, `signal-collision-detection` | covered | Intermediate Representation (IR) emitted by extractors via `ctx.emitSignal()`; the kernel resolver phase consumes Signals and materialises Links. Opt-in: the existing `ctx.emitLink()` path coexists. Phase 2.A wired the resolver end-to-end (filter -> rank -> overlap -> materialise + annotate); Phase 2.B + 2.C migrated all six link-emitter extractors (`claude/at-directive`, `claude/slash`, `core/markdown-link`, `core/annotations`, `core/mcp-tools`, `core/external-url-counter`); Phase 2.D added the `core/signal-collision` analyzer + the two cases. The cases cover (a) `extractor-emits-signal`, a markdown body with one `[text](path)` link materialises one Link via the Signal IR path; (b) `signal-collision-detection`, a body with both `[label](./api.md)` AND `@./api.md` at overlapping byte ranges triggers a cross-extractor collision, the resolver materialises ONE Link (markdown-link wins on confidence) and the loser's `resolution.rejectedBy` reaches the `core/signal-collision` analyzer which emits a `warn` issue naming WHO won, WHO lost, and WHY. |
48
48
 
49
49
  > **Note on Provider-owned schemas.** Per-kind frontmatter schemas (`skill`, `agent`, `command`, `note` for the built-in Claude Provider; other Providers MAY declare different kinds) live with the Provider that emits them, for the built-in Claude Provider, under `src/extensions/providers/claude/schemas/`. Those schemas are NOT counted in the spec's coverage matrix above; they belong to the Provider's own conformance suite at `src/extensions/providers/claude/conformance/coverage.md`. The same split applies to the cases that exercise Provider-specific kinds (`basic-scan`, `rename-high`, `orphan-detection`), they live in the Provider's `cases/` directory.
50
50
 
@@ -0,0 +1,6 @@
1
+ ---
2
+ name: api
3
+ description: Target of the architect's reference. Body content is irrelevant for the conformance assertion.
4
+ ---
5
+
6
+ API documentation.
@@ -0,0 +1,6 @@
1
+ ---
2
+ name: architect
3
+ description: Fixture for the Signal IR `signal-collision-detection` conformance case. Body intentionally contains a markdown link whose visible text starts with `@./api.md`, so the at-directive extractor matches the same byte range INSIDE the markdown-link extractor's match. Cross-extractor range overlap; the resolver picks ONE winner (markdown-link, higher confidence) and the loser surfaces as a signal-collision warn.
4
+ ---
5
+
6
+ Consult [@./api.md](./api.md) before deploying.
@@ -0,0 +1,6 @@
1
+ ---
2
+ name: source
3
+ description: Fixture for the Signal IR `extractor-emits-signal` conformance case. The single markdown link below must reach the graph as ONE Link row via the Signal IR resolver path.
4
+ ---
5
+
6
+ Read [the target file](./target.md) for more context.
@@ -0,0 +1,6 @@
1
+ ---
2
+ name: target
3
+ description: Target of the Signal IR conformance fixture. Body content is irrelevant; the assertion only checks that the link emitted by source.md materialises.
4
+ ---
5
+
6
+ Body.
package/index.json CHANGED
@@ -174,21 +174,23 @@
174
174
  }
175
175
  ]
176
176
  },
177
- "specPackageVersion": "0.34.0",
177
+ "specPackageVersion": "0.35.0",
178
178
  "integrity": {
179
179
  "algorithm": "sha256",
180
180
  "files": {
181
- "CHANGELOG.md": "50d7896784f0b209617e956bfe0dcbcc5af35d2dd6d76f5a0afdd96475fc4b27",
181
+ "CHANGELOG.md": "a6cf7d366dcfe0a04fcb438beb98fd6bc393df78c7ba7cdb396607622b6a3959",
182
182
  "README.md": "1c4b0ea58c4324f301043e9f5c36976a382d0bd2bc405a2e4e18463b0c50d946",
183
- "architecture.md": "b38c5281acaaab57fbe4869780fb0a09712dd6927ec9bdaee961061d34f3525a",
183
+ "architecture.md": "e87b916c0f3e166c79667d35472efcc27fd2dacf213907518b2ec9345aae603c",
184
184
  "cli-contract.md": "2e20c2ac77c300b3f12759c3f36d56f4624862ff0abb34d100f1c3f00861bccc",
185
185
  "conformance/README.md": "6871dde25b5770ed945284c9e0f749e0768ec3f5ba4966bdb215985789e43887",
186
+ "conformance/cases/extractor-emits-signal.json": "34b4808c232d66a0eea0f5db7632a746681432b4f0995b6bf39e8d675538451c",
186
187
  "conformance/cases/kernel-empty-boot.json": "2a5be9c93143d07a16d998df09dcc8fa4ea2d2f9a0bff6417573ed5a770352c1",
187
188
  "conformance/cases/no-global-scope.json": "1284763988026d924c0bd78ba8a9f417dc88f5b7e9f4c2b642ae0c447758bfd4",
188
189
  "conformance/cases/orphan-markdown-fallback.json": "8ef6e49b7e6532bd845d9f54974a16e537cf98d355f0c5e4f4fb06abac3adcc5",
189
190
  "conformance/cases/plugin-missing-ui-rejected.json": "59a571a2e80c2bac2050eacbe740f4f3f125849dd242954508f011304cc3e036",
190
191
  "conformance/cases/sidecar-end-to-end.json": "dbb3640f95769a36b881855a261f918481edadea13a7eb0765c6090f2417a142",
191
- "conformance/coverage.md": "26aebe674304dca0d0173b1ed4af80be1f9384c18eb4d87574fc6021eee2e746",
192
+ "conformance/cases/signal-collision-detection.json": "38c6d553c6f82c1b624fb8a8e9b4fc72034fc47bc70f7f011b3b9136817e7388",
193
+ "conformance/coverage.md": "cb0e4fb73f58c28d9ec15f733c08a6ad70fedf9eb1d1b5220adb7fa52a364343",
192
194
  "conformance/fixtures/orphan-markdown/.claude/agents/reviewer.md": "7f062731106f2d9811e4fffcf6ab44b8dfff4cfb16536a469514cc0664e832bf",
193
195
  "conformance/fixtures/orphan-markdown/ARCHITECTURE.md": "ec903666440bae65da3796b1158c92cfcdce22e0e09c3b20bb690176881a6ac4",
194
196
  "conformance/fixtures/plugin-missing-ui/.skill-map/plugins/bad-provider/kinds/markdown/kind.json": "6676a89bae5197e23cf50f1c11d596db558ac80f7334a7208fe57d8b92422251",
@@ -202,6 +204,10 @@
202
204
  "conformance/fixtures/sidecar-end-to-end/.claude/agents/stale.sm": "cb04f7f3103b4218b09fd4da92f7ea429588b04c1dac6a9547ce362263b11224",
203
205
  "conformance/fixtures/sidecar-example/agent-example.md": "741131403e8c9580d0b7a8c2446cb4502d01f80053b7a2092663de92431aaa82",
204
206
  "conformance/fixtures/sidecar-example/agent-example.sm": "8329950d49c69a1199bbe6c06e32b8513973e64207b0db8756b67301e6a1f1e2",
207
+ "conformance/fixtures/signal-ir-collision/.claude/agents/api.md": "7bdd260d82c2bf1ffc3324820e1b806684674981f9234f7c9f4f6aa61dd1cec5",
208
+ "conformance/fixtures/signal-ir-collision/.claude/agents/architect.md": "acc46b5b2dff73d98a354e4d53b5041164595deae466a4e2ce41d7c5a72f28fb",
209
+ "conformance/fixtures/signal-ir-single-signal/source.md": "1eda417b4c6eed372b66870e385c8d8cd631372b77cab7e996bb711e22218f89",
210
+ "conformance/fixtures/signal-ir-single-signal/target.md": "527137f2b4f46c0034b0edc8932cf8613d2bf22ffaaf78f01085c82a3baaebe3",
205
211
  "db-schema.md": "e56dab70f0469e8e6bd2440e8758c0436e710bc45c2ee812ac40a10b0c29ae77",
206
212
  "interfaces/security-scanner.md": "e8049712b9cf7a07c786bf19f8f775f8ef9638f063f7fba5c7a8b1431b92f38e",
207
213
  "job-events.md": "84206168ac12b536d34470d62f8c8cba95dab181fee66d23203c2cf5dfbee716",
@@ -222,23 +228,23 @@
222
228
  "schemas/extensions/formatter.schema.json": "d6d417df20260e5ddfe71f104b11a45873869706f86372c3c3c78c583e06e8d5",
223
229
  "schemas/extensions/hook.schema.json": "76bf2c07f9e689b3fd1c67cbad4516a4df10604f07103759e82670e5213ddcdf",
224
230
  "schemas/extensions/provider-kind.schema.json": "add3c5648721e67887eb971a76b39319628effac6315cffd51f7dcf679810740",
225
- "schemas/extensions/provider.schema.json": "5fd8f0db17b3d4d23930cbba6f7dbc61feea0ea856fb720ccb9f07a544d18495",
231
+ "schemas/extensions/provider.schema.json": "ebf137271d46f7100c8c520b6aa1851b131a3192a2dea43a17fe82b790d263fb",
226
232
  "schemas/frontmatter/base.schema.json": "df0056a9478514a0db7a705e59868fa4f67673ac1cc9c9da979de4237cdd62a1",
227
233
  "schemas/history-stats.schema.json": "5170dec0299f3d04382a38079a27b1f26300a6b95fdb1ea0fae11050ad9f0574",
228
234
  "schemas/input-types.schema.json": "c713b768d0b0e3d0c764afb401189f7fb624a82b4e988b73aab015cf9c67c01f",
229
235
  "schemas/issue.schema.json": "fa3344e75f1c3a5304291ca355bb973046552a68871ad6eb4edafca1cd9e1be8",
230
236
  "schemas/job.schema.json": "e43e1761c99920beffe1de12ef8f32fe29f97838bd8686742b637c19c4dbb395",
231
- "schemas/link.schema.json": "2450732829652ece58c853ca97711a8bbb64ac65e52e89e3b51024c073dddc9a",
232
- "schemas/node.schema.json": "8d0635a80c8e6f22be7fa04071654e857fc052869de15839f4b29593aa4527a3",
237
+ "schemas/link.schema.json": "336ce710250184ffa40b5d1c3ec52a275529d969d5b400177f2e2adebc643e39",
238
+ "schemas/node.schema.json": "4d7c107ed9cd2f1b7cc4d716c547c06a00ed776bd6092d3979cac634cb5326a5",
233
239
  "schemas/plugins-doctor.schema.json": "c1d92f30fdb0080e8cd8f7dc5d43e01aae02a16640bc5eb04811c337a275de58",
234
240
  "schemas/plugins-registry.schema.json": "cca7ae65f0c22510ea27ea5ae34e0074f5beb5871a57b005b6b831e6ceaff5c0",
235
- "schemas/project-config.schema.json": "e613d302a763e8815863ae58ebfe997b18a3aa0c329f85d9554dfa8d97fdd326",
241
+ "schemas/project-config.schema.json": "18f2f599023d3d567576e3ac5e722430d3f076ca3b66e412fbeaee8caf6e110f",
236
242
  "schemas/refresh-report.schema.json": "54519b8caf86ba84c182f9565be9b5084bc1631ae05e9217ee18f34c0039fff3",
237
243
  "schemas/report-base-deterministic.schema.json": "9d318d0181d121097c906ef3af1c52d71c782740bd04cf23418d7627ce2c3ed5",
238
244
  "schemas/report-base.schema.json": "a1021e9a59b4df9f99cd92454d797e88469766e7d49f52d231c4645ffdfdad8f",
239
245
  "schemas/scan-result.schema.json": "214bc12fbb9946642cbba3b23513dade60e7d6a5b6a9ed3dd0818f135b450185",
240
246
  "schemas/sidecar.schema.json": "8856c387477340efbdd0a585d74bfb07a99ba15b9ce593cc67d9efebc67c6bfc",
241
- "schemas/signal.schema.json": "2540c0014a78ebc902eb71b6815c35fa006c714b57d07dcb7415bd3c3da185b5",
247
+ "schemas/signal.schema.json": "7a9d36f13ee6fa269da7ab97e45d9831d10e0570e3f61005617128b423a4d4d8",
242
248
  "schemas/summaries/agent.schema.json": "bf540f9a804f2b43756ab33b7deb0462620d26e88cc9379c75a5f87d3b1b47d8",
243
249
  "schemas/summaries/command.schema.json": "c26f6965f77c5058608feb5e7b9f807395de8e015b0dea5efcdb44cb1820551a",
244
250
  "schemas/summaries/hook.schema.json": "58420ec485e152fdd21fa3d87337ad74b0d81a48d3b83dd072d4a2d196f78573",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@skill-map/spec",
3
- "version": "0.34.0",
3
+ "version": "0.35.0",
4
4
  "description": "JSON Schemas, prose contracts, and conformance suite for the skill-map specification.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -15,6 +15,25 @@
15
15
  "description": "Path globs (relative to scope root) that this Provider claims. **Enforcement-grade since structure-as-truth refactor**: a Provider declaring `roots` only receives files that match at least one entry of the array; a Provider without `roots` acts as a fallback and receives files unmatched by every other Provider's roots. Two Providers whose `roots` both match the same file produce a `provider-ambiguous` issue and the file stays unclassified. `sm plugins doctor` warns when no file matched a specific Provider's roots in the latest scan.",
16
16
  "items": { "type": "string" }
17
17
  },
18
+ "gatedByActiveLens": {
19
+ "type": "boolean",
20
+ "description": "Lens gating flag for vendor providers. When `true`, this Provider's `classify()` only runs (and the walker only iterates its territory) if `provider.id === activeProvider` (the project's active lens). When `false` or omitted (default), the Provider is universal and classifies unconditionally. Vendor providers (`claude`, `openai`, `antigravity`) MUST set this to `true`: the actual runtimes never read each other's on-disk formats (Claude Code does not consume `.codex/`; Codex CLI does not consume `.claude/`), and offering every file to every provider fabricates cross-vendor graph edges the runtimes themselves reject. Universal providers (open-standard `agent-skills`, markdown fallback `core/markdown`, any future format-based fallback) keep this `false` so their territory is consumed by every vendor and they run on every scan. When `activeProvider === null` (no lens resolved), the walker bypasses the gate entirely and every gated Provider runs, mirroring the permissive extractor-side fallback for unlensed projects. Affects classification ONLY; extractors continue to filter via their own `precondition.provider` allowlist."
21
+ },
22
+ "resolverRules": {
23
+ "type": "object",
24
+ "description": "Per-provider ranking hints consumed by the Signal IR resolver phase. Drives intra-Signal candidate ranking AND cross-Signal range-overlap tiebreaks. Optional; absent means the resolver uses the default tiebreak chain (confidence DESC -> range length DESC -> extractor registration order). Distinct from the post-walk `resolution` confidence-lift matrix on Link (which runs on already-emitted edges, not Signal candidates): `resolverRules` decides which candidate becomes a Link in the first place; `resolution` lifts confidence on links that survived. The two surfaces share no mechanism and intentionally do not compose.",
25
+ "additionalProperties": false,
26
+ "properties": {
27
+ "kindPriority": {
28
+ "type": "array",
29
+ "description": "When present, the resolver ranks candidates whose `kind` appears earlier in this array ABOVE candidates whose `kind` appears later. Candidates whose `kind` is absent from the array drop to the end (after every listed kind). Example: a Provider that wants `invokes` edges to win against `mentions` and `references` of the same range declares `['invokes', 'references', 'mentions']`. Ties inside the same `kindPriority` bucket fall through to the confidence -> range length -> declaration order tiebreaks.",
30
+ "items": {
31
+ "type": "string",
32
+ "enum": ["invokes", "references", "mentions", "supersedes"]
33
+ }
34
+ }
35
+ }
36
+ },
18
37
  "read": {
19
38
  "type": "object",
20
39
  "required": ["extensions", "parser"],
@@ -62,6 +62,40 @@
62
62
  "raw": {
63
63
  "type": ["string", "null"],
64
64
  "description": "Verbatim matched substring from the source body. Optional, for debugging and UI display."
65
+ },
66
+ "occurrences": {
67
+ "type": "array",
68
+ "description": "Every syntactic site in the source body that contributed to this edge. One entry per detection. Accumulated by the post-walk dedup when two extractors converge on the same `(source, target, kind, normalizedTrigger)` key. Empty / absent for synthetic links (frontmatter / sidecar-derived). The `core/redundant-target-reference` analyzer walks this array to flag multi-form references to the same target.",
69
+ "items": {
70
+ "type": "object",
71
+ "required": ["extractor", "originalTrigger"],
72
+ "additionalProperties": false,
73
+ "properties": {
74
+ "extractor": {
75
+ "type": "string",
76
+ "description": "Extractor id that observed this occurrence. Matches an entry in the parent link's `sources[]`."
77
+ },
78
+ "originalTrigger": {
79
+ "type": "string",
80
+ "description": "Verbatim author substring (sigil included)."
81
+ },
82
+ "location": {
83
+ "type": ["object", "null"],
84
+ "description": "Position of the occurrence in the body, when the extractor records it.",
85
+ "required": ["line"],
86
+ "additionalProperties": false,
87
+ "properties": {
88
+ "line": { "type": "integer", "minimum": 1 },
89
+ "column": { "type": "integer", "minimum": 1 },
90
+ "offset": { "type": "integer", "minimum": 0 }
91
+ }
92
+ }
93
+ }
94
+ }
95
+ },
96
+ "resolvedTarget": {
97
+ "type": ["string", "null"],
98
+ "description": "Node path the link resolved to, per the post-walk `liftResolvedLinkConfidence` transform. Equal to `target` for path-style links; differs for trigger-style links (`@foo`, `/cmd`) where `target` keeps the authored trigger and `resolvedTarget` carries the resolved node path. Absent / null when the link is unresolved (broken)."
65
99
  }
66
100
  }
67
101
  }
@@ -59,6 +59,30 @@
59
59
  "minimum": 0,
60
60
  "description": "http/https URLs in the body after normalization and exact-match dedup."
61
61
  },
62
+ "externalRefs": {
63
+ "type": "array",
64
+ "description": "Distinct external URLs (http/https) the body references, in extractor-order (first-seen wins, dedup is by normalised URL). The denormalised `externalRefsCount` rides alongside and MUST equal `externalRefs.length` when both are present. Surfaced via `/api/nodes` so the inspector can list every external URL without a second round-trip.",
65
+ "items": {
66
+ "type": "object",
67
+ "required": ["url"],
68
+ "additionalProperties": false,
69
+ "properties": {
70
+ "url": {
71
+ "type": "string",
72
+ "description": "Normalised URL (lowercased host, fragment stripped)."
73
+ },
74
+ "line": {
75
+ "type": "integer",
76
+ "minimum": 1,
77
+ "description": "1-indexed line of the occurrence in the source body, when known."
78
+ },
79
+ "originalTrigger": {
80
+ "type": "string",
81
+ "description": "Author substring (almost always equals `url`)."
82
+ }
83
+ }
84
+ }
85
+ },
62
86
  "sidecar": {
63
87
  "$ref": "#/$defs/sidecarOverlay",
64
88
  "description": "Step 9.6.2, co-located `.sm` sidecar overlay. Carries presence flag, drift status (null when no sidecar), and the parsed `annotations:` block (null when absent or empty). The kernel re-derives `status` on every scan from the live hashes; clients should treat it as authoritative for the snapshot but never persist it across scans."
@@ -28,6 +28,11 @@
28
28
  "type": "string",
29
29
  "description": "The active provider lens for this project. Exactly one provider id (from the enabled `providers` list) sees the project at any time. All extractors, classifiers, and resolution rules belonging to other providers are skipped during scan. Changing this triggers an atomic drop of the `scan_*` DB zone followed by a fresh scan under the new lens; `state_*` and `config_*` zones survive the switch. When absent on a fresh project, the kernel auto-detects from filesystem (presence of `.claude/`, `.codex/`, AGENTS.md, `.cursor/`, etc.) and prompts via the CLI / UI if the heuristic is ambiguous. Google's Antigravity CLI has no vendor-specific marker and is selected manually. Stability: experimental."
30
30
  },
31
+ "activeProviderMarkers": {
32
+ "type": "array",
33
+ "items": { "type": "string" },
34
+ "description": "Internal-state snapshot, NOT normally hand-edited. The set of provider ids whose filesystem markers were present on disk at the moment `activeProvider` was set (whether by auto-detect, the interactive prompt, or `sm config set activeProvider <id>`). On every subsequent scan the runtime re-detects markers and compares against this snapshot; when the diff is non-empty (new markers appeared, or recorded ones disappeared) it emits ONE soft warning before the scan and continues with the cached lens. The warn is informational and never blocks the scan. Absent on legacy projects, the runtime backfills the snapshot lazily on the next scan without warning. Stability: experimental."
35
+ },
31
36
  "roots": {
32
37
  "type": "array",
33
38
  "description": "Directories (relative to the config file) to scan. Defaults to the scope root.",
@@ -23,7 +23,8 @@
23
23
  "additionalProperties": false,
24
24
  "properties": {
25
25
  "start": { "type": "integer", "minimum": 0, "description": "Inclusive byte offset of the first character." },
26
- "end": { "type": "integer", "minimum": 0, "description": "Exclusive byte offset one past the last character." }
26
+ "end": { "type": "integer", "minimum": 0, "description": "Exclusive byte offset one past the last character." },
27
+ "line": { "type": "integer", "minimum": 1, "description": "Optional 1-indexed line number containing `start`. Extractors that already compute line tracking (via `computeLineStarts` / `lineFor`) populate this so the resolver's materialised Link can preserve `link.location.line` without re-walking the body. Absent when the extractor does not track lines, the resolver falls back to `1`." }
27
28
  }
28
29
  },
29
30
  "fieldPath": {
@@ -84,6 +85,77 @@
84
85
  }
85
86
  }
86
87
  }
88
+ },
89
+ "resolution": {
90
+ "type": "object",
91
+ "description": "Resolver outcome annotation, populated by the kernel resolver phase after `resolveSignals` runs. Absent before the resolver fires (raw extractor output). When `outcome` is `materialised`, `winnerIndex` points into `candidates[]` and a corresponding `Link` was emitted. When `outcome` is `rejected`, one of `rejectedBy` / `extractorDisabled` / `belowFloor` carries the reason. Both materialised and rejected Signals remain accessible to analyzers via `IAnalyzerContext.signals` so the `core/signal-collision` analyzer can surface losers as `warn` issues. Phase 4+ adds the `extractorDisabled` and `belowFloor` paths; today only the `rejectedBy` / range-overlap path populates rejection state.",
92
+ "required": ["outcome"],
93
+ "additionalProperties": false,
94
+ "properties": {
95
+ "outcome": {
96
+ "type": "string",
97
+ "enum": ["materialised", "rejected"],
98
+ "description": "Whether the resolver materialised this Signal's winning candidate as a `Link` (`materialised`) or rejected the whole Signal (`rejected`)."
99
+ },
100
+ "winnerIndex": {
101
+ "type": "integer",
102
+ "minimum": 0,
103
+ "description": "Index into `candidates[]` of the winning candidate when `outcome === 'materialised'`. Absent on rejection."
104
+ },
105
+ "rejectedBy": {
106
+ "type": "object",
107
+ "description": "Set when the Signal lost a cross-extractor range-overlap collision against another Signal at the same source. Names the winner so an analyzer (or the operator drilling into the sidecar) can see WHO won and WHY.",
108
+ "required": ["source", "range", "extractorId", "reason"],
109
+ "additionalProperties": false,
110
+ "properties": {
111
+ "source": {
112
+ "type": "string",
113
+ "description": "`node.path` of the winning Signal. Always equal to this Signal's `source` today, the field is explicit so future cross-node collision detection can populate it without a schema migration."
114
+ },
115
+ "range": {
116
+ "type": "object",
117
+ "description": "Byte-range of the winning Signal. Mirrors the shape of `Signal.range`.",
118
+ "required": ["start", "end"],
119
+ "additionalProperties": false,
120
+ "properties": {
121
+ "start": { "type": "integer", "minimum": 0 },
122
+ "end": { "type": "integer", "minimum": 0 }
123
+ }
124
+ },
125
+ "extractorId": {
126
+ "type": "string",
127
+ "description": "Qualified id (`<plugin>/<extractor>`) of the winning candidate's extractor."
128
+ },
129
+ "reason": {
130
+ "type": "string",
131
+ "enum": ["kind-priority", "higher-confidence", "longer-range", "earlier-declaration"],
132
+ "description": "Which tiebreak rule decided the winner. The four rules apply in this order: 1) `kind-priority` (provider `resolverRules.kindPriority`), 2) `higher-confidence` (numeric confidence DESC), 3) `longer-range` (`end - start` DESC), 4) `earlier-declaration` (extractor registration order)."
133
+ }
134
+ }
135
+ },
136
+ "extractorDisabled": {
137
+ "type": "object",
138
+ "description": "Reserved for Phase 4+: populated when every candidate of this Signal came from an extractor that the operator has disabled via `plugins.<id>.extensions.<extId>.enabled`. Today the resolver never sets this; the field is documented so the analyzer / UI surface can be built once the filter lands.",
139
+ "required": ["extractorId"],
140
+ "additionalProperties": false,
141
+ "properties": {
142
+ "extractorId": { "type": "string" }
143
+ }
144
+ },
145
+ "belowFloor": {
146
+ "type": "object",
147
+ "description": "Reserved for Phase 4+: populated when every candidate's `confidence` fell below the configured floor. Today the resolver materialises every Signal that survives overlap, regardless of confidence; the field is documented so the analyzer / UI surface can be built once the floor lands.",
148
+ "required": ["threshold"],
149
+ "additionalProperties": false,
150
+ "properties": {
151
+ "threshold": {
152
+ "type": "number",
153
+ "minimum": 0,
154
+ "maximum": 1
155
+ }
156
+ }
157
+ }
158
+ }
87
159
  }
88
160
  }
89
161
  }