@zigrivers/scaffold 3.29.0 → 3.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. package/content/guides/AUTHORING.md +146 -0
  2. package/content/guides/cli/index.html +1855 -0
  3. package/content/guides/cli/index.md +206 -0
  4. package/content/guides/concepts/index.html +1970 -0
  5. package/content/guides/concepts/index.md +347 -0
  6. package/content/guides/dashboard/index.html +1913 -0
  7. package/content/guides/dashboard/index.md +264 -0
  8. package/content/guides/index.html +368 -15
  9. package/content/guides/install/.diagrams/diagram-0.svg +1 -0
  10. package/content/guides/install/.diagrams/manifest.json +3 -0
  11. package/content/guides/install/index.html +1653 -0
  12. package/content/guides/install/index.md +186 -0
  13. package/content/guides/knowledge/.diagrams/diagram-0.svg +1 -0
  14. package/content/guides/knowledge/.diagrams/manifest.json +3 -0
  15. package/content/guides/knowledge/index.html +1765 -0
  16. package/content/guides/knowledge/index.md +209 -0
  17. package/content/guides/knowledge-freshness/.diagrams/diagram-0.svg +1 -0
  18. package/content/guides/knowledge-freshness/.diagrams/manifest.json +3 -0
  19. package/content/guides/knowledge-freshness/index.html +2795 -0
  20. package/content/guides/knowledge-freshness/index.md +893 -0
  21. package/content/guides/mmr/index.html +407 -36
  22. package/content/guides/mmr/index.md +39 -16
  23. package/content/guides/multi-agent/.diagrams/diagram-0.svg +1 -0
  24. package/content/guides/multi-agent/.diagrams/manifest.json +3 -0
  25. package/content/guides/multi-agent/index.html +1715 -0
  26. package/content/guides/multi-agent/index.md +243 -0
  27. package/content/guides/observability/.diagrams/diagram-0.svg +1 -0
  28. package/content/guides/observability/.diagrams/diagram-1.svg +1 -0
  29. package/content/guides/observability/.diagrams/diagram-2.svg +1 -0
  30. package/content/guides/observability/.diagrams/diagram-3.svg +1 -0
  31. package/content/guides/observability/.diagrams/manifest.json +6 -0
  32. package/content/guides/observability/index.html +3257 -0
  33. package/content/guides/observability/index.md +1097 -0
  34. package/content/guides/pipeline/.diagrams/diagram-0.svg +1 -0
  35. package/content/guides/pipeline/.diagrams/diagram-1.svg +1 -0
  36. package/content/guides/pipeline/.diagrams/manifest.json +4 -0
  37. package/content/guides/pipeline/index.html +1973 -0
  38. package/content/guides/pipeline/index.md +387 -0
  39. package/content/guides/review-workflow/.diagrams/diagram-0.svg +1 -0
  40. package/content/guides/review-workflow/.diagrams/diagram-1.svg +1 -0
  41. package/content/guides/review-workflow/.diagrams/manifest.json +4 -0
  42. package/content/guides/review-workflow/index.html +1790 -0
  43. package/content/guides/review-workflow/index.md +248 -0
  44. package/dist/guides/build.d.ts +1 -1
  45. package/dist/guides/build.d.ts.map +1 -1
  46. package/dist/guides/build.js +21 -9
  47. package/dist/guides/build.js.map +1 -1
  48. package/dist/guides/build.test.js +47 -0
  49. package/dist/guides/build.test.js.map +1 -1
  50. package/dist/guides/chrome.d.ts.map +1 -1
  51. package/dist/guides/chrome.js +83 -12
  52. package/dist/guides/chrome.js.map +1 -1
  53. package/dist/guides/dashboard-theme.css +8 -0
  54. package/dist/guides/directives-cite.test.d.ts +2 -0
  55. package/dist/guides/directives-cite.test.d.ts.map +1 -0
  56. package/dist/guides/directives-cite.test.js +26 -0
  57. package/dist/guides/directives-cite.test.js.map +1 -0
  58. package/dist/guides/directives-tabs.test.js +47 -0
  59. package/dist/guides/directives-tabs.test.js.map +1 -1
  60. package/dist/guides/directives.d.ts +1 -0
  61. package/dist/guides/directives.d.ts.map +1 -1
  62. package/dist/guides/directives.js +38 -0
  63. package/dist/guides/directives.js.map +1 -1
  64. package/dist/guides/guides.css +268 -0
  65. package/dist/guides/index-page.d.ts.map +1 -1
  66. package/dist/guides/index-page.js +41 -8
  67. package/dist/guides/index-page.js.map +1 -1
  68. package/dist/guides/links.d.ts +14 -0
  69. package/dist/guides/links.d.ts.map +1 -0
  70. package/dist/guides/links.js +56 -0
  71. package/dist/guides/links.js.map +1 -0
  72. package/dist/guides/links.test.d.ts +2 -0
  73. package/dist/guides/links.test.d.ts.map +1 -0
  74. package/dist/guides/links.test.js +72 -0
  75. package/dist/guides/links.test.js.map +1 -0
  76. package/dist/guides/render.d.ts +1 -0
  77. package/dist/guides/render.d.ts.map +1 -1
  78. package/dist/guides/render.js +1 -1
  79. package/dist/guides/render.js.map +1 -1
  80. package/dist/guides/sanitize.d.ts.map +1 -1
  81. package/dist/guides/sanitize.js +5 -0
  82. package/dist/guides/sanitize.js.map +1 -1
  83. package/dist/guides/template.d.ts.map +1 -1
  84. package/dist/guides/template.js +7 -2
  85. package/dist/guides/template.js.map +1 -1
  86. package/package.json +2 -2
@@ -0,0 +1,893 @@
1
+ ---
2
+ title: Knowledge Freshness
3
+ topic: knowledge-freshness
4
+ description: How knowledge entries stay current, how coverage gaps surface as Lens-I findings, and how the daily cron, five PR gates, and source allowlist keep the KB grounded
5
+ category: tools
6
+ order: 42
7
+ ---
8
+
9
+ ## What this system does
10
+
11
+ Knowledge entries under `content/knowledge/` declare a `volatility` tier and a
12
+ list of `sources`. A daily cron prefilters at most ten entries that are *due* —
13
+ by cadence or by a changed source hash — runs a grounded LLM audit against the
14
+ prefetched source bodies, opens one PR per drifted entry, and gates that PR on
15
+ five checks. In parallel, downstream agents emit `knowledge_gap_signal` events
16
+ when they hit a topic the KB does not cover; **Lens I** aggregates those signals
17
+ into P1/P2 audit findings, suppressing any topic an entry already covers.
18
+
19
+ Two arms, two outcomes:
20
+
21
+ - The **refresh arm** chases *known* sources for drift. It ends in a PR that
22
+ *updates* an entry.
23
+ - The **gap arm** surfaces *unknown* topics. It ends in a PR that *creates* an
24
+ entry.
25
+
26
+ Both terminate in a human-merged PR.
27
+
28
+ | Surface | Value | Notes |
29
+ | --- | --- | --- |
30
+ | Volatility tiers | 3 | `fast-moving` / `evolving` / `stable` |
31
+ | Audit verdicts | 4 | `current` / `minor-drift` / `major-drift` / `superseded` |
32
+ | Daily audit ceiling | 10 | set by `--max=10` in the cron workflow; not a yaml knob |
33
+ | PR gates | 5 | 4 blocking + 1 advisory |
34
+ | Signal window | 90 days | rolling; drives Lens I aggregation |
35
+
36
+ :::callout{type=note}
37
+ **Two subsystems, one config file.** Knowledge Freshness and the separate
38
+ [Build Observability](../observability/index.md) system both read
39
+ `.scaffold/observability.yaml`. This guide documents Knowledge Freshness;
40
+ Lens I is the seam where the two meet (it lives in the observability audit but
41
+ reasons about the KB).
42
+ :::
43
+
44
+ ### How a gap closes
45
+
46
+ The full lifecycle, end to end:
47
+
48
+ 1. Downstream agents emit signals; they accumulate in the rolling 90-day window.
49
+ 2. A topic's signal count and distinct-project count cross the threshold.
50
+ 3. Lens I emits a P1/P2 finding.
51
+ 4. An operator adds `content/knowledge/<category>/<slug>.md`.
52
+ 5. The next audit's knowledge index covers the slug and Lens I **suppresses** the
53
+ bucket — the finding disappears.
54
+
55
+ Signals are *not* purged when the entry is added. The window is rolling, so
56
+ yesterday's signals still aggregate tomorrow; suppression filters the *emit*
57
+ step, not the aggregation step (:cite[src/observability/checks/lens-i-knowledge-gaps.ts:155]).
58
+ Signals only fade as they age out of the 90-day window naturally.
59
+
60
+ ## System map
61
+
62
+ ```mermaid
63
+ flowchart TB
64
+ subgraph refresh[Refresh arm]
65
+ CRON["cron
66
+ 09:00 UTC daily"] --> PF["audit-prefilter
67
+ --max=10"]
68
+ PF --> RUN["audit-run-entry
69
+ grounded LLM"]
70
+ RUN --> APPLY["audit-apply
71
+ --open-pr"]
72
+ APPLY --> GATES["5 PR gates"]
73
+ GATES --> MERGE["human merge
74
+ → VERSION bump"]
75
+ end
76
+ subgraph gap[Gap arm]
77
+ TAIL["gap-signal-tail
78
+ 89 pipeline steps"] --> EVENT["scaffold observe event
79
+ knowledge_gap_signal"]
80
+ EVENT --> LEDGER["ledger
81
+ activity.jsonl"]
82
+ LEDGER --> LENSI["Lens I
83
+ 90-day window"]
84
+ LENSI --> FINDING["finding
85
+ P1 / P2"]
86
+ RESOLVER["3-tier --knowledge-root resolver
87
+ suppresses covered topics"] --> LENSI
88
+ end
89
+ FINDING -.->|operator adds an entry whose name: matches the bucket → PR| GATES
90
+ ```
91
+
92
+ Three real hooks sit beside the two arms: the **phase-audit hook** (runs Lens H
93
+ only, never Lens I), the **doc-conformance MMR channel** (routes Lens I findings
94
+ into MMR), and the **`--fix` flow** (initial + verifier + postfix audit). They
95
+ are covered below.
96
+
97
+ :::callout{type=warning}
98
+ **Doc drift on MMR-in-cron.** Three docs frame MMR-in-cron differently. The
99
+ parent spec's locked decision #3 is authoritative: a native
100
+ `knowledge-freshness` MMR channel is deferred to Phase 5. The cron today runs
101
+ only inline gates. Two interim paths give reviewers MMR signal on a freshness
102
+ PR: (1) the built-in `doc-conformance` MMR channel (disabled by default; enable
103
+ with `mmr review --channels=doc-conformance`); (2) the manual `mmr review
104
+ --diff -` command in [From candidate to merged PR](#from-candidate-to-merged-pr).
105
+ :::
106
+
107
+ ## Frontmatter, signals, and resolution
108
+
109
+ ### Frontmatter schema
110
+
111
+ Every knowledge entry's frontmatter is a Zod-validated object with four
112
+ freshness-relevant fields. The schema is the source of truth and runs as Gate 1
113
+ of the PR CI (:cite[src/validation/knowledge-frontmatter-validator.ts:42-50]);
114
+ runtime readers tolerate missing optional fields.
115
+
116
+ | Field | Type | Default | Validation | Read by |
117
+ | --- | --- | --- | --- | --- |
118
+ | `name` | string | required | regex `/^[a-z][a-z0-9-]*$/` | assembly-loader, Lens I suppression |
119
+ | `description` | string | required | warns if > 200 chars | assembly-loader (TOC), audit prompt |
120
+ | `topics` | string[] | `[]` | any string | assembly-loader (auto-selection) |
121
+ | `volatility` | enum | `evolving` | `stable|evolving|fast-moving` | prefilter cadence |
122
+ | `last-reviewed` | ISO date | `null` | `YYYY-MM-DD` & real calendar date | prefilter cadence |
123
+ | `version-pin` | string | `null` | any string (e.g. `"OWASP Top 10 2021"`) | audit prompt; `superseded` verdict signals it must advance manually |
124
+ | `sources[]` | object[] | `[]` | each: `url` (SSRF-checked at fetch), `anchor` (optional, starts with `#`), `retrieved` (ISO date), `hash` (sha256) | prefilter (hash + cadence), audit runner (prefetch) |
125
+
126
+ :::callout{type=warning}
127
+ **`name` vs. gap-topic regex.** An entry `name` must start with a letter
128
+ (`/^[a-z][a-z0-9-]*$/`), but Lens I gap *topics* allow a leading digit
129
+ (`/^[a-z0-9]+(-[a-z0-9]+)*$/`). So a gap signalled for a topic like `3d-rendering`
130
+ cannot be suppressed by an entry of the same name — pick a letter-leading `name`
131
+ (and list the numeric form under `topics`) when closing such a gap.
132
+ :::
133
+
134
+ :::callout{type=note}
135
+ **Anchor semantics.** Put fragments in `anchor`, never inside `url`. The audit
136
+ fetches `url + (anchor ?? '')` and hashes that body; the coverage check
137
+ (:cite[src/knowledge-freshness/audit-apply.ts:82-101]) matches the same combined
138
+ string. Splitting prevents hash drift from spurious URL re-encodings and lets two
139
+ sources at the same base URL with different `#anchor`s be tracked independently.
140
+ :::
141
+
142
+ ### Cadence model
143
+
144
+ Three tiers, three windows — **14 / 60 / 180** days for `fast-moving` /
145
+ `evolving` / `stable` (:cite[src/knowledge-freshness/audit-prefilter.ts:5-7]).
146
+ An entry with no `last-reviewed` always counts as due. Sources with a changed
147
+ hash also become candidates regardless of age, but the hash check only runs for
148
+ entries still *inside* their cadence window.
149
+
150
+ #### Which tier does an entry belong in?
151
+
152
+ | Provenance | Change frequency | Recommended tier |
153
+ | --- | --- | --- |
154
+ | vendor SDK / API docs | quarterly or faster | `fast-moving` |
155
+ | standards / RFCs, vendor docs | yearly-ish | `evolving` |
156
+ | canonical pattern reference | multi-year | `stable` |
157
+
158
+ Rule of thumb: if a version bump *often breaks* downstream guidance, lean
159
+ `fast-moving`; if drift is *extremely rare*, `stable`; otherwise `evolving`
160
+ (the default).
161
+
162
+ ### Adding a new entry to the KB
163
+
164
+ 1. **Choose a category directory** under `content/knowledge/<category>/`. Many
165
+ categories exist today (`backend`, `core`, `cli`, `research`, `web-app`,
166
+ `web3`, …); prefer placing into an existing one. Creating a new category is a
167
+ separate PR.
168
+ 2. **File name = entry slug + `.md`.** The basename must match the `name:` field
169
+ (e.g. `retry-with-jitter.md` ↔ `name: retry-with-jitter`). Lens I's
170
+ suppression match reads `name:` only, not the filename — a mismatch silently
171
+ breaks suppression.
172
+ 3. **Required frontmatter:** `name`, `description`. Add `volatility` + `sources[]`
173
+ if you want the cron to audit it — an entry with no `sources[]` is skipped by
174
+ the prefilter (:cite[src/knowledge-freshness/audit-prefilter.ts:17]).
175
+ 4. **Validate locally:** `make validate-knowledge`.
176
+ 5. **Confirm the prefilter will pick it up.** A fresh entry has no
177
+ `last-reviewed`, so it should appear at priority 100:
178
+ ```bash
179
+ node dist/index.js knowledge-freshness audit-prefilter --max=10 \
180
+ | jq '.[] | select(.name=="<your-new-slug>")'
181
+ ```
182
+ The daily ceiling is 10, so a flood of new entries may queue past the first day.
183
+
184
+ ### Gap-signal payload
185
+
186
+ A gap signal is a ledger event validated by
187
+ :cite[src/observability/engine/event-schemas.ts:191-220] (payload allow-list at
188
+ :cite[src/observability/engine/event-schemas.ts:12]):
189
+
190
+ ```json
191
+ {
192
+ "event_id": "<uuid>",
193
+ "worktree_id": "<sha>",
194
+ "actor_label": "agent | bot | …",
195
+ "branch": "<branch>",
196
+ "task_id": null,
197
+ "type": "knowledge_gap_signal",
198
+ "ts": "<ISO-8601>",
199
+ "payload": {
200
+ "topic": "<kebab-slug>",
201
+ "source": "agent_search",
202
+ "project_id": "<sha256-hex>",
203
+ "step_name": "tech-stack",
204
+ "agent_excerpt": "…"
205
+ }
206
+ }
207
+ ```
208
+
209
+ `topic` is ≤80 chars matching `/^[a-z0-9]+(-[a-z0-9]+)*$/`; `source` ∈
210
+ {`agent_search`, `lessons`, `manual`}; `project_id` is 64-char sha256 hex (or the
211
+ literal `lessons` when `source=lessons`); `step_name` and `agent_excerpt` (≤200
212
+ chars) are optional.
213
+
214
+ :::callout{type=tip}
215
+ **Suppressing emission in tests/CI.** Set `SCAFFOLD_GAP_SIGNAL_QUIET=1`. The
216
+ assembly-time tail (`src/core/assembly/gap-signal-tail.ts`) then renders no
217
+ emission template into the pipeline step. Default is always-on (locked decision
218
+ #9) — catch gaps everywhere they occur.
219
+ :::
220
+
221
+ ### KnowledgeRootResolution shape
222
+
223
+ The resolver returns a three-field record that threads through the audit run
224
+ (:cite[src/observability/knowledge-index.ts:275-291]):
225
+
226
+ ```ts
227
+ export interface KnowledgeRootResolution {
228
+ /** Validated absolute path to a knowledge directory, or null. */
229
+ root: string | null
230
+ /** Pre-loaded index Set, populated by the validator. Null when root is null.
231
+ Lens I reads this directly — no re-walk. */
232
+ index: Set<string> | null
233
+ /** Audit trail of what was tried. Lens I uses this to compose a precise
234
+ warn-once message when root is null. */
235
+ attempts: KnowledgeRootAttempt[]
236
+ }
237
+ ```
238
+
239
+ ## From candidate to merged PR
240
+
241
+ The cron is a thin bash loop — the brains live in three CLI subcommands and a
242
+ meta-prompt that runs a grounded LLM against pre-fetched source bodies.
243
+
244
+ ### Prefilter
245
+
246
+ An entry becomes a candidate when (1) it has at least one source, AND (2) either
247
+ its `last-reviewed` is older than the cadence window, OR a source's prefetched
248
+ hash differs from the stored one. Priority orders highest-score first: unreviewed
249
+ entries (100), then overdue entries (`50 + ageDays`, so the oldest rank highest),
250
+ with in-window hash changes at 75; the top `--max`
251
+ win (:cite[src/knowledge-freshness/audit-prefilter.ts:14-72]):
252
+
253
+ ```ts
254
+ for (const e of entries) {
255
+ if (e.sources.length === 0) continue // no sources = no audit
256
+ if (!e.lastReviewed) { select = true; priority = 100 }
257
+ else if (ageDays > window) { select = true; priority = 50 + ageDays }
258
+ else {
259
+ // hash check — Promise.all over a small per-entry list (1-3 sources)
260
+ if (anyHashChanged) { select = true; priority = 75 }
261
+ }
262
+ }
263
+ candidates.sort((a, b) => b.priority - a.priority)
264
+ return candidates.slice(0, max)
265
+ ```
266
+
267
+ The hash check is a tiebreaker, not a baseline. Entries already *past* their
268
+ window are selected immediately at priority `50 + ageDays` — no network cost.
269
+ The hash check only runs in the `else` branch (still *inside* the window), runs
270
+ `Promise.all` over the entry's 1–3 sources, and swallows fetch errors so a slow
271
+ upstream doesn't crash the cron.
272
+
273
+ ### Audit verdicts
274
+
275
+ The meta-prompt at `content/tools/knowledge-audit-entry.md` instructs the LLM to
276
+ read pre-fetched source bodies (no web tool available) and emit one of four
277
+ verdicts. **Every verdict opens a PR** — the dry-run apply runs first so gates
278
+ can inspect the proposed diff, then `--open-pr` creates the branch.
279
+
280
+ | Verdict | What the PR contains |
281
+ | --- | --- |
282
+ | `current` | Frontmatter-only: bumps `last-reviewed`, `sources[*].hash`, `sources[*].retrieved` so the entry exits the queue. |
283
+ | `minor-drift` | Frontmatter persistence + findings table as commentary. `applyVerdictToEntry` refuses any `proposed_changes` on this verdict (:cite[src/knowledge-freshness/audit-apply.ts:54-58]); no body edits. |
284
+ | `major-drift` | Body edits land via `proposed_changes` (H2-heading-anchored splices). Gate 4 blocks if a stable entry's diff exceeds 20% churn without the override label. |
285
+ | `superseded` | A new edition shipped; `version-pin` must advance. `last-reviewed` does **not** advance (:cite[src/knowledge-freshness/audit-apply.ts:103-118]) — only `hash`/`retrieved` update, so the entry stays due until a human re-audits. Prevents a known-stale entry from looking fresh. |
286
+
287
+ ### PR generation
288
+
289
+ Branch: `knowledge-freshness/<entry>-<YYYY-MM-DD>`. `renderPrBody` renders a
290
+ summary, the verdict fields, a findings table, the sources, and any preserve
291
+ warnings (it does not embed the raw verdict JSON). Each candidate gets
292
+ its own PR off `origin/main` — the cron `git checkout main` between iterations
293
+ and restores the entry between the dry-run apply (for gates) and the final
294
+ `--open-pr` call. PRs do not stack; failures isolate per-candidate.
295
+
296
+ #### VERSION bump on merge
297
+
298
+ A dedicated workflow
299
+ (:cite[.github/workflows/knowledge-freshness-version-bump.yml:16]) fires on PR
300
+ `closed` (merged-only) when the source branch starts with `knowledge-freshness/`
301
+ *or* the PR carries the `knowledge-freshness` label. It computes the next SemVer
302
+ from the PR title and body, writes `content/knowledge/VERSION`, commits with the
303
+ prefix `chore(knowledge):` (deliberately not `knowledge-freshness/*`) so the
304
+ commit doesn't re-trigger itself, then `git pull --rebase` before pushing. Bump
305
+ rules (:cite[src/knowledge-freshness/bump-version.ts:26-45]):
306
+
307
+ | Match | Bump | Notes |
308
+ | --- | --- | --- |
309
+ | `BREAKING CHANGE:` anywhere in title, or start-of-line in body | major | Wins over every other prefix |
310
+ | `feat(knowledge):` / `feat(knowledge-freshness):` title prefix | minor | Case-sensitive |
311
+ | `chore(knowledge):` / `chore(knowledge-freshness):` title prefix | patch | Used by the bump commit itself |
312
+ | Anything else (including `fix(knowledge):`) | patch | Logs a `::notice::` for unrecognized prefixes |
313
+
314
+ The start-of-line anchor on the BREAKING CHANGE body match (`/^BREAKING
315
+ CHANGE:/m`) is deliberate — a freshness PR's body embeds an LLM-generated
316
+ findings table whose evidence excerpts could otherwise mention "BREAKING CHANGE:"
317
+ and trigger an accidental major bump.
318
+
319
+ ### MMR corroboration (manual)
320
+
321
+ The cron does *not* dispatch MMR today — the workflow only runs inline gates. To
322
+ corroborate a freshness PR locally:
323
+
324
+ ```bash
325
+ git diff origin/main...HEAD -- 'content/knowledge/**/*.md' \
326
+ | mmr review --diff - --focus knowledge-freshness --sync --format json
327
+ ```
328
+
329
+ A native `knowledge-freshness` MMR channel is the Phase 5 plan. See the
330
+ [MMR guide](../mmr/index.md) for the channel architecture.
331
+
332
+ ## The five PR gates
333
+
334
+ The cron's `GITHUB_TOKEN`-opened PRs don't fire downstream workflows, so the
335
+ cron also runs the gate code inline (same CLI surface). Human-opened freshness
336
+ PRs get gated by the workflow at
337
+ :cite[.github/workflows/knowledge-freshness-gates.yml:17].
338
+
339
+ :::filter-table
340
+ | # | Gate | What it checks | Mode | Source |
341
+ | --- | --- | --- | --- | --- |
342
+ | 1 | Frontmatter validator | Zod schema parse over every entry (excludes README). Strict calendar-date refinement; SSRF guard on source URLs. | :sev[blocking]{level=p0} | :cite[src/validation/knowledge-frontmatter-validator.ts:42-50] |
343
+ | 2 | Source link-check | Every `sources[*].url` returns 2xx. Operates on the changed-files list via `--files-from`. | :sev[blocking]{level=p0} | :cite[.github/workflows/knowledge-freshness-gates.yml:117-123] |
344
+ | 3 | Unsourced-claims lint | New normative claims must have a `sources[]` entry. Runs even when 1/2 failed. | :sev[advisory]{level=p3} | :cite[.github/workflows/knowledge-freshness-gates.yml:126-135] |
345
+ | 4 | Anti-over-rewrite | Stable entries reject diffs deleting >20% of lines unless the `override:anti-over-rewrite` label is applied. Cron-opened `knowledge-freshness/*` branches only. | :sev[blocking]{level=p1} | :cite[.github/workflows/knowledge-freshness-gates.yml:137-152] |
346
+ | 5 | Deep Guidance preserved | Literal `## Deep Guidance` heading must survive — the assembly engine pulls just that section. | :sev[blocking]{level=p0} | :cite[.github/workflows/knowledge-freshness-gates.yml:154-160] |
347
+ :::
348
+
349
+ :::callout{type=warning}
350
+ **Spec drift on the Gate 4 override.** The parent spec describes the override as
351
+ a marker in the PR *description*; the shipped mechanism
352
+ (:cite[.github/workflows/knowledge-freshness-gates.yml:148-152]) reads a
353
+ maintainer-applied PR *label* (`override:anti-over-rewrite`) via `--pr-labels`.
354
+ The shipped behavior is authoritative; the spec text is stale.
355
+ :::
356
+
357
+ :::callout{type=note}
358
+ **Anti-tamper checkout (known gap).** The gate workflow builds the gate code from
359
+ HEAD, not from `origin/main`
360
+ (:cite[.github/workflows/knowledge-freshness-gates.yml:42-53]). The hardening —
361
+ build from base, overlay only PR HEAD's `content/knowledge/` — is deferred
362
+ because the bootstrap PR introduced the gate code itself. Risk is mitigated by
363
+ mandatory PR review until a follow-up flips the checkout strategy.
364
+ :::
365
+
366
+ ## Lens I — gap detection + suppression
367
+
368
+ Lens I runs under `--scope=docs` and `--scope=all`
369
+ (:cite[src/observability/checks/lens-i-knowledge-gaps.ts:43]). It collects
370
+ signals from the ledger (rolling 90-day window,
371
+ :cite[src/observability/checks/lens-i-knowledge-gaps.ts:52]) plus synthetic
372
+ signals from `tasks/lessons.md`, buckets them by normalized topic, applies the
373
+ threshold matrix, and suppresses buckets whose topic an entry already covers.
374
+
375
+ :::callout{type=note}
376
+ **Where Lens I sits in the taxonomy.** "Lens" is scaffold's name for an audit
377
+ check function inside `scaffold observe audit`. The full set is A–I; Lens I
378
+ (`I-knowledge-gaps`) is this one. The other seven plus Lens H are documented in
379
+ the [Build Observability guide](../observability/index.md).
380
+ :::
381
+
382
+ ### Threshold matrix
383
+
384
+ The rules (:cite[src/observability/checks/lens-i-knowledge-gaps.ts:148-149]):
385
+
386
+ | signal_count | distinct_projects | Severity |
387
+ | --- | --- | --- |
388
+ | ≥ 5 | ≥ 3 | :sev[P1]{level=p1} |
389
+ | ≥ 3 | ≥ 2 | :sev[P2]{level=p2} |
390
+ | below both | — | no finding |
391
+
392
+ ### Topic normalization
393
+
394
+ Lens I normalizes the raw topic before bucketing, then validates the result.
395
+ Two distinct steps: `normalizeTopic`
396
+ (:cite[src/observability/checks/lens-i-lessons-scanner.ts:32-38]) always
397
+ produces a (possibly empty) string; `isValidTopic`
398
+ (:cite[src/observability/checks/lens-i-lessons-scanner.ts:114-116]) decides
399
+ whether to accept it. Normalization lowercases, strips apostrophes, replaces
400
+ every other non-slug run with a single hyphen, collapses repeats, and trims. The
401
+ validator additionally enforces ≤ 80 chars and `/^[a-z0-9]+(-[a-z0-9]+)*$/`.
402
+ So `Agent Eval Harnesses!` → `agent-eval-harnesses` (valid), but `!!!` → `` (rejected).
403
+
404
+ ### What the lessons.md scanner sees
405
+
406
+ Lens I synthesizes signals from `tasks/lessons.md` at audit time (read inline, no
407
+ ledger writes) via two passes per non-fenced line — code-fenced blocks are
408
+ skipped (:cite[src/observability/checks/lens-i-lessons-scanner.ts:4]):
409
+
410
+ 1. **Explicit marker** — `<!-- gap-topic: <slug> -->` (slug must already be
411
+ kebab-case; the marker regex enforces it).
412
+ 2. **Heuristic phrases** (case-insensitive): *"would have helped to have a guide
413
+ on X"*, *"missing knowledge entry for X"*, *"no knowledge entry for X"* /
414
+ *"no kb entry for X"*, *"missing knowledge: X"*.
415
+
416
+ Captured topics run through the same `normalizeTopic` / `isValidTopic`.
417
+ Synthetic signals carry `project_id: "lessons"` and are **excluded** from the
418
+ distinct-projects count by the aggregator's `delete('lessons')` rule (decision
419
+ #6) — they corroborate but don't independently satisfy the threshold.
420
+
421
+ ### 3-tier `--knowledge-root` resolution
422
+
423
+ Lens I must know where the KB lives to skip already-covered topics. The resolver
424
+ (`resolveKnowledgeRoot` at :cite[src/observability/knowledge-index.ts:326-379])
425
+ tries three tiers in order:
426
+
427
+ | Tier | Source | On failure |
428
+ | --- | --- | --- |
429
+ | 1 | `--knowledge-root` CLI flag (resolved against `process.cwd()`) | **hard error** before any lens runs (`KnowledgeRootCliInvalidError`) |
430
+ | 2 | `lenses.I-knowledge-gaps.knowledge_root` in yaml (resolved against cwd) | soft-fail; records `{outcome: 'invalid', reason}` in the attempts trail |
431
+ | 3 | auto-detect — `findScaffoldKnowledgeRoot` walks parents for `package.json#name === '@zigrivers/scaffold'` (:cite[src/observability/knowledge-index.ts:164-178]) | returns `null` if no install is found |
432
+
433
+ The sharp asymmetry is intentional: an operator who *typed* a `--knowledge-root`
434
+ gets a hard error on a bad path; yaml and auto-detect soft-fail so suppression
435
+ degrades gracefully. The most instructive case: yaml invalid + auto-detect found
436
+ — the trail records the yaml failure *and* the auto-detect success, root is the
437
+ auto-detect path, and a one-line stderr note points at the stale yaml. This is
438
+ what an operator sees when `npm update -g @zigrivers/scaffold` moved the install
439
+ out from under a pinned yaml path.
440
+
441
+ ### Warning policy
442
+
443
+ | Key | Status | When emitted |
444
+ | --- | --- | --- |
445
+ | `lens-i:no-root` | active | Lens I runs, no root resolved, lens enabled. Per-audit deduped via `warnedKeys: Set<string>`. If yaml failed validation, the message gains a clause quoting the bad path + reason. |
446
+ | `lens-i:index-load-failed` | reserved | Never emitted today — `validateKnowledgeRoot` exercises the loader at resolution time, foreclosing this path. |
447
+ | (none) | no-warn | Lens I disabled — resolver runs but no warning surfaces (decisions #4 / #11). |
448
+
449
+ `emitOnceForAudit` (:cite[src/observability/knowledge-index.ts:251-259]) reads a
450
+ caller-provided `Set` created fresh in each `runAudit`
451
+ (:cite[src/observability/engine/api.ts:114]), so the `--fix` flow's three
452
+ internal audits each get their own dedup scope.
453
+
454
+ ### What a Lens I finding looks like
455
+
456
+ A single finding excerpt from the audit sidecar (`docs/audits/<id>.json`):
457
+
458
+ ```json
459
+ {
460
+ "id": "a3f2c1d4...",
461
+ "lens_id": "I-knowledge-gaps",
462
+ "severity": "P2",
463
+ "title": "Knowledge base lacks coverage for \"agent-eval-harnesses\" — 4 signals across 2 projects",
464
+ "source_doc": "",
465
+ "evidence": {
466
+ "kind": "knowledge_gap",
467
+ "topic": "agent-eval-harnesses",
468
+ "signal_count": 4,
469
+ "distinct_project_count": 2,
470
+ "distinct_projects": ["a3f2...", "1c4e..."],
471
+ "first_seen": "2026-04-12T09:00:00Z",
472
+ "last_seen": "2026-05-21T14:30:00Z",
473
+ "example_excerpts": ["No knowledge entry for agent eval harnesses"]
474
+ },
475
+ "confidence": "medium",
476
+ "fix_hint": {
477
+ "kind": "edit_doc",
478
+ "target": "content/knowledge/<category>/agent-eval-harnesses.md",
479
+ "prompt": "Propose a new knowledge entry for \"agent-eval-harnesses\". Evidence: 4 signals from 2 projects in the last 90 days."
480
+ }
481
+ }
482
+ ```
483
+
484
+ :::callout{type=warning}
485
+ **Phase audits don't trigger Lens I.** The phase-boundary hook
486
+ (`StateManager.markCompleted` → `runPhaseAudit` at
487
+ :cite[src/observability/engine/phase-audit.ts:63]) fires only Lens H-cross-doc
488
+ (`lensIds: ['H-cross-doc']` at :cite[src/observability/engine/phase-audit.ts:77]).
489
+ Lens I never runs at phase boundaries. A phase-audit run that surfaces zero
490
+ findings does **not** mean Lens I is happy — it means Lens I never ran. To see
491
+ Lens I findings, invoke `scaffold observe audit --scope=docs` (or `--scope=all`)
492
+ explicitly, or run it through `--fix`.
493
+ :::
494
+
495
+ ## The allowlist
496
+
497
+ Out-of-allowlist sources warn but don't block (decision #4). Bare hostnames
498
+ match subdomains; `host/path` entries additionally require the URL path to start
499
+ with the prefix; `github_repos` is locked to specific `owner/repo`.
500
+
501
+ The off-allowlist warning is **advisory** and is surfaced by the
502
+ frontmatter-validation path (`validateKnowledgeFile`), not by a gate. Gate 3
503
+ (`lint-unsourced`) is a separate advisory check that flags nearby links not
504
+ covered by the entry's declared `sources[]` domains. Off-allowlist sources still
505
+ get fetched, hashed, and audited — they just warn.
506
+ It is **not** a security boundary: the SSRF guard
507
+ (`src/knowledge-freshness/source-url-validator.ts`) runs independently, so a new
508
+ host never unlocks private-IP fetches. The editorial bar is: "would the
509
+ maintainers want this URL to be the verbatim grounding for a P0/P1 finding?"
510
+
511
+ ### Most-cited hosts
512
+
513
+ Counted live from every entry's `sources[*].url` at build time.
514
+
515
+ <!-- gen:host-citations -->
516
+ :::chart{type=bar}
517
+
518
+ | Host | Citations |
519
+ | --- | --- |
520
+ | martinfowler.com | 37 |
521
+ | developer.mozilla.org | 24 |
522
+ | owasp.org | 17 |
523
+ | developer.android.com | 15 |
524
+ | the-turing-way.netlify.app | 15 |
525
+ | developer.apple.com | 14 |
526
+ | developer.chrome.com | 14 |
527
+ | sre.google | 12 |
528
+ | w3.org | 12 |
529
+ | microservices.io | 11 |
530
+ | ethereum.org | 10 |
531
+ | rfc-editor.org | 10 |
532
+ | consensys.github.io | 9 |
533
+ | docs.openzeppelin.com | 9 |
534
+ | opentelemetry.io | 9 |
535
+ :::
536
+ <!-- /gen:host-citations -->
537
+
538
+ ### The full allowlist
539
+
540
+ Every host plus its category, and the pinned GitHub repos.
541
+
542
+ <!-- gen:allowlist -->
543
+ 47 allowlisted hosts and 3 GitHub repos. Out-of-list sources warn (they do not block).
544
+
545
+ :::filter-table
546
+ | Host | Category |
547
+ | --- | --- |
548
+ | `ai.google.dev` | ai-ml |
549
+ | `anthropic.com` | ai-ml |
550
+ | `docs.wandb.ai` | ai-ml |
551
+ | `mlflow.org` | ai-ml |
552
+ | `modelcontextprotocol.io` | ai-ml |
553
+ | `platform.openai.com` | ai-ml |
554
+ | `spec.graphql.org` | api |
555
+ | `spec.openapis.org` | api |
556
+ | `developer.chrome.com` | browser-ext |
557
+ | `docs.aws.amazon.com` | cloud-ops |
558
+ | `opentelemetry.io` | cloud-ops |
559
+ | `sre.google` | cloud-ops |
560
+ | `aicpa-cima.com` | compliance |
561
+ | `aicpa.org` | compliance |
562
+ | `eur-lex.europa.eu` | compliance |
563
+ | `pcisecuritystandards.org` | compliance |
564
+ | `www.finra.org` | compliance |
565
+ | `www.sec.gov` | compliance |
566
+ | `developer.android.com` | mobile |
567
+ | `developer.apple.com` | mobile |
568
+ | `adr.github.io` | patterns |
569
+ | `agilealliance.org` | patterns |
570
+ | `conventionalcommits.org` | patterns |
571
+ | `google.github.io` | patterns |
572
+ | `martinfowler.com` | patterns |
573
+ | `microservices.io` | patterns |
574
+ | `thoughtworks.com` | patterns |
575
+ | `the-turing-way.netlify.app` | research |
576
+ | `nist.gov` | security |
577
+ | `openid.net` | security |
578
+ | `owasp.org` | security |
579
+ | `consensys.github.io` | smart-contracts |
580
+ | `docs.openzeppelin.com` | smart-contracts |
581
+ | `docs.safe.global` | smart-contracts |
582
+ | `ethereum.org` | smart-contracts |
583
+ | `swcregistry.io` | smart-contracts |
584
+ | `ietf.org/rfc` | standards |
585
+ | `www.iso.org` | standards |
586
+ | `www.rfc-editor.org` | standards |
587
+ | `docs.pact.io` | testing |
588
+ | `docs.astral.sh` | tooling |
589
+ | `git-scm.com` | tooling |
590
+ | `peps.python.org` | tooling |
591
+ | `www.postgresql.org` | tooling |
592
+ | `developer.mozilla.org` | web-standards |
593
+ | `tr.designtokens.org` | web-standards |
594
+ | `www.w3.org` | web-standards |
595
+ :::
596
+
597
+ **GitHub repos:** `modelcontextprotocol/specification`, `steveyegge/beads`, `joelparkerhenderson/architecture-decision-record`
598
+ <!-- /gen:allowlist -->
599
+
600
+ ### KB inventory
601
+
602
+ Totals over `content/knowledge/`, broken down per category.
603
+
604
+ <!-- gen:kb-inventory -->
605
+ **266 entries** across 19 categories:
606
+
607
+ | Category | Entries |
608
+ | --- | --- |
609
+ | core | 35 |
610
+ | game | 25 |
611
+ | research | 25 |
612
+ | backend | 22 |
613
+ | review | 20 |
614
+ | web-app | 17 |
615
+ | web3 | 14 |
616
+ | data-science | 13 |
617
+ | browser-extension | 12 |
618
+ | data-pipeline | 12 |
619
+ | library | 12 |
620
+ | ml | 12 |
621
+ | mobile-app | 12 |
622
+ | cli | 10 |
623
+ | validation | 7 |
624
+ | product | 6 |
625
+ | execution | 5 |
626
+ | tools | 4 |
627
+ | finalization | 3 |
628
+ <!-- /gen:kb-inventory -->
629
+
630
+ ### How to expand the allowlist
631
+
632
+ Adding a host is a one-line PR to
633
+ `docs/knowledge-freshness/authoritative-sources.yaml`:
634
+
635
+ ```diff
636
+ hosts:
637
+ - owasp.org
638
+ + - developers.cloudflare.com
639
+ - nist.gov
640
+ ```
641
+
642
+ 1. **Pick the form.** Bare hostname for vendor docs whose path layout changes;
643
+ `host/path` prefix for shared-tenancy hosts where you only trust a sub-path;
644
+ `owner/repo` under `github_repos:` for specific GitHub repos. Skip `www.`
645
+ (bare entries auto-match subdomains).
646
+ 2. **Verify the host is live** — `curl -sI https://<host>/<path>` should return
647
+ 2xx (or a 3xx that ultimately resolves).
648
+ 3. **Mirror the category** in `CATEGORY_MAP` in
649
+ `scripts/build-freshness-reference.mjs` — otherwise the regenerated allowlist
650
+ table shows the new host as `other`.
651
+ 4. **Open a normal PR.** Allowlist additions are not a separate trust delegation;
652
+ any maintainer can review.
653
+
654
+ ## Anthropic vs DeepSeek (cron uses DeepSeek)
655
+
656
+ The cron switched to DeepSeek HTTP to remove the local `claude` CLI dependency
657
+ from CI. Local audits keep using whichever provider is configured. Precedence is
658
+ resolved by `resolveProvider` (:cite[src/knowledge-freshness/providers/index.ts:36]):
659
+
660
+ 1. `--provider <name>` — explicit flag, operator override
661
+ 2. `KNOWLEDGE_FRESHNESS_PROVIDER` env var
662
+ 3. A single API key in env — inferred
663
+ 4. Both API keys present → error (ambiguous)
664
+ 5. No env, `claude` on PATH → anthropic (subprocess uses keychain)
665
+ 6. Nothing → error (no provider configured)
666
+
667
+ ::::tabs
668
+ :::tab{title="Anthropic"}
669
+ Subprocess: `claude -p --tools ""` (empty-tools disables WebFetch so the model
670
+ can only read the prefetched bodies). **Requires the `claude` CLI on PATH
671
+ regardless of how the provider was chosen** — the resolver throws
672
+ (:cite[src/knowledge-freshness/providers/index.ts:44-56]) if anthropic is
673
+ picked via flag, env, or API-key inference and `claude` isn't installed.
674
+ `ANTHROPIC_API_KEY` alone is *not* sufficient. Source:
675
+ `src/knowledge-freshness/providers/anthropic.ts`.
676
+ :::
677
+ :::tab{title="DeepSeek"}
678
+ HTTP. No subprocess; works in CI without the Claude CLI.
679
+
680
+ - **Auth:** requires `DEEPSEEK_API_KEY`.
681
+ - **Default model:** `deepseek-v4-flash`.
682
+ - **Override:** set `KNOWLEDGE_FRESHNESS_DEEPSEEK_MODEL` to `deepseek-v4-pro`.
683
+ Other values throw at dispatcher-build time
684
+ (:cite[src/knowledge-freshness/providers/deepseek.ts:54-58]).
685
+ - **Thinking mode:** hardcoded `thinking: { type: 'disabled' }`.
686
+ - **URL:** hardcoded to `https://api.deepseek.com/chat/completions`;
687
+ project-local config cannot redirect (decision #7 invariant).
688
+ :::
689
+ ::::
690
+
691
+ :::callout{type=danger}
692
+ **Why the DeepSeek URL is hardcoded.** An untrusted project's
693
+ `.scaffold/observability.yaml` could otherwise redirect the LLM dispatcher at an
694
+ attacker-controlled host that captures `DEEPSEEK_API_KEY` from request headers.
695
+ Hardcoding closes that exfiltration path — the same threat model that hardcodes
696
+ Lens H's `claude -p` command in the Build Observability audit.
697
+ :::
698
+
699
+ The cron wires DeepSeek explicitly
700
+ (:cite[.github/workflows/knowledge-freshness-audit.yml:70]):
701
+
702
+ ```yaml
703
+ env:
704
+ DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
705
+ KNOWLEDGE_FRESHNESS_PROVIDER: deepseek
706
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
707
+ ```
708
+
709
+ A missing `DEEPSEEK_API_KEY` fails the run loudly at preflight rather than
710
+ silently exiting 0 with zero PRs.
711
+
712
+ ## Every command that touches the system
713
+
714
+ All commands ship in the published CLI.
715
+
716
+ ### Refresh-arm commands
717
+
718
+ | Command | Purpose |
719
+ | --- | --- |
720
+ | `scaffold knowledge-freshness audit-prefilter [--max=N]` | Walk `content/knowledge/`, apply cadence + hash check, print a JSON candidate array. `--max` default 10 (:cite[src/cli/commands/knowledge-freshness-audit-prefilter.ts:18]); the CLI emits only `{ name, path }` per candidate (:cite[src/cli/commands/knowledge-freshness-audit-prefilter.ts:43]). |
721
+ | `scaffold knowledge-freshness audit-run-entry <path>` | Pre-fetch each source through SSRF guards, dispatch the grounded audit, print verdict JSON. `--provider anthropic\|deepseek` overrides env precedence. |
722
+ | `scaffold knowledge-freshness audit-apply <path> <verdict.json> [--open-pr]` | Patch frontmatter + apply `proposed_changes` by H2 heading. The wrapper re-fetches every checked URL and computes its own sha256 (:cite[src/knowledge-freshness/audit-apply.ts:82-101]), so persisted hashes are deterministic, not the LLM's claim. Refuses to advance `last-reviewed` unless every declared source is covered. |
723
+ | `make validate-knowledge` | Gate 1 — runs the Zod validator over every entry (README excluded). |
724
+
725
+ ### Gap-arm commands
726
+
727
+ | Command | Purpose |
728
+ | --- | --- |
729
+ | `scaffold observe audit --lens I-knowledge-gaps [--knowledge-root <path>] [--fix]` | Run the gap-detection lens against the local ledger + `tasks/lessons.md`. `--knowledge-root` overrides yaml + auto-detect for suppression. `--fix` dispatches the fix flow; the override threads through all three audits. |
730
+ | `scaffold observe event knowledge_gap_signal --topic=<slug> --source=<…> --project-id=<sha> …` | Write one validated gap signal to the ledger. Used by the assembly-time tail and by operators backfilling synthetic signals. |
731
+ | `scaffold observe ack <prefix-or-id>` | Acknowledge (or reopen) a finding so it stops surfacing. Use when a Lens I topic is deliberately out of scope. |
732
+
733
+ The `--fix` flow (`runFixFlow` at :cite[src/observability/engine/fix-flow.ts:71])
734
+ runs a three-audit loop: (1) the initial audit produces a fix plan; (2) for each
735
+ blocking finding, dispatch a fix agent then re-audit just that finding (the
736
+ verifier); (3) one postfix audit runs everything for the final report. The
737
+ `--knowledge-root` override threads into all three (decision #20) so suppression
738
+ is consistent throughout.
739
+
740
+ ### Gate-side subcommands (also runnable locally for triage)
741
+
742
+ | Command | Gate | Purpose |
743
+ | --- | --- | --- |
744
+ | `knowledge-freshness link-check [<path>] [--files-from <json>]` | 2 | HTTP-HEAD every `sources[*].url`; 2xx passes, else exit 1. |
745
+ | `knowledge-freshness lint-unsourced [<path>] [--files-from <json>] [--diff <patch>]` | 3 | Heuristic scan for normative language in new lines without a `sources[]` reference. Advisory: prints findings but always exits 0. |
746
+ | `knowledge-freshness anti-over-rewrite [--files-from <json>] [--diff <patch>] [--pr-labels <csv>]` | 4 | For each changed `stable` entry, compare deleted-line count to 20% of the body; exit 1 if crossed without `override:anti-over-rewrite`. The cron passes `--pr-labels ""` (it can't self-apply labels). |
747
+ | `knowledge-freshness deep-guidance-check [<path>] [--files-from <json>]` | 5 | Assert each changed entry still contains a `## Deep Guidance` heading (case-sensitive). |
748
+ | `knowledge-freshness bump-version --title <str> --body <str>` | — | Pure-function dry-run of `deriveBumpKind` + `bumpSemver`; prints `bump:` and `next:` lines parsed by the version-bump workflow. |
749
+
750
+ ## Operations cheat sheet
751
+
752
+ ### An entry's audit failed in the cron
753
+
754
+ The cron logs `audit failed for <name> — moving on` and continues; the entry
755
+ stays in tomorrow's queue. Causes: provider auth (key rotated), source URL now
756
+ 404s, a fetch/HTTP error or the 5 MiB fetch-and-hash cap, dispatcher error, or
757
+ LLM timeout. (A source body over the 96 KiB embed cap is **truncated** and
758
+ flagged `truncated: true` — it does not fail the audit.) Reproduce locally:
759
+
760
+ ```bash
761
+ DEEPSEEK_API_KEY=sk-… node dist/index.js knowledge-freshness \
762
+ audit-run-entry content/knowledge/<cat>/<name>.md
763
+ # read stderr to see if it's a URL issue or a provider issue
764
+ ```
765
+
766
+ ### Lens I keeps surfacing a topic the KB already covers
767
+
768
+ Suppression didn't match. Either the resolver returned `root: null` (look for
769
+ `[Lens I] knowledge-root not located` in stderr) or the entry's `name:` doesn't
770
+ normalize to the same slug as the bucket topic — the match is exact and
771
+ post-normalize.
772
+
773
+ ```bash
774
+ scaffold observe audit --lens I-knowledge-gaps --json \
775
+ --knowledge-root /path/to/content/knowledge \
776
+ | jq '.findings[] | select(.lens_id=="I-knowledge-gaps")'
777
+ grep -A1 "^---" content/knowledge/<cat>/<slug>.md | grep "^name:"
778
+ ```
779
+
780
+ ### Downstream auto-detect can't find the KB
781
+
782
+ `findScaffoldKnowledgeRoot` walks parents from the CLI install's module location
783
+ looking for `package.json#name === '@zigrivers/scaffold'`. Symlinked or
784
+ repackaged installs may miss. Pin it via the tier-2 yaml:
785
+
786
+ ```yaml
787
+ lenses:
788
+ I-knowledge-gaps:
789
+ knowledge_root: /opt/homebrew/lib/node_modules/@zigrivers/scaffold/content/knowledge
790
+ ```
791
+
792
+ ### Yaml `knowledge_root` stops working after an upgrade
793
+
794
+ The yaml tier soft-fails and records the reason in the attempts trail; Lens I
795
+ appends it to the warning. Validation requires all four: the path exists, is a
796
+ directory, contains a `<path>/VERSION` marker, and `loadKnowledgeIndex` runs
797
+ without throwing (an empty index is OK). The usual cause after an upgrade is a
798
+ moved install path:
799
+
800
+ ```bash
801
+ find / -name VERSION -path '*content/knowledge*' 2>/dev/null
802
+ ```
803
+
804
+ Then update `lenses.I-knowledge-gaps.knowledge_root` to the new path.
805
+
806
+ ### A source URL fetches in `curl` but the cron rejects it
807
+
808
+ The SSRF guard re-resolves the hostname at fetch time and rejects any IP in a
809
+ non-globally-routable range (RFC1918, link-local, loopback, CGNAT, ULA,
810
+ IPv4-mapped IPv6, …). Common cause: an internal DNS view returning a private IP
811
+ for an outwardly-public hostname.
812
+
813
+ ```bash
814
+ node -e 'require("node:dns").promises.lookup("<host>", { all: true }).then(console.log)'
815
+ ```
816
+
817
+ Fix: move the source to a globally-routable host, or remove it. Allowlisting does
818
+ **not** bypass the SSRF guard.
819
+
820
+ ### `--knowledge-root` resolves to a path you didn't expect
821
+
822
+ Auto-detect may pick a stale npm-global install. The successful-resolution path
823
+ doesn't log its `attempts` trail today (only the failure path warns), so pin and
824
+ compare:
825
+
826
+ ```bash
827
+ scaffold observe audit --lens I-knowledge-gaps --json \
828
+ --knowledge-root /path/you/expected/content/knowledge \
829
+ | jq '.findings[] | select(.lens_id=="I-knowledge-gaps") | .evidence.topic'
830
+ # compare against the unset behavior; if the lists differ, auto-detect picked a different KB
831
+ ```
832
+
833
+ Fix: pin `lenses.I-knowledge-gaps.knowledge_root` in
834
+ `.scaffold/observability.yaml`. A pinned yaml path takes precedence over
835
+ auto-detect.
836
+
837
+ ## Config reference
838
+
839
+ Everything operator-tunable lives in `.scaffold/observability.yaml`. Anything
840
+ outside this list is hardcoded (decision #7 invariant) so an untrusted project
841
+ can't redirect dispatch commands or LLM URLs.
842
+
843
+ ```yaml
844
+ lenses:
845
+ I-knowledge-gaps:
846
+ knowledge_root: /path/to/content/knowledge # tier-2 resolver override
847
+
848
+ disabled_lenses: [I-knowledge-gaps] # opt-out
849
+
850
+ phase_audit:
851
+ enabled: true # default
852
+ timeout_s: 60
853
+ detached: false # fire-and-forget when true
854
+
855
+ fix:
856
+ dispatcher_command: "claude -p" # default
857
+ timeout_s: 300
858
+ per_finding_max_attempts: 3
859
+ ```
860
+
861
+ :::callout{type=warning}
862
+ **The daily audit ceiling is NOT in yaml.** The parent spec's decision #8 reads
863
+ "10 grounded audits per day; configurable via `.scaffold/observability.yaml`",
864
+ but the yaml knob was never implemented. The ceiling is the `--max=10` flag in
865
+ :cite[.github/workflows/knowledge-freshness-audit.yml:67]; the CLI default at
866
+ :cite[src/cli/commands/knowledge-freshness-audit-prefilter.ts:18] is the only
867
+ fallback. To lower it for your fork, edit the workflow — nothing in yaml will help.
868
+ :::
869
+
870
+ ## Roadmap and known divergences
871
+
872
+ ### Phase 5 (planned)
873
+
874
+ - **Native MMR `knowledge-freshness` channel** — runs automatically on freshness
875
+ PRs (today the cron dispatches no MMR).
876
+ - **Frontier scan** — augment cadence/hash triggers with a model-driven check:
877
+ "has the underlying technology meaningfully changed since `last-reviewed`?"
878
+ - **Taxonomy cross-reference** — detect when two entries assert contradictory
879
+ facts and route to a reviewer.
880
+
881
+ ### Known divergences
882
+
883
+ The reference page's own audits surfaced these doc-vs-code mismatches; the code
884
+ is ground truth:
885
+
886
+ - **MMR-in-cron framing** — three docs describe it differently; decision #3
887
+ (Phase 5 deferral) is authoritative.
888
+ - **Gate 4 override** — spec says PR-description marker; code reads a PR *label*.
889
+ - **Daily ceiling** — spec implies a yaml knob; only the `--max` flag exists.
890
+ - **`operations.md` lags** — labels the native MMR channel "Phase 4" and
891
+ describes suppression in future tense; both have shipped.
892
+ - **`www.` prefix inconsistency** (P3) — mixed `www.` use in the allowlist; bare
893
+ entries already auto-match subdomains, so the prefix is redundant.