kushi-agents 5.5.1 → 5.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -7,6 +7,10 @@
7
7
  [![host: VS Code](https://img.shields.io/badge/host-VS%20Code-007acc)](https://gim-home.github.io/kushi/)
8
8
  [![spec: agentskills.io](https://img.shields.io/badge/spec-agentskills.io-22c55e)](https://agentskills.io/skill-creation/best-practices)
9
9
 
10
+ > **v5.6.0 — Learning candidates.** Runners now auto-capture novel field errors into `<project>/Evidence/_learnings-candidates/` for human review and upstream promotion. Local-only (no telemetry, no auto-PR), 7-day dedup, classifier filters out user-side and transient errors. Closes the field-bug loop — published kushi installs can now contribute doctrine back. New probe `D48`, new concept doc [Learning candidates](https://gim-home.github.io/kushi/concepts/learning-candidates/), new how-to [Promote a learning candidate](https://gim-home.github.io/kushi/how-to/promote-learning-candidate/). `kushi share-learnings` (opt-in redacted upstream submission) lands in v5.7.0.
11
+
12
+ > **v5.5.0 — Deterministic runners.** Nine pull/orchestrator skills are now thin pointers to Node runners under `plugin/runners/`. The LLM picks scope; the runner does HTTP, file IO, week math, and writes evidence. New probes D44–D47 enforce the contract.
13
+
10
14
  > **v5.2.0 — Hooks + parallel pulls + OTel + teach + schema-evolve.** Pipeline events trigger configurable hooks (`.kushi/hooks/`); pull dispatch is parallel by default (4 workers); OpenTelemetry export is opt-in via `KUSHI_OTEL_ENDPOINT`; `kushi explain <topic>` teaches concepts; `kushi remember <rule>` persists conventions.
11
15
 
12
16
  > **v5.1.0 — Living wiki.** Build-state is now incremental: human edits outside `<!-- kushi:auto -->` fences are preserved, contradictions are flagged with Obsidian-compatible callouts (`> [!warning]`), and a new `lint-state` skill monitors wiki health. State/ is a valid [Obsidian](https://obsidian.md) vault — callout syntax, Dataview-compatible frontmatter, and `[[wikilinks]]` all work natively.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "kushi-agents",
3
- "version": "5.5.1",
3
+ "version": "5.6.1",
4
4
  "description": "Install Kushi — multi-source project evidence agent with Comprehensive Structured Capture (CSC) into weekly-only files across Email, Teams, OneNote, Loop, SharePoint, Meetings, CRM, ADO. Meetings retain a sibling verbatim/ audit folder. WorkIQ-only for M365 sources (Graph / m365_* FORBIDDEN as fallbacks; user-paste is first-class). Host-agnostic.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -0,0 +1,91 @@
1
+ ---
2
+ applyTo: "**/plugin/runners/**"
3
+ description: "Doctrine for v5.6.0 learning candidates — when runners write local markdown files capturing novel errors for later human promotion to plugin/learnings/<source>.md."
4
+ ---
5
+
6
+ # Learning candidates (v5.6.0)
7
+
8
+ When a runner hits an error worth remembering, it writes a **learning candidate** markdown file to the project's local Evidence dir. No telemetry. No auto-PR. A maintainer reviews candidates later and promotes the real ones to upstream `plugin/learnings/<source>.md`.
9
+
10
+ ## Where
11
+
12
+ ```
13
+ <engagement-root>/<project>/Evidence/_learnings-candidates/
14
+ YYYY-MM-DD-HHmm_<alias>_<source>_<short-sig>.md
15
+ _seen.json (hidden dedup ledger — do not edit)
16
+ ```
17
+
18
+ ## When to emit (runner responsibility)
19
+
20
+ The runner calls `emitLearningCandidate({ projectRoot, alias, source, entity, week, error, context })` from `plugin/runners/lib/learnings.mjs` in its catch path. The lib enforces the policy filter; callers always call, lib decides whether to write.
21
+
22
+ EMIT for:
23
+ - **Novel signatures** — anything not in the known taxonomy (user-side errors + transient HTTP). Most commonly: Graph/Dataverse returned an unexpected shape, a `$select` field came back null where it never had before, an entity-set name pluralization changed.
24
+ - **`body-unavailable` on 2nd+ sighting** for the same `(source, entity)` across runs. The first sighting is noise (could be a moved page, racing index). The second sighting is a quirk worth capturing.
25
+
26
+ DO NOT EMIT for:
27
+ | Signature | Why not |
28
+ |---|---|
29
+ | `bad-args`, `config-missing`, `config-invalid` | User-side — fix the config, not the runner. |
30
+ | `token-expired`, `auth-required`, `auth-failed` | User-side — re-auth. |
31
+ | `folder-not-found`, `entity-not-found` | User-side — typo in `boundaries.yml`. |
32
+ | `cross-tenant-blocked`, `permission-denied` | Tenant policy, not a kushi bug. |
33
+ | `fetch-failed` + HTTP 429/502/503/504/408 | Transient — runner already retried. |
34
+
35
+ ## Dedup
36
+
37
+ Same `<source>:<signature>:<fingerprint-8>` is not re-emitted within 7 days per project. Fingerprint is sha256 over `(source, signature, normalized-message)` where the message has hex blobs and long digit runs redacted. This means the same Graph 500 with a different correlation-id collapses to one candidate, but two genuinely different Graph 500s stay distinct.
38
+
39
+ ## Candidate file format
40
+
41
+ The lib writes a markdown file matching the upstream register template, so promotion is copy-paste:
42
+
43
+ ```markdown
44
+ ### YYYY-MM-DD — <source>: <signature> (<fpr>)
45
+
46
+ **Symptom**: <error message> (HTTP <status>) — entity `<entity>` — week <week>
47
+
48
+ **Root cause**: _TO INVESTIGATE_
49
+
50
+ **Fix / workaround**: _TO INVESTIGATE_
51
+
52
+ **Doctrine impact**: register-only — TODO promote on next sighting
53
+
54
+ **Discovered during**: alias `<alias>` running pull-<source>
55
+
56
+ ---
57
+
58
+ <!-- machine-readable footer -->
59
+ ```yaml
60
+ source: ...
61
+ fingerprint: ...
62
+ captured_at: ...
63
+ ```
64
+ ```
65
+
66
+ A maintainer fills in Root cause + Fix + Doctrine impact before promoting.
67
+
68
+ ## Orchestrator reporting
69
+
70
+ `refresh.mjs` and `bootstrap.mjs` count candidate files at the end of a run and include `learning_candidates_written: N` in the stdout JSON. The run-report (`Evidence/<alias>/refresh-reports/...md`) gets a "Learning candidates this run" section pointing at the dir when N > 0.
71
+
72
+ ## Promotion (manual, v5.6.0)
73
+
74
+ 1. Open `<project>/Evidence/_learnings-candidates/`.
75
+ 2. Pick a candidate. Investigate. Fill in Root cause + Fix.
76
+ 3. Copy the body (without the machine-readable footer) into the matching `<KUSHI_ROOT>/plugin/learnings/<source>.md` (newest on top).
77
+ 4. Open a PR against `gim-home/kushi`.
78
+ 5. Delete the candidate file locally once merged upstream.
79
+
80
+ The v5.7.0 `kushi share-learnings` command will automate steps 3–5 with redaction + user confirmation. v5.6.0 ships emission only.
81
+
82
+ ## Privacy
83
+
84
+ Candidates are written **locally** in your own project folder under OneDrive/SharePoint. They never leave your machine until you (or a future opt-in `share-learnings` command) explicitly send them upstream. The fingerprinting + redaction step in v5.7.0 will strip tenant ids, GUIDs, contributor aliases, and project names before any upstream submission.
85
+
86
+ ## Anti-patterns
87
+
88
+ - ❌ **Calling `emitLearningCandidate` from the LLM/chat.** Only runners emit.
89
+ - ❌ **Writing user data into the candidate body.** Symptom/Root cause/Fix should describe the *shape* of the bug, not the project content.
90
+ - ❌ **Promoting after one sighting.** Wait for the second — the dedup window is 7 days specifically to prevent solo-noise promotion.
91
+ - ❌ **Editing `_seen.json` by hand.** Delete the candidate file to force re-emission.
@@ -44,6 +44,10 @@ node plugin/runners/<source>.mjs --project <P> --alias <A> --entity <E> [--week
44
44
  - `deferred` — retry enqueued; runner will retry after `RETRY_MIN_AGE_MIN.<source>` minutes.
45
45
  - `failed` — non-retryable failure for this cell.
46
46
 
47
+ ## Runner side-effect: learning candidates (v5.6.0)
48
+
49
+ Every `pull-*` runner imports `emitLearningCandidate` from `plugin/runners/lib/learnings.mjs` and calls it from its non-retryable error paths. The lib filters out user-side / transient errors and writes a markdown file to `<project>/Evidence/_learnings-candidates/` only when the signature is genuinely novel (or `body-unavailable` is on its 2nd+ sighting). No telemetry, no auto-PR — purely local capture for later human review. See `learning-candidates.instructions.md` and self-check probe D48.
50
+
47
51
  ## What the LLM still owns
48
52
 
49
53
  - Asking the user for `request_id`, `engagement_id`, folder names, chat ids, joinUrls, section URLs, site URLs, when missing.
@@ -0,0 +1,203 @@
1
+ // plugin/runners/lib/learnings.mjs
2
+ // v5.6.0 — local-only "learning candidates" emission.
3
+ //
4
+ // When a runner hits a truly novel error, or sees the same body-unavailable
5
+ // twice for one entity, write one markdown candidate under
6
+ // <project>/Evidence/_learnings-candidates/YYYY-MM-DD-HHmm_<alias>_<source>_<sig>.md
7
+ //
8
+ // Doctrine: plugin/instructions/learning-candidates.instructions.md
9
+ // No telemetry. No auto-PR. Reviewed by humans before being promoted to
10
+ // plugin/learnings/<source>.md upstream (deferred to v5.7.0 share-learnings).
11
+
12
+ import path from 'node:path';
13
+ import { promises as fs } from 'node:fs';
14
+ import crypto from 'node:crypto';
15
+ import { evidenceRoot } from './layout.mjs';
16
+
17
+ const DIR_NAME = '_learnings-candidates';
18
+ const SEEN_FILE = '_seen.json';
19
+ const DEDUP_WINDOW_MS = 7 * 24 * 60 * 60 * 1000;
20
+
21
+ // Signatures considered "user-side" or "already-handled" — never emit.
22
+ // Keep this list narrow; anything NOT here is a candidate.
23
+ const USER_SIDE_SIGNATURES = new Set([
24
+ 'bad-args', 'config-missing', 'config-invalid',
25
+ 'token-expired', 'auth-required', 'auth-failed',
26
+ 'folder-not-found', 'entity-not-found',
27
+ 'cross-tenant-blocked', 'permission-denied',
28
+ // NOTE: 'fetch-failed' is intentionally NOT user-side. The runner uses it
29
+ // for both retryable (transient HTTP, filtered below) and non-retryable
30
+ // (unexpected response shape — the novel case we want to capture).
31
+ ]);
32
+
33
+ const TRANSIENT_HTTP_STATUSES = new Set([429, 502, 503, 504, 408]);
34
+
35
+ /**
36
+ * Decide whether an error is worth capturing as a learning candidate.
37
+ * Returns { capture: boolean, reason: string }.
38
+ *
39
+ * @param {object} error shape: { signature?, message?, status?, occurrences? }
40
+ * - signature: short kebab-case id (runner-assigned)
41
+ * - message: human-readable error message
42
+ * - status: HTTP status code if applicable
43
+ * - occurrences: cross-run count for repeat-only signatures (e.g. body-unavailable)
44
+ */
45
+ export function shouldCapture(error) {
46
+ if (!error || typeof error !== 'object') return { capture: false, reason: 'no-error' };
47
+ const sig = (error.signature || '').toLowerCase();
48
+
49
+ if (sig && USER_SIDE_SIGNATURES.has(sig)) return { capture: false, reason: 'user-side' };
50
+ if (error.status && TRANSIENT_HTTP_STATUSES.has(error.status)) return { capture: false, reason: 'transient-http' };
51
+
52
+ // body-unavailable: only emit on 2nd+ sighting for the same entity.
53
+ if (sig === 'body-unavailable') {
54
+ const n = Number(error.occurrences || 0);
55
+ if (n < 2) return { capture: false, reason: 'body-unavailable-first-sighting' };
56
+ return { capture: true, reason: 'body-unavailable-repeat' };
57
+ }
58
+
59
+ if (!sig) return { capture: true, reason: 'unclassified' };
60
+ return { capture: true, reason: 'novel-signature' };
61
+ }
62
+
63
+ /** Stable 8-char fingerprint over (source, signature, redacted-message). */
64
+ export function fingerprint(source, signature, message) {
65
+ const norm = String(message || '')
66
+ .replace(/[a-f0-9]{8,}/gi, '<hex>')
67
+ .replace(/\d{4,}/g, '<n>')
68
+ .replace(/\s+/g, ' ')
69
+ .trim()
70
+ .toLowerCase()
71
+ .slice(0, 200);
72
+ const h = crypto.createHash('sha256').update(`${source}|${signature || ''}|${norm}`).digest('hex');
73
+ return h.slice(0, 8);
74
+ }
75
+
76
+ function safeSlug(s, max = 40) {
77
+ return String(s || '')
78
+ .toLowerCase()
79
+ .replace(/[^a-z0-9]+/g, '-')
80
+ .replace(/^-+|-+$/g, '')
81
+ .slice(0, max) || 'unknown';
82
+ }
83
+
84
+ function timestampPrefix(d = new Date()) {
85
+ const y = d.getFullYear();
86
+ const mo = String(d.getMonth() + 1).padStart(2, '0');
87
+ const da = String(d.getDate()).padStart(2, '0');
88
+ const hh = String(d.getHours()).padStart(2, '0');
89
+ const mm = String(d.getMinutes()).padStart(2, '0');
90
+ return `${y}-${mo}-${da}-${hh}${mm}`;
91
+ }
92
+
93
+ async function loadSeen(dir) {
94
+ try {
95
+ const raw = await fs.readFile(path.join(dir, SEEN_FILE), 'utf8');
96
+ return JSON.parse(raw);
97
+ } catch {
98
+ return {};
99
+ }
100
+ }
101
+
102
+ async function saveSeen(dir, seen) {
103
+ await fs.writeFile(path.join(dir, SEEN_FILE), JSON.stringify(seen, null, 2));
104
+ }
105
+
106
+ function renderMarkdown({ source, alias, entity, week, error, context, fpr, capturedAt }) {
107
+ const lines = [
108
+ `### ${capturedAt.slice(0, 10)} — ${source}: ${error.signature || 'unclassified'} (${fpr})`,
109
+ '',
110
+ `**Symptom**: ${oneLine(error.message) || '(no message)'} ` +
111
+ (error.status ? `(HTTP ${error.status})` : '') +
112
+ (entity ? ` — entity \`${entity}\`` : '') +
113
+ (week ? ` — week ${week}` : ''),
114
+ '',
115
+ `**Root cause**: _TO INVESTIGATE_`,
116
+ '',
117
+ `**Fix / workaround**: _TO INVESTIGATE_`,
118
+ '',
119
+ `**Doctrine impact**: register-only — TODO promote on next sighting`,
120
+ '',
121
+ `**Discovered during**: alias \`${alias}\` running pull-${source}` +
122
+ (context && context.runner ? ` (runner ${context.runner})` : ''),
123
+ '',
124
+ '---',
125
+ '',
126
+ '<!-- machine-readable footer -->',
127
+ '```yaml',
128
+ `source: ${source}`,
129
+ `alias: ${alias}`,
130
+ `entity: ${entity || ''}`,
131
+ `week: ${week || ''}`,
132
+ `signature: ${error.signature || 'unclassified'}`,
133
+ `fingerprint: ${fpr}`,
134
+ `captured_at: ${capturedAt}`,
135
+ `error_status: ${error.status || ''}`,
136
+ `occurrences: ${error.occurrences || 1}`,
137
+ '```',
138
+ '',
139
+ ];
140
+ return lines.join('\n');
141
+ }
142
+
143
+ function oneLine(s) {
144
+ return String(s || '').replace(/\s+/g, ' ').trim();
145
+ }
146
+
147
+ /**
148
+ * Emit a learning candidate file under <project>/Evidence/_learnings-candidates/.
149
+ * Idempotent — same fingerprint within DEDUP_WINDOW_MS is skipped.
150
+ *
151
+ * @returns {Promise<{written: boolean, path?: string, reason?: string, fingerprint: string}>}
152
+ */
153
+ export async function emitLearningCandidate({
154
+ projectRoot, alias, source, entity, week, error, context = {}, now = new Date(),
155
+ }) {
156
+ if (!projectRoot || !alias || !source) {
157
+ return { written: false, reason: 'missing-required', fingerprint: '' };
158
+ }
159
+ const decision = shouldCapture(error);
160
+ if (!decision.capture) {
161
+ return { written: false, reason: decision.reason, fingerprint: '' };
162
+ }
163
+
164
+ const fpr = fingerprint(source, error.signature, error.message);
165
+ const dir = path.join(evidenceRoot(projectRoot), DIR_NAME);
166
+ await fs.mkdir(dir, { recursive: true });
167
+
168
+ const seen = await loadSeen(dir);
169
+ const prev = seen[fpr];
170
+ const nowMs = now.getTime();
171
+ if (prev && (nowMs - new Date(prev.at).getTime()) < DEDUP_WINDOW_MS) {
172
+ return { written: false, reason: 'deduped', fingerprint: fpr };
173
+ }
174
+
175
+ const ts = timestampPrefix(now);
176
+ const sigSlug = safeSlug(error.signature || 'unclassified', 30);
177
+ const fileName = `${ts}_${safeSlug(alias, 20)}_${source}_${sigSlug}.md`;
178
+ const target = path.join(dir, fileName);
179
+
180
+ const capturedAt = now.toISOString();
181
+ const md = renderMarkdown({ source, alias, entity, week, error, context, fpr, capturedAt });
182
+ await fs.writeFile(target, md);
183
+
184
+ seen[fpr] = { at: capturedAt, file: fileName, source, signature: error.signature || 'unclassified' };
185
+ await saveSeen(dir, seen);
186
+
187
+ return { written: true, path: target, reason: decision.reason, fingerprint: fpr };
188
+ }
189
+
190
+ /**
191
+ * Read the seen registry — used by orchestrators to count candidates written this run.
192
+ */
193
+ export async function readCandidateCount(projectRoot) {
194
+ try {
195
+ const dir = path.join(evidenceRoot(projectRoot), DIR_NAME);
196
+ const files = await fs.readdir(dir).catch(() => []);
197
+ return files.filter(f => f.endsWith('.md')).length;
198
+ } catch {
199
+ return 0;
200
+ }
201
+ }
202
+
203
+ export const __test__ = { USER_SIDE_SIGNATURES, TRANSIENT_HTTP_STATUSES, DEDUP_WINDOW_MS };
@@ -22,6 +22,7 @@ import { updateCell } from './lib/ledger.mjs';
22
22
  import { appendRunLog } from './lib/runlog.mjs';
23
23
  import { enqueue, clear } from './lib/deferred.mjs';
24
24
  import { currentIsoMonday, ymd } from './lib/weeks.mjs';
25
+ import { emitLearningCandidate } from './lib/learnings.mjs';
25
26
 
26
27
  const SOURCE = 'ado';
27
28
 
@@ -174,6 +175,7 @@ async function main() {
174
175
  if (retryable && !args.dryRun) {
175
176
  await enqueue(projectRoot, args.alias, { source: SOURCE, entity: args.entity, weekStart, signature: 'fetch-failed', reason: e.message });
176
177
  }
178
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-ado' } });
177
179
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
178
180
  return retryable ? 1 : 0;
179
181
  }
@@ -21,6 +21,7 @@ import { updateCell } from './lib/ledger.mjs';
21
21
  import { appendRunLog } from './lib/runlog.mjs';
22
22
  import { enqueue, clear } from './lib/deferred.mjs';
23
23
  import { isoMondayString, currentIsoMonday, ymd } from './lib/weeks.mjs';
24
+ import { emitLearningCandidate } from './lib/learnings.mjs';
24
25
 
25
26
  const SOURCE = 'crm';
26
27
 
@@ -155,6 +156,7 @@ async function main() {
155
156
  signature: 'fetch-failed', reason: e.message,
156
157
  });
157
158
  }
159
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-crm' } });
158
160
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
159
161
  return retryable ? 1 : 0;
160
162
  }
@@ -19,6 +19,7 @@ import { updateCell } from './lib/ledger.mjs';
19
19
  import { appendRunLog } from './lib/runlog.mjs';
20
20
  import { enqueue, clear } from './lib/deferred.mjs';
21
21
  import { currentIsoMonday, ymd, parseYmd } from './lib/weeks.mjs';
22
+ import { emitLearningCandidate } from './lib/learnings.mjs';
22
23
 
23
24
  const SOURCE = 'email';
24
25
 
@@ -81,6 +82,12 @@ function makeFixtureClient(data) {
81
82
  return {
82
83
  async findFolder(name) { return foldersByName.get(name) || null; },
83
84
  async listMessages(folderId, fromIso, toIso) {
85
+ if (data.throwOnListMessages) {
86
+ const t = data.throwOnListMessages;
87
+ const e = new Error(t.message || 'fixture-throw');
88
+ if (t.status) e.status = t.status;
89
+ throw e;
90
+ }
84
91
  const all = (data.messagesByFolder && data.messagesByFolder[folderId]) || [];
85
92
  return all.filter(m => m.receivedDateTime >= fromIso && m.receivedDateTime < toIso);
86
93
  },
@@ -126,6 +133,7 @@ async function main() {
126
133
  const retryable = !e.status || [429, 502, 503, 504].includes(e.status);
127
134
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { last_status: retryable ? 'deferred' : 'failed', last_error: e.message });
128
135
  if (retryable && !args.dryRun) await enqueue(projectRoot, args.alias, { source: SOURCE, entity: args.entity, weekStart, signature: 'fetch-failed', reason: e.message });
136
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-email' } });
129
137
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
130
138
  return retryable ? 1 : 0;
131
139
  }
@@ -21,6 +21,7 @@ import { appendRunLog } from './lib/runlog.mjs';
21
21
  import { enqueue, clear } from './lib/deferred.mjs';
22
22
  import { shortHash } from './lib/dedup.mjs';
23
23
  import { currentIsoMonday, ymd } from './lib/weeks.mjs';
24
+ import { emitLearningCandidate } from './lib/learnings.mjs';
24
25
 
25
26
  const SOURCE = 'meetings';
26
27
 
@@ -117,6 +118,7 @@ async function main() {
117
118
  const retryable = !e.status || [429, 502, 503, 504].includes(e.status);
118
119
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { last_status: retryable ? 'deferred' : 'failed', last_error: e.message });
119
120
  if (retryable && !args.dryRun) await enqueue(projectRoot, args.alias, { source: SOURCE, entity: args.entity, weekStart, signature: 'fetch-failed', reason: e.message });
121
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-meetings' } });
120
122
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
121
123
  return retryable ? 1 : 0;
122
124
  }
@@ -20,6 +20,8 @@ import { updateCell } from './lib/ledger.mjs';
20
20
  import { appendRunLog } from './lib/runlog.mjs';
21
21
  import { clear } from './lib/deferred.mjs';
22
22
  import { currentIsoMonday, ymd, parseYmd } from './lib/weeks.mjs';
23
+ import { emitLearningCandidate } from './lib/learnings.mjs';
24
+ import { readLedger, cellKey } from './lib/ledger.mjs';
23
25
 
24
26
  const SOURCE = 'onenote';
25
27
 
@@ -132,6 +134,7 @@ async function main() {
132
134
  try { section = await client.getSection(args.entity); }
133
135
  catch (e) {
134
136
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { last_status: 'failed', last_error: e.message });
137
+ if (!args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'section-fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-onenote' } });
135
138
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: 'failed', errors: [{ message: e.message }] });
136
139
  return 0;
137
140
  }
@@ -177,6 +180,22 @@ async function main() {
177
180
  }
178
181
 
179
182
  const status = bodyUnavailable.length === 0 ? 'captured' : (captures.length === 0 ? 'body-unavailable' : 'partial');
183
+
184
+ if (status === 'body-unavailable' && !args.dryRun) {
185
+ const prior = (await readLedger(projectRoot, args.alias).catch(() => ({ cells: {} })))
186
+ .cells?.[cellKey(SOURCE, args.entity, weekStart)];
187
+ const priorOccurrences = Number(prior?.body_unavailable_runs || 0);
188
+ const occurrences = priorOccurrences + 1;
189
+ if (occurrences >= 2) {
190
+ await emitLearningCandidate({
191
+ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart,
192
+ error: { signature: 'body-unavailable', message: `OneNote section ${section.id}: ${bodyUnavailable.length}/${pages.length} pages had no body across ${occurrences} runs`, occurrences },
193
+ context: { runner: 'pull-onenote' },
194
+ });
195
+ }
196
+ await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { body_unavailable_runs: occurrences });
197
+ }
198
+
180
199
  await clear(projectRoot, args.alias, SOURCE, args.entity).catch(() => {});
181
200
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, {
182
201
  last_status: status,
@@ -22,6 +22,7 @@ import { appendRunLog } from './lib/runlog.mjs';
22
22
  import { enqueue, clear } from './lib/deferred.mjs';
23
23
  import { shortHash } from './lib/dedup.mjs';
24
24
  import { currentIsoMonday, ymd, parseYmd } from './lib/weeks.mjs';
25
+ import { emitLearningCandidate } from './lib/learnings.mjs';
25
26
 
26
27
  const SOURCE = 'sharepoint';
27
28
 
@@ -129,6 +130,7 @@ async function main() {
129
130
  const retryable = !e.status || [429, 502, 503, 504].includes(e.status);
130
131
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { last_status: retryable ? 'deferred' : 'failed', last_error: e.message });
131
132
  if (retryable && !args.dryRun) await enqueue(projectRoot, args.alias, { source: SOURCE, entity: args.entity, weekStart, signature: 'fetch-failed', reason: e.message });
133
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-sharepoint' } });
132
134
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
133
135
  return retryable ? 1 : 0;
134
136
  }
@@ -20,6 +20,7 @@ import { appendRunLog } from './lib/runlog.mjs';
20
20
  import { enqueue, clear } from './lib/deferred.mjs';
21
21
  import { shortHash } from './lib/dedup.mjs';
22
22
  import { currentIsoMonday, ymd, parseYmd } from './lib/weeks.mjs';
23
+ import { emitLearningCandidate } from './lib/learnings.mjs';
23
24
 
24
25
  const SOURCE = 'teams';
25
26
 
@@ -105,6 +106,7 @@ async function main() {
105
106
  const retryable = !e.status || [429, 502, 503, 504].includes(e.status);
106
107
  await updateCell(projectRoot, args.alias, SOURCE, args.entity, weekStart, { last_status: retryable ? 'deferred' : 'failed', last_error: e.message });
107
108
  if (retryable && !args.dryRun) await enqueue(projectRoot, args.alias, { source: SOURCE, entity: args.entity, weekStart, signature: 'fetch-failed', reason: e.message });
109
+ if (!retryable && !args.dryRun) await emitLearningCandidate({ projectRoot, alias: args.alias, source: SOURCE, entity: args.entity, week: weekStart, error: { signature: 'fetch-failed', message: e.message, status: e.status }, context: { runner: 'pull-teams' } });
108
110
  emit({ source: SOURCE, entity: args.entity, week: weekStart, status: retryable ? 'deferred' : 'failed', errors: [{ message: e.message, status: e.status }] });
109
111
  return retryable ? 1 : 0;
110
112
  }
@@ -23,6 +23,7 @@ import { fileURLToPath } from 'node:url';
23
23
  import { loadConfig, assertProject } from './lib/config.mjs';
24
24
  import { readLedger, needsPull } from './lib/ledger.mjs';
25
25
  import { currentIsoMonday, ymd } from './lib/weeks.mjs';
26
+ import { readCandidateCount } from './lib/learnings.mjs';
26
27
 
27
28
  const HERE = path.dirname(fileURLToPath(import.meta.url));
28
29
 
@@ -223,6 +224,8 @@ async function main() {
223
224
  ? planned.map(t => ({ source: t.source, entity: t.entity, week: weekStart, dry_run: true, reason: t.reason }))
224
225
  : await pMap(planned, args.maxParallel, t => runOne(t, weekStart, args));
225
226
 
227
+ const learning_candidates_total = args.dryRun ? 0 : await readCandidateCount(args.project);
228
+
226
229
  emit({
227
230
  status: 'ok',
228
231
  project: args.project,
@@ -234,6 +237,7 @@ async function main() {
234
237
  skipped: skipped.length,
235
238
  results,
236
239
  skipped_targets: skipped,
240
+ learning_candidates_total,
237
241
  });
238
242
  return 0;
239
243
  }
@@ -0,0 +1,9 @@
1
+ {
2
+ "folders": [
3
+ { "id": "AAMkAGI=", "displayName": "23. ABN AMRO" }
4
+ ],
5
+ "throwOnListMessages": {
6
+ "status": 500,
7
+ "message": "Graph returned unexpected null for $select(receivedDateTime)"
8
+ }
9
+ }
@@ -95,3 +95,55 @@ test('missing --entity exits 2', () => {
95
95
  ], { encoding: 'utf8' });
96
96
  assert.equal(res.status, 2);
97
97
  });
98
+
99
+ test('v5.6.0: non-retryable error emits a learning candidate file', async () => {
100
+ const projectRoot3 = await fs.mkdtemp(path.join(os.tmpdir(), 'kushi-email-novel-'));
101
+ await fs.mkdir(path.join(projectRoot3, 'Evidence', 'ushak'), { recursive: true });
102
+ await fs.writeFile(path.join(projectRoot3, 'integrations.yml'), YAML.stringify({}));
103
+ const NOVEL_FIXTURE = path.join(HERE, '..', 'fixtures', 'email-novel-error.json');
104
+ try {
105
+ const res = spawnSync(process.execPath, [RUNNER,
106
+ '--project', projectRoot3, '--alias', 'ushak',
107
+ '--entity', '23. ABN AMRO', '--week', '2026-05-25', '--fixture', NOVEL_FIXTURE,
108
+ ], { encoding: 'utf8' });
109
+ assert.equal(res.status, 0, `stderr: ${res.stderr}`);
110
+ const r = JSON.parse(res.stdout.trim().split('\n').pop());
111
+ assert.equal(r.status, 'failed');
112
+ const candDir = path.join(projectRoot3, 'Evidence', '_learnings-candidates');
113
+ const entries = await fs.readdir(candDir);
114
+ const mdFiles = entries.filter(f => f.endsWith('.md'));
115
+ assert.equal(mdFiles.length, 1, `expected exactly one candidate, got: ${entries.join(', ')}`);
116
+ const body = await fs.readFile(path.join(candDir, mdFiles[0]), 'utf8');
117
+ assert.match(body, /fetch-failed/);
118
+ assert.match(body, /unexpected null/);
119
+ // Re-run within window — should not duplicate.
120
+ spawnSync(process.execPath, [RUNNER,
121
+ '--project', projectRoot3, '--alias', 'ushak',
122
+ '--entity', '23. ABN AMRO', '--week', '2026-05-25', '--fixture', NOVEL_FIXTURE,
123
+ ], { encoding: 'utf8' });
124
+ const entries2 = await fs.readdir(candDir);
125
+ const mdFiles2 = entries2.filter(f => f.endsWith('.md'));
126
+ assert.equal(mdFiles2.length, 1, 'dedup should prevent a 2nd file within 7d');
127
+ } finally {
128
+ await fs.rm(projectRoot3, { recursive: true, force: true });
129
+ }
130
+ });
131
+
132
+ test('v5.6.0: user-side error does NOT emit a learning candidate', async () => {
133
+ const projectRoot4 = await fs.mkdtemp(path.join(os.tmpdir(), 'kushi-email-userside-'));
134
+ await fs.mkdir(path.join(projectRoot4, 'Evidence', 'ushak'), { recursive: true });
135
+ await fs.writeFile(path.join(projectRoot4, 'integrations.yml'), YAML.stringify({}));
136
+ try {
137
+ const res = spawnSync(process.execPath, [RUNNER,
138
+ '--project', projectRoot4, '--alias', 'ushak',
139
+ '--entity', 'NoSuchFolder', '--week', '2026-05-25', '--fixture', FIXTURE,
140
+ ], { encoding: 'utf8' });
141
+ assert.equal(res.status, 0);
142
+ let exists = true;
143
+ try { await fs.access(path.join(projectRoot4, 'Evidence', '_learnings-candidates')); }
144
+ catch { exists = false; }
145
+ assert.equal(exists, false, 'folder-not-found should never trigger candidate emission');
146
+ } finally {
147
+ await fs.rm(projectRoot4, { recursive: true, force: true });
148
+ }
149
+ });
@@ -0,0 +1,124 @@
1
+ // plugin/runners/test/unit/learnings.test.mjs
2
+ import { test } from 'node:test';
3
+ import assert from 'node:assert/strict';
4
+ import { promises as fs } from 'node:fs';
5
+ import path from 'node:path';
6
+ import os from 'node:os';
7
+ import {
8
+ shouldCapture,
9
+ fingerprint,
10
+ emitLearningCandidate,
11
+ readCandidateCount,
12
+ } from '../../lib/learnings.mjs';
13
+
14
+ async function tmpProject() {
15
+ const dir = await fs.mkdtemp(path.join(os.tmpdir(), 'kushi-learnings-'));
16
+ await fs.mkdir(path.join(dir, 'Evidence', 'usha'), { recursive: true });
17
+ return dir;
18
+ }
19
+
20
+ test('shouldCapture: user-side signatures are skipped', () => {
21
+ for (const sig of ['bad-args','config-missing','token-expired','auth-required','folder-not-found','cross-tenant-blocked','permission-denied']) {
22
+ const d = shouldCapture({ signature: sig, message: 'x' });
23
+ assert.equal(d.capture, false, `${sig} should not capture`);
24
+ assert.equal(d.reason, 'user-side');
25
+ }
26
+ });
27
+
28
+ test('shouldCapture: transient HTTP is skipped', () => {
29
+ for (const status of [429, 502, 503, 504, 408]) {
30
+ const d = shouldCapture({ signature: 'fetch-failed', message: 'x', status });
31
+ assert.equal(d.capture, false);
32
+ assert.equal(d.reason, 'transient-http');
33
+ }
34
+ });
35
+
36
+ test('shouldCapture: fetch-failed with non-transient status captures', () => {
37
+ const d = shouldCapture({ signature: 'fetch-failed', message: 'weird shape', status: 500 });
38
+ assert.equal(d.capture, true);
39
+ });
40
+
41
+ test('shouldCapture: body-unavailable first sighting skipped, 2nd captures', () => {
42
+ const first = shouldCapture({ signature: 'body-unavailable', occurrences: 1 });
43
+ assert.equal(first.capture, false);
44
+ assert.equal(first.reason, 'body-unavailable-first-sighting');
45
+ const second = shouldCapture({ signature: 'body-unavailable', occurrences: 2 });
46
+ assert.equal(second.capture, true);
47
+ assert.equal(second.reason, 'body-unavailable-repeat');
48
+ });
49
+
50
+ test('shouldCapture: unclassified (no signature) captures', () => {
51
+ const d = shouldCapture({ message: 'something weird' });
52
+ assert.equal(d.capture, true);
53
+ assert.equal(d.reason, 'unclassified');
54
+ });
55
+
56
+ test('fingerprint normalizes hex blobs and digit runs', () => {
57
+ const a = fingerprint('email', 'fetch-failed', 'error abc12345def at row 9999');
58
+ const b = fingerprint('email', 'fetch-failed', 'error 5678ffaa9999 at row 1234');
59
+ assert.equal(a, b, 'redacted hex/digits should fingerprint the same');
60
+ assert.equal(a.length, 8);
61
+ });
62
+
63
+ test('fingerprint differs for different signatures', () => {
64
+ const a = fingerprint('email', 'fetch-failed', 'bad shape');
65
+ const b = fingerprint('email', 'other-sig', 'bad shape');
66
+ assert.notEqual(a, b);
67
+ });
68
+
69
+ test('emitLearningCandidate writes a file for a novel error', async () => {
70
+ const root = await tmpProject();
71
+ const res = await emitLearningCandidate({
72
+ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25',
73
+ error: { signature: 'fetch-failed', message: 'unexpected null in $select', status: 500 },
74
+ });
75
+ assert.equal(res.written, true);
76
+ const md = await fs.readFile(res.path, 'utf8');
77
+ assert.match(md, /Symptom/);
78
+ assert.match(md, /unexpected null in/);
79
+ assert.match(md, /captured_at:/);
80
+ });
81
+
82
+ test('emitLearningCandidate is silent for user-side errors', async () => {
83
+ const root = await tmpProject();
84
+ const res = await emitLearningCandidate({
85
+ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25',
86
+ error: { signature: 'folder-not-found', message: 'Inbox' },
87
+ });
88
+ assert.equal(res.written, false);
89
+ assert.equal(res.reason, 'user-side');
90
+ assert.equal(await readCandidateCount(root), 0);
91
+ });
92
+
93
+ test('emitLearningCandidate dedups within 7-day window', async () => {
94
+ const root = await tmpProject();
95
+ const err = { signature: 'fetch-failed', message: 'unexpected null', status: 500 };
96
+ const r1 = await emitLearningCandidate({ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25', error: err });
97
+ assert.equal(r1.written, true);
98
+ const r2 = await emitLearningCandidate({ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25', error: err });
99
+ assert.equal(r2.written, false);
100
+ assert.equal(r2.reason, 'deduped');
101
+ assert.equal(await readCandidateCount(root), 1);
102
+ });
103
+
104
+ test('emitLearningCandidate re-emits after 7-day window', async () => {
105
+ const root = await tmpProject();
106
+ const err = { signature: 'fetch-failed', message: 'unexpected null', status: 500 };
107
+ const eightDaysAgo = new Date(Date.now() - 8 * 24 * 60 * 60 * 1000);
108
+ const r1 = await emitLearningCandidate({ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25', error: err, now: eightDaysAgo });
109
+ assert.equal(r1.written, true);
110
+ const r2 = await emitLearningCandidate({ projectRoot: root, alias: 'usha', source: 'email', entity: 'Inbox', week: '2026-05-25', error: err });
111
+ assert.equal(r2.written, true);
112
+ assert.equal(await readCandidateCount(root), 2);
113
+ });
114
+
115
+ test('emitLearningCandidate rejects missing required fields', async () => {
116
+ const r = await emitLearningCandidate({ alias: 'usha', source: 'email', error: { message: 'x' } });
117
+ assert.equal(r.written, false);
118
+ assert.equal(r.reason, 'missing-required');
119
+ });
120
+
121
+ test('readCandidateCount returns 0 when dir missing', async () => {
122
+ const root = await tmpProject();
123
+ assert.equal(await readCandidateCount(root), 0);
124
+ });
@@ -2485,7 +2485,8 @@ process.stdout.write(JSON.stringify(out));
2485
2485
  $v550Doctrines = @(
2486
2486
  'llm-vs-runner.instructions.md',
2487
2487
  'csc-rendering.instructions.md',
2488
- 'discovery-prompts.instructions.md'
2488
+ 'discovery-prompts.instructions.md',
2489
+ 'learning-candidates.instructions.md'
2489
2490
  )
2490
2491
  foreach ($d in $v550Doctrines) {
2491
2492
  $df = Join-Path $instructionsDir $d
@@ -2494,6 +2495,23 @@ process.stdout.write(JSON.stringify(out));
2494
2495
  }
2495
2496
  }
2496
2497
 
2498
+ # === D48.learning-candidates (v5.6.0) — every pull-* runner must import lib/learnings.mjs ===
2499
+ $learningsLib = Join-Path $Root 'plugin/runners/lib/learnings.mjs'
2500
+ if (-not (Test-Path $learningsLib)) {
2501
+ Add-Finding 'D48.learning-candidates' 'V560Learnings' 'error' "plugin/runners/lib/learnings.mjs is missing" "Create lib/learnings.mjs per v5.6.0 learning-candidates spec." $learningsLib 0
2502
+ }
2503
+ $pullRunners = @('pull-email.mjs','pull-teams.mjs','pull-meetings.mjs','pull-onenote.mjs','pull-sharepoint.mjs','pull-crm.mjs','pull-ado.mjs')
2504
+ foreach ($r in $pullRunners) {
2505
+ $rf = Join-Path $Root "plugin/runners/$r"
2506
+ if (-not (Test-Path $rf)) { continue }
2507
+ $rt = Get-Content -Raw $rf
2508
+ if ($rt -notmatch "from\s+'\./lib/learnings\.mjs'") {
2509
+ Add-Finding 'D48.learning-candidates' 'V560Learnings' 'warning' "Runner $r does not import emitLearningCandidate from lib/learnings.mjs" "Add: import { emitLearningCandidate } from './lib/learnings.mjs'; and call it from non-retryable error paths per learning-candidates.instructions.md." $rf 0
2510
+ } elseif ($rt -notmatch 'emitLearningCandidate\s*\(') {
2511
+ Add-Finding 'D48.learning-candidates' 'V560Learnings' 'warning' "Runner $r imports emitLearningCandidate but never calls it" "Wire it into the non-retryable catch path so novel errors are captured as learning candidates." $rf 0
2512
+ }
2513
+ }
2514
+
2497
2515
  # === Output ===
2498
2516
  if ($Targeted) {
2499
2517
  # Filter findings to those whose code, surface, file path, or message contain the substring.