kushi-agents 5.0.0 → 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/README.md +30 -7
  2. package/bin/cli.mjs +73 -45
  3. package/package.json +51 -51
  4. package/plugin/instructions/agentskills-compliance.instructions.md +144 -0
  5. package/plugin/instructions/multi-host-install.instructions.md +125 -0
  6. package/plugin/instructions/plan-validate-execute.instructions.md +75 -0
  7. package/plugin/instructions/release-genealogy.instructions.md +52 -0
  8. package/plugin/skills/aggregate-project/SKILL.md +11 -2
  9. package/plugin/skills/apply-ado-update/SKILL.md +11 -2
  10. package/plugin/skills/ask-project/SKILL.md +1 -1
  11. package/plugin/skills/bootstrap-project/SKILL.md +39 -127
  12. package/plugin/skills/bootstrap-project/references/discovery-sweep.md +40 -0
  13. package/plugin/skills/bootstrap-project/references/pull-dispatch.md +50 -0
  14. package/plugin/skills/bootstrap-project/references/registry-persistence.md +55 -0
  15. package/plugin/skills/build-state/SKILL.md +50 -2
  16. package/plugin/skills/consolidate-evidence/SKILL.md +11 -2
  17. package/plugin/skills/dashboard/SKILL.md +20 -1
  18. package/plugin/skills/emit-vertex/SKILL.md +10 -1
  19. package/plugin/skills/fde-intake/SKILL.md +10 -1
  20. package/plugin/skills/fde-report/SKILL.md +10 -1
  21. package/plugin/skills/fde-triage/SKILL.md +10 -1
  22. package/plugin/skills/intro/SKILL.md +1 -1
  23. package/plugin/skills/link-entities/SKILL.md +43 -1
  24. package/plugin/skills/project-status/SKILL.md +1 -1
  25. package/plugin/skills/propose-ado-update/SKILL.md +11 -2
  26. package/plugin/skills/pull-ado/SKILL.md +26 -9
  27. package/plugin/skills/pull-crm/SKILL.md +39 -125
  28. package/plugin/skills/pull-crm/references/dataverse-doctrine.md +108 -0
  29. package/plugin/skills/pull-crm/references/legacy-shape.md +28 -0
  30. package/plugin/skills/pull-email/SKILL.md +33 -79
  31. package/plugin/skills/pull-email/references/retrieval-order.md +43 -0
  32. package/plugin/skills/pull-email/references/two-pass-pull.md +41 -0
  33. package/plugin/skills/pull-loop/SKILL.md +194 -177
  34. package/plugin/skills/pull-meetings/SKILL.md +35 -72
  35. package/plugin/skills/pull-meetings/references/legacy-stream.md +15 -0
  36. package/plugin/skills/pull-meetings/references/verbatim-capture.md +61 -0
  37. package/plugin/skills/pull-misc/SKILL.md +24 -7
  38. package/plugin/skills/pull-onenote/SKILL.md +207 -555
  39. package/plugin/skills/pull-onenote/references/playwright-fallback.md +111 -0
  40. package/plugin/skills/pull-onenote/references/preflight.md +85 -0
  41. package/plugin/skills/pull-onenote/references/runtime-contract.md +118 -0
  42. package/plugin/skills/pull-sharepoint/SKILL.md +26 -9
  43. package/plugin/skills/pull-teams/SKILL.md +26 -9
  44. package/plugin/skills/refresh-project/SKILL.md +24 -2
  45. package/plugin/skills/self-check/SKILL.md +9 -1
  46. package/plugin/skills/self-check/run.ps1 +216 -4
  47. package/plugin/skills/setup/SKILL.md +14 -120
  48. package/plugin/skills/setup/references/onedrive-pin-sync.md +60 -0
  49. package/plugin/skills/setup/references/recovery-prompts.md +81 -0
  50. package/plugin/skills/tour/SKILL.md +18 -1
  51. package/plugin/skills/vertex-link/SKILL.md +1 -1
  52. package/src/constants.mjs +39 -1
  53. package/src/multi-host-install.test.mjs +170 -0
  54. package/src/multi-host.mjs +277 -0
@@ -0,0 +1,75 @@
1
+ ---
2
+ name: "plan-validate-execute"
3
+ description: "Three-phase write doctrine for cross-source synthesisers. A skill writes a plan JSON to <project>/Evidence/_plan/<skill>-plan.json, validates it against a source of truth (entities.yml / project-graph.json), and only then executes by writing the real artifact. Applies to link-entities (graph) and build-state (State/). Prevents silent corruption from bad inputs."
4
+ applies_to: "link-entities, build-state"
5
+ since: "kushi v5.0.1"
6
+ ---
7
+
8
+ # plan-validate-execute — doctrine
9
+
10
+ Cross-source synthesisers can silently corrupt the project tree if a malformed input survives a single run (e.g. an orphan reference in the graph propagates into State/ pages). The fix is a three-phase write:
11
+
12
+ 1. **Plan** — emit a structured JSON proposal.
13
+ 2. **Validate** — check it against a source of truth.
14
+ 3. **Execute** — write the real artifact only if validation passes.
15
+
16
+ This separates failures from corruption: a bad plan is a recoverable artifact; a bad execute is a stale tree.
17
+
18
+ ## File convention
19
+
20
+ ```
21
+ <engagement-root>/<project>/Evidence/_plan/
22
+ link-entities-plan.json
23
+ state-plan.json
24
+ ```
25
+
26
+ The `_plan/` folder is per-project, single-writer-per-skill, overwritten each run. Self-check does not gate on its presence; it is an intermediate artifact, not a contract.
27
+
28
+ ## Schema (common envelope)
29
+
30
+ ```json
31
+ {
32
+ "schema": "kushi.plan/v1",
33
+ "skill": "link-entities | build-state",
34
+ "project": "<project-name>",
35
+ "generated_at": "<ISO-8601>",
36
+ "kushi_version": "<x.y.z>",
37
+ "summary": { "<counter>": <int>, ... },
38
+ "items": [ { ... } ],
39
+ "validation": {
40
+ "checks_run": [ "<check-id>", ... ],
41
+ "passed": true | false,
42
+ "failures": [ { "code": "...", "item_ref": "...", "fix": "..." } ]
43
+ }
44
+ }
45
+ ```
46
+
47
+ ## Per-skill contracts
48
+
49
+ ### link-entities
50
+
51
+ - **Plan**: `Evidence/_plan/link-entities-plan.json` lists proposed nodes (one per resolved entity id) and edges (one per `(from, to, kind)` triple) with their source citations.
52
+ - **Validate against**: every per-source `Evidence/<alias>/<source>/_index/entities.yml`. Every plan node MUST resolve to a row in some contributor's entities.yml. Every edge endpoint MUST appear in `nodes[]`.
53
+ - **Execute**: write `Evidence/_graph/project-graph.json`.
54
+
55
+ If validation fails, the skill MUST NOT touch `project-graph.json`. It surfaces the failures to the orchestrator and exits non-zero (or returns `{ ok: false }` in `-Json` mode).
56
+
57
+ ### build-state
58
+
59
+ - **Plan**: `Evidence/_plan/state-plan.json` lists proposed State pages (one per category × entity) and `log.md` entries with their CSC citations.
60
+ - **Validate against**: `Evidence/_graph/project-graph.json` (every page MUST cite a node id; no orphan references) AND `_index/entities.yml` for every cited weekly file.
61
+ - **Execute**: write `State/index.md`, `State/log.md`, and per-category pages.
62
+
63
+ If validation fails, the skill MUST NOT touch `State/`. Same failure semantics as link-entities.
64
+
65
+ ## Self-check coupling
66
+
67
+ - `D20.graph` validates the executed `project-graph.json` (post-execute).
68
+ - `D21.state` validates the executed Karpathy State layout (post-execute).
69
+ - `D30.validation-loop` ensures the SKILL.md documents the plan/validate/execute loop with a `## Validation loop` section.
70
+
71
+ ## References
72
+
73
+ - `instructions/agentskills-compliance.instructions.md`
74
+ - `instructions/entity-graph.instructions.md`
75
+ - `instructions/karpathy-state-layout.instructions.md`
@@ -0,0 +1,52 @@
1
+ ---
2
+ title: Release genealogy
3
+ applies_to: every kushi release before tagging
4
+ status: stable
5
+ since: v5.0.1
6
+ ---
7
+
8
+ # Release genealogy doctrine
9
+
10
+ Every shipped tag MUST have a corresponding entry in `docs/genealogy.md`. The genealogy file is the canonical *why* for the project — `CHANGELOG.md` records *what changed*; genealogy records *what each release built on, what it enabled, and what trade-offs were accepted*.
11
+
12
+ ## When to write the entry
13
+
14
+ Before `git tag`. The entry is part of the ship checklist, not a follow-up.
15
+
16
+ ## Required fields
17
+
18
+ Every entry MUST contain:
19
+
20
+ 1. **Built on** — which prior release (or external work) this depends on. State the *causal* dependency, not just the version number ("Built on v4.9.0 CSC. Without per-contributor `_index/entities.yml` and stable entity IDs, cross-source linking would be guesswork.").
21
+ 2. **Why this release** — the problem this release solves, in 2–4 sentences.
22
+ 3. **What it enabled** — concrete capabilities now available that weren't before. Bullets OK.
23
+ 4. **Next ancestor** — name the next release in the lineage (forward link). Update the prior release's "Next ancestor" line when shipping.
24
+
25
+ ## Recommended fields
26
+
27
+ 5. **Inspired by** — external work, papers, gists, repos. Link them. This is the project's reading list.
28
+ 6. **Trade-offs accepted** — what was knowingly sacrificed. Includes locked decisions from design time (e.g. "Q1 LLM edges opt-in").
29
+ 7. **Patch lineage** — if patch releases under this entry fix bugs introduced here, list them inline rather than as separate sections.
30
+
31
+ ## Grouping rule
32
+
33
+ One section per release that **materially advanced** the lineage. Patch releases that only fix bugs in the parent (e.g. v4.2.1–v4.2.3 fixing v4.2.0's WorkIQ probe on Windows) MUST be named under the parent's "Patch lineage" line, not given their own section. This keeps the file readable while ensuring every tag is discoverable.
34
+
35
+ ## Self-check enforcement
36
+
37
+ `D31.genealogy` (in `plugin/skills/self-check/run.ps1`):
38
+
39
+ - Every `git tag` matching `v\d+\.\d+\.\d+` MUST appear in `docs/genealogy.md` either as a `## v<x.y.z>` heading or as an explicit named patch under a parent's "Patch lineage" line.
40
+ - Missing tags fail the check with the actionable fix: "Add an entry to docs/genealogy.md before re-tagging, or list this tag under a parent's Patch lineage line."
41
+
42
+ ## Cross-references
43
+
44
+ - README links to `docs/genealogy.md` under "Project lineage"
45
+ - CHANGELOG.md header notes: "For the *why* behind each release, see [docs/genealogy.md](docs/genealogy.md)"
46
+ - Major GitHub tag descriptions link to the genealogy entry for that release
47
+
48
+ ## Style
49
+
50
+ - Genealogy entries are short — 6–15 lines per release. If a release needs more, split the entry into "summary" + "deep-dive references" rather than letting the genealogy bloat.
51
+ - Write in present tense for what the release *does*; past tense for what it *replaced*.
52
+ - Don't list every file changed (that's CHANGELOG's job). Name only the structural shifts that matter to future readers.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: "aggregate-project"
3
3
  version: "1.0.0"
4
- description: "Pull-only multi-source aggregation for a project. Runs every enabled pull-* skill + consolidate-evidence, but DOES NOT run build-state. Produces a complete Evidence/ folder (the public contract) without Kushi's State/ overlay intended for standalone use and for integration with external rollup systems."
4
+ description: "USE WHEN the user asks to \"pull everything for <project>\" or \"refresh evidence without rebuilding State\" for an already-bootstrapped kushi project. DO NOT USE for first-time setup (use bootstrap-project) or to answer questions (use ask-project). Capability: runs every enabled pull-* skill + consolidate-evidence; SKIPS build-state. Profile-aware. Verbatim-by-default every enabled source dispatched. Writes per-user refresh report."
5
5
  ---
6
6
 
7
7
  # Skill: aggregate-project
@@ -91,4 +91,13 @@ See `docs/reference/evidence-contract.md` for the full schema (filename rules, f
91
91
 
92
92
  ## Issue Recovery
93
93
 
94
- When this skill exposes a reusable defect (auth pattern, doctrine gap, layout mismatch), apply the [Issue Recovery Rule](../../instructions/issue-recovery.instructions.md): fix the smallest correct repo-owned artifact first, prefer durable fixes over per-run workarounds, then re-run the narrowest failed check. Do NOT use memory as a substitute for correcting the workflow surface.
94
+ When this skill exposes a reusable defect (auth pattern, doctrine gap, layout mismatch), apply the [Issue Recovery Rule](../../instructions/issue-recovery.instructions.md): fix the smallest correct repo-owned artifact first, prefer durable fixes over per-run workarounds, then re-run the narrowest failed check. Do NOT use memory as a substitute for correcting the workflow surface.
95
+
96
+ ## Validation loop
97
+
98
+ After writing outputs:
99
+
100
+ 1. Run self-check targeted at this skill: `pwsh plugin/skills/self-check/run.ps1 -Targeted aggregate-project`
101
+ 2. If failures: fix and re-run the affected step (not the whole skill).
102
+ 3. Repeat until self-check exits 0.
103
+ 4. Only then update `run-log.yml` with success status.
@@ -2,7 +2,7 @@
2
2
  name: "apply-ado-update"
3
3
  version: "0.1.0-preview"
4
4
  status: "preview-stub"
5
- description: "Gated write skill: reads <engagement-root>/<project>/ado-updates/<date>/proposed.md, presents the diff for approval, applies approved items to ADO (PATCH field + POST comment), and appends to the per-project update ledger. v0.1.0-preview is a STUBruns in dry-mode only and writes a planned-writes log; no real ADO PATCH/POST yet."
5
+ description: "USE WHEN the user has reviewed <project>/ado-updates/<date>/proposed.md and says \"apply the ADO updates\" / \"approve and push to ADO\" / \"commit the proposed work-item edits\". DO NOT USE without an existing proposed.md (run propose-ado-update first). Capability: gated write skill presents diff for approval, applies approved items to Azure DevOps via WorkIQ, writes applied.md ledger with success/failure per work item."
6
6
  ---
7
7
 
8
8
  # Skill: apply-ado-update
@@ -127,4 +127,13 @@ After step 5 (approval), instead of step 6:
127
127
 
128
128
  ## Issue Recovery
129
129
 
130
- When this skill exposes a reusable defect (auth pattern, doctrine gap, layout mismatch), apply the [Issue Recovery Rule](../../instructions/issue-recovery.instructions.md): fix the smallest correct repo-owned artifact first, prefer durable fixes over per-run workarounds, then re-run the narrowest failed check. Do NOT use memory as a substitute for correcting the workflow surface.
130
+ When this skill exposes a reusable defect (auth pattern, doctrine gap, layout mismatch), apply the [Issue Recovery Rule](../../instructions/issue-recovery.instructions.md): fix the smallest correct repo-owned artifact first, prefer durable fixes over per-run workarounds, then re-run the narrowest failed check. Do NOT use memory as a substitute for correcting the workflow surface.
131
+
132
+ ## Validation loop
133
+
134
+ After writing outputs:
135
+
136
+ 1. Run self-check targeted at this skill: `pwsh plugin/skills/self-check/run.ps1 -Targeted apply-ado-update`
137
+ 2. If failures: fix and re-run the affected step (not the whole skill).
138
+ 3. Repeat until self-check exits 0.
139
+ 4. Only then update `run-log.yml` with success status.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: "ask-project"
3
3
  version: "4.0.0"
4
- description: "Read-only Q&A over an already-bootstrapped project (v5.0.0). Answers natural-language questions using State/, Evidence/, and weekly CSC files. v5 cross-source questions go GRAPH-FIRST: consult Evidence/_graph/project-graph.json before walking weekly files. 3-step reader fallback (_index/entities.yml → weekly/*.md → legacy snapshot/ + stream/). Cited; no source pulls; no outbound."
4
+ description: "USE WHEN the user asks a natural-language question about a known kushi project (\"what was decided about MACC?\", \"who is the EM on Acme?\", \"list open follow-ups for <project>\"). Read-only; auto-routes when a message names a known project. DO NOT USE for cross-project search or for questions about unbootstrapped projects. Capability: graph-first for cross-source questions (v5.0.0), walks _index/entities.yml → weekly CSC → legacy snapshot/+stream/ fallback chain. Every assertion cited."
5
5
  ---
6
6
 
7
7
  # Skill: ask-project
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: "bootstrap-project"
3
- version: "4.0.0"
4
- description: "First-time setup for a project (kushi v4.9.0+): machine preflight, side-by-side config, engagement-root + project resolution, customer-hint discovery sweep across all WorkIQ-driven sources, initial full-window CSC pull across all enabled sources writing weekly/ + _index/ per source. All pull-* skills write CSC blocks per comprehensive-structured-capture.instructions.md. Verbatim-by-default every enabled source dispatched, no silent skips. Discovery sweep MANDATORY before declaring blocked-config. CRM bootstrap discovery REQUIRED per `crm-bootstrap-discovery.instructions.md`. Writes per-user refresh report per `run-reports.instructions.md`. Cleans stale no-match notes on resolution per `cleanup-on-resolution.instructions.md`. Builds State/ only on `full` profile. Idempotent."
3
+ version: "4.0.1"
4
+ description: "USE WHEN the user says \"bootstrap <X>\", \"set up project evidence for <X>\", \"add me to project <X>\", \"I'm new to <X>\", or \"do it all for <X>\" AND the project has no Evidence/ folder yet. DO NOT USE for incremental refresh of an existing project (use refresh-project) or for tenant-wide M365 setup (use setup). Capability: machine preflight, side-by-side config, engagement-root + project resolution, customer-hint discovery sweep, initial full-window CSC pull across all enabled sources writing weekly/ + _index/. Builds State/ only on `full` profile."
5
5
  ---
6
6
 
7
7
  # Skill: bootstrap-project
@@ -49,6 +49,22 @@ After every run (success or coverage-gaps), write `<project>/bootstrap-status.md
49
49
 
50
50
  ## Steps
51
51
 
52
+
53
+ ## Step checklist
54
+
55
+ Progress-trackable view of the steps below. Each `### Step` block expands the corresponding checkbox.
56
+
57
+ - [ ] Step 0 — Delegate ALL onboarding questions to the `setup` skill (REQUIRED, never asks here)
58
+ - [ ] Step 1 — Machine preflight (SETUP)
59
+ - [ ] Step 2 — Side-by-side config bootstrap
60
+ - [ ] Step 3 — Project folder scaffold
61
+ - [ ] Step 3.5 — Customer-hint discovery sweep (REQUIRED, kushi v4.8.0+)
62
+ - [ ] Step 4 — Initial pull (last 30 days)
63
+ - [ ] Step 5 — Consolidate (single contributor = no-op)
64
+ - [ ] Step 6 — Build state (FULL PROFILE ONLY)
65
+ - [ ] Step 6b — Cross-source graph + dashboard + tour (kushi v5.0.0)
66
+ - [ ] Step 7 — Verify + summary
67
+
52
68
  ### Step 0 — Delegate ALL onboarding questions to the `setup` skill (REQUIRED, never asks here)
53
69
 
54
70
  Per `identity-resolution.instructions.md` (v4.4.5 extension). The bootstrap orchestrator **MUST NOT** itself prompt for identity, OneNote, mailbox folders, or `projects_root`. The `setup` skill owns those questions.
@@ -122,137 +138,19 @@ The `State/` subtree is created **only on `full` profile**. On `standard`, only
122
138
 
123
139
  ### Step 3.5 — Customer-hint discovery sweep (REQUIRED, kushi v4.8.0+)
124
140
 
125
- Per `customer-hint-discovery.instructions.md` — **before the boundaries gate runs in Step 4**, bootstrap MUST attempt a WorkIQ-driven discovery sweep for every source whose boundary is currently empty AND whose WorkIQ-driven discovery doctrine exists. This prevents the silent-skip defect where bootstrap scaffolds empty boundaries and immediately writes every source as `blocked-config` without ever asking WorkIQ what mentions the customer.
126
-
127
- For each source in the table below, if `<engagement-root>/<project>/integrations.yml#boundaries.<source>.<required-key>` is empty (or contains only `needs_review: true` rows from a prior incomplete sweep), dispatch the sweep using the customer hint from Step 1 + `<discoveryLookbackDays>` (default 90, configurable via `m365-mutable.json#bootstrap.discoveryLookbackDays`):
128
-
129
- | Source | Sweep doctrine | Populates | Required-key gate |
130
- |---|---|---|---|
131
- | email | `email-bootstrap-discovery.instructions.md` | `boundaries.email.mailboxes[]` | `mailboxes` empty |
132
- | teams | `teams-bootstrap-discovery.instructions.md` | `boundaries.teams.chat_ids[]` + `channel_ids[]` | BOTH empty |
133
- | meetings | `meetings-bootstrap-discovery.instructions.md` | `boundaries.meetings.series_join_urls[]` | `series_join_urls` empty |
134
- | sharepoint | `sharepoint-bootstrap-discovery.instructions.md` | `boundaries.sharepoint.site_urls[]` | `site_urls` empty AND `local_folders` empty |
135
- | onenote | existing Step 4a (display-name driven, v4.7.x) | `boundaries.onenote.section_file_ids[]` + `section_group_ids[]` | both empty |
136
- | loop | `loop-bootstrap-discovery.instructions.md` (v4.6.0) | `boundaries.loop.workspace_ids[]` | empty |
137
- | crm | `crm-bootstrap-discovery.instructions.md` (v3.11.0) | `boundaries.crm.record_ids[]` + `request_ids[]` | both empty AND `crm:` shared block populated |
138
- | ado | `ado-bootstrap-discovery.instructions.md` | `boundaries.ado.area_paths[]` | empty AND `ado:` shared block populated |
139
-
140
- For each sweep:
141
+ Per `customer-hint-discovery.instructions.md` — before the boundaries gate runs in Step 4, bootstrap MUST attempt a WorkIQ-driven discovery sweep for every source whose boundary is currently empty.
141
142
 
142
- 1. Read the per-source doctrine for the exact approved WorkIQ query shape.
143
- 2. Issue ONE WorkIQ ask per source per project (chats + channels are TWO asks under the same teams doctrine).
144
- 3. Parse the response per the doctrine's parsing rules.
145
- 4. Cap at 10 candidates by recency (or per-doctrine ordering).
146
- 5. Append discovered values to `<project>/integrations.yml#boundaries.<source>.<key>` as plain strings (deduplicate by exact-string equality).
147
- 6. Write sidecar `<project>/Evidence/<alias>/_discovery/<YYYY-MM-DD>_<source>_discovery.yml` with per-row metadata (`discovered_by`, `discovered_at`, `needs_review: true`, `confidence`, `query`, `workiq_request_id`). The sidecar lives INSIDE the contributor's alias folder (discovery results are per-identity); filename is NOT suffixed with `-<alias>` since the folder already namespaces the owner.
148
- 7. If `> 10` candidates: append the remainder to `<project>/OPEN-QUESTIONS-DRAFT.md` under `## Discovery sweep — candidates over cap`.
149
- 8. If `0` candidates: write `last_status: unresolved` (NOT `blocked-config`) and append a one-line widen-hint suggestion to Open Questions.
150
- 9. If WorkIQ errors: write a `deferred-retry` marker per `deferred-retry-on-workiq-fail.instructions.md` and set `last_status: deferred`. Do NOT skip ahead to `blocked-config`.
151
-
152
- Run sweeps **in parallel** where possible (they're independent WorkIQ asks). Total wall time should be ≤ 5× single-ask latency.
153
-
154
- **Rerun rule** — if the boundary already has at least one row that is NOT `needs_review: true` in the sidecar (user manually confirmed), the sweep is **skipped** for that source. Boundary is gospel. Pass `--force-rediscover` to override.
155
-
156
- **Forbidden:** declaring `last_status: blocked-config` for email/teams/meetings/sharepoint without first running this sweep. That is a defect per `customer-hint-discovery.instructions.md` § The rule. `blocked-config` is only legitimate when a prerequisite is genuinely missing (CRM/ADO shared config, SharePoint when both `site_urls` AND `local_folders` are empty AND the sweep returned 0).
157
-
158
- After all sweeps complete, write the `## Discovery Sweep Results` table to `<project>/bootstrap-status.md` per `customer-hint-discovery.instructions.md` § Required outputs (4).
143
+ > **Load `references/discovery-sweep.md`** for the full per-source sweep table, sidecar format, parallelism rules, rerun doctrine, and cap-at-10 behavior. Load when any source boundary is empty before Step 4.
159
144
 
160
145
  ### Step 4 — Initial pull (last 30 days)
161
146
 
162
- **Boundaries gate** (kushi v3.7.0+, per `scope-boundaries.instructions.md`): before dispatching any `pull-*`, read `<engagement-root>/<project>/integrations.yml#boundaries` and verify each enabled source has its required boundary key populated. After Step 3.5, many of these should now contain discovered rows (annotated `needs_review: true` in their sidecars). For sources where the sweep ran but returned 0 candidates, the status is `unresolved` (not `blocked-config`) — add a one-liner to `<project>/OPEN-QUESTIONS-DRAFT.md` asking the user to widen the hint or manually seed the boundary. For sources where the sweep COULD NOT run because a prerequisite is genuinely missing (CRM/ADO shared connection block empty; SharePoint local-folder discovery), `blocked-config` is correct and the `next_step` MUST cite the specific missing field.
163
-
164
- For CRM and ADO additionally verify the shared connection block exists in `<workspace>/.kushi/config/shared/integrations.yml` (`crm:` block with `environment_url` + `tenant_id`, OR `ado:` block with `organization` + `project`) with non-placeholder values. **As of kushi v4.8.4, these blocks ship PRE-FILLED with the canonical Microsoft Industry Solutions defaults** (iscrm.crm.dynamics.com / Microsoft tenant / IndustrySolutions ADO org / IS Engagements project), so on a fresh install this preflight passes by default. Only if a contributor has hand-edited the file to empty values or works in a non-Microsoft tenant will this preflight fail — in that case, prompt the user to fix the affected field directly and park in Open Questions with the path. **Do NOT auto-improvise** by inferring a tenant/org or by narrating CRM evidence from email — both are explicit anti-patterns in v3.7.0.
165
-
166
- **CRM discovery is REQUIRED before declaring `disabled` (kushi v3.11.0+, per `crm-bootstrap-discovery.instructions.md`).** If `<workspace>/.kushi/config/shared/integrations.yml` exists and `az` auth succeeds, bootstrap MUST run the full 4-step Dataverse REST resolution sequence from `pull-crm/SKILL.md#resolution-order-when-crmrecordid-is-unset` (title-first → all matching accounts → wide-text → recent-slice → ask user) against the live endpoint before writing `boundaries.crm.disabled: true`. Any other path that sets `disabled: true` is a defect. If steps 1–4 all return 0, present the top 5 candidates from step 4 to the user before final disposition. Log the full attempt trail (queries + counts + outcome) to the bootstrap refresh-report under `## CRM resolution attempts`. If auth fails or Dataverse is unreachable, leave the boundary empty with `reason: 'crm-auth-unavailable-<date>'` — NOT `disabled: true` — so the next refresh retries.
167
-
168
- #### Step 4a — Discovery & registry persistence (REQUIRED, kushi v3.7.8+, per `m365-id-registry.instructions.md`)
169
-
170
- **Doctrine: bootstrap discovers, refresh consumes.** Bootstrap is the ONLY phase that probes WorkIQ to resolve canonical M365 identifiers. Refresh runs MUST read these from `m365-mutable.json#knownSections.<project>` and pass them into the index extractor verbatim. Refresh must NEVER re-discover — that is the source of "OneNote works for me sometimes" non-determinism.
171
-
172
- For each enabled source, resolve and persist the canonical lookup keys into `<workspace>/.kushi/config/user/m365-mutable.json#knownSections.<projectKey>`. The schema is fixed — populate every key the source supports:
173
-
174
- ```jsonc
175
- "knownSections": {
176
- "<projectKey>": {
177
- // OneNote (pull-onenote uses these as Step A inputs)
178
- "one_sectionName": "<displayName>.one",
179
- "one_sectionFileId": "<wdsectionfileid GUID>", // PRIMARY — consumed verbatim by pull-onenote
180
- "one_sectionGroupId": "<wdsectiongroupid GUID>", // when boundary is a section group
181
- "one_sectionOneNoteGuid": "<wdsectiononenoteguid GUID>", // alternate identifier (older notebooks)
182
- "one_sectionPath": "/<group>/<section>.one", // human-readable path for run-reports
183
- "one_notebookSourceDoc": "<notebook sourceDoc GUID>", // parentReferenceId fallback
184
- // Email
185
- "emailContext": "Inbox/<folder-path>",
186
- // Teams
187
- "teamsChatContext": {
188
- "chatHints": ["..."], // exact chat topics
189
- "channelHints": ["..."], // exact channel display names
190
- "participantHints": ["..."] // exact display names
191
- },
192
- // SharePoint (when boundary is a SP site/library/folder)
193
- "sp_siteId": "<siteId>",
194
- "sp_webId": "<webId>",
195
- "sp_listId": "<listId>",
196
- "sp_path": "/<site>/<library>/<folder>"
197
- }
198
- }
199
- ```
200
-
201
- **OneNote discovery procedure (deterministic, follow exactly).**
147
+ #### Step 4a Discovery & registry persistence (REQUIRED, kushi v3.7.8+)
202
148
 
203
- Per `workiq-onenote-query-shape.instructions.md` the **only** WorkIQ query shape that returns OneNote data is **natural-language naming by display name**, scoped to one section in one notebook. The doctrine file lists every empirically-broken phrasing (enumeration verbs, structured-field requests, filter-syntax expressions, ID-lookup questions) in its "Forbidden phrasings" table see that file for the canonical anti-pattern list. Drivers MUST NOT emit any of those phrasings; they fail empirically (WorkIQ punts to Graph or routes to summary mode).
204
-
205
- 1. For each entry in `boundaries.onenote.section_names[]` (or the user-provided section name, e.g. `HCA`), run **one** narrow WorkIQ query per section using display names only:
206
- ```
207
- workiq ask -q "In the OneNote notebook '<NOTEBOOK DISPLAY NAME>', show me the pages in the section named '<SECTION DISPLAY NAME>'. Return a flat table with: page title, last modified, web URL. No commentary. Do not truncate."
208
- ```
209
- The notebook display name comes from `m365Auth.oneNote.defaultNotebookName` in `m365-auth.json` (persisted during `setup`). The section display name comes from `boundaries.onenote.section_names[]` in `integrations.yml`.
210
- 2. **Parse GUIDs out of the URL fragments** in the response — the `web URL` column contains `Doc.aspx` URLs of the form `...?...&wd=target(<section>|<wdsectionfileid>/...)&...&wdpartid={GUID}{1}&wdsectionfileid={GUID}`. Extract `wdsectionfileid`, `wdpartid`, and `sourcedoc` GUIDs. Do NOT interpret prose summaries.
211
- 3. If the table is empty or the query returns a Graph-Explorer-style punt (classified per `fallback-status-reporting.instructions.md`), mark the section `disabled = true, reason = "workiq-discovery-failed"` in `integrations.yml`, write a one-line entry in `bootstrap-status.md`, and continue. Do NOT escalate to Playwright at bootstrap time — escalation is per `workiq-onenote-query-shape.instructions.md` "When (and only when) to use Playwright" (refresh-time, opted-in, threshold-driven).
212
- 4. Persist resolved IDs to `m365-mutable.json#knownSections.<projectKey>` per `m365-id-registry.instructions.md` and mirror into `boundaries.onenote.section_file_ids[]` / `section_group_ids[]` in `integrations.yml`.
213
- 5. Record one line in `bootstrap-status.md`: `OneNote: resolved wdsectionfileid=<id> via natural-language query for section "<name>" in notebook "<notebook>" (N pages enumerated)`.
214
- 6. **Browser-URL fields are OPTIONAL at bootstrap (kushi v4.7.3+).** The Playwright-required fields (`one_notebookSourceDoc`, `one_notebookSpoBaseUrl`, `one_sectionWebUrl`, `one_sectionName`) are only needed if the user has opted into the Playwright recovery fallback (`m365Auth.oneNote.playwrightFallback: true`). When opted in, run `recapture-section-url.mjs --check`; if exit 1, prompt for the address-bar URL paste. When NOT opted in, skip this gate entirely — WorkIQ-only operation does not need these fields.
215
-
216
- **SharePoint, Teams, Email, CRM, ADO** follow the same shape: bootstrap discovers and persists; refresh consumes. Per-source discovery procedures live in each `pull-*/SKILL.md`'s "Bootstrap discovery" section. Bootstrap MUST invoke each pull-*'s discovery probe with the user-supplied seed (folder name, channel name, request id, work item id), persist the resolved IDs, and only then dispatch the pull.
149
+ > **Load `references/registry-persistence.md`** for the `knownSections` JSON schema, OneNote deterministic discovery procedure (GUID extraction from URL fragments), and per-source ID-persistence rulesall per `m365-id-registry.instructions.md`. Load when resolving canonical M365 identifiers for any source.
217
150
 
218
151
  #### Step 4b — Dispatch (with verification gate after each pull)
219
152
 
220
- Then dispatch to each enabled per-source skill with `--window last 30 days` (each skill self-refuses if its boundary is still empty). **After every dispatch**, run the per-source verification gate per `..\..\instructions\per-source-verification-gate.instructions.md`:
221
-
222
- ```
223
- for source in [pull-email, pull-teams, pull-meetings, pull-onenote, pull-loop,
224
- pull-sharepoint, pull-crm, pull-ado, pull-misc]:
225
- if not enabled(source): continue
226
- result = dispatch(source, --window "last 30 days")
227
- gate = run_verification_gate(source, result)
228
- if gate.status == "fail":
229
- retry = dispatch(source, --window "last 30 days", --retry)
230
- gate2 = run_verification_gate(source, retry)
231
- if gate2.status == "fail":
232
- append_to_followups(<project>/FOLLOW-UPS.md, gate2)
233
- append_to_runlog(<project>/Evidence/run-log.yml, source, "failed-gate")
234
- append_to_tracking(gate2.process_audit_failures)
235
- # do NOT abort the whole bootstrap — continue to next source
236
- # next source
237
- ```
238
-
239
- Source order (deterministic, do not reshuffle):
240
-
241
- 1. `pull-email`
242
- 2. `pull-teams`
243
- 3. `pull-meetings`
244
- 4. `pull-onenote`
245
- 5. `pull-loop` (if enabled — boundary `boundaries.loop.workspace_ids[]` populated)
246
- 6. `pull-sharepoint`
247
- 7. `pull-crm` (if enabled)
248
- 8. `pull-ado` (if enabled)
249
- 9. `pull-misc` (if `<project>/external-links.txt` exists with ≥ 1 non-placeholder, non-delegated link)
250
-
251
- > **v4.9.0 SUPERSEDED.** Bootstrap walks the full window now; pull-* write `weekly/<YYYY-MM-DD>_<source>-csc.md` + `_index/entities.yml` per source (not snapshot/ + stream/). See `weekly-csc.instructions.md` and `comprehensive-structured-capture.instructions.md`.
252
-
253
- Each produces CSC weekly output per `weekly-csc.instructions.md` + `comprehensive-structured-capture.instructions.md`. The gate enforces: shape (weekly/ + _index/ + verbatim/ for meetings) → per-source extras (transcript-class file for meetings, sectionId for onenote, siteId for sharepoint, chatId for teams, folders for email, engagementRecordId for crm, areaPath for ado) → process-compliance (workiq-first / verbatim-by-default / capture-learnings / run-report mention / fuzzy-disambiguation cited / cleanup-on-resolution / citation-ledger).
254
-
255
- **pull-misc bootstrap note:** if `<project>/external-links.txt` does NOT exist, scaffold it from `templates/init/external-links.template.txt` so the user has a place to paste links. Mark the source as `enabled: true, links: 0` in `integrations.yml#boundaries.misc` and skip the dispatch (nothing to fetch yet).
153
+ > **Load `references/pull-dispatch.md`** for the boundaries gate logic, CRM discovery requirement, verification-gate pseudocode, per-source dispatch order, and pull-misc scaffolding rules. Load when dispatching per-source pulls after discovery is complete.
256
154
 
257
155
  ### Step 5 — Consolidate (single contributor = no-op)
258
156
 
@@ -311,6 +209,20 @@ When this skill exposes a reusable defect (auth pattern, doctrine gap, layout mi
311
209
 
312
210
  ## Changelog
313
211
 
212
+
213
+ - **v4.0.1 (kushi v5.0.1, 2026-05-26)**: agentskills.io spec-compliance pass. Extracted customer-hint
214
+ discovery sweep → `references/discovery-sweep.md`; registry persistence (knownSections schema +
215
+ OneNote discovery procedure) → `references/registry-persistence.md`; pull-dispatch loop (boundaries
216
+ gate + CRM discovery + verification-gate pseudocode) → `references/pull-dispatch.md`. SKILL.md
217
+ trimmed from 341 to ~190 lines. Behaviour unchanged; load-on-trigger pointers added.
218
+ - **v4.0.0 (kushi v5.0.0, 2026-05-26)**: After build-state, dispatch the v5 enrichment chain — `link-entities` → `dashboard` → `tour`. Each step's success/failure is recorded in the run summary; a failure of one step does not block the next.
219
+ - **v3.0.0 (kushi v4.9.0, 2026-05-26)**: Drops "1 verbatim per source per bootstrap" carve-out. Walks the full boundary window per source. Preflight ensures `weekly/` + `_index/` exist per source. All pull-* invoked write CSC blocks per comprehensive-structured-capture.instructions.md.
220
+
221
+ ## Validation loop
222
+
223
+ After writing outputs:
314
224
 
315
- - **v4.0.0 (kushi v5.0.0, 2026-05-26)**: After build-state, dispatch the v5 enrichment chain — `link-entities` → `dashboard` → `tour`. Each step's success/failure is recorded in the run summary; a failure of one step does not block the next.
316
- - **v3.0.0 (kushi v4.9.0, 2026-05-26)**: Drops "1 verbatim per source per bootstrap" carve-out. Walks the full boundary window per source. Preflight ensures `weekly/` + `_index/` exist per source. All pull-* invoked write CSC blocks per comprehensive-structured-capture.instructions.md.
225
+ 1. Run self-check targeted at this skill: `pwsh plugin/skills/self-check/run.ps1 -Targeted bootstrap`
226
+ 2. If failures: fix and re-run the affected step (not the whole skill).
227
+ 3. Repeat until self-check exits 0.
228
+ 4. Only then update `run-log.yml` with success status.
@@ -0,0 +1,40 @@
1
+ # references/discovery-sweep.md (bootstrap-project)
2
+
3
+ > **Load this file when** executing Step 3.5 (customer-hint discovery sweep) — i.e., when at least one source boundary is empty and you need the per-source WorkIQ discovery procedure, table of sweep targets, sidecar format, and rerun/cap rules.
4
+
5
+ ## Customer-hint discovery sweep (REQUIRED, kushi v4.8.0+)
6
+
7
+ Per `customer-hint-discovery.instructions.md` — **before the boundaries gate runs in Step 4**, bootstrap MUST attempt a WorkIQ-driven discovery sweep for every source whose boundary is currently empty AND whose WorkIQ-driven discovery doctrine exists. This prevents the silent-skip defect where bootstrap scaffolds empty boundaries and immediately writes every source as `blocked-config` without ever asking WorkIQ what mentions the customer.
8
+
9
+ For each source in the table below, if `<engagement-root>/<project>/integrations.yml#boundaries.<source>.<required-key>` is empty (or contains only `needs_review: true` rows from a prior incomplete sweep), dispatch the sweep using the customer hint from Step 1 + `<discoveryLookbackDays>` (default 90, configurable via `m365-mutable.json#bootstrap.discoveryLookbackDays`):
10
+
11
+ | Source | Sweep doctrine | Populates | Required-key gate |
12
+ |---|---|---|---|
13
+ | email | `email-bootstrap-discovery.instructions.md` | `boundaries.email.mailboxes[]` | `mailboxes` empty |
14
+ | teams | `teams-bootstrap-discovery.instructions.md` | `boundaries.teams.chat_ids[]` + `channel_ids[]` | BOTH empty |
15
+ | meetings | `meetings-bootstrap-discovery.instructions.md` | `boundaries.meetings.series_join_urls[]` | `series_join_urls` empty |
16
+ | sharepoint | `sharepoint-bootstrap-discovery.instructions.md` | `boundaries.sharepoint.site_urls[]` | `site_urls` empty AND `local_folders` empty |
17
+ | onenote | existing Step 4a (display-name driven, v4.7.x) | `boundaries.onenote.section_file_ids[]` + `section_group_ids[]` | both empty |
18
+ | loop | `loop-bootstrap-discovery.instructions.md` (v4.6.0) | `boundaries.loop.workspace_ids[]` | empty |
19
+ | crm | `crm-bootstrap-discovery.instructions.md` (v3.11.0) | `boundaries.crm.record_ids[]` + `request_ids[]` | both empty AND `crm:` shared block populated |
20
+ | ado | `ado-bootstrap-discovery.instructions.md` | `boundaries.ado.area_paths[]` | empty AND `ado:` shared block populated |
21
+
22
+ For each sweep:
23
+
24
+ 1. Read the per-source doctrine for the exact approved WorkIQ query shape.
25
+ 2. Issue ONE WorkIQ ask per source per project (chats + channels are TWO asks under the same teams doctrine).
26
+ 3. Parse the response per the doctrine's parsing rules.
27
+ 4. Cap at 10 candidates by recency (or per-doctrine ordering).
28
+ 5. Append discovered values to `<project>/integrations.yml#boundaries.<source>.<key>` as plain strings (deduplicate by exact-string equality).
29
+ 6. Write sidecar `<project>/Evidence/<alias>/_discovery/<YYYY-MM-DD>_<source>_discovery.yml` with per-row metadata (`discovered_by`, `discovered_at`, `needs_review: true`, `confidence`, `query`, `workiq_request_id`). The sidecar lives INSIDE the contributor's alias folder (discovery results are per-identity); filename is NOT suffixed with `-<alias>` since the folder already namespaces the owner.
30
+ 7. If `> 10` candidates: append the remainder to `<project>/OPEN-QUESTIONS-DRAFT.md` under `## Discovery sweep — candidates over cap`.
31
+ 8. If `0` candidates: write `last_status: unresolved` (NOT `blocked-config`) and append a one-line widen-hint suggestion to Open Questions.
32
+ 9. If WorkIQ errors: write a `deferred-retry` marker per `deferred-retry-on-workiq-fail.instructions.md` and set `last_status: deferred`. Do NOT skip ahead to `blocked-config`.
33
+
34
+ Run sweeps **in parallel** where possible (they're independent WorkIQ asks). Total wall time should be ≤ 5× single-ask latency.
35
+
36
+ **Rerun rule** — if the boundary already has at least one row that is NOT `needs_review: true` in the sidecar (user manually confirmed), the sweep is **skipped** for that source. Boundary is gospel. Pass `--force-rediscover` to override.
37
+
38
+ **Forbidden:** declaring `last_status: blocked-config` for email/teams/meetings/sharepoint without first running this sweep. That is a defect per `customer-hint-discovery.instructions.md` § The rule. `blocked-config` is only legitimate when a prerequisite is genuinely missing (CRM/ADO shared config, SharePoint when both `site_urls` AND `local_folders` are empty AND the sweep returned 0).
39
+
40
+ After all sweeps complete, write the `## Discovery Sweep Results` table to `<project>/bootstrap-status.md` per `customer-hint-discovery.instructions.md` § Required outputs (4).
@@ -0,0 +1,50 @@
1
+ # references/pull-dispatch.md (bootstrap-project)
2
+
3
+ > **Load this file when** executing Step 4b (per-source dispatch loop) — i.e., when you need the boundaries gate logic, per-source dispatch order, verification-gate pseudocode, CRM discovery requirement, and pull-misc scaffolding rules.
4
+
5
+ ## Boundaries gate (kushi v3.7.0+)
6
+
7
+ Per `scope-boundaries.instructions.md`: before dispatching any `pull-*`, read `<engagement-root>/<project>/integrations.yml#boundaries` and verify each enabled source has its required boundary key populated. After Step 3.5, many of these should now contain discovered rows (annotated `needs_review: true` in their sidecars). For sources where the sweep ran but returned 0 candidates, the status is `unresolved` (not `blocked-config`) — add a one-liner to `<project>/OPEN-QUESTIONS-DRAFT.md` asking the user to widen the hint or manually seed the boundary. For sources where the sweep COULD NOT run because a prerequisite is genuinely missing (CRM/ADO shared connection block empty; SharePoint local-folder discovery), `blocked-config` is correct and the `next_step` MUST cite the specific missing field.
8
+
9
+ For CRM and ADO additionally verify the shared connection block exists in `<workspace>/.kushi/config/shared/integrations.yml` (`crm:` block with `environment_url` + `tenant_id`, OR `ado:` block with `organization` + `project`) with non-placeholder values. **As of kushi v4.8.4, these blocks ship PRE-FILLED with the canonical Microsoft Industry Solutions defaults** (iscrm.crm.dynamics.com / Microsoft tenant / IndustrySolutions ADO org / IS Engagements project), so on a fresh install this preflight passes by default. Only if a contributor has hand-edited the file to empty values or works in a non-Microsoft tenant will this preflight fail — in that case, prompt the user to fix the affected field directly and park in Open Questions with the path. **Do NOT auto-improvise** by inferring a tenant/org or by narrating CRM evidence from email — both are explicit anti-patterns in v3.7.0.
10
+
11
+ **CRM discovery is REQUIRED before declaring `disabled` (kushi v3.11.0+, per `crm-bootstrap-discovery.instructions.md`).** If `<workspace>/.kushi/config/shared/integrations.yml` exists and `az` auth succeeds, bootstrap MUST run the full 4-step Dataverse REST resolution sequence from `pull-crm/SKILL.md#resolution-order-when-crmrecordid-is-unset` (title-first → all matching accounts → wide-text → recent-slice → ask user) against the live endpoint before writing `boundaries.crm.disabled: true`. Any other path that sets `disabled: true` is a defect. If steps 1–4 all return 0, present the top 5 candidates from step 4 to the user before final disposition. Log the full attempt trail (queries + counts + outcome) to the bootstrap refresh-report under `## CRM resolution attempts`. If auth fails or Dataverse is unreachable, leave the boundary empty with `reason: 'crm-auth-unavailable-<date>'` — NOT `disabled: true` — so the next refresh retries.
12
+
13
+ ## Step 4b — Dispatch (with verification gate after each pull)
14
+
15
+ Dispatch to each enabled per-source skill with `--window last 30 days` (each skill self-refuses if its boundary is still empty). **After every dispatch**, run the per-source verification gate per `..\..\instructions\per-source-verification-gate.instructions.md`:
16
+
17
+ ```
18
+ for source in [pull-email, pull-teams, pull-meetings, pull-onenote, pull-loop,
19
+ pull-sharepoint, pull-crm, pull-ado, pull-misc]:
20
+ if not enabled(source): continue
21
+ result = dispatch(source, --window "last 30 days")
22
+ gate = run_verification_gate(source, result)
23
+ if gate.status == "fail":
24
+ retry = dispatch(source, --window "last 30 days", --retry)
25
+ gate2 = run_verification_gate(source, retry)
26
+ if gate2.status == "fail":
27
+ append_to_followups(<project>/FOLLOW-UPS.md, gate2)
28
+ append_to_runlog(<project>/Evidence/run-log.yml, source, "failed-gate")
29
+ append_to_tracking(gate2.process_audit_failures)
30
+ # do NOT abort the whole bootstrap — continue to next source
31
+ # next source
32
+ ```
33
+
34
+ Source order (deterministic, do not reshuffle):
35
+
36
+ 1. `pull-email`
37
+ 2. `pull-teams`
38
+ 3. `pull-meetings`
39
+ 4. `pull-onenote`
40
+ 5. `pull-loop` (if enabled — boundary `boundaries.loop.workspace_ids[]` populated)
41
+ 6. `pull-sharepoint`
42
+ 7. `pull-crm` (if enabled)
43
+ 8. `pull-ado` (if enabled)
44
+ 9. `pull-misc` (if `<project>/external-links.txt` exists with ≥ 1 non-placeholder, non-delegated link)
45
+
46
+ > **v4.9.0 SUPERSEDED.** Bootstrap walks the full window now; pull-* write `weekly/<YYYY-MM-DD>_<source>-csc.md` + `_index/entities.yml` per source (not snapshot/ + stream/). See `weekly-csc.instructions.md` and `comprehensive-structured-capture.instructions.md`.
47
+
48
+ Each produces CSC weekly output per `weekly-csc.instructions.md` + `comprehensive-structured-capture.instructions.md`. The gate enforces: shape (weekly/ + _index/ + verbatim/ for meetings) → per-source extras (transcript-class file for meetings, sectionId for onenote, siteId for sharepoint, chatId for teams, folders for email, engagementRecordId for crm, areaPath for ado) → process-compliance (workiq-first / verbatim-by-default / capture-learnings / run-report mention / fuzzy-disambiguation cited / cleanup-on-resolution / citation-ledger).
49
+
50
+ **pull-misc bootstrap note:** if `<project>/external-links.txt` does NOT exist, scaffold it from `templates/init/external-links.template.txt` so the user has a place to paste links. Mark the source as `enabled: true, links: 0` in `integrations.yml#boundaries.misc` and skip the dispatch (nothing to fetch yet).
@@ -0,0 +1,55 @@
1
+ # references/registry-persistence.md (bootstrap-project)
2
+
3
+ > **Load this file when** executing Step 4a (discovery and registry persistence) — i.e., when you need the `m365-mutable.json#knownSections` schema, the OneNote deterministic discovery procedure, or the per-source ID-persistence contract.
4
+
5
+ ## Step 4a — Discovery & registry persistence (REQUIRED, kushi v3.7.8+, per `m365-id-registry.instructions.md`)
6
+
7
+ **Doctrine: bootstrap discovers, refresh consumes.** Bootstrap is the ONLY phase that probes WorkIQ to resolve canonical M365 identifiers. Refresh runs MUST read these from `m365-mutable.json#knownSections.<project>` and pass them into the index extractor verbatim. Refresh must NEVER re-discover — that is the source of "OneNote works for me sometimes" non-determinism.
8
+
9
+ For each enabled source, resolve and persist the canonical lookup keys into `<workspace>/.kushi/config/user/m365-mutable.json#knownSections.<projectKey>`. The schema is fixed — populate every key the source supports:
10
+
11
+ ```jsonc
12
+ "knownSections": {
13
+ "<projectKey>": {
14
+ // OneNote (pull-onenote uses these as Step A inputs)
15
+ "one_sectionName": "<displayName>.one",
16
+ "one_sectionFileId": "<wdsectionfileid GUID>", // PRIMARY — consumed verbatim by pull-onenote
17
+ "one_sectionGroupId": "<wdsectiongroupid GUID>", // when boundary is a section group
18
+ "one_sectionOneNoteGuid": "<wdsectiononenoteguid GUID>", // alternate identifier (older notebooks)
19
+ "one_sectionPath": "/<group>/<section>.one", // human-readable path for run-reports
20
+ "one_notebookSourceDoc": "<notebook sourceDoc GUID>", // parentReferenceId fallback
21
+ // Email
22
+ "emailContext": "Inbox/<folder-path>",
23
+ // Teams
24
+ "teamsChatContext": {
25
+ "chatHints": ["..."], // exact chat topics
26
+ "channelHints": ["..."], // exact channel display names
27
+ "participantHints": ["..."] // exact display names
28
+ },
29
+ // SharePoint (when boundary is a SP site/library/folder)
30
+ "sp_siteId": "<siteId>",
31
+ "sp_webId": "<webId>",
32
+ "sp_listId": "<listId>",
33
+ "sp_path": "/<site>/<library>/<folder>"
34
+ }
35
+ }
36
+ ```
37
+
38
+ ## OneNote discovery procedure (deterministic, follow exactly)
39
+
40
+ Per `workiq-onenote-query-shape.instructions.md` — the **only** WorkIQ query shape that returns OneNote data is **natural-language naming by display name**, scoped to one section in one notebook. The doctrine file lists every empirically-broken phrasing (enumeration verbs, structured-field requests, filter-syntax expressions, ID-lookup questions) in its "Forbidden phrasings" table — see that file for the canonical anti-pattern list. Drivers MUST NOT emit any of those phrasings; they fail empirically (WorkIQ punts to Graph or routes to summary mode).
41
+
42
+ 1. For each entry in `boundaries.onenote.section_names[]` (or the user-provided section name, e.g. `HCA`), run **one** narrow WorkIQ query per section using display names only:
43
+ ```
44
+ workiq ask -q "In the OneNote notebook '<NOTEBOOK DISPLAY NAME>', show me the pages in the section named '<SECTION DISPLAY NAME>'. Return a flat table with: page title, last modified, web URL. No commentary. Do not truncate."
45
+ ```
46
+ The notebook display name comes from `m365Auth.oneNote.defaultNotebookName` in `m365-auth.json` (persisted during `setup`). The section display name comes from `boundaries.onenote.section_names[]` in `integrations.yml`.
47
+ 2. **Parse GUIDs out of the URL fragments** in the response — the `web URL` column contains `Doc.aspx` URLs of the form `...?...&wd=target(<section>|<wdsectionfileid>/...)&...&wdpartid={GUID}{1}&wdsectionfileid={GUID}`. Extract `wdsectionfileid`, `wdpartid`, and `sourcedoc` GUIDs. Do NOT interpret prose summaries.
48
+ 3. If the table is empty or the query returns a Graph-Explorer-style punt (classified per `fallback-status-reporting.instructions.md`), mark the section `disabled = true, reason = "workiq-discovery-failed"` in `integrations.yml`, write a one-line entry in `bootstrap-status.md`, and continue. Do NOT escalate to Playwright at bootstrap time — escalation is per `workiq-onenote-query-shape.instructions.md` "When (and only when) to use Playwright" (refresh-time, opted-in, threshold-driven).
49
+ 4. Persist resolved IDs to `m365-mutable.json#knownSections.<projectKey>` per `m365-id-registry.instructions.md` and mirror into `boundaries.onenote.section_file_ids[]` / `section_group_ids[]` in `integrations.yml`.
50
+ 5. Record one line in `bootstrap-status.md`: `OneNote: resolved wdsectionfileid=<id> via natural-language query for section "<name>" in notebook "<notebook>" (N pages enumerated)`.
51
+ 6. **Browser-URL fields are OPTIONAL at bootstrap (kushi v4.7.3+).** The Playwright-required fields (`one_notebookSourceDoc`, `one_notebookSpoBaseUrl`, `one_sectionWebUrl`, `one_sectionName`) are only needed if the user has opted into the Playwright recovery fallback (`m365Auth.oneNote.playwrightFallback: true`). When opted in, run `recapture-section-url.mjs --check`; if exit 1, prompt for the address-bar URL paste. When NOT opted in, skip this gate entirely — WorkIQ-only operation does not need these fields.
52
+
53
+ ## SharePoint, Teams, Email, CRM, ADO
54
+
55
+ These follow the same shape: bootstrap discovers and persists; refresh consumes. Per-source discovery procedures live in each `pull-*/SKILL.md`'s "Bootstrap discovery" section. Bootstrap MUST invoke each pull-*'s discovery probe with the user-supplied seed (folder name, channel name, request id, work item id), persist the resolved IDs, and only then dispatch the pull.