npm - @vectros-ai/blueprints - Versions diffs - 0.6.3 → 0.6.4 - Mend

@vectros-ai/blueprints 0.6.3 → 0.6.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md +31 -0
package/dist/index.js +62 -9
package/dist/index.js.map +1 -1
package/dist/index.mjs +62 -9
package/dist/index.mjs.map +1 -1
package/guides/agentic-sdlc.md +89 -3
package/package.json +1 -1
package/prompts/agentic-sdlc-agent.md +85 -10

package/guides/agentic-sdlc.md CHANGED Viewed

@@ -130,7 +130,8 @@ document_ingest:
 ```
 Scope later searches to one type with `contentTypes: ["documents"], typeName:
-"decision"` (the document-type facet), plus `filters` for `area`/`tags`.
+"decision"` — `typeName` narrows documents and records alike (equivalently
+`filters: { recordType: "decision" }`), plus more `filters` for `area`/`tags`.
 ### Records — the structured artifacts (controls, conventions, gotchas, terms)
@@ -166,7 +167,17 @@ pauses or restarts simply converges — it never double-writes.
 ## 4. Query it
-Documents are queried via search (scoped by `typeName`); records via `record_query`.
+Reach for the most precise tool first: `record_query` for anything *enumerable*
+(exact, cheap, compact), `hybrid_search` to recall by meaning when you don't know
+the exact filter, `rag_ask` for a grounded *answer* over document bodies. Scope by
+type per tool: `hybrid_search` uses `typeName` (which narrows documents and records
+alike), `record_query` uses its `type` argument.
+Two `hybrid_search` gotchas worth knowing up front: hits carry the surrounding
+passage, so searches are *heavy* — start `limit:3` + `uniqueDocuments:true` and
+escalate only if needed; and the default `mode:HYBRID` uses `textMode:PHRASE`, so a
+long natural-language query can contribute nothing on the keyword leg (a `textScore`
+of `0` on every hit is the tell) — use a short keyword phrase or `textMode:"OR"`.
 | You want… | Call |
 |---|---|
@@ -175,7 +186,7 @@ Documents are queried via search (scoped by `typeName`); records via `record_que
 | "What's the active rule for area X?" | `record_query convention { area:"<area>", status:"active" }` |
 | "Have we hit this failure before?" | `hybrid_search "<symptom>" contentTypes:["documents"], typeName:"postmortem"`; plus `record_query gotcha { area:"deploy", status:"active" }` |
 | "Define X" | `record_query term { term:"X" }` (unique lookup) |
-| "Latest decisions / search the designs" | `hybrid_search "<topic>" contentTypes:["documents"], typeName:"decision"` (or `"design"`), `filters:{ area:"<area>" }` |
+| "Latest decisions / search the designs" | `hybrid_search "<topic>" contentTypes:["documents"], typeName:"decision"` (or `"design"`, plus `filters:{ area:"<area>" }`) |
 | "What supersedes a given decision?" | document lookup on `decision` by `supersedes:"<externalId>"` |
 ## 5. Customize
@@ -236,3 +247,78 @@ change — every type already has `tags`.)
 - **Re-ingest is keyed on `externalId`** — a backfill never double-writes (an
   unchanged item returns as-is), re-ingesting edited source with `upsert: true` keeps
   the KB in sync, and the KB can be rebuilt from source at any time.
+### Keep it in sync with your source — the self-describing marker
+If your knowledge lives in a repo (docs, decision records, runbooks) and the KB mirrors
+it, the two drift the moment someone edits a file. A durable pattern keeps them together
+without a fragile side-index.
+**Stamp each mirrored file with its KB id, in the file itself** — a one-line HTML comment
+at the top:
+```markdown
+<!-- vectros-kb-id: ref-data-model -->
+# Data model reference
+```
+- **It's invisible.** An HTML comment renders to nothing in Markdown — on your docs site
+  and in the repository view alike — so it never shows on the page, even for files that
+  are also documentation-site source.
+- **The id travels with the file.** Move or rename the file and the binding is unchanged;
+  because references resolve by `externalId`, your cross-links never break.
+- **No side ledger to maintain.** The file is self-describing — there is no separate
+  path-to-id map to keep in lockstep with every add, move, and rename.
+**Sync is then a one-liner:** on change, re-ingest the file under the id in its marker with
+`upsert: true` — the body re-indexes while the id, typed fields, and edges stay put. A
+merge hook that re-ingests every changed marked file keeps the KB fresh automatically.
+**The marker is also your membership signal:** a file belongs in the KB exactly when it
+carries a marker, so "add this to the KB" is a one-line edit and unmarked files are simply
+never mirrored — the KB stays a curated subset, not a copy of the whole repo.
+### The same, for records extracted from a file
+Records are *distilled* from a source — a glossary becomes many `term` records, a
+conventions doc becomes many `convention` records — so one file maps to many records, and
+there's no single line to stamp with one id. Two small conventions keep them in sync
+anyway:
+**1. Each record carries a `sourceRef`** — the identifier of the file it was distilled from.
+So "which records came from this file?" is one query:
+```text
+record_query  type:term  field:sourceRef  value:<ref>
+```
+**2. The source file declares itself** with a companion marker that names the record
+type(s) it feeds and pins that `ref`:
+```markdown
+<!-- vectros-kb-records: term ref=glossary.md -->
+# Glossary
+```
+Sync is then symmetric with documents: on a change to a marked source file, read its
+marker, query its existing records by `sourceRef`, re-distill, and `upsert` each by its
+stable `externalId` — refreshing changed ones, adding new headings, superseding any that
+disappeared. Because the `ref` is pinned in the marker (not read from the path), moving the
+file never breaks the link.
+**Two markers, one model:** `vectros-kb-id` marks a file that *is* a KB document;
+`vectros-kb-records` marks a file that *feeds* KB records. A file can carry either. With
+both, nothing about your KB lives in a separate index — every synced file says what it is,
+right in the file.
+### Promote durable knowledge into your repo, then the KB
+If your agent keeps a private memory or working-notes file, treat it as a **staging area**,
+not a second home for knowledge. When a note matures into something durable and shareable,
+**promote it one-way** into the right repo doc (a conventions file, a troubleshooting
+reference, a post-mortem) — the broadest type that fits — then let the marker + `sourceRef`
+sync above carry it into the KB. Once the repo doc is committed and the ingest is confirmed,
+collapse the memory note to a one-line pointer at the repo path (and only then — working
+notes usually aren't version-controlled). The result: one golden copy per lesson (the repo),
+one queryable projection (the KB), and a breadcrumb in memory — never the same prose in three
+places.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@vectros-ai/blueprints",
-  "version": "0.6.3",
+  "version": "0.6.4",
   "description": "Curated Vectros use-case blueprints (schemas + least-privilege AccessProfile + seed) and the Blueprint format + structural validation. Enforcement (the scope gate) lives in @vectros-ai/cli, not here.",
   "license": "Apache-2.0",
   "repository": {

package/prompts/agentic-sdlc-agent.md CHANGED Viewed

@@ -20,16 +20,23 @@ truth for "why is it shaped this way?" and "how do we do X?".
 ## The loop: recall before you act, capture after
-1. **RECALL first.** Before you propose a change, design something, or debug a
-   failure, query the knowledge base. A cold start is exactly when you most need
-   the decision/convention/gotcha you can't re-derive from the code. Don't
-   re-litigate a settled decision or re-discover a known trap.
+1. **RECALL first — the KB outranks your own re-derivation.** Before you propose a
+   change, design something, or debug a failure, query the knowledge base and treat
+   a hit as authoritative over what you'd reconstruct from the code alone. A cold
+   start is exactly when you most need the decision/convention/gotcha you can't
+   re-derive. Don't re-litigate a settled decision, re-invent an existing
+   convention, or re-discover a known trap — a two-second `rag_ask` is cheaper than
+   repeating a mistake the team already wrote down. If recall turns up nothing,
+   *then* proceed from first principles (and capture what you learn, per step 3).
 2. **ACT** using what you recalled — follow the conventions and controls, reuse the
    runbook, respect the supersede chain.
 3. **CAPTURE after.** When you make a durable decision, learn a convention, hit a
    gotcha, write or revise a runbook, or run a post-mortem, write it back (a
    document or a record, per below) so the next session inherits it. **Record the
-   *why*, not just the *what*** — the reasoning is the most-recalled content.
+   *why*, not just the *what*** — the reasoning is the most-recalled content. And
+   when the durable source of a KB item is a repo file you just edited, re-ingest it
+   (see *Keep the KB in sync* below) — an edit that doesn't propagate silently forks
+   the KB from the repo, which is worse than no KB at all.
 Knowledge is **superseded / retired / resolved** via a status flip, never deleted —
 the trail of how the team's thinking evolved is part of the value.
@@ -68,17 +75,51 @@ documents). Follow these to navigate provenance.
 ## How to query (MCP tools)
+**Reach for the most precise tool first.** If the ask is *enumerable* — "which
+critical controls are active?", "the convention for area X", "the definition of
+term Y" — use **`record_query`**: it is exact, cheap, and compact. Fall back to
+**`hybrid_search`** (recall by meaning) only when you don't know the exact filter,
+and to **`rag_ask`** when you want a grounded *answer* over document bodies rather
+than the raw hits. Ordering matters — a `record_query` that returns three tight
+rows beats a `hybrid_search` that spends thousands of tokens to surface the same
+fact.
+**Query compactly by default.** `hybrid_search` hits carry the surrounding passage
+(`contextText`), so a wide search is *heavy* — a handful of hits can be tens of KB.
+Start with **`limit: 3` + `uniqueDocuments: true`** and escalate only if recall is
+insufficient. Prefer a `record_query` or a tighter filter over a bigger `limit`.
 - **Recall by meaning / grounded answer** — `rag_ask` for a cited answer over the
   document bodies: *"why did we choose X?"*, *"have we hit this before?"*.
-- **Search documents** — `hybrid_search` with `contentTypes: ["documents"]` and
-  `typeName` to scope to one document type (e.g. `typeName: "decision"` or
-  `"runbook"` or `"postmortem"`), plus `filters` (`{ area: "search" }`,
-  `{ tags: "tenant-isolation" }`).
-- **Query records** — `record_query` for exact enumeration:
+- **Search documents** — `hybrid_search` with `contentTypes: ["documents"]`. Scope
+  to one document *type* with **`typeName: "decision"`** (or `"runbook"`,
+  `"postmortem"`, …) — `typeName` narrows documents and records alike. Add `filters`
+  to narrow further (`{ area: "search" }`, `{ tags: "tenant-isolation" }`).
+- **Query records** — `record_query` for exact enumeration; the record type is the
+  tool's `type` argument, plus field filters:
   `record_query control { kind: "control", criticality: "critical", status: "active" }`;
   `record_query term { term: "AccessProfile" }` (unique lookup);
   `record_query convention { area: "auth", status: "active" }`.
   Range/sort on a record's date field (`order: "desc"`) for "latest" / "since".
+  *(Type facet by tool: `hybrid_search` uses `typeName`; `record_query` uses `type`.)*
+**Mind the keyword leg.** `hybrid_search` defaults to `mode: HYBRID` with
+`textMode: PHRASE` (slop 3), so a long natural-language query often matches nothing
+on the BM25 (keyword) leg and you silently get a semantic-only ranking. Use a short
+**keyword phrase** for the text leg, or pass **`textMode: "OR"`** for a
+natural-language query. Tell-tale: if `textScore` is `0` across every hit, the
+keyword leg contributed nothing — re-shape the query or switch `textMode`.
+**Recall cheat-sheet** (map the question to the tightest query):
+| You want… | Query |
+|---|---|
+| The active rule/standard for area X | `record_query convention { area: "X", status: "active" }` · `record_query control { area: "X", status: "active" }` |
+| "Have we hit this failure before?" | `hybrid_search { contentTypes: ["documents"], typeName: "postmortem" }` + `record_query gotcha { area: "X", status: "active" }` |
+| The definition of a term | `record_query term { term: "…" }` (unique) |
+| The *why* behind a decision | `rag_ask "why did we …?"` or `hybrid_search { contentTypes: ["documents"], typeName: "decision" }` |
+| Latest N of a dated type | `record_query <type> { … , order: "desc" }` (range/sort on the date field) |
+| Everything tagged to an issue | `record_query`/`hybrid_search` with `filters: { tags: "issue:<id>" }` |
 ## How to capture (MCP tools)
@@ -92,6 +133,9 @@ documents). Follow these to navigate provenance.
   stable `externalId` + the typed fields, e.g.:
   - gotcha: `{ externalId: "gotcha-<slug>", symptom, cause, fix, area: "[area]", status: "active", discoveredOn: "YYYY-MM-DD" }`.
   - convention: `{ externalId: "<slug>", title, rule, why, howToApply, area: "[area]", status: "active", establishedBy: "<decision-externalId>", updatedOn: "YYYY-MM-DD" }`.
+  - If the record is **extracted from a repo file**, add `sourceRef: "<that file>"` so a
+    later edit to the source can find and re-extract exactly its records (see the sync
+    convention below).
 - **To retire** — re-write with `status` flipped (a decision to `superseded`, a
   gotcha to `resolved`). Don't delete.
@@ -101,6 +145,16 @@ documents). Follow these to navigate provenance.
   plain re-create returns the existing record **unchanged** (`created: false`); to apply
   edits, send the change with `upsert: true`. Pick stable slugs/numbers.
 - **Write a reference target before the record that points at it.**
+- **Keep the KB in sync with its source (self-describing, no side index).** If a document
+  mirrors a repo file, stamp the file with a top-of-file `<!-- vectros-kb-id: <externalId> -->`
+  comment; if a file is *extracted* into records, stamp it with
+  `<!-- vectros-kb-records: <type> ref=<path> -->` and give each record a `sourceRef` equal to
+  that `ref`. On a source edit, re-ingest the document or re-extract the records by
+  `externalId` with `upsert: true`. The markers are invisible HTML comments (they never
+  render), and they mean the KB needs no separate map of what came from where. On a
+  re-extract, if a record's source heading has disappeared, flip that orphaned record to
+  `resolved`/`superseded` rather than leaving it active — a re-sync that only refreshes
+  and never retires still serves stale answers.
 - **Record the why.** A statement without rationale is a log entry, not knowledge.
 - When a decision changes, write the new `decision` and set its `supersedes` — don't
   edit the old one's meaning away.
@@ -122,6 +176,27 @@ knowledge with `filters:{ tags:"issue:147" }`; the tag is also your jump-link to
 status. Be selective — most issues promote nothing; only the durable why/how/lesson
 belongs here. Never store status (open/closed/assignee) in the KB.
+## Promoting what you learn (memory → repo → KB)
+If you keep private working notes or an always-loaded memory file, treat it as a
+**staging area, not a second home**. A lesson lives in exactly one tier:
+- **Working memory** — where a lesson lands *first*, while it's still fresh or
+  agent-personal.
+- **Your repo docs** — the **golden**, shared, reviewable copy. When a memory note
+  matures into durable, shareable knowledge, **promote it one-way** into the right
+  doc (a conventions file, a troubleshooting reference, a post-mortem) under the
+  broadest type that fits — don't fragment one nugget into its own file when a
+  broader home exists.
+- **This KB** — the queryable projection, fed *from* the repo doc (stamp it with a
+  marker and ingest, per *Keep the KB in sync*).
+Promotion is terminal: once the repo doc is committed **and** the KB ingest is
+confirmed, collapse the memory note to a one-line pointer at the repo path. Order
+matters — remove the working copy **only after** the two durable copies exist, since
+working notes usually aren't version-controlled. Keep in memory, in full, only what
+has no repo/KB home by design (personal credentials, in-flight status).
 [Customize: your `area` vocabulary, which schemas your team uses, naming
 conventions for `externalId`, your tracker tag prefix, and any house rules — e.g.
 "every control names the test that enforces it in `evidence`."]