npm - la-machina-engine - Versions diffs - 0.6.0 → 0.7.1 - Mend

la-machina-engine 0.6.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md CHANGED Viewed

@@ -22,7 +22,7 @@ npm install la-machina-engine
 **v0.3.0 — published on npm; production-ready core, evolving feature surface.**
-- **1214** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
+- **1553** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
 - Zero top-level `node:` imports — runs on Node.js AND Cloudflare Workers
 - 14 live workflow tests (W1–W14) verified against OpenRouter, real R2, real MCP servers
 - Pause/resume + async runs + webhooks + state.json + R2 binding storage adapter
@@ -1142,6 +1142,199 @@ on for multi-call research / browsing / repeated-API flows. Benchmark
 on your workload — the live test at
 `scripts/workflows/w17-offload-live.mjs` shows how.
+### Knowledge base — `SearchKnowledge` + `ReadKnowledge`
+When the model needs to look things up in your tenant's docs without
+loading whole files into context, opt into the knowledge base. Two
+built-in tools — `SearchKnowledge` (token-overlap-ranked snippets)
+and `ReadKnowledge` (one section or a whole file) — let an agent
+walk a per-tenant vault on demand.
+**Layout.** Each tenant gets a folder at
+`workspaces/{workspaceId}/knowledge/`, sibling to `.claude/`. Top-
+level subfolders are *bases* — independent corpora each with their
+own pre-built `_index.json`:
+```
+workspaces/acme-corp/
+├── .claude/                      # engine state — transcripts, memory, …
+└── knowledge/
+    ├── hr-policies/              # base: "hr-policies"
+    │   ├── _index.json           # built by writeKnowledgeIndex()
+    │   ├── handbook.md
+    │   └── remote-work.md
+    └── sales-playbook/           # base: "sales-playbook"
+        ├── _index.json
+        └── q1/
+            └── pricing.md
+```
+**Build the index** when the corpus changes — for each base, the
+indexer walks its subtree, splits markdown at heading boundaries,
+tokenises section bodies, and writes one `_index.json` per base:
+```ts
+import { writeKnowledgeIndex, R2StorageAdapter } from 'la-machina-engine'
+const k = new R2StorageAdapter(r2Config, 'workspaces/acme-corp/knowledge')
+await writeKnowledgeIndex({ adapter: k, base: 'hr-policies' })
+await writeKnowledgeIndex({ adapter: k, base: 'sales-playbook' })
+```
+**Forgot to build the index?** Both tools fall back to an in-memory
+build on first call when `_index.json` is missing or corrupted. The
+fallback caches for the rest of the run, so subsequent searches are
+free. This makes the index a performance optimisation (skip the walk
+on every fresh run), not a setup requirement — drop files into the
+folder and the agent can discover them immediately. Pre-build with
+`writeKnowledgeIndex()` for production-scale corpora where the
+first-call cost matters.
+**Configure the engine** to enable the tools (off by default):
+```ts
+const engine = initEngine({
+  storage: { provider: 'r2', /* … */ },
+  knowledge: {
+    enabled: true,           // engine-level capability flag
+    maxSearchResults: 5,     // top-K per SearchKnowledge call
+    maxReadBytes: 10_000,    // ReadKnowledge truncation cap
+  },
+})
+```
+**Per-run scoping.** Folders are runtime-only — pass them via
+`RunOptions.knowledge.folders`. Sub-paths inside a base work too
+(e.g., `'sales-playbook/q1'` only sees Q1 content):
+```ts
+await engine.run({
+  task: 'What is our 401k match rate?',
+  knowledge: {
+    folders: ['hr-policies', 'sales-playbook/q1'],
+    external: [
+      // External file links — fetched on demand, never indexed.
+      // `headers` are runtime-only and NEVER persist anywhere.
+      {
+        name: 'product-catalog',
+        description: 'Product catalog CSV with unit pricing',
+        url: 'https://api.acme.example/catalog.csv',
+        format: 'csv',
+        headers: { Authorization: 'Bearer sk_real_token' },
+      },
+    ],
+  },
+})
+```
+The model then calls:
+- `SearchKnowledge({ query: '401k matching' })` → top-K ranked
+  snippets, each with a `ref` like `hr-policies/handbook.md#benefits`
+- `ReadKnowledge({ ref: 'hr-policies/handbook.md#benefits' })` →
+  full body of that section
+- `ReadKnowledge({ ref: 'ext:product-catalog' })` → fetches the
+  registered URL with its headers, runs the `csv` extractor, returns
+  text
+**Format support.** Native: `md`, `txt`, `json`, `csv`, `html` (script/
+style stripped, entities decoded, whitespace collapsed). Optional:
+`pdf` (via `pdf-parse`) and `docx` (via `mammoth`). Both have
+`requiresNode: true` — on Workers without those packages installed,
+they return a structured `ERR_KNOWLEDGE_FORMAT_UNSUPPORTED` error.
+**Path safety.** All folder + ref strings flow through one validator
+in `src/knowledge/scope.ts` that rejects absolute paths, traversal
+(`..`), unsafe characters, and out-of-base file refs. A dedicated
+test (`scope.test.ts`) pins the behaviour — every weakening would
+open a tenant-boundary hole.
+**External link headers — non-persistence guarantee.** External
+`headers` live entirely inside the tool factory closure and on the
+`init.headers` of one `fetch` call per request. They never reach the
+LLM, the transcript, `state.json`, snapshots, or any storage write.
+A sentinel-based test suite (`externalLinkSecrets.test.ts`) and the
+live R2 test (`scripts/workflows/w20-knowledge-r2.mjs`) verify this:
+the live test seeds a known sentinel into the `Authorization` header,
+runs against real R2, and reads every transcript shard back from the
+bucket asserting zero leaks.
+**Composing with offload.** If a `ReadKnowledge` result exceeds your
+`compaction.toolResultOffload.thresholdBytes`, the offload pipeline
+takes over — the body is written under `toolResults/`, the model
+sees a summary + ref, and `FetchData` rehydrates on demand. The two
+features compose without any extra wiring.
+**Disabling.** `tools.disabled: ['SearchKnowledge', 'ReadKnowledge']`
+turns the tools off even when knowledge is enabled. Absent
+`config.knowledge.enabled` → no adapter built, no tools registered,
+no prompt mention.
+#### Codebase layout
+The knowledge subsystem is small and self-contained. If you need to
+extend it (new format extractor, custom scorer, alternative index
+schema), these are the files involved:
+```
+src/
+├── knowledge/                          # subsystem (self-contained)
+│   ├── types.ts                        # V1-suffixed public types (KnowledgeFolderRefV1, KnowledgeExternalLinkV1, KnowledgeFormatV1, ResolvedKnowledgeConfigV1, RunKnowledgeOptionsV1, SectionEntryV1, KnowledgeIndexV1, …)
+│   ├── scope.ts                        # parseFolderRef / parseKnowledgeRef / refInScope — load-bearing path safety
+│   ├── tokenize.ts                     # tokenize() + scoreOverlap() — deterministic, no LLM
+│   ├── indexer.ts                      # buildKnowledgeIndex() / writeKnowledgeIndex() — section split + wiki-link extraction
+│   └── extractors.ts                   # getExtractor(format) — md/txt/json/csv/html native; pdf/docx lazy-import
+│
+├── tools/
+│   ├── searchKnowledge.ts              # createSearchKnowledgeTool() — token-overlap ranked snippets
+│   └── readKnowledge.ts                # createReadKnowledgeTool() — section / file / ext: ref dispatch
+│
+├── storage/
+│   ├── interface.ts                    # adds optional `EngineStorage.knowledge?`
+│   └── factory.ts                      # builds the knowledge adapter at workspaces/{ws}/knowledge/ when enabled
+│
+├── config/
+│   ├── types.ts                        # ResolvedConfig.knowledge?: ResolvedKnowledgeConfigV1
+│   ├── schema.ts                       # KnowledgeConfigResolved zod schema (scalars only — no folders/headers)
+│   └── merge.ts                        # KNOWLEDGE_DEFAULTS + fillKnowledgeDefaults()
+│
+├── engine/
+│   ├── engine.ts                       # resolveKnowledgeRuntime() + buildToolRegistry knowledge wire-up
+│   └── types.ts                        # adds `knowledge?: RunKnowledgeOptionsV1` to RunOptions/ResumeOptions
+│
+└── index.ts                            # public exports: writeKnowledgeIndex, buildKnowledgeIndex,
+                                        # createSearchKnowledgeTool, createReadKnowledgeTool, getExtractor,
+                                        # KnowledgeFormatV1, KnowledgeIndexV1, RunKnowledgeOptionsV1, …
+test/
+├── unit/
+│   ├── knowledge/
+│   │   ├── tokenize.test.ts            # 15 tests — stop-words, dedup, scoring
+│   │   ├── indexer.test.ts             # 16 tests — section split, wiki-links, recursion
+│   │   ├── scope.test.ts               # path-safety pin (every weakening = tenant-boundary hole)
+│   │   ├── extractors.test.ts          # 17 tests — all formats including pdf/docx fallbacks
+│   │   └── externalLinkSecrets.test.ts # 7 sentinel-based non-persistence tests
+│   ├── tools/
+│   │   ├── searchKnowledge.test.ts     # caching, multi-base, sub-path, cap, factory rejection
+│   │   └── readKnowledge.test.ts       # 18 tests — all three ref kinds + error paths
+│   └── config/
+│       └── knowledgeSchema.test.ts     # 13 tests — defaults, partials, header rejection
+│
+└── integration/engine/
+    ├── knowledgeE2E.test.ts            # 6 scenarios — registration, round-trip, disabled, sub-path, subagent inheritance
+    ├── knowledgeWithOffload.test.ts    # large ReadKnowledge → offload blob + clean transcript
+    └── knowledgeMultiBase.test.ts      # multi-base ranking, base-prefixed refs, indexed + external mix
+scripts/workflows/
+├── w20-knowledge-r2.mjs                # live R2 — vault search/read + external link
+└── w21-external-files-knowledge.mjs    # live R2 — md/json/csv/html external file round-trip
+```
+`src/knowledge/` is the only directory you need to touch to add a
+new format. Append a new `KnowledgeExtractorV1` to `extractors.ts`
+and add the type to `KnowledgeFormatV1` in `types.ts` — everything
+else is dispatched off `getExtractor(format)`.
 ### Sync vs. async — when to use which
 | Scenario | Use |
@@ -1375,6 +1568,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
 - [x] 22 built-in tools
 - [x] Custom tool registration via `defineTool()`
 - [x] Device path blocking (/dev/zero, /dev/random, /proc/kcore)
+- [x] Knowledge base (`SearchKnowledge` + `ReadKnowledge`) — opt-in, per-tenant vault under `workspaces/{ws}/knowledge/`, section-level indexing, format extractors (md/txt/json/csv/html native; pdf/docx via optional deps), external link headers with non-persistence guarantee
 ### Agent Hierarchy
 - [x] Subagent spawning with depth tracking (SubagentRegistry)
@@ -1401,7 +1595,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
 - [x] Workers compatibility (zero top-level node: imports)
 ### Testing
-- [x] 960 tests across 86 files
+- [x] 1553 tests across 142 files
 - [x] 10 live workflow tests (W1-W10) against OpenRouter + R2
 - [x] Coverage: 81% lines, 85% branches, 91% functions
 - [x] CI pipeline (lint + typecheck + test + coverage gates)
@@ -1485,7 +1679,7 @@ Features intentionally not ported — either Anthropic-only, CLI-specific, or de
 ```bash
 npm install
 npm run build          # tsup → dist/ (ESM + CJS + .d.ts)
-npm test               # 1214 tests (~12s with bun)
+npm test               # 1553 tests (~30s with vitest)
 npm run test:watch     # watch mode
 npm run test:coverage  # with coverage gates
 npm run typecheck      # TypeScript strict
@@ -1524,11 +1718,12 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
 | Category | Files | Tests |
 |----------|-------|-------|
-| Unit | 70+ | ~870 |
-| Integration | 15+ | ~130 |
+| Unit | 113+ | ~1200 |
+| Integration | 15+ | ~150 |
 | E2E | 5 | ~30 |
 | Coverage additions | 20+ | ~130 |
-| **Total** | **115+** | **1214** (current `bun test` count; 8 pre-existing Bun timer failures unrelated) |
+| Knowledge (Plan 023) | 11 | ~85 |
+| **Total** | **142** | **1553** (current `vitest run`; 8 skipped pre-existing) |
 ### Live Workflow Tests
@@ -1548,6 +1743,10 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
 | W12 | Multi-agent + MCP + skills + HITL (parent gates child's publish) | 4 |
 | W13 | Per-run skill override (inline body + URL + fetch cache) | 4 |
 | W14 | MCP auth refresh + sampling round-trip (stdio + http) | n/a (integration) |
+| W17 | Tool-result offload (FetchData rehydrate) | 4 |
+| W19 | Kitchen-sink R2 (subagent + HITL + ApiCall + offload + skills + memory + hooks + webhook) | 12 |
+| W20 | Knowledge base on R2 (vault search/read + external link + bearer non-persistence) | 8 |
+| W21 | External-file knowledge on R2 — md/json/csv/html round-trip + 401 bounds | 5 |
 ---