la-machina-engine 0.6.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -22,7 +22,7 @@ npm install la-machina-engine
22
22
 
23
23
  **v0.3.0 — published on npm; production-ready core, evolving feature surface.**
24
24
 
25
- - **1214** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
25
+ - **1553** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
26
26
  - Zero top-level `node:` imports — runs on Node.js AND Cloudflare Workers
27
27
  - 14 live workflow tests (W1–W14) verified against OpenRouter, real R2, real MCP servers
28
28
  - Pause/resume + async runs + webhooks + state.json + R2 binding storage adapter
@@ -1142,6 +1142,199 @@ on for multi-call research / browsing / repeated-API flows. Benchmark
1142
1142
  on your workload — the live test at
1143
1143
  `scripts/workflows/w17-offload-live.mjs` shows how.
1144
1144
 
1145
+ ### Knowledge base — `SearchKnowledge` + `ReadKnowledge`
1146
+
1147
+ When the model needs to look things up in your tenant's docs without
1148
+ loading whole files into context, opt into the knowledge base. Two
1149
+ built-in tools — `SearchKnowledge` (token-overlap-ranked snippets)
1150
+ and `ReadKnowledge` (one section or a whole file) — let an agent
1151
+ walk a per-tenant vault on demand.
1152
+
1153
+ **Layout.** Each tenant gets a folder at
1154
+ `workspaces/{workspaceId}/knowledge/`, sibling to `.claude/`. Top-
1155
+ level subfolders are *bases* — independent corpora each with their
1156
+ own pre-built `_index.json`:
1157
+
1158
+ ```
1159
+ workspaces/acme-corp/
1160
+ ├── .claude/ # engine state — transcripts, memory, …
1161
+ └── knowledge/
1162
+ ├── hr-policies/ # base: "hr-policies"
1163
+ │ ├── _index.json # built by writeKnowledgeIndex()
1164
+ │ ├── handbook.md
1165
+ │ └── remote-work.md
1166
+ └── sales-playbook/ # base: "sales-playbook"
1167
+ ├── _index.json
1168
+ └── q1/
1169
+ └── pricing.md
1170
+ ```
1171
+
1172
+ **Build the index** when the corpus changes — for each base, the
1173
+ indexer walks its subtree, splits markdown at heading boundaries,
1174
+ tokenises section bodies, and writes one `_index.json` per base:
1175
+
1176
+ ```ts
1177
+ import { writeKnowledgeIndex, R2StorageAdapter } from 'la-machina-engine'
1178
+
1179
+ const k = new R2StorageAdapter(r2Config, 'workspaces/acme-corp/knowledge')
1180
+ await writeKnowledgeIndex({ adapter: k, base: 'hr-policies' })
1181
+ await writeKnowledgeIndex({ adapter: k, base: 'sales-playbook' })
1182
+ ```
1183
+
1184
+ **Forgot to build the index?** Both tools fall back to an in-memory
1185
+ build on first call when `_index.json` is missing or corrupted. The
1186
+ fallback caches for the rest of the run, so subsequent searches are
1187
+ free. This makes the index a performance optimisation (skip the walk
1188
+ on every fresh run), not a setup requirement — drop files into the
1189
+ folder and the agent can discover them immediately. Pre-build with
1190
+ `writeKnowledgeIndex()` for production-scale corpora where the
1191
+ first-call cost matters.
1192
+
1193
+ **Configure the engine** to enable the tools (off by default):
1194
+
1195
+ ```ts
1196
+ const engine = initEngine({
1197
+ storage: { provider: 'r2', /* … */ },
1198
+ knowledge: {
1199
+ enabled: true, // engine-level capability flag
1200
+ maxSearchResults: 5, // top-K per SearchKnowledge call
1201
+ maxReadBytes: 10_000, // ReadKnowledge truncation cap
1202
+ },
1203
+ })
1204
+ ```
1205
+
1206
+ **Per-run scoping.** Folders are runtime-only — pass them via
1207
+ `RunOptions.knowledge.folders`. Sub-paths inside a base work too
1208
+ (e.g., `'sales-playbook/q1'` only sees Q1 content):
1209
+
1210
+ ```ts
1211
+ await engine.run({
1212
+ task: 'What is our 401k match rate?',
1213
+ knowledge: {
1214
+ folders: ['hr-policies', 'sales-playbook/q1'],
1215
+ external: [
1216
+ // External file links — fetched on demand, never indexed.
1217
+ // `headers` are runtime-only and NEVER persist anywhere.
1218
+ {
1219
+ name: 'product-catalog',
1220
+ description: 'Product catalog CSV with unit pricing',
1221
+ url: 'https://api.acme.example/catalog.csv',
1222
+ format: 'csv',
1223
+ headers: { Authorization: 'Bearer sk_real_token' },
1224
+ },
1225
+ ],
1226
+ },
1227
+ })
1228
+ ```
1229
+
1230
+ The model then calls:
1231
+
1232
+ - `SearchKnowledge({ query: '401k matching' })` → top-K ranked
1233
+ snippets, each with a `ref` like `hr-policies/handbook.md#benefits`
1234
+ - `ReadKnowledge({ ref: 'hr-policies/handbook.md#benefits' })` →
1235
+ full body of that section
1236
+ - `ReadKnowledge({ ref: 'ext:product-catalog' })` → fetches the
1237
+ registered URL with its headers, runs the `csv` extractor, returns
1238
+ text
1239
+
1240
+ **Format support.** Native: `md`, `txt`, `json`, `csv`, `html` (script/
1241
+ style stripped, entities decoded, whitespace collapsed). Optional:
1242
+ `pdf` (via `pdf-parse`) and `docx` (via `mammoth`). Both have
1243
+ `requiresNode: true` — on Workers without those packages installed,
1244
+ they return a structured `ERR_KNOWLEDGE_FORMAT_UNSUPPORTED` error.
1245
+
1246
+ **Path safety.** All folder + ref strings flow through one validator
1247
+ in `src/knowledge/scope.ts` that rejects absolute paths, traversal
1248
+ (`..`), unsafe characters, and out-of-base file refs. A dedicated
1249
+ test (`scope.test.ts`) pins the behaviour — every weakening would
1250
+ open a tenant-boundary hole.
1251
+
1252
+ **External link headers — non-persistence guarantee.** External
1253
+ `headers` live entirely inside the tool factory closure and on the
1254
+ `init.headers` of one `fetch` call per request. They never reach the
1255
+ LLM, the transcript, `state.json`, snapshots, or any storage write.
1256
+ A sentinel-based test suite (`externalLinkSecrets.test.ts`) and the
1257
+ live R2 test (`scripts/workflows/w20-knowledge-r2.mjs`) verify this:
1258
+ the live test seeds a known sentinel into the `Authorization` header,
1259
+ runs against real R2, and reads every transcript shard back from the
1260
+ bucket asserting zero leaks.
1261
+
1262
+ **Composing with offload.** If a `ReadKnowledge` result exceeds your
1263
+ `compaction.toolResultOffload.thresholdBytes`, the offload pipeline
1264
+ takes over — the body is written under `toolResults/`, the model
1265
+ sees a summary + ref, and `FetchData` rehydrates on demand. The two
1266
+ features compose without any extra wiring.
1267
+
1268
+ **Disabling.** `tools.disabled: ['SearchKnowledge', 'ReadKnowledge']`
1269
+ turns the tools off even when knowledge is enabled. Absent
1270
+ `config.knowledge.enabled` → no adapter built, no tools registered,
1271
+ no prompt mention.
1272
+
1273
+ #### Codebase layout
1274
+
1275
+ The knowledge subsystem is small and self-contained. If you need to
1276
+ extend it (new format extractor, custom scorer, alternative index
1277
+ schema), these are the files involved:
1278
+
1279
+ ```
1280
+ src/
1281
+ ├── knowledge/ # subsystem (self-contained)
1282
+ │ ├── types.ts # V1-suffixed public types (KnowledgeFolderRefV1, KnowledgeExternalLinkV1, KnowledgeFormatV1, ResolvedKnowledgeConfigV1, RunKnowledgeOptionsV1, SectionEntryV1, KnowledgeIndexV1, …)
1283
+ │ ├── scope.ts # parseFolderRef / parseKnowledgeRef / refInScope — load-bearing path safety
1284
+ │ ├── tokenize.ts # tokenize() + scoreOverlap() — deterministic, no LLM
1285
+ │ ├── indexer.ts # buildKnowledgeIndex() / writeKnowledgeIndex() — section split + wiki-link extraction
1286
+ │ └── extractors.ts # getExtractor(format) — md/txt/json/csv/html native; pdf/docx lazy-import
1287
+
1288
+ ├── tools/
1289
+ │ ├── searchKnowledge.ts # createSearchKnowledgeTool() — token-overlap ranked snippets
1290
+ │ └── readKnowledge.ts # createReadKnowledgeTool() — section / file / ext: ref dispatch
1291
+
1292
+ ├── storage/
1293
+ │ ├── interface.ts # adds optional `EngineStorage.knowledge?`
1294
+ │ └── factory.ts # builds the knowledge adapter at workspaces/{ws}/knowledge/ when enabled
1295
+
1296
+ ├── config/
1297
+ │ ├── types.ts # ResolvedConfig.knowledge?: ResolvedKnowledgeConfigV1
1298
+ │ ├── schema.ts # KnowledgeConfigResolved zod schema (scalars only — no folders/headers)
1299
+ │ └── merge.ts # KNOWLEDGE_DEFAULTS + fillKnowledgeDefaults()
1300
+
1301
+ ├── engine/
1302
+ │ ├── engine.ts # resolveKnowledgeRuntime() + buildToolRegistry knowledge wire-up
1303
+ │ └── types.ts # adds `knowledge?: RunKnowledgeOptionsV1` to RunOptions/ResumeOptions
1304
+
1305
+ └── index.ts # public exports: writeKnowledgeIndex, buildKnowledgeIndex,
1306
+ # createSearchKnowledgeTool, createReadKnowledgeTool, getExtractor,
1307
+ # KnowledgeFormatV1, KnowledgeIndexV1, RunKnowledgeOptionsV1, …
1308
+
1309
+ test/
1310
+ ├── unit/
1311
+ │ ├── knowledge/
1312
+ │ │ ├── tokenize.test.ts # 15 tests — stop-words, dedup, scoring
1313
+ │ │ ├── indexer.test.ts # 16 tests — section split, wiki-links, recursion
1314
+ │ │ ├── scope.test.ts # path-safety pin (every weakening = tenant-boundary hole)
1315
+ │ │ ├── extractors.test.ts # 17 tests — all formats including pdf/docx fallbacks
1316
+ │ │ └── externalLinkSecrets.test.ts # 7 sentinel-based non-persistence tests
1317
+ │ ├── tools/
1318
+ │ │ ├── searchKnowledge.test.ts # caching, multi-base, sub-path, cap, factory rejection
1319
+ │ │ └── readKnowledge.test.ts # 18 tests — all three ref kinds + error paths
1320
+ │ └── config/
1321
+ │ └── knowledgeSchema.test.ts # 13 tests — defaults, partials, header rejection
1322
+
1323
+ └── integration/engine/
1324
+ ├── knowledgeE2E.test.ts # 6 scenarios — registration, round-trip, disabled, sub-path, subagent inheritance
1325
+ ├── knowledgeWithOffload.test.ts # large ReadKnowledge → offload blob + clean transcript
1326
+ └── knowledgeMultiBase.test.ts # multi-base ranking, base-prefixed refs, indexed + external mix
1327
+
1328
+ scripts/workflows/
1329
+ ├── w20-knowledge-r2.mjs # live R2 — vault search/read + external link
1330
+ └── w21-external-files-knowledge.mjs # live R2 — md/json/csv/html external file round-trip
1331
+ ```
1332
+
1333
+ `src/knowledge/` is the only directory you need to touch to add a
1334
+ new format. Append a new `KnowledgeExtractorV1` to `extractors.ts`
1335
+ and add the type to `KnowledgeFormatV1` in `types.ts` — everything
1336
+ else is dispatched off `getExtractor(format)`.
1337
+
1145
1338
  ### Sync vs. async — when to use which
1146
1339
 
1147
1340
  | Scenario | Use |
@@ -1375,6 +1568,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
1375
1568
  - [x] 22 built-in tools
1376
1569
  - [x] Custom tool registration via `defineTool()`
1377
1570
  - [x] Device path blocking (/dev/zero, /dev/random, /proc/kcore)
1571
+ - [x] Knowledge base (`SearchKnowledge` + `ReadKnowledge`) — opt-in, per-tenant vault under `workspaces/{ws}/knowledge/`, section-level indexing, format extractors (md/txt/json/csv/html native; pdf/docx via optional deps), external link headers with non-persistence guarantee
1378
1572
 
1379
1573
  ### Agent Hierarchy
1380
1574
  - [x] Subagent spawning with depth tracking (SubagentRegistry)
@@ -1401,7 +1595,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
1401
1595
  - [x] Workers compatibility (zero top-level node: imports)
1402
1596
 
1403
1597
  ### Testing
1404
- - [x] 960 tests across 86 files
1598
+ - [x] 1553 tests across 142 files
1405
1599
  - [x] 10 live workflow tests (W1-W10) against OpenRouter + R2
1406
1600
  - [x] Coverage: 81% lines, 85% branches, 91% functions
1407
1601
  - [x] CI pipeline (lint + typecheck + test + coverage gates)
@@ -1485,7 +1679,7 @@ Features intentionally not ported — either Anthropic-only, CLI-specific, or de
1485
1679
  ```bash
1486
1680
  npm install
1487
1681
  npm run build # tsup → dist/ (ESM + CJS + .d.ts)
1488
- npm test # 1214 tests (~12s with bun)
1682
+ npm test # 1553 tests (~30s with vitest)
1489
1683
  npm run test:watch # watch mode
1490
1684
  npm run test:coverage # with coverage gates
1491
1685
  npm run typecheck # TypeScript strict
@@ -1524,11 +1718,12 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
1524
1718
 
1525
1719
  | Category | Files | Tests |
1526
1720
  |----------|-------|-------|
1527
- | Unit | 70+ | ~870 |
1528
- | Integration | 15+ | ~130 |
1721
+ | Unit | 113+ | ~1200 |
1722
+ | Integration | 15+ | ~150 |
1529
1723
  | E2E | 5 | ~30 |
1530
1724
  | Coverage additions | 20+ | ~130 |
1531
- | **Total** | **115+** | **1214** (current `bun test` count; 8 pre-existing Bun timer failures unrelated) |
1725
+ | Knowledge (Plan 023) | 11 | ~85 |
1726
+ | **Total** | **142** | **1553** (current `vitest run`; 8 skipped pre-existing) |
1532
1727
 
1533
1728
  ### Live Workflow Tests
1534
1729
 
@@ -1548,6 +1743,10 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
1548
1743
  | W12 | Multi-agent + MCP + skills + HITL (parent gates child's publish) | 4 |
1549
1744
  | W13 | Per-run skill override (inline body + URL + fetch cache) | 4 |
1550
1745
  | W14 | MCP auth refresh + sampling round-trip (stdio + http) | n/a (integration) |
1746
+ | W17 | Tool-result offload (FetchData rehydrate) | 4 |
1747
+ | W19 | Kitchen-sink R2 (subagent + HITL + ApiCall + offload + skills + memory + hooks + webhook) | 12 |
1748
+ | W20 | Knowledge base on R2 (vault search/read + external link + bearer non-persistence) | 8 |
1749
+ | W21 | External-file knowledge on R2 — md/json/csv/html round-trip + 401 bounds | 5 |
1551
1750
 
1552
1751
  ---
1553
1752