la-machina-engine 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -22,7 +22,7 @@ npm install la-machina-engine
22
22
 
23
23
  **v0.3.0 — published on npm; production-ready core, evolving feature surface.**
24
24
 
25
- - **1214** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
25
+ - **1553** unit + integration tests pass (8 pre-existing Bun-timer failures unrelated)
26
26
  - Zero top-level `node:` imports — runs on Node.js AND Cloudflare Workers
27
27
  - 14 live workflow tests (W1–W14) verified against OpenRouter, real R2, real MCP servers
28
28
  - Pause/resume + async runs + webhooks + state.json + R2 binding storage adapter
@@ -1142,6 +1142,190 @@ on for multi-call research / browsing / repeated-API flows. Benchmark
1142
1142
  on your workload — the live test at
1143
1143
  `scripts/workflows/w17-offload-live.mjs` shows how.
1144
1144
 
1145
+ ### Knowledge base — `SearchKnowledge` + `ReadKnowledge`
1146
+
1147
+ When the model needs to look things up in your tenant's docs without
1148
+ loading whole files into context, opt into the knowledge base. Two
1149
+ built-in tools — `SearchKnowledge` (token-overlap-ranked snippets)
1150
+ and `ReadKnowledge` (one section or a whole file) — let an agent
1151
+ walk a per-tenant vault on demand.
1152
+
1153
+ **Layout.** Each tenant gets a folder at
1154
+ `workspaces/{workspaceId}/knowledge/`, sibling to `.claude/`. Top-
1155
+ level subfolders are *bases* — independent corpora each with their
1156
+ own pre-built `_index.json`:
1157
+
1158
+ ```
1159
+ workspaces/acme-corp/
1160
+ ├── .claude/ # engine state — transcripts, memory, …
1161
+ └── knowledge/
1162
+ ├── hr-policies/ # base: "hr-policies"
1163
+ │ ├── _index.json # built by writeKnowledgeIndex()
1164
+ │ ├── handbook.md
1165
+ │ └── remote-work.md
1166
+ └── sales-playbook/ # base: "sales-playbook"
1167
+ ├── _index.json
1168
+ └── q1/
1169
+ └── pricing.md
1170
+ ```
1171
+
1172
+ **Build the index** when the corpus changes — for each base, the
1173
+ indexer walks its subtree, splits markdown at heading boundaries,
1174
+ tokenises section bodies, and writes one `_index.json` per base:
1175
+
1176
+ ```ts
1177
+ import { writeKnowledgeIndex, R2StorageAdapter } from 'la-machina-engine'
1178
+
1179
+ const k = new R2StorageAdapter(r2Config, 'workspaces/acme-corp/knowledge')
1180
+ await writeKnowledgeIndex({ adapter: k, base: 'hr-policies' })
1181
+ await writeKnowledgeIndex({ adapter: k, base: 'sales-playbook' })
1182
+ ```
1183
+
1184
+ **Configure the engine** to enable the tools (off by default):
1185
+
1186
+ ```ts
1187
+ const engine = initEngine({
1188
+ storage: { provider: 'r2', /* … */ },
1189
+ knowledge: {
1190
+ enabled: true, // engine-level capability flag
1191
+ maxSearchResults: 5, // top-K per SearchKnowledge call
1192
+ maxReadBytes: 10_000, // ReadKnowledge truncation cap
1193
+ },
1194
+ })
1195
+ ```
1196
+
1197
+ **Per-run scoping.** Folders are runtime-only — pass them via
1198
+ `RunOptions.knowledge.folders`. Sub-paths inside a base work too
1199
+ (e.g., `'sales-playbook/q1'` only sees Q1 content):
1200
+
1201
+ ```ts
1202
+ await engine.run({
1203
+ task: 'What is our 401k match rate?',
1204
+ knowledge: {
1205
+ folders: ['hr-policies', 'sales-playbook/q1'],
1206
+ external: [
1207
+ // External file links — fetched on demand, never indexed.
1208
+ // `headers` are runtime-only and NEVER persist anywhere.
1209
+ {
1210
+ name: 'product-catalog',
1211
+ description: 'Product catalog CSV with unit pricing',
1212
+ url: 'https://api.acme.example/catalog.csv',
1213
+ format: 'csv',
1214
+ headers: { Authorization: 'Bearer sk_real_token' },
1215
+ },
1216
+ ],
1217
+ },
1218
+ })
1219
+ ```
1220
+
1221
+ The model then calls:
1222
+
1223
+ - `SearchKnowledge({ query: '401k matching' })` → top-K ranked
1224
+ snippets, each with a `ref` like `hr-policies/handbook.md#benefits`
1225
+ - `ReadKnowledge({ ref: 'hr-policies/handbook.md#benefits' })` →
1226
+ full body of that section
1227
+ - `ReadKnowledge({ ref: 'ext:product-catalog' })` → fetches the
1228
+ registered URL with its headers, runs the `csv` extractor, returns
1229
+ text
1230
+
1231
+ **Format support.** Native: `md`, `txt`, `json`, `csv`, `html` (script/
1232
+ style stripped, entities decoded, whitespace collapsed). Optional:
1233
+ `pdf` (via `pdf-parse`) and `docx` (via `mammoth`). Both have
1234
+ `requiresNode: true` — on Workers without those packages installed,
1235
+ they return a structured `ERR_KNOWLEDGE_FORMAT_UNSUPPORTED` error.
1236
+
1237
+ **Path safety.** All folder + ref strings flow through one validator
1238
+ in `src/knowledge/scope.ts` that rejects absolute paths, traversal
1239
+ (`..`), unsafe characters, and out-of-base file refs. A dedicated
1240
+ test (`scope.test.ts`) pins the behaviour — every weakening would
1241
+ open a tenant-boundary hole.
1242
+
1243
+ **External link headers — non-persistence guarantee.** External
1244
+ `headers` live entirely inside the tool factory closure and on the
1245
+ `init.headers` of one `fetch` call per request. They never reach the
1246
+ LLM, the transcript, `state.json`, snapshots, or any storage write.
1247
+ A sentinel-based test suite (`externalLinkSecrets.test.ts`) and the
1248
+ live R2 test (`scripts/workflows/w20-knowledge-r2.mjs`) verify this:
1249
+ the live test seeds a known sentinel into the `Authorization` header,
1250
+ runs against real R2, and reads every transcript shard back from the
1251
+ bucket asserting zero leaks.
1252
+
1253
+ **Composing with offload.** If a `ReadKnowledge` result exceeds your
1254
+ `compaction.toolResultOffload.thresholdBytes`, the offload pipeline
1255
+ takes over — the body is written under `toolResults/`, the model
1256
+ sees a summary + ref, and `FetchData` rehydrates on demand. The two
1257
+ features compose without any extra wiring.
1258
+
1259
+ **Disabling.** `tools.disabled: ['SearchKnowledge', 'ReadKnowledge']`
1260
+ turns the tools off even when knowledge is enabled. Absent
1261
+ `config.knowledge.enabled` → no adapter built, no tools registered,
1262
+ no prompt mention.
1263
+
1264
+ #### Codebase layout
1265
+
1266
+ The knowledge subsystem is small and self-contained. If you need to
1267
+ extend it (new format extractor, custom scorer, alternative index
1268
+ schema), these are the files involved:
1269
+
1270
+ ```
1271
+ src/
1272
+ ├── knowledge/ # subsystem (self-contained)
1273
+ │ ├── types.ts # V1-suffixed public types (KnowledgeFolderRefV1, KnowledgeExternalLinkV1, KnowledgeFormatV1, ResolvedKnowledgeConfigV1, RunKnowledgeOptionsV1, SectionEntryV1, KnowledgeIndexV1, …)
1274
+ │ ├── scope.ts # parseFolderRef / parseKnowledgeRef / refInScope — load-bearing path safety
1275
+ │ ├── tokenize.ts # tokenize() + scoreOverlap() — deterministic, no LLM
1276
+ │ ├── indexer.ts # buildKnowledgeIndex() / writeKnowledgeIndex() — section split + wiki-link extraction
1277
+ │ └── extractors.ts # getExtractor(format) — md/txt/json/csv/html native; pdf/docx lazy-import
1278
+
1279
+ ├── tools/
1280
+ │ ├── searchKnowledge.ts # createSearchKnowledgeTool() — token-overlap ranked snippets
1281
+ │ └── readKnowledge.ts # createReadKnowledgeTool() — section / file / ext: ref dispatch
1282
+
1283
+ ├── storage/
1284
+ │ ├── interface.ts # adds optional `EngineStorage.knowledge?`
1285
+ │ └── factory.ts # builds the knowledge adapter at workspaces/{ws}/knowledge/ when enabled
1286
+
1287
+ ├── config/
1288
+ │ ├── types.ts # ResolvedConfig.knowledge?: ResolvedKnowledgeConfigV1
1289
+ │ ├── schema.ts # KnowledgeConfigResolved zod schema (scalars only — no folders/headers)
1290
+ │ └── merge.ts # KNOWLEDGE_DEFAULTS + fillKnowledgeDefaults()
1291
+
1292
+ ├── engine/
1293
+ │ ├── engine.ts # resolveKnowledgeRuntime() + buildToolRegistry knowledge wire-up
1294
+ │ └── types.ts # adds `knowledge?: RunKnowledgeOptionsV1` to RunOptions/ResumeOptions
1295
+
1296
+ └── index.ts # public exports: writeKnowledgeIndex, buildKnowledgeIndex,
1297
+ # createSearchKnowledgeTool, createReadKnowledgeTool, getExtractor,
1298
+ # KnowledgeFormatV1, KnowledgeIndexV1, RunKnowledgeOptionsV1, …
1299
+
1300
+ test/
1301
+ ├── unit/
1302
+ │ ├── knowledge/
1303
+ │ │ ├── tokenize.test.ts # 15 tests — stop-words, dedup, scoring
1304
+ │ │ ├── indexer.test.ts # 16 tests — section split, wiki-links, recursion
1305
+ │ │ ├── scope.test.ts # path-safety pin (every weakening = tenant-boundary hole)
1306
+ │ │ ├── extractors.test.ts # 17 tests — all formats including pdf/docx fallbacks
1307
+ │ │ └── externalLinkSecrets.test.ts # 7 sentinel-based non-persistence tests
1308
+ │ ├── tools/
1309
+ │ │ ├── searchKnowledge.test.ts # caching, multi-base, sub-path, cap, factory rejection
1310
+ │ │ └── readKnowledge.test.ts # 18 tests — all three ref kinds + error paths
1311
+ │ └── config/
1312
+ │ └── knowledgeSchema.test.ts # 13 tests — defaults, partials, header rejection
1313
+
1314
+ └── integration/engine/
1315
+ ├── knowledgeE2E.test.ts # 6 scenarios — registration, round-trip, disabled, sub-path, subagent inheritance
1316
+ ├── knowledgeWithOffload.test.ts # large ReadKnowledge → offload blob + clean transcript
1317
+ └── knowledgeMultiBase.test.ts # multi-base ranking, base-prefixed refs, indexed + external mix
1318
+
1319
+ scripts/workflows/
1320
+ ├── w20-knowledge-r2.mjs # live R2 — vault search/read + external link
1321
+ └── w21-external-files-knowledge.mjs # live R2 — md/json/csv/html external file round-trip
1322
+ ```
1323
+
1324
+ `src/knowledge/` is the only directory you need to touch to add a
1325
+ new format. Append a new `KnowledgeExtractorV1` to `extractors.ts`
1326
+ and add the type to `KnowledgeFormatV1` in `types.ts` — everything
1327
+ else is dispatched off `getExtractor(format)`.
1328
+
1145
1329
  ### Sync vs. async — when to use which
1146
1330
 
1147
1331
  | Scenario | Use |
@@ -1375,6 +1559,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
1375
1559
  - [x] 22 built-in tools
1376
1560
  - [x] Custom tool registration via `defineTool()`
1377
1561
  - [x] Device path blocking (/dev/zero, /dev/random, /proc/kcore)
1562
+ - [x] Knowledge base (`SearchKnowledge` + `ReadKnowledge`) — opt-in, per-tenant vault under `workspaces/{ws}/knowledge/`, section-level indexing, format extractors (md/txt/json/csv/html native; pdf/docx via optional deps), external link headers with non-persistence guarantee
1378
1563
 
1379
1564
  ### Agent Hierarchy
1380
1565
  - [x] Subagent spawning with depth tracking (SubagentRegistry)
@@ -1401,7 +1586,7 @@ All features ported 1:1 from La-Machina's production runtime. Pure JS, Workers-c
1401
1586
  - [x] Workers compatibility (zero top-level node: imports)
1402
1587
 
1403
1588
  ### Testing
1404
- - [x] 960 tests across 86 files
1589
+ - [x] 1553 tests across 142 files
1405
1590
  - [x] 10 live workflow tests (W1-W10) against OpenRouter + R2
1406
1591
  - [x] Coverage: 81% lines, 85% branches, 91% functions
1407
1592
  - [x] CI pipeline (lint + typecheck + test + coverage gates)
@@ -1485,7 +1670,7 @@ Features intentionally not ported — either Anthropic-only, CLI-specific, or de
1485
1670
  ```bash
1486
1671
  npm install
1487
1672
  npm run build # tsup → dist/ (ESM + CJS + .d.ts)
1488
- npm test # 1214 tests (~12s with bun)
1673
+ npm test # 1553 tests (~30s with vitest)
1489
1674
  npm run test:watch # watch mode
1490
1675
  npm run test:coverage # with coverage gates
1491
1676
  npm run typecheck # TypeScript strict
@@ -1524,11 +1709,12 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
1524
1709
 
1525
1710
  | Category | Files | Tests |
1526
1711
  |----------|-------|-------|
1527
- | Unit | 70+ | ~870 |
1528
- | Integration | 15+ | ~130 |
1712
+ | Unit | 113+ | ~1200 |
1713
+ | Integration | 15+ | ~150 |
1529
1714
  | E2E | 5 | ~30 |
1530
1715
  | Coverage additions | 20+ | ~130 |
1531
- | **Total** | **115+** | **1214** (current `bun test` count; 8 pre-existing Bun timer failures unrelated) |
1716
+ | Knowledge (Plan 023) | 11 | ~85 |
1717
+ | **Total** | **142** | **1553** (current `vitest run`; 8 skipped pre-existing) |
1532
1718
 
1533
1719
  ### Live Workflow Tests
1534
1720
 
@@ -1548,6 +1734,10 @@ publish permission on `la-machina-engine` and "Bypass 2FA" enabled.
1548
1734
  | W12 | Multi-agent + MCP + skills + HITL (parent gates child's publish) | 4 |
1549
1735
  | W13 | Per-run skill override (inline body + URL + fetch cache) | 4 |
1550
1736
  | W14 | MCP auth refresh + sampling round-trip (stdio + http) | n/a (integration) |
1737
+ | W17 | Tool-result offload (FetchData rehydrate) | 4 |
1738
+ | W19 | Kitchen-sink R2 (subagent + HITL + ApiCall + offload + skills + memory + hooks + webhook) | 12 |
1739
+ | W20 | Knowledge base on R2 (vault search/read + external link + bearer non-persistence) | 8 |
1740
+ | W21 | External-file knowledge on R2 — md/json/csv/html round-trip + 401 bounds | 5 |
1551
1741
 
1552
1742
  ---
1553
1743