la-machina-engine 0.7.0 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -694,7 +694,9 @@ Shape:
694
694
 
695
695
  ### Webhooks
696
696
 
697
- Pass a `webhook` object to `start()` / `resumeAsync()` and the engine will POST the final `EngineResponse` to your URL on the configured events.
697
+ Async runs deliver status changes to a URL you own. Pass a `webhook`
698
+ object to `start()` / `resumeAsync()` and the engine POSTs the final
699
+ `EngineResponse` whenever the run reaches a terminal or pause state.
698
700
 
699
701
  ```ts
700
702
  await engine.start({
@@ -705,46 +707,285 @@ await engine.start({
705
707
  url: 'https://your-app.com/hooks/la-machina',
706
708
  secret: 'shared-hmac-secret', // optional — enables X-LaMachina-Signature
707
709
  events: ['paused', 'done', 'failed'], // default: all three
708
- headers: { 'X-Tenant': 'acme' }, // optional — passed through
710
+ headers: { 'X-Tenant': 'acme' }, // optional — passed through per request
709
711
  },
710
712
  })
711
713
  ```
712
714
 
713
- **Request headers:**
715
+ #### Events — what fires and when
714
716
 
715
- | Header | Value |
716
- |---|---|
717
- | `Content-Type` | `application/json` |
718
- | `X-LaMachina-Event` | `status.paused` \| `status.done` \| `status.failed` |
719
- | `X-LaMachina-RunId` | Run ID from your `start()` call |
720
- | `X-LaMachina-Delivery` | Unique UUID per delivery attempt |
721
- | `X-LaMachina-Timestamp` | Unix ms (used in HMAC input) |
722
- | `X-LaMachina-Signature` | `sha256=<hex>` — HMAC over `${timestamp}.${body}` (only if `secret` set) |
717
+ Three events, mapped 1:1 from `EngineResponse.status`:
718
+
719
+ | Event | Fires when | `data` field | `meta.pauseReason` |
720
+ |---|---|---|---|
721
+ | `done` | Model reached `end_turn` cleanly | Output string (or parsed JSON in structured-output mode) | |
722
+ | `paused` | Gate callback returned `{ allow: false }`, OR run needs runner handoff | `null` | `gate_required` \| `handoff_to_runner` |
723
+ | `failed` | Anything threw (API 5xx after retries, max turns, timeout, cancel, runner unreachable) | `null` | — (`errors[0]` has the cause) |
724
+
725
+ `queued`, `running`, and `not_found` **never fire webhooks** — they're
726
+ only observable via `getStatus()` polling. Webhooks are terminal /
727
+ pausal only.
728
+
729
+ #### When webhooks do vs don't fire
730
+
731
+ | API call | Webhooks? | Why |
732
+ |---|---|---|
733
+ | `engine.start({webhook})` | ✓ fires on terminal/pause | |
734
+ | `engine.resumeAsync({webhook})` | ✓ fires on terminal/pause | |
735
+ | `engine.run()` | **never** | Caller already has the response in hand |
736
+ | `engine.resume()` | **never** | Same — synchronous, caller holds the result |
737
+ | `engine.cancelRun(runId)` | in-flight run aborts and fires `failed` | Cancellation is a normal failure path |
738
+
739
+ Webhooks are for the async surface exclusively. Anything running
740
+ synchronously returns its response directly.
741
+
742
+ #### Request shape
743
+
744
+ `POST {webhook.url}` with body = `JSON.stringify(EngineResponse)` and:
745
+
746
+ | Header | Value | Notes |
747
+ |---|---|---|
748
+ | `Content-Type` | `application/json` | |
749
+ | `X-LaMachina-Event` | `status.done` \| `status.paused` \| `status.failed` | Event-type routing on the receiver |
750
+ | `X-LaMachina-RunId` | Run ID from your `start()` call | Correlate with client-side state |
751
+ | `X-LaMachina-Delivery` | Fresh UUID per attempt | **Use this for idempotency** — same delivery ID = retry of same logical event |
752
+ | `X-LaMachina-Timestamp` | Unix ms | Covered by the HMAC — lets receivers reject replays |
753
+ | `X-LaMachina-Signature` | `sha256=<hex>` | Only when `secret` is set; see "Verifying the signature" below |
754
+ | _(user headers)_ | whatever you passed in `webhook.headers` | Merged last, cannot override engine headers |
755
+
756
+ Request timeout is 30 s by default. The engine aborts slower receivers
757
+ and treats them as a retryable network failure.
758
+
759
+ #### Payload — one schema for every event
760
+
761
+ The body is always an `EngineResponse` (the same shape `engine.run()`
762
+ returns). The event type determines which fields are meaningful:
763
+
764
+ **`done` payload:**
765
+
766
+ ```jsonc
767
+ {
768
+ "runId": "run_abc",
769
+ "status": "done",
770
+ "data": "The analysis is complete. Revenue grew 15% YoY.",
771
+ "meta": {
772
+ "nodeId": "analyze",
773
+ "turns": 5,
774
+ "tokensUsed": { "input": 12500, "output": 3200, "cacheReadInput": 8000 },
775
+ "durationMs": 8500,
776
+ "output": "The analysis is complete. Revenue grew 15% YoY.",
777
+ "transcript": { "path": "projects/run_abc/nodes/analyze", "lastShardIndex": 2 }
778
+ },
779
+ "errors": [],
780
+ "timestamp": 1712966400000
781
+ }
782
+ ```
783
+
784
+ **`paused` payload:**
785
+
786
+ ```jsonc
787
+ {
788
+ "runId": "run_abc",
789
+ "status": "paused",
790
+ "data": null,
791
+ "meta": {
792
+ "nodeId": "publish",
793
+ "pauseReason": "gate_required",
794
+ "turns": 3,
795
+ "tokensUsed": { "input": 8200, "output": 1900 },
796
+ "pendingToolCall": {
797
+ "toolName": "Publish",
798
+ "toolUseId": "toolu_01abc",
799
+ "input": { "post": { "title": "...", "body": "..." } }
800
+ },
801
+ "transcript": { "path": "projects/run_abc/nodes/publish", "lastShardIndex": 1 }
802
+ },
803
+ "errors": [],
804
+ "timestamp": 1712966400000
805
+ }
806
+ ```
807
+
808
+ Use `pendingToolCall.input` to render an approval UI, then call
809
+ `engine.resumeAsync({ runId, gateAnswer: 'approve', webhook: {...} })`
810
+ to continue. A separate `done` (or `failed`) webhook will fire for the
811
+ resumed run.
812
+
813
+ **`failed` payload:**
814
+
815
+ ```jsonc
816
+ {
817
+ "runId": "run_abc",
818
+ "status": "failed",
819
+ "data": null,
820
+ "meta": {
821
+ "nodeId": "n1",
822
+ "cancelled": true // present when the failure was engine.cancelRun()
823
+ },
824
+ "errors": [
825
+ { "code": "CANCELLED", "message": "Run was cancelled by client" }
826
+ // Other codes: RUN_FAILED, RESUME_FAILED, ERR_RUNNER_UNREACHABLE, ERR_MAX_TURNS, ORPHANED, …
827
+ ],
828
+ "timestamp": 1712966400000
829
+ }
830
+ ```
831
+
832
+ The `errors[]` array holds `{code, message}` pairs — use `errors[0].code`
833
+ for programmatic routing, `message` for display.
834
+
835
+ #### Verifying the signature
836
+
837
+ When `webhook.secret` is set, the engine signs
838
+ `${X-LaMachina-Timestamp}.${body}` with HMAC-SHA256 and sets
839
+ `X-LaMachina-Signature: sha256=<hex>`. Verify in Node:
840
+
841
+ ```ts
842
+ import { createHmac, timingSafeEqual } from 'node:crypto'
843
+
844
+ function verifyLaMachinaWebhook(req: Request, rawBody: string, secret: string): boolean {
845
+ const ts = req.headers.get('x-lamachina-timestamp')
846
+ const sig = req.headers.get('x-lamachina-signature')
847
+ if (!ts || !sig) return false
848
+
849
+ // Reject replays older than 5 minutes
850
+ if (Math.abs(Date.now() - Number(ts)) > 5 * 60_000) return false
851
+
852
+ const expected =
853
+ 'sha256=' +
854
+ createHmac('sha256', secret).update(`${ts}.${rawBody}`).digest('hex')
855
+
856
+ // Constant-time comparison
857
+ const a = Buffer.from(sig)
858
+ const b = Buffer.from(expected)
859
+ return a.length === b.length && timingSafeEqual(a, b)
860
+ }
861
+ ```
862
+
863
+ On Cloudflare Workers (Web Crypto, no `node:crypto`):
864
+
865
+ ```ts
866
+ async function verifyLaMachinaWebhook(req: Request, rawBody: string, secret: string) {
867
+ const ts = req.headers.get('x-lamachina-timestamp')
868
+ const sig = req.headers.get('x-lamachina-signature')
869
+ if (!ts || !sig) return false
870
+ if (Math.abs(Date.now() - Number(ts)) > 5 * 60_000) return false
871
+
872
+ const key = await crypto.subtle.importKey(
873
+ 'raw',
874
+ new TextEncoder().encode(secret),
875
+ { name: 'HMAC', hash: 'SHA-256' },
876
+ false,
877
+ ['sign'],
878
+ )
879
+ const buf = await crypto.subtle.sign('HMAC', key, new TextEncoder().encode(`${ts}.${rawBody}`))
880
+ const expected =
881
+ 'sha256=' +
882
+ Array.from(new Uint8Array(buf))
883
+ .map((b) => b.toString(16).padStart(2, '0'))
884
+ .join('')
885
+ return expected === sig
886
+ }
887
+ ```
888
+
889
+ **Always verify against the raw bytes** you read from the request.
890
+ Re-serializing the parsed JSON will produce different bytes and the
891
+ signature won't match.
892
+
893
+ #### Idempotency — receivers MUST handle duplicates
894
+
895
+ `X-LaMachina-Delivery` is unique per attempt, but retries of the same
896
+ logical event may send the same payload to your endpoint multiple
897
+ times (network flaps, receiver returns 5xx, etc.). De-duplicate on:
898
+
899
+ - `X-LaMachina-Delivery` — reject second delivery with the same ID
900
+ - OR `runId + status + timestamp` — simpler, event-level dedup
901
+
902
+ Pattern: insert the delivery ID into a short-TTL cache (Redis, R2, DB
903
+ unique constraint); on collision return 200 without reprocessing.
723
904
 
724
- **Retry schedule** (exponential-ish):
905
+ #### Retry schedule
906
+
907
+ Fixed schedule per delivery attempt:
725
908
 
726
909
  ```
727
910
  attempt 1: immediate
728
- attempt 2: +10s
729
- attempt 3: +60s
730
- attempt 4: +5min
731
- attempt 5: +30min
732
- then give up
911
+ attempt 2: +10 s (after the previous attempt's failure)
912
+ attempt 3: +60 s
913
+ attempt 4: +5 min
914
+ attempt 5: +30 min
915
+ give up
733
916
  ```
734
917
 
735
918
  Retry decisions:
736
919
 
737
- | HTTP | Retry? |
920
+ | Receiver response | Retry? |
738
921
  |---|---|
739
- | 2xx | No (delivered) |
922
+ | 2xx | No delivered |
740
923
  | 408 Request Timeout | Yes |
741
924
  | 429 Rate Limited | Yes |
742
- | 5xx | Yes |
743
- | 410 Gone | **No** (permanent resource removed) |
744
- | Other 4xx | No (client bug don't retry) |
925
+ | 5xx (500–599) | Yes |
926
+ | **410 Gone** | **No — give up immediately** (resource intentionally removed) |
927
+ | Other 4xx (400/401/403/404/…) | No payload/auth bug; retrying won't help |
745
928
  | Network error / timeout | Yes |
746
929
 
747
- Every attempt is appended to `state.webhook.deliveries[]` for audit.
930
+ Every attempt — success or failure — is appended to
931
+ `state.webhook.deliveries[]` in `state.json`, including the HTTP status,
932
+ error message, delivery ID, timestamps, and attempt number. Inspect
933
+ via `engine.getStatus(runId)` or read `state.json` directly from R2.
934
+
935
+ #### Manual replay
936
+
937
+ If the receiver was down and the engine has already given up (5
938
+ attempts exhausted, or 4xx stopped retries), replay any past delivery:
939
+
940
+ ```ts
941
+ const status = await engine.getStatus(runId)
942
+ const missed = status.meta.webhook?.deliveries.find((d) => d.status === 'failed')
943
+ if (missed) {
944
+ await engine.retryWebhook(runId, missed.id)
945
+ }
946
+ ```
947
+
948
+ `retryWebhook` fires a fresh POST with a **new** delivery ID (so
949
+ receivers that already processed the original ID won't reject it as a
950
+ dup — this is a deliberate re-issuance, not a network retry) and
951
+ continues the retry schedule from attempt 1.
952
+
953
+ #### Correlated pause → resume
954
+
955
+ When a run emits `paused`, the client typically gathers a decision and
956
+ calls `resumeAsync({runId, gateAnswer, webhook})`. The resumed run
957
+ will emit **another** webhook on completion — usually `done`, sometimes
958
+ `paused` again if the model hits a second gate, or `failed` if resume
959
+ fails. Receivers should track `runId` state across events:
960
+
961
+ | State sequence | Meaning |
962
+ |---|---|
963
+ | `paused` → `done` | Happy-path HITL — approved and completed |
964
+ | `paused` → `paused` → `done` | Multi-step approval — each gate wake fires its own event |
965
+ | `paused` → `failed` (`CANCELLED`) | User rejected at the gate and cancelled the run |
966
+ | `paused` → (no follow-up) | Orphaned — caller never called `resumeAsync` |
967
+
968
+ Use `runId` as your correlation key across all events for a run.
969
+
970
+ #### What's NOT a webhook event (deliberate omissions)
971
+
972
+ These are intentionally out of scope:
973
+
974
+ - **Per-turn progress** — too chatty. Poll `getStatus(runId)` for
975
+ live turn / token / activity updates (the heartbeat writes
976
+ `state.json` every ~500 ms when activity changes).
977
+ - **Per-tool dispatch** — that's what `preToolCall` / `postToolCall`
978
+ hooks are for (in-process, synchronous).
979
+ - **Subagent lifecycle** — the parent's terminal/pause state is what
980
+ fires; child runs are opaque to external receivers.
981
+ - **Resume started / resume completed** — `resumeAsync()` returns
982
+ immediately with `{runId, nodeId, status: 'running'}`; the next
983
+ webhook you'll see is the resumed run's terminal state.
984
+
985
+ If you need finer-grained updates, use `getStatus()` polling — it
986
+ reads the heartbeat-updated `state.json` and gives you
987
+ `turns / tokensUsed / currentActivity / lastTool` in real time
988
+ without any webhook-driven traffic.
748
989
 
749
990
  ### Node.js example — sync HITL and async HITL together
750
991
 
@@ -1181,6 +1422,15 @@ await writeKnowledgeIndex({ adapter: k, base: 'hr-policies' })
1181
1422
  await writeKnowledgeIndex({ adapter: k, base: 'sales-playbook' })
1182
1423
  ```
1183
1424
 
1425
+ **Forgot to build the index?** Both tools fall back to an in-memory
1426
+ build on first call when `_index.json` is missing or corrupted. The
1427
+ fallback caches for the rest of the run, so subsequent searches are
1428
+ free. This makes the index a performance optimisation (skip the walk
1429
+ on every fresh run), not a setup requirement — drop files into the
1430
+ folder and the agent can discover them immediately. Pre-build with
1431
+ `writeKnowledgeIndex()` for production-scale corpora where the
1432
+ first-call cost matters.
1433
+
1184
1434
  **Configure the engine** to enable the tools (off by default):
1185
1435
 
1186
1436
  ```ts