darwin-langgraph 0.1.0-alpha.1 → 0.3.0-alpha.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -3,6 +3,212 @@
3
3
  All notable changes to `darwin-langgraph` are documented here.
4
4
  The project adheres to [Semantic Versioning](https://semver.org/).
5
5
 
6
+ ## [0.3.0-alpha.1] — 2026-05-25
7
+
8
+ V0.3 closes the three deferred items from V0.2's R2 review (parent-run
9
+ propagation, double-wrap detection, hung-invoke guard) and lifts the
10
+ repository to the StudioMeyer open-source-standard documentation layout
11
+ (CONTRIBUTING, CODE_OF_CONDUCT, SECURITY, ECOSYSTEM, issue + PR
12
+ templates, GitHub Actions CI matrix on Node 20 + 22).
13
+
14
+ ### Added — three new behaviours + one new option
15
+
16
+ - **`DarwinTrajectoryEvent.runId` + `.parentRunId`** — propagated by
17
+ `DarwinCallbackHandler` from the LangChain callback contract. Lets
18
+ downstream consumers (OTEL exporters, Langfuse, LangSmith, custom
19
+ span-tree loggers) rebuild the chain hierarchy from the event payload
20
+ alone without a separate runId-tracking side-channel. `runId` is
21
+ always present on V0.2+ handler events; `parentRunId` is present
22
+ only when the chain has a parent (omitted for top-level invokes).
23
+ - **`withDarwinEvolution` double-wrap warning** — a `Symbol.for(...)`
24
+ sentinel is stamped on each wrapped graph. A second wrap of the same
25
+ graph instance now emits a one-shot `console.warn` flagging the
26
+ duplicate-hook footgun. The wrap still succeeds (some users legitimately
27
+ layer multiple `nodeMap` slices), but the silent-double-fire case
28
+ is now visible in logs.
29
+ - **`DarwinCallbackHandler` hung-invoke guard** — new option
30
+ `maxInFlightRuns?: number` on `DarwinCallbackHandlerOptions`. Caps the
31
+ internal `runId → InFlightRun` map. When the cap is exceeded the
32
+ oldest entry is evicted (LRU via Map insertion order) and a
33
+ one-shot warning fires. Defaults to 1024 — small enough to surface
34
+ real leaks within minutes, large enough for typical fan-out. Set to
35
+ `Infinity` to opt out.
36
+
37
+ ### Changed
38
+
39
+ - **`DarwinCallbackHandlerOptions` type exported** alongside
40
+ `DarwinCallbackHandler`. Was internal in V0.2; now a public type so
41
+ consumers can build wrapper handlers without re-declaring the shape.
42
+ - **`InFlightRun` shape** — internal change in `DarwinCallbackHandler`:
43
+ the `runIdToName` map now stores `{ nodeName, parentRunId }` rather
44
+ than just the name string, to carry parentRunId through to
45
+ `handleChainEnd`. Non-breaking — only the internal field type changed.
46
+ - **VERSION constant bumped** to `0.3.0-alpha.1` in both `package.json`
47
+ and `src/index.ts` (verified by `prepublishOnly` script).
48
+
49
+ ### Added — open-source documentation standard
50
+
51
+ The repo now matches the
52
+ [`temporal-memory-workflows`](https://github.com/studiomeyer-io/temporal-memory-workflows)
53
+ OS-standard layout. New top-level files:
54
+
55
+ - **`CONTRIBUTING.md`** — folder layout, code-review expectations, PR
56
+ format, Conventional Commits convention, deprecation contract on
57
+ `withDarwinEvolution`, adapter-vs-upstream-bugs separation.
58
+ - **`CODE_OF_CONDUCT.md`** — adapted from Contributor Covenant 2.1.
59
+ - **`SECURITY.md`** — disclosure policy, supported versions
60
+ (`0.3.x` + `0.2.x` security-only), supply-chain stance (no
61
+ postinstall scripts), defense-in-depth layer breakdown.
62
+ - **`ECOSYSTEM.md`** — where the adapter sits in the StudioMeyer
63
+ toolkit, pairing notes with `darwin-agents` /
64
+ `@langchain/langgraph` / `temporal-memory-workflows`, when-to-use-what
65
+ decision table, sibling-repo links.
66
+ - **`.github/ISSUE_TEMPLATE/{bug_report,feature_request,config}.yml`**
67
+ — surface-aware bug template, problem-first feature template, contact
68
+ links routing to upstream where appropriate.
69
+ - **`.github/PULL_REQUEST_TEMPLATE.md`** — surface checklist,
70
+ zero-hard-dep check, deprecation-contract check, R1/R2 code-review
71
+ trail field for maintainers.
72
+ - **`.github/workflows/ci.yml`** — Node 20 + 22 matrix, runs
73
+ `npm ci` → `verify-version-sync` → `typecheck` → `examples:check` →
74
+ `test` → `build` on every push to `main` and every PR.
75
+
76
+ ### Test coverage
77
+
78
+ - **132/132 vitest tests green** (was 116 in V0.2, **+16 V0.3 tests**).
79
+ - New test file: `tests/v03-features.test.ts` (16 tests across 4 groups:
80
+ parent-run propagation, double-wrap warning, hung-invoke timeout
81
+ guard, `vi.resetModules()` pattern for module-level flag tests).
82
+ - tsc strict + examples typecheck + version-sync + build all clean.
83
+ - No regressions: all V0.1 + V0.2 surface tests + R1 + R2 fixes still
84
+ green.
85
+
86
+ ### Migration notes
87
+
88
+ - **From V0.2.x → V0.3.x:** No code changes required. The
89
+ `runId`/`parentRunId` fields on `DarwinTrajectoryEvent` are optional
90
+ additions; existing callbacks ignore them. The double-wrap warning
91
+ only fires if you actually double-wrap (most users do not). The
92
+ default `maxInFlightRuns: 1024` cap is invisible in normal use.
93
+ - **From V0.1.x → V0.3.x:** Still recommended to migrate from
94
+ `withDarwinEvolution` to `DarwinCallbackHandler` — see V0.2
95
+ migration notes in this changelog.
96
+
97
+ ### V0.4 Roadmap (deferred)
98
+
99
+ - **`reflectionRunPrompt` override on `darwin-agents` (paper-fidelity).**
100
+ The GEPA paper uses a stronger reflection LM than the task LM. Currently
101
+ the adapter inherits `darwin-agents@0.5.0-alpha.2`'s single-`runPrompt`
102
+ implementation. Defer to a paired bump.
103
+ - **OTEL exporter binding helper** — turn the pure `toOtelAttributes`
104
+ mapping into a one-liner that registers a span per trajectory with
105
+ a configured tracer.
106
+ - **Langfuse handler subclass** — `DarwinLangfuseHandler` that extends
107
+ `DarwinCallbackHandler` and emits Langfuse traces directly.
108
+ - **LangSmith trace correlation** — propagate `runId`/`parentRunId`
109
+ into a LangSmith-compatible attribute set.
110
+
111
+ ## [0.2.0-alpha.1] — 2026-05-25
112
+
113
+ ### Added — three new surfaces (V0.1 roadmap → LIVE)
114
+
115
+ - **Surface 4: `DarwinCallbackHandler`** — LangChain-native replacement
116
+ for `withDarwinEvolution`. Subclass of `BaseCallbackHandler` from
117
+ `@langchain/core/callbacks/base`. Pass via
118
+ `graph.invoke(input, { callbacks: [new DarwinCallbackHandler({ nodeMap,
119
+ onTrajectory }) ] })`. No monkey-patching of `invoke` / `stream`,
120
+ no `Set<symbol>` race-fix needed, no `streamMode` warn — works
121
+ identically across `invoke`, `stream` (any `streamMode`), and
122
+ `streamEvents`. Uses `metadata.langgraph_node` as the primary
123
+ node-name source (live-verified against `@langchain/langgraph@1.3.x`)
124
+ with the `runName` parameter as fallback for non-LangGraph chains.
125
+ - **Surface 5: `toOtelAttributes(trajectory, opts?)` +
126
+ `toolCallToOtelAttributes(call, opts?)`** — pure mappers from Darwin's
127
+ `ExecutionTrace` to flat `Record<string, string|number|boolean>` keyed
128
+ by [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/).
129
+ Spec-compliant cache attribute names
130
+ (`gen_ai.usage.cache_read.input_tokens` /
131
+ `gen_ai.usage.cache_creation.input_tokens`). MCP tools correctly map
132
+ to `gen_ai.tool.type = "extension"` (server-side, calls external APIs)
133
+ vs `"function"` for builtins. Sensitive `arguments` / `result` fields
134
+ are opt-in per OTEL spec. NaN/Infinity values dropped from output
135
+ (OTEL exporter compliance).
136
+ - **Surface 6: `darwinMessagesAnnotation(extra?)`** — variant of
137
+ `darwinAnnotation` that also includes LangGraph's canonical `messages`
138
+ channel (`messagesStateReducer`). Use it when your graph mixes Darwin
139
+ agents with `createReactAgent` / `MessagesAnnotation`-based prebuilt
140
+ agents. Power-user escape hatch `getMessagesChannelSpec()` exposed
141
+ for manual `Annotation.Root` composition.
142
+
143
+ ### Changed
144
+
145
+ - **`withDarwinEvolution` is `@deprecated` since v0.2.0** — JSDoc tag
146
+ plus a one-shot `console.warn` on first call (process-level, never
147
+ spam). Will be **removed in v1.0.0**. Migration is two lines (see
148
+ README "Migration from v0.1.x to v0.2.x").
149
+ - **VERSION constant bumped** to `0.2.0-alpha.1` in both `package.json`
150
+ and `src/index.ts` (verified by `prepublishOnly` script).
151
+
152
+ ### Fixed (R1 + R2 V0.2 code-review findings, all in-place pre-publish)
153
+
154
+ The 3-Agent code-review loop ran twice on V0.2. R1 surfaced 10 findings,
155
+ R2 caught 1 HIGH that R1 missed. All addressed before this release.
156
+
157
+ **R1 — 6 MUST-FIX (S1185):**
158
+
159
+ 1. **CRITICAL (Critic 1):** `firedRuns: Set<string>` in
160
+ `DarwinCallbackHandler` was an unbounded memory leak for long-lived
161
+ handlers (e.g. server singletons). Removed — `runIdToName.delete` is
162
+ the sole dedup and LangGraph guarantees one `handleChainEnd` per
163
+ `runId`.
164
+ 2. **HIGH (Critic 2):** `isExecutionTrace` only checked `version === 1`
165
+ — a malformed trajectory could pass and crash downstream
166
+ `toOtelAttributes(trajectory.toolCalls.length)`. Guard now also
167
+ requires `Array.isArray(toolCalls)` + `Array.isArray(errors)`.
168
+ `toOtelAttributes` got defensive fallbacks too.
169
+ 3. **HIGH (Critic 3 + 7):** `typeof === "number"` passed `NaN` and
170
+ `Infinity` through to OTEL exporters (silent span drop). All numeric
171
+ attributes (token usage, durationMs, turn, textBlockCount, turnCount,
172
+ mcpInvocations) now use `Number.isFinite`.
173
+ 4. **HIGH (Research 1):** OTEL spec uses
174
+ `gen_ai.usage.cache_read.input_tokens` /
175
+ `cache_creation.input_tokens` per the official attribute registry,
176
+ not the short `cache_*_tokens` form we initially emitted. Fixed
177
+ pre-release (zero-cost rename in alpha).
178
+ 5. **HIGH (Research 3):** MCP tools execute server-side and call
179
+ external APIs — `gen_ai.tool.type` must be `"extension"`, not
180
+ `"function"`. Adapter now routes on the existing `is_mcp` heuristic.
181
+ 6. **MED (Critic 5):** `swallow(err)` used `if (warned || !err)` which
182
+ silently dropped falsy throws (`throw null` etc). Fixed in
183
+ `DarwinCallbackHandler.swallow`.
184
+
185
+ **R2 — 1 HIGH (caught what R1 missed, S1185):**
186
+
187
+ 7. **HIGH (R2 Critic R2-1):** R1's fix #6 was applied to
188
+ `DarwinCallbackHandler.swallow` but NOT to the parallel `swallow`
189
+ inside `withDarwinEvolution`. Both now consistent.
190
+
191
+ ### Known limitations carried to V0.3
192
+
193
+ - **`withDarwinEvolution` module-level deprecation flag** is not reset
194
+ across tests in the same process (R2 Critic R2-2). Acceptable for
195
+ one-shot warn semantics; testing the contract requires
196
+ `vi.resetModules()`.
197
+ - **Double-wrapping `withDarwinEvolution` on the same graph instance**
198
+ is unsupported and produces no clear error (R2 Critic R2-3). Use
199
+ `DarwinCallbackHandler` instead — composable by default.
200
+ - **Parent-run propagation** on `DarwinTrajectoryEvent` not exposed
201
+ yet (R1 Research 5). Planned for V0.3 — needs use-case validation.
202
+
203
+ ### Test coverage
204
+
205
+ - **116/116 vitest tests green** (was 63 in V0.1).
206
+ - New test files: `tests/darwin-callback-handler.test.ts` (14),
207
+ `tests/to-otel-attributes.test.ts` (18),
208
+ `tests/darwin-messages-annotation.test.ts` (7),
209
+ `tests/r2-v02-fixes.test.ts` (10 R1+R2 regression).
210
+ - tsc strict + examples typecheck + version-sync + build all clean.
211
+
6
212
  ## [0.1.0-alpha.1] — 2026-05-24
7
213
 
8
214
  ### Added
@@ -37,9 +243,12 @@ controls the installed versions and the adapter never pins them.
37
243
 
38
244
  - Released under the `alpha` npm dist-tag in parallel with
39
245
  `darwin-agents@0.5.0-alpha.1` (the first release that ships
40
- `ExecutionTrace` capture). Default `npm install darwin-langgraph`
41
- refuses to resolve until `0.1.0` final ships explicit opt-in via
42
- `npm install darwin-langgraph@alpha`.
246
+ `ExecutionTrace` capture). Because `0.1.0-alpha.1` is the very
247
+ first publish of this package, npm assigns BOTH `alpha` and
248
+ `latest` to it (npm rule: `latest` always exists), so
249
+ `npm install darwin-langgraph` resolves to the alpha version
250
+ until `0.1.0` final ships. Prefer the explicit
251
+ `npm install darwin-langgraph@alpha` form for clarity.
43
252
  - The adapter never touches `ANTHROPIC_API_KEY`. If you run Darwin on a
44
253
  Claude Max subscription via the Claude Code CLI, set `delete
45
254
  process.env.ANTHROPIC_API_KEY` in your own bootstrap.
package/README.md CHANGED
@@ -202,26 +202,147 @@ The [`examples/`](./examples/) directory ships three runnable scripts:
202
202
 
203
203
  | `darwin-langgraph` | `darwin-agents` | `@langchain/langgraph` | Status |
204
204
  |---|---|---|---|
205
- | `0.1.0-alpha.x` | `^0.5.0-alpha.1` | `^1.3.0` | alpha (this release) |
205
+ | `0.3.0-alpha.x` | `^0.5.0-alpha.1` | `^1.3.0` | alpha (this release) |
206
+ | `0.2.0-alpha.x` | `^0.5.0-alpha.1` | `^1.3.0` | superseded |
207
+ | `0.1.0-alpha.x` | `^0.5.0-alpha.1` | `^1.3.0` | superseded |
206
208
 
207
209
  The peer-dep range `darwin-agents: "^0.5.0-alpha.1"` follows npm's
208
210
  prerelease semver rules — `0.5.0-alpha.N` and `0.5.0` final satisfy it,
209
211
  but `0.5.1-alpha.0` does NOT. A patch release of this adapter will be
210
212
  required when `darwin-agents` bumps past `0.5.x`.
211
213
 
212
- ## Known limitations (will be addressed in v0.2)
214
+ ## V0.3 observability + safety (LIVE this release)
213
215
 
214
- - **`withDarwinEvolution` monkey-patches `invoke`/`stream`** V0.2
215
- will migrate to a `DarwinCallbackHandler` you pass via
216
- `graph.invoke(input, { callbacks: [...] })`, which is the canonical
217
- LangChain pattern (matches Langfuse, Braintrust, LangSmith handlers).
218
- - **No `gen_ai.*` OTEL attribute helper** — V0.2 will ship
219
- `toOtelAttributes(trajectory)` so traces forward to Langfuse /
220
- Braintrust / Datadog with OpenTelemetry GenAI Semantic Conventions.
221
- - **No bundled `messages` channel** — V0.2 will add
222
- `darwinMessagesAnnotation()` for graphs that mix `createReactAgent`
223
- with `createDarwinNode`. Until then, you can pass
224
- `darwinAnnotation({ ...MessagesAnnotation.spec })` manually.
216
+ V0.3 closes the three deferred items from V0.2's R2 review. Backwards
217
+ compatible no consumer code changes required.
218
+
219
+ ### Parent-run propagation on `DarwinTrajectoryEvent`
220
+
221
+ `DarwinCallbackHandler` now populates two new fields on every emitted
222
+ event:
223
+
224
+ - **`runId: string`** the LangChain runId of the node-chain that
225
+ produced the trajectory. Stable identifier suitable for correlation
226
+ with OTEL spans, Langfuse traces, LangSmith runs.
227
+ - **`parentRunId?: string`** — the runId of the chain that invoked
228
+ this node-chain. Omitted for top-level invokes. Use it to build the
229
+ full span hierarchy in OTEL exporters.
230
+
231
+ ```ts
232
+ const handler = new DarwinCallbackHandler({
233
+ nodeMap: { research: "researcher" },
234
+ onTrajectory: (event) => {
235
+ const attrs = toOtelAttributes(event.trajectory);
236
+ const span = tracer.startSpan(event.nodeName, {
237
+ attributes: { ...attrs, "darwin.run.id": event.runId },
238
+ });
239
+ // event.parentRunId can be used as the parent context for span linking
240
+ span.end();
241
+ },
242
+ });
243
+ ```
244
+
245
+ `withDarwinEvolution` legacy events do NOT carry the runId fields (no
246
+ runId exists in the monkey-patch path). Migrate to
247
+ `DarwinCallbackHandler` if you need them.
248
+
249
+ ### Double-wrap warning on `withDarwinEvolution`
250
+
251
+ A `Symbol.for("darwin-langgraph.evolution.wrapped")` sentinel is now
252
+ stamped on each wrapped graph. A second `withDarwinEvolution(graph, …)`
253
+ call on the same graph instance emits a one-shot `console.warn`:
254
+
255
+ ```
256
+ [darwin-langgraph] withDarwinEvolution(): graph appears to be wrapped twice
257
+ — both hooks will fire per run, producing duplicate trajectories.
258
+ ```
259
+
260
+ Some users legitimately layer multiple `nodeMap` slices on one graph,
261
+ so the wrap still succeeds — but the silent-double-fire footgun is now
262
+ visible in logs. To layer cleanly, prefer multiple
263
+ `DarwinCallbackHandler` instances via `callbacks: [h1, h2]`.
264
+
265
+ ### Hung-invoke guard on `DarwinCallbackHandler`
266
+
267
+ New option `maxInFlightRuns?: number` on
268
+ `DarwinCallbackHandlerOptions`. Caps the internal `runId → InFlightRun`
269
+ map at the configured size. When the cap is exceeded, the oldest
270
+ entry is evicted (Map insertion order) and a one-shot warning fires.
271
+ Defaults to **1024** — large enough for typical fan-out, small enough
272
+ to surface a real leak within minutes of an incident.
273
+
274
+ ```ts
275
+ const handler = new DarwinCallbackHandler({
276
+ nodeMap: { research: "researcher" },
277
+ onTrajectory: (event) => { /* ... */ },
278
+ maxInFlightRuns: 500, // tighter for memory-constrained workers
279
+ });
280
+ ```
281
+
282
+ Set `maxInFlightRuns: Infinity` to opt out (discouraged in production).
283
+
284
+ This defends against the failure mode where `handleChainEnd` /
285
+ `handleChainError` never fires (LangGraph internal bug, OS-level kill
286
+ of the worker mid-invoke, parent invoke aborted mid-flight). Without
287
+ the cap, the map grows without bound and leaks memory in long-running
288
+ processes (server singletons, schedulers, Temporal Workers).
289
+
290
+ ### V0.2 — new surfaces (still LIVE)
291
+
292
+ V0.2 ships the three items from the V0.1 V0.2-roadmap, plus runtime
293
+ deprecation warnings on the legacy wrapper:
294
+
295
+ - **`DarwinCallbackHandler`** — LangChain-native replacement for
296
+ `withDarwinEvolution`. Pass it via `graph.invoke(input, { callbacks:
297
+ [new DarwinCallbackHandler({ nodeMap, onTrajectory }) ] })`. No more
298
+ monkey-patching `invoke`/`stream`, no more `Set<symbol>` race-fix,
299
+ no more `streamMode` gymnastics. Works identically with `invoke`,
300
+ `stream` (any `streamMode`), and `streamEvents` because LangChain's
301
+ callback mechanism fires regardless of how the consumer iterates.
302
+ - **`toOtelAttributes(trajectory, opts?)` + `toolCallToOtelAttributes(call, opts?)`** —
303
+ pure mappers from Darwin's `ExecutionTrace` to flat
304
+ `Record<string, string|number|boolean>` keyed by
305
+ [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/)
306
+ (`gen_ai.operation.name`, `gen_ai.usage.input_tokens`, etc.) plus
307
+ Darwin-namespaced custom attrs. Drop straight into Langfuse,
308
+ Braintrust, Honeycomb, Datadog. Sensitive `arguments` / `result`
309
+ fields are opt-in per OTEL spec.
310
+ - **`darwinMessagesAnnotation(extra?)`** — variant of `darwinAnnotation`
311
+ that also includes LangGraph's canonical `messages` channel
312
+ (`messagesStateReducer`). Use it when your graph mixes Darwin agents
313
+ with `createReactAgent` / `MessagesAnnotation`-based prebuilt agents.
314
+ - **Runtime deprecation warning on `withDarwinEvolution`** — fires once
315
+ per process on first call. The function still works identically for
316
+ back-compat; it will be removed in v1.0. Migrate to
317
+ `DarwinCallbackHandler`.
318
+
319
+ ## Migration from v0.1.x to v0.2.x
320
+
321
+ Two lines:
322
+
323
+ ```diff
324
+ - import { withDarwinEvolution } from "darwin-langgraph";
325
+ + import { DarwinCallbackHandler } from "darwin-langgraph";
326
+ - const graph = withDarwinEvolution(compiledGraph, { nodeMap, onTrajectory });
327
+ - const result = await graph.invoke(input);
328
+ + const handler = new DarwinCallbackHandler({ nodeMap, onTrajectory });
329
+ + const result = await compiledGraph.invoke(input, { callbacks: [handler] });
330
+ ```
331
+
332
+ `DarwinEvolutionOptions`, `DarwinNodeMapEntry`, and the
333
+ `DarwinTrajectoryEvent` shape passed to `onTrajectory` are 100% identical
334
+ between both APIs — no other code changes required.
335
+
336
+ ## Migration from v0.2.x to v0.3.x
337
+
338
+ **No code changes required.** V0.3 is purely additive:
339
+
340
+ - `event.runId` and `event.parentRunId` are new optional fields — existing
341
+ callbacks that don't read them keep working untouched.
342
+ - The double-wrap warning only fires if you actually double-wrap (most
343
+ users do not).
344
+ - The default `maxInFlightRuns: 1024` is invisible in normal use. Override
345
+ only if you have memory pressure or want to opt out via `Infinity`.
225
346
 
226
347
  The adapter releases follow `darwin-agents` major bumps. When
227
348
  `@langchain/langgraph` 2.x lands, the adapter ships a new major within
@@ -267,9 +388,20 @@ delete process.env.ANTHROPIC_API_KEY; // enforce Max-Plan subscription
267
388
  ## Versioning
268
389
 
269
390
  Released under the `alpha` npm dist-tag in parallel with
270
- `darwin-agents@0.5.0-alpha.1`. Default `npm install darwin-langgraph`
271
- will NOT resolve until `0.1.0` final ships. Explicit opt-in via
272
- `npm install darwin-langgraph@alpha`.
391
+ `darwin-agents@0.5.0-alpha.1`. Because `0.1.0-alpha.1` is the FIRST
392
+ publish, npm assigns BOTH `alpha` and `latest` to it (npm rule: a
393
+ package must always have a `latest` tag) — so `npm install
394
+ darwin-langgraph` and `npm install darwin-langgraph@alpha` resolve to
395
+ the same version right now. Once `0.1.0` final ships, `latest` will
396
+ point to the stable release and `alpha` will continue to track
397
+ pre-releases.
398
+
399
+ For maximum clarity in alpha-stage projects, prefer the explicit
400
+ opt-in form:
401
+
402
+ ```bash
403
+ npm install darwin-langgraph@alpha @langchain/langgraph darwin-agents@alpha
404
+ ```
273
405
 
274
406
  ## License
275
407
 
@@ -0,0 +1,134 @@
1
+ /**
2
+ * Surface 4 (V0.2) — `DarwinCallbackHandler`.
3
+ *
4
+ * Drop-in replacement for {@link withDarwinEvolution} that uses
5
+ * LangChain's canonical `BaseCallbackHandler` mechanism instead of
6
+ * monkey-patching `graph.invoke` / `graph.stream`. The same trajectory
7
+ * hook fires, the same `nodeMap` routing applies, but the integration
8
+ * is now LangGraph-native — no method overwrites, no concurrent-invoke
9
+ * Set<symbol> dance, no streamMode gymnastics.
10
+ *
11
+ * Usage:
12
+ * ```ts
13
+ * import { DarwinCallbackHandler } from "darwin-langgraph";
14
+ *
15
+ * const handler = new DarwinCallbackHandler({
16
+ * nodeMap: { research: "researcher" },
17
+ * onTrajectory: (event) => {
18
+ * console.log(event.nodeName, event.trajectory.toolCalls.length);
19
+ * },
20
+ * });
21
+ *
22
+ * const result = await graph.invoke(
23
+ * { task: "What is GEPA?" },
24
+ * { callbacks: [handler] },
25
+ * );
26
+ * ```
27
+ *
28
+ * Design notes (S1185 V0.2):
29
+ * - **runId → runName mapping.** LangChain invokes `handleChainStart`
30
+ * with the `runName` arg (the node name in LangGraph's case) and
31
+ * a unique `runId`. We cache that mapping. On `handleChainEnd` we
32
+ * look up the name, find the matching `nodeMap` entry, and dispatch
33
+ * `onTrajectory`.
34
+ * - **Fire-and-forget.** Like the v0.1 monkey-patch wrapper, hook
35
+ * errors are swallowed with one warn-once per handler instance.
36
+ * - **No mutation of LangGraph internals.** This handler never touches
37
+ * `graph.invoke` or `graph.stream` — registering it via the standard
38
+ * `{ callbacks: [...] }` option is the only side effect, which is
39
+ * LangChain's documented integration point (matches Langfuse,
40
+ * Braintrust, LangSmith handler patterns).
41
+ * - **Stream-mode-agnostic.** Works identically with `invoke`,
42
+ * `stream` (any streamMode), and `streamEvents` because the chain
43
+ * callbacks fire regardless of how the consumer iterates.
44
+ * - **Concurrent runs work natively.** LangChain's runId is unique
45
+ * per call — no shared-counter race condition is possible.
46
+ * - **Backwards compat.** `withDarwinEvolution` from v0.1 still works
47
+ * (marked `@deprecated`) and produces the same `DarwinTrajectoryEvent`
48
+ * payload shape. Migration is one line: replace
49
+ * `withDarwinEvolution(graph, opts)` with
50
+ * `graph.invoke(input, { callbacks: [new DarwinCallbackHandler(opts)] })`.
51
+ */
52
+ import { BaseCallbackHandler } from "@langchain/core/callbacks/base";
53
+ import type { ChainValues } from "@langchain/core/utils/types";
54
+ import type { DarwinEvolutionOptions } from "./with-darwin-evolution.js";
55
+ /**
56
+ * V0.3 — extra options on top of `DarwinEvolutionOptions`. Pass to
57
+ * `new DarwinCallbackHandler({ ...opts, maxInFlightRuns })`.
58
+ *
59
+ * NEW V0.3 (S1187).
60
+ */
61
+ export interface DarwinCallbackHandlerOptions extends DarwinEvolutionOptions {
62
+ /**
63
+ * Maximum number of in-flight `runId → nodeName` mappings the handler
64
+ * holds at once. If `handleChainEnd` / `handleChainError` never fires
65
+ * (LangGraph internal bug, OS-level kill of the worker mid-invoke,
66
+ * etc.) the map would otherwise grow without bound and leak memory.
67
+ *
68
+ * When the limit is exceeded, the OLDEST entry is evicted and a
69
+ * one-shot warning is logged. Default: 1024 (enough for typical
70
+ * fan-out patterns with safety margin, small enough to surface real
71
+ * leaks within minutes of an incident).
72
+ *
73
+ * Set to `Infinity` to opt out — discouraged in production.
74
+ */
75
+ maxInFlightRuns?: number;
76
+ }
77
+ /**
78
+ * LangChain `BaseCallbackHandler` that listens for LangGraph node-chain
79
+ * events and dispatches Darwin trajectory hooks. Pass it via the
80
+ * standard `{ callbacks: [...] }` option to any `invoke`/`stream`/`streamEvents`
81
+ * call on a compiled `StateGraph`.
82
+ */
83
+ export declare class DarwinCallbackHandler extends BaseCallbackHandler {
84
+ readonly name = "DarwinCallbackHandler";
85
+ readonly awaitHandlers = false;
86
+ private readonly resolved;
87
+ private readonly onTrajectory;
88
+ /**
89
+ * runId → in-flight run state. Map preserves insertion order so the
90
+ * oldest entry is always at `.keys().next().value` for LRU eviction
91
+ * when the cap is exceeded (V0.3 hung-invoke guard).
92
+ */
93
+ private readonly runIdToName;
94
+ private readonly maxInFlightRuns;
95
+ private warned;
96
+ private evictionWarned;
97
+ constructor(opts: DarwinCallbackHandlerOptions);
98
+ /**
99
+ * Capture the run-id → node-name mapping.
100
+ *
101
+ * Implementation note (S1185 V0.2 — live-debug against @langchain/langgraph@1.3.x):
102
+ * `metadata.langgraph_node` is the stable, reliable source for the
103
+ * StateGraph node name. LangGraph populates it on every node-chain
104
+ * invocation. The `runName` parameter slot in `@langchain/core`'s
105
+ * `BaseCallbackHandler` d.ts is undefined for LangGraph chains at
106
+ * runtime — we keep it as a fallback for non-LangGraph chains.
107
+ */
108
+ handleChainStart(_chain: unknown, _inputs: ChainValues, runId: string, _runType?: string, _tags?: string[], metadata?: Record<string, unknown>, runName?: string, parentRunId?: string, _extra?: Record<string, unknown>): void;
109
+ /**
110
+ * When a node-chain finishes, look up its name, locate the matching
111
+ * trajectory in `outputs`, and dispatch onTrajectory.
112
+ *
113
+ * `outputs` carries the state update the node returned — which may be
114
+ * a partial state (just the new keys). When the node was wrapped via
115
+ * {@link createDarwinNode}, the trajectoryKey lives directly on that
116
+ * partial state under its configured name.
117
+ */
118
+ handleChainEnd(outputs: ChainValues, runId: string, _parentRunId?: string, _tags?: string[], _kwargs?: {
119
+ inputs?: ChainValues;
120
+ }): void;
121
+ /**
122
+ * Forget the in-flight runId when a chain errors out. We don't fire
123
+ * the hook on error — the trajectory is by definition incomplete.
124
+ */
125
+ handleChainError(_err: Error, runId: string, _parentRunId?: string): void;
126
+ private swallow;
127
+ /**
128
+ * Helper for tests + debug introspection — returns how many in-flight
129
+ * chain runs we are currently tracking. Should be 0 between
130
+ * top-level invocations.
131
+ */
132
+ getInFlightCount(): number;
133
+ }
134
+ //# sourceMappingURL=darwin-callback-handler.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"darwin-callback-handler.d.ts","sourceRoot":"","sources":["../src/darwin-callback-handler.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAkDG;AAEH,OAAO,EAAE,mBAAmB,EAAE,MAAM,gCAAgC,CAAC;AACrE,OAAO,KAAK,EAAE,WAAW,EAAE,MAAM,6BAA6B,CAAC;AAI/D,OAAO,KAAK,EACV,sBAAsB,EAGvB,MAAM,4BAA4B,CAAC;AAEpC;;;;;GAKG;AACH,MAAM,WAAW,4BAA6B,SAAQ,sBAAsB;IAC1E;;;;;;;;;;;;OAYG;IACH,eAAe,CAAC,EAAE,MAAM,CAAC;CAC1B;AAoED;;;;;GAKG;AACH,qBAAa,qBAAsB,SAAQ,mBAAmB;IAC5D,SAAyB,IAAI,2BAA2B;IACxD,SAAyB,aAAa,SAAS;IAE/C,OAAO,CAAC,QAAQ,CAAC,QAAQ,CAAoC;IAC7D,OAAO,CAAC,QAAQ,CAAC,YAAY,CAAyC;IACtE;;;;OAIG;IACH,OAAO,CAAC,QAAQ,CAAC,WAAW,CAAuC;IACnE,OAAO,CAAC,QAAQ,CAAC,eAAe,CAAS;IACzC,OAAO,CAAC,MAAM,CAAS;IACvB,OAAO,CAAC,cAAc,CAAS;gBAEnB,IAAI,EAAE,4BAA4B;IAsB9C;;;;;;;;;OASG;IACM,gBAAgB,CACvB,MAAM,EAAE,OAAO,EACf,OAAO,EAAE,WAAW,EACpB,KAAK,EAAE,MAAM,EACb,QAAQ,CAAC,EAAE,MAAM,EACjB,KAAK,CAAC,EAAE,MAAM,EAAE,EAChB,QAAQ,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,EAClC,OAAO,CAAC,EAAE,MAAM,EAChB,WAAW,CAAC,EAAE,MAAM,EACpB,MAAM,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,GAC/B,IAAI;IAkDP;;;;;;;;OAQG;IACM,cAAc,CACrB,OAAO,EAAE,WAAW,EACpB,KAAK,EAAE,MAAM,EACb,YAAY,CAAC,EAAE,MAAM,EACrB,KAAK,CAAC,EAAE,MAAM,EAAE,EAChB,OAAO,CAAC,EAAE;QAAE,MAAM,CAAC,EAAE,WAAW,CAAA;KAAE,GACjC,IAAI;IAwCP;;;OAGG;IACM,gBAAgB,CACvB,IAAI,EAAE,KAAK,EACX,KAAK,EAAE,MAAM,EACb,YAAY,CAAC,EAAE,MAAM,GACpB,IAAI;IAIP,OAAO,CAAC,OAAO;IAaf;;;;OAIG;IACI,gBAAgB,IAAI,MAAM;CAGlC"}