@dogpile/sdk 0.3.1 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +201 -0
- package/README.md +1 -0
- package/dist/browser/index.js +2328 -237
- package/dist/browser/index.js.map +1 -1
- package/dist/index.d.ts +3 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +1 -0
- package/dist/index.js.map +1 -1
- package/dist/providers/openai-compatible.d.ts +11 -0
- package/dist/providers/openai-compatible.d.ts.map +1 -1
- package/dist/providers/openai-compatible.js +88 -2
- package/dist/providers/openai-compatible.js.map +1 -1
- package/dist/runtime/audit.d.ts +42 -0
- package/dist/runtime/audit.d.ts.map +1 -0
- package/dist/runtime/audit.js +73 -0
- package/dist/runtime/audit.js.map +1 -0
- package/dist/runtime/broadcast.d.ts.map +1 -1
- package/dist/runtime/broadcast.js +39 -36
- package/dist/runtime/broadcast.js.map +1 -1
- package/dist/runtime/cancellation.d.ts +26 -0
- package/dist/runtime/cancellation.d.ts.map +1 -1
- package/dist/runtime/cancellation.js +38 -1
- package/dist/runtime/cancellation.js.map +1 -1
- package/dist/runtime/coordinator.d.ts +79 -1
- package/dist/runtime/coordinator.d.ts.map +1 -1
- package/dist/runtime/coordinator.js +979 -61
- package/dist/runtime/coordinator.js.map +1 -1
- package/dist/runtime/decisions.d.ts +25 -3
- package/dist/runtime/decisions.d.ts.map +1 -1
- package/dist/runtime/decisions.js +241 -3
- package/dist/runtime/decisions.js.map +1 -1
- package/dist/runtime/defaults.d.ts +37 -1
- package/dist/runtime/defaults.d.ts.map +1 -1
- package/dist/runtime/defaults.js +359 -4
- package/dist/runtime/defaults.js.map +1 -1
- package/dist/runtime/engine.d.ts +17 -4
- package/dist/runtime/engine.d.ts.map +1 -1
- package/dist/runtime/engine.js +770 -35
- package/dist/runtime/engine.js.map +1 -1
- package/dist/runtime/health.d.ts +51 -0
- package/dist/runtime/health.d.ts.map +1 -0
- package/dist/runtime/health.js +85 -0
- package/dist/runtime/health.js.map +1 -0
- package/dist/runtime/introspection.d.ts +96 -0
- package/dist/runtime/introspection.d.ts.map +1 -0
- package/dist/runtime/introspection.js +31 -0
- package/dist/runtime/introspection.js.map +1 -0
- package/dist/runtime/metrics.d.ts +44 -0
- package/dist/runtime/metrics.d.ts.map +1 -0
- package/dist/runtime/metrics.js +12 -0
- package/dist/runtime/metrics.js.map +1 -0
- package/dist/runtime/model.d.ts.map +1 -1
- package/dist/runtime/model.js +34 -7
- package/dist/runtime/model.js.map +1 -1
- package/dist/runtime/provenance.d.ts +25 -0
- package/dist/runtime/provenance.d.ts.map +1 -0
- package/dist/runtime/provenance.js +13 -0
- package/dist/runtime/provenance.js.map +1 -0
- package/dist/runtime/sequential.d.ts.map +1 -1
- package/dist/runtime/sequential.js +47 -37
- package/dist/runtime/sequential.js.map +1 -1
- package/dist/runtime/shared.d.ts.map +1 -1
- package/dist/runtime/shared.js +39 -36
- package/dist/runtime/shared.js.map +1 -1
- package/dist/runtime/tracing.d.ts +31 -0
- package/dist/runtime/tracing.d.ts.map +1 -0
- package/dist/runtime/tracing.js +18 -0
- package/dist/runtime/tracing.js.map +1 -0
- package/dist/runtime/validation.d.ts +10 -0
- package/dist/runtime/validation.d.ts.map +1 -1
- package/dist/runtime/validation.js +73 -0
- package/dist/runtime/validation.js.map +1 -1
- package/dist/types/events.d.ts +339 -12
- package/dist/types/events.d.ts.map +1 -1
- package/dist/types/replay.d.ts +7 -1
- package/dist/types/replay.d.ts.map +1 -1
- package/dist/types.d.ts +255 -6
- package/dist/types.d.ts.map +1 -1
- package/dist/types.js.map +1 -1
- package/package.json +39 -1
- package/src/index.ts +15 -0
- package/src/providers/openai-compatible.ts +83 -3
- package/src/runtime/audit.ts +121 -0
- package/src/runtime/broadcast.ts +40 -37
- package/src/runtime/cancellation.ts +59 -1
- package/src/runtime/coordinator.ts +1221 -61
- package/src/runtime/decisions.ts +307 -4
- package/src/runtime/defaults.ts +389 -4
- package/src/runtime/engine.ts +1004 -35
- package/src/runtime/health.ts +136 -0
- package/src/runtime/introspection.ts +122 -0
- package/src/runtime/metrics.ts +45 -0
- package/src/runtime/model.ts +38 -6
- package/src/runtime/provenance.ts +43 -0
- package/src/runtime/sequential.ts +49 -38
- package/src/runtime/shared.ts +40 -37
- package/src/runtime/tracing.ts +35 -0
- package/src/runtime/validation.ts +81 -0
- package/src/types/events.ts +369 -12
- package/src/types/replay.ts +14 -1
- package/src/types.ts +279 -4
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,206 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.5.0] — 2026-05-01
|
|
4
|
+
|
|
5
|
+
v0.5.0 Observability and Auditability starts with provenance annotations: model provider calls now produce real request/response events, replay can synthesize those events from provider-call anchors, and callers get a small runtime helper for normalized provenance fields.
|
|
6
|
+
|
|
7
|
+
Prepared the release identity for `@dogpile/sdk@0.5.0` and `dogpile-sdk-0.5.0.tgz`.
|
|
8
|
+
|
|
9
|
+
### Breaking
|
|
10
|
+
|
|
11
|
+
- **`ModelRequestEvent` shape changed.** The `at` field is removed. The event now carries `startedAt: string` (ISO-8601 timestamp immediately before the provider call) and `modelId: string` (resolved model identifier). Update any code that reads `event.at` on a `ModelRequestEvent`.
|
|
12
|
+
- **`ModelResponseEvent` shape changed.** The `at` field is removed. The event now carries `startedAt: string`, `completedAt: string` (ISO-8601 timestamp after the provider call), and `modelId: string`. Callers can compute call duration from a single event. Update any code that reads `event.at` on a `ModelResponseEvent`.
|
|
13
|
+
- **`model-request` and `model-response` events are now emitted.** These event types were previously typed but never produced at runtime. They are now emitted on every provider call across all four protocols (`sequential`, `broadcast`, `coordinator`, `shared`). Callers with exhaustive switches over `RunEvent["type"]` that lack a `default` branch may encounter unhandled cases — add `case "model-request":` and `case "model-response":` branches or a fallback `default`.
|
|
14
|
+
|
|
15
|
+
### Added — Provenance annotations (Phase 6)
|
|
16
|
+
|
|
17
|
+
- **`ConfiguredModelProvider.modelId?` optional field.** Provider adapters can now declare the specific model identifier, such as `"gpt-4o"`. When absent, the SDK uses `provider.id` as the fallback. `createOpenAICompatibleProvider` and the internal Vercel AI provider populate this field automatically from the configured model.
|
|
18
|
+
- **`ReplayTraceProviderCall.modelId` required field.** The model identifier is now recorded in every provider call entry in `trace.providerCalls`. This is a shape change on the replay type — if you have hand-crafted `ReplayTraceProviderCall` objects, such as in tests, add the `modelId` field.
|
|
19
|
+
- **New subpath: `@dogpile/sdk/runtime/provenance`.** Exports `getProvenance(event)`, `ProvenanceRecord`, and `PartialProvenanceRecord`. `getProvenance()` extracts normalized provenance fields from any `ModelRequestEvent` or `ModelResponseEvent`; the overloaded signature returns `ProvenanceRecord` with `completedAt` for response events and `PartialProvenanceRecord` without `completedAt` for request events.
|
|
20
|
+
|
|
21
|
+
### Added — Structured event introspection + health diagnostics (Phase 7)
|
|
22
|
+
|
|
23
|
+
- **New subpath: `@dogpile/sdk/runtime/introspection`.** Exports `queryEvents(events, filter)` and `EventQueryFilter`. `queryEvents()` filters a `readonly RunEvent[]` by event type, agent id, global turn range, and/or cost range with AND semantics, returning a narrowed subtype such as `TurnEvent[]` when `filter.type === "agent-turn"` without caller casts.
|
|
24
|
+
- **New subpath: `@dogpile/sdk/runtime/health`.** Exports `computeHealth(trace, thresholds?)`, `HealthThresholds`, `DEFAULT_HEALTH_THRESHOLDS`, `RunHealthSummary`, and `HealthAnomaly`. `computeHealth()` derives anomaly records and stats from a trace without I/O or runtime state.
|
|
25
|
+
- **`result.health: RunHealthSummary` required field.** Every `RunResult` now includes an always-present machine-readable health summary computed from trace events at result time and recomputed identically by `replay()`. The summary exposes `health.anomalies: readonly HealthAnomaly[]` and `health.stats.totalTurns`, `health.stats.agentCount`, and `health.stats.budgetUtilizationPct`.
|
|
26
|
+
- **New root-exported health types.** `AnomalyCode`, `HealthAnomaly`, and `RunHealthSummary` are exported from `@dogpile/sdk`.
|
|
27
|
+
- **Frozen health anomaly fixture.** `src/tests/fixtures/anomaly-record-v1.json` records one sample `HealthAnomaly` per anomaly code. `provider-error-recovered` is present in the `AnomalyCode` union and fixture but is not emitted by `computeHealth()` in Phase 7 because current traces have no provider-recovery signal without an event-shape change.
|
|
28
|
+
|
|
29
|
+
### Added — Audit Event Schema (Phase 8)
|
|
30
|
+
|
|
31
|
+
- **New subpath: `@dogpile/sdk/runtime/audit`.** Exports `createAuditRecord(trace)`, `AuditRecord`, `AuditOutcome`, `AuditCost`, `AuditAgentRecord`, and `AuditOutcomeStatus`.
|
|
32
|
+
- **`createAuditRecord(trace: Trace): AuditRecord`.** Pure function that derives a versioned, schema-stable audit record from any completed trace. It works on live `RunResult.trace` values and stored/replayed traces without I/O, storage, or provider access.
|
|
33
|
+
- **`AuditRecord` standalone type.** The audit schema is independent of `RunEvent` variants and contains `auditSchemaVersion`, `runId`, `intent`, `startedAt`, `completedAt`, `protocol`, `tier`, `modelProviderId`, `agentCount`, `turnCount`, `outcome`, `cost`, `agents`, and optional `childRunIds`.
|
|
34
|
+
- **Budget-stop audit outcome.** `AuditOutcome` uses `{ status: "completed" | "budget-stopped" | "aborted"; terminationCode?: string }`; `terminationCode` carries the normalized budget stop reason (`"cost"`, `"tokens"`, `"iterations"`, or `"timeout"`) for budget-stopped runs.
|
|
35
|
+
- **Frozen audit record fixture.** `src/tests/fixtures/audit-record-v1.json` records the canonical AuditRecord v1 field order and shallow type shape. Intentional AuditRecord schema changes must update the JSON fixture, companion `audit-record-v1.type-check.ts`, and shape test together.
|
|
36
|
+
|
|
37
|
+
**Note:** Audit records are not auto-attached to `RunResult`. Callers explicitly invoke `createAuditRecord(result.trace)`.
|
|
38
|
+
|
|
39
|
+
### Added — OTEL tracing bridge (Phase 9)
|
|
40
|
+
|
|
41
|
+
- **New subpath: `@dogpile/sdk/runtime/tracing`.** Exports `DogpileTracer`, `DogpileSpan`, `DogpileSpanOptions`, and `DOGPILE_SPAN_NAMES`. Pure-TS, zero runtime dependencies; `@opentelemetry/*` is not imported anywhere in `src/runtime/`, `src/browser/`, or `src/providers/`. The `src/tests/no-otel-imports.test.ts` grep test enforces this boundary.
|
|
42
|
+
- **`tracer?: DogpileTracer` on `EngineOptions` and `DogpileOptions`.** When a duck-typed tracer is provided, the SDK emits spans on every run; when absent the run completes with zero span overhead.
|
|
43
|
+
- **Four span names emitted under the `dogpile.*` namespace:** `dogpile.run`, `dogpile.sub-run`, `dogpile.agent-turn`, `dogpile.model-call`. Hierarchy: `dogpile.run` → `dogpile.sub-run` → `dogpile.agent-turn` → `dogpile.model-call`.
|
|
44
|
+
- **`dogpile.run` span attributes:** `dogpile.run.id`, `dogpile.run.protocol`, `dogpile.run.tier`, `dogpile.run.intent` (truncated to 200 chars), `dogpile.run.outcome` (`completed` / `budget-stopped` / `aborted`), `dogpile.run.cost_usd`, `dogpile.run.turn_count`, `dogpile.run.input_tokens`, `dogpile.run.output_tokens`, and `dogpile.run.termination_reason` for budget-stopped runs.
|
|
45
|
+
- **`dogpile.agent-turn` span attributes:** `dogpile.agent.id`, `dogpile.turn.number`, `dogpile.agent.role`, `dogpile.model.id`, `dogpile.turn.cost_usd`, `dogpile.turn.input_tokens`, `dogpile.turn.output_tokens`. `dogpile.turn.number` is derived from a per-agentId counter inside the engine because `TurnEvent` itself has no `turnNumber` field.
|
|
46
|
+
- **`dogpile.model-call` span attributes:** `dogpile.model.id`, `dogpile.call.id`, `dogpile.provider.id`, `dogpile.model.input_tokens`, `dogpile.model.output_tokens`, and `dogpile.model.cost_usd` when the provider reports it.
|
|
47
|
+
- **Sub-run spans are correctly nested.** Children dispatched by the coordinator protocol appear as descendants of the parent run span via internal `parentSpan` threading on `RunProtocolOptions`; they do not appear as disconnected root traces in OTEL backends.
|
|
48
|
+
- **Span status semantics.** `dogpile.run` spans get `setStatus("ok")` for completed runs, including budget-stopped runs with the termination reason captured as an attribute, and `setStatus("error", message)` for aborted or thrown runs. `dogpile.sub-run` spans on `sub-run-failed` events get `setStatus("error", event.error.message)`.
|
|
49
|
+
- **Streaming parity.** `stream()` produces the same four span types with the same nesting and attributes as `run()`.
|
|
50
|
+
- **Root re-exports.** `DogpileTracer`, `DogpileSpan`, `DogpileSpanOptions` are re-exported as types from `@dogpile/sdk`; `DOGPILE_SPAN_NAMES` is a value-level root re-export.
|
|
51
|
+
- **`replay()` and `replayStream()` are tracing-free.** Even when an engine has been configured with a `tracer`, calling `replay()` or `replayStream()` emits no spans; historical timestamps would confuse OTEL backends. See `docs/developer-usage.md` for the recommended user-side bridge pattern.
|
|
52
|
+
- **No runtime dependency added.** `@opentelemetry/api` and `@opentelemetry/sdk-trace-base` are devDependencies used only by `src/tests/otel-tracing-contract.test.ts`.
|
|
53
|
+
|
|
54
|
+
### Added — Metrics / Counters hook (Phase 10)
|
|
55
|
+
|
|
56
|
+
- **New subpath: `@dogpile/sdk/runtime/metrics`.** Exports `MetricsHook` and `RunMetricsSnapshot`. Pure-TS, zero runtime dependencies. No root re-exports.
|
|
57
|
+
- **`metricsHook?: MetricsHook` on `EngineOptions` and `DogpileOptions`.** When provided, `onRunComplete` fires at every terminal state (completed, budget-stopped, aborted) with a `RunMetricsSnapshot`; `onSubRunComplete` fires for each coordinator-dispatched child run. When absent, zero overhead — no allocations.
|
|
58
|
+
- **`RunMetricsSnapshot` fields:** `outcome`, `inputTokens`, `outputTokens`, `costUsd`, `totalInputTokens`, `totalOutputTokens`, `totalCostUsd`, `turns`, `durationMs`. Own-only counters exclude nested sub-run tokens; total counters include the full subtree.
|
|
59
|
+
- **`logger?: Logger` on `EngineOptions` and `DogpileOptions`.** Routes hook errors to a caller-supplied structured logger; falls back to `console.error` when absent. Uses the existing `Logger` interface from `@dogpile/sdk/runtime/logger`. Enables future engine-level diagnostic logging without another surface change.
|
|
60
|
+
- **Async fire-and-forget.** Hook callbacks are `(snapshot) => void | Promise<void>`. Async returns attach `.catch(err => logger.error(...))`. Hook latency never delays run completion.
|
|
61
|
+
- **`replay()` and `replayStream()` ignore `metricsHook` entirely.** Consistent with the Phase 9 replay-is-tracing-free invariant.
|
|
62
|
+
- **Frozen fixture.** `src/tests/fixtures/metrics-snapshot-v1.json` records the canonical `RunMetricsSnapshot` v1 field order. Companion `metrics-snapshot-v1.type-check.ts` enforces compile-time type fidelity.
|
|
63
|
+
|
|
64
|
+
### Replay
|
|
65
|
+
|
|
66
|
+
- **`replay()` synthesizes `model-request` / `model-response` events from `trace.providerCalls`.** The augmented event log returned by `replay()` includes provenance events derived from the canonical `providerCalls` anchor. This ensures provenance fields in replayed results are identical to those in live runs (PROV-02). Older traces without these events in `trace.events` gain them on replay.
|
|
67
|
+
|
|
68
|
+
## [0.4.0] — 2026-05-01
|
|
69
|
+
|
|
70
|
+
Recursive coordination — coordinators can now dispatch whole sub-missions via a `delegate` decision, with embedded child traces, propagated budgets/aborts/costs, bounded concurrency with locality clamping, live child-event bubbling on streams, and structured child-failure escalation. See [`docs/recursive-coordination.md`](docs/recursive-coordination.md) for the full surface and a worked example.
|
|
71
|
+
|
|
72
|
+
### Breaking
|
|
73
|
+
|
|
74
|
+
- `AgentDecision` is now a discriminated union with required `type: "participate" | "delegate"`. Existing paper-style fields (`selectedRole`, `participation`, `rationale`, `contribution`) are preserved under the `participate` branch. Consumers must narrow on `decision.type === "participate"` before reading paper-style fields. (Phase 1)
|
|
75
|
+
|
|
76
|
+
### Migration — AgentDecision narrowing (v0.3.x → v0.4.0)
|
|
77
|
+
|
|
78
|
+
```ts
|
|
79
|
+
// v0.3.x
|
|
80
|
+
const decision: AgentDecision = await coordinator.run(...);
|
|
81
|
+
console.log(decision.selectedRole, decision.contribution);
|
|
82
|
+
|
|
83
|
+
// v0.4.0
|
|
84
|
+
const decision = await coordinator.run(...);
|
|
85
|
+
if (decision.type === "participate") {
|
|
86
|
+
console.log(decision.selectedRole, decision.contribution);
|
|
87
|
+
} else if (decision.type === "delegate") {
|
|
88
|
+
// new: handle delegated sub-mission
|
|
89
|
+
}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
See [`docs/recursive-coordination.md#agentdecision-narrowing`](docs/recursive-coordination.md#agentdecision-narrowing) for the full discriminator and `delegate`-branch shape.
|
|
93
|
+
|
|
94
|
+
### Added — `delegate` decision and sub-run traces (Phase 1)
|
|
95
|
+
|
|
96
|
+
- Coordinator agents may emit `{ type: "delegate", protocol, intent, model?, budget? }` to dispatch a sub-mission as part of the plan turn. Phase 1 of v0.4.0 enables delegation from the coordinator's plan turn only; worker delegation and final-synthesis-turn delegation are rejected with `invalid-configuration`. (Phase 1)
|
|
97
|
+
- New `RunEvent` variants: `sub-run-started`, `sub-run-completed`, `sub-run-failed`. `sub-run-completed` carries the full child `RunResult` (including embedded `Trace`); `sub-run-failed` carries `error` and `partialTrace`. `sub-run-started` carries `{ childRunId, parentRunId, parentDecisionId, protocol, intent, depth }` plus `recursive: true` when the dispatching protocol and the delegated protocol are both `coordinator`. (Phase 1)
|
|
98
|
+
- Synthetic transcript entries record sub-run results with `agentId: "sub-run:<childRunId>"` and `role: "delegate-result"`. The next coordinator plan prompt receives a tagged `[sub-run <childRunId>]: <output>\n[sub-run <childRunId> stats]: turns=<N> costUsd=<X> durationMs=<Y>` block (D-17). (Phase 1)
|
|
99
|
+
- `maxDepth` option on `DogpileOptions` and `EngineOptions` (default `4`); `Engine.run` and `Engine.stream` accept an optional second-argument `RunCallOptions` that can only LOWER the engine ceiling — `effectiveMaxDepth = Math.min(engineMaxDepth, runOptions.maxDepth ?? Infinity)`. Depth overflow is enforced at both the parser (`parseDelegateDecision`) and the dispatcher (`dispatchDelegate`); both throw `invalid-configuration` with `detail.reason: "depth-overflow"` and `detail.path: "decision.protocol"`. (Phase 1)
|
|
100
|
+
- New public type `RunCallOptions` is re-exported through `@dogpile/sdk` and `@dogpile/sdk/types`. (Phase 1)
|
|
101
|
+
- Fenced-JSON delegate parsing convention added to `parseAgentDecision` (no new tool surface — delegate is a parser-level concern). Coordinator runs accept a `delegate:` prefix followed by a fenced ```json block. (Phase 1)
|
|
102
|
+
- `Dogpile.replay()` rehydrates embedded sub-run traces without provider invocation; the new `recomputeAccountingFromTrace` helper verifies recorded child `RunAccounting` against a per-child recompute and throws `invalid-configuration` with `detail.reason: "trace-accounting-mismatch"` and `detail.field` identifying the offending numeric field on tamper. The eight enumerated comparable numeric fields are `cost.usd`, `cost.inputTokens`, `cost.outputTokens`, `cost.totalTokens`, `usage.usd`, `usage.inputTokens`, `usage.outputTokens`, `usage.totalTokens`. Top-level parent drift is reported with `eventIndex: -1`; child drift is reported with the offending event's index plus `childRunId`. (Phase 1)
|
|
103
|
+
- New `ReplayTraceProtocolDecisionType` literals: `start-sub-run`, `complete-sub-run`, `fail-sub-run`. (Phase 1)
|
|
104
|
+
|
|
105
|
+
### Added — Budget, cancellation, cost roll-up (Phase 2)
|
|
106
|
+
|
|
107
|
+
The four BUDGET-* requirements ship together as a single coherent surface for safely
|
|
108
|
+
running recursive coordinator delegations under shared deadlines, abortable cancellation,
|
|
109
|
+
and reconciled cost accounting.
|
|
110
|
+
|
|
111
|
+
#### Cancellation propagation (BUDGET-01)
|
|
112
|
+
|
|
113
|
+
- Parent abort propagates to all in-flight sub-runs via a per-child derived `AbortController`. Aborted children carry `detail.reason: "parent-aborted"` on `code: "aborted"` errors.
|
|
114
|
+
- New trace event `sub-run-parent-aborted` (exported as TS type `SubRunParentAbortedEvent`) marks parent aborts that land after a sub-run completes; observable on `Dogpile.stream()` subscribers when stream teardown timing permits. New `ReplayTraceProtocolDecisionType` literal `mark-sub-run-parent-aborted`.
|
|
115
|
+
|
|
116
|
+
#### Timeout / deadline propagation (BUDGET-02)
|
|
117
|
+
|
|
118
|
+
- Parent `budget.timeoutMs` is now a true tree-wide deadline. Children inherit `parentDeadline − now` as their default timeout.
|
|
119
|
+
- Per-decision `budget.timeoutMs` exceeding the parent's remaining is **clamped** (no longer throws), and the parent trace gains a `sub-run-budget-clamped` event (exported as TS type `SubRunBudgetClampedEvent`) recording the requested vs clamped values. Parent timeouts surface on the child as `code: "aborted"` with `detail.reason: "timeout"`. New `ReplayTraceProtocolDecisionType` literal `mark-sub-run-budget-clamped`.
|
|
120
|
+
- New `defaultSubRunTimeoutMs` engine option on `createEngine`, `Dogpile.pile`, `run`, and `stream` — fallback ceiling applied only when neither parent nor decision specifies a timeout. Precedence: `decision.budget.timeoutMs` > parent's remaining deadline > `defaultSubRunTimeoutMs` > undefined.
|
|
121
|
+
|
|
122
|
+
#### Cost & token roll-up + replay parity (BUDGET-03)
|
|
123
|
+
|
|
124
|
+
- `sub-run-failed` events carry `partialCost: CostSummary` reflecting real provider spend before the failure. The parent's `accounting.cost` and token totals include failed-child partial costs recursively.
|
|
125
|
+
- Parent rolls up child cost (`subResult.cost` for completed, `partialCost` for failed) into its own totals **before** the corresponding `sub-run-completed` / `sub-run-failed` event is emitted, preserving the existing "last cost-bearing event === final.cost" invariant.
|
|
126
|
+
- `Dogpile.replay()` now detects parent-rollup drift — if a saved trace's child `subResult.cost` disagrees with `subResult.accounting.cost`, or a `sub-run-failed.partialCost` disagrees with the cost implied by its `partialTrace`, or Σ children exceeds the parent's recorded total, replay throws `DogpileError({ code: "invalid-configuration", detail: { reason: "trace-accounting-mismatch", subReason: "parent-rollup-drift" } })` with `detail.field` identifying the offending numeric field.
|
|
127
|
+
|
|
128
|
+
#### Termination floors (BUDGET-04)
|
|
129
|
+
|
|
130
|
+
- Internal contract guarantee (no public-surface delta): parent termination policies (`budget`, `convergence`, `judge`, `firstOf`) operate over parent-level events / iterations only — child agent-turn events bubbled into the parent stream do not count toward parent iteration limits, and `minTurns`/`minRounds` floors are per-protocol-instance (parent and child read their own protocol config independently). One `sub-run-completed` counts as exactly one parent iteration via the synthetic `delegate-result` transcript entry. Locked by contract tests in `src/tests/budget-first-stop.test.ts` and `src/runtime/coordinator.test.ts`.
|
|
131
|
+
|
|
132
|
+
### Added — Provider locality and bounded concurrency (Phase 3)
|
|
133
|
+
|
|
134
|
+
The PROVIDER-* and CONCURRENCY-* requirements ship together so recursive
|
|
135
|
+
coordinator runs can safely fan out work while protecting local model providers
|
|
136
|
+
from accidental self-inflicted overload.
|
|
137
|
+
|
|
138
|
+
#### Provider locality (PROVIDER-01..03)
|
|
139
|
+
|
|
140
|
+
- `ConfiguredModelProvider.metadata?.locality?: "local" | "remote"` is an optional readonly hint used by coordinator concurrency clamping. Omitted metadata is treated as remote for clamping.
|
|
141
|
+
- `createOpenAICompatibleProvider` auto-detects `metadata.locality` from `baseURL`: loopback (`localhost`, `127/8`, `::1`), RFC1918 (`10/8`, `172.16/12`, `192.168/16`), IPv4 link-local (`169.254/16`), IPv6 ULA (`fc00::/7`), IPv6 link-local (`fe80::/10`), and `*.local` mDNS hostnames classify as `"local"`.
|
|
142
|
+
- Caller `locality: "local"` always wins. Caller `locality: "remote"` on a detected-local OpenAI-compatible host now throws `DogpileError({ code: "invalid-configuration", detail: { reason: "remote-override-on-local-host" } })` so a localhost Ollama-style endpoint cannot silently bypass the local clamp.
|
|
143
|
+
- `classifyHostLocality(host)` is exported from the OpenAI-compatible provider module for advanced callers and tests.
|
|
144
|
+
- Provider locality is validated at adapter construction time and at engine run start, including custom provider objects that bypass TypeScript.
|
|
145
|
+
|
|
146
|
+
#### Bounded child concurrency (CONCURRENCY-01)
|
|
147
|
+
|
|
148
|
+
- `maxConcurrentChildren` config is available on `createEngine`, `Dogpile.pile` / `run` / `stream`, and per coordinator `delegate` decision. Default is `4`; effective concurrency is `min(engine, run ?? Infinity, decision ?? Infinity)`, so per-run and per-decision values can only lower the engine ceiling. Values must be positive integers.
|
|
149
|
+
- Coordinator agents can fan out up to 8 delegates in one plan turn by returning a fenced JSON array of delegate decisions. Mixed `participate` plus `delegate` remains invalid.
|
|
150
|
+
- New `RunEvent` variant `sub-run-queued` records delegates that waited for a concurrency slot. No-pressure runs do not emit queued events; pressure runs follow `sub-run-queued` → `sub-run-started` → `sub-run-completed` / `sub-run-failed`.
|
|
151
|
+
- `parentDecisionArrayIndex: number` was added to `sub-run-queued`, `sub-run-started`, `sub-run-completed`, and `sub-run-failed` so a delegate is uniquely identified by `parentDecisionId` plus its array index without changing the existing `parentDecisionId` format.
|
|
152
|
+
- When one fan-out delegate fails, in-flight siblings continue and queued siblings are drained with synthetic `sub-run-failed` events using `error.code: "aborted"`, `error.detail.reason: "sibling-failed"`, and zero `partialCost`.
|
|
153
|
+
- Delegate result transcript entries are appended in completion order under fan-out; replay determinism is preserved by stable `parentDecisionId` plus `parentDecisionArrayIndex`.
|
|
154
|
+
- New `ReplayTraceProtocolDecisionType` literal `queue-sub-run` pairs with `sub-run-queued` events.
|
|
155
|
+
|
|
156
|
+
#### Local-provider clamp (CONCURRENCY-02)
|
|
157
|
+
|
|
158
|
+
- New `RunEvent` variant `sub-run-concurrency-clamped` is emitted once per coordinator run when any active provider declares `metadata.locality === "local"`. Payload is `{ requestedMax, effectiveMax: 1, reason: "local-provider-detected", providerId }`.
|
|
159
|
+
- Local-provider detection walks the active tree at each delegate fan-out: the parent `options.model` first, then future-compatible `agent.model` entries if present. The first local provider id is recorded on the clamp event.
|
|
160
|
+
- Effective child concurrency is silently clamped to 1 for local providers regardless of caller config, including explicit `maxConcurrentChildren: 8`. The clamp does not throw and does not write to console; the event is the warning surface.
|
|
161
|
+
- The clamp-emitted flag is scoped to the individual run, so concurrent runs do not suppress each other's `sub-run-concurrency-clamped` event.
|
|
162
|
+
- New `ReplayTraceProtocolDecisionType` literal `mark-sub-run-concurrency-clamped` pairs with `sub-run-concurrency-clamped` events.
|
|
163
|
+
|
|
164
|
+
#### Public-surface tests
|
|
165
|
+
|
|
166
|
+
- `src/tests/event-schema.test.ts` now locks 17 run event variants, including `sub-run-queued` and `sub-run-concurrency-clamped`.
|
|
167
|
+
- `src/tests/result-contract.test.ts` verifies the new public event types are reachable from the root `@dogpile/sdk` type re-exports.
|
|
168
|
+
- `src/tests/config-validation.test.ts` locks invalid locality and `maxConcurrentChildren` validation.
|
|
169
|
+
- `src/tests/cancellation-contract.test.ts` locks public detail/reason strings including `sibling-failed`, `local-provider-detected`, and `remote-override-on-local-host`.
|
|
170
|
+
- `src/runtime/coordinator.test.ts` covers fan-out queuing, completion-order transcript behavior, sibling-failed queue drain, local-provider clamp-once behavior, remote-only no-op behavior, explicit override clamping, and per-run clamp isolation.
|
|
171
|
+
|
|
172
|
+
### Added — Streaming and child error escalation (Phase 4)
|
|
173
|
+
|
|
174
|
+
The STREAM-* and ERROR-* requirements ship together so live consumers can demux
|
|
175
|
+
child activity, parent cancellation closes delegated work, and terminal child
|
|
176
|
+
failures surface as stable public `DogpileError` instances.
|
|
177
|
+
|
|
178
|
+
- **`parentRunIds: readonly string[]` on stream events.** Every `StreamLifecycleEvent` and `StreamOutputEvent` variant accepts an optional root-to-immediate-parent ancestry chain for live bubbled child events. The chain is not persisted into parent `RunResult.events`; `replayStream()` reconstructs it from embedded child traces.
|
|
179
|
+
- **New aborted lifecycle event.** Parent streams emit `{ type: "aborted", runId, at, reason: "parent-aborted" | "timeout", detail? }` before terminal `error` events on abort paths, including parent-aborted-after-completion cases with no synthetic child failure to drain.
|
|
180
|
+
- **`onChildFailure?: "continue" | "abort"` config option.** Engine-level and per-run surfaces accept the option; default `"continue"` preserves coordinator retry/redirect behavior. `"abort"` skips the follow-up plan turn after the first real child failure and re-throws the snapshotted triggering failure.
|
|
181
|
+
- **Optional `detail.source?: "provider" | "engine"` on `provider-timeout` errors.** OpenAI-compatible HTTP timeout responses set `"provider"`; child engine deadlines set `"engine"`. Backwards-compat: absent `detail.source` means `"provider"`. Parent-budget propagation remains `code: "aborted"` with `detail.reason: "timeout"`.
|
|
182
|
+
- **Coordinator prompt structured failure roster.** The next coordinator plan prompt includes `## Sub-run failures since last decision` with a JSON array of real child failures from the latest dispatch wave: `{ childRunId, intent, error: { code, message, detail.reason? }, partialCost: { usd } }`. Synthetic `sibling-failed` / `parent-aborted` bookkeeping failures and `partialTrace` are intentionally excluded.
|
|
183
|
+
- **Cancel-during-fan-out drain.** `StreamHandle.cancel()` drains active children before terminal stream error: in-flight children emit synthetic `sub-run-failed` with `error.detail.reason: "parent-aborted"` and queued children retain `sibling-failed`; late events from drained children are suppressed at the parent stream boundary.
|
|
184
|
+
- **Terminate-without-final throw rule clarified.** "Original DogpileError unwrapped" means the child's own thrown `DogpileError`, not a wrapper, and not the first failure chronologically. Budget and abort-mode terminal paths re-throw the last real child failure by event order, excluding synthetic sibling-failed and parent-aborted entries. Explicit cancel/abort wins and throws the cancel error verbatim.
|
|
185
|
+
|
|
186
|
+
### Added — Documentation and runnable example (Phase 5)
|
|
187
|
+
|
|
188
|
+
- **`docs/recursive-coordination.md`** — new dedicated docs page: concepts, propagation rules, `parentRunIds` chain, structured failures, replay parity, "Not in v0.4.0" deferrals, canonical worked example. (Phase 5)
|
|
189
|
+
- **`docs/recursive-coordination-reference.md`** — new exhaustive reference page: every `sub-run-*` event payload, every `detail.reason` value, every `RunCallOptions` field, every `DogpileError` `code`/`detail.reason` combo from v0.4.0, replay-drift error matrix, provider locality classification table. (Phase 5)
|
|
190
|
+
- **`docs/developer-usage.md`** — new "Recursive coordination" section with maintenance comment cross-linking the dedicated pages. (Phase 5)
|
|
191
|
+
- **`docs/reference.md`** — augmented with v0.4.0 exports (`RunCallOptions`, the seven `SubRun*Event` types, `classifyHostLocality`, `recomputeAccountingFromTrace`, new `ReplayTraceProtocolDecisionType` literals) and cross-links to the dedicated reference page. (Phase 5)
|
|
192
|
+
- **`README.md` "Choose Your Path"** — new row pointing at `delegate` and `docs/recursive-coordination.md`. (Phase 5)
|
|
193
|
+
- **`examples/recursive-coordination/`** — new runnable example using the deterministic provider by default and `createOpenAICompatibleProvider` in live mode. Reuses the Hugging Face upload GUI mission verbatim and wraps it in a coordinator-with-delegate. Demonstrates all v0.4.0 surfaces: parentRunIds chain, intentionally-failing child with `partialCost`, structured failures in the next coordinator turn, locality-driven concurrency clamp. (Phase 5)
|
|
194
|
+
- **`examples/README.md`** — index entry mirroring the huggingface-upload-gui section format. (Phase 5)
|
|
195
|
+
- **`AGENTS.md` + `CLAUDE.md`** — cross-cutting-invariants list mirrors a recursive-coordination public-surface entry. (Phase 5)
|
|
196
|
+
- Prepared the release identity for `@dogpile/sdk@0.4.0` and `dogpile-sdk-0.4.0.tgz`. (Phase 5)
|
|
197
|
+
|
|
198
|
+
### Notes
|
|
199
|
+
|
|
200
|
+
- No package `exports` / `files` change. All new public types ship through the existing `@dogpile/sdk` root entry. `recomputeAccountingFromTrace` and the depth-gate helpers (`assertDepthWithinLimit`, `depthOverflowError`) remain runtime-internal.
|
|
201
|
+
- Phase 1 does not propagate cost caps, parent timeouts to children with no caller-set timeout, child-event bubbling into the parent stream, or worker-side delegation — those land in v0.4.0 Phases 2–4. Phase 1 leaves event ordering schema-stable for the future Phase 4 child-event-bubbling addition.
|
|
202
|
+
- Documentation pages (`docs/recursive-coordination*.md`) and example artifacts (`examples/recursive-coordination/`) are repository-only — neither is added to `package.json` `files`. Released tarball payload is unchanged. (Phase 5)
|
|
203
|
+
|
|
3
204
|
## 0.3.1
|
|
4
205
|
|
|
5
206
|
- Prepared the patch release identity for `@dogpile/sdk@0.3.1` and `dogpile-sdk-0.3.1.tgz`.
|
package/README.md
CHANGED
|
@@ -73,6 +73,7 @@ it.
|
|
|
73
73
|
| Reuse fixed settings across many runs | `Dogpile.createEngine({ protocol, tier, model })` |
|
|
74
74
|
| Save and reload a completed run | Persist `result.trace`, then call `Dogpile.replay(trace)` |
|
|
75
75
|
| Use direct HTTP with OpenAI-compatible servers | `createOpenAICompatibleProvider(options)` |
|
|
76
|
+
| Run a coordinator that fans out into other Dogpile runs | `delegate` decision; [`docs/recursive-coordination.md`](docs/recursive-coordination.md) |
|
|
76
77
|
|
|
77
78
|
## Install
|
|
78
79
|
|