@loops-adk/core 0.1.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,18 @@
1
- # loops
2
-
3
- **Stop prompting agents. Write the loop that prompts them. Make "done" mean _converged_, not _claimed_.**
1
+ <p align="center">
2
+ <img src="assets/logo.png" alt="loops" width="320">
3
+ </p>
4
+
5
+ <p align="center">
6
+ <strong>Stop prompting agents. Write the loop that prompts them. Make "done" mean <em>converged</em>, not <em>claimed</em>.</strong>
7
+ </p>
8
+
9
+ <p align="center">
10
+ <a href="https://www.npmjs.com/package/@loops-adk/core"><img src="https://img.shields.io/npm/v/@loops-adk/core" alt="npm"></a>
11
+ <img src="https://img.shields.io/badge/status-alpha-orange" alt="status: alpha">
12
+ <img src="https://img.shields.io/badge/TypeScript-strict-3178c6" alt="TypeScript">
13
+ <img src="https://img.shields.io/badge/node-%3E%3D20-3c873a" alt="node &gt;=20">
14
+ <img src="https://img.shields.io/badge/license-MIT-blue" alt="license: MIT">
15
+ </p>
4
16
 
5
17
  `loops` is a small, nestable library for running an agent in a convergence loop. The loop finds the work, hands it to an agent, checks the result, records what it learned, and goes again until a gate _you_ define says the work is finished. You write the loop once and it drives the agent, rather than prompting the agent by hand. Compose loops and DAGs both ways, run them against any model behind a one-method `Engine`, and watch a run in a live terminal UI.
6
18
 
@@ -8,10 +20,31 @@ Every iteration runs with a **fresh context**, so a long run never rots. Progres
8
20
 
9
21
  Where most "agent memory" recalls a _conversation_, this keeps your _decisions_ consistent across long work. No vector database, no embeddings, no index to sync or let go stale. **Git is the memory.**
10
22
 
11
- ![status: alpha](https://img.shields.io/badge/status-alpha-orange)
12
- ![TypeScript](https://img.shields.io/badge/TypeScript-strict-3178c6)
13
- ![node >=20](https://img.shields.io/badge/node-%3E%3D20-3c873a)
14
- ![license MIT](https://img.shields.io/badge/license-MIT-blue)
23
+ ## The fastest proof
24
+
25
+ A downstream agent had to preserve one upstream decision: snapshots must start
26
+ with the exact wire tag `SSv1|`. That decision lived only in a git commit body,
27
+ not in the source files or the downstream task prompt. The commit was not just a
28
+ fact store; it was the thread back through the journey, what was decided, why it
29
+ was decided, and what downstream work had to honour.
30
+
31
+ | Runner | What it could read | Result |
32
+ | --- | --- | --- |
33
+ | Memoryless graph | files plus task prompt | 0/10 preserved the contract |
34
+ | Loops Ledger | gated commit bodies plus grounding | 9/10 preserved the contract |
35
+ | Raw git dump | full git log pasted into every prompt | 10/10 on a toy log, not a real-repo operating mode |
36
+
37
+ That is the honest shape of the claim. Loops is not just `git log`: it is the
38
+ deterministic enforcement layer that makes agents write useful commit bodies when
39
+ work converges, then the grounding layer that reads those verified reasons back
40
+ into later fresh contexts. The value is not bare recall. A fresh agent can pull
41
+ on one thread and reconstruct how and why the repository got here. Full-log dump
42
+ is a useful sanity check on tiny histories, but on a repo with significant
43
+ history it is context rot and cost.
44
+
45
+ ```bash
46
+ npm run bench:compare
47
+ ```
15
48
 
16
49
  ```ts
17
50
  import { loop, agentJob, commandSucceeds, agentCheck } from '@loops-adk/core';
@@ -110,7 +143,7 @@ A loop is easy to start and hard to keep honest. Four parts decide whether it ea
110
143
  | **The gate.** Knowing the work is actually done, not just that the agent stopped. | A deterministic check (`commandSucceeds`) and a separate judge (`agentCheck`) in its own context, hardened with a k-of-n `quorum` and a geometric-mean rubric so one weak dimension sinks the verdict. The model that did the work never grades it. |
111
144
  | **Memory.** Carrying what was learned across a run without dragging a transcript along. | The git commit log is the memory: a structured handoff per milestone, read back before the next turn. No `STATE.md` the model is trusted to keep tidy, no vector store to sync. |
112
145
  | **Parallelism.** Running several agents without collisions on the same files. | `isolation: 'worktree'` gives each writer its own branch and worktree, landed back on pass with a `--no-ff` merge. |
113
- | **Hard stops.** Bounding a loop so it cannot run forever or empty your account. | `max` caps iterations and `budget` caps tokens, a non-retryable stop the engine calls refuse to cross. |
146
+ | **Hard stops.** Bounding a loop so it cannot run forever or empty your account. | `max` caps iterations, `budget` caps tokens (a non-retryable stop the engine calls refuse to cross), and `noProgress` stalls out a loop whose iterations reach no new state, with the evidence on the outcome. |
114
147
 
115
148
  Three things `loops` does that most loop libraries do not:
116
149
 
@@ -151,6 +184,7 @@ loops run \
151
184
  ```bash
152
185
  loops validate examples/confidence-gate.loop.ts # offline pre-flight: load + print the shape, no model calls
153
186
  loops describe examples/confidence-gate.loop.ts # print the loop's shape (gate, body, nodes) without running
187
+ loops describe examples/confidence-gate.loop.ts --json # machine-readable shape for agents
154
188
  loops run examples/confidence-gate.loop.ts # live Ink TUI
155
189
  loops run examples/confidence-gate.loop.ts --no-tui # plain streamed logs
156
190
  loops run examples/confidence-gate.loop.ts --json # NDJSON event stream
@@ -160,6 +194,19 @@ loops run examples/confidence-gate.loop.ts --json # NDJSON event stream
160
194
 
161
195
  **Authoring is agent-native.** Both commands work from any repo, including one that consumes `loops` as a submodule or dependency (the recipe's folder just needs an ES module scope, which such repos already have). `loops validate <file>` is the cheap, no-model pre-flight an agent runs before `loops run`: it loads the loop, reports a fix-oriented error if anything is wrong, and prints the loop's shape (its gate, body, and dag nodes), all without spending a single agent turn. `loops describe <file>` prints that same shape on its own, so an agent can see exactly what it just authored. The authoring guide an agent reads to compose a loop is [`skills/author-loop/SKILL.md`](skills/author-loop/SKILL.md).
162
196
 
197
+ The end-to-end agent workflow, from authoring through reading a supervised run's decisions back as structured records rather than a raw event stream:
198
+
199
+ ```bash
200
+ loops validate feature.loop.ts --json # pre-flight: loads, no spend
201
+ loops describe feature.loop.ts --json # the shape, incl. each agent node's contract
202
+ loops run feature.loop.ts --no-tui --supervise # run it, registered for observation
203
+ loops list # find the runId
204
+ loops tail <runId> # follow live events
205
+ loops records <runId> --kind revision --path ship/implementation --json # the semantic decision stream, filtered
206
+ ```
207
+
208
+ Two supervision skills go deeper: [`skills/supervise-loop-run/SKILL.md`](skills/supervise-loop-run/SKILL.md) (monitor a run) and [`skills/design-agent-team/SKILL.md`](skills/design-agent-team/SKILL.md) (compose a specialist team).
209
+
163
210
  **Offline demo** (no network, no key; uses the mock engine):
164
211
 
165
212
  ```bash
@@ -194,6 +241,7 @@ loop({
194
241
  stopOn, // hard early-exit each iteration; met ⇒ aborted
195
242
  review, // runs when until is met; non-pass re-enters the loop (folds back as ctx.lastReview)
196
243
  max, // iteration cap; reached without passing ⇒ exhausted
244
+ noProgress, // stall out after n consecutive iterations with no observable progress
197
245
  maxReviewRestarts, // cap the worker/reviewer standoff independently of max
198
246
  delayMs, // delay between iterations (polling); interruptible by abort
199
247
  retry, // { onError: 'continue' | 'fail', maxConsecutive?, backoffMs? }
@@ -232,6 +280,30 @@ agentCheck({
232
280
 
233
281
  **Builders:** `predicate`, `bodyPassed`, `minConfidence`, `commandSucceeds` (a shell command exits 0), `all`, `any`, `not`, `quorum` (k-of-n), `agentCheck` (small-model judge), `always`, `never`, and `gateJob` (lift a condition into a `Job`, e.g. a reviewer).
234
282
 
283
+ ## No progress: the third hard stop
284
+
285
+ The gate detects success; nothing above detects a loop that is failing to converge. `max` bounds the attempt count and `budget` bounds the cost, but both fire only after the waste, and neither can tell slow-but-real convergence from the same failure five turns running. `noProgress` is that sensor: the loop ends `exhausted` once `n` consecutive iterations reach no state the run has not already seen.
286
+
287
+ ```ts
288
+ loop({
289
+ name: 'build',
290
+ body: agentJob({ prompt: '…', ground: true }),
291
+ until: commandSucceeds('npm', ['test']),
292
+ max: 50, // generous runway for hard work…
293
+ noProgress: 3, // …because the doomed case exits after 3 flat iterations
294
+ });
295
+ ```
296
+
297
+ Progress means **novelty**, not change. An iteration counts as progress when any evidence channel reaches something new:
298
+
299
+ - **the workspace fingerprint** (HEAD, pending diff, untracked content) is a state this run has never visited, so an agent oscillating A→B→A gets no credit for the return trip;
300
+ - **the gate confidence** beats its previous best by `minConfidenceDelta` (default 0.02), a high-water mark, so judge jitter is not progress but slow steady improvement accumulates until it clears the bar;
301
+ - **a custom `signal`** returns a value not already seen, the escape hatch for progress the worktree cannot show (a queue length, a passing-test count): `noProgress: { window: 3, signal: (ctx) => queueDepth() }`.
302
+
303
+ The default is conservative: one channel showing novelty keeps the loop alive, so real-but-slow work is never cut short. And the exit is a diagnosis, not just a stop: the outcome carries `Outcome.stall` (the flat iterations, the repeated gate reason, the per-channel evidence) and a `loop:stall` event fires for supervisors, so "stalled since iteration 5 on the same scope error" replaces "reached max iterations" and a fleet watcher can re-brief the loop instead of shrugging at it. This is also what makes a generous `max` safe to grant: the safety net and the runway stop being the same number.
304
+
305
+ Off by default, like `commit`: a polling loop legitimately makes no progress until the outside world changes. Flags mode: `--stall-after <n>`. Offline demo: `npm run example:stall`.
306
+
235
307
  ## Ledger: memory built on git
236
308
 
237
309
  Fresh context kills _rot_; on its own it would cause _amnesia_. **Ledger** is the core that closes the gap: the loop writes its reasoning to git as it works and reads it back before the next turn. No parallel database, no vector store; git _is_ the index: nothing to build, embed, sync, or let go stale (the commit log can't drift out of sync with the code; it _is_ the code's history). (`Ledger` is the engine; the **commit log** is the durable memory it reads and writes; `.loops/ledger.md` and `.loops/prompt.md` are the live scratch files for work in flight.)
@@ -276,10 +348,10 @@ The agent launch only ever touches the `Engine` interface, so the loop knows not
276
348
 
277
349
  | name | backend | notes |
278
350
  | --------------- | -------------------------------- | ----------------------------------------------------------- |
279
- | `claude-cli` | `claude` subprocess (`execa`) | fresh process per call; uses host Claude auth, no key |
280
- | `agent-sdk` | `@anthropic-ai/claude-agent-sdk` | fresh `query()` per call; host Claude auth |
281
- | `anthropic-api` | `@anthropic-ai/sdk` | token-level streaming; cheapest for judges; needs a key |
282
- | `codex` | `codex exec` subprocess (GPT-5) | a genuinely different model for a second-model reviewer; read-only |
351
+ | `codex` | `codex exec` subprocess (`execa`) | fresh process per call; read-only unless `bypassPermissions` |
352
+ | `claude-cli` | `claude` subprocess (`execa`) | fresh process per call; uses host Claude auth, no key |
353
+ | `agent-sdk` | `@anthropic-ai/claude-agent-sdk` | fresh `query()` per call; host Claude auth |
354
+ | `anthropic-api` | `@anthropic-ai/sdk` | token-level streaming; cheapest for judges; needs a key |
283
355
  | `mock` | scripted, offline | for tests and examples |
284
356
 
285
357
  Select per-run (`--engine`, `RunOptions.engine`) or per-job/condition (`engine:` takes a name **or** a ready-made `Engine`). Bring your own in ~10 lines:
@@ -314,15 +386,23 @@ const storeEngineer = defineAgent({
314
386
  system: fromFile(new URL('./agents/store-engineer.md', import.meta.url)), // the persona, as markdown
315
387
  model: 'sonnet',
316
388
  tools: ['edit', 'bash'],
389
+ tier: 'worker',
317
390
  capabilities: ['storage engine', 'id stability'],
391
+ outputs: [{ name: 'patch' }, { name: 'test-report' }],
392
+ requiresSkills: ['contract-first'],
318
393
  skills: [tdd], // methodologies fold into the system
394
+ usesSkills: ['small-diff'],
395
+ humanGates: [{ name: 'prod-approval', when: 'deploying production changes' }],
319
396
  failureModes: [{ mode: 'tests-flaky', recovery: 'isolate the flake, retry once' }],
320
397
  });
321
398
 
322
399
  agentJob({ agent: storeEngineer, prompt: 'Build the store to its tests.', ground: true });
323
400
  ```
324
401
 
325
- `agentJob` resolves the def into the engine request (`system` = persona + skills, plus `model`/`tools`); inline `system`/`model`/`tools` still override it. A **skill** is a methodology (how to work: TDD, writing-plans), not a worker. This is what turns a `dag` into a named **team** (`storeEngineer`, `apiEngineer`, `securityReviewer` as small files) orchestrated by the DAG and gated by `quorum(...)`.
402
+ For a small runnable contract plus feedback example, see
403
+ [`examples/contracted-agent.loop.ts`](examples/contracted-agent.loop.ts).
404
+
405
+ `agentJob` resolves the def into the engine request (`system` = persona + skills, plus `model`/`tools`); inline `system`/`model`/`tools` still override it. A **skill** is a methodology (how to work: TDD, writing-plans), not a worker. The extra contract fields are optional metadata for validation, `loops describe`, docs, and future discovery. They do not give an agent dispatch authority. This is what turns a `dag` into a named **team** (`storeEngineer`, `apiEngineer`, `securityReviewer` as small files) orchestrated by the DAG and gated by `quorum(...)`.
326
406
 
327
407
  ## Environments: test the running thing
328
408
 
@@ -381,6 +461,43 @@ dag({
381
461
 
382
462
  `needs` = dependencies; a non-`pass` required dependency blocks its dependents; `optional` nodes never block or fail the DAG; an unmet `when` skips a node (counts green); cycles are detected before any work runs. `sequence(name, ...jobs)` and `parallel(name, jobs, concurrency?)` are sugar over `dag`.
383
463
 
464
+ ### Feedback between nodes
465
+
466
+ Review feedback is a structured revision request. In a loop, a failing `review`
467
+ outcome is threaded into the next body turn as `ctx.lastReview`; with
468
+ `consumeFeedback: true`, `agentJob` appends it to the implementation prompt in a
469
+ standard block.
470
+
471
+ ```ts
472
+ const implement = agentJob({
473
+ label: 'implementation',
474
+ prompt: brief,
475
+ consumeFeedback: true,
476
+ });
477
+ ```
478
+
479
+ For several reviewers, use `reviewPanel` to aggregate their verdicts into one
480
+ outcome. Every reviewer is a gate: the panel passes when all of them clear (or
481
+ `pass: N` of them, k-of-n), and each failing reviewer's concern is surfaced as a
482
+ blocking finding threaded into the next pass. An empty panel is a construction
483
+ error, not a vacuous pass.
484
+
485
+ ```ts
486
+ const review = reviewPanel({
487
+ // pass: 2, // optional: k-of-n instead of all
488
+ reviewers: [
489
+ { name: 'security', review: agentCheck({ question: 'Is it safe?', context: reviewContext({ diff: true, ledger: true }) }) },
490
+ { name: 'correctness', review: agentCheck({ question: 'Is it correct?' }) },
491
+ { name: 'simplicity', review: agentCheck({ question: 'Is it simple?', context: reviewContext({ files: ['src/**'] }) }) },
492
+ ],
493
+ });
494
+ ```
495
+
496
+ In a DAG, a targeted `revisionRequest({ target, findings })` reruns the target
497
+ node and its dependents when `maxKickbacks` allows it. `kickback(to, reason)` is
498
+ the terse compatibility helper for the same routed feedback. Agents can opt into
499
+ a small graph-position prompt block with `graphContext: true`.
500
+
384
501
  **Worktree isolation: branches as teams.** A concurrent node can run in its own git worktree on a fork branch (`isolation: 'worktree'` on the DAG, or `isolate: true` per node), so parallel writers never collide on files or the index. On pass, its committed work lands back into the line with a `--no-ff` merge; a conflict fails the node honestly (loops does not auto-resolve; that's a separate layer). Each team gets its own branch, its own scratch files, and (with `DagConfig.environment`) its own stage, all born and torn down together.
385
502
 
386
503
  For **dynamic** dispatch (a loop that discovers each unit at runtime and routes it to its own isolated sub-loop), `isolated(job)` is the same boundary as a composable wrapper rather than a predeclared node (fork, run, land back on pass):
@@ -458,7 +575,7 @@ loops status <runId> # its shape plus where it is now: iterati
458
575
  loops tail <runId> # stream its events live
459
576
  ```
460
577
 
461
- `list` marks a run dead if its process is gone. The read side is also on the public surface (`listRuns`, `readRunStatus`, `runEventsPath`), so an agent supervising a fleet of loops, killing the ones that drift and kicking work back into the ones that hit a problem, reads the same files. Out-of-process control (pause, abort, and kickback from outside) is the next step.
578
+ Each run keeps the raw event stream in `events.jsonl` and a smaller semantic stream in `semantic.jsonl` with dispatch, completion, surfacing, `revision-emitted`, and `revision-routed` records. Use `loops records <runId>` to inspect those records without knowing the registry path; add `--kind revision-routed`, `--kind revision` (both revision kinds), `--path ship/implementation`, `--since <time>`, `--last <n>`, or `--json` when an agent needs a filtered machine-readable stream. `list` marks a run dead if its process is gone. The read side is also on the public surface (`listRuns`, `readRunStatus`, `runEventsPath`, `runSemanticRecordsPath`), so an agent supervising a fleet of loops, killing the ones that drift and kicking work back into the ones that hit a problem, reads the same files. Out-of-process control (pause, abort, and kickback from outside) is the next step.
462
579
 
463
580
  ## What `loops` is (and isn't)
464
581
 
@@ -481,7 +598,7 @@ It deliberately does **not** do durable mid-run replay (re-running a half-finish
481
598
  - [x] Supervision: a file-based run registry with `loops list` / `status` / `tail`
482
599
  - [ ] Out-of-process control: `pause` / `abort` / `kickback` a running loop from outside
483
600
  - [ ] Optional `wip:` autosave tier (per-iteration recovery, squashed on convergence)
484
- - [ ] No-progress / stall detection: the third hard stop, alongside `max` and `budget`
601
+ - [x] No-progress / stall detection (`noProgress`): the third hard stop, alongside `max` and `budget`
485
602
  - [ ] `cost per accepted change` as a first-class reported metric
486
603
  - [ ] Calibration helpers for agent judges
487
604
  - [ ] More engine adapters (OpenAI, local models)
Binary file
@@ -1,5 +1,5 @@
1
1
  import { newAccumulator, mapMessage } from './chunk-CXEPZHSR.js';
2
- import { SUBAGENT_TOOLS } from './chunk-XC46B4FD.js';
2
+ import { SUBAGENT_TOOLS } from './chunk-MA6NDQMO.js';
3
3
  import { LoopError } from './chunk-I3STY7U6.js';
4
4
  import pTimeout from 'p-timeout';
5
5
 
@@ -91,5 +91,5 @@ var AgentSdkEngine = class {
91
91
  };
92
92
 
93
93
  export { AgentSdkEngine };
94
- //# sourceMappingURL=agent-sdk-RF5VJZAT.js.map
95
- //# sourceMappingURL=agent-sdk-RF5VJZAT.js.map
94
+ //# sourceMappingURL=agent-sdk-4QJDWM7N.js.map
95
+ //# sourceMappingURL=agent-sdk-4QJDWM7N.js.map
@@ -1 +1 @@
1
- {"version":3,"sources":["../src/engines/agent-sdk.ts"],"names":[],"mappings":";;;;;AA8BA,SAAS,iBAAiB,KAAA,EAAuC;AAC/D,EAAA,MAAM,GAAA,GAAO,SAAS,EAAC;AACvB,EAAA,MAAM,MAAM,OAAO,GAAA,CAAI,KAAA,KAAU,QAAA,GAAW,IAAI,KAAA,GAAQ,EAAA;AACxD,EAAA,MAAM,UAAU,KAAA,YAAiB,KAAA,GAAQ,KAAA,CAAM,OAAA,GAAU,OAAO,KAAK,CAAA;AACrE,EAAA,MAAM,WAAW,CAAA,EAAG,GAAG,CAAA,CAAA,EAAI,OAAO,GAAG,WAAA,EAAY;AAEjD,EAAA,MAAM,IAAA,GAAQ,GAAA,CAAI,eAAA,IAAmB,EAAC;AACtC,EAAA,MAAM,OAAA,GACJ,OAAO,IAAA,CAAK,QAAA,KAAa,QAAA,GACrB,IAAA,CAAK,QAAA,GACL,OAAO,IAAA,CAAK,eAAA,KAAoB,QAAA,GAC9B,IAAA,CAAK,eAAA,GACL,MAAA;AAER,EAAA,MAAM,OAAA,GACJ,QAAQ,eAAA,IACR,IAAA,CAAK,cAAc,kBAAA,IACnB,kCAAA,CAAmC,KAAK,QAAQ,CAAA;AAClD,EAAA,IAAI,OAAA,EAAS;AACX,IAAA,OAAO,IAAI,SAAA,CAAU;AAAA,MACnB,IAAA,EAAM,OAAA;AAAA,MACN,KAAA,EAAO,QAAA;AAAA,MACP,OAAA,EAAS,kCAAkC,OAAO,CAAA,CAAA;AAAA,MAClD,KAAA,EAAO,KAAA;AAAA,MACP;AAAA,KACD,CAAA;AAAA,EACH;AACA,EAAA,MAAM,SACJ,GAAA,KAAQ,YAAA,IACR,QAAQ,YAAA,IACR,oDAAA,CAAqD,KAAK,QAAQ,CAAA;AACpE,EAAA,IAAI,MAAA,EAAQ;AACV,IAAA,OAAO,IAAI,SAAA,CAAU;AAAA,MACnB,IAAA,EAAM,YAAA;AAAA,MACN,KAAA,EAAO,QAAA;AAAA,MACP,OAAA,EAAS,2BAA2B,OAAO,CAAA,CAAA;AAAA,MAC3C,KAAA,EAAO,KAAA;AAAA,MACP;AAAA,KACD,CAAA;AAAA,EACH;AACA,EAAA,OAAO,MAAA;AACT;AAEO,IAAM,iBAAN,MAAuC;AAAA,EAE5C,WAAA,CAA6B,IAAA,GAAsB,EAAC,EAAG;AAA1B,IAAA,IAAA,CAAA,IAAA,GAAA,IAAA;AAAA,EAA2B;AAAA,EAA3B,IAAA;AAAA,EADpB,IAAA,GAAO,WAAA;AAAA,EAGhB,MAAM,GAAA,CACJ,GAAA,EACA,OAAA,EACA,MAAA,EACsB;AAEtB,IAAA,MAAM,EAAE,KAAA,EAAM,GAAI,MAAM,OAAO,gCAAgC,CAAA;AAE/D,IAAA,MAAM,GAAA,GAAM,cAAA;AAAA,MACV,GAAA,CAAI,KAAA,IAAS,IAAA,CAAK,IAAA,CAAK,YAAA,IAAgB;AAAA,KACzC;AACA,IAAA,MAAM,KAAA,GAAQ,IAAI,eAAA,EAAgB;AAClC,IAAA,MAAM,OAAA,GAAU,MAAM,KAAA,CAAM,KAAA,EAAM;AAClC,IAAA,IAAI,MAAA,CAAO,OAAA,EAAS,KAAA,CAAM,KAAA,EAAM;AAAA,gBACpB,gBAAA,CAAiB,OAAA,EAAS,SAAS,EAAE,IAAA,EAAM,MAAM,CAAA;AAG7D,IAAA,MAAM,OAAA,GAAU;AAAA,MACd,KAAA,EAAO,GAAA,CAAI,KAAA,IAAS,IAAA,CAAK,IAAA,CAAK,YAAA;AAAA,MAC9B,cAAc,GAAA,CAAI,MAAA;AAAA,MAClB,KAAK,GAAA,CAAI,GAAA;AAAA,MACT,cAAc,GAAA,CAAI,YAAA;AAAA;AAAA,MAElB,eAAA,EAAiB,GAAA,CAAI,IAAA,GAAO,cAAA,GAAiB,MAAA;AAAA,MAC7C,cAAA,EAAgB,KAAK,IAAA,CAAK,cAAA;AAAA,MAC1B,sBAAA,EAAwB,IAAA;AAAA,MACxB,eAAA,EAAiB;AAAA,KACnB;AAEA,IAAA,IAAI;AACF,MAAA,MAAM,WAAW,KAAA,CAAM;AAAA,QACrB,QAAQ,GAAA,CAAI,MAAA;AAAA,QACZ;AAAA,OACQ,CAAA;AACV,MAAA,MAAM,WAAW,YAAY;AAC3B,QAAA,WAAA,MAAiB,OAAA,IAAW,QAAA,EAAU,UAAA,CAAW,OAAA,EAAS,KAAK,OAAO,CAAA;AAAA,MACxE,CAAA,GAAG;AACH,MAAA,OAAO,GAAA,CAAI,YACP,QAAA,CAAS,OAAA,EAAS,EAAE,YAAA,EAAc,GAAA,CAAI,SAAA,EAAW,CAAA,GACjD,OAAA,CAAA;AAAA,IACN,SAAS,CAAA,EAAG;AACV,MAAA,IAAI,MAAA,CAAO,OAAA;AACT,QAAA,MAAM,IAAI,SAAA,CAAU;AAAA,UAClB,IAAA,EAAM,SAAA;AAAA,UACN,KAAA,EAAO,QAAA;AAAA,UACP,OAAA,EAAS;AAAA,SACV,CAAA;AACH,MAAA,MAAM,KAAA,GAAQ,iBAAiB,CAAC,CAAA;AAChC,MAAA,IAAI,OAAO,MAAM,KAAA;AACjB,MAAA,MAAM,SAAA,CAAU,KAAK,CAAA,EAAG,EAAE,MAAM,QAAA,EAAU,KAAA,EAAO,UAAU,CAAA;AAAA,IAC7D,CAAA,SAAE;AACA,MAAA,MAAA,CAAO,mBAAA,CAAoB,SAAS,OAAO,CAAA;AAAA,IAC7C;AAEA,IAAA,OAAA,CAAQ,EAAE,MAAM,OAAA,EAAS,KAAA,EAAO,IAAI,KAAA,EAAO,KAAA,EAAO,GAAA,CAAI,KAAA,EAAO,CAAA;AAC7D,IAAA,OAAO;AAAA,MACL,MAAM,GAAA,CAAI,IAAA;AAAA,MACV,OAAO,GAAA,CAAI,KAAA;AAAA,MACX,OAAO,GAAA,CAAI,KAAA;AAAA,MACX,YAAY,GAAA,CAAI;AAAA,KAClB;AAAA,EACF;AACF","file":"agent-sdk-RF5VJZAT.js","sourcesContent":["/**\n * Engine adapter: the Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`).\n * Each `run` is a fresh `query()` — a clean context per loop iteration, which\n * is the whole point. Uses the host's Claude Code auth, so it needs no API key.\n */\n\nimport pTimeout from 'p-timeout';\n\nimport {\n SUBAGENT_TOOLS,\n type AgentRequest,\n type AgentResult,\n type Engine,\n type EngineEventSink,\n type EngineOptions,\n} from './engine.ts';\nimport { mapMessage, newAccumulator } from './message-map.ts';\nimport { LoopError } from '../core/errors.ts';\n\n/**\n * Best-effort classification of an Agent SDK error into a provider-limit\n * `LoopError`, or `undefined` to fall through to the generic ENGINE mapping.\n * The SDK exposes limit state in a few shapes (a thrown error message, an\n * `error` field carrying an `SDKAssistantMessageError` string, and a\n * `rate_limit_info.resetsAt` epoch). We read defensively rather than depend on\n * an exact internal shape:\n * - a rate-limit / overloaded signal → RATE_LIMIT (resets on its own).\n * - a billing / usage / credits signal → QUOTA. A `resetsAt` (when present)\n * makes it auto-waitable; otherwise QUOTA has no reset.\n */\nfunction classifySdkLimit(error: unknown): LoopError | undefined {\n const err = (error ?? {}) as Record<string, unknown>;\n const tag = typeof err.error === 'string' ? err.error : '';\n const message = error instanceof Error ? error.message : String(error);\n const haystack = `${tag} ${message}`.toLowerCase();\n\n const info = (err.rate_limit_info ?? {}) as Record<string, unknown>;\n const resetAt =\n typeof info.resetsAt === 'number'\n ? info.resetsAt\n : typeof info.overageResetsAt === 'number'\n ? info.overageResetsAt\n : undefined;\n\n const isUsage =\n tag === 'billing_error' ||\n info.errorCode === 'credits_required' ||\n /billing|credit|usage limit|quota/.test(haystack);\n if (isUsage) {\n return new LoopError({\n code: 'QUOTA',\n phase: 'engine',\n message: `agent-sdk usage/billing limit: ${message}`,\n cause: error,\n resetAt,\n });\n }\n const isRate =\n tag === 'rate_limit' ||\n tag === 'overloaded' ||\n /rate limit|rate-limit|too many requests|overloaded/.test(haystack);\n if (isRate) {\n return new LoopError({\n code: 'RATE_LIMIT',\n phase: 'engine',\n message: `agent-sdk rate limited: ${message}`,\n cause: error,\n resetAt,\n });\n }\n return undefined;\n}\n\nexport class AgentSdkEngine implements Engine {\n readonly name = 'agent-sdk';\n constructor(private readonly opts: EngineOptions = {}) {}\n\n async run(\n req: AgentRequest,\n onEvent: EngineEventSink,\n signal: AbortSignal,\n ): Promise<AgentResult> {\n // Lazy import so installs/runs that never touch this engine don't pay for it.\n const { query } = await import('@anthropic-ai/claude-agent-sdk');\n\n const acc = newAccumulator(\n req.model ?? this.opts.defaultModel ?? 'unknown',\n );\n const abort = new AbortController();\n const onAbort = () => abort.abort();\n if (signal.aborted) abort.abort();\n else signal.addEventListener('abort', onAbort, { once: true });\n\n // The SDK option surface drifts across versions; cast at this boundary.\n const options = {\n model: req.model ?? this.opts.defaultModel,\n systemPrompt: req.system,\n cwd: req.cwd,\n allowedTools: req.allowedTools,\n // A leaf agent may not spawn sub-agents — disallow the spawn tool.\n disallowedTools: req.leaf ? SUBAGENT_TOOLS : undefined,\n permissionMode: this.opts.permissionMode,\n includePartialMessages: true,\n abortController: abort,\n } as Record<string, unknown>;\n\n try {\n const response = query({\n prompt: req.prompt,\n options,\n } as never) as AsyncIterable<unknown>;\n const consume = (async () => {\n for await (const message of response) mapMessage(message, acc, onEvent);\n })();\n await (req.timeoutMs\n ? pTimeout(consume, { milliseconds: req.timeoutMs })\n : consume);\n } catch (e) {\n if (signal.aborted)\n throw new LoopError({\n code: 'ABORTED',\n phase: 'engine',\n message: 'agent-sdk run aborted',\n });\n const limit = classifySdkLimit(e);\n if (limit) throw limit;\n throw LoopError.from(e, { code: 'ENGINE', phase: 'engine' });\n } finally {\n signal.removeEventListener('abort', onAbort);\n }\n\n onEvent({ type: 'usage', usage: acc.usage, model: acc.model });\n return {\n text: acc.text,\n usage: acc.usage,\n model: acc.model,\n stopReason: acc.stopReason,\n };\n }\n}\n"]}
1
+ {"version":3,"sources":["../src/engines/agent-sdk.ts"],"names":[],"mappings":";;;;;AA8BA,SAAS,iBAAiB,KAAA,EAAuC;AAC/D,EAAA,MAAM,GAAA,GAAO,SAAS,EAAC;AACvB,EAAA,MAAM,MAAM,OAAO,GAAA,CAAI,KAAA,KAAU,QAAA,GAAW,IAAI,KAAA,GAAQ,EAAA;AACxD,EAAA,MAAM,UAAU,KAAA,YAAiB,KAAA,GAAQ,KAAA,CAAM,OAAA,GAAU,OAAO,KAAK,CAAA;AACrE,EAAA,MAAM,WAAW,CAAA,EAAG,GAAG,CAAA,CAAA,EAAI,OAAO,GAAG,WAAA,EAAY;AAEjD,EAAA,MAAM,IAAA,GAAQ,GAAA,CAAI,eAAA,IAAmB,EAAC;AACtC,EAAA,MAAM,OAAA,GACJ,OAAO,IAAA,CAAK,QAAA,KAAa,QAAA,GACrB,IAAA,CAAK,QAAA,GACL,OAAO,IAAA,CAAK,eAAA,KAAoB,QAAA,GAC9B,IAAA,CAAK,eAAA,GACL,MAAA;AAER,EAAA,MAAM,OAAA,GACJ,QAAQ,eAAA,IACR,IAAA,CAAK,cAAc,kBAAA,IACnB,kCAAA,CAAmC,KAAK,QAAQ,CAAA;AAClD,EAAA,IAAI,OAAA,EAAS;AACX,IAAA,OAAO,IAAI,SAAA,CAAU;AAAA,MACnB,IAAA,EAAM,OAAA;AAAA,MACN,KAAA,EAAO,QAAA;AAAA,MACP,OAAA,EAAS,kCAAkC,OAAO,CAAA,CAAA;AAAA,MAClD,KAAA,EAAO,KAAA;AAAA,MACP;AAAA,KACD,CAAA;AAAA,EACH;AACA,EAAA,MAAM,SACJ,GAAA,KAAQ,YAAA,IACR,QAAQ,YAAA,IACR,oDAAA,CAAqD,KAAK,QAAQ,CAAA;AACpE,EAAA,IAAI,MAAA,EAAQ;AACV,IAAA,OAAO,IAAI,SAAA,CAAU;AAAA,MACnB,IAAA,EAAM,YAAA;AAAA,MACN,KAAA,EAAO,QAAA;AAAA,MACP,OAAA,EAAS,2BAA2B,OAAO,CAAA,CAAA;AAAA,MAC3C,KAAA,EAAO,KAAA;AAAA,MACP;AAAA,KACD,CAAA;AAAA,EACH;AACA,EAAA,OAAO,MAAA;AACT;AAEO,IAAM,iBAAN,MAAuC;AAAA,EAE5C,WAAA,CAA6B,IAAA,GAAsB,EAAC,EAAG;AAA1B,IAAA,IAAA,CAAA,IAAA,GAAA,IAAA;AAAA,EAA2B;AAAA,EAA3B,IAAA;AAAA,EADpB,IAAA,GAAO,WAAA;AAAA,EAGhB,MAAM,GAAA,CACJ,GAAA,EACA,OAAA,EACA,MAAA,EACsB;AAEtB,IAAA,MAAM,EAAE,KAAA,EAAM,GAAI,MAAM,OAAO,gCAAgC,CAAA;AAE/D,IAAA,MAAM,GAAA,GAAM,cAAA;AAAA,MACV,GAAA,CAAI,KAAA,IAAS,IAAA,CAAK,IAAA,CAAK,YAAA,IAAgB;AAAA,KACzC;AACA,IAAA,MAAM,KAAA,GAAQ,IAAI,eAAA,EAAgB;AAClC,IAAA,MAAM,OAAA,GAAU,MAAM,KAAA,CAAM,KAAA,EAAM;AAClC,IAAA,IAAI,MAAA,CAAO,OAAA,EAAS,KAAA,CAAM,KAAA,EAAM;AAAA,gBACpB,gBAAA,CAAiB,OAAA,EAAS,SAAS,EAAE,IAAA,EAAM,MAAM,CAAA;AAG7D,IAAA,MAAM,OAAA,GAAU;AAAA,MACd,KAAA,EAAO,GAAA,CAAI,KAAA,IAAS,IAAA,CAAK,IAAA,CAAK,YAAA;AAAA,MAC9B,cAAc,GAAA,CAAI,MAAA;AAAA,MAClB,KAAK,GAAA,CAAI,GAAA;AAAA,MACT,cAAc,GAAA,CAAI,YAAA;AAAA;AAAA,MAElB,eAAA,EAAiB,GAAA,CAAI,IAAA,GAAO,cAAA,GAAiB,MAAA;AAAA,MAC7C,cAAA,EAAgB,KAAK,IAAA,CAAK,cAAA;AAAA,MAC1B,sBAAA,EAAwB,IAAA;AAAA,MACxB,eAAA,EAAiB;AAAA,KACnB;AAEA,IAAA,IAAI;AACF,MAAA,MAAM,WAAW,KAAA,CAAM;AAAA,QACrB,QAAQ,GAAA,CAAI,MAAA;AAAA,QACZ;AAAA,OACQ,CAAA;AACV,MAAA,MAAM,WAAW,YAAY;AAC3B,QAAA,WAAA,MAAiB,OAAA,IAAW,QAAA,EAAU,UAAA,CAAW,OAAA,EAAS,KAAK,OAAO,CAAA;AAAA,MACxE,CAAA,GAAG;AACH,MAAA,OAAO,GAAA,CAAI,YACP,QAAA,CAAS,OAAA,EAAS,EAAE,YAAA,EAAc,GAAA,CAAI,SAAA,EAAW,CAAA,GACjD,OAAA,CAAA;AAAA,IACN,SAAS,CAAA,EAAG;AACV,MAAA,IAAI,MAAA,CAAO,OAAA;AACT,QAAA,MAAM,IAAI,SAAA,CAAU;AAAA,UAClB,IAAA,EAAM,SAAA;AAAA,UACN,KAAA,EAAO,QAAA;AAAA,UACP,OAAA,EAAS;AAAA,SACV,CAAA;AACH,MAAA,MAAM,KAAA,GAAQ,iBAAiB,CAAC,CAAA;AAChC,MAAA,IAAI,OAAO,MAAM,KAAA;AACjB,MAAA,MAAM,SAAA,CAAU,KAAK,CAAA,EAAG,EAAE,MAAM,QAAA,EAAU,KAAA,EAAO,UAAU,CAAA;AAAA,IAC7D,CAAA,SAAE;AACA,MAAA,MAAA,CAAO,mBAAA,CAAoB,SAAS,OAAO,CAAA;AAAA,IAC7C;AAEA,IAAA,OAAA,CAAQ,EAAE,MAAM,OAAA,EAAS,KAAA,EAAO,IAAI,KAAA,EAAO,KAAA,EAAO,GAAA,CAAI,KAAA,EAAO,CAAA;AAC7D,IAAA,OAAO;AAAA,MACL,MAAM,GAAA,CAAI,IAAA;AAAA,MACV,OAAO,GAAA,CAAI,KAAA;AAAA,MACX,OAAO,GAAA,CAAI,KAAA;AAAA,MACX,YAAY,GAAA,CAAI;AAAA,KAClB;AAAA,EACF;AACF","file":"agent-sdk-4QJDWM7N.js","sourcesContent":["/**\n * Engine adapter: the Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`).\n * Each `run` is a fresh `query()` — a clean context per loop iteration, which\n * is the whole point. Uses the host's Claude Code auth, so it needs no API key.\n */\n\nimport pTimeout from 'p-timeout';\n\nimport {\n SUBAGENT_TOOLS,\n type AgentRequest,\n type AgentResult,\n type Engine,\n type EngineEventSink,\n type EngineOptions,\n} from './engine.ts';\nimport { mapMessage, newAccumulator } from './message-map.ts';\nimport { LoopError } from '../core/errors.ts';\n\n/**\n * Best-effort classification of an Agent SDK error into a provider-limit\n * `LoopError`, or `undefined` to fall through to the generic ENGINE mapping.\n * The SDK exposes limit state in a few shapes (a thrown error message, an\n * `error` field carrying an `SDKAssistantMessageError` string, and a\n * `rate_limit_info.resetsAt` epoch). We read defensively rather than depend on\n * an exact internal shape:\n * - a rate-limit / overloaded signal → RATE_LIMIT (resets on its own).\n * - a billing / usage / credits signal → QUOTA. A `resetsAt` (when present)\n * makes it auto-waitable; otherwise QUOTA has no reset.\n */\nfunction classifySdkLimit(error: unknown): LoopError | undefined {\n const err = (error ?? {}) as Record<string, unknown>;\n const tag = typeof err.error === 'string' ? err.error : '';\n const message = error instanceof Error ? error.message : String(error);\n const haystack = `${tag} ${message}`.toLowerCase();\n\n const info = (err.rate_limit_info ?? {}) as Record<string, unknown>;\n const resetAt =\n typeof info.resetsAt === 'number'\n ? info.resetsAt\n : typeof info.overageResetsAt === 'number'\n ? info.overageResetsAt\n : undefined;\n\n const isUsage =\n tag === 'billing_error' ||\n info.errorCode === 'credits_required' ||\n /billing|credit|usage limit|quota/.test(haystack);\n if (isUsage) {\n return new LoopError({\n code: 'QUOTA',\n phase: 'engine',\n message: `agent-sdk usage/billing limit: ${message}`,\n cause: error,\n resetAt,\n });\n }\n const isRate =\n tag === 'rate_limit' ||\n tag === 'overloaded' ||\n /rate limit|rate-limit|too many requests|overloaded/.test(haystack);\n if (isRate) {\n return new LoopError({\n code: 'RATE_LIMIT',\n phase: 'engine',\n message: `agent-sdk rate limited: ${message}`,\n cause: error,\n resetAt,\n });\n }\n return undefined;\n}\n\nexport class AgentSdkEngine implements Engine {\n readonly name = 'agent-sdk';\n constructor(private readonly opts: EngineOptions = {}) {}\n\n async run(\n req: AgentRequest,\n onEvent: EngineEventSink,\n signal: AbortSignal,\n ): Promise<AgentResult> {\n // Lazy import so installs/runs that never touch this engine don't pay for it.\n const { query } = await import('@anthropic-ai/claude-agent-sdk');\n\n const acc = newAccumulator(\n req.model ?? this.opts.defaultModel ?? 'unknown',\n );\n const abort = new AbortController();\n const onAbort = () => abort.abort();\n if (signal.aborted) abort.abort();\n else signal.addEventListener('abort', onAbort, { once: true });\n\n // The SDK option surface drifts across versions; cast at this boundary.\n const options = {\n model: req.model ?? this.opts.defaultModel,\n systemPrompt: req.system,\n cwd: req.cwd,\n allowedTools: req.allowedTools,\n // A leaf agent may not spawn sub-agents — disallow the spawn tool.\n disallowedTools: req.leaf ? SUBAGENT_TOOLS : undefined,\n permissionMode: this.opts.permissionMode,\n includePartialMessages: true,\n abortController: abort,\n } as Record<string, unknown>;\n\n try {\n const response = query({\n prompt: req.prompt,\n options,\n } as never) as AsyncIterable<unknown>;\n const consume = (async () => {\n for await (const message of response) mapMessage(message, acc, onEvent);\n })();\n await (req.timeoutMs\n ? pTimeout(consume, { milliseconds: req.timeoutMs })\n : consume);\n } catch (e) {\n if (signal.aborted)\n throw new LoopError({\n code: 'ABORTED',\n phase: 'engine',\n message: 'agent-sdk run aborted',\n });\n const limit = classifySdkLimit(e);\n if (limit) throw limit;\n throw LoopError.from(e, { code: 'ENGINE', phase: 'engine' });\n } finally {\n signal.removeEventListener('abort', onAbort);\n }\n\n onEvent({ type: 'usage', usage: acc.usage, model: acc.model });\n return {\n text: acc.text,\n usage: acc.usage,\n model: acc.model,\n stopReason: acc.stopReason,\n };\n }\n}\n"]}
package/dist/api.d.ts CHANGED
@@ -1,5 +1,56 @@
1
- import { L as LoopConfig, J as Job, D as DagConfig, O as Outcome, a as JobContext, C as ConditionInput, b as JobMeta, c as EngineRef, W as Workspace, A as AgentDef, d as Condition, e as EngineOptions, f as Engine, g as EngineName, h as AgentRequest, U as Usage, i as EngineEventSink, j as AgentResult, E as Environment, k as EnvHandle, l as LoopEvent, F as Forge, B as BudgetConfig, m as LimitPolicy } from './types-B4wGVpqo.js';
2
- export { n as AgentJobConfig, o as Budget, p as CommitJobConfig, q as ConditionResult, r as DagNode, s as EngineStreamEvent, t as ForgeOpts, G as GhForge, u as GroundConfig, v as LogLevel, w as LoopError, x as LoopErrorCode, M as MergeOptions, y as MockForge, z as MockForgeOptions, H as OutcomeStatus, P as PrInput, I as PrPatch, K as PrRef, R as RawPredicate, N as RetryPolicy, S as SUBAGENT_TOOLS, Q as Skill, T as agentJob, V as buildChecksArgs, X as buildCreateArgs, Y as buildEditArgs, Z as buildMergeArgs, _ as buildViewArgs, $ as commitJob, a0 as defineAgent, a1 as defineSkill, a2 as fnJob, a3 as fromFile, a4 as isEngine, a5 as isEnvironment, a6 as isForge, a7 as kickback, a8 as resolveSystem } from './types-B4wGVpqo.js';
1
+ import { C as ConditionInput, J as Job, R as RevisionRerun, F as FeedbackFinding, a as FeedbackDecision, O as Outcome, G as GraphPosition, b as FeedbackSeverity, c as FeedbackActionSeverity, d as JobContext, e as RevisionRequest, L as LoopConfig, D as DagConfig, f as JobMeta, g as EngineRef, W as Workspace, A as AgentDef, h as Condition, i as EngineOptions, j as Engine, k as EngineName, l as AgentRequest, U as Usage, m as EngineEventSink, n as AgentResult, E as Environment, o as EnvHandle, p as LoopEvent, q as Forge, B as BudgetConfig, r as LimitPolicy } from './types-CpB03Jj4.js';
2
+ export { s as AgentContractSummary, t as AgentFailureMode, u as AgentHumanGate, v as AgentJobConfig, w as AgentOutputContract, x as AgentSkillRef, y as AgentTier, z as Budget, H as CommitJobConfig, I as ConditionResult, K as DagNode, M as EngineStreamEvent, N as ForgeOpts, P as GhForge, Q as GroundConfig, S as LogLevel, T as LoopError, V as LoopErrorCode, X as MergeOptions, Y as MockForge, Z as MockForgeOptions, _ as NoProgressConfig, $ as NoProgressInput, a0 as OutcomeStatus, a1 as PrInput, a2 as PrPatch, a3 as PrRef, a4 as ProgressSample, a5 as ProgressTracker, a6 as RawPredicate, a7 as RetryPolicy, a8 as SUBAGENT_TOOLS, a9 as Skill, aa as StallReport, ab as agentContract, ac as agentJob, ad as buildChecksArgs, ae as buildCreateArgs, af as buildEditArgs, ag as buildMergeArgs, ah as buildViewArgs, ai as commitJob, aj as defineAgent, ak as defineSkill, al as fnJob, am as fromFile, an as isEngine, ao as isEnvironment, ap as isForge, aq as resolveNoProgress, ar as resolveSystem } from './types-CpB03Jj4.js';
3
+
4
+ interface RevisionRequestInput {
5
+ target?: string;
6
+ reason?: string;
7
+ findings?: FeedbackFinding[];
8
+ rerun?: RevisionRerun;
9
+ source?: string;
10
+ decision?: FeedbackDecision;
11
+ }
12
+ declare function normalizeFeedbackSeverity(severity: FeedbackSeverity | undefined): FeedbackActionSeverity;
13
+ declare function isRequiredFeedbackSeverity(severity: FeedbackSeverity | undefined): boolean;
14
+ declare function revisionRequest(input: RevisionRequestInput, over?: Partial<Outcome>): Outcome;
15
+ declare function kickback(to: string, reason: string, over?: Partial<Outcome>): Outcome;
16
+ /**
17
+ * The single accessor for an outcome's revision request. `Outcome.revision` is
18
+ * the one channel a producer sets (`revisionRequest`, `kickback`, `reviewPanel`,
19
+ * dag routing), so there is exactly one place to read it — no parallel `kickback`
20
+ * field or `data` copy to keep in sync.
21
+ */
22
+ declare function revisionFromOutcome(outcome: Outcome): RevisionRequest | undefined;
23
+ declare function feedbackBlock(outcome: Outcome): string;
24
+ declare function graphPositionBlock(graph: GraphPosition): string;
25
+ type ReviewTarget = {
26
+ name?: string;
27
+ review: ConditionInput;
28
+ } | {
29
+ name?: string;
30
+ job: Job;
31
+ };
32
+ interface ReviewPanelConfig {
33
+ label?: string;
34
+ reviewers: ReviewTarget[];
35
+ /** Default `all`: every reviewer must pass. A number means k-of-n over all reviewers. */
36
+ pass?: 'all' | number;
37
+ /** When set, a failing panel emits a targeted revision request for dag routing. */
38
+ target?: string;
39
+ rerun?: RevisionRerun;
40
+ }
41
+ declare function reviewPanel(config: ReviewPanelConfig): Job;
42
+ interface ReviewContextConfig {
43
+ diff?: boolean;
44
+ files?: string[];
45
+ ledger?: boolean;
46
+ tests?: boolean | {
47
+ command: string;
48
+ args?: string[];
49
+ cwd?: string;
50
+ };
51
+ maxChars?: number;
52
+ }
53
+ declare function reviewContext(config: ReviewContextConfig): (ctx: JobContext, last: Outcome | undefined) => Promise<string>;
3
54
 
4
55
  /**
5
56
  * The loop primitive. `loop(config)` returns a `Job`, so loops nest by simply
@@ -134,6 +185,17 @@ declare function stageAll(opts: GitOpts): Promise<void>;
134
185
  declare function hasStagedChanges(opts: GitOpts): Promise<boolean>;
135
186
  /** True when the work tree (staged or unstaged) has any change. */
136
187
  declare function isDirty(opts: GitOpts): Promise<boolean>;
188
+ /**
189
+ * A content hash of the workspace's observable state: HEAD, every pending
190
+ * tracked change (staged + unstaged, with content), the porcelain status, and
191
+ * the CONTENT of untracked non-ignored files (hashed by git itself, so a
192
+ * revisit to a byte-identical tree fingerprints identically). This is the
193
+ * deterministic evidence channel behind `noProgress`: two iterations with the
194
+ * same fingerprint left the workspace in the same state. Returns undefined
195
+ * outside a git work tree — the caller treats that channel as absent, never as
196
+ * "unchanged". Never throws.
197
+ */
198
+ declare function workspaceFingerprint(opts: GitOpts): Promise<string | undefined>;
137
199
  interface CommitInput {
138
200
  subject: string;
139
201
  /** The structured body — the "way". Joined to the subject with a blank line. */
@@ -1001,9 +1063,63 @@ declare function readRunStatus(runId: string): RunStatus | undefined;
1001
1063
  declare function listRuns(): RunStatus[];
1002
1064
  /** Path to a run's appended event stream (for tailing). */
1003
1065
  declare function runEventsPath(runId: string): string;
1066
+ /** Path to a run's semantic record stream. */
1067
+ declare function runSemanticRecordsPath(runId: string): string;
1004
1068
  /** A compact one-line rendering of an event, for `loops tail`. */
1005
1069
  declare function formatEvent(event: LoopEvent): string;
1006
1070
 
1071
+ type SemanticDecision = FeedbackDecision;
1072
+ type SemanticRunRecord = {
1073
+ kind: 'dispatch';
1074
+ ts: number;
1075
+ path: string[];
1076
+ unit: 'job' | 'dag-node';
1077
+ label?: string;
1078
+ node?: string;
1079
+ /** Present for a dag-node: which run this is (1-based; +1 per kickback re-run). */
1080
+ attempt?: number;
1081
+ } | {
1082
+ kind: 'completion';
1083
+ ts: number;
1084
+ path: string[];
1085
+ unit: 'job' | 'loop' | 'dag' | 'dag-node';
1086
+ label?: string;
1087
+ outcome: SemanticOutcome;
1088
+ iterations?: number;
1089
+ /** Present for a dag-node: which run this completion is for. */
1090
+ attempt?: number;
1091
+ } | {
1092
+ kind: 'surfacing';
1093
+ ts: number;
1094
+ path: string[];
1095
+ source: 'loop-review' | 'dag-kickback';
1096
+ decision: SemanticDecision;
1097
+ severity?: FeedbackActionSeverity;
1098
+ from?: string;
1099
+ to?: string;
1100
+ reason: string;
1101
+ note?: string;
1102
+ } | {
1103
+ kind: 'revision-emitted';
1104
+ ts: number;
1105
+ path: string[];
1106
+ sourceEvent: 'job:end';
1107
+ revision: RevisionRequest;
1108
+ } | {
1109
+ kind: 'revision-routed';
1110
+ ts: number;
1111
+ path: string[];
1112
+ sourceEvent: 'loop:review' | 'dag:kickback';
1113
+ decision: SemanticDecision;
1114
+ revision: RevisionRequest;
1115
+ };
1116
+ interface SemanticOutcome {
1117
+ status: Outcome['status'];
1118
+ summary?: string;
1119
+ confidence?: number;
1120
+ }
1121
+ declare function semanticRecordsFromEvent(event: LoopEvent): SemanticRunRecord[];
1122
+
1007
1123
  /**
1008
1124
  * Public API. A loop-definition file imports from here and `export default`s a
1009
1125
  * `Job` (usually a `loop(...)` or `dag(...)`). The CLI runs that default export.
@@ -1015,4 +1131,4 @@ declare function formatEvent(event: LoopEvent): string;
1015
1131
  /** Identity helper that pins the type of a default export to `Job`. */
1016
1132
  declare function defineJob(job: Job): Job;
1017
1133
 
1018
- export { type AgentCheckConfig, AgentDef, AgentRequest, AgentResult, BudgetConfig, type CommitInput, type CommitRecord, type CompactOptions, Condition, ConditionInput, type ConsolidateJobConfig, type ConsolidateOptions, DagConfig, EXIT_PAUSED, Engine, type EngineFactory, EngineName, EngineOptions, EngineRef, EngineRegistry, EnvHandle, Environment, Forge, type GroundOptions, type IsolatedOptions, Job, JobContext, JobMeta, type LedgerEntry, LimitPolicy, type LogQuery, LoopConfig, LoopEvent, type MergeJobConfig, type MergeResult, type MergeSynthesisConfig, type MergeSynthesisResult, MockEngine, type MockEnvOptions, MockEnvironment, type MockResponder, Outcome, type PromptNote, type PullRequestJobConfig, type PushJobConfig, type PushOptions, type PushResult, type RetrieveOptions, type RunLive, type RunOptions, type RunResult, type RunStatus, Stats, type StatsSnapshot, type TournamentConfig, Usage, Workspace, type WorktreeHandle, addWorktree, agentCheck, all, always, any, appendLedger, appendPrompt, bodyPassed, commandSucceeds, commit, compactLedger, composeCommitBody, conflictedFiles, consolidate, consolidateJob, currentBranch, dag, defineJob, deleteBranch, describeConditions, ensureIgnored, exitCodeFor, forgeChecks, formatEvent, gateJob, groundingText, hasStagedChanges, headSha, isDirty, isRepo, isolated, jobMeta, ledgerPath, listRuns, log, loop, mergeAbort, mergeBranch, mergeJob, mergeNoCommit, mergeSynthesis, minConfidence, mockVerdict, never, not, parallel, predicate, promptPath, pullRequestJob, push, pushJob, quorum, readLedger, readPrompt, readRunStatus, removeWorktree, renderPlan, resetLedger, resetPrompt, retrieveLedger, run, runEventsPath, runsHome, sequence, stageAll, toCondition, tournament };
1134
+ export { type AgentCheckConfig, AgentDef, AgentRequest, AgentResult, BudgetConfig, type CommitInput, type CommitRecord, type CompactOptions, Condition, ConditionInput, type ConsolidateJobConfig, type ConsolidateOptions, DagConfig, EXIT_PAUSED, Engine, type EngineFactory, EngineName, EngineOptions, EngineRef, EngineRegistry, EnvHandle, Environment, FeedbackActionSeverity, FeedbackDecision, FeedbackFinding, FeedbackSeverity, Forge, GraphPosition, type GroundOptions, type IsolatedOptions, Job, JobContext, JobMeta, type LedgerEntry, LimitPolicy, type LogQuery, LoopConfig, LoopEvent, type MergeJobConfig, type MergeResult, type MergeSynthesisConfig, type MergeSynthesisResult, MockEngine, type MockEnvOptions, MockEnvironment, type MockResponder, Outcome, type PromptNote, type PullRequestJobConfig, type PushJobConfig, type PushOptions, type PushResult, type RetrieveOptions, type ReviewContextConfig, type ReviewPanelConfig, RevisionRequest, type RevisionRequestInput, RevisionRerun, type RunLive, type RunOptions, type RunResult, type RunStatus, type SemanticDecision, type SemanticOutcome, type SemanticRunRecord, Stats, type StatsSnapshot, type TournamentConfig, Usage, Workspace, type WorktreeHandle, addWorktree, agentCheck, all, always, any, appendLedger, appendPrompt, bodyPassed, commandSucceeds, commit, compactLedger, composeCommitBody, conflictedFiles, consolidate, consolidateJob, currentBranch, dag, defineJob, deleteBranch, describeConditions, ensureIgnored, exitCodeFor, feedbackBlock, forgeChecks, formatEvent, gateJob, graphPositionBlock, groundingText, hasStagedChanges, headSha, isDirty, isRepo, isRequiredFeedbackSeverity, isolated, jobMeta, kickback, ledgerPath, listRuns, log, loop, mergeAbort, mergeBranch, mergeJob, mergeNoCommit, mergeSynthesis, minConfidence, mockVerdict, never, normalizeFeedbackSeverity, not, parallel, predicate, promptPath, pullRequestJob, push, pushJob, quorum, readLedger, readPrompt, readRunStatus, removeWorktree, renderPlan, resetLedger, resetPrompt, retrieveLedger, reviewContext, reviewPanel, revisionFromOutcome, revisionRequest, run, runEventsPath, runSemanticRecordsPath, runsHome, semanticRecordsFromEvent, sequence, stageAll, toCondition, tournament, workspaceFingerprint };
package/dist/api.js CHANGED
@@ -1,7 +1,7 @@
1
- import { mergeNoCommit, stageAll, commit, mergeAbort, log, setMeta, jobMeta, isRepo, addWorktree, childContext, composeCommitBody, mergeBranch, removeWorktree, deleteBranch, push, consolidate, toCondition, GhForge } from './chunk-6BDWTFOS.js';
2
- export { Budget, EXIT_PAUSED, EngineRegistry, GhForge, MockForge, Stats, addWorktree, agentCheck, agentJob, all, always, any, appendLedger, appendPrompt, bodyPassed, buildChecksArgs, buildCreateArgs, buildEditArgs, buildMergeArgs, buildViewArgs, commandSucceeds, commit, commitJob, compactLedger, composeCommitBody, conflictedFiles, consolidate, consolidateJob, currentBranch, defineAgent, defineSkill, deleteBranch, describeConditions, ensureIgnored, exitCodeFor, fnJob, forgeChecks, formatEvent, fromFile, gateJob, groundingText, hasStagedChanges, headSha, isDirty, isForge, isRepo, jobMeta, kickback, ledgerPath, listRuns, log, loop, mergeAbort, mergeBranch, mergeNoCommit, minConfidence, never, not, predicate, promptPath, push, quorum, readLedger, readPrompt, readRunStatus, removeWorktree, renderPlan, resetLedger, resetPrompt, resolveSystem, retrieveLedger, run, runEventsPath, runsHome, stageAll, toCondition } from './chunk-6BDWTFOS.js';
1
+ import { mergeNoCommit, stageAll, commit, mergeAbort, log, setMeta, jobMeta, isRepo, addWorktree, childContext, composeCommitBody, mergeBranch, removeWorktree, deleteBranch, push, consolidate, toCondition, revisionFromOutcome, GhForge } from './chunk-3PMVII43.js';
2
+ export { Budget, EXIT_PAUSED, EngineRegistry, GhForge, MockForge, ProgressTracker, Stats, addWorktree, agentCheck, agentContract, agentJob, all, always, any, appendLedger, appendPrompt, bodyPassed, buildChecksArgs, buildCreateArgs, buildEditArgs, buildMergeArgs, buildViewArgs, commandSucceeds, commit, commitJob, compactLedger, composeCommitBody, conflictedFiles, consolidate, consolidateJob, currentBranch, defineAgent, defineSkill, deleteBranch, describeConditions, ensureIgnored, exitCodeFor, feedbackBlock, fnJob, forgeChecks, formatEvent, fromFile, gateJob, graphPositionBlock, groundingText, hasStagedChanges, headSha, isDirty, isForge, isRepo, isRequiredFeedbackSeverity, jobMeta, kickback, ledgerPath, listRuns, log, loop, mergeAbort, mergeBranch, mergeNoCommit, minConfidence, never, normalizeFeedbackSeverity, not, predicate, promptPath, push, quorum, readLedger, readPrompt, readRunStatus, removeWorktree, renderPlan, resetLedger, resetPrompt, resolveNoProgress, resolveSystem, retrieveLedger, reviewContext, reviewPanel, revisionFromOutcome, revisionRequest, run, runEventsPath, runSemanticRecordsPath, runsHome, semanticRecordsFromEvent, stageAll, toCondition, workspaceFingerprint } from './chunk-3PMVII43.js';
3
3
  import './chunk-JFTXJ7I2.js';
4
- export { SUBAGENT_TOOLS, isEngine } from './chunk-XC46B4FD.js';
4
+ export { SUBAGENT_TOOLS, isEngine } from './chunk-MA6NDQMO.js';
5
5
  import './chunk-Y2SD7GBL.js';
6
6
  import { LoopError } from './chunk-I3STY7U6.js';
7
7
  export { LoopError } from './chunk-I3STY7U6.js';
@@ -160,6 +160,7 @@ function dag(config) {
160
160
  const limit = pLimit(limitN);
161
161
  const results = /* @__PURE__ */ new Map();
162
162
  const memo = /* @__PURE__ */ new Map();
163
+ const attempts = /* @__PURE__ */ new Map();
163
164
  let stopped = false;
164
165
  const pendingKickback = /* @__PURE__ */ new Map();
165
166
  const nodeCtx = (name, workspace, environment) => childContext(parent, {
@@ -167,7 +168,14 @@ function dag(config) {
167
168
  path: [...path, name],
168
169
  workspace,
169
170
  environment,
170
- lastReview: pendingKickback.get(name)
171
+ lastReview: pendingKickback.get(name),
172
+ graph: {
173
+ dag: config.name,
174
+ node: name,
175
+ path: [...path, name],
176
+ needs: nodes.get(name).needs ?? [],
177
+ dependents: dependents.get(name) ?? []
178
+ }
171
179
  });
172
180
  const mergeLimit = pLimit(1);
173
181
  let forkSeq2 = 0;
@@ -264,11 +272,12 @@ function dag(config) {
264
272
  path,
265
273
  node: name,
266
274
  phase,
267
- outcome: outcome2
275
+ outcome: outcome2,
276
+ attempt: attempts.get(name)
268
277
  });
269
278
  if (phase === "done" && outcome2.status !== "pass" && nodes.get(name).optional !== true && stopOnError && // A node requesting a kickback is going to be re-run — don't let its
270
279
  // (provisional) non-pass abort siblings before the feedback is resolved.
271
- !(maxKickbacks > 0 && outcome2.kickback)) {
280
+ !(maxKickbacks > 0 && revisionFromOutcome(outcome2)?.target)) {
272
281
  stopped = true;
273
282
  }
274
283
  return outcome2;
@@ -278,6 +287,7 @@ function dag(config) {
278
287
  if (existing) return existing;
279
288
  const node = nodes.get(name);
280
289
  const promise = (async () => {
290
+ attempts.set(name, (attempts.get(name) ?? 0) + 1);
281
291
  try {
282
292
  const needs = node.needs ?? [];
283
293
  const deps = await Promise.all(needs.map(run2));
@@ -324,7 +334,8 @@ function dag(config) {
324
334
  ts: ts(),
325
335
  path,
326
336
  node: name,
327
- phase: "start"
337
+ phase: "start",
338
+ attempt: attempts.get(name)
328
339
  });
329
340
  return { outcome: await runNodeJob(name, node), phase: "done" };
330
341
  }
@@ -369,10 +380,15 @@ function dag(config) {
369
380
  });
370
381
  for (; ; ) {
371
382
  const from = order.find(
372
- (n) => results.get(n)?.kickback && !rejected.has(n)
383
+ (n) => {
384
+ const result = results.get(n);
385
+ return result !== void 0 && revisionFromOutcome(result)?.target !== void 0 && !rejected.has(n);
386
+ }
373
387
  );
374
388
  if (!from) break;
375
- const { to, reason } = results.get(from).kickback;
389
+ const request = revisionFromOutcome(results.get(from));
390
+ const to = request.target;
391
+ const { reason } = request;
376
392
  const allow = nodes.get(from).acceptsKickbackTo;
377
393
  const note = !nodes.has(to) ? `unknown node "${to}"` : !ancestorsOf(from).has(to) ? `"${to}" is not an ancestor of "${from}"` : allow && !allow.includes(to) ? `"${from}" does not accept kickback to "${to}"` : void 0;
378
394
  if (note) {
@@ -401,7 +417,7 @@ function dag(config) {
401
417
  pendingKickback.set(to, {
402
418
  status: "fail",
403
419
  summary: `Kicked back from "${from}": ${reason}`,
404
- data: { kickback: true, from }
420
+ revision: { ...request, source: request.source ?? from }
405
421
  });
406
422
  stopped = false;
407
423
  await Promise.all(names.map(run2));