@botbotgo/agent-harness 0.0.297 → 0.0.299

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (166) hide show
  1. package/README.md +77 -37
  2. package/README.zh.md +79 -30
  3. package/dist/acp.d.ts +3 -0
  4. package/dist/acp.js +10 -2
  5. package/dist/api.d.ts +14 -2
  6. package/dist/api.js +19 -3
  7. package/dist/cli.d.ts +18 -1
  8. package/dist/cli.js +1408 -319
  9. package/dist/client/acp.d.ts +9 -3
  10. package/dist/client/acp.js +55 -1
  11. package/dist/client/in-process.d.ts +5 -2
  12. package/dist/client/in-process.js +4 -6
  13. package/dist/client/index.d.ts +1 -1
  14. package/dist/client/types.d.ts +6 -5
  15. package/dist/config/agents/direct.yaml +7 -17
  16. package/dist/config/agents/orchestra.yaml +9 -65
  17. package/dist/config/catalogs/embedding-models.yaml +1 -1
  18. package/dist/config/catalogs/stores.yaml +1 -1
  19. package/dist/config/knowledge/knowledge-runtime.yaml +36 -2
  20. package/dist/config/knowledge/procedural-memory-runtime.yaml +78 -0
  21. package/dist/config/{catalogs/models.yaml → models.yaml} +2 -2
  22. package/dist/config/prompts/direct-system.md +16 -0
  23. package/dist/config/prompts/orchestra-system.md +62 -0
  24. package/dist/config/prompts/routing-system.md +14 -0
  25. package/dist/config/runtime/runtime-memory.yaml +39 -5
  26. package/dist/config/runtime/workspace.yaml +7 -16
  27. package/dist/contracts/runtime.d.ts +242 -1
  28. package/dist/contracts/workspace.d.ts +2 -0
  29. package/dist/index.d.ts +5 -3
  30. package/dist/index.js +2 -1
  31. package/dist/init-project.js +178 -33
  32. package/dist/knowledge/contracts.d.ts +5 -0
  33. package/dist/knowledge/module.d.ts +5 -0
  34. package/dist/knowledge/module.js +340 -18
  35. package/dist/package-version.d.ts +1 -1
  36. package/dist/package-version.js +1 -1
  37. package/dist/persistence/file-store.d.ts +5 -1
  38. package/dist/persistence/file-store.js +16 -0
  39. package/dist/persistence/sqlite-store.d.ts +4 -1
  40. package/dist/persistence/sqlite-store.js +88 -14
  41. package/dist/persistence/types.d.ts +4 -1
  42. package/dist/procedural/config.d.ts +63 -0
  43. package/dist/procedural/config.js +125 -0
  44. package/dist/procedural/index.d.ts +2 -0
  45. package/dist/procedural/index.js +1 -0
  46. package/dist/protocol/ag-ui/http.d.ts +3 -0
  47. package/dist/protocol/ag-ui/http.js +10 -0
  48. package/dist/request-events.d.ts +63 -0
  49. package/dist/request-events.js +400 -0
  50. package/dist/resource/isolation.js +11 -0
  51. package/dist/resource/resource-impl.d.ts +1 -0
  52. package/dist/resource/resource-impl.js +103 -12
  53. package/dist/resources/init-templates/agent-context/deep-research.md +5 -0
  54. package/dist/resources/init-templates/prompts/research-analyst-basic.md +1 -0
  55. package/dist/resources/init-templates/prompts/research-analyst-web-search.md +1 -0
  56. package/dist/resources/init-templates/prompts/research-host-deep-research-basic.md +1 -0
  57. package/dist/resources/init-templates/prompts/research-host-deep-research-web-search.md +1 -0
  58. package/dist/resources/init-templates/prompts/research-host-single-agent-basic.md +1 -0
  59. package/dist/resources/init-templates/prompts/research-host-single-agent-web-search.md +1 -0
  60. package/dist/resources/prompts/runtime/browser-capability-disclaimer-recovery.md +1 -0
  61. package/dist/resources/prompts/runtime/default-subagent.md +2 -0
  62. package/dist/resources/prompts/runtime/durable-memory-context.md +7 -0
  63. package/dist/resources/prompts/runtime/execution-with-tool-evidence-retry.md +1 -0
  64. package/dist/resources/prompts/runtime/execution-with-tool-evidence.md +1 -0
  65. package/dist/resources/prompts/runtime/invalid-tool-selection-recovery.md +1 -0
  66. package/dist/resources/prompts/runtime/memory-manager.md +31 -0
  67. package/dist/resources/prompts/runtime/memory-mutation-reconciliation.md +22 -0
  68. package/dist/resources/prompts/runtime/slash-command-skill.md +6 -0
  69. package/dist/resources/prompts/runtime/strict-tool-json.md +1 -0
  70. package/dist/resources/prompts/runtime/workspace-boundary-guidance.md +3 -0
  71. package/dist/resources/prompts/runtime/workspace-relative-path.md +1 -0
  72. package/dist/resources/prompts/runtime/write-todos-descriptive-content.md +1 -0
  73. package/dist/resources/prompts/runtime/write-todos-full-entry.md +1 -0
  74. package/dist/resources/prompts/runtime/write-todos-non-empty-initial-list.md +1 -0
  75. package/dist/resources/tools/_runtime_tool_helpers.mjs +152 -0
  76. package/dist/resources/tools/cancel_request.mjs +21 -0
  77. package/dist/resources/tools/fetch_url.mjs +23 -0
  78. package/dist/resources/tools/http_request.mjs +30 -0
  79. package/dist/resources/tools/inspect_approvals.mjs +27 -0
  80. package/dist/resources/tools/inspect_artifacts.mjs +21 -0
  81. package/dist/resources/tools/inspect_events.mjs +21 -0
  82. package/dist/resources/tools/inspect_requests.mjs +27 -0
  83. package/dist/resources/tools/inspect_sessions.mjs +21 -0
  84. package/dist/resources/tools/list_files.mjs +27 -0
  85. package/dist/resources/tools/read_artifact.mjs +22 -0
  86. package/dist/resources/tools/request_approval.mjs +27 -0
  87. package/dist/resources/tools/run_command.mjs +21 -0
  88. package/dist/resources/tools/schedule_task.mjs +76 -0
  89. package/dist/resources/tools/search_files.mjs +47 -0
  90. package/dist/resources/tools/send_message.mjs +23 -0
  91. package/dist/runtime/adapter/direct-builtin-utility.d.ts +1 -0
  92. package/dist/runtime/adapter/direct-builtin-utility.js +90 -0
  93. package/dist/runtime/adapter/flow/execution-context.d.ts +1 -1
  94. package/dist/runtime/adapter/flow/execution-context.js +1 -1
  95. package/dist/runtime/adapter/flow/invocation-flow.d.ts +1 -0
  96. package/dist/runtime/adapter/flow/invocation-flow.js +9 -1
  97. package/dist/runtime/adapter/flow/invoke-runtime.d.ts +1 -1
  98. package/dist/runtime/adapter/flow/stream-runtime.d.ts +5 -1
  99. package/dist/runtime/adapter/flow/stream-runtime.js +556 -35
  100. package/dist/runtime/adapter/invocation-result.js +3 -2
  101. package/dist/runtime/adapter/local-tool-invocation.d.ts +1 -1
  102. package/dist/runtime/adapter/local-tool-invocation.js +28 -4
  103. package/dist/runtime/adapter/middleware-assembly.js +3 -1
  104. package/dist/runtime/adapter/model/invocation-request.d.ts +4 -1
  105. package/dist/runtime/adapter/model/invocation-request.js +138 -16
  106. package/dist/runtime/adapter/model/message-assembly.js +2 -6
  107. package/dist/runtime/adapter/model/model-providers.js +103 -5
  108. package/dist/runtime/adapter/resilience.js +17 -2
  109. package/dist/runtime/adapter/runtime-adapter-support.d.ts +11 -7
  110. package/dist/runtime/adapter/runtime-adapter-support.js +39 -5
  111. package/dist/runtime/adapter/tool/builtin-middleware-tools.d.ts +63 -1
  112. package/dist/runtime/adapter/tool/builtin-middleware-tools.js +193 -21
  113. package/dist/runtime/adapter/tool/tool-arguments.d.ts +3 -1
  114. package/dist/runtime/adapter/tool/tool-arguments.js +52 -17
  115. package/dist/runtime/adapter/tool-resolution.d.ts +1 -0
  116. package/dist/runtime/adapter/tool-resolution.js +4 -2
  117. package/dist/runtime/agent-runtime-adapter.d.ts +27 -0
  118. package/dist/runtime/agent-runtime-adapter.js +163 -11
  119. package/dist/runtime/harness/events/event-bus.d.ts +1 -0
  120. package/dist/runtime/harness/events/event-bus.js +3 -0
  121. package/dist/runtime/harness/events/event-sink.d.ts +3 -0
  122. package/dist/runtime/harness/events/event-sink.js +16 -7
  123. package/dist/runtime/harness/events/streaming.d.ts +18 -1
  124. package/dist/runtime/harness/events/streaming.js +23 -10
  125. package/dist/runtime/harness/run/inspection.js +26 -5
  126. package/dist/runtime/harness/run/stream-run.d.ts +13 -4
  127. package/dist/runtime/harness/run/stream-run.js +448 -4
  128. package/dist/runtime/harness/run/surface-semantics.js +7 -34
  129. package/dist/runtime/harness/system/runtime-memory-manager.d.ts +3 -0
  130. package/dist/runtime/harness/system/runtime-memory-manager.js +384 -69
  131. package/dist/runtime/harness/system/runtime-memory-policy.d.ts +20 -1
  132. package/dist/runtime/harness/system/runtime-memory-policy.js +65 -17
  133. package/dist/runtime/harness/system/runtime-memory-records.js +100 -0
  134. package/dist/runtime/harness/system/runtime-memory-sync.js +2 -2
  135. package/dist/runtime/harness/system/store.d.ts +4 -0
  136. package/dist/runtime/harness/system/store.js +153 -0
  137. package/dist/runtime/harness.d.ts +9 -1
  138. package/dist/runtime/harness.js +141 -7
  139. package/dist/runtime/maintenance/sqlite-checkpoint-saver.d.ts +8 -3
  140. package/dist/runtime/maintenance/sqlite-checkpoint-saver.js +152 -53
  141. package/dist/runtime/parsing/output-parsing.d.ts +10 -2
  142. package/dist/runtime/parsing/output-parsing.js +223 -16
  143. package/dist/runtime/parsing/stream-event-parsing.d.ts +7 -0
  144. package/dist/runtime/parsing/stream-event-parsing.js +51 -1
  145. package/dist/runtime/scheduling/system-schedule-manager.d.ts +41 -0
  146. package/dist/runtime/scheduling/system-schedule-manager.js +532 -0
  147. package/dist/runtime/support/embedding-models.d.ts +1 -1
  148. package/dist/runtime/support/embedding-models.js +5 -2
  149. package/dist/runtime/support/runtime-factories.js +1 -1
  150. package/dist/runtime/support/runtime-layout.d.ts +3 -0
  151. package/dist/runtime/support/runtime-layout.js +10 -1
  152. package/dist/runtime/support/runtime-prompts.d.ts +30 -0
  153. package/dist/runtime/support/runtime-prompts.js +55 -0
  154. package/dist/runtime/support/vector-stores.d.ts +1 -1
  155. package/dist/runtime/support/vector-stores.js +5 -2
  156. package/dist/upstream-events.js +8 -7
  157. package/dist/utils/bundled-text.d.ts +3 -0
  158. package/dist/utils/bundled-text.js +25 -0
  159. package/dist/utils/id.js +3 -2
  160. package/dist/workspace/agent-binding-compiler.js +53 -13
  161. package/dist/workspace/object-loader.js +64 -2
  162. package/dist/workspace/support/workspace-ref-utils.d.ts +2 -1
  163. package/dist/workspace/support/workspace-ref-utils.js +24 -5
  164. package/dist/workspace/yaml-object-reader.d.ts +1 -0
  165. package/dist/workspace/yaml-object-reader.js +95 -17
  166. package/package.json +13 -6
@@ -1,14 +1,16 @@
1
1
  import { type AcpHttpClientOptions, type AcpStdioClient, type AcpStdioClientOptions } from "../acp.js";
2
- import type { Approval, OperatorOverview, RequestEvent, RequestTraceItem } from "../api.js";
2
+ import type { Approval, OperatorOverview, RequestEvent, RequestPlanState, RequestTraceItem } from "../api.js";
3
3
  import type { CancelOptions, RequestSummary, RuntimeHealthSnapshot, SessionListSummary, SessionRecord, SessionSummary } from "../contracts/types.js";
4
- import type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult, HarnessClientRequestStartOptions, HarnessClientStreamItem } from "./types.js";
4
+ import type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult } from "./types.js";
5
5
  export type AcpHarnessTransport = Pick<AcpStdioClient, "request" | "subscribe" | "close">;
6
6
  export declare class AcpHarnessClient implements HarnessClient {
7
7
  private readonly transport;
8
8
  private streamSequence;
9
9
  constructor(transport: AcpHarnessTransport);
10
10
  request(options: HarnessClientRequestOptions): Promise<HarnessClientRequestResult>;
11
- streamRequest(options: HarnessClientRequestStartOptions): AsyncGenerator<HarnessClientStreamItem>;
11
+ private hasStreamingListeners;
12
+ private streamRequestInternal;
13
+ private requestWithStreamingListeners;
12
14
  resolveApproval(options: Parameters<HarnessClient["resolveApproval"]>[0]): Promise<HarnessClientRequestResult>;
13
15
  cancelRequest(options: CancelOptions): Promise<HarnessClientRequestResult>;
14
16
  subscribe(listener: (event: RequestEvent) => void | Promise<void>): () => void;
@@ -25,6 +27,10 @@ export declare class AcpHarnessClient implements HarnessClient {
25
27
  getRequest(requestId: string): Promise<RequestSummary | null>;
26
28
  listApprovals(filter?: HarnessClientApprovalFilter): Promise<Approval[]>;
27
29
  getApproval(approvalId: string): Promise<Approval | null>;
30
+ getRequestPlanState(input: {
31
+ sessionId: string;
32
+ requestId: string;
33
+ }): Promise<RequestPlanState | null>;
28
34
  listRequestEvents(input: {
29
35
  sessionId: string;
30
36
  requestId: string;
@@ -1,4 +1,5 @@
1
1
  import { createAcpHttpClient, createAcpStdioClient, } from "../acp.js";
2
+ import { applyRequestStreamItemToSnapshot, createInitialRequestEventSnapshot, toRequestDataEvent, } from "../request-events.js";
2
3
  function toEvent(notification) {
3
4
  return notification.params.event;
4
5
  }
@@ -15,9 +16,17 @@ export class AcpHarnessClient {
15
16
  this.transport = transport;
16
17
  }
17
18
  request(options) {
19
+ if (this.hasStreamingListeners(options)) {
20
+ return this.requestWithStreamingListeners(options);
21
+ }
18
22
  return this.transport.request("requests.submit", options);
19
23
  }
20
- async *streamRequest(options) {
24
+ hasStreamingListeners(options) {
25
+ return Boolean(("eventListener" in options && options.eventListener)
26
+ || ("dataListener" in options && options.dataListener)
27
+ || options.listeners);
28
+ }
29
+ async *streamRequestInternal(options) {
21
30
  const streamId = `harness-stream-${++this.streamSequence}`;
22
31
  const queued = [];
23
32
  let notify;
@@ -61,6 +70,9 @@ export class AcpHarnessClient {
61
70
  const resultPromise = this.transport.request("requests.submit", {
62
71
  ...options,
63
72
  streamId,
73
+ listeners: undefined,
74
+ eventListener: undefined,
75
+ dataListener: undefined,
64
76
  });
65
77
  resultPromise
66
78
  .then((result) => {
@@ -104,6 +116,45 @@ export class AcpHarnessClient {
104
116
  unsubscribe();
105
117
  }
106
118
  }
119
+ async requestWithStreamingListeners(options) {
120
+ const legacyListeners = options.listeners;
121
+ const eventListener = "eventListener" in options ? options.eventListener : undefined;
122
+ const dataListener = "dataListener" in options ? options.dataListener : undefined;
123
+ let snapshot = createInitialRequestEventSnapshot();
124
+ let finalResult;
125
+ for await (const item of this.streamRequestInternal(options)) {
126
+ snapshot = applyRequestStreamItemToSnapshot(snapshot, item);
127
+ if (item.type === "event") {
128
+ await legacyListeners?.onEvent?.(item.event);
129
+ }
130
+ else if (item.type === "upstream-event") {
131
+ await legacyListeners?.onUpstreamEvent?.(item.event);
132
+ if (item.surfaceItem) {
133
+ await legacyListeners?.onTraceItem?.({
134
+ sessionId: item.sessionId,
135
+ requestId: item.requestId,
136
+ surfaceItem: item.surfaceItem,
137
+ event: item.event,
138
+ });
139
+ }
140
+ }
141
+ else if (item.type === "plan-state") {
142
+ await legacyListeners?.onPlanState?.(item.planState);
143
+ }
144
+ else if (item.type === "result") {
145
+ finalResult = item.result;
146
+ }
147
+ const dataEvent = toRequestDataEvent(item);
148
+ if (dataEvent) {
149
+ await dataListener?.(dataEvent);
150
+ }
151
+ await eventListener?.(snapshot);
152
+ }
153
+ if (!finalResult) {
154
+ throw new Error("ACP streaming request completed without a terminal result.");
155
+ }
156
+ return finalResult;
157
+ }
107
158
  resolveApproval(options) {
108
159
  return this.transport.request("approvals.resolve", options);
109
160
  }
@@ -138,6 +189,9 @@ export class AcpHarnessClient {
138
189
  getApproval(approvalId) {
139
190
  return this.transport.request("approvals.get", { approvalId });
140
191
  }
192
+ getRequestPlanState(input) {
193
+ return this.transport.request("requests.plan.get", input);
194
+ }
141
195
  listRequestEvents(input) {
142
196
  return this.transport.request("events.list", input);
143
197
  }
@@ -1,11 +1,10 @@
1
1
  import { cancelRequest, listSessionSummaries, listSessions, resolveApproval, subscribe, type CreateAgentHarnessOptions } from "../api.js";
2
2
  import type { AgentHarnessRuntime } from "../runtime/harness.js";
3
- import type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult, HarnessClientRequestStartOptions, HarnessClientStreamItem } from "./types.js";
3
+ import type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult } from "./types.js";
4
4
  export declare class InProcessHarnessClient implements HarnessClient {
5
5
  readonly runtime: AgentHarnessRuntime;
6
6
  constructor(runtime: AgentHarnessRuntime);
7
7
  request(options: HarnessClientRequestOptions): Promise<HarnessClientRequestResult>;
8
- streamRequest(options: HarnessClientRequestStartOptions): AsyncGenerator<HarnessClientStreamItem>;
9
8
  resolveApproval(options: Parameters<typeof resolveApproval>[1]): Promise<HarnessClientRequestResult>;
10
9
  cancelRequest(options: Parameters<typeof cancelRequest>[1]): Promise<HarnessClientRequestResult>;
11
10
  subscribe(listener: Parameters<typeof subscribe>[1]): () => void;
@@ -26,6 +25,10 @@ export declare class InProcessHarnessClient implements HarnessClient {
26
25
  getRequest(requestId: string): Promise<import("../contracts/runtime.js").RequestRecord | null>;
27
26
  listApprovals(filter?: HarnessClientApprovalFilter): Promise<import("../api.js").Approval[]>;
28
27
  getApproval(approvalId: string): Promise<import("../api.js").Approval | null>;
28
+ getRequestPlanState(input: {
29
+ sessionId: string;
30
+ requestId: string;
31
+ }): Promise<import("../api.js").RequestPlanState | null>;
29
32
  listRequestEvents(input: {
30
33
  sessionId: string;
31
34
  requestId: string;
@@ -1,4 +1,4 @@
1
- import { cancelRequest, createAgentHarness, getApproval, getHealth, getOperatorOverview, getRequest, getSession, listApprovals, listRequestEvents, listRequests, listRequestTraceItems, listSessionSummaries, listSessions, request, resolveApproval, subscribe, stop, } from "../api.js";
1
+ import { cancelRequest, createAgentHarness, getApproval, getHealth, getOperatorOverview, getRequestPlanState, getRequest, getSession, listApprovals, listRequestEvents, listRequests, listRequestTraceItems, listSessionSummaries, listSessions, request, resolveApproval, subscribe, stop, } from "../api.js";
2
2
  export class InProcessHarnessClient {
3
3
  runtime;
4
4
  constructor(runtime) {
@@ -7,11 +7,6 @@ export class InProcessHarnessClient {
7
7
  request(options) {
8
8
  return request(this.runtime, options);
9
9
  }
10
- async *streamRequest(options) {
11
- for await (const item of this.runtime.streamEvents(options)) {
12
- yield item;
13
- }
14
- }
15
10
  resolveApproval(options) {
16
11
  return resolveApproval(this.runtime, options);
17
12
  }
@@ -42,6 +37,9 @@ export class InProcessHarnessClient {
42
37
  getApproval(approvalId) {
43
38
  return getApproval(this.runtime, approvalId);
44
39
  }
40
+ getRequestPlanState(input) {
41
+ return getRequestPlanState(this.runtime, input);
42
+ }
45
43
  listRequestEvents(input) {
46
44
  return listRequestEvents(this.runtime, input);
47
45
  }
@@ -1,4 +1,4 @@
1
1
  export { AcpHarnessClient, createAcpHarnessClient, createAcpHttpHarnessClient, createAcpStdioHarnessClient } from "./acp.js";
2
2
  export { InProcessHarnessClient, createAgentHarnessClient, createInProcessHarnessClient } from "./in-process.js";
3
- export type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult, HarnessClientRequestStartOptions, HarnessClientStreamItem, } from "./types.js";
3
+ export type { HarnessClient, HarnessClientApprovalFilter, HarnessClientRequestFilter, HarnessClientRequestOptions, HarnessClientRequestResult, HarnessClientRequestStartOptions, } from "./types.js";
4
4
  export type { AcpHarnessTransport } from "./acp.js";
@@ -1,16 +1,14 @@
1
- import type { Approval, OperatorOverview, PublicRequestListeners, PublicRequestOptions, PublicRequestResult, RequestEvent, RequestTraceItem } from "../api.js";
2
- import type { CancelOptions, HarnessStreamItem, InvocationEnvelope, MessageContent, RequestSummary, ResumeOptions, RuntimeHealthSnapshot, SessionListSummary, SessionRecord, SessionSummary } from "../contracts/types.js";
1
+ import type { Approval, OperatorOverview, PublicRequestOptions, PublicRequestResult, RequestEvent, RequestTraceItem } from "../api.js";
2
+ import type { CancelOptions, InvocationEnvelope, MessageContent, RequestPlanState, RequestSummary, ResumeOptions, RuntimeHealthSnapshot, SessionListSummary, SessionRecord, SessionSummary } from "../contracts/types.js";
3
3
  export type HarnessClientRequestStartOptions = {
4
4
  agentId?: string;
5
5
  input: MessageContent;
6
6
  sessionId?: string;
7
7
  priority?: number;
8
8
  invocation?: InvocationEnvelope;
9
- listeners?: PublicRequestListeners;
10
9
  };
11
10
  export type HarnessClientRequestOptions = PublicRequestOptions;
12
11
  export type HarnessClientRequestResult = PublicRequestResult;
13
- export type HarnessClientStreamItem = HarnessStreamItem;
14
12
  export type HarnessClientRequestFilter = {
15
13
  agentId?: string;
16
14
  sessionId?: string;
@@ -23,7 +21,6 @@ export type HarnessClientApprovalFilter = {
23
21
  };
24
22
  export interface HarnessClient {
25
23
  request(options: HarnessClientRequestOptions): Promise<HarnessClientRequestResult>;
26
- streamRequest(options: HarnessClientRequestStartOptions): AsyncGenerator<HarnessClientStreamItem>;
27
24
  resolveApproval(options: ResumeOptions): Promise<HarnessClientRequestResult>;
28
25
  cancelRequest(options: CancelOptions): Promise<HarnessClientRequestResult>;
29
26
  subscribe(listener: (event: RequestEvent) => void | Promise<void>): () => void;
@@ -40,6 +37,10 @@ export interface HarnessClient {
40
37
  getRequest(requestId: string): Promise<RequestSummary | null>;
41
38
  listApprovals(filter?: HarnessClientApprovalFilter): Promise<Approval[]>;
42
39
  getApproval(approvalId: string): Promise<Approval | null>;
40
+ getRequestPlanState(input: {
41
+ sessionId: string;
42
+ requestId: string;
43
+ }): Promise<RequestPlanState | null>;
43
44
  listRequestEvents(input: {
44
45
  sessionId: string;
45
46
  requestId: string;
@@ -12,6 +12,8 @@ spec:
12
12
  runtime:
13
13
  # agent-harness feature: workspace-level durable long-term memory defaults for this host profile.
14
14
  runtimeMemory: default
15
+ # agent-harness feature: optional background-only procedural memory defaults for this host profile.
16
+ proceduralMemory: default
15
17
  # =====================
16
18
  # Runtime Agent Features
17
19
  # =====================
@@ -31,14 +33,9 @@ spec:
31
33
  subagents: []
32
34
  # Upstream execution feature: direct host does not attach MCP servers by default.
33
35
  mcpServers: []
34
- # Runtime execution feature: checkpointer config passed into the selected backend adapter.
35
- # Even the lightweight direct path can benefit from resumable state during interactive use.
36
- # Available `kind` options in this harness: `SqliteSaver`, `FileCheckpointer`, `MemorySaver`.
37
- # The repository default uses the sqlite-backed preset so durable checkpoint state stays inside `runtime/checkpoints.sqlite`.
38
- checkpointer: default
39
- # Upstream execution feature: LangGraph store available to middleware and runtime context hooks.
40
- # The default direct host keeps this enabled so middleware can use the same durable store surface as other hosts.
41
- store: default
36
+ # Upstream execution feature: leave graph checkpointers and stores unset in the repository default.
37
+ # `direct` is the low-latency path; add `checkpointer:` or `store:` only when the host really needs resumable
38
+ # graph state or middleware-owned store access.
42
39
  # Upstream execution feature: no declarative HITL tool routing by default.
43
40
  interruptOn: {}
44
41
  # Upstream execution feature: filesystem middleware settings for LangChain v1 agents.
@@ -74,12 +71,5 @@ spec:
74
71
  # Keep this prompt biased toward concise, self-contained answers. If richer routing policy is
75
72
  # needed for choosing between host agents, configure that separately via `Runtime.spec.routing`
76
73
  # rather than overloading the direct host prompt with classifier behavior.
77
- systemPrompt: |-
78
- You are the direct agent.
79
-
80
- This is a manual low-latency host.
81
- Answer simple requests directly.
82
- Keep the path lightweight.
83
- Do not delegate.
84
- Do not perform broad multi-step execution.
85
- Do not behave like the default execution host.
74
+ systemPrompt:
75
+ path: ../prompts/direct-system.md
@@ -12,6 +12,8 @@ spec:
12
12
  runtime:
13
13
  # agent-harness feature: workspace-level durable long-term memory defaults for this host profile.
14
14
  runtimeMemory: default
15
+ # agent-harness feature: optional background-only procedural memory defaults for this host profile.
16
+ proceduralMemory: default
15
17
  # =====================
16
18
  # Runtime Agent Features
17
19
  # =====================
@@ -19,41 +21,19 @@ spec:
19
21
  backend: deepagent
20
22
  # Upstream execution feature: model ref for the underlying LLM used by this execution host.
21
23
  modelRef: model/default
22
- memory:
23
- # Upstream execution feature: bootstrap memory sources supplied to the selected backend at construction time.
24
- # These paths resolve relative to the workspace root unless they are already absolute.
25
- # Treat this as agent-owned startup context, not as a dynamic long-term memory sink:
26
- # - keep `systemPrompt` for stable role, boundaries, and hard behavioral rules
27
- # - use `memory:` for stable project knowledge, operating conventions, and shared or agent-specific context files
28
- # - use `/memories/*` via the backend/store below for durable knowledge learned from prior requests
29
- # - use the harness checkpointer for resumable graph state for an in-flight request
30
- # Updating these files changes future agent constructions, but they are still bootstrap inputs rather than
31
- # self-updating runtime memory.
32
- - path: config/agent-context.md
24
+ memory: []
33
25
  # Upstream execution feature: top-level host starts with no extra direct tool refs beyond discovered workspace tools.
34
26
  tools: []
35
27
  # Upstream execution feature: the starter runtime ships one host plus a small set of behavior skills so the
36
28
  # first request already feels like real work instead of an empty shell.
37
- skills:
38
- - resource://skills/workspace-inspection
39
- - resource://skills/safe-editing
40
- - resource://skills/delegation-discipline
41
- - resource://skills/approval-execution-policy
42
- - resource://skills/completion-discipline
29
+ skills: []
43
30
  # Upstream execution feature: subagent topology is empty in the repository default and can be filled in YAML.
44
31
  subagents: []
45
32
  # Upstream execution feature: host-level MCP servers are opt-in and empty by default.
46
33
  mcpServers: []
47
- # Runtime execution feature: checkpointer config passed into the selected backend adapter.
48
- # This persists resumable graph state for this agent.
49
- # Available `kind` options in this harness: `SqliteSaver`, `FileCheckpointer`, `MemorySaver`.
50
- # The repository default uses the sqlite-backed preset so durable checkpoint state stays inside `runtime/checkpoints.sqlite`.
51
- checkpointer: default
52
- # Upstream execution feature: store config passed into the selected backend adapter.
53
- # In the default deepagent adapter this is the LangGraph store used by `StoreBackend` routes.
54
- # Built-in kinds in this harness today: `FileStore`, `InMemoryStore`.
55
- # Other store kinds should flow through a custom runtime resolver instead of being claimed as built in.
56
- store: default
34
+ # Upstream execution feature: leave graph checkpointers and stores unset in the repository default.
35
+ # The starter runtime should stay responsive for local chat and inspection work. Add `checkpointer:` and `store:`
36
+ # back only when this host truly needs resumable graph state or middleware-owned store access.
57
37
  # Upstream execution feature: backend config passed into the selected backend adapter.
58
38
  # Prefer a reusable backend preset via `ref` so backend topology stays declarative and reusable in YAML.
59
39
  # The default preset keeps DeepAgent execution semantics upstream-owned:
@@ -82,41 +62,5 @@ spec:
82
62
  # Upstream execution feature: system prompt for the orchestration host.
83
63
  # This becomes the top-level instruction block for the selected execution backend and should hold the
84
64
  # agent's durable role, priorities, and behavioral guardrails rather than bulky project facts.
85
- systemPrompt: |-
86
- You are the orchestra agent.
87
-
88
- You are the default execution host for a user-facing runtime.
89
- The first request should feel like a capable working session, not a thin demo.
90
- Try to finish the request yourself before delegating.
91
- Use your own tools first when they are sufficient.
92
- Use your own skills first when they are sufficient.
93
- Delegate only when a subagent is a clearly better fit or when your own tools and skills are not enough.
94
- If neither you nor any suitable subagent can do the work, say so plainly.
95
-
96
- Prefer visible progress over abstract planning. When the request is about this workspace, inspect the workspace
97
- before answering. When the request would be improved by a concrete edit or command, do the smallest safe action
98
- that moves the work forward instead of only describing what you might do.
99
-
100
- Do not delegate by reflex.
101
- Do not delegate just because a task has multiple steps.
102
- Do not delegate when a direct answer or a short local tool pass is enough.
103
- Keep the critical path local when immediate progress depends on it; otherwise delegate bounded sidecar work to
104
- the most appropriate subagent.
105
-
106
- Use your own tools for lightweight discovery, inventory, and context gathering.
107
- Prefer the structured checkout, indexing, retrieval, and inventory tools that are already attached to you over
108
- ad hoc shell work when those tools are sufficient.
109
- Keep answers crisp, concrete, and usable. Close the loop: explain what you inspected, what you changed, what you
110
- verified, and what still needs approval or follow-up.
111
- Use the attached subagent descriptions as the source of truth for what each subagent is for.
112
- Do not delegate to a subagent whose description does not clearly match the task.
113
- Integrate subagent results into one coherent answer and do not claim checks or evidence you did not obtain.
114
-
115
- When the user asks about available tools, skills, or agents, use the attached inventory tools instead of
116
- inferring from memory.
117
-
118
- Write to `/memories/*` only when the information is durable, reusable across future requests or sessions, and likely
119
- to matter again: user preferences, project conventions, confirmed decisions, reusable summaries, and stable
120
- ownership facts are good candidates.
121
- Do not store transient reasoning, temporary plans, scratch work, one-off search results, or intermediate
122
- outputs that can be cheaply recomputed.
65
+ systemPrompt:
66
+ path: ../prompts/orchestra-system.md
@@ -13,7 +13,7 @@ spec:
13
13
  # LangChain aligned feature: concrete embedding model identifier passed to the provider integration.
14
14
  model: nomic-embed-text
15
15
  # LangChain aligned feature: provider-specific initialization options for embeddings.
16
- baseUrl: http://127.0.0.1:11434
16
+ baseUrl: ${env:AGENT_HARNESS_OLLAMA_BASE_URL:-http://127.0.0.1:11434}
17
17
 
18
18
  # ===================
19
19
  # DeepAgents Features
@@ -8,7 +8,7 @@ spec:
8
8
  name: default
9
9
  description: Default sqlite-backed store preset for runtime-managed agent state and durable memory.
10
10
  storeKind: SqliteStore
11
- path: knowledge/records.sqlite
11
+ path: knowledge/knowledge.sqlite
12
12
 
13
13
  # agent-harness feature: reusable checkpointer preset for resumable execution state.
14
14
  - kind: Checkpointer
@@ -41,12 +41,46 @@ spec:
41
41
  enabled: true
42
42
  manager:
43
43
  enabled: true
44
- strategy: rules
44
+ strategy: model
45
+ prompt: |-
46
+ You are the runtime memory manager.
47
+ Decide whether a candidate should be stored as durable memory and refine it if appropriate.
48
+ Return JSON only.
49
+
50
+ Rules:
51
+ - Store only durable reusable knowledge. Reject transient chatter, scratchpad, or duplication without added value.
52
+ - Reject raw request/session summaries, source-specific page/news recaps, and generic "we learned how to use the tools/workflow" reflections unless they clearly contain reusable preferences, facts, decisions, or procedures.
53
+ - If transcript evidence shows the user explicitly asked the system to remember or follow a future instruction and the assistant confirmed that intent, store the durable instruction instead of rejecting it as a generic summary.
54
+ - Treat durable knowledge as generic mutable records with database-like operations over the same underlying knowledge item.
55
+ - One candidate may yield zero, one, or multiple durable knowledge items. Split it only when the input clearly contains multiple independently mutable knowledge points.
56
+ - When storing a knowledge item, always return a `knowledgeMutation` object with a stable `identity` and an `operation` of `create`, `update`, or `delete`.
57
+ - Keep `knowledgeMutation.identity` stable across revisions of the same knowledge point, even when the wording changes.
58
+ - Use `create` for a newly introduced knowledge item, `update` for a revised active state of an existing knowledge item, and `delete` when the candidate says an existing knowledge item should no longer remain active.
59
+ - If an existing relevant record already represents the same underlying knowledge item, reuse that record's `knowledge_identity` instead of inventing a new one.
60
+ - Do not invent a second identity just because the new statement negates, revokes, deletes, or replaces the old wording. That is usually the same knowledge item with a different mutation operation.
61
+ - The stored `content` must be canonical knowledge text, not an assistant acknowledgement such as "已记住" or "I will remember".
62
+ - You may optionally include `operationalRule` when the knowledge is naturally a rule, instruction, or recurring procedure. Treat it as structured metadata, not as the primary identity mechanism.
63
+ - Prefer semantic/episodic/procedural kinds only.
64
+ - Prefer scopes session/agent/workspace/user/project only.
65
+ - If the candidate should not be stored, return {"store": false, "reason": "..."}
66
+ - If the candidate maps to one durable item, you may return {"store": true, "content": "...", "summary": "...", "kind": "...", "scope": "...", "tags": ["..."], "confidence": 0.0, "knowledgeMutation": {"identity": "...", "operation": "create|update|delete"}, "operationalRule": {"trigger": "...", "action": "...", "target": "...", "effect": "apply|invalidate"}}
67
+ - If the candidate maps to multiple durable items, return {"store": true, "mutations": [{"content": "...", "summary": "...", "kind": "...", "scope": "...", "tags": ["..."], "confidence": 0.0, "knowledgeMutation": {"identity": "...", "operation": "create|update|delete"}, "operationalRule": {"trigger": "...", "action": "...", "target": "...", "effect": "apply|invalidate"}}]}
68
+
69
+ sessionId={{sessionId}}
70
+ requestId={{requestId}}
71
+
72
+ Candidate:
73
+ {{candidateJson}}
74
+
75
+ Existing relevant records:
76
+ {{existingRecords}}
45
77
  maxContextRecords: 12
46
78
  background:
47
79
  enabled: true
48
80
  scopes:
49
- - session
81
+ - user
82
+ - project
83
+ - workspace
50
84
  stateStorePath: knowledge/formation-state.json
51
85
  maxMessagesPerRequest: 40
52
86
  writeOnApprovalResolution: true
@@ -0,0 +1,78 @@
1
+ # agent-harness feature: schema version for this declarative config object.
2
+ apiVersion: agent-harness/v1alpha1
3
+ # agent-harness feature: standalone procedural-memory runtime defaults.
4
+ # Keep experience memory separate from durable knowledge, but under the same data root.
5
+ kind: ProceduralMemoryRuntime
6
+ metadata:
7
+ # agent-harness feature: stable singleton name for the default procedural-memory runtime object.
8
+ name: default
9
+ spec:
10
+ # agent-harness feature: enable or disable background procedural-memory learning.
11
+ enabled: true
12
+ provider:
13
+ # agent-harness feature: provider identifier for a procedural-memory backend.
14
+ kind: reme
15
+ mode:
16
+ # agent-harness feature: keep procedural learning off the request hot path.
17
+ backgroundOnly: true
18
+ trigger:
19
+ # agent-harness feature: nearline formation hooks for newly completed work.
20
+ onRequestCompleted: true
21
+ onApprovalResolved: true
22
+ store:
23
+ # agent-harness feature: provider-owned procedural-memory metadata and state store.
24
+ kind: SqliteStore
25
+ path: knowledge/procedural-memory.sqlite
26
+ vectorStore:
27
+ # agent-harness feature: separate procedural vector substrate under the shared knowledge directory.
28
+ kind: LibSQLVectorStore
29
+ url: file:knowledge/procedural-vectors.sqlite
30
+ table: procedural_memory
31
+ column: embedding
32
+ embeddingModel:
33
+ # agent-harness feature: default embedding model used with the procedural vector store.
34
+ ref: embedding-model/default
35
+ extraction:
36
+ # agent-harness feature: background procedural formation focus.
37
+ focus:
38
+ - coding_patterns
39
+ - debugging_lessons
40
+ - workflow_patterns
41
+ - failure_prevention
42
+ - reusable_procedures
43
+ maxMessagesPerRequest: 60
44
+ retrieval:
45
+ # agent-harness feature: bounded procedural recall defaults.
46
+ enabled: true
47
+ defaultTopK: 5
48
+ maxPromptItems: 4
49
+ state:
50
+ # agent-harness feature: incremental background cursor for procedural formation.
51
+ cursorPath: knowledge/procedural-memory-state.json
52
+ maintenance:
53
+ # agent-harness feature: keep procedural memory consolidated outside the request path.
54
+ enabled: true
55
+ onWrite:
56
+ dedupeNearby: true
57
+ updateFrequency: true
58
+ schedule:
59
+ enabled: true
60
+ everyMinutes: 60
61
+ idle:
62
+ enabled: true
63
+ minIdleMinutes: 20
64
+ maxRunsPerIdleWindow: 1
65
+ tasks:
66
+ - dedupe
67
+ - merge_similar
68
+ - decay_stale
69
+ - prune_low_value
70
+ limits:
71
+ maxRecordsPerRun: 200
72
+ maxClustersPerRun: 50
73
+ decay:
74
+ enabled: true
75
+ maxAgeDays: 90
76
+ pruning:
77
+ minScore: 0.2
78
+ minFrequency: 2
@@ -18,12 +18,12 @@ spec:
18
18
  provider: ollama
19
19
  # LangChain aligned feature: concrete model identifier passed to the selected provider integration.
20
20
  # Example values depend on `provider`, such as `gpt-oss:latest` for `ollama`.
21
- model: gpt-oss:latest
21
+ model: gemma4:e2b
22
22
  # LangChain aligned feature: provider-specific initialization options.
23
23
  # Write these fields directly on the model object.
24
24
  # Common examples include `baseUrl`, `temperature`, and auth/client settings.
25
25
  # `baseUrl` configures the Ollama-compatible endpoint used by the model client.
26
26
  # For `openai-compatible`, `baseUrl` is normalized into the ChatOpenAI `configuration.baseURL` field.
27
- baseUrl: http://127.0.0.1:11434
27
+ baseUrl: ${env:AGENT_HARNESS_OLLAMA_BASE_URL:-http://127.0.0.1:11434}
28
28
  # LangChain aligned feature: provider/model initialization option controlling sampling temperature.
29
29
  temperature: 0.2
@@ -0,0 +1,16 @@
1
+ You are the direct agent.
2
+
3
+ This is a manual low-latency host.
4
+ Answer simple requests directly.
5
+ Keep the path lightweight.
6
+ Do not delegate.
7
+ Do not perform broad multi-step execution.
8
+ Do not behave like the default execution host.
9
+
10
+ For simple local utility questions, use the attached local tools immediately instead of guessing.
11
+ Examples:
12
+ - if the user asks for the current time or date, run a local command and return the real result
13
+ - if the user asks for a simple local inventory question that one short tool call can answer, run that tool first
14
+ - if the user asks for a recurring or scheduled system task such as "run ls every 5 minutes", call `schedule_task` instead of claiming you cannot do background work
15
+
16
+ Do not fabricate live local facts such as the current time.
@@ -0,0 +1,62 @@
1
+ You are the orchestra agent.
2
+
3
+ You are the default execution host for a user-facing runtime.
4
+ The first request should feel like a capable working session, not a thin demo.
5
+ Try to finish the request yourself before delegating.
6
+ Use your own tools first when they are sufficient.
7
+ Use your own skills first when they are sufficient.
8
+ Delegate only when a subagent is a clearly better fit or when your own tools and skills are not enough.
9
+ If neither you nor any suitable subagent can do the work, say so plainly.
10
+
11
+ Prefer visible progress over abstract planning. When the request is about this workspace, inspect the workspace
12
+ before answering. When the request would be improved by a concrete edit or command, do the smallest safe action
13
+ that moves the work forward instead of only describing what you might do.
14
+
15
+ For simple local utility questions, execute the relevant tool immediately instead of answering abstractly.
16
+ Examples:
17
+ - if the user asks for the current time or date, run a local command and return the result
18
+ - if the user asks to list files, inspect the workspace and return the file listing
19
+ - if the user asks what is in this workspace, inspect it first and answer from tool evidence
20
+ Do not say you lack access when the attached local tools can answer the question directly.
21
+ For recurring or scheduled system tasks such as "run ls every 5 minutes", call `schedule_task` and let the runtime install a system-level schedule instead of saying you cannot do background work.
22
+
23
+ For current external information such as today's news, only claim live information when you actually used a
24
+ web-capable tool in this runtime. If no web/news tool is attached, say plainly that this workspace runtime does
25
+ not currently have a web source for live news instead of redirecting the user back to generic workspace help.
26
+
27
+ When the task is clearly multi-step and requires real execution, do not ask the user to restate it. Start from
28
+ the supplied task immediately. For non-trivial multi-step work, call `write_todos` before other tool calls so the
29
+ runtime can show a live todo board. Keep that todo list updated throughout execution instead of only at the end:
30
+ mark completed steps as `completed`, keep the active step as `in_progress`, mark failures as `failed`, and attach
31
+ short `result` summaries when they help the user follow progress. Use descriptive todo content that names the
32
+ real step; never use placeholders like `1`, `2`, `3`, `step 1`, or `todo 1`.
33
+
34
+ For workspace file operations, always use workspace-relative paths such as `tmp-counter.txt` or `docs/index.html`.
35
+ Do not use absolute host paths like `/tmp/...` or `/Users/...`. Use `write_file` only for the initial creation of
36
+ a file. After a file exists, switch to `read_file` plus `edit_file` for updates instead of repeating `write_file`.
37
+ When the user asks for repeated execution steps such as write/read/wait/append loops, keep iterating until the
38
+ requested sequence is complete or a real tool error blocks further progress.
39
+
40
+ Do not delegate by reflex.
41
+ Do not delegate just because a task has multiple steps.
42
+ Do not delegate when a direct answer or a short local tool pass is enough.
43
+ Keep the critical path local when immediate progress depends on it; otherwise delegate bounded sidecar work to
44
+ the most appropriate subagent.
45
+
46
+ Use your own tools for lightweight discovery, inventory, and context gathering.
47
+ Prefer the structured checkout, indexing, retrieval, and inventory tools that are already attached to you over
48
+ ad hoc shell work when those tools are sufficient.
49
+ Keep answers crisp, concrete, and usable. Close the loop: explain what you inspected, what you changed, what you
50
+ verified, and what still needs approval or follow-up.
51
+ Use the attached subagent descriptions as the source of truth for what each subagent is for.
52
+ Do not delegate to a subagent whose description does not clearly match the task.
53
+ Integrate subagent results into one coherent answer and do not claim checks or evidence you did not obtain.
54
+
55
+ When the user asks about available tools, skills, or agents, use the attached inventory tools instead of
56
+ inferring from memory.
57
+
58
+ Write to `/memories/*` only when the information is durable, reusable across future requests or sessions, and likely
59
+ to matter again: user preferences, project conventions, confirmed decisions, reusable summaries, and stable
60
+ ownership facts are good candidates.
61
+ Do not store transient reasoning, temporary plans, scratch work, one-off search results, or intermediate
62
+ outputs that can be cheaply recomputed.
@@ -0,0 +1,14 @@
1
+ You are a routing classifier for an agent harness. Reply with exactly one agent id:
2
+ {{primaryAgentId}} or {{secondaryAgentId}}.
3
+
4
+ Choose {{primaryAgentId}} only for lightweight conversational turns that can be answered directly in one step
5
+ without tool use, repository inspection, file lookup, external checkout, or orchestration.
6
+
7
+ Choose {{secondaryAgentId}} for requests that need tools, multi-step execution, external research, repository or
8
+ file analysis, downloading or cloning content, codebase exploration, verification, or any task where the agent
9
+ should inspect the workspace or another repository before answering.
10
+
11
+ If the request asks to download, clone, fetch, inspect, analyze, trace, or locate implementation in a repo or
12
+ codebase, choose {{secondaryAgentId}}.
13
+
14
+ When uncertain, prefer {{secondaryAgentId}}.