@agent-native/core 0.52.0 → 0.54.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (267) hide show
  1. package/README.md +41 -95
  2. package/blueprints/action/crud.md +98 -0
  3. package/blueprints/channel/discord.md +74 -0
  4. package/blueprints/provider/stripe.md +87 -0
  5. package/blueprints/sandbox/docker.md +78 -0
  6. package/dist/action.d.ts +64 -1
  7. package/dist/action.d.ts.map +1 -1
  8. package/dist/action.js +73 -2
  9. package/dist/action.js.map +1 -1
  10. package/dist/agent/index.d.ts +1 -0
  11. package/dist/agent/index.d.ts.map +1 -1
  12. package/dist/agent/index.js +1 -0
  13. package/dist/agent/index.js.map +1 -1
  14. package/dist/agent/observational-memory/compactor.d.ts +43 -0
  15. package/dist/agent/observational-memory/compactor.d.ts.map +1 -0
  16. package/dist/agent/observational-memory/compactor.js +50 -0
  17. package/dist/agent/observational-memory/compactor.js.map +1 -0
  18. package/dist/agent/observational-memory/config.d.ts +37 -0
  19. package/dist/agent/observational-memory/config.d.ts.map +1 -0
  20. package/dist/agent/observational-memory/config.js +48 -0
  21. package/dist/agent/observational-memory/config.js.map +1 -0
  22. package/dist/agent/observational-memory/index.d.ts +26 -0
  23. package/dist/agent/observational-memory/index.d.ts.map +1 -0
  24. package/dist/agent/observational-memory/index.js +25 -0
  25. package/dist/agent/observational-memory/index.js.map +1 -0
  26. package/dist/agent/observational-memory/internal-run.d.ts +37 -0
  27. package/dist/agent/observational-memory/internal-run.d.ts.map +1 -0
  28. package/dist/agent/observational-memory/internal-run.js +59 -0
  29. package/dist/agent/observational-memory/internal-run.js.map +1 -0
  30. package/dist/agent/observational-memory/message-text.d.ts +13 -0
  31. package/dist/agent/observational-memory/message-text.d.ts.map +1 -0
  32. package/dist/agent/observational-memory/message-text.js +46 -0
  33. package/dist/agent/observational-memory/message-text.js.map +1 -0
  34. package/dist/agent/observational-memory/migrations.d.ts +13 -0
  35. package/dist/agent/observational-memory/migrations.d.ts.map +1 -0
  36. package/dist/agent/observational-memory/migrations.js +43 -0
  37. package/dist/agent/observational-memory/migrations.js.map +1 -0
  38. package/dist/agent/observational-memory/observer.d.ts +37 -0
  39. package/dist/agent/observational-memory/observer.d.ts.map +1 -0
  40. package/dist/agent/observational-memory/observer.js +82 -0
  41. package/dist/agent/observational-memory/observer.js.map +1 -0
  42. package/dist/agent/observational-memory/plugin.d.ts +16 -0
  43. package/dist/agent/observational-memory/plugin.d.ts.map +1 -0
  44. package/dist/agent/observational-memory/plugin.js +26 -0
  45. package/dist/agent/observational-memory/plugin.js.map +1 -0
  46. package/dist/agent/observational-memory/prompts.d.ts +27 -0
  47. package/dist/agent/observational-memory/prompts.d.ts.map +1 -0
  48. package/dist/agent/observational-memory/prompts.js +42 -0
  49. package/dist/agent/observational-memory/prompts.js.map +1 -0
  50. package/dist/agent/observational-memory/read.d.ts +45 -0
  51. package/dist/agent/observational-memory/read.d.ts.map +1 -0
  52. package/dist/agent/observational-memory/read.js +97 -0
  53. package/dist/agent/observational-memory/read.js.map +1 -0
  54. package/dist/agent/observational-memory/reflector.d.ts +31 -0
  55. package/dist/agent/observational-memory/reflector.d.ts.map +1 -0
  56. package/dist/agent/observational-memory/reflector.js +76 -0
  57. package/dist/agent/observational-memory/reflector.js.map +1 -0
  58. package/dist/agent/observational-memory/schema.d.ts +267 -0
  59. package/dist/agent/observational-memory/schema.d.ts.map +1 -0
  60. package/dist/agent/observational-memory/schema.js +48 -0
  61. package/dist/agent/observational-memory/schema.js.map +1 -0
  62. package/dist/agent/observational-memory/store.d.ts +52 -0
  63. package/dist/agent/observational-memory/store.d.ts.map +1 -0
  64. package/dist/agent/observational-memory/store.js +197 -0
  65. package/dist/agent/observational-memory/store.js.map +1 -0
  66. package/dist/agent/observational-memory/types.d.ts +61 -0
  67. package/dist/agent/observational-memory/types.d.ts.map +1 -0
  68. package/dist/agent/observational-memory/types.js +9 -0
  69. package/dist/agent/observational-memory/types.js.map +1 -0
  70. package/dist/agent/processors.d.ts +146 -0
  71. package/dist/agent/processors.d.ts.map +1 -0
  72. package/dist/agent/processors.js +122 -0
  73. package/dist/agent/processors.js.map +1 -0
  74. package/dist/agent/production-agent.d.ts +25 -0
  75. package/dist/agent/production-agent.d.ts.map +1 -1
  76. package/dist/agent/production-agent.js +341 -1
  77. package/dist/agent/production-agent.js.map +1 -1
  78. package/dist/agent/run-loop-with-resume.d.ts.map +1 -1
  79. package/dist/agent/run-loop-with-resume.js +48 -0
  80. package/dist/agent/run-loop-with-resume.js.map +1 -1
  81. package/dist/agent/run-store.d.ts +17 -0
  82. package/dist/agent/run-store.d.ts.map +1 -1
  83. package/dist/agent/run-store.js +55 -0
  84. package/dist/agent/run-store.js.map +1 -1
  85. package/dist/agent/runtime-context.d.ts +30 -0
  86. package/dist/agent/runtime-context.d.ts.map +1 -1
  87. package/dist/agent/runtime-context.js +54 -1
  88. package/dist/agent/runtime-context.js.map +1 -1
  89. package/dist/agent/tool-call-journal.d.ts +99 -0
  90. package/dist/agent/tool-call-journal.d.ts.map +1 -0
  91. package/dist/agent/tool-call-journal.js +212 -0
  92. package/dist/agent/tool-call-journal.js.map +1 -0
  93. package/dist/agent/types.d.ts +35 -0
  94. package/dist/agent/types.d.ts.map +1 -1
  95. package/dist/agent/types.js.map +1 -1
  96. package/dist/cli/add.d.ts +109 -0
  97. package/dist/cli/add.d.ts.map +1 -0
  98. package/dist/cli/add.js +352 -0
  99. package/dist/cli/add.js.map +1 -0
  100. package/dist/cli/connect.d.ts +2 -2
  101. package/dist/cli/connect.d.ts.map +1 -1
  102. package/dist/cli/connect.js +92 -24
  103. package/dist/cli/connect.js.map +1 -1
  104. package/dist/cli/eval.d.ts +17 -0
  105. package/dist/cli/eval.d.ts.map +1 -0
  106. package/dist/cli/eval.js +121 -0
  107. package/dist/cli/eval.js.map +1 -0
  108. package/dist/cli/index.js +44 -3
  109. package/dist/cli/index.js.map +1 -1
  110. package/dist/cli/mcp.d.ts.map +1 -1
  111. package/dist/cli/mcp.js +11 -5
  112. package/dist/cli/mcp.js.map +1 -1
  113. package/dist/cli/plan-local.d.ts +66 -5
  114. package/dist/cli/plan-local.d.ts.map +1 -1
  115. package/dist/cli/plan-local.js +622 -21
  116. package/dist/cli/plan-local.js.map +1 -1
  117. package/dist/cli/skills.d.ts +2 -2
  118. package/dist/cli/skills.d.ts.map +1 -1
  119. package/dist/cli/skills.js +108 -62
  120. package/dist/cli/skills.js.map +1 -1
  121. package/dist/client/AssistantChat.d.ts.map +1 -1
  122. package/dist/client/AssistantChat.js +118 -92
  123. package/dist/client/AssistantChat.js.map +1 -1
  124. package/dist/client/agent-chat-adapter.d.ts.map +1 -1
  125. package/dist/client/agent-chat-adapter.js +16 -0
  126. package/dist/client/agent-chat-adapter.js.map +1 -1
  127. package/dist/client/chat/tool-call-display.d.ts +20 -1
  128. package/dist/client/chat/tool-call-display.d.ts.map +1 -1
  129. package/dist/client/chat/tool-call-display.js +32 -7
  130. package/dist/client/chat/tool-call-display.js.map +1 -1
  131. package/dist/client/sse-event-processor.d.ts +13 -0
  132. package/dist/client/sse-event-processor.d.ts.map +1 -1
  133. package/dist/client/sse-event-processor.js +21 -0
  134. package/dist/client/sse-event-processor.js.map +1 -1
  135. package/dist/coding-tools/run-code.d.ts.map +1 -1
  136. package/dist/coding-tools/run-code.js +18 -2
  137. package/dist/coding-tools/run-code.js.map +1 -1
  138. package/dist/db/client.d.ts +4 -2
  139. package/dist/db/client.d.ts.map +1 -1
  140. package/dist/db/client.js +6 -4
  141. package/dist/db/client.js.map +1 -1
  142. package/dist/deploy/route-discovery.d.ts.map +1 -1
  143. package/dist/deploy/route-discovery.js +1 -0
  144. package/dist/deploy/route-discovery.js.map +1 -1
  145. package/dist/eval/agent-runner.d.ts +63 -0
  146. package/dist/eval/agent-runner.d.ts.map +1 -0
  147. package/dist/eval/agent-runner.js +142 -0
  148. package/dist/eval/agent-runner.js.map +1 -0
  149. package/dist/eval/define-eval.d.ts +29 -0
  150. package/dist/eval/define-eval.d.ts.map +1 -0
  151. package/dist/eval/define-eval.js +43 -0
  152. package/dist/eval/define-eval.js.map +1 -0
  153. package/dist/eval/index.d.ts +18 -0
  154. package/dist/eval/index.d.ts.map +1 -0
  155. package/dist/eval/index.js +17 -0
  156. package/dist/eval/index.js.map +1 -0
  157. package/dist/eval/report.d.ts +8 -0
  158. package/dist/eval/report.d.ts.map +1 -0
  159. package/dist/eval/report.js +44 -0
  160. package/dist/eval/report.js.map +1 -0
  161. package/dist/eval/runner.d.ts +67 -0
  162. package/dist/eval/runner.d.ts.map +1 -0
  163. package/dist/eval/runner.js +256 -0
  164. package/dist/eval/runner.js.map +1 -0
  165. package/dist/eval/scorer.d.ts +83 -0
  166. package/dist/eval/scorer.d.ts.map +1 -0
  167. package/dist/eval/scorer.js +195 -0
  168. package/dist/eval/scorer.js.map +1 -0
  169. package/dist/eval/types.d.ts +162 -0
  170. package/dist/eval/types.d.ts.map +1 -0
  171. package/dist/eval/types.js +20 -0
  172. package/dist/eval/types.js.map +1 -0
  173. package/dist/extensions/fetch-tool.d.ts.map +1 -1
  174. package/dist/extensions/fetch-tool.js +80 -15
  175. package/dist/extensions/fetch-tool.js.map +1 -1
  176. package/dist/extensions/web-content.d.ts +61 -0
  177. package/dist/extensions/web-content.d.ts.map +1 -0
  178. package/dist/extensions/web-content.js +468 -0
  179. package/dist/extensions/web-content.js.map +1 -0
  180. package/dist/extensions/web-search-tool.js +3 -3
  181. package/dist/extensions/web-search-tool.js.map +1 -1
  182. package/dist/mcp/build-server.d.ts.map +1 -1
  183. package/dist/mcp/build-server.js +4 -1
  184. package/dist/mcp/build-server.js.map +1 -1
  185. package/dist/observability/traces.d.ts.map +1 -1
  186. package/dist/observability/traces.js +100 -1
  187. package/dist/observability/traces.js.map +1 -1
  188. package/dist/observability/tracing.d.ts +73 -0
  189. package/dist/observability/tracing.d.ts.map +1 -0
  190. package/dist/observability/tracing.js +126 -0
  191. package/dist/observability/tracing.js.map +1 -0
  192. package/dist/onboarding/default-steps.d.ts.map +1 -1
  193. package/dist/onboarding/default-steps.js +4 -1
  194. package/dist/onboarding/default-steps.js.map +1 -1
  195. package/dist/provider-api/actions/query-staged-dataset.d.ts +1 -1
  196. package/dist/provider-api/corpus-jobs.d.ts +80 -0
  197. package/dist/provider-api/corpus-jobs.d.ts.map +1 -1
  198. package/dist/provider-api/corpus-jobs.js +219 -22
  199. package/dist/provider-api/corpus-jobs.js.map +1 -1
  200. package/dist/provider-api/index.d.ts +24 -32
  201. package/dist/provider-api/index.d.ts.map +1 -1
  202. package/dist/provider-api/index.js +28 -1
  203. package/dist/provider-api/index.js.map +1 -1
  204. package/dist/scripts/agent-engines/list-agent-engines.d.ts.map +1 -1
  205. package/dist/scripts/agent-engines/list-agent-engines.js +10 -3
  206. package/dist/scripts/agent-engines/list-agent-engines.js.map +1 -1
  207. package/dist/server/action-discovery.d.ts.map +1 -1
  208. package/dist/server/action-discovery.js +4 -0
  209. package/dist/server/action-discovery.js.map +1 -1
  210. package/dist/server/agent-chat-plugin.d.ts +9 -0
  211. package/dist/server/agent-chat-plugin.d.ts.map +1 -1
  212. package/dist/server/agent-chat-plugin.js +119 -111
  213. package/dist/server/agent-chat-plugin.js.map +1 -1
  214. package/dist/server/agent-teams.d.ts +62 -0
  215. package/dist/server/agent-teams.d.ts.map +1 -1
  216. package/dist/server/agent-teams.js +99 -2
  217. package/dist/server/agent-teams.js.map +1 -1
  218. package/dist/server/better-auth-instance.d.ts +7 -0
  219. package/dist/server/better-auth-instance.d.ts.map +1 -1
  220. package/dist/server/better-auth-instance.js +90 -0
  221. package/dist/server/better-auth-instance.js.map +1 -1
  222. package/dist/server/core-routes-plugin.d.ts.map +1 -1
  223. package/dist/server/core-routes-plugin.js +7 -4
  224. package/dist/server/core-routes-plugin.js.map +1 -1
  225. package/dist/server/credential-provider.d.ts.map +1 -1
  226. package/dist/server/credential-provider.js +2 -0
  227. package/dist/server/credential-provider.js.map +1 -1
  228. package/dist/server/deep-link.d.ts +7 -0
  229. package/dist/server/deep-link.d.ts.map +1 -1
  230. package/dist/server/deep-link.js +13 -2
  231. package/dist/server/deep-link.js.map +1 -1
  232. package/dist/server/framework-request-handler.d.ts.map +1 -1
  233. package/dist/server/framework-request-handler.js +33 -1
  234. package/dist/server/framework-request-handler.js.map +1 -1
  235. package/dist/server/index.d.ts +2 -1
  236. package/dist/server/index.d.ts.map +1 -1
  237. package/dist/server/index.js +2 -1
  238. package/dist/server/index.js.map +1 -1
  239. package/dist/templates/default/.agents/skills/actions/SKILL.md +52 -1
  240. package/dist/templates/default/.agents/skills/security/SKILL.md +22 -0
  241. package/dist/templates/workspace-core/.agents/skills/actions/SKILL.md +52 -1
  242. package/dist/templates/workspace-core/.agents/skills/external-agents/SKILL.md +16 -4
  243. package/dist/templates/workspace-core/.agents/skills/harness-agents/SKILL.md +20 -0
  244. package/dist/templates/workspace-core/.agents/skills/observability/SKILL.md +31 -0
  245. package/dist/templates/workspace-core/.agents/skills/security/SKILL.md +22 -0
  246. package/docs/content/actions.md +50 -0
  247. package/docs/content/agent-teams.md +32 -0
  248. package/docs/content/blueprint-installer.md +73 -0
  249. package/docs/content/durable-resume.md +49 -0
  250. package/docs/content/evals.md +141 -0
  251. package/docs/content/external-agents.md +2 -2
  252. package/docs/content/human-approval.md +101 -0
  253. package/docs/content/observability.md +21 -0
  254. package/docs/content/observational-memory.md +63 -0
  255. package/docs/content/plan-plugin.md +5 -0
  256. package/docs/content/pr-visual-recap.md +9 -5
  257. package/docs/content/processors.md +99 -0
  258. package/docs/content/sandbox-adapters.md +134 -0
  259. package/docs/content/template-plan.md +97 -21
  260. package/package.json +10 -1
  261. package/src/templates/default/.agents/skills/actions/SKILL.md +52 -1
  262. package/src/templates/default/.agents/skills/security/SKILL.md +22 -0
  263. package/src/templates/workspace-core/.agents/skills/actions/SKILL.md +52 -1
  264. package/src/templates/workspace-core/.agents/skills/external-agents/SKILL.md +16 -4
  265. package/src/templates/workspace-core/.agents/skills/harness-agents/SKILL.md +20 -0
  266. package/src/templates/workspace-core/.agents/skills/observability/SKILL.md +31 -0
  267. package/src/templates/workspace-core/.agents/skills/security/SKILL.md +22 -0
@@ -112,7 +112,10 @@ action trio instead:
112
112
  docs/spec URLs, placeholders, and examples without exposing secrets.
113
113
  - `provider-api-docs`: fetches public provider docs/spec/changelog URLs when
114
114
  the exact endpoint, filter operator, payload shape, or pagination contract is
115
- uncertain. Registered docs URLs are curated starting points.
115
+ uncertain. Registered docs URLs are curated starting points. Use
116
+ `responseMode: "markdown"` for clean readable docs, or
117
+ `responseMode: "matches"` with `search: { query | terms | regex }` for
118
+ compact snippets instead of flooding context with raw HTML.
116
119
  - `provider-api-request`: makes a constrained authenticated HTTP request to the
117
120
  provider host, injects configured credentials, blocks private/internal URLs,
118
121
  and redacts secrets.
@@ -151,6 +154,12 @@ pagination status, truncation, failed pages, and uncovered gaps. They must not
151
154
  turn default limits, sampled rows, truncated excerpts, or aborted calls into a
152
155
  confident "none found", "all records", or exhaustive conclusion.
153
156
 
157
+ For public web pages and docs, prefer the token-efficient path: `web-search`
158
+ to find likely URLs, `web-request` or `provider-api-docs` with clean
159
+ `responseMode` output to read a page, and `run-code` with `webRead()` /
160
+ `webFetch()` when you need to grep, aggregate, or compare many pages before
161
+ returning a small result.
162
+
154
163
  ### The `http` Option
155
164
 
156
165
  Controls how the action is exposed as an HTTP endpoint:
@@ -195,6 +204,48 @@ run: async (args) => {
195
204
  }
196
205
  ```
197
206
 
207
+ ### Validating Return Values (`outputSchema`)
208
+
209
+ `schema` validates inputs; `outputSchema` validates what the action **returns**. Pass any Standard Schema-compatible schema (Zod, Valibot, ArkType) and the framework validates the result _after_ `run()` resolves — input validated before `run`, output after.
210
+
211
+ ```ts
212
+ export default defineAction({
213
+ description: "Summarize a thread.",
214
+ schema: z.object({ threadId: z.string() }),
215
+ outputSchema: z.object({ summary: z.string(), messageCount: z.number() }),
216
+ outputErrorStrategy: "warn", // default; "strict" | "fallback"
217
+ // outputFallback: { summary: "", messageCount: 0 }, // used only by "fallback"
218
+ run: async ({ threadId }) => {
219
+ /* ... */
220
+ },
221
+ });
222
+ ```
223
+
224
+ - `"warn"` (default) — `console.warn` the issues and return the **original** result unchanged. Non-breaking.
225
+ - `"strict"` — throw a clear error so a buggy action surfaces loudly.
226
+ - `"fallback"` — return `outputFallback` in place of the invalid result.
227
+
228
+ On success the validated value is returned, so coercion/defaults on `outputSchema` apply. Omit `outputSchema` and behavior is byte-for-byte unchanged (no wrapping).
229
+
230
+ ### Human-in-the-Loop Approval (`needsApproval`)
231
+
232
+ For high-consequence, outward-facing, hard-to-undo actions (sending an email, charging a card, deleting an account), set `needsApproval` so the agent **cannot** run the action without a human approving the specific call:
233
+
234
+ ```ts
235
+ export default defineAction({
236
+ description: "Send an email via Gmail.",
237
+ schema: z.object({ to: z.string(), subject: z.string(), body: z.string() }),
238
+ needsApproval: true, // boolean, or (args, ctx) => boolean | Promise<boolean>
239
+ run: async (args) => {
240
+ /* ...actually send... */
241
+ },
242
+ });
243
+ ```
244
+
245
+ When the gate is truthy and the call isn't yet approved, the loop emits an `approval_required` event and **stops the turn — `run()` never executes**. A predicate gates conditionally (e.g. only external recipients) and **fails closed**: a throw is treated as "approval required". The human approves via the chat UI's Approve affordance, which re-issues the turn with the call's `approvalKey`, and only then does the action run.
246
+
247
+ **Keep approvals rare** — the default is off and almost every action should leave it off. The canonical example is Mail's `send-email` (`needsApproval: true`). See the `security` skill and the Human Approval doc.
248
+
198
249
  ## Frontend Hooks
199
250
 
200
251
  The frontend calls actions using React Query hooks from `@agent-native/core/client`. Components should not hand-write `fetch("/_agent-native/actions/...")`; add or reuse a client hook/helper instead. Use `callAction` from the same package for imperative cases that do not fit a hook, such as debounced search, prefetching, or non-React event handlers.
@@ -139,6 +139,28 @@ export default defineEventHandler(async (event) => {
139
139
 
140
140
  - Never create unprotected routes that modify data.
141
141
 
142
+ ## Human-in-the-Loop Approval for High-Consequence Actions
143
+
144
+ For a small set of outward-facing, hard-to-undo operations — sending an email, charging a card, deleting an account, posting publicly — auth and access control are necessary but not sufficient: you also do not want the **agent** to perform them autonomously. Set `needsApproval` on the `defineAction` so the agent cannot run the action without a human approving the specific call.
145
+
146
+ ```ts
147
+ export default defineAction({
148
+ description: "Send an email via Gmail.",
149
+ schema: z.object({ to: z.string(), subject: z.string(), body: z.string() }),
150
+ needsApproval: true, // or (args, ctx) => boolean | Promise<boolean>
151
+ run: async (args) => {
152
+ /* ...actually send... */
153
+ },
154
+ });
155
+ ```
156
+
157
+ When the gate is truthy and the call is not yet approved, the loop emits an `approval_required` event and **stops the turn — `run()` never executes**. The human approves via the chat UI's Approve affordance, which re-issues the turn with the call's stable `approvalKey`; only then does the action run. A predicate gates conditionally (e.g. only external recipients) and **fails closed** — a throw is treated as "approval required".
158
+
159
+ Rules:
160
+
161
+ - Reach for `needsApproval` only for genuinely high-consequence operations. The default is off, and the framework intentionally keeps approvals rare — over-gating turns the agent into a click-through wizard. The canonical (and intentionally lone) framework example is Mail's `send-email`.
162
+ - `needsApproval` is **not** a substitute for `accessFilter` / `assertAccess` or for hiding sensitive operations from the model with `agentTool: false` / `toolCallable: false`. It is the layer for "a human must explicitly bless this specific outward-facing call," not for scoping data. See the `actions` skill for the full surface.
163
+
142
164
  ## Custom HTTP Routes Must Apply Access Control Themselves
143
165
 
144
166
  This is the single most-failed rule in the codebase. Auto-mounted action routes (`/_agent-native/actions/...`) get a request context wired up automatically. **Hand-written `/api/*` Nitro routes do not.** If your handler queries an ownable resource (any table with `...ownableColumns()`), you MUST:
@@ -112,7 +112,10 @@ action trio instead:
112
112
  docs/spec URLs, placeholders, and examples without exposing secrets.
113
113
  - `provider-api-docs`: fetches public provider docs/spec/changelog URLs when
114
114
  the exact endpoint, filter operator, payload shape, or pagination contract is
115
- uncertain. Registered docs URLs are curated starting points.
115
+ uncertain. Registered docs URLs are curated starting points. Use
116
+ `responseMode: "markdown"` for clean readable docs, or
117
+ `responseMode: "matches"` with `search: { query | terms | regex }` for
118
+ compact snippets instead of flooding context with raw HTML.
116
119
  - `provider-api-request`: makes a constrained authenticated HTTP request to the
117
120
  provider host, injects configured credentials, blocks private/internal URLs,
118
121
  and redacts secrets.
@@ -151,6 +154,12 @@ pagination status, truncation, failed pages, and uncovered gaps. They must not
151
154
  turn default limits, sampled rows, truncated excerpts, or aborted calls into a
152
155
  confident "none found", "all records", or exhaustive conclusion.
153
156
 
157
+ For public web pages and docs, prefer the token-efficient path: `web-search`
158
+ to find likely URLs, `web-request` or `provider-api-docs` with clean
159
+ `responseMode` output to read a page, and `run-code` with `webRead()` /
160
+ `webFetch()` when you need to grep, aggregate, or compare many pages before
161
+ returning a small result.
162
+
154
163
  ### The `http` Option
155
164
 
156
165
  Controls how the action is exposed as an HTTP endpoint:
@@ -195,6 +204,48 @@ run: async (args) => {
195
204
  }
196
205
  ```
197
206
 
207
+ ### Validating Return Values (`outputSchema`)
208
+
209
+ `schema` validates inputs; `outputSchema` validates what the action **returns**. Pass any Standard Schema-compatible schema (Zod, Valibot, ArkType) and the framework validates the result _after_ `run()` resolves — input validated before `run`, output after.
210
+
211
+ ```ts
212
+ export default defineAction({
213
+ description: "Summarize a thread.",
214
+ schema: z.object({ threadId: z.string() }),
215
+ outputSchema: z.object({ summary: z.string(), messageCount: z.number() }),
216
+ outputErrorStrategy: "warn", // default; "strict" | "fallback"
217
+ // outputFallback: { summary: "", messageCount: 0 }, // used only by "fallback"
218
+ run: async ({ threadId }) => {
219
+ /* ... */
220
+ },
221
+ });
222
+ ```
223
+
224
+ - `"warn"` (default) — `console.warn` the issues and return the **original** result unchanged. Non-breaking.
225
+ - `"strict"` — throw a clear error so a buggy action surfaces loudly.
226
+ - `"fallback"` — return `outputFallback` in place of the invalid result.
227
+
228
+ On success the validated value is returned, so coercion/defaults on `outputSchema` apply. Omit `outputSchema` and behavior is byte-for-byte unchanged (no wrapping).
229
+
230
+ ### Human-in-the-Loop Approval (`needsApproval`)
231
+
232
+ For high-consequence, outward-facing, hard-to-undo actions (sending an email, charging a card, deleting an account), set `needsApproval` so the agent **cannot** run the action without a human approving the specific call:
233
+
234
+ ```ts
235
+ export default defineAction({
236
+ description: "Send an email via Gmail.",
237
+ schema: z.object({ to: z.string(), subject: z.string(), body: z.string() }),
238
+ needsApproval: true, // boolean, or (args, ctx) => boolean | Promise<boolean>
239
+ run: async (args) => {
240
+ /* ...actually send... */
241
+ },
242
+ });
243
+ ```
244
+
245
+ When the gate is truthy and the call isn't yet approved, the loop emits an `approval_required` event and **stops the turn — `run()` never executes**. A predicate gates conditionally (e.g. only external recipients) and **fails closed**: a throw is treated as "approval required". The human approves via the chat UI's Approve affordance, which re-issues the turn with the call's `approvalKey`, and only then does the action run.
246
+
247
+ **Keep approvals rare** — the default is off and almost every action should leave it off. The canonical example is Mail's `send-email` (`needsApproval: true`). See the `security` skill and the Human Approval doc.
248
+
198
249
  ## Frontend Hooks
199
250
 
200
251
  The frontend calls actions using React Query hooks from `@agent-native/core/client`. Components should not hand-write `fetch("/_agent-native/actions/...")`; add or reuse a client hook/helper instead. Use `callAction` from the same package for imperative cases that do not fit a hook, such as debounced search, prefetching, or non-React event handlers.
@@ -197,7 +197,7 @@ path is obvious.
197
197
  `defineAction` accepts an optional `link` builder. When set, every MCP/A2A
198
198
  result for that tool auto-appends a markdown `[label →](absoluteUrl)` block and
199
199
  a structured `_meta["agent-native/openLink"] = { label, view, webUrl,
200
- desktopUrl }`; `tools/list` adds
200
+ desktopUrl, vscodeUrl }`; `tools/list` adds
201
201
  `annotations["agent-native/producesOpenLink"]` plus a description suffix so the
202
202
  external agent knows the tool yields an openable link.
203
203
 
@@ -285,9 +285,11 @@ ngrok/prod testing caveats are documented in
285
285
 
286
286
  `buildDeepLink(...)` returns the app-relative path
287
287
  `/_agent-native/open?app=…&view=…&<recordId>=…`. The MCP layer turns that into
288
- an absolute web URL (`toAbsoluteOpenUrl`, using the request origin) and a
289
- desktop `agentnative://open?…` URL (`toDesktopOpenUrl`). When the user clicks
290
- it in any browser or inline webview, `GET /_agent-native/open`
288
+ an absolute web URL (`toAbsoluteOpenUrl`, using the request origin), a
289
+ desktop `agentnative://open?…` URL (`toDesktopOpenUrl`), and a VS Code
290
+ extension URL (`toVsCodeOpenUrl`) for
291
+ `vscode://builderio.agent-native/open?url=…`. When the user clicks the web
292
+ link in any browser or inline webview, `GET /_agent-native/open`
291
293
  (`createOpenRouteHandler`, mounted by the core routes plugin, gated by
292
294
  `disableOpenRoute`, customizable via `resolveOpenPath`):
293
295
 
@@ -416,3 +418,13 @@ before telling the user they are unauthenticated.
416
418
  - **a2a-protocol** — the `ask-agent` meta-tool and JSON-RPC peer calls
417
419
  - **adding-a-feature** — the four-area checklist (add a `link` builder when a
418
420
  feature produces a navigable resource)
421
+
422
+ ## Blueprint installer
423
+
424
+ To add a whole new integration the agent-native way, `agent-native add <kind>
425
+ <name|url>` prints a curated Markdown blueprint to stdout — pipe it into the
426
+ external coding agent you connected (`agent-native add provider stripe |
427
+ claude`) and it applies the changes against the live repo. A URL emits a
428
+ generic research-and-integrate blueprint instead. Seeded kinds:
429
+ `provider` / `channel` / `sandbox` / `action`. Add your own by dropping a
430
+ `.md` in `packages/core/blueprints/<kind>/`. See the Blueprint Installer doc.
@@ -80,6 +80,26 @@ existing run routes as `goalId=agent-harness`.
80
80
  Preserve `defineAction` auth, request context, timeouts, truncation, and
81
81
  read-only metadata.
82
82
 
83
+ ## Code Execution Sandbox
84
+
85
+ - The `run-code` tool executes through a pluggable `SandboxAdapter`
86
+ (`packages/core/src/coding-tools/sandbox/`). The default
87
+ `LocalChildProcessAdapter` spawns a locked-down local Node child process;
88
+ swap it via `AGENT_NATIVE_SANDBOX` or `registerSandboxAdapter()` for a
89
+ Docker/remote/durable backend (the lever to exceed the hosted ~40s code-exec
90
+ ceiling). An adapter only runs the already-prepared, non-secret module source
91
+ — it never sees app secrets. See the Sandbox Adapters doc; `agent-native add
92
+ sandbox docker` emits a full Docker-adapter recipe.
93
+
94
+ ## Sub-Agent Delegation Depth
95
+
96
+ - Sub-agent spawning is capped server-side (default depth `2`) so delegation
97
+ chains can't fan out indefinitely. Override at deploy time with
98
+ `AGENT_NATIVE_MAX_SUBAGENT_DEPTH` (`0` disables sub-agents; clamped to `16`).
99
+ Enforcement is ambient via `evaluateSubagentDepth` in
100
+ `packages/core/src/server/agent-teams.ts` — independent of any tool-level
101
+ guard. See the Agent Teams doc for the depth model.
102
+
83
103
  ## Don't
84
104
 
85
105
  - Don't add Claude Code, Codex, Cursor, Mastra, or Pi as an `AgentEngine`.
@@ -75,6 +75,26 @@ const criteria: EvalCriteria = {
75
75
  };
76
76
  ```
77
77
 
78
+ #### Evals (CI gate)
79
+
80
+ The three layers above score *real production runs* after the fact. For an active, deterministic gate, use the first-class `*.eval.ts` primitive from `@agent-native/core/eval` (source: `packages/core/src/eval/*`). It runs the actual agent loop against fixed inputs and exits non-zero below threshold, so it gates CI/deploys.
81
+
82
+ ```ts
83
+ // evals/faq.eval.ts
84
+ import { defineEval, contains, llmJudge } from "@agent-native/core/eval";
85
+
86
+ export default defineEval({
87
+ name: "answers the FAQ",
88
+ input: { prompt: "What is your return policy?" },
89
+ threshold: 0.7,
90
+ scorers: [contains("30 days"), llmJudge({ criteria: "accuracy" })],
91
+ });
92
+ ```
93
+
94
+ - Built-in scorers: `exactMatch` / `contains` / `usesTool` (pure JS) and `llmJudge` (provider-agnostic judge).
95
+ - Custom scorers: `createScorer` with the 4-step `preprocess → analyze → generateScore → generateReason` pipeline (only `generateScore` is required).
96
+ - Run as a gate: `agent-native eval [pattern] [--json] [--threshold N]` — discovers `**/*.eval.ts` and `evals/*.ts`, runs the agent, and exits non-zero if any eval is below its threshold. An app with no eval files exits `0`. Complements (does not replace) the post-hoc scoring in `evals.ts`. See the Evals doc.
97
+
78
98
  ### 4. Experiments
79
99
 
80
100
  A/B testing with sticky user-level assignment:
@@ -200,3 +220,14 @@ await putSetting("observability-config", {
200
220
  ```
201
221
 
202
222
  The framework emits `gen_ai.*` semantic convention spans compatible with Langfuse, Datadog, Grafana, New Relic, and any OTel-compatible backend.
223
+
224
+ ## Live OpenTelemetry Spans (Optional)
225
+
226
+ Separate from the `exporters` config above (which ships the in-house traces to an OTLP endpoint), the agent loop can also emit **live OpenTelemetry spans** for every run, model call, and tool call, so a host that already runs an OTel collector sees agent activity alongside its other distributed traces.
227
+
228
+ This layer is optional and **no-op by default**:
229
+
230
+ - `@opentelemetry/api` is an **optional dependency**. If it isn't installed, the span helpers degrade to silent no-ops — they never throw into the agent loop.
231
+ - Even with the api package installed, it ships a default no-op tracer. Spans become real only once the **host registers a `TracerProvider`** (via `@opentelemetry/sdk-node` or similar). The framework deliberately does not depend on the heavy SDK/exporter packages and never registers a provider itself — instrumentation is opt-in by the embedding app.
232
+
233
+ The loop emits `agent.run` (with `agent.run_id`, `agent.thread_id`, `agent.user_id`, `agent.model`), `tool.call` (`tool.name` + status), and `llm.call` spans, each finished with OK/ERROR status. This is purely additive to the in-house `agent_trace_spans` / `agent_trace_summaries` tables. Source: `packages/core/src/observability/tracing.ts` + `traces.ts`. See the Observability doc for the full table.
@@ -139,6 +139,28 @@ export default defineEventHandler(async (event) => {
139
139
 
140
140
  - Never create unprotected routes that modify data.
141
141
 
142
+ ## Human-in-the-Loop Approval for High-Consequence Actions
143
+
144
+ For a small set of outward-facing, hard-to-undo operations — sending an email, charging a card, deleting an account, posting publicly — auth and access control are necessary but not sufficient: you also do not want the **agent** to perform them autonomously. Set `needsApproval` on the `defineAction` so the agent cannot run the action without a human approving the specific call.
145
+
146
+ ```ts
147
+ export default defineAction({
148
+ description: "Send an email via Gmail.",
149
+ schema: z.object({ to: z.string(), subject: z.string(), body: z.string() }),
150
+ needsApproval: true, // or (args, ctx) => boolean | Promise<boolean>
151
+ run: async (args) => {
152
+ /* ...actually send... */
153
+ },
154
+ });
155
+ ```
156
+
157
+ When the gate is truthy and the call is not yet approved, the loop emits an `approval_required` event and **stops the turn — `run()` never executes**. The human approves via the chat UI's Approve affordance, which re-issues the turn with the call's stable `approvalKey`; only then does the action run. A predicate gates conditionally (e.g. only external recipients) and **fails closed** — a throw is treated as "approval required".
158
+
159
+ Rules:
160
+
161
+ - Reach for `needsApproval` only for genuinely high-consequence operations. The default is off, and the framework intentionally keeps approvals rare — over-gating turns the agent into a click-through wizard. The canonical (and intentionally lone) framework example is Mail's `send-email`.
162
+ - `needsApproval` is **not** a substitute for `accessFilter` / `assertAccess` or for hiding sensitive operations from the model with `agentTool: false` / `toolCallable: false`. It is the layer for "a human must explicitly bless this specific outward-facing call," not for scoping data. See the `actions` skill for the full surface.
163
+
142
164
  ## Custom HTTP Routes Must Apply Access Control Themselves
143
165
 
144
166
  This is the single most-failed rule in the codebase. Auto-mounted action routes (`/_agent-native/actions/...`) get a request context wired up automatically. **Hand-written `/api/*` Nitro routes do not.** If your handler queries an ownable resource (any table with `...ownableColumns()`), you MUST: