@cuylabs/agent-physical-capx 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,387 @@
1
+ # Examples
2
+
3
+ Runnable examples for `@cuylabs/agent-physical-capx`.
4
+
5
+ These examples are the `agents-ts` client path. `capx-agent-runtime` remains
6
+ the harness-neutral Python service; any other agent can use the same CaP-X
7
+ runtime by calling its HTTP API directly.
8
+
9
+ ## Main Examples
10
+
11
+ `01-capx-runtime-solver.ts` is the default single-turn bring-your-own-agent
12
+ example. It creates an `agent-core` agent, wires in the CaP-X physical tools,
13
+ and gives the agent one user turn. Inside that turn, `agent-core` may still run
14
+ multiple model/tool steps, but the example prompt asks for one useful
15
+ Code-as-Policy action and a summary:
16
+
17
+ 1. observe the CaP-X task and rendered simulator state,
18
+ 2. inspect runtime turn history and the CaP-X skill library when available,
19
+ 3. write one Python Code-as-Policy step,
20
+ 4. execute it through `capx-agent-runtime`,
21
+ 5. observe again and summarize reward, stdout/stderr, and task completion.
22
+
23
+ `02-capx-runtime-autosolve.ts` keeps the same agent session open across several
24
+ user turns. After each turn, the script observes the runtime and stops when
25
+ CaP-X reports task completion or when `CAPX_MAX_SOLVER_TURNS` is reached. Use
26
+ this when you want the harness to continue attempting the task instead of
27
+ exiting after one scripted turn.
28
+
29
+ The example enables the packaged `capx-code-as-policy` agent-core skill by
30
+ default. That skill teaches the model how to use the CaP-X tools. It is not the
31
+ same as CaP-X's runtime-side Python skill library, which is exposed dynamically
32
+ through observation `codeContext` and programmatic runtime APIs.
33
+
34
+ ## Service-First Setup
35
+
36
+ The normal path is to start the runtime service first, usually on a Linux GPU
37
+ workstation, then run the TypeScript agent from your local machine or another
38
+ client.
39
+
40
+ Follow the runtime workstation setup first:
41
+
42
+ [capx-agent-runtime workstation setup](../../../repos/capx-agent-runtime/docs/workstation-setup.md)
43
+
44
+ The runtime server is typically started from the CaP-X checkout like this:
45
+
46
+ ```bash
47
+ cd /path/to/cap-x
48
+ uv run --no-sync --active capx-agent-runtime serve \
49
+ --repo-path "$(pwd)" \
50
+ --config-path env_configs/cube_stack/franka_robosuite_cube_stack.yaml \
51
+ --host 127.0.0.1 \
52
+ --port 8210
53
+ ```
54
+
55
+ That command starts the CaP-X runtime around the selected YAML config. For the
56
+ cube-stack config, the runtime is a Robosuite simulation. The TypeScript example
57
+ connects to this service and acts as the external solver agent.
58
+
59
+ You can point the runtime service at another compatible CaP-X config, for
60
+ example:
61
+
62
+ ```bash
63
+ --config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn.yaml
64
+ --config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn_vf.yaml
65
+ --config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn_vdm.yaml
66
+ ```
67
+
68
+ The `02-capx-runtime-autosolve.ts` example can run multiple external-agent
69
+ turns against any of those configs. The `multiturn` configs add CaP-X
70
+ continuation prompt text. The `vf` and `vdm` configs expose visual-feedback or
71
+ visual-differencing intent; in this bring-your-own-agent path, the agent should
72
+ call `capx_observe` with `includeImages=true` and do the comparison in the host
73
+ model/harness.
74
+
75
+ If the runtime is remote, open an SSH tunnel so your local machine can reach
76
+ the service:
77
+
78
+ ```bash
79
+ ssh -L 8210:127.0.0.1:8210 <user>@<gpu-host>
80
+ ```
81
+
82
+ ## Local Workspace Setup
83
+
84
+ These examples are meant to run from a local `agents-ts` checkout. The
85
+ `@cuylabs/agent-physical` and `@cuylabs/agent-physical-capx` packages are
86
+ workspace-linked here, so source changes can be tested before publishing.
87
+
88
+ From the repo root, install dependencies and build this package plus its local
89
+ workspace dependencies:
90
+
91
+ ```bash
92
+ cd /path/to/agents-ts
93
+ pnpm install
94
+ pnpm --filter @cuylabs/agent-physical-capx... build
95
+ ```
96
+
97
+ Use the `pnpm` already available on your machine. If `pnpm` is missing and your
98
+ Node install includes Corepack, you can enable it with `corepack enable`; if
99
+ `corepack` is not available, install `pnpm` directly with your normal Node
100
+ package-manager setup.
101
+
102
+ The trailing `...` is intentional. It includes the dependency chain, so
103
+ `@cuylabs/agent-core`, `@cuylabs/agent-physical`, and
104
+ `@cuylabs/agent-physical-capx` are built together.
105
+
106
+ Then configure the example environment:
107
+
108
+ ```bash
109
+ cd packages/agent-physical-capx
110
+ cp examples/.env.example examples/.env
111
+ ```
112
+
113
+ Both examples import `examples/_setup.ts`. You do not run `_setup.ts`
114
+ directly; it loads `examples/.env` and creates the OpenAI-compatible provider
115
+ used by `agent-core`.
116
+
117
+ Set the required values in `examples/.env`:
118
+
119
+ ```bash
120
+ OPENAI_API_KEY=...
121
+ OPENAI_MODEL=gpt-4o-mini
122
+
123
+ CAPX_RUNTIME_SERVER_URL=http://127.0.0.1:8210
124
+ ```
125
+
126
+ `OPENAI_BASE_URL` is optional. Leave it unset for the default OpenAI endpoint.
127
+ Set it only when using an OpenAI-compatible provider, for example a local
128
+ gateway or hosted inference endpoint.
129
+
130
+ ## Run Modes
131
+
132
+ By default, the examples are observe-only. The agent can inspect the runtime
133
+ state, propose policy code, and summarize what it would do, but it cannot call
134
+ `capx_run_policy_code`:
135
+
136
+ ```bash
137
+ pnpm exec tsx examples/01-capx-runtime-solver.ts
138
+ ```
139
+
140
+ Allow the single-turn solver to execute one Python Code-as-Policy action in
141
+ simulation:
142
+
143
+ ```bash
144
+ CAPX_ALLOW_DESTRUCTIVE=1 \
145
+ pnpm exec tsx examples/01-capx-runtime-solver.ts
146
+ ```
147
+
148
+ The startup line should include `approval=policy-code-enabled`. If it still
149
+ shows `approval=observe-only`, the environment variable did not reach the Node
150
+ process. In that case, use a single-line command:
151
+
152
+ ```bash
153
+ env CAPX_ALLOW_DESTRUCTIVE=1 pnpm exec tsx examples/01-capx-runtime-solver.ts
154
+ ```
155
+
156
+ When execution is enabled, the log should also show an approval request for
157
+ `capx_run_policy_code` followed by an approval resolution. If the startup line
158
+ shows `approval=policy-code-enabled` but the tool result still says
159
+ `Approval denied for capx_run_policy_code`, rebuild and rerun the local
160
+ workspace; older example code used a hard default deny policy before the
161
+ example callback could approve the tool.
162
+
163
+ Allow execution and force video recording for that policy-code turn:
164
+
165
+ ```bash
166
+ CAPX_ALLOW_DESTRUCTIVE=1 \
167
+ CAPX_POLICY_EXECUTION_RECORD_VIDEO=1 \
168
+ pnpm exec tsx examples/01-capx-runtime-solver.ts
169
+ ```
170
+
171
+ Run the multi-turn solver in observe-only mode:
172
+
173
+ ```bash
174
+ CAPX_MAX_SOLVER_TURNS=6 pnpm exec tsx examples/02-capx-runtime-autosolve.ts
175
+ ```
176
+
177
+ Allow the multi-turn solver to execute policy code:
178
+
179
+ ```bash
180
+ CAPX_ALLOW_DESTRUCTIVE=1 \
181
+ CAPX_MAX_SOLVER_TURNS=6 \
182
+ pnpm exec tsx examples/02-capx-runtime-autosolve.ts
183
+ ```
184
+
185
+ Allow multi-turn execution and force video recording for each policy-code turn:
186
+
187
+ ```bash
188
+ CAPX_ALLOW_DESTRUCTIVE=1 \
189
+ CAPX_POLICY_EXECUTION_RECORD_VIDEO=1 \
190
+ CAPX_MAX_SOLVER_TURNS=6 \
191
+ CAPX_RECOVER_ON_RUNTIME_ERROR=reset \
192
+ CAPX_MAX_RUNTIME_RESETS=1 \
193
+ CAPX_STOP_ON_EXIT=1 \
194
+ pnpm exec tsx examples/02-capx-runtime-autosolve.ts
195
+ ```
196
+
197
+ For the video mode, the startup line should include
198
+ `approval=policy-code-enabled` and `recordVideo=1`.
199
+ `CAPX_STOP_ON_EXIT=1` stops the runtime session at the end of the example so
200
+ `capx-agent-runtime` can flush the combined session video artifact. Stopped
201
+ sessions still keep their artifacts available through the console and HTTP API;
202
+ the session may remain listed there, but the live simulator environment has
203
+ been stopped. This does not shut down the top-level `capx-agent-runtime serve`
204
+ process.
205
+
206
+ For the default Franka cube-stack config, a healthy run usually finishes after
207
+ one useful policy-code turn. The exact sampled poses and artifact URLs vary by
208
+ trial, but the important terminal lines look like this:
209
+
210
+ ```text
211
+ executionOk=true, taskCompleted=true, reward=1
212
+ terminated=true, truncated=false
213
+ sandboxRc=0
214
+ CaP-X reported completion state: taskCompleted=true terminated=true truncated=false sandboxRc=0 reward=1
215
+ ```
216
+
217
+ The server log should show the same lifecycle:
218
+
219
+ ```text
220
+ POST /sessions ... 200 OK
221
+ POST /sessions/<id>/execute-code ... 200 OK
222
+ Saved interaction video to .../video_1.000_turn_00.mp4
223
+ Saved interaction video to .../video_session_combined.mp4
224
+ POST /sessions/<id>/stop ... 200 OK
225
+ ```
226
+
227
+ The `video_..._turn_00.mp4` file is the per-policy-turn recording. The
228
+ `video_session_combined.mp4` file is written when the session stops, which is
229
+ why `CAPX_STOP_ON_EXIT=1` is recommended for video examples. In the console,
230
+ the combined session video is shown at the top of the artifact list; per-turn
231
+ videos remain linked as individual artifact files.
232
+
233
+ By default, each example run writes to a unique remote CaP-X output directory
234
+ under `outputs/capx-agent-runtime/<agent-session-id>`. That keeps artifacts
235
+ from separate runs from being mixed together in the console. Set
236
+ `CAPX_OUTPUT_DIR` only when you intentionally want a specific remote output
237
+ directory.
238
+
239
+ The autosolver also stops early if CaP-X reports a persistent observation or
240
+ depth-rendering failure. This is different from ordinary Python policy-code
241
+ failure. CaP-X can ask an agent to regenerate code when `env.step(code)`
242
+ returns a normal `info_step` with `stderr`. But if the Robosuite observation
243
+ pipeline raises while collecting camera/depth observations, `env.step(code)`
244
+ does not return a normal result. In that state even `pass` can fail before user
245
+ policy code runs, so continuing to submit more code in the same session is not
246
+ useful.
247
+
248
+ When `CAPX_RECOVER_ON_RUNTIME_ERROR=reset` is set, the autosolver resets only
249
+ for runtime-level `env.step(...)` failures where CaP-X cannot return a normal
250
+ step result. It does not reset for ordinary policy-code `stderr`; those are
251
+ left to the next agent turn so the model can inspect the error and try a better
252
+ policy. Runtime resets use the next CaP-X trial/seed. The default reset budget
253
+ is one reset; set `CAPX_MAX_RUNTIME_RESETS` to change that. If
254
+ `CAPX_POLICY_EXECUTION_TRIAL` is unset, the first session uses trial `1`, the
255
+ first recovery reset uses trial `2`, and so on. `CAPX_STOP_ON_EXIT` is separate
256
+ from this recovery behavior: recovery reset happens during the solver loop,
257
+ while stop-on-exit runs once the example is done, fails, or exhausts its reset
258
+ budget.
259
+
260
+ If the reset budget is exhausted, or if you are running without automatic
261
+ recovery, do the cleanup first, then retry with a fresh session.
262
+
263
+ If you used `CAPX_STOP_ON_EXIT=1`, the example asks the server to stop the
264
+ runtime session before exiting and flushes the combined session video. You can
265
+ then rerun the example directly.
266
+
267
+ If the session is still running, find the runtime session id and stop it:
268
+
269
+ ```bash
270
+ curl -sS http://127.0.0.1:8210/sessions
271
+ curl -X POST http://127.0.0.1:8210/sessions/<session-id>/stop
272
+ ```
273
+
274
+ You can also reset an existing session, but for observation/depth assertion
275
+ failures a fresh session is usually clearer:
276
+
277
+ ```bash
278
+ curl -X POST \
279
+ -H 'content-type: application/json' \
280
+ -d '{}' \
281
+ http://127.0.0.1:8210/sessions/<session-id>/reset
282
+ ```
283
+
284
+ Then run the autosolver again:
285
+
286
+ ```bash
287
+ CAPX_ALLOW_DESTRUCTIVE=1 \
288
+ CAPX_POLICY_EXECUTION_RECORD_VIDEO=1 \
289
+ CAPX_MAX_SOLVER_TURNS=6 \
290
+ CAPX_RECOVER_ON_RUNTIME_ERROR=reset \
291
+ CAPX_MAX_RUNTIME_RESETS=1 \
292
+ CAPX_STOP_ON_EXIT=1 \
293
+ pnpm exec tsx examples/02-capx-runtime-autosolve.ts
294
+ ```
295
+
296
+ If the depth assertion repeats immediately on a clean session, restart the
297
+ `capx-agent-runtime serve` process too. That recreates the Python environment
298
+ and the child API services instead of reusing the same process state.
299
+
300
+ If you want to isolate the TypeScript adapter and `agent-core` loop from the
301
+ vision/depth stack, start `capx-agent-runtime` with a privileged cube-stack
302
+ config when available:
303
+
304
+ ```bash
305
+ uv run --no-sync --active capx-agent-runtime serve \
306
+ --repo-path "$(pwd)" \
307
+ --config-path env_configs/cube_stack/franka_robosuite_cube_stack_privileged.yaml \
308
+ --host 127.0.0.1 \
309
+ --port 8210
310
+ ```
311
+
312
+ That path avoids some vision-derived object-pose calls and is useful for
313
+ checking that HTTP tools, approvals, artifacts, videos, and the external agent
314
+ loop are wired correctly before debugging the Robosuite camera/depth pipeline.
315
+
316
+ `CAPX_ALLOW_DESTRUCTIVE=1` means "allow side-effecting CaP-X tools" in this
317
+ example harness. It is required for `capx_run_policy_code` because that tool
318
+ executes model-authored Python inside the live CaP-X runtime. For hardware
319
+ configs, policy execution is still blocked unless
320
+ `CAPX_ALLOW_HARDWARE_POLICY_EXECUTION=1` is also set.
321
+
322
+ The example always uses the live runtime path: `mode: "runtime"`,
323
+ `startSession: true`, `enablePolicyCodeExecution: true`,
324
+ and `policyExecutionMode: "live-runtime"`. The adapter does not accept
325
+ `repoPath` or `configPath`; those belong to the runtime server startup command.
326
+ That keeps the example aligned with the production architecture: the Python
327
+ runtime service owns the CaP-X repo/config/simulator setup, and `agent-core`
328
+ owns the external agent loop.
329
+
330
+ The adapter defaults to `toolExecutionMode: "plan"`. In `agent-core`, "plan"
331
+ means framework-owned tool dispatch, not "only produce a written plan." The
332
+ model still reasons normally and emits tool calls; `agent-core` executes those
333
+ tool calls after applying approval and scheduling policy, then records the tool
334
+ results before the next model step. The console output lines such as
335
+ `capx_status({ ... })` and `capx_run_policy_code({ ... })` are the visible
336
+ planned tool calls.
337
+
338
+ Both examples use `agent-core`'s `createEventPrinter` to render progress:
339
+ steps, tool calls, tool results, approval events, text output, and completion.
340
+ For CaP-X, those logs are the easiest way to see the external agent loop:
341
+ status, observe, optional policy-code execution, observe again, then final
342
+ summary. Set `CAPX_TOOL_RESULT_MAX_CHARS` if you want the terminal to print
343
+ longer tool-result previews while debugging.
344
+
345
+ ## Environment Model
346
+
347
+ `OPENAI_API_KEY` configures the `agent-core` model provider. `OPENAI_MODEL`
348
+ defaults to `gpt-4o-mini` through `examples/_setup.ts` if you omit it.
349
+ `OPENAI_BASE_URL` is only needed for non-default OpenAI-compatible endpoints.
350
+
351
+ `CAPX_RUNTIME_SERVER_URL` points to the `capx-agent-runtime` service. When
352
+ the example creates a session, it lets the server's startup arguments define
353
+ the CaP-X repo, YAML config, output directory, and simulator context. This
354
+ matches the workstation setup command above.
355
+
356
+ `CAPX_ALLOW_DESTRUCTIVE=1` lets the example approval policy allow
357
+ `capx_run_policy_code`. Without it, the agent can observe and propose code but
358
+ will not execute policy code.
359
+
360
+ `CAPX_MAX_SOLVER_TURNS` controls the outer loop in
361
+ `02-capx-runtime-autosolve.ts`. The same `agent-core` session id is reused for
362
+ each turn so the agent keeps conversation and tool history.
363
+
364
+ `CAPX_RECOVER_ON_RUNTIME_ERROR=reset` lets
365
+ `02-capx-runtime-autosolve.ts` reset the live CaP-X runtime session to the next
366
+ trial/seed and continue when CaP-X reports an observation/depth failure. This
367
+ is session-level recovery for failures where `env.step(code)` cannot return a
368
+ normal multi-turn result. `CAPX_MAX_RUNTIME_RESETS` controls the reset budget
369
+ and defaults to `1` when recovery is enabled.
370
+
371
+ `CAPX_POLICY_EXECUTION_RECORD_VIDEO` is optional. Leave it unset to use the
372
+ selected CaP-X YAML's `record_video` setting. Set it to `1` or `0` only when
373
+ you want the TypeScript example to override the runtime server/YAML value.
374
+
375
+ ## Prompt Context
376
+
377
+ This package does not copy CaP-X prompt templates into TypeScript. In runtime
378
+ mode, `capx-agent-runtime` loads the selected CaP-X YAML config and trial. Then
379
+ `capx_observe` returns the task prompt, full prompt, observations, API
380
+ descriptions, rendered frame when available, and last-step result to the
381
+ `agent-core` agent.
382
+
383
+ The external agent reads that CaP-X-provided context and acts by calling
384
+ `capx_run_policy_code`.
385
+
386
+ This is the clean bring-your-own-agent reference: start the runtime service
387
+ first, then connect an external `agent-core` agent to it.
@@ -0,0 +1,61 @@
1
+ /**
2
+ * Shared example setup — loads `.env` from the examples directory.
3
+ */
4
+
5
+ import { createOpenAI } from "@ai-sdk/openai";
6
+ import { config } from "dotenv";
7
+ import { dirname, join } from "node:path";
8
+ import { fileURLToPath } from "node:url";
9
+
10
+ export const examplesDir = dirname(fileURLToPath(import.meta.url));
11
+
12
+ config({ path: join(examplesDir, ".env"), quiet: true });
13
+
14
+ const DEFAULT_OPENAI_MODEL = "gpt-4o-mini";
15
+
16
+ function firstEnv(names: string[]): string | undefined {
17
+ for (const name of names) {
18
+ const value = process.env[name];
19
+ if (value && value.trim()) {
20
+ return value.trim();
21
+ }
22
+ }
23
+ return undefined;
24
+ }
25
+
26
+ export function getExampleOpenAIModelId(
27
+ fallback = DEFAULT_OPENAI_MODEL,
28
+ ): string {
29
+ return (
30
+ firstEnv([
31
+ "OPENAI_MODEL",
32
+ "OPENAI_MODEL_ID",
33
+ "openai_model",
34
+ "openai_model_id",
35
+ ]) ?? fallback
36
+ );
37
+ }
38
+
39
+ export function getExampleOpenAIBaseURL(): string | undefined {
40
+ return firstEnv([
41
+ "OPENAI_BASE_URL",
42
+ "OPENAI_API_BASE_URL",
43
+ "OPENAI_BASEURL",
44
+ "openai_base_url",
45
+ "openai_api_base_url",
46
+ ]);
47
+ }
48
+
49
+ export function createExampleOpenAIProvider() {
50
+ const apiKey = firstEnv(["OPENAI_API_KEY", "openai_api_key"]);
51
+ const baseURL = getExampleOpenAIBaseURL();
52
+
53
+ return createOpenAI({
54
+ ...(apiKey ? { apiKey } : {}),
55
+ ...(baseURL ? { baseURL } : {}),
56
+ });
57
+ }
58
+
59
+ export function exampleOpenAIModel(modelId = getExampleOpenAIModelId()) {
60
+ return createExampleOpenAIProvider()(modelId);
61
+ }
package/package.json ADDED
@@ -0,0 +1,76 @@
1
+ {
2
+ "name": "@cuylabs/agent-physical-capx",
3
+ "version": "5.0.2",
4
+ "description": "CaP-X adapter for @cuylabs/agent-physical sessions and tools",
5
+ "type": "module",
6
+ "main": "./dist/index.js",
7
+ "types": "./dist/index.d.ts",
8
+ "exports": {
9
+ ".": {
10
+ "types": "./dist/index.d.ts",
11
+ "import": "./dist/index.js",
12
+ "default": "./dist/index.js"
13
+ },
14
+ "./agent": {
15
+ "types": "./dist/agent.d.ts",
16
+ "import": "./dist/agent.js",
17
+ "default": "./dist/agent.js"
18
+ },
19
+ "./session": {
20
+ "types": "./dist/session.d.ts",
21
+ "import": "./dist/session.js",
22
+ "default": "./dist/session.js"
23
+ }
24
+ },
25
+ "files": [
26
+ "dist",
27
+ "docs",
28
+ "examples/*.ts",
29
+ "examples/.env.example",
30
+ "examples/README.md",
31
+ "skills",
32
+ "README.md"
33
+ ],
34
+ "dependencies": {
35
+ "zod": "^3.25.76 || ^4.1.8",
36
+ "@cuylabs/agent-core": "^5.0.2",
37
+ "@cuylabs/agent-physical": "^5.0.2"
38
+ },
39
+ "devDependencies": {
40
+ "@ai-sdk/openai": "4.0.0-beta.38",
41
+ "@types/node": "^22.0.0",
42
+ "dotenv": "^17.2.3",
43
+ "tsup": "^8.0.0",
44
+ "tsx": "^4.21.0",
45
+ "typescript": "^5.7.0",
46
+ "vitest": "^4.0.18"
47
+ },
48
+ "keywords": [
49
+ "agent",
50
+ "physical-ai",
51
+ "robotics",
52
+ "capx",
53
+ "code-as-policy"
54
+ ],
55
+ "author": "cuylabs",
56
+ "license": "Apache-2.0",
57
+ "repository": {
58
+ "type": "git",
59
+ "url": "https://github.com/cuylabs-ai/agents-ts.git",
60
+ "directory": "packages/agent-physical-capx"
61
+ },
62
+ "engines": {
63
+ "node": ">=20"
64
+ },
65
+ "publishConfig": {
66
+ "access": "public"
67
+ },
68
+ "scripts": {
69
+ "build": "tsup --config tsup.config.ts",
70
+ "dev": "tsup --config tsup.config.ts --watch",
71
+ "typecheck": "tsc --noEmit",
72
+ "test": "vitest run",
73
+ "test:watch": "vitest",
74
+ "clean": "rm -rf dist"
75
+ }
76
+ }
@@ -0,0 +1,22 @@
1
+ ---
2
+ name: capx-code-as-policy
3
+ description: Use this when controlling a CaP-X Code-as-Policy runtime through capx_* tools. It explains the observe, render, policy-code, turn-history, and artifact loop for an external agent harness.
4
+ version: 1.0.0
5
+ tags: [capx, robotics, physical-ai, code-as-policy]
6
+ ---
7
+
8
+ # CaP-X Code-as-Policy Agent Skill
9
+
10
+ Use CaP-X as the Python robotics runtime. The external agent owns the reasoning loop. First inspect the session with `capx_status` and `capx_observe`. Treat the CaP-X task prompt, full prompt, `codeContext`, policy-code context, reset metadata, rendered frames, and last-step result as the source of truth.
11
+
12
+ One CaP-X runtime session represents one live environment. Keep using that same session across observe/run/observe turns. Do not ask for a new session, reset, or stop unless the user asks, the current trial needs a deliberate retry, or continuing would be unsafe.
13
+
14
+ When `capx_run_policy_code` is available, write concise Python policy code using APIs exposed by the observed CaP-X prompt and `codeContext`. Prefer one purposeful code step at a time, then observe again.
15
+
16
+ Use `capx_observe` with image observations when visual state matters. Compare the new observation, frame, stdout, stderr, reward, and task-completion status against the previous turn before deciding whether another policy-code step is needed.
17
+
18
+ Use `capx_turn_history` to inspect prior submitted code and results. Use `capx_artifacts` when you need saved code, logs, summaries, images, or videos.
19
+
20
+ Use policy-code context returned by `capx_observe` for reusable CaP-X Python helper hints. These are runtime-side Python functions, not agent-core skills or separate robot tools. Submit policy code through `capx_run_policy_code`.
21
+
22
+ For ensemble reasoning, generate and compare candidate plans in the agent harness, then submit only the selected Python code through `capx_run_policy_code`. Keep execution approval and hardware safety policy in the host application.