npm - @cuylabs/physical-capx-agent-core - Versions diffs - 0.1.1 - Mend

@cuylabs/physical-capx-agent-core 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/LICENSE +201 -0
package/README.md +132 -0
package/dist/agent.d.ts +59 -0
package/dist/agent.js +14 -0
package/dist/agent.js.map +1 -0
package/dist/chunk-57TF3E2Q.js +61 -0
package/dist/chunk-57TF3E2Q.js.map +1 -0
package/dist/chunk-U6REPGPH.js +109 -0
package/dist/chunk-U6REPGPH.js.map +1 -0
package/dist/index.d.ts +7 -0
package/dist/index.js +29 -0
package/dist/index.js.map +1 -0
package/dist/session.d.ts +5 -0
package/dist/session.js +12 -0
package/dist/session.js.map +1 -0
package/dist/tools.d.ts +8 -0
package/dist/tools.js +7 -0
package/dist/tools.js.map +1 -0
package/docs/README.md +23 -0
package/docs/agent-core-integration.md +76 -0
package/examples/.env.example +37 -0
package/examples/01-capx-runtime-solver.ts +165 -0
package/examples/02-capx-runtime-autosolve.ts +314 -0
package/examples/README.md +344 -0
package/examples/_setup.ts +61 -0
package/package.json +86 -0
package/skills/capx-code-as-policy/SKILL.md +22 -0

package/examples/README.md ADDED Viewed

@@ -0,0 +1,344 @@
+# Examples
+Runnable examples for `@cuylabs/physical-capx-agent-core`.
+These examples are the `@cuylabs/agent-core` client path.
+`capx-agent-runtime` remains the harness-neutral Python service; any other
+agent or workflow can use the same CaP-X runtime by calling its HTTP API
+directly.
+## Main Examples
+There are two solver examples. Both connect an `agent-core` agent to an
+already-running `capx-agent-runtime` service and expose the CaP-X session
+through the `capx_*` tools.
+`01-capx-runtime-solver.ts` is the default single-turn example. It creates one
+agent, starts one runtime session, and gives the model one user turn. Inside
+that turn, `agent-core` may still run multiple model/tool steps, but the prompt
+asks for one useful Code-as-Policy action and a short result summary.
+That flow is:
+1. observe the CaP-X task, simulator state, and rendered frame,
+2. inspect runtime turn history and available policy-code context,
+3. write one Python Code-as-Policy step,
+4. execute that Python through `capx-agent-runtime`,
+5. observe again and summarize reward, stdout/stderr, artifacts, and task
+   completion.
+`02-capx-runtime-autosolve.ts` is the multi-turn example. It keeps the same
+agent and runtime session open across several user turns. After each turn, the
+script observes the runtime result and stops when CaP-X reports task completion
+or when `CAPX_MAX_SOLVER_TURNS` is reached. Use it when the harness should keep
+trying the task instead of exiting after one solver turn.
+Both examples enable the packaged `capx-code-as-policy` agent-core skill by
+default. That skill teaches the model how to use the `capx_*` tools and how to
+write policy code for the runtime. It is separate from CaP-X's runtime-side
+Python skill library, which appears dynamically in observation `codeContext`
+and deliberate runtime APIs.
+## Service-First Setup
+The normal path is to start the runtime service first, usually on a Linux GPU
+workstation, then run the TypeScript agent from your local machine or another
+client.
+Follow the runtime project docs first:
+1. Prepare the GPU workstation with
+   [Workstation Setup](https://github.com/cuylabs-ai/capx-agent-runtime/blob/main/docs/workstation-setup.md).
+2. Start and validate the runtime server with
+   [Runtime Server](https://github.com/cuylabs-ai/capx-agent-runtime/blob/main/docs/runtime-server.md).
+The runtime server is typically started from the CaP-X checkout like this:
+```bash
+cd /path/to/cap-x
+uv run --no-sync --active capx-agent-runtime serve \
+  --repo-path "$(pwd)" \
+  --config-path env_configs/cube_stack/franka_robosuite_cube_stack.yaml \
+  --host 127.0.0.1 \
+  --port 8210
+```
+That command starts the CaP-X runtime around the selected YAML config. For the
+cube-stack config, the runtime is a Robosuite simulation. The TypeScript example
+connects to this service and acts as the external solver agent.
+You can point the runtime service at another compatible CaP-X config, for
+example:
+```bash
+--config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn.yaml
+--config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn_vf.yaml
+--config-path env_configs/cube_stack/franka_robosuite_cube_stack_multiturn_vdm.yaml
+```
+The `02-capx-runtime-autosolve.ts` example can run multiple external-agent
+turns against any of those configs. The `multiturn` configs add CaP-X
+continuation prompt text. The `vf` and `vdm` configs expose visual-feedback or
+visual-differencing intent; in this bring-your-own-agent path, the agent should
+call `capx_observe` with `includeImages=true` and do the comparison in the host
+model/harness.
+If the runtime is remote, open an SSH tunnel so your local machine can reach
+the service:
+```bash
+ssh -L 8210:127.0.0.1:8210 <user>@<gpu-host>
+```
+## Client Setup
+In an application that consumes the released packages, install the TypeScript
+client packages and the example runner dependencies:
+```bash
+npm install @cuylabs/agent-core @cuylabs/physical-core @cuylabs/physical-capx-agent-core
+npm install --save-dev @ai-sdk/openai dotenv tsx
+```
+The released package already includes its built `dist/` files, so there is no
+workspace build step in the normal install path.
+If you are running the examples from a local `physical-ai-ts` monorepo checkout
+while changing package source, install workspace dependencies first:
+```bash
+cd /path/to/physical-ai-ts
+pnpm install
+```
+Use the `pnpm` already available on your machine. If `pnpm` is missing and your
+Node install includes Corepack, you can enable it with `corepack enable`; if
+`corepack` is not available, install `pnpm` directly with your normal Node
+package-manager setup.
+For the checked-in examples, configure the local example environment from this
+package directory:
+```bash
+cd packages/physical-capx-agent-core
+cp examples/.env.example examples/.env
+```
+Both examples import `examples/_setup.ts`. You do not run `_setup.ts`
+directly; it loads `examples/.env` and creates the OpenAI-compatible provider
+used by `agent-core`.
+Set the required values in `examples/.env`:
+```bash
+OPENAI_API_KEY=...
+OPENAI_MODEL=gpt-4o-mini
+CAPX_RUNTIME_SERVER_URL=http://127.0.0.1:8210
+```
+`OPENAI_BASE_URL` is optional. Leave it unset for the default OpenAI endpoint.
+Set it only when using an OpenAI-compatible provider, for example a local
+gateway or hosted inference endpoint.
+## Run Modes
+The examples default to observe-only mode. In that mode, the agent can inspect
+the task, frame, runtime state, and policy-code context, but it cannot call
+`capx_run_policy_code`.
+### Observe Only
+```bash
+npx tsx examples/01-capx-runtime-solver.ts
+```
+Use this first to confirm that the runtime URL, model provider, session
+creation, observation, and tool wiring are working.
+### Single Policy Step
+Allow the single-turn example to execute one Python Code-as-Policy action in
+simulation.
+```bash
+CAPX_ALLOW_DESTRUCTIVE=1 \
+npx tsx examples/01-capx-runtime-solver.ts
+```
+The startup line should show `approval=policy-code-enabled`. If it still shows
+`approval=observe-only`, the environment variable did not reach the Node
+process. Use a single-line command to verify:
+```bash
+env CAPX_ALLOW_DESTRUCTIVE=1 npx tsx examples/01-capx-runtime-solver.ts
+```
+### Single Policy Step With Video
+```bash
+CAPX_ALLOW_DESTRUCTIVE=1 \
+CAPX_POLICY_EXECUTION_RECORD_VIDEO=1 \
+npx tsx examples/01-capx-runtime-solver.ts
+```
+### Multi-Turn Autosolve
+Run the autosolver in observe-only mode.
+```bash
+CAPX_MAX_SOLVER_TURNS=6 npx tsx examples/02-capx-runtime-autosolve.ts
+```
+Allow policy-code execution across the autosolver loop.
+```bash
+CAPX_ALLOW_DESTRUCTIVE=1 \
+CAPX_MAX_SOLVER_TURNS=6 \
+npx tsx examples/02-capx-runtime-autosolve.ts
+```
+For the most complete demo, enable execution, video recording, one runtime
+recovery reset, and stop-on-exit so the combined video artifact is flushed.
+```bash
+CAPX_ALLOW_DESTRUCTIVE=1 \
+CAPX_POLICY_EXECUTION_RECORD_VIDEO=1 \
+CAPX_MAX_SOLVER_TURNS=6 \
+CAPX_RECOVER_ON_RUNTIME_ERROR=reset \
+CAPX_MAX_RUNTIME_RESETS=1 \
+CAPX_STOP_ON_EXIT=1 \
+npx tsx examples/02-capx-runtime-autosolve.ts
+```
+## Expected Output
+For the default Franka cube-stack config, a healthy run usually finishes after
+one useful policy-code turn. Exact sampled poses and artifact paths vary, but
+the important terminal lines look like this:
+```text
+executionOk=true, taskCompleted=true, reward=1
+terminated=true, truncated=false
+sandboxRc=0
+CaP-X reported completion state: taskCompleted=true terminated=true truncated=false sandboxRc=0 reward=1
+```
+The server log should show the same lifecycle:
+```text
+POST /sessions ... 200 OK
+POST /sessions/<id>/execute-code ... 200 OK
+Saved interaction video to .../video_1.000_turn_00.mp4
+Saved interaction video to .../video_session_combined.mp4
+POST /sessions/<id>/stop ... 200 OK
+```
+The `video_..._turn_00.mp4` file is the per-policy-turn recording.
+`video_session_combined.mp4` is written when the session stops, so
+`CAPX_STOP_ON_EXIT=1` is recommended for video examples. The runtime console
+shows the combined session video first and links the per-turn videos as
+individual artifact files.
+## Recovery And Cleanup
+The autosolver distinguishes ordinary policy-code failures from runtime-level
+CaP-X failures.
+| Case                                                                | What Happens                                                                   |
+| ------------------------------------------------------------------- | ------------------------------------------------------------------------------ |
+| Python policy returns stderr                                        | The next agent turn can inspect the error and write better code.               |
+| Observation or depth pipeline fails before `env.step(code)` returns | The autosolver stops or uses `CAPX_RECOVER_ON_RUNTIME_ERROR=reset` if enabled. |
+| Recovery reset is enabled                                           | The session resets to the next trial/seed. The default reset budget is `1`.    |
+| `CAPX_STOP_ON_EXIT=1` is set                                        | The example stops the runtime session at exit and flushes the combined video.  |
+If the reset budget is exhausted, clean up first and retry with a fresh
+session. When a session is still running, find its id and stop it:
+```bash
+curl -sS http://127.0.0.1:8210/sessions
+curl -X POST http://127.0.0.1:8210/sessions/<session-id>/stop
+```
+You can also reset an existing session, but for observation/depth assertion
+failures a fresh session is usually clearer:
+```bash
+curl -X POST \
+  -H 'content-type: application/json' \
+  -d '{}' \
+  http://127.0.0.1:8210/sessions/<session-id>/reset
+```
+If the depth assertion repeats immediately on a clean session, restart the
+`capx-agent-runtime serve` process too. That recreates the Python environment
+and the child API services instead of reusing the same process state.
+To isolate the TypeScript adapter and `agent-core` loop from the vision/depth
+stack, start `capx-agent-runtime` with a privileged cube-stack config when that
+config is available:
+```bash
+uv run --no-sync --active capx-agent-runtime serve \
+  --repo-path "$(pwd)" \
+  --config-path env_configs/cube_stack/franka_robosuite_cube_stack_privileged.yaml \
+  --host 127.0.0.1 \
+  --port 8210
+```
+That path avoids some vision-derived object-pose calls and is useful when you
+want to validate HTTP tools, approvals, artifacts, videos, and the external
+agent loop before debugging the Robosuite camera/depth pipeline.
+## Environment Variables
+| Variable                                 | Purpose                                                                                       |
+| ---------------------------------------- | --------------------------------------------------------------------------------------------- |
+| `OPENAI_API_KEY`                         | Configures the `agent-core` model provider.                                                   |
+| `OPENAI_MODEL`                           | Model id. Defaults to `gpt-4o-mini` in `examples/_setup.ts`.                                  |
+| `OPENAI_BASE_URL`                        | Optional OpenAI-compatible provider endpoint.                                                 |
+| `CAPX_RUNTIME_SERVER_URL`                | URL for the running `capx-agent-runtime` service.                                             |
+| `CAPX_ALLOW_DESTRUCTIVE=1`               | Lets the example approval policy execute `capx_run_policy_code`.                              |
+| `CAPX_ALLOW_HARDWARE_POLICY_EXECUTION=1` | Extra gate required before policy execution against hardware configs.                         |
+| `CAPX_MAX_SOLVER_TURNS`                  | Outer loop limit for `02-capx-runtime-autosolve.ts`.                                          |
+| `CAPX_RECOVER_ON_RUNTIME_ERROR=reset`    | Reset the live runtime session after runtime-level observation/depth failures.                |
+| `CAPX_MAX_RUNTIME_RESETS`                | Recovery reset budget. Defaults to `1` when recovery is enabled.                              |
+| `CAPX_POLICY_EXECUTION_RECORD_VIDEO`     | Optional `1` or `0` override for the selected YAML's video setting.                           |
+| `CAPX_STOP_ON_EXIT=1`                    | Stop the runtime session when the example exits and flush combined video artifacts.           |
+| `CAPX_SESSION_OUTPUT_DIR`                | Privileged per-session output override. Leave unset for normal server-owned paths.            |
+| `CAPX_SESSION_SKILL_LIBRARY_PATH`        | Privileged per-session skill-library override. Leave unset unless path overrides are enabled. |
+| `CAPX_TOOL_RESULT_MAX_CHARS`             | Increase printed tool-result previews while debugging.                                        |
+By default, each example run uses the runtime server's configured output
+directory and skill-library path. Set `CAPX_SESSION_OUTPUT_DIR` or
+`CAPX_SESSION_SKILL_LIBRARY_PATH` only when the runtime server was started with
+`--allow-client-path-overrides` and allowed roots for those paths.
+## Runtime Contract
+The examples always use the live runtime path: `mode: "runtime"`,
+`startSession: true`, `enablePolicyCodeExecution: true`, and
+`policyExecutionMode: "live-runtime"`.
+The adapter does not accept `repoPath` or `configPath`, and it omits
+`outputDir` and `skillLibraryPath` by default. Those path choices belong to the
+runtime server startup command. That keeps the architecture clean: the Python
+runtime service owns the CaP-X repo/config/output/simulator setup, and
+`agent-core` owns the external agent loop.
+The adapter defaults to `toolExecutionMode: "plan"`. In `agent-core`, "plan"
+means framework-owned tool dispatch, not "only write a textual plan." The model
+can still emit tool calls; `agent-core` applies approval and scheduling policy,
+executes approved tools, then records tool results before the next model step.
+Both examples use `agent-core`'s `createEventPrinter` to render steps, tool
+calls, tool results, approval events, text output, and completion. For CaP-X,
+those logs are the easiest way to see the external agent loop: status, observe,
+optional policy-code execution, observe again, then final summary.
+This package does not copy CaP-X prompt templates into TypeScript. In runtime
+mode, `capx-agent-runtime` loads the selected CaP-X YAML config and trial.
+`capx_observe` returns the CaP-X task prompt, full prompt, observations, API
+descriptions, rendered frame when available, and last-step result. The external
+agent reads that CaP-X-provided context and acts by calling
+`capx_run_policy_code`.

package/examples/_setup.ts ADDED Viewed

@@ -0,0 +1,61 @@
+/**
+ * Shared example setup — loads `.env` from the examples directory.
+ */
+import { createOpenAI } from "@ai-sdk/openai";
+import { config } from "dotenv";
+import { dirname, join } from "node:path";
+import { fileURLToPath } from "node:url";
+export const examplesDir = dirname(fileURLToPath(import.meta.url));
+config({ path: join(examplesDir, ".env"), quiet: true });
+const DEFAULT_OPENAI_MODEL = "gpt-4o-mini";
+function firstEnv(names: string[]): string | undefined {
+  for (const name of names) {
+    const value = process.env[name];
+    if (value && value.trim()) {
+      return value.trim();
+    }
+  }
+  return undefined;
+}
+export function getExampleOpenAIModelId(
+  fallback = DEFAULT_OPENAI_MODEL,
+): string {
+  return (
+    firstEnv([
+      "OPENAI_MODEL",
+      "OPENAI_MODEL_ID",
+      "openai_model",
+      "openai_model_id",
+    ]) ?? fallback
+  );
+}
+export function getExampleOpenAIBaseURL(): string | undefined {
+  return firstEnv([
+    "OPENAI_BASE_URL",
+    "OPENAI_API_BASE_URL",
+    "OPENAI_BASEURL",
+    "openai_base_url",
+    "openai_api_base_url",
+  ]);
+}
+export function createExampleOpenAIProvider() {
+  const apiKey = firstEnv(["OPENAI_API_KEY", "openai_api_key"]);
+  const baseURL = getExampleOpenAIBaseURL();
+  return createOpenAI({
+    ...(apiKey ? { apiKey } : {}),
+    ...(baseURL ? { baseURL } : {}),
+  });
+}
+export function exampleOpenAIModel(modelId = getExampleOpenAIModelId()) {
+  return createExampleOpenAIProvider()(modelId);
+}

package/package.json ADDED Viewed

@@ -0,0 +1,86 @@
+{
+  "name": "@cuylabs/physical-capx-agent-core",
+  "version": "0.1.1",
+  "description": "Agent-core CaP-X agent and physical tool adapter",
+  "type": "module",
+  "main": "./dist/index.js",
+  "types": "./dist/index.d.ts",
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.ts",
+      "import": "./dist/index.js",
+      "default": "./dist/index.js"
+    },
+    "./agent": {
+      "types": "./dist/agent.d.ts",
+      "import": "./dist/agent.js",
+      "default": "./dist/agent.js"
+    },
+    "./tools": {
+      "types": "./dist/tools.d.ts",
+      "import": "./dist/tools.js",
+      "default": "./dist/tools.js"
+    },
+    "./session": {
+      "types": "./dist/session.d.ts",
+      "import": "./dist/session.js",
+      "default": "./dist/session.js"
+    }
+  },
+  "files": [
+    "dist",
+    "docs",
+    "examples/*.ts",
+    "examples/.env.example",
+    "examples/README.md",
+    "skills",
+    "README.md"
+  ],
+  "dependencies": {
+    "zod": "^3.25.76 || ^4.1.8",
+    "@cuylabs/physical-agent-core": "^0.1.1",
+    "@cuylabs/physical-capx": "^0.1.1",
+    "@cuylabs/physical-core": "^0.1.1"
+  },
+  "devDependencies": {
+    "@ai-sdk/openai": "4.0.0-beta.38",
+    "@cuylabs/agent-core": "^7.2.1",
+    "@types/node": "^22.0.0",
+    "dotenv": "^17.2.3",
+    "tsup": "^8.0.0",
+    "tsx": "^4.21.0",
+    "typescript": "^5.7.0",
+    "vitest": "^4.0.18"
+  },
+  "peerDependencies": {
+    "@cuylabs/agent-core": "^7.0.0"
+  },
+  "keywords": [
+    "agent",
+    "physical-ai",
+    "robotics",
+    "capx",
+    "code-as-policy"
+  ],
+  "author": "cuylabs",
+  "license": "Apache-2.0",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/cuylabs-ai/physical-ai-ts.git",
+    "directory": "packages/physical-capx-agent-core"
+  },
+  "engines": {
+    "node": ">=20"
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "scripts": {
+    "build": "tsup --config tsup.config.ts",
+    "dev": "tsup --config tsup.config.ts --watch",
+    "typecheck": "tsc --noEmit",
+    "test": "vitest run",
+    "test:watch": "vitest",
+    "clean": "rm -rf dist"
+  }
+}

package/skills/capx-code-as-policy/SKILL.md ADDED Viewed

@@ -0,0 +1,22 @@
+---
+name: capx-code-as-policy
+description: Use this when controlling a CaP-X Code-as-Policy runtime through capx_* tools. It explains the observe, render, policy-code, turn-history, and artifact loop for an external agent harness.
+version: 1.0.0
+tags: [capx, robotics, physical-ai, code-as-policy]
+---
+# CaP-X Code-as-Policy Agent Skill
+Use CaP-X as the Python robotics runtime. The external agent owns the reasoning loop. First inspect the session with `capx_status` and `capx_observe`. Treat the CaP-X task prompt, full prompt, `codeContext`, policy-code context, reset metadata, rendered frames, and last-step result as the source of truth.
+One CaP-X runtime session represents one live environment. Keep using that same session across observe/run/observe turns. Do not ask for a new session, reset, or stop unless the user asks, the current trial needs a deliberate retry, or continuing would be unsafe.
+When `capx_run_policy_code` is available, write concise Python policy code using APIs exposed by the observed CaP-X prompt and `codeContext`. Prefer one purposeful code step at a time, then observe again.
+Use `capx_observe` with image observations when visual state matters. Compare the new observation, frame, stdout, stderr, reward, and task-completion status against the previous turn before deciding whether another policy-code step is needed.
+Use `capx_turn_history` to inspect prior submitted code and results. Use `capx_artifacts` when you need saved code, logs, summaries, images, or videos.
+Use policy-code context returned by `capx_observe` for reusable CaP-X Python helper hints. These are runtime-side Python functions, not agent-core skills or separate robot tools. Submit policy code through `capx_run_policy_code`.
+For ensemble reasoning, generate and compare candidate plans in the agent harness, then submit only the selected Python code through `capx_run_policy_code`. Keep execution approval and hardware safety policy in the host application.