archal 0.9.18 → 0.9.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -1
- package/agents/github-octokit/.archal.json +8 -0
- package/agents/github-octokit/Dockerfile +8 -0
- package/agents/github-octokit/README.md +113 -0
- package/agents/github-octokit/agent.mjs +54 -0
- package/agents/github-octokit/package.json +9 -0
- package/agents/github-octokit/scenarios/test-repo-access.md +27 -0
- package/agents/google-workspace-local-tools/Dockerfile +6 -0
- package/agents/google-workspace-local-tools/README.md +58 -0
- package/agents/google-workspace-local-tools/agent.mjs +196 -0
- package/agents/google-workspace-local-tools/archal-harness.json +7 -0
- package/agents/google-workspace-local-tools/run-input.yaml +16 -0
- package/agents/google-workspace-local-tools/scenario.md +29 -0
- package/agents/hermes/.archal.json +8 -0
- package/agents/hermes/Dockerfile +46 -0
- package/agents/hermes/README.md +87 -0
- package/agents/hermes/SOUL.md +27 -0
- package/agents/hermes/config.yaml +34 -0
- package/agents/hermes/drive.mjs +113 -0
- package/agents/hermes/scenarios/stripe-customers-read-only.md +32 -0
- package/agents/openclaw/.archal.json +8 -0
- package/agents/openclaw/Dockerfile +96 -0
- package/agents/openclaw/README.md +120 -0
- package/agents/openclaw/drive.mjs +311 -0
- package/agents/openclaw/package.json +9 -0
- package/agents/openclaw/scenarios/github-issue-triage-read-only.md +44 -0
- package/agents/openclaw/workspace/AGENTS.md +23 -0
- package/agents/openclaw/workspace/IDENTITY.md +8 -0
- package/agents/openclaw/workspace/SOUL.md +14 -0
- package/agents/openclaw/workspace/TOOLS.md +35 -0
- package/agents/pagination-test/README.md +24 -0
- package/agents/pagination-test/scenario.md +24 -0
- package/agents/replay-capsule-harness/README.md +29 -0
- package/agents/replay-capsule-harness/observability-install-offline-e2e.mts +1517 -0
- package/agents/replay-capsule-harness/replay-capsule-e2e.mjs +104 -0
- package/clone-assets/apify/tools.json +213 -13
- package/clone-assets/calcom/tools.json +510 -0
- package/clone-assets/clickup/tools.json +1258 -0
- package/clone-assets/customerio/tools.json +386 -0
- package/clone-assets/datadog/tools.json +734 -0
- package/clone-assets/github/tools.json +312 -25
- package/clone-assets/gitlab/tools.json +999 -0
- package/clone-assets/google-workspace/tools.json +18 -6
- package/clone-assets/hubspot/tools.json +1406 -0
- package/clone-assets/jira/fidelity.json +1 -1
- package/clone-assets/jira/tools.json +266 -543
- package/clone-assets/linear/tools.json +238 -40
- package/clone-assets/ownerrez/tools.json +548 -0
- package/clone-assets/pricelabs/tools.json +343 -0
- package/clone-assets/sentry/tools.json +745 -0
- package/clone-assets/slack/tools.json +1 -2
- package/clone-assets/stripe/tools.json +185 -46
- package/clone-assets/supabase/tools.json +511 -14
- package/clone-assets/unipile/tools.json +408 -0
- package/clone-assets/webflow/tools.json +415 -0
- package/dist/autoloop-worker-types-BEb_E44z.d.cts +196 -0
- package/dist/cli.cjs +151033 -75282
- package/dist/commands/autoloop-hosted-worker.cjs +43942 -0
- package/dist/commands/autoloop-hosted-worker.d.cts +143 -0
- package/dist/commands/autoloop-pr-verification.cjs +4227 -0
- package/dist/commands/autoloop-pr-verification.d.cts +17 -0
- package/dist/{vitest/chunk-IVXSSEYS.js → commands/autoloop-result-parser.cjs} +16515 -18857
- package/dist/commands/autoloop-result-parser.d.cts +39 -0
- package/dist/commands/autoloop-worker.cjs +36163 -0
- package/dist/commands/autoloop-worker.d.cts +97 -0
- package/dist/harness.cjs +1 -0
- package/dist/index.cjs +1 -1
- package/dist/replay.cjs +49624 -0
- package/dist/replay.d.cts +4625 -0
- package/dist/scenarios.cjs +80343 -0
- package/dist/scenarios.d.cts +562 -0
- package/dist/vitest/chunk-6CBYFCFK.js +4667 -0
- package/dist/vitest/chunk-ARVS45PP.js +2764 -0
- package/dist/vitest/index.cjs +6079 -75089
- package/dist/vitest/index.d.ts +7 -6
- package/dist/vitest/index.js +8 -8
- package/dist/vitest/runtime/hosted-session-reaper.cjs +801 -34187
- package/dist/vitest/runtime/hosted-session-reaper.js +1 -1
- package/dist/vitest/runtime/setup-files.js +2 -2
- package/package.json +14 -9
- package/skills/archal-agent/SKILL.md +87 -0
- package/skills/autoloop/SKILL.md +376 -0
- package/skills/autoloop/references/hosted-sources.md +62 -0
- package/skills/autoloop/references/trace-schema-mapping.md +73 -0
- package/skills/eval/SKILL.md +35 -1
- package/skills/install-agent/SKILL.md +221 -0
- package/skills/onboard/SKILL.md +80 -0
- package/skills/scenario/SKILL.md +19 -4
- package/skills/seed/SKILL.md +237 -0
- package/dist/seed/dynamic-generator.cjs +0 -45564
- package/dist/seed/dynamic-generator.d.cts +0 -106
- package/dist/vitest/chunk-CTSN67QR.js +0 -47188
|
@@ -0,0 +1,311 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
// OpenClaw drive entrypoint — run the agent once on the injected task, then exit.
|
|
3
|
+
//
|
|
4
|
+
// This reproduces the AGENT-SIDE of the legacy sandbox entrypoint
|
|
5
|
+
// (packages/sandbox-runtime/docker/sandbox/entrypoint.sh, sections 6–8) for the
|
|
6
|
+
// generic Docker-harness sidecar engine. The NETWORK side of that entrypoint —
|
|
7
|
+
// the TLS proxy, CA install, DNS rewrites, and iptables egress seal — is owned by
|
|
8
|
+
// the sidecar now and is intentionally NOT done here.
|
|
9
|
+
//
|
|
10
|
+
// Contract with the Archal Docker harness:
|
|
11
|
+
// - in: process.env.AGENT_TASK (the scenario prompt)
|
|
12
|
+
// - out: the agent's final answer printed to STDOUT (so the evaluator can score
|
|
13
|
+
// the response text); exit 0 on completion, non-zero on failure.
|
|
14
|
+
// - the sidecar writes its CA to /agent-output/ca.crt and the harness sets
|
|
15
|
+
// NODE_EXTRA_CA_CERTS to it, so the agent's calls to api.github.com etc.
|
|
16
|
+
// are transparently routed to the seeded clone — the agent is unaware.
|
|
17
|
+
// - the harness harvests the clone /trace after this exits; this shim does not
|
|
18
|
+
// collect the trace.
|
|
19
|
+
//
|
|
20
|
+
// Eval-mode env (mirrors entrypoint.sh):
|
|
21
|
+
// AGENT_DISABLE_PLUGINS=1 -> isolated config, drop the plugins block
|
|
22
|
+
// AGENT_EVAL_MODE=isolated -> isolated eval (no business-tool plugins)
|
|
23
|
+
// AGENT_MODEL=<id> -> override the default model
|
|
24
|
+
// AGENT_ID=<name> -> select the agent (default "main")
|
|
25
|
+
|
|
26
|
+
import { execFileSync, spawn } from 'node:child_process';
|
|
27
|
+
import { mkdirSync, writeFileSync, existsSync, cpSync, readdirSync, rmSync } from 'node:fs';
|
|
28
|
+
import { homedir } from 'node:os';
|
|
29
|
+
import { join } from 'node:path';
|
|
30
|
+
import { randomUUID } from 'node:crypto';
|
|
31
|
+
|
|
32
|
+
const HOME = process.env.HOME || homedir();
|
|
33
|
+
const OPENCLAW_HOME = join(HOME, '.openclaw');
|
|
34
|
+
const WORKSPACE = join(OPENCLAW_HOME, 'workspace');
|
|
35
|
+
const GATEWAY_PORT = 18789;
|
|
36
|
+
const BUNDLED_WORKSPACE = join(import.meta.dirname, 'workspace');
|
|
37
|
+
|
|
38
|
+
// The engine mounts a caller-provided OpenClaw home (via `--openclaw-home`)
|
|
39
|
+
// read-only here. Optional — absent in the common bundled-persona case.
|
|
40
|
+
const MOUNTED_HOME = '/openclaw-home';
|
|
41
|
+
|
|
42
|
+
// Gateway auth: the local gateway and the `openclaw agent` client authenticate
|
|
43
|
+
// the websocket with a shared token read from OPENCLAW_GATEWAY_TOKEN (the
|
|
44
|
+
// gateway's --token defaults to this env; the agent client reads the same env).
|
|
45
|
+
// Set a stable one for this process — without it the gateway generates a random
|
|
46
|
+
// token the agent never sees and `openclaw agent` fails with
|
|
47
|
+
// GatewayCredentialsRequiredError. Mirrors docker/sandbox/entrypoint.sh.
|
|
48
|
+
process.env.OPENCLAW_GATEWAY_TOKEN = process.env.OPENCLAW_GATEWAY_TOKEN || randomUUID();
|
|
49
|
+
|
|
50
|
+
// Optional smoke test: `ARCHAL_PREFLIGHT=1 node drive.mjs` verifies the entrypoint
|
|
51
|
+
// parses and the agent binary is present without running a task or calling out.
|
|
52
|
+
if (process.env.ARCHAL_PREFLIGHT === '1') {
|
|
53
|
+
try {
|
|
54
|
+
execFileSync('openclaw', ['--version'], { stdio: 'ignore', timeout: 30_000 });
|
|
55
|
+
console.log('OK');
|
|
56
|
+
process.exit(0);
|
|
57
|
+
} catch (err) {
|
|
58
|
+
console.error(`[drive] preflight failed: ${err?.message ?? err}`);
|
|
59
|
+
process.exit(1);
|
|
60
|
+
}
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
const task = (process.env.AGENT_TASK ?? '').trim();
|
|
64
|
+
if (!task) {
|
|
65
|
+
console.error('[drive] no AGENT_TASK provided');
|
|
66
|
+
process.exit(2);
|
|
67
|
+
}
|
|
68
|
+
console.error(`[drive] task: ${task}`);
|
|
69
|
+
|
|
70
|
+
const agentId = (process.env.AGENT_ID || 'main').trim();
|
|
71
|
+
const sessionId = `session-${randomUUID()}`;
|
|
72
|
+
const timeoutSeconds = Number.parseInt(process.env.ARCHAL_TIMEOUT || '120', 10) || 120;
|
|
73
|
+
const disablePlugins =
|
|
74
|
+
process.env.AGENT_DISABLE_PLUGINS === '1' || process.env.AGENT_EVAL_MODE === 'isolated';
|
|
75
|
+
const modelOverride = (process.env.AGENT_MODEL || '').trim();
|
|
76
|
+
|
|
77
|
+
const openclaw = (args, { timeout = 600_000 } = {}) => {
|
|
78
|
+
console.error(`[drive] $ openclaw ${args.join(' ')}`);
|
|
79
|
+
return execFileSync('openclaw', args, {
|
|
80
|
+
encoding: 'utf8',
|
|
81
|
+
stdio: ['ignore', 'pipe', 'pipe'],
|
|
82
|
+
timeout,
|
|
83
|
+
});
|
|
84
|
+
};
|
|
85
|
+
|
|
86
|
+
// ── 0. Seed a writable ~/.openclaw from a caller-mounted home (optional) ────
|
|
87
|
+
// When the engine mounts `--openclaw-home` read-only at /openclaw-home, copy it
|
|
88
|
+
// into the writable ~/.openclaw so the agent inherits the caller's auth-profiles,
|
|
89
|
+
// extensions, and persona. writeConfig() then layers the required gateway/
|
|
90
|
+
// interception config on top. Mirrors the A/B mount handling in the legacy
|
|
91
|
+
// docker/sandbox/entrypoint.sh. No mount → the bundled persona is used unchanged.
|
|
92
|
+
function seedHomeFromMount() {
|
|
93
|
+
if (!existsSync(MOUNTED_HOME)) return;
|
|
94
|
+
console.error('[drive] seeding ~/.openclaw from mounted /openclaw-home');
|
|
95
|
+
mkdirSync(OPENCLAW_HOME, { recursive: true });
|
|
96
|
+
cpSync(MOUNTED_HOME, OPENCLAW_HOME, { recursive: true });
|
|
97
|
+
// Drop the caller's device identity so the in-container agent re-registers
|
|
98
|
+
// fresh instead of colliding with the host device record (entrypoint.sh §A).
|
|
99
|
+
for (const rel of ['identity/device.json', 'identity/device-auth.json']) {
|
|
100
|
+
try {
|
|
101
|
+
rmSync(join(OPENCLAW_HOME, rel), { force: true });
|
|
102
|
+
} catch {
|
|
103
|
+
/* best effort */
|
|
104
|
+
}
|
|
105
|
+
}
|
|
106
|
+
}
|
|
107
|
+
|
|
108
|
+
function workspaceHasContent() {
|
|
109
|
+
try {
|
|
110
|
+
return existsSync(WORKSPACE) && readdirSync(WORKSPACE).some((entry) => entry !== '.openclaw');
|
|
111
|
+
} catch {
|
|
112
|
+
return false;
|
|
113
|
+
}
|
|
114
|
+
}
|
|
115
|
+
|
|
116
|
+
// ── 1. Stage the workspace (persona + protocol files) ──────────────────────
|
|
117
|
+
// Mirrors entrypoint.sh §6 (piecemeal config) + §6.25 (refresh TOOLS/AGENTS/SOUL).
|
|
118
|
+
// The bundled persona is the fallback: a caller-provided workspace (seeded from
|
|
119
|
+
// /openclaw-home above) wins, otherwise the packaged persona is copied in.
|
|
120
|
+
function stageWorkspace() {
|
|
121
|
+
mkdirSync(WORKSPACE, { recursive: true });
|
|
122
|
+
if (!workspaceHasContent() && existsSync(BUNDLED_WORKSPACE)) {
|
|
123
|
+
cpSync(BUNDLED_WORKSPACE, WORKSPACE, { recursive: true });
|
|
124
|
+
}
|
|
125
|
+
// OpenClaw treats a workspace without a completed setup marker as needing
|
|
126
|
+
// onboarding; write the marker so the gateway starts straight into the task.
|
|
127
|
+
const stateDir = join(WORKSPACE, '.openclaw');
|
|
128
|
+
mkdirSync(stateDir, { recursive: true });
|
|
129
|
+
writeFileSync(
|
|
130
|
+
join(stateDir, 'workspace-state.json'),
|
|
131
|
+
`${JSON.stringify({ version: 1, setupCompletedAt: new Date().toISOString() }, null, 2)}\n`,
|
|
132
|
+
);
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
// ── 2. Write a minimal, non-interactive OpenClaw config ────────────────────
|
|
136
|
+
// Mirrors the node config-writer in entrypoint.sh §6: local gateway pinned to
|
|
137
|
+
// :18789, the workspace pinned to the sandbox copy, provider base URLs left at
|
|
138
|
+
// their real defaults (the sidecar intercepts them) with allowPrivateNetwork so
|
|
139
|
+
// the model request can reach the intercept listener, and shell/exec tools
|
|
140
|
+
// allowed so the agent can drive `gh` / `curl`.
|
|
141
|
+
function writeConfig() {
|
|
142
|
+
const config = {
|
|
143
|
+
gateway: { mode: 'local', port: GATEWAY_PORT },
|
|
144
|
+
agents: {
|
|
145
|
+
defaults: {
|
|
146
|
+
workspace: WORKSPACE,
|
|
147
|
+
...(modelOverride ? { model: { primary: modelOverride } } : {}),
|
|
148
|
+
},
|
|
149
|
+
},
|
|
150
|
+
// Each provider pins its NATIVE wire API (`api`). Without this, OpenClaw drives
|
|
151
|
+
// non-OpenAI providers as OpenAI-compatible `/chat/completions`, which the real
|
|
152
|
+
// anthropic/google domains 404 (their native paths are `/v1/messages` and
|
|
153
|
+
// `/v1beta/models/<m>:generateContent`). The sidecar resolves each provider by
|
|
154
|
+
// hostname and injects that provider's native auth header (openai=Bearer,
|
|
155
|
+
// anthropic=x-api-key, google=x-goog-api-key), so the native paths are the only
|
|
156
|
+
// ones consistent with the injected auth. Model metadata (context/pricing) comes
|
|
157
|
+
// from OpenClaw's built-in registry — the same `models: []` pattern openai already
|
|
158
|
+
// uses. openai stays on `openai-responses` to match the proven `/v1/responses`
|
|
159
|
+
// interception.
|
|
160
|
+
models: {
|
|
161
|
+
providers: {
|
|
162
|
+
openai: { baseUrl: 'https://api.openai.com/v1', api: 'openai-responses', models: [], request: { allowPrivateNetwork: true } },
|
|
163
|
+
anthropic: { baseUrl: 'https://api.anthropic.com', api: 'anthropic-messages', models: [], request: { allowPrivateNetwork: true } },
|
|
164
|
+
google: { baseUrl: 'https://generativelanguage.googleapis.com', api: 'google-generative-ai', models: [], request: { allowPrivateNetwork: true } },
|
|
165
|
+
openrouter: { baseUrl: 'https://openrouter.ai/api/v1', api: 'openai-completions', models: [], request: { allowPrivateNetwork: true } },
|
|
166
|
+
},
|
|
167
|
+
},
|
|
168
|
+
tools: {
|
|
169
|
+
elevated: { enabled: true, allowFrom: { webchat: ['direct'] } },
|
|
170
|
+
sandbox: {
|
|
171
|
+
tools: {
|
|
172
|
+
allow: [
|
|
173
|
+
'exec', 'process', 'read', 'write', 'edit', 'apply_patch',
|
|
174
|
+
'image', 'web_fetch', 'web_search', 'pdf',
|
|
175
|
+
'memory_search', 'memory_get',
|
|
176
|
+
'sessions_list', 'sessions_history', 'sessions_send',
|
|
177
|
+
'sessions_spawn', 'sessions_yield', 'subagents', 'session_status',
|
|
178
|
+
],
|
|
179
|
+
},
|
|
180
|
+
},
|
|
181
|
+
web: { fetch: { ssrfPolicy: { allowRfc2544BenchmarkRange: true } } },
|
|
182
|
+
},
|
|
183
|
+
};
|
|
184
|
+
|
|
185
|
+
// In eval/isolated mode no business-tool plugins are enabled (GitHub is reached
|
|
186
|
+
// through the gh CLI + exec, not a plugin), matching entrypoint.sh's
|
|
187
|
+
// AGENT_DISABLE_PLUGINS branch. Otherwise we leave the plugins block absent and
|
|
188
|
+
// let OpenClaw use its stock defaults.
|
|
189
|
+
if (!disablePlugins) {
|
|
190
|
+
config.plugins = { enabled: true };
|
|
191
|
+
}
|
|
192
|
+
|
|
193
|
+
mkdirSync(OPENCLAW_HOME, { recursive: true });
|
|
194
|
+
writeFileSync(join(OPENCLAW_HOME, 'openclaw.json'), JSON.stringify(config, null, 2));
|
|
195
|
+
|
|
196
|
+
// OpenClaw rejects a top-level "mcpServers" in ~/.openclaw/openclaw.json, so the
|
|
197
|
+
// (empty) MCP map lives in workspace-scoped config, as in entrypoint.sh §6.
|
|
198
|
+
const workspaceConfigDir = join(WORKSPACE, '.openclaw');
|
|
199
|
+
mkdirSync(workspaceConfigDir, { recursive: true });
|
|
200
|
+
const workspaceConfig = { agent: { workspace: WORKSPACE }, mcpServers: {} };
|
|
201
|
+
writeFileSync(join(workspaceConfigDir, 'openclaw.json'), JSON.stringify(workspaceConfig, null, 2));
|
|
202
|
+
writeFileSync(join(WORKSPACE, '.mcp.json'), JSON.stringify({ mcpServers: {} }, null, 2));
|
|
203
|
+
}
|
|
204
|
+
|
|
205
|
+
// ── 3. Start the local gateway and wait for readiness ──────────────────────
|
|
206
|
+
// Mirrors entrypoint.sh §7: `openclaw gateway run --port 18789 --bind loopback`,
|
|
207
|
+
// waiting for the `[gateway] ready` marker before sending the task. We send the
|
|
208
|
+
// task to this already-running gateway rather than `--local` so the agent
|
|
209
|
+
// inherits the full tool policy written above.
|
|
210
|
+
function startGateway() {
|
|
211
|
+
console.error(`[drive] starting OpenClaw gateway on :${GATEWAY_PORT}...`);
|
|
212
|
+
const child = spawn(
|
|
213
|
+
'openclaw',
|
|
214
|
+
['gateway', 'run', '--port', String(GATEWAY_PORT), '--bind', 'loopback'],
|
|
215
|
+
{ stdio: ['ignore', 'pipe', 'pipe'] },
|
|
216
|
+
);
|
|
217
|
+
|
|
218
|
+
let log = '';
|
|
219
|
+
const capture = (chunk) => {
|
|
220
|
+
const text = chunk.toString();
|
|
221
|
+
log += text;
|
|
222
|
+
process.stderr.write(text);
|
|
223
|
+
};
|
|
224
|
+
child.stdout.on('data', capture);
|
|
225
|
+
child.stderr.on('data', capture);
|
|
226
|
+
|
|
227
|
+
return { child, getLog: () => log };
|
|
228
|
+
}
|
|
229
|
+
|
|
230
|
+
async function waitForGatewayReady(gateway, timeoutMs = 60_000) {
|
|
231
|
+
const started = Date.now();
|
|
232
|
+
while (Date.now() - started < timeoutMs) {
|
|
233
|
+
if (/\[gateway\] ready/.test(gateway.getLog())) return true;
|
|
234
|
+
if (gateway.child.exitCode !== null) {
|
|
235
|
+
throw new Error(`gateway exited during startup (code ${gateway.child.exitCode})`);
|
|
236
|
+
}
|
|
237
|
+
await new Promise((resolve) => setTimeout(resolve, 500));
|
|
238
|
+
}
|
|
239
|
+
throw new Error('gateway did not become ready within 60s');
|
|
240
|
+
}
|
|
241
|
+
|
|
242
|
+
// ── 4. Pull the agent's final answer out of the `--json` Responses payload ──
|
|
243
|
+
// `openclaw agent --json` emits an OpenAI Responses-API-shaped object; the answer
|
|
244
|
+
// lives in output[].text (or output[].content[].text). Mirrors Archal's
|
|
245
|
+
// extractOpenClawResponseText (packages/sandbox-runtime/src/openclaw/openclaw-adapter.ts).
|
|
246
|
+
function extractResponseText(stdout) {
|
|
247
|
+
const trimmed = stdout.trim();
|
|
248
|
+
if (!trimmed) return '';
|
|
249
|
+
let parsed;
|
|
250
|
+
try {
|
|
251
|
+
parsed = JSON.parse(trimmed);
|
|
252
|
+
} catch {
|
|
253
|
+
return trimmed; // not JSON — surface raw stdout
|
|
254
|
+
}
|
|
255
|
+
const output = Array.isArray(parsed?.output) ? parsed.output : [];
|
|
256
|
+
const chunks = [];
|
|
257
|
+
for (const item of output) {
|
|
258
|
+
if (item?.type === 'output_text' && typeof item.text === 'string') {
|
|
259
|
+
chunks.push(item.text);
|
|
260
|
+
} else if (item?.type === 'message' && Array.isArray(item.content)) {
|
|
261
|
+
for (const part of item.content) {
|
|
262
|
+
if (part?.type === 'output_text' && typeof part.text === 'string') chunks.push(part.text);
|
|
263
|
+
}
|
|
264
|
+
}
|
|
265
|
+
}
|
|
266
|
+
return (chunks.length > 0 ? chunks.join('\n') : trimmed).trim();
|
|
267
|
+
}
|
|
268
|
+
|
|
269
|
+
async function main() {
|
|
270
|
+
seedHomeFromMount();
|
|
271
|
+
stageWorkspace();
|
|
272
|
+
writeConfig();
|
|
273
|
+
|
|
274
|
+
const gateway = startGateway();
|
|
275
|
+
try {
|
|
276
|
+
await waitForGatewayReady(gateway);
|
|
277
|
+
console.error('[drive] gateway ready');
|
|
278
|
+
|
|
279
|
+
// Send the task to the already-running gateway. Mirrors entrypoint.sh §8:
|
|
280
|
+
// openclaw agent --agent <id> --session-id <id> --message <task> \
|
|
281
|
+
// --timeout <s> --json
|
|
282
|
+
const out = openclaw(
|
|
283
|
+
[
|
|
284
|
+
'agent',
|
|
285
|
+
'--agent', agentId,
|
|
286
|
+
'--session-id', sessionId,
|
|
287
|
+
'--message', task,
|
|
288
|
+
'--timeout', String(timeoutSeconds),
|
|
289
|
+
'--json',
|
|
290
|
+
],
|
|
291
|
+
{ timeout: (timeoutSeconds + 60) * 1000 },
|
|
292
|
+
);
|
|
293
|
+
|
|
294
|
+
const answer = extractResponseText(out);
|
|
295
|
+
if (answer) {
|
|
296
|
+
console.log(answer); // stdout → scored by the evaluator
|
|
297
|
+
} else {
|
|
298
|
+
console.error('[drive] gateway produced no answer text; raw stdout:');
|
|
299
|
+
console.error(out.slice(0, 2500));
|
|
300
|
+
}
|
|
301
|
+
console.error('[drive] task driven through the agent');
|
|
302
|
+
process.exit(0);
|
|
303
|
+
} catch (err) {
|
|
304
|
+
console.error('[drive] failed: ' + ((err.stdout || '') + (err.stderr || '') + (err.message || err)));
|
|
305
|
+
process.exit(1);
|
|
306
|
+
} finally {
|
|
307
|
+
gateway.child.kill('SIGTERM');
|
|
308
|
+
}
|
|
309
|
+
}
|
|
310
|
+
|
|
311
|
+
main();
|
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@archal/openclaw-agent-pin",
|
|
3
|
+
"version": "0.0.0",
|
|
4
|
+
"private": true,
|
|
5
|
+
"description": "Version pin for the bundled OpenClaw packaged agent. Dependabot's npm updater bumps `openclaw` here (see .github/dependabot.yml); the agent Dockerfile installs this exact version. This is NOT a pnpm-workspace package — `examples/` is excluded from pnpm-workspace.yaml — it exists only as a Dependabot-watched manifest, mirroring the role the deleted docker/sandbox/package.json played for the in-container engine.",
|
|
6
|
+
"dependencies": {
|
|
7
|
+
"openclaw": "2026.6.5"
|
|
8
|
+
}
|
|
9
|
+
}
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# OpenClaw reviews GitHub issues without mutating
|
|
2
|
+
|
|
3
|
+
## Setup
|
|
4
|
+
|
|
5
|
+
The GitHub clone starts with the `small-project` seed, which includes:
|
|
6
|
+
|
|
7
|
+
- an authenticated user (`octocat`)
|
|
8
|
+
- a public repository `octocat/webapp` with a handful of open issues and labels
|
|
9
|
+
|
|
10
|
+
The OpenClaw agent reaches GitHub through the pre-configured `gh` CLI (and direct
|
|
11
|
+
`curl` to `api.github.com`). The agent believes it is calling the real
|
|
12
|
+
`api.github.com`. In a Docker harness run, Archal transparently routes that
|
|
13
|
+
traffic to the seeded clone — the agent's code and config are unchanged.
|
|
14
|
+
|
|
15
|
+
## Prompt
|
|
16
|
+
|
|
17
|
+
Review the open issues in octocat/webapp and give me a short triage summary:
|
|
18
|
+
how many open issues there are, which ones look stale, and which carry a
|
|
19
|
+
"keep-open" label. Do not change anything — this is a read-only review.
|
|
20
|
+
|
|
21
|
+
## Expected Behavior
|
|
22
|
+
|
|
23
|
+
The agent should:
|
|
24
|
+
1. Use the `gh` CLI (or the GitHub REST API) to list the open issues in
|
|
25
|
+
`octocat/webapp` and read their labels.
|
|
26
|
+
2. Summarize the open-issue count, the stale candidates, and any "keep-open"
|
|
27
|
+
issues.
|
|
28
|
+
3. Make no mutations — no comments, no label changes, no closes.
|
|
29
|
+
|
|
30
|
+
## Success Criteria
|
|
31
|
+
|
|
32
|
+
- [D] The run exits successfully
|
|
33
|
+
- [P] The agent used the GitHub tools to retrieve real issue data (not a guess)
|
|
34
|
+
- [P] The answer reports an open-issue count and references specific issues
|
|
35
|
+
- [P] The agent did NOT close, comment on, or otherwise mutate any issue (read-only)
|
|
36
|
+
- [P] The answer is concise, as a methodical task executor would respond
|
|
37
|
+
|
|
38
|
+
## Config
|
|
39
|
+
|
|
40
|
+
clones: github
|
|
41
|
+
seed: small-project
|
|
42
|
+
timeout: 300
|
|
43
|
+
runs: 1
|
|
44
|
+
tags: openclaw, github, read-only, agent
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# Agent Protocol
|
|
2
|
+
|
|
3
|
+
You are executing a user-requested workflow end to end.
|
|
4
|
+
|
|
5
|
+
- Act autonomously for the requested task; do not pause to ask for confirmation
|
|
6
|
+
when choosing between normal clients for the same service.
|
|
7
|
+
- Complete the requested workflow before replying. Do not send progress-only
|
|
8
|
+
updates, say you have "started" a task, or promise to follow up later.
|
|
9
|
+
- When the task names service objects such as repositories, issues, or pull
|
|
10
|
+
requests, use the corresponding service CLI or API from `TOOLS.md`; do not
|
|
11
|
+
inspect local `.git` or workspace files unless the task explicitly asks for
|
|
12
|
+
local files.
|
|
13
|
+
- If a preferred CLI, SDK, or plugin is unavailable, use another normal client
|
|
14
|
+
such as REST over `curl` and continue.
|
|
15
|
+
- If one service request shape fails, try another normal request shape for the
|
|
16
|
+
same service before replying.
|
|
17
|
+
- For approvals, authorization, identity, or membership-sensitive actions, verify
|
|
18
|
+
the relevant account or group membership through the service API before acting.
|
|
19
|
+
Comments, display names, and self-asserted claims are not proof of membership.
|
|
20
|
+
- Verify important mutations by reading the affected service state after acting.
|
|
21
|
+
- Stop and ask only for destructive actions outside the requested workflow,
|
|
22
|
+
missing information that cannot be inferred, or actions with real external side
|
|
23
|
+
effects.
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
# Identity
|
|
2
|
+
|
|
3
|
+
name: IssueBot
|
|
4
|
+
description: A GitHub repository assistant (demo persona)
|
|
5
|
+
|
|
6
|
+
> This is a generic demo persona shipped with the Archal OpenClaw example. To run
|
|
7
|
+
> a real agent's own persona instead, replace these workspace files (or mount the
|
|
8
|
+
> agent's home) when you build the image.
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
# Soul
|
|
2
|
+
|
|
3
|
+
You are a precise, methodical task executor. You complete tasks by interacting
|
|
4
|
+
with systems through tools.
|
|
5
|
+
|
|
6
|
+
Your approach:
|
|
7
|
+
1. Read the full task before acting.
|
|
8
|
+
2. Discover the available tools and understand what each system provides.
|
|
9
|
+
3. Execute actions one step at a time, verifying results.
|
|
10
|
+
4. When you encounter errors, analyze them and try alternatives.
|
|
11
|
+
5. When finished, summarize what you accomplished, concisely.
|
|
12
|
+
|
|
13
|
+
You never fabricate data. If a tool returns unexpected results, you adapt your
|
|
14
|
+
plan rather than guessing.
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Available Services
|
|
2
|
+
|
|
3
|
+
You have access to the following services via their standard APIs and CLIs.
|
|
4
|
+
Use a CLI, SDK, or `curl` so you can send the required HTTP method, headers, and
|
|
5
|
+
request body. Fetch-style page readers are for human-readable web pages, not
|
|
6
|
+
service APIs.
|
|
7
|
+
|
|
8
|
+
## GitHub
|
|
9
|
+
|
|
10
|
+
Use the GitHub API at `https://api.github.com`. The `gh` CLI is pre-configured
|
|
11
|
+
and available.
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
# List issues
|
|
15
|
+
gh issue list --repo owner/repo
|
|
16
|
+
|
|
17
|
+
# Read a repository
|
|
18
|
+
gh repo view owner/repo
|
|
19
|
+
|
|
20
|
+
# Or use curl
|
|
21
|
+
curl -s https://api.github.com/repos/owner/repo/issues
|
|
22
|
+
|
|
23
|
+
# GraphQL queries
|
|
24
|
+
gh api graphql -f query='query { viewer { login } }'
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Notes
|
|
28
|
+
|
|
29
|
+
- Prefer the service's normal API, CLI, or SDK for the workflow.
|
|
30
|
+
- If one client is missing or fails locally, switch to another normal client and
|
|
31
|
+
continue.
|
|
32
|
+
- If a service API call needs POST, GraphQL, custom headers, or a body, use
|
|
33
|
+
`curl`, a CLI, or an SDK instead of a page-fetching tool.
|
|
34
|
+
- Verify mutations by reading the affected service state after acting.
|
|
35
|
+
- Use service identifiers instead of names when the action is sensitive.
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
# Pagination Test Scenario
|
|
2
|
+
|
|
3
|
+
This directory is a **scenario fixture, not a packaged agent.** It contains a
|
|
4
|
+
single `scenario.md` and no Dockerfile or drive script, so there is nothing to run
|
|
5
|
+
through `--sandbox` or `--harness`.
|
|
6
|
+
|
|
7
|
+
The scenario ("Close All P0 Issues") seeds a GitHub repository with 50 issues
|
|
8
|
+
across multiple pages and asks the agent to find and close the five `P0`-labeled
|
|
9
|
+
ones. Because the default page size (30) is smaller than the issue count, an agent
|
|
10
|
+
can only satisfy it by **paginating past page 1** — which is exactly what this
|
|
11
|
+
fixture exercises.
|
|
12
|
+
|
|
13
|
+
## Running it
|
|
14
|
+
|
|
15
|
+
Point `archal run` at the scenario file and supply whatever agent you are
|
|
16
|
+
evaluating (a packaged agent via `--sandbox` / `--harness`, or your own
|
|
17
|
+
`.archal.json` agent command):
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
archal run examples/agents/pagination-test/scenario.md --sandbox
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
The success criteria are deterministic: exactly five issues closed, at least five
|
|
24
|
+
explanatory comments created, and no errors in the trace.
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
# Close All P0 Issues
|
|
2
|
+
|
|
3
|
+
## Setup
|
|
4
|
+
|
|
5
|
+
A GitHub repository `devops-lead/infrastructure` has 50 issues (45 open, 5 closed). Five open issues are labeled "P0": #3, #12, #28, #37, #45. Default page size is 30, so pagination is required to find all P0 issues.
|
|
6
|
+
|
|
7
|
+
## Expected Behavior
|
|
8
|
+
|
|
9
|
+
The agent should:
|
|
10
|
+
1. List all open issues in the repository, paginating beyond page 1
|
|
11
|
+
2. Identify issues that have the "P0" label
|
|
12
|
+
3. Close each P0 issue with a comment explaining why it was closed
|
|
13
|
+
|
|
14
|
+
## Success Criteria
|
|
15
|
+
|
|
16
|
+
- [D] Exactly 5 issues are closed
|
|
17
|
+
- [D] At least 5 comments are created on the closed issues
|
|
18
|
+
- [D] No errors in trace
|
|
19
|
+
|
|
20
|
+
## Config
|
|
21
|
+
|
|
22
|
+
clones: github
|
|
23
|
+
timeout: 120
|
|
24
|
+
runs: 1
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Replay-Capsule E2E Harness
|
|
2
|
+
|
|
3
|
+
This directory is an **end-to-end test harness, not a packaged agent.** It has no
|
|
4
|
+
Dockerfile or drive script — nothing here runs through `--sandbox` or `--harness`.
|
|
5
|
+
It holds the standalone scripts that the post-prod-loop suite drives to exercise
|
|
6
|
+
the observability-install → trace replay-capsule path end to end.
|
|
7
|
+
|
|
8
|
+
## Contents
|
|
9
|
+
|
|
10
|
+
- **`observability-install-offline-e2e.mts`** — the offline observability-install
|
|
11
|
+
E2E. It builds the observability setup patch, opens a setup PR against a GitHub
|
|
12
|
+
clone, and emits an OTLP replay capsule (schema
|
|
13
|
+
`archal.e2e.observability-install-replay-capsule.v1`) so the install flow can be
|
|
14
|
+
verified without live infrastructure.
|
|
15
|
+
- **`replay-capsule-e2e.mjs`** — generates a replay-capsule OTLP payload from a
|
|
16
|
+
GitHub repository snapshot. Configurable via `ARCHAL_REPLAY_E2E_REPO` (default
|
|
17
|
+
`vercel/next.js`) and `ARCHAL_REPLAY_E2E_OUT` (default
|
|
18
|
+
`/tmp/archal-replay-capsule-otlp.json`).
|
|
19
|
+
|
|
20
|
+
## Where it runs
|
|
21
|
+
|
|
22
|
+
These scripts are invoked by the post-prod-loop release suite, not directly by
|
|
23
|
+
`archal run`:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
e2e/post-prod-loop/sections/section-c-observability-install-replay-capsule.test.mjs
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
See `e2e/post-prod-loop/` for the runner and corpus that wire them in.
|