npm - openclaw-workflowskill - Versions diffs - 0.4.0 → 0.5.0 - Mend

openclaw-workflowskill 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -42,7 +42,7 @@ openclaw plugins install openclaw-workflowskill
 ```bash
 openclaw config set plugins.allow '["openclaw-workflowskill"]'
-openclaw config set tools.alsoAllow '["openclaw-workflowskill"]'
+openclaw config set tools.alsoAllow '["openclaw-workflowskill", "web_fetch"]'
 ```
 The first command allowlists the plugin so the gateway loads it. The second makes its tools available in every session — including cron jobs — so agents can invoke `workflowskill_run` to execute workflows autonomously.
@@ -59,7 +59,7 @@ The gateway loads plugins and config at startup. A restart is required to pick u
 ```bash
 openclaw plugins list
-# → workflowskill: 4 tools registered
+# → workflowskill: 6 tools registered
 openclaw skills list
 # → workflowskill-author (user-invocable)
@@ -105,14 +105,16 @@ Your agent will use `"model": "haiku"` for cron jobs, ensuring execution is chea
 ## Tools
-Registers four tools with the OpenClaw agent:
+Registers six tools with the OpenClaw agent:
-| Tool                     | Description                                                   |
-| ------------------------ | ------------------------------------------------------------- |
-| `workflowskill_validate` | Parse and validate a SKILL.md or raw YAML workflow            |
-| `workflowskill_run`      | Execute a workflow and return a compact run summary           |
-| `workflowskill_runs`     | List and inspect past run logs                                |
-| `workflowskill_llm`      | Call Anthropic directly for inline LLM reasoning in workflows |
+| Tool                       | Description                                                        |
+| -------------------------- | ------------------------------------------------------------------ |
+| `workflowskill_validate`   | Parse and validate a SKILL.md or raw YAML workflow                 |
+| `workflowskill_run`        | Execute a workflow and return a compact run summary                |
+| `workflowskill_runs`       | List and inspect past run logs                                     |
+| `workflowskill_llm`        | Call Anthropic directly for inline LLM reasoning in workflows      |
+| `workflowskill_fetch_raw`  | HTTP fetch returning `{ status, headers, body }` with JSON parsed  |
+| `workflowskill_scrape`     | Fetch a web page and extract text via CSS selectors                |
 Also ships the `/workflowskill-author` skill — just say "I want to automate X" and the agent handles the rest: researching, writing, validating, and test-running the workflow in chat.
@@ -135,9 +137,9 @@ Review past runs via `workflowskill_runs`.
 Workflow `tool` steps are forwarded to the **OpenClaw gateway** via `POST /tools/invoke`. Any tool registered with the gateway is available to a workflow — the plugin sends the tool name and args as JSON and returns the result. Gateway auth (`config.gateway.auth.token`) must be configured or the plugin will refuse to start.
-The `workflowskill_llm` tool is built-in: it calls Anthropic directly using the API key from OpenClaw's credential store, and is always available.
+The `workflowskill_llm`, `workflowskill_fetch_raw`, and `workflowskill_scrape` tools are built-in: they call external services directly without going through the gateway adapter.
-Only `workflowskill_run` is blocked from gateway forwarding — it is self-referencing and would create infinite recursion. The other three plugin tools (`workflowskill_validate`, `workflowskill_runs`, `workflowskill_llm`) are leaf operations and are forwarded normally.
+Only `workflowskill_run` is blocked from gateway forwarding — it is self-referencing and would create infinite recursion. The other plugin tools (`workflowskill_validate`, `workflowskill_runs`, `workflowskill_llm`, `workflowskill_fetch_raw`, `workflowskill_scrape`) are leaf operations and are forwarded normally.
 ## Tool Reference
@@ -186,6 +188,31 @@ Call Anthropic directly and return the text response. Uses the API key from Open
 Returns the LLM response as a plain text string.
+### `workflowskill_fetch_raw`
+Make an HTTP request and return the raw response with JSON auto-parsed. Use this instead of `web_fetch` when you need structured data from a JSON API — `web_fetch` converts responses to markdown, destroying JSON structure.
+| Param     | Type   | Required | Description                                                    |
+| --------- | ------ | -------- | -------------------------------------------------------------- |
+| `url`     | string | yes      | The URL to fetch (http or https)                               |
+| `method`  | string | no       | HTTP method — GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS     |
+| `headers` | object | no       | Request headers as key-value pairs                             |
+| `body`    | string | no       | Request body string (e.g. `JSON.stringify(...)` for POST/PUT)  |
+Returns `{ status, headers, body }`. `body` is a parsed object for `application/json` responses, or a raw string otherwise. Network errors return `{ status: 0, headers: {}, body: "<error message>" }` so workflows can branch on failure.
+### `workflowskill_scrape`
+Fetch a web page and extract structured data using CSS selectors.
+| Param       | Type   | Required | Description                                                          |
+| ----------- | ------ | -------- | -------------------------------------------------------------------- |
+| `url`       | string | yes      | The page URL to fetch (http or https)                                |
+| `selectors` | object | yes      | Named CSS selectors, e.g. `{ "title": "h1", "prices": "span.price" }` |
+| `headers`   | object | no       | Custom request headers as key-value pairs                            |
+Returns `{ status, results }` where `results` maps each selector name to an array of matching text values. Errors return `{ status: 0, error: "<message>" }`.
 ## Development
 The plugin imports from `workflowskill`, installed from npm. No build step is required for type checking:

package/index.ts CHANGED Viewed

@@ -10,6 +10,8 @@ import { validateHandler } from './tools/validate.js';
 import { runHandler } from './tools/run.js';
 import { runsHandler } from './tools/runs.js';
 import { llmHandler } from './tools/llm.js';
+import { fetchRawHandler } from './tools/fetch_raw.js';
+import { scrapeHandler } from './tools/scrape.js';
 import { createToolAdapter, type GatewayConfig } from './lib/adapters.js';
 // ─── OpenClaw plugin API types ─────────────────────────────────────────────
@@ -231,5 +233,81 @@ export default {
       },
     });
+    // ── workflowskill_scrape ──────────────────────────────────────────────
+    registerTool({
+      name: 'workflowskill_scrape',
+      description:
+        'Fetch a web page and extract structured data using CSS selectors. ' +
+        'Returns { status, results } where results maps each selector name to an array of matching text values. ' +
+        'Use when you need to extract specific content from HTML pages — ' +
+        'supply named selectors like { "title": "h1", "prices": "span.price" }. ' +
+        'Errors return { status: 0, error: "<message>" }.',
+      parameters: {
+        type: 'object',
+        properties: {
+          url: {
+            type: 'string',
+            description: 'The page URL to fetch (http or https).',
+          },
+          selectors: {
+            type: 'object',
+            description: 'Map of named CSS selectors, e.g. { "title": "h1", "prices": "span.price" }.',
+          },
+          headers: {
+            type: 'object',
+            description: 'Custom request headers as key-value pairs. Optional.',
+          },
+        },
+        required: ['url', 'selectors'],
+      },
+      execute: async (_id, params) => {
+        return toContent(
+          await scrapeHandler(
+            params as { url: string; selectors: Record<string, string>; headers?: Record<string, string> },
+          ),
+        );
+      },
+    });
+    // ── workflowskill_fetch_raw ───────────────────────────────────────────
+    registerTool({
+      name: 'workflowskill_fetch_raw',
+      description:
+        'Make an HTTP request and return { status, headers, body } with JSON auto-parsed. ' +
+        'Use this instead of web_fetch when you need structured JSON from an API — ' +
+        'web_fetch converts responses to markdown, destroying JSON structure. ' +
+        'body is a parsed object for application/json responses, or a raw string otherwise. ' +
+        'Network errors return { status: 0, body: "<error message>" } so workflows can branch on failure.',
+      parameters: {
+        type: 'object',
+        properties: {
+          url: {
+            type: 'string',
+            description: 'The URL to fetch (http or https).',
+          },
+          method: {
+            type: 'string',
+            description: 'HTTP method (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS). Defaults to GET.',
+          },
+          headers: {
+            type: 'object',
+            description: 'Request headers as key-value pairs. Optional.',
+          },
+          body: {
+            type: 'string',
+            description: 'Request body as a string (e.g. JSON.stringify output). Optional.',
+          },
+        },
+        required: ['url'],
+      },
+      execute: async (_id, params) => {
+        return toContent(
+          await fetchRawHandler(
+            params as { url: string; method?: string; headers?: Record<string, string>; body?: string },
+          ),
+        );
+      },
+    });
   },
 };

package/lib/adapters.ts CHANGED Viewed

@@ -6,15 +6,22 @@ import type { ToolAdapter, ToolDescriptor, ToolResult } from 'workflowskill';
 /** Unwrap MCP text-content envelope so workflows see actual tool output. */
 function unwrapMcpContent(body: unknown): unknown {
-  if (body !== null && typeof body === 'object' && 'content' in body) {
-    const { content } = body as { content: unknown };
+  // Peel outer { ok, result } gateway envelope if present.
+  let inner: unknown = body;
+  if (inner !== null && typeof inner === 'object' && 'ok' in inner && 'result' in inner) {
+    inner = (inner as { ok: unknown; result: unknown }).result;
+  }
+  // Peel MCP { content: [{ type: 'text', text: '...' }] } envelope.
+  if (inner !== null && typeof inner === 'object' && 'content' in inner) {
+    const { content } = inner as { content: unknown };
     if (Array.isArray(content) && content.length === 1
       && content[0]?.type === 'text' && typeof content[0]?.text === 'string') {
       try { return JSON.parse(content[0].text); }
       catch { return content[0].text; }
     }
   }
-  return body;
+  return inner;
 }
 export interface GatewayConfig {

package/lib/openclaw-context.md CHANGED Viewed

@@ -34,3 +34,44 @@ Always set `"model": "haiku"` on cron payloads — cron runs are lightweight orc
 > ```
 >
 > Without this, cron sessions cannot invoke `workflowskill_run` and will fail silently.
+### Fetching Raw API Data
+Use `workflowskill_fetch_raw` when a workflow step needs structured data from an HTTP API. Unlike `web_fetch`, which converts responses to markdown (destroying JSON structure), `workflowskill_fetch_raw` returns a parsed object for `application/json` responses.
+**Return shape:**
+```json
+{ "status": 200, "headers": { "content-type": "application/json" }, "body": { ... } }
+```
+Access response data via `$result.body.<field>`. Network errors return `status: 0` and a string `body` describing the error, so workflows can branch on failure.
+**GET request (JSON API):**
+```yaml
+steps:
+  - id: fetch_jobs
+    type: tool
+    tool: workflowskill_fetch_raw
+    params:
+      url: "https://boards-api.greenhouse.io/v1/boards/intrinsic/jobs"
+  - id: count_jobs
+    type: tool
+    tool: workflowskill_llm
+    params:
+      prompt: "There are {{ steps.fetch_jobs.result.body.jobs.length }} jobs."
+```
+**POST request with JSON body:**
+```yaml
+steps:
+  - id: create_item
+    type: tool
+    tool: workflowskill_fetch_raw
+    params:
+      url: "https://api.example.com/items"
+      method: POST
+      headers:
+        Content-Type: application/json
+        Authorization: "Bearer {{ inputs.token }}"
+      body: '{"name": "{{ inputs.name }}"}'
+```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "openclaw-workflowskill",
-  "version": "0.4.0",
+  "version": "0.5.0",
   "description": "WorkflowSkill plugin for OpenClaw — author, validate, run, and review YAML workflows",
   "type": "module",
   "main": "index.ts",
@@ -35,7 +35,8 @@
     ]
   },
   "dependencies": {
-    "workflowskill": "^0.4.0"
+    "cheerio": "^1.0.0",
+    "workflowskill": "^0.5.0"
   },
   "devDependencies": {
     "@types/node": "^25.3.0",
@@ -44,6 +45,8 @@
   },
   "scripts": {
     "typecheck": "tsc --noEmit --project tsconfig.json",
-    "prepublishOnly": "tsc --noEmit --project tsconfig.json"
+    "prepublishOnly": "tsc --noEmit --project tsconfig.json",
+    "dev:link": "./scripts/dev-link.sh",
+    "dev:unlink": "./scripts/dev-unlink.sh"
   }
 }

package/tools/fetch_raw.ts ADDED Viewed

@@ -0,0 +1,89 @@
+// workflowskill_fetch_raw — HTTP fetch that preserves JSON structure.
+const ALLOWED_METHODS = new Set(['GET', 'POST', 'PUT', 'PATCH', 'DELETE', 'HEAD', 'OPTIONS']);
+const TIMEOUT_MS = 30_000;
+const MAX_BYTES = 10 * 1024 * 1024; // 10 MB
+export interface FetchRawParams {
+  url: string;
+  method?: string;
+  headers?: Record<string, string>;
+  body?: string;
+}
+export interface FetchRawResult {
+  status: number;
+  headers: Record<string, string>;
+  body: unknown;
+}
+export async function fetchRawHandler(params: FetchRawParams): Promise<FetchRawResult> {
+  const { url, method = 'GET', headers = {}, body } = params;
+  // Protocol validation
+  let parsed: URL;
+  try {
+    parsed = new URL(url);
+  } catch {
+    return { status: 0, headers: {}, body: `Invalid URL: ${url}` };
+  }
+  if (parsed.protocol !== 'http:' && parsed.protocol !== 'https:') {
+    return { status: 0, headers: {}, body: `Unsupported protocol: ${parsed.protocol}` };
+  }
+  // Method validation
+  const upperMethod = method.toUpperCase();
+  if (!ALLOWED_METHODS.has(upperMethod)) {
+    return { status: 0, headers: {}, body: `Unsupported method: ${method}` };
+  }
+  let response: Response;
+  try {
+    response = await fetch(url, {
+      method: upperMethod,
+      headers,
+      body: body !== undefined ? body : undefined,
+      signal: AbortSignal.timeout(TIMEOUT_MS),
+    });
+  } catch (err) {
+    return { status: 0, headers: {}, body: err instanceof Error ? err.message : String(err) };
+  }
+  // Collect response headers
+  const responseHeaders: Record<string, string> = {};
+  response.headers.forEach((value, key) => {
+    responseHeaders[key] = value;
+  });
+  // Read with size guard
+  let rawBody: string;
+  try {
+    const buffer = await response.arrayBuffer();
+    if (buffer.byteLength > MAX_BYTES) {
+      return {
+        status: response.status,
+        headers: responseHeaders,
+        body: `Response too large: ${buffer.byteLength} bytes (max ${MAX_BYTES})`,
+      };
+    }
+    rawBody = new TextDecoder().decode(buffer);
+  } catch (err) {
+    return {
+      status: response.status,
+      headers: responseHeaders,
+      body: err instanceof Error ? err.message : String(err),
+    };
+  }
+  // Auto-parse JSON
+  const contentType = response.headers.get('content-type') ?? '';
+  if (contentType.includes('application/json')) {
+    try {
+      return { status: response.status, headers: responseHeaders, body: JSON.parse(rawBody) as unknown };
+    } catch {
+      // Fall through to raw string if JSON parsing fails
+    }
+  }
+  return { status: response.status, headers: responseHeaders, body: rawBody };
+}

package/tools/scrape.ts ADDED Viewed

@@ -0,0 +1,69 @@
+// workflowskill_scrape — Fetch a web page and extract data via CSS selectors.
+import { load } from 'cheerio';
+const TIMEOUT_MS = 30_000;
+const MAX_BYTES = 10 * 1024 * 1024; // 10 MB
+export interface ScrapeParams {
+  url: string;
+  selectors: Record<string, string>;
+  headers?: Record<string, string>;
+}
+export interface ScrapeResult {
+  status: number;
+  results?: Record<string, string[]>;
+  error?: string;
+}
+export async function scrapeHandler(params: ScrapeParams): Promise<ScrapeResult> {
+  const { url, selectors, headers = {} } = params;
+  // Protocol validation
+  let parsed: URL;
+  try {
+    parsed = new URL(url);
+  } catch {
+    return { status: 0, error: `Invalid URL: ${url}` };
+  }
+  if (parsed.protocol !== 'http:' && parsed.protocol !== 'https:') {
+    return { status: 0, error: `Unsupported protocol: ${parsed.protocol}` };
+  }
+  let response: Response;
+  try {
+    response = await fetch(url, {
+      headers,
+      signal: AbortSignal.timeout(TIMEOUT_MS),
+    });
+  } catch (err) {
+    return { status: 0, error: err instanceof Error ? err.message : String(err) };
+  }
+  // Read with size guard
+  let html: string;
+  try {
+    const buffer = await response.arrayBuffer();
+    if (buffer.byteLength > MAX_BYTES) {
+      return { status: response.status, error: `Response too large: ${buffer.byteLength} bytes (max ${MAX_BYTES})` };
+    }
+    html = new TextDecoder().decode(buffer);
+  } catch (err) {
+    return { status: response.status, error: err instanceof Error ? err.message : String(err) };
+  }
+  const $ = load(html);
+  const results: Record<string, string[]> = {};
+  for (const [key, selector] of Object.entries(selectors)) {
+    const texts: string[] = [];
+    $(selector).each((_i, el) => {
+      const text = $(el).text().trim();
+      if (text.length > 0) texts.push(text);
+    });
+    results[key] = texts;
+  }
+  return { status: response.status, results };
+}