npm - simmer-autoresearch - Versions diffs - 0.1.0 - Mend

simmer-autoresearch 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/SKILL.md ADDED Viewed

@@ -0,0 +1,128 @@
+---
+name: simmer-autoresearch
+description: Set up and run autonomous experiment loops to optimize Simmer trading skills. Mutates skill code + config, measures P&L, keeps what works. Use when asked to "optimize a skill", "run autoresearch", or "improve my trading".
+---
+# Simmer Autoresearch
+Autonomous experiment loop for trading skill optimization: try ideas, keep what works, discard what doesn't, never stop.
+Fork of [pi-autoresearch](https://github.com/davebcn87/pi-autoresearch) adapted for prediction market trading.
+## Tools
+- **`init_experiment`** — configure session (name, metric, unit, direction). Call again to re-initialize with a new baseline.
+- **`run_experiment`** — runs skill command, times it, captures output.
+- **`log_experiment`** — records result. `keep` auto-commits. `discard`/`crash` → `git checkout -- .` to revert. Always include secondary `metrics` dict.
+## Setup
+1. Ask (or infer): **Skill** (which skill to optimize), **Goal** (maximize P&L? find more trades? reduce drawdown?), **Constraints** (budget, venues, markets).
+2. `git checkout -b autoresearch/<skill>-<date>`
+3. Read the skill source files deeply. Understand the strategy before changing anything.
+4. Write `autoresearch.md` and `autoresearch.sh` (see below). Commit both.
+5. `init_experiment` → run baseline → `log_experiment` → start looping immediately.
+### `autoresearch.md`
+This is the heart of the session. A fresh agent with no context should be able to read this file and run the loop effectively.
+```markdown
+# Autoresearch: Optimizing <skill-name> for <goal>
+## Objective
+<What we're optimizing. E.g. "Maximize P&L on polymarket-fast-loop by tuning
+entry thresholds, momentum signals, and position sizing.">
+## Metrics
+- **Primary**: pnl ($, higher is better)
+- **Secondary**: trades, win_rate, max_drawdown
+## How to Run
+`./autoresearch.sh` — runs the skill once, outputs METRIC lines.
+## Skill Overview
+<Brief description of what the skill does, its strategy, signal sources.>
+## Files in Scope
+<Every file the agent may modify, with a brief note on what it does.>
+- `fast_loop.py` — main strategy logic, config schema, entry/exit rules
+- `SKILL.md` — skill metadata (update version on significant changes)
+## Off Limits
+- `simmer-sdk/` core — don't modify the SDK itself
+- Other skills — only optimize the target skill
+- API keys, secrets, wallet addresses
+## Constraints
+- Skill must still pass: `python3 <skill>.py --live` exits 0
+- Don't remove safety guards (max position size, daily budget caps)
+- Config env vars must stay compatible with clawhub.json tunables
+- Don't break the `load_config()` / `get_client()` patterns from simmer-sdk
+## Market Conditions
+<Current market state that affects strategy. E.g. "Low volatility week,
+few new markets, existing markets trading near consensus.">
+## What's Been Tried
+<Update as experiments accumulate. Key wins, dead ends, insights.>
+```
+### `autoresearch.sh`
+```bash
+#!/bin/bash
+set -euo pipefail
+# Run the skill and capture output
+SKILL_DIR="skills/polymarket-fast-loop"
+OUTPUT=$(python3 "$SKILL_DIR/fast_loop.py" --live 2>&1) || {
+  echo "METRIC pnl=0"
+  echo "METRIC trades=0"
+  echo "STATUS crash"
+  exit 1
+}
+# Extract metrics from skill output
+# Skills print trade summaries — parse them
+echo "$OUTPUT"
+# TODO: Query Simmer API for actual P&L since experiment start
+# For now, extract from skill stdout
+echo "METRIC pnl=0"
+echo "METRIC trades=0"
+```
+## Loop Rules
+**LOOP FOREVER.** Never ask "should I continue?" — the user expects autonomous work.
+- **Primary metric is king.** P&L improved → `keep`. Worse/equal → `discard`. Crash → `crash`.
+- **Trades > 0 is the gate.** A config that produces 0 trades is `discard` regardless of everything else. The whole point is to find opportunities.
+- **Quality gates (all must pass to `keep`):**
+  - `trades >= 3` — enough data to be meaningful
+  - `profit_factor >= 1.0` — not losing money overall
+  - P&L improved vs baseline
+  - If any gate fails, `discard` even if P&L looks better (could be one lucky trade)
+- **One parameter at a time.** Change one thing, measure, keep or revert. Don't change 3 things and guess which one helped. Reference clawhub.json tunables for parameter bounds (min/max/step).
+- **Simpler is better.** Removing code for equal P&L = keep. Complex hack for tiny gain = discard.
+- **Don't thrash.** Repeatedly reverting the same idea? Try something structurally different.
+- **Crashes:** fix if trivial (typo, missing import), otherwise log and move on.
+- **Think longer when stuck.** Re-read the skill source, study market data, understand what the strategy is actually doing. The best improvements come from understanding, not random parameter sweeps.
+- **Resuming:** if `autoresearch.md` exists, read it + git log, continue looping.
+## Trading-Specific Guidelines
+- **Start with config tuning** (thresholds, filters, sizing). Low risk, fast iterations.
+- **Graduate to strategy changes** once config space is explored. Signal logic, market selection, timing.
+- **Watch for overfitting.** A change that works on today's markets may not generalize. Prefer robust improvements.
+- **Respect position limits.** Never remove max_position_size or daily_budget caps — these are safety rails.
+- **Market conditions change.** What works in high-volatility weeks may fail in quiet periods. Note conditions in autoresearch.md.
+## Ideas Backlog
+When you discover promising optimizations you won't pursue right now, **append them to `autoresearch.ideas.md`**. Don't let good ideas get lost.
+On resume, check the ideas file — prune stale entries, experiment with promising ones.
+**NEVER STOP.** The user may be away for hours. Keep going until interrupted.

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,65 @@
+/**
+ * Simmer Autoresearch — OpenClaw Plugin
+ *
+ * Fork of pi-autoresearch adapted for trading skill optimization.
+ * Agent mutates skill code + config, runs experiments, measures P&L,
+ * keeps what works, discards what doesn't. Never stops.
+ *
+ * Original: https://github.com/davebcn87/pi-autoresearch
+ * License: MIT (Tobi Lutke + David Cortés)
+ */
+interface SpawnResult {
+    stdout: string;
+    stderr: string;
+    code: number | null;
+    signal: NodeJS.Signals | null;
+    killed: boolean;
+    termination: "exit" | "timeout" | "no-output-timeout" | "signal";
+}
+interface PluginRuntime {
+    system: {
+        runCommandWithTimeout: (argv: string[], opts: {
+            timeoutMs: number;
+            cwd?: string;
+            env?: NodeJS.ProcessEnv;
+        }) => Promise<SpawnResult>;
+    };
+}
+interface PluginApi {
+    pluginConfig?: Record<string, unknown>;
+    logger: {
+        info: (msg: string) => void;
+        warn: (msg: string) => void;
+        error: (msg: string) => void;
+    };
+    runtime: PluginRuntime;
+    on: (hook: string, handler: (...args: unknown[]) => unknown, opts?: Record<string, unknown>) => void;
+    registerService: (service: {
+        id: string;
+        start: (ctx: ServiceCtx) => Promise<void>;
+        stop?: (ctx: ServiceCtx) => Promise<void>;
+    }) => void;
+    registerCommand: (cmd: {
+        name: string;
+        description: string;
+        acceptsArgs?: boolean;
+        handler: (ctx: CommandCtx) => Promise<{
+            text: string;
+        }>;
+    }) => void;
+    registerTool: (tool: Record<string, unknown>, opts?: Record<string, unknown>) => void;
+}
+interface ServiceCtx {
+    stateDir: string;
+    workspaceDir?: string;
+    logger: {
+        info: (msg: string) => void;
+        warn: (msg: string) => void;
+        error: (msg: string) => void;
+    };
+}
+interface CommandCtx {
+    args?: string;
+}
+export default function simmerAutoresearch(pluginApi: PluginApi): void;
+export {};

package/dist/index.js ADDED Viewed

@@ -0,0 +1,609 @@
+/**
+ * Simmer Autoresearch — OpenClaw Plugin
+ *
+ * Fork of pi-autoresearch adapted for trading skill optimization.
+ * Agent mutates skill code + config, runs experiments, measures P&L,
+ * keeps what works, discards what doesn't. Never stops.
+ *
+ * Original: https://github.com/davebcn87/pi-autoresearch
+ * License: MIT (Tobi Lutke + David Cortés)
+ */
+import * as fs from "node:fs";
+import * as path from "node:path";
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+function formatNum(value, unit) {
+    if (value === null)
+        return "—";
+    const u = unit || "";
+    if (value === Math.round(value))
+        return String(value) + u;
+    return value.toFixed(2) + u;
+}
+function isBetter(current, best, direction) {
+    return direction === "lower" ? current < best : current > best;
+}
+function currentResults(results, segment) {
+    return results.filter((r) => r.segment === segment);
+}
+function findBaselineMetric(results, segment) {
+    const cur = currentResults(results, segment);
+    return cur.length > 0 ? cur[0].metric : null;
+}
+function toolResult(text) {
+    return { content: [{ type: "text", text }] };
+}
+class SimmerApi {
+    apiKey;
+    apiUrl;
+    constructor(apiKey, apiUrl) {
+        this.apiKey = apiKey;
+        this.apiUrl = apiUrl;
+    }
+    async getOutcomes(skillSlug, since) {
+        try {
+            const url = `${this.apiUrl}/api/sdk/outcomes?skill=${encodeURIComponent(skillSlug)}&since=${encodeURIComponent(since)}`;
+            const resp = await fetch(url, {
+                headers: {
+                    Authorization: `Bearer ${this.apiKey}`,
+                    "Content-Type": "application/json",
+                },
+            });
+            if (!resp.ok)
+                return null;
+            const data = (await resp.json());
+            return {
+                trades: data.trades ?? 0,
+                pnl: data.pnl ?? 0,
+                wins: data.wins ?? 0,
+                losses: data.losses ?? 0,
+            };
+        }
+        catch {
+            return null;
+        }
+    }
+}
+// ---------------------------------------------------------------------------
+// State Reconstruction (from pi-autoresearch JSONL pattern)
+// ---------------------------------------------------------------------------
+function reconstructState(workspaceDir) {
+    const state = {
+        results: [],
+        bestMetric: null,
+        bestDirection: "higher",
+        metricName: "pnl",
+        metricUnit: "$",
+        secondaryMetrics: [],
+        name: null,
+        currentSegment: 0,
+    };
+    const jsonlPath = path.join(workspaceDir, "autoresearch.jsonl");
+    try {
+        if (fs.existsSync(jsonlPath)) {
+            let segment = 0;
+            const lines = fs
+                .readFileSync(jsonlPath, "utf-8")
+                .trim()
+                .split("\n")
+                .filter(Boolean);
+            for (const line of lines) {
+                try {
+                    const entry = JSON.parse(line);
+                    if (entry.type === "config") {
+                        if (entry.name)
+                            state.name = entry.name;
+                        if (entry.metricName)
+                            state.metricName = entry.metricName;
+                        if (entry.metricUnit !== undefined)
+                            state.metricUnit = entry.metricUnit;
+                        if (entry.bestDirection)
+                            state.bestDirection = entry.bestDirection;
+                        if (state.results.length > 0)
+                            segment++;
+                        state.currentSegment = segment;
+                        continue;
+                    }
+                    state.results.push({
+                        commit: entry.commit ?? "",
+                        metric: entry.metric ?? 0,
+                        metrics: entry.metrics ?? {},
+                        status: entry.status ?? "keep",
+                        description: entry.description ?? "",
+                        timestamp: entry.timestamp ?? 0,
+                        segment,
+                    });
+                    for (const name of Object.keys(entry.metrics ?? {})) {
+                        if (!state.secondaryMetrics.find((m) => m.name === name)) {
+                            let unit = "";
+                            if (name.includes("pnl") || name.includes("budget"))
+                                unit = "$";
+                            else if (name.includes("rate") || name.includes("pct"))
+                                unit = "%";
+                            state.secondaryMetrics.push({ name, unit });
+                        }
+                    }
+                }
+                catch {
+                    // Skip malformed lines
+                }
+            }
+            if (state.results.length > 0) {
+                state.bestMetric = findBaselineMetric(state.results, state.currentSegment);
+            }
+        }
+    }
+    catch {
+        // Fresh state
+    }
+    return state;
+}
+// ---------------------------------------------------------------------------
+// Plugin Entry Point
+// ---------------------------------------------------------------------------
+export default function simmerAutoresearch(pluginApi) {
+    const pluginConfig = pluginApi.pluginConfig ?? {};
+    const apiKey = pluginConfig.apiKey || process.env.SIMMER_API_KEY || "";
+    const apiUrl = pluginConfig.apiUrl ||
+        process.env.SIMMER_API_URL ||
+        "https://api.simmer.markets";
+    const simmer = new SimmerApi(apiKey, apiUrl);
+    let state = {
+        results: [],
+        bestMetric: null,
+        bestDirection: "higher",
+        metricName: "pnl",
+        metricUnit: "$",
+        secondaryMetrics: [],
+        name: null,
+        currentSegment: 0,
+    };
+    let resolvedWorkspaceDir = "";
+    // --- Service: reconstruct state on start ---
+    pluginApi.registerService({
+        id: "simmer-autoresearch",
+        async start(ctx) {
+            resolvedWorkspaceDir = ctx.workspaceDir || process.cwd();
+            state = reconstructState(resolvedWorkspaceDir);
+            if (state.results.length > 0) {
+                ctx.logger.info(`[autoresearch] Restored ${state.results.length} experiments from JSONL (segment ${state.currentSegment})`);
+            }
+            ctx.logger.info("[autoresearch] Ready. Use /autoresearch <skill> to begin.");
+        },
+    });
+    // --- Inject context into LLM prompt ---
+    pluginApi.on("before_prompt_build", async () => {
+        const dir = resolvedWorkspaceDir;
+        if (!dir)
+            return { prependContext: "" };
+        const mdPath = path.join(dir, "autoresearch.md");
+        const ideasPath = path.join(dir, "autoresearch.ideas.md");
+        if (!fs.existsSync(mdPath))
+            return { prependContext: "" };
+        let context = "\n\n## Autoresearch Mode (ACTIVE)\n" +
+            "You are in autoresearch mode. Optimize trading skill performance through an autonomous experiment loop.\n" +
+            "Use init_experiment, run_experiment, and log_experiment tools. NEVER STOP until interrupted.\n" +
+            `Experiment rules: ${mdPath} — read this file at the start of every session and after compaction.\n` +
+            "Write promising but deferred optimizations as bullet points to autoresearch.ideas.md.\n" +
+            "If the user sends a follow-on message while an experiment is running, finish the current run_experiment + log_experiment cycle first.\n";
+        if (fs.existsSync(ideasPath)) {
+            context += `\n💡 Ideas backlog exists at ${ideasPath} — check it for promising experiment paths. Prune stale entries.\n`;
+        }
+        if (state.results.length > 0) {
+            const cur = currentResults(state.results, state.currentSegment);
+            const kept = cur.filter((r) => r.status === "keep").length;
+            const crashed = cur.filter((r) => r.status === "crash").length;
+            const discarded = cur.filter((r) => r.status === "discard").length;
+            let bestPrimary = null;
+            for (const r of cur) {
+                if (r.status === "keep" &&
+                    r.metric !== 0 &&
+                    (bestPrimary === null ||
+                        isBetter(r.metric, bestPrimary, state.bestDirection))) {
+                    bestPrimary = r.metric;
+                }
+            }
+            context += `\n### Experiment Progress\n`;
+            context += `- ${cur.length} experiments: ${kept} kept, ${discarded} discarded, ${crashed} crashed\n`;
+            context += `- Baseline ${state.metricName}: ${formatNum(state.bestMetric, state.metricUnit)}\n`;
+            if (bestPrimary !== null) {
+                context += `- Best ${state.metricName}: ${formatNum(bestPrimary, state.metricUnit)}\n`;
+                if (state.bestMetric !== null && state.bestMetric !== 0) {
+                    const pct = ((bestPrimary - state.bestMetric) / Math.abs(state.bestMetric)) *
+                        100;
+                    context += `- Improvement: ${pct > 0 ? "+" : ""}${pct.toFixed(1)}%\n`;
+                }
+            }
+            const recent = cur.slice(-5);
+            if (recent.length > 0) {
+                context += `\nRecent experiments:\n`;
+                for (const r of recent) {
+                    const icon = r.status === "keep" ? "✓" : r.status === "crash" ? "✗" : "–";
+                    context += `  ${icon} ${r.description} → ${state.metricName}: ${formatNum(r.metric, state.metricUnit)} (${r.status})\n`;
+                }
+            }
+        }
+        return { prependContext: context };
+    });
+    // -----------------------------------------------------------------------
+    // init_experiment tool
+    // -----------------------------------------------------------------------
+    pluginApi.registerTool({
+        name: "init_experiment",
+        description: "Initialize the experiment session. Call once before the first run_experiment to set the name, primary metric, unit, and direction. Writes config to autoresearch.jsonl.",
+        parameters: {
+            type: "object",
+            properties: {
+                name: {
+                    type: "string",
+                    description: 'Human-readable name (e.g. "Optimizing polymarket-ai-divergence for P&L")',
+                },
+                metric_name: {
+                    type: "string",
+                    description: 'Primary metric name (e.g. "pnl", "trades", "sharpe")',
+                },
+                metric_unit: {
+                    type: "string",
+                    description: 'Unit (e.g. "$", "%", "")',
+                },
+                direction: {
+                    type: "string",
+                    enum: ["lower", "higher"],
+                    description: 'Whether "lower" or "higher" is better. Default: "higher"',
+                },
+            },
+            required: ["name", "metric_name"],
+        },
+        async execute(_toolCallId, params) {
+            const dir = resolvedWorkspaceDir;
+            if (!dir)
+                return toolResult("❌ No workspace directory. Start the service first.");
+            const isReinit = state.results.length > 0;
+            state.name = params.name;
+            state.metricName = params.metric_name;
+            state.metricUnit = params.metric_unit ?? "$";
+            if (params.direction === "lower" || params.direction === "higher") {
+                state.bestDirection = params.direction;
+            }
+            if (isReinit) {
+                state.currentSegment++;
+            }
+            state.bestMetric = null;
+            state.secondaryMetrics = [];
+            try {
+                const jsonlPath = path.join(dir, "autoresearch.jsonl");
+                const configLine = JSON.stringify({
+                    type: "config",
+                    name: state.name,
+                    metricName: state.metricName,
+                    metricUnit: state.metricUnit,
+                    bestDirection: state.bestDirection,
+                }) + "\n";
+                if (isReinit) {
+                    fs.appendFileSync(jsonlPath, configLine);
+                }
+                else {
+                    fs.writeFileSync(jsonlPath, configLine);
+                }
+            }
+            catch (e) {
+                return toolResult(`⚠️ Failed to write autoresearch.jsonl: ${e instanceof Error ? e.message : String(e)}`);
+            }
+            const reinitNote = isReinit
+                ? " (re-initialized — previous results archived, new baseline needed)"
+                : "";
+            return toolResult(`✅ Experiment initialized: "${state.name}"${reinitNote}\n` +
+                `Metric: ${state.metricName} (${state.metricUnit || "unitless"}, ${state.bestDirection} is better)\n` +
+                `Config written to autoresearch.jsonl. Now run the baseline with run_experiment.`);
+        },
+    }, { name: "init_experiment" });
+    // -----------------------------------------------------------------------
+    // run_experiment tool
+    // -----------------------------------------------------------------------
+    pluginApi.registerTool({
+        name: "run_experiment",
+        description: "Run a shell command as an experiment. Times execution, captures output, detects pass/fail. Use for running skill scripts, tests, or benchmarks.",
+        parameters: {
+            type: "object",
+            properties: {
+                command: {
+                    type: "string",
+                    description: "Shell command to run (e.g. 'python3 skills/polymarket-ai-divergence/ai_divergence.py --live')",
+                },
+                timeout_seconds: {
+                    type: "number",
+                    description: "Kill after this many seconds (default: 600)",
+                },
+            },
+            required: ["command"],
+        },
+        async execute(_toolCallId, params) {
+            const dir = resolvedWorkspaceDir;
+            if (!dir)
+                return toolResult("❌ No workspace directory.");
+            const timeout = (params.timeout_seconds ?? 600) * 1000;
+            const t0 = Date.now();
+            let result;
+            try {
+                result = await pluginApi.runtime.system.runCommandWithTimeout(["bash", "-c", params.command], { timeoutMs: timeout, cwd: dir });
+            }
+            catch (e) {
+                return toolResult(`💥 FAILED to execute: ${e instanceof Error ? e.message : String(e)}`);
+            }
+            const durationSeconds = (Date.now() - t0) / 1000;
+            const output = (result.stdout + "\n" + result.stderr).trim();
+            const passed = result.code === 0 && !result.killed;
+            const timedOut = result.killed || result.termination === "timeout";
+            let text = "";
+            if (timedOut) {
+                text += `⏰ TIMEOUT after ${durationSeconds.toFixed(1)}s\n`;
+            }
+            else if (!passed) {
+                text += `💥 FAILED (exit code ${result.code}) in ${durationSeconds.toFixed(1)}s\n`;
+            }
+            else {
+                text += `✅ PASSED in ${durationSeconds.toFixed(1)}s\n`;
+            }
+            if (state.bestMetric !== null) {
+                text += `📊 Current best ${state.metricName}: ${formatNum(state.bestMetric, state.metricUnit)}\n`;
+            }
+            const tail = output.split("\n").slice(-80).join("\n");
+            text += `\nLast 80 lines of output:\n${tail}`;
+            return toolResult(text);
+        },
+    }, { name: "run_experiment" });
+    // -----------------------------------------------------------------------
+    // log_experiment tool
+    // -----------------------------------------------------------------------
+    pluginApi.registerTool({
+        name: "log_experiment",
+        description: 'Record an experiment result. "keep" auto-commits via git. "discard"/"crash" → revert with git checkout. Call after every run_experiment.',
+        parameters: {
+            type: "object",
+            properties: {
+                commit: {
+                    type: "string",
+                    description: "Git commit hash (short, 7 chars)",
+                },
+                metric: {
+                    type: "number",
+                    description: "Primary metric value (e.g. P&L in dollars). 0 for crashes.",
+                },
+                status: {
+                    type: "string",
+                    enum: ["keep", "discard", "crash"],
+                    description: "keep if improved, discard if worse, crash if failed",
+                },
+                description: {
+                    type: "string",
+                    description: "Short description of what this experiment tried",
+                },
+                metrics: {
+                    type: "object",
+                    description: 'Secondary metrics as { name: value } (e.g. { "trades": 5, "win_rate": 0.6 })',
+                },
+                force: {
+                    type: "boolean",
+                    description: "Set true to allow adding a new secondary metric not previously tracked",
+                },
+            },
+            required: ["commit", "metric", "status", "description"],
+        },
+        async execute(_toolCallId, params) {
+            const dir = resolvedWorkspaceDir;
+            if (!dir)
+                return toolResult("❌ No workspace directory.");
+            const secondaryMetrics = params.metrics ?? {};
+            const force = params.force ?? false;
+            // Validate secondary metrics consistency
+            if (state.secondaryMetrics.length > 0) {
+                const knownNames = new Set(state.secondaryMetrics.map((m) => m.name));
+                const providedNames = new Set(Object.keys(secondaryMetrics));
+                const missing = [...knownNames].filter((n) => !providedNames.has(n));
+                if (missing.length > 0) {
+                    return toolResult(`❌ Missing secondary metrics: ${missing.join(", ")}\n` +
+                        `Expected: ${[...knownNames].join(", ")}\n` +
+                        `Got: ${[...providedNames].join(", ") || "(none)"}\n` +
+                        `Fix: include ${missing.map((m) => `"${m}": <value>`).join(", ")} in metrics.`);
+                }
+                const newMetrics = [...providedNames].filter((n) => !knownNames.has(n));
+                if (newMetrics.length > 0 && !force) {
+                    return toolResult(`❌ New secondary metric(s) not previously tracked: ${newMetrics.join(", ")}\n` +
+                        `Existing: ${[...knownNames].join(", ")}\n` +
+                        `Call again with force: true to add, or remove from metrics.`);
+                }
+            }
+            const experiment = {
+                commit: params.commit.slice(0, 7),
+                metric: params.metric,
+                metrics: secondaryMetrics,
+                status: params.status,
+                description: params.description,
+                timestamp: Date.now(),
+                segment: state.currentSegment,
+            };
+            state.results.push(experiment);
+            // Register new secondary metrics
+            for (const name of Object.keys(secondaryMetrics)) {
+                if (!state.secondaryMetrics.find((m) => m.name === name)) {
+                    let unit = "";
+                    if (name.includes("pnl") || name.includes("budget"))
+                        unit = "$";
+                    else if (name.includes("rate") || name.includes("pct"))
+                        unit = "%";
+                    state.secondaryMetrics.push({ name, unit });
+                }
+            }
+            state.bestMetric = findBaselineMetric(state.results, state.currentSegment);
+            const curCount = currentResults(state.results, state.currentSegment).length;
+            let text = `Logged #${state.results.length}: ${experiment.status} — ${experiment.description}`;
+            if (state.bestMetric !== null) {
+                text += `\nBaseline ${state.metricName}: ${formatNum(state.bestMetric, state.metricUnit)}`;
+                if (curCount > 1 && params.status === "keep" && params.metric !== 0) {
+                    const delta = params.metric - state.bestMetric;
+                    const pct = state.bestMetric !== 0
+                        ? ((delta / Math.abs(state.bestMetric)) * 100).toFixed(1)
+                        : "∞";
+                    const sign = delta > 0 ? "+" : "";
+                    text += ` | this: ${formatNum(params.metric, state.metricUnit)} (${sign}${pct}%)`;
+                }
+            }
+            if (Object.keys(secondaryMetrics).length > 0) {
+                const parts = [];
+                for (const [name, value] of Object.entries(secondaryMetrics)) {
+                    const def = state.secondaryMetrics.find((m) => m.name === name);
+                    parts.push(`${name}: ${formatNum(value, def?.unit ?? "")}`);
+                }
+                text += `\nSecondary: ${parts.join("  ")}`;
+            }
+            text += `\n(${state.results.length} experiments total)`;
+            // Auto-commit on keep
+            if (params.status === "keep") {
+                try {
+                    const resultData = {
+                        status: params.status,
+                        [state.metricName || "metric"]: params.metric,
+                        ...secondaryMetrics,
+                    };
+                    const trailerJson = JSON.stringify(resultData);
+                    const commitMsg = `${params.description}\n\nResult: ${trailerJson}`;
+                    const gitResult = await pluginApi.runtime.system.runCommandWithTimeout([
+                        "bash",
+                        "-c",
+                        `git add -A && git diff --cached --quiet && echo "NOTHING_TO_COMMIT" || git commit -m ${JSON.stringify(commitMsg)}`,
+                    ], { timeoutMs: 10000, cwd: dir });
+                    const gitOutput = (gitResult.stdout + gitResult.stderr).trim();
+                    if (gitOutput.includes("NOTHING_TO_COMMIT")) {
+                        text += `\n📝 Git: nothing to commit`;
+                    }
+                    else if (gitResult.code === 0) {
+                        const firstLine = gitOutput.split("\n")[0] || "";
+                        text += `\n📝 Git: committed — ${firstLine}`;
+                        try {
+                            const shaResult = await pluginApi.runtime.system.runCommandWithTimeout(["git", "rev-parse", "--short=7", "HEAD"], { timeoutMs: 5000, cwd: dir });
+                            const newSha = shaResult.stdout.trim();
+                            if (newSha && newSha.length >= 7) {
+                                experiment.commit = newSha;
+                            }
+                        }
+                        catch {
+                            // Keep original
+                        }
+                    }
+                    else {
+                        text += `\n⚠️ Git commit failed: ${gitOutput.slice(0, 200)}`;
+                    }
+                }
+                catch (e) {
+                    text += `\n⚠️ Git error: ${e instanceof Error ? e.message : String(e)}`;
+                }
+            }
+            else {
+                text += `\n📝 Git: skipped commit (${params.status}) — revert with git checkout -- .`;
+            }
+            // Persist to JSONL after git (so commit hash is correct)
+            try {
+                const jsonlPath = path.join(dir, "autoresearch.jsonl");
+                fs.appendFileSync(jsonlPath, JSON.stringify({ run: state.results.length, ...experiment }) +
+                    "\n");
+            }
+            catch {
+                // Don't fail if write fails
+            }
+            return toolResult(text);
+        },
+    }, { name: "log_experiment" });
+    // -----------------------------------------------------------------------
+    // /autoresearch command
+    // -----------------------------------------------------------------------
+    pluginApi.registerCommand({
+        name: "autoresearch",
+        description: "Start or resume autoresearch mode for a skill",
+        acceptsArgs: true,
+        async handler(ctx) {
+            const dir = resolvedWorkspaceDir;
+            if (!dir)
+                return { text: "❌ No workspace directory." };
+            const args = ctx.args?.trim() ?? "";
+            const mdPath = path.join(dir, "autoresearch.md");
+            const hasRules = fs.existsSync(mdPath);
+            if (args === "off") {
+                return { text: "Autoresearch mode OFF." };
+            }
+            if (args === "status") {
+                if (state.results.length === 0) {
+                    return {
+                        text: "No experiments yet. Run /autoresearch <skill> to start.",
+                    };
+                }
+                const cur = currentResults(state.results, state.currentSegment);
+                const kept = cur.filter((r) => r.status === "keep").length;
+                const crashed = cur.filter((r) => r.status === "crash").length;
+                const discarded = cur.filter((r) => r.status === "discard").length;
+                let bestPrimary = null;
+                for (const r of cur) {
+                    if (r.status === "keep" &&
+                        r.metric !== 0 &&
+                        (bestPrimary === null ||
+                            isBetter(r.metric, bestPrimary, state.bestDirection))) {
+                        bestPrimary = r.metric;
+                    }
+                }
+                let text = `🔬 Autoresearch: ${state.name ?? "unnamed"}\n`;
+                text += `${cur.length} experiments: ${kept} kept, ${discarded} discarded, ${crashed} crashed\n`;
+                text += `Baseline ${state.metricName}: ${formatNum(state.bestMetric, state.metricUnit)}\n`;
+                if (bestPrimary !== null) {
+                    text += `Best ${state.metricName}: ${formatNum(bestPrimary, state.metricUnit)}`;
+                    if (state.bestMetric !== null && state.bestMetric !== 0) {
+                        const pct = ((bestPrimary - state.bestMetric) /
+                            Math.abs(state.bestMetric)) *
+                            100;
+                        text += ` (${pct > 0 ? "+" : ""}${pct.toFixed(1)}%)`;
+                    }
+                    text += "\n";
+                }
+                const recent = cur.slice(-5);
+                text += "\nRecent:\n";
+                for (const r of recent) {
+                    const icon = r.status === "keep" ? "✓" : r.status === "crash" ? "✗" : "–";
+                    text += `  ${icon} ${r.description} → ${state.metricName}: ${formatNum(r.metric, state.metricUnit)} (${r.status})\n`;
+                }
+                return { text };
+            }
+            if (args === "reset") {
+                state = {
+                    results: [],
+                    bestMetric: null,
+                    bestDirection: "higher",
+                    metricName: "pnl",
+                    metricUnit: "$",
+                    secondaryMetrics: [],
+                    name: null,
+                    currentSegment: 0,
+                };
+                // Clear JSONL file
+                try {
+                    const jsonlPath = path.join(dir, "autoresearch.jsonl");
+                    if (fs.existsSync(jsonlPath))
+                        fs.unlinkSync(jsonlPath);
+                }
+                catch { /* ignore */ }
+                return { text: "Experiment history cleared. Ready to start fresh." };
+            }
+            if (hasRules) {
+                return {
+                    text: args
+                        ? `Autoresearch mode active. ${args}\nRead autoresearch.md for experiment rules, then resume the loop.`
+                        : "Autoresearch mode active. Read autoresearch.md and autoresearch.sh, then resume the experiment loop.",
+                };
+            }
+            return {
+                text: args
+                    ? `Start autoresearch: ${args}\nNo autoresearch.md found — read the skill source code, set up autoresearch.md and autoresearch.sh, then start the experiment loop.`
+                    : "Start autoresearch. No autoresearch.md found — specify a skill slug (e.g. /autoresearch polymarket-ai-divergence).",
+            };
+        },
+    });
+    pluginApi.logger.info("[simmer-autoresearch] Plugin loaded");
+}

package/openclaw.plugin.json ADDED Viewed

@@ -0,0 +1,29 @@
+{
+  "id": "simmer-autoresearch",
+  "name": "Simmer Autoresearch",
+  "description": "Autonomous skill optimization — agent mutates skill code + config, measures P&L, keeps what works",
+  "version": "0.1.0",
+  "configSchema": {
+    "type": "object",
+    "additionalProperties": false,
+    "properties": {
+      "apiKey": {
+        "type": "string",
+        "description": "Simmer API key (sk_live_...)"
+      },
+      "apiUrl": {
+        "type": "string",
+        "default": "https://api.simmer.markets"
+      }
+    },
+    "required": ["apiKey"]
+  },
+  "uiHints": {
+    "apiKey": {
+      "label": "Simmer API Key",
+      "sensitive": true,
+      "placeholder": "sk_live_...",
+      "help": "Get from simmer.markets/dashboard"
+    }
+  }
+}

package/package.json ADDED Viewed

@@ -0,0 +1,34 @@
+{
+  "name": "simmer-autoresearch",
+  "version": "0.1.0",
+  "description": "Autonomous skill optimization for Simmer — fork of pi-autoresearch adapted for trading",
+  "main": "dist/index.js",
+  "types": "dist/index.d.ts",
+  "files": [
+    "dist/",
+    "openclaw.plugin.json",
+    "SKILL.md"
+  ],
+  "scripts": {
+    "build": "tsc",
+    "dev": "tsc --watch"
+  },
+  "openclaw": {
+    "extensions": [
+      "./dist/index.js"
+    ]
+  },
+  "keywords": [
+    "openclaw",
+    "plugin",
+    "simmer",
+    "autoresearch",
+    "prediction-markets",
+    "trading"
+  ],
+  "license": "MIT",
+  "devDependencies": {
+    "@types/node": "^25.5.0",
+    "typescript": "^5.4.0"
+  }
+}