npm - @tiens.nguyen/gonext-local-worker - Versions diffs - 1.0.50 → 1.0.51 - Mend

@tiens.nguyen/gonext-local-worker 1.0.50 → 1.0.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -2,6 +2,32 @@
 Run:
   GONEXT_API_BASE=... GONEXT_WORKER_KEY=... npx -y --package @tiens.nguyen/gonext-local-worker gonext-local-worker
+## Agent chat mode
+Select **Agent** (instead of Chat) under the composer in the web app. Your
+free-form prompt is handed to a [smolagents](https://github.com/huggingface/smolagents)
+agent running on your local MLX/Ollama model. The agent can call tools (v1:
+`http_request`) and streams its thinking steps + final answer directly into the
+chat thread.
+Requires smolagents in the worker's Python environment:
+```sh
+pip install smolagents certifi
+```
+- `certifi` supplies a trusted CA bundle so the agent's `http_request` tool can
+  verify HTTPS certificates on macOS (where Python's default bundle may be
+  missing). The worker falls back to the system bundle if certifi is absent.
+- Agent mode is blocked for cloud models (the API returns 400). Select a local
+  MLX or Ollama model first.
+- Tool steps appear in the collapsible reasoning (`<think>`) area; the final
+  answer is the message body — no new UI needed.
+The agent script is `gonext_agent_chat.py` (reads `{messages, agentBaseURL,
+agentApiKey, agentModelId, tools, maxSteps}` on stdin; emits NDJSON
+`{"type":"step"/"final","text":"..."}` lines on stdout).
 ## API Check / HTTP probe (Tools & Agents modes)
 The worker can run Postman-style HTTP probes queued from the web app
@@ -12,23 +38,15 @@ network_error).
 - **Tools (`tool_only`)** — no extra setup. The selected local model writes a
   one-line health summary of the measured result.
-- **Agents (`agentic`)** — a [smolagents](https://github.com/huggingface/smolagents)
-  agent (running on the selected local model) produces the summary. Install it
-  in the worker's Python environment:
-  ```sh
-  pip install smolagents
-  ```
+- **Agents (`agentic`)** — a smolagents agent (running on the selected local
+  model) produces the summary. Requires `pip install smolagents`.
   The agent talks to your local MLX OpenAI-compatible server (no cloud calls).
   The agent only summarizes; the worker's measurement stays the source of truth,
   so if smolagents or the model is unavailable the probe still returns the
   measured result with a note.
-### Probe-related env
+### Env vars
-  GONEXT_PROBE_PYTHON   Python executable for the smolagents agent
+  GONEXT_PROBE_PYTHON   Python executable for smolagents scripts
                         (default: GONEXT_MLX_LM_PYTHON or python3)
-The agent script lives next to the worker as `gonext_probe_agent.py` (reads a
-JSON probe config on stdin, writes a JSON summary on stdout).

package/gonext-local-worker.mjs CHANGED Viewed

@@ -1216,6 +1216,16 @@ async function runAgentChatJob(job) {
     }
   };
+  console.log(
+    `[gonext-worker] agent_chat ${jobId} baseURL=${payload?.agentBaseURL ?? "(none)"} modelId=${payload?.agentModelId ?? "(none)"}`
+  );
+  // Send an immediate heartbeat so the web 60-180s no-progress timer doesn't
+  // fire while the local model is loading/generating its first reasoning step.
+  enqueueText("<think>Agent starting…\n");
+  flushTail = flushTail.then(() => flushChunks()).catch((err) => {
+    console.error("[gonext-worker] agent_chat heartbeat flush error:", err);
+  });
   try {
     const python =
       (process.env.GONEXT_PROBE_PYTHON ?? process.env.GONEXT_MLX_LM_PYTHON ?? "")
@@ -1231,7 +1241,7 @@ async function runAgentChatJob(job) {
     });
     const timeoutMs = 300_000; // 5 min max for an agent run
-    let inThink = false;
+    let inThink = true; // already opened <think> above
     let finalText = "";
     await runProcessWithStreamingStdout(python, [scriptPath], input, timeoutMs, (event) => {

package/gonext_agent_chat.py CHANGED Viewed

@@ -148,6 +148,9 @@ def run_agent_chat(cfg):
             max_steps=max_steps,
             step_callbacks=[step_callback],
         )
+        # Emit before agent.run() so the web no-progress timer resets while the
+        # model loads its weights and generates its first reasoning step.
+        _emit({"type": "step", "text": f"Sending task to {agent_model_id}…"})
         with contextlib.redirect_stdout(sys.stderr):
             result = agent.run(task_text)
         final_text = str(result).strip()

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tiens.nguyen/gonext-local-worker",
-  "version": "1.0.50",
+  "version": "1.0.51",
   "description": "Polls GoNext cloud API for async local LLM jobs and runs them against Ollama/OpenAI-compatible servers on this Mac",
   "type": "module",
   "license": "MIT",