npm - @sesamespace/hivemind - Versions diffs - 0.5.17 → 0.6.1 - Mend

@sesamespace/hivemind 0.5.17 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/TOOL-USE-DESIGN.md +173 -0
package/dist/{chunk-OPOXV53N.js → chunk-7YHRRM5B.js} +622 -218
package/dist/chunk-7YHRRM5B.js.map +1 -0
package/dist/{chunk-KTOAREXT.js → chunk-D3P3TJX4.js} +2 -2
package/dist/{chunk-OTEMHDRU.js → chunk-EL4FSJBK.js} +2 -2
package/dist/{chunk-Z2FXPFKE.js → chunk-F4C7TIEX.js} +2 -2
package/dist/{chunk-2PCF2ADI.js → chunk-HCA2NYWH.js} +3 -3
package/dist/commands/fleet.js +3 -3
package/dist/commands/start.js +3 -3
package/dist/commands/watchdog.js +3 -3
package/dist/index.js +2 -2
package/dist/main.js +5 -5
package/dist/start.js +1 -1
package/package.json +1 -1
package/dist/chunk-OPOXV53N.js.map +0 -1
/package/dist/{chunk-KTOAREXT.js.map → chunk-D3P3TJX4.js.map} +0 -0
/package/dist/{chunk-OTEMHDRU.js.map → chunk-EL4FSJBK.js.map} +0 -0
/package/dist/{chunk-Z2FXPFKE.js.map → chunk-F4C7TIEX.js.map} +0 -0
/package/dist/{chunk-2PCF2ADI.js.map → chunk-HCA2NYWH.js.map} +0 -0

package/TOOL-USE-DESIGN.md ADDED Viewed

@@ -0,0 +1,173 @@
+# Hivemind Tool Use — Architecture Design
+## Current State
+The LLM client does simple chat completions: `messages[] → response.content`. No tool/function calling.
+## Goal
+Full agentic tool-use loop matching OpenClaw capabilities, with Hivemind's memory system as a differentiator.
+## Architecture
+### 1. Tool Calling Protocol (OpenAI-compatible, works with OpenRouter)
+The OpenAI chat completions API supports `tools` (function definitions) and `tool_choice`. When the model wants to use a tool, it returns a `tool_calls` array instead of (or alongside) content. We then execute the tool, append the result as a `tool` role message, and call the model again.
+```
+User message
+    ↓
+LLM (with tools defined)
+    ↓
+If tool_calls → execute tools → append results → call LLM again (loop)
+If content only → return response
+```
+This is a **while loop**, not a single call. The model may chain multiple tool calls before producing a final text response.
+### 2. Key Data Structures
+```typescript
+interface ToolDefinition {
+  name: string;
+  description: string;
+  parameters: JSONSchema;  // JSON Schema for function params
+}
+interface ToolCall {
+  id: string;
+  type: "function";
+  function: { name: string; arguments: string };  // arguments is JSON string
+}
+interface ToolResult {
+  tool_call_id: string;
+  role: "tool";
+  content: string;  // result as string
+}
+// Extended message types
+interface AssistantMessage {
+  role: "assistant";
+  content: string | null;
+  tool_calls?: ToolCall[];
+}
+interface ToolMessage {
+  role: "tool";
+  tool_call_id: string;
+  content: string;
+}
+```
+### 3. Tool Registry
+A simple registry where tools are registered with:
+- Name
+- Description (for the LLM)
+- JSON Schema for parameters
+- Executor function: `(params: any) => Promise<string>`
+```typescript
+class ToolRegistry {
+  private tools: Map<string, { def: ToolDefinition; exec: (params: any) => Promise<string> }>;
+  register(name, description, schema, executor): void;
+  getDefinitions(): ToolDefinition[];  // For LLM API call
+  execute(name: string, params: any): Promise<string>;  // Run a tool
+}
+```
+### 4. The Agentic Loop (in Agent.processMessage)
+```
+1. Build messages (system + history + user)
+2. Call LLM with tools
+3. While response has tool_calls:
+   a. For each tool_call: execute, collect result
+   b. Append assistant message (with tool_calls) to messages
+   c. Append tool result messages
+   d. Call LLM again with updated messages
+4. Return final text content
+5. Store in memory (include tool usage summary)
+```
+**Safety limits:**
+- Max iterations per turn (e.g., 25)
+- Max total tokens per turn
+- Tool execution timeout (per tool)
+- Dangerous command confirmation (optional)
+### 5. Phase 1 Tools
+#### `shell` (exec)
+- Run a shell command, return stdout/stderr
+- Working directory: `~/hivemind/workspace`
+- Timeout: 30s default, configurable
+- Safety: no `rm -rf /` etc.
+#### `read_file`
+- Read file contents (with optional offset/limit for large files)
+- Returns text content or error
+#### `write_file`
+- Write content to a file (creates dirs if needed)
+- Returns success/failure
+#### `edit_file`
+- Find and replace exact text in a file
+- oldText → newText pattern (surgical edits)
+#### `web_search`
+- Search via Brave API
+- Returns titles, URLs, snippets
+#### `web_fetch`
+- Fetch URL, extract markdown
+- Returns readable content
+### 6. Memory Integration
+Tool calls and results should be stored in memory, but summarized:
+- Don't store full file contents in L2 episodes
+- Store: "Used shell to run `git status`, found 3 modified files"
+- L3 promotion can learn patterns: "For git operations, agent uses shell tool"
+### 7. Config
+```toml
+[tools]
+enabled = true
+max_iterations = 25
+shell_timeout_s = 30
+workspace = "workspace"
+[tools.web_search]
+api_key = ""  # or from vault
+```
+### 8. Implementation Order
+1. **ToolRegistry class** — registration, definitions, execution
+2. **LLMClient.chatWithTools()** — extended chat that handles tool_calls
+3. **Agentic loop in Agent** — the while loop with safety limits
+4. **shell tool** — most impactful, enables everything
+5. **File tools** — read/write/edit
+6. **Web tools** — search/fetch
+7. **Memory integration** — summarize tool usage in episodes
+### 9. OpenRouter Compatibility
+OpenRouter passes through tool definitions to the underlying model. Most models support tools:
+- Claude: Native tool_use
+- GPT-4: Native function_calling
+- Gemini: Native function declarations
+The OpenAI-compatible format works for all of them through OpenRouter.
+### 10. Safety Considerations
+- **Sandbox**: Tools run on the agent's machine. File access should be scoped to workspace.
+- **Confirmation**: Optionally require human approval for destructive operations.
+- **Logging**: All tool calls logged to request logger for debugging.
+- **Rate limiting**: Prevent runaway tool loops.