npm - open-agents-ai - Versions diffs - 0.5.3 → 0.7.0 - Mend

open-agents-ai 0.5.3 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -91,7 +91,7 @@ pnpm -r test   # 911 tests across 77 files
 ## Tools
-The agent has access to 18 tools that it calls autonomously:
+The agent has access to 26 tools that it calls autonomously:
 | Tool | Description |
 |------|-------------|
@@ -113,6 +113,70 @@ The agent has access to 18 tools that it calls autonomously:
 | `codebase_map` | High-level project structure overview |
 | `diagnostic` | Run lint/typecheck/test/build validation pipeline |
 | `git_info` | Structured git status, log, diff, and branch info |
+| `background_run` | Run a shell command in the background (returns task ID) |
+| `task_status` | Check status of background tasks |
+| `task_output` | Read output from a background task |
+| `task_stop` | Stop a running background task |
+| `sub_agent` | Delegate a sub-task to an independent agent |
+| `image_read` | Read image files (base64 + dimensions + OCR text) |
+| `screenshot` | Capture screen or window to file |
+| `ocr` | Extract text from images (supports region cropping/zoom) |
+### Parallel Execution & Sub-Agents
+The agent can run multiple operations in parallel:
+```
+You: oa "run the test suite and lint checks in parallel, then fix any issues"
+Agent: [Turn 1] background_run(command="npm test")        → task-1
+       [Turn 2] background_run(command="npm run lint")     → task-2
+       [Turn 3] task_status()                              → task-1: running, task-2: completed
+       [Turn 4] task_output(task_id="task-2")              → 3 lint errors
+       [Turn 5] file_edit(...)                             → fix lint errors
+       [Turn 6] task_output(task_id="task-1")              → all tests pass
+       [Turn 7] task_complete(summary="Fixed lint, tests pass")
+```
+Sub-agents can be delegated independent tasks:
+```
+Agent: [Turn 1] sub_agent(task="refactor auth module", background=true)  → task-3
+       [Turn 2] sub_agent(task="add pagination to users API")            → completed
+       [Turn 3] task_output(task_id="task-3")                            → auth refactored
+```
+### Image & Visual Context
+Drag-and-drop image files onto the terminal to provide visual context:
+```bash
+# Drop an image file path while agent is working → injected as context
+# Drop an image file path at idle prompt → agent describes and analyzes it
+```
+The agent can also take screenshots and extract text via OCR:
+```
+Agent: [Turn 1] screenshot(region="active")     → captured window
+       [Turn 2] ocr(path="/tmp/screenshot.png")  → extracted text
+       [Turn 3] image_read(path="mockup.png")    → base64 + OCR text
+```
+### Mid-Task Steering
+While the agent is working (shown by the `+` prompt), you can type to add context:
+```
+> fix the auth bug
+  ⎿  📄 Read: src/auth.ts
++ also check the session handling        ← typed while agent works
+  ↪ Context added: also check the session handling
+  ⎿  🔍 Search: session
+  ⎿  ✏️  Edit: src/auth.ts
+```
+Press `Ctrl+C` to abort the current task. Slash commands (`/model`, `/help`) work during active tasks.
 ### Self-Learning