open-agents-ai 0.15.3 → 0.15.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +180 -253
  2. package/dist/index.js +449 -274
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -1,8 +1,42 @@
1
+ <p align="center">
2
+ <img src="https://img.shields.io/npm/v/open-agents-ai?color=7C3AED&style=flat-square" alt="npm version" />
3
+ <img src="https://img.shields.io/npm/dm/open-agents-ai?color=06B6D4&style=flat-square" alt="npm downloads" />
4
+ <img src="https://img.shields.io/badge/license-MIT-10B981?style=flat-square" alt="license" />
5
+ <img src="https://img.shields.io/badge/node-%3E%3D20-F59E0B?style=flat-square" alt="node version" />
6
+ <img src="https://img.shields.io/badge/models-open--weight-EC4899?style=flat-square" alt="open-weight models" />
7
+ </p>
8
+
9
+ <p align="center">
10
+ <code style="color:#5fafff">freedom of information</code> · <code style="color:#5fd7ff">freedom of patterns</code> · <code style="color:#5fffff">creating freely</code> · <code style="color:#5fffaf">open-weights</code><br>
11
+ <code style="color:#ffaf00">libertad de informacion</code> · <code style="color:#ff8700">crear libremente</code> · <code style="color:#d7afff">creer librement</code> · <code style="color:#d7d7ff">liberte d'expression</code><br>
12
+ <code style="color:#5fd75f">Freiheit der Muster</code> · <code style="color:#ff5f87">jiyuu ni souzou suru</code> · <code style="color:#8787ff">jayuroun changjak</code> · <code style="color:#5fafaf">svoboda tvorchestva</code><br>
13
+ <code style="color:#d7af5f">liberdade de criar</code> · <code style="color:#afaf87">creare liberamente</code> · <code style="color:#afff87">ozgurce yarat</code> · <code style="color:#87d7d7">skapa fritt</code><br>
14
+ <code style="color:#afd787">vrij creeren</code> · <code style="color:#d7d7af">tworz swobodnie</code> · <code style="color:#5fafff">dimiourgia elefthera</code> · <code style="color:#ff5f87">khuli soch</code><br>
15
+ <code style="color:#ffd787">hurriyat al-ibdaa</code> · <code style="color:#87ffaf">code is poetry</code> · <code style="color:#ff87d7">democratize AI</code> · <code style="color:#d7afff">imagine freely</code>
16
+ </p>
17
+
18
+ ---
19
+
1
20
  # Open Agents
2
21
 
3
- **AI coding agent framework powered by open-weight models via Ollama.**
22
+ **AI coding agent powered entirely by open-weight models via Ollama and OpenAI-compatible APIs.**
23
+
24
+ An autonomous multi-turn tool-calling agent that reads your code, makes changes, runs tests, and fixes failures iteratively until the task is complete — running 100% locally on your hardware with open-weight models. No API keys required. No cloud dependencies. Your code never leaves your machine.
25
+
26
+ ## Features
4
27
 
5
- A multi-turn agentic tool-calling loop that iteratively reads code, makes changes, runs tests, and fixes failures until the task is complete — modeled after how Claude Code operates, but running entirely on local open-weight models.
28
+ - **26 autonomous tools** file I/O, shell, grep, web search/fetch, memory, sub-agents, background tasks, image/OCR, git, diagnostics
29
+ - **Parallel tool execution** — read-only tools run concurrently via `Promise.allSettled` for faster feedback loops
30
+ - **Sub-agent delegation** — spawn independent agents for parallel workstreams with `background=true`
31
+ - **Auto-expanding context window** — detects your RAM/VRAM and creates an optimized model variant on first run
32
+ - **Neural TTS voice feedback** — hear what the agent is doing via GLaDOS or Overwatch ONNX voices
33
+ - **Mid-task steering** — type while the agent works to add context without interrupting
34
+ - **Smart context compaction** — long conversations compressed preserving files, commands, errors, and decisions
35
+ - **Persistent memory** — learned patterns stored in `.oa/memory/` across sessions
36
+ - **Self-learning** — auto-fetches docs from the web when encountering unfamiliar APIs
37
+ - **Multilingual TUI** — inspirational messages in 15+ languages on startup
38
+ - **Seamless `/update`** — in-place update and reload without losing context
39
+ - **Dynamic code rendering** — syntax-aware tool output with terminal-width cropping
6
40
 
7
41
  ## How It Works
8
42
 
@@ -16,7 +50,7 @@ Agent: [Turn 1] file_read(src/auth.ts)
16
50
  [Turn 5] task_complete(summary="Fixed null check — all tests pass")
17
51
  ```
18
52
 
19
- The agent has **18 tools** (including 3 AIWG SDLC tools and 4 advanced analysis tools) and uses them autonomously in a loop, reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.
53
+ The agent uses tools autonomously in a loop reading errors, fixing code, and re-running validation until the task succeeds or the turn limit is reached.
20
54
 
21
55
  ## Quick Start
22
56
 
@@ -26,11 +60,11 @@ The agent has **18 tools** (including 3 AIWG SDLC tools and 4 advanced analysis
26
60
  # Install globally — provides `open-agents` and `oa` commands
27
61
  npm i -g open-agents-ai
28
62
 
29
- # Run it — first launch auto-detects your system and pulls the best model
63
+ # Run it — first launch auto-detects your system and configures the optimal model
30
64
  oa "fix the null check in auth.ts"
31
65
  ```
32
66
 
33
- On first run, the setup wizard detects your RAM/VRAM and recommends the optimal qwen3.5 variant.
67
+ On first run, the setup wizard detects your RAM/VRAM and creates an expanded-context model variant automatically.
34
68
 
35
69
  ### Install from source
36
70
 
@@ -47,88 +81,88 @@ git clone https://github.com/robit-man/open-agents.git && cd open-agents
47
81
 
48
82
  # 4. Use it
49
83
  oa "add pagination to the users endpoint"
50
- open-agents "refactor the auth module into separate files"
51
84
  ```
52
85
 
53
- ## Installation
86
+ ## Interactive TUI
54
87
 
55
- ### Prerequisites
56
-
57
- - **Node.js** >= 20
58
- - **pnpm** (`npm install -g pnpm`)
59
- - **Ollama** ([ollama.com](https://ollama.com)) with a model that supports tool calling
60
-
61
- ### Install System-Wide
88
+ Launch without arguments to enter the interactive REPL with a rich terminal interface:
62
89
 
63
90
  ```bash
64
- # Install to ~/.local/bin (no sudo needed)
65
- ./scripts/install.sh
91
+ oa
92
+ ```
66
93
 
67
- # Install to /usr/local/bin
68
- sudo ./scripts/install.sh --global
94
+ The TUI features:
95
+ - **Animated multilingual phrase carousel** — creativity and open-source messages scrolling in 15+ languages
96
+ - **Live metrics bar** — token in/out counts, context window usage with pastel-colored labels
97
+ - **Rotating tips** — helpful hints cycling every 10 seconds
98
+ - **Syntax-highlighted output** — tool results rendered with language-aware formatting
99
+ - **Dynamic terminal cropping** — output adapts to terminal width on resize
69
100
 
70
- # Custom prefix
71
- ./scripts/install.sh --prefix ~/bin
101
+ ### Slash Commands
72
102
 
73
- # Uninstall
74
- ./scripts/install.sh --uninstall
75
- ```
103
+ | Command | Description |
104
+ |---------|-------------|
105
+ | `/help` | Show all available commands |
106
+ | `/model <name>` | Switch to a different Ollama model |
107
+ | `/endpoint <url>` | Connect to a remote vLLM or OpenAI-compatible API |
108
+ | `/voice [model]` | Toggle TTS voice feedback (GLaDOS, Overwatch) |
109
+ | `/stream` | Toggle streaming token display |
110
+ | `/update` | Check for and install updates (seamless reload) |
111
+ | `/config` | Show current configuration |
112
+ | `/clear` | Clear the screen |
113
+ | `/exit` | Quit |
76
114
 
77
- The installer will:
78
- 1. Check Node.js and pnpm versions
79
- 2. Install workspace dependencies
80
- 3. Build all packages
81
- 4. Create `open-agents` and `oa` symlinks
82
- 5. Configure an optimized Ollama model (auto-detects RAM for context window sizing)
115
+ ### Mid-Task Steering
83
116
 
84
- ### Manual Build
117
+ While the agent is working (shown by the `+` prompt), type to add context:
85
118
 
86
- ```bash
87
- pnpm install
88
- pnpm -r build
89
- pnpm -r test # 911 tests across 77 files
119
+ ```
120
+ > fix the auth bug
121
+ ⎿ Read: src/auth.ts
122
+ + also check the session handling ← typed while agent works
123
+ ↪ Context added: also check the session handling
124
+ ⎿ Search: session
125
+ ⎿ Edit: src/auth.ts
90
126
  ```
91
127
 
92
- ## Tools
128
+ Press `Ctrl+C` to abort the current task.
93
129
 
94
- The agent has access to 26 tools that it calls autonomously:
130
+ ## Tools (26)
95
131
 
96
132
  | Tool | Description |
97
133
  |------|-------------|
98
134
  | `file_read` | Read file contents with line numbers (supports offset/limit) |
99
135
  | `file_write` | Create or overwrite files |
100
- | `file_edit` | Precise string replacement in files (preferred over full rewrites) |
101
- | `shell` | Execute any shell command (tests, builds, git, etc.) |
102
- | `grep_search` | Search file contents with regex (uses ripgrep when available) |
136
+ | `file_edit` | Precise string replacement in files |
137
+ | `shell` | Execute any shell command |
138
+ | `grep_search` | Search file contents with regex (ripgrep when available) |
103
139
  | `find_files` | Find files by glob pattern |
104
140
  | `list_directory` | List directory contents with types and sizes |
105
141
  | `web_search` | Search the web via DuckDuckGo |
106
- | `web_fetch` | Fetch and extract text from web pages (docs, MDN, w3schools) |
142
+ | `web_fetch` | Fetch and extract text from web pages |
107
143
  | `memory_read` | Read from persistent memory store |
108
144
  | `memory_write` | Store patterns and solutions for future tasks |
109
- | `aiwg_setup` | Deploy AIWG SDLC framework in the project |
110
- | `aiwg_health` | Analyze project SDLC health and readiness |
111
- | `aiwg_workflow` | Execute AIWG commands and workflows |
112
145
  | `batch_edit` | Multiple precise edits across files in one call |
113
146
  | `codebase_map` | High-level project structure overview |
114
147
  | `diagnostic` | Run lint/typecheck/test/build validation pipeline |
115
148
  | `git_info` | Structured git status, log, diff, and branch info |
116
- | `background_run` | Run a shell command in the background (returns task ID) |
149
+ | `background_run` | Run a shell command in the background |
117
150
  | `task_status` | Check status of background tasks |
118
151
  | `task_output` | Read output from a background task |
119
152
  | `task_stop` | Stop a running background task |
120
153
  | `sub_agent` | Delegate a sub-task to an independent agent |
121
- | `image_read` | Read image files (base64 + dimensions + OCR text) |
154
+ | `image_read` | Read image files (base64 + dimensions + OCR) |
122
155
  | `screenshot` | Capture screen or window to file |
123
- | `ocr` | Extract text from images (supports region cropping/zoom) |
156
+ | `ocr` | Extract text from images |
157
+ | `aiwg_setup` | Deploy AIWG SDLC framework |
158
+ | `aiwg_health` | Analyze project SDLC health and readiness |
159
+ | `aiwg_workflow` | Execute AIWG commands and workflows |
124
160
 
125
161
  ### Parallel Execution & Sub-Agents
126
162
 
127
- The agent can run multiple operations in parallel:
163
+ Read-only tools (`file_read`, `grep_search`, `find_files`, `list_directory`, `web_fetch`, `web_search`, `memory_read`) execute concurrently when called in the same turn. Mutating tools run sequentially to ensure safety.
128
164
 
129
165
  ```
130
- You: oa "run the test suite and lint checks in parallel, then fix any issues"
131
-
132
166
  Agent: [Turn 1] background_run(command="npm test") → task-1
133
167
  [Turn 2] background_run(command="npm run lint") → task-2
134
168
  [Turn 3] task_status() → task-1: running, task-2: completed
@@ -138,7 +172,7 @@ Agent: [Turn 1] background_run(command="npm test") → task-1
138
172
  [Turn 7] task_complete(summary="Fixed lint, tests pass")
139
173
  ```
140
174
 
141
- Sub-agents can be delegated independent tasks:
175
+ Sub-agents can run independent tasks in parallel:
142
176
 
143
177
  ```
144
178
  Agent: [Turn 1] sub_agent(task="refactor auth module", background=true) → task-3
@@ -148,14 +182,7 @@ Agent: [Turn 1] sub_agent(task="refactor auth module", background=true) → tas
148
182
 
149
183
  ### Image & Visual Context
150
184
 
151
- Drag-and-drop image files onto the terminal to provide visual context:
152
-
153
- ```bash
154
- # Drop an image file path while agent is working → injected as context
155
- # Drop an image file path at idle prompt → agent describes and analyzes it
156
- ```
157
-
158
- The agent can also take screenshots and extract text via OCR:
185
+ Drag-and-drop image files onto the terminal to provide visual context. The agent can also take screenshots and extract text via OCR:
159
186
 
160
187
  ```
161
188
  Agent: [Turn 1] screenshot(region="active") → captured window
@@ -163,96 +190,119 @@ Agent: [Turn 1] screenshot(region="active") → captured window
163
190
  [Turn 3] image_read(path="mockup.png") → base64 + OCR text
164
191
  ```
165
192
 
166
- ### Mid-Task Steering
193
+ ## Auto-Expanding Context Window
194
+
195
+ On startup (and when switching models with `/model`), Open Agents automatically:
196
+
197
+ 1. Detects available system RAM and GPU VRAM
198
+ 2. Checks if an expanded-context variant of your model exists
199
+ 3. Creates one via Ollama Modelfile if needed, with optimal `num_ctx`:
200
+
201
+ | Available Memory | Context Window |
202
+ |-----------------|---------------|
203
+ | 200GB+ | 128K tokens |
204
+ | 100GB+ | 64K tokens |
205
+ | 50GB+ | 32K tokens |
206
+ | 20GB+ | 16K tokens |
207
+ | 8GB+ | 8K tokens |
208
+ | < 8GB | 4K tokens |
209
+
210
+ The expanded model is named `open-agents-{model}` and reused across sessions.
167
211
 
168
- While the agent is working (shown by the `+` prompt), you can type to add context:
212
+ ## Voice Feedback (TTS)
169
213
 
214
+ Neural TTS voices speak what the agent is doing in real-time:
215
+
216
+ ```bash
217
+ /voice # Toggle voice on/off (default: GLaDOS)
218
+ /voice glados # GLaDOS voice
219
+ /voice overwatch # Overwatch voice
170
220
  ```
171
- > fix the auth bug
172
- ⎿ 📄 Read: src/auth.ts
173
- + also check the session handling ← typed while agent works
174
- ↪ Context added: also check the session handling
175
- ⎿ 🔍 Search: session
176
- ⎿ ✏️ Edit: src/auth.ts
221
+
222
+ On first enable, auto-downloads the ONNX voice model (~50MB). For best quality:
223
+
224
+ ```bash
225
+ # Ubuntu/Debian
226
+ sudo apt install espeak-ng
227
+
228
+ # macOS
229
+ brew install espeak-ng
177
230
  ```
178
231
 
179
- Press `Ctrl+C` to abort the current task. Slash commands (`/model`, `/help`) work during active tasks.
232
+ ## Self-Learning & Error Recovery
180
233
 
181
- ### Self-Learning
234
+ **Self-learning**: When encountering an unfamiliar API, the agent automatically searches the web, fetches documentation, stores the pattern in persistent memory, and applies it.
182
235
 
183
- When the agent encounters an unfamiliar API or language feature, it automatically:
184
- 1. Searches the web for documentation
185
- 2. Fetches the relevant page (w3schools.com, MDN, official docs)
186
- 3. Stores the learned pattern in persistent memory
187
- 4. Applies the knowledge to the current task
236
+ **Error recovery**: The agent follows an iterative fix loop run validation, read errors, identify the exact file and line, fix with `file_edit`, re-run until passing.
188
237
 
189
- ### Error Recovery
238
+ ## Configuration
190
239
 
191
- The agent follows an iterative fix loop:
192
- 1. Run validation (tests/build/lint)
193
- 2. Read the full error output
194
- 3. Identify the exact file, line, and failure
195
- 4. Fix with `file_edit`
196
- 5. Re-run validation
197
- 6. Repeat until passing
240
+ Config priority: CLI flags > environment variables > `~/.open-agents/config.json` > defaults.
198
241
 
199
- ### Dynamic System Prompt
242
+ ```bash
243
+ # Set defaults
244
+ open-agents config set model qwen3.5:122b
245
+ open-agents config set backendUrl http://localhost:11434
246
+ open-agents config set backendType ollama
200
247
 
201
- The agent's system prompt is dynamically enriched at task start with:
248
+ # Environment variables
249
+ export OPEN_AGENTS_MODEL=qwen3.5:122b
250
+ export OPEN_AGENTS_BACKEND_URL=http://localhost:11434
251
+ export OPEN_AGENTS_BACKEND_TYPE=ollama
252
+ ```
202
253
 
203
- | Source | Description |
204
- |--------|-------------|
205
- | **Project context files** | `.open-agents.md`, `AGENTS.md`, or `.open-agents/context.md` — loaded from project root and parent directories |
206
- | **Git state** | Current branch, working tree status, recent commits |
207
- | **Persistent memory** | Learned patterns from previous sessions (project-local and global) |
208
- | **Environment** | Working directory, Node version, OS, date |
254
+ ### Project Context Files
209
255
 
210
- Create a `.open-agents.md` file in your project root to give the agent project-specific instructions:
256
+ Create `AGENTS.md`, `OA.md`, or `.open-agents.md` in your project root to give the agent project-specific instructions:
211
257
 
212
258
  ```markdown
213
259
  # Project Context
214
260
 
215
261
  - This is a TypeScript monorepo using pnpm workspaces
216
262
  - Run tests with: pnpm -r test
217
- - Build with: pnpm -r build
218
263
  - Always use file_edit over file_write for existing files
219
- - Database migrations are in src/db/migrations/
220
264
  ```
221
265
 
222
- Context files are merged from parent child directories, so you can set global defaults at `~/.open-agents.md` and override per-project.
266
+ Context files merge from parent to child directories set global defaults at `~/.open-agents.md` and override per-project.
223
267
 
224
268
  ### `.oa/` Project Directory
225
269
 
226
- Each project gets a `.oa/` directory (similar to `.claude/` for Claude Code) that persists artifacts across sessions:
270
+ Each project gets a `.oa/` directory that persists state across sessions:
227
271
 
228
272
  ```
229
273
  .oa/
230
274
  ├── config.json # Per-project configuration overrides
275
+ ├── settings.json # TUI settings (voice, streaming, etc.)
231
276
  ├── memory/ # Persistent memory store
232
277
  │ └── {topic}.json # Topic-based key-value memories
233
278
  ├── index/ # Cached codebase index
234
279
  │ ├── repo-profile.json # Repository metadata
235
- │ ├── file-summaries.json # Per-file purpose, exports, domain, risk
280
+ │ ├── file-summaries.json # Per-file purpose, exports, domain
236
281
  │ ├── symbols.json # Symbol table cache
237
282
  │ ├── graph.json # Import/dependency graph
238
- │ └── meta.json # Index metadata (timestamp, hash)
283
+ │ └── meta.json # Index metadata
239
284
  ├── context/ # Auto-generated project context
240
285
  │ └── project-map.md # Generated overview for system prompt
241
286
  └── history/ # Session history
242
287
  └── {session-id}.json # Per-session task log
243
288
  ```
244
289
 
245
- The agent auto-discovers `AGENTS.md`, `OA.md`, `CLAUDE.md`, and `README.md` from the project root and parent directories, injecting them into the system prompt for project-specific awareness.
290
+ ## Model Support
291
+
292
+ **Primary target**: Qwen3.5-122B-A10B via Ollama (MoE architecture, runs on 48GB+ VRAM)
246
293
 
247
- ### Smart Context Compaction
294
+ Any model that supports tool calling via Ollama or an OpenAI-compatible API works:
248
295
 
249
- When conversations exceed the context window, the agent compacts older messages while preserving:
250
- - Files that were read and modified
251
- - Shell commands that were run and their outcomes
252
- - Errors that were encountered
253
- - Key decisions that were made
296
+ ```bash
297
+ # Different Ollama model
298
+ oa --model qwen2.5-coder:32b "fix the bug"
254
299
 
255
- This structured summary prevents the agent from repeating work or losing track of what's been done.
300
+ # vLLM backend
301
+ oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
302
+
303
+ # Any OpenAI-compatible API
304
+ oa --backend-url http://10.0.0.5:11434 "refactor auth"
305
+ ```
256
306
 
257
307
  ## Commands
258
308
 
@@ -282,104 +332,22 @@ This structured summary prevents the agent from repeating work or losing track o
282
332
  -V, --version Show version
283
333
  ```
284
334
 
285
- ### Voice Feedback (TTS)
286
-
287
- The agent can speak what it's doing using neural TTS voices. Enable it in the interactive REPL:
288
-
289
- ```bash
290
- /voice # Toggle voice on/off (default: GLaDOS)
291
- /voice glados # Switch to GLaDOS voice
292
- /voice overwatch # Switch to Overwatch voice
293
- ```
294
-
295
- On first enable, the agent auto-downloads the ONNX voice model (~50MB) and installs `onnxruntime-node` in `~/.open-agents/voice/`. For best quality, install `espeak-ng`:
296
-
297
- ```bash
298
- # Ubuntu/Debian
299
- sudo apt install espeak-ng
300
-
301
- # macOS
302
- brew install espeak-ng
303
- ```
304
-
305
- When enabled, the agent speaks brief descriptions of each tool call ("Reading auth.ts", "Running tests", "Editing config.js") through your system speakers.
306
-
307
- ### Configuration
308
-
309
- Config priority: CLI flags > environment variables > `~/.open-agents/config.json` > defaults.
310
-
311
- ```bash
312
- # Set defaults
313
- open-agents config set model qwen3.5:122b
314
- open-agents config set backendUrl http://localhost:11434
315
- open-agents config set backendType ollama
316
-
317
- # Environment variables
318
- export OPEN_AGENTS_MODEL=qwen3.5:122b
319
- export OPEN_AGENTS_BACKEND_URL=http://localhost:11434
320
- export OPEN_AGENTS_BACKEND_TYPE=ollama
321
- ```
322
-
323
- ## Model Support
324
-
325
- **Primary target**: Qwen3.5-122B-A10B via Ollama (MoE, runs on 48GB+ VRAM)
326
-
327
- The `setup-model.sh` script auto-configures the context window based on available RAM:
328
-
329
- | RAM | Context Window |
330
- |-----|---------------|
331
- | 300GB+ | 128K tokens |
332
- | 128GB+ | 64K tokens |
333
- | 64GB+ | 32K tokens |
334
- | < 64GB | 16K tokens |
335
-
336
- ### Other Models
337
-
338
- Any model that supports tool calling via Ollama or an OpenAI-compatible API works:
339
-
340
- ```bash
341
- # Use a different Ollama model
342
- oa --model qwen2.5-coder:32b "fix the bug"
343
-
344
- # Use vLLM backend
345
- oa --backend vllm --backend-url http://localhost:8000/v1 "add tests"
346
-
347
- # Use any OpenAI-compatible API
348
- oa --backend-url http://10.0.0.5:11434 "refactor auth"
349
- ```
350
-
351
335
  ## AIWG Integration
352
336
 
353
- Open Agents integrates with [AIWG](https://www.npmjs.com/package/aiwg) (AI Writing Guide) — a cognitive architecture for AI-augmented software development. When AIWG is installed, the agent gains SDLC superpowers:
337
+ Open Agents integrates with [AIWG](https://www.npmjs.com/package/aiwg) (AI Writing Guide) — a cognitive architecture for AI-augmented software development:
354
338
 
355
339
  ```bash
356
- # Install AIWG globally
357
340
  npm i -g aiwg
358
-
359
- # The agent can now use AIWG tools automatically:
360
341
  oa "analyze this project's SDLC health and set up proper documentation"
361
- oa "create requirements and architecture docs for this codebase"
362
342
  ```
363
343
 
364
- ### What AIWG Adds
365
-
366
344
  | Capability | Description |
367
345
  |-----------|-------------|
368
- | **Structured Memory** | `.aiwg/` directory persists project knowledge across sessions |
346
+ | **Structured Memory** | `.aiwg/` directory persists project knowledge |
369
347
  | **SDLC Artifacts** | Requirements, architecture, test strategy, deployment docs |
370
- | **Health Analysis** | Score your project's SDLC maturity (testing, CI/CD, docs, etc.) |
348
+ | **Health Analysis** | Score your project's SDLC maturity |
371
349
  | **85+ Agents** | Specialized AI personas (Test Engineer, Security Auditor, API Designer) |
372
- | **Traceability** | @-mention system links requirements code tests |
373
-
374
- ### AIWG Tools
375
-
376
- The 3 AIWG tools are available when `aiwg` is installed globally:
377
-
378
- - **`aiwg_setup`** — Deploy an AIWG framework (`sdlc`, `marketing`, `forensics`, `research`)
379
- - **`aiwg_health`** — Analyze project SDLC readiness (works even without AIWG installed)
380
- - **`aiwg_workflow`** — Run any AIWG CLI command (`runtime-info`, `list`, `mcp info`)
381
-
382
- If AIWG is not installed, the tools return helpful install instructions. The `aiwg_health` tool provides native analysis without requiring AIWG.
350
+ | **Traceability** | @-mention system links requirements to code to tests |
383
351
 
384
352
  ## Architecture
385
353
 
@@ -392,53 +360,49 @@ User task
392
360
 
393
361
  System prompt + tools → LLM
394
362
 
395
- LLM returns tool_calls → Execute tools → Feed results back → LLM
363
+ LLM returns tool_calls → Execute tools (parallel/sequential) → Feed results → LLM
396
364
  ↓ (repeat until task_complete or max turns)
397
365
  Result: completed/incomplete, turns, tool calls, duration
398
366
  ```
399
367
 
400
- Key design decisions:
368
+ Key design:
401
369
  - **Tool-first**: The model explores via tools rather than pre-stuffed context
402
- - **Iterative**: Tests, sees failures, fixes them — no need for perfect one-shot output
403
- - **Context compaction**: Long conversations are compressed, preserving only recent context
370
+ - **Iterative**: Tests, sees failures, fixes them — no one-shot guessing
371
+ - **Parallel-safe**: Read-only tools execute concurrently; mutating tools run sequentially
372
+ - **Context compaction**: Long conversations compressed, preserving recent context
404
373
  - **Bounded**: Maximum turns, timeout, and output limits prevent runaway loops
405
- - **Observable**: Every tool call and result is emitted as a real-time event
374
+ - **Observable**: Every tool call and result emitted as a real-time event
406
375
 
407
376
  ### Package Structure
408
377
 
409
378
  ```
410
379
  packages/
411
- orchestrator/ - AgenticRunner, OllamaAgenticBackend, RALPH loop
412
- execution/ - 11 tools (file, shell, grep, web, memory), validation pipeline
380
+ orchestrator/ - AgenticRunner, backend integration, parallel execution
381
+ execution/ - 26 tools (file, shell, grep, web, memory, image, AIWG)
413
382
  schemas/ - Zod schemas and TypeScript types
414
383
  backend-vllm/ - Ollama + vLLM backend clients (OpenAI-compatible)
415
384
  memory/ - SQLite-backed persistent memory stores
416
385
  indexer/ - Codebase scanning and symbol extraction
417
386
  retrieval/ - Multi-stage retrieval (lexical + semantic + graph)
418
387
  prompts/ - Prompt contracts for each agent role
419
- cli/ - CLI entry point, commands, config, UI
388
+ cli/ - CLI entry point, TUI, status bar, carousel, config
420
389
 
421
390
  apps/
422
391
  api/ - Express API server
423
392
  worker/ - Background task processor
424
393
 
425
- eval/ - 8 evaluation tasks with agentic runner
426
- scripts/ - install.sh, setup-model.sh, bootstrap.sh
394
+ eval/ - 17 evaluation tasks with agentic runner
395
+ scripts/ - install.sh, setup-model.sh, build-publish.mjs
427
396
  ```
428
397
 
429
398
  ## Evaluation
430
399
 
431
- The framework includes 17 evaluation tasks that test the agent's ability to autonomously resolve coding problems:
400
+ 17 evaluation tasks test the agent's autonomous coding ability:
432
401
 
433
402
  ```bash
434
- # Run all 8 tasks with agentic tool-calling loop
435
- node eval/run-agentic.mjs
436
-
437
- # Single task
438
- node eval/run-agentic.mjs 04-add-test
439
-
440
- # Different model
441
- node eval/run-agentic.mjs --model qwen2.5-coder:32b
403
+ node eval/run-agentic.mjs # Run all tasks
404
+ node eval/run-agentic.mjs 04-add-test # Single task
405
+ node eval/run-agentic.mjs --model qwen2.5-coder:32b # Different model
442
406
  ```
443
407
 
444
408
  ### Results (Qwen3.5-122B)
@@ -455,49 +419,6 @@ TASK RESULT TIME TURNS TOOLS
455
419
  08-multi-file PASS 75.5s 8 13
456
420
 
457
421
  Pass rate: 100% (8/8)
458
- Total: 39 turns, 55 tool calls, ~10 minutes
459
- ```
460
-
461
- ### Task Descriptions
462
-
463
- | ID | Task | Difficulty |
464
- |----|------|-----------|
465
- | 01 | Fix typo in function name | Easy |
466
- | 02 | Add isPrime function | Easy |
467
- | 03 | Fix off-by-one bug | Easy |
468
- | 04 | Write comprehensive tests for untested functions | Medium |
469
- | 05 | Extract functions from long method (refactor) | Medium |
470
- | 06 | Fix TypeScript type errors | Medium |
471
- | 07 | Add REST API endpoint | Medium |
472
- | 08 | Add pagination across multiple files | Hard |
473
- | 09 | CSS named color lookup (148 colors, web search) | Medium |
474
- | 10 | HTTP status code lookup (32+ codes, web search) | Medium |
475
- | 11 | MIME type lookup (30+ types, web search) | Medium |
476
- | 12 | SDLC health analyzer (AIWG-style scoring) | Medium |
477
- | 13 | SDLC artifact generator (requirements, arch, tests) | Hard |
478
- | 14 | Batch refactor variable names across files | Medium |
479
- | 15 | Codebase overview generator from structure analysis | Medium |
480
- | 16 | Diagnostic fix loop (find and fix buggy code) | Medium |
481
- | 17 | Git repository analyzer | Medium |
482
-
483
- ## Test Suite
484
-
485
- ```
486
- Package Tests
487
- ─────────────────────────
488
- schemas 216
489
- backend-vllm 162
490
- execution 136
491
- indexer 94
492
- cli 72
493
- orchestrator 70
494
- retrieval 66
495
- memory 58
496
- prompts 34
497
- apps/api 1
498
- apps/worker 2
499
- ─────────────────────────
500
- Total 911 passing
501
422
  ```
502
423
 
503
424
  ## Development
@@ -505,10 +426,16 @@ Total 911 passing
505
426
  ```bash
506
427
  pnpm install # Install dependencies
507
428
  pnpm -r build # Build all packages
508
- pnpm -r test # Run all 911 tests
429
+ pnpm -r test # Run all tests
509
430
  pnpm -r dev # Watch mode
510
431
  ```
511
432
 
433
+ ## Prerequisites
434
+
435
+ - **Node.js** >= 20
436
+ - **pnpm** (`npm install -g pnpm`)
437
+ - **Ollama** ([ollama.com](https://ollama.com)) with a model that supports tool calling
438
+
512
439
  ## License
513
440
 
514
441
  MIT