zwarm 1.3.9__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. zwarm-1.3.9/.gitignore +23 -0
  2. zwarm-1.3.9/PKG-INFO +525 -0
  3. zwarm-1.3.9/README.md +513 -0
  4. zwarm-1.3.9/pyproject.toml +42 -0
  5. zwarm-1.3.9/src/zwarm/__init__.py +38 -0
  6. zwarm-1.3.9/src/zwarm/adapters/__init__.py +21 -0
  7. zwarm-1.3.9/src/zwarm/adapters/base.py +109 -0
  8. zwarm-1.3.9/src/zwarm/adapters/claude_code.py +357 -0
  9. zwarm-1.3.9/src/zwarm/adapters/codex_mcp.py +968 -0
  10. zwarm-1.3.9/src/zwarm/adapters/registry.py +69 -0
  11. zwarm-1.3.9/src/zwarm/adapters/test_codex_mcp.py +274 -0
  12. zwarm-1.3.9/src/zwarm/adapters/test_registry.py +68 -0
  13. zwarm-1.3.9/src/zwarm/cli/__init__.py +0 -0
  14. zwarm-1.3.9/src/zwarm/cli/main.py +2043 -0
  15. zwarm-1.3.9/src/zwarm/core/__init__.py +0 -0
  16. zwarm-1.3.9/src/zwarm/core/compact.py +329 -0
  17. zwarm-1.3.9/src/zwarm/core/config.py +342 -0
  18. zwarm-1.3.9/src/zwarm/core/environment.py +154 -0
  19. zwarm-1.3.9/src/zwarm/core/models.py +315 -0
  20. zwarm-1.3.9/src/zwarm/core/state.py +355 -0
  21. zwarm-1.3.9/src/zwarm/core/test_compact.py +312 -0
  22. zwarm-1.3.9/src/zwarm/core/test_config.py +160 -0
  23. zwarm-1.3.9/src/zwarm/core/test_models.py +265 -0
  24. zwarm-1.3.9/src/zwarm/orchestrator.py +623 -0
  25. zwarm-1.3.9/src/zwarm/prompts/__init__.py +10 -0
  26. zwarm-1.3.9/src/zwarm/prompts/orchestrator.py +214 -0
  27. zwarm-1.3.9/src/zwarm/sessions/__init__.py +24 -0
  28. zwarm-1.3.9/src/zwarm/sessions/manager.py +589 -0
  29. zwarm-1.3.9/src/zwarm/test_orchestrator_watchers.py +23 -0
  30. zwarm-1.3.9/src/zwarm/tools/__init__.py +17 -0
  31. zwarm-1.3.9/src/zwarm/tools/delegation.py +630 -0
  32. zwarm-1.3.9/src/zwarm/watchers/__init__.py +26 -0
  33. zwarm-1.3.9/src/zwarm/watchers/base.py +131 -0
  34. zwarm-1.3.9/src/zwarm/watchers/builtin.py +424 -0
  35. zwarm-1.3.9/src/zwarm/watchers/manager.py +181 -0
  36. zwarm-1.3.9/src/zwarm/watchers/registry.py +57 -0
  37. zwarm-1.3.9/src/zwarm/watchers/test_watchers.py +237 -0
zwarm-1.3.9/.gitignore ADDED
@@ -0,0 +1,23 @@
1
+ .DS_Store
2
+ .ipynb_checkpoints
3
+ .python-version
4
+ __pycache__/
5
+ .env
6
+ *.egg-info/
7
+ *.qzl
8
+ *.zip
9
+ *.7z
10
+ *.csv
11
+ runs/
12
+ _tmp_
13
+ .f1_state/
14
+
15
+ # Node.js / Frontend
16
+ node_modules/
17
+ dist/
18
+ dist-ssr/
19
+ *.local
20
+
21
+ jobs/
22
+
23
+ .zwarm/
zwarm-1.3.9/PKG-INFO ADDED
@@ -0,0 +1,525 @@
1
+ Metadata-Version: 2.4
2
+ Name: zwarm
3
+ Version: 1.3.9
4
+ Summary: Multi-Agent CLI Orchestration Research Platform
5
+ Requires-Python: <3.14,>=3.13
6
+ Requires-Dist: python-dotenv>=1.0.0
7
+ Requires-Dist: pyyaml>=6.0
8
+ Requires-Dist: rich>=13.0.0
9
+ Requires-Dist: typer>=0.9.0
10
+ Requires-Dist: wbal>=0.4.0
11
+ Description-Content-Type: text/markdown
12
+
13
+ # zwarm
14
+
15
+ Multi-agent CLI orchestration research platform. Coordinate multiple coding agents (Codex, Claude Code) with delegation, conversation, trajectory alignment, and automatic context management.
16
+
17
+ ## Key Features
18
+
19
+ - **Multi-adapter support**: Codex MCP, Claude Code adapters with unified interface
20
+ - **Sync & async modes**: Conversational (iterative refinement) or fire-and-forget
21
+ - **Token tracking**: Per-session token usage tracked and persisted for cost analysis
22
+ - **Context compaction**: Automatic LRU-style pruning when approaching context limits
23
+ - **Trajectory watchers**: Composable guardrails (progress, budget, scope, pattern, delegation)
24
+ - **State persistence**: Resume sessions, track history, replay events
25
+ - **Weave integration**: Full tracing and observability
26
+
27
+ ## Installation
28
+
29
+ ```bash
30
+ # From the workspace (recommended during development)
31
+ cd /path/to/labs
32
+ uv sync
33
+
34
+ # Or install directly
35
+ uv pip install -e ./zwarm
36
+ ```
37
+
38
+ **Requirements:**
39
+ - Python 3.13+
40
+ - `codex` CLI installed (for Codex adapter)
41
+ - `claude` CLI installed (for Claude Code adapter)
42
+
43
+ ## Quick Start
44
+
45
+ ```bash
46
+ # 1. Initialize zwarm in your project
47
+ zwarm init
48
+
49
+ # 2. Test an executor directly
50
+ zwarm exec --task "What is 2+2?"
51
+
52
+ # 3. Run the orchestrator with a task
53
+ zwarm orchestrate --task "Create a hello world Python function"
54
+
55
+ # 4. Check state after running
56
+ zwarm status
57
+
58
+ # 5. View event history
59
+ zwarm history
60
+ ```
61
+
62
+ ### Task Input Options
63
+
64
+ ```bash
65
+ # Direct task
66
+ zwarm orchestrate --task "Build a REST API"
67
+
68
+ # From file
69
+ zwarm orchestrate --task-file task.md
70
+
71
+ # From stdin
72
+ echo "Fix the bug in auth.py" | zwarm orchestrate
73
+ ```
74
+
75
+ ## Configuration
76
+
77
+ zwarm looks for configuration in this order:
78
+ 1. `--config` flag (YAML file)
79
+ 2. `.zwarm/config.toml` (created by `zwarm init`)
80
+ 3. `config.toml` in working directory (legacy, for backwards compat)
81
+ 4. Default settings
82
+
83
+ ### Minimal .zwarm/config.toml
84
+
85
+ ```toml
86
+ [weave]
87
+ enabled = true
88
+ project = "your-wandb-entity/zwarm"
89
+
90
+ [executor]
91
+ adapter = "codex_mcp" # or "claude_code"
92
+ ```
93
+
94
+ ### Environment Variables
95
+
96
+ ```bash
97
+ # Weave tracing (optional but recommended)
98
+ export WEAVE_PROJECT="your-entity/zwarm"
99
+
100
+ # Executor authentication (required - set based on which adapter you use)
101
+ export OPENAI_API_KEY="sk-..." # Required for codex_mcp adapter
102
+ export ANTHROPIC_API_KEY="sk-ant-..." # Required for claude_code adapter
103
+ ```
104
+
105
+ **Important:** The orchestrator agent runs with your credentials, but the executor adapters (Codex, Claude Code) need their own authentication. If executors fail with auth errors, check that the appropriate API key is set in your environment.
106
+
107
+ You can also put these in a `.env` file in your project root - zwarm will load it automatically.
108
+
109
+ ### Full Configuration Reference
110
+
111
+ ```yaml
112
+ # config.yaml
113
+ orchestrator:
114
+ lm: gpt-5-mini # Model for the orchestrator itself
115
+ max_steps: 100 # Maximum orchestrator steps
116
+ compaction: # Context window management
117
+ enabled: true
118
+ max_tokens: 100000 # Trigger compaction above this
119
+ threshold_pct: 0.85 # Compact at 85% of max
120
+ target_pct: 0.7 # Target 70% after compaction
121
+ keep_first_n: 2 # Always keep system + task
122
+ keep_last_n: 10 # Always keep recent context
123
+
124
+ executor:
125
+ adapter: codex_mcp # Default adapter: codex_mcp | claude_code
126
+ model: null # Model override (null = use adapter default)
127
+ # codex_mcp default: gpt-5.1-codex-mini
128
+ # claude_code default: claude-sonnet-4-5-20250514
129
+ sandbox: workspace-write # Codex sandbox mode
130
+
131
+ weave:
132
+ enabled: true
133
+ project: your-entity/zwarm
134
+
135
+ state_dir: .zwarm # State directory for sessions/events
136
+
137
+ watchers:
138
+ enabled: true
139
+ message_role: user # Role for nudge messages: user | assistant | system
140
+ watchers:
141
+ - name: progress
142
+ - name: budget
143
+ config:
144
+ max_steps: 50
145
+ max_sessions: 10
146
+ - name: delegation_reminder
147
+ config:
148
+ threshold: 10 # Nudge after N consecutive non-delegation calls
149
+ lookback: 30 # How many messages to check
150
+ - name: scope
151
+ config:
152
+ keywords: []
153
+ ```
154
+
155
+ ## Adapters
156
+
157
+ zwarm supports multiple CLI coding agents through adapters. Each adapter wraps a different coding CLI and handles the mechanics of starting sessions, sending messages, and capturing responses.
158
+
159
+ ### Codex MCP (default)
160
+
161
+ Uses Codex via MCP server for true conversational sessions. This is the recommended adapter for iterative work where you need back-and-forth refinement.
162
+
163
+ ```bash
164
+ # Sync mode (conversational)
165
+ zwarm exec --adapter codex_mcp --task "Add a login function"
166
+
167
+ # The orchestrator can have back-and-forth conversations
168
+ # using delegate() and converse() tools
169
+ ```
170
+
171
+ | Setting | Value |
172
+ |---------|-------|
173
+ | Default model | `gpt-5.1-codex-mini` |
174
+ | Requires | `codex` CLI installed |
175
+ | Auth | `OPENAI_API_KEY` environment variable |
176
+
177
+ ### Claude Code
178
+
179
+ Uses Claude Code CLI for execution. Good alternative when you want Claude's capabilities.
180
+
181
+ ```bash
182
+ zwarm exec --adapter claude_code --task "Fix the type errors"
183
+ ```
184
+
185
+ | Setting | Value |
186
+ |---------|-------|
187
+ | Default model | `claude-sonnet-4-5-20250514` |
188
+ | Requires | `claude` CLI installed and authenticated |
189
+ | Auth | `ANTHROPIC_API_KEY` or `claude` CLI auth |
190
+
191
+ ### Model Selection
192
+
193
+ Models are selected with this precedence (highest to lowest):
194
+
195
+ 1. **Per-delegation override**: `delegate(task="...", model="o3")`
196
+ 2. **Config file**: `executor.model` in config.toml or zwarm.yaml
197
+ 3. **Adapter default**: Each adapter has a sensible default
198
+
199
+ ```yaml
200
+ # config.toml - override the default model
201
+ [executor]
202
+ adapter = "codex_mcp"
203
+ model = "gpt-5.1-codex-max" # Use the more capable model
204
+ ```
205
+
206
+ ```bash
207
+ # Or override per-execution
208
+ zwarm exec --model gpt-5.1-codex-max --task "Complex refactoring"
209
+ ```
210
+
211
+ ## Watchers (Trajectory Alignment)
212
+
213
+ Watchers are composable guardrails that monitor agent behavior and can intervene when things go wrong.
214
+
215
+ ### Available Watchers
216
+
217
+ | Watcher | Description |
218
+ |---------|-------------|
219
+ | `progress` | Detects stuck/spinning agents |
220
+ | `budget` | Monitors step/session limits (counts only active sessions) |
221
+ | `scope` | Detects scope creep from original task |
222
+ | `pattern` | Custom regex pattern matching |
223
+ | `quality` | Code quality checks |
224
+ | `delegation` | Ensures orchestrator delegates instead of writing code directly |
225
+ | `delegation_reminder` | Nudges after many consecutive non-delegation tool calls (default: 10) |
226
+
227
+ ### Enabling Watchers
228
+
229
+ ```yaml
230
+ # config.yaml
231
+ watchers:
232
+ enabled: true
233
+ message_role: user # How nudges appear: user | assistant | system
234
+ watchers:
235
+ - name: progress
236
+ config:
237
+ max_same_calls: 3 # Flag after 3 identical tool calls
238
+ - name: budget
239
+ config:
240
+ max_steps: 50
241
+ max_sessions: 10
242
+ - name: delegation_reminder
243
+ config:
244
+ threshold: 10 # Nudge after 10 non-delegation calls
245
+ - name: scope
246
+ config:
247
+ avoid_keywords:
248
+ - "refactor everything"
249
+ - "rewrite"
250
+ ```
251
+
252
+ The `message_role` setting controls how watcher nudges are injected:
253
+ - `user` (default): Appears as a user message - strong nudge, agent must respond
254
+ - `assistant`: Appears as a previous assistant thought - softer, agent can continue
255
+ - `system`: Appears as system instruction - authoritative guidance
256
+
257
+ ### Watcher Actions
258
+
259
+ Watchers can return different actions:
260
+ - `continue` - Keep going
261
+ - `warn` - Log warning but continue
262
+ - `pause` - Pause for human review
263
+ - `stop` - Stop the orchestrator
264
+
265
+ ## Weave Integration
266
+
267
+ zwarm integrates with [Weave](https://wandb.ai/site/weave) for tracing and observability.
268
+
269
+ ### Enabling Weave
270
+
271
+ ```bash
272
+ # Via environment variable
273
+ export WEAVE_PROJECT="your-entity/zwarm"
274
+
275
+ # Or via config.toml
276
+ [weave]
277
+ enabled = true
278
+ project = "your-entity/zwarm"
279
+ ```
280
+
281
+ ### What Gets Traced
282
+
283
+ - Orchestrator `step()` calls with tool inputs/outputs
284
+ - Individual adapter calls (`_call_codex`, `_call_claude`)
285
+ - Delegation tools (`delegate`, `converse`, `end_session`)
286
+ - All tool executions
287
+
288
+ View traces at: `https://wandb.ai/your-entity/zwarm/weave`
289
+
290
+ ## CLI Reference
291
+
292
+ ### init
293
+
294
+ Initialize zwarm in a project directory.
295
+
296
+ ```bash
297
+ zwarm init [OPTIONS]
298
+
299
+ Options:
300
+ -w, --working-dir PATH Working directory [default: .]
301
+ -y, --yes Accept defaults, no prompts
302
+ --with-project Also create zwarm.yaml project config
303
+ ```
304
+
305
+ **What it creates:**
306
+
307
+ 1. `config.toml` - User settings (Weave project, adapter preferences, watchers)
308
+ 2. `.zwarm/` - State directory for sessions and events
309
+ 3. `zwarm.yaml` (optional) - Project-specific task configuration
310
+
311
+ **Examples:**
312
+
313
+ ```bash
314
+ # Interactive setup with prompts
315
+ zwarm init
316
+
317
+ # Non-interactive with defaults
318
+ zwarm init --yes
319
+
320
+ # Create project config too
321
+ zwarm init --with-project
322
+
323
+ # Initialize in a different directory
324
+ zwarm init --working-dir /path/to/project
325
+ ```
326
+
327
+ ### orchestrate
328
+
329
+ Start an orchestrator session to delegate tasks.
330
+
331
+ ```bash
332
+ zwarm orchestrate [OPTIONS]
333
+
334
+ Options:
335
+ -t, --task TEXT Task description
336
+ -f, --task-file PATH Read task from file
337
+ -c, --config PATH Config file (YAML)
338
+ --adapter TEXT Executor adapter override
339
+ --resume Resume from previous state
340
+ --set KEY=VALUE Override config values
341
+ ```
342
+
343
+ ### exec
344
+
345
+ Run a single executor directly (for testing). This bypasses the orchestrator entirely and hits the adapter (Codex/Claude) immediately with your task - useful for verifying adapters work before running full orchestration.
346
+
347
+ ```bash
348
+ zwarm exec [OPTIONS]
349
+
350
+ Options:
351
+ -t, --task TEXT Task to execute
352
+ --adapter TEXT Adapter to use [default: codex_mcp]
353
+ --model TEXT Model override
354
+ --mode [sync|async] Execution mode [default: sync]
355
+ ```
356
+
357
+ **Note:** Unlike `orchestrate`, this does NOT use watchers, compaction, state persistence, or multi-step planning. It's a single direct call to the executor.
358
+
359
+ ### status
360
+
361
+ Show current orchestrator state.
362
+
363
+ ```bash
364
+ zwarm status [OPTIONS]
365
+
366
+ Options:
367
+ --sessions Show session details
368
+ --tasks Show task details
369
+ --json Output as JSON
370
+ ```
371
+
372
+ ### history
373
+
374
+ Show event history.
375
+
376
+ ```bash
377
+ zwarm history [OPTIONS]
378
+
379
+ Options:
380
+ -n, --limit INTEGER Number of events [default: 20]
381
+ --session TEXT Filter by session ID
382
+ --json Output as JSON
383
+ ```
384
+
385
+ ### configs
386
+
387
+ Manage configuration files.
388
+
389
+ ```bash
390
+ zwarm configs list # List available configs
391
+ zwarm configs show NAME # Show config contents
392
+ ```
393
+
394
+ ### clean
395
+
396
+ Clean up zwarm state (useful for starting fresh).
397
+
398
+ ```bash
399
+ zwarm clean [OPTIONS]
400
+
401
+ Options:
402
+ --all Remove everything (events, sessions, state)
403
+ --events Remove only events
404
+ --sessions Remove only sessions
405
+ -y, --yes Skip confirmation prompt
406
+ ```
407
+
408
+ **Examples:**
409
+
410
+ ```bash
411
+ # Clean everything and start fresh
412
+ zwarm clean --all --yes
413
+
414
+ # Clean only events log
415
+ zwarm clean --events
416
+ ```
417
+
418
+ ## Architecture
419
+
420
+ ```
421
+ ┌─────────────────────────────────────────────────────────┐
422
+ │ Orchestrator │
423
+ │ (Plans, delegates, supervises - does NOT write code) │
424
+ ├─────────────────────────────────────────────────────────┤
425
+ │ Delegation Tools │
426
+ │ delegate() | converse() | check_session() | bash() │
427
+ └───────────────┬─────────────────────┬───────────────────┘
428
+ │ │
429
+ ┌───────▼───────┐ ┌───────▼───────┐
430
+ │ Codex MCP │ │ Claude Code │
431
+ │ Adapter │ │ Adapter │
432
+ └───────┬───────┘ └───────┬───────┘
433
+ │ │
434
+ ┌───────▼───────┐ ┌───────▼───────┐
435
+ │ codex │ │ claude │
436
+ │ mcp-server │ │ CLI │
437
+ └───────────────┘ └───────────────┘
438
+ ```
439
+
440
+ ### Key Concepts
441
+
442
+ - **Orchestrator**: Plans and delegates but never writes code directly
443
+ - **Executors**: CLI agents (Codex, Claude) that do the actual coding
444
+ - **Sessions**: Conversations with executors (sync or async)
445
+ - **Watchers**: Trajectory aligners that monitor and intervene
446
+
447
+ ### State Management
448
+
449
+ All state and config is stored in flat files under `.zwarm/`:
450
+
451
+ ```
452
+ .zwarm/
453
+ ├── config.toml # Runtime settings (weave, adapter, watchers)
454
+ ├── state.json # Current state
455
+ ├── events.jsonl # Append-only event log
456
+ ├── sessions/
457
+ │ └── <session-id>/
458
+ │ ├── messages.json # Conversation history
459
+ │ └── metadata.json # Session info
460
+ └── orchestrator/
461
+ └── messages.json # Orchestrator history (for resume)
462
+ ```
463
+
464
+ ## Development
465
+
466
+ ### Running Tests
467
+
468
+ ```bash
469
+ # Run all zwarm tests (68 tests)
470
+ uv run pytest src/zwarm/ -v
471
+
472
+ # Run specific test modules
473
+ uv run pytest src/zwarm/core/test_compact.py -v # Context compaction
474
+ uv run pytest src/zwarm/watchers/test_watchers.py -v # Watchers
475
+ uv run pytest src/zwarm/adapters/test_codex_mcp.py -v # Codex adapter
476
+
477
+ # Run integration tests (requires codex CLI)
478
+ uv run pytest -m integration
479
+ ```
480
+
481
+ ### Project Structure
482
+
483
+ ```
484
+ zwarm/
485
+ ├── src/zwarm/
486
+ │ ├── adapters/ # Executor adapters
487
+ │ │ ├── base.py # ExecutorAdapter protocol
488
+ │ │ ├── codex_mcp.py # Codex MCP adapter (with token tracking)
489
+ │ │ └── claude_code.py # Claude Code adapter (with token tracking)
490
+ │ ├── cli/
491
+ │ │ └── main.py # Typer CLI
492
+ │ ├── core/
493
+ │ │ ├── compact.py # Context window compaction (LRU pruning)
494
+ │ │ ├── config.py # Configuration loading
495
+ │ │ ├── environment.py # OrchestratorEnv (progress display)
496
+ │ │ ├── models.py # ConversationSession, Message, Event, etc.
497
+ │ │ └── state.py # Flat-file state management
498
+ │ ├── tools/
499
+ │ │ └── delegation.py # delegate, converse, check_session, etc.
500
+ │ ├── watchers/
501
+ │ │ ├── base.py # Watcher protocol
502
+ │ │ ├── builtin.py # Built-in watchers (progress, budget, scope, etc.)
503
+ │ │ ├── registry.py # Watcher registration
504
+ │ │ └── manager.py # WatcherManager
505
+ │ ├── prompts/
506
+ │ │ └── orchestrator.py # Orchestrator system prompt
507
+ │ └── orchestrator.py # Main Orchestrator class
508
+ ├── configs/ # Example configurations
509
+ ├── README.md
510
+ └── pyproject.toml
511
+ ```
512
+
513
+ ## Research Context
514
+
515
+ zwarm is a research platform exploring:
516
+
517
+ 1. **Agent reliability** - Can orchestrators reliably delegate and verify work?
518
+ 2. **Agent meta-capability** - Can agents effectively use other agents?
519
+ 3. **Long-running agents** - Can agents run for days, not hours?
520
+
521
+ See [ZWARM_PLAN.md](ZWARM_PLAN.md) for detailed design documentation.
522
+
523
+ ## License
524
+
525
+ Research project - see repository license.