nightshift-mcp 1.1.0 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,971 +1,971 @@
1
- # NightShift MCP
2
-
3
- **The responsible kind of multi-agent chaos.**
4
-
5
- Explicit delegation, review, and handoffs between AI models.
6
-
7
- ---
8
-
9
- An MCP (Model Context Protocol) server for agent teams and multi-agent orchestration across different AI models. Coordinate Claude, Codex, Gemini, Goose, and local Ollama models as an agentic team — with structured delegation, shared task management, and autonomous workflows. Works with any MCP-compatible client.
10
-
11
- ## Features
12
-
13
- - **Multi-agent chat**: Structured inter-agent messaging with agent name, timestamp, type, and content
14
- - **Failover handling**: Seamless handoffs when an agent hits limits or context windows fill up
15
- - **PRD-driven task management**: Work through user stories in prd.json with dependencies (`dependsOn`) and tags for smart routing
16
- - **Progress tracking**: Shared learnings via progress.txt
17
- - **Selective context retrieval**: Topic-based context store lets agents query relevant context instead of prompt-stuffing
18
- - **Execution tracing**: Structured trace of agent spawns, completions, and failures with parent-child tree visualization
19
- - **Agent spawning & orchestration**: Spawn Claude, Codex, Gemini, Goose, or local Ollama models as subprocesses with full lifecycle tracking
20
- - **Agent availability checks**: Pre-flight validation before spawning — never fails silently on missing agents
21
- - **Local model support**: Use Goose + Ollama for free, private, offline agent work with local models
22
- - **Autonomous orchestration**: Single `orchestrate` tool runs a claim→implement→complete loop until all stories pass
23
- - **Agent status tracking**: Monitor spawned agents by PID, check exit codes, and tail output in real-time
24
- - **Smart retry**: Automatically suggests or uses a different agent when one fails
25
- - **Workflow management**: Phases, strategic decisions, and agent assignments
26
- - **Watch/polling**: Monitor for new messages with cursor-based polling
27
- - **Auto-archiving**: Archive old messages to keep the chat file manageable
28
- - **Cross-platform**: Works on Windows, Linux, and macOS (uses cross-spawn and platform-safe process management)
29
- - **Heterogeneous agent teams**: Mix different AI models — use each for what it's best at
30
- - **Universal compatibility**: Works with any MCP-supporting tool (49 tools across 10 categories)
31
- - **Flexible savepoints**: Git-based checkpoints when available, JSON state snapshots when not
32
- - **Simple file-based storage**: No external services required
33
-
34
- ## What's New in 1.1.0
35
-
36
- - **Local model support**: New `ollama` agent type + Goose/Ollama integration for local model tool use
37
- - **Agent availability checks**: All spawn/delegate tools validate agents before attempting — clear errors with available alternatives
38
- - **PRD dependencies**: `dependsOn` field prevents stories from running before their prerequisites complete
39
- - **PRD tags**: `tags` field enables smart agent routing (e.g., `["code"]` routes to codex, `["research"]` routes to gemini)
40
- - **Optional savepoints**: Works without git — falls back to JSON state snapshots
41
- - **Persistent agent tracking**: Agent registry survives restarts (`.robot-chat/agent-registry.json`)
42
- - **Flexible workflow phases**: Custom phase names beyond the built-in set
43
- - **Benchmark suite**: Test local model tool-use reliability (`benchmarks/run-experiment.mjs`)
44
-
45
- ## Installation
46
-
47
- **Via npm (recommended):**
48
- ```bash
49
- npm install -g nightshift-mcp
50
- ```
51
-
52
- **Updating:**
53
- ```bash
54
- npm update -g nightshift-mcp
55
- ```
56
-
57
- **Or build from source:**
58
- ```bash
59
- git clone <repo-url>
60
- cd nightshift-mcp
61
- npm install
62
- npm run build
63
- npm link # makes 'nightshift-mcp' available globally
64
- ```
65
-
66
- ## Configuration
67
-
68
- ### Claude Code (`~/.claude.json`)
69
-
70
- ```json
71
- {
72
- "mcpServers": {
73
- "nightshift": {
74
- "command": "nightshift-mcp",
75
- "args": []
76
- }
77
- }
78
- }
79
- ```
80
-
81
- ### Codex (`~/.codex/config.toml`)
82
-
83
- ```toml
84
- [mcp_servers.nightshift]
85
- command = "nightshift-mcp"
86
- args = []
87
- ```
88
-
89
- The server automatically uses the current working directory for the `.robot-chat/` folder. You can override this with the `ROBOT_CHAT_PROJECT_PATH` environment variable if needed.
90
-
91
- ## Usage
92
-
93
- For agents to communicate, they must be running in the **same project directory**. The chat file is created at `<project>/.robot-chat/chat.txt` based on where each CLI is started.
94
-
95
- **Example - two agents working on the same project:**
96
-
97
- ```bash
98
- # Terminal 1
99
- cd ~/myproject
100
- claude
101
-
102
- # Terminal 2
103
- cd ~/myproject
104
- codex
105
- ```
106
-
107
- Both agents now share the same chat file and can coordinate via the nightshift tools.
108
-
109
- **Note:** If agents are started in different directories, they will have separate chat files and won't be able to communicate.
110
-
111
- ## Tools
112
-
113
- ### `read_robot_chat`
114
-
115
- Read recent messages from the chat file.
116
-
117
- **Parameters:**
118
- - `limit` (optional): Maximum messages to return (default: 20)
119
- - `agent` (optional): Filter by agent name
120
- - `type` (optional): Filter by message type
121
-
122
- **Example:**
123
- ```
124
- Read the last 10 messages from Claude
125
- ```
126
-
127
- ### `write_robot_chat`
128
-
129
- Write a message to the chat file.
130
-
131
- **Parameters:**
132
- - `agent` (required): Your agent name (e.g., "Claude", "Codex")
133
- - `type` (required): Message type
134
- - `content` (required): Message content
135
-
136
- **Message Types:**
137
- - `FAILOVER_NEEDED` - Request another agent to take over
138
- - `FAILOVER_CLAIMED` - Acknowledge taking over a task
139
- - `TASK_COMPLETE` - Mark a task as finished
140
- - `STATUS_UPDATE` - Share progress update
141
- - `HANDOFF` - Pass work to a specific agent
142
- - `INFO` - General information
143
- - `ERROR` - Error report
144
- - `QUESTION` - Ask other agents a question
145
- - `ANSWER` - Answer a question
146
-
147
- **Example:**
148
- ```
149
- Post a STATUS_UPDATE from Claude about completing the login form
150
- ```
151
-
152
- ### `check_failovers`
153
-
154
- Find unclaimed FAILOVER_NEEDED messages.
155
-
156
- **Example:**
157
- ```
158
- Check if any agent needs help with their task
159
- ```
160
-
161
- ### `claim_failover`
162
-
163
- Claim a failover request from another agent.
164
-
165
- **Parameters:**
166
- - `agent` (required): Your agent name
167
- - `originalAgent` (required): Agent who requested failover
168
- - `task` (optional): Task description
169
-
170
- **Example:**
171
- ```
172
- Claim the failover from Codex and continue working on the authentication feature
173
- ```
174
-
175
- ### `get_chat_path`
176
-
177
- Get the full path to the chat file.
178
-
179
- ### `list_agents`
180
-
181
- List all agents who have posted to the chat, with their activity stats.
182
-
183
- **Returns:**
184
- - Agent name
185
- - Last seen timestamp
186
- - Last message type
187
- - Total message count
188
-
189
- **Example:**
190
- ```
191
- Show me which agents have been active in the chat
192
- ```
193
-
194
- ### `watch_chat`
195
-
196
- Poll for new messages since a cursor position. Useful for monitoring the chat for updates.
197
-
198
- **Parameters:**
199
- - `cursor` (optional): Line number from previous watch call. Omit to get current cursor.
200
-
201
- **Returns:**
202
- - `cursor`: Updated cursor for next call
203
- - `messageCount`: Number of new messages
204
- - `messages`: Array of new messages
205
-
206
- **Example workflow:**
207
- ```
208
- 1. Call watch_chat without cursor to get initial position
209
- 2. Store the returned cursor value
210
- 3. Call watch_chat with that cursor to get new messages
211
- 4. Update your cursor with the returned value
212
- 5. Repeat step 3-4 to poll for updates
213
- ```
214
-
215
- ### `archive_chat`
216
-
217
- Archive old messages to a date-stamped file, keeping recent messages in main chat.
218
-
219
- **Parameters:**
220
- - `keepRecent` (optional): Number of messages to keep (default: 50)
221
-
222
- **Example:**
223
- ```
224
- Archive old messages, keeping the last 20
225
- ```
226
-
227
- ## Chat File Format
228
-
229
- Messages are stored in a human-readable format:
230
-
231
- ```
232
- # Robot Chat - Multi-Agent Communication
233
- # Format: [AgentName @ HH:MM] MESSAGE_TYPE
234
- # ========================================
235
-
236
- [Claude @ 14:32] STATUS_UPDATE
237
- Working on implementing the login form.
238
- Files modified: src/components/LoginForm.tsx
239
-
240
- [Codex @ 14:45] FAILOVER_NEEDED
241
- Status: Hit rate limit
242
- Current Task: Implementing user authentication
243
- Progress: 60% - login form done, need logout and session handling
244
- Files Modified: src/auth/login.tsx, src/api/auth.ts
245
-
246
- Requesting another agent continue this work.
247
-
248
- [Claude @ 14:47] FAILOVER_CLAIMED
249
- Claiming failover from Codex.
250
- Continuing task: Implementing user authentication
251
- ```
252
-
253
- ## Testing
254
-
255
- ### With MCP Inspector
256
-
257
- ```bash
258
- npx @modelcontextprotocol/inspector node /path/to/nightshift-mcp/dist/index.js /tmp/test-project
259
- ```
260
-
261
- ### Manual Testing
262
-
263
- ```bash
264
- # Set project path and run
265
- ROBOT_CHAT_PROJECT_PATH=/tmp/test-project node dist/index.js
266
- ```
267
-
268
- ## Development
269
-
270
- ```bash
271
- # Watch mode for development
272
- npm run dev
273
-
274
- # Build
275
- npm run build
276
- ```
277
-
278
- ## Ralph-Style Task Management
279
-
280
- NightShift includes Ralph-compatible PRD and progress management, enabling structured autonomous development.
281
-
282
- ### Setup
283
-
284
- 1. Create a `prd.json` in your project root:
285
-
286
- ```json
287
- {
288
- "project": "MyApp",
289
- "description": "Feature description",
290
- "userStories": [
291
- {
292
- "id": "US-001",
293
- "title": "Set up project structure",
294
- "description": "Initialize the project with routing and base components",
295
- "acceptanceCriteria": ["Add routes", "Create base components", "Typecheck passes"],
296
- "priority": 1,
297
- "passes": false,
298
- "notes": "",
299
- "tags": ["infrastructure"]
300
- },
301
- {
302
- "id": "US-002",
303
- "title": "Add database field",
304
- "description": "As a developer, I need to store the new field",
305
- "acceptanceCriteria": ["Add column to table", "Run migration", "Typecheck passes"],
306
- "priority": 2,
307
- "passes": false,
308
- "notes": "",
309
- "dependsOn": ["US-001"],
310
- "tags": ["code", "infrastructure"]
311
- }
312
- ]
313
- }
314
- ```
315
-
316
- ### PRD Schema
317
-
318
- | Field | Type | Required | Default | Description |
319
- |-------|------|----------|---------|-------------|
320
- | `project` | string | no | — | Project name |
321
- | `description` | string | no | "" | Project description |
322
- | **`userStories`** | array | **yes** | — | Array of user story objects |
323
-
324
- **User Story fields:**
325
-
326
- | Field | Type | Required | Default | Description |
327
- |-------|------|----------|---------|-------------|
328
- | **`id`** | string | **yes** | — | Unique ID (e.g., "US-001") |
329
- | **`title`** | string | **yes** | — | Short title |
330
- | `description` | string | no | "" | Detailed description |
331
- | `acceptanceCriteria` | string[] | no | [] | Criteria for completion |
332
- | `priority` | number | no | 999 | Lower = higher priority |
333
- | `passes` | boolean | no | false | Whether the story is complete |
334
- | `notes` | string | no | "" | Implementation notes |
335
- | `dependsOn` | string[] | no | — | Story IDs that must complete first |
336
- | `tags` | string[] | no | — | Labels for routing (e.g., `["code", "security"]`) |
337
-
338
- **Tags and routing:** When the orchestrator uses `adaptive` strategy, tags influence which agent gets assigned:
339
- - `research`, `planning`, `documentation` → prefers gemini/claude
340
- - `code`, `implementation`, `feature` → prefers codex/claude
341
- - `security`, `architecture`, `infrastructure` → prefers claude
342
-
343
- ### PRD Validation
344
-
345
- NightShift validates your `prd.json` with Zod schemas and provides helpful error messages when common mistakes are detected:
346
-
347
- - Using `stories` instead of `userStories` → suggests the correct field name
348
- - Using `acceptance_criteria` instead of `acceptanceCriteria` → suggests the correct field name
349
- - Missing required fields (`id`, `title`) → identifies which story has the issue
350
- - Optional fields default gracefully (`passes` → false, `notes` → "", `acceptanceCriteria` → [])
351
-
352
- Use `nightshift_setup(showExamples: true)` for the full schema reference and examples.
353
-
354
- 2. Agents use these tools to work through stories:
355
-
356
- ### PRD Tools
357
-
358
- #### `read_prd`
359
- Read the full PRD with completion summary.
360
-
361
- #### `get_next_story`
362
- Get the highest priority incomplete story.
363
-
364
- #### `get_incomplete_stories`
365
- List all remaining work.
366
-
367
- #### `claim_story`
368
- Claim a story and notify other agents via chat.
369
-
370
- **Parameters:**
371
- - `agent` (required): Your agent name
372
- - `storyId` (optional): Specific story to claim
373
-
374
- #### `complete_story`
375
- Mark story complete, log progress, and notify via chat.
376
-
377
- **Parameters:**
378
- - `agent` (required): Your agent name
379
- - `storyId` (required): Story ID
380
- - `summary` (required): What was implemented
381
- - `filesModified` (optional): List of changed files
382
- - `learnings` (optional): Gotchas/patterns for future iterations
383
-
384
- #### `mark_story_complete`
385
- Just mark a story as complete without chat notification.
386
-
387
- **Parameters:**
388
- - `storyId` (required): Story ID
389
- - `notes` (optional): Implementation notes
390
-
391
- ### Progress Tools
392
-
393
- #### `read_progress`
394
- Read progress.txt containing learnings from all iterations.
395
-
396
- #### `append_progress`
397
- Add a timestamped progress entry.
398
-
399
- **Parameters:**
400
- - `content` (required): What was done, files changed, learnings
401
-
402
- #### `add_codebase_pattern`
403
- Add a reusable pattern to the Codebase Patterns section.
404
-
405
- **Parameters:**
406
- - `pattern` (required): The pattern (e.g., "Use sql<number> for aggregations")
407
-
408
- ### Context Store
409
-
410
- NightShift includes a selective context retrieval system that replaces prompt-stuffing with topic-based queries. Instead of truncating progress.txt to fit the context window, agents can store and retrieve relevant context on demand.
411
-
412
- Context entries are stored as individual JSON files in `.robot-chat/context/` for concurrent-safe multi-agent access.
413
-
414
- #### `store_context`
415
- Store a context entry for other agents to query later.
416
-
417
- **Parameters:**
418
- - `topic` (required): Topic/category (e.g., "authentication", "database-schema")
419
- - `content` (required): The context to store (learnings, decisions, findings)
420
- - `agent` (required): Your agent name
421
- - `tags` (optional): Tags for better searchability (e.g., ["auth", "jwt"])
422
-
423
- **Example:**
424
- ```
425
- store_context(topic: "authentication", content: "Using JWT with RS256. Refresh tokens stored in httpOnly cookies.", agent: "Claude", tags: ["jwt", "cookies"])
426
- ```
427
-
428
- #### `query_context`
429
- Search stored context entries by topic.
430
-
431
- **Parameters:**
432
- - `topic` (required): Search term (case-insensitive match on topic and tags)
433
- - `limit` (optional): Max entries to return (default: 10)
434
-
435
- **Example:**
436
- ```
437
- query_context(topic: "auth")
438
- # Returns all entries matching "auth" in topic or tags, sorted by recency
439
- ```
440
-
441
- #### `list_context`
442
- List all topics in the context store with entry counts.
443
-
444
- **How delegation uses context:**
445
-
446
- When `delegate_story` or `delegate_research` spawns an agent, it queries the context store for entries relevant to the task and includes them in the prompt — instead of blindly truncating progress.txt. Agents are also instructed to use `store_context` to save their learnings, creating a self-enriching context loop.
447
-
448
- ### Execution Tracing
449
-
450
- NightShift automatically traces all agent spawns, completions, and failures into a structured execution log at `.robot-chat/trace.json`. Each trace event has parent-child relationships that can be reconstructed as a tree for debugging multi-agent runs.
451
-
452
- #### `get_trace`
453
- View the execution trace as a flat list or tree.
454
-
455
- **Parameters:**
456
- - `tree` (optional): Return as tree with parent-child relationships (default: false)
457
- - `taskId` (optional): Filter by story/task ID
458
-
459
- **Example:**
460
- ```
461
- get_trace(tree: true)
462
- # Returns tree showing: orchestrator → spawned claude (US-001) → completed
463
- # orchestrator → spawned codex (US-002) → failed → retried with gemini → completed
464
- ```
465
-
466
- #### `clear_trace`
467
- Reset the trace for a fresh orchestration run.
468
-
469
- **What gets traced automatically:**
470
- - `spawn_agent` and `spawn_agent_background` calls
471
- - `delegate_story` and `delegate_research` delegations
472
- - `orchestrate` decisions (inline mode)
473
- - Agent completions with exit codes
474
- - Agent failures with error details
475
-
476
- Each trace event includes metadata: agent type, story ID, prompt length, exit code, and timing.
477
-
478
- ### Autonomous Workflow
479
-
480
- With multiple agents working together:
481
-
482
- ```
483
- ┌──────────────────────────────────────────────────────────────────┐
484
- │ NightShift Workflow │
485
- ├──────────────────────────────────────────────────────────────────┤
486
- │ │
487
- │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
488
- │ │ Claude │ │ Codex │ │ Gemini │ │ Vibe │ │ Goose │ │
489
- │ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ │
490
- │ │ │ │ │ │ │
491
- │ └───────────┴─────┬─────┴───────────┴───────────┘ │
492
- │ │ │
493
- │ ▼ │
494
- │ ┌──────────────────┐ │
495
- │ │ .robot-chat/ │ ◄── Agent coordination │
496
- │ │ chat.txt │ │
497
- │ └──────────────────┘ │
498
- │ │ │
499
- │ ┌──────────┬────────┼────────┬──────────┐ │
500
- │ │ │ │ │ │ │
501
- │ ▼ ▼ ▼ ▼ ▼ │
502
- │ ┌────────┐ ┌────────┐ ┌────┐ ┌──────────┐ ┌──────────┐ │
503
- │ │prd.json│ │progress│ │Code│ │ context/ │ │trace.json│ │
504
- │ │(tasks) │ │ .txt │ │ │ │(per-topic│ │(execution│ │
505
- │ │ │ │ │ │ │ │ queries) │ │ tree) │ │
506
- │ └────────┘ └────────┘ └────┘ └──────────┘ └──────────┘ │
507
- │ │
508
- └──────────────────────────────────────────────────────────────────┘
509
- ```
510
-
511
- Each agent:
512
- 1. Checks for failovers (helps stuck agents first)
513
- 2. Reads progress.txt for codebase patterns
514
- 3. Claims the next story via chat
515
- 4. Implements the story
516
- 5. Runs quality checks
517
- 6. Commits changes
518
- 7. Marks complete and logs learnings
519
- 8. Repeats until all stories pass
520
-
521
- When an agent hits limits, it posts `FAILOVER_NEEDED` and another agent claims the work.
522
-
523
- ### Completion Signal
524
-
525
- When all stories in prd.json have `passes: true` AND all bugs in bugs.json have `fixed: true`, the tools:
526
-
527
- 1. Post a `READY_TO_TEST` message to the chat
528
- 2. Return `<promise>COMPLETE</promise>`
529
-
530
- This signals to humans that work is ready for review.
531
-
532
- ## Bug Tracking
533
-
534
- When testing reveals issues, add a `bugs.json` file:
535
-
536
- ```json
537
- {
538
- "project": "MyApp",
539
- "bugs": [
540
- {
541
- "id": "BUG-001",
542
- "title": "Login fails on mobile",
543
- "description": "Login button unresponsive on iOS Safari",
544
- "stepsToReproduce": [
545
- "Open app on iOS Safari",
546
- "Enter credentials",
547
- "Tap login button",
548
- "Nothing happens"
549
- ],
550
- "priority": 1,
551
- "fixed": false
552
- }
553
- ]
554
- }
555
- ```
556
-
557
- ### Bug Tools
558
-
559
- #### `read_bugs`
560
- Read bugs.json with completion summary.
561
-
562
- #### `get_next_bug`
563
- Get highest priority unfixed bug.
564
-
565
- #### `claim_bug`
566
- Claim a bug and notify via chat.
567
-
568
- **Parameters:**
569
- - `agent` (required): Your agent name
570
- - `bugId` (optional): Specific bug to claim
571
-
572
- #### `mark_bug_fixed`
573
- Mark bug fixed, create savepoint, and notify.
574
-
575
- **Parameters:**
576
- - `agent` (required): Your agent name
577
- - `bugId` (required): Bug ID
578
- - `summary` (required): What was fixed
579
- - `filesModified` (optional): Files changed
580
-
581
- ## Savepoints (Recovery)
582
-
583
- Every completed story and fixed bug automatically creates a savepoint (git commit + tag). Use these for easy rollback if needed.
584
-
585
- ### Savepoint Tools
586
-
587
- #### `create_savepoint`
588
- Create a manual checkpoint.
589
-
590
- **Parameters:**
591
- - `label` (required): Savepoint name (e.g., "pre-refactor", "auth-working")
592
- - `message` (optional): Commit message
593
-
594
- #### `list_savepoints`
595
- List all savepoints (git tags with `savepoint/` prefix).
596
-
597
- #### `rollback_savepoint`
598
- Reset to a previous savepoint. **Warning:** Discards all changes after that point.
599
-
600
- **Parameters:**
601
- - `label` (required): Savepoint to rollback to
602
-
603
- ### Example Recovery
604
-
605
- ```bash
606
- # Something went wrong after US-003
607
- # List available savepoints
608
- list_savepoints
609
- # → savepoint/US-001, savepoint/US-002, savepoint/US-003
610
-
611
- # Rollback to after US-002
612
- rollback_savepoint("US-002")
613
- # → All changes after US-002 discarded
614
- ```
615
-
616
- ## Workflow Management
617
-
618
- NightShift includes workflow tools for tracking project phases, recording strategic decisions, and managing agent assignments.
619
-
620
- ### Workflow Tools
621
-
622
- #### `init_workflow`
623
- Initialize a new workflow with a project goal and optional custom phases.
624
-
625
- **Parameters:**
626
- - `projectGoal` (required): High-level goal of the project
627
- - `phases` (optional): Custom phases (default: research, decisions, planning, build, test, report)
628
-
629
- #### `get_workflow_state`
630
- Get the current workflow state including phase, assignments, and decisions.
631
-
632
- #### `advance_phase`
633
- Advance to the next workflow phase when the current phase's exit criteria are met.
634
-
635
- #### `set_phase`
636
- Manually set the workflow to a specific phase.
637
-
638
- **Parameters:**
639
- - `phase` (required): Target phase (research, decisions, planning, build, test, report, complete)
640
-
641
- #### `record_decision`
642
- Record a strategic decision with rationale for future reference.
643
-
644
- **Parameters:**
645
- - `topic` (required): What the decision is about
646
- - `options` (required): Options that were considered
647
- - `chosen` (required): The chosen option
648
- - `rationale` (required): Why this option was chosen
649
- - `decidedBy` (required): Agent or person who decided
650
-
651
- #### `get_decisions`
652
- Get all recorded decisions, optionally filtered by topic.
653
-
654
- #### `get_active_assignments`
655
- Get all stories currently being worked on by agents.
656
-
657
- #### `clear_assignment`
658
- Clear a story assignment (for abandonment/failover scenarios).
659
-
660
- ## Setup & Debugging
661
-
662
- NightShift includes self-service tools for setup and troubleshooting.
663
-
664
- ### `nightshift_setup`
665
-
666
- Get configuration instructions and verify project setup.
667
-
668
- **Parameters:**
669
- - `showExamples` (optional): Include prd.json and bugs.json templates
670
-
671
- **Returns:**
672
- - Project status checks (prd.json, bugs.json, git, .gitignore)
673
- - Agent configuration examples for Claude and Codex
674
- - Setup suggestions for any issues found
675
- - Example templates (if requested)
676
-
677
- **Example:**
678
- ```
679
- nightshift_setup(showExamples: true)
680
- ```
681
-
682
- ### `nightshift_debug`
683
-
684
- Diagnose issues and get troubleshooting guidance.
685
-
686
- **Checks:**
687
- - File system permissions
688
- - JSON file validation (prd.json, bugs.json)
689
- - Daemon lock status
690
- - Recent chat errors and unclaimed failovers
691
- - Agent availability
692
- - Git repository status
693
-
694
- **Example:**
695
- ```
696
- nightshift_debug
697
- # Returns detailed diagnostic report with suggested fixes
698
- ```
699
-
700
- ## Agent Spawning & Orchestration
701
-
702
- One agent can spawn others as subprocesses, enabling fully autonomous multi-agent workflows with minimal user intervention.
703
-
704
- ### Spawning Tools
705
-
706
- #### `list_available_agents`
707
- Check which agent CLIs (claude, codex, gemini, vibe, goose) are installed and ready to run.
708
-
709
- #### `spawn_agent`
710
- Spawn another agent as a subprocess and wait for completion.
711
-
712
- **Parameters:**
713
- - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
714
- - `prompt` (required): Task/prompt to send
715
- - `timeout` (optional): Seconds before timeout (default: 300)
716
-
717
- **Example:**
718
- ```
719
- spawn_agent(agent: "codex", prompt: "Fix the type errors in src/utils.ts")
720
- ```
721
-
722
- #### `spawn_agent_background`
723
- Spawn an agent in the background (non-blocking). Returns immediately with PID and output file path.
724
-
725
- **Parameters:**
726
- - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
727
- - `prompt` (required): Task/prompt to send
728
-
729
- #### `delegate_story`
730
- Delegate a PRD user story to another agent with full context. On failure, returns a `retryHint` suggesting alternative available agents.
731
-
732
- **Parameters:**
733
- - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
734
- - `storyId` (optional): Story ID to delegate (defaults to next available)
735
- - `background` (optional): Run in background (default: false)
736
-
737
- **Example:**
738
- ```
739
- delegate_story(agent: "gemini", storyId: "US-003", background: true)
740
- ```
741
-
742
- The spawned agent receives:
743
- - Full story description and acceptance criteria
744
- - Relevant context from the context store (or progress.txt as fallback)
745
- - Recent chat messages for context
746
- - Instructions to use nightshift tools for coordination (including `store_context` and `query_context`)
747
-
748
- #### `delegate_research`
749
- Delegate a research or planning task to an agent (default: Gemini). Ideal for read-only tasks like codebase analysis, architecture planning, code review, and documentation. Queries the context store for relevant prior findings.
750
-
751
- **Parameters:**
752
- - `task` (required): The research/planning task description
753
- - `agent` (optional): Which agent to use (default: gemini)
754
- - `context` (optional): Additional context to provide
755
- - `background` (optional): Run in background (default: false)
756
-
757
- ### Monitoring Tools
758
-
759
- #### `get_agent_status`
760
- Check the status of a spawned background agent by PID.
761
-
762
- **Parameters:**
763
- - `pid` (required): Process ID of the spawned agent
764
-
765
- **Returns:**
766
- - Whether the agent is still running or has exited
767
- - Exit code (if finished)
768
- - Last 30 lines of output
769
- - Story assignment (if delegated via `delegate_story`)
770
-
771
- #### `list_running_agents`
772
- List all agents spawned in the current session with their status.
773
-
774
- **Returns:** Array of agents with PID, agent type, running/exited status, elapsed time, and story assignment.
775
-
776
- ### Orchestration
777
-
778
- #### `orchestrate`
779
- Run an autonomous orchestration loop that claims stories, implements them, and marks them complete until all work is done. This is the highest-level automation tool.
780
-
781
- **Parameters:**
782
- - `agent` (optional): Your agent name (default: "NightShift")
783
- - `maxIterations` (optional): Maximum stories to process (default: 50)
784
- - `mode` (optional): "stories", "bugs", or "all" (default: "all")
785
-
786
- ### Orchestration Patterns
787
-
788
- **Fully autonomous (recommended):**
789
- ```
790
- orchestrate(agent: "Claude", mode: "all")
791
- # Runs until all stories and bugs are complete
792
- ```
793
-
794
- **Sequential delegation:**
795
- ```
796
- delegate_story(agent: "codex") # Wait for completion
797
- delegate_story(agent: "gemini") # Then delegate next
798
- ```
799
-
800
- **Parallel execution:**
801
- ```
802
- delegate_story(agent: "codex", storyId: "US-001", background: true)
803
- delegate_story(agent: "goose", storyId: "US-002", background: true)
804
- # Work on US-003 yourself while they run in parallel
805
- # Monitor with get_agent_status or list_running_agents
806
- ```
807
-
808
- **Research then implement:**
809
- ```
810
- delegate_research(task: "Analyze auth patterns and recommend approach")
811
- # Use findings to inform implementation
812
- delegate_story(agent: "codex", storyId: "US-001")
813
- ```
814
-
815
- ## NightShift Daemon (Continuous Orchestration)
816
-
817
- For fully automated, event-driven orchestration, run the NightShift daemon:
818
-
819
- ```bash
820
- # Start the daemon
821
- nightshift-daemon
822
-
823
- # With options
824
- nightshift-daemon --verbose --max-agents 4 --health-check 1m
825
-
826
- # Preview mode (see what would happen)
827
- nightshift-daemon --dry-run --verbose
828
- ```
829
-
830
- ### How It Works
831
-
832
- The daemon provides hands-off multi-agent orchestration:
833
-
834
- 1. **Event-Driven**: Watches `prd.json` and `chat.txt` for changes
835
- 2. **Auto-Spawning**: Spawns agents for orphaned stories (up to concurrency limit)
836
- 3. **Failover Handling**: Automatically claims and reassigns failover requests
837
- 4. **Smart Retry**: Tracks failed agents per story and tries a different agent on retry
838
- 5. **Health Checks**: Periodic reconciliation as a fallback (default: every 2 min)
839
- 6. **Poison Pill Protection**: Quarantines stories that fail repeatedly
840
- 7. **Stuck Detection**: Kills agents that haven't reported activity
841
-
842
- ### Options
843
-
844
- | Flag | Description | Default |
845
- |------|-------------|---------|
846
- | `--verbose, -v` | Enable debug logging | false |
847
- | `--dry-run` | Show actions without spawning | false |
848
- | `--health-check <N>` | Health check interval (e.g., "2m", "30s") | 2m |
849
- | `--max-agents <N>` | Max concurrent agents | 3 |
850
-
851
- ### Environment
852
-
853
- - `ROBOT_CHAT_PROJECT_PATH` - Project directory (default: current directory)
854
-
855
- ### Architecture
856
-
857
- ```
858
- ┌─────────────────────────────────────────────────────────────┐
859
- │ NightShift Daemon │
860
- ├─────────────────────────────────────────────────────────────┤
861
- │ │
862
- │ ┌──────────────────────────────────────────────────┐ │
863
- │ │ File Watchers (Primary) │ │
864
- │ │ • prd.json changes → reconcile │ │
865
- │ │ • chat.txt changes → check failovers │ │
866
- │ └──────────────────────────────────────────────────┘ │
867
- │ │ │
868
- │ ▼ │
869
- │ ┌──────────────────────────────────────────────────┐ │
870
- │ │ Reconciliation Engine │ │
871
- │ │ • Find orphaned stories │ │
872
- │ │ • Spawn agents (up to max concurrency) │ │
873
- │ │ • Handle failovers │ │
874
- │ │ • Quarantine poison pills │ │
875
- │ └──────────────────────────────────────────────────┘ │
876
- │ │ │
877
- │ ▼ │
878
- │ ┌──────────────────────────────────────────────────┐ │
879
- │ │ Health Check (Fallback) │ │
880
- │ │ • Runs every 2 minutes │ │
881
- │ │ • Detects stuck agents │ │
882
- │ │ • Restarts watchers if needed │ │
883
- │ │ • Reconciles state │ │
884
- │ └──────────────────────────────────────────────────┘ │
885
- │ │
886
- └─────────────────────────────────────────────────────────────┘
887
- ```
888
-
889
- ## Local Models via Ollama
890
-
891
- NightShift supports local Ollama models through two harnesses:
892
-
893
- ### Goose + Ollama (Recommended for tool use)
894
-
895
- [Goose](https://github.com/block/goose) has its own tool-calling implementation that works reliably with local models. This is the recommended path for local agent work.
896
-
897
- ```bash
898
- # Install Goose CLI
899
- curl -fsSL https://github.com/block/goose/releases/latest/download/install.sh | bash
900
-
901
- # Install Ollama and pull a model
902
- ollama pull qwen3.5:4b
903
-
904
- # Configure nightshift to use Goose with Ollama
905
- export NIGHTSHIFT_GOOSE_PROVIDER=ollama
906
- export NIGHTSHIFT_GOOSE_MODEL=qwen3.5:4b
907
- ```
908
-
909
- Then use `goose` as your agent in nightshift:
910
- ```
911
- spawn_agent(agent: "goose", prompt: "Fix the pagination bug in src/api.ts")
912
- delegate_research(agent: "goose", task: "Analyze error handling patterns")
913
- ```
914
-
915
- **Recommended models** (by hardware):
916
-
917
- | GPU VRAM | Model | Size | Notes |
918
- |----------|-------|------|-------|
919
- | 4GB+ | `qwen3.5:4b` | 3.4 GB | Fast, good tool use |
920
- | 6GB+ | `qwen3.5:4b-q8_0` | 5.3 GB | Better accuracy, same speed |
921
- | 8GB+ | `qwen3.5:9b` | 6.6 GB | Best quality, slower on consumer GPUs |
922
-
923
- ### Claude Code + Ollama (Text generation only)
924
-
925
- For tasks that don't require tool use (summarization, code review, planning):
926
-
927
- ```bash
928
- export NIGHTSHIFT_OLLAMA_MODEL=qwen3.5:4b # or any Ollama model
929
- ```
930
-
931
- Then use `ollama` as your agent:
932
- ```
933
- spawn_agent(agent: "ollama", prompt: "Review this PR for security issues")
934
- delegate_research(agent: "ollama", task: "Summarize the authentication patterns")
935
- ```
936
-
937
- This uses Claude Code's harness with Ollama's Anthropic-compatible API. Text generation works well, but local models don't reliably trigger Claude Code's structured tool calls.
938
-
939
- ### Benchmarking Local Models
940
-
941
- A benchmark suite is included to test which models work on your hardware:
942
-
943
- ```bash
944
- # Test all tasks with goose + a specific model
945
- node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:4b
946
-
947
- # Test only text-level tasks (fast sanity check)
948
- node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:4b --level text
949
-
950
- # Compare models
951
- node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:9b
952
- ```
953
-
954
- Results are saved to `benchmarks/results/` for comparison across runs.
955
-
956
- ## Multi-Agent Tips
957
-
958
- 1. **Same directory**: All agents must run in the same project directory to share chat
959
- 2. **Claim before working**: Always claim stories to prevent duplicate work
960
- 3. **Post status updates**: Keep other agents informed of progress
961
- 4. **Store context, not just progress**: Use `store_context` to share learnings by topic — other agents can query for exactly what they need instead of reading a giant progress file
962
- 5. **Handle failovers**: Check for and claim failovers at the start of each session
963
- 6. **Use delegation**: One orchestrating agent can spawn others for parallel work
964
- 7. **Monitor background agents**: Use `get_agent_status` and `list_running_agents` to track spawned agents
965
- 8. **Use `orchestrate` for full autonomy**: The `orchestrate` tool handles the entire claim→implement→complete loop
966
- 9. **Review traces after runs**: Use `get_trace(tree: true)` to understand what happened during orchestration
967
- 10. **Add `.robot-chat/` to your project's `.gitignore`**: Chat logs, context, and traces are ephemeral and shouldn't be committed
968
-
969
- ## License
970
-
971
- MIT
1
+ # NightShift MCP
2
+
3
+ **The responsible kind of multi-agent chaos.**
4
+
5
+ Explicit delegation, review, and handoffs between AI models.
6
+
7
+ ---
8
+
9
+ An MCP (Model Context Protocol) server for agent teams and multi-agent orchestration across different AI models. Coordinate Claude, Codex, Gemini, Goose, and local Ollama models as an agentic team — with structured delegation, shared task management, and autonomous workflows. Works with any MCP-compatible client.
10
+
11
+ ## Features
12
+
13
+ - **Multi-agent chat**: Structured inter-agent messaging with agent name, timestamp, type, and content
14
+ - **Failover handling**: Seamless handoffs when an agent hits limits or context windows fill up
15
+ - **PRD-driven task management**: Work through user stories in prd.json with dependencies (`dependsOn`) and tags for smart routing
16
+ - **Progress tracking**: Shared learnings via progress.txt
17
+ - **Selective context retrieval**: Topic-based context store lets agents query relevant context instead of prompt-stuffing
18
+ - **Execution tracing**: Structured trace of agent spawns, completions, and failures with parent-child tree visualization
19
+ - **Agent spawning & orchestration**: Spawn Claude, Codex, Gemini, Goose, or local Ollama models as subprocesses with full lifecycle tracking
20
+ - **Agent availability checks**: Pre-flight validation before spawning — never fails silently on missing agents
21
+ - **Local model support**: Use Goose + Ollama for free, private, offline agent work with local models
22
+ - **Autonomous orchestration**: Single `orchestrate` tool runs a claim→implement→complete loop until all stories pass
23
+ - **Agent status tracking**: Monitor spawned agents by PID, check exit codes, and tail output in real-time
24
+ - **Smart retry**: Automatically suggests or uses a different agent when one fails
25
+ - **Workflow management**: Phases, strategic decisions, and agent assignments
26
+ - **Watch/polling**: Monitor for new messages with cursor-based polling
27
+ - **Auto-archiving**: Archive old messages to keep the chat file manageable
28
+ - **Cross-platform**: Works on Windows, Linux, and macOS (uses cross-spawn and platform-safe process management)
29
+ - **Heterogeneous agent teams**: Mix different AI models — use each for what it's best at
30
+ - **Universal compatibility**: Works with any MCP-supporting tool (49 tools across 10 categories)
31
+ - **Flexible savepoints**: Git-based checkpoints when available, JSON state snapshots when not
32
+ - **Simple file-based storage**: No external services required
33
+
34
+ ## What's New in 1.1.0
35
+
36
+ - **Local model support**: New `ollama` agent type + Goose/Ollama integration for local model tool use
37
+ - **Agent availability checks**: All spawn/delegate tools validate agents before attempting — clear errors with available alternatives
38
+ - **PRD dependencies**: `dependsOn` field prevents stories from running before their prerequisites complete
39
+ - **PRD tags**: `tags` field enables smart agent routing (e.g., `["code"]` routes to codex, `["research"]` routes to gemini)
40
+ - **Optional savepoints**: Works without git — falls back to JSON state snapshots
41
+ - **Persistent agent tracking**: Agent registry survives restarts (`.robot-chat/agent-registry.json`)
42
+ - **Flexible workflow phases**: Custom phase names beyond the built-in set
43
+ - **Benchmark suite**: Test local model tool-use reliability (`benchmarks/run-experiment.mjs`)
44
+
45
+ ## Installation
46
+
47
+ **Via npm (recommended):**
48
+ ```bash
49
+ npm install -g nightshift-mcp
50
+ ```
51
+
52
+ **Updating:**
53
+ ```bash
54
+ npm update -g nightshift-mcp
55
+ ```
56
+
57
+ **Or build from source:**
58
+ ```bash
59
+ git clone <repo-url>
60
+ cd nightshift-mcp
61
+ npm install
62
+ npm run build
63
+ npm link # makes 'nightshift-mcp' available globally
64
+ ```
65
+
66
+ ## Configuration
67
+
68
+ ### Claude Code (`~/.claude.json`)
69
+
70
+ ```json
71
+ {
72
+ "mcpServers": {
73
+ "nightshift": {
74
+ "command": "nightshift-mcp",
75
+ "args": []
76
+ }
77
+ }
78
+ }
79
+ ```
80
+
81
+ ### Codex (`~/.codex/config.toml`)
82
+
83
+ ```toml
84
+ [mcp_servers.nightshift]
85
+ command = "nightshift-mcp"
86
+ args = []
87
+ ```
88
+
89
+ The server automatically uses the current working directory for the `.robot-chat/` folder. You can override this with the `ROBOT_CHAT_PROJECT_PATH` environment variable if needed.
90
+
91
+ ## Usage
92
+
93
+ For agents to communicate, they must be running in the **same project directory**. The chat file is created at `<project>/.robot-chat/chat.txt` based on where each CLI is started.
94
+
95
+ **Example - two agents working on the same project:**
96
+
97
+ ```bash
98
+ # Terminal 1
99
+ cd ~/myproject
100
+ claude
101
+
102
+ # Terminal 2
103
+ cd ~/myproject
104
+ codex
105
+ ```
106
+
107
+ Both agents now share the same chat file and can coordinate via the nightshift tools.
108
+
109
+ **Note:** If agents are started in different directories, they will have separate chat files and won't be able to communicate.
110
+
111
+ ## Tools
112
+
113
+ ### `read_robot_chat`
114
+
115
+ Read recent messages from the chat file.
116
+
117
+ **Parameters:**
118
+ - `limit` (optional): Maximum messages to return (default: 20)
119
+ - `agent` (optional): Filter by agent name
120
+ - `type` (optional): Filter by message type
121
+
122
+ **Example:**
123
+ ```
124
+ Read the last 10 messages from Claude
125
+ ```
126
+
127
+ ### `write_robot_chat`
128
+
129
+ Write a message to the chat file.
130
+
131
+ **Parameters:**
132
+ - `agent` (required): Your agent name (e.g., "Claude", "Codex")
133
+ - `type` (required): Message type
134
+ - `content` (required): Message content
135
+
136
+ **Message Types:**
137
+ - `FAILOVER_NEEDED` - Request another agent to take over
138
+ - `FAILOVER_CLAIMED` - Acknowledge taking over a task
139
+ - `TASK_COMPLETE` - Mark a task as finished
140
+ - `STATUS_UPDATE` - Share progress update
141
+ - `HANDOFF` - Pass work to a specific agent
142
+ - `INFO` - General information
143
+ - `ERROR` - Error report
144
+ - `QUESTION` - Ask other agents a question
145
+ - `ANSWER` - Answer a question
146
+
147
+ **Example:**
148
+ ```
149
+ Post a STATUS_UPDATE from Claude about completing the login form
150
+ ```
151
+
152
+ ### `check_failovers`
153
+
154
+ Find unclaimed FAILOVER_NEEDED messages.
155
+
156
+ **Example:**
157
+ ```
158
+ Check if any agent needs help with their task
159
+ ```
160
+
161
+ ### `claim_failover`
162
+
163
+ Claim a failover request from another agent.
164
+
165
+ **Parameters:**
166
+ - `agent` (required): Your agent name
167
+ - `originalAgent` (required): Agent who requested failover
168
+ - `task` (optional): Task description
169
+
170
+ **Example:**
171
+ ```
172
+ Claim the failover from Codex and continue working on the authentication feature
173
+ ```
174
+
175
+ ### `get_chat_path`
176
+
177
+ Get the full path to the chat file.
178
+
179
+ ### `list_agents`
180
+
181
+ List all agents who have posted to the chat, with their activity stats.
182
+
183
+ **Returns:**
184
+ - Agent name
185
+ - Last seen timestamp
186
+ - Last message type
187
+ - Total message count
188
+
189
+ **Example:**
190
+ ```
191
+ Show me which agents have been active in the chat
192
+ ```
193
+
194
+ ### `watch_chat`
195
+
196
+ Poll for new messages since a cursor position. Useful for monitoring the chat for updates.
197
+
198
+ **Parameters:**
199
+ - `cursor` (optional): Line number from previous watch call. Omit to get current cursor.
200
+
201
+ **Returns:**
202
+ - `cursor`: Updated cursor for next call
203
+ - `messageCount`: Number of new messages
204
+ - `messages`: Array of new messages
205
+
206
+ **Example workflow:**
207
+ ```
208
+ 1. Call watch_chat without cursor to get initial position
209
+ 2. Store the returned cursor value
210
+ 3. Call watch_chat with that cursor to get new messages
211
+ 4. Update your cursor with the returned value
212
+ 5. Repeat step 3-4 to poll for updates
213
+ ```
214
+
215
+ ### `archive_chat`
216
+
217
+ Archive old messages to a date-stamped file, keeping recent messages in main chat.
218
+
219
+ **Parameters:**
220
+ - `keepRecent` (optional): Number of messages to keep (default: 50)
221
+
222
+ **Example:**
223
+ ```
224
+ Archive old messages, keeping the last 20
225
+ ```
226
+
227
+ ## Chat File Format
228
+
229
+ Messages are stored in a human-readable format:
230
+
231
+ ```
232
+ # Robot Chat - Multi-Agent Communication
233
+ # Format: [AgentName @ HH:MM] MESSAGE_TYPE
234
+ # ========================================
235
+
236
+ [Claude @ 14:32] STATUS_UPDATE
237
+ Working on implementing the login form.
238
+ Files modified: src/components/LoginForm.tsx
239
+
240
+ [Codex @ 14:45] FAILOVER_NEEDED
241
+ Status: Hit rate limit
242
+ Current Task: Implementing user authentication
243
+ Progress: 60% - login form done, need logout and session handling
244
+ Files Modified: src/auth/login.tsx, src/api/auth.ts
245
+
246
+ Requesting another agent continue this work.
247
+
248
+ [Claude @ 14:47] FAILOVER_CLAIMED
249
+ Claiming failover from Codex.
250
+ Continuing task: Implementing user authentication
251
+ ```
252
+
253
+ ## Testing
254
+
255
+ ### With MCP Inspector
256
+
257
+ ```bash
258
+ npx @modelcontextprotocol/inspector node /path/to/nightshift-mcp/dist/index.js /tmp/test-project
259
+ ```
260
+
261
+ ### Manual Testing
262
+
263
+ ```bash
264
+ # Set project path and run
265
+ ROBOT_CHAT_PROJECT_PATH=/tmp/test-project node dist/index.js
266
+ ```
267
+
268
+ ## Development
269
+
270
+ ```bash
271
+ # Watch mode for development
272
+ npm run dev
273
+
274
+ # Build
275
+ npm run build
276
+ ```
277
+
278
+ ## Ralph-Style Task Management
279
+
280
+ NightShift includes Ralph-compatible PRD and progress management, enabling structured autonomous development.
281
+
282
+ ### Setup
283
+
284
+ 1. Create a `prd.json` in your project root:
285
+
286
+ ```json
287
+ {
288
+ "project": "MyApp",
289
+ "description": "Feature description",
290
+ "userStories": [
291
+ {
292
+ "id": "US-001",
293
+ "title": "Set up project structure",
294
+ "description": "Initialize the project with routing and base components",
295
+ "acceptanceCriteria": ["Add routes", "Create base components", "Typecheck passes"],
296
+ "priority": 1,
297
+ "passes": false,
298
+ "notes": "",
299
+ "tags": ["infrastructure"]
300
+ },
301
+ {
302
+ "id": "US-002",
303
+ "title": "Add database field",
304
+ "description": "As a developer, I need to store the new field",
305
+ "acceptanceCriteria": ["Add column to table", "Run migration", "Typecheck passes"],
306
+ "priority": 2,
307
+ "passes": false,
308
+ "notes": "",
309
+ "dependsOn": ["US-001"],
310
+ "tags": ["code", "infrastructure"]
311
+ }
312
+ ]
313
+ }
314
+ ```
315
+
316
+ ### PRD Schema
317
+
318
+ | Field | Type | Required | Default | Description |
319
+ |-------|------|----------|---------|-------------|
320
+ | `project` | string | no | — | Project name |
321
+ | `description` | string | no | "" | Project description |
322
+ | **`userStories`** | array | **yes** | — | Array of user story objects |
323
+
324
+ **User Story fields:**
325
+
326
+ | Field | Type | Required | Default | Description |
327
+ |-------|------|----------|---------|-------------|
328
+ | **`id`** | string | **yes** | — | Unique ID (e.g., "US-001") |
329
+ | **`title`** | string | **yes** | — | Short title |
330
+ | `description` | string | no | "" | Detailed description |
331
+ | `acceptanceCriteria` | string[] | no | [] | Criteria for completion |
332
+ | `priority` | number | no | 999 | Lower = higher priority |
333
+ | `passes` | boolean | no | false | Whether the story is complete |
334
+ | `notes` | string | no | "" | Implementation notes |
335
+ | `dependsOn` | string[] | no | — | Story IDs that must complete first |
336
+ | `tags` | string[] | no | — | Labels for routing (e.g., `["code", "security"]`) |
337
+
338
+ **Tags and routing:** When the orchestrator uses `adaptive` strategy, tags influence which agent gets assigned:
339
+ - `research`, `planning`, `documentation` → prefers gemini/claude
340
+ - `code`, `implementation`, `feature` → prefers codex/claude
341
+ - `security`, `architecture`, `infrastructure` → prefers claude
342
+
343
+ ### PRD Validation
344
+
345
+ NightShift validates your `prd.json` with Zod schemas and provides helpful error messages when common mistakes are detected:
346
+
347
+ - Using `stories` instead of `userStories` → suggests the correct field name
348
+ - Using `acceptance_criteria` instead of `acceptanceCriteria` → suggests the correct field name
349
+ - Missing required fields (`id`, `title`) → identifies which story has the issue
350
+ - Optional fields default gracefully (`passes` → false, `notes` → "", `acceptanceCriteria` → [])
351
+
352
+ Use `nightshift_setup(showExamples: true)` for the full schema reference and examples.
353
+
354
+ 2. Agents use these tools to work through stories:
355
+
356
+ ### PRD Tools
357
+
358
+ #### `read_prd`
359
+ Read the full PRD with completion summary.
360
+
361
+ #### `get_next_story`
362
+ Get the highest priority incomplete story.
363
+
364
+ #### `get_incomplete_stories`
365
+ List all remaining work.
366
+
367
+ #### `claim_story`
368
+ Claim a story and notify other agents via chat.
369
+
370
+ **Parameters:**
371
+ - `agent` (required): Your agent name
372
+ - `storyId` (optional): Specific story to claim
373
+
374
+ #### `complete_story`
375
+ Mark story complete, log progress, and notify via chat.
376
+
377
+ **Parameters:**
378
+ - `agent` (required): Your agent name
379
+ - `storyId` (required): Story ID
380
+ - `summary` (required): What was implemented
381
+ - `filesModified` (optional): List of changed files
382
+ - `learnings` (optional): Gotchas/patterns for future iterations
383
+
384
+ #### `mark_story_complete`
385
+ Just mark a story as complete without chat notification.
386
+
387
+ **Parameters:**
388
+ - `storyId` (required): Story ID
389
+ - `notes` (optional): Implementation notes
390
+
391
+ ### Progress Tools
392
+
393
+ #### `read_progress`
394
+ Read progress.txt containing learnings from all iterations.
395
+
396
+ #### `append_progress`
397
+ Add a timestamped progress entry.
398
+
399
+ **Parameters:**
400
+ - `content` (required): What was done, files changed, learnings
401
+
402
+ #### `add_codebase_pattern`
403
+ Add a reusable pattern to the Codebase Patterns section.
404
+
405
+ **Parameters:**
406
+ - `pattern` (required): The pattern (e.g., "Use sql<number> for aggregations")
407
+
408
+ ### Context Store
409
+
410
+ NightShift includes a selective context retrieval system that replaces prompt-stuffing with topic-based queries. Instead of truncating progress.txt to fit the context window, agents can store and retrieve relevant context on demand.
411
+
412
+ Context entries are stored as individual JSON files in `.robot-chat/context/` for concurrent-safe multi-agent access.
413
+
414
+ #### `store_context`
415
+ Store a context entry for other agents to query later.
416
+
417
+ **Parameters:**
418
+ - `topic` (required): Topic/category (e.g., "authentication", "database-schema")
419
+ - `content` (required): The context to store (learnings, decisions, findings)
420
+ - `agent` (required): Your agent name
421
+ - `tags` (optional): Tags for better searchability (e.g., ["auth", "jwt"])
422
+
423
+ **Example:**
424
+ ```
425
+ store_context(topic: "authentication", content: "Using JWT with RS256. Refresh tokens stored in httpOnly cookies.", agent: "Claude", tags: ["jwt", "cookies"])
426
+ ```
427
+
428
+ #### `query_context`
429
+ Search stored context entries by topic.
430
+
431
+ **Parameters:**
432
+ - `topic` (required): Search term (case-insensitive match on topic and tags)
433
+ - `limit` (optional): Max entries to return (default: 10)
434
+
435
+ **Example:**
436
+ ```
437
+ query_context(topic: "auth")
438
+ # Returns all entries matching "auth" in topic or tags, sorted by recency
439
+ ```
440
+
441
+ #### `list_context`
442
+ List all topics in the context store with entry counts.
443
+
444
+ **How delegation uses context:**
445
+
446
+ When `delegate_story` or `delegate_research` spawns an agent, it queries the context store for entries relevant to the task and includes them in the prompt — instead of blindly truncating progress.txt. Agents are also instructed to use `store_context` to save their learnings, creating a self-enriching context loop.
447
+
448
+ ### Execution Tracing
449
+
450
+ NightShift automatically traces all agent spawns, completions, and failures into a structured execution log at `.robot-chat/trace.json`. Each trace event has parent-child relationships that can be reconstructed as a tree for debugging multi-agent runs.
451
+
452
+ #### `get_trace`
453
+ View the execution trace as a flat list or tree.
454
+
455
+ **Parameters:**
456
+ - `tree` (optional): Return as tree with parent-child relationships (default: false)
457
+ - `taskId` (optional): Filter by story/task ID
458
+
459
+ **Example:**
460
+ ```
461
+ get_trace(tree: true)
462
+ # Returns tree showing: orchestrator → spawned claude (US-001) → completed
463
+ # orchestrator → spawned codex (US-002) → failed → retried with gemini → completed
464
+ ```
465
+
466
+ #### `clear_trace`
467
+ Reset the trace for a fresh orchestration run.
468
+
469
+ **What gets traced automatically:**
470
+ - `spawn_agent` and `spawn_agent_background` calls
471
+ - `delegate_story` and `delegate_research` delegations
472
+ - `orchestrate` decisions (inline mode)
473
+ - Agent completions with exit codes
474
+ - Agent failures with error details
475
+
476
+ Each trace event includes metadata: agent type, story ID, prompt length, exit code, and timing.
477
+
478
+ ### Autonomous Workflow
479
+
480
+ With multiple agents working together:
481
+
482
+ ```
483
+ ┌──────────────────────────────────────────────────────────────────┐
484
+ │ NightShift Workflow │
485
+ ├──────────────────────────────────────────────────────────────────┤
486
+ │ │
487
+ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
488
+ │ │ Claude │ │ Codex │ │ Gemini │ │ Vibe │ │ Goose │ │
489
+ │ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ │
490
+ │ │ │ │ │ │ │
491
+ │ └───────────┴─────┬─────┴───────────┴───────────┘ │
492
+ │ │ │
493
+ │ ▼ │
494
+ │ ┌──────────────────┐ │
495
+ │ │ .robot-chat/ │ ◄── Agent coordination │
496
+ │ │ chat.txt │ │
497
+ │ └──────────────────┘ │
498
+ │ │ │
499
+ │ ┌──────────┬────────┼────────┬──────────┐ │
500
+ │ │ │ │ │ │ │
501
+ │ ▼ ▼ ▼ ▼ ▼ │
502
+ │ ┌────────┐ ┌────────┐ ┌────┐ ┌──────────┐ ┌──────────┐ │
503
+ │ │prd.json│ │progress│ │Code│ │ context/ │ │trace.json│ │
504
+ │ │(tasks) │ │ .txt │ │ │ │(per-topic│ │(execution│ │
505
+ │ │ │ │ │ │ │ │ queries) │ │ tree) │ │
506
+ │ └────────┘ └────────┘ └────┘ └──────────┘ └──────────┘ │
507
+ │ │
508
+ └──────────────────────────────────────────────────────────────────┘
509
+ ```
510
+
511
+ Each agent:
512
+ 1. Checks for failovers (helps stuck agents first)
513
+ 2. Reads progress.txt for codebase patterns
514
+ 3. Claims the next story via chat
515
+ 4. Implements the story
516
+ 5. Runs quality checks
517
+ 6. Commits changes
518
+ 7. Marks complete and logs learnings
519
+ 8. Repeats until all stories pass
520
+
521
+ When an agent hits limits, it posts `FAILOVER_NEEDED` and another agent claims the work.
522
+
523
+ ### Completion Signal
524
+
525
+ When all stories in prd.json have `passes: true` AND all bugs in bugs.json have `fixed: true`, the tools:
526
+
527
+ 1. Post a `READY_TO_TEST` message to the chat
528
+ 2. Return `<promise>COMPLETE</promise>`
529
+
530
+ This signals to humans that work is ready for review.
531
+
532
+ ## Bug Tracking
533
+
534
+ When testing reveals issues, add a `bugs.json` file:
535
+
536
+ ```json
537
+ {
538
+ "project": "MyApp",
539
+ "bugs": [
540
+ {
541
+ "id": "BUG-001",
542
+ "title": "Login fails on mobile",
543
+ "description": "Login button unresponsive on iOS Safari",
544
+ "stepsToReproduce": [
545
+ "Open app on iOS Safari",
546
+ "Enter credentials",
547
+ "Tap login button",
548
+ "Nothing happens"
549
+ ],
550
+ "priority": 1,
551
+ "fixed": false
552
+ }
553
+ ]
554
+ }
555
+ ```
556
+
557
+ ### Bug Tools
558
+
559
+ #### `read_bugs`
560
+ Read bugs.json with completion summary.
561
+
562
+ #### `get_next_bug`
563
+ Get highest priority unfixed bug.
564
+
565
+ #### `claim_bug`
566
+ Claim a bug and notify via chat.
567
+
568
+ **Parameters:**
569
+ - `agent` (required): Your agent name
570
+ - `bugId` (optional): Specific bug to claim
571
+
572
+ #### `mark_bug_fixed`
573
+ Mark bug fixed, create savepoint, and notify.
574
+
575
+ **Parameters:**
576
+ - `agent` (required): Your agent name
577
+ - `bugId` (required): Bug ID
578
+ - `summary` (required): What was fixed
579
+ - `filesModified` (optional): Files changed
580
+
581
+ ## Savepoints (Recovery)
582
+
583
+ Every completed story and fixed bug automatically creates a savepoint (git commit + tag). Use these for easy rollback if needed.
584
+
585
+ ### Savepoint Tools
586
+
587
+ #### `create_savepoint`
588
+ Create a manual checkpoint.
589
+
590
+ **Parameters:**
591
+ - `label` (required): Savepoint name (e.g., "pre-refactor", "auth-working")
592
+ - `message` (optional): Commit message
593
+
594
+ #### `list_savepoints`
595
+ List all savepoints (git tags with `savepoint/` prefix).
596
+
597
+ #### `rollback_savepoint`
598
+ Reset to a previous savepoint. **Warning:** Discards all changes after that point.
599
+
600
+ **Parameters:**
601
+ - `label` (required): Savepoint to rollback to
602
+
603
+ ### Example Recovery
604
+
605
+ ```bash
606
+ # Something went wrong after US-003
607
+ # List available savepoints
608
+ list_savepoints
609
+ # → savepoint/US-001, savepoint/US-002, savepoint/US-003
610
+
611
+ # Rollback to after US-002
612
+ rollback_savepoint("US-002")
613
+ # → All changes after US-002 discarded
614
+ ```
615
+
616
+ ## Workflow Management
617
+
618
+ NightShift includes workflow tools for tracking project phases, recording strategic decisions, and managing agent assignments.
619
+
620
+ ### Workflow Tools
621
+
622
+ #### `init_workflow`
623
+ Initialize a new workflow with a project goal and optional custom phases.
624
+
625
+ **Parameters:**
626
+ - `projectGoal` (required): High-level goal of the project
627
+ - `phases` (optional): Custom phases (default: research, decisions, planning, build, test, report)
628
+
629
+ #### `get_workflow_state`
630
+ Get the current workflow state including phase, assignments, and decisions.
631
+
632
+ #### `advance_phase`
633
+ Advance to the next workflow phase when the current phase's exit criteria are met.
634
+
635
+ #### `set_phase`
636
+ Manually set the workflow to a specific phase.
637
+
638
+ **Parameters:**
639
+ - `phase` (required): Target phase (research, decisions, planning, build, test, report, complete)
640
+
641
+ #### `record_decision`
642
+ Record a strategic decision with rationale for future reference.
643
+
644
+ **Parameters:**
645
+ - `topic` (required): What the decision is about
646
+ - `options` (required): Options that were considered
647
+ - `chosen` (required): The chosen option
648
+ - `rationale` (required): Why this option was chosen
649
+ - `decidedBy` (required): Agent or person who decided
650
+
651
+ #### `get_decisions`
652
+ Get all recorded decisions, optionally filtered by topic.
653
+
654
+ #### `get_active_assignments`
655
+ Get all stories currently being worked on by agents.
656
+
657
+ #### `clear_assignment`
658
+ Clear a story assignment (for abandonment/failover scenarios).
659
+
660
+ ## Setup & Debugging
661
+
662
+ NightShift includes self-service tools for setup and troubleshooting.
663
+
664
+ ### `nightshift_setup`
665
+
666
+ Get configuration instructions and verify project setup.
667
+
668
+ **Parameters:**
669
+ - `showExamples` (optional): Include prd.json and bugs.json templates
670
+
671
+ **Returns:**
672
+ - Project status checks (prd.json, bugs.json, git, .gitignore)
673
+ - Agent configuration examples for Claude and Codex
674
+ - Setup suggestions for any issues found
675
+ - Example templates (if requested)
676
+
677
+ **Example:**
678
+ ```
679
+ nightshift_setup(showExamples: true)
680
+ ```
681
+
682
+ ### `nightshift_debug`
683
+
684
+ Diagnose issues and get troubleshooting guidance.
685
+
686
+ **Checks:**
687
+ - File system permissions
688
+ - JSON file validation (prd.json, bugs.json)
689
+ - Daemon lock status
690
+ - Recent chat errors and unclaimed failovers
691
+ - Agent availability
692
+ - Git repository status
693
+
694
+ **Example:**
695
+ ```
696
+ nightshift_debug
697
+ # Returns detailed diagnostic report with suggested fixes
698
+ ```
699
+
700
+ ## Agent Spawning & Orchestration
701
+
702
+ One agent can spawn others as subprocesses, enabling fully autonomous multi-agent workflows with minimal user intervention.
703
+
704
+ ### Spawning Tools
705
+
706
+ #### `list_available_agents`
707
+ Check which agent CLIs (claude, codex, gemini, vibe, goose) are installed and ready to run.
708
+
709
+ #### `spawn_agent`
710
+ Spawn another agent as a subprocess and wait for completion.
711
+
712
+ **Parameters:**
713
+ - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
714
+ - `prompt` (required): Task/prompt to send
715
+ - `timeout` (optional): Seconds before timeout (default: 300)
716
+
717
+ **Example:**
718
+ ```
719
+ spawn_agent(agent: "codex", prompt: "Fix the type errors in src/utils.ts")
720
+ ```
721
+
722
+ #### `spawn_agent_background`
723
+ Spawn an agent in the background (non-blocking). Returns immediately with PID and output file path.
724
+
725
+ **Parameters:**
726
+ - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
727
+ - `prompt` (required): Task/prompt to send
728
+
729
+ #### `delegate_story`
730
+ Delegate a PRD user story to another agent with full context. On failure, returns a `retryHint` suggesting alternative available agents.
731
+
732
+ **Parameters:**
733
+ - `agent` (required): "claude", "codex", "gemini", "vibe", or "goose"
734
+ - `storyId` (optional): Story ID to delegate (defaults to next available)
735
+ - `background` (optional): Run in background (default: false)
736
+
737
+ **Example:**
738
+ ```
739
+ delegate_story(agent: "gemini", storyId: "US-003", background: true)
740
+ ```
741
+
742
+ The spawned agent receives:
743
+ - Full story description and acceptance criteria
744
+ - Relevant context from the context store (or progress.txt as fallback)
745
+ - Recent chat messages for context
746
+ - Instructions to use nightshift tools for coordination (including `store_context` and `query_context`)
747
+
748
+ #### `delegate_research`
749
+ Delegate a research or planning task to an agent (default: Gemini). Ideal for read-only tasks like codebase analysis, architecture planning, code review, and documentation. Queries the context store for relevant prior findings.
750
+
751
+ **Parameters:**
752
+ - `task` (required): The research/planning task description
753
+ - `agent` (optional): Which agent to use (default: gemini)
754
+ - `context` (optional): Additional context to provide
755
+ - `background` (optional): Run in background (default: false)
756
+
757
+ ### Monitoring Tools
758
+
759
+ #### `get_agent_status`
760
+ Check the status of a spawned background agent by PID.
761
+
762
+ **Parameters:**
763
+ - `pid` (required): Process ID of the spawned agent
764
+
765
+ **Returns:**
766
+ - Whether the agent is still running or has exited
767
+ - Exit code (if finished)
768
+ - Last 30 lines of output
769
+ - Story assignment (if delegated via `delegate_story`)
770
+
771
+ #### `list_running_agents`
772
+ List all agents spawned in the current session with their status.
773
+
774
+ **Returns:** Array of agents with PID, agent type, running/exited status, elapsed time, and story assignment.
775
+
776
+ ### Orchestration
777
+
778
+ #### `orchestrate`
779
+ Run an autonomous orchestration loop that claims stories, implements them, and marks them complete until all work is done. This is the highest-level automation tool.
780
+
781
+ **Parameters:**
782
+ - `agent` (optional): Your agent name (default: "NightShift")
783
+ - `maxIterations` (optional): Maximum stories to process (default: 50)
784
+ - `mode` (optional): "stories", "bugs", or "all" (default: "all")
785
+
786
+ ### Orchestration Patterns
787
+
788
+ **Fully autonomous (recommended):**
789
+ ```
790
+ orchestrate(agent: "Claude", mode: "all")
791
+ # Runs until all stories and bugs are complete
792
+ ```
793
+
794
+ **Sequential delegation:**
795
+ ```
796
+ delegate_story(agent: "codex") # Wait for completion
797
+ delegate_story(agent: "gemini") # Then delegate next
798
+ ```
799
+
800
+ **Parallel execution:**
801
+ ```
802
+ delegate_story(agent: "codex", storyId: "US-001", background: true)
803
+ delegate_story(agent: "goose", storyId: "US-002", background: true)
804
+ # Work on US-003 yourself while they run in parallel
805
+ # Monitor with get_agent_status or list_running_agents
806
+ ```
807
+
808
+ **Research then implement:**
809
+ ```
810
+ delegate_research(task: "Analyze auth patterns and recommend approach")
811
+ # Use findings to inform implementation
812
+ delegate_story(agent: "codex", storyId: "US-001")
813
+ ```
814
+
815
+ ## NightShift Daemon (Continuous Orchestration)
816
+
817
+ For fully automated, event-driven orchestration, run the NightShift daemon:
818
+
819
+ ```bash
820
+ # Start the daemon
821
+ nightshift-daemon
822
+
823
+ # With options
824
+ nightshift-daemon --verbose --max-agents 4 --health-check 1m
825
+
826
+ # Preview mode (see what would happen)
827
+ nightshift-daemon --dry-run --verbose
828
+ ```
829
+
830
+ ### How It Works
831
+
832
+ The daemon provides hands-off multi-agent orchestration:
833
+
834
+ 1. **Event-Driven**: Watches `prd.json` and `chat.txt` for changes
835
+ 2. **Auto-Spawning**: Spawns agents for orphaned stories (up to concurrency limit)
836
+ 3. **Failover Handling**: Automatically claims and reassigns failover requests
837
+ 4. **Smart Retry**: Tracks failed agents per story and tries a different agent on retry
838
+ 5. **Health Checks**: Periodic reconciliation as a fallback (default: every 2 min)
839
+ 6. **Poison Pill Protection**: Quarantines stories that fail repeatedly
840
+ 7. **Stuck Detection**: Kills agents that haven't reported activity
841
+
842
+ ### Options
843
+
844
+ | Flag | Description | Default |
845
+ |------|-------------|---------|
846
+ | `--verbose, -v` | Enable debug logging | false |
847
+ | `--dry-run` | Show actions without spawning | false |
848
+ | `--health-check <N>` | Health check interval (e.g., "2m", "30s") | 2m |
849
+ | `--max-agents <N>` | Max concurrent agents | 3 |
850
+
851
+ ### Environment
852
+
853
+ - `ROBOT_CHAT_PROJECT_PATH` - Project directory (default: current directory)
854
+
855
+ ### Architecture
856
+
857
+ ```
858
+ ┌─────────────────────────────────────────────────────────────┐
859
+ │ NightShift Daemon │
860
+ ├─────────────────────────────────────────────────────────────┤
861
+ │ │
862
+ │ ┌──────────────────────────────────────────────────┐ │
863
+ │ │ File Watchers (Primary) │ │
864
+ │ │ • prd.json changes → reconcile │ │
865
+ │ │ • chat.txt changes → check failovers │ │
866
+ │ └──────────────────────────────────────────────────┘ │
867
+ │ │ │
868
+ │ ▼ │
869
+ │ ┌──────────────────────────────────────────────────┐ │
870
+ │ │ Reconciliation Engine │ │
871
+ │ │ • Find orphaned stories │ │
872
+ │ │ • Spawn agents (up to max concurrency) │ │
873
+ │ │ • Handle failovers │ │
874
+ │ │ • Quarantine poison pills │ │
875
+ │ └──────────────────────────────────────────────────┘ │
876
+ │ │ │
877
+ │ ▼ │
878
+ │ ┌──────────────────────────────────────────────────┐ │
879
+ │ │ Health Check (Fallback) │ │
880
+ │ │ • Runs every 2 minutes │ │
881
+ │ │ • Detects stuck agents │ │
882
+ │ │ • Restarts watchers if needed │ │
883
+ │ │ • Reconciles state │ │
884
+ │ └──────────────────────────────────────────────────┘ │
885
+ │ │
886
+ └─────────────────────────────────────────────────────────────┘
887
+ ```
888
+
889
+ ## Local Models via Ollama
890
+
891
+ NightShift supports local Ollama models through two harnesses:
892
+
893
+ ### Goose + Ollama (Recommended for tool use)
894
+
895
+ [Goose](https://github.com/block/goose) has its own tool-calling implementation that works reliably with local models. This is the recommended path for local agent work.
896
+
897
+ ```bash
898
+ # Install Goose CLI
899
+ curl -fsSL https://github.com/block/goose/releases/latest/download/install.sh | bash
900
+
901
+ # Install Ollama and pull a model
902
+ ollama pull qwen3.5:4b
903
+
904
+ # Configure nightshift to use Goose with Ollama
905
+ export NIGHTSHIFT_GOOSE_PROVIDER=ollama
906
+ export NIGHTSHIFT_GOOSE_MODEL=qwen3.5:4b
907
+ ```
908
+
909
+ Then use `goose` as your agent in nightshift:
910
+ ```
911
+ spawn_agent(agent: "goose", prompt: "Fix the pagination bug in src/api.ts")
912
+ delegate_research(agent: "goose", task: "Analyze error handling patterns")
913
+ ```
914
+
915
+ **Recommended models** (by hardware):
916
+
917
+ | GPU VRAM | Model | Size | Notes |
918
+ |----------|-------|------|-------|
919
+ | 4GB+ | `qwen3.5:4b` | 3.4 GB | Fast, good tool use |
920
+ | 6GB+ | `qwen3.5:4b-q8_0` | 5.3 GB | Better accuracy, same speed |
921
+ | 8GB+ | `qwen3.5:9b` | 6.6 GB | Best quality, slower on consumer GPUs |
922
+
923
+ ### Claude Code + Ollama (Text generation only)
924
+
925
+ For tasks that don't require tool use (summarization, code review, planning):
926
+
927
+ ```bash
928
+ export NIGHTSHIFT_OLLAMA_MODEL=qwen3.5:4b # or any Ollama model
929
+ ```
930
+
931
+ Then use `ollama` as your agent:
932
+ ```
933
+ spawn_agent(agent: "ollama", prompt: "Review this PR for security issues")
934
+ delegate_research(agent: "ollama", task: "Summarize the authentication patterns")
935
+ ```
936
+
937
+ This uses Claude Code's harness with Ollama's Anthropic-compatible API. Text generation works well, but local models don't reliably trigger Claude Code's structured tool calls.
938
+
939
+ ### Benchmarking Local Models
940
+
941
+ A benchmark suite is included to test which models work on your hardware:
942
+
943
+ ```bash
944
+ # Test all tasks with goose + a specific model
945
+ node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:4b
946
+
947
+ # Test only text-level tasks (fast sanity check)
948
+ node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:4b --level text
949
+
950
+ # Compare models
951
+ node benchmarks/run-experiment.mjs --agent goose --model qwen3.5:9b
952
+ ```
953
+
954
+ Results are saved to `benchmarks/results/` for comparison across runs.
955
+
956
+ ## Multi-Agent Tips
957
+
958
+ 1. **Same directory**: All agents must run in the same project directory to share chat
959
+ 2. **Claim before working**: Always claim stories to prevent duplicate work
960
+ 3. **Post status updates**: Keep other agents informed of progress
961
+ 4. **Store context, not just progress**: Use `store_context` to share learnings by topic — other agents can query for exactly what they need instead of reading a giant progress file
962
+ 5. **Handle failovers**: Check for and claim failovers at the start of each session
963
+ 6. **Use delegation**: One orchestrating agent can spawn others for parallel work
964
+ 7. **Monitor background agents**: Use `get_agent_status` and `list_running_agents` to track spawned agents
965
+ 8. **Use `orchestrate` for full autonomy**: The `orchestrate` tool handles the entire claim→implement→complete loop
966
+ 9. **Review traces after runs**: Use `get_trace(tree: true)` to understand what happened during orchestration
967
+ 10. **Add `.robot-chat/` to your project's `.gitignore`**: Chat logs, context, and traces are ephemeral and shouldn't be committed
968
+
969
+ ## License
970
+
971
+ MIT