sisyphi 1.2.1 → 1.2.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. package/README.md +20 -20
  2. package/dist/cli.js +12461 -11237
  3. package/dist/cli.js.map +1 -1
  4. package/dist/daemon.js +1112 -564
  5. package/dist/daemon.js.map +1 -1
  6. package/dist/templates/agent-plugin/agents/CLAUDE.md +2 -2
  7. package/dist/templates/agent-plugin/agents/implementor.md +3 -2
  8. package/dist/templates/agent-plugin/agents/operator.md +3 -4
  9. package/dist/templates/agent-plugin/agents/plan.md +1 -1
  10. package/dist/templates/agent-plugin/agents/problem.md +20 -20
  11. package/dist/templates/agent-plugin/agents/research-lead.md +1 -1
  12. package/dist/templates/agent-plugin/agents/spec/engineer.md +9 -7
  13. package/dist/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  14. package/dist/templates/agent-plugin/agents/spec.md +31 -25
  15. package/dist/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  16. package/dist/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  17. package/dist/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  18. package/dist/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  19. package/dist/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  20. package/dist/templates/agent-plugin/hooks/require-submit.sh +1 -1
  21. package/dist/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  22. package/dist/templates/agent-suffix.md +4 -18
  23. package/dist/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  24. package/dist/templates/dashboard-claude.md +15 -13
  25. package/dist/templates/orchestrator-base.md +44 -78
  26. package/dist/templates/orchestrator-completion.md +9 -11
  27. package/dist/templates/orchestrator-discovery.md +8 -8
  28. package/dist/templates/orchestrator-impl.md +6 -7
  29. package/dist/templates/orchestrator-planning.md +2 -2
  30. package/dist/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  31. package/dist/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  32. package/dist/templates/orchestrator-validation.md +1 -3
  33. package/dist/templates/termrender-haiku-system.md +5 -3
  34. package/dist/tui.js +1817 -1400
  35. package/dist/tui.js.map +1 -1
  36. package/native/build-notify.sh +2 -2
  37. package/package.json +3 -3
  38. package/templates/agent-plugin/agents/CLAUDE.md +2 -2
  39. package/templates/agent-plugin/agents/implementor.md +3 -2
  40. package/templates/agent-plugin/agents/operator.md +3 -4
  41. package/templates/agent-plugin/agents/plan.md +1 -1
  42. package/templates/agent-plugin/agents/problem.md +20 -20
  43. package/templates/agent-plugin/agents/research-lead.md +1 -1
  44. package/templates/agent-plugin/agents/spec/engineer.md +9 -7
  45. package/templates/agent-plugin/agents/spec/requirements-writer.md +1 -1
  46. package/templates/agent-plugin/agents/spec.md +31 -25
  47. package/templates/agent-plugin/hooks/CLAUDE.md +0 -1
  48. package/templates/agent-plugin/hooks/ask-background-guard.sh +11 -11
  49. package/templates/agent-plugin/hooks/intercept-send-message.sh +1 -1
  50. package/templates/agent-plugin/hooks/operator-user-prompt.sh +2 -2
  51. package/templates/agent-plugin/hooks/plan-validate.sh +3 -3
  52. package/templates/agent-plugin/hooks/require-submit.sh +1 -1
  53. package/templates/agent-plugin/skills/operator/SKILL.md +1 -1
  54. package/templates/agent-suffix.md +4 -18
  55. package/templates/companion-plugin/hooks/user-prompt-context.sh +1 -1
  56. package/templates/dashboard-claude.md +15 -13
  57. package/templates/orchestrator-base.md +44 -78
  58. package/templates/orchestrator-completion.md +9 -11
  59. package/templates/orchestrator-discovery.md +8 -8
  60. package/templates/orchestrator-impl.md +6 -7
  61. package/templates/orchestrator-planning.md +2 -2
  62. package/templates/orchestrator-plugin/commands/sisyphus/scratch.md +1 -1
  63. package/templates/orchestrator-plugin/commands/sisyphus/strategize.md +2 -2
  64. package/templates/orchestrator-validation.md +1 -3
  65. package/templates/termrender-haiku-system.md +5 -3
  66. package/dist/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  67. package/dist/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  68. package/dist/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  69. package/dist/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  70. package/dist/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  71. package/dist/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  72. package/dist/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  73. package/dist/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  74. package/dist/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  75. package/dist/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  76. package/dist/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
  77. package/templates/agent-plugin/skills/humanloop/SKILL.md +0 -148
  78. package/templates/agent-plugin/skills/operator-memory/SKILL.md +0 -64
  79. package/templates/agent-plugin/skills/perspective-fanout/SKILL.md +0 -115
  80. package/templates/agent-plugin/skills/problem-document/SKILL.md +0 -105
  81. package/templates/agent-plugin/skills/problem-plateau-breakers/SKILL.md +0 -83
  82. package/templates/orchestrator-plugin/skills/humanloop/SKILL.md +0 -150
  83. package/templates/orchestrator-plugin/skills/orchestration/CLAUDE.md +0 -1
  84. package/templates/orchestrator-plugin/skills/orchestration/SKILL.md +0 -29
  85. package/templates/orchestrator-plugin/skills/orchestration/strategy.md +0 -160
  86. package/templates/orchestrator-plugin/skills/orchestration/task-patterns.md +0 -266
  87. package/templates/orchestrator-plugin/skills/orchestration/workflow-examples.md +0 -428
@@ -1,428 +0,0 @@
1
- # Workflow Examples
2
-
3
- End-to-end examples showing how the orchestrator structures cycles for real scenarios.
4
-
5
- ### Path conventions in these examples
6
-
7
- Plan files live under per-plan-lead subdirectories: `context/{plan-lead-agent-id}/plan-*.md`. These examples elide the subdir (showing `context/plan-rate-limiting.md`) for readability. In a real cycle, the orchestrator reads the exact path from the plan lead's submission report and carries it verbatim into downstream implement, review-plan, and validate agent prompts.
8
-
9
- ---
10
-
11
- ## Example 4: Wrapper-Shaped Config Migration (LOW effort — 5 files, mechanical)
12
-
13
- **Starting task**: "All config access goes through `process.env` directly — migrate to a `getConfig()` wrapper already defined in `src/config.ts`"
14
-
15
- **Effort tier**: LOW. Every change is a call-site swap onto an existing handler. No new behavior.
16
-
17
- ### Cycle 1 — Plan
18
- ```
19
- roadmap.md:
20
- ## Refactor: Migrate env access to getConfig()
21
-
22
- - [ ] Plan migration — enumerate all process.env call sites
23
- - [ ] Update call sites to use getConfig()
24
- - [ ] Validate — no direct process.env access remains; tests pass
25
-
26
- Agents spawned:
27
- plan agent → "Enumerate every direct process.env access in src/. Map each call site
28
- to the matching getConfig() key. Output a migration checklist. Files expected:
29
- src/api/server.ts, src/db/connection.ts, src/queue/worker.ts,
30
- src/cli/commands/start.ts, src/config.ts (source of truth — do not modify)."
31
- ```
32
-
33
- ### Cycle 2 — Implement
34
- ```
35
- Plan complete. 23 call sites across 4 files.
36
-
37
- Agents spawned:
38
- implement agent → "Execute migration plan at context/{plan-agent-id}/plan-config-migration.md.
39
- Replace every process.env.X access with getConfig('X'). Do not modify src/config.ts.
40
- Do not add error handling — getConfig() already throws on missing keys."
41
- ```
42
-
43
- ### Cycle 3 — Validate + complete
44
- ```
45
- Implementation complete.
46
-
47
- Agents spawned:
48
- validate agent → "Verify migration: grep for remaining process.env access in src/ (excluding
49
- src/config.ts). Run existing tests. Confirm zero direct env reads outside config.ts."
50
-
51
- Validation: PASS. Complete — "All env access routed through getConfig()."
52
- ```
53
-
54
- **Pipeline shape**: `plan → implement → validate`. 3 cycles. No `sisyphus:spec`, no `sisyphus:test-spec`, no `sisyphus:review-plan`.
55
-
56
- ---
57
-
58
- ## Example 5: New Subsystem — Distributed Task Queue (HIGH effort)
59
-
60
- **Starting task**: "Add a persistent task queue so long-running jobs survive server restarts. Include test coverage of the survival, retry, and concurrency invariants."
61
-
62
- **Effort tier**: HIGH. New subsystem, new protocol (worker ↔ queue contract), cross-domain orchestration (API + storage + worker process). The prompt explicitly asks for test coverage — `sisyphus:test-spec` is justified at Cycle 2.
63
-
64
- ### Cycle 0 — Problem exploration
65
- ```
66
- roadmap.md:
67
- ## Feature: Persistent Task Queue
68
-
69
- - [ ] Explore current job execution patterns and constraints
70
- - [ ] Spec — requirements + architecture
71
- - [ ] Plan implementation (staged outline)
72
- - [ ] Spec behavioral properties (test-spec) — user asked for tests in the prompt
73
- ...
74
-
75
- Agents spawned:
76
- explore agent → "Map current job execution in src/jobs/. Identify what needs to survive
77
- restarts, current storage backends, worker process lifecycle."
78
- problem agent → "Explore design space for persistent task queue. Questions: push vs pull
79
- worker model, at-least-once vs exactly-once semantics, failure/retry policy, storage
80
- backend options (Redis, Postgres, SQLite)."
81
- ```
82
-
83
- ### Cycle 1 — Spec (human iterates)
84
- ```
85
- Agents spawned:
86
- sisyphus:spec → "Run spec session for persistent task queue.
87
- Context in context/problem-task-queue.md and context/explore-task-queue.md."
88
-
89
- Human iterates. Spec outputs:
90
- context/requirements-task-queue.md — acceptance criteria, failure semantics
91
- context/design-task-queue.md — Redis-backed queue, pull workers, at-least-once delivery
92
- ```
93
-
94
- ### Cycle 2 — High-level plan + test-spec (parallel)
95
- ```
96
- Agents spawned (parallel):
97
- plan agent → "Create high-level stage outline from context/requirements-task-queue.md
98
- and context/design-task-queue.md. Stages: (1) queue storage layer, (2) producer API,
99
- (3) worker consumer, (4) integration + retry logic. Cycle estimates per stage."
100
- test-spec agent → "Define behavioral properties: job survives server restart, failed
101
- jobs retry up to N times, concurrent workers don't double-execute the same job."
102
- ```
103
-
104
- If the original prompt had been silent on tests, the test-spec spawn would be omitted and Cycle 2 would be plan-only — Cycle 3 would then proceed straight to detail-planning stage 1.
105
-
106
- ### Cycles 3–9 — Staged implementation with critique + validation checkpoints
107
- ```
108
- Follows Feature Build Large pattern:
109
- Cycle 3: detail-plan stage 1 + implement stage 1
110
- Cycle 4: implement stage 2; detail-plan stage 3 in parallel
111
- Cycle 5: critique stages 1-2 (foundation review before worker builds on it)
112
- Cycle 6: address critique + implement stage 3
113
- Cycle 7: implement stage 4 (integration + retry); validate stages 3-4
114
- Cycle 8: sis orch yield --mode validation — e2e: enqueue job, kill server, restart,
115
- confirm job ran exactly once
116
- Cycle 9: final review agent; complete
117
- ```
118
-
119
- **Pipeline shape**: Full HIGH pipeline — `problem → spec → plan (+ test-spec because the prompt asked for tests) → staged implement → critique → validate → review`. 9+ cycles. Without an explicit test request in the prompt, the parallel `test-spec` would be omitted and Cycle 2 would be plan-only.
120
-
121
- ---
122
-
123
- ## Example 1: Fix a Race Condition in WebSocket Reconnection
124
-
125
- **Starting task**: "WebSocket connections sometimes drop messages during reconnection"
126
-
127
- ### Cycle 1 — Diagnosis
128
- ```
129
- roadmap.md:
130
- ## Bug Fix: WebSocket message loss during reconnection
131
-
132
- - [ ] Diagnose message loss during WebSocket reconnection
133
- - [ ] Fix root cause
134
- - [ ] Validate fix under concurrent load
135
- - [ ] Review fix
136
-
137
- Agents spawned:
138
- debug agent → "Diagnose why WebSocket messages are lost during reconnection.
139
- Focus on src/ws/client.ts and src/ws/reconnect.ts.
140
- Check for race conditions between disconnect handler and message queue.
141
- Look at git blame for recent changes to reconnection logic."
142
- ```
143
-
144
- ### Cycle 2 — Read diagnosis, plan fix
145
- ```
146
- Agent report: "Root cause: reconnect() clears the message queue before the new socket
147
- is ready. Messages sent between disconnect and reconnect are dropped.
148
- Line 47 of src/ws/reconnect.ts — queue.clear() is called in onDisconnect
149
- but should be deferred until onReconnect confirms the new socket is live.
150
- Confidence: High."
151
-
152
- roadmap.md updated:
153
- - [x] ~~Diagnose message loss during WebSocket reconnection~~
154
- - [ ] Fix root cause — queue.clear() at L47 of reconnect.ts runs too early
155
- - [ ] Validate fix under concurrent load
156
- - [ ] Review fix
157
-
158
- Agents spawned:
159
- implement agent → "Fix race condition in src/ws/reconnect.ts.
160
- Root cause: queue.clear() at line 47 runs on disconnect, dropping pending messages.
161
- Fix: defer queue drain until reconnect confirms new socket. Buffer messages during
162
- reconnection window. See existing buffer pattern in src/ws/buffer.ts."
163
- ```
164
-
165
- ### Cycle 3 — Validate
166
- ```
167
- Agent report: "Fixed. Moved queue.clear() to onReconnect callback after socket.readyState
168
- confirms OPEN. Added message buffer during reconnection window. 2 files changed."
169
-
170
- roadmap.md updated:
171
- - [x] ~~Diagnose message loss~~
172
- - [x] ~~Fix root cause~~
173
- - [ ] Validate fix under concurrent load
174
- - [ ] Review fix
175
-
176
- Agents spawned:
177
- validate agent → "Verify WebSocket reconnection fix. Run existing tests in
178
- src/__tests__/ws/. Manually verify: disconnect simulation should not lose queued
179
- messages. Check that buffer drains correctly on reconnect."
180
- ```
181
-
182
- ### Cycle 4 — Review and complete
183
- ```
184
- Validation: PASS. Existing tests pass, reconnection preserves messages.
185
-
186
- roadmap.md updated:
187
- - [x] ~~Diagnose~~, [x] ~~Fix~~, [x] ~~Validate~~
188
- - [ ] Review fix
189
-
190
- Agents spawned:
191
- review agent → "Review changes in src/ws/reconnect.ts and src/ws/buffer.ts.
192
- Focus on: thread safety of the new buffer, memory leak potential if reconnect
193
- never succeeds, edge case of rapid disconnect/reconnect cycles."
194
- ```
195
-
196
- ### Cycle 5 — Complete
197
- ```
198
- Review: Clean. One medium finding — buffer should have a max size to prevent
199
- memory issues if reconnect fails indefinitely. Not blocking.
200
-
201
- All roadmap.md items complete.
202
-
203
- Action: complete — "Fixed WebSocket message loss during reconnection. Messages now
204
- buffered during reconnection window and drained on successful reconnect."
205
- ```
206
-
207
- ---
208
-
209
- ## Example 2: Add API Rate Limiting
210
-
211
- **Starting task**: "Add rate limiting to the REST API — per-user, configurable limits, with tests for the limit-enforcement and 429 response behavior"
212
-
213
- ### Cycle 1 — Problem exploration
214
- ```
215
- roadmap.md:
216
- ## Feature: API Rate Limiting
217
-
218
- ### Requirements & Design
219
- - [ ] Problem exploration — understand rate limiting needs
220
- - [ ] Requirements — define acceptance criteria
221
- - [ ] Design — architecture for rate limiting
222
- - [ ] Plan implementation
223
- - [ ] Review plan
224
-
225
- ### Implementation
226
- - [ ] Implement rate limiting middleware
227
- - [ ] Implement rate limit configuration
228
- - [ ] Implement rate limit headers and error responses
229
-
230
- ### Validation
231
- - [ ] Validate implementation
232
- - [ ] Review implementation
233
-
234
- Agents spawned:
235
- problem agent → "Explore the codebase and understand the API rate limiting landscape.
236
- Check existing middleware patterns in src/api/middleware/.
237
- Questions to explore: current request handling, existing auth/middleware chain,
238
- what storage backends are available (Redis?), user identification mechanisms."
239
- ```
240
-
241
- ### Cycle 2 — Spec (after human iterates on problem)
242
- ```
243
- Agent report: "Problem document saved to context/problem-rate-limiting.md.
244
- Current middleware chain uses Express middleware pattern. Redis is already in stack.
245
- Users are identified by JWT sub claim. No existing rate limiting."
246
-
247
- roadmap.md updated:
248
- - [x] ~~Problem exploration~~
249
- - [ ] Spec — define acceptance criteria and architecture
250
- ...
251
-
252
- Agents spawned:
253
- sisyphus:spec → "Run a spec session for per-user API rate limiting. Read context/problem-rate-limiting.md for context."
254
-
255
- Later report: "Spec completed.
256
- Requirements saved to context/requirements-rate-limiting.md.
257
- Design saved to context/design-rate-limiting.md.
258
- Covers: per-user limits, endpoint-specific overrides, 429 response format,
259
- Retry-After headers, and a Redis-backed sliding window approach."
260
- ```
261
-
262
- ### Cycle 3 — Plan (after human reviews spec)
263
- ```
264
- Agent report: "Spec outputs approved.
265
- Approach: Redis-backed sliding window middleware. Per-user with endpoint-specific
266
- overrides. Standard 429 response with Retry-After header. Config via environment variables."
267
-
268
- roadmap.md updated:
269
- - [x] ~~Problem exploration~~, [x] ~~Spec~~
270
- - [ ] Plan implementation
271
- ...
272
-
273
- Agents spawned:
274
- plan agent → "Create implementation plan from context/requirements-rate-limiting.md
275
- and context/design-rate-limiting.md"
276
- test-spec agent → "Define behavioral properties for rate limiting from
277
- context/requirements-rate-limiting.md"
278
- ```
279
-
280
- ### Cycle 4 — Review plan
281
- ```
282
- Both agents complete. Plan at context/plan-rate-limiting.md.
283
- Plan has 3 phases: middleware, config, response format.
284
-
285
- Agents spawned:
286
- review-plan agent → "Validate plan at context/plan-rate-limiting.md
287
- against context/requirements-rate-limiting.md and context/design-rate-limiting.md"
288
- ```
289
-
290
- ### Cycle 5 — Implement phases 1+2 (parallel, low-risk foundation)
291
- ```
292
- Plan review: PASS.
293
-
294
- roadmap.md updated (plan review done, starting implementation):
295
- - [x] ~~Spec~~, [x] ~~Plan~~, [x] ~~Review plan~~
296
- - [ ] Implement rate limiting middleware
297
- - [ ] Implement rate limit configuration
298
- - [ ] Critique phases 1-2 — review before integration phase
299
- - [ ] Implement rate limit headers and error responses
300
- - [ ] Validate — smoketest rate limiting end-to-end
301
- - [ ] Final review
302
-
303
- Agents spawned (parallel — phases touch different files):
304
- implement agent → "Implement Phase 1 from context/plan-rate-limiting.md —
305
- rate limiting middleware in src/api/middleware/rate-limit.ts"
306
- implement agent → "Implement Phase 2 from context/plan-rate-limiting.md —
307
- rate limit configuration in src/config/rate-limits.ts"
308
- ```
309
-
310
- ### Cycle 6 — Critique before integration builds on top
311
- ```
312
- Both implementation agents complete.
313
-
314
- Why critique now: Phase 3 (headers/error responses) integrates the middleware and
315
- config — if the foundation has issues, they'll cascade. Cheaper to catch now.
316
-
317
- roadmap.md updated:
318
- - [x] ~~Implement middleware~~, [x] ~~Implement config~~
319
- - [ ] Critique phases 1-2
320
- ...
321
-
322
- Agents spawned:
323
- review agent → "Review rate limiting middleware and config implementation.
324
- Focus on: Redis connection handling, sliding window correctness,
325
- config schema matches what middleware expects."
326
- ```
327
-
328
- ### Cycle 7 — Implement phase 3 + address critique
329
- ```
330
- Review: 2 findings — middleware doesn't handle Redis connection failure gracefully,
331
- config schema allows negative rate limits.
332
-
333
- Agents spawned (parallel):
334
- implement agent → "Fix review findings in reports/agent-008-final.md for
335
- rate limiting middleware and config."
336
- implement agent → "Implement Phase 3 from context/plan-rate-limiting.md —
337
- rate limit headers and 429 error responses in src/api/middleware/rate-limit.ts"
338
- ```
339
-
340
- ### Cycle 8 — Validate end-to-end
341
- ```
342
- Phase 3 and fixes complete.
343
-
344
- Why validate now: all three phases are done and integrated. This is the checkpoint
345
- before calling it complete — verify it actually works, not just compiles.
346
-
347
- Agents spawned:
348
- validate agent → "Verify rate limiting end-to-end: start server, send requests
349
- exceeding limits, confirm 429 responses with correct Retry-After headers.
350
- Test per-user isolation, endpoint-specific overrides, Redis failover behavior."
351
- ```
352
-
353
- ### Cycle 10 — Complete
354
- ```
355
- Validation: PASS. Final review agent confirms no issues.
356
- Complete — "Added per-user API rate limiting with Redis-backed sliding window,
357
- configurable per-endpoint limits, and graceful Redis failover."
358
- ```
359
-
360
- ---
361
-
362
- ## Example 3: Refactor Authentication Module
363
-
364
- **Starting task**: "Refactor auth — extract token logic from route handlers into dedicated service"
365
-
366
- ### Cycle 1 — Plan + baseline
367
- ```
368
- roadmap.md:
369
- ## Refactor: Extract Token Service
370
-
371
- - [ ] Plan auth refactor — extract token service
372
- - [ ] Capture behavioral baseline (run all auth tests)
373
- - [ ] Create TokenService class with extracted logic
374
- - [ ] Update route handlers to use TokenService
375
- - [ ] Update tests to use new service interface
376
- - [ ] Validate all auth tests still pass
377
- - [ ] Review for dead code and missed references
378
-
379
- Agents spawned (parallel):
380
- plan agent → "Plan refactor: extract token creation, validation, and refresh
381
- logic from src/api/routes/auth.ts into a new src/services/token-service.ts.
382
- Map all token-related functions, their callers, and the extraction plan."
383
- validate agent → "Run all tests in src/__tests__/auth/ and record results.
384
- This is the behavioral baseline — these must all pass after refactor."
385
- ```
386
-
387
- ### Cycle 2 — Extract (serial — must happen before consumer updates)
388
- ```
389
- Plan complete, baseline captured (47 tests passing).
390
-
391
- roadmap.md updated:
392
- - [x] ~~Plan auth refactor~~
393
- - [x] ~~Capture behavioral baseline~~ (47 tests passing)
394
- - [ ] Create TokenService class with extracted logic
395
- ...
396
-
397
- Agents spawned:
398
- implement agent → "Execute Phase 1 of refactor plan: create TokenService class
399
- at src/services/token-service.ts. Extract validateToken, createToken, refreshToken
400
- from src/api/routes/auth.ts. Export the class. Do NOT modify route handlers yet."
401
- ```
402
-
403
- ### Cycle 3 — Update consumers (parallel where possible)
404
- ```
405
- TokenService created.
406
-
407
- Agents spawned:
408
- implement agent → "Update route handlers in src/api/routes/auth.ts to import
409
- and use TokenService instead of inline token logic. Remove extracted functions."
410
- implement agent → "Update tests in src/__tests__/auth/ to use TokenService
411
- where they directly tested extracted functions."
412
- ```
413
-
414
- ### Cycle 4 — Validate + review
415
- ```
416
- Agents spawned (parallel):
417
- validate agent → "Run all auth tests. Compare against baseline of 47 passing.
418
- Every test must still pass."
419
- review agent → "Review src/api/routes/auth.ts and src/services/token-service.ts.
420
- Check for: dead code left behind, missed references to old functions, broken imports."
421
- ```
422
-
423
- ### Cycle 5 — Complete
424
- ```
425
- All 47 tests passing. Review clean.
426
- All roadmap.md items complete.
427
- Complete — "Extracted token logic into TokenService. All existing tests pass."
428
- ```
@@ -1,148 +0,0 @@
1
- ---
2
- name: humanloop
3
- description: >
4
- Read before calling `sis ask`. Triggers when surfacing multiple questions or decisions to the user, presenting work for review/sign-off, or proposing concrete alternatives. Covers when a deck beats chat, how to design options as real forks the user can pick between, how to bundle related questions into one deck, and how to submit via the Bash tool's `run_in_background` so you can end your turn while the user takes their time answering.
5
- ---
6
-
7
- # Talking to the user via decks
8
-
9
- `sis ask` posts a structured deck of questions to the user's dashboard inbox. They walk through it on their own time and you read structured JSON back. Use it instead of dumping a wall of questions into chat.
10
-
11
- This skill covers **what to put in a deck** and **how to invoke it**. Run `sis ask -h` for the CLI shape (file path, `--session`, the `poll` and `peek` subcommands).
12
-
13
- ## Reach for a deck when
14
-
15
- - You have **2+ questions** to surface in one beat (bundle them into one deck).
16
- - You're presenting **work for review or sign-off** (a design, a plan, a completion summary).
17
- - You're choosing between **concrete alternatives** the user must pick.
18
- - The work will sit while the user thinks. Decks survive across cycles; chat does not.
19
-
20
- ## Skip the deck when
21
-
22
- - It's a single, low-stakes question whose answer barely changes downstream work — just ask in chat.
23
- - You can settle the question yourself by reading code or running a tool. **Default to investigating before asking.**
24
- - The user is actively conversing with you — converting a live exchange into a deck adds friction.
25
-
26
- ## How to invoke
27
-
28
- The CLI **always blocks** until the user resolves the deck (potentially 10+ minutes). Submit through the Bash tool with `run_in_background: true` and **end your turn**. Do not peek, poll, or output filler chat between submit and answer — the bash completion notification is the only signal you need; it will wake you with stdout ready to parse. Same pattern for orchestrator, sub-agents, and one-off Claude Code sessions.
29
-
30
- ```
31
- Bash tool call:
32
- command: sis ask "$deck"
33
- run_in_background: true
34
- ```
35
-
36
- Stdout on completion is one line of JSON: `{responses: [{id, selectedOptionId?, freetext?}, ...], completedAt}`. Branch on each response by its interaction `id`.
37
-
38
- If you already hold an `askId` from a prior cycle (e.g. respawned mid-wait), `sis ask poll <askId>` blocks on it and `sis ask peek <askId>` returns status without blocking. Use these only for respawn-recovery — **never to monitor a deck you just submitted in the current turn**. See `sis ask -h`.
39
-
40
- ## Designing interactions
41
-
42
- ### Each option is a concrete path forward
43
-
44
- The user picks an option to commit to a direction. Each option should name a real path with its tradeoffs spelled out, grounded in *this* codebase. Sign-off decks branch differently per option ("looks good", "minor fixes", "moderate fixes", "scope rework" each route the orchestrator somewhere different). Decision decks present mutually exclusive directions with named consequences.
45
-
46
- <example type="good">
47
- ```
48
- title: "Session store backend?"
49
- subtitle: "Auth needs persistent sessions across restarts"
50
- kind: decision
51
- options:
52
- in-memory: "In-memory map — simplest. Loses sessions on restart; single-process only."
53
- redis: "Redis — survives restart, supports horizontal scale. New ops dependency."
54
- postgres: "Reuse existing Postgres — no new infra; ~10ms read latency vs Redis ~1ms."
55
- defer: "Ship in-memory now, migrate later if scale becomes real."
56
- allowFreetext: true
57
- freetextLabel: "Different framing — describe it"
58
- ```
59
- </example>
60
-
61
- <example type="bad">
62
- ```
63
- title: "Happy with this design?"
64
- options:
65
- 1. Yes
66
- 2. No, start over
67
- 3. Maybe, with comments
68
- 4. (no option, just freetext)
69
- ```
70
- "Happy?" names a feeling, not a fork. Options 3 and 4 both collapse to freetext, forcing the user to invent the actual decision. Rewrite as specific decisions about specific elements of the design.
71
- </example>
72
-
73
- ### Use `allowFreetext: true` as a safety valve, not the primary input
74
-
75
- Freetext catches "anything else?" — opinions or context the options didn't anticipate. When freetext IS the answer you want, write a chat message instead.
76
-
77
- <example type="bad">
78
- ```
79
- title: "Approve?"
80
- options:
81
- 1. Approve
82
- 2. Reject
83
- 3. Comment
84
- allowFreetext: true
85
- ```
86
- A freetext form wearing option clothing. Either name what "reject" actually routes to (back to design? abandon? try a different framing?), or drop the deck and ask in chat.
87
- </example>
88
-
89
- ### Bound option count to 2–4
90
-
91
- Above four, options become too granular for the user to weigh; below two, you've collapsed into a yes/no that's faster to ask in chat.
92
-
93
- ### Ground options in what you've already gathered
94
-
95
- Each option label should reference specifics from the codebase, plan, or exploration you just did — file names, framework constraints, prior decisions. When you can't fill in specifics, investigate before asking.
96
-
97
- ### One concern per interaction
98
-
99
- When two questions interact, give them separate `id` / `title` / `options` inside the same deck (see Bundling below). One interaction asks one thing.
100
-
101
- ## `kind` — display hint
102
-
103
- | kind | use for |
104
- |---|---|
105
- | `decision` | fork in the road; user picks a path forward |
106
- | `validation` | sign-off on completed work |
107
- | `notify` | FYI; user acknowledges |
108
- | `context` | surfacing background that needs a response |
109
- | `error` | something went wrong; user picks a recovery |
110
-
111
- The dashboard uses `kind` for inbox icons and sort weight. Mis-tagging trains the user to ignore the icons. Pick the closest fit.
112
-
113
- ## Bundling
114
-
115
- If you'd otherwise submit two decks in the same beat, merge them. One deck with multiple `interactions` is one context switch for the user; two decks is two.
116
-
117
- ```bash
118
- deck="$SISYPHUS_SESSION_DIR/context/.ask-$(date +%s).json"
119
- cat > "$deck" <<'EOF'
120
- {
121
- "title": "Phase 2 sign-off + follow-on decisions",
122
- "interactions": [
123
- {
124
- "id": "approve-phase-2",
125
- "title": "Phase 2 looks good?",
126
- "kind": "validation",
127
- "options": [...]
128
- },
129
- {
130
- "id": "phase-3-scope",
131
- "title": "Phase 3 scope?",
132
- "kind": "decision",
133
- "options": [...]
134
- }
135
- ]
136
- }
137
- EOF
138
- # Then invoke `sis ask "$deck"` via the Bash tool with run_in_background: true.
139
- # Each interaction returns its own selectedOptionId / freetext in output.responses[], indexed by id.
140
- ```
141
-
142
- ## Submission notes
143
-
144
- - The deck is validated at submit (precise errors — trust them).
145
- - `kind` is an enum: `notify` | `validation` | `decision` | `context` | `error`. No other values accepted (see the table above for which to pick).
146
- - `bodyPath` points at a markdown file instead of inlining the body in JSON. The path is resolved **relative to the deck JSON's directory** and must stay inside it (no `..`, no symlinks out, no absolute paths pointing elsewhere). Practical pattern: write the deck JSON next to its body file — e.g. both inside `$SISYPHUS_SESSION_DIR/context/` — and use a basename like `"completion-summary.md"`. Mutually exclusive with `body`.
147
- - On completion, stdout is one line of JSON: `{responses, completedAt}`. Parse `responses[]` and dispatch on each interaction's `id`.
148
- - See `sis ask -h` for the full CLI surface.
@@ -1,64 +0,0 @@
1
- ---
2
- name: operator-memory
3
- description: Use right before the operator agent submits its final report. Provides guidance for updating the project-local operator memory at .sisyphus/agent-plugin/skills/operator/ — what to capture, where to put it (SKILL.md vs a new reference file), naming conventions, and what to skip. Defers to /authoring:skills for generic skill conventions (frontmatter, length budgets, structure).
4
- user-invocable: false
5
- ---
6
-
7
- # Updating operator memory
8
-
9
- You're about to submit. Spend a minute capturing what the next operator should not have to rediscover.
10
-
11
- The memory lives at `.sisyphus/agent-plugin/skills/operator/`:
12
- - `SKILL.md` — the high-level map of this app's surfaces and operations
13
- - per-task-family reference files alongside it (`auth.md`, `db-reset.md`, `checkout-flow.md`, etc.)
14
-
15
- ## When to update (and when NOT to)
16
-
17
- The bar is **"will future operators benefit from this?"** Specifics:
18
-
19
- UPDATE when you discovered:
20
- - A repeatable operational procedure (login flow, db reset, seed step, environment toggle)
21
- - A surface that wasn't obvious (admin route, debug overlay, hidden flag, internal port)
22
- - A footgun you hit and worked around (race condition, ordering requirement, stale-cache trap)
23
- - A convention this app uses that differs from defaults (custom auth headers, non-standard ports, weird redirect chains)
24
-
25
- DON'T update when:
26
- - It's session-specific state (this user's email, this session's seeded data)
27
- - It's a one-off observation that won't reproduce
28
- - It's already covered (read existing files first — duplication is worse than nothing)
29
- - It's about the codebase, not about operating the app — that's the orchestrator's domain, not yours
30
-
31
- ## SKILL.md vs a reference file
32
-
33
- **SKILL.md** is the high-level map. It answers "what surfaces does this app have, what are the most common operations, where do I find deep dives?" Keep it dense — under ~80 lines. Each entry is a line or two with a pointer.
34
-
35
- **A reference file** is the deep dive for one task family. It answers "exactly how do I do X step by step in this project". Each file has scope: `auth.md`, `db-reset.md`, `checkout-flow.md`, `feature-flags.md`.
36
-
37
- Decision rule:
38
- - New task family the operator might face → new reference file (and add a one-line entry to SKILL.md's Reference Files section).
39
- - Refinement to existing knowledge → update the existing reference file or SKILL.md.
40
- - A surface name you keep referencing → add it to SKILL.md's App Surfaces section once.
41
-
42
- ## Naming conventions
43
-
44
- - Reference files: kebab-case, task-family scope, no `operator-` prefix (the directory already implies it), `.md` extension.
45
- - Good: `auth.md`, `admin-panel.md`, `db-reset.md`, `feature-flags.md`.
46
- - Bad: `operator-auth.md`, `flows.md`, `notes.md`, `stuff.md`.
47
- - One file per task family. If `auth.md` exists, append to it; don't create `auth-new.md` or `auth-2.md`.
48
-
49
- ## How to update
50
-
51
- 1. **Read first.** Open the current `SKILL.md` and any reference file you'll touch — orient before writing. Avoid duplicating what's already there.
52
- 2. **Write/edit with the Write or Edit tool.** The directory already exists at `.sisyphus/agent-plugin/skills/operator/` (the hook scaffolds it on first run).
53
- 3. **Keep prose dense.** The next operator pays in tokens for everything you write. If a step is obvious, omit it.
54
- 4. **Register new reference files** by adding a one-line entry to `SKILL.md`'s "Reference files" section so they're discoverable.
55
-
56
- For frontmatter, length budgets, and general skill structure rules, invoke `/authoring:skills`. Don't reinvent those rules here — this skill only covers operator-specific guidance.
57
-
58
- ## Examples
59
-
60
- **Discovered magic-link auth flow:** Create `auth.md` with the steps (email submit → check inbox → click link → cookie set). Add a one-liner to `SKILL.md` App Surfaces (`/login` — magic-link, see `auth.md`). Add to Common Operational Patterns (`Log in: see auth.md`).
61
-
62
- **Hit a stale-cache footgun:** The `/dashboard` route serves stale data for ~30s after a write because of an SWR cache. Add a single bullet to `SKILL.md` Known Footguns: `Dashboard SWR cache holds stale data ~30s after writes — hard refresh or wait`. No new reference file needed — it's a one-liner.
63
-
64
- **Found admin overlay:** `?admin=1` query param toggles an admin panel with seed/reset buttons. Add to `SKILL.md` App Surfaces: `Admin overlay: append ?admin=1 to any page; has seed/reset/feature-flag buttons`. If the overlay is rich enough to need step-by-step coverage, create `admin-panel.md` and link from there.