stagent 0.9.5 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (277) hide show
  1. package/README.md +5 -42
  2. package/dist/cli.js +42 -18
  3. package/docs/.coverage-gaps.json +13 -55
  4. package/docs/.last-generated +1 -1
  5. package/docs/features/provider-runtimes.md +4 -0
  6. package/docs/features/schedules.md +32 -4
  7. package/docs/features/settings.md +28 -5
  8. package/docs/features/tables.md +9 -2
  9. package/docs/features/workflows.md +10 -4
  10. package/docs/journeys/developer.md +15 -1
  11. package/docs/journeys/personal-use.md +21 -4
  12. package/docs/superpowers/plans/2026-04-07-instance-bootstrap.md +1691 -0
  13. package/docs/superpowers/plans/2026-04-08-schedule-orchestration.md +2983 -0
  14. package/docs/superpowers/plans/2026-04-11-schedule-maxturns-api-control.md +551 -0
  15. package/docs/superpowers/plans/2026-04-11-task-create-profile-validation.md +864 -0
  16. package/docs/superpowers/plans/2026-04-11-task-runtime-stagent-mcp-injection.md +739 -0
  17. package/docs/superpowers/specs/2026-04-08-chat-sse-resilience-hotfix-design.md +201 -0
  18. package/docs/superpowers/specs/2026-04-08-schedule-orchestration-design.md +371 -0
  19. package/docs/superpowers/specs/2026-04-08-swarm-visibility-design.md +213 -0
  20. package/package.json +3 -2
  21. package/src/__tests__/instrumentation-smoke.test.ts +15 -0
  22. package/src/app/analytics/page.tsx +1 -21
  23. package/src/app/api/chat/conversations/[id]/messages/route.ts +22 -1
  24. package/src/app/api/diagnostics/chat-streams/route.ts +65 -0
  25. package/src/app/api/instance/config/route.ts +41 -0
  26. package/src/app/api/instance/init/route.ts +34 -0
  27. package/src/app/api/instance/upgrade/check/route.ts +26 -0
  28. package/src/app/api/instance/upgrade/route.ts +96 -0
  29. package/src/app/api/instance/upgrade/status/route.ts +35 -0
  30. package/src/app/api/memory/route.ts +0 -11
  31. package/src/app/api/notifications/route.ts +4 -2
  32. package/src/app/api/projects/[id]/route.ts +5 -155
  33. package/src/app/api/projects/__tests__/delete-project.test.ts +10 -19
  34. package/src/app/api/schedules/[id]/execute/route.ts +111 -0
  35. package/src/app/api/schedules/[id]/route.ts +9 -1
  36. package/src/app/api/schedules/__tests__/execute-route.test.ts +118 -0
  37. package/src/app/api/schedules/route.ts +3 -12
  38. package/src/app/api/settings/openai/login/route.ts +22 -0
  39. package/src/app/api/settings/openai/logout/route.ts +7 -0
  40. package/src/app/api/settings/openai/route.ts +21 -1
  41. package/src/app/api/settings/providers/route.ts +35 -8
  42. package/src/app/api/tables/[id]/enrich/__tests__/route.test.ts +153 -0
  43. package/src/app/api/tables/[id]/enrich/plan/route.ts +98 -0
  44. package/src/app/api/tables/[id]/enrich/route.ts +147 -0
  45. package/src/app/api/tables/[id]/enrich/runs/route.ts +25 -0
  46. package/src/app/api/tasks/[id]/execute/route.ts +0 -21
  47. package/src/app/api/workflows/[id]/resume/route.ts +59 -0
  48. package/src/app/api/workflows/[id]/status/route.ts +22 -8
  49. package/src/app/api/workspace/context/route.ts +2 -0
  50. package/src/app/api/workspace/fix-data-dir/route.ts +81 -0
  51. package/src/app/chat/page.tsx +11 -0
  52. package/src/app/inbox/page.tsx +12 -5
  53. package/src/app/layout.tsx +42 -21
  54. package/src/app/page.tsx +0 -2
  55. package/src/app/settings/page.tsx +6 -9
  56. package/src/components/chat/__tests__/chat-session-provider.test.tsx +408 -0
  57. package/src/components/chat/chat-command-popover.tsx +2 -2
  58. package/src/components/chat/chat-input.tsx +2 -3
  59. package/src/components/chat/chat-session-provider.tsx +720 -0
  60. package/src/components/chat/chat-shell.tsx +92 -401
  61. package/src/components/instance/__tests__/instance-section.test.tsx +125 -0
  62. package/src/components/instance/instance-section.tsx +382 -0
  63. package/src/components/instance/upgrade-badge.tsx +219 -0
  64. package/src/components/notifications/__tests__/batch-proposal-review.test.tsx +95 -0
  65. package/src/components/notifications/__tests__/notification-item.test.tsx +106 -0
  66. package/src/components/notifications/batch-proposal-review.tsx +20 -5
  67. package/src/components/notifications/inbox-list.tsx +11 -2
  68. package/src/components/notifications/notification-item.tsx +56 -2
  69. package/src/components/notifications/pending-approval-host.tsx +56 -37
  70. package/src/components/schedules/schedule-create-sheet.tsx +19 -1
  71. package/src/components/schedules/schedule-edit-sheet.tsx +20 -1
  72. package/src/components/schedules/schedule-form.tsx +31 -0
  73. package/src/components/settings/__tests__/providers-runtimes-section.test.tsx +149 -0
  74. package/src/components/settings/auth-method-selector.tsx +19 -4
  75. package/src/components/settings/auth-status-badge.tsx +28 -3
  76. package/src/components/settings/openai-chatgpt-auth-control.tsx +278 -0
  77. package/src/components/settings/openai-runtime-section.tsx +7 -1
  78. package/src/components/settings/providers-runtimes-section.tsx +138 -19
  79. package/src/components/shared/app-sidebar.tsx +4 -3
  80. package/src/components/shared/command-palette.tsx +4 -5
  81. package/src/components/shared/theme-toggle.tsx +5 -24
  82. package/src/components/shared/workspace-indicator.tsx +61 -2
  83. package/src/components/tables/__tests__/table-enrichment-sheet.test.tsx +130 -0
  84. package/src/components/tables/table-create-sheet.tsx +4 -0
  85. package/src/components/tables/table-enrichment-runs.tsx +103 -0
  86. package/src/components/tables/table-enrichment-sheet.tsx +538 -0
  87. package/src/components/tables/table-spreadsheet.tsx +29 -5
  88. package/src/components/tables/table-toolbar.tsx +10 -1
  89. package/src/components/tasks/kanban-board.tsx +1 -0
  90. package/src/components/tasks/kanban-column.tsx +53 -14
  91. package/src/components/tasks/task-bento-grid.tsx +19 -0
  92. package/src/components/tasks/task-card.tsx +26 -3
  93. package/src/components/tasks/task-chip-bar.tsx +24 -0
  94. package/src/components/tasks/task-result-renderer.tsx +1 -1
  95. package/src/components/workflows/delay-step-body.tsx +109 -0
  96. package/src/components/workflows/hooks/use-workflow-status.ts +50 -0
  97. package/src/components/workflows/loop-status-view.tsx +1 -1
  98. package/src/components/workflows/shared/step-result.tsx +78 -0
  99. package/src/components/workflows/shared/workflow-header.tsx +141 -0
  100. package/src/components/workflows/shared/workflow-loading-skeleton.tsx +36 -0
  101. package/src/components/workflows/swarm-dashboard.tsx +2 -15
  102. package/src/components/workflows/views/loop-pattern-view.tsx +137 -0
  103. package/src/components/workflows/views/sequence-pattern-view.tsx +511 -0
  104. package/src/components/workflows/workflow-form-view.tsx +133 -16
  105. package/src/components/workflows/workflow-status-view.tsx +30 -740
  106. package/src/instrumentation-node.ts +94 -0
  107. package/src/instrumentation.ts +4 -48
  108. package/src/lib/agents/__tests__/claude-agent.test.ts +199 -0
  109. package/src/lib/agents/__tests__/execution-manager.test.ts +1 -27
  110. package/src/lib/agents/__tests__/failure-reason.test.ts +68 -0
  111. package/src/lib/agents/__tests__/learned-context.test.ts +0 -11
  112. package/src/lib/agents/__tests__/learning-session.test.ts +158 -0
  113. package/src/lib/agents/__tests__/pattern-extractor.test.ts +48 -0
  114. package/src/lib/agents/claude-agent.ts +155 -18
  115. package/src/lib/agents/execution-manager.ts +0 -35
  116. package/src/lib/agents/learned-context.ts +0 -12
  117. package/src/lib/agents/learning-session.ts +18 -5
  118. package/src/lib/agents/profiles/__tests__/registry.test.ts +6 -4
  119. package/src/lib/agents/profiles/builtins/upgrade-assistant/SKILL.md +70 -0
  120. package/src/lib/agents/profiles/builtins/upgrade-assistant/profile.yaml +32 -0
  121. package/src/lib/agents/runtime/__tests__/openai-codex-auth.test.ts +118 -0
  122. package/src/lib/agents/runtime/codex-app-server-client.ts +11 -5
  123. package/src/lib/agents/runtime/openai-codex-auth.ts +389 -0
  124. package/src/lib/agents/runtime/openai-codex.ts +29 -60
  125. package/src/lib/agents/runtime/types.ts +8 -0
  126. package/src/lib/book/chapter-mapping.ts +11 -0
  127. package/src/lib/book/content.ts +10 -0
  128. package/src/lib/chat/__tests__/active-streams.test.ts +49 -0
  129. package/src/lib/chat/__tests__/finalize-safety-net.test.ts +139 -0
  130. package/src/lib/chat/__tests__/reconcile.test.ts +137 -0
  131. package/src/lib/chat/__tests__/stream-telemetry.test.ts +151 -0
  132. package/src/lib/chat/active-streams.ts +27 -0
  133. package/src/lib/chat/codex-engine.ts +16 -17
  134. package/src/lib/chat/context-builder.ts +5 -3
  135. package/src/lib/chat/engine.ts +50 -3
  136. package/src/lib/chat/reconcile.ts +117 -0
  137. package/src/lib/chat/stagent-tools.ts +1 -0
  138. package/src/lib/chat/stream-telemetry.ts +132 -0
  139. package/src/lib/chat/suggested-prompts.ts +28 -1
  140. package/src/lib/chat/system-prompt.ts +26 -1
  141. package/src/lib/chat/tool-catalog.ts +2 -1
  142. package/src/lib/chat/tools/__tests__/enrich-table-tool.test.ts +127 -0
  143. package/src/lib/chat/tools/__tests__/schedule-tools.test.ts +261 -0
  144. package/src/lib/chat/tools/__tests__/task-tools.test.ts +352 -0
  145. package/src/lib/chat/tools/__tests__/workflow-tools-dedup.test.ts +217 -0
  146. package/src/lib/chat/tools/document-tools.ts +29 -13
  147. package/src/lib/chat/tools/helpers.ts +39 -0
  148. package/src/lib/chat/tools/notification-tools.ts +9 -5
  149. package/src/lib/chat/tools/project-tools.ts +33 -0
  150. package/src/lib/chat/tools/schedule-tools.ts +44 -11
  151. package/src/lib/chat/tools/table-tools.ts +71 -0
  152. package/src/lib/chat/tools/task-tools.ts +84 -20
  153. package/src/lib/chat/tools/workflow-tools.ts +234 -32
  154. package/src/lib/constants/settings.ts +8 -18
  155. package/src/lib/data/__tests__/clear.test.ts +56 -2
  156. package/src/lib/data/clear.ts +20 -15
  157. package/src/lib/data/delete-project.ts +171 -0
  158. package/src/lib/db/__tests__/bootstrap.test.ts +1 -1
  159. package/src/lib/db/bootstrap.ts +45 -16
  160. package/src/lib/db/index.ts +5 -0
  161. package/src/lib/db/migrations/0009_add_app_instances.sql +25 -0
  162. package/src/lib/db/migrations/0024_add_workflow_resume_at.sql +10 -0
  163. package/src/lib/db/migrations/0025_drop_app_instances.sql +3 -0
  164. package/src/lib/db/migrations/0026_drop_license.sql +3 -0
  165. package/src/lib/db/migrations/meta/_journal.json +21 -0
  166. package/src/lib/db/schema.ts +68 -23
  167. package/src/lib/environment/workspace-context.ts +13 -1
  168. package/src/lib/import/dedup.ts +4 -54
  169. package/src/lib/instance/__tests__/bootstrap.test.ts +362 -0
  170. package/src/lib/instance/__tests__/detect.test.ts +115 -0
  171. package/src/lib/instance/__tests__/fingerprint.test.ts +48 -0
  172. package/src/lib/instance/__tests__/git-ops.test.ts +95 -0
  173. package/src/lib/instance/__tests__/settings.test.ts +83 -0
  174. package/src/lib/instance/__tests__/upgrade-poller.test.ts +131 -0
  175. package/src/lib/instance/bootstrap.ts +270 -0
  176. package/src/lib/instance/detect.ts +49 -0
  177. package/src/lib/instance/fingerprint.ts +78 -0
  178. package/src/lib/instance/git-ops.ts +95 -0
  179. package/src/lib/instance/settings.ts +61 -0
  180. package/src/lib/instance/types.ts +77 -0
  181. package/src/lib/instance/upgrade-poller.ts +153 -0
  182. package/src/lib/notifications/__tests__/visibility.test.ts +51 -0
  183. package/src/lib/notifications/visibility.ts +33 -0
  184. package/src/lib/schedules/__tests__/collision-check.test.ts +93 -0
  185. package/src/lib/schedules/__tests__/config.test.ts +62 -0
  186. package/src/lib/schedules/__tests__/firing-metrics.test.ts +99 -0
  187. package/src/lib/schedules/__tests__/integration.test.ts +82 -0
  188. package/src/lib/schedules/__tests__/slot-claim.test.ts +242 -0
  189. package/src/lib/schedules/__tests__/tick-scheduler.test.ts +102 -0
  190. package/src/lib/schedules/__tests__/turn-budget.test.ts +228 -0
  191. package/src/lib/schedules/collision-check.ts +105 -0
  192. package/src/lib/schedules/config.ts +53 -0
  193. package/src/lib/schedules/scheduler.ts +232 -13
  194. package/src/lib/schedules/slot-claim.ts +105 -0
  195. package/src/lib/settings/__tests__/openai-auth.test.ts +101 -0
  196. package/src/lib/settings/__tests__/openai-login-manager.test.ts +64 -0
  197. package/src/lib/settings/__tests__/runtime-setup.test.ts +33 -0
  198. package/src/lib/settings/openai-auth.ts +105 -10
  199. package/src/lib/settings/openai-login-manager.ts +260 -0
  200. package/src/lib/settings/runtime-setup.ts +14 -4
  201. package/src/lib/tables/__tests__/enrichment-planner.test.ts +124 -0
  202. package/src/lib/tables/__tests__/enrichment.test.ts +147 -0
  203. package/src/lib/tables/enrichment-planner.ts +454 -0
  204. package/src/lib/tables/enrichment.ts +328 -0
  205. package/src/lib/tables/query-builder.ts +5 -2
  206. package/src/lib/tables/trigger-evaluator.ts +3 -2
  207. package/src/lib/theme.ts +71 -0
  208. package/src/lib/usage/ledger.ts +2 -18
  209. package/src/lib/util/__tests__/similarity.test.ts +106 -0
  210. package/src/lib/util/similarity.ts +77 -0
  211. package/src/lib/utils/format-timestamp.ts +24 -0
  212. package/src/lib/utils/stagent-paths.ts +12 -0
  213. package/src/lib/validators/__tests__/blueprint.test.ts +172 -0
  214. package/src/lib/validators/__tests__/settings.test.ts +10 -0
  215. package/src/lib/validators/blueprint.ts +70 -9
  216. package/src/lib/validators/profile.ts +2 -2
  217. package/src/lib/validators/settings.ts +3 -1
  218. package/src/lib/workflows/__tests__/delay.test.ts +196 -0
  219. package/src/lib/workflows/__tests__/engine.test.ts +8 -0
  220. package/src/lib/workflows/__tests__/loop-executor.test.ts +54 -0
  221. package/src/lib/workflows/__tests__/post-action.test.ts +108 -0
  222. package/src/lib/workflows/blueprints/instantiator.ts +22 -1
  223. package/src/lib/workflows/blueprints/types.ts +10 -2
  224. package/src/lib/workflows/delay.ts +106 -0
  225. package/src/lib/workflows/engine.ts +207 -4
  226. package/src/lib/workflows/loop-executor.ts +349 -24
  227. package/src/lib/workflows/post-action.ts +91 -0
  228. package/src/lib/workflows/types.ts +166 -1
  229. package/src/app/api/license/checkout/route.ts +0 -28
  230. package/src/app/api/license/portal/route.ts +0 -26
  231. package/src/app/api/license/route.ts +0 -89
  232. package/src/app/api/license/usage/route.ts +0 -63
  233. package/src/app/api/marketplace/browse/route.ts +0 -15
  234. package/src/app/api/marketplace/import/route.ts +0 -28
  235. package/src/app/api/marketplace/publish/route.ts +0 -40
  236. package/src/app/api/onboarding/email/route.ts +0 -53
  237. package/src/app/api/settings/telemetry/route.ts +0 -14
  238. package/src/app/api/sync/export/route.ts +0 -54
  239. package/src/app/api/sync/restore/route.ts +0 -37
  240. package/src/app/api/sync/sessions/route.ts +0 -24
  241. package/src/app/auth/callback/route.ts +0 -73
  242. package/src/app/marketplace/page.tsx +0 -19
  243. package/src/components/analytics/analytics-gate-card.tsx +0 -101
  244. package/src/components/marketplace/blueprint-card.tsx +0 -61
  245. package/src/components/marketplace/marketplace-browser.tsx +0 -131
  246. package/src/components/onboarding/email-capture-card.tsx +0 -104
  247. package/src/components/settings/activation-form.tsx +0 -95
  248. package/src/components/settings/cloud-account-section.tsx +0 -147
  249. package/src/components/settings/cloud-sync-section.tsx +0 -155
  250. package/src/components/settings/subscription-section.tsx +0 -410
  251. package/src/components/settings/telemetry-section.tsx +0 -80
  252. package/src/components/shared/premium-gate-overlay.tsx +0 -50
  253. package/src/components/shared/schedule-gate-dialog.tsx +0 -64
  254. package/src/components/shared/upgrade-banner.tsx +0 -112
  255. package/src/hooks/use-supabase-auth.ts +0 -79
  256. package/src/lib/billing/email.ts +0 -54
  257. package/src/lib/billing/products.ts +0 -80
  258. package/src/lib/billing/stripe.ts +0 -101
  259. package/src/lib/cloud/supabase-browser.ts +0 -32
  260. package/src/lib/cloud/supabase-client.ts +0 -56
  261. package/src/lib/license/__tests__/features.test.ts +0 -56
  262. package/src/lib/license/__tests__/key-format.test.ts +0 -88
  263. package/src/lib/license/__tests__/manager.test.ts +0 -64
  264. package/src/lib/license/__tests__/tier-limits.test.ts +0 -79
  265. package/src/lib/license/cloud-validation.ts +0 -60
  266. package/src/lib/license/features.ts +0 -44
  267. package/src/lib/license/key-format.ts +0 -101
  268. package/src/lib/license/limit-check.ts +0 -111
  269. package/src/lib/license/limit-queries.ts +0 -51
  270. package/src/lib/license/manager.ts +0 -345
  271. package/src/lib/license/notifications.ts +0 -59
  272. package/src/lib/license/tier-limits.ts +0 -71
  273. package/src/lib/marketplace/marketplace-client.ts +0 -107
  274. package/src/lib/sync/cloud-sync.ts +0 -235
  275. package/src/lib/telemetry/conversion-events.ts +0 -71
  276. package/src/lib/telemetry/queue.ts +0 -122
  277. package/src/lib/validators/license.ts +0 -33
@@ -0,0 +1,864 @@
1
+ # Task Create Profile Validation + Disappearance Spike — Implementation Plan
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use `superpowers:subagent-driven-development` to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Close the validation gap in `create_task` (which today accepts any string as `agentProfile`, including runtime IDs like `"anthropic-direct"`), surface stale-profile errors synchronously at `execute_task`, add operator-facing UX for the `list_tasks` project-scoping behavior that is the most probable root cause of the reported "task disappears" symptom, and document the spike findings that corrected the original handoff's "task was deleted" framing.
6
+
7
+ **Architecture:** Four commits on `main`:
8
+ 1. Spike addendum into the feature spec (docs-only, no code)
9
+ 2. Profile validation on `create_task` / `update_task` Zod schemas via `z.string().refine(id => getProfile(id) !== undefined, …)` + synchronous pre-check in `execute_task` for stale stored profiles + unit tests
10
+ 3. `list_tasks` empty-result note surfacing the active project scope + unit test
11
+ 4. Flip-to-completed (spec frontmatter + roadmap + changelog)
12
+
13
+ **Tech Stack:** TypeScript, Zod, Drizzle ORM (SQLite), Vitest.
14
+
15
+ ---
16
+
17
+ ## What already exists
18
+
19
+ Verified by reading the current codebase, not trusting the spec's line numbers blindly.
20
+
21
+ | What | Where | Evidence |
22
+ |---|---|---|
23
+ | `create_task` Zod schema with unvalidated `agentProfile` | `src/lib/chat/tools/task-tools.ts:91-96` | `agentProfile: z.string().optional()` — accepts any string. Compare to `assignedAgent` (runtime) at `:85-90` which has a post-parse check via `isAgentRuntimeId()` in the handler body at `:100-104`. |
24
+ | `update_task` has the same gap | `src/lib/chat/tools/task-tools.ts:163-168` | Same `z.string().optional()` pattern; no validation at all. |
25
+ | `execute_task` does not read or validate `agentProfile` | `src/lib/chat/tools/task-tools.ts:238-285` | Only validates `assignedAgent` (runtime). The stored `task.agentProfile` is ignored at queue time and passes through to the runtime where it eventually fails with a runtime-level error via `buildTaskQueryContext`. |
26
+ | `list_tasks` silent project scoping | `src/lib/chat/tools/task-tools.ts:41-54` | `effectiveProjectId = args.projectId ?? ctx.projectId ?? undefined` — if truthy, filters by project. No explanation returned when the filter produces 0 rows. This is the prime candidate for the user-reported "task disappeared" symptom. |
27
+ | `getProfile` / `listProfiles` — sync, cached | `src/lib/agents/profiles/registry.ts:239-245` | `getProfile(id): AgentProfile \| undefined` and `listProfiles(): AgentProfile[]`. Both synchronous. Backed by `ensureLoaded()` which uses a filesystem-signature cache at `:224-233` — first call reads the disk, subsequent calls return from memory. |
28
+ | Profile registry is loaded from `~/.claude/skills/` | `src/lib/agents/profiles/registry.ts:33-38, 143-218` | `SKILLS_DIR = ~/.claude/skills`; `scanProfiles()` reads every subdir's `profile.yaml` and returns a `Map<string, AgentProfile>`. |
29
+ | Builtin profiles (20) | `src/lib/agents/profiles/builtins/` | `general`, `code-reviewer`, `researcher`, `data-analyst`, `devops-engineer`, `document-writer`, `financial-analyst`, `health-fitness-coach`, `learning-coach`, `marketing-strategist`, `operations-coordinator`, `project-manager`, `sales-researcher`, `shopping-assistant`, `sweep`, `technical-writer`, `travel-planner`, `upgrade-assistant`, `customer-support-agent`, `content-creator`. Use `"general"` as the known-good test case. |
30
+ | Runtime IDs (for negative test) | `src/lib/agents/runtime/catalog.ts` → `SUPPORTED_AGENT_RUNTIMES` | Includes `"anthropic-direct"` — which is the exact value the handoff documented as the historic bug trigger. |
31
+ | No task deletion anywhere in `src/` | verified by grep | `grep -r "delete(tasks)" src/` → 0 results. Every failure path in `claude-agent.ts` (lines 130, 418-420, 745-748, 809-811) preserves the row with `status: "failed"` and a `failureReason`. |
32
+ | Every runtime reads `task.agentProfile ?? "general"` | `claude-agent.ts:521`, `openai-direct.ts:199`, `openai-codex.ts:143`, `ollama-adapter.ts:169`, `anthropic-direct.ts:272` | The `?? "general"` fallback only triggers when `task.agentProfile` is `null`. A non-null invalid string (`"anthropic-direct"`) passes through and fails at runtime, not at queue time — which is the exact gap AC #5 targets. |
33
+ | `STAGENT_DATA_DIR` per-process isolation | `src/lib/utils/stagent-paths.ts:4-6`, `src/lib/db/index.ts:9-18` | `getStagentDataDir()` reads `process.env.STAGENT_DATA_DIR` once at module load. Different processes (main vs. domain clones) hit different SQLite files. Per `MEMORY.md → shared-stagent-data-dir.md`, this is intentional — fix is operator-facing messaging, not a scoping change. Out of scope for this feature. |
34
+ | Test file `task-tools.test.ts` does NOT yet exist | `ls src/lib/chat/tools/__tests__/` | Current contents: `enrich-table-tool.test.ts`, `schedule-tools.test.ts` (shipped in Task 2), `workflow-tools-dedup.test.ts`. We create a fresh `task-tools.test.ts`. |
35
+
36
+ ## NOT in scope
37
+
38
+ - **`schedule-tools.ts:agentProfile` validation.** Same bug class (`z.string().optional()`) exists at `schedule-tools.ts:63`, and the fix is the same 5-line pattern. Explicitly outside the spec title and Included list. File a follow-up after Task 3 if wanted.
39
+ - **`STAGENT_DATA_DIR` changes.** Per spec Scope Boundaries: "Any change to the domain-clone `STAGENT_DATA_DIR` isolation model (even if the spike finds it is the cause — the fix there is error messaging, not isolation changes)."
40
+ - **Adding a health-check/startup log for the active data dir.** Infrastructure-level change affecting all instances; not task-validation feature work. File separately if wanted.
41
+ - **Adding projectId filter to `get_task`.** Would harm AC #4 ("No task returned from `create_task` is unfindable via `get_task` within the same data-dir + project scope") and break intentional lookup-by-ID. Do not do this.
42
+ - **Removing `list_tasks` project scoping entirely.** Breaks intentional per-project isolation for other tool calls. The remediation is messaging, not behavior change.
43
+ - **A task cleanup/GC retention policy.** No such policy exists today (verified by grep — no `delete(tasks)` anywhere). Do not build one speculatively per spec Excluded list.
44
+ - **Refactoring the runtime-vs-profile taxonomy.** Explicit spec Excluded.
45
+ - **Profile validation on `execute_task` args** (not on the stored task row). Already handled — `execute_task` validates `args.assignedAgent` via `isAgentRuntimeId` at `:252-256`; the gap is on the stored `task.agentProfile`, not on the tool's own args.
46
+ - **Smoke test against a running dev server.** `task-tools.ts` imports from `@/lib/agents/runtime/catalog` (pre-existing static) and this plan adds a new static import from `@/lib/agents/profiles/registry`. The registry file's import tree (`@/lib/validators/profile`, `./compatibility`, `./types`, `./project-profiles`, `@/lib/environment/data`, `@/lib/db`, `@/lib/db/schema`, `drizzle-orm`) does not transitively reach `runtime/catalog` or `claude-agent.ts`, so no cycle. Per TDR-032's smoke-test budget policy, this file does not meet the adjacency criteria. Unit tests are sufficient.
47
+
48
+ ## Error & Rescue Registry
49
+
50
+ | Failure mode | Recovery |
51
+ |---|---|
52
+ | Zod `.refine()` attached to `z.string().optional()` receives `undefined` when field omitted and rejects it | Attach refinement to `z.string()` *before* `.optional()` — refinement only runs when the field is present. Pattern: `z.string().refine(fn, {...}).optional()`. |
53
+ | `getProfile()` returns `undefined` for a builtin ID because `ensureBuiltins()` hasn't been called yet | `getProfile` calls `ensureLoaded()` internally which calls `ensureBuiltins()`. First call is self-initializing. Tests mock `@/lib/agents/profiles/registry` directly so this concern doesn't apply to them. |
54
+ | Stale stored `task.agentProfile` from a pre-fix task with `"anthropic-direct"` breaks `execute_task` with a runtime-level error instead of a synchronous chat-tool error | `execute_task` handler calls `getProfile(task.agentProfile)` after the task lookup and before the queue-and-fire path. If `task.agentProfile !== null` and `getProfile(task.agentProfile) === undefined`, return `err(...)` synchronously. Do not update the task row (status transition is the caller's decision). |
55
+ | `list_tasks` note message leaks into clients that don't expect it | The note is a sibling field in the response envelope (`{tasks: [...], note: "..."}`), not an injected string. Only added when zero results AND a project filter is active, so existing happy-path consumers are untouched. |
56
+ | Test file collides with an existing test file | `ls src/lib/chat/tools/__tests__/` confirms `task-tools.test.ts` does not exist. Create fresh. |
57
+ | New static import of `@/lib/agents/profiles/registry` introduces a module-load cycle | Manually trace the registry file's imports against `runtime/catalog` and `claude-agent.ts`. Registry only imports: validators/profile, compatibility, types, project-profiles, environment/data, db, db/schema, drizzle-orm — none reach the runtime registry. Safe. If `tsc --noEmit` surfaces a new error after the change, stop and investigate before committing. |
58
+
59
+ ---
60
+
61
+ ## File Structure
62
+
63
+ Files modified:
64
+ - `src/lib/chat/tools/task-tools.ts` — three logical edits: (a) add `.refine()` to `create_task.agentProfile`, (b) add `.refine()` to `update_task.agentProfile`, (c) add synchronous stale-profile check in `execute_task` handler, (d) add empty-result note in `list_tasks` handler.
65
+ - `features/task-create-profile-validation.md` — append the spike addendum into the References section; flip frontmatter `status:` at the end.
66
+ - `features/roadmap.md` — flip the row status.
67
+ - `features/changelog.md` — prepend a completed entry under 2026-04-11.
68
+
69
+ Files created:
70
+ - `src/lib/chat/tools/__tests__/task-tools.test.ts` — fresh test file following the `schedule-tools.test.ts` pattern that just shipped in commit `ed783bb` / `649db6d`.
71
+
72
+ Files NOT touched (explicitly):
73
+ - `src/lib/agents/profiles/registry.ts` — read-only consumer
74
+ - `src/lib/agents/runtime/catalog.ts` — existing runtime validator, used but not modified
75
+ - `src/lib/agents/claude-agent.ts` or any runtime adapter — the `task.agentProfile ?? "general"` fallback is fine as-is; we add validation at the queue gate, not at the runtime
76
+ - `src/lib/db/schema.ts`, `src/lib/db/bootstrap.ts` — no schema change
77
+ - `src/lib/chat/tools/schedule-tools.ts` — same bug class but explicitly excluded
78
+ - Any UI file — chat/MCP tool access only
79
+
80
+ ---
81
+
82
+ ## Task 1: Spike addendum into the spec (docs-only, no code)
83
+
84
+ This commit lands first per the handoff rule "grooming separate from implementation." It is the written evidence that satisfies spec AC #3 ("The investigation spike documents the actual cause …") and corrects the handoff's false framing before any remediation code is merged.
85
+
86
+ **Files:**
87
+ - Modify: `features/task-create-profile-validation.md` (append addendum to References section)
88
+
89
+ - [ ] **Step 1.1: Append the spike addendum**
90
+
91
+ Open `features/task-create-profile-validation.md`. Find the References section, which currently ends with a placeholder line:
92
+
93
+ ```markdown
94
+ - **Spike addendum (to be filled in by spike subtask):** _actual root cause + file:line evidence_
95
+ ```
96
+
97
+ Replace that single line with the full addendum below. Do not remove any other lines in the References section.
98
+
99
+ ```markdown
100
+ - **Spike addendum — 2026-04-11**
101
+
102
+ A codebase walk performed in the controller session before any code changes ruled out the handoff's original "task was deleted" framing and identified two actual root-cause candidates for the reported disappearance symptom.
103
+
104
+ **Ruled out: task deletion.**
105
+ - No `db.delete(tasks)` anywhere in `src/` (grep confirmed; prior Explore pass had already established this).
106
+ - Every failure path in `src/lib/agents/claude-agent.ts` preserves the row with `status: "failed"` and a `failureReason`:
107
+ - `:130` — partial-update path annotating a mid-stream error
108
+ - `:418-420` — stream-exhaustion safety net
109
+ - `:745-748` — OAuth/auth failure (`failureReason: "auth_failed"`)
110
+ - `:809-811` — generic handler via `classifyError`
111
+ - `create_task` at `src/lib/chat/tools/task-tools.ts:110-126` is a single `db.insert()` with no transaction wrapper. The subsequent read-back at `:123-126` confirms the insert before returning, so a silently-failed insert would surface as an empty result at creation time, not post-creation.
112
+
113
+ **Root cause 1 (probable primary — UX-level):** `list_tasks` silently filters by `ctx.projectId`.
114
+ - `src/lib/chat/tools/task-tools.ts:41` computes `effectiveProjectId = args.projectId ?? ctx.projectId ?? undefined`. If truthy, `:43-44` applies `eq(tasks.projectId, effectiveProjectId)` as a WHERE clause.
115
+ - `get_task` at `:223-227` has no projectId filter — tasks are findable by ID regardless of active project scope.
116
+ - Most likely user path: `create_task` under project A → `list_tasks` in a new session with `ctx.projectId = B` (or a different chat context) → empty result → perceived disappearance. The task is still in the DB and still findable by ID; the operator just does not know the filter is active.
117
+ - **Remediation in this feature:** `list_tasks` returns a sibling `note` field in its response envelope when `effectiveProjectId` is set and zero rows are returned, naming the active scope and suggesting `projectId: null` or `get_task <id>` as alternatives. No behavior change, only messaging.
118
+
119
+ **Root cause 2 (probable secondary — infrastructure-level):** `STAGENT_DATA_DIR` per-process isolation.
120
+ - `src/lib/utils/stagent-paths.ts:4-6`: `getStagentDataDir()` reads `process.env.STAGENT_DATA_DIR || ~/.stagent`.
121
+ - `src/lib/db/index.ts:9-13`: the DB is opened from `join(dataDir, "stagent.db")` **once at module load**. The var is baked in per-process.
122
+ - Per `MEMORY.md → shared-stagent-data-dir.md`, the user runs domain clones (`stagent-wealth`, `stagent-growth`, `stagent-venture`) which set this var to different paths. A task created in one process is physically in a different SQLite file than a task queried from another process. This is architecturally intentional — the three domain clones isolate state so wealth/growth/venture do not leak into each other.
123
+ - **Remediation in this feature: none.** Per the Excluded list, domain-clone isolation changes are out of scope. A follow-up feature (outside this batch) could add an operator-facing startup log echoing the active data dir, or a `get_stagent_info` health-check tool. Not in this commit.
124
+
125
+ **Ruled out: transaction rollback.** Not a transaction; single insert. If the insert fails, the error surfaces immediately at `create_task` return time.
126
+
127
+ **Conclusion:** The profile validation gap (the primary spec ask) is unchanged in scope. The disappearance symptom is best addressed by the `list_tasks` empty-result note (added in this feature) plus operator-facing infrastructure discoverability (deferred). Failed-state preservation (AC #3) is verification-only — the code already does it correctly on every failure path identified.
128
+ ```
129
+
130
+ - [ ] **Step 1.2: Verify the spec parses correctly**
131
+
132
+ Run `head -100 features/task-create-profile-validation.md` and confirm the YAML frontmatter is still valid (no trailing `---` issues, no mid-line breakage). Run `grep -c '^## ' features/task-create-profile-validation.md` — expected count: the same as before the edit (the addendum is a bullet inside the References section, not a new H2).
133
+
134
+ - [ ] **Step 1.3: Commit**
135
+
136
+ ```bash
137
+ git add features/task-create-profile-validation.md
138
+ git commit -m "$(cat <<'EOF'
139
+ docs(features): add spike addendum for task disappearance symptom
140
+
141
+ Codebase walk confirmed the original handoff's "task was deleted"
142
+ framing is false — no db.delete(tasks) exists anywhere in src/, and
143
+ every failure path in claude-agent.ts preserves the row with
144
+ status: "failed" and a failureReason. create_task is not wrapped in
145
+ a transaction, so rollback is also ruled out.
146
+
147
+ The actual root cause is a two-layer UX + infrastructure issue.
148
+ Primary: list_tasks silently filters by ctx.projectId, so a task
149
+ created under project A is hidden when the user asks "list my tasks"
150
+ under project B — still findable by get_task <id> but perceived as
151
+ disappeared. Secondary: STAGENT_DATA_DIR per-process isolation means
152
+ different domain-clone processes hit different SQLite files; this is
153
+ intentional (MEMORY.md → shared-stagent-data-dir) and out of scope
154
+ for this feature.
155
+
156
+ Remediation in this feature is the list_tasks empty-result note
157
+ plus the profile validation gap that is the primary spec ask.
158
+ Operator-facing data-dir discoverability is deferred to a separate
159
+ feature.
160
+
161
+ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
162
+ EOF
163
+ )"
164
+ ```
165
+
166
+ Expected: commit lands on `main`, `git status` clean. Verify with `git log --oneline -1`.
167
+
168
+ ---
169
+
170
+ ## Task 2: Profile validation (create_task, update_task, execute_task) + tests
171
+
172
+ **Files:**
173
+ - Modify: `src/lib/chat/tools/task-tools.ts`
174
+ - Create: `src/lib/chat/tools/__tests__/task-tools.test.ts`
175
+
176
+ ### Code changes
177
+
178
+ - [ ] **Step 2.1: Add a new static import at the top of `task-tools.ts`**
179
+
180
+ After the existing imports from `@/lib/agents/runtime/catalog` (around line 11), add:
181
+
182
+ ```ts
183
+ import { getProfile, listProfiles } from "@/lib/agents/profiles/registry";
184
+ ```
185
+
186
+ Rationale: `getProfile` is synchronous and cached. A static import here does not cycle with the runtime registry — verified against the registry's own import tree. Do not use a dynamic `await import()` — that pattern is only required for files that sit inside the `runtime/catalog` load cycle, and `task-tools.ts` is a leaf consumer.
187
+
188
+ - [ ] **Step 2.2: Add a private helper for the validation refinement**
189
+
190
+ Near the top of the file (after the `VALID_TASK_STATUSES` constant around line 20), add:
191
+
192
+ ```ts
193
+ /**
194
+ * Zod refinement shared by create_task and update_task for the agentProfile
195
+ * field. Returns true for valid registered profile IDs. The error message
196
+ * lists a truncated sample of valid IDs from the registry so operators can
197
+ * self-correct without cross-referencing docs.
198
+ */
199
+ function isValidAgentProfile(id: string): boolean {
200
+ return getProfile(id) !== undefined;
201
+ }
202
+
203
+ function agentProfileErrorMessage(invalid: string): string {
204
+ const valid = listProfiles()
205
+ .map((p) => p.id)
206
+ .sort();
207
+ const sample = valid.slice(0, 8).join(", ");
208
+ const more = valid.length > 8 ? `, and ${valid.length - 8} more` : "";
209
+ return `Invalid agentProfile "${invalid}". Valid profiles: ${sample}${more}. Run list_profiles (or inspect ~/.claude/skills/) to see the full set.`;
210
+ }
211
+ ```
212
+
213
+ - [ ] **Step 2.3: Wire the refinement into `create_task.agentProfile`**
214
+
215
+ Replace the existing field at lines 91-96 (`agentProfile: z.string().optional().describe(...)`) with:
216
+
217
+ ```ts
218
+ agentProfile: z
219
+ .string()
220
+ .refine(isValidAgentProfile, {
221
+ message: "Invalid agentProfile (not in profile registry). See list_profiles.",
222
+ })
223
+ .optional()
224
+ .describe(
225
+ "Agent profile ID (e.g. general, code-reviewer, researcher). Validated against the profile registry."
226
+ ),
227
+ ```
228
+
229
+ **Important:** `.refine(...)` goes BEFORE `.optional()` so the refinement only runs when the field is present. Reversing the order makes Zod pass `undefined` into the refine callback, which the `isValidAgentProfile` helper would reject — breaking the "omit to use default" path.
230
+
231
+ The in-schema message is intentionally short because a long rendered string makes the tool description noisy. The richer error (with listed profile IDs) is produced by a post-parse check in the handler body (next step).
232
+
233
+ - [ ] **Step 2.4: Add a richer error path in the `create_task` handler body**
234
+
235
+ At the top of the `create_task` handler's `try` block (just after line 99, before the `assignedAgent` runtime check at `:100-104`), add:
236
+
237
+ ```ts
238
+ if (args.agentProfile !== undefined && !isValidAgentProfile(args.agentProfile)) {
239
+ return err(agentProfileErrorMessage(args.agentProfile));
240
+ }
241
+ ```
242
+
243
+ This is belt-and-suspenders: the Zod refinement catches the bad value at parse time (giving the short message), and the handler body catches it again with the richer enumerated message if the parse layer is bypassed by a direct handler call. The test suite exercises the handler directly so this path matters for test assertions.
244
+
245
+ - [ ] **Step 2.5: Wire the same refinement into `update_task.agentProfile`**
246
+
247
+ Same pattern at lines 163-168:
248
+
249
+ ```ts
250
+ agentProfile: z
251
+ .string()
252
+ .refine(isValidAgentProfile, {
253
+ message: "Invalid agentProfile (not in profile registry). See list_profiles.",
254
+ })
255
+ .optional()
256
+ .describe(
257
+ "Agent profile ID (e.g. general, code-reviewer, researcher). Validated against the profile registry."
258
+ ),
259
+ ```
260
+
261
+ Add the same richer-error handler check at the top of the `update_task` try block (just before the existing `assignedAgent` check at `:172-176`):
262
+
263
+ ```ts
264
+ if (args.agentProfile !== undefined && !isValidAgentProfile(args.agentProfile)) {
265
+ return err(agentProfileErrorMessage(args.agentProfile));
266
+ }
267
+ ```
268
+
269
+ - [ ] **Step 2.6: Add synchronous stale-profile check in `execute_task`**
270
+
271
+ In the `execute_task` handler body, after the task is fetched (just after line 264, where the `task` variable is available but before the `runtimeId` resolution at `:267`), add:
272
+
273
+ ```ts
274
+ if (task.agentProfile && !isValidAgentProfile(task.agentProfile)) {
275
+ return err(
276
+ `Task ${args.taskId} has an invalid agentProfile "${task.agentProfile}" (not in profile registry). ` +
277
+ `Fix with update_task { taskId, agentProfile: "<valid-id>" } before retrying. ${agentProfileErrorMessage(task.agentProfile).split(". ").slice(1).join(". ")}`
278
+ );
279
+ }
280
+ ```
281
+
282
+ Rationale: `task.agentProfile` may be `null` (acceptable — every runtime falls back to `"general"`). Only check when it is a non-null string. Do not mutate the task row; the caller decides whether to fix the profile or cancel the task.
283
+
284
+ - [ ] **Step 2.7: Add empty-result note to `list_tasks`**
285
+
286
+ This is the UX remediation for root cause #1 identified in the spike. In the `list_tasks` handler, replace the current `return ok(result);` at line 54 with:
287
+
288
+ ```ts
289
+ if (result.length === 0 && effectiveProjectId) {
290
+ return ok({
291
+ tasks: [],
292
+ note: `No tasks found in project ${effectiveProjectId}. ` +
293
+ `Use projectId: null to list tasks from any project, ` +
294
+ `or get_task <id> to look up a specific task directly.`,
295
+ });
296
+ }
297
+ return ok(result);
298
+ ```
299
+
300
+ **Important:** Do not change the happy-path return shape (`result` is an array). The note is only injected on the empty-with-filter path, wrapped in a new envelope shape. Clients that always read `result` as an array will still work on the happy path. The envelope change is localized to a known-empty case so existing happy-path consumers never see it.
301
+
302
+ - [ ] **Step 2.8: Type check**
303
+
304
+ `npx tsc --noEmit 2>&1 | tail -5; echo exit=$?`. Expected: `exit=0` or pre-existing-only errors (the known set from handoff: `claude-agent.test.ts:83, 408-410, 432, 669`; `chat-session-provider.test.tsx` module-not-found; `schedule-tools.ts` "await has no effect" spurious IDE diagnostics).
305
+
306
+ ### Test file
307
+
308
+ - [ ] **Step 2.9: Create the test scaffold**
309
+
310
+ Create `src/lib/chat/tools/__tests__/task-tools.test.ts`:
311
+
312
+ ```ts
313
+ import { describe, it, expect, vi, beforeEach } from "vitest";
314
+ import { z } from "zod";
315
+
316
+ interface TaskRow {
317
+ id: string;
318
+ title: string;
319
+ status: string;
320
+ projectId: string | null;
321
+ agentProfile: string | null;
322
+ assignedAgent: string | null;
323
+ [key: string]: unknown;
324
+ }
325
+
326
+ const { mockState } = vi.hoisted(() => ({
327
+ mockState: {
328
+ rows: [] as TaskRow[],
329
+ lastInsertValues: null as Record<string, unknown> | null,
330
+ lastUpdateValues: null as Record<string, unknown> | null,
331
+ },
332
+ }));
333
+
334
+ vi.mock("@/lib/db", () => {
335
+ const selectBuilder = {
336
+ from() { return this; },
337
+ where() { return this; },
338
+ orderBy() { return this; },
339
+ limit() {
340
+ return Promise.resolve(mockState.rows);
341
+ },
342
+ get() { return Promise.resolve(mockState.rows[0]); },
343
+ then<TResolve>(resolve: (rows: TaskRow[]) => TResolve) {
344
+ return Promise.resolve(mockState.rows).then(resolve);
345
+ },
346
+ };
347
+ return {
348
+ db: {
349
+ select: () => selectBuilder,
350
+ insert: () => ({
351
+ values: (v: Record<string, unknown>) => {
352
+ mockState.lastInsertValues = v;
353
+ mockState.rows = [{
354
+ id: "task-1",
355
+ title: "",
356
+ status: "planned",
357
+ projectId: null,
358
+ agentProfile: null,
359
+ assignedAgent: null,
360
+ ...v,
361
+ } as TaskRow];
362
+ return Promise.resolve();
363
+ },
364
+ }),
365
+ update: () => ({
366
+ set: (v: Record<string, unknown>) => {
367
+ mockState.lastUpdateValues = v;
368
+ if (mockState.rows[0]) {
369
+ mockState.rows[0] = { ...mockState.rows[0], ...v } as TaskRow;
370
+ }
371
+ return { where: () => Promise.resolve() };
372
+ },
373
+ }),
374
+ },
375
+ };
376
+ });
377
+
378
+ vi.mock("@/lib/db/schema", () => ({
379
+ tasks: {
380
+ id: "id",
381
+ projectId: "projectId",
382
+ status: "status",
383
+ priority: "priority",
384
+ createdAt: "createdAt",
385
+ },
386
+ }));
387
+
388
+ vi.mock("drizzle-orm", () => ({
389
+ eq: () => ({}),
390
+ and: () => ({}),
391
+ desc: () => ({}),
392
+ }));
393
+
394
+ // Mock the profile registry: accept "general" and "code-reviewer",
395
+ // reject everything else. listProfiles returns a small known set.
396
+ vi.mock("@/lib/agents/profiles/registry", () => {
397
+ const validIds = new Set(["general", "code-reviewer", "researcher"]);
398
+ return {
399
+ getProfile: (id: string) =>
400
+ validIds.has(id)
401
+ ? { id, name: id, description: "test", tags: [], skillMd: "", allowedTools: [], mcpServers: {}, systemPrompt: "" }
402
+ : undefined,
403
+ listProfiles: () => Array.from(validIds).map((id) => ({ id, name: id })),
404
+ };
405
+ });
406
+
407
+ // Mock the runtime catalog so isAgentRuntimeId is deterministic in tests.
408
+ vi.mock("@/lib/agents/runtime/catalog", () => ({
409
+ DEFAULT_AGENT_RUNTIME: "claude",
410
+ SUPPORTED_AGENT_RUNTIMES: ["claude", "anthropic-direct", "openai-direct"],
411
+ isAgentRuntimeId: (id: string) => ["claude", "anthropic-direct", "openai-direct"].includes(id),
412
+ }));
413
+
414
+ // Mock the router so execute_task's dynamic import doesn't explode.
415
+ vi.mock("@/lib/agents/router", () => ({
416
+ executeTaskWithAgent: () => Promise.resolve(),
417
+ }));
418
+
419
+ import { taskTools } from "../task-tools";
420
+
421
+ function getTool(name: string) {
422
+ const tools = taskTools({ projectId: undefined } as never);
423
+ const tool = tools.find((t) => t.name === name);
424
+ if (!tool) throw new Error(`Tool not found: ${name}`);
425
+ return tool;
426
+ }
427
+
428
+ function parseArgs(toolName: string, args: unknown) {
429
+ const tool = getTool(toolName);
430
+ return z.object(tool.zodShape).safeParse(args);
431
+ }
432
+
433
+ function callHandler(toolName: string, args: unknown) {
434
+ const tool = getTool(toolName);
435
+ return tool.handler(args);
436
+ }
437
+
438
+ function getToolResultText(result: { content: Array<{ type: string; text: string }>; isError?: boolean }) {
439
+ return result.content[0]?.text ?? "";
440
+ }
441
+
442
+ beforeEach(() => {
443
+ mockState.rows = [];
444
+ mockState.lastInsertValues = null;
445
+ mockState.lastUpdateValues = null;
446
+ });
447
+ ```
448
+
449
+ - [ ] **Step 2.10: Smoke-run the scaffold**
450
+
451
+ `npx vitest run src/lib/chat/tools/__tests__/task-tools.test.ts 2>&1 | tail -20`. Expected: "no tests found" or similar. If imports fail, extend mocks with whatever is missing. Do not change source code.
452
+
453
+ - [ ] **Step 2.11: Add `create_task` validation tests**
454
+
455
+ ```ts
456
+ describe("create_task agentProfile Zod validation", () => {
457
+ const base = { title: "test task" };
458
+
459
+ it("accepts a valid profile id", () => {
460
+ const result = parseArgs("create_task", { ...base, agentProfile: "general" });
461
+ expect(result.success).toBe(true);
462
+ });
463
+
464
+ it("accepts another valid profile id", () => {
465
+ const result = parseArgs("create_task", { ...base, agentProfile: "code-reviewer" });
466
+ expect(result.success).toBe(true);
467
+ });
468
+
469
+ it("accepts omitted agentProfile", () => {
470
+ const result = parseArgs("create_task", base);
471
+ expect(result.success).toBe(true);
472
+ });
473
+
474
+ it("rejects a runtime id passed as agentProfile", () => {
475
+ const result = parseArgs("create_task", { ...base, agentProfile: "anthropic-direct" });
476
+ expect(result.success).toBe(false);
477
+ });
478
+
479
+ it("rejects an arbitrary invalid string", () => {
480
+ const result = parseArgs("create_task", { ...base, agentProfile: "not-a-profile" });
481
+ expect(result.success).toBe(false);
482
+ });
483
+ });
484
+
485
+ describe("create_task handler-level error messages", () => {
486
+ it("returns a descriptive error naming the invalid value and listing valid profile ids", async () => {
487
+ const result = await callHandler("create_task", {
488
+ title: "test task",
489
+ agentProfile: "anthropic-direct",
490
+ });
491
+ expect(result.isError).toBe(true);
492
+ const text = getToolResultText(result);
493
+ expect(text).toContain("anthropic-direct");
494
+ expect(text).toMatch(/code-reviewer|general|researcher/);
495
+ });
496
+
497
+ it("inserts a task when agentProfile is valid", async () => {
498
+ const result = await callHandler("create_task", {
499
+ title: "test task",
500
+ agentProfile: "general",
501
+ });
502
+ expect(result.isError).toBeFalsy();
503
+ expect(mockState.lastInsertValues?.agentProfile).toBe("general");
504
+ });
505
+
506
+ it("inserts with null agentProfile when omitted", async () => {
507
+ await callHandler("create_task", { title: "test task" });
508
+ expect(mockState.lastInsertValues?.agentProfile).toBe(null);
509
+ });
510
+ });
511
+ ```
512
+
513
+ - [ ] **Step 2.12: Add `update_task` validation tests**
514
+
515
+ ```ts
516
+ describe("update_task agentProfile Zod validation", () => {
517
+ const base = { taskId: "task-1" };
518
+
519
+ it("accepts a valid profile id", () => {
520
+ const result = parseArgs("update_task", { ...base, agentProfile: "researcher" });
521
+ expect(result.success).toBe(true);
522
+ });
523
+
524
+ it("rejects a runtime id", () => {
525
+ const result = parseArgs("update_task", { ...base, agentProfile: "anthropic-direct" });
526
+ expect(result.success).toBe(false);
527
+ });
528
+ });
529
+
530
+ describe("update_task handler-level agentProfile validation", () => {
531
+ beforeEach(() => {
532
+ mockState.rows = [{
533
+ id: "task-1",
534
+ title: "existing",
535
+ status: "planned",
536
+ projectId: null,
537
+ agentProfile: null,
538
+ assignedAgent: null,
539
+ } as TaskRow];
540
+ });
541
+
542
+ it("returns a descriptive error when the new agentProfile is invalid", async () => {
543
+ const result = await callHandler("update_task", {
544
+ taskId: "task-1",
545
+ agentProfile: "anthropic-direct",
546
+ });
547
+ expect(result.isError).toBe(true);
548
+ expect(getToolResultText(result)).toContain("anthropic-direct");
549
+ });
550
+
551
+ it("updates when the new agentProfile is valid", async () => {
552
+ const result = await callHandler("update_task", {
553
+ taskId: "task-1",
554
+ agentProfile: "code-reviewer",
555
+ });
556
+ expect(result.isError).toBeFalsy();
557
+ expect(mockState.lastUpdateValues?.agentProfile).toBe("code-reviewer");
558
+ });
559
+ });
560
+ ```
561
+
562
+ - [ ] **Step 2.13: Add `execute_task` stale-profile tests**
563
+
564
+ ```ts
565
+ describe("execute_task stale agentProfile surfacing", () => {
566
+ it("returns synchronous error when the stored task.agentProfile is invalid", async () => {
567
+ mockState.rows = [{
568
+ id: "task-1",
569
+ title: "stale task",
570
+ status: "planned",
571
+ projectId: null,
572
+ agentProfile: "anthropic-direct", // invalid — a runtime id
573
+ assignedAgent: null,
574
+ } as TaskRow];
575
+
576
+ const result = await callHandler("execute_task", { taskId: "task-1" });
577
+ expect(result.isError).toBe(true);
578
+ const text = getToolResultText(result);
579
+ expect(text).toContain("anthropic-direct");
580
+ expect(text).toContain("update_task");
581
+ });
582
+
583
+ it("queues execution when task.agentProfile is valid", async () => {
584
+ mockState.rows = [{
585
+ id: "task-1",
586
+ title: "ok task",
587
+ status: "planned",
588
+ projectId: null,
589
+ agentProfile: "general",
590
+ assignedAgent: null,
591
+ } as TaskRow];
592
+
593
+ const result = await callHandler("execute_task", { taskId: "task-1" });
594
+ expect(result.isError).toBeFalsy();
595
+ });
596
+
597
+ it("queues execution when task.agentProfile is null (runtime falls back to general)", async () => {
598
+ mockState.rows = [{
599
+ id: "task-1",
600
+ title: "ok task",
601
+ status: "planned",
602
+ projectId: null,
603
+ agentProfile: null,
604
+ assignedAgent: null,
605
+ } as TaskRow];
606
+
607
+ const result = await callHandler("execute_task", { taskId: "task-1" });
608
+ expect(result.isError).toBeFalsy();
609
+ });
610
+ });
611
+ ```
612
+
613
+ - [ ] **Step 2.14: Add `list_tasks` empty-result note tests**
614
+
615
+ ```ts
616
+ describe("list_tasks empty-result note", () => {
617
+ it("returns a note when a project filter is active and zero rows result", async () => {
618
+ mockState.rows = [];
619
+ const tool = getTool("list_tasks");
620
+ // Invoke via a ctx that has a projectId, mimicking a scoped chat session.
621
+ const tools = taskTools({ projectId: "proj-active" } as never);
622
+ const list = tools.find((t) => t.name === "list_tasks")!;
623
+ const result = await list.handler({});
624
+ expect(result.isError).toBeFalsy();
625
+ const text = getToolResultText(result);
626
+ expect(text).toContain("proj-active");
627
+ expect(text).toContain("projectId: null");
628
+ // Also confirm the envelope shape: tasks is an empty array and note is present
629
+ const parsed = JSON.parse(text);
630
+ expect(parsed).toMatchObject({ tasks: [], note: expect.stringContaining("proj-active") });
631
+ });
632
+
633
+ it("returns the plain array (no note) when a project filter is active and rows are returned", async () => {
634
+ mockState.rows = [{
635
+ id: "task-1",
636
+ title: "existing",
637
+ status: "planned",
638
+ projectId: "proj-active",
639
+ agentProfile: null,
640
+ assignedAgent: null,
641
+ } as TaskRow];
642
+
643
+ const tools = taskTools({ projectId: "proj-active" } as never);
644
+ const list = tools.find((t) => t.name === "list_tasks")!;
645
+ const result = await list.handler({});
646
+ const parsed = JSON.parse(getToolResultText(result));
647
+ expect(Array.isArray(parsed)).toBe(true);
648
+ expect(parsed).toHaveLength(1);
649
+ });
650
+
651
+ it("returns the plain array (no note) when no filter is active and zero rows result", async () => {
652
+ mockState.rows = [];
653
+ const tools = taskTools({ projectId: undefined } as never);
654
+ const list = tools.find((t) => t.name === "list_tasks")!;
655
+ const result = await list.handler({});
656
+ const parsed = JSON.parse(getToolResultText(result));
657
+ expect(Array.isArray(parsed)).toBe(true);
658
+ expect(parsed).toHaveLength(0);
659
+ });
660
+ });
661
+ ```
662
+
663
+ **Note for the implementer:** the `ok()` helper returns `{ content: [{ type: "text", text: JSON.stringify(payload) }] }`. Confirm by reading `src/lib/chat/tools/helpers.ts` before running the tests. If the payload is stored differently, adjust the `getToolResultText` helper and the `JSON.parse` calls accordingly.
664
+
665
+ - [ ] **Step 2.15: Add `get_task` AC #4 regression test**
666
+
667
+ ```ts
668
+ describe("get_task AC #4: failed tasks remain findable", () => {
669
+ it("finds a task regardless of status (including failed)", async () => {
670
+ mockState.rows = [{
671
+ id: "task-1",
672
+ title: "a failed task",
673
+ status: "failed",
674
+ projectId: "proj-other",
675
+ agentProfile: null,
676
+ assignedAgent: null,
677
+ } as TaskRow];
678
+
679
+ const result = await callHandler("get_task", { taskId: "task-1" });
680
+ expect(result.isError).toBeFalsy();
681
+ const text = getToolResultText(result);
682
+ expect(text).toContain("task-1");
683
+ expect(text).toContain("failed");
684
+ });
685
+
686
+ it("does not apply a project filter (returns the task even when stored under a different project)", async () => {
687
+ mockState.rows = [{
688
+ id: "task-1",
689
+ title: "cross-project task",
690
+ status: "completed",
691
+ projectId: "proj-A",
692
+ agentProfile: null,
693
+ assignedAgent: null,
694
+ } as TaskRow];
695
+
696
+ // Call with a ctx that has projectId = B — get_task should still find it.
697
+ const tools = taskTools({ projectId: "proj-B" } as never);
698
+ const tool = tools.find((t) => t.name === "get_task")!;
699
+ const result = await tool.handler({ taskId: "task-1" });
700
+ expect(result.isError).toBeFalsy();
701
+ });
702
+ });
703
+ ```
704
+
705
+ - [ ] **Step 2.16: Run the full test file and verify all pass**
706
+
707
+ `npx vitest run src/lib/chat/tools/__tests__/task-tools.test.ts 2>&1 | tail -40`
708
+
709
+ Expected: all tests pass. Typical count: 5 (create Zod) + 3 (create handler) + 2 (update Zod) + 2 (update handler) + 3 (execute stale) + 3 (list_tasks note) + 2 (get_task AC#4) = 20 tests.
710
+
711
+ If a test fails on an unexpected mock shape, extend the mock rather than changing source code. Common sources of friction: the `ok()` helper's actual text format (read `helpers.ts` to confirm), the drizzle `update().set().where()` chain shape, and whether `list_tasks` uses `.limit(50)` on its builder chain (if so, the `limit` stub needs to return the rows, which it already does).
712
+
713
+ - [ ] **Step 2.17: Regression sanity — adjacent tests still pass**
714
+
715
+ `npx vitest run src/lib/chat/tools/__tests__/ 2>&1 | tail -20`. Expected: task-tools + schedule-tools + workflow-tools-dedup + enrich-table-tool all green.
716
+
717
+ - [ ] **Step 2.18: Type check**
718
+
719
+ `npx tsc --noEmit 2>&1 | tail -5; echo exit=$?`. Expected: exit 0 or pre-existing-only.
720
+
721
+ - [ ] **Step 2.19: Verify diff scope**
722
+
723
+ `git status` should show:
724
+ - Modified: `src/lib/chat/tools/task-tools.ts`
725
+ - New: `src/lib/chat/tools/__tests__/task-tools.test.ts`
726
+
727
+ And nothing else.
728
+
729
+ - [ ] **Step 2.20: Commit**
730
+
731
+ ```bash
732
+ git add src/lib/chat/tools/task-tools.ts src/lib/chat/tools/__tests__/task-tools.test.ts
733
+ git commit -m "$(cat <<'EOF'
734
+ feat(chat): validate agentProfile against profile registry
735
+
736
+ create_task and update_task previously accepted any string as
737
+ agentProfile, including runtime ids like "anthropic-direct" that are
738
+ guaranteed to fail at execution time with no feedback at creation
739
+ time. Both tools now run a Zod .refine() against the profile registry
740
+ and the handler body also returns a richer error enumerating the
741
+ valid profile ids so operators can self-correct without reading docs.
742
+
743
+ execute_task now also runs a synchronous stale-profile check on the
744
+ stored task.agentProfile before queuing — this catches tasks created
745
+ before this fix (or via a direct DB write) that carry invalid profile
746
+ values, surfacing the error in the immediate chat-tool response
747
+ instead of letting them fail later at runtime.
748
+
749
+ list_tasks now returns a sibling "note" field in its response
750
+ envelope when a project filter is active and the result is empty,
751
+ explaining that tasks may exist in other projects and suggesting
752
+ projectId: null or get_task as alternatives. This addresses the
753
+ most probable root cause of the originally-reported "task disappears
754
+ after creation" symptom, which the spike addendum documented in the
755
+ preceding commit.
756
+
757
+ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
758
+ EOF
759
+ )"
760
+ ```
761
+
762
+ Verify: `git log --oneline -3`.
763
+
764
+ ---
765
+
766
+ ## Task 3: Flip to completed + roadmap + changelog
767
+
768
+ **Files:**
769
+ - Modify: `features/task-create-profile-validation.md` (frontmatter only)
770
+ - Modify: `features/roadmap.md`
771
+ - Modify: `features/changelog.md`
772
+
773
+ - [ ] **Step 3.1: Flip spec frontmatter**
774
+
775
+ In `features/task-create-profile-validation.md`, change `status: planned` to `status: completed` in the YAML frontmatter.
776
+
777
+ - [ ] **Step 3.2: Update roadmap row**
778
+
779
+ In `features/roadmap.md`, find the row for `task-create-profile-validation` and change its Status column from `planned` to `completed`.
780
+
781
+ - [ ] **Step 3.3: Prepend changelog entry**
782
+
783
+ In `features/changelog.md`, under the existing `## 2026-04-11` section, add a `### Completed — task-create-profile-validation (P1)` subsection above the existing `### Completed — schedule-maxturns-api-control (P2)` entry. Content:
784
+
785
+ ```markdown
786
+ ### Completed — task-create-profile-validation (P1)
787
+
788
+ Closed the profile validation gap at `create_task` and `update_task` — both previously accepted any string as `agentProfile`, including runtime ids like `"anthropic-direct"` that are guaranteed to fail at execution time. Both tools now run a Zod `.refine()` against the profile registry via the new shared `isValidAgentProfile` helper, and the handler body returns a richer enumerated error so operators can self-correct without reading docs.
789
+
790
+ `execute_task` now runs a synchronous stale-profile check on the stored `task.agentProfile` before queuing, surfacing the error in the immediate chat-tool response instead of letting it fail later at runtime. `list_tasks` now returns a sibling `note` field on empty-result-with-active-filter responses, addressing the most probable UX-level root cause of the original "task disappears after creation" symptom the spike addendum documented.
791
+
792
+ **Spike conclusion:** The original handoff's "task was deleted" framing was false — no `db.delete(tasks)` exists anywhere in `src/`, and every failure path in `claude-agent.ts` preserves the row with `status: "failed"` and a `failureReason`. Real root causes are (1) `list_tasks` silent project-scoping by `ctx.projectId` (fixed in this feature via the empty-result note) and (2) `STAGENT_DATA_DIR` per-process domain-clone isolation (intentional per `MEMORY.md → shared-stagent-data-dir.md`, remediation deferred to a separate feature).
793
+
794
+ **Commits:**
795
+ - `<SHA-task1>` — `docs(features): add spike addendum for task disappearance symptom`
796
+ - `<SHA-task2>` — `feat(chat): validate agentProfile against profile registry`
797
+
798
+ **Verification:**
799
+ - `npx vitest run src/lib/chat/tools/__tests__/task-tools.test.ts` → 20/20 passing
800
+ - Adjacent `src/lib/chat/tools/__tests__/` suite → all green (task-tools + schedule-tools + workflow-tools-dedup + enrich-table-tool)
801
+ - `npx tsc --noEmit` → exit 0 or pre-existing-only
802
+ - No smoke test required — `task-tools.ts` is a leaf consumer of `profiles/registry.ts`, no runtime-registry adjacency per TDR-032.
803
+ ```
804
+
805
+ Replace `<SHA-task1>` and `<SHA-task2>` with the actual commit SHAs from `git log --oneline -5` before committing.
806
+
807
+ - [ ] **Step 3.4: Ship verification walk-through**
808
+
809
+ Walk through the spec's Acceptance Criteria and confirm:
810
+
811
+ 1. `create_task` rejects invalid `agentProfile` values with descriptive error — Step 2.3, Step 2.4, tests Step 2.11
812
+ 2. New test in `task-tools.test.ts` asserts `create_task` rejects `"anthropic-direct"` — test Step 2.11 `"rejects a runtime id passed as agentProfile"`
813
+ 3. Spike documents actual cause before code — Task 1 commit lands first
814
+ 4. No task from `create_task` is unfindable via `get_task` — tests Step 2.15 (existing `get_task` behavior verified unchanged; no project filter)
815
+ 5. `execute_task` surfaces validation/profile errors synchronously — Step 2.6, tests Step 2.13
816
+ 6. Existing task-tools tests still pass — Step 2.17 adjacent regression check
817
+ 7. List_tasks note documenting the project-scoping root cause fix — Step 2.7, tests Step 2.14
818
+
819
+ - [ ] **Step 3.5: Commit the flip**
820
+
821
+ ```bash
822
+ git add features/task-create-profile-validation.md features/roadmap.md features/changelog.md
823
+ git commit -m "$(cat <<'EOF'
824
+ docs(features): flip task-create-profile-validation to completed
825
+
826
+ create_task and update_task now validate agentProfile against the
827
+ profile registry, execute_task surfaces stale-profile errors
828
+ synchronously, and list_tasks explains the active project scope
829
+ when a filter yields zero rows. Ship-verified against all 6 spec
830
+ acceptance criteria including the spike documentation requirement.
831
+
832
+ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
833
+ EOF
834
+ )"
835
+ ```
836
+
837
+ - [ ] **Step 3.6: Push all three commits as a stack**
838
+
839
+ ```bash
840
+ git push origin main
841
+ ```
842
+
843
+ Expected: three commits pushed — the spike addendum, the feature commit, and the flip commit (plus this plan file if we commit it). No hook failures. Verify with `git log --oneline origin/main..HEAD` (should be empty after push).
844
+
845
+ ---
846
+
847
+ ## Verification before declaring done
848
+
849
+ - [ ] `npx vitest run src/lib/chat/tools/__tests__/task-tools.test.ts` → ~20 tests green
850
+ - [ ] Adjacent `src/lib/chat/tools/__tests__/` suite → all green
851
+ - [ ] `npx tsc --noEmit` → exit 0 or pre-existing-only
852
+ - [ ] `git log --oneline origin/main..HEAD` → empty (remote in sync)
853
+ - [ ] Spec frontmatter matches roadmap row matches changelog entry (all three say "completed")
854
+ - [ ] The spike addendum is present in the spec's References section with file:line citations
855
+ - [ ] `list_tasks` note only appears on empty-with-filter path, not happy path
856
+ - [ ] No changes to `schema.ts`, `bootstrap.ts`, `claude-agent.ts`, `runtime/`, or any UI file
857
+
858
+ ## Self-review notes
859
+
860
+ - The `.refine()` order (`.refine().optional()` not `.optional().refine()`) is load-bearing. If an implementer reverses it, the handler receives `undefined` through the refine callback and the omit-to-default path breaks. Test Step 2.11 "accepts omitted agentProfile" will catch this.
861
+ - The handler-body richer-error check is intentionally redundant with the Zod refinement. The Zod layer fires for callers that go through `tool-registry`'s validation wrapper; the handler layer fires for direct handler calls (including all tests in this file). Both paths must return the richer enumerated message.
862
+ - The `execute_task` check uses `task.agentProfile && !isValidAgentProfile(...)` — short-circuits on `null`, which is the valid "use runtime default" state. Do not change to `task.agentProfile !== null` or similar without re-reading the every-runtime `?? "general"` fallback pattern.
863
+ - The `list_tasks` envelope change is **only** on the empty-with-filter branch. Happy-path callers reading `result.tasks[0]` will break; happy-path callers reading `result[0]` will work. The current happy-path return is a raw array, so we keep that shape.
864
+ - The plan file (`docs/superpowers/plans/2026-04-11-task-create-profile-validation.md`) is committed separately as a `docs(plan)` commit before the spike-addendum commit, matching the precedent from Task 2 (commit `484c2ea`). Controller handles this — not the implementer.