stagent 0.9.5 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -42
- package/dist/cli.js +42 -18
- package/docs/.coverage-gaps.json +13 -55
- package/docs/.last-generated +1 -1
- package/docs/features/provider-runtimes.md +4 -0
- package/docs/features/schedules.md +32 -4
- package/docs/features/settings.md +28 -5
- package/docs/features/tables.md +9 -2
- package/docs/features/workflows.md +10 -4
- package/docs/journeys/developer.md +15 -1
- package/docs/journeys/personal-use.md +21 -4
- package/docs/superpowers/plans/2026-04-07-instance-bootstrap.md +1691 -0
- package/docs/superpowers/plans/2026-04-08-schedule-orchestration.md +2983 -0
- package/docs/superpowers/plans/2026-04-11-schedule-maxturns-api-control.md +551 -0
- package/docs/superpowers/plans/2026-04-11-task-create-profile-validation.md +864 -0
- package/docs/superpowers/plans/2026-04-11-task-runtime-stagent-mcp-injection.md +739 -0
- package/docs/superpowers/specs/2026-04-08-chat-sse-resilience-hotfix-design.md +201 -0
- package/docs/superpowers/specs/2026-04-08-schedule-orchestration-design.md +371 -0
- package/docs/superpowers/specs/2026-04-08-swarm-visibility-design.md +213 -0
- package/package.json +3 -2
- package/src/__tests__/instrumentation-smoke.test.ts +15 -0
- package/src/app/analytics/page.tsx +1 -21
- package/src/app/api/chat/conversations/[id]/messages/route.ts +22 -1
- package/src/app/api/diagnostics/chat-streams/route.ts +65 -0
- package/src/app/api/instance/config/route.ts +41 -0
- package/src/app/api/instance/init/route.ts +34 -0
- package/src/app/api/instance/upgrade/check/route.ts +26 -0
- package/src/app/api/instance/upgrade/route.ts +96 -0
- package/src/app/api/instance/upgrade/status/route.ts +35 -0
- package/src/app/api/memory/route.ts +0 -11
- package/src/app/api/notifications/route.ts +4 -2
- package/src/app/api/projects/[id]/route.ts +5 -155
- package/src/app/api/projects/__tests__/delete-project.test.ts +10 -19
- package/src/app/api/schedules/[id]/execute/route.ts +111 -0
- package/src/app/api/schedules/[id]/route.ts +9 -1
- package/src/app/api/schedules/__tests__/execute-route.test.ts +118 -0
- package/src/app/api/schedules/route.ts +3 -12
- package/src/app/api/settings/openai/login/route.ts +22 -0
- package/src/app/api/settings/openai/logout/route.ts +7 -0
- package/src/app/api/settings/openai/route.ts +21 -1
- package/src/app/api/settings/providers/route.ts +35 -8
- package/src/app/api/tables/[id]/enrich/__tests__/route.test.ts +153 -0
- package/src/app/api/tables/[id]/enrich/plan/route.ts +98 -0
- package/src/app/api/tables/[id]/enrich/route.ts +147 -0
- package/src/app/api/tables/[id]/enrich/runs/route.ts +25 -0
- package/src/app/api/tasks/[id]/execute/route.ts +0 -21
- package/src/app/api/workflows/[id]/resume/route.ts +59 -0
- package/src/app/api/workflows/[id]/status/route.ts +22 -8
- package/src/app/api/workspace/context/route.ts +2 -0
- package/src/app/api/workspace/fix-data-dir/route.ts +81 -0
- package/src/app/chat/page.tsx +11 -0
- package/src/app/inbox/page.tsx +12 -5
- package/src/app/layout.tsx +42 -21
- package/src/app/page.tsx +0 -2
- package/src/app/settings/page.tsx +6 -9
- package/src/components/chat/__tests__/chat-session-provider.test.tsx +408 -0
- package/src/components/chat/chat-command-popover.tsx +2 -2
- package/src/components/chat/chat-input.tsx +2 -3
- package/src/components/chat/chat-session-provider.tsx +720 -0
- package/src/components/chat/chat-shell.tsx +92 -401
- package/src/components/instance/__tests__/instance-section.test.tsx +125 -0
- package/src/components/instance/instance-section.tsx +382 -0
- package/src/components/instance/upgrade-badge.tsx +219 -0
- package/src/components/notifications/__tests__/batch-proposal-review.test.tsx +95 -0
- package/src/components/notifications/__tests__/notification-item.test.tsx +106 -0
- package/src/components/notifications/batch-proposal-review.tsx +20 -5
- package/src/components/notifications/inbox-list.tsx +11 -2
- package/src/components/notifications/notification-item.tsx +56 -2
- package/src/components/notifications/pending-approval-host.tsx +56 -37
- package/src/components/schedules/schedule-create-sheet.tsx +19 -1
- package/src/components/schedules/schedule-edit-sheet.tsx +20 -1
- package/src/components/schedules/schedule-form.tsx +31 -0
- package/src/components/settings/__tests__/providers-runtimes-section.test.tsx +149 -0
- package/src/components/settings/auth-method-selector.tsx +19 -4
- package/src/components/settings/auth-status-badge.tsx +28 -3
- package/src/components/settings/openai-chatgpt-auth-control.tsx +278 -0
- package/src/components/settings/openai-runtime-section.tsx +7 -1
- package/src/components/settings/providers-runtimes-section.tsx +138 -19
- package/src/components/shared/app-sidebar.tsx +4 -3
- package/src/components/shared/command-palette.tsx +4 -5
- package/src/components/shared/theme-toggle.tsx +5 -24
- package/src/components/shared/workspace-indicator.tsx +61 -2
- package/src/components/tables/__tests__/table-enrichment-sheet.test.tsx +130 -0
- package/src/components/tables/table-create-sheet.tsx +4 -0
- package/src/components/tables/table-enrichment-runs.tsx +103 -0
- package/src/components/tables/table-enrichment-sheet.tsx +538 -0
- package/src/components/tables/table-spreadsheet.tsx +29 -5
- package/src/components/tables/table-toolbar.tsx +10 -1
- package/src/components/tasks/kanban-board.tsx +1 -0
- package/src/components/tasks/kanban-column.tsx +53 -14
- package/src/components/tasks/task-bento-grid.tsx +19 -0
- package/src/components/tasks/task-card.tsx +26 -3
- package/src/components/tasks/task-chip-bar.tsx +24 -0
- package/src/components/tasks/task-result-renderer.tsx +1 -1
- package/src/components/workflows/delay-step-body.tsx +109 -0
- package/src/components/workflows/hooks/use-workflow-status.ts +50 -0
- package/src/components/workflows/loop-status-view.tsx +1 -1
- package/src/components/workflows/shared/step-result.tsx +78 -0
- package/src/components/workflows/shared/workflow-header.tsx +141 -0
- package/src/components/workflows/shared/workflow-loading-skeleton.tsx +36 -0
- package/src/components/workflows/swarm-dashboard.tsx +2 -15
- package/src/components/workflows/views/loop-pattern-view.tsx +137 -0
- package/src/components/workflows/views/sequence-pattern-view.tsx +511 -0
- package/src/components/workflows/workflow-form-view.tsx +133 -16
- package/src/components/workflows/workflow-status-view.tsx +30 -740
- package/src/instrumentation-node.ts +94 -0
- package/src/instrumentation.ts +4 -48
- package/src/lib/agents/__tests__/claude-agent.test.ts +199 -0
- package/src/lib/agents/__tests__/execution-manager.test.ts +1 -27
- package/src/lib/agents/__tests__/failure-reason.test.ts +68 -0
- package/src/lib/agents/__tests__/learned-context.test.ts +0 -11
- package/src/lib/agents/__tests__/learning-session.test.ts +158 -0
- package/src/lib/agents/__tests__/pattern-extractor.test.ts +48 -0
- package/src/lib/agents/claude-agent.ts +155 -18
- package/src/lib/agents/execution-manager.ts +0 -35
- package/src/lib/agents/learned-context.ts +0 -12
- package/src/lib/agents/learning-session.ts +18 -5
- package/src/lib/agents/profiles/__tests__/registry.test.ts +6 -4
- package/src/lib/agents/profiles/builtins/upgrade-assistant/SKILL.md +70 -0
- package/src/lib/agents/profiles/builtins/upgrade-assistant/profile.yaml +32 -0
- package/src/lib/agents/runtime/__tests__/openai-codex-auth.test.ts +118 -0
- package/src/lib/agents/runtime/codex-app-server-client.ts +11 -5
- package/src/lib/agents/runtime/openai-codex-auth.ts +389 -0
- package/src/lib/agents/runtime/openai-codex.ts +29 -60
- package/src/lib/agents/runtime/types.ts +8 -0
- package/src/lib/book/chapter-mapping.ts +11 -0
- package/src/lib/book/content.ts +10 -0
- package/src/lib/chat/__tests__/active-streams.test.ts +49 -0
- package/src/lib/chat/__tests__/finalize-safety-net.test.ts +139 -0
- package/src/lib/chat/__tests__/reconcile.test.ts +137 -0
- package/src/lib/chat/__tests__/stream-telemetry.test.ts +151 -0
- package/src/lib/chat/active-streams.ts +27 -0
- package/src/lib/chat/codex-engine.ts +16 -17
- package/src/lib/chat/context-builder.ts +5 -3
- package/src/lib/chat/engine.ts +50 -3
- package/src/lib/chat/reconcile.ts +117 -0
- package/src/lib/chat/stagent-tools.ts +1 -0
- package/src/lib/chat/stream-telemetry.ts +132 -0
- package/src/lib/chat/suggested-prompts.ts +28 -1
- package/src/lib/chat/system-prompt.ts +26 -1
- package/src/lib/chat/tool-catalog.ts +2 -1
- package/src/lib/chat/tools/__tests__/enrich-table-tool.test.ts +127 -0
- package/src/lib/chat/tools/__tests__/schedule-tools.test.ts +261 -0
- package/src/lib/chat/tools/__tests__/task-tools.test.ts +352 -0
- package/src/lib/chat/tools/__tests__/workflow-tools-dedup.test.ts +217 -0
- package/src/lib/chat/tools/document-tools.ts +29 -13
- package/src/lib/chat/tools/helpers.ts +39 -0
- package/src/lib/chat/tools/notification-tools.ts +9 -5
- package/src/lib/chat/tools/project-tools.ts +33 -0
- package/src/lib/chat/tools/schedule-tools.ts +44 -11
- package/src/lib/chat/tools/table-tools.ts +71 -0
- package/src/lib/chat/tools/task-tools.ts +84 -20
- package/src/lib/chat/tools/workflow-tools.ts +234 -32
- package/src/lib/constants/settings.ts +8 -18
- package/src/lib/data/__tests__/clear.test.ts +56 -2
- package/src/lib/data/clear.ts +20 -15
- package/src/lib/data/delete-project.ts +171 -0
- package/src/lib/db/__tests__/bootstrap.test.ts +1 -1
- package/src/lib/db/bootstrap.ts +45 -16
- package/src/lib/db/index.ts +5 -0
- package/src/lib/db/migrations/0009_add_app_instances.sql +25 -0
- package/src/lib/db/migrations/0024_add_workflow_resume_at.sql +10 -0
- package/src/lib/db/migrations/0025_drop_app_instances.sql +3 -0
- package/src/lib/db/migrations/0026_drop_license.sql +3 -0
- package/src/lib/db/migrations/meta/_journal.json +21 -0
- package/src/lib/db/schema.ts +68 -23
- package/src/lib/environment/workspace-context.ts +13 -1
- package/src/lib/import/dedup.ts +4 -54
- package/src/lib/instance/__tests__/bootstrap.test.ts +362 -0
- package/src/lib/instance/__tests__/detect.test.ts +115 -0
- package/src/lib/instance/__tests__/fingerprint.test.ts +48 -0
- package/src/lib/instance/__tests__/git-ops.test.ts +95 -0
- package/src/lib/instance/__tests__/settings.test.ts +83 -0
- package/src/lib/instance/__tests__/upgrade-poller.test.ts +131 -0
- package/src/lib/instance/bootstrap.ts +270 -0
- package/src/lib/instance/detect.ts +49 -0
- package/src/lib/instance/fingerprint.ts +78 -0
- package/src/lib/instance/git-ops.ts +95 -0
- package/src/lib/instance/settings.ts +61 -0
- package/src/lib/instance/types.ts +77 -0
- package/src/lib/instance/upgrade-poller.ts +153 -0
- package/src/lib/notifications/__tests__/visibility.test.ts +51 -0
- package/src/lib/notifications/visibility.ts +33 -0
- package/src/lib/schedules/__tests__/collision-check.test.ts +93 -0
- package/src/lib/schedules/__tests__/config.test.ts +62 -0
- package/src/lib/schedules/__tests__/firing-metrics.test.ts +99 -0
- package/src/lib/schedules/__tests__/integration.test.ts +82 -0
- package/src/lib/schedules/__tests__/slot-claim.test.ts +242 -0
- package/src/lib/schedules/__tests__/tick-scheduler.test.ts +102 -0
- package/src/lib/schedules/__tests__/turn-budget.test.ts +228 -0
- package/src/lib/schedules/collision-check.ts +105 -0
- package/src/lib/schedules/config.ts +53 -0
- package/src/lib/schedules/scheduler.ts +232 -13
- package/src/lib/schedules/slot-claim.ts +105 -0
- package/src/lib/settings/__tests__/openai-auth.test.ts +101 -0
- package/src/lib/settings/__tests__/openai-login-manager.test.ts +64 -0
- package/src/lib/settings/__tests__/runtime-setup.test.ts +33 -0
- package/src/lib/settings/openai-auth.ts +105 -10
- package/src/lib/settings/openai-login-manager.ts +260 -0
- package/src/lib/settings/runtime-setup.ts +14 -4
- package/src/lib/tables/__tests__/enrichment-planner.test.ts +124 -0
- package/src/lib/tables/__tests__/enrichment.test.ts +147 -0
- package/src/lib/tables/enrichment-planner.ts +454 -0
- package/src/lib/tables/enrichment.ts +328 -0
- package/src/lib/tables/query-builder.ts +5 -2
- package/src/lib/tables/trigger-evaluator.ts +3 -2
- package/src/lib/theme.ts +71 -0
- package/src/lib/usage/ledger.ts +2 -18
- package/src/lib/util/__tests__/similarity.test.ts +106 -0
- package/src/lib/util/similarity.ts +77 -0
- package/src/lib/utils/format-timestamp.ts +24 -0
- package/src/lib/utils/stagent-paths.ts +12 -0
- package/src/lib/validators/__tests__/blueprint.test.ts +172 -0
- package/src/lib/validators/__tests__/settings.test.ts +10 -0
- package/src/lib/validators/blueprint.ts +70 -9
- package/src/lib/validators/profile.ts +2 -2
- package/src/lib/validators/settings.ts +3 -1
- package/src/lib/workflows/__tests__/delay.test.ts +196 -0
- package/src/lib/workflows/__tests__/engine.test.ts +8 -0
- package/src/lib/workflows/__tests__/loop-executor.test.ts +54 -0
- package/src/lib/workflows/__tests__/post-action.test.ts +108 -0
- package/src/lib/workflows/blueprints/instantiator.ts +22 -1
- package/src/lib/workflows/blueprints/types.ts +10 -2
- package/src/lib/workflows/delay.ts +106 -0
- package/src/lib/workflows/engine.ts +207 -4
- package/src/lib/workflows/loop-executor.ts +349 -24
- package/src/lib/workflows/post-action.ts +91 -0
- package/src/lib/workflows/types.ts +166 -1
- package/src/app/api/license/checkout/route.ts +0 -28
- package/src/app/api/license/portal/route.ts +0 -26
- package/src/app/api/license/route.ts +0 -89
- package/src/app/api/license/usage/route.ts +0 -63
- package/src/app/api/marketplace/browse/route.ts +0 -15
- package/src/app/api/marketplace/import/route.ts +0 -28
- package/src/app/api/marketplace/publish/route.ts +0 -40
- package/src/app/api/onboarding/email/route.ts +0 -53
- package/src/app/api/settings/telemetry/route.ts +0 -14
- package/src/app/api/sync/export/route.ts +0 -54
- package/src/app/api/sync/restore/route.ts +0 -37
- package/src/app/api/sync/sessions/route.ts +0 -24
- package/src/app/auth/callback/route.ts +0 -73
- package/src/app/marketplace/page.tsx +0 -19
- package/src/components/analytics/analytics-gate-card.tsx +0 -101
- package/src/components/marketplace/blueprint-card.tsx +0 -61
- package/src/components/marketplace/marketplace-browser.tsx +0 -131
- package/src/components/onboarding/email-capture-card.tsx +0 -104
- package/src/components/settings/activation-form.tsx +0 -95
- package/src/components/settings/cloud-account-section.tsx +0 -147
- package/src/components/settings/cloud-sync-section.tsx +0 -155
- package/src/components/settings/subscription-section.tsx +0 -410
- package/src/components/settings/telemetry-section.tsx +0 -80
- package/src/components/shared/premium-gate-overlay.tsx +0 -50
- package/src/components/shared/schedule-gate-dialog.tsx +0 -64
- package/src/components/shared/upgrade-banner.tsx +0 -112
- package/src/hooks/use-supabase-auth.ts +0 -79
- package/src/lib/billing/email.ts +0 -54
- package/src/lib/billing/products.ts +0 -80
- package/src/lib/billing/stripe.ts +0 -101
- package/src/lib/cloud/supabase-browser.ts +0 -32
- package/src/lib/cloud/supabase-client.ts +0 -56
- package/src/lib/license/__tests__/features.test.ts +0 -56
- package/src/lib/license/__tests__/key-format.test.ts +0 -88
- package/src/lib/license/__tests__/manager.test.ts +0 -64
- package/src/lib/license/__tests__/tier-limits.test.ts +0 -79
- package/src/lib/license/cloud-validation.ts +0 -60
- package/src/lib/license/features.ts +0 -44
- package/src/lib/license/key-format.ts +0 -101
- package/src/lib/license/limit-check.ts +0 -111
- package/src/lib/license/limit-queries.ts +0 -51
- package/src/lib/license/manager.ts +0 -345
- package/src/lib/license/notifications.ts +0 -59
- package/src/lib/license/tier-limits.ts +0 -71
- package/src/lib/marketplace/marketplace-client.ts +0 -107
- package/src/lib/sync/cloud-sync.ts +0 -235
- package/src/lib/telemetry/conversion-events.ts +0 -71
- package/src/lib/telemetry/queue.ts +0 -122
- package/src/lib/validators/license.ts +0 -33
|
@@ -0,0 +1,739 @@
|
|
|
1
|
+
# Task Runtime Stagent MCP Injection — Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
4
|
+
|
|
5
|
+
**Goal:** Wire the in-process stagent MCP server into `executeClaudeTask` and `resumeClaudeTask` so scheduled and manual tasks running under the `claude-code` runtime have reliable access to `mcp__stagent__*` tools.
|
|
6
|
+
|
|
7
|
+
**Architecture:** Small, surgical edit in one file (`src/lib/agents/claude-agent.ts`). Both execution entry points already build a `mergedMcpServers` object and already conditionally pass `allowedTools`. We (1) call the existing `createStagentMcpServer(task.projectId)` factory and merge its output into `mergedMcpServers` under the `stagent` key, and (2) conditionally prepend `"mcp__stagent__*"` to `allowedTools` only when the profile already provided one, so profiles relying on the `claude_code` preset's default tool surface are not accidentally restricted. Permission gating is unchanged — the existing `handleToolPermission` + per-profile `canUseToolPolicy` model is the correct design for task execution and does not need to be ported from the chat engine's inline switch.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** TypeScript, `@anthropic-ai/claude-agent-sdk`, Vitest (hoisted mocks), better-sqlite3 via Drizzle (untouched here).
|
|
10
|
+
|
|
11
|
+
**Spec:** `features/task-runtime-stagent-mcp-injection.md`
|
|
12
|
+
**Source handoff:** `handoff/bug-task-execution-missing-stagent-mcp.md`
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## What already exists (no new code needed)
|
|
17
|
+
|
|
18
|
+
| Asset | Path | Why we reuse it |
|
|
19
|
+
|---|---|---|
|
|
20
|
+
| `createToolServer(projectId, onToolResult?)` factory | `src/lib/chat/stagent-tools.ts:70-113` | Returns `{ asMcpServer, forProvider, definitions }`. Call `.asMcpServer()` to get the SDK-compatible server object for the `claude-code` runtime path. **Note:** `createStagentMcpServer` is a deprecated wrapper for chat-engine back-compat — new code should call `createToolServer().asMcpServer()` directly (see the `@deprecated` JSDoc at line 125). |
|
|
21
|
+
| `mergedMcpServers` merge pattern | `src/lib/agents/claude-agent.ts:487-493` (execute) and `:606-612` (resume) | Already merges profile + browser + external MCP servers. We prepend `stagent:` as the first key. |
|
|
22
|
+
| Conditional `allowedTools` pattern | `src/lib/agents/claude-agent.ts:511` and `:631` | Already omits `allowedTools` when the profile has none, preserving preset defaults. We extend this pattern: when present, prepend `"mcp__stagent__*"`; when absent, still omit. |
|
|
23
|
+
| `handleToolPermission` + `ctx.canUseToolPolicy` | `src/lib/agents/claude-agent.ts:516-521` and `:635-641`; `src/lib/agents/tool-permissions.ts:115` | Per-profile `autoApprove`/`autoDeny` + saved user patterns + notification-based approval. Already correctly gates stagent tools by default — any stagent tool not explicitly auto-approved by a profile creates an approval notification. **Do not change.** |
|
|
24
|
+
| Test harness with hoisted mocks | `src/lib/agents/__tests__/claude-agent.test.ts` | `vi.mocked(query)` captures call args on the first call. `createMockStream()` helper yields fake SDK frames. `makeTask()` helper produces task rows. `mockQuery.mock.calls[0][0].options` is the assertion surface. |
|
|
25
|
+
|
|
26
|
+
## NOT in scope
|
|
27
|
+
|
|
28
|
+
| Excluded item | Why |
|
|
29
|
+
|---|---|
|
|
30
|
+
| Lifting `PERMISSION_GATED_TOOLS` out of `src/lib/chat/engine.ts` into a shared constant | The task path already has a stronger per-profile `canUseToolPolicy` model. The chat engine's inline deny-list is a chat-specific shortcut and porting it would argue with the existing architecture. See spec's Technical Approach third bullet. |
|
|
31
|
+
| Refactoring the stagent tool registry (`createToolServer` / `asMcpServer`) | Factory is already correctly structured and already reused by `openai-direct` / `anthropic-direct`. |
|
|
32
|
+
| Adding wildcard support to `canUseToolPolicy.autoApprove` | Profiles currently list exact tool names. Wildcard support is a separate feature — file as follow-up if any profile needs to auto-approve "all stagent read tools". |
|
|
33
|
+
| Rewiring `openai-direct` / `anthropic-direct` runtimes | They already inject stagent tools via `createToolServer` (see `src/lib/agents/runtime/openai-direct.ts:19`, `anthropic-direct.ts:18`). |
|
|
34
|
+
| Chat engine changes | The chat engine's injection already works. Do not touch `src/lib/chat/engine.ts`. |
|
|
35
|
+
| Backfill of historical tasks | This bug is about runtime wiring, not data. |
|
|
36
|
+
| End-to-end smoke test against a real DB | Vitest mocks are sufficient for the wiring assertion. A real-DB smoke run is step 4 of the verification section, not a coded test. |
|
|
37
|
+
|
|
38
|
+
## Error & Rescue Registry
|
|
39
|
+
|
|
40
|
+
| Failure mode | Detection | Recovery |
|
|
41
|
+
|---|---|---|
|
|
42
|
+
| Profile has `allowedTools: []` (empty array, truthy, falls into the "has allowlist" branch) | Test A (below) covers this — the empty-array case should still prepend `mcp__stagent__*`, producing `["mcp__stagent__*"]`. | Intentional: an empty allowlist plus stagent prepended gives the agent access to stagent tools only. This is the safest interpretation of a profile that explicitly opted out of all other tools. |
|
|
43
|
+
| Profile has `allowedTools: undefined` / not set | Conditional spread omits `allowedTools` entirely. SDK falls back to `claude_code` preset defaults. Stagent tools are still reachable because they are registered via `mcpServers.stagent`. | Intentional: do not pass `allowedTools` unless the profile set one. No code change needed beyond what the current conditional already does. |
|
|
44
|
+
| `createStagentMcpServer` throws (tool registry init failure) | The `try { ... } catch` block at `claude-agent.ts:478-548` already catches and calls `handleExecutionError`, which persists `status: "failed"` with `failureReason`. No new handling needed. | Existing `handleExecutionError` path. Task is marked failed with the thrown error message. |
|
|
45
|
+
| Duplicate `stagent` key collision (profile defines its own `stagent` MCP server) | Spread order `{ stagent: stagentServer, ...profileMcpServers }` — profile wins, overwriting ours. | **Problem:** a malicious or misconfigured profile could shadow our stagent server. **Mitigation:** reverse the spread order — `{ ...profileMcpServers, stagent: stagentServer }` — so stagent always wins. Codified in Task 1, Step 3. |
|
|
46
|
+
| `allowedTools` contains a literal `"mcp__stagent__*"` already (profile pre-declared it) | Test D covers: when the profile already lists `"mcp__stagent__*"`, don't duplicate it. | Use `profileAllowedTools.includes("mcp__stagent__*") ? profileAllowedTools : ["mcp__stagent__*", ...profileAllowedTools]`. Simpler alternative: deduplicate via `Array.from(new Set(...))`. Task 1 uses the `Set` form — cleaner and handles overlaps from browser/external patterns too. |
|
|
47
|
+
|
|
48
|
+
## File Structure
|
|
49
|
+
|
|
50
|
+
**Modified:**
|
|
51
|
+
- `src/lib/agents/claude-agent.ts` — 2 edit sites: `executeClaudeTask` MCP merge (~line 487-514) and `resumeClaudeTask` MCP merge (~line 606-634). One new import.
|
|
52
|
+
- `src/lib/agents/__tests__/claude-agent.test.ts` — add 4 new tests (2 per execution path), plus one new `vi.mock` block for `@/lib/chat/stagent-tools`.
|
|
53
|
+
|
|
54
|
+
**Created:** None.
|
|
55
|
+
|
|
56
|
+
**Unchanged (do not touch):** `src/lib/chat/engine.ts`, `src/lib/chat/stagent-tools.ts`, `src/lib/agents/tool-permissions.ts`, `src/lib/agents/profiles/**`, any runtime adapter under `src/lib/agents/runtime/`.
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## Task 1: Wire stagent injection into `executeClaudeTask`
|
|
61
|
+
|
|
62
|
+
**Files:**
|
|
63
|
+
- Modify: `src/lib/agents/claude-agent.ts` (imports + `executeClaudeTask` MCP merge at lines 487-514)
|
|
64
|
+
- Test: `src/lib/agents/__tests__/claude-agent.test.ts` (add `vi.mock` for stagent-tools + 2 new tests in Group A)
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
- [ ] **Step 1: Add the stagent-tools mock to the test file**
|
|
69
|
+
|
|
70
|
+
Open `src/lib/agents/__tests__/claude-agent.test.ts`. Locate the block of `vi.mock(...)` calls around lines 81-142 (after the hoisted mock declarations, before the static `import { executeClaudeTask, resumeClaudeTask } from "../claude-agent"` at line 147). Add this mock at the end of the block, just before the static imports:
|
|
71
|
+
|
|
72
|
+
```ts
|
|
73
|
+
vi.mock("@/lib/chat/stagent-tools", () => ({
|
|
74
|
+
createToolServer: vi.fn((_projectId?: string | null) => ({
|
|
75
|
+
asMcpServer: () => ({ __mockStagentServer: true }),
|
|
76
|
+
})),
|
|
77
|
+
}));
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
This returns a sentinel object whose identity we can assert on in later steps. Mocking `createToolServer` (not the deprecated `createStagentMcpServer` wrapper) matches the production import.
|
|
81
|
+
|
|
82
|
+
- [ ] **Step 2: Write the failing test — `executeClaudeTask` injects stagent into `mcpServers`**
|
|
83
|
+
|
|
84
|
+
In `src/lib/agents/__tests__/claude-agent.test.ts`, inside the `describe("executeClaudeTask", ...)` block (around line 215), add this test after the existing A1/A2 tests:
|
|
85
|
+
|
|
86
|
+
```ts
|
|
87
|
+
it("A-stagent-1: injects stagent MCP server into query mcpServers", async () => {
|
|
88
|
+
mockWhere.mockResolvedValueOnce([makeTask({ projectId: "proj-7" })]);
|
|
89
|
+
mockQuery.mockReturnValue(
|
|
90
|
+
createMockStream([
|
|
91
|
+
{ type: "result", result: "done" },
|
|
92
|
+
]) as unknown as ReturnType<typeof query>
|
|
93
|
+
);
|
|
94
|
+
|
|
95
|
+
await executeClaudeTask("task-1");
|
|
96
|
+
|
|
97
|
+
const queryCall = mockQuery.mock.calls[0][0] as {
|
|
98
|
+
options: { mcpServers?: Record<string, unknown> };
|
|
99
|
+
};
|
|
100
|
+
expect(queryCall.options.mcpServers).toBeDefined();
|
|
101
|
+
expect(queryCall.options.mcpServers!.stagent).toEqual({ __mockStagentServer: true });
|
|
102
|
+
});
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
- [ ] **Step 3: Run the test to verify it fails**
|
|
106
|
+
|
|
107
|
+
Run:
|
|
108
|
+
```bash
|
|
109
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent-1"
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
Expected: FAIL. Either `mcpServers` is undefined (current behavior when `mergedMcpServers` is empty due to the `Object.keys().length > 0` guard) or `mcpServers.stagent` is missing.
|
|
113
|
+
|
|
114
|
+
- [ ] **Step 4: Add the import to `claude-agent.ts`**
|
|
115
|
+
|
|
116
|
+
Open `src/lib/agents/claude-agent.ts`. Find the line importing from `browser-mcp`:
|
|
117
|
+
|
|
118
|
+
```ts
|
|
119
|
+
import { getBrowserMcpServers, getExternalMcpServers } from "./browser-mcp";
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Immediately after this line, add:
|
|
123
|
+
|
|
124
|
+
```ts
|
|
125
|
+
import { createToolServer } from "@/lib/chat/stagent-tools";
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
(`createStagentMcpServer` is a `@deprecated` wrapper — new code uses `createToolServer(...).asMcpServer()`.)
|
|
129
|
+
|
|
130
|
+
- [ ] **Step 5: Inject stagent into `executeClaudeTask`'s MCP merge**
|
|
131
|
+
|
|
132
|
+
Still in `src/lib/agents/claude-agent.ts`, find this block around lines 487-493:
|
|
133
|
+
|
|
134
|
+
```ts
|
|
135
|
+
// Merge browser + external MCP servers when enabled globally
|
|
136
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
137
|
+
getBrowserMcpServers(),
|
|
138
|
+
getExternalMcpServers(),
|
|
139
|
+
]);
|
|
140
|
+
const profileMcpServers = ctx.payload?.mcpServers ?? {};
|
|
141
|
+
const mergedMcpServers = { ...profileMcpServers, ...browserServers, ...externalServers };
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
Replace it with:
|
|
145
|
+
|
|
146
|
+
```ts
|
|
147
|
+
// Merge browser + external MCP servers when enabled globally
|
|
148
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
149
|
+
getBrowserMcpServers(),
|
|
150
|
+
getExternalMcpServers(),
|
|
151
|
+
]);
|
|
152
|
+
// Inject the in-process stagent MCP server so scheduled and manual tasks
|
|
153
|
+
// have access to mcp__stagent__* tools (table CRUD, notifications, etc.).
|
|
154
|
+
// Spread profile/browser/external first, then stagent — ensures no profile
|
|
155
|
+
// can accidentally shadow our server under the `stagent` key.
|
|
156
|
+
const stagentServer = createToolServer(task.projectId).asMcpServer();
|
|
157
|
+
const profileMcpServers = ctx.payload?.mcpServers ?? {};
|
|
158
|
+
const mergedMcpServers = {
|
|
159
|
+
...profileMcpServers,
|
|
160
|
+
...browserServers,
|
|
161
|
+
...externalServers,
|
|
162
|
+
stagent: stagentServer,
|
|
163
|
+
};
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
- [ ] **Step 6: Run the test to verify it passes**
|
|
167
|
+
|
|
168
|
+
Run:
|
|
169
|
+
```bash
|
|
170
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent-1"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
Expected: PASS. `mockQuery.mock.calls[0][0].options.mcpServers.stagent` equals `{ __mockStagentServer: true }`.
|
|
174
|
+
|
|
175
|
+
- [ ] **Step 7: Write the failing test — `executeClaudeTask` prepends `mcp__stagent__*` only when profile has an allowlist**
|
|
176
|
+
|
|
177
|
+
Back in the test file, the default `mockGetProfile` mock (line 199-204) returns `{ allowedTools: undefined }`, so `ctx.payload?.allowedTools` will also be falsy by default. We need a test that sets up a profile with an explicit allowlist.
|
|
178
|
+
|
|
179
|
+
Add this test right after A-stagent-1:
|
|
180
|
+
|
|
181
|
+
```ts
|
|
182
|
+
it("A-stagent-2: prepends mcp__stagent__* when profile has allowedTools", async () => {
|
|
183
|
+
mockWhere.mockResolvedValueOnce([makeTask({ projectId: "proj-7" })]);
|
|
184
|
+
mockGetProfile.mockReturnValueOnce({
|
|
185
|
+
id: "restricted",
|
|
186
|
+
name: "Restricted",
|
|
187
|
+
systemPrompt: "",
|
|
188
|
+
allowedTools: ["Read", "Grep"],
|
|
189
|
+
});
|
|
190
|
+
mockQuery.mockReturnValue(
|
|
191
|
+
createMockStream([
|
|
192
|
+
{ type: "result", result: "done" },
|
|
193
|
+
]) as unknown as ReturnType<typeof query>
|
|
194
|
+
);
|
|
195
|
+
|
|
196
|
+
await executeClaudeTask("task-1");
|
|
197
|
+
|
|
198
|
+
const queryCall = mockQuery.mock.calls[0][0] as {
|
|
199
|
+
options: { allowedTools?: string[] };
|
|
200
|
+
};
|
|
201
|
+
expect(queryCall.options.allowedTools).toBeDefined();
|
|
202
|
+
expect(queryCall.options.allowedTools).toContain("mcp__stagent__*");
|
|
203
|
+
expect(queryCall.options.allowedTools).toContain("Read");
|
|
204
|
+
expect(queryCall.options.allowedTools).toContain("Grep");
|
|
205
|
+
// Duplicates not added when profile didn't already include the pattern
|
|
206
|
+
const stagentCount = queryCall.options.allowedTools!.filter(
|
|
207
|
+
(t) => t === "mcp__stagent__*"
|
|
208
|
+
).length;
|
|
209
|
+
expect(stagentCount).toBe(1);
|
|
210
|
+
});
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
- [ ] **Step 8: Write the failing test — `executeClaudeTask` omits `allowedTools` entirely when profile has none**
|
|
214
|
+
|
|
215
|
+
Add this test right after A-stagent-2:
|
|
216
|
+
|
|
217
|
+
```ts
|
|
218
|
+
it("A-stagent-3: omits allowedTools when profile has none (preset defaults preserved)", async () => {
|
|
219
|
+
mockWhere.mockResolvedValueOnce([makeTask({ projectId: "proj-7" })]);
|
|
220
|
+
// Default mockGetProfile returns allowedTools: undefined, so ctx.payload.allowedTools
|
|
221
|
+
// will also be undefined — the query() call should NOT include an allowedTools option.
|
|
222
|
+
mockQuery.mockReturnValue(
|
|
223
|
+
createMockStream([
|
|
224
|
+
{ type: "result", result: "done" },
|
|
225
|
+
]) as unknown as ReturnType<typeof query>
|
|
226
|
+
);
|
|
227
|
+
|
|
228
|
+
await executeClaudeTask("task-1");
|
|
229
|
+
|
|
230
|
+
const queryCall = mockQuery.mock.calls[0][0] as {
|
|
231
|
+
options: { allowedTools?: string[] };
|
|
232
|
+
};
|
|
233
|
+
expect(queryCall.options.allowedTools).toBeUndefined();
|
|
234
|
+
});
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
- [ ] **Step 9: Run the new tests to verify they fail**
|
|
238
|
+
|
|
239
|
+
Run:
|
|
240
|
+
```bash
|
|
241
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent-2"
|
|
242
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent-3"
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
Expected: A-stagent-2 FAILS (current code passes profile's `allowedTools` as-is without prepending). A-stagent-3 PASSES already (current conditional already omits when falsy) — that's fine, it's a regression test.
|
|
246
|
+
|
|
247
|
+
- [ ] **Step 10: Implement the `allowedTools` merge in `executeClaudeTask`**
|
|
248
|
+
|
|
249
|
+
In `src/lib/agents/claude-agent.ts`, find this line around 511 (inside the `query({ ... options: { ... } })` call):
|
|
250
|
+
|
|
251
|
+
```ts
|
|
252
|
+
...(ctx.payload?.allowedTools && { allowedTools: ctx.payload.allowedTools }),
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
Replace it with:
|
|
256
|
+
|
|
257
|
+
```ts
|
|
258
|
+
// When the profile set an explicit allowedTools, prepend mcp__stagent__*
|
|
259
|
+
// so the stagent tool registration is not filtered out. When the profile
|
|
260
|
+
// has no allowedTools, fall through to the preset defaults (stagent tools
|
|
261
|
+
// are still reachable because they're registered via mcpServers.stagent).
|
|
262
|
+
...(ctx.payload?.allowedTools && {
|
|
263
|
+
allowedTools: Array.from(
|
|
264
|
+
new Set(["mcp__stagent__*", ...ctx.payload.allowedTools])
|
|
265
|
+
),
|
|
266
|
+
}),
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
- [ ] **Step 11: Run the new tests to verify they pass**
|
|
270
|
+
|
|
271
|
+
Run:
|
|
272
|
+
```bash
|
|
273
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent"
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
Expected: all three (A-stagent-1, A-stagent-2, A-stagent-3) PASS.
|
|
277
|
+
|
|
278
|
+
- [ ] **Step 12: Run the full `claude-agent.test.ts` file to check for regressions**
|
|
279
|
+
|
|
280
|
+
Run:
|
|
281
|
+
```bash
|
|
282
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
Expected: all tests PASS (existing A1/A2/B/C/D groups and the new A-stagent tests).
|
|
286
|
+
|
|
287
|
+
If any previously-passing test now fails, diagnose before proceeding. Most likely failure: a Group A test that previously asserted `mcpServers` was absent because the merge was empty — the new code always merges `stagent`, so `mcpServers` is now always present in the call args. Adjust the existing assertion to be `expect(queryCall.options.mcpServers).toBeDefined()` or inspect specific keys.
|
|
288
|
+
|
|
289
|
+
- [ ] **Step 13: Commit**
|
|
290
|
+
|
|
291
|
+
```bash
|
|
292
|
+
git add src/lib/agents/claude-agent.ts src/lib/agents/__tests__/claude-agent.test.ts
|
|
293
|
+
git commit -m "$(cat <<'EOF'
|
|
294
|
+
fix(agents): inject stagent MCP into executeClaudeTask
|
|
295
|
+
|
|
296
|
+
The claude-code runtime's executeClaudeTask was missing the in-process
|
|
297
|
+
stagent MCP server that chat engine, openai-direct, and anthropic-direct
|
|
298
|
+
all inject. Scheduled and manual tasks reported "No stagent table MCP
|
|
299
|
+
tools are available" when their prompts tried to read/write tables.
|
|
300
|
+
|
|
301
|
+
Adds createStagentMcpServer(task.projectId) to the mergedMcpServers
|
|
302
|
+
merge, and prepends mcp__stagent__* to allowedTools only when the
|
|
303
|
+
profile has an explicit allowlist (profiles without one continue to
|
|
304
|
+
use the claude_code preset defaults). The per-profile canUseToolPolicy
|
|
305
|
+
permission model is untouched — it already gates dangerous stagent
|
|
306
|
+
tools via handleToolPermission.
|
|
307
|
+
|
|
308
|
+
Tests A-stagent-1/2/3 cover the three branches. resumeClaudeTask
|
|
309
|
+
will receive the same treatment in the next commit.
|
|
310
|
+
|
|
311
|
+
Refs: features/task-runtime-stagent-mcp-injection.md
|
|
312
|
+
handoff/bug-task-execution-missing-stagent-mcp.md
|
|
313
|
+
|
|
314
|
+
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
315
|
+
EOF
|
|
316
|
+
)"
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
## Task 2: Extract shared helpers and mirror into `resumeClaudeTask`
|
|
322
|
+
|
|
323
|
+
**Why this task extracts helpers first:** The code review of Task 1 pointed out that the two execution paths (`executeClaudeTask` and `resumeClaudeTask`) have already drifted once — that drift is literally the bug this feature fixes. Duplicating the stagent injection inline at two call sites recreates the drift vector. Extracting two tiny private helpers at the top of the file and calling them from both sites makes future drift structurally impossible. This was a judgment call deferred by Task 1 (YAGNI, two call sites) that Task 2 reverses now that we can see both call sites in context.
|
|
324
|
+
|
|
325
|
+
**Files:**
|
|
326
|
+
- Modify: `src/lib/agents/claude-agent.ts` — add two private helpers near the top of the file (below imports), refactor `executeClaudeTask` (lines 488-530) and `resumeClaudeTask` (lines 606-634) to call them.
|
|
327
|
+
- Test: `src/lib/agents/__tests__/claude-agent.test.ts` — add 2 new tests for `resumeClaudeTask`, and add one line to A-stagent-1 and R-stagent-1 asserting `createToolServer` was called with the task's `projectId`.
|
|
328
|
+
|
|
329
|
+
---
|
|
330
|
+
|
|
331
|
+
- [ ] **Step 1: Write the failing resume tests**
|
|
332
|
+
|
|
333
|
+
Open `src/lib/agents/__tests__/claude-agent.test.ts` and locate the `describe("resumeClaudeTask", ...)` block. If it does not exist, create it at the end of the file after Group D. Add these two tests:
|
|
334
|
+
|
|
335
|
+
```ts
|
|
336
|
+
it("R-stagent-1: injects stagent MCP server into query mcpServers on resume", async () => {
|
|
337
|
+
mockWhere.mockResolvedValueOnce([
|
|
338
|
+
makeTask({
|
|
339
|
+
projectId: "proj-7",
|
|
340
|
+
sessionId: "session-abc",
|
|
341
|
+
resumeCount: 1,
|
|
342
|
+
}),
|
|
343
|
+
]);
|
|
344
|
+
mockQuery.mockReturnValue(
|
|
345
|
+
createMockStream([
|
|
346
|
+
{ type: "result", result: "resumed and done" },
|
|
347
|
+
]) as unknown as ReturnType<typeof query>
|
|
348
|
+
);
|
|
349
|
+
|
|
350
|
+
await resumeClaudeTask("task-1");
|
|
351
|
+
|
|
352
|
+
const queryCall = mockQuery.mock.calls[0][0] as {
|
|
353
|
+
options: { mcpServers?: Record<string, unknown>; resume?: string };
|
|
354
|
+
};
|
|
355
|
+
expect(queryCall.options.resume).toBe("session-abc");
|
|
356
|
+
expect(queryCall.options.mcpServers).toBeDefined();
|
|
357
|
+
expect(queryCall.options.mcpServers!.stagent).toEqual({ __mockStagentServer: true });
|
|
358
|
+
// Review note: assert the factory saw the task's projectId so a future
|
|
359
|
+
// regression that passes undefined or the wrong id is caught.
|
|
360
|
+
expect(vi.mocked(createToolServer)).toHaveBeenCalledWith("proj-7");
|
|
361
|
+
});
|
|
362
|
+
|
|
363
|
+
it("R-stagent-2: prepends mcp__stagent__* on resume when profile has allowedTools", async () => {
|
|
364
|
+
mockWhere.mockResolvedValueOnce([
|
|
365
|
+
makeTask({
|
|
366
|
+
projectId: "proj-7",
|
|
367
|
+
sessionId: "session-abc",
|
|
368
|
+
resumeCount: 1,
|
|
369
|
+
}),
|
|
370
|
+
]);
|
|
371
|
+
mockGetProfile.mockReturnValueOnce({
|
|
372
|
+
id: "restricted",
|
|
373
|
+
name: "Restricted",
|
|
374
|
+
systemPrompt: "",
|
|
375
|
+
allowedTools: ["Read", "Grep"],
|
|
376
|
+
});
|
|
377
|
+
mockQuery.mockReturnValue(
|
|
378
|
+
createMockStream([
|
|
379
|
+
{ type: "result", result: "resumed and done" },
|
|
380
|
+
]) as unknown as ReturnType<typeof query>
|
|
381
|
+
);
|
|
382
|
+
|
|
383
|
+
await resumeClaudeTask("task-1");
|
|
384
|
+
|
|
385
|
+
const queryCall = mockQuery.mock.calls[0][0] as {
|
|
386
|
+
options: { allowedTools?: string[] };
|
|
387
|
+
};
|
|
388
|
+
expect(queryCall.options.allowedTools).toContain("mcp__stagent__*");
|
|
389
|
+
expect(queryCall.options.allowedTools).toContain("Read");
|
|
390
|
+
// Assert ordering: stagent prepended, not appended. The comment in the
|
|
391
|
+
// helper claims "prepend" — this test pins that contract.
|
|
392
|
+
expect(queryCall.options.allowedTools![0]).toBe("mcp__stagent__*");
|
|
393
|
+
});
|
|
394
|
+
```
|
|
395
|
+
|
|
396
|
+
For the `createToolServer` assertion to work, the mocked factory must be importable in the test. Near the top-level static imports (after `import { query } from "@anthropic-ai/claude-agent-sdk"`), add:
|
|
397
|
+
|
|
398
|
+
```ts
|
|
399
|
+
import { createToolServer } from "@/lib/chat/stagent-tools";
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
This works because `vi.mock` is hoisted — the import picks up the mocked factory, not the real one.
|
|
403
|
+
|
|
404
|
+
- [ ] **Step 2: Add the same `toHaveBeenCalledWith` assertion to A-stagent-1**
|
|
405
|
+
|
|
406
|
+
While editing the test file, also add one line to the existing A-stagent-1 test (from Task 1) right before its closing brace:
|
|
407
|
+
|
|
408
|
+
```ts
|
|
409
|
+
expect(vi.mocked(createToolServer)).toHaveBeenCalledWith("proj-7");
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
This hardens Task 1's test against the same regression class.
|
|
413
|
+
|
|
414
|
+
- [ ] **Step 3: Run the new/updated tests to verify they fail**
|
|
415
|
+
|
|
416
|
+
```bash
|
|
417
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "R-stagent"
|
|
418
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent-1"
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
Expected: R-stagent-1 and R-stagent-2 FAIL (resume path has no stagent injection yet). A-stagent-1 may PASS or FAIL depending on the mock reset behavior — if `vi.mocked(createToolServer).mock.calls` includes the Task 1 call, it passes; if the mock is reset between tests, it still passes because this test just awaited `executeClaudeTask` which triggers the call. Either way, it should be green after Step 3. If it's red, the `vi.mocked(createToolServer)` access is broken — diagnose before proceeding.
|
|
422
|
+
|
|
423
|
+
- [ ] **Step 4: Add the two helpers to `claude-agent.ts`**
|
|
424
|
+
|
|
425
|
+
Open `src/lib/agents/claude-agent.ts`. Below the import block (after line ~45, before the first function definition), add:
|
|
426
|
+
|
|
427
|
+
```ts
|
|
428
|
+
// ─── Stagent MCP injection helpers ──────────────────────────────────────
|
|
429
|
+
//
|
|
430
|
+
// Shared by executeClaudeTask and resumeClaudeTask so the two runtime entry
|
|
431
|
+
// points cannot drift apart. The drift between chat engine injection and
|
|
432
|
+
// claude-code runtime injection is what produced the P0 bug this feature
|
|
433
|
+
// fixes — do not duplicate these patterns inline.
|
|
434
|
+
|
|
435
|
+
/**
|
|
436
|
+
* Merge the in-process stagent MCP server into a profile/browser/external
|
|
437
|
+
* MCP server map. Stagent is spread LAST so no upstream source can shadow
|
|
438
|
+
* the `stagent` key with its own server.
|
|
439
|
+
*/
|
|
440
|
+
function withStagentMcpServer(
|
|
441
|
+
profileServers: Record<string, unknown>,
|
|
442
|
+
browserServers: Record<string, unknown>,
|
|
443
|
+
externalServers: Record<string, unknown>,
|
|
444
|
+
projectId?: string | null,
|
|
445
|
+
): Record<string, unknown> {
|
|
446
|
+
const stagentServer = createToolServer(projectId).asMcpServer();
|
|
447
|
+
return {
|
|
448
|
+
...profileServers,
|
|
449
|
+
...browserServers,
|
|
450
|
+
...externalServers,
|
|
451
|
+
stagent: stagentServer,
|
|
452
|
+
};
|
|
453
|
+
}
|
|
454
|
+
|
|
455
|
+
/**
|
|
456
|
+
* Prepend `mcp__stagent__*` to a profile's explicit allowedTools so the
|
|
457
|
+
* stagent tool registration survives the SDK preset filter. Returns
|
|
458
|
+
* `undefined` when the profile has no allowedTools — callers should spread
|
|
459
|
+
* the result conditionally so the SDK falls through to preset defaults in
|
|
460
|
+
* that case.
|
|
461
|
+
*/
|
|
462
|
+
function withStagentAllowedTools(
|
|
463
|
+
profileAllowedTools: string[] | undefined,
|
|
464
|
+
): string[] | undefined {
|
|
465
|
+
if (!profileAllowedTools) return undefined;
|
|
466
|
+
return Array.from(new Set(["mcp__stagent__*", ...profileAllowedTools]));
|
|
467
|
+
}
|
|
468
|
+
```
|
|
469
|
+
|
|
470
|
+
- [ ] **Step 5: Refactor `executeClaudeTask` to call the helpers**
|
|
471
|
+
|
|
472
|
+
Still in `src/lib/agents/claude-agent.ts`, find the MCP merge block in `executeClaudeTask` (lines ~488-504) that currently reads:
|
|
473
|
+
|
|
474
|
+
```ts
|
|
475
|
+
// Merge browser + external MCP servers when enabled globally
|
|
476
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
477
|
+
getBrowserMcpServers(),
|
|
478
|
+
getExternalMcpServers(),
|
|
479
|
+
]);
|
|
480
|
+
// Inject the in-process stagent MCP server so scheduled and manual tasks
|
|
481
|
+
// have access to mcp__stagent__* tools (table CRUD, notifications, etc.).
|
|
482
|
+
// Spread profile/browser/external first, then stagent — ensures no profile
|
|
483
|
+
// can accidentally shadow our server under the `stagent` key.
|
|
484
|
+
const stagentServer = createToolServer(task.projectId).asMcpServer();
|
|
485
|
+
const profileMcpServers = ctx.payload?.mcpServers ?? {};
|
|
486
|
+
const mergedMcpServers = {
|
|
487
|
+
...profileMcpServers,
|
|
488
|
+
...browserServers,
|
|
489
|
+
...externalServers,
|
|
490
|
+
stagent: stagentServer,
|
|
491
|
+
};
|
|
492
|
+
```
|
|
493
|
+
|
|
494
|
+
Replace it with:
|
|
495
|
+
|
|
496
|
+
```ts
|
|
497
|
+
// Merge browser + external MCP servers, then inject the in-process
|
|
498
|
+
// stagent server via the shared helper (see withStagentMcpServer above).
|
|
499
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
500
|
+
getBrowserMcpServers(),
|
|
501
|
+
getExternalMcpServers(),
|
|
502
|
+
]);
|
|
503
|
+
const mergedMcpServers = withStagentMcpServer(
|
|
504
|
+
ctx.payload?.mcpServers ?? {},
|
|
505
|
+
browserServers,
|
|
506
|
+
externalServers,
|
|
507
|
+
task.projectId,
|
|
508
|
+
);
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
Then find the `allowedTools` spread in the same function's `query({ options: { ... } })` call (currently around lines 522-530):
|
|
512
|
+
|
|
513
|
+
```ts
|
|
514
|
+
// When the profile set an explicit allowedTools, prepend mcp__stagent__*
|
|
515
|
+
// so the stagent tool registration is not filtered out. When the profile
|
|
516
|
+
// has no allowedTools, fall through to the preset defaults (stagent tools
|
|
517
|
+
// are still reachable because they're registered via mcpServers.stagent).
|
|
518
|
+
...(ctx.payload?.allowedTools && {
|
|
519
|
+
allowedTools: Array.from(
|
|
520
|
+
new Set(["mcp__stagent__*", ...ctx.payload.allowedTools])
|
|
521
|
+
),
|
|
522
|
+
}),
|
|
523
|
+
```
|
|
524
|
+
|
|
525
|
+
Replace it with:
|
|
526
|
+
|
|
527
|
+
```ts
|
|
528
|
+
// allowedTools prepended via shared helper (see withStagentAllowedTools).
|
|
529
|
+
// Returns undefined when the profile has no allowlist so the SDK falls
|
|
530
|
+
// through to claude_code preset defaults.
|
|
531
|
+
...(withStagentAllowedTools(ctx.payload?.allowedTools) && {
|
|
532
|
+
allowedTools: withStagentAllowedTools(ctx.payload?.allowedTools)!,
|
|
533
|
+
}),
|
|
534
|
+
```
|
|
535
|
+
|
|
536
|
+
**Note on the spread pattern:** calling the helper twice is a micro-cost we accept to preserve the existing conditional-spread idiom used throughout this file. If you prefer, extract a local `const allowed = withStagentAllowedTools(ctx.payload?.allowedTools);` one line above the `query({ ... })` call and reference it twice — whichever reads more cleanly at the call site. Functionally identical.
|
|
537
|
+
|
|
538
|
+
- [ ] **Step 6: Run all Task 1 tests to confirm no regression from the refactor**
|
|
539
|
+
|
|
540
|
+
```bash
|
|
541
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "A-stagent"
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
Expected: all three A-stagent tests PASS. The refactor is behavior-preserving.
|
|
545
|
+
|
|
546
|
+
- [ ] **Step 7: Apply the helpers in `resumeClaudeTask`**
|
|
547
|
+
|
|
548
|
+
In `resumeClaudeTask` (around lines 606-634), find the same two blocks as `executeClaudeTask` had:
|
|
549
|
+
|
|
550
|
+
```ts
|
|
551
|
+
// Merge browser + external MCP servers when enabled globally
|
|
552
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
553
|
+
getBrowserMcpServers(),
|
|
554
|
+
getExternalMcpServers(),
|
|
555
|
+
]);
|
|
556
|
+
const profileMcpServers = ctx.payload?.mcpServers ?? {};
|
|
557
|
+
const mergedMcpServers = { ...profileMcpServers, ...browserServers, ...externalServers };
|
|
558
|
+
```
|
|
559
|
+
|
|
560
|
+
Replace with:
|
|
561
|
+
|
|
562
|
+
```ts
|
|
563
|
+
// Merge browser + external MCP servers, then inject the in-process
|
|
564
|
+
// stagent server via the shared helper (see withStagentMcpServer).
|
|
565
|
+
const [browserServers, externalServers] = await Promise.all([
|
|
566
|
+
getBrowserMcpServers(),
|
|
567
|
+
getExternalMcpServers(),
|
|
568
|
+
]);
|
|
569
|
+
const mergedMcpServers = withStagentMcpServer(
|
|
570
|
+
ctx.payload?.mcpServers ?? {},
|
|
571
|
+
browserServers,
|
|
572
|
+
externalServers,
|
|
573
|
+
task.projectId,
|
|
574
|
+
);
|
|
575
|
+
```
|
|
576
|
+
|
|
577
|
+
Then find the `allowedTools` spread:
|
|
578
|
+
|
|
579
|
+
```ts
|
|
580
|
+
...(ctx.payload?.allowedTools && { allowedTools: ctx.payload.allowedTools }),
|
|
581
|
+
```
|
|
582
|
+
|
|
583
|
+
Replace with the same pattern used in Step 5:
|
|
584
|
+
|
|
585
|
+
```ts
|
|
586
|
+
// allowedTools prepended via shared helper (see withStagentAllowedTools).
|
|
587
|
+
...(withStagentAllowedTools(ctx.payload?.allowedTools) && {
|
|
588
|
+
allowedTools: withStagentAllowedTools(ctx.payload?.allowedTools)!,
|
|
589
|
+
}),
|
|
590
|
+
```
|
|
591
|
+
|
|
592
|
+
- [ ] **Step 8: Run the resume tests to verify they pass**
|
|
593
|
+
|
|
594
|
+
```bash
|
|
595
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts -t "R-stagent"
|
|
596
|
+
```
|
|
597
|
+
|
|
598
|
+
Expected: both R-stagent-1 and R-stagent-2 PASS.
|
|
599
|
+
|
|
600
|
+
- [ ] **Step 9: Run the full `claude-agent.test.ts` file**
|
|
601
|
+
|
|
602
|
+
```bash
|
|
603
|
+
npx vitest run src/lib/agents/__tests__/claude-agent.test.ts
|
|
604
|
+
```
|
|
605
|
+
|
|
606
|
+
Expected: all tests PASS — 3 A-stagent tests, 2 R-stagent tests, and all pre-existing tests (total should be 34).
|
|
607
|
+
|
|
608
|
+
- [ ] **Step 10: Run the TypeScript check**
|
|
609
|
+
|
|
610
|
+
```bash
|
|
611
|
+
npx tsc --noEmit
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
Expected: exit 0, or the same pre-existing errors noted at the end of Task 1 (`callOptions possibly undefined`, `never has no call signatures`, `@/lib/db/schema` not found). These predate this feature and are not regressions. If NEW errors appear that are not on that list, diagnose before committing.
|
|
615
|
+
|
|
616
|
+
- [ ] **Step 11: Commit**
|
|
617
|
+
|
|
618
|
+
```bash
|
|
619
|
+
git add src/lib/agents/claude-agent.ts src/lib/agents/__tests__/claude-agent.test.ts
|
|
620
|
+
git commit -m "$(cat <<'EOF'
|
|
621
|
+
fix(agents): extract stagent helpers + inject into resumeClaudeTask
|
|
622
|
+
|
|
623
|
+
Introduces two private helpers in claude-agent.ts — withStagentMcpServer
|
|
624
|
+
and withStagentAllowedTools — that both executeClaudeTask and resumeClaudeTask
|
|
625
|
+
now call. Previously the stagent injection was inline in executeClaudeTask
|
|
626
|
+
(092f925); this commit refactors that call site to use the shared helpers
|
|
627
|
+
and applies the same helpers to resumeClaudeTask.
|
|
628
|
+
|
|
629
|
+
Extracting before mirroring was recommended by the Task 1 code review:
|
|
630
|
+
the two runtime entry points have already drifted once — that drift is
|
|
631
|
+
literally the bug this feature fixes. Shared helpers make the same class
|
|
632
|
+
of drift structurally impossible going forward.
|
|
633
|
+
|
|
634
|
+
New tests R-stagent-1/2 cover the resume path. Existing A-stagent-1 and
|
|
635
|
+
new R-stagent-1 now also assert createToolServer was called with the
|
|
636
|
+
task's projectId. R-stagent-2 asserts mcp__stagent__* is prepended
|
|
637
|
+
(not merely present) so future reorderings can't silently break the
|
|
638
|
+
preset filter behavior.
|
|
639
|
+
|
|
640
|
+
With this commit, features/task-runtime-stagent-mcp-injection.md is
|
|
641
|
+
fully implemented on both claude-code runtime entry points.
|
|
642
|
+
|
|
643
|
+
Refs: features/task-runtime-stagent-mcp-injection.md
|
|
644
|
+
|
|
645
|
+
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
646
|
+
EOF
|
|
647
|
+
)"
|
|
648
|
+
```
|
|
649
|
+
|
|
650
|
+
---
|
|
651
|
+
|
|
652
|
+
## Task 3: End-to-end smoke verification
|
|
653
|
+
|
|
654
|
+
**Purpose:** The unit tests prove the wiring reaches the SDK call with the right shape. This task proves that a real task against a real stagent tool invocation actually works end-to-end — which is the acceptance criterion the user actually cares about.
|
|
655
|
+
|
|
656
|
+
**Files:** None (manual verification; record results in the spec's References section).
|
|
657
|
+
|
|
658
|
+
---
|
|
659
|
+
|
|
660
|
+
- [ ] **Step 1: Start the dev server**
|
|
661
|
+
|
|
662
|
+
```bash
|
|
663
|
+
npm run dev
|
|
664
|
+
```
|
|
665
|
+
|
|
666
|
+
Expected: server boots on :3000 without errors. Wait for "✓ Ready".
|
|
667
|
+
|
|
668
|
+
- [ ] **Step 2: Create a test task that exercises stagent MCP**
|
|
669
|
+
|
|
670
|
+
Using the chat interface or a direct `create_task` MCP call, create a task whose prompt explicitly uses a stagent tool. A minimal example:
|
|
671
|
+
|
|
672
|
+
> Use `mcp__stagent__query_table` to list rows from any table in the current project. Report the row count. If no tables exist, say "no tables" and stop.
|
|
673
|
+
|
|
674
|
+
Assign it to the `general` profile (or any profile that does not override `allowedTools`).
|
|
675
|
+
|
|
676
|
+
- [ ] **Step 3: Execute the task**
|
|
677
|
+
|
|
678
|
+
Approve execution via the Inbox or the chat UI's approval toast. Watch the task's log stream.
|
|
679
|
+
|
|
680
|
+
Expected: the agent successfully invokes `mcp__stagent__query_table` (you should see a `tool_use` log entry with that tool name) and reports back with either the row count or "no tables". The agent must NOT report "No stagent table MCP tools are available in this session."
|
|
681
|
+
|
|
682
|
+
- [ ] **Step 4: Verify with a schedule (bonus, skip if time-pressed)**
|
|
683
|
+
|
|
684
|
+
Create a minimal schedule that runs every 5 minutes with the same prompt as Step 2. Wait for one firing, then check `get_schedule` for that schedule — `lastTurnCount` should be small (single digits) and the task should have completed successfully.
|
|
685
|
+
|
|
686
|
+
- [ ] **Step 5: Stop the dev server and record results**
|
|
687
|
+
|
|
688
|
+
Stop the dev server. Append a short "Verification run — 2026-04-11" note to `features/task-runtime-stagent-mcp-injection.md` in the References section, citing:
|
|
689
|
+
- Test task ID
|
|
690
|
+
- Tool invocation observed (`mcp__stagent__query_table`)
|
|
691
|
+
- Completion status
|
|
692
|
+
- (Optional) schedule firing id if Step 4 was run
|
|
693
|
+
|
|
694
|
+
If Step 3 fails — the agent still reports missing stagent tools — **do not proceed to flip the feature to completed**. Diagnose by checking the `agentLogs` table for the task, reading the SDK stderr chunks captured in `claude-agent.ts`, and re-running with `console.log(JSON.stringify(mergedMcpServers))` temporarily added before the `query()` call to confirm the stagent key is present at runtime. Report findings to the user.
|
|
695
|
+
|
|
696
|
+
- [ ] **Step 6: Commit the verification note**
|
|
697
|
+
|
|
698
|
+
```bash
|
|
699
|
+
git add features/task-runtime-stagent-mcp-injection.md
|
|
700
|
+
git commit -m "$(cat <<'EOF'
|
|
701
|
+
docs(features): record verification run for stagent MCP injection
|
|
702
|
+
|
|
703
|
+
Appends end-to-end verification notes to the feature spec after
|
|
704
|
+
confirming a test task successfully invoked mcp__stagent__query_table.
|
|
705
|
+
|
|
706
|
+
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
707
|
+
EOF
|
|
708
|
+
)"
|
|
709
|
+
```
|
|
710
|
+
|
|
711
|
+
---
|
|
712
|
+
|
|
713
|
+
## Self-Review Checklist
|
|
714
|
+
|
|
715
|
+
**1. Spec coverage:** Every acceptance criterion in `features/task-runtime-stagent-mcp-injection.md` maps to a task:
|
|
716
|
+
- "executeClaudeTask calls createStagentMcpServer..." → Task 1, Step 5
|
|
717
|
+
- "resumeClaudeTask does the same" → Task 2, Step 5
|
|
718
|
+
- "When the profile has an explicit allowedTools, mcp__stagent__* is prepended" → Task 1 Step 10 + Task 2 Step 6; test A-stagent-2 + R-stagent-2
|
|
719
|
+
- "When the profile has no allowedTools, the SDK option is still omitted" → Task 1 Step 10 kept the conditional spread; test A-stagent-3
|
|
720
|
+
- "Permission-gated stagent tools still route through handleToolPermission" → No code change (already correct); verified by not touching lines 516-521 and 635-641
|
|
721
|
+
- "Existing claude-agent.test.ts tests still pass" → Task 1 Step 12, Task 2 Step 8
|
|
722
|
+
- "New unit tests assert mcpServers.stagent present on both paths" → A-stagent-1, R-stagent-1
|
|
723
|
+
- "Chat engine behavior is unchanged" → No edits to `src/lib/chat/engine.ts` (NOT in scope)
|
|
724
|
+
|
|
725
|
+
**2. Placeholder scan:** No TBDs. Every code block has concrete content. Error messages and commit messages are literal.
|
|
726
|
+
|
|
727
|
+
**3. Type consistency:** `createStagentMcpServer(projectId?: string | null, onToolResult?: ...)` — we pass only the first arg. `task.projectId` is `string | null` on the task row, matching the factory signature. The sentinel `{ __mockStagentServer: true }` is a `Record<string, unknown>` assignable to whatever shape the SDK expects for a registered MCP server (the test only asserts identity, not shape). The `allowedTools` merge produces `string[]` via `Array.from(new Set<string>(...))`, matching the SDK's `allowedTools?: string[]` option type.
|
|
728
|
+
|
|
729
|
+
---
|
|
730
|
+
|
|
731
|
+
## Execution Handoff
|
|
732
|
+
|
|
733
|
+
**Plan complete and saved to `docs/superpowers/plans/2026-04-11-task-runtime-stagent-mcp-injection.md`. Two execution options:**
|
|
734
|
+
|
|
735
|
+
**1. Subagent-Driven (recommended)** — I dispatch a fresh subagent per task, review between tasks, fast iteration.
|
|
736
|
+
|
|
737
|
+
**2. Inline Execution** — Execute tasks in this session using `superpowers:executing-plans`, batch execution with checkpoints.
|
|
738
|
+
|
|
739
|
+
**Which approach?**
|