antpath 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +66 -67
- package/dist/credentials.js +34 -5
- package/dist/credentials.js.map +1 -1
- package/dist/files/downloader.js +8 -0
- package/dist/files/downloader.js.map +1 -1
- package/dist/index.d.ts +5 -1
- package/dist/index.js +2 -0
- package/dist/index.js.map +1 -1
- package/dist/platform/client.d.ts +73 -0
- package/dist/platform/client.js +107 -0
- package/dist/platform/client.js.map +1 -0
- package/dist/platform/index.d.ts +1 -0
- package/dist/platform/index.js +2 -0
- package/dist/platform/index.js.map +1 -0
- package/dist/providers/anthropic/provider.d.ts +6 -0
- package/dist/providers/anthropic/provider.js +90 -12
- package/dist/providers/anthropic/provider.js.map +1 -1
- package/dist/utils/paths.js +9 -3
- package/dist/utils/paths.js.map +1 -1
- package/docs/cleanup.md +15 -15
- package/docs/credentials.md +23 -23
- package/docs/mcp.md +18 -18
- package/docs/outputs.md +16 -16
- package/docs/quickstart.md +13 -13
- package/docs/release.md +22 -22
- package/docs/skills.md +16 -16
- package/docs/templates.md +24 -24
- package/docs/testing.md +26 -27
- package/examples/mcp-static-bearer.ts +30 -30
- package/examples/quickstart.ts +23 -23
- package/package.json +46 -51
- package/references/architecture-decisions.md +427 -203
- package/references/implementation-plan.md +430 -527
- package/references/research-sources.md +41 -30
- package/references/testing-strategy.md +29 -108
|
@@ -1,203 +1,427 @@
|
|
|
1
|
-
---
|
|
2
|
-
title: antpath architecture decisions
|
|
3
|
-
status: accepted
|
|
4
|
-
scope:
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
# antpath architecture decisions
|
|
8
|
-
|
|
9
|
-
## Product framing
|
|
10
|
-
|
|
11
|
-
antpath
|
|
12
|
-
|
|
13
|
-
The
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
-
|
|
22
|
-
-
|
|
23
|
-
-
|
|
24
|
-
-
|
|
25
|
-
-
|
|
26
|
-
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
|
45
|
-
|
|
|
46
|
-
|
|
|
47
|
-
|
|
|
48
|
-
|
|
|
49
|
-
|
|
|
50
|
-
|
|
|
51
|
-
|
|
|
52
|
-
|
|
|
53
|
-
|
|
|
54
|
-
|
|
|
55
|
-
|
|
|
56
|
-
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
-
|
|
73
|
-
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
-
|
|
84
|
-
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
##
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
-
|
|
131
|
-
- `
|
|
132
|
-
-
|
|
133
|
-
-
|
|
134
|
-
- `
|
|
135
|
-
-
|
|
136
|
-
-
|
|
137
|
-
- `
|
|
138
|
-
-
|
|
139
|
-
-
|
|
140
|
-
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
-
|
|
145
|
-
-
|
|
146
|
-
- status
|
|
147
|
-
-
|
|
148
|
-
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
-
|
|
159
|
-
-
|
|
160
|
-
-
|
|
161
|
-
-
|
|
162
|
-
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
-
|
|
203
|
-
|
|
1
|
+
---
|
|
2
|
+
title: antpath architecture decisions
|
|
3
|
+
status: accepted
|
|
4
|
+
scope: platform MVP
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# antpath architecture decisions
|
|
8
|
+
|
|
9
|
+
## Product framing
|
|
10
|
+
|
|
11
|
+
antpath is a TypeScript-first platform plus SDK for running autonomous sessions on provider-managed agent infrastructure. Claude Managed Agents remains the execution runtime. antpath provides the durable control plane around it: submit jobs, dispatch provider sessions, track tenant-scoped metadata, observe lifecycle state, capture configured outputs, enforce quotas, and retain or clean up provider resources according to run policy.
|
|
12
|
+
|
|
13
|
+
The platform is intentionally not a custom agent loop or sandbox runtime. The worker dispatches, observes, stores metadata, captures outputs, and applies retention/cleanup policy. It does not execute tools, approve tool calls, or participate in provider-side reasoning.
|
|
14
|
+
|
|
15
|
+
This supersedes the earlier SDK-only MVP boundary.
|
|
16
|
+
|
|
17
|
+
## Goals
|
|
18
|
+
|
|
19
|
+
- Stable start-to-finish lifecycle for provider-managed agent runs.
|
|
20
|
+
- Minimal tenant-scoped dashboard metadata for monitoring run execution.
|
|
21
|
+
- Durable worker recovery after restarts, deploys, crashes, or missed notifications.
|
|
22
|
+
- Horizontal worker scaling without duplicate run ownership.
|
|
23
|
+
- Provider cleanup by default after terminal states, with optional retention when configured per-request or per-deployment.
|
|
24
|
+
- BYO provider key custody with encrypted storage.
|
|
25
|
+
- Private output capture with quota enforcement.
|
|
26
|
+
- Programmatic SDK access for submitting and observing platform runs.
|
|
27
|
+
- Test-driven implementation across SDK, dashboard, worker, database, storage, and provider adapters.
|
|
28
|
+
|
|
29
|
+
## Non-goals for platform MVP
|
|
30
|
+
|
|
31
|
+
- No custom agent loop.
|
|
32
|
+
- No antpath-managed sandbox.
|
|
33
|
+
- No runtime human approval/tool approval flow.
|
|
34
|
+
- No direct browser access to provider APIs.
|
|
35
|
+
- No direct browser Supabase data access.
|
|
36
|
+
- No raw prompt, raw model output, raw provider event payload, MCP credential, provider key, or output file content in normal application tables.
|
|
37
|
+
- No provider Agent/Environment caching in MVP.
|
|
38
|
+
- No provider webhooks in MVP.
|
|
39
|
+
- No Supabase Realtime in MVP.
|
|
40
|
+
- No OpenAI or other provider integration in MVP.
|
|
41
|
+
|
|
42
|
+
## Core decisions
|
|
43
|
+
|
|
44
|
+
| Area | Decision |
|
|
45
|
+
| --- | --- |
|
|
46
|
+
| Product surface | Platform plus SDK. |
|
|
47
|
+
| Repository | Convert the repository to an npm TypeScript workspace. |
|
|
48
|
+
| SDK location | Move the existing SDK package to `packages/sdk`. |
|
|
49
|
+
| SDK role | The SDK submits runs and observes status, metadata, outputs, cancellation, and deletion through the platform API. |
|
|
50
|
+
| Dashboard | Build a first-party minimal dashboard. |
|
|
51
|
+
| Worker | Run a Railway persistent service, designed as ephemeral and horizontally scalable. |
|
|
52
|
+
| Database | Use Supabase Postgres as the source of truth. |
|
|
53
|
+
| Storage | Use private Supabase Storage buckets for captured outputs. |
|
|
54
|
+
| Auth | Use Auth.js for user authentication, not Supabase Auth. |
|
|
55
|
+
| Tenant model | Workspace is the tenant boundary. |
|
|
56
|
+
| Membership | Users access workspaces through `workspace_memberships`. |
|
|
57
|
+
| Authorization boundary | Vercel BFF/server actions validate Auth.js sessions or SDK API tokens and scope every DB operation by workspace membership/scope. |
|
|
58
|
+
| Browser data access | Browser code talks to the BFF/server actions, not directly to Supabase data APIs. |
|
|
59
|
+
| Service credentials | Supabase service-role credentials are server/worker only. |
|
|
60
|
+
| SDK auth | SDK uses hashed, workspace-scoped API tokens. Dashboard uses Auth.js sessions. |
|
|
61
|
+
| API token attribution | Token-authenticated runs are attributed to the token creator at submission time unless a future service-account model is introduced. |
|
|
62
|
+
| Provider key custody | Workspace BYO Anthropic key is stored encrypted through Supabase Vault. |
|
|
63
|
+
| Secret lifetime | Worker resolves provider keys per claimed lifecycle step, keeps them only in memory for that step, and drops them before lease release. |
|
|
64
|
+
| Provider resources | Create provider Agent, Environment, Vault/Credential, Session, and file resources per run in MVP. |
|
|
65
|
+
| Provider resource caching | Backlog; later cache Agent/Environment by Template/config hash. |
|
|
66
|
+
| Worker wakeup | Poll due rows from the runs table. Postgres `NOTIFY` is a latency optimization only. |
|
|
67
|
+
| Worker ownership | Use DB leases, `FOR UPDATE SKIP LOCKED`, lease tokens, and expiry for horizontally scaled workers. |
|
|
68
|
+
| Provider observation | Poll provider session status and list provider events since the last cursor/timestamp where available. |
|
|
69
|
+
| Event dedupe | Provider event id is the dedupe authority; cursor/timestamp is advisory. |
|
|
70
|
+
| Webhooks | Backlog only; if added, use as wakeup/reconciliation signals, not sole source of truth. |
|
|
71
|
+
| SSE | Not the primary monitoring mechanism in MVP. |
|
|
72
|
+
| MCP approvals | Disallow approval-required tools. Worker must not approve tools or return custom tool results. |
|
|
73
|
+
| Template boundary | Templates are code-first, secret-free snapshots with stable hashes. |
|
|
74
|
+
| Idempotency | Run submission idempotency is scoped by `(workspace_id, idempotency_key)` and request hash. Same key and same hash returns the existing run; same key and different hash returns conflict. |
|
|
75
|
+
| Quotas | Enforce workspace plan quotas with per-user attribution. |
|
|
76
|
+
| Run caps | Plan-based caps cover duration, concurrency, storage, polling, retries, and output capture. |
|
|
77
|
+
| Operational config | Exact cap/tier values are environment-configurable defaults with conservative fallbacks and run-level snapshots. |
|
|
78
|
+
| Output capture | Enabled by default when configured by the run/template and within quota. |
|
|
79
|
+
| Output access | BFF returns signed links only after workspace authorization. |
|
|
80
|
+
| Cleanup | Clean up Claude provider resources by default after terminal state and output capture; explicit run policy or worker default can retain them for inspection. |
|
|
81
|
+
| Delete semantics | User delete is soft/pending while execution, cleanup, or storage deletion is active. Hard purge happens only after cleanup/storage deletion succeeds. |
|
|
82
|
+
| Retention | Run metadata and stored outputs remain until user deletion in MVP. |
|
|
83
|
+
| Realtime | Defer from MVP; use BFF-mediated refresh/polling first. Revisit with custom Supabase JWT/RLS or a server-mediated realtime bridge. |
|
|
84
|
+
| Development workflow | Use test-driven development: write or update the failing test at the narrowest useful layer before implementation. |
|
|
85
|
+
|
|
86
|
+
## Workspace layout
|
|
87
|
+
|
|
88
|
+
Target structure:
|
|
89
|
+
|
|
90
|
+
```text
|
|
91
|
+
apps/
|
|
92
|
+
dashboard/
|
|
93
|
+
app/
|
|
94
|
+
server/
|
|
95
|
+
components/
|
|
96
|
+
auth/
|
|
97
|
+
db/
|
|
98
|
+
worker/
|
|
99
|
+
src/
|
|
100
|
+
main.ts
|
|
101
|
+
polling/
|
|
102
|
+
providers/
|
|
103
|
+
lifecycle/
|
|
104
|
+
cleanup/
|
|
105
|
+
storage/
|
|
106
|
+
observability/
|
|
107
|
+
packages/
|
|
108
|
+
sdk/
|
|
109
|
+
src/
|
|
110
|
+
shared/
|
|
111
|
+
src/
|
|
112
|
+
types/
|
|
113
|
+
status/
|
|
114
|
+
redaction/
|
|
115
|
+
templates/
|
|
116
|
+
db/
|
|
117
|
+
migrations/
|
|
118
|
+
schema/
|
|
119
|
+
queries/
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
`packages/shared` and `packages/db` may start small and grow only when invariants or schema/query helpers are shared by dashboard, worker, and SDK.
|
|
123
|
+
|
|
124
|
+
## Core data model
|
|
125
|
+
|
|
126
|
+
### Identity and tenancy
|
|
127
|
+
|
|
128
|
+
- `users`
|
|
129
|
+
- Auth.js user identity mirror or Auth.js adapter user table reference.
|
|
130
|
+
- No provider secrets.
|
|
131
|
+
- `workspaces`
|
|
132
|
+
- Tenant boundary.
|
|
133
|
+
- Plan, quota, status, and retention settings.
|
|
134
|
+
- `workspace_memberships`
|
|
135
|
+
- Workspace, user, role, and membership status.
|
|
136
|
+
- Every dashboard/API operation must prove membership/scope.
|
|
137
|
+
- `api_tokens`
|
|
138
|
+
- Workspace-scoped SDK credentials.
|
|
139
|
+
- Store hashed token material only, plus scopes, creator, last-used timestamp, and revoked timestamp.
|
|
140
|
+
- API-token submissions freeze the attributed user on the run row at creation time.
|
|
141
|
+
|
|
142
|
+
### Provider connections
|
|
143
|
+
|
|
144
|
+
- `provider_connections`
|
|
145
|
+
- Workspace-owned provider configuration.
|
|
146
|
+
- Provider type, display name, validation status, rotation/revocation status, and encrypted secret reference.
|
|
147
|
+
- Secret values live in Supabase Vault.
|
|
148
|
+
- Only trusted server/worker code can resolve decrypted provider keys.
|
|
149
|
+
|
|
150
|
+
### Runs
|
|
151
|
+
|
|
152
|
+
- `runs`
|
|
153
|
+
- User-visible logical run.
|
|
154
|
+
- Workspace, creator/attributed user, template snapshot/hash, status, lifecycle phase, plan caps, timestamps.
|
|
155
|
+
- Submission idempotency key and request hash.
|
|
156
|
+
- Unique constraint: `(workspace_id, idempotency_key)`.
|
|
157
|
+
- Lease fields: `lease_owner`, `lease_token`, `lease_expires_at`, `attempt_count`, `next_check_at`, `priority`.
|
|
158
|
+
- Cancellation/deletion fields: `cancel_requested_at`, `pending_delete_at`, `deleted_at`.
|
|
159
|
+
- Provider observation cursor/watermark where available.
|
|
160
|
+
- `run_attempts`
|
|
161
|
+
- A logical run may have multiple provider attempts.
|
|
162
|
+
- Tracks provider session id, attempt state, start/end timestamps, and error classification.
|
|
163
|
+
- `provider_resources`
|
|
164
|
+
- Journal of intended and created provider resources.
|
|
165
|
+
- Resource type, local idempotency key, deterministic provider name/metadata tags, provider id when known, cleanup status.
|
|
166
|
+
- Insert an intended row before provider side effects whenever possible.
|
|
167
|
+
- `run_events`
|
|
168
|
+
- Metadata only.
|
|
169
|
+
- Provider event id/type, processed timestamp, redacted summary fields, usage metadata.
|
|
170
|
+
- Unique constraint on `(run_attempt_id, provider_event_id)`.
|
|
171
|
+
- No raw prompts, raw outputs, tool inputs/results, file contents, or credentials.
|
|
172
|
+
|
|
173
|
+
### Outputs and cleanup
|
|
174
|
+
|
|
175
|
+
- `output_objects`
|
|
176
|
+
- Workspace/run owner, storage bucket/path, size, checksum if available, content type, provider file id.
|
|
177
|
+
- Used for quota accounting and signed-link generation.
|
|
178
|
+
- `cleanup_attempts`
|
|
179
|
+
- Per-resource cleanup action, status, retry count, redacted error code/message, timestamps.
|
|
180
|
+
- `usage_ledger`
|
|
181
|
+
- Token/cost/storage attribution by workspace and attributed user.
|
|
182
|
+
- Written transactionally with source event/output rows to avoid drift.
|
|
183
|
+
|
|
184
|
+
## Run lifecycle
|
|
185
|
+
|
|
186
|
+
1. SDK/dashboard submits a run to the BFF.
|
|
187
|
+
2. BFF validates Auth.js session or SDK API token.
|
|
188
|
+
3. BFF resolves active workspace membership/scope.
|
|
189
|
+
4. BFF validates Template/request shape, plan limits, and idempotency key/request hash.
|
|
190
|
+
5. BFF writes a `runs` row with execution-affecting cap values snapshotted from plan/env defaults.
|
|
191
|
+
6. BFF emits Postgres `NOTIFY` for fast wakeup.
|
|
192
|
+
7. Worker claims due runs with row locking, `SKIP LOCKED`, lease token, and lease expiry.
|
|
193
|
+
8. Worker resolves the provider key from Supabase Vault for the claimed step.
|
|
194
|
+
9. Worker validates platform constraints: no approval-required tools, no custom antpath-executed tools, known output policy.
|
|
195
|
+
10. Worker pre-journals intended provider resources.
|
|
196
|
+
11. Worker creates provider resources and records provider IDs immediately after successful calls.
|
|
197
|
+
12. Worker creates provider session and sends the initial user event.
|
|
198
|
+
13. Worker schedules provider polling through `next_check_at`.
|
|
199
|
+
14. Worker later reclaims the run and polls provider session status plus event list.
|
|
200
|
+
15. Worker stores only metadata events and usage, deduped by provider event id.
|
|
201
|
+
16. Before every side effect, worker checks lease token and cancellation/deletion requests.
|
|
202
|
+
17. On terminal provider state or antpath timeout, worker captures configured outputs within per-file, per-run, and workspace quota caps.
|
|
203
|
+
18. Worker cleans up provider resources by default, or records them as retained when the run/deployment policy asks to retain them.
|
|
204
|
+
19. Worker marks the run terminal and releases the lease.
|
|
205
|
+
20. Dashboard/SDK read tenant-scoped metadata and signed output links through BFF APIs.
|
|
206
|
+
|
|
207
|
+
## Worker concurrency and recovery
|
|
208
|
+
|
|
209
|
+
The database is the coordination primitive. Worker instances are stateless and can be added or removed without losing correctness.
|
|
210
|
+
|
|
211
|
+
Claiming rules:
|
|
212
|
+
|
|
213
|
+
- Query due rows ordered by priority, fairness across workspaces, and `next_check_at`.
|
|
214
|
+
- Use `FOR UPDATE SKIP LOCKED`.
|
|
215
|
+
- Set `lease_owner`, `lease_token`, `lease_expires_at`, and attempt counters.
|
|
216
|
+
- Commit immediately.
|
|
217
|
+
- Execute one bounded lifecycle step outside long transactions.
|
|
218
|
+
- Persist result and either schedule next step or mark terminal.
|
|
219
|
+
|
|
220
|
+
Safety rules:
|
|
221
|
+
|
|
222
|
+
- Every status-mutating update includes `WHERE lease_token = $token AND lease_expires_at > now()` and verifies exactly one affected row.
|
|
223
|
+
- Every side-effecting step checks `cancel_requested_at` and `pending_delete_at`.
|
|
224
|
+
- Expired leases are reclaimable by any worker.
|
|
225
|
+
- `next_check_at` includes jitter.
|
|
226
|
+
- Claiming enforces per-workspace active-run caps and provider-key-scoped rate limits.
|
|
227
|
+
- Workers handle SIGTERM by stopping new claims and relying on bounded steps, idempotency, leases, and reconciliation for in-flight work.
|
|
228
|
+
- Polling is always enabled; `NOTIFY` only reduces latency.
|
|
229
|
+
|
|
230
|
+
## Provider observation
|
|
231
|
+
|
|
232
|
+
MVP source of truth:
|
|
233
|
+
|
|
234
|
+
- Provider session retrieve for status.
|
|
235
|
+
- Provider session events list for event metadata and usage.
|
|
236
|
+
|
|
237
|
+
Rules:
|
|
238
|
+
|
|
239
|
+
- Cursor/timestamp filters are used when available.
|
|
240
|
+
- Provider event id is always used for dedupe.
|
|
241
|
+
- Phase 5 must verify Claude Managed Agents event pagination/filter semantics.
|
|
242
|
+
- If no stable cursor or since filter exists, use a bounded re-list strategy with event-id dedupe instead of unbounded full-history scans.
|
|
243
|
+
|
|
244
|
+
Excluded from MVP:
|
|
245
|
+
|
|
246
|
+
- SSE as primary monitoring.
|
|
247
|
+
- Anthropic webhooks as primary monitoring.
|
|
248
|
+
|
|
249
|
+
## Auth and dashboard security
|
|
250
|
+
|
|
251
|
+
Auth.js handles interactive user sign-in. SDK API tokens handle programmatic access.
|
|
252
|
+
|
|
253
|
+
BFF/server actions must:
|
|
254
|
+
|
|
255
|
+
- validate Auth.js session or API token;
|
|
256
|
+
- resolve user identity and active workspace;
|
|
257
|
+
- check workspace membership or token scope;
|
|
258
|
+
- scope every query/mutation by workspace id;
|
|
259
|
+
- return only metadata allowed by the user's role/scope;
|
|
260
|
+
- keep Supabase service-role credentials out of browser bundles.
|
|
261
|
+
|
|
262
|
+
Supabase Realtime is backlog until antpath either mints short-lived Supabase-compatible JWTs with RLS policies or exposes a server-mediated realtime bridge.
|
|
263
|
+
|
|
264
|
+
## Secret handling
|
|
265
|
+
|
|
266
|
+
- Provider keys are workspace BYOK and stored via Supabase Vault.
|
|
267
|
+
- MCP credentials follow the same no-log/no-table-secret rule.
|
|
268
|
+
- Normal app tables store only secret references and validation/rotation status.
|
|
269
|
+
- Worker resolves decrypted provider keys only for a claimed lifecycle step.
|
|
270
|
+
- Secret values should use explicit redacted wrappers in code so they cannot serialize into logs, metrics, errors, events, or fixtures.
|
|
271
|
+
- Secret redaction applies to logs, errors, run metadata, provider events, tests, fixtures, and docs.
|
|
272
|
+
- If a provider key is revoked/rotated mid-run, active runs are failed or cancelled with a tenant-permanent error unless cleanup can still authenticate.
|
|
273
|
+
- Revoked keys do not get a hidden grace cache in MVP. If cleanup cannot authenticate after revocation, the run/resource moves to `cleanup_failed` with an actionable tenant error.
|
|
274
|
+
|
|
275
|
+
## Output storage and quotas
|
|
276
|
+
|
|
277
|
+
Output flow:
|
|
278
|
+
|
|
279
|
+
1. Worker lists provider session files.
|
|
280
|
+
2. Worker checks provider file metadata, output policy, and remaining quota.
|
|
281
|
+
3. Worker downloads selected files with streaming hard caps.
|
|
282
|
+
4. If provider size is unknown or exceeds caps, worker aborts capture for that object and records a quota/cap warning.
|
|
283
|
+
5. Worker uploads accepted files to private Supabase Storage.
|
|
284
|
+
6. Worker records `output_objects` rows.
|
|
285
|
+
7. Worker cleans up provider resources by default, or retains them when explicitly configured.
|
|
286
|
+
8. Dashboard/SDK request signed links through the BFF.
|
|
287
|
+
|
|
288
|
+
Quota rules:
|
|
289
|
+
|
|
290
|
+
- Workspace plan quota is the hard enforcement boundary.
|
|
291
|
+
- Per-file and per-run caps protect the worker before workspace quota is consumed.
|
|
292
|
+
- Per-user attribution is frozen from the run row at submission time.
|
|
293
|
+
- Free first X users is a plan/billing rule on top of workspace-level enforcement.
|
|
294
|
+
|
|
295
|
+
## Cleanup and reconciliation
|
|
296
|
+
|
|
297
|
+
Claude provider-resource cleanup is mandatory by default after terminal provider state and output capture so antpath does not leave behind provider state. Retention is opt-in and policy-driven through `cleanup.claudeSession = "retain"` or `ANTPATH_WORKER_CLAUDE_SESSION_CLEANUP_DEFAULT=retain`. Retained resources are recorded in `provider_resources.cleanup_status = retained` and remain reachable via the provider API.
|
|
298
|
+
|
|
299
|
+
Explicit cleanup order:
|
|
300
|
+
|
|
301
|
+
1. Provider credentials/vaults.
|
|
302
|
+
2. Session files/session where supported.
|
|
303
|
+
3. Agent/archive.
|
|
304
|
+
4. Environment/archive or delete where allowed.
|
|
305
|
+
5. Uploaded provider file resources.
|
|
306
|
+
6. Local output metadata/storage only when user deletes a run.
|
|
307
|
+
|
|
308
|
+
Cleanup properties:
|
|
309
|
+
|
|
310
|
+
- Retried independently from run execution.
|
|
311
|
+
- Idempotent where provider APIs allow.
|
|
312
|
+
- Redacted errors recorded in `cleanup_attempts`.
|
|
313
|
+
- Runs can be terminal while cleanup remains pending or retryable-failed.
|
|
314
|
+
- User deletion sets `pending_delete_at` while cleanup/storage deletion is active.
|
|
315
|
+
|
|
316
|
+
Resource leak recovery:
|
|
317
|
+
|
|
318
|
+
- Every provider create starts from a local intended row with deterministic provider name/metadata where provider APIs allow.
|
|
319
|
+
- A reconciliation sweeper reviews unfinished intended rows, expired leases, and provider-listable resources tagged with antpath metadata.
|
|
320
|
+
- If provider create succeeded but provider id was not persisted before a crash, the sweeper attempts to match by deterministic name/metadata and attach the provider id for cleanup.
|
|
321
|
+
|
|
322
|
+
Workspace/key deletion:
|
|
323
|
+
|
|
324
|
+
- Workspace deletion blocks new runs, requests cancellation for active runs, drains cleanup, deletes stored outputs, then purges metadata.
|
|
325
|
+
- Provider key revocation blocks new runs immediately and may force existing runs to fail/cancel if cleanup can no longer authenticate.
|
|
326
|
+
|
|
327
|
+
## Run state model
|
|
328
|
+
|
|
329
|
+
Suggested run statuses:
|
|
330
|
+
|
|
331
|
+
- `queued`
|
|
332
|
+
- `claiming`
|
|
333
|
+
- `provisioning`
|
|
334
|
+
- `session_created`
|
|
335
|
+
- `dispatched`
|
|
336
|
+
- `provider_running`
|
|
337
|
+
- `provider_idle`
|
|
338
|
+
- `provider_rescheduled`
|
|
339
|
+
- `cancelling`
|
|
340
|
+
- `capturing_outputs`
|
|
341
|
+
- `cleaning_up`
|
|
342
|
+
- `succeeded`
|
|
343
|
+
- `failed`
|
|
344
|
+
- `timed_out`
|
|
345
|
+
- `cancelled`
|
|
346
|
+
- `cleanup_failed`
|
|
347
|
+
- `pending_delete`
|
|
348
|
+
- `deleted`
|
|
349
|
+
|
|
350
|
+
Terminal user-facing run status is separate from cleanup state so a successful provider run can still show cleanup retry warnings.
|
|
351
|
+
|
|
352
|
+
Error classes:
|
|
353
|
+
|
|
354
|
+
- `transient_provider`: retry with backoff and jitter.
|
|
355
|
+
- `provider_permanent`: fail attempt/run and cleanup.
|
|
356
|
+
- `tenant_permanent`: invalid key, quota exceeded, invalid Template, approval-required event observed.
|
|
357
|
+
- `antpath_bug`: fail safely, alert, and preserve cleanup work.
|
|
358
|
+
- `cancelled_by_user`: cleanup and mark cancelled.
|
|
359
|
+
|
|
360
|
+
## Deployment baseline
|
|
361
|
+
|
|
362
|
+
- Dashboard: Vercel.
|
|
363
|
+
- Worker: Railway persistent service with configurable replicas.
|
|
364
|
+
- Database: Supabase Postgres.
|
|
365
|
+
- Storage: Supabase Storage private bucket.
|
|
366
|
+
- Secrets: Vercel/Railway environment variables plus Supabase Vault for tenant provider keys.
|
|
367
|
+
|
|
368
|
+
Required runtime config includes:
|
|
369
|
+
|
|
370
|
+
- database URL;
|
|
371
|
+
- Supabase service credentials;
|
|
372
|
+
- Supabase Storage bucket;
|
|
373
|
+
- Supabase Vault access function/schema;
|
|
374
|
+
- Auth.js secret/providers;
|
|
375
|
+
- SDK token hashing secret/pepper;
|
|
376
|
+
- worker identity;
|
|
377
|
+
- provider API base/version settings.
|
|
378
|
+
|
|
379
|
+
Environment-configurable defaults include:
|
|
380
|
+
|
|
381
|
+
- max run duration;
|
|
382
|
+
- max active runs per workspace;
|
|
383
|
+
- max active runs per user/token;
|
|
384
|
+
- polling base interval;
|
|
385
|
+
- polling max interval;
|
|
386
|
+
- polling jitter;
|
|
387
|
+
- provider create/delete/poll token-bucket limits;
|
|
388
|
+
- provider retry backoff;
|
|
389
|
+
- lease duration;
|
|
390
|
+
- lease renewal threshold;
|
|
391
|
+
- max provider attempts;
|
|
392
|
+
- cleanup retry count;
|
|
393
|
+
- cleanup retry backoff;
|
|
394
|
+
- per-file output cap;
|
|
395
|
+
- per-run output cap;
|
|
396
|
+
- workspace storage cap;
|
|
397
|
+
- signed URL TTL;
|
|
398
|
+
- free user allowance;
|
|
399
|
+
- metadata retention toggles.
|
|
400
|
+
|
|
401
|
+
Missing optional env vars must fall back to conservative low limits. Missing required secret/connectivity env vars must fail service startup.
|
|
402
|
+
|
|
403
|
+
## Open implementation details
|
|
404
|
+
|
|
405
|
+
These are not architecture blockers, but must be pinned during implementation planning:
|
|
406
|
+
|
|
407
|
+
- Exact plan tier values for duration, concurrency, storage, polling, and free user allowance.
|
|
408
|
+
- Exact Auth.js providers.
|
|
409
|
+
- Auth.js adapter/session mode and user mirror lifecycle.
|
|
410
|
+
- Supabase Vault access method and SQL privilege wrapper shape.
|
|
411
|
+
- Signed URL TTL and whether storage paths include random unguessable components in addition to workspace/run ids.
|
|
412
|
+
- Deletion audit requirements.
|
|
413
|
+
- Provider metadata naming convention.
|
|
414
|
+
- Provider list/search capabilities and reconciliation coverage for each resource type.
|
|
415
|
+
- Exact Claude Managed Agents event pagination/filter semantics and bounded polling fallback.
|
|
416
|
+
|
|
417
|
+
## Backlog
|
|
418
|
+
|
|
419
|
+
- Provider webhooks as wakeup/reconciliation accelerator.
|
|
420
|
+
- SSE live event stream for richer UI.
|
|
421
|
+
- Supabase Realtime with explicit Auth.js-to-Supabase authorization design.
|
|
422
|
+
- Agent/Environment caching by Template/config hash.
|
|
423
|
+
- More providers.
|
|
424
|
+
- Runtime human approval flow if product scope changes.
|
|
425
|
+
- Advanced billing and plan management.
|
|
426
|
+
- Cloud Template registry.
|
|
427
|
+
- Curated MCP adapter catalog.
|