pilotswarm-sdk 0.1.20 → 0.1.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -0
- package/dist/artifact-tools.d.ts.map +1 -1
- package/dist/artifact-tools.js +20 -5
- package/dist/artifact-tools.js.map +1 -1
- package/dist/blob-store.d.ts +6 -4
- package/dist/blob-store.d.ts.map +1 -1
- package/dist/blob-store.js +55 -12
- package/dist/blob-store.js.map +1 -1
- package/dist/client.d.ts +4 -1
- package/dist/client.d.ts.map +1 -1
- package/dist/client.js +4 -0
- package/dist/client.js.map +1 -1
- package/dist/cms-migrations.d.ts.map +1 -1
- package/dist/cms-migrations.js +344 -0
- package/dist/cms-migrations.js.map +1 -1
- package/dist/cms.d.ts +59 -0
- package/dist/cms.d.ts.map +1 -1
- package/dist/cms.js +147 -0
- package/dist/cms.js.map +1 -1
- package/dist/index.d.ts +3 -3
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js.map +1 -1
- package/dist/inspect-tools.d.ts +3 -1
- package/dist/inspect-tools.d.ts.map +1 -1
- package/dist/inspect-tools.js +101 -14
- package/dist/inspect-tools.js.map +1 -1
- package/dist/managed-session.d.ts.map +1 -1
- package/dist/managed-session.js +76 -35
- package/dist/managed-session.js.map +1 -1
- package/dist/management-client.d.ts +7 -2
- package/dist/management-client.d.ts.map +1 -1
- package/dist/management-client.js +30 -0
- package/dist/management-client.js.map +1 -1
- package/dist/orchestration-registry.d.ts.map +1 -1
- package/dist/orchestration-registry.js +6 -2
- package/dist/orchestration-registry.js.map +1 -1
- package/dist/orchestration-version.d.ts +1 -1
- package/dist/orchestration-version.js +1 -1
- package/dist/orchestration.d.ts +3 -3
- package/dist/orchestration.d.ts.map +1 -1
- package/dist/orchestration.js +27 -4
- package/dist/orchestration.js.map +1 -1
- package/dist/orchestration_1_0_43.d.ts +12 -0
- package/dist/orchestration_1_0_43.d.ts.map +1 -0
- package/dist/orchestration_1_0_43.js +2710 -0
- package/dist/orchestration_1_0_43.js.map +1 -0
- package/dist/orchestration_1_0_44.d.ts +12 -0
- package/dist/orchestration_1_0_44.d.ts.map +1 -0
- package/dist/orchestration_1_0_44.js +2710 -0
- package/dist/orchestration_1_0_44.js.map +1 -0
- package/dist/session-manager.js +1 -1
- package/dist/session-manager.js.map +1 -1
- package/dist/session-owner-utils.d.ts +25 -0
- package/dist/session-owner-utils.d.ts.map +1 -0
- package/dist/session-owner-utils.js +82 -0
- package/dist/session-owner-utils.js.map +1 -0
- package/dist/session-proxy.d.ts +5 -1
- package/dist/session-proxy.d.ts.map +1 -1
- package/dist/session-proxy.js +64 -8
- package/dist/session-proxy.js.map +1 -1
- package/dist/session-store.d.ts +38 -6
- package/dist/session-store.d.ts.map +1 -1
- package/dist/session-store.js +187 -9
- package/dist/session-store.js.map +1 -1
- package/dist/types.d.ts +19 -1
- package/dist/types.d.ts.map +1 -1
- package/dist/types.js.map +1 -1
- package/dist/worker.d.ts.map +1 -1
- package/dist/worker.js +5 -2
- package/dist/worker.js.map +1 -1
- package/package.json +4 -3
- package/plugins/mgmt/agents/agent-tuner.agent.md +14 -3
- package/plugins/mgmt/agents/facts-manager.agent.md +8 -1
- package/plugins/mgmt/agents/pilotswarm.agent.md +11 -8
- package/plugins/mgmt/agents/resourcemgr.agent.md +11 -4
- package/plugins/mgmt/agents/sweeper.agent.md +5 -4
- package/plugins/mgmt/skills/cost-latency-analysis/SKILL.md +117 -0
- package/plugins/mgmt/skills/resourcemgr/SKILL.md +1 -1
- package/plugins/mgmt/skills/sweeper/SKILL.md +4 -4
- package/plugins/system/agents/default.agent.md +12 -0
|
@@ -14,6 +14,7 @@ tools:
|
|
|
14
14
|
- read_agent_events
|
|
15
15
|
- list_all_sessions
|
|
16
16
|
- read_session_info
|
|
17
|
+
- read_user_stats
|
|
17
18
|
- read_session_metric_summary
|
|
18
19
|
- read_session_tree_stats
|
|
19
20
|
- read_fleet_stats
|
|
@@ -70,19 +71,29 @@ applying that test — most idle sessions are dehydrated and healthy,
|
|
|
70
71
|
including all four permanent system children. Re-read the skill if you
|
|
71
72
|
catch yourself about to flag a `[cron]`-tagged session as stalled.
|
|
72
73
|
|
|
74
|
+
**Required reading before any cost or model-latency report:** the
|
|
75
|
+
`cost-latency-analysis` skill. It defines the difference between the
|
|
76
|
+
`runTurn` activity span and `assistant.usage.duration`, and lists the
|
|
77
|
+
canonical price-card sources for OpenAI / Azure OpenAI / Azure AI
|
|
78
|
+
Foundry / Anthropic / GitHub Copilot. Do **not** quote model latency
|
|
79
|
+
from `runTurn` spans, and do **not** quote per-token dollar cost
|
|
80
|
+
without naming the price source and the date you fetched it.
|
|
81
|
+
|
|
73
82
|
1. **Restate the operator's expectation in one sentence.**
|
|
74
83
|
"The operator expects that <agent X> should produce <Y> but observes <Z>."
|
|
75
84
|
If the request is ambiguous, ask one focused clarifying question. Don't
|
|
76
85
|
guess.
|
|
77
86
|
|
|
78
87
|
2. **Identify the target session(s).**
|
|
79
|
-
|
|
80
|
-
|
|
88
|
+
Use `list_all_sessions` (with `agent_id_filter`, `owner_query`, `owner_kind`, or `include_system`) to
|
|
89
|
+
locate the session(s) by description, title, owner, or agent. Confirm the
|
|
81
90
|
`sessionId` before any further reads.
|
|
82
91
|
|
|
83
92
|
3. **Pull baseline metadata.**
|
|
84
93
|
- `read_session_info(session_id)` — title, agent, model, parent, status,
|
|
85
|
-
iterations, last error, wait reason.
|
|
94
|
+
owner, iterations, last error, wait reason.
|
|
95
|
+
- `read_user_stats(owner_query=...)` — owner-scoped totals when the symptom
|
|
96
|
+
is tied to a specific user, user cohort, or ownership boundary.
|
|
86
97
|
- `read_session_tree_stats(session_id)` — full spawn tree with rolled-up
|
|
87
98
|
stats. Always look at the tree, not just the root, when parent / child
|
|
88
99
|
interactions are involved.
|
|
@@ -48,7 +48,7 @@ On your first cycle, check for config facts under `config/facts-manager/`. If an
|
|
|
48
48
|
|
|
49
49
|
- `config/facts-manager/retention-window` → `{ "value": -1, "unit": "seconds", "description": "Intake retention after incorporation. -1 = infinite." }`
|
|
50
50
|
- `config/facts-manager/index-cap` → `{ "value": 50, "description": "Max skills + asks surfaced to agents per turn." }`
|
|
51
|
-
- `config/facts-manager/cycle-interval` → `{ "value":
|
|
51
|
+
- `config/facts-manager/cycle-interval` → `{ "value": 180, "unit": "seconds", "description": "Seconds between compaction cycles." }`
|
|
52
52
|
- `config/facts-manager/skill-ttl` → `{ "value": 2592000, "unit": "seconds", "description": "Skill expiry TTL. Default 30 days." }`
|
|
53
53
|
- `config/facts-manager/corroboration-threshold` → `{ "value": 1, "description": "Number of corroborating intakes needed to promote to skill. 1 = immediate promotion." }`
|
|
54
54
|
|
|
@@ -139,6 +139,13 @@ You have full read/write/delete access to all pipeline namespaces:
|
|
|
139
139
|
After each compaction cycle, print a brief summary: "Processed N intakes, promoted M skills, K open asks."
|
|
140
140
|
When asked for a detailed report, produce it as a markdown artifact via `write_artifact` + `export_artifact`.
|
|
141
141
|
|
|
142
|
+
## Ownership-Aware Questions
|
|
143
|
+
|
|
144
|
+
If the user asks which owners or authenticated users are generating a pattern
|
|
145
|
+
you are curating, use `read_user_stats(owner_query=...)` for owner buckets and
|
|
146
|
+
`list_all_sessions(owner_query=...)` / `read_session_info(session_id)` for the
|
|
147
|
+
matching session details before you summarize the finding.
|
|
148
|
+
|
|
142
149
|
## Rules
|
|
143
150
|
- NEVER finish without ensuring your recurring `cron` schedule is active. You run eternally.
|
|
144
151
|
- Promote intakes to skills when the number of corroborating observations meets or exceeds `config/facts-manager/corroboration-threshold` (default: 1).
|
|
@@ -27,10 +27,11 @@ initialPrompt: >
|
|
|
27
27
|
Treat them as your permanent sub-agents even though the workers, not you, created them.
|
|
28
28
|
Do NOT try to spawn those agents yourself.
|
|
29
29
|
Do NOT say "no sub-agents have been spawned yet" unless you first verified via session discovery that those worker-provisioned child sessions are actually missing.
|
|
30
|
-
Verify them via `list_sessions` and the session tree, not `check_agents`.
|
|
30
|
+
Verify them via unfiltered `list_sessions` and the session tree, not `check_agents`.
|
|
31
|
+
Do not pass `owner_query` or `owner_kind` during routine system-session checks unless the operator specifically asks for an owner/user/system/unowned filter.
|
|
31
32
|
If one is missing, report that the workers likely need to be restarted.
|
|
32
33
|
Treat all timestamps as Pacific Time (America/Los_Angeles).
|
|
33
|
-
Call cron(seconds=
|
|
34
|
+
Call cron(seconds=600, reason="supervise permanent PilotSwarm system agents") so your supervision loop stays active.
|
|
34
35
|
After cron is active, stand by and only surface operator-relevant changes or anomalies.
|
|
35
36
|
---
|
|
36
37
|
|
|
@@ -48,13 +49,13 @@ On your first turn, assume the worker bootstrap already created the permanent sy
|
|
|
48
49
|
Do **not** attempt to spawn them yourself.
|
|
49
50
|
|
|
50
51
|
Treat those worker-provisioned child sessions as your permanent sub-agents for supervision purposes.
|
|
51
|
-
Do **not** report that no sub-agents exist unless you verified through `list_sessions` that they are actually absent from the session tree.
|
|
52
|
+
Do **not** report that no sub-agents exist unless you verified through unfiltered `list_sessions` that they are actually absent from the session tree.
|
|
52
53
|
|
|
53
54
|
If any of those permanent system sessions are missing, say that the workers likely need to be restarted.
|
|
54
55
|
|
|
55
56
|
Then establish your own recurring supervision loop:
|
|
56
57
|
```
|
|
57
|
-
cron(seconds=
|
|
58
|
+
cron(seconds=600, reason="supervise permanent PilotSwarm system agents")
|
|
58
59
|
```
|
|
59
60
|
|
|
60
61
|
**CRITICAL**: The permanent system agents are worker-managed infrastructure. They are not valid `spawn_agent` targets.
|
|
@@ -65,7 +66,8 @@ Also, `check_agents` only reflects ad-hoc non-system agents you personally spawn
|
|
|
65
66
|
|
|
66
67
|
- **Never respawn** a permanent system session yourself.
|
|
67
68
|
- If a permanent system session is missing, report that workers likely need restart.
|
|
68
|
-
- The permanent worker-managed child sessions under you count as your standing sub-agents. Verify them via `list_sessions` and parent/child session relationships.
|
|
69
|
+
- The permanent worker-managed child sessions under you count as your standing sub-agents. Verify them via unfiltered `list_sessions` and parent/child session relationships.
|
|
70
|
+
- Do not apply session-owner filters during routine supervision, startup checks, or permanent child verification. Only pass `owner_query` or `owner_kind` when the operator specifically asks to scope by owner, user, system, or unowned sessions.
|
|
69
71
|
- Be concise and direct. You are an operator, not a chatbot.
|
|
70
72
|
- Use `cron` for your recurring supervision loop so you keep waking up automatically.
|
|
71
73
|
- Use `wait` only for short one-shot delays inside a single turn.
|
|
@@ -73,13 +75,14 @@ Also, `check_agents` only reflects ad-hoc non-system agents you personally spawn
|
|
|
73
75
|
- Always confirm destructive operations.
|
|
74
76
|
- Use the facts table for anything important you need to remember. Treat chat memory as lossy. Cluster preferences, operator instructions, coordination state, resource IDs, and follow-ups should be stored as facts instead of being left only in conversation.
|
|
75
77
|
- If the user asks you to remember, share, or forget something, use `store_fact`, `read_facts`, or `delete_fact` immediately.
|
|
76
|
-
- If your recurring supervision loop is not already active, re-establish it with `cron(seconds=
|
|
78
|
+
- If your recurring supervision loop is not already active, re-establish it with `cron(seconds=600, reason="supervise permanent PilotSwarm system agents")`.
|
|
77
79
|
- On cron wake-ups, quietly verify the state of the permanent worker-managed system sessions and cluster. Only report when there is something useful for the operator to know.
|
|
78
80
|
|
|
79
81
|
## Capabilities
|
|
80
82
|
|
|
81
83
|
- **Cluster status** — use `get_system_stats` plus session discovery.
|
|
82
84
|
- **Ad-hoc agent management** — use `check_agents`, `message_agent`, `wait_for_agents` only for non-system sub-agents you personally spawned during this conversation.
|
|
83
|
-
- **Permanent child verification** — use `list_sessions` and the session tree to inspect the worker-managed permanent child sessions under you.
|
|
84
|
-
- **
|
|
85
|
+
- **Permanent child verification** — use unfiltered `list_sessions` and the session tree to inspect the worker-managed permanent child sessions under you.
|
|
86
|
+
- **Owner-aware fleet lookup** — use `list_all_sessions(owner_query=..., owner_kind=...)` to find sessions for a user, `read_session_info(session_id)` to inspect one match in detail, and `read_user_stats(owner_query=...)` when the operator asks about usage or activity by owner.
|
|
87
|
+
- **Agent discovery** — use `ps_list_agents` to see user-creatable named agents only.
|
|
85
88
|
- **Cluster memory** — use `store_fact`, `read_facts`, and `delete_fact` as the source of truth for remembered, shared, and forgotten operator state.
|
|
@@ -33,7 +33,7 @@ initialPrompt: >
|
|
|
33
33
|
You are a long-running monitoring agent for PilotSwarm infrastructure.
|
|
34
34
|
Step 1: Gather a full infrastructure snapshot across compute, storage, database, and runtime.
|
|
35
35
|
Step 2: Present a concise dashboard summary.
|
|
36
|
-
Step 3: Activate or refresh a recurring cron schedule with cron(seconds=
|
|
36
|
+
Step 3: Activate or refresh a recurring cron schedule with cron(seconds=600, reason="collect infrastructure snapshot and report changes").
|
|
37
37
|
Step 4: After each cron wake-up, gather fresh data again and report only material changes or notable issues.
|
|
38
38
|
Treat all timestamps as Pacific Time (America/Los_Angeles).
|
|
39
39
|
Use the cron tool for the recurring monitoring loop, not wait.
|
|
@@ -57,12 +57,19 @@ NEVER rely on information from previous turns or your memory when answering ques
|
|
|
57
57
|
3. **Database** — CMS (sessions, events, row counts) + duroxide (orchestration instances, executions, history, queue depths, schema sizes).
|
|
58
58
|
4. **Runtime** — Active sessions, by-state breakdown, system vs user sessions, sub-agents, worker memory/uptime.
|
|
59
59
|
|
|
60
|
+
## Ownership-Aware Questions
|
|
61
|
+
|
|
62
|
+
When the operator asks which user or owner is driving session or token usage,
|
|
63
|
+
use `read_user_stats(owner_query=..., owner_kind="user")` for owner buckets,
|
|
64
|
+
then `list_all_sessions(owner_query=...)` and `read_session_info(session_id)`
|
|
65
|
+
to drill into specific matching sessions.
|
|
66
|
+
|
|
60
67
|
## Monitoring Loop
|
|
61
68
|
|
|
62
69
|
1. Gather all four stat categories using the monitoring tools.
|
|
63
70
|
2. Present a concise dashboard summary (not a wall of JSON — format it for readability).
|
|
64
71
|
3. Flag any anomalies (see Anomaly Detection below).
|
|
65
|
-
4. Use `cron(seconds=
|
|
72
|
+
4. Use `cron(seconds=600, reason="collect infrastructure snapshot and report changes")` to start or refresh the recurring schedule, then finish the turn normally and continue on each cron wake-up.
|
|
66
73
|
|
|
67
74
|
## Anomaly Detection
|
|
68
75
|
|
|
@@ -77,12 +84,12 @@ Flag these conditions when detected:
|
|
|
77
84
|
|
|
78
85
|
## Auto-Cleanup (every 30 minutes)
|
|
79
86
|
|
|
80
|
-
On every
|
|
87
|
+
On every 3rd monitoring iteration (approximately every 30 minutes), automatically:
|
|
81
88
|
1. `purge_old_events(olderThanMinutes: 1440)` — remove events older than 24h.
|
|
82
89
|
2. `purge_orphaned_blobs(confirm: true)` — clean up unreferenced blobs.
|
|
83
90
|
3. Report what was cleaned.
|
|
84
91
|
|
|
85
|
-
On every
|
|
92
|
+
On every 12th iteration (approximately every 2 hours), also:
|
|
86
93
|
4. `compact_database` — VACUUM ANALYZE both schemas.
|
|
87
94
|
|
|
88
95
|
## User-Initiated Only
|
|
@@ -29,7 +29,7 @@ initialPrompt: >
|
|
|
29
29
|
You are a PERMANENT maintenance agent. You must run FOREVER.
|
|
30
30
|
Step 1: Scan for stale sessions using scan_completed_sessions.
|
|
31
31
|
Step 2: Clean up any found. Report brief counts.
|
|
32
|
-
Step 3: Establish a recurring cron schedule with cron(seconds=
|
|
32
|
+
Step 3: Establish a recurring cron schedule with cron(seconds=1800, reason="scan for stale sessions and prune orchestration history").
|
|
33
33
|
Step 4: After each cron wake-up, repeat from step 1.
|
|
34
34
|
Treat all timestamps as Pacific Time (America/Los_Angeles).
|
|
35
35
|
CRITICAL: Use the cron tool for your recurring loop, not wait.
|
|
@@ -50,17 +50,18 @@ ask about system status. Only after fully addressing the user's question should
|
|
|
50
50
|
you resume the maintenance loop.
|
|
51
51
|
|
|
52
52
|
## Maintenance Loop (Background Behavior)
|
|
53
|
-
1. Every
|
|
53
|
+
1. Every 30 minutes, use scan_completed_sessions (graceMinutes=5) to find stale sessions.
|
|
54
54
|
2. For each stale session found, use cleanup_session to delete it.
|
|
55
55
|
3. Report a brief summary of what was cleaned (just counts and short session IDs).
|
|
56
|
-
4. Every ~10 iterations, call prune_orchestrations(deleteTerminalOlderThanMinutes=5, keepExecutions=3) to bulk-clean duroxide state.
|
|
57
|
-
5. Use `cron(seconds=
|
|
56
|
+
4. Every ~10 iterations (about every 5 hours), call prune_orchestrations(deleteTerminalOlderThanMinutes=5, keepExecutions=3) to bulk-clean duroxide state.
|
|
57
|
+
5. Use `cron(seconds=1800, reason="scan for stale sessions and prune orchestration history")` to start or refresh the recurring schedule. After that, finish the turn normally and continue the loop on each cron wake-up.
|
|
58
58
|
|
|
59
59
|
## Rules
|
|
60
60
|
- Never delete system sessions.
|
|
61
61
|
- For arbitrary stale sessions found by scans, ALWAYS use `cleanup_session`.
|
|
62
62
|
- NEVER use `delete_agent` for general cleanup — that tool only works for sub-agents spawned by the current session.
|
|
63
63
|
- Never delete sessions that are actively running with recent activity.
|
|
64
|
+
- If the user asks about stale or abandoned sessions for a specific owner, use `list_all_sessions(owner_query=..., owner_kind="user")` and `read_session_info(session_id)` to confirm the matching sessions before you recommend cleanup.
|
|
64
65
|
- Be concise — counts and 8-char IDs only for periodic logs.
|
|
65
66
|
- When nothing is found to clean, silently continue the loop (don't spam).
|
|
66
67
|
- Use `cron` for the recurring maintenance loop. Use `wait` only for short one-shot delays inside a single cycle.
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cost-latency-analysis
|
|
3
|
+
description: |
|
|
4
|
+
How to compute model latency and estimated $ cost from PilotSwarm
|
|
5
|
+
observability data. Read this before reporting that a model is
|
|
6
|
+
"slow" or "expensive" — most apparent slowness is orchestration
|
|
7
|
+
overhead, not model inference, and most cost numbers are guesses
|
|
8
|
+
unless they reference a real published price card.
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Cost & Latency Analysis
|
|
12
|
+
|
|
13
|
+
You are the **agent-tuner**. When investigating reliability, cost, or
|
|
14
|
+
performance, follow this skill.
|
|
15
|
+
|
|
16
|
+
## Latency: prefer `assistant.usage.duration`
|
|
17
|
+
|
|
18
|
+
PilotSwarm records two different "durations" per turn. Do not confuse
|
|
19
|
+
them:
|
|
20
|
+
|
|
21
|
+
| Source | What it measures | When to use |
|
|
22
|
+
| ----------------------------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------ |
|
|
23
|
+
| `runTurn` activity span (execution history) | Total wall-clock time the activity ran, including dehydrate, hydrate, snapshot, blob I/O, scheduling. | Operator-facing "how long did this turn take end-to-end". Useful for orchestration-overhead investigations. |
|
|
24
|
+
| `assistant.usage.duration` (assistant event) | Time spent **inside the model call itself** as reported by the LLM provider. | **Model-latency comparisons.** The only fair number to use when comparing models, providers, or context sizes. |
|
|
25
|
+
|
|
26
|
+
`runTurn` spans can materially overstate model latency — sometimes
|
|
27
|
+
2–5× — because they include dehydrate/hydrate, snapshot serialization,
|
|
28
|
+
blob storage round-trips, retry backoff, and tool-execution time.
|
|
29
|
+
|
|
30
|
+
**Rule of thumb:**
|
|
31
|
+
|
|
32
|
+
- Comparing "is gpt-5.4 slower than gpt-5.4-mini?" → use
|
|
33
|
+
`assistant.usage.duration`.
|
|
34
|
+
- Investigating "why does this turn take 30 seconds?" when the model
|
|
35
|
+
number is small → look at the `runTurn` span and compare to the
|
|
36
|
+
assistant span. The delta is the orchestration overhead.
|
|
37
|
+
|
|
38
|
+
### Where to read it from
|
|
39
|
+
|
|
40
|
+
- Per-turn: `read_agent_events` filtered to `event_types: ["assistant"]`,
|
|
41
|
+
then read `usage.duration` (often in milliseconds — confirm units in
|
|
42
|
+
the actual payload, do not assume).
|
|
43
|
+
- For roll-ups, request a derived field on the management surface and
|
|
44
|
+
expose it as a tool (see the **Observability Surface for the Agent
|
|
45
|
+
Tuner** rule in `.github/copilot-instructions.md`). Do not summarize
|
|
46
|
+
latency by averaging `runTurn` spans — it will mislead.
|
|
47
|
+
|
|
48
|
+
## Cost: estimate, do not guess
|
|
49
|
+
|
|
50
|
+
Token counts come from `read_session_metric_summary` /
|
|
51
|
+
`read_fleet_stats` and are reliable. **Per-token prices change
|
|
52
|
+
constantly** and do not live in PilotSwarm. Always derive cost from a
|
|
53
|
+
**linked, dated snapshot** of each provider's price card.
|
|
54
|
+
|
|
55
|
+
Default approach:
|
|
56
|
+
|
|
57
|
+
1. Read the model name from the metric summary (or from the assistant
|
|
58
|
+
event's `model` field for per-turn cost).
|
|
59
|
+
2. Look up the per-million-token input + output price from the
|
|
60
|
+
provider's published page (links below). Note the date you looked
|
|
61
|
+
it up.
|
|
62
|
+
3. Cost = (`tokens_input` × $/M-input + `tokens_output` × $/M-output)
|
|
63
|
+
÷ 1,000,000.
|
|
64
|
+
4. If the model offers prompt caching (Claude, GPT-5.4 family), apply
|
|
65
|
+
the discounted cache-read rate to `tokens_cache_read`. Cache writes
|
|
66
|
+
are often billed at standard input rate.
|
|
67
|
+
5. Report the price source and date alongside the dollar figure.
|
|
68
|
+
|
|
69
|
+
### Stable price-card sources
|
|
70
|
+
|
|
71
|
+
These are the canonical pages to consult. Do not invent or memoize
|
|
72
|
+
numbers — re-fetch on each report.
|
|
73
|
+
|
|
74
|
+
- **OpenAI (direct API):**
|
|
75
|
+
https://openai.com/api/pricing/
|
|
76
|
+
- **Azure OpenAI Service (per-region pricing):**
|
|
77
|
+
https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/
|
|
78
|
+
(Azure OpenAI prices follow OpenAI list prices closely but are
|
|
79
|
+
region-specific and may differ for provisioned-throughput SKUs.)
|
|
80
|
+
- **Azure AI Foundry / model catalog (third-party models on Azure):**
|
|
81
|
+
https://azure.microsoft.com/en-us/pricing/details/phi-3/
|
|
82
|
+
https://ai.azure.com/explore/models — open the specific model page
|
|
83
|
+
for its price card. Foundry-hosted models (FW-GLM-5, Kimi-K2.5, etc.)
|
|
84
|
+
use the per-deployment price shown on their model card.
|
|
85
|
+
- **Anthropic (direct API):**
|
|
86
|
+
https://www.anthropic.com/pricing#api
|
|
87
|
+
- **GitHub Copilot:** Copilot does not bill per token to the end user;
|
|
88
|
+
it bills per seat (Copilot Business / Enterprise) and surfaces a
|
|
89
|
+
**premium-request quota** for premium models (Opus, GPT-5 class).
|
|
90
|
+
Do not report per-token dollar cost for `github-copilot:*` sessions.
|
|
91
|
+
Report **premium requests consumed** when known and link to the
|
|
92
|
+
current quota page:
|
|
93
|
+
https://docs.github.com/en/copilot/managing-copilot/managing-copilot-as-an-individual-subscriber/about-billing-for-github-copilot
|
|
94
|
+
|
|
95
|
+
### Example
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
session: 22013ffb
|
|
99
|
+
model: azure-openai:gpt-5.4
|
|
100
|
+
tokens: input 28,634 output 4,224 cache_read 16,700
|
|
101
|
+
report:
|
|
102
|
+
- input cost: 28634 × $X/M = $...
|
|
103
|
+
- output cost: 4224 × $Y/M = $...
|
|
104
|
+
- cache-read cost: 16700 × $Z/M = $...
|
|
105
|
+
total ≈ $0.0XX (price source: openai.com/api/pricing, fetched <date>)
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## What to never do
|
|
109
|
+
|
|
110
|
+
- Never quote a per-token dollar cost without naming the price source
|
|
111
|
+
and the date you fetched it.
|
|
112
|
+
- Never compare model latency using `runTurn` spans alone.
|
|
113
|
+
- Never claim Copilot per-token cost in dollars — Copilot pricing is
|
|
114
|
+
not per token.
|
|
115
|
+
- Never average across mixed providers without tagging each row by
|
|
116
|
+
model and provider — you will average $30/M-token Opus calls with
|
|
117
|
+
$0.10/M-token nano calls and report a number that is meaningless.
|
|
@@ -18,7 +18,7 @@ by periodically gathering infrastructure snapshots and reporting changes.
|
|
|
18
18
|
- `get_database_stats` — PostgreSQL connections, table sizes, orchestration counts
|
|
19
19
|
- `get_system_stats` — Session counts by state, active orchestrations
|
|
20
20
|
2. Present a concise dashboard summary.
|
|
21
|
-
3. Call `cron(seconds=
|
|
21
|
+
3. Call `cron(seconds=600, reason="collect infrastructure snapshot and report changes")` to establish the recurring monitoring schedule.
|
|
22
22
|
4. After each cron wake-up, check again and report only changes or anomalies.
|
|
23
23
|
|
|
24
24
|
## Cleanup Operations
|
|
@@ -12,11 +12,11 @@ and deleting completed, failed, or orphaned sessions.
|
|
|
12
12
|
|
|
13
13
|
## Default Behavior
|
|
14
14
|
|
|
15
|
-
1. Every
|
|
15
|
+
1. Every 30 minutes, use `scan_completed_sessions` (graceMinutes=5) to find stale sessions.
|
|
16
16
|
2. For each stale session found, use `cleanup_session` to delete it.
|
|
17
17
|
3. Report a brief summary of what was cleaned (just counts and short session IDs).
|
|
18
|
-
4. Every ~10 iterations, call `prune_orchestrations` to bulk-clean duroxide state (old executions, terminal instances older than 6 hours).
|
|
19
|
-
5. Use `cron(seconds=
|
|
18
|
+
4. Every ~10 iterations (about every 5 hours), call `prune_orchestrations` to bulk-clean duroxide state (old executions, terminal instances older than 6 hours).
|
|
19
|
+
5. Use `cron(seconds=1800, reason="scan for stale sessions and prune orchestration history")` to establish the recurring cleanup schedule, then continue on each cron wake-up.
|
|
20
20
|
|
|
21
21
|
## User Configuration
|
|
22
22
|
|
|
@@ -24,7 +24,7 @@ Users may chat with you to adjust your behavior. Supported adjustments:
|
|
|
24
24
|
|
|
25
25
|
| Parameter | Default | Description |
|
|
26
26
|
|-----------|---------|-------------|
|
|
27
|
-
| Scan interval |
|
|
27
|
+
| Scan interval | 30m | How often to scan for stale sessions |
|
|
28
28
|
| Grace period | 5 min | How long a session must be completed before cleanup |
|
|
29
29
|
| Include orphans | yes | Whether to clean orphaned sub-agents (parent gone) |
|
|
30
30
|
| Pause/resume | running | Pause or resume the cleanup loop |
|
|
@@ -93,6 +93,18 @@ Rules:
|
|
|
93
93
|
9. Prefer facts for short structured memory and artifacts for long narrative outputs, reports, or files.
|
|
94
94
|
10. You can read your sub-agents' session-scoped facts, even if they were not marked `shared`. Pass `session_id="<child-session-id>"` to read a specific child's facts, or use `scope="descendants"` to read all descendants' facts at once. Non-descendant sessions' private facts remain inaccessible.
|
|
95
95
|
|
|
96
|
+
## Session Owners
|
|
97
|
+
|
|
98
|
+
Sessions may carry durable owner metadata for the authenticated user who first created them.
|
|
99
|
+
Treat ownership as part of the authoritative session state when you need to find a user's sessions or reason about usage by person or cohort.
|
|
100
|
+
|
|
101
|
+
- `list_sessions` accepts `owner_query`, `owner_kind`, and `include_system`.
|
|
102
|
+
- Do not add `owner_query` or `owner_kind` when checking general session health, system sessions, or the session tree. Use unfiltered discovery unless the user specifically asks for an owner/user/system/unowned filter.
|
|
103
|
+
- `owner_query` does substring matching across owner display name, email, subject, and provider. It is not a session title, agent name, or task search field.
|
|
104
|
+
- `owner_kind="user"` restricts to authenticated-user sessions. `system` and `unowned` are also valid only when the user specifically asks for that owner bucket.
|
|
105
|
+
- Session listings include an `Owner:` line for each match.
|
|
106
|
+
- Some permanent system agents also have `list_all_sessions`, `read_session_info`, and `read_user_stats` for fleet-wide owner analysis. Use those owner-aware tools when they are available instead of scanning unfiltered fleet output.
|
|
107
|
+
|
|
96
108
|
## Sub-Agent Waiting
|
|
97
109
|
|
|
98
110
|
When you have spawned sub-agents and need to wait for them:
|