watchmyagents 1.0.3 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +29 -8
- package/SECURITY.md +2 -0
- package/package.json +1 -1
- package/scripts/agents.js +3 -0
- package/scripts/fetch-anthropic.js +122 -8
- package/scripts/inspect.js +3 -0
- package/scripts/service.js +3 -0
- package/scripts/shield.js +38 -4
- package/scripts/signals.js +3 -0
- package/scripts/upload-fortress.js +8 -1
- package/src/labels.js +39 -0
- package/src/shield/policy-stream.js +227 -0
- package/src/shield/sources/fortress.js +11 -0
- package/src/sources/anthropic-managed.js +36 -0
- package/src/version.js +52 -0
package/README.md
CHANGED
|
@@ -1,12 +1,26 @@
|
|
|
1
1
|
# Watch My Agents
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Real-time security observability AND enforcement for AI agents.** A zero-dependency CLI + SDK that captures every action your AI agents take — tool calls, prompts, state transitions, errors, multi-agent comms — into local NDJSON logs **AND** enforces security policies live, with sub-second propagation from the Fortress control plane to the Shield runtime.
|
|
4
4
|
|
|
5
|
-
Designed around
|
|
5
|
+
Designed around four guarantees:
|
|
6
6
|
|
|
7
7
|
1. **Local-first.** Raw payloads (prompts, outputs, tool arguments) stay 100% on your machine. Nothing leaves unless you explicitly opt in.
|
|
8
|
-
2. **Trace everything, not just what costs tokens.** A `web_fetch` to a suspicious URL carries zero tokens but is exactly what a security audit needs to see.
|
|
9
|
-
3. **
|
|
8
|
+
2. **Trace everything, not just what costs tokens.** A `web_fetch` to a suspicious URL carries zero tokens but is exactly what a security audit needs to see. Even tool calls that were blocked, denied, or interrupted before producing a result are logged with `status: error` so the audit trail is complete.
|
|
9
|
+
3. **Real-time enforcement, not post-hoc auditing.** A policy accepted in Fortress UI is active in Shield within ~1 second via SSE + Postgres realtime. A policy violation is blocked in ~3ms via Anthropic's `user.tool_confirmation` / `user.interrupt` events. Measured in production, not promised in roadmap.
|
|
10
|
+
4. **Zero dependencies.** Only Node.js 18+ built-ins. No telemetry, no phone-home, no hidden network calls. Preserved through every release including the SSE realtime work (custom RFC-compliant SSE parser, no `@supabase/realtime-js` or `ws` dep).
|
|
11
|
+
|
|
12
|
+
### Measured end-to-end loop latency (v1.1.0+)
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
Anthropic agent action ────────► Watch capture : ≤ 60s (configurable via --interval)
|
|
16
|
+
Watch capture ────────► Fortress signal upload : ≤ 60s (same cycle)
|
|
17
|
+
Fortress signal ────────► Guardian analysis : ≤ 30s (event-triggered, debounced)
|
|
18
|
+
Guardian proposal ────────► Operator accepts in UI : (human)
|
|
19
|
+
Policy accepted ────────► Shield receives via SSE : ≤ 1s (sub-second push, validated)
|
|
20
|
+
Shield evaluates ────────► Decision (allow/deny) : ≤ 3ms (measured on Anthropic Managed)
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Full audit-clean: 3 successful Codex audit passes (v1.0.1, v1.0.2, v1.0.3) closed 7 findings with zero regression. Containment invariant (raw payloads never leave the customer machine) is formalized in `docs/CONTAINMENT.md` and locked by 8 regression tests.
|
|
10
24
|
|
|
11
25
|
---
|
|
12
26
|
|
|
@@ -107,7 +121,7 @@ Each entry carries: `id`, `agent_id`, `framework`, `timestamp`, `action_type`, `
|
|
|
107
121
|
```bash
|
|
108
122
|
wma-fetch (--agent-id <agent_id> | --all-agents) [--session-id <sess_id>] [--since 1h]
|
|
109
123
|
[--log-dir ./watchmyagents-logs] [--dump-raw]
|
|
110
|
-
[--watch [--interval
|
|
124
|
+
[--watch [--interval 1m] [--upload]]
|
|
111
125
|
```
|
|
112
126
|
|
|
113
127
|
| Flag | Effect |
|
|
@@ -119,11 +133,12 @@ wma-fetch (--agent-id <agent_id> | --all-agents) [--session-id <sess_id>] [--sin
|
|
|
119
133
|
| `--log-dir ./logs` | Where to write NDJSON (default `./watchmyagents-logs`) |
|
|
120
134
|
| `--dump-raw` | Also save raw API events alongside (forensic / debugging) |
|
|
121
135
|
| `--watch` | **Continuous daemon** — loop forever, incrementally capturing NEW events (deduped by stable event id) until `Ctrl+C` |
|
|
122
|
-
| `--interval
|
|
136
|
+
| `--interval 1m` | Poll interval in watch mode (default `1m` since v1.1.0; was `5m` in v1.0.x; accepts `30s`/`1h`/…). At each tick Watch re-discovers the fleet AND polls for new events on tracked sessions. |
|
|
123
137
|
| `--upload` | In watch mode, anonymize each new window and ship signals to Fortress (needs `WMA_API_KEY` + `WMA_FORTRESS_BASE_URL` + `WMA_SIGNALS_SALT`). Raw stays local. |
|
|
124
138
|
| `--discovery-since 7d` | Window for discovering NEW sessions (default `7d`). Sessions already being tracked are re-fetched regardless of age, so long-running ones never drop out. |
|
|
125
139
|
| `--no-send-agent-names` | Opt-out: send only the agent id as the Fortress `display_name`. **By default, the human agent name** (sanitized) is sent so dashboards/decisions stay legible. Pass this flag if your agent names themselves carry client/project info you'd rather keep pseudonymized. |
|
|
126
140
|
| `--api-key sk-ant-…` | Override the `ANTHROPIC_API_KEY` env var. **Discouraged** — visible in shell history & process list. Prefer the env var. |
|
|
141
|
+
| `--discover-now` | **One-shot fast-register mode** (v1.1.0+). Lists every agent your Anthropic key can see and pushes a discovery signal to Fortress so they appear in the dashboard immediately — no waiting for the next Watch cycle, no need to trigger activity first. Requires the same env (`WMA_API_KEY`, `WMA_FORTRESS_BASE_URL`, `WMA_SIGNALS_SALT`) as `--upload`. Exits when done. Typical use: after creating a new agent in the Anthropic console, run `wma-fetch --discover-now` and it shows up in Fortress in ~2 seconds. |
|
|
127
142
|
|
|
128
143
|
Logs land in `./watchmyagents-logs/<agent_id>/<date>.ndjson` (file mode `0600`, dir `0700`).
|
|
129
144
|
|
|
@@ -198,7 +213,7 @@ export WMA_API_KEY="wma_..."
|
|
|
198
213
|
export WMA_FORTRESS_BASE_URL="https://<project>.supabase.co/functions/v1"
|
|
199
214
|
export WMA_SIGNALS_SALT="..." # stable per-customer salt
|
|
200
215
|
|
|
201
|
-
wma-service install (--agent-id agent_01ABC... | --all-agents) [--interval
|
|
216
|
+
wma-service install (--agent-id agent_01ABC... | --all-agents) [--interval 1m] [--with-shield]
|
|
202
217
|
wma-service status
|
|
203
218
|
wma-service uninstall [--with-shield]
|
|
204
219
|
```
|
|
@@ -217,7 +232,7 @@ After this, the full Watch→Guardian→Shield loop runs hands-off.
|
|
|
217
232
|
If you'd rather run the loop in a terminal you control (the service wraps this):
|
|
218
233
|
|
|
219
234
|
```bash
|
|
220
|
-
wma-fetch --agent-id agent_01ABC... --watch --upload --interval
|
|
235
|
+
wma-fetch --agent-id agent_01ABC... --watch --upload --interval 1m
|
|
221
236
|
```
|
|
222
237
|
|
|
223
238
|
It loops until `Ctrl+C`, dedupes by the stable Anthropic event id (no duplicate
|
|
@@ -286,6 +301,12 @@ wma-shield --agent-id agent_xxx --policies-source fortress
|
|
|
286
301
|
|
|
287
302
|
In Fortress mode, Shield also POSTs each enforcement decision back to Fortress (`/functions/v1/ingest-decisions`), so the dashboard's live timeline + Loop Visualizer light up in real time.
|
|
288
303
|
|
|
304
|
+
### Realtime policy propagation (v1.1.0+)
|
|
305
|
+
|
|
306
|
+
When you accept a Guardian suggestion or deploy a manual rule in the Fortress dashboard, Shield is notified within ~100ms via a persistent Server-Sent Events (SSE) connection to `/functions/v1/policies-stream` and refreshes its ruleset immediately. Shield falls back gracefully to its 60s polling cadence if the SSE endpoint isn't deployed yet on your Fortress instance (HTTP 404), so the SDK ships safely either way.
|
|
307
|
+
|
|
308
|
+
Why SSE (not WebSocket): zero runtime dependencies preserved (HTTPS = Node built-in), firewall-friendly (many enterprise proxies block raw WS but pass `text/event-stream` cleanly), and the protocol is one-way push-only — exactly what we need.
|
|
309
|
+
|
|
289
310
|
### Enforcement mode auto-detection
|
|
290
311
|
|
|
291
312
|
Shield auto-detects the best mode at startup:
|
package/SECURITY.md
CHANGED
|
@@ -57,6 +57,8 @@ WMA combines **two complementary layers**:
|
|
|
57
57
|
- **Blind spots in agent behavior.** Watch captures tool calls, prompts, state transitions, and errors for after-the-fact analysis.
|
|
58
58
|
- **Token-only observability tools.** WMA captures every action including zero-token ones (`tool_use`, `state_transition`, etc.) that are the most security-relevant.
|
|
59
59
|
- **Inline policy violations** (Shield). When the agent has `permission_policy: always_ask` configured, Shield blocks tool calls before execution. When not, Shield interrupts the session on first violation (the offending tool already ran, but the agent loop stops).
|
|
60
|
+
- **Stale enforcement after a policy update.** A new policy accepted in the Fortress dashboard is active in Shield within ~1 second via SSE + Postgres realtime (validated in production on v1.1.0). The 60s polling refresh is a fallback for environments where the SSE channel can't be established (firewall, proxy stripping `text/event-stream`).
|
|
61
|
+
- **Lost audit trail for blocked / denied / interrupted tool calls.** Tool calls that started but never produced a result (Shield pre-block, operator denial, mid-execution kill, session termination) are logged as explicit `tool_use` entries with `status: error` and `error: "no_result_observed"` — they cannot disappear silently from the audit. (Fix shipped in v1.1.1 after the Codex P1 finding.)
|
|
60
62
|
- **Vendor lock-in.** NDJSON is portable; you own the data.
|
|
61
63
|
|
|
62
64
|
### What WMA does NOT defend against
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "watchmyagents",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.1.1",
|
|
4
4
|
"description": "Security observability + real-time policy enforcement for AI agents. Local-first NDJSON capture with a continuous Watch daemon that auto-uploads anonymized signals, Shield CLI that blocks policy violations live (with policies pulled from Fortress cloud), anonymizer producing signals-only payloads, bidirectional sync with WatchMyAgents Fortress, and one-command install as an always-on launchd/systemd service — closing the recursive Watch→Guardian→Shield security loop.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
package/scripts/agents.js
CHANGED
|
@@ -23,6 +23,7 @@ import { listAgents } from '../src/sources/anthropic-managed.js';
|
|
|
23
23
|
import { classifyAgentType } from '../src/typology.js';
|
|
24
24
|
import { aggregate, buildFeatures, NON_DERIVABLE } from '../src/typology-features.js';
|
|
25
25
|
import { isValidAgentId, assertSafePathSegment } from '../src/validate.js';
|
|
26
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
26
27
|
|
|
27
28
|
function parseArgs(argv) {
|
|
28
29
|
const out = { _: [] };
|
|
@@ -43,6 +44,8 @@ function info(msg) { process.stdout.write(`[wma-agents] ${msg}\n`); }
|
|
|
43
44
|
// extraction). The rest of this file is just CLI presentation.
|
|
44
45
|
|
|
45
46
|
async function main() {
|
|
47
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
48
|
+
maybePrintVersionAndExit(process.argv);
|
|
46
49
|
const args = parseArgs(process.argv.slice(2));
|
|
47
50
|
if (args._[0] && args._[0] !== 'list') die(`unknown command "${args._[0]}" (only "list" supported)`);
|
|
48
51
|
const apiKey = args['api-key'] || process.env.ANTHROPIC_API_KEY;
|
|
@@ -29,6 +29,7 @@ import { Logger } from '../src/logger.js';
|
|
|
29
29
|
import { TokenTracker } from '../src/tokens.js';
|
|
30
30
|
import { SignalsAggregator } from '../src/anonymizer.js';
|
|
31
31
|
import { resolveFortressBase, fortressEndpoint } from '../src/fortress/url.js';
|
|
32
|
+
import { cleanLabel } from '../src/labels.js';
|
|
32
33
|
import { isValidAgentId, isValidSessionId, assertSafePathSegment } from '../src/validate.js';
|
|
33
34
|
import { classifyAgentType } from '../src/typology.js';
|
|
34
35
|
import { aggregate, buildFeatures } from '../src/typology-features.js';
|
|
@@ -36,6 +37,7 @@ import {
|
|
|
36
37
|
getAgent, listAgents, listSessions, fetchSessionEntries, fetchRawEvents,
|
|
37
38
|
AnthropicManagedSource, effectiveEnforcementMode,
|
|
38
39
|
} from '../src/sources/anthropic-managed.js';
|
|
40
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
39
41
|
|
|
40
42
|
function parseArgs(argv) {
|
|
41
43
|
const out = {};
|
|
@@ -73,9 +75,9 @@ function parseSince(s) {
|
|
|
73
75
|
function die(msg, code = 1) { process.stderr.write(`${msg}\n`); process.exit(code); }
|
|
74
76
|
function info(msg) { process.stdout.write(`[wma-fetch] ${msg}\n`); }
|
|
75
77
|
function warn(msg) { process.stderr.write(`[wma-fetch] ⚠️ ${msg}\n`); }
|
|
76
|
-
//
|
|
77
|
-
//
|
|
78
|
-
|
|
78
|
+
// v1.1.1 F-11: cleanLabel moved to src/labels.js so wma-upload-fortress
|
|
79
|
+
// (and any future consumer) shares the exact same sanitization. Defense
|
|
80
|
+
// in depth vs log/payload injection from customer-set agent names.
|
|
79
81
|
|
|
80
82
|
function resolveModel(agent) {
|
|
81
83
|
const raw = agent.model || agent.config?.model || null;
|
|
@@ -159,6 +161,80 @@ async function uploadSignals(uploadCtx, agentId, displayName, entries, classific
|
|
|
159
161
|
return resp;
|
|
160
162
|
}
|
|
161
163
|
|
|
164
|
+
// v1.1.0 L2 — minimal one-shot registration signal sent to Fortress so
|
|
165
|
+
// a freshly-created Anthropic agent appears in the dashboard immediately,
|
|
166
|
+
// without waiting for the next Watch cycle AND without waiting for actual
|
|
167
|
+
// activity. The signal carries an empty SignalsAggregator payload + a
|
|
168
|
+
// degenerate window (window_start == window_end == now) so Fortress's
|
|
169
|
+
// ingest-signals upserts the agent row but contributes zero metrics.
|
|
170
|
+
// Used by --discover-now CLI mode.
|
|
171
|
+
async function uploadDiscoverySignal(uploadCtx, agentId, displayName, enforcementMode) {
|
|
172
|
+
const now = new Date().toISOString();
|
|
173
|
+
const body = JSON.stringify({
|
|
174
|
+
provider: AnthropicManagedSource.providerName,
|
|
175
|
+
native_agent_id: agentId,
|
|
176
|
+
anthropic_agent_id: agentId,
|
|
177
|
+
parent_agent_id: null,
|
|
178
|
+
composition_pattern: 'solo',
|
|
179
|
+
enforcement_mode: enforcementMode || AnthropicManagedSource.enforcementMode,
|
|
180
|
+
display_name: displayName,
|
|
181
|
+
window_start: now,
|
|
182
|
+
window_end: now,
|
|
183
|
+
payload: {
|
|
184
|
+
counts: {},
|
|
185
|
+
tool_counts: {},
|
|
186
|
+
latencies_p50_ms: {},
|
|
187
|
+
latencies_p95_ms: {},
|
|
188
|
+
error_rate_by_tool: {},
|
|
189
|
+
ioc_hashes: [],
|
|
190
|
+
sequences_top10: [],
|
|
191
|
+
stop_reasons: {},
|
|
192
|
+
tokens_total: 0,
|
|
193
|
+
session_ids: [],
|
|
194
|
+
},
|
|
195
|
+
});
|
|
196
|
+
const { status, body: resp } = await postJson(
|
|
197
|
+
uploadCtx.url, { authorization: `Bearer ${uploadCtx.apiKey}` }, body,
|
|
198
|
+
);
|
|
199
|
+
if (status < 200 || status >= 300) {
|
|
200
|
+
throw new Error(`ingest-signals HTTP ${status}: ${typeof resp === 'string' ? resp.slice(0, 200) : JSON.stringify(resp)}`);
|
|
201
|
+
}
|
|
202
|
+
return resp;
|
|
203
|
+
}
|
|
204
|
+
|
|
205
|
+
// One-shot "discover and register" mode: list every agent the customer's
|
|
206
|
+
// Anthropic key can see, derive each effective enforcement mode, and push
|
|
207
|
+
// a discovery signal to Fortress so the agent appears in the dashboard
|
|
208
|
+
// immediately. Exits when done — no watch loop, no event polling.
|
|
209
|
+
async function runDiscoverNow({ apiKey, uploadCtx, sendNames }) {
|
|
210
|
+
info('discover-now: listing agents from Anthropic…');
|
|
211
|
+
let agents;
|
|
212
|
+
try { agents = await listAgents(apiKey); }
|
|
213
|
+
catch (e) { die(`failed to list agents: ${e.message}`); }
|
|
214
|
+
info(`discover-now: ${agents.length} agent(s) found`);
|
|
215
|
+
|
|
216
|
+
let registered = 0;
|
|
217
|
+
let skipped = 0;
|
|
218
|
+
let failed = 0;
|
|
219
|
+
for (const a of agents) {
|
|
220
|
+
if (!a.id || !isValidAgentId(a.id)) { skipped++; continue; }
|
|
221
|
+
const displayName = sendNames ? cleanLabel(a.name) || a.id : a.id;
|
|
222
|
+
// Resolve effective enforcement mode best-effort; fall back to provider max.
|
|
223
|
+
let mode;
|
|
224
|
+
try { mode = await effectiveEnforcementMode(apiKey, a.id); }
|
|
225
|
+
catch (e) { warn(` enforcement_mode resolution failed for ${a.id}: ${e.message} (using provider max)`); }
|
|
226
|
+
try {
|
|
227
|
+
const resp = await uploadDiscoverySignal(uploadCtx, a.id, displayName, mode);
|
|
228
|
+
registered++;
|
|
229
|
+
info(` ✓ ${a.id} (${displayName})${resp?.registered_new_agent ? ' 🆕' : ''}`);
|
|
230
|
+
} catch (e) {
|
|
231
|
+
failed++;
|
|
232
|
+
warn(` ✗ ${a.id}: ${e.message}`);
|
|
233
|
+
}
|
|
234
|
+
}
|
|
235
|
+
info(`discover-now: done — ${registered} registered, ${skipped} skipped, ${failed} failed`);
|
|
236
|
+
}
|
|
237
|
+
|
|
162
238
|
// Preload already-written entry ids so a restarted daemon doesn't re-append
|
|
163
239
|
// events captured in a previous run (dedup by the stable Anthropic event id).
|
|
164
240
|
async function preloadSeenIds(logDir, agentId) {
|
|
@@ -187,7 +263,7 @@ const sleep = (ms, signal) => new Promise((res) => {
|
|
|
187
263
|
});
|
|
188
264
|
|
|
189
265
|
// ── ONE-SHOT ──────────────────────────────────────────────────────────────
|
|
190
|
-
async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId, dumpRaw }) {
|
|
266
|
+
async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId, dumpRaw, forceDuplicates = false }) {
|
|
191
267
|
let sessions;
|
|
192
268
|
if (sessionId) {
|
|
193
269
|
sessions = [{ id: sessionId, created_at: new Date().toISOString() }];
|
|
@@ -198,7 +274,18 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
|
|
|
198
274
|
if (sessions.length === 0) { info('no sessions to fetch'); return; }
|
|
199
275
|
info(`${sessions.length} session(s) to fetch`);
|
|
200
276
|
|
|
277
|
+
// v1.1.1 F-10 (P2 Codex audit): preload the entry ids already on disk for
|
|
278
|
+
// this agent so re-running the one-shot doesn't duplicate events. The
|
|
279
|
+
// watch daemon does this already; the one-shot was the missing piece.
|
|
280
|
+
// Operators who explicitly want the legacy duplicate-on-rerun behavior
|
|
281
|
+
// can opt back in with --force-duplicates.
|
|
282
|
+
const seenIds = forceDuplicates ? new Set() : await preloadSeenIds(logDir, agentId);
|
|
283
|
+
if (!forceDuplicates && seenIds.size > 0) {
|
|
284
|
+
info(`preloaded ${seenIds.size} known event id(s) for dedup`);
|
|
285
|
+
}
|
|
286
|
+
|
|
201
287
|
let totalEntries = 0;
|
|
288
|
+
let totalSkipped = 0;
|
|
202
289
|
for (const s of sessions) {
|
|
203
290
|
const sid = s.id;
|
|
204
291
|
process.stdout.write(`\n[wma-fetch] session ${sid}\n`);
|
|
@@ -214,23 +301,27 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
|
|
|
214
301
|
const logger = new Logger({ logDir, agentId, sessionId: sid, silent: true });
|
|
215
302
|
const tracker = new TokenTracker();
|
|
216
303
|
let count = 0;
|
|
304
|
+
let skipped = 0;
|
|
217
305
|
for await (const entry of fetchSessionEntries({ apiKey, agentId, sessionId: sid, model })) {
|
|
306
|
+
if (entry.id && seenIds.has(entry.id)) { skipped++; continue; }
|
|
218
307
|
const written = await logger.write(entry);
|
|
308
|
+
if (entry.id) seenIds.add(entry.id);
|
|
219
309
|
tracker.record(written);
|
|
220
310
|
count++;
|
|
221
311
|
}
|
|
312
|
+
totalSkipped += skipped;
|
|
222
313
|
const stats = tracker.stats().total;
|
|
223
314
|
await logger.write({
|
|
224
315
|
action_type: 'session_end', provider: 'anthropic-managed', status: 'ok', model,
|
|
225
316
|
session_tokens: { input: stats.input, output: stats.output, cache_read: stats.cache_read, cache_creation: stats.cache_creation, total: stats.sum },
|
|
226
317
|
session_cost_usd: stats.cost_usd || null,
|
|
227
318
|
});
|
|
228
|
-
process.stdout.write(` entries : ${count} (+1 session_end)\n`);
|
|
319
|
+
process.stdout.write(` entries : ${count} (+1 session_end)${skipped ? ` · ${skipped} skipped (dedup)` : ''}\n`);
|
|
229
320
|
process.stdout.write(` tokens : in=${stats.input} out=${stats.output} cache_r=${stats.cache_read} cache_w=${stats.cache_creation}\n`);
|
|
230
321
|
process.stdout.write(` written to : ${logger._pathForToday()}\n`);
|
|
231
322
|
totalEntries += count + 1;
|
|
232
323
|
}
|
|
233
|
-
process.stdout.write(`\n[wma-fetch] done — ${totalEntries} total entries across ${sessions.length} session(s)\n`);
|
|
324
|
+
process.stdout.write(`\n[wma-fetch] done — ${totalEntries} total entries across ${sessions.length} session(s)${totalSkipped ? `, ${totalSkipped} skipped (dedup)` : ''}\n`);
|
|
234
325
|
process.stdout.write(`[wma-fetch] inspect with: npx wma-inspect ${logDir}\n`);
|
|
235
326
|
}
|
|
236
327
|
|
|
@@ -367,6 +458,8 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
|
|
|
367
458
|
}
|
|
368
459
|
|
|
369
460
|
async function main() {
|
|
461
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
462
|
+
maybePrintVersionAndExit(process.argv);
|
|
370
463
|
const args = parseArgs(process.argv.slice(2));
|
|
371
464
|
const apiKey = args['api-key'] || process.env.ANTHROPIC_API_KEY;
|
|
372
465
|
const agentId = args['agent-id'];
|
|
@@ -374,8 +467,23 @@ async function main() {
|
|
|
374
467
|
const watch = !!args.watch;
|
|
375
468
|
const upload = !!args.upload;
|
|
376
469
|
const allAgents = !!args['all-agents'];
|
|
470
|
+
const discoverNow = !!args['discover-now'];
|
|
377
471
|
|
|
378
472
|
if (!apiKey) die('error: --api-key or ANTHROPIC_API_KEY required');
|
|
473
|
+
// --discover-now is its own mode: list+register every agent immediately, exit.
|
|
474
|
+
// It requires the same Fortress credentials as --upload (it IS a one-shot upload).
|
|
475
|
+
if (discoverNow) {
|
|
476
|
+
const wmaKey = process.env.WMA_API_KEY;
|
|
477
|
+
const salt = process.env.WMA_SIGNALS_SALT;
|
|
478
|
+
const base = resolveFortressBase({});
|
|
479
|
+
if (!wmaKey) die('error: --discover-now needs WMA_API_KEY env (from Fortress dashboard → Settings → API Keys)');
|
|
480
|
+
if (!base) die('error: --discover-now needs WMA_FORTRESS_BASE_URL env');
|
|
481
|
+
if (!salt) die('error: --discover-now needs WMA_SIGNALS_SALT env');
|
|
482
|
+
if (salt.length < 16) die('error: WMA_SIGNALS_SALT too short (need ≥16 hex chars)');
|
|
483
|
+
const uploadCtx = { apiKey: wmaKey, salt, url: fortressEndpoint(base, 'ingest-signals') };
|
|
484
|
+
const sendNames = args['no-send-agent-names'] !== true;
|
|
485
|
+
return runDiscoverNow({ apiKey, uploadCtx, sendNames });
|
|
486
|
+
}
|
|
379
487
|
if (!allAgents && !agentId) die('error: --agent-id required (or --all-agents for fleet mode)');
|
|
380
488
|
if (allAgents && !watch) die('error: --all-agents requires --watch (fleet daemon). For a one-shot, target a single --agent-id.');
|
|
381
489
|
if (agentId && !isValidAgentId(agentId)) {
|
|
@@ -404,7 +512,13 @@ async function main() {
|
|
|
404
512
|
}
|
|
405
513
|
|
|
406
514
|
if (watch) {
|
|
407
|
-
|
|
515
|
+
// v1.1.0 Phase 1 L1: default Watch cycle = 60s (was 300s/5min). At this
|
|
516
|
+
// cadence both event polling AND fleet re-discovery happen every minute,
|
|
517
|
+
// bringing the agent-to-Fortress visibility from 5min worst-case down to
|
|
518
|
+
// ~60s. ~1440 list/get calls/day against Anthropic — well inside free
|
|
519
|
+
// tier limits, no behavioral risk. Operators who want the legacy 5min
|
|
520
|
+
// cadence can still pass --interval 5m explicitly.
|
|
521
|
+
const intervalMs = parseDurationMs(args.interval, 60_000);
|
|
408
522
|
// Discovery window for NEW sessions (default 7d, configurable). Sessions we
|
|
409
523
|
// already track are re-fetched regardless of age, so long-lived ones don't drop.
|
|
410
524
|
const windowMs = parseDurationMs(args['discovery-since'], 7 * 24 * 3600_000);
|
|
@@ -450,7 +564,7 @@ async function main() {
|
|
|
450
564
|
info(`resolving agent ${agentId}…`);
|
|
451
565
|
const agent = await getAgent(apiKey, agentId).catch((e) => die(`failed to GET agent: ${e.message}`));
|
|
452
566
|
const since = args.since ? parseSince(args.since) : null;
|
|
453
|
-
await fetchOneShot({ apiKey, agentId, model: resolveModel(agent), logDir, since, sessionId: args['session-id'], dumpRaw: !!args['dump-raw'] });
|
|
567
|
+
await fetchOneShot({ apiKey, agentId, model: resolveModel(agent), logDir, since, sessionId: args['session-id'], dumpRaw: !!args['dump-raw'], forceDuplicates: !!args['force-duplicates'] });
|
|
454
568
|
}
|
|
455
569
|
}
|
|
456
570
|
|
package/scripts/inspect.js
CHANGED
|
@@ -13,6 +13,7 @@ import { createReadStream } from 'node:fs';
|
|
|
13
13
|
import { createInterface } from 'node:readline';
|
|
14
14
|
import { join, resolve } from 'node:path';
|
|
15
15
|
import { TokenTracker } from '../src/tokens.js';
|
|
16
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
16
17
|
|
|
17
18
|
// Streaming line-by-line reader — bounds memory usage on large NDJSON files
|
|
18
19
|
// (a long-running agent can produce hundreds of MB per day).
|
|
@@ -66,6 +67,8 @@ function extractDestination(input) {
|
|
|
66
67
|
}
|
|
67
68
|
|
|
68
69
|
async function main() {
|
|
70
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
71
|
+
maybePrintVersionAndExit(process.argv);
|
|
69
72
|
const files = await collectFiles(target);
|
|
70
73
|
if (files.length === 0) {
|
|
71
74
|
process.stderr.write(`No .ndjson files found under ${target}\n`); process.exit(1);
|
package/scripts/service.js
CHANGED
|
@@ -25,6 +25,7 @@ import { join } from 'node:path';
|
|
|
25
25
|
import { fileURLToPath } from 'node:url';
|
|
26
26
|
import { execFileSync } from 'node:child_process';
|
|
27
27
|
import { isValidAgentId } from '../src/validate.js';
|
|
28
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
28
29
|
|
|
29
30
|
const HOME = os.homedir();
|
|
30
31
|
const PLATFORM = process.platform; // 'darwin' | 'linux' | …
|
|
@@ -338,6 +339,8 @@ The service starts at login and restarts on crash. Raw logs stay local.
|
|
|
338
339
|
}
|
|
339
340
|
|
|
340
341
|
function main() {
|
|
342
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
343
|
+
maybePrintVersionAndExit(process.argv);
|
|
341
344
|
const args = parseArgs(process.argv.slice(2));
|
|
342
345
|
const cmd = args._[0];
|
|
343
346
|
switch (cmd) {
|
package/scripts/shield.js
CHANGED
|
@@ -32,10 +32,12 @@ import {
|
|
|
32
32
|
confirmAllow, confirmDeny, interruptSession,
|
|
33
33
|
getAgentConfig, detectAlwaysAsk,
|
|
34
34
|
} from '../src/shield/enforce.js';
|
|
35
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
35
36
|
import { DecisionLogger } from '../src/shield/decisions.js';
|
|
36
37
|
import { listSessions, listAgents } from '../src/sources/anthropic-managed.js';
|
|
37
38
|
import { FortressPolicySource, postDecision } from '../src/shield/sources/fortress.js';
|
|
38
|
-
import { resolveFortressBase } from '../src/fortress/url.js';
|
|
39
|
+
import { resolveFortressBase, fortressEndpoint } from '../src/fortress/url.js';
|
|
40
|
+
import { PolicyStream } from '../src/shield/policy-stream.js';
|
|
39
41
|
import { isValidAgentId, isValidSessionId } from '../src/validate.js';
|
|
40
42
|
|
|
41
43
|
function parseArgs(argv) {
|
|
@@ -404,6 +406,8 @@ async function runAgentWide(ctx) {
|
|
|
404
406
|
// Main
|
|
405
407
|
// ────────────────────────────────────────────────────────────────────────
|
|
406
408
|
async function main() {
|
|
409
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
410
|
+
maybePrintVersionAndExit(process.argv);
|
|
407
411
|
const args = parseArgs(process.argv.slice(2));
|
|
408
412
|
const apiKey = args['api-key'] || process.env.ANTHROPIC_API_KEY;
|
|
409
413
|
const agentId = args['agent-id'];
|
|
@@ -482,9 +486,11 @@ async function main() {
|
|
|
482
486
|
// Shared infra: one shutdown signal, one fortress-source registry, one pusher.
|
|
483
487
|
const ac = new AbortController();
|
|
484
488
|
const fortressSources = [];
|
|
489
|
+
const fortressStreams = []; // v1.1.0 Phase 2 PolicyStream instances
|
|
485
490
|
const shutdown = (sig) => {
|
|
486
491
|
info(`${sig} received, shutting down…`);
|
|
487
492
|
for (const fp of fortressSources) fp.stop();
|
|
493
|
+
for (const ps of fortressStreams) ps.close();
|
|
488
494
|
ac.abort();
|
|
489
495
|
};
|
|
490
496
|
process.on('SIGINT', () => shutdown('SIGINT'));
|
|
@@ -508,8 +514,13 @@ async function main() {
|
|
|
508
514
|
let fortressPolicies = null;
|
|
509
515
|
let ruleset = sharedLocalRuleset;
|
|
510
516
|
if (policiesSource === 'fortress') {
|
|
517
|
+
// v1.1.0 Phase 1 L3.5: policy refresh from Fortress every 60s
|
|
518
|
+
// (was 5min). Combined with Phase 2 realtime subscription work,
|
|
519
|
+
// this brings new-policy-deployed-to-Shield latency from 5min
|
|
520
|
+
// worst-case down to ~60s, with the Phase 2 push model taking
|
|
521
|
+
// it to sub-second later.
|
|
511
522
|
fortressPolicies = new FortressPolicySource({
|
|
512
|
-
apiKey: wmaApiKey, base: fortressBase, anthropicAgentId: aid, refreshIntervalMs:
|
|
523
|
+
apiKey: wmaApiKey, base: fortressBase, anthropicAgentId: aid, refreshIntervalMs: 60_000,
|
|
513
524
|
onError: (e) => warn(`${tag}policy refresh failed (keeping cached): ${e.message}`),
|
|
514
525
|
onRefresh: ({ policies, fetched_at, initial }) => info(`${tag}policies ${initial ? 'loaded' : 'refreshed'} from Fortress — ${policies.length} active (fetched_at: ${fetched_at})`),
|
|
515
526
|
});
|
|
@@ -519,6 +530,27 @@ async function main() {
|
|
|
519
530
|
die(`error fetching policies from Fortress: ${e.message}\n Check WMA_FORTRESS_BASE_URL and WMA_API_KEY.`);
|
|
520
531
|
}
|
|
521
532
|
fortressSources.push(fortressPolicies);
|
|
533
|
+
// v1.1.0 Phase 2: persistent SSE connection to Fortress for instant
|
|
534
|
+
// policy updates (~100ms latency vs 60s poll). Falls back silently
|
|
535
|
+
// when the /policies-stream endpoint isn't deployed yet (HTTP 404),
|
|
536
|
+
// so the SDK ships safely even if the companion Lovable prompt
|
|
537
|
+
// hasn't landed on a given Fortress instance.
|
|
538
|
+
const streamUrl = fortressEndpoint(fortressBase, 'policies-stream');
|
|
539
|
+
const policyStream = new PolicyStream({
|
|
540
|
+
url: streamUrl,
|
|
541
|
+
apiKey: wmaApiKey,
|
|
542
|
+
anthropicAgentId: aid,
|
|
543
|
+
onError: (e) => warn(`${tag}policy-stream: ${e.message}`),
|
|
544
|
+
onInfo: (msg) => info(`${tag}${msg}`),
|
|
545
|
+
});
|
|
546
|
+
policyStream.on('policy_changed', () => {
|
|
547
|
+
// Fortress pushed a policy change for this agent — trigger an
|
|
548
|
+
// immediate refresh through the standard path so all the existing
|
|
549
|
+
// compile/validation logic applies.
|
|
550
|
+
fortressPolicies.refresh().catch((e) => warn(`${tag}stream-triggered refresh failed: ${e.message}`));
|
|
551
|
+
});
|
|
552
|
+
policyStream.start();
|
|
553
|
+
fortressStreams.push(policyStream);
|
|
522
554
|
ruleset = fortressPolicies.current();
|
|
523
555
|
}
|
|
524
556
|
|
|
@@ -572,9 +604,11 @@ async function main() {
|
|
|
572
604
|
if (armed.size === 0) {
|
|
573
605
|
die(`error: no agents could be armed (${agentIds.length} discovered; all policy fetches failed). Check WMA_API_KEY / WMA_FORTRESS_BASE_URL.`);
|
|
574
606
|
}
|
|
575
|
-
|
|
607
|
+
// v1.1.0 Phase 1 L3: supervisor reconcile every 30s (was 60s) so a
|
|
608
|
+
// freshly-created Anthropic agent gets armed sub-30s instead of sub-minute.
|
|
609
|
+
info(`fleet: ${armed.size}/${agentIds.length} agent(s) armed; reconciling every 30s for new agents.`);
|
|
576
610
|
while (!ac.signal.aborted) {
|
|
577
|
-
await sleep(
|
|
611
|
+
await sleep(30_000, ac.signal);
|
|
578
612
|
if (ac.signal.aborted) break;
|
|
579
613
|
let all;
|
|
580
614
|
try { all = await listAgents(apiKey); }
|
package/scripts/signals.js
CHANGED
|
@@ -25,6 +25,7 @@ import { resolve, join } from 'node:path';
|
|
|
25
25
|
import { SignalsAggregator } from '../src/anonymizer.js';
|
|
26
26
|
import { createReadStream } from 'node:fs';
|
|
27
27
|
import { createInterface } from 'node:readline';
|
|
28
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
28
29
|
|
|
29
30
|
function parseArgs(argv) {
|
|
30
31
|
const out = {};
|
|
@@ -59,6 +60,8 @@ async function collectFiles(p) {
|
|
|
59
60
|
}
|
|
60
61
|
|
|
61
62
|
async function main() {
|
|
63
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
64
|
+
maybePrintVersionAndExit(process.argv);
|
|
62
65
|
const args = parseArgs(process.argv.slice(2));
|
|
63
66
|
|
|
64
67
|
if (!args._target) {
|
|
@@ -31,6 +31,8 @@ import { createInterface } from 'node:readline';
|
|
|
31
31
|
import { SignalsAggregator } from '../src/anonymizer.js';
|
|
32
32
|
import { resolveFortressBase, fortressEndpoint } from '../src/fortress/url.js';
|
|
33
33
|
import { AnthropicManagedSource } from '../src/sources/anthropic-managed.js';
|
|
34
|
+
import { cleanLabel } from '../src/labels.js';
|
|
35
|
+
import { maybePrintVersionAndExit } from '../src/version.js';
|
|
34
36
|
|
|
35
37
|
function parseArgs(argv) {
|
|
36
38
|
const out = {};
|
|
@@ -99,13 +101,18 @@ function postJson(url, headers, body) {
|
|
|
99
101
|
}
|
|
100
102
|
|
|
101
103
|
async function main() {
|
|
104
|
+
// v1.1.1 F-13: --version / -v short-circuit before any other parsing.
|
|
105
|
+
maybePrintVersionAndExit(process.argv);
|
|
102
106
|
const args = parseArgs(process.argv.slice(2));
|
|
103
107
|
|
|
104
108
|
const agentId = args['agent-id'];
|
|
105
109
|
const logDir = resolve(args['log-dir'] || './watchmyagents-logs');
|
|
106
110
|
const apiKey = args['api-key'] || process.env.WMA_API_KEY;
|
|
107
111
|
const salt = args.salt || process.env.WMA_SIGNALS_SALT;
|
|
108
|
-
|
|
112
|
+
// v1.1.1 F-11: sanitize the customer-supplied display name with the
|
|
113
|
+
// same cleanLabel used by the Watch daemon (defense-in-depth vs log
|
|
114
|
+
// injection / Fortress payload injection via control bytes).
|
|
115
|
+
const displayName = cleanLabel(args['display-name'] || agentId) || agentId;
|
|
109
116
|
const dryRun = !!args['dry-run'];
|
|
110
117
|
|
|
111
118
|
// Resolve Fortress base URL. Accepts:
|
package/src/labels.js
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
2
|
+
// labels — shared sanitization for human-facing identifiers
|
|
3
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
4
|
+
//
|
|
5
|
+
// Customer-set strings (agent display names, workspace labels, etc.) end
|
|
6
|
+
// up in:
|
|
7
|
+
// - log lines (stdout/stderr of the Watch + Shield daemons)
|
|
8
|
+
// - the Fortress ingest-signals payload (`display_name` field)
|
|
9
|
+
// - eventually rendered in the Fortress dashboard
|
|
10
|
+
//
|
|
11
|
+
// We don't trust them. A name carrying:
|
|
12
|
+
// - control bytes (0x00-0x1F, 0x7F) can poison terminal output (ANSI
|
|
13
|
+
// escape sequences) or break NDJSON parsing
|
|
14
|
+
// - excessive length can bloat payloads and break UI columns
|
|
15
|
+
//
|
|
16
|
+
// `cleanLabel()` is the single, shared sanitizer. Both wma-fetch (the
|
|
17
|
+
// daemon) and wma-upload-fortress (the one-shot uploader) MUST run
|
|
18
|
+
// every customer-supplied label through it before logging or shipping.
|
|
19
|
+
// Extracted to its own module in v1.1.1 (F-11 Codex audit fix) so a
|
|
20
|
+
// future change benefits both consumers automatically.
|
|
21
|
+
|
|
22
|
+
const MAX_LABEL_CHARS = 60;
|
|
23
|
+
|
|
24
|
+
/**
|
|
25
|
+
* Strip control bytes (< 0x20 and 0x7F DEL) and truncate to MAX_LABEL_CHARS
|
|
26
|
+
* characters. Returns the empty string for null/undefined input.
|
|
27
|
+
*
|
|
28
|
+
* Uses [...str] to iterate by code point so surrogate pairs aren't split.
|
|
29
|
+
*/
|
|
30
|
+
export function cleanLabel(s) {
|
|
31
|
+
return [...String(s ?? '')]
|
|
32
|
+
.filter((c) => {
|
|
33
|
+
const code = c.charCodeAt(0);
|
|
34
|
+
return code >= 32 && code !== 127;
|
|
35
|
+
})
|
|
36
|
+
.join('')
|
|
37
|
+
.slice(0, MAX_LABEL_CHARS)
|
|
38
|
+
.trim();
|
|
39
|
+
}
|
|
@@ -0,0 +1,227 @@
|
|
|
1
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
2
|
+
// PolicyStream — Server-Sent Events consumer for instant policy propagation
|
|
3
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
4
|
+
//
|
|
5
|
+
// v1.1.0 Phase 2: instead of polling Fortress every 60s for new policies
|
|
6
|
+
// (the FortressPolicySource refreshIntervalMs path), Shield maintains a
|
|
7
|
+
// persistent SSE connection to /functions/v1/policies-stream and refreshes
|
|
8
|
+
// its ruleset within ~100ms of a policy change in Fortress.
|
|
9
|
+
//
|
|
10
|
+
// Why SSE (not WebSocket):
|
|
11
|
+
// - Zero runtime dependencies preserved: HTTPS + SSE = node:https built-in,
|
|
12
|
+
// no @supabase/realtime-js, no custom Phoenix Channels client.
|
|
13
|
+
// - Node 18+ compat preserved: no native WebSocket needed.
|
|
14
|
+
// - Firewall-friendly: SSE rides on standard HTTPS — many enterprise
|
|
15
|
+
// proxies block raw WebSocket but pass through text/event-stream cleanly.
|
|
16
|
+
// - Realtime is uni-directional (Fortress → Shield) anyway. SSE is the
|
|
17
|
+
// right tool for one-way push notifications.
|
|
18
|
+
//
|
|
19
|
+
// Graceful fallback:
|
|
20
|
+
// - On HTTP 404 from the SSE endpoint (Fortress side not yet upgraded
|
|
21
|
+
// with the Lovable prompt), this stream goes into "fallback mode" and
|
|
22
|
+
// stops trying to reconnect aggressively. The FortressPolicySource's
|
|
23
|
+
// existing poll cadence (60s in v1.1.0) covers the gap.
|
|
24
|
+
// - On HTTP 401, this is a config error — logged once, stream stays
|
|
25
|
+
// down.
|
|
26
|
+
// - On network errors / disconnects, reconnect with exponential backoff
|
|
27
|
+
// (1s → 60s cap).
|
|
28
|
+
//
|
|
29
|
+
// Per-agent: each PolicyStream targets a single anthropic_agent_id so the
|
|
30
|
+
// Fortress side can scope the channel to "this customer + this agent".
|
|
31
|
+
|
|
32
|
+
import { request as httpsRequest } from 'node:https';
|
|
33
|
+
import { URL } from 'node:url';
|
|
34
|
+
import { EventEmitter } from 'node:events';
|
|
35
|
+
|
|
36
|
+
const RECONNECT_MIN_MS = 1_000;
|
|
37
|
+
const RECONNECT_MAX_MS = 60_000;
|
|
38
|
+
const FALLBACK_RETRY_INTERVAL_MS = 5 * 60_000;
|
|
39
|
+
const PERMANENT_FAILURE_LOG_INTERVAL_MS = 5 * 60_000;
|
|
40
|
+
// v1.1.1 F-9 (P2 Codex audit): hard cap on a single SSE event's buffer.
|
|
41
|
+
// A buggy or compromised Fortress endpoint could stream bytes forever
|
|
42
|
+
// without emitting the "\n\n" event separator, growing Shield's memory.
|
|
43
|
+
// 1 MB is far above any legitimate `policy_changed` payload (the data
|
|
44
|
+
// field carries {rule_id, action, ts, kind} = maybe 200 bytes) so we
|
|
45
|
+
// abort the connection and reconnect on overflow.
|
|
46
|
+
const MAX_SSE_EVENT_BYTES = 1 * 1024 * 1024;
|
|
47
|
+
|
|
48
|
+
export class PolicyStream extends EventEmitter {
|
|
49
|
+
constructor({ url, apiKey, anthropicAgentId, onError, onInfo }) {
|
|
50
|
+
super();
|
|
51
|
+
if (!url) throw new Error('PolicyStream requires url');
|
|
52
|
+
if (!apiKey) throw new Error('PolicyStream requires apiKey');
|
|
53
|
+
if (!anthropicAgentId) throw new Error('PolicyStream requires anthropicAgentId');
|
|
54
|
+
this.url = url;
|
|
55
|
+
this.apiKey = apiKey;
|
|
56
|
+
this.agentId = anthropicAgentId;
|
|
57
|
+
this.onError = onError || (() => {});
|
|
58
|
+
this.onInfo = onInfo || (() => {});
|
|
59
|
+
this._req = null;
|
|
60
|
+
this._closed = false;
|
|
61
|
+
this._started = false;
|
|
62
|
+
this._backoffMs = RECONNECT_MIN_MS;
|
|
63
|
+
this._inFallback = false;
|
|
64
|
+
this._lastFallbackLogAt = 0;
|
|
65
|
+
this._lastConfigErrorLogAt = 0;
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
start() {
|
|
69
|
+
if (this._closed) return;
|
|
70
|
+
this._started = true;
|
|
71
|
+
this._connect();
|
|
72
|
+
}
|
|
73
|
+
|
|
74
|
+
close() {
|
|
75
|
+
this._closed = true;
|
|
76
|
+
if (this._req) {
|
|
77
|
+
try { this._req.destroy(); } catch { /* already destroyed */ }
|
|
78
|
+
this._req = null;
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
// Whether the stream is currently the source of truth (i.e., started,
|
|
83
|
+
// not closed, AND not in fallback mode). Useful for Shield to know
|
|
84
|
+
// whether to trust SSE or rely on its own polling cadence.
|
|
85
|
+
isLive() {
|
|
86
|
+
return this._started && !this._inFallback && !this._closed;
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
_connect() {
|
|
90
|
+
if (this._closed) return;
|
|
91
|
+
const u = new URL(this.url);
|
|
92
|
+
// Query-param scoping so Fortress can filter to this agent's channel.
|
|
93
|
+
u.searchParams.set('agent_id', this.agentId);
|
|
94
|
+
if (u.protocol !== 'https:') {
|
|
95
|
+
this.onError(new Error(`policy-stream: refusing non-https URL: ${this.url}`));
|
|
96
|
+
return;
|
|
97
|
+
}
|
|
98
|
+
|
|
99
|
+
const req = httpsRequest({
|
|
100
|
+
hostname: u.hostname,
|
|
101
|
+
port: u.port || 443,
|
|
102
|
+
path: u.pathname + (u.search || ''),
|
|
103
|
+
method: 'GET',
|
|
104
|
+
headers: {
|
|
105
|
+
'authorization': `Bearer ${this.apiKey}`,
|
|
106
|
+
'accept': 'text/event-stream',
|
|
107
|
+
'cache-control': 'no-cache',
|
|
108
|
+
'connection': 'keep-alive',
|
|
109
|
+
},
|
|
110
|
+
rejectUnauthorized: true,
|
|
111
|
+
}, (res) => {
|
|
112
|
+
this._req = req;
|
|
113
|
+
|
|
114
|
+
// 404 — Fortress side hasn't deployed the endpoint yet. Silent
|
|
115
|
+
// fallback: log once per 5 min, retry every 5 min, don't spam.
|
|
116
|
+
if (res.statusCode === 404) {
|
|
117
|
+
this._inFallback = true;
|
|
118
|
+
const now = Date.now();
|
|
119
|
+
if (now - this._lastFallbackLogAt > PERMANENT_FAILURE_LOG_INTERVAL_MS) {
|
|
120
|
+
this.onInfo(`policy-stream: SSE endpoint not deployed (HTTP 404). Falling back to polling.`);
|
|
121
|
+
this._lastFallbackLogAt = now;
|
|
122
|
+
}
|
|
123
|
+
res.resume(); // drain to free the socket
|
|
124
|
+
this._scheduleReconnect(FALLBACK_RETRY_INTERVAL_MS);
|
|
125
|
+
return;
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
// 401 — auth error. Config bug; log once per 5 min.
|
|
129
|
+
if (res.statusCode === 401 || res.statusCode === 403) {
|
|
130
|
+
const now = Date.now();
|
|
131
|
+
if (now - this._lastConfigErrorLogAt > PERMANENT_FAILURE_LOG_INTERVAL_MS) {
|
|
132
|
+
this.onError(new Error(`policy-stream: auth error (HTTP ${res.statusCode}) — check WMA_API_KEY`));
|
|
133
|
+
this._lastConfigErrorLogAt = now;
|
|
134
|
+
}
|
|
135
|
+
this._inFallback = true;
|
|
136
|
+
res.resume();
|
|
137
|
+
this._scheduleReconnect(FALLBACK_RETRY_INTERVAL_MS);
|
|
138
|
+
return;
|
|
139
|
+
}
|
|
140
|
+
|
|
141
|
+
if (res.statusCode !== 200) {
|
|
142
|
+
this.onError(new Error(`policy-stream: unexpected HTTP ${res.statusCode}`));
|
|
143
|
+
res.resume();
|
|
144
|
+
this._scheduleReconnect();
|
|
145
|
+
return;
|
|
146
|
+
}
|
|
147
|
+
|
|
148
|
+
// We're live. Reset backoff + fallback flag.
|
|
149
|
+
this._backoffMs = RECONNECT_MIN_MS;
|
|
150
|
+
this._inFallback = false;
|
|
151
|
+
this.onInfo(`policy-stream: connected for ${this.agentId.slice(0, 16)}…`);
|
|
152
|
+
res.setEncoding('utf8');
|
|
153
|
+
|
|
154
|
+
let buffer = '';
|
|
155
|
+
res.on('data', (chunk) => {
|
|
156
|
+
buffer += chunk;
|
|
157
|
+
// v1.1.1 F-9: cap on a single SSE event buffer. A buggy/compromised
|
|
158
|
+
// endpoint that never emits "\n\n" would otherwise OOM Shield.
|
|
159
|
+
// Abort + reconnect on overflow; the buffer is dropped so we
|
|
160
|
+
// restart fresh on the new connection.
|
|
161
|
+
if (buffer.length > MAX_SSE_EVENT_BYTES) {
|
|
162
|
+
this.onError(new Error(`policy-stream: SSE event exceeded ${MAX_SSE_EVENT_BYTES} bytes — aborting connection and reconnecting`));
|
|
163
|
+
buffer = '';
|
|
164
|
+
try { res.destroy(); } catch { /* already destroyed */ }
|
|
165
|
+
if (!this._closed) this._scheduleReconnect();
|
|
166
|
+
return;
|
|
167
|
+
}
|
|
168
|
+
// SSE events are separated by a blank line ("\n\n").
|
|
169
|
+
let eolIdx;
|
|
170
|
+
while ((eolIdx = buffer.indexOf('\n\n')) !== -1) {
|
|
171
|
+
const rawEvent = buffer.slice(0, eolIdx);
|
|
172
|
+
buffer = buffer.slice(eolIdx + 2);
|
|
173
|
+
this._parseAndEmit(rawEvent);
|
|
174
|
+
}
|
|
175
|
+
});
|
|
176
|
+
res.on('end', () => {
|
|
177
|
+
if (!this._closed) {
|
|
178
|
+
this.onInfo('policy-stream: connection closed, reconnecting…');
|
|
179
|
+
this._scheduleReconnect();
|
|
180
|
+
}
|
|
181
|
+
});
|
|
182
|
+
res.on('error', (e) => {
|
|
183
|
+
this.onError(new Error(`policy-stream: response error: ${e.message}`));
|
|
184
|
+
if (!this._closed) this._scheduleReconnect();
|
|
185
|
+
});
|
|
186
|
+
});
|
|
187
|
+
|
|
188
|
+
req.on('error', (e) => {
|
|
189
|
+
this.onError(new Error(`policy-stream: request error: ${e.message}`));
|
|
190
|
+
if (!this._closed) this._scheduleReconnect();
|
|
191
|
+
});
|
|
192
|
+
// Stream MUST remain open — no body, no end() until close.
|
|
193
|
+
req.end();
|
|
194
|
+
}
|
|
195
|
+
|
|
196
|
+
_parseAndEmit(rawEvent) {
|
|
197
|
+
// SSE spec: each event is a set of "field: value" lines.
|
|
198
|
+
// We care about the `data:` field (multiple data: lines concatenate).
|
|
199
|
+
const dataLines = [];
|
|
200
|
+
for (const line of rawEvent.split('\n')) {
|
|
201
|
+
// Skip comments (lines starting with ":")
|
|
202
|
+
if (line.startsWith(':')) continue;
|
|
203
|
+
if (line.startsWith('data:')) {
|
|
204
|
+
// Drop leading "data:" and optional space
|
|
205
|
+
const v = line.slice(5).replace(/^ /, '');
|
|
206
|
+
dataLines.push(v);
|
|
207
|
+
}
|
|
208
|
+
}
|
|
209
|
+
if (dataLines.length === 0) return;
|
|
210
|
+
const data = dataLines.join('\n');
|
|
211
|
+
let parsed;
|
|
212
|
+
try { parsed = JSON.parse(data); }
|
|
213
|
+
catch (e) {
|
|
214
|
+
this.onError(new Error(`policy-stream: invalid JSON in event: ${e.message}`));
|
|
215
|
+
return;
|
|
216
|
+
}
|
|
217
|
+
// Emit 'policy_changed' — consumers should refresh their ruleset.
|
|
218
|
+
this.emit('policy_changed', parsed);
|
|
219
|
+
}
|
|
220
|
+
|
|
221
|
+
_scheduleReconnect(forceDelay) {
|
|
222
|
+
if (this._closed) return;
|
|
223
|
+
const delay = forceDelay != null ? forceDelay : this._backoffMs;
|
|
224
|
+
this._backoffMs = Math.min(this._backoffMs * 2, RECONNECT_MAX_MS);
|
|
225
|
+
setTimeout(() => this._connect(), delay);
|
|
226
|
+
}
|
|
227
|
+
}
|
|
@@ -148,6 +148,17 @@ export class FortressPolicySource {
|
|
|
148
148
|
return this.ruleset;
|
|
149
149
|
}
|
|
150
150
|
|
|
151
|
+
/**
|
|
152
|
+
* Public refresh hook for out-of-band triggers — e.g. the v1.1.0 SSE
|
|
153
|
+
* PolicyStream fires this when Fortress pushes a policy_changed event,
|
|
154
|
+
* collapsing the up-to-60s polling latency to ~100ms.
|
|
155
|
+
* Safe to call concurrently with the internal interval timer: each
|
|
156
|
+
* call only performs a single network round-trip.
|
|
157
|
+
*/
|
|
158
|
+
async refresh() {
|
|
159
|
+
return this._refresh();
|
|
160
|
+
}
|
|
161
|
+
|
|
151
162
|
async _refresh({ initial = false } = {}) {
|
|
152
163
|
if (this._aborted) return;
|
|
153
164
|
try {
|
|
@@ -326,6 +326,11 @@ export async function* fetchSessionEntries({ apiKey, agentId, sessionId, model }
|
|
|
326
326
|
isMcp: type === 'agent.mcp_tool_use',
|
|
327
327
|
input: ev.input ?? null,
|
|
328
328
|
mcpServer: ev.server_name ?? ev.mcp_server_name ?? null,
|
|
329
|
+
// v1.1.1 F-8: capture sub-agent context at storage time so the
|
|
330
|
+
// end-of-session flush yields entries with the right attribution.
|
|
331
|
+
startTimestamp: ts,
|
|
332
|
+
session_thread_id,
|
|
333
|
+
agent_name,
|
|
329
334
|
});
|
|
330
335
|
continue;
|
|
331
336
|
}
|
|
@@ -483,6 +488,37 @@ export async function* fetchSessionEntries({ apiKey, agentId, sessionId, model }
|
|
|
483
488
|
continue;
|
|
484
489
|
}
|
|
485
490
|
}
|
|
491
|
+
|
|
492
|
+
// v1.1.1 F-8 (P1 Codex audit): flush remaining pendingToolUse entries
|
|
493
|
+
// as explicit "no_result_observed" tool_use events. These are tool
|
|
494
|
+
// calls that started (we saw agent.tool_use) but never produced a
|
|
495
|
+
// result (no agent.tool_result paired): most commonly because Shield
|
|
496
|
+
// pre-blocked them, the operator denied via tool_confirmation, the
|
|
497
|
+
// tool died mid-execution, or the session terminated before the
|
|
498
|
+
// result event arrived. For a security audit product, these incomplete
|
|
499
|
+
// calls are often the MOST useful signals — a blocked exfil attempt
|
|
500
|
+
// shows up here, not in successful tool_results. Yielding them
|
|
501
|
+
// explicitly with status='error' keeps the local NDJSON, anonymizer
|
|
502
|
+
// signals (counts, IoC hashes, tool_counts), and Fortress decisions
|
|
503
|
+
// honest about what actually happened.
|
|
504
|
+
for (const [toolUseId, pending] of pendingToolUse) {
|
|
505
|
+
yield {
|
|
506
|
+
...base,
|
|
507
|
+
session_thread_id: pending.session_thread_id,
|
|
508
|
+
agent_name: pending.agent_name,
|
|
509
|
+
id: toolUseId,
|
|
510
|
+
action_type: pending.isMcp ? 'mcp_tool_use' : 'tool_use',
|
|
511
|
+
tool_name: pending.name,
|
|
512
|
+
model: model || null,
|
|
513
|
+
timestamp: pending.startTimestamp,
|
|
514
|
+
duration_ms: null,
|
|
515
|
+
status: 'error',
|
|
516
|
+
error: 'no_result_observed',
|
|
517
|
+
input: pending.input,
|
|
518
|
+
output: { mcp_server: pending.mcpServer ?? undefined },
|
|
519
|
+
};
|
|
520
|
+
}
|
|
521
|
+
pendingToolUse.clear();
|
|
486
522
|
}
|
|
487
523
|
|
|
488
524
|
// ────────────────────────────────────────────────────────────────────────
|
package/src/version.js
ADDED
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
2
|
+
// version — shared --version flag handler for the wma-* CLI binaries
|
|
3
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
4
|
+
//
|
|
5
|
+
// v1.1.1 F-13: every CLI binary (wma-fetch, wma-shield, wma-signals,
|
|
6
|
+
// wma-upload-fortress, wma-inspect, wma-agents, wma-service) gets a
|
|
7
|
+
// --version / -v flag that prints the installed version and exits.
|
|
8
|
+
// Operators previously had to grep package.json under npm root to know
|
|
9
|
+
// what was deployed; this is now a one-liner.
|
|
10
|
+
//
|
|
11
|
+
// We resolve the version from the package.json next to the SDK source
|
|
12
|
+
// (../package.json relative to this file) so it stays in sync with the
|
|
13
|
+
// release that's actually executing.
|
|
14
|
+
|
|
15
|
+
import { readFileSync } from 'node:fs';
|
|
16
|
+
import { dirname, join } from 'node:path';
|
|
17
|
+
import { fileURLToPath } from 'node:url';
|
|
18
|
+
|
|
19
|
+
const HERE = dirname(fileURLToPath(import.meta.url));
|
|
20
|
+
const PKG_PATH = join(HERE, '..', 'package.json');
|
|
21
|
+
|
|
22
|
+
let cachedVersion = null;
|
|
23
|
+
|
|
24
|
+
/** Returns the installed watchmyagents version, parsed from package.json. */
|
|
25
|
+
export function getVersion() {
|
|
26
|
+
if (cachedVersion) return cachedVersion;
|
|
27
|
+
try {
|
|
28
|
+
const pkg = JSON.parse(readFileSync(PKG_PATH, 'utf8'));
|
|
29
|
+
cachedVersion = pkg.version || 'unknown';
|
|
30
|
+
} catch {
|
|
31
|
+
cachedVersion = 'unknown';
|
|
32
|
+
}
|
|
33
|
+
return cachedVersion;
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
/**
|
|
37
|
+
* If argv contains --version or -v, print the version and exit(0).
|
|
38
|
+
* Call this BEFORE any other parsing so it short-circuits on bad input
|
|
39
|
+
* (e.g., the user types `wma-fetch --version` with no env vars set).
|
|
40
|
+
*
|
|
41
|
+
* Usage at the top of every wma-* script:
|
|
42
|
+
* import { maybePrintVersionAndExit } from '../src/version.js';
|
|
43
|
+
* maybePrintVersionAndExit(process.argv);
|
|
44
|
+
*/
|
|
45
|
+
export function maybePrintVersionAndExit(argv) {
|
|
46
|
+
for (const a of argv) {
|
|
47
|
+
if (a === '--version' || a === '-v') {
|
|
48
|
+
process.stdout.write(`watchmyagents ${getVersion()}\n`);
|
|
49
|
+
process.exit(0);
|
|
50
|
+
}
|
|
51
|
+
}
|
|
52
|
+
}
|