watchmyagents 0.9.2 → 0.9.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -4
- package/package.json +1 -1
- package/scripts/agents.js +4 -4
- package/scripts/fetch-anthropic.js +9 -4
- package/scripts/service.js +1 -1
- package/scripts/upload-fortress.js +1 -1
- package/src/typology-features.js +3 -3
- package/src/typology-weights.json +2 -2
- package/src/typology.js +4 -4
package/README.md
CHANGED
|
@@ -129,7 +129,7 @@ Logs land in `./watchmyagents-logs/<agent_id>/<date>.ndjson` (file mode `0600`,
|
|
|
129
129
|
|
|
130
130
|
### `wma-anonymize` — preview what would leave your machine
|
|
131
131
|
|
|
132
|
-
Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify
|
|
132
|
+
Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Containment compliance and to test the format.
|
|
133
133
|
|
|
134
134
|
```bash
|
|
135
135
|
export WMA_SIGNALS_SALT="$(node -e 'console.log(require("crypto").randomBytes(16).toString("hex"))')"
|
|
@@ -155,7 +155,7 @@ wma-upload-fortress --agent-id agent_01ABC... [--display-name "My agent"]
|
|
|
155
155
|
wma-upload-fortress --agent-id agent_xxx --dry-run
|
|
156
156
|
```
|
|
157
157
|
|
|
158
|
-
**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output) **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the
|
|
158
|
+
**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output), the agent's **`classification`** when the daemon has it (`{agent_type, confidence, stage}` — anonymized metadata, never raw content), **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the **human-readable agent name** (sanitized to strip control chars) for UX in the dashboard — pass `--no-send-agent-names` to keep it pseudonymized (sends the agent id instead) if your agent names themselves carry sensitive client/project info.
|
|
159
159
|
**What is NOT sent:** raw prompts, raw URLs/commands/queries, raw agent responses, raw error messages. All payload content stays on your machine.
|
|
160
160
|
|
|
161
161
|
The endpoint auto-registers the agent on the first upload if it doesn't exist in Fortress yet — no manual onboarding needed for new agents.
|
|
@@ -174,7 +174,7 @@ Outputs sections aligned with security audit needs: tokens summary, by-tool / by
|
|
|
174
174
|
|
|
175
175
|
Lists every Managed Agent under your key and classifies each one's **typology**
|
|
176
176
|
(one of 10 Guardian Core archetypes) from its OBSERVED behaviour in your local
|
|
177
|
-
logs — which drives the cold-start Shield template.
|
|
177
|
+
logs — which drives the cold-start Shield template. Containment: reads local logs
|
|
178
178
|
only (tool-category fractions, never raw content) and transmits nothing.
|
|
179
179
|
|
|
180
180
|
```bash
|
|
@@ -303,7 +303,7 @@ Decisions are logged to the same NDJSON stream as Watch (`action_type: shield_de
|
|
|
303
303
|
|
|
304
304
|
- ✅ Watch SDK — Anthropic Managed Agents post-hoc fetch + local audit
|
|
305
305
|
- ✅ Shield SDK — real-time enforcement (interrupt mode + tool_confirmation mode)
|
|
306
|
-
- ✅ Anonymizer — produce signals payloads (
|
|
306
|
+
- ✅ Anonymizer — produce signals payloads (Containment: no raw content leaves)
|
|
307
307
|
- ✅ Anonymized telemetry to WMA Fortress cloud (`wma-upload-fortress` in v0.5.0)
|
|
308
308
|
- ✅ Guardian AI (cloud) — automatic policy suggestions from observed behavior
|
|
309
309
|
- ✅ Fortress (cloud) — dashboard + human-in-the-loop validation queue
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "watchmyagents",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.4",
|
|
4
4
|
"description": "Security observability + real-time policy enforcement for AI agents. Local-first NDJSON capture with a continuous Watch daemon that auto-uploads anonymized signals, Shield CLI that blocks policy violations live (with policies pulled from Fortress cloud), anonymizer producing signals-only payloads, bidirectional sync with WatchMyAgents Fortress, and one-command install as an always-on launchd/systemd service — closing the recursive Watch→Guardian→Shield security loop.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
package/scripts/agents.js
CHANGED
|
@@ -5,14 +5,14 @@
|
|
|
5
5
|
// Usage:
|
|
6
6
|
// wma-agents [list] [--log-dir ~/.watchmyagents/logs] [--json]
|
|
7
7
|
//
|
|
8
|
-
// Reads the local Watch logs (NEVER leaves the machine —
|
|
8
|
+
// Reads the local Watch logs (NEVER leaves the machine — Containment) and derives
|
|
9
9
|
// the anonymized behavioural FEATURE VECTOR per the typology spec:
|
|
10
10
|
// per-tool-category FRACTIONS (f_*), boolean local flags (flag_*), aux ratios
|
|
11
11
|
// (aux_*), and n_events. It then calls classifyAgentType() and prints the
|
|
12
12
|
// schema-conformant result. With <50 events an agent is `generic` (cold start)
|
|
13
13
|
// and refines as activity accumulates.
|
|
14
14
|
//
|
|
15
|
-
//
|
|
15
|
+
// Containment invariant: only counts/ratios/flags are computed here — never raw
|
|
16
16
|
// prompt/output content, never the agent display name. Nothing is transmitted.
|
|
17
17
|
//
|
|
18
18
|
// ANTHROPIC_API_KEY from env (or --api-key, discouraged).
|
|
@@ -39,7 +39,7 @@ function die(msg, code = 1) { process.stderr.write(`error: ${msg}\n`); process.e
|
|
|
39
39
|
function info(msg) { process.stdout.write(`[wma-agents] ${msg}\n`); }
|
|
40
40
|
|
|
41
41
|
// Feature aggregation lives in src/typology-features.js (shared with the Watch
|
|
42
|
-
// daemon so both CLI snapshot and continuous upload use the same
|
|
42
|
+
// daemon so both CLI snapshot and continuous upload use the same Containment
|
|
43
43
|
// extraction). The rest of this file is just CLI presentation.
|
|
44
44
|
|
|
45
45
|
async function main() {
|
|
@@ -76,7 +76,7 @@ async function main() {
|
|
|
76
76
|
if (asJson) { process.stdout.write(JSON.stringify(results, null, 2) + '\n'); return; }
|
|
77
77
|
|
|
78
78
|
info(`discovered ${results.length} agent(s) - classified from local logs in ${logDir}`);
|
|
79
|
-
info(`
|
|
79
|
+
info(`Containment: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
|
|
80
80
|
process.stdout.write('\n');
|
|
81
81
|
for (const r of results) {
|
|
82
82
|
const mods = (r.modifiers && r.modifiers.length) ? ` [+${r.modifiers.join(',')}]` : '';
|
|
@@ -14,7 +14,7 @@
|
|
|
14
14
|
// by the stable Anthropic event id), appends them to the NDJSON, and — with
|
|
15
15
|
// --upload — anonymizes the new window and ships signals to Fortress. This
|
|
16
16
|
// automates the Watch leg of the WGS loop so Guardian gets fresh data with
|
|
17
|
-
// no manual step. The raw NDJSON always stays local (
|
|
17
|
+
// no manual step. The raw NDJSON always stays local (Containment).
|
|
18
18
|
//
|
|
19
19
|
// API key from --api-key or env ANTHROPIC_API_KEY.
|
|
20
20
|
// --upload also needs: WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT.
|
|
@@ -213,7 +213,7 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
|
|
|
213
213
|
// up. `windowMs` bounds discovery of NEW sessions, but sessions we're ALREADY
|
|
214
214
|
// tracking are re-fetched regardless of age, so a long-running (>window) session
|
|
215
215
|
// never drops out of capture. `sendNames`: include the human agent name in the
|
|
216
|
-
// Fortress display_name (opt-in); default sends the agent id only (
|
|
216
|
+
// Fortress display_name (opt-in); default sends the agent id only (Containment).
|
|
217
217
|
async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, windowMs, uploadCtx, sendNames }) {
|
|
218
218
|
let agents = await resolveAgents();
|
|
219
219
|
const seenIds = new Set(); // stable Anthropic event ids already captured
|
|
@@ -284,7 +284,7 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
|
|
|
284
284
|
try {
|
|
285
285
|
// Compute the agent's typology from its CUMULATIVE local logs and
|
|
286
286
|
// thread the prior across cycles so the state machine refines toward
|
|
287
|
-
// stable (
|
|
287
|
+
// stable (Containment: features = counts/categories only, no raw content).
|
|
288
288
|
let classification;
|
|
289
289
|
try {
|
|
290
290
|
const features = buildFeatures(await aggregate(logDir, ag.agentId));
|
|
@@ -365,7 +365,12 @@ async function main() {
|
|
|
365
365
|
// Discovery window for NEW sessions (default 7d, configurable). Sessions we
|
|
366
366
|
// already track are re-fetched regardless of age, so long-lived ones don't drop.
|
|
367
367
|
const windowMs = parseDurationMs(args['discovery-since'], 7 * 24 * 3600_000);
|
|
368
|
-
|
|
368
|
+
// display_name on the Fortress payload: defaults to the human agent name
|
|
369
|
+
// (UX-friendly — operators identify agents by name in the dashboard). The
|
|
370
|
+
// name is sanitized via cleanLabel() so log/payload injection is impossible.
|
|
371
|
+
// Use --no-send-agent-names to opt OUT (sends the agent_id instead) for
|
|
372
|
+
// setups where the agent name itself is considered sensitive metadata.
|
|
373
|
+
const sendNames = args['no-send-agent-names'] !== true;
|
|
369
374
|
|
|
370
375
|
let resolveAgents;
|
|
371
376
|
if (allAgents) {
|
package/scripts/service.js
CHANGED
|
@@ -17,7 +17,7 @@
|
|
|
17
17
|
// environment) into a protected env file (~/.watchmyagents/env, chmod 600) that
|
|
18
18
|
// the service loads at runtime. Required env at install time:
|
|
19
19
|
// ANTHROPIC_API_KEY, WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT
|
|
20
|
-
// Raw logs stay local (
|
|
20
|
+
// Raw logs stay local (Containment); only anonymized signals are uploaded.
|
|
21
21
|
|
|
22
22
|
import os from 'node:os';
|
|
23
23
|
import { mkdirSync, writeFileSync, rmSync, existsSync, chmodSync } from 'node:fs';
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
//
|
|
5
5
|
// Composable with the rest of the SDK:
|
|
6
6
|
// wma-fetch → ./watchmyagents-logs/<agent_id>/<date>.ndjson (local capture)
|
|
7
|
-
// wma-anonymize → signals payload (
|
|
7
|
+
// wma-anonymize → signals payload (Containment: no raw content)
|
|
8
8
|
// wma-upload-fortress → POST signals to https://<project>.supabase.co/functions/v1/ingest-signals
|
|
9
9
|
//
|
|
10
10
|
// Usage:
|
package/src/typology-features.js
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
|
-
// Shared local-log feature extraction for the typology classifier (
|
|
1
|
+
// Shared local-log feature extraction for the typology classifier (Containment).
|
|
2
2
|
// Both wma-agents (CLI snapshot) and the Watch daemon (continuous upload) use
|
|
3
3
|
// this to derive the anonymized behavioural FEATURE VECTOR from local NDJSON
|
|
4
4
|
// logs, then feed it to classifyAgentType() in ./typology.js.
|
|
5
5
|
//
|
|
6
|
-
//
|
|
6
|
+
// Containment invariant: only `action_type` and `tool_name` are read from each log
|
|
7
7
|
// line — the raw payload fields (input/output/content/error/thinking) are NEVER
|
|
8
8
|
// touched here, so no raw content can ever enter a feature.
|
|
9
9
|
|
|
@@ -37,7 +37,7 @@ const CATEGORY_RULES = [
|
|
|
37
37
|
const DEPLOY_RE = /deploy|terraform|kubectl|helm|(^|_)release($|_)|ansible|pulumi|cloudformation/;
|
|
38
38
|
|
|
39
39
|
// Features that the WMA NDJSON logs CANNOT reliably expose today (opaque tool
|
|
40
|
-
// names, no behavioural signal, or raw content off-limits under
|
|
40
|
+
// names, no behavioural signal, or raw content off-limits under Containment).
|
|
41
41
|
// They default to 0/false; callers can surface this to the user.
|
|
42
42
|
export const NON_DERIVABLE = [
|
|
43
43
|
'f_database', 'f_email', 'f_payment', 'f_secret', 'f_memory',
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
{
|
|
2
|
-
"$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic.
|
|
2
|
+
"$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Containment: all inputs are anonymized behavioural fractions/flags only.",
|
|
3
3
|
"version": "0.1.0",
|
|
4
4
|
"updated_at": "2026-05-29T00:00:00Z",
|
|
5
5
|
|
|
@@ -42,7 +42,7 @@
|
|
|
42
42
|
},
|
|
43
43
|
|
|
44
44
|
"features": {
|
|
45
|
-
"$comment": "Canonical anonymized feature keys (
|
|
45
|
+
"$comment": "Canonical anonymized feature keys (Containment). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
|
|
46
46
|
"fractions": ["f_code", "f_browser", "f_database", "f_http", "f_email", "f_payment", "f_secret", "f_search", "f_memory", "f_handoff", "f_user_msg", "f_file"],
|
|
47
47
|
"flags": ["flag_deploy", "flag_internal_sys", "flag_on_behalf"],
|
|
48
48
|
"aux": ["aux_autonomy", "aux_untrusted", "aux_sensitive"]
|
package/src/typology.js
CHANGED
|
@@ -8,7 +8,7 @@
|
|
|
8
8
|
// Why behaviour, not config: Anthropic Managed Agents expose their tools as an
|
|
9
9
|
// opaque bundle (`agent_toolset_20260401`), so static config can't tell a
|
|
10
10
|
// researcher from a coder. We classify from anonymized behavioural signals
|
|
11
|
-
// (
|
|
11
|
+
// (Containment): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
|
|
12
12
|
// and aux ratios (aux_*). NEVER raw content — no prompts, no outputs, no names.
|
|
13
13
|
//
|
|
14
14
|
// ──────────────────────────────────────────────────────────────────────────
|
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
// ──────────────────────────────────────────────────────────────────────────
|
|
24
24
|
//
|
|
25
25
|
// INVARIANTS enforced here:
|
|
26
|
-
// 1.
|
|
26
|
+
// 1. Containment — inputs are anonymized fractions/flags/aux ONLY.
|
|
27
27
|
// 2. Weights + thresholds come from config (typology-weights.json), never
|
|
28
28
|
// hardcoded in the logic below.
|
|
29
29
|
// 3. No easy downgrade — moving to a LESS strict template needs a raised
|
|
@@ -76,7 +76,7 @@ function strictnessOf(cfg, type) {
|
|
|
76
76
|
* Build the canonical feature vector from a loose features object.
|
|
77
77
|
* Only the schema-legal keys are kept; everything is coerced to a number and
|
|
78
78
|
* clamped to [0,1] (the schema requires every feature_vector value in [0,1]).
|
|
79
|
-
* Missing features default to 0 —
|
|
79
|
+
* Missing features default to 0 — Containment: an absent signal is "not observed",
|
|
80
80
|
* never inferred from content.
|
|
81
81
|
*/
|
|
82
82
|
function normalizeFeatures(cfg, features) {
|
|
@@ -121,7 +121,7 @@ function rankTypes(cfg, fv) {
|
|
|
121
121
|
* classifyAgentType(features[, prior][, opts]) → object conforming EXACTLY to
|
|
122
122
|
* agent-classification.schema.json.
|
|
123
123
|
*
|
|
124
|
-
* @param {object} features Anonymized behavioural signals (
|
|
124
|
+
* @param {object} features Anonymized behavioural signals (Containment):
|
|
125
125
|
* agent_id {string} pass-through identifier (no content)
|
|
126
126
|
* f_code,f_browser,… {number} per-category FRACTIONS in [0,1]
|
|
127
127
|
* flag_deploy,… {0|1|bool} local discriminator flags (no content)
|