watchmyagents 0.9.2 → 0.9.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -129,7 +129,7 @@ Logs land in `./watchmyagents-logs/<agent_id>/<date>.ndjson` (file mode `0600`,
129
129
 
130
130
  ### `wma-anonymize` — preview what would leave your machine
131
131
 
132
- Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Modèle C compliance and to test the format.
132
+ Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Containment compliance and to test the format.
133
133
 
134
134
  ```bash
135
135
  export WMA_SIGNALS_SALT="$(node -e 'console.log(require("crypto").randomBytes(16).toString("hex"))')"
@@ -155,7 +155,7 @@ wma-upload-fortress --agent-id agent_01ABC... [--display-name "My agent"]
155
155
  wma-upload-fortress --agent-id agent_xxx --dry-run
156
156
  ```
157
157
 
158
- **What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output) **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the agent id and only carries the human-readable agent name if you opt in (`wma-fetch --watch --upload --send-agent-names`).
158
+ **What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output), the agent's **`classification`** when the daemon has it (`{agent_type, confidence, stage}` — anonymized metadata, never raw content), **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the **human-readable agent name** (sanitized to strip control chars) for UX in the dashboard pass `--no-send-agent-names` to keep it pseudonymized (sends the agent id instead) if your agent names themselves carry sensitive client/project info.
159
159
  **What is NOT sent:** raw prompts, raw URLs/commands/queries, raw agent responses, raw error messages. All payload content stays on your machine.
160
160
 
161
161
  The endpoint auto-registers the agent on the first upload if it doesn't exist in Fortress yet — no manual onboarding needed for new agents.
@@ -174,7 +174,7 @@ Outputs sections aligned with security audit needs: tokens summary, by-tool / by
174
174
 
175
175
  Lists every Managed Agent under your key and classifies each one's **typology**
176
176
  (one of 10 Guardian Core archetypes) from its OBSERVED behaviour in your local
177
- logs — which drives the cold-start Shield template. Modèle C: reads local logs
177
+ logs — which drives the cold-start Shield template. Containment: reads local logs
178
178
  only (tool-category fractions, never raw content) and transmits nothing.
179
179
 
180
180
  ```bash
@@ -303,7 +303,7 @@ Decisions are logged to the same NDJSON stream as Watch (`action_type: shield_de
303
303
 
304
304
  - ✅ Watch SDK — Anthropic Managed Agents post-hoc fetch + local audit
305
305
  - ✅ Shield SDK — real-time enforcement (interrupt mode + tool_confirmation mode)
306
- - ✅ Anonymizer — produce signals payloads (Modèle C: no raw content leaves)
306
+ - ✅ Anonymizer — produce signals payloads (Containment: no raw content leaves)
307
307
  - ✅ Anonymized telemetry to WMA Fortress cloud (`wma-upload-fortress` in v0.5.0)
308
308
  - ✅ Guardian AI (cloud) — automatic policy suggestions from observed behavior
309
309
  - ✅ Fortress (cloud) — dashboard + human-in-the-loop validation queue
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "watchmyagents",
3
- "version": "0.9.2",
3
+ "version": "0.9.4",
4
4
  "description": "Security observability + real-time policy enforcement for AI agents. Local-first NDJSON capture with a continuous Watch daemon that auto-uploads anonymized signals, Shield CLI that blocks policy violations live (with policies pulled from Fortress cloud), anonymizer producing signals-only payloads, bidirectional sync with WatchMyAgents Fortress, and one-command install as an always-on launchd/systemd service — closing the recursive Watch→Guardian→Shield security loop.",
5
5
  "type": "module",
6
6
  "files": [
package/scripts/agents.js CHANGED
@@ -5,14 +5,14 @@
5
5
  // Usage:
6
6
  // wma-agents [list] [--log-dir ~/.watchmyagents/logs] [--json]
7
7
  //
8
- // Reads the local Watch logs (NEVER leaves the machine — Modèle C) and derives
8
+ // Reads the local Watch logs (NEVER leaves the machine — Containment) and derives
9
9
  // the anonymized behavioural FEATURE VECTOR per the typology spec:
10
10
  // per-tool-category FRACTIONS (f_*), boolean local flags (flag_*), aux ratios
11
11
  // (aux_*), and n_events. It then calls classifyAgentType() and prints the
12
12
  // schema-conformant result. With <50 events an agent is `generic` (cold start)
13
13
  // and refines as activity accumulates.
14
14
  //
15
- // Modèle C invariant: only counts/ratios/flags are computed here — never raw
15
+ // Containment invariant: only counts/ratios/flags are computed here — never raw
16
16
  // prompt/output content, never the agent display name. Nothing is transmitted.
17
17
  //
18
18
  // ANTHROPIC_API_KEY from env (or --api-key, discouraged).
@@ -39,7 +39,7 @@ function die(msg, code = 1) { process.stderr.write(`error: ${msg}\n`); process.e
39
39
  function info(msg) { process.stdout.write(`[wma-agents] ${msg}\n`); }
40
40
 
41
41
  // Feature aggregation lives in src/typology-features.js (shared with the Watch
42
- // daemon so both CLI snapshot and continuous upload use the same Modèle C
42
+ // daemon so both CLI snapshot and continuous upload use the same Containment
43
43
  // extraction). The rest of this file is just CLI presentation.
44
44
 
45
45
  async function main() {
@@ -76,7 +76,7 @@ async function main() {
76
76
  if (asJson) { process.stdout.write(JSON.stringify(results, null, 2) + '\n'); return; }
77
77
 
78
78
  info(`discovered ${results.length} agent(s) - classified from local logs in ${logDir}`);
79
- info(`Modele C: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
79
+ info(`Containment: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
80
80
  process.stdout.write('\n');
81
81
  for (const r of results) {
82
82
  const mods = (r.modifiers && r.modifiers.length) ? ` [+${r.modifiers.join(',')}]` : '';
@@ -14,7 +14,7 @@
14
14
  // by the stable Anthropic event id), appends them to the NDJSON, and — with
15
15
  // --upload — anonymizes the new window and ships signals to Fortress. This
16
16
  // automates the Watch leg of the WGS loop so Guardian gets fresh data with
17
- // no manual step. The raw NDJSON always stays local (Modèle C).
17
+ // no manual step. The raw NDJSON always stays local (Containment).
18
18
  //
19
19
  // API key from --api-key or env ANTHROPIC_API_KEY.
20
20
  // --upload also needs: WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT.
@@ -213,7 +213,7 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
213
213
  // up. `windowMs` bounds discovery of NEW sessions, but sessions we're ALREADY
214
214
  // tracking are re-fetched regardless of age, so a long-running (>window) session
215
215
  // never drops out of capture. `sendNames`: include the human agent name in the
216
- // Fortress display_name (opt-in); default sends the agent id only (Modèle C).
216
+ // Fortress display_name (opt-in); default sends the agent id only (Containment).
217
217
  async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, windowMs, uploadCtx, sendNames }) {
218
218
  let agents = await resolveAgents();
219
219
  const seenIds = new Set(); // stable Anthropic event ids already captured
@@ -284,7 +284,7 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
284
284
  try {
285
285
  // Compute the agent's typology from its CUMULATIVE local logs and
286
286
  // thread the prior across cycles so the state machine refines toward
287
- // stable (Modèle C: features = counts/categories only, no raw content).
287
+ // stable (Containment: features = counts/categories only, no raw content).
288
288
  let classification;
289
289
  try {
290
290
  const features = buildFeatures(await aggregate(logDir, ag.agentId));
@@ -365,7 +365,12 @@ async function main() {
365
365
  // Discovery window for NEW sessions (default 7d, configurable). Sessions we
366
366
  // already track are re-fetched regardless of age, so long-lived ones don't drop.
367
367
  const windowMs = parseDurationMs(args['discovery-since'], 7 * 24 * 3600_000);
368
- const sendNames = !!args['send-agent-names'];
368
+ // display_name on the Fortress payload: defaults to the human agent name
369
+ // (UX-friendly — operators identify agents by name in the dashboard). The
370
+ // name is sanitized via cleanLabel() so log/payload injection is impossible.
371
+ // Use --no-send-agent-names to opt OUT (sends the agent_id instead) for
372
+ // setups where the agent name itself is considered sensitive metadata.
373
+ const sendNames = args['no-send-agent-names'] !== true;
369
374
 
370
375
  let resolveAgents;
371
376
  if (allAgents) {
@@ -17,7 +17,7 @@
17
17
  // environment) into a protected env file (~/.watchmyagents/env, chmod 600) that
18
18
  // the service loads at runtime. Required env at install time:
19
19
  // ANTHROPIC_API_KEY, WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT
20
- // Raw logs stay local (Modèle C); only anonymized signals are uploaded.
20
+ // Raw logs stay local (Containment); only anonymized signals are uploaded.
21
21
 
22
22
  import os from 'node:os';
23
23
  import { mkdirSync, writeFileSync, rmSync, existsSync, chmodSync } from 'node:fs';
@@ -4,7 +4,7 @@
4
4
  //
5
5
  // Composable with the rest of the SDK:
6
6
  // wma-fetch → ./watchmyagents-logs/<agent_id>/<date>.ndjson (local capture)
7
- // wma-anonymize → signals payload (Modèle C: no raw content)
7
+ // wma-anonymize → signals payload (Containment: no raw content)
8
8
  // wma-upload-fortress → POST signals to https://<project>.supabase.co/functions/v1/ingest-signals
9
9
  //
10
10
  // Usage:
@@ -1,9 +1,9 @@
1
- // Shared local-log feature extraction for the typology classifier (Modèle C).
1
+ // Shared local-log feature extraction for the typology classifier (Containment).
2
2
  // Both wma-agents (CLI snapshot) and the Watch daemon (continuous upload) use
3
3
  // this to derive the anonymized behavioural FEATURE VECTOR from local NDJSON
4
4
  // logs, then feed it to classifyAgentType() in ./typology.js.
5
5
  //
6
- // Modèle C invariant: only `action_type` and `tool_name` are read from each log
6
+ // Containment invariant: only `action_type` and `tool_name` are read from each log
7
7
  // line — the raw payload fields (input/output/content/error/thinking) are NEVER
8
8
  // touched here, so no raw content can ever enter a feature.
9
9
 
@@ -37,7 +37,7 @@ const CATEGORY_RULES = [
37
37
  const DEPLOY_RE = /deploy|terraform|kubectl|helm|(^|_)release($|_)|ansible|pulumi|cloudformation/;
38
38
 
39
39
  // Features that the WMA NDJSON logs CANNOT reliably expose today (opaque tool
40
- // names, no behavioural signal, or raw content off-limits under Modèle C).
40
+ // names, no behavioural signal, or raw content off-limits under Containment).
41
41
  // They default to 0/false; callers can surface this to the user.
42
42
  export const NON_DERIVABLE = [
43
43
  'f_database', 'f_email', 'f_payment', 'f_secret', 'f_memory',
@@ -1,5 +1,5 @@
1
1
  {
2
- "$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Modèle C: all inputs are anonymized behavioural fractions/flags only.",
2
+ "$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Containment: all inputs are anonymized behavioural fractions/flags only.",
3
3
  "version": "0.1.0",
4
4
  "updated_at": "2026-05-29T00:00:00Z",
5
5
 
@@ -42,7 +42,7 @@
42
42
  },
43
43
 
44
44
  "features": {
45
- "$comment": "Canonical anonymized feature keys (Modèle C). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
45
+ "$comment": "Canonical anonymized feature keys (Containment). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
46
46
  "fractions": ["f_code", "f_browser", "f_database", "f_http", "f_email", "f_payment", "f_secret", "f_search", "f_memory", "f_handoff", "f_user_msg", "f_file"],
47
47
  "flags": ["flag_deploy", "flag_internal_sys", "flag_on_behalf"],
48
48
  "aux": ["aux_autonomy", "aux_untrusted", "aux_sensitive"]
package/src/typology.js CHANGED
@@ -8,7 +8,7 @@
8
8
  // Why behaviour, not config: Anthropic Managed Agents expose their tools as an
9
9
  // opaque bundle (`agent_toolset_20260401`), so static config can't tell a
10
10
  // researcher from a coder. We classify from anonymized behavioural signals
11
- // (Modèle C): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
11
+ // (Containment): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
12
12
  // and aux ratios (aux_*). NEVER raw content — no prompts, no outputs, no names.
13
13
  //
14
14
  // ──────────────────────────────────────────────────────────────────────────
@@ -23,7 +23,7 @@
23
23
  // ──────────────────────────────────────────────────────────────────────────
24
24
  //
25
25
  // INVARIANTS enforced here:
26
- // 1. Modèle C — inputs are anonymized fractions/flags/aux ONLY.
26
+ // 1. Containment — inputs are anonymized fractions/flags/aux ONLY.
27
27
  // 2. Weights + thresholds come from config (typology-weights.json), never
28
28
  // hardcoded in the logic below.
29
29
  // 3. No easy downgrade — moving to a LESS strict template needs a raised
@@ -76,7 +76,7 @@ function strictnessOf(cfg, type) {
76
76
  * Build the canonical feature vector from a loose features object.
77
77
  * Only the schema-legal keys are kept; everything is coerced to a number and
78
78
  * clamped to [0,1] (the schema requires every feature_vector value in [0,1]).
79
- * Missing features default to 0 — Modèle C: an absent signal is "not observed",
79
+ * Missing features default to 0 — Containment: an absent signal is "not observed",
80
80
  * never inferred from content.
81
81
  */
82
82
  function normalizeFeatures(cfg, features) {
@@ -121,7 +121,7 @@ function rankTypes(cfg, fv) {
121
121
  * classifyAgentType(features[, prior][, opts]) → object conforming EXACTLY to
122
122
  * agent-classification.schema.json.
123
123
  *
124
- * @param {object} features Anonymized behavioural signals (Modèle C):
124
+ * @param {object} features Anonymized behavioural signals (Containment):
125
125
  * agent_id {string} pass-through identifier (no content)
126
126
  * f_code,f_browser,… {number} per-category FRACTIONS in [0,1]
127
127
  * flag_deploy,… {0|1|bool} local discriminator flags (no content)