npm - watchmyagents - Versions diffs - 0.9.2 → 0.9.4 - Mend

watchmyagents 0.9.2 → 0.9.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +4 -4
package/package.json +1 -1
package/scripts/agents.js +4 -4
package/scripts/fetch-anthropic.js +9 -4
package/scripts/service.js +1 -1
package/scripts/upload-fortress.js +1 -1
package/src/typology-features.js +3 -3
package/src/typology-weights.json +2 -2
package/src/typology.js +4 -4

package/README.md CHANGED Viewed

@@ -129,7 +129,7 @@ Logs land in `./watchmyagents-logs/<agent_id>/<date>.ndjson` (file mode `0600`,
 ### `wma-anonymize` — preview what would leave your machine
-Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Modèle C compliance and to test the format.
+Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Containment compliance and to test the format.
 ```bash
 export WMA_SIGNALS_SALT="$(node -e 'console.log(require("crypto").randomBytes(16).toString("hex"))')"
@@ -155,7 +155,7 @@ wma-upload-fortress --agent-id agent_01ABC... [--display-name "My agent"]
 wma-upload-fortress --agent-id agent_xxx --dry-run
 ```
-**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output) **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the agent id and only carries the human-readable agent name if you opt in (`wma-fetch --watch --upload --send-agent-names`).
+**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output), the agent's **`classification`** when the daemon has it (`{agent_type, confidence, stage}` — anonymized metadata, never raw content), **plus two routing identifiers**: your `anthropic_agent_id` and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the **human-readable agent name** (sanitized to strip control chars) for UX in the dashboard — pass `--no-send-agent-names` to keep it pseudonymized (sends the agent id instead) if your agent names themselves carry sensitive client/project info.
 **What is NOT sent:** raw prompts, raw URLs/commands/queries, raw agent responses, raw error messages. All payload content stays on your machine.
 The endpoint auto-registers the agent on the first upload if it doesn't exist in Fortress yet — no manual onboarding needed for new agents.
@@ -174,7 +174,7 @@ Outputs sections aligned with security audit needs: tokens summary, by-tool / by
 Lists every Managed Agent under your key and classifies each one's **typology**
 (one of 10 Guardian Core archetypes) from its OBSERVED behaviour in your local
-logs — which drives the cold-start Shield template. Modèle C: reads local logs
+logs — which drives the cold-start Shield template. Containment: reads local logs
 only (tool-category fractions, never raw content) and transmits nothing.
 ```bash
@@ -303,7 +303,7 @@ Decisions are logged to the same NDJSON stream as Watch (`action_type: shield_de
 - ✅ Watch SDK — Anthropic Managed Agents post-hoc fetch + local audit
 - ✅ Shield SDK — real-time enforcement (interrupt mode + tool_confirmation mode)
-- ✅ Anonymizer — produce signals payloads (Modèle C: no raw content leaves)
+- ✅ Anonymizer — produce signals payloads (Containment: no raw content leaves)
 - ✅ Anonymized telemetry to WMA Fortress cloud (`wma-upload-fortress` in v0.5.0)
 - ✅ Guardian AI (cloud) — automatic policy suggestions from observed behavior
 - ✅ Fortress (cloud) — dashboard + human-in-the-loop validation queue

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "watchmyagents",
-  "version": "0.9.2",
+  "version": "0.9.4",
   "description": "Security observability + real-time policy enforcement for AI agents. Local-first NDJSON capture with a continuous Watch daemon that auto-uploads anonymized signals, Shield CLI that blocks policy violations live (with policies pulled from Fortress cloud), anonymizer producing signals-only payloads, bidirectional sync with WatchMyAgents Fortress, and one-command install as an always-on launchd/systemd service — closing the recursive Watch→Guardian→Shield security loop.",
   "type": "module",
   "files": [

package/scripts/agents.js CHANGED Viewed

@@ -5,14 +5,14 @@
 // Usage:
 //   wma-agents [list] [--log-dir ~/.watchmyagents/logs] [--json]
 //
-// Reads the local Watch logs (NEVER leaves the machine — Modèle C) and derives
+// Reads the local Watch logs (NEVER leaves the machine — Containment) and derives
 // the anonymized behavioural FEATURE VECTOR per the typology spec:
 //   per-tool-category FRACTIONS (f_*), boolean local flags (flag_*), aux ratios
 //   (aux_*), and n_events. It then calls classifyAgentType() and prints the
 //   schema-conformant result. With <50 events an agent is `generic` (cold start)
 //   and refines as activity accumulates.
 //
-// Modèle C invariant: only counts/ratios/flags are computed here — never raw
+// Containment invariant: only counts/ratios/flags are computed here — never raw
 // prompt/output content, never the agent display name. Nothing is transmitted.
 //
 // ANTHROPIC_API_KEY from env (or --api-key, discouraged).
@@ -39,7 +39,7 @@ function die(msg, code = 1) { process.stderr.write(`error: ${msg}\n`); process.e
 function info(msg) { process.stdout.write(`[wma-agents] ${msg}\n`); }
 // Feature aggregation lives in src/typology-features.js (shared with the Watch
-// daemon so both CLI snapshot and continuous upload use the same Modèle C
+// daemon so both CLI snapshot and continuous upload use the same Containment
 // extraction). The rest of this file is just CLI presentation.
 async function main() {
@@ -76,7 +76,7 @@ async function main() {
   if (asJson) { process.stdout.write(JSON.stringify(results, null, 2) + '\n'); return; }
   info(`discovered ${results.length} agent(s) - classified from local logs in ${logDir}`);
-  info(`Modele C: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
+  info(`Containment: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
   process.stdout.write('\n');
   for (const r of results) {
     const mods = (r.modifiers && r.modifiers.length) ? ` [+${r.modifiers.join(',')}]` : '';

package/scripts/fetch-anthropic.js CHANGED Viewed

@@ -14,7 +14,7 @@
 //     by the stable Anthropic event id), appends them to the NDJSON, and — with
 //     --upload — anonymizes the new window and ships signals to Fortress. This
 //     automates the Watch leg of the WGS loop so Guardian gets fresh data with
-//     no manual step. The raw NDJSON always stays local (Modèle C).
+//     no manual step. The raw NDJSON always stays local (Containment).
 //
 // API key from --api-key or env ANTHROPIC_API_KEY.
 // --upload also needs: WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT.
@@ -213,7 +213,7 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
 // up. `windowMs` bounds discovery of NEW sessions, but sessions we're ALREADY
 // tracking are re-fetched regardless of age, so a long-running (>window) session
 // never drops out of capture. `sendNames`: include the human agent name in the
-// Fortress display_name (opt-in); default sends the agent id only (Modèle C).
+// Fortress display_name (opt-in); default sends the agent id only (Containment).
 async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, windowMs, uploadCtx, sendNames }) {
   let agents = await resolveAgents();
   const seenIds = new Set();     // stable Anthropic event ids already captured
@@ -284,7 +284,7 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
         try {
           // Compute the agent's typology from its CUMULATIVE local logs and
           // thread the prior across cycles so the state machine refines toward
-          // stable (Modèle C: features = counts/categories only, no raw content).
+          // stable (Containment: features = counts/categories only, no raw content).
           let classification;
           try {
             const features = buildFeatures(await aggregate(logDir, ag.agentId));
@@ -365,7 +365,12 @@ async function main() {
     // Discovery window for NEW sessions (default 7d, configurable). Sessions we
     // already track are re-fetched regardless of age, so long-lived ones don't drop.
     const windowMs = parseDurationMs(args['discovery-since'], 7 * 24 * 3600_000);
-    const sendNames = !!args['send-agent-names'];
+    // display_name on the Fortress payload: defaults to the human agent name
+    // (UX-friendly — operators identify agents by name in the dashboard). The
+    // name is sanitized via cleanLabel() so log/payload injection is impossible.
+    // Use --no-send-agent-names to opt OUT (sends the agent_id instead) for
+    // setups where the agent name itself is considered sensitive metadata.
+    const sendNames = args['no-send-agent-names'] !== true;
     let resolveAgents;
     if (allAgents) {

package/scripts/service.js CHANGED Viewed

@@ -17,7 +17,7 @@
 // environment) into a protected env file (~/.watchmyagents/env, chmod 600) that
 // the service loads at runtime. Required env at install time:
 //   ANTHROPIC_API_KEY, WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT
-// Raw logs stay local (Modèle C); only anonymized signals are uploaded.
+// Raw logs stay local (Containment); only anonymized signals are uploaded.
 import os from 'node:os';
 import { mkdirSync, writeFileSync, rmSync, existsSync, chmodSync } from 'node:fs';

package/scripts/upload-fortress.js CHANGED Viewed

@@ -4,7 +4,7 @@
 //
 // Composable with the rest of the SDK:
 //   wma-fetch  →  ./watchmyagents-logs/<agent_id>/<date>.ndjson   (local capture)
-//   wma-anonymize  →  signals payload (Modèle C: no raw content)
+//   wma-anonymize  →  signals payload (Containment: no raw content)
 //   wma-upload-fortress  →  POST signals to https://<project>.supabase.co/functions/v1/ingest-signals
 //
 // Usage:

package/src/typology-features.js CHANGED Viewed

@@ -1,9 +1,9 @@
-// Shared local-log feature extraction for the typology classifier (Modèle C).
+// Shared local-log feature extraction for the typology classifier (Containment).
 // Both wma-agents (CLI snapshot) and the Watch daemon (continuous upload) use
 // this to derive the anonymized behavioural FEATURE VECTOR from local NDJSON
 // logs, then feed it to classifyAgentType() in ./typology.js.
 //
-// Modèle C invariant: only `action_type` and `tool_name` are read from each log
+// Containment invariant: only `action_type` and `tool_name` are read from each log
 // line — the raw payload fields (input/output/content/error/thinking) are NEVER
 // touched here, so no raw content can ever enter a feature.
@@ -37,7 +37,7 @@ const CATEGORY_RULES = [
 const DEPLOY_RE = /deploy|terraform|kubectl|helm|(^|_)release($|_)|ansible|pulumi|cloudformation/;
 // Features that the WMA NDJSON logs CANNOT reliably expose today (opaque tool
-// names, no behavioural signal, or raw content off-limits under Modèle C).
+// names, no behavioural signal, or raw content off-limits under Containment).
 // They default to 0/false; callers can surface this to the user.
 export const NON_DERIVABLE = [
   'f_database', 'f_email', 'f_payment', 'f_secret', 'f_memory',

package/src/typology-weights.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Modèle C: all inputs are anonymized behavioural fractions/flags only.",
+  "$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Containment: all inputs are anonymized behavioural fractions/flags only.",
   "version": "0.1.0",
   "updated_at": "2026-05-29T00:00:00Z",
@@ -42,7 +42,7 @@
   },
   "features": {
-    "$comment": "Canonical anonymized feature keys (Modèle C). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
+    "$comment": "Canonical anonymized feature keys (Containment). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
     "fractions": ["f_code", "f_browser", "f_database", "f_http", "f_email", "f_payment", "f_secret", "f_search", "f_memory", "f_handoff", "f_user_msg", "f_file"],
     "flags": ["flag_deploy", "flag_internal_sys", "flag_on_behalf"],
     "aux": ["aux_autonomy", "aux_untrusted", "aux_sensitive"]

package/src/typology.js CHANGED Viewed

@@ -8,7 +8,7 @@
 // Why behaviour, not config: Anthropic Managed Agents expose their tools as an
 // opaque bundle (`agent_toolset_20260401`), so static config can't tell a
 // researcher from a coder. We classify from anonymized behavioural signals
-// (Modèle C): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
+// (Containment): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
 // and aux ratios (aux_*). NEVER raw content — no prompts, no outputs, no names.
 //
 // ──────────────────────────────────────────────────────────────────────────
@@ -23,7 +23,7 @@
 // ──────────────────────────────────────────────────────────────────────────
 //
 // INVARIANTS enforced here:
-//   1. Modèle C — inputs are anonymized fractions/flags/aux ONLY.
+//   1. Containment — inputs are anonymized fractions/flags/aux ONLY.
 //   2. Weights + thresholds come from config (typology-weights.json), never
 //      hardcoded in the logic below.
 //   3. No easy downgrade — moving to a LESS strict template needs a raised
@@ -76,7 +76,7 @@ function strictnessOf(cfg, type) {
  * Build the canonical feature vector from a loose features object.
  * Only the schema-legal keys are kept; everything is coerced to a number and
  * clamped to [0,1] (the schema requires every feature_vector value in [0,1]).
- * Missing features default to 0 — Modèle C: an absent signal is "not observed",
+ * Missing features default to 0 — Containment: an absent signal is "not observed",
  * never inferred from content.
  */
 function normalizeFeatures(cfg, features) {
@@ -121,7 +121,7 @@ function rankTypes(cfg, fv) {
  * classifyAgentType(features[, prior][, opts]) → object conforming EXACTLY to
  * agent-classification.schema.json.
  *
- * @param {object} features          Anonymized behavioural signals (Modèle C):
+ * @param {object} features          Anonymized behavioural signals (Containment):
  *   agent_id            {string}    pass-through identifier (no content)
  *   f_code,f_browser,…  {number}    per-category FRACTIONS in [0,1]
  *   flag_deploy,…       {0|1|bool}  local discriminator flags (no content)