watchmyagents 0.9.3 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -5
- package/package.json +1 -1
- package/scripts/agents.js +4 -4
- package/scripts/fetch-anthropic.js +30 -5
- package/scripts/service.js +1 -1
- package/scripts/upload-fortress.js +14 -1
- package/src/logger.js +15 -2
- package/src/shield/decisions.js +1 -1
- package/src/sources/anthropic-managed.js +76 -1
- package/src/sources/contract.js +259 -0
- package/src/typology-features.js +3 -3
- package/src/typology-weights.json +2 -2
- package/src/typology.js +4 -4
package/README.md
CHANGED
|
@@ -129,7 +129,7 @@ Logs land in `./watchmyagents-logs/<agent_id>/<date>.ndjson` (file mode `0600`,
|
|
|
129
129
|
|
|
130
130
|
### `wma-anonymize` — preview what would leave your machine
|
|
131
131
|
|
|
132
|
-
Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify
|
|
132
|
+
Produces the anonymized signals payload (counts, latencies, salted IoC hashes, sequence histograms — no raw URLs/commands/prompts) that future WMA cloud features would ship. Useful to verify Containment compliance and to test the format.
|
|
133
133
|
|
|
134
134
|
```bash
|
|
135
135
|
export WMA_SIGNALS_SALT="$(node -e 'console.log(require("crypto").randomBytes(16).toString("hex"))')"
|
|
@@ -155,7 +155,7 @@ wma-upload-fortress --agent-id agent_01ABC... [--display-name "My agent"]
|
|
|
155
155
|
wma-upload-fortress --agent-id agent_xxx --dry-run
|
|
156
156
|
```
|
|
157
157
|
|
|
158
|
-
**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output), the agent's **`classification`** when the daemon has it (`{agent_type, confidence, stage}` — anonymized metadata, never raw content), **plus
|
|
158
|
+
**What is sent:** the anonymized signals payload (counts, latencies, salted IoC hashes, sequences — same as `wma-anonymize` output), the agent's **`classification`** when the daemon has it (`{agent_type, confidence, stage}` — anonymized metadata, never raw content), **plus the routing identifiers**: `provider` (e.g., `"anthropic-managed"` — added in v1.0 for the multi-framework SDK), `native_agent_id` (the canonical provider-agnostic field), `anthropic_agent_id` (kept for backwards compat with existing Fortress instances; will be dropped once Fortress migrates), `parent_agent_id` (`null` for root agents — populated for sub-agents detected via OpenAI Agents handoffs, CrewAI manager mode, Hermes Agent `spawn_subagent`, LangGraph sub-graphs), `composition_pattern` (`"solo" | "hierarchy" | "graph" | "peer"` — defaults to `"solo"` for Anthropic until thread-message detection lands), `enforcement_mode` (`"sync_confirm" | "sync_interrupt" | "detect_only"` — the strongest enforcement capability the Source provides; Fortress greys out Shield UI for `detect_only` agents to prevent UI/runtime mismatch), and a `display_name`. The agent id is required so Fortress can associate signals with the right agent; `display_name` defaults to the **human-readable agent name** (sanitized to strip control chars) for UX in the dashboard — pass `--no-send-agent-names` to keep it pseudonymized (sends the agent id instead) if your agent names themselves carry sensitive client/project info.
|
|
159
159
|
**What is NOT sent:** raw prompts, raw URLs/commands/queries, raw agent responses, raw error messages. All payload content stays on your machine.
|
|
160
160
|
|
|
161
161
|
The endpoint auto-registers the agent on the first upload if it doesn't exist in Fortress yet — no manual onboarding needed for new agents.
|
|
@@ -174,7 +174,7 @@ Outputs sections aligned with security audit needs: tokens summary, by-tool / by
|
|
|
174
174
|
|
|
175
175
|
Lists every Managed Agent under your key and classifies each one's **typology**
|
|
176
176
|
(one of 10 Guardian Core archetypes) from its OBSERVED behaviour in your local
|
|
177
|
-
logs — which drives the cold-start Shield template.
|
|
177
|
+
logs — which drives the cold-start Shield template. Containment: reads local logs
|
|
178
178
|
only (tool-category fractions, never raw content) and transmits nothing.
|
|
179
179
|
|
|
180
180
|
```bash
|
|
@@ -247,7 +247,7 @@ WatchMyAgents is built so that **your prompts and outputs never have to leave yo
|
|
|
247
247
|
|---|---|
|
|
248
248
|
| **Your machine** (`./watchmyagents-logs/`) | Full NDJSON with all prompts, tool inputs, agent outputs. `chmod 600` on every file. |
|
|
249
249
|
| **Anthropic API** | Where the agent runs. WMA pulls events via the public REST API only. |
|
|
250
|
-
| **WMA Fortress** (opt-in, only with `--upload` / `wma-upload-fortress` / `wma-shield --policies-source fortress`) | The **anonymized signals** payload (counts, timings, salted hashes, sequences) +
|
|
250
|
+
| **WMA Fortress** (opt-in, only with `--upload` / `wma-upload-fortress` / `wma-shield --policies-source fortress`) | The **anonymized signals** payload (counts, timings, salted hashes, sequences) + routing identifiers: `provider` (e.g. `"anthropic-managed"`), `native_agent_id`, `anthropic_agent_id` (legacy alias), and `display_name` (defaults to the agent id; the human agent name only with `--send-agent-names`). Shield enforcement **decisions** (hashed session/event/input fingerprints — never raw values). **Never** raw prompts, URLs, commands, or outputs. |
|
|
251
251
|
|
|
252
252
|
This is the "local-first" guarantee: **raw payloads never leave your machine.** Cloud upload is opt-in and carries only anonymized metadata + the agent id/name needed to route it.
|
|
253
253
|
|
|
@@ -303,7 +303,7 @@ Decisions are logged to the same NDJSON stream as Watch (`action_type: shield_de
|
|
|
303
303
|
|
|
304
304
|
- ✅ Watch SDK — Anthropic Managed Agents post-hoc fetch + local audit
|
|
305
305
|
- ✅ Shield SDK — real-time enforcement (interrupt mode + tool_confirmation mode)
|
|
306
|
-
- ✅ Anonymizer — produce signals payloads (
|
|
306
|
+
- ✅ Anonymizer — produce signals payloads (Containment: no raw content leaves)
|
|
307
307
|
- ✅ Anonymized telemetry to WMA Fortress cloud (`wma-upload-fortress` in v0.5.0)
|
|
308
308
|
- ✅ Guardian AI (cloud) — automatic policy suggestions from observed behavior
|
|
309
309
|
- ✅ Fortress (cloud) — dashboard + human-in-the-loop validation queue
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "watchmyagents",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "1.0.0",
|
|
4
4
|
"description": "Security observability + real-time policy enforcement for AI agents. Local-first NDJSON capture with a continuous Watch daemon that auto-uploads anonymized signals, Shield CLI that blocks policy violations live (with policies pulled from Fortress cloud), anonymizer producing signals-only payloads, bidirectional sync with WatchMyAgents Fortress, and one-command install as an always-on launchd/systemd service — closing the recursive Watch→Guardian→Shield security loop.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
package/scripts/agents.js
CHANGED
|
@@ -5,14 +5,14 @@
|
|
|
5
5
|
// Usage:
|
|
6
6
|
// wma-agents [list] [--log-dir ~/.watchmyagents/logs] [--json]
|
|
7
7
|
//
|
|
8
|
-
// Reads the local Watch logs (NEVER leaves the machine —
|
|
8
|
+
// Reads the local Watch logs (NEVER leaves the machine — Containment) and derives
|
|
9
9
|
// the anonymized behavioural FEATURE VECTOR per the typology spec:
|
|
10
10
|
// per-tool-category FRACTIONS (f_*), boolean local flags (flag_*), aux ratios
|
|
11
11
|
// (aux_*), and n_events. It then calls classifyAgentType() and prints the
|
|
12
12
|
// schema-conformant result. With <50 events an agent is `generic` (cold start)
|
|
13
13
|
// and refines as activity accumulates.
|
|
14
14
|
//
|
|
15
|
-
//
|
|
15
|
+
// Containment invariant: only counts/ratios/flags are computed here — never raw
|
|
16
16
|
// prompt/output content, never the agent display name. Nothing is transmitted.
|
|
17
17
|
//
|
|
18
18
|
// ANTHROPIC_API_KEY from env (or --api-key, discouraged).
|
|
@@ -39,7 +39,7 @@ function die(msg, code = 1) { process.stderr.write(`error: ${msg}\n`); process.e
|
|
|
39
39
|
function info(msg) { process.stdout.write(`[wma-agents] ${msg}\n`); }
|
|
40
40
|
|
|
41
41
|
// Feature aggregation lives in src/typology-features.js (shared with the Watch
|
|
42
|
-
// daemon so both CLI snapshot and continuous upload use the same
|
|
42
|
+
// daemon so both CLI snapshot and continuous upload use the same Containment
|
|
43
43
|
// extraction). The rest of this file is just CLI presentation.
|
|
44
44
|
|
|
45
45
|
async function main() {
|
|
@@ -76,7 +76,7 @@ async function main() {
|
|
|
76
76
|
if (asJson) { process.stdout.write(JSON.stringify(results, null, 2) + '\n'); return; }
|
|
77
77
|
|
|
78
78
|
info(`discovered ${results.length} agent(s) - classified from local logs in ${logDir}`);
|
|
79
|
-
info(`
|
|
79
|
+
info(`Containment: features below default to 0 (logs don't expose them): ${NON_DERIVABLE.join(', ')}`);
|
|
80
80
|
process.stdout.write('\n');
|
|
81
81
|
for (const r of results) {
|
|
82
82
|
const mods = (r.modifiers && r.modifiers.length) ? ` [+${r.modifiers.join(',')}]` : '';
|
|
@@ -14,7 +14,7 @@
|
|
|
14
14
|
// by the stable Anthropic event id), appends them to the NDJSON, and — with
|
|
15
15
|
// --upload — anonymizes the new window and ships signals to Fortress. This
|
|
16
16
|
// automates the Watch leg of the WGS loop so Guardian gets fresh data with
|
|
17
|
-
// no manual step. The raw NDJSON always stays local (
|
|
17
|
+
// no manual step. The raw NDJSON always stays local (Containment).
|
|
18
18
|
//
|
|
19
19
|
// API key from --api-key or env ANTHROPIC_API_KEY.
|
|
20
20
|
// --upload also needs: WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT.
|
|
@@ -34,6 +34,7 @@ import { classifyAgentType } from '../src/typology.js';
|
|
|
34
34
|
import { aggregate, buildFeatures } from '../src/typology-features.js';
|
|
35
35
|
import {
|
|
36
36
|
getAgent, listAgents, listSessions, fetchSessionEntries, fetchRawEvents,
|
|
37
|
+
AnthropicManagedSource,
|
|
37
38
|
} from '../src/sources/anthropic-managed.js';
|
|
38
39
|
|
|
39
40
|
function parseArgs(argv) {
|
|
@@ -115,8 +116,32 @@ async function uploadSignals(uploadCtx, agentId, displayName, entries, classific
|
|
|
115
116
|
for (const e of entries) agg.add(e);
|
|
116
117
|
const sig = agg.finalize();
|
|
117
118
|
if (!sig.window_start || !sig.window_end) return null; // nothing datable to ship
|
|
119
|
+
// PR-C: derive the agent's composition pattern + parent from the
|
|
120
|
+
// observed entries. For Anthropic today, the Source yields solo/root
|
|
121
|
+
// agents — sub-agent detection from thread_message_* events lands
|
|
122
|
+
// with PR-D or a follow-up. Once a future adapter populates these on
|
|
123
|
+
// the events themselves, this carries them up to Fortress without
|
|
124
|
+
// any payload-shape change.
|
|
125
|
+
const firstWithHierarchy = entries.find((e) => e.parent_agent_id != null);
|
|
126
|
+
const parent_agent_id = firstWithHierarchy?.parent_agent_id ?? null;
|
|
127
|
+
const composition_pattern = firstWithHierarchy?.composition_pattern
|
|
128
|
+
|| entries.find((e) => e.composition_pattern && e.composition_pattern !== 'solo')?.composition_pattern
|
|
129
|
+
|| 'solo';
|
|
130
|
+
// PR-B: payload carries the canonical provider-agnostic identifiers
|
|
131
|
+
// (`provider` + `native_agent_id`) AND the legacy `anthropic_agent_id`
|
|
132
|
+
// so old Fortress instances still recognize the upload. Once the
|
|
133
|
+
// Lovable-deployed ingest-signals migrates, future SDK releases will
|
|
134
|
+
// stop emitting `anthropic_agent_id`.
|
|
135
|
+
// PR-D: enforcement_mode is read CANONICALLY from the Source's static
|
|
136
|
+
// declaration so it stays in sync with the actual capability of the
|
|
137
|
+
// adapter — never re-declared inline.
|
|
118
138
|
const body = JSON.stringify({
|
|
139
|
+
provider: AnthropicManagedSource.providerName,
|
|
140
|
+
native_agent_id: agentId,
|
|
119
141
|
anthropic_agent_id: agentId,
|
|
142
|
+
parent_agent_id,
|
|
143
|
+
composition_pattern,
|
|
144
|
+
enforcement_mode: AnthropicManagedSource.enforcementMode,
|
|
120
145
|
display_name: displayName,
|
|
121
146
|
window_start: sig.window_start,
|
|
122
147
|
window_end: sig.window_end,
|
|
@@ -194,7 +219,7 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
|
|
|
194
219
|
}
|
|
195
220
|
const stats = tracker.stats().total;
|
|
196
221
|
await logger.write({
|
|
197
|
-
action_type: 'session_end',
|
|
222
|
+
action_type: 'session_end', provider: 'anthropic-managed', status: 'ok', model,
|
|
198
223
|
session_tokens: { input: stats.input, output: stats.output, cache_read: stats.cache_read, cache_creation: stats.cache_creation, total: stats.sum },
|
|
199
224
|
session_cost_usd: stats.cost_usd || null,
|
|
200
225
|
});
|
|
@@ -213,7 +238,7 @@ async function fetchOneShot({ apiKey, agentId, model, logDir, since, sessionId,
|
|
|
213
238
|
// up. `windowMs` bounds discovery of NEW sessions, but sessions we're ALREADY
|
|
214
239
|
// tracking are re-fetched regardless of age, so a long-running (>window) session
|
|
215
240
|
// never drops out of capture. `sendNames`: include the human agent name in the
|
|
216
|
-
// Fortress display_name (opt-in); default sends the agent id only (
|
|
241
|
+
// Fortress display_name (opt-in); default sends the agent id only (Containment).
|
|
217
242
|
async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, windowMs, uploadCtx, sendNames }) {
|
|
218
243
|
let agents = await resolveAgents();
|
|
219
244
|
const seenIds = new Set(); // stable Anthropic event ids already captured
|
|
@@ -284,7 +309,7 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
|
|
|
284
309
|
try {
|
|
285
310
|
// Compute the agent's typology from its CUMULATIVE local logs and
|
|
286
311
|
// thread the prior across cycles so the state machine refines toward
|
|
287
|
-
// stable (
|
|
312
|
+
// stable (Containment: features = counts/categories only, no raw content).
|
|
288
313
|
let classification;
|
|
289
314
|
try {
|
|
290
315
|
const features = buildFeatures(await aggregate(logDir, ag.agentId));
|
|
@@ -307,7 +332,7 @@ async function runWatch({ apiKey, resolveAgents, fleet, logDir, intervalMs, wind
|
|
|
307
332
|
for (const e of fresh) tracker.record(e);
|
|
308
333
|
const stats = tracker.stats().total;
|
|
309
334
|
await logger.write({
|
|
310
|
-
action_type: 'session_end',
|
|
335
|
+
action_type: 'session_end', provider: 'anthropic-managed', status: 'ok', model: ag.model,
|
|
311
336
|
session_tokens: { input: stats.input, output: stats.output, cache_read: stats.cache_read, cache_creation: stats.cache_creation, total: stats.sum },
|
|
312
337
|
session_cost_usd: stats.cost_usd || null,
|
|
313
338
|
});
|
package/scripts/service.js
CHANGED
|
@@ -17,7 +17,7 @@
|
|
|
17
17
|
// environment) into a protected env file (~/.watchmyagents/env, chmod 600) that
|
|
18
18
|
// the service loads at runtime. Required env at install time:
|
|
19
19
|
// ANTHROPIC_API_KEY, WMA_API_KEY, WMA_FORTRESS_BASE_URL, WMA_SIGNALS_SALT
|
|
20
|
-
// Raw logs stay local (
|
|
20
|
+
// Raw logs stay local (Containment); only anonymized signals are uploaded.
|
|
21
21
|
|
|
22
22
|
import os from 'node:os';
|
|
23
23
|
import { mkdirSync, writeFileSync, rmSync, existsSync, chmodSync } from 'node:fs';
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
//
|
|
5
5
|
// Composable with the rest of the SDK:
|
|
6
6
|
// wma-fetch → ./watchmyagents-logs/<agent_id>/<date>.ndjson (local capture)
|
|
7
|
-
// wma-anonymize → signals payload (
|
|
7
|
+
// wma-anonymize → signals payload (Containment: no raw content)
|
|
8
8
|
// wma-upload-fortress → POST signals to https://<project>.supabase.co/functions/v1/ingest-signals
|
|
9
9
|
//
|
|
10
10
|
// Usage:
|
|
@@ -30,6 +30,7 @@ import { createReadStream } from 'node:fs';
|
|
|
30
30
|
import { createInterface } from 'node:readline';
|
|
31
31
|
import { SignalsAggregator } from '../src/anonymizer.js';
|
|
32
32
|
import { resolveFortressBase, fortressEndpoint } from '../src/fortress/url.js';
|
|
33
|
+
import { AnthropicManagedSource } from '../src/sources/anthropic-managed.js';
|
|
33
34
|
|
|
34
35
|
function parseArgs(argv) {
|
|
35
36
|
const out = {};
|
|
@@ -177,8 +178,20 @@ async function main() {
|
|
|
177
178
|
die('error: no entries had timestamps — nothing to upload');
|
|
178
179
|
}
|
|
179
180
|
|
|
181
|
+
// PR-B: provider-agnostic identifiers + legacy fallback (see fetch-anthropic.js).
|
|
182
|
+
// PR-C: ship the agent's hierarchy + composition pattern. wma-upload-fortress
|
|
183
|
+
// is a one-shot post-hoc tool — it has no per-entry context to derive
|
|
184
|
+
// hierarchy from, so it sends defaults (solo / null) until a future
|
|
185
|
+
// adapter writes those fields into the local NDJSON.
|
|
186
|
+
// PR-D: enforcement_mode read from the Source class so any change to
|
|
187
|
+
// the adapter's capability automatically reflects in the payload.
|
|
180
188
|
const body = {
|
|
189
|
+
provider: AnthropicManagedSource.providerName,
|
|
190
|
+
native_agent_id: agentId,
|
|
181
191
|
anthropic_agent_id: agentId,
|
|
192
|
+
parent_agent_id: null,
|
|
193
|
+
composition_pattern: 'solo',
|
|
194
|
+
enforcement_mode: AnthropicManagedSource.enforcementMode,
|
|
182
195
|
display_name: displayName,
|
|
183
196
|
window_start: signals.window_start,
|
|
184
197
|
window_end: signals.window_end,
|
package/src/logger.js
CHANGED
|
@@ -3,8 +3,16 @@ import { join } from 'node:path';
|
|
|
3
3
|
import { randomUUID } from 'node:crypto';
|
|
4
4
|
import { assertSafePathSegment } from './validate.js';
|
|
5
5
|
|
|
6
|
+
// PR-B: `framework` → `provider` (canonical name per src/sources/contract.js).
|
|
7
|
+
// PR-C: adds `parent_agent_id` + `composition_pattern` so any future
|
|
8
|
+
// adapter that knows the hierarchy (OpenAI Agents handoffs, CrewAI
|
|
9
|
+
// manager, Hermes Agent spawn_subagent, LangGraph sub-graphs) can
|
|
10
|
+
// thread the relationship through to Fortress without rework.
|
|
11
|
+
// NDJSON written before PR-B may carry `framework`; readers that need the
|
|
12
|
+
// provider tag should read `provider` first and fall back to `framework`.
|
|
6
13
|
const EXPORT_FIELDS = [
|
|
7
|
-
'id', 'agent_id', '
|
|
14
|
+
'id', 'agent_id', 'parent_agent_id', 'composition_pattern',
|
|
15
|
+
'provider', 'timestamp', 'action_type',
|
|
8
16
|
'tool_name', 'duration_ms', 'tokens_used',
|
|
9
17
|
'input_tokens', 'output_tokens', 'cache_read_tokens', 'cache_creation_tokens',
|
|
10
18
|
'cost_usd', 'model',
|
|
@@ -47,7 +55,12 @@ export class Logger {
|
|
|
47
55
|
const full = {
|
|
48
56
|
id: e.id || randomUUID(),
|
|
49
57
|
agent_id: this.agentId,
|
|
50
|
-
|
|
58
|
+
// PR-C: sub-agent fields. Defaults are honest for solo / root agents.
|
|
59
|
+
// An adapter that detects hierarchy (e.g. OpenAI Agents handoffs)
|
|
60
|
+
// populates these on the event, and the Logger threads them through.
|
|
61
|
+
parent_agent_id: e.parent_agent_id ?? null,
|
|
62
|
+
composition_pattern: e.composition_pattern || 'solo',
|
|
63
|
+
provider: e.provider || e.framework || 'generic',
|
|
51
64
|
timestamp: e.timestamp || new Date().toISOString(),
|
|
52
65
|
action_type: e.action_type || 'tool_call',
|
|
53
66
|
tool_name: e.tool_name || null,
|
package/src/shield/decisions.js
CHANGED
|
@@ -25,7 +25,7 @@ export class DecisionLogger {
|
|
|
25
25
|
}) {
|
|
26
26
|
return this._logger.write({
|
|
27
27
|
action_type: 'shield_decision',
|
|
28
|
-
|
|
28
|
+
provider: 'anthropic-managed',
|
|
29
29
|
tool_name: sourceEvent?.name || sourceEvent?.tool_name || null,
|
|
30
30
|
status: decision === 'deny' || decision === 'interrupt' ? 'error' : 'ok',
|
|
31
31
|
error: decision === 'deny' || decision === 'interrupt' ? message : null,
|
|
@@ -17,6 +17,7 @@
|
|
|
17
17
|
|
|
18
18
|
import { request } from 'node:https';
|
|
19
19
|
import { URLSearchParams } from 'node:url';
|
|
20
|
+
import { Source, PROVIDERS, ENFORCEMENT_MODES } from './contract.js';
|
|
20
21
|
|
|
21
22
|
const API_HOST = 'api.anthropic.com';
|
|
22
23
|
const BETA = 'managed-agents-2026-04-01';
|
|
@@ -163,7 +164,14 @@ export async function* fetchSessionEntries({ apiKey, agentId, sessionId, model }
|
|
|
163
164
|
const pendingModelReq = new Map(); // span.model_request_start.id → ts
|
|
164
165
|
const pendingToolUse = new Map(); // agent.tool_use.id → { ts, name, isMcp, input }
|
|
165
166
|
|
|
166
|
-
|
|
167
|
+
// `provider` is the canonical field per src/sources/contract.js (no
|
|
168
|
+
// other consumer ever read the previous `framework` field, so it was
|
|
169
|
+
// dropped in PR-B with zero downstream impact).
|
|
170
|
+
const base = {
|
|
171
|
+
provider: PROVIDERS.ANTHROPIC_MANAGED,
|
|
172
|
+
agent_id: agentId,
|
|
173
|
+
session_id: sessionId,
|
|
174
|
+
};
|
|
167
175
|
|
|
168
176
|
// No server-side `types[]` filter: the API rejects unknown values, but the
|
|
169
177
|
// exact filterable set is undocumented & evolves. We pull everything and
|
|
@@ -459,3 +467,70 @@ function extractText(content) {
|
|
|
459
467
|
if (content && typeof content === 'object') return content.text || JSON.stringify(content);
|
|
460
468
|
return '';
|
|
461
469
|
}
|
|
470
|
+
|
|
471
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
472
|
+
// AnthropicManagedSource — V1 Source contract wrapper
|
|
473
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
474
|
+
// Implements the Source ABC over the low-level functions above. New SDK
|
|
475
|
+
// code should use this class; the function exports stay public for
|
|
476
|
+
// backwards compat with the existing wma-fetch + wma-shield daemons
|
|
477
|
+
// (migration is PR-B / PR-D).
|
|
478
|
+
//
|
|
479
|
+
// Capability declaration:
|
|
480
|
+
// sync_confirm — Anthropic Managed Agents exposes pre-execution
|
|
481
|
+
// `user.tool_confirmation` (block before the tool runs) AND
|
|
482
|
+
// `user.interrupt` (stop the current LLM turn). The stronger of the
|
|
483
|
+
// two is sync_confirm.
|
|
484
|
+
|
|
485
|
+
export class AnthropicManagedSource extends Source {
|
|
486
|
+
static providerName = PROVIDERS.ANTHROPIC_MANAGED;
|
|
487
|
+
static enforcementMode = ENFORCEMENT_MODES.SYNC_CONFIRM;
|
|
488
|
+
|
|
489
|
+
constructor({ apiKey } = {}) {
|
|
490
|
+
super({ apiKey });
|
|
491
|
+
if (!apiKey) throw new Error('AnthropicManagedSource requires an apiKey');
|
|
492
|
+
this.apiKey = apiKey;
|
|
493
|
+
}
|
|
494
|
+
|
|
495
|
+
/**
|
|
496
|
+
* Discover Managed Agents under this API key. Returns the canonical
|
|
497
|
+
* agent descriptor (`{ id, name, native }`) — the raw vendor agent
|
|
498
|
+
* stays in `native` for adapters/UI that want richer metadata.
|
|
499
|
+
*/
|
|
500
|
+
async listAgents() {
|
|
501
|
+
const raw = await listAgents(this.apiKey);
|
|
502
|
+
return raw.map((a) => ({
|
|
503
|
+
id: a.id,
|
|
504
|
+
name: a.name || null,
|
|
505
|
+
native: a,
|
|
506
|
+
}));
|
|
507
|
+
}
|
|
508
|
+
|
|
509
|
+
/**
|
|
510
|
+
* Stream WMAAction entries for a session. Anthropic events are
|
|
511
|
+
* per-session, so opts.sessionId is required — fleet-wide watching is
|
|
512
|
+
* the caller's job (wma-fetch already orchestrates this).
|
|
513
|
+
*/
|
|
514
|
+
async *streamEvents(agentId, { sessionId, model } = {}) {
|
|
515
|
+
if (!sessionId) {
|
|
516
|
+
throw new Error('AnthropicManagedSource.streamEvents requires opts.sessionId — Anthropic events are scoped to a session');
|
|
517
|
+
}
|
|
518
|
+
yield* fetchSessionEntries({
|
|
519
|
+
apiKey: this.apiKey, agentId, sessionId, model,
|
|
520
|
+
});
|
|
521
|
+
}
|
|
522
|
+
|
|
523
|
+
/**
|
|
524
|
+
* Enforce a policy decision against a pending action.
|
|
525
|
+
*
|
|
526
|
+
* PR-A scaffold: the actual `user.tool_confirmation` / `user.interrupt`
|
|
527
|
+
* HTTP call currently lives in scripts/shield.js, which talks to the
|
|
528
|
+
* Anthropic API directly. Migrating that into this method is PR-D — at
|
|
529
|
+
* which point this body will POST the decision via the SSE/HTTP control
|
|
530
|
+
* channel. For PR-A, the method exists to satisfy the contract;
|
|
531
|
+
* Shield does not call it yet.
|
|
532
|
+
*/
|
|
533
|
+
async enforce(action, decision) { // eslint-disable-line no-unused-vars
|
|
534
|
+
throw new Error('AnthropicManagedSource.enforce() — Shield migration pending PR-D (scripts/shield.js still handles enforcement directly)');
|
|
535
|
+
}
|
|
536
|
+
}
|
|
@@ -0,0 +1,259 @@
|
|
|
1
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
2
|
+
// WatchMyAgents — Source contract (V1)
|
|
3
|
+
// ────────────────────────────────────────────────────────────────────────
|
|
4
|
+
//
|
|
5
|
+
// THIS FILE IS THE CONTRACT every adapter MUST follow.
|
|
6
|
+
//
|
|
7
|
+
// Why this exists:
|
|
8
|
+
// The SDK shipped today integrates Anthropic Managed Agents via the
|
|
9
|
+
// functions in `./anthropic-managed.js`. To add OpenAI / LangGraph /
|
|
10
|
+
// CrewAI / Bedrock / etc. without rewriting the pipe each time, the
|
|
11
|
+
// contract between "fetching events" and "the rest of WMA" has to be
|
|
12
|
+
// explicit.
|
|
13
|
+
//
|
|
14
|
+
// Everything in WMA (anonymizer, typology classifier, Guardian scoring,
|
|
15
|
+
// Shield enforcement, Fortress signals payload) operates on `WMAAction`
|
|
16
|
+
// objects — the canonical shape defined below. Each Source adapter is
|
|
17
|
+
// responsible for translating its provider's native events into this
|
|
18
|
+
// shape, and nothing else.
|
|
19
|
+
//
|
|
20
|
+
// Containment invariant:
|
|
21
|
+
// A Source's `streamEvents()` yields WMAAction objects that MAY carry
|
|
22
|
+
// raw payload bytes in `input`/`output` — these are written to the
|
|
23
|
+
// LOCAL NDJSON file but NEVER sent to Fortress. The anonymizer is the
|
|
24
|
+
// single gate between WMAAction (raw) and the signals payload (cloud).
|
|
25
|
+
// See `src/anonymizer.js` and `docs/SOURCE-ADAPTER-CONTRACT.md`.
|
|
26
|
+
|
|
27
|
+
// ── Canonical vocabulary ────────────────────────────────────────────────
|
|
28
|
+
|
|
29
|
+
// Every WMAAction.action_type MUST be one of these. New adapters that
|
|
30
|
+
// emit a novel kind of action should propose adding a new constant here
|
|
31
|
+
// (and document it) rather than inventing one inline.
|
|
32
|
+
export const ACTION_TYPES = Object.freeze({
|
|
33
|
+
LLM_CALL: 'llm_call',
|
|
34
|
+
TOOL_USE: 'tool_use',
|
|
35
|
+
MCP_TOOL_USE: 'mcp_tool_use',
|
|
36
|
+
CUSTOM_TOOL_USE: 'custom_tool_use',
|
|
37
|
+
CUSTOM_TOOL_RESULT: 'custom_tool_result',
|
|
38
|
+
TOOL_CONFIRMATION: 'tool_confirmation',
|
|
39
|
+
USER_MESSAGE: 'user_message',
|
|
40
|
+
USER_INTERRUPT: 'user_interrupt',
|
|
41
|
+
MESSAGE: 'message',
|
|
42
|
+
THINKING: 'thinking',
|
|
43
|
+
CONTEXT_COMPACTED: 'context_compacted',
|
|
44
|
+
THREAD_CREATED: 'thread_created',
|
|
45
|
+
THREAD_MESSAGE_SENT: 'thread_message_sent',
|
|
46
|
+
THREAD_MESSAGE_RECEIVED: 'thread_message_received',
|
|
47
|
+
CONFIG_CHANGE: 'config_change',
|
|
48
|
+
STATE_TRANSITION: 'state_transition',
|
|
49
|
+
SESSION_ERROR: 'session_error',
|
|
50
|
+
// Shield-only — emitted when WMA itself blocks an action:
|
|
51
|
+
SHIELD_DECISION: 'shield_decision',
|
|
52
|
+
});
|
|
53
|
+
|
|
54
|
+
export const STATUS_VALUES = Object.freeze({
|
|
55
|
+
OK: 'ok',
|
|
56
|
+
ERROR: 'error',
|
|
57
|
+
BLOCKED: 'blocked',
|
|
58
|
+
});
|
|
59
|
+
|
|
60
|
+
// A Source declares how strongly it can enforce policies. This drives
|
|
61
|
+
// what Shield can do on its events:
|
|
62
|
+
// sync_confirm → can confirm/deny a tool call before execution
|
|
63
|
+
// (Anthropic user.tool_confirmation, AgentCore
|
|
64
|
+
// Gateway interceptor with transformedResponse)
|
|
65
|
+
// sync_interrupt → can interrupt mid-execution after an LLM call
|
|
66
|
+
// (Anthropic user.interrupt)
|
|
67
|
+
// detect_only → can observe but cannot block — post-hoc audit
|
|
68
|
+
// (E2B lifecycle webhooks, pure observability sinks)
|
|
69
|
+
export const ENFORCEMENT_MODES = Object.freeze({
|
|
70
|
+
SYNC_CONFIRM: 'sync_confirm',
|
|
71
|
+
SYNC_INTERRUPT: 'sync_interrupt',
|
|
72
|
+
DETECT_ONLY: 'detect_only',
|
|
73
|
+
});
|
|
74
|
+
|
|
75
|
+
// How the agent composes with other agents — drives the WMA dashboard
|
|
76
|
+
// tree view and the policy `subtree` surface (PR-C).
|
|
77
|
+
// solo → no sub-agents, one tool-loop
|
|
78
|
+
// hierarchy → boss + workers (CrewAI manager, Anthropic Task tool,
|
|
79
|
+
// Hermes Agent spawn-subagent)
|
|
80
|
+
// graph → nodes + edges (LangGraph)
|
|
81
|
+
// peer → N agents converse on equal footing (AutoGen)
|
|
82
|
+
export const COMPOSITION_PATTERNS = Object.freeze({
|
|
83
|
+
SOLO: 'solo',
|
|
84
|
+
HIERARCHY: 'hierarchy',
|
|
85
|
+
GRAPH: 'graph',
|
|
86
|
+
PEER: 'peer',
|
|
87
|
+
});
|
|
88
|
+
|
|
89
|
+
// Known provider identifiers. Adapters should register their provider
|
|
90
|
+
// name here as they land so consumers can build provider-specific UI.
|
|
91
|
+
export const PROVIDERS = Object.freeze({
|
|
92
|
+
ANTHROPIC_MANAGED: 'anthropic-managed',
|
|
93
|
+
// Coming next:
|
|
94
|
+
// OPENAI_AGENTS: 'openai-agents',
|
|
95
|
+
// AWS_BEDROCK_AGENTCORE: 'aws-bedrock-agentcore',
|
|
96
|
+
// LANGGRAPH: 'langgraph',
|
|
97
|
+
// CREWAI: 'crewai',
|
|
98
|
+
});
|
|
99
|
+
|
|
100
|
+
// ── WMAAction canonical shape ───────────────────────────────────────────
|
|
101
|
+
//
|
|
102
|
+
// /**
|
|
103
|
+
// * @typedef {object} WMAAction
|
|
104
|
+
// *
|
|
105
|
+
// * REQUIRED — every Source MUST populate these:
|
|
106
|
+
// * @property {string} id Stable, dedup-friendly event id
|
|
107
|
+
// * @property {string} provider From PROVIDERS (e.g. 'anthropic-managed')
|
|
108
|
+
// * @property {string} agent_id Native agent identifier
|
|
109
|
+
// * @property {string} session_id Native session/thread/run identifier
|
|
110
|
+
// * @property {string} action_type From ACTION_TYPES
|
|
111
|
+
// * @property {string} timestamp ISO-8601
|
|
112
|
+
// * @property {'ok'|'error'|'blocked'} status
|
|
113
|
+
// *
|
|
114
|
+
// * OPTIONAL — present when applicable:
|
|
115
|
+
// * @property {string|null} tool_name For tool_use family
|
|
116
|
+
// * @property {string|null} model For llm_call
|
|
117
|
+
// * @property {number|null} duration_ms Latency (start→end pair)
|
|
118
|
+
// * @property {number|null} tokens_used For llm_call (input+output+cache)
|
|
119
|
+
// * @property {number|null} input_tokens
|
|
120
|
+
// * @property {number|null} output_tokens
|
|
121
|
+
// * @property {number|null} cache_read_tokens
|
|
122
|
+
// * @property {number|null} cache_creation_tokens
|
|
123
|
+
// * @property {string|null} error Truncated error message (≤500ch)
|
|
124
|
+
// * @property {object|null} input Raw input payload — STAYS LOCAL
|
|
125
|
+
// * @property {object|null} output Raw output payload — STAYS LOCAL
|
|
126
|
+
// *
|
|
127
|
+
// * SUB-AGENT FIELDS (PR-C — see WMAAction.parent_agent_id):
|
|
128
|
+
// * @property {string|null} parent_agent_id Null for root agents
|
|
129
|
+
// * @property {string|null} composition_pattern From COMPOSITION_PATTERNS
|
|
130
|
+
// */
|
|
131
|
+
|
|
132
|
+
const REQUIRED_FIELDS = ['id', 'provider', 'agent_id', 'session_id', 'action_type', 'timestamp', 'status'];
|
|
133
|
+
|
|
134
|
+
/**
|
|
135
|
+
* Validate a WMAAction at runtime. Returns `{ valid, errors }`.
|
|
136
|
+
* Cheap enough to run on every yield in dev (process.env.WMA_DEV_VALIDATE=1).
|
|
137
|
+
*
|
|
138
|
+
* Adapters should call this BEFORE yielding in their test suite, and the
|
|
139
|
+
* SDK can opt into runtime validation via the env flag.
|
|
140
|
+
*/
|
|
141
|
+
export function validateWMAAction(obj) {
|
|
142
|
+
const errors = [];
|
|
143
|
+
if (!obj || typeof obj !== 'object') {
|
|
144
|
+
return { valid: false, errors: ['not an object'] };
|
|
145
|
+
}
|
|
146
|
+
for (const f of REQUIRED_FIELDS) {
|
|
147
|
+
if (obj[f] == null) errors.push(`missing required field: ${f}`);
|
|
148
|
+
}
|
|
149
|
+
if (obj.action_type && !Object.values(ACTION_TYPES).includes(obj.action_type)) {
|
|
150
|
+
errors.push(`unknown action_type "${obj.action_type}" — add to ACTION_TYPES in contract.js`);
|
|
151
|
+
}
|
|
152
|
+
if (obj.status && !Object.values(STATUS_VALUES).includes(obj.status)) {
|
|
153
|
+
errors.push(`unknown status "${obj.status}" — must be one of ${Object.values(STATUS_VALUES).join(', ')}`);
|
|
154
|
+
}
|
|
155
|
+
if (obj.composition_pattern != null
|
|
156
|
+
&& !Object.values(COMPOSITION_PATTERNS).includes(obj.composition_pattern)) {
|
|
157
|
+
errors.push(`unknown composition_pattern "${obj.composition_pattern}"`);
|
|
158
|
+
}
|
|
159
|
+
if (obj.timestamp && Number.isNaN(Date.parse(obj.timestamp))) {
|
|
160
|
+
errors.push(`timestamp not parseable: ${obj.timestamp}`);
|
|
161
|
+
}
|
|
162
|
+
return { valid: errors.length === 0, errors };
|
|
163
|
+
}
|
|
164
|
+
|
|
165
|
+
// ── Source abstract base class ──────────────────────────────────────────
|
|
166
|
+
|
|
167
|
+
/**
|
|
168
|
+
* Every framework adapter MUST extend this class and override the abstract
|
|
169
|
+
* methods. The Source is the boundary between "the customer's agent
|
|
170
|
+
* runtime" and "the rest of WMA" — the only place where vendor-specific
|
|
171
|
+
* code lives. Once a Source yields WMAAction objects, the pipe is
|
|
172
|
+
* provider-agnostic.
|
|
173
|
+
*
|
|
174
|
+
* Static contract:
|
|
175
|
+
* providerName — value from PROVIDERS
|
|
176
|
+
* enforcementMode — value from ENFORCEMENT_MODES
|
|
177
|
+
*
|
|
178
|
+
* Instance contract:
|
|
179
|
+
* listAgents() — return all agents accessible with the client creds
|
|
180
|
+
* streamEvents(id) — async generator yielding WMAAction objects
|
|
181
|
+
* enforce(action, d) — only required if enforcementMode != detect_only
|
|
182
|
+
*
|
|
183
|
+
* See `docs/SOURCE-ADAPTER-CONTRACT.md` for the full author guide.
|
|
184
|
+
*/
|
|
185
|
+
export class Source {
|
|
186
|
+
static providerName = null;
|
|
187
|
+
static enforcementMode = null;
|
|
188
|
+
|
|
189
|
+
constructor(config = {}) {
|
|
190
|
+
if (new.target === Source) {
|
|
191
|
+
throw new Error('Source is abstract — extend it in a subclass (e.g., AnthropicManagedSource).');
|
|
192
|
+
}
|
|
193
|
+
this.config = config;
|
|
194
|
+
}
|
|
195
|
+
|
|
196
|
+
/**
|
|
197
|
+
* Discover all agents under the client's credentials.
|
|
198
|
+
* @returns {Promise<Array<{id: string, name?: string, native?: object}>>}
|
|
199
|
+
*/
|
|
200
|
+
async listAgents() {
|
|
201
|
+
throw new Error(`${this.constructor.name}.listAgents() not implemented`);
|
|
202
|
+
}
|
|
203
|
+
|
|
204
|
+
/**
|
|
205
|
+
* Stream WMAAction objects for the given agent. The implementation may
|
|
206
|
+
* page, retry, dedup, or restart internally — consumers see a single
|
|
207
|
+
* ordered stream.
|
|
208
|
+
* @param {string} agentId
|
|
209
|
+
* @param {object} [opts]
|
|
210
|
+
* @yields {WMAAction}
|
|
211
|
+
*/
|
|
212
|
+
async *streamEvents(agentId, opts) { // eslint-disable-line no-unused-vars
|
|
213
|
+
throw new Error(`${this.constructor.name}.streamEvents() not implemented`);
|
|
214
|
+
yield; // make this a generator
|
|
215
|
+
}
|
|
216
|
+
|
|
217
|
+
/**
|
|
218
|
+
* Enforce a policy decision against a pending action. Only called when
|
|
219
|
+
* the Source's static `enforcementMode` is not `detect_only`. The
|
|
220
|
+
* subclass is responsible for translating WMA's canonical decision
|
|
221
|
+
* (`allow`|`deny`) into the provider's native confirm/interrupt call.
|
|
222
|
+
* @param {WMAAction} action
|
|
223
|
+
* @param {{decision: 'allow'|'deny', reason?: string}} decision
|
|
224
|
+
* @returns {Promise<{enforced: boolean, native_response?: object}>}
|
|
225
|
+
*/
|
|
226
|
+
async enforce(action, decision) { // eslint-disable-line no-unused-vars
|
|
227
|
+
if (this.constructor.enforcementMode === ENFORCEMENT_MODES.DETECT_ONLY) {
|
|
228
|
+
throw new Error(`${this.constructor.name} is detect_only — enforce() must not be called`);
|
|
229
|
+
}
|
|
230
|
+
throw new Error(`${this.constructor.name}.enforce() not implemented`);
|
|
231
|
+
}
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
/**
|
|
235
|
+
* Assertion helper for tests: verify a Source subclass declares the
|
|
236
|
+
* required static fields and overrides the abstract methods.
|
|
237
|
+
* Throws on any contract violation.
|
|
238
|
+
*/
|
|
239
|
+
export function assertImplementsSource(SourceClass) {
|
|
240
|
+
if (!(SourceClass.prototype instanceof Source)) {
|
|
241
|
+
throw new Error(`${SourceClass?.name || SourceClass} does not extend Source`);
|
|
242
|
+
}
|
|
243
|
+
if (!Object.values(PROVIDERS).includes(SourceClass.providerName)) {
|
|
244
|
+
throw new Error(`${SourceClass.name}.providerName="${SourceClass.providerName}" not in PROVIDERS`);
|
|
245
|
+
}
|
|
246
|
+
if (!Object.values(ENFORCEMENT_MODES).includes(SourceClass.enforcementMode)) {
|
|
247
|
+
throw new Error(`${SourceClass.name}.enforcementMode="${SourceClass.enforcementMode}" not in ENFORCEMENT_MODES`);
|
|
248
|
+
}
|
|
249
|
+
// The base class throws "not implemented" — a real subclass must override.
|
|
250
|
+
for (const m of ['listAgents', 'streamEvents']) {
|
|
251
|
+
if (SourceClass.prototype[m] === Source.prototype[m]) {
|
|
252
|
+
throw new Error(`${SourceClass.name}.${m}() must be overridden`);
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
if (SourceClass.enforcementMode !== ENFORCEMENT_MODES.DETECT_ONLY
|
|
256
|
+
&& SourceClass.prototype.enforce === Source.prototype.enforce) {
|
|
257
|
+
throw new Error(`${SourceClass.name}.enforce() must be overridden (enforcementMode=${SourceClass.enforcementMode})`);
|
|
258
|
+
}
|
|
259
|
+
}
|
package/src/typology-features.js
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
|
-
// Shared local-log feature extraction for the typology classifier (
|
|
1
|
+
// Shared local-log feature extraction for the typology classifier (Containment).
|
|
2
2
|
// Both wma-agents (CLI snapshot) and the Watch daemon (continuous upload) use
|
|
3
3
|
// this to derive the anonymized behavioural FEATURE VECTOR from local NDJSON
|
|
4
4
|
// logs, then feed it to classifyAgentType() in ./typology.js.
|
|
5
5
|
//
|
|
6
|
-
//
|
|
6
|
+
// Containment invariant: only `action_type` and `tool_name` are read from each log
|
|
7
7
|
// line — the raw payload fields (input/output/content/error/thinking) are NEVER
|
|
8
8
|
// touched here, so no raw content can ever enter a feature.
|
|
9
9
|
|
|
@@ -37,7 +37,7 @@ const CATEGORY_RULES = [
|
|
|
37
37
|
const DEPLOY_RE = /deploy|terraform|kubectl|helm|(^|_)release($|_)|ansible|pulumi|cloudformation/;
|
|
38
38
|
|
|
39
39
|
// Features that the WMA NDJSON logs CANNOT reliably expose today (opaque tool
|
|
40
|
-
// names, no behavioural signal, or raw content off-limits under
|
|
40
|
+
// names, no behavioural signal, or raw content off-limits under Containment).
|
|
41
41
|
// They default to 0/false; callers can surface this to the user.
|
|
42
42
|
export const NON_DERIVABLE = [
|
|
43
43
|
'f_database', 'f_email', 'f_payment', 'f_secret', 'f_memory',
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
{
|
|
2
|
-
"$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic.
|
|
2
|
+
"$comment": "WatchMyAgents — typology classifier weights + thresholds (Guardian Core, agent-typology-classification.spec.md §3/§4/§5). INVARIANT: weights and thresholds live HERE, never hardcoded in typology.js ('poids de signature en config, pas en dur'). Calibrate on labelled real traffic. Containment: all inputs are anonymized behavioural fractions/flags only.",
|
|
3
3
|
"version": "0.1.0",
|
|
4
4
|
"updated_at": "2026-05-29T00:00:00Z",
|
|
5
5
|
|
|
@@ -42,7 +42,7 @@
|
|
|
42
42
|
},
|
|
43
43
|
|
|
44
44
|
"features": {
|
|
45
|
-
"$comment": "Canonical anonymized feature keys (
|
|
45
|
+
"$comment": "Canonical anonymized feature keys (Containment). Fractions f_* in [0,1]; flag_* in {0,1}; aux_* in [0,1]. Order is informational only — scoring is key-addressed.",
|
|
46
46
|
"fractions": ["f_code", "f_browser", "f_database", "f_http", "f_email", "f_payment", "f_secret", "f_search", "f_memory", "f_handoff", "f_user_msg", "f_file"],
|
|
47
47
|
"flags": ["flag_deploy", "flag_internal_sys", "flag_on_behalf"],
|
|
48
48
|
"aux": ["aux_autonomy", "aux_untrusted", "aux_sensitive"]
|
package/src/typology.js
CHANGED
|
@@ -8,7 +8,7 @@
|
|
|
8
8
|
// Why behaviour, not config: Anthropic Managed Agents expose their tools as an
|
|
9
9
|
// opaque bundle (`agent_toolset_20260401`), so static config can't tell a
|
|
10
10
|
// researcher from a coder. We classify from anonymized behavioural signals
|
|
11
|
-
// (
|
|
11
|
+
// (Containment): per-tool-category FRACTIONS (f_*), boolean local flags (flag_*),
|
|
12
12
|
// and aux ratios (aux_*). NEVER raw content — no prompts, no outputs, no names.
|
|
13
13
|
//
|
|
14
14
|
// ──────────────────────────────────────────────────────────────────────────
|
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
// ──────────────────────────────────────────────────────────────────────────
|
|
24
24
|
//
|
|
25
25
|
// INVARIANTS enforced here:
|
|
26
|
-
// 1.
|
|
26
|
+
// 1. Containment — inputs are anonymized fractions/flags/aux ONLY.
|
|
27
27
|
// 2. Weights + thresholds come from config (typology-weights.json), never
|
|
28
28
|
// hardcoded in the logic below.
|
|
29
29
|
// 3. No easy downgrade — moving to a LESS strict template needs a raised
|
|
@@ -76,7 +76,7 @@ function strictnessOf(cfg, type) {
|
|
|
76
76
|
* Build the canonical feature vector from a loose features object.
|
|
77
77
|
* Only the schema-legal keys are kept; everything is coerced to a number and
|
|
78
78
|
* clamped to [0,1] (the schema requires every feature_vector value in [0,1]).
|
|
79
|
-
* Missing features default to 0 —
|
|
79
|
+
* Missing features default to 0 — Containment: an absent signal is "not observed",
|
|
80
80
|
* never inferred from content.
|
|
81
81
|
*/
|
|
82
82
|
function normalizeFeatures(cfg, features) {
|
|
@@ -121,7 +121,7 @@ function rankTypes(cfg, fv) {
|
|
|
121
121
|
* classifyAgentType(features[, prior][, opts]) → object conforming EXACTLY to
|
|
122
122
|
* agent-classification.schema.json.
|
|
123
123
|
*
|
|
124
|
-
* @param {object} features Anonymized behavioural signals (
|
|
124
|
+
* @param {object} features Anonymized behavioural signals (Containment):
|
|
125
125
|
* agent_id {string} pass-through identifier (no content)
|
|
126
126
|
* f_code,f_browser,… {number} per-category FRACTIONS in [0,1]
|
|
127
127
|
* flag_deploy,… {0|1|bool} local discriminator flags (no content)
|