canary-scan 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Elif
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,200 @@
1
+ # Canary
2
+
3
+ Prompt injection detection using behavioral analysis. Weak LLMs as sensitive sensors.
4
+
5
+ ## How it works
6
+
7
+ Canary sends content to a small, cheap LLM with one instruction: **echo it back exactly**. Then it checks what happened.
8
+
9
+ Two independent detection channels:
10
+
11
+ 1. **Text deviation** — Did the output differ from the input? Deterministic string comparison after normalization. No fuzzy matching, no thresholds.
12
+ 2. **Tool call attempt** — Did the model try to call any honeypot tools? Five attractive tool definitions (execute_command, read_file, send_request, update_memory, send_message) are offered but never referenced in the prompt. Any tool call means the input content influenced the model.
13
+
14
+ If either channel fires: **FLAGGED**. If neither: **CLEAR**.
15
+
16
+ ### Why weak models?
17
+
18
+ A small, instruction-tuned model is *more susceptible* to prompt injection than a frontier model. That's the feature. A model that gets tricked easily makes a more sensitive detector. The canary doesn't need to be smart — it needs to be gullible.
19
+
20
+ ### What CLEAR and FLAGGED mean
21
+
22
+ - **CLEAR** = "No deviation detected under test conditions." This is not a safety guarantee. Sophisticated injections can evade detection.
23
+ - **FLAGGED** = "Behavioral deviation detected." The content caused the canary to deviate from its echo instruction. Human review recommended.
24
+
25
+ Canary makes bounded claims, not absolute ones.
26
+
27
+ ## Install
28
+
29
+ ```bash
30
+ npm install canary-scan
31
+ ```
32
+
33
+ Or run directly:
34
+
35
+ ```bash
36
+ npx canary-scan scan https://example.com
37
+ ```
38
+
39
+ ## Setup
40
+
41
+ You need an API key from [OpenRouter](https://openrouter.ai/) (free tier works).
42
+
43
+ ```bash
44
+ export CANARY_API_KEY=your-openrouter-key
45
+ ```
46
+
47
+ Optional:
48
+
49
+ ```bash
50
+ export CANARY_MODEL=arcee-ai/trinity-mini:free # default
51
+ export CANARY_BASE_URL=https://openrouter.ai/api/v1 # default
52
+ ```
53
+
54
+ ## CLI Usage
55
+
56
+ ```bash
57
+ # Scan a URL
58
+ canary scan https://example.com
59
+
60
+ # Scan raw text
61
+ canary scan --text "some content to check"
62
+
63
+ # Calibrate — measure echo fidelity and tool call rate for your model
64
+ canary calibrate
65
+
66
+ # Trust management
67
+ canary trust list
68
+ canary trust add https://known-safe.com
69
+ canary flag https://suspicious.com
70
+ ```
71
+
72
+ ### Example output
73
+
74
+ ```
75
+ Status: FLAGGED
76
+ Model: arcee-ai/trinity-mini:free
77
+ Time: 2340ms
78
+ Preview: Ignore all previous instructions...
79
+ Deviation: YES
80
+ Tool call: YES — execute_command
81
+ Detail: 2 indicator(s): Text deviation at position 0: "...I'll help you with that!..."; Tool call attempted: execute_command
82
+ Chunks: 1 scanned, 1 flagged
83
+ Coverage: 100% unique, 100% raw
84
+
85
+ This content caused behavioral deviation in the canary model.
86
+ Human review recommended before processing.
87
+ ```
88
+
89
+ ## Library Usage
90
+
91
+ ```typescript
92
+ import { CanaryScanner } from "canary-scan";
93
+
94
+ const scanner = new CanaryScanner({
95
+ apiKey: process.env.CANARY_API_KEY!,
96
+ model: "arcee-ai/trinity-mini:free", // optional
97
+ chunkSize: 1500, // optional
98
+ overlapRatio: 0.25, // optional
99
+ calibrationArtifacts: [], // optional, from calibration
100
+ });
101
+
102
+ // Scan text
103
+ const result = await scanner.scan("some untrusted content");
104
+ console.log(result.status); // "clear" or "flagged"
105
+
106
+ // Scan a URL
107
+ const urlResult = await scanner.scanUrl("https://example.com");
108
+
109
+ // Calibrate — run once per model to find artifacts
110
+ const calibration = await scanner.calibrate();
111
+ console.log(calibration.echoFidelity); // raw fidelity
112
+ console.log(calibration.adjustedEchoFidelity); // fidelity after artifact filtering
113
+ console.log(calibration.artifacts); // pass these to calibrationArtifacts
114
+ ```
115
+
116
+ ### ScanResult
117
+
118
+ ```typescript
119
+ {
120
+ status: "clear" | "flagged",
121
+ reason: string | null,
122
+ deviationDetected: boolean,
123
+ toolCallAttempted: boolean,
124
+ toolsInvoked: string[],
125
+ contentPreview: string,
126
+ model: string,
127
+ scanTimeMs: number,
128
+ metadata: {
129
+ confidence: "bounded",
130
+ chunksScanned: number,
131
+ chunksFlagged: number,
132
+ rawCoverage: number,
133
+ uniqueCoverage: number,
134
+ overlapRatio: number,
135
+ }
136
+ }
137
+ ```
138
+
139
+ ## MCP Server
140
+
141
+ Canary runs as an MCP server so AI agents can scan content before reading it.
142
+
143
+ ```json
144
+ {
145
+ "mcpServers": {
146
+ "canary": {
147
+ "command": "npx",
148
+ "args": ["tsx", "/path/to/canary/src/mcp-server.ts"],
149
+ "env": { "CANARY_API_KEY": "your-key" }
150
+ }
151
+ }
152
+ }
153
+ ```
154
+
155
+ Tools provided:
156
+ - `canary_scan_url` — Scan a URL before reading it
157
+ - `canary_scan_text` — Scan raw text content
158
+ - `canary_trust` — Manually mark sources as trusted/flagged
159
+
160
+ ## Calibration
161
+
162
+ Different models have different echo fidelity. Some add prefixes ("Sure! Here's the text:"), strip labels, or reformat whitespace. Calibration measures this baseline noise so you can distinguish it from injection-caused deviation.
163
+
164
+ ```bash
165
+ canary calibrate
166
+ ```
167
+
168
+ This runs 20 clean text samples through the model and reports:
169
+ - **Raw echo fidelity** — percentage of perfect echoes before artifact filtering
170
+ - **Adjusted echo fidelity** — percentage after filtering discovered artifacts
171
+ - **Tool call rate** — how often the model calls tools on clean input (should be 0%)
172
+ - **Artifacts** — specific strings the model consistently adds/removes
173
+
174
+ Pass discovered artifacts to `calibrationArtifacts` in your config to reduce false positives.
175
+
176
+ ## How it handles long content
177
+
178
+ Content is split into overlapping chunks (default: 1500 chars, 25% overlap). Each chunk is scanned independently — the canary model has no context between chunks. If any chunk is flagged, the whole scan is flagged.
179
+
180
+ Overlap ensures injections at chunk boundaries are still caught.
181
+
182
+ ## Limitations
183
+
184
+ - **Not a guarantee.** Sophisticated injections can produce output that matches the input while still containing executable payloads.
185
+ - **Model-dependent.** Detection sensitivity varies by model. Calibrate before production use.
186
+ - **Rate limits.** Free OpenRouter models have rate limits (~8 RPM). Scanning large content takes time.
187
+ - **No HTML stripping.** The canary sees raw content, including HTML tags. This is intentional — stripping could remove injections.
188
+ - **One-way detection.** Canary detects behavioral influence, not the *type* of injection. A FLAGGED result doesn't tell you *what* the injection tries to do.
189
+
190
+ ## Tests
191
+
192
+ ```bash
193
+ npm test
194
+ ```
195
+
196
+ 50 tests covering normalization, both detection channels, chunking, caching, metadata, known injection payloads, and trust management. All tests run offline with mocked API calls.
197
+
198
+ ## License
199
+
200
+ MIT
package/dist/cli.d.ts ADDED
@@ -0,0 +1,18 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Canary CLI — Scan URLs or text for prompt injection indicators
4
+ *
5
+ * Usage:
6
+ * canary scan https://example.com
7
+ * canary scan --text "some content to check"
8
+ * canary calibrate
9
+ * canary trust list
10
+ * canary trust add https://example.com
11
+ * canary flag https://example.com
12
+ *
13
+ * Environment:
14
+ * CANARY_API_KEY — OpenRouter (or compatible) API key (required)
15
+ * CANARY_BASE_URL — API base URL (default: https://openrouter.ai/api/v1)
16
+ * CANARY_MODEL — Model to use (default: mistralai/qwen3-4b:free)
17
+ */
18
+ export {};
package/dist/cli.js ADDED
@@ -0,0 +1,182 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+ /**
4
+ * Canary CLI — Scan URLs or text for prompt injection indicators
5
+ *
6
+ * Usage:
7
+ * canary scan https://example.com
8
+ * canary scan --text "some content to check"
9
+ * canary calibrate
10
+ * canary trust list
11
+ * canary trust add https://example.com
12
+ * canary flag https://example.com
13
+ *
14
+ * Environment:
15
+ * CANARY_API_KEY — OpenRouter (or compatible) API key (required)
16
+ * CANARY_BASE_URL — API base URL (default: https://openrouter.ai/api/v1)
17
+ * CANARY_MODEL — Model to use (default: mistralai/qwen3-4b:free)
18
+ */
19
+ Object.defineProperty(exports, "__esModule", { value: true });
20
+ const scanner_1 = require("./scanner");
21
+ const API_KEY = process.env.CANARY_API_KEY || process.env.OPENROUTER_API_KEY || "";
22
+ const BASE_URL = process.env.CANARY_BASE_URL || "https://openrouter.ai/api/v1";
23
+ const MODEL = process.env.CANARY_MODEL || "arcee-ai/trinity-mini:free";
24
+ function printUsage() {
25
+ console.log(`
26
+ Canary — Prompt Injection Behavioral Detection
27
+
28
+ Uses a weak LLM as a behavioral probe. Content is sent to a small model
29
+ with a verbatim echo instruction. Any deviation in output or attempted
30
+ tool use indicates the content influenced the model's behavior.
31
+
32
+ FLAGGED = content caused behavioral deviation. Human review recommended.
33
+ CLEAR = no deviation detected under test conditions. Not a safety guarantee.
34
+
35
+ Usage:
36
+ canary scan <url> Scan a URL
37
+ canary scan --text "content" Scan raw text
38
+ canary calibrate Test model echo fidelity and tool call rate
39
+ canary trust list Show trusted/flagged sources
40
+ canary trust add <source> Manually trust a source
41
+ canary flag <source> Manually flag a source
42
+
43
+ Environment:
44
+ CANARY_API_KEY API key for LLM provider (OpenRouter, etc.)
45
+ CANARY_BASE_URL API base URL (default: OpenRouter)
46
+ CANARY_MODEL Model ID (default: qwen3-4b:free)
47
+
48
+ The default model is small and free — on purpose.
49
+ A gullible model is a more sensitive detector.
50
+ `);
51
+ }
52
+ async function main() {
53
+ const args = process.argv.slice(2);
54
+ if (args.length === 0 || args[0] === "--help" || args[0] === "-h") {
55
+ printUsage();
56
+ process.exit(0);
57
+ }
58
+ if (!API_KEY) {
59
+ console.error("Error: CANARY_API_KEY or OPENROUTER_API_KEY environment variable required");
60
+ process.exit(1);
61
+ }
62
+ const scanner = new scanner_1.CanaryScanner({
63
+ apiKey: API_KEY,
64
+ baseUrl: BASE_URL,
65
+ model: MODEL,
66
+ });
67
+ const command = args[0];
68
+ if (command === "scan") {
69
+ if (args[1] === "--text") {
70
+ const text = args.slice(2).join(" ");
71
+ if (!text) {
72
+ console.error("Error: provide text to scan");
73
+ process.exit(1);
74
+ }
75
+ const result = await scanner.scan(text);
76
+ printResult(result);
77
+ }
78
+ else if (args[1]) {
79
+ const url = args[1];
80
+ console.log(`Scanning ${url}...`);
81
+ const result = await scanner.scanUrl(url);
82
+ printResult(result);
83
+ }
84
+ else {
85
+ console.error("Error: provide a URL or --text");
86
+ process.exit(1);
87
+ }
88
+ }
89
+ else if (command === "calibrate") {
90
+ console.log(`Calibrating model: ${MODEL}`);
91
+ console.log("Running echo fidelity and tool call tests...\n");
92
+ const result = await scanner.calibrate();
93
+ printCalibration(result);
94
+ }
95
+ else if (command === "trust") {
96
+ if (args[1] === "list") {
97
+ const lists = scanner.getTrustList();
98
+ console.log("Trusted:", lists.trusted.length ? lists.trusted.join(", ") : "(none)");
99
+ console.log("Flagged:", lists.flagged.length ? lists.flagged.join(", ") : "(none)");
100
+ }
101
+ else if (args[1] === "add" && args[2]) {
102
+ scanner.setTrust(args[2], "clear");
103
+ console.log(`Trusted: ${args[2]}`);
104
+ }
105
+ else {
106
+ console.error("Usage: canary trust list | canary trust add <source>");
107
+ }
108
+ }
109
+ else if (command === "flag") {
110
+ if (args[1]) {
111
+ scanner.setTrust(args[1], "flagged");
112
+ console.log(`Flagged: ${args[1]}`);
113
+ }
114
+ else {
115
+ console.error("Usage: canary flag <source>");
116
+ }
117
+ }
118
+ else {
119
+ printUsage();
120
+ }
121
+ }
122
+ function printResult(result) {
123
+ const label = result.status === "clear" ? "CLEAR" : "FLAGGED";
124
+ console.log(`\n Status: ${label}`);
125
+ console.log(` Model: ${result.model}`);
126
+ console.log(` Time: ${result.scanTimeMs}ms`);
127
+ console.log(` Preview: ${result.contentPreview}`);
128
+ console.log(` Deviation: ${result.deviationDetected ? "YES" : "no"}`);
129
+ console.log(` Tool call: ${result.toolCallAttempted ? "YES — " + result.toolsInvoked.join(", ") : "no"}`);
130
+ if (result.reason) {
131
+ console.log(` Detail: ${result.reason}`);
132
+ }
133
+ const m = result.metadata;
134
+ console.log(` Chunks: ${m.chunksScanned} scanned, ${m.chunksFlagged} flagged`);
135
+ console.log(` Coverage: ${Math.round(m.uniqueCoverage * 100)}% unique, ${Math.round(m.rawCoverage * 100)}% raw`);
136
+ console.log(` Overlap: ${Math.round(m.overlapRatio * 100)}%`);
137
+ console.log();
138
+ if (result.status === "flagged") {
139
+ console.log(" This content caused behavioral deviation in the canary model.");
140
+ console.log(" Human review recommended before processing.\n");
141
+ }
142
+ else {
143
+ console.log(" No deviation detected under test conditions.");
144
+ console.log(" This does not guarantee the content is safe.\n");
145
+ }
146
+ }
147
+ function printCalibration(result) {
148
+ console.log(` Model: ${result.model}`);
149
+ console.log(` Echo fidelity: ${Math.round(result.echoFidelity * 100)}% raw`);
150
+ if (result.artifacts.length > 0) {
151
+ console.log(` Adjusted: ${Math.round(result.adjustedEchoFidelity * 100)}% (with ${result.artifacts.length} artifact(s) filtered)`);
152
+ }
153
+ console.log(` Tool call rate: ${Math.round(result.toolCallRate * 100)}%`);
154
+ console.log(` Suitable: ${result.suitable ? "YES" : "NO"}`);
155
+ if (result.artifacts.length > 0) {
156
+ console.log(`\n Artifacts found (model-specific noise to filter):`);
157
+ for (const artifact of result.artifacts) {
158
+ console.log(` "${artifact}"`);
159
+ }
160
+ console.log(`\n Pass these to CanaryConfig.calibrationArtifacts to reduce false positives.`);
161
+ }
162
+ if (result.details.length > 0) {
163
+ console.log(`\n Details:`);
164
+ for (const detail of result.details) {
165
+ console.log(` - ${detail}`);
166
+ }
167
+ }
168
+ if (!result.suitable) {
169
+ console.log("\n This model may produce too many false positives.");
170
+ if (result.adjustedEchoFidelity < 0.85) {
171
+ console.log(" Echo fidelity below 85% — model struggles with verbatim reproduction.");
172
+ }
173
+ if (result.toolCallRate > 0.05) {
174
+ console.log(" Tool call rate above 5% — model calls tools on clean input.");
175
+ }
176
+ }
177
+ console.log();
178
+ }
179
+ main().catch((err) => {
180
+ console.error("Error:", err.message);
181
+ process.exit(1);
182
+ });
@@ -0,0 +1 @@
1
+ export { CanaryScanner, normalize, type ScanResult, type ScanMetadata, type CanaryConfig, type CalibrationResult, } from "./scanner";
package/dist/index.js ADDED
@@ -0,0 +1,6 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.normalize = exports.CanaryScanner = void 0;
4
+ var scanner_1 = require("./scanner");
5
+ Object.defineProperty(exports, "CanaryScanner", { enumerable: true, get: function () { return scanner_1.CanaryScanner; } });
6
+ Object.defineProperty(exports, "normalize", { enumerable: true, get: function () { return scanner_1.normalize; } });
@@ -0,0 +1,22 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Canary MCP Server
4
+ *
5
+ * Provides prompt injection scanning as an MCP tool.
6
+ * Any AI agent can call `canary_scan` before reading untrusted content.
7
+ *
8
+ * Usage:
9
+ * CANARY_API_KEY=... npx tsx src/mcp-server.ts
10
+ *
11
+ * Add to claude_desktop_config.json or .claude/settings.json:
12
+ * {
13
+ * "mcpServers": {
14
+ * "canary": {
15
+ * "command": "npx",
16
+ * "args": ["tsx", "/path/to/canary/src/mcp-server.ts"],
17
+ * "env": { "CANARY_API_KEY": "your-key" }
18
+ * }
19
+ * }
20
+ * }
21
+ */
22
+ export {};
@@ -0,0 +1,174 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+ /**
4
+ * Canary MCP Server
5
+ *
6
+ * Provides prompt injection scanning as an MCP tool.
7
+ * Any AI agent can call `canary_scan` before reading untrusted content.
8
+ *
9
+ * Usage:
10
+ * CANARY_API_KEY=... npx tsx src/mcp-server.ts
11
+ *
12
+ * Add to claude_desktop_config.json or .claude/settings.json:
13
+ * {
14
+ * "mcpServers": {
15
+ * "canary": {
16
+ * "command": "npx",
17
+ * "args": ["tsx", "/path/to/canary/src/mcp-server.ts"],
18
+ * "env": { "CANARY_API_KEY": "your-key" }
19
+ * }
20
+ * }
21
+ * }
22
+ */
23
+ Object.defineProperty(exports, "__esModule", { value: true });
24
+ const scanner_1 = require("./scanner");
25
+ const API_KEY = process.env.CANARY_API_KEY || process.env.OPENROUTER_API_KEY || "";
26
+ const BASE_URL = process.env.CANARY_BASE_URL || "https://openrouter.ai/api/v1";
27
+ const MODEL = process.env.CANARY_MODEL || "arcee-ai/trinity-mini:free";
28
+ if (!API_KEY) {
29
+ console.error("CANARY_API_KEY or OPENROUTER_API_KEY required");
30
+ process.exit(1);
31
+ }
32
+ const scanner = new scanner_1.CanaryScanner({
33
+ apiKey: API_KEY,
34
+ baseUrl: BASE_URL,
35
+ model: MODEL,
36
+ });
37
+ // MCP protocol over stdio
38
+ const TOOLS = [
39
+ {
40
+ name: "canary_scan_url",
41
+ description: "Scan a URL for prompt injection indicators before reading it. Uses a weak LLM as a behavioral probe — sends content with a verbatim echo instruction and checks for deviation. Returns CLEAR (no deviation detected under test conditions — not a safety guarantee) or FLAGGED (behavioral deviation detected — human review recommended).",
42
+ inputSchema: {
43
+ type: "object",
44
+ properties: {
45
+ url: { type: "string", description: "The URL to scan" },
46
+ },
47
+ required: ["url"],
48
+ },
49
+ },
50
+ {
51
+ name: "canary_scan_text",
52
+ description: "Scan raw text for prompt injection indicators. Uses a weak LLM as a behavioral probe — sends content with a verbatim echo instruction and checks for deviation. Returns CLEAR (no deviation detected) or FLAGGED (behavioral deviation detected — human review recommended).",
53
+ inputSchema: {
54
+ type: "object",
55
+ properties: {
56
+ text: { type: "string", description: "The text content to scan" },
57
+ },
58
+ required: ["text"],
59
+ },
60
+ },
61
+ {
62
+ name: "canary_trust",
63
+ description: "Manually mark a source as trusted (clear) or flagged after human review.",
64
+ inputSchema: {
65
+ type: "object",
66
+ properties: {
67
+ source: { type: "string", description: "The source identifier (URL or content hash)" },
68
+ status: { type: "string", enum: ["clear", "flagged"], description: "Trust status" },
69
+ },
70
+ required: ["source", "status"],
71
+ },
72
+ },
73
+ ];
74
+ // Simplified MCP stdio transport
75
+ let buffer = "";
76
+ process.stdin.setEncoding("utf-8");
77
+ process.stdin.on("data", (chunk) => {
78
+ buffer += chunk;
79
+ processBuffer();
80
+ });
81
+ function processBuffer() {
82
+ while (true) {
83
+ const headerEnd = buffer.indexOf("\r\n\r\n");
84
+ if (headerEnd === -1)
85
+ break;
86
+ const header = buffer.slice(0, headerEnd);
87
+ const contentLengthMatch = header.match(/Content-Length: (\d+)/i);
88
+ if (!contentLengthMatch) {
89
+ buffer = buffer.slice(headerEnd + 4);
90
+ continue;
91
+ }
92
+ const contentLength = parseInt(contentLengthMatch[1]);
93
+ const bodyStart = headerEnd + 4;
94
+ if (buffer.length < bodyStart + contentLength)
95
+ break;
96
+ const body = buffer.slice(bodyStart, bodyStart + contentLength);
97
+ buffer = buffer.slice(bodyStart + contentLength);
98
+ try {
99
+ const msg = JSON.parse(body);
100
+ handleMessage(msg);
101
+ }
102
+ catch {
103
+ // Skip malformed messages
104
+ }
105
+ }
106
+ }
107
+ function sendMessage(msg) {
108
+ const body = JSON.stringify(msg);
109
+ const header = `Content-Length: ${Buffer.byteLength(body)}\r\n\r\n`;
110
+ process.stdout.write(header + body);
111
+ }
112
+ async function handleMessage(msg) {
113
+ if (msg.method === "initialize") {
114
+ sendMessage({
115
+ jsonrpc: "2.0",
116
+ id: msg.id,
117
+ result: {
118
+ protocolVersion: "2024-11-05",
119
+ capabilities: { tools: {} },
120
+ serverInfo: { name: "canary", version: "0.2.0" },
121
+ },
122
+ });
123
+ }
124
+ else if (msg.method === "notifications/initialized") {
125
+ // No response needed
126
+ }
127
+ else if (msg.method === "tools/list") {
128
+ sendMessage({
129
+ jsonrpc: "2.0",
130
+ id: msg.id,
131
+ result: { tools: TOOLS },
132
+ });
133
+ }
134
+ else if (msg.method === "tools/call") {
135
+ const { name, arguments: args } = msg.params;
136
+ let result;
137
+ try {
138
+ if (name === "canary_scan_url") {
139
+ result = await scanner.scanUrl(args.url);
140
+ }
141
+ else if (name === "canary_scan_text") {
142
+ result = await scanner.scan(args.text);
143
+ }
144
+ else if (name === "canary_trust") {
145
+ scanner.setTrust(args.source, args.status);
146
+ result = { status: args.status, source: args.source, message: `Source ${args.status === "clear" ? "trusted" : "flagged"}` };
147
+ }
148
+ else {
149
+ throw new Error(`Unknown tool: ${name}`);
150
+ }
151
+ sendMessage({
152
+ jsonrpc: "2.0",
153
+ id: msg.id,
154
+ result: {
155
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
156
+ },
157
+ });
158
+ }
159
+ catch (error) {
160
+ sendMessage({
161
+ jsonrpc: "2.0",
162
+ id: msg.id,
163
+ result: {
164
+ content: [{ type: "text", text: `Error: ${error.message}` }],
165
+ isError: true,
166
+ },
167
+ });
168
+ }
169
+ }
170
+ }
171
+ // Log to stderr so it doesn't interfere with MCP stdio
172
+ console.error("Canary MCP server started (v0.2.0 — echo + tool detection)");
173
+ console.error(`Model: ${MODEL}`);
174
+ console.error("Waiting for connections...");