tunnel-mcp 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -11,6 +11,36 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
11
11
 
12
12
  - Nothing yet.
13
13
 
14
+ ## [0.1.4] - 2026-07-01
15
+
16
+ ### Fixed
17
+
18
+ - **Guests no longer fail to join with `getaddrinfo ENOTFOUND …trycloudflare.com`.**
19
+ Root cause: a cloudflared quick tunnel prints its URL ~8–25s before the
20
+ per-tunnel DNS record propagates, and the old host-side reachability probe
21
+ looked the name up immediately — seeding an `NXDOMAIN` that the resolver
22
+ negative-cached for up to 30 minutes (the zone's SOA minimum), breaking the
23
+ guest's join even after the tunnel went live. The probe was the cause, not a
24
+ diagnostic.
25
+
26
+ ### Changed
27
+
28
+ - **`tunnel_open` now gates the join link on real DNS readiness via DoH.** After
29
+ cloudflared reports the URL, the host polls liveness over IP-literal DoH
30
+ endpoints (`1.1.1.1`/`1.0.0.1`/`8.8.8.8`) — which never touch, and so never
31
+ poison, the system resolver — and only returns the link once the record
32
+ resolves (best-effort: it never blocks or hard-fails; after a budget it returns
33
+ the link anyway).
34
+ - **The guest resolves system-first with a DoH fallback**, connecting by the
35
+ resolved IP while keeping SNI/Host = the hostname, so a guest whose resolver
36
+ lags or holds a stale negative cache still connects.
37
+ - **Guest connection is now time-bounded** (handshake + overall connect deadline),
38
+ so a black-hole link fails fast with a clear error instead of hanging.
39
+ - The `0.1.3` `TUNNEL_REACHABILITY` (and `0.1.2` `TUNNEL_SKIP_REACHABILITY_CHECK`)
40
+ environment variables are **no longer read** — they only ever relaxed the
41
+ now-deleted probe. New single knob: `TUNNEL_DOH=off` disables DoH for networks
42
+ that block it and where system DNS already works.
43
+
14
44
  ## [0.1.3] - 2026-07-01
15
45
 
16
46
  ### Changed
@@ -94,7 +124,8 @@ install-skill` copies the `tunnel-etiquette` skill into `~/.claude/skills`
94
124
  declaring a fix "confirmed".
95
125
  - Test suite of 109 tests built with vitest, developed test-first (TDD).
96
126
 
97
- [Unreleased]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.3...HEAD
127
+ [Unreleased]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.4...HEAD
128
+ [0.1.4]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.3...v0.1.4
98
129
  [0.1.3]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.2...v0.1.3
99
130
  [0.1.2]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.1...v0.1.2
100
131
  [0.1.1]: https://github.com/zachlikefolio/tunnel-mcp/compare/v0.1.0...v0.1.1
package/README.md CHANGED
@@ -162,7 +162,7 @@ vulnerability.
162
162
 
163
163
  ```bash
164
164
  npm ci # install dependencies
165
- npm test # run the test suite (136 tests, TDD)
165
+ npm test # run the test suite (159 tests, TDD)
166
166
  npm run build # compile TypeScript
167
167
  npm run lint # eslint
168
168
  npm run format:check # prettier --check .
@@ -178,21 +178,18 @@ an interactive CLI — with no arguments it starts and waits for an MCP client t
178
178
  connect over stdin/stdout. That's working as intended. Register it with a client
179
179
  (above), or run `tunnel-mcp --help`.
180
180
 
181
- **`tunnel_open` warns that this machine can't reach `*.trycloudflare.com`.**
182
- cloudflared reaches Cloudflare's edge over its own protocol, but the public
183
- `*.trycloudflare.com` hostname still has to resolve via normal DNS and some
184
- networks (corporate/filtered networks, some public DNS resolvers, or a proxy that
185
- Node's `fetch` ignores) can't reach it from the host. Because only your
186
- **guest's** network truly has to reach the link, `tunnel_open` **opens anyway and
187
- returns a `reachabilityWarning` by default**share the link and have your guest
188
- confirm they can open it. Control this with `TUNNEL_REACHABILITY`:
189
-
190
- - `warn` (default) open, but warn when the host can't reach the URL
191
- - `strict` fail `tunnel_open` unless the host can reach the URL first
192
- - `off` skip the host-side reachability probe entirely
193
-
194
- Diagnose the host's DNS with `dig +short <random>.trycloudflare.com` or
195
- `curl -sI https://<the-url>`.
181
+ **Guest join fails with `getaddrinfo ENOTFOUND trycloudflare.com`.** A
182
+ cloudflared quick tunnel prints its URL a few seconds _before_ the per-tunnel DNS
183
+ record has propagated. If anything looks the name up too early it gets an
184
+ `NXDOMAIN` that the resolver negative-caches for up to 30 minutes — breaking the
185
+ join even after the tunnel is live. `tunnel-mcp` avoids this: `tunnel_open` waits
186
+ for the record to actually resolve (via DoH to Cloudflare's `1.1.1.1`, an IP that
187
+ never touches and so never poisons your system resolver) before returning the
188
+ link, and the guest resolves system-first with a DoH fallback. So a fresh join
189
+ should just work; if you hit `ENOTFOUND`, an _earlier_ attempt likely poisoned the
190
+ cache wait for it to expire, or flush DNS (`sudo dscacheutil -flushcache` on
191
+ macOS). Set `TUNNEL_DOH=off` only on networks that block DoH (`1.1.1.1`) and where
192
+ system DNS already resolves `*.trycloudflare.com`.
196
193
 
197
194
  ## Roadmap / not yet supported
198
195
 
package/dist/cli.js CHANGED
@@ -62,10 +62,10 @@ export function helpText(version = readVersion()) {
62
62
  `Environment:`,
63
63
  ` TUNNEL_SKILLS_DIR Override the skills directory`,
64
64
  ` TUNNEL_SKIP_SKILL_INSTALL=1 Skip the automatic skill install on npm install`,
65
- ` TUNNEL_REACHABILITY warn (default) | strict | off how tunnel_open reacts`,
66
- ` when this host can't reach *.trycloudflare.com. Only the`,
67
- ` guest's network truly must reach it, so the default warns`,
68
- ` and opens anyway; strict fails; off skips the check.`,
65
+ ` TUNNEL_DOH=off Disable DoH used to confirm the tunnel hostname is`,
66
+ ` live before sharing the link, and as the guest's`,
67
+ ` resolver fallback. Only disable on networks that block`,
68
+ ` DoH (1.1.1.1) and where system DNS already works.`,
69
69
  ``,
70
70
  `Docs: https://github.com/zachlikefolio/tunnel-mcp`,
71
71
  ].join('\n');
@@ -1,27 +1,30 @@
1
- import { ReachabilityMode } from '../env.js';
1
+ import { DohResult } from '../net/doh.js';
2
2
  export interface TunnelHandle {
3
3
  publicUrl: string;
4
4
  stop(): void;
5
- reachabilityWarning?: string;
6
5
  }
7
6
  export interface StartOptions {
8
7
  timeoutMs?: number;
9
8
  extraArgs?: string[];
10
- attempts?: number;
11
- intervalMs?: number;
12
- healthCheck?: (url: string) => Promise<boolean>;
13
- probeTimeoutMs?: number;
14
- reachability?: ReachabilityMode;
9
+ budgetMs?: number;
10
+ pollIntervalMs?: number;
11
+ initialDelayMs?: number;
12
+ resolveHost?: (host: string) => Promise<DohResult>;
13
+ dohEnabled?: boolean;
15
14
  }
16
15
  export declare function parsePublicUrl(line: string): string | null;
17
- export declare function describeProbeError(e: unknown): string;
18
- export declare function defaultHealthCheck(url: string): Promise<boolean>;
19
- export declare function unreachableMessage(url: string, attempts: number, lastReason?: string): string;
20
- export declare function reachabilityWarningMessage(url: string, lastReason?: string): string;
21
16
  /**
22
- * `extraArgs` exists for tests: it lets a fake binary (e.g. `node fake.mjs`) be
23
- * launched in place of `cloudflared tunnel --url ...`. Production passes none.
24
- * The URL is surfaced only after a health probe confirms the edge is reachable
25
- * (cloudflared prints the hostname before routing is live).
17
+ * Spawn `cloudflared tunnel --url ...` and resolve with the public URL but
18
+ * only after a readiness gate confirms the per-tunnel DNS record has propagated.
19
+ *
20
+ * cloudflared prints the URL ~8–25s before the record exists. Looking the name
21
+ * up via the system resolver during that window returns NXDOMAIN and gets it
22
+ * negative-cached (SOA min 1800s), breaking the guest's join for up to 30 min.
23
+ * So the gate polls DoH over IP-literal endpoints (which never touch the system
24
+ * resolver, so they cannot poison anything). It is best-effort: it never blocks
25
+ * on "not live yet" or "DoH unavailable" — after the budget it returns the link
26
+ * optimistically (the guest's own DoH fallback is the safety net).
27
+ *
28
+ * `extraArgs` exists for tests: it launches a fake binary in place of cloudflared.
26
29
  */
27
30
  export declare function startCloudflared(binPath: string, localPort: number, opts?: StartOptions): Promise<TunnelHandle>;
@@ -1,143 +1,55 @@
1
1
  import { spawn } from 'node:child_process';
2
- import { CLOUDFLARED_URL_TIMEOUT_MS, CLOUDFLARED_HEALTH_ATTEMPTS, CLOUDFLARED_HEALTH_INTERVAL_MS, } from '../config.js';
2
+ import { CLOUDFLARED_URL_TIMEOUT_MS, READINESS_GATE_BUDGET_MS, READINESS_INITIAL_DELAY_MS, READINESS_POLL_INTERVAL_MS, } from '../config.js';
3
+ import { dohResolve } from '../net/doh.js';
4
+ import { envFlag } from '../env.js';
3
5
  const URL_RE = /https:\/\/[a-z0-9-]+\.trycloudflare\.com/;
4
- // Bounds a single probe so a black-hole connection (or a caller-supplied
5
- // healthCheck that hangs/throws) can't stall the health-check loop forever.
6
- const DEFAULT_PROBE_TIMEOUT_MS = 5000;
6
+ const delay = (ms) => new Promise((r) => setTimeout(r, ms));
7
7
  export function parsePublicUrl(line) {
8
8
  const m = line.match(URL_RE);
9
9
  return m ? m[0] : null;
10
10
  }
11
- // undici (Node's global fetch) reports the real network error on `.cause`, e.g.
12
- // a DNS failure surfaces as `cause.code === 'ENOTFOUND'`. Pull that out so the
13
- // caller can tell "DNS can't resolve the host" apart from "edge not ready yet".
14
- export function describeProbeError(e) {
15
- const err = e;
16
- const code = err?.cause?.code;
17
- if (code)
18
- return err.cause?.message ? `${code}: ${err.cause.message}` : code;
19
- if (err?.name === 'TimeoutError')
20
- return 'probe timed out';
21
- return err?.message || 'unknown error';
22
- }
23
- // Any HTTP response (even 404/502/426) means the Cloudflare edge is routing to
24
- // us. A thrown error carries why it failed (DNS, TLS, refused, timeout).
25
- async function reachabilityProbe(url, probeTimeoutMs) {
26
- try {
27
- await fetch(url, { method: 'GET', signal: AbortSignal.timeout(probeTimeoutMs) });
28
- return { ok: true };
29
- }
30
- catch (e) {
31
- return { ok: false, reason: describeProbeError(e) };
32
- }
33
- }
34
- // Back-compat boolean probe (kept for external callers/tests).
35
- export async function defaultHealthCheck(url) {
36
- return (await reachabilityProbe(url, DEFAULT_PROBE_TIMEOUT_MS)).ok;
37
- }
38
- // The shared DNS sentence: when a probe failure looks like name resolution,
39
- // point at *.trycloudflare.com being blocked — the single most common real-world
40
- // cause. Returns '' when the failure isn't DNS-shaped.
41
- function dnsHint(url, reason) {
42
- if (!reason || !/ENOTFOUND|EAI_AGAIN|getaddrinfo|\bdns\b/i.test(reason))
43
- return '';
44
- let host = url;
45
- try {
46
- host = new URL(url).host;
47
- }
48
- catch {
49
- /* keep the raw url */
50
- }
51
- return (` This machine can't resolve ${host} — your DNS or network may be blocking *.trycloudflare.com` +
52
- ` (common on filtered/corporate networks and some public DNS resolvers).`);
53
- }
54
- // Fatal error for `strict` mode: the host never confirmed the edge was routing.
55
- export function unreachableMessage(url, attempts, lastReason) {
56
- let msg = `cloudflared reported ${url} but it never became reachable from this machine after ${attempts} probe(s)`;
57
- if (lastReason)
58
- msg += ` (last error: ${lastReason})`;
59
- msg += '.' + dnsHint(url, lastReason);
60
- msg +=
61
- ` Both you and your guest must be able to reach it. Set TUNNEL_REACHABILITY=warn (the default) to` +
62
- ` open anyway with a warning, or TUNNEL_REACHABILITY=off to skip this check entirely.`;
63
- return msg;
64
- }
65
- // Non-fatal warning for `warn` mode: the tunnel is open, but this host couldn't
66
- // confirm reachability. Only the guest's network has to reach the URL, so this
67
- // is often a false alarm — but surface it so the human can sanity-check.
68
- export function reachabilityWarningMessage(url, lastReason) {
69
- let msg = `Tunnel opened, but this machine could not reach ${url}`;
70
- if (lastReason)
71
- msg += ` (${lastReason})`;
72
- msg += '.' + dnsHint(url, lastReason);
73
- msg +=
74
- ` Your guest still needs to reach the link — if they can't open it, check your DNS/proxy. Set` +
75
- ` TUNNEL_REACHABILITY=strict to require host reachability, or =off to silence this check.`;
76
- return msg;
77
- }
78
- // Races a single probe against a per-attempt timeout so that a caller-supplied
79
- // `check` that throws, rejects, or simply never resolves can never leave the
80
- // loop (and therefore the outer startCloudflared promise) hanging.
81
- function probeOnce(url, probe, probeTimeoutMs) {
82
- return new Promise((resolve) => {
83
- let settled = false;
84
- const finish = (r) => {
85
- if (!settled) {
86
- settled = true;
87
- resolve(r);
88
- }
89
- };
90
- const timer = setTimeout(() => finish({ ok: false, reason: 'probe timed out' }), probeTimeoutMs);
91
- Promise.resolve()
92
- .then(() => probe(url))
93
- .then((r) => {
94
- clearTimeout(timer);
95
- finish(r);
96
- })
97
- .catch((err) => {
98
- clearTimeout(timer);
99
- finish({ ok: false, reason: describeProbeError(err) });
100
- });
101
- });
102
- }
103
- async function waitHealthy(url, attempts, intervalMs, probe, probeTimeoutMs) {
104
- let lastReason;
105
- for (let i = 0; i < attempts; i++) {
106
- const r = await probeOnce(url, probe, probeTimeoutMs);
107
- if (r.ok)
108
- return r;
109
- lastReason = r.reason;
110
- await new Promise((res) => setTimeout(res, intervalMs));
111
- }
112
- return { ok: false, reason: lastReason };
11
+ function dohOn(explicit) {
12
+ return explicit ?? (process.env.TUNNEL_DOH === undefined || envFlag('TUNNEL_DOH'));
113
13
  }
114
14
  /**
115
- * `extraArgs` exists for tests: it lets a fake binary (e.g. `node fake.mjs`) be
116
- * launched in place of `cloudflared tunnel --url ...`. Production passes none.
117
- * The URL is surfaced only after a health probe confirms the edge is reachable
118
- * (cloudflared prints the hostname before routing is live).
15
+ * Spawn `cloudflared tunnel --url ...` and resolve with the public URL but
16
+ * only after a readiness gate confirms the per-tunnel DNS record has propagated.
17
+ *
18
+ * cloudflared prints the URL ~8–25s before the record exists. Looking the name
19
+ * up via the system resolver during that window returns NXDOMAIN and gets it
20
+ * negative-cached (SOA min 1800s), breaking the guest's join for up to 30 min.
21
+ * So the gate polls DoH over IP-literal endpoints (which never touch the system
22
+ * resolver, so they cannot poison anything). It is best-effort: it never blocks
23
+ * on "not live yet" or "DoH unavailable" — after the budget it returns the link
24
+ * optimistically (the guest's own DoH fallback is the safety net).
25
+ *
26
+ * `extraArgs` exists for tests: it launches a fake binary in place of cloudflared.
119
27
  */
120
28
  export function startCloudflared(binPath, localPort, opts = {}) {
121
29
  const args = opts.extraArgs ?? ['tunnel', '--url', `http://localhost:${localPort}`];
122
30
  const timeoutMs = opts.timeoutMs ?? CLOUDFLARED_URL_TIMEOUT_MS;
123
- const attempts = opts.attempts ?? CLOUDFLARED_HEALTH_ATTEMPTS;
124
- const intervalMs = opts.intervalMs ?? CLOUDFLARED_HEALTH_INTERVAL_MS;
125
- const probeTimeoutMs = opts.probeTimeoutMs ?? DEFAULT_PROBE_TIMEOUT_MS;
126
- // A caller-supplied boolean healthCheck carries no failure reason; the default
127
- // probe does. Adapt the former into a ProbeResult either way.
128
- const custom = opts.healthCheck;
129
- const probe = custom
130
- ? async (u) => ({ ok: await custom(u) })
131
- : (u) => reachabilityProbe(u, probeTimeoutMs);
31
+ const budgetMs = opts.budgetMs ?? READINESS_GATE_BUDGET_MS;
32
+ const pollIntervalMs = opts.pollIntervalMs ?? READINESS_POLL_INTERVAL_MS;
33
+ const initialDelayMs = opts.initialDelayMs ?? READINESS_INITIAL_DELAY_MS;
34
+ const resolveHost = opts.resolveHost ?? ((h) => dohResolve(h, 4));
35
+ const dohEnabled = dohOn(opts.dohEnabled);
132
36
  return new Promise((resolve, reject) => {
133
37
  const child = spawn(binPath, args, { stdio: ['ignore', 'pipe', 'pipe'] });
134
38
  let settled = false;
39
+ let gateStarted = false;
40
+ let exited = false;
135
41
  const stop = () => {
136
42
  try {
137
43
  child.kill('SIGTERM');
138
44
  }
139
45
  catch {
140
- /* gone */
46
+ /* already gone */
47
+ }
48
+ };
49
+ const succeed = (url) => {
50
+ if (!settled) {
51
+ settled = true;
52
+ resolve({ publicUrl: url, stop });
141
53
  }
142
54
  };
143
55
  const fail = (err) => {
@@ -149,45 +61,42 @@ export function startCloudflared(binPath, localPort, opts = {}) {
149
61
  }
150
62
  };
151
63
  const timer = setTimeout(() => fail(new Error('cloudflared did not report a URL in time')), timeoutMs);
64
+ const runGate = async (url, host) => {
65
+ if (!dohEnabled) {
66
+ await delay(pollIntervalMs); // brief settle; the guest's DoH fallback covers readiness
67
+ succeed(url);
68
+ return;
69
+ }
70
+ await delay(initialDelayMs);
71
+ const deadline = Date.now() + budgetMs;
72
+ while (!settled && !exited && Date.now() < deadline) {
73
+ const res = await resolveHost(host).catch(() => ({ klass: 'INDETERMINATE', addresses: [] }));
74
+ if (res.klass === 'RESOLVED') {
75
+ succeed(url);
76
+ return;
77
+ }
78
+ await delay(pollIntervalMs);
79
+ }
80
+ if (settled)
81
+ return;
82
+ if (exited) {
83
+ fail(new Error('cloudflared exited during readiness wait'));
84
+ return;
85
+ }
86
+ // Budget exhausted without a RESOLVED — hand out the link optimistically.
87
+ // No system-DNS lookup ever happened, so nothing was poisoned, and the
88
+ // guest resolves the name itself (system-first, DoH fallback).
89
+ succeed(url);
90
+ };
152
91
  const onData = (buf) => {
92
+ if (gateStarted)
93
+ return;
153
94
  for (const line of buf.toString().split('\n')) {
154
95
  const url = parsePublicUrl(line);
155
- if (url && !settled) {
156
- settled = true;
96
+ if (url) {
97
+ gateStarted = true;
157
98
  clearTimeout(timer);
158
- // The reachability probe runs on the *host*, but only the guest's
159
- // network must reach the URL for messaging. 'off' skips it entirely;
160
- // 'warn' (the product default) opens anyway and reports a warning;
161
- // 'strict' fails open() if the host can't confirm reachability.
162
- const mode = opts.reachability ?? 'strict';
163
- if (mode === 'off') {
164
- resolve({ publicUrl: url, stop });
165
- return;
166
- }
167
- waitHealthy(url, attempts, intervalMs, probe, probeTimeoutMs)
168
- .then((res) => {
169
- if (res.ok)
170
- resolve({ publicUrl: url, stop });
171
- else if (mode === 'warn') {
172
- resolve({
173
- publicUrl: url,
174
- stop,
175
- reachabilityWarning: reachabilityWarningMessage(url, res.reason),
176
- });
177
- }
178
- else {
179
- stop();
180
- reject(new Error(unreachableMessage(url, attempts, res.reason)));
181
- }
182
- })
183
- .catch((err) => {
184
- // Should be unreachable (waitHealthy/probeOnce never reject), but
185
- // this guarantees the child is never orphaned and the outer
186
- // promise always settles, even on a future bug or surprise throw.
187
- stop();
188
- const reason = err instanceof Error ? err.message : String(err);
189
- reject(new Error(`cloudflared health check failed unexpectedly: ${reason}`));
190
- });
99
+ void runGate(url, new URL(url).host);
191
100
  return;
192
101
  }
193
102
  }
@@ -195,6 +104,9 @@ export function startCloudflared(binPath, localPort, opts = {}) {
195
104
  child.stdout?.on('data', onData);
196
105
  child.stderr?.on('data', onData);
197
106
  child.on('error', (err) => fail(err));
198
- child.on('exit', (code) => fail(new Error(`cloudflared exited (${code})`)));
107
+ child.on('exit', (code) => {
108
+ exited = true;
109
+ fail(new Error(`cloudflared exited (${code})`));
110
+ });
199
111
  });
200
112
  }
package/dist/config.d.ts CHANGED
@@ -1,3 +1,4 @@
1
+ import type { DohProvider } from './net/doh.js';
1
2
  export declare const TUNNEL_HOME: string;
2
3
  export declare const BIN_DIR: string;
3
4
  export declare const SESSIONS_DIR: string;
@@ -5,6 +6,14 @@ export declare const DEFAULT_LISTEN_TIMEOUT_MS = 60000;
5
6
  export declare const DEFAULT_IDLE_TEARDOWN_MS: number;
6
7
  export declare const DEFAULT_JOIN_LINK_TTL_MS: number;
7
8
  export declare const CLOUDFLARED_URL_TIMEOUT_MS = 30000;
8
- export declare const CLOUDFLARED_HEALTH_ATTEMPTS = 10;
9
- export declare const CLOUDFLARED_HEALTH_INTERVAL_MS = 1000;
10
9
  export declare const OPEN_RETRY_ATTEMPTS = 3;
10
+ export declare const READINESS_GATE_BUDGET_MS = 60000;
11
+ export declare const READINESS_INITIAL_DELAY_MS = 5000;
12
+ export declare const READINESS_POLL_INTERVAL_MS = 1000;
13
+ export declare const DOH_REQUEST_TIMEOUT_MS = 3000;
14
+ export declare const DOH_PROVIDERS: DohProvider[];
15
+ export declare const GUEST_HANDSHAKE_TIMEOUT_MS = 15000;
16
+ export declare const GUEST_CONNECT_DEADLINE_MS = 20000;
17
+ export declare const GUEST_SYS_LOOKUP_TIMEOUT_MS = 2000;
18
+ export declare const DOH_GUEST_RETRIES = 3;
19
+ export declare const DOH_GUEST_RETRY_DELAY_MS = 700;
package/dist/config.js CHANGED
@@ -8,8 +8,41 @@ export const DEFAULT_IDLE_TEARDOWN_MS = 30 * 60_000;
8
8
  // Join links are single-use and expire after this window; a leaked link that
9
9
  // is never used (or is reused after the guest joined) can't admit anyone.
10
10
  export const DEFAULT_JOIN_LINK_TTL_MS = 10 * 60_000;
11
- // cloudflared startup robustness
11
+ // cloudflared startup
12
12
  export const CLOUDFLARED_URL_TIMEOUT_MS = 30_000; // wait for the URL line
13
- export const CLOUDFLARED_HEALTH_ATTEMPTS = 10; // edge-reachability probes
14
- export const CLOUDFLARED_HEALTH_INTERVAL_MS = 1_000; // delay between probes
15
13
  export const OPEN_RETRY_ATTEMPTS = 3; // re-spawn attempts in session.open
14
+ // Host readiness gate. cloudflared prints the quick-tunnel URL before the
15
+ // per-tunnel DNS record has propagated (~8–25s). Any early lookup of the name
16
+ // via the system resolver would be NXDOMAIN and get negative-cached for the
17
+ // zone's SOA minimum (1800s), breaking the guest's join for up to 30 minutes.
18
+ // So we confirm liveness via DoH to IP-literal endpoints (which never touch the
19
+ // system resolver) before handing out the link.
20
+ export const READINESS_GATE_BUDGET_MS = 60_000; // total wait for the record to go live
21
+ export const READINESS_INITIAL_DELAY_MS = 5_000; // delay before the first poll (never faster than ~8s)
22
+ export const READINESS_POLL_INTERVAL_MS = 1_000; // between DoH polls
23
+ // DoH resolver
24
+ export const DOH_REQUEST_TIMEOUT_MS = 3_000; // per-request (measured 40–110ms)
25
+ export const DOH_PROVIDERS = [
26
+ {
27
+ name: 'cloudflare',
28
+ url: (h, t) => `https://1.1.1.1/dns-query?name=${encodeURIComponent(h)}&type=${t}`,
29
+ headers: { accept: 'application/dns-json' },
30
+ },
31
+ {
32
+ name: 'cloudflare2',
33
+ url: (h, t) => `https://1.0.0.1/dns-query?name=${encodeURIComponent(h)}&type=${t}`,
34
+ headers: { accept: 'application/dns-json' },
35
+ },
36
+ // dns.google's cert carries an 8.8.8.8 SAN; the JSON endpoint is /resolve
37
+ // (NOT /dns-query, which expects wire format). IP-literal, so no system DNS.
38
+ {
39
+ name: 'google',
40
+ url: (h, t) => `https://8.8.8.8/resolve?name=${encodeURIComponent(h)}&type=${t}`,
41
+ },
42
+ ];
43
+ // Guest connection bounds (so a black-hole/lagging resolver can't hang the join)
44
+ export const GUEST_HANDSHAKE_TIMEOUT_MS = 15_000; // ws handshake (DNS+TCP+TLS+upgrade)
45
+ export const GUEST_CONNECT_DEADLINE_MS = 20_000; // overall connect+auth deadline (> handshake)
46
+ export const GUEST_SYS_LOOKUP_TIMEOUT_MS = 2_000; // bound the system-first lookup before DoH fallback
47
+ export const DOH_GUEST_RETRIES = 3; // DoH attempts in the guest fallback
48
+ export const DOH_GUEST_RETRY_DELAY_MS = 700; // backoff between guest DoH attempts
package/dist/env.d.ts CHANGED
@@ -5,15 +5,3 @@
5
5
  * truthiness check treats "0"/"false" as true).
6
6
  */
7
7
  export declare function envFlag(name: string): boolean;
8
- export type ReachabilityMode = 'warn' | 'strict' | 'off';
9
- /**
10
- * How `tunnel_open` treats a host-side reachability-probe failure:
11
- * warn (default) — open anyway, surface a warning; the guest is the real test
12
- * strict — fail open() if the host can't reach the public URL
13
- * off — skip the probe entirely
14
- * Reads `TUNNEL_REACHABILITY`. Only when it is unset/blank does it fall back to
15
- * the deprecated `TUNNEL_SKIP_REACHABILITY_CHECK` (== off) from 0.1.2 — an
16
- * explicitly set (even mistyped) `TUNNEL_REACHABILITY` never defers to the alias.
17
- * Any unrecognized value defaults to warn.
18
- */
19
- export declare function reachabilityMode(): ReachabilityMode;
package/dist/env.js CHANGED
@@ -11,21 +11,3 @@ export function envFlag(name) {
11
11
  const s = v.trim().toLowerCase();
12
12
  return s !== '' && s !== '0' && s !== 'false' && s !== 'no' && s !== 'off';
13
13
  }
14
- /**
15
- * How `tunnel_open` treats a host-side reachability-probe failure:
16
- * warn (default) — open anyway, surface a warning; the guest is the real test
17
- * strict — fail open() if the host can't reach the public URL
18
- * off — skip the probe entirely
19
- * Reads `TUNNEL_REACHABILITY`. Only when it is unset/blank does it fall back to
20
- * the deprecated `TUNNEL_SKIP_REACHABILITY_CHECK` (== off) from 0.1.2 — an
21
- * explicitly set (even mistyped) `TUNNEL_REACHABILITY` never defers to the alias.
22
- * Any unrecognized value defaults to warn.
23
- */
24
- export function reachabilityMode() {
25
- const raw = (process.env.TUNNEL_REACHABILITY ?? '').trim().toLowerCase();
26
- if (raw === 'warn' || raw === 'strict' || raw === 'off')
27
- return raw;
28
- if (raw === '' && envFlag('TUNNEL_SKIP_REACHABILITY_CHECK'))
29
- return 'off';
30
- return 'warn';
31
- }
@@ -0,0 +1,16 @@
1
+ export type DohClass = 'RESOLVED' | 'NXDOMAIN' | 'INDETERMINATE';
2
+ export interface DohAddress {
3
+ address: string;
4
+ family: 4 | 6;
5
+ }
6
+ export interface DohResult {
7
+ klass: DohClass;
8
+ addresses: DohAddress[];
9
+ }
10
+ export interface DohProvider {
11
+ name: string;
12
+ url: (host: string, type: 'A' | 'AAAA') => string;
13
+ headers?: Record<string, string>;
14
+ }
15
+ export declare function dohQueryOnce(provider: DohProvider, host: string, family: 4 | 6, timeoutMs?: number, fetchImpl?: typeof fetch): Promise<DohResult>;
16
+ export declare function dohResolve(host: string, family: 4 | 6, providers?: DohProvider[], timeoutMs?: number, fetchImpl?: typeof fetch): Promise<DohResult>;
@@ -0,0 +1,52 @@
1
+ import { isIP } from 'node:net';
2
+ import { DOH_PROVIDERS, DOH_REQUEST_TIMEOUT_MS } from '../config.js';
3
+ // Query ONE provider for ONE record type over an IP-literal endpoint (so it can
4
+ // never re-enter the system resolver). Never throws; classifies every failure.
5
+ export async function dohQueryOnce(provider, host, family, timeoutMs = DOH_REQUEST_TIMEOUT_MS, fetchImpl = fetch) {
6
+ const type = family === 6 ? 'AAAA' : 'A';
7
+ const rrType = family === 6 ? 28 : 1;
8
+ try {
9
+ const r = await fetchImpl(provider.url(host, type), {
10
+ headers: { accept: 'application/dns-json', ...(provider.headers ?? {}) },
11
+ signal: AbortSignal.timeout(timeoutMs),
12
+ });
13
+ if (!r.ok)
14
+ return { klass: 'INDETERMINATE', addresses: [] };
15
+ let j;
16
+ try {
17
+ j = (await r.json()); // captive-portal HTML / non-JSON body → catch below
18
+ }
19
+ catch {
20
+ return { klass: 'INDETERMINATE', addresses: [] };
21
+ }
22
+ if (!j || typeof j.Status !== 'number')
23
+ return { klass: 'INDETERMINATE', addresses: [] };
24
+ if (j.Status === 3)
25
+ return { klass: 'NXDOMAIN', addresses: [] }; // not live yet → keep polling
26
+ if (j.Status !== 0)
27
+ return { klass: 'INDETERMINATE', addresses: [] }; // SERVFAIL(2) etc → unreachable-ish
28
+ const answers = Array.isArray(j.Answer) ? j.Answer : [];
29
+ const addresses = answers
30
+ .filter((a) => a.type === rrType && typeof a.data === 'string' && isIP(a.data) === family)
31
+ .map((a) => ({ address: a.data, family }));
32
+ if (!addresses.length)
33
+ return { klass: 'NXDOMAIN', addresses: [] }; // A-less / CNAME-only → not routable yet
34
+ return { klass: 'RESOLVED', addresses };
35
+ }
36
+ catch {
37
+ return { klass: 'INDETERMINATE', addresses: [] }; // refused/timeout/ENETUNREACH/TLS reset
38
+ }
39
+ }
40
+ // Try providers in order; first RESOLVED wins. Fold classes: any NXDOMAIN (and
41
+ // no RESOLVED) → NXDOMAIN; otherwise INDETERMINATE (DoH itself unavailable).
42
+ export async function dohResolve(host, family, providers = DOH_PROVIDERS, timeoutMs = DOH_REQUEST_TIMEOUT_MS, fetchImpl = fetch) {
43
+ let sawNx = false;
44
+ for (const p of providers) {
45
+ const res = await dohQueryOnce(p, host, family, timeoutMs, fetchImpl);
46
+ if (res.klass === 'RESOLVED')
47
+ return res;
48
+ if (res.klass === 'NXDOMAIN')
49
+ sawNx = true;
50
+ }
51
+ return { klass: sawNx ? 'NXDOMAIN' : 'INDETERMINATE', addresses: [] };
52
+ }
@@ -2,13 +2,19 @@ import { EventEmitter } from 'node:events';
2
2
  import { JoinLink } from '../protocol/link.js';
3
3
  import { SessionLog } from '../log/sessionLog.js';
4
4
  import { WireMessage } from '../protocol/messages.js';
5
+ export interface GuestNetOptions {
6
+ handshakeTimeoutMs?: number;
7
+ connectDeadlineMs?: number;
8
+ lookup?: unknown;
9
+ }
5
10
  export declare class GuestClient extends EventEmitter {
6
11
  private link;
7
12
  private guestName;
8
13
  private log;
14
+ private netOpts;
9
15
  private ws?;
10
16
  private pending;
11
- constructor(link: JoinLink, guestName: string, log: SessionLog);
17
+ constructor(link: JoinLink, guestName: string, log: SessionLog, netOpts?: GuestNetOptions);
12
18
  connect(sinceSeq?: number): Promise<{
13
19
  goal: string;
14
20
  peerName: string;
@@ -2,23 +2,61 @@ import { EventEmitter } from 'node:events';
2
2
  import WebSocket from 'ws';
3
3
  import { respondChallenge } from '../protocol/crypto.js';
4
4
  import { encodeFrame, decodeFrame } from '../protocol/messages.js';
5
- import { DEFAULT_LISTEN_TIMEOUT_MS } from '../config.js';
5
+ import { DEFAULT_LISTEN_TIMEOUT_MS, GUEST_HANDSHAKE_TIMEOUT_MS, GUEST_CONNECT_DEADLINE_MS, } from '../config.js';
6
+ import { makeGuestLookup } from './guestLookup.js';
6
7
  export class GuestClient extends EventEmitter {
7
8
  link;
8
9
  guestName;
9
10
  log;
11
+ netOpts;
10
12
  ws;
11
13
  pending = new Map();
12
- constructor(link, guestName, log) {
14
+ constructor(link, guestName, log, netOpts = {}) {
13
15
  super();
14
16
  this.link = link;
15
17
  this.guestName = guestName;
16
18
  this.log = log;
19
+ this.netOpts = netOpts;
17
20
  }
18
21
  connect(sinceSeq = 0) {
19
22
  return new Promise((resolve, reject) => {
20
- const ws = new WebSocket(this.link.wsUrl);
23
+ const ws = new WebSocket(this.link.wsUrl, {
24
+ // Resolve system-first, DoH-fallback (bypasses a stale NXDOMAIN negative
25
+ // cache). ws keeps SNI/Host = the hostname, so returning a DoH IP here
26
+ // does not break TLS validation or Cloudflare routing.
27
+ lookup: this.netOpts.lookup ?? makeGuestLookup(),
28
+ handshakeTimeout: this.netOpts.handshakeTimeoutMs ?? GUEST_HANDSHAKE_TIMEOUT_MS,
29
+ });
21
30
  this.ws = ws;
31
+ // Overall connect+auth deadline: handshakeTimeout only bounds DNS+TCP+TLS+
32
+ // upgrade; the post-open challenge/auth round-trip is otherwise unbounded.
33
+ let settled = false;
34
+ const deadline = setTimeout(() => {
35
+ if (settled)
36
+ return;
37
+ settled = true;
38
+ try {
39
+ ws.terminate();
40
+ }
41
+ catch {
42
+ /* already gone */
43
+ }
44
+ reject(new Error('timed out establishing tunnel'));
45
+ }, this.netOpts.connectDeadlineMs ?? GUEST_CONNECT_DEADLINE_MS);
46
+ const settleResolve = (v) => {
47
+ if (settled)
48
+ return;
49
+ settled = true;
50
+ clearTimeout(deadline);
51
+ resolve(v);
52
+ };
53
+ const settleReject = (e) => {
54
+ if (settled)
55
+ return;
56
+ settled = true;
57
+ clearTimeout(deadline);
58
+ reject(e);
59
+ };
22
60
  ws.on('message', (data) => {
23
61
  let frame;
24
62
  try {
@@ -38,10 +76,10 @@ export class GuestClient extends EventEmitter {
38
76
  else if (frame.t === 'auth_ok') {
39
77
  for (const m of frame.backlog)
40
78
  this.log.record(m);
41
- resolve({ goal: frame.goal, peerName: frame.peerName });
79
+ settleResolve({ goal: frame.goal, peerName: frame.peerName });
42
80
  }
43
81
  else if (frame.t === 'auth_fail') {
44
- reject(new Error(`auth failed: ${frame.reason}`));
82
+ settleReject(new Error(`auth failed: ${frame.reason}`));
45
83
  ws.close();
46
84
  }
47
85
  else if (frame.t === 'msg') {
@@ -56,7 +94,7 @@ export class GuestClient extends EventEmitter {
56
94
  });
57
95
  ws.on('close', () => this.failPending(new Error('tunnel disconnected')));
58
96
  ws.on('error', (err) => {
59
- reject(err);
97
+ settleReject(err);
60
98
  this.failPending(err);
61
99
  });
62
100
  });
@@ -0,0 +1,26 @@
1
+ import type { LookupOptions } from 'node:dns';
2
+ import { dohResolve } from '../net/doh.js';
3
+ type Addr = {
4
+ address: string;
5
+ family: number;
6
+ };
7
+ type LookupCallback = (err: NodeJS.ErrnoException | null, address?: string | Addr[], family?: number) => void;
8
+ type SysLookup = (hostname: string, options: LookupOptions, callback: LookupCallback) => void;
9
+ export interface GuestLookupOpts {
10
+ dohEnabled?: boolean;
11
+ doh?: typeof dohResolve;
12
+ sys?: SysLookup;
13
+ sysTimeoutMs?: number;
14
+ retries?: number;
15
+ retryDelayMs?: number;
16
+ }
17
+ export declare function dohEnabledByDefault(): boolean;
18
+ /**
19
+ * A drop-in `dns.lookup` for the guest WebSocket. Tries the system resolver
20
+ * first (respects split-horizon/corp DNS, and is what most guests need), then —
21
+ * only on failure — falls back to DoH, so a guest whose resolver lags or holds a
22
+ * stale NXDOMAIN negative cache still connects. Returns only an address; ws/tls
23
+ * keep SNI/Host = the hostname, so returning a DoH IP does not break routing.
24
+ */
25
+ export declare function makeGuestLookup(o?: GuestLookupOpts): (hostname: string, options: LookupOptions | number, callback: LookupCallback) => void;
26
+ export {};
@@ -0,0 +1,80 @@
1
+ import { lookup as sysLookup } from 'node:dns';
2
+ import { dohResolve } from '../net/doh.js';
3
+ import { envFlag } from '../env.js';
4
+ import { GUEST_SYS_LOOKUP_TIMEOUT_MS, DOH_GUEST_RETRIES, DOH_GUEST_RETRY_DELAY_MS, } from '../config.js';
5
+ // DoH fallback is ON by default; only an explicit off/0/false/no disables it.
6
+ export function dohEnabledByDefault() {
7
+ return process.env.TUNNEL_DOH === undefined || envFlag('TUNNEL_DOH');
8
+ }
9
+ /**
10
+ * A drop-in `dns.lookup` for the guest WebSocket. Tries the system resolver
11
+ * first (respects split-horizon/corp DNS, and is what most guests need), then —
12
+ * only on failure — falls back to DoH, so a guest whose resolver lags or holds a
13
+ * stale NXDOMAIN negative cache still connects. Returns only an address; ws/tls
14
+ * keep SNI/Host = the hostname, so returning a DoH IP does not break routing.
15
+ */
16
+ export function makeGuestLookup(o = {}) {
17
+ const dohEnabled = o.dohEnabled ?? dohEnabledByDefault();
18
+ const doh = o.doh ?? dohResolve;
19
+ const sys = o.sys ?? sysLookup;
20
+ const sysTimeoutMs = o.sysTimeoutMs ?? GUEST_SYS_LOOKUP_TIMEOUT_MS;
21
+ const retries = o.retries ?? DOH_GUEST_RETRIES;
22
+ const retryDelayMs = o.retryDelayMs ?? DOH_GUEST_RETRY_DELAY_MS;
23
+ return function guestLookup(hostname, options, callback) {
24
+ const opts = typeof options === 'number' ? { family: options } : (options ?? {});
25
+ const wantAll = opts.all === true;
26
+ const family = opts.family === 6 ? 6 : 4; // prefer A/IPv4; AAAA only when explicitly asked
27
+ let settled = false;
28
+ const done = (err, address, fam) => {
29
+ if (settled)
30
+ return;
31
+ settled = true;
32
+ callback(err, address, fam);
33
+ };
34
+ // Stage 1: system resolver first, bounded so a poisoned/lagging getaddrinfo
35
+ // can't stall for seconds before we fall back to DoH.
36
+ let sysSettled = false;
37
+ const sysTimer = setTimeout(() => {
38
+ if (!sysSettled) {
39
+ sysSettled = true;
40
+ goDoh(new Error('system lookup timed out'));
41
+ }
42
+ }, sysTimeoutMs);
43
+ sys(hostname, opts, (err, address, fam) => {
44
+ if (sysSettled)
45
+ return;
46
+ sysSettled = true;
47
+ clearTimeout(sysTimer);
48
+ const ok = !err && (wantAll ? Array.isArray(address) && address.length > 0 : !!address);
49
+ if (ok)
50
+ return done(null, address, fam);
51
+ goDoh(err ?? new Error(`getaddrinfo failed for ${hostname}`));
52
+ });
53
+ function goDoh(sysErr) {
54
+ if (!dohEnabled)
55
+ return fail(sysErr);
56
+ let attempt = 0;
57
+ const tryOnce = () => {
58
+ doh(hostname, family)
59
+ .then((res) => {
60
+ if (res.klass === 'RESOLVED') {
61
+ if (wantAll)
62
+ return done(null, res.addresses.map((a) => ({ address: a.address, family: a.family })));
63
+ return done(null, res.addresses[0].address, res.addresses[0].family);
64
+ }
65
+ // NXDOMAIN (still propagating) or INDETERMINATE (DoH blocked): retry a few times.
66
+ if (++attempt < retries)
67
+ return void setTimeout(tryOnce, retryDelayMs);
68
+ fail(sysErr);
69
+ })
70
+ .catch(() => ++attempt < retries ? void setTimeout(tryOnce, retryDelayMs) : fail(sysErr));
71
+ };
72
+ tryOnce();
73
+ }
74
+ function fail(sysErr) {
75
+ const e = new Error(`could not resolve ${hostname}: system resolver failed (${sysErr.message}) and DoH (1.1.1.1/1.0.0.1/8.8.8.8) also failed`);
76
+ e.code = 'ENOTFOUND';
77
+ done(e, wantAll ? [] : '', family);
78
+ }
79
+ };
80
+ }
package/dist/session.d.ts CHANGED
@@ -32,7 +32,6 @@ export declare class TunnelSession {
32
32
  joinLink: string;
33
33
  status: string;
34
34
  joinLinkExpiresInSec: number;
35
- reachabilityWarning?: string;
36
35
  }>;
37
36
  join(joinLink: string, guestName: string): Promise<{
38
37
  tunnelId: string;
package/dist/session.js CHANGED
@@ -7,12 +7,9 @@ import { GuestClient } from './relay/guestClient.js';
7
7
  import { ensureCloudflared as realEnsure } from './cloudflared/provision.js';
8
8
  import { startCloudflared as realStart } from './cloudflared/tunnelProcess.js';
9
9
  import { DEFAULT_LISTEN_TIMEOUT_MS, DEFAULT_IDLE_TEARDOWN_MS, DEFAULT_JOIN_LINK_TTL_MS, OPEN_RETRY_ATTEMPTS, } from './config.js';
10
- import { reachabilityMode } from './env.js';
11
10
  const DEFAULT_DEPS = {
12
11
  ensureCloudflared: realEnsure,
13
- // Resolve the reachability mode per-call (not at module load) so TUNNEL_REACHABILITY
14
- // can be set right before opening a tunnel.
15
- startCloudflared: (bin, port) => realStart(bin, port, { reachability: reachabilityMode() }),
12
+ startCloudflared: (bin, port) => realStart(bin, port),
16
13
  };
17
14
  export class TunnelSession {
18
15
  deps;
@@ -84,9 +81,6 @@ export class TunnelSession {
84
81
  joinLink,
85
82
  status: 'waiting_for_guest',
86
83
  joinLinkExpiresInSec: Math.round(joinTtlMs / 1000),
87
- // Present only in 'warn' mode when the host couldn't confirm reachability;
88
- // the agent should relay it to the human before sharing the link.
89
- ...(tunnel.reachabilityWarning ? { reachabilityWarning: tunnel.reachabilityWarning } : {}),
90
84
  };
91
85
  }
92
86
  async join(joinLink, guestName) {
package/dist/tools.js CHANGED
@@ -28,7 +28,7 @@ function register(server, name, schema, cb) {
28
28
  }
29
29
  export function registerTools(server, session, opts) {
30
30
  register(server, 'tunnel_open', {
31
- description: 'Open a tunnel as host and get a join link to share. The link is a secret — share it over a trusted channel. It is single-use (works for exactly one guest) and expires (see joinLinkExpiresInSec in the result), so tell the human to share it promptly. If the result includes a reachabilityWarning, relay it to the human: this host could not confirm it can reach the link, so the guest should verify they can open it.',
31
+ description: 'Open a tunnel as host and get a join link to share. The link is a secret — share it over a trusted channel. It is single-use (works for exactly one guest) and expires (see joinLinkExpiresInSec in the result), so tell the human to share it promptly.',
32
32
  inputSchema: { goal: z.string() },
33
33
  }, async ({ goal }) => ok(await session.open(goal, opts.displayName)));
34
34
  register(server, 'tunnel_join', {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "tunnel-mcp",
3
- "version": "0.1.3",
3
+ "version": "0.1.4",
4
4
  "description": "Let two developers' Claude agents talk directly through an ephemeral, end-to-end-encrypted tunnel.",
5
5
  "type": "module",
6
6
  "bin": {