npm - fullstackgtm - Versions diffs - 0.25.0 → 0.25.2 - Mend

fullstackgtm 0.25.0 → 0.25.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/CHANGELOG.md +85 -0
package/INSTALL_FOR_AGENTS.md +15 -2
package/README.md +14 -3
package/dist/cli.js +17 -2
package/dist/connectors/hubspot.js +5 -2
package/dist/connectors/salesforce.js +4 -2
package/dist/connectors/stripe.js +4 -2
package/dist/credentials.js +22 -1
package/dist/enrich.js +24 -2
package/dist/enrichApollo.js +5 -2
package/dist/market.d.ts +1 -0
package/dist/market.js +144 -8
package/dist/marketReport.d.ts +9 -0
package/dist/marketReport.js +29 -4
package/dist/schedule.d.ts +17 -0
package/dist/schedule.js +83 -2
package/docs/api.md +28 -2
package/docs/crm-health-lifecycle.md +11 -6
package/docs/roadmap-to-1.0.md +27 -0
package/package.json +1 -1
package/skills/fullstackgtm/SKILL.md +6 -4
package/src/cli.ts +18 -1
package/src/connectors/hubspot.ts +5 -2
package/src/connectors/salesforce.ts +4 -2
package/src/connectors/stripe.ts +4 -2
package/src/credentials.ts +24 -0
package/src/enrich.ts +25 -2
package/src/enrichApollo.ts +5 -2
package/src/market.ts +129 -8
package/src/marketReport.ts +30 -4
package/src/schedule.ts +92 -2

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,68 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and the project adheres to [Semantic Versioning](https://semver.org/).
 The path to 1.0 is planned in [docs/roadmap-to-1.0.md](./docs/roadmap-to-1.0.md).
+## [0.25.2] — 2026-06-15
+Security hardening I — confirmed fixes from an adversarial audit (each verified
+by a refute-by-default re-attack; the crontab and report fixes took three
+rounds because the re-attack kept finding deeper paths).
+### Security
+- **Crontab injection via `schedule install` (was: arbitrary code execution).**
+  `schedule add --label` rejects newlines/control chars; `renderManagedBlock`
+  now refuses to render any entry (or CLI invocation) whose interpolated
+  fields — label, cron, id, profile, argv, **and the resolved node/script
+  path + `FSGTM_HOME`** — carry a control character, so a hand-edited
+  `schedules.json` or a newline in `FSGTM_HOME` can no longer inject a live
+  crontab line. `parseCron` now accepts ASCII space/tab only (rejects Unicode
+  whitespace), and a stray `%` in a path is escaped (`\%`) so it can't truncate
+  the managed line.
+- **SSRF in `market capture`.** Page fetches now allow only http/https, refuse
+  any host that is or resolves to a private/loopback/link-local/CGNAT/metadata
+  address (IPv4, IPv6, and IPv4-mapped IPv6 in dotted or hex form), follow
+  redirects manually with per-hop re-validation, and cap time/body size.
+- **Stored XSS in the market HTML report.** The embedded JSON data island is
+  serialized with `<`/`>`/`&`/U+2028/U+2029 escaped (no `</script>` breakout),
+  the tooltip is built with `textContent` (no `innerHTML`), and the two
+  remaining raw sinks (anchor vendor name, evidence-appendix confidence) are
+  now `escapeHtml`'d; `validateObservationSet` rejects a non-enum `confidence`
+  so an `observe --from` file can't smuggle markup.
+- **Provider response bodies no longer leak into errors.** HubSpot, Salesforce,
+  Apollo, and Stripe connectors throw status-line-only errors (a 4xx body can
+  echo submitted emails/domains or the key, and these errors are persisted into
+  scheduled-run records).
+- **CSV/formula injection neutralized at the enrich write path.** Ingested
+  string values beginning with `= + - @` / tab / CR are prefixed with `'` so
+  they can't execute if the CRM is later exported to a spreadsheet; numeric
+  values keep full fidelity.
+- **Credential-store mode enforced on read, not just write.** A pre-existing
+  `credentials.json` with group/other permissions is re-tightened to 0600 (and
+  warned) on read, closing the inherited-loose-permissions gap.
+Known residuals tracked for follow-up: `marketMapToMarkdown` does not
+HTML-escape (safe in terminals/GitHub; only a risk if a downstream renderer
+trusts raw HTML — to be addressed with the report work); the credential read
+check is reactive (a loose file is exposed until the next CLI read).
+## [0.25.1] — 2026-06-12
+Docs-sync release — no code changes.
+### Fixed
+- README, INSTALL_FOR_AGENTS.md, the agent skill, and docs/api.md corrected
+  against the shipped surface: the MCP tool list now enumerates all 8 tools
+  (read-only vs gated), the builtin rule count is 12, docs/api.md gains a
+  Schedule section and the `schedule` command in its CLI list, the README
+  cites the 612-run CRM-ops benchmark and the `diff --fail-on-new-findings`
+  CI gate, the bulk-update section covers `!~` / `--create-task` /
+  `--force-archive-duplicates`, and the skill's verb map completes the
+  `schedule`, `market`, and `plans` rows and adds `report`.
+- The 0.23.0 entry below is amended retroactively: `dedupe`, `reassign`,
+  `fix`, and `--set <field>=from:<sourceField>` shipped in 0.23.0 without a
+  changelog record.
 ## [0.25.0] — 2026-06-12
 ### Added
@@ -136,6 +198,29 @@ everything that shipped 0.19–0.23. (No code changes.)
   - Every `enrich` subcommand catches `--help`/`-h` before config load,
     credential resolution, or any network call. No scheduling/cron logic —
     that is the horizontal schedule layer's job (docs/schedule.md).
+- **Four task-shaped verbs** (entry added retroactively — these shipped in
+  0.23.0 without a changelog record). The 612-run benchmark's gated-agent
+  failures clustered into four missing verbs; all four compile to plans
+  through the existing plan → approve → apply gate — nothing writes directly.
+  - `dedupe <account|contact|deal> --key <domain|email|name>` — duplicate
+    groups by normalized identity key, one `merge_records` operation per
+    group with a deterministic survivor (`richest` = most populated data
+    fields, ties to lowest id; `oldest` = lowest id). High risk, approval
+    required; merges are irreversible on apply.
+  - `reassign --from <ownerId> --to <ownerId>` — the ownership-handoff
+    playbook: one bulk-update-style plan per object type, extra `--where`
+    scoping account-lifted for deals/contacts, `--except-deal-stage`
+    excluding the stage AND records whose account has an open deal in it —
+    re-verified per record at apply time.
+  - `fix --rule <id>` — one-shot composite: audit one rule → save → suggest
+    → approve only suggestion-backed values at the confidence bar → apply
+    (`--yes` required), with a stage-by-stage summary.
+  - `bulk-update --set <field>=from:<sourceField>` — per-record derived
+    values resolved from the filter view (relational sources like
+    `account.ownerId` included); empty-source records are skipped and
+    counted, never guessed. Plus the `--archive` duplicate guard: archiving
+    a record that shares its identity key with another is refused and
+    pointed at `dedupe`, overridable with `--force-archive-duplicates`.
 ## [0.22.0] — 2026-06-12

package/INSTALL_FOR_AGENTS.md CHANGED Viewed

@@ -3,6 +3,10 @@
 Deterministic install-and-verify steps. Every command is non-interactive, every
 check has an expected output, and nothing here writes to a CRM.
+If your harness supports agent skills, `npx skills add fullstackgtm/core`
+installs the compact operating guide; this document remains the deterministic
+install-and-verify path.
 ## 1. Install
 ```bash
@@ -60,6 +64,11 @@ page texts — every span is checked character-for-character against the stored
 capture, and paraphrased quotes are rejected. In non-interactive contexts the
 CLI never prompts — it fails with this guidance.
+Apollo enrichment (`enrich append --source apollo`) needs `APOLLO_API_KEY` in
+the environment, or have the human run `echo "$KEY" | fullstackgtm login apollo`
+once. Without it, `enrich ingest <file> --source clay` still stages push-style
+data keyless.
 Provider prerequisites (what the human must create, and which scopes) are in
 the README's **"Connect your CRM"** section: HubSpot needs a private app with
 four `crm.objects.*.read` scopes (plus write scopes only for `apply`);
@@ -111,8 +120,12 @@ If the working directory's project already has the peers in its node_modules,
 the server resolves them from there (peer-dependency semantics) — so this
 works from inside existing projects too.
-Tools exposed over stdio: `fullstackgtm_audit` (read-only),
-`fullstackgtm_rules`, `fullstackgtm_apply` (requires `approvedOperationIds`).
+Tools exposed over stdio — read-only: `fullstackgtm_audit`,
+`fullstackgtm_rules`, `fullstackgtm_suggest`, `fullstackgtm_call_parse`,
+`fullstackgtm_resolve`, `fullstackgtm_market_worksheet`. Gated:
+`fullstackgtm_apply` (requires explicit `approvedOperationIds`),
+`fullstackgtm_market_observe` (every quoted span is verified against the
+stored captures before anything is appended).
 ## Troubleshooting

package/README.md CHANGED Viewed

@@ -127,7 +127,7 @@ fullstackgtm reassign --from 411 --to 902 --except-deal-stage closing --save   #
 fullstackgtm fix --rule missing-deal-owner --provider hubspot --yes  # audit one rule → suggest → approve → apply, one command
 ```
-`bulk-update` filters the snapshot (`=`, `!=`, `~` substring, `:empty`/`:notempty`, `|` any-of, relational pseudo-fields like `account.domain` or `openDealStages`) into a dry-run patch plan — and **the full filter is re-verified per record at apply time**, with mid-apply rechecks, so a record that stopped matching between audit and apply is skipped, not clobbered. Equality filters double as preconditions; `--require` adds explicit ones; `--guard` asserts cross-record conditions; `--max-operations` caps blast radius. `--set field=from:<sourceField>` derives values per record; `--archive` refuses records whose identity key (account domain, contact email) is shared with another record — that's a duplicate, and duplicates are merged with `dedupe`, not archived around.
+`bulk-update` filters the snapshot (`=`, `!=`, `~` substring, `!~` not-substring, `:empty`/`:notempty`, `|` any-of, relational pseudo-fields like `account.domain` or `openDealStages`) into a dry-run patch plan — and **the full filter is re-verified per record at apply time**, with mid-apply rechecks, so a record that stopped matching between audit and apply is skipped, not clobbered. Equality filters double as preconditions; `--require` adds explicit ones; `--guard` asserts cross-record conditions; `--max-operations` caps blast radius. `--set field=from:<sourceField>` derives values per record; `--create-task <text>` is the third change mode, emitting approval-gated `create_task` operations instead of field writes; `--archive` refuses records whose identity key (account domain, contact email) is shared with another record — that's a duplicate, and duplicates are merged with `dedupe`, not archived around (`--force-archive-duplicates` overrides that refusal explicitly).
 `dedupe` finds duplicate groups by normalized identity key and emits one `merge_records` operation per group with a deterministic survivor (`richest` = most populated fields, ties to lowest id; `oldest`). Merges stay irreversible-and-therefore-low-confidence-capped on approval, exactly like merge suggestions from the audit. `reassign` is the ownership-handoff playbook: one plan per object type, extra scoping account-lifted to deals and contacts, and `--except-deal-stage` excludes both deals in that stage and every record whose account has an open deal in it. `fix` is the one-shot composite for a single rule: audit → save → suggest → approve suggestion-backed operations at the confidence bar → with `--yes`, apply and print the stage-by-stage summary; without it, stop after approval and print the apply command.
@@ -210,12 +210,17 @@ fullstackgtm audit --input snap.json --rules stale-deal --stale-days 45 --json
 # Gate a nightly CI job or agent run on hygiene: exit 2 if findings ≥ threshold
 fullstackgtm audit --provider hubspot --fail-on warning
+# Gate CI on hygiene drift instead: exit 2 only when a NEW (rule, record) finding appears
+fullstackgtm diff --before old.json --after new.json --fail-on-new-findings
 ```
 - Finding and operation ids are **stable hashes** of rule + record, so two runs over the same data produce identical ids — agents can diff plans, track findings across runs, and approve operations by id without re-parsing.
 - `--demo` (with `--seed`) generates a realistic mid-market CRM with injected real-world failure modes — departed owners, unlinked deals, orphan accounts, stale pipeline — so agents and CI can exercise the full snapshot → audit → apply pipeline with zero credentials.
 - Exit codes: `0` success, `1` error, `2` findings at/above `--fail-on`.
+"Built for agents" is measured, not asserted: a 612-run benchmark (17 scenarios × 3 tool-surface arms × 4 trials, deterministic graders over final CRM state, τ-bench-style pass^k) shows the gated CLI surface beating raw CRM-API access on completion-under-policy for every model tested. Full matrix and methodology: [the leaderboard](./evals/crm/leaderboard/RESULTS.md).
 ## Authentication: CLI-first, browser only at the consent moment
 Credential resolution is a ladder — the first rung that yields a token wins:
@@ -297,7 +302,7 @@ The Stripe connector only reads customers and subscriptions, and `apply` is read
 | Concept | What it is |
 |---|---|
 | **Canonical snapshot** | Provider-independent view of users, accounts, contacts, deals, activities. Records carry `identities` — `(provider, externalId)` claims — so the same real-world entity can be tracked across several systems. |
-| **Audit rule** | A deterministic function `(context) => { findings, operations }`. Eleven built-ins cover orphan accounts, ownerless/unlinked/amount-less deals, past close dates, stale pipeline, duplicates, and more — `fullstackgtm rules` lists them all. Write your own in ~10 lines. |
+| **Audit rule** | A deterministic function `(context) => { findings, operations }`. Twelve built-ins cover orphan accounts, ownerless/unlinked/amount-less deals, past close dates, stale pipeline, duplicates, and more — `fullstackgtm rules` lists them all. Write your own in ~10 lines. |
 | **Patch plan** | The dry-run output of an audit: findings plus typed patch operations with before/after values, reasons, risk levels, and approval flags. Always a proposal, never a mutation. |
 | **Connector** | A provider adapter: `fetchSnapshot()` for reads, optional `applyOperation()` for writes. HubSpot and Salesforce reference connectors ship in the package; connectors never drop records they can't fully resolve — the audit flags them instead. |
 | **Patch plan run** | The audit record of one apply attempt: per-operation applied/failed/skipped results. |
@@ -396,7 +401,13 @@ Or configure any MCP client (Cursor, Claude Desktop, …) with:
 }
 ```
-Exposes `fullstackgtm_audit` (read-only; sample, demo, file, or live provider sources with optional rule scoping), `fullstackgtm_rules` (rule discovery), and `fullstackgtm_apply` (requires explicit `approvedOperationIds`) over stdio. Tokens stored via `fullstackgtm login` are picked up automatically — the env var is only needed when no stored login exists.
+Eight tools are exposed over stdio.
+**Read-only:** `fullstackgtm_audit` (sample, demo, file, or live provider sources with optional rule scoping), `fullstackgtm_rules` (rule discovery), `fullstackgtm_suggest` (deterministic placeholder values with confidence + reasons), `fullstackgtm_call_parse` (transcripts → provenance-marked segments, insights, and evidence), `fullstackgtm_resolve` (the create gate: exists / ambiguous / safe_to_create), and `fullstackgtm_market_worksheet` (the classification packet for one vendor: claims, judging rules, captured page texts).
+**Gated:** `fullstackgtm_apply` (requires explicit `approvedOperationIds`; placeholders still need value overrides) and `fullstackgtm_market_observe` (verifies every quoted span against the stored captures before appending — nothing is stored unless the whole set passes).
+Tokens stored via `fullstackgtm login` are picked up automatically — the env var is only needed when no stored login exists.
 ## Safety model

package/dist/cli.js CHANGED Viewed

@@ -27,7 +27,7 @@ import { marketMapToHtml, marketMapToMarkdown } from "./marketReport.js";
 import { DEFAULT_RUBRIC, detectProviderFromKey, extractInsightsLlm, parseRubric, resolveLlmCredential, scoreCallLlm, validateLlmKey, } from "./llm.js";
 import { buildEnrichPlan, createFileEnrichRunStore, DEFAULT_STALE_DAYS, ENRICH_CONFIG_FILE_NAME, enrichRunId, inferIngestObjectType, latestStamps, loadEnrichConfig, parseCsv, resolveCrmField, selectStaleWork, stagedSourceRecords, staleDaysFor, } from "./enrich.js";
 import { apolloPullKeysForAppend, apolloPullKeysForRefresh, createApolloClient, pullApolloRecords, } from "./enrichApollo.js";
-import { computeMissedFirings, createFileScheduleRunStore, createFileScheduleStore, nextCronFiring, parseCron, renderManagedBlock, replaceManagedBlock, scheduleId, systemCrontabIo, tokenizeCommand, validateSchedulableArgv, } from "./schedule.js";
+import { computeMissedFirings, createFileScheduleRunStore, createFileScheduleStore, nextCronFiring, parseCron, renderManagedBlock, replaceManagedBlock, assertSingleLineLabel, hasControlChar, scheduleId, systemCrontabIo, tokenizeCommand, validateSchedulableArgv, } from "./schedule.js";
 import { resolveRecord } from "./resolve.js";
 import { buildBulkUpdatePlan } from "./bulkUpdate.js";
 import { buildDedupePlan } from "./dedupe.js";
@@ -1614,6 +1614,7 @@ trigger: manual. status shows next firing and surfaces missed firings
         const createdAt = new Date().toISOString();
         const label = option(rest, "--label") ??
             argv.filter((arg) => !arg.startsWith("--")).slice(0, 2).join("-").replace(/[^\w.-]+/g, "-");
+        assertSingleLineLabel(label);
         const entry = {
             id: scheduleId(label, cron.source, argv, createdAt),
             label,
@@ -1819,13 +1820,27 @@ function scheduleCliInvocation() {
     if (!script || !existsSync(script)) {
         throw new Error("Cannot resolve the fullstackgtm entry point for crontab lines (process.argv[1] is missing).");
     }
+    // A newline/control char in any of these flows verbatim into the crontab
+    // executable line; single-quote escaping defends the shell, not cron's line
+    // parser. Refuse early with a clear message (renderManagedBlock re-checks).
+    for (const [name, value] of [
+        ["FSGTM_HOME", process.env.FSGTM_HOME],
+        ["the node executable path", process.execPath],
+        ["the CLI script path", script],
+    ]) {
+        if (value && hasControlChar(value)) {
+            throw new Error(`Cannot install schedules: ${name} contains a newline or control character.`);
+        }
+    }
     const quote = (value) => `'${value.replace(/'/g, `'\\''`)}'`;
     const parts = [quote(process.execPath)];
     if (script.endsWith(".ts"))
         parts.push("--experimental-strip-types");
     parts.push(quote(script));
     const home = process.env.FSGTM_HOME ? `FSGTM_HOME=${quote(process.env.FSGTM_HOME)} ` : "";
-    return home + parts.join(" ");
+    // cron treats an unescaped `%` in the command field as a newline/stdin split.
+    // Escape it as `\%` so a stray `%` in a path can't truncate the managed line.
+    return (home + parts.join(" ")).replace(/%/g, "\\%");
 }
 /**
  * The single provider entry point: execute the scheduled command in-process

package/dist/connectors/hubspot.js CHANGED Viewed

@@ -44,8 +44,11 @@ export function createHubspotConnector(options) {
             throw new Error(`Cannot reach HubSpot at ${baseUrl}${cause}. Check network access.`);
         }
         if (!response.ok) {
-            const body = await response.text();
-            throw new Error(`HubSpot API error ${response.status}: ${body}`);
+            // Status line only — HubSpot 4xx bodies echo submitted property values
+            // (contact emails, company domains) and the request payload, and these
+            // errors are persisted into scheduled-run records. Never interpolate it.
+            await response.text().catch(() => undefined);
+            throw new Error(`HubSpot API error ${response.status}. Check the token scopes and request.`);
         }
         // DELETE and some association writes return 204 with an empty body.
         const text = await response.text();

package/dist/connectors/salesforce.js CHANGED Viewed

@@ -46,8 +46,10 @@ export function createSalesforceConnector(options) {
             throw new Error(`Cannot reach Salesforce at ${connection.instanceUrl}${cause}. Check SALESFORCE_INSTANCE_URL (your My Domain URL, e.g. https://yourco.my.salesforce.com) and network access.`);
         }
         if (!response.ok) {
-            const body = await response.text();
-            throw new Error(`Salesforce API error ${response.status}: ${body}`);
+            // Status line only — the body echoes submitted field values and the
+            // request, and these errors are persisted into scheduled-run records.
+            await response.text().catch(() => undefined);
+            throw new Error(`Salesforce API error ${response.status}. Check the token and request.`);
         }
         // Salesforce PATCH returns 204 No Content on success.
         const text = await response.text();

package/dist/connectors/stripe.js CHANGED Viewed

@@ -26,8 +26,10 @@ export function createStripeConnector(options) {
             headers: { Authorization: `Bearer ${apiKey}` },
         });
         if (!response.ok) {
-            const body = await response.text();
-            throw new Error(`Stripe API error ${response.status}: ${body}`);
+            // Status line only — the body can echo request details bound to a live
+            // billing key, and these errors land in scheduled-run records.
+            await response.text().catch(() => undefined);
+            throw new Error(`Stripe API error ${response.status}. Check the restricted key and request.`);
         }
         return response.json();
     }

package/dist/credentials.js CHANGED Viewed

@@ -1,4 +1,4 @@
-import { chmodSync, existsSync, mkdirSync, readdirSync, readFileSync, unlinkSync, writeFileSync, } from "node:fs";
+import { chmodSync, existsSync, mkdirSync, readdirSync, readFileSync, statSync, unlinkSync, writeFileSync, } from "node:fs";
 import { homedir } from "node:os";
 import { join } from "node:path";
 import { refreshHubspotToken } from "./connectors/hubspotAuth.js";
@@ -98,8 +98,29 @@ export function writeSecureFile(path, contents) {
         // Non-POSIX filesystems ignore chmod.
     }
 }
+/**
+ * The 0600/0700 guarantee was write-only: a credentials.json inherited at
+ * looser permissions (a restored backup, a file created by another tool, a
+ * cloned home) was read and trusted regardless of its actual mode. Enforce the
+ * mode on read too — re-tighten to 0600 and warn once — so a world-readable
+ * credential store can't sit there silently leaking the token to other users.
+ */
+function enforceCredentialFileMode(path) {
+    try {
+        const mode = statSync(path).mode & 0o777;
+        if ((mode & 0o077) !== 0) {
+            chmodSync(path, 0o600);
+            console.error(`fullstackgtm: tightened ${path} from ${mode.toString(8).padStart(3, "0")} to 600 ` +
+                "(it was readable or writable by other users).");
+        }
+    }
+    catch {
+        // Missing file or non-POSIX filesystem: nothing to enforce.
+    }
+}
 function readFile() {
     try {
+        enforceCredentialFileMode(credentialsPath());
         const parsed = JSON.parse(readFileSync(credentialsPath(), "utf8"));
         if (parsed && typeof parsed === "object" && parsed.version === 1 && parsed.providers) {
             return parsed;

package/dist/enrich.js CHANGED Viewed

@@ -291,6 +291,28 @@ function valueToString(value) {
         return String(value);
     return "";
 }
+/**
+ * CSV/formula-injection neutralization for string values destined for a CRM
+ * write. Third-party export rows (Clay CSV, webhook JSON) can contain cells
+ * like `=cmd|'/c calc'!A1` or `@SUM(...)`; written verbatim to a CRM field they
+ * lie dormant until someone exports the CRM to CSV and opens it in a spreadsheet,
+ * where the leading `= + - @` (or a leading tab/CR) makes the client execute it.
+ * We prefix a single apostrophe — the spreadsheet-standard escape that renders
+ * the cell as literal text. Numeric values bypass this (they're written as
+ * numbers, not strings), so signed numbers keep full fidelity; a phone number
+ * supplied as a string and starting with `+` gains a leading `'`, which the
+ * human sees in the approved diff. Applied only at the write path, never to
+ * match keys.
+ */
+function neutralizeFormulaInjection(value) {
+    if (value && /^[=+\-@\t\r]/.test(value))
+        return `'${value}`;
+    return value;
+}
+/** valueToString for a value that will be written to a CRM field. */
+function writeSafeString(value) {
+    return neutralizeFormulaInjection(valueToString(value));
+}
 function normalizeKeyValue(key, value) {
     const text = valueToString(value).toLowerCase();
     if (!text)
@@ -498,7 +520,7 @@ export function buildEnrichPlan(options) {
                     operation: "set_field",
                     field: canonicalField,
                     beforeValue: currentValue ?? null,
-                    afterValue: typeof sourceValue === "number" ? sourceValue : valueToString(sourceValue),
+                    afterValue: typeof sourceValue === "number" ? sourceValue : writeSafeString(sourceValue),
                     reason: `${source} ${record.objectType} "${describeSourceRecord(record)}" (matched by ` +
                         `${outcome.matchedKey}) reports a changed value for ${canonicalField}.`,
                     sourceRuleOrPolicy: `enrich:${source}:${canonicalField}`,
@@ -516,7 +538,7 @@ export function buildEnrichPlan(options) {
             if (!isEmptyValue(currentValue))
                 continue;
             emittedForRecord = true;
-            const afterValue = typeof sourceValue === "number" ? sourceValue : valueToString(sourceValue);
+            const afterValue = typeof sourceValue === "number" ? sourceValue : writeSafeString(sourceValue);
             operations.push({
                 id: `op_enr_${fnv1a(`${source}:${record.objectType}:${outcome.recordId}:${canonicalField}`)}`,
                 objectType: canonicalObjectType(record.objectType),

package/dist/enrichApollo.js CHANGED Viewed

@@ -56,9 +56,12 @@ export function createApolloClient(options) {
             if (response.status === 404)
                 return null;
             if (!response.ok) {
-                const body = await response.text();
+                // Status line only — never interpolate the response body. It can echo
+                // the submitted query (contact emails / company domains) or the API key,
+                // and these errors are persisted verbatim into scheduled-run records.
+                await response.text().catch(() => undefined);
                 const exhausted = response.status === 429 ? ` (rate limited; ${maxRetries} retries exhausted)` : "";
-                throw new Error(`Apollo API error ${response.status}${exhausted}: ${body}`);
+                throw new Error(`Apollo API error ${response.status}${exhausted}. Check the API key and request.`);
             }
             const text = await response.text();
             return text ? JSON.parse(text) : null;

package/dist/market.d.ts CHANGED Viewed

@@ -153,6 +153,7 @@ export type FetchPage = (url: string) => Promise<{
     status: number;
     body: string;
 }>;
+export declare function assertPublicUrl(rawUrl: string): Promise<URL>;
 export type CaptureOptions = {
     /** Directory for captures; defaults to <marketHome>/captures. */
     dir?: string;

package/dist/market.js CHANGED Viewed

@@ -1,5 +1,7 @@
 import { createHash } from "node:crypto";
+import { lookup } from "node:dns/promises";
 import { existsSync, mkdirSync, readFileSync, readdirSync, writeFileSync } from "node:fs";
+import { isIP } from "node:net";
 import { join } from "node:path";
 import { credentialsDir } from "./credentials.js";
 const INTENSITY_RANK = {
@@ -141,15 +143,144 @@ export function extractReadableText(html) {
         .filter(Boolean)
         .join("\n");
 }
+/**
+ * SSRF guard. market.config.json URLs are operator-authored, but configs are
+ * shared/templated in consulting/team use and `market capture|refresh` is on
+ * the cron allowlist — an unguarded fetch is an unattended internal-network
+ * and cloud-metadata probe. We therefore (1) allow only http/https, (2) refuse
+ * any host that is or resolves to a private/loopback/link-local/metadata
+ * address, and (3) follow redirects manually, re-validating each hop.
+ *
+ * Residual gap (documented, not defended here): TOCTOU DNS rebinding between
+ * our lookup and fetch's own resolution. Out of scope for fetching public
+ * competitor pages; a hardened deployment should fetch through an egress proxy.
+ */
+const MAX_REDIRECTS = 5;
+const FETCH_TIMEOUT_MS = 15_000;
+const MAX_BODY_BYTES = 5_000_000;
+function ipv4IsPrivate(ip) {
+    const parts = ip.split(".").map((n) => Number(n));
+    if (parts.length !== 4 || parts.some((n) => !Number.isInteger(n) || n < 0 || n > 255))
+        return true;
+    const [a, b] = parts;
+    if (a === 0 || a === 127)
+        return true; // this-host, loopback
+    if (a === 10)
+        return true; // private
+    if (a === 172 && b >= 16 && b <= 31)
+        return true; // private
+    if (a === 192 && b === 168)
+        return true; // private
+    if (a === 169 && b === 254)
+        return true; // link-local incl. 169.254.169.254 metadata
+    if (a === 100 && b >= 64 && b <= 127)
+        return true; // CGNAT
+    if (a >= 224)
+        return true; // multicast / reserved
+    return false;
+}
+function ipIsPrivate(ip) {
+    const family = isIP(ip);
+    if (family === 4)
+        return ipv4IsPrivate(ip);
+    if (family === 6) {
+        const lower = ip.toLowerCase();
+        if (lower === "::1" || lower === "::")
+            return true; // loopback / unspecified
+        // IPv4-mapped (::ffff:…) — Node normalizes ::ffff:127.0.0.1 to ::ffff:7f00:1,
+        // so accept both the dotted and the hex-pair forms, unwrap, check the v4.
+        const mapped = lower.match(/^::ffff:(.+)$/);
+        if (mapped) {
+            const rest = mapped[1];
+            if (rest.includes("."))
+                return ipv4IsPrivate(rest);
+            const groups = rest.split(":");
+            if (groups.length === 2) {
+                const hi = parseInt(groups[0], 16);
+                const lo = parseInt(groups[1], 16);
+                if (Number.isNaN(hi) || Number.isNaN(lo))
+                    return true;
+                return ipv4IsPrivate(`${(hi >> 8) & 0xff}.${hi & 0xff}.${(lo >> 8) & 0xff}.${lo & 0xff}`);
+            }
+            return true; // unrecognized mapped form → refuse
+        }
+        if (lower.startsWith("fe8") || lower.startsWith("fe9") || lower.startsWith("fea") || lower.startsWith("feb"))
+            return true; // link-local fe80::/10
+        if (lower.startsWith("fc") || lower.startsWith("fd"))
+            return true; // unique-local fc00::/7
+        return false;
+    }
+    return true; // not a recognizable IP literal → refuse
+}
+export async function assertPublicUrl(rawUrl) {
+    let url;
+    try {
+        url = new URL(rawUrl);
+    }
+    catch {
+        throw new Error(`market capture: "${rawUrl}" is not a valid URL.`);
+    }
+    if (url.protocol !== "http:" && url.protocol !== "https:") {
+        throw new Error(`market capture refuses ${url.protocol} URLs (only http/https): ${rawUrl}`);
+    }
+    const host = url.hostname.replace(/^\[|\]$/g, ""); // strip IPv6 brackets
+    if (isIP(host)) {
+        if (ipIsPrivate(host))
+            throw new Error(`market capture refuses private/loopback address ${host} (SSRF guard).`);
+        return url;
+    }
+    // Hostname: resolve and refuse if ANY address is private.
+    const addrs = await lookup(host, { all: true });
+    for (const { address } of addrs) {
+        if (ipIsPrivate(address)) {
+            throw new Error(`market capture refuses ${host} — it resolves to private/internal address ${address} (SSRF guard).`);
+        }
+    }
+    return url;
+}
 const defaultFetchPage = async (url) => {
-    const response = await fetch(url, {
-        headers: {
-            "User-Agent": "fullstackgtm-market/0 (+https://github.com/fullstackgtm/core)",
-            "Accept-Language": "en-US",
-        },
-        redirect: "follow",
-    });
-    return { status: response.status, body: await response.text() };
+    let current = url;
+    for (let hop = 0; hop <= MAX_REDIRECTS; hop++) {
+        await assertPublicUrl(current);
+        const controller = new AbortController();
+        const timer = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
+        let response;
+        try {
+            response = await fetch(current, {
+                headers: {
+                    "User-Agent": "fullstackgtm-market/0 (+https://github.com/fullstackgtm/core)",
+                    "Accept-Language": "en-US",
+                },
+                redirect: "manual",
+                signal: controller.signal,
+            });
+        }
+        finally {
+            clearTimeout(timer);
+        }
+        if (response.status >= 300 && response.status < 400 && response.headers.get("location")) {
+            current = new URL(response.headers.get("location"), current).toString();
+            continue; // re-validate the redirect target on the next iteration
+        }
+        const reader = response.body?.getReader();
+        if (!reader)
+            return { status: response.status, body: await response.text() };
+        const chunks = [];
+        let total = 0;
+        for (;;) {
+            const { done, value } = await reader.read();
+            if (done)
+                break;
+            total += value.length;
+            if (total > MAX_BODY_BYTES) {
+                await reader.cancel();
+                break;
+            }
+            chunks.push(value);
+        }
+        return { status: response.status, body: Buffer.concat(chunks).toString("utf8") };
+    }
+    throw new Error(`market capture: too many redirects (>${MAX_REDIRECTS}) for ${url}`);
 };
 export async function captureMarket(config, options = {}) {
     const dir = options.dir ?? join(marketHome(config.category), "captures");
@@ -284,6 +415,11 @@ export function validateObservationSet(config, set) {
         if (!INTENSITY_RANK[obs.intensity] && obs.intensity !== "unobservable") {
             problems.push(`${cell}: invalid intensity "${obs.intensity}"`);
         }
+        // confidence is rendered into the HTML report; only the enum is allowed, so
+        // an `observe --from` file can't smuggle markup through a free-text value.
+        if (obs.confidence !== "high" && obs.confidence !== "medium" && obs.confidence !== "low") {
+            problems.push(`${cell}: invalid confidence "${String(obs.confidence)}" (expected high, medium, or low)`);
+        }
         if ((obs.intensity === "loud" || obs.intensity === "quiet") && obs.evidence.length === 0) {
             problems.push(`${cell}: ${obs.intensity} reading with no quoted evidence`);
         }

package/dist/marketReport.d.ts CHANGED Viewed

@@ -1,3 +1,12 @@
 import type { MarketConfig, ObservationSet } from "./market.ts";
+/**
+ * Serialize JSON for embedding inside an inline <script> block. JSON.stringify
+ * does not escape `<`, `>`, `&`, or the U+2028/U+2029 line separators, so a
+ * vendor name containing `</script>` (these are untrusted, competitor-authored
+ * strings) would close the tag and inject markup. Replacing them with their
+ * \uXXXX escapes keeps the parsed value identical while making the breakout
+ * sequence unrepresentable in the HTML source.
+ */
+export declare function safeJsonForScript(value: unknown): string;
 export declare function marketMapToMarkdown(config: MarketConfig, set: ObservationSet): string;
 export declare function marketMapToHtml(config: MarketConfig, set: ObservationSet): string;