seo-intel 1.5.29 → 1.5.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  # Changelog
2
2
 
3
+ ## 1.5.31 (2026-05-17)
4
+
5
+ ### MCP — `export_intel` ships the full data layer to AI agents
6
+ The biggest gap closed: agents can now grab seo-intel's entire structured intelligence in a single call. Mirrors `seo-intel export --full <project>` as an MCP tool, with a sharp safety valve and an explicit "do not blind-ingest" notice.
7
+
8
+ - **`export_intel(project, tables?, max_rows_per_table?)`** — bulk JSON export. Free tables: `pages, keywords, headings, links, technical, sitemap_urls`. Paid (Solo) tables: `extractions, analyses, page_schemas, citability_scores, insights`. Per-table row cap (default 1000, max 50000) so big projects can't OOM Node on `JSON.stringify`.
9
+ - **The notice field is the design point.** Every response includes a top-level `notice` with `level: important|critical`, token estimate, byte size, and a clear instruction set: *"🛑 DO NOT INGEST THIS RESPONSE WHOLESALE. (1) write to file and query with jq/sqlite-utils, (2) use get_intel(for=audit|blog|competitor) for digests, (3) for pre-parsed analysis upgrade to Solo."* Free users see the list of paid tables they're missing + the Solo tool names that return digests instead of raw rows.
10
+ - Truncation is first-class: `counts: { pages: { total: 3422, returned: 1000, truncated: true } }`. Notice flips to `critical` whenever any table truncates, with the explicit "re-call with `max_rows_per_table: <N>` or `tables: ['specific_one']`" guidance.
11
+ - Verified: carbium full free export = 1.2 MB / 314k tokens with 6 tables truncated — still fits the safety valve, won't crash Node. Free-tier `analyses` request → clean paid gate. Small slices (e.g. `tables: ['technical']` on risunouto) → tiny notice, no truncation.
12
+
13
+ **The strategy this lands:** free tier offers the firehose with explicit guardrails ("hiccup with tokens or pay €20"). Paid tools (`run_citability_audit`, `get_competitor_positioning`, `prescore_draft`, `draft_blog_prompt`, `get_intel(for=audit|blog|competitor)`) return *digested, AI-ready* output — the value-add for Solo subscribers vs raw-data parsing on the client side.
14
+
15
+ **MCP surface: 13 tools total** — 9 free (including export_intel for free-table subset) + 5 paid (including export_intel for paid tables).
16
+
17
+ ## 1.5.30 (2026-05-17)
18
+
19
+ ### MCP — paid analysis tools (the full Solo surface for AI agents)
20
+ Solo subscribers can now reach the full analysis layer from any MCP host, not just the dashboard. Four new tools, all paid, all wrap existing `analyses/*` modules — same library-first pattern.
21
+
22
+ - **`run_citability_audit(project, include_competitors?)`** — Run AEO scoring across all crawled pages (6 signals: entity authority, structured claims, answer density, Q&A proximity, freshness, schema coverage). Persists scores to `citability_scores` and upserts `citability_gap` insights into the ledger. Pure function — fast, no LLM calls. Returns target/competitor page counts, average score, top 20 low-score pages.
23
+ - **`get_competitor_positioning(project)`** — Latest positioning analysis (from analyze runs or agent ingests) + per-competitor crawl stats (page counts, keyword counts, last crawl). The strategic narrative + the raw coverage in one envelope.
24
+ - **`prescore_draft(draft_md)`** — Pre-publish AEO scorer for agent-written content. Same scorer the dashboard uses; takes markdown (frontmatter-aware) and returns 0–100 score, tier, signal breakdown, AI intents. Includes revision hints for sub-60 drafts. Pair with `draft_blog_prompt` for a write→score→revise loop.
25
+ - **`draft_blog_prompt(project, topic?, lang?, content_type?)`** — Assemble an AEO-aware prompt seeded with the project's keyword gaps, citability gaps, entities, brand voice, and competitor heading patterns. The agent's own flagship LLM (Opus 4.7 / GPT-4o / Gemini) writes the draft. Supports `en` and `fi`. Topic optional — if omitted, prompt asks the LLM to pick the highest-leverage topic from gap data.
26
+
27
+ **MCP surface now:** 12 tools total — 8 free (read raw data, trigger crawls, persist findings) + 4 paid (`get_intel` audit/blog/competitor slices, `run_citability_audit`, `get_competitor_positioning`, `prescore_draft`, `draft_blog_prompt`). Paid tools share a unified gate message that surfaces the Ahrefs/Semrush price comparison.
28
+
29
+ A Solo agent session now looks like: `run_citability_audit` → `get_competitor_positioning` → `draft_blog_prompt(topic)` → agent's LLM writes the draft → `prescore_draft(output)` → revise if < 60 → `ingest_insight` to persist the gap that motivated the draft. Closed loop, all via MCP, no dashboard required.
30
+
31
+ ### Deferred
32
+ - `run_gap_intel` (Ollama-based, long-running) — deferred to v1.5.31 where it'll use the detached-spawn pattern from `run_crawl`.
33
+
3
34
  ## 1.5.29 (2026-05-17)
4
35
 
5
36
  ### MCP — `ingest_insight` closes the loop (agents become collaborators, not consumers)
package/mcp/server.js CHANGED
@@ -25,11 +25,29 @@ import { spawn } from 'child_process';
25
25
  import { dirname, join } from 'path';
26
26
  import { fileURLToPath } from 'url';
27
27
 
28
- import { getDb, insertAgentInsight, AGENT_INSIGHT_TYPES } from '../db/db.js';
28
+ import { getDb, insertAgentInsight, AGENT_INSIGHT_TYPES, getActiveInsights, getCompetitorSummary } from '../db/db.js';
29
29
  import { getIntel, INTEL_SLICES, FREE_SLICES } from '../lib/intel.js';
30
30
  import { isPro } from '../lib/license.js';
31
31
  import { readProgress } from '../lib/progress.js';
32
32
 
33
+ import { runAeoAnalysis, persistAeoScores, upsertCitabilityInsights } from '../analyses/aeo/index.js';
34
+ import { prescore } from '../analyses/blog-draft/prescorer.js';
35
+ import { gatherBlogDraftContext, buildBlogDraftPrompt } from '../analyses/blog-draft/index.js';
36
+
37
+ // ── Helpers ────────────────────────────────────────────────────────────────
38
+ function paidGate(toolName) {
39
+ return {
40
+ content: [{ type: 'text', text: `The "${toolName}" tool requires SEO Intel Solo (€19.99/mo — vs Ahrefs ~$129/mo or Semrush ~$140/mo). Free tier already covers list_projects, get_intel(raw), get_pages, list_keywords, get_headings, run_crawl, get_crawl_status, ingest_insight. Activate at https://ukkometa.fi/en/seo-intel/ — set SEO_INTEL_LICENSE=SI-xxxx-xxxx-xxxx-xxxx in your env.` }],
41
+ isError: true,
42
+ };
43
+ }
44
+
45
+ function loadProjectConfig(project) {
46
+ const p = join(CONFIG_DIR, `${project}.json`);
47
+ if (!existsSync(p)) return null;
48
+ try { return JSON.parse(readFileSync(p, 'utf8')); } catch { return null; }
49
+ }
50
+
33
51
  const __dirname = dirname(fileURLToPath(import.meta.url));
34
52
  const ROOT = join(__dirname, '..');
35
53
  const VERSION = JSON.parse(readFileSync(join(ROOT, 'package.json'), 'utf8')).version;
@@ -364,11 +382,304 @@ server.registerTool(
364
382
  }
365
383
  );
366
384
 
385
+ // ── Tool: run_citability_audit (PAID) ─────────────────────────────────────
386
+ server.registerTool(
387
+ 'run_citability_audit',
388
+ {
389
+ description: 'Run AEO citability scoring across all crawled pages (6 signals: entity authority, structured claims, answer density, Q&A proximity, freshness, schema coverage). Persists scores to citability_scores and upserts citability_gap insights into the ledger. Pure function — fast, no LLM calls. Paid tier.',
390
+ inputSchema: {
391
+ project: z.string(),
392
+ include_competitors: z.boolean().optional().describe('Score competitor pages too (default true)'),
393
+ },
394
+ },
395
+ async ({ project, include_competitors = true }) => {
396
+ if (!isPro()) return paidGate('run_citability_audit');
397
+ if (!loadProjectConfig(project)) {
398
+ return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
399
+ }
400
+ try {
401
+ const db = getDb();
402
+ const results = runAeoAnalysis(db, project, { includeCompetitors: include_competitors, log: () => {} });
403
+ persistAeoScores(db, results);
404
+ upsertCitabilityInsights(db, project, results.target);
405
+ const competitorPageCount = [...results.competitors.values()].reduce((a, list) => a + list.length, 0);
406
+ const avgTargetScore = results.target.length
407
+ ? Math.round(results.target.reduce((s, p) => s + p.score, 0) / results.target.length)
408
+ : 0;
409
+ const lowScorePages = results.target
410
+ .filter(p => p.score < 40)
411
+ .sort((a, b) => a.score - b.score)
412
+ .slice(0, 20)
413
+ .map(p => ({ url: p.url, score: p.score, tier: p.tier }));
414
+ const summary = {
415
+ ok: true,
416
+ project,
417
+ target_pages_scored: results.target.length,
418
+ competitor_pages_scored: competitorPageCount,
419
+ avg_target_score: avgTargetScore,
420
+ low_score_target_pages: lowScorePages,
421
+ hint: 'Scores persisted to DB. Call get_intel(project, for=audit) to see the full citability matrix + insights ledger.',
422
+ };
423
+ return {
424
+ content: [{ type: 'text', text: JSON.stringify(summary, null, 2) }],
425
+ structuredContent: summary,
426
+ };
427
+ } catch (err) {
428
+ return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
429
+ }
430
+ }
431
+ );
432
+
433
+ // ── Tool: get_competitor_positioning (PAID) ───────────────────────────────
434
+ server.registerTool(
435
+ 'get_competitor_positioning',
436
+ {
437
+ description: 'Return the latest positioning analysis for a project + per-competitor crawl stats. Combines the positioning insight from the ledger (from `analyze` or agent ingests) with raw competitor coverage (page counts, keyword counts, last crawl). Paid tier.',
438
+ inputSchema: {
439
+ project: z.string(),
440
+ },
441
+ },
442
+ async ({ project }) => {
443
+ if (!isPro()) return paidGate('get_competitor_positioning');
444
+ if (!loadProjectConfig(project)) {
445
+ return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
446
+ }
447
+ try {
448
+ const db = getDb();
449
+ const insights = getActiveInsights(db, project);
450
+ const competitorSummary = getCompetitorSummary(db, project);
451
+ const out = {
452
+ project,
453
+ positioning: insights.positioning, // null if never analysed
454
+ competitor_summary: competitorSummary,
455
+ last_insight_at: insights.generated_at ? new Date(insights.generated_at).toISOString() : null,
456
+ hint: insights.positioning ? 'Positioning is from the most recent analyze run or agent ingest.' : 'No positioning insight yet — run `seo-intel analyze <project>` or ingest one via ingest_insight(type=positioning).',
457
+ };
458
+ return {
459
+ content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
460
+ structuredContent: out,
461
+ };
462
+ } catch (err) {
463
+ return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
464
+ }
465
+ }
466
+ );
467
+
468
+ // ── Tool: prescore_draft (PAID) ───────────────────────────────────────────
469
+ server.registerTool(
470
+ 'prescore_draft',
471
+ {
472
+ description: 'Run the AEO scorer on a markdown draft before publishing. Returns the same 6-signal breakdown the dashboard uses (entity authority, structured claims, answer density, Q&A proximity, freshness, schema coverage) plus the overall 0-100 score and tier (excellent / good / fair / poor). Use this as a pre-publish gate when drafting via draft_blog_prompt — score < 60 means revise. Paid tier.',
473
+ inputSchema: {
474
+ draft_md: z.string().describe('Full markdown of the draft, including YAML frontmatter if present. The scorer extracts headings, word count, schema_type from frontmatter, etc.'),
475
+ },
476
+ },
477
+ async ({ draft_md }) => {
478
+ if (!isPro()) return paidGate('prescore_draft');
479
+ try {
480
+ const score = prescore(draft_md);
481
+ const out = {
482
+ ok: true,
483
+ score: score.score,
484
+ tier: score.tier,
485
+ signals: score.signals,
486
+ ai_intents: score.ai_intents,
487
+ hint: score.score >= 60
488
+ ? 'Draft scores well. Safe to publish.'
489
+ : 'Below 60 — consider strengthening: add FAQ schema for Q&A proximity, increase entity authority via named experts/citations, shorten paragraphs for answer density, add structured claims (numbers/dates).',
490
+ };
491
+ return {
492
+ content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
493
+ structuredContent: out,
494
+ };
495
+ } catch (err) {
496
+ return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
497
+ }
498
+ }
499
+ );
500
+
501
+ // ── Tool: draft_blog_prompt (PAID) ────────────────────────────────────────
502
+ server.registerTool(
503
+ 'draft_blog_prompt',
504
+ {
505
+ description: 'Generate an AEO-aware blog draft prompt seeded with full project context — keyword gaps, citability gaps, top entities, brand voice notes, competitor heading patterns. The agent\'s own LLM writes the draft using this prompt. Pair with prescore_draft for a write→score→revise loop. Paid tier.',
506
+ inputSchema: {
507
+ project: z.string(),
508
+ topic: z.string().optional().describe('Specific topic to draft about. If omitted, the prompt asks the LLM to pick the highest-leverage topic from the gap data.'),
509
+ lang: z.enum(['en', 'fi']).optional().describe('Output language (default en)'),
510
+ content_type: z.enum(['blog', 'article', 'guide']).optional().describe('Content type framing (default blog)'),
511
+ },
512
+ },
513
+ async ({ project, topic, lang = 'en', content_type = 'blog' }) => {
514
+ if (!isPro()) return paidGate('draft_blog_prompt');
515
+ const config = loadProjectConfig(project);
516
+ if (!config) {
517
+ return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
518
+ }
519
+ try {
520
+ const db = getDb();
521
+ const context = gatherBlogDraftContext(db, project, topic);
522
+ const prompt = buildBlogDraftPrompt(context, { config, lang, topic, contentType: content_type });
523
+ const out = {
524
+ project,
525
+ topic: topic || '(LLM to pick from gap data)',
526
+ lang,
527
+ content_type,
528
+ prompt_length_chars: prompt.length,
529
+ prompt,
530
+ hint: 'Pass `prompt` to your flagship LLM (Opus 4.7 / GPT-4o / etc) to generate the draft. Then run prescore_draft on the output to AEO-score before publishing.',
531
+ };
532
+ return {
533
+ content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
534
+ structuredContent: out,
535
+ };
536
+ } catch (err) {
537
+ return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
538
+ }
539
+ }
540
+ );
541
+
542
+ // ── Tool: export_intel (firehose; free tables + paid tables) ──────────────
543
+ const FREE_EXPORT_TABLES = ['pages', 'keywords', 'headings', 'links', 'technical', 'sitemap_urls'];
544
+ const PAID_EXPORT_TABLES = ['extractions', 'analyses', 'page_schemas', 'citability_scores', 'insights'];
545
+ const ALL_EXPORT_TABLES = [...FREE_EXPORT_TABLES, ...PAID_EXPORT_TABLES];
546
+
547
+ const EXPORT_TABLE_QUERIES = {
548
+ pages: `SELECT p.url, d.domain, d.role, p.status_code, p.word_count, p.load_ms, p.is_indexable, p.click_depth, p.published_date, p.modified_date, p.title, p.meta_desc, p.final_url, p.x_robots_tag
549
+ FROM pages p JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY d.role, d.domain, p.click_depth`,
550
+ keywords: `SELECT k.keyword, k.location, p.url, d.domain, d.role FROM keywords k JOIN pages p ON p.id = k.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY k.keyword`,
551
+ headings: `SELECT h.level, h.text, p.url, d.domain FROM headings h JOIN pages p ON p.id = h.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY p.url, h.level`,
552
+ links: `SELECT l.target_url, l.anchor_text, l.is_internal, p.url as source_url, d.domain FROM links l JOIN pages p ON p.id = l.source_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY l.is_internal DESC, d.domain`,
553
+ technical: `SELECT t.has_canonical, t.has_og_tags, t.has_schema, t.is_mobile_ok, t.has_sitemap, t.has_robots, t.core_web_vitals, p.url, d.domain FROM technical t JOIN pages p ON p.id = t.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ?`,
554
+ sitemap_urls: `SELECT s.url, s.sitemap_source, s.head_status, s.head_location, s.discovered_at, d.domain FROM sitemap_urls s JOIN domains d ON d.id = s.domain_id WHERE d.project = ?`,
555
+ extractions: `SELECT e.title, e.meta_desc, e.h1, e.product_type, e.pricing_tier, e.cta_primary, e.tech_stack, e.schema_types, e.search_intent, e.primary_entities, e.intent_scores, p.url, d.domain FROM extractions e JOIN pages p ON p.id = e.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ?`,
556
+ analyses: `SELECT generated_at, model, keyword_gaps, long_tails, quick_wins, new_pages, content_gaps, positioning, technical_gaps FROM analyses WHERE project = ? ORDER BY generated_at DESC`,
557
+ page_schemas: `SELECT ps.schema_type, ps.name, ps.description, ps.rating, ps.rating_count, ps.price, ps.currency, ps.author, ps.date_published, p.url, d.domain FROM page_schemas ps JOIN pages p ON p.id = ps.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY ps.schema_type`,
558
+ citability_scores: `SELECT cs.url, cs.score, cs.tier, cs.entity_authority, cs.structured_claims, cs.answer_density, cs.qa_proximity, cs.freshness, cs.schema_coverage, cs.ai_intents, cs.scored_at, p.title, d.domain, d.role FROM citability_scores cs JOIN pages p ON p.id = cs.page_id JOIN domains d ON d.id = p.domain_id WHERE d.project = ? ORDER BY cs.score`,
559
+ insights: `SELECT id, type, status, fingerprint, first_seen, last_seen, source, data FROM insights WHERE project = ? ORDER BY last_seen DESC`,
560
+ };
561
+
562
+ const DEFAULT_MAX_ROWS_PER_TABLE = 1000;
563
+ const MAX_MAX_ROWS_PER_TABLE = 50000;
564
+
565
+ function buildExportNotice({ tokens, bytes, free, paidRequested, paidExcluded, anyTruncated, maxRowsPerTable }) {
566
+ const tooBig = tokens > 50000;
567
+ const upgradeBlurb = free
568
+ ? `\n\n📦 Tables NOT in this response (require SEO Intel Solo, €19.99/mo — vs Ahrefs ~$129/mo): ${PAID_EXPORT_TABLES.join(', ')}.\n These are the AI-derived layers: per-page entity/intent/schema extraction, full analysis history, structured @type inventory, citability scores, and the Intelligence Ledger.\n For pre-parsed digests instead of raw rows, the Solo tools return ready-to-use analysis: run_citability_audit, get_competitor_positioning, prescore_draft, draft_blog_prompt.`
569
+ : `\n\nYou have Solo. Paid tables in this export: ${(paidRequested || []).join(', ') || '(none requested)'}.`;
570
+
571
+ const sizeLine = tooBig
572
+ ? `\n\n⚠️ HEAVY EXPORT: ${tokens.toLocaleString()} estimated tokens (~${(bytes / 1024 / 1024).toFixed(1)} MB). This WILL blow up a typical agent's context budget.`
573
+ : `\n\nSize: ${tokens.toLocaleString()} estimated tokens (~${(bytes / 1024).toFixed(0)} KB).`;
574
+
575
+ const truncLine = anyTruncated
576
+ ? `\n\n✂️ TRUNCATED: some tables hit the per-table row cap (currently ${maxRowsPerTable.toLocaleString()}). Check the per-table \`counts\` map — \`truncated: true\` means there are more rows. To pull more, re-call with \`max_rows_per_table: <N up to ${MAX_MAX_ROWS_PER_TABLE.toLocaleString()}>\` or \`tables: ["specific_one"]\`.`
577
+ : '';
578
+
579
+ return {
580
+ level: tooBig || anyTruncated ? 'critical' : 'important',
581
+ message: [
582
+ '🛑 DO NOT INGEST THIS RESPONSE WHOLESALE INTO YOUR CONTEXT.',
583
+ '',
584
+ 'This is a raw structured-data firehose — designed for tooling, not direct LLM consumption. Recommended ways to handle it:',
585
+ ' 1. Write it to a file via your shell tool (e.g. `... > intel.json`), then query selectively with jq / sqlite-utils / a small Python script.',
586
+ ' 2. For pre-digested intelligence, call get_intel(for=audit|blog|competitor) — same data, summarized.',
587
+ ' 3. For specific record lookups, use the targeted tools: get_pages, get_headings, list_keywords, get_competitor_positioning.',
588
+ ].join('\n') + sizeLine + truncLine + upgradeBlurb,
589
+ token_estimate: tokens,
590
+ size_bytes: bytes,
591
+ max_rows_per_table: maxRowsPerTable,
592
+ truncated: anyTruncated,
593
+ tables_paid_excluded: paidExcluded,
594
+ };
595
+ }
596
+
597
+ server.registerTool(
598
+ 'export_intel',
599
+ {
600
+ description: [
601
+ 'Bulk export of raw structured intelligence — pages, keywords, headings, links, technical, sitemap URLs (free), plus extractions, analyses, schemas, citability scores, and insights (Solo). Mirrors `seo-intel export --full <project>` as a single MCP call.',
602
+ '',
603
+ '⚠️ FIREHOSE WARNING: this is raw rows, not summaries. For carbium-sized projects it can be 5–10 MB / 200k+ tokens. The response includes a `notice` field telling the agent how to handle it (pipe to file, use other tools, or upgrade). Agents SHOULD NOT paste the response wholesale into their context — read the `notice` first, then either query selectively or save to a file.',
604
+ '',
605
+ 'For pre-parsed AI-ready intel, prefer: get_intel(for=audit|blog|competitor), run_citability_audit, get_competitor_positioning, draft_blog_prompt.',
606
+ ].join('\n'),
607
+ inputSchema: {
608
+ project: z.string(),
609
+ tables: z.array(z.enum(ALL_EXPORT_TABLES)).optional().describe(`Tables to include. Free: ${FREE_EXPORT_TABLES.join(', ')}. Paid (Solo only): ${PAID_EXPORT_TABLES.join(', ')}. Omit to get the free subset.`),
610
+ max_rows_per_table: z.number().int().positive().max(MAX_MAX_ROWS_PER_TABLE).optional().describe(`Cap on rows returned per table — safety valve against OOM on large projects (default ${DEFAULT_MAX_ROWS_PER_TABLE}, max ${MAX_MAX_ROWS_PER_TABLE}). When truncated, the per-table counts map shows total + returned so you know what's missing.`),
611
+ },
612
+ },
613
+ async ({ project, tables, max_rows_per_table }) => {
614
+ if (!loadProjectConfig(project)) {
615
+ return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
616
+ }
617
+ const requested = tables && tables.length ? tables : FREE_EXPORT_TABLES;
618
+ const paidRequested = requested.filter(t => PAID_EXPORT_TABLES.includes(t));
619
+ if (paidRequested.length && !isPro()) {
620
+ return paidGate(`export_intel (paid tables: ${paidRequested.join(', ')})`);
621
+ }
622
+ const maxRows = max_rows_per_table || DEFAULT_MAX_ROWS_PER_TABLE;
623
+ try {
624
+ const db = getDb();
625
+ const domains = db.prepare('SELECT domain, role, last_crawled FROM domains WHERE project=? ORDER BY role, domain').all(project);
626
+ const data = {};
627
+ const counts = {};
628
+ let anyTruncated = false;
629
+ for (const table of requested) {
630
+ try {
631
+ // Two-step: count first, then fetch with LIMIT. Cheaper than .all() then .slice() for huge tables.
632
+ const countRow = db.prepare(`SELECT COUNT(*) AS n FROM (${EXPORT_TABLE_QUERIES[table]}) AS sub`).get(project);
633
+ const total = countRow?.n || 0;
634
+ const rows = db.prepare(`${EXPORT_TABLE_QUERIES[table]} LIMIT ?`).all(project, maxRows);
635
+ const truncated = total > rows.length;
636
+ if (truncated) anyTruncated = true;
637
+ data[table] = rows;
638
+ counts[table] = { total, returned: rows.length, truncated };
639
+ } catch (e) {
640
+ data[table] = { error: e.message };
641
+ counts[table] = { total: 0, returned: 0, truncated: false, error: e.message };
642
+ }
643
+ }
644
+ const free = !isPro();
645
+ const dataJson = JSON.stringify(data);
646
+ const tokenEstimate = Math.ceil(dataJson.length / 4);
647
+ const sizeBytes = Buffer.byteLength(dataJson, 'utf8');
648
+ const notice = buildExportNotice({
649
+ tokens: tokenEstimate,
650
+ bytes: sizeBytes,
651
+ free,
652
+ paidRequested,
653
+ paidExcluded: free ? PAID_EXPORT_TABLES : undefined,
654
+ anyTruncated,
655
+ maxRowsPerTable: maxRows,
656
+ });
657
+ const envelope = {
658
+ project,
659
+ exported_at: new Date().toISOString(),
660
+ seo_intel_version: VERSION,
661
+ tier: free ? 'free' : 'paid',
662
+ tables_included: requested,
663
+ counts,
664
+ domains,
665
+ notice,
666
+ data,
667
+ };
668
+ return {
669
+ content: [{ type: 'text', text: JSON.stringify(envelope, null, 2) }],
670
+ structuredContent: envelope,
671
+ };
672
+ } catch (err) {
673
+ return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
674
+ }
675
+ }
676
+ );
677
+
367
678
  async function main() {
368
679
  const transport = new StdioServerTransport();
369
680
  await server.connect(transport);
370
681
  // stderr is fine; the host typically surfaces this in its MCP logs panel.
371
- console.error(`[seo-intel-mcp] v${VERSION} ready on stdio. Tools: list_projects, get_intel, get_pages, list_keywords, get_headings, run_crawl, get_crawl_status, ingest_insight.`);
682
+ console.error(`[seo-intel-mcp] v${VERSION} ready on stdio. 13 tools — free: list_projects, get_intel(raw), get_pages, list_keywords, get_headings, run_crawl, get_crawl_status, ingest_insight, export_intel (free-tier subset); paid: get_intel(audit/blog/competitor), run_citability_audit, get_competitor_positioning, prescore_draft, draft_blog_prompt, export_intel (paid tables).`);
372
683
  }
373
684
 
374
685
  main().catch(err => {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "seo-intel",
3
- "version": "1.5.29",
3
+ "version": "1.5.31",
4
4
  "description": "Local Ahrefs-style SEO competitor intelligence. Crawl → SQLite → cloud analysis.",
5
5
  "type": "module",
6
6
  "license": "SEE LICENSE IN LICENSE",