seo-intel 1.5.33 → 1.5.37
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +56 -0
- package/cli.js +55 -0
- package/db/db.js +14 -0
- package/lib/notify.js +74 -0
- package/lib/problems.js +382 -0
- package/mcp/server.js +159 -4
- package/package.json +1 -1
- package/setup/web-routes.js +19 -17
- package/setup/wizard.html +14 -4
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,61 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 1.5.37 (2026-05-23)
|
|
4
|
+
|
|
5
|
+
### Notify — native macOS / Linux notifications for pending problems
|
|
6
|
+
The "subtle nudge" delivery channel agreed in the v1.5.34 brainstorm. Users don't have to remember to open the dashboard; the OS reminds them when there's work to do.
|
|
7
|
+
|
|
8
|
+
- **New CLI:** `seo-intel notify [project]` — scans configured projects (or one if specified), fires a native notification per project with critical/warn/info problem counts. Cron-friendly: no interactive output, never blocks, never throws. Pass `--open` to also open the dashboard URL after notifying.
|
|
9
|
+
- **macOS:** uses built-in `osascript` (Notification Center). Glass sound fires when any project has critical issues; quiet otherwise. No third-party deps (no `terminal-notifier` etc).
|
|
10
|
+
- **Linux:** uses `notify-send` (libnotify, ships with GNOME/KDE/XFCE). Falls through to console if not installed.
|
|
11
|
+
- **Windows/unknown:** console-prints the notification so cron logs still capture it.
|
|
12
|
+
- **New library:** `lib/notify.js` exports `notify({ title, message, subtitle?, sound? })` + `openUrl(url)`. Reusable from any future module (e.g. a Site Watch hook firing notifications on regressions).
|
|
13
|
+
|
|
14
|
+
**Suggested cron entry** (macOS): `0 9 * * * cd /path/to/seo-intel && node cli.js notify` — fires at 9am every day for every project with pending issues.
|
|
15
|
+
|
|
16
|
+
**Verified live:** 4 notifications fired correctly during testing (carbium 190 warn · 51 info; dgents 11 warn · 1 info; risunouto 26 warn · 11 info; ukkometa 55 warn · 20 info). All four landed in macOS Notification Center.
|
|
17
|
+
|
|
18
|
+
## 1.5.36 (2026-05-23)
|
|
19
|
+
|
|
20
|
+
### Setup — LM Studio detection works for LAN hosts (fixes "unreachable" false negative)
|
|
21
|
+
The wizard's host-ping logic was gated on port number — only checked LM Studio if port was exactly 1234, only checked Ollama if anything else. That broke for any non-default setup. **Now probes both engines in parallel for every host** regardless of port.
|
|
22
|
+
|
|
23
|
+
- **`/api/setup/ping-ollama`** runs `checkOllamaRemote` and `checkLmStudio` in parallel via `Promise.all`. Whichever responds wins. Order: Ollama preferred when both respond (preserves existing behaviour for ambiguous setups).
|
|
24
|
+
- Success message now identifies the engine: *"Connected to LM Studio — 5 model(s) found"* vs *"Connected to Ollama — 3 model(s) found"*.
|
|
25
|
+
- Unreachable error returns a structured `hint` with three common causes (bind to 127.0.0.1 only, firewall, wrong port) — much more useful than the old "check IP, port, and that Ollama is running" message.
|
|
26
|
+
- Wizard surfaces the `hint` directly via HTML-escaped error text. No more misleading "Ollama is running on that machine" when the user is running LM Studio.
|
|
27
|
+
|
|
28
|
+
The "EXTRACTION HOSTS" section copy already mentioned both engines correctly — only the per-ping result message and the backend gating needed fixing. Existing localhost auto-detection (the green `localhost:1234 active` row in the screenshot) was unaffected.
|
|
29
|
+
|
|
30
|
+
## 1.5.35 (2026-05-22)
|
|
31
|
+
|
|
32
|
+
### MCP — `mark_problem_status` closes the Problems loop
|
|
33
|
+
Agents can now confirm fixes and dismiss problems they've handled. Without this tool, subjective problems (positioning, content gaps) would keep re-appearing in `list_problems` even after the agent had addressed them.
|
|
34
|
+
|
|
35
|
+
- **`mark_problem_status(problem_id, project, status, snooze_days?, agent_name?, note?)`** — **free tier**. Status: `fixed` | `wont_fix` | `snoozed`. Snoozed requires `snooze_days` (1-365). Re-marking the same problem_id updates the existing record.
|
|
36
|
+
- **`list_problems` gains `include_marked: boolean`** — by default marked problems are hidden; set true to audit what's been suppressed (each row gains a `status: 'active' | 'fixed' | 'wont_fix' | 'snoozed'` field).
|
|
37
|
+
- **`problem_counts` in `list_projects` honor marks** — when an agent marks 12 of 26 orphans as fixed, the nag immediately drops to 14. The "warm fuzzy" of clearing things.
|
|
38
|
+
|
|
39
|
+
Schema: idempotent `CREATE TABLE IF NOT EXISTS problem_status` migration in `getDb()`. Stores `problem_id` (matches `list_problems` output), project, status, marked_at, marked_by (e.g. `agent:claude-opus-4-7`), note, expires_at (for snoozes). Indexed by `(project, status)`.
|
|
40
|
+
|
|
41
|
+
Verified end-to-end: mark a real orphan → count drops 26→25 → re-list with `include_marked` reveals it with `status: 'fixed'`. Smoke 10/10. MCP surface: 15 tools.
|
|
42
|
+
|
|
43
|
+
## 1.5.34 (2026-05-22)
|
|
44
|
+
|
|
45
|
+
### MCP — Problems as the entry surface ("what should I fix?")
|
|
46
|
+
The single biggest UX shift in the agent flow. Two new touchpoints turn `list_projects` into a passive nag layer and `list_problems` into the canonical "fix-able findings" tool.
|
|
47
|
+
|
|
48
|
+
- **`list_problems(project, severity?, category?, limit?, max_fix_difficulty?)`** — severity-sorted, agent-fixable problem list. Every item returns `{id, severity, category, tier, title, description, affected_urls, evidence, fix_template, verification, first_seen, last_seen, fix_difficulty}`. The `fix_template` is the design point — it gives a coding agent a concrete next step (file/URL, what to change, how to verify).
|
|
49
|
+
- **Free categories**: `tech` (HTTP 4xx/5xx), `indexability` (robots header conflicts), `links` (orphan pages), `schema` (missing structured data on substantive pages).
|
|
50
|
+
- **Paid categories**: `citability` (low AEO scores from `citability_scores`), `content` / `keyword` / `positioning` (mapped from Intelligence Ledger).
|
|
51
|
+
- Sorting: severity (critical → warn → info), then fix_difficulty (1=trivial → 5=deep work), then last_seen DESC.
|
|
52
|
+
- **`list_projects` now nags.** Every project response includes `problem_counts`, `stale_days`, and a `nag` string that flags critical/warn counts and stale crawls. Solo users see paid-tier counts; free users see free-tier counts only (no teasing). Example output: `risunouto: 26 warn · crawl 42d stale. Call list_problems('risunouto') to see them.`
|
|
53
|
+
- **New library: `lib/problems.js`** — `getProblems(db, project, opts)` + `getProblemCounts(db, project, opts)` are the unifying primitive. Six collectors today (4 free + 2 paid); future patches add more (decay targets, friction points, mark_problem_status, schema-vs-competitor diffs).
|
|
54
|
+
|
|
55
|
+
The agent loop this unlocks: `list_projects` → see the nag → `list_problems(project, severity='critical')` → fix the highest-leverage one → `run_crawl(project)` → re-call `list_problems` to verify it cleared. Closed loop, no dashboard required.
|
|
56
|
+
|
|
57
|
+
**MCP surface: 14 tools.** Next patches: `mark_problem_status` (v1.5.35) + native notification daemon (v1.5.36) + dashboard Problems tab as landing (v1.5.37).
|
|
58
|
+
|
|
3
59
|
## 1.5.33 (2026-05-19)
|
|
4
60
|
|
|
5
61
|
### Dashboard — visual brief foundation (intel-blue tokens + component utilities)
|
package/cli.js
CHANGED
|
@@ -1456,6 +1456,61 @@ program
|
|
|
1456
1456
|
}
|
|
1457
1457
|
});
|
|
1458
1458
|
|
|
1459
|
+
// ── NOTIFY (native OS notification for pending problems) ───────────────────
|
|
1460
|
+
// Designed to run on cron/launchd. Subtle nudge to push the user back into
|
|
1461
|
+
// the dashboard. macOS via osascript, Linux via notify-send, no deps.
|
|
1462
|
+
program
|
|
1463
|
+
.command('notify [project]')
|
|
1464
|
+
.description('Fire native macOS/Linux notification for projects with pending problems. Cron-friendly. Pass a project to limit; omit to scan all configured projects.')
|
|
1465
|
+
.option('--open', 'Also open the dashboard URL when a notification fires')
|
|
1466
|
+
.option('--all', 'Notify even when no problems exist (useful for testing)')
|
|
1467
|
+
.option('--port <port>', 'Dashboard port for --open (default 3000)', '3000')
|
|
1468
|
+
.action(async (project, opts) => {
|
|
1469
|
+
const { notify, openUrl } = await import('./lib/notify.js');
|
|
1470
|
+
const { getProblemCounts } = await import('./lib/problems.js');
|
|
1471
|
+
const db = getDb();
|
|
1472
|
+
|
|
1473
|
+
// Resolve project list: explicit arg → single project; otherwise all configs with data
|
|
1474
|
+
const configDir = join(__dirname, 'config');
|
|
1475
|
+
const { readFileSync, readdirSync, existsSync } = await import('node:fs');
|
|
1476
|
+
let projects = [];
|
|
1477
|
+
if (project) {
|
|
1478
|
+
try { loadConfig(project); projects = [{ project }]; }
|
|
1479
|
+
catch (e) { console.error(chalk.red(e.message)); process.exit(1); }
|
|
1480
|
+
} else if (existsSync(configDir)) {
|
|
1481
|
+
projects = readdirSync(configDir)
|
|
1482
|
+
.filter(f => f.endsWith('.json') && f !== 'example.json' && !f.startsWith('setup'))
|
|
1483
|
+
.map(f => {
|
|
1484
|
+
try { const c = JSON.parse(readFileSync(join(configDir, f), 'utf8')); return { project: c.project || f.replace('.json', '') }; }
|
|
1485
|
+
catch { return null; }
|
|
1486
|
+
})
|
|
1487
|
+
.filter(Boolean);
|
|
1488
|
+
}
|
|
1489
|
+
|
|
1490
|
+
let fired = 0;
|
|
1491
|
+
const pro = isPro();
|
|
1492
|
+
for (const { project: p } of projects) {
|
|
1493
|
+
let counts;
|
|
1494
|
+
try { counts = getProblemCounts(db, p, { includePaid: pro }); }
|
|
1495
|
+
catch { continue; }
|
|
1496
|
+
const score = counts.critical * 10 + counts.warn; // weighted: 1 critical = 10 warns
|
|
1497
|
+
if (score === 0 && !opts.all) continue;
|
|
1498
|
+
const title = `SEO Intel — ${p}`;
|
|
1499
|
+
const message = counts.critical > 0
|
|
1500
|
+
? `${counts.critical} CRITICAL · ${counts.warn} warn pending`
|
|
1501
|
+
: `${counts.warn} warn · ${counts.info} info pending`;
|
|
1502
|
+
notify({ title, message, subtitle: 'Click the dashboard to fix', sound: counts.critical > 0 ? 'Glass' : false });
|
|
1503
|
+
fired++;
|
|
1504
|
+
console.log(chalk.dim(` 📣 ${p}: ${message}`));
|
|
1505
|
+
}
|
|
1506
|
+
if (opts.open && fired > 0) {
|
|
1507
|
+
const url = `http://localhost:${opts.port}/`;
|
|
1508
|
+
openUrl(url);
|
|
1509
|
+
console.log(chalk.dim(` → opened ${url}`));
|
|
1510
|
+
}
|
|
1511
|
+
if (fired === 0) console.log(chalk.dim(' ✓ No projects need attention.'));
|
|
1512
|
+
});
|
|
1513
|
+
|
|
1459
1514
|
// ── STATUS ─────────────────────────────────────────────────────────────────
|
|
1460
1515
|
program
|
|
1461
1516
|
.command('status')
|
package/db/db.js
CHANGED
|
@@ -31,6 +31,20 @@ export function getDb(dbPath = './seo-intel.db') {
|
|
|
31
31
|
try { _db.exec('ALTER TABLE extractions ADD COLUMN intent_scores TEXT'); } catch { /* already exists */ }
|
|
32
32
|
try { _db.exec("ALTER TABLE insights ADD COLUMN source TEXT DEFAULT 'cli'"); } catch { /* already exists */ }
|
|
33
33
|
|
|
34
|
+
// Problem status tracking (v1.5.35) — agents/users mark items as fixed/wont_fix/snoozed
|
|
35
|
+
_db.exec(`
|
|
36
|
+
CREATE TABLE IF NOT EXISTS problem_status (
|
|
37
|
+
problem_id TEXT PRIMARY KEY, -- matches lib/problems.js makeId() output
|
|
38
|
+
project TEXT NOT NULL,
|
|
39
|
+
status TEXT NOT NULL, -- fixed | wont_fix | snoozed
|
|
40
|
+
marked_at INTEGER NOT NULL,
|
|
41
|
+
marked_by TEXT, -- 'agent:<name>' | 'cli' | 'dashboard'
|
|
42
|
+
note TEXT,
|
|
43
|
+
expires_at INTEGER -- for snoozed; NULL for fixed/wont_fix
|
|
44
|
+
);
|
|
45
|
+
CREATE INDEX IF NOT EXISTS idx_problem_status_project ON problem_status(project, status);
|
|
46
|
+
`);
|
|
47
|
+
|
|
34
48
|
// Backfill first_seen_at from crawled_at for existing rows
|
|
35
49
|
_db.exec('UPDATE pages SET first_seen_at = crawled_at WHERE first_seen_at IS NULL');
|
|
36
50
|
|
package/lib/notify.js
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* lib/notify.js — Fire native OS notifications (macOS / Linux).
|
|
3
|
+
*
|
|
4
|
+
* Subtle nudge channel for the "user forgets to check SEO" problem.
|
|
5
|
+
* Click action: opens the dashboard URL (configurable). No third-party
|
|
6
|
+
* deps — uses built-in `osascript` on macOS and `notify-send` (libnotify)
|
|
7
|
+
* on Linux. Falls through to console on Windows / unknown platforms.
|
|
8
|
+
*
|
|
9
|
+
* Designed to be safe in cron contexts: never throws, never blocks the
|
|
10
|
+
* process, fire-and-forget via detached subprocess.
|
|
11
|
+
*/
|
|
12
|
+
|
|
13
|
+
import { spawn } from 'child_process';
|
|
14
|
+
|
|
15
|
+
/**
|
|
16
|
+
* @param {object} opts
|
|
17
|
+
* @param {string} opts.title Headline shown bold in the notification
|
|
18
|
+
* @param {string} opts.message Body text
|
|
19
|
+
* @param {string} [opts.subtitle] macOS only — small subtitle below title
|
|
20
|
+
* @param {string} [opts.sound] macOS sound name (e.g. 'Glass', 'Tink'). Set false to silence.
|
|
21
|
+
* @returns {boolean} true if a native notification was fired; false on fallback
|
|
22
|
+
*/
|
|
23
|
+
export function notify({ title, message, subtitle, sound = false }) {
|
|
24
|
+
if (!title || !message) return false;
|
|
25
|
+
const platform = process.platform;
|
|
26
|
+
try {
|
|
27
|
+
if (platform === 'darwin') return notifyMacOS({ title, message, subtitle, sound });
|
|
28
|
+
if (platform === 'linux') return notifyLinux({ title, message });
|
|
29
|
+
} catch { /* fall through */ }
|
|
30
|
+
// Windows / unknown — print to console so cron jobs still leave a trace
|
|
31
|
+
console.log(`[seo-intel notify] ${title}: ${message}`);
|
|
32
|
+
return false;
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
function notifyMacOS({ title, message, subtitle, sound }) {
|
|
36
|
+
// osascript can fire a notification but cannot wire click→URL natively.
|
|
37
|
+
// For click-to-open we'd need terminal-notifier (third-party). Keeping
|
|
38
|
+
// zero-dep here; the CLI's `--open` flag opens the dashboard alongside.
|
|
39
|
+
const safe = (s) => String(s).replace(/\\/g, '\\\\').replace(/"/g, '\\"');
|
|
40
|
+
const parts = [`display notification "${safe(message)}" with title "${safe(title)}"`];
|
|
41
|
+
if (subtitle) parts.push(`subtitle "${safe(subtitle)}"`);
|
|
42
|
+
if (sound) parts.push(`sound name "${safe(sound)}"`);
|
|
43
|
+
const script = parts.join(' ');
|
|
44
|
+
spawn('osascript', ['-e', script], { detached: true, stdio: 'ignore' }).unref();
|
|
45
|
+
return true;
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
function notifyLinux({ title, message }) {
|
|
49
|
+
// notify-send is shipped with libnotify on most Linux distros (GNOME, KDE,
|
|
50
|
+
// XFCE). On minimal/headless installs it may be missing — we fail
|
|
51
|
+
// silently and console-print in that case.
|
|
52
|
+
const child = spawn('notify-send', [
|
|
53
|
+
'--app-name=SEO Intel',
|
|
54
|
+
'--icon=dialog-information',
|
|
55
|
+
title,
|
|
56
|
+
message,
|
|
57
|
+
], { detached: true, stdio: 'ignore' });
|
|
58
|
+
child.on('error', () => console.log(`[seo-intel notify] ${title}: ${message}`));
|
|
59
|
+
child.unref();
|
|
60
|
+
return true;
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
/**
|
|
64
|
+
* Open a URL in the user's default browser. Cross-platform, fire-and-forget.
|
|
65
|
+
* @param {string} url
|
|
66
|
+
*/
|
|
67
|
+
export function openUrl(url) {
|
|
68
|
+
try {
|
|
69
|
+
const platform = process.platform;
|
|
70
|
+
if (platform === 'darwin') spawn('open', [url], { detached: true, stdio: 'ignore' }).unref();
|
|
71
|
+
else if (platform === 'linux') spawn('xdg-open', [url], { detached: true, stdio: 'ignore' }).unref();
|
|
72
|
+
else if (platform === 'win32') spawn('cmd', ['/c', 'start', '', url], { detached: true, stdio: 'ignore' }).unref();
|
|
73
|
+
} catch { /* best-effort */ }
|
|
74
|
+
}
|
package/lib/problems.js
ADDED
|
@@ -0,0 +1,382 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* lib/problems.js — Unified Problems list.
|
|
3
|
+
*
|
|
4
|
+
* Aggregates problem-shaped findings from every source in the DB (technical
|
|
5
|
+
* audit, citability scores, orphan analysis, schema gaps, intelligence
|
|
6
|
+
* ledger) into a single severity-sorted list with everything an AI coding
|
|
7
|
+
* agent needs to fix it: affected_urls, fix_template, verification.
|
|
8
|
+
*
|
|
9
|
+
* This is the canonical "what should I work on?" surface — backs both the
|
|
10
|
+
* MCP `list_problems` tool and the upcoming dashboard Problems tab.
|
|
11
|
+
*
|
|
12
|
+
* Each problem returns:
|
|
13
|
+
* {
|
|
14
|
+
* id, severity, category, tier, title, description, affected_urls,
|
|
15
|
+
* evidence, fix_template, verification, first_seen, last_seen,
|
|
16
|
+
* fix_difficulty
|
|
17
|
+
* }
|
|
18
|
+
*/
|
|
19
|
+
|
|
20
|
+
import crypto from 'node:crypto';
|
|
21
|
+
|
|
22
|
+
export const PROBLEM_CATEGORIES = ['tech', 'indexability', 'links', 'schema', 'citability', 'content', 'keyword', 'positioning'];
|
|
23
|
+
export const FREE_CATEGORIES = ['tech', 'indexability', 'links', 'schema'];
|
|
24
|
+
export const PAID_CATEGORIES = ['citability', 'content', 'keyword', 'positioning'];
|
|
25
|
+
export const PROBLEM_STATUSES = ['fixed', 'wont_fix', 'snoozed'];
|
|
26
|
+
|
|
27
|
+
const SEVERITY_RANK = { critical: 0, warn: 1, info: 2 };
|
|
28
|
+
|
|
29
|
+
/**
|
|
30
|
+
* Persist a problem-status mark. Same problem_id → upserts.
|
|
31
|
+
*
|
|
32
|
+
* @param {import('node:sqlite').DatabaseSync} db
|
|
33
|
+
* @param {{ problemId: string, project: string, status: string, markedBy?: string, note?: string, snoozeDays?: number }} args
|
|
34
|
+
* @returns {{ ok: boolean, error?: string, status?: string, expires_at?: number }}
|
|
35
|
+
*/
|
|
36
|
+
export function markProblemStatus(db, { problemId, project, status, markedBy, note, snoozeDays }) {
|
|
37
|
+
if (!PROBLEM_STATUSES.includes(status)) {
|
|
38
|
+
return { ok: false, error: `Unknown status "${status}". Allowed: ${PROBLEM_STATUSES.join(', ')}` };
|
|
39
|
+
}
|
|
40
|
+
if (!problemId || !project) return { ok: false, error: 'problemId and project are required' };
|
|
41
|
+
if (status === 'snoozed' && (!snoozeDays || snoozeDays <= 0)) {
|
|
42
|
+
return { ok: false, error: 'snoozeDays (positive integer) is required when status=snoozed' };
|
|
43
|
+
}
|
|
44
|
+
const now = Date.now();
|
|
45
|
+
const expiresAt = status === 'snoozed' ? now + snoozeDays * 86_400_000 : null;
|
|
46
|
+
db.prepare(`
|
|
47
|
+
INSERT INTO problem_status (problem_id, project, status, marked_at, marked_by, note, expires_at)
|
|
48
|
+
VALUES (?, ?, ?, ?, ?, ?, ?)
|
|
49
|
+
ON CONFLICT(problem_id) DO UPDATE SET
|
|
50
|
+
status = excluded.status,
|
|
51
|
+
marked_at = excluded.marked_at,
|
|
52
|
+
marked_by = excluded.marked_by,
|
|
53
|
+
note = excluded.note,
|
|
54
|
+
expires_at = excluded.expires_at
|
|
55
|
+
`).run(problemId, project, status, now, markedBy || null, note || null, expiresAt);
|
|
56
|
+
return { ok: true, status, marked_at: now, expires_at: expiresAt };
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
/**
|
|
60
|
+
* Read all active status marks for a project. "Active" = not expired.
|
|
61
|
+
* Used internally by getProblems to filter; also exposed via MCP for inspection.
|
|
62
|
+
*
|
|
63
|
+
* @returns {Map<string, { status, marked_at, marked_by, note, expires_at }>}
|
|
64
|
+
*/
|
|
65
|
+
export function getActiveStatusMap(db, project) {
|
|
66
|
+
const now = Date.now();
|
|
67
|
+
const rows = db.prepare(`
|
|
68
|
+
SELECT problem_id, status, marked_at, marked_by, note, expires_at
|
|
69
|
+
FROM problem_status
|
|
70
|
+
WHERE project = ? AND (expires_at IS NULL OR expires_at > ?)
|
|
71
|
+
`).all(project, now);
|
|
72
|
+
return new Map(rows.map(r => [r.problem_id, r]));
|
|
73
|
+
}
|
|
74
|
+
|
|
75
|
+
function shortHash(str) {
|
|
76
|
+
return crypto.createHash('sha1').update(str).digest('hex').slice(0, 10);
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
function makeId(category, kind, key) {
|
|
80
|
+
return `${category}::${kind}::${shortHash(key)}`;
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
// ── Collectors (each returns Problem[]) ─────────────────────────────────────
|
|
84
|
+
|
|
85
|
+
// 1. HTTP errors on target/owned pages — broken pages, critical
|
|
86
|
+
function collectHttpErrors(db, project) {
|
|
87
|
+
const rows = db.prepare(`
|
|
88
|
+
SELECT p.url, p.status_code, p.crawled_at, p.first_seen_at, d.domain, d.role
|
|
89
|
+
FROM pages p JOIN domains d ON d.id = p.domain_id
|
|
90
|
+
WHERE d.project = ? AND d.role IN ('target', 'owned')
|
|
91
|
+
AND p.status_code >= 400 AND p.status_code < 600
|
|
92
|
+
ORDER BY p.status_code, p.url
|
|
93
|
+
`).all(project);
|
|
94
|
+
return rows.map(r => ({
|
|
95
|
+
id: makeId('tech', `http-${r.status_code}`, r.url),
|
|
96
|
+
severity: 'critical',
|
|
97
|
+
category: 'tech',
|
|
98
|
+
tier: 'free',
|
|
99
|
+
title: `${r.status_code} on ${shortPath(r.url)}`,
|
|
100
|
+
description: `Page returns HTTP ${r.status_code}. Search engines and AI crawlers will drop this URL.`,
|
|
101
|
+
affected_urls: [r.url],
|
|
102
|
+
evidence: { status_code: r.status_code, domain: r.domain, role: r.role },
|
|
103
|
+
fix_template: r.status_code === 404
|
|
104
|
+
? `Either restore the page at \`${r.url}\` or add a 301 redirect to its replacement. Check internal links pointing here via \`get_pages\` and update them.`
|
|
105
|
+
: `Investigate why \`${r.url}\` returns ${r.status_code}. Server error, auth wall, or rate-limit. Restore 200 status or redirect.`,
|
|
106
|
+
verification: `Re-crawl with \`run_crawl(${project})\`, then re-run \`list_problems\` — this entry should disappear.`,
|
|
107
|
+
first_seen: r.first_seen_at || r.crawled_at,
|
|
108
|
+
last_seen: r.crawled_at,
|
|
109
|
+
fix_difficulty: r.status_code === 404 ? 2 : 4,
|
|
110
|
+
}));
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
// 2. Indexability — pages marked noindex via x_robots_tag header but indexable=1 in meta (conflict)
|
|
114
|
+
// OR pages explicitly noindex that have backlinks (wasted authority)
|
|
115
|
+
function collectIndexabilityIssues(db, project) {
|
|
116
|
+
const xRobotsNoindex = db.prepare(`
|
|
117
|
+
SELECT p.url, p.x_robots_tag, p.is_indexable, p.crawled_at, p.first_seen_at, d.domain
|
|
118
|
+
FROM pages p JOIN domains d ON d.id = p.domain_id
|
|
119
|
+
WHERE d.project = ? AND d.role IN ('target', 'owned')
|
|
120
|
+
AND p.x_robots_tag IS NOT NULL
|
|
121
|
+
AND lower(p.x_robots_tag) LIKE '%noindex%'
|
|
122
|
+
AND p.is_indexable = 1
|
|
123
|
+
`).all(project);
|
|
124
|
+
|
|
125
|
+
const out = [];
|
|
126
|
+
for (const r of xRobotsNoindex) {
|
|
127
|
+
out.push({
|
|
128
|
+
id: makeId('indexability', 'robots-conflict', r.url),
|
|
129
|
+
severity: 'warn',
|
|
130
|
+
category: 'indexability',
|
|
131
|
+
tier: 'free',
|
|
132
|
+
title: `Robots header conflict on ${shortPath(r.url)}`,
|
|
133
|
+
description: `X-Robots-Tag header says noindex but the meta robots tag allows indexing. Search engines will follow the header — page won't be indexed.`,
|
|
134
|
+
affected_urls: [r.url],
|
|
135
|
+
evidence: { x_robots_tag: r.x_robots_tag, is_indexable_meta: !!r.is_indexable },
|
|
136
|
+
fix_template: `Decide which is canonical. Either remove \`X-Robots-Tag: noindex\` from the server response, or set \`<meta name="robots" content="noindex">\` so both agree. Check Cloudflare/nginx config if the header is unexpected.`,
|
|
137
|
+
verification: `Re-crawl and confirm \`x_robots_tag\` no longer contains noindex via \`get_pages\`.`,
|
|
138
|
+
first_seen: r.first_seen_at || r.crawled_at,
|
|
139
|
+
last_seen: r.crawled_at,
|
|
140
|
+
fix_difficulty: 3,
|
|
141
|
+
});
|
|
142
|
+
}
|
|
143
|
+
return out;
|
|
144
|
+
}
|
|
145
|
+
|
|
146
|
+
// 3. Orphan pages — target/owned pages on the site with no incoming internal links
|
|
147
|
+
function collectOrphans(db, project) {
|
|
148
|
+
const rows = db.prepare(`
|
|
149
|
+
SELECT p.url, p.crawled_at, p.first_seen_at, p.click_depth, d.domain
|
|
150
|
+
FROM pages p JOIN domains d ON d.id = p.domain_id
|
|
151
|
+
WHERE d.project = ? AND d.role IN ('target', 'owned')
|
|
152
|
+
AND p.status_code = 200
|
|
153
|
+
AND p.click_depth > 0
|
|
154
|
+
AND p.url NOT IN (
|
|
155
|
+
SELECT DISTINCT l.target_url FROM links l
|
|
156
|
+
JOIN pages sp ON sp.id = l.source_id
|
|
157
|
+
JOIN domains sd ON sd.id = sp.domain_id
|
|
158
|
+
WHERE sd.project = ? AND l.is_internal = 1
|
|
159
|
+
)
|
|
160
|
+
ORDER BY p.click_depth, p.url
|
|
161
|
+
LIMIT 200
|
|
162
|
+
`).all(project, project);
|
|
163
|
+
return rows.map(r => ({
|
|
164
|
+
id: makeId('links', 'orphan', r.url),
|
|
165
|
+
severity: 'warn',
|
|
166
|
+
category: 'links',
|
|
167
|
+
tier: 'free',
|
|
168
|
+
title: `Orphan: ${shortPath(r.url)}`,
|
|
169
|
+
description: `No internal links point to this page. Search engines can only find it via sitemap; AI agents won't surface it.`,
|
|
170
|
+
affected_urls: [r.url],
|
|
171
|
+
evidence: { click_depth: r.click_depth, domain: r.domain },
|
|
172
|
+
fix_template: `Find 2–3 thematically related pages and add internal links to \`${r.url}\` from them. Use anchor text matching the page's primary keyword. Call \`get_pages(${project})\` to find candidates by topic, or \`list_keywords(${project})\` to find pages targeting overlapping keywords.`,
|
|
173
|
+
verification: `Re-crawl, then re-run \`list_problems\` — the orphan entry should be gone once any incoming link exists.`,
|
|
174
|
+
first_seen: r.first_seen_at || r.crawled_at,
|
|
175
|
+
last_seen: r.crawled_at,
|
|
176
|
+
fix_difficulty: 2,
|
|
177
|
+
}));
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
// 4. Schema coverage gaps — target pages missing schema where competitors have it
|
|
181
|
+
function collectSchemaGaps(db, project) {
|
|
182
|
+
// Per-page: target pages with no page_schemas entries
|
|
183
|
+
const rows = db.prepare(`
|
|
184
|
+
SELECT p.url, p.title, p.word_count, p.crawled_at, p.first_seen_at, d.domain
|
|
185
|
+
FROM pages p JOIN domains d ON d.id = p.domain_id
|
|
186
|
+
WHERE d.project = ? AND d.role IN ('target', 'owned')
|
|
187
|
+
AND p.status_code = 200 AND p.word_count >= 300
|
|
188
|
+
AND p.id NOT IN (SELECT DISTINCT page_id FROM page_schemas)
|
|
189
|
+
ORDER BY p.word_count DESC
|
|
190
|
+
LIMIT 50
|
|
191
|
+
`).all(project);
|
|
192
|
+
return rows.map(r => ({
|
|
193
|
+
id: makeId('schema', 'missing', r.url),
|
|
194
|
+
severity: 'info',
|
|
195
|
+
category: 'schema',
|
|
196
|
+
tier: 'free',
|
|
197
|
+
title: `No schema on ${shortPath(r.url)}`,
|
|
198
|
+
description: `Substantive page (${r.word_count} words) ships zero structured-data markup. AI engines and rich-results lose out.`,
|
|
199
|
+
affected_urls: [r.url],
|
|
200
|
+
evidence: { word_count: r.word_count, title: r.title },
|
|
201
|
+
fix_template: `Add JSON-LD schema appropriate to the page type. Article / BlogPosting / Product / FAQPage / Organization are the common ones. Use \`get_headings(${project}, '${r.url}')\` to inspect the page structure first. Keep it short — 5–10 fields is enough.`,
|
|
202
|
+
verification: `Re-crawl, then \`get_intel(${project}, for=raw)\` should show schema count increment.`,
|
|
203
|
+
first_seen: r.first_seen_at || r.crawled_at,
|
|
204
|
+
last_seen: r.crawled_at,
|
|
205
|
+
fix_difficulty: 2,
|
|
206
|
+
}));
|
|
207
|
+
}
|
|
208
|
+
|
|
209
|
+
// 5. PAID — low-citability pages (AEO score < 40 in citability_scores table)
|
|
210
|
+
function collectCitabilityGaps(db, project) {
|
|
211
|
+
let rows = [];
|
|
212
|
+
try {
|
|
213
|
+
rows = db.prepare(`
|
|
214
|
+
SELECT cs.url, cs.score, cs.tier, cs.entity_authority, cs.structured_claims,
|
|
215
|
+
cs.answer_density, cs.qa_proximity, cs.freshness, cs.schema_coverage,
|
|
216
|
+
cs.scored_at, p.title, p.word_count, d.role
|
|
217
|
+
FROM citability_scores cs
|
|
218
|
+
JOIN pages p ON p.id = cs.page_id
|
|
219
|
+
JOIN domains d ON d.id = p.domain_id
|
|
220
|
+
WHERE d.project = ? AND d.role IN ('target', 'owned') AND cs.score < 60
|
|
221
|
+
ORDER BY cs.score ASC
|
|
222
|
+
LIMIT 100
|
|
223
|
+
`).all(project);
|
|
224
|
+
} catch { /* citability_scores may not exist if AEO never run */ }
|
|
225
|
+
return rows.map(r => ({
|
|
226
|
+
id: makeId('citability', 'low-score', r.url),
|
|
227
|
+
severity: r.score < 30 ? 'critical' : r.score < 45 ? 'warn' : 'info',
|
|
228
|
+
category: 'citability',
|
|
229
|
+
tier: 'paid',
|
|
230
|
+
title: `Citability ${r.score}/100 on ${shortPath(r.url)}`,
|
|
231
|
+
description: `Page scores poorly for AI citability. Weak: ${weakestSignals(r)}.`,
|
|
232
|
+
affected_urls: [r.url],
|
|
233
|
+
evidence: {
|
|
234
|
+
score: r.score, tier: r.tier,
|
|
235
|
+
signals: {
|
|
236
|
+
entity_authority: r.entity_authority,
|
|
237
|
+
structured_claims: r.structured_claims,
|
|
238
|
+
answer_density: r.answer_density,
|
|
239
|
+
qa_proximity: r.qa_proximity,
|
|
240
|
+
freshness: r.freshness,
|
|
241
|
+
schema_coverage: r.schema_coverage,
|
|
242
|
+
},
|
|
243
|
+
word_count: r.word_count,
|
|
244
|
+
},
|
|
245
|
+
fix_template: citabilityFix(r),
|
|
246
|
+
verification: `Re-crawl, run \`run_citability_audit(${project})\`, then \`list_problems\` — score should rise.`,
|
|
247
|
+
first_seen: r.scored_at,
|
|
248
|
+
last_seen: r.scored_at,
|
|
249
|
+
fix_difficulty: 3,
|
|
250
|
+
}));
|
|
251
|
+
}
|
|
252
|
+
|
|
253
|
+
// 6. PAID — Intelligence Ledger insights mapped to problems
|
|
254
|
+
function collectInsightProblems(db, project) {
|
|
255
|
+
let rows = [];
|
|
256
|
+
try {
|
|
257
|
+
rows = db.prepare(`
|
|
258
|
+
SELECT id, type, fingerprint, first_seen, last_seen, data, source
|
|
259
|
+
FROM insights
|
|
260
|
+
WHERE project = ? AND status = 'active'
|
|
261
|
+
AND type IN ('content_gap', 'keyword_gap', 'technical_gap', 'positioning')
|
|
262
|
+
ORDER BY last_seen DESC
|
|
263
|
+
LIMIT 100
|
|
264
|
+
`).all(project);
|
|
265
|
+
} catch { return []; }
|
|
266
|
+
return rows.map(r => {
|
|
267
|
+
const data = safeParse(r.data);
|
|
268
|
+
const typeMeta = INSIGHT_TYPE_MAP[r.type] || { category: 'content', severity: 'info' };
|
|
269
|
+
const titleHint = data?.keyword || data?.topic || data?.gap || data?.phrase || `Insight ${r.id}`;
|
|
270
|
+
return {
|
|
271
|
+
id: makeId(typeMeta.category, r.type, r.fingerprint),
|
|
272
|
+
severity: typeMeta.severity,
|
|
273
|
+
category: typeMeta.category,
|
|
274
|
+
tier: 'paid',
|
|
275
|
+
title: `${typeMeta.label}: ${titleHint}`,
|
|
276
|
+
description: data?.why || data?.description || `Active insight in the Intelligence Ledger (type=${r.type}).`,
|
|
277
|
+
affected_urls: data?.url ? [data.url] : (data?.pages || []),
|
|
278
|
+
evidence: { insight_id: r.id, source: r.source, ...data },
|
|
279
|
+
fix_template: data?.suggestion || data?.fix || `Address this ${r.type} via blog draft, page update, or content fix. Use \`draft_blog_prompt(${project}, topic='${titleHint}')\` for an AEO-aware draft prompt.`,
|
|
280
|
+
verification: `After the fix, call \`mark_problem_status('${makeId(typeMeta.category, r.type, r.fingerprint)}', 'fixed')\` (coming in v1.5.35) or wait for the next analyze run to clear it.`,
|
|
281
|
+
first_seen: r.first_seen,
|
|
282
|
+
last_seen: r.last_seen,
|
|
283
|
+
fix_difficulty: typeMeta.difficulty,
|
|
284
|
+
};
|
|
285
|
+
});
|
|
286
|
+
}
|
|
287
|
+
|
|
288
|
+
const INSIGHT_TYPE_MAP = {
|
|
289
|
+
content_gap: { category: 'content', severity: 'warn', label: 'Content gap', difficulty: 4 },
|
|
290
|
+
keyword_gap: { category: 'keyword', severity: 'warn', label: 'Keyword gap', difficulty: 3 },
|
|
291
|
+
technical_gap: { category: 'tech', severity: 'warn', label: 'Technical gap', difficulty: 3 },
|
|
292
|
+
positioning: { category: 'positioning', severity: 'info', label: 'Positioning', difficulty: 5 },
|
|
293
|
+
};
|
|
294
|
+
|
|
295
|
+
// ── Aggregator ─────────────────────────────────────────────────────────────
|
|
296
|
+
|
|
297
|
+
/**
|
|
298
|
+
* @param {import('node:sqlite').DatabaseSync} db
|
|
299
|
+
* @param {string} project
|
|
300
|
+
* @param {{ severity?: string, category?: string, limit?: number, includePaid?: boolean, maxFixDifficulty?: number, includeMarked?: boolean }} opts
|
|
301
|
+
* @returns {object[]}
|
|
302
|
+
*/
|
|
303
|
+
export function getProblems(db, project, opts = {}) {
|
|
304
|
+
const all = [
|
|
305
|
+
...collectHttpErrors(db, project),
|
|
306
|
+
...collectIndexabilityIssues(db, project),
|
|
307
|
+
...collectOrphans(db, project),
|
|
308
|
+
...collectSchemaGaps(db, project),
|
|
309
|
+
];
|
|
310
|
+
if (opts.includePaid) {
|
|
311
|
+
all.push(...collectCitabilityGaps(db, project));
|
|
312
|
+
all.push(...collectInsightProblems(db, project));
|
|
313
|
+
}
|
|
314
|
+
// Filter out problems marked fixed/wont_fix/snoozed (unless caller asks to see them).
|
|
315
|
+
const statusMap = getActiveStatusMap(db, project);
|
|
316
|
+
let filtered = all.flatMap(p => {
|
|
317
|
+
const mark = statusMap.get(p.id);
|
|
318
|
+
if (mark && !opts.includeMarked) return []; // hide by default
|
|
319
|
+
return [{ ...p, status: mark ? mark.status : 'active', status_mark: mark || null }];
|
|
320
|
+
});
|
|
321
|
+
if (opts.severity) filtered = filtered.filter(p => p.severity === opts.severity);
|
|
322
|
+
if (opts.category) filtered = filtered.filter(p => p.category === opts.category);
|
|
323
|
+
if (opts.maxFixDifficulty) filtered = filtered.filter(p => p.fix_difficulty <= opts.maxFixDifficulty);
|
|
324
|
+
filtered.sort((a, b) =>
|
|
325
|
+
SEVERITY_RANK[a.severity] - SEVERITY_RANK[b.severity] ||
|
|
326
|
+
a.fix_difficulty - b.fix_difficulty ||
|
|
327
|
+
b.last_seen - a.last_seen
|
|
328
|
+
);
|
|
329
|
+
if (opts.limit) filtered = filtered.slice(0, opts.limit);
|
|
330
|
+
return filtered;
|
|
331
|
+
}
|
|
332
|
+
|
|
333
|
+
/**
|
|
334
|
+
* Counts only — used by list_projects nag to surface "5 critical pending".
|
|
335
|
+
* Free tier sees free-only counts so we don't tease paid data.
|
|
336
|
+
*/
|
|
337
|
+
export function getProblemCounts(db, project, { includePaid = false } = {}) {
|
|
338
|
+
const problems = getProblems(db, project, { includePaid });
|
|
339
|
+
const counts = { critical: 0, warn: 0, info: 0, total: problems.length, by_category: {} };
|
|
340
|
+
for (const p of problems) {
|
|
341
|
+
counts[p.severity]++;
|
|
342
|
+
counts.by_category[p.category] = (counts.by_category[p.category] || 0) + 1;
|
|
343
|
+
}
|
|
344
|
+
return counts;
|
|
345
|
+
}
|
|
346
|
+
|
|
347
|
+
// ── Helpers ────────────────────────────────────────────────────────────────
|
|
348
|
+
|
|
349
|
+
function shortPath(url) {
|
|
350
|
+
try {
|
|
351
|
+
const u = new URL(url);
|
|
352
|
+
const p = (u.pathname + u.search + u.hash).slice(0, 60);
|
|
353
|
+
return `${u.hostname}${p}`;
|
|
354
|
+
} catch { return url.slice(0, 60); }
|
|
355
|
+
}
|
|
356
|
+
|
|
357
|
+
function safeParse(s) { try { return JSON.parse(s); } catch { return null; } }
|
|
358
|
+
|
|
359
|
+
function weakestSignals(r) {
|
|
360
|
+
const signals = [
|
|
361
|
+
['entity authority', r.entity_authority],
|
|
362
|
+
['structured claims', r.structured_claims],
|
|
363
|
+
['answer density', r.answer_density],
|
|
364
|
+
['Q&A proximity', r.qa_proximity],
|
|
365
|
+
['freshness', r.freshness],
|
|
366
|
+
['schema coverage', r.schema_coverage],
|
|
367
|
+
];
|
|
368
|
+
return signals.sort((a, b) => a[1] - b[1]).slice(0, 2).map(s => s[0]).join(' + ');
|
|
369
|
+
}
|
|
370
|
+
|
|
371
|
+
function citabilityFix(r) {
|
|
372
|
+
const fixes = [];
|
|
373
|
+
if (r.entity_authority < 4) fixes.push('cite 2–3 named experts/authoritative sources');
|
|
374
|
+
if (r.structured_claims < 4) fixes.push('add concrete numbers, dates, or measurable claims (e.g. "47ms latency")');
|
|
375
|
+
if (r.answer_density < 4) fixes.push('shorten paragraphs; one answer per heading');
|
|
376
|
+
if (r.qa_proximity < 4) fixes.push('add an FAQ section with `FAQPage` schema');
|
|
377
|
+
if (r.freshness < 4) fixes.push('update the publish date and add a brief "last updated" note');
|
|
378
|
+
if (r.schema_coverage < 4) fixes.push('add JSON-LD schema appropriate to the page type');
|
|
379
|
+
return fixes.length
|
|
380
|
+
? `To raise score: ${fixes.join('; ')}.`
|
|
381
|
+
: `Page just under threshold — minor improvements suffice. Use \`prescore_draft\` on a revised version to confirm before publishing.`;
|
|
382
|
+
}
|
package/mcp/server.js
CHANGED
|
@@ -29,6 +29,7 @@ import { getDb, insertAgentInsight, AGENT_INSIGHT_TYPES, getActiveInsights, getC
|
|
|
29
29
|
import { getIntel, INTEL_SLICES, FREE_SLICES } from '../lib/intel.js';
|
|
30
30
|
import { isPro } from '../lib/license.js';
|
|
31
31
|
import { readProgress } from '../lib/progress.js';
|
|
32
|
+
import { getProblems, getProblemCounts, markProblemStatus, getActiveStatusMap, PROBLEM_CATEGORIES, PROBLEM_STATUSES } from '../lib/problems.js';
|
|
32
33
|
|
|
33
34
|
import { runAeoAnalysis, persistAeoScores, upsertCitabilityInsights } from '../analyses/aeo/index.js';
|
|
34
35
|
import { prescore } from '../analyses/blog-draft/prescorer.js';
|
|
@@ -69,19 +70,50 @@ function listConfigProjects() {
|
|
|
69
70
|
}
|
|
70
71
|
|
|
71
72
|
// ── Tool: list_projects (free) ────────────────────────────────────────────
|
|
73
|
+
// Designed as the natural entry point. Returns per-project pending problem
|
|
74
|
+
// counts so that every interaction with the agent surfaces stale audits —
|
|
75
|
+
// the "freemium nag" pattern. Solo users see paid-tier problem counts too.
|
|
72
76
|
server.registerTool(
|
|
73
77
|
'list_projects',
|
|
74
78
|
{
|
|
75
|
-
description: 'List all SEO Intel projects
|
|
79
|
+
description: 'List all SEO Intel projects on this machine with crawled-page counts AND pending problem counts. Use this as the entry point for every SEO conversation — the response includes a `nag` field per project flagging stale crawls or unresolved critical issues. After noticing the nag, the next natural tool is list_problems(project). Free tier — no license required (Solo users see paid-tier problem counts in addition to free ones).',
|
|
76
80
|
},
|
|
77
81
|
async () => {
|
|
78
82
|
const db = getDb();
|
|
79
83
|
const configs = listConfigProjects();
|
|
84
|
+
const includePaid = isPro();
|
|
85
|
+
const now = Date.now();
|
|
80
86
|
const out = configs.map(c => {
|
|
81
87
|
const row = db.prepare(
|
|
82
|
-
'SELECT COUNT(*) AS n FROM pages p JOIN domains d ON d.id=p.domain_id WHERE d.project=?'
|
|
88
|
+
'SELECT COUNT(*) AS n, MAX(d.last_crawled) AS last_crawled FROM pages p JOIN domains d ON d.id=p.domain_id WHERE d.project=?'
|
|
83
89
|
).get(c.project);
|
|
84
|
-
|
|
90
|
+
const pages = row?.n || 0;
|
|
91
|
+
const lastCrawl = row?.last_crawled || null;
|
|
92
|
+
const staleDays = lastCrawl ? Math.floor((now - lastCrawl) / 86_400_000) : null;
|
|
93
|
+
let counts = null;
|
|
94
|
+
let nag = null;
|
|
95
|
+
if (pages > 0) {
|
|
96
|
+
try {
|
|
97
|
+
counts = getProblemCounts(db, c.project, { includePaid });
|
|
98
|
+
const reasons = [];
|
|
99
|
+
if (counts.critical > 0) reasons.push(`${counts.critical} CRITICAL`);
|
|
100
|
+
if (counts.warn > 0) reasons.push(`${counts.warn} warn`);
|
|
101
|
+
if (staleDays !== null && staleDays >= 7) reasons.push(`crawl ${staleDays}d stale`);
|
|
102
|
+
if (reasons.length) {
|
|
103
|
+
nag = `${reasons.join(' · ')}. Call list_problems('${c.project}') to see them${counts.critical > 0 ? ', then fix the criticals first' : ''}.`;
|
|
104
|
+
}
|
|
105
|
+
} catch { /* problems collector failed (e.g. fresh DB) — silent */ }
|
|
106
|
+
}
|
|
107
|
+
return {
|
|
108
|
+
project: c.project,
|
|
109
|
+
target: c.target,
|
|
110
|
+
pages,
|
|
111
|
+
last_crawled: lastCrawl ? new Date(lastCrawl).toISOString() : null,
|
|
112
|
+
stale_days: staleDays,
|
|
113
|
+
problem_counts: counts,
|
|
114
|
+
nag,
|
|
115
|
+
problems_tier: includePaid ? 'paid' : 'free',
|
|
116
|
+
};
|
|
85
117
|
});
|
|
86
118
|
return {
|
|
87
119
|
content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
|
|
@@ -675,11 +707,134 @@ server.registerTool(
|
|
|
675
707
|
}
|
|
676
708
|
);
|
|
677
709
|
|
|
710
|
+
// ── Tool: list_problems (the "what should I fix?" entry tool) ─────────────
|
|
711
|
+
// This is the canonical Problems surface. Returns severity-sorted, agent-
|
|
712
|
+
// fixable findings with affected_urls, fix_template, verification. Free
|
|
713
|
+
// categories: tech, indexability, links, schema. Paid adds: citability,
|
|
714
|
+
// content, keyword, positioning.
|
|
715
|
+
server.registerTool(
|
|
716
|
+
'list_problems',
|
|
717
|
+
{
|
|
718
|
+
description: [
|
|
719
|
+
"List concrete, fixable SEO problems for a project — severity-sorted, with everything an AI coding agent needs to remediate (affected_urls, fix_template, verification). This is the primary 'what should I work on?' tool: call list_projects first to see the nag/counts, then call list_problems here.",
|
|
720
|
+
"",
|
|
721
|
+
"Free tier categories: tech (HTTP errors), indexability (robots conflicts), links (orphan pages), schema (missing structured data).",
|
|
722
|
+
"Paid tier adds: citability (low AEO scores), content/keyword/positioning gaps from the Intelligence Ledger.",
|
|
723
|
+
"",
|
|
724
|
+
"Each problem returns {id, severity, category, tier, title, description, affected_urls, evidence, fix_template, verification, first_seen, last_seen, fix_difficulty}. fix_difficulty: 1=trivial → 5=deep work.",
|
|
725
|
+
"",
|
|
726
|
+
"Typical agent loop: list_projects → list_problems(project, severity='critical') → fix highest-leverage one → run_crawl(project) → list_problems again to verify it cleared.",
|
|
727
|
+
].join("\n"),
|
|
728
|
+
inputSchema: {
|
|
729
|
+
project: z.string(),
|
|
730
|
+
severity: z.enum(['critical', 'warn', 'info']).optional().describe('Filter to one severity'),
|
|
731
|
+
category: z.enum(PROBLEM_CATEGORIES).optional().describe('Filter to one category'),
|
|
732
|
+
limit: z.number().int().positive().max(500).optional().describe('Max problems to return (default 50)'),
|
|
733
|
+
max_fix_difficulty: z.number().int().min(1).max(5).optional().describe('Cap on fix_difficulty — useful for picking quick wins (set to 2 for "easy" only)'),
|
|
734
|
+
include_marked: z.boolean().optional().describe('Include problems already marked fixed/wont_fix/snoozed (default false — they are hidden). Useful for auditing what has been suppressed.'),
|
|
735
|
+
},
|
|
736
|
+
},
|
|
737
|
+
async ({ project, severity, category, limit = 50, max_fix_difficulty, include_marked }) => {
|
|
738
|
+
if (!loadProjectConfig(project)) {
|
|
739
|
+
return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
|
|
740
|
+
}
|
|
741
|
+
try {
|
|
742
|
+
const db = getDb();
|
|
743
|
+
const includePaid = isPro();
|
|
744
|
+
const problems = getProblems(db, project, {
|
|
745
|
+
severity, category, limit,
|
|
746
|
+
maxFixDifficulty: max_fix_difficulty,
|
|
747
|
+
includePaid,
|
|
748
|
+
includeMarked: !!include_marked,
|
|
749
|
+
});
|
|
750
|
+
const counts = getProblemCounts(db, project, { includePaid });
|
|
751
|
+
const out = {
|
|
752
|
+
project,
|
|
753
|
+
tier: includePaid ? 'paid' : 'free',
|
|
754
|
+
counts,
|
|
755
|
+
returned: problems.length,
|
|
756
|
+
filters: { severity: severity || 'any', category: category || 'any', max_fix_difficulty: max_fix_difficulty || null },
|
|
757
|
+
upsell: includePaid ? null : 'Solo unlocks citability, content_gap, keyword_gap, positioning categories. Currently showing free-tier categories only (tech, indexability, links, schema). Upgrade at https://ukkometa.fi/en/seo-intel/',
|
|
758
|
+
problems,
|
|
759
|
+
};
|
|
760
|
+
return {
|
|
761
|
+
content: [{ type: 'text', text: JSON.stringify(out, null, 2) }],
|
|
762
|
+
structuredContent: out,
|
|
763
|
+
};
|
|
764
|
+
} catch (err) {
|
|
765
|
+
return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
|
|
766
|
+
}
|
|
767
|
+
}
|
|
768
|
+
);
|
|
769
|
+
|
|
770
|
+
// ── Tool: mark_problem_status (free — closes the loop) ────────────────────
|
|
771
|
+
server.registerTool(
|
|
772
|
+
'mark_problem_status',
|
|
773
|
+
{
|
|
774
|
+
description: [
|
|
775
|
+
"Mark a problem (from list_problems) as fixed, wont_fix, or snoozed. Subsequent list_problems calls will hide it unless the underlying source data re-surfaces it. Free tier — agents need this to confirm 'I fixed it' on subjective problems (positioning, content_gap) whose source data won't auto-clear on re-crawl.",
|
|
776
|
+
"",
|
|
777
|
+
"Statuses:",
|
|
778
|
+
" fixed — done. Hide permanently. (If the same problem_id resurfaces from a future crawl, the mark is ignored and it shows again — i.e. the mark is per-instance not per-fingerprint.)",
|
|
779
|
+
" wont_fix — accepted/ignored. Hide permanently.",
|
|
780
|
+
" snoozed — hide for N days (require `snooze_days`). After that the problem re-appears in list_problems.",
|
|
781
|
+
"",
|
|
782
|
+
"Re-marking the same problem_id with a different status updates the existing record. To un-hide, re-mark with status='fixed' and snooze_days=0, then re-mark with 'snoozed' / snooze_days=0 — actually simpler: call list_problems(include_marked=true) to see hidden ones and mark them again.",
|
|
783
|
+
].join("\n"),
|
|
784
|
+
inputSchema: {
|
|
785
|
+
problem_id: z.string().describe('The `id` field from a list_problems result, e.g. "links::orphan::abc1234567"'),
|
|
786
|
+
project: z.string().describe('Must match the project the problem belongs to'),
|
|
787
|
+
status: z.enum(PROBLEM_STATUSES).describe('fixed | wont_fix | snoozed'),
|
|
788
|
+
snooze_days: z.number().int().positive().max(365).optional().describe('Required when status=snoozed. Max 365.'),
|
|
789
|
+
agent_name: z.string().optional().describe('Provenance — stored as marked_by'),
|
|
790
|
+
note: z.string().optional().describe('Optional context, e.g. "added internal link from homepage"'),
|
|
791
|
+
},
|
|
792
|
+
},
|
|
793
|
+
async ({ problem_id, project, status, snooze_days, agent_name, note }) => {
|
|
794
|
+
if (!loadProjectConfig(project)) {
|
|
795
|
+
return { content: [{ type: 'text', text: `Project "${project}" not found. Use list_projects to discover.` }], isError: true };
|
|
796
|
+
}
|
|
797
|
+
try {
|
|
798
|
+
const db = getDb();
|
|
799
|
+
const result = markProblemStatus(db, {
|
|
800
|
+
problemId: problem_id,
|
|
801
|
+
project,
|
|
802
|
+
status,
|
|
803
|
+
markedBy: agent_name ? `agent:${agent_name}` : 'agent',
|
|
804
|
+
note,
|
|
805
|
+
snoozeDays: snooze_days,
|
|
806
|
+
});
|
|
807
|
+
if (!result.ok) {
|
|
808
|
+
return { content: [{ type: 'text', text: `seo-intel mark error: ${result.error}` }], isError: true };
|
|
809
|
+
}
|
|
810
|
+
const payload = {
|
|
811
|
+
ok: true,
|
|
812
|
+
problem_id,
|
|
813
|
+
project,
|
|
814
|
+
status,
|
|
815
|
+
marked_at: new Date(result.marked_at).toISOString(),
|
|
816
|
+
expires_at: result.expires_at ? new Date(result.expires_at).toISOString() : null,
|
|
817
|
+
hint: status === 'fixed'
|
|
818
|
+
? 'Marked fixed. Hidden from list_problems. Re-call list_problems to confirm.'
|
|
819
|
+
: status === 'wont_fix'
|
|
820
|
+
? 'Marked wont_fix. Permanently hidden. To un-hide: call this tool again with a different status.'
|
|
821
|
+
: `Snoozed until ${new Date(result.expires_at).toISOString()}. Will re-appear in list_problems after that.`,
|
|
822
|
+
};
|
|
823
|
+
return {
|
|
824
|
+
content: [{ type: 'text', text: JSON.stringify(payload, null, 2) }],
|
|
825
|
+
structuredContent: payload,
|
|
826
|
+
};
|
|
827
|
+
} catch (err) {
|
|
828
|
+
return { content: [{ type: 'text', text: `seo-intel error: ${err.message}` }], isError: true };
|
|
829
|
+
}
|
|
830
|
+
}
|
|
831
|
+
);
|
|
832
|
+
|
|
678
833
|
async function main() {
|
|
679
834
|
const transport = new StdioServerTransport();
|
|
680
835
|
await server.connect(transport);
|
|
681
836
|
// stderr is fine; the host typically surfaces this in its MCP logs panel.
|
|
682
|
-
console.error(`[seo-intel-mcp] v${VERSION} ready on stdio.
|
|
837
|
+
console.error(`[seo-intel-mcp] v${VERSION} ready on stdio. 15 tools — free: list_projects (with nag), list_problems, mark_problem_status, get_intel(raw), get_pages, list_keywords, get_headings, run_crawl, get_crawl_status, ingest_insight, export_intel (free-tier subset); paid: get_intel(audit/blog/competitor), run_citability_audit, get_competitor_positioning, prescore_draft, draft_blog_prompt, export_intel (paid tables), and list_problems unlocks paid categories.`);
|
|
683
838
|
}
|
|
684
839
|
|
|
685
840
|
main().catch(err => {
|
package/package.json
CHANGED
package/setup/web-routes.js
CHANGED
|
@@ -314,25 +314,27 @@ async function handlePingOllama(req, res) {
|
|
|
314
314
|
|
|
315
315
|
const { checkOllamaRemote, checkLmStudio } = await import('./checks.js');
|
|
316
316
|
|
|
317
|
-
//
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
317
|
+
// Probe BOTH engines in parallel. Whichever responds, that's the engine.
|
|
318
|
+
// Don't gate on port — users can run Ollama on 1234 or LM Studio on 11434
|
|
319
|
+
// (and many do, e.g. when avoiding port conflicts on shared dev boxes).
|
|
320
|
+
const [ollamaResult, lmResult] = await Promise.all([
|
|
321
|
+
checkOllamaRemote(host).catch(() => ({ reachable: false, models: [], host })),
|
|
322
|
+
checkLmStudio(host).catch(() => ({ reachable: false, models: [], host })),
|
|
323
|
+
]);
|
|
324
|
+
|
|
325
|
+
if (ollamaResult.reachable) {
|
|
326
|
+
jsonResponse(res, { ...ollamaResult, mode: 'ollama' });
|
|
327
|
+
} else if (lmResult.reachable) {
|
|
322
328
|
jsonResponse(res, { ...lmResult, host, mode: 'lmstudio' });
|
|
323
329
|
} else {
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
} else {
|
|
333
|
-
jsonResponse(res, result); // return original Ollama failure
|
|
334
|
-
}
|
|
335
|
-
}
|
|
330
|
+
// Neither engine responded — return a useful error.
|
|
331
|
+
const port = (() => { try { return new URL(host).port; } catch { return ''; } })();
|
|
332
|
+
jsonResponse(res, {
|
|
333
|
+
reachable: false,
|
|
334
|
+
host,
|
|
335
|
+
probed: { ollama: `${host}/api/tags`, lmstudio: `${host}/api/v1/models` },
|
|
336
|
+
hint: `Neither Ollama (port ${port || '11434'} expected) nor LM Studio (port ${port || '1234'} expected) responded at ${host}. Common causes: (1) the engine is bound to 127.0.0.1 only — re-bind to 0.0.0.0 to allow LAN access, (2) firewall blocking inbound port ${port || 'N'}, (3) wrong port. Verify from the LM Studio Developer tab / 'ollama serve' output.`,
|
|
337
|
+
});
|
|
336
338
|
}
|
|
337
339
|
} catch (err) {
|
|
338
340
|
jsonResponse(res, { error: err.message }, 500);
|
package/setup/wizard.html
CHANGED
|
@@ -2274,7 +2274,9 @@ input::placeholder {
|
|
|
2274
2274
|
let host = input.value.trim();
|
|
2275
2275
|
if (!host) return;
|
|
2276
2276
|
|
|
2277
|
-
// Auto-add protocol
|
|
2277
|
+
// Auto-add protocol. Don't auto-add a default port — LM Studio uses 1234,
|
|
2278
|
+
// Ollama uses 11434. If the user pasted without a port, default to Ollama's
|
|
2279
|
+
// 11434 but the probe will also try LM Studio at 1234 if Ollama fails.
|
|
2278
2280
|
if (!host.startsWith('http')) host = 'http://' + host;
|
|
2279
2281
|
if (!host.includes(':', host.indexOf('//') + 2)) host += ':11434';
|
|
2280
2282
|
|
|
@@ -2285,8 +2287,9 @@ input::placeholder {
|
|
|
2285
2287
|
try {
|
|
2286
2288
|
const res = await API.get(`/api/setup/ping-ollama?host=${encodeURIComponent(host)}`);
|
|
2287
2289
|
if (res.reachable) {
|
|
2288
|
-
|
|
2289
|
-
|
|
2290
|
+
const engine = res.mode === 'lmstudio' ? 'LM Studio' : 'Ollama';
|
|
2291
|
+
result.innerHTML = `<span style="color:var(--color-success);"><i class="fa-solid fa-check"></i> Connected to ${engine} — ${res.models.length} model(s) found</span>`;
|
|
2292
|
+
// Append to comma-separated OLLAMA_HOSTS in .env (legacy name covers both engines)
|
|
2290
2293
|
const currentHosts = (state.systemStatus?.env?.raw?.OLLAMA_HOSTS || '').split(',').map(h => h.trim()).filter(Boolean);
|
|
2291
2294
|
if (!currentHosts.includes(host)) currentHosts.push(host);
|
|
2292
2295
|
await API.post('/api/setup/save-env', { key: 'OLLAMA_HOSTS', value: currentHosts.join(',') });
|
|
@@ -2298,7 +2301,9 @@ input::placeholder {
|
|
|
2298
2301
|
// Reload models with new host
|
|
2299
2302
|
loadModels();
|
|
2300
2303
|
} else {
|
|
2301
|
-
|
|
2304
|
+
// Better diagnostic — backend now returns a `hint` with common-cause guidance
|
|
2305
|
+
const hint = res.hint || `Check IP, port, and that Ollama (port 11434) or LM Studio (port 1234) is running on that machine and bound to a network interface (not just 127.0.0.1).`;
|
|
2306
|
+
result.innerHTML = `<span style="color:var(--color-danger);"><i class="fa-solid fa-xmark"></i> Unreachable — ${escapeHtml(hint)}</span>`;
|
|
2302
2307
|
}
|
|
2303
2308
|
} catch (err) {
|
|
2304
2309
|
result.innerHTML = `<span style="color:var(--color-danger);"><i class="fa-solid fa-xmark"></i> ${err.message}</span>`;
|
|
@@ -2307,6 +2312,11 @@ input::placeholder {
|
|
|
2307
2312
|
btn.innerHTML = '<i class="fa-solid fa-satellite-dish"></i> Ping';
|
|
2308
2313
|
};
|
|
2309
2314
|
|
|
2315
|
+
// Light HTML escape so backend-supplied hint strings can't break out
|
|
2316
|
+
function escapeHtml(s) {
|
|
2317
|
+
return String(s).replace(/[&<>"']/g, c => ({ '&': '&', '<': '<', '>': '>', '"': '"', "'": ''' }[c]));
|
|
2318
|
+
}
|
|
2319
|
+
|
|
2310
2320
|
window.removeOllamaHost = async function(host) {
|
|
2311
2321
|
// Remove single host from comma-separated OLLAMA_HOSTS
|
|
2312
2322
|
const currentHosts = (state.systemStatus?.env?.raw?.OLLAMA_HOSTS || '').split(',').map(h => h.trim()).filter(Boolean);
|