aeorank 2.0.0 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +104 -35
- package/dist/browser.d.ts +1 -1
- package/dist/browser.js +239 -99
- package/dist/browser.js.map +1 -1
- package/dist/cli.js +157 -73
- package/dist/cli.js.map +1 -1
- package/dist/index.cjs +239 -99
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +1 -1
- package/dist/index.d.ts +1 -1
- package/dist/index.js +239 -99
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -41,38 +41,90 @@ console.log(result.opportunities); // Prioritized improvements
|
|
|
41
41
|
|
|
42
42
|
## What It Checks
|
|
43
43
|
|
|
44
|
-
AEORank evaluates 28 criteria
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
|
51
|
-
|
|
52
|
-
|
|
|
53
|
-
|
|
|
54
|
-
| 7 |
|
|
55
|
-
|
|
|
56
|
-
|
|
|
57
|
-
|
|
|
58
|
-
|
|
|
59
|
-
|
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
|
64
|
-
|
|
65
|
-
|
|
|
66
|
-
|
|
|
67
|
-
|
|
|
68
|
-
|
|
|
69
|
-
|
|
|
70
|
-
|
|
|
71
|
-
|
|
|
72
|
-
|
|
|
73
|
-
|
|
|
74
|
-
|
|
|
75
|
-
|
|
44
|
+
AEORank evaluates 28 criteria that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content. Criteria are organized into three tiers by impact on real-world AI citations:
|
|
45
|
+
|
|
46
|
+
### Scoring Tiers (by importance)
|
|
47
|
+
|
|
48
|
+
**Content Substance (~55%)** - *Why* an AI engine would cite you:
|
|
49
|
+
|
|
50
|
+
| Criterion | Weight | What it measures |
|
|
51
|
+
|-----------|--------|------------------|
|
|
52
|
+
| Topic Coherence | 14% | Blog content focus on core expertise vs scattered topics |
|
|
53
|
+
| Original Data & Expert Analysis | 10% | Proprietary research, case studies, unique data points |
|
|
54
|
+
| Content Depth | 7% | Article length, heading structure, deep vs thin pages |
|
|
55
|
+
| Fact & Data Density | 6% | Specific numbers, statistics, data points per page |
|
|
56
|
+
| Direct Answer Paragraphs | 5% | Concise answer paragraphs after question headings |
|
|
57
|
+
| Q&A Content Format | 5% | Question-format headings (What, How, Why) with answers |
|
|
58
|
+
| Query-Answer Alignment | 5% | Every question heading followed by a direct answer |
|
|
59
|
+
| Comprehensive FAQ Section | 4% | Dedicated FAQ with FAQPage schema markup |
|
|
60
|
+
|
|
61
|
+
**Content Organization (~30%)** - *How* easily AI can extract and trust your content:
|
|
62
|
+
|
|
63
|
+
| Criterion | Weight | What it measures |
|
|
64
|
+
|-----------|--------|------------------|
|
|
65
|
+
| Entity Authority & NAP Consistency | 5% | Organization schema, consistent name/address/phone |
|
|
66
|
+
| Internal Linking Structure | 4% | Topic clusters, breadcrumbs, reachability from homepage |
|
|
67
|
+
| Content Freshness Signals | 4% | dateModified schema, visible dates, recent content |
|
|
68
|
+
| Schema.org Structured Data | 3% | JSON-LD blocks (Organization, Article, FAQPage, etc.) |
|
|
69
|
+
| Author & Expert Schema | 3% | Person schema with credentials and expertise |
|
|
70
|
+
| Table & List Extractability | 3% | HTML tables with headers, ordered/unordered lists |
|
|
71
|
+
| Definition Patterns | 2% | Clear "X is defined as..." patterns for key terms |
|
|
72
|
+
| Visible Date Signal | 2% | Visible publication dates with `<time>` elements |
|
|
73
|
+
| Semantic HTML5 & Accessibility | 2% | Semantic elements (main, article, nav), ARIA, lang |
|
|
74
|
+
| Clean, Crawlable HTML | 2% | HTTPS, meta tags, proper heading hierarchy |
|
|
75
|
+
|
|
76
|
+
**Technical Plumbing (~15%)** - *Whether* AI crawlers can find you (table stakes):
|
|
77
|
+
|
|
78
|
+
| Criterion | Weight | What it measures |
|
|
79
|
+
|-----------|--------|------------------|
|
|
80
|
+
| Content Cannibalization | 2% | Overlapping pages competing for the same topic |
|
|
81
|
+
| llms.txt File | 2% | /llms.txt with site description and key page URLs |
|
|
82
|
+
| robots.txt for AI Crawlers | 2% | GPTBot, ClaudeBot, PerplexityBot access |
|
|
83
|
+
| Content Publishing Velocity | 2% | Regular publishing cadence in sitemap |
|
|
84
|
+
| Content Licensing & AI Permissions | 2% | /ai.txt file, license schema for AI usage |
|
|
85
|
+
| Sitemap Completeness | 1% | sitemap.xml with lastmod dates |
|
|
86
|
+
| Canonical URL Strategy | 1% | Self-referencing canonical tags |
|
|
87
|
+
| RSS/Atom Feed | 1% | RSS feed linked from homepage |
|
|
88
|
+
| Schema Coverage & Depth | 1% | Schema markup on inner pages, not just homepage |
|
|
89
|
+
| Speakable Schema | 1% | SpeakableSpecification for voice assistants |
|
|
90
|
+
|
|
91
|
+
> **Coherence Gate:** Sites with topic coherence below 6/10 are score-capped regardless of technical perfection. A scattered site with perfect robots.txt, llms.txt, and schema will score lower than a focused site with mediocre technical implementation.
|
|
92
|
+
|
|
93
|
+
<details>
|
|
94
|
+
<summary>All 28 criteria (numbered list)</summary>
|
|
95
|
+
|
|
96
|
+
| # | Criterion | Weight | Tier |
|
|
97
|
+
|---|-----------|--------|------|
|
|
98
|
+
| 1 | llms.txt File | 2% | Plumbing |
|
|
99
|
+
| 2 | Schema.org Structured Data | 3% | Organization |
|
|
100
|
+
| 3 | Q&A Content Format | 5% | Substance |
|
|
101
|
+
| 4 | Clean, Crawlable HTML | 2% | Organization |
|
|
102
|
+
| 5 | Entity Authority & NAP Consistency | 5% | Organization |
|
|
103
|
+
| 6 | robots.txt for AI Crawlers | 2% | Plumbing |
|
|
104
|
+
| 7 | Comprehensive FAQ Section | 4% | Substance |
|
|
105
|
+
| 8 | Original Data & Expert Analysis | 10% | Substance |
|
|
106
|
+
| 9 | Internal Linking Structure | 4% | Organization |
|
|
107
|
+
| 10 | Semantic HTML5 & Accessibility | 2% | Organization |
|
|
108
|
+
| 11 | Content Freshness Signals | 4% | Organization |
|
|
109
|
+
| 12 | Sitemap Completeness | 1% | Plumbing |
|
|
110
|
+
| 13 | RSS/Atom Feed | 1% | Plumbing |
|
|
111
|
+
| 14 | Table & List Extractability | 3% | Organization |
|
|
112
|
+
| 15 | Definition Patterns | 2% | Organization |
|
|
113
|
+
| 16 | Direct Answer Paragraphs | 5% | Substance |
|
|
114
|
+
| 17 | Content Licensing & AI Permissions | 2% | Plumbing |
|
|
115
|
+
| 18 | Author & Expert Schema | 3% | Organization |
|
|
116
|
+
| 19 | Fact & Data Density | 6% | Substance |
|
|
117
|
+
| 20 | Canonical URL Strategy | 1% | Plumbing |
|
|
118
|
+
| 21 | Content Publishing Velocity | 2% | Plumbing |
|
|
119
|
+
| 22 | Schema Coverage & Depth | 1% | Plumbing |
|
|
120
|
+
| 23 | Speakable Schema | 1% | Plumbing |
|
|
121
|
+
| 24 | Query-Answer Alignment | 5% | Substance |
|
|
122
|
+
| 25 | Content Cannibalization | 2% | Plumbing |
|
|
123
|
+
| 26 | Visible Date Signal | 2% | Organization |
|
|
124
|
+
| 27 | Topic Coherence | 14% | Substance |
|
|
125
|
+
| 28 | Content Depth | 7% | Substance |
|
|
126
|
+
|
|
127
|
+
</details>
|
|
76
128
|
|
|
77
129
|
## CLI Options
|
|
78
130
|
|
|
@@ -296,9 +348,26 @@ console.log(crawlResult.discoveredUrls.length); // Total URLs found
|
|
|
296
348
|
|
|
297
349
|
AEORank scores each individual page (0-100) against the 14 criteria that apply at page level. Instead of only seeing "your site scores 62," you get "your /about page scores 45, your /blog/guide scores 78."
|
|
298
350
|
|
|
299
|
-
The 14 per-page criteria
|
|
300
|
-
|
|
301
|
-
|
|
351
|
+
The 14 per-page criteria follow the same substance-first weighting as the site-level score:
|
|
352
|
+
|
|
353
|
+
| Tier | Per-Page Criteria | Weight |
|
|
354
|
+
|------|-------------------|--------|
|
|
355
|
+
| **Substance** | Original Data & Expert Content | 10% |
|
|
356
|
+
| | Fact & Data Density | 6% |
|
|
357
|
+
| | Direct Answer Paragraphs | 5% |
|
|
358
|
+
| | Q&A Content Format | 5% |
|
|
359
|
+
| | Query-Answer Alignment | 5% |
|
|
360
|
+
| | FAQ Section Content | 4% |
|
|
361
|
+
| **Organization** | Content Freshness Signals | 4% |
|
|
362
|
+
| | Schema.org Structured Data | 3% |
|
|
363
|
+
| | Table & List Extractability | 3% |
|
|
364
|
+
| | Definition Patterns | 2% |
|
|
365
|
+
| | Visible Date Signal | 2% |
|
|
366
|
+
| | Semantic HTML5 & Accessibility | 2% |
|
|
367
|
+
| | Clean, Crawlable HTML | 2% |
|
|
368
|
+
| **Plumbing** | Canonical URL Strategy | 1% |
|
|
369
|
+
|
|
370
|
+
The remaining 14 criteria are site-level only: llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization, topic coherence, and content depth.
|
|
302
371
|
|
|
303
372
|
### CLI Output
|
|
304
373
|
|
package/dist/browser.d.ts
CHANGED
|
@@ -376,7 +376,7 @@ declare function analyzeAllPages(siteData: SiteData): PageReview[];
|
|
|
376
376
|
|
|
377
377
|
/**
|
|
378
378
|
* Per-page AEO scoring.
|
|
379
|
-
* Evaluates 14 of
|
|
379
|
+
* Evaluates 14 of 28 criteria that apply at individual page level.
|
|
380
380
|
* Produces a 0-100 AEO score per page.
|
|
381
381
|
*/
|
|
382
382
|
|
package/dist/browser.js
CHANGED
|
@@ -2069,40 +2069,58 @@ function auditSiteFromData(data) {
|
|
|
2069
2069
|
|
|
2070
2070
|
// src/scoring.ts
|
|
2071
2071
|
var WEIGHTS = {
|
|
2072
|
-
// ───
|
|
2073
|
-
|
|
2074
|
-
original_data: 0.12,
|
|
2072
|
+
// ─── Content Substance (~55%) ─────────────────────────────────────────────
|
|
2073
|
+
// WHY an AI engine would cite you. These drive citation quality directly.
|
|
2075
2074
|
topic_coherence: 0.14,
|
|
2076
|
-
//
|
|
2077
|
-
|
|
2078
|
-
|
|
2079
|
-
content_depth: 0.
|
|
2080
|
-
//
|
|
2081
|
-
|
|
2082
|
-
|
|
2083
|
-
|
|
2084
|
-
|
|
2085
|
-
|
|
2086
|
-
|
|
2087
|
-
|
|
2088
|
-
//
|
|
2089
|
-
|
|
2090
|
-
|
|
2091
|
-
|
|
2092
|
-
|
|
2093
|
-
|
|
2094
|
-
|
|
2095
|
-
|
|
2096
|
-
|
|
2097
|
-
|
|
2098
|
-
|
|
2099
|
-
|
|
2100
|
-
|
|
2101
|
-
|
|
2102
|
-
|
|
2103
|
-
|
|
2104
|
-
|
|
2105
|
-
|
|
2075
|
+
// Topical authority - THE gating signal
|
|
2076
|
+
original_data: 0.1,
|
|
2077
|
+
// Unique value AI can't find elsewhere
|
|
2078
|
+
content_depth: 0.07,
|
|
2079
|
+
// Comprehensive vs thin coverage
|
|
2080
|
+
fact_density: 0.06,
|
|
2081
|
+
// Information density per page
|
|
2082
|
+
direct_answer_density: 0.05,
|
|
2083
|
+
// Direct answers to queries
|
|
2084
|
+
qa_content_format: 0.05,
|
|
2085
|
+
// Answer-shaped content structure
|
|
2086
|
+
query_answer_alignment: 0.05,
|
|
2087
|
+
// Relevance to actual AI queries
|
|
2088
|
+
faq_section: 0.04,
|
|
2089
|
+
// Structured Q&A pairs
|
|
2090
|
+
// ─── Content Organization (~30%) ──────────────────────────────────────────
|
|
2091
|
+
// HOW easily AI engines can extract and trust your content.
|
|
2092
|
+
entity_consistency: 0.05,
|
|
2093
|
+
// Brand authority and E-E-A-T
|
|
2094
|
+
internal_linking: 0.04,
|
|
2095
|
+
// Site structure and topic clusters
|
|
2096
|
+
content_freshness: 0.04,
|
|
2097
|
+
// Recency signals
|
|
2098
|
+
schema_markup: 0.03,
|
|
2099
|
+
// Structured data for discovery
|
|
2100
|
+
author_schema_depth: 0.03,
|
|
2101
|
+
// Expert attribution
|
|
2102
|
+
table_list_extractability: 0.03,
|
|
2103
|
+
// Extractable structured data
|
|
2104
|
+
definition_patterns: 0.02,
|
|
2105
|
+
// Clear definitions
|
|
2106
|
+
visible_date_signal: 0.02,
|
|
2107
|
+
// Publication date trust
|
|
2108
|
+
semantic_html: 0.02,
|
|
2109
|
+
// Clean semantic structure
|
|
2110
|
+
clean_html: 0.02,
|
|
2111
|
+
// Parseable markup
|
|
2112
|
+
// ─── Technical Plumbing (~15%) ────────────────────────────────────────────
|
|
2113
|
+
// WHETHER AI crawlers can find you. Table stakes with diminishing returns.
|
|
2114
|
+
content_cannibalization: 0.02,
|
|
2115
|
+
llms_txt: 0.02,
|
|
2116
|
+
robots_txt: 0.02,
|
|
2117
|
+
content_velocity: 0.02,
|
|
2118
|
+
content_licensing: 0.02,
|
|
2119
|
+
sitemap_completeness: 0.01,
|
|
2120
|
+
canonical_url: 0.01,
|
|
2121
|
+
rss_feed: 0.01,
|
|
2122
|
+
schema_coverage: 0.01,
|
|
2123
|
+
speakable_schema: 0.01
|
|
2106
2124
|
};
|
|
2107
2125
|
function calculateOverallScore(criteria) {
|
|
2108
2126
|
let totalWeight = 0;
|
|
@@ -2113,7 +2131,13 @@ function calculateOverallScore(criteria) {
|
|
|
2113
2131
|
totalWeight += weight;
|
|
2114
2132
|
}
|
|
2115
2133
|
if (totalWeight === 0) return 0;
|
|
2116
|
-
|
|
2134
|
+
let score = Math.round(weightedSum / totalWeight);
|
|
2135
|
+
const coherence = criteria.find((c) => c.criterion === "topic_coherence");
|
|
2136
|
+
if (coherence && coherence.score < 6) {
|
|
2137
|
+
const cap2 = 35 + coherence.score * 5;
|
|
2138
|
+
score = Math.min(score, cap2);
|
|
2139
|
+
}
|
|
2140
|
+
return score;
|
|
2117
2141
|
}
|
|
2118
2142
|
|
|
2119
2143
|
// src/scorecard-builder.ts
|
|
@@ -2231,32 +2255,37 @@ function buildDetailedFindings(results) {
|
|
|
2231
2255
|
|
|
2232
2256
|
// src/narrative-generator.ts
|
|
2233
2257
|
var CRITERION_WEIGHTS = {
|
|
2234
|
-
|
|
2235
|
-
|
|
2236
|
-
qa_content_format: 0.15,
|
|
2237
|
-
clean_html: 0.1,
|
|
2238
|
-
entity_consistency: 0.1,
|
|
2239
|
-
robots_txt: 0.05,
|
|
2240
|
-
faq_section: 0.1,
|
|
2258
|
+
// Content Substance (~55%)
|
|
2259
|
+
topic_coherence: 0.14,
|
|
2241
2260
|
original_data: 0.1,
|
|
2242
|
-
|
|
2243
|
-
|
|
2244
|
-
|
|
2245
|
-
|
|
2246
|
-
|
|
2247
|
-
|
|
2248
|
-
|
|
2249
|
-
|
|
2250
|
-
|
|
2251
|
-
|
|
2252
|
-
|
|
2253
|
-
|
|
2254
|
-
|
|
2255
|
-
|
|
2256
|
-
|
|
2257
|
-
|
|
2258
|
-
|
|
2259
|
-
|
|
2261
|
+
content_depth: 0.07,
|
|
2262
|
+
fact_density: 0.06,
|
|
2263
|
+
direct_answer_density: 0.05,
|
|
2264
|
+
qa_content_format: 0.05,
|
|
2265
|
+
query_answer_alignment: 0.05,
|
|
2266
|
+
faq_section: 0.04,
|
|
2267
|
+
// Content Organization (~30%)
|
|
2268
|
+
entity_consistency: 0.05,
|
|
2269
|
+
internal_linking: 0.04,
|
|
2270
|
+
content_freshness: 0.04,
|
|
2271
|
+
schema_markup: 0.03,
|
|
2272
|
+
author_schema_depth: 0.03,
|
|
2273
|
+
table_list_extractability: 0.03,
|
|
2274
|
+
definition_patterns: 0.02,
|
|
2275
|
+
visible_date_signal: 0.02,
|
|
2276
|
+
semantic_html: 0.02,
|
|
2277
|
+
clean_html: 0.02,
|
|
2278
|
+
// Technical Plumbing (~15%)
|
|
2279
|
+
content_cannibalization: 0.02,
|
|
2280
|
+
llms_txt: 0.02,
|
|
2281
|
+
robots_txt: 0.02,
|
|
2282
|
+
content_velocity: 0.02,
|
|
2283
|
+
content_licensing: 0.02,
|
|
2284
|
+
sitemap_completeness: 0.01,
|
|
2285
|
+
canonical_url: 0.01,
|
|
2286
|
+
rss_feed: 0.01,
|
|
2287
|
+
schema_coverage: 0.01,
|
|
2288
|
+
speakable_schema: 0.01
|
|
2260
2289
|
};
|
|
2261
2290
|
var OPPORTUNITY_TEMPLATES = {
|
|
2262
2291
|
llms_txt: {
|
|
@@ -2388,6 +2417,16 @@ var OPPORTUNITY_TEMPLATES = {
|
|
|
2388
2417
|
name: "Add Visible Date Signals",
|
|
2389
2418
|
effort: "Low",
|
|
2390
2419
|
description: "Display publication/modification dates visibly using <time> elements and add datePublished/dateModified to JSON-LD schema."
|
|
2420
|
+
},
|
|
2421
|
+
topic_coherence: {
|
|
2422
|
+
name: "Focus Content on Core Topics",
|
|
2423
|
+
effort: "High",
|
|
2424
|
+
description: 'Ensure blog content consistently covers your core expertise areas rather than scattering across unrelated topics. AI engines build authority models - a site about "Medicare coverage" that also publishes about humidifiers and groceries dilutes its topical authority.'
|
|
2425
|
+
},
|
|
2426
|
+
content_depth: {
|
|
2427
|
+
name: "Increase Content Depth",
|
|
2428
|
+
effort: "Medium",
|
|
2429
|
+
description: "Expand articles to 1000+ words with structured H2/H3 sections, comparison tables, and expert analysis. Thin content (under 300 words) is rarely cited by AI engines. Deep, well-structured articles demonstrate expertise."
|
|
2391
2430
|
}
|
|
2392
2431
|
};
|
|
2393
2432
|
function calculateImpact(score, weight, effort) {
|
|
@@ -2509,7 +2548,7 @@ function generatePitchNumbers(score, rawData, scorecard) {
|
|
|
2509
2548
|
const passing = scorecard.filter((s) => s.score >= 7).length;
|
|
2510
2549
|
metrics.push({
|
|
2511
2550
|
metric: "Criteria Passing",
|
|
2512
|
-
value: `${passing}/
|
|
2551
|
+
value: `${passing}/28`,
|
|
2513
2552
|
significance: passing >= 18 ? "Excellent coverage across AEO dimensions" : passing >= 12 ? "Good foundation with room to improve remaining criteria" : `${26 - passing} criteria need attention for full AI visibility`
|
|
2514
2553
|
});
|
|
2515
2554
|
return metrics;
|
|
@@ -2616,6 +2655,38 @@ function extractNavLinks(html, domain) {
|
|
|
2616
2655
|
}
|
|
2617
2656
|
return Array.from(paths);
|
|
2618
2657
|
}
|
|
2658
|
+
function extractAllInternalLinks(html, domain, limit = 30) {
|
|
2659
|
+
const cleanDomain = domain.replace(/^www\./, "").toLowerCase();
|
|
2660
|
+
const hrefMatches = html.match(/href="([^"#]*)"/gi) || [];
|
|
2661
|
+
const paths = /* @__PURE__ */ new Set();
|
|
2662
|
+
for (const match of hrefMatches) {
|
|
2663
|
+
const href = match.match(/href="([^"#]*)"/i)?.[1];
|
|
2664
|
+
if (!href) continue;
|
|
2665
|
+
let path;
|
|
2666
|
+
if (href.startsWith("/")) {
|
|
2667
|
+
path = href;
|
|
2668
|
+
} else if (href.startsWith("http")) {
|
|
2669
|
+
try {
|
|
2670
|
+
const url = new URL(href);
|
|
2671
|
+
const linkDomain = url.hostname.replace(/^www\./, "").toLowerCase();
|
|
2672
|
+
if (linkDomain !== cleanDomain) continue;
|
|
2673
|
+
path = url.pathname;
|
|
2674
|
+
} catch {
|
|
2675
|
+
continue;
|
|
2676
|
+
}
|
|
2677
|
+
} else {
|
|
2678
|
+
continue;
|
|
2679
|
+
}
|
|
2680
|
+
path = path.replace(/\/+$/, "") || "/";
|
|
2681
|
+
if (path === "/") continue;
|
|
2682
|
+
if (path.includes("#") || path.includes("?")) continue;
|
|
2683
|
+
if (/\.(js|css|png|jpg|jpeg|gif|svg|ico|pdf|xml|txt|zip|woff|woff2|ttf|eot|mp4|webm|mp3)$/i.test(path)) continue;
|
|
2684
|
+
if (/^\/(api|wp-admin|wp-includes|wp-json|static|assets|_next|auth|login|signup|sign-up|register|cart|checkout|account|admin|cdn-cgi|feed|rss)\b/i.test(path)) continue;
|
|
2685
|
+
if (path.startsWith("mailto:") || path.startsWith("tel:")) continue;
|
|
2686
|
+
paths.add(path);
|
|
2687
|
+
}
|
|
2688
|
+
return Array.from(paths).sort((a, b) => a.split("/").length - b.split("/").length || a.localeCompare(b)).slice(0, limit);
|
|
2689
|
+
}
|
|
2619
2690
|
function extractContentPagesFromSitemap(sitemapText, domain, limit = 6) {
|
|
2620
2691
|
const urlBlocks = sitemapText.match(/<url>([\s\S]*?)<\/url>/gi) || [];
|
|
2621
2692
|
const cleanDomain = domain.replace(/^www\./, "").toLowerCase();
|
|
@@ -2683,6 +2754,16 @@ async function fetchMultiPageData(siteData, options) {
|
|
|
2683
2754
|
if (!existingUrls.has(url)) urlsToFetch.set(url, "content");
|
|
2684
2755
|
}
|
|
2685
2756
|
}
|
|
2757
|
+
const hasBlogSample = (siteData.blogSample?.length ?? 0) > 3;
|
|
2758
|
+
if (!hasBlogSample) {
|
|
2759
|
+
const allPaths = extractAllInternalLinks(siteData.homepage.text, siteData.domain, 30);
|
|
2760
|
+
for (const path of allPaths) {
|
|
2761
|
+
const url = `${baseUrl}${path}`;
|
|
2762
|
+
if (!existingUrls.has(url) && !urlsToFetch.has(url)) {
|
|
2763
|
+
urlsToFetch.set(url, "content");
|
|
2764
|
+
}
|
|
2765
|
+
}
|
|
2766
|
+
}
|
|
2686
2767
|
const entries = Array.from(urlsToFetch.entries());
|
|
2687
2768
|
if (entries.length === 0) return 0;
|
|
2688
2769
|
const results = await Promise.all(entries.map(([url]) => fetchPage(url, timeoutMs)));
|
|
@@ -2701,20 +2782,23 @@ async function fetchMultiPageData(siteData, options) {
|
|
|
2701
2782
|
|
|
2702
2783
|
// src/page-scorer.ts
|
|
2703
2784
|
var PAGE_CRITERIA = {
|
|
2704
|
-
|
|
2705
|
-
qa_content_format: { weight: 0.15, label: "Q&A Content Format" },
|
|
2706
|
-
clean_html: { weight: 0.1, label: "Clean, Crawlable HTML" },
|
|
2707
|
-
faq_section: { weight: 0.1, label: "FAQ Section Content" },
|
|
2785
|
+
// Content Substance
|
|
2708
2786
|
original_data: { weight: 0.1, label: "Original Data & Expert Content" },
|
|
2709
|
-
|
|
2710
|
-
|
|
2711
|
-
|
|
2712
|
-
|
|
2713
|
-
|
|
2714
|
-
|
|
2715
|
-
|
|
2716
|
-
|
|
2717
|
-
|
|
2787
|
+
fact_density: { weight: 0.06, label: "Fact & Data Density" },
|
|
2788
|
+
direct_answer_density: { weight: 0.05, label: "Direct Answer Paragraphs" },
|
|
2789
|
+
qa_content_format: { weight: 0.05, label: "Q&A Content Format" },
|
|
2790
|
+
query_answer_alignment: { weight: 0.05, label: "Query-Answer Alignment" },
|
|
2791
|
+
faq_section: { weight: 0.04, label: "FAQ Section Content" },
|
|
2792
|
+
// Content Organization
|
|
2793
|
+
content_freshness: { weight: 0.04, label: "Content Freshness Signals" },
|
|
2794
|
+
schema_markup: { weight: 0.03, label: "Schema.org Structured Data" },
|
|
2795
|
+
table_list_extractability: { weight: 0.03, label: "Table & List Extractability" },
|
|
2796
|
+
definition_patterns: { weight: 0.02, label: "Definition Patterns" },
|
|
2797
|
+
visible_date_signal: { weight: 0.02, label: "Visible Date Signal" },
|
|
2798
|
+
semantic_html: { weight: 0.02, label: "Semantic HTML5 & Accessibility" },
|
|
2799
|
+
clean_html: { weight: 0.02, label: "Clean, Crawlable HTML" },
|
|
2800
|
+
// Technical Plumbing
|
|
2801
|
+
canonical_url: { weight: 0.01, label: "Canonical URL Strategy" }
|
|
2718
2802
|
};
|
|
2719
2803
|
function extractJsonLdBlocks(html) {
|
|
2720
2804
|
const blocks = [];
|
|
@@ -3484,32 +3568,37 @@ function buildLinkGraph(pages, domain, homepageUrl) {
|
|
|
3484
3568
|
|
|
3485
3569
|
// src/fix-engine.ts
|
|
3486
3570
|
var CRITERION_WEIGHTS2 = {
|
|
3487
|
-
|
|
3488
|
-
|
|
3489
|
-
qa_content_format: 0.15,
|
|
3490
|
-
clean_html: 0.1,
|
|
3491
|
-
entity_consistency: 0.1,
|
|
3492
|
-
robots_txt: 0.05,
|
|
3493
|
-
faq_section: 0.1,
|
|
3571
|
+
// Content Substance (~55%)
|
|
3572
|
+
topic_coherence: 0.14,
|
|
3494
3573
|
original_data: 0.1,
|
|
3495
|
-
|
|
3496
|
-
|
|
3497
|
-
|
|
3498
|
-
|
|
3499
|
-
|
|
3500
|
-
|
|
3501
|
-
|
|
3502
|
-
|
|
3503
|
-
|
|
3504
|
-
|
|
3505
|
-
|
|
3506
|
-
|
|
3507
|
-
|
|
3508
|
-
|
|
3509
|
-
|
|
3510
|
-
|
|
3511
|
-
|
|
3512
|
-
|
|
3574
|
+
content_depth: 0.07,
|
|
3575
|
+
fact_density: 0.06,
|
|
3576
|
+
direct_answer_density: 0.05,
|
|
3577
|
+
qa_content_format: 0.05,
|
|
3578
|
+
query_answer_alignment: 0.05,
|
|
3579
|
+
faq_section: 0.04,
|
|
3580
|
+
// Content Organization (~30%)
|
|
3581
|
+
entity_consistency: 0.05,
|
|
3582
|
+
internal_linking: 0.04,
|
|
3583
|
+
content_freshness: 0.04,
|
|
3584
|
+
schema_markup: 0.03,
|
|
3585
|
+
author_schema_depth: 0.03,
|
|
3586
|
+
table_list_extractability: 0.03,
|
|
3587
|
+
definition_patterns: 0.02,
|
|
3588
|
+
visible_date_signal: 0.02,
|
|
3589
|
+
semantic_html: 0.02,
|
|
3590
|
+
clean_html: 0.02,
|
|
3591
|
+
// Technical Plumbing (~15%)
|
|
3592
|
+
content_cannibalization: 0.02,
|
|
3593
|
+
llms_txt: 0.02,
|
|
3594
|
+
robots_txt: 0.02,
|
|
3595
|
+
content_velocity: 0.02,
|
|
3596
|
+
content_licensing: 0.02,
|
|
3597
|
+
sitemap_completeness: 0.01,
|
|
3598
|
+
canonical_url: 0.01,
|
|
3599
|
+
rss_feed: 0.01,
|
|
3600
|
+
schema_coverage: 0.01,
|
|
3601
|
+
speakable_schema: 0.01
|
|
3513
3602
|
};
|
|
3514
3603
|
var PHASE_CONFIG = [
|
|
3515
3604
|
{
|
|
@@ -3532,7 +3621,9 @@ var PHASE_CONFIG = [
|
|
|
3532
3621
|
"content_freshness",
|
|
3533
3622
|
"table_list_extractability",
|
|
3534
3623
|
"query_answer_alignment",
|
|
3535
|
-
"visible_date_signal"
|
|
3624
|
+
"visible_date_signal",
|
|
3625
|
+
"topic_coherence",
|
|
3626
|
+
"content_depth"
|
|
3536
3627
|
]
|
|
3537
3628
|
},
|
|
3538
3629
|
{
|
|
@@ -4436,6 +4527,55 @@ Summarization: yes`,
|
|
|
4436
4527
|
affectedPages: affected,
|
|
4437
4528
|
pageCount: affected?.length
|
|
4438
4529
|
}];
|
|
4530
|
+
},
|
|
4531
|
+
topic_coherence: (c) => {
|
|
4532
|
+
if (c.score >= 10) return [];
|
|
4533
|
+
const impact = impactFromScore(c.score);
|
|
4534
|
+
const effort = effortForCriterion("topic_coherence", c.score);
|
|
4535
|
+
return [{
|
|
4536
|
+
id: "fix-topic-coherence",
|
|
4537
|
+
criterion: c.criterion_label,
|
|
4538
|
+
criterionId: c.criterion,
|
|
4539
|
+
title: "Focus blog content on core expertise",
|
|
4540
|
+
description: "Ensure blog content consistently covers your core topic areas. Scattered content across unrelated topics weakens AI engine authority signals.",
|
|
4541
|
+
impact,
|
|
4542
|
+
effort: effort === "trivial" ? "low" : effort,
|
|
4543
|
+
impactScore: 0,
|
|
4544
|
+
category: "content",
|
|
4545
|
+
steps: [
|
|
4546
|
+
"Identify 2-3 core expertise areas your brand is known for",
|
|
4547
|
+
"Audit existing blog posts and remove or consolidate off-topic content",
|
|
4548
|
+
"Create a content calendar focused on core topics",
|
|
4549
|
+
"Use topic clusters: pillar pages linking to supporting articles within the same niche"
|
|
4550
|
+
],
|
|
4551
|
+
successCriteria: "80%+ of blog content covers core expertise areas with consistent topic focus"
|
|
4552
|
+
}];
|
|
4553
|
+
},
|
|
4554
|
+
content_depth: (c, pages) => {
|
|
4555
|
+
if (c.score >= 10) return [];
|
|
4556
|
+
const impact = impactFromScore(c.score);
|
|
4557
|
+
const effort = effortForCriterion("content_depth", c.score);
|
|
4558
|
+
const affected = getAffectedPages("content_depth", pages);
|
|
4559
|
+
return [{
|
|
4560
|
+
id: "fix-content-depth",
|
|
4561
|
+
criterion: c.criterion_label,
|
|
4562
|
+
criterionId: c.criterion,
|
|
4563
|
+
title: "Increase content depth and structure",
|
|
4564
|
+
description: "Expand thin content with more detail, examples, and structured sections. AI engines prefer comprehensive articles with clear heading hierarchies.",
|
|
4565
|
+
impact,
|
|
4566
|
+
effort: effort === "trivial" ? "low" : effort,
|
|
4567
|
+
impactScore: 0,
|
|
4568
|
+
category: "content",
|
|
4569
|
+
steps: [
|
|
4570
|
+
"Aim for 1000+ words per article with expert analysis and examples",
|
|
4571
|
+
"Use H2/H3 subheadings every 200-300 words for clear structure",
|
|
4572
|
+
"Add comparison tables, numbered steps, and data points",
|
|
4573
|
+
"Remove or expand thin pages (under 300 words) that dilute site quality"
|
|
4574
|
+
],
|
|
4575
|
+
successCriteria: "Average article length exceeds 1000 words with 5+ subheadings per page",
|
|
4576
|
+
affectedPages: affected,
|
|
4577
|
+
pageCount: affected?.length
|
|
4578
|
+
}];
|
|
4439
4579
|
}
|
|
4440
4580
|
};
|
|
4441
4581
|
function generateFixPlan(domain, overallScore, criteria, pagesReviewed, linkGraph) {
|