aeorank 1.5.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # AEORank
2
2
 
3
- Score any website for AI engine visibility across 26 criteria. Pure HTTP + regex - zero API keys required.
3
+ Score any website for AI engine visibility across 28 criteria. Pure HTTP + regex - zero API keys required.
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/aeorank.svg)](https://www.npmjs.com/package/aeorank)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -35,42 +35,44 @@ import { audit } from 'aeorank';
35
35
 
36
36
  const result = await audit('example.com');
37
37
  console.log(result.overallScore); // 0-100
38
- console.log(result.scorecard); // 26 criteria with scores
38
+ console.log(result.scorecard); // 28 criteria with scores
39
39
  console.log(result.opportunities); // Prioritized improvements
40
40
  ```
41
41
 
42
42
  ## What It Checks
43
43
 
44
- AEORank evaluates 26 criteria across 4 categories that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content:
44
+ AEORank evaluates 28 criteria across 4 categories that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content:
45
45
 
46
46
  | # | Criterion | Weight | Category |
47
47
  |---|-----------|--------|----------|
48
- | 1 | llms.txt File | 10% | Discovery |
49
- | 2 | Schema.org Structured Data | 15% | Structure |
50
- | 3 | Q&A Content Format | 15% | Content |
51
- | 4 | Clean, Crawlable HTML | 10% | Structure |
52
- | 5 | Entity Authority & NAP Consistency | 10% | Authority |
53
- | 6 | robots.txt for AI Crawlers | 5% | Discovery |
54
- | 7 | Comprehensive FAQ Section | 10% | Content |
55
- | 8 | Original Data & Expert Analysis | 10% | Content |
56
- | 9 | Internal Linking Structure | 10% | Structure |
57
- | 10 | Semantic HTML5 & Accessibility | 5% | Structure |
58
- | 11 | Content Freshness Signals | 7% | Content |
59
- | 12 | Sitemap Completeness | 5% | Discovery |
60
- | 13 | RSS/Atom Feed | 3% | Discovery |
61
- | 14 | Table & List Extractability | 7% | Structure |
48
+ | 1 | llms.txt File | 8% | Discovery |
49
+ | 2 | Schema.org Structured Data | 8% | Structure |
50
+ | 3 | Q&A Content Format | 12% | Content |
51
+ | 4 | Clean, Crawlable HTML | 8% | Structure |
52
+ | 5 | Entity Authority & NAP Consistency | 8% | Authority |
53
+ | 6 | robots.txt for AI Crawlers | 3% | Discovery |
54
+ | 7 | Comprehensive FAQ Section | 8% | Content |
55
+ | 8 | Original Data & Expert Analysis | 12% | Content |
56
+ | 9 | Internal Linking Structure | 8% | Structure |
57
+ | 10 | Semantic HTML5 & Accessibility | 4% | Structure |
58
+ | 11 | Content Freshness Signals | 6% | Content |
59
+ | 12 | Sitemap Completeness | 3% | Discovery |
60
+ | 13 | RSS/Atom Feed | 2% | Discovery |
61
+ | 14 | Table & List Extractability | 5% | Structure |
62
62
  | 15 | Definition Patterns | 4% | Content |
63
63
  | 16 | Direct Answer Paragraphs | 7% | Content |
64
- | 17 | Content Licensing & AI Permissions | 4% | Discovery |
64
+ | 17 | Content Licensing & AI Permissions | 3% | Discovery |
65
65
  | 18 | Author & Expert Schema | 4% | Authority |
66
- | 19 | Fact & Data Density | 5% | Content |
67
- | 20 | Canonical URL Strategy | 4% | Structure |
66
+ | 19 | Fact & Data Density | 8% | Content |
67
+ | 20 | Canonical URL Strategy | 2% | Structure |
68
68
  | 21 | Content Publishing Velocity | 3% | Content |
69
- | 22 | Schema Coverage & Depth | 3% | Structure |
70
- | 23 | Speakable Schema | 3% | Structure |
71
- | 24 | Query-Answer Alignment | 8% | Content |
69
+ | 22 | Schema Coverage & Depth | 2% | Structure |
70
+ | 23 | Speakable Schema | 2% | Structure |
71
+ | 24 | Query-Answer Alignment | 6% | Content |
72
72
  | 25 | Content Cannibalization | 5% | Content |
73
73
  | 26 | Visible Date Signal | 4% | Content |
74
+ | 27 | **Topic Coherence** | **14%** | **Content** |
75
+ | 28 | **Content Depth** | **6%** | **Content** |
74
76
 
75
77
  ## CLI Options
76
78
 
@@ -99,7 +101,7 @@ Use the built-in action to gate deployments on AEO score:
99
101
 
100
102
  ```yaml
101
103
  - name: AEO Audit
102
- uses: AEO-Content-Inc/aeorank@v1
104
+ uses: AEO-Content-Inc/aeorank@v2
103
105
  with:
104
106
  domain: example.com
105
107
  threshold: 70
@@ -119,7 +121,7 @@ Or use `npx` directly:
119
121
  Run a complete audit. Returns `AuditResult` with:
120
122
 
121
123
  - `overallScore` - 0-100 weighted score
122
- - `scorecard` - 26 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
124
+ - `scorecard` - 28 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
123
125
  - `detailedFindings` - Per-criterion findings with severity
124
126
  - `opportunities` - Prioritized improvements with effort/impact
125
127
  - `pitchNumbers` - Key metrics (schema types, AI crawler access, etc.)
@@ -150,6 +152,55 @@ Score a single HTML page against 14 per-page AEO criteria. Returns `PageScoreRes
150
152
 
151
153
  Batch-score all pages (homepage + blogSample) from a `SiteData` object. Returns `PageScoreResult[]`.
152
154
 
155
+ ### `buildLinkGraph(pages, domain, homepageUrl)`
156
+
157
+ Analyze internal linking structure from crawled pages. Returns `LinkGraph` with:
158
+
159
+ - `nodes` - Map of URL to `PageNode` (in/out degree, depth, pillar/hub/orphan flags)
160
+ - `edges` - Array of `LinkEdge` (from, to, anchor text)
161
+ - `stats` - `LinkGraphStats` (total pages, orphans, pillars, hubs, avg depth, clusters)
162
+ - `clusters` - `TopicCluster[]` (pillar URL, spoke URLs, cohesion score)
163
+
164
+ ```ts
165
+ import { crawlFullSite, prefetchSiteData, buildLinkGraph } from 'aeorank';
166
+
167
+ const siteData = await prefetchSiteData('example.com');
168
+ const crawl = await crawlFullSite(siteData, { maxPages: 200 });
169
+ const graph = buildLinkGraph(crawl.pages, 'example.com', 'https://example.com');
170
+
171
+ console.log(graph.stats.orphanPages); // Pages with no inbound links
172
+ console.log(graph.stats.pillarPages); // High-authority hub pages
173
+ console.log(graph.clusters); // Topic clusters detected
174
+ ```
175
+
176
+ ### `generateFixPlan(domain, score, criteria, pages?, linkGraph?)`
177
+
178
+ Generate a phased fix plan from audit results. Returns `FixPlan` with:
179
+
180
+ - `phases` - 4 phases (Foundation, Content, Authority, Architecture) with prioritized `FixAction[]`
181
+ - `quickWins` - Low-effort, high-impact fixes
182
+ - `projectedScore` - Estimated score after applying all fixes
183
+ - `summary` - Counts by impact level, top opportunity, estimated effort
184
+
185
+ Each `FixAction` includes: title, description, impact/effort levels, step-by-step instructions, code examples, affected pages, and dependency ordering.
186
+
187
+ ```ts
188
+ import { audit, generateFixPlan } from 'aeorank';
189
+
190
+ const result = await audit('example.com');
191
+ const plan = generateFixPlan(
192
+ 'example.com',
193
+ result.overallScore,
194
+ result.criterionResults,
195
+ result.pagesReviewed,
196
+ );
197
+
198
+ console.log(plan.projectedScore); // e.g. 82
199
+ console.log(plan.quickWins[0].title); // e.g. "Add llms.txt file"
200
+ console.log(plan.quickWins[0].impactScore); // e.g. 10
201
+ console.log(plan.phases[0].fixes.length); // Foundation phase fixes
202
+ ```
203
+
153
204
  ### Advanced API
154
205
 
155
206
  For custom pipelines, import individual stages:
@@ -165,6 +216,8 @@ import {
165
216
  generateOpportunities,
166
217
  scorePage,
167
218
  scoreAllPages,
219
+ buildLinkGraph,
220
+ generateFixPlan,
168
221
  isSpaShell,
169
222
  fetchWithHeadless,
170
223
  } from 'aeorank';
@@ -174,6 +227,24 @@ const results = auditSiteFromData(siteData);
174
227
  const score = calculateOverallScore(results);
175
228
  ```
176
229
 
230
+ ### Browser Entry Point
231
+
232
+ For browser environments (Chrome extensions, web apps), import from `aeorank/browser` to avoid Node.js dependencies (Puppeteer, fs):
233
+
234
+ ```ts
235
+ import {
236
+ prefetchSiteData,
237
+ auditSiteFromData,
238
+ calculateOverallScore,
239
+ buildLinkGraph,
240
+ generateFixPlan,
241
+ analyzeAllPages,
242
+ crawlFullSite,
243
+ } from 'aeorank/browser';
244
+ ```
245
+
246
+ The browser entry exports everything except `headless-fetch` (Puppeteer), `html-report` (Node fs), `audit` orchestrator, and CLI.
247
+
177
248
  ## SPA Support
178
249
 
179
250
  Sites that use client-side rendering (React, Vue, Angular) return empty HTML shells to regular HTTP requests. AEORank detects these automatically and re-renders them with Puppeteer if available.
@@ -188,7 +259,7 @@ Use `--no-headless` to skip SPA rendering (faster but may produce lower scores f
188
259
 
189
260
  ## Full-Site Crawl
190
261
 
191
- By default, AEORank audits the homepage plus ~20 discovered pages. For deeper analysis, enable `--full-crawl` to BFS-crawl every discoverable page:
262
+ By default, AEORank audits the homepage plus up to 50 blog pages from the sitemap. For deeper analysis, enable `--full-crawl` to BFS-crawl every discoverable page:
192
263
 
193
264
  ```bash
194
265
  npx aeorank example.com --full-crawl # Up to 200 pages
@@ -227,7 +298,7 @@ AEORank scores each individual page (0-100) against the 14 criteria that apply a
227
298
 
228
299
  The 14 per-page criteria: Schema.org Structured Data, Q&A Content Format, Clean Crawlable HTML, FAQ Section Content, Original Data & Expert Content, Query-Answer Alignment, Content Freshness Signals, Table & List Extractability, Direct Answer Paragraphs, Semantic HTML5 & Accessibility, Fact & Data Density, Definition Patterns, Canonical URL Strategy, Visible Date Signal.
229
300
 
230
- The remaining 12 criteria (llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization) are site-level only.
301
+ The remaining 14 criteria (llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization, topic coherence, content depth) are site-level only.
231
302
 
232
303
  ### CLI Output
233
304
 
@@ -258,6 +329,70 @@ console.log(result.criterionScores); // 14 per-criterion scores
258
329
  const allScores = scoreAllPages(siteData);
259
330
  ```
260
331
 
332
+ ## Link Graph Analysis
333
+
334
+ Analyze your site's internal linking structure to find orphan pages, identify pillar content, and detect topic clusters:
335
+
336
+ ```bash
337
+ npx aeorank example.com --full-crawl --json | jq '.linkGraph.stats'
338
+ ```
339
+
340
+ ```ts
341
+ import { crawlFullSite, prefetchSiteData, buildLinkGraph, serializeLinkGraph } from 'aeorank';
342
+
343
+ const siteData = await prefetchSiteData('example.com');
344
+ const crawl = await crawlFullSite(siteData, { maxPages: 200 });
345
+ const graph = buildLinkGraph(crawl.pages, 'example.com', 'https://example.com');
346
+
347
+ // Orphan pages (no inbound links - invisible to crawlers)
348
+ const orphans = [...graph.nodes.values()].filter(n => n.isOrphan);
349
+
350
+ // Pillar pages (high authority, many inbound links)
351
+ const pillars = [...graph.nodes.values()].filter(n => n.isPillar);
352
+
353
+ // Topic clusters (pillar + spoke pages with high cohesion)
354
+ graph.clusters.forEach(c => {
355
+ console.log(`${c.pillarTitle}: ${c.spokes.length} spokes, cohesion ${c.cohesion}`);
356
+ });
357
+
358
+ // Serialize for storage/transport (Map -> plain object)
359
+ const json = serializeLinkGraph(graph);
360
+ ```
361
+
362
+ ## Fix Plan Engine
363
+
364
+ Generate actionable, phased fix plans from audit results. Each fix includes step-by-step instructions, code examples, effort/impact ratings, and dependency ordering:
365
+
366
+ ```bash
367
+ npx aeorank example.com --full-crawl --json | jq '.fixPlan'
368
+ ```
369
+
370
+ ```ts
371
+ import { audit, generateFixPlan } from 'aeorank';
372
+
373
+ const result = await audit('example.com', { fullCrawl: true });
374
+ const plan = generateFixPlan(
375
+ 'example.com',
376
+ result.overallScore,
377
+ result.criterionResults,
378
+ result.pagesReviewed,
379
+ result.linkGraph, // optional - enables link-aware fixes
380
+ );
381
+
382
+ // 4 phases: Foundation -> Content -> Authority -> Architecture
383
+ plan.phases.forEach(phase => {
384
+ console.log(`${phase.title}: ${phase.fixes.length} fixes`);
385
+ });
386
+
387
+ // Quick wins: low effort + high impact
388
+ plan.quickWins.forEach(qw => {
389
+ console.log(`${qw.title} (+${qw.impactScore} pts) - ${qw.effort} effort`);
390
+ qw.steps.forEach(s => console.log(` - ${s}`));
391
+ });
392
+
393
+ console.log(`Current: ${plan.overallScore} -> Projected: ${plan.projectedScore}`);
394
+ ```
395
+
261
396
  ## Scoring
262
397
 
263
398
  Each criterion is scored 0-10 by deterministic checks (regex, HTML parsing, HTTP headers). The overall score is a weighted average normalized to 0-100.
@@ -316,7 +451,7 @@ console.log(result.comparison.tied); // Criteria with equal scores
316
451
 
317
452
  ## Benchmark Dataset
318
453
 
319
- The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 26 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
454
+ The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 28 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
320
455
 
321
456
  | File | Contents |
322
457
  |------|----------|