aeorank 1.6.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # AEORank
2
2
 
3
- Score any website for AI engine visibility across 26 criteria. Pure HTTP + regex - zero API keys required.
3
+ Score any website for AI engine visibility across 28 criteria. Pure HTTP + regex - zero API keys required.
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/aeorank.svg)](https://www.npmjs.com/package/aeorank)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -35,42 +35,96 @@ import { audit } from 'aeorank';
35
35
 
36
36
  const result = await audit('example.com');
37
37
  console.log(result.overallScore); // 0-100
38
- console.log(result.scorecard); // 26 criteria with scores
38
+ console.log(result.scorecard); // 28 criteria with scores
39
39
  console.log(result.opportunities); // Prioritized improvements
40
40
  ```
41
41
 
42
42
  ## What It Checks
43
43
 
44
- AEORank evaluates 26 criteria across 4 categories that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content:
45
-
46
- | # | Criterion | Weight | Category |
47
- |---|-----------|--------|----------|
48
- | 1 | llms.txt File | 10% | Discovery |
49
- | 2 | Schema.org Structured Data | 15% | Structure |
50
- | 3 | Q&A Content Format | 15% | Content |
51
- | 4 | Clean, Crawlable HTML | 10% | Structure |
52
- | 5 | Entity Authority & NAP Consistency | 10% | Authority |
53
- | 6 | robots.txt for AI Crawlers | 5% | Discovery |
54
- | 7 | Comprehensive FAQ Section | 10% | Content |
55
- | 8 | Original Data & Expert Analysis | 10% | Content |
56
- | 9 | Internal Linking Structure | 10% | Structure |
57
- | 10 | Semantic HTML5 & Accessibility | 5% | Structure |
58
- | 11 | Content Freshness Signals | 7% | Content |
59
- | 12 | Sitemap Completeness | 5% | Discovery |
60
- | 13 | RSS/Atom Feed | 3% | Discovery |
61
- | 14 | Table & List Extractability | 7% | Structure |
62
- | 15 | Definition Patterns | 4% | Content |
63
- | 16 | Direct Answer Paragraphs | 7% | Content |
64
- | 17 | Content Licensing & AI Permissions | 4% | Discovery |
65
- | 18 | Author & Expert Schema | 4% | Authority |
66
- | 19 | Fact & Data Density | 5% | Content |
67
- | 20 | Canonical URL Strategy | 4% | Structure |
68
- | 21 | Content Publishing Velocity | 3% | Content |
69
- | 22 | Schema Coverage & Depth | 3% | Structure |
70
- | 23 | Speakable Schema | 3% | Structure |
71
- | 24 | Query-Answer Alignment | 8% | Content |
72
- | 25 | Content Cannibalization | 5% | Content |
73
- | 26 | Visible Date Signal | 4% | Content |
44
+ AEORank evaluates 28 criteria that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content. Criteria are organized into three tiers by impact on real-world AI citations:
45
+
46
+ ### Scoring Tiers (by importance)
47
+
48
+ **Content Substance (~55%)** - *Why* an AI engine would cite you:
49
+
50
+ | Criterion | Weight | What it measures |
51
+ |-----------|--------|------------------|
52
+ | Topic Coherence | 14% | Blog content focus on core expertise vs scattered topics |
53
+ | Original Data & Expert Analysis | 10% | Proprietary research, case studies, unique data points |
54
+ | Content Depth | 7% | Article length, heading structure, deep vs thin pages |
55
+ | Fact & Data Density | 6% | Specific numbers, statistics, data points per page |
56
+ | Direct Answer Paragraphs | 5% | Concise answer paragraphs after question headings |
57
+ | Q&A Content Format | 5% | Question-format headings (What, How, Why) with answers |
58
+ | Query-Answer Alignment | 5% | Every question heading followed by a direct answer |
59
+ | Comprehensive FAQ Section | 4% | Dedicated FAQ with FAQPage schema markup |
60
+
61
+ **Content Organization (~30%)** - *How* easily AI can extract and trust your content:
62
+
63
+ | Criterion | Weight | What it measures |
64
+ |-----------|--------|------------------|
65
+ | Entity Authority & NAP Consistency | 5% | Organization schema, consistent name/address/phone |
66
+ | Internal Linking Structure | 4% | Topic clusters, breadcrumbs, reachability from homepage |
67
+ | Content Freshness Signals | 4% | dateModified schema, visible dates, recent content |
68
+ | Schema.org Structured Data | 3% | JSON-LD blocks (Organization, Article, FAQPage, etc.) |
69
+ | Author & Expert Schema | 3% | Person schema with credentials and expertise |
70
+ | Table & List Extractability | 3% | HTML tables with headers, ordered/unordered lists |
71
+ | Definition Patterns | 2% | Clear "X is defined as..." patterns for key terms |
72
+ | Visible Date Signal | 2% | Visible publication dates with `<time>` elements |
73
+ | Semantic HTML5 & Accessibility | 2% | Semantic elements (main, article, nav), ARIA, lang |
74
+ | Clean, Crawlable HTML | 2% | HTTPS, meta tags, proper heading hierarchy |
75
+
76
+ **Technical Plumbing (~15%)** - *Whether* AI crawlers can find you (table stakes):
77
+
78
+ | Criterion | Weight | What it measures |
79
+ |-----------|--------|------------------|
80
+ | Content Cannibalization | 2% | Overlapping pages competing for the same topic |
81
+ | llms.txt File | 2% | /llms.txt with site description and key page URLs |
82
+ | robots.txt for AI Crawlers | 2% | GPTBot, ClaudeBot, PerplexityBot access |
83
+ | Content Publishing Velocity | 2% | Regular publishing cadence in sitemap |
84
+ | Content Licensing & AI Permissions | 2% | /ai.txt file, license schema for AI usage |
85
+ | Sitemap Completeness | 1% | sitemap.xml with lastmod dates |
86
+ | Canonical URL Strategy | 1% | Self-referencing canonical tags |
87
+ | RSS/Atom Feed | 1% | RSS feed linked from homepage |
88
+ | Schema Coverage & Depth | 1% | Schema markup on inner pages, not just homepage |
89
+ | Speakable Schema | 1% | SpeakableSpecification for voice assistants |
90
+
91
+ > **Coherence Gate:** Sites with topic coherence below 6/10 are score-capped regardless of technical perfection. A scattered site with perfect robots.txt, llms.txt, and schema will score lower than a focused site with mediocre technical implementation.
92
+
93
+ <details>
94
+ <summary>All 28 criteria (numbered list)</summary>
95
+
96
+ | # | Criterion | Weight | Tier |
97
+ |---|-----------|--------|------|
98
+ | 1 | llms.txt File | 2% | Plumbing |
99
+ | 2 | Schema.org Structured Data | 3% | Organization |
100
+ | 3 | Q&A Content Format | 5% | Substance |
101
+ | 4 | Clean, Crawlable HTML | 2% | Organization |
102
+ | 5 | Entity Authority & NAP Consistency | 5% | Organization |
103
+ | 6 | robots.txt for AI Crawlers | 2% | Plumbing |
104
+ | 7 | Comprehensive FAQ Section | 4% | Substance |
105
+ | 8 | Original Data & Expert Analysis | 10% | Substance |
106
+ | 9 | Internal Linking Structure | 4% | Organization |
107
+ | 10 | Semantic HTML5 & Accessibility | 2% | Organization |
108
+ | 11 | Content Freshness Signals | 4% | Organization |
109
+ | 12 | Sitemap Completeness | 1% | Plumbing |
110
+ | 13 | RSS/Atom Feed | 1% | Plumbing |
111
+ | 14 | Table & List Extractability | 3% | Organization |
112
+ | 15 | Definition Patterns | 2% | Organization |
113
+ | 16 | Direct Answer Paragraphs | 5% | Substance |
114
+ | 17 | Content Licensing & AI Permissions | 2% | Plumbing |
115
+ | 18 | Author & Expert Schema | 3% | Organization |
116
+ | 19 | Fact & Data Density | 6% | Substance |
117
+ | 20 | Canonical URL Strategy | 1% | Plumbing |
118
+ | 21 | Content Publishing Velocity | 2% | Plumbing |
119
+ | 22 | Schema Coverage & Depth | 1% | Plumbing |
120
+ | 23 | Speakable Schema | 1% | Plumbing |
121
+ | 24 | Query-Answer Alignment | 5% | Substance |
122
+ | 25 | Content Cannibalization | 2% | Plumbing |
123
+ | 26 | Visible Date Signal | 2% | Organization |
124
+ | 27 | Topic Coherence | 14% | Substance |
125
+ | 28 | Content Depth | 7% | Substance |
126
+
127
+ </details>
74
128
 
75
129
  ## CLI Options
76
130
 
@@ -99,7 +153,7 @@ Use the built-in action to gate deployments on AEO score:
99
153
 
100
154
  ```yaml
101
155
  - name: AEO Audit
102
- uses: AEO-Content-Inc/aeorank@v1
156
+ uses: AEO-Content-Inc/aeorank@v2
103
157
  with:
104
158
  domain: example.com
105
159
  threshold: 70
@@ -119,7 +173,7 @@ Or use `npx` directly:
119
173
  Run a complete audit. Returns `AuditResult` with:
120
174
 
121
175
  - `overallScore` - 0-100 weighted score
122
- - `scorecard` - 26 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
176
+ - `scorecard` - 28 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
123
177
  - `detailedFindings` - Per-criterion findings with severity
124
178
  - `opportunities` - Prioritized improvements with effort/impact
125
179
  - `pitchNumbers` - Key metrics (schema types, AI crawler access, etc.)
@@ -257,7 +311,7 @@ Use `--no-headless` to skip SPA rendering (faster but may produce lower scores f
257
311
 
258
312
  ## Full-Site Crawl
259
313
 
260
- By default, AEORank audits the homepage plus ~20 discovered pages. For deeper analysis, enable `--full-crawl` to BFS-crawl every discoverable page:
314
+ By default, AEORank audits the homepage plus up to 50 blog pages from the sitemap. For deeper analysis, enable `--full-crawl` to BFS-crawl every discoverable page:
261
315
 
262
316
  ```bash
263
317
  npx aeorank example.com --full-crawl # Up to 200 pages
@@ -294,9 +348,26 @@ console.log(crawlResult.discoveredUrls.length); // Total URLs found
294
348
 
295
349
  AEORank scores each individual page (0-100) against the 14 criteria that apply at page level. Instead of only seeing "your site scores 62," you get "your /about page scores 45, your /blog/guide scores 78."
296
350
 
297
- The 14 per-page criteria: Schema.org Structured Data, Q&A Content Format, Clean Crawlable HTML, FAQ Section Content, Original Data & Expert Content, Query-Answer Alignment, Content Freshness Signals, Table & List Extractability, Direct Answer Paragraphs, Semantic HTML5 & Accessibility, Fact & Data Density, Definition Patterns, Canonical URL Strategy, Visible Date Signal.
298
-
299
- The remaining 12 criteria (llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization) are site-level only.
351
+ The 14 per-page criteria follow the same substance-first weighting as the site-level score:
352
+
353
+ | Tier | Per-Page Criteria | Weight |
354
+ |------|-------------------|--------|
355
+ | **Substance** | Original Data & Expert Content | 10% |
356
+ | | Fact & Data Density | 6% |
357
+ | | Direct Answer Paragraphs | 5% |
358
+ | | Q&A Content Format | 5% |
359
+ | | Query-Answer Alignment | 5% |
360
+ | | FAQ Section Content | 4% |
361
+ | **Organization** | Content Freshness Signals | 4% |
362
+ | | Schema.org Structured Data | 3% |
363
+ | | Table & List Extractability | 3% |
364
+ | | Definition Patterns | 2% |
365
+ | | Visible Date Signal | 2% |
366
+ | | Semantic HTML5 & Accessibility | 2% |
367
+ | | Clean, Crawlable HTML | 2% |
368
+ | **Plumbing** | Canonical URL Strategy | 1% |
369
+
370
+ The remaining 14 criteria are site-level only: llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization, topic coherence, and content depth.
300
371
 
301
372
  ### CLI Output
302
373
 
@@ -449,7 +520,7 @@ console.log(result.comparison.tied); // Criteria with equal scores
449
520
 
450
521
  ## Benchmark Dataset
451
522
 
452
- The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 26 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
523
+ The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 28 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
453
524
 
454
525
  | File | Contents |
455
526
  |------|----------|
package/dist/browser.d.ts CHANGED
@@ -173,7 +173,7 @@ interface SiteData {
173
173
  redirectedTo: string | null;
174
174
  /** Set when homepage is a parked/for-sale/lost domain */
175
175
  parkedReason: string | null;
176
- /** Sampled blog/content pages from sitemap (up to 5) */
176
+ /** Sampled blog/content pages from sitemap (up to 50) */
177
177
  blogSample?: FetchResult[];
178
178
  /** Full-crawl statistics (set when --full-crawl is used) */
179
179
  crawlStats?: {
@@ -376,7 +376,7 @@ declare function analyzeAllPages(siteData: SiteData): PageReview[];
376
376
 
377
377
  /**
378
378
  * Per-page AEO scoring.
379
- * Evaluates 14 of 26 criteria that apply at individual page level.
379
+ * Evaluates 14 of 28 criteria that apply at individual page level.
380
380
  * Produces a 0-100 AEO score per page.
381
381
  */
382
382