aeorank 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # AEORank
2
2
 
3
- Score any website for AI engine visibility across 36 criteria in a 5-pillar framework. Pure HTTP + regex - zero API keys, under 10 seconds.
3
+ Score any website for AI engine visibility across 40 criteria in a 5-pillar framework. Pure HTTP + regex - zero API keys, under 10 seconds.
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/aeorank.svg)](https://www.npmjs.com/package/aeorank)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -35,7 +35,7 @@ import { audit } from 'aeorank';
35
35
 
36
36
  const result = await audit('example.com');
37
37
  console.log(result.overallScore); // 0-100
38
- console.log(result.scorecard); // 36 criteria with scores, pillars, weights
38
+ console.log(result.scorecard); // 40 criteria with scores, pillars, weights
39
39
  console.log(result.pillarScores); // { answerReadiness, contentStructure, ... }
40
40
  console.log(result.topFixes); // Top 3 highest-impact fixes
41
41
  console.log(result.opportunities); // Prioritized improvements
@@ -43,7 +43,7 @@ console.log(result.opportunities); // Prioritized improvements
43
43
 
44
44
  ## What It Checks
45
45
 
46
- AEORank evaluates 36 criteria that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content. Criteria are organized into five pillars:
46
+ AEORank evaluates 40 criteria that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content. Criteria are organized into five pillars:
47
47
 
48
48
  ### 5-Pillar Framework
49
49
 
@@ -60,6 +60,8 @@ AEORank evaluates 36 criteria that determine how AI engines (ChatGPT, Claude, Pe
60
60
  | Cross-Page Duplicate Content | 3% | Same paragraphs copy-pasted across multiple pages |
61
61
  | Answer-First Placement | 3% | Answer block in first 300 words, no throat-clearing openers |
62
62
  | Evidence Packaging | 3% | Inline citations, attribution phrases, sources sections |
63
+ | Helpful Purpose Alignment | 3% | Whether pages solve the promised visitor task vs reading like search-first filler |
64
+ | First-Hand Experience Signals | 3% | Concrete evidence of direct use, testing, implementation, or lived experience |
63
65
 
64
66
  **Pillar 2: Content Structure (~25%)** - *How* AI extracts your answers:
65
67
 
@@ -82,6 +84,8 @@ AEORank evaluates 36 criteria that determine how AI engines (ChatGPT, Claude, Pe
82
84
  | Content Freshness Signals | 4% | dateModified schema, visible dates, recent content |
83
85
  | Author & Expert Schema | 3% | Person schema with credentials and expertise |
84
86
  | Schema.org Structured Data | 3% | JSON-LD blocks (Organization, Article, FAQPage, etc.) |
87
+ | Creator Transparency | 2% | Visible bylines, author pages, and reviewer attribution where expected |
88
+ | Methodology Transparency | 2% | Whether pages explain how they were tested, researched, reviewed, or updated |
85
89
 
86
90
  **Pillar 4: Technical Foundation (~10%)** - *How* easily AI parses your pages:
87
91
 
@@ -89,70 +93,74 @@ AEORank evaluates 36 criteria that determine how AI engines (ChatGPT, Claude, Pe
89
93
  |-----------|--------|------------------|
90
94
  | Semantic HTML5 & Accessibility | 2% | Semantic elements (main, article, nav), ARIA, lang |
91
95
  | Clean, Crawlable HTML | 2% | HTTPS, meta tags, proper heading hierarchy |
92
- | Visible Date Signal | 2% | Visible publication dates with `<time>` elements |
96
+ | Visible Date Signal | 1.5% | Visible publication dates with `<time>` elements |
93
97
  | Extraction Friction | 2% | Sentence length, voice-friendly leads, jargon density |
94
- | Image Context for AI | 1% | Figure/figcaption, descriptive alt text, contextual placement |
95
- | Schema Coverage & Depth | 1% | Schema markup on inner pages, not just homepage |
96
- | Speakable Schema | 1% | SpeakableSpecification for voice assistants |
98
+ | Image Context for AI | 0.5% | Figure/figcaption, descriptive alt text, contextual placement |
99
+ | Schema Coverage & Depth | 0% | Schema markup on inner pages, not just homepage |
100
+ | Speakable Schema | 0% | SpeakableSpecification for voice assistants |
97
101
 
98
102
  **Pillar 5: AI Discovery (~10%)** - *Whether* AI crawlers can find you:
99
103
 
100
104
  | Criterion | Weight | What it measures |
101
105
  |-----------|--------|------------------|
102
106
  | Content Cannibalization | 2% | Overlapping pages competing for the same topic |
103
- | llms.txt File | 2% | /llms.txt with site description and key page URLs |
104
- | robots.txt for AI Crawlers | 2% | GPTBot, ClaudeBot, PerplexityBot access |
107
+ | llms.txt File | 1% | /llms.txt with site description and key page URLs |
108
+ | robots.txt for AI Crawlers | 1% | GPTBot, ClaudeBot, PerplexityBot access |
105
109
  | Content Publishing Velocity | 2% | Regular publishing cadence in sitemap |
106
- | Content Licensing & AI Permissions | 2% | /ai.txt file, license schema for AI usage |
110
+ | Content Licensing & AI Permissions | 1% | /ai.txt file, license schema for AI usage |
107
111
  | Sitemap Completeness | 1% | sitemap.xml with lastmod dates |
108
- | Canonical URL Strategy | 1% | Self-referencing canonical tags |
109
- | RSS/Atom Feed | 1% | RSS feed linked from homepage |
112
+ | Canonical URL Strategy | 0.5% | Self-referencing canonical tags |
113
+ | RSS/Atom Feed | 0% | RSS feed linked from homepage |
110
114
 
111
115
  > **Coherence Gate:** Sites with topic coherence below 6/10 are score-capped regardless of technical perfection. A scattered site with perfect robots.txt, llms.txt, and schema will score lower than a focused site with mediocre technical implementation.
112
116
  >
113
117
  > **Duplication Gate:** Per-page scores are capped when duplicate content blocks are detected. A page with 3+ identical copy-pasted paragraphs cannot score above 35/75 regardless of other signals — LLMs will flag it as low-quality content.
114
118
 
115
119
  <details>
116
- <summary>All 36 criteria (numbered list)</summary>
120
+ <summary>All 40 criteria (numbered list)</summary>
117
121
 
118
122
  | # | Criterion | Weight | Pillar |
119
123
  |---|-----------|--------|--------|
120
- | 1 | llms.txt File | 2% | AI Discovery |
124
+ | 1 | llms.txt File | 1% | AI Discovery |
121
125
  | 2 | Schema.org Structured Data | 3% | Trust & Authority |
122
126
  | 3 | Q&A Content Format | 4% | Content Structure |
123
127
  | 4 | Clean, Crawlable HTML | 2% | Technical Foundation |
124
128
  | 5 | Entity Authority & NAP Consistency | 5% | Trust & Authority |
125
- | 6 | robots.txt for AI Crawlers | 2% | AI Discovery |
129
+ | 6 | robots.txt for AI Crawlers | 1% | AI Discovery |
126
130
  | 7 | Comprehensive FAQ Section | 3% | Content Structure |
127
131
  | 8 | Original Data & Expert Analysis | 10% | Answer Readiness |
128
132
  | 9 | Internal Linking Structure | 4% | Trust & Authority |
129
133
  | 10 | Semantic HTML5 & Accessibility | 2% | Technical Foundation |
130
134
  | 11 | Content Freshness Signals | 4% | Trust & Authority |
131
135
  | 12 | Sitemap Completeness | 1% | AI Discovery |
132
- | 13 | RSS/Atom Feed | 1% | AI Discovery |
136
+ | 13 | RSS/Atom Feed | 0% | AI Discovery |
133
137
  | 14 | Table & List Extractability | 3% | Content Structure |
134
- | 15 | Definition Patterns | 2% | Content Structure |
138
+ | 15 | Definition Patterns | 1.5% | Content Structure |
135
139
  | 16 | Direct Answer Paragraphs | 5% | Content Structure |
136
- | 17 | Content Licensing & AI Permissions | 2% | AI Discovery |
140
+ | 17 | Content Licensing & AI Permissions | 1% | AI Discovery |
137
141
  | 18 | Author & Expert Schema | 3% | Trust & Authority |
138
142
  | 19 | Fact & Data Density | 6% | Answer Readiness |
139
- | 20 | Canonical URL Strategy | 1% | AI Discovery |
143
+ | 20 | Canonical URL Strategy | 0.5% | AI Discovery |
140
144
  | 21 | Content Publishing Velocity | 2% | AI Discovery |
141
- | 22 | Schema Coverage & Depth | 1% | Technical Foundation |
142
- | 23 | Speakable Schema | 1% | Technical Foundation |
145
+ | 22 | Schema Coverage & Depth | 0% | Technical Foundation |
146
+ | 23 | Speakable Schema | 0% | Technical Foundation |
143
147
  | 24 | Query-Answer Alignment | 4% | Content Structure |
144
148
  | 25 | Content Cannibalization | 2% | AI Discovery |
145
- | 26 | Visible Date Signal | 2% | Technical Foundation |
149
+ | 26 | Visible Date Signal | 1.5% | Technical Foundation |
146
150
  | 27 | Topic Coherence | 14% | Answer Readiness |
147
151
  | 28 | Content Depth | 7% | Answer Readiness |
148
- | 29 | Citation-Ready Writing | 4% | Answer Readiness |
149
- | 30 | Answer-First Placement | 3% | Answer Readiness |
150
- | 31 | Evidence Packaging | 3% | Answer Readiness |
151
- | 32 | Entity Disambiguation | 2% | Content Structure |
152
- | 33 | Extraction Friction | 2% | Technical Foundation |
153
- | 34 | Image Context for AI | 1% | Technical Foundation |
154
- | 35 | Duplicate Content Blocks | 5% | Answer Readiness |
155
- | 36 | Cross-Page Duplicate Content | 3% | Answer Readiness |
152
+ | 29 | Helpful Purpose Alignment | 3% | Answer Readiness |
153
+ | 30 | First-Hand Experience Signals | 3% | Answer Readiness |
154
+ | 31 | Creator Transparency | 2% | Trust & Authority |
155
+ | 32 | Methodology Transparency | 2% | Trust & Authority |
156
+ | 33 | Citation-Ready Writing | 4% | Answer Readiness |
157
+ | 34 | Answer-First Placement | 3% | Answer Readiness |
158
+ | 35 | Evidence Packaging | 3% | Answer Readiness |
159
+ | 36 | Entity Disambiguation | 2% | Content Structure |
160
+ | 37 | Extraction Friction | 2% | Technical Foundation |
161
+ | 38 | Image Context for AI | 0.5% | Technical Foundation |
162
+ | 39 | Duplicate Content Blocks | 5% | Answer Readiness |
163
+ | 40 | Cross-Page Duplicate Content | 3% | Answer Readiness |
156
164
 
157
165
  </details>
158
166
 
@@ -183,7 +191,7 @@ Use the built-in action to gate deployments on AEO score:
183
191
 
184
192
  ```yaml
185
193
  - name: AEO Audit
186
- uses: AEO-Content-Inc/aeorank@v2
194
+ uses: AEO-Content-Inc/aeorank@v3
187
195
  with:
188
196
  domain: example.com
189
197
  threshold: 70
@@ -203,7 +211,7 @@ Or use `npx` directly:
203
211
  Run a complete audit. Returns `AuditResult` with:
204
212
 
205
213
  - `overallScore` - 0-100 weighted score
206
- - `scorecard` - 36 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
214
+ - `scorecard` - 40 `ScoreCardItem` entries (criterion, score 0-10, status, key findings)
207
215
  - `detailedFindings` - Per-criterion findings with severity
208
216
  - `opportunities` - Prioritized improvements with effort/impact
209
217
  - `pitchNumbers` - Key metrics (schema types, AI crawler access, etc.)
@@ -225,10 +233,10 @@ Run a complete audit. Returns `AuditResult` with:
225
233
 
226
234
  ### `scorePage(html, url?)`
227
235
 
228
- Score a single HTML page against 21 per-page AEO criteria. Returns `PageScoreResult` with:
236
+ Score a single HTML page against 25 per-page AEO criteria. Returns `PageScoreResult` with:
229
237
 
230
238
  - `aeoScore` - 0-75 weighted score (capped; duplication gate may lower further)
231
- - `criterionScores` - 21 `PageCriterionScore` entries (criterion, score 0-10, weight)
239
+ - `criterionScores` - 25 `PageCriterionScore` entries (criterion, score 0-10, weight)
232
240
 
233
241
  ### `scoreAllPages(siteData)`
234
242
 
@@ -388,9 +396,9 @@ console.log(crawlResult.discoveredUrls.length); // Total URLs found
388
396
 
389
397
  ## Per-Page Scoring
390
398
 
391
- AEORank scores each individual page (0-75) against the 21 criteria that apply at page level. Instead of only seeing "your site scores 62," you get "your /about page scores 45, your /blog/guide scores 72."
399
+ AEORank scores each individual page (0-75) against the 25 criteria that apply at page level. Instead of only seeing "your site scores 62," you get "your /about page scores 45, your /blog/guide scores 72."
392
400
 
393
- The 21 per-page criteria follow the same pillar-first weighting as the site-level score:
401
+ The 25 per-page criteria follow the same pillar-first weighting as the site-level score:
394
402
 
395
403
  | Pillar | Per-Page Criteria | Weight |
396
404
  |--------|-------------------|--------|
@@ -400,21 +408,25 @@ The 21 per-page criteria follow the same pillar-first weighting as the site-leve
400
408
  | | Citation-Ready Writing | 4% |
401
409
  | | Answer-First Placement | 3% |
402
410
  | | Evidence Packaging | 3% |
411
+ | | Helpful Purpose Alignment | 3% |
412
+ | | First-Hand Experience Signals | 3% |
403
413
  | **Content Structure** | Direct Answer Paragraphs | 5% |
404
414
  | | Q&A Content Format | 4% |
405
415
  | | Query-Answer Alignment | 4% |
406
416
  | | FAQ Section Content | 3% |
407
417
  | | Table & List Extractability | 3% |
408
- | | Definition Patterns | 2% |
418
+ | | Definition Patterns | 1.5% |
409
419
  | | Entity Disambiguation | 2% |
410
420
  | **Trust & Authority** | Content Freshness Signals | 4% |
411
421
  | | Schema.org Structured Data | 3% |
422
+ | | Creator Transparency | 2% |
423
+ | | Methodology Transparency | 2% |
412
424
  | **Technical Foundation** | Semantic HTML5 & Accessibility | 2% |
413
425
  | | Clean, Crawlable HTML | 2% |
414
- | | Visible Date Signal | 2% |
426
+ | | Visible Date Signal | 1.5% |
415
427
  | | Extraction Friction | 2% |
416
- | | Image Context for AI | 1% |
417
- | **AI Discovery** | Canonical URL Strategy | 1% |
428
+ | | Image Context for AI | 0.5% |
429
+ | **AI Discovery** | Canonical URL Strategy | 0.5% |
418
430
 
419
431
  The remaining 15 criteria are site-level only: llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization, cross-page duplication, topic coherence, and content depth.
420
432
 
@@ -445,7 +457,7 @@ import type { PageScoreResult, PageCriterionScore } from 'aeorank';
445
457
  // Score a single page
446
458
  const result = scorePage(html, url);
447
459
  console.log(result.aeoScore); // 0-75 (capped for single pages)
448
- console.log(result.criterionScores); // 21 per-criterion scores
460
+ console.log(result.criterionScores); // 25 per-criterion scores
449
461
  console.log(result.scoreCapped); // true if score was capped at 75
450
462
 
451
463
  // Score all pages from site data
@@ -574,13 +586,21 @@ console.log(result.comparison.tied); // Criteria with equal scores
574
586
 
575
587
  ## Changelog
576
588
 
589
+ ### v3.1.1 - Duplicate Detection False-Positive Fix
590
+
591
+ Duplicate-content detection now ignores short metadata rows like `Deadline:` and `Decision timeline:` so structured guides do not get penalized for repeated timeline labels. Shared duplicate-matching logic is now used by both page scoring and site-wide crawling.
592
+
577
593
  ### v3.1.0 - Duplicate Content Detection
578
594
 
579
595
  2 new criteria (#35-#36): Duplicate Content Blocks (intra-page, 5%) and Cross-Page Duplicate Content (3%). Detects identical text blocks within pages and copy-pasted paragraphs across pages using shingle-based Jaccard similarity. Boilerplate filtering excludes CTAs, signups, and template content from false positives. Duplication gate caps per-page scores when severe duplication is found. CLI now shows duplicate section names inline per page.
580
596
 
597
+ ### v3.2.0 - Helpful Content Criteria
598
+
599
+ Added 4 new criteria: Helpful Purpose Alignment, First-Hand Experience Signals, Creator Transparency, and Methodology Transparency. The model now scores 40 total criteria and 25 page-level criteria while explicitly avoiding any "AI-written" detector.
600
+
581
601
  ### v3.0.0 - 5-Pillar Framework & 6 New Criteria
582
602
 
583
- Scoring Engine v2: 28 → 34 criteria (now 36) with 5-pillar framework (Answer Readiness, Content Structure, Trust & Authority, Technical Foundation, AI Discovery). 6 new criteria targeting citation quality, evidence packaging, and extraction friction. Per-pillar sub-scores, top-3 fixes, client-friendly names. Single-page score cap at 75. 15 per-page quality checks (up from 12).
603
+ Scoring Engine v2: 28 → 34 criteria with 5-pillar framework (Answer Readiness, Content Structure, Trust & Authority, Technical Foundation, AI Discovery). 6 new criteria targeting citation quality, evidence packaging, and extraction friction. Per-pillar sub-scores, top-3 fixes, client-friendly names. Single-page score cap at 75.
584
604
 
585
605
  ### v2.3.0 - Coherence Scaling & Script Stripping
586
606
 
@@ -604,11 +624,11 @@ Internal linking analysis with orphan/pillar/hub detection, topic clusters. Phas
604
624
 
605
625
  ### v1.5.0 - Per-Page Scoring
606
626
 
607
- Individual page scores (0-100) against 14 page-level criteria. Top/bottom page rankings.
627
+ Individual page scores against the initial page-level scoring model. Top/bottom page rankings.
608
628
 
609
629
  ## Benchmark Dataset
610
630
 
611
- The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 36 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
631
+ The `data/` directory contains the largest open dataset of AI visibility scores - **13,619 domains** scored across 40 criteria, including **4,328 Y Combinator startups** across 48 batches (W06-W26):
612
632
 
613
633
  | File | Contents |
614
634
  |------|----------|
package/dist/browser.d.ts CHANGED
@@ -64,7 +64,7 @@ declare function buildLinkGraph(pages: FetchResult[], domain: string, homepageUr
64
64
 
65
65
  /**
66
66
  * V2 Pillar Framework — 5-pillar scoring model.
67
- * Maps all 36 criteria into pillars, computes sub-scores,
67
+ * Maps all 40 criteria into pillars, computes sub-scores,
68
68
  * provides client-friendly names, and calculates top-3 fixes.
69
69
  */
70
70
 
@@ -320,7 +320,7 @@ interface SitemapDateAnalysis {
320
320
  declare function countRecentSitemapDates(sitemapText: string): SitemapDateAnalysis;
321
321
  declare function extractRawDataSummary(data: SiteData): RawDataSummary;
322
322
  /**
323
- * Run all 36 criteria checks using pre-fetched site data.
323
+ * Run all 40 criteria checks using pre-fetched site data.
324
324
  * All functions are synchronous (no HTTP calls) - data was already fetched.
325
325
  */
326
326
  declare function auditSiteFromData(data: SiteData): CriterionResult[];
@@ -456,7 +456,7 @@ declare function analyzeAllPages(siteData: SiteData): PageReview[];
456
456
 
457
457
  /**
458
458
  * Per-page AEO scoring.
459
- * Evaluates 21 of 36 criteria that apply at individual page level.
459
+ * Evaluates 25 of 40 criteria that apply at individual page level.
460
460
  * Produces a 0-75 AEO score per page (single-page cap at 75).
461
461
  */
462
462
 
@@ -484,7 +484,7 @@ declare function scoreExtractionFriction(html: string): number;
484
484
  /** 20. Image Context for AI */
485
485
  declare function scoreImageContextAI(html: string): number;
486
486
  /**
487
- * Score a single page against 20 AEO criteria.
487
+ * Score a single page against 25 AEO criteria.
488
488
  * Returns a 0-100 AEO score (capped at 75 for single pages) and individual criterion scores.
489
489
  */
490
490
  declare function scorePage(html: string, url?: string): PageScoreResult;