@agentimization/core 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +39 -155
  2. package/dist/index.js +48 -24
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -1,43 +1,6 @@
1
- <p align="center">
2
- <img src="https://img.shields.io/npm/v/agentimization?style=flat-square&color=blue" alt="npm version" />
3
- <img src="https://img.shields.io/badge/license-MIT-green?style=flat-square" alt="license" />
4
- <img src="https://img.shields.io/badge/checks-35-purple?style=flat-square" alt="checks" />
5
- </p>
1
+ # agentimization
6
2
 
7
- <h1 align="center">agentimization</h1>
8
-
9
- <p align="center">
10
- GEO audit for agent-ready websites.<br/>
11
- One command to check if AI agents can discover, parse, and cite your content.
12
- </p>
13
-
14
- ---
15
-
16
- ## Why
17
-
18
- AI agents (Claude, ChatGPT, Perplexity, Gemini) are becoming a major source of traffic and citations. But most websites are invisible to them — no `llms.txt`, no markdown endpoints, no structured data, client-rendered content that crawlers can't read.
19
-
20
- **Agentimization** runs checks across 8 categories and gives you a GEO score from 0–100, with specific fixes you can hand off to an AI coding agent.
21
-
22
- ## Install
23
-
24
- ```bash
25
- npx agentimization https://your-site.com
26
- ```
27
-
28
- Or install globally:
29
-
30
- ```bash
31
- npm install -g agentimization
32
- ```
33
-
34
- ## Usage
35
-
36
- ### Audit a live site
37
-
38
- ```bash
39
- agentimization https://docs.anthropic.com
40
- ```
3
+ [![npm version](https://img.shields.io/npm/v/agentimization?style=flat-square&color=blue)](https://www.npmjs.com/package/agentimization)
41
4
 
42
5
  ```text
43
6
  ╭───────────────────────────────────────────────╮
@@ -45,149 +8,70 @@ agentimization https://docs.anthropic.com
45
8
  │ ░▓▒░▓░░▒▓▒░▓░░▒▓▓░▒░▓▒░░▓▒░▓░▒░░▓▒░░▓░▒ │
46
9
  │ ▓░▒▓░░▒▓▒░░▓░▒▓▒░░▓░░▓▒░▓░▒░░▓▒░▓░░▒▓░ │
47
10
  │ ░▒▓░▒░▓▒░░▓░▒▓░░▒▓▒░░▓░▒▓░░▒▓░ agentimization │
48
- │ │
49
- │ https://docs.anthropic.com │
50
- │ │
51
- │ Crawling the site, one sec… │
52
11
  ╰───────────────────────────────────────────────╯
53
12
  ```
54
13
 
55
- ### Audit a local directory (great for CI)
14
+ geo audit for agent-ready websites and projects.
56
15
 
57
- ```bash
58
- agentimization .
59
- agentimization ./docs
60
- ```
16
+ geomaxx your site so ai agents can actually find, parse, and cite it.
61
17
 
62
- ### Output formats
18
+ ## install
63
19
 
64
20
  ```bash
65
- # JSON for CI pipelines
66
- agentimization https://example.com --json
67
-
68
- # Markdown report — paste into Claude, ChatGPT, etc.
69
- agentimization https://example.com --md
70
-
71
- # Filter by category
72
- agentimization https://example.com --category content-discoverability
21
+ npx agentimization https://your-site.com
73
22
  ```
74
23
 
75
- ### After the audit
76
-
77
- Agentimization shows an interactive menu when the audit finishes:
24
+ ## usage
78
25
 
79
- - **Copy fix prompt to clipboard** — structured markdown an AI coding agent can use to fix your GEO issues
80
- - **Save JSON report** — full audit data written to `agentimization-report.json`
81
- - **Run another URL or path** — keep the session open and audit the next site
82
- - **Exit**
26
+ audit a live site:
83
27
 
84
- ## Checks
85
-
86
- Agentimization runs **36 checks** across **8 categories**:
87
-
88
- | Category | What it checks |
89
- |---|---|
90
- | **Content Discoverability** | `llms.txt` existence, structure, size, coverage, link resolution. Sitemap presence. `robots.txt` AI agent rules. |
91
- | **Markdown Availability** | `.md` URL support, `Accept: text/markdown` content negotiation, HTML↔markdown parity. |
92
- | **Content Structure** | Code fence validity, heading hierarchy, tabbed content serialization. |
93
- | **Page Size & Rendering** | SSR vs CSR detection, HTML/markdown page size, content start position (boilerplate ratio). |
94
- | **URL Stability** | HTTP status codes, redirect behavior, cache header hygiene. |
95
- | **Authentication & Access** | Auth gate detection, alternative access paths for gated content. |
96
- | **GEO Signals** | Structured data (JSON-LD), citation worthiness, topical authority, content freshness, E-E-A-T signals, FAQ schema, canonical URLs. |
97
- | **Agent Protocols** | AGENTS.md, MCP server card, API catalog (RFC 9727), content signals (AI usage declarations), Link headers (RFC 8288), agent skills index. |
98
-
99
- ## Scoring
100
-
101
- Each check returns **pass**, **warn**, **fail**, **skip**, or **info**. Checks are weighted by importance, and scores roll up into category scores and an overall grade:
102
-
103
- | Grade | Score |
104
- |---|---|
105
- | A+ | 95–100 |
106
- | A | 85–94 |
107
- | B | 70–84 |
108
- | C | 55–69 |
109
- | D | 40–54 |
110
- | F | 0–39 |
111
-
112
- ## Example scores
113
-
114
- How popular sites score on Agentimization (approximate, scores change as sites update):
115
-
116
- | Site | Grade | Score | Notes |
117
- |---|---|---|---|
118
- | `docs.anthropic.com` | **A** | 88 | Strong `llms.txt`, good markdown, structured data |
119
- | `docs.stripe.com` | **A** | 91 | Excellent discoverability, markdown endpoints, great structure |
120
- | `nextjs.org/docs` | **B** | 76 | Good SSR, missing `llms.txt`, decent GEO signals |
121
- | `react.dev` | **B** | 72 | Good structure, no `llms.txt`, client-heavy rendering |
122
- | `en.wikipedia.org` | **A** | 86 | Great content structure, strong citations, no `llms.txt` |
123
- | `medium.com` | **D** | 45 | Auth gates, weak markdown, no `llms.txt` |
124
- | `substack.com` | **C** | 58 | Mixed access, some content gated |
125
-
126
- > These are illustrative examples. Run `agentimization <url>` to get real-time scores.
127
-
128
- ## Local mode
129
-
130
- When you pass a directory path instead of a URL, Agentimization runs in **local mode**:
28
+ ```bash
29
+ agentimization https://docs.your-site.com
30
+ ```
131
31
 
132
- - Scans your files on disk (HTML, markdown, `llms.txt`, `robots.txt`, `sitemap.xml`)
133
- - Skips network-only checks (content negotiation, auth detection, cache headers, etc.)
134
- - Perfect as a **CI pre-deploy step** — catch GEO regressions before they ship
32
+ audit a local directory:
135
33
 
136
34
  ```bash
137
- # In CI
138
- agentimization . --json
139
- # Exit code 1 if score < 50
35
+ agentimization .
140
36
  ```
141
37
 
142
- ## Programmatic API
143
-
144
- ```typescript
145
- import { audit, auditLocal } from "@agentimization/core"
38
+ pipe results to a tool or file:
146
39
 
147
- // Remote audit
148
- const result = await audit("https://docs.anthropic.com")
149
- console.log(result.grade, result.overall_score)
40
+ ```bash
41
+ agentimization https://your-site.com --json > report.json
42
+ agentimization https://your-site.com --md | pbcopy
43
+ ```
150
44
 
151
- // Local audit
152
- const local = await auditLocal("./docs")
153
- console.log(local.grade, local.overall_score)
45
+ ## what it checks
154
46
 
155
- // With options
156
- const result = await audit("https://example.com", {
157
- sampleSize: 20,
158
- categories: ["content-discoverability", "geo-signals"],
159
- onEvent: (event) => console.log(event),
160
- })
161
- ```
47
+ 36 checks across 8 categories. each one is a thing ai agents need to discover, parse, or cite your content.
162
48
 
163
- ## What is GEO?
49
+ - content discoverability: `llms.txt`, sitemap, robots
50
+ - markdown availability: `.md` urls, content negotiation
51
+ - content structure: headings, code fences, hidden tabs
52
+ - page size and rendering: ssr vs csr, boilerplate ratio
53
+ - url stability: status codes, redirects, canonicals
54
+ - authentication and access: gates, alternative paths
55
+ - geo signals: json-ld, citations, freshness, e-e-a-t
56
+ - agent protocols: mcp card, api catalog, agents.md, link headers
164
57
 
165
- **Generative Engine Optimization** is like SEO, but for AI. Instead of optimizing for Google's crawlers and ranking algorithm, GEO optimizes for AI agents that need to:
58
+ ## how it works
166
59
 
167
- 1. **Discover** your content (via `llms.txt`, sitemaps, `robots.txt`)
168
- 2. **Parse** it efficiently (markdown availability, clean HTML, SSR)
169
- 3. **Cite** it accurately (structured data, canonical URLs, E-E-A-T signals)
60
+ it samples up to 10 pages of your site, runs 36 checks against the html, headers, and well-known files, then weights them into a 0 to 100 score. failed checks come with a suggestion you can paste into your ai coding agent.
170
61
 
171
- Sites that score well on Agentimization are more likely to be surfaced and cited by Claude, ChatGPT, Perplexity, and other generative engines.
62
+ ## requirements
172
63
 
173
- ## Contributing
64
+ node 18 or newer.
174
65
 
175
- ```bash
176
- git clone https://github.com/antlio/agentimization
177
- cd agentimization
178
- bun install
179
- bun run build
180
- bun run typecheck
181
- ```
66
+ ## programmatic use
182
67
 
183
- The monorepo structure:
68
+ ```typescript
69
+ import { audit } from "@agentimization/core"
184
70
 
185
- ```
186
- packages/shared — Types, schemas, constants
187
- packages/core — Audit engine + all 36 checks
188
- apps/cli — CLI (Commander.js + Ink)
71
+ const result = await audit("https://your-site.com")
72
+ console.log(result.grade, result.overall_score)
189
73
  ```
190
74
 
191
- ## License
75
+ ## license
192
76
 
193
- MIT
77
+ mit
package/dist/index.js CHANGED
@@ -993,7 +993,7 @@ var contentStartPosition = {
993
993
  id: "content-start-position",
994
994
  name: "Content Start Position",
995
995
  category: "page-size",
996
- description: "Checks if main content starts within the first 10% of the HTML",
996
+ description: "Checks how soon main content starts relative to total HTML",
997
997
  weight: 0.5,
998
998
  run: async (ctx) => {
999
999
  const pages = ctx.sampledPages.slice(0, 10);
@@ -1010,17 +1010,16 @@ var contentStartPosition = {
1010
1010
  url: p.url,
1011
1011
  position: findContentStartPosition(p.html)
1012
1012
  }));
1013
- const earlyStart = positions.filter((p) => p.position <= 0.1);
1014
1013
  const medianPct = Math.round(
1015
1014
  positions.map((p) => p.position).sort((a, b) => a - b)[Math.floor(positions.length / 2)] * 100
1016
1015
  );
1017
- if (earlyStart.length === pages.length) {
1016
+ if (medianPct <= 30) {
1018
1017
  return {
1019
1018
  id: "content-start-position",
1020
1019
  name: "Content Start Position",
1021
1020
  category: "page-size",
1022
1021
  status: "pass",
1023
- message: `Content starts within first 10% on all ${pages.length} sampled pages (median ${medianPct}%)`,
1022
+ message: `content starts at ${medianPct}% of html (median over ${pages.length} pages)`,
1024
1023
  metadata: { medianPct }
1025
1024
  };
1026
1025
  }
@@ -1028,10 +1027,10 @@ var contentStartPosition = {
1028
1027
  id: "content-start-position",
1029
1028
  name: "Content Start Position",
1030
1029
  category: "page-size",
1031
- status: "warn",
1032
- message: `Content starts late on ${pages.length - earlyStart.length}/${pages.length} pages (median ${medianPct}%)`,
1033
- suggestion: "Move main content higher in the HTML. AI agents may waste context window tokens on navigation, headers, and boilerplate before reaching actual content.",
1034
- metadata: { medianPct, earlyStart: earlyStart.length }
1030
+ status: medianPct <= 50 ? "warn" : "fail",
1031
+ message: `content starts at ${medianPct}% of html (median over ${pages.length} pages)`,
1032
+ suggestion: "trim head metadata or move main content higher in the html so ai agents do not waste context tokens on boilerplate before reaching real content.",
1033
+ metadata: { medianPct }
1035
1034
  };
1036
1035
  }
1037
1036
  };
@@ -1573,9 +1572,10 @@ var topicalAuthoritySignals = {
1573
1572
  const pages = ctx.sampledPages.slice(0, 10);
1574
1573
  let totalInternalLinks = 0;
1575
1574
  let pagesWithGoodLinking = 0;
1575
+ const resolveBase = ctx.mode === "local" ? ctx.baseUrl.href : ctx.baseUrl.origin;
1576
1576
  for (const page of pages) {
1577
- const links = extractLinks(page.html, ctx.baseUrl.origin);
1578
- const internalLinks = ctx.mode === "local" ? links.filter((l) => !l.startsWith("http://") && !l.startsWith("https://")) : links.filter((l) => {
1577
+ const links = extractLinks(page.html, resolveBase);
1578
+ const internalLinks = ctx.mode === "local" ? links.filter((l) => l.startsWith("file:")) : links.filter((l) => {
1579
1579
  try {
1580
1580
  return new URL(l).origin === ctx.baseUrl.origin;
1581
1581
  } catch {
@@ -1671,7 +1671,8 @@ var eeatSignals = {
1671
1671
  const hasAuthorHtml = /class=["'][^"']*author[^"']*["']|rel=["']author["']/i.test(page.html);
1672
1672
  if (hasAuthorMeta || hasAuthorJsonLd || hasAuthorHtml) withAuthor++;
1673
1673
  const hasCredentials = /Ph\.?D|M\.?D|CPA|certified|licensed|expert|specialist/i.test(page.html);
1674
- const hasAboutPage = extractLinks(page.html, ctx.baseUrl.origin).some((l) => /about|team|author/i.test(l));
1674
+ const linkBase = ctx.mode === "local" ? ctx.baseUrl.href : ctx.baseUrl.origin;
1675
+ const hasAboutPage = extractLinks(page.html, linkBase).some((l) => /about|team|author/i.test(l));
1675
1676
  if (hasCredentials || hasAboutPage) withExpertise++;
1676
1677
  }
1677
1678
  const score = (withAuthor + withExpertise) / (pages.length * 2);
@@ -1742,6 +1743,15 @@ var canonicalUrlConsistency = {
1742
1743
  description: "Checks if pages have consistent canonical URLs",
1743
1744
  weight: 0.5,
1744
1745
  run: async (ctx) => {
1746
+ if (ctx.mode === "local") {
1747
+ return {
1748
+ id: "canonical-url-consistency",
1749
+ name: "Canonical URL Consistency",
1750
+ category: "geo-signals",
1751
+ status: "info",
1752
+ message: "only meaningful for live urls. re-run against a deployed site to verify"
1753
+ };
1754
+ }
1745
1755
  const pages = ctx.sampledPages.slice(0, 10);
1746
1756
  let withCanonical = 0;
1747
1757
  let selfReferencing = 0;
@@ -2185,7 +2195,30 @@ var ALL_CHECKS = [
2185
2195
 
2186
2196
  // src/utils/local.ts
2187
2197
  import { readFileSync, readdirSync, existsSync } from "fs";
2188
- import { join, relative, extname } from "path";
2198
+ import { dirname, join, relative, extname } from "path";
2199
+ var readIfExists = (path) => {
2200
+ try {
2201
+ if (existsSync(path)) {
2202
+ return readFileSync(path, "utf-8");
2203
+ }
2204
+ } catch {
2205
+ }
2206
+ return void 0;
2207
+ };
2208
+ var findUpward = (start, names, maxDepth = 6) => {
2209
+ let current = start;
2210
+ for (let i = 0; i < maxDepth; i++) {
2211
+ for (const name of names) {
2212
+ const candidate = join(current, name);
2213
+ const value = readIfExists(candidate);
2214
+ if (value !== void 0) return value;
2215
+ }
2216
+ const parent = dirname(current);
2217
+ if (parent === current) break;
2218
+ current = parent;
2219
+ }
2220
+ return void 0;
2221
+ };
2189
2222
  var walkDir = (dir, extensions, maxDepth = 10) => {
2190
2223
  if (maxDepth <= 0) return [];
2191
2224
  const results = [];
@@ -2206,15 +2239,6 @@ var walkDir = (dir, extensions, maxDepth = 10) => {
2206
2239
  }
2207
2240
  return results;
2208
2241
  };
2209
- var readIfExists = (path) => {
2210
- try {
2211
- if (existsSync(path)) {
2212
- return readFileSync(path, "utf-8");
2213
- }
2214
- } catch {
2215
- }
2216
- return void 0;
2217
- };
2218
2242
  var buildLocalContext = (dirPath, config) => {
2219
2243
  const baseUrl = new URL(`file://${dirPath}`);
2220
2244
  const robotsTxt = readIfExists(join(dirPath, "robots.txt"));
@@ -2224,7 +2248,7 @@ var buildLocalContext = (dirPath, config) => {
2224
2248
  const mcpServerCard2 = readIfExists(join(dirPath, ".well-known", "mcp", "server-card.json"));
2225
2249
  const apiCatalog2 = readIfExists(join(dirPath, ".well-known", "api-catalog"));
2226
2250
  const agentSkillsIndex2 = readIfExists(join(dirPath, ".well-known", "agent-skills", "index.json"));
2227
- const agentsMd2 = readIfExists(join(dirPath, "AGENTS.md")) ?? readIfExists(join(dirPath, "AGENT.md"));
2251
+ const agentsMd2 = findUpward(dirPath, ["AGENTS.md", "AGENT.md"]);
2228
2252
  const sitemapUrls = sitemapXml ? parseSitemapUrls(sitemapXml) : [];
2229
2253
  if (!sitemapXml && robotsTxt) {
2230
2254
  const sitemapMatch = robotsTxt.match(/Sitemap:\s*(.+)/i);
@@ -2241,8 +2265,8 @@ var buildLocalContext = (dirPath, config) => {
2241
2265
  }
2242
2266
  const htmlFiles = walkDir(dirPath, /* @__PURE__ */ new Set([".html", ".htm"]));
2243
2267
  const mdFiles = walkDir(dirPath, /* @__PURE__ */ new Set([".md", ".mdx"]));
2244
- const allFiles = [...htmlFiles, ...mdFiles];
2245
- const sampled = allFiles.slice(0, config.sampleSize);
2268
+ const sampleSource = htmlFiles.length > 0 ? htmlFiles : mdFiles;
2269
+ const sampled = sampleSource.slice(0, config.sampleSize);
2246
2270
  const sampledPages = sampled.map((filePath) => {
2247
2271
  const content = readFileSync(filePath, "utf-8");
2248
2272
  const relPath = relative(dirPath, filePath);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@agentimization/core",
3
- "version": "0.1.0",
3
+ "version": "0.1.1",
4
4
  "description": "GEO audit engine. Check if your website is agent-ready and generative-engine optimized.",
5
5
  "keywords": [
6
6
  "geo",