npm - @agentimization/core - Versions diffs - 0.1.0 → 0.1.1 - Mend

@agentimization/core 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -1,43 +1,6 @@
-<p align="center">
-  <img src="https://img.shields.io/npm/v/agentimization?style=flat-square&color=blue" alt="npm version" />
-  <img src="https://img.shields.io/badge/license-MIT-green?style=flat-square" alt="license" />
-  <img src="https://img.shields.io/badge/checks-35-purple?style=flat-square" alt="checks" />
-</p>
+# agentimization
-<h1 align="center">agentimization</h1>
-<p align="center">
-  GEO audit for agent-ready websites.<br/>
-  One command to check if AI agents can discover, parse, and cite your content.
-</p>
----
-## Why
-AI agents (Claude, ChatGPT, Perplexity, Gemini) are becoming a major source of traffic and citations. But most websites are invisible to them — no `llms.txt`, no markdown endpoints, no structured data, client-rendered content that crawlers can't read.
-**Agentimization** runs checks across 8 categories and gives you a GEO score from 0–100, with specific fixes you can hand off to an AI coding agent.
-## Install
-```bash
-npx agentimization https://your-site.com
-```
-Or install globally:
-```bash
-npm install -g agentimization
-```
-## Usage
-### Audit a live site
-```bash
-agentimization https://docs.anthropic.com
-```
+[![npm version](https://img.shields.io/npm/v/agentimization?style=flat-square&color=blue)](https://www.npmjs.com/package/agentimization)
 ```text
 ╭───────────────────────────────────────────────╮
@@ -45,149 +8,70 @@ agentimization https://docs.anthropic.com
 │ ░▓▒░▓░░▒▓▒░▓░░▒▓▓░▒░▓▒░░▓▒░▓░▒░░▓▒░░▓░▒       │
 │ ▓░▒▓░░▒▓▒░░▓░▒▓▒░░▓░░▓▒░▓░▒░░▓▒░▓░░▒▓░        │
 │ ░▒▓░▒░▓▒░░▓░▒▓░░▒▓▒░░▓░▒▓░░▒▓░ agentimization │
-│                                               │
-│ https://docs.anthropic.com                    │
-│                                               │
-│ Crawling the site, one sec…                   │
 ╰───────────────────────────────────────────────╯
 ```
-### Audit a local directory (great for CI)
+geo audit for agent-ready websites and projects.
-```bash
-agentimization .
-agentimization ./docs
-```
+geomaxx your site so ai agents can actually find, parse, and cite it.
-### Output formats
+## install
 ```bash
-# JSON for CI pipelines
-agentimization https://example.com --json
-# Markdown report — paste into Claude, ChatGPT, etc.
-agentimization https://example.com --md
-# Filter by category
-agentimization https://example.com --category content-discoverability
+npx agentimization https://your-site.com
 ```
-### After the audit
-Agentimization shows an interactive menu when the audit finishes:
+## usage
-- **Copy fix prompt to clipboard** — structured markdown an AI coding agent can use to fix your GEO issues
-- **Save JSON report** — full audit data written to `agentimization-report.json`
-- **Run another URL or path** — keep the session open and audit the next site
-- **Exit**
+audit a live site:
-## Checks
-Agentimization runs **36 checks** across **8 categories**:
-| Category | What it checks |
-|---|---|
-| **Content Discoverability** | `llms.txt` existence, structure, size, coverage, link resolution. Sitemap presence. `robots.txt` AI agent rules. |
-| **Markdown Availability** | `.md` URL support, `Accept: text/markdown` content negotiation, HTML↔markdown parity. |
-| **Content Structure** | Code fence validity, heading hierarchy, tabbed content serialization. |
-| **Page Size & Rendering** | SSR vs CSR detection, HTML/markdown page size, content start position (boilerplate ratio). |
-| **URL Stability** | HTTP status codes, redirect behavior, cache header hygiene. |
-| **Authentication & Access** | Auth gate detection, alternative access paths for gated content. |
-| **GEO Signals** | Structured data (JSON-LD), citation worthiness, topical authority, content freshness, E-E-A-T signals, FAQ schema, canonical URLs. |
-| **Agent Protocols** | AGENTS.md, MCP server card, API catalog (RFC 9727), content signals (AI usage declarations), Link headers (RFC 8288), agent skills index. |
-## Scoring
-Each check returns **pass**, **warn**, **fail**, **skip**, or **info**. Checks are weighted by importance, and scores roll up into category scores and an overall grade:
-| Grade | Score |
-|---|---|
-| A+ | 95–100 |
-| A | 85–94 |
-| B | 70–84 |
-| C | 55–69 |
-| D | 40–54 |
-| F | 0–39 |
-## Example scores
-How popular sites score on Agentimization (approximate, scores change as sites update):
-| Site | Grade | Score | Notes |
-|---|---|---|---|
-| `docs.anthropic.com` | **A** | 88 | Strong `llms.txt`, good markdown, structured data |
-| `docs.stripe.com` | **A** | 91 | Excellent discoverability, markdown endpoints, great structure |
-| `nextjs.org/docs` | **B** | 76 | Good SSR, missing `llms.txt`, decent GEO signals |
-| `react.dev` | **B** | 72 | Good structure, no `llms.txt`, client-heavy rendering |
-| `en.wikipedia.org` | **A** | 86 | Great content structure, strong citations, no `llms.txt` |
-| `medium.com` | **D** | 45 | Auth gates, weak markdown, no `llms.txt` |
-| `substack.com` | **C** | 58 | Mixed access, some content gated |
-> These are illustrative examples. Run `agentimization <url>` to get real-time scores.
-## Local mode
-When you pass a directory path instead of a URL, Agentimization runs in **local mode**:
+```bash
+agentimization https://docs.your-site.com
+```
-- Scans your files on disk (HTML, markdown, `llms.txt`, `robots.txt`, `sitemap.xml`)
-- Skips network-only checks (content negotiation, auth detection, cache headers, etc.)
-- Perfect as a **CI pre-deploy step** — catch GEO regressions before they ship
+audit a local directory:
 ```bash
-# In CI
-agentimization . --json
-# Exit code 1 if score < 50
+agentimization .
 ```
-## Programmatic API
-```typescript
-import { audit, auditLocal } from "@agentimization/core"
+pipe results to a tool or file:
-// Remote audit
-const result = await audit("https://docs.anthropic.com")
-console.log(result.grade, result.overall_score)
+```bash
+agentimization https://your-site.com --json > report.json
+agentimization https://your-site.com --md | pbcopy
+```
-// Local audit
-const local = await auditLocal("./docs")
-console.log(local.grade, local.overall_score)
+## what it checks
-// With options
-const result = await audit("https://example.com", {
-  sampleSize: 20,
-  categories: ["content-discoverability", "geo-signals"],
-  onEvent: (event) => console.log(event),
-})
-```
+36 checks across 8 categories. each one is a thing ai agents need to discover, parse, or cite your content.
-## What is GEO?
+- content discoverability: `llms.txt`, sitemap, robots
+- markdown availability: `.md` urls, content negotiation
+- content structure: headings, code fences, hidden tabs
+- page size and rendering: ssr vs csr, boilerplate ratio
+- url stability: status codes, redirects, canonicals
+- authentication and access: gates, alternative paths
+- geo signals: json-ld, citations, freshness, e-e-a-t
+- agent protocols: mcp card, api catalog, agents.md, link headers
-**Generative Engine Optimization** is like SEO, but for AI. Instead of optimizing for Google's crawlers and ranking algorithm, GEO optimizes for AI agents that need to:
+## how it works
-1. **Discover** your content (via `llms.txt`, sitemaps, `robots.txt`)
-2. **Parse** it efficiently (markdown availability, clean HTML, SSR)
-3. **Cite** it accurately (structured data, canonical URLs, E-E-A-T signals)
+it samples up to 10 pages of your site, runs 36 checks against the html, headers, and well-known files, then weights them into a 0 to 100 score. failed checks come with a suggestion you can paste into your ai coding agent.
-Sites that score well on Agentimization are more likely to be surfaced and cited by Claude, ChatGPT, Perplexity, and other generative engines.
+## requirements
-## Contributing
+node 18 or newer.
-```bash
-git clone https://github.com/antlio/agentimization
-cd agentimization
-bun install
-bun run build
-bun run typecheck
-```
+## programmatic use
-The monorepo structure:
+```typescript
+import { audit } from "@agentimization/core"
-```
-packages/shared  — Types, schemas, constants
-packages/core    — Audit engine + all 36 checks
-apps/cli         — CLI (Commander.js + Ink)
+const result = await audit("https://your-site.com")
+console.log(result.grade, result.overall_score)
 ```
-## License
+## license
-MIT
+mit

package/dist/index.js CHANGED Viewed

@@ -993,7 +993,7 @@ var contentStartPosition = {
   id: "content-start-position",
   name: "Content Start Position",
   category: "page-size",
-  description: "Checks if main content starts within the first 10% of the HTML",
+  description: "Checks how soon main content starts relative to total HTML",
   weight: 0.5,
   run: async (ctx) => {
     const pages = ctx.sampledPages.slice(0, 10);
@@ -1010,17 +1010,16 @@ var contentStartPosition = {
       url: p.url,
       position: findContentStartPosition(p.html)
     }));
-    const earlyStart = positions.filter((p) => p.position <= 0.1);
     const medianPct = Math.round(
       positions.map((p) => p.position).sort((a, b) => a - b)[Math.floor(positions.length / 2)] * 100
     );
-    if (earlyStart.length === pages.length) {
+    if (medianPct <= 30) {
       return {
         id: "content-start-position",
         name: "Content Start Position",
         category: "page-size",
         status: "pass",
-        message: `Content starts within first 10% on all ${pages.length} sampled pages (median ${medianPct}%)`,
+        message: `content starts at ${medianPct}% of html (median over ${pages.length} pages)`,
         metadata: { medianPct }
       };
     }
@@ -1028,10 +1027,10 @@ var contentStartPosition = {
       id: "content-start-position",
       name: "Content Start Position",
       category: "page-size",
-      status: "warn",
-      message: `Content starts late on ${pages.length - earlyStart.length}/${pages.length} pages (median ${medianPct}%)`,
-      suggestion: "Move main content higher in the HTML. AI agents may waste context window tokens on navigation, headers, and boilerplate before reaching actual content.",
-      metadata: { medianPct, earlyStart: earlyStart.length }
+      status: medianPct <= 50 ? "warn" : "fail",
+      message: `content starts at ${medianPct}% of html (median over ${pages.length} pages)`,
+      suggestion: "trim head metadata or move main content higher in the html so ai agents do not waste context tokens on boilerplate before reaching real content.",
+      metadata: { medianPct }
     };
   }
 };
@@ -1573,9 +1572,10 @@ var topicalAuthoritySignals = {
     const pages = ctx.sampledPages.slice(0, 10);
     let totalInternalLinks = 0;
     let pagesWithGoodLinking = 0;
+    const resolveBase = ctx.mode === "local" ? ctx.baseUrl.href : ctx.baseUrl.origin;
     for (const page of pages) {
-      const links = extractLinks(page.html, ctx.baseUrl.origin);
-      const internalLinks = ctx.mode === "local" ? links.filter((l) => !l.startsWith("http://") && !l.startsWith("https://")) : links.filter((l) => {
+      const links = extractLinks(page.html, resolveBase);
+      const internalLinks = ctx.mode === "local" ? links.filter((l) => l.startsWith("file:")) : links.filter((l) => {
         try {
           return new URL(l).origin === ctx.baseUrl.origin;
         } catch {
@@ -1671,7 +1671,8 @@ var eeatSignals = {
       const hasAuthorHtml = /class=["'][^"']*author[^"']*["']|rel=["']author["']/i.test(page.html);
       if (hasAuthorMeta || hasAuthorJsonLd || hasAuthorHtml) withAuthor++;
       const hasCredentials = /Ph\.?D|M\.?D|CPA|certified|licensed|expert|specialist/i.test(page.html);
-      const hasAboutPage = extractLinks(page.html, ctx.baseUrl.origin).some((l) => /about|team|author/i.test(l));
+      const linkBase = ctx.mode === "local" ? ctx.baseUrl.href : ctx.baseUrl.origin;
+      const hasAboutPage = extractLinks(page.html, linkBase).some((l) => /about|team|author/i.test(l));
       if (hasCredentials || hasAboutPage) withExpertise++;
     }
     const score = (withAuthor + withExpertise) / (pages.length * 2);
@@ -1742,6 +1743,15 @@ var canonicalUrlConsistency = {
   description: "Checks if pages have consistent canonical URLs",
   weight: 0.5,
   run: async (ctx) => {
+    if (ctx.mode === "local") {
+      return {
+        id: "canonical-url-consistency",
+        name: "Canonical URL Consistency",
+        category: "geo-signals",
+        status: "info",
+        message: "only meaningful for live urls. re-run against a deployed site to verify"
+      };
+    }
     const pages = ctx.sampledPages.slice(0, 10);
     let withCanonical = 0;
     let selfReferencing = 0;
@@ -2185,7 +2195,30 @@ var ALL_CHECKS = [
 // src/utils/local.ts
 import { readFileSync, readdirSync, existsSync } from "fs";
-import { join, relative, extname } from "path";
+import { dirname, join, relative, extname } from "path";
+var readIfExists = (path) => {
+  try {
+    if (existsSync(path)) {
+      return readFileSync(path, "utf-8");
+    }
+  } catch {
+  }
+  return void 0;
+};
+var findUpward = (start, names, maxDepth = 6) => {
+  let current = start;
+  for (let i = 0; i < maxDepth; i++) {
+    for (const name of names) {
+      const candidate = join(current, name);
+      const value = readIfExists(candidate);
+      if (value !== void 0) return value;
+    }
+    const parent = dirname(current);
+    if (parent === current) break;
+    current = parent;
+  }
+  return void 0;
+};
 var walkDir = (dir, extensions, maxDepth = 10) => {
   if (maxDepth <= 0) return [];
   const results = [];
@@ -2206,15 +2239,6 @@ var walkDir = (dir, extensions, maxDepth = 10) => {
   }
   return results;
 };
-var readIfExists = (path) => {
-  try {
-    if (existsSync(path)) {
-      return readFileSync(path, "utf-8");
-    }
-  } catch {
-  }
-  return void 0;
-};
 var buildLocalContext = (dirPath, config) => {
   const baseUrl = new URL(`file://${dirPath}`);
   const robotsTxt = readIfExists(join(dirPath, "robots.txt"));
@@ -2224,7 +2248,7 @@ var buildLocalContext = (dirPath, config) => {
   const mcpServerCard2 = readIfExists(join(dirPath, ".well-known", "mcp", "server-card.json"));
   const apiCatalog2 = readIfExists(join(dirPath, ".well-known", "api-catalog"));
   const agentSkillsIndex2 = readIfExists(join(dirPath, ".well-known", "agent-skills", "index.json"));
-  const agentsMd2 = readIfExists(join(dirPath, "AGENTS.md")) ?? readIfExists(join(dirPath, "AGENT.md"));
+  const agentsMd2 = findUpward(dirPath, ["AGENTS.md", "AGENT.md"]);
   const sitemapUrls = sitemapXml ? parseSitemapUrls(sitemapXml) : [];
   if (!sitemapXml && robotsTxt) {
     const sitemapMatch = robotsTxt.match(/Sitemap:\s*(.+)/i);
@@ -2241,8 +2265,8 @@ var buildLocalContext = (dirPath, config) => {
   }
   const htmlFiles = walkDir(dirPath, /* @__PURE__ */ new Set([".html", ".htm"]));
   const mdFiles = walkDir(dirPath, /* @__PURE__ */ new Set([".md", ".mdx"]));
-  const allFiles = [...htmlFiles, ...mdFiles];
-  const sampled = allFiles.slice(0, config.sampleSize);
+  const sampleSource = htmlFiles.length > 0 ? htmlFiles : mdFiles;
+  const sampled = sampleSource.slice(0, config.sampleSize);
   const sampledPages = sampled.map((filePath) => {
     const content = readFileSync(filePath, "utf-8");
     const relPath = relative(dirPath, filePath);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@agentimization/core",
-  "version": "0.1.0",
+  "version": "0.1.1",
   "description": "GEO audit engine. Check if your website is agent-ready and generative-engine optimized.",
   "keywords": [
     "geo",