ax-audit 3.1.0 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. package/CHANGELOG.md +60 -0
  2. package/README.md +61 -225
  3. package/dist/checks/agent-access.d.ts +16 -0
  4. package/dist/checks/agent-access.d.ts.map +1 -0
  5. package/dist/checks/agent-access.js +110 -0
  6. package/dist/checks/agent-access.js.map +1 -0
  7. package/dist/checks/crawl-efficiency.d.ts +4 -0
  8. package/dist/checks/crawl-efficiency.d.ts.map +1 -0
  9. package/dist/checks/crawl-efficiency.js +122 -0
  10. package/dist/checks/crawl-efficiency.js.map +1 -0
  11. package/dist/checks/index.d.ts.map +1 -1
  12. package/dist/checks/index.js +6 -0
  13. package/dist/checks/index.js.map +1 -1
  14. package/dist/checks/robots-txt.d.ts +20 -0
  15. package/dist/checks/robots-txt.d.ts.map +1 -1
  16. package/dist/checks/robots-txt.js +111 -3
  17. package/dist/checks/robots-txt.js.map +1 -1
  18. package/dist/checks/rsl.d.ts +6 -0
  19. package/dist/checks/rsl.d.ts.map +1 -0
  20. package/dist/checks/rsl.js +252 -0
  21. package/dist/checks/rsl.js.map +1 -0
  22. package/dist/cli.d.ts.map +1 -1
  23. package/dist/cli.js +20 -2
  24. package/dist/cli.js.map +1 -1
  25. package/dist/constants.d.ts +17 -0
  26. package/dist/constants.d.ts.map +1 -1
  27. package/dist/constants.js +39 -1
  28. package/dist/constants.js.map +1 -1
  29. package/dist/fetcher.d.ts +5 -1
  30. package/dist/fetcher.d.ts.map +1 -1
  31. package/dist/fetcher.js +32 -27
  32. package/dist/fetcher.js.map +1 -1
  33. package/dist/index.d.ts +2 -1
  34. package/dist/index.d.ts.map +1 -1
  35. package/dist/index.js +1 -0
  36. package/dist/index.js.map +1 -1
  37. package/dist/orchestrator.d.ts +2 -2
  38. package/dist/orchestrator.d.ts.map +1 -1
  39. package/dist/orchestrator.js +13 -6
  40. package/dist/orchestrator.js.map +1 -1
  41. package/dist/reporter/index.d.ts.map +1 -1
  42. package/dist/reporter/index.js +7 -0
  43. package/dist/reporter/index.js.map +1 -1
  44. package/dist/reporter/markdown.d.ts +8 -0
  45. package/dist/reporter/markdown.d.ts.map +1 -0
  46. package/dist/reporter/markdown.js +76 -0
  47. package/dist/reporter/markdown.js.map +1 -0
  48. package/dist/types.d.ts +7 -1
  49. package/dist/types.d.ts.map +1 -1
  50. package/docs/api.md +200 -0
  51. package/docs/architecture.md +88 -0
  52. package/docs/checks.md +322 -0
  53. package/docs/ci.md +89 -0
  54. package/docs/cli.md +67 -0
  55. package/docs/concepts.md +87 -0
  56. package/docs/faq.md +77 -0
  57. package/docs/getting-started.md +101 -0
  58. package/package.json +2 -1
package/docs/ci.md ADDED
@@ -0,0 +1,89 @@
1
+ # CI Integration
2
+
3
+ ax-audit's exit codes (see [cli.md](./cli.md)) make it a drop-in quality gate: `0` for Good/Excellent, `1` for Fair/Poor or regressions.
4
+
5
+ ## GitHub Actions
6
+
7
+ ### Basic gate
8
+
9
+ ```yaml
10
+ - name: AX Audit
11
+ run: npx ax-audit https://your-site.com
12
+ # Fails the step if the score < 70
13
+ ```
14
+
15
+ ### Regression gate with a committed baseline
16
+
17
+ Commit `.ax-baseline.json` to the repo and fail the build only when a check drops:
18
+
19
+ ```yaml
20
+ - name: AX Audit (regression gate)
21
+ run: npx ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 5
22
+ ```
23
+
24
+ Refresh the baseline deliberately (e.g., after intentional changes):
25
+
26
+ ```bash
27
+ npx ax-audit https://your-site.com --save-baseline .ax-baseline.json
28
+ git add .ax-baseline.json && git commit -m "chore: refresh AX baseline"
29
+ ```
30
+
31
+ ### Markdown report as a PR comment
32
+
33
+ ```yaml
34
+ - name: AX Audit (markdown)
35
+ run: npx ax-audit ${{ env.PREVIEW_URL }} --output markdown > ax-report.md
36
+ continue-on-error: true
37
+
38
+ - name: Comment PR
39
+ uses: marocchino/sticky-pull-request-comment@v2
40
+ with:
41
+ path: ax-report.md
42
+ ```
43
+
44
+ This pairs naturally with Vercel/Netlify preview deployments: audit the preview URL on every PR and the reviewer sees the AX impact inline.
45
+
46
+ ### Artifacts
47
+
48
+ ```yaml
49
+ - name: AX Audit (JSON)
50
+ run: npx ax-audit https://your-site.com --json > ax-report.json
51
+
52
+ - uses: actions/upload-artifact@v4
53
+ with:
54
+ name: ax-audit-report
55
+ path: ax-report.json
56
+ ```
57
+
58
+ ## Auditing multiple environments
59
+
60
+ ```yaml
61
+ - name: AX Audit (all properties)
62
+ run: npx ax-audit https://www.your-site.com https://docs.your-site.com https://api.your-site.com --concurrency 3
63
+ # Exit 1 if any property scores < 70
64
+ ```
65
+
66
+ ## Tuning for CI stability
67
+
68
+ - `--retries 3` absorbs transient 5xx/timeouts from cold preview deployments (default is 2).
69
+ - `--timeout 15000` for slow staging environments.
70
+ - `--checks ...` to gate only on the surface you are iterating on — but remember the overall score then averages only the selected checks.
71
+
72
+ ## Scheduled audits
73
+
74
+ A weekly audit catches drift from infrastructure changes (CDN settings, WAF rules, header changes deployed by other teams):
75
+
76
+ ```yaml
77
+ on:
78
+ schedule:
79
+ - cron: '0 6 * * 1'
80
+
81
+ jobs:
82
+ ax-audit:
83
+ runs-on: ubuntu-latest
84
+ steps:
85
+ - uses: actions/checkout@v4
86
+ - run: npx ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 0
87
+ ```
88
+
89
+ `--fail-on-regression 0` makes any per-check drop fail the workflow — appropriate for scheduled runs where every change is unexpected.
package/docs/cli.md ADDED
@@ -0,0 +1,67 @@
1
+ # CLI Reference
2
+
3
+ ```bash
4
+ ax-audit <urls...> [options]
5
+ ```
6
+
7
+ One or more fully qualified URLs (scheme required). A single URL produces a full report; multiple URLs run in batch mode with a summary table.
8
+
9
+ ## Options
10
+
11
+ | Flag | Default | Description |
12
+ | --- | --- | --- |
13
+ | `--output <format>` | `terminal` | Output format: `terminal`, `json`, `html`, `markdown`. Invalid values error out. |
14
+ | `--json` | — | Shorthand for `--output json`. |
15
+ | `--checks <list>` | all | Comma-separated check IDs to run (see [checks.md](./checks.md)). Unknown IDs error with the list of valid ones. |
16
+ | `--timeout <ms>` | `10000` | Per-request timeout in milliseconds. |
17
+ | `--retries <n>` | `2` | Retry attempts for transient fetch failures (network errors, timeouts, 408/425/429/5xx) with exponential backoff from 250ms. `0` disables retries. |
18
+ | `--concurrency <n>` | `1` | Batch mode only: maximum URLs audited in parallel. Output order always matches input order. |
19
+ | `--verbose` | — | Log every HTTP request, cache hit, retry, and per-check score to stderr. |
20
+ | `--only-failures` | — | Hide passing findings; checks with only passes are omitted entirely. |
21
+ | `--save-baseline <path>` | — | Save this audit as a baseline JSON file. |
22
+ | `--baseline <path>` | — | Compare against a saved baseline; shows per-check deltas (▲/▼). Single-URL mode only. |
23
+ | `--fail-on-regression <points>` | — | Exit 1 if any check regresses more than N points vs the baseline. Requires `--baseline`. |
24
+ | `-v, --version` | — | Print version. |
25
+
26
+ ## Output formats
27
+
28
+ - **terminal** — colored report with score bar, per-check sections, and PASS/WARN/FAIL findings.
29
+ - **json** — the full `AuditReport` (plus `baselineDiff` when `--baseline` is used). Stable shape for CI pipelines.
30
+ - **html** — self-contained page (score gauge, dark/light mode, collapsible sections). Pipe to a file: `ax-audit <url> --output html > report.html`.
31
+ - **markdown** — summary table + per-check findings with status emoji. Built for CI logs and PR comments: `ax-audit <url> --output markdown > report.md`.
32
+
33
+ ## Exit codes
34
+
35
+ | Code | Meaning |
36
+ | --- | --- |
37
+ | `0` | Score ≥ 70 (single), or all URLs ≥ 70 (batch), and no regression beyond the `--fail-on-regression` threshold. |
38
+ | `1` | Score < 70, any batch URL < 70, invalid arguments, or regression beyond threshold. |
39
+ | `2` | Fatal error (network failure on the audit itself, unreadable baseline file). |
40
+
41
+ ## Baseline workflow
42
+
43
+ ```bash
44
+ # First run — record the baseline
45
+ ax-audit https://your-site.com --save-baseline .ax-baseline.json
46
+
47
+ # Subsequent runs — compare and gate
48
+ ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 5
49
+ ```
50
+
51
+ The baseline stores the overall score and per-check scores. Checks added after the baseline was saved appear as new (no delta); removed checks are ignored.
52
+
53
+ ## Examples
54
+
55
+ ```bash
56
+ # Quick audit
57
+ npx ax-audit https://your-site.com
58
+
59
+ # Only the AI-licensing surface
60
+ npx ax-audit https://your-site.com --checks robots-txt,rsl,content-negotiation
61
+
62
+ # Batch, 4 at a time, machine-readable
63
+ npx ax-audit $(cat urls.txt) --concurrency 4 --json > batch.json
64
+
65
+ # Show me only what is broken
66
+ npx ax-audit https://your-site.com --only-failures
67
+ ```
@@ -0,0 +1,87 @@
1
+ # Concepts: the AX standards landscape
2
+
3
+ "AI Agent Experience" (AX) is the sum of the conventions a site uses to be discovered, read, governed, and transacted with by autonomous AI agents and crawlers — the way "web accessibility" is the sum of conventions for assistive technology. This page maps the standards ax-audit checks against, why each exists, and how they relate. It's the conceptual companion to the mechanical detail in [checks.md](./checks.md).
4
+
5
+ ## Why AX is its own discipline
6
+
7
+ Agents are not browsers. Three differences drive every check:
8
+
9
+ 1. **They mostly don't run JavaScript.** GPTBot, ClaudeBot, CCBot and most crawlers fetch raw HTML. A client-rendered SPA that returns an empty `<div id="root">` is, to them, a blank page. (`html-rendering`, `content-negotiation`)
10
+ 2. **They look for declared structure, not visual layout.** An agent would rather read a `/llms.txt` summary or a JSON-LD graph than infer meaning from your CSS grid. (`llms-txt`, `structured-data`, `meta-tags`, `agent-json`, `mcp`, `openapi`)
11
+ 3. **Their access is a policy and economic question, not just a technical one.** Who may crawl, for what use, at what price, under what license — these now have machine-readable answers. (`robots-txt`, Content Signals, `rsl`, `agent-access`)
12
+
13
+ Bot traffic is projected to exceed human traffic by 2029. AX is the interface layer for that shift.
14
+
15
+ ## The four families of standards
16
+
17
+ ### 1. Content discovery & readability
18
+
19
+ | Standard | What it is | Check |
20
+ | --- | --- | --- |
21
+ | **[llms.txt](https://llmstxt.org)** | A Markdown file at your root summarizing your site for LLMs, with curated links. The "sitemap for AI." | `llms-txt` |
22
+ | **Server-side rendering** | Delivering real content in the HTML response, not assembling it client-side. | `html-rendering` |
23
+ | **[Markdown for Agents](https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/)** | Content negotiation: serve clean Markdown when a client sends `Accept: text/markdown`. ~80% fewer tokens than HTML. | `content-negotiation` |
24
+ | **schema.org / JSON-LD** | Structured data describing entities (Person, Organization, Product) in a graph agents can parse. | `structured-data` |
25
+ | **Sitemaps** | The classic XML index, still how crawlers enumerate your URLs. | `sitemap` |
26
+
27
+ These answer: *can an agent find your content and actually read it?*
28
+
29
+ ### 2. Agent interaction surface
30
+
31
+ | Standard | What it is | Check |
32
+ | --- | --- | --- |
33
+ | **[A2A — Agent2Agent](https://a2a-protocol.org)** | An "Agent Card" at `/.well-known/agent.json` advertising your agent's identity and skills, so other agents can interoperate. | `agent-json` |
34
+ | **[MCP — Model Context Protocol](https://modelcontextprotocol.io)** | A manifest at `/.well-known/mcp.json` describing tools and resources an agent can call. The emerging standard for exposing capabilities to LLMs. | `mcp` |
35
+ | **[OpenAPI](https://www.openapis.org)** | The long-standing machine-readable API description; agents use it to call your endpoints. | `openapi` |
36
+ | **Emerging discovery files** | `ai.txt`, `genai.txt`, `ai-plugin.json`, `agents.json`, `nlweb.json` — competing/early conventions, scored as coverage bonus. | `well-known-ai` |
37
+ | **AI meta tags & discovery links** | `ai:*` meta tags and `rel="alternate"` links pointing agents to your llms.txt / agent.json. | `meta-tags` |
38
+
39
+ These answer: *once an agent arrives, can it understand what you offer and act on it?*
40
+
41
+ ### 3. Access governance & licensing
42
+
43
+ This is the newest and fastest-moving family — the response to "AI scraped my content and now competes with me."
44
+
45
+ | Standard | What it is | Check |
46
+ | --- | --- | --- |
47
+ | **[Robots Exclusion Protocol](https://www.rfc-editor.org/rfc/rfc9309.html)** | The original robots.txt — *who* may crawl *what*. ax-audit grades coverage of 48 known AI crawlers. | `robots-txt` |
48
+ | **[Content Signals](https://contentsignals.org)** | A robots.txt extension (Cloudflare, CC0) declaring *how* content may be used after access: `search`, `ai-input`, `ai-train`. Served by default on 3.8M+ Cloudflare domains. | `robots-txt` (findings) |
49
+ | **[RSL — Really Simple Licensing](https://rslstandard.org)** | A full machine-readable licensing layer (license.xml): permits/prohibits vocabularies, payment models (free, attribution, pay-per-crawl, pay-per-inference). Endorsed by 1,500+ publishers. | `rsl` |
50
+ | **Cloaking integrity** | Not a standard but a failure mode: your stated policy (robots.txt allows GPTBot) contradicting enforcement (WAF returns 403). | `agent-access` |
51
+
52
+ These answer: *have you expressed your access and usage policy in a form agents can honor — and does your infrastructure actually match it?*
53
+
54
+ The progression is one of increasing expressiveness: robots.txt says **who/where**, Content Signals adds **how it may be used**, RSL adds **under what license and price**.
55
+
56
+ ### 4. Transport, efficiency & hygiene
57
+
58
+ | Standard | What it is | Check |
59
+ | --- | --- | --- |
60
+ | **TLS / HSTS** | HTTPS everywhere; many agents refuse plaintext origins. | `tls-https` |
61
+ | **HTTP security & discovery headers** | Security headers plus `Link` headers advertising your AI files. | `http-headers` |
62
+ | **Compression & conditional GET** | Brotli/gzip and `ETag`/`304` — crawl cost matters when bots dominate traffic. | `crawl-efficiency` |
63
+ | **[RFC 9116 security.txt](https://www.rfc-editor.org/rfc/rfc9116)** | A machine-readable security contact. | `security-txt` |
64
+ | **SEO basics** | Title, description, canonical, lang, hreflang — agents use the same head-tag fundamentals search engines do. | `seo-basics` |
65
+
66
+ These answer: *is the connection trustworthy, cheap, and well-formed?*
67
+
68
+ ## On the horizon (not yet scored)
69
+
70
+ Two standards are maturing and worth watching:
71
+
72
+ - **[Web Bot Auth](https://datatracker.ietf.org/doc/draft-meunier-web-bot-auth-architecture/)** — cryptographic crawler verification via HTTP Message Signatures (RFC 9421). Bots sign requests with a key published at `/.well-known/http-message-signatures-directory`; sites verify identity instead of guessing from user-agent strings. Already implemented by Cloudflare and Google (`agent.bot.goog`). It directly affects the `agent-access` check: a WAF using Web Bot Auth may pass a real, signed crawler while rejecting ax-audit's unsigned probe — which is why that check's findings carry an explicit verified-bots caveat.
73
+ - **Pay-per-crawl / HTTP 402** — Cloudflare and the RSL payment vocabulary point toward metered, paid agent access. RSL already encodes the terms; enforcement protocols (Open License Protocol, x402) are emerging.
74
+
75
+ ## How the families compose
76
+
77
+ A fully AX-ready site tells a coherent story across all four:
78
+
79
+ > "Here's my content in a form you can read **(family 1)**, here's the interface to interact with me **(family 2)**, here's exactly who may use it and how, for what license **(family 3)**, over a fast and trustworthy connection **(family 4)**."
80
+
81
+ ax-audit's weighting reflects today's leverage: discovery and readability (`llms-txt`, `robots-txt`, `html-rendering`, `structured-data`, `http-headers`) carry the most weight because they're the highest-impact, most-adopted signals. The governance and efficiency standards are informational in 3.x — real and worth adopting, but still stabilizing — and gain weight in v4.0.
82
+
83
+ ## See also
84
+
85
+ - [getting-started.md](./getting-started.md) — run your first audit
86
+ - [checks.md](./checks.md) — exact scoring per standard
87
+ - The [remediation guides](https://lucioduran.com/projects/ax-audit/guides) — how to implement each one
package/docs/faq.md ADDED
@@ -0,0 +1,77 @@
1
+ # FAQ & Troubleshooting
2
+
3
+ ## Scores & results
4
+
5
+ ### Why did my score change after upgrading ax-audit?
6
+
7
+ In 3.x, score changes on the same site are treated as **breaking** and only happen in major or minor releases that explicitly say so. The 3.0.0 release redistributed weights across 14 checks and added Content-Type penalties — see its CHANGELOG entry. Every check added since (3.1.0–3.6.0) ships at **weight 0** precisely so your score and baselines don't move. To track changes deliberately, use `--baseline` (see [cli.md](./cli.md)).
8
+
9
+ ### Why is my score lower than my Lighthouse / SEO score?
10
+
11
+ ax-audit measures the *AI-agent* surface, not performance, accessibility, or human SEO. A fast, beautiful site can still score poorly if it has no `llms.txt`, ships an empty SPA shell to non-JS crawlers, and exposes no structured data. That gap is the reason the tool exists.
12
+
13
+ ### A check shows 0 but the file exists — why?
14
+
15
+ Most "not found" hard-fails mean the request didn't return a 2xx. Common causes: the file is served with a redirect chain that breaks, a non-2xx status, or — most often — a WAF/bot-rule blocking ax-audit's request (see below). Re-run with `--verbose` to see the exact status per request.
16
+
17
+ ### What's the difference between a weighted and an informational check?
18
+
19
+ Weighted checks (14) sum to 100% and determine your overall score. Informational checks (4: `content-negotiation`, `rsl`, `agent-access`, `crawl-efficiency`) run and report full findings but contribute 0 to the score in 3.x. They gain weight in v4.0. The Content Signals findings inside `robots-txt` are likewise informational.
20
+
21
+ ## False positives & caveats
22
+
23
+ ### `agent-access` flags crawlers as blocked, but my real crawlers work fine
24
+
25
+ This is the most important caveat in the tool. ax-audit's probe sends a user-agent *containing* the crawler token (e.g. `...GPTBot/1.0`) but it is **not** the real, verified crawler. If your WAF verifies bots cryptographically ([Web Bot Auth](./concepts.md)) or by IP range, it will correctly pass the genuine GPTBot while rejecting ax-audit's unverified probe. **Before changing any WAF rule, confirm against your WAF logs** whether real crawler traffic is actually being served. If it is, this finding is a false positive for your setup.
26
+
27
+ ### `well-known-ai` is low — should I worry?
28
+
29
+ No. It's scored as *coverage bonus* over five emerging, partly-competing files (`ai.txt`, `genai.txt`, `ai-plugin.json`, `agents.json`, `nlweb.json`). None is universally adopted; a low score here is not a defect. Implement the ones relevant to your stack.
30
+
31
+ ### `crawl-efficiency` says no compression, but my CDN compresses
32
+
33
+ The check reads the `Content-Encoding` header on the response it received. If a proxy between ax-audit and your origin strips or fails to negotiate compression, you'll see this. Verify directly: `curl -sI -H 'Accept-Encoding: br, gzip' https://your-site.com | grep -i content-encoding`.
34
+
35
+ ### `content-negotiation` fails but I don't serve Markdown
36
+
37
+ That's expected — most sites don't yet. It's informational (weight 0). Adopt it when you're ready; the [guide](https://lucioduran.com/projects/ax-audit/guides/content-negotiation) covers Cloudflare/Vercel zero-code options.
38
+
39
+ ## Running the tool
40
+
41
+ ### My WAF is blocking ax-audit itself
42
+
43
+ ax-audit sends a `User-Agent` of `ax-audit/<version> (+https://github.com/lucioduran/ax-audit)`. If your firewall challenges unknown agents, allowlist that UA (or the IP you run from) for the duration of the audit. Note that several checks deliberately send *other* user-agents (`agent-access`) and unusual `Accept` headers (`content-negotiation`) — a WAF rejecting those is itself a finding, not a tool bug.
44
+
45
+ ### How do I audit a staging site behind auth?
46
+
47
+ ax-audit has no auth support today. Options: run it from inside the network perimeter, temporarily allowlist its UA/IP, or audit a public preview deployment (the typical CI pattern — see [ci.md](./ci.md)).
48
+
49
+ ### Audits are slow / flaky on cold deployments
50
+
51
+ Transient failures (timeouts, 5xx) retry automatically with backoff — raise `--retries` (default 2) for very cold preview environments and `--timeout` (default 10000ms) for slow origins. In batch mode, `--concurrency` speeds up multi-URL runs.
52
+
53
+ ### Can I run only some checks?
54
+
55
+ Yes: `--checks llms-txt,robots-txt,rsl`. Note the overall score then averages *only* those checks, so a subset run isn't comparable to a full-audit score. Unknown IDs error out with the valid list.
56
+
57
+ ### Is there rate limiting I should know about?
58
+
59
+ The tool itself doesn't rate-limit, but it makes several requests per audit (one per check, plus follow-ups for conditional GET, content negotiation, and the 8 `agent-access` probes). All responses are cached per run, so repeated checks of the same URL don't re-fetch. Be considerate auditing sites you don't own.
60
+
61
+ ## Integration
62
+
63
+ ### Does it work in CI?
64
+
65
+ Yes — exit codes gate the build (`0` = Good/Excellent, `1` = Fair/Poor). See [ci.md](./ci.md) for GitHub Actions recipes including PR comments via `--output markdown` and regression gates via `--baseline`.
66
+
67
+ ### Can I consume results programmatically?
68
+
69
+ Yes — `import { audit } from 'ax-audit'` returns a typed `AuditReport`. See [api.md](./api.md).
70
+
71
+ ### How do I generate the files ax-audit checks for?
72
+
73
+ Use [ax-init](https://github.com/lucioduran/ax-init) — it generates `llms.txt`, `robots.txt`, `agent.json`, `mcp.json`, `security.txt`, structured data, and header snippets, then you verify with `npx ax-audit`.
74
+
75
+ ## Still stuck?
76
+
77
+ Open an issue at [github.com/lucioduran/ax-audit/issues](https://github.com/lucioduran/ax-audit/issues) with the output of `npx ax-audit <url> --verbose`.
@@ -0,0 +1,101 @@
1
+ # Getting Started
2
+
3
+ This walkthrough takes you from zero to a passing AX score: run your first audit, learn to read the report, and fix findings in the order that moves your score most.
4
+
5
+ ## 1. Run your first audit
6
+
7
+ No install needed:
8
+
9
+ ```bash
10
+ npx ax-audit https://your-site.com
11
+ ```
12
+
13
+ You get a report like:
14
+
15
+ ```
16
+ AX Audit Report
17
+ https://your-site.com
18
+
19
+ ██████████████████████░░░░░░░░░░░░░░░░░░ 56/100 Fair
20
+
21
+ LLMs.txt (0/100)
22
+ FAIL /llms.txt not found
23
+ ...
24
+ ```
25
+
26
+ Three things to locate immediately:
27
+
28
+ - **The overall score and grade.** 0–100, weighted across 14 checks. Grades: Excellent (≥90), Good (≥70), Fair (≥50), Poor (<50). The CLI exits `0` at Good or better — that is the CI gate.
29
+ - **Per-check scores.** Each check is independent and scored 0–100. The weight of each check is in [checks.md](./checks.md).
30
+ - **Findings.** Every `WARN`/`FAIL` line carries a hint and a `learnMoreUrl` to a remediation guide with copy-pasteable fixes.
31
+
32
+ To see only what needs fixing:
33
+
34
+ ```bash
35
+ npx ax-audit https://your-site.com --only-failures
36
+ ```
37
+
38
+ ## 2. Understand what you're optimizing
39
+
40
+ AI agents interact with your site differently than browsers: most don't execute JavaScript, they look for machine-readable discovery files, and they respect (or at least read) your declared crawler policy. The audit measures three layers — if you're new to the standards involved (llms.txt, A2A, MCP, RSL, Content Signals), read [concepts.md](./concepts.md) first:
41
+
42
+ 1. **Can agents find and read your content?** (`html-rendering`, `robots-txt`, `sitemap`, `tls-https`, `agent-access`)
43
+ 2. **Did you publish the AI-specific surface?** (`llms-txt`, `agent-json`, `mcp`, `openapi`, `well-known-ai`, `meta-tags`, `structured-data`)
44
+ 3. **Is the interaction efficient and well-governed?** (`content-negotiation`, `crawl-efficiency`, `rsl`, Content Signals, `http-headers`, `security-txt`, `seo-basics`)
45
+
46
+ ## 3. Fix in impact order
47
+
48
+ The fastest path from Fair to Good, by weight and typical effort:
49
+
50
+ | Step | Check | Weight | Typical effort |
51
+ | --- | --- | --- | --- |
52
+ | 1 | Create `/llms.txt` | 11% | 30 minutes — it's a Markdown file. `npx ax-init` generates it. |
53
+ | 2 | Configure `robots.txt` for the 8 core AI crawlers | 11% | 15 minutes; `npx ax-init` generates this too |
54
+ | 3 | Verify server-rendered content | 9% | Free if you SSR; significant if you ship an SPA shell |
55
+ | 4 | Add JSON-LD structured data | 9% | 1–2 hours |
56
+ | 5 | Security + discovery headers | 9% | 30 minutes of server config |
57
+ | 6 | `agent.json` + `mcp.json` | 14% combined | An hour with the spec links in the guides |
58
+
59
+ The remaining weighted checks (`seo-basics`, `security-txt`, `meta-tags`, `openapi`, `tls-https`, `sitemap`, `well-known-ai`) are mostly configuration; the remediation guides give exact snippets for Nginx, Vercel, Netlify, and Express.
60
+
61
+ Re-run after each fix — all requests are cached per run, so audits are fast and cheap.
62
+
63
+ ## 4. Lock in your progress with a baseline
64
+
65
+ Once you reach a score you're happy with, freeze it:
66
+
67
+ ```bash
68
+ npx ax-audit https://your-site.com --save-baseline .ax-baseline.json
69
+ git add .ax-baseline.json && git commit -m "chore: AX baseline"
70
+ ```
71
+
72
+ From then on, compare every run against it:
73
+
74
+ ```bash
75
+ npx ax-audit https://your-site.com --baseline .ax-baseline.json --fail-on-regression 5
76
+ ```
77
+
78
+ This catches drift you didn't cause — a CDN toggle, a WAF rule, a header dropped in a refactor. Wire it into CI with the recipes in [ci.md](./ci.md).
79
+
80
+ ## 5. Look at the informational checks
81
+
82
+ Four checks report findings without affecting your score yet (they will in v4.0): `content-negotiation`, `rsl`, `agent-access`, `crawl-efficiency`. Treat them as the early-warning lane — they cover the newest standards, and fixing them now means v4.0 changes nothing for you.
83
+
84
+ The one to check first is `agent-access`: it detects the failure mode you cannot see — your robots.txt allows GPTBot while your WAF returns it a 403:
85
+
86
+ ```bash
87
+ npx ax-audit https://your-site.com --checks agent-access
88
+ ```
89
+
90
+ ## Common first-run questions
91
+
92
+ - **"My score seems harsh."** The audit measures the AI-agent surface, not site quality. A beautiful SPA with no llms.txt, no structured data, and an empty `#root` div is genuinely poor AX — that's the point of the tool.
93
+ - **"A check crashed / network error."** Transient failures retry automatically (`--retries`, default 2). For slow staging environments raise `--timeout`.
94
+ - **"Which findings are safe to ignore?"** See the [FAQ](./faq.md) — notably the `agent-access` verified-bots caveat and `well-known-ai`, which is coverage bonus rather than baseline.
95
+
96
+ ## Next steps
97
+
98
+ - [checks.md](./checks.md) — exact scoring of all 18 checks
99
+ - [concepts.md](./concepts.md) — the AX standards landscape explained
100
+ - [cli.md](./cli.md) — every flag · [ci.md](./ci.md) — CI recipes · [api.md](./api.md) — programmatic use
101
+ - [ax-init](https://github.com/lucioduran/ax-init) — generates most of the files this tool audits
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ax-audit",
3
- "version": "3.1.0",
3
+ "version": "3.6.0",
4
4
  "description": "Audit websites for AI Agent Experience (AX) readiness. Lighthouse for AI Agents.",
5
5
  "type": "module",
6
6
  "license": "Apache-2.0",
@@ -40,6 +40,7 @@
40
40
  "files": [
41
41
  "bin/",
42
42
  "dist/",
43
+ "docs/",
43
44
  "LICENSE",
44
45
  "README.md",
45
46
  "CHANGELOG.md"