geo-checker 0.1.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.ko.md ADDED
@@ -0,0 +1,267 @@
1
+ <p align="center">
2
+ <img src="https://raw.githubusercontent.com/BaRam-OSS/geo-checker/main/docs/assets/baram.png" alt="BaRam" width="260" />
3
+ </p>
4
+
5
+ <h1 align="center">geo-checker</h1>
6
+
7
+ <p align="center">
8
+ <b>Generative Engine Optimization(GEO)</b>을 위한 Lighthouse급 감사 도구. ChatGPT · Claude · Gemini · Perplexity 같은 AI 검색 엔진이 내 사이트를 얼마나 잘 발견하고, 이해하고, 인용할 수 있는지 측정합니다.
9
+ </p>
10
+
11
+ <p align="center">
12
+ <a href="https://www.npmjs.com/package/geo-checker"><img src="https://img.shields.io/npm/v/geo-checker.svg" alt="npm" /></a>
13
+ <a href="./LICENSE"><img src="https://img.shields.io/npm/l/geo-checker.svg" alt="license" /></a>
14
+ <a href="https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml"><img src="https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
15
+ </p>
16
+
17
+ <p align="center">
18
+ <b>한국어</b> · <a href="./README.md">English</a>
19
+ </p>
20
+
21
+ ---
22
+
23
+ ## 왜 필요한가요?
24
+
25
+ 기존 SEO 도구는 **구글 검색이 내 페이지를 랭킹할 수 있는지**를 점검합니다. `geo-checker`는 한 걸음 더 나아가, **AI 검색 엔진이 내 페이지를 인용할 수 있는지**를 점검합니다. **31개의 온페이지 신호**를 4개의 가중치 범주에 걸쳐 검사하고, 0–100점의 카테고리 점수와 함께 인터랙티브 HTML 리포트, 우선순위가 매겨진 개선 기회(Opportunities), 구체적인 수정 방법을 제공합니다.
26
+
27
+ Google Lighthouse에서 영감을 받아 만들었지만, 대상은 GEO입니다 — AI 크롤러 robots 규칙, `llms.txt`, schema.org 그래프 품질, 인용 신호 등.
28
+
29
+ ## 설치
30
+
31
+ ```sh
32
+ # 일회성 실행
33
+ npx geo-checker https://example.com
34
+
35
+ # 또는 개발 의존성으로 추가
36
+ npm install --save-dev geo-checker
37
+ ```
38
+
39
+ Node.js **20.18.1 이상**이 필요합니다.
40
+
41
+ ## 사용법 — CLI
42
+
43
+ ```sh
44
+ # 터미널 출력 (impact 칩 + 타이밍 포함)
45
+ geo-checker https://example.com
46
+
47
+ # 독립 실행형 HTML 리포트 생성
48
+ geo-checker https://example.com --html report.html
49
+
50
+ # report.json + report.html을 함께 디렉터리에 저장
51
+ geo-checker https://example.com --out ./reports
52
+
53
+ # stdout으로 JSON 출력 (jq, CI 파이프라이닝용)
54
+ geo-checker https://example.com --json > report.json
55
+
56
+ # SPA / JS 렌더링 사이트 (playwright 선택 의존성 필요)
57
+ geo-checker https://example.com --render
58
+
59
+ # 특정 카테고리나 룰만 실행
60
+ geo-checker https://example.com --category crawler
61
+ geo-checker https://example.com --only crawler.https,sd.required-fields
62
+
63
+ # CI 모드 — warn 또는 fail 시 exit 1
64
+ geo-checker https://example.com --fail-on warn
65
+ ```
66
+
67
+ **전체 플래그 목록:**
68
+
69
+ | 플래그 | 설명 |
70
+ |---|---|
71
+ | `--json` | JSON을 stdout으로 출력. |
72
+ | `--html <path>` | 독립 실행형 HTML 리포트를 `<path>`에 저장. `-`이면 stdout. |
73
+ | `--out <dir>` | `<dir>`에 `report.json` + `report.html`을 함께 저장 (디렉터리 자동 생성). |
74
+ | `--csv <path>` | 룰별 1행의 플랫 CSV 저장 — BI 대시보드 인제스트용. |
75
+ | `--md <path>` | 점수 뱃지 + 이슈 표가 담긴 Markdown PR 코멘트 템플릿 저장. |
76
+ | `--sarif <path>` | **GitHub Code Scanning** 통합용 SARIF 2.1.0 리포트 저장. |
77
+ | `--baseline <prev.json>` | 이전 JSON 리포트와 비교하여 카테고리 delta, 회귀, 해결 항목 출력. |
78
+ | `--config <path>` | 설정 파일 로드 (기본값: cwd의 `geo-checker.config.{json,mjs,js}`). |
79
+ | `--render` | Playwright 기반 헤드리스 Chromium 사용 (선택 의존성). |
80
+ | `--category <names>` | 쉼표 구분: `crawler`, `structured-data`, `citation`, `content`. |
81
+ | `--only <ids>` | 실행할 rule ID (또는 stableId) 쉼표 구분. |
82
+ | `--fail-on <level>` | `fail`(기본) 또는 `warn`. |
83
+ | `--timeout <ms>` | 요청당 타임아웃 (기본 20 000). |
84
+
85
+ **배치 모드:**
86
+
87
+ ```sh
88
+ # urls.txt의 모든 URL(한 줄 하나, `#` 주석 허용) 감사, 동시성 4
89
+ geo-checker batch urls.txt --out ./reports --concurrency 4
90
+ ```
91
+
92
+ URL별 `<slug>.json` + `<slug>.html`과 집계된 `summary.json`을 생성합니다. 한 URL 실패가 전체 배치를 중단시키지 않습니다.
93
+
94
+ **종료 코드:** `0` 성공 · `1` 정책 실패 · `2` 런타임 오류.
95
+
96
+ ## 설정 파일
97
+
98
+ 프로젝트 루트에 `geo-checker.config.json` (또는 `--config <path>`)을 두면 룰을 끄거나, 가중치를 조정하거나, 커스텀 룰을 주입할 수 있습니다:
99
+
100
+ ```json
101
+ {
102
+ "rules": {
103
+ "cnt.word-count": { "enabled": false },
104
+ "crawler.robots-ai-allow": { "weight": 10 }
105
+ },
106
+ "categories": {
107
+ "structured-data": { "weight": 40 }
108
+ }
109
+ }
110
+ ```
111
+
112
+ `.mjs` / `.js`도 지원합니다 (`export default` 로 config 객체 내보내기). 모든 stableId 목록은 [`docs/rules.md`](./docs/rules.md) 참조.
113
+
114
+ ## HTML 리포트
115
+
116
+ `--html`은 외부 CSS, 폰트, 네트워크 호출 없이 작동하는 단일 HTML 파일을 생성합니다. 브라우저에서 바로 열면 Lighthouse와 유사한 UX를 경험할 수 있습니다:
117
+
118
+ - **점수 링(Score Ring)** — 전체 점수와 카테고리별 점수.
119
+ - **Opportunities** — 고치면 회복할 수 있는 점수 순으로 정렬된 개선 기회.
120
+ - **Diagnostics** — 나머지 비통과 감사 항목, 카테고리별로 그룹핑.
121
+ - **Passed audits** — 기본적으로 접혀 있음.
122
+ - **Raw JSON** — 다른 도구로 파이핑할 수 있는 Copy-to-clipboard 버튼.
123
+
124
+ 라이트/다크 모드에 자동으로 대응합니다. 일반 페이지 기준 ~60–80 KB.
125
+
126
+ ## 사용법 — 프로그래매틱 API
127
+
128
+ ```ts
129
+ import { audit } from 'geo-checker';
130
+
131
+ const report = await audit('https://example.com', { render: false });
132
+
133
+ console.log(report.overall); // 78
134
+ console.log(report.categories.crawler.score); // 92
135
+ console.log(report.timing); // { fetchMs, auditMs, totalMs }
136
+ console.log(report.meta); // { toolVersion, nodeVersion, ... }
137
+ ```
138
+
139
+ ### HTML 또는 JSON 리포트를 직접 렌더링
140
+
141
+ ```ts
142
+ import { audit } from 'geo-checker';
143
+ import { toHtml } from 'geo-checker/dist/reporters/html.js';
144
+ import { toJson } from 'geo-checker/dist/reporters/json.js';
145
+
146
+ const report = await audit('https://example.com');
147
+ await fs.writeFile('report.html', toHtml(report));
148
+ await fs.writeFile('report.json', toJson(report));
149
+ ```
150
+
151
+ ## 무엇을 검사하나요?
152
+
153
+ | 카테고리 | 검사 항목 | 룰 수 | 가중치 |
154
+ |---|---|---:|---:|
155
+ | **AI Crawler Access** | HTTPS, robots.txt 도달성, **17개 AI 봇 허용 목록** (GPTBot, OAI-SearchBot, ChatGPT-User, Google-Extended, Google-CloudVertexBot, ClaudeBot, anthropic-ai, Claude-Web, PerplexityBot, Applebot-Extended, Meta-ExternalAgent, Bytespider, DuckAssistBot, YouBot, cohere-ai, CCBot, Amazonbot), `llms.txt`, `llms-full.txt`, sitemap.xml | 7 | 25 |
156
+ | **Structured Data** | JSON-LD 존재/유효성, 인식 가능한 schema.org 타입, 필수 필드 커버리지, microdata/RDFa 폴백, 중복 primary 타입 검사, `sameAs` 지식 그래프 연결, BreadcrumbList 아이템 유효성 | 8 | 30 |
157
+ | **Citation Signals** | `<title>`, meta description, canonical, Open Graph, Twitter Card, `<html lang>`, 저자, 게시/수정 날짜, 콘텐츠 신선도(dateModified ≤ 1년) | 9 | 25 |
158
+ | **Content Structure** | 단일 `<h1>`, 헤딩 계층, 이미지 alt 커버리지, TL;DR / FAQ 블록, 단어 수, 답 추출용 Q&A 구조, 외부 인용(E-E-A-T) | 7 | 20 |
159
+
160
+ 모든 룰은 다음 필드를 선언합니다:
161
+
162
+ - **`stableId`** — CI 예산용 고정 식별자 (절대 변경되지 않음).
163
+ - **`impact`** — `critical` / `high` / `medium` / `low`.
164
+ - **`effort`** — `low` / `medium` / `high` (수정에 걸리는 시간의 대략적인 지표).
165
+ - **`group`** — `opportunity` (회복 가능한 점수) 또는 `diagnostic` (이진 신호).
166
+
167
+ 전체 룰 목록과 수정 가이드는 [`docs/rules.md`](./docs/rules.md)를 참조하세요.
168
+
169
+ ## 리포트 스키마
170
+
171
+ ```ts
172
+ interface AuditReport {
173
+ schemaVersion: 1;
174
+ url: string;
175
+ finalUrl: string;
176
+ fetchedAt: string;
177
+ renderMode: 'static' | 'rendered';
178
+ overall: number; // 0–100
179
+ categories: Record<Category, CategoryReport>;
180
+ warnings: string[];
181
+ version: string;
182
+ meta: { toolVersion: string; nodeVersion: string; userAgent?: string };
183
+ timing: { fetchMs: number; auditMs: number; totalMs: number };
184
+ }
185
+ ```
186
+
187
+ 각 감사 결과는 `stableId`, `impact`, `effort`, `group`, `docsUrl`, 그리고 해당되는 경우 `estimatedImpact`(Opportunity가 얼마나 가치 있는 점수인지)를 함께 담고 있습니다.
188
+
189
+ ## 확장성
190
+
191
+ 커스텀 룰을 추가하는 방법:
192
+
193
+ ```ts
194
+ import { audit, defineRule } from 'geo-checker';
195
+
196
+ const hasJsonFeed = defineRule({
197
+ id: 'custom.has-json-feed',
198
+ stableId: 'custom.has-json-feed',
199
+ category: 'crawler',
200
+ group: 'opportunity',
201
+ weight: 2,
202
+ impact: 'low',
203
+ effort: 'low',
204
+ title: 'JSON Feed present',
205
+ description: 'Site should expose a JSON Feed at /feed.json',
206
+ docsUrl: 'https://example.com/docs/json-feed',
207
+ async run(ctx) {
208
+ // ctx.$ / ctx.headers / ctx.robots 등을 활용한 검사 로직
209
+ return { status: 'pass', score: 1, rationale: 'JSON feed found' };
210
+ },
211
+ });
212
+
213
+ const report = await audit('https://example.com', { extraRules: [hasJsonFeed] });
214
+ ```
215
+
216
+ 커스텀 룰은 기본 룰과 자동으로 병합되며, 모든 reporter에서 동일하게 출력됩니다.
217
+
218
+ ## CI 레시피
219
+
220
+ ```yaml
221
+ # .github/workflows/geo.yml
222
+ name: GEO audit
223
+ on: [push, pull_request]
224
+ jobs:
225
+ geo:
226
+ runs-on: ubuntu-latest
227
+ permissions:
228
+ contents: read
229
+ security-events: write # SARIF 업로드용
230
+ pull-requests: write # PR 코멘트용
231
+ steps:
232
+ - uses: actions/checkout@v4
233
+ - uses: actions/setup-node@v4
234
+ with: { node-version: 20 }
235
+ - run: |
236
+ npx -y geo-checker https://staging.example.com \
237
+ --fail-on warn \
238
+ --out ./geo \
239
+ --sarif ./geo/results.sarif \
240
+ --md ./geo/summary.md
241
+ - uses: actions/upload-artifact@v4
242
+ with:
243
+ name: geo-report
244
+ path: ./geo
245
+ - uses: github/codeql-action/upload-sarif@v3
246
+ if: always()
247
+ with:
248
+ sarif_file: ./geo/results.sarif
249
+ - if: github.event_name == 'pull_request'
250
+ run: gh pr comment ${{ github.event.pull_request.number }} --body-file ./geo/summary.md
251
+ env:
252
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
253
+ ```
254
+
255
+ **회귀 추적** 은 마지막 안정 `report.json`을 저장해두고 `--baseline`으로 넘기면 됩니다:
256
+
257
+ ```sh
258
+ geo-checker https://staging.example.com --baseline ./baselines/main.json --out ./geo
259
+ ```
260
+
261
+ ## 라이선스
262
+
263
+ MIT © BaRam-OSS. [LICENSE](./LICENSE) 참조.
264
+
265
+ ## 기여하기
266
+
267
+ 새로운 룰, 픽스처, 문서 개선 등 PR을 환영합니다. [CONTRIBUTING.md](./CONTRIBUTING.md) 참조.
package/README.md CHANGED
@@ -1,16 +1,30 @@
1
- # geo-checker
1
+ <p align="center">
2
+ <img src="https://raw.githubusercontent.com/BaRam-OSS/geo-checker/main/docs/assets/baram.png" alt="BaRam" width="260" />
3
+ </p>
2
4
 
3
- > Lighthouse-style auditor for **Generative Engine Optimization (GEO)**. Checks how ready your site is to be cited by ChatGPT, Claude, Gemini, and Perplexity.
5
+ <h1 align="center">geo-checker</h1>
4
6
 
5
- [![npm](https://img.shields.io/npm/v/geo-checker.svg)](https://www.npmjs.com/package/geo-checker)
6
- [![license](https://img.shields.io/npm/l/geo-checker.svg)](./LICENSE)
7
- [![CI](https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml/badge.svg)](https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml)
7
+ <p align="center">
8
+ Lighthouse-grade auditor for <b>Generative Engine Optimization (GEO)</b>. Measures how ready your site is to be found, understood, and cited by ChatGPT · Claude · Gemini · Perplexity.
9
+ </p>
10
+
11
+ <p align="center">
12
+ <a href="https://www.npmjs.com/package/geo-checker"><img src="https://img.shields.io/npm/v/geo-checker.svg" alt="npm" /></a>
13
+ <a href="./LICENSE"><img src="https://img.shields.io/npm/l/geo-checker.svg" alt="license" /></a>
14
+ <a href="https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml"><img src="https://github.com/BaRam-OSS/geo-checker/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
15
+ </p>
16
+
17
+ <p align="center">
18
+ <a href="./README.ko.md">한국어</a> · <b>English</b>
19
+ </p>
8
20
 
9
21
  ---
10
22
 
11
23
  ## Why
12
24
 
13
- SEO tools check whether Google can rank your page. `geo-checker` checks whether **AI search engines** can find, understand, and cite it. It inspects 24 on-page signals across four categories and gives you a 0–100 score per category, plus concrete fix suggestions.
25
+ SEO tools tell you whether **Google** can rank your page. `geo-checker` tells you whether **AI search engines** can cite it. It inspects **31 on-page signals** across four weighted categories and returns a 0–100 score per category plus an interactive HTML report, prioritized Opportunities, and concrete fixes.
26
+
27
+ Inspired by Google Lighthouse, but built for GEO: AI-crawler robots rules, `llms.txt`, schema.org graph quality, citation signals.
14
28
 
15
29
  ## Install
16
30
 
@@ -18,24 +32,96 @@ SEO tools check whether Google can rank your page. `geo-checker` checks whether
18
32
  # One-off
19
33
  npx geo-checker https://example.com
20
34
 
21
- # Or as a dependency
35
+ # Or as a dev dependency
22
36
  npm install --save-dev geo-checker
23
37
  ```
24
38
 
25
- Node.js 20.18.1 or later required.
39
+ Requires Node.js **≥ 20.18.1**.
26
40
 
27
41
  ## Usage — CLI
28
42
 
29
43
  ```sh
30
- geo-checker https://example.com # pretty output
31
- geo-checker https://example.com --json > report.json # JSON output
32
- geo-checker https://example.com --render # use headless browser (SPA sites)
33
- geo-checker https://example.com --category crawler # run only one category
34
- geo-checker https://example.com --only sd.required-fields
35
- geo-checker https://example.com --fail-on warn # CI mode (exit 1 on warn/fail)
44
+ # Pretty terminal output (with impact chips and timing)
45
+ geo-checker https://example.com
46
+
47
+ # Standalone interactive HTML report
48
+ geo-checker https://example.com --html report.html
49
+
50
+ # Write report.json + report.html side-by-side
51
+ geo-checker https://example.com --out ./reports
52
+
53
+ # JSON to stdout (for piping to jq, CI, etc.)
54
+ geo-checker https://example.com --json > report.json
55
+
56
+ # SPA / JS-rendered sites (requires optional playwright)
57
+ geo-checker https://example.com --render
58
+
59
+ # Filter to a single category or rule set
60
+ geo-checker https://example.com --category crawler
61
+ geo-checker https://example.com --only crawler.https,sd.required-fields
62
+
63
+ # CI mode — exit 1 on warn or fail
64
+ geo-checker https://example.com --fail-on warn
65
+ ```
66
+
67
+ **All flags:**
68
+
69
+ | Flag | Description |
70
+ |---|---|
71
+ | `--json` | Emit JSON to stdout. |
72
+ | `--html <path>` | Write a self-contained HTML report to `<path>`. Use `-` for stdout. |
73
+ | `--out <dir>` | Write `report.json` + `report.html` to `<dir>` (directory is created if missing). |
74
+ | `--csv <path>` | Flat CSV export (one row per rule result) — feed into BI dashboards. |
75
+ | `--md <path>` | Markdown PR-comment summary with score badges and an issue table. |
76
+ | `--sarif <path>` | SARIF 2.1.0 report for **GitHub Code Scanning** integration. |
77
+ | `--baseline <prev.json>` | Compare against a prior JSON report and print per-category deltas + regressions/fixes. |
78
+ | `--config <path>` | Load a config file (defaults to `geo-checker.config.{json,mjs,js}` in cwd). |
79
+ | `--render` | Use headless Chromium via Playwright (optional dep). |
80
+ | `--category <names>` | Comma-separated: `crawler`, `structured-data`, `citation`, `content`. |
81
+ | `--only <ids>` | Comma-separated rule IDs (or stableIds) to run. |
82
+ | `--fail-on <level>` | `fail` (default) or `warn`. |
83
+ | `--timeout <ms>` | Per-request timeout (default 20 000). |
84
+
85
+ **Batch mode:**
86
+
87
+ ```sh
88
+ # Audit every URL in urls.txt (one per line, # comments allowed) with 4 workers
89
+ geo-checker batch urls.txt --out ./reports --concurrency 4
90
+ ```
91
+
92
+ Writes per-URL `<slug>.json` + `<slug>.html` and an aggregated `summary.json`. Per-URL failures are isolated — one timeout doesn't abort the batch.
93
+
94
+ **Exit codes:** `0` success · `1` policy failure · `2` runtime error.
95
+
96
+ ## Config file
97
+
98
+ Drop a `geo-checker.config.json` in your project root (or pass `--config <path>`) to disable rules, adjust weights, or inject custom rules:
99
+
100
+ ```json
101
+ {
102
+ "rules": {
103
+ "cnt.word-count": { "enabled": false },
104
+ "crawler.robots-ai-allow": { "weight": 10 }
105
+ },
106
+ "categories": {
107
+ "structured-data": { "weight": 40 }
108
+ }
109
+ }
36
110
  ```
37
111
 
38
- Exit codes: `0` success · `1` policy failure · `2` runtime error.
112
+ `.mjs` and `.js` are also supported (must `export default` the config object). See [`docs/rules.md`](./docs/rules.md) for every rule `stableId`.
113
+
114
+ ## The HTML report
115
+
116
+ `--html` produces a single, self-contained HTML file — no external CSS, fonts, or network calls. Open it in any browser. It mirrors the Lighthouse UX:
117
+
118
+ - **Score rings** for overall and each category.
119
+ - **Opportunities** section — ranked by the points you would recover by fixing each issue.
120
+ - **Diagnostics** — the remaining non-passing audits, grouped by category.
121
+ - **Passed audits** — collapsed by default.
122
+ - **Raw JSON** — copy-to-clipboard button for piping into other tools.
123
+
124
+ Auto-adapts to light/dark mode. ~60–80 KB for a typical page.
39
125
 
40
126
  ## Usage — Programmatic
41
127
 
@@ -43,20 +129,62 @@ Exit codes: `0` success · `1` policy failure · `2` runtime error.
43
129
  import { audit } from 'geo-checker';
44
130
 
45
131
  const report = await audit('https://example.com', { render: false });
46
- console.log(report.overall); // 78
47
- console.log(report.categories.crawler); // { score: 92, results: [...] }
132
+
133
+ console.log(report.overall); // 78
134
+ console.log(report.categories.crawler.score); // 92
135
+ console.log(report.timing); // { fetchMs, auditMs, totalMs }
136
+ console.log(report.meta); // { toolVersion, nodeVersion, ... }
137
+ ```
138
+
139
+ ### Render an HTML or JSON report directly
140
+
141
+ ```ts
142
+ import { audit } from 'geo-checker';
143
+ import { toHtml } from 'geo-checker/dist/reporters/html.js';
144
+ import { toJson } from 'geo-checker/dist/reporters/json.js';
145
+
146
+ const report = await audit('https://example.com');
147
+ await fs.writeFile('report.html', toHtml(report));
148
+ await fs.writeFile('report.json', toJson(report));
48
149
  ```
49
150
 
50
151
  ## What gets checked
51
152
 
52
- | Category | What it covers | Weight |
53
- |---|---|---:|
54
- | **AI Crawler Access** | HTTPS, robots.txt, AI bot allow-lists (GPTBot, Google-Extended, ClaudeBot, PerplexityBot, CCBot, Amazonbot), `llms.txt`, sitemap.xml | 25 |
55
- | **Structured Data** | JSON-LD presence & validity, schema.org types (Article, FAQPage, HowTo, Organization, BreadcrumbList, Product), required fields | 30 |
56
- | **Citation Signals** | title, meta description, canonical, Open Graph, Twitter Card, author, publish/modified dates, `lang` | 25 |
57
- | **Content Structure** | single H1, heading hierarchy, image alt coverage, TL;DR/FAQ blocks, word count | 20 |
153
+ | Category | Signals | Rules | Weight |
154
+ |---|---|---:|---:|
155
+ | **AI Crawler Access** | HTTPS, robots.txt reachability, **17 AI-bot allow-list** (GPTBot, OAI-SearchBot, ChatGPT-User, Google-Extended, Google-CloudVertexBot, ClaudeBot, anthropic-ai, Claude-Web, PerplexityBot, Applebot-Extended, Meta-ExternalAgent, Bytespider, DuckAssistBot, YouBot, cohere-ai, CCBot, Amazonbot), `llms.txt`, `llms-full.txt`, sitemap.xml | 7 | 25 |
156
+ | **Structured Data** | JSON-LD presence & validity, recognised schema.org types, required-field coverage, microdata/RDFa fallback, no duplicate primary types, `sameAs` knowledge-graph linkage, BreadcrumbList item validity | 8 | 30 |
157
+ | **Citation Signals** | `<title>`, meta description, canonical, Open Graph, Twitter Card, `<html lang>`, author, publish/modified dates, content freshness (dateModified ≤ 1y) | 9 | 25 |
158
+ | **Content Structure** | single `<h1>`, heading hierarchy, image alt coverage, TL;DR / FAQ blocks, word count, Q&A structure for answer extraction, external citations (E-E-A-T) | 7 | 20 |
159
+
160
+ Every rule declares:
161
+
162
+ - **`stableId`** — frozen identifier for CI budgets (never renamed).
163
+ - **`impact`** — `critical` / `high` / `medium` / `low`.
164
+ - **`effort`** — `low` / `medium` / `high` (roughly how long the fix takes).
165
+ - **`group`** — `opportunity` (points you can recover) or `diagnostic` (binary signal).
166
+
167
+ See [`docs/rules.md`](./docs/rules.md) for every rule, why it matters for GEO, and how to fix a failure.
58
168
 
59
- See [`docs/rules.md`](./docs/rules.md) for every individual rule and how to fix a failure.
169
+ ## Report schema
170
+
171
+ ```ts
172
+ interface AuditReport {
173
+ schemaVersion: 1;
174
+ url: string;
175
+ finalUrl: string;
176
+ fetchedAt: string;
177
+ renderMode: 'static' | 'rendered';
178
+ overall: number; // 0–100
179
+ categories: Record<Category, CategoryReport>;
180
+ warnings: string[];
181
+ version: string;
182
+ meta: { toolVersion: string; nodeVersion: string; userAgent?: string };
183
+ timing: { fetchMs: number; auditMs: number; totalMs: number };
184
+ }
185
+ ```
186
+
187
+ Each audit result carries `stableId`, `impact`, `effort`, `group`, `docsUrl`, and, where applicable, `estimatedImpact` (the points an Opportunity is worth).
60
188
 
61
189
  ## Extensibility
62
190
 
@@ -67,17 +195,67 @@ import { audit, defineRule } from 'geo-checker';
67
195
 
68
196
  const hasJsonFeed = defineRule({
69
197
  id: 'custom.has-json-feed',
198
+ stableId: 'custom.has-json-feed',
70
199
  category: 'crawler',
200
+ group: 'opportunity',
71
201
  weight: 2,
202
+ impact: 'low',
203
+ effort: 'low',
72
204
  title: 'JSON Feed present',
73
205
  description: 'Site should expose a JSON Feed at /feed.json',
206
+ docsUrl: 'https://example.com/docs/json-feed',
74
207
  async run(ctx) {
75
- // ...your logic
208
+ // ...your logic using ctx.$ / ctx.headers / ctx.robots, etc.
76
209
  return { status: 'pass', score: 1, rationale: 'JSON feed found' };
77
210
  },
78
211
  });
79
212
 
80
- await audit('https://example.com', { extraRules: [hasJsonFeed] });
213
+ const report = await audit('https://example.com', { extraRules: [hasJsonFeed] });
214
+ ```
215
+
216
+ Custom rules are merged with the defaults and appear in every reporter automatically.
217
+
218
+ ## CI recipe
219
+
220
+ ```yaml
221
+ # .github/workflows/geo.yml
222
+ name: GEO audit
223
+ on: [push, pull_request]
224
+ jobs:
225
+ geo:
226
+ runs-on: ubuntu-latest
227
+ permissions:
228
+ contents: read
229
+ security-events: write # for SARIF upload
230
+ pull-requests: write # for PR comment
231
+ steps:
232
+ - uses: actions/checkout@v4
233
+ - uses: actions/setup-node@v4
234
+ with: { node-version: 20 }
235
+ - run: |
236
+ npx -y geo-checker https://staging.example.com \
237
+ --fail-on warn \
238
+ --out ./geo \
239
+ --sarif ./geo/results.sarif \
240
+ --md ./geo/summary.md
241
+ - uses: actions/upload-artifact@v4
242
+ with:
243
+ name: geo-report
244
+ path: ./geo
245
+ - uses: github/codeql-action/upload-sarif@v3
246
+ if: always()
247
+ with:
248
+ sarif_file: ./geo/results.sarif
249
+ - if: github.event_name == 'pull_request'
250
+ run: gh pr comment ${{ github.event.pull_request.number }} --body-file ./geo/summary.md
251
+ env:
252
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
253
+ ```
254
+
255
+ **Track regressions** by saving last-known-good `report.json` and passing it as `--baseline`:
256
+
257
+ ```sh
258
+ geo-checker https://staging.example.com --baseline ./baselines/main.json --out ./geo
81
259
  ```
82
260
 
83
261
  ## License
@@ -86,4 +264,4 @@ MIT © BaRam-OSS. See [LICENSE](./LICENSE).
86
264
 
87
265
  ## Contributing
88
266
 
89
- PRs welcome. See [CONTRIBUTING.md](./CONTRIBUTING.md).
267
+ PRs welcome — especially new rules, fixtures, and docs. See [CONTRIBUTING.md](./CONTRIBUTING.md).