@growthub/cli 0.3.52 → 0.3.53

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/assets/worker-kits/growthub-geo-seo-v1/.env.example +12 -0
  2. package/assets/worker-kits/growthub-geo-seo-v1/QUICKSTART.md +138 -0
  3. package/assets/worker-kits/growthub-geo-seo-v1/brands/NEW-CLIENT.md +104 -0
  4. package/assets/worker-kits/growthub-geo-seo-v1/brands/_template/brand-kit.md +116 -0
  5. package/assets/worker-kits/growthub-geo-seo-v1/brands/growthub/brand-kit.md +117 -0
  6. package/assets/worker-kits/growthub-geo-seo-v1/bundles/growthub-geo-seo-v1.json +54 -0
  7. package/assets/worker-kits/growthub-geo-seo-v1/docs/geo-seo-fork-integration.md +244 -0
  8. package/assets/worker-kits/growthub-geo-seo-v1/docs/pdf-report-layer.md +139 -0
  9. package/assets/worker-kits/growthub-geo-seo-v1/docs/scoring-methodology.md +230 -0
  10. package/assets/worker-kits/growthub-geo-seo-v1/docs/subagent-dispatch.md +273 -0
  11. package/assets/worker-kits/growthub-geo-seo-v1/examples/citability-sample.md +155 -0
  12. package/assets/worker-kits/growthub-geo-seo-v1/examples/geo-audit-sample.md +126 -0
  13. package/assets/worker-kits/growthub-geo-seo-v1/examples/pdf-report-sample.md +207 -0
  14. package/assets/worker-kits/growthub-geo-seo-v1/examples/prospect-proposal-sample.md +184 -0
  15. package/assets/worker-kits/growthub-geo-seo-v1/growthub-meta/README.md +124 -0
  16. package/assets/worker-kits/growthub-geo-seo-v1/growthub-meta/kit-standard.md +116 -0
  17. package/assets/worker-kits/growthub-geo-seo-v1/kit.json +102 -0
  18. package/assets/worker-kits/growthub-geo-seo-v1/output/README.md +114 -0
  19. package/assets/worker-kits/growthub-geo-seo-v1/output-standards.md +143 -0
  20. package/assets/worker-kits/growthub-geo-seo-v1/runtime-assumptions.md +175 -0
  21. package/assets/worker-kits/growthub-geo-seo-v1/setup/check-deps.sh +80 -0
  22. package/assets/worker-kits/growthub-geo-seo-v1/setup/clone-fork.sh +56 -0
  23. package/assets/worker-kits/growthub-geo-seo-v1/setup/verify-env.mjs +152 -0
  24. package/assets/worker-kits/growthub-geo-seo-v1/skills.md +359 -0
  25. package/assets/worker-kits/growthub-geo-seo-v1/templates/brand-visibility-report.md +101 -0
  26. package/assets/worker-kits/growthub-geo-seo-v1/templates/citability-analysis.md +131 -0
  27. package/assets/worker-kits/growthub-geo-seo-v1/templates/client-proposal.md +172 -0
  28. package/assets/worker-kits/growthub-geo-seo-v1/templates/content-analysis.md +136 -0
  29. package/assets/worker-kits/growthub-geo-seo-v1/templates/crawler-access-report.md +115 -0
  30. package/assets/worker-kits/growthub-geo-seo-v1/templates/geo-audit-brief.md +114 -0
  31. package/assets/worker-kits/growthub-geo-seo-v1/templates/geo-score-summary.md +113 -0
  32. package/assets/worker-kits/growthub-geo-seo-v1/templates/llmstxt-plan.md +173 -0
  33. package/assets/worker-kits/growthub-geo-seo-v1/templates/remediation-roadmap.md +150 -0
  34. package/assets/worker-kits/growthub-geo-seo-v1/templates/schema-validation.md +177 -0
  35. package/assets/worker-kits/growthub-geo-seo-v1/templates/technical-foundations.md +108 -0
  36. package/assets/worker-kits/growthub-geo-seo-v1/validation-checklist.md +139 -0
  37. package/assets/worker-kits/growthub-geo-seo-v1/workers/geo-seo-operator/CLAUDE.md +320 -0
  38. package/package.json +1 -1
@@ -0,0 +1,244 @@
1
+ # geo-seo-claude Fork Integration
2
+
3
+ **Source repo:** https://github.com/zubair-trabzada/geo-seo-claude
4
+
5
+ ---
6
+
7
+ ## What geo-seo-claude Is
8
+
9
+ `geo-seo-claude` is an open-source Claude Code skill for GEO (Generative Engine Optimization) and SEO auditing. It provides 14 specialized commands that run against any live URL to produce AI search visibility data, citability scores, crawler access reports, and remediation artifacts.
10
+
11
+ The Growthub GEO SEO Studio wraps this tool with:
12
+ - A brand kit system (per-client configuration)
13
+ - Structured output templates (11 templates, 9 core artifact types)
14
+ - A 5-layer documentation architecture
15
+ - A 4-week remediation roadmap format
16
+ - PDF report generation integration
17
+ - Agency proposal templates
18
+
19
+ ---
20
+
21
+ ## Architecture of the Fork
22
+
23
+ ```
24
+ geo-seo-claude/
25
+ geo/ # Main skill entry point — Claude Code reads this
26
+ skill.md # Master skill definition
27
+ skills/ # 14 sub-skill definitions (one per /geo command)
28
+ audit.md
29
+ citability.md
30
+ crawlers.md
31
+ brands.md
32
+ report.md
33
+ report-pdf.md
34
+ content.md
35
+ schema.md
36
+ technical.md
37
+ llmstxt.md
38
+ quick.md
39
+ proposal.md
40
+ prospect.md
41
+ compare.md
42
+ agents/ # 5 parallel subagent definitions
43
+ geo-ai-visibility.md
44
+ geo-content.md
45
+ geo-platform-analysis.md
46
+ geo-schema.md
47
+ geo-technical.md
48
+ scripts/ # Python utility scripts
49
+ fetch_page.py # Playwright-based page fetcher and parser
50
+ citability_scorer.py # 5-metric citability algorithm
51
+ brand_scanner.py # 8-platform brand mention scanner
52
+ generate_pdf_report.py # ReportLab PDF generator
53
+ llmstxt_generator.py # llms.txt and llms-full.txt generator
54
+ crm_dashboard.py # Flask CRM dashboard
55
+ schema/ # 6 JSON-LD templates
56
+ organization.json
57
+ article.json
58
+ faqpage.json
59
+ product.json
60
+ localbusiness.json
61
+ breadcrumblist.json
62
+ requirements.txt # Python dependencies
63
+ README.md
64
+ ```
65
+
66
+ ---
67
+
68
+ ## Key Scripts and What Each Does
69
+
70
+ ### `scripts/fetch_page.py`
71
+
72
+ Fetches a URL using Playwright (chromium) and extracts all signals needed for GEO analysis.
73
+
74
+ **What it produces:**
75
+ - Rendered HTML (after JavaScript execution)
76
+ - robots.txt contents (fetched from domain root)
77
+ - llms.txt and llms-full.txt status (exists/missing/malformed)
78
+ - sitemap.xml discovery and URL count
79
+ - HTTP response headers
80
+ - Page word count and heading hierarchy
81
+ - JSON-LD structured data blocks
82
+
83
+ **Usage:**
84
+ ```bash
85
+ python scripts/fetch_page.py https://example.com
86
+ python scripts/fetch_page.py https://example.com --output analysis.json
87
+ ```
88
+
89
+ ---
90
+
91
+ ### `scripts/citability_scorer.py`
92
+
93
+ Runs the 5-metric citability algorithm against page content extracted by `fetch_page.py`.
94
+
95
+ **5 metrics:**
96
+ 1. Answer Block Quality (30%)
97
+ 2. Self-Containment (25%)
98
+ 3. Structural Readability (20%)
99
+ 4. Statistical Density (15%)
100
+ 5. Uniqueness Signals (10%)
101
+
102
+ **Usage:**
103
+ ```bash
104
+ python scripts/citability_scorer.py --input analysis.json
105
+ python scripts/citability_scorer.py --url https://example.com
106
+ ```
107
+
108
+ ---
109
+
110
+ ### `scripts/brand_scanner.py`
111
+
112
+ Scans 8 platforms for brand mentions and computes a brand authority score.
113
+
114
+ **Platforms scanned:**
115
+ YouTube, Reddit, Wikipedia, LinkedIn, Twitter/X, GitHub, Quora, HackerNews
116
+
117
+ **Usage:**
118
+ ```bash
119
+ python scripts/brand_scanner.py --brand "Brand Name" --domain example.com
120
+ ```
121
+
122
+ ---
123
+
124
+ ### `scripts/generate_pdf_report.py`
125
+
126
+ Generates a branded PDF report using ReportLab from a GEO score JSON data file.
127
+
128
+ **Inputs required:**
129
+ - `geo_score_data.json` — machine-readable score data produced by the audit
130
+ - Client name and target URL
131
+ - Optional: logo file path, color scheme
132
+
133
+ **Usage:**
134
+ ```bash
135
+ python scripts/generate_pdf_report.py \
136
+ --input output/<client>/geo_score_data.json \
137
+ --output output/<client>/report.pdf \
138
+ --brand "Client Name"
139
+ ```
140
+
141
+ ---
142
+
143
+ ### `scripts/llmstxt_generator.py`
144
+
145
+ Generates `llms.txt` and `llms-full.txt` files for a domain.
146
+
147
+ **Usage:**
148
+ ```bash
149
+ python scripts/llmstxt_generator.py --domain https://example.com --from-sitemap
150
+ python scripts/llmstxt_generator.py --domain https://example.com --full
151
+ python scripts/llmstxt_generator.py --domain https://example.com --dry-run
152
+ ```
153
+
154
+ ---
155
+
156
+ ### `scripts/crm_dashboard.py`
157
+
158
+ Launches a Flask web dashboard for managing audit history and client accounts.
159
+
160
+ **Usage:**
161
+ ```bash
162
+ python scripts/crm_dashboard.py
163
+ # Dashboard available at http://localhost:5000 (or FLASK_PORT if set)
164
+ ```
165
+
166
+ ---
167
+
168
+ ## Python Dependencies
169
+
170
+ ```
171
+ beautifulsoup4 # HTML parsing
172
+ playwright # Dynamic page fetching (requires: playwright install chromium)
173
+ reportlab # PDF generation
174
+ flask # CRM dashboard web server
175
+ rich # Terminal output formatting
176
+ validators # URL validation
177
+ requests # HTTP requests for robots.txt, llms.txt
178
+ lxml # Fast XML/HTML parsing (sitemap processing)
179
+ ```
180
+
181
+ Install all:
182
+ ```bash
183
+ pip install -r requirements.txt
184
+ playwright install chromium
185
+ ```
186
+
187
+ ---
188
+
189
+ ## Installation
190
+
191
+ ```bash
192
+ # Clone the fork
193
+ git clone https://github.com/zubair-trabzada/geo-seo-claude ~/geo-seo-claude
194
+
195
+ # Install Python dependencies
196
+ cd ~/geo-seo-claude
197
+ pip install -r requirements.txt
198
+
199
+ # Install Playwright browser
200
+ playwright install chromium
201
+
202
+ # Verify installation
203
+ python scripts/fetch_page.py https://example.com
204
+ ```
205
+
206
+ Or use the kit's setup script:
207
+ ```bash
208
+ bash setup/clone-fork.sh
209
+ ```
210
+
211
+ ---
212
+
213
+ ## When the Fork Is Unavailable
214
+
215
+ If the local fork is not cloned, the GEO SEO Operator switches to **agent-only mode**:
216
+
217
+ - Page fetching is performed via Claude's built-in fetch capability
218
+ - Citability scoring is performed manually using the 5-metric algorithm from `docs/scoring-methodology.md`
219
+ - Brand scanning is performed via Claude's knowledge of platform signals
220
+ - PDF generation is not available — Markdown equivalents are produced instead
221
+ - All output artifacts are still produced and follow the same templates
222
+
223
+ Agent-only mode is always valid and produces complete outputs. The local fork adds:
224
+ - Higher accuracy citability scores (Python parser vs. agent estimate)
225
+ - Real-time robots.txt parsing with exact user-agent matching
226
+ - Playwright-rendered page content (handles JavaScript-heavy sites)
227
+ - PDF generation capability
228
+ - Flask CRM dashboard
229
+
230
+ ---
231
+
232
+ ## Upstream Assumptions Frozen in This Kit
233
+
234
+ These assumptions were verified against the fork at kit creation time (2026-04-14):
235
+
236
+ - 14 /geo commands are available (listed in skills.md)
237
+ - 5 parallel subagents are defined in `agents/`
238
+ - Python 3.8+ is required
239
+ - Playwright uses chromium by default
240
+ - `requirements.txt` includes all dependencies listed above
241
+ - `schema/` contains 6 JSON-LD templates
242
+ - CRM dashboard runs on Flask at port 5000 by default
243
+
244
+ If the upstream fork changes its API or file structure after this date, update `runtime-assumptions.md` accordingly.
@@ -0,0 +1,139 @@
1
+ # PDF Report Layer
2
+
3
+ ---
4
+
5
+ ## When to Trigger PDF Generation
6
+
7
+ Trigger PDF generation only in these cases:
8
+
9
+ 1. The user explicitly requests `/geo report-pdf`
10
+ 2. The brand kit has `delivery_format: pdf` or `delivery_format: both`
11
+ 3. The operator completes a full audit and the user confirms PDF delivery
12
+
13
+ Do not generate a PDF unless one of these conditions is met. Markdown is the default delivery format.
14
+
15
+ ---
16
+
17
+ ## Script and Requirements
18
+
19
+ **Script:** `scripts/generate_pdf_report.py` (in geo-seo-claude fork)
20
+
21
+ **Python dependency:** ReportLab (`pip install reportlab`)
22
+
23
+ **Required data:** The script consumes `geo_score_data.json` — the machine-readable score data produced at the end of the audit workflow. This file must be written to the output directory before triggering PDF generation.
24
+
25
+ ---
26
+
27
+ ## What the Script Needs
28
+
29
+ The following inputs are required by `generate_pdf_report.py`:
30
+
31
+ | Input | Source | Required |
32
+ |---|---|---|
33
+ | `--input path/to/geo_score_data.json` | Written by operator at end of audit | Yes |
34
+ | `--output path/to/report.pdf` | Specified by operator | Yes |
35
+ | `--brand "Client Name"` | From brand kit `client_name` field | Yes |
36
+ | Target URL | From `geo_score_data.json` | Yes (in data) |
37
+ | All 6 component scores | From `geo_score_data.json` | Yes (in data) |
38
+ | Top findings list | From `geo_score_data.json` | Yes (in data) |
39
+ | Remediation roadmap summary | From `geo_score_data.json` | Yes (in data) |
40
+ | `--logo path/to/logo.png` | From brand kit `logo_file` field | Optional |
41
+
42
+ ---
43
+
44
+ ## Usage
45
+
46
+ ```bash
47
+ # Standard invocation from kit root
48
+ python ~/geo-seo-claude/scripts/generate_pdf_report.py \
49
+ --input output/<client-slug>/<project-slug>/geo_score_data.json \
50
+ --output output/<client-slug>/<project-slug>/<ClientSlug>_GeoScoreReport_v1_<YYYYMMDD>.pdf \
51
+ --brand "Client Name" \
52
+ --logo brands/<client-slug>/assets/logo.png
53
+
54
+ # Without logo (logo will use default Growthub placeholder)
55
+ python ~/geo-seo-claude/scripts/generate_pdf_report.py \
56
+ --input output/<client-slug>/<project-slug>/geo_score_data.json \
57
+ --output output/<client-slug>/<project-slug>/report.pdf \
58
+ --brand "Client Name"
59
+ ```
60
+
61
+ ---
62
+
63
+ ## What the PDF Contains
64
+
65
+ The branded PDF report produced by `generate_pdf_report.py` includes the following sections:
66
+
67
+ | Section | Pages | Content |
68
+ |---|---|---|
69
+ | Cover page | 1 | Client name, target URL, GEO Score gauge visualization, letter grade, audit date |
70
+ | Executive summary | 1 | 3 key findings, score vs. category benchmark, top recommended action |
71
+ | GEO Score breakdown | 2 | Visual bar chart for all 6 components, weighted contribution table |
72
+ | Citability analysis | 1 | 5-metric breakdown with visual gauges, letter grade, top 3 improvements |
73
+ | Crawler access matrix | 1 | 14-crawler table with color-coded access status (green/yellow/red) |
74
+ | Top findings | 2 | Ranked findings with impact level, effort estimate, and specific fix instructions |
75
+ | 4-week roadmap | 1 | Sprint table with actions, owners, and projected score gain |
76
+ | Back cover | 1 | Growthub contact information, next steps, rescore recommendation |
77
+
78
+ **Typical PDF length:** 10–12 pages (Letter or A4)
79
+
80
+ ---
81
+
82
+ ## geo_score_data.json Format
83
+
84
+ The operator must write this file before triggering PDF generation. See `examples/pdf-report-sample.md` for the complete format.
85
+
86
+ Minimum required structure:
87
+ ```json
88
+ {
89
+ "report_meta": {
90
+ "client_name": "",
91
+ "target_url": "",
92
+ "audit_date": "",
93
+ "report_version": "1"
94
+ },
95
+ "geo_score": {
96
+ "composite": 0,
97
+ "grade": ""
98
+ },
99
+ "components": [
100
+ { "name": "", "weight": 0.0, "raw_score": 0, "weighted_score": 0.0, "grade": "", "primary_issue": "" }
101
+ ],
102
+ "top_findings": [
103
+ { "rank": 1, "title": "", "description": "", "component": "", "priority": "", "expected_score_gain": 0 }
104
+ ],
105
+ "remediation_summary": {
106
+ "score_before": 0,
107
+ "score_after_projected": 0
108
+ }
109
+ }
110
+ ```
111
+
112
+ ---
113
+
114
+ ## In Agent-Only Mode
115
+
116
+ If the local fork is not available, PDF generation is not possible.
117
+
118
+ **Fallback behavior:**
119
+ 1. Produce the complete `GeoScoreSummary` Markdown file as the primary score artifact
120
+ 2. Note at the top: `> PDF generation requires the local geo-seo-claude fork. Run bash setup/clone-fork.sh to enable.`
121
+ 3. Write `geo_score_data.json` to the output directory so PDF can be generated later when the fork is available
122
+
123
+ The Markdown output is a complete substitute for stakeholder communication. All the same data is present — the PDF simply adds visual formatting.
124
+
125
+ ---
126
+
127
+ ## PDF Styling
128
+
129
+ The PDF uses the brand kit's color values:
130
+
131
+ | Element | Color Source |
132
+ |---|---|
133
+ | Header and cover background | `colors.primary` from brand kit |
134
+ | Accent bars and highlights | `colors.accent` from brand kit |
135
+ | Score gauge fill | Dynamic: green (A), blue (B), yellow (C), orange (D), red (F) |
136
+ | Body text | `colors.dark` or black |
137
+ | Table alternating rows | `colors.secondary` at 10% opacity |
138
+
139
+ If no brand kit colors are found, the PDF uses Growthub defaults (`#1A1A2E` primary, `#E94560` accent).
@@ -0,0 +1,230 @@
1
+ # GEO Scoring Methodology
2
+
3
+ **Source of truth for all scoring rules. The operator must apply these formulas exactly.**
4
+
5
+ ---
6
+
7
+ ## GEO Score Formula
8
+
9
+ The GEO Score is a weighted composite of 6 component scores, each ranging from 0 to 100.
10
+
11
+ ```
12
+ GEO Score = (AI Citability & Visibility × 0.25)
13
+ + (Brand Authority × 0.20)
14
+ + (Content Quality & E-E-A-T × 0.20)
15
+ + (Technical Foundations × 0.15)
16
+ + (Structured Data × 0.10)
17
+ + (Platform Optimization × 0.10)
18
+ ```
19
+
20
+ ### Component Weights
21
+
22
+ | Component | Weight | Rationale |
23
+ |---|---|---|
24
+ | AI Citability & Visibility | 25% | Core measure of AI search readiness — crawler access + citability signals |
25
+ | Brand Authority | 20% | AI systems prefer to cite recognized, cross-platform brands |
26
+ | Content Quality & E-E-A-T | 20% | Google and AI systems both weight E-E-A-T heavily for content ranking |
27
+ | Technical Foundations | 15% | Technical barriers prevent all other optimization from working |
28
+ | Structured Data | 10% | Schema markup directly feeds AI answer surfaces (Google AI Overviews, ChatGPT) |
29
+ | Platform Optimization | 10% | Platform-specific readiness for the 4 major AI search engines |
30
+
31
+ ### Computation Rules
32
+
33
+ - Each component score is 0–100 (no decimals before aggregation)
34
+ - Apply weights before rounding
35
+ - Round the composite score to the nearest integer
36
+ - If a component score is unavailable (data gap), use **50** as the neutral default
37
+ - Flag any data-gap component in the GeoScoreSummary output
38
+
39
+ **Example:**
40
+ ```
41
+ AI Citability: 58 × 0.25 = 14.50
42
+ Brand Auth: 66 × 0.20 = 13.20
43
+ Content: 71 × 0.20 = 14.20
44
+ Technical: 84 × 0.15 = 12.60
45
+ Schema: 42 × 0.10 = 4.20
46
+ Platform: 61 × 0.10 = 6.10
47
+ ─────
48
+ Composite: 64.80 → rounds to 65
49
+ ```
50
+
51
+ ---
52
+
53
+ ## Letter Grade Thresholds
54
+
55
+ | Grade | Score Range | AI Search Status |
56
+ |---|---|---|
57
+ | A | 85–100 | Highly optimized — strong citability, clean crawler access, rich schema |
58
+ | B | 70–84 | Good — some gaps, addressable in one sprint cycle |
59
+ | C | 55–69 | Moderate — missing key citability signals, schema gaps, crawler issues |
60
+ | D | 40–54 | Poor — likely not capturing meaningful AI-referred traffic |
61
+ | F | Below 40 | Not AI-search-ready — critical issues across multiple components |
62
+
63
+ ---
64
+
65
+ ## Citability Algorithm
66
+
67
+ The Citability Score (one of the inputs to AI Citability & Visibility) uses a 5-metric algorithm.
68
+
69
+ ### 5-Metric Breakdown
70
+
71
+ | Metric | Weight | What It Measures | Scoring Method |
72
+ |---|---|---|---|
73
+ | Answer Block Quality | 30% | Do paragraphs contain complete, self-sufficient answers AI can quote verbatim? | Score each paragraph 0–10; average across all paragraphs |
74
+ | Self-Containment | 25% | Can each paragraph be understood without reading surrounding context? | Pronoun-to-noun ratio; lower = better self-containment |
75
+ | Structural Readability | 20% | Does the page use headings, short paragraphs, and lists to enable AI parsing? | Check H1/H2/list/paragraph-length signals |
76
+ | Statistical Density | 15% | Does the page contain specific numbers, percentages, and data references? | Count data points per 1,000 words; optimal: 8–15 |
77
+ | Uniqueness Signals | 10% | Does the content contain proprietary claims or data not found elsewhere? | Check for first-party research, original data, unique terminology |
78
+
79
+ **Citability Score formula:**
80
+ ```
81
+ Citability = (Answer Block Quality × 0.30)
82
+ + (Self-Containment × 0.25)
83
+ + (Structural Readability × 0.20)
84
+ + (Statistical Density × 0.15)
85
+ + (Uniqueness Signals × 0.10)
86
+ ```
87
+
88
+ ---
89
+
90
+ ### Metric Scoring Rules
91
+
92
+ #### Answer Block Quality (0–100)
93
+
94
+ Evaluate each paragraph on 3 criteria:
95
+ 1. Subject is clearly stated (no pronoun-as-subject opener)
96
+ 2. Supporting evidence or data is present
97
+ 3. No unresolved pronoun references
98
+
99
+ Score each paragraph:
100
+ - All 3 criteria met: 10/10
101
+ - 2 criteria met: 7/10
102
+ - 1 criterion met: 4/10
103
+ - None met: 1/10
104
+
105
+ Average across all paragraphs. Multiply by 10 to get 0–100 score.
106
+
107
+ **High-score example:**
108
+ > "GPTBot (used by ChatGPT's Browse mode) is blocked in the site's robots.txt. This means ChatGPT cannot fetch and cite this page's content, even when users directly ask about topics the page covers."
109
+
110
+ **Low-score example:**
111
+ > "It blocks them from accessing the site. This causes problems because they can't see what's there."
112
+ *(No subject named, no evidence, multiple unresolved pronouns)*
113
+
114
+ ---
115
+
116
+ #### Self-Containment (0–100)
117
+
118
+ **Pronoun inventory:** it, they, this, that, these, those, he, she, we (when antecedent is not in same sentence)
119
+
120
+ 1. Count pronouns used as subjects or objects without a prior noun in the same sentence
121
+ 2. Count total noun references (named entities + common nouns)
122
+ 3. Pronoun-to-noun ratio = pronoun count / noun count
123
+
124
+ **Scoring:**
125
+ - Ratio < 0.15: 100
126
+ - Ratio 0.15–0.25: 85
127
+ - Ratio 0.25–0.35: 70
128
+ - Ratio 0.35–0.50: 50
129
+ - Ratio 0.50–0.65: 30
130
+ - Ratio > 0.65: 10
131
+
132
+ Also penalize for word count outside the 300–3,000 word range:
133
+ - < 300 words: score cap at 40 (too thin to be self-contained on a topic)
134
+ - 300–800 words: light penalty (-10)
135
+ - 800–2,500 words: optimal range, no penalty
136
+ - > 2,500 words: light penalty (-5) — tends toward context-dependent sprawl
137
+
138
+ ---
139
+
140
+ #### Structural Readability (0–100)
141
+
142
+ Start at 100 and deduct:
143
+
144
+ | Issue | Deduction |
145
+ |---|---|
146
+ | No H1 present | -30 |
147
+ | Fewer than 2 H2 sections | -15 |
148
+ | No numbered or bulleted lists | -15 |
149
+ | Average paragraph length > 150 words | -15 |
150
+ | Any paragraph > 300 words (wall of text) | -10 per occurrence, capped at -20 |
151
+ | No visual separation between sections | -10 |
152
+
153
+ Minimum score: 0.
154
+
155
+ ---
156
+
157
+ #### Statistical Density (0–100)
158
+
159
+ Count data points per 1,000 words. Data points include:
160
+ - Percentage figures (e.g., "47%," "3x increase")
161
+ - Specific numbers with units (e.g., "65ms," "$2,800," "14 crawlers")
162
+ - Named statistics (e.g., "200M weekly active users")
163
+ - Year references as evidence (e.g., "Q1 2026 data shows...")
164
+
165
+ **Scoring:**
166
+ - < 2 data points per 1,000 words: 10
167
+ - 2–5: 35
168
+ - 5–8: 60
169
+ - 8–15: 100 (optimal range)
170
+ - 15–25: 85 (slightly over-cited — readability may suffer)
171
+ - > 25: 60 (data overload — hard for AI to extract clean answers)
172
+
173
+ ---
174
+
175
+ #### Uniqueness Signals (0–100)
176
+
177
+ | Signal | Points |
178
+ |---|---|
179
+ | First-party study or original research | 35 |
180
+ | Proprietary data or named internal data source | 25 |
181
+ | Original methodology with named process | 20 |
182
+ | Unique branded terminology | 15 |
183
+ | Non-generic competitive differentiation statement | 10 |
184
+ | Named case study with specific results | 10 |
185
+
186
+ Cap at 100. If total > 100, use 100.
187
+
188
+ ---
189
+
190
+ ## Component Score Normalization
191
+
192
+ Each subagent returns a raw 0–100 score. No normalization is needed — all scores are already on the same scale.
193
+
194
+ Do not normalize scores before applying weights. Apply weights directly to the 0–100 values.
195
+
196
+ ---
197
+
198
+ ## Score Interpretation for Client Communication
199
+
200
+ ### Grade A (85–100)
201
+ > "Your site is well-positioned for AI-driven search. You are in the top tier for citability, crawler access, and content quality. We recommend ongoing monthly monitoring and targeted improvements to maintain this position as AI search evolves."
202
+
203
+ ### Grade B (70–84)
204
+ > "Good foundation. You have established AI search presence, but specific gaps are limiting your ceiling. Targeted improvements in [lowest-scoring component] can push you into the A tier within 30–60 days."
205
+
206
+ ### Grade C (55–69)
207
+ > "Moderate visibility. You are not capturing significant AI-referred traffic yet. A full remediation roadmap is recommended. Addressing the top 3 gaps typically produces a measurable score improvement within 30 days."
208
+
209
+ ### Grade D (40–54)
210
+ > "Your site has significant AI search blindspots. AI systems may not be citing or recommending your content at all, even for queries you should rank for. Immediate technical and content remediation is required."
211
+
212
+ ### Grade F (below 40)
213
+ > "Critical issues detected. AI crawlers may be blocked outright, or your content lacks the structural signals needed for AI citation. A full remediation engagement is required before any AI-referred traffic is possible."
214
+
215
+ ---
216
+
217
+ ## Benchmark Context
218
+
219
+ These benchmarks are based on geo-seo-claude audit data:
220
+
221
+ | Benchmark | Score |
222
+ |---|---|
223
+ | Portfolio average (all audited sites) | 58 / 100 |
224
+ | Top quartile (25th percentile from top) | 76 / 100 |
225
+ | Top 10% | 85 / 100 |
226
+ | Minimum for meaningful AI-referred traffic | ~65 / 100 |
227
+ | Sites that actively appear in Perplexity citations | ~72+ / 100 |
228
+ | Sites that appear in Google AI Overviews regularly | ~78+ / 100 |
229
+
230
+ **llms.txt adoption rate:** Approximately 22% of audited sites have a valid `llms.txt` as of Q1 2026.