npm - @growthub/cli - Versions diffs - 0.3.52 → 0.3.53 - Mend

@growthub/cli 0.3.52 → 0.3.53

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/assets/worker-kits/growthub-geo-seo-v1/docs/geo-seo-fork-integration.md ADDED Viewed

@@ -0,0 +1,244 @@
+# geo-seo-claude Fork Integration
+**Source repo:** https://github.com/zubair-trabzada/geo-seo-claude
+---
+## What geo-seo-claude Is
+`geo-seo-claude` is an open-source Claude Code skill for GEO (Generative Engine Optimization) and SEO auditing. It provides 14 specialized commands that run against any live URL to produce AI search visibility data, citability scores, crawler access reports, and remediation artifacts.
+The Growthub GEO SEO Studio wraps this tool with:
+- A brand kit system (per-client configuration)
+- Structured output templates (11 templates, 9 core artifact types)
+- A 5-layer documentation architecture
+- A 4-week remediation roadmap format
+- PDF report generation integration
+- Agency proposal templates
+---
+## Architecture of the Fork
+```
+geo-seo-claude/
+  geo/                        # Main skill entry point — Claude Code reads this
+    skill.md                  # Master skill definition
+  skills/                     # 14 sub-skill definitions (one per /geo command)
+    audit.md
+    citability.md
+    crawlers.md
+    brands.md
+    report.md
+    report-pdf.md
+    content.md
+    schema.md
+    technical.md
+    llmstxt.md
+    quick.md
+    proposal.md
+    prospect.md
+    compare.md
+  agents/                     # 5 parallel subagent definitions
+    geo-ai-visibility.md
+    geo-content.md
+    geo-platform-analysis.md
+    geo-schema.md
+    geo-technical.md
+  scripts/                    # Python utility scripts
+    fetch_page.py             # Playwright-based page fetcher and parser
+    citability_scorer.py      # 5-metric citability algorithm
+    brand_scanner.py          # 8-platform brand mention scanner
+    generate_pdf_report.py    # ReportLab PDF generator
+    llmstxt_generator.py      # llms.txt and llms-full.txt generator
+    crm_dashboard.py          # Flask CRM dashboard
+  schema/                     # 6 JSON-LD templates
+    organization.json
+    article.json
+    faqpage.json
+    product.json
+    localbusiness.json
+    breadcrumblist.json
+  requirements.txt            # Python dependencies
+  README.md
+```
+---
+## Key Scripts and What Each Does
+### `scripts/fetch_page.py`
+Fetches a URL using Playwright (chromium) and extracts all signals needed for GEO analysis.
+**What it produces:**
+- Rendered HTML (after JavaScript execution)
+- robots.txt contents (fetched from domain root)
+- llms.txt and llms-full.txt status (exists/missing/malformed)
+- sitemap.xml discovery and URL count
+- HTTP response headers
+- Page word count and heading hierarchy
+- JSON-LD structured data blocks
+**Usage:**
+```bash
+python scripts/fetch_page.py https://example.com
+python scripts/fetch_page.py https://example.com --output analysis.json
+```
+---
+### `scripts/citability_scorer.py`
+Runs the 5-metric citability algorithm against page content extracted by `fetch_page.py`.
+**5 metrics:**
+1. Answer Block Quality (30%)
+2. Self-Containment (25%)
+3. Structural Readability (20%)
+4. Statistical Density (15%)
+5. Uniqueness Signals (10%)
+**Usage:**
+```bash
+python scripts/citability_scorer.py --input analysis.json
+python scripts/citability_scorer.py --url https://example.com
+```
+---
+### `scripts/brand_scanner.py`
+Scans 8 platforms for brand mentions and computes a brand authority score.
+**Platforms scanned:**
+YouTube, Reddit, Wikipedia, LinkedIn, Twitter/X, GitHub, Quora, HackerNews
+**Usage:**
+```bash
+python scripts/brand_scanner.py --brand "Brand Name" --domain example.com
+```
+---
+### `scripts/generate_pdf_report.py`
+Generates a branded PDF report using ReportLab from a GEO score JSON data file.
+**Inputs required:**
+- `geo_score_data.json` — machine-readable score data produced by the audit
+- Client name and target URL
+- Optional: logo file path, color scheme
+**Usage:**
+```bash
+python scripts/generate_pdf_report.py \
+  --input output/<client>/geo_score_data.json \
+  --output output/<client>/report.pdf \
+  --brand "Client Name"
+```
+---
+### `scripts/llmstxt_generator.py`
+Generates `llms.txt` and `llms-full.txt` files for a domain.
+**Usage:**
+```bash
+python scripts/llmstxt_generator.py --domain https://example.com --from-sitemap
+python scripts/llmstxt_generator.py --domain https://example.com --full
+python scripts/llmstxt_generator.py --domain https://example.com --dry-run
+```
+---
+### `scripts/crm_dashboard.py`
+Launches a Flask web dashboard for managing audit history and client accounts.
+**Usage:**
+```bash
+python scripts/crm_dashboard.py
+# Dashboard available at http://localhost:5000 (or FLASK_PORT if set)
+```
+---
+## Python Dependencies
+```
+beautifulsoup4      # HTML parsing
+playwright          # Dynamic page fetching (requires: playwright install chromium)
+reportlab           # PDF generation
+flask               # CRM dashboard web server
+rich                # Terminal output formatting
+validators          # URL validation
+requests            # HTTP requests for robots.txt, llms.txt
+lxml                # Fast XML/HTML parsing (sitemap processing)
+```
+Install all:
+```bash
+pip install -r requirements.txt
+playwright install chromium
+```
+---
+## Installation
+```bash
+# Clone the fork
+git clone https://github.com/zubair-trabzada/geo-seo-claude ~/geo-seo-claude
+# Install Python dependencies
+cd ~/geo-seo-claude
+pip install -r requirements.txt
+# Install Playwright browser
+playwright install chromium
+# Verify installation
+python scripts/fetch_page.py https://example.com
+```
+Or use the kit's setup script:
+```bash
+bash setup/clone-fork.sh
+```
+---
+## When the Fork Is Unavailable
+If the local fork is not cloned, the GEO SEO Operator switches to **agent-only mode**:
+- Page fetching is performed via Claude's built-in fetch capability
+- Citability scoring is performed manually using the 5-metric algorithm from `docs/scoring-methodology.md`
+- Brand scanning is performed via Claude's knowledge of platform signals
+- PDF generation is not available — Markdown equivalents are produced instead
+- All output artifacts are still produced and follow the same templates
+Agent-only mode is always valid and produces complete outputs. The local fork adds:
+- Higher accuracy citability scores (Python parser vs. agent estimate)
+- Real-time robots.txt parsing with exact user-agent matching
+- Playwright-rendered page content (handles JavaScript-heavy sites)
+- PDF generation capability
+- Flask CRM dashboard
+---
+## Upstream Assumptions Frozen in This Kit
+These assumptions were verified against the fork at kit creation time (2026-04-14):
+- 14 /geo commands are available (listed in skills.md)
+- 5 parallel subagents are defined in `agents/`
+- Python 3.8+ is required
+- Playwright uses chromium by default
+- `requirements.txt` includes all dependencies listed above
+- `schema/` contains 6 JSON-LD templates
+- CRM dashboard runs on Flask at port 5000 by default
+If the upstream fork changes its API or file structure after this date, update `runtime-assumptions.md` accordingly.

package/assets/worker-kits/growthub-geo-seo-v1/docs/pdf-report-layer.md ADDED Viewed

@@ -0,0 +1,139 @@
+# PDF Report Layer
+---
+## When to Trigger PDF Generation
+Trigger PDF generation only in these cases:
+1. The user explicitly requests `/geo report-pdf`
+2. The brand kit has `delivery_format: pdf` or `delivery_format: both`
+3. The operator completes a full audit and the user confirms PDF delivery
+Do not generate a PDF unless one of these conditions is met. Markdown is the default delivery format.
+---
+## Script and Requirements
+**Script:** `scripts/generate_pdf_report.py` (in geo-seo-claude fork)
+**Python dependency:** ReportLab (`pip install reportlab`)
+**Required data:** The script consumes `geo_score_data.json` — the machine-readable score data produced at the end of the audit workflow. This file must be written to the output directory before triggering PDF generation.
+---
+## What the Script Needs
+The following inputs are required by `generate_pdf_report.py`:
+| Input | Source | Required |
+|---|---|---|
+| `--input path/to/geo_score_data.json` | Written by operator at end of audit | Yes |
+| `--output path/to/report.pdf` | Specified by operator | Yes |
+| `--brand "Client Name"` | From brand kit `client_name` field | Yes |
+| Target URL | From `geo_score_data.json` | Yes (in data) |
+| All 6 component scores | From `geo_score_data.json` | Yes (in data) |
+| Top findings list | From `geo_score_data.json` | Yes (in data) |
+| Remediation roadmap summary | From `geo_score_data.json` | Yes (in data) |
+| `--logo path/to/logo.png` | From brand kit `logo_file` field | Optional |
+---
+## Usage
+```bash
+# Standard invocation from kit root
+python ~/geo-seo-claude/scripts/generate_pdf_report.py \
+  --input output/<client-slug>/<project-slug>/geo_score_data.json \
+  --output output/<client-slug>/<project-slug>/<ClientSlug>_GeoScoreReport_v1_<YYYYMMDD>.pdf \
+  --brand "Client Name" \
+  --logo brands/<client-slug>/assets/logo.png
+# Without logo (logo will use default Growthub placeholder)
+python ~/geo-seo-claude/scripts/generate_pdf_report.py \
+  --input output/<client-slug>/<project-slug>/geo_score_data.json \
+  --output output/<client-slug>/<project-slug>/report.pdf \
+  --brand "Client Name"
+```
+---
+## What the PDF Contains
+The branded PDF report produced by `generate_pdf_report.py` includes the following sections:
+| Section | Pages | Content |
+|---|---|---|
+| Cover page | 1 | Client name, target URL, GEO Score gauge visualization, letter grade, audit date |
+| Executive summary | 1 | 3 key findings, score vs. category benchmark, top recommended action |
+| GEO Score breakdown | 2 | Visual bar chart for all 6 components, weighted contribution table |
+| Citability analysis | 1 | 5-metric breakdown with visual gauges, letter grade, top 3 improvements |
+| Crawler access matrix | 1 | 14-crawler table with color-coded access status (green/yellow/red) |
+| Top findings | 2 | Ranked findings with impact level, effort estimate, and specific fix instructions |
+| 4-week roadmap | 1 | Sprint table with actions, owners, and projected score gain |
+| Back cover | 1 | Growthub contact information, next steps, rescore recommendation |
+**Typical PDF length:** 10–12 pages (Letter or A4)
+---
+## geo_score_data.json Format
+The operator must write this file before triggering PDF generation. See `examples/pdf-report-sample.md` for the complete format.
+Minimum required structure:
+```json
+{
+  "report_meta": {
+    "client_name": "",
+    "target_url": "",
+    "audit_date": "",
+    "report_version": "1"
+  },
+  "geo_score": {
+    "composite": 0,
+    "grade": ""
+  },
+  "components": [
+    { "name": "", "weight": 0.0, "raw_score": 0, "weighted_score": 0.0, "grade": "", "primary_issue": "" }
+  ],
+  "top_findings": [
+    { "rank": 1, "title": "", "description": "", "component": "", "priority": "", "expected_score_gain": 0 }
+  ],
+  "remediation_summary": {
+    "score_before": 0,
+    "score_after_projected": 0
+  }
+}
+```
+---
+## In Agent-Only Mode
+If the local fork is not available, PDF generation is not possible.
+**Fallback behavior:**
+1. Produce the complete `GeoScoreSummary` Markdown file as the primary score artifact
+2. Note at the top: `> PDF generation requires the local geo-seo-claude fork. Run bash setup/clone-fork.sh to enable.`
+3. Write `geo_score_data.json` to the output directory so PDF can be generated later when the fork is available
+The Markdown output is a complete substitute for stakeholder communication. All the same data is present — the PDF simply adds visual formatting.
+---
+## PDF Styling
+The PDF uses the brand kit's color values:
+| Element | Color Source |
+|---|---|
+| Header and cover background | `colors.primary` from brand kit |
+| Accent bars and highlights | `colors.accent` from brand kit |
+| Score gauge fill | Dynamic: green (A), blue (B), yellow (C), orange (D), red (F) |
+| Body text | `colors.dark` or black |
+| Table alternating rows | `colors.secondary` at 10% opacity |
+If no brand kit colors are found, the PDF uses Growthub defaults (`#1A1A2E` primary, `#E94560` accent).

package/assets/worker-kits/growthub-geo-seo-v1/docs/scoring-methodology.md ADDED Viewed

@@ -0,0 +1,230 @@
+# GEO Scoring Methodology
+**Source of truth for all scoring rules. The operator must apply these formulas exactly.**
+---
+## GEO Score Formula
+The GEO Score is a weighted composite of 6 component scores, each ranging from 0 to 100.
+```
+GEO Score = (AI Citability & Visibility × 0.25)
+          + (Brand Authority × 0.20)
+          + (Content Quality & E-E-A-T × 0.20)
+          + (Technical Foundations × 0.15)
+          + (Structured Data × 0.10)
+          + (Platform Optimization × 0.10)
+```
+### Component Weights
+| Component | Weight | Rationale |
+|---|---|---|
+| AI Citability & Visibility | 25% | Core measure of AI search readiness — crawler access + citability signals |
+| Brand Authority | 20% | AI systems prefer to cite recognized, cross-platform brands |
+| Content Quality & E-E-A-T | 20% | Google and AI systems both weight E-E-A-T heavily for content ranking |
+| Technical Foundations | 15% | Technical barriers prevent all other optimization from working |
+| Structured Data | 10% | Schema markup directly feeds AI answer surfaces (Google AI Overviews, ChatGPT) |
+| Platform Optimization | 10% | Platform-specific readiness for the 4 major AI search engines |
+### Computation Rules
+- Each component score is 0–100 (no decimals before aggregation)
+- Apply weights before rounding
+- Round the composite score to the nearest integer
+- If a component score is unavailable (data gap), use **50** as the neutral default
+- Flag any data-gap component in the GeoScoreSummary output
+**Example:**
+```
+AI Citability:  58 × 0.25 = 14.50
+Brand Auth:     66 × 0.20 = 13.20
+Content:        71 × 0.20 = 14.20
+Technical:      84 × 0.15 = 12.60
+Schema:         42 × 0.10 =  4.20
+Platform:       61 × 0.10 =  6.10
+                            ─────
+Composite:                  64.80 → rounds to 65
+```
+---
+## Letter Grade Thresholds
+| Grade | Score Range | AI Search Status |
+|---|---|---|
+| A | 85–100 | Highly optimized — strong citability, clean crawler access, rich schema |
+| B | 70–84 | Good — some gaps, addressable in one sprint cycle |
+| C | 55–69 | Moderate — missing key citability signals, schema gaps, crawler issues |
+| D | 40–54 | Poor — likely not capturing meaningful AI-referred traffic |
+| F | Below 40 | Not AI-search-ready — critical issues across multiple components |
+---
+## Citability Algorithm
+The Citability Score (one of the inputs to AI Citability & Visibility) uses a 5-metric algorithm.
+### 5-Metric Breakdown
+| Metric | Weight | What It Measures | Scoring Method |
+|---|---|---|---|
+| Answer Block Quality | 30% | Do paragraphs contain complete, self-sufficient answers AI can quote verbatim? | Score each paragraph 0–10; average across all paragraphs |
+| Self-Containment | 25% | Can each paragraph be understood without reading surrounding context? | Pronoun-to-noun ratio; lower = better self-containment |
+| Structural Readability | 20% | Does the page use headings, short paragraphs, and lists to enable AI parsing? | Check H1/H2/list/paragraph-length signals |
+| Statistical Density | 15% | Does the page contain specific numbers, percentages, and data references? | Count data points per 1,000 words; optimal: 8–15 |
+| Uniqueness Signals | 10% | Does the content contain proprietary claims or data not found elsewhere? | Check for first-party research, original data, unique terminology |
+**Citability Score formula:**
+```
+Citability = (Answer Block Quality × 0.30)
+           + (Self-Containment × 0.25)
+           + (Structural Readability × 0.20)
+           + (Statistical Density × 0.15)
+           + (Uniqueness Signals × 0.10)
+```
+---
+### Metric Scoring Rules
+#### Answer Block Quality (0–100)
+Evaluate each paragraph on 3 criteria:
+1. Subject is clearly stated (no pronoun-as-subject opener)
+2. Supporting evidence or data is present
+3. No unresolved pronoun references
+Score each paragraph:
+- All 3 criteria met: 10/10
+- 2 criteria met: 7/10
+- 1 criterion met: 4/10
+- None met: 1/10
+Average across all paragraphs. Multiply by 10 to get 0–100 score.
+**High-score example:**
+> "GPTBot (used by ChatGPT's Browse mode) is blocked in the site's robots.txt. This means ChatGPT cannot fetch and cite this page's content, even when users directly ask about topics the page covers."
+**Low-score example:**
+> "It blocks them from accessing the site. This causes problems because they can't see what's there."
+*(No subject named, no evidence, multiple unresolved pronouns)*
+---
+#### Self-Containment (0–100)
+**Pronoun inventory:** it, they, this, that, these, those, he, she, we (when antecedent is not in same sentence)
+1. Count pronouns used as subjects or objects without a prior noun in the same sentence
+2. Count total noun references (named entities + common nouns)
+3. Pronoun-to-noun ratio = pronoun count / noun count
+**Scoring:**
+- Ratio < 0.15: 100
+- Ratio 0.15–0.25: 85
+- Ratio 0.25–0.35: 70
+- Ratio 0.35–0.50: 50
+- Ratio 0.50–0.65: 30
+- Ratio > 0.65: 10
+Also penalize for word count outside the 300–3,000 word range:
+- < 300 words: score cap at 40 (too thin to be self-contained on a topic)
+- 300–800 words: light penalty (-10)
+- 800–2,500 words: optimal range, no penalty
+- > 2,500 words: light penalty (-5) — tends toward context-dependent sprawl
+---
+#### Structural Readability (0–100)
+Start at 100 and deduct:
+| Issue | Deduction |
+|---|---|
+| No H1 present | -30 |
+| Fewer than 2 H2 sections | -15 |
+| No numbered or bulleted lists | -15 |
+| Average paragraph length > 150 words | -15 |
+| Any paragraph > 300 words (wall of text) | -10 per occurrence, capped at -20 |
+| No visual separation between sections | -10 |
+Minimum score: 0.
+---
+#### Statistical Density (0–100)
+Count data points per 1,000 words. Data points include:
+- Percentage figures (e.g., "47%," "3x increase")
+- Specific numbers with units (e.g., "65ms," "$2,800," "14 crawlers")
+- Named statistics (e.g., "200M weekly active users")
+- Year references as evidence (e.g., "Q1 2026 data shows...")
+**Scoring:**
+- < 2 data points per 1,000 words: 10
+- 2–5: 35
+- 5–8: 60
+- 8–15: 100 (optimal range)
+- 15–25: 85 (slightly over-cited — readability may suffer)
+- > 25: 60 (data overload — hard for AI to extract clean answers)
+---
+#### Uniqueness Signals (0–100)
+| Signal | Points |
+|---|---|
+| First-party study or original research | 35 |
+| Proprietary data or named internal data source | 25 |
+| Original methodology with named process | 20 |
+| Unique branded terminology | 15 |
+| Non-generic competitive differentiation statement | 10 |
+| Named case study with specific results | 10 |
+Cap at 100. If total > 100, use 100.
+---
+## Component Score Normalization
+Each subagent returns a raw 0–100 score. No normalization is needed — all scores are already on the same scale.
+Do not normalize scores before applying weights. Apply weights directly to the 0–100 values.
+---
+## Score Interpretation for Client Communication
+### Grade A (85–100)
+> "Your site is well-positioned for AI-driven search. You are in the top tier for citability, crawler access, and content quality. We recommend ongoing monthly monitoring and targeted improvements to maintain this position as AI search evolves."
+### Grade B (70–84)
+> "Good foundation. You have established AI search presence, but specific gaps are limiting your ceiling. Targeted improvements in [lowest-scoring component] can push you into the A tier within 30–60 days."
+### Grade C (55–69)
+> "Moderate visibility. You are not capturing significant AI-referred traffic yet. A full remediation roadmap is recommended. Addressing the top 3 gaps typically produces a measurable score improvement within 30 days."
+### Grade D (40–54)
+> "Your site has significant AI search blindspots. AI systems may not be citing or recommending your content at all, even for queries you should rank for. Immediate technical and content remediation is required."
+### Grade F (below 40)
+> "Critical issues detected. AI crawlers may be blocked outright, or your content lacks the structural signals needed for AI citation. A full remediation engagement is required before any AI-referred traffic is possible."
+---
+## Benchmark Context
+These benchmarks are based on geo-seo-claude audit data:
+| Benchmark | Score |
+|---|---|
+| Portfolio average (all audited sites) | 58 / 100 |
+| Top quartile (25th percentile from top) | 76 / 100 |
+| Top 10% | 85 / 100 |
+| Minimum for meaningful AI-referred traffic | ~65 / 100 |
+| Sites that actively appear in Perplexity citations | ~72+ / 100 |
+| Sites that appear in Google AI Overviews regularly | ~78+ / 100 |
+**llms.txt adoption rate:** Approximately 22% of audited sites have a valid `llms.txt` as of Q1 2026.