@cloudcreate/adsense-check 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +113 -39
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,76 +1,150 @@
1
- # adsense-check
1
+ # @cloudcreate/adsense-check
2
2
 
3
- Check if a website meets Google AdSense review requirements.
3
+ Automated website checker for Google AdSense review requirements. Focuses on detecting "low value content" — the #1 rejection reason.
4
4
 
5
5
  ## Install
6
6
 
7
7
  ```bash
8
- npm install -g adsense-check
8
+ npm install -g @cloudcreate/adsense-check
9
9
  ```
10
10
 
11
- ## Usage
11
+ Or run directly without installing:
12
12
 
13
13
  ```bash
14
- # Basic check
15
- adsense-check https://example.com
14
+ npx @cloudcreate/adsense-check https://example.com
15
+ ```
16
16
 
17
- # JSON output (for programmatic use)
18
- adsense-check https://example.com --json
17
+ ## Quick Start
19
18
 
20
- # Crawl more internal pages (default: 5)
21
- adsense-check https://example.com --depth 10
19
+ ```bash
20
+ # Full check with AI analysis
21
+ adsense-check https://example.com
22
22
 
23
- # Skip AI content analysis
23
+ # Quick check without AI
24
24
  adsense-check https://example.com --skip-ai
25
25
 
26
- # Custom timeout (ms)
27
- adsense-check https://example.com --timeout 60000
26
+ # JSON output (for programmatic use)
27
+ adsense-check https://example.com --json
28
28
  ```
29
29
 
30
- ## What it checks
31
-
32
- | Category | Checks |
33
- |----------|--------|
34
- | **Content Quality** | Page word count, content duplication |
35
- | **Required Pages** | About, Privacy Policy, Contact, Terms of Service |
36
- | **Site Structure** | H1 tags, robots.txt, sitemap.xml, internal links, dead links |
37
- | **Performance** | Load time, mobile viewport, responsive layout, font size, popups |
38
- | **Policy Compliance** | Blacklisted keywords (porn, gambling, piracy, etc.) |
39
- | **AI Analysis** | Content originality, quality, compliance (requires `AI_API_KEY`) |
30
+ Reports are auto-saved to `tmp/<domain>-<timestamp>.json`.
31
+
32
+ ## What It Checks
33
+
34
+ | Category | Checks | Focus |
35
+ |----------|--------|-------|
36
+ | **Content Quality** (8) | Content ratio, depth, template detection, filler detection, duplication, freshness, site scale | Low-value content |
37
+ | **Required Pages** (4) | About, Privacy Policy, Contact, Terms of Service | Completeness |
38
+ | **Site Structure** (5) | H1 tags, robots.txt, sitemap, internal links, dead links | Crawlability |
39
+ | **Performance** (5) | Load speed, viewport, mobile overflow, font size, popups | User experience |
40
+ | **Policy Compliance** (1) | Blacklisted keywords | AdSense policy |
41
+ | **AI Analysis** (3+) | Content value, originality, compliance + per-page analysis | Low-value content |
42
+
43
+ ### Content Quality (Anti Low-Value Content)
44
+
45
+ The core focus of this tool — detecting content that AdSense reviewers flag as "low value":
46
+
47
+ - **Content Ratio**: Strips navigation/footer/sidebar, measures real content percentage
48
+ - **Content Depth**: Per-page word count of actual content (not total page text)
49
+ - **Template Detection**: Flags pages with identical structures but different words
50
+ - **Filler Detection**: Catches repeated phrases, padding, meaningless text
51
+ - **Cross-Page Duplication**: Segment-level dedup across all crawled pages
52
+ - **Content Freshness**: Checks if site has been updated recently
53
+ - **Site Scale**: Warns if site has too few content pages
54
+
55
+ ### AI Per-Page Analysis
56
+
57
+ With AI enabled, each crawled page gets individual assessment:
58
+
59
+ ```json
60
+ {
61
+ "pages": [
62
+ {
63
+ "url": "https://example.com/blog/post-1",
64
+ "title": "Post Title",
65
+ "contentChars": 1200,
66
+ "contentRatio": 85,
67
+ "contentStatus": "pass",
68
+ "issues": [],
69
+ "ai": {
70
+ "status": "pass",
71
+ "assessment": "Content provides genuine value...",
72
+ "suggestions": ["Add more specific examples"]
73
+ }
74
+ }
75
+ ]
76
+ }
77
+ ```
40
78
 
41
79
  ## Options
42
80
 
43
81
  ```
44
- -v, --version Show version
45
- -j, --json Output as JSON
46
- -d, --depth <n> Number of internal pages to crawl (default: 5)
47
- -s, --skip-ai Skip AI content analysis
48
- -t, --timeout <ms> Page load timeout (default: 30000)
49
- --api-key <key> Anthropic API key (or set ANTHROPIC_API_KEY env var)
82
+ -v, --version Show version
83
+ -j, --json Output JSON to stdout
84
+ -d, --depth <n> Pages to crawl (default: 10)
85
+ -s, --skip-ai Skip AI analysis
86
+ -t, --timeout <ms> Page load timeout (default: 30000)
87
+ --api-key <key> AI API key
88
+ -o, --output <dir> Report output dir (default: tmp)
89
+ --no-save Skip auto-saving report
50
90
  ```
51
91
 
52
- ## AI Analysis
92
+ ## AI Configuration
53
93
 
54
- For deeper content quality assessment, configure AI API in `.env`:
94
+ Supports any OpenAI-compatible API (DeepSeek, OpenAI, Moonshot, local LLM, etc.).
55
95
 
56
96
  ```bash
57
97
  cp .env.example .env
58
- # Edit .env and set AI_API_KEY, AI_API_BASE, AI_MODEL
98
+ # Edit .env:
99
+ # AI_API_KEY=sk-xxx
100
+ # AI_API_BASE=https://api.deepseek.com
101
+ # AI_MODEL=deepseek-chat
59
102
  ```
60
103
 
61
- Compatible with any OpenAI-format API: DeepSeek, OpenAI, Moonshot, local LLM, etc.
62
-
63
- Or pass the key directly:
104
+ Or pass directly:
64
105
 
65
106
  ```bash
66
107
  adsense-check https://example.com --api-key sk-xxx...
67
108
  ```
68
109
 
69
- ## Exit codes
110
+ ## Report Output
111
+
112
+ ### Terminal Report
113
+
114
+ ```
115
+ AdSense Checklist Report
116
+ Website: https://example.com
117
+
118
+ Content Quality
119
+ ✔ [PASS] 各页面正文占比正常
120
+ ✔ [PASS] 首页正文内容充足 (2,340 字)
121
+
122
+ ...
123
+
124
+ Page Details (5 pages analyzed)
125
+ ✔ /
126
+ 正文 92% (2,340/2,540 字)
127
+ ⚠ /blog/old-post
128
+ 正文 25% (80/320 字)
129
+ ! 正文占比仅 25%,大量模板元素
130
+ ✘ AI: 内容过于单薄,缺乏实质性信息
131
+ → 增加至少 500 字的原创分析内容
132
+
133
+ Score: 18/21
134
+ Status: NOT READY — 1 项失败需要修复
135
+ ```
136
+
137
+ ### JSON Report
138
+
139
+ Full structured data including per-page details, saved automatically to `tmp/`.
140
+
141
+ ## Exit Codes
70
142
 
71
- - `0` No failures (ready or mostly ready)
72
- - `1` — Has failures (not ready)
73
- - `2` Error (invalid URL, network failure, etc.)
143
+ | Code | Meaning |
144
+ |------|---------|
145
+ | 0 | No failures (READY or MOSTLY READY) |
146
+ | 1 | Has failures (NOT READY) |
147
+ | 2 | Runtime error |
74
148
 
75
149
  ## License
76
150
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@cloudcreate/adsense-check",
3
- "version": "1.0.0",
3
+ "version": "1.0.1",
4
4
  "description": "Check if a website meets Google AdSense review requirements",
5
5
  "homepage": "https://cloudcreate.ai",
6
6
  "repository": {