google-search-scraper-api 0.0.1 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +104 -263
- package/index.js +6 -2
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,59 +1,59 @@
|
|
|
1
1
|
# google-search-scraper-api
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://www.scrapingbee.com/features/google/)
|
|
4
|
+
|
|
5
|
+
A Node.js client for scraping Google search results through the ScrapingBee Google Search API. Hands off the parts of google scraping that don't add product value — proxy rotation, headless rendering, anti-bot, SERP parsing — and returns clean JSON.
|
|
4
6
|
|
|
5
7
|
```bash
|
|
6
8
|
npm install google-search-scraper-api
|
|
7
9
|
```
|
|
8
10
|
|
|
9
|
-
|
|
10
|
-
const { GoogleSearchScraper } = require('google-search-scraper-api');
|
|
11
|
+
You'll need a ScrapingBee API key. The free tier gives you 1,000 credits with no card required: [scrapingbee.com](https://www.scrapingbee.com/).
|
|
11
12
|
|
|
12
|
-
const scraper = new GoogleSearchScraper({ apiKey: 'YOUR-API-KEY' });
|
|
13
13
|
|
|
14
|
-
|
|
15
|
-
console.log(organic_results[0]);
|
|
16
|
-
// { position: 1, title: '...', url: '...', description: '...' }
|
|
17
|
-
```
|
|
14
|
+
## Why use a Google Search Scraper API?
|
|
18
15
|
|
|
19
|
-
|
|
16
|
+
Anyone who has run a homegrown google search results scraper for more than a quarter knows the maintenance shape. Datacenter IPs work for a few hundred requests, then Google starts serving the consent wall. Residential proxies survive longer but cost more and need rotation logic. Headless browsers handle JS-rendered SERP features and AI Overviews — until Google rotates a layout and every CSS selector in your parser breaks at once.
|
|
20
17
|
|
|
18
|
+
A managed google search scraper api collapses that stack into one HTTP call. ScrapingBee runs the proxies (classic, premium residential, stealth), the headless browser pool, and a parser that tracks Google's SERP schema changes. You get a structured JSON response back from a single endpoint:
|
|
21
19
|
|
|
22
|
-
|
|
20
|
+
```
|
|
21
|
+
https://app.scrapingbee.com/api/v1/store/google
|
|
22
|
+
```
|
|
23
23
|
|
|
24
|
-
|
|
24
|
+
This package is a thin Node.js wrapper around that endpoint with camelCase options and a Promise-based interface — so calling code stays idiomatic to Node rather than translating snake_case query strings by hand.
|
|
25
25
|
|
|
26
|
-
The first ran on a pool of cheap datacenter IPs. It hit ~300 successful requests before Google started serving the consent wall, then the sorry page, then nothing. The second swapped in residential proxies, added a headless Chromium fleet behind a Redis queue, and survived longer — until Google rotated three SERP layout variants in two months and every CSS selector in the parsing layer broke at once.
|
|
27
26
|
|
|
28
|
-
|
|
27
|
+
## Quick start
|
|
29
28
|
|
|
29
|
+
The official ScrapingBee Google API call in JavaScript:
|
|
30
30
|
|
|
31
|
-
|
|
31
|
+
```javascript
|
|
32
|
+
const axios = require('axios');
|
|
32
33
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
34
|
+
axios.get('https://app.scrapingbee.com/api/v1/store/google', {
|
|
35
|
+
params: {
|
|
36
|
+
'api_key': 'YOUR-API-KEY',
|
|
37
|
+
'search': 'pizza new york',
|
|
38
|
+
}
|
|
39
|
+
}).then(function (response) {
|
|
40
|
+
console.log(response);
|
|
41
|
+
})
|
|
42
|
+
```
|
|
39
43
|
|
|
44
|
+
The same call through this package:
|
|
40
45
|
|
|
41
|
-
|
|
46
|
+
```javascript
|
|
47
|
+
const { GoogleSearchScraper } = require('google-search-scraper-api');
|
|
42
48
|
|
|
43
|
-
|
|
49
|
+
const scraper = new GoogleSearchScraper({ apiKey: 'YOUR-API-KEY' });
|
|
44
50
|
|
|
51
|
+
scraper.search({ query: 'pizza new york' }).then(function (response) {
|
|
52
|
+
console.log(response);
|
|
53
|
+
});
|
|
45
54
|
```
|
|
46
|
-
https://app.scrapingbee.com/api/v1/store/google?api_key=…&search=…
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
On ScrapingBee's side that triggers:
|
|
50
55
|
|
|
51
|
-
|
|
52
|
-
2. Loading the SERP in a real headless browser
|
|
53
|
-
3. Parsing the rendered page into a JSON SERP schema
|
|
54
|
-
4. Returning the structured result
|
|
55
|
-
|
|
56
|
-
You pay credits per successful response. Failed requests (HTTP 500) aren't charged, so it's safe to retry. There's no caching layer — every call returns the live SERP, which is what you want for rank tracking and what you'll need to add yourself if you're polling the same query repeatedly.
|
|
56
|
+
Both hit the same endpoint and return the same JSON. The wrapper exists so options like `country`, `device`, and `searchType` are plain JavaScript fields rather than query-string parameters you have to remember the casing of.
|
|
57
57
|
|
|
58
58
|
|
|
59
59
|
## API reference
|
|
@@ -67,278 +67,119 @@ You pay credits per successful response. Failed requests (HTTP 500) aren't charg
|
|
|
67
67
|
|
|
68
68
|
### `.search(options)`
|
|
69
69
|
|
|
70
|
-
|
|
71
|
-
| ------------ | --------- | -------------------------------------------------------------------------------------------- |
|
|
72
|
-
| `query` | string | The search query (required) |
|
|
73
|
-
| `country` | string | ISO-2 country code: `us`, `gb`, `de`, `fr`, `jp`, etc. Determines proxy geo + Google domain. |
|
|
74
|
-
| `language` | string | UI / results language: `en`, `de`, `es`, `fr`, etc. |
|
|
75
|
-
| `device` | string | `desktop` or `mobile` |
|
|
76
|
-
| `page` | number | Page number, 1-based |
|
|
77
|
-
| `nbResults` | number | Results per page (10–100) |
|
|
78
|
-
| `searchType` | string | `classic` (default), `news`, `images`, `videos`, `shopping`, `maps` |
|
|
79
|
-
| `addHtml` | boolean | Include raw HTML in the response |
|
|
80
|
-
| `extra` | object | Any additional ScrapingBee parameter passed through verbatim |
|
|
81
|
-
|
|
82
|
-
Returns a parsed JSON object — see the response shape below.
|
|
70
|
+
Every option below maps directly to a documented ScrapingBee Google Search API parameter. See the [official reference](https://www.scrapingbee.com/documentation/google-api/) for the canonical spec.
|
|
83
71
|
|
|
72
|
+
| Option | API parameter | Type | Default | Description |
|
|
73
|
+
| -------------- | --------------- | ------- | --------- | -------------------------------------------------------------------------------------------- |
|
|
74
|
+
| `query` | `search` | string | required | The search query |
|
|
75
|
+
| `country` | `country_code` | string | `us` | ISO 3166-1 country code |
|
|
76
|
+
| `language` | `language` | string | `en` | Results display language |
|
|
77
|
+
| `device` | `device` | string | `desktop` | `desktop` or `mobile` |
|
|
78
|
+
| `page` | `page` | integer | `1` | Results page number |
|
|
79
|
+
| `searchType` | `search_type` | string | `classic` | `classic`, `news`, `maps`, `images`, `lens`, `shopping`, `ai_mode` |
|
|
80
|
+
| `lightRequest` | `light_request` | boolean | `true` | `true` = 10 credits. `false` = 15 credits and unlocks AI Overviews + JS-rendered SERP fields. |
|
|
81
|
+
| `nfpr` | `nfpr` | boolean | `false` | Disable Google's autocorrection |
|
|
82
|
+
| `addHtml` | `add_html` | boolean | `false` | Include the raw `full_html` field in the response |
|
|
83
|
+
| `extraParams` | `extra_params` | string | `""` | Pass-through appended to the Google URL (e.g. `&uule=…`) |
|
|
84
84
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
A successful call returns an object that looks like this (trimmed for clarity):
|
|
88
|
-
|
|
89
|
-
```json
|
|
90
|
-
{
|
|
91
|
-
"search_metadata": {
|
|
92
|
-
"query": "best running shoes",
|
|
93
|
-
"url": "https://www.google.com/search?q=best+running+shoes",
|
|
94
|
-
"number_of_results": 412000000
|
|
95
|
-
},
|
|
96
|
-
"organic_results": [
|
|
97
|
-
{
|
|
98
|
-
"position": 1,
|
|
99
|
-
"title": "The 12 Best Running Shoes of 2026 - Runner's World",
|
|
100
|
-
"url": "https://www.runnersworld.com/...",
|
|
101
|
-
"description": "We tested hundreds of pairs..."
|
|
102
|
-
}
|
|
103
|
-
],
|
|
104
|
-
"featured_snippet": {
|
|
105
|
-
"title": "...",
|
|
106
|
-
"description": "...",
|
|
107
|
-
"url": "..."
|
|
108
|
-
},
|
|
109
|
-
"people_also_ask": [
|
|
110
|
-
{ "question": "...", "answer": "...", "url": "..." }
|
|
111
|
-
],
|
|
112
|
-
"related_searches": ["best running shoes for flat feet", "..."],
|
|
113
|
-
"top_stories": [],
|
|
114
|
-
"ads": [],
|
|
115
|
-
"knowledge_graph": {}
|
|
116
|
-
}
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
Fields are present only when Google actually rendered them for that query. Always null-check.
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
## Use cases (with working code)
|
|
123
|
-
|
|
124
|
-
These are the four use cases I keep coming back to. Each one is a complete, runnable snippet — drop your API key in and run with `node example.js`.
|
|
125
|
-
|
|
126
|
-
### 1. Rank tracking for an SEO dashboard
|
|
127
|
-
|
|
128
|
-
You have a list of keywords, you want to know where a domain ranks for each, refreshed daily.
|
|
129
|
-
|
|
130
|
-
```js
|
|
131
|
-
const { GoogleSearchScraper } = require('google-search-scraper-api');
|
|
132
|
-
|
|
133
|
-
const scraper = new GoogleSearchScraper({ apiKey: process.env.SB_KEY });
|
|
134
|
-
const targetDomain = 'runnersworld.com';
|
|
135
|
-
const keywords = ['best running shoes', 'best trail running shoes', 'best marathon shoes'];
|
|
136
|
-
|
|
137
|
-
async function rankFor(query) {
|
|
138
|
-
const { organic_results } = await scraper.search({ query, country: 'us', nbResults: 100 });
|
|
139
|
-
const hit = organic_results.find(r => new URL(r.url).hostname.endsWith(targetDomain));
|
|
140
|
-
return hit ? hit.position : null;
|
|
141
|
-
}
|
|
142
|
-
|
|
143
|
-
for (const kw of keywords) {
|
|
144
|
-
console.log(kw, '→', await rankFor(kw));
|
|
145
|
-
}
|
|
146
|
-
```
|
|
147
|
-
|
|
148
|
-
Two things worth noting from production:
|
|
149
|
-
|
|
150
|
-
- `nbResults: 100` matters. If you only fetch the first page, anything ranking past position 10 looks like "not ranking" — same bug ate a week of my dashboard's accuracy until I noticed.
|
|
151
|
-
- Polling daily? Add a 1–2 second jitter between calls. ScrapingBee handles concurrency on their side, but jittering also makes your own logs easier to debug.
|
|
152
|
-
|
|
153
|
-
### 2. People Also Ask mining for content briefs
|
|
154
|
-
|
|
155
|
-
When I build a content brief I want every PAA question Google fires for the head term plus a couple of variants — they're the cleanest signal for what the SERP audience is actually asking.
|
|
156
|
-
|
|
157
|
-
```js
|
|
158
|
-
const seeds = ['how to scrape google', 'google search scraping', 'scrape google search results'];
|
|
159
|
-
const questions = new Set();
|
|
160
|
-
|
|
161
|
-
for (const query of seeds) {
|
|
162
|
-
const { people_also_ask = [] } = await scraper.search({ query, country: 'us' });
|
|
163
|
-
for (const paa of people_also_ask) questions.add(paa.question);
|
|
164
|
-
}
|
|
165
|
-
|
|
166
|
-
console.log([...questions]);
|
|
167
|
-
```
|
|
85
|
+
**Documented constraints:**
|
|
168
86
|
|
|
169
|
-
|
|
87
|
+
- `searchType: 'news'` is not available with `device: 'mobile'`
|
|
88
|
+
- `searchType: 'lens'` only accepts image URLs as the `query` value
|
|
89
|
+
- `searchType: 'ai_mode'` caps the query at 400 characters
|
|
90
|
+
- AI Overviews require `lightRequest: false`
|
|
170
91
|
|
|
171
|
-
### 3. Local SERP monitoring across regions
|
|
172
92
|
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
```js
|
|
176
|
-
const markets = [
|
|
177
|
-
{ country: 'us', language: 'en' },
|
|
178
|
-
{ country: 'gb', language: 'en' },
|
|
179
|
-
{ country: 'de', language: 'de' },
|
|
180
|
-
{ country: 'fr', language: 'fr' },
|
|
181
|
-
{ country: 'jp', language: 'ja' },
|
|
182
|
-
];
|
|
183
|
-
|
|
184
|
-
const query = 'noise cancelling headphones';
|
|
185
|
-
|
|
186
|
-
const snapshots = await Promise.all(
|
|
187
|
-
markets.map(m => scraper.search({ query, ...m, device: 'mobile' }).then(r => ({ ...m, top3: r.organic_results.slice(0, 3) })))
|
|
188
|
-
);
|
|
189
|
-
|
|
190
|
-
for (const s of snapshots) console.log(s.country, s.top3.map(r => r.url));
|
|
191
|
-
```
|
|
192
|
-
|
|
193
|
-
Two production notes:
|
|
194
|
-
|
|
195
|
-
- `device: 'mobile'` is the right default for most international markets — global mobile share crossed desktop years ago and Google's mobile SERP differs in feature set.
|
|
196
|
-
- Don't fan out to 200 parallel requests on day one. Start with `Promise.all` for ~10–20, then move to a concurrency-limited queue (see use case 4) once you exceed your plan's concurrency cap.
|
|
197
|
-
|
|
198
|
-
### 4. Bulk keyword scraping with controlled concurrency
|
|
199
|
-
|
|
200
|
-
You're feeding 5,000 keywords into a database. You don't want to fire them all at once, and you want retries on transient failures.
|
|
201
|
-
|
|
202
|
-
```js
|
|
203
|
-
const pLimit = require('p-limit');
|
|
204
|
-
const fs = require('fs/promises');
|
|
205
|
-
|
|
206
|
-
const limit = pLimit(10); // tune to match your ScrapingBee plan's concurrency cap
|
|
207
|
-
|
|
208
|
-
async function withRetry(fn, tries = 3) {
|
|
209
|
-
for (let i = 0; i < tries; i++) {
|
|
210
|
-
try { return await fn(); }
|
|
211
|
-
catch (err) { if (i === tries - 1) throw err; await new Promise(r => setTimeout(r, 1000 * (i + 1))); }
|
|
212
|
-
}
|
|
213
|
-
}
|
|
214
|
-
|
|
215
|
-
async function scrapeAll(keywords) {
|
|
216
|
-
const tasks = keywords.map(kw => limit(() => withRetry(() => scraper.search({ query: kw, country: 'us' }))));
|
|
217
|
-
return Promise.all(tasks);
|
|
218
|
-
}
|
|
219
|
-
|
|
220
|
-
const keywords = (await fs.readFile('./keywords.txt', 'utf8')).split('\n').filter(Boolean);
|
|
221
|
-
const results = await scrapeAll(keywords);
|
|
222
|
-
|
|
223
|
-
await fs.writeFile('./results.json', JSON.stringify(results, null, 2));
|
|
224
|
-
```
|
|
225
|
-
|
|
226
|
-
ScrapingBee's plan tiers cap concurrency at 10 / 50 / 100 / 200 depending on the plan — exceed it and the API returns a clear error. The `p-limit` value should match (or sit just below) your tier cap.
|
|
227
|
-
|
|
228
|
-
### 5. News monitoring
|
|
229
|
-
|
|
230
|
-
```js
|
|
231
|
-
const { news_results = [] } = await scraper.search({
|
|
232
|
-
query: '"product launch" "your brand"',
|
|
233
|
-
searchType: 'news',
|
|
234
|
-
country: 'us',
|
|
235
|
-
});
|
|
236
|
-
|
|
237
|
-
for (const item of news_results) {
|
|
238
|
-
console.log(item.date, item.source, item.title, item.url);
|
|
239
|
-
}
|
|
240
|
-
```
|
|
241
|
-
|
|
242
|
-
The `searchType: 'news'` flag returns Google News results with timestamps and publication metadata, which is what you actually want for a brand-monitoring pipeline. The classic SERP's Top Stories box is shallower.
|
|
243
|
-
|
|
244
|
-
### 6. SERPs as RAG retrieval
|
|
245
|
-
|
|
246
|
-
If you're building an LLM workflow that needs fresh web context — answering questions about events the model wasn't trained on, building a research agent, grounding a customer-support bot — a structured SERP is a cheap retrieval layer.
|
|
247
|
-
|
|
248
|
-
```js
|
|
249
|
-
async function searchContext(query) {
|
|
250
|
-
const { organic_results, featured_snippet, people_also_ask } = await scraper.search({ query, country: 'us', nbResults: 20 });
|
|
251
|
-
return {
|
|
252
|
-
answer: featured_snippet?.description,
|
|
253
|
-
sources: organic_results.slice(0, 5).map(r => ({ title: r.title, url: r.url, snippet: r.description })),
|
|
254
|
-
related_questions: people_also_ask?.map(p => p.question) ?? [],
|
|
255
|
-
};
|
|
256
|
-
}
|
|
257
|
-
|
|
258
|
-
// Pass `searchContext` output into your prompt as a tool-call result
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
## Patterns you'll need eventually
|
|
263
|
-
|
|
264
|
-
### Retry on transient failures
|
|
265
|
-
|
|
266
|
-
ScrapingBee doesn't charge credits on HTTP 500 — retrying is genuinely safe. Three tries with linear backoff (1s, 2s, 3s) covers almost every transient blip in practice.
|
|
267
|
-
|
|
268
|
-
### Caching
|
|
269
|
-
|
|
270
|
-
There's no server-side cache. If you're polling the same query inside a tight window, cache responses client-side with a key like `${query}|${country}|${device}|${page}`. Redis with a 6–24 hour TTL works well for rank tracking; in-memory `Map` is fine for short scripts.
|
|
93
|
+
## Response shape
|
|
271
94
|
|
|
272
|
-
|
|
95
|
+
Successful responses return an object with the following documented fields. Any given query will only populate the fields Google actually rendered — null-check before reading.
|
|
273
96
|
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
97
|
+
| Field | Description |
|
|
98
|
+
| ----------------- | -------------------------------------------- |
|
|
99
|
+
| `meta_data` | Query metadata and pagination info |
|
|
100
|
+
| `organic_results` | Standard organic search results |
|
|
101
|
+
| `top_ads` | Ads shown at the top of the SERP |
|
|
102
|
+
| `bottom_ads` | Ads shown at the bottom of the SERP |
|
|
103
|
+
| `ai_overviews` | AI-generated answer summary (when present) |
|
|
104
|
+
| `knowledge_graph` | Knowledge panel data |
|
|
105
|
+
| `questions` | People Also Ask questions and answers |
|
|
106
|
+
| `related_queries` | Related search suggestions |
|
|
107
|
+
| `related_searches`| "Searches related to" block |
|
|
108
|
+
| `top_stories` | Top news stories carousel |
|
|
109
|
+
| `news_results` | News results (when `searchType: 'news'`) |
|
|
110
|
+
| `images` | Image results (when `searchType: 'images'`) |
|
|
111
|
+
| `local_results` | Local business listings |
|
|
112
|
+
| `map_results` | Map location results (when `searchType: 'maps'`) |
|
|
113
|
+
| `hotel_results` | Hotel listings |
|
|
284
114
|
|
|
285
|
-
### TypeScript
|
|
286
115
|
|
|
287
|
-
|
|
116
|
+
## Use cases
|
|
288
117
|
|
|
118
|
+
Common scenarios for a structured google search scraper api in a Node.js stack:
|
|
289
119
|
|
|
290
|
-
|
|
120
|
+
- **Rank tracking** — feed a list of keywords through the scraper on a schedule, compare positions over time, surface ranking changes in a dashboard.
|
|
121
|
+
- **People Also Ask mining** — collect the `questions` array across a seed list to build SERP-validated content briefs.
|
|
122
|
+
- **Brand monitoring** — track when a brand appears in `organic_results`, `news_results`, or `ai_overviews` across geographies.
|
|
123
|
+
- **Local SEO snapshots** — combine `country` + `language` + `extraParams` (UULE) for city-precision local SERPs across multiple markets.
|
|
124
|
+
- **News aggregation** — `searchType: 'news'` returns Google News results with publication metadata for media monitoring.
|
|
125
|
+
- **AI / RAG retrieval** — `lightRequest: false` unlocks `ai_overviews` and gives an LLM a fresh, structured grounding context for any query.
|
|
126
|
+
- **Ad intelligence** — `top_ads` and `bottom_ads` expose competitor ad copy and landing pages.
|
|
127
|
+
- **Shopping research** — `searchType: 'shopping'` populates product-style results.
|
|
291
128
|
|
|
292
|
-
|
|
129
|
+
Each scenario uses the same `.search()` call with different option combinations — every option mapped to a documented ScrapingBee parameter listed above.
|
|
293
130
|
|
|
294
|
-
- **You need login-walled content.** ScrapingBee's terms prohibit scraping behind logins. LinkedIn private profiles, gated content, intranet pages — wrong tool.
|
|
295
|
-
- **You're scraping 50 queries a month.** A free residential proxy and `cheerio` will get you there; you don't need a managed API.
|
|
296
|
-
- **You need millisecond latency.** SERP scraping involves a headless browser render. Expect 3–8 seconds per request. Fine for batch jobs, wrong for interactive search-as-you-type.
|
|
297
|
-
- **You want to avoid all SaaS dependencies.** This is a wrapper around an external API; the API going down means your scraper goes down.
|
|
298
131
|
|
|
132
|
+
## Pricing
|
|
299
133
|
|
|
300
|
-
|
|
134
|
+
ScrapingBee bills per successful API call:
|
|
301
135
|
|
|
302
|
-
|
|
136
|
+
- **Light request** (default, `lightRequest: true`) — 10 credits per call
|
|
137
|
+
- **Regular request** (`lightRequest: false`, required for AI Overviews) — 15 credits per call
|
|
138
|
+
- Failed requests retry for up to 30 seconds before returning an error; failed calls are not charged
|
|
303
139
|
|
|
304
|
-
|
|
140
|
+
A $49/month plan provides 250,000 credits — roughly 25,000 light requests or 16,000 regular requests per month. Current rate card and plan tiers: [scrapingbee.com/pricing](https://www.scrapingbee.com/pricing/).
|
|
305
141
|
|
|
306
142
|
|
|
307
143
|
## FAQ
|
|
308
144
|
|
|
309
145
|
### Is it legal to scrape Google search results?
|
|
310
146
|
|
|
311
|
-
Public SERP data is generally legal to collect in most jurisdictions, particularly for SEO research, brand monitoring, and competitive analysis. Personal data, copyrighted content, and login-walled material
|
|
147
|
+
Public SERP data is generally legal to collect in most jurisdictions, particularly for SEO research, brand monitoring, and competitive analysis. Personal data, copyrighted content, and login-walled material fall under different rules — check your local regulations and Google's terms before scaling. ScrapingBee's terms specifically prohibit scraping behind logins.
|
|
148
|
+
|
|
149
|
+
### Why use this package instead of `axios` directly?
|
|
150
|
+
|
|
151
|
+
You can call the ScrapingBee endpoint with any HTTP client. This wrapper just gives you camelCase options, a typed `apiKey` constructor, and parameter validation so your editor autocompletes the option names. Functionally identical to a raw `axios.get` against the same endpoint.
|
|
312
152
|
|
|
313
|
-
###
|
|
153
|
+
### What's the difference between `lightRequest: true` and `lightRequest: false`?
|
|
314
154
|
|
|
315
|
-
|
|
155
|
+
Light requests (the default) skip JavaScript rendering — they're fast, cheap (10 credits), and return organic results, questions, related searches, news, images, and maps. Regular requests (15 credits) render the page in a headless browser and unlock `ai_overviews` along with any other JS-injected SERP features Google rolls out.
|
|
316
156
|
|
|
317
|
-
###
|
|
157
|
+
### How do I scrape google search results for a specific city?
|
|
318
158
|
|
|
319
|
-
|
|
159
|
+
Pass `country` for the country-level setting and `extraParams: '&uule=…'` for the city. UULE strings are encoded location identifiers; a number of small open-source libraries generate them.
|
|
320
160
|
|
|
321
|
-
###
|
|
161
|
+
### Does this work in serverless environments (Lambda, Vercel, Cloudflare Workers)?
|
|
322
162
|
|
|
323
|
-
|
|
163
|
+
Yes. The only runtime dependency is `axios`. Works in any Node.js 14+ environment.
|
|
324
164
|
|
|
325
|
-
###
|
|
165
|
+
### Can I track ranks past position 10?
|
|
326
166
|
|
|
327
|
-
|
|
167
|
+
Yes — increment the `page` option. Each page returns the next set of organic results.
|
|
328
168
|
|
|
329
169
|
### How do I handle rate limits?
|
|
330
170
|
|
|
331
|
-
ScrapingBee
|
|
171
|
+
ScrapingBee caps concurrency per plan tier (10 / 50 / 100 / 200 depending on the plan). If you exceed it, the API returns a clear error — lower your concurrency rather than slowing per-request latency. Any standard Node.js concurrency limiter works.
|
|
332
172
|
|
|
333
|
-
###
|
|
173
|
+
### What about Google Scholar, Google Trends, Google Flights?
|
|
334
174
|
|
|
335
|
-
This package targets the
|
|
175
|
+
This package targets the documented Google Search verticals: `classic`, `news`, `maps`, `images`, `lens`, `shopping`, `ai_mode`. Scholar, Trends, and Flights aren't part of the ScrapingBee Google Search API.
|
|
336
176
|
|
|
337
177
|
|
|
338
178
|
## Documentation
|
|
339
179
|
|
|
340
180
|
- [ScrapingBee Google Search API documentation](https://www.scrapingbee.com/documentation/google-api/)
|
|
341
181
|
- [ScrapingBee pricing](https://www.scrapingbee.com/pricing/)
|
|
182
|
+
- [ScrapingBee main site](https://www.scrapingbee.com/)
|
|
342
183
|
|
|
343
184
|
|
|
344
185
|
## License
|
|
@@ -348,4 +189,4 @@ MIT. See [LICENSE](LICENSE).
|
|
|
348
189
|
|
|
349
190
|
## Disclaimer
|
|
350
191
|
|
|
351
|
-
This is an unofficial Node.js wrapper around ScrapingBee
|
|
192
|
+
This is an unofficial Node.js wrapper around the ScrapingBee Google Search API. Not affiliated with ScrapingBee or Google. Compliance with Google's terms of service and applicable data-protection law is the responsibility of the operator.
|
package/index.js
CHANGED
|
@@ -17,9 +17,11 @@ class GoogleSearchScraper {
|
|
|
17
17
|
language,
|
|
18
18
|
device,
|
|
19
19
|
page,
|
|
20
|
-
nbResults,
|
|
21
20
|
searchType,
|
|
21
|
+
lightRequest,
|
|
22
|
+
nfpr,
|
|
22
23
|
addHtml,
|
|
24
|
+
extraParams,
|
|
23
25
|
extra = {},
|
|
24
26
|
} = {}) {
|
|
25
27
|
if (!query) {
|
|
@@ -36,9 +38,11 @@ class GoogleSearchScraper {
|
|
|
36
38
|
if (language) params.language = language;
|
|
37
39
|
if (device) params.device = device;
|
|
38
40
|
if (page !== undefined) params.page = page;
|
|
39
|
-
if (nbResults !== undefined) params.nb_results = nbResults;
|
|
40
41
|
if (searchType) params.search_type = searchType;
|
|
42
|
+
if (lightRequest !== undefined) params.light_request = lightRequest ? 'true' : 'false';
|
|
43
|
+
if (nfpr !== undefined) params.nfpr = nfpr ? 'true' : 'false';
|
|
41
44
|
if (addHtml !== undefined) params.add_html = addHtml ? 'true' : 'false';
|
|
45
|
+
if (extraParams) params.extra_params = extraParams;
|
|
42
46
|
|
|
43
47
|
const response = await axios.get(ENDPOINT, {
|
|
44
48
|
params,
|