thordata-sdk 0.5.0__py3-none-any.whl → 0.7.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- thordata/__init__.py +139 -135
- thordata/_utils.py +144 -126
- thordata/async_client.py +815 -768
- thordata/client.py +1040 -995
- thordata/demo.py +140 -0
- thordata/enums.py +384 -315
- thordata/exceptions.py +344 -344
- thordata/models.py +840 -725
- thordata/parameters.py +53 -53
- thordata/retry.py +380 -380
- {thordata_sdk-0.5.0.dist-info → thordata_sdk-0.7.0.dist-info}/METADATA +1053 -896
- thordata_sdk-0.7.0.dist-info/RECORD +15 -0
- {thordata_sdk-0.5.0.dist-info → thordata_sdk-0.7.0.dist-info}/licenses/LICENSE +21 -21
- thordata_sdk-0.5.0.dist-info/RECORD +0 -14
- {thordata_sdk-0.5.0.dist-info → thordata_sdk-0.7.0.dist-info}/WHEEL +0 -0
- {thordata_sdk-0.5.0.dist-info → thordata_sdk-0.7.0.dist-info}/top_level.txt +0 -0
|
@@ -1,896 +1,1053 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: thordata-sdk
|
|
3
|
-
Version: 0.
|
|
4
|
-
Summary: The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.
|
|
5
|
-
Author-email: Thordata Developer Team <support@thordata.com>
|
|
6
|
-
License: MIT
|
|
7
|
-
Project-URL: Homepage, https://www.thordata.com
|
|
8
|
-
Project-URL: Documentation, https://github.com/Thordata/thordata-python-sdk#readme
|
|
9
|
-
Project-URL: Source, https://github.com/Thordata/thordata-python-sdk
|
|
10
|
-
Project-URL: Tracker, https://github.com/Thordata/thordata-python-sdk/issues
|
|
11
|
-
Project-URL: Changelog, https://github.com/Thordata/thordata-python-sdk/blob/main/CHANGELOG.md
|
|
12
|
-
Keywords: web scraping,proxy,residential proxy,datacenter proxy,ai,llm,data-mining,serp,thordata,web scraper,anti-bot bypass
|
|
13
|
-
Classifier: Development Status :: 4 - Beta
|
|
14
|
-
Classifier: Intended Audience :: Developers
|
|
15
|
-
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
16
|
-
Classifier: Topic :: Internet :: WWW/HTTP
|
|
17
|
-
Classifier: Topic :: Internet :: Proxy Servers
|
|
18
|
-
Classifier: Programming Language :: Python :: 3
|
|
19
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
20
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
21
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
22
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
23
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
24
|
-
Classifier: Operating System :: OS Independent
|
|
25
|
-
Classifier: Typing :: Typed
|
|
26
|
-
Requires-Python: >=3.9
|
|
27
|
-
Description-Content-Type: text/markdown
|
|
28
|
-
License-File: LICENSE
|
|
29
|
-
Requires-Dist: requests>=2.25.0
|
|
30
|
-
Requires-Dist: aiohttp>=3.9.0
|
|
31
|
-
Provides-Extra: dev
|
|
32
|
-
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
33
|
-
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
|
|
34
|
-
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
35
|
-
Requires-Dist: pytest-httpserver>=1.0.0; extra == "dev"
|
|
36
|
-
Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
|
|
37
|
-
Requires-Dist: black>=23.0.0; extra == "dev"
|
|
38
|
-
Requires-Dist: ruff>=0.1.0; extra == "dev"
|
|
39
|
-
Requires-Dist: mypy>=1.0.0; extra == "dev"
|
|
40
|
-
Requires-Dist: types-requests>=2.28.0; extra == "dev"
|
|
41
|
-
Dynamic: license-file
|
|
42
|
-
|
|
43
|
-
# Thordata Python SDK
|
|
44
|
-
|
|
45
|
-
<div align="center">
|
|
46
|
-
|
|
47
|
-
**Official Python client for Thordata's Proxy Network, SERP API, Web Unlocker, and Web Scraper API.**
|
|
48
|
-
|
|
49
|
-
*Async-ready, type-safe, built for AI agents and large-scale data collection.*
|
|
50
|
-
|
|
51
|
-
[](https://github.com/Thordata/thordata-python-sdk/actions/workflows/ci.yml)
|
|
52
|
-
[](https://pypi.org/project/thordata-sdk/)
|
|
53
|
-
[](LICENSE)
|
|
55
|
-
[](https://github.com/Thordata/thordata-python-sdk)
|
|
56
|
-
|
|
57
|
-
[Documentation](https://doc.thordata.com) • [Dashboard](https://www.thordata.com) • [Examples](examples/) • [Changelog](CHANGELOG.md)
|
|
58
|
-
|
|
59
|
-
</div>
|
|
60
|
-
|
|
61
|
-
---
|
|
62
|
-
|
|
63
|
-
## ✨ Features
|
|
64
|
-
|
|
65
|
-
| Feature | Description |
|
|
66
|
-
|---------|-------------|
|
|
67
|
-
| 🌐 **Proxy Network** | Residential, Mobile, Datacenter, ISP proxies with geo-targeting |
|
|
68
|
-
| 🔍 **SERP API** | Google, Bing, Yandex, DuckDuckGo, Baidu search results |
|
|
69
|
-
| 🔓 **Web Unlocker** | Bypass Cloudflare, CAPTCHAs, anti-bot systems automatically |
|
|
70
|
-
| 🕷️ **Web Scraper** | Async task-based scraping for complex sites |
|
|
71
|
-
| ⚡ **Async Support** | Full async/await support with aiohttp |
|
|
72
|
-
| 🔄 **Auto Retry** | Configurable retry with exponential backoff |
|
|
73
|
-
| 📝 **Type Safe** | Full type annotations for IDE autocomplete |
|
|
74
|
-
|
|
75
|
-
---
|
|
76
|
-
|
|
77
|
-
## 📦 Installation
|
|
78
|
-
|
|
79
|
-
```bash
|
|
80
|
-
pip install thordata-sdk
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
For development:
|
|
84
|
-
|
|
85
|
-
```bash
|
|
86
|
-
pip install thordata-sdk[dev]
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
---
|
|
90
|
-
|
|
91
|
-
## 🚀 Quick Start
|
|
92
|
-
|
|
93
|
-
### Get Your Credentials
|
|
94
|
-
|
|
95
|
-
1. Sign up at [thordata.com](https://www.thordata.com)
|
|
96
|
-
2. Navigate to your Dashboard
|
|
97
|
-
3. Copy your Scraper Token, Public Token, and Public Key
|
|
98
|
-
|
|
99
|
-
### Basic Usage
|
|
100
|
-
|
|
101
|
-
```python
|
|
102
|
-
from thordata import ThordataClient
|
|
103
|
-
|
|
104
|
-
# Initialize the client
|
|
105
|
-
client = ThordataClient(
|
|
106
|
-
scraper_token="your_scraper_token",
|
|
107
|
-
public_token="your_public_token", # Optional, for task APIs
|
|
108
|
-
public_key="your_public_key" # Optional, for task APIs
|
|
109
|
-
)
|
|
110
|
-
|
|
111
|
-
# Make a request through the proxy network
|
|
112
|
-
response = client.get("https://httpbin.org/ip")
|
|
113
|
-
print(response.json())
|
|
114
|
-
# {'origin': '123.45.67.89'} # Residential IP
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
### Environment Variables
|
|
118
|
-
|
|
119
|
-
Create a `.env` file:
|
|
120
|
-
|
|
121
|
-
```env
|
|
122
|
-
THORDATA_SCRAPER_TOKEN=your_scraper_token
|
|
123
|
-
THORDATA_PUBLIC_TOKEN=your_public_token
|
|
124
|
-
THORDATA_PUBLIC_KEY=your_public_key
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
Then use with python-dotenv:
|
|
128
|
-
|
|
129
|
-
```python
|
|
130
|
-
import os
|
|
131
|
-
from dotenv import load_dotenv
|
|
132
|
-
from thordata import ThordataClient
|
|
133
|
-
|
|
134
|
-
load_dotenv()
|
|
135
|
-
|
|
136
|
-
client = ThordataClient(
|
|
137
|
-
scraper_token=os.getenv("THORDATA_SCRAPER_TOKEN"),
|
|
138
|
-
public_token=os.getenv("THORDATA_PUBLIC_TOKEN"),
|
|
139
|
-
public_key=os.getenv("THORDATA_PUBLIC_KEY"),
|
|
140
|
-
)
|
|
141
|
-
```
|
|
142
|
-
|
|
143
|
-
---
|
|
144
|
-
|
|
145
|
-
## 📖 Usage Guide
|
|
146
|
-
|
|
147
|
-
### 1. Proxy Network
|
|
148
|
-
|
|
149
|
-
#### Basic Proxy Request
|
|
150
|
-
|
|
151
|
-
```python
|
|
152
|
-
from thordata import ThordataClient
|
|
153
|
-
|
|
154
|
-
client = ThordataClient(scraper_token="your_token")
|
|
155
|
-
|
|
156
|
-
# GET request through proxy
|
|
157
|
-
response = client.get("https://example.com")
|
|
158
|
-
print(response.text)
|
|
159
|
-
|
|
160
|
-
# POST request through proxy
|
|
161
|
-
response = client.post("https://httpbin.org/post", json={"key": "value"})
|
|
162
|
-
print(response.json())
|
|
163
|
-
```
|
|
164
|
-
|
|
165
|
-
#### Geo-Targeting
|
|
166
|
-
|
|
167
|
-
```python
|
|
168
|
-
from thordata import ThordataClient, ProxyConfig
|
|
169
|
-
|
|
170
|
-
client = ThordataClient(scraper_token="your_token")
|
|
171
|
-
|
|
172
|
-
# Create a proxy config with geo-targeting
|
|
173
|
-
config = ProxyConfig(
|
|
174
|
-
username="your_username",
|
|
175
|
-
password="your_password",
|
|
176
|
-
country="us", # Target country
|
|
177
|
-
state="california", # Target state
|
|
178
|
-
city="los_angeles", # Target city
|
|
179
|
-
)
|
|
180
|
-
|
|
181
|
-
response = client.get("https://httpbin.org/ip", proxy_config=config)
|
|
182
|
-
print(response.json())
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
#### Sticky Sessions
|
|
186
|
-
|
|
187
|
-
Keep the same IP for multiple requests:
|
|
188
|
-
|
|
189
|
-
```python
|
|
190
|
-
from thordata import ThordataClient, StickySession
|
|
191
|
-
|
|
192
|
-
client = ThordataClient(scraper_token="your_token")
|
|
193
|
-
|
|
194
|
-
# Create a sticky session (same IP for 10 minutes)
|
|
195
|
-
session = StickySession(
|
|
196
|
-
username="your_username",
|
|
197
|
-
password="your_password",
|
|
198
|
-
country="gb",
|
|
199
|
-
duration_minutes=10,
|
|
200
|
-
)
|
|
201
|
-
|
|
202
|
-
# All requests use the same IP
|
|
203
|
-
for i in range(5):
|
|
204
|
-
response = client.get("https://httpbin.org/ip", proxy_config=session)
|
|
205
|
-
print(f"Request {i+1}: {response.json()['origin']}")
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
####
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
# Residential
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
)
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
#
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
)
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
#
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
#
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
|
|
374
|
-
|
|
375
|
-
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
|
|
379
|
-
|
|
380
|
-
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
|
|
406
|
-
|
|
407
|
-
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
411
|
-
|
|
412
|
-
|
|
413
|
-
|
|
414
|
-
|
|
415
|
-
|
|
416
|
-
|
|
417
|
-
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
|
|
426
|
-
|
|
427
|
-
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
| Document Parameter | SDK Field/Usage | Description |
|
|
431
|
-
|-------------------|-----------------|-------------|
|
|
432
|
-
| q | query | Search
|
|
433
|
-
|
|
|
434
|
-
|
|
|
435
|
-
|
|
|
436
|
-
|
|
|
437
|
-
|
|
|
438
|
-
|
|
|
439
|
-
|
|
|
440
|
-
|
|
|
441
|
-
|
|
442
|
-
|
|
443
|
-
|
|
444
|
-
|
|
445
|
-
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
|
|
449
|
-
|
|
450
|
-
|
|
451
|
-
|
|
452
|
-
|
|
453
|
-
|
|
454
|
-
|
|
455
|
-
|
|
456
|
-
|
|
457
|
-
|
|
458
|
-
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
|
|
462
|
-
|
|
463
|
-
|
|
464
|
-
|
|
465
|
-
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
|
|
469
|
-
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
484
|
-
|
|
485
|
-
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
|
499
|
-
|
|
500
|
-
|
|
|
501
|
-
|
|
|
502
|
-
|
|
|
503
|
-
|
|
|
504
|
-
|
|
|
505
|
-
|
|
|
506
|
-
|
|
|
507
|
-
|
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
|
|
519
|
-
```python
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
531
|
-
|
|
532
|
-
|
|
533
|
-
|
|
534
|
-
|
|
535
|
-
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
543
|
-
|
|
544
|
-
|
|
545
|
-
|
|
546
|
-
|
|
547
|
-
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
|
|
551
|
-
|
|
552
|
-
|
|
553
|
-
|
|
554
|
-
|
|
555
|
-
|
|
556
|
-
|
|
557
|
-
|
|
558
|
-
|
|
559
|
-
|
|
560
|
-
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
|
|
569
|
-
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
|
|
573
|
-
|
|
574
|
-
|
|
575
|
-
|
|
576
|
-
|
|
577
|
-
|
|
578
|
-
|
|
579
|
-
|
|
580
|
-
|
|
581
|
-
|
|
582
|
-
|
|
583
|
-
|
|
584
|
-
|
|
585
|
-
|
|
586
|
-
|
|
587
|
-
|
|
588
|
-
|
|
589
|
-
|
|
590
|
-
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
604
|
-
|
|
605
|
-
|
|
606
|
-
|
|
607
|
-
|
|
608
|
-
|
|
609
|
-
|
|
610
|
-
|
|
611
|
-
|
|
612
|
-
|
|
613
|
-
|
|
614
|
-
|
|
615
|
-
|
|
616
|
-
|
|
617
|
-
|
|
618
|
-
|
|
619
|
-
|
|
620
|
-
|
|
621
|
-
|
|
622
|
-
|
|
623
|
-
|
|
624
|
-
|
|
625
|
-
|
|
626
|
-
|
|
627
|
-
|
|
628
|
-
|
|
629
|
-
|
|
630
|
-
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
|
|
634
|
-
|
|
635
|
-
|
|
636
|
-
|
|
637
|
-
|
|
638
|
-
|
|
639
|
-
|
|
640
|
-
|
|
641
|
-
|
|
642
|
-
|
|
643
|
-
|
|
644
|
-
|
|
645
|
-
|
|
646
|
-
|
|
647
|
-
|
|
648
|
-
|
|
649
|
-
|
|
650
|
-
|
|
651
|
-
|
|
652
|
-
|
|
653
|
-
|
|
654
|
-
|
|
655
|
-
|
|
656
|
-
|
|
657
|
-
|
|
658
|
-
|
|
659
|
-
|
|
660
|
-
|
|
661
|
-
|
|
662
|
-
|
|
663
|
-
|
|
664
|
-
|
|
665
|
-
|
|
666
|
-
|
|
667
|
-
|
|
668
|
-
|
|
669
|
-
|
|
670
|
-
|
|
671
|
-
|
|
672
|
-
|
|
673
|
-
|
|
674
|
-
|
|
675
|
-
|
|
676
|
-
|
|
677
|
-
|
|
678
|
-
|
|
679
|
-
|
|
680
|
-
|
|
681
|
-
|
|
682
|
-
|
|
683
|
-
|
|
684
|
-
|
|
685
|
-
|
|
686
|
-
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
#
|
|
695
|
-
|
|
696
|
-
|
|
697
|
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
|
|
701
|
-
|
|
702
|
-
|
|
703
|
-
|
|
704
|
-
|
|
705
|
-
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
|
|
711
|
-
|
|
712
|
-
|
|
713
|
-
|
|
714
|
-
|
|
715
|
-
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
|
|
719
|
-
|
|
720
|
-
|
|
721
|
-
|
|
722
|
-
|
|
723
|
-
|
|
724
|
-
|
|
725
|
-
|
|
726
|
-
|
|
727
|
-
|
|
728
|
-
|
|
729
|
-
|
|
730
|
-
|
|
731
|
-
|
|
732
|
-
|
|
733
|
-
|
|
734
|
-
|
|
735
|
-
|
|
736
|
-
|
|
737
|
-
|
|
738
|
-
|
|
739
|
-
|
|
740
|
-
|
|
741
|
-
|
|
742
|
-
|
|
743
|
-
|
|
744
|
-
)
|
|
745
|
-
|
|
746
|
-
|
|
747
|
-
|
|
748
|
-
|
|
749
|
-
|
|
750
|
-
|
|
751
|
-
|
|
752
|
-
|
|
753
|
-
|
|
754
|
-
|
|
755
|
-
|
|
756
|
-
|
|
757
|
-
|
|
758
|
-
|
|
759
|
-
|
|
760
|
-
|
|
761
|
-
|
|
762
|
-
|
|
763
|
-
|
|
764
|
-
|
|
765
|
-
|
|
766
|
-
|
|
767
|
-
|
|
768
|
-
|
|
769
|
-
|
|
770
|
-
|
|
771
|
-
###
|
|
772
|
-
|
|
773
|
-
|
|
774
|
-
|
|
775
|
-
|
|
776
|
-
|
|
777
|
-
|
|
778
|
-
|
|
779
|
-
|
|
780
|
-
|
|
781
|
-
|
|
782
|
-
|
|
783
|
-
|
|
784
|
-
|
|
785
|
-
|
|
786
|
-
|
|
787
|
-
|
|
788
|
-
|
|
789
|
-
|
|
790
|
-
|
|
791
|
-
|
|
792
|
-
|
|
793
|
-
|
|
794
|
-
|
|
795
|
-
|
|
796
|
-
|
|
797
|
-
|
|
798
|
-
|
|
799
|
-
|
|
800
|
-
|
|
801
|
-
|
|
802
|
-
|
|
803
|
-
|
|
804
|
-
|
|
805
|
-
|
|
806
|
-
|
|
807
|
-
|
|
808
|
-
|
|
809
|
-
|
|
810
|
-
|
|
811
|
-
|
|
812
|
-
|
|
813
|
-
|
|
814
|
-
|
|
815
|
-
|
|
816
|
-
|
|
817
|
-
|
|
818
|
-
|
|
819
|
-
|
|
820
|
-
|
|
821
|
-
|
|
822
|
-
|
|
823
|
-
|
|
824
|
-
|
|
825
|
-
|
|
826
|
-
|
|
827
|
-
|
|
828
|
-
|
|
829
|
-
|
|
830
|
-
|
|
831
|
-
|
|
832
|
-
|
|
833
|
-
|
|
834
|
-
|
|
835
|
-
|
|
836
|
-
|
|
837
|
-
|
|
838
|
-
|
|
839
|
-
|
|
840
|
-
|
|
841
|
-
#
|
|
842
|
-
|
|
843
|
-
|
|
844
|
-
#
|
|
845
|
-
|
|
846
|
-
|
|
847
|
-
|
|
848
|
-
|
|
849
|
-
|
|
850
|
-
|
|
851
|
-
|
|
852
|
-
|
|
853
|
-
|
|
854
|
-
#
|
|
855
|
-
|
|
856
|
-
|
|
857
|
-
|
|
858
|
-
|
|
859
|
-
|
|
860
|
-
|
|
861
|
-
|
|
862
|
-
|
|
863
|
-
|
|
864
|
-
|
|
865
|
-
|
|
866
|
-
|
|
867
|
-
|
|
868
|
-
|
|
869
|
-
|
|
870
|
-
|
|
871
|
-
|
|
872
|
-
|
|
873
|
-
|
|
874
|
-
|
|
875
|
-
|
|
876
|
-
|
|
877
|
-
|
|
878
|
-
|
|
879
|
-
|
|
880
|
-
|
|
881
|
-
|
|
882
|
-
|
|
883
|
-
|
|
884
|
-
|
|
885
|
-
|
|
886
|
-
|
|
887
|
-
|
|
888
|
-
|
|
889
|
-
|
|
890
|
-
|
|
891
|
-
|
|
892
|
-
|
|
893
|
-
|
|
894
|
-
|
|
895
|
-
|
|
896
|
-
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: thordata-sdk
|
|
3
|
+
Version: 0.7.0
|
|
4
|
+
Summary: The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.
|
|
5
|
+
Author-email: Thordata Developer Team <support@thordata.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://www.thordata.com
|
|
8
|
+
Project-URL: Documentation, https://github.com/Thordata/thordata-python-sdk#readme
|
|
9
|
+
Project-URL: Source, https://github.com/Thordata/thordata-python-sdk
|
|
10
|
+
Project-URL: Tracker, https://github.com/Thordata/thordata-python-sdk/issues
|
|
11
|
+
Project-URL: Changelog, https://github.com/Thordata/thordata-python-sdk/blob/main/CHANGELOG.md
|
|
12
|
+
Keywords: web scraping,proxy,residential proxy,datacenter proxy,ai,llm,data-mining,serp,thordata,web scraper,anti-bot bypass
|
|
13
|
+
Classifier: Development Status :: 4 - Beta
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
16
|
+
Classifier: Topic :: Internet :: WWW/HTTP
|
|
17
|
+
Classifier: Topic :: Internet :: Proxy Servers
|
|
18
|
+
Classifier: Programming Language :: Python :: 3
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
22
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
23
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
24
|
+
Classifier: Operating System :: OS Independent
|
|
25
|
+
Classifier: Typing :: Typed
|
|
26
|
+
Requires-Python: >=3.9
|
|
27
|
+
Description-Content-Type: text/markdown
|
|
28
|
+
License-File: LICENSE
|
|
29
|
+
Requires-Dist: requests>=2.25.0
|
|
30
|
+
Requires-Dist: aiohttp>=3.9.0
|
|
31
|
+
Provides-Extra: dev
|
|
32
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
33
|
+
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
|
|
34
|
+
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
35
|
+
Requires-Dist: pytest-httpserver>=1.0.0; extra == "dev"
|
|
36
|
+
Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
|
|
37
|
+
Requires-Dist: black>=23.0.0; extra == "dev"
|
|
38
|
+
Requires-Dist: ruff>=0.1.0; extra == "dev"
|
|
39
|
+
Requires-Dist: mypy>=1.0.0; extra == "dev"
|
|
40
|
+
Requires-Dist: types-requests>=2.28.0; extra == "dev"
|
|
41
|
+
Dynamic: license-file
|
|
42
|
+
|
|
43
|
+
# Thordata Python SDK
|
|
44
|
+
|
|
45
|
+
<div align="center">
|
|
46
|
+
|
|
47
|
+
**Official Python client for Thordata's Proxy Network, SERP API, Web Unlocker, and Web Scraper API.**
|
|
48
|
+
|
|
49
|
+
*Async-ready, type-safe, built for AI agents and large-scale data collection.*
|
|
50
|
+
|
|
51
|
+
[](https://github.com/Thordata/thordata-python-sdk/actions/workflows/ci.yml)
|
|
52
|
+
[](https://pypi.org/project/thordata-sdk/)
|
|
53
|
+
[](https://python.org)
|
|
54
|
+
[](LICENSE)
|
|
55
|
+
[](https://github.com/Thordata/thordata-python-sdk)
|
|
56
|
+
|
|
57
|
+
[Documentation](https://doc.thordata.com) • [Dashboard](https://www.thordata.com) • [Examples](examples/) • [Changelog](CHANGELOG.md)
|
|
58
|
+
|
|
59
|
+
</div>
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## ✨ Features
|
|
64
|
+
|
|
65
|
+
| Feature | Description |
|
|
66
|
+
|---------|-------------|
|
|
67
|
+
| 🌐 **Proxy Network** | Residential, Mobile, Datacenter, ISP proxies with geo-targeting |
|
|
68
|
+
| 🔍 **SERP API** | Google, Bing, Yandex, DuckDuckGo, Baidu search results |
|
|
69
|
+
| 🔓 **Web Unlocker** | Bypass Cloudflare, CAPTCHAs, anti-bot systems automatically |
|
|
70
|
+
| 🕷️ **Web Scraper** | Async task-based scraping for complex sites |
|
|
71
|
+
| ⚡ **Async Support** | Full async/await support with aiohttp |
|
|
72
|
+
| 🔄 **Auto Retry** | Configurable retry with exponential backoff |
|
|
73
|
+
| 📝 **Type Safe** | Full type annotations for IDE autocomplete |
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## 📦 Installation
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
pip install thordata-sdk
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
For development:
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
pip install thordata-sdk[dev]
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## 🚀 Quick Start
|
|
92
|
+
|
|
93
|
+
### Get Your Credentials
|
|
94
|
+
|
|
95
|
+
1. Sign up at [thordata.com](https://www.thordata.com)
|
|
96
|
+
2. Navigate to your Dashboard
|
|
97
|
+
3. Copy your Scraper Token, Public Token, and Public Key
|
|
98
|
+
|
|
99
|
+
### Basic Usage
|
|
100
|
+
|
|
101
|
+
```python
|
|
102
|
+
from thordata import ThordataClient
|
|
103
|
+
|
|
104
|
+
# Initialize the client
|
|
105
|
+
client = ThordataClient(
|
|
106
|
+
scraper_token="your_scraper_token",
|
|
107
|
+
public_token="your_public_token", # Optional, for task APIs
|
|
108
|
+
public_key="your_public_key" # Optional, for task APIs
|
|
109
|
+
)
|
|
110
|
+
|
|
111
|
+
# Make a request through the proxy network
|
|
112
|
+
response = client.get("https://httpbin.org/ip")
|
|
113
|
+
print(response.json())
|
|
114
|
+
# {'origin': '123.45.67.89'} # Residential IP
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### Environment Variables
|
|
118
|
+
|
|
119
|
+
Create a `.env` file:
|
|
120
|
+
|
|
121
|
+
```env
|
|
122
|
+
THORDATA_SCRAPER_TOKEN=your_scraper_token
|
|
123
|
+
THORDATA_PUBLIC_TOKEN=your_public_token
|
|
124
|
+
THORDATA_PUBLIC_KEY=your_public_key
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
Then use with python-dotenv:
|
|
128
|
+
|
|
129
|
+
```python
|
|
130
|
+
import os
|
|
131
|
+
from dotenv import load_dotenv
|
|
132
|
+
from thordata import ThordataClient
|
|
133
|
+
|
|
134
|
+
load_dotenv()
|
|
135
|
+
|
|
136
|
+
client = ThordataClient(
|
|
137
|
+
scraper_token=os.getenv("THORDATA_SCRAPER_TOKEN"),
|
|
138
|
+
public_token=os.getenv("THORDATA_PUBLIC_TOKEN"),
|
|
139
|
+
public_key=os.getenv("THORDATA_PUBLIC_KEY"),
|
|
140
|
+
)
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## 📖 Usage Guide
|
|
146
|
+
|
|
147
|
+
### 1. Proxy Network
|
|
148
|
+
|
|
149
|
+
#### Basic Proxy Request
|
|
150
|
+
|
|
151
|
+
```python
|
|
152
|
+
from thordata import ThordataClient
|
|
153
|
+
|
|
154
|
+
client = ThordataClient(scraper_token="your_token")
|
|
155
|
+
|
|
156
|
+
# GET request through proxy
|
|
157
|
+
response = client.get("https://example.com")
|
|
158
|
+
print(response.text)
|
|
159
|
+
|
|
160
|
+
# POST request through proxy
|
|
161
|
+
response = client.post("https://httpbin.org/post", json={"key": "value"})
|
|
162
|
+
print(response.json())
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
#### Geo-Targeting
|
|
166
|
+
|
|
167
|
+
```python
|
|
168
|
+
from thordata import ThordataClient, ProxyConfig
|
|
169
|
+
|
|
170
|
+
client = ThordataClient(scraper_token="your_token")
|
|
171
|
+
|
|
172
|
+
# Create a proxy config with geo-targeting
|
|
173
|
+
config = ProxyConfig(
|
|
174
|
+
username="your_username",
|
|
175
|
+
password="your_password",
|
|
176
|
+
country="us", # Target country
|
|
177
|
+
state="california", # Target state
|
|
178
|
+
city="los_angeles", # Target city
|
|
179
|
+
)
|
|
180
|
+
|
|
181
|
+
response = client.get("https://httpbin.org/ip", proxy_config=config)
|
|
182
|
+
print(response.json())
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
#### Sticky Sessions
|
|
186
|
+
|
|
187
|
+
Keep the same IP for multiple requests:
|
|
188
|
+
|
|
189
|
+
```python
|
|
190
|
+
from thordata import ThordataClient, StickySession
|
|
191
|
+
|
|
192
|
+
client = ThordataClient(scraper_token="your_token")
|
|
193
|
+
|
|
194
|
+
# Create a sticky session (same IP for 10 minutes)
|
|
195
|
+
session = StickySession(
|
|
196
|
+
username="your_username",
|
|
197
|
+
password="your_password",
|
|
198
|
+
country="gb",
|
|
199
|
+
duration_minutes=10,
|
|
200
|
+
)
|
|
201
|
+
|
|
202
|
+
# All requests use the same IP
|
|
203
|
+
for i in range(5):
|
|
204
|
+
response = client.get("https://httpbin.org/ip", proxy_config=session)
|
|
205
|
+
print(f"Request {i+1}: {response.json()['origin']}")
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
#### Proxy Credentials
|
|
209
|
+
|
|
210
|
+
Each proxy product requires **separate credentials** from Thordata Dashboard:
|
|
211
|
+
|
|
212
|
+
```env
|
|
213
|
+
# Residential Proxy (port 9999)
|
|
214
|
+
THORDATA_RESIDENTIAL_USERNAME=your_residential_username
|
|
215
|
+
THORDATA_RESIDENTIAL_PASSWORD=your_residential_password
|
|
216
|
+
|
|
217
|
+
# Datacenter Proxy (port 7777)
|
|
218
|
+
THORDATA_DATACENTER_USERNAME=your_datacenter_username
|
|
219
|
+
THORDATA_DATACENTER_PASSWORD=your_datacenter_password
|
|
220
|
+
|
|
221
|
+
# Mobile Proxy (port 5555)
|
|
222
|
+
THORDATA_MOBILE_USERNAME=your_mobile_username
|
|
223
|
+
THORDATA_MOBILE_PASSWORD=your_mobile_password
|
|
224
|
+
|
|
225
|
+
# Static ISP Proxy (port 6666, direct IP connection)
|
|
226
|
+
THORDATA_ISP_HOST=your_static_ip_address
|
|
227
|
+
THORDATA_ISP_USERNAME=your_isp_username
|
|
228
|
+
THORDATA_ISP_PASSWORD=your_isp_password
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
#### Residential Proxy
|
|
232
|
+
|
|
233
|
+
```python
|
|
234
|
+
from thordata import ProxyConfig, ProxyProduct
|
|
235
|
+
|
|
236
|
+
proxy = ProxyConfig(
|
|
237
|
+
username="your_username",
|
|
238
|
+
password="your_password",
|
|
239
|
+
product=ProxyProduct.RESIDENTIAL,
|
|
240
|
+
country="us",
|
|
241
|
+
)
|
|
242
|
+
|
|
243
|
+
response = requests.get(
|
|
244
|
+
"http://httpbin.org/ip",
|
|
245
|
+
proxies=proxy.to_proxies_dict(),
|
|
246
|
+
)
|
|
247
|
+
print(response.json())
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
#### Datacenter Proxy
|
|
251
|
+
|
|
252
|
+
```python
|
|
253
|
+
proxy = ProxyConfig(
|
|
254
|
+
username="your_username",
|
|
255
|
+
password="your_password",
|
|
256
|
+
product=ProxyProduct.DATACENTER,
|
|
257
|
+
)
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
#### Mobile Proxy
|
|
261
|
+
|
|
262
|
+
```python
|
|
263
|
+
proxy = ProxyConfig(
|
|
264
|
+
username="your_username",
|
|
265
|
+
password="your_password",
|
|
266
|
+
product=ProxyProduct.MOBILE,
|
|
267
|
+
country="gb",
|
|
268
|
+
)
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
#### Static ISP Proxy
|
|
272
|
+
|
|
273
|
+
Static ISP proxies connect directly to your purchased IP address:
|
|
274
|
+
|
|
275
|
+
```python
|
|
276
|
+
from thordata import StaticISPProxy
|
|
277
|
+
|
|
278
|
+
proxy = StaticISPProxy(
|
|
279
|
+
host="your_static_ip_address", # Your purchased IP
|
|
280
|
+
username="your_username",
|
|
281
|
+
password="your_password",
|
|
282
|
+
)
|
|
283
|
+
|
|
284
|
+
response = requests.get(
|
|
285
|
+
"http://httpbin.org/ip",
|
|
286
|
+
proxies=proxy.to_proxies_dict(),
|
|
287
|
+
)
|
|
288
|
+
# Returns your purchased static IP
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
#### Proxy Examples
|
|
292
|
+
|
|
293
|
+
```bash
|
|
294
|
+
python examples/proxy_residential.py
|
|
295
|
+
python examples/proxy_datacenter.py
|
|
296
|
+
python examples/proxy_mobile.py
|
|
297
|
+
python examples/proxy_isp.py
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
### Run All Examples
|
|
301
|
+
|
|
302
|
+
```bash
|
|
303
|
+
# SERP API examples
|
|
304
|
+
python examples/demo_serp_api.py
|
|
305
|
+
python examples/demo_serp_google_news.py
|
|
306
|
+
|
|
307
|
+
# Universal API examples
|
|
308
|
+
python examples/demo_universal.py
|
|
309
|
+
python examples/demo_scraping_browser.py
|
|
310
|
+
|
|
311
|
+
# Web Scraper API examples
|
|
312
|
+
python examples/demo_web_scraper_api.py
|
|
313
|
+
|
|
314
|
+
# Proxy Network examples
|
|
315
|
+
python examples/proxy_residential.py
|
|
316
|
+
python examples/proxy_datacenter.py
|
|
317
|
+
python examples/proxy_mobile.py
|
|
318
|
+
python examples/proxy_isp.py
|
|
319
|
+
|
|
320
|
+
# Async high concurrency example
|
|
321
|
+
python examples/async_high_concurrency.py
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
### 2. SERP API (Search Engine Results)
|
|
325
|
+
|
|
326
|
+
#### Basic Search
|
|
327
|
+
|
|
328
|
+
```python
|
|
329
|
+
from thordata import ThordataClient, Engine
|
|
330
|
+
|
|
331
|
+
client = ThordataClient(scraper_token="your_token")
|
|
332
|
+
|
|
333
|
+
# Google search
|
|
334
|
+
results = client.serp_search(
|
|
335
|
+
query="python programming",
|
|
336
|
+
engine=Engine.GOOGLE,
|
|
337
|
+
num=10
|
|
338
|
+
)
|
|
339
|
+
|
|
340
|
+
# Print organic results
|
|
341
|
+
for result in results.get("organic", []):
|
|
342
|
+
print(f"{result['title']}: {result['link']}")
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
#### General Calling Method
|
|
346
|
+
|
|
347
|
+
```python
|
|
348
|
+
from thordata import ThordataClient
|
|
349
|
+
|
|
350
|
+
client = ThordataClient(scraper_token="YOUR_SCRAPER_TOKEN")
|
|
351
|
+
|
|
352
|
+
# Recommended: use dedicated engines for Google verticals when available
|
|
353
|
+
news = client.serp_search(
|
|
354
|
+
query="pizza",
|
|
355
|
+
engine="google_news",
|
|
356
|
+
country="us",
|
|
357
|
+
language="en",
|
|
358
|
+
num=10,
|
|
359
|
+
so=1, # 0=relevance, 1=date (Google News)
|
|
360
|
+
)
|
|
361
|
+
|
|
362
|
+
# Alternative: use Google generic engine + tbm via `search_type`
|
|
363
|
+
# Note: `search_type` maps to Google tbm and is mainly intended for engine="google".
|
|
364
|
+
results = client.serp_search(
|
|
365
|
+
query="pizza",
|
|
366
|
+
engine="google",
|
|
367
|
+
num=10,
|
|
368
|
+
country="us",
|
|
369
|
+
language="en",
|
|
370
|
+
search_type="news", # tbm=nws (Google generic engine)
|
|
371
|
+
ibp="some_ibp_value",
|
|
372
|
+
lsig="some_lsig_value",
|
|
373
|
+
)
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
**Note**: All parameters above will be assembled into Thordata SERP API request parameters.
|
|
377
|
+
|
|
378
|
+
#### Advanced Search Options
|
|
379
|
+
|
|
380
|
+
```python
|
|
381
|
+
from thordata import ThordataClient, SerpRequest
|
|
382
|
+
|
|
383
|
+
client = ThordataClient(scraper_token="your_token")
|
|
384
|
+
|
|
385
|
+
# Create a detailed search request
|
|
386
|
+
request = SerpRequest(
|
|
387
|
+
query="best laptops 2024",
|
|
388
|
+
engine="google_shopping",
|
|
389
|
+
num=20,
|
|
390
|
+
country="us",
|
|
391
|
+
language="en",
|
|
392
|
+
safe_search=True,
|
|
393
|
+
device="mobile",
|
|
394
|
+
# Shopping-specific params can be passed via extra_params
|
|
395
|
+
# e.g. min_price=500, max_price=1500, sort_by=1, shoprs="..."
|
|
396
|
+
)
|
|
397
|
+
|
|
398
|
+
results = client.serp_search_advanced(request)
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
#### Multiple Search Engines
|
|
402
|
+
|
|
403
|
+
```python
|
|
404
|
+
from thordata import ThordataClient, Engine
|
|
405
|
+
|
|
406
|
+
client = ThordataClient(scraper_token="your_token")
|
|
407
|
+
|
|
408
|
+
# Google
|
|
409
|
+
google_results = client.serp_search("AI news", engine=Engine.GOOGLE)
|
|
410
|
+
|
|
411
|
+
# Bing
|
|
412
|
+
bing_results = client.serp_search("AI news", engine=Engine.BING)
|
|
413
|
+
|
|
414
|
+
# Yandex (Russian search engine)
|
|
415
|
+
yandex_results = client.serp_search("AI news", engine=Engine.YANDEX)
|
|
416
|
+
|
|
417
|
+
# DuckDuckGo
|
|
418
|
+
ddg_results = client.serp_search("AI news", engine=Engine.DUCKDUCKGO)
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
---
|
|
422
|
+
|
|
423
|
+
## 🔧 SERP API Parameter Mapping
|
|
424
|
+
|
|
425
|
+
Thordata's SERP API supports multiple search engines and sub-features (Google Search/Shopping/News, etc.).
|
|
426
|
+
This SDK wraps common parameters through `ThordataClient.serp_search` and `SerpRequest`, while other parameters can be passed directly through `**kwargs`.
|
|
427
|
+
|
|
428
|
+
### Google Search Parameter Mapping
|
|
429
|
+
|
|
430
|
+
| Document Parameter | SDK Field/Usage | Description |
|
|
431
|
+
|-------------------|-----------------|-------------|
|
|
432
|
+
| q | query | Search keyword |
|
|
433
|
+
| engine | engine | Engine.GOOGLE / "google" |
|
|
434
|
+
| google_domain | google_domain | e.g., "google.co.uk" |
|
|
435
|
+
| gl | country | Country/region, e.g., "us" |
|
|
436
|
+
| hl | language | Language, e.g., "en", "zh-CN" |
|
|
437
|
+
| cr | countries_filter | Multi-country filter, e.g., "countryFR |
|
|
438
|
+
| lr | languages_filter | Multi-language filter, e.g., "lang_en |
|
|
439
|
+
| location | location | Exact location, e.g., "India" |
|
|
440
|
+
| uule | uule | Base64 encoded location string |
|
|
441
|
+
| tbm | search_type | "images"→tbm=isch, "shopping"→tbm=shop, "news"→tbm=nws, "videos"→tbm=vid, other values passed through as-is |
|
|
442
|
+
| start | start | Result offset for pagination |
|
|
443
|
+
| num | num | Number of results per page |
|
|
444
|
+
| ludocid | ludocid | Google Place ID |
|
|
445
|
+
| kgmid | kgmid | Google Knowledge Graph ID |
|
|
446
|
+
| ibp | ibp="..." (kwargs) | Passed through **kwargs |
|
|
447
|
+
| lsig | lsig="..." (kwargs) | Same as above |
|
|
448
|
+
| si | si="..." (kwargs) | Same as above |
|
|
449
|
+
| uds | uds="ADV" (kwargs) | Same as above |
|
|
450
|
+
| tbs | time_filter or tbs="..." | time_filter="week" generates tbs=qdr:w, can also pass complete tbs directly |
|
|
451
|
+
| safe | safe_search | True → safe=active, False → safe=off |
|
|
452
|
+
| nfpr | no_autocorrect | True → nfpr=1 |
|
|
453
|
+
| filter | filter_duplicates | True → filter=1, False → filter=0 |
|
|
454
|
+
|
|
455
|
+
**Example: Google Search Basic Usage**
|
|
456
|
+
|
|
457
|
+
```python
|
|
458
|
+
results = client.serp_search(
|
|
459
|
+
query="python web scraping best practices",
|
|
460
|
+
engine=Engine.GOOGLE,
|
|
461
|
+
country="us",
|
|
462
|
+
language="en",
|
|
463
|
+
num=10,
|
|
464
|
+
time_filter="week", # Last week
|
|
465
|
+
safe_search=True, # Adult content filter
|
|
466
|
+
)
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
### Google Shopping Parameter Mapping
|
|
470
|
+
|
|
471
|
+
Recommended: use the dedicated Google Shopping engine (`engine="google_shopping"`):
|
|
472
|
+
|
|
473
|
+
```python
|
|
474
|
+
results = client.serp_search(
|
|
475
|
+
query="iPhone 15",
|
|
476
|
+
engine="google_shopping",
|
|
477
|
+
country="us",
|
|
478
|
+
language="en",
|
|
479
|
+
num=20,
|
|
480
|
+
# Shopping parameters are passed through kwargs
|
|
481
|
+
min_price=500,
|
|
482
|
+
max_price=1500,
|
|
483
|
+
sort_by=1,
|
|
484
|
+
free_shipping=True,
|
|
485
|
+
on_sale=True,
|
|
486
|
+
small_business=True,
|
|
487
|
+
direct_link=True,
|
|
488
|
+
shoprs="FILTER_ID_HERE",
|
|
489
|
+
)
|
|
490
|
+
shopping_items = results.get("shopping_results", [])
|
|
491
|
+
```
|
|
492
|
+
Alternative: use `engine="google"` with `search_type="shopping"` (tbm=shop).
|
|
493
|
+
|
|
494
|
+
| Document Parameter | SDK Field/Usage | Description |
|
|
495
|
+
|-------------------|-----------------|-------------|
|
|
496
|
+
| q | query | Search keyword |
|
|
497
|
+
| google_domain | google_domain | Same as above |
|
|
498
|
+
| gl | country | Same as above |
|
|
499
|
+
| hl | language | Same as above |
|
|
500
|
+
| location | location | Same as above |
|
|
501
|
+
| uule | uule | Same as above |
|
|
502
|
+
| start | start | Offset |
|
|
503
|
+
| num | num | Quantity |
|
|
504
|
+
| tbs | time_filter or tbs="..." | Same as above |
|
|
505
|
+
| shoprs | shoprs="..." (kwargs) | Filter ID |
|
|
506
|
+
| min_price | min_price=... (kwargs) | Minimum price |
|
|
507
|
+
| max_price | max_price=... (kwargs) | Maximum price |
|
|
508
|
+
| sort_by | sort_by=1/2 (kwargs) | Sort order |
|
|
509
|
+
| free_shipping | free_shipping=True/False (kwargs) | Free shipping |
|
|
510
|
+
| on_sale | on_sale=True/False (kwargs) | On sale |
|
|
511
|
+
| small_business | small_business=True/False (kwargs) | Small business |
|
|
512
|
+
| direct_link | direct_link=True/False (kwargs) | Include direct links |
|
|
513
|
+
|
|
514
|
+
### Google Local Parameter Mapping
|
|
515
|
+
|
|
516
|
+
Google Local is mainly about location-based local searches.
|
|
517
|
+
In the SDK, you can use search_type="local" to mark Local mode (tbm passed through as "local"), combined with location + uule.
|
|
518
|
+
|
|
519
|
+
```python
|
|
520
|
+
results = client.serp_search(
|
|
521
|
+
query="pizza near me",
|
|
522
|
+
engine=Engine.GOOGLE,
|
|
523
|
+
search_type="local",
|
|
524
|
+
google_domain="google.com",
|
|
525
|
+
country="us",
|
|
526
|
+
language="en",
|
|
527
|
+
location="San Francisco",
|
|
528
|
+
uule="w+CAIQICIFU2FuIEZyYW5jaXNjbw", # Example value
|
|
529
|
+
start=0, # Local only accepts 0, 20, 40...
|
|
530
|
+
)
|
|
531
|
+
local_results = results.get("local_results", results.get("organic", []))
|
|
532
|
+
```
|
|
533
|
+
|
|
534
|
+
| Document Parameter | SDK Field/Usage | Description |
|
|
535
|
+
|-------------------|-----------------|-------------|
|
|
536
|
+
| q | query | Search term |
|
|
537
|
+
| google_domain | google_domain | Domain |
|
|
538
|
+
| gl | country | Country |
|
|
539
|
+
| hl | language | Language |
|
|
540
|
+
| location location |
|
|
541
|
+
| u | location | Localule | uule | Encoded location |
|
|
542
|
+
| start | start | Offset (must be 0,20,40...) |
|
|
543
|
+
| ludocid | ludocid | Place ID (commonly used in Local results) |
|
|
544
|
+
| tbs | time_filter or tbs="..." | Advanced filtering |
|
|
545
|
+
|
|
546
|
+
### Google Videos Parameter Mapping
|
|
547
|
+
|
|
548
|
+
```python
|
|
549
|
+
results = client.serp_search(
|
|
550
|
+
query="python async tutorial",
|
|
551
|
+
engine=Engine.GOOGLE,
|
|
552
|
+
search_type="videos", # tbm=vid
|
|
553
|
+
country="us",
|
|
554
|
+
language="en",
|
|
555
|
+
languages_filter="lang_en|lang_fr",
|
|
556
|
+
location="United States",
|
|
557
|
+
uule="ENCODED_LOCATION_HERE",
|
|
558
|
+
num=10,
|
|
559
|
+
time_filter="month",
|
|
560
|
+
safe_search=True,
|
|
561
|
+
filter_duplicates=True,
|
|
562
|
+
)
|
|
563
|
+
video_results = results.get("video_results", results.get("organic", []))
|
|
564
|
+
```
|
|
565
|
+
|
|
566
|
+
| Document Parameter | SDK Field/Usage | Description |
|
|
567
|
+
|-------------------|-----------------|-------------|
|
|
568
|
+
| q | query | Search term |
|
|
569
|
+
| google_domain | google_domain | Domain |
|
|
570
|
+
| gl | country | Country |
|
|
571
|
+
| hl | language | Language |
|
|
572
|
+
| lr | languages_filter | Multi-language filter |
|
|
573
|
+
| location | location | Geographic location |
|
|
574
|
+
| uule | uule | Encoded location |
|
|
575
|
+
| start | start | Offset |
|
|
576
|
+
| num | num | Quantity |
|
|
577
|
+
| tbs | time_filter or tbs="..." | Time and advanced filtering |
|
|
578
|
+
| safe | safe_search | Adult content filter |
|
|
579
|
+
| nfpr | no_autocorrect | Disable auto-correction |
|
|
580
|
+
| filter | filter_duplicates | Remove duplicates |
|
|
581
|
+
|
|
582
|
+
### Google News Parameter Mapping
|
|
583
|
+
|
|
584
|
+
Google News has a set of exclusive token parameters for precise control of "topics/media/sections/stories".
|
|
585
|
+
|
|
586
|
+
```python
|
|
587
|
+
results = client.serp_search(
|
|
588
|
+
query="AI regulation",
|
|
589
|
+
engine="google_news",
|
|
590
|
+
country="us",
|
|
591
|
+
language="en",
|
|
592
|
+
topic_token="YOUR_TOPIC_TOKEN",
|
|
593
|
+
publication_token="YOUR_PUBLICATION_TOKEN",
|
|
594
|
+
section_token="YOUR_SECTION_TOKEN",
|
|
595
|
+
story_token="YOUR_STORY_TOKEN",
|
|
596
|
+
so=1, # 0=relevance, 1=date
|
|
597
|
+
)
|
|
598
|
+
news_results = results.get("news_results", results.get("organic", []))
|
|
599
|
+
```
|
|
600
|
+
|
|
601
|
+
| Document Parameter | SDK Field/Usage | Description |
|
|
602
|
+
|-------------------|-----------------|-------------|
|
|
603
|
+
| q | query | Search term |
|
|
604
|
+
| gl | country | Country |
|
|
605
|
+
| hl | language | Language |
|
|
606
|
+
| topic_token | topic_token="..." (kwargs) | Topic token |
|
|
607
|
+
| publication_token | publication_token="..." (kwargs) | Media token |
|
|
608
|
+
| section_token | section_token="..." (kwargs) | Section token |
|
|
609
|
+
| story_token | story_token="..." (kwargs) | Story token |
|
|
610
|
+
| so | so=0/1 (kwargs) | Sort: 0=relevance, 1=time |
|
|
611
|
+
|
|
612
|
+
---
|
|
613
|
+
|
|
614
|
+
👉 For more SERP modes and parameter mappings, see docs/serp_reference.md.
|
|
615
|
+
|
|
616
|
+
## 🔓 Web Unlocker (Universal Scraping API)
|
|
617
|
+
|
|
618
|
+
Automatically bypass anti-bot protections:
|
|
619
|
+
|
|
620
|
+
#### Basic Usage
|
|
621
|
+
|
|
622
|
+
```python
|
|
623
|
+
from thordata import ThordataClient
|
|
624
|
+
|
|
625
|
+
client = ThordataClient(scraper_token="your_token")
|
|
626
|
+
|
|
627
|
+
# Get HTML content
|
|
628
|
+
html = client.universal_scrape(
|
|
629
|
+
url="https://example.com",
|
|
630
|
+
js_render=True, # Enable JavaScript rendering
|
|
631
|
+
)
|
|
632
|
+
print(html[:500])
|
|
633
|
+
```
|
|
634
|
+
|
|
635
|
+
#### Advanced Options
|
|
636
|
+
|
|
637
|
+
```python
|
|
638
|
+
from thordata import ThordataClient, UniversalScrapeRequest
|
|
639
|
+
|
|
640
|
+
client = ThordataClient(scraper_token="your_token")
|
|
641
|
+
|
|
642
|
+
request = UniversalScrapeRequest(
|
|
643
|
+
url="https://example.com",
|
|
644
|
+
js_render=True,
|
|
645
|
+
output_format="html",
|
|
646
|
+
country="us",
|
|
647
|
+
block_resources="image,font", # Speed up by blocking resources
|
|
648
|
+
clean_content="js,css", # Remove JS/CSS from output
|
|
649
|
+
wait=5000, # Wait 5 seconds after load
|
|
650
|
+
wait_for=".content-loaded", # Wait for CSS selector
|
|
651
|
+
headers=[
|
|
652
|
+
{"name": "Accept-Language", "value": "en-US"}
|
|
653
|
+
],
|
|
654
|
+
cookies=[
|
|
655
|
+
{"name": "session", "value": "abc123"}
|
|
656
|
+
],
|
|
657
|
+
)
|
|
658
|
+
|
|
659
|
+
html = client.universal_scrape_advanced(request)
|
|
660
|
+
```
|
|
661
|
+
|
|
662
|
+
#### Take Screenshots
|
|
663
|
+
|
|
664
|
+
```python
|
|
665
|
+
from thordata import ThordataClient
|
|
666
|
+
|
|
667
|
+
client = ThordataClient(scraper_token="your_token")
|
|
668
|
+
|
|
669
|
+
# Get PNG screenshot
|
|
670
|
+
png_bytes = client.universal_scrape(
|
|
671
|
+
url="https://example.com",
|
|
672
|
+
js_render=True,
|
|
673
|
+
output_format="png",
|
|
674
|
+
)
|
|
675
|
+
|
|
676
|
+
# Save to file
|
|
677
|
+
with open("screenshot.png", "wb") as f:
|
|
678
|
+
f.write(png_bytes)
|
|
679
|
+
```
|
|
680
|
+
|
|
681
|
+
### Web Scraper API (Async Tasks)
|
|
682
|
+
|
|
683
|
+
For complex scraping jobs that run asynchronously:
|
|
684
|
+
|
|
685
|
+
```python
|
|
686
|
+
from thordata import ThordataClient
|
|
687
|
+
|
|
688
|
+
client = ThordataClient(
|
|
689
|
+
scraper_token="your_token",
|
|
690
|
+
public_token="your_public_token",
|
|
691
|
+
public_key="your_public_key",
|
|
692
|
+
)
|
|
693
|
+
|
|
694
|
+
# Create a scraping task
|
|
695
|
+
task_id = client.create_scraper_task(
|
|
696
|
+
file_name="youtube_channel_data",
|
|
697
|
+
spider_id="youtube_video-post_by-url", # From Dashboard
|
|
698
|
+
spider_name="youtube.com",
|
|
699
|
+
parameters={
|
|
700
|
+
"url": "https://www.youtube.com/@PewDiePie/videos",
|
|
701
|
+
"num_of_posts": "50"
|
|
702
|
+
}
|
|
703
|
+
)
|
|
704
|
+
print(f"Task created: {task_id}")
|
|
705
|
+
|
|
706
|
+
# Wait for completion (with timeout)
|
|
707
|
+
status = client.wait_for_task(task_id, max_wait=300)
|
|
708
|
+
print(f"Task status: {status}")
|
|
709
|
+
|
|
710
|
+
# Get results
|
|
711
|
+
if status in ("ready", "success"):
|
|
712
|
+
download_url = client.get_task_result(task_id)
|
|
713
|
+
print(f"Download: {download_url}")
|
|
714
|
+
```
|
|
715
|
+
|
|
716
|
+
### Async Client (High Concurrency)
|
|
717
|
+
|
|
718
|
+
For maximum performance with concurrent requests:
|
|
719
|
+
|
|
720
|
+
```python
|
|
721
|
+
import asyncio
|
|
722
|
+
from thordata import AsyncThordataClient
|
|
723
|
+
|
|
724
|
+
async def main():
|
|
725
|
+
async with AsyncThordataClient(
|
|
726
|
+
scraper_token="your_token",
|
|
727
|
+
public_token="your_public_token",
|
|
728
|
+
public_key="your_public_key",
|
|
729
|
+
) as client:
|
|
730
|
+
|
|
731
|
+
# Concurrent proxy requests
|
|
732
|
+
urls = [
|
|
733
|
+
"https://httpbin.org/ip",
|
|
734
|
+
"https://httpbin.org/headers",
|
|
735
|
+
"https://httpbin.org/user-agent",
|
|
736
|
+
]
|
|
737
|
+
|
|
738
|
+
tasks = [client.get(url) for url in urls]
|
|
739
|
+
responses = await asyncio.gather(*tasks)
|
|
740
|
+
|
|
741
|
+
for resp in responses:
|
|
742
|
+
print(await resp.json())
|
|
743
|
+
|
|
744
|
+
asyncio.run(main())
|
|
745
|
+
```
|
|
746
|
+
|
|
747
|
+
#### Async SERP Search
|
|
748
|
+
|
|
749
|
+
```python
|
|
750
|
+
import asyncio
|
|
751
|
+
from thordata import AsyncThordataClient, Engine
|
|
752
|
+
|
|
753
|
+
async def search_multiple():
|
|
754
|
+
async with AsyncThordataClient(scraper_token="your_token") as client:
|
|
755
|
+
queries = ["python", "javascript", "rust", "go"]
|
|
756
|
+
|
|
757
|
+
tasks = [
|
|
758
|
+
client.serp_search(q, engine=Engine.GOOGLE)
|
|
759
|
+
for q in queries
|
|
760
|
+
]
|
|
761
|
+
|
|
762
|
+
results = await asyncio.gather(*tasks)
|
|
763
|
+
|
|
764
|
+
for query, result in zip(queries, results):
|
|
765
|
+
count = len(result.get("organic", []))
|
|
766
|
+
print(f"{query}: {count} results")
|
|
767
|
+
|
|
768
|
+
asyncio.run(search_multiple())
|
|
769
|
+
```
|
|
770
|
+
|
|
771
|
+
### Location APIs
|
|
772
|
+
|
|
773
|
+
Discover available geo-targeting options:
|
|
774
|
+
|
|
775
|
+
```python
|
|
776
|
+
from thordata import ThordataClient, ProxyType
|
|
777
|
+
|
|
778
|
+
client = ThordataClient(
|
|
779
|
+
scraper_token="your_token",
|
|
780
|
+
public_token="your_public_token",
|
|
781
|
+
public_key="your_public_key",
|
|
782
|
+
)
|
|
783
|
+
|
|
784
|
+
# List all supported countries
|
|
785
|
+
countries = client.list_countries(proxy_type=ProxyType.RESIDENTIAL)
|
|
786
|
+
print(f"Supported countries: {len(countries)}")
|
|
787
|
+
|
|
788
|
+
# List states for a country
|
|
789
|
+
states = client.list_states("US")
|
|
790
|
+
for state in states[:5]:
|
|
791
|
+
print(f" {state['state_code']}: {state['state_name']}")
|
|
792
|
+
|
|
793
|
+
# List cities
|
|
794
|
+
cities = client.list_cities("US", state_code="california")
|
|
795
|
+
print(f"Cities in California: {len(cities)}")
|
|
796
|
+
|
|
797
|
+
# List ASNs (for ISP targeting)
|
|
798
|
+
asns = client.list_asn("US")
|
|
799
|
+
for asn in asns[:5]:
|
|
800
|
+
print(f" {asn['asn_code']}: {asn['asn_name']}")
|
|
801
|
+
```
|
|
802
|
+
|
|
803
|
+
### Error Handling
|
|
804
|
+
|
|
805
|
+
```python
|
|
806
|
+
from thordata import (
|
|
807
|
+
ThordataClient,
|
|
808
|
+
ThordataError,
|
|
809
|
+
ThordataAuthError,
|
|
810
|
+
ThordataRateLimitError,
|
|
811
|
+
ThordataNetworkError,
|
|
812
|
+
ThordataTimeoutError,
|
|
813
|
+
)
|
|
814
|
+
|
|
815
|
+
client = ThordataClient(scraper_token="your_token")
|
|
816
|
+
|
|
817
|
+
try:
|
|
818
|
+
result = client.serp_search("test query")
|
|
819
|
+
except ThordataAuthError as e:
|
|
820
|
+
print(f"Authentication failed: {e}")
|
|
821
|
+
print(f"Check your token. Status code: {e.status_code}")
|
|
822
|
+
except ThordataRateLimitError as e:
|
|
823
|
+
print(f"Rate limited: {e}")
|
|
824
|
+
if e.retry_after:
|
|
825
|
+
print(f"Retry after {e.retry_after} seconds")
|
|
826
|
+
except ThordataTimeoutError as e:
|
|
827
|
+
print(f"Request timed out: {e}")
|
|
828
|
+
except ThordataNetworkError as e:
|
|
829
|
+
print(f"Network error: {e}")
|
|
830
|
+
except ThordataError as e:
|
|
831
|
+
print(f"General error: {e}")
|
|
832
|
+
```
|
|
833
|
+
|
|
834
|
+
### Retry Configuration
|
|
835
|
+
|
|
836
|
+
Customize automatic retry behavior:
|
|
837
|
+
|
|
838
|
+
```python
|
|
839
|
+
from thordata import ThordataClient, RetryConfig
|
|
840
|
+
|
|
841
|
+
# Custom retry configuration
|
|
842
|
+
retry_config = RetryConfig(
|
|
843
|
+
max_retries=5, # Maximum retry attempts
|
|
844
|
+
backoff_factor=2.0, # Exponential backoff multiplier
|
|
845
|
+
max_backoff=120.0, # Maximum wait between retries
|
|
846
|
+
jitter=True, # Add randomness to prevent thundering herd
|
|
847
|
+
)
|
|
848
|
+
|
|
849
|
+
client = ThordataClient(
|
|
850
|
+
scraper_token="your_token",
|
|
851
|
+
retry_config=retry_config,
|
|
852
|
+
)
|
|
853
|
+
|
|
854
|
+
# Requests will automatically retry on transient failures
|
|
855
|
+
response = client.get("https://example.com")
|
|
856
|
+
```
|
|
857
|
+
|
|
858
|
+
---
|
|
859
|
+
|
|
860
|
+
## 🔧 Configuration Reference
|
|
861
|
+
|
|
862
|
+
### ThordataClient Parameters
|
|
863
|
+
|
|
864
|
+
| Parameter | Type | Default | Description |
|
|
865
|
+
|-----------|------|---------|-------------|
|
|
866
|
+
| scraper_token | str | required | API token from Dashboard |
|
|
867
|
+
| public_token | str | None | Public API token (for tasks/locations) |
|
|
868
|
+
| public_key | str | None | Public API key |
|
|
869
|
+
| proxy_host | str | "pr.thordata.net" | Proxy gateway host |
|
|
870
|
+
| proxy_port | int | 9999 | Proxy gateway port |
|
|
871
|
+
| timeout | int | 30 | Default request timeout (seconds) |
|
|
872
|
+
| retry_config | RetryConfig | None | Retry configuration |
|
|
873
|
+
|
|
874
|
+
### ProxyConfig Parameters
|
|
875
|
+
|
|
876
|
+
| Parameter | Type | Default | Description |
|
|
877
|
+
|-----------|------|---------|-------------|
|
|
878
|
+
| username | str | required | Proxy username |
|
|
879
|
+
| password | str | required | Proxy password |
|
|
880
|
+
| product | ProxyProduct | RESIDENTIAL | Proxy type |
|
|
881
|
+
| country | str | None | ISO 3166-1 alpha-2 code |
|
|
882
|
+
| state | str | None | State name (lowercase) |
|
|
883
|
+
| city | str | None | City name (lowercase) |
|
|
884
|
+
| continent | str | None | Continent code (af/an/as/eu/na/oc/sa) |
|
|
885
|
+
| asn | str | None | ASN code (requires country) |
|
|
886
|
+
| session_id | str | None | Session ID for sticky sessions |
|
|
887
|
+
| session_duration | int | None | Session duration (1-90 minutes) |
|
|
888
|
+
|
|
889
|
+
### Proxy Products & Ports
|
|
890
|
+
|
|
891
|
+
| Product | Port | Description |
|
|
892
|
+
|---------|------|-------------|
|
|
893
|
+
| RESIDENTIAL | 9999 | Rotating residential IPs |
|
|
894
|
+
| MOBILE | 5555 | Mobile carrier IPs |
|
|
895
|
+
| DATACENTER | 7777 | Datacenter IPs |
|
|
896
|
+
| ISP | 6666 | Static ISP IPs |
|
|
897
|
+
|
|
898
|
+
---
|
|
899
|
+
|
|
900
|
+
## 📁 Project Structure
|
|
901
|
+
|
|
902
|
+
```
|
|
903
|
+
thordata-python-sdk/
|
|
904
|
+
├── src/thordata/
|
|
905
|
+
│ ├── __init__.py # Public API exports
|
|
906
|
+
│ ├── client.py # Sync client
|
|
907
|
+
│ ├── async_client.py # Async client
|
|
908
|
+
│ ├── models.py # Data models (ProxyConfig, SerpRequest, etc.)
|
|
909
|
+
│ ├── enums.py # Enumerations
|
|
910
|
+
│ ├── exceptions.py # Exception hierarchy
|
|
911
|
+
│ ├── retry.py # Retry mechanism
|
|
912
|
+
│ ├── parameters.py # Parameter definitions
|
|
913
|
+
│ ├── demo.py # Demo functionality
|
|
914
|
+
│ └── _utils.py # Internal utilities
|
|
915
|
+
├── tests/
|
|
916
|
+
│ ├── __init__.py # Test initialization
|
|
917
|
+
│ ├── conftest.py # Pytest configuration
|
|
918
|
+
│ ├── test_client.py # Client tests
|
|
919
|
+
│ ├── test_async_client.py # Async client tests
|
|
920
|
+
│ ├── test_client_errors.py # Client error tests
|
|
921
|
+
│ ├── test_async_client_errors.py # Async client error tests
|
|
922
|
+
│ ├── test_enums.py # Enums tests
|
|
923
|
+
│ ├── test_models.py # Models tests
|
|
924
|
+
│ ├── test_exceptions.py # Exceptions tests
|
|
925
|
+
│ ├── test_demo_entrypoint.py # Demo entrypoint tests
|
|
926
|
+
│ ├── test_task_status_and_wait.py # Task status tests
|
|
927
|
+
│ ├── test_user_agent.py # User agent tests
|
|
928
|
+
│ ├── test_examples_demo_serp_api.py # SERP API examples tests
|
|
929
|
+
│ ├── test_examples_demo_universal.py # Universal API examples tests
|
|
930
|
+
│ ├── test_examples_demo_web_scraper_api.py # Web scraper examples tests
|
|
931
|
+
│ └── test_examples_async_high_concurrency.py # Async high concurrency tests
|
|
932
|
+
├── examples/
|
|
933
|
+
│ ├── demo_serp_api.py # SERP API demo
|
|
934
|
+
│ ├── demo_serp_google_news.py # Google News demo
|
|
935
|
+
│ ├── demo_universal.py # Universal API demo
|
|
936
|
+
│ ├── demo_web_scraper_api.py # Web scraper demo
|
|
937
|
+
│ ├── demo_scraping_browser.py # Scraping browser demo
|
|
938
|
+
│ ├── async_high_concurrency.py # Async high concurrency demo
|
|
939
|
+
│ ├── proxy_residential.py # Residential proxy example
|
|
940
|
+
│ ├── proxy_datacenter.py # Datacenter proxy example
|
|
941
|
+
│ ├── proxy_mobile.py # Mobile proxy example
|
|
942
|
+
│ ├── proxy_isp.py # Static ISP proxy example
|
|
943
|
+
│ └── .env.example # Environment variables template
|
|
944
|
+
├── docs/
|
|
945
|
+
│ ├── serp_reference.md # SERP API reference
|
|
946
|
+
│ ├── serp_reference_legacy.md # Legacy SERP reference
|
|
947
|
+
│ └── universal_reference.md # Universal API reference
|
|
948
|
+
├── .github/
|
|
949
|
+
│ ├── dependabot.yml # Dependabot configuration
|
|
950
|
+
│ ├── pull_request_template.md # PR template
|
|
951
|
+
│ ├── ISSUE_TEMPLATE/
|
|
952
|
+
│ │ ├── bug_report.md # Bug report template
|
|
953
|
+
│ │ └── feature_request.md # Feature request template
|
|
954
|
+
│ └── workflows/
|
|
955
|
+
│ ├── ci.yml # Continuous integration
|
|
956
|
+
│ └── pypi-publish.yml # PyPI publishing workflow
|
|
957
|
+
├── .env.example # Environment variables template
|
|
958
|
+
├── .prettierrc # Prettier configuration
|
|
959
|
+
├── .prettierignore # Prettier ignore patterns
|
|
960
|
+
├── eslint.config.cjs # ESLint configuration
|
|
961
|
+
├── LICENSE # License file
|
|
962
|
+
├── package.json # Package configuration
|
|
963
|
+
├── py.typed # Type hints marker
|
|
964
|
+
├── pyproject.toml # Python package configuration
|
|
965
|
+
├── pytest.ini # Pytest configuration
|
|
966
|
+
├── requirements.txt # Python dependencies
|
|
967
|
+
├── tsconfig.json # TypeScript configuration
|
|
968
|
+
├── tsconfig.build.json # TypeScript build configuration
|
|
969
|
+
├── vitest.config.ts # Vitest testing configuration
|
|
970
|
+
└── README.md # This file
|
|
971
|
+
```
|
|
972
|
+
|
|
973
|
+
---
|
|
974
|
+
|
|
975
|
+
## 🧪 Development
|
|
976
|
+
|
|
977
|
+
### Setup
|
|
978
|
+
|
|
979
|
+
```bash
|
|
980
|
+
# Clone the repository
|
|
981
|
+
git clone https://github.com/Thordata/thordata-python-sdk.git
|
|
982
|
+
cd thordata-python-sdk
|
|
983
|
+
|
|
984
|
+
# Create virtual environment
|
|
985
|
+
python -m venv venv
|
|
986
|
+
source venv/bin/activate # On Windows: venv\Scripts\activate
|
|
987
|
+
|
|
988
|
+
# Install with dev dependencies
|
|
989
|
+
pip install -e ".[dev]"
|
|
990
|
+
```
|
|
991
|
+
|
|
992
|
+
### Run Tests
|
|
993
|
+
|
|
994
|
+
```bash
|
|
995
|
+
# Run all tests
|
|
996
|
+
pytest
|
|
997
|
+
|
|
998
|
+
# Run with coverage
|
|
999
|
+
pytest --cov=thordata --cov-report=html
|
|
1000
|
+
|
|
1001
|
+
# Run specific test file
|
|
1002
|
+
pytest tests/test_client.py -v
|
|
1003
|
+
```
|
|
1004
|
+
|
|
1005
|
+
### Code Quality
|
|
1006
|
+
|
|
1007
|
+
```bash
|
|
1008
|
+
# Format code
|
|
1009
|
+
black src tests
|
|
1010
|
+
|
|
1011
|
+
# Lint
|
|
1012
|
+
ruff check src tests
|
|
1013
|
+
|
|
1014
|
+
# Type check
|
|
1015
|
+
mypy src
|
|
1016
|
+
```
|
|
1017
|
+
|
|
1018
|
+
---
|
|
1019
|
+
|
|
1020
|
+
## 📝 Changelog
|
|
1021
|
+
|
|
1022
|
+
See [CHANGELOG.md](CHANGELOG.md) for version history.
|
|
1023
|
+
|
|
1024
|
+
---
|
|
1025
|
+
|
|
1026
|
+
## 🤝 Contributing
|
|
1027
|
+
|
|
1028
|
+
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
|
1029
|
+
|
|
1030
|
+
1. Fork the repository
|
|
1031
|
+
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
|
1032
|
+
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
|
1033
|
+
4. Push to the branch (`git push origin feature/amazing-feature`)
|
|
1034
|
+
5. Open a Pull Request
|
|
1035
|
+
|
|
1036
|
+
---
|
|
1037
|
+
|
|
1038
|
+
## 📄 License
|
|
1039
|
+
|
|
1040
|
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
1041
|
+
|
|
1042
|
+
---
|
|
1043
|
+
|
|
1044
|
+
## 🆘 Support
|
|
1045
|
+
|
|
1046
|
+
- 📧 **Email**: support@thordata.com
|
|
1047
|
+
- 📚 **Documentation**: [doc.thordata.com](https://doc.thordata.com)
|
|
1048
|
+
- 🐛 **Issues**: [GitHub Issues](https://github.com/Thordata/thordata-python-sdk/issues)
|
|
1049
|
+
- 💬 **Dashboard**: [thordata.com](https://www.thordata.com)
|
|
1050
|
+
|
|
1051
|
+
<div align="center">
|
|
1052
|
+
<sub>Built with ❤️ by the Thordata Team</sub>
|
|
1053
|
+
</div>
|