thordata-sdk 0.3.1__py3-none-any.whl → 0.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,896 @@
1
+ Metadata-Version: 2.4
2
+ Name: thordata-sdk
3
+ Version: 0.5.0
4
+ Summary: The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.
5
+ Author-email: Thordata Developer Team <support@thordata.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://www.thordata.com
8
+ Project-URL: Documentation, https://github.com/Thordata/thordata-python-sdk#readme
9
+ Project-URL: Source, https://github.com/Thordata/thordata-python-sdk
10
+ Project-URL: Tracker, https://github.com/Thordata/thordata-python-sdk/issues
11
+ Project-URL: Changelog, https://github.com/Thordata/thordata-python-sdk/blob/main/CHANGELOG.md
12
+ Keywords: web scraping,proxy,residential proxy,datacenter proxy,ai,llm,data-mining,serp,thordata,web scraper,anti-bot bypass
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
16
+ Classifier: Topic :: Internet :: WWW/HTTP
17
+ Classifier: Topic :: Internet :: Proxy Servers
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.9
20
+ Classifier: Programming Language :: Python :: 3.10
21
+ Classifier: Programming Language :: Python :: 3.11
22
+ Classifier: Programming Language :: Python :: 3.12
23
+ Classifier: License :: OSI Approved :: MIT License
24
+ Classifier: Operating System :: OS Independent
25
+ Classifier: Typing :: Typed
26
+ Requires-Python: >=3.9
27
+ Description-Content-Type: text/markdown
28
+ License-File: LICENSE
29
+ Requires-Dist: requests>=2.25.0
30
+ Requires-Dist: aiohttp>=3.9.0
31
+ Provides-Extra: dev
32
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
33
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
34
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
35
+ Requires-Dist: pytest-httpserver>=1.0.0; extra == "dev"
36
+ Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
37
+ Requires-Dist: black>=23.0.0; extra == "dev"
38
+ Requires-Dist: ruff>=0.1.0; extra == "dev"
39
+ Requires-Dist: mypy>=1.0.0; extra == "dev"
40
+ Requires-Dist: types-requests>=2.28.0; extra == "dev"
41
+ Dynamic: license-file
42
+
43
+ # Thordata Python SDK
44
+
45
+ <div align="center">
46
+
47
+ **Official Python client for Thordata's Proxy Network, SERP API, Web Unlocker, and Web Scraper API.**
48
+
49
+ *Async-ready, type-safe, built for AI agents and large-scale data collection.*
50
+
51
+ [![CI](https://github.com/Thordata/thordata-python-sdk/actions/workflows/ci.yml/badge.svg)](https://github.com/Thordata/thordata-python-sdk/actions/workflows/ci.yml)
52
+ [![PyPI version](https://img.shields.io/pypi/v/thordata-sdk?color=blue)](https://pypi.org/project/thordata-sdk/)
53
+ [![Python](https://img.shields.io/badge/python-3.8+-blue)](https://python.org)
54
+ [![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
55
+ [![Typed](https://img.shields.io/badge/typing-typed-purple)](https://github.com/Thordata/thordata-python-sdk)
56
+
57
+ [Documentation](https://doc.thordata.com) • [Dashboard](https://www.thordata.com) • [Examples](examples/) • [Changelog](CHANGELOG.md)
58
+
59
+ </div>
60
+
61
+ ---
62
+
63
+ ## ✨ Features
64
+
65
+ | Feature | Description |
66
+ |---------|-------------|
67
+ | 🌐 **Proxy Network** | Residential, Mobile, Datacenter, ISP proxies with geo-targeting |
68
+ | 🔍 **SERP API** | Google, Bing, Yandex, DuckDuckGo, Baidu search results |
69
+ | 🔓 **Web Unlocker** | Bypass Cloudflare, CAPTCHAs, anti-bot systems automatically |
70
+ | 🕷️ **Web Scraper** | Async task-based scraping for complex sites |
71
+ | ⚡ **Async Support** | Full async/await support with aiohttp |
72
+ | 🔄 **Auto Retry** | Configurable retry with exponential backoff |
73
+ | 📝 **Type Safe** | Full type annotations for IDE autocomplete |
74
+
75
+ ---
76
+
77
+ ## 📦 Installation
78
+
79
+ ```bash
80
+ pip install thordata-sdk
81
+ ```
82
+
83
+ For development:
84
+
85
+ ```bash
86
+ pip install thordata-sdk[dev]
87
+ ```
88
+
89
+ ---
90
+
91
+ ## 🚀 Quick Start
92
+
93
+ ### Get Your Credentials
94
+
95
+ 1. Sign up at [thordata.com](https://www.thordata.com)
96
+ 2. Navigate to your Dashboard
97
+ 3. Copy your Scraper Token, Public Token, and Public Key
98
+
99
+ ### Basic Usage
100
+
101
+ ```python
102
+ from thordata import ThordataClient
103
+
104
+ # Initialize the client
105
+ client = ThordataClient(
106
+ scraper_token="your_scraper_token",
107
+ public_token="your_public_token", # Optional, for task APIs
108
+ public_key="your_public_key" # Optional, for task APIs
109
+ )
110
+
111
+ # Make a request through the proxy network
112
+ response = client.get("https://httpbin.org/ip")
113
+ print(response.json())
114
+ # {'origin': '123.45.67.89'} # Residential IP
115
+ ```
116
+
117
+ ### Environment Variables
118
+
119
+ Create a `.env` file:
120
+
121
+ ```env
122
+ THORDATA_SCRAPER_TOKEN=your_scraper_token
123
+ THORDATA_PUBLIC_TOKEN=your_public_token
124
+ THORDATA_PUBLIC_KEY=your_public_key
125
+ ```
126
+
127
+ Then use with python-dotenv:
128
+
129
+ ```python
130
+ import os
131
+ from dotenv import load_dotenv
132
+ from thordata import ThordataClient
133
+
134
+ load_dotenv()
135
+
136
+ client = ThordataClient(
137
+ scraper_token=os.getenv("THORDATA_SCRAPER_TOKEN"),
138
+ public_token=os.getenv("THORDATA_PUBLIC_TOKEN"),
139
+ public_key=os.getenv("THORDATA_PUBLIC_KEY"),
140
+ )
141
+ ```
142
+
143
+ ---
144
+
145
+ ## 📖 Usage Guide
146
+
147
+ ### 1. Proxy Network
148
+
149
+ #### Basic Proxy Request
150
+
151
+ ```python
152
+ from thordata import ThordataClient
153
+
154
+ client = ThordataClient(scraper_token="your_token")
155
+
156
+ # GET request through proxy
157
+ response = client.get("https://example.com")
158
+ print(response.text)
159
+
160
+ # POST request through proxy
161
+ response = client.post("https://httpbin.org/post", json={"key": "value"})
162
+ print(response.json())
163
+ ```
164
+
165
+ #### Geo-Targeting
166
+
167
+ ```python
168
+ from thordata import ThordataClient, ProxyConfig
169
+
170
+ client = ThordataClient(scraper_token="your_token")
171
+
172
+ # Create a proxy config with geo-targeting
173
+ config = ProxyConfig(
174
+ username="your_username",
175
+ password="your_password",
176
+ country="us", # Target country
177
+ state="california", # Target state
178
+ city="los_angeles", # Target city
179
+ )
180
+
181
+ response = client.get("https://httpbin.org/ip", proxy_config=config)
182
+ print(response.json())
183
+ ```
184
+
185
+ #### Sticky Sessions
186
+
187
+ Keep the same IP for multiple requests:
188
+
189
+ ```python
190
+ from thordata import ThordataClient, StickySession
191
+
192
+ client = ThordataClient(scraper_token="your_token")
193
+
194
+ # Create a sticky session (same IP for 10 minutes)
195
+ session = StickySession(
196
+ username="your_username",
197
+ password="your_password",
198
+ country="gb",
199
+ duration_minutes=10,
200
+ )
201
+
202
+ # All requests use the same IP
203
+ for i in range(5):
204
+ response = client.get("https://httpbin.org/ip", proxy_config=session)
205
+ print(f"Request {i+1}: {response.json()['origin']}")
206
+ ```
207
+
208
+ #### Different Proxy Products
209
+
210
+ ```python
211
+ from thordata import ProxyConfig, ProxyProduct
212
+
213
+ # Residential proxy (default, port 9999)
214
+ residential = ProxyConfig(
215
+ username="user", password="pass",
216
+ product=ProxyProduct.RESIDENTIAL
217
+ )
218
+
219
+ # Mobile proxy (port 5555)
220
+ mobile = ProxyConfig(
221
+ username="user", password="pass",
222
+ product=ProxyProduct.MOBILE
223
+ )
224
+
225
+ # Datacenter proxy (port 7777)
226
+ datacenter = ProxyConfig(
227
+ username="user", password="pass",
228
+ product=ProxyProduct.DATACENTER
229
+ )
230
+ ```
231
+
232
+ ### 2. SERP API (Search Engine Results)
233
+
234
+ #### Basic Search
235
+
236
+ ```python
237
+ from thordata import ThordataClient, Engine
238
+
239
+ client = ThordataClient(scraper_token="your_token")
240
+
241
+ # Google search
242
+ results = client.serp_search(
243
+ query="python programming",
244
+ engine=Engine.GOOGLE,
245
+ num=10
246
+ )
247
+
248
+ # Print organic results
249
+ for result in results.get("organic", []):
250
+ print(f"{result['title']}: {result['link']}")
251
+ ```
252
+
253
+ #### General Calling Method
254
+
255
+ ```python
256
+ from thordata import ThordataClient, Engine
257
+
258
+ client = ThordataClient(scraper_token="YOUR_SCRAPER_TOKEN")
259
+
260
+ results = client.serp_search(
261
+ query="pizza",
262
+ engine=Engine.GOOGLE, # or "google"
263
+ num=10,
264
+ country="us",
265
+ language="en",
266
+ search_type="news", # corresponds to tbm=nws
267
+ # Other parameters are passed in via kwargs
268
+ ibp="some_ibp_value",
269
+ lsig="some_lsig_value",
270
+ )
271
+ ```
272
+
273
+ **Note**: All parameters above will be assembled into Thordata SERP API request parameters.
274
+
275
+ #### Advanced Search Options
276
+
277
+ ```python
278
+ from thordata import ThordataClient, SerpRequest
279
+
280
+ client = ThordataClient(scraper_token="your_token")
281
+
282
+ # Create a detailed search request
283
+ request = SerpRequest(
284
+ query="best laptops 2024",
285
+ engine="google",
286
+ num=20,
287
+ country="us",
288
+ language="en",
289
+ search_type="shopping", # shopping, news, images, videos
290
+ time_filter="month", # hour, day, week, month, year
291
+ safe_search=True,
292
+ device="mobile", # desktop, mobile, tablet
293
+ )
294
+
295
+ results = client.serp_search_advanced(request)
296
+ ```
297
+
298
+ #### Multiple Search Engines
299
+
300
+ ```python
301
+ from thordata import ThordataClient, Engine
302
+
303
+ client = ThordataClient(scraper_token="your_token")
304
+
305
+ # Google
306
+ google_results = client.serp_search("AI news", engine=Engine.GOOGLE)
307
+
308
+ # Bing
309
+ bing_results = client.serp_search("AI news", engine=Engine.BING)
310
+
311
+ # Yandex (Russian search engine)
312
+ yandex_results = client.serp_search("AI news", engine=Engine.YANDEX)
313
+
314
+ # DuckDuckGo
315
+ ddg_results = client.serp_search("AI news", engine=Engine.DUCKDUCKGO)
316
+ ```
317
+
318
+ ---
319
+
320
+ ## 🔧 SERP API Parameter Mapping
321
+
322
+ Thordata's SERP API supports multiple search engines and sub-features (Google Search/Shopping/News, etc.).
323
+ This SDK wraps common parameters through `ThordataClient.serp_search` and `SerpRequest`, while other parameters can be passed directly through `**kwargs`.
324
+
325
+ ### Google Search Parameter Mapping
326
+
327
+ | Document Parameter | SDK Field/Usage | Description |
328
+ |-------------------|-----------------|-------------|
329
+ | q | query | Search keyword |
330
+ | engine | engine | Engine.GOOGLE / "google" |
331
+ | google_domain | google_domain | e.g., "google.co.uk" |
332
+ | gl | country | Country/region, e.g., "us" |
333
+ | hl | language | Language, e.g., "en", "zh-CN" |
334
+ | cr | countries_filter | Multi-country filter, e.g., "countryFR |
335
+ | lr | languages_filter | Multi-language filter, e.g., "lang_en |
336
+ | location | location | Exact location, e.g., "India" |
337
+ | uule | uule | Base64 encoded location string |
338
+ | tbm | search_type | "images"→tbm=isch, "shopping"→tbm=shop, "news"→tbm=nws, "videos"→tbm=vid, other values passed through as-is |
339
+ | start | start | Result offset for pagination |
340
+ | num | num | Number of results per page |
341
+ | ludocid | ludocid | Google Place ID |
342
+ | kgmid | kgmid | Google Knowledge Graph ID |
343
+ | ibp | ibp="..." (kwargs) | Passed through **kwargs |
344
+ | lsig | lsig="..." (kwargs) | Same as above |
345
+ | si | si="..." (kwargs) | Same as above |
346
+ | uds | uds="ADV" (kwargs) | Same as above |
347
+ | tbs | time_filter or tbs="..." | time_filter="week" generates tbs=qdr:w, can also pass complete tbs directly |
348
+ | safe | safe_search | True → safe=active, False → safe=off |
349
+ | nfpr | no_autocorrect | True → nfpr=1 |
350
+ | filter | filter_duplicates | True → filter=1, False → filter=0 |
351
+
352
+ **Example: Google Search Basic Usage**
353
+
354
+ ```python
355
+ results = client.serp_search(
356
+ query="python web scraping best practices",
357
+ engine=Engine.GOOGLE,
358
+ country="us",
359
+ language="en",
360
+ num=10,
361
+ time_filter="week", # Last week
362
+ safe_search=True, # Adult content filter
363
+ )
364
+ ```
365
+
366
+ ### Google Shopping Parameter Mapping
367
+
368
+ Shopping still uses engine="google", search_type="shopping" to select Shopping mode:
369
+
370
+ ```python
371
+ results = client.serp_search(
372
+ query="iPhone 15",
373
+ engine=Engine.GOOGLE,
374
+ search_type="shopping", # tbm=shop
375
+ country="us",
376
+ language="en",
377
+ num=20,
378
+ min_price=500, # Parameters below passed through kwargs
379
+ max_price=1500,
380
+ sort_by=1, # 1=price low to high, 2=high to low
381
+ free_shipping=True,
382
+ on_sale=True,
383
+ small_business=True,
384
+ direct_link=True,
385
+ shoprs="FILTER_ID_HERE",
386
+ )
387
+ shopping_items = results.get("shopping_results", [])
388
+ ```
389
+
390
+ | Document Parameter | SDK Field/Usage | Description |
391
+ |-------------------|-----------------|-------------|
392
+ | q | query | Search keyword |
393
+ | google_domain | google_domain | Same as above |
394
+ | gl | country | Same as above |
395
+ | hl | language | Same as above |
396
+ | location | location | Same as above |
397
+ | uule | uule | Same as above |
398
+ | start | start | Offset |
399
+ | num | num | Quantity |
400
+ | tbs | time_filter or tbs="..." | Same as above |
401
+ | shoprs | shoprs="..." (kwargs) | Filter ID |
402
+ | min_price | min_price=... (kwargs) | Minimum price |
403
+ | max_price | max_price=... (kwargs) | Maximum price |
404
+ | sort_by | sort_by=1/2 (kwargs) | Sort order |
405
+ | free_shipping | free_shipping=True/False (kwargs) | Free shipping |
406
+ | on_sale | on_sale=True/False (kwargs) | On sale |
407
+ | small_business | small_business=True/False (kwargs) | Small business |
408
+ | direct_link | direct_link=True/False (kwargs) | Include direct links |
409
+
410
+ ### Google Local Parameter Mapping
411
+
412
+ Google Local is mainly about location-based local searches.
413
+ In the SDK, you can use search_type="local" to mark Local mode (tbm passed through as "local"), combined with location + uule.
414
+
415
+ ```python
416
+ results = client.serp_search(
417
+ query="pizza near me",
418
+ engine=Engine.GOOGLE,
419
+ search_type="local",
420
+ google_domain="google.com",
421
+ country="us",
422
+ language="en",
423
+ location="San Francisco",
424
+ uule="w+CAIQICIFU2FuIEZyYW5jaXNjbw", # Example value
425
+ start=0, # Local only accepts 0, 20, 40...
426
+ )
427
+ local_results = results.get("local_results", results.get("organic", []))
428
+ ```
429
+
430
+ | Document Parameter | SDK Field/Usage | Description |
431
+ |-------------------|-----------------|-------------|
432
+ | q | query | Search term |
433
+ | google_domain | google_domain | Domain |
434
+ | gl | country | Country |
435
+ | hl | language | Language |
436
+ | location | location | Local location |
437
+ | uule | uule | Encoded location |
438
+ | start | start | Offset (must be 0,20,40...) |
439
+ | ludocid | ludocid | Place ID (commonly used in Local results) |
440
+ | tbs | time_filter or tbs="..." | Advanced filtering |
441
+
442
+ ### Google Videos Parameter Mapping
443
+
444
+ ```python
445
+ results = client.serp_search(
446
+ query="python async tutorial",
447
+ engine=Engine.GOOGLE,
448
+ search_type="videos", # tbm=vid
449
+ country="us",
450
+ language="en",
451
+ languages_filter="lang_en|lang_fr",
452
+ location="United States",
453
+ uule="ENCODED_LOCATION_HERE",
454
+ num=10,
455
+ time_filter="month",
456
+ safe_search=True,
457
+ filter_duplicates=True,
458
+ )
459
+ video_results = results.get("video_results", results.get("organic", []))
460
+ ```
461
+
462
+ | Document Parameter | SDK Field/Usage | Description |
463
+ |-------------------|-----------------|-------------|
464
+ | q | query | Search term |
465
+ | google_domain | google_domain | Domain |
466
+ | gl | country | Country |
467
+ | hl | language | Language |
468
+ | lr | languages_filter | Multi-language filter |
469
+ | location | location | Geographic location |
470
+ | uule | uule | Encoded location |
471
+ | start | start | Offset |
472
+ | num | num | Quantity |
473
+ | tbs | time_filter or tbs="..." | Time and advanced filtering |
474
+ | safe | safe_search | Adult content filter |
475
+ | nfpr | no_autocorrect | Disable auto-correction |
476
+ | filter | filter_duplicates | Remove duplicates |
477
+
478
+ ### Google News Parameter Mapping
479
+
480
+ Google News has a set of exclusive token parameters for precise control of "topics/media/sections/stories".
481
+
482
+ ```python
483
+ results = client.serp_search(
484
+ query="AI regulation",
485
+ engine=Engine.GOOGLE,
486
+ search_type="news", # tbm=nws
487
+ country="us",
488
+ language="en",
489
+ topic_token="YOUR_TOPIC_TOKEN", # Optional
490
+ publication_token="YOUR_PUBLICATION_TOKEN", # Optional
491
+ section_token="YOUR_SECTION_TOKEN", # Optional
492
+ story_token="YOUR_STORY_TOKEN", # Optional
493
+ so=1, # 0=relevance, 1=time
494
+ )
495
+ news_results = results.get("news_results", results.get("organic", []))
496
+ ```
497
+
498
+ | Document Parameter | SDK Field/Usage | Description |
499
+ |-------------------|-----------------|-------------|
500
+ | q | query | Search term |
501
+ | gl | country | Country |
502
+ | hl | language | Language |
503
+ | topic_token | topic_token="..." (kwargs) | Topic token |
504
+ | publication_token | publication_token="..." (kwargs) | Media token |
505
+ | section_token | section_token="..." (kwargs) | Section token |
506
+ | story_token | story_token="..." (kwargs) | Story token |
507
+ | so | so=0/1 (kwargs) | Sort: 0=relevance, 1=time |
508
+
509
+ ---
510
+
511
+ 👉 For more SERP modes and parameter mappings, see docs/serp_reference.md.
512
+
513
+ ## 🔓 Web Unlocker (Universal Scraping API)
514
+
515
+ Automatically bypass anti-bot protections:
516
+
517
+ #### Basic Usage
518
+
519
+ ```python
520
+ from thordata import ThordataClient
521
+
522
+ client = ThordataClient(scraper_token="your_token")
523
+
524
+ # Get HTML content
525
+ html = client.universal_scrape(
526
+ url="https://example.com",
527
+ js_render=True, # Enable JavaScript rendering
528
+ )
529
+ print(html[:500])
530
+ ```
531
+
532
+ #### Advanced Options
533
+
534
+ ```python
535
+ from thordata import ThordataClient, UniversalScrapeRequest
536
+
537
+ client = ThordataClient(scraper_token="your_token")
538
+
539
+ request = UniversalScrapeRequest(
540
+ url="https://example.com",
541
+ js_render=True,
542
+ output_format="html",
543
+ country="us",
544
+ block_resources="image,font", # Speed up by blocking resources
545
+ clean_content="js,css", # Remove JS/CSS from output
546
+ wait=5000, # Wait 5 seconds after load
547
+ wait_for=".content-loaded", # Wait for CSS selector
548
+ headers=[
549
+ {"name": "Accept-Language", "value": "en-US"}
550
+ ],
551
+ cookies=[
552
+ {"name": "session", "value": "abc123"}
553
+ ],
554
+ )
555
+
556
+ html = client.universal_scrape_advanced(request)
557
+ ```
558
+
559
+ #### Take Screenshots
560
+
561
+ ```python
562
+ from thordata import ThordataClient
563
+
564
+ client = ThordataClient(scraper_token="your_token")
565
+
566
+ # Get PNG screenshot
567
+ png_bytes = client.universal_scrape(
568
+ url="https://example.com",
569
+ js_render=True,
570
+ output_format="png",
571
+ )
572
+
573
+ # Save to file
574
+ with open("screenshot.png", "wb") as f:
575
+ f.write(png_bytes)
576
+ ```
577
+
578
+ ### Web Scraper API (Async Tasks)
579
+
580
+ For complex scraping jobs that run asynchronously:
581
+
582
+ ```python
583
+ from thordata import ThordataClient
584
+
585
+ client = ThordataClient(
586
+ scraper_token="your_token",
587
+ public_token="your_public_token",
588
+ public_key="your_public_key",
589
+ )
590
+
591
+ # Create a scraping task
592
+ task_id = client.create_scraper_task(
593
+ file_name="youtube_channel_data",
594
+ spider_id="youtube_video-post_by-url", # From Dashboard
595
+ spider_name="youtube.com",
596
+ parameters={
597
+ "url": "https://www.youtube.com/@PewDiePie/videos",
598
+ "num_of_posts": "50"
599
+ }
600
+ )
601
+ print(f"Task created: {task_id}")
602
+
603
+ # Wait for completion (with timeout)
604
+ status = client.wait_for_task(task_id, max_wait=300)
605
+ print(f"Task status: {status}")
606
+
607
+ # Get results
608
+ if status in ("ready", "success"):
609
+ download_url = client.get_task_result(task_id)
610
+ print(f"Download: {download_url}")
611
+ ```
612
+
613
+ ### Async Client (High Concurrency)
614
+
615
+ For maximum performance with concurrent requests:
616
+
617
+ ```python
618
+ import asyncio
619
+ from thordata import AsyncThordataClient
620
+
621
+ async def main():
622
+ async with AsyncThordataClient(
623
+ scraper_token="your_token",
624
+ public_token="your_public_token",
625
+ public_key="your_public_key",
626
+ ) as client:
627
+
628
+ # Concurrent proxy requests
629
+ urls = [
630
+ "https://httpbin.org/ip",
631
+ "https://httpbin.org/headers",
632
+ "https://httpbin.org/user-agent",
633
+ ]
634
+
635
+ tasks = [client.get(url) for url in urls]
636
+ responses = await asyncio.gather(*tasks)
637
+
638
+ for resp in responses:
639
+ print(await resp.json())
640
+
641
+ asyncio.run(main())
642
+ ```
643
+
644
+ #### Async SERP Search
645
+
646
+ ```python
647
+ import asyncio
648
+ from thordata import AsyncThordataClient, Engine
649
+
650
+ async def search_multiple():
651
+ async with AsyncThordataClient(scraper_token="your_token") as client:
652
+ queries = ["python", "javascript", "rust", "go"]
653
+
654
+ tasks = [
655
+ client.serp_search(q, engine=Engine.GOOGLE)
656
+ for q in queries
657
+ ]
658
+
659
+ results = await asyncio.gather(*tasks)
660
+
661
+ for query, result in zip(queries, results):
662
+ count = len(result.get("organic", []))
663
+ print(f"{query}: {count} results")
664
+
665
+ asyncio.run(search_multiple())
666
+ ```
667
+
668
+ ### Location APIs
669
+
670
+ Discover available geo-targeting options:
671
+
672
+ ```python
673
+ from thordata import ThordataClient, ProxyType
674
+
675
+ client = ThordataClient(
676
+ scraper_token="your_token",
677
+ public_token="your_public_token",
678
+ public_key="your_public_key",
679
+ )
680
+
681
+ # List all supported countries
682
+ countries = client.list_countries(proxy_type=ProxyType.RESIDENTIAL)
683
+ print(f"Supported countries: {len(countries)}")
684
+
685
+ # List states for a country
686
+ states = client.list_states("US")
687
+ for state in states[:5]:
688
+ print(f" {state['state_code']}: {state['state_name']}")
689
+
690
+ # List cities
691
+ cities = client.list_cities("US", state_code="california")
692
+ print(f"Cities in California: {len(cities)}")
693
+
694
+ # List ASNs (for ISP targeting)
695
+ asns = client.list_asn("US")
696
+ for asn in asns[:5]:
697
+ print(f" {asn['asn_code']}: {asn['asn_name']}")
698
+ ```
699
+
700
+ ### Error Handling
701
+
702
+ ```python
703
+ from thordata import (
704
+ ThordataClient,
705
+ ThordataError,
706
+ ThordataAuthError,
707
+ ThordataRateLimitError,
708
+ ThordataNetworkError,
709
+ ThordataTimeoutError,
710
+ )
711
+
712
+ client = ThordataClient(scraper_token="your_token")
713
+
714
+ try:
715
+ result = client.serp_search("test query")
716
+ except ThordataAuthError as e:
717
+ print(f"Authentication failed: {e}")
718
+ print(f"Check your token. Status code: {e.status_code}")
719
+ except ThordataRateLimitError as e:
720
+ print(f"Rate limited: {e}")
721
+ if e.retry_after:
722
+ print(f"Retry after {e.retry_after} seconds")
723
+ except ThordataTimeoutError as e:
724
+ print(f"Request timed out: {e}")
725
+ except ThordataNetworkError as e:
726
+ print(f"Network error: {e}")
727
+ except ThordataError as e:
728
+ print(f"General error: {e}")
729
+ ```
730
+
731
+ ### Retry Configuration
732
+
733
+ Customize automatic retry behavior:
734
+
735
+ ```python
736
+ from thordata import ThordataClient, RetryConfig
737
+
738
+ # Custom retry configuration
739
+ retry_config = RetryConfig(
740
+ max_retries=5, # Maximum retry attempts
741
+ backoff_factor=2.0, # Exponential backoff multiplier
742
+ max_backoff=120.0, # Maximum wait between retries
743
+ jitter=True, # Add randomness to prevent thundering herd
744
+ )
745
+
746
+ client = ThordataClient(
747
+ scraper_token="your_token",
748
+ retry_config=retry_config,
749
+ )
750
+
751
+ # Requests will automatically retry on transient failures
752
+ response = client.get("https://example.com")
753
+ ```
754
+
755
+ ---
756
+
757
+ ## 🔧 Configuration Reference
758
+
759
+ ### ThordataClient Parameters
760
+
761
+ | Parameter | Type | Default | Description |
762
+ |-----------|------|---------|-------------|
763
+ | scraper_token | str | required | API token from Dashboard |
764
+ | public_token | str | None | Public API token (for tasks/locations) |
765
+ | public_key | str | None | Public API key |
766
+ | proxy_host | str | "pr.thordata.net" | Proxy gateway host |
767
+ | proxy_port | int | 9999 | Proxy gateway port |
768
+ | timeout | int | 30 | Default request timeout (seconds) |
769
+ | retry_config | RetryConfig | None | Retry configuration |
770
+
771
+ ### ProxyConfig Parameters
772
+
773
+ | Parameter | Type | Default | Description |
774
+ |-----------|------|---------|-------------|
775
+ | username | str | required | Proxy username |
776
+ | password | str | required | Proxy password |
777
+ | product | ProxyProduct | RESIDENTIAL | Proxy type |
778
+ | country | str | None | ISO 3166-1 alpha-2 code |
779
+ | state | str | None | State name (lowercase) |
780
+ | city | str | None | City name (lowercase) |
781
+ | continent | str | None | Continent code (af/an/as/eu/na/oc/sa) |
782
+ | asn | str | None | ASN code (requires country) |
783
+ | session_id | str | None | Session ID for sticky sessions |
784
+ | session_duration | int | None | Session duration (1-90 minutes) |
785
+
786
+ ### Proxy Products & Ports
787
+
788
+ | Product | Port | Description |
789
+ |---------|------|-------------|
790
+ | RESIDENTIAL | 9999 | Rotating residential IPs |
791
+ | MOBILE | 5555 | Mobile carrier IPs |
792
+ | DATACENTER | 7777 | Datacenter IPs |
793
+ | ISP | 6666 | Static ISP IPs |
794
+
795
+ ---
796
+
797
+ ## 📁 Project Structure
798
+
799
+ ```
800
+ thordata-python-sdk/
801
+ ├── src/thordata/
802
+ │ ├── __init__.py # Public API exports
803
+ │ ├── client.py # Sync client
804
+ │ ├── async_client.py # Async client
805
+ │ ├── models.py # Data models (ProxyConfig, SerpRequest, etc.)
806
+ │ ├── enums.py # Enumerations
807
+ │ ├── exceptions.py # Exception hierarchy
808
+ │ ├── retry.py # Retry mechanism
809
+ │ └── _utils.py # Internal utilities
810
+ ├── tests/ # Test suite
811
+ ├── examples/ # Usage examples
812
+ ├── pyproject.toml # Package configuration
813
+ └── README.md
814
+ ```
815
+
816
+ ---
817
+
818
+ ## 🧪 Development
819
+
820
+ ### Setup
821
+
822
+ ```bash
823
+ # Clone the repository
824
+ git clone https://github.com/Thordata/thordata-python-sdk.git
825
+ cd thordata-python-sdk
826
+
827
+ # Create virtual environment
828
+ python -m venv venv
829
+ source venv/bin/activate # On Windows: venv\Scripts\activate
830
+
831
+ # Install with dev dependencies
832
+ pip install -e ".[dev]"
833
+ ```
834
+
835
+ ### Run Tests
836
+
837
+ ```bash
838
+ # Run all tests
839
+ pytest
840
+
841
+ # Run with coverage
842
+ pytest --cov=thordata --cov-report=html
843
+
844
+ # Run specific test file
845
+ pytest tests/test_client.py -v
846
+ ```
847
+
848
+ ### Code Quality
849
+
850
+ ```bash
851
+ # Format code
852
+ black src tests
853
+
854
+ # Lint
855
+ ruff check src tests
856
+
857
+ # Type check
858
+ mypy src
859
+ ```
860
+
861
+ ---
862
+
863
+ ## 📝 Changelog
864
+
865
+ See [CHANGELOG.md](CHANGELOG.md) for version history.
866
+
867
+ ---
868
+
869
+ ## 🤝 Contributing
870
+
871
+ Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
872
+
873
+ 1. Fork the repository
874
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
875
+ 3. Commit your changes (`git commit -m 'Add amazing feature'`)
876
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
877
+ 5. Open a Pull Request
878
+
879
+ ---
880
+
881
+ ## 📄 License
882
+
883
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
884
+
885
+ ---
886
+
887
+ ## 🆘 Support
888
+
889
+ - 📧 **Email**: support@thordata.com
890
+ - 📚 **Documentation**: [doc.thordata.com](https://doc.thordata.com)
891
+ - 🐛 **Issues**: [GitHub Issues](https://github.com/Thordata/thordata-python-sdk/issues)
892
+ - 💬 **Dashboard**: [thordata.com](https://www.thordata.com)
893
+
894
+ <div align="center">
895
+ <sub>Built with ❤️ by the Thordata Team</sub>
896
+ </div>