thordata-sdk 0.4.0__py3-none-any.whl → 0.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: thordata-sdk
3
- Version: 0.4.0
3
+ Version: 0.5.0
4
4
  Summary: The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.
5
5
  Author-email: Thordata Developer Team <support@thordata.com>
6
6
  License: MIT
@@ -16,7 +16,6 @@ Classifier: Topic :: Software Development :: Libraries :: Python Modules
16
16
  Classifier: Topic :: Internet :: WWW/HTTP
17
17
  Classifier: Topic :: Internet :: Proxy Servers
18
18
  Classifier: Programming Language :: Python :: 3
19
- Classifier: Programming Language :: Python :: 3.8
20
19
  Classifier: Programming Language :: Python :: 3.9
21
20
  Classifier: Programming Language :: Python :: 3.10
22
21
  Classifier: Programming Language :: Python :: 3.11
@@ -24,15 +23,17 @@ Classifier: Programming Language :: Python :: 3.12
24
23
  Classifier: License :: OSI Approved :: MIT License
25
24
  Classifier: Operating System :: OS Independent
26
25
  Classifier: Typing :: Typed
27
- Requires-Python: >=3.8
26
+ Requires-Python: >=3.9
28
27
  Description-Content-Type: text/markdown
29
28
  License-File: LICENSE
30
29
  Requires-Dist: requests>=2.25.0
31
- Requires-Dist: aiohttp>=3.8.0
30
+ Requires-Dist: aiohttp>=3.9.0
32
31
  Provides-Extra: dev
33
32
  Requires-Dist: pytest>=7.0.0; extra == "dev"
34
33
  Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
35
34
  Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
35
+ Requires-Dist: pytest-httpserver>=1.0.0; extra == "dev"
36
+ Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
36
37
  Requires-Dist: black>=23.0.0; extra == "dev"
37
38
  Requires-Dist: ruff>=0.1.0; extra == "dev"
38
39
  Requires-Dist: mypy>=1.0.0; extra == "dev"
@@ -249,6 +250,28 @@ for result in results.get("organic", []):
249
250
  print(f"{result['title']}: {result['link']}")
250
251
  ```
251
252
 
253
+ #### General Calling Method
254
+
255
+ ```python
256
+ from thordata import ThordataClient, Engine
257
+
258
+ client = ThordataClient(scraper_token="YOUR_SCRAPER_TOKEN")
259
+
260
+ results = client.serp_search(
261
+ query="pizza",
262
+ engine=Engine.GOOGLE, # or "google"
263
+ num=10,
264
+ country="us",
265
+ language="en",
266
+ search_type="news", # corresponds to tbm=nws
267
+ # Other parameters are passed in via kwargs
268
+ ibp="some_ibp_value",
269
+ lsig="some_lsig_value",
270
+ )
271
+ ```
272
+
273
+ **Note**: All parameters above will be assembled into Thordata SERP API request parameters.
274
+
252
275
  #### Advanced Search Options
253
276
 
254
277
  ```python
@@ -292,7 +315,202 @@ yandex_results = client.serp_search("AI news", engine=Engine.YANDEX)
292
315
  ddg_results = client.serp_search("AI news", engine=Engine.DUCKDUCKGO)
293
316
  ```
294
317
 
295
- ### 3. Web Unlocker (Universal Scraping API)
318
+ ---
319
+
320
+ ## 🔧 SERP API Parameter Mapping
321
+
322
+ Thordata's SERP API supports multiple search engines and sub-features (Google Search/Shopping/News, etc.).
323
+ This SDK wraps common parameters through `ThordataClient.serp_search` and `SerpRequest`, while other parameters can be passed directly through `**kwargs`.
324
+
325
+ ### Google Search Parameter Mapping
326
+
327
+ | Document Parameter | SDK Field/Usage | Description |
328
+ |-------------------|-----------------|-------------|
329
+ | q | query | Search keyword |
330
+ | engine | engine | Engine.GOOGLE / "google" |
331
+ | google_domain | google_domain | e.g., "google.co.uk" |
332
+ | gl | country | Country/region, e.g., "us" |
333
+ | hl | language | Language, e.g., "en", "zh-CN" |
334
+ | cr | countries_filter | Multi-country filter, e.g., "countryFR |
335
+ | lr | languages_filter | Multi-language filter, e.g., "lang_en |
336
+ | location | location | Exact location, e.g., "India" |
337
+ | uule | uule | Base64 encoded location string |
338
+ | tbm | search_type | "images"→tbm=isch, "shopping"→tbm=shop, "news"→tbm=nws, "videos"→tbm=vid, other values passed through as-is |
339
+ | start | start | Result offset for pagination |
340
+ | num | num | Number of results per page |
341
+ | ludocid | ludocid | Google Place ID |
342
+ | kgmid | kgmid | Google Knowledge Graph ID |
343
+ | ibp | ibp="..." (kwargs) | Passed through **kwargs |
344
+ | lsig | lsig="..." (kwargs) | Same as above |
345
+ | si | si="..." (kwargs) | Same as above |
346
+ | uds | uds="ADV" (kwargs) | Same as above |
347
+ | tbs | time_filter or tbs="..." | time_filter="week" generates tbs=qdr:w, can also pass complete tbs directly |
348
+ | safe | safe_search | True → safe=active, False → safe=off |
349
+ | nfpr | no_autocorrect | True → nfpr=1 |
350
+ | filter | filter_duplicates | True → filter=1, False → filter=0 |
351
+
352
+ **Example: Google Search Basic Usage**
353
+
354
+ ```python
355
+ results = client.serp_search(
356
+ query="python web scraping best practices",
357
+ engine=Engine.GOOGLE,
358
+ country="us",
359
+ language="en",
360
+ num=10,
361
+ time_filter="week", # Last week
362
+ safe_search=True, # Adult content filter
363
+ )
364
+ ```
365
+
366
+ ### Google Shopping Parameter Mapping
367
+
368
+ Shopping still uses engine="google", search_type="shopping" to select Shopping mode:
369
+
370
+ ```python
371
+ results = client.serp_search(
372
+ query="iPhone 15",
373
+ engine=Engine.GOOGLE,
374
+ search_type="shopping", # tbm=shop
375
+ country="us",
376
+ language="en",
377
+ num=20,
378
+ min_price=500, # Parameters below passed through kwargs
379
+ max_price=1500,
380
+ sort_by=1, # 1=price low to high, 2=high to low
381
+ free_shipping=True,
382
+ on_sale=True,
383
+ small_business=True,
384
+ direct_link=True,
385
+ shoprs="FILTER_ID_HERE",
386
+ )
387
+ shopping_items = results.get("shopping_results", [])
388
+ ```
389
+
390
+ | Document Parameter | SDK Field/Usage | Description |
391
+ |-------------------|-----------------|-------------|
392
+ | q | query | Search keyword |
393
+ | google_domain | google_domain | Same as above |
394
+ | gl | country | Same as above |
395
+ | hl | language | Same as above |
396
+ | location | location | Same as above |
397
+ | uule | uule | Same as above |
398
+ | start | start | Offset |
399
+ | num | num | Quantity |
400
+ | tbs | time_filter or tbs="..." | Same as above |
401
+ | shoprs | shoprs="..." (kwargs) | Filter ID |
402
+ | min_price | min_price=... (kwargs) | Minimum price |
403
+ | max_price | max_price=... (kwargs) | Maximum price |
404
+ | sort_by | sort_by=1/2 (kwargs) | Sort order |
405
+ | free_shipping | free_shipping=True/False (kwargs) | Free shipping |
406
+ | on_sale | on_sale=True/False (kwargs) | On sale |
407
+ | small_business | small_business=True/False (kwargs) | Small business |
408
+ | direct_link | direct_link=True/False (kwargs) | Include direct links |
409
+
410
+ ### Google Local Parameter Mapping
411
+
412
+ Google Local is mainly about location-based local searches.
413
+ In the SDK, you can use search_type="local" to mark Local mode (tbm passed through as "local"), combined with location + uule.
414
+
415
+ ```python
416
+ results = client.serp_search(
417
+ query="pizza near me",
418
+ engine=Engine.GOOGLE,
419
+ search_type="local",
420
+ google_domain="google.com",
421
+ country="us",
422
+ language="en",
423
+ location="San Francisco",
424
+ uule="w+CAIQICIFU2FuIEZyYW5jaXNjbw", # Example value
425
+ start=0, # Local only accepts 0, 20, 40...
426
+ )
427
+ local_results = results.get("local_results", results.get("organic", []))
428
+ ```
429
+
430
+ | Document Parameter | SDK Field/Usage | Description |
431
+ |-------------------|-----------------|-------------|
432
+ | q | query | Search term |
433
+ | google_domain | google_domain | Domain |
434
+ | gl | country | Country |
435
+ | hl | language | Language |
436
+ | location | location | Local location |
437
+ | uule | uule | Encoded location |
438
+ | start | start | Offset (must be 0,20,40...) |
439
+ | ludocid | ludocid | Place ID (commonly used in Local results) |
440
+ | tbs | time_filter or tbs="..." | Advanced filtering |
441
+
442
+ ### Google Videos Parameter Mapping
443
+
444
+ ```python
445
+ results = client.serp_search(
446
+ query="python async tutorial",
447
+ engine=Engine.GOOGLE,
448
+ search_type="videos", # tbm=vid
449
+ country="us",
450
+ language="en",
451
+ languages_filter="lang_en|lang_fr",
452
+ location="United States",
453
+ uule="ENCODED_LOCATION_HERE",
454
+ num=10,
455
+ time_filter="month",
456
+ safe_search=True,
457
+ filter_duplicates=True,
458
+ )
459
+ video_results = results.get("video_results", results.get("organic", []))
460
+ ```
461
+
462
+ | Document Parameter | SDK Field/Usage | Description |
463
+ |-------------------|-----------------|-------------|
464
+ | q | query | Search term |
465
+ | google_domain | google_domain | Domain |
466
+ | gl | country | Country |
467
+ | hl | language | Language |
468
+ | lr | languages_filter | Multi-language filter |
469
+ | location | location | Geographic location |
470
+ | uule | uule | Encoded location |
471
+ | start | start | Offset |
472
+ | num | num | Quantity |
473
+ | tbs | time_filter or tbs="..." | Time and advanced filtering |
474
+ | safe | safe_search | Adult content filter |
475
+ | nfpr | no_autocorrect | Disable auto-correction |
476
+ | filter | filter_duplicates | Remove duplicates |
477
+
478
+ ### Google News Parameter Mapping
479
+
480
+ Google News has a set of exclusive token parameters for precise control of "topics/media/sections/stories".
481
+
482
+ ```python
483
+ results = client.serp_search(
484
+ query="AI regulation",
485
+ engine=Engine.GOOGLE,
486
+ search_type="news", # tbm=nws
487
+ country="us",
488
+ language="en",
489
+ topic_token="YOUR_TOPIC_TOKEN", # Optional
490
+ publication_token="YOUR_PUBLICATION_TOKEN", # Optional
491
+ section_token="YOUR_SECTION_TOKEN", # Optional
492
+ story_token="YOUR_STORY_TOKEN", # Optional
493
+ so=1, # 0=relevance, 1=time
494
+ )
495
+ news_results = results.get("news_results", results.get("organic", []))
496
+ ```
497
+
498
+ | Document Parameter | SDK Field/Usage | Description |
499
+ |-------------------|-----------------|-------------|
500
+ | q | query | Search term |
501
+ | gl | country | Country |
502
+ | hl | language | Language |
503
+ | topic_token | topic_token="..." (kwargs) | Topic token |
504
+ | publication_token | publication_token="..." (kwargs) | Media token |
505
+ | section_token | section_token="..." (kwargs) | Section token |
506
+ | story_token | story_token="..." (kwargs) | Story token |
507
+ | so | so=0/1 (kwargs) | Sort: 0=relevance, 1=time |
508
+
509
+ ---
510
+
511
+ 👉 For more SERP modes and parameter mappings, see docs/serp_reference.md.
512
+
513
+ ## 🔓 Web Unlocker (Universal Scraping API)
296
514
 
297
515
  Automatically bypass anti-bot protections:
298
516
 
@@ -357,7 +575,7 @@ with open("screenshot.png", "wb") as f:
357
575
  f.write(png_bytes)
358
576
  ```
359
577
 
360
- ### 4. Web Scraper API (Async Tasks)
578
+ ### Web Scraper API (Async Tasks)
361
579
 
362
580
  For complex scraping jobs that run asynchronously:
363
581
 
@@ -392,7 +610,7 @@ if status in ("ready", "success"):
392
610
  print(f"Download: {download_url}")
393
611
  ```
394
612
 
395
- ### 5. Async Client (High Concurrency)
613
+ ### Async Client (High Concurrency)
396
614
 
397
615
  For maximum performance with concurrent requests:
398
616
 
@@ -447,7 +665,7 @@ async def search_multiple():
447
665
  asyncio.run(search_multiple())
448
666
  ```
449
667
 
450
- ### 6. Location APIs
668
+ ### Location APIs
451
669
 
452
670
  Discover available geo-targeting options:
453
671
 
@@ -479,7 +697,7 @@ for asn in asns[:5]:
479
697
  print(f" {asn['asn_code']}: {asn['asn_name']}")
480
698
  ```
481
699
 
482
- ### 7. Error Handling
700
+ ### Error Handling
483
701
 
484
702
  ```python
485
703
  from thordata import (
@@ -510,7 +728,7 @@ except ThordataError as e:
510
728
  print(f"General error: {e}")
511
729
  ```
512
730
 
513
- ### 8. Retry Configuration
731
+ ### Retry Configuration
514
732
 
515
733
  Customize automatic retry behavior:
516
734
 
@@ -0,0 +1,14 @@
1
+ thordata/__init__.py,sha256=nJYeULWtg4rm5nQ4I6nJd_AUdS9CqwB00F9M0vPZTTc,3018
2
+ thordata/_utils.py,sha256=nYiyNVeHATDGszeiZtZt56NjNkH_FRkntv7iVmzqy3E,3148
3
+ thordata/async_client.py,sha256=fDD0quARX9eIqOwcXWFusvTSgtXAXsqDtJwzNk8jbgc,25974
4
+ thordata/client.py,sha256=xcMC7q48SzA4HnUkaafxVv7GcdKH4btH_MoLHxchOqY,33164
5
+ thordata/enums.py,sha256=bVRJ7tk2_maidffInn0cTCHfKJHqNni-20eSEST_o5g,6762
6
+ thordata/exceptions.py,sha256=9cU7_opMt-jBsW1W072PvvWN5w94XHQ4mXAPzCjDRyM,9887
7
+ thordata/models.py,sha256=1ik3zapkbV6hq_hexH9tZctTqkdWusw--Qy_P5GQksU,24240
8
+ thordata/parameters.py,sha256=WHylIoRPvYwYebPknBNtVvJy-dDMHZEdbHl8mtGQBjM,1802
9
+ thordata/retry.py,sha256=asa-jsBmpCSN4jRnbf2P-LLs1vkAEwOZFE4ZUKJ0Qas,11815
10
+ thordata_sdk-0.5.0.dist-info/licenses/LICENSE,sha256=iGPxYqZSwa4imqBS7FDaNpUJgKrPfYjcv2nD0aJBmNI,1109
11
+ thordata_sdk-0.5.0.dist-info/METADATA,sha256=avKvCs7BooLUtLxux4Js3LTHavuCDYan_CnvwgBCh-g,26322
12
+ thordata_sdk-0.5.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
13
+ thordata_sdk-0.5.0.dist-info/top_level.txt,sha256=Z8R_07m0lXCCSb1hapL9_nxMtyO3rf_9wOvq4n9u2Hg,9
14
+ thordata_sdk-0.5.0.dist-info/RECORD,,
@@ -1,14 +0,0 @@
1
- thordata/__init__.py,sha256=1IDM3RMWb8OyzBRtEUOgZfj_lcv0g65AOniL-8ohvfk,2986
2
- thordata/_utils.py,sha256=gDk0RpplRk-oV0cjN02_cXKS7ucpmHEtgJxxz4H5Ers,3243
3
- thordata/async_client.py,sha256=2mgG1uwrD1bxocjEPxxeiCKYjVVe-suMCcJX9SdjR84,24279
4
- thordata/client.py,sha256=UNBYXP1sDkdKGTEhcYK8KodAu_qPrXbQDobyhj6QBxI,30742
5
- thordata/enums.py,sha256=SSriGhEIkYebie_-Yyan1wPcWCF-jIo9CJ_Y_0Ngu7U,6574
6
- thordata/exceptions.py,sha256=uqFGkmJpUYGRZ7KmvNVVpYz9O0fueenmXJHKb4A_UaE,9402
7
- thordata/models.py,sha256=HdFLP_bSXU8Ul8eq90Idng9Ogl3KNIm3LdWazLDKKc0,23870
8
- thordata/parameters.py,sha256=1lNx_BSS8ztBKEj_MXZMaIQQ9_W3EAlS-VFiBqSWb9E,1841
9
- thordata/retry.py,sha256=w2G_xv_jOlCqDaFIMR_c6ZsJFRKdjHAZHZ61TggM61U,12273
10
- thordata_sdk-0.4.0.dist-info/licenses/LICENSE,sha256=iGPxYqZSwa4imqBS7FDaNpUJgKrPfYjcv2nD0aJBmNI,1109
11
- thordata_sdk-0.4.0.dist-info/METADATA,sha256=ffmLzRS_pPTxlEiBOsofIewTApOGBGPt8RvMVIjFJVM,18454
12
- thordata_sdk-0.4.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
13
- thordata_sdk-0.4.0.dist-info/top_level.txt,sha256=Z8R_07m0lXCCSb1hapL9_nxMtyO3rf_9wOvq4n9u2Hg,9
14
- thordata_sdk-0.4.0.dist-info/RECORD,,