pytrends-modern 0.1.1__tar.gz → 0.1.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {pytrends_modern-0.1.1/pytrends_modern.egg-info → pytrends_modern-0.1.2}/PKG-INFO +39 -1
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/README.md +38 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pyproject.toml +6 -1
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/__init__.py +2 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/request.py +1 -6
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/rss.py +8 -8
- pytrends_modern-0.1.2/pytrends_modern/scraper.py +292 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2/pytrends_modern.egg-info}/PKG-INFO +39 -1
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/SOURCES.txt +1 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/LICENSE +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/MANIFEST.in +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/examples/advanced_usage.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/examples/basic_usage.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/cli.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/config.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/exceptions.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/py.typed +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern/utils.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/dependency_links.txt +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/entry_points.txt +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/requires.txt +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/top_level.txt +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/setup.cfg +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/tests/conftest.py +0 -0
- {pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/tests/test_basic.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: pytrends-modern
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.2
|
|
4
4
|
Summary: Modern Google Trends API - Combining the best of pytrends, with RSS feeds, Selenium scraping, and enhanced features
|
|
5
5
|
Author: pytrends-modern contributors
|
|
6
6
|
License: MIT
|
|
@@ -389,6 +389,44 @@ This project builds upon and combines features from:
|
|
|
389
389
|
- CLI interface
|
|
390
390
|
- Multiple export formats
|
|
391
391
|
|
|
392
|
+
## 📌 Important Notes
|
|
393
|
+
|
|
394
|
+
### Google API Changes
|
|
395
|
+
|
|
396
|
+
Google has deprecated several trending search API endpoints. **pytrends-modern provides two working alternatives:**
|
|
397
|
+
|
|
398
|
+
#### Option 1: Fast RSS Feed (Recommended for most use cases)
|
|
399
|
+
```python
|
|
400
|
+
from pytrends_modern import TrendsRSS
|
|
401
|
+
|
|
402
|
+
rss = TrendsRSS()
|
|
403
|
+
trends = rss.get_trends(geo='US') # ~0.7s, returns 10 trends with images/articles
|
|
404
|
+
```
|
|
405
|
+
**Pros:** Lightning fast, includes rich media, no browser needed
|
|
406
|
+
**Cons:** Limited to 10 trends, no filtering options
|
|
407
|
+
|
|
408
|
+
#### Option 2: Selenium Web Scraper (For complete data)
|
|
409
|
+
```python
|
|
410
|
+
from pytrends_modern import TrendsScraper
|
|
411
|
+
|
|
412
|
+
scraper = TrendsScraper(headless=True)
|
|
413
|
+
df = scraper.trending_searches(geo='US', hours=24) # ~15s, returns 400+ trends
|
|
414
|
+
scraper.close()
|
|
415
|
+
```
|
|
416
|
+
**Pros:** Complete data (400+ trends), supports categories/filters
|
|
417
|
+
**Cons:** Slower, requires Chrome browser
|
|
418
|
+
|
|
419
|
+
### Working Features
|
|
420
|
+
✅ All core API methods work perfectly:
|
|
421
|
+
- `interest_over_time()` - Historical search trends
|
|
422
|
+
- `interest_by_region()` - Geographic distribution
|
|
423
|
+
- `related_queries()` / `related_topics()` - Related searches
|
|
424
|
+
- `suggestions()` - Keyword suggestions
|
|
425
|
+
- And more!
|
|
426
|
+
|
|
427
|
+
✅ RSS feeds for 125+ countries
|
|
428
|
+
✅ Selenium scraper for comprehensive trending data
|
|
429
|
+
|
|
392
430
|
## ⚠️ Disclaimer
|
|
393
431
|
|
|
394
432
|
This is an unofficial library and is not affiliated with or endorsed by Google. Use responsibly and in accordance with Google's Terms of Service.
|
|
@@ -342,6 +342,44 @@ This project builds upon and combines features from:
|
|
|
342
342
|
- CLI interface
|
|
343
343
|
- Multiple export formats
|
|
344
344
|
|
|
345
|
+
## 📌 Important Notes
|
|
346
|
+
|
|
347
|
+
### Google API Changes
|
|
348
|
+
|
|
349
|
+
Google has deprecated several trending search API endpoints. **pytrends-modern provides two working alternatives:**
|
|
350
|
+
|
|
351
|
+
#### Option 1: Fast RSS Feed (Recommended for most use cases)
|
|
352
|
+
```python
|
|
353
|
+
from pytrends_modern import TrendsRSS
|
|
354
|
+
|
|
355
|
+
rss = TrendsRSS()
|
|
356
|
+
trends = rss.get_trends(geo='US') # ~0.7s, returns 10 trends with images/articles
|
|
357
|
+
```
|
|
358
|
+
**Pros:** Lightning fast, includes rich media, no browser needed
|
|
359
|
+
**Cons:** Limited to 10 trends, no filtering options
|
|
360
|
+
|
|
361
|
+
#### Option 2: Selenium Web Scraper (For complete data)
|
|
362
|
+
```python
|
|
363
|
+
from pytrends_modern import TrendsScraper
|
|
364
|
+
|
|
365
|
+
scraper = TrendsScraper(headless=True)
|
|
366
|
+
df = scraper.trending_searches(geo='US', hours=24) # ~15s, returns 400+ trends
|
|
367
|
+
scraper.close()
|
|
368
|
+
```
|
|
369
|
+
**Pros:** Complete data (400+ trends), supports categories/filters
|
|
370
|
+
**Cons:** Slower, requires Chrome browser
|
|
371
|
+
|
|
372
|
+
### Working Features
|
|
373
|
+
✅ All core API methods work perfectly:
|
|
374
|
+
- `interest_over_time()` - Historical search trends
|
|
375
|
+
- `interest_by_region()` - Geographic distribution
|
|
376
|
+
- `related_queries()` / `related_topics()` - Related searches
|
|
377
|
+
- `suggestions()` - Keyword suggestions
|
|
378
|
+
- And more!
|
|
379
|
+
|
|
380
|
+
✅ RSS feeds for 125+ countries
|
|
381
|
+
✅ Selenium scraper for comprehensive trending data
|
|
382
|
+
|
|
345
383
|
## ⚠️ Disclaimer
|
|
346
384
|
|
|
347
385
|
This is an unofficial library and is not affiliated with or endorsed by Google. Use responsibly and in accordance with Google's Terms of Service.
|
|
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|
|
4
4
|
|
|
5
5
|
[project]
|
|
6
6
|
name = "pytrends-modern"
|
|
7
|
-
version = "0.1.
|
|
7
|
+
version = "0.1.2"
|
|
8
8
|
description = "Modern Google Trends API - Combining the best of pytrends, with RSS feeds, Selenium scraping, and enhanced features"
|
|
9
9
|
readme = "README.md"
|
|
10
10
|
requires-python = ">=3.8"
|
|
@@ -92,3 +92,8 @@ python_files = ["test_*.py"]
|
|
|
92
92
|
python_classes = ["Test*"]
|
|
93
93
|
python_functions = ["test_*"]
|
|
94
94
|
addopts = "-v --cov=pytrends_modern --cov-report=term-missing"
|
|
95
|
+
|
|
96
|
+
[tool.uv.workspace]
|
|
97
|
+
members = [
|
|
98
|
+
"test_space",
|
|
99
|
+
]
|
|
@@ -8,6 +8,7 @@ __license__ = "MIT"
|
|
|
8
8
|
|
|
9
9
|
from pytrends_modern.request import TrendReq
|
|
10
10
|
from pytrends_modern.rss import TrendsRSS
|
|
11
|
+
from pytrends_modern.scraper import TrendsScraper
|
|
11
12
|
from pytrends_modern.exceptions import (
|
|
12
13
|
TooManyRequestsError,
|
|
13
14
|
ResponseError,
|
|
@@ -19,6 +20,7 @@ from pytrends_modern.exceptions import (
|
|
|
19
20
|
__all__ = [
|
|
20
21
|
"TrendReq",
|
|
21
22
|
"TrendsRSS",
|
|
23
|
+
"TrendsScraper",
|
|
22
24
|
"TooManyRequestsError",
|
|
23
25
|
"ResponseError",
|
|
24
26
|
"InvalidParameterError",
|
|
@@ -428,7 +428,7 @@ class TrendReq:
|
|
|
428
428
|
|
|
429
429
|
# Add isPartial column
|
|
430
430
|
if "isPartial" in df:
|
|
431
|
-
df = df.
|
|
431
|
+
df["isPartial"] = df["isPartial"].where(df["isPartial"].notna(), False)
|
|
432
432
|
is_partial_df = df["isPartial"].apply(
|
|
433
433
|
lambda x: pd.Series(str(x).replace("[", "").replace("]", "").split(","))
|
|
434
434
|
)
|
|
@@ -716,11 +716,6 @@ class TrendReq:
|
|
|
716
716
|
|
|
717
717
|
Returns:
|
|
718
718
|
DataFrame of real-time trending searches with entity names and titles
|
|
719
|
-
|
|
720
|
-
Example:
|
|
721
|
-
>>> pytrends = TrendReq()
|
|
722
|
-
>>> df = pytrends.realtime_trending_searches(pn='US', count=50)
|
|
723
|
-
>>> print(df.head())
|
|
724
719
|
"""
|
|
725
720
|
# Validate count
|
|
726
721
|
ri_value = min(count, 300)
|
|
@@ -36,7 +36,7 @@ class TrendsRSS:
|
|
|
36
36
|
... print(f"{trend['title']}: {trend['traffic']}")
|
|
37
37
|
"""
|
|
38
38
|
|
|
39
|
-
RSS_URL_TEMPLATE = "https://trends.google.com/
|
|
39
|
+
RSS_URL_TEMPLATE = "https://trends.google.com/trending/rss?geo={geo}"
|
|
40
40
|
|
|
41
41
|
def __init__(self, timeout: int = 10):
|
|
42
42
|
"""
|
|
@@ -123,7 +123,7 @@ class TrendsRSS:
|
|
|
123
123
|
trend_data["pub_date_datetime"] = None
|
|
124
124
|
|
|
125
125
|
# Traffic volume (from ht:approx_traffic namespace)
|
|
126
|
-
traffic_elem = item.find(".//{
|
|
126
|
+
traffic_elem = item.find(".//{https://trends.google.com/trending/rss}approx_traffic")
|
|
127
127
|
if traffic_elem is not None and traffic_elem.text:
|
|
128
128
|
# Remove '+' and ',' from traffic string
|
|
129
129
|
traffic_str = traffic_elem.text.replace("+", "").replace(",", "")
|
|
@@ -136,12 +136,12 @@ class TrendsRSS:
|
|
|
136
136
|
|
|
137
137
|
# Image
|
|
138
138
|
if include_images:
|
|
139
|
-
picture_elem = item.find(".//{
|
|
139
|
+
picture_elem = item.find(".//{https://trends.google.com/trending/rss}picture")
|
|
140
140
|
trend_data["picture"] = picture_elem.text if picture_elem is not None else None
|
|
141
141
|
|
|
142
142
|
# News articles
|
|
143
143
|
if include_articles:
|
|
144
|
-
news_items = item.findall(".//{
|
|
144
|
+
news_items = item.findall(".//{https://trends.google.com/trending/rss}news_item")
|
|
145
145
|
articles = []
|
|
146
146
|
|
|
147
147
|
for news_item in news_items[:max_articles_per_trend]:
|
|
@@ -149,25 +149,25 @@ class TrendsRSS:
|
|
|
149
149
|
|
|
150
150
|
# Article title
|
|
151
151
|
title_elem = news_item.find(
|
|
152
|
-
".//{
|
|
152
|
+
".//{https://trends.google.com/trending/rss}news_item_title"
|
|
153
153
|
)
|
|
154
154
|
article["title"] = title_elem.text if title_elem is not None else None
|
|
155
155
|
|
|
156
156
|
# Article URL
|
|
157
157
|
url_elem = news_item.find(
|
|
158
|
-
".//{
|
|
158
|
+
".//{https://trends.google.com/trending/rss}news_item_url"
|
|
159
159
|
)
|
|
160
160
|
article["url"] = url_elem.text if url_elem is not None else None
|
|
161
161
|
|
|
162
162
|
# Article snippet
|
|
163
163
|
snippet_elem = news_item.find(
|
|
164
|
-
".//{
|
|
164
|
+
".//{https://trends.google.com/trending/rss}news_item_snippet"
|
|
165
165
|
)
|
|
166
166
|
article["snippet"] = snippet_elem.text if snippet_elem is not None else None
|
|
167
167
|
|
|
168
168
|
# Article source
|
|
169
169
|
source_elem = news_item.find(
|
|
170
|
-
".//{
|
|
170
|
+
".//{https://trends.google.com/trending/rss}news_item_source"
|
|
171
171
|
)
|
|
172
172
|
article["source"] = source_elem.text if source_elem is not None else None
|
|
173
173
|
|
|
@@ -0,0 +1,292 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Selenium-based scraper for Google Trends trending searches
|
|
3
|
+
Uses browser automation to download CSV data when API endpoints are unavailable
|
|
4
|
+
"""
|
|
5
|
+
|
|
6
|
+
import os
|
|
7
|
+
import time
|
|
8
|
+
from typing import Optional, Union, List, Dict
|
|
9
|
+
from pathlib import Path
|
|
10
|
+
import warnings
|
|
11
|
+
|
|
12
|
+
import pandas as pd
|
|
13
|
+
from selenium import webdriver
|
|
14
|
+
from selenium.webdriver.common.by import By
|
|
15
|
+
from selenium.webdriver.support.ui import WebDriverWait
|
|
16
|
+
from selenium.webdriver.support import expected_conditions as EC
|
|
17
|
+
from selenium.webdriver.chrome.options import Options
|
|
18
|
+
from selenium.common.exceptions import (
|
|
19
|
+
WebDriverException,
|
|
20
|
+
TimeoutException,
|
|
21
|
+
NoSuchElementException,
|
|
22
|
+
)
|
|
23
|
+
|
|
24
|
+
from . import exceptions
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
class TrendsScraper:
|
|
28
|
+
"""
|
|
29
|
+
Selenium-based scraper for Google Trends when API endpoints are unavailable
|
|
30
|
+
|
|
31
|
+
This provides an alternative to deprecated API methods like trending_searches()
|
|
32
|
+
by using browser automation to download actual trending data.
|
|
33
|
+
"""
|
|
34
|
+
|
|
35
|
+
def __init__(self, headless: bool = True, download_dir: Optional[str] = None):
|
|
36
|
+
"""
|
|
37
|
+
Initialize the scraper
|
|
38
|
+
|
|
39
|
+
Args:
|
|
40
|
+
headless: Run browser in headless mode
|
|
41
|
+
download_dir: Directory for downloads (default: temp directory)
|
|
42
|
+
"""
|
|
43
|
+
self.headless = headless
|
|
44
|
+
|
|
45
|
+
if download_dir is None:
|
|
46
|
+
import tempfile
|
|
47
|
+
|
|
48
|
+
self.download_dir = tempfile.mkdtemp(prefix="pytrends_")
|
|
49
|
+
else:
|
|
50
|
+
self.download_dir = os.path.abspath(download_dir)
|
|
51
|
+
os.makedirs(self.download_dir, exist_ok=True)
|
|
52
|
+
|
|
53
|
+
self.driver = None
|
|
54
|
+
|
|
55
|
+
def _init_driver(self):
|
|
56
|
+
"""Initialize Selenium WebDriver"""
|
|
57
|
+
if self.driver is not None:
|
|
58
|
+
return
|
|
59
|
+
|
|
60
|
+
chrome_options = Options()
|
|
61
|
+
prefs = {
|
|
62
|
+
"download.default_directory": self.download_dir,
|
|
63
|
+
"download.prompt_for_download": False,
|
|
64
|
+
"download.directory_upgrade": True,
|
|
65
|
+
"safebrowsing.enabled": True,
|
|
66
|
+
}
|
|
67
|
+
chrome_options.add_experimental_option("prefs", prefs)
|
|
68
|
+
|
|
69
|
+
if self.headless:
|
|
70
|
+
chrome_options.add_argument("--headless=new")
|
|
71
|
+
chrome_options.add_argument("--disable-gpu")
|
|
72
|
+
chrome_options.add_argument("--no-sandbox")
|
|
73
|
+
chrome_options.add_argument("--disable-dev-shm-usage")
|
|
74
|
+
|
|
75
|
+
chrome_options.add_argument("--log-level=3")
|
|
76
|
+
chrome_options.add_experimental_option("excludeSwitches", ["enable-logging"])
|
|
77
|
+
|
|
78
|
+
try:
|
|
79
|
+
self.driver = webdriver.Chrome(options=chrome_options)
|
|
80
|
+
except WebDriverException as e:
|
|
81
|
+
raise exceptions.BrowserError(
|
|
82
|
+
f"Failed to start Chrome browser: {e}\n\n"
|
|
83
|
+
"Please ensure:\n"
|
|
84
|
+
"1. Chrome/Chromium browser is installed\n"
|
|
85
|
+
"2. ChromeDriver is compatible with your Chrome version\n"
|
|
86
|
+
"3. You have proper permissions\n"
|
|
87
|
+
)
|
|
88
|
+
|
|
89
|
+
def trending_searches(
|
|
90
|
+
self,
|
|
91
|
+
geo: str = "US",
|
|
92
|
+
hours: int = 24,
|
|
93
|
+
category: str = "all",
|
|
94
|
+
active_only: bool = False,
|
|
95
|
+
sort_by: str = "relevance",
|
|
96
|
+
return_df: bool = True,
|
|
97
|
+
) -> Union[pd.DataFrame, str]:
|
|
98
|
+
"""
|
|
99
|
+
Get trending searches by scraping Google Trends
|
|
100
|
+
|
|
101
|
+
Args:
|
|
102
|
+
geo: Country code (US, GB, IN, etc.)
|
|
103
|
+
hours: Time period in hours (4, 24, 48, 168)
|
|
104
|
+
category: Category filter (all, sports, entertainment, etc.)
|
|
105
|
+
active_only: Show only active trends
|
|
106
|
+
sort_by: Sort criteria (relevance, title, volume, recency)
|
|
107
|
+
return_df: Return DataFrame (True) or CSV path (False)
|
|
108
|
+
|
|
109
|
+
Returns:
|
|
110
|
+
DataFrame of trending searches or path to CSV file
|
|
111
|
+
|
|
112
|
+
Raises:
|
|
113
|
+
BrowserError: If browser automation fails
|
|
114
|
+
DownloadError: If download fails
|
|
115
|
+
|
|
116
|
+
Example:
|
|
117
|
+
>>> scraper = TrendsScraper()
|
|
118
|
+
>>> df = scraper.trending_searches(geo='US', hours=24)
|
|
119
|
+
>>> print(df.head())
|
|
120
|
+
"""
|
|
121
|
+
self._init_driver()
|
|
122
|
+
|
|
123
|
+
# Get existing files before download
|
|
124
|
+
existing_files = set(f for f in os.listdir(self.download_dir) if f.endswith(".csv"))
|
|
125
|
+
|
|
126
|
+
try:
|
|
127
|
+
# Build URL with parameters
|
|
128
|
+
url = f"https://trends.google.com/trending?geo={geo}"
|
|
129
|
+
|
|
130
|
+
if hours != 24:
|
|
131
|
+
url += f"&hours={hours}"
|
|
132
|
+
|
|
133
|
+
# Add category if not 'all'
|
|
134
|
+
categories = {
|
|
135
|
+
"all": "",
|
|
136
|
+
"business": "b",
|
|
137
|
+
"entertainment": "e",
|
|
138
|
+
"health": "m",
|
|
139
|
+
"science": "t",
|
|
140
|
+
"sports": "s",
|
|
141
|
+
"top": "h",
|
|
142
|
+
}
|
|
143
|
+
cat_code = categories.get(category.lower(), "")
|
|
144
|
+
if cat_code:
|
|
145
|
+
url += f"&cat={cat_code}"
|
|
146
|
+
|
|
147
|
+
# Navigate to page
|
|
148
|
+
self.driver.get(url)
|
|
149
|
+
|
|
150
|
+
# Wait for page to load by checking for Export button
|
|
151
|
+
WebDriverWait(self.driver, 10).until(
|
|
152
|
+
EC.presence_of_element_located((By.XPATH, "//button[contains(., 'Export')]"))
|
|
153
|
+
)
|
|
154
|
+
|
|
155
|
+
# Click Export button
|
|
156
|
+
try:
|
|
157
|
+
export_button = WebDriverWait(self.driver, 10).until(
|
|
158
|
+
EC.element_to_be_clickable((By.XPATH, "//button[contains(., 'Export')]"))
|
|
159
|
+
)
|
|
160
|
+
export_button.click()
|
|
161
|
+
time.sleep(1)
|
|
162
|
+
|
|
163
|
+
# Click Download CSV from the menu
|
|
164
|
+
download_csv = WebDriverWait(self.driver, 5).until(
|
|
165
|
+
EC.presence_of_element_located((By.CSS_SELECTOR, 'li[data-action="csv"]'))
|
|
166
|
+
)
|
|
167
|
+
self.driver.execute_script("arguments[0].click();", download_csv)
|
|
168
|
+
time.sleep(1)
|
|
169
|
+
|
|
170
|
+
except Exception as e:
|
|
171
|
+
raise exceptions.DownloadError(f"Could not find or click Export/CSV button: {e}")
|
|
172
|
+
|
|
173
|
+
# Wait for file download
|
|
174
|
+
max_wait = 30
|
|
175
|
+
waited = 0
|
|
176
|
+
downloaded_file = None
|
|
177
|
+
|
|
178
|
+
while waited < max_wait:
|
|
179
|
+
current_files = set(f for f in os.listdir(self.download_dir) if f.endswith(".csv"))
|
|
180
|
+
new_files = current_files - existing_files
|
|
181
|
+
|
|
182
|
+
if new_files:
|
|
183
|
+
# Get the most recently created file
|
|
184
|
+
newest = max(
|
|
185
|
+
new_files,
|
|
186
|
+
key=lambda f: os.path.getctime(os.path.join(self.download_dir, f)),
|
|
187
|
+
)
|
|
188
|
+
downloaded_file = os.path.join(self.download_dir, newest)
|
|
189
|
+
|
|
190
|
+
# Verify file is complete (size > 0 and not growing)
|
|
191
|
+
size1 = os.path.getsize(downloaded_file)
|
|
192
|
+
time.sleep(0.5)
|
|
193
|
+
size2 = os.path.getsize(downloaded_file)
|
|
194
|
+
|
|
195
|
+
if size1 > 0 and size1 == size2:
|
|
196
|
+
break
|
|
197
|
+
|
|
198
|
+
time.sleep(0.5)
|
|
199
|
+
waited += 0.5
|
|
200
|
+
|
|
201
|
+
if not downloaded_file:
|
|
202
|
+
raise exceptions.DownloadError(
|
|
203
|
+
f"Download timeout after {max_wait}s. No CSV file appeared in {self.download_dir}"
|
|
204
|
+
)
|
|
205
|
+
|
|
206
|
+
# Return DataFrame or path
|
|
207
|
+
if return_df:
|
|
208
|
+
df = pd.read_csv(downloaded_file)
|
|
209
|
+
# Clean up file
|
|
210
|
+
try:
|
|
211
|
+
os.remove(downloaded_file)
|
|
212
|
+
except:
|
|
213
|
+
pass
|
|
214
|
+
return df
|
|
215
|
+
else:
|
|
216
|
+
return downloaded_file
|
|
217
|
+
|
|
218
|
+
except Exception as e:
|
|
219
|
+
if isinstance(e, (exceptions.BrowserError, exceptions.DownloadError)):
|
|
220
|
+
raise
|
|
221
|
+
raise exceptions.DownloadError(f"Scraping failed: {e}")
|
|
222
|
+
|
|
223
|
+
def today_searches(self, geo: str = "US", return_df: bool = True) -> Union[pd.DataFrame, str]:
|
|
224
|
+
"""
|
|
225
|
+
Get today's trending searches (shortcut for 24 hour trends)
|
|
226
|
+
|
|
227
|
+
Args:
|
|
228
|
+
geo: Country code (US, GB, IN, etc.)
|
|
229
|
+
return_df: Return DataFrame (True) or CSV path (False)
|
|
230
|
+
|
|
231
|
+
Returns:
|
|
232
|
+
DataFrame of today's trending searches
|
|
233
|
+
|
|
234
|
+
Example:
|
|
235
|
+
>>> scraper = TrendsScraper()
|
|
236
|
+
>>> df = scraper.today_searches(geo='US')
|
|
237
|
+
"""
|
|
238
|
+
return self.trending_searches(geo=geo, hours=24, return_df=return_df)
|
|
239
|
+
|
|
240
|
+
def realtime_trending_searches(
|
|
241
|
+
self, geo: str = "US", hours: int = 4, return_df: bool = True
|
|
242
|
+
) -> Union[pd.DataFrame, str]:
|
|
243
|
+
"""
|
|
244
|
+
Get real-time trending searches (4 hour window)
|
|
245
|
+
|
|
246
|
+
Args:
|
|
247
|
+
geo: Country code (US, GB, IN, etc.)
|
|
248
|
+
hours: Time period (default 4 for real-time)
|
|
249
|
+
return_df: Return DataFrame (True) or CSV path (False)
|
|
250
|
+
|
|
251
|
+
Returns:
|
|
252
|
+
DataFrame of real-time trending searches
|
|
253
|
+
|
|
254
|
+
Example:
|
|
255
|
+
>>> scraper = TrendsScraper()
|
|
256
|
+
>>> df = scraper.realtime_trending_searches(geo='US')
|
|
257
|
+
"""
|
|
258
|
+
return self.trending_searches(geo=geo, hours=hours, return_df=return_df)
|
|
259
|
+
|
|
260
|
+
def __enter__(self):
|
|
261
|
+
"""Context manager support"""
|
|
262
|
+
return self
|
|
263
|
+
|
|
264
|
+
def __exit__(self, exc_type, exc_val, exc_tb):
|
|
265
|
+
"""Context manager cleanup"""
|
|
266
|
+
self.close()
|
|
267
|
+
|
|
268
|
+
def close(self):
|
|
269
|
+
# Ignore errors during shutdown
|
|
270
|
+
"""Close browser and cleanup"""
|
|
271
|
+
if self.driver:
|
|
272
|
+
try:
|
|
273
|
+
self.driver.quit()
|
|
274
|
+
except:
|
|
275
|
+
pass
|
|
276
|
+
self.driver = None
|
|
277
|
+
|
|
278
|
+
# Cleanup temp directory if we created it
|
|
279
|
+
if self.download_dir and "/tmp/" in self.download_dir:
|
|
280
|
+
import shutil
|
|
281
|
+
|
|
282
|
+
try:
|
|
283
|
+
shutil.rmtree(self.download_dir)
|
|
284
|
+
except:
|
|
285
|
+
pass
|
|
286
|
+
|
|
287
|
+
def __del__(self):
|
|
288
|
+
"""Cleanup on deletion"""
|
|
289
|
+
try:
|
|
290
|
+
self.close()
|
|
291
|
+
except:
|
|
292
|
+
pass
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: pytrends-modern
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.2
|
|
4
4
|
Summary: Modern Google Trends API - Combining the best of pytrends, with RSS feeds, Selenium scraping, and enhanced features
|
|
5
5
|
Author: pytrends-modern contributors
|
|
6
6
|
License: MIT
|
|
@@ -389,6 +389,44 @@ This project builds upon and combines features from:
|
|
|
389
389
|
- CLI interface
|
|
390
390
|
- Multiple export formats
|
|
391
391
|
|
|
392
|
+
## 📌 Important Notes
|
|
393
|
+
|
|
394
|
+
### Google API Changes
|
|
395
|
+
|
|
396
|
+
Google has deprecated several trending search API endpoints. **pytrends-modern provides two working alternatives:**
|
|
397
|
+
|
|
398
|
+
#### Option 1: Fast RSS Feed (Recommended for most use cases)
|
|
399
|
+
```python
|
|
400
|
+
from pytrends_modern import TrendsRSS
|
|
401
|
+
|
|
402
|
+
rss = TrendsRSS()
|
|
403
|
+
trends = rss.get_trends(geo='US') # ~0.7s, returns 10 trends with images/articles
|
|
404
|
+
```
|
|
405
|
+
**Pros:** Lightning fast, includes rich media, no browser needed
|
|
406
|
+
**Cons:** Limited to 10 trends, no filtering options
|
|
407
|
+
|
|
408
|
+
#### Option 2: Selenium Web Scraper (For complete data)
|
|
409
|
+
```python
|
|
410
|
+
from pytrends_modern import TrendsScraper
|
|
411
|
+
|
|
412
|
+
scraper = TrendsScraper(headless=True)
|
|
413
|
+
df = scraper.trending_searches(geo='US', hours=24) # ~15s, returns 400+ trends
|
|
414
|
+
scraper.close()
|
|
415
|
+
```
|
|
416
|
+
**Pros:** Complete data (400+ trends), supports categories/filters
|
|
417
|
+
**Cons:** Slower, requires Chrome browser
|
|
418
|
+
|
|
419
|
+
### Working Features
|
|
420
|
+
✅ All core API methods work perfectly:
|
|
421
|
+
- `interest_over_time()` - Historical search trends
|
|
422
|
+
- `interest_by_region()` - Geographic distribution
|
|
423
|
+
- `related_queries()` / `related_topics()` - Related searches
|
|
424
|
+
- `suggestions()` - Keyword suggestions
|
|
425
|
+
- And more!
|
|
426
|
+
|
|
427
|
+
✅ RSS feeds for 125+ countries
|
|
428
|
+
✅ Selenium scraper for comprehensive trending data
|
|
429
|
+
|
|
392
430
|
## ⚠️ Disclaimer
|
|
393
431
|
|
|
394
432
|
This is an unofficial library and is not affiliated with or endorsed by Google. Use responsibly and in accordance with Google's Terms of Service.
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{pytrends_modern-0.1.1 → pytrends_modern-0.1.2}/pytrends_modern.egg-info/dependency_links.txt
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|