webtools-cli 1.0.0__tar.gz → 1.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,162 @@
1
+ Metadata-Version: 2.4
2
+ Name: webtools-cli
3
+ Version: 1.0.3
4
+ Summary: Advanced Web Intelligence & Scraping Toolkit with CLI and Web UI
5
+ Author: Abhinav Adarsh
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/abhinavgautam08/webtools-cli
8
+ Keywords: web-scraping,osint,seo,intelligence,cli
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Environment :: Console
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Programming Language :: Python :: 3.9
14
+ Classifier: Programming Language :: Python :: 3.10
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Topic :: Internet :: WWW/HTTP
18
+ Requires-Python: >=3.9
19
+ Description-Content-Type: text/markdown
20
+ License-File: LICENSE
21
+ Requires-Dist: flask
22
+ Requires-Dist: requests
23
+ Requires-Dist: beautifulsoup4
24
+ Requires-Dist: qrcode
25
+ Requires-Dist: opencv-python
26
+ Requires-Dist: numpy
27
+ Requires-Dist: textblob
28
+ Requires-Dist: Pillow
29
+ Requires-Dist: mtranslate
30
+ Requires-Dist: colorama
31
+ Requires-Dist: pyreadline3; platform_system == "Windows"
32
+ Provides-Extra: playwright
33
+ Requires-Dist: playwright; extra == "playwright"
34
+ Dynamic: license-file
35
+
36
+ # WebTools CLI
37
+
38
+ [![PyPI Version](https://img.shields.io/pypi/v/webtools-cli)](https://pypi.org/project/webtools-cli/)
39
+ [![License](https://img.shields.io/github/license/abhinavgautam08/webtools-cli)](https://github.com/abhinavgautam08/webtools-cli/blob/main/LICENSE)
40
+ [![Python Version](https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white)](https://pypi.org/project/webtools-cli/)
41
+
42
+ ![WebTools CLI Interface](CLI.png)
43
+
44
+ WebTools CLI is an advanced web intelligence suite for researchers, OSINT enthusiasts, and developers. It brings the power of deep web analysis and automated scraping directly into your terminal, bridging the gap between a high-speed **Terminal UI** and a feature-rich **Cyber-themed Dashboard**.
45
+
46
+ ---
47
+
48
+ ## 🚀 Why WebTools CLI?
49
+
50
+ - **🎯 Stealth & Speed**: Smart proxy rotation and Turbo-Fetch logic for evasion and performance.
51
+ - **🧠 AI-Powered**: Automated content summarization, sentiment analysis, and readability scoring.
52
+ - **🔧 Security-Centric**: Built-in honeypot detection, threat leveling, and image forensic analysis.
53
+ - **💻 Terminal-First**: Designed for power users who live in the command line.
54
+ - **🛡️ Cross-Platform**: Works seamlessly on Windows, Linux, and macOS (with auto-download for Windows tunnels).
55
+ - **🔌 SPA Ready**: Automatic Playwright fallback for JavaScript-heavy sites like LinkedIn/Instagram.
56
+
57
+ ---
58
+
59
+ ## 📦 Installation
60
+
61
+ See the installation guide for recommended system specifications.
62
+
63
+ ### Quick Install
64
+
65
+ Install globally via pip:
66
+
67
+ ```bash
68
+ pip install webtools-cli
69
+ ```
70
+
71
+ To upgrade to the latest version:
72
+
73
+ ```bash
74
+ pip install webtools-cli --upgrade
75
+ ```
76
+
77
+ ### Optional Dependencies
78
+
79
+ For Single Page Application (SPA) support:
80
+
81
+ ```bash
82
+ playwright install chromium
83
+ ```
84
+
85
+ ---
86
+
87
+ ## 📋 Key Features
88
+
89
+ ### Advanced Scraping & Stealth
90
+ - **Smart Proxy Rotation**: Automatically rotates User-Agents and Proxies to evade detection.
91
+ - **Turbo-Fetch**: Parallel chunk downloads for large media (Videos/Images).
92
+ - **Deep Crawl**: Recursive link mapping up to 3 levels deep.
93
+ - **Headless Fallback**: Integrated Playwright support for auth-walled or SPA environments.
94
+
95
+ ### Intelligence & Security Analysis
96
+ - **OSINT Toolkit**: Auto-extract emails, phones, locations, social media, and tech stacks.
97
+ - **SEO Auditor**: Page score, heading hierarchy, link integrity, and image alt-text auditing.
98
+ - **Image Forensics**: CLI-based Error Level Analysis (ELA) and AI-likelihood detection.
99
+ - **Honeypot Detector**: Identifies hidden traps and anti-bot measures (Cloudflare/CAPTCHAs).
100
+
101
+ ### Modern Experience
102
+ - **Matrix Background**: "Flickering Grid" animated dashboard (Canvas-based).
103
+ - **Responsive Preview**: Live rendering scaling for desktop and mobile viewpoints.
104
+ - **History & Stats**: Phase-by-phase performance tracking and historical session management.
105
+
106
+ ---
107
+
108
+ ## 🚀 Getting Started
109
+
110
+ ### Basic Usage
111
+
112
+ #### Launch Interactive Menu
113
+ ```bash
114
+ webtools
115
+ ```
116
+
117
+ #### Non-Interactive Script Mode
118
+ ```bash
119
+ python -m webtools
120
+ ```
121
+
122
+ ### Slash Commands Reference
123
+
124
+ Navigate the suite using quick terminal commands:
125
+
126
+ | Command | Alias | Description |
127
+ |---------|-------|-------------|
128
+ | `/web` | `/w` | Launch **Web UI** (Cloudflare Tunnel + QR) |
129
+ | `/cli` | `/c` | Launch **CLI Intelligence** scan |
130
+ | `/image` | `/i` | **Image Forensics** & AI Likelihood |
131
+ | `/history`| `/hi`| View and manage scan history |
132
+ | `/help` | `/h` | Show full command documentation |
133
+ | `/clear` | - | Purge all locally scraped data |
134
+ | `/quit` | `/q` | Exit the application |
135
+
136
+ ---
137
+
138
+ ## ☁️ Deployment Options
139
+
140
+ - **Local Development**: Run on your machine with a generated QR code for mobile access.
141
+ - **Cloud Tunnels**: Automatic `cloudflared` integration to expose your UI globally.
142
+ - **Google Colab**: Compatible with Colab for cloud-based scraping (see badge above).
143
+
144
+ ---
145
+
146
+ ## 🤝 Resources & Support
147
+
148
+ - **[GitHub Repository](https://github.com/abhinavgautam08/webtools-cli)** - Source code and updates.
149
+ - **[Issue Tracker](https://github.com/abhinavgautam08/webtools-cli/issues)** - Report bugs or request features.
150
+ - **[License](./LICENSE)** - MIT License.
151
+
152
+ ---
153
+
154
+ ## ⚖️ Legal
155
+
156
+ This tool is for **educational and testing purposes only**. Always respect `robots.txt` and the Terms of Service of the websites you scrape. Neither the author nor the contributors are responsible for any misuse of this tool.
157
+
158
+ ---
159
+
160
+ <p align="center">
161
+ Built with ❤️ by <strong>Abhinav Adarsh</strong> and the open source community
162
+ </p>
@@ -0,0 +1,127 @@
1
+ # WebTools CLI
2
+
3
+ [![PyPI Version](https://img.shields.io/pypi/v/webtools-cli)](https://pypi.org/project/webtools-cli/)
4
+ [![License](https://img.shields.io/github/license/abhinavgautam08/webtools-cli)](https://github.com/abhinavgautam08/webtools-cli/blob/main/LICENSE)
5
+ [![Python Version](https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white)](https://pypi.org/project/webtools-cli/)
6
+
7
+ ![WebTools CLI Interface](CLI.png)
8
+
9
+ WebTools CLI is an advanced web intelligence suite for researchers, OSINT enthusiasts, and developers. It brings the power of deep web analysis and automated scraping directly into your terminal, bridging the gap between a high-speed **Terminal UI** and a feature-rich **Cyber-themed Dashboard**.
10
+
11
+ ---
12
+
13
+ ## 🚀 Why WebTools CLI?
14
+
15
+ - **🎯 Stealth & Speed**: Smart proxy rotation and Turbo-Fetch logic for evasion and performance.
16
+ - **🧠 AI-Powered**: Automated content summarization, sentiment analysis, and readability scoring.
17
+ - **🔧 Security-Centric**: Built-in honeypot detection, threat leveling, and image forensic analysis.
18
+ - **💻 Terminal-First**: Designed for power users who live in the command line.
19
+ - **🛡️ Cross-Platform**: Works seamlessly on Windows, Linux, and macOS (with auto-download for Windows tunnels).
20
+ - **🔌 SPA Ready**: Automatic Playwright fallback for JavaScript-heavy sites like LinkedIn/Instagram.
21
+
22
+ ---
23
+
24
+ ## 📦 Installation
25
+
26
+ See the installation guide for recommended system specifications.
27
+
28
+ ### Quick Install
29
+
30
+ Install globally via pip:
31
+
32
+ ```bash
33
+ pip install webtools-cli
34
+ ```
35
+
36
+ To upgrade to the latest version:
37
+
38
+ ```bash
39
+ pip install webtools-cli --upgrade
40
+ ```
41
+
42
+ ### Optional Dependencies
43
+
44
+ For Single Page Application (SPA) support:
45
+
46
+ ```bash
47
+ playwright install chromium
48
+ ```
49
+
50
+ ---
51
+
52
+ ## 📋 Key Features
53
+
54
+ ### Advanced Scraping & Stealth
55
+ - **Smart Proxy Rotation**: Automatically rotates User-Agents and Proxies to evade detection.
56
+ - **Turbo-Fetch**: Parallel chunk downloads for large media (Videos/Images).
57
+ - **Deep Crawl**: Recursive link mapping up to 3 levels deep.
58
+ - **Headless Fallback**: Integrated Playwright support for auth-walled or SPA environments.
59
+
60
+ ### Intelligence & Security Analysis
61
+ - **OSINT Toolkit**: Auto-extract emails, phones, locations, social media, and tech stacks.
62
+ - **SEO Auditor**: Page score, heading hierarchy, link integrity, and image alt-text auditing.
63
+ - **Image Forensics**: CLI-based Error Level Analysis (ELA) and AI-likelihood detection.
64
+ - **Honeypot Detector**: Identifies hidden traps and anti-bot measures (Cloudflare/CAPTCHAs).
65
+
66
+ ### Modern Experience
67
+ - **Matrix Background**: "Flickering Grid" animated dashboard (Canvas-based).
68
+ - **Responsive Preview**: Live rendering scaling for desktop and mobile viewpoints.
69
+ - **History & Stats**: Phase-by-phase performance tracking and historical session management.
70
+
71
+ ---
72
+
73
+ ## 🚀 Getting Started
74
+
75
+ ### Basic Usage
76
+
77
+ #### Launch Interactive Menu
78
+ ```bash
79
+ webtools
80
+ ```
81
+
82
+ #### Non-Interactive Script Mode
83
+ ```bash
84
+ python -m webtools
85
+ ```
86
+
87
+ ### Slash Commands Reference
88
+
89
+ Navigate the suite using quick terminal commands:
90
+
91
+ | Command | Alias | Description |
92
+ |---------|-------|-------------|
93
+ | `/web` | `/w` | Launch **Web UI** (Cloudflare Tunnel + QR) |
94
+ | `/cli` | `/c` | Launch **CLI Intelligence** scan |
95
+ | `/image` | `/i` | **Image Forensics** & AI Likelihood |
96
+ | `/history`| `/hi`| View and manage scan history |
97
+ | `/help` | `/h` | Show full command documentation |
98
+ | `/clear` | - | Purge all locally scraped data |
99
+ | `/quit` | `/q` | Exit the application |
100
+
101
+ ---
102
+
103
+ ## ☁️ Deployment Options
104
+
105
+ - **Local Development**: Run on your machine with a generated QR code for mobile access.
106
+ - **Cloud Tunnels**: Automatic `cloudflared` integration to expose your UI globally.
107
+ - **Google Colab**: Compatible with Colab for cloud-based scraping (see badge above).
108
+
109
+ ---
110
+
111
+ ## 🤝 Resources & Support
112
+
113
+ - **[GitHub Repository](https://github.com/abhinavgautam08/webtools-cli)** - Source code and updates.
114
+ - **[Issue Tracker](https://github.com/abhinavgautam08/webtools-cli/issues)** - Report bugs or request features.
115
+ - **[License](./LICENSE)** - MIT License.
116
+
117
+ ---
118
+
119
+ ## ⚖️ Legal
120
+
121
+ This tool is for **educational and testing purposes only**. Always respect `robots.txt` and the Terms of Service of the websites you scrape. Neither the author nor the contributors are responsible for any misuse of this tool.
122
+
123
+ ---
124
+
125
+ <p align="center">
126
+ Built with ❤️ by <strong>Abhinav Adarsh</strong> and the open source community
127
+ </p>
@@ -4,10 +4,10 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "webtools-cli"
7
- version = "1.0.0"
7
+ version = "1.0.3"
8
8
  description = "Advanced Web Intelligence & Scraping Toolkit with CLI and Web UI"
9
9
  readme = "README.md"
10
- license = {text = "MIT"}
10
+ license = "MIT"
11
11
  requires-python = ">=3.9"
12
12
  authors = [
13
13
  {name = "Abhinav Adarsh"},
@@ -17,7 +17,6 @@ classifiers = [
17
17
  "Development Status :: 4 - Beta",
18
18
  "Environment :: Console",
19
19
  "Intended Audience :: Developers",
20
- "License :: OSI Approved :: MIT License",
21
20
  "Programming Language :: Python :: 3",
22
21
  "Programming Language :: Python :: 3.9",
23
22
  "Programming Language :: Python :: 3.10",
@@ -46,7 +45,7 @@ playwright = ["playwright"]
46
45
  webtools = "webtools.cli:main"
47
46
 
48
47
  [project.urls]
49
- Homepage = "https://github.com/abhinavgautam08/webtools"
48
+ Homepage = "https://github.com/abhinavgautam08/webtools-cli"
50
49
 
51
50
  [tool.setuptools.packages.find]
52
51
  include = ["webtools*"]
@@ -4,7 +4,8 @@ sys.dont_write_bytecode = True
4
4
  # --- PACKAGE PATHS ---
5
5
  PACKAGE_DIR = os.path.dirname(os.path.abspath(__file__))
6
6
  DATA_DIR = os.path.join(os.path.expanduser('~'), '.webtools')
7
- os.makedirs(DATA_DIR, exist_ok=True)
7
+ SCRAPED_DIR = os.path.join(DATA_DIR, 'scraped')
8
+ os.makedirs(SCRAPED_DIR, exist_ok=True)
8
9
  try:
9
10
  from colorama import init, Fore, Style
10
11
  init(autoreset=True)
@@ -97,9 +98,8 @@ log.setLevel(logging.ERROR)
97
98
  urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
98
99
 
99
100
  # Directories setup kar rahe hain
100
- os.makedirs('webfiles/scraped', exist_ok=True)
101
- os.makedirs('webfiles/scraped/images', exist_ok=True)
102
- os.makedirs('webfiles/scraped/videos', exist_ok=True)
101
+ os.makedirs(os.path.join(SCRAPED_DIR, 'images'), exist_ok=True)
102
+ os.makedirs(os.path.join(SCRAPED_DIR, 'videos'), exist_ok=True)
103
103
 
104
104
  # --- PERFORMANCE AUDITOR ---
105
105
  class PerformanceTracker:
@@ -385,7 +385,7 @@ def serve_favicon():
385
385
 
386
386
  @app.route('/download/<path:filename>')
387
387
  def serve_scraped_file(filename):
388
- return send_from_directory('webfiles/scraped', filename)
388
+ return send_from_directory(SCRAPED_DIR, filename)
389
389
 
390
390
  def scrape_with_playwright(url, proxy=None):
391
391
  if not PLAYWRIGHT_AVAILABLE:
@@ -561,9 +561,24 @@ def detect_tech_stack(soup, response):
561
561
 
562
562
  return list(stack)
563
563
 
564
+ def ensure_textblob_corpora():
565
+ """Ensure necessary NLTK corpora for TextBlob are downloaded"""
566
+ try:
567
+ import nltk
568
+ needed = ['punkt', 'brown', 'averaged_perceptron_tagger']
569
+ for corpus in needed:
570
+ try:
571
+ nltk.data.find(f'tokenizers/{corpus}' if corpus == 'punkt' else f'corpora/{corpus}')
572
+ except (LookupError, AttributeError):
573
+ print(f"Downloading required AI data: {corpus}...")
574
+ nltk.download(corpus, quiet=True)
575
+ except:
576
+ pass
577
+
564
578
  def analyze_ai_content(text):
565
579
  """Text analyze karo (sentiment, summary, readability, aur keywords)"""
566
580
  try:
581
+ ensure_textblob_corpora()
567
582
  from textblob import TextBlob
568
583
  import re
569
584
 
@@ -816,9 +831,9 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
816
831
  self.headers = {}
817
832
  response = MockResponse(pw_html)
818
833
  else:
819
- return jsonify({'error': f'Request failed and Playwright fallback failed: {str(e)}'}), 400
834
+ return {'success': False, 'error': f'Request failed and Playwright fallback failed: {str(e)}'}
820
835
  else:
821
- return jsonify({'error': f'Request failed: {str(e)}'}), 400
836
+ return {'success': False, 'error': f'Request failed: {str(e)}'}
822
837
 
823
838
  soup = BeautifulSoup(response.text, 'html.parser')
824
839
  perf_tracker.record_phase("HTML Parsing")
@@ -990,7 +1005,7 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
990
1005
  else:
991
1006
  filename = f"{base}_{uuid.uuid4().hex[:8]}{ext}"
992
1007
 
993
- filepath = f'webfiles/scraped/videos/{filename}'
1008
+ filepath = os.path.join(SCRAPED_DIR, 'videos', filename)
994
1009
 
995
1010
  # Pehle TURBO FETCH try karo
996
1011
  if not download_file_turbo(v_url, filepath):
@@ -1161,7 +1176,7 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
1161
1176
 
1162
1177
  filename = f"{uuid.uuid4().hex[:8]}_{filename}"
1163
1178
 
1164
- filepath = f'webfiles/scraped/images/{filename}'
1179
+ filepath = os.path.join(SCRAPED_DIR, 'images', filename)
1165
1180
  with open(filepath, 'wb') as f:
1166
1181
  f.write(content)
1167
1182
  return (img_src, f'images/{filename}', f'/download/images/{filename}', image_hash, filepath)
@@ -1212,7 +1227,7 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
1212
1227
  # Image Tasks collect karo
1213
1228
  image_tasks = []
1214
1229
  if fetch_images:
1215
- os.makedirs('webfiles/scraped/images', exist_ok=True)
1230
+ os.makedirs(os.path.join(SCRAPED_DIR, 'images'), exist_ok=True)
1216
1231
 
1217
1232
  # Exclude karne ke liye Video Posters ID karo
1218
1233
  poster_blacklist = set()
@@ -1277,7 +1292,7 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
1277
1292
  # Video Tasks collect karo
1278
1293
  video_tasks = []
1279
1294
  if fetch_videos:
1280
- os.makedirs('webfiles/scraped/videos', exist_ok=True)
1295
+ os.makedirs(os.path.join(SCRAPED_DIR, 'videos'), exist_ok=True)
1281
1296
  # -------------------------------------------------------------------------
1282
1297
  # RESOURCE SNIFFER (Deep Scan)
1283
1298
  # Scans raw HTML/JS for hidden video links (mp4, m3u8, etc)
@@ -1461,11 +1476,11 @@ def execute_scrape_logic(url, fetch_images=False, fetch_videos=False, crawl_dept
1461
1476
  body.append(js_script)
1462
1477
  # Files save karo
1463
1478
  html_content = str(soup)
1464
- with open('webfiles/scraped/index.html', 'w', encoding='utf-8') as f:
1479
+ with open(os.path.join(SCRAPED_DIR, 'index.html'), 'w', encoding='utf-8') as f:
1465
1480
  f.write(html_content)
1466
- with open('webfiles/scraped/style.css', 'w', encoding='utf-8') as f:
1481
+ with open(os.path.join(SCRAPED_DIR, 'style.css'), 'w', encoding='utf-8') as f:
1467
1482
  f.write('\n\n'.join(css_content) or '/* No CSS found */')
1468
- with open('webfiles/scraped/script.js', 'w', encoding='utf-8') as f:
1483
+ with open(os.path.join(SCRAPED_DIR, 'script.js'), 'w', encoding='utf-8') as f:
1469
1484
  f.write('\n\n'.join(js_content) or '// No JS found */')
1470
1485
  # Stats calculate karo
1471
1486
  def get_size(content):
@@ -1767,12 +1782,12 @@ def api_save():
1767
1782
  @app.route('/api/download-zip')
1768
1783
  def download_zip():
1769
1784
  try:
1770
- zip_path = '/tmp/scraped_files.zip'
1785
+ zip_path = os.path.join(DATA_DIR, 'scraped_files.zip')
1771
1786
  with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
1772
- for root, dirs, files in os.walk('webfiles/scraped'):
1787
+ for root, dirs, files in os.walk(SCRAPED_DIR):
1773
1788
  for file in files:
1774
1789
  file_path = os.path.join(root, file)
1775
- arcname = os.path.relpath(file_path, 'webfiles/scraped')
1790
+ arcname = os.path.relpath(file_path, SCRAPED_DIR)
1776
1791
  zipf.write(file_path, arcname)
1777
1792
 
1778
1793
  return send_file(zip_path, as_attachment=True, download_name='scraped_files.zip')
@@ -1781,12 +1796,10 @@ def download_zip():
1781
1796
 
1782
1797
  def clear_scraped_data():
1783
1798
  try:
1784
- folder = 'webfiles/scraped'
1785
- if os.path.exists(folder):
1786
- shutil.rmtree(folder)
1787
- os.makedirs('webfiles/scraped', exist_ok=True)
1788
- os.makedirs('webfiles/scraped/images', exist_ok=True)
1789
- os.makedirs('webfiles/scraped/videos', exist_ok=True)
1799
+ if os.path.exists(SCRAPED_DIR):
1800
+ shutil.rmtree(SCRAPED_DIR)
1801
+ os.makedirs(os.path.join(SCRAPED_DIR, 'images'), exist_ok=True)
1802
+ os.makedirs(os.path.join(SCRAPED_DIR, 'videos'), exist_ok=True)
1790
1803
  return True
1791
1804
  except Exception as e:
1792
1805
  print(f"Cleanup Error: {e}")
@@ -2156,11 +2169,20 @@ def start_cloudflare_tunnel(port):
2156
2169
  # OS ke hisaab se executable choose karo
2157
2170
  cf_executable = os.path.join(DATA_DIR, 'cloudflared.exe') if os.name == 'nt' else os.path.join(DATA_DIR, 'cloudflared')
2158
2171
 
2159
- # Agar missing ho (Linux/Colab) toh download karo
2160
- if not os.path.exists(cf_executable) and os.name != 'nt':
2161
- print("Downloading cloudflared...")
2162
- subprocess.run(['wget', '-q', '-O', cf_executable, 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64'])
2163
- subprocess.run(['chmod', '+x', cf_executable])
2172
+ # Agar missing ho toh download karo
2173
+ if not os.path.exists(cf_executable):
2174
+ print(f"Downloading cloudflared for {os.name}...")
2175
+ if os.name == 'nt':
2176
+ # Windows binary download URL
2177
+ win_url = "https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-windows-amd64.exe"
2178
+ resp = requests.get(win_url, stream=True)
2179
+ with open(cf_executable, 'wb') as f:
2180
+ for chunk in resp.iter_content(chunk_size=8192):
2181
+ f.write(chunk)
2182
+ else:
2183
+ # Linux binary download URL
2184
+ subprocess.run(['wget', '-q', '-O', cf_executable, 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64'])
2185
+ subprocess.run(['chmod', '+x', cf_executable])
2164
2186
 
2165
2187
  process = subprocess.Popen(
2166
2188
  [cf_executable, 'tunnel', '--protocol', 'http2', '--url', f'http://127.0.0.1:{port}'],
@@ -0,0 +1,162 @@
1
+ Metadata-Version: 2.4
2
+ Name: webtools-cli
3
+ Version: 1.0.3
4
+ Summary: Advanced Web Intelligence & Scraping Toolkit with CLI and Web UI
5
+ Author: Abhinav Adarsh
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/abhinavgautam08/webtools-cli
8
+ Keywords: web-scraping,osint,seo,intelligence,cli
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Environment :: Console
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Programming Language :: Python :: 3.9
14
+ Classifier: Programming Language :: Python :: 3.10
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Topic :: Internet :: WWW/HTTP
18
+ Requires-Python: >=3.9
19
+ Description-Content-Type: text/markdown
20
+ License-File: LICENSE
21
+ Requires-Dist: flask
22
+ Requires-Dist: requests
23
+ Requires-Dist: beautifulsoup4
24
+ Requires-Dist: qrcode
25
+ Requires-Dist: opencv-python
26
+ Requires-Dist: numpy
27
+ Requires-Dist: textblob
28
+ Requires-Dist: Pillow
29
+ Requires-Dist: mtranslate
30
+ Requires-Dist: colorama
31
+ Requires-Dist: pyreadline3; platform_system == "Windows"
32
+ Provides-Extra: playwright
33
+ Requires-Dist: playwright; extra == "playwright"
34
+ Dynamic: license-file
35
+
36
+ # WebTools CLI
37
+
38
+ [![PyPI Version](https://img.shields.io/pypi/v/webtools-cli)](https://pypi.org/project/webtools-cli/)
39
+ [![License](https://img.shields.io/github/license/abhinavgautam08/webtools-cli)](https://github.com/abhinavgautam08/webtools-cli/blob/main/LICENSE)
40
+ [![Python Version](https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white)](https://pypi.org/project/webtools-cli/)
41
+
42
+ ![WebTools CLI Interface](CLI.png)
43
+
44
+ WebTools CLI is an advanced web intelligence suite for researchers, OSINT enthusiasts, and developers. It brings the power of deep web analysis and automated scraping directly into your terminal, bridging the gap between a high-speed **Terminal UI** and a feature-rich **Cyber-themed Dashboard**.
45
+
46
+ ---
47
+
48
+ ## 🚀 Why WebTools CLI?
49
+
50
+ - **🎯 Stealth & Speed**: Smart proxy rotation and Turbo-Fetch logic for evasion and performance.
51
+ - **🧠 AI-Powered**: Automated content summarization, sentiment analysis, and readability scoring.
52
+ - **🔧 Security-Centric**: Built-in honeypot detection, threat leveling, and image forensic analysis.
53
+ - **💻 Terminal-First**: Designed for power users who live in the command line.
54
+ - **🛡️ Cross-Platform**: Works seamlessly on Windows, Linux, and macOS (with auto-download for Windows tunnels).
55
+ - **🔌 SPA Ready**: Automatic Playwright fallback for JavaScript-heavy sites like LinkedIn/Instagram.
56
+
57
+ ---
58
+
59
+ ## 📦 Installation
60
+
61
+ See the installation guide for recommended system specifications.
62
+
63
+ ### Quick Install
64
+
65
+ Install globally via pip:
66
+
67
+ ```bash
68
+ pip install webtools-cli
69
+ ```
70
+
71
+ To upgrade to the latest version:
72
+
73
+ ```bash
74
+ pip install webtools-cli --upgrade
75
+ ```
76
+
77
+ ### Optional Dependencies
78
+
79
+ For Single Page Application (SPA) support:
80
+
81
+ ```bash
82
+ playwright install chromium
83
+ ```
84
+
85
+ ---
86
+
87
+ ## 📋 Key Features
88
+
89
+ ### Advanced Scraping & Stealth
90
+ - **Smart Proxy Rotation**: Automatically rotates User-Agents and Proxies to evade detection.
91
+ - **Turbo-Fetch**: Parallel chunk downloads for large media (Videos/Images).
92
+ - **Deep Crawl**: Recursive link mapping up to 3 levels deep.
93
+ - **Headless Fallback**: Integrated Playwright support for auth-walled or SPA environments.
94
+
95
+ ### Intelligence & Security Analysis
96
+ - **OSINT Toolkit**: Auto-extract emails, phones, locations, social media, and tech stacks.
97
+ - **SEO Auditor**: Page score, heading hierarchy, link integrity, and image alt-text auditing.
98
+ - **Image Forensics**: CLI-based Error Level Analysis (ELA) and AI-likelihood detection.
99
+ - **Honeypot Detector**: Identifies hidden traps and anti-bot measures (Cloudflare/CAPTCHAs).
100
+
101
+ ### Modern Experience
102
+ - **Matrix Background**: "Flickering Grid" animated dashboard (Canvas-based).
103
+ - **Responsive Preview**: Live rendering scaling for desktop and mobile viewpoints.
104
+ - **History & Stats**: Phase-by-phase performance tracking and historical session management.
105
+
106
+ ---
107
+
108
+ ## 🚀 Getting Started
109
+
110
+ ### Basic Usage
111
+
112
+ #### Launch Interactive Menu
113
+ ```bash
114
+ webtools
115
+ ```
116
+
117
+ #### Non-Interactive Script Mode
118
+ ```bash
119
+ python -m webtools
120
+ ```
121
+
122
+ ### Slash Commands Reference
123
+
124
+ Navigate the suite using quick terminal commands:
125
+
126
+ | Command | Alias | Description |
127
+ |---------|-------|-------------|
128
+ | `/web` | `/w` | Launch **Web UI** (Cloudflare Tunnel + QR) |
129
+ | `/cli` | `/c` | Launch **CLI Intelligence** scan |
130
+ | `/image` | `/i` | **Image Forensics** & AI Likelihood |
131
+ | `/history`| `/hi`| View and manage scan history |
132
+ | `/help` | `/h` | Show full command documentation |
133
+ | `/clear` | - | Purge all locally scraped data |
134
+ | `/quit` | `/q` | Exit the application |
135
+
136
+ ---
137
+
138
+ ## ☁️ Deployment Options
139
+
140
+ - **Local Development**: Run on your machine with a generated QR code for mobile access.
141
+ - **Cloud Tunnels**: Automatic `cloudflared` integration to expose your UI globally.
142
+ - **Google Colab**: Compatible with Colab for cloud-based scraping (see badge above).
143
+
144
+ ---
145
+
146
+ ## 🤝 Resources & Support
147
+
148
+ - **[GitHub Repository](https://github.com/abhinavgautam08/webtools-cli)** - Source code and updates.
149
+ - **[Issue Tracker](https://github.com/abhinavgautam08/webtools-cli/issues)** - Report bugs or request features.
150
+ - **[License](./LICENSE)** - MIT License.
151
+
152
+ ---
153
+
154
+ ## ⚖️ Legal
155
+
156
+ This tool is for **educational and testing purposes only**. Always respect `robots.txt` and the Terms of Service of the websites you scrape. Neither the author nor the contributors are responsible for any misuse of this tool.
157
+
158
+ ---
159
+
160
+ <p align="center">
161
+ Built with ❤️ by <strong>Abhinav Adarsh</strong> and the open source community
162
+ </p>
@@ -1,110 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: webtools-cli
3
- Version: 1.0.0
4
- Summary: Advanced Web Intelligence & Scraping Toolkit with CLI and Web UI
5
- Author: Abhinav Adarsh
6
- License: MIT
7
- Project-URL: Homepage, https://github.com/abhinavgautam08/webtools
8
- Keywords: web-scraping,osint,seo,intelligence,cli
9
- Classifier: Development Status :: 4 - Beta
10
- Classifier: Environment :: Console
11
- Classifier: Intended Audience :: Developers
12
- Classifier: License :: OSI Approved :: MIT License
13
- Classifier: Programming Language :: Python :: 3
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
16
- Classifier: Programming Language :: Python :: 3.11
17
- Classifier: Programming Language :: Python :: 3.12
18
- Classifier: Topic :: Internet :: WWW/HTTP
19
- Requires-Python: >=3.9
20
- Description-Content-Type: text/markdown
21
- License-File: LICENSE
22
- Requires-Dist: flask
23
- Requires-Dist: requests
24
- Requires-Dist: beautifulsoup4
25
- Requires-Dist: qrcode
26
- Requires-Dist: opencv-python
27
- Requires-Dist: numpy
28
- Requires-Dist: textblob
29
- Requires-Dist: Pillow
30
- Requires-Dist: mtranslate
31
- Requires-Dist: colorama
32
- Requires-Dist: pyreadline3; platform_system == "Windows"
33
- Provides-Extra: playwright
34
- Requires-Dist: playwright; extra == "playwright"
35
- Dynamic: license-file
36
-
37
- <p align="center">
38
- <img src="Web_Tools.png" alt="WebTools CLI" width="180">
39
- </p>
40
-
41
- <h1 align="center">WebTools CLI</h1>
42
-
43
- <p align="center">
44
- <strong>Advanced Web Intelligence & Scraping Toolkit</strong><br>
45
- <em>OSINT - SEO - AI Analysis - Security Scanner</em>
46
- </p>
47
-
48
- <p align="center">
49
- <img src="https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white" alt="Python">
50
- <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
51
- <img src="https://img.shields.io/badge/version-1.0.0-cyan" alt="Version">
52
- </p>
53
-
54
- ---
55
-
56
- ## Install
57
-
58
- ```bash
59
- pip install webtools-cli
60
- ```
61
-
62
- ## Usage
63
-
64
- ```bash
65
- # Launch CLI
66
- webtools
67
-
68
- # Or via Python module
69
- python -m webtools
70
- ```
71
-
72
- ## Features
73
-
74
- | Feature | Description |
75
- |---------|-------------|
76
- | **Web Mode** | Full web UI with Cloudflare tunnel + QR code sharing |
77
- | **CLI Intelligence** | Deep scan any URL from terminal |
78
- | **Security Scanner** | Threat detection, honeypot traps, CSRF checks |
79
- | **SEO Analyzer** | Score, headings, broken links, image audit |
80
- | **AI Analysis** | Sentiment, readability, keywords, summarization |
81
- | **OSINT** | Emails, phones, locations, social media, tech stack |
82
- | **Smart Media** | Image quality filter + video deep-scan with sniffer |
83
- | **Proxy Intelligence** | Smart proxy rotation with learning algorithm |
84
- | **Playwright Fallback** | Handles SPAs and auth walls automatically |
85
- | **Performance Tracker** | Phase-by-phase timing with historical stats |
86
-
87
- ## CLI Commands
88
-
89
- | Command | Description |
90
- |---------|-------------|
91
- | `/web` or `/w` | Launch Web UI mode |
92
- | `/cli` or `/c` | Launch CLI Intelligence mode |
93
- | `/image` or `/i` | Image Forensics & AI Detection |
94
- | `/help` or `/h` | Show all commands |
95
- | `/history` or `/hi` | View scan history |
96
- | `/clear` | Purge scraped data |
97
- | `/quit` or `/q` | Exit |
98
-
99
- ## Requirements
100
-
101
- - Python 3.9+
102
- - Optional: `playwright` for SPA/auth wall bypass
103
-
104
- ## Author
105
-
106
- **Abhinav Adarsh**
107
-
108
- ## License
109
-
110
- MIT
@@ -1,74 +0,0 @@
1
- <p align="center">
2
- <img src="Web_Tools.png" alt="WebTools CLI" width="180">
3
- </p>
4
-
5
- <h1 align="center">WebTools CLI</h1>
6
-
7
- <p align="center">
8
- <strong>Advanced Web Intelligence & Scraping Toolkit</strong><br>
9
- <em>OSINT - SEO - AI Analysis - Security Scanner</em>
10
- </p>
11
-
12
- <p align="center">
13
- <img src="https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white" alt="Python">
14
- <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
15
- <img src="https://img.shields.io/badge/version-1.0.0-cyan" alt="Version">
16
- </p>
17
-
18
- ---
19
-
20
- ## Install
21
-
22
- ```bash
23
- pip install webtools-cli
24
- ```
25
-
26
- ## Usage
27
-
28
- ```bash
29
- # Launch CLI
30
- webtools
31
-
32
- # Or via Python module
33
- python -m webtools
34
- ```
35
-
36
- ## Features
37
-
38
- | Feature | Description |
39
- |---------|-------------|
40
- | **Web Mode** | Full web UI with Cloudflare tunnel + QR code sharing |
41
- | **CLI Intelligence** | Deep scan any URL from terminal |
42
- | **Security Scanner** | Threat detection, honeypot traps, CSRF checks |
43
- | **SEO Analyzer** | Score, headings, broken links, image audit |
44
- | **AI Analysis** | Sentiment, readability, keywords, summarization |
45
- | **OSINT** | Emails, phones, locations, social media, tech stack |
46
- | **Smart Media** | Image quality filter + video deep-scan with sniffer |
47
- | **Proxy Intelligence** | Smart proxy rotation with learning algorithm |
48
- | **Playwright Fallback** | Handles SPAs and auth walls automatically |
49
- | **Performance Tracker** | Phase-by-phase timing with historical stats |
50
-
51
- ## CLI Commands
52
-
53
- | Command | Description |
54
- |---------|-------------|
55
- | `/web` or `/w` | Launch Web UI mode |
56
- | `/cli` or `/c` | Launch CLI Intelligence mode |
57
- | `/image` or `/i` | Image Forensics & AI Detection |
58
- | `/help` or `/h` | Show all commands |
59
- | `/history` or `/hi` | View scan history |
60
- | `/clear` | Purge scraped data |
61
- | `/quit` or `/q` | Exit |
62
-
63
- ## Requirements
64
-
65
- - Python 3.9+
66
- - Optional: `playwright` for SPA/auth wall bypass
67
-
68
- ## Author
69
-
70
- **Abhinav Adarsh**
71
-
72
- ## License
73
-
74
- MIT
@@ -1,110 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: webtools-cli
3
- Version: 1.0.0
4
- Summary: Advanced Web Intelligence & Scraping Toolkit with CLI and Web UI
5
- Author: Abhinav Adarsh
6
- License: MIT
7
- Project-URL: Homepage, https://github.com/abhinavgautam08/webtools
8
- Keywords: web-scraping,osint,seo,intelligence,cli
9
- Classifier: Development Status :: 4 - Beta
10
- Classifier: Environment :: Console
11
- Classifier: Intended Audience :: Developers
12
- Classifier: License :: OSI Approved :: MIT License
13
- Classifier: Programming Language :: Python :: 3
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
16
- Classifier: Programming Language :: Python :: 3.11
17
- Classifier: Programming Language :: Python :: 3.12
18
- Classifier: Topic :: Internet :: WWW/HTTP
19
- Requires-Python: >=3.9
20
- Description-Content-Type: text/markdown
21
- License-File: LICENSE
22
- Requires-Dist: flask
23
- Requires-Dist: requests
24
- Requires-Dist: beautifulsoup4
25
- Requires-Dist: qrcode
26
- Requires-Dist: opencv-python
27
- Requires-Dist: numpy
28
- Requires-Dist: textblob
29
- Requires-Dist: Pillow
30
- Requires-Dist: mtranslate
31
- Requires-Dist: colorama
32
- Requires-Dist: pyreadline3; platform_system == "Windows"
33
- Provides-Extra: playwright
34
- Requires-Dist: playwright; extra == "playwright"
35
- Dynamic: license-file
36
-
37
- <p align="center">
38
- <img src="Web_Tools.png" alt="WebTools CLI" width="180">
39
- </p>
40
-
41
- <h1 align="center">WebTools CLI</h1>
42
-
43
- <p align="center">
44
- <strong>Advanced Web Intelligence & Scraping Toolkit</strong><br>
45
- <em>OSINT - SEO - AI Analysis - Security Scanner</em>
46
- </p>
47
-
48
- <p align="center">
49
- <img src="https://img.shields.io/badge/python-3.9+-blue?logo=python&logoColor=white" alt="Python">
50
- <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
51
- <img src="https://img.shields.io/badge/version-1.0.0-cyan" alt="Version">
52
- </p>
53
-
54
- ---
55
-
56
- ## Install
57
-
58
- ```bash
59
- pip install webtools-cli
60
- ```
61
-
62
- ## Usage
63
-
64
- ```bash
65
- # Launch CLI
66
- webtools
67
-
68
- # Or via Python module
69
- python -m webtools
70
- ```
71
-
72
- ## Features
73
-
74
- | Feature | Description |
75
- |---------|-------------|
76
- | **Web Mode** | Full web UI with Cloudflare tunnel + QR code sharing |
77
- | **CLI Intelligence** | Deep scan any URL from terminal |
78
- | **Security Scanner** | Threat detection, honeypot traps, CSRF checks |
79
- | **SEO Analyzer** | Score, headings, broken links, image audit |
80
- | **AI Analysis** | Sentiment, readability, keywords, summarization |
81
- | **OSINT** | Emails, phones, locations, social media, tech stack |
82
- | **Smart Media** | Image quality filter + video deep-scan with sniffer |
83
- | **Proxy Intelligence** | Smart proxy rotation with learning algorithm |
84
- | **Playwright Fallback** | Handles SPAs and auth walls automatically |
85
- | **Performance Tracker** | Phase-by-phase timing with historical stats |
86
-
87
- ## CLI Commands
88
-
89
- | Command | Description |
90
- |---------|-------------|
91
- | `/web` or `/w` | Launch Web UI mode |
92
- | `/cli` or `/c` | Launch CLI Intelligence mode |
93
- | `/image` or `/i` | Image Forensics & AI Detection |
94
- | `/help` or `/h` | Show all commands |
95
- | `/history` or `/hi` | View scan history |
96
- | `/clear` | Purge scraped data |
97
- | `/quit` or `/q` | Exit |
98
-
99
- ## Requirements
100
-
101
- - Python 3.9+
102
- - Optional: `playwright` for SPA/auth wall bypass
103
-
104
- ## Author
105
-
106
- **Abhinav Adarsh**
107
-
108
- ## License
109
-
110
- MIT
File without changes
File without changes