pydoll-python 1.6.0__tar.gz → 2.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pydoll_python-2.0.0/PKG-INFO +377 -0
- pydoll_python-2.0.0/README.md +357 -0
- pydoll_python-2.0.0/pydoll/browser/__init__.py +4 -0
- pydoll_python-2.0.0/pydoll/browser/chromium/__init__.py +7 -0
- pydoll_python-2.0.0/pydoll/browser/chromium/base.py +542 -0
- pydoll_python-2.0.0/pydoll/browser/chromium/chrome.py +61 -0
- pydoll_python-2.0.0/pydoll/browser/chromium/edge.py +67 -0
- pydoll_python-2.0.0/pydoll/browser/interfaces.py +27 -0
- pydoll_python-2.0.0/pydoll/browser/managers/__init__.py +15 -0
- pydoll_python-2.0.0/pydoll/browser/managers/browser_options_manager.py +46 -0
- pydoll_python-2.0.0/pydoll/browser/managers/browser_process_manager.py +72 -0
- pydoll_python-2.0.0/pydoll/browser/managers/proxy_manager.py +85 -0
- pydoll_python-2.0.0/pydoll/browser/managers/temp_dir_manager.py +96 -0
- {pydoll_python-1.6.0 → pydoll_python-2.0.0}/pydoll/browser/options.py +9 -15
- pydoll_python-2.0.0/pydoll/browser/tab.py +695 -0
- pydoll_python-2.0.0/pydoll/commands/__init__.py +22 -0
- pydoll_python-2.0.0/pydoll/commands/browser_commands.py +239 -0
- pydoll_python-2.0.0/pydoll/commands/dom_commands.py +1272 -0
- pydoll_python-2.0.0/pydoll/commands/fetch_commands.py +310 -0
- pydoll_python-2.0.0/pydoll/commands/input_commands.py +615 -0
- pydoll_python-2.0.0/pydoll/commands/network_commands.py +945 -0
- pydoll_python-2.0.0/pydoll/commands/page_commands.py +848 -0
- pydoll_python-2.0.0/pydoll/commands/runtime_commands.py +512 -0
- pydoll_python-2.0.0/pydoll/commands/storage_commands.py +726 -0
- pydoll_python-2.0.0/pydoll/commands/target_commands.py +400 -0
- pydoll_python-2.0.0/pydoll/connection/__init__.py +5 -0
- pydoll_python-2.0.0/pydoll/connection/connection_handler.py +259 -0
- pydoll_python-2.0.0/pydoll/connection/managers/__init__.py +7 -0
- pydoll_python-2.0.0/pydoll/connection/managers/commands_manager.py +47 -0
- pydoll_python-2.0.0/pydoll/connection/managers/events_manager.py +115 -0
- pydoll_python-2.0.0/pydoll/constants.py +915 -0
- pydoll_python-2.0.0/pydoll/elements/mixins/__init__.py +5 -0
- pydoll_python-2.0.0/pydoll/elements/mixins/find_elements_mixin.py +501 -0
- pydoll_python-2.0.0/pydoll/elements/web_element.py +397 -0
- pydoll_python-2.0.0/pydoll/exceptions.py +235 -0
- pydoll_python-2.0.0/pydoll/protocol/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/base.py +47 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/events.py +35 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/methods.py +23 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/params.py +48 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/responses.py +32 -0
- pydoll_python-2.0.0/pydoll/protocol/browser/types.py +13 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/events.py +149 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/methods.py +56 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/params.py +228 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/responses.py +253 -0
- pydoll_python-2.0.0/pydoll/protocol/dom/types.py +81 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/events.py +57 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/methods.py +13 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/params.py +58 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/responses.py +18 -0
- pydoll_python-2.0.0/pydoll/protocol/fetch/types.py +22 -0
- pydoll_python-2.0.0/pydoll/protocol/input/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/input/events.py +20 -0
- pydoll_python-2.0.0/pydoll/protocol/input/methods.py +17 -0
- pydoll_python-2.0.0/pydoll/protocol/input/params.py +132 -0
- pydoll_python-2.0.0/pydoll/protocol/input/types.py +28 -0
- pydoll_python-2.0.0/pydoll/protocol/network/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/network/events.py +518 -0
- pydoll_python-2.0.0/pydoll/protocol/network/methods.py +35 -0
- pydoll_python-2.0.0/pydoll/protocol/network/params.py +214 -0
- pydoll_python-2.0.0/pydoll/protocol/network/responses.py +180 -0
- pydoll_python-2.0.0/pydoll/protocol/network/types.py +152 -0
- pydoll_python-2.0.0/pydoll/protocol/page/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/page/events.py +257 -0
- pydoll_python-2.0.0/pydoll/protocol/page/methods.py +52 -0
- pydoll_python-2.0.0/pydoll/protocol/page/params.py +242 -0
- pydoll_python-2.0.0/pydoll/protocol/page/responses.py +245 -0
- pydoll_python-2.0.0/pydoll/protocol/page/types.py +263 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/events.py +93 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/methods.py +27 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/params.py +113 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/responses.py +104 -0
- pydoll_python-2.0.0/pydoll/protocol/runtime/types.py +130 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/events.py +186 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/methods.py +43 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/params.py +154 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/responses.py +109 -0
- pydoll_python-2.0.0/pydoll/protocol/storage/types.py +36 -0
- pydoll_python-2.0.0/pydoll/protocol/target/__init__.py +1 -0
- pydoll_python-2.0.0/pydoll/protocol/target/events.py +77 -0
- pydoll_python-2.0.0/pydoll/protocol/target/methods.py +21 -0
- pydoll_python-2.0.0/pydoll/protocol/target/params.py +88 -0
- pydoll_python-2.0.0/pydoll/protocol/target/responses.py +59 -0
- pydoll_python-2.0.0/pydoll/protocol/target/types.py +19 -0
- pydoll_python-2.0.0/pydoll/utils.py +72 -0
- {pydoll_python-1.6.0 → pydoll_python-2.0.0}/pyproject.toml +14 -4
- pydoll_python-1.6.0/PKG-INFO +0 -766
- pydoll_python-1.6.0/README.md +0 -746
- pydoll_python-1.6.0/pydoll/browser/__init__.py +0 -4
- pydoll_python-1.6.0/pydoll/browser/base.py +0 -595
- pydoll_python-1.6.0/pydoll/browser/chrome.py +0 -69
- pydoll_python-1.6.0/pydoll/browser/constants.py +0 -6
- pydoll_python-1.6.0/pydoll/browser/edge.py +0 -78
- pydoll_python-1.6.0/pydoll/browser/managers.py +0 -387
- pydoll_python-1.6.0/pydoll/browser/page.py +0 -783
- pydoll_python-1.6.0/pydoll/commands/__init__.py +0 -22
- pydoll_python-1.6.0/pydoll/commands/browser.py +0 -127
- pydoll_python-1.6.0/pydoll/commands/dom.py +0 -397
- pydoll_python-1.6.0/pydoll/commands/fetch.py +0 -315
- pydoll_python-1.6.0/pydoll/commands/input.py +0 -195
- pydoll_python-1.6.0/pydoll/commands/network.py +0 -364
- pydoll_python-1.6.0/pydoll/commands/page.py +0 -210
- pydoll_python-1.6.0/pydoll/commands/runtime.py +0 -92
- pydoll_python-1.6.0/pydoll/commands/storage.py +0 -54
- pydoll_python-1.6.0/pydoll/commands/target.py +0 -101
- pydoll_python-1.6.0/pydoll/common/__init__.py +0 -1
- pydoll_python-1.6.0/pydoll/common/keyboard.py +0 -101
- pydoll_python-1.6.0/pydoll/common/keys.py +0 -52
- pydoll_python-1.6.0/pydoll/connection/connection.py +0 -419
- pydoll_python-1.6.0/pydoll/connection/managers.py +0 -262
- pydoll_python-1.6.0/pydoll/constants.py +0 -125
- pydoll_python-1.6.0/pydoll/element.py +0 -523
- pydoll_python-1.6.0/pydoll/events/__init__.py +0 -13
- pydoll_python-1.6.0/pydoll/events/browser.py +0 -26
- pydoll_python-1.6.0/pydoll/events/dom.py +0 -108
- pydoll_python-1.6.0/pydoll/events/fetch.py +0 -29
- pydoll_python-1.6.0/pydoll/events/network.py +0 -160
- pydoll_python-1.6.0/pydoll/events/page.py +0 -163
- pydoll_python-1.6.0/pydoll/exceptions.py +0 -99
- pydoll_python-1.6.0/pydoll/mixins/__init__.py +0 -0
- pydoll_python-1.6.0/pydoll/mixins/find_elements.py +0 -244
- pydoll_python-1.6.0/pydoll/utils.py +0 -50
- {pydoll_python-1.6.0 → pydoll_python-2.0.0}/LICENSE +0 -0
- {pydoll_python-1.6.0 → pydoll_python-2.0.0}/pydoll/__init__.py +0 -0
- {pydoll_python-1.6.0/pydoll/connection → pydoll_python-2.0.0/pydoll/elements}/__init__.py +0 -0
|
@@ -0,0 +1,377 @@
|
|
|
1
|
+
Metadata-Version: 2.3
|
|
2
|
+
Name: pydoll-python
|
|
3
|
+
Version: 2.0.0
|
|
4
|
+
Summary:
|
|
5
|
+
Author: Thalison Fernandes
|
|
6
|
+
Author-email: thalissfernandes99@gmail.com
|
|
7
|
+
Requires-Python: >=3.10,<4.0
|
|
8
|
+
Classifier: Programming Language :: Python :: 3
|
|
9
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
10
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
13
|
+
Requires-Dist: aiofiles (>=23.2.1,<24.0.0)
|
|
14
|
+
Requires-Dist: aiohttp (>=3.9.5,<4.0.0)
|
|
15
|
+
Requires-Dist: bs4 (>=0.0.2,<0.0.3)
|
|
16
|
+
Requires-Dist: mkdocstrings (>=0.29.1,<0.30.0)
|
|
17
|
+
Requires-Dist: websockets (>=13.1,<14.0)
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
|
|
20
|
+
<p align="center">
|
|
21
|
+
<img src="https://github.com/user-attachments/assets/219f2dbc-37ed-4aea-a289-ba39cdbb335d" alt="Pydoll Logo" /> <br><br>
|
|
22
|
+
</p>
|
|
23
|
+
|
|
24
|
+
<p align="center">
|
|
25
|
+
<a href="https://codecov.io/gh/autoscrape-labs/pydoll">
|
|
26
|
+
<img src="https://codecov.io/gh/autoscrape-labs/pydoll/graph/badge.svg?token=40I938OGM9"/>
|
|
27
|
+
</a>
|
|
28
|
+
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/tests.yml/badge.svg" alt="Tests">
|
|
29
|
+
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/ruff-ci.yml/badge.svg" alt="Ruff CI">
|
|
30
|
+
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/release.yml/badge.svg" alt="Release">
|
|
31
|
+
<img src="https://github.com/thalissonvs/pydoll/actions/workflows/mypy.yml/badge.svg" alt="MyPy CI">
|
|
32
|
+
</p>
|
|
33
|
+
|
|
34
|
+
<p align="center">
|
|
35
|
+
<a href="https://autoscrape-labs.github.io/pydoll/">Documentation</a> •
|
|
36
|
+
<a href="#getting-started">Getting Started</a> •
|
|
37
|
+
<a href="#advanced-features">Advanced Features</a> •
|
|
38
|
+
<a href="#contributing">Contributing</a> •
|
|
39
|
+
<a href="#support-my-work">Support</a> •
|
|
40
|
+
<a href="#license">License</a>
|
|
41
|
+
</p>
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
## Why Pydoll Exists
|
|
45
|
+
|
|
46
|
+
Picture this: you need to automate browser tasks. Maybe it's testing your web application, scraping data from websites, or automating repetitive processes. Traditionally, this meant dealing with external drivers, complex configurations, and a host of compatibility issues that seemed to appear out of nowhere.
|
|
47
|
+
|
|
48
|
+
**Pydoll was born to change that.**
|
|
49
|
+
|
|
50
|
+
Built from the ground up with a different philosophy, Pydoll connects directly to the Chrome DevTools Protocol (CDP), eliminating the need for external drivers entirely. This isn't just a technical change - it's a revolution in how you interact with browsers through Python.
|
|
51
|
+
|
|
52
|
+
We believe that powerful automation shouldn't require you to become a configuration expert. With Pydoll, you focus on what matters: your automation logic, not the underlying complexity.
|
|
53
|
+
|
|
54
|
+
## What Makes Pydoll Special
|
|
55
|
+
|
|
56
|
+
**Genuine Simplicity**: We don't want you wasting time configuring drivers or dealing with compatibility issues. With Pydoll, you install and you're ready to automate.
|
|
57
|
+
|
|
58
|
+
**Truly Human Interactions**: Our algorithms simulate real human behavior patterns - from timing between clicks to how the mouse moves across the screen.
|
|
59
|
+
|
|
60
|
+
**Native Async Performance**: Built from the ground up with `asyncio`, Pydoll doesn't just support asynchronous operations - it was designed for them.
|
|
61
|
+
|
|
62
|
+
**Integrated Intelligence**: Automatic bypass of Cloudflare Turnstile and reCAPTCHA v3 captchas, without external services or complex configurations.
|
|
63
|
+
|
|
64
|
+
**Powerful Network Monitoring**: Intercept, modify, and analyze all network traffic with ease, giving you complete control over requests.
|
|
65
|
+
|
|
66
|
+
**Event-Driven Architecture**: React to page events, network requests, and user interactions in real-time.
|
|
67
|
+
|
|
68
|
+
**Intuitive Element Finding**: Modern `find()` and `query()` methods that make sense and work as you'd expect.
|
|
69
|
+
|
|
70
|
+
**Robust Type Safety**: Comprehensive type system for better IDE support and error prevention.
|
|
71
|
+
|
|
72
|
+
## Installation
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
pip install pydoll-python
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
That's it. No drivers to download, no complex configurations. Just install and start automating.
|
|
79
|
+
|
|
80
|
+
## Getting Started
|
|
81
|
+
|
|
82
|
+
### Your First Automation
|
|
83
|
+
|
|
84
|
+
Let's start with something simple. The code below opens a browser, navigates to a website, and interacts with elements:
|
|
85
|
+
|
|
86
|
+
```python
|
|
87
|
+
import asyncio
|
|
88
|
+
from pydoll.browser import Chrome
|
|
89
|
+
|
|
90
|
+
async def my_first_automation():
|
|
91
|
+
# Create a browser instance
|
|
92
|
+
async with Chrome() as browser:
|
|
93
|
+
# Start the browser and get a tab
|
|
94
|
+
tab = await browser.start()
|
|
95
|
+
|
|
96
|
+
# Navigate to a website
|
|
97
|
+
await tab.go_to('https://example.com')
|
|
98
|
+
|
|
99
|
+
# Find elements intuitively
|
|
100
|
+
button = await tab.find(tag_name='button', class_name='submit')
|
|
101
|
+
await button.click()
|
|
102
|
+
|
|
103
|
+
# Or use CSS selectors/XPath directly
|
|
104
|
+
link = await tab.query('a[href*="contact"]')
|
|
105
|
+
await link.click()
|
|
106
|
+
|
|
107
|
+
# Run the automation
|
|
108
|
+
asyncio.run(my_first_automation())
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### Custom Configuration
|
|
112
|
+
|
|
113
|
+
Sometimes you need more control. Pydoll offers flexible configuration options:
|
|
114
|
+
|
|
115
|
+
```python
|
|
116
|
+
from pydoll.browser import Chrome
|
|
117
|
+
from pydoll.browser.options import ChromiumOptions
|
|
118
|
+
|
|
119
|
+
async def custom_automation():
|
|
120
|
+
# Configure browser options
|
|
121
|
+
options = ChromiumOptions()
|
|
122
|
+
options.add_argument('--proxy-server=username:password@ip:port')
|
|
123
|
+
options.add_argument('--window-size=1920,1080')
|
|
124
|
+
options.add_argument('--disable-web-security')
|
|
125
|
+
options.binary_location = '/path/to/your/browser'
|
|
126
|
+
|
|
127
|
+
async with Chrome(options=options) as browser:
|
|
128
|
+
tab = await browser.start()
|
|
129
|
+
|
|
130
|
+
# Your automation code here
|
|
131
|
+
await tab.go_to('https://example.com')
|
|
132
|
+
|
|
133
|
+
# The browser is now using your custom settings
|
|
134
|
+
|
|
135
|
+
asyncio.run(custom_automation())
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## Advanced Features
|
|
139
|
+
|
|
140
|
+
### Intelligent Captcha Bypass
|
|
141
|
+
|
|
142
|
+
One of Pydoll's most impressive features is its ability to automatically handle Cloudflare Turnstile captchas. This means fewer interruptions and smoother automations:
|
|
143
|
+
|
|
144
|
+
```python
|
|
145
|
+
import asyncio
|
|
146
|
+
from pydoll.browser import Chrome
|
|
147
|
+
|
|
148
|
+
async def bypass_cloudflare():
|
|
149
|
+
async with Chrome() as browser:
|
|
150
|
+
tab = await browser.start()
|
|
151
|
+
|
|
152
|
+
# Method 1: Context manager (waits for captcha completion)
|
|
153
|
+
async with tab.expect_and_bypass_cloudflare_captcha():
|
|
154
|
+
await tab.go_to('https://site-with-cloudflare.com')
|
|
155
|
+
print("Captcha automatically solved!")
|
|
156
|
+
|
|
157
|
+
# Method 2: Background processing
|
|
158
|
+
await tab.enable_auto_solve_cloudflare_captcha()
|
|
159
|
+
await tab.go_to('https://another-protected-site.com')
|
|
160
|
+
# Captcha solved in background while code continues
|
|
161
|
+
|
|
162
|
+
await tab.disable_auto_solve_cloudflare_captcha()
|
|
163
|
+
|
|
164
|
+
asyncio.run(bypass_cloudflare())
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
|
|
168
|
+
### Advanced Element Finding
|
|
169
|
+
|
|
170
|
+
Pydoll offers multiple intuitive ways to find elements. No matter how you prefer to work, we have an approach that makes sense for you:
|
|
171
|
+
|
|
172
|
+
```python
|
|
173
|
+
import asyncio
|
|
174
|
+
from pydoll.browser import Chrome
|
|
175
|
+
|
|
176
|
+
async def element_finding_examples():
|
|
177
|
+
async with Chrome() as browser:
|
|
178
|
+
tab = await browser.start()
|
|
179
|
+
await tab.go_to('https://example.com')
|
|
180
|
+
|
|
181
|
+
# Find by attributes (most intuitive)
|
|
182
|
+
submit_btn = await tab.find(
|
|
183
|
+
tag_name='button',
|
|
184
|
+
class_name='btn-primary',
|
|
185
|
+
text='Submit'
|
|
186
|
+
)
|
|
187
|
+
|
|
188
|
+
# Find by ID
|
|
189
|
+
username_field = await tab.find(id='username')
|
|
190
|
+
|
|
191
|
+
# Find multiple elements
|
|
192
|
+
all_links = await tab.find(tag_name='a', find_all=True)
|
|
193
|
+
|
|
194
|
+
# CSS selectors and XPath
|
|
195
|
+
nav_menu = await tab.query('nav.main-menu')
|
|
196
|
+
specific_item = await tab.query('//div[@data-testid="item-123"]')
|
|
197
|
+
|
|
198
|
+
# With timeout and error handling
|
|
199
|
+
delayed_element = await tab.find(
|
|
200
|
+
class_name='dynamic-content',
|
|
201
|
+
timeout=10,
|
|
202
|
+
raise_exc=False # Returns None if not found
|
|
203
|
+
)
|
|
204
|
+
|
|
205
|
+
# Advanced: Custom attributes
|
|
206
|
+
custom_element = await tab.find(
|
|
207
|
+
data_testid='submit-button',
|
|
208
|
+
aria_label='Submit form'
|
|
209
|
+
)
|
|
210
|
+
|
|
211
|
+
asyncio.run(element_finding_examples())
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
### Concurrent Automation
|
|
215
|
+
|
|
216
|
+
One of the great advantages of Pydoll's asynchronous design is the ability to process multiple tasks simultaneously:
|
|
217
|
+
|
|
218
|
+
```python
|
|
219
|
+
import asyncio
|
|
220
|
+
from pydoll.browser import Chrome
|
|
221
|
+
|
|
222
|
+
async def scrape_page(url):
|
|
223
|
+
"""Extract data from a single page"""
|
|
224
|
+
async with Chrome() as browser:
|
|
225
|
+
tab = await browser.start()
|
|
226
|
+
await tab.go_to(url)
|
|
227
|
+
|
|
228
|
+
title = await tab.execute_script('return document.title')
|
|
229
|
+
links = await tab.find(tag_name='a', find_all=True)
|
|
230
|
+
|
|
231
|
+
return {
|
|
232
|
+
'url': url,
|
|
233
|
+
'title': title,
|
|
234
|
+
'link_count': len(links)
|
|
235
|
+
}
|
|
236
|
+
|
|
237
|
+
async def concurrent_scraping():
|
|
238
|
+
urls = [
|
|
239
|
+
'https://example1.com',
|
|
240
|
+
'https://example2.com',
|
|
241
|
+
'https://example3.com'
|
|
242
|
+
]
|
|
243
|
+
|
|
244
|
+
# Process all URLs simultaneously
|
|
245
|
+
tasks = [scrape_page(url) for url in urls]
|
|
246
|
+
results = await asyncio.gather(*tasks)
|
|
247
|
+
|
|
248
|
+
for result in results:
|
|
249
|
+
print(f"{result['url']}: {result['title']} ({result['link_count']} links)")
|
|
250
|
+
|
|
251
|
+
asyncio.run(concurrent_scraping())
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### Event-Driven Automation
|
|
255
|
+
|
|
256
|
+
React to page events and user interactions in real-time. This enables more sophisticated and responsive automations:
|
|
257
|
+
|
|
258
|
+
```python
|
|
259
|
+
import asyncio
|
|
260
|
+
from pydoll.browser import Chrome
|
|
261
|
+
from pydoll.protocol.page.events import PageEvent
|
|
262
|
+
|
|
263
|
+
async def event_driven_automation():
|
|
264
|
+
async with Chrome() as browser:
|
|
265
|
+
tab = await browser.start()
|
|
266
|
+
|
|
267
|
+
# Enable page events
|
|
268
|
+
await tab.enable_page_events()
|
|
269
|
+
|
|
270
|
+
# React to page load
|
|
271
|
+
async def on_page_load(event):
|
|
272
|
+
print("Page loaded! Starting automation...")
|
|
273
|
+
# Perform actions after page loads
|
|
274
|
+
search_box = await tab.find(id='search-box')
|
|
275
|
+
await search_box.type('automation')
|
|
276
|
+
|
|
277
|
+
# React to navigation
|
|
278
|
+
async def on_navigation(event):
|
|
279
|
+
url = event['params']['url']
|
|
280
|
+
print(f"Navigated to: {url}")
|
|
281
|
+
|
|
282
|
+
await tab.on(PageEvent.LOAD_EVENT_FIRED, on_page_load)
|
|
283
|
+
await tab.on(PageEvent.FRAME_NAVIGATED, on_navigation)
|
|
284
|
+
|
|
285
|
+
await tab.go_to('https://example.com')
|
|
286
|
+
await asyncio.sleep(5) # Let events process
|
|
287
|
+
|
|
288
|
+
asyncio.run(event_driven_automation())
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
### Working with iFrames
|
|
292
|
+
|
|
293
|
+
Pydoll provides seamless iframe interaction through the `get_frame()` method. This is especially useful for dealing with embedded content:
|
|
294
|
+
|
|
295
|
+
```python
|
|
296
|
+
import asyncio
|
|
297
|
+
from pydoll.browser.chromium import Chrome
|
|
298
|
+
|
|
299
|
+
async def iframe_interaction():
|
|
300
|
+
async with Chrome() as browser:
|
|
301
|
+
tab = await browser.start()
|
|
302
|
+
await tab.go_to('https://example.com/page-with-iframe')
|
|
303
|
+
|
|
304
|
+
# Find the iframe element
|
|
305
|
+
iframe_element = await tab.query('.hcaptcha-iframe', timeout=10)
|
|
306
|
+
|
|
307
|
+
# Get a Tab instance for the iframe content
|
|
308
|
+
frame = await tab.get_frame(iframe_element)
|
|
309
|
+
|
|
310
|
+
# Now interact with elements inside the iframe
|
|
311
|
+
submit_button = await frame.find(tag_name='button', class_name='submit')
|
|
312
|
+
await submit_button.click()
|
|
313
|
+
|
|
314
|
+
# You can use all Tab methods on the frame
|
|
315
|
+
form_input = await frame.find(id='captcha-input')
|
|
316
|
+
await form_input.type('verification-code')
|
|
317
|
+
|
|
318
|
+
# Find elements by various methods
|
|
319
|
+
links = await frame.find(tag_name='a', find_all=True)
|
|
320
|
+
specific_element = await frame.query('#specific-id')
|
|
321
|
+
|
|
322
|
+
asyncio.run(iframe_interaction())
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
## The Philosophy Behind Pydoll
|
|
326
|
+
|
|
327
|
+
Pydoll isn't just another automation library. It represents a different approach to solving real problems that developers face daily.
|
|
328
|
+
|
|
329
|
+
**Simplicity Without Sacrificing Power**: We believe that powerful tools don't need to be complex. Pydoll offers advanced functionality through a clean and intuitive API.
|
|
330
|
+
|
|
331
|
+
**Performance That Matters**: In a world where every millisecond counts, Pydoll's native asynchronous design ensures your automations are not just functional, but efficient.
|
|
332
|
+
|
|
333
|
+
**Constant Evolution**: The web ecosystem is always changing, and Pydoll evolves with it. New challenges like advanced captchas are met with innovative solutions integrated into the library.
|
|
334
|
+
|
|
335
|
+
## Documentation
|
|
336
|
+
|
|
337
|
+
For comprehensive documentation, detailed examples, and deep dives into Pydoll's features, visit our [official documentation site](https://autoscrape-labs.github.io/pydoll/).
|
|
338
|
+
|
|
339
|
+
The documentation includes:
|
|
340
|
+
- **Getting Started Guide** - Step-by-step tutorials
|
|
341
|
+
- **API Reference** - Complete method documentation
|
|
342
|
+
- **Advanced Techniques** - Network interception, event handling, performance optimization
|
|
343
|
+
- **Migration Guide** - Upgrading from older versions
|
|
344
|
+
- **Troubleshooting** - Common issues and solutions
|
|
345
|
+
- **Best Practices** - Patterns for reliable automation
|
|
346
|
+
|
|
347
|
+
## Contributing
|
|
348
|
+
|
|
349
|
+
We'd love your help making Pydoll even better! Check out our [contribution guidelines](CONTRIBUTING.md) to get started. Whether it's fixing bugs, adding features, or improving documentation - all contributions are welcome!
|
|
350
|
+
|
|
351
|
+
Please make sure to:
|
|
352
|
+
- Write tests for new features or bug fixes
|
|
353
|
+
- Follow coding style and conventions
|
|
354
|
+
- Use conventional commits for pull requests
|
|
355
|
+
- Run lint and test checks before submitting
|
|
356
|
+
|
|
357
|
+
## Support My Work
|
|
358
|
+
|
|
359
|
+
If you find my projects helpful, consider [sponsoring me on GitHub](https://github.com/sponsors/thalissonvs).
|
|
360
|
+
You'll get access to exclusive perks like prioritized support, custom features, and more!
|
|
361
|
+
|
|
362
|
+
Can't sponsor right now? No problem — you can still help a lot by:
|
|
363
|
+
- Starring the repo
|
|
364
|
+
- Sharing it on social media
|
|
365
|
+
- Writing blog posts or tutorials
|
|
366
|
+
- Giving feedback or reporting issues
|
|
367
|
+
|
|
368
|
+
Every bit of support makes a difference — thank you!
|
|
369
|
+
|
|
370
|
+
## License
|
|
371
|
+
|
|
372
|
+
Pydoll is licensed under the [MIT License](LICENSE).
|
|
373
|
+
|
|
374
|
+
<p align="center">
|
|
375
|
+
<b>Pydoll</b> — Making browser automation magical!
|
|
376
|
+
</p>
|
|
377
|
+
|