puppeteer-bidi 0.0.3.beta1 → 0.0.3.beta2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,271 +0,0 @@
1
- # ExposeFunction and EvaluateOnNewDocument Implementation
2
-
3
- This document details the implementation of `Page.evaluateOnNewDocument` and `Page.exposeFunction`, which use BiDi preload scripts and `script.message` channel for communication.
4
-
5
- ## Page.evaluateOnNewDocument
6
-
7
- Injects JavaScript to be evaluated before any page scripts run.
8
-
9
- ### BiDi Implementation
10
-
11
- Uses `script.addPreloadScript` command:
12
-
13
- ```ruby
14
- def evaluate_on_new_document(page_function, *args)
15
- expression = build_evaluation_expression(page_function, *args)
16
- script_id = @browsing_context.add_preload_script(expression).wait
17
- NewDocumentScriptEvaluation.new(script_id)
18
- end
19
- ```
20
-
21
- ### Key Points
22
-
23
- 1. **Preload Scripts Persist**: Scripts added via `addPreloadScript` run on every navigation
24
- 2. **Argument Serialization**: Arguments are serialized into the script as JSON literals
25
- 3. **Return Value**: Returns a `NewDocumentScriptEvaluation` with the script ID for later removal
26
-
27
- ### Example
28
-
29
- ```ruby
30
- # Inject code that runs before any page scripts
31
- script = page.evaluate_on_new_document("window.injected = 123")
32
- page.goto(server.empty_page)
33
- result = page.evaluate("window.injected") # => 123
34
-
35
- # Remove when done
36
- page.remove_script_to_evaluate_on_new_document(script.identifier)
37
- ```
38
-
39
- ## Page.exposeFunction
40
-
41
- Exposes a Ruby callable as a JavaScript function on the page.
42
-
43
- ### BiDi Implementation
44
-
45
- Uses `script.message` channel for bidirectional communication:
46
-
47
- ```
48
- Page (JS) Ruby (ExposedFunction)
49
- | |
50
- |-- callback([resolve,reject,args]) -->|
51
- | |
52
- |<-- resolve(result) or reject(error) -|
53
- ```
54
-
55
- ### Key Components
56
-
57
- #### 1. Channel Argument Pattern
58
-
59
- BiDi uses a special `channel` argument type:
60
-
61
- ```ruby
62
- def channel_argument
63
- {
64
- type: "channel",
65
- value: {
66
- channel: @channel, # Unique channel ID
67
- ownership: "root" # Keep handles alive
68
- }
69
- }
70
- end
71
- ```
72
-
73
- #### 2. Function Declaration
74
-
75
- The exposed function creates a Promise that waits for the Ruby callback:
76
-
77
- ```javascript
78
- (callback) => {
79
- Object.assign(globalThis, {
80
- [name]: function (...args) {
81
- return new Promise((resolve, reject) => {
82
- callback([resolve, reject, args]);
83
- });
84
- },
85
- });
86
- }
87
- ```
88
-
89
- #### 3. Message Handling
90
-
91
- Ruby listens for `script.message` events and processes calls:
92
-
93
- ```ruby
94
- def handle_message(params)
95
- return unless params["channel"] == @channel
96
-
97
- # Extract data handle with [resolve, reject, args]
98
- data_handle = JSHandle.from(params["data"], realm.core_realm)
99
-
100
- # Call Ruby function and send result back
101
- result = @apply.call(*args)
102
- send_result(data_handle, result)
103
- end
104
- ```
105
-
106
- ### Session Event Subscription
107
-
108
- **Important**: `script.message` must be subscribed in the session:
109
-
110
- ```ruby
111
- # In Core::Session
112
- def subscribe_to_events
113
- subscribe([
114
- "browsingContext.load",
115
- "browsingContext.domContentLoaded",
116
- # ... other events
117
- "script.message", # Required for exposeFunction
118
- ]).wait
119
- end
120
- ```
121
-
122
- ### Frame Handling
123
-
124
- ExposedFunction handles dynamic frames by:
125
-
126
- 1. **Listening to frameattached**: Injects into new frames
127
- 2. **Using preload scripts**: For top-level browsing contexts (not iframes)
128
- 3. **Using callFunction**: For immediate injection into current context
129
-
130
- ```ruby
131
- def inject_into_frame(frame)
132
- # Add preload script for top-level contexts only
133
- if frame.browsing_context.parent.nil?
134
- script_id = frame.browsing_context.add_preload_script(
135
- function_declaration,
136
- arguments: [channel]
137
- ).wait
138
- end
139
-
140
- # Always call function for immediate availability
141
- realm.core_realm.call_function(
142
- function_declaration,
143
- false,
144
- arguments: [channel]
145
- ).wait
146
- end
147
- ```
148
-
149
- ### Error Handling
150
-
151
- #### Standard Errors
152
-
153
- Errors are serialized with name, message, and stack trace:
154
-
155
- ```ruby
156
- def send_error(data_handle, error)
157
- name = error.class.name
158
- message = error.message
159
- stack = error.backtrace&.join("\n")
160
-
161
- data_handle.evaluate(<<~JS, name, message, stack)
162
- ([, reject], name, message, stack) => {
163
- const error = new Error(message);
164
- error.name = name;
165
- if (stack) { error.stack = stack; }
166
- reject(error);
167
- }
168
- JS
169
- end
170
- ```
171
-
172
- #### Non-Error Values (ThrownValue)
173
-
174
- Ruby doesn't support `throw "string"` syntax. Use `ThrownValue`:
175
-
176
- ```ruby
177
- class ThrownValue < StandardError
178
- attr_reader :value
179
-
180
- def initialize(value)
181
- @value = value
182
- super("Thrown value")
183
- end
184
- end
185
-
186
- # Usage
187
- page.expose_function("throwValue") do |value|
188
- raise ExposedFunction::ThrownValue.new(value)
189
- end
190
- ```
191
-
192
- ### Cleanup
193
-
194
- Disposal removes the function from all frames and cleans up resources:
195
-
196
- ```ruby
197
- def dispose
198
- session.off("script.message", &@listener)
199
- page.off(:frameattached, &@frame_listener)
200
-
201
- # Remove from globalThis
202
- remove_binding_from_frame(@frame)
203
-
204
- # Remove preload scripts
205
- @scripts.each do |frame, script_id|
206
- frame.browsing_context.remove_preload_script(script_id).wait
207
- end
208
- end
209
- ```
210
-
211
- ## Testing Considerations
212
-
213
- ### CSP Headers
214
-
215
- Some tests require Content-Security-Policy headers. Use `TestServer#set_csp`:
216
-
217
- ```ruby
218
- server.set_csp("/empty.html", "script-src 'self'")
219
- ```
220
-
221
- ### Test Asset
222
-
223
- `spec/assets/tamperable.html` captures `window.injected` before page scripts run:
224
-
225
- ```html
226
- <script>
227
- window.result = window.injected;
228
- </script>
229
- ```
230
-
231
- ## Common Pitfalls
232
-
233
- ### 1. Missing script.message Subscription
234
-
235
- **Problem**: `exposeFunction` doesn't receive callbacks
236
-
237
- **Solution**: Ensure `script.message` is in session event subscriptions
238
-
239
- ### 2. Ownership: "root" Required
240
-
241
- **Problem**: JSHandle becomes invalid before processing
242
-
243
- **Solution**: Use `ownership: "root"` in channel argument to keep handles alive
244
-
245
- ### 3. Preload Scripts for Iframes
246
-
247
- **Problem**: `addPreloadScript` not supported for iframe contexts
248
-
249
- **Solution**: Only use preload scripts for top-level contexts; use `callFunction` for iframes
250
-
251
- ### 4. TypeError on raise nil
252
-
253
- **Problem**: Ruby's `raise nil` throws `TypeError: exception class/object expected`
254
-
255
- **Solution**: Catch and convert to `send_thrown_value`:
256
-
257
- ```ruby
258
- rescue TypeError => e
259
- if e.message.include?("exception class/object expected")
260
- send_thrown_value(data_handle, nil)
261
- else
262
- send_error(data_handle, e)
263
- end
264
- end
265
- ```
266
-
267
- ## References
268
-
269
- - [WebDriver BiDi script.message](https://w3c.github.io/webdriver-bidi/#event-script-message)
270
- - [WebDriver BiDi addPreloadScript](https://w3c.github.io/webdriver-bidi/#command-script-addPreloadScript)
271
- - [Puppeteer ExposedFunction.ts](https://github.com/puppeteer/puppeteer/blob/main/packages/puppeteer-core/src/bidi/ExposedFunction.ts)
@@ -1,95 +0,0 @@
1
- # FileChooser Implementation
2
-
3
- ## Overview
4
-
5
- FileChooser provides file upload functionality through the WebDriver BiDi `input.setFiles` command and `input.fileDialogOpened` event.
6
-
7
- ## Firefox Nightly Requirement
8
-
9
- **Important**: The `input.fileDialogOpened` event is only supported in Firefox Nightly. Stable Firefox does not fire this event.
10
-
11
- ```bash
12
- # Run tests with Firefox Nightly
13
- FIREFOX_PATH="/Applications/Firefox Nightly.app/Contents/MacOS/firefox" bundle exec rspec spec/integration/input_spec.rb
14
- ```
15
-
16
- The browser launcher prioritizes Firefox Nightly in its search order.
17
-
18
- ## API Design
19
-
20
- ### Block-based Interface
21
-
22
- Ruby uses a block-based interface instead of JavaScript's Promise.all pattern:
23
-
24
- ```ruby
25
- # Ruby (block-based)
26
- chooser = page.wait_for_file_chooser do
27
- page.click('input[type=file]')
28
- end
29
- chooser.accept(['/path/to/file.txt'])
30
-
31
- # JavaScript equivalent
32
- const [chooser] = await Promise.all([
33
- page.waitForFileChooser(),
34
- page.click('input[type=file]'),
35
- ]);
36
- await chooser.accept(['/path/to/file.txt']);
37
- ```
38
-
39
- ### Direct Upload
40
-
41
- ```ruby
42
- input = page.query_selector('input[type=file]')
43
- input.upload_file('/path/to/file.txt')
44
- ```
45
-
46
- ## Implementation Flow
47
-
48
- ```
49
- ElementHandle#upload_file(files)
50
- └── Frame#set_files(element, files)
51
- └── BrowsingContext#set_files(shared_reference, files)
52
- └── BiDi command: input.setFiles
53
- ```
54
-
55
- This follows Puppeteer's pattern where ElementHandle delegates to Frame, which then calls the BrowsingContext.
56
-
57
- ## Firefox BiDi Limitations
58
-
59
- ### 1. Detached Elements Not Supported
60
-
61
- Firefox does not fire `input.fileDialogOpened` for file inputs that are not attached to the DOM:
62
-
63
- ```ruby
64
- # This will timeout - detached element
65
- page.wait_for_file_chooser do
66
- page.evaluate(<<~JS)
67
- () => {
68
- const el = document.createElement('input');
69
- el.type = 'file';
70
- el.click(); // No event fired
71
- }
72
- JS
73
- end
74
- ```
75
-
76
- ### 2. Non-existent Files Rejected
77
-
78
- Firefox BiDi rejects files that don't exist with `NS_ERROR_FILE_NOT_FOUND`. Chrome allows setting non-existent files.
79
-
80
- ### 3. Event Subscription
81
-
82
- The `input` module must be subscribed at session level. This is handled automatically in `Session#initialize_session`:
83
-
84
- ```ruby
85
- subscribe_modules = %w[browsingContext network log script input]
86
- subscribe(subscribe_modules)
87
- ```
88
-
89
- ## Key Classes
90
-
91
- - `FileChooser` - Wraps the element with `accept()`, `cancel()`, `multiple?()` methods
92
- - `Page#wait_for_file_chooser` - Listens for `filedialogopened` event with timeout
93
- - `ElementHandle#upload_file` - Resolves paths and delegates to frame
94
- - `Frame#set_files` - Calls BrowsingContext with shared reference
95
- - `BrowsingContext#set_files` - Sends BiDi `input.setFiles` command
@@ -1,346 +0,0 @@
1
- # Frame Architecture Implementation
2
-
3
- ## Overview
4
-
5
- This document details the Frame architecture implementation following Puppeteer's parent-based design pattern.
6
-
7
- ## Architecture Change
8
-
9
- ### Before (Incorrect)
10
-
11
- ```ruby
12
- class Frame
13
- def initialize(browsing_context, page = nil)
14
- @browsing_context = browsing_context
15
- @page = page
16
- end
17
- end
18
-
19
- # Page creates frame
20
- Frame.new(@browsing_context, self)
21
- ```
22
-
23
- **Problem**: Frame directly stores reference to Page, doesn't support nested frames (iframe).
24
-
25
- ### After (Correct - Following Puppeteer)
26
-
27
- ```ruby
28
- class Frame
29
- def initialize(parent, browsing_context)
30
- @parent = parent # Page or Frame
31
- @browsing_context = browsing_context
32
- end
33
-
34
- def page
35
- @parent.is_a?(Page) ? @parent : @parent.page
36
- end
37
-
38
- def parent_frame
39
- @parent.is_a?(Frame) ? @parent : nil
40
- end
41
- end
42
-
43
- # Page creates frame
44
- Frame.new(self, @browsing_context)
45
- ```
46
-
47
- **Benefits**:
48
- - Supports nested frames (iframe within iframe)
49
- - Matches Puppeteer's TypeScript implementation
50
- - Enables recursive page traversal
51
- - Simplifies parent_frame implementation
52
-
53
- ## Reference Implementation
54
-
55
- Based on [Puppeteer's Frame.ts](https://github.com/puppeteer/puppeteer/blob/main/packages/puppeteer-core/src/bidi/Frame.ts):
56
-
57
- ```typescript
58
- export class BidiFrame extends Frame {
59
- #parent: BidiPage | BidiFrame;
60
- #browsingContext: BrowsingContext;
61
-
62
- constructor(
63
- parent: BidiPage | BidiFrame,
64
- browsingContext: BrowsingContext,
65
- ) {
66
- super();
67
- this.#parent = parent;
68
- this.#browsingContext = browsingContext;
69
- }
70
-
71
- override get page(): BidiPage {
72
- let parent = this.#parent;
73
- while (parent instanceof BidiFrame) {
74
- parent = parent.#parent;
75
- }
76
- return parent;
77
- }
78
-
79
- override get parentFrame(): BidiFrame | null {
80
- if (this.#parent instanceof BidiFrame) {
81
- return this.#parent;
82
- }
83
- return null;
84
- }
85
- }
86
- ```
87
-
88
- ## Implementation Details
89
-
90
- ### Constructor Signature
91
-
92
- **Critical**: The first parameter is `parent` (Page or Frame), not `page`:
93
-
94
- ```ruby
95
- def initialize(parent, browsing_context)
96
- @parent = parent
97
- @browsing_context = browsing_context
98
- end
99
- ```
100
-
101
- ### Page Traversal
102
-
103
- Recursive implementation using ternary operator:
104
-
105
- ```ruby
106
- def page
107
- @parent.is_a?(Page) ? @parent : @parent.page
108
- end
109
- ```
110
-
111
- This is simpler than a while loop and matches Puppeteer's logic flow.
112
-
113
- ### Parent Frame Access
114
-
115
- ```ruby
116
- def parent_frame
117
- @parent.is_a?(Frame) ? @parent : nil
118
- end
119
- ```
120
-
121
- Returns:
122
- - `Frame` instance if this is a child frame
123
- - `nil` if this is a main frame (parent is Page)
124
-
125
- ## Usage Examples
126
-
127
- ### Main Frame
128
-
129
- ```ruby
130
- page = browser.new_page
131
- main_frame = page.main_frame
132
-
133
- main_frame.page # => page
134
- main_frame.parent_frame # => nil
135
- ```
136
-
137
- ### Nested Frames (Future)
138
-
139
- ```ruby
140
- # When iframe support is added:
141
- iframe = main_frame.child_frames.first
142
-
143
- iframe.page # => page (traverses up to Page)
144
- iframe.parent_frame # => main_frame
145
- ```
146
-
147
- ## Testing
148
-
149
- All 108 integration tests pass with this architecture:
150
-
151
- ```bash
152
- bundle exec rspec spec/integration/
153
- # 108 examples, 0 failures, 4 pending
154
- ```
155
-
156
- ## Key Takeaways
157
-
158
- 1. **Follow Puppeteer's constructor signature exactly** - `(parent, browsing_context)` not `(browsing_context, page)`
159
- 2. **Use ternary operator for simplicity** - `@parent.is_a?(Page) ? @parent : @parent.page`
160
- 3. **Enables future iframe support** - Architecture supports nested frame trees
161
- 4. **Remove redundant attr_reader** - No need for `attr_reader :parent` when using private instance variable
162
-
163
- ## Frame Events
164
-
165
- ### Overview
166
-
167
- Frame lifecycle events are emitted on the Page object, following Puppeteer's pattern:
168
-
169
- - `:frameattached` - Fired when a new child frame is created
170
- - `:framedetached` - Fired when a frame's browsing context is closed
171
- - `:framenavigated` - Fired on DOMContentLoaded or fragment navigation
172
-
173
- ### Event Emission Locations (Following Puppeteer Exactly)
174
-
175
- **Critical**: The location where each event is emitted matters for correct behavior.
176
-
177
- | Event | Location | Trigger |
178
- |-------|----------|---------|
179
- | `:frameattached` | `Frame#create_frame_target` | Child browsing context created |
180
- | `:framedetached` | `Frame#initialize_frame` | **Self's** browsing context closed |
181
- | `:framenavigated` | `Frame#initialize_frame` | DOMContentLoaded or fragment_navigated |
182
-
183
- ### Puppeteer Reference Code
184
-
185
- From [Puppeteer's bidi/Frame.ts](https://github.com/puppeteer/puppeteer/blob/main/packages/puppeteer-core/src/bidi/Frame.ts):
186
-
187
- ```typescript
188
- // In #initialize() - FrameDetached is emitted for THIS frame
189
- this.browsingContext.on('closed', () => {
190
- this.page().trustedEmitter.emit(PageEvent.FrameDetached, this);
191
- });
192
-
193
- // In #createFrameTarget() - FrameAttached is emitted for child frame
194
- #createFrameTarget(browsingContext: BrowsingContext) {
195
- const frame = BidiFrame.from(this, browsingContext);
196
- this.#frames.set(browsingContext, frame);
197
- this.page().trustedEmitter.emit(PageEvent.FrameAttached, frame);
198
-
199
- // Note: FrameDetached is NOT emitted here
200
- browsingContext.on('closed', () => {
201
- this.#frames.delete(browsingContext);
202
- });
203
-
204
- return frame;
205
- }
206
- ```
207
-
208
- ### Ruby Implementation
209
-
210
- ```ruby
211
- # Frame#initialize_frame
212
- def initialize_frame
213
- # ... child frame setup ...
214
-
215
- # FrameDetached: emit when THIS frame's context closes
216
- @browsing_context.on(:closed) do
217
- @frames.clear
218
- page.emit(:framedetached, self)
219
- end
220
-
221
- # FrameNavigated: emit on navigation events
222
- @browsing_context.on(:dom_content_loaded) do
223
- page.emit(:framenavigated, self)
224
- end
225
-
226
- @browsing_context.on(:fragment_navigated) do
227
- page.emit(:framenavigated, self)
228
- end
229
- end
230
-
231
- # Frame#create_frame_target
232
- def create_frame_target(browsing_context)
233
- frame = Frame.from(self, browsing_context)
234
- @frames[browsing_context.id] = frame
235
-
236
- # FrameAttached: emit for the new child frame
237
- page.emit(:frameattached, frame)
238
-
239
- # Only cleanup, NO FrameDetached here
240
- browsing_context.once(:closed) do
241
- @frames.delete(browsing_context.id)
242
- end
243
-
244
- frame
245
- end
246
- ```
247
-
248
- ### Common Mistake
249
-
250
- **Wrong**: Emitting `:framedetached` in `create_frame_target` when child's context closes.
251
-
252
- **Correct**: Each frame emits its own `:framedetached` in `initialize_frame` when its own browsing context closes.
253
-
254
- This matters because the event should be emitted by the frame instance that is being detached, not by its parent.
255
-
256
- ## Page Event Emitter
257
-
258
- Page delegates to `Core::EventEmitter` for event handling:
259
-
260
- ```ruby
261
- class Page
262
- def initialize(...)
263
- @emitter = Core::EventEmitter.new
264
- end
265
-
266
- def on(event, &block)
267
- @emitter.on(event, &block)
268
- end
269
-
270
- def emit(event, data = nil)
271
- @emitter.emit(event, data)
272
- end
273
- end
274
- ```
275
-
276
- ## Files Changed
277
-
278
- - `lib/puppeteer/bidi/frame.rb`: Constructor signature, page method, parent_frame method, frame events
279
- - `lib/puppeteer/bidi/page.rb`: main_frame initialization, event emitter delegation
280
-
281
- ## BiDi Protocol Limitations
282
-
283
- ### Frame.frameElement with Shadow DOM
284
-
285
- **Status**: Not supported in BiDi protocol
286
-
287
- `Frame#frame_element` returns `nil` for iframes inside Shadow DOM (both open and closed).
288
-
289
- #### Root Cause
290
-
291
- | Protocol | Behavior | Mechanism |
292
- |----------|----------|-----------|
293
- | **CDP (Chrome)** | Works | Uses `DOM.getFrameOwner` command |
294
- | **BiDi (Firefox)** | Returns nil | Uses `document.querySelectorAll` (cannot traverse Shadow DOM) |
295
-
296
- #### Technical Details
297
-
298
- 1. **CDP Implementation** (`cdp/Frame.js`):
299
- ```javascript
300
- const { backendNodeId } = await parent.client.send('DOM.getFrameOwner', {
301
- frameId: this._id,
302
- });
303
- return await parent.mainRealm().adoptBackendNode(backendNodeId);
304
- ```
305
-
306
- 2. **BiDi Implementation** (base `api/Frame.js`):
307
- ```javascript
308
- const list = await parentFrame.isolatedRealm().evaluateHandle(() => {
309
- return document.querySelectorAll('iframe,frame');
310
- });
311
- // Cannot find elements inside Shadow DOM
312
- ```
313
-
314
- 3. **WebDriver BiDi Specification**: No `DOM.getFrameOwner` equivalent command exists.
315
-
316
- #### Verification
317
-
318
- Tested with Puppeteer (Node.js) using both protocols:
319
-
320
- ```
321
- === Firefox (BiDi protocol) ===
322
- Frame element is NULL - Shadow DOM issue confirmed
323
-
324
- === Chrome (CDP protocol) ===
325
- Frame element tagName: iframe
326
- ```
327
-
328
- #### References
329
-
330
- - [Puppeteer Issue #13155](https://github.com/puppeteer/puppeteer/issues/13155) - Original bug report
331
- - [Puppeteer PR #13156](https://github.com/puppeteer/puppeteer/pull/13156) - CDP-only fix (October 2024)
332
- - [WebDriver BiDi Specification](https://w3c.github.io/webdriver-bidi/) - browsingContext module
333
-
334
- #### Test Status
335
-
336
- ```ruby
337
- it 'should handle shadow roots', pending: 'BiDi protocol limitation: no DOM.getFrameOwner equivalent' do
338
- # ...
339
- end
340
- ```
341
-
342
- This is a **protocol limitation**, not an implementation bug in this library.
343
-
344
- ## Commit Reference
345
-
346
- See commit: "Refactor Frame to use parent-based architecture following Puppeteer"