zenrows 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5c04285fb07c401d81848da8a4ba763b3e0cd0a550c5e34fc1716101f229fbb7
4
- data.tar.gz: 04d35bac2bafd413d8969d5d61b54f414c54390db0deb014dd721e2874944cc2
3
+ metadata.gz: e684c8840821205883d52bd782aa3d22c8c21404bc9e3a4ab068f9069389e575
4
+ data.tar.gz: eed9546964086c061082a55aa4fa9218c5c24dfa02b70f20297b04b55fa77891
5
5
  SHA512:
6
- metadata.gz: 12becd343181666b4dc8b8edccc5fffcb410ecaf7ba74b9b7ed8a6859693d20f7b466e27f5e8a245b9841c56d7f2c4b1d2b3fa7a84b241b3725140acc546ad2c
7
- data.tar.gz: '0368ed029138f05de5e2041fb4136caacc04d3ae9b172e7b5983f1204eefa0d12afc2ab70724f124922358d201195aa070e60db2248cee7b9031bb9d64e0c6d2'
6
+ metadata.gz: 31a6e6f8d95e0431cd7ce0bf545e813c64954021914eb67cf8312155b0c9d7cdea2ab1ede5378972757c1b148be0cd7be13e7ddda5ed2fbdb83cb30aad9fc18a
7
+ data.tar.gz: 74584d555a2915b1e0f8ac2fd0fb46ca78b21a80430c0e20f252237497a27087b4027e244c450222ae322b2f1ae2f0a02b59652daddc2069ac48783525d5cc69
data/CHANGELOG.md CHANGED
@@ -7,6 +7,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.3.0] - 2025-12-30
11
+
12
+ ### Added
13
+
14
+ - Hooks/callbacks system for request lifecycle events
15
+ - Five hook types: `before_request`, `after_request`, `on_response`, `on_error`, `around_request`
16
+ - Global and per-client hook registration
17
+ - `Zenrows::Hooks::LogSubscriber` built-in logging subscriber
18
+ - `InstrumentedClient` wrapper for HTTP client instrumentation
19
+ - Context object with ZenRows header parsing (request_cost, concurrency_remaining, request_id, final_url)
20
+ - RBS type signatures for all hooks classes
21
+ - Monotonic clock for accurate request duration timing
22
+
23
+ ### Changed
24
+
25
+ - `after_request` hook now always runs (via `ensure`), even on errors
26
+
10
27
  ## [0.2.1] - 2025-12-25
11
28
 
12
29
  ### Added
data/README.md CHANGED
@@ -208,6 +208,46 @@ response.concurrency_remaining # => 199
208
208
  | `response_type` | String | Output format ('markdown') |
209
209
  | `outputs` | String | Extract specific data (headings,links) |
210
210
 
211
+ ## Hooks
212
+
213
+ Register callbacks for request lifecycle events:
214
+
215
+ ```ruby
216
+ Zenrows.configure do |c|
217
+ c.api_key = 'YOUR_KEY'
218
+
219
+ # Log responses
220
+ c.on_response { |resp, ctx| puts "#{ctx[:host]} -> #{resp.status}" }
221
+
222
+ # Track errors
223
+ c.on_error { |err, ctx| Sentry.capture_exception(err) }
224
+
225
+ # Monitor costs
226
+ c.on_response do |resp, ctx|
227
+ cost = ctx[:zenrows_headers][:request_cost]
228
+ StatsD.increment('zenrows.cost', cost) if cost
229
+ end
230
+ end
231
+ ```
232
+
233
+ Per-client hooks:
234
+
235
+ ```ruby
236
+ client = Zenrows::Client.new do |c|
237
+ c.on_response { |resp, ctx| log_specific(resp) }
238
+ end
239
+ ```
240
+
241
+ Built-in logger:
242
+
243
+ ```ruby
244
+ c.add_subscriber(Zenrows::Hooks::LogSubscriber.new)
245
+ ```
246
+
247
+ Available hooks: `before_request`, `after_request`, `on_response`, `on_error`, `around_request`
248
+
249
+ Context includes: `method`, `url`, `host`, `duration`, `zenrows_headers` (request_cost, concurrency_remaining, request_id, final_url)
250
+
211
251
  ## Error Handling
212
252
 
213
253
  ```ruby
@@ -25,6 +25,11 @@ module Zenrows
25
25
  # @example With markdown output
26
26
  # response = api.get(url, response_type: 'markdown')
27
27
  #
28
+ # @example With per-client hooks
29
+ # api = Zenrows::ApiClient.new do |c|
30
+ # c.on_response { |resp, ctx| puts "#{ctx[:host]} -> #{resp.status}" }
31
+ # end
32
+ #
28
33
  # @author Ernest Bursa
29
34
  # @since 0.2.0
30
35
  # @api public
@@ -38,15 +43,23 @@ module Zenrows
38
43
  # @return [Configuration] Configuration instance
39
44
  attr_reader :config
40
45
 
46
+ # @return [Hooks] Hook registry for this client
47
+ attr_reader :hooks
48
+
41
49
  # Initialize API client
42
50
  #
43
51
  # @param api_key [String, nil] Override API key (uses global config if nil)
44
52
  # @param api_endpoint [String, nil] Override API endpoint (uses global config if nil)
45
- def initialize(api_key: nil, api_endpoint: nil)
53
+ # @yield [config] Optional block for per-client configuration (hooks)
54
+ # @yieldparam config [HookConfigurator] Hook configuration DSL
55
+ def initialize(api_key: nil, api_endpoint: nil, &block)
46
56
  @config = Zenrows.configuration
47
57
  @api_key = api_key || @config.api_key
48
58
  @api_endpoint = api_endpoint || @config.api_endpoint
49
59
  @config.validate! unless api_key
60
+
61
+ # Build hooks: start with global, allow per-client additions
62
+ @hooks = block ? build_hooks(&block) : Zenrows.configuration.hooks.dup
50
63
  end
51
64
 
52
65
  # Make GET request through ZenRows API
@@ -76,9 +89,11 @@ module Zenrows
76
89
  # @raise [AuthenticationError] if API key invalid
77
90
  # @raise [RateLimitError] if rate limited
78
91
  def get(url, **options)
79
- params = build_params(url, options)
80
- http_response = build_http_client.get(api_endpoint, params: params)
81
- handle_response(http_response, options)
92
+ instrument(:get, url, options) do
93
+ params = build_params(url, options)
94
+ http_response = build_http_client.get(api_endpoint, params: params)
95
+ handle_response(http_response, options)
96
+ end
82
97
  end
83
98
 
84
99
  # Make POST request through ZenRows API
@@ -88,13 +103,61 @@ module Zenrows
88
103
  # @param options [Hash] Request options (same as #get)
89
104
  # @return [ApiResponse] Response wrapper
90
105
  def post(url, body: nil, **options)
91
- params = build_params(url, options)
92
- http_response = build_http_client.post(api_endpoint, params: params, body: body)
93
- handle_response(http_response, options)
106
+ instrument(:post, url, options) do
107
+ params = build_params(url, options)
108
+ http_response = build_http_client.post(api_endpoint, params: params, body: body)
109
+ handle_response(http_response, options)
110
+ end
94
111
  end
95
112
 
96
113
  private
97
114
 
115
+ # Build hooks registry for this client
116
+ #
117
+ # @yield [config] Block for registering per-client hooks
118
+ # @return [Hooks] Combined hooks registry
119
+ def build_hooks
120
+ client_hooks = Zenrows.configuration.hooks.dup
121
+ hook_config = HookConfigurator.new(client_hooks)
122
+ yield(hook_config)
123
+ client_hooks
124
+ end
125
+
126
+ # Instrument a request with hooks
127
+ #
128
+ # @param method [Symbol] HTTP method
129
+ # @param url [String] Target URL
130
+ # @param options [Hash] Request options
131
+ # @yield Block that executes the actual request
132
+ # @return [Object] Response from block
133
+ def instrument(method, url, options)
134
+ return yield if hooks.empty?
135
+
136
+ context = Hooks::Context.for_request(
137
+ method: method,
138
+ url: url,
139
+ options: options,
140
+ backend: :api
141
+ )
142
+
143
+ hooks.run(:before_request, context)
144
+
145
+ response = hooks.run_around(context) do
146
+ result = yield
147
+ Hooks::Context.enrich_with_response(context, result)
148
+ hooks.run(:on_response, result, context)
149
+ result
150
+ end
151
+
152
+ response
153
+ rescue => e
154
+ context[:error] = e if context
155
+ hooks.run(:on_error, e, context) if context
156
+ raise
157
+ ensure
158
+ hooks.run(:after_request, context) if context
159
+ end
160
+
98
161
  def build_http_client
99
162
  HTTP
100
163
  .timeout(connect: config.connect_timeout, read: config.read_timeout)
@@ -18,11 +18,16 @@ module Zenrows
18
18
  # @return [Zenrows::Configuration] Configuration instance
19
19
  attr_reader :config
20
20
 
21
+ # @return [Zenrows::Hooks] Hook registry for this backend
22
+ attr_reader :hooks
23
+
21
24
  # @param proxy [Zenrows::Proxy] Proxy configuration builder
22
25
  # @param config [Zenrows::Configuration] Configuration instance
23
- def initialize(proxy:, config:)
26
+ # @param hooks [Zenrows::Hooks, nil] Optional hook registry (defaults to config.hooks)
27
+ def initialize(proxy:, config:, hooks: nil)
24
28
  @proxy = proxy
25
29
  @config = config
30
+ @hooks = hooks || config.hooks&.dup || Hooks.new
26
31
  end
27
32
 
28
33
  # Build a configured HTTP client
@@ -74,6 +79,31 @@ module Zenrows
74
79
  {connect: connect, read: read}
75
80
  end
76
81
 
82
+ # Wrap HTTP client with instrumentation if hooks are registered
83
+ #
84
+ # @param client [Object] The underlying HTTP client
85
+ # @param options [Hash] Request options used for this client
86
+ # @return [Object] Instrumented client or original if no hooks
87
+ def wrap_client(client, options)
88
+ return client if hooks.empty?
89
+
90
+ InstrumentedClient.new(
91
+ client,
92
+ hooks: hooks,
93
+ context_base: {
94
+ options: options,
95
+ backend: backend_name
96
+ }
97
+ )
98
+ end
99
+
100
+ # Get the backend name for context
101
+ #
102
+ # @return [Symbol] Backend identifier
103
+ def backend_name
104
+ :base
105
+ end
106
+
77
107
  private
78
108
 
79
109
  # Normalize wait value to seconds
@@ -27,7 +27,7 @@ module Zenrows
27
27
  # @option options [Boolean, Integer] :wait Wait time
28
28
  # @option options [String] :wait_for CSS selector to wait for
29
29
  # @option options [Hash] :headers Custom HTTP headers
30
- # @return [HTTP::Client] Configured HTTP client
30
+ # @return [HTTP::Client, InstrumentedClient] Configured HTTP client (instrumented if hooks registered)
31
31
  def build_client(options = {})
32
32
  opts = options.dup
33
33
  headers = opts.delete(:headers) || {}
@@ -42,7 +42,7 @@ module Zenrows
42
42
  timeouts = calculate_timeouts(opts)
43
43
 
44
44
  # Build HTTP client with SSL context and proxy
45
- HTTP
45
+ client = HTTP
46
46
  .timeout(connect: timeouts[:connect], read: timeouts[:read])
47
47
  .headers(headers)
48
48
  .via(
@@ -52,6 +52,14 @@ module Zenrows
52
52
  proxy_config[:password],
53
53
  ssl_context: ssl_context
54
54
  )
55
+
56
+ # Wrap with instrumentation if hooks registered
57
+ wrap_client(client, opts)
58
+ end
59
+
60
+ # @return [Symbol] Backend identifier
61
+ def backend_name
62
+ :http_rb
55
63
  end
56
64
  end
57
65
  end
@@ -23,7 +23,7 @@ module Zenrows
23
23
  # Build a configured HTTP client wrapper
24
24
  #
25
25
  # @param options [Hash] Request options
26
- # @return [NetHttpClient] Configured client wrapper
26
+ # @return [NetHttpClient, InstrumentedClient] Configured client wrapper (instrumented if hooks registered)
27
27
  def build_client(options = {})
28
28
  opts = options.dup
29
29
  headers = opts.delete(:headers) || {}
@@ -32,12 +32,20 @@ module Zenrows
32
32
  proxy_config = proxy.build(opts)
33
33
  timeouts = calculate_timeouts(opts)
34
34
 
35
- NetHttpClient.new(
35
+ client = NetHttpClient.new(
36
36
  proxy_config: proxy_config,
37
37
  headers: headers,
38
38
  timeouts: timeouts,
39
39
  ssl_context: ssl_context
40
40
  )
41
+
42
+ # Wrap with instrumentation if hooks registered
43
+ wrap_client(client, opts)
44
+ end
45
+
46
+ # @return [Symbol] Backend identifier
47
+ def backend_name
48
+ :net_http
41
49
  end
42
50
  end
43
51
 
@@ -19,6 +19,11 @@ module Zenrows
19
19
  # client = Zenrows::Client.new(api_key: 'KEY', host: 'proxy.zenrows.com')
20
20
  # http = client.http(premium_proxy: true, proxy_country: 'us')
21
21
  #
22
+ # @example With per-client hooks
23
+ # client = Zenrows::Client.new do |c|
24
+ # c.on_response { |resp, ctx| puts "#{ctx[:host]} -> #{resp.status}" }
25
+ # end
26
+ #
22
27
  # @author Ernest Bursa
23
28
  # @since 0.1.0
24
29
  # @api public
@@ -32,17 +37,30 @@ module Zenrows
32
37
  # @return [Backends::Base] HTTP backend instance
33
38
  attr_reader :backend
34
39
 
40
+ # @return [Hooks] Hook registry for this client
41
+ attr_reader :hooks
42
+
35
43
  # Initialize a new client
36
44
  #
37
45
  # @param api_key [String, nil] Override API key from global config
38
46
  # @param host [String, nil] Override proxy host
39
47
  # @param port [Integer, nil] Override proxy port
40
48
  # @param backend [Symbol] Backend to use (:http_rb)
49
+ # @yield [config] Optional block for per-client configuration (hooks, etc.)
50
+ # @yieldparam config [Configuration] Client configuration for hook registration
41
51
  # @raise [ConfigurationError] if api_key is not configured
42
- def initialize(api_key: nil, host: nil, port: nil, backend: nil)
52
+ #
53
+ # @example With per-client hooks
54
+ # client = Zenrows::Client.new do |c|
55
+ # c.on_response { |resp, ctx| puts resp.status }
56
+ # end
57
+ def initialize(api_key: nil, host: nil, port: nil, backend: nil, &block)
43
58
  @config = build_config(api_key: api_key, host: host, port: port, backend: backend)
44
59
  @config.validate!
45
60
 
61
+ # Build hooks: start with global, allow per-client additions
62
+ @hooks = block ? build_hooks(&block) : Zenrows.configuration.hooks.dup
63
+
46
64
  @proxy = Proxy.new(
47
65
  api_key: @config.api_key,
48
66
  host: @config.host,
@@ -149,14 +167,31 @@ module Zenrows
149
167
  backend_name = resolve_backend
150
168
  case backend_name
151
169
  when :http_rb
152
- Backends::HttpRb.new(proxy: proxy, config: config)
170
+ Backends::HttpRb.new(proxy: proxy, config: config, hooks: hooks)
153
171
  when :net_http
154
- Backends::NetHttp.new(proxy: proxy, config: config)
172
+ Backends::NetHttp.new(proxy: proxy, config: config, hooks: hooks)
155
173
  else
156
174
  raise ConfigurationError, "Unsupported backend: #{backend_name}. Use :http_rb or :net_http"
157
175
  end
158
176
  end
159
177
 
178
+ # Build hooks registry for this client
179
+ #
180
+ # Starts with global hooks, then applies per-client hooks from block.
181
+ #
182
+ # @yield [config] Block for registering per-client hooks
183
+ # @return [Hooks] Combined hooks registry
184
+ def build_hooks
185
+ # Start with a copy of global hooks
186
+ client_hooks = Zenrows.configuration.hooks.dup
187
+
188
+ # Create a temporary config-like object for hook registration
189
+ hook_config = HookConfigurator.new(client_hooks)
190
+ yield(hook_config)
191
+
192
+ client_hooks
193
+ end
194
+
160
195
  # Resolve which backend to use
161
196
  #
162
197
  # @return [Symbol] Backend name
@@ -182,4 +217,52 @@ module Zenrows
182
217
  false
183
218
  end
184
219
  end
220
+
221
+ # Helper class for per-client hook configuration
222
+ #
223
+ # Provides the same hook registration DSL as Configuration.
224
+ #
225
+ # @api private
226
+ class HookConfigurator
227
+ # @param hooks [Hooks] Hook registry to configure
228
+ def initialize(hooks)
229
+ @hooks = hooks
230
+ end
231
+
232
+ # Register a before_request callback
233
+ def before_request(callable = nil, &block)
234
+ @hooks.register(:before_request, callable, &block)
235
+ self
236
+ end
237
+
238
+ # Register an after_request callback
239
+ def after_request(callable = nil, &block)
240
+ @hooks.register(:after_request, callable, &block)
241
+ self
242
+ end
243
+
244
+ # Register an on_response callback
245
+ def on_response(callable = nil, &block)
246
+ @hooks.register(:on_response, callable, &block)
247
+ self
248
+ end
249
+
250
+ # Register an on_error callback
251
+ def on_error(callable = nil, &block)
252
+ @hooks.register(:on_error, callable, &block)
253
+ self
254
+ end
255
+
256
+ # Register an around_request callback
257
+ def around_request(callable = nil, &block)
258
+ @hooks.register(:around_request, callable, &block)
259
+ self
260
+ end
261
+
262
+ # Add a subscriber object
263
+ def add_subscriber(subscriber)
264
+ @hooks.add_subscriber(subscriber)
265
+ self
266
+ end
267
+ end
185
268
  end
@@ -45,6 +45,9 @@ module Zenrows
45
45
  # @return [String] ZenRows API endpoint for ApiClient
46
46
  attr_accessor :api_endpoint
47
47
 
48
+ # @return [Zenrows::Hooks] Hook registry for request lifecycle events
49
+ attr_reader :hooks
50
+
48
51
  # Default configuration values
49
52
  DEFAULTS = {
50
53
  host: "superproxy.zenrows.com",
@@ -73,9 +76,117 @@ module Zenrows
73
76
  @read_timeout = DEFAULTS[:read_timeout]
74
77
  @backend = DEFAULTS[:backend]
75
78
  @logger = nil
79
+ @hooks = Hooks.new
76
80
  end
77
81
  end
78
82
 
83
+ # Register a callback to run before each request
84
+ #
85
+ # @param callable [#call, nil] Callable object
86
+ # @yield [context] Block to execute
87
+ # @yieldparam context [Hash] Request context
88
+ # @return [self]
89
+ #
90
+ # @example
91
+ # config.before_request { |ctx| puts "Starting: #{ctx[:url]}" }
92
+ def before_request(callable = nil, &block)
93
+ hooks.register(:before_request, callable, &block)
94
+ self
95
+ end
96
+
97
+ # Register a callback to run after each request (always runs)
98
+ #
99
+ # @param callable [#call, nil] Callable object
100
+ # @yield [context] Block to execute
101
+ # @yieldparam context [Hash] Request context
102
+ # @return [self]
103
+ #
104
+ # @example
105
+ # config.after_request { |ctx| puts "Finished: #{ctx[:duration]}s" }
106
+ def after_request(callable = nil, &block)
107
+ hooks.register(:after_request, callable, &block)
108
+ self
109
+ end
110
+
111
+ # Register a callback to run on successful response
112
+ #
113
+ # @param callable [#call, nil] Callable object
114
+ # @yield [response, context] Block to execute
115
+ # @yieldparam response [Object] HTTP response
116
+ # @yieldparam context [Hash] Request context with ZenRows headers
117
+ # @return [self]
118
+ #
119
+ # @example Log by host
120
+ # config.on_response { |resp, ctx| puts "#{ctx[:host]} -> #{resp.status}" }
121
+ #
122
+ # @example Track costs
123
+ # config.on_response do |resp, ctx|
124
+ # cost = ctx[:zenrows_headers][:request_cost]
125
+ # StatsD.increment('zenrows.cost', cost) if cost
126
+ # end
127
+ def on_response(callable = nil, &block)
128
+ hooks.register(:on_response, callable, &block)
129
+ self
130
+ end
131
+
132
+ # Register a callback to run on request error
133
+ #
134
+ # @param callable [#call, nil] Callable object
135
+ # @yield [error, context] Block to execute
136
+ # @yieldparam error [Exception] The error that occurred
137
+ # @yieldparam context [Hash] Request context
138
+ # @return [self]
139
+ #
140
+ # @example
141
+ # config.on_error { |err, ctx| Sentry.capture_exception(err) }
142
+ def on_error(callable = nil, &block)
143
+ hooks.register(:on_error, callable, &block)
144
+ self
145
+ end
146
+
147
+ # Register a callback to wrap around requests
148
+ #
149
+ # Around callbacks can modify timing, add retries, etc.
150
+ # The block MUST call the passed block and return its result.
151
+ #
152
+ # @param callable [#call, nil] Callable object
153
+ # @yield [context, &block] Block to execute
154
+ # @yieldparam context [Hash] Request context
155
+ # @yieldparam block [Proc] Block to call to execute the request
156
+ # @return [self]
157
+ #
158
+ # @example Timing
159
+ # config.around_request do |ctx, &block|
160
+ # start = Time.now
161
+ # response = block.call
162
+ # puts "Request took #{Time.now - start}s"
163
+ # response
164
+ # end
165
+ def around_request(callable = nil, &block)
166
+ hooks.register(:around_request, callable, &block)
167
+ self
168
+ end
169
+
170
+ # Add a subscriber object for hook events
171
+ #
172
+ # Subscribers can implement any of: before_request, after_request,
173
+ # on_response, on_error, around_request.
174
+ #
175
+ # @param subscriber [Object] Object responding to hook methods
176
+ # @return [self]
177
+ #
178
+ # @example
179
+ # class MySubscriber
180
+ # def on_response(response, context)
181
+ # puts response.status
182
+ # end
183
+ # end
184
+ # config.add_subscriber(MySubscriber.new)
185
+ def add_subscriber(subscriber)
186
+ hooks.add_subscriber(subscriber)
187
+ self
188
+ end
189
+
79
190
  # Validate that required configuration is present
80
191
  #
81
192
  # @raise [ConfigurationError] if api_key is missing