require-hooks 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 25df07edf75eef2e55e17548e0e02d54ba54e542e4ee3ebe20ec54958d3e2385
4
- data.tar.gz: 8337f6fc90e9ee9e917ce30c3f1eb7536309ca590b0882798e70969d519cf626
3
+ metadata.gz: 0ad50776a8d1a68e226e2813265d503d7e11ecba3a991daf56945a07ca0ae34c
4
+ data.tar.gz: 906c6874932cfeb3e684b45f572b33f18f25304983a9566c01d540b7570c42ae
5
5
  SHA512:
6
- metadata.gz: 55979ba3219aacfc8a1626e4c198518f29e527a1bb547e39d86f4fd170d1562c71166df87e50c5d4ad5a7deb58adfa5248dfeac85baa21a41839cecac4441460
7
- data.tar.gz: 33c22e744609eccfd80eb46949e7e5740a563cabbbfb9e0a6d68e6fede8a377ea10cfb47998670ae67003e7723e6b2f9c58e3a7cce832488409f254624a4f2f0
6
+ metadata.gz: 89c6aea0a0fe1c1b1a9dcf4831435d0a6c98d52149ffd8fc079e5fd134a768f8203253b7d5ec9b0148ef5d1f595f19b95274d18f8e889ed4e140a0a1258cc524
7
+ data.tar.gz: cdc4edf49fc5c7d7f9201e800ee54fda332a1fa94595bc2cf625b0fa032aaf0aa6d72f2653753f399f9bc8ac1d7da9426dab6620197f19f0db8394a390921b9e
data/CHANGELOG.md CHANGED
@@ -2,6 +2,10 @@
2
2
 
3
3
  ## master
4
4
 
5
+ ## 0.2.0 (2023-08-23)
6
+
7
+ - Add `patterns` and `exclude_patterns` options to hooks. ([@palkan][]])
8
+
5
9
  ## 0.1.0 (2023-07-14)
6
10
 
7
11
  - Extracted from Ruby Next. ([@palkan][]])
data/README.md CHANGED
@@ -33,24 +33,29 @@ spec.add_dependency "require-hooks"
33
33
 
34
34
  ## Usage
35
35
 
36
- To enable hooks, you need to load `require-hooks/setup` as early as possible. For example, in your gem's entrypoint:
36
+ To enable hooks, you need to load `require-hooks/setup` before any code that you want to pre-process via hooks:
37
37
 
38
38
  ```ruby
39
39
  require "require-hooks/setup"
40
40
  ```
41
41
 
42
+ For example, in an application (e.g., Rails), you may want to only process the source files you own, so you must activate Require Hooks after loading the dependencies (e.g., in the `config/application.rb` file right after `Bundler.require(*)`).
43
+
44
+ If you want to pre-process all files, you can activate Require Hooks earlier.
45
+
42
46
  Then, you can add hooks:
43
47
 
44
48
  - **around_load:** a hook that wraps code loading operation. Useful for logging and debugging purposes.
45
49
 
46
50
  ```ruby
47
51
  # Simple logging
48
- RequireHooks.around_load do |path, &block|
52
+ RequireHooks.around_load(patterns: ["/gem/dir/*.rb"]) do |path, &block|
49
53
  puts "Loading #{path}"
50
54
  block.call.tap { puts "Loaded #{path}" }
51
55
  end
52
56
 
53
- # Error enrichment
57
+ # Error enrichment.
58
+ # No patterns — all files are affected.
54
59
  RequireHooks.around_load do |path, &block|
55
60
  block.call
56
61
  rescue SyntaxError => e
@@ -63,8 +68,7 @@ The return value MUST be a result of calling the passed block.
63
68
  - **source_transform:** perform source-to-source transformations.
64
69
 
65
70
  ```ruby
66
- RequireHooks.source_transform do |path, source|
67
- next unless path =~ /my_project\/.*/
71
+ RequireHooks.source_transform(patterns: ["/my_project/*.rb"], exclude_patterns: ["/my_project/vendor/*"]) do |path, source|
68
72
  source ||= File.read(path)
69
73
  "# frozen_string_literal: true\n#{source}"
70
74
  end
@@ -75,9 +79,8 @@ The return value MUST be either String (new source code) or `nil` (indicating th
75
79
  - **hijack_load:** a hook that is used to manually compile byte code for VM to load it.
76
80
 
77
81
  ```ruby
78
- RequireHooks.hijack_load do |path, source|
79
- next unless path =~ /my_project\/.*/
80
-
82
+ # Pattern can be a Proc. If it returns `true`, the hijacker is used.
83
+ RequireHooks.hijack_load(patterns: ["/my_project/*.rb"]) do |path, source|
81
84
  source ||= File.read(path)
82
85
  if defined?(RubyVM::InstructionSequence)
83
86
  RubyVM::InstructionSequence.compile(source)
@@ -89,6 +92,8 @@ end
89
92
 
90
93
  The return value is platform-specific. If there are multiple _hijackers_, the first one that returns a non-`nil` value is used, others are ignored.
91
94
 
95
+ **NOTE:** The `patterns` and `exclude_patterns` arguments accept globs as recognized by [File.fnmatch](https://rubyapi.org/3.2/o/file#method-c-fnmatch).
96
+
92
97
  ## Modes
93
98
 
94
99
  Depending on the runtime conditions, Require Hooks picks an optimal strategy for injecting the code. You can enforce a particular _mode_ by setting the `REQUIRE_HOOKS_MODE` env variable (`patch`, `load_iseq` or `bootsnap`). In practice, only setting to `patch` may makes sense.
@@ -118,6 +123,66 @@ The _around load_ hooks are executed for all files independently of whether they
118
123
 
119
124
  Thus, if you introduce new source transformers or hijackers, you must invalidate the cache. (We plan to implement automatic invalidation in future versions.)
120
125
 
126
+ ## Limitations
127
+
128
+ - `Kernel#load` with a wrap argument (e.g., `load "some_path", true` or `load "some_path", MyModule)`) is not supported (fallbacked to the original implementation). The biggest challenge here is to support constants nesting.
129
+ - Some very edgy symlinking scenarios are not supported (unlikely to affect real-world projects).
130
+
131
+ ## Performance
132
+
133
+ We conducted a benchmark to measure the performance overhead of Require Hooks using a large Rails project with the following characteristics:
134
+
135
+ ```sh
136
+ $ find config/ lib/ app/ -name "*.rb" | wc -l
137
+
138
+ 2689
139
+ ```
140
+
141
+ ```sh
142
+ $ bundle list | wc -l
143
+
144
+ 427
145
+ ```
146
+
147
+ Total number of `#require` calls: **12741**.
148
+
149
+ We activated Require Hooks in the very start of the program (`config/boot.rb`).
150
+
151
+ There is a single around load hook to count all the calls:
152
+
153
+ ```ruby
154
+ counter = 0
155
+ RequireHooks.around_load do |_, &block|
156
+ counter += 1
157
+ block.call
158
+ end
159
+
160
+ at_exit { puts "Total hooked calls: #{counter}" }
161
+ ```
162
+
163
+ ## Results
164
+
165
+ All tests made with `eager_load=true`.
166
+
167
+ Test script: `time bundle exec rails runner 'puts "done"'`.
168
+
169
+ | | |
170
+ |-------------------------------------|--------------|
171
+ | baseline | 29s |
172
+ | baseline w/bootsnap  | 12s |
173
+ | rhooks (iseq)  | 30s |
174
+ | rhooks (patch)  | **8m** |
175
+ | rhooks (bootsnap)  | 12s |
176
+
177
+ You can see that requiring tons of files with Require Hooks in patch mode is very slow for now. Why? Mostly because we MUST check `$LOADED_FEATURES` for the presence of the file we want to load and currently we do this via `$LOADED_FEATURES.include?(path)` call, which becomes very slow when `$LOADED_FEATURES` is huge. Thus, we recommend activating Require Hooks after loading all the dependencies and limiting the scope of affected files (via the `patterns` option) on non-MRI platforms to avoid this overhead.
178
+
179
+ **NOTE:** Why Ruby's internal implementations is fast despite from doing the same checks? It uses an internal hash table to keep track of the loaded features (`vm->loaded_features_realpaths`), not an array. Unfortunately, it's not accessible from Ruby.
180
+
181
+ Here are the numbers for the same project with scoped hooks (only some folders) activated after `Bundler.require(*)`:
182
+
183
+ - 732 files affected: 2m36s (vs. 30s w/o hooks)
184
+ - 153 files affected: 55s (vs. 30s w/o hooks)
185
+
121
186
  ## Contributing
122
187
 
123
188
  Bug reports and pull requests are welcome on GitHub at [https://github.com/ruby-next/require-hooks](https://github.com/ruby-next/require-hooks).
@@ -5,7 +5,61 @@ module RequireHooks
5
5
  @@source_transform = []
6
6
  @@hijack_load = []
7
7
 
8
+ class Context
9
+ def initialize(around_load, source_transform, hijack_load)
10
+ @around_load = around_load
11
+ @source_transform = source_transform
12
+ @hijack_load = hijack_load
13
+ end
14
+
15
+ def empty?
16
+ @around_load.empty? && @source_transform.empty? && @hijack_load.empty?
17
+ end
18
+
19
+ def source_transform?
20
+ @source_transform.any?
21
+ end
22
+
23
+ def hijack?
24
+ @hijack_load.any?
25
+ end
26
+
27
+ def run_around_load_callbacks(path)
28
+ return yield if @around_load.empty?
29
+
30
+ chain = @around_load.reverse.inject do |acc_proc, next_proc|
31
+ proc { |path, &block| acc_proc.call(path) { next_proc.call(path, &block) } }
32
+ end
33
+
34
+ chain.call(path) { yield }
35
+ end
36
+
37
+ def perform_source_transform(path)
38
+ return unless @source_transform.any?
39
+
40
+ source = nil
41
+
42
+ @source_transform.each do |transform|
43
+ source = transform.call(path, source) || source
44
+ end
45
+
46
+ source
47
+ end
48
+
49
+ def try_hijack_load(path, source)
50
+ return unless @hijack_load.any?
51
+
52
+ @hijack_load.each do |hijack|
53
+ result = hijack.call(path, source)
54
+ return result if result
55
+ end
56
+ nil
57
+ end
58
+ end
59
+
8
60
  class << self
61
+ attr_accessor :print_warnings
62
+
9
63
  # Define a block to wrap the code loading.
10
64
  # The return value MUST be a result of calling the passed block.
11
65
  # For example, you can use such hooks for instrumentation, debugging purposes.
@@ -14,8 +68,8 @@ module RequireHooks
14
68
  # puts "Loading #{path}"
15
69
  # block.call.tap { puts "Loaded #{path}" }
16
70
  # end
17
- def around_load(&block)
18
- @@around_load << block
71
+ def around_load(patterns: nil, exclude_patterns: nil, &block)
72
+ @@around_load << [patterns, exclude_patterns, block]
19
73
  end
20
74
 
21
75
  # Define hooks to perform source-to-source transformations.
@@ -28,8 +82,8 @@ module RequireHooks
28
82
  # RequireHooks.source_transform do |path, source|
29
83
  # "# frozen_string_literal: true\n#{source}"
30
84
  # end
31
- def source_transform(&block)
32
- @@source_transform << block
85
+ def source_transform(patterns: nil, exclude_patterns: nil, &block)
86
+ @@source_transform << [patterns, exclude_patterns, block]
33
87
  end
34
88
 
35
89
  # This hook should be used to manually compile byte code to be loaded by the VM.
@@ -46,40 +100,33 @@ module RequireHooks
46
100
  # JRuby.compile(source)
47
101
  # end
48
102
  # end
49
- def hijack_load(&block)
50
- @@hijack_load << block
103
+ def hijack_load(patterns: nil, exclude_patterns: nil, &block)
104
+ @@hijack_load << [patterns, exclude_patterns, block]
51
105
  end
52
106
 
53
- def run_around_load_callbacks(path)
54
- return yield if @@around_load.empty?
107
+ def context_for(path)
108
+ around_load = @@around_load.select do |patterns, exclude_patterns, _block|
109
+ next unless !patterns || patterns.any? { |pattern| File.fnmatch?(pattern, path) }
110
+ next if exclude_patterns&.any? { |pattern| File.fnmatch?(pattern, path) }
55
111
 
56
- chain = @@around_load.reverse.inject do |acc_proc, next_proc|
57
- proc { |path, &block| acc_proc.call(path) { next_proc.call(path, &block) } }
58
- end
112
+ true
113
+ end.map { |_patterns, _exclude_patterns, block| block }
59
114
 
60
- chain.call(path) { yield }
61
- end
115
+ source_transform = @@source_transform.select do |patterns, exclude_patterns, _block|
116
+ next unless !patterns || patterns.any? { |pattern| File.fnmatch?(pattern, path) }
117
+ next if exclude_patterns&.any? { |pattern| File.fnmatch?(pattern, path) }
62
118
 
63
- def perform_source_transform(path)
64
- return unless @@source_transform.any?
119
+ true
120
+ end.map { |_patterns, _exclude_patterns, block| block }
65
121
 
66
- source = nil
122
+ hijack_load = @@hijack_load.select do |patterns, exclude_patterns, _block|
123
+ next unless !patterns || patterns.any? { |pattern| File.fnmatch?(pattern, path) }
124
+ next if exclude_patterns&.any? { |pattern| File.fnmatch?(pattern, path) }
67
125
 
68
- @@source_transform.each do |transform|
69
- source = transform.call(path, source) || source
70
- end
71
-
72
- source
73
- end
126
+ true
127
+ end.map { |_patterns, _exclude_patterns, block| block }
74
128
 
75
- def try_hijack_load(path, source)
76
- return unless @@hijack_load.any?
77
-
78
- @@hijack_load.each do |hijack|
79
- result = hijack.call(path, source)
80
- return result if result
81
- end
82
- nil
129
+ Context.new(around_load, source_transform, hijack_load)
83
130
  end
84
131
  end
85
132
  end
@@ -4,8 +4,10 @@ module RequireHooks
4
4
  module Bootsnap
5
5
  module CompileCacheExt
6
6
  def input_to_storage(source, path, *)
7
- new_contents = RequireHooks.perform_source_transform(path)
8
- hijacked = RequireHooks.try_hijack_load(path, new_contents)
7
+ ctx = RequireHooks.context_for(path)
8
+
9
+ new_contents = ctx.perform_source_transform(path)
10
+ hijacked = ctx.try_hijack_load(path, new_contents)
9
11
 
10
12
  if hijacked
11
13
  raise TypeError, "Unsupported bytecode format for #{path}: #{hijack.class}" unless hijacked.is_a?(::RubyVM::InstructionSequence)
@@ -24,7 +26,7 @@ module RequireHooks
24
26
  # Around hooks must be performed every time we trigger a file load, even if
25
27
  # the file is already cached.
26
28
  def load_iseq(path)
27
- RequireHooks.run_around_load_callbacks(path) { super }
29
+ RequireHooks.context_for(path).run_around_load_callbacks(path) { super }
28
30
  end
29
31
  end
30
32
  end
@@ -7,9 +7,13 @@ module RequireHooks
7
7
  module KernelPatch
8
8
  class << self
9
9
  def load(path)
10
- RequireHooks.run_around_load_callbacks(path) do
11
- new_contents = RequireHooks.perform_source_transform(path)
12
- hijacked = RequireHooks.try_hijack_load(path, new_contents)
10
+ ctx = RequireHooks.context_for(path)
11
+
12
+ ctx.run_around_load_callbacks(path) do
13
+ next load_without_require_hooks(path) unless ctx.source_transform? || ctx.hijack?
14
+
15
+ new_contents = ctx.perform_source_transform(path)
16
+ hijacked = ctx.try_hijack_load(path, new_contents)
13
17
 
14
18
  return try_evaluate(path, hijacked) if hijacked
15
19
 
@@ -104,7 +108,7 @@ module RequireHooks
104
108
 
105
109
  # Recursive require
106
110
  if lock.owned? && lock.locked?
107
- warn "circular require considered harmful: #{fname}"
111
+ warn "loading in progress, circular require considered harmful: #{fname}" if RequireHooks.print_warnings
108
112
  return yield(true)
109
113
  end
110
114
 
@@ -227,47 +231,38 @@ module Kernel
227
231
  raise TypeError unless path.is_a?(::String)
228
232
 
229
233
  realpath = nil
234
+ feature = path
230
235
 
231
236
  # if extname == ".rb" => lookup feature -> resolve feature -> load
232
237
  # if extname != ".rb" => append ".rb" - lookup feature -> resolve feature -> lookup orig (no ext) -> resolve orig (no ext) -> load
233
238
  if File.extname(path) != ".rb"
234
- return false if RequireHooks::KernelPatch::Features.feature_loaded?(path + ".rb")
235
-
236
- loaded = RequireHooks::KernelPatch::Features::LOCK.lock_feature(path + ".rb") do |loaded|
237
- return false if loaded
238
-
239
- realpath = RequireHooks::KernelPatch::Features.feature_path(path + ".rb")
239
+ realpath = RequireHooks::KernelPatch::Features.feature_path(path + ".rb")
240
240
 
241
- if realpath
242
- $LOADED_FEATURES << realpath
243
- RequireHooks::KernelPatch.load(realpath)
244
- true
245
- end
241
+ if realpath
242
+ feature = path + ".rb"
246
243
  end
247
-
248
- return true if loaded
249
244
  end
250
245
 
251
- return false if RequireHooks::KernelPatch::Features.feature_loaded?(path)
246
+ realpath ||= RequireHooks::KernelPatch::Features.feature_path(path)
252
247
 
253
- loaded = RequireHooks::KernelPatch::Features::LOCK.lock_feature(path) do |loaded|
254
- return false if loaded
248
+ return require_without_require_hooks(path) unless realpath
255
249
 
256
- realpath = RequireHooks::KernelPatch::Features.feature_path(path)
250
+ ctx = RequireHooks.context_for(realpath)
257
251
 
258
- if realpath
259
- $LOADED_FEATURES << realpath
260
- RequireHooks::KernelPatch.load(realpath)
261
- true
262
- end
263
- end
252
+ return require_without_require_hooks(path) if ctx.empty?
264
253
 
265
- return true if loaded
254
+ return false if RequireHooks::KernelPatch::Features.feature_loaded?(feature)
266
255
 
267
- require_without_require_hooks(path)
256
+ RequireHooks::KernelPatch::Features::LOCK.lock_feature(feature) do |loaded|
257
+ return false if loaded
258
+
259
+ $LOADED_FEATURES << realpath
260
+ RequireHooks::KernelPatch.load(realpath)
261
+ true
262
+ end
268
263
  rescue LoadError => e
269
264
  $LOADED_FEATURES.delete(realpath) if realpath
270
- warn "RequireHooks failed to require '#{path}': #{e.message}"
265
+ warn "RequireHooks failed to require '#{path}': #{e.message}" if RequireHooks.print_warnings
271
266
  require_without_require_hooks(path)
272
267
  rescue Errno::ENOENT, Errno::EACCES
273
268
  raise LoadError, "cannot load such file -- #{path}"
@@ -301,7 +296,7 @@ module Kernel
301
296
  alias_method :load_without_require_hooks, :load
302
297
  def load(path, wrap = false)
303
298
  if wrap
304
- warn "RequireHooks does not support `load(smth, wrap: ...)`. Falling back to original `Kernel#load`"
299
+ warn "RequireHooks does not support `load(smth, wrap: ...)`. Falling back to original `Kernel#load`" if RequireHooks.print_warnings
305
300
  return load_without_require_hooks(path, wrap)
306
301
  end
307
302
 
@@ -325,7 +320,7 @@ module Kernel
325
320
  rescue Errno::ENOENT, Errno::EACCES
326
321
  raise LoadError, "cannot load such file -- #{path}"
327
322
  rescue LoadError => e
328
- warn "RuquireHooks failed to load '#{path}': #{e.message}"
323
+ warn "RuquireHooks failed to load '#{path}': #{e.message}" if RequireHooks.print_warnings
329
324
  load_without_require_hooks(path)
330
325
  end
331
326
  end
@@ -3,15 +3,19 @@
3
3
  module RequireHooks
4
4
  module LoadIseq
5
5
  def load_iseq(path)
6
- RequireHooks.run_around_load_callbacks(path) do
7
- new_contents = RequireHooks.perform_source_transform(path)
8
- hijacked = RequireHooks.try_hijack_load(path, new_contents)
6
+ ctx = RequireHooks.context_for(path)
9
7
 
10
- if hijacked
11
- raise TypeError, "Unsupported bytecode format for #{path}: #{hijack.class}" unless hijacked.is_a?(::RubyVM::InstructionSequence)
12
- return hijacked
13
- elsif new_contents
14
- return RubyVM::InstructionSequence.compile(new_contents, path, path, 1)
8
+ ctx.run_around_load_callbacks(path) do
9
+ if ctx.source_transform? || ctx.hijack?
10
+ new_contents = ctx.perform_source_transform(path)
11
+ hijacked = ctx.try_hijack_load(path, new_contents)
12
+
13
+ if hijacked
14
+ raise TypeError, "Unsupported bytecode format for #{path}: #{hijack.class}" unless hijacked.is_a?(::RubyVM::InstructionSequence)
15
+ return hijacked
16
+ elsif new_contents
17
+ return RubyVM::InstructionSequence.compile(new_contents, path, path, 1)
18
+ end
15
19
  end
16
20
 
17
21
  defined?(super) ? super : RubyVM::InstructionSequence.compile_file(path)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module RequireHooks
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: require-hooks
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Vladimir Dementyev
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-07-15 00:00:00.000000000 Z
11
+ date: 2023-08-23 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: Require Hooks provide infrastructure for intercepting require/load calls
14
14
  in Ruby