selma 0.3.0-x64-mingw-ucrt → 0.4.0-x64-mingw-ucrt

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a0ddf7d69640ad327cf8e8efd8e35e0441f6d4910e98d98629f96254d22268f8
4
- data.tar.gz: 0c1ff8f5237f35d2353b6d78f79a2316ae052631094920cb17ff8dda5062ff56
3
+ metadata.gz: '09223eb3558240eceac68e85d2b1176d5200bfe0e5615f2fac578ce0aa009442'
4
+ data.tar.gz: 2055d3638f4f85f1b8b7257e820c441894fb5ee78185f9caa43334c6465b3406
5
5
  SHA512:
6
- metadata.gz: 94ecfa9b7f1ef2f36a9397880ed901900fc128996d3c79b4ef337cbd7f3e0fbeebcd66dea3e509b29cffb2a21a951f49263648c8485a4954d7ccd5987f532e52
7
- data.tar.gz: 806f0ed6ff134d58e7aa5a63a455ee5e0364a29d77b2169893d1712c9268b43efd5f1f1c1dcb8c98b2092efdd0133846043868680d69b30a55847051ec9b3365
6
+ metadata.gz: 870af5762919e3f45bf4c9386811910802fbc9c2f8dec2447ae84335694978044b117aa00803201a3bfbe680038be99e2a4ee5cd489eb689071a01cec58519e5
7
+ data.tar.gz: dc883f86c62281e50abfb6dccd77ac79dbea6068f2f683b2a37a67ab01d2332e968eaa7abfaa2ce9e7846f7eb6ee98d1215d3e3071997744377010802ecb128e
data/README.md CHANGED
@@ -180,6 +180,19 @@ The `element` argument in `handle_element` has the following methods:
180
180
  - `after(content, as: content_type)`: Inserts `content` after the text. `content_type` is either `:text` or `:html` and determines how the content will be applied.
181
181
  - `replace(content, as: content_type)`: Replaces the text node with `content`. `content_type` is either `:text` or `:html` and determines how the content will be applied.
182
182
 
183
+ ## Security
184
+
185
+ Theoretically, a malicious user can provide a very large document for processing, which can exhaust the memory of the host machine. To set a limit on how much string content is processed at once, you can provide two options into the `memory` namespace:
186
+
187
+ ```ruby
188
+ memory: {
189
+ max_allowed_memory_usage: 1000,
190
+ preallocated_parsing_buffer_size: 100,
191
+ },
192
+ ```
193
+
194
+ Note that `preallocated_parsing_buffer_size` must always be less than `max_allowed_memory_usage`. See [the`lol_html` project documentation](https://docs.rs/lol_html/1.2.1/lol_html/struct.MemorySettings.html) to learn more about the default values.
195
+
183
196
  ## Benchmarks
184
197
 
185
198
  When `bundle exec rake benchmark`, two different benchmarks are calculated. Here are those results on my machine.
@@ -191,30 +204,33 @@ Comparing Selma against popular Ruby sanitization gems:
191
204
  <!-- prettier-ignore-start -->
192
205
  <details>
193
206
  <pre>
207
+ input size = 25309 bytes, 0.03 MB
208
+
209
+ ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
194
210
  Warming up --------------------------------------
195
- sanitize-sm 15.000 i/100ms
196
- selma-sm 126.000 i/100ms
211
+ sanitize-sm 16.000 i/100ms
212
+ selma-sm 214.000 i/100ms
197
213
  Calculating -------------------------------------
198
- sanitize-sm 155.074 (± 1.9%) i/s - 4.665k in 30.092214s
199
- selma-sm 1.290k1.3%) i/s - 38.808k in 30.085333s
214
+ sanitize-sm 171.670 (± 1.2%) i/s - 5.152k in 30.017081s
215
+ selma-sm 2.146k3.0%) i/s - 64.414k in 30.058470s
200
216
 
201
217
  Comparison:
202
- selma-sm: 1290.1 i/s
203
- sanitize-sm: 155.1 i/s - 8.32x slower
218
+ selma-sm: 2145.8 i/s
219
+ sanitize-sm: 171.7 i/s - 12.50x slower
204
220
 
205
221
  input size = 86686 bytes, 0.09 MB
206
222
 
207
223
  ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
208
224
  Warming up --------------------------------------
209
- sanitize-md 3.000 i/100ms
210
- selma-md 33.000 i/100ms
225
+ sanitize-md 4.000 i/100ms
226
+ selma-md 56.000 i/100ms
211
227
  Calculating -------------------------------------
212
- sanitize-md 40.3215.0%) i/s - 1.206k in 30.004711s
213
- selma-md 337.417 (± 1.5%) i/s - 10.131k in 30.032772s
228
+ sanitize-md 44.3972.3%) i/s - 1.332k in 30.022430s
229
+ selma-md 558.448 (± 1.4%) i/s - 16.800k in 30.089196s
214
230
 
215
231
  Comparison:
216
- selma-md: 337.4 i/s
217
- sanitize-md: 40.3 i/s - 8.37x slower
232
+ selma-md: 558.4 i/s
233
+ sanitize-md: 44.4 i/s - 12.58x slower
218
234
 
219
235
  input size = 7172510 bytes, 7.17 MB
220
236
 
@@ -223,12 +239,12 @@ Warming up --------------------------------------
223
239
  sanitize-lg 1.000 i/100ms
224
240
  selma-lg 1.000 i/100ms
225
241
  Calculating -------------------------------------
226
- sanitize-lg 0.144 (± 0.0%) i/s - 5.000 in 34.772526s
227
- selma-lg 4.026 (± 0.0%) i/s - 121.000 in 30.067415s
242
+ sanitize-lg 0.163 (± 0.0%) i/s - 6.000 in 37.375628s
243
+ selma-lg 6.750 (± 0.0%) i/s - 203.000 in 30.080976s
228
244
 
229
245
  Comparison:
230
- selma-lg: 4.0 i/s
231
- sanitize-lg: 0.1 i/s - 27.99x slower
246
+ selma-lg: 6.7 i/s
247
+ sanitize-lg: 0.2 i/s - 41.32x slower
232
248
  </pre>
233
249
  </details>
234
250
  <!-- prettier-ignore-end -->
@@ -239,41 +255,39 @@ Comparing Selma against popular Ruby HTML parsing gems:
239
255
 
240
256
  <!-- prettier-ignore-start -->
241
257
  <details>
242
- <pre>
243
-
244
- input size = 25309 bytes, 0.03 MB
258
+ <pre>input size = 25309 bytes, 0.03 MB
245
259
 
246
260
  ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
247
261
  Warming up --------------------------------------
248
- nokogiri-sm 79.000 i/100ms
249
- nokolexbor-sm 285.000 i/100ms
250
- selma-sm 244.000 i/100ms
262
+ nokogiri-sm 107.000 i/100ms
263
+ nokolexbor-sm 340.000 i/100ms
264
+ selma-sm 380.000 i/100ms
251
265
  Calculating -------------------------------------
252
- nokogiri-sm 807.7903.1%) i/s - 24.253k in 30.056301s
253
- nokolexbor-sm 2.880k 6.4%) i/s - 86.070k in 30.044766s
254
- selma-sm 2.508k1.2%) i/s - 75.396k in 30.068792s
266
+ nokogiri-sm 1.073k2.1%) i/s - 32.207k in 30.025474s
267
+ nokolexbor-sm 3.300k13.2%) i/s - 27.540k in 36.788212s
268
+ selma-sm 3.779k3.4%) i/s - 113.240k in 30.013908s
255
269
 
256
270
  Comparison:
257
- nokolexbor-sm: 2880.3 i/s
258
- selma-sm: 2507.8 i/s - 1.15x slower
259
- nokogiri-sm: 807.8 i/s - 3.57x slower
271
+ selma-sm: 3779.4 i/s
272
+ nokolexbor-sm: 3300.1 i/s - same-ish: difference falls within error
273
+ nokogiri-sm: 1073.1 i/s - 3.52x slower
260
274
 
261
275
  input size = 86686 bytes, 0.09 MB
262
276
 
263
277
  ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
264
278
  Warming up --------------------------------------
265
- nokogiri-md 8.000 i/100ms
266
- nokolexbor-md 43.000 i/100ms
267
- selma-md 39.000 i/100ms
279
+ nokogiri-md 11.000 i/100ms
280
+ nokolexbor-md 48.000 i/100ms
281
+ selma-md 53.000 i/100ms
268
282
  Calculating -------------------------------------
269
- nokogiri-md 87.3673.4%) i/s - 2.624k in 30.061642s
270
- nokolexbor-md 438.7823.9%) i/s - 13.158k in 30.031163s
271
- selma-md 392.5913.1%) i/s - 11.778k in 30.031391s
283
+ nokogiri-md 103.9985.8%) i/s - 3.113k in 30.029932s
284
+ nokolexbor-md 428.9287.9%) i/s - 12.816k in 30.066662s
285
+ selma-md 492.1906.9%) i/s - 14.734k in 30.082943s
272
286
 
273
287
  Comparison:
274
- nokolexbor-md: 438.8 i/s
275
- selma-md: 392.6 i/s - 1.12x slower
276
- nokogiri-md: 87.4 i/s - 5.02x slower
288
+ selma-md: 492.2 i/s
289
+ nokolexbor-md: 428.9 i/s - same-ish: difference falls within error
290
+ nokogiri-md: 104.0 i/s - 4.73x slower
277
291
 
278
292
  input size = 7172510 bytes, 7.17 MB
279
293
 
@@ -283,14 +297,14 @@ Warming up --------------------------------------
283
297
  nokolexbor-lg 1.000 i/100ms
284
298
  selma-lg 1.000 i/100ms
285
299
  Calculating -------------------------------------
286
- nokogiri-lg 0.895 (± 0.0%) i/s - 27.000 in 30.300832s
287
- nokolexbor-lg 2.163 (± 0.0%) i/s - 65.000 in 30.085656s
288
- selma-lg 5.867 (± 0.0%) i/s - 176.000 in 30.006240s
300
+ nokogiri-lg 0.874 (± 0.0%) i/s - 27.000 in 30.921090s
301
+ nokolexbor-lg 2.227 (± 0.0%) i/s - 67.000 in 30.137903s
302
+ selma-lg 8.354 (± 0.0%) i/s - 251.000 in 30.075227s
289
303
 
290
304
  Comparison:
291
- selma-lg: 5.9 i/s
292
- nokolexbor-lg: 2.2 i/s - 2.71x slower
293
- nokogiri-lg: 0.9 i/s - 6.55x slower
305
+ selma-lg: 8.4 i/s
306
+ nokolexbor-lg: 2.2 i/s - 3.75x slower
307
+ nokogiri-lg: 0.9 i/s - 9.56x slower
294
308
  </pre>
295
309
  </details>
296
310
  <!-- prettier-ignore-end -->
Binary file
Binary file
Binary file
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Selma
4
+ module Config
5
+ OPTIONS = {
6
+ memory: {
7
+ max_allowed_memory_usage: nil,
8
+ preallocated_parsing_buffer_size: nil,
9
+ },
10
+ }
11
+ end
12
+ end
data/lib/selma/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Selma
4
- VERSION = "0.3.0"
4
+ VERSION = "0.4.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: selma
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.4.0
5
5
  platform: x64-mingw-ucrt
6
6
  authors:
7
7
  - Garen J. Torikian
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-06-07 00:00:00.000000000 Z
11
+ date: 2024-07-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake
@@ -51,6 +51,7 @@ files:
51
51
  - lib/selma/3.1/selma.so
52
52
  - lib/selma/3.2/selma.so
53
53
  - lib/selma/3.3/selma.so
54
+ - lib/selma/config.rb
54
55
  - lib/selma/extension.rb
55
56
  - lib/selma/html.rb
56
57
  - lib/selma/rewriter.rb