scrub_rb 0.2.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +10 -8
  3. data/lib/scrub_rb/version.rb +1 -1
  4. metadata +2 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 416f1b945e82bf566a572fa0ac825204e2928d5e
4
- data.tar.gz: 1c4b189c9e523d2ed21d96159e60e9efb1fe22b4
3
+ metadata.gz: 644acb6ff368a1469fabcce3a3a41b8c3389fbb1
4
+ data.tar.gz: 4b80ec9d61708227bb65a43bb799d66b48b2e14f
5
5
  SHA512:
6
- metadata.gz: e3fad2d47eb39cffa8a96a4a9a63c55b7deaf9224a37bdfc844f2fbfdda6c79d7fdcd5feccfe03a398218ce360ecc049eb57476d9b5fef4d6779d6c1c731347e
7
- data.tar.gz: f8552f53acf3a26440b64145e086f175511b83c03bf50f30ed7ba1b867a38a5922883bc7a51e6f7d1a9a0822709ca4b456dadc445be16982e3c67217373e7bfa
6
+ metadata.gz: 9ad061344fd9587e8ed1eb39404e828a05f357a769c9838043870cb7034efaa763f2a827322043e5fa943e17ff6318df42f89050c4de7fc4aeba7d876a823c7c
7
+ data.tar.gz: 69b9a20817599ed7647ad6007f7c8b123f247cd32484719b88d7f14ab3e29879fc5d7b1d6d1225c209477fa5eb6e8df2f19444debd75578089b02b8acc1c8685
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Pure-ruby polyfill of MRI 2.1 String#scrub, for ruby 1.9 and 2.0 any interpreter
4
4
 
5
- [![Build Status](https://travis-ci.org/jrochkind/scrub_rb.png?branch=master)](https://travis-ci.org/jrochkind/scrub_rb)
5
+ [![Build Status](https://travis-ci.org/jrochkind/scrub_rb.png?branch=master)](https://travis-ci.org/jrochkind/scrub_rb) [![Gem Version](https://badge.fury.io/rb/scrub_rb.png)](http://badge.fury.io/rb/scrub_rb)
6
6
 
7
7
  ## Installation
8
8
 
@@ -21,8 +21,8 @@ Or install it yourself as:
21
21
 
22
22
  ## What it is
23
23
 
24
- Ruby 2.1 introduces String#scrub, a method to replace invalid bytes in a given string
25
- and it's specified encoding. See docs in [MRI ruby source](https://github.com/ruby/ruby/blob/1e8a05c1dfee94db9b6b825097e1d192ad32930a/string.c#L7772)
24
+ Ruby 2.1 introduces String#scrub, a method to replace bytes in a string that are invalid for it's specified encoding.
25
+ See docs in [MRI ruby source](https://github.com/ruby/ruby/blob/1e8a05c1dfee94db9b6b825097e1d192ad32930a/string.c#L7772)
26
26
 
27
27
  If you need String#scrub in MRI ruby 2.0, you can use the [string-scrub gem](https://github.com/hsbt/string-scrub), which provides a backport of the C code from MRI ruby 2.1 into MRI 2.0.
28
28
 
@@ -50,7 +50,7 @@ This pure ruby implementation is about an order of magnitude slower than stdlib
50
50
 
51
51
  ## Discrepency with MRI 2.1 String#scrub
52
52
 
53
- If there are more than one concurrent invalid byte in a string, should the entire block be replaced with only one replacement, or should each invalid byte be replaced with a replacement?
53
+ If there is a sequence of multiple contiguous invalid bytes in a string, should the entire block be replaced with only one replacement, or should each invalid byte be replaced with a replacement?
54
54
 
55
55
  I have not been able to understand the logic MRI 2.1 uses to divide contiguous invalid bytes into
56
56
  certain sub-sequences for replacement, as represented in the [test suite](https://github.com/ruby/ruby/blob/3ac0ec4ecdea849143ed64e8935e6675b341e44b/test/ruby/test_m17n.rb#L1505). The test suite may be suggesting that the examples are from unicode documentation, but I wasn't able to find such documentation to see if it shed any light on the matter.
@@ -63,17 +63,19 @@ For most uses, this discrepency is probably not of consequence.
63
63
 
64
64
  If anyone can explain whats going on here, I'm very curious! I can't read C very well to try and figure it out from source.
65
65
 
66
- ## Jruby may raise
66
+ ## JRuby may raise
67
67
 
68
68
  Due to an apparent JRuby bug, some invalid strings cause an internal
69
- exception from JRuby when trying to scrub_rb. The entire original MRI test suite
69
+ exception from JRuby when trying to scrub_rb. This bug should [be fixed in jruby 1.7.11](https://github.com/jruby/jruby/issues/1361#issuecomment-35776377)
70
+
71
+ In Jruby versions prior to that, The entire original MRI test suite
70
72
  does passes against scrub_rb in JRuby -- but [one test original to us, involving
71
73
  input tagged 'ascii' encoding](./test/scrub_test.rb#L67), fails raising an ArrayIndexOutOfBoundsException
72
74
  from inside of JRuby. I have filed an [issue with JRuby](https://github.com/jruby/jruby/issues/1361).
73
75
 
74
- I believe this problem should be rare -- so far, the only reproduction case involves an input string tagged 'ascii' encoding, which probably isn't a common use case. But it's unfortunate
76
+ **I believe this problem is likely to be rare** -- so far, the only reproduction case involves an input string tagged 'ascii' encoding, which probably isn't a common use case. But it's unfortunate
75
77
  that `scrub_rb` isn't reliable on jruby. I haven't been able to figure out any workaround in ruby to the jruby bug -- you could theoretically provide a Java alternate implementation usable in jruby, but I'm not sure what Java tools are available and how hard it would be to match the scrub api.
76
78
 
77
79
  ## Contributions
78
80
 
79
- Pull requests or suggestions welcome, especially on performance, on JRuby issue, and on discrepencies with official String#scrub.
81
+ Pull requests or suggestions welcome, especially on performance, on JRuby issue, and on discrepencies with official String#scrub.
@@ -1,3 +1,3 @@
1
1
  module ScrubRb
2
- VERSION = "0.2.0"
2
+ VERSION = "1.0.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: scrub_rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jonathan Rochkind
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2013-12-26 00:00:00.000000000 Z
11
+ date: 2014-07-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler