scrub_rb 0.2.0 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +10 -8
  3. data/lib/scrub_rb/version.rb +1 -1
  4. metadata +2 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 416f1b945e82bf566a572fa0ac825204e2928d5e
4
- data.tar.gz: 1c4b189c9e523d2ed21d96159e60e9efb1fe22b4
3
+ metadata.gz: 644acb6ff368a1469fabcce3a3a41b8c3389fbb1
4
+ data.tar.gz: 4b80ec9d61708227bb65a43bb799d66b48b2e14f
5
5
  SHA512:
6
- metadata.gz: e3fad2d47eb39cffa8a96a4a9a63c55b7deaf9224a37bdfc844f2fbfdda6c79d7fdcd5feccfe03a398218ce360ecc049eb57476d9b5fef4d6779d6c1c731347e
7
- data.tar.gz: f8552f53acf3a26440b64145e086f175511b83c03bf50f30ed7ba1b867a38a5922883bc7a51e6f7d1a9a0822709ca4b456dadc445be16982e3c67217373e7bfa
6
+ metadata.gz: 9ad061344fd9587e8ed1eb39404e828a05f357a769c9838043870cb7034efaa763f2a827322043e5fa943e17ff6318df42f89050c4de7fc4aeba7d876a823c7c
7
+ data.tar.gz: 69b9a20817599ed7647ad6007f7c8b123f247cd32484719b88d7f14ab3e29879fc5d7b1d6d1225c209477fa5eb6e8df2f19444debd75578089b02b8acc1c8685
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Pure-ruby polyfill of MRI 2.1 String#scrub, for ruby 1.9 and 2.0 any interpreter
4
4
 
5
- [![Build Status](https://travis-ci.org/jrochkind/scrub_rb.png?branch=master)](https://travis-ci.org/jrochkind/scrub_rb)
5
+ [![Build Status](https://travis-ci.org/jrochkind/scrub_rb.png?branch=master)](https://travis-ci.org/jrochkind/scrub_rb) [![Gem Version](https://badge.fury.io/rb/scrub_rb.png)](http://badge.fury.io/rb/scrub_rb)
6
6
 
7
7
  ## Installation
8
8
 
@@ -21,8 +21,8 @@ Or install it yourself as:
21
21
 
22
22
  ## What it is
23
23
 
24
- Ruby 2.1 introduces String#scrub, a method to replace invalid bytes in a given string
25
- and it's specified encoding. See docs in [MRI ruby source](https://github.com/ruby/ruby/blob/1e8a05c1dfee94db9b6b825097e1d192ad32930a/string.c#L7772)
24
+ Ruby 2.1 introduces String#scrub, a method to replace bytes in a string that are invalid for it's specified encoding.
25
+ See docs in [MRI ruby source](https://github.com/ruby/ruby/blob/1e8a05c1dfee94db9b6b825097e1d192ad32930a/string.c#L7772)
26
26
 
27
27
  If you need String#scrub in MRI ruby 2.0, you can use the [string-scrub gem](https://github.com/hsbt/string-scrub), which provides a backport of the C code from MRI ruby 2.1 into MRI 2.0.
28
28
 
@@ -50,7 +50,7 @@ This pure ruby implementation is about an order of magnitude slower than stdlib
50
50
 
51
51
  ## Discrepency with MRI 2.1 String#scrub
52
52
 
53
- If there are more than one concurrent invalid byte in a string, should the entire block be replaced with only one replacement, or should each invalid byte be replaced with a replacement?
53
+ If there is a sequence of multiple contiguous invalid bytes in a string, should the entire block be replaced with only one replacement, or should each invalid byte be replaced with a replacement?
54
54
 
55
55
  I have not been able to understand the logic MRI 2.1 uses to divide contiguous invalid bytes into
56
56
  certain sub-sequences for replacement, as represented in the [test suite](https://github.com/ruby/ruby/blob/3ac0ec4ecdea849143ed64e8935e6675b341e44b/test/ruby/test_m17n.rb#L1505). The test suite may be suggesting that the examples are from unicode documentation, but I wasn't able to find such documentation to see if it shed any light on the matter.
@@ -63,17 +63,19 @@ For most uses, this discrepency is probably not of consequence.
63
63
 
64
64
  If anyone can explain whats going on here, I'm very curious! I can't read C very well to try and figure it out from source.
65
65
 
66
- ## Jruby may raise
66
+ ## JRuby may raise
67
67
 
68
68
  Due to an apparent JRuby bug, some invalid strings cause an internal
69
- exception from JRuby when trying to scrub_rb. The entire original MRI test suite
69
+ exception from JRuby when trying to scrub_rb. This bug should [be fixed in jruby 1.7.11](https://github.com/jruby/jruby/issues/1361#issuecomment-35776377)
70
+
71
+ In Jruby versions prior to that, The entire original MRI test suite
70
72
  does passes against scrub_rb in JRuby -- but [one test original to us, involving
71
73
  input tagged 'ascii' encoding](./test/scrub_test.rb#L67), fails raising an ArrayIndexOutOfBoundsException
72
74
  from inside of JRuby. I have filed an [issue with JRuby](https://github.com/jruby/jruby/issues/1361).
73
75
 
74
- I believe this problem should be rare -- so far, the only reproduction case involves an input string tagged 'ascii' encoding, which probably isn't a common use case. But it's unfortunate
76
+ **I believe this problem is likely to be rare** -- so far, the only reproduction case involves an input string tagged 'ascii' encoding, which probably isn't a common use case. But it's unfortunate
75
77
  that `scrub_rb` isn't reliable on jruby. I haven't been able to figure out any workaround in ruby to the jruby bug -- you could theoretically provide a Java alternate implementation usable in jruby, but I'm not sure what Java tools are available and how hard it would be to match the scrub api.
76
78
 
77
79
  ## Contributions
78
80
 
79
- Pull requests or suggestions welcome, especially on performance, on JRuby issue, and on discrepencies with official String#scrub.
81
+ Pull requests or suggestions welcome, especially on performance, on JRuby issue, and on discrepencies with official String#scrub.
@@ -1,3 +1,3 @@
1
1
  module ScrubRb
2
- VERSION = "0.2.0"
2
+ VERSION = "1.0.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: scrub_rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jonathan Rochkind
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2013-12-26 00:00:00.000000000 Z
11
+ date: 2014-07-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler