rchardet19 1.3.5 → 1.3.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 437c6ce14d2dcccc100d652abf7efd8ad2f76123
4
+ data.tar.gz: 59ab35c63ab22ce50d67eefeef71830500829654
5
+ SHA512:
6
+ metadata.gz: 55592006d5309c094cc335d23ad0d668397ef3cceed5d4b95721baadc4f43048b767dc6fdddb241f4de3aab843969e4827087b9793b81ac92599991cad4c5461
7
+ data.tar.gz: 19e54f6731b6cd4ecd10e5902dc0f8f9500fc0a98658c73c0d6d1a40a81c6791fc6383fdc0344c195a7ffdd62fce44eeb13611987b8e39ed5ccb7ee1bf008c11
data/README.markdown CHANGED
@@ -1,63 +1,66 @@
1
1
  # rCharDet*19*
2
2
 
3
- ### [Project Page](http://rubyforge.org/projects/rchardet) | [1.9 Author](https://github.com/edouard/rchardet) | [Original Author](https://github.com/jmhodges/rchardet)
3
+ ## [Project Page](http://rubyforge.org/projects/rchardet) | [1.9 Author](https://github.com/edouard/rchardet) | [Original Author](https://github.com/jmhodges/rchardet)
4
4
 
5
- rCharDet is a character encoding detection library for ruby and the implementation is based
6
- on Mozilla Charset Detectors.
5
+ *rCharDet* is a character encoding detection library for ruby and the implementation is based on Mozilla Charset Detectors.
7
6
 
8
- This is a forked project in a effort to make it Ruby 1.9 compatible
7
+ This is a forked project in a effort to make it Ruby 1.9 compatible.
9
8
 
10
- ### How do use
11
-
12
- require 'rubygems'
13
- require 'rchardet19'
14
-
15
- >> cd = CharDet.detect("some data")
16
- => #<struct #<Class:0x102216198> encoding="ascii", confidence=1.0>
9
+ Follow me on [Twitter](http://twitter.com/linusoleander) or [Github](https://github.com/oleander) for more info and updates.
17
10
 
18
- ### How to use - in real life
11
+ ### How to use
19
12
 
20
- `detect` takes the variable `data` that contains an unknown encoding.
13
+ `CharDet.detect` takes the variable `data` that contains an unknown encoding.
21
14
 
22
15
  We then try to change the encoding to UTF-8, but only if we are at least ~ 60% sure that we found the right encoding.
23
16
 
24
- data = "Some unknown data"
25
- cd = CharDet.detect(data)
26
- data = cd.confidence > 0.6 ? Iconv.conv(cd.encoding, "UTF-8", data) : data
27
-
28
- ### What do I've to work with?
17
+ ```` ruby
18
+ data = "Some unknown data"
19
+ cd = CharDet.detect(data)
20
+ data = cd.confidence > 0.6 ? Iconv.conv(cd.encoding, "UTF-8", data) : data
21
+ ````
22
+
23
+ ## What do I've to work with?
29
24
 
30
- A struct is being returned from the `detect` method and has the following accessors.
25
+ A struct is being returned from the `detect` method, it has the following accessors.
31
26
 
32
27
  - **encoding** (String) Encoding of the ingoing string, `UTF-8` for example.
33
28
  - **confidence** (Float) The confidence level of the *encoding*, from 0.0 to 1.0, where 1.0 is the best.
34
29
 
35
- ### Make it silent
30
+ ## Make it silent
36
31
 
37
32
  The `detect` takes two arguments, the string to guess the encoding on and an option hash.
38
33
 
39
34
  You can use the option hash de decide if you want the `detect` method to raise an exception or not if the ingoing string is `nil`.
40
35
 
41
- >> CharDet.detect("some data", :silent => true) # Won't raise an exception
42
- >> CharDet.detect(nil, :silent => true) # Won't raise an exception
43
- >> CharDet.detect(nil) # Will raise an exception
44
- >> CharDet.detect(nil, :silent => false) # Will raise an exception
45
-
46
- ### How do install
36
+ ```` ruby
37
+ CharDet.detect("some data", :silent => true) # Won't raise an exception
38
+ CharDet.detect(nil, :silent => true) # Won't raise an exception
39
+ CharDet.detect(nil) # Will raise an exception
40
+ CharDet.detect(nil, :silent => false) # Will raise an exception
41
+ ````
42
+
43
+ ## How do install
47
44
 
48
45
  [sudo] gem install rchardet19
49
46
 
50
- ### How to use it in a rails 3 project
47
+ ## How to use it in a rails 3 project
51
48
 
52
49
  Add `gem 'rchardet19'` to your Gemfile and run `bundle`.
53
50
 
54
- ### How to help
51
+ ## How to help
55
52
 
56
53
  - Start by copying the project or make your own branch.
57
54
  - Navigate to the root path of the project and run `bundle`.
58
55
  - Start by running all tests using rspec, `rspec spec/rchardet19_spec.rb`.
59
56
  - Implement your own code, write some tests, commit and do a pull request.
60
57
 
61
- ### Requirements
58
+ ## Requirements
59
+
60
+ *rCharDet19* is tested in
61
+ - OS X 10.6.6 using Ruby 1.8.7 and 1.9.2.
62
+ - Ubuntu 12.10 using Ruby 1.9.3 and 2.0.0-p0
63
+
64
+ ## License
62
65
 
63
- rCharDet19 is tested in OS X 10.6.6 using Ruby 1.8.7 and 1.9.2.
66
+ *rCharDet19* is released under the *MIT license*.
data/Rakefile CHANGED
@@ -1,29 +1,2 @@
1
- require 'rubygems'
2
- require 'rake/testtask'
3
- require 'rake/gempackagetask'
4
- begin
5
- require 'lib/rchardet'
6
- rescue LoadError
7
- module CharDet; VERSION = '0.0.0'; end
8
- puts "Problem loading rfeedparser; try rake setup"
9
- end
10
-
11
- spec = Gem::Specification.new do |s|
12
- s.name = "rchardet"
13
- s.version = CharDet::VERSION
14
- s.author = "Jeff Hodges"
15
- s.email = "jeff at somethingsimilar dot com"
16
- s.homepage = "http://github.com/jmhodges/rchardet/tree/master"
17
- s.platform = Gem::Platform::RUBY
18
- s.summary = "Character encoding auto-detection in Ruby. As smart as your browser. Open source."
19
- s.files = FileList["lib/**/*"]
20
- s.require_path = "lib"
21
- # s.autorequire = "feedparser" # tHe 3vil according to Why.
22
- s.has_rdoc = false # TODO: fix
23
- s.extra_rdoc_files = ['README', 'COPYING']
24
- s.rubyforge_project = 'rchardet'
25
-
26
- end
27
-
28
- Rake::GemPackageTask.new(spec) do
29
- end
1
+ require "bundler"
2
+ Bundler::GemHelper.install_tasks
@@ -14,12 +14,12 @@
14
14
  # modify it under the terms of the GNU Lesser General Public
15
15
  # License as published by the Free Software Foundation; either
16
16
  # version 2.1 of the License, or (at your option) any later version.
17
- #
17
+ #
18
18
  # This library is distributed in the hope that it will be useful,
19
19
  # but WITHOUT ANY WARRANTY; without even the implied warranty of
20
20
  # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
21
21
  # Lesser General Public License for more details.
22
- #
22
+ #
23
23
  # You should have received a copy of the GNU Lesser General Public
24
24
  # License along with this library; if not, write to the Free Software
25
25
  # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
@@ -137,9 +137,9 @@ module CharDet
137
137
  return if @_mDone
138
138
 
139
139
  # The buffer we got is byte oriented, and a character may span in more than one
140
- # buffers. In case the last one or two byte in last buffer is not complete, we
140
+ # buffers. In case the last one or two byte in last buffer is not complete, we
141
141
  # record how many byte needed to complete that character and skip these bytes here.
142
- # We can choose to record those bytes as well and analyse the character once it
142
+ # We can choose to record those bytes as well and analyse the character once it
143
143
  # is complete, but since a character will not make much difference, by simply skipping
144
144
  # this character will simply our logic and improve performance.
145
145
  i = @_mNeedToSkipCharNum
@@ -195,10 +195,10 @@ module CharDet
195
195
  # return its order if it is hiragana
196
196
  if aStr.length > 1
197
197
  if (aStr[0..0] == "\202") and (aStr[1..1] >= "\x9F") and (aStr[1..1] <= "\xF1")
198
- return aStr[1] - 0x9F, charLen
198
+ return aStr[1].ord - 0x9F, charLen
199
199
  end
200
200
  end
201
-
201
+
202
202
  return -1, charLen
203
203
  end
204
204
  end
@@ -219,7 +219,7 @@ module CharDet
219
219
  # return its order if it is hiragana
220
220
  if aStr.length > 1
221
221
  if (aStr[0..0] == "\xA4") and (aStr[1..1] >= "\xA1") and (aStr[1..1] <= "\xF3")
222
- return aStr[1] - 0xA1, charLen
222
+ return aStr[1].ord - 0xA1, charLen
223
223
  end
224
224
  end
225
225
 
data/rchardet.gemspec CHANGED
@@ -3,12 +3,12 @@ $:.push File.expand_path("../lib", __FILE__)
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = "rchardet19"
6
- s.version = "1.3.5"
6
+ s.version = "1.3.6"
7
7
  s.authors = ["Jeff Hodges", "Édouard Brière", "Linus Oleander"]
8
8
  s.email = "linus@oleander.nu"
9
9
  s.homepage = "https://github.com/oleander/rchardet"
10
10
  s.platform = Gem::Platform::RUBY
11
- s.summary = "Character encoding auto-detection. Ruby 1.9 compat."
11
+ s.summary = "Ruby 1.9 compatible character encoding auto-detection library"
12
12
  s.description = "Character encoding auto-detection in Ruby. This library is a port of the auto-detection code in Mozilla. It means taking a sequence of bytes in an unknown character encoding, and attempting to determine the encoding so you can read the text. It’s like cracking a code when you don’t have the decryption key."
13
13
  s.files = `git ls-files`.split("\n")
14
14
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
metadata CHANGED
@@ -1,43 +1,44 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: rchardet19
3
- version: !ruby/object:Gem::Version
4
- prerelease:
5
- version: 1.3.5
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.3.6
6
5
  platform: ruby
7
- authors:
6
+ authors:
8
7
  - Jeff Hodges
9
- - "\xC3\x89douard Bri\xC3\xA8re"
8
+ - "Édouard Brière"
10
9
  - Linus Oleander
11
10
  autorequire:
12
11
  bindir: bin
13
12
  cert_chain: []
14
-
15
- date: 2011-02-17 00:00:00 +01:00
16
- default_executable:
17
- dependencies:
18
- - !ruby/object:Gem::Dependency
13
+ date: 2014-04-18 00:00:00.000000000 Z
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
19
16
  name: rspec
20
- prerelease: false
21
- requirement: &id001 !ruby/object:Gem::Requirement
22
- none: false
23
- requirements:
17
+ requirement: !ruby/object:Gem::Requirement
18
+ requirements:
24
19
  - - ">="
25
- - !ruby/object:Gem::Version
26
- version: "0"
20
+ - !ruby/object:Gem::Version
21
+ version: '0'
27
22
  type: :development
28
- version_requirements: *id001
29
- description: "Character encoding auto-detection in Ruby. This library is a port of the auto-detection code in Mozilla. It means taking a sequence of bytes in an unknown character encoding, and attempting to determine the encoding so you can read the text. It\xE2\x80\x99s like cracking a code when you don\xE2\x80\x99t have the decryption key."
23
+ prerelease: false
24
+ version_requirements: !ruby/object:Gem::Requirement
25
+ requirements:
26
+ - - ">="
27
+ - !ruby/object:Gem::Version
28
+ version: '0'
29
+ description: Character encoding auto-detection in Ruby. This library is a port of
30
+ the auto-detection code in Mozilla. It means taking a sequence of bytes in an unknown
31
+ character encoding, and attempting to determine the encoding so you can read the
32
+ text. It’s like cracking a code when you don’t have the decryption key.
30
33
  email: linus@oleander.nu
31
34
  executables: []
32
-
33
35
  extensions: []
34
-
35
- extra_rdoc_files:
36
+ extra_rdoc_files:
36
37
  - README.markdown
37
38
  - COPYING
38
- files:
39
- - .gitignore
40
- - .rspec
39
+ files:
40
+ - ".gitignore"
41
+ - ".rspec"
41
42
  - COPYING
42
43
  - Gemfile
43
44
  - Gemfile.lock
@@ -81,34 +82,29 @@ files:
81
82
  - rchardet.gemspec
82
83
  - spec/rchardet19_spec.rb
83
84
  - spec/spec_helper.rb
84
- has_rdoc: true
85
85
  homepage: https://github.com/oleander/rchardet
86
86
  licenses: []
87
-
87
+ metadata: {}
88
88
  post_install_message:
89
89
  rdoc_options: []
90
-
91
- require_paths:
90
+ require_paths:
92
91
  - lib
93
- required_ruby_version: !ruby/object:Gem::Requirement
94
- none: false
95
- requirements:
92
+ required_ruby_version: !ruby/object:Gem::Requirement
93
+ requirements:
96
94
  - - ">="
97
- - !ruby/object:Gem::Version
98
- version: "0"
99
- required_rubygems_version: !ruby/object:Gem::Requirement
100
- none: false
101
- requirements:
95
+ - !ruby/object:Gem::Version
96
+ version: '0'
97
+ required_rubygems_version: !ruby/object:Gem::Requirement
98
+ requirements:
102
99
  - - ">="
103
- - !ruby/object:Gem::Version
104
- version: "0"
100
+ - !ruby/object:Gem::Version
101
+ version: '0'
105
102
  requirements: []
106
-
107
103
  rubyforge_project:
108
- rubygems_version: 1.5.0
104
+ rubygems_version: 2.1.8
109
105
  signing_key:
110
- specification_version: 3
111
- summary: Character encoding auto-detection. Ruby 1.9 compat.
112
- test_files:
106
+ specification_version: 4
107
+ summary: Ruby 1.9 compatible character encoding auto-detection library
108
+ test_files:
113
109
  - spec/rchardet19_spec.rb
114
110
  - spec/spec_helper.rb