hunspell-ffi 0.1.3.alpha2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,13 @@
1
+ = 0.1.3 / 2012-05-04
2
+ * Support for hunspell 1.3 including backwards compatibility with hunspell 1.2 and fallback support to any hunspell [drbrain (Eric Hodel)]
3
+ * Fixed memory leak in Hunspell#suggest [drbrain (Eric Hodel)]
4
+ * Feature: #stem and #analyze wrappers [drbrain (Eric Hodel)]
5
+ * Feature: Automatic dictionary determination from path and ENV [drbrain (Eric Hodel)]
6
+ * Update ffi version (~>1.0.7)
7
+ * New alias: Hunspell#check? for Hunspell#check [hamin (Haris Amin)]
8
+ * New methods: #add(word), #remove(word), #add_with_affix(word, example) to
9
+ add/remove words from the run-time dictionary.
10
+ * Show a warning when we cannot find aff/dic files.
11
+
12
+ = 0.1.2 / 2010-08-07
13
+ * First release that works on OSX and Debian
@@ -20,34 +20,43 @@ On Debian:
20
20
  gem install hunspell-ffi
21
21
 
22
22
  == Usage
23
+
23
24
  require 'hunspell-ffi'
24
- dict = Hunspell.new("path/to/cakes.aff", "path/to/cakes.dic")
25
- dict.spell("Baumkuchen") # => true same as #check, #check?
26
- dict.spell("Bomcuken") # => false
27
- dict.check?("Bomcuken") # => false
28
- dict.suggest("Baumgurken") # => ["Baumkuchen"]
25
+
26
+ # Detect language from ENV:
27
+ dict = Hunspell.new("/path/to/dictionaries")
28
+
29
+ # Directly specify language:
30
+ dict = Hunspell.new("/path/to/dictionaries", "en_US")
31
+
32
+ # directly specify dictionaries (legacy)
33
+ dict = Hunspell.new("path/to/dictionaries/en_US.aff", "path/to/dictionaries/en_US.dic")
34
+
35
+ dict.spell("walked") # => true same as #check, #check?
36
+ dict.spell("woked") # => false
37
+ dict.check?("woked") # => false
38
+ dict.suggest("woked") # => ["woke", "worked", "waked", "woken", ...]
29
39
  dict.suggest("qwss43easd") # => []
30
-
40
+
41
+ dict.stem("Baumkuchen") # => ["Baumkuchen"]
42
+ dict.analyze("Baumkuchen") # => [" st:Baumkuchen"]
43
+
31
44
  # Modify the run-time dictionary:
32
45
  dict.add("Geburtstagskuchen")
33
46
  dict.remove("Fichte")
34
47
 
35
48
  == Authors
36
- Andreas Haller (https://github.com/ahaller) and contributors.
37
- Full list of contributors: https://github.com/ahaller/hunspell-ffi/contributors
49
+ Andreas Haller and contributors.
50
+ Full list of contributors: https://github.com/ahx/hunspell-ffi/contributors
38
51
 
39
52
  == License
40
53
  Hereby placed under public domain, do what you want, just do not hold me accountable.
41
54
 
42
55
  == Help wanted
43
- I hear Hunspell has some superpowers like stemming and some that i never even heard of.
44
- Maybe you want to help out to bring something of that power into the ruby world.
45
- Or maybe we can think of a nice way to find to locate .dict files on a system or something.
56
+ Maybe we can think of a nice way to find to locate .dict files on a system or something.
46
57
  Anyways, feel free to fork and send pull requests. kthxbye. Andreas.
47
58
 
48
- The source is on GitHub: https://github.com/ahaller/hunspell-ffi
59
+ The source is on GitHub: https://github.com/ahx/hunspell-ffi
49
60
 
50
61
  === TODOs
51
- Figure out how to use and add hunspell analyzing methods (analyze, stem ...)
52
-
53
62
  Test on Windows
@@ -1,16 +1,16 @@
1
1
  # encoding: utf-8
2
2
  Gem::Specification.new do |s|
3
3
  s.name = 'hunspell-ffi'
4
- s.version = '0.1.3.alpha2'
5
- s.date = '2011-03-23'
4
+ s.version = '0.1.3'
5
+ s.date = '2012-05-04'
6
6
  s.authors = ["Andreas Haller"]
7
7
  s.email = ["andreashaller@gmail.com"]
8
- s.homepage = "http://github.com/ahaller/hunspell-ffi"
8
+ s.homepage = "http://github.com/ahx/hunspell-ffi"
9
9
  s.summary = "A Ruby FFI interface to the Hunspell spelling checker"
10
-
10
+
11
11
  s.add_dependency 'ffi', '~> 1.0.7'
12
12
  s.required_rubygems_version = ">= 1.3.6"
13
-
13
+
14
14
  s.files = `git ls-files`.split("\n")
15
15
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
16
16
  s.require_path = 'lib'
@@ -3,21 +3,88 @@ require 'ffi'
3
3
  class Hunspell
4
4
  module C
5
5
  extend FFI::Library
6
- ffi_lib ['libhunspell', 'libhunspell-1.2', 'libhunspell-1.2.so.0']
6
+ ffi_lib %w[
7
+ libhunspell-1.3
8
+ libhunspell-1.3.so.0
9
+ libhunspell-1.2
10
+ libhunspell-1.2.so.0
11
+ libhunspell
12
+ ]
7
13
  attach_function :Hunspell_create, [:string, :string], :pointer
8
14
  attach_function :Hunspell_spell, [:pointer, :string], :bool
9
15
  attach_function :Hunspell_suggest, [:pointer, :pointer, :string], :int
10
16
  attach_function :Hunspell_add, [:pointer, :string], :int
11
17
  attach_function :Hunspell_add_with_affix, [:pointer, :string, :string], :int
18
+ attach_function :Hunspell_analyze, [:pointer, :pointer, :string], :int
19
+ attach_function :Hunspell_free_list, [:pointer, :pointer, :int], :void
20
+ attach_function :Hunspell_get_dic_encoding, [:pointer], :string
12
21
  attach_function :Hunspell_remove, [:pointer, :string], :int
22
+ attach_function :Hunspell_stem, [:pointer, :pointer, :string], :int
13
23
  end
14
-
15
- def initialize(affpath, dicpath)
16
- warn("Hunspell could not find aff-file #{affpath}") unless File.exist?(affpath)
17
- warn("Hunspell could not find dic-file #{affpath}") unless File.exist?(dicpath)
18
- @handler = C.Hunspell_create(affpath, dicpath)
24
+
25
+ ##
26
+ # The affix file used to check words
27
+
28
+ attr_reader :affix
29
+
30
+ ##
31
+ # The dictionary file used to check words
32
+
33
+ attr_reader :dictionary
34
+
35
+ ##
36
+ # Creates a spell-checking instance. If only +path+ is given, Hunspell will
37
+ # look for a dictionary using the language of your current locale, checking
38
+ # LC_ALL, LC_MESSAGES and LANG. If you would like to spell check words of a
39
+ # specific language provide it as the second parameter, +language+.
40
+ #
41
+ # You may also directly provide the affix file as the +path+ argument and
42
+ # the dictionary file as the +language+ argument, provided they both exist.
43
+ # This is for legacy use of Hunspell.
44
+
45
+ def initialize(path, language = nil)
46
+ if File.exist?(path) and language and File.exist?(language) then
47
+ @affix = path
48
+ @dictionary = language
49
+ else
50
+ language ||= find_language
51
+
52
+ @affix = File.join path, "#{language}.aff"
53
+ @dictionary = File.join path, "#{language}.dic"
54
+ end
55
+
56
+ raise ArgumentError,
57
+ "Hunspell could not find affix file #{@affix}" unless
58
+ File.exist?(@affix)
59
+ raise ArgumentError,
60
+ "Hunspell could not find dictionary file #{@dictionary}" unless
61
+ File.exist?(@dictionary)
62
+
63
+ @handler = C.Hunspell_create @affix, @dictionary
64
+ @dic_encoding = nil
65
+
66
+ if Object.const_defined? :Encoding then
67
+ begin
68
+ encoding_name = C.Hunspell_get_dic_encoding @handler
69
+ @dic_encoding = Encoding.find encoding_name
70
+ rescue ArgumentError
71
+ # unknown encoding name, results will be ASCII-8BIT
72
+ end
73
+ end
19
74
  end
20
-
75
+
76
+ def find_language
77
+ %w[LC_ALL LC_MESSAGES LANG].each do |var|
78
+ next unless value = ENV[var]
79
+
80
+ lang, charset = value.split('.', 2)
81
+
82
+ return lang if charset
83
+ end
84
+
85
+ nil
86
+ end
87
+
21
88
  # Returns true for a known word or false.
22
89
  def spell(word)
23
90
  C.Hunspell_spell(@handler, word)
@@ -27,10 +94,11 @@ class Hunspell
27
94
 
28
95
  # Returns an array with suggested words or returns and empty array.
29
96
  def suggest(word)
30
- ptr = FFI::MemoryPointer.new(:pointer, 1)
31
- len = Hunspell::C.Hunspell_suggest(@handler, ptr, word)
32
- str_ptr = ptr.read_pointer
33
- str_ptr.null? ? [] : str_ptr.get_array_of_string(0, len).compact
97
+ list_pointer = FFI::MemoryPointer.new(:pointer, 1)
98
+
99
+ len = C.Hunspell_suggest(@handler, list_pointer, word)
100
+
101
+ read_list(list_pointer, len)
34
102
  end
35
103
 
36
104
  # Add word to the run-time dictionary
@@ -44,9 +112,47 @@ class Hunspell
44
112
  def add_with_affix(word, example)
45
113
  C.Hunspell_add_with_affix(@handler, word, example)
46
114
  end
47
-
115
+
116
+ # Performs morphological analysis of +word+. See hunspell(4) for details on
117
+ # the output format.
118
+ def analyze(word)
119
+ list_pointer = FFI::MemoryPointer.new(:pointer, 1)
120
+
121
+ len = C.Hunspell_analyze(@handler, list_pointer, word)
122
+
123
+ read_list(list_pointer, len)
124
+ end
125
+
126
+ def read_list(list_pointer, len)
127
+ return [] if len.zero?
128
+
129
+ list = list_pointer.read_pointer
130
+
131
+ strings = list.get_array_of_string(0, len)
132
+
133
+ C.Hunspell_free_list(@handler, list_pointer, len)
134
+
135
+ if @dic_encoding then
136
+ strings.map do |string|
137
+ string.force_encoding @dic_encoding
138
+ end
139
+ end
140
+
141
+ strings
142
+ end
143
+
48
144
  # Remove word from the run-time dictionary
49
145
  def remove(word)
50
146
  C.Hunspell_remove(@handler, word)
51
147
  end
148
+
149
+ # Returns the stems of +word+
150
+ def stem(word)
151
+ list_pointer = FFI::MemoryPointer.new(:pointer, 1)
152
+
153
+ len = C.Hunspell_stem(@handler, list_pointer, word)
154
+
155
+ read_list(list_pointer, len)
156
+ end
157
+
52
158
  end
@@ -0,0 +1,16 @@
1
+ SET UTF-8
2
+ TRY esianrtolcdugmphbyfvkwzESIANRTOLCDUGMPHBYFVKWZ'
3
+ REP 2
4
+ REP f ph
5
+ REP ph f
6
+
7
+ PFX A Y 1
8
+ PFX A 0 re .
9
+
10
+ SFX B Y 2
11
+ SFX B 0 ed [^y]
12
+ SFX B y ied y
13
+
14
+ OCONV 1
15
+ OCONV ' ’
16
+
@@ -0,0 +1,5 @@
1
+ 3
2
+ hello
3
+ hots's
4
+ try/B
5
+ work/AB
@@ -4,14 +4,40 @@ require File.expand_path(File.dirname(__FILE__)) + '/../lib/hunspell-ffi'
4
4
  class TestHunspell < Test::Unit::TestCase
5
5
  def setup
6
6
  @dict_dir = File.dirname(__FILE__)
7
- @dict = Hunspell.new("#{@dict_dir}/cakes.aff", "#{@dict_dir}/cakes.dic")
7
+ @dict = Hunspell.new(@dict_dir, "en_US")
8
+ end
9
+
10
+ def test_initialize
11
+ assert_equal File.join(@dict_dir, "en_US.aff"), @dict.affix
12
+ assert_equal File.join(@dict_dir, "en_US.dic"), @dict.dictionary
13
+ end
14
+
15
+ def test_initialize_legacy
16
+ h = Hunspell.new("#{@dict_dir}/en_US.aff", "#{@dict_dir}/en_US.dic")
17
+
18
+ assert_equal File.join(@dict_dir, "en_US.aff"), h.affix
19
+ assert_equal File.join(@dict_dir, "en_US.dic"), h.dictionary
20
+ end
21
+
22
+ def test_initialize_missing
23
+ e = assert_raises ArgumentError do
24
+ Hunspell.new(@dict_dir, "en_CA")
25
+ end
26
+
27
+ dict = File.join(@dict_dir, "en_CA.aff")
28
+ assert_equal "Hunspell could not find affix file #{dict}", e.message
29
+ end
30
+
31
+ def test_analyze
32
+ assert_equal [" st:hello"], @dict.analyze("hello")
8
33
  end
9
34
 
10
35
  def test_basic_spelling
11
- assert @dict.spell("Baumkuchen") == true
12
- assert @dict.check("Baumkuchen") == true # check alias
13
- assert @dict.spell("Bomcuken") == false
14
- assert_equal ["Baumkuchen"], @dict.suggest("Baumgurken")
36
+ assert @dict.spell("worked")
37
+ assert @dict.check("worked") # check alias
38
+ assert !@dict.spell("working")
39
+
40
+ assert_equal ["worked", "work"], @dict.suggest("woked")
15
41
  assert_equal [], @dict.suggest("qwss43easd")
16
42
  end
17
43
 
@@ -23,4 +49,67 @@ class TestHunspell < Test::Unit::TestCase
23
49
  assert @dict.spell("Neuer Kuchen") == false
24
50
  # TODO test add_with_affix
25
51
  end
26
- end
52
+
53
+ def test_find_langauge_none
54
+ orig_LC_ALL = ENV["LC_ALL"]
55
+ orig_LC_MESSAGES = ENV["LC_ALL"]
56
+ orig_LANG = ENV["LANG"]
57
+
58
+ ENV.delete "LC_ALL"
59
+ ENV.delete "LC_MESSAGES"
60
+ ENV.delete "LANG"
61
+
62
+ assert_nil @dict.find_language
63
+ ensure
64
+ ENV["LC_ALL"] = orig_LC_ALL
65
+ ENV["LC_MESSAGES"] = orig_LC_MESSAGES
66
+ ENV["LANG"] = orig_LANG
67
+ end
68
+
69
+ def test_find_langauge_LANG
70
+ orig_LC_ALL = ENV["LC_ALL"]
71
+ orig_LC_MESSAGES = ENV["LC_ALL"]
72
+ orig_LANG = ENV["LANG"]
73
+
74
+ ENV.delete "LC_ALL"
75
+ ENV.delete "LC_MESSAGES"
76
+ ENV["LANG"] = "en_CA.UTF-8"
77
+
78
+ assert_equal "en_CA", @dict.find_language
79
+ ensure
80
+ ENV["LC_ALL"] = orig_LC_ALL
81
+ ENV["LC_MESSAGES"] = orig_LC_MESSAGES
82
+ ENV["LANG"] = orig_LANG
83
+ end
84
+
85
+ def test_find_langauge_LC_ALL
86
+ orig_LC_ALL = ENV["LC_ALL"]
87
+ ENV["LC_ALL"] = "en_CA.UTF-8"
88
+
89
+ assert_equal "en_CA", @dict.find_language
90
+ ensure
91
+ ENV["LC_ALL"] = orig_LC_ALL
92
+ end
93
+
94
+ def test_find_langauge_LC_MESSAGES
95
+ orig_LC_ALL = ENV["LC_ALL"]
96
+ orig_LC_MESSAGES = ENV["LC_ALL"]
97
+ ENV.delete "LC_ALL"
98
+ ENV["LC_MESSAGES"] = "en_CA.UTF-8"
99
+
100
+ assert_equal "en_CA", @dict.find_language
101
+ ensure
102
+ ENV["LC_ALL"] = orig_LC_ALL
103
+ ENV["LC_MESSAGES"] = orig_LC_MESSAGES
104
+ end
105
+
106
+ def test_stem
107
+ assert_equal %w[hello], @dict.stem("hello")
108
+ end
109
+
110
+ def test_suggest
111
+ suggestions = @dict.suggest "HOWTOs"
112
+
113
+ assert_equal %w[Hots’s], suggestions
114
+ end
115
+ end
metadata CHANGED
@@ -1,94 +1,71 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: hunspell-ffi
3
- version: !ruby/object:Gem::Version
4
- prerelease: true
5
- segments:
6
- - 0
7
- - 1
8
- - 3
9
- - alpha2
10
- version: 0.1.3.alpha2
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.3
5
+ prerelease:
11
6
  platform: ruby
12
- authors:
7
+ authors:
13
8
  - Andreas Haller
14
9
  autorequire:
15
10
  bindir: bin
16
11
  cert_chain: []
17
-
18
- date: 2011-03-23 00:00:00 +01:00
19
- default_executable:
20
- dependencies:
21
- - !ruby/object:Gem::Dependency
12
+ date: 2012-05-04 00:00:00.000000000Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
22
15
  name: ffi
23
- prerelease: false
24
- requirement: &id001 !ruby/object:Gem::Requirement
16
+ requirement: &70297776116200 !ruby/object:Gem::Requirement
25
17
  none: false
26
- requirements:
18
+ requirements:
27
19
  - - ~>
28
- - !ruby/object:Gem::Version
29
- segments:
30
- - 1
31
- - 0
32
- - 7
20
+ - !ruby/object:Gem::Version
33
21
  version: 1.0.7
34
22
  type: :runtime
35
- version_requirements: *id001
23
+ prerelease: false
24
+ version_requirements: *70297776116200
36
25
  description:
37
- email:
26
+ email:
38
27
  - andreashaller@gmail.com
39
28
  executables: []
40
-
41
29
  extensions: []
42
-
43
30
  extra_rdoc_files: []
44
-
45
- files:
31
+ files:
46
32
  - .gitignore
47
- - CHANGES
33
+ - CHANGELOG
48
34
  - Gemfile
49
35
  - Gemfile.lock
50
36
  - README.rdoc
51
37
  - Rakefile
52
38
  - hunspell-ffi.gemspec
53
39
  - lib/hunspell-ffi.rb
54
- - test/cakes.aff
55
- - test/cakes.dic
40
+ - test/en_US.aff
41
+ - test/en_US.dic
56
42
  - test/test_hunspell.rb
57
- has_rdoc: true
58
- homepage: http://github.com/ahaller/hunspell-ffi
43
+ homepage: http://github.com/ahx/hunspell-ffi
59
44
  licenses: []
60
-
61
45
  post_install_message:
62
46
  rdoc_options: []
63
-
64
- require_paths:
47
+ require_paths:
65
48
  - lib
66
- required_ruby_version: !ruby/object:Gem::Requirement
49
+ required_ruby_version: !ruby/object:Gem::Requirement
67
50
  none: false
68
- requirements:
69
- - - ">="
70
- - !ruby/object:Gem::Version
71
- segments:
72
- - 0
73
- version: "0"
74
- required_rubygems_version: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ! '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ required_rubygems_version: !ruby/object:Gem::Requirement
75
56
  none: false
76
- requirements:
77
- - - ">="
78
- - !ruby/object:Gem::Version
79
- segments:
80
- - 1
81
- - 3
82
- - 6
57
+ requirements:
58
+ - - ! '>='
59
+ - !ruby/object:Gem::Version
83
60
  version: 1.3.6
84
61
  requirements: []
85
-
86
62
  rubyforge_project:
87
- rubygems_version: 1.3.7
63
+ rubygems_version: 1.8.6
88
64
  signing_key:
89
65
  specification_version: 3
90
66
  summary: A Ruby FFI interface to the Hunspell spelling checker
91
- test_files:
92
- - test/cakes.aff
93
- - test/cakes.dic
67
+ test_files:
68
+ - test/en_US.aff
69
+ - test/en_US.dic
94
70
  - test/test_hunspell.rb
71
+ has_rdoc:
data/CHANGES DELETED
@@ -1,10 +0,0 @@
1
- = 0.1.3 (Work in progress)
2
- * Update ffi version (~>1.0.7)
3
- * New alias: Hunspell#check? for Hunspell#check
4
- * New methods: #add(word), #remove(word), #add_with_affix(word, example) to
5
- add/remove words from the run-time dictionary.
6
- * Show a warning when we cannot find aff/dic files.
7
-
8
- = 0.1.2 / 2010-08-07
9
- * First release that works on OSX and Debian
10
-
File without changes
@@ -1,21 +0,0 @@
1
- 20
2
- Apfelkuchen
3
- Baumkuchen
4
- Bienenstich
5
- Butterkuchen
6
- Donauwelle
7
- Eierschecke
8
- Guglhupf
9
- Hefekuchen
10
- Käsekuchen
11
- Marmorkuchen
12
- Obstkuchen
13
- Panettone
14
- Pflaumenkuchen
15
- Russischer Zupfkuchen
16
- Rührkuchen
17
- Sandkuchen
18
- Schmandkuchen
19
- Stollen
20
- Streuselkuchen
21
- Zwiebelkuchen