raspell 0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README ADDED
@@ -0,0 +1,133 @@
1
+
2
+ RASPELL - an interface binding for ruby to aspell.
3
+ ==================================================
4
+
5
+ Author:
6
+ -------
7
+ - Matthias Veit <matthias_veit@yahoo.de>
8
+
9
+ Requirements:
10
+ -------------
11
+ - aspell Ver 0.5 (or later) - see aspell.net.
12
+
13
+ Documentation:
14
+ --------------
15
+ Aspell comes with a very brief description of every functionality.
16
+ Read the manual for the use of aspell.
17
+
18
+ Notes for the Wrapper:
19
+ ----------------------
20
+ Aspell comes with a lot of classes (iterators, stringhelpers etc.)
21
+ that are not needed by ruby and are therefore not bridged.
22
+ Only 2 classes are introduced:
23
+
24
+ => Aspell:
25
+ Almost all functionality of aspell is accessible via this class:
26
+
27
+ *inheritance tree:
28
+ -Object
29
+
30
+ *included modules:
31
+ -Kernel
32
+
33
+ *defined constants:
34
+ ULTRA, - option of suggestion mode
35
+ NORMAL, - option of suggestion mode
36
+ FAST, - option of suggestion mode
37
+ BADSPELLERS, - option of suggestion mode
38
+ CheckerOptions - list of checker options
39
+ DictionaryOptions, - list of dictionary options
40
+ FilterOptions, - list of filter options
41
+ MiscOptions, - list of misc options
42
+ RunTogetherOptions, - list of run together options
43
+ UtilityOptions, - list of utility options
44
+
45
+ *class methods:
46
+ -list_dicts()
47
+ List of all available dictionaries.
48
+ Return a list of AspellDictInfo objects.
49
+ -new(language, jargon, size, encoding)
50
+ Constructor.
51
+ -language ISO639 language code plus optional ISO 3166 counry code as string (eg: "de" or "us_US")
52
+ -jargon a special jargon of the selected language
53
+ -size the size of the dictionary to chose (if there are options)
54
+ -encoding the encoding to use
55
+ Note: All parameters are optional and have default values.
56
+ In most cases it is enough to create an Aspell-instance by eg: Aspell.new("us").
57
+
58
+ *instance methods:
59
+ -add_to_personal(word)
60
+ Add the word to the private dictionary.
61
+ -add_to_session(word)
62
+ Add the word to the session list (Ignore the
63
+ word for the rest of the session). The session
64
+ relates to the lifetime of the object.
65
+ -check(word)
66
+ Check given word for correctnes. Returns true
67
+ if word is correct, otherwise false.
68
+ -correct_file(filename)
69
+ Check the whole file with name filename.
70
+ This method needs a block, which will yield
71
+ each misspelled word.
72
+ -correct_lines(array_of_strings)
73
+ Check an array of strings for correctnes.
74
+ This method needs a block, which will yield
75
+ each misspelled word.
76
+ -clear_session()
77
+ Delete all words inside the session-wordlist.
78
+ -get_option(option)
79
+ Value of option in config.
80
+ -get_option_as_list(option)
81
+ Value of option in config.
82
+ The result is a list of strings.
83
+ -list_misspelled(array_of_strings)
84
+ Check an array of strings for correctnes.
85
+ Return a list of all words, that are misspelled.
86
+ -personal_wordlist()
87
+ Return a list of words inside private dictionary.
88
+ -save_all_word_lists()
89
+ All changed dictionaries get synchronized.
90
+ -session_wordlist()
91
+ Return a list of words inside session wordlist.
92
+ -set_option(option, value)
93
+ Set a specified configurable option to value.
94
+ -suggest(word)
95
+ Make a suggestion to the given misspelled word.
96
+ Return a list of words, that are possible suggestions
97
+ to the misspelled given word.
98
+ -suggestion_mode=(mode)
99
+ Set the suggestion mode to one of:
100
+ Aspell::ULTRA - look for soundslikes with one edit distance
101
+ Aspell::FAST - like ultra, but with typo analysis
102
+ Aspell::NORMAL - look for soundslikes with two edit distances + typo analysis
103
+ Aspell::BADSPELLERS - tailored for bad spellers
104
+
105
+
106
+ => AspellDictInfo:
107
+ Aspell.list_dicts return a list of AspellDictInfo-objects. Each dictionary is
108
+ described by an instance of AspellDictInfo:
109
+
110
+ *inheritance tree:
111
+ -Object
112
+
113
+ *included modules:
114
+ -Kernel
115
+
116
+ *class methods:
117
+ -new()
118
+ Constructor. Never invoke this method by hand.
119
+
120
+ *instance methods:
121
+ -code()
122
+ The code of the dictionary.
123
+ -jargon()
124
+ The jargon of the dictionary.
125
+ -name()
126
+ The name of the dictionary.
127
+ -size()
128
+ The size of the dictionary.
129
+
130
+ Example:
131
+ --------
132
+ See in examples directory inside distribution.
133
+
@@ -0,0 +1,46 @@
1
+ $:.unshift(File.dirname(__FILE__) + "/../lib/")
2
+ begin
3
+ require 'rubygems'
4
+ require_gem 'raspell'
5
+ rescue LoadError
6
+ require 'raspell'
7
+ end
8
+
9
+ aspell = Aspell.new
10
+ # Set suggestion mode: one of [ULTRA|FAST|NORMAL|BADSPELLERS]
11
+ aspell.suggestion_mode = Aspell::ULTRA
12
+
13
+
14
+ # some content to check - with many errors
15
+ content = ["This is a simpel sample texxt, with manny erors to find.",
16
+ "To check this, we need an englishh diktionary!"]
17
+ puts content.join("\n")+ "\n\nChecking..."
18
+
19
+ # get a list of all misspelled words
20
+ misspelled = aspell.list_misspelled(content).sort
21
+ puts "\n#{misspelled.length} misspelled words:\n -"+misspelled.join("\n -")
22
+
23
+ # correct content
24
+ correctreadme = aspell.correct_lines(content) { |badword|
25
+ puts "\nMisspelled: #{badword}\n"
26
+ suggestions = aspell.suggest(badword)
27
+ suggestions.each_with_index { |word, num|
28
+ puts " [#{num+1}] #{word}\n"
29
+ }
30
+ puts "Enter number or correct word: "
31
+ input = gets.chomp
32
+ if (input.to_i != 0)
33
+ input = suggestions[input.to_i-1]
34
+ else
35
+ if (!aspell.check(input))
36
+ puts "\nthe word #{input} is not known inside dictionary."
37
+ #possible to add the word into private dictionary or into session
38
+ #via aspell.add_to_personal or aspell.add_to_session
39
+ end
40
+ end
41
+ input #return input
42
+ }
43
+
44
+ puts "\n\nThe correct text is:\n"+correctreadme.join("\n")
45
+
46
+ # It is possible to correct files directly via aspell.correct_file(filename)
@@ -0,0 +1,38 @@
1
+ $:.unshift(File.dirname(__FILE__) + "/../lib/")
2
+ begin
3
+ require 'rubygems'
4
+ require_gem 'raspell'
5
+ rescue LoadError
6
+ require 'raspell'
7
+ end
8
+
9
+ # Show all installed dictionaries.
10
+ # Select one to use.
11
+ puts "Choose Dictionary:\n"
12
+ dicts = Aspell.list_dicts
13
+ raise "\nNo dictionary installed!\nYou have to install some dictionaries (see: www.aspell.net)" if dicts.empty?
14
+ dicts.each_with_index { |dict, num|
15
+ puts " [#{num+1}] #{dict.name} (code:#{dict.code}, jargon:#{dict.jargon}, size:#{dict.size})\n"
16
+ }
17
+ puts "enter number of dictionary:"
18
+ dictionary = dicts[gets.to_i-1]
19
+
20
+
21
+ # Create a new aspell instance
22
+ # Aspell.new(language, jargon, size, encoding)
23
+ # It is possible to omit all parameters (just forget or nil as placeholder).
24
+ aspell = Aspell.new(dictionary.code, dictionary.jargon, dictionary.size.to_s)
25
+ # Set suggestion mode: one of [ULTRA|FAST|NORMAL|BADSPELLERS]
26
+ aspell.suggestion_mode = Aspell::ULTRA
27
+
28
+ puts "\n\nCurrent Dictionary config=====================\n"
29
+ Aspell::DictionaryOptions.each { |option|
30
+ begin
31
+ option_details = aspell.get_option(option)
32
+ rescue
33
+ option_details = "unknown"
34
+ end
35
+ puts " -#{option}: #{option_details}\n"
36
+ }
37
+ puts "==============================================\n"
38
+
data/ext/extconf.rb ADDED
@@ -0,0 +1,6 @@
1
+ require "mkmf"
2
+
3
+ have_header("ruby.h")
4
+ have_header("aspell.h")
5
+ have_library("aspell")
6
+ create_makefile("raspell")
data/ext/raspell.c ADDED
@@ -0,0 +1,748 @@
1
+
2
+ #include "raspell.h"
3
+
4
+ /**
5
+ * raspell September 2002
6
+ *
7
+ * Copyright (c) aquanauten (resp: matthias_veit@yahoo.de)
8
+ *
9
+ * This program is free software; you can redistribute it and/or modify
10
+ * it under the terms of the GNU General Public License as published by
11
+ * the Free Software Foundation; either version 2, or (at your option)
12
+ * any later version.
13
+ *
14
+ * This program is distributed in the hope that it will be useful,
15
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
16
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17
+ * GNU General Public License for more details.
18
+ *
19
+ * You should have received a copy of the GNU General Public License
20
+ * along with this program; if not, write to the Free Software
21
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
22
+ * 02111-1307, USA.
23
+ *
24
+ */
25
+
26
+ extern void Init_dictinfo();
27
+ extern void Init_aspell();
28
+
29
+ void Init_raspell() {
30
+ cAspellError = rb_define_class("AspellError", rb_eStandardError);
31
+ Init_dictinfo();
32
+ Init_aspell();
33
+ }
34
+
35
+ static AspellDictInfo* get_info(VALUE info) {
36
+ AspellDictInfo *result;
37
+ Data_Get_Struct(info, AspellDictInfo, result);
38
+ return result;
39
+ }
40
+
41
+ static VALUE dictinfo_s_new(int argc, VALUE *argv, VALUE klass) {
42
+ rb_raise(rb_eException, "not instantiable");
43
+ }
44
+
45
+ static VALUE dictinfo_name(VALUE self) {
46
+ return rb_str_new2(get_info(self)->name);
47
+ }
48
+
49
+ static VALUE dictinfo_code(VALUE self) {
50
+ return rb_str_new2(get_info(self)->code);
51
+ }
52
+
53
+ static VALUE dictinfo_jargon(VALUE self) {
54
+ return rb_str_new2(get_info(self)->jargon);
55
+ }
56
+
57
+ static VALUE dictinfo_size(VALUE self) {
58
+ return INT2FIX(get_info(self)->size);
59
+ }
60
+
61
+ static VALUE dictinfo_size_str(VALUE self) {
62
+ return rb_str_new2(get_info(self)->size_str);
63
+ }
64
+
65
+ void Init_dictinfo() {
66
+ //CLASS DEFINITION=========================================================
67
+ cDictInfo = rb_define_class("AspellDictInfo", rb_cObject);
68
+
69
+ //CLASS METHODS============================================================
70
+ rb_define_singleton_method(cDictInfo, "new", dictinfo_s_new, 0);
71
+
72
+ //METHODS =================================================================
73
+ rb_define_method(cDictInfo, "name", dictinfo_name, 0);
74
+ rb_define_method(cDictInfo, "code", dictinfo_code, 0);
75
+ rb_define_method(cDictInfo, "jargon", dictinfo_jargon, 0);
76
+ rb_define_method(cDictInfo, "size", dictinfo_size, 0);
77
+ rb_define_method(cDictInfo, "size_str", dictinfo_size_str, 0);
78
+ }
79
+
80
+ extern VALUE rb_cFile;
81
+
82
+ /**
83
+ * This method is called from the garbage collector during finalization.
84
+ * @param p pointer to spellchecker-object.
85
+ */
86
+ static void aspell_free(void *p) {
87
+ delete_aspell_speller(p);
88
+ }
89
+
90
+ /**
91
+ * Check for an error - raise exception if so.
92
+ * @param speller the spellchecker-object.
93
+ * @return void
94
+ * @exception Exception if there was an error.
95
+ */
96
+ static void check_for_error(AspellSpeller * speller) {
97
+ if (aspell_speller_error(speller) != 0) {
98
+ rb_raise(cAspellError, aspell_speller_error_message(speller));
99
+ }
100
+ }
101
+
102
+ /**
103
+ * Set a specific option, that is known by aspell.
104
+ * @param config the config object of a specific spellchecker.
105
+ * @param key the option to set (eg: lang).
106
+ * @param value the value of the option to set (eg: "us_US").
107
+ * @exception Exception if key not known, or value undefined.
108
+ */
109
+ static void set_option(AspellConfig *config, char *key, char *value) {
110
+ //printf("set option: %s = %s\n", key, value);
111
+ if (aspell_config_replace(config, key, value) == 0) {
112
+ rb_raise(cAspellError, aspell_config_error_message(config));
113
+ }
114
+ //check config:
115
+ if (aspell_config_error(config) != 0) {
116
+ rb_raise(cAspellError, aspell_config_error_message(config));
117
+ }
118
+ }
119
+
120
+ static void set_options(AspellConfig *config, VALUE hash) {
121
+ VALUE options = rb_funcall(hash, rb_intern("keys"), 0);
122
+ int count=RARRAY(options)->len;
123
+ int c = 0;
124
+ //set all values
125
+ while(c<count) {
126
+ //fetch option
127
+ VALUE option = RARRAY(options)->ptr[c];
128
+ VALUE value = rb_funcall(hash, rb_intern("fetch"), 1, option);
129
+ if (TYPE(option)!=T_STRING) rb_raise(cAspellError, "Given key must be a string.");
130
+ if (TYPE(value )!=T_STRING) rb_raise(cAspellError, "Given value must be a string.");
131
+ set_option(config, STR2CSTR(option), STR2CSTR(value));
132
+ c++;
133
+ }
134
+ }
135
+
136
+ /**
137
+ * Extract c-struct speller from ruby object.
138
+ * @param speller the speller as ruby object.
139
+ * @return the speller as c-struct.
140
+ */
141
+ static AspellSpeller* get_speller(VALUE speller) {
142
+ AspellSpeller *result;
143
+ Data_Get_Struct(speller, AspellSpeller, result);
144
+ return result;
145
+ }
146
+
147
+ /**
148
+ * Generate a document checker object from a given speller.
149
+ * @param speller the speller that shall chech a document.
150
+ * @return a fresh document checker.
151
+ */
152
+ static AspellDocumentChecker* get_checker(AspellSpeller *speller) {
153
+ AspellCanHaveError * ret;
154
+ AspellDocumentChecker * checker;
155
+ ret = new_aspell_document_checker(speller);
156
+ if (aspell_error(ret) != 0)
157
+ rb_raise(cAspellError, aspell_error_message(ret));
158
+ checker = to_aspell_document_checker(ret);
159
+ return checker;
160
+ }
161
+
162
+ /**
163
+ * Utility function that wraps a list of words as ruby array of ruby strings.
164
+ * @param list an aspell wordlist.
165
+ * @return an ruby array, containing all words as ruby strings.
166
+ */
167
+ static VALUE get_list(const AspellWordList *list) {
168
+ VALUE result = rb_ary_new2(aspell_word_list_size(list));
169
+ if (list != 0) {
170
+ AspellStringEnumeration * els = aspell_word_list_elements(list);
171
+ const char * word;
172
+ while ( (word = aspell_string_enumeration_next(els)) != 0) {
173
+ rb_ary_push(result, rb_str_new2(word));
174
+ }
175
+ delete_aspell_string_enumeration(els);
176
+ }
177
+ return result;
178
+ }
179
+
180
+ /**
181
+ * Generate a regexp from the given word with word boundaries.
182
+ * @param word the word to match.
183
+ * @return regular expression, matching exactly the word as whole.
184
+ */
185
+ static VALUE get_wordregexp(VALUE word) {
186
+ char *cword = STR2CSTR(word);
187
+ char *result = malloc((strlen(cword)+5)*sizeof(char));
188
+ *result='\0';
189
+ strcat(result, "\\b");
190
+ strcat(result, cword);
191
+ strcat(result, "\\b");
192
+ word = rb_reg_new(result, strlen(result), 0);
193
+ free(result);
194
+ return word;
195
+ }
196
+
197
+
198
+ /**
199
+ * Ctor for aspell objects:
200
+ * Aspell.new(language, jargon, size, encoding)
201
+ * Please note: All parameters are optional. If a parameter is omitted, a default value is assumed from
202
+ * the environment (eg lang from $LANG). To retain default values, you can use nil
203
+ * as value: to set only size: Aspell.new(nil, nil, "80")
204
+ * @param language ISO639 language code plus optional ISO 3166 counry code as string (eg: "de" or "us_US")
205
+ * @param jargon a special jargon of the selected language
206
+ * @param size the size of the dictionary to chose (if there are options)
207
+ * @param encoding the encoding to use
208
+ * @exception Exception if the specified dictionary is not found.
209
+ */
210
+ static VALUE aspell_s_new(int argc, VALUE *argv, VALUE klass) {
211
+ VALUE vlang, vjargon, vsize, vencoding;
212
+ const char *tmp;
213
+ //aspell values
214
+ AspellCanHaveError * ret;
215
+ AspellSpeller * speller;
216
+ AspellConfig * config;
217
+
218
+ //create new config
219
+ config = new_aspell_config();
220
+
221
+ //extract values
222
+ rb_scan_args(argc, argv, "04", &vlang, &vjargon, &vsize, &vencoding);
223
+
224
+ //language:
225
+ if (RTEST(vlang)) set_option(config, "lang", STR2CSTR(vlang));
226
+ //jargon:
227
+ if (RTEST(vjargon)) set_option(config, "jargon", STR2CSTR(vjargon));
228
+ //size:
229
+ if (RTEST(vsize)) set_option(config, "size", STR2CSTR(vsize));
230
+ //encoding:
231
+ if (RTEST(vencoding)) set_option(config, "encoding", STR2CSTR(vencoding));
232
+
233
+ //create speller:
234
+ ret = new_aspell_speller(config);
235
+ delete_aspell_config(config);
236
+ if (aspell_error(ret) != 0) {
237
+ tmp = strdup(aspell_error_message(ret));
238
+ delete_aspell_can_have_error(ret);
239
+ rb_raise(cAspellError, tmp);
240
+ }
241
+
242
+ speller = to_aspell_speller(ret);
243
+
244
+ //wrap pointer
245
+ return Data_Wrap_Struct(klass, 0, aspell_free, speller);
246
+ }
247
+
248
+
249
+
250
+ /**
251
+ * Ctor for aspell objects.
252
+ * This is a custom constructor and takes a hash of config options: key, value pairs.
253
+ * Common use:
254
+ *
255
+ * a = Aspell.new({"lang"=>"de", "jargon"=>"berlin"})
256
+ *
257
+ * For a list of config options, see aspell manual.
258
+ * @param options hash of options
259
+ */
260
+ static VALUE aspell_s_new1(VALUE klass, VALUE options) {
261
+ //aspell values
262
+ AspellCanHaveError * ret;
263
+ AspellSpeller * speller;
264
+ AspellConfig * config;
265
+
266
+ //create new config
267
+ config = new_aspell_config();
268
+
269
+ //set options
270
+ set_options(config, options);
271
+
272
+ //create speller:
273
+ ret = new_aspell_speller(config);
274
+ delete_aspell_config(config);
275
+ if (aspell_error(ret) != 0) {
276
+ const char *tmp = strdup(aspell_error_message(ret));
277
+ delete_aspell_can_have_error(ret);
278
+ rb_raise(cAspellError, tmp);
279
+ }
280
+
281
+ speller = to_aspell_speller(ret);
282
+
283
+ //wrap pointer
284
+ return Data_Wrap_Struct(klass, 0, aspell_free, speller);
285
+ }
286
+
287
+ /**
288
+ * List all available dictionaries.
289
+ * @param class object
290
+ * @return array of AspellDictInfo objects.
291
+ */
292
+ static VALUE aspell_s_list_dicts(VALUE klass) {
293
+ AspellConfig * config;
294
+ AspellDictInfoList * dlist;
295
+ AspellDictInfoEnumeration * dels;
296
+ const AspellDictInfo * entry;
297
+ VALUE result = rb_ary_new();
298
+
299
+ //get a list of dictionaries
300
+ config = new_aspell_config();
301
+ dlist = get_aspell_dict_info_list(config);
302
+ delete_aspell_config(config);
303
+
304
+ //iterate over list - fill ruby array
305
+ dels = aspell_dict_info_list_elements(dlist);
306
+ while ( (entry = aspell_dict_info_enumeration_next(dels)) != 0) {
307
+ rb_ary_push(result, Data_Wrap_Struct(cDictInfo, 0, 0, (AspellDictInfo *)entry));
308
+ }
309
+ delete_aspell_dict_info_enumeration(dels);
310
+ return result;
311
+ }
312
+
313
+ /**
314
+ * @see set_option.
315
+ */
316
+ static VALUE aspell_set_option(VALUE self, VALUE option, VALUE value) {
317
+ AspellSpeller *speller = get_speller(self);
318
+ set_option(aspell_speller_config(speller), STR2CSTR(option), STR2CSTR(value));
319
+ return self;
320
+ }
321
+
322
+
323
+ /**
324
+ * Delete an option.
325
+ * @param option optionstring to remove from the options.
326
+ */
327
+ static VALUE aspell_remove_option(VALUE self, VALUE option) {
328
+ AspellSpeller *speller = get_speller(self);
329
+ aspell_config_remove(aspell_speller_config(speller), STR2CSTR(option));
330
+ return self;
331
+ }
332
+
333
+ /**
334
+ * To set the mode, words are suggested.
335
+ * @param one of Aspell::[ULTRA|FAST|NORMAL|BADSPELLERS]
336
+ */
337
+ static VALUE aspell_set_suggestion_mode(VALUE self, VALUE value) {
338
+ AspellSpeller *speller = get_speller(self);
339
+ set_option(aspell_speller_config(speller), "sug-mode", STR2CSTR(value));
340
+ return self;
341
+ }
342
+
343
+ /**
344
+ * Returns the personal wordlist as array of strings.
345
+ * @return array of strings
346
+ */
347
+ static VALUE aspell_personal_wordlist(VALUE self) {
348
+ AspellSpeller *speller = get_speller(self);
349
+ return get_list(aspell_speller_personal_word_list(speller));
350
+ }
351
+
352
+ /**
353
+ * Returns the session wordlist as array of strings.
354
+ * @return array of strings
355
+ */
356
+ static VALUE aspell_session_wordlist(VALUE self) {
357
+ AspellSpeller *speller = get_speller(self);
358
+ return get_list(aspell_speller_session_word_list(speller));
359
+ }
360
+
361
+ /**
362
+ * Returns the main wordlist as array of strings.
363
+ * @return array of strings
364
+ */
365
+ static VALUE aspell_main_wordlist(VALUE self) {
366
+ AspellSpeller *speller = get_speller(self);
367
+ return get_list(aspell_speller_main_word_list(speller));
368
+ }
369
+
370
+ /**
371
+ * Synchronize all wordlists with the current session.
372
+ */
373
+ static VALUE aspell_save_all_wordlists(VALUE self) {
374
+ AspellSpeller *speller = get_speller(self);
375
+ aspell_speller_save_all_word_lists(speller);
376
+ check_for_error(speller);
377
+ return self;
378
+ }
379
+
380
+ /**
381
+ * Remove all words inside session.
382
+ */
383
+ static VALUE aspell_clear_session(VALUE self) {
384
+ AspellSpeller *speller = get_speller(self);
385
+ aspell_speller_clear_session(speller);
386
+ check_for_error(speller);
387
+ return self;
388
+ }
389
+
390
+ /**
391
+ * Suggest words for the given misspelled word.
392
+ * @param word the misspelled word.
393
+ * @return array of strings.
394
+ */
395
+ static VALUE aspell_suggest(VALUE self, VALUE word) {
396
+ AspellSpeller *speller = get_speller(self);
397
+ return get_list(aspell_speller_suggest(speller, STR2CSTR(word), -1));
398
+ }
399
+
400
+ /**
401
+ * Add a given word to the list of known words inside my private dictionary.
402
+ * You have to call aspell_save_all_wordlists to make sure the list gets persistent.
403
+ * @param word the word to add.
404
+ */
405
+ static VALUE aspell_add_to_personal(VALUE self, VALUE word) {
406
+ AspellSpeller *speller = get_speller(self);
407
+ aspell_speller_add_to_personal(speller, STR2CSTR(word), -1);
408
+ check_for_error(speller);
409
+ return self;
410
+ }
411
+
412
+ /**
413
+ * Add a given word to the list of known words just for the lifetime of this object.
414
+ * @param word the word to add.
415
+ */
416
+ static VALUE aspell_add_to_session(VALUE self, VALUE word) {
417
+ AspellSpeller *speller = get_speller(self);
418
+ aspell_speller_add_to_session(speller, STR2CSTR(word), -1);
419
+ check_for_error(speller);
420
+ return self;
421
+ }
422
+
423
+ /**
424
+ * Retrieve the value of a specific option.
425
+ * The options are listed inside
426
+ * Aspell::[DictionaryOptions|CheckerOptions|FilterOptions|RunTogetherOptions|MiscOptions|UtilityOptions]
427
+ * @param word the option as string.
428
+ */
429
+ static VALUE aspell_conf_retrieve(VALUE self, VALUE key) {
430
+ AspellSpeller *speller = get_speller(self);
431
+ AspellConfig *config = aspell_speller_config(speller);
432
+ VALUE result = rb_str_new2(aspell_config_retrieve(config, STR2CSTR(key)));
433
+ if (aspell_config_error(config) != 0) {
434
+ rb_raise(cAspellError, aspell_config_error_message(config));
435
+ }
436
+ return result;
437
+ }
438
+
439
+ /**
440
+ * Retrieve the value of a specific option as list.
441
+ * @param word the option as string.
442
+ */
443
+ static VALUE aspell_conf_retrieve_list(VALUE self, VALUE key) {
444
+ AspellSpeller *speller = get_speller(self);
445
+ AspellConfig *config = aspell_speller_config(speller);
446
+ AspellStringList * list = new_aspell_string_list();
447
+ AspellMutableContainer * container = aspell_string_list_to_mutable_container(list);
448
+ AspellStringEnumeration * els;
449
+ VALUE result = rb_ary_new();
450
+ const char *option_value;
451
+
452
+ //retrieve list
453
+ aspell_config_retrieve_list(config, STR2CSTR(key), container);
454
+ //check for error
455
+ if (aspell_config_error(config) != 0) {
456
+ char *tmp = strdup(aspell_config_error_message(config));
457
+ delete_aspell_string_list(list);
458
+ rb_raise( cAspellError, tmp);
459
+ }
460
+
461
+ //iterate over list
462
+ els = aspell_string_list_elements(list);
463
+ while ( (option_value = aspell_string_enumeration_next(els)) != 0) {
464
+ //push the option value to result
465
+ rb_ary_push(result, rb_str_new2(option_value));
466
+ }
467
+ //free list
468
+ delete_aspell_string_enumeration(els);
469
+ delete_aspell_string_list(list);
470
+
471
+ return result;
472
+ }
473
+
474
+ /**
475
+ * Simply dump config.
476
+ * Not very useful at all.
477
+ */
478
+ static VALUE aspell_dump_config(VALUE self) {
479
+ AspellSpeller *speller = get_speller(self);
480
+ AspellConfig *config = aspell_speller_config(speller);
481
+ AspellKeyInfoEnumeration * key_list = aspell_config_possible_elements( config, 0 );
482
+ const AspellKeyInfo * entry;
483
+
484
+ while ( (entry = aspell_key_info_enumeration_next(key_list) ) ) {
485
+ printf("%20s: %s\n", entry->name, aspell_config_retrieve(config, entry->name) );
486
+ }
487
+ delete_aspell_key_info_enumeration(key_list);
488
+ return self;
489
+ }
490
+
491
+ /**
492
+ * Check a given word for correctness.
493
+ * @param word the word to check
494
+ * @return true if the word is correct, otherwise false.
495
+ */
496
+ static VALUE aspell_check(VALUE self, VALUE word) {
497
+ AspellSpeller *speller = get_speller(self);
498
+ VALUE result = Qfalse;
499
+ int code = aspell_speller_check(speller, STR2CSTR(word), -1);
500
+ if (code == 1)
501
+ result = Qtrue;
502
+ else if (code == 0)
503
+ result = Qfalse;
504
+ else
505
+ rb_raise( cAspellError, aspell_speller_error_message(speller));
506
+ return result;
507
+ }
508
+
509
+ /**
510
+ * This method will check an array of strings for misspelled words.
511
+ * This method needs a block to work proper. Each misspelled word is yielded,
512
+ * a correct word as result from the block is assumed.
513
+ * Common use:
514
+ *
515
+ * a = Aspell.new(...)
516
+ * text = ...
517
+ * a.correct_lines(text) { |badword|
518
+ * puts "Error: #{badword}\n"
519
+ * puts a.suggest(badword).join(" | ")
520
+ * gets #the input is returned as right word
521
+ * }
522
+ *
523
+ * @param ary the array of strings to check.
524
+ * @result an array holding all lines with corrected words.
525
+ */
526
+ static VALUE aspell_correct_lines(VALUE self, VALUE ary) {
527
+ VALUE result = ary;
528
+ if (rb_block_given_p()) {
529
+ //create checker
530
+ AspellSpeller *speller = get_speller(self);
531
+ AspellDocumentChecker * checker = get_checker(speller);
532
+ AspellToken token;
533
+ //some tmp values
534
+ VALUE vline, sline;
535
+ VALUE word, rword;
536
+ char *line;
537
+ int count=RARRAY(ary)->len;
538
+ int c=0;
539
+ //create new result array
540
+ result = rb_ary_new();
541
+ //iterate over array
542
+ while(c<count) {
543
+ int offset=0;
544
+ //fetch line
545
+ vline = RARRAY(ary)->ptr[c];
546
+ //save line
547
+ sline = rb_funcall(vline, rb_intern("dup"), 0);
548
+ //c representation
549
+ line = STR2CSTR(vline);
550
+ //process line
551
+ aspell_document_checker_process(checker, line, -1);
552
+ //iterate over all misspelled words
553
+ while (token = aspell_document_checker_next_misspelling(checker), token.len != 0) {
554
+ //extract word by start/length qualifier
555
+ word = rb_funcall(vline, rb_intern("[]"), 2, INT2FIX(token.offset), INT2FIX(token.len));
556
+ //get the right word from the block
557
+ rword = rb_yield(word);
558
+ //nil -> do nothing
559
+ if(rword == Qnil) continue;
560
+ //check for string
561
+ if (TYPE(rword) != T_STRING) rb_raise(cAspellError, "Need a String to substitute");
562
+ //chomp the string
563
+ rb_funcall(rword, rb_intern("chomp!"), 0);
564
+ //empty string -> do nothing
565
+ if(strlen(STR2CSTR(rword)) == 0) continue;
566
+ //remember word for later suggestion
567
+ aspell_speller_store_replacement(speller, STR2CSTR(word), -1, STR2CSTR(rword), -1);
568
+ //substitute the word by replacement
569
+ rb_funcall(sline, rb_intern("[]="), 3, INT2FIX(token.offset+offset), INT2FIX(token.len), rword);
570
+ //adjust offset
571
+ offset += strlen(STR2CSTR(rword))-strlen(STR2CSTR(word));
572
+ //printf("replace >%s< with >%s< (offset now %d)\n", STR2CSTR(word), STR2CSTR(rword), offset);
573
+ }
574
+ //push the substituted line to result
575
+ rb_ary_push(result, sline);
576
+ c++;
577
+ }
578
+ //free checker
579
+ delete_aspell_document_checker(checker);
580
+ } else {
581
+ rb_raise(cAspellError, "No block given. How to correct?");
582
+ }
583
+ return result;
584
+ }
585
+
586
+ /**
587
+ * Remember a correction.
588
+ * This affects the suggestion of other words to fit this correction.
589
+ * @param badword the bad spelled word as string.
590
+ * @param badword the correction of the bad spelled word as string.
591
+ * @result self
592
+ */
593
+ static VALUE aspell_store_replacement(VALUE self, VALUE badword, VALUE rightword) {
594
+ AspellSpeller *speller = get_speller(self);
595
+ aspell_speller_store_replacement(speller, STR2CSTR(badword), -1, STR2CSTR(rightword), -1);
596
+ return self;
597
+ }
598
+
599
+ /**
600
+ * Simple utility function to correct a file.
601
+ * The file gets read, content will be checked and write back.
602
+ * Please note: This method will change the file! - no backup and of course: no warranty!
603
+ * @param filename the name of the file as String.
604
+ * @exception Exception due to lack of read/write permissions.
605
+ */
606
+ static VALUE aspell_correct_file(VALUE self, VALUE filename) {
607
+ if (rb_block_given_p()) {
608
+ VALUE content = rb_funcall(rb_cFile, rb_intern("readlines"), 1, filename);
609
+ VALUE newcontent = aspell_correct_lines(self, content);
610
+ VALUE file = rb_file_open(STR2CSTR(filename), "w+");
611
+ rb_funcall(file, rb_intern("write"), 1, newcontent);
612
+ rb_funcall(file, rb_intern("close"), 0);
613
+ } else {
614
+ rb_raise(cAspellError, "No block given. How to correct?");
615
+ }
616
+ return self;
617
+
618
+ }
619
+
620
+ /**
621
+ * Return a list of all misspelled words inside a given array of strings.
622
+ * @param ary an array of strings to check for.
623
+ * @return array of strings: words that are misspelled.
624
+ */
625
+ static VALUE aspell_list_misspelled(VALUE self, VALUE ary) {
626
+ VALUE result = rb_hash_new();
627
+ //create checker
628
+ AspellSpeller *speller = get_speller(self);
629
+ AspellDocumentChecker * checker = get_checker(speller);
630
+ AspellToken token;
631
+ VALUE word, vline;
632
+ int count=RARRAY(ary)->len;
633
+ int c=0;
634
+ //iterate over array
635
+ while(c<count) {
636
+ //process line
637
+ vline = RARRAY(ary)->ptr[c];
638
+ aspell_document_checker_process(checker, STR2CSTR(vline), -1);
639
+ //iterate over all misspelled words
640
+ while (token = aspell_document_checker_next_misspelling(checker), token.len != 0) {
641
+ //extract word by start/length qualifier
642
+ word = rb_funcall(vline, rb_intern("[]"), 2, INT2FIX(token.offset), INT2FIX(token.len));
643
+ rb_hash_aset(result, word, Qnil);
644
+ //yield block, if given
645
+ if (rb_block_given_p())
646
+ rb_yield(word);
647
+ }
648
+ c++;
649
+ }
650
+ //free checker
651
+ delete_aspell_document_checker(checker);
652
+ result = rb_funcall(result, rb_intern("keys"), 0);
653
+ return result;
654
+ }
655
+
656
+ void Init_aspell() {
657
+ //CLASS DEFINITION=========================================================
658
+ cAspell = rb_define_class("Aspell", rb_cObject);
659
+
660
+ //CONSTANTS================================================================
661
+ rb_define_const(cAspell, "ULTRA", rb_str_new2("ultra"));
662
+ rb_define_const(cAspell, "FAST", rb_str_new2("fast"));
663
+ rb_define_const(cAspell, "NORMAL", rb_str_new2("normal"));
664
+ rb_define_const(cAspell, "BADSPELLERS", rb_str_new2("bad-spellers"));
665
+ rb_define_const(cAspell, "DictionaryOptions", rb_ary_new3( 11,
666
+ rb_str_new2("master"),
667
+ rb_str_new2("dict-dir"),
668
+ rb_str_new2("lang"),
669
+ rb_str_new2("size"),
670
+ rb_str_new2("jargon"),
671
+ rb_str_new2("word-list-path"),
672
+ rb_str_new2("module-search-order"),
673
+ rb_str_new2("personal"),
674
+ rb_str_new2("repl"),
675
+ rb_str_new2("extra-dicts"),
676
+ rb_str_new2("strip-accents")));
677
+ rb_define_const(cAspell, "CheckerOptions", rb_ary_new3( 11,
678
+ rb_str_new2("ignore"),
679
+ rb_str_new2("ignore-case"),
680
+ rb_str_new2("ignore-accents"),
681
+ rb_str_new2("ignore-repl"),
682
+ rb_str_new2("save-repl"),
683
+ rb_str_new2("sug-mode"),
684
+ rb_str_new2("module-search-order"),
685
+ rb_str_new2("personal"),
686
+ rb_str_new2("repl"),
687
+ rb_str_new2("extra-dicts"),
688
+ rb_str_new2("strip-accents")));
689
+ rb_define_const(cAspell, "FilterOptions", rb_ary_new3( 10,
690
+ rb_str_new2("filter"),
691
+ rb_str_new2("mode"),
692
+ rb_str_new2("encoding"),
693
+ rb_str_new2("add-email-quote"),
694
+ rb_str_new2("rem-email-quote"),
695
+ rb_str_new2("email-margin"),
696
+ rb_str_new2("sgml-check"),
697
+ rb_str_new2("sgml-extension"),
698
+ rb_str_new2("tex-command"),
699
+ rb_str_new2("tex-check-command")));
700
+ rb_define_const(cAspell, "RunTogetherOptions", rb_ary_new3( 3,
701
+ rb_str_new2("run-together"),
702
+ rb_str_new2("run-together-limit"),
703
+ rb_str_new2("run-together-min")));
704
+ rb_define_const(cAspell, "MiscOptions", rb_ary_new3( 8,
705
+ rb_str_new2("conf"),
706
+ rb_str_new2("conf-dir"),
707
+ rb_str_new2("data-dir"),
708
+ rb_str_new2("local-data-dir"),
709
+ rb_str_new2("home-dir"),
710
+ rb_str_new2("per-conf"),
711
+ rb_str_new2("prefix"),
712
+ rb_str_new2("set-prefix")));
713
+
714
+ rb_define_const(cAspell, "UtilityOptions", rb_ary_new3( 4,
715
+ rb_str_new2("backup"),
716
+ rb_str_new2("time"),
717
+ rb_str_new2("reverse"),
718
+ rb_str_new2("keymapping")));
719
+
720
+ //CLASS METHODS============================================================
721
+ rb_define_singleton_method(cAspell, "new", aspell_s_new, -1);
722
+ rb_define_singleton_method(cAspell, "new1", aspell_s_new1, 1);
723
+ rb_define_singleton_method(cAspell, "list_dicts", aspell_s_list_dicts, 0);
724
+
725
+ //METHODS =================================================================
726
+ rb_define_method(cAspell, "add_to_personal", aspell_add_to_personal, 1);
727
+ rb_define_method(cAspell, "add_to_session", aspell_add_to_personal, 1);
728
+ rb_define_method(cAspell, "check", aspell_check, 1);
729
+ rb_define_method(cAspell, "correct_lines", aspell_correct_lines, 1);
730
+ rb_define_method(cAspell, "correct_file", aspell_correct_file, 1);
731
+ rb_define_method(cAspell, "clear_session", aspell_clear_session, 0);
732
+ rb_define_method(cAspell, "dump_config", aspell_dump_config, 0);
733
+ rb_define_method(cAspell, "list_misspelled", aspell_list_misspelled, 1);
734
+ //This seems not to be very useful ...
735
+ //rb_define_method(cAspell, "main_wordlist", aspell_main_wordlist, 0);
736
+ rb_define_method(cAspell, "personal_wordlist", aspell_personal_wordlist, 0);
737
+ rb_define_method(cAspell, "save_all_word_lists", aspell_save_all_wordlists, 0);
738
+ rb_define_method(cAspell, "session_wordlist", aspell_session_wordlist, 0);
739
+ rb_define_method(cAspell, "set_option", aspell_set_option, 2);
740
+ rb_define_method(cAspell, "store_replacement", aspell_store_replacement, 2);
741
+ rb_define_method(cAspell, "remove_option", aspell_remove_option, 1);
742
+ rb_define_method(cAspell, "get_option", aspell_conf_retrieve, 1);
743
+ rb_define_method(cAspell, "get_option_as_list", aspell_conf_retrieve_list, 1);
744
+ rb_define_method(cAspell, "suggest", aspell_suggest, 1);
745
+ rb_define_method(cAspell, "suggestion_mode=", aspell_set_suggestion_mode, 1);
746
+ }
747
+
748
+
data/ext/raspell.h ADDED
@@ -0,0 +1,34 @@
1
+ #ifndef _RASPELL_GLOBAL_H
2
+ #define _RASPELL_GLOBAL_H
3
+
4
+ #include <ruby.h>
5
+ #include <aspell.h>
6
+
7
+ VALUE cAspell;
8
+ VALUE cDictInfo;
9
+
10
+ VALUE cAspellError;
11
+
12
+ #endif
13
+
14
+ /**
15
+ * raspell September 2002
16
+ *
17
+ * Copyright (c) aquanauten (resp: matthias_veit@yahoo.de)
18
+ *
19
+ * This program is free software; you can redistribute it and/or modify
20
+ * it under the terms of the GNU General Public License as published by
21
+ * the Free Software Foundation; either version 2, or (at your option)
22
+ * any later version.
23
+ *
24
+ * This program is distributed in the hope that it will be useful,
25
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
26
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
27
+ * GNU General Public License for more details.
28
+ *
29
+ * You should have received a copy of the GNU General Public License
30
+ * along with this program; if not, write to the Free Software
31
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
32
+ * 02111-1307, USA.
33
+ *
34
+ */
data/raspell.gemspec ADDED
@@ -0,0 +1,25 @@
1
+ require 'rubygems'
2
+ spec = Gem::Specification.new do |s|
3
+ s.name = "raspell"
4
+ s.version = "0.3"
5
+ s.author = "Evan Weaver"
6
+ s.email = "evan at cloudbur st"
7
+ s.homepage = "http://rubyforge.org/projects/fauna"
8
+ s.platform = Gem::Platform::RUBY
9
+ s.summary = "an interface binding for ruby to aspell (aspell.net)"
10
+ candidates = Dir.glob("{.,ext,lib,test,examples}/**/*")
11
+ s.files = candidates.delete_if do |item|
12
+ item.include?("CVS") || item.include?("rdoc") || item.include?("svn")
13
+ end
14
+ s.require_path = "lib"
15
+ s.autorequire = "raspell"
16
+ s.extensions = ["ext/extconf.rb"]
17
+ s.test_file = "test/simple_test.rb"
18
+ s.has_rdoc = false
19
+ s.extra_rdoc_files = ["README"]
20
+ end
21
+
22
+ if $0 == __FILE__
23
+ Gem::manage_gems
24
+ Gem::Builder.new(spec).build
25
+ end
@@ -0,0 +1,36 @@
1
+ $:.unshift(File.dirname(__FILE__) + "/../lib/")
2
+
3
+ require 'test/unit'
4
+ require "raspell"
5
+
6
+ class TestSpell < Test::Unit::TestCase
7
+
8
+ def setup
9
+ @aspell = Aspell.new
10
+ @text = ["Hiere is somthing wrong on the planett. And it was not the Apollo."]
11
+ end
12
+
13
+ def test_correct_lines
14
+ assert_equal(["<wrong word> is <wrong word> wrong on the <wrong word>. And it was not the Apollo."],
15
+ @aspell.correct_lines(@text) { |word| "<wrong word>" })
16
+ end
17
+
18
+ def test_list_mispelled
19
+ misspelled = @aspell.list_misspelled(@text).sort
20
+ assert_equal(3, misspelled.length)
21
+ end
22
+
23
+ def test_suggest
24
+ suggestions = @aspell.suggest("spel")
25
+ assert_equal(["spell", "spiel", "spew", "Opel", "spec", "sped"],
26
+ suggestions)
27
+ end
28
+
29
+ def test_check
30
+ assert_equal(false, @aspell.check("spel"))
31
+ assert(@aspell.check("spell"))
32
+ end
33
+
34
+ end
35
+
36
+
metadata ADDED
@@ -0,0 +1,64 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.9.2
3
+ specification_version: 1
4
+ name: raspell
5
+ version: !ruby/object:Gem::Version
6
+ version: "0.3"
7
+ date: 2007-03-10 00:00:00 -05:00
8
+ summary: an interface binding for ruby to aspell (aspell.net)
9
+ require_paths:
10
+ - lib
11
+ email: evan at cloudbur st
12
+ homepage: http://rubyforge.org/projects/fauna
13
+ rubyforge_project:
14
+ description:
15
+ autorequire: raspell
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: false
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
25
+ platform: ruby
26
+ signing_key:
27
+ cert_chain:
28
+ post_install_message:
29
+ authors:
30
+ - Evan Weaver
31
+ files:
32
+ - ./README
33
+ - ./examples
34
+ - ./ext
35
+ - ./lib
36
+ - ./raspell.gemspec
37
+ - ./test
38
+ - ./examples/check_correct.rb
39
+ - ./examples/dictionaries.rb
40
+ - ./ext/extconf.rb
41
+ - ./ext/raspell.c
42
+ - ./ext/raspell.h
43
+ - ./test/simple_test.rb
44
+ - ext/extconf.rb
45
+ - ext/raspell.c
46
+ - ext/raspell.h
47
+ - test/simple_test.rb
48
+ - examples/check_correct.rb
49
+ - examples/dictionaries.rb
50
+ - README
51
+ test_files:
52
+ - test/simple_test.rb
53
+ rdoc_options: []
54
+
55
+ extra_rdoc_files:
56
+ - README
57
+ executables: []
58
+
59
+ extensions:
60
+ - ext/extconf.rb
61
+ requirements: []
62
+
63
+ dependencies: []
64
+