pascoale 0.0.1 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.ruby-gemset +1 -0
- data/.ruby-version +1 -0
- data/README.md +32 -4
- data/data/errors.txt +1124 -0
- data/data/everything.txt +177302 -0
- data/data/unique_errors.txt +957 -0
- data/lib/pascoale/constants.rb +8 -0
- data/lib/pascoale/edits.rb +1 -1
- data/lib/pascoale/syllable_separator.rb +44 -0
- data/lib/pascoale/syllable_separator_benchmark.rb +29 -0
- data/lib/pascoale/version.rb +1 -1
- data/lib/pascoale.rb +8 -3
- data/pascoale.gemspec +1 -0
- data/spec/lib/pascoale/syllable_separator_spec.rb +150 -0
- metadata +38 -14
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: bfd6cfb79a1e6ef372e86ccd6eba1e414f256ac7
|
4
|
+
data.tar.gz: 85e7e768620a374cb72588e54de90e6166cb2f4f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b3f0f665e8daab808873ca392de860ddbd1da1e7e361466bee69aed3728032b73aa926720056bc01e416eee12cc4270de3df036b45f2f9801f839d0c38aea705
|
7
|
+
data.tar.gz: 69e36730f1c1f809f8e5d4117f044e5948a2ab89065dae57bb69160117d499181ded61ba873d2a6640953018155c03671964dfd1e9cead0c9f9b65e53e385896
|
data/.ruby-gemset
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
pascoale
|
data/.ruby-version
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
ruby-2.1
|
data/README.md
CHANGED
@@ -1,11 +1,17 @@
|
|
1
1
|
# Pascoale
|
2
2
|
|
3
|
-
Minor utilities for text processing in Brazilian Portuguese
|
3
|
+
Minor utilities for text processing in **Brazilian Portuguese**.
|
4
4
|
|
5
|
-
I'm going to add new functions as I need them.
|
5
|
+
I'm going to add new functions as I need them.
|
6
|
+
|
7
|
+
Currently it has:
|
8
|
+
- variations of a word at one and two **edit distances** (Reference: http://norvig.com/spell-correct.html).
|
9
|
+
- Syllabic separation. My tests against a corpus of ~170K words shows 99.36% of correctness \o/.
|
6
10
|
|
7
11
|
The code is kinda slow, but I'm not worried about speed (yet).
|
8
12
|
|
13
|
+
The name of the gem is a homage to "Prof. Pasquale Cipro Neto" (http://pt.wikipedia.org/wiki/Pasquale_Cipro_Neto), a great teacher! And yes, the name of the gem is wrong spelled as a joke ^_^
|
14
|
+
|
9
15
|
## Installation
|
10
16
|
|
11
17
|
Add this line to your application's Gemfile:
|
@@ -27,15 +33,37 @@ Variations of a word (typos and misspelling)
|
|
27
33
|
```ruby
|
28
34
|
require 'pascoale'
|
29
35
|
|
30
|
-
edits = Pascoale
|
36
|
+
edits = Pascoale::Edits.new('você')
|
31
37
|
|
32
38
|
# 1 edit distance
|
33
39
|
puts edits.editions.inspect
|
34
40
|
|
35
41
|
# 2 edits distance
|
36
|
-
puts edits.editions2.inspect # LOTS of output,
|
42
|
+
puts edits.editions2.inspect # LOTS of output, beware.
|
37
43
|
```
|
38
44
|
|
45
|
+
Syllabic separation
|
46
|
+
|
47
|
+
```ruby
|
48
|
+
require 'pascoale'
|
49
|
+
|
50
|
+
separator = Pascoale::SyllableSeparator.new('exceção')
|
51
|
+
puts separator.separated.inspect # ["ex", "ce", "ção"]
|
52
|
+
|
53
|
+
separator = Pascoale::SyllableSeparator.new('aéreo')
|
54
|
+
puts separator.separated.inspect # ["a", "é", "re", "o"]
|
55
|
+
|
56
|
+
separator = Pascoale::SyllableSeparator.new('apneia')
|
57
|
+
puts separator.separated.inspect # ["ap", "nei", "a"]
|
58
|
+
|
59
|
+
separator = Pascoale::SyllableSeparator.new('construir')
|
60
|
+
puts separator.separated.inspect # ["cons", "tru", "ir"]
|
61
|
+
|
62
|
+
# Known error :( :( :(
|
63
|
+
separator = Pascoale::SyllableSeparator.new('traidor')
|
64
|
+
puts separator.separated.inspect # ["tra", "i", "dor"] should be ["trai", "dor"]
|
65
|
+
|
66
|
+
```
|
39
67
|
|
40
68
|
## Contributing
|
41
69
|
|