sastrawi 0.1.0.pre
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +50 -0
- data/.travis.yml +8 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +21 -0
- data/README.md +70 -0
- data/Rakefile +6 -0
- data/data/kata-dasar.txt +29932 -0
- data/lib/sastrawi/dictionary/array_dictionary.rb +33 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule10.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule11.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule12.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule13a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule13b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule14.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule15a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule15b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule16.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule17a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule17b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule17c.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule17d.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule18a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule18b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule19.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule1a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule1b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule2.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule20.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule21a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule21b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule23.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule24.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule25.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule26a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule26b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule27.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule28a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule28b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule29.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule3.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule30a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule30b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule30c.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule31a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule31b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule32.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule34.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule35.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule36.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule37a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule37b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule38a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule38b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule39a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule39b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule4.rb +11 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule40a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule40b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule41.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule42.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule5.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule6a.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule6b.rb +17 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule7.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule8.rb +19 -0
- data/lib/sastrawi/morphology/disambiguator/disambiguator_prefix_rule9.rb +19 -0
- data/lib/sastrawi/morphology/invalid_affix_pair_specification.rb +24 -0
- data/lib/sastrawi/stemmer/cache/array_cache.rb +25 -0
- data/lib/sastrawi/stemmer/cached_stemmer.rb +33 -0
- data/lib/sastrawi/stemmer/confix_stripping/precedence_adjustment_specification.rb +20 -0
- data/lib/sastrawi/stemmer/context/context.rb +170 -0
- data/lib/sastrawi/stemmer/context/removal.rb +17 -0
- data/lib/sastrawi/stemmer/context/visitor/dont_stem_short_word.rb +17 -0
- data/lib/sastrawi/stemmer/context/visitor/prefix_disambiguator.rb +46 -0
- data/lib/sastrawi/stemmer/context/visitor/remove_derivational_suffix.rb +28 -0
- data/lib/sastrawi/stemmer/context/visitor/remove_inflectional_particle.rb +26 -0
- data/lib/sastrawi/stemmer/context/visitor/remove_inflectional_possessive_pronoun.rb +26 -0
- data/lib/sastrawi/stemmer/context/visitor/remove_plain_prefix.rb +26 -0
- data/lib/sastrawi/stemmer/context/visitor/visitor_provider.rb +157 -0
- data/lib/sastrawi/stemmer/filter/text_normalizer.rb +15 -0
- data/lib/sastrawi/stemmer/stemmer.rb +85 -0
- data/lib/sastrawi/stemmer/stemmer_factory.rb +45 -0
- data/lib/sastrawi/stop_word_remover/stop_word_remover.rb +24 -0
- data/lib/sastrawi/stop_word_remover/stop_word_remover_factory.rb +152 -0
- data/lib/sastrawi/version.rb +3 -0
- data/lib/sastrawi.rb +12 -0
- data/sastrawi.gemspec +25 -0
- metadata +173 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 85c8c97313e9ebf76008045f30bec9a9eaca39dd
|
4
|
+
data.tar.gz: f4353fd69f5a722fd8003c37e8c0a61c43ce8c34
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 86ac9f0de919863b86bc7e41b2428f958c43d921453fc34336282996352005bde335c5790ac509a894251b79fa9c79685a638c1ca97469921493fc54f91cad7a
|
7
|
+
data.tar.gz: 826cef4036c182fd855b399c0db1f7a417b43950f8ac9e710a4e8bb335880369f589196bc947162ccb664e427b6ea7e79d014a6ee9cbf0053084665c6df82fa9
|
data/.gitignore
ADDED
@@ -0,0 +1,50 @@
|
|
1
|
+
*.gem
|
2
|
+
*.rbc
|
3
|
+
/.config
|
4
|
+
/coverage/
|
5
|
+
/InstalledFiles
|
6
|
+
/pkg/
|
7
|
+
/spec/reports/
|
8
|
+
/spec/examples.txt
|
9
|
+
/test/tmp/
|
10
|
+
/test/version_tmp/
|
11
|
+
/tmp/
|
12
|
+
|
13
|
+
# Used by dotenv library to load environment variables.
|
14
|
+
# .env
|
15
|
+
|
16
|
+
## Specific to RubyMotion:
|
17
|
+
.dat*
|
18
|
+
.repl_history
|
19
|
+
build/
|
20
|
+
*.bridgesupport
|
21
|
+
build-iPhoneOS/
|
22
|
+
build-iPhoneSimulator/
|
23
|
+
|
24
|
+
## Specific to RubyMotion (use of CocoaPods):
|
25
|
+
#
|
26
|
+
# We recommend against adding the Pods directory to your .gitignore. However
|
27
|
+
# you should judge for yourself, the pros and cons are mentioned at:
|
28
|
+
# https://guides.cocoapods.org/using/using-cocoapods.html#should-i-check-the-pods-directory-into-source-control
|
29
|
+
#
|
30
|
+
# vendor/Pods/
|
31
|
+
|
32
|
+
## Documentation cache and generated files:
|
33
|
+
/.yardoc/
|
34
|
+
/_yardoc/
|
35
|
+
/doc/
|
36
|
+
/rdoc/
|
37
|
+
|
38
|
+
## Environment normalization:
|
39
|
+
/.bundle/
|
40
|
+
/vendor/bundle
|
41
|
+
/lib/bundler/man/
|
42
|
+
|
43
|
+
# for a library or gem, you might want to ignore these files since the code is
|
44
|
+
# intended to run in multiple environments; otherwise, check them in:
|
45
|
+
# Gemfile.lock
|
46
|
+
# .ruby-version
|
47
|
+
# .ruby-gemset
|
48
|
+
|
49
|
+
# unless supporting rvm < 1.11.0 or doing something fancy, ignore this:
|
50
|
+
.rvmrc
|
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2016-2017 Andrias Meisyal
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,70 @@
|
|
1
|
+
# Sastrawi Bindings for Ruby [![Build Status](https://travis-ci.org/meisyal/sastrawi-ruby.svg?branch=master)](https://travis-ci.org/meisyal/sastrawi-ruby)
|
2
|
+
|
3
|
+
sastrawi-ruby is Ruby bindings for [Sastrawi][sastrawi], a library which allows you
|
4
|
+
to stem words in Bahasa Indonesia. The original implementation of Sastrawi was
|
5
|
+
written in PHP and this library is written in Ruby language.
|
6
|
+
|
7
|
+
Taken from [Wikipedia][stemmingwiki], stemming is the process of reducing
|
8
|
+
inflected (or sometimes derived) words to their word stem, base or root form.
|
9
|
+
For instance, "menahan" has "tahan" as its base form.
|
10
|
+
|
11
|
+
## Documentation
|
12
|
+
|
13
|
+
Documentation for this library is not available at this moment. But, you can
|
14
|
+
check [sastrawi-ruby GitHub Wiki][documentation] that contains TODO list.
|
15
|
+
|
16
|
+
## Installation
|
17
|
+
|
18
|
+
There are two options to install this library. First, if you just want to use
|
19
|
+
Ruby bindings for Sastrawi, add this line to your application's Gemfile:
|
20
|
+
|
21
|
+
gem 'sastrawi'
|
22
|
+
|
23
|
+
and then execute:
|
24
|
+
|
25
|
+
bundle install
|
26
|
+
|
27
|
+
or you can install directly:
|
28
|
+
|
29
|
+
gem install sastrawi
|
30
|
+
|
31
|
+
Note that, this library requires Ruby. Ruby 1.9.3 or above should be installed
|
32
|
+
on your system. I would recommend to choose the stable versions.
|
33
|
+
|
34
|
+
## Usage
|
35
|
+
|
36
|
+
Currently, this library supports stemming words with provided base forms. You
|
37
|
+
can't add or remove any base form. This feature will be implemented for next
|
38
|
+
release.
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
require 'sastrawi'
|
42
|
+
|
43
|
+
# prepare a sentence or words to be stemmed and call the stem API
|
44
|
+
sentence = 'Perekonomian Indonesia sedang dalam pertumbuhan yang membanggakan.'
|
45
|
+
stemming_result = Sastrawi.stem(sentence)
|
46
|
+
|
47
|
+
# the stemming result should be "ekonomi indonesia sedang dalam tumbuh yang
|
48
|
+
bangga"
|
49
|
+
puts stemming_result
|
50
|
+
```
|
51
|
+
|
52
|
+
## Contributing
|
53
|
+
|
54
|
+
Contributions are welcome. If you find a bug, please report it to issue
|
55
|
+
tracker. Use `dev` branch as a target of your feature branch for pull request.
|
56
|
+
Both issue and pull request details should be written in English.
|
57
|
+
|
58
|
+
## License
|
59
|
+
|
60
|
+
This library is released under the terms of MIT License. See the
|
61
|
+
[LICENSE][license] file for more details. sastrawi-ruby contains base form of
|
62
|
+
words from [Kateglo][kateglo] and it is licensed under a [Creative Commons
|
63
|
+
Attribution-NonCommercial-ShareAlike 3.0 Unported License][kateglolicense].
|
64
|
+
|
65
|
+
[sastrawi]: https://github.com/sastrawi/sastrawi
|
66
|
+
[stemmingwiki]: https://en.wikipedia.org/wiki/Stemming
|
67
|
+
[documentation]: https://github.com/meisyal/sastrawi-ruby/wiki
|
68
|
+
[license]: https://github.com/meisyal/sastrawi-ruby/blob/master/LICENSE.txt
|
69
|
+
[kateglo]: http://kateglo.com
|
70
|
+
[kateglolicense]: https://creativecommons.org/licenses/by-nc-sa/3.0/
|