freeling-analyzer 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +18 -0
- data/.travis.yml +9 -0
- data/Gemfile +6 -0
- data/Guardfile +17 -0
- data/LICENSE +22 -0
- data/README.md +85 -0
- data/Rakefile +11 -0
- data/freeling-analyzer.gemspec +30 -0
- data/lib/freeling-analyzer.rb +1 -0
- data/lib/freeling/analyzer.rb +129 -0
- data/lib/freeling/analyzer/freeling_default.rb +57 -0
- data/lib/freeling/analyzer/process_wrapper.rb +82 -0
- data/lib/freeling/analyzer/version.rb +5 -0
- data/test/analyzer_test.rb +45 -0
- data/test/freeling_default_test.rb +41 -0
- data/test/process_wrapper_test.rb +89 -0
- data/test/test_helper.rb +5 -0
- metadata +203 -0
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/Guardfile
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
# A sample Guardfile
|
2
|
+
# More info at https://github.com/guard/guard#readme
|
3
|
+
|
4
|
+
guard :minitest do
|
5
|
+
# with Minitest::Unit
|
6
|
+
#watch(%r|^test/(.*)\/?test_(.*)\.rb|)
|
7
|
+
#watch(%r|^lib/(.*)([^/]+)\.rb|) { |m| "test/#{m[1]}test_#{m[2]}.rb" }
|
8
|
+
#watch(%r|^test/test_helper\.rb|) { "test" }
|
9
|
+
watch(%r{^lib/(.+)\.rb$}) { |m| "test/#{m[1].split("/").last}_test.rb" }
|
10
|
+
watch(%r{^test/.+_test\.rb$})
|
11
|
+
watch('test/test_helper.rb') { "test" }
|
12
|
+
|
13
|
+
# with Minitest::Spec
|
14
|
+
# watch(%r|^spec/(.*)_spec\.rb|)
|
15
|
+
# watch(%r|^lib/(.*)([^/]+)\.rb|) { |m| "spec/#{m[1]}#{m[2]}_spec.rb" }
|
16
|
+
# watch(%r|^spec/spec_helper\.rb|) { "spec" }
|
17
|
+
end
|
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Damián Silvani
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
# Freeling::Analyzer [](https://travis-ci.org/munshkr/freeling-analyzer-ruby)
|
2
|
+
|
3
|
+
**FreeLing::Analyzer** is a Ruby wrapper around `analyzer`, a binary tool
|
4
|
+
included in FreeLing's package that allows the user to process a stream of text
|
5
|
+
with FreeLing.
|
6
|
+
|
7
|
+
*This has been tested with version 3.0+ only*.
|
8
|
+
|
9
|
+
## Usage
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
text = "Mi amigo Juan Mesa se mesa la barba al lado de la mesa."
|
13
|
+
analyzer = FreeLing::Analyzer.new(text, :language => :es)
|
14
|
+
|
15
|
+
analyzer.tokens.first
|
16
|
+
# => #<FreeLing::Analyzer::Token form="Mi" lemma="mi" prob=0.995536 tag="DP1CSS">
|
17
|
+
|
18
|
+
analyzer.tokens.map { |t| t.lemma }
|
19
|
+
# => ["mi", "amigo", "juan_mesa", "se", "mesar", "el", "barba", "a+el", "lado", "de", "el", "mesa", "."]
|
20
|
+
```
|
21
|
+
|
22
|
+
## Features
|
23
|
+
|
24
|
+
* Analyzer is *lazy*, it does not spawn the process until needed and it sends
|
25
|
+
the input text to `analyzer` on demand.
|
26
|
+
* It just works with the default instalation of FreeLing. Just set the language
|
27
|
+
to use and you're good to go.
|
28
|
+
|
29
|
+
## Installation
|
30
|
+
|
31
|
+
Add this line to your application's Gemfile:
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
gem 'freeling-analyzer'
|
35
|
+
```
|
36
|
+
|
37
|
+
And then execute:
|
38
|
+
|
39
|
+
$ bundle
|
40
|
+
|
41
|
+
Or install it yourself as:
|
42
|
+
|
43
|
+
$ gem install freeling-analyzer
|
44
|
+
|
45
|
+
## FreeLing
|
46
|
+
|
47
|
+
[FreeLing](http://nlp.lsi.upc.edu/freeling/) is an open source suite of
|
48
|
+
language analyzers written in C++.
|
49
|
+
|
50
|
+
The main services offered are:
|
51
|
+
* Text tokenization
|
52
|
+
* Sentence splitting
|
53
|
+
* Morphological analysis
|
54
|
+
* Suffix treatment
|
55
|
+
* Retokenization of clitic pronouns
|
56
|
+
* Flexible multiword recognition
|
57
|
+
* Contraction splitting
|
58
|
+
* Probabilistic prediction of unkown word categories
|
59
|
+
* Named entity detection (NER)
|
60
|
+
* Recognition of dates, numbers, ratios, currency, and physical magnitudes
|
61
|
+
(speed, weight, temperature, density, etc.)
|
62
|
+
* PoS tagging
|
63
|
+
* Chart-based shallow parsing
|
64
|
+
* Named entity classification (NEC)
|
65
|
+
* WordNet based sense annotation and disambiguation
|
66
|
+
* Rule-based dependency parsing
|
67
|
+
* Nominal correference resolution.
|
68
|
+
|
69
|
+
Currently supported languages are:
|
70
|
+
* Spanish
|
71
|
+
* Catalan
|
72
|
+
* Galician
|
73
|
+
* Italian
|
74
|
+
* English
|
75
|
+
* Welsh
|
76
|
+
* Portuguese
|
77
|
+
* Asturian
|
78
|
+
|
79
|
+
## Contributing
|
80
|
+
|
81
|
+
1. Fork it
|
82
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
83
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
84
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
85
|
+
5. Create new Pull Request
|
data/Rakefile
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/freeling/analyzer/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Damián Silvani"]
|
6
|
+
gem.email = ["munshkr@gmail.com"]
|
7
|
+
gem.summary = %q{Ruby wrapper for FreeLing's analyzer tool}
|
8
|
+
gem.description = %q{FreeLing::Analyzer is a Ruby wrapper around
|
9
|
+
`analyzer`, a binary tool included in FreeLing's
|
10
|
+
package that allows the user to process a stream of
|
11
|
+
text with FreeLing.}
|
12
|
+
gem.homepage = "https://github.com/munshkr/freeling-analyzer-ruby"
|
13
|
+
|
14
|
+
gem.files = `git ls-files`.split($\)
|
15
|
+
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
16
|
+
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
17
|
+
gem.name = "freeling-analyzer"
|
18
|
+
gem.require_paths = ["lib"]
|
19
|
+
gem.version = FreeLing::Analyzer::VERSION
|
20
|
+
|
21
|
+
gem.add_development_dependency "rake"
|
22
|
+
gem.add_development_dependency "yard"
|
23
|
+
gem.add_development_dependency "mocha", "~> 0.13.3"
|
24
|
+
gem.add_development_dependency "minitest"
|
25
|
+
gem.add_development_dependency "minitest-wscolor"
|
26
|
+
gem.add_development_dependency "guard"
|
27
|
+
gem.add_development_dependency "guard-minitest"
|
28
|
+
|
29
|
+
gem.add_runtime_dependency "hashie"
|
30
|
+
end
|
@@ -0,0 +1 @@
|
|
1
|
+
require "freeling/analyzer"
|
@@ -0,0 +1,129 @@
|
|
1
|
+
require "open3"
|
2
|
+
require "hashie/mash"
|
3
|
+
require "freeling/analyzer/process_wrapper"
|
4
|
+
require "freeling/analyzer/freeling_default"
|
5
|
+
|
6
|
+
module FreeLing
|
7
|
+
class Analyzer
|
8
|
+
attr_reader :document, :latest_error_log
|
9
|
+
|
10
|
+
Token = Class.new(Hashie::Mash)
|
11
|
+
|
12
|
+
def initialize(document, opts={})
|
13
|
+
@document = document
|
14
|
+
|
15
|
+
@options = {
|
16
|
+
:share_path => freeling_path,
|
17
|
+
:analyze_path => analyzer_path,
|
18
|
+
:input_format => :plain,
|
19
|
+
:output_format => :tagged,
|
20
|
+
:memoize => true,
|
21
|
+
:language => :es
|
22
|
+
}.merge(opts)
|
23
|
+
end
|
24
|
+
|
25
|
+
def sentences(run_again=false)
|
26
|
+
if @options[:output_format] == :token
|
27
|
+
raise "Sentence splitter is not available with output format set to 'token'"
|
28
|
+
end
|
29
|
+
|
30
|
+
if not run_again and @sentences
|
31
|
+
return @sentences.to_enum
|
32
|
+
end
|
33
|
+
|
34
|
+
Enumerator.new do |yielder|
|
35
|
+
tokens = []
|
36
|
+
read_tokens.each do |token|
|
37
|
+
if token
|
38
|
+
tokens << token
|
39
|
+
else
|
40
|
+
yielder << tokens
|
41
|
+
if @options[:memoize]
|
42
|
+
@sentences ||= []
|
43
|
+
@sentences << tokens
|
44
|
+
end
|
45
|
+
tokens = []
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
|
51
|
+
def tokens(run_again=false)
|
52
|
+
if not run_again and @tokens
|
53
|
+
return @tokens.to_enum
|
54
|
+
end
|
55
|
+
|
56
|
+
if @sentences
|
57
|
+
@tokens ||= @sentences.flatten
|
58
|
+
return @tokens.to_enum
|
59
|
+
end
|
60
|
+
|
61
|
+
Enumerator.new do |yielder|
|
62
|
+
read_tokens.each do |token|
|
63
|
+
if token
|
64
|
+
yielder << token
|
65
|
+
if @options[:memoize]
|
66
|
+
@tokens ||= []
|
67
|
+
@tokens << token
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
74
|
+
|
75
|
+
private
|
76
|
+
def command
|
77
|
+
"#{@options[:analyze_path]} " \
|
78
|
+
"-f #{config_path} " \
|
79
|
+
"--inpf #{@options[:input_format]} " \
|
80
|
+
"--outf #{@options[:output_format]} " \
|
81
|
+
"--nec " \
|
82
|
+
"--noflush"
|
83
|
+
end
|
84
|
+
|
85
|
+
def config_path
|
86
|
+
@options[:config_path] || File.join(language_config, "#{@options[:language]}.cfg")
|
87
|
+
end
|
88
|
+
|
89
|
+
def read_tokens
|
90
|
+
Enumerator.new do |yielder|
|
91
|
+
output_fd = @document.respond_to?(:read) ? @document : StringIO.new(@document)
|
92
|
+
@process_wrapper = ProcessWrapper.new(command, output_fd, "FREELINGSHARE" => @options[:share_path])
|
93
|
+
|
94
|
+
@process_wrapper.run.each do |line|
|
95
|
+
if not line.empty?
|
96
|
+
yielder << parse_token_line(line)
|
97
|
+
end
|
98
|
+
end
|
99
|
+
|
100
|
+
@latest_error_log = @process_wrapper.error_log
|
101
|
+
|
102
|
+
@process_wrapper.close
|
103
|
+
@process_wrapper = nil
|
104
|
+
end
|
105
|
+
end
|
106
|
+
|
107
|
+
def parse_token_line(str)
|
108
|
+
form, lemma, tag, prob = str.split(' ')[0..3]
|
109
|
+
Token.new({
|
110
|
+
:form => form,
|
111
|
+
:lemma => lemma,
|
112
|
+
:tag => tag,
|
113
|
+
:prob => prob && prob.to_f,
|
114
|
+
}.reject { |k, v| v.nil? })
|
115
|
+
end
|
116
|
+
|
117
|
+
def language_config
|
118
|
+
FreelingDefault.language_config
|
119
|
+
end
|
120
|
+
|
121
|
+
def freeling_path
|
122
|
+
FreelingDefault.freeling_path
|
123
|
+
end
|
124
|
+
|
125
|
+
def analyzer_path
|
126
|
+
FreelingDefault.analyzer_path
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
@@ -0,0 +1,57 @@
|
|
1
|
+
module FreeLing
|
2
|
+
class Analyzer
|
3
|
+
class FreelingDefault
|
4
|
+
|
5
|
+
LOCAL_ANALYZE_PATH = "/usr/local/bin/analyzer"
|
6
|
+
USR_ANALYZE_PATH = "/usr/bin/analyzer"
|
7
|
+
LOCAL_FREELING_SHARE_PATH = "/usr/local/share/freeling"
|
8
|
+
USR_FREELING_SHARE_PATH = "/usr/share/freeling"
|
9
|
+
|
10
|
+
class << self
|
11
|
+
def analyzer_path
|
12
|
+
self.new.analyzer_path
|
13
|
+
end
|
14
|
+
|
15
|
+
def freeling_path
|
16
|
+
self.new.freeling_path
|
17
|
+
end
|
18
|
+
|
19
|
+
def language_config
|
20
|
+
self.new.language_config
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def language_config
|
25
|
+
if freeling_path.instance_of? String
|
26
|
+
File.join(freeling_path, "config")
|
27
|
+
else
|
28
|
+
raise_error(:analyze)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
32
|
+
def analyzer_path
|
33
|
+
if File.exists? LOCAL_ANALYZE_PATH
|
34
|
+
LOCAL_ANALYZE_PATH
|
35
|
+
elsif File.exists? USR_ANALYZE_PATH
|
36
|
+
USR_ANALYZE_PATH
|
37
|
+
else
|
38
|
+
raise_error(:analyze)
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
def freeling_path
|
43
|
+
if Dir.exists? LOCAL_FREELING_SHARE_PATH
|
44
|
+
LOCAL_FREELING_SHARE_PATH
|
45
|
+
elsif Dir.exists? USR_FREELING_SHARE_PATH
|
46
|
+
USR_FREELING_SHARE_PATH
|
47
|
+
else
|
48
|
+
raise_error(:freeling)
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
def raise_error(type)
|
53
|
+
raise "#{type} is not installed."
|
54
|
+
end
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
@@ -0,0 +1,82 @@
|
|
1
|
+
require "enumerator"
|
2
|
+
require "thread"
|
3
|
+
require "open3"
|
4
|
+
|
5
|
+
module FreeLing
|
6
|
+
class Analyzer
|
7
|
+
class ProcessWrapper
|
8
|
+
attr_accessor :command, :output_fd, :env, :error_log
|
9
|
+
|
10
|
+
def initialize(command, output_fd, env={})
|
11
|
+
@command = command
|
12
|
+
@output_fd = output_fd
|
13
|
+
@env = env
|
14
|
+
@error_log = nil
|
15
|
+
end
|
16
|
+
|
17
|
+
def run
|
18
|
+
@error_log = nil
|
19
|
+
|
20
|
+
Enumerator.new do |yielder|
|
21
|
+
open_process
|
22
|
+
|
23
|
+
if @stdout.nil?
|
24
|
+
run_process
|
25
|
+
end
|
26
|
+
|
27
|
+
while line = @stdout.gets
|
28
|
+
line.chomp!
|
29
|
+
yielder << line
|
30
|
+
end
|
31
|
+
|
32
|
+
@stdout.close_read
|
33
|
+
@error_log = @stderr.read
|
34
|
+
@write_thr.join
|
35
|
+
close_process
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def close
|
40
|
+
close_process
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
def open_process
|
45
|
+
@stdin, @stdout, @stderr, @wait_thr = Open3.popen3(@env, command)
|
46
|
+
|
47
|
+
@write_thr = Thread.new do
|
48
|
+
begin
|
49
|
+
IO.copy_stream(@output_fd, @stdin)
|
50
|
+
@stdin.close_write
|
51
|
+
rescue Errno::EPIPE
|
52
|
+
@error_log = @stderr.read
|
53
|
+
end
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
def close_process
|
58
|
+
close_fds
|
59
|
+
kill_threads
|
60
|
+
@stdin = @stdout = @stderr = nil
|
61
|
+
@wait_thr = @write_thr = nil
|
62
|
+
end
|
63
|
+
|
64
|
+
def close_fds
|
65
|
+
[@stdin, @stdout, @stderr].each do |fd|
|
66
|
+
if fd and not fd.closed?
|
67
|
+
fd.close
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
def kill_threads
|
73
|
+
[@wait_thr, @write_thr].each do |thr|
|
74
|
+
if thr and thr.alive?
|
75
|
+
thr.kill
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
end # ProcessWrapper
|
81
|
+
end # Analyzer
|
82
|
+
end # FreeLing
|
@@ -0,0 +1,45 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "test_helper"
|
3
|
+
|
4
|
+
class AnalyzerTest < MiniTest::Unit::TestCase
|
5
|
+
def setup
|
6
|
+
@a = "El gato come pescado y bebe agua."
|
7
|
+
@b = "Yo bajo con el hombre bajo a tocar el bajo bajo la escalera."
|
8
|
+
@c = "Mi amigo Juan Mesa se mesa la barba al lado de la mesa."
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_token_attributes
|
12
|
+
skip
|
13
|
+
|
14
|
+
expected_token = { form: "El", lemma: "el", prob: 1, tag: "DA0MS0" }
|
15
|
+
|
16
|
+
analyzer = FreeLing::Analyzer.new(@a, :language => :es)
|
17
|
+
token = analyzer.tokens.first
|
18
|
+
|
19
|
+
[:form, :lemma, :prob, :tag].each do |key|
|
20
|
+
assert_equal expected_token[key], token[key]
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def test_token_list
|
25
|
+
skip
|
26
|
+
|
27
|
+
expected_tokens = [
|
28
|
+
{ form: "El", lemma: "el", prob: 1, tag: "DA0MS0" },
|
29
|
+
{ form: "gato", lemma: "gato", prob: 1, tag: "NCMS000" },
|
30
|
+
{ form: "come", lemma: "comer", prob: 0.75, tag: "VMIP3S0" },
|
31
|
+
{ form: "pescado", lemma: "pescado", prob: 0.833333, tag: "NCMS000" },
|
32
|
+
{ form: "y", lemma: "y", prob: 0.999812, tag: "CC" },
|
33
|
+
{ form: "bebe", lemma: "beber", prob: 0.994868, tag: "VMIP3S0" },
|
34
|
+
{ form: "agua", lemma: "agua", prob: 0.973333, tag: "NCCS000" },
|
35
|
+
{ form: ".", lemma: ".", prob: 1, tag: "Fp" },
|
36
|
+
]
|
37
|
+
|
38
|
+
analyzer = FreeLing::Analyzer.new(@a, :language => :es)
|
39
|
+
analyzer.tokens.each.with_index do |token, i|
|
40
|
+
[:form, :lemma, :prob, :tag].each do |key|
|
41
|
+
assert_equal expected_tokens[i][key], token[key]
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
@@ -0,0 +1,41 @@
|
|
1
|
+
require 'test_helper'
|
2
|
+
|
3
|
+
class FreelingDefaultTest < MiniTest::Unit::TestCase
|
4
|
+
def setup
|
5
|
+
@usr_bin_analyzer = "/usr/bin/analyzer"
|
6
|
+
@local_bin_analyzer = "/usr/local/bin/analyzer"
|
7
|
+
@usr_share_freeling = "/usr/share/freeling"
|
8
|
+
@local_share_freeling = "/usr/local/share/freeling"
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_freeling_installed_on_usr
|
12
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(true)
|
13
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(false)
|
14
|
+
|
15
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @usr_bin_analyzer
|
16
|
+
end
|
17
|
+
|
18
|
+
def test_freeling_installed_on_local
|
19
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(false)
|
20
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(true)
|
21
|
+
|
22
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @local_bin_analyzer
|
23
|
+
end
|
24
|
+
|
25
|
+
def test_freeling_not_installed
|
26
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(false)
|
27
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(false)
|
28
|
+
|
29
|
+
lambda {
|
30
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @local_bin_analyzer
|
31
|
+
}.must_raise RuntimeError
|
32
|
+
end
|
33
|
+
|
34
|
+
def test_instanciate_analyzer
|
35
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(true)
|
36
|
+
Dir.stubs("exists?").with(@local_share_freeling).returns(true)
|
37
|
+
document = mock("document")
|
38
|
+
|
39
|
+
assert FreeLing::Analyzer.new document
|
40
|
+
end
|
41
|
+
end
|
@@ -0,0 +1,89 @@
|
|
1
|
+
require "test_helper"
|
2
|
+
|
3
|
+
class ProcessWrapperTest < MiniTest::Unit::TestCase
|
4
|
+
def setup
|
5
|
+
@command = "cat"
|
6
|
+
@text = ["Hello", "world!"]
|
7
|
+
@output_fd = StringIO.new(@text.join("\n"))
|
8
|
+
@pw = FreeLing::Analyzer::ProcessWrapper.new(@command, @output_fd)
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_constructor_paramenters_can_be_read
|
12
|
+
assert_equal @command, @pw.command
|
13
|
+
assert_equal @output_fd, @pw.output_fd
|
14
|
+
assert_equal({}, @pw.env)
|
15
|
+
end
|
16
|
+
|
17
|
+
def test_constructor_paramenters_can_be_written
|
18
|
+
assert_equal @command, @pw.command
|
19
|
+
@pw.command = "cat -n"
|
20
|
+
assert_equal "cat -n", @pw.command
|
21
|
+
|
22
|
+
assert_equal @output_fd, @pw.output_fd
|
23
|
+
new_text_io = StringIO.new("New text")
|
24
|
+
@pw.output_fd = new_text_io
|
25
|
+
assert_equal new_text_io, @pw.output_fd
|
26
|
+
|
27
|
+
assert_equal({}, @pw.env)
|
28
|
+
@pw.env["FREELINGSHARE"] = "/usr/local/share/freeling"
|
29
|
+
assert_equal({
|
30
|
+
"FREELINGSHARE" => "/usr/local/share/freeling"
|
31
|
+
}, @pw.env)
|
32
|
+
end
|
33
|
+
|
34
|
+
def test_command_can_be_changed_once_wrapper_was_created
|
35
|
+
assert_equal @command, @pw.command
|
36
|
+
@pw.command = "cat -n"
|
37
|
+
assert_equal "cat -n", @pw.command
|
38
|
+
end
|
39
|
+
|
40
|
+
def test_run_returns_an_enumerator
|
41
|
+
enum = @pw.run
|
42
|
+
assert_instance_of Enumerator, enum
|
43
|
+
end
|
44
|
+
|
45
|
+
def test_run_returns_process_output_per_line
|
46
|
+
assert_equal @text, @pw.run.to_a
|
47
|
+
end
|
48
|
+
|
49
|
+
def test_run_twice_without_rewinding_output_fd
|
50
|
+
assert_equal @text, @pw.run.to_a
|
51
|
+
assert @pw.output_fd.eof?
|
52
|
+
assert_equal [], @pw.run.to_a
|
53
|
+
end
|
54
|
+
|
55
|
+
def test_run_twice_but_rewind_output_fd_after_first_run
|
56
|
+
assert_equal @text, @pw.run.to_a
|
57
|
+
assert @pw.output_fd.eof?
|
58
|
+
@pw.output_fd.rewind
|
59
|
+
assert_equal @text, @pw.run.to_a
|
60
|
+
end
|
61
|
+
|
62
|
+
def test_another_command
|
63
|
+
@pw.command = "cat -n"
|
64
|
+
assert_equal [" 1\tHello", " 2\tworld!"], @pw.run.to_a
|
65
|
+
end
|
66
|
+
|
67
|
+
def test_invalid_command
|
68
|
+
@pw.command = "inexistant_command"
|
69
|
+
assert_raises(Errno::ENOENT) { @pw.run.first }
|
70
|
+
end
|
71
|
+
|
72
|
+
def test_custom_env_variables
|
73
|
+
@pw.env["MY_VARIABLE"] = "foobar"
|
74
|
+
@pw.command = "echo $MY_VARIABLE"
|
75
|
+
assert_equal "foobar", @pw.run.first
|
76
|
+
end
|
77
|
+
|
78
|
+
def test_run_process_that_only_prints_to_stderr
|
79
|
+
@pw.command = "echo 'this is only printed on stderr' > /dev/fd/2"
|
80
|
+
assert @pw.run.first.nil?
|
81
|
+
assert @pw.error_log = "this is only printed on stderr\n"
|
82
|
+
end
|
83
|
+
|
84
|
+
def test_close_process_early
|
85
|
+
@pw.command = "echo '1\n2\n3'"
|
86
|
+
assert_equal @pw.run.first, "1"
|
87
|
+
@pw.close
|
88
|
+
end
|
89
|
+
end
|
data/test/test_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,203 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: freeling-analyzer
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Damián Silvani
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2013-05-23 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: rake
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :development
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: !ruby/object:Gem::Requirement
|
25
|
+
none: false
|
26
|
+
requirements:
|
27
|
+
- - ! '>='
|
28
|
+
- !ruby/object:Gem::Version
|
29
|
+
version: '0'
|
30
|
+
- !ruby/object:Gem::Dependency
|
31
|
+
name: yard
|
32
|
+
requirement: !ruby/object:Gem::Requirement
|
33
|
+
none: false
|
34
|
+
requirements:
|
35
|
+
- - ! '>='
|
36
|
+
- !ruby/object:Gem::Version
|
37
|
+
version: '0'
|
38
|
+
type: :development
|
39
|
+
prerelease: false
|
40
|
+
version_requirements: !ruby/object:Gem::Requirement
|
41
|
+
none: false
|
42
|
+
requirements:
|
43
|
+
- - ! '>='
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: '0'
|
46
|
+
- !ruby/object:Gem::Dependency
|
47
|
+
name: mocha
|
48
|
+
requirement: !ruby/object:Gem::Requirement
|
49
|
+
none: false
|
50
|
+
requirements:
|
51
|
+
- - ~>
|
52
|
+
- !ruby/object:Gem::Version
|
53
|
+
version: 0.13.3
|
54
|
+
type: :development
|
55
|
+
prerelease: false
|
56
|
+
version_requirements: !ruby/object:Gem::Requirement
|
57
|
+
none: false
|
58
|
+
requirements:
|
59
|
+
- - ~>
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: 0.13.3
|
62
|
+
- !ruby/object:Gem::Dependency
|
63
|
+
name: minitest
|
64
|
+
requirement: !ruby/object:Gem::Requirement
|
65
|
+
none: false
|
66
|
+
requirements:
|
67
|
+
- - ! '>='
|
68
|
+
- !ruby/object:Gem::Version
|
69
|
+
version: '0'
|
70
|
+
type: :development
|
71
|
+
prerelease: false
|
72
|
+
version_requirements: !ruby/object:Gem::Requirement
|
73
|
+
none: false
|
74
|
+
requirements:
|
75
|
+
- - ! '>='
|
76
|
+
- !ruby/object:Gem::Version
|
77
|
+
version: '0'
|
78
|
+
- !ruby/object:Gem::Dependency
|
79
|
+
name: minitest-wscolor
|
80
|
+
requirement: !ruby/object:Gem::Requirement
|
81
|
+
none: false
|
82
|
+
requirements:
|
83
|
+
- - ! '>='
|
84
|
+
- !ruby/object:Gem::Version
|
85
|
+
version: '0'
|
86
|
+
type: :development
|
87
|
+
prerelease: false
|
88
|
+
version_requirements: !ruby/object:Gem::Requirement
|
89
|
+
none: false
|
90
|
+
requirements:
|
91
|
+
- - ! '>='
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
94
|
+
- !ruby/object:Gem::Dependency
|
95
|
+
name: guard
|
96
|
+
requirement: !ruby/object:Gem::Requirement
|
97
|
+
none: false
|
98
|
+
requirements:
|
99
|
+
- - ! '>='
|
100
|
+
- !ruby/object:Gem::Version
|
101
|
+
version: '0'
|
102
|
+
type: :development
|
103
|
+
prerelease: false
|
104
|
+
version_requirements: !ruby/object:Gem::Requirement
|
105
|
+
none: false
|
106
|
+
requirements:
|
107
|
+
- - ! '>='
|
108
|
+
- !ruby/object:Gem::Version
|
109
|
+
version: '0'
|
110
|
+
- !ruby/object:Gem::Dependency
|
111
|
+
name: guard-minitest
|
112
|
+
requirement: !ruby/object:Gem::Requirement
|
113
|
+
none: false
|
114
|
+
requirements:
|
115
|
+
- - ! '>='
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '0'
|
118
|
+
type: :development
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
none: false
|
122
|
+
requirements:
|
123
|
+
- - ! '>='
|
124
|
+
- !ruby/object:Gem::Version
|
125
|
+
version: '0'
|
126
|
+
- !ruby/object:Gem::Dependency
|
127
|
+
name: hashie
|
128
|
+
requirement: !ruby/object:Gem::Requirement
|
129
|
+
none: false
|
130
|
+
requirements:
|
131
|
+
- - ! '>='
|
132
|
+
- !ruby/object:Gem::Version
|
133
|
+
version: '0'
|
134
|
+
type: :runtime
|
135
|
+
prerelease: false
|
136
|
+
version_requirements: !ruby/object:Gem::Requirement
|
137
|
+
none: false
|
138
|
+
requirements:
|
139
|
+
- - ! '>='
|
140
|
+
- !ruby/object:Gem::Version
|
141
|
+
version: '0'
|
142
|
+
description: ! "FreeLing::Analyzer is a Ruby wrapper around\n `analyzer`,
|
143
|
+
a binary tool included in FreeLing's\n package that allows
|
144
|
+
the user to process a stream of\n text with FreeLing."
|
145
|
+
email:
|
146
|
+
- munshkr@gmail.com
|
147
|
+
executables: []
|
148
|
+
extensions: []
|
149
|
+
extra_rdoc_files: []
|
150
|
+
files:
|
151
|
+
- .gitignore
|
152
|
+
- .travis.yml
|
153
|
+
- Gemfile
|
154
|
+
- Guardfile
|
155
|
+
- LICENSE
|
156
|
+
- README.md
|
157
|
+
- Rakefile
|
158
|
+
- freeling-analyzer.gemspec
|
159
|
+
- lib/freeling-analyzer.rb
|
160
|
+
- lib/freeling/analyzer.rb
|
161
|
+
- lib/freeling/analyzer/freeling_default.rb
|
162
|
+
- lib/freeling/analyzer/process_wrapper.rb
|
163
|
+
- lib/freeling/analyzer/version.rb
|
164
|
+
- test/analyzer_test.rb
|
165
|
+
- test/freeling_default_test.rb
|
166
|
+
- test/process_wrapper_test.rb
|
167
|
+
- test/test_helper.rb
|
168
|
+
homepage: https://github.com/munshkr/freeling-analyzer-ruby
|
169
|
+
licenses: []
|
170
|
+
post_install_message:
|
171
|
+
rdoc_options: []
|
172
|
+
require_paths:
|
173
|
+
- lib
|
174
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
175
|
+
none: false
|
176
|
+
requirements:
|
177
|
+
- - ! '>='
|
178
|
+
- !ruby/object:Gem::Version
|
179
|
+
version: '0'
|
180
|
+
segments:
|
181
|
+
- 0
|
182
|
+
hash: -171360197096185165
|
183
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
184
|
+
none: false
|
185
|
+
requirements:
|
186
|
+
- - ! '>='
|
187
|
+
- !ruby/object:Gem::Version
|
188
|
+
version: '0'
|
189
|
+
segments:
|
190
|
+
- 0
|
191
|
+
hash: -171360197096185165
|
192
|
+
requirements: []
|
193
|
+
rubyforge_project:
|
194
|
+
rubygems_version: 1.8.25
|
195
|
+
signing_key:
|
196
|
+
specification_version: 3
|
197
|
+
summary: Ruby wrapper for FreeLing's analyzer tool
|
198
|
+
test_files:
|
199
|
+
- test/analyzer_test.rb
|
200
|
+
- test/freeling_default_test.rb
|
201
|
+
- test/process_wrapper_test.rb
|
202
|
+
- test/test_helper.rb
|
203
|
+
has_rdoc:
|