freeling-analyzer 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +18 -0
- data/.travis.yml +9 -0
- data/Gemfile +6 -0
- data/Guardfile +17 -0
- data/LICENSE +22 -0
- data/README.md +85 -0
- data/Rakefile +11 -0
- data/freeling-analyzer.gemspec +30 -0
- data/lib/freeling-analyzer.rb +1 -0
- data/lib/freeling/analyzer.rb +129 -0
- data/lib/freeling/analyzer/freeling_default.rb +57 -0
- data/lib/freeling/analyzer/process_wrapper.rb +82 -0
- data/lib/freeling/analyzer/version.rb +5 -0
- data/test/analyzer_test.rb +45 -0
- data/test/freeling_default_test.rb +41 -0
- data/test/process_wrapper_test.rb +89 -0
- data/test/test_helper.rb +5 -0
- metadata +203 -0
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/Guardfile
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
# A sample Guardfile
|
2
|
+
# More info at https://github.com/guard/guard#readme
|
3
|
+
|
4
|
+
guard :minitest do
|
5
|
+
# with Minitest::Unit
|
6
|
+
#watch(%r|^test/(.*)\/?test_(.*)\.rb|)
|
7
|
+
#watch(%r|^lib/(.*)([^/]+)\.rb|) { |m| "test/#{m[1]}test_#{m[2]}.rb" }
|
8
|
+
#watch(%r|^test/test_helper\.rb|) { "test" }
|
9
|
+
watch(%r{^lib/(.+)\.rb$}) { |m| "test/#{m[1].split("/").last}_test.rb" }
|
10
|
+
watch(%r{^test/.+_test\.rb$})
|
11
|
+
watch('test/test_helper.rb') { "test" }
|
12
|
+
|
13
|
+
# with Minitest::Spec
|
14
|
+
# watch(%r|^spec/(.*)_spec\.rb|)
|
15
|
+
# watch(%r|^lib/(.*)([^/]+)\.rb|) { |m| "spec/#{m[1]}#{m[2]}_spec.rb" }
|
16
|
+
# watch(%r|^spec/spec_helper\.rb|) { "spec" }
|
17
|
+
end
|
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Damián Silvani
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
# Freeling::Analyzer [![Build Status](https://secure.travis-ci.org/munshkr/freeling-analyzer-ruby.png?branch=master)](https://travis-ci.org/munshkr/freeling-analyzer-ruby)
|
2
|
+
|
3
|
+
**FreeLing::Analyzer** is a Ruby wrapper around `analyzer`, a binary tool
|
4
|
+
included in FreeLing's package that allows the user to process a stream of text
|
5
|
+
with FreeLing.
|
6
|
+
|
7
|
+
*This has been tested with version 3.0+ only*.
|
8
|
+
|
9
|
+
## Usage
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
text = "Mi amigo Juan Mesa se mesa la barba al lado de la mesa."
|
13
|
+
analyzer = FreeLing::Analyzer.new(text, :language => :es)
|
14
|
+
|
15
|
+
analyzer.tokens.first
|
16
|
+
# => #<FreeLing::Analyzer::Token form="Mi" lemma="mi" prob=0.995536 tag="DP1CSS">
|
17
|
+
|
18
|
+
analyzer.tokens.map { |t| t.lemma }
|
19
|
+
# => ["mi", "amigo", "juan_mesa", "se", "mesar", "el", "barba", "a+el", "lado", "de", "el", "mesa", "."]
|
20
|
+
```
|
21
|
+
|
22
|
+
## Features
|
23
|
+
|
24
|
+
* Analyzer is *lazy*, it does not spawn the process until needed and it sends
|
25
|
+
the input text to `analyzer` on demand.
|
26
|
+
* It just works with the default instalation of FreeLing. Just set the language
|
27
|
+
to use and you're good to go.
|
28
|
+
|
29
|
+
## Installation
|
30
|
+
|
31
|
+
Add this line to your application's Gemfile:
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
gem 'freeling-analyzer'
|
35
|
+
```
|
36
|
+
|
37
|
+
And then execute:
|
38
|
+
|
39
|
+
$ bundle
|
40
|
+
|
41
|
+
Or install it yourself as:
|
42
|
+
|
43
|
+
$ gem install freeling-analyzer
|
44
|
+
|
45
|
+
## FreeLing
|
46
|
+
|
47
|
+
[FreeLing](http://nlp.lsi.upc.edu/freeling/) is an open source suite of
|
48
|
+
language analyzers written in C++.
|
49
|
+
|
50
|
+
The main services offered are:
|
51
|
+
* Text tokenization
|
52
|
+
* Sentence splitting
|
53
|
+
* Morphological analysis
|
54
|
+
* Suffix treatment
|
55
|
+
* Retokenization of clitic pronouns
|
56
|
+
* Flexible multiword recognition
|
57
|
+
* Contraction splitting
|
58
|
+
* Probabilistic prediction of unkown word categories
|
59
|
+
* Named entity detection (NER)
|
60
|
+
* Recognition of dates, numbers, ratios, currency, and physical magnitudes
|
61
|
+
(speed, weight, temperature, density, etc.)
|
62
|
+
* PoS tagging
|
63
|
+
* Chart-based shallow parsing
|
64
|
+
* Named entity classification (NEC)
|
65
|
+
* WordNet based sense annotation and disambiguation
|
66
|
+
* Rule-based dependency parsing
|
67
|
+
* Nominal correference resolution.
|
68
|
+
|
69
|
+
Currently supported languages are:
|
70
|
+
* Spanish
|
71
|
+
* Catalan
|
72
|
+
* Galician
|
73
|
+
* Italian
|
74
|
+
* English
|
75
|
+
* Welsh
|
76
|
+
* Portuguese
|
77
|
+
* Asturian
|
78
|
+
|
79
|
+
## Contributing
|
80
|
+
|
81
|
+
1. Fork it
|
82
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
83
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
84
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
85
|
+
5. Create new Pull Request
|
data/Rakefile
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/freeling/analyzer/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Damián Silvani"]
|
6
|
+
gem.email = ["munshkr@gmail.com"]
|
7
|
+
gem.summary = %q{Ruby wrapper for FreeLing's analyzer tool}
|
8
|
+
gem.description = %q{FreeLing::Analyzer is a Ruby wrapper around
|
9
|
+
`analyzer`, a binary tool included in FreeLing's
|
10
|
+
package that allows the user to process a stream of
|
11
|
+
text with FreeLing.}
|
12
|
+
gem.homepage = "https://github.com/munshkr/freeling-analyzer-ruby"
|
13
|
+
|
14
|
+
gem.files = `git ls-files`.split($\)
|
15
|
+
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
16
|
+
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
17
|
+
gem.name = "freeling-analyzer"
|
18
|
+
gem.require_paths = ["lib"]
|
19
|
+
gem.version = FreeLing::Analyzer::VERSION
|
20
|
+
|
21
|
+
gem.add_development_dependency "rake"
|
22
|
+
gem.add_development_dependency "yard"
|
23
|
+
gem.add_development_dependency "mocha", "~> 0.13.3"
|
24
|
+
gem.add_development_dependency "minitest"
|
25
|
+
gem.add_development_dependency "minitest-wscolor"
|
26
|
+
gem.add_development_dependency "guard"
|
27
|
+
gem.add_development_dependency "guard-minitest"
|
28
|
+
|
29
|
+
gem.add_runtime_dependency "hashie"
|
30
|
+
end
|
@@ -0,0 +1 @@
|
|
1
|
+
require "freeling/analyzer"
|
@@ -0,0 +1,129 @@
|
|
1
|
+
require "open3"
|
2
|
+
require "hashie/mash"
|
3
|
+
require "freeling/analyzer/process_wrapper"
|
4
|
+
require "freeling/analyzer/freeling_default"
|
5
|
+
|
6
|
+
module FreeLing
|
7
|
+
class Analyzer
|
8
|
+
attr_reader :document, :latest_error_log
|
9
|
+
|
10
|
+
Token = Class.new(Hashie::Mash)
|
11
|
+
|
12
|
+
def initialize(document, opts={})
|
13
|
+
@document = document
|
14
|
+
|
15
|
+
@options = {
|
16
|
+
:share_path => freeling_path,
|
17
|
+
:analyze_path => analyzer_path,
|
18
|
+
:input_format => :plain,
|
19
|
+
:output_format => :tagged,
|
20
|
+
:memoize => true,
|
21
|
+
:language => :es
|
22
|
+
}.merge(opts)
|
23
|
+
end
|
24
|
+
|
25
|
+
def sentences(run_again=false)
|
26
|
+
if @options[:output_format] == :token
|
27
|
+
raise "Sentence splitter is not available with output format set to 'token'"
|
28
|
+
end
|
29
|
+
|
30
|
+
if not run_again and @sentences
|
31
|
+
return @sentences.to_enum
|
32
|
+
end
|
33
|
+
|
34
|
+
Enumerator.new do |yielder|
|
35
|
+
tokens = []
|
36
|
+
read_tokens.each do |token|
|
37
|
+
if token
|
38
|
+
tokens << token
|
39
|
+
else
|
40
|
+
yielder << tokens
|
41
|
+
if @options[:memoize]
|
42
|
+
@sentences ||= []
|
43
|
+
@sentences << tokens
|
44
|
+
end
|
45
|
+
tokens = []
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
|
51
|
+
def tokens(run_again=false)
|
52
|
+
if not run_again and @tokens
|
53
|
+
return @tokens.to_enum
|
54
|
+
end
|
55
|
+
|
56
|
+
if @sentences
|
57
|
+
@tokens ||= @sentences.flatten
|
58
|
+
return @tokens.to_enum
|
59
|
+
end
|
60
|
+
|
61
|
+
Enumerator.new do |yielder|
|
62
|
+
read_tokens.each do |token|
|
63
|
+
if token
|
64
|
+
yielder << token
|
65
|
+
if @options[:memoize]
|
66
|
+
@tokens ||= []
|
67
|
+
@tokens << token
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
74
|
+
|
75
|
+
private
|
76
|
+
def command
|
77
|
+
"#{@options[:analyze_path]} " \
|
78
|
+
"-f #{config_path} " \
|
79
|
+
"--inpf #{@options[:input_format]} " \
|
80
|
+
"--outf #{@options[:output_format]} " \
|
81
|
+
"--nec " \
|
82
|
+
"--noflush"
|
83
|
+
end
|
84
|
+
|
85
|
+
def config_path
|
86
|
+
@options[:config_path] || File.join(language_config, "#{@options[:language]}.cfg")
|
87
|
+
end
|
88
|
+
|
89
|
+
def read_tokens
|
90
|
+
Enumerator.new do |yielder|
|
91
|
+
output_fd = @document.respond_to?(:read) ? @document : StringIO.new(@document)
|
92
|
+
@process_wrapper = ProcessWrapper.new(command, output_fd, "FREELINGSHARE" => @options[:share_path])
|
93
|
+
|
94
|
+
@process_wrapper.run.each do |line|
|
95
|
+
if not line.empty?
|
96
|
+
yielder << parse_token_line(line)
|
97
|
+
end
|
98
|
+
end
|
99
|
+
|
100
|
+
@latest_error_log = @process_wrapper.error_log
|
101
|
+
|
102
|
+
@process_wrapper.close
|
103
|
+
@process_wrapper = nil
|
104
|
+
end
|
105
|
+
end
|
106
|
+
|
107
|
+
def parse_token_line(str)
|
108
|
+
form, lemma, tag, prob = str.split(' ')[0..3]
|
109
|
+
Token.new({
|
110
|
+
:form => form,
|
111
|
+
:lemma => lemma,
|
112
|
+
:tag => tag,
|
113
|
+
:prob => prob && prob.to_f,
|
114
|
+
}.reject { |k, v| v.nil? })
|
115
|
+
end
|
116
|
+
|
117
|
+
def language_config
|
118
|
+
FreelingDefault.language_config
|
119
|
+
end
|
120
|
+
|
121
|
+
def freeling_path
|
122
|
+
FreelingDefault.freeling_path
|
123
|
+
end
|
124
|
+
|
125
|
+
def analyzer_path
|
126
|
+
FreelingDefault.analyzer_path
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
@@ -0,0 +1,57 @@
|
|
1
|
+
module FreeLing
|
2
|
+
class Analyzer
|
3
|
+
class FreelingDefault
|
4
|
+
|
5
|
+
LOCAL_ANALYZE_PATH = "/usr/local/bin/analyzer"
|
6
|
+
USR_ANALYZE_PATH = "/usr/bin/analyzer"
|
7
|
+
LOCAL_FREELING_SHARE_PATH = "/usr/local/share/freeling"
|
8
|
+
USR_FREELING_SHARE_PATH = "/usr/share/freeling"
|
9
|
+
|
10
|
+
class << self
|
11
|
+
def analyzer_path
|
12
|
+
self.new.analyzer_path
|
13
|
+
end
|
14
|
+
|
15
|
+
def freeling_path
|
16
|
+
self.new.freeling_path
|
17
|
+
end
|
18
|
+
|
19
|
+
def language_config
|
20
|
+
self.new.language_config
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def language_config
|
25
|
+
if freeling_path.instance_of? String
|
26
|
+
File.join(freeling_path, "config")
|
27
|
+
else
|
28
|
+
raise_error(:analyze)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
32
|
+
def analyzer_path
|
33
|
+
if File.exists? LOCAL_ANALYZE_PATH
|
34
|
+
LOCAL_ANALYZE_PATH
|
35
|
+
elsif File.exists? USR_ANALYZE_PATH
|
36
|
+
USR_ANALYZE_PATH
|
37
|
+
else
|
38
|
+
raise_error(:analyze)
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
def freeling_path
|
43
|
+
if Dir.exists? LOCAL_FREELING_SHARE_PATH
|
44
|
+
LOCAL_FREELING_SHARE_PATH
|
45
|
+
elsif Dir.exists? USR_FREELING_SHARE_PATH
|
46
|
+
USR_FREELING_SHARE_PATH
|
47
|
+
else
|
48
|
+
raise_error(:freeling)
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
def raise_error(type)
|
53
|
+
raise "#{type} is not installed."
|
54
|
+
end
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
@@ -0,0 +1,82 @@
|
|
1
|
+
require "enumerator"
|
2
|
+
require "thread"
|
3
|
+
require "open3"
|
4
|
+
|
5
|
+
module FreeLing
|
6
|
+
class Analyzer
|
7
|
+
class ProcessWrapper
|
8
|
+
attr_accessor :command, :output_fd, :env, :error_log
|
9
|
+
|
10
|
+
def initialize(command, output_fd, env={})
|
11
|
+
@command = command
|
12
|
+
@output_fd = output_fd
|
13
|
+
@env = env
|
14
|
+
@error_log = nil
|
15
|
+
end
|
16
|
+
|
17
|
+
def run
|
18
|
+
@error_log = nil
|
19
|
+
|
20
|
+
Enumerator.new do |yielder|
|
21
|
+
open_process
|
22
|
+
|
23
|
+
if @stdout.nil?
|
24
|
+
run_process
|
25
|
+
end
|
26
|
+
|
27
|
+
while line = @stdout.gets
|
28
|
+
line.chomp!
|
29
|
+
yielder << line
|
30
|
+
end
|
31
|
+
|
32
|
+
@stdout.close_read
|
33
|
+
@error_log = @stderr.read
|
34
|
+
@write_thr.join
|
35
|
+
close_process
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def close
|
40
|
+
close_process
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
def open_process
|
45
|
+
@stdin, @stdout, @stderr, @wait_thr = Open3.popen3(@env, command)
|
46
|
+
|
47
|
+
@write_thr = Thread.new do
|
48
|
+
begin
|
49
|
+
IO.copy_stream(@output_fd, @stdin)
|
50
|
+
@stdin.close_write
|
51
|
+
rescue Errno::EPIPE
|
52
|
+
@error_log = @stderr.read
|
53
|
+
end
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
def close_process
|
58
|
+
close_fds
|
59
|
+
kill_threads
|
60
|
+
@stdin = @stdout = @stderr = nil
|
61
|
+
@wait_thr = @write_thr = nil
|
62
|
+
end
|
63
|
+
|
64
|
+
def close_fds
|
65
|
+
[@stdin, @stdout, @stderr].each do |fd|
|
66
|
+
if fd and not fd.closed?
|
67
|
+
fd.close
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
def kill_threads
|
73
|
+
[@wait_thr, @write_thr].each do |thr|
|
74
|
+
if thr and thr.alive?
|
75
|
+
thr.kill
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
end # ProcessWrapper
|
81
|
+
end # Analyzer
|
82
|
+
end # FreeLing
|
@@ -0,0 +1,45 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "test_helper"
|
3
|
+
|
4
|
+
class AnalyzerTest < MiniTest::Unit::TestCase
|
5
|
+
def setup
|
6
|
+
@a = "El gato come pescado y bebe agua."
|
7
|
+
@b = "Yo bajo con el hombre bajo a tocar el bajo bajo la escalera."
|
8
|
+
@c = "Mi amigo Juan Mesa se mesa la barba al lado de la mesa."
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_token_attributes
|
12
|
+
skip
|
13
|
+
|
14
|
+
expected_token = { form: "El", lemma: "el", prob: 1, tag: "DA0MS0" }
|
15
|
+
|
16
|
+
analyzer = FreeLing::Analyzer.new(@a, :language => :es)
|
17
|
+
token = analyzer.tokens.first
|
18
|
+
|
19
|
+
[:form, :lemma, :prob, :tag].each do |key|
|
20
|
+
assert_equal expected_token[key], token[key]
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def test_token_list
|
25
|
+
skip
|
26
|
+
|
27
|
+
expected_tokens = [
|
28
|
+
{ form: "El", lemma: "el", prob: 1, tag: "DA0MS0" },
|
29
|
+
{ form: "gato", lemma: "gato", prob: 1, tag: "NCMS000" },
|
30
|
+
{ form: "come", lemma: "comer", prob: 0.75, tag: "VMIP3S0" },
|
31
|
+
{ form: "pescado", lemma: "pescado", prob: 0.833333, tag: "NCMS000" },
|
32
|
+
{ form: "y", lemma: "y", prob: 0.999812, tag: "CC" },
|
33
|
+
{ form: "bebe", lemma: "beber", prob: 0.994868, tag: "VMIP3S0" },
|
34
|
+
{ form: "agua", lemma: "agua", prob: 0.973333, tag: "NCCS000" },
|
35
|
+
{ form: ".", lemma: ".", prob: 1, tag: "Fp" },
|
36
|
+
]
|
37
|
+
|
38
|
+
analyzer = FreeLing::Analyzer.new(@a, :language => :es)
|
39
|
+
analyzer.tokens.each.with_index do |token, i|
|
40
|
+
[:form, :lemma, :prob, :tag].each do |key|
|
41
|
+
assert_equal expected_tokens[i][key], token[key]
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
@@ -0,0 +1,41 @@
|
|
1
|
+
require 'test_helper'
|
2
|
+
|
3
|
+
class FreelingDefaultTest < MiniTest::Unit::TestCase
|
4
|
+
def setup
|
5
|
+
@usr_bin_analyzer = "/usr/bin/analyzer"
|
6
|
+
@local_bin_analyzer = "/usr/local/bin/analyzer"
|
7
|
+
@usr_share_freeling = "/usr/share/freeling"
|
8
|
+
@local_share_freeling = "/usr/local/share/freeling"
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_freeling_installed_on_usr
|
12
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(true)
|
13
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(false)
|
14
|
+
|
15
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @usr_bin_analyzer
|
16
|
+
end
|
17
|
+
|
18
|
+
def test_freeling_installed_on_local
|
19
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(false)
|
20
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(true)
|
21
|
+
|
22
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @local_bin_analyzer
|
23
|
+
end
|
24
|
+
|
25
|
+
def test_freeling_not_installed
|
26
|
+
File.stubs("exists?").with(@usr_bin_analyzer).returns(false)
|
27
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(false)
|
28
|
+
|
29
|
+
lambda {
|
30
|
+
FreeLing::Analyzer::FreelingDefault.analyzer_path.must_equal @local_bin_analyzer
|
31
|
+
}.must_raise RuntimeError
|
32
|
+
end
|
33
|
+
|
34
|
+
def test_instanciate_analyzer
|
35
|
+
File.stubs("exists?").with(@local_bin_analyzer).returns(true)
|
36
|
+
Dir.stubs("exists?").with(@local_share_freeling).returns(true)
|
37
|
+
document = mock("document")
|
38
|
+
|
39
|
+
assert FreeLing::Analyzer.new document
|
40
|
+
end
|
41
|
+
end
|
@@ -0,0 +1,89 @@
|
|
1
|
+
require "test_helper"
|
2
|
+
|
3
|
+
class ProcessWrapperTest < MiniTest::Unit::TestCase
|
4
|
+
def setup
|
5
|
+
@command = "cat"
|
6
|
+
@text = ["Hello", "world!"]
|
7
|
+
@output_fd = StringIO.new(@text.join("\n"))
|
8
|
+
@pw = FreeLing::Analyzer::ProcessWrapper.new(@command, @output_fd)
|
9
|
+
end
|
10
|
+
|
11
|
+
def test_constructor_paramenters_can_be_read
|
12
|
+
assert_equal @command, @pw.command
|
13
|
+
assert_equal @output_fd, @pw.output_fd
|
14
|
+
assert_equal({}, @pw.env)
|
15
|
+
end
|
16
|
+
|
17
|
+
def test_constructor_paramenters_can_be_written
|
18
|
+
assert_equal @command, @pw.command
|
19
|
+
@pw.command = "cat -n"
|
20
|
+
assert_equal "cat -n", @pw.command
|
21
|
+
|
22
|
+
assert_equal @output_fd, @pw.output_fd
|
23
|
+
new_text_io = StringIO.new("New text")
|
24
|
+
@pw.output_fd = new_text_io
|
25
|
+
assert_equal new_text_io, @pw.output_fd
|
26
|
+
|
27
|
+
assert_equal({}, @pw.env)
|
28
|
+
@pw.env["FREELINGSHARE"] = "/usr/local/share/freeling"
|
29
|
+
assert_equal({
|
30
|
+
"FREELINGSHARE" => "/usr/local/share/freeling"
|
31
|
+
}, @pw.env)
|
32
|
+
end
|
33
|
+
|
34
|
+
def test_command_can_be_changed_once_wrapper_was_created
|
35
|
+
assert_equal @command, @pw.command
|
36
|
+
@pw.command = "cat -n"
|
37
|
+
assert_equal "cat -n", @pw.command
|
38
|
+
end
|
39
|
+
|
40
|
+
def test_run_returns_an_enumerator
|
41
|
+
enum = @pw.run
|
42
|
+
assert_instance_of Enumerator, enum
|
43
|
+
end
|
44
|
+
|
45
|
+
def test_run_returns_process_output_per_line
|
46
|
+
assert_equal @text, @pw.run.to_a
|
47
|
+
end
|
48
|
+
|
49
|
+
def test_run_twice_without_rewinding_output_fd
|
50
|
+
assert_equal @text, @pw.run.to_a
|
51
|
+
assert @pw.output_fd.eof?
|
52
|
+
assert_equal [], @pw.run.to_a
|
53
|
+
end
|
54
|
+
|
55
|
+
def test_run_twice_but_rewind_output_fd_after_first_run
|
56
|
+
assert_equal @text, @pw.run.to_a
|
57
|
+
assert @pw.output_fd.eof?
|
58
|
+
@pw.output_fd.rewind
|
59
|
+
assert_equal @text, @pw.run.to_a
|
60
|
+
end
|
61
|
+
|
62
|
+
def test_another_command
|
63
|
+
@pw.command = "cat -n"
|
64
|
+
assert_equal [" 1\tHello", " 2\tworld!"], @pw.run.to_a
|
65
|
+
end
|
66
|
+
|
67
|
+
def test_invalid_command
|
68
|
+
@pw.command = "inexistant_command"
|
69
|
+
assert_raises(Errno::ENOENT) { @pw.run.first }
|
70
|
+
end
|
71
|
+
|
72
|
+
def test_custom_env_variables
|
73
|
+
@pw.env["MY_VARIABLE"] = "foobar"
|
74
|
+
@pw.command = "echo $MY_VARIABLE"
|
75
|
+
assert_equal "foobar", @pw.run.first
|
76
|
+
end
|
77
|
+
|
78
|
+
def test_run_process_that_only_prints_to_stderr
|
79
|
+
@pw.command = "echo 'this is only printed on stderr' > /dev/fd/2"
|
80
|
+
assert @pw.run.first.nil?
|
81
|
+
assert @pw.error_log = "this is only printed on stderr\n"
|
82
|
+
end
|
83
|
+
|
84
|
+
def test_close_process_early
|
85
|
+
@pw.command = "echo '1\n2\n3'"
|
86
|
+
assert_equal @pw.run.first, "1"
|
87
|
+
@pw.close
|
88
|
+
end
|
89
|
+
end
|
data/test/test_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,203 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: freeling-analyzer
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Damián Silvani
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2013-05-23 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: rake
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :development
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: !ruby/object:Gem::Requirement
|
25
|
+
none: false
|
26
|
+
requirements:
|
27
|
+
- - ! '>='
|
28
|
+
- !ruby/object:Gem::Version
|
29
|
+
version: '0'
|
30
|
+
- !ruby/object:Gem::Dependency
|
31
|
+
name: yard
|
32
|
+
requirement: !ruby/object:Gem::Requirement
|
33
|
+
none: false
|
34
|
+
requirements:
|
35
|
+
- - ! '>='
|
36
|
+
- !ruby/object:Gem::Version
|
37
|
+
version: '0'
|
38
|
+
type: :development
|
39
|
+
prerelease: false
|
40
|
+
version_requirements: !ruby/object:Gem::Requirement
|
41
|
+
none: false
|
42
|
+
requirements:
|
43
|
+
- - ! '>='
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: '0'
|
46
|
+
- !ruby/object:Gem::Dependency
|
47
|
+
name: mocha
|
48
|
+
requirement: !ruby/object:Gem::Requirement
|
49
|
+
none: false
|
50
|
+
requirements:
|
51
|
+
- - ~>
|
52
|
+
- !ruby/object:Gem::Version
|
53
|
+
version: 0.13.3
|
54
|
+
type: :development
|
55
|
+
prerelease: false
|
56
|
+
version_requirements: !ruby/object:Gem::Requirement
|
57
|
+
none: false
|
58
|
+
requirements:
|
59
|
+
- - ~>
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: 0.13.3
|
62
|
+
- !ruby/object:Gem::Dependency
|
63
|
+
name: minitest
|
64
|
+
requirement: !ruby/object:Gem::Requirement
|
65
|
+
none: false
|
66
|
+
requirements:
|
67
|
+
- - ! '>='
|
68
|
+
- !ruby/object:Gem::Version
|
69
|
+
version: '0'
|
70
|
+
type: :development
|
71
|
+
prerelease: false
|
72
|
+
version_requirements: !ruby/object:Gem::Requirement
|
73
|
+
none: false
|
74
|
+
requirements:
|
75
|
+
- - ! '>='
|
76
|
+
- !ruby/object:Gem::Version
|
77
|
+
version: '0'
|
78
|
+
- !ruby/object:Gem::Dependency
|
79
|
+
name: minitest-wscolor
|
80
|
+
requirement: !ruby/object:Gem::Requirement
|
81
|
+
none: false
|
82
|
+
requirements:
|
83
|
+
- - ! '>='
|
84
|
+
- !ruby/object:Gem::Version
|
85
|
+
version: '0'
|
86
|
+
type: :development
|
87
|
+
prerelease: false
|
88
|
+
version_requirements: !ruby/object:Gem::Requirement
|
89
|
+
none: false
|
90
|
+
requirements:
|
91
|
+
- - ! '>='
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
94
|
+
- !ruby/object:Gem::Dependency
|
95
|
+
name: guard
|
96
|
+
requirement: !ruby/object:Gem::Requirement
|
97
|
+
none: false
|
98
|
+
requirements:
|
99
|
+
- - ! '>='
|
100
|
+
- !ruby/object:Gem::Version
|
101
|
+
version: '0'
|
102
|
+
type: :development
|
103
|
+
prerelease: false
|
104
|
+
version_requirements: !ruby/object:Gem::Requirement
|
105
|
+
none: false
|
106
|
+
requirements:
|
107
|
+
- - ! '>='
|
108
|
+
- !ruby/object:Gem::Version
|
109
|
+
version: '0'
|
110
|
+
- !ruby/object:Gem::Dependency
|
111
|
+
name: guard-minitest
|
112
|
+
requirement: !ruby/object:Gem::Requirement
|
113
|
+
none: false
|
114
|
+
requirements:
|
115
|
+
- - ! '>='
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '0'
|
118
|
+
type: :development
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
none: false
|
122
|
+
requirements:
|
123
|
+
- - ! '>='
|
124
|
+
- !ruby/object:Gem::Version
|
125
|
+
version: '0'
|
126
|
+
- !ruby/object:Gem::Dependency
|
127
|
+
name: hashie
|
128
|
+
requirement: !ruby/object:Gem::Requirement
|
129
|
+
none: false
|
130
|
+
requirements:
|
131
|
+
- - ! '>='
|
132
|
+
- !ruby/object:Gem::Version
|
133
|
+
version: '0'
|
134
|
+
type: :runtime
|
135
|
+
prerelease: false
|
136
|
+
version_requirements: !ruby/object:Gem::Requirement
|
137
|
+
none: false
|
138
|
+
requirements:
|
139
|
+
- - ! '>='
|
140
|
+
- !ruby/object:Gem::Version
|
141
|
+
version: '0'
|
142
|
+
description: ! "FreeLing::Analyzer is a Ruby wrapper around\n `analyzer`,
|
143
|
+
a binary tool included in FreeLing's\n package that allows
|
144
|
+
the user to process a stream of\n text with FreeLing."
|
145
|
+
email:
|
146
|
+
- munshkr@gmail.com
|
147
|
+
executables: []
|
148
|
+
extensions: []
|
149
|
+
extra_rdoc_files: []
|
150
|
+
files:
|
151
|
+
- .gitignore
|
152
|
+
- .travis.yml
|
153
|
+
- Gemfile
|
154
|
+
- Guardfile
|
155
|
+
- LICENSE
|
156
|
+
- README.md
|
157
|
+
- Rakefile
|
158
|
+
- freeling-analyzer.gemspec
|
159
|
+
- lib/freeling-analyzer.rb
|
160
|
+
- lib/freeling/analyzer.rb
|
161
|
+
- lib/freeling/analyzer/freeling_default.rb
|
162
|
+
- lib/freeling/analyzer/process_wrapper.rb
|
163
|
+
- lib/freeling/analyzer/version.rb
|
164
|
+
- test/analyzer_test.rb
|
165
|
+
- test/freeling_default_test.rb
|
166
|
+
- test/process_wrapper_test.rb
|
167
|
+
- test/test_helper.rb
|
168
|
+
homepage: https://github.com/munshkr/freeling-analyzer-ruby
|
169
|
+
licenses: []
|
170
|
+
post_install_message:
|
171
|
+
rdoc_options: []
|
172
|
+
require_paths:
|
173
|
+
- lib
|
174
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
175
|
+
none: false
|
176
|
+
requirements:
|
177
|
+
- - ! '>='
|
178
|
+
- !ruby/object:Gem::Version
|
179
|
+
version: '0'
|
180
|
+
segments:
|
181
|
+
- 0
|
182
|
+
hash: -171360197096185165
|
183
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
184
|
+
none: false
|
185
|
+
requirements:
|
186
|
+
- - ! '>='
|
187
|
+
- !ruby/object:Gem::Version
|
188
|
+
version: '0'
|
189
|
+
segments:
|
190
|
+
- 0
|
191
|
+
hash: -171360197096185165
|
192
|
+
requirements: []
|
193
|
+
rubyforge_project:
|
194
|
+
rubygems_version: 1.8.25
|
195
|
+
signing_key:
|
196
|
+
specification_version: 3
|
197
|
+
summary: Ruby wrapper for FreeLing's analyzer tool
|
198
|
+
test_files:
|
199
|
+
- test/analyzer_test.rb
|
200
|
+
- test/freeling_default_test.rb
|
201
|
+
- test/process_wrapper_test.rb
|
202
|
+
- test/test_helper.rb
|
203
|
+
has_rdoc:
|