treetagger-ruby 0.0.1.prealpha → 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.yardopts +1 -1
- data/CHANGELOG.rdoc +2 -3
- data/README.rdoc +76 -0
- data/bin/rtt +31 -0
- data/lib/tree_tagger/argv_parser.rb +9 -0
- data/lib/tree_tagger/chunker.rb +6 -0
- data/lib/tree_tagger/error.rb +19 -0
- data/lib/tree_tagger/tagger.rb +22 -0
- data/lib/tree_tagger/version.rb +3 -0
- metadata +17 -15
- data/lib/treetagger/tagger.rb +0 -6
- data/lib/treetagger/version.rb +0 -3
data/.yardopts
CHANGED
data/CHANGELOG.rdoc
CHANGED
@@ -1,11 +1,10 @@
|
|
1
1
|
== COMPLETED
|
2
|
+
=== 0.0.1
|
3
|
+
Implemented simple tagging. The TreeTagger is invoked through the evn variable.
|
2
4
|
=== 0.0.1.prealpha
|
3
5
|
Created the structure for this project, added documentation and a public repo.
|
4
6
|
|
5
|
-
|
6
7
|
== PLANNED
|
7
|
-
=== 0.0.1
|
8
|
-
|
9
8
|
=== 0.1.0
|
10
9
|
|
11
10
|
=== 0.2.0
|
data/README.rdoc
CHANGED
@@ -1 +1,77 @@
|
|
1
|
+
= TreeTagger for Ruby
|
2
|
+
|
3
|
+
* {RubyGems}[http://rubygems.org/gems/treetagger-ruby]
|
4
|
+
* Developers {Homepage}[http://bu.chsta.be/]
|
5
|
+
* {RTT Project Page}[http://bu.chsta.be/projects/treetagger-ruby/]
|
6
|
+
* {Source Code}[https://github.com/arbox/treetagger-ruby]
|
7
|
+
* {Bug Tracker}[https://github.com/arbox/treetagger-ruby/issues]
|
8
|
+
|
9
|
+
== DESCRIPTION
|
1
10
|
The Ruby based wrapper for the TreeTagger by Helmut Schmid.
|
11
|
+
Check it out if you are interested
|
12
|
+
in Natural Language Processing (NLP) and Human Language Technology (HLT).
|
13
|
+
=== Implemented Features
|
14
|
+
Simple tagging.
|
15
|
+
|
16
|
+
|
17
|
+
== INSTALLATION
|
18
|
+
Before you install the <tt>treetagger-ruby</tt> package please ensure
|
19
|
+
you have downloaded and installe the <tt>TreeTagger</tt> itself.
|
20
|
+
|
21
|
+
The {TreeTagger}[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/]
|
22
|
+
is a copyrighted software by Helmut Schmid and IMC, please read the license
|
23
|
+
agreament befor you download the package.
|
24
|
+
|
25
|
+
After the installation of the <tt>TreeTagger</tt> set the environment variable
|
26
|
+
<tt>TREETAGGERHOME</tt> to the location where you have the programm installed.
|
27
|
+
Usually this directory contains subdirectories <tt>bin, cmd, lib</tt> and
|
28
|
+
<tt>doc</tt>.
|
29
|
+
For instance you may add the following line to your <tt>.profile</tt> file:
|
30
|
+
export TREETAGGERHOME='/path/to/your/TreeTagger/installation'
|
31
|
+
|
32
|
+
<tt>treetagger-ruby</tt> is provided as a .gem package. Simply install it via
|
33
|
+
{RubyGems}[http://rubygems.org/gems/treetagger-ruby].
|
34
|
+
To install <tt>treetagger-ruby</tt> ussue the following command:
|
35
|
+
$ gem install treetagger-ruby
|
36
|
+
|
37
|
+
If you want to do a system wide installation, do this as root
|
38
|
+
(possibly using +sudo+).
|
39
|
+
|
40
|
+
Alternatively use your Gemfile for dependency management.
|
41
|
+
|
42
|
+
|
43
|
+
== SYNOPSIS
|
44
|
+
|
45
|
+
Basic usage is very simple:
|
46
|
+
$ require 'treetagger-ruby'
|
47
|
+
$ tagger = TreeTagger::Tagger.new
|
48
|
+
$ api.process('Ich gehe in die Schule')
|
49
|
+
|
50
|
+
See documentation in the TreeTagger::Tagger class for details
|
51
|
+
on particular search methods.
|
52
|
+
|
53
|
+
== EXCEPTION HIERARCHY
|
54
|
+
While using TreeTagger you can face following errors:
|
55
|
+
* <tt>TreeTagger::UserError</tt>;
|
56
|
+
* <tt>TreeTagger::RuntimeError</tt>;
|
57
|
+
* <tt>TreeTagger::ExternalError</tt>.
|
58
|
+
|
59
|
+
== SUPPORT
|
60
|
+
If you have question, bug reports or any suggestions, please drop me an email :)
|
61
|
+
Any help is deeply appreciated!
|
62
|
+
|
63
|
+
== CHANGELOG
|
64
|
+
For details on future plan and working progress see CHANGELOG.
|
65
|
+
|
66
|
+
== CAUTION
|
67
|
+
This library is <b>work in process</b>! Though the interface is mostly complete,
|
68
|
+
you might face some not implemented features.
|
69
|
+
|
70
|
+
Please contact me with your suggestions, bug reports and feature requests.
|
71
|
+
== LICENSE
|
72
|
+
|
73
|
+
RTT is a copyrighted software by Andrei Beliankou, 2011-
|
74
|
+
|
75
|
+
You may use, redistribute and change it under the terms
|
76
|
+
provided in the LICENSE file.
|
77
|
+
|
data/bin/rtt
ADDED
@@ -0,0 +1,31 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# rtt - Ruby TreeTagger
|
3
|
+
|
4
|
+
require 'tree_tagger/tagger'
|
5
|
+
require 'tree_tagger/argv_parser'
|
6
|
+
|
7
|
+
options = TreeTagger::ARGVParser.parse(ARGV)
|
8
|
+
|
9
|
+
tagger = TreeTagger::Tagger.new(options)
|
10
|
+
|
11
|
+
while line = ARGF.gets
|
12
|
+
# [['token', 'tag', 'lemma'], ['token', 'tag', 'lemma']]
|
13
|
+
result_array = tagger.process(line.chomp)
|
14
|
+
|
15
|
+
# Adding some colors to the output.
|
16
|
+
# Using ANSI escape codes.
|
17
|
+
red = "\e[31m"
|
18
|
+
green = "\e[32m"
|
19
|
+
blue = "\e[34m"
|
20
|
+
reset = "\e[0m"
|
21
|
+
|
22
|
+
result_array.each do |tuple|
|
23
|
+
if $stdout.tty?
|
24
|
+
tuple[0].insert(0, red).insert(-1, reset)
|
25
|
+
tuple[1].insert(0, green).insert(-1, reset)
|
26
|
+
tuple[2].insert(0, blue).insert(-1, reset)
|
27
|
+
end
|
28
|
+
|
29
|
+
$stdout.puts tuple.join("\t")
|
30
|
+
end
|
31
|
+
end
|
@@ -0,0 +1,19 @@
|
|
1
|
+
module TreeTagger
|
2
|
+
# A simple error wrapper,
|
3
|
+
# you can intercept all error from the library.
|
4
|
+
class Error < StandardError; end
|
5
|
+
|
6
|
+
# Somethig went wrong: no env variable, data not coded prperly etc.
|
7
|
+
class ExternalError < Error
|
8
|
+
end
|
9
|
+
|
10
|
+
# Exectution error, an assert like exception.
|
11
|
+
class RuntimeError < Error
|
12
|
+
|
13
|
+
end
|
14
|
+
|
15
|
+
# User tries to use the lib in a wrong manner, e.g. provides
|
16
|
+
# wrong parameters.
|
17
|
+
class UserError < Error
|
18
|
+
end
|
19
|
+
end
|
@@ -0,0 +1,22 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
|
3
|
+
module TreeTagger
|
4
|
+
class Tagger
|
5
|
+
def initialize(
|
6
|
+
lang = :de,
|
7
|
+
opts = {
|
8
|
+
:sgml => true,
|
9
|
+
:token => true,
|
10
|
+
:lemma => true
|
11
|
+
}
|
12
|
+
)
|
13
|
+
@lang = lang
|
14
|
+
@opt = opts
|
15
|
+
end
|
16
|
+
def process(str)
|
17
|
+
line = %x(echo '#{str}' | #{ENV['TREETAGGERHOME']}/cmd/tree-tagger-german)
|
18
|
+
arr = line.split("\n").collect { |el| el.split("\t") }
|
19
|
+
end
|
20
|
+
end # class
|
21
|
+
end # module
|
22
|
+
|
metadata
CHANGED
@@ -1,14 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: treetagger-ruby
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
5
|
-
prerelease:
|
4
|
+
hash: 29
|
5
|
+
prerelease:
|
6
6
|
segments:
|
7
7
|
- 0
|
8
8
|
- 0
|
9
9
|
- 1
|
10
|
-
|
11
|
-
version: 0.0.1.prealpha
|
10
|
+
version: 0.0.1
|
12
11
|
platform: ruby
|
13
12
|
authors:
|
14
13
|
- Andrei Beliankou
|
@@ -16,7 +15,7 @@ autorequire:
|
|
16
15
|
bindir: bin
|
17
16
|
cert_chain: []
|
18
17
|
|
19
|
-
date: 2011-12-
|
18
|
+
date: 2011-12-18 00:00:00 Z
|
20
19
|
dependencies:
|
21
20
|
- !ruby/object:Gem::Dependency
|
22
21
|
name: rdoc
|
@@ -78,8 +77,8 @@ dependencies:
|
|
78
77
|
version_requirements: *id004
|
79
78
|
description: This package contains a simple wrapper for the TreeTagger, a POS tagger based on decision trees and developed by Helmut Schmid at IMS in Stuttgart, Germany. You should have the TreeTagger with all library files installed on your machine in order to use this wrapper.
|
80
79
|
email: a.belenkow@uni-trier.de
|
81
|
-
executables:
|
82
|
-
|
80
|
+
executables:
|
81
|
+
- rtt
|
83
82
|
extensions: []
|
84
83
|
|
85
84
|
extra_rdoc_files:
|
@@ -87,12 +86,16 @@ extra_rdoc_files:
|
|
87
86
|
- LICENCE.rdoc
|
88
87
|
- CHANGELOG.rdoc
|
89
88
|
files:
|
90
|
-
- lib/
|
91
|
-
- lib/
|
89
|
+
- lib/tree_tagger/chunker.rb
|
90
|
+
- lib/tree_tagger/error.rb
|
91
|
+
- lib/tree_tagger/argv_parser.rb
|
92
|
+
- lib/tree_tagger/tagger.rb
|
93
|
+
- lib/tree_tagger/version.rb
|
92
94
|
- README.rdoc
|
93
95
|
- LICENCE.rdoc
|
94
96
|
- CHANGELOG.rdoc
|
95
97
|
- .yardopts
|
98
|
+
- bin/rtt
|
96
99
|
homepage: http://www.uni-trier.de/index.php?id=34451
|
97
100
|
licenses: []
|
98
101
|
|
@@ -116,14 +119,12 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
116
119
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
117
120
|
none: false
|
118
121
|
requirements:
|
119
|
-
- - "
|
122
|
+
- - ">="
|
120
123
|
- !ruby/object:Gem::Version
|
121
|
-
hash:
|
124
|
+
hash: 3
|
122
125
|
segments:
|
123
|
-
-
|
124
|
-
|
125
|
-
- 1
|
126
|
-
version: 1.3.1
|
126
|
+
- 0
|
127
|
+
version: "0"
|
127
128
|
requirements: []
|
128
129
|
|
129
130
|
rubyforge_project:
|
@@ -133,3 +134,4 @@ specification_version: 3
|
|
133
134
|
summary: A wrapper for the TreeTagger by Helmut Schmid.
|
134
135
|
test_files: []
|
135
136
|
|
137
|
+
has_rdoc:
|
data/lib/treetagger/tagger.rb
DELETED
data/lib/treetagger/version.rb
DELETED