opener-pos-tagger 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: ccc6c90ace4f3e79af9d820dd7b773ccffdd65fe
4
+ data.tar.gz: 3af96ef7ef65f6210ff076f3c8dc605e5394ec0f
5
+ SHA512:
6
+ metadata.gz: 70f3208c340084a7fa7c5a7f1701c3aa4bdd39f7bbe43b47a652c6da09179eedf7d3a5ef1f5680f827c953e6063935af11ca343d10a38e7552f47b95a08773a6
7
+ data.tar.gz: a72e6d5d3cdd505179e9bac632ba9bd9cf125a34a2b579942c8a98b7d62f7ebb34cec01e50f90d4fd22a9257f6e44ccbe22798e35730f95fe381af0b15d2801a
data/README.md ADDED
@@ -0,0 +1,164 @@
1
+ POS-tagger
2
+ ------------
3
+
4
+ Component that wraps the different existing POS Taggers based on OpenNLP.
5
+
6
+ ### Confused by some terminology?
7
+
8
+ This software is part of a larger collection of natural language processing
9
+ tools known as "the OpeNER project". You can find more information about the
10
+ project at (the OpeNER portal)[http://opener-project.github.io]. There you can
11
+ also find references to terms like KAF (an XML standard to represent linguistic
12
+ annotations in texts), component, cores, scenario's and pipelines.
13
+
14
+ Quick Use Example
15
+ -----------------
16
+
17
+ Installing the pos-tagger can be done by executing:
18
+
19
+ gem install opener-pos-tagger
20
+
21
+ Please bare in mind that all components in OpeNER take KAF as an input and
22
+ output KAF by default.
23
+
24
+ ### Command line interface
25
+
26
+ You should now be able to call the POS tagger as a regular shell
27
+ command: by its name. Once installed the gem normalyl sits in your path so you can call it directly from anywhere.
28
+
29
+ This aplication reads a text from standard input in order to identify the language.
30
+
31
+ POS Tagging some text (assuming that the above text is in a file called *english.kaf*):
32
+
33
+ cat english.kaf | pos-tagger
34
+
35
+ Will result in
36
+
37
+ <?xml version='1.0' encoding='UTF-8'?>
38
+ <KAF version="v1.opener" xml:lang="en">
39
+ <kafHeader>
40
+ <linguisticProcessors layer="text">
41
+ <lp name="opennlp-en-tok" timestamp="2013-06-11T13:41:37Z" version="1.0"/>
42
+ <lp name="opennlp-en-sent" timestamp="2013-06-11T13:41:37Z" version="1.0"/>
43
+ </linguisticProcessors>
44
+ <linguisticProcessor layer="term">
45
+ <lp timestamp="2013-06-12T15:18:03CEST" version="1.0" name="Open nlp pos tagger"/>
46
+ </linguisticProcessor>
47
+ </kafHeader>
48
+ <text>
49
+ <wf length="4" offset="0" para="1" sent="1" wid="w1">this</wf>
50
+ <wf length="2" offset="5" para="1" sent="1" wid="w2">is</wf>
51
+ <wf length="2" offset="8" para="1" sent="1" wid="w3">an</wf>
52
+ <wf length="7" offset="11" para="1" sent="1" wid="w4">english</wf>
53
+ <wf length="4" offset="19" para="1" sent="1" wid="w5">text</wf>
54
+ </text>
55
+ <terms>
56
+ <term lemma="this" morphofeat="FM" pos="O" tid="t_1" type="open">
57
+ <span>
58
+ <target id="w1"/>
59
+ </span>
60
+ </term>
61
+ <term lemma="is" morphofeat="FM" pos="O" tid="t_2" type="open">
62
+ <span>
63
+ <target id="w2"/>
64
+ </span>
65
+ </term>
66
+ <term lemma="an" morphofeat="APPR" pos="P" tid="t_3" type="close">
67
+ <span>
68
+ <target id="w3"/>
69
+ </span>
70
+ </term>
71
+ <term lemma="english" morphofeat="FM" pos="O" tid="t_4" type="open">
72
+ <span>
73
+ <target id="w4"/>
74
+ </span>
75
+ </term>
76
+ <term lemma="text" morphofeat="FM" pos="O" tid="t_5" type="open">
77
+ <span>
78
+ <target id="w5"/>
79
+ </span>
80
+ </term>
81
+ </terms>
82
+ </KAF>
83
+
84
+ ### Webservices
85
+
86
+ You can launch a language identification webservice by executing:
87
+
88
+ pos-tagger-server
89
+
90
+ This will launch a mini webserver with the webservice. It defaults to port 9292,
91
+ so you can access it at <http://localhost:9292>.
92
+
93
+ To launch it on a different port provide the `-p [port-number]` option like
94
+ this:
95
+
96
+ pos-tagger-server -p 1234
97
+
98
+ It then launches at <http://localhost:1234>
99
+
100
+ Documentation on the Webservice is provided by surfing to the urls provided
101
+ above. For more information on how to launch a webservice run the command with
102
+ the ```-h``` option.
103
+
104
+
105
+ ### Daemon
106
+
107
+ Last but not least the POS tagger comes shipped with a daemon that
108
+ can read jobs (and write) jobs to and from Amazon SQS queues. For more
109
+ information type:
110
+
111
+ pos-tagger-daemon -h
112
+
113
+ Description of dependencies
114
+ ---------------------------
115
+
116
+ This component runs best if you run it in an environment suited for OpeNER
117
+ components. You can find an installation guide and helper tools in the (OpeNER
118
+ installer)[https://github.com/opener-project/opener-installer] and (an
119
+ installation guide on the Opener
120
+ Website)[http://opener-project.github.io/getting-started/how-to/local-installation.html]
121
+
122
+ At least you need the following system setup:
123
+
124
+ ### Depenencies for normal use:
125
+
126
+ * JRuby (1.7.9+)
127
+ * Java 1.7 or newer (There are problems with encoding in older versions).
128
+
129
+ ### Dependencies if you want to modify the component:
130
+
131
+ * Maven (for building the Gem)
132
+
133
+ Language Extension
134
+ ------------------
135
+
136
+ TODO
137
+
138
+ The Core
139
+ --------
140
+
141
+ The component is a fat wrapper around the actual language technology core. You
142
+ can find the core technolies in the following repositories: (https://github.com/opener-project/?query=pos)[https://github.com/opener-project/?query=pos]
143
+
144
+ Where to go from here
145
+ ---------------------
146
+
147
+ * Check (the project websitere)[http://opener-project.github.io]
148
+ * (Checkout the webservice)[http://opener.olery.com/pos-tagger]
149
+
150
+ Report problem/Get help
151
+ -----------------------
152
+
153
+ If you encounter problems, please email support@opener-project.eu or leave an
154
+ issue in the (issue tracker)[https://github.com/opener-project/pos-tagger/issues].
155
+
156
+
157
+ Contributing
158
+ ------------
159
+
160
+ 1. Fork it ( http://github.com/opener-project/pos-tagger/fork )
161
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
162
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
163
+ 4. Push to the branch (`git push origin my-new-feature`)
164
+ 5. Create new Pull Request
data/bin/pos-tagger ADDED
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require_relative '../lib/opener/pos_tagger'
4
+
5
+ cli = Opener::POSTagger::CLI.new(:args => ARGV)
6
+
7
+ cli.run(STDIN.tty? ? nil : STDIN.read)
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require_relative '../lib/opener/pos_tagger/server'
4
+
5
+ # Without calling `Rack::Server#options` manually the CLI arguments will never
6
+ # be passed, thus the application can't be specified as a constructor argument.
7
+ server = Rack::Server.new
8
+ server.options[:config] = File.expand_path('../../config.ru', __FILE__)
9
+
10
+ server.start
data/config.ru ADDED
@@ -0,0 +1,4 @@
1
+ require File.expand_path('../lib/opener/pos_tagger', __FILE__)
2
+ require File.expand_path('../lib/opener/pos_tagger/server', __FILE__)
3
+
4
+ run Opener::POSTagger::Server
@@ -0,0 +1,90 @@
1
+ require 'opener/pos_taggers/base'
2
+ require 'opener/pos_taggers/en'
3
+ require 'nokogiri'
4
+ require 'open3'
5
+ require 'optparse'
6
+
7
+ require_relative 'pos_tagger/version'
8
+ require_relative 'pos_tagger/cli'
9
+
10
+ module Opener
11
+ ##
12
+ # Primary POS tagger class that delegates work the various POS tagging
13
+ # kernels.
14
+ #
15
+ # @!attribute [r] options
16
+ # @return [Hash]
17
+ #
18
+ class POSTagger
19
+ attr_reader :options
20
+
21
+ ##
22
+ # Hash containing the default options to use.
23
+ #
24
+ # @return [Hash]
25
+ #
26
+ DEFAULT_OPTIONS = {
27
+ :args => []
28
+ }.freeze
29
+
30
+ ##
31
+ # @param [Hash] options
32
+ #
33
+ # @option options [Array] :args Arbitrary arguments to pass to the
34
+ # underlying kernel.
35
+ #
36
+ def initialize(options = {})
37
+ @options = DEFAULT_OPTIONS.merge(options)
38
+ end
39
+
40
+ ##
41
+ # Processes the input and returns an Array containing the output of STDOUT,
42
+ # STDERR and an object containing process information.
43
+ #
44
+ # @param [String] input The input to process.
45
+ # @return [Array]
46
+ #
47
+ def run(input)
48
+ language = language_from_kaf(input)
49
+
50
+ unless valid_language?(language)
51
+ raise ArgumentError, "The specified language (#{language}) is invalid"
52
+ end
53
+
54
+ kernel = language_constant(language).new(:args => options[:args])
55
+
56
+ return kernel.run(input)
57
+ end
58
+
59
+ alias tag run
60
+
61
+ protected
62
+
63
+ ##
64
+ # Extracts the language from a KAF document.
65
+ #
66
+ # @param [String] input
67
+ # @return [String]
68
+ #
69
+ def language_from_kaf(input)
70
+ reader = Nokogiri::XML::Reader(input)
71
+
72
+ return reader.read.lang
73
+ end
74
+
75
+ ##
76
+ # @param [String] language
77
+ # @return [Class]
78
+ #
79
+ def language_constant(language)
80
+ return language && POSTaggers.const_get(language.upcase)
81
+ end
82
+
83
+ ##
84
+ # @return [TrueClass|FalseClass]
85
+ #
86
+ def valid_language?(language)
87
+ return Opener::POSTaggers.const_defined?(language.upcase)
88
+ end
89
+ end # POSTagger
90
+ end # Opener
@@ -0,0 +1,73 @@
1
+ require 'optparse'
2
+
3
+ module Opener
4
+ class POSTagger
5
+ ##
6
+ # CLI wrapper around {Opener::POSTagger} using OptionParser.
7
+ #
8
+ # @!attribute [r] options
9
+ # @return [Hash]
10
+ # @!attribute [r] option_parser
11
+ # @return [OptionParser]
12
+ #
13
+ class CLI
14
+ attr_reader :options, :option_parser
15
+
16
+ ##
17
+ # @param [Hash] options
18
+ #
19
+ def initialize(options = {})
20
+ @options = DEFAULT_OPTIONS.merge(options)
21
+
22
+ @option_parser = ::OptionParser.new do |opts|
23
+ opts.program_name = 'pos-tagger'
24
+ opts.summary_indent = ' '
25
+
26
+ opts.on('-h', '--help', 'Shows this help message') do
27
+ show_help
28
+ end
29
+
30
+ opts.on('-v', '--version', 'Shows the current version') do
31
+ show_version
32
+ end
33
+
34
+ opts.separator <<-EOF
35
+
36
+ Examples:
37
+
38
+ cat example.kaf | #{opts.program_name}
39
+ EOF
40
+ end
41
+ end
42
+
43
+ ##
44
+ # @param [String] input
45
+ #
46
+ def run(input)
47
+ option_parser.parse!(options[:args])
48
+
49
+ tagger = POSTagger.new(options)
50
+
51
+ stdout = tagger.run(input)
52
+
53
+ puts stdout
54
+ end
55
+
56
+ private
57
+
58
+ ##
59
+ # Shows the help message and exits the program.
60
+ #
61
+ def show_help
62
+ abort option_parser.to_s
63
+ end
64
+
65
+ ##
66
+ # Shows the version and exits the program.
67
+ #
68
+ def show_version
69
+ abort "#{option_parser.program_name} v#{VERSION} on #{RUBY_DESCRIPTION}"
70
+ end
71
+ end # CLI
72
+ end # POSTagger
73
+ end # Opener
@@ -0,0 +1,283 @@
1
+ input[type="text"], textarea
2
+ {
3
+ width: 500px;
4
+ }
5
+
6
+ body {
7
+ font-family: Helvetica, arial, sans-serif;
8
+ font-size: 14px;
9
+ line-height: 1.6;
10
+ padding-top: 10px;
11
+ padding-bottom: 10px;
12
+ background-color: white;
13
+ padding: 30px; }
14
+
15
+ body > *:first-child {
16
+ margin-top: 0 !important; }
17
+ body > *:last-child {
18
+ margin-bottom: 0 !important; }
19
+
20
+ a {
21
+ color: #4183C4; }
22
+ a.absent {
23
+ color: #cc0000; }
24
+ a.anchor {
25
+ display: block;
26
+ padding-left: 30px;
27
+ margin-left: -30px;
28
+ cursor: pointer;
29
+ position: absolute;
30
+ top: 0;
31
+ left: 0;
32
+ bottom: 0; }
33
+
34
+ h1, h2, h3, h4, h5, h6 {
35
+ margin: 20px 0 10px;
36
+ padding: 0;
37
+ font-weight: bold;
38
+ -webkit-font-smoothing: antialiased;
39
+ cursor: text;
40
+ position: relative; }
41
+
42
+ h1:hover a.anchor, h2:hover a.anchor, h3:hover a.anchor, h4:hover a.anchor, h5:hover a.anchor, h6:hover a.anchor {
43
+ background: url("../../images/modules/styleguide/para.png") no-repeat 10px center;
44
+ text-decoration: none; }
45
+
46
+ h1 tt, h1 code {
47
+ font-size: inherit; }
48
+
49
+ h2 tt, h2 code {
50
+ font-size: inherit; }
51
+
52
+ h3 tt, h3 code {
53
+ font-size: inherit; }
54
+
55
+ h4 tt, h4 code {
56
+ font-size: inherit; }
57
+
58
+ h5 tt, h5 code {
59
+ font-size: inherit; }
60
+
61
+ h6 tt, h6 code {
62
+ font-size: inherit; }
63
+
64
+ h1 {
65
+ font-size: 28px;
66
+ color: black; }
67
+
68
+ h2 {
69
+ font-size: 24px;
70
+ border-bottom: 1px solid #cccccc;
71
+ color: black; }
72
+
73
+ h3 {
74
+ font-size: 18px; }
75
+
76
+ h4 {
77
+ font-size: 16px; }
78
+
79
+ h5 {
80
+ font-size: 14px; }
81
+
82
+ h6 {
83
+ color: #777777;
84
+ font-size: 14px; }
85
+
86
+ p, blockquote, ul, ol, dl, li, table, pre {
87
+ margin: 15px 0; }
88
+
89
+ hr {
90
+ background: transparent url("../../images/modules/pulls/dirty-shade.png") repeat-x 0 0;
91
+ border: 0 none;
92
+ color: #cccccc;
93
+ height: 4px;
94
+ padding: 0; }
95
+
96
+ body > h2:first-child {
97
+ margin-top: 0;
98
+ padding-top: 0; }
99
+ body > h1:first-child {
100
+ margin-top: 0;
101
+ padding-top: 0; }
102
+ body > h1:first-child + h2 {
103
+ margin-top: 0;
104
+ padding-top: 0; }
105
+ body > h3:first-child, body > h4:first-child, body > h5:first-child, body > h6:first-child {
106
+ margin-top: 0;
107
+ padding-top: 0; }
108
+
109
+ a:first-child h1, a:first-child h2, a:first-child h3, a:first-child h4, a:first-child h5, a:first-child h6 {
110
+ margin-top: 0;
111
+ padding-top: 0; }
112
+
113
+ h1 p, h2 p, h3 p, h4 p, h5 p, h6 p {
114
+ margin-top: 0; }
115
+
116
+ li p.first {
117
+ display: inline-block; }
118
+
119
+ ul, ol {
120
+ padding-left: 30px; }
121
+
122
+ ul :first-child, ol :first-child {
123
+ margin-top: 0; }
124
+
125
+ ul :last-child, ol :last-child {
126
+ margin-bottom: 0; }
127
+
128
+ dl {
129
+ padding: 0; }
130
+ dl dt {
131
+ font-size: 14px;
132
+ font-weight: bold;
133
+ font-style: italic;
134
+ padding: 0;
135
+ margin: 15px 0 5px; }
136
+ dl dt:first-child {
137
+ padding: 0; }
138
+ dl dt > :first-child {
139
+ margin-top: 0; }
140
+ dl dt > :last-child {
141
+ margin-bottom: 0; }
142
+ dl dd {
143
+ margin: 0 0 15px;
144
+ padding: 0 15px; }
145
+ dl dd > :first-child {
146
+ margin-top: 0; }
147
+ dl dd > :last-child {
148
+ margin-bottom: 0; }
149
+
150
+ blockquote {
151
+ border-left: 4px solid #dddddd;
152
+ padding: 0 15px;
153
+ color: #777777; }
154
+ blockquote > :first-child {
155
+ margin-top: 0; }
156
+ blockquote > :last-child {
157
+ margin-bottom: 0; }
158
+
159
+ table {
160
+ padding: 0; }
161
+ table tr {
162
+ border-top: 1px solid #cccccc;
163
+ background-color: white;
164
+ margin: 0;
165
+ padding: 0; }
166
+ table tr:nth-child(2n) {
167
+ background-color: #f8f8f8; }
168
+ table tr th {
169
+ font-weight: bold;
170
+ border: 1px solid #cccccc;
171
+ text-align: left;
172
+ margin: 0;
173
+ padding: 6px 13px; }
174
+ table tr td {
175
+ border: 1px solid #cccccc;
176
+ text-align: left;
177
+ margin: 0;
178
+ padding: 6px 13px; }
179
+ table tr th :first-child, table tr td :first-child {
180
+ margin-top: 0; }
181
+ table tr th :last-child, table tr td :last-child {
182
+ margin-bottom: 0; }
183
+
184
+ img {
185
+ max-width: 100%; }
186
+
187
+ span.frame {
188
+ display: block;
189
+ overflow: hidden; }
190
+ span.frame > span {
191
+ border: 1px solid #dddddd;
192
+ display: block;
193
+ float: left;
194
+ overflow: hidden;
195
+ margin: 13px 0 0;
196
+ padding: 7px;
197
+ width: auto; }
198
+ span.frame span img {
199
+ display: block;
200
+ float: left; }
201
+ span.frame span span {
202
+ clear: both;
203
+ color: #333333;
204
+ display: block;
205
+ padding: 5px 0 0; }
206
+ span.align-center {
207
+ display: block;
208
+ overflow: hidden;
209
+ clear: both; }
210
+ span.align-center > span {
211
+ display: block;
212
+ overflow: hidden;
213
+ margin: 13px auto 0;
214
+ text-align: center; }
215
+ span.align-center span img {
216
+ margin: 0 auto;
217
+ text-align: center; }
218
+ span.align-right {
219
+ display: block;
220
+ overflow: hidden;
221
+ clear: both; }
222
+ span.align-right > span {
223
+ display: block;
224
+ overflow: hidden;
225
+ margin: 13px 0 0;
226
+ text-align: right; }
227
+ span.align-right span img {
228
+ margin: 0;
229
+ text-align: right; }
230
+ span.float-left {
231
+ display: block;
232
+ margin-right: 13px;
233
+ overflow: hidden;
234
+ float: left; }
235
+ span.float-left span {
236
+ margin: 13px 0 0; }
237
+ span.float-right {
238
+ display: block;
239
+ margin-left: 13px;
240
+ overflow: hidden;
241
+ float: right; }
242
+ span.float-right > span {
243
+ display: block;
244
+ overflow: hidden;
245
+ margin: 13px auto 0;
246
+ text-align: right; }
247
+
248
+ code, tt {
249
+ margin: 0 2px;
250
+ padding: 0 5px;
251
+ white-space: nowrap;
252
+ border: 1px solid #eaeaea;
253
+ background-color: #f8f8f8;
254
+ border-radius: 3px; }
255
+
256
+ pre code {
257
+ margin: 0;
258
+ padding: 0;
259
+ white-space: pre;
260
+ border: none;
261
+ background: transparent; }
262
+
263
+ .highlight pre {
264
+ background-color: #f8f8f8;
265
+ border: 1px solid #cccccc;
266
+ font-size: 13px;
267
+ line-height: 19px;
268
+ overflow: auto;
269
+ padding: 6px 10px;
270
+ border-radius: 3px; }
271
+
272
+ pre {
273
+ background-color: #f8f8f8;
274
+ border: 1px solid #cccccc;
275
+ font-size: 13px;
276
+ line-height: 19px;
277
+ overflow: auto;
278
+ padding: 6px 10px;
279
+ border-radius: 3px; }
280
+ pre code, pre tt {
281
+ background-color: transparent;
282
+ border: none; }
283
+
@@ -0,0 +1,16 @@
1
+ require 'sinatra/base'
2
+ require 'httpclient'
3
+ require 'opener/webservice'
4
+
5
+ module Opener
6
+ class POSTagger
7
+ ##
8
+ # POS Tagger server powered by Sinatra.
9
+ #
10
+ class Server < Webservice
11
+ set :views, File.expand_path('../views', __FILE__)
12
+ text_processor POSTagger
13
+ accepted_params :input
14
+ end # Server
15
+ end # POSTagger
16
+ end # Opener
@@ -0,0 +1,5 @@
1
+ module Opener
2
+ class POSTagger
3
+ VERSION = "2.0.0"
4
+ end
5
+ end
@@ -0,0 +1,163 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <link type="text/css" rel="stylesheet" charset="UTF-8" href="markdown.css"/>
5
+ <title>POS Tagger Webservice</title>
6
+ </head>
7
+ <body>
8
+ <h1>POS Tagger Web Service</h1>
9
+
10
+ <h2>Example Usage</h2>
11
+
12
+ <p>
13
+ <pre>pos-tagger-server start</pre>
14
+ <pre>curl -d &#39;input=&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&lt;KAF version=&quot;v1.opener&quot; xml:lang=&quot;en&quot;&gt;&lt;kafHeader&gt;&lt;linguisticProcessors layer=&quot;text&quot;&gt;&lt;lp name=&quot;opennlp-en-tok&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;&lt;lp name=&quot;opennlp-en-sent&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;&lt;/linguisticProcessors&gt;&lt;/kafHeader&gt;&lt;text&gt;&lt;wf length=&quot;4&quot; offset=&quot;0&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w1&quot;&gt;this&lt;/wf&gt;&lt;wf length=&quot;2&quot; offset=&quot;5&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w2&quot;&gt;is&lt;/wf&gt;&lt;wf length=&quot;2&quot; offset=&quot;8&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w3&quot;&gt;an&lt;/wf&gt;&lt;wf length=&quot;7&quot; offset=&quot;11&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w4&quot;&gt;english&lt;/wf&gt;&lt;wf length=&quot;4&quot; offset=&quot;19&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w5&quot;&gt;text&lt;/wf&gt;&lt;/text&gt;&lt;/KAF&gt;&#39; http://localhost:9292 -XPOST</pre>
15
+
16
+ outputs:
17
+
18
+ <pre>
19
+ &lt;?xml version=&#39;1.0&#39; encoding=&#39;UTF-8&#39;?&gt;
20
+ &lt;KAF version=&quot;v1.opener&quot; xml:lang=&quot;en&quot;&gt;
21
+ &lt;kafHeader&gt;
22
+ &lt;linguisticProcessors layer=&quot;text&quot;&gt;
23
+ &lt;lp name=&quot;opennlp-en-tok&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;
24
+ &lt;lp name=&quot;opennlp-en-sent&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;
25
+ &lt;/linguisticProcessors&gt;
26
+ &lt;linguisticProcessor layer=&quot;term&quot;&gt;
27
+ &lt;lp timestamp=&quot;2013-06-12T15:18:03CEST&quot; version=&quot;1.0&quot; name=&quot;Open nlp pos tagger&quot;/&gt;
28
+ &lt;/linguisticProcessor&gt;
29
+ &lt;/kafHeader&gt;
30
+ &lt;text&gt;
31
+ &lt;wf length=&quot;4&quot; offset=&quot;0&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w1&quot;&gt;this&lt;/wf&gt;
32
+ &lt;wf length=&quot;2&quot; offset=&quot;5&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w2&quot;&gt;is&lt;/wf&gt;
33
+ &lt;wf length=&quot;2&quot; offset=&quot;8&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w3&quot;&gt;an&lt;/wf&gt;
34
+ &lt;wf length=&quot;7&quot; offset=&quot;11&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w4&quot;&gt;english&lt;/wf&gt;
35
+ &lt;wf length=&quot;4&quot; offset=&quot;19&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w5&quot;&gt;text&lt;/wf&gt;
36
+ &lt;/text&gt;
37
+ &lt;terms&gt;
38
+ &lt;term lemma=&quot;this&quot; morphofeat=&quot;FM&quot; pos=&quot;O&quot; tid=&quot;t_1&quot; type=&quot;open&quot;&gt;
39
+ &lt;span&gt;
40
+ &lt;target id=&quot;w1&quot;/&gt;
41
+ &lt;/span&gt;
42
+ &lt;/term&gt;
43
+ &lt;term lemma=&quot;is&quot; morphofeat=&quot;FM&quot; pos=&quot;O&quot; tid=&quot;t_2&quot; type=&quot;open&quot;&gt;
44
+ &lt;span&gt;
45
+ &lt;target id=&quot;w2&quot;/&gt;
46
+ &lt;/span&gt;
47
+ &lt;/term&gt;
48
+ &lt;term lemma=&quot;an&quot; morphofeat=&quot;APPR&quot; pos=&quot;P&quot; tid=&quot;t_3&quot; type=&quot;close&quot;&gt;
49
+ &lt;span&gt;
50
+ &lt;target id=&quot;w3&quot;/&gt;
51
+ &lt;/span&gt;
52
+ &lt;/term&gt;
53
+ &lt;term lemma=&quot;english&quot; morphofeat=&quot;FM&quot; pos=&quot;O&quot; tid=&quot;t_4&quot; type=&quot;open&quot;&gt;
54
+ &lt;span&gt;
55
+ &lt;target id=&quot;w4&quot;/&gt;
56
+ &lt;/span&gt;
57
+ &lt;/term&gt;
58
+ &lt;term lemma=&quot;text&quot; morphofeat=&quot;FM&quot; pos=&quot;O&quot; tid=&quot;t_5&quot; type=&quot;open&quot;&gt;
59
+ &lt;span&gt;
60
+ &lt;target id=&quot;w5&quot;/&gt;
61
+ &lt;/span&gt;
62
+ &lt;/term&gt;
63
+ &lt;/terms&gt;
64
+ &lt;/KAF&gt;</pre>
65
+ </p>
66
+
67
+ <h2>Try the webservice</h2>
68
+
69
+ <p>* required</p>
70
+ <p>** When entering a value no response will be displayed in the browser.</p>
71
+
72
+ <form action="<%=url("/")%>" method="POST">
73
+ <div>
74
+ <label for="input"/>Type your text here*</label>
75
+ <br/>
76
+
77
+ <textarea name="input" id="text" rows="10" cols="50"/></textarea>
78
+ </div>
79
+
80
+ <% 10.times do |t| %>
81
+ <div>
82
+ <label for="callbacks">Callback URL <%=t+1%>(**)</label>
83
+ <br />
84
+
85
+ <input id="callbacks" type="text" name="callbacks[]" />
86
+ </div>
87
+ <% end %>
88
+
89
+
90
+ <div>
91
+ <label for="error_callback">Error Callback</label>
92
+ <br />
93
+
94
+ <input id="error_callback" type="text" name="error_callback" />
95
+ </div>
96
+ <input type="submit" value="Submit" />
97
+ </form>
98
+
99
+ <h2>Actions</h2>
100
+
101
+ <p>
102
+ <dl>
103
+ <dt>POST /</dt>
104
+ <dd>Tag the input tokenized text. See arguments listing for more options.</dd>
105
+ <dt>GET /</dt>
106
+ <dd>Show this page</dd>
107
+ </dl>
108
+ </p>
109
+
110
+ <h2>Arguments</h2>
111
+
112
+ <p> The webservice takes the following arguments: </p>
113
+ <p>* required</p>
114
+
115
+ <dl>
116
+ <dt>text*</dt>
117
+ <dd>The input text in KAF format. Sample KAF input:</dd>
118
+ <pre>
119
+ &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;
120
+ &lt;KAF version=&quot;v1.opener&quot; xml:lang=&quot;en&quot;&gt;
121
+ &lt;kafHeader&gt;
122
+ &lt;linguisticProcessors layer=&quot;text&quot;&gt;
123
+ &lt;lp name=&quot;opennlp-en-tok&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;
124
+ &lt;lp name=&quot;opennlp-en-sent&quot; timestamp=&quot;2013-06-11T13:41:37Z&quot; version=&quot;1.0&quot;/&gt;
125
+ &lt;/linguisticProcessors&gt;
126
+ &lt;/kafHeader&gt;
127
+ &lt;text&gt;
128
+ &lt;wf length=&quot;4&quot; offset=&quot;0&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w1&quot;&gt;this&lt;/wf&gt;
129
+ &lt;wf length=&quot;2&quot; offset=&quot;5&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w2&quot;&gt;is&lt;/wf&gt;
130
+ &lt;wf length=&quot;2&quot; offset=&quot;8&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w3&quot;&gt;an&lt;/wf&gt;
131
+ &lt;wf length=&quot;7&quot; offset=&quot;11&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w4&quot;&gt;english&lt;/wf&gt;
132
+ &lt;wf length=&quot;4&quot; offset=&quot;19&quot; para=&quot;1&quot; sent=&quot;1&quot; wid=&quot;w5&quot;&gt;text&lt;/wf&gt;
133
+ &lt;/text&gt;
134
+ &lt;/KAF&gt;</pre>
135
+
136
+ <dt>callbacks</dt>
137
+ <dd>
138
+ You can provide a list of callback urls. If you provide callback urls
139
+ the POS tagger will run as a background job and a callback
140
+ with the results will be performed (POST) to the first url in the callback
141
+ list. The other urls in callback list will be provided in the "callbacks"
142
+ argument.<br/><br/>
143
+ Using callback you can chain together several OpeNER webservices in
144
+ one call. The first, will call the second, which will call the third, etc.
145
+ See for more information the <a href="http://opener-project.github.io">
146
+ webservice documentation online</a>.
147
+ </dd>
148
+ <dt>error_callback</dt>
149
+ <dd>URL to notify if errors occur in the background process. The error
150
+ callback will do a POST with the error message in the 'error' field.</dd>
151
+ </dt>
152
+
153
+
154
+
155
+ </dl>
156
+
157
+
158
+ <p>
159
+
160
+ </p>
161
+
162
+ </body>
163
+ </html>
@@ -0,0 +1,15 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <link type="text/css" rel="stylesheet" charset="UTF-8" href="markdown.css"/>
5
+ <title>Language Detector Webservice</title>
6
+ </head>
7
+ <body>
8
+ <h1>Output URL</h1>
9
+ <p>
10
+ When ready, you can view the result
11
+ <a href=<%= output_url %>>here</a>
12
+ </p>
13
+
14
+ </body>
15
+ </html>
@@ -0,0 +1,35 @@
1
+ require File.expand_path('../lib/opener/pos_tagger/version', __FILE__)
2
+
3
+ Gem::Specification.new do |gem|
4
+ gem.name = 'opener-pos-tagger'
5
+ gem.version = Opener::POSTagger::VERSION
6
+ gem.authors = ['development@olery.com']
7
+ gem.summary = 'Gem that wraps up the different existing pos-taggers'
8
+ gem.description = gem.summary
9
+ gem.homepage = 'http://opener-project.github.com/'
10
+ gem.has_rdoc = "yard"
11
+ gem.required_ruby_version = ">= 1.9.2"
12
+
13
+ gem.files = Dir.glob([
14
+ 'lib/**/*',
15
+ 'config.ru',
16
+ '*.gemspec',
17
+ 'README.md'
18
+ ]).select { |file| File.file?(file) }
19
+
20
+ gem.executables = Dir.glob('bin/*').map { |file| File.basename(file) }
21
+
22
+ gem.add_dependency 'opener-pos-tagger-base'
23
+ gem.add_dependency 'opener-pos-tagger-en-es'
24
+ gem.add_dependency 'opener-webservice'
25
+
26
+ gem.add_dependency 'nokogiri'
27
+ gem.add_dependency 'sinatra', '~>1.4.2'
28
+ gem.add_dependency 'httpclient'
29
+
30
+ gem.add_development_dependency 'rspec'
31
+ gem.add_development_dependency 'cucumber'
32
+ gem.add_development_dependency 'pry'
33
+ gem.add_development_dependency 'rake'
34
+ end
35
+
metadata ADDED
@@ -0,0 +1,197 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: opener-pos-tagger
3
+ version: !ruby/object:Gem::Version
4
+ version: 2.0.0
5
+ platform: ruby
6
+ authors:
7
+ - development@olery.com
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-05-20 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: opener-pos-tagger-base
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: opener-pos-tagger-en-es
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: opener-webservice
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: nokogiri
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: sinatra
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: 1.4.2
76
+ type: :runtime
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: 1.4.2
83
+ - !ruby/object:Gem::Dependency
84
+ name: httpclient
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - ">="
88
+ - !ruby/object:Gem::Version
89
+ version: '0'
90
+ type: :runtime
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - ">="
95
+ - !ruby/object:Gem::Version
96
+ version: '0'
97
+ - !ruby/object:Gem::Dependency
98
+ name: rspec
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - ">="
102
+ - !ruby/object:Gem::Version
103
+ version: '0'
104
+ type: :development
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: '0'
111
+ - !ruby/object:Gem::Dependency
112
+ name: cucumber
113
+ requirement: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - ">="
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ type: :development
119
+ prerelease: false
120
+ version_requirements: !ruby/object:Gem::Requirement
121
+ requirements:
122
+ - - ">="
123
+ - !ruby/object:Gem::Version
124
+ version: '0'
125
+ - !ruby/object:Gem::Dependency
126
+ name: pry
127
+ requirement: !ruby/object:Gem::Requirement
128
+ requirements:
129
+ - - ">="
130
+ - !ruby/object:Gem::Version
131
+ version: '0'
132
+ type: :development
133
+ prerelease: false
134
+ version_requirements: !ruby/object:Gem::Requirement
135
+ requirements:
136
+ - - ">="
137
+ - !ruby/object:Gem::Version
138
+ version: '0'
139
+ - !ruby/object:Gem::Dependency
140
+ name: rake
141
+ requirement: !ruby/object:Gem::Requirement
142
+ requirements:
143
+ - - ">="
144
+ - !ruby/object:Gem::Version
145
+ version: '0'
146
+ type: :development
147
+ prerelease: false
148
+ version_requirements: !ruby/object:Gem::Requirement
149
+ requirements:
150
+ - - ">="
151
+ - !ruby/object:Gem::Version
152
+ version: '0'
153
+ description: Gem that wraps up the different existing pos-taggers
154
+ email:
155
+ executables:
156
+ - pos-tagger-server
157
+ - pos-tagger
158
+ extensions: []
159
+ extra_rdoc_files: []
160
+ files:
161
+ - README.md
162
+ - bin/pos-tagger
163
+ - bin/pos-tagger-server
164
+ - config.ru
165
+ - lib/opener/pos_tagger.rb
166
+ - lib/opener/pos_tagger/cli.rb
167
+ - lib/opener/pos_tagger/public/markdown.css
168
+ - lib/opener/pos_tagger/server.rb
169
+ - lib/opener/pos_tagger/version.rb
170
+ - lib/opener/pos_tagger/views/index.erb
171
+ - lib/opener/pos_tagger/views/result.erb
172
+ - opener-pos-tagger.gemspec
173
+ homepage: http://opener-project.github.com/
174
+ licenses: []
175
+ metadata: {}
176
+ post_install_message:
177
+ rdoc_options: []
178
+ require_paths:
179
+ - lib
180
+ required_ruby_version: !ruby/object:Gem::Requirement
181
+ requirements:
182
+ - - ">="
183
+ - !ruby/object:Gem::Version
184
+ version: 1.9.2
185
+ required_rubygems_version: !ruby/object:Gem::Requirement
186
+ requirements:
187
+ - - ">="
188
+ - !ruby/object:Gem::Version
189
+ version: '0'
190
+ requirements: []
191
+ rubyforge_project:
192
+ rubygems_version: 2.2.2
193
+ signing_key:
194
+ specification_version: 4
195
+ summary: Gem that wraps up the different existing pos-taggers
196
+ test_files: []
197
+ has_rdoc: yard