stanfordparser 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README +3 -3
- data/lib/stanfordparser.rb +19 -10
- metadata +2 -2
data/README
CHANGED
@@ -9,9 +9,9 @@ The Stanford Natural Language Parser is a Java implementation of a probabilistic
|
|
9
9
|
|
10
10
|
In addition to the Ruby gems it requires, to run this module you must manually install the {Stanford Natural Language Parser}[http://nlp.stanford.edu/downloads/lex-parser.shtml].
|
11
11
|
|
12
|
-
This module expects the parser to be installed in the <tt>/usr/local/stanford-parser/current</tt> directory. This is the directory that contains the <tt>stanford-parser.jar</tt> file. When the module is loaded, it adds this directory to the Java classpath and launches the Java VM with the arguments <tt>-server -Xmx150m</tt>.
|
12
|
+
This module expects the parser to be installed in the <tt>/usr/local/stanford-parser/current</tt> directory on UNIX platforms and in the <tt>C:\stanford-parser\current</tt> directory on Windows platforms. This is the directory that contains the <tt>stanford-parser.jar</tt> file. When the module is loaded, it adds this directory to the Java classpath and launches the Java VM with the arguments <tt>-server -Xmx150m</tt>.
|
13
13
|
|
14
|
-
These defaults can be overridden by creating
|
14
|
+
These defaults can be overridden by creating the configuration file <tt>/etc/ruby_stanford_parser.yaml</tt> on UNIX platforms and <tt>C:\stanford-parser\ruby-stanford-parser.yaml</tt> on Windows platforms. This file is in the Ruby YAML[http://ruby-doc.org/stdlib/libdoc/yaml/rdoc/index.html] format, and may contain two values: <tt>root</tt> and <tt>jvmargs</tt>. For example, the file might look like the following:
|
15
15
|
|
16
16
|
root: /usr/local/stanford-parser/other/location
|
17
17
|
jvmargs: -Xmx100m -verbose
|
@@ -108,7 +108,7 @@ Unlike their parents StanfordParser::DocumentPreprocessor and StanfordParser::Le
|
|
108
108
|
1.1.0:: Make module initialization function private. Add example code.
|
109
109
|
1.2.0:: Read Java VM arguments from the configuration file. Add Word class.
|
110
110
|
2.0.0:: Add support for standoff parsing. Change the way Rjb::JavaObjectWrapper wraps returned values: see wrap_java_object for details. Rjb::JavaObjectWrapper supports static members. Minor changes to stanford-sentence-parser script.
|
111
|
-
|
111
|
+
2.1.0:: Different default paths for Windows machines; Minor changes to StandoffToken definition
|
112
112
|
|
113
113
|
= Copyright
|
114
114
|
|
data/lib/stanfordparser.rb
CHANGED
@@ -34,7 +34,7 @@ require "java_object.rb"
|
|
34
34
|
# Parser}[http://nlp.stanford.edu/downloads/lex-parser.shtml].
|
35
35
|
module StanfordParser
|
36
36
|
|
37
|
-
VERSION = "2.
|
37
|
+
VERSION = "2.1.0"
|
38
38
|
|
39
39
|
# The default sentence segmenter and tokenizer. This is an English-language
|
40
40
|
# tokenizer with support for Penn Treebank markup.
|
@@ -47,17 +47,28 @@ module StanfordParser
|
|
47
47
|
|
48
48
|
# This function is executed once when the module is loaded. It initializes
|
49
49
|
# the Java virtual machine in which the Stanford parser will run. By
|
50
|
-
# default, it adds the parser installation root
|
51
|
-
# <tt>/usr/local/stanford-parser/current</tt> to the Java classpath and
|
50
|
+
# default, it adds the parser installation root to the Java classpath and
|
52
51
|
# launches the VM with the arguments <tt>-server -Xmx150m</tt>. Different
|
53
|
-
# values may be specified with the <tt
|
52
|
+
# values may be specified with the <tt>ruby-stanford-parser.yaml</tt>
|
54
53
|
# configuration file.
|
55
54
|
#
|
55
|
+
# This function determines which operating system we are running on and sets
|
56
|
+
# default pathnames accordingly:
|
57
|
+
#
|
58
|
+
# UNIX:: /usr/local/stanford-parser/current, /etc/ruby-stanford-parser.yaml
|
59
|
+
# Windows:: C:\stanford-parser\current,
|
60
|
+
# C:\stanford-parser\ruby-stanford-parser.yaml
|
61
|
+
#
|
56
62
|
# This function returns the path of the parser installation root.
|
57
63
|
def StanfordParser.initialize_on_load
|
58
|
-
|
64
|
+
if RUBY_PLATFORM =~ /(win|w)32$/
|
65
|
+
root = Pathname.new("C:\\stanford-parser\\current")
|
66
|
+
config = Pathname.new("C:\\stanford-parser\\ruby-stanford-parser.yaml")
|
67
|
+
else
|
68
|
+
root = Pathname.new("/usr/local/stanford-parser/current")
|
69
|
+
config = Pathname.new("/etc/ruby-stanford-parser.yaml")
|
70
|
+
end
|
59
71
|
jvmargs = ["-server", "-Xmx150m"]
|
60
|
-
config = Pathname.new("/etc/ruby-stanford-parser.yaml")
|
61
72
|
if config.file?
|
62
73
|
configuration = open(config) {|f| YAML.load(f)}
|
63
74
|
if configuration.key?("root") and not configuration["root"].nil?
|
@@ -243,14 +254,12 @@ module StanfordParser
|
|
243
254
|
end
|
244
255
|
end # DocumentPreprocessor
|
245
256
|
|
246
|
-
StandoffToken = Struct.new(:current, :word, :before, :after,
|
247
|
-
:begin_position, :end_position)
|
248
|
-
|
249
257
|
# A text token that contains raw and normalized token identity (.e.g "(" and
|
250
258
|
# "-LRB-"), an offset span, and the characters immediately preceding and
|
251
259
|
# following the token. Given a list of these objects it is possible to
|
252
260
|
# recreate the text from which they came verbatim.
|
253
|
-
class StandoffToken
|
261
|
+
class StandoffToken < Struct.new(:current, :word, :before, :after,
|
262
|
+
:begin_position, :end_position)
|
254
263
|
def to_s
|
255
264
|
"#{current} [#{begin_position},#{end_position}]"
|
256
265
|
end
|
metadata
CHANGED
@@ -3,8 +3,8 @@ rubygems_version: 0.9.2
|
|
3
3
|
specification_version: 1
|
4
4
|
name: stanfordparser
|
5
5
|
version: !ruby/object:Gem::Version
|
6
|
-
version: 2.
|
7
|
-
date: 2008-06-
|
6
|
+
version: 2.1.0
|
7
|
+
date: 2008-06-23 00:00:00 -07:00
|
8
8
|
summary: Ruby wrapper for the Stanford Natural Language Parser
|
9
9
|
require_paths:
|
10
10
|
- lib
|