stanford-core-nlp 0.5.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +17 -12
- data/lib/stanford-core-nlp.rb +16 -8
- metadata +32 -28
data/README.md
CHANGED
@@ -8,7 +8,7 @@ This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1. It is t
|
|
8
8
|
|
9
9
|
**Installing**
|
10
10
|
|
11
|
-
First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files.
|
11
|
+
First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Two packages are available:
|
12
12
|
|
13
13
|
* A [minimal package](http://louismullie.com/treat/stanford-core-nlp-minimal.zip) with the default tagger and parser models for English, French and German.
|
14
14
|
* A [full package](http://louismullie.com/treat/stanford-core-nlp-full.zip), with all of the tagger and parser models for English, French and German, as well as named entity and coreference resolution models for English.
|
@@ -17,7 +17,7 @@ Place the contents of the extracted archive inside the /bin/ folder of the stanf
|
|
17
17
|
|
18
18
|
**Configuration**
|
19
19
|
|
20
|
-
|
20
|
+
You may want to set some optional configuration options. Here are some examples:
|
21
21
|
|
22
22
|
```ruby
|
23
23
|
# Set an alternative path to look for the JAR files
|
@@ -36,9 +36,6 @@ StanfordCoreNLP.jvm_args = ['-option1', '-option2']
|
|
36
36
|
# Redirect VM output to log.txt
|
37
37
|
StanfordCoreNLP.log_file = 'log.txt'
|
38
38
|
|
39
|
-
# Use the model files for a different language than English.
|
40
|
-
StanfordCoreNLP.use(:french) # or :german
|
41
|
-
|
42
39
|
# Change a specific model file.
|
43
40
|
StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
|
44
41
|
```
|
@@ -46,6 +43,9 @@ StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
|
|
46
43
|
**Using the gem**
|
47
44
|
|
48
45
|
```ruby
|
46
|
+
# Use the model files for a different language than English.
|
47
|
+
StanfordCoreNLP.use :french # or :german
|
48
|
+
|
49
49
|
text = 'Angela Merkel met Nicolas Sarkozy on January 25th in ' +
|
50
50
|
'Berlin to discuss a new austerity package. Sarkozy ' +
|
51
51
|
'looked pleased, but Merkel was dismayed.'
|
@@ -71,18 +71,22 @@ text.get(:sentences).each do |sentence|
|
|
71
71
|
puts token.get(:named_entity_tag).to_s
|
72
72
|
# Coreference
|
73
73
|
puts token.get(:coref_cluster_id).to_s
|
74
|
-
# Also of interest: coref, coref_chain,
|
74
|
+
# Also of interest: coref, coref_chain,
|
75
|
+
# coref_cluster, coref_dest, coref_graph.
|
75
76
|
end
|
76
77
|
end
|
77
78
|
```
|
78
79
|
|
79
80
|
> Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
|
80
81
|
|
81
|
-
|
82
|
+
The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class is the `snake_case` of the class name, with 'Annotation' at the end removed. For example, `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
|
83
|
+
|
84
|
+
A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the `config.rb` file inside the gem.
|
85
|
+
|
82
86
|
|
83
87
|
**Loading specific classes**
|
84
88
|
|
85
|
-
You may
|
89
|
+
You may want to load additional Java classes (including any class from the Stanford NLP packages). The gem provides an API for this:
|
86
90
|
|
87
91
|
```ruby
|
88
92
|
# Default base class is edu.stanford.nlp.pipeline.
|
@@ -120,9 +124,7 @@ Here is a full list of annotator classes provided by the Stanford Core NLP packa
|
|
120
124
|
Here is a full list of the default models for the Stanford Core NLP pipeline. You can change these models individually using `StanfordCoreNLP.set_model` (see above).
|
121
125
|
|
122
126
|
* 'pos.model' - 'english-left3words-distsim.tagger'
|
123
|
-
* 'ner.model
|
124
|
-
* 'ner.model.7class' - 'muc.7class.distsim.crf.ser.gz'
|
125
|
-
* 'ner.model.MISCclass' -- 'conll.4class.distsim.crf.ser.gz'
|
127
|
+
* 'ner.model' - 'all.3class.distsim.crf.ser.gz'
|
126
128
|
* 'parse.model' - 'englishPCFG.ser.gz'
|
127
129
|
* 'dcoref.demonym' - 'demonyms.txt'
|
128
130
|
* 'dcoref.animate' - 'animate.unigrams.txt'
|
@@ -137,4 +139,7 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
|
|
137
139
|
|
138
140
|
**Contributing**
|
139
141
|
|
140
|
-
|
142
|
+
Simple.
|
143
|
+
|
144
|
+
1. Fork the project.
|
145
|
+
2. Send me a pull request!
|
data/lib/stanford-core-nlp.rb
CHANGED
@@ -2,7 +2,7 @@ require 'stanford-core-nlp/config'
|
|
2
2
|
|
3
3
|
module StanfordCoreNLP
|
4
4
|
|
5
|
-
VERSION = '0.5.
|
5
|
+
VERSION = '0.5.1'
|
6
6
|
|
7
7
|
require 'bind-it'
|
8
8
|
extend BindIt::Binding
|
@@ -44,6 +44,8 @@ module StanfordCoreNLP
|
|
44
44
|
['CoreLabel', 'edu.stanford.nlp.ling'],
|
45
45
|
['MaxentTagger', 'edu.stanford.nlp.tagger.maxent'],
|
46
46
|
['CRFClassifier', 'edu.stanford.nlp.ie.crf'],
|
47
|
+
['LexicalizedParser', 'edu.stanford.nlp.parser.lexparser'],
|
48
|
+
['Options', 'edu.stanford.nlp.parser.lexparser'],
|
47
49
|
['Properties', 'java.util'],
|
48
50
|
['ArrayList', 'java.util'],
|
49
51
|
['AnnotationBridge', '']
|
@@ -111,11 +113,8 @@ module StanfordCoreNLP
|
|
111
113
|
# Public API methods #
|
112
114
|
# ########################### #
|
113
115
|
|
114
|
-
|
115
|
-
|
116
|
-
# properties.
|
117
|
-
def self.load(*annotators)
|
118
|
-
|
116
|
+
def self.bind
|
117
|
+
|
119
118
|
# Take care of Windows users.
|
120
119
|
if self.running_on_windows?
|
121
120
|
self.jar_path.gsub!('/', '\\')
|
@@ -123,14 +122,23 @@ module StanfordCoreNLP
|
|
123
122
|
end
|
124
123
|
|
125
124
|
# Make the bindings.
|
126
|
-
|
125
|
+
super
|
127
126
|
|
128
127
|
# Bind annotation bridge.
|
129
128
|
self.default_classes.each do |info|
|
130
129
|
klass = const_get(info.first)
|
131
130
|
self.inject_get_method(klass)
|
132
131
|
end
|
133
|
-
|
132
|
+
|
133
|
+
end
|
134
|
+
|
135
|
+
# Load a StanfordCoreNLP pipeline with the
|
136
|
+
# specified JVM flags and StanfordCoreNLP
|
137
|
+
# properties.
|
138
|
+
def self.load(*annotators)
|
139
|
+
|
140
|
+
self.bind unless self.bound
|
141
|
+
|
134
142
|
# Prepend the JAR path to the model files.
|
135
143
|
properties = {}
|
136
144
|
self.model_files.each do |k,v|
|
metadata
CHANGED
@@ -1,87 +1,91 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: stanford-core-nlp
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.5.
|
5
|
-
prerelease:
|
4
|
+
version: 0.5.1
|
5
|
+
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
8
8
|
- Louis Mullie
|
9
|
-
autorequire:
|
9
|
+
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2013-01-07 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bind-it
|
16
|
-
|
17
|
-
none: false
|
16
|
+
version_requirements: !ruby/object:Gem::Requirement
|
18
17
|
requirements:
|
19
18
|
- - ~>
|
20
19
|
- !ruby/object:Gem::Version
|
21
20
|
version: 0.2.5
|
22
|
-
type: :runtime
|
23
|
-
prerelease: false
|
24
|
-
version_requirements: !ruby/object:Gem::Requirement
|
25
21
|
none: false
|
22
|
+
requirement: !ruby/object:Gem::Requirement
|
26
23
|
requirements:
|
27
24
|
- - ~>
|
28
25
|
- !ruby/object:Gem::Version
|
29
26
|
version: 0.2.5
|
27
|
+
none: false
|
28
|
+
prerelease: false
|
29
|
+
type: :runtime
|
30
30
|
- !ruby/object:Gem::Dependency
|
31
31
|
name: rspec
|
32
|
-
|
33
|
-
none: false
|
32
|
+
version_requirements: !ruby/object:Gem::Requirement
|
34
33
|
requirements:
|
35
34
|
- - ! '>='
|
36
35
|
- !ruby/object:Gem::Version
|
37
|
-
version:
|
38
|
-
|
39
|
-
prerelease: false
|
40
|
-
version_requirements: !ruby/object:Gem::Requirement
|
36
|
+
version: !binary |-
|
37
|
+
MA==
|
41
38
|
none: false
|
39
|
+
requirement: !ruby/object:Gem::Requirement
|
42
40
|
requirements:
|
43
41
|
- - ! '>='
|
44
42
|
- !ruby/object:Gem::Version
|
45
|
-
version:
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
43
|
+
version: !binary |-
|
44
|
+
MA==
|
45
|
+
none: false
|
46
|
+
prerelease: false
|
47
|
+
type: :development
|
48
|
+
description: " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
|
49
|
+
\ language processing \ntools that provides tokenization, part-of-speech tagging\
|
50
|
+
\ and parsing for several languages, as well as named entity \nrecognition and coreference\
|
51
|
+
\ resolution for English. "
|
50
52
|
email:
|
51
53
|
- louis.mullie@gmail.com
|
52
54
|
executables: []
|
53
55
|
extensions: []
|
54
56
|
extra_rdoc_files: []
|
55
57
|
files:
|
58
|
+
- lib/stanford-core-nlp.rb
|
56
59
|
- lib/stanford-core-nlp/bridge.rb
|
57
60
|
- lib/stanford-core-nlp/config.rb
|
58
|
-
- lib/stanford-core-nlp.rb
|
59
61
|
- bin/AnnotationBridge.java
|
60
62
|
- bin/bridge.jar
|
61
63
|
- README.md
|
62
64
|
- LICENSE
|
63
65
|
homepage: https://github.com/louismullie/stanford-core-nlp
|
64
66
|
licenses: []
|
65
|
-
post_install_message:
|
67
|
+
post_install_message:
|
66
68
|
rdoc_options: []
|
67
69
|
require_paths:
|
68
70
|
- lib
|
69
71
|
required_ruby_version: !ruby/object:Gem::Requirement
|
70
|
-
none: false
|
71
72
|
requirements:
|
72
73
|
- - ! '>='
|
73
74
|
- !ruby/object:Gem::Version
|
74
|
-
version:
|
75
|
-
|
75
|
+
version: !binary |-
|
76
|
+
MA==
|
76
77
|
none: false
|
78
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
77
79
|
requirements:
|
78
80
|
- - ! '>='
|
79
81
|
- !ruby/object:Gem::Version
|
80
|
-
version:
|
82
|
+
version: !binary |-
|
83
|
+
MA==
|
84
|
+
none: false
|
81
85
|
requirements: []
|
82
|
-
rubyforge_project:
|
86
|
+
rubyforge_project:
|
83
87
|
rubygems_version: 1.8.24
|
84
|
-
signing_key:
|
88
|
+
signing_key:
|
85
89
|
specification_version: 3
|
86
90
|
summary: Ruby bindings to the Stanford Core NLP tools.
|
87
91
|
test_files: []
|