stanford-core-nlp 0.5.1 → 0.5.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/README.md +94 -20
- data/bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo.java +28 -0
- data/bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo2.java +58 -0
- data/lib/stanford-core-nlp.rb +24 -24
- data/lib/stanford-core-nlp/config.rb +6 -4
- data/lib/stanford-core-nlp/version.rb +3 -0
- metadata +30 -51
- data/bin/bridge.jar +0 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: f58c7f886c5f1f7fae0ff5915adb48c9fd9dc8df
|
4
|
+
data.tar.gz: 8d0b101b58a6584f3f19ed714a9e9c004d73acc8
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: c47c18b4d11651d6e64f48a87d9a5318b5506a92f2dc0ec3453e31bbc1a75f96b7cef72b09dfc77eba98f35b8779bc0dc0ab1befef96107e3611ab2dd34fec65
|
7
|
+
data.tar.gz: 4ff45b83b273e2b7bc29a6614124f857c016fed38c2382a064db8a355c54d998a5c850d60ed204a76aef3ed277c0c4513eabb7cae98a815d5f2d43c0fa1b4c2c
|
data/README.md
CHANGED
@@ -1,21 +1,25 @@
|
|
1
|
-
[![Build Status](https://secure.travis-ci.org/louismullie/stanford-core-nlp.png)](http://travis-ci.org/louismullie/stanford-core-nlp)
|
1
|
+
# Stanford CoreNLP [![Build Status](https://secure.travis-ci.org/louismullie/stanford-core-nlp.png)](http://travis-ci.org/louismullie/stanford-core-nlp) [![Awesome RubyNLP](https://img.shields.io/badge/Awesome-RubyNLP-brightgreen.svg)](https://github.com/arbox/nlp-with-ruby)
|
2
2
|
|
3
|
-
|
4
|
-
|
5
|
-
This gem provides high-level Ruby bindings to the [Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml), a set natural language processing tools for tokenization, sentence segmentation, part-of-speech tagging, lemmatization, and parsing of English, French and German. The package also provides named entity recognition and coreference resolution for English.
|
3
|
+
> Ruby bindings for the Stanford [CoreNLP Toolchain](http://stanfordnlp.github.io/CoreNLP/).
|
6
4
|
|
7
|
-
This gem
|
5
|
+
This gem provides high-level Ruby bindings to the
|
6
|
+
[Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml),
|
7
|
+
a set natural language processing tools for tokenization, sentence segmentation,
|
8
|
+
part-of-speech tagging, lemmatization, and parsing of English, French and German.
|
9
|
+
The package also provides named entity recognition and coreference resolution for English.
|
8
10
|
|
9
|
-
|
11
|
+
This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1.
|
12
|
+
It is tested on both Java 6 and Java 7.
|
13
|
+
Newer Ruby version should work as well.
|
10
14
|
|
11
|
-
|
15
|
+
## Installation
|
12
16
|
|
13
|
-
|
14
|
-
|
17
|
+
First, install the gem: `gem install stanford-core-nlp`.
|
18
|
+
Then, download the Stanford Core NLP JAR and model files: [Stanford CoreNLP](http://nlp.stanford.edu/software/stanford-postagger-full-2014-10-26.zip)
|
15
19
|
|
16
20
|
Place the contents of the extracted archive inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/).
|
17
21
|
|
18
|
-
|
22
|
+
## Configuration
|
19
23
|
|
20
24
|
You may want to set some optional configuration options. Here are some examples:
|
21
25
|
|
@@ -40,7 +44,7 @@ StanfordCoreNLP.log_file = 'log.txt'
|
|
40
44
|
StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
|
41
45
|
```
|
42
46
|
|
43
|
-
|
47
|
+
## Using the gem
|
44
48
|
|
45
49
|
```ruby
|
46
50
|
# Use the model files for a different language than English.
|
@@ -71,7 +75,7 @@ text.get(:sentences).each do |sentence|
|
|
71
75
|
puts token.get(:named_entity_tag).to_s
|
72
76
|
# Coreference
|
73
77
|
puts token.get(:coref_cluster_id).to_s
|
74
|
-
# Also of interest: coref, coref_chain,
|
78
|
+
# Also of interest: coref, coref_chain,
|
75
79
|
# coref_cluster, coref_dest, coref_graph.
|
76
80
|
end
|
77
81
|
end
|
@@ -79,9 +83,13 @@ end
|
|
79
83
|
|
80
84
|
> Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
|
81
85
|
|
82
|
-
The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation
|
86
|
+
The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation
|
87
|
+
class is the `snake_case` of the class name, with 'Annotation' at the end removed.
|
88
|
+
For example, `NamedEntityTagAnnotation` translates to `:named_entity_tag`,
|
89
|
+
`PartOfSpeechAnnotation` to `:part_of_speech`, etc.
|
83
90
|
|
84
|
-
A good reference for names of annotations are the Stanford Javadocs
|
91
|
+
A good reference for names of annotations are the Stanford Javadocs
|
92
|
+
for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the `config.rb` file inside the gem.
|
85
93
|
|
86
94
|
|
87
95
|
**Loading specific classes**
|
@@ -90,17 +98,17 @@ You may want to load additional Java classes (including any class from the Stanf
|
|
90
98
|
|
91
99
|
```ruby
|
92
100
|
# Default base class is edu.stanford.nlp.pipeline.
|
93
|
-
StanfordCoreNLP.load_class('PTBTokenizerAnnotator')
|
101
|
+
StanfordCoreNLP.load_class('PTBTokenizerAnnotator')
|
94
102
|
puts StanfordCoreNLP::PTBTokenizerAnnotator.inspect
|
95
103
|
# => #<Rjb::Edu_stanford_nlp_pipeline_PTBTokenizerAnnotator>
|
96
104
|
|
97
105
|
# Here, we specify another base class.
|
98
|
-
StanfordCoreNLP.load_class('MaxentTagger', 'edu.stanford.nlp.tagger')
|
106
|
+
StanfordCoreNLP.load_class('MaxentTagger', 'edu.stanford.nlp.tagger')
|
99
107
|
puts StanfordCoreNLP::MaxentTagger.inspect
|
100
108
|
# => <Rjb::Edu_stanford_nlp_tagger_maxent_MaxentTagger:0x007f88491e2020>
|
101
109
|
```
|
102
110
|
|
103
|
-
|
111
|
+
## List of annotator classes
|
104
112
|
|
105
113
|
Here is a full list of annotator classes provided by the Stanford Core NLP package. You can load these classes individually using `StanfordCoreNLP.load_class` (see above). Once this is done, you can use them like you would from a Java program. Refer to the Java documentation for a list of functions provided by each of these classes.
|
106
114
|
|
@@ -119,7 +127,7 @@ Here is a full list of annotator classes provided by the Stanford Core NLP packa
|
|
119
127
|
* DeterministicCorefAnnotator - implements anaphora resolution using a deterministic model.
|
120
128
|
* NFLAnnotator - implements entity and relation mention extraction for the NFL domain.
|
121
129
|
|
122
|
-
|
130
|
+
## List of model files
|
123
131
|
|
124
132
|
Here is a full list of the default models for the Stanford Core NLP pipeline. You can change these models individually using `StanfordCoreNLP.set_model` (see above).
|
125
133
|
|
@@ -137,9 +145,75 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
|
|
137
145
|
* 'dcoref.states' - 'state-abbreviations.txt'
|
138
146
|
* 'dcoref.extra.gender' - 'namegender.combine.txt'
|
139
147
|
|
140
|
-
|
148
|
+
## Testing
|
149
|
+
|
150
|
+
To run the specs for each language (after copying the JARs into the `bin` folder):
|
151
|
+
|
152
|
+
``` shell
|
153
|
+
rake spec[english]
|
154
|
+
rake spec[german]
|
155
|
+
rake spec[french]
|
156
|
+
```
|
157
|
+
|
158
|
+
## Using the latest version of the Stanford CoreNLP
|
159
|
+
|
160
|
+
Using the latest version of the Stanford CoreNLP (version 3.5.0 as of 31/10/2014) requires some additional manual steps:
|
161
|
+
|
162
|
+
* Download [Stanford CoreNLP version 3.5.0](http://nlp.stanford.edu/software/stanford-corenlp-full-2014-10-31.zip) from http://nlp.stanford.edu/.
|
163
|
+
* Place the contents of the extracted archive inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/) or inside the directory location configured by setting StanfordCoreNLP.jar_path.
|
164
|
+
* Download [the full Stanford Tagger version 3.5.0](http://nlp.stanford.edu/software/stanford-postagger-full-2014-10-26.zip) from http://nlp.stanford.edu/.
|
165
|
+
* Make a directory named 'taggers' inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/) or inside the directory configured by setting StanfordCoreNLP.jar_path.
|
166
|
+
* Place the contents of the extracted archive inside taggers directory.
|
167
|
+
* Download [the bridge.jar file](https://github.com/louismullie/stanford-core-nlp/blob/master/bin/bridge.jar?raw=true) from https://github.com/louismullie/stanford-core-nlp.
|
168
|
+
* Place the downloaded bridger.jar file inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/taggers/) or inside the directory configured by setting StanfordCoreNLP.jar_path.
|
169
|
+
* Configure your setup (for English) as follows:
|
170
|
+
```ruby
|
171
|
+
StanfordCoreNLP.use :english
|
172
|
+
StanfordCoreNLP.model_files = {}
|
173
|
+
StanfordCoreNLP.default_jars = [
|
174
|
+
'joda-time.jar',
|
175
|
+
'xom.jar',
|
176
|
+
'stanford-corenlp-3.5.0.jar',
|
177
|
+
'stanford-corenlp-3.5.0-models.jar',
|
178
|
+
'jollyday.jar',
|
179
|
+
'bridge.jar'
|
180
|
+
]
|
181
|
+
end
|
182
|
+
```
|
183
|
+
Or configure your setup (for French) as follows:
|
184
|
+
```ruby
|
185
|
+
StanfordCoreNLP.use :french
|
186
|
+
StanfordCoreNLP.model_files = {}
|
187
|
+
StanfordCoreNLP.set_model('pos.model', 'french.tagger')
|
188
|
+
StanfordCoreNLP.default_jars = [
|
189
|
+
'joda-time.jar',
|
190
|
+
'xom.jar',
|
191
|
+
'stanford-corenlp-3.5.0.jar',
|
192
|
+
'stanford-corenlp-3.5.0-models.jar',
|
193
|
+
'jollyday.jar',
|
194
|
+
'bridge.jar'
|
195
|
+
]
|
196
|
+
end
|
197
|
+
```
|
198
|
+
Or configure your setup (for German) as follows:
|
199
|
+
```ruby
|
200
|
+
StanfordCoreNLP.use :german
|
201
|
+
StanfordCoreNLP.model_files = {}
|
202
|
+
StanfordCoreNLP.set_model('pos.model', 'german-fast.tagger')
|
203
|
+
StanfordCoreNLP.default_jars = [
|
204
|
+
'joda-time.jar',
|
205
|
+
'xom.jar',
|
206
|
+
'stanford-corenlp-3.5.0.jar',
|
207
|
+
'stanford-corenlp-3.5.0-models.jar',
|
208
|
+
'jollyday.jar',
|
209
|
+
'bridge.jar'
|
210
|
+
]
|
211
|
+
end
|
212
|
+
```
|
213
|
+
|
214
|
+
## Contributing
|
141
215
|
|
142
216
|
Simple.
|
143
217
|
|
144
218
|
1. Fork the project.
|
145
|
-
2. Send me a pull request!
|
219
|
+
2. Send me a pull request!
|
@@ -0,0 +1,28 @@
|
|
1
|
+
|
2
|
+
import java.io.BufferedReader;
|
3
|
+
import java.io.FileReader;
|
4
|
+
import java.util.List;
|
5
|
+
|
6
|
+
import edu.stanford.nlp.ling.Sentence;
|
7
|
+
import edu.stanford.nlp.ling.TaggedWord;
|
8
|
+
import edu.stanford.nlp.ling.HasWord;
|
9
|
+
import edu.stanford.nlp.tagger.maxent.MaxentTagger;
|
10
|
+
|
11
|
+
class TaggerDemo {
|
12
|
+
|
13
|
+
private TaggerDemo() {}
|
14
|
+
|
15
|
+
public static void main(String[] args) throws Exception {
|
16
|
+
if (args.length != 2) {
|
17
|
+
System.err.println("usage: java TaggerDemo modelFile fileToTag");
|
18
|
+
return;
|
19
|
+
}
|
20
|
+
MaxentTagger tagger = new MaxentTagger(args[0]);
|
21
|
+
List<List<HasWord>> sentences = MaxentTagger.tokenizeText(new BufferedReader(new FileReader(args[1])));
|
22
|
+
for (List<HasWord> sentence : sentences) {
|
23
|
+
List<TaggedWord> tSentence = tagger.tagSentence(sentence);
|
24
|
+
System.out.println(Sentence.listToString(tSentence, false));
|
25
|
+
}
|
26
|
+
}
|
27
|
+
|
28
|
+
}
|
@@ -0,0 +1,58 @@
|
|
1
|
+
|
2
|
+
import java.io.BufferedReader;
|
3
|
+
import java.io.FileInputStream;
|
4
|
+
import java.io.InputStreamReader;
|
5
|
+
import java.io.OutputStreamWriter;
|
6
|
+
import java.io.PrintWriter;
|
7
|
+
import java.util.List;
|
8
|
+
|
9
|
+
import edu.stanford.nlp.ling.Sentence;
|
10
|
+
import edu.stanford.nlp.ling.TaggedWord;
|
11
|
+
import edu.stanford.nlp.ling.HasWord;
|
12
|
+
import edu.stanford.nlp.ling.CoreLabel;
|
13
|
+
import edu.stanford.nlp.process.CoreLabelTokenFactory;
|
14
|
+
import edu.stanford.nlp.process.DocumentPreprocessor;
|
15
|
+
import edu.stanford.nlp.process.PTBTokenizer;
|
16
|
+
import edu.stanford.nlp.process.TokenizerFactory;
|
17
|
+
import edu.stanford.nlp.tagger.maxent.MaxentTagger;
|
18
|
+
|
19
|
+
/** This demo shows user-provided sentences (i.e., {@code List<HasWord>})
|
20
|
+
* being tagged by the tagger. The sentences are generated by direct use
|
21
|
+
* of the DocumentPreprocessor class.
|
22
|
+
*
|
23
|
+
* @author Christopher Manning
|
24
|
+
*/
|
25
|
+
class TaggerDemo2 {
|
26
|
+
|
27
|
+
private TaggerDemo2() {}
|
28
|
+
|
29
|
+
public static void main(String[] args) throws Exception {
|
30
|
+
if (args.length != 2) {
|
31
|
+
System.err.println("usage: java TaggerDemo2 modelFile fileToTag");
|
32
|
+
return;
|
33
|
+
}
|
34
|
+
MaxentTagger tagger = new MaxentTagger(args[0]);
|
35
|
+
TokenizerFactory<CoreLabel> ptbTokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(),
|
36
|
+
"untokenizable=noneKeep");
|
37
|
+
BufferedReader r = new BufferedReader(new InputStreamReader(new FileInputStream(args[1]), "utf-8"));
|
38
|
+
PrintWriter pw = new PrintWriter(new OutputStreamWriter(System.out, "utf-8"));
|
39
|
+
DocumentPreprocessor documentPreprocessor = new DocumentPreprocessor(r);
|
40
|
+
documentPreprocessor.setTokenizerFactory(ptbTokenizerFactory);
|
41
|
+
for (List<HasWord> sentence : documentPreprocessor) {
|
42
|
+
List<TaggedWord> tSentence = tagger.tagSentence(sentence);
|
43
|
+
pw.println(Sentence.listToString(tSentence, false));
|
44
|
+
}
|
45
|
+
|
46
|
+
// print the adjectives in one more sentence. This shows how to get at words and tags in a tagged sentence.
|
47
|
+
List<HasWord> sent = Sentence.toWordList("The", "slimy", "slug", "crawled", "over", "the", "long", ",", "green", "grass", ".");
|
48
|
+
List<TaggedWord> taggedSent = tagger.tagSentence(sent);
|
49
|
+
for (TaggedWord tw : taggedSent) {
|
50
|
+
if (tw.tag().startsWith("JJ")) {
|
51
|
+
pw.println(tw.word());
|
52
|
+
}
|
53
|
+
}
|
54
|
+
|
55
|
+
pw.close();
|
56
|
+
}
|
57
|
+
|
58
|
+
}
|
data/lib/stanford-core-nlp.rb
CHANGED
@@ -1,10 +1,7 @@
|
|
1
|
+
require 'bind-it'
|
1
2
|
require 'stanford-core-nlp/config'
|
2
3
|
|
3
4
|
module StanfordCoreNLP
|
4
|
-
|
5
|
-
VERSION = '0.5.1'
|
6
|
-
|
7
|
-
require 'bind-it'
|
8
5
|
extend BindIt::Binding
|
9
6
|
|
10
7
|
# ############################ #
|
@@ -29,9 +26,7 @@ module StanfordCoreNLP
|
|
29
26
|
StanfordCoreNLP.default_jars = [
|
30
27
|
'joda-time.jar',
|
31
28
|
'xom.jar',
|
32
|
-
'stanford-parser.jar',
|
33
29
|
'stanford-corenlp.jar',
|
34
|
-
'stanford-segmenter.jar',
|
35
30
|
'jollyday.jar',
|
36
31
|
'bridge.jar'
|
37
32
|
]
|
@@ -57,7 +52,7 @@ module StanfordCoreNLP
|
|
57
52
|
|
58
53
|
require 'stanford-core-nlp/bridge'
|
59
54
|
extend StanfordCoreNLP::Bridge
|
60
|
-
|
55
|
+
|
61
56
|
class << self
|
62
57
|
# The model file names for a given language.
|
63
58
|
attr_accessor :model_files
|
@@ -65,13 +60,17 @@ module StanfordCoreNLP
|
|
65
60
|
attr_accessor :model_path
|
66
61
|
# Store the language currently being used.
|
67
62
|
attr_accessor :language
|
63
|
+
#Custom properties
|
64
|
+
attr_accessor :custom_properties
|
68
65
|
end
|
69
66
|
|
67
|
+
self.custom_properties = {}
|
68
|
+
|
70
69
|
# The path to the main folder containing the folders
|
71
70
|
# with the individual models inside. By default, this
|
72
71
|
# is the same as the JAR path.
|
73
72
|
self.model_path = self.jar_path
|
74
|
-
|
73
|
+
|
75
74
|
# ########################### #
|
76
75
|
# Public configuration params #
|
77
76
|
# ########################### #
|
@@ -102,7 +101,7 @@ module StanfordCoreNLP
|
|
102
101
|
|
103
102
|
# Use english by default.
|
104
103
|
self.use :english
|
105
|
-
|
104
|
+
|
106
105
|
# Set a model file.
|
107
106
|
def self.set_model(name, file)
|
108
107
|
n = name.split('.')[0].intern
|
@@ -114,7 +113,7 @@ module StanfordCoreNLP
|
|
114
113
|
# ########################### #
|
115
114
|
|
116
115
|
def self.bind
|
117
|
-
|
116
|
+
|
118
117
|
# Take care of Windows users.
|
119
118
|
if self.running_on_windows?
|
120
119
|
self.jar_path.gsub!('/', '\\')
|
@@ -129,16 +128,16 @@ module StanfordCoreNLP
|
|
129
128
|
klass = const_get(info.first)
|
130
129
|
self.inject_get_method(klass)
|
131
130
|
end
|
132
|
-
|
131
|
+
|
133
132
|
end
|
134
|
-
|
133
|
+
|
135
134
|
# Load a StanfordCoreNLP pipeline with the
|
136
135
|
# specified JVM flags and StanfordCoreNLP
|
137
136
|
# properties.
|
138
137
|
def self.load(*annotators)
|
139
|
-
|
138
|
+
|
140
139
|
self.bind unless self.bound
|
141
|
-
|
140
|
+
|
142
141
|
# Prepend the JAR path to the model files.
|
143
142
|
properties = {}
|
144
143
|
self.model_files.each do |k,v|
|
@@ -156,7 +155,7 @@ module StanfordCoreNLP
|
|
156
155
|
end
|
157
156
|
properties[k] = f
|
158
157
|
end
|
159
|
-
|
158
|
+
|
160
159
|
properties['annotators'] = annotators.map { |x| x.to_s }.join(', ')
|
161
160
|
|
162
161
|
unless self.language == :english
|
@@ -168,45 +167,46 @@ module StanfordCoreNLP
|
|
168
167
|
# Otherswise throws java.lang.NullPointerException: null.
|
169
168
|
properties['parse.buildgraphs'] = 'false'
|
170
169
|
end
|
171
|
-
|
170
|
+
|
172
171
|
# Bug fix for NER system. Otherwise throws:
|
173
172
|
# Error initializing binder 1 at edu.stanford.
|
174
173
|
# nlp.time.Options.<init>(Options.java:88)
|
175
174
|
properties['sutime.binders'] = '0'
|
176
|
-
|
175
|
+
|
177
176
|
# Manually include SUTime models.
|
178
177
|
if annotators.include?(:ner)
|
179
|
-
properties['sutime.rules'] =
|
178
|
+
properties['sutime.rules'] =
|
180
179
|
self.model_path + 'sutime/defs.sutime.txt, ' +
|
181
180
|
self.model_path + 'sutime/english.sutime.txt'
|
182
181
|
end
|
183
|
-
|
182
|
+
|
184
183
|
props = get_properties(properties)
|
185
|
-
|
184
|
+
|
186
185
|
# Hack for Java7 compatibility.
|
187
186
|
bridge = const_get(:AnnotationBridge)
|
188
187
|
bridge.getPipelineWithProperties(props)
|
189
188
|
|
190
189
|
end
|
191
|
-
|
190
|
+
|
192
191
|
# Hack in order not to break backwards compatibility.
|
193
192
|
def self.const_missing(const)
|
194
193
|
if const == :Text
|
195
194
|
puts "WARNING: StanfordCoreNLP::Text has been deprecated." +
|
196
195
|
"Please use StanfordCoreNLP::Annotation instead."
|
197
196
|
Annotation
|
198
|
-
else
|
197
|
+
else
|
199
198
|
super(const)
|
200
199
|
end
|
201
200
|
end
|
202
201
|
|
203
202
|
private
|
204
|
-
|
203
|
+
|
205
204
|
# Create a java.util.Properties object from a hash.
|
206
205
|
def self.get_properties(properties)
|
206
|
+
properties = properties.merge(self.custom_properties)
|
207
207
|
props = Properties.new
|
208
208
|
properties.each do |property, value|
|
209
|
-
props.set_property(property, value)
|
209
|
+
props.set_property(property.to_s, value.to_s)
|
210
210
|
end
|
211
211
|
props
|
212
212
|
end
|
@@ -41,7 +41,7 @@ module StanfordCoreNLP
|
|
41
41
|
},
|
42
42
|
|
43
43
|
:ner => {
|
44
|
-
:english => 'all.3class.distsim.crf.ser.gz'
|
44
|
+
:english => 'english.all.3class.distsim.crf.ser.gz'
|
45
45
|
# :german => {} # Add this at some point.
|
46
46
|
},
|
47
47
|
|
@@ -58,7 +58,8 @@ module StanfordCoreNLP
|
|
58
58
|
'states' => 'state-abbreviations.txt',
|
59
59
|
'countries' => 'countries',
|
60
60
|
'states.provinces' => 'statesandprovinces',
|
61
|
-
'extra.gender' => 'namegender.combine.txt'
|
61
|
+
'extra.gender' => 'namegender.combine.txt',
|
62
|
+
'singleton.predictor' => 'singleton.predictor.ser'
|
62
63
|
},
|
63
64
|
:german => {},
|
64
65
|
:french => {}
|
@@ -351,7 +352,7 @@ module StanfordCoreNLP
|
|
351
352
|
'ConstraintAnnotation'
|
352
353
|
],
|
353
354
|
|
354
|
-
'nlp.
|
355
|
+
'nlp.semgraph.SemanticGraphCoreAnnotations' => [
|
355
356
|
'BasicDependenciesAnnotation',
|
356
357
|
'CollapsedCCProcessedDependenciesAnnotation',
|
357
358
|
'CollapsedDependenciesAnnotation'
|
@@ -364,7 +365,8 @@ module StanfordCoreNLP
|
|
364
365
|
|
365
366
|
'nlp.time.TimeExpression' => [
|
366
367
|
'Annotation',
|
367
|
-
'ChildrenAnnotation'
|
368
|
+
'ChildrenAnnotation',
|
369
|
+
'TimeIndexAnnotation'
|
368
370
|
],
|
369
371
|
|
370
372
|
'nlp.trees.TreeCoreAnnotations' => [
|
metadata
CHANGED
@@ -1,91 +1,70 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: stanford-core-nlp
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.5.
|
5
|
-
prerelease:
|
4
|
+
version: 0.5.3
|
6
5
|
platform: ruby
|
7
6
|
authors:
|
8
7
|
- Louis Mullie
|
9
|
-
autorequire:
|
8
|
+
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 2016-12-24 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: bind-it
|
16
|
-
version_requirements: !ruby/object:Gem::Requirement
|
17
|
-
requirements:
|
18
|
-
- - ~>
|
19
|
-
- !ruby/object:Gem::Version
|
20
|
-
version: 0.2.5
|
21
|
-
none: false
|
22
15
|
requirement: !ruby/object:Gem::Requirement
|
23
16
|
requirements:
|
24
|
-
- - ~>
|
17
|
+
- - "~>"
|
25
18
|
- !ruby/object:Gem::Version
|
26
|
-
version: 0.2.
|
27
|
-
none: false
|
28
|
-
prerelease: false
|
19
|
+
version: 0.2.7
|
29
20
|
type: :runtime
|
30
|
-
|
31
|
-
name: rspec
|
21
|
+
prerelease: false
|
32
22
|
version_requirements: !ruby/object:Gem::Requirement
|
33
23
|
requirements:
|
34
|
-
- -
|
35
|
-
- !ruby/object:Gem::Version
|
36
|
-
version: !binary |-
|
37
|
-
MA==
|
38
|
-
none: false
|
39
|
-
requirement: !ruby/object:Gem::Requirement
|
40
|
-
requirements:
|
41
|
-
- - ! '>='
|
24
|
+
- - "~>"
|
42
25
|
- !ruby/object:Gem::Version
|
43
|
-
version:
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
description: " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
|
49
|
-
\ language processing \ntools that provides tokenization, part-of-speech tagging\
|
50
|
-
\ and parsing for several languages, as well as named entity \nrecognition and coreference\
|
51
|
-
\ resolution for English. "
|
26
|
+
version: 0.2.7
|
27
|
+
description: High-level Ruby bindings to the Stanford CoreNLP package, a set natural
|
28
|
+
language processing tools that provides tokenization, part-of-speech tagging and
|
29
|
+
parsing for several languages, as well as named entity recognition and coreference
|
30
|
+
resolution for English, German, French and other languages.
|
52
31
|
email:
|
53
32
|
- louis.mullie@gmail.com
|
54
33
|
executables: []
|
55
34
|
extensions: []
|
56
35
|
extra_rdoc_files: []
|
57
36
|
files:
|
37
|
+
- LICENSE
|
38
|
+
- README.md
|
39
|
+
- bin/AnnotationBridge.java
|
40
|
+
- bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo.java
|
41
|
+
- bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo2.java
|
58
42
|
- lib/stanford-core-nlp.rb
|
59
43
|
- lib/stanford-core-nlp/bridge.rb
|
60
44
|
- lib/stanford-core-nlp/config.rb
|
61
|
-
-
|
62
|
-
- bin/bridge.jar
|
63
|
-
- README.md
|
64
|
-
- LICENSE
|
45
|
+
- lib/stanford-core-nlp/version.rb
|
65
46
|
homepage: https://github.com/louismullie/stanford-core-nlp
|
66
|
-
licenses:
|
67
|
-
|
47
|
+
licenses:
|
48
|
+
- GPL-3.0
|
49
|
+
metadata: {}
|
50
|
+
post_install_message:
|
68
51
|
rdoc_options: []
|
69
52
|
require_paths:
|
70
53
|
- lib
|
71
54
|
required_ruby_version: !ruby/object:Gem::Requirement
|
72
55
|
requirements:
|
73
|
-
- -
|
56
|
+
- - ">="
|
74
57
|
- !ruby/object:Gem::Version
|
75
|
-
version:
|
76
|
-
MA==
|
77
|
-
none: false
|
58
|
+
version: '0'
|
78
59
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
79
60
|
requirements:
|
80
|
-
- -
|
61
|
+
- - ">="
|
81
62
|
- !ruby/object:Gem::Version
|
82
|
-
version:
|
83
|
-
MA==
|
84
|
-
none: false
|
63
|
+
version: '0'
|
85
64
|
requirements: []
|
86
|
-
rubyforge_project:
|
87
|
-
rubygems_version:
|
88
|
-
signing_key:
|
89
|
-
specification_version:
|
65
|
+
rubyforge_project:
|
66
|
+
rubygems_version: 2.5.1
|
67
|
+
signing_key:
|
68
|
+
specification_version: 4
|
90
69
|
summary: Ruby bindings to the Stanford Core NLP tools.
|
91
70
|
test_files: []
|
data/bin/bridge.jar
DELETED
Binary file
|