stanford-core-nlp 0.5.1 → 0.5.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +94 -20
- data/bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo.java +28 -0
- data/bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo2.java +58 -0
- data/lib/stanford-core-nlp.rb +24 -24
- data/lib/stanford-core-nlp/config.rb +6 -4
- data/lib/stanford-core-nlp/version.rb +3 -0
- metadata +30 -51
- data/bin/bridge.jar +0 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: f58c7f886c5f1f7fae0ff5915adb48c9fd9dc8df
|
4
|
+
data.tar.gz: 8d0b101b58a6584f3f19ed714a9e9c004d73acc8
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: c47c18b4d11651d6e64f48a87d9a5318b5506a92f2dc0ec3453e31bbc1a75f96b7cef72b09dfc77eba98f35b8779bc0dc0ab1befef96107e3611ab2dd34fec65
|
7
|
+
data.tar.gz: 4ff45b83b273e2b7bc29a6614124f857c016fed38c2382a064db8a355c54d998a5c850d60ed204a76aef3ed277c0c4513eabb7cae98a815d5f2d43c0fa1b4c2c
|
data/README.md
CHANGED
@@ -1,21 +1,25 @@
|
|
1
|
-
[](http://travis-ci.org/louismullie/stanford-core-nlp)
|
1
|
+
# Stanford CoreNLP [](http://travis-ci.org/louismullie/stanford-core-nlp) [](https://github.com/arbox/nlp-with-ruby)
|
2
2
|
|
3
|
-
|
4
|
-
|
5
|
-
This gem provides high-level Ruby bindings to the [Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml), a set natural language processing tools for tokenization, sentence segmentation, part-of-speech tagging, lemmatization, and parsing of English, French and German. The package also provides named entity recognition and coreference resolution for English.
|
3
|
+
> Ruby bindings for the Stanford [CoreNLP Toolchain](http://stanfordnlp.github.io/CoreNLP/).
|
6
4
|
|
7
|
-
This gem
|
5
|
+
This gem provides high-level Ruby bindings to the
|
6
|
+
[Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml),
|
7
|
+
a set natural language processing tools for tokenization, sentence segmentation,
|
8
|
+
part-of-speech tagging, lemmatization, and parsing of English, French and German.
|
9
|
+
The package also provides named entity recognition and coreference resolution for English.
|
8
10
|
|
9
|
-
|
11
|
+
This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1.
|
12
|
+
It is tested on both Java 6 and Java 7.
|
13
|
+
Newer Ruby version should work as well.
|
10
14
|
|
11
|
-
|
15
|
+
## Installation
|
12
16
|
|
13
|
-
|
14
|
-
|
17
|
+
First, install the gem: `gem install stanford-core-nlp`.
|
18
|
+
Then, download the Stanford Core NLP JAR and model files: [Stanford CoreNLP](http://nlp.stanford.edu/software/stanford-postagger-full-2014-10-26.zip)
|
15
19
|
|
16
20
|
Place the contents of the extracted archive inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/).
|
17
21
|
|
18
|
-
|
22
|
+
## Configuration
|
19
23
|
|
20
24
|
You may want to set some optional configuration options. Here are some examples:
|
21
25
|
|
@@ -40,7 +44,7 @@ StanfordCoreNLP.log_file = 'log.txt'
|
|
40
44
|
StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
|
41
45
|
```
|
42
46
|
|
43
|
-
|
47
|
+
## Using the gem
|
44
48
|
|
45
49
|
```ruby
|
46
50
|
# Use the model files for a different language than English.
|
@@ -71,7 +75,7 @@ text.get(:sentences).each do |sentence|
|
|
71
75
|
puts token.get(:named_entity_tag).to_s
|
72
76
|
# Coreference
|
73
77
|
puts token.get(:coref_cluster_id).to_s
|
74
|
-
# Also of interest: coref, coref_chain,
|
78
|
+
# Also of interest: coref, coref_chain,
|
75
79
|
# coref_cluster, coref_dest, coref_graph.
|
76
80
|
end
|
77
81
|
end
|
@@ -79,9 +83,13 @@ end
|
|
79
83
|
|
80
84
|
> Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
|
81
85
|
|
82
|
-
The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation
|
86
|
+
The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation
|
87
|
+
class is the `snake_case` of the class name, with 'Annotation' at the end removed.
|
88
|
+
For example, `NamedEntityTagAnnotation` translates to `:named_entity_tag`,
|
89
|
+
`PartOfSpeechAnnotation` to `:part_of_speech`, etc.
|
83
90
|
|
84
|
-
A good reference for names of annotations are the Stanford Javadocs
|
91
|
+
A good reference for names of annotations are the Stanford Javadocs
|
92
|
+
for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the `config.rb` file inside the gem.
|
85
93
|
|
86
94
|
|
87
95
|
**Loading specific classes**
|
@@ -90,17 +98,17 @@ You may want to load additional Java classes (including any class from the Stanf
|
|
90
98
|
|
91
99
|
```ruby
|
92
100
|
# Default base class is edu.stanford.nlp.pipeline.
|
93
|
-
StanfordCoreNLP.load_class('PTBTokenizerAnnotator')
|
101
|
+
StanfordCoreNLP.load_class('PTBTokenizerAnnotator')
|
94
102
|
puts StanfordCoreNLP::PTBTokenizerAnnotator.inspect
|
95
103
|
# => #<Rjb::Edu_stanford_nlp_pipeline_PTBTokenizerAnnotator>
|
96
104
|
|
97
105
|
# Here, we specify another base class.
|
98
|
-
StanfordCoreNLP.load_class('MaxentTagger', 'edu.stanford.nlp.tagger')
|
106
|
+
StanfordCoreNLP.load_class('MaxentTagger', 'edu.stanford.nlp.tagger')
|
99
107
|
puts StanfordCoreNLP::MaxentTagger.inspect
|
100
108
|
# => <Rjb::Edu_stanford_nlp_tagger_maxent_MaxentTagger:0x007f88491e2020>
|
101
109
|
```
|
102
110
|
|
103
|
-
|
111
|
+
## List of annotator classes
|
104
112
|
|
105
113
|
Here is a full list of annotator classes provided by the Stanford Core NLP package. You can load these classes individually using `StanfordCoreNLP.load_class` (see above). Once this is done, you can use them like you would from a Java program. Refer to the Java documentation for a list of functions provided by each of these classes.
|
106
114
|
|
@@ -119,7 +127,7 @@ Here is a full list of annotator classes provided by the Stanford Core NLP packa
|
|
119
127
|
* DeterministicCorefAnnotator - implements anaphora resolution using a deterministic model.
|
120
128
|
* NFLAnnotator - implements entity and relation mention extraction for the NFL domain.
|
121
129
|
|
122
|
-
|
130
|
+
## List of model files
|
123
131
|
|
124
132
|
Here is a full list of the default models for the Stanford Core NLP pipeline. You can change these models individually using `StanfordCoreNLP.set_model` (see above).
|
125
133
|
|
@@ -137,9 +145,75 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
|
|
137
145
|
* 'dcoref.states' - 'state-abbreviations.txt'
|
138
146
|
* 'dcoref.extra.gender' - 'namegender.combine.txt'
|
139
147
|
|
140
|
-
|
148
|
+
## Testing
|
149
|
+
|
150
|
+
To run the specs for each language (after copying the JARs into the `bin` folder):
|
151
|
+
|
152
|
+
``` shell
|
153
|
+
rake spec[english]
|
154
|
+
rake spec[german]
|
155
|
+
rake spec[french]
|
156
|
+
```
|
157
|
+
|
158
|
+
## Using the latest version of the Stanford CoreNLP
|
159
|
+
|
160
|
+
Using the latest version of the Stanford CoreNLP (version 3.5.0 as of 31/10/2014) requires some additional manual steps:
|
161
|
+
|
162
|
+
* Download [Stanford CoreNLP version 3.5.0](http://nlp.stanford.edu/software/stanford-corenlp-full-2014-10-31.zip) from http://nlp.stanford.edu/.
|
163
|
+
* Place the contents of the extracted archive inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/) or inside the directory location configured by setting StanfordCoreNLP.jar_path.
|
164
|
+
* Download [the full Stanford Tagger version 3.5.0](http://nlp.stanford.edu/software/stanford-postagger-full-2014-10-26.zip) from http://nlp.stanford.edu/.
|
165
|
+
* Make a directory named 'taggers' inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/) or inside the directory configured by setting StanfordCoreNLP.jar_path.
|
166
|
+
* Place the contents of the extracted archive inside taggers directory.
|
167
|
+
* Download [the bridge.jar file](https://github.com/louismullie/stanford-core-nlp/blob/master/bin/bridge.jar?raw=true) from https://github.com/louismullie/stanford-core-nlp.
|
168
|
+
* Place the downloaded bridger.jar file inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/taggers/) or inside the directory configured by setting StanfordCoreNLP.jar_path.
|
169
|
+
* Configure your setup (for English) as follows:
|
170
|
+
```ruby
|
171
|
+
StanfordCoreNLP.use :english
|
172
|
+
StanfordCoreNLP.model_files = {}
|
173
|
+
StanfordCoreNLP.default_jars = [
|
174
|
+
'joda-time.jar',
|
175
|
+
'xom.jar',
|
176
|
+
'stanford-corenlp-3.5.0.jar',
|
177
|
+
'stanford-corenlp-3.5.0-models.jar',
|
178
|
+
'jollyday.jar',
|
179
|
+
'bridge.jar'
|
180
|
+
]
|
181
|
+
end
|
182
|
+
```
|
183
|
+
Or configure your setup (for French) as follows:
|
184
|
+
```ruby
|
185
|
+
StanfordCoreNLP.use :french
|
186
|
+
StanfordCoreNLP.model_files = {}
|
187
|
+
StanfordCoreNLP.set_model('pos.model', 'french.tagger')
|
188
|
+
StanfordCoreNLP.default_jars = [
|
189
|
+
'joda-time.jar',
|
190
|
+
'xom.jar',
|
191
|
+
'stanford-corenlp-3.5.0.jar',
|
192
|
+
'stanford-corenlp-3.5.0-models.jar',
|
193
|
+
'jollyday.jar',
|
194
|
+
'bridge.jar'
|
195
|
+
]
|
196
|
+
end
|
197
|
+
```
|
198
|
+
Or configure your setup (for German) as follows:
|
199
|
+
```ruby
|
200
|
+
StanfordCoreNLP.use :german
|
201
|
+
StanfordCoreNLP.model_files = {}
|
202
|
+
StanfordCoreNLP.set_model('pos.model', 'german-fast.tagger')
|
203
|
+
StanfordCoreNLP.default_jars = [
|
204
|
+
'joda-time.jar',
|
205
|
+
'xom.jar',
|
206
|
+
'stanford-corenlp-3.5.0.jar',
|
207
|
+
'stanford-corenlp-3.5.0-models.jar',
|
208
|
+
'jollyday.jar',
|
209
|
+
'bridge.jar'
|
210
|
+
]
|
211
|
+
end
|
212
|
+
```
|
213
|
+
|
214
|
+
## Contributing
|
141
215
|
|
142
216
|
Simple.
|
143
217
|
|
144
218
|
1. Fork the project.
|
145
|
-
2. Send me a pull request!
|
219
|
+
2. Send me a pull request!
|
@@ -0,0 +1,28 @@
|
|
1
|
+
|
2
|
+
import java.io.BufferedReader;
|
3
|
+
import java.io.FileReader;
|
4
|
+
import java.util.List;
|
5
|
+
|
6
|
+
import edu.stanford.nlp.ling.Sentence;
|
7
|
+
import edu.stanford.nlp.ling.TaggedWord;
|
8
|
+
import edu.stanford.nlp.ling.HasWord;
|
9
|
+
import edu.stanford.nlp.tagger.maxent.MaxentTagger;
|
10
|
+
|
11
|
+
class TaggerDemo {
|
12
|
+
|
13
|
+
private TaggerDemo() {}
|
14
|
+
|
15
|
+
public static void main(String[] args) throws Exception {
|
16
|
+
if (args.length != 2) {
|
17
|
+
System.err.println("usage: java TaggerDemo modelFile fileToTag");
|
18
|
+
return;
|
19
|
+
}
|
20
|
+
MaxentTagger tagger = new MaxentTagger(args[0]);
|
21
|
+
List<List<HasWord>> sentences = MaxentTagger.tokenizeText(new BufferedReader(new FileReader(args[1])));
|
22
|
+
for (List<HasWord> sentence : sentences) {
|
23
|
+
List<TaggedWord> tSentence = tagger.tagSentence(sentence);
|
24
|
+
System.out.println(Sentence.listToString(tSentence, false));
|
25
|
+
}
|
26
|
+
}
|
27
|
+
|
28
|
+
}
|
@@ -0,0 +1,58 @@
|
|
1
|
+
|
2
|
+
import java.io.BufferedReader;
|
3
|
+
import java.io.FileInputStream;
|
4
|
+
import java.io.InputStreamReader;
|
5
|
+
import java.io.OutputStreamWriter;
|
6
|
+
import java.io.PrintWriter;
|
7
|
+
import java.util.List;
|
8
|
+
|
9
|
+
import edu.stanford.nlp.ling.Sentence;
|
10
|
+
import edu.stanford.nlp.ling.TaggedWord;
|
11
|
+
import edu.stanford.nlp.ling.HasWord;
|
12
|
+
import edu.stanford.nlp.ling.CoreLabel;
|
13
|
+
import edu.stanford.nlp.process.CoreLabelTokenFactory;
|
14
|
+
import edu.stanford.nlp.process.DocumentPreprocessor;
|
15
|
+
import edu.stanford.nlp.process.PTBTokenizer;
|
16
|
+
import edu.stanford.nlp.process.TokenizerFactory;
|
17
|
+
import edu.stanford.nlp.tagger.maxent.MaxentTagger;
|
18
|
+
|
19
|
+
/** This demo shows user-provided sentences (i.e., {@code List<HasWord>})
|
20
|
+
* being tagged by the tagger. The sentences are generated by direct use
|
21
|
+
* of the DocumentPreprocessor class.
|
22
|
+
*
|
23
|
+
* @author Christopher Manning
|
24
|
+
*/
|
25
|
+
class TaggerDemo2 {
|
26
|
+
|
27
|
+
private TaggerDemo2() {}
|
28
|
+
|
29
|
+
public static void main(String[] args) throws Exception {
|
30
|
+
if (args.length != 2) {
|
31
|
+
System.err.println("usage: java TaggerDemo2 modelFile fileToTag");
|
32
|
+
return;
|
33
|
+
}
|
34
|
+
MaxentTagger tagger = new MaxentTagger(args[0]);
|
35
|
+
TokenizerFactory<CoreLabel> ptbTokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(),
|
36
|
+
"untokenizable=noneKeep");
|
37
|
+
BufferedReader r = new BufferedReader(new InputStreamReader(new FileInputStream(args[1]), "utf-8"));
|
38
|
+
PrintWriter pw = new PrintWriter(new OutputStreamWriter(System.out, "utf-8"));
|
39
|
+
DocumentPreprocessor documentPreprocessor = new DocumentPreprocessor(r);
|
40
|
+
documentPreprocessor.setTokenizerFactory(ptbTokenizerFactory);
|
41
|
+
for (List<HasWord> sentence : documentPreprocessor) {
|
42
|
+
List<TaggedWord> tSentence = tagger.tagSentence(sentence);
|
43
|
+
pw.println(Sentence.listToString(tSentence, false));
|
44
|
+
}
|
45
|
+
|
46
|
+
// print the adjectives in one more sentence. This shows how to get at words and tags in a tagged sentence.
|
47
|
+
List<HasWord> sent = Sentence.toWordList("The", "slimy", "slug", "crawled", "over", "the", "long", ",", "green", "grass", ".");
|
48
|
+
List<TaggedWord> taggedSent = tagger.tagSentence(sent);
|
49
|
+
for (TaggedWord tw : taggedSent) {
|
50
|
+
if (tw.tag().startsWith("JJ")) {
|
51
|
+
pw.println(tw.word());
|
52
|
+
}
|
53
|
+
}
|
54
|
+
|
55
|
+
pw.close();
|
56
|
+
}
|
57
|
+
|
58
|
+
}
|
data/lib/stanford-core-nlp.rb
CHANGED
@@ -1,10 +1,7 @@
|
|
1
|
+
require 'bind-it'
|
1
2
|
require 'stanford-core-nlp/config'
|
2
3
|
|
3
4
|
module StanfordCoreNLP
|
4
|
-
|
5
|
-
VERSION = '0.5.1'
|
6
|
-
|
7
|
-
require 'bind-it'
|
8
5
|
extend BindIt::Binding
|
9
6
|
|
10
7
|
# ############################ #
|
@@ -29,9 +26,7 @@ module StanfordCoreNLP
|
|
29
26
|
StanfordCoreNLP.default_jars = [
|
30
27
|
'joda-time.jar',
|
31
28
|
'xom.jar',
|
32
|
-
'stanford-parser.jar',
|
33
29
|
'stanford-corenlp.jar',
|
34
|
-
'stanford-segmenter.jar',
|
35
30
|
'jollyday.jar',
|
36
31
|
'bridge.jar'
|
37
32
|
]
|
@@ -57,7 +52,7 @@ module StanfordCoreNLP
|
|
57
52
|
|
58
53
|
require 'stanford-core-nlp/bridge'
|
59
54
|
extend StanfordCoreNLP::Bridge
|
60
|
-
|
55
|
+
|
61
56
|
class << self
|
62
57
|
# The model file names for a given language.
|
63
58
|
attr_accessor :model_files
|
@@ -65,13 +60,17 @@ module StanfordCoreNLP
|
|
65
60
|
attr_accessor :model_path
|
66
61
|
# Store the language currently being used.
|
67
62
|
attr_accessor :language
|
63
|
+
#Custom properties
|
64
|
+
attr_accessor :custom_properties
|
68
65
|
end
|
69
66
|
|
67
|
+
self.custom_properties = {}
|
68
|
+
|
70
69
|
# The path to the main folder containing the folders
|
71
70
|
# with the individual models inside. By default, this
|
72
71
|
# is the same as the JAR path.
|
73
72
|
self.model_path = self.jar_path
|
74
|
-
|
73
|
+
|
75
74
|
# ########################### #
|
76
75
|
# Public configuration params #
|
77
76
|
# ########################### #
|
@@ -102,7 +101,7 @@ module StanfordCoreNLP
|
|
102
101
|
|
103
102
|
# Use english by default.
|
104
103
|
self.use :english
|
105
|
-
|
104
|
+
|
106
105
|
# Set a model file.
|
107
106
|
def self.set_model(name, file)
|
108
107
|
n = name.split('.')[0].intern
|
@@ -114,7 +113,7 @@ module StanfordCoreNLP
|
|
114
113
|
# ########################### #
|
115
114
|
|
116
115
|
def self.bind
|
117
|
-
|
116
|
+
|
118
117
|
# Take care of Windows users.
|
119
118
|
if self.running_on_windows?
|
120
119
|
self.jar_path.gsub!('/', '\\')
|
@@ -129,16 +128,16 @@ module StanfordCoreNLP
|
|
129
128
|
klass = const_get(info.first)
|
130
129
|
self.inject_get_method(klass)
|
131
130
|
end
|
132
|
-
|
131
|
+
|
133
132
|
end
|
134
|
-
|
133
|
+
|
135
134
|
# Load a StanfordCoreNLP pipeline with the
|
136
135
|
# specified JVM flags and StanfordCoreNLP
|
137
136
|
# properties.
|
138
137
|
def self.load(*annotators)
|
139
|
-
|
138
|
+
|
140
139
|
self.bind unless self.bound
|
141
|
-
|
140
|
+
|
142
141
|
# Prepend the JAR path to the model files.
|
143
142
|
properties = {}
|
144
143
|
self.model_files.each do |k,v|
|
@@ -156,7 +155,7 @@ module StanfordCoreNLP
|
|
156
155
|
end
|
157
156
|
properties[k] = f
|
158
157
|
end
|
159
|
-
|
158
|
+
|
160
159
|
properties['annotators'] = annotators.map { |x| x.to_s }.join(', ')
|
161
160
|
|
162
161
|
unless self.language == :english
|
@@ -168,45 +167,46 @@ module StanfordCoreNLP
|
|
168
167
|
# Otherswise throws java.lang.NullPointerException: null.
|
169
168
|
properties['parse.buildgraphs'] = 'false'
|
170
169
|
end
|
171
|
-
|
170
|
+
|
172
171
|
# Bug fix for NER system. Otherwise throws:
|
173
172
|
# Error initializing binder 1 at edu.stanford.
|
174
173
|
# nlp.time.Options.<init>(Options.java:88)
|
175
174
|
properties['sutime.binders'] = '0'
|
176
|
-
|
175
|
+
|
177
176
|
# Manually include SUTime models.
|
178
177
|
if annotators.include?(:ner)
|
179
|
-
properties['sutime.rules'] =
|
178
|
+
properties['sutime.rules'] =
|
180
179
|
self.model_path + 'sutime/defs.sutime.txt, ' +
|
181
180
|
self.model_path + 'sutime/english.sutime.txt'
|
182
181
|
end
|
183
|
-
|
182
|
+
|
184
183
|
props = get_properties(properties)
|
185
|
-
|
184
|
+
|
186
185
|
# Hack for Java7 compatibility.
|
187
186
|
bridge = const_get(:AnnotationBridge)
|
188
187
|
bridge.getPipelineWithProperties(props)
|
189
188
|
|
190
189
|
end
|
191
|
-
|
190
|
+
|
192
191
|
# Hack in order not to break backwards compatibility.
|
193
192
|
def self.const_missing(const)
|
194
193
|
if const == :Text
|
195
194
|
puts "WARNING: StanfordCoreNLP::Text has been deprecated." +
|
196
195
|
"Please use StanfordCoreNLP::Annotation instead."
|
197
196
|
Annotation
|
198
|
-
else
|
197
|
+
else
|
199
198
|
super(const)
|
200
199
|
end
|
201
200
|
end
|
202
201
|
|
203
202
|
private
|
204
|
-
|
203
|
+
|
205
204
|
# Create a java.util.Properties object from a hash.
|
206
205
|
def self.get_properties(properties)
|
206
|
+
properties = properties.merge(self.custom_properties)
|
207
207
|
props = Properties.new
|
208
208
|
properties.each do |property, value|
|
209
|
-
props.set_property(property, value)
|
209
|
+
props.set_property(property.to_s, value.to_s)
|
210
210
|
end
|
211
211
|
props
|
212
212
|
end
|
@@ -41,7 +41,7 @@ module StanfordCoreNLP
|
|
41
41
|
},
|
42
42
|
|
43
43
|
:ner => {
|
44
|
-
:english => 'all.3class.distsim.crf.ser.gz'
|
44
|
+
:english => 'english.all.3class.distsim.crf.ser.gz'
|
45
45
|
# :german => {} # Add this at some point.
|
46
46
|
},
|
47
47
|
|
@@ -58,7 +58,8 @@ module StanfordCoreNLP
|
|
58
58
|
'states' => 'state-abbreviations.txt',
|
59
59
|
'countries' => 'countries',
|
60
60
|
'states.provinces' => 'statesandprovinces',
|
61
|
-
'extra.gender' => 'namegender.combine.txt'
|
61
|
+
'extra.gender' => 'namegender.combine.txt',
|
62
|
+
'singleton.predictor' => 'singleton.predictor.ser'
|
62
63
|
},
|
63
64
|
:german => {},
|
64
65
|
:french => {}
|
@@ -351,7 +352,7 @@ module StanfordCoreNLP
|
|
351
352
|
'ConstraintAnnotation'
|
352
353
|
],
|
353
354
|
|
354
|
-
'nlp.
|
355
|
+
'nlp.semgraph.SemanticGraphCoreAnnotations' => [
|
355
356
|
'BasicDependenciesAnnotation',
|
356
357
|
'CollapsedCCProcessedDependenciesAnnotation',
|
357
358
|
'CollapsedDependenciesAnnotation'
|
@@ -364,7 +365,8 @@ module StanfordCoreNLP
|
|
364
365
|
|
365
366
|
'nlp.time.TimeExpression' => [
|
366
367
|
'Annotation',
|
367
|
-
'ChildrenAnnotation'
|
368
|
+
'ChildrenAnnotation',
|
369
|
+
'TimeIndexAnnotation'
|
368
370
|
],
|
369
371
|
|
370
372
|
'nlp.trees.TreeCoreAnnotations' => [
|
metadata
CHANGED
@@ -1,91 +1,70 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: stanford-core-nlp
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.5.
|
5
|
-
prerelease:
|
4
|
+
version: 0.5.3
|
6
5
|
platform: ruby
|
7
6
|
authors:
|
8
7
|
- Louis Mullie
|
9
|
-
autorequire:
|
8
|
+
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 2016-12-24 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: bind-it
|
16
|
-
version_requirements: !ruby/object:Gem::Requirement
|
17
|
-
requirements:
|
18
|
-
- - ~>
|
19
|
-
- !ruby/object:Gem::Version
|
20
|
-
version: 0.2.5
|
21
|
-
none: false
|
22
15
|
requirement: !ruby/object:Gem::Requirement
|
23
16
|
requirements:
|
24
|
-
- - ~>
|
17
|
+
- - "~>"
|
25
18
|
- !ruby/object:Gem::Version
|
26
|
-
version: 0.2.
|
27
|
-
none: false
|
28
|
-
prerelease: false
|
19
|
+
version: 0.2.7
|
29
20
|
type: :runtime
|
30
|
-
|
31
|
-
name: rspec
|
21
|
+
prerelease: false
|
32
22
|
version_requirements: !ruby/object:Gem::Requirement
|
33
23
|
requirements:
|
34
|
-
- -
|
35
|
-
- !ruby/object:Gem::Version
|
36
|
-
version: !binary |-
|
37
|
-
MA==
|
38
|
-
none: false
|
39
|
-
requirement: !ruby/object:Gem::Requirement
|
40
|
-
requirements:
|
41
|
-
- - ! '>='
|
24
|
+
- - "~>"
|
42
25
|
- !ruby/object:Gem::Version
|
43
|
-
version:
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
description: " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
|
49
|
-
\ language processing \ntools that provides tokenization, part-of-speech tagging\
|
50
|
-
\ and parsing for several languages, as well as named entity \nrecognition and coreference\
|
51
|
-
\ resolution for English. "
|
26
|
+
version: 0.2.7
|
27
|
+
description: High-level Ruby bindings to the Stanford CoreNLP package, a set natural
|
28
|
+
language processing tools that provides tokenization, part-of-speech tagging and
|
29
|
+
parsing for several languages, as well as named entity recognition and coreference
|
30
|
+
resolution for English, German, French and other languages.
|
52
31
|
email:
|
53
32
|
- louis.mullie@gmail.com
|
54
33
|
executables: []
|
55
34
|
extensions: []
|
56
35
|
extra_rdoc_files: []
|
57
36
|
files:
|
37
|
+
- LICENSE
|
38
|
+
- README.md
|
39
|
+
- bin/AnnotationBridge.java
|
40
|
+
- bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo.java
|
41
|
+
- bin/taggers/stanford-postagger-full-2014-10-26/TaggerDemo2.java
|
58
42
|
- lib/stanford-core-nlp.rb
|
59
43
|
- lib/stanford-core-nlp/bridge.rb
|
60
44
|
- lib/stanford-core-nlp/config.rb
|
61
|
-
-
|
62
|
-
- bin/bridge.jar
|
63
|
-
- README.md
|
64
|
-
- LICENSE
|
45
|
+
- lib/stanford-core-nlp/version.rb
|
65
46
|
homepage: https://github.com/louismullie/stanford-core-nlp
|
66
|
-
licenses:
|
67
|
-
|
47
|
+
licenses:
|
48
|
+
- GPL-3.0
|
49
|
+
metadata: {}
|
50
|
+
post_install_message:
|
68
51
|
rdoc_options: []
|
69
52
|
require_paths:
|
70
53
|
- lib
|
71
54
|
required_ruby_version: !ruby/object:Gem::Requirement
|
72
55
|
requirements:
|
73
|
-
- -
|
56
|
+
- - ">="
|
74
57
|
- !ruby/object:Gem::Version
|
75
|
-
version:
|
76
|
-
MA==
|
77
|
-
none: false
|
58
|
+
version: '0'
|
78
59
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
79
60
|
requirements:
|
80
|
-
- -
|
61
|
+
- - ">="
|
81
62
|
- !ruby/object:Gem::Version
|
82
|
-
version:
|
83
|
-
MA==
|
84
|
-
none: false
|
63
|
+
version: '0'
|
85
64
|
requirements: []
|
86
|
-
rubyforge_project:
|
87
|
-
rubygems_version:
|
88
|
-
signing_key:
|
89
|
-
specification_version:
|
65
|
+
rubyforge_project:
|
66
|
+
rubygems_version: 2.5.1
|
67
|
+
signing_key:
|
68
|
+
specification_version: 4
|
90
69
|
summary: Ruby bindings to the Stanford Core NLP tools.
|
91
70
|
test_files: []
|
data/bin/bridge.jar
DELETED
Binary file
|