stanford-core-nlp 0.5.0 → 0.5.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (3) hide show
  1. data/README.md +17 -12
  2. data/lib/stanford-core-nlp.rb +16 -8
  3. metadata +32 -28
data/README.md CHANGED
@@ -8,7 +8,7 @@ This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1. It is t
8
8
 
9
9
  **Installing**
10
10
 
11
- First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Three different packages are available:
11
+ First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Two packages are available:
12
12
 
13
13
  * A [minimal package](http://louismullie.com/treat/stanford-core-nlp-minimal.zip) with the default tagger and parser models for English, French and German.
14
14
  * A [full package](http://louismullie.com/treat/stanford-core-nlp-full.zip), with all of the tagger and parser models for English, French and German, as well as named entity and coreference resolution models for English.
@@ -17,7 +17,7 @@ Place the contents of the extracted archive inside the /bin/ folder of the stanf
17
17
 
18
18
  **Configuration**
19
19
 
20
- After installing and requiring the gem (`require 'stanford-core-nlp'`), you may want to set some optional configuration options. Here are some examples:
20
+ You may want to set some optional configuration options. Here are some examples:
21
21
 
22
22
  ```ruby
23
23
  # Set an alternative path to look for the JAR files
@@ -36,9 +36,6 @@ StanfordCoreNLP.jvm_args = ['-option1', '-option2']
36
36
  # Redirect VM output to log.txt
37
37
  StanfordCoreNLP.log_file = 'log.txt'
38
38
 
39
- # Use the model files for a different language than English.
40
- StanfordCoreNLP.use(:french) # or :german
41
-
42
39
  # Change a specific model file.
43
40
  StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
44
41
  ```
@@ -46,6 +43,9 @@ StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
46
43
  **Using the gem**
47
44
 
48
45
  ```ruby
46
+ # Use the model files for a different language than English.
47
+ StanfordCoreNLP.use :french # or :german
48
+
49
49
  text = 'Angela Merkel met Nicolas Sarkozy on January 25th in ' +
50
50
  'Berlin to discuss a new austerity package. Sarkozy ' +
51
51
  'looked pleased, but Merkel was dismayed.'
@@ -71,18 +71,22 @@ text.get(:sentences).each do |sentence|
71
71
  puts token.get(:named_entity_tag).to_s
72
72
  # Coreference
73
73
  puts token.get(:coref_cluster_id).to_s
74
- # Also of interest: coref, coref_chain, coref_cluster, coref_dest, coref_graph.
74
+ # Also of interest: coref, coref_chain,
75
+ # coref_cluster, coref_dest, coref_graph.
75
76
  end
76
77
  end
77
78
  ```
78
79
 
79
80
  > Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
80
81
 
81
- A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the 'config.rb' file inside the gem. The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class follows the simple un-camel-casing convention, with 'Annotation' at the end removed. For example, the annotation `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
82
+ The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class is the `snake_case` of the class name, with 'Annotation' at the end removed. For example, `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
83
+
84
+ A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the `config.rb` file inside the gem.
85
+
82
86
 
83
87
  **Loading specific classes**
84
88
 
85
- You may also want to load your own classes from the Stanford NLP to do more specific tasks. The gem provides an API to do this:
89
+ You may want to load additional Java classes (including any class from the Stanford NLP packages). The gem provides an API for this:
86
90
 
87
91
  ```ruby
88
92
  # Default base class is edu.stanford.nlp.pipeline.
@@ -120,9 +124,7 @@ Here is a full list of annotator classes provided by the Stanford Core NLP packa
120
124
  Here is a full list of the default models for the Stanford Core NLP pipeline. You can change these models individually using `StanfordCoreNLP.set_model` (see above).
121
125
 
122
126
  * 'pos.model' - 'english-left3words-distsim.tagger'
123
- * 'ner.model.3class' - 'all.3class.distsim.crf.ser.gz'
124
- * 'ner.model.7class' - 'muc.7class.distsim.crf.ser.gz'
125
- * 'ner.model.MISCclass' -- 'conll.4class.distsim.crf.ser.gz'
127
+ * 'ner.model' - 'all.3class.distsim.crf.ser.gz'
126
128
  * 'parse.model' - 'englishPCFG.ser.gz'
127
129
  * 'dcoref.demonym' - 'demonyms.txt'
128
130
  * 'dcoref.animate' - 'animate.unigrams.txt'
@@ -137,4 +139,7 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
137
139
 
138
140
  **Contributing**
139
141
 
140
- Feel free to fork the project and send me a pull request!
142
+ Simple.
143
+
144
+ 1. Fork the project.
145
+ 2. Send me a pull request!
@@ -2,7 +2,7 @@ require 'stanford-core-nlp/config'
2
2
 
3
3
  module StanfordCoreNLP
4
4
 
5
- VERSION = '0.5.0'
5
+ VERSION = '0.5.1'
6
6
 
7
7
  require 'bind-it'
8
8
  extend BindIt::Binding
@@ -44,6 +44,8 @@ module StanfordCoreNLP
44
44
  ['CoreLabel', 'edu.stanford.nlp.ling'],
45
45
  ['MaxentTagger', 'edu.stanford.nlp.tagger.maxent'],
46
46
  ['CRFClassifier', 'edu.stanford.nlp.ie.crf'],
47
+ ['LexicalizedParser', 'edu.stanford.nlp.parser.lexparser'],
48
+ ['Options', 'edu.stanford.nlp.parser.lexparser'],
47
49
  ['Properties', 'java.util'],
48
50
  ['ArrayList', 'java.util'],
49
51
  ['AnnotationBridge', '']
@@ -111,11 +113,8 @@ module StanfordCoreNLP
111
113
  # Public API methods #
112
114
  # ########################### #
113
115
 
114
- # Load a StanfordCoreNLP pipeline with the
115
- # specified JVM flags and StanfordCoreNLP
116
- # properties.
117
- def self.load(*annotators)
118
-
116
+ def self.bind
117
+
119
118
  # Take care of Windows users.
120
119
  if self.running_on_windows?
121
120
  self.jar_path.gsub!('/', '\\')
@@ -123,14 +122,23 @@ module StanfordCoreNLP
123
122
  end
124
123
 
125
124
  # Make the bindings.
126
- self.bind
125
+ super
127
126
 
128
127
  # Bind annotation bridge.
129
128
  self.default_classes.each do |info|
130
129
  klass = const_get(info.first)
131
130
  self.inject_get_method(klass)
132
131
  end
133
-
132
+
133
+ end
134
+
135
+ # Load a StanfordCoreNLP pipeline with the
136
+ # specified JVM flags and StanfordCoreNLP
137
+ # properties.
138
+ def self.load(*annotators)
139
+
140
+ self.bind unless self.bound
141
+
134
142
  # Prepend the JAR path to the model files.
135
143
  properties = {}
136
144
  self.model_files.each do |k,v|
metadata CHANGED
@@ -1,87 +1,91 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: stanford-core-nlp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.0
5
- prerelease:
4
+ version: 0.5.1
5
+ prerelease:
6
6
  platform: ruby
7
7
  authors:
8
8
  - Louis Mullie
9
- autorequire:
9
+ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-12-26 00:00:00.000000000 Z
12
+ date: 2013-01-07 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bind-it
16
- requirement: !ruby/object:Gem::Requirement
17
- none: false
16
+ version_requirements: !ruby/object:Gem::Requirement
18
17
  requirements:
19
18
  - - ~>
20
19
  - !ruby/object:Gem::Version
21
20
  version: 0.2.5
22
- type: :runtime
23
- prerelease: false
24
- version_requirements: !ruby/object:Gem::Requirement
25
21
  none: false
22
+ requirement: !ruby/object:Gem::Requirement
26
23
  requirements:
27
24
  - - ~>
28
25
  - !ruby/object:Gem::Version
29
26
  version: 0.2.5
27
+ none: false
28
+ prerelease: false
29
+ type: :runtime
30
30
  - !ruby/object:Gem::Dependency
31
31
  name: rspec
32
- requirement: !ruby/object:Gem::Requirement
33
- none: false
32
+ version_requirements: !ruby/object:Gem::Requirement
34
33
  requirements:
35
34
  - - ! '>='
36
35
  - !ruby/object:Gem::Version
37
- version: '0'
38
- type: :development
39
- prerelease: false
40
- version_requirements: !ruby/object:Gem::Requirement
36
+ version: !binary |-
37
+ MA==
41
38
  none: false
39
+ requirement: !ruby/object:Gem::Requirement
42
40
  requirements:
43
41
  - - ! '>='
44
42
  - !ruby/object:Gem::Version
45
- version: '0'
46
- description: ! " High-level Ruby bindings to the Stanford CoreNLP package, a set natural
47
- language processing \ntools that provides tokenization, part-of-speech tagging and
48
- parsing for several languages, as well as named entity \nrecognition and coreference
49
- resolution for English. "
43
+ version: !binary |-
44
+ MA==
45
+ none: false
46
+ prerelease: false
47
+ type: :development
48
+ description: " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
49
+ \ language processing \ntools that provides tokenization, part-of-speech tagging\
50
+ \ and parsing for several languages, as well as named entity \nrecognition and coreference\
51
+ \ resolution for English. "
50
52
  email:
51
53
  - louis.mullie@gmail.com
52
54
  executables: []
53
55
  extensions: []
54
56
  extra_rdoc_files: []
55
57
  files:
58
+ - lib/stanford-core-nlp.rb
56
59
  - lib/stanford-core-nlp/bridge.rb
57
60
  - lib/stanford-core-nlp/config.rb
58
- - lib/stanford-core-nlp.rb
59
61
  - bin/AnnotationBridge.java
60
62
  - bin/bridge.jar
61
63
  - README.md
62
64
  - LICENSE
63
65
  homepage: https://github.com/louismullie/stanford-core-nlp
64
66
  licenses: []
65
- post_install_message:
67
+ post_install_message:
66
68
  rdoc_options: []
67
69
  require_paths:
68
70
  - lib
69
71
  required_ruby_version: !ruby/object:Gem::Requirement
70
- none: false
71
72
  requirements:
72
73
  - - ! '>='
73
74
  - !ruby/object:Gem::Version
74
- version: '0'
75
- required_rubygems_version: !ruby/object:Gem::Requirement
75
+ version: !binary |-
76
+ MA==
76
77
  none: false
78
+ required_rubygems_version: !ruby/object:Gem::Requirement
77
79
  requirements:
78
80
  - - ! '>='
79
81
  - !ruby/object:Gem::Version
80
- version: '0'
82
+ version: !binary |-
83
+ MA==
84
+ none: false
81
85
  requirements: []
82
- rubyforge_project:
86
+ rubyforge_project:
83
87
  rubygems_version: 1.8.24
84
- signing_key:
88
+ signing_key:
85
89
  specification_version: 3
86
90
  summary: Ruby bindings to the Stanford Core NLP tools.
87
91
  test_files: []