stanford-core-nlp 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. data/README.md +17 -12
  2. data/lib/stanford-core-nlp.rb +16 -8
  3. metadata +32 -28
data/README.md CHANGED
@@ -8,7 +8,7 @@ This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1. It is t
8
8
 
9
9
  **Installing**
10
10
 
11
- First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Three different packages are available:
11
+ First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Two packages are available:
12
12
 
13
13
  * A [minimal package](http://louismullie.com/treat/stanford-core-nlp-minimal.zip) with the default tagger and parser models for English, French and German.
14
14
  * A [full package](http://louismullie.com/treat/stanford-core-nlp-full.zip), with all of the tagger and parser models for English, French and German, as well as named entity and coreference resolution models for English.
@@ -17,7 +17,7 @@ Place the contents of the extracted archive inside the /bin/ folder of the stanf
17
17
 
18
18
  **Configuration**
19
19
 
20
- After installing and requiring the gem (`require 'stanford-core-nlp'`), you may want to set some optional configuration options. Here are some examples:
20
+ You may want to set some optional configuration options. Here are some examples:
21
21
 
22
22
  ```ruby
23
23
  # Set an alternative path to look for the JAR files
@@ -36,9 +36,6 @@ StanfordCoreNLP.jvm_args = ['-option1', '-option2']
36
36
  # Redirect VM output to log.txt
37
37
  StanfordCoreNLP.log_file = 'log.txt'
38
38
 
39
- # Use the model files for a different language than English.
40
- StanfordCoreNLP.use(:french) # or :german
41
-
42
39
  # Change a specific model file.
43
40
  StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
44
41
  ```
@@ -46,6 +43,9 @@ StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
46
43
  **Using the gem**
47
44
 
48
45
  ```ruby
46
+ # Use the model files for a different language than English.
47
+ StanfordCoreNLP.use :french # or :german
48
+
49
49
  text = 'Angela Merkel met Nicolas Sarkozy on January 25th in ' +
50
50
  'Berlin to discuss a new austerity package. Sarkozy ' +
51
51
  'looked pleased, but Merkel was dismayed.'
@@ -71,18 +71,22 @@ text.get(:sentences).each do |sentence|
71
71
  puts token.get(:named_entity_tag).to_s
72
72
  # Coreference
73
73
  puts token.get(:coref_cluster_id).to_s
74
- # Also of interest: coref, coref_chain, coref_cluster, coref_dest, coref_graph.
74
+ # Also of interest: coref, coref_chain,
75
+ # coref_cluster, coref_dest, coref_graph.
75
76
  end
76
77
  end
77
78
  ```
78
79
 
79
80
  > Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
80
81
 
81
- A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the 'config.rb' file inside the gem. The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class follows the simple un-camel-casing convention, with 'Annotation' at the end removed. For example, the annotation `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
82
+ The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class is the `snake_case` of the class name, with 'Annotation' at the end removed. For example, `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
83
+
84
+ A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the `config.rb` file inside the gem.
85
+
82
86
 
83
87
  **Loading specific classes**
84
88
 
85
- You may also want to load your own classes from the Stanford NLP to do more specific tasks. The gem provides an API to do this:
89
+ You may want to load additional Java classes (including any class from the Stanford NLP packages). The gem provides an API for this:
86
90
 
87
91
  ```ruby
88
92
  # Default base class is edu.stanford.nlp.pipeline.
@@ -120,9 +124,7 @@ Here is a full list of annotator classes provided by the Stanford Core NLP packa
120
124
  Here is a full list of the default models for the Stanford Core NLP pipeline. You can change these models individually using `StanfordCoreNLP.set_model` (see above).
121
125
 
122
126
  * 'pos.model' - 'english-left3words-distsim.tagger'
123
- * 'ner.model.3class' - 'all.3class.distsim.crf.ser.gz'
124
- * 'ner.model.7class' - 'muc.7class.distsim.crf.ser.gz'
125
- * 'ner.model.MISCclass' -- 'conll.4class.distsim.crf.ser.gz'
127
+ * 'ner.model' - 'all.3class.distsim.crf.ser.gz'
126
128
  * 'parse.model' - 'englishPCFG.ser.gz'
127
129
  * 'dcoref.demonym' - 'demonyms.txt'
128
130
  * 'dcoref.animate' - 'animate.unigrams.txt'
@@ -137,4 +139,7 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
137
139
 
138
140
  **Contributing**
139
141
 
140
- Feel free to fork the project and send me a pull request!
142
+ Simple.
143
+
144
+ 1. Fork the project.
145
+ 2. Send me a pull request!
@@ -2,7 +2,7 @@ require 'stanford-core-nlp/config'
2
2
 
3
3
  module StanfordCoreNLP
4
4
 
5
- VERSION = '0.5.0'
5
+ VERSION = '0.5.1'
6
6
 
7
7
  require 'bind-it'
8
8
  extend BindIt::Binding
@@ -44,6 +44,8 @@ module StanfordCoreNLP
44
44
  ['CoreLabel', 'edu.stanford.nlp.ling'],
45
45
  ['MaxentTagger', 'edu.stanford.nlp.tagger.maxent'],
46
46
  ['CRFClassifier', 'edu.stanford.nlp.ie.crf'],
47
+ ['LexicalizedParser', 'edu.stanford.nlp.parser.lexparser'],
48
+ ['Options', 'edu.stanford.nlp.parser.lexparser'],
47
49
  ['Properties', 'java.util'],
48
50
  ['ArrayList', 'java.util'],
49
51
  ['AnnotationBridge', '']
@@ -111,11 +113,8 @@ module StanfordCoreNLP
111
113
  # Public API methods #
112
114
  # ########################### #
113
115
 
114
- # Load a StanfordCoreNLP pipeline with the
115
- # specified JVM flags and StanfordCoreNLP
116
- # properties.
117
- def self.load(*annotators)
118
-
116
+ def self.bind
117
+
119
118
  # Take care of Windows users.
120
119
  if self.running_on_windows?
121
120
  self.jar_path.gsub!('/', '\\')
@@ -123,14 +122,23 @@ module StanfordCoreNLP
123
122
  end
124
123
 
125
124
  # Make the bindings.
126
- self.bind
125
+ super
127
126
 
128
127
  # Bind annotation bridge.
129
128
  self.default_classes.each do |info|
130
129
  klass = const_get(info.first)
131
130
  self.inject_get_method(klass)
132
131
  end
133
-
132
+
133
+ end
134
+
135
+ # Load a StanfordCoreNLP pipeline with the
136
+ # specified JVM flags and StanfordCoreNLP
137
+ # properties.
138
+ def self.load(*annotators)
139
+
140
+ self.bind unless self.bound
141
+
134
142
  # Prepend the JAR path to the model files.
135
143
  properties = {}
136
144
  self.model_files.each do |k,v|
metadata CHANGED
@@ -1,87 +1,91 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: stanford-core-nlp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.0
5
- prerelease:
4
+ version: 0.5.1
5
+ prerelease:
6
6
  platform: ruby
7
7
  authors:
8
8
  - Louis Mullie
9
- autorequire:
9
+ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-12-26 00:00:00.000000000 Z
12
+ date: 2013-01-07 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bind-it
16
- requirement: !ruby/object:Gem::Requirement
17
- none: false
16
+ version_requirements: !ruby/object:Gem::Requirement
18
17
  requirements:
19
18
  - - ~>
20
19
  - !ruby/object:Gem::Version
21
20
  version: 0.2.5
22
- type: :runtime
23
- prerelease: false
24
- version_requirements: !ruby/object:Gem::Requirement
25
21
  none: false
22
+ requirement: !ruby/object:Gem::Requirement
26
23
  requirements:
27
24
  - - ~>
28
25
  - !ruby/object:Gem::Version
29
26
  version: 0.2.5
27
+ none: false
28
+ prerelease: false
29
+ type: :runtime
30
30
  - !ruby/object:Gem::Dependency
31
31
  name: rspec
32
- requirement: !ruby/object:Gem::Requirement
33
- none: false
32
+ version_requirements: !ruby/object:Gem::Requirement
34
33
  requirements:
35
34
  - - ! '>='
36
35
  - !ruby/object:Gem::Version
37
- version: '0'
38
- type: :development
39
- prerelease: false
40
- version_requirements: !ruby/object:Gem::Requirement
36
+ version: !binary |-
37
+ MA==
41
38
  none: false
39
+ requirement: !ruby/object:Gem::Requirement
42
40
  requirements:
43
41
  - - ! '>='
44
42
  - !ruby/object:Gem::Version
45
- version: '0'
46
- description: ! " High-level Ruby bindings to the Stanford CoreNLP package, a set natural
47
- language processing \ntools that provides tokenization, part-of-speech tagging and
48
- parsing for several languages, as well as named entity \nrecognition and coreference
49
- resolution for English. "
43
+ version: !binary |-
44
+ MA==
45
+ none: false
46
+ prerelease: false
47
+ type: :development
48
+ description: " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
49
+ \ language processing \ntools that provides tokenization, part-of-speech tagging\
50
+ \ and parsing for several languages, as well as named entity \nrecognition and coreference\
51
+ \ resolution for English. "
50
52
  email:
51
53
  - louis.mullie@gmail.com
52
54
  executables: []
53
55
  extensions: []
54
56
  extra_rdoc_files: []
55
57
  files:
58
+ - lib/stanford-core-nlp.rb
56
59
  - lib/stanford-core-nlp/bridge.rb
57
60
  - lib/stanford-core-nlp/config.rb
58
- - lib/stanford-core-nlp.rb
59
61
  - bin/AnnotationBridge.java
60
62
  - bin/bridge.jar
61
63
  - README.md
62
64
  - LICENSE
63
65
  homepage: https://github.com/louismullie/stanford-core-nlp
64
66
  licenses: []
65
- post_install_message:
67
+ post_install_message:
66
68
  rdoc_options: []
67
69
  require_paths:
68
70
  - lib
69
71
  required_ruby_version: !ruby/object:Gem::Requirement
70
- none: false
71
72
  requirements:
72
73
  - - ! '>='
73
74
  - !ruby/object:Gem::Version
74
- version: '0'
75
- required_rubygems_version: !ruby/object:Gem::Requirement
75
+ version: !binary |-
76
+ MA==
76
77
  none: false
78
+ required_rubygems_version: !ruby/object:Gem::Requirement
77
79
  requirements:
78
80
  - - ! '>='
79
81
  - !ruby/object:Gem::Version
80
- version: '0'
82
+ version: !binary |-
83
+ MA==
84
+ none: false
81
85
  requirements: []
82
- rubyforge_project:
86
+ rubyforge_project:
83
87
  rubygems_version: 1.8.24
84
- signing_key:
88
+ signing_key:
85
89
  specification_version: 3
86
90
  summary: Ruby bindings to the Stanford Core NLP tools.
87
91
  test_files: []