stanford-core-nlp 0.3.5 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,18 +1,17 @@
1
1
  **About**
2
2
 
3
- This gem provides high-level Ruby bindings to the [Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml), a set natural language processing tools for tokenization, part-of-speech tagging, lemmatization, and parsing of several languages, as well as named entity recognition and coreference resolution in English. This gem is compatible with Ruby 1.9.2 and above.
3
+ This gem provides high-level Ruby bindings to the [Stanford Core NLP package](http://nlp.stanford.edu/software/corenlp.shtml), a set natural language processing tools for tokenization, sentence segmentation, part-of-speech tagging, lemmatization, and parsing of English, French and German. The package also provides named entity recognition and coreference resolution for English. This gem is compatible with JRuby 1.6.4 and above, as well as Ruby 1.9.2 and 1.9.3 (through Rjb).
4
4
 
5
- If you are looking for an full-scale natural language processing framework in Ruby, have a look at [Treat](https://github.com/louismullie/treat).
5
+ This gem only provides a thin wrapper over the Stanford Core NLP API. If you are looking for a Ruby natural language processing framework, have a look at [Treat](https://github.com/louismullie/treat).
6
6
 
7
7
  **Installing**
8
8
 
9
- _Note: This gem uses the Ruby-Java Bridge (Rjb), which currently does not support Java 7. Therefore, if you have installed Java 7, you should set your JAVA_HOME to point to your old Java 6 install before installing Rjb; for example, `export "JAVA_HOME=/usr/lib/jvm/java-6-openjdk/"`._
9
+ _Note: If you are running on MRI, this gem will use the Ruby-Java Bridge (Rjb), which currently does not support Java 7. Therefore, if you have installed Java 7, you should set your JAVA_HOME to point to your old Java 6 install before installing Rjb; for example, `export "JAVA_HOME=/usr/lib/jvm/java-6-openjdk/"`._
10
10
 
11
11
  First, install the gem: `gem install stanford-core-nlp`. Then, download the Stanford Core NLP JAR and model files. Three different packages are available:
12
12
 
13
- * A [minimal package for English](http://louismullie.com/treat/stanford-core-nlp-minimal.zip) with one tagger model and one parser model for English.
14
- * A [full package for English](http://louismullie.com/treat/stanford-core-nlp-english.zip), with all tagger and parser models for English, plus the coreference resolution and named entity recognition models.
15
- * A [full package for all languages](http://louismullie.com/treat/stanford-core-nlp-all.zip), including tagger and parser models for English, French, German, Arabic and Chinese.
13
+ * A [minimal package](http://louismullie.com/treat/stanford-core-nlp-minimal.zip) with the default tagger and parser models for English, French and German.
14
+ * A [full package](http://louismullie.com/treat/stanford-core-nlp-all.zip), with all of the tagger and parser models for English, French and German, as well as named entity and coreference resolution models for English.
16
15
 
17
16
  Place the contents of the extracted archive inside the /bin/ folder of the stanford-core-nlp gem (e.g. [...]/gems/stanford-core-nlp-0.x/bin/).
18
17
 
@@ -38,7 +37,7 @@ StanfordCoreNLP.jvm_args = ['-option1', '-option2']
38
37
  StanfordCoreNLP.log_file = 'log.txt'
39
38
 
40
39
  # Use the model files for a different language than English.
41
- StanfordCoreNLP.use(:french)
40
+ StanfordCoreNLP.use(:french) # or :german
42
41
 
43
42
  # Change a specific model file.
44
43
  StanfordCoreNLP.set_model('pos.model', 'english-left3words-distsim.tagger')
@@ -52,7 +51,7 @@ text = 'Angela Merkel met Nicolas Sarkozy on January 25th in ' +
52
51
  'looked pleased, but Merkel was dismayed.'
53
52
 
54
53
  pipeline = StanfordCoreNLP.load(:tokenize, :ssplit, :pos, :lemma, :parse, :ner, :dcoref)
55
- text = StanfordCoreNLP::Text.new(text)
54
+ text = StanfordCoreNLP::Annotation.new(text)
56
55
  pipeline.annotate(text)
57
56
 
58
57
  text.get(:sentences).each do |sentence|
@@ -71,13 +70,13 @@ text.get(:sentences).each do |sentence|
71
70
  # Named entity tag
72
71
  puts token.get(:named_entity_tag).to_s
73
72
  # Coreference
74
- puts token.get(:coref_cluster_id).to_s
73
+ puts token.get(:coref_cluster_id).to_s
75
74
  # Also of interest: coref, coref_chain, coref_cluster, coref_dest, coref_graph.
76
75
  end
77
76
  end
78
77
  ```
79
78
 
80
- > Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Text class.
79
+ > Important: You need to load the StanfordCoreNLP pipeline before using the StanfordCoreNLP::Annotation class.
81
80
 
82
81
  A good reference for names of annotations are the Stanford Javadocs for [CoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ling/CoreAnnotations.html), [CoreCorefAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/dcoref/CorefCoreAnnotations.html), and [TreeCoreAnnotations](http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/TreeCoreAnnotations.html). For a full list of all possible annotations, see the 'config.rb' file inside the gem. The Ruby symbol (e.g. `:named_entity_tag`) corresponding to a Java annotation class follows the simple un-camel-casing convention, with 'Annotation' at the end removed. For example, the annotation `NamedEntityTagAnnotation` translates to `:named_entity_tag`, `PartOfSpeechAnnotation` to `:part_of_speech`, etc.
83
82
 
@@ -124,7 +123,7 @@ Here is a full list of the default models for the Stanford Core NLP pipeline. Yo
124
123
  * 'ner.model.3class' - 'all.3class.distsim.crf.ser.gz'
125
124
  * 'ner.model.7class' - 'muc.7class.distsim.crf.ser.gz'
126
125
  * 'ner.model.MISCclass' -- 'conll.4class.distsim.crf.ser.gz'
127
- * 'parser.model' - 'englishPCFG.ser.gz'
126
+ * 'parse.model' - 'englishPCFG.ser.gz'
128
127
  * 'dcoref.demonym' - 'demonyms.txt'
129
128
  * 'dcoref.animate' - 'animate.unigrams.txt'
130
129
  * 'dcoref.female' - 'female.unigrams.txt'
@@ -1,57 +1,56 @@
1
+ require 'stanford-core-nlp/config'
2
+
1
3
  module StanfordCoreNLP
2
4
 
3
- VERSION = '0.3.5'
5
+ VERSION = '0.4.0'
4
6
 
5
7
  require 'bind-it'
6
8
  extend BindIt::Binding
7
-
9
+
8
10
  # ############################ #
9
11
  # BindIt Configuration Options #
10
12
  # ############################ #
11
-
12
- # The default path for the JAR files
13
+
14
+ # The default path for the JAR files
13
15
  # is the gem's bin folder.
14
- self.jar_path = File.dirname(__FILE__).
15
- gsub(/\/lib\z/, '') + '/bin/'
16
-
16
+ self.jar_path = File.dirname(__FILE__).gsub(/\/lib\z/, '') + '/bins/'
17
+
18
+ # Default namespace is the Stanford pipeline namespace.
19
+ self.default_namespace = 'edu.stanford.nlp.pipeline'
20
+
17
21
  # Load the JVM with a minimum heap size of 512MB,
18
22
  # and a maximum heap size of 1024MB.
19
- self.jvm_args = ['-Xms512M', '-Xmx1024M']
20
-
23
+ StanfordCoreNLP.jvm_args = ['-Xms512M', '-Xmx1024M']
24
+
21
25
  # Turn logging off by default.
22
- self.log_file = nil
23
-
26
+ StanfordCoreNLP.log_file = nil
27
+
24
28
  # Default JAR files to load.
25
- self.default_jars = [
26
- 'joda-time.jar',
27
- 'xom.jar',
29
+ StanfordCoreNLP.default_jars = [
30
+ 'joda-time.jar',
31
+ 'xom.jar',
28
32
  'stanford-parser.jar',
29
- 'stanford-corenlp.jar',
33
+ 'stanford-corenlp.jar',
34
+ 'stanford-segmenter.jar',
30
35
  'bridge.jar'
31
36
  ]
32
-
37
+
33
38
  # Default classes to load.
34
- self.default_classes = [
39
+ StanfordCoreNLP.default_classes = [
35
40
  ['StanfordCoreNLP', 'edu.stanford.nlp.pipeline', 'CoreNLP'],
36
- ['Annotation', 'edu.stanford.nlp.pipeline', 'Text'],
41
+ ['Annotation', 'edu.stanford.nlp.pipeline'],
37
42
  ['Word', 'edu.stanford.nlp.ling'],
43
+ ['CoreLabel', 'edu.stanford.nlp.ling'],
38
44
  ['MaxentTagger', 'edu.stanford.nlp.tagger.maxent'],
39
45
  ['CRFClassifier', 'edu.stanford.nlp.ie.crf'],
40
46
  ['Properties', 'java.util'],
41
- ['ArrayList', 'java.util'],
42
- ['AnnotationBridge', '']
47
+ ['ArrayList', 'java.util']
43
48
  ]
44
-
45
- # Default namespace is the Stanford pipeline namespace.
46
- self.default_namespace = 'edu.stanford.nlp.pipeline'
47
-
49
+
48
50
  # ########################### #
49
51
  # Stanford Core NLP bindings #
50
52
  # ########################### #
51
-
52
- require 'stanford-core-nlp/config'
53
- require 'stanford-core-nlp/bridge'
54
-
53
+
55
54
  class << self
56
55
  # The model file names for a given language.
57
56
  attr_accessor :model_files
@@ -60,12 +59,28 @@ module StanfordCoreNLP
60
59
  # Store the language currently being used.
61
60
  attr_accessor :language
62
61
  end
63
-
62
+
64
63
  # The path to the main folder containing the folders
65
64
  # with the individual models inside. By default, this
66
65
  # is the same as the JAR path.
67
66
  self.model_path = self.jar_path
68
67
 
68
+ # ########################### #
69
+ # Annotation bridge (Rjb/Jrb) #
70
+ # ########################### #
71
+
72
+ if RUBY_PLATFORM =~ /java/
73
+ require 'stanford-core-nlp/jruby_bridge'
74
+ extend StanfordCoreNLP::JrubyBridge
75
+ else
76
+ require 'stanford-core-nlp/rjb_bridge'
77
+ extend StanfordCoreNLP::RjbBridge
78
+ end
79
+
80
+ # ########################### #
81
+ # Public configuration params #
82
+ # ########################### #
83
+
69
84
  # Use models for a given language. Language can be
70
85
  # supplied as full-length, or ISO-639 2 or 3 letter
71
86
  # code (e.g. :english, :eng or :en will work).
@@ -83,40 +98,47 @@ module StanfordCoreNLP
83
98
  n = n.to_s
84
99
  n += '.model' if n == 'ner'
85
100
  models.each do |m, file|
86
- self.model_files["#{n}.#{m}"] =
87
- folder + file
101
+ self.model_files["#{n}.#{m}"] = folder + file
88
102
  end
89
103
  elsif models.is_a?(String)
90
- self.model_files["#{n}.model"] =
91
- folder + models
104
+ self.model_files["#{n}.model"] = folder + models
92
105
  end
93
106
  end
94
107
  end
95
108
 
96
109
  # Use english by default.
97
110
  self.use :english
98
-
99
- # Set a model file.
111
+
112
+ # Set a model file.
100
113
  def self.set_model(name, file)
101
114
  n = name.split('.')[0].intern
102
- self.model_files[name] =
103
- Config::ModelFolders[n] + file
115
+ self.model_files[name] = Config::ModelFolders[n] + file
104
116
  end
105
117
 
118
+ # ########################### #
119
+ # Public API methods #
120
+ # ########################### #
121
+
106
122
  # Load a StanfordCoreNLP pipeline with the
107
123
  # specified JVM flags and StanfordCoreNLP
108
124
  # properties.
109
125
  def self.load(*annotators)
110
-
126
+
111
127
  # Take care of Windows users.
112
128
  if self.running_on_windows?
113
129
  self.jar_path.gsub!('/', '\\')
114
130
  self.model_path.gsub!('/', '\\')
115
131
  end
116
-
132
+
117
133
  # Make the bindings.
118
134
  self.bind
119
-
135
+
136
+ # Bind annotation bridge.
137
+ self.default_classes.each do |info|
138
+ klass = const_get(info.first)
139
+ self.inject_get_method(klass)
140
+ end
141
+
120
142
  # Prepend the JAR path to the model files.
121
143
  properties = {}
122
144
  self.model_files.each do |k,v|
@@ -129,26 +151,42 @@ module StanfordCoreNLP
129
151
  f = self.model_path + v
130
152
  unless File.readable?(f)
131
153
  raise "Model file #{f} could not be found. " +
132
- "You may need to download this file manually "+
133
- " and/or set paths properly."
154
+ "You may need to download this file manually " +
155
+ "and/or set paths properly."
134
156
  end
135
157
  properties[k] = f
136
158
  end
159
+
160
+ properties['annotators'] = annotators.map { |x| x.to_s }.join(', ')
137
161
 
138
- # Bug fix for French/German parser due to Stanford bug.
139
- # Otherwise throws IllegalArgumentException:
140
- # Unknown option: -retainTmpSubcategories
141
- if self.language == :french ||
142
- self.language == :german
143
- properties['parser.flags'] = ''
162
+ unless self.language == :english
163
+ # Bug fix for French/German parsers.
164
+ # Otherwise throws "IllegalArgumentException:
165
+ # Unknown option: -retainTmpSubcategories"
166
+ properties['parse.flags'] = ''
167
+ # Bug fix for French/German parsers.
168
+ # Otherswise throws java.lang.NullPointerException: null.
169
+ properties['parse.buildgraphs'] = 'false'
170
+ end
171
+
172
+ # Hack for Rjb compatibility.
173
+ const_get(:CoreNLP).new(get_properties(properties))
174
+
175
+ end
176
+
177
+ # Hack in order not to break backwards compatibility.
178
+ def self.const_missing(const)
179
+ if const == :Text
180
+ puts "WARNING: StanfordCoreNLP::Text has been deprecated." +
181
+ "Please use StanfordCoreNLP::Annotation instead."
182
+ Annotation
183
+ else
184
+ super(const)
144
185
  end
145
-
146
- properties['annotators'] =
147
- annotators.map { |x| x.to_s }.join(', ')
148
-
149
- CoreNLP.new(get_properties(properties))
150
186
  end
151
187
 
188
+ private
189
+
152
190
  # Create a java.util.Properties object from a hash.
153
191
  def self.get_properties(properties)
154
192
  props = Properties.new
@@ -157,13 +195,13 @@ module StanfordCoreNLP
157
195
  end
158
196
  props
159
197
  end
160
-
198
+
161
199
  # Get a Java ArrayList binding to pass lists
162
200
  # of tokens to the Stanford Core NLP process.
163
201
  def self.get_list(tokens)
164
202
  list = StanfordCoreNLP::ArrayList.new
165
203
  tokens.each do |t|
166
- list.add(StanfordCoreNLP::Word.new(t.to_s))
204
+ list.add(Word.new(t.to_s))
167
205
  end
168
206
  list
169
207
  end
@@ -173,4 +211,10 @@ module StanfordCoreNLP
173
211
  RUBY_PLATFORM.split("-")[1] == 'mswin32'
174
212
  end
175
213
 
176
- end
214
+ # camel_case which also support dot as separator
215
+ def self.camel_case(s)
216
+ s = s.to_s.gsub(/\/(.?)/) { "::#{$1.upcase}" }
217
+ s.gsub(/(?:^|_|\.)(.)/) { $1.upcase }
218
+ end
219
+
220
+ end
@@ -7,15 +7,13 @@ module StanfordCoreNLP
7
7
  LanguageCodes = {
8
8
  :english => [:en, :eng, :english],
9
9
  :german => [:de, :ger, :german],
10
- :french => [:fr, :fre, :french],
11
- :arabic => [:ar, :ara, :arabic],
12
- :chinese => [:ch, :chi, :chinese]
10
+ :french => [:fr, :fre, :french]
13
11
  }
14
12
 
15
13
  # Folders inside the JAR path for the models.
16
14
  ModelFolders = {
17
15
  :pos => 'taggers/',
18
- :parser => 'grammar/',
16
+ :parse => 'grammar/',
19
17
  :ner => 'classifiers/',
20
18
  :dcoref => 'dcoref/'
21
19
  }
@@ -24,7 +22,6 @@ module StanfordCoreNLP
24
22
  TagSets = {
25
23
  :english => :penn,
26
24
  :german => :stutgart,
27
- :chinese => :chinese,
28
25
  :french => :paris7
29
26
  }
30
27
 
@@ -34,17 +31,13 @@ module StanfordCoreNLP
34
31
  :pos => {
35
32
  :english => 'english-left3words-distsim.tagger',
36
33
  :german => 'german-fast.tagger',
37
- :french => 'french.tagger',
38
- :arabic => 'arabic-fast.tagger',
39
- :chinese => 'chinese.tagger'
34
+ :french => 'french.tagger'
40
35
  },
41
36
 
42
- :parser => {
37
+ :parse => {
43
38
  :english => 'englishPCFG.ser.gz',
44
39
  :german => 'germanPCFG.ser.gz',
45
- :french => 'frenchFactored.ser.gz',
46
- :arabic => 'arabicFactored.ser.gz',
47
- :chinese => 'chinesePCFG.ser.gz'
40
+ :french => 'frenchFactored.ser.gz'
48
41
  },
49
42
 
50
43
  :ner => {
@@ -54,9 +47,7 @@ module StanfordCoreNLP
54
47
  'MISCclass' => 'conll.4class.distsim.crf.ser.gz'
55
48
  },
56
49
  :german => {},
57
- :french => {},
58
- :arabic => {},
59
- :chinese => {}
50
+ :french => {}
60
51
  },
61
52
 
62
53
  :dcoref => {
@@ -75,9 +66,7 @@ module StanfordCoreNLP
75
66
  'extra.gender' => 'namegender.combine.txt'
76
67
  },
77
68
  :german => {},
78
- :french => {},
79
- :arabic => {},
80
- :chinese => {}
69
+ :french => {}
81
70
  }
82
71
 
83
72
  # Models to add.
@@ -92,61 +81,6 @@ module StanfordCoreNLP
92
81
  # List of annotations by JAVA class path.
93
82
  Annotations = {
94
83
 
95
- 'nlp.trees.international.pennchinese.ChineseGrammaticalRelations' => [
96
- 'AdjectivalModifierGRAnnotation',
97
- 'AdverbialModifierGRAnnotation',
98
- 'ArgumentGRAnnotation',
99
- 'AspectMarkerGRAnnotation',
100
- 'AssociativeMarkerGRAnnotation',
101
- 'AssociativeModifierGRAnnotation',
102
- 'AttributiveGRAnnotation',
103
- 'AuxModifierGRAnnotation',
104
- 'AuxPassiveGRAnnotation',
105
- 'BaGRAnnotation',
106
- 'ClausalComplementGRAnnotation',
107
- 'ClausalSubjectGRAnnotation',
108
- 'ClauseModifierGRAnnotation',
109
- 'ComplementGRAnnotation',
110
- 'ComplementizerGRAnnotation',
111
- 'ControllingSubjectGRAnnotation',
112
- 'CoordinationGRAnnotation',
113
- 'DeterminerGRAnnotation',
114
- 'DirectObjectGRAnnotation',
115
- 'DvpMarkerGRAnnotation',
116
- 'DvpModifierGRAnnotation',
117
- 'EtcGRAnnotation',
118
- 'LocalizerComplementGRAnnotation',
119
- 'ModalGRAnnotation',
120
- 'ModifierGRAnnotation',
121
- 'NegationModifierGRAnnotation',
122
- 'NominalPassiveSubjectGRAnnotation',
123
- 'NominalSubjectGRAnnotation',
124
- 'NounCompoundModifierGRAnnotation',
125
- 'NumberModifierGRAnnotation',
126
- 'NumericModifierGRAnnotation',
127
- 'ObjectGRAnnotation',
128
- 'OrdNumberGRAnnotation',
129
- 'ParentheticalGRAnnotation',
130
- 'ParticipialModifierGRAnnotation',
131
- 'PreconjunctGRAnnotation',
132
- 'PrepositionalLocalizerModifierGRAnnotation',
133
- 'PrepositionalModifierGRAnnotation',
134
- 'PrepositionalObjectGRAnnotation',
135
- 'PunctuationGRAnnotation',
136
- 'RangeGRAnnotation',
137
- 'RelativeClauseModifierGRAnnotation',
138
- 'ResultativeComplementGRAnnotation',
139
- 'SemanticDependentGRAnnotation',
140
- 'SubjectGRAnnotation',
141
- 'TemporalClauseGRAnnotation',
142
- 'TemporalGRAnnotation',
143
- 'TimePostpositionGRAnnotation',
144
- 'TopicGRAnnotation',
145
- 'VerbCompoundGRAnnotation',
146
- 'VerbModifierGRAnnotation',
147
- 'XClausalComplementGRAnnotation'
148
- ],
149
-
150
84
  'nlp.dcoref.CoNLL2011DocumentReader' => [
151
85
  'CorefMentionAnnotation',
152
86
  'NamedEntityAnnotation'
@@ -0,0 +1,41 @@
1
+ module StanfordCoreNLP::JrubyBridge
2
+
3
+ def inject_get_method(klass)
4
+ return unless klass.method_defined?(:get)
5
+ klass.class_eval do
6
+
7
+ # Dynamically defined on all proxied annotation classes.
8
+ # Get an annotation using the annotation bridge.
9
+ def get_with_casting(annotation, anno_base = nil)
10
+ anno_class = "#{StanfordCoreNLP.camel_case(annotation)}Annotation"
11
+ if anno_base
12
+ unless StanfordNLP::Config::Annotations[anno_base]
13
+ raise "The path #{anno_base} doesn't exist."
14
+ end
15
+ anno_bases = [anno_base]
16
+ else
17
+ anno_bases = StanfordCoreNLP::Config::AnnotationsByName[anno_class]
18
+ raise "The annotation #{anno_class} doesn't exist." unless anno_bases
19
+ end
20
+ if anno_bases.size > 1
21
+ msg = "There are many different annotations bearing the name #{anno_class}. \nPlease specify one of the following base classes as second parameter to disambiguate: "
22
+ msg << anno_bases.join(',')
23
+ raise msg
24
+ else
25
+ base_class = anno_bases[0]
26
+ end
27
+
28
+ fqcn = "edu.stanford.#{base_class}"
29
+ class_path = fqcn.split(".")
30
+ class_name = class_path.pop
31
+ jruby_class = "Java::#{StanfordCoreNLP.camel_case(class_path.join("."))}::#{class_name}::#{anno_class}"
32
+
33
+ get_without_casting(Object.module_eval(jruby_class))
34
+ end
35
+
36
+ alias_method :get_without_casting, :get
37
+ alias_method :get, :get_with_casting
38
+ end
39
+ end
40
+
41
+ end
@@ -0,0 +1,42 @@
1
+ module StanfordCoreNLP::RjbBridge
2
+
3
+ StanfordCoreNLP.default_classes << ['AnnotationBridge', '']
4
+
5
+ def inject_get_method(klass)
6
+ klass.class_eval do
7
+
8
+ # Dynamically defined on all proxied annotation classes.
9
+ # Get an annotation using the annotation bridge.
10
+ def get(annotation, anno_base = nil)
11
+ if !java_methods.include?('get(Ljava.lang.Class;)')
12
+ raise 'No annotation can be retrieved on this object.'
13
+ else
14
+ anno_class = "#{StanfordCoreNLP.camel_case(annotation)}Annotation"
15
+ if anno_base
16
+ unless StanfordNLP::Config::Annotations[anno_base]
17
+ raise "The path #{anno_base} doesn't exist."
18
+ end
19
+ anno_bases = [anno_base]
20
+ else
21
+ anno_bases = StanfordCoreNLP::Config::AnnotationsByName[anno_class]
22
+ raise "The annotation #{anno_class} doesn't exist." unless anno_bases
23
+ end
24
+ if anno_bases.size > 1
25
+ msg = "There are many different annotations " +
26
+ "bearing the name #{anno_class}. \nPlease specify " +
27
+ "one of the following base classes as second " +
28
+ "parameter to disambiguate: "
29
+ msg << anno_bases.join(',')
30
+ raise msg
31
+ else
32
+ base_class = anno_bases[0]
33
+ end
34
+ url = "edu.stanford.#{base_class}$#{anno_class}"
35
+ StanfordCoreNLP::AnnotationBridge.getAnnotation(self, url)
36
+ end
37
+ end
38
+
39
+ end
40
+
41
+ end
42
+ end
metadata CHANGED
@@ -1,45 +1,46 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: stanford-core-nlp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.5
5
- prerelease:
4
+ prerelease:
5
+ version: 0.4.0
6
6
  platform: ruby
7
7
  authors:
8
8
  - Louis Mullie
9
- autorequire:
9
+ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-12-04 00:00:00.000000000 Z
12
+ date: 2012-12-18 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bind-it
16
- requirement: !ruby/object:Gem::Requirement
17
- none: false
16
+ version_requirements: !ruby/object:Gem::Requirement
18
17
  requirements:
19
18
  - - ! '>='
20
19
  - !ruby/object:Gem::Version
21
20
  version: '0'
22
- type: :runtime
23
- prerelease: false
24
- version_requirements: !ruby/object:Gem::Requirement
25
21
  none: false
22
+ requirement: !ruby/object:Gem::Requirement
26
23
  requirements:
27
24
  - - ! '>='
28
25
  - !ruby/object:Gem::Version
29
26
  version: '0'
30
- description: ! " High-level Ruby bindings to the Stanford CoreNLP package, a set natural
31
- language processing \ntools that provides tokenization, part-of-speech tagging and
32
- parsing for several languages, as well as named entity \nrecognition and coreference
33
- resolution for English. "
27
+ none: false
28
+ prerelease: false
29
+ type: :runtime
30
+ description: ! " High-level Ruby bindings to the Stanford CoreNLP package, a set natural\
31
+ \ language processing \ntools that provides tokenization, part-of-speech tagging\
32
+ \ and parsing for several languages, as well as named entity \nrecognition and coreference\
33
+ \ resolution for English. "
34
34
  email:
35
35
  - louis.mullie@gmail.com
36
36
  executables: []
37
37
  extensions: []
38
38
  extra_rdoc_files: []
39
39
  files:
40
- - lib/stanford-core-nlp/bridge.rb
41
- - lib/stanford-core-nlp/config.rb
42
40
  - lib/stanford-core-nlp.rb
41
+ - lib/stanford-core-nlp/config.rb
42
+ - lib/stanford-core-nlp/jruby_bridge.rb
43
+ - lib/stanford-core-nlp/rjb_bridge.rb
43
44
  - bin/AnnotationBridge.java
44
45
  - bin/bridge.jar
45
46
  - bin/Stanford.java
@@ -47,26 +48,27 @@ files:
47
48
  - LICENSE
48
49
  homepage: https://github.com/louismullie/stanford-core-nlp
49
50
  licenses: []
50
- post_install_message:
51
+ post_install_message:
51
52
  rdoc_options: []
52
53
  require_paths:
53
54
  - lib
54
55
  required_ruby_version: !ruby/object:Gem::Requirement
55
- none: false
56
56
  requirements:
57
57
  - - ! '>='
58
58
  - !ruby/object:Gem::Version
59
59
  version: '0'
60
- required_rubygems_version: !ruby/object:Gem::Requirement
61
60
  none: false
61
+ required_rubygems_version: !ruby/object:Gem::Requirement
62
62
  requirements:
63
63
  - - ! '>='
64
64
  - !ruby/object:Gem::Version
65
65
  version: '0'
66
+ none: false
66
67
  requirements: []
67
- rubyforge_project:
68
+ rubyforge_project:
68
69
  rubygems_version: 1.8.24
69
- signing_key:
70
+ signing_key:
70
71
  specification_version: 3
71
72
  summary: Ruby bindings to the Stanford Core NLP tools.
72
73
  test_files: []
74
+ ...
@@ -1,40 +0,0 @@
1
- module StanfordCoreNLP
2
-
3
- # Modify the Rjb JavaProxy class to add our
4
- # own methods to every Java object.
5
- Rjb::Rjb_JavaProxy.class_eval do
6
-
7
- # Dynamically defined on all proxied annotation classes.
8
- # Get an annotation using the annotation bridge.
9
- def get(annotation, anno_base = nil)
10
- if !java_methods.include?('get(Ljava.lang.Class;)')
11
- raise 'No annotation can be retrieved on this object.'
12
- else
13
- anno_class = "#{StanfordCoreNLP.camel_case(annotation)}Annotation"
14
- if anno_base
15
- unless StanfordNLP::Config::Annotations[anno_base]
16
- raise "The path #{anno_base} doesn't exist."
17
- end
18
- anno_bases = [anno_base]
19
- else
20
- anno_bases = StanfordCoreNLP::Config::AnnotationsByName[anno_class]
21
- raise "The annotation #{anno_class} doesn't exist." unless anno_bases
22
- end
23
- if anno_bases.size > 1
24
- msg = "There are many different annotations " +
25
- "bearing the name #{anno_class}. \nPlease specify " +
26
- "one of the following base classes as second " +
27
- "parameter to disambiguate: "
28
- msg << anno_bases.join(',')
29
- raise msg
30
- else
31
- base_class = anno_bases[0]
32
- end
33
- url = "edu.stanford.#{base_class}$#{anno_class}"
34
- StanfordCoreNLP::AnnotationBridge.getAnnotation(self, url)
35
- end
36
- end
37
-
38
- end
39
-
40
- end