corenlp 0.0.4 → 0.0.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a88d19d7dc8eae9e7df59d4fe9b1e0c492aa194f
4
- data.tar.gz: 5c6a32994f6720210b7a909c5b839503530793b3
3
+ metadata.gz: 13fb6d3e676a78f59359715c392e857d4836198e
4
+ data.tar.gz: ee58cbc6af1c899a1ef7cb070e2bad0c939c3765
5
5
  SHA512:
6
- metadata.gz: b5f185cda3feb604e97e5682440a01f763773e80631bf5a68aa19a1a35e2874999770dd8707af2d74cb9baf630491219fdba498284f37a784813510d48c6549f
7
- data.tar.gz: 99cfc0054a47e92c517b6ba9025e063d6fb316a2788acf9255079ab828a7f046ff82057fee9b2b1a154127f7825ce6abcef0eae52a08e59175c0d94bcb58233b
6
+ metadata.gz: 649e14c7fc8936da85e307bde80eaf1b3615c87faa61b2137a8a673aafd8dfea18b8ca300851fb6107b55519edb7dd42782213272e9585553d77878a5674f53b
7
+ data.tar.gz: c0fc54d340e2d2c03067066989d9a9b70e866c49f5f5250c1512735ab2b8bc742d0d894b8c7edc3504edd4f00d7b947135e0e9f5e8998a19e21e1b63df81278e
data/README.md CHANGED
@@ -38,11 +38,12 @@ The following code will build up a treebank structure for the raw text "Put the
38
38
 
39
39
  ## Options
40
40
 
41
- The Treebank object can be initialize with various options.
41
+ The Treebank object can be initialized with various options.
42
42
 
43
43
  * `java_max_memory` - set to 3GB by default. This can be customized via the Treebank initializer to be `-Xmx2g`, which would use a max of 2GB of memory, for example.
44
44
  * `threads_to_use` - number of threads Stanford CoreNLP uses to parse text. This is set to 4 by default. This option is passed to the Java executable.
45
45
  * `output_directory` - by default this is `./tmp/language_processing`, which already exists. This is where Stanford CoreNLP XML files are placed. These XML files represented the structured parser output.
46
+ * `deps_dir` - the directory where the Stanford CoreNLP dependencies files are. By default this is './lib/ext`.
46
47
 
47
48
  ## Tests
48
49
 
data/lib/corenlp.rb CHANGED
@@ -4,7 +4,7 @@ Bundler.require
4
4
 
5
5
  module Corenlp
6
6
  class Treebank
7
- attr_accessor :raw_text, :filenames, :output_directory, :summary_file, :threads_to_use, :java_max_memory, :sentences
7
+ attr_accessor :raw_text, :filenames, :output_directory, :summary_file, :threads_to_use, :java_max_memory, :sentences, :deps_dir
8
8
 
9
9
  def initialize(attrs = {})
10
10
  self.raw_text = attrs[:raw_text] || ""
@@ -15,6 +15,7 @@ module Corenlp
15
15
  self.threads_to_use = attrs[:threads_to_use] || 4
16
16
  self.java_max_memory = attrs[:java_max_memory] || "-Xmx3g"
17
17
  self.sentences = []
18
+ self.deps_dir = attrs[:deps_dir] || "./lib/ext"
18
19
  end
19
20
 
20
21
  def write_output_file_and_summary_file
@@ -25,8 +26,7 @@ module Corenlp
25
26
  end
26
27
 
27
28
  def process_files_with_stanford_corenlp
28
- deps = "./lib/ext" # dependencies directory: JARs, model files, taggers, etc.
29
- classpath = "#{deps}/stanford-corenlp-3.4.jar:#{deps}/stanford-corenlp-3.4-models.jar:#{deps}/xom.jar:#{deps}/joda-time.jar:#{deps}/jollyday.jar:#{deps}/ejml-0.23.jar"
29
+ classpath = "#{deps_dir}/stanford-corenlp-3.4.jar:#{deps_dir}/stanford-corenlp-3.4-models.jar:#{deps_dir}/xom.jar:#{deps_dir}/joda-time.jar:#{deps_dir}/jollyday.jar:#{deps_dir}/ejml-0.23.jar"
30
30
  stanford_bin = "edu.stanford.nlp.pipeline.StanfordCoreNLP"
31
31
  annotators = "tokenize,ssplit,pos,lemma,parse,ner"
32
32
 
@@ -1,3 +1,3 @@
1
1
  module Corenlp
2
- VERSION = "0.0.4"
2
+ VERSION = "0.0.5"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: corenlp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.4
4
+ version: 0.0.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Lengio Corporation