corenlp 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 072b1b153bb4591c16e6242713e9b4431ba003da
4
+ data.tar.gz: 0e71dd5289c128e0245f082ace874d29b51cd92f
5
+ SHA512:
6
+ metadata.gz: 7969ddc18c42ca6c832c06bf677df56212f4ec54bc35bebbd4d9e4925425804015ef2e333cbc0463a88af76204ed46f814c355c10c703586761c9a7db501442d
7
+ data.tar.gz: 646eb3e03f42182e5a957fe6d52db3a7767cd52c9cf48bb4bd4bc36d0e07c03f1f4890c652addad58ccaf845f9a430c5d877143902b3824efc60c42e44757f31
@@ -0,0 +1,95 @@
1
+ # Corenlp
2
+
3
+ Corenlp is a Ruby gem that uses the [Stanford CoreNLP Java tools](http://nlp.stanford.edu/software/corenlp.shtml) to parse text. The gem takes the output from Stanford CoreNLP and builds objects in Ruby, for use in Ruby applications.
4
+
5
+ Stanford CoreNLP requires Java version 1.6 or higher. Installations vary so that will need to be installed by the developer on their own before continuing.
6
+
7
+ Development has been done with 64-bit Java on a OS X machine. 3G of RAM is allocated for the Java process, which can be set any time the parser is called.
8
+
9
+ The following Java version has been used in development.
10
+
11
+ $ java -version
12
+ java version "1.7.0_45"
13
+ Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
14
+ Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
15
+
16
+ ## Installing Stanford CoreNLP
17
+
18
+ Run `rake corenlp:download_deps` to download the Stanford CoreNLP dependencies. The files will be extracted to the `lib/ext` directory, which is ignored from git. The dependencies directory can be customized. The download URL can also be customized.
19
+
20
+ The rake task will use the values from the following environment variables if they are set.
21
+
22
+ * `CORENLP_DOWNLOAD_URL` - This is set to "http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip" which was the latest version when this was written.
23
+ * `CORENLP_DEPS_DIR` - This is set to "./lib/ext/", which is a directory that exists in our project where we want to place the Stanford CoreNLP files.
24
+
25
+ To customize these values, supply environment variable arguments when calling the rake task like this:
26
+
27
+ rake corenlp:download_deps CORENLP_DEPS_DIR='./my_directory'
28
+
29
+ ## Testing the output in a IRB console
30
+
31
+ Corenlp gem builds up a treebank of structured parts that define tokens, sentences, and dependencies between the tokens. This treebank structure is represented as a nested Ruby hash. Token objects that are part of a sentence are nested within the sentence. Token dependencies are nested within the sentence, and so on.
32
+
33
+ The following code will build up a treebank structure for the raw text "Put the book down.". On my machine this takes around 10 seconds to run.
34
+
35
+ bundle exec irb
36
+ Bundler.require
37
+ Corenlp::Treebank.new(raw_text: "Put the book down.").parse
38
+
39
+ ## Options
40
+
41
+ The Treebank object can be initialize with various options.
42
+
43
+ * `java_max_memory` - set to 3GB by default. This can be customized via the Treebank initializer to be `-Xmx2g`, which would use a max of 2GB of memory, for example.
44
+ * `threads_to_use` - number of threads Stanford CoreNLP uses to parse text. This is set to 4 by default. This option is passed to the Java executable.
45
+ * `output_directory` - by default this is `./tmp/language_processing`, which already exists. This is where Stanford CoreNLP XML files are placed. These XML files represented the structured parser output.
46
+
47
+ ## Tests
48
+
49
+ Minitest is used as a test suite for the Ruby objects. New code should include test coverage. Manually testing is also useful. Internally we have some more test methods to verify parser output on the same content over time, but they are not included at this time.
50
+
51
+ rake
52
+
53
+ To run a single test:
54
+
55
+ ruby path/to/file.rb --name test_method_name
56
+
57
+ ## Terminology
58
+
59
+ Stanford CoreNLP uses a lot of terminology from the natural language processing field, and defines its own terminology. Refer to the [Stanford CoreNLP documentation](http://nlp.stanford.edu/software/corenlp.shtml) to learn more.
60
+
61
+ ## Contributors
62
+
63
+ This gem was developed at Lengio by Andy Atkinson as an extraction of some of our natural language processing tools.
64
+
65
+ * Andy Atkinson, gem author and maintainer
66
+ * Kamran Khan
67
+ * Rodolfo Carvalho
68
+
69
+ ## Rubygems badge
70
+
71
+ [![Gem Version](https://badge.fury.io/rb/corenlp.svg)](http://badge.fury.io/rb/corenlp)
72
+
73
+ ## License
74
+
75
+ The MIT License (MIT)
76
+
77
+ Copyright (c) 2014 Lengio Corporation
78
+
79
+ Permission is hereby granted, free of charge, to any person obtaining a copy
80
+ of this software and associated documentation files (the "Software"), to deal
81
+ in the Software without restriction, including without limitation the rights
82
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
83
+ copies of the Software, and to permit persons to whom the Software is
84
+ furnished to do so, subject to the following conditions:
85
+
86
+ The above copyright notice and this permission notice shall be included in
87
+ all copies or substantial portions of the Software.
88
+
89
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
90
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
91
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
92
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
93
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
94
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
95
+ THE SOFTWARE.
@@ -0,0 +1,56 @@
1
+ require 'net/http'
2
+ require 'zip/zip'
3
+ require 'fileutils'
4
+ require 'uri'
5
+
6
+ module Corenlp
7
+ class Downloader
8
+ attr_accessor :url, :destination, :local_file
9
+ def initialize(url, destination)
10
+ self.url = url
11
+ self.destination = destination
12
+ self.local_file = nil
13
+ end
14
+
15
+ def extract
16
+ puts "extracting file..."
17
+ Zip::ZipFile.open(local_file) do |zip_file|
18
+ zip_file.each do |file|
19
+ file_path = File.join(destination, file.name)
20
+ zip_file.extract(file, file_path) unless File.exist?(file_path)
21
+ end
22
+
23
+ puts "moving files into directory..."
24
+ dirname = local_file[0...-4]
25
+ dir = File.join(destination, dirname)
26
+ if File.exists?(dir)
27
+ Dir.glob(File.join(dir, "*")).each do |file|
28
+ FileUtils.mv(file, File.join(destination, File.basename(file)))
29
+ end
30
+ FileUtils.rm_rf(dir)
31
+ end
32
+
33
+ puts "deleting original zip file..."
34
+ FileUtils.rm(local_file)
35
+ puts "done."
36
+ end
37
+ end
38
+
39
+ def download
40
+ return unless url
41
+ puts "downloading zip file from url #{url} to #{destination}..."
42
+ self.local_file = File.basename(url)
43
+ uri = URI.parse(url)
44
+ if local_file && uri
45
+ Net::HTTP.start(uri.host) do |http|
46
+ resp = http.get(uri.request_uri)
47
+ open(local_file, "wb") do |file|
48
+ file.write(resp.body)
49
+ end
50
+ end
51
+ puts "done. Downloaded file #{local_file}."
52
+ extract
53
+ end
54
+ end
55
+ end
56
+ end
@@ -0,0 +1,22 @@
1
+ module Corenlp
2
+ class Enclitic < Token
3
+ def enclitic_map
4
+ # Note: This isn't really one-to-one (e.g. "'d" could be "had" or "would"):
5
+ {
6
+ "'ll" => "will",
7
+ "'m" => "am",
8
+ "'re" => "are",
9
+ "'s" => "is",
10
+ "'d" => "would",
11
+ "'t" => "not",
12
+ "'ve" => "have",
13
+ "'nt" => "not",
14
+ "n't" => "not"
15
+ }
16
+ end
17
+
18
+ def expanded
19
+ enclitic_map[text]
20
+ end
21
+ end
22
+ end
@@ -0,0 +1,4 @@
1
+ module Corenlp
2
+ class Number < Token
3
+ end
4
+ end
@@ -0,0 +1,20 @@
1
+ module Corenlp
2
+ class Punctuation < Token
3
+ # From http://www.unicode.org/charts/PDF/U20A0.pdf
4
+ CURRENCY_SYMBOLS = %W(\u0024 \u00A2 \u00A3 \u00A4 \u00A5 \u20AC)
5
+ DASH_SYMBOLS = %W(\u2010 \u2011 \u2012 \u2013 \u2014)
6
+ OPENING_SYMBOLS = %W(\u201C \u2018 \u00A1 \u00BF \( [ {)
7
+
8
+ def currency?
9
+ CURRENCY_SYMBOLS.include?(text)
10
+ end
11
+
12
+ def dash?
13
+ DASH_SYMBOLS.include?(text)
14
+ end
15
+
16
+ def opening?
17
+ OPENING_SYMBOLS.include?(text)
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,28 @@
1
+ module Corenlp
2
+ class Sentence
3
+ attr_accessor :index, :tokens, :token_dependencies, :parse_tree_raw
4
+
5
+ def initialize(attrs = {})
6
+ @index = attrs[:index]
7
+ @tokens = []
8
+ @token_dependencies = []
9
+ @parse_tree_raw = ''
10
+ end
11
+
12
+ def governor_dependencies(token)
13
+ token_dependencies.select{|td| td.governor == token}
14
+ end
15
+
16
+ def next_token(token)
17
+ tokens.sort_by(&:index).detect{|t| t.index > token.index}
18
+ end
19
+
20
+ def previous_token(token)
21
+ tokens.sort_by(&:index).reverse.detect{|t| t.index < token.index}
22
+ end
23
+
24
+ def get_dependency_token_by_index(index)
25
+ tokens.detect{|t| t.index == index}
26
+ end
27
+ end
28
+ end
@@ -0,0 +1,73 @@
1
+ module Corenlp
2
+ class Token
3
+ attr_accessor :index, :text, :penn_treebank_tag, :stanford_lemma, :type, :ner
4
+
5
+ def initialize(attrs = {})
6
+ @index = attrs[:index]
7
+ @text = attrs[:text]
8
+ @penn_treebank_tag = attrs[:penn_treebank_tag]
9
+ @stanford_lemma = attrs[:stanford_lemma]
10
+ @type = attrs[:type]
11
+ @ner = attrs[:ner]
12
+ end
13
+
14
+ IGNORED_ENTITIES = ["PERSON"]
15
+
16
+ def content?
17
+ is_a?(Word) || is_a?(Enclitic)
18
+ end
19
+
20
+ def top_level_penn_treebank_category
21
+ penn_treebank_tag[0]
22
+ end
23
+
24
+ def ==(other)
25
+ index == other.index && \
26
+ penn_treebank_tag == other.penn_treebank_tag && type == other.type
27
+ end
28
+
29
+ def website_text?
30
+ text =~ /http:\/\//
31
+ end
32
+
33
+ def self.clean_stanford_text(text)
34
+ Token::STANFORD_TEXT_REPLACEMENTS.each_pair do |original, replacement|
35
+ text.gsub!(replacement, original)
36
+ end
37
+ text
38
+ end
39
+
40
+ Enclitics = %w{'ll 'm 're 's 't 've 'nt n't 'd ’ll ’m ’re ’s ’t ’ve ’nt n’t ’d}
41
+ WordRegexp = /^[[:alpha:]\-'\/]+$/
42
+ NumberRegexp = /^#?(\d+)(,\d+)*(\.\d+)?$/
43
+ PunctRegexp = /^[[:punct:]'"\$]+$/
44
+ WebsiteRegexp = /https?:\/\/[\S]+/
45
+
46
+ # The character replacements that Stanford performs which we reverse:
47
+ STANFORD_TEXT_REPLACEMENTS = {
48
+ '”' => "''", '“' => '``', '(' => '-LRB-',
49
+ ')' => '-RRB-', '[' => '-LSB-', ']' => '-RSB-',
50
+ '{' => '-LCB-', '}' => '-RCB-',
51
+ '‘' => '`', '’' => '\'', '—' => '--', '/' => '\\/'
52
+ }
53
+
54
+ def ignored_entity?
55
+ IGNORED_ENTITIES.include?(self.ner)
56
+ end
57
+
58
+ def self.token_subclass_from_text(text)
59
+ case
60
+ when Enclitics.include?(text)
61
+ Enclitic
62
+ when (text =~ WordRegexp && text != '-') || (text =~ WebsiteRegexp)
63
+ Word
64
+ when text =~ PunctRegexp
65
+ Punctuation
66
+ when text =~ NumberRegexp
67
+ Number
68
+ else
69
+ Token
70
+ end
71
+ end
72
+ end
73
+ end
@@ -0,0 +1,11 @@
1
+ module Corenlp
2
+ class TokenDependency
3
+ attr_accessor :dependent, :governor, :relation
4
+
5
+ def initialize(attrs = {})
6
+ @dependent = attrs[:dependent]
7
+ @governor = attrs[:governor]
8
+ @relation = attrs[:relation]
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,3 @@
1
+ module Corenlp
2
+ VERSION = "0.0.3"
3
+ end
@@ -0,0 +1,4 @@
1
+ module Corenlp
2
+ class Word < Token
3
+ end
4
+ end
@@ -0,0 +1,9 @@
1
+ require "test_helper"
2
+
3
+ class DownloaderTest < Minitest::Test
4
+ def test_initialized_ok
5
+ zip_file_url = "http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip"
6
+ destination = File.join('./lib/ext/')
7
+ assert Downloader.new(zip_file_url, destination)
8
+ end
9
+ end
@@ -0,0 +1,8 @@
1
+ require "test_helper"
2
+
3
+ class EncliticTest < Minitest::Test
4
+ def test_expaneded_returns_the_expanded_version
5
+ enclitic = Enclitic.new(text: "'ll")
6
+ assert_equal "will", enclitic.expanded
7
+ end
8
+ end
@@ -0,0 +1,8 @@
1
+ require "test_helper"
2
+
3
+ class NumberTest < Minitest::Test
4
+ def test_number_is_initialized_ok
5
+ number = Number.new(text: "1")
6
+ assert number.is_a?(Number)
7
+ end
8
+ end
@@ -0,0 +1,38 @@
1
+ require "test_helper"
2
+
3
+ class PunctuationTest < Minitest::Test
4
+ def test_punctuation_is_initialized_ok
5
+ punctuation = Punctuation.new(text: "$")
6
+ assert punctuation.is_a?(Punctuation)
7
+ end
8
+
9
+ def test_punctuation_is_currency
10
+ Punctuation::CURRENCY_SYMBOLS.each do |s|
11
+ token = Punctuation.new text: s
12
+ assert token.currency?, "Token #{token.text} should be a currency"
13
+ end
14
+
15
+ token = Punctuation.new text: "a"
16
+ assert !token.currency?
17
+ end
18
+
19
+ def test_punctuation_is_dash
20
+ Punctuation::DASH_SYMBOLS.each do |s|
21
+ token = Punctuation.new text: s
22
+ assert token.dash?, "Token #{token.text} should be a dash"
23
+ end
24
+
25
+ token = Punctuation.new text: "{"
26
+ assert !token.dash?
27
+ end
28
+
29
+ def test_punctuation_is_opening
30
+ Punctuation::OPENING_SYMBOLS.each do |s|
31
+ token = Punctuation.new text: s
32
+ assert token.opening?, "Token #{token.text} should be an opening"
33
+ end
34
+
35
+ token = Punctuation.new text: "$"
36
+ assert !token.opening?
37
+ end
38
+ end
@@ -0,0 +1,33 @@
1
+ require "test_helper"
2
+
3
+ class SentenceTest < Minitest::Test
4
+ def test_initialized_ok
5
+ sentence = Sentence.new(index: 0, text: "some text in a sentence.")
6
+ t1 = Token.new(index: 0, text: "some", type: "Word", sentence: sentence)
7
+ t2 = Token.new(index: 1, text: "text", type: "Word", sentence: sentence)
8
+ t3 = Token.new(index: 2, text: "in", type: "Word", sentence: sentence)
9
+ t4 = Token.new(index: 3, text: "a", type: "Word", sentence: sentence)
10
+ td1 = TokenDependency.new(dependent: t1, governor: t2, relation: "det")
11
+ sentence.tokens << t1 << t2 << t3 << t4
12
+ sentence.token_dependencies << td1
13
+ assert_equal 4, sentence.tokens.size
14
+ assert_equal 1, sentence.token_dependencies.size
15
+ assert_equal [td1], sentence.governor_dependencies(t2)
16
+ assert_equal [td1], sentence.token_dependencies
17
+ assert_equal 0, sentence.index
18
+ end
19
+
20
+ def test_calculate_previous_and_next_token_from_token_in_a_sentence
21
+ sentence = Sentence.new(index: 0, text: "some text in a sentence.")
22
+ t0 = Word.new(index: 0, text: "some", sentence: sentence)
23
+ t1 = Word.new(index: 1, text: "text", sentence: sentence)
24
+ t2 = Word.new(index: 2, text: "in", sentence: sentence)
25
+ t3 = Word.new(index: 3, text: "a", sentence: sentence)
26
+ sentence.tokens << t0 << t1 << t2 << t3
27
+ assert_equal nil, sentence.previous_token(t0)
28
+ assert_equal nil, sentence.next_token(t3)
29
+ assert_equal t1, sentence.next_token(t0)
30
+ assert_equal t2, sentence.previous_token(t3)
31
+ assert_equal t1, sentence.previous_token(t2)
32
+ end
33
+ end
@@ -0,0 +1,5 @@
1
+ require 'minitest/autorun'
2
+ lib = File.expand_path('../../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "corenlp"
5
+ include Corenlp
@@ -0,0 +1,41 @@
1
+ require "test_helper"
2
+
3
+ class TokenTest < Minitest::Test
4
+ def test_initialized_ok
5
+ assert token = Word.new(index: 0, text: "some", penn_treebank_tag: "NNP",
6
+ stanford_lemma: "some")
7
+ assert_equal "N", token.top_level_penn_treebank_category
8
+ assert token.content?
9
+ end
10
+
11
+ def test_token_equality
12
+ t0 = Word.new(index: 0, text: "some", penn_treebank_tag: "NNP")
13
+ t1 = Word.new(index: 0, text: "more", penn_treebank_tag: "NNP")
14
+ assert t0 == t1
15
+ end
16
+
17
+ def test_token_person_ner_value_is_ignored
18
+ assert Token.new(text: "Walter", ner: "PERSON").ignored_entity?
19
+ end
20
+
21
+ def test_number_recognition
22
+ text_samples = ["33,333", "20", "30.00", "30,000,000.00"]
23
+
24
+ text_samples.each do |text|
25
+ assert_equal Number, Token.token_subclass_from_text(text), text
26
+ end
27
+
28
+ text_samples = ["33A333", "20", "30F00", "30X000;000"]
29
+ text_samples.each do |text|
30
+ assert !Token.token_subclass_from_text(text).is_a?(Number)
31
+ end
32
+ end
33
+
34
+ def test_enclitic_recognition
35
+ text_samples = %w{n't 'nt 'll n’t ’nt ’ll}
36
+
37
+ text_samples.each do |text|
38
+ assert_equal Enclitic, Token.token_subclass_from_text(text)
39
+ end
40
+ end
41
+ end
@@ -0,0 +1,12 @@
1
+ require 'test_helper'
2
+
3
+ class TreebankTest < Minitest::Test
4
+ @@treebank = Treebank.new(raw_text: 'I put the book down on the coffee table.').parse
5
+
6
+ def test_treebank_has_all_the_parsed_parts
7
+ # earlier verions of stanford had some different dependencies
8
+ # ["I put nsubj", "the book det", "book put dobj", "down put prt", "on put prep", "the table det", "coffee table nn", "table on pobj"]
9
+ expected = ["I put nsubj", "the book det", "book put dobj", "down put prt", "the table det", "coffee table nn", "table put prep_on"]
10
+ assert_equal expected, @@treebank.sentences.map(&:token_dependencies).flatten.map{|x| "#{x.dependent.text} #{x.governor.text} #{x.relation}"}
11
+ end
12
+ end
@@ -0,0 +1,8 @@
1
+ require "test_helper"
2
+
3
+ class WordTest < Minitest::Test
4
+ def test_word_is_initialized_ok
5
+ word = Word.new(text: "here")
6
+ assert word.is_a?(Word)
7
+ end
8
+ end
metadata ADDED
@@ -0,0 +1,150 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: corenlp
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.3
5
+ platform: ruby
6
+ authors:
7
+ - Lengio Corporation
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-06-23 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: nokogiri
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '='
18
+ - !ruby/object:Gem::Version
19
+ version: 1.6.1
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '='
25
+ - !ruby/object:Gem::Version
26
+ version: 1.6.1
27
+ - !ruby/object:Gem::Dependency
28
+ name: rubyzip
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - '>='
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: bundler
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ version: '1.5'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ~>
53
+ - !ruby/object:Gem::Version
54
+ version: '1.5'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rake
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: pry
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - '>='
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - '>='
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: minitest
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - '>='
88
+ - !ruby/object:Gem::Version
89
+ version: '0'
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - '>='
95
+ - !ruby/object:Gem::Version
96
+ version: '0'
97
+ description: Corenlp is a Ruby gem that uses the Stanford CoreNLP Java tools to parse
98
+ text.
99
+ email:
100
+ - engineering@leng.io
101
+ executables: []
102
+ extensions: []
103
+ extra_rdoc_files: []
104
+ files:
105
+ - lib/corenlp/downloader.rb
106
+ - lib/corenlp/enclitic.rb
107
+ - lib/corenlp/number.rb
108
+ - lib/corenlp/punctuation.rb
109
+ - lib/corenlp/sentence.rb
110
+ - lib/corenlp/token.rb
111
+ - lib/corenlp/token_dependency.rb
112
+ - lib/corenlp/version.rb
113
+ - lib/corenlp/word.rb
114
+ - test/downloader_test.rb
115
+ - test/enclitic_test.rb
116
+ - test/number_test.rb
117
+ - test/punctuation_test.rb
118
+ - test/sentence_test.rb
119
+ - test/test_helper.rb
120
+ - test/token_test.rb
121
+ - test/treebank_test.rb
122
+ - test/word_test.rb
123
+ - README.md
124
+ homepage: https://github.com/lengio/corenlp
125
+ licenses:
126
+ - MIT
127
+ metadata: {}
128
+ post_install_message:
129
+ rdoc_options: []
130
+ require_paths:
131
+ - lib
132
+ required_ruby_version: !ruby/object:Gem::Requirement
133
+ requirements:
134
+ - - '>='
135
+ - !ruby/object:Gem::Version
136
+ version: '0'
137
+ required_rubygems_version: !ruby/object:Gem::Requirement
138
+ requirements:
139
+ - - '>='
140
+ - !ruby/object:Gem::Version
141
+ version: '0'
142
+ requirements: []
143
+ rubyforge_project:
144
+ rubygems_version: 2.0.14
145
+ signing_key:
146
+ specification_version: 4
147
+ summary: Corenlp is a Ruby gem that uses the Stanford CoreNLP Java tools to parse
148
+ text.
149
+ test_files: []
150
+ has_rdoc: