RubyGems - engtagger - Versions diffs - 0.1.1 → 0.1.2 - Mend

engtagger 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

data/lib/engtagger/porter.rb CHANGED

@@ -1,9 +1,5 @@
-#! /local/ruby/bin/ruby
-#
-# $Id: stemmable.rb,v 1.2 2003/02/01 02:07:30 condit Exp $
-#
-# See example usage at the end of this file.
-#
+#!/usr/bin/env ruby
+# -*- coding: utf-8 -*-
 module Stemmable

data/lib/engtagger/pos_tags.hash CHANGED

Binary file

data/lib/engtagger/pos_words.hash CHANGED

Binary file

data/lib/engtagger/version.rb ADDED

@@ -0,0 +1,3 @@
+module EngTagger
+  VERSION = "0.1.2"
+end

metadata CHANGED

@@ -1,86 +1,63 @@
---- !ruby/object:Gem::Specification
+--- !ruby/object:Gem::Specification
 name: engtagger
-version: !ruby/object:Gem::Version
-  version: 0.1.1
+version: !ruby/object:Gem::Version
+  version: 0.1.2
+  prerelease:
 platform: ruby
-authors:
+authors:
 - Yoichiro Hasebe
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2008-05-15 00:00:00 +09:00
-default_executable:
-dependencies:
-- !ruby/object:Gem::Dependency
-  name: hpricot
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: "0"
-    version:
-- !ruby/object:Gem::Dependency
-  name: hoe
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: 1.5.1
-    version:
-description: A Ruby port of Perl Lingua::EN::Tagger, a probability based, corpus-trained  tagger that assigns POS tags to English text based on a lookup dictionary and  a set of probability values. The tagger assigns appropriate tags based on  conditional probabilities--it examines the preceding tag to determine the  appropriate tag for the current word. Unknown words are classified according to  word morphology or can be set to be treated as nouns or other parts of speech.   The tagger also extracts as many nouns and noun phrases as it can, using a set  of regular expressions.
-email: yohasebe@gmail.com
+date: 2012-06-05 00:00:00.000000000 Z
+dependencies: []
+description: A Ruby port of Perl Lingua::EN::Tagger, a probability based, corpus-trained
+  tagger that assigns POS tags to English text based on a lookup dictionary and a
+  set of probability values.
+email:
+- yohasebe@gmail.com
 executables: []
 extensions: []
-extra_rdoc_files:
-- History.txt
-- LICENSE.txt
-- Manifest.txt
-- README.txt
-files:
-- History.txt
-- LICENSE.txt
-- Manifest.txt
-- README.txt
+extra_rdoc_files: []
+files:
+- .gitignore
+- Gemfile
+- LICENSE
+- README.md
 - Rakefile
+- engtagger.gemspec
 - lib/engtagger.rb
 - lib/engtagger/porter.rb
 - lib/engtagger/pos_tags.hash
 - lib/engtagger/pos_words.hash
 - lib/engtagger/tags.yml
 - lib/engtagger/unknown.yml
+- lib/engtagger/version.rb
 - lib/engtagger/words.yml
 - test/test_engtagger.rb
-has_rdoc: true
-homepage: http://engtagger.rubyforge.org
+homepage: http://github.com/yohasebe/engtagger
+licenses: []
 post_install_message:
-rdoc_options:
-- --main
-- README.txt
-require_paths:
+rdoc_options: []
+require_paths:
 - lib
-required_ruby_version: !ruby/object:Gem::Requirement
-  requirements:
-  - - ">="
-    - !ruby/object:Gem::Version
-      version: "0"
-  version:
-required_rubygems_version: !ruby/object:Gem::Requirement
-  requirements:
-  - - ">="
-    - !ruby/object:Gem::Version
-      version: "0"
-  version:
+required_ruby_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
 requirements: []
-rubyforge_project: engtagger
-rubygems_version: 1.1.1
+rubyforge_project:
+rubygems_version: 1.8.24
 signing_key:
-specification_version: 2
-summary: English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger
-test_files:
+specification_version: 3
+summary: A probability based, corpus-trained English POS tagger
+test_files:
 - test/test_engtagger.rb

data/History.txt DELETED

@@ -1,10 +0,0 @@
-=== 0.1.0 / 2008-05-14
-* Modified Synopsis section of Readme.txt
-* Created a description of tag set in Readme.txt
-* Fixed a few minor bugs
-=== 0.1.0 / 2008-05-06
-* Initial release
-* Functionalities are basically the same as those of Perl Lingua::EN::Tagger.

data/Manifest.txt DELETED

@@ -1,13 +0,0 @@
-History.txt
-LICENSE.txt
-Manifest.txt
-README.txt
-Rakefile
-lib/engtagger.rb
-lib/engtagger/porter.rb
-lib/engtagger/pos_tags.hash
-lib/engtagger/pos_words.hash
-lib/engtagger/tags.yml
-lib/engtagger/unknown.yml
-lib/engtagger/words.yml
-test/test_engtagger.rb

data/README.txt DELETED

@@ -1,140 +0,0 @@
-= EngTagger
-English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger
-=== Description
-A Ruby port of Perl Lingua::EN::Tagger, a probability based, corpus-trained
-tagger that assigns POS tags to English text based on a lookup dictionary and
-a set of probability values. The tagger assigns appropriate tags based on
-conditional probabilities--it examines the preceding tag to determine the
-appropriate tag for the current word. Unknown words are classified according to
-word morphology or can be set to be treated as nouns or other parts of speech.
-The tagger also extracts as many nouns and noun phrases as it can, using a set
-of regular expressions.
-=== Features
-* Assigns POS tags to English text
-* Extract noun phrases from tagged text
-* etc.
-=== Synopsis:
-  require 'rubygems'
-  require 'engtagger'
-  # Create a parser object
-  tgr = EngTagger.new
-  # Sample text
-  text = "Alice chased the big fat cat."
-  # Add part-of-speech tags to text
-  tagged = tgr.add_tags(text)
-  #=> "<nnp>Alice</nnp> <vbd>chased</vbd> <det>the</det> <jj>big</jj> <jj>fat</jj><nn>cat</nn> <pp>.</pp>"
-  # Get a list of all nouns and noun phrases with occurrence counts
-  word_list = tgr.get_words(text)
-  #=> {"Alice"=>1, "cat"=>1, "fat cat"=>1, "big fat cat"=>1}
-  # Get a readable version of the tagged text
-  readable = tgr.get_readable(text)
-  #=> "Alice/NNP chased/VBD the/DET big/JJ fat/JJ cat/NN ./PP"
-  # Get all nouns from a tagged output
-  nouns = tgr.get_nouns(tagged)
-  #=> {"cat"=>1, "Alice"=>1}
-  # Get all proper nouns
-  proper = tgr.get_proper_nouns(tagged)
-  #=> {"Alice"=>1}
-  # Get all noun phrases of any syntactic level
-  # (same as word_list but take a tagged input)
-  nps = tgr.get_noun_phrases(tagged)
-  #=> {"Alice"=>1, "cat"=>1, "fat cat"=>1, "big fat cat"=>1}
-=== Tag Set
-The set of POS tags used here is a modified version of the Penn Treebank tagset. Tags with non-letter characters have been redefined to work better in our data structures. Also, the "Determiner" tag (DET) has been changed from 'DT', in order to avoid confusion with the HTML tag, <DT>.
-  CC      Conjunction, coordinating               and, or
-  CD      Adjective, cardinal number              3, fifteen
-  DET     Determiner                              this, each, some
-  EX      Pronoun, existential there              there
-  FW      Foreign words
-  IN      Preposition / Conjunction               for, of, although, that
-  JJ      Adjective                               happy, bad
-  JJR     Adjective, comparative                  happier, worse
-  JJS     Adjective, superlative                  happiest, worst
-  LS      Symbol, list item                       A, A.
-  MD      Verb, modal                             can, could, 'll
-  NN      Noun                                    aircraft, data
-  NNP     Noun, proper                            London, Michael
-  NNPS    Noun, proper, plural                    Australians, Methodists
-  NNS     Noun, plural                            women, books
-  PDT     Determiner, prequalifier                quite, all, half
-  POS     Possessive                              's, '
-  PRP     Determiner, possessive second           mine, yours
-  PRPS    Determiner, possessive                  their, your
-  RB      Adverb                                  often, not, very, here
-  RBR     Adverb, comparative                     faster
-  RBS     Adverb, superlative                     fastest
-  RP      Adverb, particle                        up, off, out
-  SYM     Symbol                                  *
-  TO      Preposition                             to
-  UH      Interjection                            oh, yes, mmm
-  VB      Verb, infinitive                        take, live
-  VBD     Verb, past tense                        took, lived
-  VBG     Verb, gerund                            taking, living
-  VBN     Verb, past/passive participle           taken, lived
-  VBP     Verb, base present form                 take, live
-  VBZ     Verb, present 3SG -s form               takes, lives
-  WDT     Determiner, question                    which, whatever
-  WP      Pronoun, question                       who, whoever
-  WPS     Determiner, possessive & question       whose
-  WRB     Adverb, question                        when, how, however
-  PP      Punctuation, sentence ender             ., !, ?
-  PPC     Punctuation, comma                      ,
-  PPD     Punctuation, dollar sign                $
-  PPL     Punctuation, quotation mark left        ``
-  PPR     Punctuation, quotation mark right       ''
-  PPS     Punctuation, colon, semicolon, elipsis  :, ..., -
-  LRB     Punctuation, left bracket               (, {, [
-  RRB     Punctuation, right bracket              ), }, ]
-=== Requirements
-* Ruby 1.8.6
-* Hpricot[http://code.whytheluckystiff.net/hpricot/] (optional)
-=== Install
-  (sudo) gem install engtagger
-=== Author
-of this Ruby library
-* Yoichiro Hasebe (yohasebe [at] gmail.com)
-of the original Perl module
-* Aaron Coburn (acoburn [at] middlebury.edu)
-=== Acknowledgement
-This Ruby library is a direct port of Lingua::EN::Tagger available at CPAN.
-The credit for the crucial part of its algorithm/design therefore goes to
-Aaron Coburn, the author of the original Perl version.
-=== License
-This library is distributed under the GPL.  Please see the LICENSE file.