RubyGems - DRMacIver-term-extractor - Versions diffs - 0.0.0 → 0.0.1 - Mend

DRMacIver-term-extractor 0.0.0 → 0.0.1

Files changed (14) hide show

data/README.markdown +31 -1
data/VERSION +1 -1
data/app.rb +56 -0
data/lib/term-extractor.rb +78 -30
data/lib/term-extractor/nlp.rb +2 -0
data/term-extractor.gemspec +11 -7
data/test/examples/2009-08-16-14:41_spec.rb +22 -0
data/test/nlp_spec.rb +1 -2
data/test/term_extractor_spec.rb +0 -13
data/training/bad +3 -0
data/training/good +13 -0
data/views/index.haml +19 -0
metadata +9 -6
data/test/examples_spec.rb +0 -131

data/README.markdown CHANGED

@@ -1,7 +1,11 @@
 # The Trampoline Systems term extractor
+## Introduction
 The term extractor is a library for taking natural text and extracting a
-set of terms from it which make sense without additional context. For example, feeding it the following text from my home page:
+set of terms from it which make sense without additional context. We developed it at [Trampoline Systems](http://trampolinesystems.com/) as part of our work on [SONAR](http://trampolinesystems.com/product/Sonar+Expertise/benefits).
+For example, feeding it the following text from my home page:
     Hi. I’m David.
@@ -35,6 +39,32 @@ One limitation of this is that it doesn't necessarily extract all reasonable ter
 Currently only english is supported. There are plans to support other languages, but nothing is implemented in that regard: It requires someone who is native to that language, a competent programmer and at least passingly familiar with NLP, so understandably we're a bit resource constrained on getting wide spread non-english support.
+## Usage
+The primary use for the term extractor is as a JRuby library. There is a command line script wrapping it, but it currently only supports very basic use and isn't really practical because of a long startup time (this is more to do with loading models than Java startup).
+Use of the library is very simple:
+    jirb -rubygems -rterm-extractor
+    irb(main):001:0> extractor = TermExtractor.new
+    irb(main):002:0>  puts extractor.extract_terms_from_text("Scala is a multi-paradigm programming language designed to integrate features of object-orientedd programming and functional programming.")
+    Scala
+    multi-paradigm programming language
+    features
+    irb(main):003:0> p extractor.extract_terms_from_text("Scala is a multi-paradigm programming language designed to integrate features of object-orientedd programming and functional programming.")
+    [#<Term:0xd36ff3 @to_s="Scala", @pos="NNP", @sentence=0>, #<Term:0x15af049 @to_s="multi-paradigm programming language", @pos="JJ-NN-NN", @sentence=0>, #<Term:0x1555185 @to_s="features", @pos="NNS", @sentence=0>]
+    irb(main):004:0> terms = extractor.extract_terms_from_text("Scala is a multi-paradigm programming language designed to integrate features of object-orientedd programming and functional programming.")
+    irb(main):006:0> p terms[0]
+    #<Term:0x1c958af @to_s="Scala", @pos="NNP", @sentence=0>
+    irb(main):007:0> puts terms[0].pos
+    NNP
+    irb(main):008:0> puts terms[0].sentence
+    0
+    irb(main):009:0> puts terms[0].to_s
+    Scala
+You create a term extractor. You pass it text with extract_terms_from_text and it returns an array of Term objects. You'll probably most be interested in these to convert them straight to strings, where they correspond to the desired snippets of text from the document, but they also provide some additional information. Currently they provide information about parts of speech and which sentence in the text they occur in. More information may be added later.
 ## Copyright
 Copyright (c) 2009 Trampoline Systems. See LICENSE for details.

data/VERSION CHANGED

	@@ -1 +1 @@
1	- 0.0.0
1	+ 0.0.1

data/app.rb ADDED

@@ -0,0 +1,56 @@
+require "date"
+require "rubygems"
+require "sinatra"
+$: << "lib"
+require "term-extractor"
+TE = TermExtractor.new
+get '/' do
+  haml :index
+end
+post '/' do
+  if params[:extract]
+    @text = params[:text]
+    @terms = TE.extract_terms_from_text(@text).map{|x| x.to_s}.uniq
+  elsif params[:train]
+    File.open("training/good", "a"){|o| o.puts params[:goodterms]}
+    File.open("training/bad", "a"){|o| o.puts params[:badterms]}
+    time = DateTime.now.strftime("%Y-%m-%d-%H:%M")
+    File.open("test/examples/#{time}_spec.rb", "w"){ |o|
+o.puts <<SPEC
+TE = TermExtractor.new unless defined? TE
+Text = <<TEXT
+#{params[:text]}
+TEXT
+Terms = TE.extract_terms_from_text(Text).map{|x| x.to_s}.sort.uniq
+describe "the example generated at #{time}" do
+  it "should contain the following terms" do
+    #{(params[:goodterms] || "").split(/\r?\n/).map{|x| x.strip}.inspect}.each do |term|
+      Terms.should include(term)
+    end
+  end
+  it "should not contain the following terms" do
+    #{(params[:badterms] || "").split(/\r?\n/).map{|x| x.strip}.inspect}.each do |term|
+      Terms.should_not include(term)
+    end
+  end
+end
+SPEC
+    }
+  end
+  haml :index
+end

data/lib/term-extractor.rb CHANGED

@@ -1,28 +1,29 @@
 require "term-extractor/nlp"
 class Term
-  attr_accessor :to_s, :pos, :sentence
+  attr_accessor :pos, :sentence, :chunks, :tokens
-  def initialize(ts, pos, sentence = nil)
-    @to_s, @pos, @sentence = ts, pos, sentence
+  def initialize(tokens)
+    @tokens = tokens
+    yield self if block_given?
+  end
+  def to_s
+    @to_s ||= TermExtractor.recombobulate_term(@tokens)
   end
 end
+# A class for extracting useful snippets of text from a document
 class TermExtractor
-  attr_accessor :nlp, :max_term_length, :proscribed_start, :required_ending, :remove_urls, :remove_paths
+  attr_accessor :nlp, :max_term_length, :remove_urls, :remove_paths
   def initialize(models = File.dirname(__FILE__) + "/../models")
     @nlp = NLP.new(models)
     # Empirically, terms longer than about 5 words seem to be either
     # too specific to be useful or very noisy.
-    @max_term_length = 5
-    # Common sources of crap starting words
-    @proscribed_start = /CC|PRP|IN|DT|PRP\$|WP|WP\$|TO|EX/
+    @max_term_length = 4
-    # We have to end in a noun, foreign word or number.
-    @required_ending = /NN|NNS|NNP|NNPS|FW|CD/
     self.remove_urls = true
     self.remove_paths = true
@@ -30,7 +31,14 @@ class TermExtractor
     yield self if block_given?
   end
+  # This class holds all the state needed for term calculations
+  # on a single sentence.
+  # It uses chunking and part of speech tagging information to
+  # mark each token in the sentence as to whether it is allowed
+  # to start a term or end a term and whether terms can cross it
+  # Terms are then calculated by simply looking for all sequences
+  # of tokens up to the maximum length which meet these constraints.
   class TermContext
     attr_accessor :parent, :tokens, :postags, :chunks
@@ -55,7 +63,8 @@ class TermExtractor
       @sentence = sentence
     end
+    # This is the bit where all the work happens
     def boundaries
       return @boundaries if @boundaries
@@ -66,13 +75,32 @@ class TermExtractor
       @boundaries = tokens.map{|t| {}}
       @boundaries.each_with_index do |b, i|
+        # WARNING: It's important to only write boundaries for indices
+        # <= i. Otherwise the next loop iteration will overwrite the
+        # set value.
         tok = tokens[i]
         pos = postags[i]
         chunk = chunks[i]
         # Cannot cross commas or coordinating conjections (and, or, etc)
-        b[:can_cross] = !(pos =~ /,|CC/)
+        b[:can_cross] = !(pos =~ /,/)
+        # words which are extra double plus stop wordy and shouldn't appear inside
+        # terms
+        # FIXME: This is a hack. We're really hitting the limit of
+        # rule based systems here
+        b[:can_cross] &&= ![
+          "after",
+          "where",
+          "when",
+          "for",
+          "at",
+          "to",
+          "with"
+        ].include?(tok)
         # Cannot cross the beginning of verb terms
         # i.e. we may start with verb terms but not include them
         b[:can_cross] = (chunk != "B-VP") if b[:can_cross]
@@ -83,21 +111,36 @@ class TermExtractor
         # We are only allowed to start terms on the beginning of a term chunk
         b[:can_start] = (chunks[i] == "B-NP")
-        if i > 0
-          if postags[i-1] =~ /DT|WDT|PRP|JJR|JJS/
-              # In some cases we want to move the start of a term to the right. These cases are:
-              # - a determiner (the, a, etc)
-              # - a posessive pronoun (my, your, etc)
-              # - comparative and superlative adjectives (best, better, etc.)
-              # In all cases we only do this for noun terms, and will only move them to internal points.
-              b[:can_start] ||= (chunks[i] == "I-NP")
-              @boundaries[i - 1][:can_start] = false
-          end
+        # In some cases we want to move the start of a term to the right. These cases are:
+        # - a determiner (the, a, etc)
+        # - a posessive pronoun (my, your, etc)
+        # - comparative and superlative adjectives (best, better, etc.)
+        # - A number. In this case note that starting with the number is also allowed. e.g. "two cities" will produce both "two cities"
+        # In all cases we only do this for noun terms, and will only move them to internal points.
+        if (chunks[i] == "I-NP") && (postags[i-1] =~ /DT|WDT|PRP|JJR|JJS|CD/)
+            b[:can_start] = true
         end
         # We must include any tokens internal to the current chunk
         b[:can_end] = !(chunks[i + 1] =~ /I-/)
+        # We break phrases around coordinating conjunctions (and, or, etc)
+        # but allow phrases that should rightfully be forced to continue past
+        # the conjunction. e.g. in "nuts and bolts", we allow "nuts" and "bolts"
+        # but not the whole phrase. This is true even if this resolves as a single
+        # chunk
+        if pos == 'CC'
+          @boundaries[i-1][:can_end] = true if i > 0
+          @boundaries[i][:can_cross] = false
+        end
+        # need to do it here rather than in previous if statement
+        # as otherwise the next pass along will overwrite the result
+        # we set here.
+        if i > 0 && @postags[i-1] == 'CC'
+          @boundaries[i][:can_start] = true
+        end
         # It is permitted to cross stopwords, but they cannot lie at the term boundary
         if (nlp.stopword? tok) || (nlp.stopword? tokens[i..i+1].join) # Need to take into account contractions, which span multiple tokens
           b[:can_end] = false
@@ -111,10 +154,12 @@ class TermExtractor
           b[:can_start] = false
           @boundaries[i - 1][:can_end] = false
         end
-        # Must match the requirements for POSes at the beginning and end.
-        b[:can_start] &&= !(pos =~ parent.proscribed_start)
-        b[:can_end] &&= (pos =~ parent.required_ending)
+        # Common sources of crap starting words
+        b[:can_start] &&= !(pos =~ /CC|PRP|IN|DT|PRP\$|WP|WP\$|TO|EX|JJR|JJS/)
+        # TODO: Is this still a good idea?
+        b[:can_end] &&= (pos =~ /NN|NNS|NNP|NNPS|FW|CD/)
       end
@@ -149,7 +194,10 @@ class TermExtractor
         term = tokens[i..j]
         poses = postags.to_a[i..j]
-        term = Term.new(TermExtractor.recombobulate_term(term), poses.join("-"))
+        term = Term.new(term){ |it|
+          it.pos = poses.join("-")
+          it.chunks = chunks.to_a[i..j]
+        }
         terms << term if TermExtractor.allowed_term?(term)
         j += 1
@@ -179,7 +227,7 @@ class TermExtractor
   # Final post filter on terms to determine if they're allowed.
   def self.allowed_term?(p)
-    return false if p.pos =~ /^CD(-CD)*$/ # We don't allow things which are just sequences of numbers
+    return false if p.to_s =~ /^[^a-zA-Z]*$/ # We don't allow things which are just sequences of numbers
     return false if p.to_s.length > 255
     true
   end

data/lib/term-extractor/nlp.rb CHANGED

@@ -86,6 +86,8 @@ class TermExtractor
       text = text.dup
       text.gsub!(/--+/, " -- ") # TODO: What's this for?
+      text.gsub!(/…/, "...") # expand ellipsis character
       # Normalize bracket types.
       # TODO: Shouldn't do this inside of tokens.
       text.gsub!(/{\[/, "(")

data/term-extractor.gemspec CHANGED

@@ -2,11 +2,11 @@
 Gem::Specification.new do |s|
   s.name = %q{term-extractor}
-  s.version = "0.0.0"
+  s.version = "0.0.1"
   s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
   s.authors = ["David R. MacIver"]
-  s.date = %q{2009-08-06}
+  s.date = %q{2009-09-08}
   s.default_executable = %q{terms.rb}
   s.email = %q{david.maciver@gmail.com}
   s.executables = ["terms.rb"]
@@ -19,6 +19,7 @@ Gem::Specification.new do |s|
      "README.markdown",
      "Rakefile",
      "VERSION",
+     "app.rb",
      "bin/terms.rb",
      "lib/term-extractor.rb",
      "lib/term-extractor/maxent-2.5.2.jar",
@@ -37,11 +38,14 @@ Gem::Specification.new do |s|
      "models/tagdict",
      "models/tok.bin.gz",
      "term-extractor.gemspec",
-     "test/examples_spec.rb",
+     "test/examples/2009-08-16-14:41_spec.rb",
      "test/files/1.email",
      "test/files/juries_seg_8_v1",
      "test/nlp_spec.rb",
-     "test/term_extractor_spec.rb"
+     "test/term_extractor_spec.rb",
+     "training/bad",
+     "training/good",
+     "views/index.haml"
   ]
   s.homepage = %q{http://github.com/david.maciver@gmail.com/term-extractor}
   s.rdoc_options = ["--charset=UTF-8"]
@@ -49,9 +53,9 @@ Gem::Specification.new do |s|
   s.rubygems_version = %q{1.3.4}
   s.summary = %q{A library for extracting useful terms from text}
   s.test_files = [
-    "test/term_extractor_spec.rb",
-     "test/nlp_spec.rb",
-     "test/examples_spec.rb"
+    "test/examples/2009-08-16-14:41_spec.rb",
+     "test/term_extractor_spec.rb",
+     "test/nlp_spec.rb"
   ]
   if s.respond_to? :specification_version then

data/test/examples/2009-08-16-14:41_spec.rb ADDED

@@ -0,0 +1,22 @@
+TE = TermExtractor.new unless defined? TE
+Text = <<TEXT
+As the healthcare debate picks up pace, I find myself being asked with increasing regularity what I think of Britain’s healthcare system.  Six months ago, I’d have jumped into the answer with gusto, but these days…  I don’t know, I am just so fatigued by all the fear-mongering and hysteria, the ignorance and the downright idiocy of the current debate that I can hardly summon the energy to add my voice to the cacophony.
+TEXT
+Terms = TE.extract_terms_from_text(Text).map{|x| x.to_s}.sort.uniq
+describe "the example generated at 2009-08-16-14:41" do
+  it "should contain the following terms" do
+    ["healthcare debate", "Britain's healthcare system", "Six months", "answer", "gusto", "fear-mongering", "hysteria", "ignorance", "downright idiocy", "current debate", "energy", "voice", "cacophony"].each do |term|
+      Terms.should include(term)
+    end
+  end
+  it "should not contain the following terms" do
+    ["days\342\200\246", "voice to the cacophony", "answer with gusto"].each do |term|
+      Terms.should_not include(term)
+    end
+  end
+end

data/test/nlp_spec.rb CHANGED

@@ -39,9 +39,8 @@ I like kitties
 I like puppies
 KITTIES
   end
 end
 describe "url removal" do

data/test/term_extractor_spec.rb CHANGED

@@ -15,18 +15,6 @@ def each_tag(&blk)
 end
 describe TermExtractor do
-  it "should only return themes ending in nouns" do
-    each_tag do |tag|
-      tag.pos.should =~ /(^|-)(#{PE.required_ending})$/
-    end
-  end
-  it "must not return themes starting with proscribed parts of speech" do
-    each_tag do  |tag|
-      tag.pos.should_not =~ /^(#{PE.proscribed_start})($|-)/
-    end
-  end
   it "should produce at least as many tags as words" do
     each_tag do |tag|
       tag.pos.split("-").length.should be >= tag.to_s.split.length
@@ -137,5 +125,4 @@ BINARYSOLO
       term.to_s.should_not =~ /’|'/
     }
   end
 end

data/training/bad ADDED

@@ -0,0 +1,3 @@
+days…
+voice to the cacophony
+answer with gusto

data/training/good ADDED

@@ -0,0 +1,13 @@
+healthcare debate
+Britain's healthcare system
+Six months
+answer
+gusto
+fear-mongering
+hysteria
+ignorance
+downright idiocy
+current debate
+energy
+voice
+cacophony

data/views/index.haml ADDED

@@ -0,0 +1,19 @@
+%html{ :xmlns => "http://www.w3.org/1999/xhtml", "xml:lang" => "en" }
+  %head
+    %title Training
+    %link{:rel => "stylesheet", :type => "text/css", :href => "/style.css"}/
+  %body
+    %h1 Term Extractor Training
+    %form{ :method => "POST" }
+      %div{ :style => "float: left; width: 60%;"}
+        %textarea{ :style =>"width: 100%; height: 60em;", :name => "text"}= @text
+        %input{:type => "submit", :name => "extract", :value => "Extract"}
+      %div{ :style => "float: right; width: 35%;"}
+        %h2 Good terms
+        %textarea{ :style => "width: 100%; height: 20em;", :name => "goodterms"}= @terms && @terms.join("\n")
+        %h2 Bad terms
+        %textarea{ :style => "width: 100%; height: 20em;", :name => "badterms"}
+        %input{:type => "submit", :name => "train", :value => "Train"}

metadata CHANGED

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: DRMacIver-term-extractor
 version: !ruby/object:Gem::Version
-  version: 0.0.0
+  version: 0.0.1
 platform: ruby
 authors:
 - David R. MacIver
@@ -9,7 +9,7 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2009-08-06 00:00:00 -07:00
+date: 2009-09-08 00:00:00 -07:00
 default_executable: terms.rb
 dependencies: []
@@ -27,6 +27,7 @@ files:
 - README.markdown
 - Rakefile
 - VERSION
+- app.rb
 - bin/terms.rb
 - lib/term-extractor.rb
 - lib/term-extractor/maxent-2.5.2.jar
@@ -45,14 +46,16 @@ files:
 - models/tagdict
 - models/tok.bin.gz
 - term-extractor.gemspec
-- test/examples_spec.rb
+- test/examples/2009-08-16-14:41_spec.rb
 - test/files/1.email
 - test/files/juries_seg_8_v1
 - test/nlp_spec.rb
 - test/term_extractor_spec.rb
+- training/bad
+- training/good
+- views/index.haml
 has_rdoc: false
 homepage: http://github.com/david.maciver@gmail.com/term-extractor
-licenses:
 post_install_message:
 rdoc_options:
 - --charset=UTF-8
@@ -73,11 +76,11 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubyforge_project:
-rubygems_version: 1.3.5
+rubygems_version: 1.2.0
 signing_key:
 specification_version: 3
 summary: A library for extracting useful terms from text
 test_files:
+- test/examples/2009-08-16-14:41_spec.rb
 - test/term_extractor_spec.rb
 - test/nlp_spec.rb
-- test/examples_spec.rb

data/test/examples_spec.rb DELETED

@@ -1,131 +0,0 @@
-require "term-extractor"
-PE = TermExtractor.new
-Diagrams = <<DIAGRAMS
-I think having nice standardised diagrams of stuff like that is REALLY
-useful. One OO architect drops dead and your replacement walks in and
-can pick up the documents and read them because they already speak
-that language. That's a great thing. I sort of wish it had been pushed
-as being that -- a lingua franca for documenting designs.
-DIAGRAMS
-describe "Diagram terms" do
-end
-Murray = <<MURRAY
-The MCHS Department of Music is one of the most distinguished music programs in the State, having an award-winning choral and band program. The Marching Indians, under the direction of Mr. Mike Weaver, have performed all over the country, most recently at Universal Studios in Orlando, Disney World and the St. Patrick's Day Parade in New York City. Since 1958, the Marching Indians have been entreating fans with exciting, visually stimulating shows and their trademark deep, loud sound. Recently the Marching Indians received the Grand Championship at the 2008 Golden River Music Festival and won the first ever US101 radio battle of the bands receiving a concert by the Eli Young Band. Many students from MCHS Department of Bands have been involved with All District and All State bands as well as various summer clinics, orchestras and even the Georgia Lions All State Band.
-MURRAY
-MurrayTerms = PE.extract_terms_from_text(Murray).map{|x| x.to_s}
-describe "Murray terms" do
-  it "should get Mike's name right" do
-    MurrayTerms.should_not include("Mr . Mike Weaver")
-    MurrayTerms.should include("Mr. Mike Weaver")
-  end
-end
-Chromosome = <<CHROM
-Humans have 23 pairs of chromosomes packed with genes that dictate every aspect of our biological functioning. Of these pairs, the sex chromosomes are different; women have two X chromosomes and men have an X and a Y chromosome. The Y chromosome contains essential blueprints for the male reproductive system, in particular those for sperm development.
-But the Y chromosome, which once contained as many genes as the X chromosome, has deteriorated over time and now contains less than 80 functional genes compared to its partner, which contains more than 1,000 genes. Geneticists and evolutionary biologists determined that the Y chromosome's deterioration is due to accumulated mutations, deletions and anomalies that have nowhere to go because the chromosome doesn't swap genes with the X chromosome like every other chromosomal pair in our cells do.
-CHROM
-ChromosomeTerms = PE.extract_terms_from_text(Chromosome).map{|x| x.to_s}
-describe "Chromosome terms" do
-  it "should say nothing about what humans have" do
-    ChromosomeTerms.should_not include("Humans have 23 pairs")
-  end
-  it "knows about the male reproductive system, if you know what I mean" do
-    ChromosomeTerms.should include("male reproductive system")
-    ChromosomeTerms.should include("sperm development")
-  end
-  it "is about humans" do
-    ChromosomeTerms.should include("Humans")
-  end
-end
-Environment = "Please consider the environment before printing this e-mail"
-EnvironmentTerms = PE.extract_terms_from_text(Environment).map{|x| x.to_s}.sort
-describe "Environment terms" do
-  it "is about email" do
-    EnvironmentTerms.should include("e-mail")
-  end
-end
-Apollo = <<APOLLO
-Fate has ordained that the men who went to the moon to explore in peace will stay on the moon to rest in peace.
-These brave men, Neil Armstrong and Edwin Aldrin, know that there is no hope for their recovery. But they also know that there is hope for mankind in their sacrifice.
-These two men are laying down their lives in mankind's most noble goal: the search for truth and understanding.
-They will be mourned by their families and friends; they will be mourned by their nation; they will be mourned by the people of the world; they will be mourned by a Mother Earth that dared send two of her sons into the unknown.
-In their exploration, they stirred the people of the world to feel as one; in their sacrifice, they bind more tightly the brotherhood of man.
-In ancient days, men looked at stars and saw their heroes in the constellations. In modern times, we do much the same, but our heroes are epic men of flesh and blood.
-Others will follow, and surely find their way home. Man's search will not be denied. But these men were the first, and they will remain the foremost in our hearts.
-APOLLO
-ApolloTerms = PE.extract_terms_from_text(Apollo).map{|x| x.to_s}.sort.uniq
-describe "Apollo terms" do
-  it "knows of Neil and Buzz" do
-    ApolloTerms.should include("Neil Armstrong")
-    ApolloTerms.should include("Edwin Aldrin")
-  end
-  it "knows of where they've been" do
-    ApolloTerms.should include("moon")
-  end
-  it "knows of times past and present" do
-    ApolloTerms.should include("ancient days")
-    ApolloTerms.should include("modern times")
-  end
-  it "knows of destiny" do
-    ApolloTerms.should include("Fate")
-  end
-  it "knows of searching" do
-    ApolloTerms.should include("exploration")
-    ApolloTerms.should include("search")
-    ApolloTerms.should include("Man's search")
-  end
-  it "knows not of mourning, but of courage and sacrifice" do
-    ApolloTerms.should_not include("mourned")
-    ApolloTerms.should include("brave men")
-    ApolloTerms.should include("sacrifice")
-  end
-  it "knows of brotherhood" do
-    ApolloTerms.should include("brotherhood of man")
-  end
-  it "knows of mankind, and of its heroes" do
-    ApolloTerms.should include("man")
-    ApolloTerms.should include("men")
-    ApolloTerms.should include("mankind")
-    ApolloTerms.should include("heroes")
-    ApolloTerms.should include("epic men")
-  end
-  it "looks to the stars from the earth" do
-    ApolloTerms.should include("stars")
-    ApolloTerms.should include("constellations")
-    ApolloTerms.should include("Mother Earth")
-    ApolloTerms.should include("world")
-  end
-end