RubyGems - solrizer - Versions diffs - 3.4.1 → 4.0.0 - Mend

solrizer 3.4.1 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

checksums.yaml +4 -4
data/CONTRIBUTING.md +66 -20
data/README.md +1 -75
data/lib/solrizer.rb +0 -3
data/lib/solrizer/field_mapper.rb +1 -1
data/lib/solrizer/version.rb +1 -1
data/solrizer.gemspec +0 -2
metadata +3 -44
data/bin/solrizer +0 -107
data/bin/solrizerd +0 -68
data/lib/solrizer/extractor.rb +0 -68
data/lib/solrizer/html.rb +0 -7
data/lib/solrizer/html/extractor.rb +0 -36
data/lib/solrizer/xml.rb +0 -5
data/lib/solrizer/xml/extractor.rb +0 -32
data/spec/units/extractor_spec.rb +0 -44
data/spec/units/xml_extractor_spec.rb +0 -26

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 75f93d429c92672c47052351cd82036b31a98916
-  data.tar.gz: bd2336c84504de0921a74feb4fead735275b9110
+  metadata.gz: d9aa32a4193f5ad11975f40824475634bf149efb
+  data.tar.gz: 6300a897b884f28fd8a335f36e6a2ee214df4d16
 SHA512:
-  metadata.gz: 034013543a506460f4e39f31ef449336a533383ecd9279b5900e7a4f341266620084c47459469d14f79abcd15aa6dd4c31bcd404e0e4d3653ec58915df817428
-  data.tar.gz: dc8d9f340f30bec3e03820677b833d0012188aafafad1f21d6a234a21640621d0eb74b702bd5f508fc61672b512387bc871388d9346eee46ce18b862417fcc8f
+  metadata.gz: 955481c893fbd628a1c61a088465f40d07978647eb5d711f1ad7ff63711da6301a4a735dd7a770d2d95e63a93dc430a509d715cdeddd7a1c4b1a57131a2ed8bd
+  data.tar.gz: c7eb036b3a788b8de7a764c077f60ef7e74ae90ce70335f613463920852cc47f5adb6541d2a304fc509c0000619f1d2601924a6fa4a314275d4ba186323952f0

data/CONTRIBUTING.md CHANGED

@@ -3,6 +3,13 @@
 We want your help to make Project Hydra great.
 There are a few guidelines that we need contributors to follow so that we can have a chance of keeping on top of things.
+## Code of Conduct
+The Hydra community is dedicated to providing a welcoming and positive experience for all its
+members, whether they are at a formal gathering, in a social setting, or taking part in activities
+online.  Please see our [Code of Conduct](https://wiki.duraspace.org/display/hydra/Code+of+Conduct)
+for more information.
 ## Hydra Project Intellectual Property Licensing and Ownership
 All code contributors must have an Individual Contributor License Agreement (iCLA) on file with the Hydra Project Steering Group.
@@ -16,8 +23,10 @@ You should also add yourself to the `CONTRIBUTORS.md` file in the root of the pr
 * Reporting Issues
 * Making Changes
+* Documenting Code
+* Committing Changes
 * Submitting Changes
-* Merging Changes
+* Reviewing and Merging Changes
 ### Reporting Issues
@@ -38,8 +47,28 @@ You should also add yourself to the `CONTRIBUTORS.md` file in the root of the pr
   * Then checkout the new branch with `git checkout fix/master/my_contribution`.
   * Please avoid working directly on the `master` branch.
   * You may find the [hub suite of commands](https://github.com/defunkt/hub) helpful
+* Make sure you have added sufficient tests and documentation for your changes.
+  * Test functionality with RSpec; est features / UI with Capybara.
+* Run _all_ the tests to assure nothing else was accidentally broken.
+### Documenting Code
+* All new public methods, modules, and classes should include inline documentation in [YARD](http://yardoc.org/).
+  * Documentation should seek to answer the question "why does this code exist?"
+* Document private / protected methods as desired.
+* If you are working in a file with no prior documentation, do try to document as you gain understanding of the code.
+  * If you don't know exactly what a bit of code does, it is extra likely that it needs to be documented. Take a stab at it and ask for feedback in your pull request. You can use the 'blame' button on GitHub to identify the original developer of the code and @mention them in your comment.
+  * This work greatly increases the usability of the code base and supports the on-ramping of new committers.
+  * We will all be understanding of one another's time constraints in this area.
+* YARD examples:
+  * [Hydra::Works::RemoveGenericFile](https://github.com/projecthydra-labs/hydra-works/blob/master/lib/hydra/works/services/generic_work/remove_generic_file.rb)
+  * [ActiveTriples::LocalName::Minter](https://github.com/ActiveTriples/active_triples-local_name/blob/master/lib/active_triples/local_name/minter.rb)
+* [Getting started with YARD](http://www.rubydoc.info/gems/yard/file/docs/GettingStarted.md)
+### Committing changes
 * Make commits of logical units.
-  * Your commit should include a high level description of your work in HISTORY.textile
+  * Your commit should include a high level description of your work in HISTORY.textile
 * Check for unnecessary whitespace with `git diff --check` before committing.
 * Make sure your commit messages are [well formed](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
 * If you created an issue, you can close it by including "Closes #issue" in your commit message. See [Github's blog post for more details](https://github.com/blog/1386-closing-issues-via-commit-messages)
@@ -60,7 +89,9 @@ You should also add yourself to the `CONTRIBUTORS.md` file in the root of the pr
         class PostsController
           def index
-            respond_with Post.limit(10)
+            respond_to do |wants|
+              wants.html { render 'index' }
+            end
           end
         end
@@ -72,38 +103,53 @@ You should also add yourself to the `CONTRIBUTORS.md` file in the root of the pr
       long to fit in 72 characters
 ```
-* Make sure you have added the necessary tests for your changes.
-* Run _all_ the tests to assure nothing else was accidentally broken.
-* When you are ready to submit a pull request
 ### Submitting Changes
-[Detailed Walkthrough of One Pull Request per Commit](http://ndlib.github.io/practices/one-commit-per-pull-request/)
 * Read the article ["Using Pull Requests"](https://help.github.com/articles/using-pull-requests) on GitHub.
 * Make sure your branch is up to date with its parent branch (i.e. master)
   * `git checkout master`
   * `git pull --rebase`
   * `git checkout <your-branch>`
   * `git rebase master`
-  * It is likely a good idea to run your tests again.
-* Squash the commits for your branch into one commit
-  * `git rebase --interactive HEAD~<number-of-commits>` ([See Github help](https://help.github.com/articles/interactive-rebase))
-  * To determine the number of commits on your branch: `git log master..<your-branch> --oneline | wc -l`
+  * It is a good idea to run your tests again.
+* If you've made more than one commit take a moment to consider whether squashing commits together would help improve their logical grouping.
+  * [Detailed Walkthrough of One Pull Request per Commit](http://ndlib.github.io/practices/one-commit-per-pull-request/)
+  * `git rebase --interactive master` ([See Github help](https://help.github.com/articles/interactive-rebase))
   * Squashing your branch's changes into one commit is "good form" and helps the person merging your request to see everything that is going on.
 * Push your changes to a topic branch in your fork of the repository.
 * Submit a pull request from your fork to the project.
-### Merging Changes
+### Reviewing and Merging Changes
+We adopted [Github's Pull Request Review](https://help.github.com/articles/about-pull-request-reviews/) for our repositories.
+Common checks that may occur in our repositories:
+1. Travis CI - where our automated tests are running
+2. Hound CI - where we check for style violations
+3. Approval Required - Github enforces at least one person approve a pull request. Also, all reviewers that have chimed in must approve.
+4. CodeClimate - is our code remaining healthy (at least according to static code analysis)
+If one or more of the required checks failed (or are incomplete), the code should not be merged (and the UI will not allow it). If all of the checks have passed, then anyone on the project (including the pull request submitter) may merge the code.
+*Example: Carolyn submits a pull request, Justin reviews the pull request and approves. However, Justin is still waiting on other checks (Travis CI is usually the culprit), so he does not merge the pull request. Eventually, all of the checks pass. At this point, Carolyn or anyone else may merge the pull request.*
+#### Things to Consider When Reviewing
+First, the person contributing the code is putting themselves out there. Be mindful of what you say in a review.
+* Ask clarifying questions
+* State your understanding and expectations
+* Provide example code or alternate solutions, and explain why
+This is your chance for a mentoring moment of another developer. Take time to give an honest and thorough review of what has changed. Things to consider:
-* It is considered "poor from" to merge your own request.
-* Please take the time to review the changes and get a sense of what is being changed. Things to consider:
   * Does the commit message explain what is going on?
-  * Does the code changes have tests? _Not all changes need new tests, some changes are refactorings_
+  * Does the code changes have tests? _Not all changes need new tests, some changes are refactors_
+  * Do new or changed methods, modules, and classes have documentation?
   * Does the commit contain more than it should? Are two separate concerns being addressed in one commit?
-  * Did the Travis tests complete successfully?
-* If you are uncertain, bring other contributors into the conversation by creating a comment that includes their @username.
-* If you like the pull request, but want others to chime in, create a +1 comment and tag a user.
+  * Does the description of the new/changed specs match your understanding of what the spec is doing?
+If you are uncertain, bring other contributors into the conversation by assigning them as a reviewer.
 # Additional Resources

data/README.md CHANGED

@@ -3,13 +3,7 @@
 [![Build Status](https://travis-ci.org/projecthydra/solrizer.png?branch=master)](https://travis-ci.org/projecthydra/solrizer)
 [![Gem Version](https://badge.fury.io/rb/solrizer.png)](http://badge.fury.io/rb/solrizer)
-A lightweight, configurable tool for indexing metadata into solr.  Can be triggered from within your application, from
-the command line, or as a JMS listener.
-Solrizer provides the baseline and structures for the process of solrizing.  In order to actually read objects from a
-data source and write solr documents into a solr instance, you need to use an implementation specific gem, such as
-"solrizer-fedora":https://github.com/projecthydra/solrizer-fedora, which provides the mechanics for reading from a
-fedora repository and writing to a solr instance.
+A lightweight tool for creating dynamic solr schema sufixes.
 ## Installation
@@ -157,74 +151,6 @@ But now you may also pass an Descriptor instance if that works for you:
     indexer = Solrizer::Descriptor.new(:integer, :indexed, :stored)
     t.main_title(:index_as=>[indexer],:path=>"title", :label=>"title") { ... }
-### Extractor and Extractor Mixins
-Solrizer::Extractor provides utilities for extracting solr fields from objects or inserting solr fields into documents:
-    > extractor = Solrizer::Extractor.new
-    > solr_doc = Hash.new
-    > extractor.format_node_value(["foo     ","\n      bar"])
-    => "foo bar"
-    > extractor.insert_solr_field_value(solr_doc, "foo","bar")
-    => {"foo"=>"bar"}
-    > extractor.insert_solr_field_value(solr_doc,"foo","baz")
-    => {"foo"=>["bar", "baz"]}
-    > extractor.insert_solr_field_value(solr_doc, "boo","hoo")
-    => {"foo"=>["bar", "baz"], "boo"=>"hoo"}
-#### Solrizer provides some default mixins:
-`Solrizer::HTML::Extractor` provides html_to_solr method and `Solrizer::XML::Extractor` provides xml_to_solr method:
-    > Solrizer::XML::Extractor
-    > extractor = Solrizer::Extractor.new
-    > xml = "<fields><foo>bar</foo><bar>baz</bar></fields>"
-    > extractor.xml_to_solr(xml)
-    => {:foo_tesim=>"bar", :bar_tesim=>"baz"}
-#### Solrizer::XML::TerminologyBasedSolrizer
-Another powerful mixin for use with classes that include the `OM::XML::Document` module is
-`Solrizer::XML::TerminologyBasedSolrizer`. The methods provided by this module map provides a robust way of mapping
-terms and solr fields via om terminologies. A notable example  can be found in `ActiveFedora::NokogiriDatatstream`.
-## JMS Listener for Hydra Rails Applications
-### The executables: solrizer and solrizerd
-The solrizer gem provides two executables:
- * solrizer is a stomp consumer which listens for fedora.apim.updates and solrizes (or de-solrizes) objects accordingly.
- * solrizerd is a wrapper script that spawns a daemonized version of solrizer and handles start|stop|restart|status requests.
-### Usage
-The usage for solrizerd is as follows:
-    solrizerd command --hydra_home PATH [options]
-The commands are as follows:
- *  start      start an instance of the application
- *  stop       stop all instances of the application
- *  restart    stop all instances and restart them afterwards
- *  status     show status (PID) of application instances
-Required parameters:
---hydra_home: this is the path to your hydra rails applications' root directory.  Solrizerd needs this in order to load all your models and corresponding terminoligies.
-The options:
- *  -p, --port         Stomp port  61613
- *  -o, --host         Host to connect to  localhost
- *  -u, --user         User name for stomp listener
- *  -w, --password     Password for stomp listener
- *  -d, --destination  Topic to listen to (default: /topic/fedora.apim.update)
- *  -h, --help         Display this screen
-Note:
-Since the solrizer script must fire up your hydra rails application, it must have all the gems installed that your hydra instance needs.
 ## Note on Patches/Pull Requests
 * Fork the project.

data/lib/solrizer.rb CHANGED

@@ -5,14 +5,11 @@ module Solrizer
   extend ActiveSupport::Autoload
   autoload :Common
-  autoload :Extractor
   autoload :Descriptor
   autoload :FieldMapper
   autoload :DefaultDescriptors
   autoload :Suffix
-  autoload :HTML, 'solrizer/html'
   autoload :VERSION, 'solrizer/version'
-  autoload :XML, 'solrizer/xml'
   mattr_accessor :logger, instance_writer: false

data/lib/solrizer/field_mapper.rb CHANGED

@@ -163,7 +163,7 @@ module Solrizer
     def extract_type(value)
       case value
       when NilClass
-      when 0.class # Fixnum for ruby < 2.4, and Integer afterwards
+      when Integer # In ruby < 2.4, Fixnum extends Integer
         :integer
       when DateTime
         :time

data/lib/solrizer/version.rb CHANGED

@@ -1,3 +1,3 @@
 module Solrizer
-  VERSION = "3.4.1"
+  VERSION = "4.0.0"
 end

data/solrizer.gemspec CHANGED

@@ -14,8 +14,6 @@ Gem::Specification.new do |s|
   s.add_dependency "nokogiri"
   s.add_dependency "xml-simple"
-  s.add_dependency "stomp"
-  s.add_dependency "daemons"
   s.add_dependency "activesupport"
   s.add_development_dependency 'rspec', '~> 3.5'
   s.add_development_dependency 'rake'

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: solrizer
 version: !ruby/object:Gem::Version
-  version: 3.4.1
+  version: 4.0.0
 platform: ruby
 authors:
 - Matt Zumwalt
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2017-01-05 00:00:00.000000000 Z
+date: 2017-01-26 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: nokogiri
@@ -38,34 +38,6 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
-- !ruby/object:Gem::Dependency
-  name: stomp
-  requirement: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: '0'
-  type: :runtime
-  prerelease: false
-  version_requirements: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: '0'
-- !ruby/object:Gem::Dependency
-  name: daemons
-  requirement: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: '0'
-  type: :runtime
-  prerelease: false
-  version_requirements: !ruby/object:Gem::Requirement
-    requirements:
-    - - ">="
-      - !ruby/object:Gem::Version
-        version: '0'
 - !ruby/object:Gem::Dependency
   name: activesupport
   requirement: !ruby/object:Gem::Requirement
@@ -139,9 +111,7 @@ dependencies:
 description: Use solrizer to populate solr indexes.  You can run solrizer from within
   your app, using the provided rake tasks, or as a JMS listener
 email: hydra-tech@googlegroups.com
-executables:
-- solrizer
-- solrizerd
+executables: []
 extensions: []
 extra_rdoc_files:
 - LICENSE
@@ -155,31 +125,22 @@ files:
 - LICENSE
 - README.md
 - Rakefile
-- bin/solrizer
-- bin/solrizerd
 - lib/solrizer.rb
 - lib/solrizer/common.rb
 - lib/solrizer/default_descriptors.rb
 - lib/solrizer/descriptor.rb
-- lib/solrizer/extractor.rb
 - lib/solrizer/field_mapper.rb
-- lib/solrizer/html.rb
-- lib/solrizer/html/extractor.rb
 - lib/solrizer/suffix.rb
 - lib/solrizer/version.rb
-- lib/solrizer/xml.rb
-- lib/solrizer/xml/extractor.rb
 - lib/tasks/solrizer.rake
 - solrizer.gemspec
 - spec/.rspec
 - spec/fixtures/druid-bv448hq0314-descMetadata.xml
 - spec/spec_helper.rb
 - spec/units/common_spec.rb
-- spec/units/extractor_spec.rb
 - spec/units/field_mapper_spec.rb
 - spec/units/solrizer_spec.rb
 - spec/units/suffix_spec.rb
-- spec/units/xml_extractor_spec.rb
 homepage: http://github.com/projecthydra/solrizer
 licenses: []
 metadata: {}
@@ -208,8 +169,6 @@ test_files:
 - spec/fixtures/druid-bv448hq0314-descMetadata.xml
 - spec/spec_helper.rb
 - spec/units/common_spec.rb
-- spec/units/extractor_spec.rb
 - spec/units/field_mapper_spec.rb
 - spec/units/solrizer_spec.rb
 - spec/units/suffix_spec.rb
-- spec/units/xml_extractor_spec.rb

data/bin/solrizer DELETED

@@ -1,107 +0,0 @@
-#!/usr/bin/env ruby
-require 'rubygems'
-require 'optparse'
-require 'stomp'
-options = {}
-optparse = OptionParser.new do|opts|
-  opts.banner = "Usage: solrizer [options]"
-  options[:hydra_home] = nil
-  opts.on( '--hydra_home PATH', 'Load the Hydra instance  at this path' ) do |path|
-    if File.exist?(File.join(path,"config","environment.rb"))
-      options[:hydra_home] = path
-    else
-      puts "#{path} does not appear to be a valid rails home"
-      exit
-    end
-  end
-  options[:port] = 61613
-  opts.on('-p','--port NUM', 'Stomp port') do |port|
-    options[:port] = port
-  end
-  options[:host] = 'localhost'
-  opts.on('-o','--host HOSTNAME', 'Host to connect to') do |host|
-    options[:host] = host
-  end
-  options[:user] = 'fedoraStomper'
-  opts.on('-u', '--user USERNAME', 'User name for stomp listener') do |user|
-    options[:user] = user
-  end
-  options[:password] = 'fedoraStomper'
-  opts.on('-w', '--password PASSWORD', 'Password for stomp listener') do |password|
-    options[:password] = password
-  end
-  options[:destination] = '/topic/fedora.apim.update'
-  opts.on('-d','--destination TOPIC', 'Topic to listen to') do |destination|
-    options[:destination] = destination
-  end
-  opts.on('-h', '--help', 'Display this screen') do
-    puts opts
-    exit
-  end
-end
-optparse.parse!
-begin; require 'rubygems'; rescue; end
-if options[:hydra_home]
-  puts "Loading app..."
-  Dir.chdir(options[:hydra_home])
-  require File.join(options[:hydra_home],"config","environment.rb")
-  puts "app loaded"
-else
-  $stderr.puts "The --hydra_home PATH option is mandatory. Please provide the path to the root of a valid Hydra instance."
-  exit 1
-end
-puts "loading listener"
-begin
-  @port = options[:port]
-  @host = options[:host]
-  @user = options[:user]
-  @password = options[:password]
-  @reliable = true
-  @clientid = "fedora_stomper"
-  @destination = options[:destination]
-  $stderr.print "Connecting to stomp://#{@host}:#{@port} as #{@user}\n"
-  @conn = Stomp::Connection.open(@user, @password, @host, @port, @reliable, 5, {"client-id" => @clientid} )
-  $stderr.print "Getting output from #{@destination}\n"
-  @conn.subscribe(@destination, {"activemq.subscriptionName" => @clientid, :ack =>"client" })
-  while true
-      @msg = @conn.receive
-      pid = @msg.headers["pid"]
-      method = @msg.headers["methodName"]
-      puts @msg.headers.inspect
-      puts "\nPID: #{@msg.headers["pid"]}\n"
-      if ["addDatastream", "addRelationship","ingest","modifyDatastreamByValue","modifyDatastreamByReference","modifyObject","purgeDatastream","purgeRelationship"].include? method
-        ActiveFedora::Base.find(@msg.headers["pid"], cast: true).update_index
-      elsif method == "purgeObject"
-        ActiveFedora::SolrService.instance.conn.delete_by_id(pid)
-      else
-        $stderr.puts "Unknown Method: #{method}"
-      end
-      puts  "updated solr index for #{@msg.headers["pid"]}\n"
-      @conn.ack @msg.headers["message-id"]
-  end
-  @conn.join
-rescue Exception => e
-p e
-end

data/bin/solrizerd DELETED

@@ -1,68 +0,0 @@
-#!/usr/bin/env ruby
-require 'rubygems'
-require 'daemons'
-require 'stomp'
-banner=<<-EOC
-Usage: solrizerd command --hydra_home PATH [options]
-        PATH must point to a valid hydra application
-Commands:
-        start         start an instance of the application
-        stop          stop all instances of the application
-        restart       stop all instances and restart them afterwards
-        status        show status (PID) of application instances
-Options:
-        --hydra_home PATH          Load the hydra instance at this path
-        -p, --port NUM             Stomp port (default 61613)
-        -o, --host HOSTNAME        Host to connect to
-        -u, --user USERNAME        User name for stomp listener
-        -w, --password PASSWORD    Password for stomp listener
-        -d, --destination TOPIC    Topic to listen to (default: /topic/fedora.apim.update)
-        -h, --help                 Display this screen
-EOC
-# check for a valid command
-unless ['start','stop','restart','status'].include? ARGV[0]
-  puts banner
-  exit 7
-end
-if ARGV.include?('-h') || ARGV.include?('--help')
-  puts banner
-  exit 0
-end
-# Make sure --hydra_home was set for the start and restart commands
-if ARGV[0] == 'start' || ARGV[0] == 'restart'
-  unless ARGV[1] == '--hydra_home'
-    puts "ERROR: You must --hydra_home to specify the path to a valid hydra application"
-    exit 8
-  end
-# make sure valid path was set for hydra_home
-  unless ARGV[2] && File.exist?(File.join(ARGV[2],"config","environment.rb"))
-    puts "ERROR: the path entered does not appear to be a valid hydra instance"
-    exit 9
-  end
-end
-options = {
-  :multiple=>false,
-  :dir_mode=>:normal,
-  :dir=>'/tmp',
-  :backtrace=>true
-}
-argv_array = []
-argv_array << ARGV[0]
-argv_array << '--'
-ARGV[1..-1].each {|ele| argv_array << ele }
-options[:ARGV] = argv_array
-version = '>=0'
-app = Gem.bin_path('solrizer','solrizer',version)
-Daemons.run(app,options)

data/lib/solrizer/extractor.rb DELETED

@@ -1,68 +0,0 @@
-module Solrizer
-# Provides utilities for extracting solr fields from a variety of objects and/or creating solr documents from a given object
-# Note: These utilities are optional.  You can implement .to_solr directly on your classes if you want to bypass using Extractors.
-#
-# Each of the Solrizer implementations (ie. solrizer-fedora) provides its own Extractor module that extends the behaviors of Solrizer::Extractor
-# with methods specific to that implementation (ie. extract_tag, extract_rels_ext, xml_to_solr, html_to_solr).
-# By convention, the solrizer implementations will mix their own Extractors' behaviors into this class when you load them into an application.
-#
-class Extractor
-  class << self
-    # Insert +field_value+ for +field_name+ into +solr_doc+
-    # Handles inserting new values into a Hash while ensuring that you don't destroy or overwrite any existing values in the hash.
-    # Ensures that field values are always appended to arrays within the values hash.
-    # Also ensures that values are run through format_node_value
-    # @param [Hash] solr_doc
-    # @param [String] field_name
-    # @param [String] field_value
-    def insert_solr_field_value(solr_doc, field_name, field_value)
-      formatted_value = format_node_value(field_value)
-      if solr_doc[field_name]
-        solr_doc[field_name] = Array(solr_doc[field_name]) << formatted_value
-      else
-        solr_doc[field_name] = formatted_value
-      end
-      return solr_doc
-    end
-    # Strips the majority of whitespace from the values array and then joins them with a single blank delimitter
-    # Returns an empty string if values argument is nil
-    #
-    # @param [Array] values Array of strings representing the values to be formatted
-    # @return [String]
-    def format_node_value values
-      if values.nil?
-        ""
-      else
-        Array(values).map{|val| val.gsub(/\s+/,' ').strip}.join(" ")
-      end
-    end
-  end
-  # Instance Methods
-  # Alias for Solrizer::Extractor#insert_solr_field_value
-  def insert_solr_field_value(solr_doc, field_name, field_value)
-    Solrizer::Extractor.insert_solr_field_value(solr_doc, field_name, field_value)
-  end
-  # Alias for Solrizer::Extractor#format_node_value
-  def format_node_value values
-    Solrizer::Extractor.format_node_value(values)
-  end
-  # Deprecated.
-  # merges input_hash into solr_hash
-  # @param [Hash] input_hash the input hash of values
-  # @param [Hash] solr_hash the solr values hash to add the values into
-  # @return [Hash] the populated Solr values hash
-  #
-  def extract_hash( input_hash, solr_hash=Hash.new )
-    warn "[DEPRECATION] `extract_hash` is deprecated.  Just pass values directly into your solr values hash"
-    return solr_hash.merge!(input_hash)
-  end
-end
-end

data/lib/solrizer/html.rb DELETED

@@ -1,7 +0,0 @@
-require "solrizer"
-module Solrizer::HTML
-end
-Dir[File.join(File.dirname(__FILE__),"html","*.rb")].each {|file| require file }
-Solrizer::Extractor.send(:include, Solrizer::HTML::Extractor)

data/lib/solrizer/html/extractor.rb DELETED

@@ -1,36 +0,0 @@
-require "nokogiri"
-require 'yaml'
-module Solrizer::HTML::Extractor
-  #
-  # This method strips html tags out and returns content to be indexed in solr
-  #
-  # @param [Datastream] ds object that responds to .content with HTML content
-  # @param [Hash] solr_doc hash of values to be inserted into solr as a solr document
-  def html_to_solr( ds, solr_doc=Hash.new )
-    text = CGI.unescapeHTML(ds.content)
-    doc = Nokogiri::HTML(text)
-    # html to story_display
-    stories = doc.xpath('//story')
-    stories.each do |story|
-      solr_doc.merge!({:story_display => story.children.to_xml})
-    end
-    #strip out text and put in story_t
-    text_nodes = doc.xpath("//text()")
-    text = String.new
-     text_nodes.each do |text_node|
-       text << text_node.content
-     end
-     solr_doc.merge!({:story_t => text})
-     return solr_doc
-  end
-end

data/lib/solrizer/xml.rb DELETED

@@ -1,5 +0,0 @@
-module Solrizer::XML
-end
-Dir[File.join(File.dirname(__FILE__),"xml","*.rb")].each {|file| require file }
-Solrizer::Extractor.send(:include, Solrizer::XML::Extractor)

data/lib/solrizer/xml/extractor.rb DELETED

@@ -1,32 +0,0 @@
-require "xmlsimple"
-module Solrizer::XML::Extractor
-  #
-  # This method extracts solr fields from simple xml
-  # If you want to do anything more nuanced with the xml, use OM instead.
-  #
-  # @param [xml] text xml content to index
-  # @param [Hash] solr_doc
-  def xml_to_solr( text, solr_doc=Hash.new, mapper = Solrizer.default_field_mapper )
-    doc = XmlSimple.xml_in( text )
-    doc.each_pair do |name, value|
-      if value.kind_of?(Array)
-        if value.first.kind_of?(Hash)
-          # This deals with the way xml-simple handles nodes with attributes
-          solr_doc.merge!({mapper.solr_name(name, :stored_searchable, :type=>:text).to_sym => "#{value.first["content"]}"})
-        elsif value.length > 1
-          solr_doc.merge!({mapper.solr_name(name, :stored_searchable, :type=>:text).to_sym => value})
-        else
-          solr_doc.merge!({mapper.solr_name(name, :stored_searchable, :type=>:text).to_sym => "#{value.first}"})
-        end
-      else
-        solr_doc.merge!({mapper.solr_name(name, :stored_searchable, :type=>:text).to_sym => "#{value}"})
-      end
-    end
-    return solr_doc
-  end
-end

data/spec/units/extractor_spec.rb DELETED

@@ -1,44 +0,0 @@
-require 'spec_helper'
-describe Solrizer::Extractor do
-  before(:all) do
-    @extractor = Solrizer::Extractor.new
-  end
-  describe ".format_node_value" do
-    it "should strip white space out of the array and join it with a single blank" do
-      expect(Solrizer::Extractor.format_node_value([" test    \n   node    \t value \t"])).to eq "test node value"
-      expect(Solrizer::Extractor.format_node_value([" test ", "     \n   node ", "   \t value \t"])).to eq "test node value"
-    end
-    it "should return an empty string if given an argument of nil" do
-      expect(Solrizer::Extractor.format_node_value(nil)).to eq ''
-    end
-    it "should strip white space out of a string" do
-      expect(Solrizer::Extractor.format_node_value("raw  string\n with whitespace")).to eq "raw string with whitespace"
-    end
-  end
-  describe "#insert_solr_field_value" do
-    it "should initialize a solr doc list if it is nil" do
-       solr_doc = {'title_tesim' => nil }
-       Solrizer::Extractor.insert_solr_field_value(solr_doc, 'title_tesim', 'Frank')
-       expect(solr_doc).to eq("title_tesim"=>"Frank")
-    end
-    it "should insert multiple" do
-       solr_doc = {'title_tesim' => nil }
-       Solrizer::Extractor.insert_solr_field_value(solr_doc, 'title_tesim', 'Frank')
-       Solrizer::Extractor.insert_solr_field_value(solr_doc, 'title_tesim', 'Margret')
-       Solrizer::Extractor.insert_solr_field_value(solr_doc, 'title_tesim', 'Joyce')
-       expect(solr_doc).to eq("title_tesim"=>["Frank", 'Margret', 'Joyce'])
-    end
-    it "should not make a list if a single valued field is passed in" do
-       solr_doc = {}
-       Solrizer::Extractor.insert_solr_field_value(solr_doc, 'title_dtsi', '2013-03-22T12:33:00Z')
-       expect(solr_doc).to eq("title_dtsi"=>"2013-03-22T12:33:00Z")
-    end
-  end
-end

data/spec/units/xml_extractor_spec.rb DELETED

@@ -1,26 +0,0 @@
-require 'spec_helper'
-describe Solrizer::XML::Extractor do
-  before do
-    @extractor = Solrizer::Extractor.new
-  end
-  let(:result) { @extractor.xml_to_solr(fixture("druid-bv448hq0314-descMetadata.xml"))}
-  describe ".xml_to_solr" do
-    it "should turn simple xml into a solr document" do
-      expect(result[:type_tesim]).to eq "text"
-      expect(result[:medium_tesim]).to eq "Paper Document"
-      expect(result[:rights_tesim]).to eq "Presumed under copyright. Do not publish."
-      expect(result[:date_tesim]).to eq "1985-12-30"
-      expect(result[:format_tesim]).to be_kind_of(Array)
-      expect(result[:format_tesim]).to include("application/tiff")
-      expect(result[:format_tesim]).to include("application/pdf")
-      expect(result[:format_tesim]).to include("application/jp2000")
-      expect(result[:title_tesim]).to eq "This is a Sample Title"
-      expect(result[:publisher_tesim]).to eq "Sample Unversity"
-    end
-  end
-end