mileszs-importex 0.1.2
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG.rdoc +8 -0
- data/LICENSE +20 -0
- data/README.rdoc +72 -0
- data/Rakefile +13 -0
- data/lib/importex/base.rb +79 -0
- data/lib/importex/column.rb +37 -0
- data/lib/importex/core_ext/blank.rb +59 -0
- data/lib/importex/core_ext/importex_value.rb +45 -0
- data/lib/importex.rb +17 -0
- data/spec/fixtures/simple.xls +0 -0
- data/spec/importex/base_spec.rb +62 -0
- data/spec/importex/column_spec.rb +4 -0
- data/spec/spec_helper.rb +8 -0
- metadata +99 -0
data/CHANGELOG.rdoc
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2010 Ryan Bates
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.rdoc
ADDED
@@ -0,0 +1,72 @@
|
|
1
|
+
= Importex
|
2
|
+
|
3
|
+
RDocs[http://rdoc.info/projects/ryanb/importex] | Metrics[http://getcaliper.com/caliper/project?repo=git%3A%2F%2Fgithub.com%2Fryanb%2Fimportex.git] | Tests[http://runcoderun.com/ryanb/importex]
|
4
|
+
|
5
|
+
This Ruby gem helps import an Excel document into a database or some other format. Just create a class defining the columns and pass in a path to an "xls" file. It will automatically format the columns into specified Ruby objects and raise errors on bad data.
|
6
|
+
|
7
|
+
This is extracted from an internal set of administration scripts used for importing products into an e-commerce application. Rather than going through a web interface or directly into an SQL database, it is easiest to fill out an Excel spreadsheet with a row for each product, and filter that through a Ruby script.
|
8
|
+
|
9
|
+
Note: This library has some hacks and is not intended to be a full featured, production quality library. I designed it to fit my needs for importing records through internal administration scripts.
|
10
|
+
|
11
|
+
|
12
|
+
== Installation
|
13
|
+
|
14
|
+
You can install through a gem.
|
15
|
+
|
16
|
+
gem install importex
|
17
|
+
|
18
|
+
|
19
|
+
== Usage
|
20
|
+
|
21
|
+
First create a class which inherits from Importex::Base and define the columns there.
|
22
|
+
|
23
|
+
require 'importex'
|
24
|
+
class Product < Importex::Base
|
25
|
+
column "Name", :required => true
|
26
|
+
column "Price", :format => /^\d+\.\d\d$/, :required => true
|
27
|
+
column "Amount in Stock", :type => Integer
|
28
|
+
column "Release Date", :type => Date
|
29
|
+
column "Discontinued", :type => Boolean
|
30
|
+
end
|
31
|
+
|
32
|
+
Pass in an "xls" file to the Import class method to import the data. It expects the first row to be the column names and every row after that to be the records.
|
33
|
+
|
34
|
+
Product.import("path/to/products.xls")
|
35
|
+
|
36
|
+
Use the "all" method to fetch the product instances for all of the records. You can access the columns like a hash.
|
37
|
+
|
38
|
+
products = Product.all
|
39
|
+
products.first["Discontinued"] # => false
|
40
|
+
|
41
|
+
It is up to you to import this data into the database or other location. You can do this through something like Active Record, DataMapper, or Sequel.
|
42
|
+
|
43
|
+
|
44
|
+
== Handling Bad Data
|
45
|
+
|
46
|
+
If the Excel document is formatted improperly it will raise some form of Importex::ImportError exception. I recommend rescuing from this and handling it in a clean way for the user so they do not get a full stack trace.
|
47
|
+
|
48
|
+
begin
|
49
|
+
Product.import(...)
|
50
|
+
rescue Importex::ImportError => e
|
51
|
+
puts e.message
|
52
|
+
end
|
53
|
+
|
54
|
+
|
55
|
+
== Custom Types
|
56
|
+
|
57
|
+
It is possible to have smart columns which reference other Ruby objects. Importex expects a class method called "importex_value" to exist which it passes the Excel content to and expects a ruby object in return. Let's say you have a Category model in Active Record and you have the name of the category in the Products Excel sheet.
|
58
|
+
|
59
|
+
class Category < ActiveRecord::Base
|
60
|
+
def self.importex_value(str)
|
61
|
+
find_by_name!(str)
|
62
|
+
rescue ActiveRecord::RecordNotFound
|
63
|
+
raise Importex::InvalidCell, "No category with that name."
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
class Product < Importex::Base
|
68
|
+
column "Category", :type => Category
|
69
|
+
end
|
70
|
+
|
71
|
+
Then product["Category"] will return an instance of the found Category.
|
72
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'rake'
|
3
|
+
require 'spec/rake/spectask'
|
4
|
+
|
5
|
+
spec_files = Rake::FileList["spec/**/*_spec.rb"]
|
6
|
+
|
7
|
+
desc "Run specs"
|
8
|
+
Spec::Rake::SpecTask.new do |t|
|
9
|
+
t.spec_files = spec_files
|
10
|
+
t.spec_opts = ["-c"]
|
11
|
+
end
|
12
|
+
|
13
|
+
task :default => :spec
|
@@ -0,0 +1,79 @@
|
|
1
|
+
module Importex
|
2
|
+
class Base
|
3
|
+
attr_reader :attributes
|
4
|
+
|
5
|
+
# Defines a column that may be found in the excel document. The first argument is a string
|
6
|
+
# representing the name of the column. The second argument is a hash of options.
|
7
|
+
#
|
8
|
+
# Options:
|
9
|
+
# [:+type+]
|
10
|
+
# The Ruby class to be used as the value on import.
|
11
|
+
#
|
12
|
+
# column :type => Date
|
13
|
+
#
|
14
|
+
# [:+format+]
|
15
|
+
# Usually a regular expression representing the required format for the value. Can also be a string
|
16
|
+
# or an array of strings and regular expressions.
|
17
|
+
#
|
18
|
+
# column :format => [/^\d+$/, "0.0"]
|
19
|
+
#
|
20
|
+
# [:+required+]
|
21
|
+
# Boolean specifying whether or not the given column must be present in the Excel document.
|
22
|
+
# Defaults to false.
|
23
|
+
def self.column(*args)
|
24
|
+
@columns ||= []
|
25
|
+
@columns << Column.new(*args)
|
26
|
+
end
|
27
|
+
|
28
|
+
# Pass a path to an Excel (xls) document and optionally the worksheet index. The worksheet
|
29
|
+
# will default to the first one (0). The first row in the Excel document should be the column
|
30
|
+
# names, all rows after that should be records.
|
31
|
+
def self.import(path, worksheet_index = 0)
|
32
|
+
Ole::Log.level = Logger::ERROR # to avoid the annoying "root name was" warning
|
33
|
+
@records ||= []
|
34
|
+
workbook = Spreadsheet.open(path)
|
35
|
+
worksheet = workbook.worksheet(worksheet_index)
|
36
|
+
worksheet.format_dates!
|
37
|
+
columns = worksheet.row(0).map do |cell|
|
38
|
+
@columns.detect { |column| column.name == cell.to_s }
|
39
|
+
end
|
40
|
+
(@columns.select(&:required?) - columns).each do |column|
|
41
|
+
raise MissingColumn, "Column #{column.name} is required but it doesn't exist."
|
42
|
+
end
|
43
|
+
(1..worksheet.last_row_index).each do |row_number|
|
44
|
+
row = worksheet.row(row_number)
|
45
|
+
unless row.at(0).nil?
|
46
|
+
attributes = {}
|
47
|
+
columns.each_with_index do |column, index|
|
48
|
+
if column
|
49
|
+
if row.at(index).nil?
|
50
|
+
value = ""
|
51
|
+
else
|
52
|
+
value = row.at(index).to_s
|
53
|
+
end
|
54
|
+
attributes[column.name] = column.cell_value(value, row_number)
|
55
|
+
end
|
56
|
+
end
|
57
|
+
@records << new(attributes)
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
61
|
+
|
62
|
+
# Returns all records imported from the excel document.
|
63
|
+
def self.all
|
64
|
+
@records
|
65
|
+
end
|
66
|
+
|
67
|
+
def initialize(attributes = {})
|
68
|
+
@attributes = attributes
|
69
|
+
end
|
70
|
+
|
71
|
+
# A convenient way to access the column data for a given record.
|
72
|
+
#
|
73
|
+
# product["Price"]
|
74
|
+
#
|
75
|
+
def [](name)
|
76
|
+
@attributes[name]
|
77
|
+
end
|
78
|
+
end
|
79
|
+
end
|
@@ -0,0 +1,37 @@
|
|
1
|
+
module Importex
|
2
|
+
class Column
|
3
|
+
attr_reader :name
|
4
|
+
|
5
|
+
def initialize(name, options = {})
|
6
|
+
@name = name
|
7
|
+
@type = options[:type]
|
8
|
+
@format = [options[:format]].compact.flatten
|
9
|
+
@required = options[:required]
|
10
|
+
end
|
11
|
+
|
12
|
+
def cell_value(str, row_number)
|
13
|
+
validate_cell(str)
|
14
|
+
@type ? @type.importex_value(str) : str
|
15
|
+
rescue InvalidCell => e
|
16
|
+
raise InvalidCell, "#{str} (column #{name}, row #{row_number+1}) does not match required format: #{e.message}"
|
17
|
+
end
|
18
|
+
|
19
|
+
def validate_cell(str)
|
20
|
+
if @format && !@format.empty? && !@format.any? { |format| match_format?(str, format) }
|
21
|
+
raise InvalidCell, @format.reject { |r| r.kind_of? Proc }.inspect
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
def match_format?(str, format)
|
26
|
+
case format
|
27
|
+
when String then str == format
|
28
|
+
when Regexp then str =~ format
|
29
|
+
when Proc then format.call(str)
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
def required?
|
34
|
+
@required
|
35
|
+
end
|
36
|
+
end
|
37
|
+
end
|
@@ -0,0 +1,59 @@
|
|
1
|
+
# Stolen straight from active_support/core_ext/object/blank.rb
|
2
|
+
class Object
|
3
|
+
# An object is blank if it's false, empty, or a whitespace string.
|
4
|
+
# For example, "", " ", +nil+, [], and {} are blank.
|
5
|
+
#
|
6
|
+
# This simplifies
|
7
|
+
#
|
8
|
+
# if !address.nil? && !address.empty?
|
9
|
+
#
|
10
|
+
# to
|
11
|
+
#
|
12
|
+
# if !address.blank?
|
13
|
+
def blank?
|
14
|
+
respond_to?(:empty?) ? empty? : !self
|
15
|
+
end
|
16
|
+
|
17
|
+
# An object is present if it's not blank.
|
18
|
+
def present?
|
19
|
+
!blank?
|
20
|
+
end
|
21
|
+
end
|
22
|
+
|
23
|
+
class NilClass #:nodoc:
|
24
|
+
def blank?
|
25
|
+
true
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
class FalseClass #:nodoc:
|
30
|
+
def blank?
|
31
|
+
true
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
class TrueClass #:nodoc:
|
36
|
+
def blank?
|
37
|
+
false
|
38
|
+
end
|
39
|
+
end
|
40
|
+
|
41
|
+
class Array #:nodoc:
|
42
|
+
alias_method :blank?, :empty?
|
43
|
+
end
|
44
|
+
|
45
|
+
class Hash #:nodoc:
|
46
|
+
alias_method :blank?, :empty?
|
47
|
+
end
|
48
|
+
|
49
|
+
class String #:nodoc:
|
50
|
+
def blank?
|
51
|
+
self !~ /\S/
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
class Numeric #:nodoc:
|
56
|
+
def blank?
|
57
|
+
false
|
58
|
+
end
|
59
|
+
end
|
@@ -0,0 +1,45 @@
|
|
1
|
+
class Integer
|
2
|
+
def self.importex_value(str)
|
3
|
+
unless str.blank?
|
4
|
+
if str =~ /^[.\d]+$/
|
5
|
+
str.to_i
|
6
|
+
else
|
7
|
+
raise Importex::InvalidCell, "Not a number."
|
8
|
+
end
|
9
|
+
end
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
class Float
|
14
|
+
def self.importex_value(str)
|
15
|
+
unless str.blank?
|
16
|
+
if str =~ /^[.\d]+$/
|
17
|
+
str.to_f
|
18
|
+
else
|
19
|
+
raise Importex::InvalidCell, "Not a number."
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
class Boolean
|
26
|
+
def self.importex_value(str)
|
27
|
+
!["", "f", "F", "n", "N", "0"].include?(str)
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
class Time
|
32
|
+
def self.importex_value(str)
|
33
|
+
Time.parse(str) unless str.blank?
|
34
|
+
rescue ArgumentError
|
35
|
+
raise Importex::InvalidCell, "Not a time."
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
class Date
|
40
|
+
def self.importex_value(str)
|
41
|
+
Date.parse(str) unless str.blank?
|
42
|
+
rescue ArgumentError
|
43
|
+
raise Importex::InvalidCell, "Not a date."
|
44
|
+
end
|
45
|
+
end
|
data/lib/importex.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
require 'spreadsheet'
|
2
|
+
|
3
|
+
require File.expand_path(File.dirname(__FILE__) + '/importex/base')
|
4
|
+
require File.expand_path(File.dirname(__FILE__) + '/importex/column')
|
5
|
+
require File.expand_path(File.dirname(__FILE__) + '/importex/core_ext/importex_value.rb')
|
6
|
+
|
7
|
+
module Importex
|
8
|
+
# This is an abstract exception for errors occurred during import. It is recommended
|
9
|
+
# that you rescue from this and handle importe errors in a clean way.
|
10
|
+
class ImportError < StandardError; end
|
11
|
+
|
12
|
+
# This exception is raised when a cell does not fit the column's :format specification.
|
13
|
+
class InvalidCell < ImportError; end
|
14
|
+
|
15
|
+
# This exception is raised when a :required column is missing.
|
16
|
+
class MissingColumn < ImportError; end
|
17
|
+
end
|
Binary file
|
@@ -0,0 +1,62 @@
|
|
1
|
+
require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
|
2
|
+
|
3
|
+
describe Importex::Base do
|
4
|
+
before(:each) do
|
5
|
+
@simple_class = Class.new(Importex::Base)
|
6
|
+
@xls_file = File.dirname(__FILE__) + '/../fixtures/simple.xls'
|
7
|
+
end
|
8
|
+
|
9
|
+
it "should import simple excel doc" do
|
10
|
+
@simple_class.column "Name"
|
11
|
+
@simple_class.column "Age", :type => Integer
|
12
|
+
@simple_class.import(@xls_file)
|
13
|
+
@simple_class.all.map(&:attributes).should == [{"Name" => "Foo", "Age" => 27}, {"Name" => "Bar", "Age" => 42}]
|
14
|
+
end
|
15
|
+
|
16
|
+
it "should import only the column given and ignore others" do
|
17
|
+
@simple_class.column "Age", :type => Integer
|
18
|
+
@simple_class.column "Nothing"
|
19
|
+
@simple_class.import(@xls_file)
|
20
|
+
@simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
|
21
|
+
end
|
22
|
+
|
23
|
+
it "should add restrictions through an array of strings or regular expressions" do
|
24
|
+
@simple_class.column "Age", :format => ["foo", /bar/]
|
25
|
+
lambda {
|
26
|
+
@simple_class.import(@xls_file)
|
27
|
+
}.should raise_error(Importex::InvalidCell, '27.0 (column Age, row 2) does not match required format: ["foo", /bar/]')
|
28
|
+
end
|
29
|
+
|
30
|
+
it "should support a lambda as a requirement" do
|
31
|
+
@simple_class.column "Age", :format => lambda { |age| age.to_i < 30 }
|
32
|
+
lambda {
|
33
|
+
@simple_class.import(@xls_file)
|
34
|
+
}.should raise_error(Importex::InvalidCell, '42.0 (column Age, row 3) does not match required format: []')
|
35
|
+
end
|
36
|
+
|
37
|
+
it "should have some default requirements" do
|
38
|
+
@simple_class.column "Name", :type => Integer
|
39
|
+
lambda {
|
40
|
+
@simple_class.import(@xls_file)
|
41
|
+
}.should raise_error(Importex::InvalidCell, 'Foo (column Name, row 2) does not match required format: Not a number.')
|
42
|
+
end
|
43
|
+
|
44
|
+
it "should have a [] method which returns attributes" do
|
45
|
+
simple = @simple_class.new("Foo" => "Bar")
|
46
|
+
simple["Foo"].should == "Bar"
|
47
|
+
end
|
48
|
+
|
49
|
+
it "should import if it matches one of the requirements given in array" do
|
50
|
+
@simple_class.column "Age", :type => Integer, :format => ["", /^[.\d]+$/]
|
51
|
+
@simple_class.import(@xls_file)
|
52
|
+
@simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
|
53
|
+
end
|
54
|
+
|
55
|
+
it "should raise an exception if required column is missing" do
|
56
|
+
@simple_class.column "Age", :required => true
|
57
|
+
@simple_class.column "Foo", :required => true
|
58
|
+
lambda {
|
59
|
+
@simple_class.import(@xls_file)
|
60
|
+
}.should raise_error(Importex::MissingColumn, "Column Foo is required but it doesn't exist.")
|
61
|
+
end
|
62
|
+
end
|
data/spec/spec_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,99 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: mileszs-importex
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
prerelease: false
|
5
|
+
segments:
|
6
|
+
- 0
|
7
|
+
- 1
|
8
|
+
- 2
|
9
|
+
version: 0.1.2
|
10
|
+
platform: ruby
|
11
|
+
authors:
|
12
|
+
- Ryan Bates
|
13
|
+
- Miles Z. Sterrett
|
14
|
+
autorequire:
|
15
|
+
bindir: bin
|
16
|
+
cert_chain: []
|
17
|
+
|
18
|
+
date: 2010-03-03 00:00:00 -05:00
|
19
|
+
default_executable:
|
20
|
+
dependencies:
|
21
|
+
- !ruby/object:Gem::Dependency
|
22
|
+
name: spreadsheet
|
23
|
+
prerelease: false
|
24
|
+
requirement: &id001 !ruby/object:Gem::Requirement
|
25
|
+
requirements:
|
26
|
+
- - ">="
|
27
|
+
- !ruby/object:Gem::Version
|
28
|
+
segments:
|
29
|
+
- 0
|
30
|
+
- 6
|
31
|
+
- 4
|
32
|
+
- 1
|
33
|
+
version: 0.6.4.1
|
34
|
+
type: :runtime
|
35
|
+
version_requirements: *id001
|
36
|
+
description: Import an Excel document by creating a Ruby class and passing in an 'xls' file. It will automatically format the columns into specified Ruby objects and raise errors on bad data.
|
37
|
+
email:
|
38
|
+
- ryan@railscasts.com
|
39
|
+
- miles@mileszs.com
|
40
|
+
executables: []
|
41
|
+
|
42
|
+
extensions: []
|
43
|
+
|
44
|
+
extra_rdoc_files:
|
45
|
+
- README.rdoc
|
46
|
+
- CHANGELOG.rdoc
|
47
|
+
- LICENSE
|
48
|
+
files:
|
49
|
+
- lib/importex.rb
|
50
|
+
- lib/importex/core_ext/importex_value.rb
|
51
|
+
- lib/importex/core_ext/blank.rb
|
52
|
+
- lib/importex/base.rb
|
53
|
+
- lib/importex/column.rb
|
54
|
+
- spec/fixtures/simple.xls
|
55
|
+
- spec/importex/base_spec.rb
|
56
|
+
- spec/importex/column_spec.rb
|
57
|
+
- spec/spec_helper.rb
|
58
|
+
- LICENSE
|
59
|
+
- README.rdoc
|
60
|
+
- Rakefile
|
61
|
+
- CHANGELOG.rdoc
|
62
|
+
has_rdoc: true
|
63
|
+
homepage: http://github.com/mileszs/importex
|
64
|
+
licenses: []
|
65
|
+
|
66
|
+
post_install_message:
|
67
|
+
rdoc_options:
|
68
|
+
- --line-numbers
|
69
|
+
- --inline-source
|
70
|
+
- --title
|
71
|
+
- Importex
|
72
|
+
- --main
|
73
|
+
- README.rdoc
|
74
|
+
require_paths:
|
75
|
+
- lib
|
76
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
77
|
+
requirements:
|
78
|
+
- - ">="
|
79
|
+
- !ruby/object:Gem::Version
|
80
|
+
segments:
|
81
|
+
- 0
|
82
|
+
version: "0"
|
83
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
84
|
+
requirements:
|
85
|
+
- - ">="
|
86
|
+
- !ruby/object:Gem::Version
|
87
|
+
segments:
|
88
|
+
- 1
|
89
|
+
- 2
|
90
|
+
version: "1.2"
|
91
|
+
requirements: []
|
92
|
+
|
93
|
+
rubyforge_project:
|
94
|
+
rubygems_version: 1.3.6
|
95
|
+
signing_key:
|
96
|
+
specification_version: 3
|
97
|
+
summary: Import an Excel document with Ruby.
|
98
|
+
test_files: []
|
99
|
+
|