importex 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,3 @@
1
+ 0.1.0 (Feb 8, 2010)
2
+
3
+ * initial release
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2010 Ryan Bates
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,72 @@
1
+ = Importex
2
+
3
+ RDocs[http://rdoc.info/projects/ryanb/importex] | Metrics[http://getcaliper.com/caliper/project?repo=git%3A%2F%2Fgithub.com%2Fryanb%2Fimportex.git] | Tests[http://runcoderun.com/ryanb/importex]
4
+
5
+ This Ruby gem helps import an Excel document into a database or some other format. Just create a class defining the columns and pass in a path to an "xls" file. It will automatically format the columns into specified Ruby objects and raise errors on bad data.
6
+
7
+ This is extracted from an internal set of administration scripts used for importing products into an e-commerce application. Rather than going through a web interface or directly into an SQL database, it is easiest to fill out an Excel spreadsheet with a row for each product, and filter that through a Ruby script.
8
+
9
+ Note: This library has some hacks and is not intended to be a full featured, production quality library. I designed it to fit my needs for importing records through internal administration scripts.
10
+
11
+
12
+ == Installation
13
+
14
+ You can install through a gem.
15
+
16
+ gem install importex
17
+
18
+
19
+ == Usage
20
+
21
+ First create a class which inherits from Importex::Base and define the columns there.
22
+
23
+ require 'importex'
24
+ class Product < Importex::Base
25
+ column "Name", :required => true
26
+ column "Price", :format => /^\d+\.\d\d$/, :required => true
27
+ column "Amount in Stock", :type => Integer
28
+ column "Release Date", :type => Date
29
+ column "Discontinued", :type => Boolean
30
+ end
31
+
32
+ Pass in an "xls" file to the Import class method to import the data. It expects the first row to be the column names and every row after that to be the records.
33
+
34
+ Product.import("path/to/products.xls")
35
+
36
+ Use the "all" method to fetch the product instances for all of the records. You can access the columns like a hash.
37
+
38
+ products = Product.all
39
+ products.first["Discontinued"] # => false
40
+
41
+ It is up to you to import this data into the database or other location. You can do this through something like Active Record, DataMapper, or Sequel.
42
+
43
+
44
+ == Handling Bad Data
45
+
46
+ If the Excel document is formatted improperly it will raise some form of Importex::ImportError exception. I recommend rescuing from this and handling it in a clean way for the user so they do not get a full stack trace.
47
+
48
+ begin
49
+ Product.import(...)
50
+ rescue Importex::ImportError => e
51
+ puts e.message
52
+ end
53
+
54
+
55
+ == Custom Types
56
+
57
+ It is possible to have smart columns which reference other Ruby objects. Importex expects a class method called "importex_value" to exist which it passes the Excel content to and expects a ruby object in return. Let's say you have a Category model in Active Record and you have the name of the category in the Products Excel sheet.
58
+
59
+ class Category < ActiveRecord::Base
60
+ def self.importex_value(str)
61
+ find_by_name!(str)
62
+ rescue ActiveRecord::RecordNotFound
63
+ raise Importex::InvalidCell, "No category with that name."
64
+ end
65
+ end
66
+
67
+ class Product < Importex::Base
68
+ column "Category", :type => Category
69
+ end
70
+
71
+ Then product["Category"] will return an instance of the found Category.
72
+
@@ -0,0 +1,13 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+ require 'spec/rake/spectask'
4
+
5
+ spec_files = Rake::FileList["spec/**/*_spec.rb"]
6
+
7
+ desc "Run specs"
8
+ Spec::Rake::SpecTask.new do |t|
9
+ t.spec_files = spec_files
10
+ t.spec_opts = ["-c"]
11
+ end
12
+
13
+ task :default => :spec
@@ -0,0 +1,17 @@
1
+ require 'parseexcel'
2
+
3
+ require File.expand_path(File.dirname(__FILE__) + '/importex/base')
4
+ require File.expand_path(File.dirname(__FILE__) + '/importex/column')
5
+ require File.expand_path(File.dirname(__FILE__) + '/importex/ruby_additions')
6
+
7
+ module Importex
8
+ # This is an abstract exception for errors occurred during import. It is recommended
9
+ # that you rescue from this and handle importe errors in a clean way.
10
+ class ImportError < StandardError; end
11
+
12
+ # This exception is raised when a cell does not fit the column's :format specification.
13
+ class InvalidCell < ImportError; end
14
+
15
+ # This exception is raised when a :required column is missing.
16
+ class MissingColumn < ImportError; end
17
+ end
@@ -0,0 +1,79 @@
1
+ module Importex
2
+ class Base
3
+ attr_reader :attributes
4
+
5
+ # Defines a column that may be found in the excel document. The first argument is a string
6
+ # representing the name of the column. The second argument is a hash of options.
7
+ #
8
+ # Options:
9
+ # [:+type+]
10
+ # The Ruby class to be used as the value on import.
11
+ #
12
+ # column :type => Date
13
+ #
14
+ # [:+format+]
15
+ # Usually a regular expression representing the required format for the value. Can also be a string
16
+ # or an array of strings and regular expressions.
17
+ #
18
+ # column :format => [/^\d+$/, "0.0"]
19
+ #
20
+ # [:+required+]
21
+ # Boolean specifying whether or not the given column must be present in the Excel document.
22
+ # Defaults to false.
23
+ def self.column(*args)
24
+ @columns ||= []
25
+ @columns << Column.new(*args)
26
+ end
27
+
28
+ # Pass a path to an Excel (xls) document and optionally the worksheet index. The worksheet
29
+ # will default to the first one (0). The first row in the Excel document should be the column
30
+ # names, all rows after that should be records.
31
+ def self.import(path, worksheet_index = 0)
32
+ @records ||= []
33
+ workbook = Spreadsheet::ParseExcel.parse(path)
34
+ worksheet = workbook.worksheet(worksheet_index)
35
+ columns = worksheet.row(0).map do |cell|
36
+ @columns.detect { |column| column.name == cell.to_s('latin1') }
37
+ end
38
+ (@columns.select(&:required?) - columns).each do |column|
39
+ raise MissingColumn, "Column #{column.name} is required but it doesn't exist."
40
+ end
41
+ (1...worksheet.num_rows).each do |row_number|
42
+ row = worksheet.row(row_number)
43
+ unless row.at(0).nil?
44
+ attributes = {}
45
+ columns.each_with_index do |column, index|
46
+ if column
47
+ if row.at(index).nil?
48
+ value = ""
49
+ elsif row.at(index).type == :date
50
+ value = row.at(index).date.strftime("%Y-%m-%d %H:%M:%I")
51
+ else
52
+ value = row.at(index).to_s('latin1')
53
+ end
54
+ attributes[column.name] = column.cell_value(value, row_number)
55
+ end
56
+ end
57
+ @records << new(attributes)
58
+ end
59
+ end
60
+ end
61
+
62
+ # Returns all records imported from the excel document.
63
+ def self.all
64
+ @records
65
+ end
66
+
67
+ def initialize(attributes = {})
68
+ @attributes = attributes
69
+ end
70
+
71
+ # A convenient way to access the column data for a given record.
72
+ #
73
+ # product["Price"]
74
+ #
75
+ def [](name)
76
+ @attributes[name]
77
+ end
78
+ end
79
+ end
@@ -0,0 +1,37 @@
1
+ module Importex
2
+ class Column
3
+ attr_reader :name
4
+
5
+ def initialize(name, options = {})
6
+ @name = name
7
+ @type = options[:type]
8
+ @format = [options[:format]].compact.flatten
9
+ @required = options[:required]
10
+ end
11
+
12
+ def cell_value(str, row_number)
13
+ validate_cell(str)
14
+ @type ? @type.importex_value(str) : str
15
+ rescue InvalidCell => e
16
+ raise InvalidCell, "#{str} (column #{name}, row #{row_number+1}) does not match required format: #{e.message}"
17
+ end
18
+
19
+ def validate_cell(str)
20
+ if @format && !@format.empty? && !@format.any? { |format| match_format?(str, format) }
21
+ raise InvalidCell, @format.reject { |r| r.kind_of? Proc }.inspect
22
+ end
23
+ end
24
+
25
+ def match_format?(str, format)
26
+ case format
27
+ when String then str == format
28
+ when Regexp then str =~ format
29
+ when Proc then format.call(str)
30
+ end
31
+ end
32
+
33
+ def required?
34
+ @required
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,45 @@
1
+ class Integer
2
+ def self.importex_value(str)
3
+ unless str.blank?
4
+ if str =~ /^[.\d]+$/
5
+ str.to_i
6
+ else
7
+ raise Importex::InvalidCell, "Not a number."
8
+ end
9
+ end
10
+ end
11
+ end
12
+
13
+ class Float
14
+ def self.importex_value(str)
15
+ unless str.blank?
16
+ if str =~ /^[.\d]+$/
17
+ str.to_f
18
+ else
19
+ raise Importex::InvalidCell, "Not a number."
20
+ end
21
+ end
22
+ end
23
+ end
24
+
25
+ class Boolean
26
+ def self.importex_value(str)
27
+ !["", "f", "F", "n", "N", "0"].include?(str)
28
+ end
29
+ end
30
+
31
+ class Time
32
+ def self.importex_value(str)
33
+ Time.parse(str) unless str.blank?
34
+ rescue ArgumentError
35
+ raise Importex::InvalidCell, "Not a time."
36
+ end
37
+ end
38
+
39
+ class Date
40
+ def self.importex_value(str)
41
+ Date.parse(str) unless str.blank?
42
+ rescue ArgumentError
43
+ raise Importex::InvalidCell, "Not a date."
44
+ end
45
+ end
Binary file
@@ -0,0 +1,62 @@
1
+ require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
2
+
3
+ describe Importex::Base do
4
+ before(:each) do
5
+ @simple_class = Class.new(Importex::Base)
6
+ @xls_file = File.dirname(__FILE__) + '/../fixtures/simple.xls'
7
+ end
8
+
9
+ it "should import simple excel doc" do
10
+ @simple_class.column "Name"
11
+ @simple_class.column "Age", :type => Integer
12
+ @simple_class.import(@xls_file)
13
+ @simple_class.all.map(&:attributes).should == [{"Name" => "Foo", "Age" => 27}, {"Name" => "Bar", "Age" => 42}]
14
+ end
15
+
16
+ it "should import only the column given and ignore others" do
17
+ @simple_class.column "Age", :type => Integer
18
+ @simple_class.column "Nothing"
19
+ @simple_class.import(@xls_file)
20
+ @simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
21
+ end
22
+
23
+ it "should add restrictions through an array of strings or regular expressions" do
24
+ @simple_class.column "Age", :format => ["foo", /bar/]
25
+ lambda {
26
+ @simple_class.import(@xls_file)
27
+ }.should raise_error(Importex::InvalidCell, '27.0 (column Age, row 2) does not match required format: ["foo", /bar/]')
28
+ end
29
+
30
+ it "should support a lambda as a requirement" do
31
+ @simple_class.column "Age", :format => lambda { |age| age.to_i < 30 }
32
+ lambda {
33
+ @simple_class.import(@xls_file)
34
+ }.should raise_error(Importex::InvalidCell, '42.0 (column Age, row 3) does not match required format: []')
35
+ end
36
+
37
+ it "should have some default requirements" do
38
+ @simple_class.column "Name", :type => Integer
39
+ lambda {
40
+ @simple_class.import(@xls_file)
41
+ }.should raise_error(Importex::InvalidCell, 'Foo (column Name, row 2) does not match required format: Not a number.')
42
+ end
43
+
44
+ it "should have a [] method which returns attributes" do
45
+ simple = @simple_class.new("Foo" => "Bar")
46
+ simple["Foo"].should == "Bar"
47
+ end
48
+
49
+ it "should import if it matches one of the requirements given in array" do
50
+ @simple_class.column "Age", :type => Integer, :format => ["", /^[.\d]+$/]
51
+ @simple_class.import(@xls_file)
52
+ @simple_class.all.map(&:attributes).should == [{"Age" => 27}, {"Age" => 42}]
53
+ end
54
+
55
+ it "should raise an exception if required column is missing" do
56
+ @simple_class.column "Age", :required => true
57
+ @simple_class.column "Foo", :required => true
58
+ lambda {
59
+ @simple_class.import(@xls_file)
60
+ }.should raise_error(Importex::MissingColumn, "Column Foo is required but it doesn't exist.")
61
+ end
62
+ end
@@ -0,0 +1,4 @@
1
+ require File.expand_path(File.dirname(__FILE__) + '/../spec_helper')
2
+
3
+ describe Importex::Column do
4
+ end
@@ -0,0 +1,9 @@
1
+ require 'rubygems'
2
+ require 'spec'
3
+ require 'active_support'
4
+ require 'fileutils'
5
+ require File.dirname(__FILE__) + '/../lib/importex'
6
+
7
+ Spec::Runner.configure do |config|
8
+ config.mock_with :rr
9
+ end
metadata ADDED
@@ -0,0 +1,82 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: importex
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Ryan Bates
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2010-02-08 00:00:00 -08:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: parseexcel
17
+ type: :runtime
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: 0.5.2
24
+ version:
25
+ description: Import an Excel document by creating a Ruby class and passing in an 'xls' file. It will automatically format the columns into specified Ruby objects and raise errors on bad data.
26
+ email: ryan@railscasts.com
27
+ executables: []
28
+
29
+ extensions: []
30
+
31
+ extra_rdoc_files:
32
+ - README.rdoc
33
+ - CHANGELOG.rdoc
34
+ - LICENSE
35
+ files:
36
+ - lib/importex/base.rb
37
+ - lib/importex/column.rb
38
+ - lib/importex/ruby_additions.rb
39
+ - lib/importex.rb
40
+ - spec/fixtures/simple.xls
41
+ - spec/importex/base_spec.rb
42
+ - spec/importex/column_spec.rb
43
+ - spec/spec_helper.rb
44
+ - LICENSE
45
+ - README.rdoc
46
+ - Rakefile
47
+ - CHANGELOG.rdoc
48
+ has_rdoc: true
49
+ homepage: http://github.com/ryanb/importex
50
+ licenses: []
51
+
52
+ post_install_message:
53
+ rdoc_options:
54
+ - --line-numbers
55
+ - --inline-source
56
+ - --title
57
+ - Importex
58
+ - --main
59
+ - README.rdoc
60
+ require_paths:
61
+ - lib
62
+ required_ruby_version: !ruby/object:Gem::Requirement
63
+ requirements:
64
+ - - ">="
65
+ - !ruby/object:Gem::Version
66
+ version: "0"
67
+ version:
68
+ required_rubygems_version: !ruby/object:Gem::Requirement
69
+ requirements:
70
+ - - ">="
71
+ - !ruby/object:Gem::Version
72
+ version: "1.2"
73
+ version:
74
+ requirements: []
75
+
76
+ rubyforge_project:
77
+ rubygems_version: 1.3.5
78
+ signing_key:
79
+ specification_version: 3
80
+ summary: Import an Excel document with Ruby.
81
+ test_files: []
82
+