cumulus_csv 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/.document ADDED
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
data/.gitignore ADDED
@@ -0,0 +1,21 @@
1
+ ## MAC OS
2
+ .DS_Store
3
+
4
+ ## TEXTMATE
5
+ *.tmproj
6
+ tmtags
7
+
8
+ ## EMACS
9
+ *~
10
+ \#*
11
+ .\#*
12
+
13
+ ## VIM
14
+ *.swp
15
+
16
+ ## PROJECT::GENERAL
17
+ coverage
18
+ rdoc
19
+ pkg
20
+
21
+ ## PROJECT::SPECIFIC
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 evizitei
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.rdoc ADDED
@@ -0,0 +1,44 @@
1
+ = cumulus_csv
2
+
3
+ CSV Files: I hate them, you probably do too, but sometimes you need to get data into your system and this is the only way it's happening.
4
+
5
+ If you're deploying a rails app in a cloud setup, you may have troubles if you're trying to store an uploaded file locally and process it later in a background thread (I know I have).
6
+
7
+ cumulus_csv is one way to solve that problem. You can save your file to your S3 account, and loop over the data inside it at your convenience later. So it doesn't matter where you're doing the processing, you just need to have the key you used to store the file, and you can process away.
8
+
9
+ THIS GEM IS DEPENDANT ON AWS::S3!
10
+
11
+ Since this gem uses AWS::S3, it should be no suprise that you'll need similar auth parameters:
12
+
13
+ manager = Cumulus::CSV::DataFileManager.new(
14
+ :access_key_id => 'abc',
15
+ :secret_access_key => '123'
16
+ )
17
+
18
+ this manager has 2 main functions: storing your files as they're uploaded, and letting you iterate over them later when you need to.
19
+
20
+ To store your file on S3 when you upload it, you'd do something like this:
21
+
22
+ key = manager.store_uploaded_file!(params[:uploaded_file])
23
+
24
+ That will work for your standard multi part form. The key the file is stored under is returned, it's just the basename of the file. You can pass this key to a rake task or whatever you're using, and it will be made use of later when you want to process this file:
25
+
26
+ manager.each_row_of(key) do |row|
27
+ #...some processing of this CSV row
28
+ end
29
+
30
+ in that block, you can load each row into your database, or send an email based on each one, whatever it is you're trying to accomplish by having your app interact with this data file.
31
+
32
+ == Note on Patches/Pull Requests
33
+
34
+ * Fork the project.
35
+ * Make your feature addition or bug fix.
36
+ * Add tests for it. This is important so I don't break it in a
37
+ future version unintentionally.
38
+ * Commit, do not mess with rakefile, version, or history.
39
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
40
+ * Send me a pull request. Bonus points for topic branches.
41
+
42
+ == Copyright
43
+
44
+ Copyright (c) 2010 evizitei. See LICENSE for details.
data/Rakefile ADDED
@@ -0,0 +1,59 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "cumulus_csv"
8
+ gem.summary = %Q{Helps you save uploaded csv files containing data to amazon s3, and gives you a way to download and loop through the data in a background process easily}
9
+ gem.description = %Q{CSV Files: I hate them, you probably do too, but sometimes you need to get data into your system and this is the only way it's happening.
10
+
11
+ If you're deploying a rails app in a cloud setup, you may have troubles if you're trying to store an uploaded file locally and process it later in a background thread (I know I have).
12
+
13
+ cumulus_csv is one way to solve that problem. You can save your file to your S3 account, and loop over the data inside it at your convenience later. So it doesn't matter where you're doing the processing, you just need to have the key you used to store the file, and you can process away.}
14
+ gem.email = "ethan.vizitei@gmail.com"
15
+ gem.homepage = "http://github.com/evizitei/cumulus_csv"
16
+ gem.authors = ["evizitei"]
17
+ gem.add_development_dependency "thoughtbot-shoulda", ">= 0"
18
+ gem.add_development_dependency "mocha", ">= 0.9.8"
19
+ gem.add_dependency "aws-s3", ">= 0.6.2"
20
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
21
+ end
22
+ Jeweler::GemcutterTasks.new
23
+ rescue LoadError
24
+ puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
25
+ end
26
+
27
+ require 'rake/testtask'
28
+ Rake::TestTask.new(:test) do |test|
29
+ test.libs << 'lib' << 'test'
30
+ test.pattern = 'test/**/test_*.rb'
31
+ test.verbose = true
32
+ end
33
+
34
+ begin
35
+ require 'rcov/rcovtask'
36
+ Rcov::RcovTask.new do |test|
37
+ test.libs << 'test'
38
+ test.pattern = 'test/**/test_*.rb'
39
+ test.verbose = true
40
+ end
41
+ rescue LoadError
42
+ task :rcov do
43
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
44
+ end
45
+ end
46
+
47
+ task :test => :check_dependencies
48
+
49
+ task :default => :test
50
+
51
+ require 'rake/rdoctask'
52
+ Rake::RDocTask.new do |rdoc|
53
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
54
+
55
+ rdoc.rdoc_dir = 'rdoc'
56
+ rdoc.title = "cumulus_csv #{version}"
57
+ rdoc.rdoc_files.include('README*')
58
+ rdoc.rdoc_files.include('lib/**/*.rb')
59
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.0.2
@@ -0,0 +1,65 @@
1
+ # Generated by jeweler
2
+ # DO NOT EDIT THIS FILE DIRECTLY
3
+ # Instead, edit Jeweler::Tasks in Rakefile, and run the gemspec command
4
+ # -*- encoding: utf-8 -*-
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = %q{cumulus_csv}
8
+ s.version = "0.0.2"
9
+
10
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
+ s.authors = ["evizitei"]
12
+ s.date = %q{2010-03-09}
13
+ s.description = %q{CSV Files: I hate them, you probably do too, but sometimes you need to get data into your system and this is the only way it's happening.
14
+
15
+ If you're deploying a rails app in a cloud setup, you may have troubles if you're trying to store an uploaded file locally and process it later in a background thread (I know I have).
16
+
17
+ cumulus_csv is one way to solve that problem. You can save your file to your S3 account, and loop over the data inside it at your convenience later. So it doesn't matter where you're doing the processing, you just need to have the key you used to store the file, and you can process away.}
18
+ s.email = %q{ethan.vizitei@gmail.com}
19
+ s.extra_rdoc_files = [
20
+ "LICENSE",
21
+ "README.rdoc"
22
+ ]
23
+ s.files = [
24
+ ".document",
25
+ ".gitignore",
26
+ "LICENSE",
27
+ "README.rdoc",
28
+ "Rakefile",
29
+ "VERSION",
30
+ "cumulus_csv.gemspec",
31
+ "lib/cumulus_csv.rb",
32
+ "lib/cumulus_csv/data_file_manager.rb",
33
+ "test/helper.rb",
34
+ "test/test_data_file_manager.rb"
35
+ ]
36
+ s.homepage = %q{http://github.com/evizitei/cumulus_csv}
37
+ s.rdoc_options = ["--charset=UTF-8"]
38
+ s.require_paths = ["lib"]
39
+ s.rubygems_version = %q{1.3.5}
40
+ s.summary = %q{Helps you save uploaded csv files containing data to amazon s3, and gives you a way to download and loop through the data in a background process easily}
41
+ s.test_files = [
42
+ "test/helper.rb",
43
+ "test/test_data_file_manager.rb"
44
+ ]
45
+
46
+ if s.respond_to? :specification_version then
47
+ current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
48
+ s.specification_version = 3
49
+
50
+ if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
51
+ s.add_development_dependency(%q<thoughtbot-shoulda>, [">= 0"])
52
+ s.add_development_dependency(%q<mocha>, [">= 0.9.8"])
53
+ s.add_runtime_dependency(%q<aws-s3>, [">= 0.6.2"])
54
+ else
55
+ s.add_dependency(%q<thoughtbot-shoulda>, [">= 0"])
56
+ s.add_dependency(%q<mocha>, [">= 0.9.8"])
57
+ s.add_dependency(%q<aws-s3>, [">= 0.6.2"])
58
+ end
59
+ else
60
+ s.add_dependency(%q<thoughtbot-shoulda>, [">= 0"])
61
+ s.add_dependency(%q<mocha>, [">= 0.9.8"])
62
+ s.add_dependency(%q<aws-s3>, [">= 0.6.2"])
63
+ end
64
+ end
65
+
@@ -0,0 +1,3 @@
1
+ require 'csv'
2
+ require 'aws/s3'
3
+ require 'cumulus_csv/data_file_manager'
@@ -0,0 +1,50 @@
1
+ module Cumulus
2
+ module CSV
3
+ BUCKET_NAME = "cumuluscsvtmp"
4
+ # DataFileManager is the gatekeeper for sending your data files to S3, and for iterating over them later.
5
+ #
6
+ # In the constructor, It takes the same authentication parameters as aws-s3:
7
+ #
8
+ # DataFileManager.new(:access_key_id => 'abc',:secret_access_key => '123')
9
+ #
10
+ # For storing your csv data file on S3, you need to setup a controller to send your uploaded files through this interface:
11
+ #
12
+ # DataFileManager.new(connection_params).store_uploaded_file!(params[:uploaded_file])
13
+ #
14
+ # The file will be posted to S3 in a bucket set aside for this gem (it will be created upon connection if it doesn't exist already)
15
+ #
16
+ # When you're ready to iterate over this csv file later in a background job (or wherever), you'll use this:
17
+ #
18
+ # DataFileManager.new(connection_params).each_row_of(name) {|row| #...whatever processing you need }
19
+ #
20
+ class DataFileManager
21
+ attr_reader :bucket
22
+
23
+ def initialize(connect_params)
24
+ AWS::S3::Base.establish_connection!(connect_params)
25
+ cache_bucket
26
+ end
27
+
28
+ def store_uploaded_file!(uploaded_file)
29
+ name = File.basename(uploaded_file.original_filename)
30
+ AWS::S3::S3Object.store(name,uploaded_file.read,BUCKET_NAME)
31
+ return name
32
+ end
33
+
34
+ def each_row_of(file_name)
35
+ data = AWS::S3::S3Object.value(file_name,BUCKET_NAME)
36
+ ::CSV::Reader.parse(data).each{|row| yield row }
37
+ end
38
+
39
+ private
40
+ def cache_bucket
41
+ begin
42
+ @bucket = AWS::S3::Bucket.find(BUCKET_NAME)
43
+ rescue AWS::S3::S3Exception
44
+ AWS::S3::Bucket.create(BUCKET_NAME)
45
+ @bucket = AWS::S3::Bucket.find(BUCKET_NAME)
46
+ end
47
+ end
48
+ end
49
+ end
50
+ end
data/test/helper.rb ADDED
@@ -0,0 +1,12 @@
1
+ require 'rubygems'
2
+ require 'test/unit'
3
+ require 'shoulda'
4
+ require 'mocha'
5
+
6
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
7
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
8
+ require 'cumulus_csv'
9
+
10
+ class Test::Unit::TestCase
11
+ include Cumulus::CSV
12
+ end
@@ -0,0 +1,64 @@
1
+ require 'helper'
2
+
3
+ class TestDataFileManager < Test::Unit::TestCase
4
+ context "Test" do
5
+ setup do
6
+ nueter_aws!
7
+ @auth_hash = {:access_key_id => 'abc',:secret_access_key => '123'}
8
+ end
9
+
10
+ context "Infrastructure" do
11
+ should "connect to s3 at creation" do
12
+ AWS::S3::Base.expects(:establish_connection!).with(@auth_hash)
13
+ DataFileManager.new(@auth_hash)
14
+ end
15
+
16
+ should "cache cumulus bucket if it already exists" do
17
+ AWS::S3::Bucket.expects(:create).times(0)
18
+ AWS::S3::Bucket.stubs(:find).with(BUCKET_NAME).returns(AWS::S3::Bucket.new(:name=>BUCKET_NAME))
19
+ manager = DataFileManager.new(@auth_hash)
20
+ assert_equal AWS::S3::Bucket, manager.bucket.class
21
+ end
22
+
23
+ should "create new cumulus bucket if it does not yet exist" do
24
+ AWS::S3::Bucket.expects(:create).times(1)
25
+ AWS::S3::Bucket.stubs(:find).with(BUCKET_NAME).raises(AWS::S3::S3Exception).then.returns(AWS::S3::Bucket.new(:name=>BUCKET_NAME))
26
+ manager = DataFileManager.new(@auth_hash)
27
+ assert_equal AWS::S3::Bucket, manager.bucket.class
28
+ end
29
+ end
30
+
31
+ context "Uploaded File" do
32
+ setup do
33
+ @uploaded_file = stub("uploaded_file")
34
+ @uploaded_file.stubs(:original_filename).returns("/long/path/to/some_name.csv")
35
+ @uploaded_file.stubs(:read).returns("file data galore")
36
+ end
37
+
38
+ should "be stored by key" do
39
+ manager = DataFileManager.new(@auth_hash)
40
+ AWS::S3::S3Object.expects("store").with("some_name.csv","file data galore",BUCKET_NAME)
41
+ assert_equal "some_name.csv",manager.store_uploaded_file!(@uploaded_file)
42
+ end
43
+ end
44
+
45
+ context "Stored csv file" do
46
+ should "be iterated over row by row" do
47
+ AWS::S3::S3Object.expects(:value).with("some_name.csv",BUCKET_NAME).returns("1,2,3\nA,B,C\nx,y,z\n")
48
+ manager = DataFileManager.new(@auth_hash)
49
+ results = []
50
+ manager.each_row_of("some_name.csv") do |row|
51
+ results << row
52
+ end
53
+ assert_equal ["1","2","3"],results.first
54
+ assert_equal ["x","y","z"],results.last
55
+ end
56
+ end
57
+
58
+ end
59
+ def nueter_aws!
60
+ AWS::S3::Base.stubs(:establish_connection!)
61
+ AWS::S3::Bucket.stubs(:find)
62
+ AWS::S3::Bucket.stubs(:create)
63
+ end
64
+ end
metadata ADDED
@@ -0,0 +1,101 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: cumulus_csv
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.2
5
+ platform: ruby
6
+ authors:
7
+ - evizitei
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2010-03-09 00:00:00 +00:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: thoughtbot-shoulda
17
+ type: :development
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0"
24
+ version:
25
+ - !ruby/object:Gem::Dependency
26
+ name: mocha
27
+ type: :development
28
+ version_requirement:
29
+ version_requirements: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: 0.9.8
34
+ version:
35
+ - !ruby/object:Gem::Dependency
36
+ name: aws-s3
37
+ type: :runtime
38
+ version_requirement:
39
+ version_requirements: !ruby/object:Gem::Requirement
40
+ requirements:
41
+ - - ">="
42
+ - !ruby/object:Gem::Version
43
+ version: 0.6.2
44
+ version:
45
+ description: |-
46
+ CSV Files: I hate them, you probably do too, but sometimes you need to get data into your system and this is the only way it's happening.
47
+
48
+ If you're deploying a rails app in a cloud setup, you may have troubles if you're trying to store an uploaded file locally and process it later in a background thread (I know I have).
49
+
50
+ cumulus_csv is one way to solve that problem. You can save your file to your S3 account, and loop over the data inside it at your convenience later. So it doesn't matter where you're doing the processing, you just need to have the key you used to store the file, and you can process away.
51
+ email: ethan.vizitei@gmail.com
52
+ executables: []
53
+
54
+ extensions: []
55
+
56
+ extra_rdoc_files:
57
+ - LICENSE
58
+ - README.rdoc
59
+ files:
60
+ - .document
61
+ - .gitignore
62
+ - LICENSE
63
+ - README.rdoc
64
+ - Rakefile
65
+ - VERSION
66
+ - cumulus_csv.gemspec
67
+ - lib/cumulus_csv.rb
68
+ - lib/cumulus_csv/data_file_manager.rb
69
+ - test/helper.rb
70
+ - test/test_data_file_manager.rb
71
+ has_rdoc: true
72
+ homepage: http://github.com/evizitei/cumulus_csv
73
+ licenses: []
74
+
75
+ post_install_message:
76
+ rdoc_options:
77
+ - --charset=UTF-8
78
+ require_paths:
79
+ - lib
80
+ required_ruby_version: !ruby/object:Gem::Requirement
81
+ requirements:
82
+ - - ">="
83
+ - !ruby/object:Gem::Version
84
+ version: "0"
85
+ version:
86
+ required_rubygems_version: !ruby/object:Gem::Requirement
87
+ requirements:
88
+ - - ">="
89
+ - !ruby/object:Gem::Version
90
+ version: "0"
91
+ version:
92
+ requirements: []
93
+
94
+ rubyforge_project:
95
+ rubygems_version: 1.3.5
96
+ signing_key:
97
+ specification_version: 3
98
+ summary: Helps you save uploaded csv files containing data to amazon s3, and gives you a way to download and loop through the data in a background process easily
99
+ test_files:
100
+ - test/helper.rb
101
+ - test/test_data_file_manager.rb