jsuchal-activerecord-fast-import 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.markdown +66 -66
- data/Rakefile +1 -0
- data/VERSION +1 -1
- data/activerecord-fast-import.gemspec +5 -7
- data/lib/activerecord-fast-import.rb +10 -3
- metadata +4 -5
- data/.document +0 -5
data/README.markdown
CHANGED
@@ -1,66 +1,66 @@
|
|
1
|
-
# activerecord-fast-import
|
2
|
-
|
3
|
-
Loads data from text files into tables using fast native MySQL [LOAD DATA INFILE](http://dev.mysql.com/doc/refman/5.1/en/load-data.html) query.
|
4
|
-
|
5
|
-
## Examples
|
6
|
-
|
7
|
-
### Loading data from tab delimited log file
|
8
|
-
|
9
|
-
Suppose you have an ActiveRecord model LogEntry defined as `LogEntry happened_at:datetime url:string` and a log file with tab delimited columns like this:
|
10
|
-
|
11
|
-
2009-09-30 12:32:43<tab>http://github.com/
|
12
|
-
2009-09-30 13:36:13<tab>http://facebook.com/
|
13
|
-
|
14
|
-
To import data from this log file, you have to use
|
15
|
-
|
16
|
-
LogEntry.fast_import('huge.log')
|
17
|
-
|
18
|
-
That's it!
|
19
|
-
|
20
|
-
Of course in real world you will also need more advanced features. Read on...
|
21
|
-
|
22
|
-
|
23
|
-
### Changing delimiters and ignoring some rows
|
24
|
-
|
25
|
-
Of course not all log files are delimited by tabs and newlines. Just pass custom delimiters to options. If you want to ignore first 10 lines, just use `:ignore_lines`
|
26
|
-
|
27
|
-
import_options = {
|
28
|
-
:fields_terminated_by => ',',
|
29
|
-
:lines_terminated_by => ';',
|
30
|
-
:ignore_lines => 10
|
31
|
-
}
|
32
|
-
|
33
|
-
### Changing order of columns and ignoring columns
|
34
|
-
|
35
|
-
Now, imagine you want to import data from a huge log file with following format:
|
36
|
-
|
37
|
-
http://github.com/<tab>Mozilla<tab>2009-09-30 12:32:43
|
38
|
-
http://facebook.com/<tab>Opera<tab>2009-09-30 13:36:13
|
39
|
-
|
40
|
-
It is clear that columns are in different order and we even want to ignore the second column. Let's do it
|
41
|
-
|
42
|
-
import_options = {:columns => ["url", "@dummy", "happened_at"]}
|
43
|
-
LogEntry.fast_import('huge.log', import_options)
|
44
|
-
|
45
|
-
The special `@dummy` loads that column into a local variable and when unused (in a transformation) is just ignored.
|
46
|
-
|
47
|
-
### Transforming data
|
48
|
-
|
49
|
-
Now imagine we have a log file like this:
|
50
|
-
|
51
|
-
2009-09-30 12:32:43<tab>http://github.com/<tab>image.jpg
|
52
|
-
2009-09-30 13:36:13<tab>http://facebook.com/<tab>styles/default.css
|
53
|
-
|
54
|
-
We want to concatenate those two columns into one.
|
55
|
-
|
56
|
-
import_options = {
|
57
|
-
:columns => ["happened_at", "@domain", "@file"],
|
58
|
-
:mapping => { :url => "CONCAT(@domain, @file)" }
|
59
|
-
}
|
60
|
-
LogEntry.fast_import('huge.log', import_options)
|
61
|
-
|
62
|
-
Of course you can use any of those shiny [MySQL functions](http://dev.mysql.com/doc/refman/5.1/en/functions.html).
|
63
|
-
|
64
|
-
## Copyright
|
65
|
-
|
66
|
-
Copyright (c) 2009 Jan Suchal. See LICENSE for details.
|
1
|
+
# activerecord-fast-import
|
2
|
+
|
3
|
+
Loads data from text files into tables using fast native MySQL [LOAD DATA INFILE](http://dev.mysql.com/doc/refman/5.1/en/load-data.html) query.
|
4
|
+
|
5
|
+
## Examples
|
6
|
+
|
7
|
+
### Loading data from tab delimited log file
|
8
|
+
|
9
|
+
Suppose you have an ActiveRecord model LogEntry defined as `LogEntry happened_at:datetime url:string` and a log file with tab delimited columns like this:
|
10
|
+
|
11
|
+
2009-09-30 12:32:43<tab>http://github.com/
|
12
|
+
2009-09-30 13:36:13<tab>http://facebook.com/
|
13
|
+
|
14
|
+
To import data from this log file, you have to use
|
15
|
+
|
16
|
+
LogEntry.fast_import('huge.log')
|
17
|
+
|
18
|
+
That's it!
|
19
|
+
|
20
|
+
Of course in real world you will also need more advanced features. Read on...
|
21
|
+
|
22
|
+
|
23
|
+
### Changing delimiters and ignoring some rows
|
24
|
+
|
25
|
+
Of course not all log files are delimited by tabs and newlines. Just pass custom delimiters to options. If you want to ignore first 10 lines, just use `:ignore_lines`
|
26
|
+
|
27
|
+
import_options = {
|
28
|
+
:fields_terminated_by => ',',
|
29
|
+
:lines_terminated_by => ';',
|
30
|
+
:ignore_lines => 10
|
31
|
+
}
|
32
|
+
|
33
|
+
### Changing order of columns and ignoring columns
|
34
|
+
|
35
|
+
Now, imagine you want to import data from a huge log file with following format:
|
36
|
+
|
37
|
+
http://github.com/<tab>Mozilla<tab>2009-09-30 12:32:43
|
38
|
+
http://facebook.com/<tab>Opera<tab>2009-09-30 13:36:13
|
39
|
+
|
40
|
+
It is clear that columns are in different order and we even want to ignore the second column. Let's do it
|
41
|
+
|
42
|
+
import_options = {:columns => ["url", "@dummy", "happened_at"]}
|
43
|
+
LogEntry.fast_import('huge.log', import_options)
|
44
|
+
|
45
|
+
The special `@dummy` loads that column into a local variable and when unused (in a transformation) is just ignored.
|
46
|
+
|
47
|
+
### Transforming data
|
48
|
+
|
49
|
+
Now imagine we have a log file like this:
|
50
|
+
|
51
|
+
2009-09-30 12:32:43<tab>http://github.com/<tab>image.jpg
|
52
|
+
2009-09-30 13:36:13<tab>http://facebook.com/<tab>styles/default.css
|
53
|
+
|
54
|
+
We want to concatenate those two columns into one.
|
55
|
+
|
56
|
+
import_options = {
|
57
|
+
:columns => ["happened_at", "@domain", "@file"],
|
58
|
+
:mapping => { :url => "CONCAT(@domain, @file)" }
|
59
|
+
}
|
60
|
+
LogEntry.fast_import('huge.log', import_options)
|
61
|
+
|
62
|
+
Of course you can use any of those shiny [MySQL functions](http://dev.mysql.com/doc/refman/5.1/en/functions.html).
|
63
|
+
|
64
|
+
## Copyright
|
65
|
+
|
66
|
+
Copyright (c) 2009 Jan Suchal. See LICENSE for details.
|
data/Rakefile
CHANGED
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.1.
|
1
|
+
0.1.4
|
@@ -5,11 +5,11 @@
|
|
5
5
|
|
6
6
|
Gem::Specification.new do |s|
|
7
7
|
s.name = %q{activerecord-fast-import}
|
8
|
-
s.version = "0.1.
|
8
|
+
s.version = "0.1.4"
|
9
9
|
|
10
10
|
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
11
11
|
s.authors = ["Jan Suchal"]
|
12
|
-
s.date = %q{2009-09-
|
12
|
+
s.date = %q{2009-09-10}
|
13
13
|
s.description = %q{Native MySQL additions to ActiveRecord, like LOAD DATA INFILE, ENABLE/DISABLE KEYS, TRUNCATE TABLE.}
|
14
14
|
s.email = %q{johno@jsmf.net}
|
15
15
|
s.extra_rdoc_files = [
|
@@ -17,8 +17,7 @@ Gem::Specification.new do |s|
|
|
17
17
|
"README.markdown"
|
18
18
|
]
|
19
19
|
s.files = [
|
20
|
-
".
|
21
|
-
".gitignore",
|
20
|
+
".gitignore",
|
22
21
|
"LICENSE",
|
23
22
|
"README.markdown",
|
24
23
|
"Rakefile",
|
@@ -30,11 +29,10 @@ Gem::Specification.new do |s|
|
|
30
29
|
"spec/activerecord-fast-import_spec.rb",
|
31
30
|
"spec/spec_helper.rb"
|
32
31
|
]
|
33
|
-
s.has_rdoc = true
|
34
32
|
s.homepage = %q{http://github.com/jsuchal/activerecord-fast-import}
|
35
33
|
s.rdoc_options = ["--charset=UTF-8"]
|
36
34
|
s.require_paths = ["lib"]
|
37
|
-
s.rubygems_version = %q{1.3.
|
35
|
+
s.rubygems_version = %q{1.3.5}
|
38
36
|
s.summary = %q{Fast MySQL import for ActiveRecord}
|
39
37
|
s.test_files = [
|
40
38
|
"spec/activerecord-fast-import_spec.rb",
|
@@ -43,7 +41,7 @@ Gem::Specification.new do |s|
|
|
43
41
|
|
44
42
|
if s.respond_to? :specification_version then
|
45
43
|
current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
|
46
|
-
s.specification_version =
|
44
|
+
s.specification_version = 3
|
47
45
|
|
48
46
|
if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
|
49
47
|
s.add_development_dependency(%q<rspec>, [">= 0"])
|
@@ -16,6 +16,13 @@ module ActiveRecord #:nodoc:
|
|
16
16
|
connection.execute("ALTER TABLE #{quoted_table_name} ENABLE KEYS")
|
17
17
|
end
|
18
18
|
|
19
|
+
# Disables keys, yields block, enables keys.
|
20
|
+
def self.with_keys_disabled
|
21
|
+
disable_keys
|
22
|
+
yield
|
23
|
+
enable_keys
|
24
|
+
end
|
25
|
+
|
19
26
|
# Loads data from file using MySQL native LOAD DATA INFILE query, disabling
|
20
27
|
# key updates for even faster import speed
|
21
28
|
#
|
@@ -24,9 +31,9 @@ module ActiveRecord #:nodoc:
|
|
24
31
|
# * +options+ (see <tt>load_data_infile</tt>)
|
25
32
|
def self.fast_import(files, options = {})
|
26
33
|
files = [files] unless files.is_a? Array
|
27
|
-
|
28
|
-
|
29
|
-
|
34
|
+
with_keys_disabled do
|
35
|
+
files.each {|file| load_data_infile(file, options)}
|
36
|
+
end
|
30
37
|
end
|
31
38
|
|
32
39
|
# Loads data from file using MySQL native LOAD DATA INFILE query
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: jsuchal-activerecord-fast-import
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jan Suchal
|
@@ -9,7 +9,7 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2009-09-
|
12
|
+
date: 2009-09-10 00:00:00 -07:00
|
13
13
|
default_executable:
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
@@ -42,7 +42,6 @@ extra_rdoc_files:
|
|
42
42
|
- LICENSE
|
43
43
|
- README.markdown
|
44
44
|
files:
|
45
|
-
- .document
|
46
45
|
- .gitignore
|
47
46
|
- LICENSE
|
48
47
|
- README.markdown
|
@@ -54,7 +53,7 @@ files:
|
|
54
53
|
- nbproject/project.xml
|
55
54
|
- spec/activerecord-fast-import_spec.rb
|
56
55
|
- spec/spec_helper.rb
|
57
|
-
has_rdoc:
|
56
|
+
has_rdoc: false
|
58
57
|
homepage: http://github.com/jsuchal/activerecord-fast-import
|
59
58
|
post_install_message:
|
60
59
|
rdoc_options:
|
@@ -78,7 +77,7 @@ requirements: []
|
|
78
77
|
rubyforge_project:
|
79
78
|
rubygems_version: 1.2.0
|
80
79
|
signing_key:
|
81
|
-
specification_version:
|
80
|
+
specification_version: 3
|
82
81
|
summary: Fast MySQL import for ActiveRecord
|
83
82
|
test_files:
|
84
83
|
- spec/activerecord-fast-import_spec.rb
|