embulk-input-random 0.0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +14 -0
- data/Gemfile +2 -0
- data/LICENSE.txt +22 -0
- data/README.md +109 -0
- data/Rakefile +2 -0
- data/embulk-input-random.gemspec +22 -0
- data/lib/embulk/input/random.rb +77 -0
- metadata +80 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 344eddc7c080632f70da8dacc8d9d96f38954e07
|
4
|
+
data.tar.gz: c03c9dd6a7e71f2eace49339f0fa7f240d4f931c
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 281fb6c47bf7763fa9e0609c34b3c99aa3db634419962b38d025bfcbaf33acb8638a4b75689f974de7cad7bac20fcb087a41ec3a562cee1704ebd60c22c8aa72
|
7
|
+
data.tar.gz: 9d82e8a8656c33271560aa57b0dfd2262f52e4de30488bf76999b1af81412d94b4b7ce0201cec5699a07bbe31502a869710c1bfba47377c7c4a54172b0eedee2
|
data/.gitignore
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2015 KUMAZAKI Hiroki
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,109 @@
|
|
1
|
+
# Embulk::Input::Random
|
2
|
+
|
3
|
+
Random data generator for [Embulk](https://github.com/embulk/embulk).
|
4
|
+
Expected to used to test or benchmark.
|
5
|
+
|
6
|
+
## Installation
|
7
|
+
|
8
|
+
Run this command with your embulk binary.
|
9
|
+
|
10
|
+
```ruby
|
11
|
+
$ embulk gem install embulk-input-random
|
12
|
+
```
|
13
|
+
|
14
|
+
## Usage
|
15
|
+
|
16
|
+
Specify in your config.yml file
|
17
|
+
|
18
|
+
```yaml
|
19
|
+
in:
|
20
|
+
type: random
|
21
|
+
rows: 100
|
22
|
+
threads: 2
|
23
|
+
schema:
|
24
|
+
myid: primary_key
|
25
|
+
name: string
|
26
|
+
score: integer
|
27
|
+
```
|
28
|
+
|
29
|
+
- type: specify this plugin as `random`
|
30
|
+
- rows: number of inserting rows (required)
|
31
|
+
- threads: number of thread (optional)
|
32
|
+
- schema: specify the attribute of table and data type (required)
|
33
|
+
|
34
|
+
|
35
|
+
### Try
|
36
|
+
|
37
|
+
You can try this plugin with saving below as random.yaml
|
38
|
+
|
39
|
+
```
|
40
|
+
exec: {}
|
41
|
+
in:
|
42
|
+
type: random
|
43
|
+
rows: 100
|
44
|
+
threads: 1
|
45
|
+
schema:
|
46
|
+
id: primary_key
|
47
|
+
name: string
|
48
|
+
score: integer
|
49
|
+
out:
|
50
|
+
type: stdout
|
51
|
+
```
|
52
|
+
|
53
|
+
and just run
|
54
|
+
|
55
|
+
```
|
56
|
+
$ java -jar embulk.jar preview random.yml
|
57
|
+
```
|
58
|
+
|
59
|
+
will generate result like
|
60
|
+
|
61
|
+
```
|
62
|
+
Random generation started.
|
63
|
+
Random generator input thread 0...
|
64
|
+
+---------+---------------------------------------------+------------+
|
65
|
+
| id:long | name:string | score:long |
|
66
|
+
+---------+---------------------------------------------+------------+
|
67
|
+
| 0 | UPPYQ0S1oiKDddasQxOlXPhZ9ys-FtVwH6-DIywnHG8 | 875 |
|
68
|
+
| 1 | IT8KHcI48wM_0ygtm8OVSZQSR1xA4g5lntZ9xAQwY5Y | 2,652 |
|
69
|
+
| 2 | 6HOLiPz9-srgwV8bgBX0Whd7Dq6HRUPKusZdONRxesw | 8,560 |
|
70
|
+
| 3 | r8X3G5iVZsJJEAp5Wqy8LdUte-2wmnz2Zb9gMiiTp-Q | 2,288 |
|
71
|
+
| 4 | g7DDPm6J0y6G9FYGcDgsMk-V6Rewz03sLIu3VUfmp5M | 2,065 |
|
72
|
+
| 5 | fwNJWwztnraaa9MH01sq1Uhx2iz66djdkeUSw18DFnQ | 9,214 |
|
73
|
+
| 6 | EE2WZ3Z7UIFN4U93fgjWYmGqzWEruVBVBaWJXGjfCsQ | 9,972 |
|
74
|
+
| 7 | 70WHrnDYAPx5qNtRxcG2HF-Y4yMO1SXigMep0NFtOo8 | 2,988 |
|
75
|
+
| 8 | wbbi1qQlC3x0WY8uksUc_b0PvJjN6e6QhTrykMF7BJE | 831 |
|
76
|
+
| 9 | zNBjwP_l1Fu7t8b4xIiYz7dfEO0v0BHS5vZd-xqdmCk | 8,596 |
|
77
|
+
| 10 | Rj_NNmf4MG0UASImGbmEHPKf_MUOZe97Jyrs5RQA3q4 | 1,129 |
|
78
|
+
...
|
79
|
+
```
|
80
|
+
|
81
|
+
You can insert arbitrary storage via Embulk!
|
82
|
+
|
83
|
+
### Data Type
|
84
|
+
|
85
|
+
Now supported types are belows
|
86
|
+
- string: 32bytes of ascii code string
|
87
|
+
- integer: random integer 0 to 10000
|
88
|
+
- primary_key: increasing number for each rows
|
89
|
+
- float: random floating point 0 to 10000
|
90
|
+
- date: random date from 1970 to now
|
91
|
+
|
92
|
+
More and more types will be appended...
|
93
|
+
|
94
|
+
## Todo
|
95
|
+
|
96
|
+
- Add more data type to generate
|
97
|
+
- fake user names
|
98
|
+
- flexible length of strings
|
99
|
+
- flexible range of numerics
|
100
|
+
- random generator with gaussian
|
101
|
+
- binary
|
102
|
+
|
103
|
+
## Contributing
|
104
|
+
|
105
|
+
1. Fork it ( https://github.com/kumagi]/embulk-input-random/fork )
|
106
|
+
2. Create your feature branch (`git checkout -b my-cool-feature`)
|
107
|
+
3. Commit your changes (`git commit -am 'Add cool feature'`)
|
108
|
+
4. Push to the branch (`git push origin my-cool-feature`)
|
109
|
+
5. Create a new Pull Request
|
data/Rakefile
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
|
5
|
+
Gem::Specification.new do |spec|
|
6
|
+
spec.name = "embulk-input-random"
|
7
|
+
spec.version = "0.0.2"
|
8
|
+
spec.authors = ["KUMAZAKI Hiroki"]
|
9
|
+
spec.email = ["hiroki.kumazaki@gmail.com"]
|
10
|
+
spec.summary = %q{Random Table Generator for Embulk}
|
11
|
+
spec.description = %q{Create dummy table}
|
12
|
+
spec.homepage = "https://github.com/kumagi/embulk-input-random"
|
13
|
+
spec.license = "MIT"
|
14
|
+
|
15
|
+
spec.files = `git ls-files -z`.split("\x0")
|
16
|
+
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
17
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
18
|
+
spec.require_paths = ["lib"]
|
19
|
+
|
20
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
21
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
22
|
+
end
|
@@ -0,0 +1,77 @@
|
|
1
|
+
# -*- coding:utf-8 -*-
|
2
|
+
|
3
|
+
module Embulk
|
4
|
+
require 'securerandom'
|
5
|
+
class InputRandom < InputPlugin
|
6
|
+
require 'json'
|
7
|
+
|
8
|
+
Plugin.register_input('random', self)
|
9
|
+
|
10
|
+
def self.transaction(config, &control)
|
11
|
+
schema = config.param('schema', :hash)
|
12
|
+
rows = config.param('rows', :integer, default: 5000)
|
13
|
+
threads = config.param('threads', :integer, default: 1)
|
14
|
+
|
15
|
+
columns = schema.each_with_index.map{|column, index|
|
16
|
+
attr, type = column
|
17
|
+
# TODO: type should more flexible
|
18
|
+
case type.downcase
|
19
|
+
when "boolean"
|
20
|
+
Column.new(index, attr, :boolean)
|
21
|
+
when "string"
|
22
|
+
Column.new(index, attr, :string)
|
23
|
+
when "integer", "int", "long", "primary_key"
|
24
|
+
Column.new(index, attr, :long)
|
25
|
+
when "double", "float"
|
26
|
+
Column.new(index, attr, :double)
|
27
|
+
when "date"
|
28
|
+
Column.new(index, attr, :timestamp)
|
29
|
+
end
|
30
|
+
}
|
31
|
+
|
32
|
+
task = {'schema' => schema, 'rows' => rows}
|
33
|
+
|
34
|
+
puts "Random generation started."
|
35
|
+
commit_reports = yield(task, columns, threads)
|
36
|
+
puts "Random input finished. Commit reports = #{commit_reports.to_json}"
|
37
|
+
|
38
|
+
return {}
|
39
|
+
end
|
40
|
+
|
41
|
+
def initialize(task, schema, index, page_builder)
|
42
|
+
super
|
43
|
+
end
|
44
|
+
|
45
|
+
def run
|
46
|
+
puts "Random generator input thread #{@index}..."
|
47
|
+
rows = @task['rows']
|
48
|
+
schema = @task['schema']
|
49
|
+
|
50
|
+
rows.times{|n|
|
51
|
+
@page_builder.add(schema.map{|attr, type|
|
52
|
+
case type
|
53
|
+
when "string"
|
54
|
+
SecureRandom.urlsafe_base64(32)
|
55
|
+
when "integer", "int", "long"
|
56
|
+
(Random.rand * 10000).to_i
|
57
|
+
when "primary_key"
|
58
|
+
n
|
59
|
+
when 'float', 'double'
|
60
|
+
Random.rand * 10000
|
61
|
+
when 'date'
|
62
|
+
Time.at(rand * Time.now.to_f)
|
63
|
+
else
|
64
|
+
raise "unknown type: #{type}"
|
65
|
+
end
|
66
|
+
})
|
67
|
+
}
|
68
|
+
@page_builder.finish
|
69
|
+
|
70
|
+
{ # commit report
|
71
|
+
"rows" => rows,
|
72
|
+
"columns" => schema.size
|
73
|
+
}
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
end
|
metadata
ADDED
@@ -0,0 +1,80 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: embulk-input-random
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.2
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- KUMAZAKI Hiroki
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2015-02-23 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.7'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.7'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
description: Create dummy table
|
42
|
+
email:
|
43
|
+
- hiroki.kumazaki@gmail.com
|
44
|
+
executables: []
|
45
|
+
extensions: []
|
46
|
+
extra_rdoc_files: []
|
47
|
+
files:
|
48
|
+
- ".gitignore"
|
49
|
+
- Gemfile
|
50
|
+
- LICENSE.txt
|
51
|
+
- README.md
|
52
|
+
- Rakefile
|
53
|
+
- embulk-input-random.gemspec
|
54
|
+
- lib/embulk/input/random.rb
|
55
|
+
homepage: https://github.com/kumagi/embulk-input-random
|
56
|
+
licenses:
|
57
|
+
- MIT
|
58
|
+
metadata: {}
|
59
|
+
post_install_message:
|
60
|
+
rdoc_options: []
|
61
|
+
require_paths:
|
62
|
+
- lib
|
63
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
64
|
+
requirements:
|
65
|
+
- - ">="
|
66
|
+
- !ruby/object:Gem::Version
|
67
|
+
version: '0'
|
68
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
69
|
+
requirements:
|
70
|
+
- - ">="
|
71
|
+
- !ruby/object:Gem::Version
|
72
|
+
version: '0'
|
73
|
+
requirements: []
|
74
|
+
rubyforge_project:
|
75
|
+
rubygems_version: 2.4.5
|
76
|
+
signing_key:
|
77
|
+
specification_version: 4
|
78
|
+
summary: Random Table Generator for Embulk
|
79
|
+
test_files: []
|
80
|
+
has_rdoc:
|