embulk-filter-column 0.6.0.pre6 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -1
- data/CHANGELOG.md +1 -1
- data/README.md +28 -26
- data/build.gradle +1 -1
- metadata +5 -5
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9ce6ab66b7e8d2cbccac522531d8fc271b8be874
|
4
|
+
data.tar.gz: 6b5d29acd7231d2f2d7ee2375a5096796323c8b5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0a00986235b253418cb0da460d7c2b38c86e3aae6ac98793a5a2aec84466dfdde7361a31dad2ca2b67442a35b731f206e04ca1ff80e33a6e8fb5e523582fabe5
|
7
|
+
data.tar.gz: 253cf3387f7032f3db29490103872693ce8583722107ca8502cf4b221968ea25f786174e080867fbc49b323e9ad8e52498da4f6110f1d533fbefc9b5dcecacfe
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -25,15 +25,15 @@ A filter plugin for Embulk to filter out columns
|
|
25
25
|
- **default_timestamp_format**: default timestamp format for timestamp columns (string, default is `%Y-%m-%d %H:%M:%S.%N %z`)
|
26
26
|
- **default_timezone**: default timezone for timestamp columns (string, default is `UTC`)
|
27
27
|
|
28
|
-
## Example
|
28
|
+
## Example - columns
|
29
29
|
|
30
30
|
Say input.csv is as follows:
|
31
31
|
|
32
32
|
```
|
33
33
|
time,id,key,score
|
34
|
-
2015-07-13,0,
|
35
|
-
2015-07-13,1,
|
36
|
-
2015-07-13,2,
|
34
|
+
2015-07-13,0,Vqjht6YE,1370
|
35
|
+
2015-07-13,1,VmjbjAA0,3962
|
36
|
+
2015-07-13,2,C40P5H1W,7323
|
37
37
|
```
|
38
38
|
|
39
39
|
```yaml
|
@@ -48,22 +48,23 @@ filters:
|
|
48
48
|
reduces columns to only `time`, `id`, and `key` columns as:
|
49
49
|
|
50
50
|
```
|
51
|
-
|
52
|
-
2015-07-13,
|
53
|
-
2015-07-13,
|
51
|
+
time,id,key
|
52
|
+
2015-07-13,0,Vqjht6YE
|
53
|
+
2015-07-13,1,VmjbjAA0
|
54
|
+
2015-07-13,2,C40P5H1W
|
54
55
|
```
|
55
56
|
|
56
57
|
Note that column types are automatically retrieved from input data (inputSchema).
|
57
58
|
|
58
|
-
## Example
|
59
|
+
## Example - add_columns
|
59
60
|
|
60
61
|
Say input.csv is as follows:
|
61
62
|
|
62
63
|
```
|
63
64
|
time,id,key,score
|
64
|
-
2015-07-13,0,
|
65
|
-
2015-07-13,1,
|
66
|
-
2015-07-13,2,
|
65
|
+
2015-07-13,0,Vqjht6YE,1370
|
66
|
+
2015-07-13,1,VmjbjAA0,3962
|
67
|
+
2015-07-13,2,C40P5H1W,7323
|
67
68
|
```
|
68
69
|
|
69
70
|
```yaml
|
@@ -77,20 +78,21 @@ filters:
|
|
77
78
|
add `d` column, and `copy_id` column which is a copy of `id` column as:
|
78
79
|
|
79
80
|
```
|
80
|
-
|
81
|
-
2015-07-13,
|
82
|
-
2015-07-13,
|
81
|
+
time,id,key,score,d,copy_id
|
82
|
+
2015-07-13,0,Vqjht6YE,1370,2015-07-13,0
|
83
|
+
2015-07-13,1,VmjbjAA0,3962,2015-07-13,1
|
84
|
+
2015-07-13,2,C40P5H1W,7323,2015-07,13,2
|
83
85
|
```
|
84
86
|
|
85
|
-
## Example
|
87
|
+
## Example - drop_columns
|
86
88
|
|
87
89
|
Say input.csv is as follows:
|
88
90
|
|
89
91
|
```
|
90
92
|
time,id,key,score
|
91
|
-
2015-07-13,0,
|
92
|
-
2015-07-13,1,
|
93
|
-
2015-07-13,2,
|
93
|
+
2015-07-13,0,Vqjht6YE,1370
|
94
|
+
2015-07-13,1,VmjbjAA0,3962
|
95
|
+
2015-07-13,2,C40P5H1W,7323
|
94
96
|
```
|
95
97
|
|
96
98
|
```yaml
|
@@ -104,12 +106,13 @@ filters:
|
|
104
106
|
drop `time` and `id` columns as:
|
105
107
|
|
106
108
|
```
|
107
|
-
|
108
|
-
|
109
|
-
|
109
|
+
key,score
|
110
|
+
Vqjht6YE,1370
|
111
|
+
VmjbjAA0,3962
|
112
|
+
C40P5H1W,7323
|
110
113
|
```
|
111
114
|
|
112
|
-
## JSONPath
|
115
|
+
## JSONPath
|
113
116
|
|
114
117
|
For type: json column, you can specify [JSONPath](http://goessner.net/articles/JsonPath/) for column's name as:
|
115
118
|
|
@@ -128,12 +131,11 @@ EXAMPLE:
|
|
128
131
|
|
129
132
|
NOTE:
|
130
133
|
|
131
|
-
|
132
|
-
* `type: timesatmp` for `add_columns` or `columns` is not available because Embulk's `type: json` cannot have timestamp column
|
134
|
+
1) JSONPath syntax is not fully supported
|
133
135
|
|
134
|
-
|
136
|
+
2) `type: timesatmp` for `add_columns` or `columns` is not available because Embulk's `type: json` cannot have timestamp column inside
|
135
137
|
|
136
|
-
Rename or copy of json paths by `src` option is only partially supported yet. The parent json path must be same like:
|
138
|
+
3) Rename or copy of json paths by `src` option is only partially supported yet. The parent json path must be same like:
|
137
139
|
|
138
140
|
```
|
139
141
|
- {name: $.payload.foo.dest, src: $.payload.foo.src}
|
data/build.gradle
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: embulk-filter-column
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.6.0
|
4
|
+
version: 0.6.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Naotoshi Seo
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-
|
11
|
+
date: 2016-11-05 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -82,7 +82,7 @@ files:
|
|
82
82
|
- classpath/accessors-smart-1.1.jar
|
83
83
|
- classpath/asm-5.0.3.jar
|
84
84
|
- classpath/commons-lang3-3.4.jar
|
85
|
-
- classpath/embulk-filter-column-0.6.0.
|
85
|
+
- classpath/embulk-filter-column-0.6.0.jar
|
86
86
|
- classpath/json-smart-2.2.1.jar
|
87
87
|
- classpath/JsonPathCompiler-0.0.12.jar
|
88
88
|
- classpath/slf4j-api-1.7.21.jar
|
@@ -101,9 +101,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
101
101
|
version: '0'
|
102
102
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
103
103
|
requirements:
|
104
|
-
- - '
|
104
|
+
- - '>='
|
105
105
|
- !ruby/object:Gem::Version
|
106
|
-
version:
|
106
|
+
version: '0'
|
107
107
|
requirements: []
|
108
108
|
rubyforge_project:
|
109
109
|
rubygems_version: 2.1.9
|