embulk-filter-timestamp_format 0.1.9 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 88c7a2edd52859e9304b268ecea6c766158857cd
4
- data.tar.gz: 263540b3154fdd4c0a04892db3708b09dac23cc0
3
+ metadata.gz: 6c2034a4c12e1d4439ed09e907953e99ebca2ab7
4
+ data.tar.gz: 042b74ef4661bb9c1a629178df7e13c76bab03db
5
5
  SHA512:
6
- metadata.gz: 08c540df5bab3aeb3f03c3507042f623fead8488126f714a17dfa3ff09a8c536d03c94ed1773cb6fa37b6255d8cf55f86ed454cb4b6cb229c8bc68e5bcee8c27
7
- data.tar.gz: 945fb0c472ee3d9cc42bcfb9dda0a4444919773c42a9b4cca13a274e61505475ce2733e8bfc430449103488cddae5617300d9d0f2195cf779fafc24a96716819
6
+ metadata.gz: eda3234fc719296056ee7640dcf889dc9af2230f14db52c0bb1678d5a08f709a938bd7a8fbf796ce740a64d22f6a7bf7f7a2930a83b2d30152321e1fd485fc7a
7
+ data.tar.gz: dc28a057caaf95736d2667542253850d569f00ac2fb9181b4d9526e903a0842eefaca635ce2a41f1b5adc00863f21c3443f89905ff1da9848c8d4f2e897d8f38
data/CHANGELOG.md CHANGED
@@ -1,3 +1,9 @@
1
+ # 0.2.0 (2016-05-13)
2
+
3
+ Enhancements:
4
+
5
+ * Add `timestamp_format: auto_java` option (experimental)
6
+
1
7
  # 0.1.9 (2016-05-10)
2
8
 
3
9
  Enhancements:
data/README.md CHANGED
@@ -19,9 +19,10 @@ A filter plugin for Embulk to change timestamp format
19
19
  - **default_from_timezone**: default timezone for the input string (string, default is `UTC`)
20
20
  - **default_to_timestamp_format**: default timestamp format for the output string (string, default is `%Y-%m-%d %H:%M:%S.%N %z`)
21
21
  - **default_to_timezone**: default timezone for the output string (string, default is `UTC`)
22
- - **default_from_timestamp_unit**: default time unit such as `second`, `ms`, `us`, `ns` for the input unixtimestamp (string, default is `second`)
23
- - **default_to_timestamp_unit**: default time unit such as `second`, `ms`, `us`, `ns` for the output unixtimestamp (string, default is `second`)
22
+ - **default_from_timestamp_unit**: default time unit such as `sec` (for second), `ms` (for milli second), `us` (for micro second), `ns` (for nano second) for the input unixtimestamp (string, default is `second`)
23
+ - **default_to_timestamp_unit**: default time unit such as `sec` (for second), `ms` (for milli second), `us` (for micro second), `ns` (for nano second) for the output unixtimestamp (string, default is `second`)
24
24
  - **stop_on_invalid_record**: stop bulk load transaction if a invalid record is found (boolean, default is `false`)
25
+ - **timestamp_parser** (experimental): set `auto_java` to try to convert ruby format to java format to use faster java timestamp parser (string, default is `auto`)
25
26
 
26
27
  ## Example
27
28
 
@@ -62,9 +63,9 @@ Output will be as:
62
63
 
63
64
  See [./example](./example) for more examples.
64
65
 
65
- ## JRuby Timestamp Parser/Formatter Performance Issue
66
+ ## JRuby Timestamp Parser Performance Issue
66
67
 
67
- Embulk's timestamp parser/formatter originally uses jruby implementation, but it is slow.
68
+ Embulk's timestamp parser originally uses jruby implementation, but it is slow.
68
69
  To improve performance, this plugin also supports Java's Joda-Time [DateTimeFormat](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html) format as:
69
70
 
70
71
  ```yaml
@@ -90,6 +91,10 @@ out:
90
91
 
91
92
  If format strings contain `%`, jruby parser/formatter is used. Otherwirse, java parser/formatter is used
92
93
 
94
+ **Auto Java timestamp format conversion** (experimental)
95
+
96
+ If you configure `timestamp_parser: auto_java`, this plugin tries to convert ruby format into java format to use faster java timestamp parser.
97
+
93
98
  **COMPARISON:**
94
99
 
95
100
  Benchmark test sets are available at [./bench](./bench). In my environment (Mac Book Pro), for 1000000 timestamps:
@@ -99,9 +104,11 @@ Benchmark test sets are available at [./bench](./bench). In my environment (Mac
99
104
  * jruby parser / java formatter: 64.52s
100
105
  * jruby parser / jruby formatter: 65.06s
101
106
 
107
+ JRuby parser is slow, but JRuby formatter is not so slow.
108
+
102
109
  ## Nano Resolution
103
110
 
104
- JRuby parser has micro second resolution. Java (Joda-Time) parser has milli second resolution (although Java8's DateTimeFormatter supports nano second resolution)
111
+ JRuby parser has micro second resolution. Java (Joda-Time) parser has milli second resolution.
105
112
 
106
113
  Nano second resolution is partially supported by this plugin itself. Use parser format `nnnnnnnnn` for Java parser as
107
114
 
@@ -118,6 +125,8 @@ yyyy-MM-dd HH:mm:ss.nnnnnnnnn z
118
125
  yyyy-MM-dd HH:mm:ss.nnnnnn z
119
126
  ```
120
127
 
128
+ FYI: Java8's DateTimeFormatter supports nano second resolution, but we can not use it because embulk supports Java7.
129
+
121
130
  ## ToDo
122
131
 
123
132
  * Write test
data/build.gradle CHANGED
@@ -13,7 +13,7 @@ configurations {
13
13
  provided
14
14
  }
15
15
 
16
- version = "0.1.9"
16
+ version = "0.2.0"
17
17
  sourceCompatibility = 1.7
18
18
  targetCompatibility = 1.7
19
19
 
@@ -22,6 +22,7 @@ dependencies {
22
22
  provided "org.embulk:embulk-core:0.8.+"
23
23
  // compile "YOUR_JAR_DEPENDENCY_GROUP:YOUR_JAR_DEPENDENCY_MODULE:YOUR_JAR_DEPENDENCY_VERSION"
24
24
  testCompile "junit:junit:4.+"
25
+ testCompile "org.embulk:embulk-core:0.7.+:tests"
25
26
  }
26
27
 
27
28
  checkstyle {
data/example/empty.yml ADDED
@@ -0,0 +1,9 @@
1
+ in:
2
+ type: file
3
+ path_prefix: example/example.jsonl
4
+ parser:
5
+ type: json
6
+ filters:
7
+ - type: timestamp_format
8
+ out:
9
+ type: "null"
@@ -0,0 +1,24 @@
1
+ in:
2
+ type: file
3
+ path_prefix: example/string.csv
4
+ parser:
5
+ type: csv
6
+ columns:
7
+ - {name: string1, type: string}
8
+ - {name: string2, type: string}
9
+ - {name: string3, type: string}
10
+ - {name: string4, type: string}
11
+ filters:
12
+ - type: timestamp_format
13
+ default_from_timezone: "Asia/Taipei"
14
+ default_from_timestamp_format: ["%Y-%m-%d", "%Y-%m-%d %Z", "%Y-%m-%d %H:%M:%S.%N %Z", "%Y-%m-%d %H:%M:%S %Z", "%Y-%m-%d %H:%M:%S"]
15
+ default_to_timezone: "Asia/Taipei"
16
+ default_to_timestamp_format: "%Y-%m-%d %H:%M:%S.%N"
17
+ timestamp_parser: auto_java
18
+ columns:
19
+ - {name: string1}
20
+ - {name: string2, type: timestamp}
21
+ - {name: string3, type: long, to_unit: ms}
22
+ - {name: string4, type: double, to_unit: ms}
23
+ out:
24
+ type: "null"
@@ -13,7 +13,7 @@ filters:
13
13
  default_from_timezone: "Asia/Taipei"
14
14
  default_from_timestamp_format: ["yyyy-MM-dd", "yyyy-MM-dd z", "yyyy-MM-dd HH:mm:ss.nnnnnnnnn z", "yyyy-MM-dd HH:mm:ss z", "yyyy-MM-dd HH:mm:ss"]
15
15
  default_to_timezone: "Asia/Taipei"
16
- default_to_timestamp_format: "yyyy-MM-dd HH:mm:ss.nnnnnnnn"
16
+ default_to_timestamp_format: "yyyy-MM-dd HH:mm:ss.nnnnnn Z"
17
17
  columns:
18
18
  - {name: string1}
19
19
  - {name: string2, type: timestamp}
@@ -21,6 +21,7 @@ import org.joda.time.DateTimeZone;
21
21
  import org.msgpack.value.Value;
22
22
  import org.slf4j.Logger;
23
23
 
24
+ import java.util.ArrayList;
24
25
  import java.util.HashMap;
25
26
  import java.util.List;
26
27
 
@@ -69,7 +70,24 @@ public class ColumnCaster
69
70
  {
70
71
  DateTimeZone timezone = columnConfig.getFromTimeZone().or(task.getDefaultFromTimeZone());
71
72
  List<String> formatList = columnConfig.getFromFormat().or(task.getDefaultFromTimestampFormat());
72
- return new TimestampParser(task.getJRuby(), formatList, timezone);
73
+ List<String> newFormatList = new ArrayList<>(formatList);
74
+ String name = columnConfig.getName();
75
+ if (task.getTimeStampParser().equals("auto_java")) {
76
+ for (int i = 0; i < formatList.size(); i++) {
77
+ String format = formatList.get(i);
78
+ if (!format.contains("%")) {
79
+ continue;
80
+ }
81
+ String javaFormat = TimestampFormatConverter.toJavaFormat(format);
82
+ if (javaFormat == null) {
83
+ logger.info(String.format("%s: Failed to convert ruby parser to java parser: \"%s\", Use ruby parser as is", name, format));
84
+ } else {
85
+ logger.debug(String.format("%s: Convert ruby parser \"%s\" to java parser \"%s\"", name, format, javaFormat));
86
+ newFormatList.set(i, javaFormat);
87
+ }
88
+ }
89
+ }
90
+ return new TimestampParser(task.getJRuby(), newFormatList, timezone);
73
91
  }
74
92
 
75
93
  private void buildTimestampFormatterMap()
@@ -0,0 +1,147 @@
1
+ package org.embulk.filter.timestamp_format;
2
+
3
+ // Convert JRuby Time Format into Java (Joda-Time) Format
4
+ // Aimed only for parser (JRuby format is too rich than Java Format in terms of formatter)
5
+
6
+ import java.util.HashMap;
7
+ import java.util.regex.Matcher;
8
+ import java.util.regex.Pattern;
9
+
10
+ public class TimestampFormatConverter
11
+ {
12
+ public static final HashMap<String, String> RUBY_TO_JAVA_FORMAT_TABLE = new HashMap<>();
13
+ private static final Pattern IDENTIFIER_PATTERN;
14
+ private static final Pattern NON_IDENTIFIER_PATTERN;
15
+
16
+ static
17
+ {
18
+ // %A<Friday> EEEE<Friday>
19
+ // %B<May> MMMM<May>
20
+ // %C<20> CC<20>
21
+ // %D<05/13/16> MM/dd/yy<05/13/16>
22
+ // %F<2016-05-13> yyyy-MM-dd<2016-05-13>
23
+ // %H<09> HH<09>
24
+ // %I<09> hh<09>
25
+ // %L<123> SSS<123>
26
+ // %M<02> mm<02>
27
+ // %N<123456789> nnnnnnnnn<123456789>
28
+ // %P<am> a<AM>
29
+ // %R<09:02> HH:mm<09:02>
30
+ // %S<39> ss<39>
31
+ // %T<09:02:39> HH:mm:ss<09:02:39>
32
+ // %U<19> w<19>
33
+ // %V<19> w<19>
34
+ // %W<19> w<19>
35
+ // %X<09:02:39> HH:mm:ss<09:02:39>
36
+ // %Y<2016> yyyy<2016>
37
+ // %Z<UTC> z<UTC>
38
+ // %a<Fri> EEE<Fri>
39
+ // %b<May> MMM<May>
40
+ // %c<Fri May 13 09:02:39 2016> EEE MMM dd HH:mm:ss yyyy<Fri May 13 09:02:39 2016>
41
+ // %d<13> dd<13>
42
+ // %e<13> dd<13>
43
+ // %h<May> MMM<May>
44
+ // %j<134> DDD<134>
45
+ // %k< 9> HH<09>
46
+ // %m<05> MM<05>
47
+ // %p<AM> a<AM>
48
+ // %r<09:02:39 AM> hh:mm:ss a<09:02:39 AM>
49
+ // %u<5> e<5>
50
+ // %v<13-MAY-2016> dd-MMM-yyyy<13-May-2016>
51
+ // %w<5> e<5>
52
+ // %x<05/13/16> MM/dd/yy<05/13/16>
53
+ // %y<16> yy<16>
54
+ // %z<+0000> Z<+0000>
55
+ // %:z<+00:00> Z<+0000>
56
+ // %::z<+00:00:00> Z<+0000>
57
+ RUBY_TO_JAVA_FORMAT_TABLE.put("A", "EEEE");
58
+ RUBY_TO_JAVA_FORMAT_TABLE.put("a", "EEE");
59
+ RUBY_TO_JAVA_FORMAT_TABLE.put("B", "MMMM");
60
+ RUBY_TO_JAVA_FORMAT_TABLE.put("b", "MMM");
61
+ RUBY_TO_JAVA_FORMAT_TABLE.put("C", "CC");
62
+ RUBY_TO_JAVA_FORMAT_TABLE.put("c", "EEE MMM dd HH:mm:ss yyyy");
63
+ RUBY_TO_JAVA_FORMAT_TABLE.put("D", "MM/dd/yy");
64
+ RUBY_TO_JAVA_FORMAT_TABLE.put("d", "dd");
65
+ RUBY_TO_JAVA_FORMAT_TABLE.put("e", "dd");
66
+ RUBY_TO_JAVA_FORMAT_TABLE.put("F", "yyyy-MM-dd");
67
+ RUBY_TO_JAVA_FORMAT_TABLE.put("H", "HH");
68
+ RUBY_TO_JAVA_FORMAT_TABLE.put("h", "MMM");
69
+ RUBY_TO_JAVA_FORMAT_TABLE.put("I", "hh");
70
+ RUBY_TO_JAVA_FORMAT_TABLE.put("j", "DDD");
71
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("k", "HH"); // " 9" fails with HH
72
+ RUBY_TO_JAVA_FORMAT_TABLE.put("L", "SSS");
73
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("l", "hh"); // " 9" fails with hh
74
+ RUBY_TO_JAVA_FORMAT_TABLE.put("M", "mm");
75
+ RUBY_TO_JAVA_FORMAT_TABLE.put("m", "MM");
76
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("n", "");
77
+ RUBY_TO_JAVA_FORMAT_TABLE.put("N", "nnnnnnnnn");
78
+ RUBY_TO_JAVA_FORMAT_TABLE.put("P", "a");
79
+ RUBY_TO_JAVA_FORMAT_TABLE.put("p", "a");
80
+ RUBY_TO_JAVA_FORMAT_TABLE.put("R", "HH:mm");
81
+ RUBY_TO_JAVA_FORMAT_TABLE.put("r", "hh:mm:ss a");
82
+ RUBY_TO_JAVA_FORMAT_TABLE.put("S", "ss");
83
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("s", "")); // N/A
84
+ RUBY_TO_JAVA_FORMAT_TABLE.put("T", "HH:mm:ss");
85
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("t", "");
86
+ RUBY_TO_JAVA_FORMAT_TABLE.put("U", "w");
87
+ RUBY_TO_JAVA_FORMAT_TABLE.put("u", "e");
88
+ RUBY_TO_JAVA_FORMAT_TABLE.put("v", "dd-MMM-yyyy");
89
+ RUBY_TO_JAVA_FORMAT_TABLE.put("V", "w");
90
+ RUBY_TO_JAVA_FORMAT_TABLE.put("W", "w");
91
+ RUBY_TO_JAVA_FORMAT_TABLE.put("w", "e");
92
+ RUBY_TO_JAVA_FORMAT_TABLE.put("X", "HH:mm:ss");
93
+ RUBY_TO_JAVA_FORMAT_TABLE.put("x", "MM/dd/yy");
94
+ RUBY_TO_JAVA_FORMAT_TABLE.put("Y", "yyyy");
95
+ RUBY_TO_JAVA_FORMAT_TABLE.put("y", "yy");
96
+ RUBY_TO_JAVA_FORMAT_TABLE.put("Z", "z");
97
+ RUBY_TO_JAVA_FORMAT_TABLE.put("z", "Z");
98
+ //RUBY_TO_JAVA_FORMAT_TABLE.put("%", "");
99
+
100
+ String[] array = RUBY_TO_JAVA_FORMAT_TABLE.keySet().toArray(new String[0]);
101
+ StringBuilder keyPatternBuilder = new StringBuilder(array[0]);
102
+ for (int i = 1; i < array.length; i++) {
103
+ keyPatternBuilder.append(array[i]);
104
+ }
105
+ IDENTIFIER_PATTERN = Pattern.compile(new StringBuilder()
106
+ .append("%[-_^#0-9:]*([")
107
+ .append(keyPatternBuilder.toString())
108
+ .append("])")
109
+ .toString());
110
+
111
+ NON_IDENTIFIER_PATTERN = Pattern.compile("(^|\\s)([^%\\s]\\S*)");
112
+ }
113
+
114
+ // @return returns null if appropriate java format is not available
115
+ public static String toJavaFormat(String rubyFormat)
116
+ {
117
+ String quotedFormat = quoteFormat(rubyFormat);
118
+ Matcher match = IDENTIFIER_PATTERN.matcher(quotedFormat);
119
+ StringBuffer buf = new StringBuffer();
120
+ while (match.find()) {
121
+ String key = match.group(1);
122
+ String replacement = RUBY_TO_JAVA_FORMAT_TABLE.get(key);
123
+ match.appendReplacement(buf, replacement);
124
+ }
125
+ match.appendTail(buf);
126
+ String javaFormat = buf.toString();
127
+
128
+ if (javaFormat.contains("%")) {
129
+ return null; // give up to use java format
130
+ }
131
+ else {
132
+ return javaFormat;
133
+ }
134
+ }
135
+
136
+ private static String quoteFormat(String rubyFormat)
137
+ {
138
+ Matcher match = NON_IDENTIFIER_PATTERN.matcher(rubyFormat);
139
+ StringBuffer buf = new StringBuffer();
140
+ while (match.find()) {
141
+ String replacement = new StringBuilder().append(match.group(1)).append("'").append(match.group(2)).append("'").toString();
142
+ match.appendReplacement(buf, replacement);
143
+ }
144
+ match.appendTail(buf);
145
+ return buf.toString();
146
+ }
147
+ }
@@ -19,7 +19,6 @@ import org.embulk.spi.PageOutput;
19
19
  import org.embulk.spi.PageReader;
20
20
  import org.embulk.spi.Schema;
21
21
 
22
- import org.embulk.spi.time.Timestamp;
23
22
  import org.embulk.spi.type.BooleanType;
24
23
  import org.embulk.spi.type.JsonType;
25
24
  import org.embulk.spi.type.TimestampType;
@@ -66,6 +65,10 @@ public class TimestampFormatFilterPlugin implements FilterPlugin
66
65
  @ConfigDefault("false")
67
66
  Boolean getStopOnInvalidRecord();
68
67
 
68
+ @Config("timestamp_parser")
69
+ @ConfigDefault("\"auto\"") // or auto_java
70
+ String getTimeStampParser();
71
+
69
72
  @Config("default_from_timestamp_unit")
70
73
  @ConfigDefault("\"second\"")
71
74
  TimestampUnit getDefaultFromTimestampUnit();
@@ -16,8 +16,6 @@ import org.jruby.embed.ScriptingContainer;
16
16
  import org.jruby.util.RubyDateFormat;
17
17
 
18
18
  import java.util.Locale;
19
- import java.util.regex.Matcher;
20
- import java.util.regex.Pattern;
21
19
 
22
20
  import org.joda.time.format.DateTimeFormat;
23
21
  import org.joda.time.format.DateTimeFormatter;
@@ -21,14 +21,20 @@ public class TimestampUnitDeserializer
21
21
  builder.put("MilliSecond", TimestampUnit.MilliSecond);
22
22
  builder.put("millisecond", TimestampUnit.MilliSecond);
23
23
  builder.put("milli_second", TimestampUnit.MilliSecond);
24
+ builder.put("milli", TimestampUnit.MilliSecond);
25
+ builder.put("msec", TimestampUnit.MilliSecond);
24
26
  builder.put("ms", TimestampUnit.MilliSecond);
25
27
  builder.put("MicroSecond", TimestampUnit.MicroSecond);
26
28
  builder.put("microsecond", TimestampUnit.MicroSecond);
27
29
  builder.put("micro_second", TimestampUnit.MicroSecond);
30
+ builder.put("micro", TimestampUnit.MicroSecond);
31
+ builder.put("usec", TimestampUnit.MicroSecond);
28
32
  builder.put("us", TimestampUnit.MicroSecond);
29
33
  builder.put("NanoSecond", TimestampUnit.NanoSecond);
30
34
  builder.put("nanosecond", TimestampUnit.NanoSecond);
31
35
  builder.put("nano_second", TimestampUnit.NanoSecond);
36
+ builder.put("nano", TimestampUnit.NanoSecond);
37
+ builder.put("nsec", TimestampUnit.NanoSecond);
32
38
  builder.put("ns", TimestampUnit.NanoSecond);
33
39
  stringToTimestampUnitMap = builder.build();
34
40
  }
@@ -0,0 +1,114 @@
1
+ package org.embulk.filter.timestamp_format;
2
+
3
+ import org.embulk.EmbulkTestRuntime;
4
+
5
+ import org.embulk.spi.time.Timestamp;
6
+ import org.joda.time.DateTimeZone;
7
+ import org.jruby.embed.ScriptingContainer;
8
+
9
+ import org.junit.Before;
10
+ import org.junit.Rule;
11
+ import org.junit.Test;
12
+
13
+ import java.util.Arrays;
14
+ import java.util.Map;
15
+
16
+ import static org.junit.Assert.assertEquals;
17
+ import static org.junit.Assert.fail;
18
+
19
+ public class TestTimestampFormatConverter
20
+ {
21
+ @Rule
22
+ public EmbulkTestRuntime runtime = new EmbulkTestRuntime();
23
+ public ScriptingContainer jruby;
24
+ public DateTimeZone zone;
25
+ public Timestamp timestamp;
26
+
27
+ @Before
28
+ public void createResource()
29
+ {
30
+ jruby = new ScriptingContainer();
31
+ zone = DateTimeZone.UTC;
32
+ timestamp = Timestamp.ofEpochSecond(1463130159, 123456789);
33
+ }
34
+
35
+ @Test
36
+ public void testRUBY_TO_JAVA_FORMAT_TABLE()
37
+ {
38
+ for(Map.Entry<String, String> entry : TimestampFormatConverter.RUBY_TO_JAVA_FORMAT_TABLE.entrySet()) {
39
+ String rubyFormat = "%" + entry.getKey();
40
+ String javaFormat = entry.getValue();
41
+
42
+ TimestampFormatter rubyFormatter = new TimestampFormatter(jruby, rubyFormat, zone);
43
+ TimestampFormatter javaFormatter = new TimestampFormatter(jruby, javaFormat, zone);
44
+ String rubyFormatted = rubyFormatter.format(timestamp);
45
+ String javaFormatted = javaFormatter.format(timestamp);
46
+ // System.out.println(String.format("%s<%s> %s<%s>", rubyFormat, rubyFormatted, javaFormat, javaFormatted));
47
+
48
+ TimestampParser rubyParser = new TimestampParser(jruby, Arrays.asList("." + rubyFormat), zone);
49
+ TimestampParser javaParser = new TimestampParser(jruby, Arrays.asList("." + javaFormat), zone);
50
+ Timestamp rubyParsed = rubyParser.parse("." + rubyFormatted);
51
+ try {
52
+ Timestamp javaParsed = javaParser.parse("." + rubyFormatted);
53
+ }
54
+ catch (IllegalArgumentException ex) {
55
+ fail(String.format("Parse \"%s\" with java format \"%s\" failed (corresponding ruby format \"%s\")", rubyFormatted, javaFormat, rubyFormat));
56
+ }
57
+ }
58
+ }
59
+
60
+ @Test
61
+ public void testToJavaFormat()
62
+ {
63
+ for(Map.Entry<String, String> entry : TimestampFormatConverter.RUBY_TO_JAVA_FORMAT_TABLE.entrySet()) {
64
+ String rubyFormat = "%-2" + entry.getKey();
65
+ String javaFormat = entry.getValue();
66
+ assertEquals(javaFormat, TimestampFormatConverter.toJavaFormat(rubyFormat));
67
+ }
68
+ }
69
+
70
+ @Test
71
+ public void testToJavaFormats()
72
+ {
73
+ {
74
+ String rubyFormat = "%Y-%m-%d %H:%M:%S.%6N %:z";
75
+ String javaFormat = "yyyy-MM-dd HH:mm:ss.nnnnnnnnn Z";
76
+ assertEquals(javaFormat, TimestampFormatConverter.toJavaFormat(rubyFormat));
77
+
78
+ TimestampParser parser = new TimestampParser(jruby, Arrays.asList(javaFormat), zone);
79
+ try {
80
+ parser.parse("2016-05-12 20:14:13.123456789 +09:00");
81
+ }
82
+ catch (IllegalArgumentException ex) {
83
+ fail();
84
+ }
85
+ }
86
+ {
87
+ String rubyFormat = "%Y-%m-%d %H:%M:%S.%6N UTC";
88
+ String javaFormat = "yyyy-MM-dd HH:mm:ss.nnnnnnnnn 'UTC'";
89
+ assertEquals(javaFormat, TimestampFormatConverter.toJavaFormat(rubyFormat));
90
+
91
+ TimestampParser parser = new TimestampParser(jruby, Arrays.asList(javaFormat), zone);
92
+ try {
93
+ parser.parse("2016-05-12 20:14:13.123456789 UTC");
94
+ }
95
+ catch (IllegalArgumentException ex) {
96
+ fail();
97
+ }
98
+ }
99
+ {
100
+ String rubyFormat = "%Y-%m-%d %H:%M:%S.%6N +00:00";
101
+ String javaFormat = "yyyy-MM-dd HH:mm:ss.nnnnnnnnn '+00:00'";
102
+ assertEquals(javaFormat, TimestampFormatConverter.toJavaFormat(rubyFormat));
103
+
104
+ TimestampParser parser = new TimestampParser(jruby, Arrays.asList(javaFormat), zone);
105
+ try {
106
+ parser.parse("2016-05-12 20:14:13.123456789 +00:00");
107
+ }
108
+ catch (IllegalArgumentException ex) {
109
+ fail();
110
+ }
111
+ }
112
+ }
113
+ }
114
+
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: embulk-filter-timestamp_format
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.9
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Naotoshi Seo
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-05-10 00:00:00.000000000 Z
11
+ date: 2016-05-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -60,6 +60,7 @@ files:
60
60
  - config/checkstyle/checkstyle.xml
61
61
  - example/double.csv
62
62
  - example/double.yml
63
+ - example/empty.yml
63
64
  - example/example.jsonl
64
65
  - example/example.yml
65
66
  - example/example2.csv
@@ -74,6 +75,7 @@ files:
74
75
  - example/long.yml
75
76
  - example/string.csv
76
77
  - example/string.yml
78
+ - example/string_auto_java.yml
77
79
  - example/string_java.yml
78
80
  - example/string_nano.yml
79
81
  - example/timestamp.csv
@@ -87,6 +89,7 @@ files:
87
89
  - src/main/java/org/embulk/filter/timestamp_format/ColumnVisitorImpl.java
88
90
  - src/main/java/org/embulk/filter/timestamp_format/JsonCaster.java
89
91
  - src/main/java/org/embulk/filter/timestamp_format/JsonVisitor.java
92
+ - src/main/java/org/embulk/filter/timestamp_format/TimestampFormatConverter.java
90
93
  - src/main/java/org/embulk/filter/timestamp_format/TimestampFormatFilterPlugin.java
91
94
  - src/main/java/org/embulk/filter/timestamp_format/TimestampFormatter.java
92
95
  - src/main/java/org/embulk/filter/timestamp_format/TimestampParser.java
@@ -96,8 +99,9 @@ files:
96
99
  - src/main/java/org/embulk/filter/timestamp_format/cast/LongCast.java
97
100
  - src/main/java/org/embulk/filter/timestamp_format/cast/StringCast.java
98
101
  - src/main/java/org/embulk/filter/timestamp_format/cast/TimestampCast.java
102
+ - src/test/java/org/embulk/filter/timestamp_format/TestTimestampFormatConverter.java
99
103
  - src/test/java/org/embulk/filter/timestamp_format/TestTimestampUnit.java
100
- - classpath/embulk-filter-timestamp_format-0.1.9.jar
104
+ - classpath/embulk-filter-timestamp_format-0.2.0.jar
101
105
  homepage: https://github.com/sonots/embulk-filter-timestamp_format
102
106
  licenses:
103
107
  - MIT