embulk-output-elasticsearch_1.x 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: f7d5473cbc2fdad178a9d50b9b9b5e4c416290dc
4
+ data.tar.gz: edabbc01a70b8350e5963d668602fe5e5e923fbb
5
+ SHA512:
6
+ metadata.gz: 4b455b0eda32674f9e6c64d5af0c1a5b191b824b59d82b9180d5c3d1c0ad220b409c959fa8b125b757d5aa6d4b0fec185626f31b21a2726e8372b34ca772d5d4
7
+ data.tar.gz: f077a539cc227e714110ffe60e2e19ebe53ab9b527c907a005483d30c07ccc549323f115921ad6c8b30ff3281a36373ec3f6603aa0ffe423e54a5a894c145194
data/.gitignore ADDED
@@ -0,0 +1,7 @@
1
+ *~
2
+ *.iml
3
+ .idea
4
+ build/
5
+ /classpath/
6
+ /.gradle
7
+ /pkg/
data/CHANGELOG.md ADDED
@@ -0,0 +1,51 @@
1
+ ## 0.3.1 - 2016-06-21
2
+
3
+ * [maintenance] Update Elasticsearch client to 2.3.3 [#25](https://github.com/muga/embulk-output-elasticsearch/pull/25)
4
+
5
+ ## 0.3.0 - 2016-02-22
6
+
7
+ * [maintenance] Upgrade Embulk v08 [#21](https://github.com/muga/embulk-output-elasticsearch/pull/21)
8
+
9
+ ## 0.2.1 - 2016-02-05
10
+
11
+ * [maintenance] Fix bug. Force to fail jobs if nodes down while executing [#19](https://github.com/muga/embulk-output-elasticsearch/pull/19)
12
+
13
+ ## 0.2.0 - 2016-01-26
14
+
15
+ * [new feature] Support Elasticsearch 2.x [#12](https://github.com/muga/embulk-output-elasticsearch/pull/12)
16
+ * [new feature] Added replace mode [#15](https://github.com/muga/embulk-output-elasticsearch/pull/15)
17
+ * [maintenance] Fix id param's behavior [#14](https://github.com/muga/embulk-output-elasticsearch/pull/14)
18
+ * [maintenance] Added unit tests [#17](https://github.com/muga/embulk-output-elasticsearch/pull/17)
19
+ * [maintenance] Upgraded Embulk to v0.7.7
20
+
21
+ ## 0.1.8 - 2015-08-19
22
+
23
+ * [maintenance] Upgraded Embulk to v0.7.0
24
+ * [maintenance] Upgraded Elasticsearch to v1.5.2
25
+
26
+ ## 0.1.7 - 2015-05-09
27
+
28
+ * [maintenance] Fixed handling null value [#10](https://github.com/muga/embulk-output-elasticsearch/pull/10)
29
+
30
+ ## 0.1.6 - 2015-04-14
31
+
32
+ * [new feature] Added bulk_size parameter [#8](https://github.com/muga/embulk-output-elasticsearch/pull/8)
33
+
34
+ ## 0.1.5 - 2015-03-26
35
+
36
+ * [new feature] Added cluster_name parameter [#7](https://github.com/muga/embulk-output-elasticsearch/pull/7)
37
+
38
+ ## 0.1.4 - 2015-03-19
39
+
40
+ * [maintenance] Fixed parameter names index_name to index, doc_id_column to id. [#5](https://github.com/muga/embulk-output-elasticsearch/pull/5)
41
+ * [maintenance] Fixed typo at parameter [#6](https://github.com/muga/embulk-output-elasticsearch/pull/6)
42
+
43
+ ## 0.1.3 - 2015-02-25
44
+
45
+ * [new feature] Supported timestamp column [#4](https://github.com/muga/embulk-output-elasticsearch/pull/4)
46
+
47
+ ## 0.1.2 - 2015-02-24
48
+
49
+ ## 0.1.1 - 2015-02-16
50
+
51
+ ## 0.1.0 - 2015-02-16
data/README.md ADDED
@@ -0,0 +1,117 @@
1
+ # Elasticsearch output plugin for Embulk
2
+
3
+ **Notice** This plugin doesn't support [Amazon(AWS) Elasticsearch Service](https://aws.amazon.com/elasticsearch-service/).
4
+ Plugin uses [Transport Client](https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.0/transport-client.html) but AWS Elasticsearch doesn't support this method.
5
+ > The service supports HTTP on port 80, but does not support TCP transport.
6
+ - *[Amazon Elasticsearch Service Limits](http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-limits.html)*
7
+
8
+ ## Overview
9
+
10
+ * **Plugin type**: output
11
+ * **Rollback supported**: no
12
+ * **Resume supported**: no
13
+ * **Cleanup supported**: no
14
+
15
+ ## Configuration
16
+
17
+ - **mode**: "insert" or "replace". See below(string, optional, default is insert)
18
+ - **nodes**: list of nodes. nodes are pairs of host and port (list, required)
19
+ - **cluster_name**: name of the cluster (string, default is "elasticsearch")
20
+ - **index**: index name (string, required)
21
+ - **index_type**: index type (string, required)
22
+ - **id**: document id column (string, default is null)
23
+ - **bulk_actions**: Sets when to flush a new bulk request based on the number of actions currently added. (int, default is 1000)
24
+ - **bulk_size**: Sets when to flush a new bulk request based on the size of actions currently added. (long, default is 5242880)
25
+ - **concurrent_requests**: concurrent_requests (int, default is 5)
26
+
27
+ ### Modes
28
+
29
+ #### insert:
30
+
31
+ default.
32
+ This mode writes data to existing index.
33
+
34
+ #### replace:
35
+
36
+ 1. Create new temporary index
37
+ 2. Insert data into the new index
38
+ 3. replace the alias with the new index. If alias doesn't exists, plugin will create new alias.
39
+ 4. Delete existing (old) index if exists
40
+
41
+ Index should not exists with the same name as the alias
42
+
43
+ ```yaml
44
+ out:
45
+ type: elasticsearch
46
+ mode: replace
47
+ nodes:
48
+ - {host: localhost, port: 9300}
49
+ index: <alias name> # plugin generates index name like <index>_%Y%m%d-%H%M%S
50
+ index_type: <index type>
51
+ ```
52
+
53
+ ## Example
54
+
55
+ ```yaml
56
+ out:
57
+ type: elasticsearch
58
+ mode: insert
59
+ nodes:
60
+ - {host: localhost, port: 9300}
61
+ index: <index name>
62
+ index_type: <index type>
63
+ ```
64
+
65
+ ## Build
66
+
67
+ ```
68
+ $ ./gradlew gem # -t to watch change of files and rebuild continuously
69
+ ```
70
+
71
+ ## Test
72
+
73
+ ```
74
+ $ ./gradlew test # -t to watch change of files and rebuild continuously
75
+ ```
76
+
77
+ To run unit tests, we need to configure the following environment variables.
78
+
79
+ When environment variables are not set, skip almost test cases.
80
+
81
+ ```
82
+ ES_HOST
83
+ ES_PORT(optional, if needed, default: 9300)
84
+ ES_INDEX
85
+ ES_INDEX_TYPE
86
+ ```
87
+
88
+ If you're using Mac OS X El Capitan and GUI Applications(IDE), like as follows.
89
+ ```
90
+ $ vi ~/Library/LaunchAgents/environment.plist
91
+ <?xml version="1.0" encoding="UTF-8"?>
92
+ <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
93
+ <plist version="1.0">
94
+ <dict>
95
+ <key>Label</key>
96
+ <string>my.startup</string>
97
+ <key>ProgramArguments</key>
98
+ <array>
99
+ <string>sh</string>
100
+ <string>-c</string>
101
+ <string>
102
+ launchctl setenv ES_HOST example.com
103
+ launchctl setenv ES_PORT 9300
104
+ launchctl setenv ES_INDEX embulk
105
+ launchctl setenv ES_INDEX_TYPE embulk
106
+ </string>
107
+ </array>
108
+ <key>RunAtLoad</key>
109
+ <true/>
110
+ </dict>
111
+ </plist>
112
+
113
+ $ launchctl load ~/Library/LaunchAgents/environment.plist
114
+ $ launchctl getenv ES_INDEX //try to get value.
115
+
116
+ Then start your applications.
117
+ ```
data/build.gradle ADDED
@@ -0,0 +1,70 @@
1
+ plugins {
2
+ id "com.jfrog.bintray" version "1.1"
3
+ id "com.github.jruby-gradle.base" version "0.1.5"
4
+ id "java"
5
+ }
6
+ import com.github.jrubygradle.JRubyExec
7
+ repositories {
8
+ mavenCentral()
9
+ jcenter()
10
+ mavenLocal()
11
+ }
12
+ configurations {
13
+ provided
14
+ }
15
+
16
+ version = "0.1.8"
17
+
18
+ compileJava.options.encoding = 'UTF-8' // source encoding
19
+ sourceCompatibility = 1.7
20
+ targetCompatibility = 1.7
21
+
22
+ dependencies {
23
+ compile "org.embulk:embulk-core:0.8.9"
24
+ provided "org.embulk:embulk-core:0.8.9"
25
+ compile 'org.elasticsearch:elasticsearch:1.7.2'
26
+ testCompile "junit:junit:4.+"
27
+ testCompile "org.mockito:mockito-core:1.+"
28
+ }
29
+
30
+ task classpath(type: Copy, dependsOn: ["jar"]) {
31
+ doFirst { file("classpath").deleteDir() }
32
+ from (configurations.runtime - configurations.provided + files(jar.archivePath))
33
+ into "classpath"
34
+ }
35
+ clean { delete 'classpath' }
36
+
37
+ //task copyDependencies(type:Copy) {
38
+ // new File("$buildDir/libs/dependencies").mkdirs()
39
+ // into "$buildDir/libs/dependencies" from configurations.runtime
40
+ //}
41
+
42
+ task gem(type: JRubyExec, dependsOn: ["build", "gemspec", "classpath"]) {
43
+ jrubyArgs "-rrubygems/gem_runner", "-eGem::GemRunner.new.run(ARGV)", "build"
44
+ script "build/gemspec"
45
+ doLast { ant.move(file: "${project.name}-${project.version}.gem", todir: "pkg") }
46
+ }
47
+
48
+ task gemspec << { file("build/gemspec").write($/
49
+ Gem::Specification.new do |spec|
50
+ spec.name = "${project.name}"
51
+ spec.version = "${project.version}"
52
+ spec.authors = ["Muga Nishizawa", "Shinji Ikeda"]
53
+ spec.summary = %[Elasticsearch 1.x output plugin for Embulk]
54
+ spec.description = %[Elasticsearch 1.x output plugin is an Embulk plugin that loads records to Elasticsearch read by any input plugins. Search the input plugins by "embulk-input" keyword.]
55
+ spec.email = ["muga.nishizawa@gmail.com", "gm.ikeda@gmail.com"]
56
+ spec.licenses = ["Apache 2.0"]
57
+ spec.homepage = "https://github.com/shinjiikeda/embulk-output-elasticsearch"
58
+
59
+ spec.files = `git ls-files`.split("\n") + Dir["classpath/*.jar"]
60
+ spec.test_files = spec.files.grep(%r"^(test|spec)/")
61
+ spec.require_paths = ["lib"]
62
+ spec.executables = spec.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
63
+ spec.has_rdoc = false
64
+
65
+ spec.add_development_dependency "bundler", [">= 1.0"]
66
+ spec.add_development_dependency "rake", [">= 10.0"]
67
+ spec.add_development_dependency "test-unit", ["~> 3.0.2"]
68
+ end
69
+ /$)
70
+ }
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
@@ -0,0 +1,6 @@
1
+ #Tue Aug 11 00:26:20 PDT 2015
2
+ distributionBase=GRADLE_USER_HOME
3
+ distributionPath=wrapper/dists
4
+ zipStoreBase=GRADLE_USER_HOME
5
+ zipStorePath=wrapper/dists
6
+ distributionUrl=https\://services.gradle.org/distributions/gradle-2.6-bin.zip
data/gradlew ADDED
@@ -0,0 +1,164 @@
1
+ #!/usr/bin/env bash
2
+
3
+ ##############################################################################
4
+ ##
5
+ ## Gradle start up script for UN*X
6
+ ##
7
+ ##############################################################################
8
+
9
+ # Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
10
+ DEFAULT_JVM_OPTS=""
11
+
12
+ APP_NAME="Gradle"
13
+ APP_BASE_NAME=`basename "$0"`
14
+
15
+ # Use the maximum available, or set MAX_FD != -1 to use that value.
16
+ MAX_FD="maximum"
17
+
18
+ warn ( ) {
19
+ echo "$*"
20
+ }
21
+
22
+ die ( ) {
23
+ echo
24
+ echo "$*"
25
+ echo
26
+ exit 1
27
+ }
28
+
29
+ # OS specific support (must be 'true' or 'false').
30
+ cygwin=false
31
+ msys=false
32
+ darwin=false
33
+ case "`uname`" in
34
+ CYGWIN* )
35
+ cygwin=true
36
+ ;;
37
+ Darwin* )
38
+ darwin=true
39
+ ;;
40
+ MINGW* )
41
+ msys=true
42
+ ;;
43
+ esac
44
+
45
+ # For Cygwin, ensure paths are in UNIX format before anything is touched.
46
+ if $cygwin ; then
47
+ [ -n "$JAVA_HOME" ] && JAVA_HOME=`cygpath --unix "$JAVA_HOME"`
48
+ fi
49
+
50
+ # Attempt to set APP_HOME
51
+ # Resolve links: $0 may be a link
52
+ PRG="$0"
53
+ # Need this for relative symlinks.
54
+ while [ -h "$PRG" ] ; do
55
+ ls=`ls -ld "$PRG"`
56
+ link=`expr "$ls" : '.*-> \(.*\)$'`
57
+ if expr "$link" : '/.*' > /dev/null; then
58
+ PRG="$link"
59
+ else
60
+ PRG=`dirname "$PRG"`"/$link"
61
+ fi
62
+ done
63
+ SAVED="`pwd`"
64
+ cd "`dirname \"$PRG\"`/" >&-
65
+ APP_HOME="`pwd -P`"
66
+ cd "$SAVED" >&-
67
+
68
+ CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
69
+
70
+ # Determine the Java command to use to start the JVM.
71
+ if [ -n "$JAVA_HOME" ] ; then
72
+ if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
73
+ # IBM's JDK on AIX uses strange locations for the executables
74
+ JAVACMD="$JAVA_HOME/jre/sh/java"
75
+ else
76
+ JAVACMD="$JAVA_HOME/bin/java"
77
+ fi
78
+ if [ ! -x "$JAVACMD" ] ; then
79
+ die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
80
+
81
+ Please set the JAVA_HOME variable in your environment to match the
82
+ location of your Java installation."
83
+ fi
84
+ else
85
+ JAVACMD="java"
86
+ which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
87
+
88
+ Please set the JAVA_HOME variable in your environment to match the
89
+ location of your Java installation."
90
+ fi
91
+
92
+ # Increase the maximum file descriptors if we can.
93
+ if [ "$cygwin" = "false" -a "$darwin" = "false" ] ; then
94
+ MAX_FD_LIMIT=`ulimit -H -n`
95
+ if [ $? -eq 0 ] ; then
96
+ if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then
97
+ MAX_FD="$MAX_FD_LIMIT"
98
+ fi
99
+ ulimit -n $MAX_FD
100
+ if [ $? -ne 0 ] ; then
101
+ warn "Could not set maximum file descriptor limit: $MAX_FD"
102
+ fi
103
+ else
104
+ warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT"
105
+ fi
106
+ fi
107
+
108
+ # For Darwin, add options to specify how the application appears in the dock
109
+ if $darwin; then
110
+ GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\""
111
+ fi
112
+
113
+ # For Cygwin, switch paths to Windows format before running java
114
+ if $cygwin ; then
115
+ APP_HOME=`cygpath --path --mixed "$APP_HOME"`
116
+ CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
117
+
118
+ # We build the pattern for arguments to be converted via cygpath
119
+ ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`
120
+ SEP=""
121
+ for dir in $ROOTDIRSRAW ; do
122
+ ROOTDIRS="$ROOTDIRS$SEP$dir"
123
+ SEP="|"
124
+ done
125
+ OURCYGPATTERN="(^($ROOTDIRS))"
126
+ # Add a user-defined pattern to the cygpath arguments
127
+ if [ "$GRADLE_CYGPATTERN" != "" ] ; then
128
+ OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)"
129
+ fi
130
+ # Now convert the arguments - kludge to limit ourselves to /bin/sh
131
+ i=0
132
+ for arg in "$@" ; do
133
+ CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -`
134
+ CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option
135
+
136
+ if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition
137
+ eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"`
138
+ else
139
+ eval `echo args$i`="\"$arg\""
140
+ fi
141
+ i=$((i+1))
142
+ done
143
+ case $i in
144
+ (0) set -- ;;
145
+ (1) set -- "$args0" ;;
146
+ (2) set -- "$args0" "$args1" ;;
147
+ (3) set -- "$args0" "$args1" "$args2" ;;
148
+ (4) set -- "$args0" "$args1" "$args2" "$args3" ;;
149
+ (5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;;
150
+ (6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;;
151
+ (7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;;
152
+ (8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;;
153
+ (9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;;
154
+ esac
155
+ fi
156
+
157
+ # Split up the JVM_OPTS And GRADLE_OPTS values into an array, following the shell quoting and substitution rules
158
+ function splitJvmOpts() {
159
+ JVM_OPTS=("$@")
160
+ }
161
+ eval splitJvmOpts $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS
162
+ JVM_OPTS[${#JVM_OPTS[*]}]="-Dorg.gradle.appname=$APP_BASE_NAME"
163
+
164
+ exec "$JAVACMD" "${JVM_OPTS[@]}" -classpath "$CLASSPATH" org.gradle.wrapper.GradleWrapperMain "$@"
data/gradlew.bat ADDED
@@ -0,0 +1,90 @@
1
+ @if "%DEBUG%" == "" @echo off
2
+ @rem ##########################################################################
3
+ @rem
4
+ @rem Gradle startup script for Windows
5
+ @rem
6
+ @rem ##########################################################################
7
+
8
+ @rem Set local scope for the variables with windows NT shell
9
+ if "%OS%"=="Windows_NT" setlocal
10
+
11
+ @rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
12
+ set DEFAULT_JVM_OPTS=
13
+
14
+ set DIRNAME=%~dp0
15
+ if "%DIRNAME%" == "" set DIRNAME=.
16
+ set APP_BASE_NAME=%~n0
17
+ set APP_HOME=%DIRNAME%
18
+
19
+ @rem Find java.exe
20
+ if defined JAVA_HOME goto findJavaFromJavaHome
21
+
22
+ set JAVA_EXE=java.exe
23
+ %JAVA_EXE% -version >NUL 2>&1
24
+ if "%ERRORLEVEL%" == "0" goto init
25
+
26
+ echo.
27
+ echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
28
+ echo.
29
+ echo Please set the JAVA_HOME variable in your environment to match the
30
+ echo location of your Java installation.
31
+
32
+ goto fail
33
+
34
+ :findJavaFromJavaHome
35
+ set JAVA_HOME=%JAVA_HOME:"=%
36
+ set JAVA_EXE=%JAVA_HOME%/bin/java.exe
37
+
38
+ if exist "%JAVA_EXE%" goto init
39
+
40
+ echo.
41
+ echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
42
+ echo.
43
+ echo Please set the JAVA_HOME variable in your environment to match the
44
+ echo location of your Java installation.
45
+
46
+ goto fail
47
+
48
+ :init
49
+ @rem Get command-line arguments, handling Windowz variants
50
+
51
+ if not "%OS%" == "Windows_NT" goto win9xME_args
52
+ if "%@eval[2+2]" == "4" goto 4NT_args
53
+
54
+ :win9xME_args
55
+ @rem Slurp the command line arguments.
56
+ set CMD_LINE_ARGS=
57
+ set _SKIP=2
58
+
59
+ :win9xME_args_slurp
60
+ if "x%~1" == "x" goto execute
61
+
62
+ set CMD_LINE_ARGS=%*
63
+ goto execute
64
+
65
+ :4NT_args
66
+ @rem Get arguments from the 4NT Shell from JP Software
67
+ set CMD_LINE_ARGS=%$
68
+
69
+ :execute
70
+ @rem Setup the command line
71
+
72
+ set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
73
+
74
+ @rem Execute Gradle
75
+ "%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS%
76
+
77
+ :end
78
+ @rem End local scope for the variables with windows NT shell
79
+ if "%ERRORLEVEL%"=="0" goto mainEnd
80
+
81
+ :fail
82
+ rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
83
+ rem the _cmd.exe /c_ return code!
84
+ if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
85
+ exit /b 1
86
+
87
+ :mainEnd
88
+ if "%OS%"=="Windows_NT" endlocal
89
+
90
+ :omega
@@ -0,0 +1,3 @@
1
+ Embulk::JavaPlugin.register_output(
2
+ :elasticsearch, "org.embulk.output.elasticsearch.ElasticsearchOutputPlugin",
3
+ File.expand_path('../../../../classpath', __FILE__))
data/settings.gradle ADDED
@@ -0,0 +1 @@
1
+ rootProject.name = 'embulk-output-elasticsearch_1.x'
@@ -0,0 +1,431 @@
1
+ package org.embulk.output.elasticsearch;
2
+
3
+ import com.google.common.base.Optional;
4
+ import com.google.common.base.Throwables;
5
+ import com.google.common.collect.ImmutableList;
6
+ import com.google.inject.Inject;
7
+ import org.elasticsearch.action.bulk.BulkItemResponse;
8
+ import org.elasticsearch.action.bulk.BulkProcessor;
9
+ import org.elasticsearch.action.bulk.BulkRequest;
10
+ import org.elasticsearch.action.bulk.BulkResponse;
11
+ import org.elasticsearch.action.index.IndexRequest;
12
+ import org.elasticsearch.client.Client;
13
+ import org.elasticsearch.client.Requests;
14
+ import org.elasticsearch.client.transport.TransportClient;
15
+ import org.elasticsearch.common.unit.ByteSizeValue;
16
+ import org.elasticsearch.common.unit.ByteSizeUnit;
17
+ import org.elasticsearch.common.settings.ImmutableSettings;
18
+ import org.elasticsearch.common.settings.Settings;
19
+ import org.elasticsearch.common.transport.InetSocketTransportAddress;
20
+ import org.elasticsearch.common.xcontent.XContentBuilder;
21
+ import org.elasticsearch.common.xcontent.XContentFactory;
22
+ import org.elasticsearch.node.Node;
23
+ import org.elasticsearch.node.NodeBuilder;
24
+ import org.embulk.config.TaskReport;
25
+ import org.embulk.config.Config;
26
+ import org.embulk.config.ConfigDefault;
27
+ import org.embulk.config.ConfigDiff;
28
+ import org.embulk.config.ConfigSource;
29
+ import org.embulk.config.Task;
30
+ import org.embulk.config.TaskSource;
31
+ import org.embulk.spi.Column;
32
+ import org.embulk.spi.Exec;
33
+ import org.embulk.spi.OutputPlugin;
34
+ import org.embulk.spi.Page;
35
+ import org.embulk.spi.PageReader;
36
+ import org.embulk.spi.Schema;
37
+ import org.embulk.spi.ColumnVisitor;
38
+ import org.embulk.spi.TransactionalPageOutput;
39
+ import org.embulk.spi.type.Types;
40
+ import org.slf4j.Logger;
41
+
42
+ import java.io.IOException;
43
+ import java.util.Date;
44
+ import java.util.List;
45
+ import java.util.concurrent.TimeUnit;
46
+
47
+ import static com.google.common.base.Preconditions.checkState;
48
+
49
+ public class ElasticsearchOutputPlugin
50
+ implements OutputPlugin
51
+ {
52
+ public interface NodeAddressTask
53
+ extends Task
54
+ {
55
+ @Config("host")
56
+ public String getHost();
57
+
58
+ @Config("port")
59
+ @ConfigDefault("9300")
60
+ public int getPort();
61
+ }
62
+
63
+ public interface PluginTask
64
+ extends Task
65
+ {
66
+ @Config("nodes")
67
+ public List<NodeAddressTask> getNodes();
68
+
69
+ @Config("cluster_name")
70
+ @ConfigDefault("\"elasticsearch\"")
71
+ public String getClusterName();
72
+
73
+ @Config("index")
74
+ public String getIndex();
75
+
76
+ @Config("index_type")
77
+ public String getType();
78
+
79
+ @Config("id")
80
+ @ConfigDefault("null")
81
+ public Optional<String> getId();
82
+
83
+ @Config("bulk_actions")
84
+ @ConfigDefault("1000")
85
+ public int getBulkActions();
86
+
87
+ @Config("bulk_size")
88
+ @ConfigDefault("5242880")
89
+ public long getBulkSize();
90
+
91
+ @Config("concurrent_requests")
92
+ @ConfigDefault("5")
93
+ public int getConcurrentRequests();
94
+ }
95
+
96
+ private final Logger log;
97
+
98
+ @Inject
99
+ public ElasticsearchOutputPlugin()
100
+ {
101
+ log = Exec.getLogger(getClass());
102
+ }
103
+
104
+ @Override
105
+ public ConfigDiff transaction(ConfigSource config, Schema schema,
106
+ int processorCount, Control control)
107
+ {
108
+ final PluginTask task = config.loadConfig(PluginTask.class);
109
+
110
+ // confirm that a client can be initialized
111
+ try (Client client = createClient(task)) {
112
+ }
113
+
114
+ try {
115
+ control.run(task.dump());
116
+ } catch (Exception e) {
117
+ throw Throwables.propagate(e);
118
+ }
119
+
120
+ ConfigDiff nextConfig = Exec.newConfigDiff();
121
+ return nextConfig;
122
+ }
123
+
124
+ @Override
125
+ public ConfigDiff resume(TaskSource taskSource,
126
+ Schema schema, int processorCount,
127
+ OutputPlugin.Control control)
128
+ {
129
+ // TODO
130
+ return Exec.newConfigDiff();
131
+ }
132
+
133
+ @Override
134
+ public void cleanup(TaskSource taskSource,
135
+ Schema schema, int processorCount,
136
+ List<TaskReport> successTaskReports)
137
+ { }
138
+
139
+ private Client createClient(final PluginTask task)
140
+ {
141
+ // @see http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/client.html
142
+ Settings settings = ImmutableSettings.settingsBuilder()
143
+ .classLoader(Settings.class.getClassLoader())
144
+ .put("cluster.name", task.getClusterName())
145
+ .build();
146
+ TransportClient client = new TransportClient(settings);
147
+ List<NodeAddressTask> nodes = task.getNodes();
148
+ for (NodeAddressTask node : nodes) {
149
+ client.addTransportAddress(new InetSocketTransportAddress(node.getHost(), node.getPort()));
150
+ }
151
+ return client;
152
+ }
153
+
154
+ private BulkProcessor newBulkProcessor(final PluginTask task, final Client client)
155
+ {
156
+ return BulkProcessor.builder(client, new BulkProcessor.Listener() {
157
+ @Override
158
+ public void beforeBulk(long executionId, BulkRequest request)
159
+ {
160
+ log.info("Execute {} bulk actions", request.numberOfActions());
161
+ }
162
+
163
+ @Override
164
+ public void afterBulk(long executionId, BulkRequest request, BulkResponse response)
165
+ {
166
+ if (response.hasFailures()) {
167
+ long items = 0;
168
+ if (log.isDebugEnabled()) {
169
+ for (BulkItemResponse item : response.getItems()) {
170
+ if (item.isFailed()) {
171
+ items += 1;
172
+ log.debug(" Error for {}/{}/{} for {} operation: {}",
173
+ item.getIndex(), item.getType(), item.getId(),
174
+ item.getOpType(), item.getFailureMessage());
175
+ }
176
+ }
177
+ }
178
+ log.warn("{} bulk actions failed: {}", items, response.buildFailureMessage());
179
+ } else {
180
+ log.info("{} bulk actions succeeded", request.numberOfActions());
181
+ }
182
+ }
183
+
184
+ @Override
185
+ public void afterBulk(long executionId, BulkRequest request, Throwable failure)
186
+ {
187
+ log.warn("Got the error during bulk processing", failure);
188
+ }
189
+ }).setBulkActions(task.getBulkActions())
190
+ .setBulkSize(new ByteSizeValue(task.getBulkSize()))
191
+ .setConcurrentRequests(task.getConcurrentRequests())
192
+ .build();
193
+ }
194
+
195
+ @Override
196
+ public TransactionalPageOutput open(TaskSource taskSource, Schema schema,
197
+ int processorIndex)
198
+ {
199
+ final PluginTask task = taskSource.loadTask(PluginTask.class);
200
+
201
+ Client client = createClient(task);
202
+ BulkProcessor bulkProcessor = newBulkProcessor(task, client);
203
+ ElasticsearchPageOutput pageOutput = new ElasticsearchPageOutput(task, client, bulkProcessor);
204
+ pageOutput.open(schema);
205
+ return pageOutput;
206
+ }
207
+
208
+ public static class ElasticsearchPageOutput implements TransactionalPageOutput
209
+ {
210
+ private Logger log;
211
+
212
+ private Client client;
213
+ private BulkProcessor bulkProcessor;
214
+
215
+ private PageReader pageReader;
216
+ private Column idColumn;
217
+
218
+ private final String index;
219
+ private final String type;
220
+ private final String id;
221
+
222
+ public ElasticsearchPageOutput(PluginTask task, Client client, BulkProcessor bulkProcessor)
223
+ {
224
+ this.log = Exec.getLogger(getClass());
225
+
226
+ this.client = client;
227
+ this.bulkProcessor = bulkProcessor;
228
+
229
+ this.index = task.getIndex();
230
+ this.type = task.getType();
231
+ this.id = task.getId().orNull();
232
+ }
233
+
234
+ void open(final Schema schema)
235
+ {
236
+ pageReader = new PageReader(schema);
237
+ idColumn = (id == null) ? null : schema.lookupColumn(id);
238
+ }
239
+
240
+ @Override
241
+ public void add(Page page)
242
+ {
243
+ pageReader.setPage(page);
244
+
245
+ while (pageReader.nextRecord()) {
246
+ try {
247
+ final XContentBuilder contextBuilder = XContentFactory.jsonBuilder().startObject(); // TODO reusable??
248
+ pageReader.getSchema().visitColumns(new ColumnVisitor() {
249
+ @Override
250
+ public void booleanColumn(Column column) {
251
+ try {
252
+ if (pageReader.isNull(column)) {
253
+ contextBuilder.nullField(column.getName());
254
+ } else {
255
+ contextBuilder.field(column.getName(), pageReader.getBoolean(column));
256
+ }
257
+ } catch (IOException e) {
258
+ try {
259
+ contextBuilder.nullField(column.getName());
260
+ } catch (IOException ex) {
261
+ throw Throwables.propagate(ex);
262
+ }
263
+ }
264
+ }
265
+
266
+ @Override
267
+ public void longColumn(Column column) {
268
+ try {
269
+ if (pageReader.isNull(column)) {
270
+ contextBuilder.nullField(column.getName());
271
+ } else {
272
+ contextBuilder.field(column.getName(), pageReader.getLong(column));
273
+ }
274
+ } catch (IOException e) {
275
+ try {
276
+ contextBuilder.nullField(column.getName());
277
+ } catch (IOException ex) {
278
+ throw Throwables.propagate(ex);
279
+ }
280
+ }
281
+ }
282
+
283
+ @Override
284
+ public void doubleColumn(Column column) {
285
+ try {
286
+ if (pageReader.isNull(column)) {
287
+ contextBuilder.nullField(column.getName());
288
+ } else {
289
+ contextBuilder.field(column.getName(), pageReader.getDouble(column));
290
+ }
291
+ } catch (IOException e) {
292
+ try {
293
+ contextBuilder.nullField(column.getName());
294
+ } catch (IOException ex) {
295
+ throw Throwables.propagate(ex);
296
+ }
297
+ }
298
+ }
299
+
300
+ @Override
301
+ public void stringColumn(Column column) {
302
+ try {
303
+ if (pageReader.isNull(column)) {
304
+ contextBuilder.nullField(column.getName());
305
+ } else {
306
+ contextBuilder.field(column.getName(), pageReader.getString(column));
307
+ }
308
+ } catch (IOException e) {
309
+ try {
310
+ contextBuilder.nullField(column.getName());
311
+ } catch (IOException ex) {
312
+ throw Throwables.propagate(ex);
313
+ }
314
+ }
315
+ }
316
+
317
+ @Override
318
+ public void jsonColumn(Column column) {
319
+ try {
320
+ if (pageReader.isNull(column)) {
321
+ contextBuilder.nullField(column.getName());
322
+ } else {
323
+ contextBuilder.field(column.getName(), pageReader.getJson(column).toJson());
324
+ }
325
+ } catch (IOException e) {
326
+ try {
327
+ contextBuilder.nullField(column.getName());
328
+ } catch (IOException ex) {
329
+ throw Throwables.propagate(ex);
330
+ }
331
+ }
332
+ }
333
+
334
+ @Override
335
+ public void timestampColumn(Column column) {
336
+ try {
337
+ if (pageReader.isNull(column)) {
338
+ contextBuilder.nullField(column.getName());
339
+ } else {
340
+ contextBuilder.field(column.getName(), new Date(pageReader.getTimestamp(column).toEpochMilli()));
341
+ }
342
+ } catch (IOException e) {
343
+ try {
344
+ contextBuilder.nullField(column.getName());
345
+ } catch (IOException ex) {
346
+ throw Throwables.propagate(ex);
347
+ }
348
+ }
349
+ }
350
+ });
351
+
352
+ contextBuilder.endObject();
353
+ bulkProcessor.add(newIndexRequest(getIdValue(idColumn)).source(contextBuilder));
354
+
355
+ } catch (IOException e) {
356
+ Throwables.propagate(e); // TODO error handling
357
+ }
358
+ }
359
+ }
360
+
361
+ private String getIdValue(Column inputColumn) {
362
+ if (inputColumn == null) return null;
363
+ if (pageReader.isNull(inputColumn)) return null;
364
+ String idValue = null;
365
+ if (Types.STRING.equals(inputColumn.getType())) {
366
+ idValue = pageReader.getString(inputColumn);
367
+ } else if (Types.BOOLEAN.equals(inputColumn.getType())) {
368
+ idValue = pageReader.getBoolean(inputColumn) + "";
369
+ } else if (Types.DOUBLE.equals(inputColumn.getType())) {
370
+ idValue = pageReader.getDouble(inputColumn) + "";
371
+ } else if (Types.LONG.equals(inputColumn.getType())) {
372
+ idValue = pageReader.getLong(inputColumn) + "";
373
+ } else if (Types.TIMESTAMP.equals(inputColumn.getType())) {
374
+ idValue = pageReader.getTimestamp(inputColumn).toString();
375
+ } else {
376
+ idValue = null;
377
+ }
378
+ return idValue;
379
+ }
380
+
381
+ private IndexRequest newIndexRequest(String idValue)
382
+ {
383
+ return Requests.indexRequest(index).type(type).id(idValue);
384
+ }
385
+
386
+ @Override
387
+ public void finish()
388
+ {
389
+ try {
390
+ bulkProcessor.flush();
391
+ } finally {
392
+ close();
393
+ }
394
+ }
395
+
396
+ @Override
397
+ public void close()
398
+ {
399
+ if (bulkProcessor != null) {
400
+ try {
401
+ while (!bulkProcessor.awaitClose(3, TimeUnit.SECONDS)) {
402
+ log.debug("wait for closing the bulk processing..");
403
+ }
404
+ } catch (InterruptedException e) {
405
+ Thread.currentThread().interrupt();
406
+ }
407
+ bulkProcessor = null;
408
+ }
409
+
410
+ if (client != null) {
411
+ client.close(); // ElasticsearchException
412
+ client = null;
413
+ }
414
+ }
415
+
416
+ @Override
417
+ public void abort()
418
+ {
419
+ // TODO do nothing
420
+ }
421
+
422
+ @Override
423
+ public TaskReport commit()
424
+ {
425
+ TaskReport report = Exec.newTaskReport();
426
+ // TODO
427
+ return report;
428
+ }
429
+
430
+ }
431
+ }
@@ -0,0 +1,5 @@
1
+ package org.embulk.output.elasticsearch;
2
+
3
+ public class TestElasticsearchOutputPlugin
4
+ {
5
+ }
@@ -0,0 +1,5 @@
1
+ id,account,time,purchase,flg,score,comment
2
+ 1,32864,2015-01-27 19:23:49,20150127,1,123.45,embulk
3
+ 2,14824,2015-01-27 19:01:23,20150127,0,234,56,embulk
4
+ 3,27559,2015-01-28 02:20:02,20150128,1,678.90,embulk
5
+ 4,11270,2015-01-29 11:54:36,20150129,0,100.00,embulk
metadata ADDED
@@ -0,0 +1,116 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: embulk-output-elasticsearch_1.x
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.8
5
+ platform: ruby
6
+ authors:
7
+ - Muga Nishizawa
8
+ - Shinji Ikeda
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2016-06-24 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '1.0'
20
+ name: bundler
21
+ prerelease: false
22
+ type: :development
23
+ version_requirements: !ruby/object:Gem::Requirement
24
+ requirements:
25
+ - - '>='
26
+ - !ruby/object:Gem::Version
27
+ version: '1.0'
28
+ - !ruby/object:Gem::Dependency
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ name: rake
35
+ prerelease: false
36
+ type: :development
37
+ version_requirements: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - '>='
40
+ - !ruby/object:Gem::Version
41
+ version: '10.0'
42
+ - !ruby/object:Gem::Dependency
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ version: 3.0.2
48
+ name: test-unit
49
+ prerelease: false
50
+ type: :development
51
+ version_requirements: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ~>
54
+ - !ruby/object:Gem::Version
55
+ version: 3.0.2
56
+ description: Elasticsearch 1.x output plugin is an Embulk plugin that loads records to Elasticsearch read by any input plugins. Search the input plugins by "embulk-input" keyword.
57
+ email:
58
+ - muga.nishizawa@gmail.com
59
+ - gm.ikeda@gmail.com
60
+ executables: []
61
+ extensions: []
62
+ extra_rdoc_files: []
63
+ files:
64
+ - .gitignore
65
+ - CHANGELOG.md
66
+ - README.md
67
+ - build.gradle
68
+ - gradle/wrapper/gradle-wrapper.jar
69
+ - gradle/wrapper/gradle-wrapper.properties
70
+ - gradlew
71
+ - gradlew.bat
72
+ - lib/embulk/output/elasticsearch.rb
73
+ - settings.gradle
74
+ - src/main/java/org/embulk/output/elasticsearch/ElasticsearchOutputPlugin.java
75
+ - src/test/java/org/embulk/output/elasticsearch/TestElasticsearchOutputPlugin.java
76
+ - src/test/resources/sample_01.csv
77
+ - classpath/elasticsearch-1.7.2.jar
78
+ - classpath/embulk-output-elasticsearch_1.x-0.1.8.jar
79
+ - classpath/lucene-analyzers-common-4.10.4.jar
80
+ - classpath/lucene-core-4.10.4.jar
81
+ - classpath/lucene-grouping-4.10.4.jar
82
+ - classpath/lucene-highlighter-4.10.4.jar
83
+ - classpath/lucene-join-4.10.4.jar
84
+ - classpath/lucene-memory-4.10.4.jar
85
+ - classpath/lucene-misc-4.10.4.jar
86
+ - classpath/lucene-queries-4.10.4.jar
87
+ - classpath/lucene-queryparser-4.10.4.jar
88
+ - classpath/lucene-sandbox-4.10.4.jar
89
+ - classpath/lucene-spatial-4.10.4.jar
90
+ - classpath/lucene-suggest-4.10.4.jar
91
+ - classpath/spatial4j-0.4.1.jar
92
+ homepage: https://github.com/shinjiikeda/embulk-output-elasticsearch
93
+ licenses:
94
+ - Apache 2.0
95
+ metadata: {}
96
+ post_install_message:
97
+ rdoc_options: []
98
+ require_paths:
99
+ - lib
100
+ required_ruby_version: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - '>='
103
+ - !ruby/object:Gem::Version
104
+ version: '0'
105
+ required_rubygems_version: !ruby/object:Gem::Requirement
106
+ requirements:
107
+ - - '>='
108
+ - !ruby/object:Gem::Version
109
+ version: '0'
110
+ requirements: []
111
+ rubyforge_project:
112
+ rubygems_version: 2.1.9
113
+ signing_key:
114
+ specification_version: 4
115
+ summary: Elasticsearch 1.x output plugin for Embulk
116
+ test_files: []