embulk-input-google_analytics 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: f461f4cc05a23ba6ed9b53e23c826fc69b7c82c4
4
- data.tar.gz: d73dea5ede7fb7ce0b348409808e2a7036392fa7
3
+ metadata.gz: 122431f951c569688cc1df231944216e766f9356
4
+ data.tar.gz: 687477c6f8e0c08853b07e681bd3f3e39e81f876
5
5
  SHA512:
6
- metadata.gz: 5d014c2caf49c2b73c325433cb55580581184634b1a4b7dbff9d7030278f8b569b2bcf5d934e0638f92b8114ac3a0f6e1dd8df7fff0747ff4d51ab855af534ec
7
- data.tar.gz: 3f36eed0bb7b8926d7aeecb7902eb0631e6a31f86e91e81f041b2e97a5894abb1a9bc47329b1acd9d96d80ed482a6915dfce4561fdb406ad37b2c6ed582d8d42
6
+ metadata.gz: 226f36d2128a30c5f30da1083c6f9bb63db86bb7cd08a9221c3f55f891fee816bc2e154fcf0e48eadbb9f71c959547ca6c644e3cbebe3fdce5b20fcb5234b9b7
7
+ data.tar.gz: 2c1f6b067797788227f46210ba99270d80f6720da43cb97349fa439e7463fadc6b150c05f665ca60e15388cdf57ac4b67d409a5d2da4dc8e9b73e1b55e9f0854
data/CHANGELOG.md CHANGED
@@ -1,3 +1,8 @@
1
+ ## 0.1.1 - 2016-07-13
2
+ * Enable scheduled execution [#4](https://github.com/treasure-data/embulk-input-google_analytics/pull/4)
3
+ * Error handling [#6](https://github.com/treasure-data/embulk-input-google_analytics/pull/6)
4
+ * Ignore too early accessing data due to it is not fixed value [#5](https://github.com/treasure-data/embulk-input-google_analytics/pull/5)
5
+
1
6
  ## 0.1.0 - 2016-07-07
2
7
 
3
8
  The first release!!
data/README.md CHANGED
@@ -15,8 +15,45 @@ Embulk input plugin for Google Analytics reports.
15
15
  - **time_series**: Only `ga:dateHour` or `ga:date` (string, required)
16
16
  - **dimensions**: Target dimensions (array, default: `[]` )
17
17
  - **metrics**: Target metrics (array, default: `[]` )
18
- - **start_date**: Target report start date (string, default: [7 days ago](https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#reportrequest))
19
- - **end_date**: Target report end date (string, default: [1 day ago](https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#reportrequest))
18
+ - **start_date**: Target report start date. Valid format is "YYYY-MM-DD". (string, default: [7 days ago](https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#reportrequest))
19
+ - **end_date**: Target report end date. Valid format is "YYYY-MM-DD". (string, default: [1 day ago](https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#reportrequest))
20
+ - **incremental**: `true` for generate "config_diff" with `embulk run -c config.diff` (bool, default: true)
21
+ - **last_record_time**: Ignore fetched records until this time. Mainly for incremental:true. (string, default: nil)
22
+ - **retry_limit**: Try to retry this times (integer, default: 5)
23
+ - **retry_initial_wait_sec**: Wait seconds for exponential backoff initial value (integer, default: 2)
24
+
25
+ ### About `json_key_content` option.
26
+
27
+ You need a service account on Google.
28
+
29
+ <ol>
30
+ <li>Open the <a href="https://console.developers.google.com/permissions/serviceaccounts"><b>Service accounts</b> page</a>. If prompted,
31
+ select a project.</li>
32
+ <li>Click <b>Create service account</b>.</li>
33
+ <li>
34
+
35
+ In the <b>Create service account</b> window, type a name for the service
36
+ account, and select <b>Furnish a new private key</b>. If you want to
37
+ <a href="https://developers.google.com/identity/protocols/OAuth2ServiceAccount#delegatingauthority">grant
38
+ Google Apps domain-wide authority</a> to the service account, also select
39
+ <b>Enable Google Apps Domain-wide Delegation</b>.
40
+
41
+ Then click <b>Create</b>.</li>
42
+ </ol>
43
+ From: <https://developers.google.com/identity/protocols/OAuth2ServiceAccount>
44
+
45
+ Screenshot: ![Service Account](./service_account.png)
46
+
47
+ ## Why the result doesn't match with web interface?
48
+
49
+ Google Reporting API uses "sampling" data.
50
+
51
+ - https://developers.google.com/analytics/devguides/reporting/core/v4/basics#sampling
52
+ - https://support.google.com/analytics/answer/2637192
53
+
54
+ That means sometimes result will be unmatched with Google Analytics web interface, and the result is based on sampled data, not all of raw data. This is a Google API's limitation.
55
+
56
+ Currently a sampling level supported by this plugin is DEFAULT only. Let us know if you want to use other sampling level (SMALL or LARGE).
20
57
 
21
58
  ## Example
22
59
 
data/Rakefile CHANGED
@@ -1,4 +1,5 @@
1
1
  require "bundler/gem_tasks"
2
+ require "gem_release_helper/tasks"
2
3
 
3
4
  task default: :test
4
5
 
@@ -13,3 +14,8 @@ task :cov do
13
14
  ruby("--debug", "test/run-test.rb", "--use-color=yes", "--collector=dir")
14
15
  end
15
16
 
17
+ GemReleaseHelper::Tasks.install({
18
+ gemspec: "./embulk-input-google_analytics.gemspec",
19
+ github_name: "treasure-data/embulk-input-google_analytics",
20
+ })
21
+
@@ -1,7 +1,7 @@
1
1
 
2
2
  Gem::Specification.new do |spec|
3
3
  spec.name = "embulk-input-google_analytics"
4
- spec.version = "0.1.0"
4
+ spec.version = "0.1.1"
5
5
  spec.authors = ["uu59"]
6
6
  spec.summary = "Google Analytics input plugin for Embulk"
7
7
  spec.description = "Loads records from Google Analytics."
@@ -16,7 +16,8 @@ Gem::Specification.new do |spec|
16
16
  spec.add_dependency "httpclient"
17
17
  spec.add_dependency "google-api-client", "~> 0.9"
18
18
  spec.add_dependency "signet"
19
- spec.add_dependency "activesupport" # for Time.zone.parse
19
+ spec.add_dependency "activesupport" # for Time.zone.parse, Time.zone.now
20
+ spec.add_dependency "perfect_retry", "~> 0.5"
20
21
 
21
22
  spec.add_development_dependency 'embulk', ['>= 0.8.9']
22
23
  spec.add_development_dependency 'bundler', ['>= 1.10.6']
@@ -26,4 +27,5 @@ Gem::Specification.new do |spec|
26
27
  spec.add_development_dependency 'simplecov'
27
28
  spec.add_development_dependency "codeclimate-test-reporter"
28
29
  spec.add_development_dependency "pry"
30
+ spec.add_development_dependency "gem_release_helper", "~> 1.0"
29
31
  end
@@ -1,3 +1,4 @@
1
+ require "perfect_retry"
1
2
  require "active_support/core_ext/time"
2
3
  require "google/apis/analyticsreporting_v4"
3
4
  require "google/apis/analytics_v3"
@@ -46,8 +47,9 @@ module Embulk
46
47
  dim = dimensions.zip(row[:dimensions]).to_h
47
48
  met = metrics.zip(row[:metrics].first[:values]).to_h
48
49
  format_row = dim.merge(met)
49
- time = format_row[task["time_series"]]
50
- format_row[task["time_series"]] = time_parse_with_profile_timezone(time)
50
+ raw_time = format_row[task["time_series"]]
51
+ next if too_early_data?(raw_time)
52
+ format_row[task["time_series"]] = time_parse_with_profile_timezone(raw_time)
51
53
  block.call format_row
52
54
  end
53
55
 
@@ -80,7 +82,9 @@ module Embulk
80
82
  service.authorization = auth
81
83
 
82
84
  Embulk.logger.debug "Fetching profile from API"
83
- service.list_profiles("~all", "~all")
85
+ retryer.with_retry do
86
+ service.list_profiles("~all", "~all")
87
+ end
84
88
  end
85
89
 
86
90
  def time_parse_with_profile_timezone(time_string)
@@ -93,11 +97,9 @@ module Embulk
93
97
  end
94
98
  parts = Date._strptime(time_string, date_format)
95
99
 
96
- orig_timezone = Time.zone
97
- Time.zone = get_profile[:timezone]
98
- Time.zone.local(*parts.values_at(:year, :mon, :mday, :hour)).to_time
99
- ensure
100
- Time.zone = orig_timezone
100
+ swap_time_zone do
101
+ Time.zone.local(*parts.values_at(:year, :mon, :mday, :hour)).to_time
102
+ end
101
103
  end
102
104
 
103
105
  def get_reports(page_token = nil)
@@ -109,14 +111,18 @@ module Embulk
109
111
  request.report_requests = build_report_request(page_token)
110
112
 
111
113
  Embulk.logger.info "Query to Core Report API: #{request.to_json}"
112
- service.batch_get_reports request
114
+ retryer.with_retry do
115
+ service.batch_get_reports request
116
+ end
113
117
  end
114
118
 
115
119
  def get_columns_list
116
120
  # https://developers.google.com/analytics/devguides/reporting/metadata/v3/reference/metadata/columns/list
117
121
  service = Google::Apis::AnalyticsV3::AnalyticsService.new
118
122
  service.authorization = auth
119
- service.list_metadata_columns("ga").to_h[:items]
123
+ retryer.with_retry do
124
+ service.list_metadata_columns("ga").to_h[:items]
125
+ end
120
126
  end
121
127
 
122
128
  def build_report_request(page_token = nil)
@@ -147,13 +153,53 @@ module Embulk
147
153
  end
148
154
 
149
155
  def auth
150
- Google::Auth::ServiceAccountCredentials.make_creds(
151
- json_key_io: StringIO.new(task["json_key_content"]),
152
- scope: "https://www.googleapis.com/auth/analytics.readonly"
153
- )
154
- rescue => e
156
+ retryer.with_retry do
157
+ Google::Auth::ServiceAccountCredentials.make_creds(
158
+ json_key_io: StringIO.new(task["json_key_content"]),
159
+ scope: "https://www.googleapis.com/auth/analytics.readonly"
160
+ )
161
+ end
162
+ rescue Google::Apis::AuthorizationError => e
155
163
  raise ConfigError.new(e.message)
156
164
  end
165
+
166
+ def swap_time_zone(&block)
167
+ orig_timezone = Time.zone
168
+ Time.zone = get_profile[:timezone]
169
+ yield
170
+ ensure
171
+ Time.zone = orig_timezone
172
+ end
173
+
174
+ def too_early_data?(time_str)
175
+ # fetching 20160720 data on 2016-07-20, it is too early fetching
176
+ swap_time_zone do
177
+ now = Time.zone.now
178
+ case task["time_series"]
179
+ when "ga:dateHour"
180
+ time_str.to_i >= now.strftime("%Y%m%d%H").to_i
181
+ when "ga:date"
182
+ time_str.to_i >= now.strftime("%Y%m%d").to_i
183
+ end
184
+ end
185
+ end
186
+
187
+ def retryer
188
+ PerfectRetry.new do |config|
189
+ config.limit = task["retry_limit"]
190
+ config.logger = Embulk.logger
191
+ config.log_level = nil
192
+
193
+ # https://developers.google.com/analytics/devguides/reporting/core/v4/errors
194
+ # https://developers.google.com/analytics/devguides/reporting/core/v4/limits-quotas#additional_quota
195
+ # https://github.com/google/google-api-ruby-client/blob/master/lib/google/apis/errors.rb
196
+ # https://github.com/google/google-api-ruby-client/blob/0.9.11/lib/google/apis/core/http_command.rb#L33
197
+ config.rescues = Google::Apis::Core::HttpCommand::RETRIABLE_ERRORS
198
+ config.dont_rescues = [Embulk::DataError, Embulk::ConfigError]
199
+ config.sleep = lambda{|n| task["retry_initial_wait_sec"]* (2 ** (n-1)) }
200
+ config.raise_original_error = true
201
+ end
202
+ end
157
203
  end
158
204
  end
159
205
  end
@@ -42,7 +42,7 @@ module Embulk
42
42
  def self.resume(task, columns, count, &control)
43
43
  task_reports = yield(task, columns, count)
44
44
 
45
- next_config_diff = {}
45
+ next_config_diff = task_reports.first
46
46
  return next_config_diff
47
47
  end
48
48
 
@@ -56,6 +56,10 @@ module Embulk
56
56
  "time_series" => config.param("time_series", :string),
57
57
  "start_date" => config.param("start_date", :string, default: nil),
58
58
  "end_date" => config.param("end_date", :string, default: nil),
59
+ "incremental" => config.param("incremental", :bool, default: true),
60
+ "last_record_time" => config.param("last_record_time", :string, default: nil),
61
+ "retry_limit" => config.param("retry_limit", :integer, default: 5),
62
+ "retry_initial_wait_sec" => config.param("retry_initial_wait_sec", :integer, default: 2),
59
63
  }
60
64
  end
61
65
 
@@ -79,14 +83,28 @@ module Embulk
79
83
  client = Client.new(task, preview?)
80
84
  columns = self.class.columns_from_task(task)
81
85
 
86
+ last_record_time = task["last_record_time"] ? Time.parse(task["last_record_time"]) : nil
87
+
88
+ latest_time_series = nil
82
89
  client.each_report_row do |row|
90
+ time = row[task["time_series"]]
91
+ next if last_record_time && time <= last_record_time
92
+
83
93
  values = row.values_at(*columns)
84
94
  page_builder.add values
95
+
96
+ latest_time_series = [
97
+ latest_time_series,
98
+ time,
99
+ ].compact.max
85
100
  end
86
101
  page_builder.finish
87
102
 
88
- task_report = {}
89
- return task_report
103
+ if task["incremental"]
104
+ calculate_next_times(latest_time_series)
105
+ else
106
+ {}
107
+ end
90
108
  end
91
109
 
92
110
  def preview?
@@ -95,6 +113,49 @@ module Embulk
95
113
  false
96
114
  end
97
115
 
116
+ def calculate_next_times(fetched_latest_time)
117
+ task_report = {}
118
+
119
+ if fetched_latest_time
120
+ task_report[:start_date] = fetched_latest_time.strftime("%Y-%m-%d")
121
+
122
+ # if end_date specified as statically YYYY-MM-DD, it will be conflict with start_date (end_date < start_date)
123
+ # Modify it as "today" to be safe
124
+ if task["end_date"].match(/[0-9]{4}-[0-9]{2}-[0-9]{2}/)
125
+ task_report[:end_date] = "today" # "today" means now. running at 03:30 AM, will got 3 o'clock data.
126
+ end
127
+
128
+ # "start_date" format is YYYY-MM-DD, but ga:dateHour will return records by hourly.
129
+ # If run at 2016-07-03 05:00:00, start_date will set "2016-07-03" and got records until 2016-07-03 05:00:00.
130
+ # Then next run at 2016-07-04 05:00, will got records between 2016-07-03 00:00:00 and 2016-07-04 05:00:00.
131
+ # It will evantually duplicated between 2016-07-03 00:00:00 and 2016-07-03 05:00:00
132
+ #
133
+ # Date| 2016-07-03 | 2016-07-04
134
+ # Hour| 5 | 5
135
+ # 1st run ------|----| |
136
+ # 2nd run |------------------------|-----
137
+ # ^^^^^ duplicated
138
+ #
139
+ # "last_record_time" option solves that problem
140
+ #
141
+ # Date| 2016-07-03 | 2016-07-04
142
+ # Hour| 5 | 5
143
+ # 1st run ------|----| |
144
+ # 2nd run #####|-------------------|-----
145
+ # ^^^^^ ignored (skipped)
146
+ #
147
+ task_report[:last_record_time] = fetched_latest_time.strftime("%Y-%m-%d %H:%M:%S %z")
148
+ else
149
+ # no records fetched, don't modify config_diff
150
+ task_report = {
151
+ start_date: task["start_date"],
152
+ end_date: task["end_date"],
153
+ last_record_time: task["last_record_time"],
154
+ }
155
+ end
156
+
157
+ task_report
158
+ end
98
159
  end
99
160
  end
100
161
  end
Binary file
@@ -179,17 +179,95 @@ module Embulk
179
179
  sub_test_case "auth" do
180
180
  setup do
181
181
  conf = valid_config["in"]
182
+ mute_logger
182
183
  @client = Client.new(task(embulk_config(conf)))
183
184
  end
184
185
 
185
- test "raise ConfigError when auth failed" do
186
- stub(Google::Auth::ServiceAccountCredentials).make_creds { raise "some error" }
187
- assert_raise(Embulk::ConfigError) do
188
- @client.auth
186
+ sub_test_case "retry" do
187
+ def should_retry
188
+ mock(Google::Auth::ServiceAccountCredentials).make_creds(anything).times(retryer.config.limit + 1) { raise error }
189
+ assert_raise do
190
+ @client.auth
191
+ end
192
+ end
193
+
194
+ def should_not_retry
195
+ mock(Google::Auth::ServiceAccountCredentials).make_creds(anything).times(1) { raise error }
196
+ assert_raise do
197
+ @client.auth
198
+ end
199
+ end
200
+
201
+ setup do
202
+ # stub(Google::Auth::ServiceAccountCredentials).make_creds { raise error }
203
+ end
204
+
205
+ sub_test_case "Server error (5xx)" do
206
+ def error
207
+ Google::Apis::ServerError.new("error")
208
+ end
209
+
210
+ test "should retry" do
211
+ should_retry
212
+ end
213
+ end
214
+
215
+ sub_test_case "Rate Limit" do
216
+ def error
217
+ Google::Apis::RateLimitError.new("error")
218
+ end
219
+
220
+ test "should retry" do
221
+ should_retry
222
+ end
223
+ end
224
+
225
+ sub_test_case "Auth Error" do
226
+ def error
227
+ Google::Apis::AuthorizationError.new("error")
228
+ end
229
+
230
+ test "should not retry" do
231
+ should_not_retry
232
+ end
189
233
  end
190
234
  end
191
235
  end
192
236
 
237
+ sub_test_case "too_early_data?" do
238
+ def stub_timezone(client)
239
+ stub(client).get_profile { {timezone: "America/Los_Angeles" } }
240
+ stub(client).swap_time_zone do |block|
241
+ stub(Time.zone).now { @now }
242
+ block.call
243
+ end
244
+ end
245
+
246
+ test "ga:dateHour" do
247
+ conf = valid_config["in"]
248
+ conf["time_series"] = "ga:dateHour"
249
+ client = Client.new(task(embulk_config(conf)))
250
+ @now = Time.parse("2016-06-01 05:00:00 PDT")
251
+ stub_timezone(client)
252
+
253
+ assert_equal false, client.too_early_data?("2016060104")
254
+ assert_equal true , client.too_early_data?("2016060105")
255
+ assert_equal true , client.too_early_data?("2016060106")
256
+ end
257
+
258
+ test "ga:date" do
259
+ conf = valid_config["in"]
260
+ conf["time_series"] = "ga:date"
261
+ client = Client.new(task(embulk_config(conf)))
262
+ @now = Time.parse("2016-06-03 05:00:00 PDT")
263
+ stub_timezone(client)
264
+
265
+ assert_equal false, client.too_early_data?("20160601")
266
+ assert_equal false, client.too_early_data?("20160602")
267
+ assert_equal true , client.too_early_data?("20160603")
268
+ end
269
+ end
270
+
193
271
  sub_test_case "each_report_row" do
194
272
  setup do
195
273
  conf = valid_config["in"]
@@ -284,6 +362,15 @@ module Embulk
284
362
  def embulk_config(hash)
285
363
  Embulk::DataSource.new(hash)
286
364
  end
365
+
366
+ def mute_logger
367
+ @logger = Logger.new(File::NULL)
368
+ stub(Embulk).logger { @logger }
369
+ end
370
+
371
+ def retryer
372
+ @client.retryer
373
+ end
287
374
  end
288
375
  end
289
376
  end
@@ -156,6 +156,37 @@ module Embulk
156
156
  mock(@page_builder).finish
157
157
  @plugin.run
158
158
  end
159
+
160
+ sub_test_case "last_record_time option" do
161
+ setup do
162
+ Time.zone = "America/Los_Angeles"
163
+ @last_record_time = Time.zone.parse("2016-06-01 12:00:00").to_time
164
+
165
+ conf = valid_config["in"]
166
+ conf["time_series"] = time_series
167
+ conf["last_record_time"] = @last_record_time.strftime("%Y-%m-%d %H:%M:%S %z")
168
+ @plugin = Plugin.new(embulk_config(conf), nil, nil, @page_builder)
169
+ end
170
+
171
+ test "ignore records when old" do
172
+ any_instance_of(Client) do |klass|
173
+ stub(klass).each_report_row do |block|
174
+ row = {
175
+ "ga:dateHour" => @last_record_time,
176
+ "ga:browser" => "wget",
177
+ "ga:visits" => 3,
178
+ "ga:pageviews" => 4,
179
+ }
180
+ block.call row
181
+ end
182
+ end
183
+
184
+ mock(@page_builder).add.never
185
+ mock(@page_builder).finish
186
+ @plugin.run
187
+ end
188
+ end
189
+
159
190
  end
160
191
 
161
192
  sub_test_case "time_series: 'ga:date'" do
@@ -182,6 +213,36 @@ module Embulk
182
213
  mock(@page_builder).finish
183
214
  @plugin.run
184
215
  end
216
+
217
+ sub_test_case "last_record_time option" do
218
+ setup do
219
+ Time.zone = "America/Los_Angeles"
220
+ @last_record_time = Time.zone.parse("2016-06-01 12:00:00").to_time
221
+
222
+ conf = valid_config["in"]
223
+ conf["time_series"] = time_series
224
+ conf["last_record_time"] = @last_record_time.strftime("%Y-%m-%d %H:%M:%S %z")
225
+ @plugin = Plugin.new(embulk_config(conf), nil, nil, @page_builder)
226
+ end
227
+
228
+ test "ignore records when old" do
229
+ any_instance_of(Client) do |klass|
230
+ stub(klass).each_report_row do |block|
231
+ row = {
232
+ "ga:date" => @last_record_time,
233
+ "ga:browser" => "wget",
234
+ "ga:visits" => 3,
235
+ "ga:pageviews" => 4,
236
+ }
237
+ block.call row
238
+ end
239
+ end
240
+
241
+ mock(@page_builder).add.never
242
+ mock(@page_builder).finish
243
+ @plugin.run
244
+ end
245
+ end
185
246
  end
186
247
  end
187
248
  end
@@ -201,6 +262,135 @@ module Embulk
201
262
  end
202
263
  end
203
264
 
265
+ sub_test_case "calculate_next_times" do
266
+ setup do
267
+ @page_builder = Object.new
268
+ @config = embulk_config(valid_config["in"])
269
+ end
270
+
271
+ sub_test_case "ga:dateHour" do
272
+ setup do
273
+ conf = valid_config["in"]
274
+ conf["time_series"] = "ga:dateHour"
275
+ @config = embulk_config(conf)
276
+ end
277
+
278
+ sub_test_case "no records fetched" do
279
+ test "config_diff won't modify" do
280
+ plugin = Plugin.new(config, nil, nil, @page_builder)
281
+ expected = {
282
+ start_date: task["start_date"],
283
+ end_date: task["end_date"],
284
+ last_record_time: task["last_record_time"],
285
+ }
286
+ assert_equal expected, plugin.calculate_next_times(nil)
287
+ end
288
+ end
289
+
290
+ sub_test_case "updated" do
291
+ sub_test_case "end_date is given as YYYY-MM-DD" do
292
+ setup do
293
+ @config[:start_date] = "2000-01-01"
294
+ @config[:end_date] = "2000-01-05"
295
+ end
296
+
297
+ test "config_diff will modify" do
298
+ latest_time = Time.parse("2000-01-07")
299
+ plugin = Plugin.new(config, nil, nil, @page_builder)
300
+ expected = {
301
+ start_date: latest_time.strftime("%Y-%m-%d"),
302
+ end_date: "today",
303
+ last_record_time: latest_time.strftime("%Y-%m-%d %H:%M:%S %z"),
304
+ }
305
+ assert_equal expected, plugin.calculate_next_times(latest_time)
306
+ end
307
+ end
308
+
309
+ sub_test_case "end_date is given as nDaysAgo" do
310
+ setup do
311
+ @config[:start_date] = "2000-01-01"
312
+ @config[:end_date] = "10DaysAgo"
313
+ end
314
+
315
+ test "config_diff end_date won't modify" do
316
+ latest_time = Time.parse("2000-01-07")
317
+ plugin = Plugin.new(config, nil, nil, @page_builder)
318
+ expected = {
319
+ start_date: latest_time.strftime("%Y-%m-%d"),
320
+ last_record_time: latest_time.strftime("%Y-%m-%d %H:%M:%S %z"),
321
+ }
322
+ assert_equal expected, plugin.calculate_next_times(latest_time)
323
+ end
324
+ end
325
+ end
326
+ end
327
+
328
+ sub_test_case "ga:date" do
329
+ setup do
330
+ conf = valid_config["in"]
331
+ conf["time_series"] = "ga:date"
332
+ @config = embulk_config(conf)
333
+ end
334
+
335
+ sub_test_case "no records fetched" do
336
+ test "config_diff will keep previous" do
337
+ plugin = Plugin.new(config, nil, nil, @page_builder)
338
+ expected = {
339
+ start_date: task["start_date"],
340
+ end_date: task["end_date"],
341
+ last_record_time: task["last_record_time"],
342
+ }
343
+ assert_equal expected, plugin.calculate_next_times(nil)
344
+ end
345
+ end
346
+
347
+ sub_test_case "updated" do
348
+ sub_test_case "end_date is given as YYYY-MM-DD" do
349
+ setup do
350
+ @config[:start_date] = "2000-01-01"
351
+ @config[:end_date] = "2000-01-05"
352
+ end
353
+
354
+ test "config_diff will modify" do
355
+ latest_time = Time.parse("2000-01-07")
356
+ plugin = Plugin.new(config, nil, nil, @page_builder)
357
+ expected = {
358
+ start_date: latest_time.strftime("%Y-%m-%d"),
359
+ end_date: "today",
360
+ last_record_time: latest_time.strftime("%Y-%m-%d %H:%M:%S %z"),
361
+ }
362
+ assert_equal expected, plugin.calculate_next_times(latest_time)
363
+ end
364
+ end
365
+
366
+ sub_test_case "end_date is given as nDaysAgo" do
367
+ setup do
368
+ @config[:start_date] = "2000-01-01"
369
+ @config[:end_date] = "10DaysAgo"
370
+ end
371
+
372
+ test "config_diff end_date won't modify" do
373
+ latest_time = Time.parse("2000-01-07")
374
+ plugin = Plugin.new(config, nil, nil, @page_builder)
375
+ expected = {
376
+ start_date: latest_time.strftime("%Y-%m-%d"),
377
+ last_record_time: latest_time.strftime("%Y-%m-%d %H:%M:%S %z"),
378
+ }
379
+ assert_equal expected, plugin.calculate_next_times(latest_time)
380
+ end
381
+ end
382
+ end
383
+ end
384
+
385
+ def task
386
+ Plugin.task_from_config(@config)
387
+ end
388
+
389
+ def config
390
+ @config
391
+ end
392
+ end
393
+
204
394
  def valid_config
205
395
  fixture_load("valid.yml")
206
396
  end
@@ -12,6 +12,8 @@ in:
12
12
  metrics:
13
13
  - "ga:visits"
14
14
  - "ga:pageviews"
15
+ retry_limit: 2
16
+ retry_initial_wait_sec: 0
15
17
 
16
18
  out:
17
19
  type: stdout
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: embulk-input-google_analytics
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - uu59
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-07-07 00:00:00.000000000 Z
11
+ date: 2016-07-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -66,6 +66,20 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ requirement: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - "~>"
73
+ - !ruby/object:Gem::Version
74
+ version: '0.5'
75
+ name: perfect_retry
76
+ prerelease: false
77
+ type: :runtime
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '0.5'
69
83
  - !ruby/object:Gem::Dependency
70
84
  requirement: !ruby/object:Gem::Requirement
71
85
  requirements:
@@ -178,6 +192,20 @@ dependencies:
178
192
  - - ">="
179
193
  - !ruby/object:Gem::Version
180
194
  version: '0'
195
+ - !ruby/object:Gem::Dependency
196
+ requirement: !ruby/object:Gem::Requirement
197
+ requirements:
198
+ - - "~>"
199
+ - !ruby/object:Gem::Version
200
+ version: '1.0'
201
+ name: gem_release_helper
202
+ prerelease: false
203
+ type: :development
204
+ version_requirements: !ruby/object:Gem::Requirement
205
+ requirements:
206
+ - - "~>"
207
+ - !ruby/object:Gem::Version
208
+ version: '1.0'
181
209
  description: Loads records from Google Analytics.
182
210
  email:
183
211
  - k@uu59.org
@@ -197,6 +225,7 @@ files:
197
225
  - lib/embulk/input/google_analytics.rb
198
226
  - lib/embulk/input/google_analytics/client.rb
199
227
  - lib/embulk/input/google_analytics/plugin.rb
228
+ - service_account.png
200
229
  - test/embulk/input/google_analytics/test_client.rb
201
230
  - test/embulk/input/google_analytics/test_plugin.rb
202
231
  - test/fixture_helper.rb