ferto 0.0.4 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 9fe993b148492a811c70a78c299c483b2d23ef5e
4
- data.tar.gz: '0180f279e8372d85c5e908060cb3c43e50467d62'
2
+ SHA256:
3
+ metadata.gz: 7c778310b7f3bba9e3d6984c185b7c61ed475e398a38924c34ccecbf490ddb8d
4
+ data.tar.gz: 3f027ed81e81275682260282b1b72c09a0469ebb7d026864105a79e3bb0c6ad8
5
5
  SHA512:
6
- metadata.gz: '09ca432db72f3bb2f2c6456672f95a075c58ce32530fe0fc4fdd52849918d013f361157d2626e2899bc9b8f8912979cfbce46bbdf32cc5750a900d6b6cdb6d4a'
7
- data.tar.gz: 6c1384ff84dcb8ce3afbafed63e61562b58743194a1ccf6c50f24be0624d95cd70165a543235b6f2e016fbaf5f3879b94be577da560883504b5cf5f5276302e1
6
+ metadata.gz: 2745ec8da102954efe1eb8e6682290ca56ab9970816f1411e5e01947d9d61b4fcd69efbc4d329af567dc7f04ff1d9d590a845c245b30bac67bb78633ebabc3bb
7
+ data.tar.gz: e3d558ff92fd5bcd542f56a7a924b77119ddaae6bcb4887f966fe34c0c39b221a0d29bac848d16db1793005d90a92aff6b21afc50626e5693cbb7750dc63a0e3
@@ -0,0 +1,29 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [ master ]
6
+ pull_request:
7
+ branches: [ master ]
8
+
9
+ jobs:
10
+ test:
11
+ runs-on: ubuntu-latest
12
+
13
+ strategy:
14
+ fail-fast: false
15
+ matrix:
16
+ ruby-version: ['2.7', '3.2']
17
+
18
+ steps:
19
+ - uses: actions/checkout@v3
20
+ - name: Install dependencies
21
+ run: sudo apt install -y libcurl4-openssl-dev
22
+ - name: Set up Ruby ${{ matrix.ruby-version }}
23
+ uses: ruby/setup-ruby@v1
24
+ with:
25
+ ruby-version: ${{ matrix.ruby-version }}
26
+ - name: Install dependencies
27
+ run: bundle install
28
+ - name: Run tests
29
+ run: bundle exec rspec
data/CHANGELOG.md ADDED
@@ -0,0 +1,50 @@
1
+ # Changelog
2
+
3
+ Breaking changes are prefixed with a "[BREAKING]" label.
4
+
5
+ ## master (unreleased)
6
+
7
+ ## 0.1.0 (2023-06-16)
8
+
9
+ - Add compatibility for Ruby 3
10
+ - Unpin curb version from gemspec
11
+ - Unpin faker version
12
+ - specs: Pass params as kwargs instead of hash
13
+
14
+ ## 0.0.9 (2022-11-14)
15
+
16
+ ### Added
17
+
18
+ - Support for different callbacks when a job fails [[#13](https://github.com/skroutz/ferto/pull/13)]
19
+
20
+ ## 0.0.8 (2022-08-16)
21
+
22
+ ### Added
23
+
24
+ - Support for setting subpath in download requests [[#12](https://github.com/skroutz/ferto/pull/12)]
25
+
26
+ ## 0.0.6 (2019-07-09)
27
+
28
+ ### Added
29
+
30
+ - Support for setting request headers in download requests [[#10](https://github.com/skroutz/ferto/pull/10)]
31
+
32
+ ## 0.0.7 (2022-07-21)
33
+
34
+ ### Added
35
+
36
+ - Support for setting AWS S3 bucket as filestorage solution [[#11](https://github.com/skroutz/ferto/pull/11)]
37
+
38
+ ## 0.0.5 (2019-05-16)
39
+
40
+ ### Added
41
+
42
+ - [BREAKING] `Ferto::ResponseError` exception raising when 40X or 50X response is returned [[#9](https://github.com/skroutz/ferto/pull/9)]
43
+
44
+ ## 0.0.4 (2019-04-18)
45
+
46
+ ### Added
47
+
48
+ - Support setting a job download timeout [[#7](https://github.com/skroutz/ferto/pull/7)]
49
+ - Support setting an HTTP proxy for use in download requests [[#7](https://github.com/skroutz/ferto/pull/7)]
50
+ - Support setting the User-Agent header in download requests [[#7](https://github.com/skroutz/ferto/pull/7)]
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Ferto
2
2
 
3
- [![Build Status](https://travis-ci.org/skroutz/ferto.svg?branch=master)](https://travis-ci.org/skroutz/ferto)
3
+ ![Build Status](https://github.com/skroutz/ferto/actions/workflows/CI.yml/badge.svg?branch=master)
4
4
  [![Gem Version](https://badge.fury.io/rb/ferto.svg)](https://badge.fury.io/rb/ferto)
5
5
  [![Documentation](http://img.shields.io/badge/yard-docs-blue.svg)](http://www.rubydoc.info/github/skroutz/ferto)
6
6
 
@@ -50,7 +50,8 @@ dl_resp = client.download(aggr_id: 'bucket1',
50
50
  mime_type: 'text/html',
51
51
  callback_type: 'http',
52
52
  callback_dst: 'http://myservice.com/downloader_callback',
53
- extra: { some_extra_info: 'info' })
53
+ extra: { some_extra_info: 'info' },
54
+ request_headers: { "Accept" => "application/html,application/xhtml+html" })
54
55
  ```
55
56
 
56
57
  In order for a service to consume downloader's result, it *must* accept the HTTP
@@ -65,7 +66,8 @@ dl_resp = client.download(aggr_id: 'bucket1',
65
66
  mime_type: 'text/html',
66
67
  callback_type: 'kafka',
67
68
  callback_dst: 'my-kafka-topic',
68
- extra: { some_extra_info: 'info' })
69
+ extra: { some_extra_info: 'info' },
70
+ request_headers: { "Accept" => "application/html,application/xhtml+html" })
69
71
  ```
70
72
 
71
73
  To consume the downloader's result, you can use your favorite Kafka library and
@@ -77,6 +79,10 @@ If the connection with the `downloader` API was successful, the aforementioned
77
79
  object. If the client failed to connect, a
78
80
  [`Ferto::ConnectionError`](https://github.com/skroutz/ferto/blob/master/lib/ferto.rb#L18)
79
81
  exception is raised.
82
+ Also if the download call, results to a response with code
83
+ either `40X` or `50X` then a [`Ferto::ResponseError`](https://github.com/skroutz/ferto/blob/master/lib/ferto.rb#L21)
84
+ is raised with the response object encapsulated in the raised exception in order
85
+ to be further handled by the end user.
80
86
 
81
87
  To handle the actual callback message, e.g. from inside a Rails controller:
82
88
 
@@ -103,6 +109,17 @@ end
103
109
  > parameters](https://github.com/skroutz/downloader#endpoints), [callback
104
110
  > payload](https://github.com/skroutz/downloader/tree/kafka-backend#usage)).
105
111
 
112
+
113
+ #### A Note on User-Agent
114
+
115
+ We continue to expose the `user_agent` field as tools like `curl` and `wget` do.
116
+ Along with that we will follow their paradigm where if both a `user-agent` flag
117
+ and a `User-Agent` in the request headers are provided then the user-agent in
118
+ the request headers is preferred.
119
+
120
+ Also if the `user_agent` is provided but the request headers do not
121
+ contain a `User-Agent` key, then the `user_agent` is copied to the headers
122
+
106
123
  ## Contributing
107
124
 
108
125
  Bug reports and pull requests are welcome on GitHub at
data/ferto.gemspec CHANGED
@@ -23,12 +23,11 @@ Gem::Specification.new do |spec|
23
23
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
24
24
  spec.require_paths = ["lib"]
25
25
 
26
- spec.add_dependency 'curb', "~> 0.9"
26
+ spec.add_dependency 'curb'
27
27
 
28
- spec.add_development_dependency "bundler", "~> 1.13"
29
28
  spec.add_development_dependency "rake", "~> 10.0"
30
29
  spec.add_development_dependency "rspec", "~> 3.0"
31
30
  spec.add_development_dependency "webmock", "~> 3.5"
32
31
  spec.add_development_dependency "factory_bot", "~> 4.10"
33
- spec.add_development_dependency "faker", "~> 1.9"
32
+ spec.add_development_dependency "faker"
34
33
  end
data/lib/ferto/client.rb CHANGED
@@ -50,6 +50,8 @@ module Ferto
50
50
  # @param url [String] the resource to be downloaded
51
51
  # @param callback_type [String]
52
52
  # @param callback_dst [String] the callback destination
53
+ # @param callback_error_type [String]
54
+ # @param callback_error_dst [String] the callback destination in case the job fails
53
55
  # @param mime_type [String] (default: "") accepted MIME types for the
54
56
  # resource
55
57
  # @param aggr_id [String] aggregation identifier
@@ -62,6 +64,11 @@ module Ferto
62
64
  # @param user_agent [String] the User-Agent string to use for
63
65
  # downloading the resource, by default it uses the User-Agent string
64
66
  # set in the downloader's configuration
67
+ # @param request_headers [Hash] the request headers that will be used
68
+ # in downloader when performing the actual request in order to fetch
69
+ # the desired resource
70
+ # @param subpath [String] the subfolder(s) that the jobs will be stored
71
+ # under the top level directory of storage backend
65
72
  #
66
73
  # @example
67
74
  # client.download(
@@ -73,25 +80,34 @@ module Ferto
73
80
  # aggr_proxy: 'http://myproxy.com/',
74
81
  # user_agent: 'my-useragent',
75
82
  # mime_type: "image/jpeg",
83
+ # request_headers: { "Accept" => "image/*,*/*;q=0.8" },
76
84
  # extra: { something: 'someone' }
77
85
  # )
78
86
  #
79
87
  # @raise [Ferto::ConnectionError] if there was an error scheduling the
80
- # job to downloader
88
+ # job to downloader with respect to the fact that a Curl ConnectionFailedError occured
89
+ # @raise [Ferto::ResponseError] if a response code of 40X or 50X is received
81
90
  #
82
91
  # @return [Ferto::Response]
83
92
  #
84
93
  # @see https://github.com/skroutz/downloader/#post-download
85
94
  def download(aggr_id:, aggr_limit: @aggr_limit, url:,
86
95
  aggr_proxy: nil, download_timeout: nil, user_agent: nil,
87
- callback_url: "", callback_dst: "",
88
- callback_type: "", mime_type: "", extra: {})
96
+ callback_url: "", callback_dst: "", callback_type: "",
97
+ callback_error_type: "", callback_error_dst: "",
98
+ mime_type: "", extra: {},
99
+ request_headers: {},
100
+ s3_bucket: nil, s3_region: nil, subpath: nil)
89
101
  uri = URI::HTTP.build(
90
102
  scheme: scheme, host: host, port: port, path: path
91
103
  )
92
104
  body = build_body(
93
- aggr_id, aggr_limit, url, callback_url, callback_type, callback_dst,
94
- aggr_proxy, download_timeout, user_agent, mime_type, extra
105
+ aggr_id, aggr_limit, url,
106
+ callback_url, callback_type, callback_dst,
107
+ callback_error_type, callback_error_dst,
108
+ aggr_proxy, download_timeout, user_agent,
109
+ mime_type, extra, request_headers,
110
+ s3_bucket, s3_region, subpath
95
111
  )
96
112
  # Curl.post reuses the same handler
97
113
  begin
@@ -100,6 +116,14 @@ module Ferto
100
116
  handle.connect_timeout = connect_timeout
101
117
  handle.timeout = timeout
102
118
  end
119
+
120
+ case res.response_code
121
+ when 400..599
122
+ error_msg = ("An error occured during the download call. " \
123
+ "Received a #{res.response_code} response code and body " \
124
+ "#{res.body_str}")
125
+ raise Ferto::ResponseError.new(error_msg, res)
126
+ end
103
127
  rescue Curl::Err::ConnectionFailedError => e
104
128
  raise Ferto::ConnectionError.new(e)
105
129
  end
@@ -117,14 +141,27 @@ module Ferto
117
141
  end
118
142
 
119
143
  def build_body(aggr_id, aggr_limit, url, callback_url, callback_type,
120
- callback_dst, aggr_proxy, download_timeout, user_agent,
121
- mime_type, extra)
144
+ callback_dst, callback_error_type, callback_error_dst,
145
+ aggr_proxy, download_timeout, user_agent,
146
+ mime_type, extra, request_headers,
147
+ s3_bucket, s3_region, subpath)
122
148
  body = {
123
149
  aggr_id: aggr_id,
124
150
  aggr_limit: aggr_limit,
125
151
  url: url
126
152
  }
127
153
 
154
+ if s3_bucket && s3_region
155
+ body[:s3_bucket] = s3_bucket
156
+ body[:s3_region] = s3_region
157
+ end
158
+
159
+ if !s3_bucket && s3_region
160
+ raise ArgumentError, "s3_region provided without an s3_bucket"
161
+ elsif !s3_region && s3_bucket
162
+ raise ArgumentError, "s3_bucket provided without an s3_region"
163
+ end
164
+
128
165
  if callback_url.empty?
129
166
  body[:callback_type] = callback_type
130
167
  body[:callback_dst] = callback_dst
@@ -132,10 +169,14 @@ module Ferto
132
169
  body[:callback_url] = callback_url
133
170
  end
134
171
 
172
+ body[:callback_error_type] = callback_error_type unless callback_error_type.to_s.empty?
173
+ body[:callback_error_dst] = callback_error_dst unless callback_error_dst.to_s.empty?
174
+
135
175
  if !mime_type.empty?
136
176
  body[:mime_type] = mime_type
137
177
  end
138
178
 
179
+ body[:subpath] = subpath if subpath
139
180
  body[:aggr_proxy] = aggr_proxy if aggr_proxy
140
181
  body[:download_timeout] = download_timeout if download_timeout
141
182
  body[:user_agent] = user_agent if user_agent
@@ -144,6 +185,19 @@ module Ferto
144
185
  body[:extra] = extra.is_a?(Hash) ? extra.to_json : extra.to_s
145
186
  end
146
187
 
188
+ # We will continue to expose the user_agent field just like tools
189
+ # like curl and wget do. Along with that we will follow their paradigm
190
+ # where if both a user-agent flag and a `User-Agent` in the request headers
191
+ # are provided then the user agent in the request headers is preferred.
192
+ #
193
+ # Also if the `user_agent` is provided but the request headers do not
194
+ # contain a `User-Agent` key, then the `user_agent` is copied to the headers
195
+ if user_agent && !request_headers.key?("User-Agent")
196
+ request_headers["User-Agent"] = user_agent
197
+ end
198
+
199
+ body[:request_headers] = request_headers
200
+
147
201
  body
148
202
  end
149
203
  end
data/lib/ferto/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Ferto
2
- VERSION = "0.0.4"
2
+ VERSION = "0.1.0"
3
3
  end
data/lib/ferto.rb CHANGED
@@ -16,4 +16,25 @@ module Ferto
16
16
  }.freeze
17
17
 
18
18
  class ConnectionError < StandardError; end
19
+
20
+ # A custom error class for 40X and 50X responses
21
+ class ResponseError < StandardError
22
+
23
+ # Initialize a Ferto::ResponseError
24
+ #
25
+ # @param [String] err A string describing the error occured
26
+ # @param [Curl::Easy | nil] response a Curl::Easy object
27
+ # that represents the response returned by the download method.
28
+ # Default: nil
29
+ def initialize(err, response=nil)
30
+ super(err)
31
+ @response = response
32
+ end
33
+
34
+ # response is set, during the download in case of
35
+ # 40X or 50X responses are returned, so that it
36
+ # can be used in case of debugging but it is also
37
+ # included for reasons of completeness.
38
+ attr_reader :response
39
+ end
19
40
  end
metadata CHANGED
@@ -1,43 +1,29 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ferto
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.4
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aggelos Avgerinos
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2019-04-18 00:00:00.000000000 Z
11
+ date: 2023-06-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: curb
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
17
+ - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: '0.9'
19
+ version: '0'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - "~>"
24
+ - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: '0.9'
27
- - !ruby/object:Gem::Dependency
28
- name: bundler
29
- requirement: !ruby/object:Gem::Requirement
30
- requirements:
31
- - - "~>"
32
- - !ruby/object:Gem::Version
33
- version: '1.13'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - "~>"
39
- - !ruby/object:Gem::Version
40
- version: '1.13'
26
+ version: '0'
41
27
  - !ruby/object:Gem::Dependency
42
28
  name: rake
43
29
  requirement: !ruby/object:Gem::Requirement
@@ -98,16 +84,16 @@ dependencies:
98
84
  name: faker
99
85
  requirement: !ruby/object:Gem::Requirement
100
86
  requirements:
101
- - - "~>"
87
+ - - ">="
102
88
  - !ruby/object:Gem::Version
103
- version: '1.9'
89
+ version: '0'
104
90
  type: :development
105
91
  prerelease: false
106
92
  version_requirements: !ruby/object:Gem::Requirement
107
93
  requirements:
108
- - - "~>"
94
+ - - ">="
109
95
  - !ruby/object:Gem::Version
110
- version: '1.9'
96
+ version: '0'
111
97
  description: Ruby API client for Downloader service
112
98
  email:
113
99
  - avgerinos@skroutz.gr
@@ -115,8 +101,9 @@ executables: []
115
101
  extensions: []
116
102
  extra_rdoc_files: []
117
103
  files:
104
+ - ".github/workflows/CI.yml"
118
105
  - ".gitignore"
119
- - ".travis.yml"
106
+ - CHANGELOG.md
120
107
  - Gemfile
121
108
  - LICENSE.txt
122
109
  - README.md
@@ -134,7 +121,7 @@ homepage: https://github.com/skroutz/ferto
134
121
  licenses:
135
122
  - GPL-3.0
136
123
  metadata: {}
137
- post_install_message:
124
+ post_install_message:
138
125
  rdoc_options: []
139
126
  require_paths:
140
127
  - lib
@@ -149,9 +136,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
149
136
  - !ruby/object:Gem::Version
150
137
  version: '0'
151
138
  requirements: []
152
- rubyforge_project:
153
- rubygems_version: 2.5.2
154
- signing_key:
139
+ rubygems_version: 3.4.14
140
+ signing_key:
155
141
  specification_version: 4
156
142
  summary: Ruby API client for Downloader
157
143
  test_files: []
data/.travis.yml DELETED
@@ -1,5 +0,0 @@
1
- sudo: false
2
- language: ruby
3
- rvm:
4
- - 2.3
5
- - 2.4