ferto 0.0.3 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 97a79db2d44895104a87860165f149d3b08ebe7b
4
- data.tar.gz: 28bd5801d93445f394629766cf18d6bc2547ece3
2
+ SHA256:
3
+ metadata.gz: 7c778310b7f3bba9e3d6984c185b7c61ed475e398a38924c34ccecbf490ddb8d
4
+ data.tar.gz: 3f027ed81e81275682260282b1b72c09a0469ebb7d026864105a79e3bb0c6ad8
5
5
  SHA512:
6
- metadata.gz: 7c04fbf4d9943121888bb8d211a4a31d8098e891150154f7c6853e4309d55fb87f46b2c86046fca8589d981d2d7eb5827fee54487d6f8ce12109abcad6bb2690
7
- data.tar.gz: 05d6a9d9f3f4064d7e461eb57a4702eed647e9fca734e722e21db50ab973a901335fcb0f0e4a9ef339ffe2ecbaa01c5c00a6307d25a74ce50e748a165869177c
6
+ metadata.gz: 2745ec8da102954efe1eb8e6682290ca56ab9970816f1411e5e01947d9d61b4fcd69efbc4d329af567dc7f04ff1d9d590a845c245b30bac67bb78633ebabc3bb
7
+ data.tar.gz: e3d558ff92fd5bcd542f56a7a924b77119ddaae6bcb4887f966fe34c0c39b221a0d29bac848d16db1793005d90a92aff6b21afc50626e5693cbb7750dc63a0e3
@@ -0,0 +1,29 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [ master ]
6
+ pull_request:
7
+ branches: [ master ]
8
+
9
+ jobs:
10
+ test:
11
+ runs-on: ubuntu-latest
12
+
13
+ strategy:
14
+ fail-fast: false
15
+ matrix:
16
+ ruby-version: ['2.7', '3.2']
17
+
18
+ steps:
19
+ - uses: actions/checkout@v3
20
+ - name: Install dependencies
21
+ run: sudo apt install -y libcurl4-openssl-dev
22
+ - name: Set up Ruby ${{ matrix.ruby-version }}
23
+ uses: ruby/setup-ruby@v1
24
+ with:
25
+ ruby-version: ${{ matrix.ruby-version }}
26
+ - name: Install dependencies
27
+ run: bundle install
28
+ - name: Run tests
29
+ run: bundle exec rspec
data/.gitignore CHANGED
@@ -9,4 +9,5 @@
9
9
  /tmp/
10
10
 
11
11
  *.swp
12
+ *.gem
12
13
  .rspec
data/CHANGELOG.md ADDED
@@ -0,0 +1,50 @@
1
+ # Changelog
2
+
3
+ Breaking changes are prefixed with a "[BREAKING]" label.
4
+
5
+ ## master (unreleased)
6
+
7
+ ## 0.1.0 (2023-06-16)
8
+
9
+ - Add compatibility for Ruby 3
10
+ - Unpin curb version from gemspec
11
+ - Unpin faker version
12
+ - specs: Pass params as kwargs instead of hash
13
+
14
+ ## 0.0.9 (2022-11-14)
15
+
16
+ ### Added
17
+
18
+ - Support for different callbacks when a job fails [[#13](https://github.com/skroutz/ferto/pull/13)]
19
+
20
+ ## 0.0.8 (2022-08-16)
21
+
22
+ ### Added
23
+
24
+ - Support for setting subpath in download requests [[#12](https://github.com/skroutz/ferto/pull/12)]
25
+
26
+ ## 0.0.6 (2019-07-09)
27
+
28
+ ### Added
29
+
30
+ - Support for setting request headers in download requests [[#10](https://github.com/skroutz/ferto/pull/10)]
31
+
32
+ ## 0.0.7 (2022-07-21)
33
+
34
+ ### Added
35
+
36
+ - Support for setting AWS S3 bucket as filestorage solution [[#11](https://github.com/skroutz/ferto/pull/11)]
37
+
38
+ ## 0.0.5 (2019-05-16)
39
+
40
+ ### Added
41
+
42
+ - [BREAKING] `Ferto::ResponseError` exception raising when 40X or 50X response is returned [[#9](https://github.com/skroutz/ferto/pull/9)]
43
+
44
+ ## 0.0.4 (2019-04-18)
45
+
46
+ ### Added
47
+
48
+ - Support setting a job download timeout [[#7](https://github.com/skroutz/ferto/pull/7)]
49
+ - Support setting an HTTP proxy for use in download requests [[#7](https://github.com/skroutz/ferto/pull/7)]
50
+ - Support setting the User-Agent header in download requests [[#7](https://github.com/skroutz/ferto/pull/7)]
data/README.md CHANGED
@@ -1,7 +1,8 @@
1
1
  # Ferto
2
2
 
3
- [![Build Status](https://travis-ci.org/skroutz/ferto.svg?branch=master)](https://travis-ci.org/skroutz/ferto)
3
+ ![Build Status](https://github.com/skroutz/ferto/actions/workflows/CI.yml/badge.svg?branch=master)
4
4
  [![Gem Version](https://badge.fury.io/rb/ferto.svg)](https://badge.fury.io/rb/ferto)
5
+ [![Documentation](http://img.shields.io/badge/yard-docs-blue.svg)](http://www.rubydoc.info/github/skroutz/ferto)
5
6
 
6
7
  A Ruby client for [skroutz/downloader](https://github.com/skroutz/downloader).
7
8
 
@@ -49,7 +50,8 @@ dl_resp = client.download(aggr_id: 'bucket1',
49
50
  mime_type: 'text/html',
50
51
  callback_type: 'http',
51
52
  callback_dst: 'http://myservice.com/downloader_callback',
52
- extra: { some_extra_info: 'info' })
53
+ extra: { some_extra_info: 'info' },
54
+ request_headers: { "Accept" => "application/html,application/xhtml+html" })
53
55
  ```
54
56
 
55
57
  In order for a service to consume downloader's result, it *must* accept the HTTP
@@ -64,7 +66,8 @@ dl_resp = client.download(aggr_id: 'bucket1',
64
66
  mime_type: 'text/html',
65
67
  callback_type: 'kafka',
66
68
  callback_dst: 'my-kafka-topic',
67
- extra: { some_extra_info: 'info' })
69
+ extra: { some_extra_info: 'info' },
70
+ request_headers: { "Accept" => "application/html,application/xhtml+html" })
68
71
  ```
69
72
 
70
73
  To consume the downloader's result, you can use your favorite Kafka library and
@@ -76,6 +79,10 @@ If the connection with the `downloader` API was successful, the aforementioned
76
79
  object. If the client failed to connect, a
77
80
  [`Ferto::ConnectionError`](https://github.com/skroutz/ferto/blob/master/lib/ferto.rb#L18)
78
81
  exception is raised.
82
+ Also if the download call, results to a response with code
83
+ either `40X` or `50X` then a [`Ferto::ResponseError`](https://github.com/skroutz/ferto/blob/master/lib/ferto.rb#L21)
84
+ is raised with the response object encapsulated in the raised exception in order
85
+ to be further handled by the end user.
79
86
 
80
87
  To handle the actual callback message, e.g. from inside a Rails controller:
81
88
 
@@ -102,6 +109,17 @@ end
102
109
  > parameters](https://github.com/skroutz/downloader#endpoints), [callback
103
110
  > payload](https://github.com/skroutz/downloader/tree/kafka-backend#usage)).
104
111
 
112
+
113
+ #### A Note on User-Agent
114
+
115
+ We continue to expose the `user_agent` field as tools like `curl` and `wget` do.
116
+ Along with that we will follow their paradigm where if both a `user-agent` flag
117
+ and a `User-Agent` in the request headers are provided then the user-agent in
118
+ the request headers is preferred.
119
+
120
+ Also if the `user_agent` is provided but the request headers do not
121
+ contain a `User-Agent` key, then the `user_agent` is copied to the headers
122
+
105
123
  ## Contributing
106
124
 
107
125
  Bug reports and pull requests are welcome on GitHub at
data/ferto.gemspec CHANGED
@@ -23,12 +23,11 @@ Gem::Specification.new do |spec|
23
23
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
24
24
  spec.require_paths = ["lib"]
25
25
 
26
- spec.add_dependency 'curb', "~> 0.9"
26
+ spec.add_dependency 'curb'
27
27
 
28
- spec.add_development_dependency "bundler", "~> 1.13"
29
28
  spec.add_development_dependency "rake", "~> 10.0"
30
29
  spec.add_development_dependency "rspec", "~> 3.0"
31
30
  spec.add_development_dependency "webmock", "~> 3.5"
32
31
  spec.add_development_dependency "factory_bot", "~> 4.10"
33
- spec.add_development_dependency "faker", "~> 1.9"
32
+ spec.add_development_dependency "faker"
34
33
  end
data/lib/ferto/client.rb CHANGED
@@ -26,7 +26,7 @@ module Ferto
26
26
  # the service to make.
27
27
  attr_reader :aggr_limit
28
28
 
29
- # @param [Hash{Symbol => String, Fixnum}]
29
+ # @param opts [Hash{Symbol => String, Fixnum}]
30
30
  # @option opts [String] :scheme
31
31
  # @option opts [String] :host
32
32
  # @option opts [String] :path
@@ -47,30 +47,67 @@ module Ferto
47
47
 
48
48
  # Sends a request to Downloader and returns its reply.
49
49
  #
50
+ # @param url [String] the resource to be downloaded
51
+ # @param callback_type [String]
52
+ # @param callback_dst [String] the callback destination
53
+ # @param callback_error_type [String]
54
+ # @param callback_error_dst [String] the callback destination in case the job fails
55
+ # @param mime_type [String] (default: "") accepted MIME types for the
56
+ # resource
57
+ # @param aggr_id [String] aggregation identifier
58
+ # @param aggr_limit [Integer] aggregation concurrency limit
59
+ # @param aggr_proxy [String] the HTTP proxy to use for downloading the
60
+ # resource, by default no proxy is used. The proxy is set up on
61
+ # aggregation level and it cannot be updated for an existing aggregation.
62
+ # @param download_timeout [Integer] the maximum time to wait for the
63
+ # resource to be downloaded in seconds, by default there is no timeout
64
+ # @param user_agent [String] the User-Agent string to use for
65
+ # downloading the resource, by default it uses the User-Agent string
66
+ # set in the downloader's configuration
67
+ # @param request_headers [Hash] the request headers that will be used
68
+ # in downloader when performing the actual request in order to fetch
69
+ # the desired resource
70
+ # @param subpath [String] the subfolder(s) that the jobs will be stored
71
+ # under the top level directory of storage backend
72
+ #
50
73
  # @example
51
- # downloader = Ferto::Client.new
52
- # dl_resp = downloader.download(
53
- # aggr_id: 'msystems',
54
- # aggr_limit: 3,
74
+ # client.download(
55
75
  # url: 'http://foo.bar/a.jpg',
56
76
  # callback_type: 'http',
57
- # callback_dst: 'http://example.com/downloads/myfile',
58
- # extra: { groupno: 'foobar' }
77
+ # callback_dst: 'http://myapp.com/handle-download',
78
+ # aggr_id: 'foo', aggr_limit: 3,
79
+ # download_timeout: 120,
80
+ # aggr_proxy: 'http://myproxy.com/',
81
+ # user_agent: 'my-useragent',
82
+ # mime_type: "image/jpeg",
83
+ # request_headers: { "Accept" => "image/*,*/*;q=0.8" },
84
+ # extra: { something: 'someone' }
59
85
  # )
60
86
  #
61
- # @raise [Ferto::ConnectionError] if the client failed to connect to the
62
- # downloader API
87
+ # @raise [Ferto::ConnectionError] if there was an error scheduling the
88
+ # job to downloader with respect to the fact that a Curl ConnectionFailedError occured
89
+ # @raise [Ferto::ResponseError] if a response code of 40X or 50X is received
63
90
  #
64
91
  # @return [Ferto::Response]
92
+ #
93
+ # @see https://github.com/skroutz/downloader/#post-download
65
94
  def download(aggr_id:, aggr_limit: @aggr_limit, url:,
66
- callback_url: "", callback_dst: "",
67
- callback_type: "", mime_type: "", extra: {})
95
+ aggr_proxy: nil, download_timeout: nil, user_agent: nil,
96
+ callback_url: "", callback_dst: "", callback_type: "",
97
+ callback_error_type: "", callback_error_dst: "",
98
+ mime_type: "", extra: {},
99
+ request_headers: {},
100
+ s3_bucket: nil, s3_region: nil, subpath: nil)
68
101
  uri = URI::HTTP.build(
69
102
  scheme: scheme, host: host, port: port, path: path
70
103
  )
71
104
  body = build_body(
72
- aggr_id, aggr_limit, url, callback_url, callback_type, callback_dst,
73
- mime_type, extra
105
+ aggr_id, aggr_limit, url,
106
+ callback_url, callback_type, callback_dst,
107
+ callback_error_type, callback_error_dst,
108
+ aggr_proxy, download_timeout, user_agent,
109
+ mime_type, extra, request_headers,
110
+ s3_bucket, s3_region, subpath
74
111
  )
75
112
  # Curl.post reuses the same handler
76
113
  begin
@@ -79,6 +116,14 @@ module Ferto
79
116
  handle.connect_timeout = connect_timeout
80
117
  handle.timeout = timeout
81
118
  end
119
+
120
+ case res.response_code
121
+ when 400..599
122
+ error_msg = ("An error occured during the download call. " \
123
+ "Received a #{res.response_code} response code and body " \
124
+ "#{res.body_str}")
125
+ raise Ferto::ResponseError.new(error_msg, res)
126
+ end
82
127
  rescue Curl::Err::ConnectionFailedError => e
83
128
  raise Ferto::ConnectionError.new(e)
84
129
  end
@@ -96,13 +141,27 @@ module Ferto
96
141
  end
97
142
 
98
143
  def build_body(aggr_id, aggr_limit, url, callback_url, callback_type,
99
- callback_dst, mime_type, extra)
144
+ callback_dst, callback_error_type, callback_error_dst,
145
+ aggr_proxy, download_timeout, user_agent,
146
+ mime_type, extra, request_headers,
147
+ s3_bucket, s3_region, subpath)
100
148
  body = {
101
149
  aggr_id: aggr_id,
102
150
  aggr_limit: aggr_limit,
103
151
  url: url
104
152
  }
105
153
 
154
+ if s3_bucket && s3_region
155
+ body[:s3_bucket] = s3_bucket
156
+ body[:s3_region] = s3_region
157
+ end
158
+
159
+ if !s3_bucket && s3_region
160
+ raise ArgumentError, "s3_region provided without an s3_bucket"
161
+ elsif !s3_region && s3_bucket
162
+ raise ArgumentError, "s3_bucket provided without an s3_region"
163
+ end
164
+
106
165
  if callback_url.empty?
107
166
  body[:callback_type] = callback_type
108
167
  body[:callback_dst] = callback_dst
@@ -110,14 +169,35 @@ module Ferto
110
169
  body[:callback_url] = callback_url
111
170
  end
112
171
 
172
+ body[:callback_error_type] = callback_error_type unless callback_error_type.to_s.empty?
173
+ body[:callback_error_dst] = callback_error_dst unless callback_error_dst.to_s.empty?
174
+
113
175
  if !mime_type.empty?
114
176
  body[:mime_type] = mime_type
115
177
  end
116
178
 
179
+ body[:subpath] = subpath if subpath
180
+ body[:aggr_proxy] = aggr_proxy if aggr_proxy
181
+ body[:download_timeout] = download_timeout if download_timeout
182
+ body[:user_agent] = user_agent if user_agent
183
+
117
184
  if !extra.nil?
118
185
  body[:extra] = extra.is_a?(Hash) ? extra.to_json : extra.to_s
119
186
  end
120
187
 
188
+ # We will continue to expose the user_agent field just like tools
189
+ # like curl and wget do. Along with that we will follow their paradigm
190
+ # where if both a user-agent flag and a `User-Agent` in the request headers
191
+ # are provided then the user agent in the request headers is preferred.
192
+ #
193
+ # Also if the `user_agent` is provided but the request headers do not
194
+ # contain a `User-Agent` key, then the `user_agent` is copied to the headers
195
+ if user_agent && !request_headers.key?("User-Agent")
196
+ request_headers["User-Agent"] = user_agent
197
+ end
198
+
199
+ body[:request_headers] = request_headers
200
+
121
201
  body
122
202
  end
123
203
  end
data/lib/ferto/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Ferto
2
- VERSION = "0.0.3"
2
+ VERSION = "0.1.0"
3
3
  end
data/lib/ferto.rb CHANGED
@@ -16,4 +16,25 @@ module Ferto
16
16
  }.freeze
17
17
 
18
18
  class ConnectionError < StandardError; end
19
+
20
+ # A custom error class for 40X and 50X responses
21
+ class ResponseError < StandardError
22
+
23
+ # Initialize a Ferto::ResponseError
24
+ #
25
+ # @param [String] err A string describing the error occured
26
+ # @param [Curl::Easy | nil] response a Curl::Easy object
27
+ # that represents the response returned by the download method.
28
+ # Default: nil
29
+ def initialize(err, response=nil)
30
+ super(err)
31
+ @response = response
32
+ end
33
+
34
+ # response is set, during the download in case of
35
+ # 40X or 50X responses are returned, so that it
36
+ # can be used in case of debugging but it is also
37
+ # included for reasons of completeness.
38
+ attr_reader :response
39
+ end
19
40
  end
metadata CHANGED
@@ -1,43 +1,29 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ferto
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.3
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aggelos Avgerinos
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2019-04-08 00:00:00.000000000 Z
11
+ date: 2023-06-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: curb
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
17
+ - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: '0.9'
19
+ version: '0'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - "~>"
24
+ - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: '0.9'
27
- - !ruby/object:Gem::Dependency
28
- name: bundler
29
- requirement: !ruby/object:Gem::Requirement
30
- requirements:
31
- - - "~>"
32
- - !ruby/object:Gem::Version
33
- version: '1.13'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - "~>"
39
- - !ruby/object:Gem::Version
40
- version: '1.13'
26
+ version: '0'
41
27
  - !ruby/object:Gem::Dependency
42
28
  name: rake
43
29
  requirement: !ruby/object:Gem::Requirement
@@ -98,16 +84,16 @@ dependencies:
98
84
  name: faker
99
85
  requirement: !ruby/object:Gem::Requirement
100
86
  requirements:
101
- - - "~>"
87
+ - - ">="
102
88
  - !ruby/object:Gem::Version
103
- version: '1.9'
89
+ version: '0'
104
90
  type: :development
105
91
  prerelease: false
106
92
  version_requirements: !ruby/object:Gem::Requirement
107
93
  requirements:
108
- - - "~>"
94
+ - - ">="
109
95
  - !ruby/object:Gem::Version
110
- version: '1.9'
96
+ version: '0'
111
97
  description: Ruby API client for Downloader service
112
98
  email:
113
99
  - avgerinos@skroutz.gr
@@ -115,15 +101,15 @@ executables: []
115
101
  extensions: []
116
102
  extra_rdoc_files: []
117
103
  files:
104
+ - ".github/workflows/CI.yml"
118
105
  - ".gitignore"
119
- - ".travis.yml"
106
+ - CHANGELOG.md
120
107
  - Gemfile
121
108
  - LICENSE.txt
122
109
  - README.md
123
110
  - Rakefile
124
111
  - bin/console
125
112
  - bin/setup
126
- - ferto-0.1.0.gem
127
113
  - ferto.gemspec
128
114
  - lib/ferto.rb
129
115
  - lib/ferto/callback.rb
@@ -135,7 +121,7 @@ homepage: https://github.com/skroutz/ferto
135
121
  licenses:
136
122
  - GPL-3.0
137
123
  metadata: {}
138
- post_install_message:
124
+ post_install_message:
139
125
  rdoc_options: []
140
126
  require_paths:
141
127
  - lib
@@ -150,9 +136,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
150
136
  - !ruby/object:Gem::Version
151
137
  version: '0'
152
138
  requirements: []
153
- rubyforge_project:
154
- rubygems_version: 2.5.2
155
- signing_key:
139
+ rubygems_version: 3.4.14
140
+ signing_key:
156
141
  specification_version: 4
157
142
  summary: Ruby API client for Downloader
158
143
  test_files: []
data/.travis.yml DELETED
@@ -1,5 +0,0 @@
1
- sudo: false
2
- language: ruby
3
- rvm:
4
- - 2.3
5
- - 2.4
data/ferto-0.1.0.gem DELETED
Binary file