httpdisk 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0e3b6af20bf5a6b76a23a6d170328c040cf3f0715e012c392739b70ab6911bee
4
- data.tar.gz: 46385301b89bc19f2e8862bcc9bf48b0f1b14fafbd2fa4fc8542b7bf4dcfe930
3
+ metadata.gz: 7b1e66859e069e390390dd87fac3753ee43501613884bfca2d7f40f70a224274
4
+ data.tar.gz: 8de194055b6fc7ac858305eb739dc25494b139413d36decbbbaeaa00ce8f1da0
5
5
  SHA512:
6
- metadata.gz: c4b163648a7adf00ae2ae103ccef899f5a3b958d8e38eeb56fc0d609f0bae73a19eed4fb7d5921c583b4f6d2755bd70d5cf1da748a6d8e858f5195da1683d009
7
- data.tar.gz: 3270ce507204023e25451048fd0ddecc713ce6c637abfe795c2f159d619328892393bb8fa73249f08199a30e9db895fb2e2b6c440bfae0dcf07abd277851d050
6
+ metadata.gz: 4594b3eb07cd683a901883484ce88ed3be63f4a65a1c2d9c1a2eb80c9791606ceaaf7287ca74ae35d5fa5015e3218133a7e1199111233841f58c67696edc3f0e
7
+ data.tar.gz: 5d9f0dd6f3d8407e4b5133ec407d4aa56b6c442d42679c7715f9519b60a791ebb212e11c3c8474ad92bf4dd9e2d5d8daaded2eadcff4b210a6fa5b1e0245677b
data/.rubocop.yml CHANGED
@@ -15,9 +15,11 @@ Naming/MethodParameterName: { Enabled: false }
15
15
  Naming/VariableNumber: { Enabled: false }
16
16
  Style/Documentation: { Enabled: false }
17
17
  Style/DoubleNegation: { Enabled: false }
18
+ Style/EmptyCaseCondition: { Enabled: false }
18
19
  Style/FrozenStringLiteralComment: { Enabled: false }
19
20
  Style/IfUnlessModifier: { Enabled: false }
20
21
  Style/NegatedIf: { Enabled: false }
22
+ Style/NumericPredicate: { Enabled: false }
21
23
  Style/ParallelAssignment: { Enabled: false }
22
24
  Style/StderrPuts: { Enabled: false }
23
25
  Style/TrailingCommaInArrayLiteral: { EnforcedStyleForMultiline: consistent_comma }
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- httpdisk (0.3.0)
4
+ httpdisk (0.4.0)
5
5
  faraday (~> 1.4)
6
6
  faraday-cookie_jar (~> 0.0)
7
7
  faraday_middleware (~> 1.0)
@@ -18,7 +18,9 @@ GEM
18
18
  rexml
19
19
  domain_name (0.5.20190701)
20
20
  unf (>= 0.0.5, < 1.0.0)
21
- faraday (1.4.1)
21
+ faraday (1.4.2)
22
+ faraday-em_http (~> 1.0)
23
+ faraday-em_synchrony (~> 1.0)
22
24
  faraday-excon (~> 1.1)
23
25
  faraday-net_http (~> 1.0)
24
26
  faraday-net_http_persistent (~> 1.1)
@@ -27,6 +29,8 @@ GEM
27
29
  faraday-cookie_jar (0.0.7)
28
30
  faraday (>= 0.8.0)
29
31
  http-cookie (~> 1.0.0)
32
+ faraday-em_http (1.0.0)
33
+ faraday-em_synchrony (1.0.0)
30
34
  faraday-excon (1.1.0)
31
35
  faraday-net_http (1.0.1)
32
36
  faraday-net_http_persistent (1.1.0)
data/README.md CHANGED
@@ -125,7 +125,7 @@ In general, if you make a request it will be cached regardless of the outcome.
125
125
  httpdisk supports a few options:
126
126
 
127
127
  - `dir:` location for disk cache, defaults to `~/httpdisk`
128
- - `expires_in:` when to expire cached requests, default is nil (never expire)
128
+ - `expires:` when to expire cached requests, default is nil (never expire)
129
129
  - `force:` don't read anything from cache (but still write)
130
130
  - `force_errors:` don't read errors from cache (but still write)
131
131
  - `ignore_params:` array of query params to ignore when calculating cache_key
@@ -135,7 +135,7 @@ Pass these in when setting up Faraday:
135
135
 
136
136
  ```ruby
137
137
  faraday = Faraday.new do
138
- _1.use :httpdisk, expires_in: 7*24*60*60, force: true
138
+ _1.use :httpdisk, expires: 7*24*60*60, force: true
139
139
  end
140
140
  ```
141
141
 
@@ -163,10 +163,13 @@ Specific to httpdisk:
163
163
  --force don't read anything from cache (but still write)
164
164
  --force-errors don't read errors from cache (but still write)
165
165
  --status show status for a url in the cache
166
- --version show version
167
- --help show this help
168
166
  ```
169
167
 
168
+ ## Goodies: httpdisk-grep
169
+
170
+ The `httpdisk-grep` command makes it easy to search your cache directory.
171
+ It can be challenging to use grep/ripgrep because cache files are compressed and JSON bodies often lack newlines. httpdisk-grep is the right tool for the job. See `httpdisk-grep --help`.
172
+
170
173
  ## Limitations & Gotchas
171
174
 
172
175
  - Transient errors are cached. This is appropriate for many uses cases (like crawling) but can be confusing. Use `httpdisk --status` to debug.
@@ -177,7 +180,13 @@ Specific to httpdisk:
177
180
 
178
181
  ## Changelog
179
182
 
180
- #### 0.3 (unreleased)
183
+ #### 0.4
184
+
185
+ - added httpdisk-grep for searching cache files
186
+ - added HTTPDisk::Cache#delete
187
+ - rename `:expires_in` to `:expires`
188
+
189
+ #### 0.3
181
190
 
182
191
  - added :ignore_params, for ignoring query params when generating cache keys
183
192
  - HTTP 40x & 50x responses return :error status (and respond to `force_error`)
data/Rakefile CHANGED
@@ -10,13 +10,15 @@ spec = Gem::Specification.load('httpdisk.gemspec')
10
10
  #
11
11
 
12
12
  # test (default)
13
- Rake::TestTask.new { _1.libs << 'test' }
13
+ Rake::TestTask.new do
14
+ _1.libs << 'test'
15
+ _1.warning = false # https://github.com/lostisland/faraday/issues/1285
16
+ end
14
17
  task default: :test
15
18
 
16
19
  # Watch rb files, run tests whenever something changes
17
20
  task :watch do
18
- # https://superuser.com/a/665208 / https://unix.stackexchange.com/a/42288
19
- system("while true; do find . -name '*.rb' | entr -c -d rake; test $? -gt 128 && break; done")
21
+ sh "find . -name '*.rb' | entr -c rake"
20
22
  end
21
23
 
22
24
  #
@@ -24,7 +26,7 @@ end
24
26
  #
25
27
 
26
28
  task :pry do
27
- system 'pry -I lib -r httpdisk.rb'
29
+ sh 'pry -I lib -r httpdisk.rb'
28
30
  end
29
31
 
30
32
  #
data/bin/httpdisk CHANGED
@@ -6,20 +6,22 @@
6
6
 
7
7
  $LOAD_PATH.unshift(File.join(__dir__, '../lib'))
8
8
 
9
+ BIN = File.basename($PROGRAM_NAME)
10
+
9
11
  def puts_error(s)
10
- $stderr.puts "httpdisk: #{s}"
12
+ $stderr.puts "#{BIN}: #{s}"
11
13
  end
12
14
 
13
15
  #
14
16
  # Load the bare minimum and parse args with slop. We do this separately for speed.
15
17
  #
16
18
 
17
- require 'httpdisk/cli_slop'
19
+ require 'httpdisk/cli/args'
18
20
  begin
19
- slop = HTTPDisk::CliSlop.slop(ARGV)
21
+ slop = HTTPDisk::Cli::Args.slop(ARGV)
20
22
  rescue Slop::Error => e
21
23
  puts_error(e) if e.message != ''
22
- puts_error("try 'httpdisk --help' for more information")
24
+ puts_error("try '#{BIN} --help' for more information")
23
25
  exit 1
24
26
  end
25
27
 
@@ -28,11 +30,11 @@ end
28
30
  #
29
31
 
30
32
  require 'httpdisk'
31
- cli = HTTPDisk::Cli.new(slop)
33
+ main = HTTPDisk::Cli::Main.new(slop)
32
34
  begin
33
- cli.run
35
+ main.run
34
36
  rescue StandardError => e
35
- puts_error(e) if !cli.options[:silent]
37
+ puts_error(e) if !main.options[:silent]
36
38
  if ENV['HTTPDISK_DEBUG']
37
39
  $stderr.puts
38
40
  $stderr.puts e.backtrace.join("\n")
data/bin/httpdisk-grep ADDED
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ #
4
+ # Search an HTTPDisk cache, similar to grep.
5
+ #
6
+
7
+ $LOAD_PATH.unshift(File.join(__dir__, '../lib'))
8
+
9
+ BIN = File.basename($PROGRAM_NAME)
10
+
11
+ def puts_error(s)
12
+ $stderr.puts "#{BIN}: #{s}"
13
+ end
14
+
15
+ #
16
+ # Load the bare minimum and parse args with slop. We do this separately for speed.
17
+ #
18
+
19
+ require 'httpdisk/grep/args'
20
+ begin
21
+ slop = HTTPDisk::Grep::Args.slop(ARGV)
22
+ rescue Slop::Error => e
23
+ puts_error(e) if e.message != ''
24
+ puts_error("try '#{BIN} --help' for more information")
25
+ exit 1
26
+ end
27
+
28
+ #
29
+ # now load everything and run
30
+ #
31
+
32
+ require 'httpdisk'
33
+
34
+ main = HTTPDisk::Grep::Main.new(slop)
35
+ begin
36
+ success = main.run
37
+ exit 1 if !success
38
+ rescue StandardError => e
39
+ puts_error(e)
40
+ if ENV['HTTPDISK_DEBUG']
41
+ $stderr.puts
42
+ $stderr.puts e.class
43
+ $stderr.puts e.backtrace.join("\n")
44
+ end
45
+ exit 2
46
+ end
data/lib/httpdisk.rb CHANGED
@@ -1,13 +1,17 @@
1
1
  require 'httpdisk/cache_key'
2
2
  require 'httpdisk/cache'
3
- require 'httpdisk/cli_slop'
4
- require 'httpdisk/cli'
5
3
  require 'httpdisk/client'
6
4
  require 'httpdisk/error'
7
5
  require 'httpdisk/payload'
6
+ require 'httpdisk/slop_duration'
8
7
  require 'httpdisk/sloptions'
9
8
  require 'httpdisk/version'
10
9
 
11
- module HTTPDisk
12
- ERROR_STATUS = 999
13
- end
10
+ # cli
11
+ require 'httpdisk/cli/args'
12
+ require 'httpdisk/cli/main'
13
+
14
+ # grep
15
+ require 'httpdisk/grep/args'
16
+ require 'httpdisk/grep/main'
17
+ require 'httpdisk/grep/printer'
@@ -9,7 +9,7 @@ module HTTPDisk
9
9
  @options = options
10
10
  end
11
11
 
12
- %i[dir expires_in force force_errors].each do |method|
12
+ %i[dir expires force force_errors].each do |method|
13
13
  define_method(method) do
14
14
  options[method]
15
15
  end
@@ -38,6 +38,12 @@ module HTTPDisk
38
38
  Zlib::GzipWriter.open(path) { payload.write(_1) }
39
39
  end
40
40
 
41
+ # Delete existing response, if any
42
+ def delete(cache_key)
43
+ path = diskpath(cache_key)
44
+ FileUtils.rm(path) if File.exist?(path)
45
+ end
46
+
41
47
  # Relative path for this cache_key based on the cache key
42
48
  def diskpath(cache_key)
43
49
  File.join(dir, cache_key.diskpath)
@@ -61,7 +67,7 @@ module HTTPDisk
61
67
 
62
68
  # Is this path expired?
63
69
  def expired?(path)
64
- expires_in && File.stat(path).mtime < Time.now - expires_in
70
+ expires && File.stat(path).mtime < Time.now - expires
65
71
  end
66
72
  end
67
73
  end
@@ -0,0 +1,57 @@
1
+ # manually load dependencies here since this is loaded standalone by bin
2
+ require 'httpdisk/error'
3
+ require 'httpdisk/slop_duration'
4
+ require 'httpdisk/version'
5
+ require 'slop'
6
+
7
+ module HTTPDisk
8
+ module Cli
9
+ # Slop parsing. This is broken out so we can run without require 'httpdisk'.
10
+ module Args
11
+ def self.slop(args)
12
+ slop = Slop.parse(args) do |o|
13
+ o.banner = 'httpdisk [options] [url]'
14
+
15
+ # similar to curl
16
+ o.separator 'Similar to curl:'
17
+ o.string '-d', '--data', 'HTTP POST data'
18
+ o.array '-H', '--header', 'pass custom header(s) to server', delimiter: nil
19
+ o.boolean '-i', '--include', 'include response headers in the output'
20
+ o.integer '-m', '--max-time', 'maximum time allowed for the transfer'
21
+ o.string '-o', '--output', 'write to file instead of stdout'
22
+ o.string '-x', '--proxy', 'use host[:port] as proxy'
23
+ o.string '-X', '--request', 'HTTP method to use'
24
+ o.integer '--retry', 'retry request if problems occur'
25
+ o.boolean '-s', '--silent', "silent mode (don't print errors)"
26
+ o.string '-A', '--user-agent', 'send User-Agent to server'
27
+
28
+ # from httpdisk
29
+ o.separator 'Specific to httpdisk:'
30
+ o.string '--dir', 'httpdisk cache directory (defaults to ~/httpdisk)'
31
+ o.duration '--expires', 'when to expire cached requests (ex: 1h, 2d, 3w)'
32
+ o.boolean '--force', "don't read anything from cache (but still write)"
33
+ o.boolean '--force-errors', "don't read errors from cache (but still write)"
34
+ o.boolean '--status', 'show status for a url in the cache'
35
+
36
+ # generic
37
+ o.boolean '--version', 'show version' do
38
+ puts "httpdisk #{HTTPDisk::VERSION}"
39
+ exit
40
+ end
41
+ o.on '--help', 'show this help' do
42
+ puts o
43
+ exit
44
+ end
45
+ end
46
+
47
+ raise Slop::Error, '' if args.empty?
48
+ raise Slop::Error, 'no URL specified' if slop.args.empty?
49
+ raise Slop::Error, 'more than one URL specified' if slop.args.length > 1
50
+
51
+ slop.to_h.tap do
52
+ _1[:url] = slop.args.first
53
+ end
54
+ end
55
+ end
56
+ end
57
+ end
@@ -0,0 +1,166 @@
1
+ require 'faraday-cookie_jar'
2
+ require 'faraday_middleware'
3
+ require 'ostruct'
4
+
5
+ module HTTPDisk
6
+ module Cli
7
+ # Command line httpdisk command.
8
+ class Main
9
+ attr_reader :options
10
+
11
+ def initialize(options)
12
+ @options = options
13
+ end
14
+
15
+ # Make the request (or print status)
16
+ def run
17
+ # short circuit --status
18
+ if options[:status]
19
+ status
20
+ return
21
+ end
22
+
23
+ # create Faraday client
24
+ faraday = create_faraday
25
+
26
+ # run request
27
+ response = faraday.run_request(request_method, request_url, request_body, request_headers)
28
+ if response.status >= 400
29
+ raise CliError, "the requested URL returned error: #{response.status} #{response.reason_phrase}"
30
+ end
31
+
32
+ # output
33
+ if options[:output]
34
+ File.open(options[:output], 'w') { output(response, _1) }
35
+ else
36
+ output(response, $stdout)
37
+ end
38
+ end
39
+
40
+ def create_faraday
41
+ Faraday.new do
42
+ # connection settings
43
+ _1.proxy = options[:proxy] if options[:proxy]
44
+ _1.options.timeout = options[:max_time] if options[:max_time]
45
+
46
+ # cookie middleware
47
+ _1.use :cookie_jar
48
+
49
+ # BEFORE httpdisk so each redirect segment is cached
50
+ _1.response :follow_redirects
51
+
52
+ # httpdisk
53
+ _1.use :httpdisk, client_options
54
+
55
+ # AFTER httpdisk so transient failures are not cached
56
+ if options[:retry]
57
+ # we have a very liberal retry policy
58
+ retry_options = {
59
+ max: options[:retry],
60
+ methods: %w[delete get head options patch post put trace],
61
+ retry_statuses: (500..600).to_a,
62
+ retry_if: ->(_env, _err) { true },
63
+ }
64
+ _1.request :retry, retry_options
65
+ end
66
+ end
67
+ end
68
+
69
+ # Support for --status
70
+ def status
71
+ # build env
72
+ env = Faraday::Env.new.tap do
73
+ _1.method = request_method
74
+ _1.request_body = request_body
75
+ _1.request_headers = request_headers
76
+ _1.url = request_url
77
+ end
78
+
79
+ # now print status
80
+ client = HTTPDisk::Client.new(nil, client_options)
81
+ client.status(env).each do
82
+ puts "#{_1}: #{_2.inspect}"
83
+ end
84
+ end
85
+
86
+ # Output response to f
87
+ def output(response, f)
88
+ if options[:include]
89
+ f.puts "HTTPDISK #{response.status} #{response.reason_phrase}"
90
+ response.headers.each { f.puts("#{_1}: #{_2}") }
91
+ f.puts
92
+ end
93
+ f.write(response.body)
94
+ end
95
+
96
+ #
97
+ # request_XXX
98
+ #
99
+
100
+ # HTTP method (get, post, etc.)
101
+ def request_method
102
+ method = if options[:request]
103
+ options[:request]
104
+ elsif options[:data]
105
+ 'post'
106
+ end
107
+ method ||= 'get'
108
+ method = method.downcase.to_sym
109
+
110
+ if !Faraday::Connection::METHODS.include?(method)
111
+ raise CliError, "invalid --request #{method.inspect}"
112
+ end
113
+
114
+ method
115
+ end
116
+
117
+ # Request url
118
+ def request_url
119
+ url = options[:url]
120
+ # recover from missing http:
121
+ if url !~ %r{^https?://}i
122
+ if url =~ %r{^\w+://}
123
+ raise CliError, 'only http/https supported'
124
+ end
125
+
126
+ url = "http://#{url}"
127
+ end
128
+ URI.parse(url)
129
+ rescue URI::InvalidURIError
130
+ raise CliError, "invalid url #{url.inspect}"
131
+ end
132
+
133
+ # Request body
134
+ def request_body
135
+ options[:data]
136
+ end
137
+
138
+ # Request headers
139
+ def request_headers
140
+ {}.tap do |headers|
141
+ if options[:user_agent]
142
+ headers['User-Agent'] = options[:user_agent]
143
+ end
144
+
145
+ options[:header].each do |header|
146
+ key, value = header.split(': ', 2)
147
+ if !key || !value || key.empty? || value.empty?
148
+ raise CliError, "invalid --header #{header.inspect}"
149
+ end
150
+
151
+ headers[key] = value
152
+ end
153
+ end
154
+ end
155
+
156
+ #
157
+ # helpers
158
+ #
159
+
160
+ # Options to HTTPDisk::Client
161
+ def client_options
162
+ options.slice(:dir, :expires, :force, :force_errors)
163
+ end
164
+ end
165
+ end
166
+ end
@@ -9,7 +9,7 @@ module HTTPDisk
9
9
  def initialize(app, options = {})
10
10
  options = Sloptions.parse(options) do
11
11
  _1.string :dir, default: File.join(ENV['HOME'], 'httpdisk')
12
- _1.integer :expires_in
12
+ _1.integer :expires
13
13
  _1.boolean :force
14
14
  _1.boolean :force_errors
15
15
  _1.array :ignore_params, default: []
@@ -1,3 +1,5 @@
1
1
  module HTTPDisk
2
+ ERROR_STATUS = 999
3
+
2
4
  class CliError < StandardError; end
3
5
  end
@@ -0,0 +1,35 @@
1
+ # manually load dependencies here since this is loaded standalone by bin
2
+ require 'httpdisk/version'
3
+ require 'slop'
4
+
5
+ module HTTPDisk
6
+ module Grep
7
+ module Args
8
+ # Slop parsing. This is broken out so we can run without require 'httpdisk'.
9
+ def self.slop(args)
10
+ slop = Slop.parse(args) do |o|
11
+ o.banner = 'httpdisk-grep [options] pattern [path ...]'
12
+ o.boolean '-c', '--count', 'suppress normal output and show count'
13
+ o.boolean '-h', '--head', 'show req headers before each match'
14
+ o.boolean '-s', '--silent', 'do not print anything to stdout'
15
+ o.boolean '--version', 'show version' do
16
+ puts "httpdisk-grep #{HTTPDisk::VERSION}"
17
+ exit
18
+ end
19
+ o.on '--help', 'show this help' do
20
+ puts o
21
+ exit
22
+ end
23
+ end
24
+
25
+ raise Slop::Error, '' if args.empty?
26
+ raise Slop::Error, 'no PATTERN specified' if slop.args.empty?
27
+
28
+ slop.to_h.tap do
29
+ _1[:pattern] = slop.args.shift
30
+ _1[:roots] = slop.args
31
+ end
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,110 @@
1
+ require 'find'
2
+ require 'json'
3
+
4
+ module HTTPDisk
5
+ module Grep
6
+ class Main
7
+ attr_reader :options, :success
8
+
9
+ def initialize(options)
10
+ @options = options
11
+ end
12
+
13
+ # Enumerate file paths one at a time. Returns true if matches were found.
14
+ def run
15
+ paths.each do
16
+ begin
17
+ run_one(_1)
18
+ rescue StandardError => e
19
+ if ENV['HTTPDISK_DEBUG']
20
+ $stderr.puts
21
+ $stderr.puts e.class
22
+ $stderr.puts e.backtrace.join("\n")
23
+ end
24
+ raise CliError, "#{e.message[0, 70]} (#{_1})"
25
+ end
26
+ end
27
+ success
28
+ end
29
+
30
+ def run_one(path)
31
+ # read payload & body
32
+ payload = Zlib::GzipReader.open(path) { Payload.read(_1) }
33
+ body = prepare_body(payload)
34
+
35
+ # collect all_matches
36
+ all_matches = body.each_line.map do |line|
37
+ [].tap do |matches|
38
+ line.scan(pattern) { matches << Regexp.last_match }
39
+ end
40
+ end.reject(&:empty?)
41
+ return if all_matches.empty?
42
+
43
+ # print
44
+ @success = true
45
+ printer.print(path, payload, all_matches)
46
+ end
47
+
48
+ # file paths to be searched
49
+ def paths
50
+ # roots
51
+ roots = options[:roots]
52
+ roots = ['.'] if roots.empty?
53
+
54
+ # find files in roots
55
+ paths = roots.flat_map { Find.find(_1).to_a }.sort
56
+ paths = paths.select { File.file?(_1) }
57
+
58
+ # strip default './'
59
+ paths = paths.map { _1.gsub(%r{^\./}, '') } if options[:roots].empty?
60
+ paths
61
+ end
62
+
63
+ # convert raw body into something palatable for pattern matching
64
+ def prepare_body(payload)
65
+ body = payload.body
66
+
67
+ if content_type = payload.headers['Content-Type']
68
+ # Mismatches between Content-Type and body.encoding are fatal, so make
69
+ # an effort to align them.
70
+ if charset = content_type[/charset=([^;]+)/, 1]
71
+ encoding = begin
72
+ Encoding.find(charset)
73
+ rescue StandardError
74
+ nil
75
+ end
76
+ if encoding && body.encoding != encoding
77
+ body.force_encoding(encoding)
78
+ end
79
+ end
80
+
81
+ # pretty print json for easier searching
82
+ if content_type =~ /\bjson\b/
83
+ body = JSON.pretty_generate(JSON.parse(body))
84
+ end
85
+ end
86
+
87
+ body
88
+ end
89
+
90
+ # regex pattern from options
91
+ def pattern
92
+ @pattern ||= Regexp.new(options[:pattern], Regexp::IGNORECASE)
93
+ end
94
+
95
+ # printer for output
96
+ def printer
97
+ @printer ||= case
98
+ when options[:silent]
99
+ Grep::SilentPrinter.new
100
+ when options[:count]
101
+ Grep::CountPrinter.new($stdout)
102
+ when options[:head] || $stdout.tty?
103
+ Grep::HeaderPrinter.new($stdout, options[:head])
104
+ else
105
+ Grep::TersePrinter.new($stdout)
106
+ end
107
+ end
108
+ end
109
+ end
110
+ end
@@ -0,0 +1,99 @@
1
+ module HTTPDisk
2
+ module Grep
3
+ class Printer
4
+ GREP_COLOR = '37;45'.freeze
5
+
6
+ attr_reader :output
7
+
8
+ def initialize(output)
9
+ @output = output
10
+ end
11
+
12
+ def print(path, payload, all_matches); end
13
+
14
+ protected
15
+
16
+ #
17
+ # helpers for subclasses
18
+ #
19
+
20
+ def grep_color
21
+ @grep_color ||= (ENV['GREP_COLOR'] || GREP_COLOR)
22
+ end
23
+
24
+ def print_matches(matches)
25
+ s = matches.first.string
26
+ if output.tty?
27
+ s = [].tap do |result|
28
+ ii = 0
29
+ matches.each do
30
+ result << s[ii..._1.begin(0)]
31
+ result << "\e["
32
+ result << grep_color
33
+ result << 'm'
34
+ result << _1[0]
35
+ result << "\e[0m"
36
+ ii = _1.end(0)
37
+ end
38
+ result << s[ii..]
39
+ end.join
40
+ end
41
+ output.puts s
42
+ end
43
+ end
44
+
45
+ #
46
+ # subclasses
47
+ #
48
+
49
+ # path:count
50
+ class CountPrinter < Printer
51
+ def print(path, _payload, all_matches)
52
+ output.puts "#{path}:#{all_matches.length}"
53
+ end
54
+ end
55
+
56
+ # header, then each match
57
+ class HeaderPrinter < Printer
58
+ attr_reader :head, :printed
59
+
60
+ def initialize(output, head)
61
+ super(output)
62
+ @head = head
63
+ @printed = 0
64
+ end
65
+
66
+ def print(path, payload, all_matches)
67
+ # separator & filename
68
+ output.puts if (@printed += 1) > 1
69
+ output.puts path
70
+
71
+ # --head
72
+ if head
73
+ io = StringIO.new
74
+ payload.write_header(io)
75
+ io.string.lines.each { output.puts "< #{_1}" }
76
+ end
77
+
78
+ # matches
79
+ all_matches.each { print_matches(_1) }
80
+ end
81
+ end
82
+
83
+ class SilentPrinter < Printer
84
+ def initialize
85
+ super(nil)
86
+ end
87
+ end
88
+
89
+ # each match as path:match
90
+ class TersePrinter < Printer
91
+ def print(path, _payload, all_matches)
92
+ all_matches.each do
93
+ output.write("#{path}:")
94
+ print_matches(_1)
95
+ end
96
+ end
97
+ end
98
+ end
99
+ end
@@ -43,7 +43,7 @@ module HTTPDisk
43
43
  status >= 400
44
44
  end
45
45
 
46
- def write(f)
46
+ def write_header(f)
47
47
  # comment
48
48
  f.puts "# #{comment}"
49
49
 
@@ -52,9 +52,11 @@ module HTTPDisk
52
52
 
53
53
  # headers
54
54
  headers.each { f.puts("#{_1}: #{_2}") }
55
- f.puts
55
+ end
56
56
 
57
- # body
57
+ def write(f)
58
+ write_header(f)
59
+ f.puts
58
60
  f.write(body)
59
61
  end
60
62
  end
@@ -0,0 +1,24 @@
1
+ require 'slop'
2
+
3
+ module Slop
4
+ # Custom duration type for Slop, used for --expires. Raises aggressively
5
+ # because this is a tricky and lightly documented option.
6
+ class DurationOption < Option
7
+ UNITS = {
8
+ s: 1,
9
+ m: 60,
10
+ h: 60 * 60,
11
+ d: 24 * 60 * 60,
12
+ w: 7 * 24 * 60 * 60,
13
+ y: 365 * 7 * 24 * 60 * 60,
14
+ }.freeze
15
+
16
+ def call(value)
17
+ m = value.match(/^(\d+)([smhdwy])?$/)
18
+ raise Slop::Error, "invalid --expires #{value.inspect}" if !m
19
+
20
+ num, unit = m[1].to_i, (m[2] || 's').to_sym
21
+ num * UNITS[unit]
22
+ end
23
+ end
24
+ end
@@ -1,3 +1,3 @@
1
1
  module HTTPDisk
2
- VERSION = '0.3.0'.freeze
2
+ VERSION = '0.4.0'.freeze
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: httpdisk
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Adam Doppelt
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-05-20 00:00:00.000000000 Z
11
+ date: 2021-06-05 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: faraday
@@ -70,6 +70,7 @@ description: httpdisk works with faraday to aggressively cache responses on disk
70
70
  email: amd@gurge.com
71
71
  executables:
72
72
  - httpdisk
73
+ - httpdisk-grep
73
74
  extensions: []
74
75
  extra_rdoc_files: []
75
76
  files:
@@ -82,16 +83,21 @@ files:
82
83
  - README.md
83
84
  - Rakefile
84
85
  - bin/httpdisk
86
+ - bin/httpdisk-grep
85
87
  - examples.rb
86
88
  - httpdisk.gemspec
87
89
  - lib/httpdisk.rb
88
90
  - lib/httpdisk/cache.rb
89
91
  - lib/httpdisk/cache_key.rb
90
- - lib/httpdisk/cli.rb
91
- - lib/httpdisk/cli_slop.rb
92
+ - lib/httpdisk/cli/args.rb
93
+ - lib/httpdisk/cli/main.rb
92
94
  - lib/httpdisk/client.rb
93
95
  - lib/httpdisk/error.rb
96
+ - lib/httpdisk/grep/args.rb
97
+ - lib/httpdisk/grep/main.rb
98
+ - lib/httpdisk/grep/printer.rb
94
99
  - lib/httpdisk/payload.rb
100
+ - lib/httpdisk/slop_duration.rb
95
101
  - lib/httpdisk/sloptions.rb
96
102
  - lib/httpdisk/version.rb
97
103
  - logo.svg
data/lib/httpdisk/cli.rb DELETED
@@ -1,223 +0,0 @@
1
- require 'faraday-cookie_jar'
2
- require 'faraday_middleware'
3
- require 'ostruct'
4
-
5
- module HTTPDisk
6
- # Command line httpdisk command.
7
- class Cli
8
- attr_reader :options
9
-
10
- # for --expires
11
- UNITS = {
12
- s: 1,
13
- m: 60,
14
- h: 60 * 60,
15
- d: 24 * 60 * 60,
16
- w: 7 * 24 * 60 * 60,
17
- y: 365 * 7 * 24 * 60 * 60,
18
- }.freeze
19
-
20
- def initialize(options)
21
- @options = options
22
- end
23
-
24
- # we have a very liberal retry policy
25
- RETRY_OPTIONS = {
26
- methods: %w[delete get head options patch post put trace],
27
- retry_statuses: (400..600).to_a,
28
- retry_if: ->(_env, _err) { true },
29
- }.freeze
30
-
31
- # Make the request (or print status)
32
- def run
33
- # short circuit --status
34
- if options[:status]
35
- status
36
- return
37
- end
38
-
39
- # create Faraday client
40
- faraday = create_faraday
41
-
42
- # run request
43
- response = faraday.run_request(request_method, request_url, request_body, request_headers)
44
- if response.status >= 400
45
- raise CliError, "the requested URL returned error: #{response.status} #{response.reason_phrase}"
46
- end
47
-
48
- # output
49
- if options[:output]
50
- File.open(options[:output], 'w') { output(response, _1) }
51
- else
52
- output(response, $stdout)
53
- end
54
- end
55
-
56
- def create_faraday
57
- Faraday.new do
58
- # connection settings
59
- _1.proxy = proxy if options[:proxy]
60
- _1.options.timeout = options[:max_time] if options[:max_time]
61
-
62
- # cookie middleware
63
- _1.use :cookie_jar
64
-
65
- # BEFORE httpdisk so each redirect segment is cached
66
- _1.response :follow_redirects
67
-
68
- # httpdisk
69
- _1.use :httpdisk, client_options
70
-
71
- # AFTER httpdisk so transient failures are not cached
72
- if options[:retry]
73
- _1.request :retry, RETRY_OPTIONS.merge(max: options[:retry])
74
- end
75
- end
76
- end
77
-
78
- # Support for --status
79
- def status
80
- # build env
81
- env = Faraday::Env.new.tap do
82
- _1.method = request_method
83
- _1.request_body = request_body
84
- _1.request_headers = request_headers
85
- _1.url = request_url
86
- end
87
-
88
- # now print status
89
- client = HTTPDisk::Client.new(nil, client_options)
90
- client.status(env).each do
91
- puts "#{_1}: #{_2.inspect}"
92
- end
93
- end
94
-
95
- # Output response to f
96
- def output(response, f)
97
- if options[:include]
98
- f.puts "HTTPDISK #{response.status} #{response.reason_phrase}"
99
- response.headers.each { f.puts("#{_1}: #{_2}") }
100
- f.puts
101
- end
102
- f.write(response.body)
103
- end
104
-
105
- #
106
- # request_XXX
107
- #
108
-
109
- # HTTP method (get, post, etc.)
110
- def request_method
111
- method = if options[:request]
112
- options[:request]
113
- elsif options[:data]
114
- 'post'
115
- end
116
- method ||= 'get'
117
- method = method.downcase.to_sym
118
-
119
- if !Faraday::Connection::METHODS.include?(method)
120
- raise CliError, "invalid --request #{method.inspect}"
121
- end
122
-
123
- method
124
- end
125
-
126
- # Request url
127
- def request_url
128
- url = options[:url]
129
- # recover from missing http:
130
- if url !~ %r{^https?://}i
131
- if url =~ %r{^\w+://}
132
- raise CliError, 'only http/https supported'
133
- end
134
-
135
- url = "http://#{url}"
136
- end
137
- URI.parse(url)
138
- rescue URI::InvalidURIError
139
- raise CliError, "invalid url #{url.inspect}"
140
- end
141
-
142
- # Request body
143
- def request_body
144
- options[:data]
145
- end
146
-
147
- # Request headers
148
- def request_headers
149
- {}.tap do |headers|
150
- if options[:user_agent]
151
- headers['User-Agent'] = options[:user_agent]
152
- end
153
-
154
- options[:header].each do |header|
155
- key, value = header.split(': ', 2)
156
- if !key || !value || key.empty? || value.empty?
157
- raise CliError, "invalid --header #{header.inspect}"
158
- end
159
-
160
- headers[key] = value
161
- end
162
- end
163
- end
164
-
165
- #
166
- # helpers
167
- #
168
-
169
- # Options to HTTPDisk::Client
170
- def client_options
171
- {}.tap do |client_options|
172
- client_options[:dir] = options[:dir]
173
- if options[:expires]
174
- seconds = parse_expires(options[:expires])
175
- if !seconds
176
- raise CliError, "invalid --expires #{options[:expires].inspect}"
177
- end
178
-
179
- client_options[:expires_in] = seconds
180
- end
181
- client_options[:force] = options[:force]
182
- client_options[:force_errors] = options[:force_errors]
183
- end
184
- end
185
-
186
- # Return validated --proxy flag if present
187
- def proxy
188
- return if !options[:proxy]
189
-
190
- proxy = parse_proxy(options[:proxy])
191
- raise CliError, "--proxy should be host[:port], not #{options[:proxy].inspect}" if !proxy
192
-
193
- proxy
194
- end
195
-
196
- # Parse --expires flag
197
- def parse_expires(s)
198
- m = s.match(/^(\d+)([smhdwy])?$/)
199
- return if !m
200
-
201
- num, unit = m[1].to_i, (m[2] || 's').to_sym
202
- return if !UNITS.key?(unit)
203
-
204
- num * UNITS[unit]
205
- end
206
-
207
- # Parse --proxy flag
208
- def parse_proxy(proxy_flag)
209
- host, port = proxy_flag.split(':', 2)
210
- return if !host || host.empty?
211
- return if port&.empty?
212
-
213
- URI.parse('http://placeholder').tap do
214
- begin
215
- _1.host = host
216
- _1.port = port if port
217
- rescue URI::InvalidComponentError
218
- return
219
- end
220
- end.to_s
221
- end
222
- end
223
- end
@@ -1,54 +0,0 @@
1
- # manually load dependencies here since this is loaded standalone by bin
2
- require 'httpdisk/error'
3
- require 'httpdisk/version'
4
- require 'slop'
5
-
6
- module HTTPDisk
7
- # Slop parsing. This is broken out so we can run without require 'httpdisk'.
8
- module CliSlop
9
- def self.slop(args)
10
- slop = Slop.parse(args) do |o|
11
- o.banner = 'httpdisk [options] [url]'
12
-
13
- # similar to curl
14
- o.separator 'Similar to curl:'
15
- o.string '-d', '--data', 'HTTP POST data'
16
- o.array '-H', '--header', 'pass custom header(s) to server', delimiter: nil
17
- o.boolean '-i', '--include', 'include response headers in the output'
18
- o.integer '-m', '--max-time', 'maximum time allowed for the transfer'
19
- o.string '-o', '--output', 'write to file instead of stdout'
20
- o.string '-x', '--proxy', 'use host[:port] as proxy'
21
- o.string '-X', '--request', 'HTTP method to use'
22
- o.integer '--retry', 'retry request if problems occur'
23
- o.boolean '-s', '--silent', "silent mode (don't print errors)"
24
- o.string '-A', '--user-agent', 'send User-Agent to server'
25
-
26
- # from httpdisk
27
- o.separator 'Specific to httpdisk:'
28
- o.string '--dir', 'httpdisk cache directory (defaults to ~/httpdisk)'
29
- o.string '--expires', 'when to expire cached requests (ex: 1h, 2d, 3w)'
30
- o.boolean '--force', "don't read anything from cache (but still write)"
31
- o.boolean '--force-errors', "don't read errors from cache (but still write)"
32
- o.boolean '--status', 'show status for a url in the cache'
33
-
34
- # generic
35
- o.boolean '--version', 'show version' do
36
- puts "httpdisk #{HTTPDisk::VERSION}"
37
- exit
38
- end
39
- o.on '--help', 'show this help' do
40
- puts o
41
- exit
42
- end
43
- end
44
-
45
- raise Slop::Error, '' if args.empty?
46
- raise Slop::Error, 'no URL specified' if slop.args.empty?
47
- raise Slop::Error, 'more than one URL specified' if slop.args.length > 1
48
-
49
- slop.to_h.tap do
50
- _1[:url] = slop.args.first
51
- end
52
- end
53
- end
54
- end