curlyq 0.0.7 → 0.0.9

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 92b27e3065435d17fd5d6129bd640481dee8acf66f9ed82b2b68ae6a7589f463
4
- data.tar.gz: a5cd5299248fd01d8f12a80a1c9982de4f8e0160ea5033ab96b60677bb9ea2c8
3
+ metadata.gz: 44e01914de08789721522e24e506fe88a49106610fac9a16736efcba0916be88
4
+ data.tar.gz: ec2887fee0dab67c64c0095f59091e867445b22559674b31f2eea64d8f4b9fea
5
5
  SHA512:
6
- metadata.gz: 1348b97fdf89faf44cd0cfc0f2aecc05a679606f19fe57392d588209da26fc3a5c2407569173d41e1497bdf409d762adee0d2533089b5ce27854e298fe98cc13
7
- data.tar.gz: c80ecd381e1d941d8e8e5ead0dd925682e26a2f8cc639f0202d5b5cd30f025582fc5d5a2665daca3e5e7d0f2099066d3ca3d9ad8ad00b07d2ee5673b122bae01
6
+ metadata.gz: aa8338482e3d9414e6347195abc7ba8645adbc120f13469f219df7f2aa5fa1ba3c209740c608f728539969502965f6c29afc9d65b5ba411c8387b26ebd640c9d
7
+ data.tar.gz: 4fe071cb872a259795163da084851afdb5d003f7607a82a5ab6fe868f7f2edae22f0caa31327dfe1bad31e28e48fddde1857d882344359db8bf47b8579aef22c
data/CHANGELOG.md CHANGED
@@ -1,3 +1,23 @@
1
+ ### 0.0.9
2
+
3
+ 2024-01-16 12:38
4
+
5
+ #### IMPROVED
6
+
7
+ - You can now use dot syntax inside of a square bracket comparison in --query (`[attrs.id*=what]`)
8
+ - *=, ^=, $=, and == work with array values
9
+ - [] comparisons with no comparison, e.g. [attrs.id], will return every match that has that element populated
10
+
11
+ ### 0.0.8
12
+
13
+ 2024-01-15 16:45
14
+
15
+ #### IMPROVED
16
+
17
+ - Dot syntax query can now operate on a full array using empty set []
18
+ - Dot syntax query should output a specific key, e.g. attrs[id*=news].content (work in progress)
19
+ - Dot query syntax handling touch-ups. Piping to jq is still more flexible, but the basics are there.
20
+
1
21
  ### 0.0.7
2
22
 
3
23
  2024-01-12 17:03
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- curlyq (0.0.7)
4
+ curlyq (0.0.9)
5
5
  gli (~> 2.21.0)
6
6
  nokogiri (~> 1.16.0)
7
7
  selenium-webdriver (~> 4.16.0)
data/README.md CHANGED
@@ -10,10 +10,13 @@ _If you find this useful, feel free to [buy me some coffee][donate]._
10
10
  [donate]: https://brettterpstra.com/donate
11
11
 
12
12
 
13
- The current version of `curlyq` is 0.0.7
13
+ [jq]: https://github.com/jqlang/jq "Command-line JSON processor"
14
+ [yq]: https://github.com/mikefarah/yq "yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor"
15
+
16
+ The current version of `curlyq` is 0.0.9
14
17
  .
15
18
 
16
- CurlyQ is a utility that provides a simple interface for curl, with additional features for things like extracting images and links, finding elements by CSS selector or XPath, getting detailed header info, and more. It's designed to be part of a scripting pipeline, outputting everything as structured data (JSON or YAML). It also has rudimentary support for making calls to JSON endpoints easier, but it's expected that you'll use something like `jq` to parse the output.
19
+ CurlyQ is a utility that provides a simple interface for curl, with additional features for things like extracting images and links, finding elements by CSS selector or XPath, getting detailed header info, and more. It's designed to be part of a scripting pipeline, outputting everything as structured data (JSON or YAML). It also has rudimentary support for making calls to JSON endpoints easier, but it's expected that you'll use something like [jq] to parse the output.
17
20
 
18
21
  [github]: https://github.com/ttscoff/curlyq/
19
22
 
@@ -44,7 +47,7 @@ SYNOPSIS
44
47
  curlyq [global options] command [command options] [arguments...]
45
48
 
46
49
  VERSION
47
- 0.0.7
50
+ 0.0.9
48
51
 
49
52
  GLOBAL OPTIONS
50
53
  --help - Show this message
@@ -71,6 +74,9 @@ You can shape the results using `--search` (`-s`) and `--query` (`-q`) on some c
71
74
 
72
75
  A search uses either CSS or XPath syntax to locate elements. For example, if you wanted to locate all of the `<article>` elements with a class of `post` inside of the div with an id of `main`, you would run `--search '#main article.post'`. Searches can target tags, ids, and classes, and can accept `>` to target direct descendents. You can also use XPaths, but I hate those so I'm not going to document them.
73
76
 
77
+ > I've tried to make the query function useful, but if you want to do any kind of advanced shaping, you're better off piping the JSON output to [jq] or [yq].
78
+
79
+
74
80
  Queries are specifically for shaping CurlyQ output. If you're using the `html` command, it returns a key called `images`, so you can target just the images in the response with `-q 'images'`. The queries accept array syntax, so to get the first image, you would use `-q 'images[0]'`. Ranges are accepted as well, so `-q 'images[1..4]'` will return the 2nd through 5th images found on the page. You can also do comparisons, e.g. `images[rel=me]'` to target only images with a `rel` attribute of `me`.
75
81
 
76
82
  The comparisons for the query flag are:
@@ -84,6 +90,16 @@ The comparisons for the query flag are:
84
90
  - `^=` starts with text
85
91
  - `$=` ends with text
86
92
 
93
+ Comparisons can be numeric or string comparisons. A numeric comparison like `curlyq images -q '[width>500]' URL` would return all of the images on the page with a width attribute greater than 500.
94
+
95
+ You can also use dot syntax inside of comparisons, e.g. `[links.rel*=me]` to target the links object (`html` command), and return only the links with a `rel=me` attribute. If the comparison is to an array object (like `class` or `rel`), it will match if any of the elements of the array match your comparison.
96
+
97
+ If you end the query with a specific key, only that key will be output. If there's only one match, it will be output as a raw string. If there are multiple matches, output will be an array:
98
+
99
+ curlyq tags --search '#main .post h3' -q '[attrs.id*=what].source' 'https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/'
100
+
101
+ <h3 id="whats-next">What???s Next</h3>
102
+
87
103
  #### Commands
88
104
 
89
105
  curlyq makes use of subcommands, e.g. `curlyq html [options] URL` or `curlyq extract [options] URL`. Each subcommand takes its own options, but I've made an effort to standardize the choices between each command as much as possible.
@@ -440,7 +456,7 @@ COMMAND OPTIONS
440
456
 
441
457
  Return a hierarchy of all tags in a page. Use `-t` to limit to a specific tag.
442
458
 
443
- curlyq tags --search '#main .post h3' -q 'attrs[id*=what]' https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/
459
+ curlyq tags --search '#main .post h3' -q '[attrs.id*=what]' https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/
444
460
 
445
461
  [
446
462
  {
data/bin/curlyq CHANGED
@@ -130,13 +130,13 @@ command %i[html curl] do |c|
130
130
  out = res.parse(source)
131
131
 
132
132
  if options[:query]
133
- out = out.to_data(url: url, clean: options[:clean]).dot_query(options[:query])
133
+ out = out.to_data(url: url, clean: options[:clean]).dot_query(options[:query], full_tag: false)
134
134
  else
135
135
  out = out.to_data
136
136
  end
137
137
  output.push([out])
138
138
  elsif options[:query]
139
- queried = res.to_data.dot_query(options[:query])
139
+ queried = res.to_data.dot_query(options[:query], full_tag: false)
140
140
  output.push(queried) if queried
141
141
  else
142
142
  output.push(res.to_data(url: url))
@@ -147,6 +147,12 @@ command %i[html curl] do |c|
147
147
  # output = output[0] if output.count == 1
148
148
  output.map! { |o| o[options[:raw].to_sym] } if options[:raw]
149
149
 
150
+ if output.is_a?(Array)
151
+ while output.length == 1
152
+ output = output[0]
153
+ end
154
+ end
155
+
150
156
  print_out(output, global_options[:yaml], raw: options[:raw], pretty: global_options[:pretty])
151
157
  end
152
158
  end
@@ -342,9 +348,7 @@ command :tags do |c|
342
348
  out = out.dot_query(options[:query]) if options[:query]
343
349
  output.push(out)
344
350
  elsif options[:query]
345
- query = options[:query] =~ /^links/ ? options[:query] : "links#{options[:query]}"
346
-
347
- output = res.to_data.dot_query(query)
351
+ output = res.to_data.dot_query(options[:query])
348
352
  elsif tags.count.positive?
349
353
  tags.each { |tag| output.concat(res.tags(tag)) }
350
354
  else
@@ -352,7 +356,9 @@ command :tags do |c|
352
356
  end
353
357
  end
354
358
 
355
- output = output[0] if output.count == 1
359
+ while output.is_a?(Array) && output.count == 1
360
+ output = output[0]
361
+ end
356
362
 
357
363
  if options[:source]
358
364
  puts output.to_html
@@ -393,13 +399,13 @@ command :images do |c|
393
399
  res.curl
394
400
 
395
401
  res = res.images(types: types)
402
+ res = { images: res }.dot_query(options[:query], 'images', full_tag: false) if options[:query]
396
403
 
397
- if options[:query]
398
- query = options[:query] =~ /^images/ ? options[:query] : "images#{options[:query]}"
399
- res = { images: res }.dot_query(query)
404
+ if res.is_a?(Array)
405
+ output.concat(res)
406
+ else
407
+ output.push(res)
400
408
  end
401
-
402
- output.concat(res)
403
409
  end
404
410
 
405
411
  print_out(output, global_options[:yaml], pretty: global_options[:pretty])
@@ -439,9 +445,9 @@ command :links do |c|
439
445
  res.curl
440
446
 
441
447
  if options[:query]
442
- query = options[:query] =~ /^links/ ? options[:query] : "links#{options[:query]}"
443
- queried = res.to_data.dot_query(query)
444
- output.concat(queried) if queried
448
+ queried = res.to_data.dot_query(options[:query], 'links', full_tag: false)
449
+
450
+ queried.is_a?(Array) ? output.concat(queried) : output.push(queried) if queried
445
451
  else
446
452
  output.concat(res.body_links)
447
453
  end
@@ -469,9 +475,8 @@ command :headlinks do |c|
469
475
  res.curl
470
476
 
471
477
  if options[:query]
472
- query = options[:query] =~ /^links/ ? options[:query] : "links#{options[:query]}"
473
- queried = { links: res.to_data[:meta_links] }.dot_query(query)
474
- output.concat(queried) if queried
478
+ queried = { links: res.to_data[:meta_links] }.dot_query(options[:query], 'links', full_tag: false)
479
+ output.push(queried) if queried
475
480
  else
476
481
  output.push(res.to_data[:meta_links])
477
482
  end
@@ -516,10 +521,10 @@ command :scrape do |c|
516
521
  if options[:search]
517
522
  out = res.search(options[:search])
518
523
 
519
- out = out.dot_query(options[:query]) if options[:query]
524
+ out = out.dot_query(options[:query], full_tag: false) if options[:query]
520
525
  output.push(out)
521
526
  elsif options[:query]
522
- queried = res.to_data(url: url).dot_query(options[:query])
527
+ queried = res.to_data(url: url).dot_query(options[:query], full_tag: false)
523
528
  output.push(queried) if queried
524
529
  else
525
530
  output.push(res.to_data(url: url))
data/lib/curly/array.rb CHANGED
@@ -74,20 +74,18 @@ class ::Array
74
74
  ## @return [Array] elements matching dot query
75
75
  ##
76
76
  def dot_query(path)
77
- filter! do |tag|
78
- r = tag.dot_query(path)
79
- if r.is_a?(Array)
80
- r.count.positive?
81
- else
82
- r
83
- end
84
- end
77
+ res = map { |el| el.dot_query(path) }
78
+ res.delete_if { |r| !r }
79
+ res.delete_if(&:empty?)
80
+ res
81
+ end
85
82
 
86
- return self
83
+ def get_value(path)
84
+ map { |el| el.get_value(path) }
87
85
  end
88
86
 
89
87
  def to_html
90
- map { |el| el.to_html }
88
+ map(&:to_html)
91
89
  end
92
90
 
93
91
  ##
data/lib/curly/hash.rb CHANGED
@@ -29,24 +29,62 @@ class ::Hash
29
29
  end
30
30
  end
31
31
 
32
+ def get_value(query)
33
+ return nil if self.empty?
34
+ stringify_keys!
35
+
36
+ query.split('.').inject(self) do |v, k|
37
+ if v.is_a? Array
38
+ return v.map { |el| el.get_value(k) }
39
+ end
40
+ # k = k.to_i if v.is_a? Array
41
+ next unless v.key?(k)
42
+
43
+ v.fetch(k)
44
+ end
45
+ end
46
+
32
47
  # Extract data using a dot-syntax path
33
48
  #
34
49
  # @param path [String] The path
35
50
  #
36
51
  # @return Result of path query
37
52
  #
38
- def dot_query(path)
53
+ def dot_query(path, root = nil, full_tag: true)
39
54
  res = stringify_keys
55
+ res = res[root] unless root.nil?
56
+
57
+ unless path =~ /\[/
58
+ return res.get_value(path)
59
+ end
40
60
 
61
+ path.gsub!(/\[(.*?)\]/) do
62
+ inter = Regexp.last_match(1).gsub(/\./, '%')
63
+ "[#{inter}]"
64
+ end
65
+
66
+ enumerate = false
41
67
  out = []
42
68
  q = path.split(/(?<![\d.])\./)
43
- q.each do |pth|
44
- el = Regexp.last_match(1) if pth =~ /\[([0-9,.]+)\]/
45
- pth.sub!(/\[([0-9,.]+)\]/, '')
69
+
70
+ while q.count.positive?
71
+ pth = q.shift
72
+ pth.gsub!(/%/, '.')
73
+
74
+ return nil if res.nil?
75
+
76
+ unless pth =~ /\[/
77
+ return res.get_value(pth)
78
+ end
79
+
80
+ el = Regexp.last_match(1) if pth =~ /\[([0-9,.]+)?\]/
81
+ pth.sub!(/\[([0-9,.]+)?\]/, '')
82
+
46
83
  ats = []
47
84
  at = []
48
- while pth =~ /\[[+&,]?\w+ *[\^*$=<>]=? *\w+/
49
- m = pth.match(/\[(?<com>[,+&])? *(?<key>\w+) *(?<op>[\^*$=<>]{1,2}) *(?<val>[^,&\]]+) */)
85
+ while pth =~ /\[[+&,]?[\w.]+( *[\^*$=<>]=? *\w+)?/
86
+ m = pth.match(/\[(?<com>[,+&])? *(?<key>[\w.]+)( *(?<op>[\^*$=<>]{1,2}) *(?<val>[^,&\]]+))? */)
87
+
50
88
  comp = [m['key'], m['op'], m['val']]
51
89
  case m['com']
52
90
  when ','
@@ -56,16 +94,32 @@ class ::Hash
56
94
  at.push(comp)
57
95
  end
58
96
 
59
- pth.sub!(/\[(?<com>[,&+])? *(?<key>\w+) *(?<op>[\^*$=<>]{1,2}) *(?<val>[^,&\]]+)/, '[')
97
+ pth.sub!(/\[(?<com>[,&+])? *(?<key>[\w.]+)( *(?<op>[\^*$=<>]{1,2}) *(?<val>[^,&\]]+))?/, '[')
60
98
  end
61
99
  ats.push(at) unless at.empty?
62
100
  pth.sub!(/\[\]/, '')
63
101
 
64
- res = res[0] if res.is_a?(Array)
102
+ res = res[0] if res.is_a?(Array) && res.count == 1
103
+ if ats.empty? && el.nil? && res.is_a?(Array) && res[0]&.key?(pth)
104
+ res.map! { |r| r[pth] }
105
+ next
106
+ end
107
+
108
+ res.map!(&:stringify_keys) if res.is_a?(Array) && res[0].is_a?(Hash)
109
+ # if res.is_a?(String) || (res.is_a?(Array) && res[0].is_a?(String))
110
+ # out.push(res)
111
+ # next
112
+ # end
65
113
 
66
- return false if el.nil? && ats.empty? && !res.key?(pth)
114
+ # if res.is_a?(Array) && !pth.nil?
115
+ # return res.delete_if { |r| !r.key?(pth) }
116
+ # else
117
+ # return false if el.nil? && ats.empty? && res.is_a?(Hash) && (res.nil? || !res.key?(pth))
118
+ # end
119
+ tag = res
120
+ res = res[pth] unless pth.nil? || pth.empty?
67
121
 
68
- res = res[pth] unless pth.empty?
122
+ pth = ''
69
123
 
70
124
  return false if res.nil?
71
125
 
@@ -73,22 +127,49 @@ class ::Hash
73
127
  while ats.count.positive?
74
128
  atr = ats.shift
75
129
  res = [res] if res.is_a?(Hash)
76
- keepers = res.filter do |r|
77
- evaluate_comp(r, atr)
130
+ res.each do |r|
131
+ out.push(full_tag ? tag : r) if evaluate_comp(r, atr)
78
132
  end
79
-
80
- out.concat(keepers)
81
133
  end
82
134
  else
83
135
  out = res
84
136
  end
85
137
 
86
- out = out[eval(el)] if out.is_a?(Array) && el =~ /^[\d.,]+$/
138
+ out = out.get_value(pth) unless pth.nil?
139
+
140
+ if el.nil? && out.is_a?(Array) && out[0].is_a?(Hash)
141
+ out.map! { |o|
142
+ o.stringify_keys
143
+ # o.key?(pth) ? o[pth] : o
144
+ }
145
+ elsif out.is_a?(Array) && el =~ /^[\d.,]+$/
146
+ out = out[eval(el)]
147
+ end
148
+ res = out
87
149
  end
88
150
 
151
+ out = out[0] if out&.count == 1
89
152
  out
90
153
  end
91
154
 
155
+ def array_match(array, key, comp)
156
+ keep = false
157
+ array.each do |el|
158
+ keep = case comp
159
+ when /^\^/
160
+ key =~ /^#{el}/i ? true : false
161
+ when /^\$/
162
+ key =~ /#{el}$/i ? true : false
163
+ when /^\*/
164
+ key =~ /#{el}/i ? true : false
165
+ else
166
+ key =~ /^#{el}$/i ? true : false
167
+ end
168
+ break if keep
169
+ end
170
+ keep
171
+ end
172
+
92
173
  ##
93
174
  ## Evaluate a comparison
94
175
  ##
@@ -112,39 +193,59 @@ class ::Hash
112
193
  else
113
194
  a[2]
114
195
  end
196
+ r = r.get_value(key.to_s) if key.to_s =~ /\./
197
+
198
+ if val.nil?
199
+ if r.is_a?(Hash)
200
+ return r.key?(key) && !r[key].nil? && !r[key].empty?
201
+ elsif r.is_a?(String)
202
+ return r.nil? ? false : true
203
+ elsif r.is_a?(Array)
204
+ return r.empty? ? false : true
205
+ end
206
+ end
115
207
 
116
- if !r.key?(key)
208
+ if r.nil?
117
209
  keep = false
118
- elsif r[key].is_a?(Array)
119
- valid = r[key].filter do |k|
120
- case a[1]
121
- when /^\^/
122
- k =~ /^#{a[2]}/i ? true : false
123
- when /^\$/
124
- k =~ /#{a[2]}$/i ? true : false
125
- when /^\*/
126
- k =~ /#{a[2]}/i ? true : false
210
+ elsif r.is_a?(Array)
211
+ valid = r.filter do |k|
212
+ if k.is_a? Array
213
+ array_match(k, a[2], a[1])
127
214
  else
128
- k =~ /^#{a[2]}$/i ? true : false
215
+ case a[1]
216
+ when /^\^/
217
+ k =~ /^#{a[2]}/i ? true : false
218
+ when /^\$/
219
+ k =~ /#{a[2]}$/i ? true : false
220
+ when /^\*/
221
+ k =~ /#{a[2]}/i ? true : false
222
+ else
223
+ k =~ /^#{a[2]}$/i ? true : false
224
+ end
129
225
  end
130
226
  end
131
227
 
132
228
  keep = valid.count.positive?
133
229
  elsif val.is_a?(Numeric) && a[1] =~ /^[<>=]{1,2}$/
134
- k = r[key].to_i
230
+ k = r.to_i
135
231
  comp = a[1] =~ /^=$/ ? '==' : a[1]
136
232
  keep = eval("#{k}#{comp}#{val}")
137
233
  else
138
- keep = case a[1]
139
- when /^\^/
140
- r[key] =~ /^#{a[2]}/i ? true : false
141
- when /^\$/
142
- r[key] =~ /#{a[2]}$/i ? true : false
143
- when /^\*/
144
- r[key] =~ /#{a[2]}/i ? true : false
145
- else
146
- r[key] =~ /^#{a[2]}$/i ? true : false
147
- end
234
+ v = r.is_a?(Hash) ? r[key] : r
235
+ if v.is_a? Array
236
+ keep = array_match(v, a[2], a[1])
237
+ else
238
+ keep = case a[1]
239
+ when /^\^/
240
+ v =~ /^#{a[2]}/i ? true : false
241
+ when /^\$/
242
+ v =~ /#{a[2]}$/i ? true : false
243
+ when /^\*/
244
+ v =~ /#{a[2]}/i ? true : false
245
+ else
246
+ v =~ /^#{a[2]}$/i ? true : false
247
+ end
248
+ end
148
249
  end
149
250
 
150
251
  return false unless keep
@@ -251,4 +352,8 @@ class ::Hash
251
352
  hsh[k.to_s] = v.is_a?(Hash) ? v.stringify_keys : v
252
353
  end
253
354
  end
355
+
356
+ def stringify_keys!
357
+ replace stringify_keys
358
+ end
254
359
  end
data/lib/curly/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Curly
2
- VERSION = '0.0.7'
2
+ VERSION = '0.0.9'
3
3
  end
data/src/_README.md CHANGED
@@ -10,9 +10,12 @@ _If you find this useful, feel free to [buy me some coffee][donate]._
10
10
  [donate]: https://brettterpstra.com/donate
11
11
  <!--END GITHUB-->
12
12
 
13
- The current version of `curlyq` is <!--VER-->0.0.6<!--END VER-->.
13
+ [jq]: https://github.com/jqlang/jq "Command-line JSON processor"
14
+ [yq]: https://github.com/mikefarah/yq "yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor"
14
15
 
15
- CurlyQ is a utility that provides a simple interface for curl, with additional features for things like extracting images and links, finding elements by CSS selector or XPath, getting detailed header info, and more. It's designed to be part of a scripting pipeline, outputting everything as structured data (JSON or YAML). It also has rudimentary support for making calls to JSON endpoints easier, but it's expected that you'll use something like `jq` to parse the output.
16
+ The current version of `curlyq` is <!--VER-->0.0.4<!--END VER-->.
17
+
18
+ CurlyQ is a utility that provides a simple interface for curl, with additional features for things like extracting images and links, finding elements by CSS selector or XPath, getting detailed header info, and more. It's designed to be part of a scripting pipeline, outputting everything as structured data (JSON or YAML). It also has rudimentary support for making calls to JSON endpoints easier, but it's expected that you'll use something like [jq] to parse the output.
16
19
 
17
20
  [github]: https://github.com/ttscoff/curlyq/
18
21
 
@@ -45,6 +48,9 @@ You can shape the results using `--search` (`-s`) and `--query` (`-q`) on some c
45
48
 
46
49
  A search uses either CSS or XPath syntax to locate elements. For example, if you wanted to locate all of the `<article>` elements with a class of `post` inside of the div with an id of `main`, you would run `--search '#main article.post'`. Searches can target tags, ids, and classes, and can accept `>` to target direct descendents. You can also use XPaths, but I hate those so I'm not going to document them.
47
50
 
51
+ > I've tried to make the query function useful, but if you want to do any kind of advanced shaping, you're better off piping the JSON output to [jq] or [yq].
52
+ <!--JEKYLL{:.warn}-->
53
+
48
54
  Queries are specifically for shaping CurlyQ output. If you're using the `html` command, it returns a key called `images`, so you can target just the images in the response with `-q 'images'`. The queries accept array syntax, so to get the first image, you would use `-q 'images[0]'`. Ranges are accepted as well, so `-q 'images[1..4]'` will return the 2nd through 5th images found on the page. You can also do comparisons, e.g. `images[rel=me]'` to target only images with a `rel` attribute of `me`.
49
55
 
50
56
  The comparisons for the query flag are:
@@ -58,6 +64,16 @@ The comparisons for the query flag are:
58
64
  - `^=` starts with text
59
65
  - `$=` ends with text
60
66
 
67
+ Comparisons can be numeric or string comparisons. A numeric comparison like `curlyq images -q '[width>500]' URL` would return all of the images on the page with a width attribute greater than 500.
68
+
69
+ You can also use dot syntax inside of comparisons, e.g. `[links.rel*=me]` to target the links object (`html` command), and return only the links with a `rel=me` attribute. If the comparison is to an array object (like `class` or `rel`), it will match if any of the elements of the array match your comparison.
70
+
71
+ If you end the query with a specific key, only that key will be output. If there's only one match, it will be output as a raw string. If there are multiple matches, output will be an array:
72
+
73
+ curlyq tags --search '#main .post h3' -q '[attrs.id*=what].source' 'https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/'
74
+
75
+ <h3 id="whats-next">What’s Next</h3>
76
+
61
77
  #### Commands
62
78
 
63
79
  curlyq makes use of subcommands, e.g. `curlyq html [options] URL` or `curlyq extract [options] URL`. Each subcommand takes its own options, but I've made an effort to standardize the choices between each command as much as possible.
@@ -314,7 +330,7 @@ Example:
314
330
 
315
331
  Return a hierarchy of all tags in a page. Use `-t` to limit to a specific tag.
316
332
 
317
- curlyq tags --search '#main .post h3' -q 'attrs[id*=what]' https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/
333
+ curlyq tags --search '#main .post h3' -q '[attrs.id*=what]' https://brettterpstra.com/2024/01/10/introducing-curlyq-a-pipeline-oriented-curl-helper/
318
334
 
319
335
  [
320
336
  {
@@ -7,7 +7,7 @@ require 'helpers/curlyq-helpers'
7
7
  require 'test_helper'
8
8
 
9
9
  # Tests for tags command
10
- class CurlyQTagsTest < Test::Unit::TestCase
10
+ class CurlyQExtractTest < Test::Unit::TestCase
11
11
  include CurlyQHelpers
12
12
 
13
13
  def setup
@@ -12,9 +12,9 @@ class CurlyQHtmlTest < Test::Unit::TestCase
12
12
 
13
13
  def test_html_search_query
14
14
  result = curlyq('html', '-s', '#main article .aligncenter', '-q', 'images[1]', 'https://brettterpstra.com')
15
- json = JSON.parse(result)[0]
15
+ json = JSON.parse(result)
16
16
 
17
- assert_match(/aligncenter/, json[0]['class'], 'Should have found an image with class "aligncenter"')
17
+ assert_match(/aligncenter/, json['class'], 'Should have found an image with class "aligncenter"')
18
18
  end
19
19
 
20
20
  def test_html_query
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: curlyq
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.7
4
+ version: 0.0.9
5
5
  platform: ruby
6
6
  authors:
7
7
  - Brett Terpstra
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-01-12 00:00:00.000000000 Z
11
+ date: 2024-01-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake