datahunter 0.4.0 → 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +44 -39
- data/bin/hunter +5 -4
- data/lib/datahunter/base.rb +24 -7
- data/lib/datahunter/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 763f9761c548add060339d31f4af973bda927417
|
4
|
+
data.tar.gz: 89cfe34f6be6ffeb997ae347a2532aed03765946
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ffbe1bb02f5505e9aa9fadeacf563c5cdae07740c2b2a23d2cb735ec1500f624d219cced5380c1320309acb8055b4fc11af6feb6c96b7beb14e0adce632d7931
|
7
|
+
data.tar.gz: abb5d5dbb003a81b723556f4625eb76c80ce7ae92b45d0548f009664f4065ee1c2d8e34469c260a19ff72ff4a6921e85105fd3cb253c830d4c4cda86dd6da4a1
|
data/README.md
CHANGED
@@ -20,73 +20,77 @@ We believe that pre-processing data is a bad idea and that hosting them could cr
|
|
20
20
|
|
21
21
|
$ gem install datahunter
|
22
22
|
|
23
|
+
## Version
|
24
|
+
|
25
|
+
Hunter is currently `0.4.xx`
|
26
|
+
|
27
|
+
Be careful every version before `0.4.0` **won't work** since the Hunter API has changed and now use Elastic Search.
|
28
|
+
|
23
29
|
## Usage
|
24
30
|
|
25
31
|
### $ hunter find
|
26
32
|
|
27
33
|
$ hunter find consumer
|
28
|
-
### Response in 0.
|
29
|
-
### We've found
|
34
|
+
### Response in 0.266687 seconds
|
35
|
+
### We've found 14 datasets corresponding to your query 'consumer':
|
30
36
|
|
31
|
-
1. Consumer Complaint Database id:
|
32
|
-
["us", "usa", "america", "united states"
|
37
|
+
1. Consumer Complaint Database id: AUzHZIcnlutSVlmd6y27
|
38
|
+
["us", "usa", "america", "united states"]
|
33
39
|
These are complaints we’ve received about financial products and services.
|
34
|
-
2.
|
35
|
-
["us", "usa", "america", "united states"
|
36
|
-
The Consumer Expenditure Survey (CE) program consists of two surveys, the quarterly Interview ...
|
37
|
-
3. Farmers Markets Geographic Data id: 54de229ba82653913d1142fe
|
38
|
-
["us", "usa", "america", "united states", "united-states"]
|
40
|
+
2. Farmers Markets Geographic Data id: AUzHZJZilutSVlmd6y3K
|
41
|
+
["us", "usa", "america", "united states"]
|
39
42
|
longitude and latitude, state, address, name, and zip code of Farmers Markets in the United ...
|
40
|
-
|
41
|
-
["
|
42
|
-
|
43
|
-
|
44
|
-
["
|
45
|
-
|
43
|
+
3. Food Price Outlook id: AUzHZKVglutSVlmd6y3Z
|
44
|
+
["us", "usa", "america", "united states"]
|
45
|
+
The Consumer Price Index (CPI) for food is a component of the all-items CPI. The CPI measures the ...
|
46
|
+
4. All Product Recalls id: AUzHZLOplutSVlmd6y3o
|
47
|
+
["us", "usa", "america", "united states"]
|
48
|
+
Recalls and product safety news. CPSC is charged with protecting the public from unreasonable risks ...
|
49
|
+
5. SaferProducts API id: AUzHZPzclutSVlmd6y4x
|
50
|
+
["us", "usa", "america", "united states"]
|
51
|
+
On March 11, 2011, the U.S. Consumer Product Safety Commission launched SaferProducts.gov. This ...
|
46
52
|
|
47
|
-
###
|
48
|
-
1
|
53
|
+
### Get data? (1..5) Show next 5 datasets? (RET) abort? (q)
|
54
|
+
> 1
|
49
55
|
0. Consumer Complaint Database - CSV
|
50
56
|
1. Consumer Complaint Database - JSON
|
51
57
|
2. Consumer Complaint Database - XML
|
52
58
|
3. Consumer Complaint Database - api
|
53
59
|
### which one? (0/1/...)
|
54
|
-
1
|
55
|
-
Create/overwrite /Users/
|
56
|
-
|
57
|
-
Start downloading...
|
58
|
-
Your file has been downloaded ;)
|
59
|
-
|
60
|
-
https://data.consumerfinance.gov/api/views/x94z-ydhh/rows.csv?accessType=DOWNLOAD
|
61
|
-
|
60
|
+
> 1
|
61
|
+
### Create/overwrite /Users/my_project/views.json? (RET) Rename? (r) abort? (q)
|
62
|
+
>
|
63
|
+
### Start downloading...
|
64
|
+
### Your file has been downloaded ;)
|
65
|
+
|
62
66
|

|
63
67
|
|
64
68
|
### $ hunter info
|
65
69
|
|
66
|
-
$ hunter info
|
67
|
-
|
70
|
+
$ hunter info AUzHZIcnlutSVlmd6y27
|
71
|
+
|
68
72
|
Consumer Complaint Database
|
69
73
|
These are complaints we’ve received about financial products and services.
|
70
74
|
|
71
75
|
publisher: Consumer Financial Protection Bureau
|
72
76
|
temporal: ["2011", "2012", "2013"]
|
73
|
-
spatial: ["us", "usa", "america", "united states"
|
74
|
-
created: 2014-02-
|
75
|
-
updated: 2015-
|
76
|
-
score:
|
77
|
+
spatial: ["us", "usa", "america", "united states"]
|
78
|
+
created: 2014-02-25T19:48:25:192000
|
79
|
+
updated: 2015-03-13T01:32:35:438000
|
80
|
+
score: 35.119
|
77
81
|
|
78
|
-
### $ hunter get
|
82
|
+
### $ hunter get
|
83
|
+
|
84
|
+
$ bin/hunter get AUzHZIcnlutSVlmd6y27
|
79
85
|
|
80
|
-
$ hunter get 548c82a7a826dfe85070e5fa
|
81
|
-
|
82
86
|
0. Consumer Complaint Database - CSV
|
83
87
|
1. Consumer Complaint Database - JSON
|
84
88
|
2. Consumer Complaint Database - XML
|
89
|
+
3. Consumer Complaint Database - api
|
85
90
|
### which one? (0/1/...)
|
86
|
-
1
|
87
|
-
### Create/overwrite /Users/Terpolilli/views.json?(
|
88
|
-
|
89
|
-
Path/to/filename: /Users/Terpolilli/Downloads/consumer-data.json
|
91
|
+
> 1
|
92
|
+
### Create/overwrite /Users/Terpolilli/Documents/Sites/datahunter/views.json? (RET) Rename? (r) abort? (q)
|
93
|
+
>
|
90
94
|
### Start downloading...
|
91
95
|
### Your file has been downloaded ;)
|
92
96
|
If this is not the file you expected, it's maybe because publisher don't always keep the metadata up-to-date. We try to clean most of uri's and check the url. Anyway you may be able to download your file by hand here:
|
@@ -97,11 +101,12 @@ Don't hesitate to [give us any feedback about you experience with Hunter!](https
|
|
97
101
|
|
98
102
|
## Update
|
99
103
|
|
100
|
-
* datasets indexed: 8336
|
104
|
+
* datasets indexed: 8336 (temporaly only ~2000 are available)
|
101
105
|
* last datasets indexed: Canada open data, NETL's Energy Data eXchange, dati.gov.it, complete french health DAMIR data.
|
102
106
|
|
103
107
|
## Change Log
|
104
108
|
|
109
|
+
* 0.4.x - Adapted to the new Hunter API version (based on ElasticSearch)
|
105
110
|
* 0.3.x - Merge `$ hunter find <keyword>` and `$ hunter search <keyword>` commands.
|
106
111
|
The new `$ hunter find` command displays the datasets corresponding to the query, 5 by 5,
|
107
112
|
sorted by popularity
|
data/bin/hunter
CHANGED
@@ -44,10 +44,11 @@ command :find do |c|
|
|
44
44
|
sub_datasets = datasets[(5 * i - 5) .. (5 * i - 1)]
|
45
45
|
|
46
46
|
Datahunter.print_coll_of_datasets_info_light sub_datasets
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
47
|
+
|
48
|
+
puts ("### Get data? (1..5) ".colorize(:yellow) +
|
49
|
+
"Show next 5 datasets? (RET) ".colorize(:cyan) +
|
50
|
+
"abort? (q)")
|
51
|
+
case ask "> "
|
51
52
|
when '1'
|
52
53
|
Datahunter.get_dataset sub_datasets[0]
|
53
54
|
break
|
data/lib/datahunter/base.rb
CHANGED
@@ -22,6 +22,14 @@ module Datahunter
|
|
22
22
|
"?q=#{s}"
|
23
23
|
end
|
24
24
|
|
25
|
+
def self.clean_string string
|
26
|
+
string
|
27
|
+
.gsub(/\n/, "")
|
28
|
+
.gsub(/\r/, "")
|
29
|
+
.gsub(/--/, "")
|
30
|
+
.gsub(/ /, " ")
|
31
|
+
end
|
32
|
+
|
25
33
|
def self.datasets_url query
|
26
34
|
"#{DATASETS_URL}#{Datahunter.query_string_builder query}"
|
27
35
|
end
|
@@ -32,7 +40,7 @@ module Datahunter
|
|
32
40
|
|
33
41
|
def self.print_dataset_info dataset
|
34
42
|
puts ("#{dataset["title"]}".colorize(:green))
|
35
|
-
puts ("#{dataset["description"]}".colorize(:blue))
|
43
|
+
puts ("#{Datahunter.clean_string (dataset["description"])}".colorize(:blue))
|
36
44
|
puts
|
37
45
|
puts ("publisher: ".colorize(:blue) + "#{dataset["publisher"]}")
|
38
46
|
puts ("temporal: ".colorize(:blue) + "#{dataset["temporal"]}")
|
@@ -44,12 +52,13 @@ module Datahunter
|
|
44
52
|
|
45
53
|
def self.print_coll_of_datasets_info_light coll_of_datasets
|
46
54
|
coll_of_datasets.each_with_index do |ds, index|
|
55
|
+
desc = clean_string ds["description"]
|
47
56
|
puts ("#{index+1}. ".colorize(:yellow) +
|
48
57
|
"#{ds["title"]}".colorize(:green) +
|
49
58
|
" id: ".colorize(:blue) +
|
50
59
|
"#{ds["_id"]}")
|
51
60
|
puts ("#{ds["spatial"].take(5)}")
|
52
|
-
puts ("#{
|
61
|
+
puts ("#{desc[0..100].gsub(/\w+\s*$/,'...')}".colorize(:blue))
|
53
62
|
end
|
54
63
|
puts
|
55
64
|
end
|
@@ -78,9 +87,14 @@ module Datahunter
|
|
78
87
|
file_name = uri.basename
|
79
88
|
loc = location + "/" + file_name
|
80
89
|
|
81
|
-
|
90
|
+
puts ("### Create/overwrite #{loc}? (RET) ".colorize(:yellow) +
|
91
|
+
"Rename? (r) ".colorize(:cyan) +
|
92
|
+
"abort? (q)")
|
93
|
+
|
94
|
+
case ask "> "
|
82
95
|
when 'rename'
|
83
|
-
|
96
|
+
puts "Path/to/filename: ".colorize(:yellow)
|
97
|
+
loc = ask "> "
|
84
98
|
when 'n'
|
85
99
|
abort("Ok then")
|
86
100
|
end
|
@@ -99,7 +113,8 @@ module Datahunter
|
|
99
113
|
dl = 0
|
100
114
|
else
|
101
115
|
Datahunter.print_downloadable_links resources
|
102
|
-
|
116
|
+
puts "### which one? (0/1/...)".colorize(:yellow)
|
117
|
+
dl = ask("> ", Integer) {|i| i.in = 0..(number_of_downloadable_links - 1)}
|
103
118
|
end
|
104
119
|
|
105
120
|
dl = dl.to_i
|
@@ -147,7 +162,8 @@ module Datahunter
|
|
147
162
|
|
148
163
|
## Feedback requests
|
149
164
|
def self.print_feedback_request
|
150
|
-
|
165
|
+
puts "### give feedback? (y/n)".colorize(:yellow)
|
166
|
+
case ask "> "
|
151
167
|
when 'y'
|
152
168
|
Launchy.open(FEEDBACK_URL, options = {})
|
153
169
|
else
|
@@ -156,7 +172,8 @@ module Datahunter
|
|
156
172
|
end
|
157
173
|
|
158
174
|
def self.print_request_dataset_message
|
159
|
-
|
175
|
+
puts "### request a dataset? (y/n)".colorize(:yellow)
|
176
|
+
case ask "> "
|
160
177
|
when 'y'
|
161
178
|
Launchy.open(REQUEST_URL, options = {})
|
162
179
|
end
|
data/lib/datahunter/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: datahunter
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.4.
|
4
|
+
version: 0.4.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Terpo
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2015-04-
|
11
|
+
date: 2015-04-18 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|