fech-ftp 0.1.1 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +16 -9
- data/lib/fech-ftp.rb +2 -2
- data/lib/fech-ftp/table.rb +10 -15
- data/lib/fech-ftp/version.rb +1 -1
- metadata +3 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2497302b726593b2adf7596e8129f7ae35d06a29
|
4
|
+
data.tar.gz: d19312acbe195d7a9f6fb5174b0fb7085b7ac969
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ed07c1f566fc9f617486c823d38b59012c02aab1b11337eb60f69afd75c6c21aa9472bce7a382982d875d220474edb0fec9e093caeae03497719ff0ed1e1b2b7
|
7
|
+
data.tar.gz: 3c585a2dde2d9c1d841f309627d8aae66d749fb329dc829dd970452ed26235a1585ee9e142be889f9c3c4efd74456117e2bbe6180675aeb2321b8bd262d4e55b
|
data/README.md
CHANGED
@@ -1,8 +1,10 @@
|
|
1
1
|
# fech-ftp
|
2
2
|
|
3
|
-
A Ruby library for retrieving and parsing [
|
3
|
+
A Ruby library for retrieving and parsing [bulk data downloads](http://classic.fec.gov/finance/disclosure/ftp_download.shtml) from the Federal Election Commission. While fech-ftp provides an API to some transaction data (contributions from a committee to a candidate and contributions between two committees), its main purpose is to provide a simple interface to the "[committee master](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteeMaster.shtml)" and "[candidate master](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCandidateMaster.shtml)" files, with the ultimate goal of providing a way to connect individual transactions parsed by [Fech](https://github.com/NYTimes/Fech) with their canonical recipients.
|
4
4
|
|
5
|
-
fech-ftp is tested under Ruby 2.0.0, 2.1.X and 2.
|
5
|
+
fech-ftp is tested under Ruby 2.0.0, 2.1.X, 2.2.X, and 2.3.X.
|
6
|
+
|
7
|
+
Ironically, the FEC announced in late 2017 that it would be moving its bulk data files from an FTP server to S3. This library has been updated to work with that setup, so it performs exactly zero FTP requests.
|
6
8
|
|
7
9
|
## Installation
|
8
10
|
|
@@ -24,33 +26,33 @@ Fech-FTP can be used by itself or in combination with Fech. To retrieve canonica
|
|
24
26
|
|
25
27
|
```ruby
|
26
28
|
require 'fech-ftp'
|
27
|
-
cands = Fech::Candidate.detail(
|
28
|
-
cands.first # => {:candidate_id=>"
|
29
|
+
cands = Fech::Candidate.detail(2018)
|
30
|
+
cands.first # => {:candidate_id=>"H0AL02087", :candidate_name=>"ROBY, MARTHA", :party=>"REP", :election_year=>"2018", :office_state=>"AL", :office=>"H", :district=>"02", :incumbent_challenger_status=>"I", :candidate_status=>"C", :committee_id=>"C00462143", :street_one=>"3260 BANKHEAD AVE", :street_two=>"", :city=>"MONTGOMERY", :state=>"AL", :zipcode=>"361062448"}
|
29
31
|
```
|
30
32
|
|
31
33
|
If you want to have the data transferred into an csv, add the property `format: :csv`, like so:
|
32
34
|
|
33
35
|
```ruby
|
34
|
-
Fech::Candidate.detail(
|
36
|
+
Fech::Candidate.detail(2018, format: :csv)
|
35
37
|
```
|
36
38
|
|
37
39
|
You can specify the location that the FEC zip files opened by fech-ftp are downloaded by adding the property `location`, with an absolute path to an _existing_ directory that must include a trailing slash, like so:
|
38
40
|
|
39
41
|
```ruby
|
40
|
-
Fech::Candidate.detail(
|
42
|
+
Fech::Candidate.detail(2018, location: "/tmp/fec/")
|
41
43
|
```
|
42
44
|
|
43
45
|
If you are using the [Sequel Gem](https://github.com/jeremyevans/sequel), you can pass in the DB table object as the `connection` property:
|
44
46
|
|
45
47
|
```ruby
|
46
|
-
Fech::Candidate.detail(
|
48
|
+
Fech::Candidate.detail(2018, format: :db, connection: DB[:candidates])
|
47
49
|
```
|
48
50
|
|
49
51
|
Please note that it assumes the table object's columns == header properties for the data that gets passed in. Otherwise, an exception will be thrown.
|
50
52
|
To get around this, you can provide the `connection` property as an array, with the traditional `DB` constant value being the in the first element, followed by the table name:
|
51
53
|
|
52
54
|
```ruby
|
53
|
-
Fech::Candidate.detail(
|
55
|
+
Fech::Candidate.detail(2018, format: :db, connection: [<DATABASE OBJECT>, :candidates])
|
54
56
|
```
|
55
57
|
|
56
58
|
The table will automatically be created, and then will be populated with the selected dataset. Also please note that it assumes there are no foreign keys. To add them, please follow the Sequel documentation guidelines for adding/altering foreign keys.
|
@@ -68,7 +70,12 @@ Fech::IndividualContribution
|
|
68
70
|
Fech::CommitteeContribution
|
69
71
|
```
|
70
72
|
|
71
|
-
There are additional classes representing [PAC contributions to candidates](http://
|
73
|
+
There are additional classes representing [PAC contributions to candidates](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryContributionstoCandidates.shtml) (`CandidateContribution`) and [transactions involving two committees](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteetoCommittee.shtml) (`CommitteeContribution`). Be advised that both of the FTP files loaded by these classes are large and can take minutes to parse. They are appropriately used for background processing or data loading purposes, not for providing a live API. Individual Contributions in particular runs in excess of 1-2 million rows of data (~ 200mb)
|
74
|
+
|
75
|
+
## Authors
|
76
|
+
|
77
|
+
* [Derek Willis](https://github.com/dwillis)
|
78
|
+
* [Matt Long](https://github.com/wismer)
|
72
79
|
|
73
80
|
## Contributing
|
74
81
|
|
data/lib/fech-ftp.rb
CHANGED
@@ -7,7 +7,7 @@ require "fech-ftp/candidate_contribution"
|
|
7
7
|
require "fech-ftp/committee_contribution"
|
8
8
|
require "fech-ftp/individual_contribution"
|
9
9
|
require "fech-ftp/table_methods"
|
10
|
-
require "net/
|
10
|
+
require "net/http"
|
11
11
|
require 'zip'
|
12
12
|
require 'active_support'
|
13
13
|
require 'active_support/deprecation'
|
@@ -20,7 +20,7 @@ require 'csv'
|
|
20
20
|
module Fech
|
21
21
|
class Utilities
|
22
22
|
def self.superpacs
|
23
|
-
url = "http://
|
23
|
+
url = "http://classic.fec.gov/press/press2011/ieoc_alpha.shtml"
|
24
24
|
t = RemoteTable.new url, :row_xpath => '//table/tr', :column_xpath => 'td', :encoding => 'windows-1252', :headers => %w{ row_id committee_id committee_name filing_frequency}
|
25
25
|
t.entries
|
26
26
|
end
|
data/lib/fech-ftp/table.rb
CHANGED
@@ -1,5 +1,7 @@
|
|
1
1
|
module Fech
|
2
2
|
class Table
|
3
|
+
AWS_URL = "https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads"
|
4
|
+
|
3
5
|
def initialize(cycle, opts={})
|
4
6
|
@cycle = cycle
|
5
7
|
@headers = opts[:headers]
|
@@ -59,19 +61,13 @@ module Fech
|
|
59
61
|
end
|
60
62
|
|
61
63
|
def fetch_file(&blk)
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
ftp.login
|
66
|
-
ftp.chdir("./FEC/#{@cycle}")
|
67
|
-
begin
|
68
|
-
ftp.get(zip_file, "./#{zip_file}")
|
69
|
-
rescue Net::FTPPermError
|
70
|
-
raise 'File not found - please try the other methods'
|
71
|
-
end
|
72
|
-
end
|
64
|
+
filename = "#{@file}#{@cycle.to_s[2..3]}.zip"
|
65
|
+
uri = URI("#{AWS_URL}/#{@cycle}/#{filename}")
|
66
|
+
response = Net::HTTP.get_response(uri)
|
73
67
|
|
74
|
-
|
68
|
+
if response.code == '200'
|
69
|
+
unzip(response.body, &blk)
|
70
|
+
end
|
75
71
|
end
|
76
72
|
|
77
73
|
def parser
|
@@ -114,15 +110,14 @@ module Fech
|
|
114
110
|
end
|
115
111
|
|
116
112
|
def unzip(zip_file, &blk)
|
117
|
-
Zip::File.
|
113
|
+
Zip::File.open_buffer(zip_file) do |zip|
|
118
114
|
zip.each do |entry|
|
119
115
|
path = @location.nil? ? entry.name : @location + entry.name
|
120
116
|
entry.extract(path) if !File.file?(path)
|
121
|
-
|
117
|
+
|
122
118
|
File.foreach(path) do |row|
|
123
119
|
blk.call(format_row(row))
|
124
120
|
end
|
125
|
-
File.delete(path)
|
126
121
|
end
|
127
122
|
end
|
128
123
|
end
|
data/lib/fech-ftp/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fech-ftp
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Derek Willis
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2017-11-10 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -200,7 +200,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
200
200
|
version: '0'
|
201
201
|
requirements: []
|
202
202
|
rubyforge_project:
|
203
|
-
rubygems_version: 2.
|
203
|
+
rubygems_version: 2.4.5.1
|
204
204
|
signing_key:
|
205
205
|
specification_version: 4
|
206
206
|
summary: A Ruby interface for FTP data from the Federal Election Commission.
|
@@ -213,4 +213,3 @@ test_files:
|
|
213
213
|
- test/test_individual.rb
|
214
214
|
- test/webk.txt
|
215
215
|
- test/webl.txt
|
216
|
-
has_rdoc:
|