fech-ftp 0.1.1 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 74b4ad9ebf4adbe59ec021a13408871e0180a6fe
4
- data.tar.gz: 2d5a44faf28e8e4bc12d0fa351c0b61651fa17f8
3
+ metadata.gz: 2497302b726593b2adf7596e8129f7ae35d06a29
4
+ data.tar.gz: d19312acbe195d7a9f6fb5174b0fb7085b7ac969
5
5
  SHA512:
6
- metadata.gz: bf7a38d8dd0906992baf77bbe19666b477945a1209895cb8158b07b43fc9db68fba6e386e0a7856e53be830e1f9f0cccea7bbe6b40781def329e135cf11d4392
7
- data.tar.gz: 8ed8360b64583bfe81ca70846b60f864e1e4bfe87cf0aaf9208a5cb851d0c6a94ef575b9afdff6a7a278602cb12320baa1ee88a5ea6fa943799707b26b2fe2aa
6
+ metadata.gz: ed07c1f566fc9f617486c823d38b59012c02aab1b11337eb60f69afd75c6c21aa9472bce7a382982d875d220474edb0fec9e093caeae03497719ff0ed1e1b2b7
7
+ data.tar.gz: 3c585a2dde2d9c1d841f309627d8aae66d749fb329dc829dd970452ed26235a1585ee9e142be889f9c3c4efd74456117e2bbe6180675aeb2321b8bd262d4e55b
data/README.md CHANGED
@@ -1,8 +1,10 @@
1
1
  # fech-ftp
2
2
 
3
- A Ruby library for retrieving and parsing [FTP data downloads](http://www.fec.gov/finance/disclosure/ftp_download.shtml) from the Federal Election Commission. While fech-ftp provides an API to some transaction data (contributions from a committee to a candidate and contributions between two committees), its main purpose is to provide a simple interface to the "[committee master](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteeMaster.shtml)" and "[candidate master](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCandidateMaster.shtml)" files, with the ultimate goal of providing a way to connect individual transactions parsed by [Fech](https://github.com/NYTimes/Fech) with their canonical recipients.
3
+ A Ruby library for retrieving and parsing [bulk data downloads](http://classic.fec.gov/finance/disclosure/ftp_download.shtml) from the Federal Election Commission. While fech-ftp provides an API to some transaction data (contributions from a committee to a candidate and contributions between two committees), its main purpose is to provide a simple interface to the "[committee master](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteeMaster.shtml)" and "[candidate master](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCandidateMaster.shtml)" files, with the ultimate goal of providing a way to connect individual transactions parsed by [Fech](https://github.com/NYTimes/Fech) with their canonical recipients.
4
4
 
5
- fech-ftp is tested under Ruby 2.0.0, 2.1.X and 2.2.X.
5
+ fech-ftp is tested under Ruby 2.0.0, 2.1.X, 2.2.X, and 2.3.X.
6
+
7
+ Ironically, the FEC announced in late 2017 that it would be moving its bulk data files from an FTP server to S3. This library has been updated to work with that setup, so it performs exactly zero FTP requests.
6
8
 
7
9
  ## Installation
8
10
 
@@ -24,33 +26,33 @@ Fech-FTP can be used by itself or in combination with Fech. To retrieve canonica
24
26
 
25
27
  ```ruby
26
28
  require 'fech-ftp'
27
- cands = Fech::Candidate.detail(2014)
28
- cands.first # => {:candidate_id=>"H0AK00097", :candidate_name=>"COX, JOHN ROBERT", :party=>"REP", :election_year=>"2012", :office_state=>"AK", :office=>"H", :district=>"00", :incumbent_challenger_status=>"C", :candidate_status=>"N", :committee_id=>"C00525261", :street_one=>"PO BOX 1092", :street_two=>"", :city=>"ANCHOR POINT", :state=>"AK", :zipcode=>"995561092"}
29
+ cands = Fech::Candidate.detail(2018)
30
+ cands.first # => {:candidate_id=>"H0AL02087", :candidate_name=>"ROBY, MARTHA", :party=>"REP", :election_year=>"2018", :office_state=>"AL", :office=>"H", :district=>"02", :incumbent_challenger_status=>"I", :candidate_status=>"C", :committee_id=>"C00462143", :street_one=>"3260 BANKHEAD AVE", :street_two=>"", :city=>"MONTGOMERY", :state=>"AL", :zipcode=>"361062448"}
29
31
  ```
30
32
 
31
33
  If you want to have the data transferred into an csv, add the property `format: :csv`, like so:
32
34
 
33
35
  ```ruby
34
- Fech::Candidate.detail(2014, format: :csv)
36
+ Fech::Candidate.detail(2018, format: :csv)
35
37
  ```
36
38
 
37
39
  You can specify the location that the FEC zip files opened by fech-ftp are downloaded by adding the property `location`, with an absolute path to an _existing_ directory that must include a trailing slash, like so:
38
40
 
39
41
  ```ruby
40
- Fech::Candidate.detail(2014, location: "/tmp/fec/")
42
+ Fech::Candidate.detail(2018, location: "/tmp/fec/")
41
43
  ```
42
44
 
43
45
  If you are using the [Sequel Gem](https://github.com/jeremyevans/sequel), you can pass in the DB table object as the `connection` property:
44
46
 
45
47
  ```ruby
46
- Fech::Candidate.detail(2014, format: :db, connection: DB[:candidates])
48
+ Fech::Candidate.detail(2018, format: :db, connection: DB[:candidates])
47
49
  ```
48
50
 
49
51
  Please note that it assumes the table object's columns == header properties for the data that gets passed in. Otherwise, an exception will be thrown.
50
52
  To get around this, you can provide the `connection` property as an array, with the traditional `DB` constant value being the in the first element, followed by the table name:
51
53
 
52
54
  ```ruby
53
- Fech::Candidate.detail(2014, format: :db, connection: [<DATABASE OBJECT>, :candidates])
55
+ Fech::Candidate.detail(2018, format: :db, connection: [<DATABASE OBJECT>, :candidates])
54
56
  ```
55
57
 
56
58
  The table will automatically be created, and then will be populated with the selected dataset. Also please note that it assumes there are no foreign keys. To add them, please follow the Sequel documentation guidelines for adding/altering foreign keys.
@@ -68,7 +70,12 @@ Fech::IndividualContribution
68
70
  Fech::CommitteeContribution
69
71
  ```
70
72
 
71
- There are additional classes representing [PAC contributions to candidates](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryContributionstoCandidates.shtml) (`CandidateContribution`) and [transactions involving two committees](http://www.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteetoCommittee.shtml) (`CommitteeContribution`). Be advised that both of the FTP files loaded by these classes are large and can take minutes to parse. They are appropriately used for background processing or data loading purposes, not for providing a live API. Individual Contributions in particular runs in excess of 1-2 million rows of data (~ 200mb)
73
+ There are additional classes representing [PAC contributions to candidates](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryContributionstoCandidates.shtml) (`CandidateContribution`) and [transactions involving two committees](http://classic.fec.gov/finance/disclosure/metadata/DataDictionaryCommitteetoCommittee.shtml) (`CommitteeContribution`). Be advised that both of the FTP files loaded by these classes are large and can take minutes to parse. They are appropriately used for background processing or data loading purposes, not for providing a live API. Individual Contributions in particular runs in excess of 1-2 million rows of data (~ 200mb)
74
+
75
+ ## Authors
76
+
77
+ * [Derek Willis](https://github.com/dwillis)
78
+ * [Matt Long](https://github.com/wismer)
72
79
 
73
80
  ## Contributing
74
81
 
data/lib/fech-ftp.rb CHANGED
@@ -7,7 +7,7 @@ require "fech-ftp/candidate_contribution"
7
7
  require "fech-ftp/committee_contribution"
8
8
  require "fech-ftp/individual_contribution"
9
9
  require "fech-ftp/table_methods"
10
- require "net/ftp"
10
+ require "net/http"
11
11
  require 'zip'
12
12
  require 'active_support'
13
13
  require 'active_support/deprecation'
@@ -20,7 +20,7 @@ require 'csv'
20
20
  module Fech
21
21
  class Utilities
22
22
  def self.superpacs
23
- url = "http://www.fec.gov/press/press2011/ieoc_alpha.shtml"
23
+ url = "http://classic.fec.gov/press/press2011/ieoc_alpha.shtml"
24
24
  t = RemoteTable.new url, :row_xpath => '//table/tr', :column_xpath => 'td', :encoding => 'windows-1252', :headers => %w{ row_id committee_id committee_name filing_frequency}
25
25
  t.entries
26
26
  end
@@ -1,5 +1,7 @@
1
1
  module Fech
2
2
  class Table
3
+ AWS_URL = "https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads"
4
+
3
5
  def initialize(cycle, opts={})
4
6
  @cycle = cycle
5
7
  @headers = opts[:headers]
@@ -59,19 +61,13 @@ module Fech
59
61
  end
60
62
 
61
63
  def fetch_file(&blk)
62
- zip_file = "#{@file}#{@cycle.to_s[2..3]}.zip"
63
- Net::FTP.open("ftp.fec.gov") do |ftp|
64
- ftp.passive = true if @passive
65
- ftp.login
66
- ftp.chdir("./FEC/#{@cycle}")
67
- begin
68
- ftp.get(zip_file, "./#{zip_file}")
69
- rescue Net::FTPPermError
70
- raise 'File not found - please try the other methods'
71
- end
72
- end
64
+ filename = "#{@file}#{@cycle.to_s[2..3]}.zip"
65
+ uri = URI("#{AWS_URL}/#{@cycle}/#{filename}")
66
+ response = Net::HTTP.get_response(uri)
73
67
 
74
- unzip(zip_file, &blk)
68
+ if response.code == '200'
69
+ unzip(response.body, &blk)
70
+ end
75
71
  end
76
72
 
77
73
  def parser
@@ -114,15 +110,14 @@ module Fech
114
110
  end
115
111
 
116
112
  def unzip(zip_file, &blk)
117
- Zip::File.open(zip_file) do |zip|
113
+ Zip::File.open_buffer(zip_file) do |zip|
118
114
  zip.each do |entry|
119
115
  path = @location.nil? ? entry.name : @location + entry.name
120
116
  entry.extract(path) if !File.file?(path)
121
- File.delete(zip_file)
117
+
122
118
  File.foreach(path) do |row|
123
119
  blk.call(format_row(row))
124
120
  end
125
- File.delete(path)
126
121
  end
127
122
  end
128
123
  end
@@ -1,5 +1,5 @@
1
1
  module Fech
2
2
  class Ftp
3
- VERSION = "0.1.1"
3
+ VERSION = "0.2.0"
4
4
  end
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fech-ftp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Derek Willis
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-07-13 00:00:00.000000000 Z
11
+ date: 2017-11-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -200,7 +200,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
200
200
  version: '0'
201
201
  requirements: []
202
202
  rubyforge_project:
203
- rubygems_version: 2.2.2
203
+ rubygems_version: 2.4.5.1
204
204
  signing_key:
205
205
  specification_version: 4
206
206
  summary: A Ruby interface for FTP data from the Federal Election Commission.
@@ -213,4 +213,3 @@ test_files:
213
213
  - test/test_individual.rb
214
214
  - test/webk.txt
215
215
  - test/webl.txt
216
- has_rdoc: