ecfs 0.5.1 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.gitignore +9 -18
- data/.rspec +2 -0
- data/.ruby-version +2 -0
- data/.travis.yml +2 -1
- data/CODE_OF_CONDUCT.md +13 -0
- data/LICENSE.txt +17 -18
- data/README.md +107 -230
- data/Rakefile +1 -11
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/ecfs.gemspec +20 -17
- data/lib/ecfs.rb +209 -13
- data/lib/ecfs/version.rb +1 -1
- metadata +47 -73
- data/test/helper.rb +0 -30
- data/test/test_daily_releases.rb +0 -21
- data/test/test_filing.rb +0 -51
- data/test/test_filings_query.rb +0 -54
- data/test/test_large_proceeding.rb +0 -43
- data/test/test_proceeding.rb +0 -62
- data/test/test_proceedings_query.rb +0 -67
- data/test/test_solr_scrape.rb +0 -72
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f978c7a0c7b9a75ff8296f121cd2815a7bd743c9
|
4
|
+
data.tar.gz: 0e8beb8217159728293ab16461da4d60eb259cd1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 29c08e644aeb1380e946cef3f2ec5a49fd78cbbab8f4f4b547fd13c6eb29f05a78d82d0c30776b4810b8564758d1b279245d6bddb5bba1d4dccb0b910dd15ba4
|
7
|
+
data.tar.gz: 2b62bdf3e967ca050a38221f9cee90ac9fe495977f8a7bcd2508dc0b24f2b395691161cecb54f85c7913ad6f66b9c046ade1879139dfbfc159130a5ad56dbb79
|
data/.gitignore
CHANGED
@@ -1,18 +1,9 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
.
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
doc/
|
11
|
-
lib/bundler/man
|
12
|
-
pkg
|
13
|
-
rdoc
|
14
|
-
spec/reports
|
15
|
-
test/tmp
|
16
|
-
test/version_tmp
|
17
|
-
tmp
|
18
|
-
.ruby-version
|
1
|
+
/.bundle/
|
2
|
+
/.yardoc
|
3
|
+
/Gemfile.lock
|
4
|
+
/_yardoc/
|
5
|
+
/coverage/
|
6
|
+
/doc/
|
7
|
+
/pkg/
|
8
|
+
/spec/reports/
|
9
|
+
/tmp/
|
data/.rspec
ADDED
data/.ruby-version
ADDED
data/.travis.yml
CHANGED
data/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
# Contributor Code of Conduct
|
2
|
+
|
3
|
+
As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
|
4
|
+
|
5
|
+
We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, age, or religion.
|
6
|
+
|
7
|
+
Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
|
8
|
+
|
9
|
+
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
|
10
|
+
|
11
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
|
12
|
+
|
13
|
+
This Code of Conduct is adapted from the [Contributor Covenant](http:contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
|
data/LICENSE.txt
CHANGED
@@ -1,22 +1,21 @@
|
|
1
|
-
|
1
|
+
The MIT License (MIT)
|
2
2
|
|
3
|
-
|
3
|
+
Copyright (c) 2015 TODO: Write your name
|
4
4
|
|
5
|
-
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
the following conditions:
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
12
11
|
|
13
|
-
The above copyright notice and this permission notice shall be
|
14
|
-
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
15
14
|
|
16
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
OF
|
22
|
-
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
CHANGED
@@ -1,13 +1,5 @@
|
|
1
1
|
# ECFS
|
2
2
|
|
3
|
-
ECFS helps you download and parse filings from the FCC's Electronic Comment Filing System.
|
4
|
-
|
5
|
-
[![Build Status](https://travis-ci.org/adelevie/ecfs.png?branch=master)](https://travis-ci.org/adelevie/ecfs)
|
6
|
-
|
7
|
-
[![Gem Version](https://badge.fury.io/rb/ecfs.png)](http://badge.fury.io/rb/ecfs)
|
8
|
-
|
9
|
-
[![Code Climate](https://codeclimate.com/github/adelevie/ecfs.png)](https://codeclimate.com/github/adelevie/ecfs)
|
10
|
-
|
11
3
|
## Installation
|
12
4
|
|
13
5
|
Add this line to your application's Gemfile:
|
@@ -24,294 +16,179 @@ $ bundle
|
|
24
16
|
|
25
17
|
Or install it yourself as:
|
26
18
|
|
27
|
-
```
|
19
|
+
```
|
28
20
|
$ gem install ecfs
|
29
21
|
```
|
30
22
|
|
31
23
|
## Usage
|
32
24
|
|
33
|
-
###
|
25
|
+
### Filings
|
34
26
|
|
35
|
-
|
27
|
+
Get a list of filings given a docket number:
|
36
28
|
|
37
29
|
```ruby
|
38
|
-
|
39
|
-
q.bureau_code = "WC" # Wireline Competition Bureau
|
40
|
-
q.per_page = "100" # Defaults to 10, maximum is 100
|
41
|
-
q.page_number = "1"
|
42
|
-
end.get
|
30
|
+
filings = ECFS::Filings.search(docket: '14-57')
|
43
31
|
#=>
|
44
|
-
|
45
|
-
{
|
46
|
-
|
47
|
-
"
|
48
|
-
|
49
|
-
|
32
|
+
[
|
33
|
+
{
|
34
|
+
:docket=>"14-57",
|
35
|
+
:filer=>"Media Bureau",
|
36
|
+
:url=>"http://apps.fcc.gov/ecfs/comment/view?id=6017610890",
|
37
|
+
:lawfirm=>"FCC",
|
38
|
+
:received=>"2014-04-04T00:00:00-04:00",
|
39
|
+
:posted=>"2014-04-04T14:00:29-04:00",
|
40
|
+
:exparte=>"No",
|
41
|
+
:type=>"ORDER",
|
42
|
+
:pages=>10
|
50
43
|
},
|
51
|
-
"fcc_url" => "http://apps.fcc.gov/ecfs/proceeding_search/execute?bureauCode=WC&pageNumber=1&pageSize=100",
|
52
|
-
"current_page" => 1,
|
53
|
-
"total_pages" => 16,
|
54
|
-
"first_result" => 1,
|
55
|
-
"last_result" => 100,
|
56
|
-
"total_results" => 1504,
|
57
|
-
"results" => [
|
58
|
-
{
|
59
|
-
"docket_number" => "10-90",
|
60
|
-
"bureau" => "Wireline Competition Bureau",
|
61
|
-
"subject" => "In the Matter of Connect America Fund A National Brooadband Plan for Our Future High-Cost\r\nUniversal Service Support. .",
|
62
|
-
"filings_in_last_30_days" => 182
|
63
|
-
},
|
64
|
-
{
|
65
|
-
"docket_number" => "05-337",
|
66
|
-
"bureau" => "Wireline Competition Bureau",
|
67
|
-
"subject" =>
|
68
|
-
"In the Matter of Federal -State Joint Board on Universal Service High-Cost Universal\r\nService Support. .. .",
|
69
|
-
"filings_in_last_30_days" => 102
|
70
|
-
},
|
71
44
|
#...
|
72
|
-
]
|
73
45
|
}
|
74
46
|
```
|
75
47
|
|
76
|
-
|
48
|
+
By default, `ECFS::Filings::search` will fetch up to 1000 filings in ascending order by data. However you can supply your own parameters:
|
77
49
|
|
78
50
|
```ruby
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
"bureau_code" => "WC",
|
84
|
-
"per_page" => "100",
|
85
|
-
"page_number" => "2" # automagically incremented the page number
|
86
|
-
},
|
87
|
-
"fcc_url" => "http://apps.fcc.gov/ecfs/proceeding_search/execute?bureauCode=WC&pageSize=100&pageNumber=2",
|
88
|
-
"current_page" => 2,
|
89
|
-
"total_pages" => 16,
|
90
|
-
"first_result" => 101,
|
91
|
-
"last_result" => 200,
|
92
|
-
"total_results" => 1504,
|
93
|
-
"results" => [
|
94
|
-
# ...
|
95
|
-
]
|
96
|
-
}
|
51
|
+
filings1 = ECFS::Filings.search(docket: '14-57', size: 500, start: 0)
|
52
|
+
filings2 = ECFS::Filings.search(docket: '14-57', size: 500, start: 500)
|
53
|
+
filings3 = ECFS::Filings.search(docket: '14-57', size: 1000, start: 0)
|
54
|
+
filings1 + filings2 == filings3 #=> true
|
97
55
|
```
|
98
|
-
See `ECFS::ProceedingsQuery#constraints_dictionary` for a list of query options.
|
99
56
|
|
100
|
-
|
57
|
+
Get filings in descending order (e.g. gets you the most recent):
|
101
58
|
|
102
59
|
```ruby
|
103
|
-
|
104
|
-
proceeding.fetch_info!
|
105
|
-
pp proceeding
|
106
|
-
#=>
|
107
|
-
{
|
108
|
-
"docket_number" => "12-375",
|
109
|
-
"bureau" => "Wireline Competition Bureau",
|
110
|
-
"subject" => "Implementation of the Pay Telephone Reclassification and Compensation Provisions of the Telecommunications Act of 1996 et al.",
|
111
|
-
"bureau_name" => "Wireline Competition Bureau",
|
112
|
-
"prepared_by" => "Aleta.Bowers",
|
113
|
-
"date_created" => "2012-12-26T00:00:00.000Z", # iso8601 string
|
114
|
-
"status" => "Open",
|
115
|
-
"total_filings" => "292",
|
116
|
-
"filings_in_last_30_days" => "58"
|
117
|
-
}
|
60
|
+
filings = ECFS::Filings.search(docket: '14-57', order: 'desc')
|
118
61
|
```
|
119
62
|
|
120
|
-
|
63
|
+
#### Fetching document links
|
64
|
+
|
65
|
+
The `:url` in a filings hash points to a filing that could contain multiple documents ([example](http://apps.fcc.gov/ecfs/comment/view?id=60001016691)).
|
66
|
+
|
67
|
+
To programatically get the links to all documents from that filing:
|
121
68
|
|
122
69
|
```ruby
|
123
|
-
|
70
|
+
links = ECFS::Filings.get_document_links(url: 'http://apps.fcc.gov/ecfs/comment/view?id=60001016691')
|
124
71
|
#=>
|
125
|
-
|
126
|
-
"
|
127
|
-
|
128
|
-
"subject" => "Implementation of the Pay Telephone Reclassification and Compensation Provisions of the Telecommunications Act of 1996 et al.",
|
129
|
-
"bureau_name" => "Wireline Competition Bureau",
|
130
|
-
"prepared_by" => "Aleta.Bowers",
|
131
|
-
"date_created" => "2012-12-26T00:00:00.000Z",
|
132
|
-
"status" => "Open",
|
133
|
-
"total_filings" => "292",
|
134
|
-
"filings_in_last_30_days" => "58"
|
135
|
-
}
|
72
|
+
[
|
73
|
+
"http://apps.fcc.gov/ecfs/document/view?id=60001029567", "http://apps.fcc.gov/ecfs/document/view?id=60001029568"
|
74
|
+
]
|
136
75
|
```
|
137
76
|
|
138
|
-
|
77
|
+
### Proceedings
|
78
|
+
|
79
|
+
Get information about a proceeding given a docket number:
|
139
80
|
|
140
81
|
```ruby
|
141
|
-
proceeding = ECFS::
|
142
|
-
|
143
|
-
|
82
|
+
proceeding = ECFS::Proceedings.search(docket: '14-57')
|
83
|
+
#=>
|
84
|
+
{
|
85
|
+
:bureauname=>"Media Bureau",
|
86
|
+
:subject=>
|
87
|
+
"Applications of Comcast Corporation and Time Warner Cable Inc. for Consent to Assign or Transfer Control of Licenses and Applications",
|
88
|
+
:preparedby=>"Robin.Minor",
|
89
|
+
:datecreated=>"2014-04-04 13:08:57.993",
|
90
|
+
:status=>"Open",
|
91
|
+
:totalfilings=>"100697",
|
92
|
+
:filingsinlast30days=>"139"
|
93
|
+
}
|
144
94
|
```
|
145
95
|
|
146
|
-
###
|
96
|
+
### EDOCS
|
97
|
+
|
98
|
+
Provides a few ways to get documents published by the FCC.
|
147
99
|
|
148
|
-
####
|
100
|
+
#### Given a docket number
|
149
101
|
|
150
102
|
```ruby
|
151
|
-
|
152
|
-
|
153
|
-
end.get
|
154
|
-
#=>
|
103
|
+
docs = ECFS::EDOCS.search(docket: '14-261')
|
104
|
+
#=>
|
155
105
|
[
|
156
|
-
# Each result is instance of `ECFS::Filing`, which is a subclass of `Hash`
|
157
106
|
{
|
158
|
-
|
159
|
-
|
160
|
-
"
|
161
|
-
"date_received" => "2013-05-14T00:00:00.000Z", # iso8601 string
|
162
|
-
"date_posted" => "2013-05-14T00:00:00.000Z",
|
163
|
-
"exparte" => true,
|
164
|
-
"type_of_filing" => "NOTICE OF EXPARTE",
|
165
|
-
"document_urls" => [
|
166
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313561",
|
167
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313562",
|
168
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313563"
|
169
|
-
]
|
170
|
-
},
|
171
|
-
{
|
172
|
-
"name_of_filer" => "The Leadership Conference on Civil and Human Rights",
|
173
|
-
"docket_number" => "12-375",
|
174
|
-
"lawfirm_name" => "",
|
175
|
-
"date_received" => "2013-05-13T00:00:00.000Z",
|
176
|
-
"date_posted" => "2013-05-13T00:00:00.000Z",
|
177
|
-
"exparte" => true,
|
178
|
-
"type_of_filing" => "NOTICE OF EXPARTE",
|
179
|
-
"document_urls" => [
|
180
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313134"
|
181
|
-
]
|
107
|
+
:title=>"Promoting Innovation and Competition in the Provision of Multichannel Video Programming Distribution Services",
|
108
|
+
:released=>"03/11/2015",
|
109
|
+
:description=>"Granted the request for extension of the reply comment deadline", :word=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-15-314A1.doc"], :pdf=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-15-314A1.pdf"], :txt=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-15-314A1.txt"]
|
182
110
|
},
|
183
111
|
# ...
|
184
112
|
]
|
185
113
|
```
|
186
114
|
|
187
|
-
|
188
|
-
|
189
|
-
#### Working with filing documents
|
190
|
-
|
191
|
-
`ECFS::Filing#documents` returns an `Array` of `ECFS::Document` instances.
|
115
|
+
### Given an FCC Record citation
|
192
116
|
|
193
117
|
```ruby
|
194
|
-
|
195
|
-
|
196
|
-
|
197
|
-
#<ECFS::Document:0x007fed7c95bf48
|
198
|
-
@filing=
|
118
|
+
docs = ECFS::EDOCS.search(fcc_rcd_vol: '16', fcc_rcd_page: '20341')
|
119
|
+
#=>
|
120
|
+
[
|
199
121
|
{
|
200
|
-
|
201
|
-
|
202
|
-
"
|
203
|
-
|
204
|
-
|
205
|
-
"exparte" => true,
|
206
|
-
"type_of_filing" => "NOTICE OF EXPARTE",
|
207
|
-
"document_urls" => [
|
208
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313561",
|
209
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313562",
|
210
|
-
"http://apps.fcc.gov/ecfs/document/view?id=7022313563"
|
211
|
-
]
|
212
|
-
},
|
213
|
-
@pages=[#<ECFS::Document::Page @text=String, @page_number=1>],
|
214
|
-
@url="http://apps.fcc.gov/ecfs/document/view?id=7022313561">
|
122
|
+
:title=>"PACIFIC WIRELESS TECHNOLOGIES, INC. AND NEXTEL OF CALIFORNIA",
|
123
|
+
:released=>"11/16/2001",
|
124
|
+
:description=>"Granted application of Pacific Wireless Technologies to assign its licenses to Nextel of Calfornia, Inc.", :word=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-01-2685A1.doc"], :pdf=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-01-2685A1.pdf"], :txt=>["https://apps.fcc.gov/edocs_public/attachmatch/DA-01-2685A1.txt"]
|
125
|
+
}
|
126
|
+
]
|
215
127
|
```
|
216
128
|
|
217
|
-
|
218
|
-
|
219
|
-
You can also keep track of page numbers with `ECFS::Document#pages`, which returns an `Array` of `ECFS::Document::Page` instances. `ECFS::Document::Page#text` and `ECFS::Document::Page#page_number` are self-explanatory.
|
220
|
-
|
221
|
-
### Bulk Queries
|
222
|
-
|
223
|
-
None of this works (leaving here for posterity):
|
224
|
-
|
225
|
-
This has been a problem that's been bothering me for a while: ECFS filing pages won't create spreadsheets when a query returns more than 10,000 filings. A simple solution is to add date constraints to the query until you have a set of queries where each result set contains 10,000 or fewer filings.
|
129
|
+
### Parsing footnotes
|
226
130
|
|
227
|
-
|
131
|
+
You can get machine-readable access to an EDOCS document's footnotes with a URL to its `:word` version. E.g.:
|
228
132
|
|
229
133
|
```ruby
|
230
|
-
|
231
|
-
|
232
|
-
|
134
|
+
docs = ECFS::EDOCS.search(docket: '14-261')
|
135
|
+
doc = docs[3]
|
136
|
+
word_url = doc[:word].first
|
137
|
+
footnotes = ECFS::Util.get_footnotes(url: word_url)
|
138
|
+
#=>
|
139
|
+
[
|
140
|
+
{
|
141
|
+
:index=>1,
|
142
|
+
:text=>
|
143
|
+
"We see daily news that cable operators and satellite television providers are obtaining rights for online distribution of content. Sam Adams and Christian Plumb, Verizon CEO says to launch Web TV product in 2015, Reuters, September 11, 2014, available at http://www.reuters.com/article/2014/09/11/us-verizon-comms-towers-idUSKBN0H61KB20140911 (reporting that Sony, Dish Network, DIRECTV and Verizon are each developing Internet-delivered streaming video services that are a “viable alternative to cable TV service.”); Edmund Lee, Scott Moritz and Alex Sherman, Dish Leads in Race to Offer Online TV to Compete With Cable, Bloomberg, March 15, 2014, available at http://www.bloomberg.com/news/2014-03-04/dish-takes-lead-in-race-to-offer-streaming-tv-to-rival-cable.html (“If Dish goes ahead with an online service, competitors could follow -- including cable companies like Comcast and Cablevision Systems Corp., which could move out of their traditional regions to offer TV nationwide, said Bernard Gershon, a digital media consultant in New York.”); Chris Young, Industry awaits linear OTT experiment, SNL Kagan, July 18, 2014, available at http://www.snl.com/interactivex/article.aspx?id=28627040&KPLT=2; Comcast branches out cloud DVR, live streaming service, CED Magazine, May 8, 2014, available at http://www.cedmagazine.com/news/2014/05/comcast-branches-out-cloud-dvr-live-streaming-service (“Like other video service providers, Comcast is focused on offering live streaming out of the home.”). AT&T’s U-Verse service is delivered via Internet Protocol (“IP”) today. See AT&T, What is IPTV? (2009), available at https://www.att.com/Common/about_us/files/pdf/IPTV_background.pdf. In recognition of the increasing prevalence of Internet distribution of video, the National Cable & Telecommunications Association has renamed its annual Cable Show as INTX: the Internet and Television Expo, “in an effort to broaden the three-day gathering to include online video providers and distributors beyond the traditional Cable Show crowd.” Kent Gibbons, NCTA: ‘Cable Show’ Convention Becoming INTX, Multichannel News (Sept. 17, 2014), http://www.multichannel.com/ncta-cable-show-convention-becoming-intx/383922."},
|
144
|
+
{
|
145
|
+
:index=>2,
|
146
|
+
:text=>
|
147
|
+
"For readability throughout this NPRM, we use the term “Internet-delivered” to refer to any service delivered using IP whether or not it uses the public Internet, except for cable service. See infra ¶ 71."
|
148
|
+
},
|
149
|
+
# ...
|
150
|
+
]
|
233
151
|
```
|
234
152
|
|
235
|
-
|
153
|
+
#### Counting Ids
|
236
154
|
|
237
|
-
|
238
|
-
|
239
|
-
The FCC has a [SOLR search page](http://apps.fcc.gov/ecfs/solr/search) which is not limited to 10,000 results. The bad news is that each page of results is maxed out at twenty. So this is all scrapable, but every 20 results requires a new HTTP request. Nevertheless, here's how you can scrape it:
|
155
|
+
In citations, especially legal citations, the ["Idem"](http://en.wikipedia.org/wiki/Idem) is used to point to the previous citation. If you want to know how many "children" a given footnote has (e.g. which citations reference it with "Id.") pass `id_tree: true` to `ECFS::Util::get_footnotes`:
|
240
156
|
|
241
157
|
```ruby
|
242
|
-
|
243
|
-
|
244
|
-
|
245
|
-
|
246
|
-
|
247
|
-
|
248
|
-
|
249
|
-
|
250
|
-
|
251
|
-
|
252
|
-
|
253
|
-
|
254
|
-
|
255
|
-
|
158
|
+
url = "https://apps.fcc.gov/edocs_public/attachmatch/FCC-14-210A1.docx"
|
159
|
+
footnotes = ECFS::Util.get_footnotes(url: url, id_tree: true)
|
160
|
+
footnotes.select {|fn| fn[:ids].length > 0}
|
161
|
+
#=>
|
162
|
+
[
|
163
|
+
{
|
164
|
+
:index=>270,
|
165
|
+
:text=>
|
166
|
+
"U.S. Census Bureau, 2007 Economic Census. See U.S. Census Bureau, American FactFinder, “Information: Subject Series – Estab and Firm Size: Employment Size of Establishments for the United States: 2007 – 2007 Economic Census,” NAICS code 517110, Table EC0751SSSZ2; available at http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml.",
|
167
|
+
:ids=>[
|
168
|
+
{:index=>271, :text=>"Id.", :ids=>[], :id=>true}
|
169
|
+
],
|
170
|
+
:id=>false
|
171
|
+
},
|
172
|
+
# ...
|
256
173
|
}
|
257
174
|
```
|
258
175
|
|
259
|
-
|
176
|
+
If you were so inclined, you could quickly find all idem footnotes:
|
260
177
|
|
261
178
|
```ruby
|
262
|
-
|
263
|
-
q.docket_number = '12-83'
|
264
|
-
|
265
|
-
# a minimum date, inclusive. mm/dd/yyyy
|
266
|
-
q.received_min_date = '03/30/2012'
|
267
|
-
|
268
|
-
# an after_scrape block
|
269
|
-
q.after_scrape = Proc.new do |filings|
|
270
|
-
p "Fetched asynchronyously: #{filings.length}"
|
271
|
-
end
|
272
|
-
# This is handy for large scrapes.
|
273
|
-
end.get
|
179
|
+
footnotes.select {|fn| fn[:id] == true}
|
274
180
|
```
|
275
181
|
|
276
|
-
|
277
|
-
|
278
|
-
This feature parses these types of pages: http://transition.fcc.gov/Daily_Releases/Daily_Business/2014/db0917/.
|
279
|
-
|
280
|
-
The documents listed are PDFs, text files, and `.docx` files.
|
281
|
-
|
282
|
-
```ruby
|
283
|
-
releases = ECFS::DailyReasesQuery.new.tap do |q|
|
284
|
-
q.month = '12'
|
285
|
-
q.day = '17'
|
286
|
-
q.year = '2014'
|
287
|
-
end.get
|
288
|
-
|
289
|
-
txt_urls = releases.txts
|
290
|
-
pdf_urls = releases.pdfs
|
291
|
-
docs_urls = releases.docxs
|
292
|
-
|
293
|
-
p txt_urls.first
|
294
|
-
#=>
|
295
|
-
{
|
296
|
-
title: "DA-14-1835A1.txt",
|
297
|
-
url: "http://transition.fcc.gov/Daily_Releases/Daily_Business/2014/db1217//DA-14-1835A1.txt"
|
298
|
-
}
|
299
|
-
```
|
300
|
-
|
301
|
-
## Testing
|
302
|
-
|
303
|
-
```
|
304
|
-
$ bundle exec m
|
305
|
-
```
|
182
|
+
## Development
|
306
183
|
|
307
|
-
|
184
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `bin/console` for an interactive prompt that will allow you to experiment.
|
308
185
|
|
309
|
-
|
186
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release` to create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
310
187
|
|
311
188
|
## Contributing
|
312
189
|
|
313
|
-
1. Fork it
|
190
|
+
1. Fork it ( https://github.com/[my-github-username]/ecfs/fork )
|
314
191
|
2. Create your feature branch (`git checkout -b my-new-feature`)
|
315
192
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
316
193
|
4. Push to the branch (`git push origin my-new-feature`)
|
317
|
-
5. Create new Pull Request
|
194
|
+
5. Create a new Pull Request
|