factbook 0.1.2 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Manifest.txt +1 -0
- data/README.md +436 -20
- data/Rakefile +9 -88
- data/lib/factbook.rb +0 -11
- data/lib/factbook/page.rb +41 -12
- data/lib/factbook/sect.rb +29 -12
- data/lib/factbook/version.rb +18 -2
- data/test/test_fields.rb +48 -0
- metadata +15 -13
data/Manifest.txt
CHANGED
data/README.md
CHANGED
@@ -21,44 +21,460 @@ offers free country profiles in the public domain (that is, no copyright(s), no
|
|
21
21
|
|
22
22
|
### Get country profile page as a hash (that is, structured data e.g. nested key/values)
|
23
23
|
|
24
|
-
|
25
|
-
pp page.data # pretty print hash
|
24
|
+
```ruby
|
26
25
|
|
26
|
+
page = Factbook::Page.new( 'br' ) # br is the country code for Brazil
|
27
|
+
pp page.data # pretty print hash
|
28
|
+
|
29
|
+
```
|
27
30
|
|
28
31
|
### Save to disk as JSON
|
29
32
|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
33
|
+
```ruby
|
34
|
+
|
35
|
+
page = Factbook::Page.new( 'br' )
|
36
|
+
File.open( 'br.json', 'w') do |f|
|
37
|
+
f.write page.to_json( pretty: true )
|
38
|
+
end
|
39
|
+
|
40
|
+
```
|
41
|
+
|
42
|
+
### Options - Header, "Long" Category / Field Names
|
43
|
+
|
44
|
+
#### Include Header Option - `header: true`
|
45
|
+
|
46
|
+
```ruby
|
47
|
+
page = Factbook::Page.new( 'br', header: true )
|
48
|
+
```
|
49
|
+
|
50
|
+
will include a leading header section. Example:
|
51
|
+
|
52
|
+
```json
|
53
|
+
{
|
54
|
+
"Header": {
|
55
|
+
"code": "au",
|
56
|
+
"generator": "factbook/0.1.2",
|
57
|
+
"last_built": "2014-08-24 12:55:39 +0200"
|
58
|
+
}
|
59
|
+
...
|
60
|
+
}
|
61
|
+
```
|
62
|
+
|
63
|
+
#### "Long" Category / Field Names Option - `fields: 'long'`
|
64
|
+
|
65
|
+
```ruby
|
66
|
+
page = Factbook::Page.new( 'br', fields: 'long')
|
67
|
+
```
|
68
|
+
|
69
|
+
will change the category / field names to the long form (that is, passing through unchanged from the source).
|
70
|
+
e.g.
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
page['econ']['budget_surplus_or_deficit']['text']
|
74
|
+
page['econ']['labor_force_by_occupation']['agriculture']
|
75
|
+
page['trans']['ports_and_terminals']['river_ports']
|
76
|
+
```
|
77
|
+
becomes
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
81
|
+
page['Economy']['Labor force - by occupation']['agriculture']
|
82
|
+
page['Transportation']['Ports and terminals']['river port(s)']
|
83
|
+
```
|
84
|
+
|
85
|
+
Note: You can - of course - use the options together e.g.
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
page = Factbook::Page.new( 'br', header: true, fields: 'long' )
|
89
|
+
```
|
90
|
+
|
91
|
+
or
|
92
|
+
|
93
|
+
```ruby
|
94
|
+
opts = {
|
95
|
+
header: true,
|
96
|
+
fields: 'long'
|
97
|
+
}
|
98
|
+
page = Factbook::Page.new( 'br', opts )
|
99
|
+
```
|
100
|
+
|
101
|
+
|
102
|
+
## The World Factbook Summary (267 Entries)
|
103
|
+
|
104
|
+
The World Factbook includes 267 entries -
|
105
|
+
195 sovereign countries /
|
106
|
+
2 others /
|
107
|
+
58 dependencies /
|
108
|
+
6 miscellaneous /
|
109
|
+
5 oceans /
|
110
|
+
1 world:
|
111
|
+
|
112
|
+
|
113
|
+
### Sovereign Countries (195)
|
114
|
+
|
115
|
+
**A**
|
116
|
+
`af` Afghanistan
|
117
|
+
`al` Albania
|
118
|
+
`ag` Algeria
|
119
|
+
`an` Andorra
|
120
|
+
`ao` Angola
|
121
|
+
`ac` Antigua and Barbuda
|
122
|
+
`ar` Argentina
|
123
|
+
`am` Armenia
|
124
|
+
`as` Australia
|
125
|
+
`au` Austria
|
126
|
+
`aj` Azerbaijan
|
127
|
+
**B**
|
128
|
+
`bf` The Bahamas
|
129
|
+
`ba` Bahrain
|
130
|
+
`bg` Bangladesh
|
131
|
+
`bb` Barbados
|
132
|
+
`bo` Belarus
|
133
|
+
`be` Belgium
|
134
|
+
`bh` Belize
|
135
|
+
`bn` Benin
|
136
|
+
`bt` Bhutan
|
137
|
+
`bl` Bolivia
|
138
|
+
`bk` Bosnia and Herzegovina
|
139
|
+
`bc` Botswana
|
140
|
+
`br` Brazil
|
141
|
+
`bx` Brunei
|
142
|
+
`bu` Bulgaria
|
143
|
+
`uv` Burkina Faso
|
144
|
+
`bm` Burma
|
145
|
+
`by` Burundi
|
146
|
+
**C**
|
147
|
+
`cb` Cambodia
|
148
|
+
`cm` Cameroon
|
149
|
+
`ca` Canada
|
150
|
+
`cv` Cape Verde
|
151
|
+
`ct` Central African Republic
|
152
|
+
`cd` Chad
|
153
|
+
`ci` Chile
|
154
|
+
`ch` China
|
155
|
+
`co` Colombia
|
156
|
+
`cn` Comoros
|
157
|
+
`cg` Congo DR
|
158
|
+
`cf` Congo
|
159
|
+
`cs` Costa Rica
|
160
|
+
`iv` Cote d'Ivoire
|
161
|
+
`hr` Croatia
|
162
|
+
`cu` Cuba
|
163
|
+
`cy` Cyprus
|
164
|
+
`ez` Czech Republic
|
165
|
+
**D**
|
166
|
+
`da` Denmark
|
167
|
+
`dj` Djibouti
|
168
|
+
`do` Dominica
|
169
|
+
`dr` Dominican Republic
|
170
|
+
**E**
|
171
|
+
`ec` Ecuador
|
172
|
+
`eg` Egypt
|
173
|
+
`es` El Salvador
|
174
|
+
`ek` Equatorial Guinea
|
175
|
+
`er` Eritrea
|
176
|
+
`en` Estonia
|
177
|
+
`et` Ethiopia
|
178
|
+
**F**
|
179
|
+
`fj` Fiji
|
180
|
+
`fi` Finland
|
181
|
+
`fr` France
|
182
|
+
**G**
|
183
|
+
`gb` Gabon
|
184
|
+
`ga` The Gambia
|
185
|
+
`gg` Georgia
|
186
|
+
`gm` Germany
|
187
|
+
`gh` Ghana
|
188
|
+
`gr` Greece
|
189
|
+
`gj` Grenada
|
190
|
+
`gt` Guatemala
|
191
|
+
`gv` Guinea
|
192
|
+
`pu` Guinea-Bissau
|
193
|
+
`gy` Guyana
|
194
|
+
**H**
|
195
|
+
`ha` Haiti
|
196
|
+
`ho` Honduras
|
197
|
+
`hu` Hungary
|
198
|
+
**I**
|
199
|
+
`ic` Iceland
|
200
|
+
`in` India
|
201
|
+
`id` Indonesia
|
202
|
+
`ir` Iran
|
203
|
+
`iz` Iraq
|
204
|
+
`ei` Ireland
|
205
|
+
`is` Israel
|
206
|
+
`it` Italy
|
207
|
+
**J**
|
208
|
+
`jm` Jamaica
|
209
|
+
`ja` Japan
|
210
|
+
`jo` Jordan
|
211
|
+
**K**
|
212
|
+
`kz` Kazakhstan
|
213
|
+
`ke` Kenya
|
214
|
+
`kr` Kiribati
|
215
|
+
`kn` North Korea
|
216
|
+
`ks` South Korea
|
217
|
+
`kv` Kosovo
|
218
|
+
`ku` Kuwait
|
219
|
+
`kg` Kyrgyzstan
|
220
|
+
**L**
|
221
|
+
`la` Laos
|
222
|
+
`lg` Latvia
|
223
|
+
`le` Lebanon
|
224
|
+
`lt` Lesotho
|
225
|
+
`li` Liberia
|
226
|
+
`ly` Libya
|
227
|
+
`ls` Liechtenstein
|
228
|
+
`lh` Lithuania
|
229
|
+
`lu` Luxembourg
|
230
|
+
**M**
|
231
|
+
`mk` Macedonia
|
232
|
+
`ma` Madagascar
|
233
|
+
`mi` Malawi
|
234
|
+
`my` Malaysia
|
235
|
+
`mv` Maldives
|
236
|
+
`ml` Mali
|
237
|
+
`mt` Malta
|
238
|
+
`rm` Marshall Islands
|
239
|
+
`mr` Mauritania
|
240
|
+
`mp` Mauritius
|
241
|
+
`mx` Mexico
|
242
|
+
`fm` Micronesia
|
243
|
+
`md` Moldova
|
244
|
+
`mn` Monaco
|
245
|
+
`mg` Mongolia
|
246
|
+
`mj` Montenegro
|
247
|
+
`mo` Morocco
|
248
|
+
`mz` Mozambique
|
249
|
+
**N**
|
250
|
+
`wa` Namibia
|
251
|
+
`nr` Nauru
|
252
|
+
`np` Nepal
|
253
|
+
`nl` Netherlands
|
254
|
+
`nz` New Zealand
|
255
|
+
`nu` Nicaragua
|
256
|
+
`ng` Niger
|
257
|
+
`ni` Nigeria
|
258
|
+
`no` Norway
|
259
|
+
**O**
|
260
|
+
`mu` Oman
|
261
|
+
**P**
|
262
|
+
`pk` Pakistan
|
263
|
+
`ps` Palau
|
264
|
+
`pm` Panama
|
265
|
+
`pp` Papua New Guinea
|
266
|
+
`pa` Paraguay
|
267
|
+
`pe` Peru
|
268
|
+
`rp` Philippines
|
269
|
+
`pl` Poland
|
270
|
+
`po` Portugal
|
271
|
+
**Q**
|
272
|
+
`qa` Qatar
|
273
|
+
**R**
|
274
|
+
`ro` Romania
|
275
|
+
`rs` Russia
|
276
|
+
`rw` Rwanda
|
277
|
+
**S**
|
278
|
+
`sc` Saint Kitts and Nevis
|
279
|
+
`st` Saint Lucia
|
280
|
+
`vc` Saint Vincent and the Grenadines
|
281
|
+
`ws` Samoa
|
282
|
+
`sm` San Marino
|
283
|
+
`tp` Sao Tome and Principe
|
284
|
+
`sa` Saudi Arabia
|
285
|
+
`sg` Senegal
|
286
|
+
`ri` Serbia
|
287
|
+
`se` Seychelles
|
288
|
+
`sl` Sierra Leone
|
289
|
+
`sn` Singapore
|
290
|
+
`lo` Slovakia
|
291
|
+
`si` Slovenia
|
292
|
+
`bp` Solomon Islands
|
293
|
+
`so` Somalia
|
294
|
+
`sf` South Africa
|
295
|
+
`od` South Sudan
|
296
|
+
`sp` Spain
|
297
|
+
`ce` Sri Lanka
|
298
|
+
`su` Sudan
|
299
|
+
`ns` Suriname
|
300
|
+
`wz` Swaziland
|
301
|
+
`sw` Sweden
|
302
|
+
`sz` Switzerland
|
303
|
+
`sy` Syria
|
304
|
+
**T**
|
305
|
+
`ti` Tajikistan
|
306
|
+
`tz` Tanzania
|
307
|
+
`th` Thailand
|
308
|
+
`tt` Timor-Leste
|
309
|
+
`to` Togo
|
310
|
+
`tn` Tonga
|
311
|
+
`td` Trinidad and Tobago
|
312
|
+
`ts` Tunisia
|
313
|
+
`tu` Turkey
|
314
|
+
`tx` Turkmenistan
|
315
|
+
`tv` Tuvalu
|
316
|
+
**U**
|
317
|
+
`ug` Uganda
|
318
|
+
`up` Ukraine
|
319
|
+
`ae` United Arab Emirates
|
320
|
+
`uk` United Kingdom
|
321
|
+
`us` United States
|
322
|
+
`uy` Uruguay
|
323
|
+
`uz` Uzbekistan
|
324
|
+
**V**
|
325
|
+
`nh` Vanuatu
|
326
|
+
`vt` Vatican City (Holy See)
|
327
|
+
`ve` Venezuela
|
328
|
+
`vm` Vietnam
|
329
|
+
**Y**
|
330
|
+
`ym` Yemen
|
331
|
+
**Z**
|
332
|
+
`za` Zambia
|
333
|
+
`zi` Zimbabwe
|
334
|
+
|
335
|
+
|
336
|
+
### Other (2)
|
337
|
+
|
338
|
+
`tw` Taiwan
|
339
|
+
`ee` European Union
|
340
|
+
|
341
|
+
### Dependencies (58)
|
342
|
+
|
343
|
+
Australia (6):
|
344
|
+
`at` Ashmore and Cartier Islands
|
345
|
+
`kt` Christmas Island
|
346
|
+
`ck` Cocos (Keeling) Islands
|
347
|
+
`cr` Coral Sea Islands
|
348
|
+
`hm` Heard Island and McDonald Islands
|
349
|
+
`nf` Norfolk Island
|
350
|
+
|
351
|
+
China (2):
|
352
|
+
`hk` Hong Kong
|
353
|
+
`mc` Macau
|
354
|
+
|
355
|
+
Denmark (2):
|
356
|
+
`fo` Faroe Islands
|
357
|
+
`gl` Greenland
|
358
|
+
|
359
|
+
France (8):
|
360
|
+
`ip` Clipperton Island
|
361
|
+
`fp` French Polynesia
|
362
|
+
`fs` French Southern and Antarctic Lands
|
363
|
+
`nc` New Caledonia
|
364
|
+
`tb` Saint Barthelemy
|
365
|
+
`rn` Saint Martin
|
366
|
+
`sb` Saint Pierre and Miquelon
|
367
|
+
`wf` Wallis and Futuna
|
368
|
+
|
369
|
+
Netherlands (3):
|
370
|
+
`aa` Aruba
|
371
|
+
`uc` Curacao
|
372
|
+
`nn` Sint Maarten
|
373
|
+
|
374
|
+
New Zealand (3):
|
375
|
+
`cw` Cook Islands
|
376
|
+
`ne` Niue
|
377
|
+
`tl` Tokelau
|
378
|
+
|
379
|
+
Norway (3):
|
380
|
+
`bv` Bouvet Island
|
381
|
+
`jn` Jan Mayen
|
382
|
+
`sv` Svalbard
|
383
|
+
|
384
|
+
Great Britain (17):
|
385
|
+
`ax` Akrotiri (Sovereign Base)
|
386
|
+
`av` Anguilla
|
387
|
+
`bd` Bermuda
|
388
|
+
`io` British Indian Ocean Territory
|
389
|
+
`vi` British Virgin Islands
|
390
|
+
`cj` Cayman Islands
|
391
|
+
`dx` Dhekelia (Sovereign Base)
|
392
|
+
`fk` Falkland Islands
|
393
|
+
`gi` Gibraltar
|
394
|
+
`gk` Guernsey
|
395
|
+
`je` Jersey
|
396
|
+
`im` Isle of Man
|
397
|
+
`mh` Montserrat
|
398
|
+
`pc` Pitcairn Islands
|
399
|
+
`sh` Saint Helena
|
400
|
+
`sx` South Georgia and the South Sandwich Islands
|
401
|
+
`tk` Turks and Caicos Islands
|
402
|
+
|
403
|
+
United States (14):
|
404
|
+
`aq` American Samoa
|
405
|
+
`gq` Guam
|
406
|
+
`bq` Navassa Island
|
407
|
+
`cq` Northern Mariana Islands
|
408
|
+
`rq` Puerto Rico
|
409
|
+
`vq` US Virgin Islands
|
410
|
+
`wq` Wake Island
|
411
|
+
`um` US Pacific Island Wildlife Refuges
|
412
|
+
(Baker Island, Howland Island, Jarvis Island, Johnston Atoll, Kingman Reef, Midway Islands, Palmyra Atoll)
|
413
|
+
|
414
|
+
|
415
|
+
### Miscellaneous (6)
|
416
|
+
|
417
|
+
`ay` Antarctica
|
418
|
+
`gz` Gaza Strip
|
419
|
+
`pf` Paracel Islands
|
420
|
+
`pg` Spratly Islands
|
421
|
+
`we` West Bank
|
422
|
+
`wi` Western Sahara
|
423
|
+
|
424
|
+
### Oceans (5)
|
425
|
+
|
426
|
+
`xq` Arctic Ocean
|
427
|
+
`zh` Atlantic Ocean
|
428
|
+
`xo` Indian Ocean
|
429
|
+
`zn` Pacific Ocean
|
430
|
+
`oo` Southern Ocean
|
431
|
+
|
432
|
+
### World (1)
|
433
|
+
|
434
|
+
`xx` World
|
435
|
+
|
436
|
+
|
437
|
+
|
438
|
+
|
439
|
+
## Ready-To-Use Public Domain Factbook Datasets
|
34
440
|
|
441
|
+
[openmundi/factbook.json](https://github.com/openmundi/factbook.json) - open (public domain)
|
442
|
+
factbook country profiles in JSON for all the world's countries (using internet domain names
|
443
|
+
for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
|
35
444
|
|
36
|
-
## Install
|
37
445
|
|
38
|
-
Just install the gem:
|
39
446
|
|
40
|
-
|
447
|
+
## Alternatives (Libraries and Gems)
|
41
448
|
|
449
|
+
Ruby
|
42
450
|
|
43
|
-
|
451
|
+
- [worldfactbook gem](https://github.com/sayem/worldfactbook)
|
452
|
+
by Sayem Khan (aka sayem);
|
453
|
+
fetches data from its own mirror, that is, [rubyworldfactbook.com](http://rubyworldfactbook.com)
|
454
|
+
(last updated 2011?)
|
44
455
|
|
45
|
-
|
456
|
+
- [the_country_identity gem](https://github.com/p1nox/the_country_identity)
|
457
|
+
by Raul Pino (aka p1nox);
|
458
|
+
fetches data from an [RDF Turtle endpoint](http://wifo5-03.informatik.uni-mannheim.de/factbook/)
|
459
|
+
hosted by the Research Group Data and Web Science at the University of Mannheim, Germany
|
46
460
|
|
47
|
-
|
48
|
-
factbook country profiles in JSON for all the world's countries (using internet domain names
|
49
|
-
for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
|
461
|
+
JavaScript
|
50
462
|
|
463
|
+
- [worldfactbook-dataset](https://github.com/twigkit/worldfactbook-dataset)
|
464
|
+
by Richard Marr (aka richmarr); fetches data using Node.js
|
465
|
+
(last updated 2013)
|
51
466
|
|
467
|
+
Others
|
52
468
|
|
53
|
-
|
469
|
+
TBD
|
54
470
|
|
55
|
-
Ruby
|
56
471
|
|
57
|
-
- [worldfactbook gem](https://github.com/sayem/worldfactbook) by sayem (aka Sayem Khan); fetches data from its own mirror, that is, rubyworldfactbook.com (last updated 2011?)
|
58
472
|
|
59
|
-
|
473
|
+
## Install
|
60
474
|
|
61
|
-
|
475
|
+
Just install the gem:
|
476
|
+
|
477
|
+
$ gem install factbook
|
62
478
|
|
63
479
|
|
64
480
|
## License
|
@@ -70,4 +486,4 @@ Use it as you please with no restrictions whatsoever.
|
|
70
486
|
## Questions? Comments?
|
71
487
|
|
72
488
|
Send them along to the [Open Mundi (world.db) Database Forum/Mailing List](http://groups.google.com/group/openmundi).
|
73
|
-
Thanks!
|
489
|
+
Thanks!
|
data/Rakefile
CHANGED
@@ -32,88 +32,15 @@ Hoe.spec 'factbook' do
|
|
32
32
|
end
|
33
33
|
|
34
34
|
|
35
|
-
=begin
|
36
|
-
# errors to fix:
|
37
|
-
saving a copy to europe/li-liechtenstein.html for debugging
|
38
|
-
found section 0 @ 38
|
39
|
-
found section 1 @ 1882
|
40
|
-
found section 2 @ 13160
|
41
|
-
found section 3 @ 29355
|
42
|
-
found section 4 @ 46010
|
43
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
44
|
-
found section 6 @ 64725
|
45
|
-
|
46
|
-
aving a copy to europe/mc-monaco.html for debugging
|
47
|
-
found section 0 @ 38
|
48
|
-
found section 1 @ 1446
|
49
|
-
found section 2 @ 12736
|
50
|
-
found section 3 @ 31192
|
51
|
-
found section 4 @ 47762
|
52
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
53
|
-
|
54
|
-
saving a copy to europe/sm-san-marino.html for debugging
|
55
|
-
found section 0 @ 38
|
56
|
-
found section 1 @ 1379
|
57
|
-
found section 2 @ 12243
|
58
|
-
found section 3 @ 27349
|
59
|
-
found section 4 @ 46949
|
60
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
61
|
-
|
62
|
-
saving a copy to europe/va-vatican-city.html for debugging
|
63
|
-
found section 0 @ 38
|
64
|
-
found section 1 @ 2000
|
65
|
-
found section 2 @ 13093
|
66
|
-
found section 3 @ 19912
|
67
|
-
found section 4 @ 37264
|
68
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
69
|
-
found section 6 @ 44353
|
70
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Trans"
|
71
|
-
|
72
|
-
saving a copy to pacific/mh-marshall-islands.html for debugging
|
73
|
-
found section 0 @ 38
|
74
|
-
found section 1 @ 1414
|
75
|
-
found section 2 @ 13404
|
76
|
-
found section 3 @ 34854
|
77
|
-
found section 4 @ 52734
|
78
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
79
|
-
|
80
|
-
saving a copy to pacific/pw-palau.html for debugging
|
81
|
-
found section 0 @ 38
|
82
|
-
found section 1 @ 1338
|
83
|
-
found section 2 @ 12729
|
84
|
-
found section 3 @ 34145
|
85
|
-
found section 4 @ 51005
|
86
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
87
|
-
|
88
|
-
saving a copy to pacific/tv-tuvalu.html for debugging
|
89
|
-
found section 0 @ 38
|
90
|
-
found section 1 @ 1391
|
91
|
-
found section 2 @ 13580
|
92
|
-
found section 3 @ 33729
|
93
|
-
found section 4 @ 50390
|
94
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
95
|
-
|
96
|
-
saving a copy to africa/ss-south-sudan.html for debugging
|
97
|
-
found section 0 @ 38
|
98
|
-
found section 1 @ 2560
|
99
|
-
found section 2 @ 11342
|
100
|
-
found section 3 @ 26234
|
101
|
-
found section 4 @ 42271
|
102
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
103
|
-
|
104
|
-
|
105
|
-
=end
|
106
|
-
|
107
|
-
|
108
35
|
|
109
36
|
desc 'generate json for factbook.json repo'
|
110
37
|
task :genjson do
|
111
38
|
require 'factbook'
|
112
39
|
|
113
40
|
countries = [
|
114
|
-
=begin
|
115
41
|
['xx', 'world' ], ## special code for the world
|
116
42
|
|
43
|
+
=begin
|
117
44
|
['ee', 'europe/eu-european-union'], ## special code for the european union
|
118
45
|
['al', 'europe/al-albania' ],
|
119
46
|
['an', 'europe/ad-andorra' ],
|
@@ -140,13 +67,13 @@ task :genjson do
|
|
140
67
|
['ei', 'europe/ie-ireland' ],
|
141
68
|
['it', 'europe/it-italy' ],
|
142
69
|
['lg', 'europe/lv-latvia' ],
|
143
|
-
|
70
|
+
['ls', 'europe/li-liechtenstein' ],
|
144
71
|
['lh', 'europe/lt-lithuania' ],
|
145
72
|
['lu', 'europe/lu-luxembourg' ],
|
146
73
|
['mk', 'europe/mk-macedonia' ],
|
147
74
|
['mt', 'europe/mt-malta' ],
|
148
75
|
['md', 'europe/md-moldova' ],
|
149
|
-
|
76
|
+
['mn', 'europe/mc-monaco' ],
|
150
77
|
['mj', 'europe/me-montenegro' ],
|
151
78
|
['nl', 'europe/nl-netherlands' ],
|
152
79
|
['no', 'europe/no-norway' ],
|
@@ -154,7 +81,7 @@ task :genjson do
|
|
154
81
|
['po', 'europe/pt-portugal' ],
|
155
82
|
['ro', 'europe/ro-romania' ],
|
156
83
|
['rs', 'europe/ru-russia' ],
|
157
|
-
|
84
|
+
['sm', 'europe/sm-san-marino' ],
|
158
85
|
['ri', 'europe/rs-serbia' ],
|
159
86
|
['lo', 'europe/sk-slovakia' ],
|
160
87
|
['si', 'europe/si-slovenia' ],
|
@@ -163,7 +90,7 @@ task :genjson do
|
|
163
90
|
['sz', 'europe/ch-switzerland' ],
|
164
91
|
['tu', 'europe/tr-turkey' ],
|
165
92
|
['up', 'europe/ua-ukraine' ],
|
166
|
-
|
93
|
+
['vt', 'europe/va-vatican-city' ],
|
167
94
|
|
168
95
|
['ca', 'north-america/ca-canada' ],
|
169
96
|
['us', 'north-america/us-united-states' ],
|
@@ -249,7 +176,7 @@ task :genjson do
|
|
249
176
|
['sl', 'africa/sl-sierra-leone' ],
|
250
177
|
['so', 'africa/so-somalia' ],
|
251
178
|
['sf', 'africa/za-south-africa' ],
|
252
|
-
|
179
|
+
['od', 'africa/ss-south-sudan' ],
|
253
180
|
['su', 'africa/sd-sudan' ],
|
254
181
|
['wz', 'africa/sz-swaziland' ],
|
255
182
|
['tz', 'africa/tz-tanzania' ],
|
@@ -308,25 +235,19 @@ task :genjson do
|
|
308
235
|
['as', 'pacific/au-australia' ],
|
309
236
|
['fj', 'pacific/fj-fiji' ],
|
310
237
|
['kr', 'pacific/ki-kiribati' ],
|
311
|
-
|
238
|
+
['rm', 'pacific/mh-marshall-islands' ],
|
312
239
|
['fm', 'pacific/fm-micronesia' ],
|
313
240
|
['nr', 'pacific/nr-nauru' ],
|
314
241
|
['nz', 'pacific/nz-new-zealand' ],
|
315
|
-
|
242
|
+
['ps', 'pacific/pw-palau' ],
|
316
243
|
['pp', 'pacific/pg-papua-new-guinea' ],
|
317
244
|
['ws', 'pacific/ws-samoa' ],
|
318
245
|
['bp', 'pacific/sb-solomon-islands' ],
|
319
246
|
['tn', 'pacific/to-tonga' ],
|
320
|
-
|
247
|
+
['tv', 'pacific/tv-tuvalu' ],
|
321
248
|
['nh', 'pacific/vu-vanuatu' ],
|
322
|
-
|
323
249
|
=end
|
324
250
|
|
325
|
-
|
326
|
-
|
327
|
-
=begin
|
328
|
-
['', 'africa/' ],
|
329
|
-
=end
|
330
251
|
]
|
331
252
|
|
332
253
|
countries.each do |country|
|
data/lib/factbook.rb
CHANGED
@@ -25,17 +25,6 @@ require 'factbook/page'
|
|
25
25
|
require 'factbook/sect'
|
26
26
|
|
27
27
|
|
28
|
-
module Factbook
|
29
|
-
|
30
|
-
def self.banner
|
31
|
-
"factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
|
32
|
-
end
|
33
|
-
|
34
|
-
def self.root
|
35
|
-
"#{File.expand_path( File.dirname(File.dirname(__FILE__)) )}"
|
36
|
-
end
|
37
|
-
|
38
|
-
end # module Factbook
|
39
28
|
|
40
29
|
|
41
30
|
puts Factbook.banner
|
data/lib/factbook/page.rb
CHANGED
@@ -13,11 +13,14 @@ module Factbook
|
|
13
13
|
## e.g. www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_br.html
|
14
14
|
SITE_BASE = 'https://www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_{code}.html'
|
15
15
|
|
16
|
-
def initialize( code )
|
16
|
+
def initialize( code, opts={} )
|
17
17
|
## note: requires factbook country code
|
18
18
|
# e.g. austria is au
|
19
19
|
# germany is gm and so on
|
20
20
|
@code = code
|
21
|
+
|
22
|
+
### rename fields to format option?? why? why not? e.g. :format => 'long' ??
|
23
|
+
@opts = opts # fields: full|long|keep|std|?? -- find a good name for the option keeping field names as is
|
21
24
|
|
22
25
|
@html = nil
|
23
26
|
@doc = nil
|
@@ -38,10 +41,35 @@ module Factbook
|
|
38
41
|
end
|
39
42
|
end
|
40
43
|
|
44
|
+
|
45
|
+
def [](key) ### convenience shortcut
|
46
|
+
# lets you use
|
47
|
+
# page['geo']
|
48
|
+
# instead of
|
49
|
+
# page.data['geo']
|
50
|
+
|
51
|
+
## fix: use delegate data, [] from forwardable lib - why?? why not??
|
52
|
+
|
53
|
+
data[key]
|
54
|
+
end
|
55
|
+
|
56
|
+
|
41
57
|
def data
|
42
58
|
if @data.nil?
|
43
59
|
@data = {}
|
44
60
|
|
61
|
+
if @opts[:header] ## include (leading) header section ??
|
62
|
+
|
63
|
+
header_key = @opts[:fields] ? 'Header' : 'header'
|
64
|
+
last_built_key = @opts[:fields] ? 'last built' : 'last_built'
|
65
|
+
|
66
|
+
@data[header_key] = {
|
67
|
+
'code' => @code,
|
68
|
+
'generator' => "factbook/#{VERSION}",
|
69
|
+
last_built_key => "#{Time.now}",
|
70
|
+
}
|
71
|
+
end
|
72
|
+
|
45
73
|
sects.each_with_index do |sect,i|
|
46
74
|
logger.debug "############################"
|
47
75
|
logger.debug "### [#{i}] stats sect >#{sect.title}<: "
|
@@ -58,17 +86,18 @@ module Factbook
|
|
58
86
|
## split html into sections
|
59
87
|
## lets us avoids errors w/ (wrongly) nested tags
|
60
88
|
|
89
|
+
## check opts for using long or short category/field names
|
61
90
|
divs = [
|
62
|
-
[ 'intro', '<div id="CollapsiblePanel1_Intro"' ],
|
63
|
-
[ 'geo', '<div id="CollapsiblePanel1_Geo"' ],
|
64
|
-
[ 'people', '<div id="CollapsiblePanel1_People"' ],
|
65
|
-
[ 'govt', '<div id="CollapsiblePanel1_Govt"' ],
|
66
|
-
[ 'econ', '<div id="CollapsiblePanel1_Econ"' ],
|
67
|
-
[ 'energy', '<div id="CollapsiblePanel1_Energy"' ],
|
68
|
-
[ 'comm', '<div id="CollapsiblePanel1_Comm"' ],
|
69
|
-
[ 'trans', '<div id="CollapsiblePanel1_Trans"' ],
|
70
|
-
[ 'military', '<div id="CollapsiblePanel1_Military"'],
|
71
|
-
[ 'issues', '<div id="CollapsiblePanel1_Issues"' ]
|
91
|
+
[ @opts[:fields] ? 'Introduction' : 'intro', '<div id="CollapsiblePanel1_Intro"' ],
|
92
|
+
[ @opts[:fields] ? 'Geography' : 'geo', '<div id="CollapsiblePanel1_Geo"' ],
|
93
|
+
[ @opts[:fields] ? 'People and Society' : 'people', '<div id="CollapsiblePanel1_People"' ],
|
94
|
+
[ @opts[:fields] ? 'Government' : 'govt', '<div id="CollapsiblePanel1_Govt"' ],
|
95
|
+
[ @opts[:fields] ? 'Economy' : 'econ', '<div id="CollapsiblePanel1_Econ"' ],
|
96
|
+
[ @opts[:fields] ? 'Energy' : 'energy', '<div id="CollapsiblePanel1_Energy"' ],
|
97
|
+
[ @opts[:fields] ? 'Communications' : 'comm', '<div id="CollapsiblePanel1_Comm"' ],
|
98
|
+
[ @opts[:fields] ? 'Transportation' : 'trans', '<div id="CollapsiblePanel1_Trans"' ],
|
99
|
+
[ @opts[:fields] ? 'Military' : 'military', '<div id="CollapsiblePanel1_Military"'],
|
100
|
+
[ @opts[:fields] ? 'Transnational Issues': 'issues', '<div id="CollapsiblePanel1_Issues"' ]
|
72
101
|
]
|
73
102
|
|
74
103
|
indexes = []
|
@@ -102,7 +131,7 @@ module Factbook
|
|
102
131
|
|
103
132
|
## todo: check that from is smaller than to
|
104
133
|
logger.debug " cut section #{i} [#{from}..#{to}]"
|
105
|
-
@sects << Sect.new( title, html[ from..to ] )
|
134
|
+
@sects << Sect.new( title, html[ from..to ], @opts )
|
106
135
|
|
107
136
|
##if i==0 || i==1
|
108
137
|
## puts "debug sect #{i}:"
|
data/lib/factbook/sect.rb
CHANGED
@@ -7,11 +7,12 @@ module Factbook
|
|
7
7
|
|
8
8
|
attr_reader :title, :html
|
9
9
|
|
10
|
-
def initialize( title, html )
|
10
|
+
def initialize( title, html, opts={} )
|
11
11
|
## todo: passing a ref to the parent page - why? why not??
|
12
12
|
@title = title
|
13
13
|
@html = html
|
14
|
-
|
14
|
+
@opts = opts # fields: full|long|keep|std|??? -- find a good name for the option keeping field names as is
|
15
|
+
|
15
16
|
@doc = nil
|
16
17
|
@data = nil
|
17
18
|
end
|
@@ -28,15 +29,31 @@ module Factbook
|
|
28
29
|
private
|
29
30
|
|
30
31
|
def cleanup_key( key )
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
32
|
+
|
33
|
+
if @opts[:fields] # if set assume full|long|keep for now
|
34
|
+
### kepe field names as is
|
35
|
+
## e.g.
|
36
|
+
## GDP - composition, by sector of origin:
|
37
|
+
## Budget surplus (+) or deficit (-):
|
38
|
+
## becomes:
|
39
|
+
## GDP - composition, by sector of origin
|
40
|
+
## Budget surplus (+) or deficit (-)
|
41
|
+
key = key.strip
|
42
|
+
key = key.gsub( /[ ]{2,}/, ' ' ) # fold two plus spaces into one -- check if exists?
|
43
|
+
key = key.gsub( /:\z/, '' ) # remove trailing : if present
|
44
|
+
key = key.strip
|
45
|
+
else
|
46
|
+
## to lower case
|
47
|
+
key = key.downcase
|
48
|
+
## seaport(s) => seaports
|
49
|
+
key = key.gsub( '(s)', 's' )
|
50
|
+
key = key.gsub( ':', '' ) # trailing : ## fix: use regex /:$/ w/ anchor??
|
51
|
+
## remove special chars ()+-/,'
|
52
|
+
key = key.gsub( /['()+\-\/,]/, ' ' )
|
53
|
+
key = key.strip
|
54
|
+
key = key.gsub( /[ ]+/, '_' )
|
55
|
+
end
|
56
|
+
|
40
57
|
key
|
41
58
|
end
|
42
59
|
|
@@ -140,7 +157,7 @@ private
|
|
140
157
|
last_pair[1] += " #{text}" ## append w/o separator
|
141
158
|
end
|
142
159
|
else
|
143
|
-
if last_cat == 'demographic_profile' ## special case (use space a sep)
|
160
|
+
if last_cat == 'demographic_profile' || last_cat == 'Demographic profile' ## special case (use space a sep)
|
144
161
|
last_pair[1] += " #{text}" ## append with separator
|
145
162
|
else
|
146
163
|
last_pair[1] += "; #{text}" ## append with separator
|
data/lib/factbook/version.rb
CHANGED
@@ -1,5 +1,21 @@
|
|
1
1
|
|
2
2
|
module Factbook
|
3
|
-
VERSION = '0.1.2'
|
4
|
-
end
|
5
3
|
|
4
|
+
MAJOR = 0
|
5
|
+
MINOR = 1
|
6
|
+
PATCH = 3
|
7
|
+
VERSION = [MAJOR,MINOR,PATCH].join('.')
|
8
|
+
|
9
|
+
def self.version
|
10
|
+
VERSION
|
11
|
+
end
|
12
|
+
|
13
|
+
def self.banner
|
14
|
+
"factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
|
15
|
+
end
|
16
|
+
|
17
|
+
def self.root
|
18
|
+
"#{File.expand_path( File.dirname(File.dirname(File.dirname(__FILE__))) )}"
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
data/test/test_fields.rb
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
|
4
|
+
require 'helper'
|
5
|
+
|
6
|
+
|
7
|
+
class TestFields < MiniTest::Unit::TestCase
|
8
|
+
|
9
|
+
def read_test_page( code )
|
10
|
+
File.read( "#{Factbook.root}/test/data/countrytemplate_#{code}.html" )
|
11
|
+
end
|
12
|
+
|
13
|
+
def test_fields_full_w_header
|
14
|
+
page = Factbook::Page.new( 'au', header: true, fields: 'full' )
|
15
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
16
|
+
|
17
|
+
assert_equal 'au', page['Header']['code']
|
18
|
+
assert_equal "factbook/#{Factbook::VERSION}", page['Header']['generator']
|
19
|
+
|
20
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
21
|
+
assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
|
22
|
+
|
23
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
|
24
|
+
end
|
25
|
+
|
26
|
+
|
27
|
+
def test_fields_full
|
28
|
+
page = Factbook::Page.new( 'au', fields: 'full' )
|
29
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
30
|
+
|
31
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
32
|
+
assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
|
33
|
+
|
34
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
|
35
|
+
end
|
36
|
+
|
37
|
+
def test_fields_std
|
38
|
+
page = Factbook::Page.new( 'au' )
|
39
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
40
|
+
|
41
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['econ']['budget_surplus_or_deficit']['text']
|
42
|
+
assert_equal '5.5%', page['econ']['labor_force_by_occupation']['agriculture']
|
43
|
+
|
44
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['trans']['ports_and_terminals']['river_ports']
|
45
|
+
end
|
46
|
+
|
47
|
+
|
48
|
+
end # class TestFields
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: factbook
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.3
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2014-
|
12
|
+
date: 2014-08-24 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: logutils
|
16
|
-
requirement: &
|
16
|
+
requirement: &74365370 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *74365370
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: fetcher
|
27
|
-
requirement: &
|
27
|
+
requirement: &74365080 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: '0'
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *74365080
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: nokogiri
|
38
|
-
requirement: &
|
38
|
+
requirement: &74364810 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :runtime
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *74364810
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: rdoc
|
49
|
-
requirement: &
|
49
|
+
requirement: &74364520 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ~>
|
@@ -54,18 +54,18 @@ dependencies:
|
|
54
54
|
version: '4.0'
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *74364520
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: hoe
|
60
|
-
requirement: &
|
60
|
+
requirement: &74364240 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ~>
|
64
64
|
- !ruby/object:Gem::Version
|
65
|
-
version: '3.
|
65
|
+
version: '3.12'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *74364240
|
69
69
|
description: factbook - scripts for the world factbook (get open structured data e.g
|
70
70
|
JSON etc.)
|
71
71
|
email: openmundi@googlegroups.com
|
@@ -93,6 +93,7 @@ files:
|
|
93
93
|
- test/data/countrytemplate_vt.html
|
94
94
|
- test/data/countrytemplate_xx.html
|
95
95
|
- test/helper.rb
|
96
|
+
- test/test_fields.rb
|
96
97
|
- test/test_json.rb
|
97
98
|
- test/test_page.rb
|
98
99
|
- test/test_page_old.rb
|
@@ -129,5 +130,6 @@ summary: factbook - scripts for the world factbook (get open structured data e.g
|
|
129
130
|
test_files:
|
130
131
|
- test/test_page_old.rb
|
131
132
|
- test/test_strip.rb
|
133
|
+
- test/test_fields.rb
|
132
134
|
- test/test_json.rb
|
133
135
|
- test/test_page.rb
|