factbook 0.1.2 → 0.1.3
Sign up to get free protection for your applications and to get access to all the features.
- data/Manifest.txt +1 -0
- data/README.md +436 -20
- data/Rakefile +9 -88
- data/lib/factbook.rb +0 -11
- data/lib/factbook/page.rb +41 -12
- data/lib/factbook/sect.rb +29 -12
- data/lib/factbook/version.rb +18 -2
- data/test/test_fields.rb +48 -0
- metadata +15 -13
data/Manifest.txt
CHANGED
data/README.md
CHANGED
@@ -21,44 +21,460 @@ offers free country profiles in the public domain (that is, no copyright(s), no
|
|
21
21
|
|
22
22
|
### Get country profile page as a hash (that is, structured data e.g. nested key/values)
|
23
23
|
|
24
|
-
|
25
|
-
pp page.data # pretty print hash
|
24
|
+
```ruby
|
26
25
|
|
26
|
+
page = Factbook::Page.new( 'br' ) # br is the country code for Brazil
|
27
|
+
pp page.data # pretty print hash
|
28
|
+
|
29
|
+
```
|
27
30
|
|
28
31
|
### Save to disk as JSON
|
29
32
|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
33
|
+
```ruby
|
34
|
+
|
35
|
+
page = Factbook::Page.new( 'br' )
|
36
|
+
File.open( 'br.json', 'w') do |f|
|
37
|
+
f.write page.to_json( pretty: true )
|
38
|
+
end
|
39
|
+
|
40
|
+
```
|
41
|
+
|
42
|
+
### Options - Header, "Long" Category / Field Names
|
43
|
+
|
44
|
+
#### Include Header Option - `header: true`
|
45
|
+
|
46
|
+
```ruby
|
47
|
+
page = Factbook::Page.new( 'br', header: true )
|
48
|
+
```
|
49
|
+
|
50
|
+
will include a leading header section. Example:
|
51
|
+
|
52
|
+
```json
|
53
|
+
{
|
54
|
+
"Header": {
|
55
|
+
"code": "au",
|
56
|
+
"generator": "factbook/0.1.2",
|
57
|
+
"last_built": "2014-08-24 12:55:39 +0200"
|
58
|
+
}
|
59
|
+
...
|
60
|
+
}
|
61
|
+
```
|
62
|
+
|
63
|
+
#### "Long" Category / Field Names Option - `fields: 'long'`
|
64
|
+
|
65
|
+
```ruby
|
66
|
+
page = Factbook::Page.new( 'br', fields: 'long')
|
67
|
+
```
|
68
|
+
|
69
|
+
will change the category / field names to the long form (that is, passing through unchanged from the source).
|
70
|
+
e.g.
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
page['econ']['budget_surplus_or_deficit']['text']
|
74
|
+
page['econ']['labor_force_by_occupation']['agriculture']
|
75
|
+
page['trans']['ports_and_terminals']['river_ports']
|
76
|
+
```
|
77
|
+
becomes
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
81
|
+
page['Economy']['Labor force - by occupation']['agriculture']
|
82
|
+
page['Transportation']['Ports and terminals']['river port(s)']
|
83
|
+
```
|
84
|
+
|
85
|
+
Note: You can - of course - use the options together e.g.
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
page = Factbook::Page.new( 'br', header: true, fields: 'long' )
|
89
|
+
```
|
90
|
+
|
91
|
+
or
|
92
|
+
|
93
|
+
```ruby
|
94
|
+
opts = {
|
95
|
+
header: true,
|
96
|
+
fields: 'long'
|
97
|
+
}
|
98
|
+
page = Factbook::Page.new( 'br', opts )
|
99
|
+
```
|
100
|
+
|
101
|
+
|
102
|
+
## The World Factbook Summary (267 Entries)
|
103
|
+
|
104
|
+
The World Factbook includes 267 entries -
|
105
|
+
195 sovereign countries /
|
106
|
+
2 others /
|
107
|
+
58 dependencies /
|
108
|
+
6 miscellaneous /
|
109
|
+
5 oceans /
|
110
|
+
1 world:
|
111
|
+
|
112
|
+
|
113
|
+
### Sovereign Countries (195)
|
114
|
+
|
115
|
+
**A**
|
116
|
+
`af` Afghanistan
|
117
|
+
`al` Albania
|
118
|
+
`ag` Algeria
|
119
|
+
`an` Andorra
|
120
|
+
`ao` Angola
|
121
|
+
`ac` Antigua and Barbuda
|
122
|
+
`ar` Argentina
|
123
|
+
`am` Armenia
|
124
|
+
`as` Australia
|
125
|
+
`au` Austria
|
126
|
+
`aj` Azerbaijan
|
127
|
+
**B**
|
128
|
+
`bf` The Bahamas
|
129
|
+
`ba` Bahrain
|
130
|
+
`bg` Bangladesh
|
131
|
+
`bb` Barbados
|
132
|
+
`bo` Belarus
|
133
|
+
`be` Belgium
|
134
|
+
`bh` Belize
|
135
|
+
`bn` Benin
|
136
|
+
`bt` Bhutan
|
137
|
+
`bl` Bolivia
|
138
|
+
`bk` Bosnia and Herzegovina
|
139
|
+
`bc` Botswana
|
140
|
+
`br` Brazil
|
141
|
+
`bx` Brunei
|
142
|
+
`bu` Bulgaria
|
143
|
+
`uv` Burkina Faso
|
144
|
+
`bm` Burma
|
145
|
+
`by` Burundi
|
146
|
+
**C**
|
147
|
+
`cb` Cambodia
|
148
|
+
`cm` Cameroon
|
149
|
+
`ca` Canada
|
150
|
+
`cv` Cape Verde
|
151
|
+
`ct` Central African Republic
|
152
|
+
`cd` Chad
|
153
|
+
`ci` Chile
|
154
|
+
`ch` China
|
155
|
+
`co` Colombia
|
156
|
+
`cn` Comoros
|
157
|
+
`cg` Congo DR
|
158
|
+
`cf` Congo
|
159
|
+
`cs` Costa Rica
|
160
|
+
`iv` Cote d'Ivoire
|
161
|
+
`hr` Croatia
|
162
|
+
`cu` Cuba
|
163
|
+
`cy` Cyprus
|
164
|
+
`ez` Czech Republic
|
165
|
+
**D**
|
166
|
+
`da` Denmark
|
167
|
+
`dj` Djibouti
|
168
|
+
`do` Dominica
|
169
|
+
`dr` Dominican Republic
|
170
|
+
**E**
|
171
|
+
`ec` Ecuador
|
172
|
+
`eg` Egypt
|
173
|
+
`es` El Salvador
|
174
|
+
`ek` Equatorial Guinea
|
175
|
+
`er` Eritrea
|
176
|
+
`en` Estonia
|
177
|
+
`et` Ethiopia
|
178
|
+
**F**
|
179
|
+
`fj` Fiji
|
180
|
+
`fi` Finland
|
181
|
+
`fr` France
|
182
|
+
**G**
|
183
|
+
`gb` Gabon
|
184
|
+
`ga` The Gambia
|
185
|
+
`gg` Georgia
|
186
|
+
`gm` Germany
|
187
|
+
`gh` Ghana
|
188
|
+
`gr` Greece
|
189
|
+
`gj` Grenada
|
190
|
+
`gt` Guatemala
|
191
|
+
`gv` Guinea
|
192
|
+
`pu` Guinea-Bissau
|
193
|
+
`gy` Guyana
|
194
|
+
**H**
|
195
|
+
`ha` Haiti
|
196
|
+
`ho` Honduras
|
197
|
+
`hu` Hungary
|
198
|
+
**I**
|
199
|
+
`ic` Iceland
|
200
|
+
`in` India
|
201
|
+
`id` Indonesia
|
202
|
+
`ir` Iran
|
203
|
+
`iz` Iraq
|
204
|
+
`ei` Ireland
|
205
|
+
`is` Israel
|
206
|
+
`it` Italy
|
207
|
+
**J**
|
208
|
+
`jm` Jamaica
|
209
|
+
`ja` Japan
|
210
|
+
`jo` Jordan
|
211
|
+
**K**
|
212
|
+
`kz` Kazakhstan
|
213
|
+
`ke` Kenya
|
214
|
+
`kr` Kiribati
|
215
|
+
`kn` North Korea
|
216
|
+
`ks` South Korea
|
217
|
+
`kv` Kosovo
|
218
|
+
`ku` Kuwait
|
219
|
+
`kg` Kyrgyzstan
|
220
|
+
**L**
|
221
|
+
`la` Laos
|
222
|
+
`lg` Latvia
|
223
|
+
`le` Lebanon
|
224
|
+
`lt` Lesotho
|
225
|
+
`li` Liberia
|
226
|
+
`ly` Libya
|
227
|
+
`ls` Liechtenstein
|
228
|
+
`lh` Lithuania
|
229
|
+
`lu` Luxembourg
|
230
|
+
**M**
|
231
|
+
`mk` Macedonia
|
232
|
+
`ma` Madagascar
|
233
|
+
`mi` Malawi
|
234
|
+
`my` Malaysia
|
235
|
+
`mv` Maldives
|
236
|
+
`ml` Mali
|
237
|
+
`mt` Malta
|
238
|
+
`rm` Marshall Islands
|
239
|
+
`mr` Mauritania
|
240
|
+
`mp` Mauritius
|
241
|
+
`mx` Mexico
|
242
|
+
`fm` Micronesia
|
243
|
+
`md` Moldova
|
244
|
+
`mn` Monaco
|
245
|
+
`mg` Mongolia
|
246
|
+
`mj` Montenegro
|
247
|
+
`mo` Morocco
|
248
|
+
`mz` Mozambique
|
249
|
+
**N**
|
250
|
+
`wa` Namibia
|
251
|
+
`nr` Nauru
|
252
|
+
`np` Nepal
|
253
|
+
`nl` Netherlands
|
254
|
+
`nz` New Zealand
|
255
|
+
`nu` Nicaragua
|
256
|
+
`ng` Niger
|
257
|
+
`ni` Nigeria
|
258
|
+
`no` Norway
|
259
|
+
**O**
|
260
|
+
`mu` Oman
|
261
|
+
**P**
|
262
|
+
`pk` Pakistan
|
263
|
+
`ps` Palau
|
264
|
+
`pm` Panama
|
265
|
+
`pp` Papua New Guinea
|
266
|
+
`pa` Paraguay
|
267
|
+
`pe` Peru
|
268
|
+
`rp` Philippines
|
269
|
+
`pl` Poland
|
270
|
+
`po` Portugal
|
271
|
+
**Q**
|
272
|
+
`qa` Qatar
|
273
|
+
**R**
|
274
|
+
`ro` Romania
|
275
|
+
`rs` Russia
|
276
|
+
`rw` Rwanda
|
277
|
+
**S**
|
278
|
+
`sc` Saint Kitts and Nevis
|
279
|
+
`st` Saint Lucia
|
280
|
+
`vc` Saint Vincent and the Grenadines
|
281
|
+
`ws` Samoa
|
282
|
+
`sm` San Marino
|
283
|
+
`tp` Sao Tome and Principe
|
284
|
+
`sa` Saudi Arabia
|
285
|
+
`sg` Senegal
|
286
|
+
`ri` Serbia
|
287
|
+
`se` Seychelles
|
288
|
+
`sl` Sierra Leone
|
289
|
+
`sn` Singapore
|
290
|
+
`lo` Slovakia
|
291
|
+
`si` Slovenia
|
292
|
+
`bp` Solomon Islands
|
293
|
+
`so` Somalia
|
294
|
+
`sf` South Africa
|
295
|
+
`od` South Sudan
|
296
|
+
`sp` Spain
|
297
|
+
`ce` Sri Lanka
|
298
|
+
`su` Sudan
|
299
|
+
`ns` Suriname
|
300
|
+
`wz` Swaziland
|
301
|
+
`sw` Sweden
|
302
|
+
`sz` Switzerland
|
303
|
+
`sy` Syria
|
304
|
+
**T**
|
305
|
+
`ti` Tajikistan
|
306
|
+
`tz` Tanzania
|
307
|
+
`th` Thailand
|
308
|
+
`tt` Timor-Leste
|
309
|
+
`to` Togo
|
310
|
+
`tn` Tonga
|
311
|
+
`td` Trinidad and Tobago
|
312
|
+
`ts` Tunisia
|
313
|
+
`tu` Turkey
|
314
|
+
`tx` Turkmenistan
|
315
|
+
`tv` Tuvalu
|
316
|
+
**U**
|
317
|
+
`ug` Uganda
|
318
|
+
`up` Ukraine
|
319
|
+
`ae` United Arab Emirates
|
320
|
+
`uk` United Kingdom
|
321
|
+
`us` United States
|
322
|
+
`uy` Uruguay
|
323
|
+
`uz` Uzbekistan
|
324
|
+
**V**
|
325
|
+
`nh` Vanuatu
|
326
|
+
`vt` Vatican City (Holy See)
|
327
|
+
`ve` Venezuela
|
328
|
+
`vm` Vietnam
|
329
|
+
**Y**
|
330
|
+
`ym` Yemen
|
331
|
+
**Z**
|
332
|
+
`za` Zambia
|
333
|
+
`zi` Zimbabwe
|
334
|
+
|
335
|
+
|
336
|
+
### Other (2)
|
337
|
+
|
338
|
+
`tw` Taiwan
|
339
|
+
`ee` European Union
|
340
|
+
|
341
|
+
### Dependencies (58)
|
342
|
+
|
343
|
+
Australia (6):
|
344
|
+
`at` Ashmore and Cartier Islands
|
345
|
+
`kt` Christmas Island
|
346
|
+
`ck` Cocos (Keeling) Islands
|
347
|
+
`cr` Coral Sea Islands
|
348
|
+
`hm` Heard Island and McDonald Islands
|
349
|
+
`nf` Norfolk Island
|
350
|
+
|
351
|
+
China (2):
|
352
|
+
`hk` Hong Kong
|
353
|
+
`mc` Macau
|
354
|
+
|
355
|
+
Denmark (2):
|
356
|
+
`fo` Faroe Islands
|
357
|
+
`gl` Greenland
|
358
|
+
|
359
|
+
France (8):
|
360
|
+
`ip` Clipperton Island
|
361
|
+
`fp` French Polynesia
|
362
|
+
`fs` French Southern and Antarctic Lands
|
363
|
+
`nc` New Caledonia
|
364
|
+
`tb` Saint Barthelemy
|
365
|
+
`rn` Saint Martin
|
366
|
+
`sb` Saint Pierre and Miquelon
|
367
|
+
`wf` Wallis and Futuna
|
368
|
+
|
369
|
+
Netherlands (3):
|
370
|
+
`aa` Aruba
|
371
|
+
`uc` Curacao
|
372
|
+
`nn` Sint Maarten
|
373
|
+
|
374
|
+
New Zealand (3):
|
375
|
+
`cw` Cook Islands
|
376
|
+
`ne` Niue
|
377
|
+
`tl` Tokelau
|
378
|
+
|
379
|
+
Norway (3):
|
380
|
+
`bv` Bouvet Island
|
381
|
+
`jn` Jan Mayen
|
382
|
+
`sv` Svalbard
|
383
|
+
|
384
|
+
Great Britain (17):
|
385
|
+
`ax` Akrotiri (Sovereign Base)
|
386
|
+
`av` Anguilla
|
387
|
+
`bd` Bermuda
|
388
|
+
`io` British Indian Ocean Territory
|
389
|
+
`vi` British Virgin Islands
|
390
|
+
`cj` Cayman Islands
|
391
|
+
`dx` Dhekelia (Sovereign Base)
|
392
|
+
`fk` Falkland Islands
|
393
|
+
`gi` Gibraltar
|
394
|
+
`gk` Guernsey
|
395
|
+
`je` Jersey
|
396
|
+
`im` Isle of Man
|
397
|
+
`mh` Montserrat
|
398
|
+
`pc` Pitcairn Islands
|
399
|
+
`sh` Saint Helena
|
400
|
+
`sx` South Georgia and the South Sandwich Islands
|
401
|
+
`tk` Turks and Caicos Islands
|
402
|
+
|
403
|
+
United States (14):
|
404
|
+
`aq` American Samoa
|
405
|
+
`gq` Guam
|
406
|
+
`bq` Navassa Island
|
407
|
+
`cq` Northern Mariana Islands
|
408
|
+
`rq` Puerto Rico
|
409
|
+
`vq` US Virgin Islands
|
410
|
+
`wq` Wake Island
|
411
|
+
`um` US Pacific Island Wildlife Refuges
|
412
|
+
(Baker Island, Howland Island, Jarvis Island, Johnston Atoll, Kingman Reef, Midway Islands, Palmyra Atoll)
|
413
|
+
|
414
|
+
|
415
|
+
### Miscellaneous (6)
|
416
|
+
|
417
|
+
`ay` Antarctica
|
418
|
+
`gz` Gaza Strip
|
419
|
+
`pf` Paracel Islands
|
420
|
+
`pg` Spratly Islands
|
421
|
+
`we` West Bank
|
422
|
+
`wi` Western Sahara
|
423
|
+
|
424
|
+
### Oceans (5)
|
425
|
+
|
426
|
+
`xq` Arctic Ocean
|
427
|
+
`zh` Atlantic Ocean
|
428
|
+
`xo` Indian Ocean
|
429
|
+
`zn` Pacific Ocean
|
430
|
+
`oo` Southern Ocean
|
431
|
+
|
432
|
+
### World (1)
|
433
|
+
|
434
|
+
`xx` World
|
435
|
+
|
436
|
+
|
437
|
+
|
438
|
+
|
439
|
+
## Ready-To-Use Public Domain Factbook Datasets
|
34
440
|
|
441
|
+
[openmundi/factbook.json](https://github.com/openmundi/factbook.json) - open (public domain)
|
442
|
+
factbook country profiles in JSON for all the world's countries (using internet domain names
|
443
|
+
for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
|
35
444
|
|
36
|
-
## Install
|
37
445
|
|
38
|
-
Just install the gem:
|
39
446
|
|
40
|
-
|
447
|
+
## Alternatives (Libraries and Gems)
|
41
448
|
|
449
|
+
Ruby
|
42
450
|
|
43
|
-
|
451
|
+
- [worldfactbook gem](https://github.com/sayem/worldfactbook)
|
452
|
+
by Sayem Khan (aka sayem);
|
453
|
+
fetches data from its own mirror, that is, [rubyworldfactbook.com](http://rubyworldfactbook.com)
|
454
|
+
(last updated 2011?)
|
44
455
|
|
45
|
-
|
456
|
+
- [the_country_identity gem](https://github.com/p1nox/the_country_identity)
|
457
|
+
by Raul Pino (aka p1nox);
|
458
|
+
fetches data from an [RDF Turtle endpoint](http://wifo5-03.informatik.uni-mannheim.de/factbook/)
|
459
|
+
hosted by the Research Group Data and Web Science at the University of Mannheim, Germany
|
46
460
|
|
47
|
-
|
48
|
-
factbook country profiles in JSON for all the world's countries (using internet domain names
|
49
|
-
for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
|
461
|
+
JavaScript
|
50
462
|
|
463
|
+
- [worldfactbook-dataset](https://github.com/twigkit/worldfactbook-dataset)
|
464
|
+
by Richard Marr (aka richmarr); fetches data using Node.js
|
465
|
+
(last updated 2013)
|
51
466
|
|
467
|
+
Others
|
52
468
|
|
53
|
-
|
469
|
+
TBD
|
54
470
|
|
55
|
-
Ruby
|
56
471
|
|
57
|
-
- [worldfactbook gem](https://github.com/sayem/worldfactbook) by sayem (aka Sayem Khan); fetches data from its own mirror, that is, rubyworldfactbook.com (last updated 2011?)
|
58
472
|
|
59
|
-
|
473
|
+
## Install
|
60
474
|
|
61
|
-
|
475
|
+
Just install the gem:
|
476
|
+
|
477
|
+
$ gem install factbook
|
62
478
|
|
63
479
|
|
64
480
|
## License
|
@@ -70,4 +486,4 @@ Use it as you please with no restrictions whatsoever.
|
|
70
486
|
## Questions? Comments?
|
71
487
|
|
72
488
|
Send them along to the [Open Mundi (world.db) Database Forum/Mailing List](http://groups.google.com/group/openmundi).
|
73
|
-
Thanks!
|
489
|
+
Thanks!
|
data/Rakefile
CHANGED
@@ -32,88 +32,15 @@ Hoe.spec 'factbook' do
|
|
32
32
|
end
|
33
33
|
|
34
34
|
|
35
|
-
=begin
|
36
|
-
# errors to fix:
|
37
|
-
saving a copy to europe/li-liechtenstein.html for debugging
|
38
|
-
found section 0 @ 38
|
39
|
-
found section 1 @ 1882
|
40
|
-
found section 2 @ 13160
|
41
|
-
found section 3 @ 29355
|
42
|
-
found section 4 @ 46010
|
43
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
44
|
-
found section 6 @ 64725
|
45
|
-
|
46
|
-
aving a copy to europe/mc-monaco.html for debugging
|
47
|
-
found section 0 @ 38
|
48
|
-
found section 1 @ 1446
|
49
|
-
found section 2 @ 12736
|
50
|
-
found section 3 @ 31192
|
51
|
-
found section 4 @ 47762
|
52
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
53
|
-
|
54
|
-
saving a copy to europe/sm-san-marino.html for debugging
|
55
|
-
found section 0 @ 38
|
56
|
-
found section 1 @ 1379
|
57
|
-
found section 2 @ 12243
|
58
|
-
found section 3 @ 27349
|
59
|
-
found section 4 @ 46949
|
60
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
61
|
-
|
62
|
-
saving a copy to europe/va-vatican-city.html for debugging
|
63
|
-
found section 0 @ 38
|
64
|
-
found section 1 @ 2000
|
65
|
-
found section 2 @ 13093
|
66
|
-
found section 3 @ 19912
|
67
|
-
found section 4 @ 37264
|
68
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
69
|
-
found section 6 @ 44353
|
70
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Trans"
|
71
|
-
|
72
|
-
saving a copy to pacific/mh-marshall-islands.html for debugging
|
73
|
-
found section 0 @ 38
|
74
|
-
found section 1 @ 1414
|
75
|
-
found section 2 @ 13404
|
76
|
-
found section 3 @ 34854
|
77
|
-
found section 4 @ 52734
|
78
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
79
|
-
|
80
|
-
saving a copy to pacific/pw-palau.html for debugging
|
81
|
-
found section 0 @ 38
|
82
|
-
found section 1 @ 1338
|
83
|
-
found section 2 @ 12729
|
84
|
-
found section 3 @ 34145
|
85
|
-
found section 4 @ 51005
|
86
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
87
|
-
|
88
|
-
saving a copy to pacific/tv-tuvalu.html for debugging
|
89
|
-
found section 0 @ 38
|
90
|
-
found section 1 @ 1391
|
91
|
-
found section 2 @ 13580
|
92
|
-
found section 3 @ 33729
|
93
|
-
found section 4 @ 50390
|
94
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
95
|
-
|
96
|
-
saving a copy to africa/ss-south-sudan.html for debugging
|
97
|
-
found section 0 @ 38
|
98
|
-
found section 1 @ 2560
|
99
|
-
found section 2 @ 11342
|
100
|
-
found section 3 @ 26234
|
101
|
-
found section 4 @ 42271
|
102
|
-
*** error: section not found -- <div id="CollapsiblePanel1_Energy"
|
103
|
-
|
104
|
-
|
105
|
-
=end
|
106
|
-
|
107
|
-
|
108
35
|
|
109
36
|
desc 'generate json for factbook.json repo'
|
110
37
|
task :genjson do
|
111
38
|
require 'factbook'
|
112
39
|
|
113
40
|
countries = [
|
114
|
-
=begin
|
115
41
|
['xx', 'world' ], ## special code for the world
|
116
42
|
|
43
|
+
=begin
|
117
44
|
['ee', 'europe/eu-european-union'], ## special code for the european union
|
118
45
|
['al', 'europe/al-albania' ],
|
119
46
|
['an', 'europe/ad-andorra' ],
|
@@ -140,13 +67,13 @@ task :genjson do
|
|
140
67
|
['ei', 'europe/ie-ireland' ],
|
141
68
|
['it', 'europe/it-italy' ],
|
142
69
|
['lg', 'europe/lv-latvia' ],
|
143
|
-
|
70
|
+
['ls', 'europe/li-liechtenstein' ],
|
144
71
|
['lh', 'europe/lt-lithuania' ],
|
145
72
|
['lu', 'europe/lu-luxembourg' ],
|
146
73
|
['mk', 'europe/mk-macedonia' ],
|
147
74
|
['mt', 'europe/mt-malta' ],
|
148
75
|
['md', 'europe/md-moldova' ],
|
149
|
-
|
76
|
+
['mn', 'europe/mc-monaco' ],
|
150
77
|
['mj', 'europe/me-montenegro' ],
|
151
78
|
['nl', 'europe/nl-netherlands' ],
|
152
79
|
['no', 'europe/no-norway' ],
|
@@ -154,7 +81,7 @@ task :genjson do
|
|
154
81
|
['po', 'europe/pt-portugal' ],
|
155
82
|
['ro', 'europe/ro-romania' ],
|
156
83
|
['rs', 'europe/ru-russia' ],
|
157
|
-
|
84
|
+
['sm', 'europe/sm-san-marino' ],
|
158
85
|
['ri', 'europe/rs-serbia' ],
|
159
86
|
['lo', 'europe/sk-slovakia' ],
|
160
87
|
['si', 'europe/si-slovenia' ],
|
@@ -163,7 +90,7 @@ task :genjson do
|
|
163
90
|
['sz', 'europe/ch-switzerland' ],
|
164
91
|
['tu', 'europe/tr-turkey' ],
|
165
92
|
['up', 'europe/ua-ukraine' ],
|
166
|
-
|
93
|
+
['vt', 'europe/va-vatican-city' ],
|
167
94
|
|
168
95
|
['ca', 'north-america/ca-canada' ],
|
169
96
|
['us', 'north-america/us-united-states' ],
|
@@ -249,7 +176,7 @@ task :genjson do
|
|
249
176
|
['sl', 'africa/sl-sierra-leone' ],
|
250
177
|
['so', 'africa/so-somalia' ],
|
251
178
|
['sf', 'africa/za-south-africa' ],
|
252
|
-
|
179
|
+
['od', 'africa/ss-south-sudan' ],
|
253
180
|
['su', 'africa/sd-sudan' ],
|
254
181
|
['wz', 'africa/sz-swaziland' ],
|
255
182
|
['tz', 'africa/tz-tanzania' ],
|
@@ -308,25 +235,19 @@ task :genjson do
|
|
308
235
|
['as', 'pacific/au-australia' ],
|
309
236
|
['fj', 'pacific/fj-fiji' ],
|
310
237
|
['kr', 'pacific/ki-kiribati' ],
|
311
|
-
|
238
|
+
['rm', 'pacific/mh-marshall-islands' ],
|
312
239
|
['fm', 'pacific/fm-micronesia' ],
|
313
240
|
['nr', 'pacific/nr-nauru' ],
|
314
241
|
['nz', 'pacific/nz-new-zealand' ],
|
315
|
-
|
242
|
+
['ps', 'pacific/pw-palau' ],
|
316
243
|
['pp', 'pacific/pg-papua-new-guinea' ],
|
317
244
|
['ws', 'pacific/ws-samoa' ],
|
318
245
|
['bp', 'pacific/sb-solomon-islands' ],
|
319
246
|
['tn', 'pacific/to-tonga' ],
|
320
|
-
|
247
|
+
['tv', 'pacific/tv-tuvalu' ],
|
321
248
|
['nh', 'pacific/vu-vanuatu' ],
|
322
|
-
|
323
249
|
=end
|
324
250
|
|
325
|
-
|
326
|
-
|
327
|
-
=begin
|
328
|
-
['', 'africa/' ],
|
329
|
-
=end
|
330
251
|
]
|
331
252
|
|
332
253
|
countries.each do |country|
|
data/lib/factbook.rb
CHANGED
@@ -25,17 +25,6 @@ require 'factbook/page'
|
|
25
25
|
require 'factbook/sect'
|
26
26
|
|
27
27
|
|
28
|
-
module Factbook
|
29
|
-
|
30
|
-
def self.banner
|
31
|
-
"factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
|
32
|
-
end
|
33
|
-
|
34
|
-
def self.root
|
35
|
-
"#{File.expand_path( File.dirname(File.dirname(__FILE__)) )}"
|
36
|
-
end
|
37
|
-
|
38
|
-
end # module Factbook
|
39
28
|
|
40
29
|
|
41
30
|
puts Factbook.banner
|
data/lib/factbook/page.rb
CHANGED
@@ -13,11 +13,14 @@ module Factbook
|
|
13
13
|
## e.g. www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_br.html
|
14
14
|
SITE_BASE = 'https://www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_{code}.html'
|
15
15
|
|
16
|
-
def initialize( code )
|
16
|
+
def initialize( code, opts={} )
|
17
17
|
## note: requires factbook country code
|
18
18
|
# e.g. austria is au
|
19
19
|
# germany is gm and so on
|
20
20
|
@code = code
|
21
|
+
|
22
|
+
### rename fields to format option?? why? why not? e.g. :format => 'long' ??
|
23
|
+
@opts = opts # fields: full|long|keep|std|?? -- find a good name for the option keeping field names as is
|
21
24
|
|
22
25
|
@html = nil
|
23
26
|
@doc = nil
|
@@ -38,10 +41,35 @@ module Factbook
|
|
38
41
|
end
|
39
42
|
end
|
40
43
|
|
44
|
+
|
45
|
+
def [](key) ### convenience shortcut
|
46
|
+
# lets you use
|
47
|
+
# page['geo']
|
48
|
+
# instead of
|
49
|
+
# page.data['geo']
|
50
|
+
|
51
|
+
## fix: use delegate data, [] from forwardable lib - why?? why not??
|
52
|
+
|
53
|
+
data[key]
|
54
|
+
end
|
55
|
+
|
56
|
+
|
41
57
|
def data
|
42
58
|
if @data.nil?
|
43
59
|
@data = {}
|
44
60
|
|
61
|
+
if @opts[:header] ## include (leading) header section ??
|
62
|
+
|
63
|
+
header_key = @opts[:fields] ? 'Header' : 'header'
|
64
|
+
last_built_key = @opts[:fields] ? 'last built' : 'last_built'
|
65
|
+
|
66
|
+
@data[header_key] = {
|
67
|
+
'code' => @code,
|
68
|
+
'generator' => "factbook/#{VERSION}",
|
69
|
+
last_built_key => "#{Time.now}",
|
70
|
+
}
|
71
|
+
end
|
72
|
+
|
45
73
|
sects.each_with_index do |sect,i|
|
46
74
|
logger.debug "############################"
|
47
75
|
logger.debug "### [#{i}] stats sect >#{sect.title}<: "
|
@@ -58,17 +86,18 @@ module Factbook
|
|
58
86
|
## split html into sections
|
59
87
|
## lets us avoids errors w/ (wrongly) nested tags
|
60
88
|
|
89
|
+
## check opts for using long or short category/field names
|
61
90
|
divs = [
|
62
|
-
[ 'intro', '<div id="CollapsiblePanel1_Intro"' ],
|
63
|
-
[ 'geo', '<div id="CollapsiblePanel1_Geo"' ],
|
64
|
-
[ 'people', '<div id="CollapsiblePanel1_People"' ],
|
65
|
-
[ 'govt', '<div id="CollapsiblePanel1_Govt"' ],
|
66
|
-
[ 'econ', '<div id="CollapsiblePanel1_Econ"' ],
|
67
|
-
[ 'energy', '<div id="CollapsiblePanel1_Energy"' ],
|
68
|
-
[ 'comm', '<div id="CollapsiblePanel1_Comm"' ],
|
69
|
-
[ 'trans', '<div id="CollapsiblePanel1_Trans"' ],
|
70
|
-
[ 'military', '<div id="CollapsiblePanel1_Military"'],
|
71
|
-
[ 'issues', '<div id="CollapsiblePanel1_Issues"' ]
|
91
|
+
[ @opts[:fields] ? 'Introduction' : 'intro', '<div id="CollapsiblePanel1_Intro"' ],
|
92
|
+
[ @opts[:fields] ? 'Geography' : 'geo', '<div id="CollapsiblePanel1_Geo"' ],
|
93
|
+
[ @opts[:fields] ? 'People and Society' : 'people', '<div id="CollapsiblePanel1_People"' ],
|
94
|
+
[ @opts[:fields] ? 'Government' : 'govt', '<div id="CollapsiblePanel1_Govt"' ],
|
95
|
+
[ @opts[:fields] ? 'Economy' : 'econ', '<div id="CollapsiblePanel1_Econ"' ],
|
96
|
+
[ @opts[:fields] ? 'Energy' : 'energy', '<div id="CollapsiblePanel1_Energy"' ],
|
97
|
+
[ @opts[:fields] ? 'Communications' : 'comm', '<div id="CollapsiblePanel1_Comm"' ],
|
98
|
+
[ @opts[:fields] ? 'Transportation' : 'trans', '<div id="CollapsiblePanel1_Trans"' ],
|
99
|
+
[ @opts[:fields] ? 'Military' : 'military', '<div id="CollapsiblePanel1_Military"'],
|
100
|
+
[ @opts[:fields] ? 'Transnational Issues': 'issues', '<div id="CollapsiblePanel1_Issues"' ]
|
72
101
|
]
|
73
102
|
|
74
103
|
indexes = []
|
@@ -102,7 +131,7 @@ module Factbook
|
|
102
131
|
|
103
132
|
## todo: check that from is smaller than to
|
104
133
|
logger.debug " cut section #{i} [#{from}..#{to}]"
|
105
|
-
@sects << Sect.new( title, html[ from..to ] )
|
134
|
+
@sects << Sect.new( title, html[ from..to ], @opts )
|
106
135
|
|
107
136
|
##if i==0 || i==1
|
108
137
|
## puts "debug sect #{i}:"
|
data/lib/factbook/sect.rb
CHANGED
@@ -7,11 +7,12 @@ module Factbook
|
|
7
7
|
|
8
8
|
attr_reader :title, :html
|
9
9
|
|
10
|
-
def initialize( title, html )
|
10
|
+
def initialize( title, html, opts={} )
|
11
11
|
## todo: passing a ref to the parent page - why? why not??
|
12
12
|
@title = title
|
13
13
|
@html = html
|
14
|
-
|
14
|
+
@opts = opts # fields: full|long|keep|std|??? -- find a good name for the option keeping field names as is
|
15
|
+
|
15
16
|
@doc = nil
|
16
17
|
@data = nil
|
17
18
|
end
|
@@ -28,15 +29,31 @@ module Factbook
|
|
28
29
|
private
|
29
30
|
|
30
31
|
def cleanup_key( key )
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
32
|
+
|
33
|
+
if @opts[:fields] # if set assume full|long|keep for now
|
34
|
+
### kepe field names as is
|
35
|
+
## e.g.
|
36
|
+
## GDP - composition, by sector of origin:
|
37
|
+
## Budget surplus (+) or deficit (-):
|
38
|
+
## becomes:
|
39
|
+
## GDP - composition, by sector of origin
|
40
|
+
## Budget surplus (+) or deficit (-)
|
41
|
+
key = key.strip
|
42
|
+
key = key.gsub( /[ ]{2,}/, ' ' ) # fold two plus spaces into one -- check if exists?
|
43
|
+
key = key.gsub( /:\z/, '' ) # remove trailing : if present
|
44
|
+
key = key.strip
|
45
|
+
else
|
46
|
+
## to lower case
|
47
|
+
key = key.downcase
|
48
|
+
## seaport(s) => seaports
|
49
|
+
key = key.gsub( '(s)', 's' )
|
50
|
+
key = key.gsub( ':', '' ) # trailing : ## fix: use regex /:$/ w/ anchor??
|
51
|
+
## remove special chars ()+-/,'
|
52
|
+
key = key.gsub( /['()+\-\/,]/, ' ' )
|
53
|
+
key = key.strip
|
54
|
+
key = key.gsub( /[ ]+/, '_' )
|
55
|
+
end
|
56
|
+
|
40
57
|
key
|
41
58
|
end
|
42
59
|
|
@@ -140,7 +157,7 @@ private
|
|
140
157
|
last_pair[1] += " #{text}" ## append w/o separator
|
141
158
|
end
|
142
159
|
else
|
143
|
-
if last_cat == 'demographic_profile' ## special case (use space a sep)
|
160
|
+
if last_cat == 'demographic_profile' || last_cat == 'Demographic profile' ## special case (use space a sep)
|
144
161
|
last_pair[1] += " #{text}" ## append with separator
|
145
162
|
else
|
146
163
|
last_pair[1] += "; #{text}" ## append with separator
|
data/lib/factbook/version.rb
CHANGED
@@ -1,5 +1,21 @@
|
|
1
1
|
|
2
2
|
module Factbook
|
3
|
-
VERSION = '0.1.2'
|
4
|
-
end
|
5
3
|
|
4
|
+
MAJOR = 0
|
5
|
+
MINOR = 1
|
6
|
+
PATCH = 3
|
7
|
+
VERSION = [MAJOR,MINOR,PATCH].join('.')
|
8
|
+
|
9
|
+
def self.version
|
10
|
+
VERSION
|
11
|
+
end
|
12
|
+
|
13
|
+
def self.banner
|
14
|
+
"factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
|
15
|
+
end
|
16
|
+
|
17
|
+
def self.root
|
18
|
+
"#{File.expand_path( File.dirname(File.dirname(File.dirname(__FILE__))) )}"
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
data/test/test_fields.rb
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
|
4
|
+
require 'helper'
|
5
|
+
|
6
|
+
|
7
|
+
class TestFields < MiniTest::Unit::TestCase
|
8
|
+
|
9
|
+
def read_test_page( code )
|
10
|
+
File.read( "#{Factbook.root}/test/data/countrytemplate_#{code}.html" )
|
11
|
+
end
|
12
|
+
|
13
|
+
def test_fields_full_w_header
|
14
|
+
page = Factbook::Page.new( 'au', header: true, fields: 'full' )
|
15
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
16
|
+
|
17
|
+
assert_equal 'au', page['Header']['code']
|
18
|
+
assert_equal "factbook/#{Factbook::VERSION}", page['Header']['generator']
|
19
|
+
|
20
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
21
|
+
assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
|
22
|
+
|
23
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
|
24
|
+
end
|
25
|
+
|
26
|
+
|
27
|
+
def test_fields_full
|
28
|
+
page = Factbook::Page.new( 'au', fields: 'full' )
|
29
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
30
|
+
|
31
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
|
32
|
+
assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
|
33
|
+
|
34
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
|
35
|
+
end
|
36
|
+
|
37
|
+
def test_fields_std
|
38
|
+
page = Factbook::Page.new( 'au' )
|
39
|
+
page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
|
40
|
+
|
41
|
+
assert_equal '-3.1% of GDP (2012 est.)', page['econ']['budget_surplus_or_deficit']['text']
|
42
|
+
assert_equal '5.5%', page['econ']['labor_force_by_occupation']['agriculture']
|
43
|
+
|
44
|
+
assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['trans']['ports_and_terminals']['river_ports']
|
45
|
+
end
|
46
|
+
|
47
|
+
|
48
|
+
end # class TestFields
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: factbook
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.3
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2014-
|
12
|
+
date: 2014-08-24 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: logutils
|
16
|
-
requirement: &
|
16
|
+
requirement: &74365370 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *74365370
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: fetcher
|
27
|
-
requirement: &
|
27
|
+
requirement: &74365080 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ! '>='
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: '0'
|
33
33
|
type: :runtime
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *74365080
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: nokogiri
|
38
|
-
requirement: &
|
38
|
+
requirement: &74364810 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,10 +43,10 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :runtime
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *74364810
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
48
|
name: rdoc
|
49
|
-
requirement: &
|
49
|
+
requirement: &74364520 !ruby/object:Gem::Requirement
|
50
50
|
none: false
|
51
51
|
requirements:
|
52
52
|
- - ~>
|
@@ -54,18 +54,18 @@ dependencies:
|
|
54
54
|
version: '4.0'
|
55
55
|
type: :development
|
56
56
|
prerelease: false
|
57
|
-
version_requirements: *
|
57
|
+
version_requirements: *74364520
|
58
58
|
- !ruby/object:Gem::Dependency
|
59
59
|
name: hoe
|
60
|
-
requirement: &
|
60
|
+
requirement: &74364240 !ruby/object:Gem::Requirement
|
61
61
|
none: false
|
62
62
|
requirements:
|
63
63
|
- - ~>
|
64
64
|
- !ruby/object:Gem::Version
|
65
|
-
version: '3.
|
65
|
+
version: '3.12'
|
66
66
|
type: :development
|
67
67
|
prerelease: false
|
68
|
-
version_requirements: *
|
68
|
+
version_requirements: *74364240
|
69
69
|
description: factbook - scripts for the world factbook (get open structured data e.g
|
70
70
|
JSON etc.)
|
71
71
|
email: openmundi@googlegroups.com
|
@@ -93,6 +93,7 @@ files:
|
|
93
93
|
- test/data/countrytemplate_vt.html
|
94
94
|
- test/data/countrytemplate_xx.html
|
95
95
|
- test/helper.rb
|
96
|
+
- test/test_fields.rb
|
96
97
|
- test/test_json.rb
|
97
98
|
- test/test_page.rb
|
98
99
|
- test/test_page_old.rb
|
@@ -129,5 +130,6 @@ summary: factbook - scripts for the world factbook (get open structured data e.g
|
|
129
130
|
test_files:
|
130
131
|
- test/test_page_old.rb
|
131
132
|
- test/test_strip.rb
|
133
|
+
- test/test_fields.rb
|
132
134
|
- test/test_json.rb
|
133
135
|
- test/test_page.rb
|