factbook 0.1.2 → 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
data/Manifest.txt CHANGED
@@ -15,6 +15,7 @@ test/data/countrytemplate_mx.html
15
15
  test/data/countrytemplate_vt.html
16
16
  test/data/countrytemplate_xx.html
17
17
  test/helper.rb
18
+ test/test_fields.rb
18
19
  test/test_json.rb
19
20
  test/test_page.rb
20
21
  test/test_page_old.rb
data/README.md CHANGED
@@ -21,44 +21,460 @@ offers free country profiles in the public domain (that is, no copyright(s), no
21
21
 
22
22
  ### Get country profile page as a hash (that is, structured data e.g. nested key/values)
23
23
 
24
- page = Factbook::Page.new( 'br' ) # br is the country code for Brazil
25
- pp page.data # pretty print hash
24
+ ```ruby
26
25
 
26
+ page = Factbook::Page.new( 'br' ) # br is the country code for Brazil
27
+ pp page.data # pretty print hash
28
+
29
+ ```
27
30
 
28
31
  ### Save to disk as JSON
29
32
 
30
- page = Factbook::Page.new( 'br' )
31
- File.open( 'br.json', 'w') do |f|
32
- f.write page.to_json( pretty: true )
33
- end
33
+ ```ruby
34
+
35
+ page = Factbook::Page.new( 'br' )
36
+ File.open( 'br.json', 'w') do |f|
37
+ f.write page.to_json( pretty: true )
38
+ end
39
+
40
+ ```
41
+
42
+ ### Options - Header, "Long" Category / Field Names
43
+
44
+ #### Include Header Option - `header: true`
45
+
46
+ ```ruby
47
+ page = Factbook::Page.new( 'br', header: true )
48
+ ```
49
+
50
+ will include a leading header section. Example:
51
+
52
+ ```json
53
+ {
54
+ "Header": {
55
+ "code": "au",
56
+ "generator": "factbook/0.1.2",
57
+ "last_built": "2014-08-24 12:55:39 +0200"
58
+ }
59
+ ...
60
+ }
61
+ ```
62
+
63
+ #### "Long" Category / Field Names Option - `fields: 'long'`
64
+
65
+ ```ruby
66
+ page = Factbook::Page.new( 'br', fields: 'long')
67
+ ```
68
+
69
+ will change the category / field names to the long form (that is, passing through unchanged from the source).
70
+ e.g.
71
+
72
+ ```ruby
73
+ page['econ']['budget_surplus_or_deficit']['text']
74
+ page['econ']['labor_force_by_occupation']['agriculture']
75
+ page['trans']['ports_and_terminals']['river_ports']
76
+ ```
77
+ becomes
78
+
79
+ ```ruby
80
+ page['Economy']['Budget surplus (+) or deficit (-)']['text']
81
+ page['Economy']['Labor force - by occupation']['agriculture']
82
+ page['Transportation']['Ports and terminals']['river port(s)']
83
+ ```
84
+
85
+ Note: You can - of course - use the options together e.g.
86
+
87
+ ```ruby
88
+ page = Factbook::Page.new( 'br', header: true, fields: 'long' )
89
+ ```
90
+
91
+ or
92
+
93
+ ```ruby
94
+ opts = {
95
+ header: true,
96
+ fields: 'long'
97
+ }
98
+ page = Factbook::Page.new( 'br', opts )
99
+ ```
100
+
101
+
102
+ ## The World Factbook Summary (267 Entries)
103
+
104
+ The World Factbook includes 267 entries -
105
+ 195 sovereign countries /
106
+ 2 others /
107
+ 58 dependencies /
108
+ 6 miscellaneous /
109
+ 5 oceans /
110
+ 1 world:
111
+
112
+
113
+ ### Sovereign Countries (195)
114
+
115
+ **A**
116
+ `af` Afghanistan
117
+ `al` Albania
118
+ `ag` Algeria
119
+ `an` Andorra
120
+ `ao` Angola
121
+ `ac` Antigua and Barbuda
122
+ `ar` Argentina
123
+ `am` Armenia
124
+ `as` Australia
125
+ `au` Austria
126
+ `aj` Azerbaijan
127
+ **B**
128
+ `bf` The Bahamas
129
+ `ba` Bahrain
130
+ `bg` Bangladesh
131
+ `bb` Barbados
132
+ `bo` Belarus
133
+ `be` Belgium
134
+ `bh` Belize
135
+ `bn` Benin
136
+ `bt` Bhutan
137
+ `bl` Bolivia
138
+ `bk` Bosnia and Herzegovina
139
+ `bc` Botswana
140
+ `br` Brazil
141
+ `bx` Brunei
142
+ `bu` Bulgaria
143
+ `uv` Burkina Faso
144
+ `bm` Burma
145
+ `by` Burundi
146
+ **C**
147
+ `cb` Cambodia
148
+ `cm` Cameroon
149
+ `ca` Canada
150
+ `cv` Cape Verde
151
+ `ct` Central African Republic
152
+ `cd` Chad
153
+ `ci` Chile
154
+ `ch` China
155
+ `co` Colombia
156
+ `cn` Comoros
157
+ `cg` Congo DR
158
+ `cf` Congo
159
+ `cs` Costa Rica
160
+ `iv` Cote d'Ivoire
161
+ `hr` Croatia
162
+ `cu` Cuba
163
+ `cy` Cyprus
164
+ `ez` Czech Republic
165
+ **D**
166
+ `da` Denmark
167
+ `dj` Djibouti
168
+ `do` Dominica
169
+ `dr` Dominican Republic
170
+ **E**
171
+ `ec` Ecuador
172
+ `eg` Egypt
173
+ `es` El Salvador
174
+ `ek` Equatorial Guinea
175
+ `er` Eritrea
176
+ `en` Estonia
177
+ `et` Ethiopia
178
+ **F**
179
+ `fj` Fiji
180
+ `fi` Finland
181
+ `fr` France
182
+ **G**
183
+ `gb` Gabon
184
+ `ga` The Gambia
185
+ `gg` Georgia
186
+ `gm` Germany
187
+ `gh` Ghana
188
+ `gr` Greece
189
+ `gj` Grenada
190
+ `gt` Guatemala
191
+ `gv` Guinea
192
+ `pu` Guinea-Bissau
193
+ `gy` Guyana
194
+ **H**
195
+ `ha` Haiti
196
+ `ho` Honduras
197
+ `hu` Hungary
198
+ **I**
199
+ `ic` Iceland
200
+ `in` India
201
+ `id` Indonesia
202
+ `ir` Iran
203
+ `iz` Iraq
204
+ `ei` Ireland
205
+ `is` Israel
206
+ `it` Italy
207
+ **J**
208
+ `jm` Jamaica
209
+ `ja` Japan
210
+ `jo` Jordan
211
+ **K**
212
+ `kz` Kazakhstan
213
+ `ke` Kenya
214
+ `kr` Kiribati
215
+ `kn` North Korea
216
+ `ks` South Korea
217
+ `kv` Kosovo
218
+ `ku` Kuwait
219
+ `kg` Kyrgyzstan
220
+ **L**
221
+ `la` Laos
222
+ `lg` Latvia
223
+ `le` Lebanon
224
+ `lt` Lesotho
225
+ `li` Liberia
226
+ `ly` Libya
227
+ `ls` Liechtenstein
228
+ `lh` Lithuania
229
+ `lu` Luxembourg
230
+ **M**
231
+ `mk` Macedonia
232
+ `ma` Madagascar
233
+ `mi` Malawi
234
+ `my` Malaysia
235
+ `mv` Maldives
236
+ `ml` Mali
237
+ `mt` Malta
238
+ `rm` Marshall Islands
239
+ `mr` Mauritania
240
+ `mp` Mauritius
241
+ `mx` Mexico
242
+ `fm` Micronesia
243
+ `md` Moldova
244
+ `mn` Monaco
245
+ `mg` Mongolia
246
+ `mj` Montenegro
247
+ `mo` Morocco
248
+ `mz` Mozambique
249
+ **N**
250
+ `wa` Namibia
251
+ `nr` Nauru
252
+ `np` Nepal
253
+ `nl` Netherlands
254
+ `nz` New Zealand
255
+ `nu` Nicaragua
256
+ `ng` Niger
257
+ `ni` Nigeria
258
+ `no` Norway
259
+ **O**
260
+ `mu` Oman
261
+ **P**
262
+ `pk` Pakistan
263
+ `ps` Palau
264
+ `pm` Panama
265
+ `pp` Papua New Guinea
266
+ `pa` Paraguay
267
+ `pe` Peru
268
+ `rp` Philippines
269
+ `pl` Poland
270
+ `po` Portugal
271
+ **Q**
272
+ `qa` Qatar
273
+ **R**
274
+ `ro` Romania
275
+ `rs` Russia
276
+ `rw` Rwanda
277
+ **S**
278
+ `sc` Saint Kitts and Nevis
279
+ `st` Saint Lucia
280
+ `vc` Saint Vincent and the Grenadines
281
+ `ws` Samoa
282
+ `sm` San Marino
283
+ `tp` Sao Tome and Principe
284
+ `sa` Saudi Arabia
285
+ `sg` Senegal
286
+ `ri` Serbia
287
+ `se` Seychelles
288
+ `sl` Sierra Leone
289
+ `sn` Singapore
290
+ `lo` Slovakia
291
+ `si` Slovenia
292
+ `bp` Solomon Islands
293
+ `so` Somalia
294
+ `sf` South Africa
295
+ `od` South Sudan
296
+ `sp` Spain
297
+ `ce` Sri Lanka
298
+ `su` Sudan
299
+ `ns` Suriname
300
+ `wz` Swaziland
301
+ `sw` Sweden
302
+ `sz` Switzerland
303
+ `sy` Syria
304
+ **T**
305
+ `ti` Tajikistan
306
+ `tz` Tanzania
307
+ `th` Thailand
308
+ `tt` Timor-Leste
309
+ `to` Togo
310
+ `tn` Tonga
311
+ `td` Trinidad and Tobago
312
+ `ts` Tunisia
313
+ `tu` Turkey
314
+ `tx` Turkmenistan
315
+ `tv` Tuvalu
316
+ **U**
317
+ `ug` Uganda
318
+ `up` Ukraine
319
+ `ae` United Arab Emirates
320
+ `uk` United Kingdom
321
+ `us` United States
322
+ `uy` Uruguay
323
+ `uz` Uzbekistan
324
+ **V**
325
+ `nh` Vanuatu
326
+ `vt` Vatican City (Holy See)
327
+ `ve` Venezuela
328
+ `vm` Vietnam
329
+ **Y**
330
+ `ym` Yemen
331
+ **Z**
332
+ `za` Zambia
333
+ `zi` Zimbabwe
334
+
335
+
336
+ ### Other (2)
337
+
338
+ `tw` Taiwan
339
+ `ee` European Union
340
+
341
+ ### Dependencies (58)
342
+
343
+ Australia (6):
344
+ `at` Ashmore and Cartier Islands
345
+ `kt` Christmas Island
346
+ `ck` Cocos (Keeling) Islands
347
+ `cr` Coral Sea Islands
348
+ `hm` Heard Island and McDonald Islands
349
+ `nf` Norfolk Island
350
+
351
+ China (2):
352
+ `hk` Hong Kong
353
+ `mc` Macau
354
+
355
+ Denmark (2):
356
+ `fo` Faroe Islands
357
+ `gl` Greenland
358
+
359
+ France (8):
360
+ `ip` Clipperton Island
361
+ `fp` French Polynesia
362
+ `fs` French Southern and Antarctic Lands
363
+ `nc` New Caledonia
364
+ `tb` Saint Barthelemy
365
+ `rn` Saint Martin
366
+ `sb` Saint Pierre and Miquelon
367
+ `wf` Wallis and Futuna
368
+
369
+ Netherlands (3):
370
+ `aa` Aruba
371
+ `uc` Curacao
372
+ `nn` Sint Maarten
373
+
374
+ New Zealand (3):
375
+ `cw` Cook Islands
376
+ `ne` Niue
377
+ `tl` Tokelau
378
+
379
+ Norway (3):
380
+ `bv` Bouvet Island
381
+ `jn` Jan Mayen
382
+ `sv` Svalbard
383
+
384
+ Great Britain (17):
385
+ `ax` Akrotiri (Sovereign Base)
386
+ `av` Anguilla
387
+ `bd` Bermuda
388
+ `io` British Indian Ocean Territory
389
+ `vi` British Virgin Islands
390
+ `cj` Cayman Islands
391
+ `dx` Dhekelia (Sovereign Base)
392
+ `fk` Falkland Islands
393
+ `gi` Gibraltar
394
+ `gk` Guernsey
395
+ `je` Jersey
396
+ `im` Isle of Man
397
+ `mh` Montserrat
398
+ `pc` Pitcairn Islands
399
+ `sh` Saint Helena
400
+ `sx` South Georgia and the South Sandwich Islands
401
+ `tk` Turks and Caicos Islands
402
+
403
+ United States (14):
404
+ `aq` American Samoa
405
+ `gq` Guam
406
+ `bq` Navassa Island
407
+ `cq` Northern Mariana Islands
408
+ `rq` Puerto Rico
409
+ `vq` US Virgin Islands
410
+ `wq` Wake Island
411
+ `um` US Pacific Island Wildlife Refuges
412
+ (Baker Island, Howland Island, Jarvis Island, Johnston Atoll, Kingman Reef, Midway Islands, Palmyra Atoll)
413
+
414
+
415
+ ### Miscellaneous (6)
416
+
417
+ `ay` Antarctica
418
+ `gz` Gaza Strip
419
+ `pf` Paracel Islands
420
+ `pg` Spratly Islands
421
+ `we` West Bank
422
+ `wi` Western Sahara
423
+
424
+ ### Oceans (5)
425
+
426
+ `xq` Arctic Ocean
427
+ `zh` Atlantic Ocean
428
+ `xo` Indian Ocean
429
+ `zn` Pacific Ocean
430
+ `oo` Southern Ocean
431
+
432
+ ### World (1)
433
+
434
+ `xx` World
435
+
436
+
437
+
438
+
439
+ ## Ready-To-Use Public Domain Factbook Datasets
34
440
 
441
+ [openmundi/factbook.json](https://github.com/openmundi/factbook.json) - open (public domain)
442
+ factbook country profiles in JSON for all the world's countries (using internet domain names
443
+ for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
35
444
 
36
- ## Install
37
445
 
38
- Just install the gem:
39
446
 
40
- $ gem install factbook
447
+ ## Alternatives (Libraries and Gems)
41
448
 
449
+ Ruby
42
450
 
43
- ## Ready-To-Use Public Domain Datasets
451
+ - [worldfactbook gem](https://github.com/sayem/worldfactbook)
452
+ by Sayem Khan (aka sayem);
453
+ fetches data from its own mirror, that is, [rubyworldfactbook.com](http://rubyworldfactbook.com)
454
+ (last updated 2011?)
44
455
 
45
- Datasets generated by `factbook` include:
456
+ - [the_country_identity gem](https://github.com/p1nox/the_country_identity)
457
+ by Raul Pino (aka p1nox);
458
+ fetches data from an [RDF Turtle endpoint](http://wifo5-03.informatik.uni-mannheim.de/factbook/)
459
+ hosted by the Research Group Data and Web Science at the University of Mannheim, Germany
46
460
 
47
- [openmundi/factbook.json](https://github.com/openmundi/factbook.json) - open (public domain)
48
- factbook country profiles in JSON for all the world's countries (using internet domain names
49
- for country codes e.g. Austria is `at.json` not `au.json`, Germany is `de.json` not `gm.json` and so on)
461
+ JavaScript
50
462
 
463
+ - [worldfactbook-dataset](https://github.com/twigkit/worldfactbook-dataset)
464
+ by Richard Marr (aka richmarr); fetches data using Node.js
465
+ (last updated 2013)
51
466
 
467
+ Others
52
468
 
53
- ## Alternatives Libraries and Gems
469
+ TBD
54
470
 
55
- Ruby
56
471
 
57
- - [worldfactbook gem](https://github.com/sayem/worldfactbook) by sayem (aka Sayem Khan); fetches data from its own mirror, that is, rubyworldfactbook.com (last updated 2011?)
58
472
 
59
- Others
473
+ ## Install
60
474
 
61
- TBD
475
+ Just install the gem:
476
+
477
+ $ gem install factbook
62
478
 
63
479
 
64
480
  ## License
@@ -70,4 +486,4 @@ Use it as you please with no restrictions whatsoever.
70
486
  ## Questions? Comments?
71
487
 
72
488
  Send them along to the [Open Mundi (world.db) Database Forum/Mailing List](http://groups.google.com/group/openmundi).
73
- Thanks!
489
+ Thanks!
data/Rakefile CHANGED
@@ -32,88 +32,15 @@ Hoe.spec 'factbook' do
32
32
  end
33
33
 
34
34
 
35
- =begin
36
- # errors to fix:
37
- saving a copy to europe/li-liechtenstein.html for debugging
38
- found section 0 @ 38
39
- found section 1 @ 1882
40
- found section 2 @ 13160
41
- found section 3 @ 29355
42
- found section 4 @ 46010
43
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
44
- found section 6 @ 64725
45
-
46
- aving a copy to europe/mc-monaco.html for debugging
47
- found section 0 @ 38
48
- found section 1 @ 1446
49
- found section 2 @ 12736
50
- found section 3 @ 31192
51
- found section 4 @ 47762
52
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
53
-
54
- saving a copy to europe/sm-san-marino.html for debugging
55
- found section 0 @ 38
56
- found section 1 @ 1379
57
- found section 2 @ 12243
58
- found section 3 @ 27349
59
- found section 4 @ 46949
60
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
61
-
62
- saving a copy to europe/va-vatican-city.html for debugging
63
- found section 0 @ 38
64
- found section 1 @ 2000
65
- found section 2 @ 13093
66
- found section 3 @ 19912
67
- found section 4 @ 37264
68
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
69
- found section 6 @ 44353
70
- *** error: section not found -- <div id="CollapsiblePanel1_Trans"
71
-
72
- saving a copy to pacific/mh-marshall-islands.html for debugging
73
- found section 0 @ 38
74
- found section 1 @ 1414
75
- found section 2 @ 13404
76
- found section 3 @ 34854
77
- found section 4 @ 52734
78
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
79
-
80
- saving a copy to pacific/pw-palau.html for debugging
81
- found section 0 @ 38
82
- found section 1 @ 1338
83
- found section 2 @ 12729
84
- found section 3 @ 34145
85
- found section 4 @ 51005
86
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
87
-
88
- saving a copy to pacific/tv-tuvalu.html for debugging
89
- found section 0 @ 38
90
- found section 1 @ 1391
91
- found section 2 @ 13580
92
- found section 3 @ 33729
93
- found section 4 @ 50390
94
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
95
-
96
- saving a copy to africa/ss-south-sudan.html for debugging
97
- found section 0 @ 38
98
- found section 1 @ 2560
99
- found section 2 @ 11342
100
- found section 3 @ 26234
101
- found section 4 @ 42271
102
- *** error: section not found -- <div id="CollapsiblePanel1_Energy"
103
-
104
-
105
- =end
106
-
107
-
108
35
 
109
36
  desc 'generate json for factbook.json repo'
110
37
  task :genjson do
111
38
  require 'factbook'
112
39
 
113
40
  countries = [
114
- =begin
115
41
  ['xx', 'world' ], ## special code for the world
116
42
 
43
+ =begin
117
44
  ['ee', 'europe/eu-european-union'], ## special code for the european union
118
45
  ['al', 'europe/al-albania' ],
119
46
  ['an', 'europe/ad-andorra' ],
@@ -140,13 +67,13 @@ task :genjson do
140
67
  ['ei', 'europe/ie-ireland' ],
141
68
  ['it', 'europe/it-italy' ],
142
69
  ['lg', 'europe/lv-latvia' ],
143
- # ['ls', 'europe/li-liechtenstein' ],
70
+ ['ls', 'europe/li-liechtenstein' ],
144
71
  ['lh', 'europe/lt-lithuania' ],
145
72
  ['lu', 'europe/lu-luxembourg' ],
146
73
  ['mk', 'europe/mk-macedonia' ],
147
74
  ['mt', 'europe/mt-malta' ],
148
75
  ['md', 'europe/md-moldova' ],
149
- # ['mn', 'europe/mc-monaco' ],
76
+ ['mn', 'europe/mc-monaco' ],
150
77
  ['mj', 'europe/me-montenegro' ],
151
78
  ['nl', 'europe/nl-netherlands' ],
152
79
  ['no', 'europe/no-norway' ],
@@ -154,7 +81,7 @@ task :genjson do
154
81
  ['po', 'europe/pt-portugal' ],
155
82
  ['ro', 'europe/ro-romania' ],
156
83
  ['rs', 'europe/ru-russia' ],
157
- # ['sm', 'europe/sm-san-marino' ],
84
+ ['sm', 'europe/sm-san-marino' ],
158
85
  ['ri', 'europe/rs-serbia' ],
159
86
  ['lo', 'europe/sk-slovakia' ],
160
87
  ['si', 'europe/si-slovenia' ],
@@ -163,7 +90,7 @@ task :genjson do
163
90
  ['sz', 'europe/ch-switzerland' ],
164
91
  ['tu', 'europe/tr-turkey' ],
165
92
  ['up', 'europe/ua-ukraine' ],
166
- # ['vt', 'europe/va-vatican-city' ],
93
+ ['vt', 'europe/va-vatican-city' ],
167
94
 
168
95
  ['ca', 'north-america/ca-canada' ],
169
96
  ['us', 'north-america/us-united-states' ],
@@ -249,7 +176,7 @@ task :genjson do
249
176
  ['sl', 'africa/sl-sierra-leone' ],
250
177
  ['so', 'africa/so-somalia' ],
251
178
  ['sf', 'africa/za-south-africa' ],
252
- # ['od', 'africa/ss-south-sudan' ],
179
+ ['od', 'africa/ss-south-sudan' ],
253
180
  ['su', 'africa/sd-sudan' ],
254
181
  ['wz', 'africa/sz-swaziland' ],
255
182
  ['tz', 'africa/tz-tanzania' ],
@@ -308,25 +235,19 @@ task :genjson do
308
235
  ['as', 'pacific/au-australia' ],
309
236
  ['fj', 'pacific/fj-fiji' ],
310
237
  ['kr', 'pacific/ki-kiribati' ],
311
- # ['rm', 'pacific/mh-marshall-islands' ],
238
+ ['rm', 'pacific/mh-marshall-islands' ],
312
239
  ['fm', 'pacific/fm-micronesia' ],
313
240
  ['nr', 'pacific/nr-nauru' ],
314
241
  ['nz', 'pacific/nz-new-zealand' ],
315
- # ['ps', 'pacific/pw-palau' ],
242
+ ['ps', 'pacific/pw-palau' ],
316
243
  ['pp', 'pacific/pg-papua-new-guinea' ],
317
244
  ['ws', 'pacific/ws-samoa' ],
318
245
  ['bp', 'pacific/sb-solomon-islands' ],
319
246
  ['tn', 'pacific/to-tonga' ],
320
- # ['tv', 'pacific/tv-tuvalu' ],
247
+ ['tv', 'pacific/tv-tuvalu' ],
321
248
  ['nh', 'pacific/vu-vanuatu' ],
322
-
323
249
  =end
324
250
 
325
-
326
-
327
- =begin
328
- ['', 'africa/' ],
329
- =end
330
251
  ]
331
252
 
332
253
  countries.each do |country|
data/lib/factbook.rb CHANGED
@@ -25,17 +25,6 @@ require 'factbook/page'
25
25
  require 'factbook/sect'
26
26
 
27
27
 
28
- module Factbook
29
-
30
- def self.banner
31
- "factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
32
- end
33
-
34
- def self.root
35
- "#{File.expand_path( File.dirname(File.dirname(__FILE__)) )}"
36
- end
37
-
38
- end # module Factbook
39
28
 
40
29
 
41
30
  puts Factbook.banner
data/lib/factbook/page.rb CHANGED
@@ -13,11 +13,14 @@ module Factbook
13
13
  ## e.g. www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_br.html
14
14
  SITE_BASE = 'https://www.cia.gov/library/publications/the-world-factbook/geos/countrytemplate_{code}.html'
15
15
 
16
- def initialize( code )
16
+ def initialize( code, opts={} )
17
17
  ## note: requires factbook country code
18
18
  # e.g. austria is au
19
19
  # germany is gm and so on
20
20
  @code = code
21
+
22
+ ### rename fields to format option?? why? why not? e.g. :format => 'long' ??
23
+ @opts = opts # fields: full|long|keep|std|?? -- find a good name for the option keeping field names as is
21
24
 
22
25
  @html = nil
23
26
  @doc = nil
@@ -38,10 +41,35 @@ module Factbook
38
41
  end
39
42
  end
40
43
 
44
+
45
+ def [](key) ### convenience shortcut
46
+ # lets you use
47
+ # page['geo']
48
+ # instead of
49
+ # page.data['geo']
50
+
51
+ ## fix: use delegate data, [] from forwardable lib - why?? why not??
52
+
53
+ data[key]
54
+ end
55
+
56
+
41
57
  def data
42
58
  if @data.nil?
43
59
  @data = {}
44
60
 
61
+ if @opts[:header] ## include (leading) header section ??
62
+
63
+ header_key = @opts[:fields] ? 'Header' : 'header'
64
+ last_built_key = @opts[:fields] ? 'last built' : 'last_built'
65
+
66
+ @data[header_key] = {
67
+ 'code' => @code,
68
+ 'generator' => "factbook/#{VERSION}",
69
+ last_built_key => "#{Time.now}",
70
+ }
71
+ end
72
+
45
73
  sects.each_with_index do |sect,i|
46
74
  logger.debug "############################"
47
75
  logger.debug "### [#{i}] stats sect >#{sect.title}<: "
@@ -58,17 +86,18 @@ module Factbook
58
86
  ## split html into sections
59
87
  ## lets us avoids errors w/ (wrongly) nested tags
60
88
 
89
+ ## check opts for using long or short category/field names
61
90
  divs = [
62
- [ 'intro', '<div id="CollapsiblePanel1_Intro"' ],
63
- [ 'geo', '<div id="CollapsiblePanel1_Geo"' ],
64
- [ 'people', '<div id="CollapsiblePanel1_People"' ],
65
- [ 'govt', '<div id="CollapsiblePanel1_Govt"' ],
66
- [ 'econ', '<div id="CollapsiblePanel1_Econ"' ],
67
- [ 'energy', '<div id="CollapsiblePanel1_Energy"' ],
68
- [ 'comm', '<div id="CollapsiblePanel1_Comm"' ],
69
- [ 'trans', '<div id="CollapsiblePanel1_Trans"' ],
70
- [ 'military', '<div id="CollapsiblePanel1_Military"'],
71
- [ 'issues', '<div id="CollapsiblePanel1_Issues"' ]
91
+ [ @opts[:fields] ? 'Introduction' : 'intro', '<div id="CollapsiblePanel1_Intro"' ],
92
+ [ @opts[:fields] ? 'Geography' : 'geo', '<div id="CollapsiblePanel1_Geo"' ],
93
+ [ @opts[:fields] ? 'People and Society' : 'people', '<div id="CollapsiblePanel1_People"' ],
94
+ [ @opts[:fields] ? 'Government' : 'govt', '<div id="CollapsiblePanel1_Govt"' ],
95
+ [ @opts[:fields] ? 'Economy' : 'econ', '<div id="CollapsiblePanel1_Econ"' ],
96
+ [ @opts[:fields] ? 'Energy' : 'energy', '<div id="CollapsiblePanel1_Energy"' ],
97
+ [ @opts[:fields] ? 'Communications' : 'comm', '<div id="CollapsiblePanel1_Comm"' ],
98
+ [ @opts[:fields] ? 'Transportation' : 'trans', '<div id="CollapsiblePanel1_Trans"' ],
99
+ [ @opts[:fields] ? 'Military' : 'military', '<div id="CollapsiblePanel1_Military"'],
100
+ [ @opts[:fields] ? 'Transnational Issues': 'issues', '<div id="CollapsiblePanel1_Issues"' ]
72
101
  ]
73
102
 
74
103
  indexes = []
@@ -102,7 +131,7 @@ module Factbook
102
131
 
103
132
  ## todo: check that from is smaller than to
104
133
  logger.debug " cut section #{i} [#{from}..#{to}]"
105
- @sects << Sect.new( title, html[ from..to ] )
134
+ @sects << Sect.new( title, html[ from..to ], @opts )
106
135
 
107
136
  ##if i==0 || i==1
108
137
  ## puts "debug sect #{i}:"
data/lib/factbook/sect.rb CHANGED
@@ -7,11 +7,12 @@ module Factbook
7
7
 
8
8
  attr_reader :title, :html
9
9
 
10
- def initialize( title, html )
10
+ def initialize( title, html, opts={} )
11
11
  ## todo: passing a ref to the parent page - why? why not??
12
12
  @title = title
13
13
  @html = html
14
-
14
+ @opts = opts # fields: full|long|keep|std|??? -- find a good name for the option keeping field names as is
15
+
15
16
  @doc = nil
16
17
  @data = nil
17
18
  end
@@ -28,15 +29,31 @@ module Factbook
28
29
  private
29
30
 
30
31
  def cleanup_key( key )
31
- ## to lower case
32
- key = key.downcase
33
- ## seaport(s) => seaports
34
- key = key.gsub( '(s)', 's' )
35
- key = key.gsub( ':', '' ) # trailing :
36
- ## remove special chars ()-/,'
37
- key = key.gsub( /['()\-\/,]/, ' ' )
38
- key = key.strip
39
- key = key.gsub( /[ ]+/, '_' )
32
+
33
+ if @opts[:fields] # if set assume full|long|keep for now
34
+ ### kepe field names as is
35
+ ## e.g.
36
+ ## GDP - composition, by sector of origin:
37
+ ## Budget surplus (+) or deficit (-):
38
+ ## becomes:
39
+ ## GDP - composition, by sector of origin
40
+ ## Budget surplus (+) or deficit (-)
41
+ key = key.strip
42
+ key = key.gsub( /[ ]{2,}/, ' ' ) # fold two plus spaces into one -- check if exists?
43
+ key = key.gsub( /:\z/, '' ) # remove trailing : if present
44
+ key = key.strip
45
+ else
46
+ ## to lower case
47
+ key = key.downcase
48
+ ## seaport(s) => seaports
49
+ key = key.gsub( '(s)', 's' )
50
+ key = key.gsub( ':', '' ) # trailing : ## fix: use regex /:$/ w/ anchor??
51
+ ## remove special chars ()+-/,'
52
+ key = key.gsub( /['()+\-\/,]/, ' ' )
53
+ key = key.strip
54
+ key = key.gsub( /[ ]+/, '_' )
55
+ end
56
+
40
57
  key
41
58
  end
42
59
 
@@ -140,7 +157,7 @@ private
140
157
  last_pair[1] += " #{text}" ## append w/o separator
141
158
  end
142
159
  else
143
- if last_cat == 'demographic_profile' ## special case (use space a sep)
160
+ if last_cat == 'demographic_profile' || last_cat == 'Demographic profile' ## special case (use space a sep)
144
161
  last_pair[1] += " #{text}" ## append with separator
145
162
  else
146
163
  last_pair[1] += "; #{text}" ## append with separator
@@ -1,5 +1,21 @@
1
1
 
2
2
  module Factbook
3
- VERSION = '0.1.2'
4
- end
5
3
 
4
+ MAJOR = 0
5
+ MINOR = 1
6
+ PATCH = 3
7
+ VERSION = [MAJOR,MINOR,PATCH].join('.')
8
+
9
+ def self.version
10
+ VERSION
11
+ end
12
+
13
+ def self.banner
14
+ "factbook/#{VERSION} on Ruby #{RUBY_VERSION} (#{RUBY_RELEASE_DATE}) [#{RUBY_PLATFORM}]"
15
+ end
16
+
17
+ def self.root
18
+ "#{File.expand_path( File.dirname(File.dirname(File.dirname(__FILE__))) )}"
19
+ end
20
+
21
+ end
@@ -0,0 +1,48 @@
1
+ # encoding: utf-8
2
+
3
+
4
+ require 'helper'
5
+
6
+
7
+ class TestFields < MiniTest::Unit::TestCase
8
+
9
+ def read_test_page( code )
10
+ File.read( "#{Factbook.root}/test/data/countrytemplate_#{code}.html" )
11
+ end
12
+
13
+ def test_fields_full_w_header
14
+ page = Factbook::Page.new( 'au', header: true, fields: 'full' )
15
+ page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
16
+
17
+ assert_equal 'au', page['Header']['code']
18
+ assert_equal "factbook/#{Factbook::VERSION}", page['Header']['generator']
19
+
20
+ assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
21
+ assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
22
+
23
+ assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
24
+ end
25
+
26
+
27
+ def test_fields_full
28
+ page = Factbook::Page.new( 'au', fields: 'full' )
29
+ page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
30
+
31
+ assert_equal '-3.1% of GDP (2012 est.)', page['Economy']['Budget surplus (+) or deficit (-)']['text']
32
+ assert_equal '5.5%', page['Economy']['Labor force - by occupation']['agriculture']
33
+
34
+ assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['Transportation']['Ports and terminals']['river port(s)']
35
+ end
36
+
37
+ def test_fields_std
38
+ page = Factbook::Page.new( 'au' )
39
+ page.html = read_test_page( 'au' ) # use builtin test page (do NOT fetch via internet)
40
+
41
+ assert_equal '-3.1% of GDP (2012 est.)', page['econ']['budget_surplus_or_deficit']['text']
42
+ assert_equal '5.5%', page['econ']['labor_force_by_occupation']['agriculture']
43
+
44
+ assert_equal 'Enns, Krems, Linz, Vienna (Danube)', page['trans']['ports_and_terminals']['river_ports']
45
+ end
46
+
47
+
48
+ end # class TestFields
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: factbook
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.3
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2014-07-13 00:00:00.000000000 Z
12
+ date: 2014-08-24 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: logutils
16
- requirement: &82513280 !ruby/object:Gem::Requirement
16
+ requirement: &74365370 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: '0'
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *82513280
24
+ version_requirements: *74365370
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: fetcher
27
- requirement: &82544000 !ruby/object:Gem::Requirement
27
+ requirement: &74365080 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ! '>='
@@ -32,10 +32,10 @@ dependencies:
32
32
  version: '0'
33
33
  type: :runtime
34
34
  prerelease: false
35
- version_requirements: *82544000
35
+ version_requirements: *74365080
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: nokogiri
38
- requirement: &82543300 !ruby/object:Gem::Requirement
38
+ requirement: &74364810 !ruby/object:Gem::Requirement
39
39
  none: false
40
40
  requirements:
41
41
  - - ! '>='
@@ -43,10 +43,10 @@ dependencies:
43
43
  version: '0'
44
44
  type: :runtime
45
45
  prerelease: false
46
- version_requirements: *82543300
46
+ version_requirements: *74364810
47
47
  - !ruby/object:Gem::Dependency
48
48
  name: rdoc
49
- requirement: &82542590 !ruby/object:Gem::Requirement
49
+ requirement: &74364520 !ruby/object:Gem::Requirement
50
50
  none: false
51
51
  requirements:
52
52
  - - ~>
@@ -54,18 +54,18 @@ dependencies:
54
54
  version: '4.0'
55
55
  type: :development
56
56
  prerelease: false
57
- version_requirements: *82542590
57
+ version_requirements: *74364520
58
58
  - !ruby/object:Gem::Dependency
59
59
  name: hoe
60
- requirement: &82541780 !ruby/object:Gem::Requirement
60
+ requirement: &74364240 !ruby/object:Gem::Requirement
61
61
  none: false
62
62
  requirements:
63
63
  - - ~>
64
64
  - !ruby/object:Gem::Version
65
- version: '3.11'
65
+ version: '3.12'
66
66
  type: :development
67
67
  prerelease: false
68
- version_requirements: *82541780
68
+ version_requirements: *74364240
69
69
  description: factbook - scripts for the world factbook (get open structured data e.g
70
70
  JSON etc.)
71
71
  email: openmundi@googlegroups.com
@@ -93,6 +93,7 @@ files:
93
93
  - test/data/countrytemplate_vt.html
94
94
  - test/data/countrytemplate_xx.html
95
95
  - test/helper.rb
96
+ - test/test_fields.rb
96
97
  - test/test_json.rb
97
98
  - test/test_page.rb
98
99
  - test/test_page_old.rb
@@ -129,5 +130,6 @@ summary: factbook - scripts for the world factbook (get open structured data e.g
129
130
  test_files:
130
131
  - test/test_page_old.rb
131
132
  - test/test_strip.rb
133
+ - test/test_fields.rb
132
134
  - test/test_json.rb
133
135
  - test/test_page.rb