swissmatch-location 0.1.2.201409 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 4b41908ad58903a56c562d0ad66b1f0f0518c4f4
4
- data.tar.gz: 2273c6732c40767514b1ca058bad38706292967f
3
+ metadata.gz: 561e487d2ba8a0c726877e72e89c26bef87931a1
4
+ data.tar.gz: 22332920078c7e433fd68f34c7716eea73e7f9fe
5
5
  SHA512:
6
- metadata.gz: 0dc83aefbf0466108498dfba097b120fcc6794cfff5cb13f8df1ff5c08078b765ac7933986e5326463f31864b3545b8186c1d26e87dd63890255c947a48e3780
7
- data.tar.gz: acd33e0b2bfa223c271370a899726bc01f86929227b48e65d92b44346e4798500223f62685906d35610f43af469c1e04444b554ef29731bb6028036e1cbf440b
6
+ metadata.gz: b2a72500cce5454dcb9731c68647af586f0e44def5b8fece74db640be777825fa72682782f57e6a0873416d12bf35af6b4e03283805d3b1d49f5a36bd685348d
7
+ data.tar.gz: fe23fbc5b3d87ba1160931d6cee52abd7cf9479c7e4f64701df8dce15e2511eca9c1ac7655c6068415bc7e059dc8f375e100f51b43a9cdace7df5e19054bcc8d
@@ -4,21 +4,50 @@ README
4
4
 
5
5
  Summary
6
6
  -------
7
+
7
8
  Deal with swiss zip codes, cantons and communities, using the official swiss post mat[ch]
8
9
  database.
9
10
 
10
11
 
11
12
  Installation
12
13
  ------------
14
+
13
15
  Install the gem: `gem install swissmatch-location`
14
16
  Depending on how you installed rubygems, you have to use `sudo`:
15
17
  `sudo gem install swissmatch-location`
16
18
  In Ruby: `require 'swissmatch/location'`
17
19
  To automatically load the datafiles: `require 'swissmatch/location/autoload'`
18
20
 
21
+ **IMPORTANT!**
22
+
23
+ Due to a change in the license agreement of the swiss post, I'm no longer
24
+ allowed to ship the data together with the gem. Here's a guide on how to
25
+ install and update your swissmatch data:
26
+
27
+ 1. Go to https://www.post.ch/de/pages/downloadcenter-match
28
+ 2. **In the pop-up menu top-left** select "Register"
29
+ 3. Once you're registered (you'll get a snail-mail letter from the post to sign),
30
+ you visit the same page again and this time you choose "Login"
31
+ **from the pop-up menu top-left**, the login button top right **does not work
32
+ for this!** (the former logs you into the downloadcenter, the latter into
33
+ the customer center).
34
+ 3. After login, you choose the download page for "Address master data"
35
+ (de: "Adressstammdaten", fr: "Base de données d'adresses de référence", it:
36
+ "Banca dati indirizzi di riferimento")
37
+ 4. Download "Existing data" (de: "Bestand", fr: "Etat", it: "Versione completa")
38
+ 5. Unzip the file
39
+ 6. Open a shell and cd into the directory with the unzipped master data
40
+ 7. Run `swissmatch-location install-data PATH_TO_MASTER_DATA_FILE`
41
+
42
+ You can test your installation by running `swissmatch-location stats`. It should
43
+ tell you the age of the data and a number >0 of zip codes.
44
+ A negative age is possible since the swiss post provides files which start to be
45
+ valid in the future.
46
+
19
47
 
20
48
  Usage
21
49
  -----
50
+
22
51
  require 'swissmatch/location/autoload' # use this to automatically load the data
23
52
 
24
53
  # Get all zip codes for a given code, the example returns the official name of the first
@@ -42,19 +71,40 @@ Usage
42
71
  SwissMatch.canton("Zurigo").name # => "Zürich"
43
72
 
44
73
  # SwissMatch also provides data over swiss communities (Gemeinden)
45
- SwissMatch.communities("Zürich").first.community_number # => 261
46
- SwissMatch.community(261).name # => "Zürich"
74
+ SwissMatch.community("Zürich").community_number # => 261
75
+ SwissMatch.community(261).name # => "Zürich"
47
76
 
48
77
 
49
78
  SwissMatch and Rails/Databases
50
79
  ------------------------------
80
+
51
81
  If you want to load the data into your database, or use it in a rails project,
52
82
  then you should look at swissmatch-rails. It provides a couple of models and
53
83
  a data loading script.
54
84
 
55
85
 
86
+ Notable Recent Changes
87
+ ----------------------
88
+
89
+ ### 0.1.2.x -> 1.0.0
90
+
91
+ * Zip code master data is no longer bundled with the gem. Check the installation
92
+ guide for how to obtain, install and update the data.
93
+ * swissmatch-location executable added to support the installation and updating
94
+ of the master data
95
+ * SwissMatch.communities and SwissMatch::Location.communities no longer return
96
+ communities by name. This has moved to the .community methods since community
97
+ names are unique now.
98
+ * SwissMatch::Location::Converter has been added to convert the new data format
99
+ into a compact data file (~140MB down to ~400KB)
100
+ * SwissMatch::Location::DataFiles rewritten to read the new compact master data
101
+ file.
102
+ * Dropped rubyzip dependency.
103
+
104
+
56
105
  Relevant Classes and Modules
57
106
  ----------------------------
107
+
58
108
  * __{SwissMatch}__
59
109
  Convenience methods to access cantons, communities and zip codes
60
110
  * __{SwissMatch::Cantons}__
@@ -68,7 +118,8 @@ Relevant Classes and Modules
68
118
  * __{SwissMatch::ZipCodes}__
69
119
  Swiss zip code collection
70
120
  * __{SwissMatch::ZipCode}__
71
- A swiss zip code (a zip code can be described and uniquely identified by either code and city, code and add-on or the swiss posts ONRP)
121
+ A swiss zip code (a zip code can be described and uniquely identified by
122
+ either code and city, code and add-on or the swiss posts ONRP)
72
123
 
73
124
 
74
125
  Links
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ lib_dir = File.expand_path("#{__dir__}/../lib")
4
+ $LOAD_PATH << lib_dir if File.directory?(lib_dir) && !$LOAD_PATH.include?(lib_dir)
5
+
6
+ require "fileutils"
7
+ require "swissmatch/location"
8
+ require "swissmatch/location/converter"
9
+
10
+ begin
11
+ is_empty = false
12
+ SwissMatch::Location.load
13
+ rescue SwissMatch::LoadError, ArgumentError
14
+ is_empty = true
15
+ SwissMatch::Location.load(SwissMatch::Location::DataFiles.empty)
16
+ end
17
+
18
+ case ARGV[0]
19
+ when "stats"
20
+ puts "SwissMatch::Location Statistics"
21
+ puts "Master Data from #{SwissMatch::Location.data.date} (age #{(Date.today - SwissMatch::Location.data.date).floor} days), random code #{SwissMatch::Location.data.random_code}"
22
+ puts "Zip Codes: #{SwissMatch.zip_codes.size}"
23
+ puts "Cantons: #{SwissMatch.cantons.size}"
24
+ puts "Communities: #{SwissMatch.communities.size}"
25
+ puts "Districts: #{SwissMatch.districts.size}"
26
+
27
+ when "install-data"
28
+ master_file = ARGV[1]
29
+ install_dir = ARGV[2]
30
+
31
+ if !master_file
32
+ abort("Please supply a master file (`swissmatch-location install-data MASTER_FILE [INSTALL_DIRECTORY]`)")
33
+ elsif !File.exist?(master_file)
34
+ abort("Could not find #{master_file.inspect}")
35
+ elsif !File.readable?(master_file)
36
+ abort("Could not read #{master_file.inspect}")
37
+ end
38
+ unless install_dir
39
+ install_dir = File.expand_path('~/.swissmatch')
40
+ FileUtils.mkdir_p(install_dir)
41
+ end
42
+
43
+ puts "Installing data from #{master_file} in #{install_dir}"
44
+ binary_file = "#{install_dir}/locations_#{Time.now.strftime('%F')}.binary"
45
+ SwissMatch::Location::Converter.new(master_file).convert.write(binary_file)
46
+ puts "Done"
47
+
48
+ else
49
+ puts "Please supply either `stats` or `install-data MASTER_FILE [INSTALL_DIRECTORY]` as arguments"
50
+ exit(1)
51
+ end
@@ -72,22 +72,26 @@ module SwissMatch
72
72
  @data.districts
73
73
  end
74
74
 
75
- # @param [Integer] key
76
- # The community number of the community
75
+ # @param [Integer, String] key
76
+ # The name or community number of the community
77
77
  #
78
78
  # @return [SwissMatch::Community]
79
- # The community with the community number
79
+ # The community with the given name or community number
80
80
  def self.community(key)
81
- @data.communities.by_community_number(key)
81
+ case key
82
+ when Integer
83
+ @data.communities.by_community_number(key)
84
+ when String
85
+ @data.communities.by_name(key)
86
+ else
87
+ raise TypeError, "Expected Integer or String, but got #{key.inspect}:#{key.class}"
88
+ end
82
89
  end
83
90
 
84
- # @param [String] name
85
- # The name of the communities
86
- #
87
91
  # @return [SwissMatch::Communities]
88
- # All communities, or those matching the given name
89
- def self.communities(name=nil)
90
- name ? @data.communities.by_name(name) : @data.communities
92
+ # All communities
93
+ def self.communities
94
+ @data.communities
91
95
  end
92
96
 
93
97
  # @param [String, Integer] code_or_name
@@ -0,0 +1,155 @@
1
+ module SwissMatch
2
+ module Location
3
+
4
+ # SwissMatch::Location::Converter
5
+ #
6
+ # Converts the files supplied by post.ch and bfs.admin.ch into a single
7
+ # binary file which is faster to load
8
+ #
9
+ # Format:
10
+ # Byte 0...4: PostMatch master file date, in Date.jd format
11
+ # Byte 4...8: PostMach master file random code
12
+ # Byte 8...18: zip1_count, zip2_count, community1_count, community2_count, district_count; packed with N*
13
+ # Byte 18...34: bytesizes of int1_columns, int2_columns, int4_columns and text_columns
14
+ # Byte 34...-1: int1_columns + int2_columns + int4_columns + text_columns
15
+ #
16
+ # int1_columns: packed with C* the columns
17
+ # * zip1_type
18
+ # * zip1_addon
19
+ # * zip1_language
20
+ # * zip1_language_alternative
21
+ # * zip2_region
22
+ # * zip2_type
23
+ # * zip2_lang
24
+ # * com2_PLZZ
25
+ #
26
+ # int2_columns: packed with n* the columns
27
+ # * zip1_onrp
28
+ # * zip1_code
29
+ # * zip1_delivery_by
30
+ # * zip1_largest_community_number
31
+ # * zip2_onrp
32
+ # * com1_bfsnr
33
+ # * com1_agglomeration
34
+ # * com2_GDENR
35
+ # * com2_PLZ4
36
+ # * district_GDEBZNR
37
+ #
38
+ # int4_columns: packed with N* the columns
39
+ # * zip1_valid_from
40
+ #
41
+ # text_columns: joined with \x1f
42
+ # * zip1_name_short
43
+ # * zip1_name
44
+ # * zip1_canton
45
+ # * zip2_short
46
+ # * zip2_name
47
+ # * com1_name
48
+ # * com1_canton
49
+ # * district_GDEKT
50
+ # * district_GDEBZNA
51
+ #
52
+ class Converter
53
+ def initialize(match_path, districts_path=nil, communities_path=nil)
54
+ @match_path = match_path
55
+ @districts_path = districts_path || gem_districts_path
56
+ @communities_path = communities_path || gem_communities_path
57
+ @data = nil
58
+ end
59
+
60
+ def gem_data_path
61
+ data_directory = File.expand_path('../../../../data/swissmatch-location', __FILE__)
62
+ data_directory = Gem.datadir 'swissmatch-location' if defined?(Gem) && !File.directory?(data_directory)
63
+
64
+ data_directory
65
+ end
66
+
67
+ def gem_districts_path
68
+ Dir.enum_for(:glob, "#{gem_data_path}/districts_*.csv").sort.last
69
+ end
70
+
71
+ def gem_communities_path
72
+ Dir.enum_for(:glob, "#{gem_data_path}/communities_*.csv").sort.last
73
+ end
74
+
75
+ def generate_expression(size, separator, terminator)
76
+ /^#{Array.new(size) { "([^#{separator}]*)" }.join(eval("'#{separator}'"))}#{terminator}/
77
+ end
78
+
79
+ def convert
80
+ match_data = File.read(@match_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
81
+ districts_data = File.read(@districts_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
82
+ communities_data = File.read(@communities_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
83
+
84
+ r_base = generate_expression(3, ';', '\r\n')
85
+ r_zip_1 = generate_expression(16, ';', '\r\n')
86
+ r_zip_2 = generate_expression(7, ';', '\r\n')
87
+ r_community1 = generate_expression(5, ';', '\r\n')
88
+ r_community2 = generate_expression(10, ',', '(?:\n|\z)')
89
+ r_district = generate_expression(3, ',', '\n')
90
+
91
+ start_zip1 = match_data.index(/^01/)
92
+ start_zip2 = match_data.index(/^02/, start_zip1)
93
+ start_com = match_data.index(/^03/, start_zip2)
94
+ end_com = match_data.index(/^04/, start_com)
95
+
96
+ base = match_data[0...start_zip1].scan(r_base).first
97
+ zip1 = match_data[start_zip1...start_zip2].scan(r_zip_1); zip1.size
98
+ zip2 = match_data[start_zip2...start_com].scan(r_zip_2); zip2.size
99
+ com1 = match_data[start_com...end_com].scan(r_community1); com1.size
100
+ com2 = communities_data.scan(r_community2); com2.size
101
+ districts = districts_data.scan(r_district); districts.size
102
+
103
+ zip1_columns = zip1.transpose; 0
104
+ zip2_columns = zip2.transpose; 0
105
+ com1_columns = com1.transpose; 0
106
+ com2_columns = com2.transpose; 0
107
+ dist_columns = districts.transpose; 0
108
+
109
+ int1_columns = (
110
+ zip1_columns.values_at(3,5,10,11).flatten+
111
+ zip2_columns.values_at(2,3,4).flatten+
112
+ com2_columns[8]
113
+ ).map(&:to_i).pack("C*")
114
+
115
+ int2_columns = (
116
+ zip1_columns.values_at(1,4,12,2).flatten+
117
+ zip2_columns[1]+
118
+ com1_columns.values_at(1,4).flatten+
119
+ com2_columns[4]+
120
+ com2_columns[7]+
121
+ dist_columns[1]
122
+ ).map(&:to_i).pack("n*")
123
+
124
+ int4_columns = (
125
+ zip1_columns[13].map { |date| Date.civil(*date.match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)).jd }
126
+ ).pack("N*")
127
+
128
+ text_columns = (
129
+ zip1_columns.values_at(7,8,9).flatten+
130
+ zip2_columns[5]+
131
+ zip2_columns[6]+
132
+ com1_columns[2]+
133
+ com1_columns[3]+
134
+ dist_columns[0]+
135
+ dist_columns[2]
136
+ ).join("\x1f").force_encoding(Encoding::BINARY)
137
+
138
+ @data =
139
+ [Date.civil(*base[1].match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)).jd, base[2].to_i].pack("NN")+
140
+ [zip1.size, zip2.size, com1.size, com2.size, districts.size].pack("n*")+
141
+ [int1_columns.bytesize, int2_columns.bytesize, int4_columns.bytesize, text_columns.bytesize].pack("N*")+
142
+ int1_columns+
143
+ int2_columns+
144
+ int4_columns+
145
+ text_columns
146
+
147
+ self
148
+ end
149
+
150
+ def write(path)
151
+ File.write(path, @data, encoding: Encoding::BINARY)
152
+ end
153
+ end
154
+ end
155
+ end
@@ -12,12 +12,15 @@ require 'swissmatch/community'
12
12
  require 'swissmatch/communities'
13
13
  require 'swissmatch/zipcode'
14
14
  require 'swissmatch/zipcodes'
15
+ require 'swissmatch/location/ruby'
15
16
 
16
17
 
17
18
 
18
19
  module SwissMatch
19
20
  module Location
20
21
 
22
+ # SwissMatch::Location::DataFiles
23
+ #
21
24
  # Deals with retrieving and updating the files provided by the swiss postal service,
22
25
  # and loading the data from them.
23
26
  #
@@ -26,73 +29,58 @@ module SwissMatch
26
29
  # change over iterations.
27
30
  class DataFiles
28
31
 
29
- # Used to generate the regular expressions used to parse the data files.
30
- # Generates a regular expression, that matches +size+ tab separated fields,
31
- # delimited by \r\n.
32
- # @private
33
- def self.generate_expression(size, separator, terminator)
34
- /^#{Array.new(size) { "([^#{separator}]*)" }.join(eval("'#{separator}'"))}#{terminator}/
35
- end
36
-
37
- # Regular expressions used to parse the different files.
38
- # @private
39
- Expressions = {
40
- :community => generate_expression(4, '\t', '\r\n'),
41
- :zip_2 => generate_expression(6, '\t', '\r\n'),
42
- :zip_1 => generate_expression(13, '\t', '\r\n'),
43
- :districts => generate_expression(3, ',', '\n'),
44
- :communities => generate_expression(10, ',', '\n'),
45
- }
46
-
47
- # @private
48
- # The URL of the plz_p1 file
49
- URLZip1 = "https://match.post.ch/download?file=10001&tid=11&rol=0"
32
+ # Used to convert numerical language codes to symbols
33
+ LanguageCodes = [nil, :de, :fr, :it, :rt]
50
34
 
35
+ # The data of all cantons
51
36
  # @private
52
- # The URL of the plz_p2 file
53
- URLZip2 = "https://match.post.ch/download?file=10002&tid=14&rol=0"
37
+ AllCantons = Cantons.new([
38
+ Canton.new("AG", "Aargau", "Aargau", "Argovie", "Argovia", "Argovia"),
39
+ Canton.new("AI", "Appenzell Innerrhoden", "Appenzell Innerrhoden", "Appenzell Rhodes-Intérieures", "Appenzello Interno", "Appenzell Dadens"),
40
+ Canton.new("AR", "Appenzell Ausserrhoden", "Appenzell Ausserrhoden", "Appenzell Rhodes-Extérieures", "Appenzello Esterno", "Appenzell Dadora"),
41
+ Canton.new("BE", "Bern", "Bern", "Berne", "Berna", "Berna"),
42
+ Canton.new("BL", "Basel-Landschaft", "Basel-Landschaft", "Bâle-Campagne", "Basilea Campagna", "Basilea-Champagna"),
43
+ Canton.new("BS", "Basel-Stadt", "Basel-Stadt", "Bâle-Ville", "Basilea Città", "Basilea-Citad"),
44
+ Canton.new("FR", "Freiburg", "Fribourg", "Fribourg", "Friburgo", "Friburg"),
45
+ Canton.new("GE", "Genève", "Genf", "Genève", "Ginevra", "Genevra"),
46
+ Canton.new("GL", "Glarus", "Glarus", "Glaris", "Glarona", "Glaruna"),
47
+ Canton.new("GR", "Graubünden", "Graubünden", "Grisons", "Grigioni", "Grischun"),
48
+ Canton.new("JU", "Jura", "Jura", "Jura", "Giura", "Giura"),
49
+ Canton.new("LU", "Luzern", "Luzern", "Lucerne", "Lucerna", "Lucerna"),
50
+ Canton.new("NE", "Neuchâtel", "Neuenburg", "Neuchâtel", "Neuchâtel", "Neuchâtel"),
51
+ Canton.new("NW", "Nidwalden", "Nidwalden", "Nidwald", "Nidvaldo", "Sutsilvania"),
52
+ Canton.new("OW", "Obwalden", "Obwalden", "Obwald", "Obvaldo", "Sursilvania"),
53
+ Canton.new("SG", "St. Gallen", "St. Gallen", "Saint-Gall", "San Gallo", "Son Gagl"),
54
+ Canton.new("SH", "Schaffhausen", "Schaffhausen", "Schaffhouse", "Sciaffusa", "Schaffusa"),
55
+ Canton.new("SO", "Solothurn", "Solothurn", "Soleure", "Soletta", "Soloturn"),
56
+ Canton.new("SZ", "Schwyz", "Schwyz", "Schwytz", "Svitto", "Sviz"),
57
+ Canton.new("TG", "Thurgau", "Thurgau", "Thurgovie", "Turgovia", "Turgovia"),
58
+ Canton.new("TI", "Ticino", "Tessin", "Tessin", "Ticino", "Tessin"),
59
+ Canton.new("UR", "Uri", "Uri", "Uri", "Uri", "Uri"),
60
+ Canton.new("VD", "Vaud", "Waadt", "Vaud", "Vaud", "Vad"),
61
+ Canton.new("VS", "Valais", "Wallis", "Valais", "Vallese", "Vallais"),
62
+ Canton.new("ZG", "Zug", "Zug", "Zoug", "Zugo", "Zug"),
63
+ Canton.new("ZH", "Zürich", "Zürich", "Zurich", "Zurigo", "Turitg"),
64
+ Canton.new("FL", "Fürstentum Liechtenstein", "Fürstentum Liechtenstein", "Liechtenstein", "Liechtenstein", "Liechtenstein"),
65
+ Canton.new("DE", "Deutschland", "Deutschland", "Allemagne", "Germania", "Germania"),
66
+ Canton.new("IT", "Italien", "Italien", "Italie", "Italia", "Italia"),
67
+ ])
68
+
69
+ def self.empty
70
+ data = new
71
+ data.load_empty!
72
+
73
+ data
74
+ end
54
75
 
55
- # @private
56
- # The URL of the plz_c file
57
- URLCommunity = "https://match.post.ch/download?file=10003&tid=13&rol=0"
76
+ # @return [Date]
77
+ # The date from when the data from the swiss post master data file
78
+ # starts to be valid
79
+ attr_reader :date
58
80
 
59
- # @private
60
- # An array of all urls
61
- URLAll = [URLZip1, URLZip2, URLCommunity]
62
-
63
- # The data of all cantons
64
- # @private
65
- CantonData = [
66
- ["AG", "Aargau", "Aargau", "Argovie", "Argovia", "Argovia"],
67
- ["AI", "Appenzell Innerrhoden", "Appenzell Innerrhoden", "Appenzell Rhodes-Intérieures", "Appenzello Interno", "Appenzell Dadens"],
68
- ["AR", "Appenzell Ausserrhoden", "Appenzell Ausserrhoden", "Appenzell Rhodes-Extérieures", "Appenzello Esterno", "Appenzell Dadora"],
69
- ["BE", "Bern", "Bern", "Berne", "Berna", "Berna"],
70
- ["BL", "Basel-Landschaft", "Basel-Landschaft", "Bâle-Campagne", "Basilea Campagna", "Basilea-Champagna"],
71
- ["BS", "Basel-Stadt", "Basel-Stadt", "Bâle-Ville", "Basilea Città", "Basilea-Citad"],
72
- ["FR", "Freiburg", "Fribourg", "Fribourg", "Friburgo", "Friburg"],
73
- ["GE", "Genève", "Genf", "Genève", "Ginevra", "Genevra"],
74
- ["GL", "Glarus", "Glarus", "Glaris", "Glarona", "Glaruna"],
75
- ["GR", "Graubünden", "Graubünden", "Grisons", "Grigioni", "Grischun"],
76
- ["JU", "Jura", "Jura", "Jura", "Giura", "Giura"],
77
- ["LU", "Luzern", "Luzern", "Lucerne", "Lucerna", "Lucerna"],
78
- ["NE", "Neuchâtel", "Neuenburg", "Neuchâtel", "Neuchâtel", "Neuchâtel"],
79
- ["NW", "Nidwalden", "Nidwalden", "Nidwald", "Nidvaldo", "Sutsilvania"],
80
- ["OW", "Obwalden", "Obwalden", "Obwald", "Obvaldo", "Sursilvania"],
81
- ["SG", "St. Gallen", "St. Gallen", "Saint-Gall", "San Gallo", "Son Gagl"],
82
- ["SH", "Schaffhausen", "Schaffhausen", "Schaffhouse", "Sciaffusa", "Schaffusa"],
83
- ["SO", "Solothurn", "Solothurn", "Soleure", "Soletta", "Soloturn"],
84
- ["SZ", "Schwyz", "Schwyz", "Schwytz", "Svitto", "Sviz"],
85
- ["TG", "Thurgau", "Thurgau", "Thurgovie", "Turgovia", "Turgovia"],
86
- ["TI", "Ticino", "Tessin", "Tessin", "Ticino", "Tessin"],
87
- ["UR", "Uri", "Uri", "Uri", "Uri", "Uri"],
88
- ["VD", "Vaud", "Waadt", "Vaud", "Vaud", "Vad"],
89
- ["VS", "Valais", "Wallis", "Valais", "Vallese", "Vallais"],
90
- ["ZG", "Zug", "Zug", "Zoug", "Zugo", "Zug"],
91
- ["ZH", "Zürich", "Zürich", "Zurich", "Zurigo", "Turitg"],
92
- ["FL", "Fürstentum Liechtenstein", "Fürstentum Liechtenstein", "Liechtenstein", "Liechtenstein", "Liechtenstein"],
93
- ["DE", "Deutschland", "Deutschland", "Allemagne", "Germania", "Germania"],
94
- ["IT", "Italien", "Italien", "Italie", "Italia", "Italia"],
95
- ]
81
+ # @return [Integer]
82
+ # The random code from the swiss post master data file
83
+ attr_reader :random_code
96
84
 
97
85
  # The directory in which the post mat[ch] files reside
98
86
  attr_accessor :data_directory
@@ -116,14 +104,13 @@ module SwissMatch
116
104
  # The directory in which the post mat[ch] files reside
117
105
  def initialize(data_directory=nil)
118
106
  reset_errors!
107
+ @loaded = false
119
108
  if data_directory then
120
109
  @data_directory = data_directory
121
110
  elsif ENV['SWISSMATCH_DATA'] then
122
111
  @data_directory = ENV['SWISSMATCH_DATA']
123
112
  else
124
- data_directory = File.expand_path('../../../../data/swissmatch-location', __FILE__)
125
- data_directory = Gem.datadir 'swissmatch-location' if defined?(Gem) && !File.directory?(data_directory)
126
- @data_directory = data_directory
113
+ @data_directory = File.expand_path('~/.swissmatch')
127
114
  end
128
115
  end
129
116
 
@@ -134,120 +121,116 @@ module SwissMatch
134
121
  self
135
122
  end
136
123
 
137
- # Load new files
138
- #
139
- # @return [Array<String>]
140
- # An array with the absolute file paths of the extracted files.
141
- def load_updates
142
- URLAll.flat_map { |url|
143
- http_get_zip_file(url, @data_directory)
144
- }
124
+ def latest_binary_file
125
+ Dir.enum_for(:glob, "#{@data_directory}/locations_*.binary").last
145
126
  end
146
127
 
147
- # Performs an HTTP-GET for the given url, extracts it as a zipped file into the
148
- # destination directory.
149
- #
150
- # @return [Array<String>]
151
- # An array with the absolute file paths of the extracted files.
152
- def http_get_zip_file(url, destination)
153
- require 'open-uri'
154
- require 'swissmatch/zip' # patched zip/zip
155
- require 'fileutils'
156
-
157
- files = []
158
-
159
- open(url) do |zip_buffer|
160
- Zip::ZipFile.open(zip_buffer) do |zip_file|
161
- zip_file.each do |f|
162
- target_path = File.join(destination, f.name)
163
- FileUtils.mkdir_p(File.dirname(target_path))
164
- zip_file.extract(f, target_path) unless File.exist?(target_path)
165
- files << target_path
166
- end
167
- end
168
- end
169
-
170
- files
171
- end
128
+ def load_empty!
129
+ return if @loaded
172
130
 
173
- # Unzips it as a zipped file into the destination directory.
174
- def unzip_file(file, destination)
175
- require 'swissmatch/zip'
176
- Zip::ZipFile.open(file) do |zip_file|
177
- zip_file.each do |f|
178
- target_path = File.join(destination, f.name)
179
- FileUtils.mkdir_p(File.dirname(target_path))
180
- zip_file.extract(f, target_path) unless File.exist?(target_path)
181
- end
182
- end
131
+ @loaded = true
132
+ @date = Date.new(0)
133
+ @random_code = 0
134
+ @cantons = AllCantons
135
+ @districts = Districts.new([])
136
+ @communities = Communities.new([])
137
+ @zip_codes = ZipCodes.new([])
183
138
  end
184
139
 
185
- # Used to convert numerical language codes to symbols
186
- LanguageCodes = [nil, :de, :fr, :it, :rt]
187
-
188
140
  # Loads the data into this DataFiles instance
189
141
  #
190
142
  # @return [self]
191
143
  # Returns self.
192
- def load!
193
- @cantons, @districts, @communities, @zip_codes = *load
194
- self
195
- end
144
+ def load!(file=nil)
145
+ return if @loaded
146
+
147
+ file ||= latest_binary_file
148
+
149
+ raise LoadError.new("File #{file.inspect} not found or not readable", nil) unless file && File.readable?(file)
150
+
151
+ data = File.read(file, encoding: Encoding::BINARY)
152
+ date, random_code, zip1_count, zip2_count, com1_count, com2_count, district_count = *data[0,18].unpack("NNn*")
153
+ int1_size, int2_size, int4_size, text_size = *data[18,16].unpack("N*")
154
+
155
+ offset = 34
156
+ int1_cols = data[offset, int1_size].unpack("C*")
157
+ int2_cols = data[offset+=int1_size, int2_size].unpack("n*")
158
+ int4_cols = data[offset+=int2_size, int4_size].unpack("N*")
159
+ text_cols = data[offset+=int4_size, text_size].force_encoding(Encoding::UTF_8).split("\x1f")
160
+
161
+ offset = 0
162
+ zip1_type = int1_cols[offset, zip1_count]
163
+ zip1_addon = int1_cols[offset += zip1_count, zip1_count]
164
+ zip1_language = int1_cols[offset += zip1_count, zip1_count]
165
+ zip1_language_alternative = int1_cols[offset += zip1_count, zip1_count]
166
+ zip2_region = int1_cols[offset += zip1_count, zip2_count]
167
+ zip2_type = int1_cols[offset += zip2_count, zip2_count]
168
+ zip2_lang = int1_cols[offset += zip2_count, zip2_count]
169
+ com2_PLZZ = int1_cols[offset += zip2_count, com2_count]
170
+
171
+ offset = 0
172
+ zip1_onrp = int2_cols[offset, zip1_count]
173
+ zip1_code = int2_cols[offset += zip1_count, zip1_count]
174
+ zip1_delivery_by = int2_cols[offset += zip1_count, zip1_count]
175
+ zip1_largest_community_number = int2_cols[offset += zip1_count, zip1_count]
176
+ zip2_onrp = int2_cols[offset += zip1_count, zip2_count]
177
+ com1_bfsnr = int2_cols[offset += zip2_count, com1_count]
178
+ com1_agglomeration = int2_cols[offset += com1_count, com1_count]
179
+ com2_GDENR = int2_cols[offset += com1_count, com2_count]
180
+ com2_PLZ4 = int2_cols[offset += com2_count, com2_count]
181
+ district_GDEBZNR = int2_cols[offset += com2_count, district_count]
182
+
183
+ zip1_valid_from = int4_cols
184
+
185
+ offset = 0
186
+ zip1_name_short = text_cols[offset, zip1_count]
187
+ zip1_name = text_cols[offset += zip1_count, zip1_count]
188
+ zip1_canton = text_cols[offset += zip1_count, zip1_count]
189
+ zip2_short = text_cols[offset += zip1_count, zip2_count]
190
+ zip2_name = text_cols[offset += zip2_count, zip2_count]
191
+ com1_name = text_cols[offset += zip2_count, com1_count]
192
+ com1_canton = text_cols[offset += com1_count, com1_count]
193
+ district_GDEKT = text_cols[offset += com1_count, district_count]
194
+ district_GDEBZNA = text_cols[offset += district_count, district_count]
195
+
196
+ zip1 = [
197
+ zip1_onrp, zip1_type, zip1_canton, zip1_code, zip1_addon,
198
+ zip1_delivery_by, zip1_language, zip1_language_alternative,
199
+ zip1_name_short, zip1_name, zip1_largest_community_number,
200
+ zip1_valid_from
201
+ ].transpose
202
+ zip2 = [zip2_onrp, zip2_region, zip2_type, zip2_lang, zip2_short, zip2_name].transpose
203
+ com1 = [com1_bfsnr, com1_name, com1_canton, com1_agglomeration].transpose
204
+ com2 = [com2_PLZ4, com2_PLZZ, com2_GDENR].transpose
205
+ district = [district_GDEKT, district_GDEBZNR, district_GDEBZNA].transpose
206
+
207
+ @date = Date.jd(date)
208
+ @random_code = random_code
209
+ @cantons = AllCantons
210
+ @districts = load_districts(district)
211
+ @communities = load_communities(com1)
212
+ @zip_codes = load_zipcodes(zip1, zip2, com2)
196
213
 
197
- # @return [Array]
198
- # Returns an array of the form [SwissMatch::Cantons, SwissMatch::Communities,
199
- # SwissMatch::ZipCodes].
200
- def load
201
- reset_errors!
202
-
203
- cantons = load_cantons
204
- districts = load_districts(cantons)
205
- communities = load_communities(cantons)
206
- zip_codes = load_zipcodes(cantons, communities)
207
-
208
- [cantons, districts, communities, zip_codes]
214
+ self
209
215
  end
210
216
 
211
- # @return [SwissMatch::Cantons]
212
- # A SwissMatch::Cantons containing all cantons used by the swiss postal service.
213
- def load_cantons
214
- Cantons.new(
215
- CantonData.map { |tag, name, name_de, name_fr, name_it, name_rt|
216
- Canton.new(tag, name, name_de, name_fr, name_it, name_rt)
217
- }
218
- )
219
- end
220
-
221
- def load_districts(cantons)
222
- # File format: GDEKT,GDEBZNR,GDEBZNA
223
- path = Dir.enum_for(:glob, "#{@data_directory}/districts_*.csv").last
224
- data = File.read(path, encoding: Encoding::UTF_8.to_s).scan(Expressions[:districts])
225
- districts = data[1..-1].map { |canton_tag, district_number, district_name|
226
- district_number = Integer(district_number, 10)
227
- canton = cantons.by_license_tag(canton_tag)
228
-
229
- District.new(district_number, district_name, canton, SwissMatch::Communities.new([]))
230
- }
231
-
232
- Districts.new(districts)
217
+ def load_districts(data)
218
+ Districts.new(data.map { |data|
219
+ District.new(*data, SwissMatch::Communities.new([]))
220
+ })
233
221
  end
234
222
 
235
223
  # @return [SwissMatch::Communities]
236
224
  # An instance of SwissMatch::Communities containing all communities defined by the
237
225
  # files known to this DataFiles instance.
238
- def load_communities(cantons)
239
- raise "Must load cantons first" unless cantons
240
-
241
- file = Dir.enum_for(:glob, "#{@data_directory}/plz_c_*.txt").last
226
+ def load_communities(data)
242
227
  temporary = []
243
228
  complete = {}
244
- load_table(file, :community).each do |bfsnr, name, canton, agglomeration|
245
- bfsnr = bfsnr.to_i
246
- agglomeration = agglomeration.to_i
247
- canton = cantons.by_license_tag(canton)
229
+ data.each do |bfsnr, name, canton, agglomeration|
230
+ canton = @cantons.by_license_tag(canton)
248
231
  if agglomeration == bfsnr then
249
232
  complete[bfsnr] = Community.new(bfsnr, name, canton, :self)
250
- elsif agglomeration.nil? then
233
+ elsif agglomeration.zero? then
251
234
  complete[bfsnr] = Community.new(bfsnr, name, canton, nil)
252
235
  else
253
236
  temporary << [bfsnr, name, canton, agglomeration]
@@ -268,48 +251,31 @@ module SwissMatch
268
251
  # @return [SwissMatch::ZipCodes]
269
252
  # An instance of SwissMatch::ZipCodes containing all zip codes defined by the
270
253
  # files known to this DataFiles instance.
271
- def load_zipcodes(cantons, communities)
272
- raise "Must load cantons first" unless cantons
273
- raise "Must load communities first" unless communities
274
-
254
+ def load_zipcodes(zip1_data, zip2_data, com2_data)
275
255
  community_mapping = Hash.new { |h,k| h[k] = [] }
276
256
  self_delivered = []
277
257
  others = []
278
- zip1_file = Dir.enum_for(:glob, "#{@data_directory}/plz_p1_*.txt").last
279
- zip2_file = Dir.enum_for(:glob, "#{@data_directory}/plz_p2_*.txt").last
280
- communities_file = Dir.enum_for(:glob, "#{@data_directory}/communities_*.csv").last
281
-
282
- # KTKZ,OHW,ORTNAME,GHW,GDENR,GDENAMK,PHW,PLZ4,PLZZ,PLZNAMK
283
- communities_data = File.read(
284
- communities_file,
285
- encoding: Encoding::UTF_8.to_s
286
- ).scan(Expressions[:communities])[1..-1].transpose.values_at(4,7,8)
287
- communities_data[0].map!(&:to_i)
288
- communities_data[1].map!(&:to_i)
289
- communities_data[2].map!(&:to_i)
290
- communities_data.transpose.each do |data|
291
- community_mapping[data.last(2)] << data.at(0)
258
+ temporary = {}
259
+
260
+ com2_data.each do |*key, value|
261
+ community_mapping[key] << value
292
262
  end
293
263
 
294
- temporary = {}
295
- load_table(zip1_file, :zip_1).each do |row|
296
- onrp = row.at(0).to_i
297
- code = row.at(2).to_i
298
- addon = row.at(3).to_i
299
- delivery_by = row.at(10).to_i
264
+ zip1_data.each do |onrp, type, canton, code, addon, delivery_by, lang, lang_alt, name_short, name, largest_community_number, valid_from|
300
265
  delivery_by = case delivery_by when 0 then nil; when onrp then :self; else delivery_by; end
301
- language = LanguageCodes[row.at(7).to_i]
302
- language_alternative = LanguageCodes[row.at(8).to_i]
303
- name_short = Name.new(row.at(4), language)
304
- name = Name.new(row.at(5), language)
305
- largest_community_number = row.at(11).to_i
266
+ language = LanguageCodes[lang]
267
+ language_alternative = LanguageCodes[lang_alt]
268
+ name_short = Name.new(name_short, language)
269
+ name = Name.new(name, language)
270
+
306
271
  # compact, because some communities already no longer exist, so by_community_numbers can
307
272
  # contain nils which must be removed
308
- community_numbers = (community_mapping[[code, addon]]|[largest_community_number]).sort
309
- communities = Communities.new(communities.by_community_numbers(*community_numbers).compact)
273
+ community_numbers = (community_mapping[[code, addon]] | [largest_community_number]).sort
274
+ communities = Communities.new(@communities.by_community_numbers(*community_numbers).compact)
275
+
310
276
  data = [
311
277
  onrp, # ordering_number
312
- row.at(1).to_i, # type
278
+ type, # type
313
279
  code,
314
280
  addon,
315
281
  name, # name (official)
@@ -318,14 +284,14 @@ module SwissMatch
318
284
  [name_short], # names_short (official + alternative)
319
285
  [], # PLZ2 type 3 short names (additional region names)
320
286
  [], # PLZ2 type 3 names (additional region names)
321
- cantons.by_license_tag(row.at(6)), # canton
287
+ cantons.by_license_tag(canton), # canton
322
288
  language,
323
289
  language_alternative,
324
- row.at(9) == "1", # sortfile_member
290
+ false, # sortfile_member TODO: remove
325
291
  delivery_by, # delivery_by
326
292
  communities.by_community_number(largest_community_number), # community_number
327
293
  communities,
328
- Date.civil(*row.at(12).match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)) # valid_from
294
+ Date.jd(valid_from) # valid_from
329
295
  ]
330
296
  temporary[onrp] = data
331
297
  if :self == delivery_by then
@@ -335,26 +301,26 @@ module SwissMatch
335
301
  end
336
302
  end
337
303
 
338
- load_table(zip2_file, :zip_2).each do |onrp, rn, type, lang, short, name|
339
- onrp = onrp.to_i
340
- lang_code = lang.to_i
304
+ zip2_data.each do |onrp, rn, type, lang, short, name|
305
+ onrp = onrp
306
+ lang_code = lang
341
307
  language = LanguageCodes[lang_code]
342
308
  entry = temporary[onrp]
343
- if type == "2"
344
- entry[5] << Name.new(name, language, rn.to_i)
345
- entry[7] << Name.new(short, language, rn.to_i)
346
- elsif type == "3"
347
- entry[8] << Name.new(name, language, rn.to_i)
348
- entry[9] << Name.new(short, language, rn.to_i)
309
+ if type == 2
310
+ entry[5] << Name.new(name, language, rn)
311
+ entry[7] << Name.new(short, language, rn)
312
+ elsif type == 3
313
+ entry[8] << Name.new(name, language, rn)
314
+ entry[9] << Name.new(short, language, rn)
349
315
  end
350
316
  end
351
317
 
352
318
  self_delivered.each do |row|
353
- temporary[row.at(0)] = ZipCode.new(*row)
319
+ temporary[row[0]] = ZipCode.new(*row)
354
320
  end
355
321
  others.each do |row|
356
- if row.at(14) then
357
- raise "Delivery not found:\n#{row.inspect}" unless tmp = temporary[row.at(14)]
322
+ if row[14] then
323
+ raise "Delivery not found:\n#{row.inspect}" unless tmp = temporary[row[14]]
358
324
  if tmp.kind_of?(Array) then
359
325
  @errors << LoadError.new("Invalid reference: onrp #{row.at(0)} delivery by #{row.at(14)}", row)
360
326
  row[14] = nil
@@ -362,26 +328,11 @@ module SwissMatch
362
328
  row[14] = tmp
363
329
  end
364
330
  end
365
- temporary[row.at(0)] = ZipCode.new(*row)
331
+ temporary[row[0]] = ZipCode.new(*row)
366
332
  end
367
333
 
368
334
  ZipCodes.new(temporary.values)
369
335
  end
370
-
371
- # Reads a file and parses using the pattern of the given name.
372
- #
373
- # @param [String] path
374
- # The path of the file to parse
375
- # @param [Symbol] pattern
376
- # The pattern-name used to parse the file (see Expressions)
377
- #
378
- # @return [Array<Array<String>>]
379
- # A 2 dimensional array representing the tabular data contained in the given file.
380
- def load_table(path, pattern)
381
- File.read(path, :encoding => Encoding::Windows_1252.to_s). # to_s because sadly, ruby 1.9.2 can't handle an Encoding instance as argument
382
- encode(Encoding::UTF_8).
383
- scan(Expressions[pattern])
384
- end
385
336
  end
386
337
  end
387
338
  end