swissmatch-location 0.1.2.201409 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 4b41908ad58903a56c562d0ad66b1f0f0518c4f4
4
- data.tar.gz: 2273c6732c40767514b1ca058bad38706292967f
3
+ metadata.gz: 561e487d2ba8a0c726877e72e89c26bef87931a1
4
+ data.tar.gz: 22332920078c7e433fd68f34c7716eea73e7f9fe
5
5
  SHA512:
6
- metadata.gz: 0dc83aefbf0466108498dfba097b120fcc6794cfff5cb13f8df1ff5c08078b765ac7933986e5326463f31864b3545b8186c1d26e87dd63890255c947a48e3780
7
- data.tar.gz: acd33e0b2bfa223c271370a899726bc01f86929227b48e65d92b44346e4798500223f62685906d35610f43af469c1e04444b554ef29731bb6028036e1cbf440b
6
+ metadata.gz: b2a72500cce5454dcb9731c68647af586f0e44def5b8fece74db640be777825fa72682782f57e6a0873416d12bf35af6b4e03283805d3b1d49f5a36bd685348d
7
+ data.tar.gz: fe23fbc5b3d87ba1160931d6cee52abd7cf9479c7e4f64701df8dce15e2511eca9c1ac7655c6068415bc7e059dc8f375e100f51b43a9cdace7df5e19054bcc8d
@@ -4,21 +4,50 @@ README
4
4
 
5
5
  Summary
6
6
  -------
7
+
7
8
  Deal with swiss zip codes, cantons and communities, using the official swiss post mat[ch]
8
9
  database.
9
10
 
10
11
 
11
12
  Installation
12
13
  ------------
14
+
13
15
  Install the gem: `gem install swissmatch-location`
14
16
  Depending on how you installed rubygems, you have to use `sudo`:
15
17
  `sudo gem install swissmatch-location`
16
18
  In Ruby: `require 'swissmatch/location'`
17
19
  To automatically load the datafiles: `require 'swissmatch/location/autoload'`
18
20
 
21
+ **IMPORTANT!**
22
+
23
+ Due to a change in the license agreement of the swiss post, I'm no longer
24
+ allowed to ship the data together with the gem. Here's a guide on how to
25
+ install and update your swissmatch data:
26
+
27
+ 1. Go to https://www.post.ch/de/pages/downloadcenter-match
28
+ 2. **In the pop-up menu top-left** select "Register"
29
+ 3. Once you're registered (you'll get a snail-mail letter from the post to sign),
30
+ you visit the same page again and this time you choose "Login"
31
+ **from the pop-up menu top-left**, the login button top right **does not work
32
+ for this!** (the former logs you into the downloadcenter, the latter into
33
+ the customer center).
34
+ 3. After login, you choose the download page for "Address master data"
35
+ (de: "Adressstammdaten", fr: "Base de données d'adresses de référence", it:
36
+ "Banca dati indirizzi di riferimento")
37
+ 4. Download "Existing data" (de: "Bestand", fr: "Etat", it: "Versione completa")
38
+ 5. Unzip the file
39
+ 6. Open a shell and cd into the directory with the unzipped master data
40
+ 7. Run `swissmatch-location install-data PATH_TO_MASTER_DATA_FILE`
41
+
42
+ You can test your installation by running `swissmatch-location stats`. It should
43
+ tell you the age of the data and a number >0 of zip codes.
44
+ A negative age is possible since the swiss post provides files which start to be
45
+ valid in the future.
46
+
19
47
 
20
48
  Usage
21
49
  -----
50
+
22
51
  require 'swissmatch/location/autoload' # use this to automatically load the data
23
52
 
24
53
  # Get all zip codes for a given code, the example returns the official name of the first
@@ -42,19 +71,40 @@ Usage
42
71
  SwissMatch.canton("Zurigo").name # => "Zürich"
43
72
 
44
73
  # SwissMatch also provides data over swiss communities (Gemeinden)
45
- SwissMatch.communities("Zürich").first.community_number # => 261
46
- SwissMatch.community(261).name # => "Zürich"
74
+ SwissMatch.community("Zürich").community_number # => 261
75
+ SwissMatch.community(261).name # => "Zürich"
47
76
 
48
77
 
49
78
  SwissMatch and Rails/Databases
50
79
  ------------------------------
80
+
51
81
  If you want to load the data into your database, or use it in a rails project,
52
82
  then you should look at swissmatch-rails. It provides a couple of models and
53
83
  a data loading script.
54
84
 
55
85
 
86
+ Notable Recent Changes
87
+ ----------------------
88
+
89
+ ### 0.1.2.x -> 1.0.0
90
+
91
+ * Zip code master data is no longer bundled with the gem. Check the installation
92
+ guide for how to obtain, install and update the data.
93
+ * swissmatch-location executable added to support the installation and updating
94
+ of the master data
95
+ * SwissMatch.communities and SwissMatch::Location.communities no longer return
96
+ communities by name. This has moved to the .community methods since community
97
+ names are unique now.
98
+ * SwissMatch::Location::Converter has been added to convert the new data format
99
+ into a compact data file (~140MB down to ~400KB)
100
+ * SwissMatch::Location::DataFiles rewritten to read the new compact master data
101
+ file.
102
+ * Dropped rubyzip dependency.
103
+
104
+
56
105
  Relevant Classes and Modules
57
106
  ----------------------------
107
+
58
108
  * __{SwissMatch}__
59
109
  Convenience methods to access cantons, communities and zip codes
60
110
  * __{SwissMatch::Cantons}__
@@ -68,7 +118,8 @@ Relevant Classes and Modules
68
118
  * __{SwissMatch::ZipCodes}__
69
119
  Swiss zip code collection
70
120
  * __{SwissMatch::ZipCode}__
71
- A swiss zip code (a zip code can be described and uniquely identified by either code and city, code and add-on or the swiss posts ONRP)
121
+ A swiss zip code (a zip code can be described and uniquely identified by
122
+ either code and city, code and add-on or the swiss posts ONRP)
72
123
 
73
124
 
74
125
  Links
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ lib_dir = File.expand_path("#{__dir__}/../lib")
4
+ $LOAD_PATH << lib_dir if File.directory?(lib_dir) && !$LOAD_PATH.include?(lib_dir)
5
+
6
+ require "fileutils"
7
+ require "swissmatch/location"
8
+ require "swissmatch/location/converter"
9
+
10
+ begin
11
+ is_empty = false
12
+ SwissMatch::Location.load
13
+ rescue SwissMatch::LoadError, ArgumentError
14
+ is_empty = true
15
+ SwissMatch::Location.load(SwissMatch::Location::DataFiles.empty)
16
+ end
17
+
18
+ case ARGV[0]
19
+ when "stats"
20
+ puts "SwissMatch::Location Statistics"
21
+ puts "Master Data from #{SwissMatch::Location.data.date} (age #{(Date.today - SwissMatch::Location.data.date).floor} days), random code #{SwissMatch::Location.data.random_code}"
22
+ puts "Zip Codes: #{SwissMatch.zip_codes.size}"
23
+ puts "Cantons: #{SwissMatch.cantons.size}"
24
+ puts "Communities: #{SwissMatch.communities.size}"
25
+ puts "Districts: #{SwissMatch.districts.size}"
26
+
27
+ when "install-data"
28
+ master_file = ARGV[1]
29
+ install_dir = ARGV[2]
30
+
31
+ if !master_file
32
+ abort("Please supply a master file (`swissmatch-location install-data MASTER_FILE [INSTALL_DIRECTORY]`)")
33
+ elsif !File.exist?(master_file)
34
+ abort("Could not find #{master_file.inspect}")
35
+ elsif !File.readable?(master_file)
36
+ abort("Could not read #{master_file.inspect}")
37
+ end
38
+ unless install_dir
39
+ install_dir = File.expand_path('~/.swissmatch')
40
+ FileUtils.mkdir_p(install_dir)
41
+ end
42
+
43
+ puts "Installing data from #{master_file} in #{install_dir}"
44
+ binary_file = "#{install_dir}/locations_#{Time.now.strftime('%F')}.binary"
45
+ SwissMatch::Location::Converter.new(master_file).convert.write(binary_file)
46
+ puts "Done"
47
+
48
+ else
49
+ puts "Please supply either `stats` or `install-data MASTER_FILE [INSTALL_DIRECTORY]` as arguments"
50
+ exit(1)
51
+ end
@@ -72,22 +72,26 @@ module SwissMatch
72
72
  @data.districts
73
73
  end
74
74
 
75
- # @param [Integer] key
76
- # The community number of the community
75
+ # @param [Integer, String] key
76
+ # The name or community number of the community
77
77
  #
78
78
  # @return [SwissMatch::Community]
79
- # The community with the community number
79
+ # The community with the given name or community number
80
80
  def self.community(key)
81
- @data.communities.by_community_number(key)
81
+ case key
82
+ when Integer
83
+ @data.communities.by_community_number(key)
84
+ when String
85
+ @data.communities.by_name(key)
86
+ else
87
+ raise TypeError, "Expected Integer or String, but got #{key.inspect}:#{key.class}"
88
+ end
82
89
  end
83
90
 
84
- # @param [String] name
85
- # The name of the communities
86
- #
87
91
  # @return [SwissMatch::Communities]
88
- # All communities, or those matching the given name
89
- def self.communities(name=nil)
90
- name ? @data.communities.by_name(name) : @data.communities
92
+ # All communities
93
+ def self.communities
94
+ @data.communities
91
95
  end
92
96
 
93
97
  # @param [String, Integer] code_or_name
@@ -0,0 +1,155 @@
1
+ module SwissMatch
2
+ module Location
3
+
4
+ # SwissMatch::Location::Converter
5
+ #
6
+ # Converts the files supplied by post.ch and bfs.admin.ch into a single
7
+ # binary file which is faster to load
8
+ #
9
+ # Format:
10
+ # Byte 0...4: PostMatch master file date, in Date.jd format
11
+ # Byte 4...8: PostMach master file random code
12
+ # Byte 8...18: zip1_count, zip2_count, community1_count, community2_count, district_count; packed with N*
13
+ # Byte 18...34: bytesizes of int1_columns, int2_columns, int4_columns and text_columns
14
+ # Byte 34...-1: int1_columns + int2_columns + int4_columns + text_columns
15
+ #
16
+ # int1_columns: packed with C* the columns
17
+ # * zip1_type
18
+ # * zip1_addon
19
+ # * zip1_language
20
+ # * zip1_language_alternative
21
+ # * zip2_region
22
+ # * zip2_type
23
+ # * zip2_lang
24
+ # * com2_PLZZ
25
+ #
26
+ # int2_columns: packed with n* the columns
27
+ # * zip1_onrp
28
+ # * zip1_code
29
+ # * zip1_delivery_by
30
+ # * zip1_largest_community_number
31
+ # * zip2_onrp
32
+ # * com1_bfsnr
33
+ # * com1_agglomeration
34
+ # * com2_GDENR
35
+ # * com2_PLZ4
36
+ # * district_GDEBZNR
37
+ #
38
+ # int4_columns: packed with N* the columns
39
+ # * zip1_valid_from
40
+ #
41
+ # text_columns: joined with \x1f
42
+ # * zip1_name_short
43
+ # * zip1_name
44
+ # * zip1_canton
45
+ # * zip2_short
46
+ # * zip2_name
47
+ # * com1_name
48
+ # * com1_canton
49
+ # * district_GDEKT
50
+ # * district_GDEBZNA
51
+ #
52
+ class Converter
53
+ def initialize(match_path, districts_path=nil, communities_path=nil)
54
+ @match_path = match_path
55
+ @districts_path = districts_path || gem_districts_path
56
+ @communities_path = communities_path || gem_communities_path
57
+ @data = nil
58
+ end
59
+
60
+ def gem_data_path
61
+ data_directory = File.expand_path('../../../../data/swissmatch-location', __FILE__)
62
+ data_directory = Gem.datadir 'swissmatch-location' if defined?(Gem) && !File.directory?(data_directory)
63
+
64
+ data_directory
65
+ end
66
+
67
+ def gem_districts_path
68
+ Dir.enum_for(:glob, "#{gem_data_path}/districts_*.csv").sort.last
69
+ end
70
+
71
+ def gem_communities_path
72
+ Dir.enum_for(:glob, "#{gem_data_path}/communities_*.csv").sort.last
73
+ end
74
+
75
+ def generate_expression(size, separator, terminator)
76
+ /^#{Array.new(size) { "([^#{separator}]*)" }.join(eval("'#{separator}'"))}#{terminator}/
77
+ end
78
+
79
+ def convert
80
+ match_data = File.read(@match_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
81
+ districts_data = File.read(@districts_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
82
+ communities_data = File.read(@communities_path, encoding: Encoding::Windows_1252).encode(Encoding::UTF_8)
83
+
84
+ r_base = generate_expression(3, ';', '\r\n')
85
+ r_zip_1 = generate_expression(16, ';', '\r\n')
86
+ r_zip_2 = generate_expression(7, ';', '\r\n')
87
+ r_community1 = generate_expression(5, ';', '\r\n')
88
+ r_community2 = generate_expression(10, ',', '(?:\n|\z)')
89
+ r_district = generate_expression(3, ',', '\n')
90
+
91
+ start_zip1 = match_data.index(/^01/)
92
+ start_zip2 = match_data.index(/^02/, start_zip1)
93
+ start_com = match_data.index(/^03/, start_zip2)
94
+ end_com = match_data.index(/^04/, start_com)
95
+
96
+ base = match_data[0...start_zip1].scan(r_base).first
97
+ zip1 = match_data[start_zip1...start_zip2].scan(r_zip_1); zip1.size
98
+ zip2 = match_data[start_zip2...start_com].scan(r_zip_2); zip2.size
99
+ com1 = match_data[start_com...end_com].scan(r_community1); com1.size
100
+ com2 = communities_data.scan(r_community2); com2.size
101
+ districts = districts_data.scan(r_district); districts.size
102
+
103
+ zip1_columns = zip1.transpose; 0
104
+ zip2_columns = zip2.transpose; 0
105
+ com1_columns = com1.transpose; 0
106
+ com2_columns = com2.transpose; 0
107
+ dist_columns = districts.transpose; 0
108
+
109
+ int1_columns = (
110
+ zip1_columns.values_at(3,5,10,11).flatten+
111
+ zip2_columns.values_at(2,3,4).flatten+
112
+ com2_columns[8]
113
+ ).map(&:to_i).pack("C*")
114
+
115
+ int2_columns = (
116
+ zip1_columns.values_at(1,4,12,2).flatten+
117
+ zip2_columns[1]+
118
+ com1_columns.values_at(1,4).flatten+
119
+ com2_columns[4]+
120
+ com2_columns[7]+
121
+ dist_columns[1]
122
+ ).map(&:to_i).pack("n*")
123
+
124
+ int4_columns = (
125
+ zip1_columns[13].map { |date| Date.civil(*date.match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)).jd }
126
+ ).pack("N*")
127
+
128
+ text_columns = (
129
+ zip1_columns.values_at(7,8,9).flatten+
130
+ zip2_columns[5]+
131
+ zip2_columns[6]+
132
+ com1_columns[2]+
133
+ com1_columns[3]+
134
+ dist_columns[0]+
135
+ dist_columns[2]
136
+ ).join("\x1f").force_encoding(Encoding::BINARY)
137
+
138
+ @data =
139
+ [Date.civil(*base[1].match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)).jd, base[2].to_i].pack("NN")+
140
+ [zip1.size, zip2.size, com1.size, com2.size, districts.size].pack("n*")+
141
+ [int1_columns.bytesize, int2_columns.bytesize, int4_columns.bytesize, text_columns.bytesize].pack("N*")+
142
+ int1_columns+
143
+ int2_columns+
144
+ int4_columns+
145
+ text_columns
146
+
147
+ self
148
+ end
149
+
150
+ def write(path)
151
+ File.write(path, @data, encoding: Encoding::BINARY)
152
+ end
153
+ end
154
+ end
155
+ end
@@ -12,12 +12,15 @@ require 'swissmatch/community'
12
12
  require 'swissmatch/communities'
13
13
  require 'swissmatch/zipcode'
14
14
  require 'swissmatch/zipcodes'
15
+ require 'swissmatch/location/ruby'
15
16
 
16
17
 
17
18
 
18
19
  module SwissMatch
19
20
  module Location
20
21
 
22
+ # SwissMatch::Location::DataFiles
23
+ #
21
24
  # Deals with retrieving and updating the files provided by the swiss postal service,
22
25
  # and loading the data from them.
23
26
  #
@@ -26,73 +29,58 @@ module SwissMatch
26
29
  # change over iterations.
27
30
  class DataFiles
28
31
 
29
- # Used to generate the regular expressions used to parse the data files.
30
- # Generates a regular expression, that matches +size+ tab separated fields,
31
- # delimited by \r\n.
32
- # @private
33
- def self.generate_expression(size, separator, terminator)
34
- /^#{Array.new(size) { "([^#{separator}]*)" }.join(eval("'#{separator}'"))}#{terminator}/
35
- end
36
-
37
- # Regular expressions used to parse the different files.
38
- # @private
39
- Expressions = {
40
- :community => generate_expression(4, '\t', '\r\n'),
41
- :zip_2 => generate_expression(6, '\t', '\r\n'),
42
- :zip_1 => generate_expression(13, '\t', '\r\n'),
43
- :districts => generate_expression(3, ',', '\n'),
44
- :communities => generate_expression(10, ',', '\n'),
45
- }
46
-
47
- # @private
48
- # The URL of the plz_p1 file
49
- URLZip1 = "https://match.post.ch/download?file=10001&tid=11&rol=0"
32
+ # Used to convert numerical language codes to symbols
33
+ LanguageCodes = [nil, :de, :fr, :it, :rt]
50
34
 
35
+ # The data of all cantons
51
36
  # @private
52
- # The URL of the plz_p2 file
53
- URLZip2 = "https://match.post.ch/download?file=10002&tid=14&rol=0"
37
+ AllCantons = Cantons.new([
38
+ Canton.new("AG", "Aargau", "Aargau", "Argovie", "Argovia", "Argovia"),
39
+ Canton.new("AI", "Appenzell Innerrhoden", "Appenzell Innerrhoden", "Appenzell Rhodes-Intérieures", "Appenzello Interno", "Appenzell Dadens"),
40
+ Canton.new("AR", "Appenzell Ausserrhoden", "Appenzell Ausserrhoden", "Appenzell Rhodes-Extérieures", "Appenzello Esterno", "Appenzell Dadora"),
41
+ Canton.new("BE", "Bern", "Bern", "Berne", "Berna", "Berna"),
42
+ Canton.new("BL", "Basel-Landschaft", "Basel-Landschaft", "Bâle-Campagne", "Basilea Campagna", "Basilea-Champagna"),
43
+ Canton.new("BS", "Basel-Stadt", "Basel-Stadt", "Bâle-Ville", "Basilea Città", "Basilea-Citad"),
44
+ Canton.new("FR", "Freiburg", "Fribourg", "Fribourg", "Friburgo", "Friburg"),
45
+ Canton.new("GE", "Genève", "Genf", "Genève", "Ginevra", "Genevra"),
46
+ Canton.new("GL", "Glarus", "Glarus", "Glaris", "Glarona", "Glaruna"),
47
+ Canton.new("GR", "Graubünden", "Graubünden", "Grisons", "Grigioni", "Grischun"),
48
+ Canton.new("JU", "Jura", "Jura", "Jura", "Giura", "Giura"),
49
+ Canton.new("LU", "Luzern", "Luzern", "Lucerne", "Lucerna", "Lucerna"),
50
+ Canton.new("NE", "Neuchâtel", "Neuenburg", "Neuchâtel", "Neuchâtel", "Neuchâtel"),
51
+ Canton.new("NW", "Nidwalden", "Nidwalden", "Nidwald", "Nidvaldo", "Sutsilvania"),
52
+ Canton.new("OW", "Obwalden", "Obwalden", "Obwald", "Obvaldo", "Sursilvania"),
53
+ Canton.new("SG", "St. Gallen", "St. Gallen", "Saint-Gall", "San Gallo", "Son Gagl"),
54
+ Canton.new("SH", "Schaffhausen", "Schaffhausen", "Schaffhouse", "Sciaffusa", "Schaffusa"),
55
+ Canton.new("SO", "Solothurn", "Solothurn", "Soleure", "Soletta", "Soloturn"),
56
+ Canton.new("SZ", "Schwyz", "Schwyz", "Schwytz", "Svitto", "Sviz"),
57
+ Canton.new("TG", "Thurgau", "Thurgau", "Thurgovie", "Turgovia", "Turgovia"),
58
+ Canton.new("TI", "Ticino", "Tessin", "Tessin", "Ticino", "Tessin"),
59
+ Canton.new("UR", "Uri", "Uri", "Uri", "Uri", "Uri"),
60
+ Canton.new("VD", "Vaud", "Waadt", "Vaud", "Vaud", "Vad"),
61
+ Canton.new("VS", "Valais", "Wallis", "Valais", "Vallese", "Vallais"),
62
+ Canton.new("ZG", "Zug", "Zug", "Zoug", "Zugo", "Zug"),
63
+ Canton.new("ZH", "Zürich", "Zürich", "Zurich", "Zurigo", "Turitg"),
64
+ Canton.new("FL", "Fürstentum Liechtenstein", "Fürstentum Liechtenstein", "Liechtenstein", "Liechtenstein", "Liechtenstein"),
65
+ Canton.new("DE", "Deutschland", "Deutschland", "Allemagne", "Germania", "Germania"),
66
+ Canton.new("IT", "Italien", "Italien", "Italie", "Italia", "Italia"),
67
+ ])
68
+
69
+ def self.empty
70
+ data = new
71
+ data.load_empty!
72
+
73
+ data
74
+ end
54
75
 
55
- # @private
56
- # The URL of the plz_c file
57
- URLCommunity = "https://match.post.ch/download?file=10003&tid=13&rol=0"
76
+ # @return [Date]
77
+ # The date from when the data from the swiss post master data file
78
+ # starts to be valid
79
+ attr_reader :date
58
80
 
59
- # @private
60
- # An array of all urls
61
- URLAll = [URLZip1, URLZip2, URLCommunity]
62
-
63
- # The data of all cantons
64
- # @private
65
- CantonData = [
66
- ["AG", "Aargau", "Aargau", "Argovie", "Argovia", "Argovia"],
67
- ["AI", "Appenzell Innerrhoden", "Appenzell Innerrhoden", "Appenzell Rhodes-Intérieures", "Appenzello Interno", "Appenzell Dadens"],
68
- ["AR", "Appenzell Ausserrhoden", "Appenzell Ausserrhoden", "Appenzell Rhodes-Extérieures", "Appenzello Esterno", "Appenzell Dadora"],
69
- ["BE", "Bern", "Bern", "Berne", "Berna", "Berna"],
70
- ["BL", "Basel-Landschaft", "Basel-Landschaft", "Bâle-Campagne", "Basilea Campagna", "Basilea-Champagna"],
71
- ["BS", "Basel-Stadt", "Basel-Stadt", "Bâle-Ville", "Basilea Città", "Basilea-Citad"],
72
- ["FR", "Freiburg", "Fribourg", "Fribourg", "Friburgo", "Friburg"],
73
- ["GE", "Genève", "Genf", "Genève", "Ginevra", "Genevra"],
74
- ["GL", "Glarus", "Glarus", "Glaris", "Glarona", "Glaruna"],
75
- ["GR", "Graubünden", "Graubünden", "Grisons", "Grigioni", "Grischun"],
76
- ["JU", "Jura", "Jura", "Jura", "Giura", "Giura"],
77
- ["LU", "Luzern", "Luzern", "Lucerne", "Lucerna", "Lucerna"],
78
- ["NE", "Neuchâtel", "Neuenburg", "Neuchâtel", "Neuchâtel", "Neuchâtel"],
79
- ["NW", "Nidwalden", "Nidwalden", "Nidwald", "Nidvaldo", "Sutsilvania"],
80
- ["OW", "Obwalden", "Obwalden", "Obwald", "Obvaldo", "Sursilvania"],
81
- ["SG", "St. Gallen", "St. Gallen", "Saint-Gall", "San Gallo", "Son Gagl"],
82
- ["SH", "Schaffhausen", "Schaffhausen", "Schaffhouse", "Sciaffusa", "Schaffusa"],
83
- ["SO", "Solothurn", "Solothurn", "Soleure", "Soletta", "Soloturn"],
84
- ["SZ", "Schwyz", "Schwyz", "Schwytz", "Svitto", "Sviz"],
85
- ["TG", "Thurgau", "Thurgau", "Thurgovie", "Turgovia", "Turgovia"],
86
- ["TI", "Ticino", "Tessin", "Tessin", "Ticino", "Tessin"],
87
- ["UR", "Uri", "Uri", "Uri", "Uri", "Uri"],
88
- ["VD", "Vaud", "Waadt", "Vaud", "Vaud", "Vad"],
89
- ["VS", "Valais", "Wallis", "Valais", "Vallese", "Vallais"],
90
- ["ZG", "Zug", "Zug", "Zoug", "Zugo", "Zug"],
91
- ["ZH", "Zürich", "Zürich", "Zurich", "Zurigo", "Turitg"],
92
- ["FL", "Fürstentum Liechtenstein", "Fürstentum Liechtenstein", "Liechtenstein", "Liechtenstein", "Liechtenstein"],
93
- ["DE", "Deutschland", "Deutschland", "Allemagne", "Germania", "Germania"],
94
- ["IT", "Italien", "Italien", "Italie", "Italia", "Italia"],
95
- ]
81
+ # @return [Integer]
82
+ # The random code from the swiss post master data file
83
+ attr_reader :random_code
96
84
 
97
85
  # The directory in which the post mat[ch] files reside
98
86
  attr_accessor :data_directory
@@ -116,14 +104,13 @@ module SwissMatch
116
104
  # The directory in which the post mat[ch] files reside
117
105
  def initialize(data_directory=nil)
118
106
  reset_errors!
107
+ @loaded = false
119
108
  if data_directory then
120
109
  @data_directory = data_directory
121
110
  elsif ENV['SWISSMATCH_DATA'] then
122
111
  @data_directory = ENV['SWISSMATCH_DATA']
123
112
  else
124
- data_directory = File.expand_path('../../../../data/swissmatch-location', __FILE__)
125
- data_directory = Gem.datadir 'swissmatch-location' if defined?(Gem) && !File.directory?(data_directory)
126
- @data_directory = data_directory
113
+ @data_directory = File.expand_path('~/.swissmatch')
127
114
  end
128
115
  end
129
116
 
@@ -134,120 +121,116 @@ module SwissMatch
134
121
  self
135
122
  end
136
123
 
137
- # Load new files
138
- #
139
- # @return [Array<String>]
140
- # An array with the absolute file paths of the extracted files.
141
- def load_updates
142
- URLAll.flat_map { |url|
143
- http_get_zip_file(url, @data_directory)
144
- }
124
+ def latest_binary_file
125
+ Dir.enum_for(:glob, "#{@data_directory}/locations_*.binary").last
145
126
  end
146
127
 
147
- # Performs an HTTP-GET for the given url, extracts it as a zipped file into the
148
- # destination directory.
149
- #
150
- # @return [Array<String>]
151
- # An array with the absolute file paths of the extracted files.
152
- def http_get_zip_file(url, destination)
153
- require 'open-uri'
154
- require 'swissmatch/zip' # patched zip/zip
155
- require 'fileutils'
156
-
157
- files = []
158
-
159
- open(url) do |zip_buffer|
160
- Zip::ZipFile.open(zip_buffer) do |zip_file|
161
- zip_file.each do |f|
162
- target_path = File.join(destination, f.name)
163
- FileUtils.mkdir_p(File.dirname(target_path))
164
- zip_file.extract(f, target_path) unless File.exist?(target_path)
165
- files << target_path
166
- end
167
- end
168
- end
169
-
170
- files
171
- end
128
+ def load_empty!
129
+ return if @loaded
172
130
 
173
- # Unzips it as a zipped file into the destination directory.
174
- def unzip_file(file, destination)
175
- require 'swissmatch/zip'
176
- Zip::ZipFile.open(file) do |zip_file|
177
- zip_file.each do |f|
178
- target_path = File.join(destination, f.name)
179
- FileUtils.mkdir_p(File.dirname(target_path))
180
- zip_file.extract(f, target_path) unless File.exist?(target_path)
181
- end
182
- end
131
+ @loaded = true
132
+ @date = Date.new(0)
133
+ @random_code = 0
134
+ @cantons = AllCantons
135
+ @districts = Districts.new([])
136
+ @communities = Communities.new([])
137
+ @zip_codes = ZipCodes.new([])
183
138
  end
184
139
 
185
- # Used to convert numerical language codes to symbols
186
- LanguageCodes = [nil, :de, :fr, :it, :rt]
187
-
188
140
  # Loads the data into this DataFiles instance
189
141
  #
190
142
  # @return [self]
191
143
  # Returns self.
192
- def load!
193
- @cantons, @districts, @communities, @zip_codes = *load
194
- self
195
- end
144
+ def load!(file=nil)
145
+ return if @loaded
146
+
147
+ file ||= latest_binary_file
148
+
149
+ raise LoadError.new("File #{file.inspect} not found or not readable", nil) unless file && File.readable?(file)
150
+
151
+ data = File.read(file, encoding: Encoding::BINARY)
152
+ date, random_code, zip1_count, zip2_count, com1_count, com2_count, district_count = *data[0,18].unpack("NNn*")
153
+ int1_size, int2_size, int4_size, text_size = *data[18,16].unpack("N*")
154
+
155
+ offset = 34
156
+ int1_cols = data[offset, int1_size].unpack("C*")
157
+ int2_cols = data[offset+=int1_size, int2_size].unpack("n*")
158
+ int4_cols = data[offset+=int2_size, int4_size].unpack("N*")
159
+ text_cols = data[offset+=int4_size, text_size].force_encoding(Encoding::UTF_8).split("\x1f")
160
+
161
+ offset = 0
162
+ zip1_type = int1_cols[offset, zip1_count]
163
+ zip1_addon = int1_cols[offset += zip1_count, zip1_count]
164
+ zip1_language = int1_cols[offset += zip1_count, zip1_count]
165
+ zip1_language_alternative = int1_cols[offset += zip1_count, zip1_count]
166
+ zip2_region = int1_cols[offset += zip1_count, zip2_count]
167
+ zip2_type = int1_cols[offset += zip2_count, zip2_count]
168
+ zip2_lang = int1_cols[offset += zip2_count, zip2_count]
169
+ com2_PLZZ = int1_cols[offset += zip2_count, com2_count]
170
+
171
+ offset = 0
172
+ zip1_onrp = int2_cols[offset, zip1_count]
173
+ zip1_code = int2_cols[offset += zip1_count, zip1_count]
174
+ zip1_delivery_by = int2_cols[offset += zip1_count, zip1_count]
175
+ zip1_largest_community_number = int2_cols[offset += zip1_count, zip1_count]
176
+ zip2_onrp = int2_cols[offset += zip1_count, zip2_count]
177
+ com1_bfsnr = int2_cols[offset += zip2_count, com1_count]
178
+ com1_agglomeration = int2_cols[offset += com1_count, com1_count]
179
+ com2_GDENR = int2_cols[offset += com1_count, com2_count]
180
+ com2_PLZ4 = int2_cols[offset += com2_count, com2_count]
181
+ district_GDEBZNR = int2_cols[offset += com2_count, district_count]
182
+
183
+ zip1_valid_from = int4_cols
184
+
185
+ offset = 0
186
+ zip1_name_short = text_cols[offset, zip1_count]
187
+ zip1_name = text_cols[offset += zip1_count, zip1_count]
188
+ zip1_canton = text_cols[offset += zip1_count, zip1_count]
189
+ zip2_short = text_cols[offset += zip1_count, zip2_count]
190
+ zip2_name = text_cols[offset += zip2_count, zip2_count]
191
+ com1_name = text_cols[offset += zip2_count, com1_count]
192
+ com1_canton = text_cols[offset += com1_count, com1_count]
193
+ district_GDEKT = text_cols[offset += com1_count, district_count]
194
+ district_GDEBZNA = text_cols[offset += district_count, district_count]
195
+
196
+ zip1 = [
197
+ zip1_onrp, zip1_type, zip1_canton, zip1_code, zip1_addon,
198
+ zip1_delivery_by, zip1_language, zip1_language_alternative,
199
+ zip1_name_short, zip1_name, zip1_largest_community_number,
200
+ zip1_valid_from
201
+ ].transpose
202
+ zip2 = [zip2_onrp, zip2_region, zip2_type, zip2_lang, zip2_short, zip2_name].transpose
203
+ com1 = [com1_bfsnr, com1_name, com1_canton, com1_agglomeration].transpose
204
+ com2 = [com2_PLZ4, com2_PLZZ, com2_GDENR].transpose
205
+ district = [district_GDEKT, district_GDEBZNR, district_GDEBZNA].transpose
206
+
207
+ @date = Date.jd(date)
208
+ @random_code = random_code
209
+ @cantons = AllCantons
210
+ @districts = load_districts(district)
211
+ @communities = load_communities(com1)
212
+ @zip_codes = load_zipcodes(zip1, zip2, com2)
196
213
 
197
- # @return [Array]
198
- # Returns an array of the form [SwissMatch::Cantons, SwissMatch::Communities,
199
- # SwissMatch::ZipCodes].
200
- def load
201
- reset_errors!
202
-
203
- cantons = load_cantons
204
- districts = load_districts(cantons)
205
- communities = load_communities(cantons)
206
- zip_codes = load_zipcodes(cantons, communities)
207
-
208
- [cantons, districts, communities, zip_codes]
214
+ self
209
215
  end
210
216
 
211
- # @return [SwissMatch::Cantons]
212
- # A SwissMatch::Cantons containing all cantons used by the swiss postal service.
213
- def load_cantons
214
- Cantons.new(
215
- CantonData.map { |tag, name, name_de, name_fr, name_it, name_rt|
216
- Canton.new(tag, name, name_de, name_fr, name_it, name_rt)
217
- }
218
- )
219
- end
220
-
221
- def load_districts(cantons)
222
- # File format: GDEKT,GDEBZNR,GDEBZNA
223
- path = Dir.enum_for(:glob, "#{@data_directory}/districts_*.csv").last
224
- data = File.read(path, encoding: Encoding::UTF_8.to_s).scan(Expressions[:districts])
225
- districts = data[1..-1].map { |canton_tag, district_number, district_name|
226
- district_number = Integer(district_number, 10)
227
- canton = cantons.by_license_tag(canton_tag)
228
-
229
- District.new(district_number, district_name, canton, SwissMatch::Communities.new([]))
230
- }
231
-
232
- Districts.new(districts)
217
+ def load_districts(data)
218
+ Districts.new(data.map { |data|
219
+ District.new(*data, SwissMatch::Communities.new([]))
220
+ })
233
221
  end
234
222
 
235
223
  # @return [SwissMatch::Communities]
236
224
  # An instance of SwissMatch::Communities containing all communities defined by the
237
225
  # files known to this DataFiles instance.
238
- def load_communities(cantons)
239
- raise "Must load cantons first" unless cantons
240
-
241
- file = Dir.enum_for(:glob, "#{@data_directory}/plz_c_*.txt").last
226
+ def load_communities(data)
242
227
  temporary = []
243
228
  complete = {}
244
- load_table(file, :community).each do |bfsnr, name, canton, agglomeration|
245
- bfsnr = bfsnr.to_i
246
- agglomeration = agglomeration.to_i
247
- canton = cantons.by_license_tag(canton)
229
+ data.each do |bfsnr, name, canton, agglomeration|
230
+ canton = @cantons.by_license_tag(canton)
248
231
  if agglomeration == bfsnr then
249
232
  complete[bfsnr] = Community.new(bfsnr, name, canton, :self)
250
- elsif agglomeration.nil? then
233
+ elsif agglomeration.zero? then
251
234
  complete[bfsnr] = Community.new(bfsnr, name, canton, nil)
252
235
  else
253
236
  temporary << [bfsnr, name, canton, agglomeration]
@@ -268,48 +251,31 @@ module SwissMatch
268
251
  # @return [SwissMatch::ZipCodes]
269
252
  # An instance of SwissMatch::ZipCodes containing all zip codes defined by the
270
253
  # files known to this DataFiles instance.
271
- def load_zipcodes(cantons, communities)
272
- raise "Must load cantons first" unless cantons
273
- raise "Must load communities first" unless communities
274
-
254
+ def load_zipcodes(zip1_data, zip2_data, com2_data)
275
255
  community_mapping = Hash.new { |h,k| h[k] = [] }
276
256
  self_delivered = []
277
257
  others = []
278
- zip1_file = Dir.enum_for(:glob, "#{@data_directory}/plz_p1_*.txt").last
279
- zip2_file = Dir.enum_for(:glob, "#{@data_directory}/plz_p2_*.txt").last
280
- communities_file = Dir.enum_for(:glob, "#{@data_directory}/communities_*.csv").last
281
-
282
- # KTKZ,OHW,ORTNAME,GHW,GDENR,GDENAMK,PHW,PLZ4,PLZZ,PLZNAMK
283
- communities_data = File.read(
284
- communities_file,
285
- encoding: Encoding::UTF_8.to_s
286
- ).scan(Expressions[:communities])[1..-1].transpose.values_at(4,7,8)
287
- communities_data[0].map!(&:to_i)
288
- communities_data[1].map!(&:to_i)
289
- communities_data[2].map!(&:to_i)
290
- communities_data.transpose.each do |data|
291
- community_mapping[data.last(2)] << data.at(0)
258
+ temporary = {}
259
+
260
+ com2_data.each do |*key, value|
261
+ community_mapping[key] << value
292
262
  end
293
263
 
294
- temporary = {}
295
- load_table(zip1_file, :zip_1).each do |row|
296
- onrp = row.at(0).to_i
297
- code = row.at(2).to_i
298
- addon = row.at(3).to_i
299
- delivery_by = row.at(10).to_i
264
+ zip1_data.each do |onrp, type, canton, code, addon, delivery_by, lang, lang_alt, name_short, name, largest_community_number, valid_from|
300
265
  delivery_by = case delivery_by when 0 then nil; when onrp then :self; else delivery_by; end
301
- language = LanguageCodes[row.at(7).to_i]
302
- language_alternative = LanguageCodes[row.at(8).to_i]
303
- name_short = Name.new(row.at(4), language)
304
- name = Name.new(row.at(5), language)
305
- largest_community_number = row.at(11).to_i
266
+ language = LanguageCodes[lang]
267
+ language_alternative = LanguageCodes[lang_alt]
268
+ name_short = Name.new(name_short, language)
269
+ name = Name.new(name, language)
270
+
306
271
  # compact, because some communities already no longer exist, so by_community_numbers can
307
272
  # contain nils which must be removed
308
- community_numbers = (community_mapping[[code, addon]]|[largest_community_number]).sort
309
- communities = Communities.new(communities.by_community_numbers(*community_numbers).compact)
273
+ community_numbers = (community_mapping[[code, addon]] | [largest_community_number]).sort
274
+ communities = Communities.new(@communities.by_community_numbers(*community_numbers).compact)
275
+
310
276
  data = [
311
277
  onrp, # ordering_number
312
- row.at(1).to_i, # type
278
+ type, # type
313
279
  code,
314
280
  addon,
315
281
  name, # name (official)
@@ -318,14 +284,14 @@ module SwissMatch
318
284
  [name_short], # names_short (official + alternative)
319
285
  [], # PLZ2 type 3 short names (additional region names)
320
286
  [], # PLZ2 type 3 names (additional region names)
321
- cantons.by_license_tag(row.at(6)), # canton
287
+ cantons.by_license_tag(canton), # canton
322
288
  language,
323
289
  language_alternative,
324
- row.at(9) == "1", # sortfile_member
290
+ false, # sortfile_member TODO: remove
325
291
  delivery_by, # delivery_by
326
292
  communities.by_community_number(largest_community_number), # community_number
327
293
  communities,
328
- Date.civil(*row.at(12).match(/^(\d{4})(\d\d)(\d\d)$/).captures.map(&:to_i)) # valid_from
294
+ Date.jd(valid_from) # valid_from
329
295
  ]
330
296
  temporary[onrp] = data
331
297
  if :self == delivery_by then
@@ -335,26 +301,26 @@ module SwissMatch
335
301
  end
336
302
  end
337
303
 
338
- load_table(zip2_file, :zip_2).each do |onrp, rn, type, lang, short, name|
339
- onrp = onrp.to_i
340
- lang_code = lang.to_i
304
+ zip2_data.each do |onrp, rn, type, lang, short, name|
305
+ onrp = onrp
306
+ lang_code = lang
341
307
  language = LanguageCodes[lang_code]
342
308
  entry = temporary[onrp]
343
- if type == "2"
344
- entry[5] << Name.new(name, language, rn.to_i)
345
- entry[7] << Name.new(short, language, rn.to_i)
346
- elsif type == "3"
347
- entry[8] << Name.new(name, language, rn.to_i)
348
- entry[9] << Name.new(short, language, rn.to_i)
309
+ if type == 2
310
+ entry[5] << Name.new(name, language, rn)
311
+ entry[7] << Name.new(short, language, rn)
312
+ elsif type == 3
313
+ entry[8] << Name.new(name, language, rn)
314
+ entry[9] << Name.new(short, language, rn)
349
315
  end
350
316
  end
351
317
 
352
318
  self_delivered.each do |row|
353
- temporary[row.at(0)] = ZipCode.new(*row)
319
+ temporary[row[0]] = ZipCode.new(*row)
354
320
  end
355
321
  others.each do |row|
356
- if row.at(14) then
357
- raise "Delivery not found:\n#{row.inspect}" unless tmp = temporary[row.at(14)]
322
+ if row[14] then
323
+ raise "Delivery not found:\n#{row.inspect}" unless tmp = temporary[row[14]]
358
324
  if tmp.kind_of?(Array) then
359
325
  @errors << LoadError.new("Invalid reference: onrp #{row.at(0)} delivery by #{row.at(14)}", row)
360
326
  row[14] = nil
@@ -362,26 +328,11 @@ module SwissMatch
362
328
  row[14] = tmp
363
329
  end
364
330
  end
365
- temporary[row.at(0)] = ZipCode.new(*row)
331
+ temporary[row[0]] = ZipCode.new(*row)
366
332
  end
367
333
 
368
334
  ZipCodes.new(temporary.values)
369
335
  end
370
-
371
- # Reads a file and parses using the pattern of the given name.
372
- #
373
- # @param [String] path
374
- # The path of the file to parse
375
- # @param [Symbol] pattern
376
- # The pattern-name used to parse the file (see Expressions)
377
- #
378
- # @return [Array<Array<String>>]
379
- # A 2 dimensional array representing the tabular data contained in the given file.
380
- def load_table(path, pattern)
381
- File.read(path, :encoding => Encoding::Windows_1252.to_s). # to_s because sadly, ruby 1.9.2 can't handle an Encoding instance as argument
382
- encode(Encoding::UTF_8).
383
- scan(Expressions[pattern])
384
- end
385
336
  end
386
337
  end
387
338
  end