smo_scottish_lidar 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +246 -0
- data/lib/smo_scottish_lidar/client.rb +148 -0
- data/lib/smo_scottish_lidar/constants.rb +61 -0
- data/lib/smo_scottish_lidar/downloader.rb +134 -0
- data/lib/smo_scottish_lidar/lister.rb +75 -0
- data/lib/smo_scottish_lidar/version.rb +5 -0
- data/lib/smo_scottish_lidar.rb +7 -0
- metadata +55 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: e863ff5b29426cd4b10ab869f95e6a87cd1a5531d0a9210b0f84cec1ff3fa0f3
|
|
4
|
+
data.tar.gz: a3c1b55c62746a9dd9661dc1541c9b0f0f67cf1f969419373333c760aa31f989
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: c7ac311402bce11dfbea9e8bcf030df07e7e9d3269f462eb942a8705a6a1d3a3815242f211c1b03dbf6e7413ef6e4dda4250c197dc2696241285a6b16dd8ff74
|
|
7
|
+
data.tar.gz: b853e6a7f50d7ec73ac3b016dd266bb4352ffe86ab72433689502d6139f15e4e81b16245ca38f38ce2fce7d6216991a13b71b2b5c88eafbd7d450e270441164f
|
data/README.md
ADDED
|
@@ -0,0 +1,246 @@
|
|
|
1
|
+
# smo_scottish_lidar
|
|
2
|
+
|
|
3
|
+
A pure Ruby gem for listing and downloading Scottish Public Sector LiDAR data from the [Registry of Open Data on AWS](https://registry.opendata.aws/scottish-lidar/). Built by Sebastian Madrid Ontiveros to support hydraulic modellers in Scotland working on 1D-2D flood risk assessments and model build workflows.
|
|
4
|
+
|
|
5
|
+
No external dependencies. No AWS CLI. No credentials. Uses only Ruby stdlib (`net/http`, `uri`, `fileutils`). Compatible with InfoWorks ICM 2027 embedded Ruby.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## What is this data?
|
|
10
|
+
|
|
11
|
+
The Scottish Government has made LiDAR survey data publicly available through an S3 bucket (`srsp-open-data`). The dataset covers most of Scotland across five survey phases plus a dedicated Outer Hebrides survey. Each phase includes:
|
|
12
|
+
|
|
13
|
+
- **DSM** - Digital Surface Model (includes buildings, trees, structures)
|
|
14
|
+
- **DTM** - Digital Terrain Model (bare earth, vegetation removed)
|
|
15
|
+
- **LAZ** - Raw LiDAR point cloud in compressed LAS format
|
|
16
|
+
|
|
17
|
+
Files are organised by OS National Grid square (e.g. NS, NT, NO, NN) and are free to download.
|
|
18
|
+
|
|
19
|
+
| Phase | Area covered |
|
|
20
|
+
|---|---|
|
|
21
|
+
| phase-1 | Central Scotland |
|
|
22
|
+
| phase-2 | South and East Scotland |
|
|
23
|
+
| phase-3 | North and West Scotland |
|
|
24
|
+
| phase-4 | Additional coverage |
|
|
25
|
+
| phase-5 | Latest survey phase |
|
|
26
|
+
| outer-hebrides | Western Isles (25cm and 50cm resolution) |
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Installation
|
|
31
|
+
|
|
32
|
+
```sh
|
|
33
|
+
gem install smo_scottish_lidar
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
```ruby
|
|
37
|
+
require "smo_scottish_lidar"
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## Quick start
|
|
43
|
+
|
|
44
|
+
```ruby
|
|
45
|
+
require "smo_scottish_lidar"
|
|
46
|
+
|
|
47
|
+
# List all Phase 1 DSM tiles in the NS grid square
|
|
48
|
+
lister = SmoScottishLidar::Lister.new
|
|
49
|
+
lister.summary("phase-1", "dsm", grid_square: "NS")
|
|
50
|
+
|
|
51
|
+
# Download a single tile
|
|
52
|
+
downloader = SmoScottishLidar::Downloader.new
|
|
53
|
+
downloader.download_file("phase-1", "dsm", "NS56_1M_DSM_PHASE1.tif",
|
|
54
|
+
destination: "/tmp/lidar"
|
|
55
|
+
)
|
|
56
|
+
|
|
57
|
+
# Batch download all NS tiles for Phase 1 DSM
|
|
58
|
+
downloader.download("phase-1", "dsm",
|
|
59
|
+
destination: "/tmp/lidar/phase-1/dsm",
|
|
60
|
+
grid_square: "NS"
|
|
61
|
+
)
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## API reference
|
|
67
|
+
|
|
68
|
+
### `SmoScottishLidar::Lister`
|
|
69
|
+
|
|
70
|
+
Lists available files from the S3 bucket. All filtering is done client-side after fetching the S3 listing.
|
|
71
|
+
|
|
72
|
+
```ruby
|
|
73
|
+
lister = SmoScottishLidar::Lister.new(verbose: false)
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
#### `lister.list(phase, type, grid_square: nil, resolution: nil)`
|
|
77
|
+
|
|
78
|
+
Returns an `Array<Hash>` of matching files. Each hash contains:
|
|
79
|
+
|
|
80
|
+
| Key | Type | Description |
|
|
81
|
+
|---|---|---|
|
|
82
|
+
| `:key` | String | Full S3 object key |
|
|
83
|
+
| `:filename` | String | Bare filename (e.g. `NS56_1M_DSM_PHASE1.tif`) |
|
|
84
|
+
| `:size` | Integer | File size in bytes |
|
|
85
|
+
| `:last_modified` | String | ISO 8601 timestamp |
|
|
86
|
+
|
|
87
|
+
```ruby
|
|
88
|
+
files = lister.list("phase-1", "dsm", grid_square: "NS")
|
|
89
|
+
files.each { |f| puts "#{f[:filename]} #{f[:size]} bytes" }
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
#### `lister.summary(phase, type, grid_square: nil, resolution: nil)`
|
|
93
|
+
|
|
94
|
+
Prints a formatted table to stdout and returns the same `Array<Hash>`.
|
|
95
|
+
|
|
96
|
+
```ruby
|
|
97
|
+
lister.summary("phase-2", "dtm", grid_square: "NT")
|
|
98
|
+
lister.summary("outer-hebrides", "dsm", resolution: "50cm")
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
### `SmoScottishLidar::Downloader`
|
|
104
|
+
|
|
105
|
+
Downloads files from the S3 bucket. Streams in chunks to avoid loading large files into memory.
|
|
106
|
+
|
|
107
|
+
```ruby
|
|
108
|
+
downloader = SmoScottishLidar::Downloader.new(verbose: false)
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
#### `downloader.download(phase, type, destination:, ...)`
|
|
112
|
+
|
|
113
|
+
Downloads all tiles matching the given filters. Returns a summary hash `{ downloaded: [...], skipped: [...], failed: [...] }`.
|
|
114
|
+
|
|
115
|
+
```ruby
|
|
116
|
+
downloader.download(
|
|
117
|
+
"phase-1", "dsm",
|
|
118
|
+
destination: "/tmp/lidar/phase-1/dsm",
|
|
119
|
+
grid_square: "NS", # optional. nil downloads everything
|
|
120
|
+
skip_existing: true, # skip files already on disk at the correct size
|
|
121
|
+
dry_run: false # set true to preview without downloading
|
|
122
|
+
)
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
| Option | Default | Description |
|
|
126
|
+
|---|---|---|
|
|
127
|
+
| `destination:` | required | Local directory to save files into |
|
|
128
|
+
| `grid_square:` | `nil` | OS grid square filter, e.g. `"NS"`, `"NT"` |
|
|
129
|
+
| `resolution:` | `nil` | Outer Hebrides only, e.g. `"25cm"`, `"50cm"`, `"4ppm"`, `"16ppm"` |
|
|
130
|
+
| `skip_existing:` | `true` | Skip files that already exist locally at the correct size |
|
|
131
|
+
| `dry_run:` | `false` | Print what would be downloaded without downloading |
|
|
132
|
+
|
|
133
|
+
#### `downloader.download_file(phase, type, filename, destination:, resolution: nil)`
|
|
134
|
+
|
|
135
|
+
Downloads a single tile by exact filename.
|
|
136
|
+
|
|
137
|
+
```ruby
|
|
138
|
+
downloader.download_file(
|
|
139
|
+
"phase-1", "dsm", "NS56_1M_DSM_PHASE1.tif",
|
|
140
|
+
destination: "/tmp/lidar"
|
|
141
|
+
)
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
### `SmoScottishLidar.prefix_for(phase, type, resolution: nil)`
|
|
147
|
+
|
|
148
|
+
Returns the S3 prefix string for a given phase and type. Useful if you need to build custom queries.
|
|
149
|
+
|
|
150
|
+
```ruby
|
|
151
|
+
SmoScottishLidar.prefix_for("phase-1", "dsm")
|
|
152
|
+
# => "lidar/phase-1/dsm/27700/gridded/"
|
|
153
|
+
|
|
154
|
+
SmoScottishLidar.prefix_for("outer-hebrides", "dtm", resolution: "50cm")
|
|
155
|
+
# => "lidar/outer-hebrides/2019/dtm/50cm/27700/gridded/"
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Valid phases and types
|
|
161
|
+
|
|
162
|
+
```ruby
|
|
163
|
+
SmoScottishLidar::PHASES
|
|
164
|
+
# => ["phase-1", "phase-2", "phase-3", "phase-4", "phase-5", "outer-hebrides"]
|
|
165
|
+
|
|
166
|
+
SmoScottishLidar::DATASET_TYPES
|
|
167
|
+
# => ["dsm", "dtm", "laz"]
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
### Outer Hebrides resolutions
|
|
171
|
+
|
|
172
|
+
| Type | Available resolutions |
|
|
173
|
+
|---|---|
|
|
174
|
+
| dsm | `"25cm"` (default), `"50cm"` |
|
|
175
|
+
| dtm | `"25cm"` (default), `"50cm"` |
|
|
176
|
+
| laz | `"4ppm"` (default), `"16ppm"` |
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## Examples
|
|
181
|
+
|
|
182
|
+
The `examples/` directory contains ready-to-run scripts:
|
|
183
|
+
|
|
184
|
+
| Script | Description |
|
|
185
|
+
|---|---|
|
|
186
|
+
| `demo.rb` | Full walkthrough of all features |
|
|
187
|
+
| `phase_1_dsm.rb` | List Phase 1 DSM tiles |
|
|
188
|
+
| `phase_1_dtm.rb` | List Phase 1 DTM tiles |
|
|
189
|
+
| `phase_1_laz.rb` | List Phase 1 LAZ tiles |
|
|
190
|
+
| `phase_2_dsm.rb` ... | One script per phase and type |
|
|
191
|
+
| `outer_hebrides_dsm.rb` | Outer Hebrides DSM |
|
|
192
|
+
| `outer_hebrides_dtm.rb` | Outer Hebrides DTM |
|
|
193
|
+
| `outer_hebrides_laz.rb` | Outer Hebrides LAZ |
|
|
194
|
+
| `download_individual_tile.rb` | Download a single named tile |
|
|
195
|
+
| `download_batch_tiles.rb` | Batch download with grid square filter |
|
|
196
|
+
|
|
197
|
+
Run any script with:
|
|
198
|
+
|
|
199
|
+
```sh
|
|
200
|
+
ruby examples/phase_1_dsm.rb
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
## Typical workflow for hydraulic modelling
|
|
206
|
+
|
|
207
|
+
```ruby
|
|
208
|
+
require "smo_scottish_lidar"
|
|
209
|
+
|
|
210
|
+
downloader = SmoScottishLidar::Downloader.new(verbose: true)
|
|
211
|
+
|
|
212
|
+
# 1. Check what is available for your catchment (e.g. NS and NS grid squares)
|
|
213
|
+
lister = SmoScottishLidar::Lister.new
|
|
214
|
+
lister.summary("phase-1", "dtm", grid_square: "NS")
|
|
215
|
+
|
|
216
|
+
# 2. Dry run first to confirm file sizes and count
|
|
217
|
+
downloader.download("phase-1", "dtm",
|
|
218
|
+
destination: "/projects/my_catchment/lidar/dtm",
|
|
219
|
+
grid_square: "NS",
|
|
220
|
+
dry_run: true
|
|
221
|
+
)
|
|
222
|
+
|
|
223
|
+
# 3. Download for real
|
|
224
|
+
downloader.download("phase-1", "dtm",
|
|
225
|
+
destination: "/projects/my_catchment/lidar/dtm",
|
|
226
|
+
grid_square: "NS",
|
|
227
|
+
dry_run: false
|
|
228
|
+
)
|
|
229
|
+
|
|
230
|
+
# 4. If the download is interrupted, re-run the same command.
|
|
231
|
+
# Files already on disk at the correct size are skipped automatically.
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Support
|
|
237
|
+
|
|
238
|
+
If this gem saves you time on a project, you can buy me a coffee.
|
|
239
|
+
|
|
240
|
+
[](https://buymeacoffee.com/smadrid)
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## License
|
|
245
|
+
|
|
246
|
+
MIT. Copyright (c) 2024 Sebastian Madrid Ontiveros.
|
|
@@ -0,0 +1,148 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "net/http"
|
|
4
|
+
require "uri"
|
|
5
|
+
|
|
6
|
+
module SmoScottishLidar
|
|
7
|
+
# Low-level S3 client. Uses only Ruby stdlib net/http.
|
|
8
|
+
# Accesses the public (unsigned) srsp-open-data bucket directly over HTTPS.
|
|
9
|
+
class Client
|
|
10
|
+
MAX_KEYS = 1000
|
|
11
|
+
|
|
12
|
+
def initialize(verbose: false)
|
|
13
|
+
@verbose = verbose
|
|
14
|
+
end
|
|
15
|
+
|
|
16
|
+
# Lists all object keys under a given S3 prefix.
|
|
17
|
+
# Handles S3 pagination (continuation tokens) automatically.
|
|
18
|
+
# Returns an Array of Hashes: [{ key:, size:, last_modified: }, ...]
|
|
19
|
+
def list_objects(prefix)
|
|
20
|
+
objects = []
|
|
21
|
+
continuation_token = nil
|
|
22
|
+
|
|
23
|
+
loop do
|
|
24
|
+
xml = fetch_list_page(prefix, continuation_token)
|
|
25
|
+
page_objects, next_token, truncated = parse_list_response(xml)
|
|
26
|
+
objects.concat(page_objects)
|
|
27
|
+
|
|
28
|
+
break unless truncated
|
|
29
|
+
|
|
30
|
+
continuation_token = next_token
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
objects
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
# Downloads a single S3 object key to a local file path.
|
|
37
|
+
# Follows redirects, streams in chunks to avoid loading into memory.
|
|
38
|
+
# Returns true on success, raises on error.
|
|
39
|
+
def download_object(key, local_path, &progress_block)
|
|
40
|
+
url = "#{BASE_URL}/#{key}"
|
|
41
|
+
log "Downloading: #{url} -> #{local_path}"
|
|
42
|
+
|
|
43
|
+
uri = URI.parse(url)
|
|
44
|
+
fetch_with_redirect(uri, local_path, &progress_block)
|
|
45
|
+
true
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
private
|
|
49
|
+
|
|
50
|
+
def fetch_list_page(prefix, continuation_token)
|
|
51
|
+
params = {
|
|
52
|
+
"list-type" => "2",
|
|
53
|
+
"prefix" => prefix,
|
|
54
|
+
"max-keys" => MAX_KEYS.to_s
|
|
55
|
+
}
|
|
56
|
+
params["continuation-token"] = continuation_token if continuation_token
|
|
57
|
+
|
|
58
|
+
query = params.map { |k, v| "#{uri_encode(k)}=#{uri_encode(v)}" }.join("&")
|
|
59
|
+
uri = URI.parse("#{BASE_URL}/?#{query}")
|
|
60
|
+
|
|
61
|
+
log "Listing: #{uri}"
|
|
62
|
+
response = get_response(uri)
|
|
63
|
+
response.body
|
|
64
|
+
end
|
|
65
|
+
|
|
66
|
+
def parse_list_response(xml)
|
|
67
|
+
objects = []
|
|
68
|
+
|
|
69
|
+
xml.scan(%r{<Contents>(.*?)</Contents>}m).each do |match|
|
|
70
|
+
block = match[0]
|
|
71
|
+
key = extract_tag(block, "Key")
|
|
72
|
+
size = extract_tag(block, "Size").to_i
|
|
73
|
+
mtime = extract_tag(block, "LastModified")
|
|
74
|
+
objects << { key: key, size: size, last_modified: mtime }
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
truncated = extract_tag(xml, "IsTruncated") == "true"
|
|
78
|
+
next_token = extract_tag(xml, "NextContinuationToken")
|
|
79
|
+
|
|
80
|
+
[objects, next_token, truncated]
|
|
81
|
+
end
|
|
82
|
+
|
|
83
|
+
def fetch_with_redirect(uri, local_path, redirects_remaining: 5, &progress_block)
|
|
84
|
+
raise "Too many redirects" if redirects_remaining.zero?
|
|
85
|
+
|
|
86
|
+
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == "https") do |http|
|
|
87
|
+
request = Net::HTTP::Get.new(uri.request_uri)
|
|
88
|
+
http.request(request) do |response|
|
|
89
|
+
case response.code.to_i
|
|
90
|
+
when 200
|
|
91
|
+
total = response["content-length"]&.to_i
|
|
92
|
+
received = 0
|
|
93
|
+
File.open(local_path, "wb") do |f|
|
|
94
|
+
response.read_body do |chunk|
|
|
95
|
+
f.write(chunk)
|
|
96
|
+
received += chunk.bytesize
|
|
97
|
+
progress_block&.call(received, total)
|
|
98
|
+
end
|
|
99
|
+
end
|
|
100
|
+
when 301, 302, 307, 308
|
|
101
|
+
location = response["location"]
|
|
102
|
+
raise "Redirect with no Location header" unless location
|
|
103
|
+
|
|
104
|
+
log "Redirect -> #{location}"
|
|
105
|
+
fetch_with_redirect(URI.parse(location), local_path,
|
|
106
|
+
redirects_remaining: redirects_remaining - 1,
|
|
107
|
+
&progress_block)
|
|
108
|
+
else
|
|
109
|
+
raise "HTTP #{response.code} for #{uri}"
|
|
110
|
+
end
|
|
111
|
+
end
|
|
112
|
+
end
|
|
113
|
+
end
|
|
114
|
+
|
|
115
|
+
def get_response(uri, redirects_remaining: 5)
|
|
116
|
+
raise "Too many redirects listing #{uri}" if redirects_remaining.zero?
|
|
117
|
+
|
|
118
|
+
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == "https") do |http|
|
|
119
|
+
response = http.get(uri.request_uri)
|
|
120
|
+
case response.code.to_i
|
|
121
|
+
when 200
|
|
122
|
+
response
|
|
123
|
+
when 301, 302, 307, 308
|
|
124
|
+
location = response["location"]
|
|
125
|
+
raise "Redirect with no Location header" unless location
|
|
126
|
+
|
|
127
|
+
log "Redirect -> #{location}"
|
|
128
|
+
get_response(URI.parse(location), redirects_remaining: redirects_remaining - 1)
|
|
129
|
+
else
|
|
130
|
+
raise "HTTP #{response.code} listing #{uri}"
|
|
131
|
+
end
|
|
132
|
+
end
|
|
133
|
+
end
|
|
134
|
+
|
|
135
|
+
def extract_tag(xml, tag)
|
|
136
|
+
match = xml.match(%r{<#{tag}>(.*?)</#{tag}>}m)
|
|
137
|
+
match ? match[1].strip : ""
|
|
138
|
+
end
|
|
139
|
+
|
|
140
|
+
def uri_encode(str)
|
|
141
|
+
URI.encode_uri_component(str.to_s)
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def log(msg)
|
|
145
|
+
warn "[smo_scottish_lidar] #{msg}" if @verbose
|
|
146
|
+
end
|
|
147
|
+
end
|
|
148
|
+
end
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module SmoScottishLidar
|
|
4
|
+
BUCKET = "srsp-open-data"
|
|
5
|
+
BASE_PREFIX = "lidar"
|
|
6
|
+
REGION = "eu-west-2"
|
|
7
|
+
BASE_URL = "https://#{BUCKET}.s3.#{REGION}.amazonaws.com"
|
|
8
|
+
|
|
9
|
+
# Valid phases and dataset types derived from the Scottish Government
|
|
10
|
+
# Registry of Open Data on AWS documentation.
|
|
11
|
+
PHASES = %w[phase-1 phase-2 phase-3 phase-4 phase-5 outer-hebrides].freeze
|
|
12
|
+
DATASET_TYPES = %w[dsm dtm laz].freeze
|
|
13
|
+
|
|
14
|
+
# Outer Hebrides has resolution sub-folders; all other phases do not.
|
|
15
|
+
OUTER_HEBRIDES_RESOLUTIONS = {
|
|
16
|
+
"dsm" => %w[25cm 50cm],
|
|
17
|
+
"dtm" => %w[25cm 50cm],
|
|
18
|
+
"laz" => %w[4ppm 16ppm]
|
|
19
|
+
}.freeze
|
|
20
|
+
|
|
21
|
+
# Canonical S3 prefix builder. Returns the prefix string (no bucket).
|
|
22
|
+
# Examples:
|
|
23
|
+
# prefix_for("phase-1", "dsm") => "lidar/phase-1/dsm/27700/gridded/"
|
|
24
|
+
# prefix_for("outer-hebrides", "dtm", resolution: "50cm") => "lidar/outer-hebrides/2019/dtm/50cm/27700/gridded/"
|
|
25
|
+
def self.prefix_for(phase, type, resolution: nil)
|
|
26
|
+
validate_phase!(phase)
|
|
27
|
+
validate_type!(type)
|
|
28
|
+
|
|
29
|
+
if phase == "outer-hebrides"
|
|
30
|
+
res = resolve_outer_hebrides_resolution(type, resolution)
|
|
31
|
+
"#{BASE_PREFIX}/outer-hebrides/2019/#{type}/#{res}/27700/gridded/"
|
|
32
|
+
else
|
|
33
|
+
"#{BASE_PREFIX}/#{phase}/#{type}/27700/gridded/"
|
|
34
|
+
end
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
def self.validate_phase!(phase)
|
|
38
|
+
return if PHASES.include?(phase)
|
|
39
|
+
|
|
40
|
+
raise ArgumentError, "Unknown phase '#{phase}'. Valid: #{PHASES.join(', ')}"
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
def self.validate_type!(type)
|
|
44
|
+
return if DATASET_TYPES.include?(type)
|
|
45
|
+
|
|
46
|
+
raise ArgumentError, "Unknown dataset type '#{type}'. Valid: #{DATASET_TYPES.join(', ')}"
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
def self.resolve_outer_hebrides_resolution(type, resolution)
|
|
50
|
+
available = OUTER_HEBRIDES_RESOLUTIONS[type]
|
|
51
|
+
if resolution.nil?
|
|
52
|
+
available.first
|
|
53
|
+
elsif available.include?(resolution)
|
|
54
|
+
resolution
|
|
55
|
+
else
|
|
56
|
+
raise ArgumentError,
|
|
57
|
+
"Resolution '#{resolution}' not available for outer-hebrides #{type}. " \
|
|
58
|
+
"Available: #{available.join(', ')}"
|
|
59
|
+
end
|
|
60
|
+
end
|
|
61
|
+
end
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "fileutils"
|
|
4
|
+
|
|
5
|
+
module SmoScottishLidar
|
|
6
|
+
# Downloads LiDAR files from the Scottish Government S3 bucket.
|
|
7
|
+
# No external dependencies. Uses only Ruby stdlib.
|
|
8
|
+
class Downloader
|
|
9
|
+
attr_reader :client, :lister
|
|
10
|
+
|
|
11
|
+
def initialize(verbose: false)
|
|
12
|
+
@client = Client.new(verbose: verbose)
|
|
13
|
+
@lister = Lister.new(verbose: verbose)
|
|
14
|
+
@verbose = verbose
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
# Download all files for a given phase/type, with optional filtering.
|
|
18
|
+
#
|
|
19
|
+
# @param phase [String] e.g. "phase-1", "outer-hebrides"
|
|
20
|
+
# @param type [String] "dsm", "dtm", or "laz"
|
|
21
|
+
# @param destination [String] Local directory to save files into
|
|
22
|
+
# @param grid_square [String, nil] OS grid square filter e.g. "NS"
|
|
23
|
+
# @param resolution [String, nil] Outer Hebrides resolution e.g. "50cm"
|
|
24
|
+
# @param skip_existing [Boolean] Skip files that already exist locally (default: true)
|
|
25
|
+
# @param dry_run [Boolean] List what would be downloaded without downloading
|
|
26
|
+
# @return [Hash] { downloaded: [...], skipped: [...], failed: [...] }
|
|
27
|
+
def download(phase, type,
|
|
28
|
+
destination:,
|
|
29
|
+
grid_square: nil,
|
|
30
|
+
resolution: nil,
|
|
31
|
+
skip_existing: true,
|
|
32
|
+
dry_run: false)
|
|
33
|
+
|
|
34
|
+
FileUtils.mkdir_p(destination) unless dry_run
|
|
35
|
+
|
|
36
|
+
objects = lister.list(phase, type, grid_square: grid_square, resolution: resolution)
|
|
37
|
+
|
|
38
|
+
if objects.empty?
|
|
39
|
+
puts "No files matched your criteria."
|
|
40
|
+
return { downloaded: [], skipped: [], failed: [] }
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
total_bytes = objects.sum { |o| o[:size] }
|
|
44
|
+
puts "Found #{objects.size} file(s) (#{format_bytes(total_bytes)} total)"
|
|
45
|
+
puts "Destination: #{destination}"
|
|
46
|
+
puts "(Dry run - no files will be downloaded)" if dry_run
|
|
47
|
+
puts
|
|
48
|
+
|
|
49
|
+
results = { downloaded: [], skipped: [], failed: [] }
|
|
50
|
+
|
|
51
|
+
objects.each_with_index do |obj, idx|
|
|
52
|
+
local_path = File.join(destination, obj[:filename])
|
|
53
|
+
label = "[#{idx + 1}/#{objects.size}] #{obj[:filename]} (#{format_bytes(obj[:size])})"
|
|
54
|
+
|
|
55
|
+
if skip_existing && File.exist?(local_path) && File.size(local_path) == obj[:size]
|
|
56
|
+
puts "SKIP #{label}"
|
|
57
|
+
results[:skipped] << obj[:filename]
|
|
58
|
+
next
|
|
59
|
+
end
|
|
60
|
+
|
|
61
|
+
if dry_run
|
|
62
|
+
puts "WOULD #{label}"
|
|
63
|
+
results[:downloaded] << obj[:filename]
|
|
64
|
+
next
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
print "GET #{label} ... "
|
|
68
|
+
$stdout.flush
|
|
69
|
+
|
|
70
|
+
begin
|
|
71
|
+
client.download_object(obj[:key], local_path) do |received, total|
|
|
72
|
+
next unless @verbose && total
|
|
73
|
+
|
|
74
|
+
pct = (received.to_f / total * 100).round(1)
|
|
75
|
+
print "\rGET #{label} ... #{pct}%"
|
|
76
|
+
$stdout.flush
|
|
77
|
+
end
|
|
78
|
+
puts "OK"
|
|
79
|
+
results[:downloaded] << obj[:filename]
|
|
80
|
+
rescue StandardError => e
|
|
81
|
+
puts "FAILED (#{e.message})"
|
|
82
|
+
results[:failed] << { file: obj[:filename], error: e.message }
|
|
83
|
+
end
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
print_summary(results)
|
|
87
|
+
results
|
|
88
|
+
end
|
|
89
|
+
|
|
90
|
+
# Convenience: download a single file by its S3 key or filename.
|
|
91
|
+
#
|
|
92
|
+
# @param phase [String]
|
|
93
|
+
# @param type [String]
|
|
94
|
+
# @param filename [String] Exact filename to download
|
|
95
|
+
# @param destination [String] Local directory
|
|
96
|
+
# @param resolution [String, nil]
|
|
97
|
+
def download_file(phase, type, filename, destination:, resolution: nil)
|
|
98
|
+
prefix = SmoScottishLidar.prefix_for(phase, type, resolution: resolution)
|
|
99
|
+
key = "#{prefix}#{filename}"
|
|
100
|
+
local = File.join(destination, filename)
|
|
101
|
+
|
|
102
|
+
FileUtils.mkdir_p(destination)
|
|
103
|
+
puts "Downloading #{filename} ..."
|
|
104
|
+
client.download_object(key, local)
|
|
105
|
+
puts "Saved to #{local}"
|
|
106
|
+
local
|
|
107
|
+
end
|
|
108
|
+
|
|
109
|
+
private
|
|
110
|
+
|
|
111
|
+
def print_summary(results)
|
|
112
|
+
puts
|
|
113
|
+
puts "Done. Downloaded: #{results[:downloaded].size}, " \
|
|
114
|
+
"Skipped: #{results[:skipped].size}, " \
|
|
115
|
+
"Failed: #{results[:failed].size}"
|
|
116
|
+
|
|
117
|
+
return if results[:failed].empty?
|
|
118
|
+
|
|
119
|
+
puts "Failed files:"
|
|
120
|
+
results[:failed].each { |f| puts " #{f[:file]}: #{f[:error]}" }
|
|
121
|
+
end
|
|
122
|
+
|
|
123
|
+
def format_bytes(bytes)
|
|
124
|
+
units = %w[B KB MB GB TB]
|
|
125
|
+
idx = 0
|
|
126
|
+
size = bytes.to_f
|
|
127
|
+
while size >= 1024 && idx < units.size - 1
|
|
128
|
+
size /= 1024.0
|
|
129
|
+
idx += 1
|
|
130
|
+
end
|
|
131
|
+
format("%.1f %s", size, units[idx])
|
|
132
|
+
end
|
|
133
|
+
end
|
|
134
|
+
end
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module SmoScottishLidar
|
|
4
|
+
# Lists available LiDAR files from the Scottish Government S3 bucket.
|
|
5
|
+
# All filtering is done client-side after fetching the S3 listing.
|
|
6
|
+
class Lister
|
|
7
|
+
attr_reader :client
|
|
8
|
+
|
|
9
|
+
def initialize(verbose: false)
|
|
10
|
+
@client = Client.new(verbose: verbose)
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
# List files for a given phase and dataset type.
|
|
14
|
+
#
|
|
15
|
+
# @param phase [String] e.g. "phase-1", "outer-hebrides"
|
|
16
|
+
# @param type [String] "dsm", "dtm", or "laz"
|
|
17
|
+
# @param grid_square [String, nil] Optional OS grid square filter, e.g. "NS", "NT"
|
|
18
|
+
# @param resolution [String, nil] For outer-hebrides only, e.g. "50cm", "4ppm"
|
|
19
|
+
# @return [Array<Hash>] Array of { key:, size:, last_modified:, filename: }
|
|
20
|
+
def list(phase, type, grid_square: nil, resolution: nil)
|
|
21
|
+
prefix = SmoScottishLidar.prefix_for(phase, type, resolution: resolution)
|
|
22
|
+
objects = client.list_objects(prefix)
|
|
23
|
+
|
|
24
|
+
objects.map! do |obj|
|
|
25
|
+
obj.merge(filename: File.basename(obj[:key]))
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
if grid_square
|
|
29
|
+
pattern = grid_square.upcase
|
|
30
|
+
objects.select! { |obj| obj[:filename].upcase.start_with?(pattern) }
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
objects
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
# Print a human-readable summary of available files.
|
|
37
|
+
def summary(phase, type, grid_square: nil, resolution: nil)
|
|
38
|
+
objects = list(phase, type, grid_square: grid_square, resolution: resolution)
|
|
39
|
+
|
|
40
|
+
if objects.empty?
|
|
41
|
+
puts "No files found."
|
|
42
|
+
return objects
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
total_bytes = objects.sum { |o| o[:size] }
|
|
46
|
+
|
|
47
|
+
puts "Phase : #{phase}"
|
|
48
|
+
puts "Type : #{type}"
|
|
49
|
+
puts "Grid sq. : #{grid_square || '(all)'}"
|
|
50
|
+
puts "Files : #{objects.size}"
|
|
51
|
+
puts "Total size: #{format_bytes(total_bytes)}"
|
|
52
|
+
puts
|
|
53
|
+
puts format("%-50s %10s %s", "Filename", "Size", "Last Modified")
|
|
54
|
+
puts "-" * 80
|
|
55
|
+
objects.each do |obj|
|
|
56
|
+
puts format("%-50s %10s %s", obj[:filename], format_bytes(obj[:size]), obj[:last_modified])
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
objects
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
private
|
|
63
|
+
|
|
64
|
+
def format_bytes(bytes)
|
|
65
|
+
units = %w[B KB MB GB TB]
|
|
66
|
+
idx = 0
|
|
67
|
+
size = bytes.to_f
|
|
68
|
+
while size >= 1024 && idx < units.size - 1
|
|
69
|
+
size /= 1024.0
|
|
70
|
+
idx += 1
|
|
71
|
+
end
|
|
72
|
+
format("%.1f %s", size, units[idx])
|
|
73
|
+
end
|
|
74
|
+
end
|
|
75
|
+
end
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative "smo_scottish_lidar/version"
|
|
4
|
+
require_relative "smo_scottish_lidar/constants"
|
|
5
|
+
require_relative "smo_scottish_lidar/client"
|
|
6
|
+
require_relative "smo_scottish_lidar/downloader"
|
|
7
|
+
require_relative "smo_scottish_lidar/lister"
|
metadata
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
|
2
|
+
name: smo_scottish_lidar
|
|
3
|
+
version: !ruby/object:Gem::Version
|
|
4
|
+
version: 0.1.0
|
|
5
|
+
platform: ruby
|
|
6
|
+
authors:
|
|
7
|
+
- Sebastian Madrid Ontiveros
|
|
8
|
+
bindir: bin
|
|
9
|
+
cert_chain: []
|
|
10
|
+
date: 2026-05-04 00:00:00.000000000 Z
|
|
11
|
+
dependencies: []
|
|
12
|
+
description: |
|
|
13
|
+
Developed by Sebastian Madrid Ontiveros to support hydraulic modellers in Scotland
|
|
14
|
+
building 1D-2D hydraulic models and flood risk assessments. Provides a pure Ruby
|
|
15
|
+
interface for listing and downloading Scottish Public Sector LiDAR datasets (DSM,
|
|
16
|
+
DTM, LAZ) from the Registry of Open Data on AWS. Supports all survey phases (1-5)
|
|
17
|
+
and Outer Hebrides, OS National Grid square filtering, paginated S3 listing, streamed
|
|
18
|
+
downloads with resume support, and dry-run mode. No external dependencies. Uses only
|
|
19
|
+
Ruby stdlib (net/http, uri, fileutils). If this gem saves you time, consider buying
|
|
20
|
+
Sebastian a coffee at https://buymeacoffee.com/smadrid
|
|
21
|
+
email: []
|
|
22
|
+
executables: []
|
|
23
|
+
extensions: []
|
|
24
|
+
extra_rdoc_files: []
|
|
25
|
+
files:
|
|
26
|
+
- README.md
|
|
27
|
+
- lib/smo_scottish_lidar.rb
|
|
28
|
+
- lib/smo_scottish_lidar/client.rb
|
|
29
|
+
- lib/smo_scottish_lidar/constants.rb
|
|
30
|
+
- lib/smo_scottish_lidar/downloader.rb
|
|
31
|
+
- lib/smo_scottish_lidar/lister.rb
|
|
32
|
+
- lib/smo_scottish_lidar/version.rb
|
|
33
|
+
homepage: https://github.com/Sebasmadridmx/smo_scottish_lidar
|
|
34
|
+
licenses:
|
|
35
|
+
- MIT
|
|
36
|
+
metadata: {}
|
|
37
|
+
rdoc_options: []
|
|
38
|
+
require_paths:
|
|
39
|
+
- lib
|
|
40
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
|
41
|
+
requirements:
|
|
42
|
+
- - ">="
|
|
43
|
+
- !ruby/object:Gem::Version
|
|
44
|
+
version: 2.7.0
|
|
45
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
46
|
+
requirements:
|
|
47
|
+
- - ">="
|
|
48
|
+
- !ruby/object:Gem::Version
|
|
49
|
+
version: '0'
|
|
50
|
+
requirements: []
|
|
51
|
+
rubygems_version: 3.6.2
|
|
52
|
+
specification_version: 4
|
|
53
|
+
summary: Download Scottish Public Sector LiDAR data from the Registry of Open Data
|
|
54
|
+
on AWS.
|
|
55
|
+
test_files: []
|