middleman-s3_sync 4.6.4 → 4.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e6b576cfe91e75975edadd71f0cfd319e84772edecd882aabe2e3c8c06c90082
4
- data.tar.gz: 86bc102e074f9b47980e4cbf93060babd7f7b8a5117cbf35b60ea78ba7476ff0
3
+ metadata.gz: 6b94ebf16b78ae47530a08332b802f50aa9c4510a47b07b60cab9a8b004576e0
4
+ data.tar.gz: 6d097be5659e364f9c789164da5838bed6781fbb0db3a4724433d2b78d43cbad
5
5
  SHA512:
6
- metadata.gz: 20b465799e297cff4e8085ba9c9e25637a25a3a77242d862f1d2b93291dd6112c32835fc91eb8fd92d0e4cd11df6676c034330d804e4556b5fc0ba4c2312215e
7
- data.tar.gz: 36a4b96fa0c6831047a90ef4140c598bfab3eb5279fffa09adeee5e23673cb30a811806019ea63de16d1c52383b9a644a2ed6cf0bf400ccd1e9eaade616ae00a
6
+ metadata.gz: 0f29594758ad18658482ee676c9fa2d505dbe133ccd164e9a76816a7803982736f644f62193b61a45aa5bc4204439dad2c0911852dad6b596897caed93fbfd4a
7
+ data.tar.gz: 5f309d8e5aa6b43e0b6cb76cf7be241b4d198cff4b87d561f33c140c2886cd9337be6f7187b70f38235bed8e469deeb9ce02fb7433c82c8ca08f208cb9a28a02
@@ -0,0 +1,29 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [master]
6
+ pull_request:
7
+ branches: [master]
8
+
9
+ jobs:
10
+ test:
11
+ runs-on: ubuntu-latest
12
+ strategy:
13
+ matrix:
14
+ ruby-version: ['3.1', '3.2', '3.3', '3.4']
15
+
16
+ steps:
17
+ - uses: actions/checkout@v4
18
+
19
+ - name: Set up Ruby ${{ matrix.ruby-version }}
20
+ uses: ruby/setup-ruby@v1
21
+ with:
22
+ ruby-version: ${{ matrix.ruby-version }}
23
+ bundler-cache: true
24
+
25
+ - name: Run tests
26
+ run: bundle exec rspec
27
+
28
+ - name: Build gem
29
+ run: gem build middleman-s3_sync.gemspec
@@ -0,0 +1,53 @@
1
+ name: Release
2
+
3
+ on:
4
+ push:
5
+ tags:
6
+ - 'v*'
7
+
8
+ jobs:
9
+ release:
10
+ runs-on: ubuntu-latest
11
+ permissions:
12
+ contents: write # For creating GitHub releases
13
+
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+
17
+ - name: Set up Ruby
18
+ uses: ruby/setup-ruby@v1
19
+ with:
20
+ ruby-version: '3.4'
21
+ bundler-cache: true
22
+
23
+ - name: Run tests
24
+ run: bundle exec rspec
25
+
26
+ - name: Build gem
27
+ run: gem build middleman-s3_sync.gemspec
28
+
29
+ - name: Get version
30
+ id: version
31
+ run: echo "VERSION=${GITHUB_REF#refs/tags/v}" >> $GITHUB_OUTPUT
32
+
33
+ - name: Generate checksums
34
+ run: |
35
+ sha256sum middleman-s3_sync-*.gem > checksums.txt
36
+ cat checksums.txt
37
+
38
+ - name: Publish to RubyGems
39
+ run: |
40
+ mkdir -p ~/.gem
41
+ echo -e "---\n:rubygems_api_key: ${RUBYGEMS_API_KEY}" > ~/.gem/credentials
42
+ chmod 0600 ~/.gem/credentials
43
+ gem push middleman-s3_sync-*.gem
44
+ env:
45
+ RUBYGEMS_API_KEY: ${{ secrets.RUBYGEMS_API_KEY }}
46
+
47
+ - name: Create GitHub Release
48
+ uses: softprops/action-gh-release@v2
49
+ with:
50
+ files: |
51
+ middleman-s3_sync-*.gem
52
+ checksums.txt
53
+ generate_release_notes: true
data/Changelog.md CHANGED
@@ -2,6 +2,18 @@
2
2
 
3
3
  The gem that tries really hard not to push files to S3.
4
4
 
5
+ ## v4.6.5
6
+ - Performance and stability improvements
7
+ - Thread-safe invalidation path tracking (use Set + mutex) when running in parallel
8
+ - Cache CloudFront client (with reset hook for tests)
9
+ - Single-pass resource categorization (reduce multiple iterations over resources)
10
+ - Batch S3 deletes via delete_objects (up to 1000 keys/request)
11
+ - Stream file uploads to reduce memory; compute MD5s in a single read when possible
12
+ - Optimize CloudFront path deduplication to O(n × path_depth)
13
+ - CLI/extension: support option writers (e.g., verbose=, dry_run=) to fix NoMethodError
14
+ - Tests: add coverage for CloudFront, batch delete, and streaming uploads
15
+ - No breaking changes; default behavior preserved
16
+
5
17
  ## v4.6.4
6
18
  * Remove map gem dependency and replace with native Ruby implementation
7
19
  * Add IndifferentHash class to provide string/symbol indifferent access without external dependencies
data/README.md CHANGED
@@ -19,6 +19,33 @@ that are no longer needed.
19
19
  * Use middleman-s3_sync version 4.x for Middleman 4.x
20
20
  * Use middleman-s3_sync version 3.x for Middleman 3.x
21
21
 
22
+ ## What's New in 4.7.0
23
+
24
+ **New Features**
25
+ - `after_s3_sync` callback for post-sync hooks (notifications, custom actions)
26
+ - `scan_build_dir` option to sync files outside the Middleman sitemap
27
+ - `routing_rules` option for S3 website redirect configuration
28
+ - Improved content type detection with mime-types gem fallback
29
+
30
+ **Performance & Efficiency**
31
+ - Batch deletes using S3 `delete_objects` (up to 1,000 keys per request)
32
+ - Streaming uploads to reduce memory usage on large files
33
+ - Single-pass MD5 computation avoids redundant file reads
34
+ - Single-pass resource categorization (create/update/delete)
35
+ - Faster redundant-path pruning for CloudFront invalidations
36
+
37
+ **Reliability**
38
+ - Thread-safe CloudFront invalidation path tracking (mutex-protected Set)
39
+ - Cached CloudFront client to reduce re-instantiation overhead
40
+ - Proper sitemap population before sync (`ensure_resource_list_updated!`)
41
+ - Fixed redirect detection to return boolean values
42
+
43
+ **Developer Experience**
44
+ - Extension now properly delegates option writers (`verbose=`, `dry_run=`, etc.)
45
+ - GitHub Actions CI and release workflows
46
+ - Tightened gemspec with bounded dependency versions
47
+ - Ruby >= 3.0 requirement
48
+
22
49
  ## Installation
23
50
 
24
51
  Add this line to your application's Gemfile:
@@ -258,6 +285,57 @@ Your AWS credentials need CloudFront permissions in addition to S3:
258
285
  - Use `cloudfront_invalidate_all: true` for major updates to minimize costs (counts as 1 path)
259
286
  - Consider the trade-off between immediate cache invalidation and cost
260
287
 
288
+ ## Callbacks
289
+
290
+ ### after_s3_sync
291
+
292
+ You can configure a callback that runs after the sync completes. This is useful for triggering notifications, updating external services, or running post-deployment tasks.
293
+
294
+ ```ruby
295
+ activate :s3_sync do |s3_sync|
296
+ # ... other configuration ...
297
+
298
+ # Using a lambda/proc
299
+ s3_sync.after_s3_sync = ->(results) {
300
+ puts "Created: #{results[:created]} files"
301
+ puts "Updated: #{results[:updated]} files"
302
+ puts "Deleted: #{results[:deleted]} files"
303
+ puts "Invalidation paths: #{results[:invalidation_paths].join(', ')}"
304
+ }
305
+ end
306
+ ```
307
+
308
+ The callback receives a hash with sync results:
309
+
310
+ | Key | Type | Description |
311
+ | --------------------- | ------- | ----------- |
312
+ | `:created` | Integer | Number of files created |
313
+ | `:updated` | Integer | Number of files updated |
314
+ | `:deleted` | Integer | Number of files deleted |
315
+ | `:invalidation_paths` | Array | CloudFront paths that were invalidated |
316
+
317
+ You can also use a symbol to call a method on the Middleman app:
318
+
319
+ ```ruby
320
+ # In config.rb
321
+ def notify_slack(results)
322
+ # Send deployment notification to Slack
323
+ end
324
+
325
+ activate :s3_sync do |s3_sync|
326
+ # ... other configuration ...
327
+ s3_sync.after_s3_sync = :notify_slack
328
+ end
329
+ ```
330
+
331
+ Callbacks that take no arguments are also supported:
332
+
333
+ ```ruby
334
+ activate :s3_sync do |s3_sync|
335
+ s3_sync.after_s3_sync = -> { puts "Sync complete!" }
336
+ end
337
+ ```
338
+
261
339
  #### IAM Policy
262
340
 
263
341
  Here's a sample IAM policy with least-privilege permissions that will allow syncing to a bucket named "mysite.com":
@@ -514,6 +592,16 @@ The full values and their semantics are [documented on AWS's
514
592
  documentation
515
593
  site](http://docs.aws.amazon.com/AmazonS3/latest/dev/ACLOverview.html#CannedACL).
516
594
 
595
+ ##### Buckets with ACLs Disabled
596
+
597
+ If your bucket uses "Object Ownership: Bucket owner enforced" (ACLs disabled), set:
598
+
599
+ ```ruby
600
+ s3_sync.acl = '' # or: s3_sync.acl = nil
601
+ ```
602
+
603
+ The gem will also auto-detect buckets that reject ACL headers and transparently retry uploads without the `:acl` parameter.
604
+
517
605
  #### Encryption
518
606
 
519
607
  You can ask Amazon to encrypt your files at rest by setting the
data/WARP.md CHANGED
@@ -100,6 +100,10 @@ The gem determines what to do with each file by comparing:
100
100
  - **`config.rb`**: Middleman configuration with `activate :s3_sync` block
101
101
  - **Environment Variables**: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_BUCKET, etc.
102
102
 
103
+ ## Development Rules
104
+
105
+ **All changes must be accompanied with unit tests.** This is non-negotiable for maintaining code quality and preventing regressions.
106
+
103
107
  ## Common Development Patterns
104
108
 
105
109
  When adding new functionality:
@@ -109,4 +113,4 @@ When adding new functionality:
109
113
  4. Add comprehensive specs following existing patterns
110
114
  5. Update README.md with new configuration options
111
115
 
112
- The codebase emphasizes security (credential handling), efficiency (parallel operations), and reliability (comprehensive error handling and dry-run support).
116
+ The codebase emphasizes security (credential handling), efficiency (parallel operations), and reliability (comprehensive error handling and dry-run support).
@@ -1,5 +1,6 @@
1
1
  require 'aws-sdk-cloudfront'
2
2
  require 'securerandom'
3
+ require 'set'
3
4
 
4
5
  module Middleman
5
6
  module S3Sync
@@ -101,25 +102,42 @@ module Middleman
101
102
  # Sort paths to ensure wildcards come before specific files
102
103
  sorted_paths = paths.sort
103
104
  result = []
105
+ # Use a Set for O(1) lookup of wildcard prefixes
106
+ wildcard_prefixes = Set.new
104
107
 
105
108
  sorted_paths.each do |path|
106
- # Check if this path is already covered by a wildcard we've added
107
- is_redundant = result.any? do |existing_path|
108
- if existing_path.end_with?('/*')
109
- # Check if current path is under this wildcard
110
- wildcard_prefix = existing_path[0..-3] # Remove /*
111
- path.start_with?(wildcard_prefix + '/')
112
- else
113
- false
109
+ # Check if this path is covered by any existing wildcard prefix
110
+ # by checking all parent directories of this path
111
+ is_redundant = path_covered_by_wildcard?(path, wildcard_prefixes)
112
+
113
+ unless is_redundant
114
+ result << path
115
+ # If this is a wildcard path, add its prefix for future lookups
116
+ if path.end_with?('/*')
117
+ wildcard_prefixes.add(path[0..-3]) # Remove /*
114
118
  end
115
119
  end
116
-
117
- result << path unless is_redundant
118
120
  end
119
121
 
120
122
  result
121
123
  end
122
124
 
125
+ # Check if a path is covered by any wildcard prefix in O(path_depth) time
126
+ def path_covered_by_wildcard?(path, wildcard_prefixes)
127
+ return false if wildcard_prefixes.empty?
128
+
129
+ # Check each parent directory of the path
130
+ segments = path.split('/')
131
+ current_path = ''
132
+
133
+ segments[0..-2].each do |segment| # Exclude the last segment
134
+ current_path = current_path.empty? ? segment : "#{current_path}/#{segment}"
135
+ return true if wildcard_prefixes.include?(current_path)
136
+ end
137
+
138
+ false
139
+ end
140
+
123
141
  def create_invalidation_with_retry(paths, options)
124
142
  max_retries = options.cloudfront_invalidation_max_retries || 5
125
143
  retries = 0
@@ -161,24 +179,30 @@ module Middleman
161
179
  end
162
180
 
163
181
  def cloudfront_client(options)
164
- client_options = {
165
- region: 'us-east-1' # CloudFront is always in us-east-1
166
- }
167
-
168
- # Use the same credentials as S3 if available
169
- if options.aws_access_key_id && options.aws_secret_access_key
170
- client_options.merge!({
171
- access_key_id: options.aws_access_key_id,
172
- secret_access_key: options.aws_secret_access_key
173
- })
174
-
175
- # If using an assumed role
176
- client_options.merge!({
177
- session_token: options.aws_session_token
178
- }) if options.aws_session_token
182
+ @cloudfront_client ||= begin
183
+ client_options = {
184
+ region: 'us-east-1' # CloudFront is always in us-east-1
185
+ }
186
+
187
+ # Use the same credentials as S3 if available
188
+ if options.aws_access_key_id && options.aws_secret_access_key
189
+ client_options.merge!({
190
+ access_key_id: options.aws_access_key_id,
191
+ secret_access_key: options.aws_secret_access_key
192
+ })
193
+
194
+ # If using an assumed role
195
+ client_options.merge!({
196
+ session_token: options.aws_session_token
197
+ }) if options.aws_session_token
198
+ end
199
+
200
+ Aws::CloudFront::Client.new(client_options)
179
201
  end
202
+ end
180
203
 
181
- Aws::CloudFront::Client.new(client_options)
204
+ def reset_cloudfront_client!
205
+ @cloudfront_client = nil
182
206
  end
183
207
 
184
208
  def wait_for_invalidations(invalidation_ids, options)
@@ -26,13 +26,16 @@ module Middleman
26
26
  :ignore_paths,
27
27
  :index_document,
28
28
  :error_document,
29
+ :routing_rules,
30
+ :scan_build_dir,
29
31
  :cloudfront_distribution_id,
30
32
  :cloudfront_invalidate,
31
33
  :cloudfront_invalidate_all,
32
34
  :cloudfront_invalidation_batch_size,
33
35
  :cloudfront_invalidation_max_retries,
34
36
  :cloudfront_invalidation_batch_delay,
35
- :cloudfront_wait
37
+ :cloudfront_wait,
38
+ :after_s3_sync
36
39
  ]
37
40
  attr_accessor *OPTIONS
38
41
 
@@ -113,6 +116,14 @@ module Middleman
113
116
  @version_bucket.nil? ? false : @version_bucket
114
117
  end
115
118
 
119
+ def routing_rules
120
+ @routing_rules || []
121
+ end
122
+
123
+ def scan_build_dir
124
+ @scan_build_dir.nil? ? false : @scan_build_dir
125
+ end
126
+
116
127
  end
117
128
  end
118
129
  end
@@ -6,9 +6,11 @@ module Middleman
6
6
 
7
7
  include Status
8
8
 
9
- def initialize(resource, partial_s3_resource)
9
+ def initialize(resource, partial_s3_resource, path: nil)
10
10
  @resource = resource
11
- @path = if resource
11
+ @path = if path
12
+ path.sub(/^\//, '')
13
+ elsif resource
12
14
  resource.destination_path.sub(/^\//, '')
13
15
  elsif partial_s3_resource&.key
14
16
  partial_s3_resource.key.sub(/^\//, '')
@@ -116,20 +118,25 @@ module Middleman
116
118
 
117
119
  def upload!
118
120
  object = bucket.object(remote_path.sub(/^\//, ''))
119
- upload_options = build_upload_options
120
-
121
- begin
122
- object.put(upload_options)
123
- rescue Aws::S3::Errors::AccessControlListNotSupported => e
124
- # Bucket has ACLs disabled - retry without ACL
125
- if upload_options.key?(:acl)
126
- say_status "#{ANSI.yellow{"Note"}} Bucket does not support ACLs, retrying without ACL parameter"
127
- # Automatically disable ACLs for this bucket going forward
128
- options.acl = ''
129
- upload_options.delete(:acl)
130
- retry
131
- else
132
- raise e
121
+
122
+ # Use streaming upload for memory efficiency with large files
123
+ File.open(local_path, 'rb') do |file|
124
+ upload_options = build_upload_options_for_stream(file)
125
+
126
+ begin
127
+ object.put(upload_options)
128
+ rescue Aws::S3::Errors::AccessControlListNotSupported => e
129
+ # Bucket has ACLs disabled - retry without ACL
130
+ if upload_options.key?(:acl)
131
+ say_status "#{ANSI.yellow{"Note"}} Bucket does not support ACLs, retrying without ACL parameter"
132
+ # Automatically disable ACLs for this bucket going forward
133
+ options.acl = ''
134
+ upload_options.delete(:acl)
135
+ file.rewind # Reset file position for retry
136
+ retry
137
+ else
138
+ raise e
139
+ end
133
140
  end
134
141
  end
135
142
  end
@@ -219,7 +226,8 @@ module Middleman
219
226
  end
220
227
 
221
228
  def local?
222
- File.exist?(local_path) && resource
229
+ # For orphan files (scan_build_dir), resource is nil but file exists
230
+ File.exist?(local_path)
223
231
  end
224
232
 
225
233
  def remote?
@@ -227,8 +235,8 @@ module Middleman
227
235
  end
228
236
 
229
237
  def redirect?
230
- (resource && resource.respond_to?(:redirect?) && resource.redirect?) ||
231
- (full_s3_resource && full_s3_resource.respond_to?(:website_redirect_location) && full_s3_resource.website_redirect_location)
238
+ !!(resource && resource.respond_to?(:redirect?) && resource.redirect?) ||
239
+ !!(full_s3_resource && full_s3_resource.respond_to?(:website_redirect_location) && full_s3_resource.website_redirect_location)
232
240
  end
233
241
 
234
242
  def metadata_match?
@@ -278,12 +286,24 @@ module Middleman
278
286
  end
279
287
 
280
288
  def local_object_md5
281
- @local_object_md5 ||= Digest::MD5.hexdigest(File.read(local_path))
289
+ @local_object_md5 ||= begin
290
+ # When not gzipped, compute both MD5s in single read to avoid redundant I/O
291
+ if !gzipped && local_path == original_path
292
+ compute_md5s_single_read
293
+ @local_object_md5
294
+ else
295
+ Digest::MD5.hexdigest(File.read(local_path))
296
+ end
297
+ end
282
298
  end
283
299
 
284
300
  def local_content_md5
285
301
  @local_content_md5 ||= begin
286
- if File.exist?(original_path)
302
+ # When not gzipped, compute both MD5s in single read to avoid redundant I/O
303
+ if !gzipped && local_path == original_path
304
+ compute_md5s_single_read
305
+ @local_content_md5
306
+ elsif File.exist?(original_path)
287
307
  Digest::MD5.hexdigest(File.read(original_path))
288
308
  else
289
309
  nil
@@ -291,13 +311,39 @@ module Middleman
291
311
  end
292
312
  end
293
313
 
314
+ # Compute both MD5s from a single file read when they're the same file
315
+ def compute_md5s_single_read
316
+ return if @md5s_computed
317
+ content = File.read(local_path)
318
+ md5 = Digest::MD5.hexdigest(content)
319
+ @local_object_md5 = md5
320
+ @local_content_md5 = md5
321
+ @md5s_computed = true
322
+ end
323
+
294
324
  def original_path
295
325
  gzipped ? local_path.gsub(/\.gz$/, '') : local_path
296
326
  end
297
327
 
298
328
  def content_type
299
- @content_type ||= Middleman::S3Sync.content_types[local_path]
300
- @content_type ||= !resource.nil? && resource.respond_to?(:content_type) ? resource.content_type : nil
329
+ @content_type ||= begin
330
+ # Priority: content_types option > mm_resource > mime-types > default
331
+ ct = options.content_types[local_path] if options.content_types
332
+ ct ||= options.content_types[path] if options.content_types
333
+ ct ||= Middleman::S3Sync.content_types[local_path]
334
+ ct ||= Middleman::S3Sync.content_types[path]
335
+ ct ||= resource.content_type if resource&.respond_to?(:content_type)
336
+ ct ||= detect_content_type_from_extension
337
+ ct || 'application/octet-stream'
338
+ end
339
+ end
340
+
341
+ def detect_content_type_from_extension
342
+ return nil unless defined?(MIME::Types)
343
+ extension = File.extname(original_path).delete_prefix('.')
344
+ return nil if extension.empty?
345
+ types = MIME::Types.type_for(extension)
346
+ types.first&.content_type
301
347
  end
302
348
 
303
349
  def caching_policy
@@ -314,9 +360,10 @@ module Middleman
314
360
 
315
361
  protected
316
362
 
317
- def build_upload_options
363
+ # Build upload options with a file stream as the body
364
+ def build_upload_options_for_stream(file_stream)
318
365
  upload_options = {
319
- body: local_content,
366
+ body: file_stream,
320
367
  content_type: content_type
321
368
  }
322
369
  # Only add ACL if enabled (not for buckets with ACLs disabled)
@@ -1,5 +1,5 @@
1
1
  module Middleman
2
2
  module S3Sync
3
- VERSION = "4.6.4"
3
+ VERSION = "4.7.0"
4
4
  end
5
5
  end