appydave-tools 0.68.0 → 0.70.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,780 @@
1
+ # Batch S3 Listing - Requirements Document
2
+
3
+ **Author:** Claude Code
4
+ **Date:** 2025-01-24
5
+ **Status:** Draft
6
+ **Related Issue:** Performance bottleneck in `dam list <brand> --s3`
7
+
8
+ ---
9
+
10
+ ## Executive Summary
11
+
12
+ **Problem:** The `dam list <brand> --s3` command makes N individual AWS S3 API calls (one per project), causing severe performance degradation for brands with many projects (3-5 seconds for 13 projects).
13
+
14
+ **Proposed Solution:** Implement batch S3 listing that makes a single AWS API call for all projects in a brand, then distributes results locally.
15
+
16
+ **Expected Improvement:** 13 AWS API calls → 1 call (13x reduction, ~3-5s → ~300-500ms)
17
+
18
+ ---
19
+
20
+ ## 1. Problem Statement
21
+
22
+ ### Current Behavior
23
+
24
+ **Command:** `dam list appydave --s3`
25
+
26
+ **What happens:**
27
+ 1. Retrieves list of 13 projects locally
28
+ 2. For EACH project (loop):
29
+ - Creates new `S3Operations` instance
30
+ - Calls `calculate_sync_status()`
31
+ - Which calls `list_s3_files()` (AWS API call)
32
+ - Returns sync status (↑ upload, ↓ download, ✓ synced, none)
33
+
34
+ **Result:** 13 sequential AWS API calls, taking 3-5 seconds total
35
+
36
+ **User Impact:**
37
+ - Slow response time for common operation
38
+ - Poor UX for daily workflows
39
+ - Cost implications (AWS S3 API charges per request)
40
+
41
+ ### Root Cause
42
+
43
+ **N+1 Query Pattern:**
44
+ ```ruby
45
+ # In project_listing.rb:172
46
+ project_data = projects.map do |project|
47
+ collect_project_data(..., s3: true) # Called for each project
48
+ → calculate_project_s3_sync_status() # Line 566
49
+ → S3Operations.new().calculate_sync_status() # Line 493
50
+ → list_s3_files() # Line 502 - AWS API CALL
51
+ end
52
+ ```
53
+
54
+ Each project makes an independent S3 list_objects_v2 call with prefix:
55
+ - `staging/appydave/b60/`
56
+ - `staging/appydave/b61/`
57
+ - `staging/appydave/b62/`
58
+ - ... (13 times)
59
+
60
+ ---
61
+
62
+ ## 2. Current Architecture
63
+
64
+ ### Class Responsibilities
65
+
66
+ **`ProjectListing` (presenter layer):**
67
+ - Formats and displays brand/project lists
68
+ - Calls `S3Operations` for each project individually
69
+
70
+ **`S3Operations` (business logic):**
71
+ - Handles S3 operations for a SINGLE project
72
+ - Initialized with `brand` + `project_id`
73
+ - Method: `list_s3_files()` lists files for that specific project
74
+
75
+ ### Current Flow
76
+
77
+ ```
78
+ User: dam list appydave --s3
79
+
80
+ ProjectListing.list_brand_projects(brand_arg, s3: true)
81
+
82
+ [Loop 13 times]
83
+
84
+ collect_project_data(project, s3: true)
85
+
86
+ calculate_project_s3_sync_status(brand, project)
87
+
88
+ S3Operations.new(brand, project)
89
+
90
+ list_s3_files() ← AWS API CALL (prefix: staging/appydave/b60/)
91
+
92
+ calculate_sync_status() → "✓ synced"
93
+
94
+ Display table with S3 column
95
+ ```
96
+
97
+ **Total AWS calls:** 13 (one per project)
98
+
99
+ ---
100
+
101
+ ## 3. Proposed Solution
102
+
103
+ ### Batch Listing Strategy
104
+
105
+ **Core Idea:** Make ONE AWS call to list ALL files for a brand, then distribute results to projects locally.
106
+
107
+ ### New Flow
108
+
109
+ ```
110
+ User: dam list appydave --s3
111
+
112
+ ProjectListing.list_brand_projects(brand_arg, s3: true)
113
+
114
+ S3Operations.list_all_brand_files(brand) ← SINGLE AWS API CALL
115
+ ↓ (prefix: staging/appydave/)
116
+ Returns: {
117
+ 'b60' => [file1, file2, ...],
118
+ 'b61' => [file3, file4, ...],
119
+ ...
120
+ }
121
+
122
+ [Loop 13 times]
123
+
124
+ collect_project_data(project, s3: true, s3_cache: s3_files_map)
125
+
126
+ calculate_project_s3_sync_status(brand, project, s3_files: s3_files_map[project])
127
+
128
+ calculate_sync_status(s3_files) ← Use cached data (no AWS call)
129
+
130
+ Display table with S3 column
131
+ ```
132
+
133
+ **Total AWS calls:** 1 (batch fetch for all projects)
134
+
135
+ ---
136
+
137
+ ## 4. Technical Design
138
+
139
+ ### 4.1 New Class Method
140
+
141
+ **Location:** `lib/appydave/tools/dam/s3_operations.rb`
142
+
143
+ ```ruby
144
+ # Class method: List all S3 files for a brand, grouped by project
145
+ # @param brand [String] Brand key (e.g., 'appydave')
146
+ # @param brand_info [BrandInfo] Optional pre-loaded brand info (DI)
147
+ # @return [Hash<String, Array<Hash>>] Map of project_id => array of S3 file hashes
148
+ #
149
+ # Example return value:
150
+ # {
151
+ # 'b60-automate-image-generation' => [
152
+ # { 'Key' => 'staging/appydave/b60-automate-image-generation/video.mp4',
153
+ # 'Size' => 12345, 'ETag' => '"abc123"', 'LastModified' => Time }
154
+ # ],
155
+ # 'b61-kdd-bmad' => [...]
156
+ # }
157
+ def self.list_all_brand_files(brand, brand_info: nil)
158
+ # Load brand info
159
+ brand_info ||= load_brand_info(brand)
160
+
161
+ # Create S3 client
162
+ s3_client = create_s3_client(brand_info)
163
+
164
+ # Fetch ALL objects for brand with single list_objects_v2 call
165
+ prefix = "#{brand_info.aws.s3_prefix}" # e.g., "staging/appydave/"
166
+
167
+ all_files = []
168
+ continuation_token = nil
169
+
170
+ loop do
171
+ response = s3_client.list_objects_v2(
172
+ bucket: brand_info.aws.s3_bucket,
173
+ prefix: prefix,
174
+ continuation_token: continuation_token
175
+ )
176
+
177
+ all_files.concat(response.contents) if response.contents
178
+
179
+ break unless response.is_truncated
180
+ continuation_token = response.next_continuation_token
181
+ end
182
+
183
+ # Group files by project ID
184
+ group_files_by_project(all_files, prefix)
185
+ end
186
+
187
+ private
188
+
189
+ # Extract project ID from S3 key and group files
190
+ # @param files [Array<Aws::S3::Types::Object>] S3 objects
191
+ # @param prefix [String] S3 prefix (e.g., "staging/appydave/")
192
+ # @return [Hash<String, Array<Hash>>] Map of project_id => files
193
+ def self.group_files_by_project(files, prefix)
194
+ grouped = Hash.new { |h, k| h[k] = [] }
195
+
196
+ files.each do |obj|
197
+ # Extract project ID from key
198
+ # Example: "staging/appydave/b60-project-name/video.mp4" → "b60-project-name"
199
+ relative_path = obj.key.sub(prefix, '')
200
+ project_id = relative_path.split('/').first
201
+
202
+ next if project_id.nil? || project_id.empty?
203
+
204
+ # Store file info in project's array
205
+ grouped[project_id] << {
206
+ 'Key' => obj.key,
207
+ 'Size' => obj.size,
208
+ 'ETag' => obj.etag,
209
+ 'LastModified' => obj.last_modified
210
+ }
211
+ end
212
+
213
+ grouped
214
+ end
215
+ ```
216
+
217
+ ### 4.2 Updated Instance Method
218
+
219
+ **Modify:** `calculate_sync_status` to accept optional cached S3 files
220
+
221
+ ```ruby
222
+ # Calculate 3-state S3 sync status
223
+ # @param s3_files [Array<Hash>, nil] Optional pre-fetched S3 files (for batch mode)
224
+ # @return [String] One of: '↑ upload', '↓ download', '✓ synced', 'none'
225
+ def calculate_sync_status(s3_files: nil)
226
+ project_dir = project_directory_path
227
+ staging_dir = File.join(project_dir, 's3-staging')
228
+
229
+ # No s3-staging directory means no S3 intent
230
+ return 'none' unless Dir.exist?(staging_dir)
231
+
232
+ # Get S3 files (use cached if provided, otherwise fetch)
233
+ begin
234
+ s3_files_list = s3_files || list_s3_files
235
+ rescue StandardError
236
+ # S3 not configured or not accessible
237
+ return 'none'
238
+ end
239
+
240
+ local_files = list_local_files(staging_dir)
241
+
242
+ # No files anywhere
243
+ return 'none' if s3_files_list.empty? && local_files.empty?
244
+
245
+ # ... (rest of logic unchanged)
246
+ end
247
+ ```
248
+
249
+ ### 4.3 Updated ProjectListing Integration
250
+
251
+ **Modify:** `list_brand_projects` to use batch listing
252
+
253
+ ```ruby
254
+ def self.list_brand_projects(brand_arg, detailed: false, s3: false)
255
+ # ... (existing code)
256
+
257
+ # Batch-fetch S3 files if requested (SINGLE AWS CALL)
258
+ s3_files_cache = if s3
259
+ begin
260
+ S3Operations.list_all_brand_files(brand_arg, brand_info: brand_info)
261
+ rescue StandardError => e
262
+ # S3 not configured or error - log and continue without S3
263
+ puts "⚠️ S3 listing failed: #{e.message}" if ENV['DAM_DEBUG']
264
+ {}
265
+ end
266
+ else
267
+ {}
268
+ end
269
+
270
+ # Gather project data (use cached S3 files)
271
+ project_data = projects.map do |project|
272
+ collect_project_data(
273
+ brand_arg, brand_path, brand_info, project, is_git_repo,
274
+ detailed: detailed,
275
+ s3: s3,
276
+ s3_files_cache: s3_files_cache # NEW PARAMETER
277
+ )
278
+ end
279
+
280
+ # ... (rest of method unchanged)
281
+ end
282
+ ```
283
+
284
+ **Modify:** `collect_project_data` to use cached S3 files
285
+
286
+ ```ruby
287
+ def self.collect_project_data(brand_arg, brand_path, brand_info, project, is_git_repo,
288
+ detailed: false, s3: false, s3_files_cache: {})
289
+ # ... (existing code)
290
+
291
+ # Calculate 3-state S3 sync status - use cache if available
292
+ s3_sync = if s3
293
+ calculate_project_s3_sync_status(
294
+ brand_arg, brand_info, project,
295
+ s3_files: s3_files_cache[project] # Use cached S3 files
296
+ )
297
+ else
298
+ 'N/A'
299
+ end
300
+
301
+ # ... (rest of method)
302
+ end
303
+ ```
304
+
305
+ **Modify:** `calculate_project_s3_sync_status` to accept cached files
306
+
307
+ ```ruby
308
+ def self.calculate_project_s3_sync_status(brand_arg, brand_info, project, s3_files: nil)
309
+ # Check if S3 is configured
310
+ s3_bucket = brand_info.aws.s3_bucket
311
+ return 'N/A' if s3_bucket.nil? || s3_bucket.empty? || s3_bucket == 'NOT-SET'
312
+
313
+ # Use S3Operations to calculate sync status
314
+ begin
315
+ s3_ops = S3Operations.new(brand_arg, project, brand_info: brand_info)
316
+ s3_ops.calculate_sync_status(s3_files: s3_files) # Pass cached files
317
+ rescue StandardError
318
+ # S3 not accessible or other error
319
+ 'N/A'
320
+ end
321
+ end
322
+ ```
323
+
324
+ ---
325
+
326
+ ## 5. Implementation Tasks
327
+
328
+ ### Phase 1: Core Batch Listing (2-3 hours)
329
+
330
+ **Task 1.1:** Add `S3Operations.list_all_brand_files` class method
331
+ - [ ] Implement single list_objects_v2 call with pagination
332
+ - [ ] Handle S3 pagination (continuation tokens)
333
+ - [ ] Group files by project ID
334
+ - [ ] Return hash map: `{ project_id => [files] }`
335
+ - [ ] Handle errors gracefully (return empty hash on failure)
336
+
337
+ **Task 1.2:** Extract `group_files_by_project` helper
338
+ - [ ] Parse S3 keys to extract project IDs
339
+ - [ ] Handle edge cases (empty keys, missing project folders)
340
+ - [ ] Unit tests with sample S3 responses
341
+
342
+ **Task 1.3:** Update `calculate_sync_status` to accept cached files
343
+ - [ ] Add optional `s3_files:` parameter
344
+ - [ ] Use cached files if provided, fallback to `list_s3_files()` otherwise
345
+ - [ ] Ensure backward compatibility (existing code still works)
346
+
347
+ ### Phase 2: Integration (1-2 hours)
348
+
349
+ **Task 2.1:** Update `ProjectListing.list_brand_projects`
350
+ - [ ] Call `S3Operations.list_all_brand_files` once before loop
351
+ - [ ] Pass `s3_files_cache` to `collect_project_data`
352
+ - [ ] Handle S3 fetch errors gracefully
353
+
354
+ **Task 2.2:** Update `collect_project_data`
355
+ - [ ] Accept `s3_files_cache:` parameter
356
+ - [ ] Pass cached files to `calculate_project_s3_sync_status`
357
+
358
+ **Task 2.3:** Update `calculate_project_s3_sync_status`
359
+ - [ ] Accept `s3_files:` parameter
360
+ - [ ] Pass to `calculate_sync_status`
361
+
362
+ ### Phase 3: S3 Timestamps (1 hour)
363
+
364
+ **Task 3.1:** Update `calculate_s3_timestamps` for batch mode
365
+ - [ ] Accept optional `s3_files:` parameter
366
+ - [ ] Extract timestamps from cached data
367
+ - [ ] Fallback to `list_s3_files()` if not cached
368
+
369
+ **Task 3.2:** Update `collect_project_data` detailed mode
370
+ - [ ] Pass cached S3 files to `calculate_s3_timestamps`
371
+
372
+ ### Phase 4: Testing (2-3 hours)
373
+
374
+ **Task 4.1:** Unit tests
375
+ - [ ] Test `group_files_by_project` with various S3 key formats
376
+ - [ ] Test empty brand (no projects in S3)
377
+ - [ ] Test partial match (some projects have S3 files, some don't)
378
+ - [ ] Test S3 pagination (large brands with >1000 files)
379
+
380
+ **Task 4.2:** Integration tests
381
+ - [ ] Test `list_brand_projects` with batch listing
382
+ - [ ] Test fallback to individual listing on batch error
383
+ - [ ] Test backward compatibility (existing code paths)
384
+
385
+ **Task 4.3:** Performance validation
386
+ - [ ] Measure AWS API call count (should be 1)
387
+ - [ ] Measure wall-clock time improvement
388
+ - [ ] Test with various brand sizes (1, 10, 50 projects)
389
+
390
+ ### Phase 5: Documentation (30 minutes)
391
+
392
+ **Task 5.1:** Update CLAUDE.md
393
+ - [ ] Document batch listing approach
394
+ - [ ] Update performance notes
395
+
396
+ **Task 5.2:** Add code comments
397
+ - [ ] Document new class method
398
+ - [ ] Explain batching strategy
399
+
400
+ ---
401
+
402
+ ## 6. Testing Strategy
403
+
404
+ ### 6.1 Unit Tests
405
+
406
+ **Location:** `spec/appydave/tools/dam/s3_operations_spec.rb`
407
+
408
+ ```ruby
409
+ describe S3Operations do
410
+ describe '.list_all_brand_files' do
411
+ let(:brand) { 'appydave' }
412
+ let(:s3_client) { instance_double(Aws::S3::Client) }
413
+
414
+ context 'with multiple projects' do
415
+ it 'groups files by project ID' do
416
+ # Mock S3 response
417
+ response = double(
418
+ contents: [
419
+ double(key: 'staging/appydave/b60-project/video.mp4', size: 1000, etag: '"abc"', last_modified: Time.now),
420
+ double(key: 'staging/appydave/b60-project/subtitle.srt', size: 100, etag: '"def"', last_modified: Time.now),
421
+ double(key: 'staging/appydave/b61-other/video.mp4', size: 2000, etag: '"ghi"', last_modified: Time.now)
422
+ ],
423
+ is_truncated: false
424
+ )
425
+
426
+ allow(s3_client).to receive(:list_objects_v2).and_return(response)
427
+
428
+ result = S3Operations.list_all_brand_files(brand, s3_client: s3_client)
429
+
430
+ expect(result.keys).to contain_exactly('b60-project', 'b61-other')
431
+ expect(result['b60-project'].size).to eq(2)
432
+ expect(result['b61-other'].size).to eq(1)
433
+ end
434
+ end
435
+
436
+ context 'with S3 pagination' do
437
+ it 'fetches all pages' do
438
+ # Test continuation token handling
439
+ end
440
+ end
441
+
442
+ context 'with empty brand' do
443
+ it 'returns empty hash' do
444
+ # Test no files case
445
+ end
446
+ end
447
+ end
448
+
449
+ describe '#calculate_sync_status' do
450
+ context 'with cached S3 files' do
451
+ it 'uses cached data instead of AWS call' do
452
+ # Verify no S3 API call made
453
+ end
454
+ end
455
+
456
+ context 'without cached files' do
457
+ it 'falls back to individual listing' do
458
+ # Verify AWS call still made
459
+ end
460
+ end
461
+ end
462
+ end
463
+ ```
464
+
465
+ ### 6.2 Integration Tests
466
+
467
+ **Location:** `spec/appydave/tools/dam/project_listing_spec.rb`
468
+
469
+ ```ruby
470
+ describe ProjectListing do
471
+ describe '.list_brand_projects' do
472
+ context 'with --s3 flag (batch mode)' do
473
+ it 'makes single AWS call for all projects' do
474
+ allow(S3Operations).to receive(:list_all_brand_files).once.and_return({})
475
+
476
+ ProjectListing.list_brand_projects('appydave', s3: true)
477
+
478
+ expect(S3Operations).to have_received(:list_all_brand_files).once
479
+ end
480
+
481
+ it 'displays correct S3 status for all projects' do
482
+ # Verify output matches expectations
483
+ end
484
+ end
485
+ end
486
+ end
487
+ ```
488
+
489
+ ### 6.3 Performance Tests
490
+
491
+ **Manual Testing Script:**
492
+
493
+ ```bash
494
+ # Baseline (individual calls)
495
+ git checkout main
496
+ time bin/dam list appydave --s3
497
+
498
+ # After batch implementation
499
+ git checkout feature/batch-s3-listing
500
+ time bin/dam list appydave --s3
501
+
502
+ # Expected improvement: 3-5s → 300-500ms
503
+ ```
504
+
505
+ ---
506
+
507
+ ## 7. Risks & Mitigations
508
+
509
+ ### Risk 1: S3 Key Format Variations
510
+
511
+ **Risk:** Different project naming conventions may break project ID extraction.
512
+
513
+ **Examples:**
514
+ - FliVideo: `b60-project-name` (standard)
515
+ - Storyline: `boy-baker` (no prefix)
516
+ - Edge case: `b60-project-name/archived/video.mp4` (nested folders)
517
+
518
+ **Mitigation:**
519
+ - Extract project ID as first path segment after brand prefix
520
+ - Unit test with various key formats
521
+ - Gracefully handle unexpected formats (skip, don't crash)
522
+
523
+ ### Risk 2: Large Brand Performance
524
+
525
+ **Risk:** Brands with 1000+ files may have slow S3 pagination.
526
+
527
+ **Mitigation:**
528
+ - Implement proper pagination (continuation tokens)
529
+ - Add progress indicator for large brands
530
+ - Consider caching results (future enhancement)
531
+
532
+ ### Risk 3: Partial S3 Failures
533
+
534
+ **Risk:** S3 listing succeeds but some projects fail to parse.
535
+
536
+ **Mitigation:**
537
+ - Return partial results (best-effort)
538
+ - Log warnings for unparseable keys
539
+ - Don't block entire listing on single project error
540
+
541
+ ### Risk 4: Backward Compatibility
542
+
543
+ **Risk:** Breaking existing code that uses `calculate_sync_status()` without parameters.
544
+
545
+ **Mitigation:**
546
+ - Make `s3_files:` parameter optional with default `nil`
547
+ - Existing code falls back to individual listing
548
+ - Ensure all existing tests still pass
549
+
550
+ ### Risk 5: S3 API Rate Limits
551
+
552
+ **Risk:** Large brands may hit S3 API rate limits.
553
+
554
+ **Mitigation:**
555
+ - Single call is much better than N calls
556
+ - AWS S3 rate limits are generous (5500 GET/HEAD per second per prefix)
557
+ - Monitor CloudWatch metrics for throttling
558
+
559
+ ---
560
+
561
+ ## 8. Performance Expectations
562
+
563
+ ### Current Performance (Baseline)
564
+
565
+ **Test case:** `dam list appydave --s3` (13 projects)
566
+
567
+ | Metric | Value |
568
+ |--------|-------|
569
+ | AWS API calls | 13 (sequential) |
570
+ | Network round-trips | 13 |
571
+ | Total time | 3-5 seconds |
572
+ | User experience | Slow, frustrating |
573
+
574
+ ### Expected Performance (After Batch)
575
+
576
+ | Metric | Value | Improvement |
577
+ |--------|-------|-------------|
578
+ | AWS API calls | 1 (may paginate) | 13x reduction |
579
+ | Network round-trips | 1-2 | 6-13x reduction |
580
+ | Total time | 300-500ms | 6-10x faster |
581
+ | User experience | Fast, responsive | ✅ Acceptable |
582
+
583
+ ### Performance by Brand Size
584
+
585
+ | Projects | Current | After Batch | Improvement |
586
+ |----------|---------|-------------|-------------|
587
+ | 5 | 1-2s | 200-300ms | 5-7x |
588
+ | 13 | 3-5s | 300-500ms | 6-10x |
589
+ | 30 | 8-12s | 500-800ms | 10-15x |
590
+ | 50 | 15-20s | 800-1200ms | 12-18x |
591
+
592
+ **Note:** Improvement scales with brand size (larger brands benefit more).
593
+
594
+ ---
595
+
596
+ ## 9. Future Enhancements
597
+
598
+ ### 9.1 Caching Layer (Optional)
599
+
600
+ **Benefit:** Repeated `dam list` calls within 30-60 seconds use cached S3 data.
601
+
602
+ **Implementation:**
603
+ - Cache S3 results in temp file or memory
604
+ - TTL: 30-60 seconds
605
+ - Invalidate on `s3-up`, `s3-down`, `s3-cleanup`
606
+
607
+ **Trade-off:** Adds complexity, stale data risk
608
+
609
+ ### 9.2 Parallel Git Status (Future)
610
+
611
+ **Note:** Git status checks also exhibit N+1 pattern (checked per project).
612
+
613
+ **Potential fix:**
614
+ - Batch `git status` calls using single `git status --porcelain <project1> <project2> ...`
615
+ - Or parallelize git calls (less benefit than S3)
616
+
617
+ **Priority:** Lower (git is local, faster than S3)
618
+
619
+ ### 9.3 Progress Indicators
620
+
621
+ **For large brands:** Show spinner or progress bar during S3 fetch.
622
+
623
+ ```ruby
624
+ print "🔍 Fetching S3 data for #{brand}... "
625
+ s3_files_cache = S3Operations.list_all_brand_files(brand)
626
+ puts "✓ (#{s3_files_cache.size} projects)"
627
+ ```
628
+
629
+ ---
630
+
631
+ ## 10. Success Criteria
632
+
633
+ ### Must Have
634
+
635
+ - ✅ Single AWS S3 API call per brand (vs N calls currently)
636
+ - ✅ Correct S3 sync status for all projects
637
+ - ✅ No regressions in existing functionality
638
+ - ✅ All existing tests pass
639
+ - ✅ Performance improvement: 3-5s → < 1s
640
+
641
+ ### Nice to Have
642
+
643
+ - ⭐ Unit test coverage > 90%
644
+ - ⭐ Performance improvement > 6x
645
+ - ⭐ Graceful error handling for S3 failures
646
+ - ⭐ Progress indicator for large brands
647
+
648
+ ### Out of Scope
649
+
650
+ - ❌ Caching layer (future enhancement)
651
+ - ❌ Batch git status (separate issue)
652
+ - ❌ Parallel S3 requests (single call is better)
653
+
654
+ ---
655
+
656
+ ## 11. Rollout Plan
657
+
658
+ ### Phase 1: Development (This Sprint)
659
+
660
+ 1. Implement batch listing (Tasks 1.1-1.3)
661
+ 2. Integrate with ProjectListing (Tasks 2.1-2.3)
662
+ 3. Add S3 timestamps support (Tasks 3.1-3.2)
663
+ 4. Unit tests (Task 4.1)
664
+
665
+ ### Phase 2: Testing (Next Sprint)
666
+
667
+ 1. Integration tests (Task 4.2)
668
+ 2. Performance validation (Task 4.3)
669
+ 3. UAT with real brands (appydave, voz, ss)
670
+
671
+ ### Phase 3: Documentation & Release
672
+
673
+ 1. Update documentation (Task 5.1-5.2)
674
+ 2. Code review
675
+ 3. Merge to main
676
+ 4. Release as minor version (e.g., v0.70.0)
677
+
678
+ ---
679
+
680
+ ## 12. Appendix
681
+
682
+ ### A. S3 Key Format Examples
683
+
684
+ **FliVideo (AppyDave):**
685
+ ```
686
+ staging/appydave/b60-automate-image-generation/video.mp4
687
+ staging/appydave/b60-automate-image-generation/subtitle.srt
688
+ staging/appydave/b61-kdd-bmad/video.mp4
689
+ ```
690
+
691
+ **Storyline (VOZ):**
692
+ ```
693
+ staging/voz/boy-baker/final-edit.mov
694
+ staging/voz/boy-baker/scene-01.mp4
695
+ staging/voz/the-point/intro.mp4
696
+ ```
697
+
698
+ **Project ID Extraction Logic:**
699
+ ```ruby
700
+ # Input: "staging/appydave/b60-project/video.mp4"
701
+ # Prefix: "staging/appydave/"
702
+ # Remove prefix: "b60-project/video.mp4"
703
+ # Split on '/': ["b60-project", "video.mp4"]
704
+ # First element: "b60-project" ← Project ID
705
+ ```
706
+
707
+ ### B. AWS S3 list_objects_v2 API
708
+
709
+ **Request:**
710
+ ```ruby
711
+ response = s3_client.list_objects_v2(
712
+ bucket: 'appydave-video-projects',
713
+ prefix: 'staging/appydave/',
714
+ max_keys: 1000, # Default max per page
715
+ continuation_token: nil # For pagination
716
+ )
717
+ ```
718
+
719
+ **Response Structure:**
720
+ ```ruby
721
+ {
722
+ contents: [
723
+ {
724
+ key: 'staging/appydave/b60-project/video.mp4',
725
+ size: 12345678,
726
+ etag: '"abc123def456"',
727
+ last_modified: Time.parse('2025-01-20 10:30:00 UTC')
728
+ },
729
+ # ... more objects
730
+ ],
731
+ is_truncated: true, # More results available?
732
+ next_continuation_token: 'xyz789...' # Use for next page
733
+ }
734
+ ```
735
+
736
+ **Pagination Example:**
737
+ ```ruby
738
+ def fetch_all_objects(s3_client, bucket, prefix)
739
+ all_objects = []
740
+ continuation_token = nil
741
+
742
+ loop do
743
+ response = s3_client.list_objects_v2(
744
+ bucket: bucket,
745
+ prefix: prefix,
746
+ continuation_token: continuation_token
747
+ )
748
+
749
+ all_objects.concat(response.contents) if response.contents
750
+
751
+ break unless response.is_truncated
752
+ continuation_token = response.next_continuation_token
753
+ end
754
+
755
+ all_objects
756
+ end
757
+ ```
758
+
759
+ ### C. Error Handling Strategy
760
+
761
+ **Graceful Degradation:**
762
+ ```ruby
763
+ begin
764
+ s3_files_cache = S3Operations.list_all_brand_files(brand)
765
+ rescue Aws::S3::Errors::ServiceError => e
766
+ # AWS error (credentials, network, etc.)
767
+ puts "⚠️ S3 listing failed: #{e.message}" if ENV['DAM_DEBUG']
768
+ s3_files_cache = {} # Fall back to no S3 data
769
+ rescue StandardError => e
770
+ # Unexpected error
771
+ puts "⚠️ Unexpected error: #{e.message}" if ENV['DAM_DEBUG']
772
+ s3_files_cache = {}
773
+ end
774
+
775
+ # Continue with empty cache (shows 'N/A' for S3 status)
776
+ ```
777
+
778
+ ---
779
+
780
+ **End of Requirements Document**