activerecord-graph-extractor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,532 @@
1
+ # ActiveRecord Graph Extractor
2
+
3
+ A gem for extracting ActiveRecord models and their relationships into a JSON format that can be imported into another environment.
4
+
5
+ ## Features
6
+
7
+ - Complete graph traversal of ActiveRecord models
8
+ - Smart dependency resolution for importing
9
+ - Beautiful CLI with progress visualization
10
+ - Configurable extraction and import options
11
+ - Handles complex relationships including polymorphic associations
12
+ - Preserves referential integrity during import
13
+
14
+ ## Installation
15
+
16
+ Add this line to your application's Gemfile:
17
+
18
+ ```ruby
19
+ gem 'activerecord-graph-extractor'
20
+ ```
21
+
22
+ And then execute:
23
+ ```bash
24
+ $ bundle install
25
+ ```
26
+
27
+ Or install it yourself as:
28
+ ```bash
29
+ $ gem install activerecord-graph-extractor
30
+ ```
31
+
32
+ ### CLI Installation
33
+
34
+ The gem includes a command-line interface called `arge` (ActiveRecord Graph Extractor). After installing the gem, you can use the CLI in several ways:
35
+
36
+ #### Option 1: Using with Bundler (Recommended for Rails projects)
37
+
38
+ If you've added the gem to your Rails application's Gemfile:
39
+
40
+ ```bash
41
+ # Run CLI commands with bundle exec
42
+ $ bundle exec arge extract Order 12345 --output order.json
43
+ $ bundle exec arge s3_list --bucket my-bucket
44
+ ```
45
+
46
+ #### Option 2: Global Installation
47
+
48
+ To use the `arge` command globally from anywhere in your terminal:
49
+
50
+ ```bash
51
+ # Install the gem globally
52
+ $ gem install activerecord-graph-extractor
53
+
54
+ # Now you can use arge from anywhere
55
+ $ arge version
56
+ $ arge extract Order 12345 --output order.json
57
+ ```
58
+
59
+ #### Option 3: Using with rbenv/RVM
60
+
61
+ If you're using rbenv or RVM, make sure the gem is installed in your current Ruby version:
62
+
63
+ ```bash
64
+ # Check your current Ruby version
65
+ $ ruby -v
66
+
67
+ # Install the gem
68
+ $ gem install activerecord-graph-extractor
69
+
70
+ # Rehash to make the command available (rbenv only)
71
+ $ rbenv rehash
72
+
73
+ # Verify installation
74
+ $ arge version
75
+ ```
76
+
77
+ #### Option 4: Development Setup
78
+
79
+ If you're working on the gem itself or want to use the latest development version:
80
+
81
+ ```bash
82
+ # Clone the repository
83
+ $ git clone https://github.com/your-org/activerecord-graph-extractor.git
84
+ $ cd activerecord-graph-extractor
85
+
86
+ # Install dependencies
87
+ $ bundle install
88
+
89
+ # Use the CLI directly
90
+ $ bundle exec exe/arge version
91
+ $ bundle exec exe/arge extract Order 12345
92
+ ```
93
+
94
+ #### Verifying Installation
95
+
96
+ To verify the CLI is properly installed:
97
+
98
+ ```bash
99
+ # Check if the command is available
100
+ $ which arge
101
+ /usr/local/bin/arge
102
+
103
+ # Check the version
104
+ $ arge version
105
+ ActiveRecord Graph Extractor v1.0.0
106
+
107
+ # See all available commands
108
+ $ arge help
109
+ Commands:
110
+ arge extract MODEL_CLASS ID # Extract a record and its relationships
111
+ arge extract_to_s3 MODEL_CLASS ID # Extract and upload to S3
112
+ arge import FILE # Import records from JSON file
113
+ arge s3_list # List files in S3 bucket
114
+ arge s3_download S3_KEY # Download file from S3
115
+ arge analyze FILE # Analyze a JSON export file
116
+ arge dry_run MODEL_CLASS ID # Analyze extraction scope without extracting
117
+ ```
118
+
119
+ **Automated Verification:** You can also run our installation verification script:
120
+
121
+ ```bash
122
+ # Download and run the verification script
123
+ $ curl -fsSL https://raw.githubusercontent.com/your-org/activerecord-graph-extractor/main/scripts/verify_installation.rb | ruby
124
+
125
+ # Or if you have the gem source code
126
+ $ ruby scripts/verify_installation.rb
127
+ ```
128
+
129
+ This script will check your Ruby version, gem installation, CLI availability, and test all commands.
130
+
131
+ #### Troubleshooting CLI Installation
132
+
133
+ **Command not found:**
134
+ ```bash
135
+ $ arge version
136
+ -bash: arge: command not found
137
+ ```
138
+
139
+ Solutions:
140
+ 1. **Check gem installation:** `gem list | grep activerecord-graph-extractor`
141
+ 2. **Rehash your shell:** `rbenv rehash` (for rbenv) or restart your terminal
142
+ 3. **Check your PATH:** `echo $PATH` should include your gem bin directory
143
+ 4. **Use bundle exec:** `bundle exec arge version` if installed via Gemfile
144
+
145
+ **Permission errors:**
146
+ ```bash
147
+ $ gem install activerecord-graph-extractor
148
+ ERROR: While executing gem ... (Gem::FilePermissionError)
149
+ ```
150
+
151
+ Solutions:
152
+ 1. **Use rbenv/RVM:** Install Ruby via rbenv or RVM instead of system Ruby
153
+ 2. **Use --user-install:** `gem install --user-install activerecord-graph-extractor`
154
+ 3. **Use sudo (not recommended):** `sudo gem install activerecord-graph-extractor`
155
+
156
+ **Wrong Ruby version:**
157
+ ```bash
158
+ $ arge version
159
+ Your Ruby version is 2.6.0, but this gem requires >= 2.7.0
160
+ ```
161
+
162
+ Solution: Upgrade your Ruby version using rbenv, RVM, or your system package manager.
163
+
164
+ ## Quick Start
165
+
166
+ ### Ruby API
167
+
168
+ ```ruby
169
+ # Extract an Order and all related records
170
+ order = Order.find(12345)
171
+ extractor = ActiveRecordGraphExtractor::Extractor.new
172
+ data = extractor.extract(order)
173
+
174
+ # Export to a JSON file
175
+ File.write('order_12345.json', data.to_json)
176
+
177
+ # Import in another environment
178
+ importer = ActiveRecordGraphExtractor::Importer.new
179
+ importer.import_from_file('order_12345.json')
180
+ ```
181
+
182
+ ### CLI Usage
183
+
184
+ ```bash
185
+ # Extract an Order and related records
186
+ $ arge extract Order 12345 --output order.json
187
+
188
+ # Import from a JSON file
189
+ $ arge import order_12345.json
190
+
191
+ # Extract and upload directly to S3
192
+ $ arge extract_to_s3 Order 12345 --bucket my-bucket --key extractions/order.json
193
+
194
+ # List files in S3 bucket
195
+ $ arge s3_list --bucket my-bucket --prefix extractions/
196
+
197
+ # Download from S3
198
+ $ arge s3_download extractions/order.json --bucket my-bucket --output local_order.json
199
+
200
+ # Analyze an export file
201
+ $ arge analyze order_12345.json
202
+
203
+ # Dry run analysis (analyze before extracting)
204
+ $ arge dry_run Order 12345 --max-depth 3
205
+ ```
206
+
207
+ ## Dry Run Analysis
208
+
209
+ Before performing large extractions, use the dry run feature to understand the scope and performance implications:
210
+
211
+ ### Ruby API
212
+
213
+ ```ruby
214
+ # Analyze what would be extracted
215
+ order = Order.find(12345)
216
+ extractor = ActiveRecordGraphExtractor::Extractor.new
217
+
218
+ analysis = extractor.dry_run(order)
219
+ puts "Would extract #{analysis['extraction_scope']['total_estimated_records']} records"
220
+ puts "Estimated file size: #{analysis['estimated_file_size']['human_readable']}"
221
+ puts "Estimated time: #{analysis['performance_estimates']['estimated_extraction_time_human']}"
222
+
223
+ # Check for warnings and recommendations
224
+ analysis['warnings'].each { |w| puts "⚠️ #{w['message']}" }
225
+ analysis['recommendations'].each { |r| puts "💡 #{r['message']}" }
226
+ ```
227
+
228
+ ### CLI Usage
229
+
230
+ ```bash
231
+ # Basic dry run analysis
232
+ $ arge dry_run Order 12345
233
+
234
+ # With custom depth and save report
235
+ $ arge dry_run Order 12345 --max-depth 2 --output analysis.json
236
+ ```
237
+
238
+ **Example Output:**
239
+ ```
240
+ 🔍 Performing dry run analysis...
241
+
242
+ ✅ Dry run analysis completed!
243
+
244
+ 📊 Analysis Summary:
245
+ Models involved: 8
246
+ Total estimated records: 1,247
247
+ Estimated file size: 2.3 MB
248
+ Estimated extraction time: 1.2 seconds
249
+
250
+ 📋 Records by Model:
251
+ Order 856 (68.7%)
252
+ Product 234 (18.8%)
253
+ Photo 145 (11.6%)
254
+
255
+ 💡 Recommendations:
256
+ S3: Large file detected - consider uploading directly to S3
257
+ ```
258
+
259
+ See the [Dry Run Guide](docs/dry_run.md) for comprehensive documentation.
260
+
261
+ ## Configuration
262
+
263
+ ### Extraction Options
264
+
265
+ ```ruby
266
+ extractor = ActiveRecordGraphExtractor::Extractor.new(
267
+ max_depth: 3, # Maximum depth of relationship traversal
268
+ include_relationships: %w[products customer shipping_address], # Only include specific relationships
269
+ exclude_relationships: %w[audit_logs], # Exclude specific relationships
270
+ custom_serializers: { # Custom serialization for specific models
271
+ 'User' => ->(record) {
272
+ {
273
+ id: record.id,
274
+ full_name: "#{record.first_name} #{record.last_name}",
275
+ email: record.email
276
+ }
277
+ }
278
+ }
279
+ )
280
+ ```
281
+
282
+ ### Import Options
283
+
284
+ ```ruby
285
+ importer = ActiveRecordGraphExtractor::Importer.new(
286
+ skip_existing: true, # Skip records that already exist
287
+ update_existing: false, # Update existing records instead of skipping
288
+ transaction: true, # Wrap import in a transaction
289
+ validate: true, # Validate records before saving
290
+ custom_finders: { # Custom finder methods for specific models
291
+ 'Product' => ->(attrs) { Product.find_by(product_number: attrs['product_number']) }
292
+ }
293
+ )
294
+ ```
295
+
296
+ ## JSON Structure
297
+
298
+ The exported JSON has the following structure:
299
+
300
+ ```json
301
+ {
302
+ "metadata": {
303
+ "extracted_at": "2024-03-20T10:00:00Z",
304
+ "root_model": "Order",
305
+ "root_id": 12345,
306
+ "total_records": 150,
307
+ "models": ["Order", "User", "Product", "Address"],
308
+ "circular_references": [],
309
+ "max_depth": 3
310
+ },
311
+ "records": {
312
+ "Order": [
313
+ {
314
+ "id": 12345,
315
+ "user_id": 67890,
316
+ "state": "completed",
317
+ "total_amount": 99.99,
318
+ "created_at": "2024-03-19T15:30:00Z"
319
+ }
320
+ ],
321
+ "User": [
322
+ {
323
+ "id": 67890,
324
+ "email": "customer@example.com",
325
+ "first_name": "John",
326
+ "last_name": "Doe"
327
+ }
328
+ ]
329
+ }
330
+ }
331
+ ```
332
+
333
+ ## S3 Integration
334
+
335
+ The gem includes built-in support for uploading extractions directly to Amazon S3:
336
+
337
+ ```ruby
338
+ # Extract and upload to S3 in one step
339
+ extractor = ActiveRecordGraphExtractor::Extractor.new
340
+ result = extractor.extract_and_upload_to_s3(
341
+ order,
342
+ bucket_name: 'my-extraction-bucket',
343
+ s3_key: 'extractions/order_12345.json',
344
+ region: 'us-east-1'
345
+ )
346
+
347
+ puts "Uploaded to: #{result['s3_upload'][:url]}"
348
+ ```
349
+
350
+ ### S3Client Operations
351
+
352
+ ```ruby
353
+ # Create S3 client for advanced operations
354
+ s3_client = ActiveRecordGraphExtractor::S3Client.new(
355
+ bucket_name: 'my-bucket',
356
+ region: 'us-east-1'
357
+ )
358
+
359
+ # List files
360
+ files = s3_client.list_files(prefix: 'extractions/')
361
+
362
+ # Download files
363
+ s3_client.download_file('extractions/order.json', 'local_order.json')
364
+
365
+ # Generate presigned URLs
366
+ url = s3_client.presigned_url('extractions/order.json', expires_in: 3600)
367
+ ```
368
+
369
+ ### S3 CLI Commands
370
+
371
+ ```bash
372
+ # Extract and upload to S3
373
+ $ arge extract_to_s3 Order 12345 --bucket my-bucket --region us-east-1
374
+
375
+ # List S3 files
376
+ $ arge s3_list --bucket my-bucket --prefix extractions/2024/
377
+
378
+ # Download from S3
379
+ $ arge s3_download extractions/order.json --bucket my-bucket
380
+ ```
381
+
382
+ **AWS Configuration:** Set up your AWS credentials using environment variables, AWS credentials file, or IAM roles. See the [S3 Integration Guide](docs/s3_integration.md) for detailed configuration instructions.
383
+
384
+ ## Advanced Usage
385
+
386
+ ### Custom Traversal Rules
387
+
388
+ ```ruby
389
+ extractor = ActiveRecordGraphExtractor::Extractor.new(
390
+ traversal_rules: {
391
+ 'Order' => {
392
+ 'products' => { max_depth: 2 },
393
+ 'customer' => { max_depth: 1 },
394
+ 'shipping_address' => { max_depth: 1 }
395
+ }
396
+ }
397
+ )
398
+ ```
399
+
400
+ ### Handling Large Datasets
401
+
402
+ ```ruby
403
+ # Extract with progress tracking
404
+ extractor = ActiveRecordGraphExtractor::Extractor.new
405
+ extractor.extract(order) do |progress|
406
+ puts "Processed #{progress.current} of #{progress.total} records"
407
+ end
408
+
409
+ # Import with batch processing
410
+ importer = ActiveRecordGraphExtractor::Importer.new(batch_size: 1000)
411
+ importer.import_from_file('large_export.json')
412
+ ```
413
+
414
+ ## CLI Features
415
+
416
+ The CLI provides a beautiful interface with progress visualization:
417
+
418
+ ```bash
419
+ $ arge extract Order 12345
420
+ Extracting Order #12345 and related records...
421
+ [====================] 100% | 150 records processed
422
+ Export completed: order_12345.json
423
+
424
+ $ arge import order_12345.json
425
+ Importing records from order_12345.json...
426
+ [====================] 100% | 150 records imported
427
+ Import completed successfully
428
+ ```
429
+
430
+ ## Error Handling
431
+
432
+ ### Validation Errors
433
+
434
+ ```ruby
435
+ begin
436
+ importer.import_from_file('data.json')
437
+ rescue ActiveRecordGraphExtractor::ImportError => e
438
+ puts "Import failed: #{e.message}"
439
+ puts "Failed records: #{e.failed_records}"
440
+ end
441
+ ```
442
+
443
+ ### Circular Dependencies
444
+
445
+ ```ruby
446
+ begin
447
+ extractor.extract(order)
448
+ rescue ActiveRecordGraphExtractor::CircularDependencyError => e
449
+ puts "Circular dependency detected: #{e.message}"
450
+ puts "Dependency chain: #{e.dependency_chain}"
451
+ end
452
+ ```
453
+
454
+ ## Performance
455
+
456
+ The gem is optimized for performance:
457
+
458
+ - Smart caching of traversed records
459
+ - Efficient SQL queries using includes
460
+ - Batch processing for large datasets
461
+ - Progress tracking for long-running operations
462
+
463
+ Benchmarks:
464
+ - Small dataset (100 records): ~1 second
465
+ - Medium dataset (1,000 records): ~5 seconds
466
+ - Large dataset (10,000 records): ~30 seconds
467
+
468
+ ## Contributing
469
+
470
+ 1. Fork the repository
471
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
472
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
473
+ 4. Push to the branch (`git push origin my-new-feature`)
474
+ 5. Create a new Pull Request
475
+
476
+ ## Quick Reference
477
+
478
+ ### CLI Commands
479
+
480
+ | Command | Description | Example |
481
+ |---------|-------------|---------|
482
+ | `arge version` | Show version information | `arge version` |
483
+ | `arge help` | Show all commands | `arge help` |
484
+ | `arge extract` | Extract records to JSON file | `arge extract Order 123 --output order.json` |
485
+ | `arge import` | Import records from JSON file | `arge import order.json` |
486
+ | `arge extract_to_s3` | Extract and upload to S3 | `arge extract_to_s3 Order 123 --bucket my-bucket` |
487
+ | `arge s3_list` | List files in S3 bucket | `arge s3_list --bucket my-bucket` |
488
+ | `arge s3_download` | Download file from S3 | `arge s3_download file.json --bucket my-bucket` |
489
+ | `arge analyze` | Analyze JSON export file | `arge analyze order.json` |
490
+ | `arge dry_run` | Analyze extraction scope without extracting | `arge dry_run Order 123 --max-depth 2` |
491
+
492
+ ### Installation Quick Start
493
+
494
+ ```bash
495
+ # For Rails projects (recommended)
496
+ echo 'gem "activerecord-graph-extractor"' >> Gemfile
497
+ bundle install
498
+ bundle exec arge version
499
+
500
+ # For global installation
501
+ gem install activerecord-graph-extractor
502
+ arge version
503
+
504
+ # Verify installation
505
+ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/your-org/activerecord-graph-extractor/main/scripts/verify_installation.rb)"
506
+ ```
507
+
508
+ ### Common Usage Patterns
509
+
510
+ ```bash
511
+ # Basic extraction
512
+ arge extract Order 123 --output order.json
513
+
514
+ # Extract with depth limit
515
+ arge extract Order 123 --max-depth 2 --output order.json
516
+
517
+ # Extract to S3
518
+ arge extract_to_s3 Order 123 --bucket my-bucket --region us-east-1
519
+
520
+ # Import with validation
521
+ arge import order.json --validate
522
+
523
+ # Dry run before large extraction
524
+ arge dry_run Order 123 --max-depth 3 --output analysis.json
525
+
526
+ # List recent S3 extractions
527
+ arge s3_list --bucket my-bucket --prefix extractions/$(date +%Y/%m)
528
+ ```
529
+
530
+ ## License
531
+
532
+ The gem is available as open source under the terms of the MIT License.
data/Rakefile ADDED
@@ -0,0 +1,36 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+ require "rubocop/rake_task"
6
+
7
+ RSpec::Core::RakeTask.new(:spec)
8
+ RuboCop::RakeTask.new
9
+
10
+ desc "Run all tests and checks"
11
+ task test: [:spec, :rubocop]
12
+
13
+ task default: :test
14
+
15
+ desc "Run tests with coverage report"
16
+ task :coverage do
17
+ ENV['COVERAGE'] = 'true'
18
+ Rake::Task[:spec].invoke
19
+ end
20
+
21
+ desc "Generate documentation"
22
+ task :doc do
23
+ sh "yard doc"
24
+ end
25
+
26
+ desc "Setup development environment"
27
+ task :setup do
28
+ sh "bundle install"
29
+ puts "✅ Development environment setup complete!"
30
+ puts
31
+ puts "Available tasks:"
32
+ puts " rake spec # Run tests"
33
+ puts " rake rubocop # Run code style checks"
34
+ puts " rake test # Run tests and style checks"
35
+ puts " rake coverage # Run tests with coverage report"
36
+ end
@@ -0,0 +1,64 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "lib/activerecord_graph_extractor/version"
4
+
5
+ Gem::Specification.new do |spec|
6
+ spec.name = "activerecord-graph-extractor"
7
+ spec.version = ActiveRecordGraphExtractor::VERSION
8
+ spec.authors = ["Florian Lorrain"]
9
+ spec.email = ["lorrain.florian@gmail.com"]
10
+
11
+ spec.summary = "Extract and import complex ActiveRecord object graphs while preserving relationships"
12
+ spec.description = "A Ruby gem for extracting and importing complex ActiveRecord object graphs with smart dependency resolution, beautiful CLI progress visualization, and memory-efficient streaming. Perfect for data migration, testing, and environment synchronization."
13
+ spec.homepage = "https://github.com/florrain/activerecord-graph-extractor"
14
+ spec.license = "MIT"
15
+ spec.required_ruby_version = ">= 2.7.0"
16
+
17
+ spec.metadata["allowed_push_host"] = "https://rubygems.org"
18
+ spec.metadata["homepage_uri"] = spec.homepage
19
+ spec.metadata["source_code_uri"] = "https://github.com/florrain/activerecord-graph-extractor"
20
+ spec.metadata["changelog_uri"] = "https://github.com/florrain/activerecord-graph-extractor/blob/main/CHANGELOG.md"
21
+
22
+ # Specify which files should be added to the gem when it is released.
23
+ spec.files = Dir.chdir(File.expand_path(__dir__)) do
24
+ `git ls-files -z`.split("\x0").reject do |f|
25
+ (f == __FILE__) || f.match(%r{\A(?:(?:bin|test|spec|features)/|\.(?:git|travis|circleci)|appveyor)})
26
+ end
27
+ end
28
+
29
+ spec.bindir = "exe"
30
+ spec.executables = spec.files.grep(%r{\Aexe/}) { |f| File.basename(f) }
31
+ spec.require_paths = ["lib"]
32
+
33
+ # Core dependencies
34
+ spec.add_dependency "activerecord", ">= 6.0"
35
+ spec.add_dependency "activesupport", ">= 6.0"
36
+
37
+ # CLI dependencies
38
+ spec.add_dependency "thor", "~> 1.2"
39
+ spec.add_dependency "tty-progressbar", "~> 0.18"
40
+ spec.add_dependency "tty-spinner", "~> 0.9"
41
+ spec.add_dependency "tty-tree", "~> 0.4"
42
+ spec.add_dependency "pastel", "~> 0.8"
43
+ spec.add_dependency "tty-prompt", "~> 0.23"
44
+
45
+ # JSON streaming
46
+ spec.add_dependency "oj", "~> 3.13"
47
+ spec.add_dependency "yajl-ruby", ">= 1.3"
48
+
49
+ # S3 support
50
+ spec.add_dependency "aws-sdk-s3", "~> 1.0"
51
+
52
+ # Development dependencies
53
+ spec.add_development_dependency "bundler", "~> 2.0"
54
+ spec.add_development_dependency "rake", "~> 13.0"
55
+ spec.add_development_dependency "rspec", "~> 3.12"
56
+ spec.add_development_dependency "rubocop", "~> 1.57"
57
+ spec.add_development_dependency "sqlite3", "~> 1.6"
58
+ spec.add_development_dependency "database_cleaner", "~> 2.0"
59
+ spec.add_development_dependency "factory_bot", "~> 6.2"
60
+ spec.add_development_dependency "simplecov", "~> 0.22"
61
+ spec.add_development_dependency "rubocop-rspec", "~> 2.25"
62
+ spec.add_development_dependency "pry", "~> 0.14"
63
+ spec.add_development_dependency "pry-byebug", "~> 3.10"
64
+ end