activerecord-graph-extractor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/docs/usage.md ADDED
@@ -0,0 +1,363 @@
1
+ # Usage Guide
2
+
3
+ This guide covers common usage patterns and best practices for the ActiveRecord Graph Extractor gem.
4
+
5
+ ## Getting Started
6
+
7
+ ### Installation
8
+
9
+ Add the gem to your application's Gemfile:
10
+
11
+ ```ruby
12
+ gem 'activerecord-graph-extractor'
13
+ ```
14
+
15
+ Or install directly:
16
+
17
+ ```bash
18
+ gem install activerecord-graph-extractor
19
+ ```
20
+
21
+ ### Basic Usage
22
+
23
+ #### Extracting Data
24
+
25
+ ```ruby
26
+ require 'activerecord_graph_extractor'
27
+
28
+ # Find your root object
29
+ order = Order.find(12345)
30
+
31
+ # Create extractor with default configuration
32
+ extractor = ActiveRecordGraphExtractor::Extractor.new
33
+
34
+ # Extract to file
35
+ result = extractor.extract_to_file(order, 'order_export.json')
36
+ puts "Extracted #{result['metadata']['total_records']} records"
37
+ ```
38
+
39
+ #### Importing Data
40
+
41
+ ```ruby
42
+ # Create importer
43
+ importer = ActiveRecordGraphExtractor::Importer.new
44
+
45
+ # Import from file
46
+ result = importer.import_from_file('order_export.json')
47
+ puts "Imported #{result['imported_records']} records"
48
+ ```
49
+
50
+ ## Configuration
51
+
52
+ ### Global Configuration
53
+
54
+ ```ruby
55
+ ActiveRecordGraphExtractor.configure do |config|
56
+ config.max_depth = 3 # Limit relationship depth
57
+ config.include_relationship('user') # Only include specific relationships
58
+ config.include_relationship('products')
59
+ config.exclude_model('History') # Skip these models
60
+ config.exclude_model('AuditLog')
61
+ config.batch_size = 500 # Process in batches
62
+ config.stream_json = true # Use streaming for large files
63
+ end
64
+ ```
65
+
66
+ ### Per-Extraction Configuration
67
+
68
+ ```ruby
69
+ extractor = ActiveRecordGraphExtractor::Extractor.new
70
+ result = extractor.extract(order, {
71
+ max_depth: 3,
72
+ custom_serializers: {
73
+ 'User' => ->(user) { { id: user.id, email: user.email } }
74
+ }
75
+ })
76
+ ```
77
+
78
+ ## Progress Monitoring
79
+
80
+ ### Programmatic Progress Tracking
81
+
82
+ ```ruby
83
+ ActiveRecordGraphExtractor.configure do |config|
84
+ config.progress_enabled = true
85
+ end
86
+
87
+ extractor = ActiveRecordGraphExtractor::Extractor.new
88
+ result = extractor.extract(order)
89
+ ```
90
+
91
+ ### CLI Progress Visualization
92
+
93
+ ```bash
94
+ # Extract with beautiful progress bars
95
+ arge extract Order 12345 \
96
+ --output export.json \
97
+ --progress \
98
+ --show-graph
99
+
100
+ # Import with progress
101
+ arge import export.json \
102
+ --progress \
103
+ --batch-size 500
104
+ ```
105
+
106
+ ## Advanced Configuration
107
+
108
+ ### Handling Circular References
109
+
110
+ ```ruby
111
+ ActiveRecordGraphExtractor.configure do |config|
112
+ config.handle_circular_references = true
113
+ config.max_depth = 5
114
+ end
115
+
116
+ extractor = ActiveRecordGraphExtractor::Extractor.new
117
+ result = extractor.extract(order)
118
+ ```
119
+
120
+ ### Memory Optimization
121
+
122
+ ```ruby
123
+ # For large datasets
124
+ ActiveRecordGraphExtractor.configure do |config|
125
+ config.stream_json = true # Stream JSON writing/reading
126
+ config.batch_size = 250 # Smaller batches
127
+ config.progress_enabled = true # Monitor progress
128
+ config.exclude_model('History')
129
+ config.exclude_model('AuditLog')
130
+ config.exclude_model('UserEmailHistory')
131
+ end
132
+ ```
133
+
134
+ ## Error Handling
135
+
136
+ ### Extraction Errors
137
+
138
+ ```ruby
139
+ begin
140
+ result = extractor.extract_to_file(order, 'export.json')
141
+ rescue ActiveRecordGraphExtractor::ExtractionError => e
142
+ puts "Extraction failed: #{e.message}"
143
+ end
144
+ ```
145
+
146
+ ### Import Errors
147
+
148
+ ```ruby
149
+ begin
150
+ result = importer.import_from_file('export.json')
151
+ rescue ActiveRecordGraphExtractor::ImportError => e
152
+ puts "Import failed: #{e.message}"
153
+ end
154
+ ```
155
+
156
+ ## Performance Considerations
157
+
158
+ ### Extraction Performance
159
+
160
+ 1. **Limit Depth**: Use `max_depth` to prevent deep traversal
161
+ 2. **Filter Models**: Use `exclude_model` to skip unnecessary tables
162
+ 3. **Filter Relationships**: Use `include_relationship` for specific paths
163
+ 4. **Batch Size**: Adjust `batch_size` based on memory constraints
164
+ 5. **Streaming**: Enable `stream_json` for large datasets
165
+
166
+ ### Import Performance
167
+
168
+ 1. **Skip Validations**: Use `validate_records: false` for trusted data
169
+ 2. **Larger Batches**: Increase `batch_size` for faster imports
170
+ 3. **Transaction Strategy**: Use `use_transactions: true` for consistency
171
+
172
+ ## Best Practices
173
+
174
+ ### 1. Start Small
175
+
176
+ ```ruby
177
+ # Test with limited scope first
178
+ ActiveRecordGraphExtractor.configure do |config|
179
+ config.max_depth = 2
180
+ config.include_relationship('user')
181
+ config.include_relationship('products')
182
+ end
183
+
184
+ extractor = ActiveRecordGraphExtractor::Extractor.new
185
+ result = extractor.extract(order)
186
+ ```
187
+
188
+ ### 2. Use Dry Runs
189
+
190
+ ```ruby
191
+ # Always test imports with validation first
192
+ ActiveRecordGraphExtractor.configure do |config|
193
+ config.validate_records = true
194
+ end
195
+
196
+ importer = ActiveRecordGraphExtractor::Importer.new
197
+ result = importer.import_from_file('export.json')
198
+ puts "Imported #{result['imported_records']} records"
199
+ ```
200
+
201
+ ### 3. Monitor Memory Usage
202
+
203
+ ```ruby
204
+ ActiveRecordGraphExtractor.configure do |config|
205
+ config.progress_enabled = true
206
+ config.batch_size = 500 # Adjust based on memory constraints
207
+ end
208
+ ```
209
+
210
+ ### 4. Handle Large Datasets
211
+
212
+ ```ruby
213
+ # For datasets > 10k records
214
+ ActiveRecordGraphExtractor.configure do |config|
215
+ config.stream_json = true
216
+ config.batch_size = 250
217
+ config.progress_enabled = true
218
+ config.exclude_model('History')
219
+ config.exclude_model('AuditLog')
220
+ end
221
+ ```
222
+
223
+ ### 5. Version Your Exports
224
+
225
+ ```ruby
226
+ timestamp = Time.now.strftime('%Y%m%d_%H%M%S')
227
+ filename = "order_#{order.id}_#{timestamp}.json"
228
+ result = extractor.extract_to_file(order, filename)
229
+ ```
230
+
231
+ ## Common Patterns
232
+
233
+ ### Environment Migration
234
+
235
+ ```ruby
236
+ # Production -> Staging
237
+ production_order = Order.find(id)
238
+
239
+ ActiveRecordGraphExtractor.configure do |config|
240
+ config.exclude_model('History')
241
+ config.exclude_model('AuditLog')
242
+ config.max_depth = 4
243
+ end
244
+
245
+ extractor = ActiveRecordGraphExtractor::Extractor.new
246
+
247
+ # Export from production
248
+ result = extractor.extract_to_file(production_order, 'prod_export.json')
249
+
250
+ # Import to staging (different environment)
251
+ ActiveRecordGraphExtractor.configure do |config|
252
+ config.validate_records = true
253
+ end
254
+
255
+ importer = ActiveRecordGraphExtractor::Importer.new
256
+ staging_result = importer.import_from_file('prod_export.json')
257
+ ```
258
+
259
+ ### Test Data Setup
260
+
261
+ ```ruby
262
+ # Extract test fixtures
263
+ test_orders = Order.joins(:user)
264
+ .where(users: { email: 'test@example.com' })
265
+ .limit(3)
266
+
267
+ extractor = ActiveRecordGraphExtractor::Extractor.new
268
+
269
+ test_orders.each_with_index do |order, i|
270
+ ActiveRecordGraphExtractor.configure do |config|
271
+ config.max_depth = 2
272
+ end
273
+
274
+ extractor.extract_to_file(order, "test_order_#{i + 1}.json")
275
+ end
276
+ ```
277
+
278
+ ### Debugging Data Issues
279
+
280
+ ```ruby
281
+ # Extract with maximum detail for debugging
282
+ ActiveRecordGraphExtractor.configure do |config|
283
+ config.max_depth = 10
284
+ config.handle_circular_references = true
285
+ config.progress_enabled = true
286
+ end
287
+
288
+ extractor = ActiveRecordGraphExtractor::Extractor.new
289
+ result = extractor.extract(problematic_record)
290
+ ```
291
+
292
+ ## CLI Quick Reference
293
+
294
+ ```bash
295
+ # Basic extraction
296
+ arge extract Order 12345 -o export.json
297
+
298
+ # Advanced extraction
299
+ arge extract Order 12345 \
300
+ --output export.json \
301
+ --max-depth 3 \
302
+ --include-relationships user,products,partner \
303
+ --exclude-models History,AuditLog \
304
+ --batch-size 500 \
305
+ --progress \
306
+ --show-graph \
307
+ --stream
308
+
309
+ # Basic import
310
+ arge import export.json
311
+
312
+ # Advanced import
313
+ arge import export.json \
314
+ --batch-size 1000 \
315
+ --skip-validations \
316
+ --progress
317
+
318
+ # Dry run import
319
+ arge import export.json --dry-run
320
+
321
+ # Analyze export file
322
+ arge analyze export.json
323
+
324
+ # Get help
325
+ arge help extract
326
+ arge help import
327
+ ```
328
+
329
+ ## Troubleshooting
330
+
331
+ ### Memory Issues
332
+
333
+ If you encounter memory issues:
334
+
335
+ 1. Reduce `batch_size`
336
+ 2. Enable `stream_json`
337
+ 3. Increase progress monitoring frequency
338
+ 4. Use `exclude_model` to skip large tables
339
+
340
+ ### Performance Issues
341
+
342
+ For slow extraction/import:
343
+
344
+ 1. Increase `batch_size` (if memory allows)
345
+ 2. Use `validate_records: false` for imports
346
+ 3. Limit `max_depth`
347
+ 4. Filter relationships with `include_relationship`
348
+
349
+ ### Circular Reference Errors
350
+
351
+ If you hit circular references:
352
+
353
+ 1. Use `handle_circular_references: true`
354
+ 2. Set appropriate `max_depth`
355
+ 3. Use `include_relationship` to avoid problematic paths
356
+
357
+ ### Validation Errors
358
+
359
+ If import validation fails:
360
+
361
+ 1. Check for missing required fields
362
+ 2. Verify foreign key relationships
363
+ 3. Consider `validate_records: false` if data is trusted
@@ -0,0 +1,227 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ # Example: Dry Run Analysis
5
+ # This script demonstrates how to use the dry run feature to analyze
6
+ # what would be extracted without performing the actual extraction.
7
+
8
+ require 'bundler/setup'
9
+ require 'activerecord_graph_extractor'
10
+
11
+ # This would typically be in your Rails application
12
+ # For this example, we'll assume you have User and Order models
13
+
14
+ puts "🔍 ActiveRecord Graph Extractor - Dry Run Examples"
15
+ puts "=" * 60
16
+ puts
17
+
18
+ # Example 1: Basic Dry Run Analysis
19
+ puts "1. Basic Dry Run Analysis"
20
+ puts "-" * 30
21
+
22
+ begin
23
+ # Find a user to analyze
24
+ user = User.first
25
+
26
+ if user
27
+ extractor = ActiveRecordGraphExtractor::Extractor.new
28
+
29
+ puts "Analyzing User ##{user.id}..."
30
+ analysis = extractor.dry_run(user)
31
+
32
+ puts "✅ Analysis completed in #{analysis['analysis_time']} seconds"
33
+ puts
34
+ puts "📊 Summary:"
35
+ puts " Models involved: #{analysis['extraction_scope']['total_models']}"
36
+ puts " Estimated records: #{analysis['extraction_scope']['total_estimated_records']}"
37
+ puts " Estimated file size: #{analysis['estimated_file_size']['human_readable']}"
38
+ puts " Estimated extraction time: #{analysis['performance_estimates']['estimated_extraction_time_human']}"
39
+ puts
40
+
41
+ # Show model breakdown
42
+ puts "📋 Records by Model:"
43
+ analysis['estimated_counts_by_model'].each do |model, count|
44
+ percentage = (count.to_f / analysis['extraction_scope']['total_estimated_records'] * 100).round(1)
45
+ puts " #{model.ljust(20)} #{count.to_s.rjust(8)} (#{percentage}%)"
46
+ end
47
+ puts
48
+ else
49
+ puts "❌ No users found in database"
50
+ end
51
+ rescue => e
52
+ puts "❌ Error: #{e.message}"
53
+ end
54
+
55
+ # Example 2: Dry Run with Custom Depth
56
+ puts "2. Dry Run with Custom Max Depth"
57
+ puts "-" * 35
58
+
59
+ begin
60
+ user = User.first
61
+
62
+ if user
63
+ extractor = ActiveRecordGraphExtractor::Extractor.new
64
+
65
+ # Compare different depths
66
+ [1, 2, 3].each do |depth|
67
+ analysis = extractor.dry_run(user, max_depth: depth)
68
+
69
+ puts "Depth #{depth}:"
70
+ puts " Models: #{analysis['extraction_scope']['total_models']}"
71
+ puts " Records: #{analysis['extraction_scope']['total_estimated_records']}"
72
+ puts " File size: #{analysis['estimated_file_size']['human_readable']}"
73
+ puts " Time: #{analysis['performance_estimates']['estimated_extraction_time_human']}"
74
+ puts
75
+ end
76
+ end
77
+ rescue => e
78
+ puts "❌ Error: #{e.message}"
79
+ end
80
+
81
+ # Example 3: Analyzing Multiple Objects
82
+ puts "3. Multiple Objects Analysis"
83
+ puts "-" * 30
84
+
85
+ begin
86
+ users = User.limit(3).to_a
87
+
88
+ if users.any?
89
+ extractor = ActiveRecordGraphExtractor::Extractor.new
90
+
91
+ analysis = extractor.dry_run(users)
92
+
93
+ puts "Analyzing #{users.size} users..."
94
+ puts "✅ Analysis completed"
95
+ puts
96
+ puts "📊 Summary:"
97
+ puts " Root objects: #{analysis['root_objects']['count']}"
98
+ puts " Total estimated records: #{analysis['extraction_scope']['total_estimated_records']}"
99
+ puts " Estimated file size: #{analysis['estimated_file_size']['human_readable']}"
100
+ puts
101
+ end
102
+ rescue => e
103
+ puts "❌ Error: #{e.message}"
104
+ end
105
+
106
+ # Example 4: Analyzing Warnings and Recommendations
107
+ puts "4. Warnings and Recommendations"
108
+ puts "-" * 35
109
+
110
+ begin
111
+ # Try to find a user with many relationships to trigger warnings
112
+ user = User.joins(:orders).group('users.id').having('COUNT(orders.id) > 0').first
113
+
114
+ if user
115
+ extractor = ActiveRecordGraphExtractor::Extractor.new
116
+
117
+ analysis = extractor.dry_run(user, max_depth: 5)
118
+
119
+ # Show warnings
120
+ if analysis['warnings'].any?
121
+ puts "⚠️ Warnings:"
122
+ analysis['warnings'].each do |warning|
123
+ puts " #{warning['type'].upcase} (#{warning['severity']}): #{warning['message']}"
124
+ end
125
+ puts
126
+ else
127
+ puts "✅ No warnings detected"
128
+ puts
129
+ end
130
+
131
+ # Show recommendations
132
+ if analysis['recommendations'].any?
133
+ puts "💡 Recommendations:"
134
+ analysis['recommendations'].each do |rec|
135
+ puts " #{rec['type'].upcase}: #{rec['message']}"
136
+ puts " → #{rec['action']}"
137
+ puts
138
+ end
139
+ else
140
+ puts "✅ No specific recommendations"
141
+ puts
142
+ end
143
+ end
144
+ rescue => e
145
+ puts "❌ Error: #{e.message}"
146
+ end
147
+
148
+ # Example 5: Saving Analysis Report
149
+ puts "5. Saving Analysis Report"
150
+ puts "-" * 28
151
+
152
+ begin
153
+ user = User.first
154
+
155
+ if user
156
+ extractor = ActiveRecordGraphExtractor::Extractor.new
157
+
158
+ analysis = extractor.dry_run(user)
159
+
160
+ # Save to file
161
+ report_file = "dry_run_analysis_#{Time.now.strftime('%Y%m%d_%H%M%S')}.json"
162
+ File.write(report_file, JSON.pretty_generate(analysis))
163
+
164
+ puts "📄 Analysis report saved to: #{report_file}"
165
+ puts " File size: #{File.size(report_file)} bytes"
166
+ puts
167
+
168
+ # Show a sample of the JSON structure
169
+ puts "📋 Report structure:"
170
+ analysis.keys.each do |key|
171
+ puts " #{key}"
172
+ end
173
+ puts
174
+ end
175
+ rescue => e
176
+ puts "❌ Error: #{e.message}"
177
+ end
178
+
179
+ # Example 6: Decision Making Based on Analysis
180
+ puts "6. Decision Making Example"
181
+ puts "-" * 28
182
+
183
+ begin
184
+ user = User.first
185
+
186
+ if user
187
+ extractor = ActiveRecordGraphExtractor::Extractor.new
188
+
189
+ analysis = extractor.dry_run(user)
190
+
191
+ total_records = analysis['extraction_scope']['total_estimated_records']
192
+ file_size_mb = analysis['estimated_file_size']['bytes'] / (1024.0 * 1024)
193
+ extraction_time = analysis['performance_estimates']['estimated_extraction_time_seconds']
194
+
195
+ puts "📊 Analysis Results:"
196
+ puts " Records: #{total_records}"
197
+ puts " File size: #{file_size_mb.round(1)} MB"
198
+ puts " Estimated time: #{extraction_time.round(1)} seconds"
199
+ puts
200
+
201
+ # Make decisions based on analysis
202
+ if total_records > 10_000
203
+ puts "🚨 Large dataset detected!"
204
+ puts " Recommendation: Consider using batch processing or reducing max_depth"
205
+ elsif file_size_mb > 50
206
+ puts "📦 Large file expected!"
207
+ puts " Recommendation: Consider uploading directly to S3"
208
+ elsif extraction_time > 300 # 5 minutes
209
+ puts "⏰ Long extraction time expected!"
210
+ puts " Recommendation: Run during off-peak hours"
211
+ else
212
+ puts "✅ Extraction looks manageable - proceed with confidence!"
213
+ end
214
+ puts
215
+ end
216
+ rescue => e
217
+ puts "❌ Error: #{e.message}"
218
+ end
219
+
220
+ puts "🎉 Dry run examples completed!"
221
+ puts
222
+ puts "💡 Tips:"
223
+ puts " • Use dry run before large extractions to understand scope"
224
+ puts " • Pay attention to warnings and recommendations"
225
+ puts " • Save analysis reports for documentation"
226
+ puts " • Compare different max_depth values to optimize performance"
227
+ puts " • Use dry run results to make informed decisions about extraction strategy"