grainery 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: ab9b7d46f3a3716cf0f043060f4f267a9be926ffae12cdb477dd85df016529e3
4
+ data.tar.gz: 44edc342af67c5fce8f77559830eada34772b18f5ac226c76886d1fe65a55fa0
5
+ SHA512:
6
+ metadata.gz: 5eea1be971b0ee1b02619377c1b2ec44fc68e8ea302653164a754d04493d31bbe914906391061ca2eaa773794890f6b4a49239c15aefbdc0c8386daa20bef518
7
+ data.tar.gz: 40258587446259c0e5770b61b1d7282fb7750d64b14aed6983cee287338e4d65338a7b363455c292f1fe56cfcc7f9dd93e71af12f1618a506c7dec71ef5fa35e
data/CHANGELOG.md ADDED
@@ -0,0 +1,33 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2025-10-01
9
+
10
+ ### Added
11
+ - Initial release
12
+ - Automatic database detection and configuration generation
13
+ - Multi-database support (SQL Server, MySQL, PostgreSQL)
14
+ - Dependency-aware seed loading with topological sort
15
+ - One seed file per table organization
16
+ - Configurable per-project via `config/grainery.yml`
17
+ - Lookup tables support (harvest all records)
18
+ - Test database management tasks
19
+ - Preserves custom `db/seeds.rb` (loaded last)
20
+ - Clean separation of harvested vs custom seeds
21
+ - Rake tasks:
22
+ - `grainery:init_config` - Initialize configuration
23
+ - `grainery:generate` - Harvest with limit (100 records per table)
24
+ - `grainery:generate_all` - Harvest ALL records
25
+ - `grainery:load` - Load seeds in dependency order
26
+ - `grainery:clean` - Clean grainery directory
27
+ - `test:db:setup_for_grainery` - Setup clean test database
28
+ - `test:db:seed_with_grainery` - Seed test database
29
+ - `test:db:reset_with_grainery` - Reset and seed test database
30
+ - `test:db:clean` - Truncate all test tables
31
+ - `test:db:stats` - Show test database statistics
32
+
33
+ [0.1.0]: https://github.com/mpantel/grainery/releases/tag/v0.1.0
data/MIT-LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Michail Pantelelis
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,313 @@
1
+ # Grainery
2
+
3
+ Database seed storage system for Rails applications. Extract database records and generate seed files organized by database with automatic dependency resolution. Like a grainery stores grain, this gem stores and organizes your database seeds.
4
+
5
+ ## Features
6
+
7
+ - ✅ Automatic database detection
8
+ - ✅ Dependency-aware loading (topological sort)
9
+ - ✅ Multi-database support
10
+ - ✅ Configurable per project
11
+ - ✅ Preserves custom seeds
12
+ - ✅ One seed file per table
13
+ - ✅ Clean separation of concerns
14
+ - ✅ Supports SQL Server, MySQL, PostgreSQL
15
+ - ✅ Test database management tasks
16
+
17
+ ## Installation
18
+
19
+ Add this line to your application's Gemfile:
20
+
21
+ ```ruby
22
+ gem 'grainery', path: 'grainery'
23
+ ```
24
+
25
+ And then execute:
26
+
27
+ ```bash
28
+ bundle install
29
+ ```
30
+
31
+ ## Usage
32
+
33
+ ### 1. Initialize Configuration
34
+
35
+ ```bash
36
+ rake grainery:init_config
37
+ ```
38
+
39
+ This auto-detects all databases and creates `config/grainery.yml`.
40
+
41
+ ### 2. Harvest Data
42
+
43
+ ```bash
44
+ # Harvest with limit (100 records per table)
45
+ rake grainery:generate
46
+
47
+ # Harvest ALL records (use with caution)
48
+ rake grainery:generate_all
49
+ ```
50
+
51
+ ### 3. Load Seeds
52
+
53
+ ```bash
54
+ rake grainery:load
55
+ ```
56
+
57
+ This loads:
58
+ 1. Harvested seeds (in dependency order)
59
+ 2. Custom seeds from `db/seeds.rb` (last)
60
+
61
+ ## Directory Structure
62
+
63
+ ```
64
+ db/
65
+ ├── grainery/ # Harvested seeds (auto-generated)
66
+ │ ├── load_order.txt # Load order respecting dependencies
67
+ │ ├── primary/ # Primary database
68
+ │ │ ├── users.rb
69
+ │ │ ├── posts.rb
70
+ │ │ └── comments.rb
71
+ │ ├── other/ # Other database
72
+ │ │ └── projects.rb
73
+ │ └── banking/ # Banking database
74
+ │ └── employees.rb
75
+ └── seeds.rb # Custom seeds (loaded last)
76
+ ```
77
+
78
+ ## Configuration
79
+
80
+ `config/grainery.yml`:
81
+
82
+ ```yaml
83
+ # Path for harvested seed files
84
+ grainery_path: db/grainery
85
+
86
+ # Database connection mappings
87
+ database_connections:
88
+ primary:
89
+ connection: test
90
+ adapter: sqlserver
91
+ model_base_class: ApplicationRecord
92
+ other:
93
+ connection: other
94
+ adapter: sqlserver
95
+ model_base_class: OtherDB
96
+ # ... other databases
97
+
98
+ # Lookup tables (harvest all records)
99
+ lookup_tables: []
100
+ ```
101
+
102
+ ## Available Rake Tasks
103
+
104
+ ### Grainery Tasks
105
+
106
+ ```bash
107
+ # Initialize configuration
108
+ rake grainery:init_config
109
+
110
+ # Harvest data (with limit)
111
+ rake grainery:generate
112
+
113
+ # Harvest ALL records
114
+ rake grainery:generate_all
115
+
116
+ # Load harvested + custom seeds
117
+ rake grainery:load
118
+
119
+ # Clean grainery directory
120
+ rake grainery:clean
121
+ ```
122
+
123
+ ### Test Database Tasks
124
+
125
+ ```bash
126
+ # Setup clean test database (schema only)
127
+ rake test:db:setup_for_grainery
128
+ # or: rake db:test:setup_for_grainery
129
+
130
+ # Seed test database with grainery data
131
+ rake test:db:seed_with_grainery
132
+
133
+ # Reset and seed (one command)
134
+ rake test:db:reset_with_grainery
135
+ # or: rake db:test:reset_with_grainery
136
+
137
+ # Clean test database (truncate all tables)
138
+ rake test:db:clean
139
+ # or: rake db:test:clean
140
+
141
+ # Show test database statistics
142
+ rake test:db:stats
143
+ # or: rake db:test:stats
144
+ ```
145
+
146
+ ## Dependency Resolution
147
+
148
+ Grainer automatically:
149
+ 1. Analyzes `belongs_to` associations
150
+ 2. Builds dependency graph
151
+ 3. Performs topological sort
152
+ 4. Generates `load_order.txt`
153
+
154
+ ### Example Load Order
155
+
156
+ ```
157
+ # PRIMARY Database
158
+ primary/users.rb
159
+ primary/categories.rb
160
+ primary/posts.rb
161
+ primary/comments.rb
162
+
163
+ # OTHER Database
164
+ other/departments.rb
165
+ other/projects.rb
166
+ ```
167
+
168
+ ## Lookup Tables
169
+
170
+ For small reference tables (statuses, types, categories), grainer can load **all records** instead of samples.
171
+
172
+ Add to `config/grainery.yml`:
173
+ ```yaml
174
+ lookup_tables:
175
+ - invoice_statuses
176
+ - user_roles
177
+ - categories
178
+ ```
179
+
180
+ ## Seed File Format
181
+
182
+ Each table gets its own seed file:
183
+
184
+ ```ruby
185
+ # Harvested from primary database: users
186
+ # Records: 100
187
+ # Generated: 2025-10-01 10:30:00
188
+
189
+ User.create!(
190
+ {
191
+ email: "user1@example.com",
192
+ name: "John Doe",
193
+ active: true
194
+ },
195
+ {
196
+ email: "user2@example.com",
197
+ name: "Jane Smith",
198
+ active: true
199
+ }
200
+ )
201
+ ```
202
+
203
+ ## Custom Seeds
204
+
205
+ Your custom seed logic in `db/seeds.rb` is **preserved and loaded last**.
206
+
207
+ Example `db/seeds.rb`:
208
+ ```ruby
209
+ # Custom seed logic
210
+ puts "Creating admin user..."
211
+ User.find_or_create_by!(email: 'admin@example.com') do |user|
212
+ user.name = 'Admin'
213
+ user.role = 'admin'
214
+ end
215
+
216
+ puts "Setting up application defaults..."
217
+ Setting.create!(key: 'app_name', value: 'My App')
218
+ ```
219
+
220
+ ## Use Cases
221
+
222
+ ### Development
223
+ ```bash
224
+ # Harvest production-like data for development
225
+ rake grainery:generate
226
+ rake grainery:load
227
+ ```
228
+
229
+ ### Testing
230
+ ```bash
231
+ # Create test fixtures
232
+ rake grainery:generate
233
+ # In test setup, load specific seeds as needed
234
+ ```
235
+
236
+ ### Staging
237
+ ```bash
238
+ # Harvest production data (anonymized)
239
+ rake grainery:generate_all
240
+ # Deploy to staging
241
+ # Load on staging server
242
+ rake grainery:load
243
+ ```
244
+
245
+ ## Safety Features
246
+
247
+ 1. **Separate Directories**: Harvested seeds never touch `db/seeds.rb`
248
+ 2. **Dependency Order**: Foreign keys respected automatically
249
+ 3. **Custom Preservation**: Your `db/seeds.rb` always loads last
250
+ 4. **Clean Command**: `rake grainery:clean` removes only harvested files
251
+
252
+ ## Best Practices
253
+
254
+ 1. **Use Limits**: Start with `rake grainery:generate` (100 records)
255
+ 2. **Review Load Order**: Check `db/grainery/load_order.txt`
256
+ 3. **Test Loading**: Run `rake grainery:load` on clean database first
257
+ 4. **Commit Selectively**: Consider `.gitignore` for large grainery files
258
+ 5. **Custom Seeds Last**: Keep application-specific logic in `db/seeds.rb`
259
+
260
+ ## Troubleshooting
261
+
262
+ ### Circular Dependencies
263
+ If you see "Circular dependency detected", check for:
264
+ - Self-referential associations
265
+ - Circular foreign keys
266
+
267
+ Solution: Temporarily remove `optional: true` or `foreign_key: false`
268
+
269
+ ### Missing Records
270
+ If records fail to load:
271
+ 1. Check `load_order.txt` for correct ordering
272
+ 2. Verify foreign key constraints
273
+ 3. Review error messages in console output
274
+
275
+ ### Large Files
276
+ If seed files are too large:
277
+ ```bash
278
+ # Use limit parameter
279
+ rake grainery:generate # 100 records per table (default)
280
+ ```
281
+
282
+ ## Example Workflow
283
+
284
+ ```bash
285
+ # 1. Initialize on first use
286
+ rake grainery:init_config
287
+
288
+ # 2. Harvest from production (with VPN/SSH tunnel)
289
+ RAILS_ENV=production rake grainery:generate
290
+
291
+ # 3. Review generated files
292
+ ls -la db/grainery/
293
+
294
+ # 4. Commit grainery files (optional)
295
+ git add db/grainery/
296
+ git commit -m "Add production seed data"
297
+
298
+ # 5. On another machine, pull and load
299
+ git pull
300
+ rake db:reset
301
+ rake grainery:load
302
+
303
+ # 6. Your custom seeds run automatically last
304
+ # db/seeds.rb is executed after all harvested seeds
305
+ ```
306
+
307
+ ## Contributing
308
+
309
+ Bug reports and pull requests are welcome on GitHub at https://github.com/mpantel/grainery.
310
+
311
+ ## License
312
+
313
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,519 @@
1
+ require 'fileutils'
2
+ require 'yaml'
3
+
4
+ module Grainery
5
+ class Grainer
6
+ attr_reader :database_configs, :lookup_tables
7
+
8
+ # Security: Whitelist of allowed base class patterns
9
+ ALLOWED_BASE_CLASS_PATTERN = /\A(ApplicationRecord|ActiveRecord::Base|[A-Z][a-zA-Z0-9]*Record|[A-Z][a-zA-Z0-9]*(DB|Database|Connection))\z/
10
+
11
+ def initialize
12
+ @config = load_config
13
+ @grainery_path = load_grainery_path
14
+ @database_configs = load_database_connections
15
+ @lookup_tables = load_lookup_tables
16
+ end
17
+
18
+ def load_config
19
+ config_path = Rails.root.join('config/grainery.yml')
20
+
21
+ unless File.exist?(config_path)
22
+ puts " Warning: config/grainery.yml not found. Creating default configuration..."
23
+ create_default_config
24
+ end
25
+
26
+ YAML.safe_load_file(config_path, permitted_classes: [Symbol, Date, Time], aliases: true) || {}
27
+ rescue => e
28
+ puts " Warning: Could not load grainery.yml: #{e.message}"
29
+ {}
30
+ end
31
+
32
+ def create_default_config
33
+ config_path = Rails.root.join('config/grainery.yml')
34
+
35
+ # Detect databases and model base classes dynamically
36
+ detected_databases = detect_databases_and_models
37
+
38
+ # Build configuration hash
39
+ config = {
40
+ 'database_connections' => detected_databases,
41
+ 'grainery_path' => 'db/grainery',
42
+ 'lookup_tables' => [],
43
+ 'last_updated' => Time.now.to_s
44
+ }
45
+
46
+ # Write with custom formatting for better readability
47
+ write_config_file(config_path, config)
48
+
49
+ puts " ✓ Created config/grainery.yml with #{detected_databases.size} detected databases"
50
+ end
51
+
52
+ def detect_databases_and_models
53
+ puts " → Detecting databases and model base classes..."
54
+
55
+ Rails.application.eager_load!
56
+
57
+ # Find all model base classes
58
+ base_classes = ObjectSpace.each_object(Class).select do |klass|
59
+ klass < ActiveRecord::Base &&
60
+ !klass.abstract_class? &&
61
+ klass != ActiveRecord::Base &&
62
+ klass.descendants.any?
63
+ rescue
64
+ false
65
+ end
66
+
67
+ connections_map = {}
68
+
69
+ base_classes.each do |base_class|
70
+ begin
71
+ connection_config = base_class.connection_db_config
72
+ connection_name = connection_config.name.to_s
73
+ adapter = connection_config.adapter.to_s
74
+
75
+ logical_name = infer_logical_name(connection_name, base_class.name)
76
+
77
+ connections_map[logical_name] = {
78
+ 'connection' => connection_name,
79
+ 'adapter' => adapter,
80
+ 'model_base_class' => base_class.name
81
+ }
82
+
83
+ puts " ✓ Detected: #{logical_name} → #{base_class.name} (#{adapter})"
84
+ rescue => e
85
+ puts " ⚠ Warning: Could not detect connection for #{base_class.name}: #{e.message}"
86
+ end
87
+ end
88
+
89
+ # Ensure primary database is included
90
+ unless connections_map.key?('primary')
91
+ begin
92
+ primary_config = ApplicationRecord.connection_db_config
93
+ connections_map['primary'] = {
94
+ 'connection' => primary_config.name.to_s,
95
+ 'adapter' => primary_config.adapter.to_s,
96
+ 'model_base_class' => 'ApplicationRecord'
97
+ }
98
+ puts " ✓ Detected: primary → ApplicationRecord (#{primary_config.adapter})"
99
+ rescue => e
100
+ puts " ⚠ Warning: Could not detect primary database: #{e.message}"
101
+ end
102
+ end
103
+
104
+ connections_map
105
+ end
106
+
107
+ def infer_logical_name(connection_name, class_name)
108
+ logical = class_name.gsub(/DB$|Database$|Connection$/, '')
109
+ logical = logical.gsub(/([A-Z]+)([A-Z][a-z])/, '\\1_\\2')
110
+ .gsub(/([a-z\d])([A-Z])/, '\\1_\\2')
111
+ .downcase
112
+
113
+ if connection_name != 'test' && connection_name.length > logical.length
114
+ logical = connection_name.gsub(/_db$|_database$/, '')
115
+ end
116
+
117
+ logical
118
+ end
119
+
120
+ def write_config_file(path, config)
121
+ content = []
122
+
123
+ content << "# Data Harvest Configuration"
124
+ content << "# This file contains configuration for the data harvesting system"
125
+ content << "#"
126
+ content << "# Database Connections:"
127
+ content << "# Map of logical database names to connection name, adapter, and model base class"
128
+ content << "#"
129
+ content << "# Harvest Path:"
130
+ content << "# Where harvested seed files are stored (default: db/grainery)"
131
+ content << "#"
132
+ content << "# Lookup Tables:"
133
+ content << "# Tables to harvest ALL records (not just samples)"
134
+ content << ""
135
+ content << "# Path for harvested seed files"
136
+ content << "grainery_path: #{config['grainery_path'] || 'db/grainery'}"
137
+ content << ""
138
+ content << "# Database connection mappings"
139
+ content << "database_connections:"
140
+
141
+ config['database_connections'].each do |db_name, db_config|
142
+ content << " #{db_name}:"
143
+ content << " connection: #{db_config['connection']}"
144
+ content << " adapter: #{db_config['adapter']}"
145
+ content << " model_base_class: #{db_config['model_base_class']}"
146
+ end
147
+
148
+ content << ""
149
+ content << "# Lookup tables (harvest all records)"
150
+ content << "lookup_tables: #{config['lookup_tables'].inspect}"
151
+ content << ""
152
+ content << "# Metadata"
153
+ content << "last_updated: #{config['last_updated']}"
154
+ content << ""
155
+
156
+ File.write(path, content.join("\n"))
157
+ end
158
+
159
+ def load_database_connections
160
+ connections = @config['database_connections'] || {}
161
+ result = {}
162
+
163
+ connections.each do |db_name, config|
164
+ db_key = db_name.to_sym
165
+ if config.is_a?(Hash)
166
+ result[db_key] = {
167
+ connection: config['connection'].to_sym,
168
+ adapter: config['adapter']&.to_sym || :sqlserver,
169
+ model_base_class: config['model_base_class'] || 'ApplicationRecord'
170
+ }
171
+ else
172
+ result[db_key] = {
173
+ connection: config.to_sym,
174
+ adapter: :sqlserver,
175
+ model_base_class: 'ApplicationRecord'
176
+ }
177
+ end
178
+ end
179
+ result
180
+ end
181
+
182
+ def load_grainery_path
183
+ path = @config['grainery_path'] || 'db/grainery'
184
+ path = path.sub(/\/$/, '')
185
+
186
+ # Security: Validate path is within Rails.root to prevent path traversal
187
+ full_path = Rails.root.join(path).expand_path
188
+ unless full_path.to_s.start_with?(Rails.root.to_s)
189
+ raise SecurityError, "Invalid grainery_path '#{path}': must be within Rails application directory"
190
+ end
191
+
192
+ path
193
+ rescue SecurityError => e
194
+ puts " Error: #{e.message}"
195
+ raise
196
+ rescue => e
197
+ puts " Warning: Could not load grainery_path: #{e.message}"
198
+ 'db/grainery'
199
+ end
200
+
201
+ def load_lookup_tables
202
+ (@config['lookup_tables'] || []).to_set
203
+ rescue => e
204
+ puts " Warning: Could not load lookup tables: #{e.message}"
205
+ Set.new
206
+ end
207
+
208
+ # Security: Safe constant resolution with whitelist
209
+ def safe_const_get(class_name)
210
+ unless class_name.match?(ALLOWED_BASE_CLASS_PATTERN)
211
+ raise SecurityError, "Unauthorized base class '#{class_name}'. Only ActiveRecord model base classes are allowed."
212
+ end
213
+
214
+ Object.const_get(class_name)
215
+ rescue NameError => e
216
+ raise NameError, "Could not find constant '#{class_name}': #{e.message}"
217
+ end
218
+
219
+ def get_all_models
220
+ Rails.application.eager_load!
221
+ models = []
222
+
223
+ @database_configs.each do |db_name, db_config|
224
+ base_class_name = db_config[:model_base_class]
225
+ next unless base_class_name
226
+
227
+ begin
228
+ base_class = safe_const_get(base_class_name)
229
+ models += base_class.descendants
230
+ rescue => e
231
+ puts " Warning: Could not load models from '#{base_class_name}': #{e.message}"
232
+ end
233
+ end
234
+
235
+ models.uniq.compact
236
+ end
237
+
238
+ def detect_database(model)
239
+ @database_configs.each do |db_name, db_config|
240
+ base_class_name = db_config[:model_base_class]
241
+ next unless base_class_name
242
+
243
+ begin
244
+ base_class = safe_const_get(base_class_name)
245
+ return db_name if model < base_class
246
+ rescue
247
+ next
248
+ end
249
+ end
250
+
251
+ :primary
252
+ end
253
+
254
+ def harvest_all(limit: nil)
255
+ all_models = get_all_models
256
+
257
+ models_to_harvest = all_models.reject do |model|
258
+ model.abstract_class? ||
259
+ model.name.start_with?('HABTM_', 'ActiveRecord::') ||
260
+ model.table_name.nil?
261
+ end
262
+
263
+ harvest_models(models_to_harvest, limit: limit)
264
+ end
265
+
266
+ def harvest_models(models, limit: nil)
267
+ models = Array(models)
268
+ return if models.empty?
269
+
270
+ puts "\n" + "="*80
271
+ puts "Grainer - Extracting Database Seeds"
272
+ puts "="*80
273
+ puts "Total models: #{models.size}"
274
+ puts "Limit per table: #{limit || 'ALL RECORDS'}"
275
+ puts "="*80 + "\n"
276
+
277
+ # Group by database
278
+ grouped_models = models.group_by { |model| detect_database(model) }
279
+
280
+ # Calculate dependencies for load order
281
+ dependency_graph = build_dependency_graph(models)
282
+ load_order = topological_sort(dependency_graph)
283
+
284
+ # Create harvest directories
285
+ grouped_models.each do |db_name, _|
286
+ db_dir = Rails.root.join(@grainery_path, db_name.to_s)
287
+ FileUtils.mkdir_p(db_dir)
288
+ end
289
+
290
+ # Harvest in dependency order
291
+ load_order.each do |model|
292
+ next unless models.include?(model)
293
+
294
+ begin
295
+ db_name = detect_database(model)
296
+ harvest_table(model, db_name, limit: limit)
297
+ rescue => e
298
+ puts " ✗ Error harvesting #{model.name}: #{e.message}"
299
+ end
300
+ end
301
+
302
+ # Create load order file
303
+ create_load_order_file(load_order, models)
304
+
305
+ puts "\n" + "="*80
306
+ puts "Data harvest complete!"
307
+ puts "Seed files created in #{@grainery_path}/"
308
+ puts "Load with: rake grainery:load"
309
+ puts "="*80
310
+ end
311
+
312
+ def build_dependency_graph(models)
313
+ graph = {}
314
+
315
+ models.each do |model|
316
+ graph[model] = []
317
+
318
+ # Find belongs_to associations (dependencies)
319
+ model.reflect_on_all_associations(:belongs_to).each do |assoc|
320
+ begin
321
+ if assoc.klass && !assoc.polymorphic? && models.include?(assoc.klass)
322
+ graph[model] << assoc.klass
323
+ end
324
+ rescue
325
+ next
326
+ end
327
+ end
328
+ end
329
+
330
+ graph
331
+ end
332
+
333
+ def topological_sort(graph)
334
+ sorted = []
335
+ visited = Set.new
336
+ visiting = Set.new
337
+
338
+ visit = lambda do |node|
339
+ return if visited.include?(node)
340
+ raise "Circular dependency detected" if visiting.include?(node)
341
+
342
+ visiting.add(node)
343
+
344
+ (graph[node] || []).each do |dependency|
345
+ visit.call(dependency) if graph.key?(dependency)
346
+ end
347
+
348
+ visiting.delete(node)
349
+ visited.add(node)
350
+ sorted << node
351
+ end
352
+
353
+ graph.keys.each { |node| visit.call(node) }
354
+
355
+ sorted
356
+ end
357
+
358
+ def harvest_table(model, db_name, limit: nil)
359
+ table_name = model.table_name
360
+ is_lookup = @lookup_tables.include?(table_name)
361
+
362
+ # Determine how many records to harvest
363
+ records = if is_lookup
364
+ model.all.to_a
365
+ elsif limit
366
+ model.limit(limit).to_a
367
+ else
368
+ model.all.to_a
369
+ end
370
+
371
+ if records.empty?
372
+ puts " ⚠ #{model.name.ljust(50)} → skipped (no data)"
373
+ return
374
+ end
375
+
376
+ # Generate seed file
377
+ seed_content = generate_seed_content(model, records, db_name)
378
+ seed_path = get_seed_path(model, db_name)
379
+
380
+ File.write(seed_path, seed_content)
381
+
382
+ record_info = is_lookup ? " (lookup: #{records.size} records)" : " (#{records.size} records)"
383
+ puts " ✓ #{model.name.ljust(50)} → #{table_name}.rb#{record_info}"
384
+ end
385
+
386
+ def get_seed_path(model, db_name)
387
+ db_dir = File.join(@grainery_path, db_name.to_s)
388
+ FileUtils.mkdir_p(Rails.root.join(db_dir))
389
+ Rails.root.join(db_dir, "#{model.table_name}.rb")
390
+ end
391
+
392
+ def generate_seed_content(model, records, db_name)
393
+ table_name = model.table_name
394
+
395
+ # Get columns to export (exclude id, timestamps)
396
+ columns = model.columns.reject do |col|
397
+ %w[id created_at updated_at].include?(col.name)
398
+ end
399
+
400
+ content = []
401
+ content << "# Harvested from #{db_name} database: #{table_name}"
402
+ content << "# Records: #{records.size}"
403
+ content << "# Generated: #{Time.now}"
404
+ content << ""
405
+ content << "#{model.name}.create!("
406
+
407
+ records.each_with_index do |record, idx|
408
+ content << " {" if idx == 0
409
+ content << " }," if idx > 0
410
+ content << " {" if idx > 0
411
+
412
+ columns.each_with_index do |col, col_idx|
413
+ value = record.send(col.name)
414
+ formatted_value = format_seed_value(value, col)
415
+ comma = col_idx < columns.size - 1 ? ',' : ''
416
+ content << " #{col.name}: #{formatted_value}#{comma}"
417
+ end
418
+ end
419
+
420
+ content << " }"
421
+ content << ")"
422
+ content << ""
423
+
424
+ content.join("\n")
425
+ end
426
+
427
+ def format_seed_value(value, column)
428
+ return 'nil' if value.nil?
429
+
430
+ case column.type
431
+ when :string, :text
432
+ value.to_s.inspect
433
+ when :integer, :bigint
434
+ value.to_i
435
+ when :decimal, :float
436
+ value.to_f
437
+ when :boolean
438
+ value ? 'true' : 'false'
439
+ when :date
440
+ "Date.parse(#{value.to_s.inspect})"
441
+ when :datetime, :timestamp
442
+ "Time.parse(#{value.to_s.inspect})"
443
+ when :json, :jsonb
444
+ value.to_json
445
+ else
446
+ value.inspect
447
+ end
448
+ end
449
+
450
+ def create_load_order_file(load_order, models)
451
+ order_path = Rails.root.join(@grainery_path, 'load_order.txt')
452
+
453
+ grouped_by_db = load_order.select { |m| models.include?(m) }.group_by { |m| detect_database(m) }
454
+
455
+ content = []
456
+ content << "# Load order for harvested seeds"
457
+ content << "# Load files in this order to respect foreign key dependencies"
458
+ content << ""
459
+
460
+ grouped_by_db.each do |db_name, db_models|
461
+ content << "# #{db_name.to_s.upcase} Database"
462
+ db_models.each do |model|
463
+ content << "#{db_name}/#{model.table_name}.rb"
464
+ end
465
+ content << ""
466
+ end
467
+
468
+ File.write(order_path, content.join("\n"))
469
+ puts "\n ✓ Load order written to #{@grainery_path}/load_order.txt"
470
+ end
471
+
472
+ def load_seeds
473
+ order_file = Rails.root.join(@grainery_path, 'load_order.txt')
474
+
475
+ unless File.exist?(order_file)
476
+ puts " ✗ Load order file not found. Run 'rake grainery:generate' first."
477
+ return
478
+ end
479
+
480
+ puts "\n" + "="*80
481
+ puts "Loading Harvested Seeds"
482
+ puts "="*80
483
+
484
+ # Load harvested seeds in dependency order
485
+ File.readlines(order_file).each do |line|
486
+ line = line.strip
487
+ next if line.empty? || line.start_with?('#')
488
+
489
+ seed_file = Rails.root.join(@grainery_path, line)
490
+ if File.exist?(seed_file)
491
+ puts " → Loading #{line}..."
492
+ begin
493
+ load seed_file
494
+ rescue => e
495
+ puts " ✗ Error loading #{line}: #{e.message}"
496
+ end
497
+ end
498
+ end
499
+
500
+ # Load custom seeds last (if they exist)
501
+ custom_seeds = Rails.root.join('db/seeds.rb')
502
+ if File.exist?(custom_seeds) && File.read(custom_seeds).strip.present?
503
+ puts "\n" + "-"*80
504
+ puts "Loading Custom Seeds"
505
+ puts "-"*80
506
+ puts " → Loading db/seeds.rb..."
507
+ begin
508
+ load custom_seeds
509
+ rescue => e
510
+ puts " ✗ Error loading custom seeds: #{e.message}"
511
+ end
512
+ end
513
+
514
+ puts "\n" + "="*80
515
+ puts "Seed loading complete!"
516
+ puts "="*80
517
+ end
518
+ end
519
+ end
@@ -0,0 +1,12 @@
1
+ require 'grainery/grainer'
2
+
3
+ module Grainery
4
+ class Railtie < Rails::Railtie
5
+ railtie_name :grainery
6
+
7
+ rake_tasks do
8
+ load 'tasks/grainery_tasks.rake'
9
+ load 'tasks/test_db_tasks.rake'
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,3 @@
1
+ module Grainery
2
+ VERSION = "0.1.0"
3
+ end
data/lib/grainery.rb ADDED
@@ -0,0 +1,8 @@
1
+ require "grainery/version"
2
+ require "grainery/railtie" if defined?(Rails)
3
+
4
+ module Grainery
5
+ class Error < StandardError; end
6
+
7
+ # Your code goes here...
8
+ end
@@ -0,0 +1,59 @@
1
+ namespace :grainery do
2
+ desc "Initialize grainery.yml configuration file"
3
+ task init_config: :environment do
4
+ config_path = Rails.root.join('config/grainery.yml')
5
+
6
+ if File.exist?(config_path)
7
+ puts "Configuration file already exists at config/grainery.yml"
8
+ exit
9
+ end
10
+
11
+ puts "Initializing grainery configuration..."
12
+ grainery = Grainery::Grainer.new
13
+
14
+ puts "\n✓ Configuration file created successfully!"
15
+ puts "\nNext steps:"
16
+ puts "1. Review config/grainery.yml"
17
+ puts "2. Run 'rake grainery:generate' to harvest data"
18
+ end
19
+
20
+ desc "Harvest data from all tables (respecting dependencies)"
21
+ task generate: :environment do
22
+ puts "Harvesting data from all tables..."
23
+
24
+ grainery = Grainery::Grainer.new
25
+ grainery.harvest_all(limit: 100) # Default limit of 100 records per table
26
+
27
+ puts "\n✓ Data harvested to db/grainery/"
28
+ end
29
+
30
+ desc "Harvest all records from all tables (no limit)"
31
+ task generate_all: :environment do
32
+ puts "⚠ WARNING: This will harvest ALL records from ALL tables"
33
+ puts " This may take a while and create large files\n\n"
34
+
35
+ grainery = Grainery::Grainer.new
36
+ grainery.harvest_all(limit: nil)
37
+
38
+ puts "\n✓ All data harvested to db/grainery/"
39
+ end
40
+
41
+ desc "Load harvested seeds into database (in dependency order)"
42
+ task load: :environment do
43
+ grainery = Grainery::Grainer.new
44
+ grainery.load_seeds
45
+ end
46
+
47
+ desc "Clean grainery directory"
48
+ task clean: :environment do
49
+ grainery = Grainery::Grainer.new
50
+ grainery_path = Rails.root.join(grainery.instance_variable_get(:@grainery_path))
51
+
52
+ if Dir.exist?(grainery_path)
53
+ FileUtils.rm_rf(grainery_path)
54
+ puts "✓ Cleaned #{grainery_path}"
55
+ else
56
+ puts " Nothing to clean"
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,174 @@
1
+ namespace :test do
2
+ namespace :db do
3
+ desc "Setup clean test database for grainery-based testing"
4
+ task setup_for_grainery: :environment do
5
+ puts "Setting up clean test database for grainery-based testing..."
6
+
7
+ # Switch to test environment
8
+ Rails.env = 'test'
9
+ ActiveRecord::Base.establish_connection(:test)
10
+
11
+ begin
12
+ # Drop existing test database (optional - remove if you want to preserve structure)
13
+ puts " → Dropping test database..."
14
+ ActiveRecord::Tasks::DatabaseTasks.drop_current('test')
15
+ rescue => e
16
+ puts " ⚠ Could not drop database: #{e.message}"
17
+ end
18
+
19
+ # Create test database
20
+ puts " → Creating test database..."
21
+ ActiveRecord::Tasks::DatabaseTasks.create_current('test')
22
+
23
+ # Load schema
24
+ puts " → Loading schema..."
25
+ ActiveRecord::Base.establish_connection(:test)
26
+ ActiveRecord::Tasks::DatabaseTasks.load_schema_for 'test', :ruby
27
+
28
+ # Reconnect
29
+ ActiveRecord::Base.establish_connection(:test)
30
+
31
+ puts "\n✓ Test database is ready for grainery-based testing!"
32
+ puts "\nNext steps:"
33
+ puts " 1. Load seeds: RAILS_ENV=test rake grainery:load"
34
+ puts " 2. Run tests: bundle exec rspec"
35
+ puts "\nOr use the combined task:"
36
+ puts " rake test:db:reset_with_grainery"
37
+ end
38
+
39
+ desc "Seed test database with grainery data"
40
+ task seed_with_grainery: :environment do
41
+ Rails.env = 'test'
42
+ ActiveRecord::Base.establish_connection(:test)
43
+
44
+ puts "Seeding test database with grainery data..."
45
+
46
+ begin
47
+ # Check if grainery seeds exist
48
+ grainery_path = Rails.root.join('db/grainery')
49
+ load_order_file = grainery_path.join('load_order.txt')
50
+
51
+ unless File.exist?(load_order_file)
52
+ puts "\n⚠ No grainery seeds found!"
53
+ puts "\nTo generate seeds:"
54
+ puts " 1. Harvest from another environment:"
55
+ puts " RAILS_ENV=development rake grainery:generate"
56
+ puts "\n 2. Then load into test:"
57
+ puts " RAILS_ENV=test rake grainery:load"
58
+ puts "\nOr run this task which does both:"
59
+ puts " rake test:db:reset_with_grainery"
60
+ exit 1
61
+ end
62
+
63
+ # Load grainery seeds
64
+ puts " → Loading grainery seeds..."
65
+ Rake::Task['grainery:load'].invoke
66
+
67
+ puts "\n✓ Test database seeded successfully with grainery data!"
68
+ rescue => e
69
+ puts "Error seeding database: #{e.message}"
70
+ puts e.backtrace.first(5).join("\n")
71
+ end
72
+ end
73
+
74
+ desc "Reset test database and seed with grainery"
75
+ task reset_with_grainery: [:setup_for_grainery, :seed_with_grainery]
76
+
77
+ desc "Clean test database (truncate all tables)"
78
+ task clean: :environment do
79
+ Rails.env = 'test'
80
+ ActiveRecord::Base.establish_connection(:test)
81
+
82
+ puts "Cleaning test database..."
83
+
84
+ # Get all table names
85
+ tables = ActiveRecord::Base.connection.tables.reject do |table|
86
+ table == 'schema_migrations' || table == 'ar_internal_metadata'
87
+ end
88
+
89
+ puts " → Truncating #{tables.size} tables..."
90
+
91
+ # Disable foreign key checks temporarily (SQL Server specific)
92
+ ActiveRecord::Base.connection.execute("EXEC sp_MSforeachtable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL'")
93
+
94
+ tables.each do |table|
95
+ begin
96
+ quoted_table = ActiveRecord::Base.connection.quote_table_name(table)
97
+ ActiveRecord::Base.connection.execute("TRUNCATE TABLE #{quoted_table}")
98
+ rescue => e
99
+ puts " ⚠ Could not truncate #{table}: #{e.message}"
100
+ end
101
+ end
102
+
103
+ # Re-enable foreign key checks
104
+ ActiveRecord::Base.connection.execute("EXEC sp_MSforeachtable 'ALTER TABLE ? CHECK CONSTRAINT ALL'")
105
+
106
+ puts "\n✓ Test database cleaned!"
107
+ end
108
+
109
+ desc "Show test database statistics"
110
+ task stats: :environment do
111
+ Rails.env = 'test'
112
+ ActiveRecord::Base.establish_connection(:test)
113
+
114
+ puts "Test Database Statistics"
115
+ puts "=" * 80
116
+
117
+ # Get all tables and their row counts
118
+ tables = ActiveRecord::Base.connection.tables.reject do |table|
119
+ table == 'schema_migrations' || table == 'ar_internal_metadata'
120
+ end
121
+
122
+ total_rows = 0
123
+ table_data = []
124
+
125
+ tables.each do |table|
126
+ begin
127
+ quoted_table = ActiveRecord::Base.connection.quote_table_name(table)
128
+ count = ActiveRecord::Base.connection.select_value("SELECT COUNT(*) FROM #{quoted_table}")
129
+ total_rows += count
130
+ table_data << [table, count] if count > 0
131
+ rescue => e
132
+ # Skip tables we can't query
133
+ end
134
+ end
135
+
136
+ # Sort by row count descending
137
+ table_data.sort_by! { |_, count| -count }
138
+
139
+ if table_data.empty?
140
+ puts "\n Database is empty ✓"
141
+ puts "\n Run 'rake test:db:setup_for_grainery' to initialize"
142
+ else
143
+ puts "\n Total tables with data: #{table_data.size}"
144
+ puts " Total rows: #{total_rows.to_s.reverse.gsub(/(\d{3})(?=\d)/, '\\1,').reverse}"
145
+ puts "\n Top 20 tables by row count:"
146
+ puts " " + "-" * 76
147
+
148
+ table_data.first(20).each do |table, count|
149
+ formatted_count = count.to_s.reverse.gsub(/(\d{3})(?=\d)/, '\\1,').reverse
150
+ puts " #{table.ljust(50)} #{formatted_count.rjust(10)} rows"
151
+ end
152
+ end
153
+
154
+ puts "=" * 80
155
+ end
156
+ end
157
+ end
158
+
159
+ # Aliases for convenience
160
+ namespace :db do
161
+ namespace :test do
162
+ desc "Alias for test:db:setup_for_grainery"
163
+ task setup_for_grainery: 'test:db:setup_for_grainery'
164
+
165
+ desc "Alias for test:db:clean"
166
+ task clean: 'test:db:clean'
167
+
168
+ desc "Alias for test:db:stats"
169
+ task stats: 'test:db:stats'
170
+
171
+ desc "Alias for test:db:reset_with_grainery"
172
+ task reset_with_grainery: 'test:db:reset_with_grainery'
173
+ end
174
+ end
metadata ADDED
@@ -0,0 +1,95 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: grainery
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Michail Pantelelis
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: rails
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - "~>"
17
+ - !ruby/object:Gem::Version
18
+ version: '6.0'
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - "~>"
24
+ - !ruby/object:Gem::Version
25
+ version: '6.0'
26
+ - !ruby/object:Gem::Dependency
27
+ name: rspec
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '3.0'
33
+ type: :development
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '3.0'
40
+ - !ruby/object:Gem::Dependency
41
+ name: sqlite3
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '1.4'
47
+ type: :development
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - "~>"
52
+ - !ruby/object:Gem::Version
53
+ version: '1.4'
54
+ description: Extract database records and generate seed files organized by database
55
+ with automatic dependency resolution. Like a grainery stores grain, this gem stores
56
+ and organizes your database seeds.
57
+ email:
58
+ - mpantel@aegean.gr
59
+ executables: []
60
+ extensions: []
61
+ extra_rdoc_files: []
62
+ files:
63
+ - CHANGELOG.md
64
+ - MIT-LICENSE
65
+ - README.md
66
+ - lib/grainery.rb
67
+ - lib/grainery/grainer.rb
68
+ - lib/grainery/railtie.rb
69
+ - lib/grainery/version.rb
70
+ - lib/tasks/grainery_tasks.rake
71
+ - lib/tasks/test_db_tasks.rake
72
+ homepage: https://github.com/mpantel/grainery
73
+ licenses:
74
+ - MIT
75
+ metadata:
76
+ homepage_uri: https://github.com/mpantel/grainery
77
+ changelog_uri: https://github.com/mpantel/grainery/blob/main/CHANGELOG.md
78
+ rdoc_options: []
79
+ require_paths:
80
+ - lib
81
+ required_ruby_version: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - ">="
84
+ - !ruby/object:Gem::Version
85
+ version: 2.7.0
86
+ required_rubygems_version: !ruby/object:Gem::Requirement
87
+ requirements:
88
+ - - ">="
89
+ - !ruby/object:Gem::Version
90
+ version: '0'
91
+ requirements: []
92
+ rubygems_version: 3.7.1
93
+ specification_version: 4
94
+ summary: Database seed storage system for Rails applications
95
+ test_files: []