antz 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: d729212462861daa56d7d8991a4619581ede1fe819f5c3ea8108b228df7efd60
4
+ data.tar.gz: bac708da0d1044ce92ea90341e2cbbb15a1ecde237d8d1547ae21731fe32ecfa
5
+ SHA512:
6
+ metadata.gz: fd3f948e683e7ec3597e969b78906964948a9ec70413675a4ff6cdeaef7a3945190d6e02b671b2d9c3eaf42cf8eea4da844bd9332fc38c62908734baad59a334
7
+ data.tar.gz: 2d394c6284afa5db8b5a1cdc102d83a4c2ff5b76168e8242e731764a8f6a7c10575dc463f396fa84849554ce711d57837c61b9f2c281763ad5cc5ccfd26bd43c
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## [Unreleased]
2
+
3
+ ## [0.1.0] - 2025-11-05
4
+
5
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Ilmir Karimov
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,182 @@
1
+
2
+ ---
3
+
4
+ # 🐜 Antz
5
+
6
+ **Declarative CSV-to-ActiveRecord importer with dependency resolution**
7
+
8
+ > Import structured CSV data into your Rails (or plain ActiveRecord) app — safely, clearly, and in the right order.
9
+
10
+ ---
11
+
12
+ ## 🧩 Why Antz?
13
+
14
+ Ever needed to seed a database from CSVs…
15
+ but your `products` depend on `categories`, which depend on `vendors`?
16
+ And you don’t want to write fragile, one-off scripts?
17
+
18
+ **Antz** lets you:
19
+
20
+ - **Declare** what data goes where — like a recipe.
21
+ - **Map and transform** CSV columns to model attributes.
22
+ - **Automatically resolve dependencies** (e.g., import `users` before `orders`).
23
+ - **Batch-process** large files efficiently.
24
+ - **Dry-run** safely before touching your DB.
25
+
26
+ All with a clean, readable Ruby DSL using the `#table` method.
27
+
28
+ ---
29
+
30
+ ## 🚀 Quick Start
31
+
32
+ ### 1. Add to your Gemfile
33
+
34
+ ```ruby
35
+ gem "antz"
36
+ ```
37
+
38
+ Then run:
39
+
40
+ ```sh
41
+ bundle install
42
+ ```
43
+
44
+ > ✅ Requires Ruby ≥ 3.2 and ActiveRecord ≥ 6.0.
45
+
46
+ ---
47
+
48
+ ### 2. Configure (e.g. in Rails)
49
+
50
+ ```ruby
51
+ # config/initializers/antz.rb
52
+
53
+ Antz.configure do |c|
54
+ c.base_dir = Rails.root.join("data", "csv_imports") # where your CSVs live
55
+ c.batch_size = 500 # default: 200
56
+ end
57
+ ```
58
+
59
+ ---
60
+
61
+ ### 3. Define Your Import Plan
62
+
63
+ ```ruby
64
+ import = Antz.define do
65
+ table :categories do
66
+ map :id
67
+ map :name
68
+ end
69
+
70
+ table :products, depends_on: :categories do
71
+ map :name
72
+ map :category_id
73
+ map(:price_cents) { |v| v.to_i * 100 } # transform on the fly
74
+ end
75
+
76
+ table :users do
77
+ map :full_name, to: :name
78
+ map :email
79
+ map(:age) { |v| v&.to_i }
80
+ end
81
+ end
82
+ ```
83
+
84
+ > 💡 CSV filenames default to pluralized table names (e.g., `products.csv`).
85
+ > You can override with `file: "my_custom_file.csv"`.
86
+
87
+ ---
88
+
89
+ ### 4. Run It
90
+
91
+ ```ruby
92
+ # Preview what would happen (no DB writes)
93
+ import.run(dry_run: true)
94
+
95
+ # Actually import
96
+ import.run
97
+ ```
98
+
99
+ ✅ Handles **`upsert_all`** (PostgreSQL/MySQL) when possible. Falls back to `insert_all`.
100
+
101
+ ---
102
+
103
+ ## 📁 Expected CSV Structure
104
+
105
+ Each file must have **headers** matching your source column names:
106
+
107
+ **`categories.csv`**
108
+
109
+ ```csv
110
+ id,name
111
+ 1,Fruits
112
+ 2,Vegetables
113
+ ```
114
+
115
+ **`products.csv`**
116
+
117
+ ```csv
118
+ name,category_id,price_cents
119
+ Apple,1,99
120
+ Carrot,2,59
121
+ ```
122
+
123
+ ---
124
+
125
+ ## 🔌 Advanced Usage
126
+
127
+ ### Custom Model Name
128
+
129
+ ```ruby
130
+ table :items, model: Product do
131
+ map :title, to: :name
132
+ end
133
+ ```
134
+
135
+ ### Skip Upsert (Insert Only)
136
+
137
+ ```ruby
138
+ table :logs, on_duplicate: :ignore do
139
+ map :message
140
+ end
141
+ ```
142
+
143
+ ### Absolute File Path
144
+
145
+ ```ruby
146
+ table :events, file: "/mnt/data/events_export.csv" do
147
+ map :timestamp
148
+ end
149
+ ```
150
+
151
+ ---
152
+
153
+ ## 🧪 Testing & Development
154
+
155
+ Antz is thoroughly tested with RSpec. To run the suite:
156
+
157
+ ```sh
158
+ bin/setup
159
+ bundle exec rspec
160
+ ```
161
+
162
+ Fixtures use in-memory SQLite — no external DB needed.
163
+
164
+ ---
165
+
166
+ ## 📄 License
167
+
168
+ MIT © 2025 [Ilmir Karimov](mailto:code.for.func@gmail.com)
169
+
170
+ ---
171
+
172
+ ## 💡 Inspired by
173
+
174
+ - The pain of one-off CSV import scripts
175
+ - The need for **reproducible**, **maintainable** data bootstrapping
176
+
177
+ ---
178
+
179
+ > 👉 Found a bug? Have an idea?
180
+ > Open an issue or PR on [GitHub](https://github.com/it1ro/antz)!
181
+
182
+ ---
data/Rakefile ADDED
@@ -0,0 +1,17 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ task default: %i[spec rubocop]
13
+
14
+ desc "Run RuboCop and fail on any warning or higher severity"
15
+ task :rubocop_strict do
16
+ sh "bundle exec rubocop --fail-level=W"
17
+ end
@@ -0,0 +1,13 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ class Configuration
5
+ attr_accessor :batch_size, :base_dir, :logger
6
+
7
+ def initialize
8
+ @batch_size = 200
9
+ @base_dir = nil
10
+ @logger = defined?(Rails) ? Rails.logger : Logger.new($stdout)
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ class Dataset
5
+ def initialize
6
+ @tables = {}
7
+ end
8
+
9
+ def table(name, **, &)
10
+ @tables[name] = Importer.new(name, **, &)
11
+ end
12
+
13
+ def run(dry_run: false)
14
+ ordered = DependencyResolver.new(@tables).resolve
15
+ ordered.each do |table|
16
+ table.execute(dry_run: dry_run)
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,26 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "tsort"
4
+
5
+ module Antz
6
+ class DependencyResolver
7
+ include TSort
8
+
9
+ def initialize(importers)
10
+ @importers = importers
11
+ end
12
+
13
+ def resolve
14
+ tsort.map { |name| @importers[name] }
15
+ end
16
+
17
+ def tsort_each_node(&)
18
+ @importers.keys.each(&)
19
+ end
20
+
21
+ def tsort_each_child(node, &)
22
+ deps = @importers[node].depends_on
23
+ Array(deps).each(&)
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,27 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ class FieldMapper
5
+ def initialize(&block)
6
+ @mappings = {}
7
+ instance_eval(&block) if block
8
+ end
9
+
10
+ def map(source_key, to: nil, &transform)
11
+ target_key = to || source_key
12
+ @mappings[source_key] = { target: target_key, transform: transform }
13
+ end
14
+
15
+ alias field map
16
+
17
+ def transform(row)
18
+ result = {}
19
+ @mappings.each do |src, opts|
20
+ value = row[src.to_s] || row[src.to_sym]
21
+ value = opts[:transform].call(value) if opts[:transform]
22
+ result[opts[:target]] = value
23
+ end
24
+ result
25
+ end
26
+ end
27
+ end
@@ -0,0 +1,43 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "sink/active_record"
4
+ require_relative "field_mapper"
5
+ require_relative "source/csv_file"
6
+
7
+ module Antz
8
+ class Importer
9
+ attr_reader :name, :depends_on, :model_class, :file_name
10
+
11
+ def initialize(name, depends_on: [], model: nil, file: nil, on_duplicate: :update, &)
12
+ @name = name
13
+ @depends_on = depends_on
14
+ @model_class = model || name.to_s.singularize.camelize.constantize
15
+ @file_name = file || "#{name}.csv"
16
+ @on_duplicate = on_duplicate
17
+ @field_mapper = FieldMapper.new(&)
18
+ end
19
+
20
+ def execute(dry_run: false)
21
+ source = Source::CsvFile.new(file_path: file_path)
22
+ sink = Sink::ActiveRecord.new(model: @model_class, on_duplicate: @on_duplicate)
23
+
24
+ source.each_row(batch_size: Antz.configuration.batch_size) do |batch|
25
+ mapped = batch.map { |row| @field_mapper.transform(row) }
26
+ if dry_run
27
+ Antz.configuration.logger.info("[DRY RUN] Would write: #{mapped}")
28
+ else
29
+ sink.write(mapped)
30
+ end
31
+ end
32
+ end
33
+
34
+ private
35
+
36
+ def file_path
37
+ return @file_name if @file_name.start_with?("/")
38
+
39
+ base = Antz.configuration.base_dir || Dir.pwd
40
+ File.join(base, @file_name)
41
+ end
42
+ end
43
+ end
@@ -0,0 +1,52 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ module Sink
5
+ class ActiveRecord
6
+ def initialize(model:, on_duplicate: :update)
7
+ @model = model
8
+ @on_duplicate = on_duplicate
9
+ end
10
+
11
+ def write(records)
12
+ return if records.empty?
13
+
14
+ trigger_schema_load
15
+ now = Time.current
16
+ enriched_records = enrich_with_timestamps(records, now)
17
+
18
+ persist_records(enriched_records)
19
+ end
20
+
21
+ private
22
+
23
+ def trigger_schema_load
24
+ @model.columns
25
+ end
26
+
27
+ def enrich_with_timestamps(records, now)
28
+ needs_created_at = @model.column_names.include?("created_at")
29
+ needs_updated_at = @model.column_names.include?("updated_at")
30
+
31
+ records.map do |record|
32
+ record = record.dup.transform_keys(&:to_s)
33
+ record["created_at"] = now if needs_created_at
34
+ record["updated_at"] = now if needs_updated_at
35
+ record
36
+ end
37
+ end
38
+
39
+ def persist_records(records)
40
+ if use_upsert?
41
+ @model.upsert_all(records, returning: false)
42
+ else
43
+ @model.insert_all(records, returning: false)
44
+ end
45
+ end
46
+
47
+ def use_upsert?
48
+ @on_duplicate == :update && @model.respond_to?(:upsert_all)
49
+ end
50
+ end
51
+ end
52
+ end
data/lib/antz/sink.rb ADDED
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ class Sink
5
+ def write(records)
6
+ raise NotImplementedError
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ module Source
5
+ class CsvFile
6
+ def initialize(file_path:)
7
+ @file_path = file_path
8
+ end
9
+
10
+ def each_row(batch_size:)
11
+ rows = []
12
+ CSV.foreach(@file_path, headers: true) do |row|
13
+ rows << row.to_h
14
+ if rows.size >= batch_size
15
+ yield rows
16
+ rows = []
17
+ end
18
+ end
19
+ yield rows unless rows.empty?
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ class Source
5
+ def each_row
6
+ raise NotImplementedError
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Antz
4
+ VERSION = "0.1.0"
5
+ end
data/lib/antz.rb ADDED
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "csv"
4
+ require_relative "antz/version"
5
+
6
+ # Main namespace for the Antz CSV-to-ActiveRecord import library.
7
+ module Antz
8
+ autoload :Configuration, "antz/configuration"
9
+ autoload :Dataset, "antz/dataset"
10
+ autoload :Importer, "antz/importer"
11
+ autoload :DependencyResolver, "antz/dependency_resolver"
12
+ autoload :FieldMapper, "antz/field_mapper"
13
+
14
+ module Source
15
+ autoload :CsvFile, "antz/source/csv_file"
16
+ end
17
+
18
+ module Sink
19
+ autoload :ActiveRecord, "antz/sink/active_record"
20
+ end
21
+
22
+ class << self
23
+ attr_writer :configuration
24
+ end
25
+
26
+ def self.configuration
27
+ @configuration ||= Configuration.new
28
+ end
29
+
30
+ def self.configure
31
+ yield(configuration)
32
+ end
33
+
34
+ def self.define(&block)
35
+ Dataset.new.tap { |ds| ds.instance_eval(&block) }
36
+ end
37
+ end
metadata ADDED
@@ -0,0 +1,97 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: antz
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Ilmir Karimov
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2025-11-05 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: activerecord
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '6.1'
20
+ - - "<"
21
+ - !ruby/object:Gem::Version
22
+ version: '9.0'
23
+ type: :runtime
24
+ prerelease: false
25
+ version_requirements: !ruby/object:Gem::Requirement
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ version: '6.1'
30
+ - - "<"
31
+ - !ruby/object:Gem::Version
32
+ version: '9.0'
33
+ - !ruby/object:Gem::Dependency
34
+ name: csv
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '3.3'
40
+ type: :runtime
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '3.3'
47
+ description: |
48
+ Antz helps you safely import CSV data into ActiveRecord models,
49
+ with field mapping, batch processing, and dependency-aware execution order.
50
+ email:
51
+ - code.for.func@gmail.com
52
+ executables: []
53
+ extensions: []
54
+ extra_rdoc_files: []
55
+ files:
56
+ - CHANGELOG.md
57
+ - LICENSE.txt
58
+ - README.md
59
+ - Rakefile
60
+ - lib/antz.rb
61
+ - lib/antz/configuration.rb
62
+ - lib/antz/dataset.rb
63
+ - lib/antz/dependency_resolver.rb
64
+ - lib/antz/field_mapper.rb
65
+ - lib/antz/importer.rb
66
+ - lib/antz/sink.rb
67
+ - lib/antz/sink/active_record.rb
68
+ - lib/antz/source.rb
69
+ - lib/antz/source/csv_file.rb
70
+ - lib/antz/version.rb
71
+ homepage: https://github.com/it1ro/antz
72
+ licenses:
73
+ - MIT
74
+ metadata:
75
+ source_code_uri: https://github.com/it1ro/antz
76
+ changelog_uri: https://github.com/it1ro/antz/blob/main/CHANGELOG.md
77
+ rubygems_mfa_required: 'true'
78
+ post_install_message:
79
+ rdoc_options: []
80
+ require_paths:
81
+ - lib
82
+ required_ruby_version: !ruby/object:Gem::Requirement
83
+ requirements:
84
+ - - ">="
85
+ - !ruby/object:Gem::Version
86
+ version: '3.2'
87
+ required_rubygems_version: !ruby/object:Gem::Requirement
88
+ requirements:
89
+ - - ">="
90
+ - !ruby/object:Gem::Version
91
+ version: '0'
92
+ requirements: []
93
+ rubygems_version: 3.4.1
94
+ signing_key:
95
+ specification_version: 4
96
+ summary: Declarative CSV-to-ActiveRecord import with dependency resolution
97
+ test_files: []