RubyGems - antz - Versions diffs - 0.1.0 - Mend

antz 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

checksums.yaml +7 -0
data/CHANGELOG.md +5 -0
data/LICENSE.txt +21 -0
data/README.md +182 -0
data/Rakefile +17 -0
data/lib/antz/configuration.rb +13 -0
data/lib/antz/dataset.rb +20 -0
data/lib/antz/dependency_resolver.rb +26 -0
data/lib/antz/field_mapper.rb +27 -0
data/lib/antz/importer.rb +43 -0
data/lib/antz/sink/active_record.rb +52 -0
data/lib/antz/sink.rb +9 -0
data/lib/antz/source/csv_file.rb +23 -0
data/lib/antz/source.rb +9 -0
data/lib/antz/version.rb +5 -0
data/lib/antz.rb +37 -0
metadata +97 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: d729212462861daa56d7d8991a4619581ede1fe819f5c3ea8108b228df7efd60
+  data.tar.gz: bac708da0d1044ce92ea90341e2cbbb15a1ecde237d8d1547ae21731fe32ecfa
+SHA512:
+  metadata.gz: fd3f948e683e7ec3597e969b78906964948a9ec70413675a4ff6cdeaef7a3945190d6e02b671b2d9c3eaf42cf8eea4da844bd9332fc38c62908734baad59a334
+  data.tar.gz: 2d394c6284afa5db8b5a1cdc102d83a4c2ff5b76168e8242e731764a8f6a7c10575dc463f396fa84849554ce711d57837c61b9f2c281763ad5cc5ccfd26bd43c

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,5 @@
+## [Unreleased]
+## [0.1.0] - 2025-11-05
+- Initial release

data/LICENSE.txt ADDED Viewed

@@ -0,0 +1,21 @@
+The MIT License (MIT)
+Copyright (c) 2025 Ilmir Karimov
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,182 @@
+---
+# 🐜 Antz
+**Declarative CSV-to-ActiveRecord importer with dependency resolution**
+> Import structured CSV data into your Rails (or plain ActiveRecord) app — safely, clearly, and in the right order.
+---
+## 🧩 Why Antz?
+Ever needed to seed a database from CSVs…
+but your `products` depend on `categories`, which depend on `vendors`?
+And you don’t want to write fragile, one-off scripts?
+**Antz** lets you:
+- **Declare** what data goes where — like a recipe.
+- **Map and transform** CSV columns to model attributes.
+- **Automatically resolve dependencies** (e.g., import `users` before `orders`).
+- **Batch-process** large files efficiently.
+- **Dry-run** safely before touching your DB.
+All with a clean, readable Ruby DSL using the `#table` method.
+---
+## 🚀 Quick Start
+### 1. Add to your Gemfile
+```ruby
+gem "antz"
+```
+Then run:
+```sh
+bundle install
+```
+> ✅ Requires Ruby ≥ 3.2 and ActiveRecord ≥ 6.0.
+---
+### 2. Configure (e.g. in Rails)
+```ruby
+# config/initializers/antz.rb
+Antz.configure do |c|
+  c.base_dir = Rails.root.join("data", "csv_imports")  # where your CSVs live
+  c.batch_size = 500                                   # default: 200
+end
+```
+---
+### 3. Define Your Import Plan
+```ruby
+import = Antz.define do
+  table :categories do
+    map :id
+    map :name
+  end
+  table :products, depends_on: :categories do
+    map :name
+    map :category_id
+    map(:price_cents) { |v| v.to_i * 100 }  # transform on the fly
+  end
+  table :users do
+    map :full_name, to: :name
+    map :email
+    map(:age) { |v| v&.to_i }
+  end
+end
+```
+> 💡 CSV filenames default to pluralized table names (e.g., `products.csv`).
+> You can override with `file: "my_custom_file.csv"`.
+---
+### 4. Run It
+```ruby
+# Preview what would happen (no DB writes)
+import.run(dry_run: true)
+# Actually import
+import.run
+```
+✅ Handles **`upsert_all`** (PostgreSQL/MySQL) when possible. Falls back to `insert_all`.
+---
+## 📁 Expected CSV Structure
+Each file must have **headers** matching your source column names:
+**`categories.csv`**
+```csv
+id,name
+1,Fruits
+2,Vegetables
+```
+**`products.csv`**
+```csv
+name,category_id,price_cents
+Apple,1,99
+Carrot,2,59
+```
+---
+## 🔌 Advanced Usage
+### Custom Model Name
+```ruby
+table :items, model: Product do
+  map :title, to: :name
+end
+```
+### Skip Upsert (Insert Only)
+```ruby
+table :logs, on_duplicate: :ignore do
+  map :message
+end
+```
+### Absolute File Path
+```ruby
+table :events, file: "/mnt/data/events_export.csv" do
+  map :timestamp
+end
+```
+---
+## 🧪 Testing & Development
+Antz is thoroughly tested with RSpec. To run the suite:
+```sh
+bin/setup
+bundle exec rspec
+```
+Fixtures use in-memory SQLite — no external DB needed.
+---
+## 📄 License
+MIT © 2025 [Ilmir Karimov](mailto:code.for.func@gmail.com)
+---
+## 💡 Inspired by
+- The pain of one-off CSV import scripts
+- The need for **reproducible**, **maintainable** data bootstrapping
+---
+> 👉 Found a bug? Have an idea?
+> Open an issue or PR on [GitHub](https://github.com/it1ro/antz)!
+---

data/Rakefile ADDED Viewed

@@ -0,0 +1,17 @@
+# frozen_string_literal: true
+require "bundler/gem_tasks"
+require "rspec/core/rake_task"
+RSpec::Core::RakeTask.new(:spec)
+require "rubocop/rake_task"
+RuboCop::RakeTask.new
+task default: %i[spec rubocop]
+desc "Run RuboCop and fail on any warning or higher severity"
+task :rubocop_strict do
+  sh "bundle exec rubocop --fail-level=W"
+end

data/lib/antz/configuration.rb ADDED Viewed

@@ -0,0 +1,13 @@
+# frozen_string_literal: true
+module Antz
+  class Configuration
+    attr_accessor :batch_size, :base_dir, :logger
+    def initialize
+      @batch_size = 200
+      @base_dir = nil
+      @logger = defined?(Rails) ? Rails.logger : Logger.new($stdout)
+    end
+  end
+end

data/lib/antz/dataset.rb ADDED Viewed

@@ -0,0 +1,20 @@
+# frozen_string_literal: true
+module Antz
+  class Dataset
+    def initialize
+      @tables = {}
+    end
+    def table(name, **, &)
+      @tables[name] = Importer.new(name, **, &)
+    end
+    def run(dry_run: false)
+      ordered = DependencyResolver.new(@tables).resolve
+      ordered.each do |table|
+        table.execute(dry_run: dry_run)
+      end
+    end
+  end
+end

data/lib/antz/dependency_resolver.rb ADDED Viewed

@@ -0,0 +1,26 @@
+# frozen_string_literal: true
+require "tsort"
+module Antz
+  class DependencyResolver
+    include TSort
+    def initialize(importers)
+      @importers = importers
+    end
+    def resolve
+      tsort.map { |name| @importers[name] }
+    end
+    def tsort_each_node(&)
+      @importers.keys.each(&)
+    end
+    def tsort_each_child(node, &)
+      deps = @importers[node].depends_on
+      Array(deps).each(&)
+    end
+  end
+end

data/lib/antz/field_mapper.rb ADDED Viewed

@@ -0,0 +1,27 @@
+# frozen_string_literal: true
+module Antz
+  class FieldMapper
+    def initialize(&block)
+      @mappings = {}
+      instance_eval(&block) if block
+    end
+    def map(source_key, to: nil, &transform)
+      target_key = to || source_key
+      @mappings[source_key] = { target: target_key, transform: transform }
+    end
+    alias field map
+    def transform(row)
+      result = {}
+      @mappings.each do |src, opts|
+        value = row[src.to_s] || row[src.to_sym]
+        value = opts[:transform].call(value) if opts[:transform]
+        result[opts[:target]] = value
+      end
+      result
+    end
+  end
+end

data/lib/antz/importer.rb ADDED Viewed

@@ -0,0 +1,43 @@
+# frozen_string_literal: true
+require_relative "sink/active_record"
+require_relative "field_mapper"
+require_relative "source/csv_file"
+module Antz
+  class Importer
+    attr_reader :name, :depends_on, :model_class, :file_name
+    def initialize(name, depends_on: [], model: nil, file: nil, on_duplicate: :update, &)
+      @name = name
+      @depends_on = depends_on
+      @model_class = model || name.to_s.singularize.camelize.constantize
+      @file_name = file || "#{name}.csv"
+      @on_duplicate = on_duplicate
+      @field_mapper = FieldMapper.new(&)
+    end
+    def execute(dry_run: false)
+      source = Source::CsvFile.new(file_path: file_path)
+      sink = Sink::ActiveRecord.new(model: @model_class, on_duplicate: @on_duplicate)
+      source.each_row(batch_size: Antz.configuration.batch_size) do |batch|
+        mapped = batch.map { |row| @field_mapper.transform(row) }
+        if dry_run
+          Antz.configuration.logger.info("[DRY RUN] Would write: #{mapped}")
+        else
+          sink.write(mapped)
+        end
+      end
+    end
+    private
+    def file_path
+      return @file_name if @file_name.start_with?("/")
+      base = Antz.configuration.base_dir || Dir.pwd
+      File.join(base, @file_name)
+    end
+  end
+end

data/lib/antz/sink/active_record.rb ADDED Viewed

@@ -0,0 +1,52 @@
+# frozen_string_literal: true
+module Antz
+  module Sink
+    class ActiveRecord
+      def initialize(model:, on_duplicate: :update)
+        @model = model
+        @on_duplicate = on_duplicate
+      end
+      def write(records)
+        return if records.empty?
+        trigger_schema_load
+        now = Time.current
+        enriched_records = enrich_with_timestamps(records, now)
+        persist_records(enriched_records)
+      end
+      private
+      def trigger_schema_load
+        @model.columns
+      end
+      def enrich_with_timestamps(records, now)
+        needs_created_at = @model.column_names.include?("created_at")
+        needs_updated_at = @model.column_names.include?("updated_at")
+        records.map do |record|
+          record = record.dup.transform_keys(&:to_s)
+          record["created_at"] = now if needs_created_at
+          record["updated_at"] = now if needs_updated_at
+          record
+        end
+      end
+      def persist_records(records)
+        if use_upsert?
+          @model.upsert_all(records, returning: false)
+        else
+          @model.insert_all(records, returning: false)
+        end
+      end
+      def use_upsert?
+        @on_duplicate == :update && @model.respond_to?(:upsert_all)
+      end
+    end
+  end
+end

data/lib/antz/sink.rb ADDED Viewed

@@ -0,0 +1,9 @@
+# frozen_string_literal: true
+module Antz
+  class Sink
+    def write(records)
+      raise NotImplementedError
+    end
+  end
+end

data/lib/antz/source/csv_file.rb ADDED Viewed

@@ -0,0 +1,23 @@
+# frozen_string_literal: true
+module Antz
+  module Source
+    class CsvFile
+      def initialize(file_path:)
+        @file_path = file_path
+      end
+      def each_row(batch_size:)
+        rows = []
+        CSV.foreach(@file_path, headers: true) do |row|
+          rows << row.to_h
+          if rows.size >= batch_size
+            yield rows
+            rows = []
+          end
+        end
+        yield rows unless rows.empty?
+      end
+    end
+  end
+end

data/lib/antz/source.rb ADDED Viewed

@@ -0,0 +1,9 @@
+# frozen_string_literal: true
+module Antz
+  class Source
+    def each_row
+      raise NotImplementedError
+    end
+  end
+end

data/lib/antz/version.rb ADDED Viewed

@@ -0,0 +1,5 @@
+# frozen_string_literal: true
+module Antz
+  VERSION = "0.1.0"
+end

data/lib/antz.rb ADDED Viewed

@@ -0,0 +1,37 @@
+# frozen_string_literal: true
+require "csv"
+require_relative "antz/version"
+# Main namespace for the Antz CSV-to-ActiveRecord import library.
+module Antz
+  autoload :Configuration, "antz/configuration"
+  autoload :Dataset, "antz/dataset"
+  autoload :Importer, "antz/importer"
+  autoload :DependencyResolver, "antz/dependency_resolver"
+  autoload :FieldMapper, "antz/field_mapper"
+  module Source
+    autoload :CsvFile, "antz/source/csv_file"
+  end
+  module Sink
+    autoload :ActiveRecord, "antz/sink/active_record"
+  end
+  class << self
+    attr_writer :configuration
+  end
+  def self.configuration
+    @configuration ||= Configuration.new
+  end
+  def self.configure
+    yield(configuration)
+  end
+  def self.define(&block)
+    Dataset.new.tap { |ds| ds.instance_eval(&block) }
+  end
+end

metadata ADDED Viewed

@@ -0,0 +1,97 @@
+--- !ruby/object:Gem::Specification
+name: antz
+version: !ruby/object:Gem::Version
+  version: 0.1.0
+platform: ruby
+authors:
+- Ilmir Karimov
+autorequire:
+bindir: exe
+cert_chain: []
+date: 2025-11-05 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: activerecord
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '6.1'
+    - - "<"
+      - !ruby/object:Gem::Version
+        version: '9.0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '6.1'
+    - - "<"
+      - !ruby/object:Gem::Version
+        version: '9.0'
+- !ruby/object:Gem::Dependency
+  name: csv
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '3.3'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '3.3'
+description: |
+  Antz helps you safely import CSV data into ActiveRecord models,
+  with field mapping, batch processing, and dependency-aware execution order.
+email:
+- code.for.func@gmail.com
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- CHANGELOG.md
+- LICENSE.txt
+- README.md
+- Rakefile
+- lib/antz.rb
+- lib/antz/configuration.rb
+- lib/antz/dataset.rb
+- lib/antz/dependency_resolver.rb
+- lib/antz/field_mapper.rb
+- lib/antz/importer.rb
+- lib/antz/sink.rb
+- lib/antz/sink/active_record.rb
+- lib/antz/source.rb
+- lib/antz/source/csv_file.rb
+- lib/antz/version.rb
+homepage: https://github.com/it1ro/antz
+licenses:
+- MIT
+metadata:
+  source_code_uri: https://github.com/it1ro/antz
+  changelog_uri: https://github.com/it1ro/antz/blob/main/CHANGELOG.md
+  rubygems_mfa_required: 'true'
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '3.2'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubygems_version: 3.4.1
+signing_key:
+specification_version: 4
+summary: Declarative CSV-to-ActiveRecord import with dependency resolution
+test_files: []