rails_ai_kit 0.1.3 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5f5b6bc37b278c3ccd429b8b37a24256b89675b56b2d0ffad092d55a6b2a93ba
4
- data.tar.gz: f876cc97ad7180007ad9e57481474232c3695e4538e59d820b94b365cc52def9
3
+ metadata.gz: e0ed50b48e8ada5fcbdee2169fbcb5e3bd427a66cc609a3b2794a0202372c9a3
4
+ data.tar.gz: ef0472f6a40a0f31768a814e5115d0da43c07623f194f2d9ff58c004aab7571c
5
5
  SHA512:
6
- metadata.gz: fddc5b46a839156459abc6ba82419329d66aeca703b76151663a9b66296e0d9867467422ce71dd067f998e99c22639eb8fabc558e5ea07f34c4e697faf8694ca
7
- data.tar.gz: 7a1420f049a63341d5783b20e08c7948ac4347c51a65ebfe71c6691f5f88ae9507174b16ccf1e42648c9ebd2dcbfc37aa7235ef378a99a63da525977bae016be
6
+ metadata.gz: 1d34a020030a1d8391b18d32c68374af6b918ab03f5edb51d56acc205e3a0c4f05a4bddbcb158b5bd9c0ac027bc8a5e60d5007f11fcf0135b6695716f25e3c96
7
+ data.tar.gz: b9a3e6a229716e9a28cee18ba85febb98c5ade412077c5e215cb4823a9bf0bb53bc495ba8f10109370368ab944571eab9ef50007023e232190dd189bf6ad204b
data/README.md CHANGED
@@ -1,41 +1,55 @@
1
+ <div align="center">
2
+ <img src="logo.png" alt="Rails AI Kit" width="280" />
3
+ </div>
4
+
1
5
  # Rails AI Kit
2
6
 
3
- A Rails gem that adds a **classification layer on top of [pgvector](https://github.com/pgvector/pgvector)**. Instead of building custom ML or calling LLMs every time, use vector similarity to classify data: support tickets, content moderation, ecommerce categories, document routing, and more.
7
+ **AI-first toolkit for Rails** embeddings, vector search, and classification without running your own ML or calling LLMs on every request.
4
8
 
5
- **Source:** [github.com/imrrohitt/rails_ai_kit](https://github.com/imrrohitt/rails_ai_kit)
9
+ > Rails gem for vector-based text classification (pgvector), embeddings (OpenAI, Cohere), similarity search, and generators. Train labels with examples; classify on save or in batch.
6
10
 
7
- - **No ML training** – train labels with example texts and compare embeddings
8
- - **No LLM cost** – one-time embedding per piece of content; classification is nearest-neighbor in PostgreSQL
9
- - **Rails-friendly** – `vector_classify`, `Classifier.train`, `Classifier.classify`, `Article.similar_to("query")`
10
- - **Multi-provider embeddings** – pass API keys for OpenAI, Cohere, or your own provider
11
+ [![Ruby](https://img.shields.io/badge/Ruby-3.0+-red.svg)](https://rubygems.org/gems/rails_ai_kit)
12
+ [![Rails](https://img.shields.io/badge/Rails-6.0+-blue.svg)](https://rubyonrails.org/)
13
+ [![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
11
14
 
12
- ## Architecture
15
+ ---
13
16
 
14
- ```
15
- Your Rails App
16
-
17
-
18
- Rails AI Kit (classification + indexing + similarity)
19
-
20
-
21
- pgvector (PostgreSQL)
22
-
23
-
24
- vector similarity search → predicted label
25
- ```
17
+ ### Overview
18
+
19
+ | | |
20
+ |:---|:---|
21
+ | **Rails-native** | Generators, ActiveRecord integration, single config file |
22
+ | **Vector-ready** | [pgvector](https://github.com/pgvector/pgvector) with OpenAI or Cohere embeddings |
23
+ | **Modular** | Use only the features you need; add more as the gem evolves |
24
+
25
+ **[Source code](https://github.com/imrrohitt/rails_ai_kit)** · **[RubyGems](https://rubygems.org/gems/rails_ai_kit)**
26
+
27
+ ---
28
+
29
+ ## Features
26
30
 
27
- The gem does **not** store your application data. You store data in your own PostgreSQL tables with pgvector. The gem provides:
31
+ Rails AI Kit is organized around **features**. Each is optional and self-contained.
28
32
 
29
- - Classification logic (label training, classify by similarity)
30
- - Indexing helpers (migrations for vector columns and label table)
31
- - Similarity search (`similar_to`)
32
- - Filtering (standard `where(label: "sports")`)
33
+ ### Classifier
34
+
35
+ Vector-based text classification: route support tickets, moderate content, tag articles, or categorize documents. No ML training — you provide example texts per label; classification is nearest-neighbor in PostgreSQL.
36
+
37
+ - Train labels with examples → one vector per label
38
+ - Auto-classify on save (embed + compare → `label`, `confidence_score`)
39
+ - Similarity search: `Model.similar_to("query", limit: 5)`
40
+ - Batch classify for backfills or background jobs
41
+
42
+ *Requires pgvector, an embedding API (OpenAI/Cohere), and the install + vector_columns generators.*
43
+
44
+ ---
33
45
 
34
46
  ## Requirements
35
47
 
36
48
  - Rails 6+
37
- - PostgreSQL with [pgvector](https://github.com/pgvector/pgvector) extension
38
- - An embedding API (OpenAI or Cohere) and API key
49
+ - PostgreSQL with [pgvector](https://github.com/pgvector/pgvector) (for Classifier and vector features)
50
+ - An embedding API key (OpenAI or Cohere) for features that use embeddings
51
+
52
+ ---
39
53
 
40
54
  ## Installation
41
55
 
@@ -51,16 +65,16 @@ Then:
51
65
  bundle install
52
66
  ```
53
67
 
54
- ### 1. Enable pgvector and create the labels table
68
+ ### 1. Install (pgvector + labels table)
55
69
 
56
- The gem needs one internal table to store **label embeddings** (one vector per label per classifier):
70
+ For the **Classifier** feature, the gem needs pgvector and one internal table for label embeddings:
57
71
 
58
72
  ```bash
59
73
  rails g rails_ai_kit:install
60
74
  rails db:migrate
61
75
  ```
62
76
 
63
- ### 2. Configure embedding provider and API keys
77
+ ### 2. Configure
64
78
 
65
79
  In `config/initializers/rails_ai_kit.rb` (create the file):
66
80
 
@@ -69,7 +83,6 @@ RailsAiKit.configure do |config|
69
83
  config.embedding_provider = :openai # or :cohere
70
84
  config.embedding_dimensions = 1536 # 1536 for OpenAI text-embedding-3-small
71
85
 
72
- # Pass API keys for the provider(s) you use
73
86
  config.api_keys = {
74
87
  openai: ENV["OPENAI_API_KEY"],
75
88
  cohere: ENV["COHERE_API_KEY"]
@@ -81,9 +94,9 @@ end
81
94
 
82
95
  Use environment variables or Rails credentials; do not commit raw API keys.
83
96
 
84
- ### 3. Add vector columns to your model
97
+ ### 3. Add vector columns (for Classifier)
85
98
 
86
- For a new or existing table (e.g. `articles` with a `content` column):
99
+ For models you want to classify (e.g. `articles` with a `content` column):
87
100
 
88
101
  ```bash
89
102
  rails g rails_ai_kit:vector_columns Article content
@@ -92,9 +105,23 @@ rails db:migrate
92
105
 
93
106
  This adds `embedding` (vector), `label` (string), and `confidence_score` (float).
94
107
 
108
+ ---
109
+
95
110
  ## Usage
96
111
 
97
- ### Declare vector classification on a model
112
+ ### Embeddings (general)
113
+
114
+ Use embeddings anywhere in your app:
115
+
116
+ ```ruby
117
+ RailsAiKit.embed("some text") # => array of floats
118
+ RailsAiKit.embedding.embed("text") # same
119
+ RailsAiKit.embedding.embed_batch(["a", "b"]) # batch
120
+ ```
121
+
122
+ ### Classifier feature
123
+
124
+ #### Declare classification on a model
98
125
 
99
126
  ```ruby
100
127
  class Article < ApplicationRecord
@@ -107,55 +134,43 @@ On save, the gem will:
107
134
 
108
135
  1. Generate an embedding for `content`
109
136
  2. Store it in `embedding`
110
- 3. Run classification (nearest label vector) and set `label` and `confidence_score`
111
-
112
- Example:
113
-
114
- ```ruby
115
- article = Article.create!(content: "Apple released a new iPhone")
116
- article.label # => "technology"
117
- article.confidence_score # => 0.91
118
- ```
137
+ 3. Compare to trained label vectors and set `label` and `confidence_score`
119
138
 
120
- ### Train labels with examples
139
+ #### Train labels first
121
140
 
122
- Before classifying, train each label with example texts so the gem can build a label vector (average of example embeddings):
141
+ Before classifying, train each label with example texts:
123
142
 
124
143
  ```ruby
125
- classifier = RailsAiKit.classifier("Article") # or use default
144
+ c = RailsAiKit.classifier("Article")
126
145
 
127
- classifier.train("sports", examples: [
146
+ c.train("sports", examples: [
128
147
  "football match",
129
148
  "cricket tournament",
130
149
  "Olympic gold medal"
131
150
  ])
132
-
133
- classifier.train("politics", examples: [
134
- "election results",
135
- "parliament debate"
136
- ])
137
-
138
- classifier.train("technology", examples: [
139
- "new iPhone launch",
140
- "AI software update"
141
- ])
151
+ c.train("politics", examples: ["election results", "parliament debate"])
152
+ c.train("technology", examples: ["new iPhone launch", "AI software update"])
142
153
  ```
143
154
 
144
- You can use the default classifier name or pass a custom one:
155
+ #### Create records
145
156
 
146
157
  ```ruby
147
- RailsAiKit.classifier("Article").train("sports", examples: [...])
148
- RailsAiKit.classifier.train("sports", examples: [...]) # default classifier
158
+ article = Article.create!(content: "Apple released a new iPhone")
159
+ article.label # => "technology"
160
+ article.confidence_score # => 0.91
149
161
  ```
150
162
 
151
- ### Classify text without saving
163
+ #### Classify without saving
152
164
 
153
165
  ```ruby
154
- result = RailsAiKit.classifier("Article").classify("India won the cricket match")
166
+ RailsAiKit.classifier("Article").classify("India won the cricket match")
155
167
  # => { label: "sports", confidence: 0.91, distance: 0.09 }
168
+
169
+ RailsAiKit.classifier("Article").classify_by_embedding(article.embedding)
170
+ # => { label: "technology", confidence: 0.91, distance: 0.09 }
156
171
  ```
157
172
 
158
- ### Batch classification
173
+ #### Batch classification
159
174
 
160
175
  ```ruby
161
176
  records = Article.where(label: nil).limit(100)
@@ -164,26 +179,24 @@ RailsAiKit.classifier("Article").batch_classify(records,
164
179
  label_attribute: :label,
165
180
  confidence_attribute: :confidence_score
166
181
  )
167
- # Optionally save: records.each(&:save!)
182
+ # Optionally: records.each(&:save!)
168
183
  ```
169
184
 
170
- ### Filtering
171
-
172
- Use normal ActiveRecord scopes:
185
+ #### Similarity search
173
186
 
174
187
  ```ruby
175
- Article.where(label: "sports")
176
- Article.where("confidence_score >= ?", 0.8)
188
+ Article.similar_to("new iPhone launch", limit: 5)
177
189
  ```
178
190
 
179
- ### Similarity search
180
-
181
- Find records similar to a piece of text (embeds the query, then nearest-neighbor search):
191
+ #### Filtering
182
192
 
183
193
  ```ruby
184
- Article.similar_to("new iPhone launch", limit: 5)
194
+ Article.where(label: "sports")
195
+ Article.where("confidence_score >= ?", 0.8)
185
196
  ```
186
197
 
198
+ ---
199
+
187
200
  ## Example: Support ticket routing
188
201
 
189
202
  ```ruby
@@ -195,16 +208,18 @@ class SupportTicket < ApplicationRecord
195
208
  end
196
209
 
197
210
  # Train once
198
- classifier = RailsAiKit.classifier("SupportTicket")
199
- classifier.train("billing", examples: ["My payment failed", "Refund request", "Invoice issue"])
200
- classifier.train("technical", examples: ["App crashed", "Login not working", "Error message"])
201
- classifier.train("account", examples: ["Change email", "Close my account", "Password reset"])
211
+ c = RailsAiKit.classifier("SupportTicket")
212
+ c.train("billing", examples: ["My payment failed", "Refund request", "Invoice issue"])
213
+ c.train("technical", examples: ["App crashed", "Login not working", "Error message"])
214
+ c.train("account", examples: ["Change email", "Close my account", "Password reset"])
202
215
 
203
216
  # Incoming ticket
204
217
  ticket = SupportTicket.create!(message: "My payment failed last night")
205
218
  ticket.label # => "billing" → route to billing queue
206
219
  ```
207
220
 
221
+ ---
222
+
208
223
  ## Configuration reference
209
224
 
210
225
  | Option | Description | Default |
@@ -212,27 +227,33 @@ ticket.label # => "billing" → route to billing queue
212
227
  | `embedding_provider` | `:openai` or `:cohere` | `:openai` |
213
228
  | `embedding_dimensions` | Vector size (must match provider) | `1536` |
214
229
  | `api_keys` | Hash of provider => API key | `{}` |
215
- | `default_classifier_name` | Name when no classifier given | `"default"` |
230
+ | `default_classifier_name` | Classifier name when none given | `"default"` |
231
+
232
+ ---
216
233
 
217
234
  ## Generators
218
235
 
219
236
  | Generator | Purpose |
220
237
  |-----------|---------|
221
- | `rails g rails_ai_kit:install` | Migration: enable pgvector + create `rails_ai_kit_labels` |
222
- | `rails g rails_ai_kit:vector_columns ModelName content_column` | Migration: add `embedding`, `label`, `confidence_score` to a table |
238
+ | `rails g rails_ai_kit:install` | Enable pgvector + create `rails_ai_kit_labels` (for Classifier) |
239
+ | `rails g rails_ai_kit:vector_columns ModelName content_column` | Add `embedding`, `label`, `confidence_score` to a table |
223
240
 
224
- ## How it works
241
+ ---
225
242
 
226
- 1. **Label vectors** – Each label is represented by a vector (average of example embeddings). Stored in `rails_ai_kit_labels`.
227
- 2. **Classification** – New content is embedded and compared to all label vectors with cosine distance. The nearest label wins; confidence is `1 - distance`.
228
- 3. **Storage** – Your table holds the content, its embedding, the predicted label, and confidence. The gem only adds one table for label vectors.
243
+ ## How Classifier works
229
244
 
230
- ## Future ideas
245
+ 1. **Label vectors** – Each label is a vector (average of example embeddings), stored in `rails_ai_kit_labels`.
246
+ 2. **Classification** – New content is embedded and compared to label vectors (cosine distance). Nearest label wins; confidence = `1 - distance`.
247
+ 3. **Storage** – Your table holds content, embedding, predicted label, and confidence. The gem adds one table for label vectors only.
231
248
 
232
- - Hierarchical labels (e.g. technology → mobile, laptops)
233
- - Confidence threshold (e.g. mark as "unknown" if &lt; 0.7)
234
- - Hybrid search (vector + keyword)
235
- - Incremental learning (add examples to improve labels over time)
249
+ ---
250
+
251
+ ## Roadmap
252
+
253
+ - **Classifier:** hierarchical labels, confidence threshold, hybrid search, incremental learning
254
+ - **More features** – additional AI capabilities as the gem evolves
255
+
256
+ ---
236
257
 
237
258
  ## Development
238
259
 
@@ -243,21 +264,26 @@ bundle exec rake install
243
264
 
244
265
  Run tests (when added) with `bundle exec rspec` or `bundle exec rake test`.
245
266
 
267
+ ---
268
+
246
269
  ## License
247
270
 
248
271
  MIT.
249
272
 
273
+ ---
274
+
250
275
  ## How it’s built
251
276
 
252
- - **Configuration** (`lib/rails_ai_kit/configuration.rb`) – Embedding provider, dimensions, and API keys (e.g. `api_keys[:openai]`).
253
- - **Embedding providers** (`lib/rails_ai_kit/embedding_providers/`) – Base class plus OpenAI and Cohere. Each implements `embed(text)` and `embed_batch(texts)` using the provider API.
254
- - **EmbeddingService** – Wraps the configured provider and API key so `RailsAiKit.embedding.embed(text)` works without passing keys every time.
255
- - **LabelRecord** – ActiveRecord model for `rails_ai_kit_labels` (classifier_name, label_name, embedding). Uses Neighbor’s `has_neighbors :embedding` for similarity.
256
- - **Classifier** – Trains labels by averaging example embeddings and storing them; classifies by nearest-neighbor (cosine) against those label vectors. Supports `classify(text)`, `classify_by_embedding(vector)`, and `batch_classify(records)`.
257
- - **VectorClassify** – Concern that adds the `vector_classify` macro: `has_neighbors` on the embedding column, a before_save that embeds the source column and runs `classify_by_embedding`, and a `similar_to(query_text)` scope that embeds the query and runs nearest-neighbor search.
258
- - **Generators** – `rails_ai_kit:install` creates the labels table migration; `rails_ai_kit:vector_columns` adds embedding/label/confidence_score to a given table.
277
+ - **Configuration** – Embedding provider, dimensions, API keys.
278
+ - **Embedding providers** – Base + OpenAI and Cohere (`embed`, `embed_batch`).
279
+ - **EmbeddingService** – `RailsAiKit.embedding` / `RailsAiKit.embed(text)`.
280
+ - **Classifier** – Label training, `classify`, `classify_by_embedding`, `batch_classify`; uses `rails_ai_kit_labels` + Neighbor for similarity.
281
+ - **VectorClassify** – ActiveRecord concern: `vector_classify` macro, `similar_to` scope.
282
+ - **Generators** – Install (labels table), vector_columns (embedding/label/confidence on a model).
283
+
284
+ ---
259
285
 
260
286
  ## Related
261
287
 
262
- - [pgvector](https://github.com/pgvector/pgvector) – Open-source vector similarity search for Postgres
288
+ - [pgvector](https://github.com/pgvector/pgvector) – Vector similarity search for Postgres
263
289
  - [Neighbor](https://github.com/ankane/neighbor) – Nearest neighbor search for Rails (used by this gem)
@@ -1,12 +1,14 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "rails/generators"
4
- require "rails/generators/active_record/migration/migration_generator"
4
+ require "rails/generators/active_record/migration"
5
5
 
6
6
  module RailsAiKit
7
7
  module Generators
8
8
  class VectorColumnsGenerator < Rails::Generators::NamedBase
9
- source_root File.expand_path("templates", __dir__)
9
+ include ActiveRecord::Generators::Migration
10
+
11
+ source_root File.expand_path("vector_columns/templates", __dir__)
10
12
 
11
13
  desc "Adds embedding, label, and confidence_score columns to an existing table for vector_classify"
12
14
 
@@ -16,9 +18,9 @@ module RailsAiKit
16
18
  class_option :embedding_dimensions, type: :numeric, default: 1536,
17
19
  desc: "Vector dimensions (must match your embedding provider)"
18
20
 
19
- def create_migration
21
+ def add_vector_columns_migration
20
22
  migration_template "add_vector_columns.rb",
21
- "db/migrate/#{migration_timestamp}_add_rails_ai_kit_vector_columns_to_#{table_name}.rb"
23
+ "db/migrate/add_rails_ai_kit_vector_columns_to_#{table_name}.rb"
22
24
  end
23
25
 
24
26
  private
@@ -27,10 +29,6 @@ module RailsAiKit
27
29
  name.underscore.pluralize
28
30
  end
29
31
 
30
- def migration_timestamp
31
- Time.now.utc.strftime("%Y%m%d%H%M%S")
32
- end
33
-
34
32
  def dimensions
35
33
  options[:embedding_dimensions] || (defined?(RailsAiKit) && RailsAiKit.configuration.embedding_dimensions) || 1536
36
34
  end
@@ -6,7 +6,13 @@ module RailsAiKit
6
6
  class Classifier
7
7
  attr_reader :classifier_name
8
8
 
9
- def initialize(classifier_name: nil)
9
+ def initialize(classifier_name: nil, **kwargs)
10
+ if kwargs.any?
11
+ raise ArgumentError,
12
+ "Classifier accepts only classifier_name: (optional). " \
13
+ "Labels are trained with .train(label_name, examples: [...]). " \
14
+ "Unknown: #{kwargs.keys.join(', ')}"
15
+ end
10
16
  @classifier_name = classifier_name || RailsAiKit.configuration.default_classifier_name
11
17
  end
12
18
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module RailsAiKit
4
- VERSION = "0.1.3"
4
+ VERSION = "0.1.6"
5
5
  end
data/lib/rails_ai_kit.rb CHANGED
@@ -4,7 +4,7 @@ require_relative "rails_ai_kit/version"
4
4
  require_relative "rails_ai_kit/configuration"
5
5
 
6
6
  module RailsAiKit
7
- class Error < StandardError; end
7
+ class Error < StandardError; end
8
8
  end
9
9
 
10
10
  require_relative "rails_ai_kit/embedding_providers/base"
@@ -36,4 +36,9 @@ module RailsAiKit
36
36
  def self.classifier(classifier_name = nil)
37
37
  Classifier.new(classifier_name: classifier_name)
38
38
  end
39
+
40
+ # Convenience: embed a single text. Same as RailsAiKit.embedding.embed(text).
41
+ def self.embed(text)
42
+ embedding.embed(text)
43
+ end
39
44
  end
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rails_ai_kit
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.1.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rails AI Kit Contributors
8
- autorequire:
9
8
  bindir: bin
10
9
  cert_chain: []
11
- date: 2026-03-07 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: activerecord
@@ -66,11 +65,11 @@ dependencies:
66
65
  - - ">="
67
66
  - !ruby/object:Gem::Version
68
67
  version: '0.2'
69
- description: 'Rails AI Kit provides ready-made tools to classify data using vector
70
- similarity: label training, auto/batch classification, similarity search, and filtering—without
71
- ML training or LLM calls.'
68
+ description: 'Rails AI Kit provides AI building blocks for Rails apps: embeddings
69
+ (OpenAI, Cohere), vector-based classification, similarity search, and generators.
70
+ Start with the Classifier feature; more capabilities as the gem grows.'
72
71
  email:
73
- - rohit.kushwaha@w3villa.com
72
+ - imrohitkushwaha2002@gmail.com
74
73
  executables: []
75
74
  extensions: []
76
75
  extra_rdoc_files: []
@@ -98,7 +97,6 @@ metadata:
98
97
  homepage_uri: https://github.com/imrrohitt/rails_ai_kit
99
98
  source_code_uri: https://github.com/imrrohitt/rails_ai_kit
100
99
  changelog_uri: https://github.com/imrrohitt/rails_ai_kit/blob/main/CHANGELOG.md
101
- post_install_message:
102
100
  rdoc_options: []
103
101
  require_paths:
104
102
  - lib
@@ -113,8 +111,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
113
111
  - !ruby/object:Gem::Version
114
112
  version: '0'
115
113
  requirements: []
116
- rubygems_version: 3.0.3.1
117
- signing_key:
114
+ rubygems_version: 3.6.9
118
115
  specification_version: 4
119
- summary: Vector-based classification layer on top of pgvector for Rails
116
+ summary: 'AI-first toolkit for Rails: embeddings, vector search, and classification'
120
117
  test_files: []