RubyGems - clip-rb - Versions diffs - 0.1.0 → 0.2.0 - Mend

clip-rb 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

checksums.yaml +4 -4
data/README.md +17 -9
data/lib/clip/image_preprocessor.rb +57 -0
data/lib/clip/version.rb +1 -1
data/sig/clip/.gitkeep +0 -0
data/sig/clip.rbs +3 -0
metadata +31 -2
data/sig/clip/rb.rbs +0 -6

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 29e8df54de90f811ae779c79b8430d07cf7f23cf0d1995ccb9d3dcd268357df2
-  data.tar.gz: 556ac48aef1f241fb28e2907a9e5ae6b77930c57ccd67c6ba2e89d7f0a09d63a
+  metadata.gz: 9bffd8c438606eb432fc7aa887390917ae4176c6c33dfee8e5ebffe7e9c196ca
+  data.tar.gz: 8fac7c434d2740672969ac348f73b57798dfd5a13846a240dd219614fd70bb6c
 SHA512:
-  metadata.gz: af8f9334c49912107269621066863a22c393b54c13ed5464df7e7abd77803fc70a236bcf41bf5a1ad33ad5c6226e8741b5ae4198be0a2e84c117bb181306a2c8
-  data.tar.gz: 42db46f9e13d6792335607471c4250d272358da3e8fec24afdf23510da171bec9a88b0853575abdb53196676641f3a7da7141c62cd67656d6071a3f7769f00d9
+  metadata.gz: cc02707257b2eba026a925a4d63b24c9a71a81f8a0f7aae342bc4c9ac7eaa1330e93a4bcfcabdd4361aec61cd28c02c3e25265d9f90a57dd53f52d21b63136d8
+  data.tar.gz: ae464db71f045712d52e78891493fc8e5177f83311e14c968e5413b3600a5062efc65492375fd2935d6a5fb63022940e802ec1962d4fbddcbfe6120acdbffbb2

data/README.md CHANGED Viewed

@@ -1,18 +1,24 @@
 # clip-rb
-[![Gem Version](https://badge.fury.io/rb/clip-rb.svg)](https://badge.fury.io/rb/clip-rb)
+[![Gem Version](https://badge.fury.io/rb/clip-rb.svg)](https://badge.fury.io/rb/clip-rb)
 [![Test](https://github.com/khasinski/clip-rb/workflows/clip-rb/badge.svg)](https://github.com/khasinski/clip-rb/actions/workflows/main.yml)
-Clip replacement that uses ONNX models. No Python required!
+**clip-rb** is a Ruby implementation of [OpenAI CLIP](https://openai.com/index/clip/) powered by ONNX models—no Python required!
+CLIP (Contrastive Language–Image Pre-training) is a powerful neural network developed by OpenAI. It connects text and images by learning shared representations, enabling tasks such as image-to-text matching, zero-shot classification, and visual search. With clip-rb, you can easily encode text and images into high-dimensional embeddings for similarity comparison or use in downstream applications like caption generation and vector search.
+---
 ## Requirements
 - Ruby 3.0.0 or later
-- ONNX models for CLIP (downloaded automatically on first use)
+- ONNX CLIP models (downloaded automatically on first use)
+---
 ## Installation
-Install the gem and add to the application's Gemfile by executing:
+Add the gem to your application by executing:
 ```bash
 bundle add clip-rb
@@ -31,11 +37,15 @@ require 'clip'
 clip = Clip::Model.new
-clip.encode_text("a photo of a cat") # => [0.15546110272407532, 0.07329428941011429, ...]
+text_embedding = clip.encode_text("a photo of a cat")
+# => [0.15546110272407532, 0.07329428941011429, ...]
-clip.encode_image("test/fixtures/test.jpg") # => [0.22115306556224823,, 0.19343754649162292, ...]
+image_embedding = clip.encode_image("test/fixtures/test.jpg")
+# => [0.22115306556224823, 0.19343754649162292, ...]
 ```
+💡 Tip: Use cosine similarity for KNN vector search when comparing embeddings!
 ## CLI
 Additionally you can fetch embeddings by calling:
@@ -45,8 +55,6 @@ $ clip-embed-text "a photo of a cat"
 $ clip-embed-image test/fixtures/test.jpg
 ```
-Use KNN vector search to find similar images, remember to use cosine distance!
 ## Development
 After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
@@ -55,7 +63,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
 ## Contributing
-Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/clip-rb. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/clip-rb/blob/main/CODE_OF_CONDUCT.md).
+Bug reports and pull requests are welcome on GitHub at https://github.com/khasinski/clip-rb. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/clip-rb/blob/main/CODE_OF_CONDUCT.md).
 ## License

data/lib/clip/image_preprocessor.rb CHANGED Viewed

@@ -1,5 +1,62 @@
+require "mini_magick"
+require "numo/narray"
 module Clip
   class ImagePreprocessor
+    # CLIP's expected image normalization parameters
+    MEAN = Numo::DFloat[*[ 0.48145466, 0.4578275, 0.40821073 ]]
+    STD = Numo::DFloat[*[ 0.26862954, 0.26130258, 0.27577711 ]]
+    def initialize(target_size: 224)
+      @target_size = target_size
+    end
+    # Preprocess the image and return a tensor with shape [batch_size, 3, 224, 224]
+    def preprocess(image_path)
+      image = load_and_resize(image_path)
+      tensor = image_to_tensor(image)
+      normalized = normalize(tensor)
+      add_batch_dimension(normalized)
+    end
+    private
+    # Load image, convert to RGB, and resize to target size
+    def load_and_resize(image_path)
+      image = MiniMagick::Image.open(image_path)
+      image.format "png" # Ensure consistent format
+      image = image.combine_options do |c|
+        c.resize "#{@target_size}x#{@target_size}!"
+        c.quality 100
+        c.colorspace "RGB"
+      end
+      image
+    end
+    # Convert the image to a normalized NumPy array with shape [3, 224, 224]
+    def image_to_tensor(image)
+      pixels = image.get_pixels # Returns [[R, G, B], ...] for each row
+      # Convert to Numo::NArray and reshape
+      pixel_array = Numo::UInt8.asarray(pixels).cast_to(Numo::DFloat)
+      # Reshape to [height, width, channels]
+      pixel_array = pixel_array.reshape(@target_size, @target_size, 3)
+      # Transpose to [channels, height, width]
+      pixel_array = pixel_array.transpose(2, 0, 1)
+      # Normalize to [0, 1]
+      pixel_array / 255.0
+    end
+    # Apply CLIP normalization: (x - mean) / std
+    def normalize(tensor)
+      # Expand mean and std to match tensor shape
+      mean = MEAN.reshape(3, 1, 1)
+      std = STD.reshape(3, 1, 1)
+      (tensor - mean) / std
+    end
+    # Add batch dimension: [1, 3, 224, 224]
+    def add_batch_dimension(tensor)
+      tensor.reshape(3, @target_size, @target_size)
+    end
   end
 end

data/lib/clip/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Clip
-  VERSION = "0.1.0"
+  VERSION = "0.2.0"
 end

data/sig/clip/.gitkeep ADDED Viewed

File without changes

data/sig/clip.rbs ADDED Viewed

@@ -0,0 +1,3 @@
+module Clip
+  VERSION: String
+end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: clip-rb
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.2.0
 platform: ruby
 authors:
 - Krzysztof Hasiński
@@ -66,6 +66,34 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '1.6'
+- !ruby/object:Gem::Dependency
+  name: numo-narray
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: 0.9.2
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: 0.9.2
+- !ruby/object:Gem::Dependency
+  name: mini_magick
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.0'
 description: OpenAI CLIP embeddings, uses ONNX models. Allows to create embeddings
   for images and text
 email:
@@ -90,7 +118,8 @@ files:
 - lib/clip/model.rb
 - lib/clip/tokenizer.rb
 - lib/clip/version.rb
-- sig/clip/rb.rbs
+- sig/clip.rbs
+- sig/clip/.gitkeep
 homepage: https://github.com/khasinski/clip-rb
 licenses:
 - MIT

data/sig/clip/rb.rbs DELETED Viewed

@@ -1,6 +0,0 @@
-module Clip
-  module Rb
-    VERSION: String
-    # See the writing guide of rbs: https://github.com/ruby/rbs#guides
-  end
-end