RubyGems - nukitori - Versions diffs - 0.1.0 → 0.1.1 - Mend

nukitori 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 9aa7b220b6a1cfe138ce6a644fe38bf2aa4c3cd699f1cffd21ec67ccadcb451b
-  data.tar.gz: c94ac2f7da447a988c8e6b72049f0d6f908c18ad84c315f7d89988b009aa10a0
+  metadata.gz: 37519a4a86c49b4904f08d4c6d1b91109ce00df94523c5cccb31bcc5ac8d55b5
+  data.tar.gz: 544da85b164aade937654ed702b49f6aa49b193d672d4108dc8952b25c1b3514
 SHA512:
-  metadata.gz: 1ee431bc34a28cf4554eec19fe1be3599fa14f3de7f0aeff34efd702198c2e7fd3b7b0a69409b69e87d68811bb695e67deaf41eca755a369f0dfb53b5b00c414
-  data.tar.gz: e624dc374ca0d52b1e4a7bef3b892b1ad8284589089df829190d52be95ccd6307a10824e9614c9b7f4d00d3d77808995390553ea8192e8704489a0868940267e
+  metadata.gz: 64a804009ff6786aee759fd3ca16dee56f60da253a0438c6fceac1159680aa47c07917ef12d8bc49fcf767dab12ef25c4db491f04c4b9ea487010f35021c3193
+  data.tar.gz: ff17ddd62a07c37c8b04b3b09bed0f1ace7b98bbc48feb02bcac8a10d7acb4dd47a5dc71db6a2584b7d9cc7de4eb6bd702f591b9e1984137e07a606812c1b232

data/README.md CHANGED Viewed

@@ -8,17 +8,16 @@ Nukitori is a Ruby gem for HTML data extraction that uses an LLM once to generat
 - **Robust reusable schemas** — avoids page-specific IDs, dynamic hashes, and fragile selectors
 - **Transparent output** — generated schemas are plain JSON, easy to inspect, diff, and version
 - **Token-optimized** — strips scripts, styles, and redundant DOM before sending HTML to the LLM
-- **Any LLM provider** — works with OpenAI, Anthropic, Gemini, and local models
-Define what you want to extract from HTML using a simple schema DSL:
+- **Any LLM provider** — works with OpenAI, Anthropic, Gemini, and local models:
 ```ruby
-# github_extract.rb
+# example_extract.rb
 require 'nukitori'
 require 'json'
 html = "<HTML DOM from https://github.com/search?q=ruby+web+scraping&type=repositories>"
+# define what you want to extract from HTML using simple DSL:
 data = Nukitori(html, 'schema.json') do
   integer :repositories_found_count
   array :repositories do
@@ -35,12 +34,10 @@ end
 File.write('results.json', JSON.pretty_generate(data))
 ```
-On the first run `$ ruby github_extract.rb` Nukitori uses AI to generate a reusable XPath extraction schema:
-<details>
-  <summary><code>schema.json</code> (click to expand)</summary><br>
+On the first run `$ ruby example_extract.rb` Nukitori uses AI to generate a reusable XPath extraction schema:
 ```json
+/* schema.json */
 {
   "repositories_found_count": {
     "xpath": "//a[@data-testid='nav-item-repositories']//span[@data-testid='resolved-count-label']",
@@ -78,14 +75,11 @@ On the first run `$ ruby github_extract.rb` Nukitori uses AI to generate a reusa
   }
 }
 ```
-</details>
 After that, Nukitori extracts structured data from similar HTMLs without any LLM calls, in milliseconds:
-<details>
-  <summary><code>results.json</code> (click to expand)</summary><br>
 ```json
+/* results.json */
 {
   "repositories_found_count": 314,
   "repositories": [
@@ -114,7 +108,6 @@ After that, Nukitori extracts structured data from similar HTMLs without any LLM
   ]
 }
 ```
-</details>
 ## Installation

data/lib/nukitori/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Nukitori
-  VERSION = '0.1.0'
+  VERSION = '0.1.1'
 end

data/lib/nukitori.rb CHANGED Viewed

@@ -14,12 +14,8 @@ require_relative 'nukitori/schema_extractor'
 require_relative 'nukitori/llm_extractor'
 module Nukitori
-  # Path to bundled models.json with up-to-date model definitions
-  MODELS_JSON = File.expand_path('nukitori/models.json', __dir__)
   class << self
     # Configure RubyLLM through Nukitori
-    # Automatically uses bundled models.json with latest model definitions
-    #
     # @example
     #   Nukitori.configure do |config|
     #     config.default_model = 'gpt-5.2'
@@ -28,8 +24,6 @@ module Nukitori
     #
     def configure
       RubyLLM.configure do |config|
-        # Use bundled models.json with up-to-date model definitions
-        config.model_registry_file = MODELS_JSON
         yield config if block_given?
       end
     end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: nukitori
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.1.1
 platform: ruby
 authors:
 - Victor Afanasev
@@ -30,6 +30,9 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '1.9'
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: 1.9.2
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
@@ -37,6 +40,9 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '1.9'
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: 1.9.2
 description: Nukitori is a Ruby gem for HTML data extraction. It uses an LLM once
   to generate reusable XPath schemas, then extracts structured data from similarly
   structured pages using plain Nokogiri. This makes scraping fast, predictable, and
@@ -55,7 +61,6 @@ files:
 - lib/nukitori/chat_factory.rb
 - lib/nukitori/html_preprocessor.rb
 - lib/nukitori/llm_extractor.rb
-- lib/nukitori/models.json
 - lib/nukitori/response_parser.rb
 - lib/nukitori/schema_extractor.rb
 - lib/nukitori/schema_generator.rb