RubyGems - glancer - Versions diffs - 1.0.0 → 1.1.0 - Mend

glancer 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +39 -55
data/README.md +31 -2
data/lib/generators/glancer/install/templates/glancer.rb +14 -1
data/lib/glancer/configuration.rb +16 -1
data/lib/glancer/retriever.rb +16 -12
data/lib/glancer/utils/rate_limit_retry.rb +51 -0
data/lib/glancer/version.rb +1 -1
data/lib/glancer/workflow/builder.rb +12 -4
data/lib/glancer/workflow/llm.rb +12 -4
data/lib/glancer/workflow/query_enricher.rb +3 -1
data/lib/glancer.rb +2 -1
data/spec/lib/glancer/configuration_spec.rb +60 -0
data/spec/lib/glancer/utils/rate_limit_retry_spec.rb +193 -0
metadata +3 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: d4248cad39bdfcfd7cb3907a83c81170f49bf15b76c0af226d6333622479dd72
-  data.tar.gz: cd8b5588e4032750fa0f5be8eae5de5ded28a713a1223b948d4fd94b258ce012
+  metadata.gz: '0780756d0563d270e2669590d7fa195506c67eb9224abc24382356f54682d9cf'
+  data.tar.gz: b2035a81f75a73e2dd9a4ceaa0e4af1d084867c3c6ecf0554e97acc05d9f7df1
 SHA512:
-  metadata.gz: 4cb913d4829f562ceaa8ebc9ff808c22037b540c0ecf75991cfb6f785434fc0fd8588d73d6a65e54a09f96d002a7e0359bc0e8884ba112a75fac215832b32340
-  data.tar.gz: 4b5f94c281b5161e83bbbbf48296d321344302ee53811bf0b8b4184cce385f831552fca3827ee26dae34897dd501ff8de74ad33f6714e8d7e3ed1b6c135779d7
+  metadata.gz: 5c2fb146ee00751b91613a2bbc818cadfd44dff3c0f8fa4c42907516a02baeda4f281a273dfd8fba62f793ce16ed4d2ed2cebc3545026c282e4f78c9f6d5e2ef
+  data.tar.gz: cbca9b21bff036fd309a5f6fab87cea4e27ef83224f8f91bb0be618146bf611f156cc9e624940164e74e75e87586a52c2533f26d27e87f5bd725e075d91de1bd

data/CHANGELOG.md CHANGED Viewed

@@ -7,82 +7,66 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
-## [1.0.0] — 2026-05-24
+## [1.1.0] — 2026-06-05
 ### Added
-- **SQL editing**: users can edit the generated SQL directly in the chat UI before
-  executing it. Edits are persisted with a `user_edited_sql` flag and surfaced with
-  a visible badge.
-- **Pipeline status labels**: animated step-by-step labels (embedding → retrieval →
-  SQL generation → validation → execution → response) while the pipeline is running.
-- **Accordion results**: running a new query collapses the previous results panel;
-  panels can be toggled independently.
-- **Copy buttons**: one-click copy for both the raw SQL and the full assistant response.
-- **Large-result alert**: a warning banner when a query returns ≥ 500 rows or has no
-  `LIMIT` clause.
-- **Audio input**: microphone button powered by the Web Speech API; transcribed text
-  is appended to the question field.
-- **Desktop sidebar toggle**: chevron button to collapse/expand the chat list on large
-  screens; state is persisted in `localStorage`.
-- **Blazer integration**: "Open in Blazer" button auto-detected when the `blazer` gem
-  is present; configurable via `config.blazer_path`.
-- **`:silent` log verbosity level**: suppresses all log output including `warn` and
-  `error` — intended for test environments.
-- **100 % test coverage**: 552 RSpec examples covering every workflow path, edge case,
-  and rescue branch.
-- **CI badge**: automatically generated coverage SVG committed to the `badge-generator`
-  branch on every push to `main`.
-### Changed
-- LLM humanization prompt rewritten to never describe the query as "executed" — it now
-  explains the logic and why it answers the question.
-- Improved self-correction: the executor retries up to 3 times, passing the database
-  error back to the LLM on each attempt.
-- Immediate user-message rendering: the user's bubble appears in the chat instantly
-  (before the server responds) via a temporary DOM node that is replaced by the
-  Turbo Stream response.
+- **Rate-limit retry with backoff**: all LLM and embedding API calls now automatically
+  retry when a rate-limit / quota-exceeded error is received. If the provider returns a
+  "retry in Xs" hint (e.g. Gemini Flash), that exact delay is honoured; otherwise
+  exponential backoff is applied (`llm_retry_delay * 2^attempt`). Resolves [#1].
+- **`max_llm_retries` config option** (default: `3`): maximum number of retry attempts
+  before propagating the error to the user. Set to `0` to disable automatic retries.
+- **`llm_retry_delay` config option** (default: `60` seconds): base delay in seconds
+  used when the provider does not supply a retry-after hint.
-### Fixed
-- Info panel never showed content because the controller was passing `message_for_info:`
-  but the partial expected `message_info:`.
+## [1.0.0] — 2026-05-24
-## [0.1.0] — 2026-05-14
+First public release.
 ### Added
-- **RAG pipeline**: embed → retrieve → generate SQL → execute → humanize, with automatic
-  retry (up to 3 attempts) on SQL errors using LLM self-correction.
+- **RAG pipeline**: embed → retrieve → generate code → validate → execute → humanize,
+  with automatic retry (up to 3 attempts) on errors using LLM self-correction.
+- **Dual query mode**: `query_mode: :sql` (default) generates read-only SQL;
+  `query_mode: :activerecord` generates and evaluates Ruby/ActiveRecord expressions.
+  Each mode has its own sanitizer (`SQLSanitizer` / `ARSanitizer`), extractor, prompt
+  builder, and executor.
 - **Multi-provider LLM support** via [ruby_llm](https://github.com/crmne/ruby_llm):
-  Gemini, OpenAI, and OpenRouter. Each role (SQL generation, chat responses, embeddings)
-  can use a different provider and model.
+  Gemini, OpenAI, and OpenRouter. Code generation, chat responses, and embeddings can
+  each use a different provider and model.
+- **Async message processing** via `Glancer::AsyncRunner`: messages are processed in a
+  background thread using `connection_pool.with_connection` — no external job queue
+  (Sidekiq, GoodJob, etc.) required.
+- **Client-side polling**: the UI polls `/messages/:id/poll` every 2 s and replaces the
+  message partial via Turbo Stream once done; a 5-minute hard timeout marks stuck
+  messages as failed automatically.
+- **Query enrichment**: `QueryEnricher` translates natural-language questions into dense
+  technical specifications before retrieval, improving code accuracy.
 - **Indexers** for `db/schema.rb`, `app/models/**/*.rb`, and a custom Markdown context
   file. Rake tasks: `glancer:index:all`, `glancer:index:schema`, `glancer:index:models`,
   `glancer:index:context`.
 - **Cosine similarity retrieval** with per-source-type relevance weights (schema 1.3×,
-  context 1.2×, models 1.1×) and a configurable minimum score threshold.
-- **Chunk overlap** to prevent context loss at document boundaries.
+  context 1.2×, models 1.1×), configurable minimum score threshold, and fallback to
+  top-k results when no embedding meets the threshold.
 - **SQL safety layer**: `SQLSanitizer` (blocks destructive statements),
-  `SQLValidator` (verifies table references against indexed schema), and
-  mandatory read-only transaction with automatic rollback.
+  `SQLValidator` (verifies table references against indexed schema), and mandatory
+  read-only transaction with automatic rollback.
 - **Audit trail**: every executed query is stored in `glancer_audits` with a unique
-  `run_id` UUID injected as an SQL comment (`/*glancer,run_id:UUID*/`).
+  `run_id` UUID injected as a comment (`/*glancer,run_id:UUID*/`).
 - **In-memory response cache** (`workflow_cache_ttl`) to avoid redundant LLM calls for
   repeated identical questions.
 - **Chat UI**: Stimulus + Turbo Streams interface with dark mode, typewriter effect,
-  CSV export, SQL re-run, pipeline status labels, accordion results, copy-to-clipboard,
-  and audio input (Web Speech API).
+  CSV export, SQL/AR re-run, chart visualizations (bar, line, pie), client-side polling,
+  pipeline status labels, accordion results, and copy-to-clipboard.
 - **Settings page** at `/glancer/settings` for runtime custom instructions.
 - **Schema viewer** at `/glancer/db-schema` showing indexed tables and columns.
 - **Install generator**: `rails generate glancer:install` scaffolds the initializer,
   context file, and mounts the engine.
-- Configurable `statement_timeout` enforced via adapter-native mechanisms
-  (PostgreSQL `SET statement_timeout`, MySQL `SET max_execution_time`).
-- `config.history_limit` to control how many prior turns are included in the prompt.
-- `config.read_only_db` to route queries to a replica connection string.
+- Configurable `statement_timeout`, `history_limit`, `read_only_db`, `k`, `min_score`,
+  and per-source document weights.
+- **100% line coverage**: 717 RSpec examples covering every workflow path, edge case,
+  and rescue branch.
 [Unreleased]: https://github.com/ErnaneJ/glancer/compare/v1.0.0...HEAD
-[1.0.0]: https://github.com/ErnaneJ/glancer/compare/v0.1.0...v1.0.0
-[0.1.0]: https://github.com/ErnaneJ/glancer/releases/tag/v0.1.0
+[1.0.0]: https://github.com/ErnaneJ/glancer/releases/tag/v1.0.0

data/README.md CHANGED Viewed

@@ -36,8 +36,11 @@ Glancer is a **Ruby on Rails engine** that mounts a full chat interface inside y
 → SELECT executed, results shown, answer written in plain language.
 ```
-[![Click to play video](./.github/assets/demo.png)](https://github.com/ErnaneJ/glancer/raw/refs/heads/main/.github/assets/demo.mp4)
-> Click to see demo. ☝
+<p align="center">
+  <a href="https://github.com/ErnaneJ/glancer/raw/refs/heads/main/.github/assets/demo.mp4">
+    <img src="./.github/assets/demo.gif" alt="DEMO">
+  </a>
+</p>
 ## Why Glancer?
@@ -64,6 +67,16 @@ Glancer removes that friction. It gives your app a persistent, context-aware dat
 Glancer is built on top of [**ruby_llm**](https://github.com/crmne/ruby_llm), a provider-agnostic LLM client for Ruby. All LLM calls (query generation, humanized responses, embeddings, and optional question enrichment) go through ruby_llm, so any model it supports works with Glancer.
+### Supported models by provider
+| Provider | Chat / Code models | Embedding models | Model list |
+|---|---|---|---|
+| **Gemini** | `gemini-2.0-flash`, `gemini-2.5-pro`, `gemini-1.5-pro`, … | `text-embedding-004` | [ai.google.dev/gemini-api/docs/models](https://ai.google.dev/gemini-api/docs/models) |
+| **OpenAI** | `gpt-4o`, `gpt-4o-mini`, `o3-mini`, … | `text-embedding-3-large`, `text-embedding-3-small` | [platform.openai.com/docs/models](https://platform.openai.com/docs/models) |
+| **OpenRouter** | Any model available on the platform (e.g. `anthropic/claude-3.5-sonnet`, `deepseek/deepseek-r1:free`) | Not natively supported — pair with `:gemini` or `:openai` for embeddings | [openrouter.ai/models](https://openrouter.ai/models) |
+The full list of models validated by the ruby_llm gem is at [rubyllm.com/available-models](https://rubyllm.com/available-models/). If you need a model not yet in that registry, set `assume_model_exists: true` by configuring an explicit `embedding_model` or `code_model` string.
 ## Installation
 ### 1. Add to your Gemfile
@@ -164,6 +177,20 @@ Route all queries to a replica to offload your primary database:
 config.read_only_db = ENV["REPLICA_DATABASE_URL"]
 ```
+### Rate limiting
+When using free-tier or low-quota LLM providers (e.g. Gemini Flash free tier), you may hit rate limits. Glancer automatically retries with backoff when it receives a quota-exceeded or rate-limit error from any provider:
+- If the provider returns a **"retry in Xs"** hint in the error message (Gemini does this), that exact delay is used.
+- Otherwise, **exponential backoff** is applied: `llm_retry_delay × 2^(attempt − 1)`.
+```ruby
+config.max_llm_retries = 3   # retries before propagating the error (0 = disable)
+config.llm_retry_delay = 60  # base delay in seconds when no retry hint is provided
+```
+A warning is logged on each retry attempt so you can monitor them in your logs. If all retries are exhausted, the error is surfaced to the user normally.
 ### Full configuration reference
 | Option | Default | Description |
@@ -194,6 +221,8 @@ config.read_only_db = ENV["REPLICA_DATABASE_URL"]
 | `models_documents_weight` | `1.1` | Score boost for model chunks |
 | `history_limit` | `6` | Prior conversation turns included in the LLM prompt |
 | `workflow_cache_ttl` | `5.minutes` | In-memory result cache TTL; `0` to disable |
+| `max_llm_retries` | `3` | Max retries on rate-limit / quota errors; `0` to disable |
+| `llm_retry_delay` | `60` | Base delay in seconds between retries (exponential backoff; API hint takes priority) |
 | `log_verbosity` | `:info` | `:silent`, `:none`, `:info`, or `:debug` |
 | `log_output_path` | `nil` | Log file path; `nil` writes to stdout |
 | `blazer_path` | `nil` | Blazer base path; auto-detected when `blazer` gem is present |

data/lib/generators/glancer/install/templates/glancer.rb CHANGED Viewed

@@ -101,7 +101,7 @@ Glancer.configure do |config|
   #   avoid errors. If you must use an OpenRouter embedding model anyway, set
   #   embedding_model explicitly (e.g., 'openai/text-embedding-3-small') —
   #   Glancer will bypass the model registry check automatically.
-  #
+  #   https://rubyllm.com/available-models/
   # Accepted: nil | :gemini | :openai | :openrouter
   config.embedding_provider = nil
@@ -203,6 +203,19 @@ Glancer.configure do |config|
   # shared across Puma workers or restarts).
   config.workflow_cache_ttl = 5.minutes
+  # ─────────────────────────────────────────────────────────────────────────────
+  # Rate limiting
+  # ─────────────────────────────────────────────────────────────────────────────
+  # Maximum number of retries when an LLM or embedding API returns a rate-limit
+  # or quota-exceeded error. Set to 0 to disable automatic retries.
+  config.max_llm_retries = 3
+  # Base delay in seconds between rate-limit retries. When the provider returns a
+  # "retry in Xs" hint (e.g. Gemini), that hint takes priority. Otherwise,
+  # exponential backoff is applied: delay * 2^(attempt - 1).
+  config.llm_retry_delay = 60
   # ─────────────────────────────────────────────────────────────────────────────
   # Logging
   # ─────────────────────────────────────────────────────────────────────────────

data/lib/glancer/configuration.rb CHANGED Viewed

@@ -52,6 +52,8 @@ module Glancer
       self.query_enrichment_enabled = false # enrich question with table names before retrieval
       self.enrichment_provider = nil        # nil → falls back to llm_provider
       self.enrichment_model = nil           # nil → falls back to llm_model
+      self.max_llm_retries = 3             # retries on rate-limit errors
+      self.llm_retry_delay = 60            # base delay in seconds (fallback when no retry-after hint)
     end
     # === READERS ===
@@ -66,7 +68,8 @@ module Glancer
                 :code_provider, :code_model,
                 :chat_provider, :chat_model,
                 :blazer_path, :query_mode,
-                :query_enrichment_enabled, :enrichment_provider, :enrichment_model
+                :query_enrichment_enabled, :enrichment_provider, :enrichment_model,
+                :max_llm_retries, :llm_retry_delay
     # === WRITERS ===
     def adapter=(value)
@@ -340,6 +343,18 @@ module Glancer
       @query_mode = value
     end
+    def max_llm_retries=(value)
+      raise ArgumentError, "max_llm_retries must be a non-negative integer" unless value.is_a?(Integer) && value >= 0
+      @max_llm_retries = value
+    end
+    def llm_retry_delay=(value)
+      raise ArgumentError, "llm_retry_delay must be a positive number" unless value.is_a?(Numeric) && value.positive?
+      @llm_retry_delay = value
+    end
     # Returns the Blazer base path if Blazer is available, nil otherwise.
     def resolved_blazer_path
       return @blazer_path unless @blazer_path.nil?

data/lib/glancer/retriever.rb CHANGED Viewed

@@ -16,12 +16,14 @@ module Glancer
         Glancer::Utils::Logger.debug("Retriever",
                                      "Embedding chunk ##{idx + 1} (#{data[:source_type]} - #{data[:source_path]}): '#{preview}...'")
-        vector = RubyLLM.embed(
-          chunk,
-          model: Glancer.configuration.resolved_embedding_model,
-          provider: Glancer.configuration.resolved_embedding_provider,
-          assume_model_exists: true
-        ).vectors
+        vector = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
+          RubyLLM.embed(
+            chunk,
+            model: Glancer.configuration.resolved_embedding_model,
+            provider: Glancer.configuration.resolved_embedding_provider,
+            assume_model_exists: true
+          ).vectors
+        end
         Glancer::Utils::Logger.debug("Retriever",
                                      "Vector size: #{vector.size}, example values: #{vector.first(5).inspect}")
@@ -47,12 +49,14 @@ module Glancer
     def search(query)
       Glancer::Utils::Logger.info("Retriever", "Searching for top #{Glancer.configuration.k} results...")
-      query_embedding = RubyLLM.embed(
-        query,
-        model: Glancer.configuration.resolved_embedding_model,
-        provider: Glancer.configuration.resolved_embedding_provider,
-        assume_model_exists: true
-      ).vectors
+      query_embedding = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
+        RubyLLM.embed(
+          query,
+          model: Glancer.configuration.resolved_embedding_model,
+          provider: Glancer.configuration.resolved_embedding_provider,
+          assume_model_exists: true
+        ).vectors
+      end
       # @TODO Postgres with native search?
       perform_ruby_search(query_embedding)

data/lib/glancer/utils/rate_limit_retry.rb ADDED Viewed

@@ -0,0 +1,51 @@
+# frozen_string_literal: true
+module Glancer
+  module Utils
+    module RateLimitRetry
+      RATE_LIMIT_PATTERNS = [
+        /rate.?limit/i,
+        /quota.?exceed/i,
+        /exceeded.?your.?current.?quota/i,
+        /too.?many.?request/i,
+        /resource.?exhausted/i,
+        /\b429\b/
+      ].freeze
+      RETRY_AFTER_PATTERN = /retry.?in\s+([0-9]+(?:\.[0-9]+)?)\s*s/i
+      def self.with_retry(context:, max_retries: nil, base_delay: nil)
+        max_retries ||= Glancer.configuration.max_llm_retries
+        base_delay  ||= Glancer.configuration.llm_retry_delay
+        attempt = 0
+        begin
+          yield
+        rescue StandardError => e
+          raise unless rate_limit_error?(e) && attempt < max_retries
+          attempt += 1
+          delay = parse_retry_after(e.message) || (base_delay * (2**(attempt - 1)))
+          Glancer::Utils::Logger.warn(
+            context,
+            "Rate limit hit (attempt #{attempt}/#{max_retries}). Retrying in #{delay.ceil}s..."
+          )
+          sleep(delay)
+          retry
+        end
+      end
+      def self.rate_limit_error?(error)
+        RATE_LIMIT_PATTERNS.any? { |p| error.message.match?(p) } ||
+          error.class.name.match?(/rate.?limit/i)
+      end
+      private_class_method :rate_limit_error?
+      def self.parse_retry_after(message)
+        m = message.match(RETRY_AFTER_PATTERN)
+        m && m[1].to_f.positive? ? m[1].to_f : nil
+      end
+      private_class_method :parse_retry_after
+    end
+  end
+end

data/lib/glancer/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Glancer
-  VERSION = "1.0.0"
+  VERSION = "1.1.0"
 end

data/lib/glancer/workflow/builder.rb CHANGED Viewed

@@ -17,7 +17,9 @@ module Glancer
           assume_model_exists: true
         )
-        response = chat.ask(prompt)
+        response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
+          chat.ask(prompt)
+        end
         Glancer::Utils::Logger.info("Workflow::Builder",
                                     "LLM responded with SQL (length: #{response.content&.length || 0} characters)")
@@ -64,7 +66,9 @@ module Glancer
           assume_model_exists: true
         )
-        response = chat.ask(prompt)
+        response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
+          chat.ask(prompt)
+        end
         # Clean the response to ensure we only have the raw SQL
         Glancer::Workflow::SQLExtractor.extract(response.content)
@@ -84,7 +88,9 @@ module Glancer
           assume_model_exists: true
         )
-        response = chat.ask(prompt)
+        response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
+          chat.ask(prompt)
+        end
         Glancer::Utils::Logger.info("Workflow::Builder",
                                     "LLM responded with AR code (length: #{response.content&.length || 0} chars)")
         response.content
@@ -118,7 +124,9 @@ module Glancer
           assume_model_exists: true
         )
-        response = chat.ask(prompt)
+        response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
+          chat.ask(prompt)
+        end
         Glancer::Workflow::ARExtractor.extract(response.content)
       rescue StandardError => e
         Glancer::Utils::Logger.error("Workflow::Builder", "Failed to fix AR code: #{e.message}")

data/lib/glancer/workflow/llm.rb CHANGED Viewed

@@ -39,7 +39,9 @@ module Glancer
         context += "\n\nADDITIONAL INSTRUCTIONS:\n#{custom}" if custom.present?
         chat.with_instructions(context)
-        response = chat.ask(question)
+        response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
+          chat.ask(question)
+        end
         response.content
       rescue StandardError => e
@@ -72,7 +74,9 @@ module Glancer
           model: Glancer.configuration.resolved_chat_model,
           assume_model_exists: true
         )
-        chat.ask(prompt).content
+        Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
+          chat.ask(prompt).content
+        end
       rescue StandardError => e
         Glancer::Utils::Logger.error("Workflow::LLM", "explain_missing_tables failed: #{e.message}")
         "Não consegui encontrar a(s) tabela(s) **#{missing}** no schema indexado. " \
@@ -87,7 +91,9 @@ module Glancer
         )
         prompt = "Generate a concise, descriptive title (max 45 characters, no quotes, no punctuation at end) " \
                  "for a database query session starting with this question: #{question}"
-        chat.ask(prompt).content.strip.truncate(50)
+        Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
+          chat.ask(prompt).content.strip.truncate(50)
+        end
       rescue StandardError => e
         Glancer::Utils::Logger.error("Workflow::LLM", "generate_title failed: #{e.message}")
         question.truncate(45)
@@ -116,7 +122,9 @@ module Glancer
           5. Respond in the user's language.
         PROMPT
-        chat.ask(prompt).content
+        Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
+          chat.ask(prompt).content
+        end
       end
     end
   end

data/lib/glancer/workflow/query_enricher.rb CHANGED Viewed

@@ -50,7 +50,9 @@ module Glancer
           assume_model_exists: true
         )
-        enriched = chat.ask(prompt).content.to_s.strip
+        enriched = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::QueryEnricher") do
+          chat.ask(prompt).content.to_s.strip
+        end
         enriched.presence || question
       rescue StandardError => e
         Glancer::Utils::Logger.warn("Workflow::QueryEnricher", "Enrichment failed, using original: #{e.message}")

data/lib/glancer.rb CHANGED Viewed

@@ -20,7 +20,8 @@ require "glancer/utils/logger" # Glancer::Utils::Logger
 require "glancer/utils/markdown_helper" # Glancer::Utils::MarkdownHelper
 require "glancer/utils/result_formatter" # Glancer::Utils::ResultFormatter
 require "glancer/utils/table_stats" # Glancer::Utils::TableStats
-require "glancer/utils/transaction" # Glancer::Utils::Transaction
+require "glancer/utils/transaction"        # Glancer::Utils::Transaction
+require "glancer/utils/rate_limit_retry"   # Glancer::Utils::RateLimitRetry
 require "glancer/engine" # Glancer::Engine

data/spec/lib/glancer/configuration_spec.rb CHANGED Viewed

@@ -143,6 +143,14 @@ RSpec.describe Glancer::Configuration do
     it "sets read_only_db to nil" do
       expect(config.read_only_db).to be_nil
     end
+    it "sets max_llm_retries to 3" do
+      expect(config.max_llm_retries).to eq(3)
+    end
+    it "sets llm_retry_delay to 60" do
+      expect(config.llm_retry_delay).to eq(60)
+    end
   end
   # ── adapter= ─────────────────────────────────────────────────────────────────
@@ -855,4 +863,56 @@ RSpec.describe Glancer::Configuration do
       expect(config.resolved_enrichment_model).to eq("gemini-2.0-flash")
     end
   end
+  # ── max_llm_retries= ─────────────────────────────────────────────────────────
+  describe "#max_llm_retries=" do
+    it "defaults to 3" do
+      expect(config.max_llm_retries).to eq(3)
+    end
+    it "accepts 0 (disables retries)" do
+      config.max_llm_retries = 0
+      expect(config.max_llm_retries).to eq(0)
+    end
+    it "accepts a positive integer" do
+      config.max_llm_retries = 5
+      expect(config.max_llm_retries).to eq(5)
+    end
+    it "raises ArgumentError for a negative integer" do
+      expect { config.max_llm_retries = -1 }.to raise_error(ArgumentError, /non-negative integer/)
+    end
+    it "raises ArgumentError for a non-integer" do
+      expect { config.max_llm_retries = 2.5 }.to raise_error(ArgumentError, /non-negative integer/)
+    end
+  end
+  # ── llm_retry_delay= ─────────────────────────────────────────────────────────
+  describe "#llm_retry_delay=" do
+    it "defaults to 60" do
+      expect(config.llm_retry_delay).to eq(60)
+    end
+    it "accepts a positive integer" do
+      config.llm_retry_delay = 30
+      expect(config.llm_retry_delay).to eq(30)
+    end
+    it "accepts a positive float" do
+      config.llm_retry_delay = 0.5
+      expect(config.llm_retry_delay).to eq(0.5)
+    end
+    it "raises ArgumentError for zero" do
+      expect { config.llm_retry_delay = 0 }.to raise_error(ArgumentError, /positive number/)
+    end
+    it "raises ArgumentError for a negative number" do
+      expect { config.llm_retry_delay = -5 }.to raise_error(ArgumentError, /positive number/)
+    end
+  end
 end

data/spec/lib/glancer/utils/rate_limit_retry_spec.rb ADDED Viewed

@@ -0,0 +1,193 @@
+# frozen_string_literal: true
+require "spec_helper"
+RSpec.describe Glancer::Utils::RateLimitRetry do
+  before do
+    Glancer.configuration.max_llm_retries = 3
+    Glancer.configuration.llm_retry_delay = 60
+    allow(described_class).to receive(:sleep)
+    allow(Glancer::Utils::Logger).to receive(:warn)
+  end
+  describe ".with_retry" do
+    context "when the block succeeds on the first attempt" do
+      it "returns the block result without retrying" do
+        result = described_class.with_retry(context: "Test") { 42 }
+        expect(result).to eq(42)
+        expect(described_class).not_to have_received(:sleep)
+      end
+    end
+    context "when the block raises a non-rate-limit error" do
+      it "re-raises immediately without retrying" do
+        calls = 0
+        expect do
+          described_class.with_retry(context: "Test") do
+            calls += 1
+            raise StandardError, "some other error"
+          end
+        end.to raise_error(StandardError, "some other error")
+        expect(calls).to eq(1)
+        expect(described_class).not_to have_received(:sleep)
+      end
+    end
+    context "when the block raises a rate-limit error" do
+      it "retries up to max_retries times then re-raises" do
+        calls = 0
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 2) do
+            calls += 1
+            raise StandardError, "You exceeded your current quota"
+          end
+        end.to raise_error(StandardError, "You exceeded your current quota")
+        expect(calls).to eq(3) # 1 initial + 2 retries
+      end
+      it "logs a warning on each retry" do
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 1) do
+            raise StandardError, "rate limit exceeded"
+          end
+        end.to raise_error(StandardError)
+        expect(Glancer::Utils::Logger).to have_received(:warn).with("Test", %r{Rate limit hit \(attempt 1/1\)})
+      end
+      it "sleeps for the base delay when no retry-after hint is present" do
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 1, base_delay: 10) do
+            raise StandardError, "quota exceeded"
+          end
+        end.to raise_error(StandardError)
+        expect(described_class).to have_received(:sleep).with(10)
+      end
+      it "sleeps for the hint delay when a retry-after hint is present" do
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 1, base_delay: 60) do
+            raise StandardError, "quota exceeded. Please retry in 51.63s"
+          end
+        end.to raise_error(StandardError)
+        expect(described_class).to have_received(:sleep).with(51.63)
+      end
+      it "uses exponential backoff when no hint is present" do
+        sleep_calls = []
+        allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 3, base_delay: 10) do
+            raise StandardError, "resource exhausted"
+          end
+        end.to raise_error(StandardError)
+        # attempt 1 → 10 * 2^0 = 10, attempt 2 → 10 * 2^1 = 20, attempt 3 → 10 * 2^2 = 40
+        expect(sleep_calls).to eq([10, 20, 40])
+      end
+      it "succeeds if a later attempt does not raise" do
+        calls = 0
+        result = described_class.with_retry(context: "Test", max_retries: 2) do
+          calls += 1
+          raise StandardError, "too many requests" if calls < 3
+          "ok"
+        end
+        expect(result).to eq("ok")
+        expect(calls).to eq(3)
+      end
+      it "reads max_retries from configuration when not supplied" do
+        Glancer.configuration.max_llm_retries = 1
+        calls = 0
+        expect do
+          described_class.with_retry(context: "Test") do
+            calls += 1
+            raise StandardError, "quota exceeded"
+          end
+        end.to raise_error(StandardError)
+        expect(calls).to eq(2) # 1 initial + 1 from config
+      end
+      it "reads base_delay from configuration when not supplied" do
+        Glancer.configuration.llm_retry_delay = 5
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 1) do
+            raise StandardError, "resource exhausted"
+          end
+        end.to raise_error(StandardError)
+        expect(described_class).to have_received(:sleep).with(5)
+      end
+      it "does not retry when max_retries is 0" do
+        calls = 0
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 0) do
+            calls += 1
+            raise StandardError, "rate limit"
+          end
+        end.to raise_error(StandardError)
+        expect(calls).to eq(1)
+        expect(described_class).not_to have_received(:sleep)
+      end
+    end
+    context "rate limit error detection" do
+      {
+        "rate limit" => /rate.?limit/i,
+        "quota exceeded" => /quota.?exceed/i,
+        "You exceeded your current quota" => /exceeded.?your.?current.?quota/i,
+        "Too Many Requests" => /too.?many.?request/i,
+        "RESOURCE_EXHAUSTED" => /resource.?exhausted/i,
+        "HTTP 429 error" => /\b429\b/
+      }.each_key do |message|
+        it "detects '#{message}' as a rate-limit error" do
+          calls = 0
+          expect do
+            described_class.with_retry(context: "Test", max_retries: 1) do
+              calls += 1
+              raise StandardError, message
+            end
+          end.to raise_error(StandardError)
+          expect(calls).to eq(2) # retried once
+        end
+      end
+      it "detects errors whose class name contains 'rate_limit'" do
+        klass = Class.new(StandardError) { def self.name = "SomeRateLimitError" }
+        calls = 0
+        expect do
+          described_class.with_retry(context: "Test", max_retries: 1) do
+            calls += 1
+            raise klass, "any message"
+          end
+        end.to raise_error(klass)
+        expect(calls).to eq(2)
+      end
+    end
+    context "retry-after hint parsing" do
+      [
+        ["Please retry in 51.632812448s", 51.632812448],
+        ["retry in 30s", 30.0],
+        ["RETRY IN 120.5s now", 120.5],
+        ["retryIn 0s — ignored", nil],
+        ["no hint here", nil]
+      ].each do |message, expected|
+        it "parses #{expected.inspect} from '#{message}'" do
+          sleep_calls = []
+          allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
+          expect do
+            described_class.with_retry(context: "Test", max_retries: 1, base_delay: 99) do
+              raise StandardError, "quota exceeded. #{message}"
+            end
+          end.to raise_error(StandardError)
+          expected_delay = expected || 99
+          expect(sleep_calls.first).to eq(expected_delay)
+        end
+      end
+    end
+  end
+end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: glancer
 version: !ruby/object:Gem::Version
-  version: 1.0.0
+  version: 1.1.0
 platform: ruby
 authors:
 - Ernane Ferreira
@@ -169,6 +169,7 @@ files:
 - lib/glancer/retriever.rb
 - lib/glancer/utils/logger.rb
 - lib/glancer/utils/markdown_helper.rb
+- lib/glancer/utils/rate_limit_retry.rb
 - lib/glancer/utils/result_formatter.rb
 - lib/glancer/utils/table_stats.rb
 - lib/glancer/utils/transaction.rb
@@ -201,6 +202,7 @@ files:
 - spec/lib/glancer/retriever_spec.rb
 - spec/lib/glancer/utils/logger_spec.rb
 - spec/lib/glancer/utils/markdown_helper_spec.rb
+- spec/lib/glancer/utils/rate_limit_retry_spec.rb
 - spec/lib/glancer/utils/result_formatter_spec.rb
 - spec/lib/glancer/utils/table_stats_spec.rb
 - spec/lib/glancer/utils/transaction_spec.rb