glancer 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d4248cad39bdfcfd7cb3907a83c81170f49bf15b76c0af226d6333622479dd72
4
- data.tar.gz: cd8b5588e4032750fa0f5be8eae5de5ded28a713a1223b948d4fd94b258ce012
3
+ metadata.gz: '0780756d0563d270e2669590d7fa195506c67eb9224abc24382356f54682d9cf'
4
+ data.tar.gz: b2035a81f75a73e2dd9a4ceaa0e4af1d084867c3c6ecf0554e97acc05d9f7df1
5
5
  SHA512:
6
- metadata.gz: 4cb913d4829f562ceaa8ebc9ff808c22037b540c0ecf75991cfb6f785434fc0fd8588d73d6a65e54a09f96d002a7e0359bc0e8884ba112a75fac215832b32340
7
- data.tar.gz: 4b5f94c281b5161e83bbbbf48296d321344302ee53811bf0b8b4184cce385f831552fca3827ee26dae34897dd501ff8de74ad33f6714e8d7e3ed1b6c135779d7
6
+ metadata.gz: 5c2fb146ee00751b91613a2bbc818cadfd44dff3c0f8fa4c42907516a02baeda4f281a273dfd8fba62f793ce16ed4d2ed2cebc3545026c282e4f78c9f6d5e2ef
7
+ data.tar.gz: cbca9b21bff036fd309a5f6fab87cea4e27ef83224f8f91bb0be618146bf611f156cc9e624940164e74e75e87586a52c2533f26d27e87f5bd725e075d91de1bd
data/CHANGELOG.md CHANGED
@@ -7,82 +7,66 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
- ## [1.0.0] — 2026-05-24
10
+ ## [1.1.0] — 2026-06-05
11
11
 
12
12
  ### Added
13
13
 
14
- - **SQL editing**: users can edit the generated SQL directly in the chat UI before
15
- executing it. Edits are persisted with a `user_edited_sql` flag and surfaced with
16
- a visible badge.
17
- - **Pipeline status labels**: animated step-by-step labels (embedding → retrieval →
18
- SQL generation validation execution → response) while the pipeline is running.
19
- - **Accordion results**: running a new query collapses the previous results panel;
20
- panels can be toggled independently.
21
- - **Copy buttons**: one-click copy for both the raw SQL and the full assistant response.
22
- - **Large-result alert**: a warning banner when a query returns ≥ 500 rows or has no
23
- `LIMIT` clause.
24
- - **Audio input**: microphone button powered by the Web Speech API; transcribed text
25
- is appended to the question field.
26
- - **Desktop sidebar toggle**: chevron button to collapse/expand the chat list on large
27
- screens; state is persisted in `localStorage`.
28
- - **Blazer integration**: "Open in Blazer" button auto-detected when the `blazer` gem
29
- is present; configurable via `config.blazer_path`.
30
- - **`:silent` log verbosity level**: suppresses all log output including `warn` and
31
- `error` — intended for test environments.
32
- - **100 % test coverage**: 552 RSpec examples covering every workflow path, edge case,
33
- and rescue branch.
34
- - **CI badge**: automatically generated coverage SVG committed to the `badge-generator`
35
- branch on every push to `main`.
36
-
37
- ### Changed
38
-
39
- - LLM humanization prompt rewritten to never describe the query as "executed" — it now
40
- explains the logic and why it answers the question.
41
- - Improved self-correction: the executor retries up to 3 times, passing the database
42
- error back to the LLM on each attempt.
43
- - Immediate user-message rendering: the user's bubble appears in the chat instantly
44
- (before the server responds) via a temporary DOM node that is replaced by the
45
- Turbo Stream response.
14
+ - **Rate-limit retry with backoff**: all LLM and embedding API calls now automatically
15
+ retry when a rate-limit / quota-exceeded error is received. If the provider returns a
16
+ "retry in Xs" hint (e.g. Gemini Flash), that exact delay is honoured; otherwise
17
+ exponential backoff is applied (`llm_retry_delay * 2^attempt`). Resolves [#1].
18
+ - **`max_llm_retries` config option** (default: `3`): maximum number of retry attempts
19
+ before propagating the error to the user. Set to `0` to disable automatic retries.
20
+ - **`llm_retry_delay` config option** (default: `60` seconds): base delay in seconds
21
+ used when the provider does not supply a retry-after hint.
46
22
 
47
- ### Fixed
48
-
49
- - Info panel never showed content because the controller was passing `message_for_info:`
50
- but the partial expected `message_info:`.
23
+ ## [1.0.0] — 2026-05-24
51
24
 
52
- ## [0.1.0] — 2026-05-14
25
+ First public release.
53
26
 
54
27
  ### Added
55
28
 
56
- - **RAG pipeline**: embed → retrieve → generate SQL → execute → humanize, with automatic
57
- retry (up to 3 attempts) on SQL errors using LLM self-correction.
29
+ - **RAG pipeline**: embed → retrieve → generate codevalidate → execute → humanize,
30
+ with automatic retry (up to 3 attempts) on errors using LLM self-correction.
31
+ - **Dual query mode**: `query_mode: :sql` (default) generates read-only SQL;
32
+ `query_mode: :activerecord` generates and evaluates Ruby/ActiveRecord expressions.
33
+ Each mode has its own sanitizer (`SQLSanitizer` / `ARSanitizer`), extractor, prompt
34
+ builder, and executor.
58
35
  - **Multi-provider LLM support** via [ruby_llm](https://github.com/crmne/ruby_llm):
59
- Gemini, OpenAI, and OpenRouter. Each role (SQL generation, chat responses, embeddings)
60
- can use a different provider and model.
36
+ Gemini, OpenAI, and OpenRouter. Code generation, chat responses, and embeddings can
37
+ each use a different provider and model.
38
+ - **Async message processing** via `Glancer::AsyncRunner`: messages are processed in a
39
+ background thread using `connection_pool.with_connection` — no external job queue
40
+ (Sidekiq, GoodJob, etc.) required.
41
+ - **Client-side polling**: the UI polls `/messages/:id/poll` every 2 s and replaces the
42
+ message partial via Turbo Stream once done; a 5-minute hard timeout marks stuck
43
+ messages as failed automatically.
44
+ - **Query enrichment**: `QueryEnricher` translates natural-language questions into dense
45
+ technical specifications before retrieval, improving code accuracy.
61
46
  - **Indexers** for `db/schema.rb`, `app/models/**/*.rb`, and a custom Markdown context
62
47
  file. Rake tasks: `glancer:index:all`, `glancer:index:schema`, `glancer:index:models`,
63
48
  `glancer:index:context`.
64
49
  - **Cosine similarity retrieval** with per-source-type relevance weights (schema 1.3×,
65
- context 1.2×, models 1.1×) and a configurable minimum score threshold.
66
- - **Chunk overlap** to prevent context loss at document boundaries.
50
+ context 1.2×, models 1.1×), configurable minimum score threshold, and fallback to
51
+ top-k results when no embedding meets the threshold.
67
52
  - **SQL safety layer**: `SQLSanitizer` (blocks destructive statements),
68
- `SQLValidator` (verifies table references against indexed schema), and
69
- mandatory read-only transaction with automatic rollback.
53
+ `SQLValidator` (verifies table references against indexed schema), and mandatory
54
+ read-only transaction with automatic rollback.
70
55
  - **Audit trail**: every executed query is stored in `glancer_audits` with a unique
71
- `run_id` UUID injected as an SQL comment (`/*glancer,run_id:UUID*/`).
56
+ `run_id` UUID injected as a comment (`/*glancer,run_id:UUID*/`).
72
57
  - **In-memory response cache** (`workflow_cache_ttl`) to avoid redundant LLM calls for
73
58
  repeated identical questions.
74
59
  - **Chat UI**: Stimulus + Turbo Streams interface with dark mode, typewriter effect,
75
- CSV export, SQL re-run, pipeline status labels, accordion results, copy-to-clipboard,
76
- and audio input (Web Speech API).
60
+ CSV export, SQL/AR re-run, chart visualizations (bar, line, pie), client-side polling,
61
+ pipeline status labels, accordion results, and copy-to-clipboard.
77
62
  - **Settings page** at `/glancer/settings` for runtime custom instructions.
78
63
  - **Schema viewer** at `/glancer/db-schema` showing indexed tables and columns.
79
64
  - **Install generator**: `rails generate glancer:install` scaffolds the initializer,
80
65
  context file, and mounts the engine.
81
- - Configurable `statement_timeout` enforced via adapter-native mechanisms
82
- (PostgreSQL `SET statement_timeout`, MySQL `SET max_execution_time`).
83
- - `config.history_limit` to control how many prior turns are included in the prompt.
84
- - `config.read_only_db` to route queries to a replica connection string.
66
+ - Configurable `statement_timeout`, `history_limit`, `read_only_db`, `k`, `min_score`,
67
+ and per-source document weights.
68
+ - **100% line coverage**: 717 RSpec examples covering every workflow path, edge case,
69
+ and rescue branch.
85
70
 
86
71
  [Unreleased]: https://github.com/ErnaneJ/glancer/compare/v1.0.0...HEAD
87
- [1.0.0]: https://github.com/ErnaneJ/glancer/compare/v0.1.0...v1.0.0
88
- [0.1.0]: https://github.com/ErnaneJ/glancer/releases/tag/v0.1.0
72
+ [1.0.0]: https://github.com/ErnaneJ/glancer/releases/tag/v1.0.0
data/README.md CHANGED
@@ -36,8 +36,11 @@ Glancer is a **Ruby on Rails engine** that mounts a full chat interface inside y
36
36
  → SELECT executed, results shown, answer written in plain language.
37
37
  ```
38
38
 
39
- [![Click to play video](./.github/assets/demo.png)](https://github.com/ErnaneJ/glancer/raw/refs/heads/main/.github/assets/demo.mp4)
40
- > Click to see demo.
39
+ <p align="center">
40
+ <a href="https://github.com/ErnaneJ/glancer/raw/refs/heads/main/.github/assets/demo.mp4">
41
+ <img src="./.github/assets/demo.gif" alt="DEMO">
42
+ </a>
43
+ </p>
41
44
 
42
45
  ## Why Glancer?
43
46
 
@@ -64,6 +67,16 @@ Glancer removes that friction. It gives your app a persistent, context-aware dat
64
67
 
65
68
  Glancer is built on top of [**ruby_llm**](https://github.com/crmne/ruby_llm), a provider-agnostic LLM client for Ruby. All LLM calls (query generation, humanized responses, embeddings, and optional question enrichment) go through ruby_llm, so any model it supports works with Glancer.
66
69
 
70
+ ### Supported models by provider
71
+
72
+ | Provider | Chat / Code models | Embedding models | Model list |
73
+ |---|---|---|---|
74
+ | **Gemini** | `gemini-2.0-flash`, `gemini-2.5-pro`, `gemini-1.5-pro`, … | `text-embedding-004` | [ai.google.dev/gemini-api/docs/models](https://ai.google.dev/gemini-api/docs/models) |
75
+ | **OpenAI** | `gpt-4o`, `gpt-4o-mini`, `o3-mini`, … | `text-embedding-3-large`, `text-embedding-3-small` | [platform.openai.com/docs/models](https://platform.openai.com/docs/models) |
76
+ | **OpenRouter** | Any model available on the platform (e.g. `anthropic/claude-3.5-sonnet`, `deepseek/deepseek-r1:free`) | Not natively supported — pair with `:gemini` or `:openai` for embeddings | [openrouter.ai/models](https://openrouter.ai/models) |
77
+
78
+ The full list of models validated by the ruby_llm gem is at [rubyllm.com/available-models](https://rubyllm.com/available-models/). If you need a model not yet in that registry, set `assume_model_exists: true` by configuring an explicit `embedding_model` or `code_model` string.
79
+
67
80
  ## Installation
68
81
 
69
82
  ### 1. Add to your Gemfile
@@ -164,6 +177,20 @@ Route all queries to a replica to offload your primary database:
164
177
  config.read_only_db = ENV["REPLICA_DATABASE_URL"]
165
178
  ```
166
179
 
180
+ ### Rate limiting
181
+
182
+ When using free-tier or low-quota LLM providers (e.g. Gemini Flash free tier), you may hit rate limits. Glancer automatically retries with backoff when it receives a quota-exceeded or rate-limit error from any provider:
183
+
184
+ - If the provider returns a **"retry in Xs"** hint in the error message (Gemini does this), that exact delay is used.
185
+ - Otherwise, **exponential backoff** is applied: `llm_retry_delay × 2^(attempt − 1)`.
186
+
187
+ ```ruby
188
+ config.max_llm_retries = 3 # retries before propagating the error (0 = disable)
189
+ config.llm_retry_delay = 60 # base delay in seconds when no retry hint is provided
190
+ ```
191
+
192
+ A warning is logged on each retry attempt so you can monitor them in your logs. If all retries are exhausted, the error is surfaced to the user normally.
193
+
167
194
  ### Full configuration reference
168
195
 
169
196
  | Option | Default | Description |
@@ -194,6 +221,8 @@ config.read_only_db = ENV["REPLICA_DATABASE_URL"]
194
221
  | `models_documents_weight` | `1.1` | Score boost for model chunks |
195
222
  | `history_limit` | `6` | Prior conversation turns included in the LLM prompt |
196
223
  | `workflow_cache_ttl` | `5.minutes` | In-memory result cache TTL; `0` to disable |
224
+ | `max_llm_retries` | `3` | Max retries on rate-limit / quota errors; `0` to disable |
225
+ | `llm_retry_delay` | `60` | Base delay in seconds between retries (exponential backoff; API hint takes priority) |
197
226
  | `log_verbosity` | `:info` | `:silent`, `:none`, `:info`, or `:debug` |
198
227
  | `log_output_path` | `nil` | Log file path; `nil` writes to stdout |
199
228
  | `blazer_path` | `nil` | Blazer base path; auto-detected when `blazer` gem is present |
@@ -101,7 +101,7 @@ Glancer.configure do |config|
101
101
  # avoid errors. If you must use an OpenRouter embedding model anyway, set
102
102
  # embedding_model explicitly (e.g., 'openai/text-embedding-3-small') —
103
103
  # Glancer will bypass the model registry check automatically.
104
- #
104
+ # https://rubyllm.com/available-models/
105
105
  # Accepted: nil | :gemini | :openai | :openrouter
106
106
  config.embedding_provider = nil
107
107
 
@@ -203,6 +203,19 @@ Glancer.configure do |config|
203
203
  # shared across Puma workers or restarts).
204
204
  config.workflow_cache_ttl = 5.minutes
205
205
 
206
+ # ─────────────────────────────────────────────────────────────────────────────
207
+ # Rate limiting
208
+ # ─────────────────────────────────────────────────────────────────────────────
209
+
210
+ # Maximum number of retries when an LLM or embedding API returns a rate-limit
211
+ # or quota-exceeded error. Set to 0 to disable automatic retries.
212
+ config.max_llm_retries = 3
213
+
214
+ # Base delay in seconds between rate-limit retries. When the provider returns a
215
+ # "retry in Xs" hint (e.g. Gemini), that hint takes priority. Otherwise,
216
+ # exponential backoff is applied: delay * 2^(attempt - 1).
217
+ config.llm_retry_delay = 60
218
+
206
219
  # ─────────────────────────────────────────────────────────────────────────────
207
220
  # Logging
208
221
  # ─────────────────────────────────────────────────────────────────────────────
@@ -52,6 +52,8 @@ module Glancer
52
52
  self.query_enrichment_enabled = false # enrich question with table names before retrieval
53
53
  self.enrichment_provider = nil # nil → falls back to llm_provider
54
54
  self.enrichment_model = nil # nil → falls back to llm_model
55
+ self.max_llm_retries = 3 # retries on rate-limit errors
56
+ self.llm_retry_delay = 60 # base delay in seconds (fallback when no retry-after hint)
55
57
  end
56
58
 
57
59
  # === READERS ===
@@ -66,7 +68,8 @@ module Glancer
66
68
  :code_provider, :code_model,
67
69
  :chat_provider, :chat_model,
68
70
  :blazer_path, :query_mode,
69
- :query_enrichment_enabled, :enrichment_provider, :enrichment_model
71
+ :query_enrichment_enabled, :enrichment_provider, :enrichment_model,
72
+ :max_llm_retries, :llm_retry_delay
70
73
 
71
74
  # === WRITERS ===
72
75
  def adapter=(value)
@@ -340,6 +343,18 @@ module Glancer
340
343
  @query_mode = value
341
344
  end
342
345
 
346
+ def max_llm_retries=(value)
347
+ raise ArgumentError, "max_llm_retries must be a non-negative integer" unless value.is_a?(Integer) && value >= 0
348
+
349
+ @max_llm_retries = value
350
+ end
351
+
352
+ def llm_retry_delay=(value)
353
+ raise ArgumentError, "llm_retry_delay must be a positive number" unless value.is_a?(Numeric) && value.positive?
354
+
355
+ @llm_retry_delay = value
356
+ end
357
+
343
358
  # Returns the Blazer base path if Blazer is available, nil otherwise.
344
359
  def resolved_blazer_path
345
360
  return @blazer_path unless @blazer_path.nil?
@@ -16,12 +16,14 @@ module Glancer
16
16
  Glancer::Utils::Logger.debug("Retriever",
17
17
  "Embedding chunk ##{idx + 1} (#{data[:source_type]} - #{data[:source_path]}): '#{preview}...'")
18
18
 
19
- vector = RubyLLM.embed(
20
- chunk,
21
- model: Glancer.configuration.resolved_embedding_model,
22
- provider: Glancer.configuration.resolved_embedding_provider,
23
- assume_model_exists: true
24
- ).vectors
19
+ vector = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
20
+ RubyLLM.embed(
21
+ chunk,
22
+ model: Glancer.configuration.resolved_embedding_model,
23
+ provider: Glancer.configuration.resolved_embedding_provider,
24
+ assume_model_exists: true
25
+ ).vectors
26
+ end
25
27
 
26
28
  Glancer::Utils::Logger.debug("Retriever",
27
29
  "Vector size: #{vector.size}, example values: #{vector.first(5).inspect}")
@@ -47,12 +49,14 @@ module Glancer
47
49
  def search(query)
48
50
  Glancer::Utils::Logger.info("Retriever", "Searching for top #{Glancer.configuration.k} results...")
49
51
 
50
- query_embedding = RubyLLM.embed(
51
- query,
52
- model: Glancer.configuration.resolved_embedding_model,
53
- provider: Glancer.configuration.resolved_embedding_provider,
54
- assume_model_exists: true
55
- ).vectors
52
+ query_embedding = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
53
+ RubyLLM.embed(
54
+ query,
55
+ model: Glancer.configuration.resolved_embedding_model,
56
+ provider: Glancer.configuration.resolved_embedding_provider,
57
+ assume_model_exists: true
58
+ ).vectors
59
+ end
56
60
 
57
61
  # @TODO Postgres with native search?
58
62
  perform_ruby_search(query_embedding)
@@ -0,0 +1,51 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Glancer
4
+ module Utils
5
+ module RateLimitRetry
6
+ RATE_LIMIT_PATTERNS = [
7
+ /rate.?limit/i,
8
+ /quota.?exceed/i,
9
+ /exceeded.?your.?current.?quota/i,
10
+ /too.?many.?request/i,
11
+ /resource.?exhausted/i,
12
+ /\b429\b/
13
+ ].freeze
14
+
15
+ RETRY_AFTER_PATTERN = /retry.?in\s+([0-9]+(?:\.[0-9]+)?)\s*s/i
16
+
17
+ def self.with_retry(context:, max_retries: nil, base_delay: nil)
18
+ max_retries ||= Glancer.configuration.max_llm_retries
19
+ base_delay ||= Glancer.configuration.llm_retry_delay
20
+ attempt = 0
21
+
22
+ begin
23
+ yield
24
+ rescue StandardError => e
25
+ raise unless rate_limit_error?(e) && attempt < max_retries
26
+
27
+ attempt += 1
28
+ delay = parse_retry_after(e.message) || (base_delay * (2**(attempt - 1)))
29
+ Glancer::Utils::Logger.warn(
30
+ context,
31
+ "Rate limit hit (attempt #{attempt}/#{max_retries}). Retrying in #{delay.ceil}s..."
32
+ )
33
+ sleep(delay)
34
+ retry
35
+ end
36
+ end
37
+
38
+ def self.rate_limit_error?(error)
39
+ RATE_LIMIT_PATTERNS.any? { |p| error.message.match?(p) } ||
40
+ error.class.name.match?(/rate.?limit/i)
41
+ end
42
+ private_class_method :rate_limit_error?
43
+
44
+ def self.parse_retry_after(message)
45
+ m = message.match(RETRY_AFTER_PATTERN)
46
+ m && m[1].to_f.positive? ? m[1].to_f : nil
47
+ end
48
+ private_class_method :parse_retry_after
49
+ end
50
+ end
51
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Glancer
4
- VERSION = "1.0.0"
4
+ VERSION = "1.1.0"
5
5
  end
@@ -17,7 +17,9 @@ module Glancer
17
17
  assume_model_exists: true
18
18
  )
19
19
 
20
- response = chat.ask(prompt)
20
+ response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
21
+ chat.ask(prompt)
22
+ end
21
23
 
22
24
  Glancer::Utils::Logger.info("Workflow::Builder",
23
25
  "LLM responded with SQL (length: #{response.content&.length || 0} characters)")
@@ -64,7 +66,9 @@ module Glancer
64
66
  assume_model_exists: true
65
67
  )
66
68
 
67
- response = chat.ask(prompt)
69
+ response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
70
+ chat.ask(prompt)
71
+ end
68
72
 
69
73
  # Clean the response to ensure we only have the raw SQL
70
74
  Glancer::Workflow::SQLExtractor.extract(response.content)
@@ -84,7 +88,9 @@ module Glancer
84
88
  assume_model_exists: true
85
89
  )
86
90
 
87
- response = chat.ask(prompt)
91
+ response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
92
+ chat.ask(prompt)
93
+ end
88
94
  Glancer::Utils::Logger.info("Workflow::Builder",
89
95
  "LLM responded with AR code (length: #{response.content&.length || 0} chars)")
90
96
  response.content
@@ -118,7 +124,9 @@ module Glancer
118
124
  assume_model_exists: true
119
125
  )
120
126
 
121
- response = chat.ask(prompt)
127
+ response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
128
+ chat.ask(prompt)
129
+ end
122
130
  Glancer::Workflow::ARExtractor.extract(response.content)
123
131
  rescue StandardError => e
124
132
  Glancer::Utils::Logger.error("Workflow::Builder", "Failed to fix AR code: #{e.message}")
@@ -39,7 +39,9 @@ module Glancer
39
39
  context += "\n\nADDITIONAL INSTRUCTIONS:\n#{custom}" if custom.present?
40
40
 
41
41
  chat.with_instructions(context)
42
- response = chat.ask(question)
42
+ response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
43
+ chat.ask(question)
44
+ end
43
45
 
44
46
  response.content
45
47
  rescue StandardError => e
@@ -72,7 +74,9 @@ module Glancer
72
74
  model: Glancer.configuration.resolved_chat_model,
73
75
  assume_model_exists: true
74
76
  )
75
- chat.ask(prompt).content
77
+ Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
78
+ chat.ask(prompt).content
79
+ end
76
80
  rescue StandardError => e
77
81
  Glancer::Utils::Logger.error("Workflow::LLM", "explain_missing_tables failed: #{e.message}")
78
82
  "Não consegui encontrar a(s) tabela(s) **#{missing}** no schema indexado. " \
@@ -87,7 +91,9 @@ module Glancer
87
91
  )
88
92
  prompt = "Generate a concise, descriptive title (max 45 characters, no quotes, no punctuation at end) " \
89
93
  "for a database query session starting with this question: #{question}"
90
- chat.ask(prompt).content.strip.truncate(50)
94
+ Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
95
+ chat.ask(prompt).content.strip.truncate(50)
96
+ end
91
97
  rescue StandardError => e
92
98
  Glancer::Utils::Logger.error("Workflow::LLM", "generate_title failed: #{e.message}")
93
99
  question.truncate(45)
@@ -116,7 +122,9 @@ module Glancer
116
122
  5. Respond in the user's language.
117
123
  PROMPT
118
124
 
119
- chat.ask(prompt).content
125
+ Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
126
+ chat.ask(prompt).content
127
+ end
120
128
  end
121
129
  end
122
130
  end
@@ -50,7 +50,9 @@ module Glancer
50
50
  assume_model_exists: true
51
51
  )
52
52
 
53
- enriched = chat.ask(prompt).content.to_s.strip
53
+ enriched = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::QueryEnricher") do
54
+ chat.ask(prompt).content.to_s.strip
55
+ end
54
56
  enriched.presence || question
55
57
  rescue StandardError => e
56
58
  Glancer::Utils::Logger.warn("Workflow::QueryEnricher", "Enrichment failed, using original: #{e.message}")
data/lib/glancer.rb CHANGED
@@ -20,7 +20,8 @@ require "glancer/utils/logger" # Glancer::Utils::Logger
20
20
  require "glancer/utils/markdown_helper" # Glancer::Utils::MarkdownHelper
21
21
  require "glancer/utils/result_formatter" # Glancer::Utils::ResultFormatter
22
22
  require "glancer/utils/table_stats" # Glancer::Utils::TableStats
23
- require "glancer/utils/transaction" # Glancer::Utils::Transaction
23
+ require "glancer/utils/transaction" # Glancer::Utils::Transaction
24
+ require "glancer/utils/rate_limit_retry" # Glancer::Utils::RateLimitRetry
24
25
 
25
26
  require "glancer/engine" # Glancer::Engine
26
27
 
@@ -143,6 +143,14 @@ RSpec.describe Glancer::Configuration do
143
143
  it "sets read_only_db to nil" do
144
144
  expect(config.read_only_db).to be_nil
145
145
  end
146
+
147
+ it "sets max_llm_retries to 3" do
148
+ expect(config.max_llm_retries).to eq(3)
149
+ end
150
+
151
+ it "sets llm_retry_delay to 60" do
152
+ expect(config.llm_retry_delay).to eq(60)
153
+ end
146
154
  end
147
155
 
148
156
  # ── adapter= ─────────────────────────────────────────────────────────────────
@@ -855,4 +863,56 @@ RSpec.describe Glancer::Configuration do
855
863
  expect(config.resolved_enrichment_model).to eq("gemini-2.0-flash")
856
864
  end
857
865
  end
866
+
867
+ # ── max_llm_retries= ─────────────────────────────────────────────────────────
868
+
869
+ describe "#max_llm_retries=" do
870
+ it "defaults to 3" do
871
+ expect(config.max_llm_retries).to eq(3)
872
+ end
873
+
874
+ it "accepts 0 (disables retries)" do
875
+ config.max_llm_retries = 0
876
+ expect(config.max_llm_retries).to eq(0)
877
+ end
878
+
879
+ it "accepts a positive integer" do
880
+ config.max_llm_retries = 5
881
+ expect(config.max_llm_retries).to eq(5)
882
+ end
883
+
884
+ it "raises ArgumentError for a negative integer" do
885
+ expect { config.max_llm_retries = -1 }.to raise_error(ArgumentError, /non-negative integer/)
886
+ end
887
+
888
+ it "raises ArgumentError for a non-integer" do
889
+ expect { config.max_llm_retries = 2.5 }.to raise_error(ArgumentError, /non-negative integer/)
890
+ end
891
+ end
892
+
893
+ # ── llm_retry_delay= ─────────────────────────────────────────────────────────
894
+
895
+ describe "#llm_retry_delay=" do
896
+ it "defaults to 60" do
897
+ expect(config.llm_retry_delay).to eq(60)
898
+ end
899
+
900
+ it "accepts a positive integer" do
901
+ config.llm_retry_delay = 30
902
+ expect(config.llm_retry_delay).to eq(30)
903
+ end
904
+
905
+ it "accepts a positive float" do
906
+ config.llm_retry_delay = 0.5
907
+ expect(config.llm_retry_delay).to eq(0.5)
908
+ end
909
+
910
+ it "raises ArgumentError for zero" do
911
+ expect { config.llm_retry_delay = 0 }.to raise_error(ArgumentError, /positive number/)
912
+ end
913
+
914
+ it "raises ArgumentError for a negative number" do
915
+ expect { config.llm_retry_delay = -5 }.to raise_error(ArgumentError, /positive number/)
916
+ end
917
+ end
858
918
  end
@@ -0,0 +1,193 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "spec_helper"
4
+
5
+ RSpec.describe Glancer::Utils::RateLimitRetry do
6
+ before do
7
+ Glancer.configuration.max_llm_retries = 3
8
+ Glancer.configuration.llm_retry_delay = 60
9
+ allow(described_class).to receive(:sleep)
10
+ allow(Glancer::Utils::Logger).to receive(:warn)
11
+ end
12
+
13
+ describe ".with_retry" do
14
+ context "when the block succeeds on the first attempt" do
15
+ it "returns the block result without retrying" do
16
+ result = described_class.with_retry(context: "Test") { 42 }
17
+ expect(result).to eq(42)
18
+ expect(described_class).not_to have_received(:sleep)
19
+ end
20
+ end
21
+
22
+ context "when the block raises a non-rate-limit error" do
23
+ it "re-raises immediately without retrying" do
24
+ calls = 0
25
+ expect do
26
+ described_class.with_retry(context: "Test") do
27
+ calls += 1
28
+ raise StandardError, "some other error"
29
+ end
30
+ end.to raise_error(StandardError, "some other error")
31
+ expect(calls).to eq(1)
32
+ expect(described_class).not_to have_received(:sleep)
33
+ end
34
+ end
35
+
36
+ context "when the block raises a rate-limit error" do
37
+ it "retries up to max_retries times then re-raises" do
38
+ calls = 0
39
+ expect do
40
+ described_class.with_retry(context: "Test", max_retries: 2) do
41
+ calls += 1
42
+ raise StandardError, "You exceeded your current quota"
43
+ end
44
+ end.to raise_error(StandardError, "You exceeded your current quota")
45
+ expect(calls).to eq(3) # 1 initial + 2 retries
46
+ end
47
+
48
+ it "logs a warning on each retry" do
49
+ expect do
50
+ described_class.with_retry(context: "Test", max_retries: 1) do
51
+ raise StandardError, "rate limit exceeded"
52
+ end
53
+ end.to raise_error(StandardError)
54
+ expect(Glancer::Utils::Logger).to have_received(:warn).with("Test", %r{Rate limit hit \(attempt 1/1\)})
55
+ end
56
+
57
+ it "sleeps for the base delay when no retry-after hint is present" do
58
+ expect do
59
+ described_class.with_retry(context: "Test", max_retries: 1, base_delay: 10) do
60
+ raise StandardError, "quota exceeded"
61
+ end
62
+ end.to raise_error(StandardError)
63
+ expect(described_class).to have_received(:sleep).with(10)
64
+ end
65
+
66
+ it "sleeps for the hint delay when a retry-after hint is present" do
67
+ expect do
68
+ described_class.with_retry(context: "Test", max_retries: 1, base_delay: 60) do
69
+ raise StandardError, "quota exceeded. Please retry in 51.63s"
70
+ end
71
+ end.to raise_error(StandardError)
72
+ expect(described_class).to have_received(:sleep).with(51.63)
73
+ end
74
+
75
+ it "uses exponential backoff when no hint is present" do
76
+ sleep_calls = []
77
+ allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
78
+
79
+ expect do
80
+ described_class.with_retry(context: "Test", max_retries: 3, base_delay: 10) do
81
+ raise StandardError, "resource exhausted"
82
+ end
83
+ end.to raise_error(StandardError)
84
+
85
+ # attempt 1 → 10 * 2^0 = 10, attempt 2 → 10 * 2^1 = 20, attempt 3 → 10 * 2^2 = 40
86
+ expect(sleep_calls).to eq([10, 20, 40])
87
+ end
88
+
89
+ it "succeeds if a later attempt does not raise" do
90
+ calls = 0
91
+ result = described_class.with_retry(context: "Test", max_retries: 2) do
92
+ calls += 1
93
+ raise StandardError, "too many requests" if calls < 3
94
+
95
+ "ok"
96
+ end
97
+ expect(result).to eq("ok")
98
+ expect(calls).to eq(3)
99
+ end
100
+
101
+ it "reads max_retries from configuration when not supplied" do
102
+ Glancer.configuration.max_llm_retries = 1
103
+ calls = 0
104
+ expect do
105
+ described_class.with_retry(context: "Test") do
106
+ calls += 1
107
+ raise StandardError, "quota exceeded"
108
+ end
109
+ end.to raise_error(StandardError)
110
+ expect(calls).to eq(2) # 1 initial + 1 from config
111
+ end
112
+
113
+ it "reads base_delay from configuration when not supplied" do
114
+ Glancer.configuration.llm_retry_delay = 5
115
+ expect do
116
+ described_class.with_retry(context: "Test", max_retries: 1) do
117
+ raise StandardError, "resource exhausted"
118
+ end
119
+ end.to raise_error(StandardError)
120
+ expect(described_class).to have_received(:sleep).with(5)
121
+ end
122
+
123
+ it "does not retry when max_retries is 0" do
124
+ calls = 0
125
+ expect do
126
+ described_class.with_retry(context: "Test", max_retries: 0) do
127
+ calls += 1
128
+ raise StandardError, "rate limit"
129
+ end
130
+ end.to raise_error(StandardError)
131
+ expect(calls).to eq(1)
132
+ expect(described_class).not_to have_received(:sleep)
133
+ end
134
+ end
135
+
136
+ context "rate limit error detection" do
137
+ {
138
+ "rate limit" => /rate.?limit/i,
139
+ "quota exceeded" => /quota.?exceed/i,
140
+ "You exceeded your current quota" => /exceeded.?your.?current.?quota/i,
141
+ "Too Many Requests" => /too.?many.?request/i,
142
+ "RESOURCE_EXHAUSTED" => /resource.?exhausted/i,
143
+ "HTTP 429 error" => /\b429\b/
144
+ }.each_key do |message|
145
+ it "detects '#{message}' as a rate-limit error" do
146
+ calls = 0
147
+ expect do
148
+ described_class.with_retry(context: "Test", max_retries: 1) do
149
+ calls += 1
150
+ raise StandardError, message
151
+ end
152
+ end.to raise_error(StandardError)
153
+ expect(calls).to eq(2) # retried once
154
+ end
155
+ end
156
+
157
+ it "detects errors whose class name contains 'rate_limit'" do
158
+ klass = Class.new(StandardError) { def self.name = "SomeRateLimitError" }
159
+ calls = 0
160
+ expect do
161
+ described_class.with_retry(context: "Test", max_retries: 1) do
162
+ calls += 1
163
+ raise klass, "any message"
164
+ end
165
+ end.to raise_error(klass)
166
+ expect(calls).to eq(2)
167
+ end
168
+ end
169
+
170
+ context "retry-after hint parsing" do
171
+ [
172
+ ["Please retry in 51.632812448s", 51.632812448],
173
+ ["retry in 30s", 30.0],
174
+ ["RETRY IN 120.5s now", 120.5],
175
+ ["retryIn 0s — ignored", nil],
176
+ ["no hint here", nil]
177
+ ].each do |message, expected|
178
+ it "parses #{expected.inspect} from '#{message}'" do
179
+ sleep_calls = []
180
+ allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
181
+ expect do
182
+ described_class.with_retry(context: "Test", max_retries: 1, base_delay: 99) do
183
+ raise StandardError, "quota exceeded. #{message}"
184
+ end
185
+ end.to raise_error(StandardError)
186
+
187
+ expected_delay = expected || 99
188
+ expect(sleep_calls.first).to eq(expected_delay)
189
+ end
190
+ end
191
+ end
192
+ end
193
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: glancer
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ernane Ferreira
@@ -169,6 +169,7 @@ files:
169
169
  - lib/glancer/retriever.rb
170
170
  - lib/glancer/utils/logger.rb
171
171
  - lib/glancer/utils/markdown_helper.rb
172
+ - lib/glancer/utils/rate_limit_retry.rb
172
173
  - lib/glancer/utils/result_formatter.rb
173
174
  - lib/glancer/utils/table_stats.rb
174
175
  - lib/glancer/utils/transaction.rb
@@ -201,6 +202,7 @@ files:
201
202
  - spec/lib/glancer/retriever_spec.rb
202
203
  - spec/lib/glancer/utils/logger_spec.rb
203
204
  - spec/lib/glancer/utils/markdown_helper_spec.rb
205
+ - spec/lib/glancer/utils/rate_limit_retry_spec.rb
204
206
  - spec/lib/glancer/utils/result_formatter_spec.rb
205
207
  - spec/lib/glancer/utils/table_stats_spec.rb
206
208
  - spec/lib/glancer/utils/transaction_spec.rb