glancer 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +39 -55
- data/README.md +31 -2
- data/lib/generators/glancer/install/templates/glancer.rb +14 -1
- data/lib/glancer/configuration.rb +16 -1
- data/lib/glancer/retriever.rb +16 -12
- data/lib/glancer/utils/rate_limit_retry.rb +51 -0
- data/lib/glancer/version.rb +1 -1
- data/lib/glancer/workflow/builder.rb +12 -4
- data/lib/glancer/workflow/llm.rb +12 -4
- data/lib/glancer/workflow/query_enricher.rb +3 -1
- data/lib/glancer.rb +2 -1
- data/spec/lib/glancer/configuration_spec.rb +60 -0
- data/spec/lib/glancer/utils/rate_limit_retry_spec.rb +193 -0
- metadata +3 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: '0780756d0563d270e2669590d7fa195506c67eb9224abc24382356f54682d9cf'
|
|
4
|
+
data.tar.gz: b2035a81f75a73e2dd9a4ceaa0e4af1d084867c3c6ecf0554e97acc05d9f7df1
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 5c2fb146ee00751b91613a2bbc818cadfd44dff3c0f8fa4c42907516a02baeda4f281a273dfd8fba62f793ce16ed4d2ed2cebc3545026c282e4f78c9f6d5e2ef
|
|
7
|
+
data.tar.gz: cbca9b21bff036fd309a5f6fab87cea4e27ef83224f8f91bb0be618146bf611f156cc9e624940164e74e75e87586a52c2533f26d27e87f5bd725e075d91de1bd
|
data/CHANGELOG.md
CHANGED
|
@@ -7,82 +7,66 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
-
## [1.
|
|
10
|
+
## [1.1.0] — 2026-06-05
|
|
11
11
|
|
|
12
12
|
### Added
|
|
13
13
|
|
|
14
|
-
- **
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
- **Large-result alert**: a warning banner when a query returns ≥ 500 rows or has no
|
|
23
|
-
`LIMIT` clause.
|
|
24
|
-
- **Audio input**: microphone button powered by the Web Speech API; transcribed text
|
|
25
|
-
is appended to the question field.
|
|
26
|
-
- **Desktop sidebar toggle**: chevron button to collapse/expand the chat list on large
|
|
27
|
-
screens; state is persisted in `localStorage`.
|
|
28
|
-
- **Blazer integration**: "Open in Blazer" button auto-detected when the `blazer` gem
|
|
29
|
-
is present; configurable via `config.blazer_path`.
|
|
30
|
-
- **`:silent` log verbosity level**: suppresses all log output including `warn` and
|
|
31
|
-
`error` — intended for test environments.
|
|
32
|
-
- **100 % test coverage**: 552 RSpec examples covering every workflow path, edge case,
|
|
33
|
-
and rescue branch.
|
|
34
|
-
- **CI badge**: automatically generated coverage SVG committed to the `badge-generator`
|
|
35
|
-
branch on every push to `main`.
|
|
36
|
-
|
|
37
|
-
### Changed
|
|
38
|
-
|
|
39
|
-
- LLM humanization prompt rewritten to never describe the query as "executed" — it now
|
|
40
|
-
explains the logic and why it answers the question.
|
|
41
|
-
- Improved self-correction: the executor retries up to 3 times, passing the database
|
|
42
|
-
error back to the LLM on each attempt.
|
|
43
|
-
- Immediate user-message rendering: the user's bubble appears in the chat instantly
|
|
44
|
-
(before the server responds) via a temporary DOM node that is replaced by the
|
|
45
|
-
Turbo Stream response.
|
|
14
|
+
- **Rate-limit retry with backoff**: all LLM and embedding API calls now automatically
|
|
15
|
+
retry when a rate-limit / quota-exceeded error is received. If the provider returns a
|
|
16
|
+
"retry in Xs" hint (e.g. Gemini Flash), that exact delay is honoured; otherwise
|
|
17
|
+
exponential backoff is applied (`llm_retry_delay * 2^attempt`). Resolves [#1].
|
|
18
|
+
- **`max_llm_retries` config option** (default: `3`): maximum number of retry attempts
|
|
19
|
+
before propagating the error to the user. Set to `0` to disable automatic retries.
|
|
20
|
+
- **`llm_retry_delay` config option** (default: `60` seconds): base delay in seconds
|
|
21
|
+
used when the provider does not supply a retry-after hint.
|
|
46
22
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
- Info panel never showed content because the controller was passing `message_for_info:`
|
|
50
|
-
but the partial expected `message_info:`.
|
|
23
|
+
## [1.0.0] — 2026-05-24
|
|
51
24
|
|
|
52
|
-
|
|
25
|
+
First public release.
|
|
53
26
|
|
|
54
27
|
### Added
|
|
55
28
|
|
|
56
|
-
- **RAG pipeline**: embed → retrieve → generate
|
|
57
|
-
retry (up to 3 attempts) on
|
|
29
|
+
- **RAG pipeline**: embed → retrieve → generate code → validate → execute → humanize,
|
|
30
|
+
with automatic retry (up to 3 attempts) on errors using LLM self-correction.
|
|
31
|
+
- **Dual query mode**: `query_mode: :sql` (default) generates read-only SQL;
|
|
32
|
+
`query_mode: :activerecord` generates and evaluates Ruby/ActiveRecord expressions.
|
|
33
|
+
Each mode has its own sanitizer (`SQLSanitizer` / `ARSanitizer`), extractor, prompt
|
|
34
|
+
builder, and executor.
|
|
58
35
|
- **Multi-provider LLM support** via [ruby_llm](https://github.com/crmne/ruby_llm):
|
|
59
|
-
Gemini, OpenAI, and OpenRouter.
|
|
60
|
-
|
|
36
|
+
Gemini, OpenAI, and OpenRouter. Code generation, chat responses, and embeddings can
|
|
37
|
+
each use a different provider and model.
|
|
38
|
+
- **Async message processing** via `Glancer::AsyncRunner`: messages are processed in a
|
|
39
|
+
background thread using `connection_pool.with_connection` — no external job queue
|
|
40
|
+
(Sidekiq, GoodJob, etc.) required.
|
|
41
|
+
- **Client-side polling**: the UI polls `/messages/:id/poll` every 2 s and replaces the
|
|
42
|
+
message partial via Turbo Stream once done; a 5-minute hard timeout marks stuck
|
|
43
|
+
messages as failed automatically.
|
|
44
|
+
- **Query enrichment**: `QueryEnricher` translates natural-language questions into dense
|
|
45
|
+
technical specifications before retrieval, improving code accuracy.
|
|
61
46
|
- **Indexers** for `db/schema.rb`, `app/models/**/*.rb`, and a custom Markdown context
|
|
62
47
|
file. Rake tasks: `glancer:index:all`, `glancer:index:schema`, `glancer:index:models`,
|
|
63
48
|
`glancer:index:context`.
|
|
64
49
|
- **Cosine similarity retrieval** with per-source-type relevance weights (schema 1.3×,
|
|
65
|
-
context 1.2×, models 1.1×)
|
|
66
|
-
-
|
|
50
|
+
context 1.2×, models 1.1×), configurable minimum score threshold, and fallback to
|
|
51
|
+
top-k results when no embedding meets the threshold.
|
|
67
52
|
- **SQL safety layer**: `SQLSanitizer` (blocks destructive statements),
|
|
68
|
-
`SQLValidator` (verifies table references against indexed schema), and
|
|
69
|
-
|
|
53
|
+
`SQLValidator` (verifies table references against indexed schema), and mandatory
|
|
54
|
+
read-only transaction with automatic rollback.
|
|
70
55
|
- **Audit trail**: every executed query is stored in `glancer_audits` with a unique
|
|
71
|
-
`run_id` UUID injected as
|
|
56
|
+
`run_id` UUID injected as a comment (`/*glancer,run_id:UUID*/`).
|
|
72
57
|
- **In-memory response cache** (`workflow_cache_ttl`) to avoid redundant LLM calls for
|
|
73
58
|
repeated identical questions.
|
|
74
59
|
- **Chat UI**: Stimulus + Turbo Streams interface with dark mode, typewriter effect,
|
|
75
|
-
CSV export, SQL re-run,
|
|
76
|
-
|
|
60
|
+
CSV export, SQL/AR re-run, chart visualizations (bar, line, pie), client-side polling,
|
|
61
|
+
pipeline status labels, accordion results, and copy-to-clipboard.
|
|
77
62
|
- **Settings page** at `/glancer/settings` for runtime custom instructions.
|
|
78
63
|
- **Schema viewer** at `/glancer/db-schema` showing indexed tables and columns.
|
|
79
64
|
- **Install generator**: `rails generate glancer:install` scaffolds the initializer,
|
|
80
65
|
context file, and mounts the engine.
|
|
81
|
-
- Configurable `statement_timeout`
|
|
82
|
-
|
|
83
|
-
-
|
|
84
|
-
|
|
66
|
+
- Configurable `statement_timeout`, `history_limit`, `read_only_db`, `k`, `min_score`,
|
|
67
|
+
and per-source document weights.
|
|
68
|
+
- **100% line coverage**: 717 RSpec examples covering every workflow path, edge case,
|
|
69
|
+
and rescue branch.
|
|
85
70
|
|
|
86
71
|
[Unreleased]: https://github.com/ErnaneJ/glancer/compare/v1.0.0...HEAD
|
|
87
|
-
[1.0.0]: https://github.com/ErnaneJ/glancer/
|
|
88
|
-
[0.1.0]: https://github.com/ErnaneJ/glancer/releases/tag/v0.1.0
|
|
72
|
+
[1.0.0]: https://github.com/ErnaneJ/glancer/releases/tag/v1.0.0
|
data/README.md
CHANGED
|
@@ -36,8 +36,11 @@ Glancer is a **Ruby on Rails engine** that mounts a full chat interface inside y
|
|
|
36
36
|
→ SELECT executed, results shown, answer written in plain language.
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
-
|
|
40
|
-
|
|
39
|
+
<p align="center">
|
|
40
|
+
<a href="https://github.com/ErnaneJ/glancer/raw/refs/heads/main/.github/assets/demo.mp4">
|
|
41
|
+
<img src="./.github/assets/demo.gif" alt="DEMO">
|
|
42
|
+
</a>
|
|
43
|
+
</p>
|
|
41
44
|
|
|
42
45
|
## Why Glancer?
|
|
43
46
|
|
|
@@ -64,6 +67,16 @@ Glancer removes that friction. It gives your app a persistent, context-aware dat
|
|
|
64
67
|
|
|
65
68
|
Glancer is built on top of [**ruby_llm**](https://github.com/crmne/ruby_llm), a provider-agnostic LLM client for Ruby. All LLM calls (query generation, humanized responses, embeddings, and optional question enrichment) go through ruby_llm, so any model it supports works with Glancer.
|
|
66
69
|
|
|
70
|
+
### Supported models by provider
|
|
71
|
+
|
|
72
|
+
| Provider | Chat / Code models | Embedding models | Model list |
|
|
73
|
+
|---|---|---|---|
|
|
74
|
+
| **Gemini** | `gemini-2.0-flash`, `gemini-2.5-pro`, `gemini-1.5-pro`, … | `text-embedding-004` | [ai.google.dev/gemini-api/docs/models](https://ai.google.dev/gemini-api/docs/models) |
|
|
75
|
+
| **OpenAI** | `gpt-4o`, `gpt-4o-mini`, `o3-mini`, … | `text-embedding-3-large`, `text-embedding-3-small` | [platform.openai.com/docs/models](https://platform.openai.com/docs/models) |
|
|
76
|
+
| **OpenRouter** | Any model available on the platform (e.g. `anthropic/claude-3.5-sonnet`, `deepseek/deepseek-r1:free`) | Not natively supported — pair with `:gemini` or `:openai` for embeddings | [openrouter.ai/models](https://openrouter.ai/models) |
|
|
77
|
+
|
|
78
|
+
The full list of models validated by the ruby_llm gem is at [rubyllm.com/available-models](https://rubyllm.com/available-models/). If you need a model not yet in that registry, set `assume_model_exists: true` by configuring an explicit `embedding_model` or `code_model` string.
|
|
79
|
+
|
|
67
80
|
## Installation
|
|
68
81
|
|
|
69
82
|
### 1. Add to your Gemfile
|
|
@@ -164,6 +177,20 @@ Route all queries to a replica to offload your primary database:
|
|
|
164
177
|
config.read_only_db = ENV["REPLICA_DATABASE_URL"]
|
|
165
178
|
```
|
|
166
179
|
|
|
180
|
+
### Rate limiting
|
|
181
|
+
|
|
182
|
+
When using free-tier or low-quota LLM providers (e.g. Gemini Flash free tier), you may hit rate limits. Glancer automatically retries with backoff when it receives a quota-exceeded or rate-limit error from any provider:
|
|
183
|
+
|
|
184
|
+
- If the provider returns a **"retry in Xs"** hint in the error message (Gemini does this), that exact delay is used.
|
|
185
|
+
- Otherwise, **exponential backoff** is applied: `llm_retry_delay × 2^(attempt − 1)`.
|
|
186
|
+
|
|
187
|
+
```ruby
|
|
188
|
+
config.max_llm_retries = 3 # retries before propagating the error (0 = disable)
|
|
189
|
+
config.llm_retry_delay = 60 # base delay in seconds when no retry hint is provided
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
A warning is logged on each retry attempt so you can monitor them in your logs. If all retries are exhausted, the error is surfaced to the user normally.
|
|
193
|
+
|
|
167
194
|
### Full configuration reference
|
|
168
195
|
|
|
169
196
|
| Option | Default | Description |
|
|
@@ -194,6 +221,8 @@ config.read_only_db = ENV["REPLICA_DATABASE_URL"]
|
|
|
194
221
|
| `models_documents_weight` | `1.1` | Score boost for model chunks |
|
|
195
222
|
| `history_limit` | `6` | Prior conversation turns included in the LLM prompt |
|
|
196
223
|
| `workflow_cache_ttl` | `5.minutes` | In-memory result cache TTL; `0` to disable |
|
|
224
|
+
| `max_llm_retries` | `3` | Max retries on rate-limit / quota errors; `0` to disable |
|
|
225
|
+
| `llm_retry_delay` | `60` | Base delay in seconds between retries (exponential backoff; API hint takes priority) |
|
|
197
226
|
| `log_verbosity` | `:info` | `:silent`, `:none`, `:info`, or `:debug` |
|
|
198
227
|
| `log_output_path` | `nil` | Log file path; `nil` writes to stdout |
|
|
199
228
|
| `blazer_path` | `nil` | Blazer base path; auto-detected when `blazer` gem is present |
|
|
@@ -101,7 +101,7 @@ Glancer.configure do |config|
|
|
|
101
101
|
# avoid errors. If you must use an OpenRouter embedding model anyway, set
|
|
102
102
|
# embedding_model explicitly (e.g., 'openai/text-embedding-3-small') —
|
|
103
103
|
# Glancer will bypass the model registry check automatically.
|
|
104
|
-
#
|
|
104
|
+
# https://rubyllm.com/available-models/
|
|
105
105
|
# Accepted: nil | :gemini | :openai | :openrouter
|
|
106
106
|
config.embedding_provider = nil
|
|
107
107
|
|
|
@@ -203,6 +203,19 @@ Glancer.configure do |config|
|
|
|
203
203
|
# shared across Puma workers or restarts).
|
|
204
204
|
config.workflow_cache_ttl = 5.minutes
|
|
205
205
|
|
|
206
|
+
# ─────────────────────────────────────────────────────────────────────────────
|
|
207
|
+
# Rate limiting
|
|
208
|
+
# ─────────────────────────────────────────────────────────────────────────────
|
|
209
|
+
|
|
210
|
+
# Maximum number of retries when an LLM or embedding API returns a rate-limit
|
|
211
|
+
# or quota-exceeded error. Set to 0 to disable automatic retries.
|
|
212
|
+
config.max_llm_retries = 3
|
|
213
|
+
|
|
214
|
+
# Base delay in seconds between rate-limit retries. When the provider returns a
|
|
215
|
+
# "retry in Xs" hint (e.g. Gemini), that hint takes priority. Otherwise,
|
|
216
|
+
# exponential backoff is applied: delay * 2^(attempt - 1).
|
|
217
|
+
config.llm_retry_delay = 60
|
|
218
|
+
|
|
206
219
|
# ─────────────────────────────────────────────────────────────────────────────
|
|
207
220
|
# Logging
|
|
208
221
|
# ─────────────────────────────────────────────────────────────────────────────
|
|
@@ -52,6 +52,8 @@ module Glancer
|
|
|
52
52
|
self.query_enrichment_enabled = false # enrich question with table names before retrieval
|
|
53
53
|
self.enrichment_provider = nil # nil → falls back to llm_provider
|
|
54
54
|
self.enrichment_model = nil # nil → falls back to llm_model
|
|
55
|
+
self.max_llm_retries = 3 # retries on rate-limit errors
|
|
56
|
+
self.llm_retry_delay = 60 # base delay in seconds (fallback when no retry-after hint)
|
|
55
57
|
end
|
|
56
58
|
|
|
57
59
|
# === READERS ===
|
|
@@ -66,7 +68,8 @@ module Glancer
|
|
|
66
68
|
:code_provider, :code_model,
|
|
67
69
|
:chat_provider, :chat_model,
|
|
68
70
|
:blazer_path, :query_mode,
|
|
69
|
-
:query_enrichment_enabled, :enrichment_provider, :enrichment_model
|
|
71
|
+
:query_enrichment_enabled, :enrichment_provider, :enrichment_model,
|
|
72
|
+
:max_llm_retries, :llm_retry_delay
|
|
70
73
|
|
|
71
74
|
# === WRITERS ===
|
|
72
75
|
def adapter=(value)
|
|
@@ -340,6 +343,18 @@ module Glancer
|
|
|
340
343
|
@query_mode = value
|
|
341
344
|
end
|
|
342
345
|
|
|
346
|
+
def max_llm_retries=(value)
|
|
347
|
+
raise ArgumentError, "max_llm_retries must be a non-negative integer" unless value.is_a?(Integer) && value >= 0
|
|
348
|
+
|
|
349
|
+
@max_llm_retries = value
|
|
350
|
+
end
|
|
351
|
+
|
|
352
|
+
def llm_retry_delay=(value)
|
|
353
|
+
raise ArgumentError, "llm_retry_delay must be a positive number" unless value.is_a?(Numeric) && value.positive?
|
|
354
|
+
|
|
355
|
+
@llm_retry_delay = value
|
|
356
|
+
end
|
|
357
|
+
|
|
343
358
|
# Returns the Blazer base path if Blazer is available, nil otherwise.
|
|
344
359
|
def resolved_blazer_path
|
|
345
360
|
return @blazer_path unless @blazer_path.nil?
|
data/lib/glancer/retriever.rb
CHANGED
|
@@ -16,12 +16,14 @@ module Glancer
|
|
|
16
16
|
Glancer::Utils::Logger.debug("Retriever",
|
|
17
17
|
"Embedding chunk ##{idx + 1} (#{data[:source_type]} - #{data[:source_path]}): '#{preview}...'")
|
|
18
18
|
|
|
19
|
-
vector =
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
19
|
+
vector = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
|
|
20
|
+
RubyLLM.embed(
|
|
21
|
+
chunk,
|
|
22
|
+
model: Glancer.configuration.resolved_embedding_model,
|
|
23
|
+
provider: Glancer.configuration.resolved_embedding_provider,
|
|
24
|
+
assume_model_exists: true
|
|
25
|
+
).vectors
|
|
26
|
+
end
|
|
25
27
|
|
|
26
28
|
Glancer::Utils::Logger.debug("Retriever",
|
|
27
29
|
"Vector size: #{vector.size}, example values: #{vector.first(5).inspect}")
|
|
@@ -47,12 +49,14 @@ module Glancer
|
|
|
47
49
|
def search(query)
|
|
48
50
|
Glancer::Utils::Logger.info("Retriever", "Searching for top #{Glancer.configuration.k} results...")
|
|
49
51
|
|
|
50
|
-
query_embedding =
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
52
|
+
query_embedding = Glancer::Utils::RateLimitRetry.with_retry(context: "Retriever") do
|
|
53
|
+
RubyLLM.embed(
|
|
54
|
+
query,
|
|
55
|
+
model: Glancer.configuration.resolved_embedding_model,
|
|
56
|
+
provider: Glancer.configuration.resolved_embedding_provider,
|
|
57
|
+
assume_model_exists: true
|
|
58
|
+
).vectors
|
|
59
|
+
end
|
|
56
60
|
|
|
57
61
|
# @TODO Postgres with native search?
|
|
58
62
|
perform_ruby_search(query_embedding)
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Glancer
|
|
4
|
+
module Utils
|
|
5
|
+
module RateLimitRetry
|
|
6
|
+
RATE_LIMIT_PATTERNS = [
|
|
7
|
+
/rate.?limit/i,
|
|
8
|
+
/quota.?exceed/i,
|
|
9
|
+
/exceeded.?your.?current.?quota/i,
|
|
10
|
+
/too.?many.?request/i,
|
|
11
|
+
/resource.?exhausted/i,
|
|
12
|
+
/\b429\b/
|
|
13
|
+
].freeze
|
|
14
|
+
|
|
15
|
+
RETRY_AFTER_PATTERN = /retry.?in\s+([0-9]+(?:\.[0-9]+)?)\s*s/i
|
|
16
|
+
|
|
17
|
+
def self.with_retry(context:, max_retries: nil, base_delay: nil)
|
|
18
|
+
max_retries ||= Glancer.configuration.max_llm_retries
|
|
19
|
+
base_delay ||= Glancer.configuration.llm_retry_delay
|
|
20
|
+
attempt = 0
|
|
21
|
+
|
|
22
|
+
begin
|
|
23
|
+
yield
|
|
24
|
+
rescue StandardError => e
|
|
25
|
+
raise unless rate_limit_error?(e) && attempt < max_retries
|
|
26
|
+
|
|
27
|
+
attempt += 1
|
|
28
|
+
delay = parse_retry_after(e.message) || (base_delay * (2**(attempt - 1)))
|
|
29
|
+
Glancer::Utils::Logger.warn(
|
|
30
|
+
context,
|
|
31
|
+
"Rate limit hit (attempt #{attempt}/#{max_retries}). Retrying in #{delay.ceil}s..."
|
|
32
|
+
)
|
|
33
|
+
sleep(delay)
|
|
34
|
+
retry
|
|
35
|
+
end
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
def self.rate_limit_error?(error)
|
|
39
|
+
RATE_LIMIT_PATTERNS.any? { |p| error.message.match?(p) } ||
|
|
40
|
+
error.class.name.match?(/rate.?limit/i)
|
|
41
|
+
end
|
|
42
|
+
private_class_method :rate_limit_error?
|
|
43
|
+
|
|
44
|
+
def self.parse_retry_after(message)
|
|
45
|
+
m = message.match(RETRY_AFTER_PATTERN)
|
|
46
|
+
m && m[1].to_f.positive? ? m[1].to_f : nil
|
|
47
|
+
end
|
|
48
|
+
private_class_method :parse_retry_after
|
|
49
|
+
end
|
|
50
|
+
end
|
|
51
|
+
end
|
data/lib/glancer/version.rb
CHANGED
|
@@ -17,7 +17,9 @@ module Glancer
|
|
|
17
17
|
assume_model_exists: true
|
|
18
18
|
)
|
|
19
19
|
|
|
20
|
-
response =
|
|
20
|
+
response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
|
|
21
|
+
chat.ask(prompt)
|
|
22
|
+
end
|
|
21
23
|
|
|
22
24
|
Glancer::Utils::Logger.info("Workflow::Builder",
|
|
23
25
|
"LLM responded with SQL (length: #{response.content&.length || 0} characters)")
|
|
@@ -64,7 +66,9 @@ module Glancer
|
|
|
64
66
|
assume_model_exists: true
|
|
65
67
|
)
|
|
66
68
|
|
|
67
|
-
response =
|
|
69
|
+
response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
|
|
70
|
+
chat.ask(prompt)
|
|
71
|
+
end
|
|
68
72
|
|
|
69
73
|
# Clean the response to ensure we only have the raw SQL
|
|
70
74
|
Glancer::Workflow::SQLExtractor.extract(response.content)
|
|
@@ -84,7 +88,9 @@ module Glancer
|
|
|
84
88
|
assume_model_exists: true
|
|
85
89
|
)
|
|
86
90
|
|
|
87
|
-
response =
|
|
91
|
+
response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
|
|
92
|
+
chat.ask(prompt)
|
|
93
|
+
end
|
|
88
94
|
Glancer::Utils::Logger.info("Workflow::Builder",
|
|
89
95
|
"LLM responded with AR code (length: #{response.content&.length || 0} chars)")
|
|
90
96
|
response.content
|
|
@@ -118,7 +124,9 @@ module Glancer
|
|
|
118
124
|
assume_model_exists: true
|
|
119
125
|
)
|
|
120
126
|
|
|
121
|
-
response =
|
|
127
|
+
response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::Builder") do
|
|
128
|
+
chat.ask(prompt)
|
|
129
|
+
end
|
|
122
130
|
Glancer::Workflow::ARExtractor.extract(response.content)
|
|
123
131
|
rescue StandardError => e
|
|
124
132
|
Glancer::Utils::Logger.error("Workflow::Builder", "Failed to fix AR code: #{e.message}")
|
data/lib/glancer/workflow/llm.rb
CHANGED
|
@@ -39,7 +39,9 @@ module Glancer
|
|
|
39
39
|
context += "\n\nADDITIONAL INSTRUCTIONS:\n#{custom}" if custom.present?
|
|
40
40
|
|
|
41
41
|
chat.with_instructions(context)
|
|
42
|
-
response =
|
|
42
|
+
response = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
|
|
43
|
+
chat.ask(question)
|
|
44
|
+
end
|
|
43
45
|
|
|
44
46
|
response.content
|
|
45
47
|
rescue StandardError => e
|
|
@@ -72,7 +74,9 @@ module Glancer
|
|
|
72
74
|
model: Glancer.configuration.resolved_chat_model,
|
|
73
75
|
assume_model_exists: true
|
|
74
76
|
)
|
|
75
|
-
|
|
77
|
+
Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
|
|
78
|
+
chat.ask(prompt).content
|
|
79
|
+
end
|
|
76
80
|
rescue StandardError => e
|
|
77
81
|
Glancer::Utils::Logger.error("Workflow::LLM", "explain_missing_tables failed: #{e.message}")
|
|
78
82
|
"Não consegui encontrar a(s) tabela(s) **#{missing}** no schema indexado. " \
|
|
@@ -87,7 +91,9 @@ module Glancer
|
|
|
87
91
|
)
|
|
88
92
|
prompt = "Generate a concise, descriptive title (max 45 characters, no quotes, no punctuation at end) " \
|
|
89
93
|
"for a database query session starting with this question: #{question}"
|
|
90
|
-
|
|
94
|
+
Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
|
|
95
|
+
chat.ask(prompt).content.strip.truncate(50)
|
|
96
|
+
end
|
|
91
97
|
rescue StandardError => e
|
|
92
98
|
Glancer::Utils::Logger.error("Workflow::LLM", "generate_title failed: #{e.message}")
|
|
93
99
|
question.truncate(45)
|
|
@@ -116,7 +122,9 @@ module Glancer
|
|
|
116
122
|
5. Respond in the user's language.
|
|
117
123
|
PROMPT
|
|
118
124
|
|
|
119
|
-
|
|
125
|
+
Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::LLM") do
|
|
126
|
+
chat.ask(prompt).content
|
|
127
|
+
end
|
|
120
128
|
end
|
|
121
129
|
end
|
|
122
130
|
end
|
|
@@ -50,7 +50,9 @@ module Glancer
|
|
|
50
50
|
assume_model_exists: true
|
|
51
51
|
)
|
|
52
52
|
|
|
53
|
-
enriched =
|
|
53
|
+
enriched = Glancer::Utils::RateLimitRetry.with_retry(context: "Workflow::QueryEnricher") do
|
|
54
|
+
chat.ask(prompt).content.to_s.strip
|
|
55
|
+
end
|
|
54
56
|
enriched.presence || question
|
|
55
57
|
rescue StandardError => e
|
|
56
58
|
Glancer::Utils::Logger.warn("Workflow::QueryEnricher", "Enrichment failed, using original: #{e.message}")
|
data/lib/glancer.rb
CHANGED
|
@@ -20,7 +20,8 @@ require "glancer/utils/logger" # Glancer::Utils::Logger
|
|
|
20
20
|
require "glancer/utils/markdown_helper" # Glancer::Utils::MarkdownHelper
|
|
21
21
|
require "glancer/utils/result_formatter" # Glancer::Utils::ResultFormatter
|
|
22
22
|
require "glancer/utils/table_stats" # Glancer::Utils::TableStats
|
|
23
|
-
require "glancer/utils/transaction"
|
|
23
|
+
require "glancer/utils/transaction" # Glancer::Utils::Transaction
|
|
24
|
+
require "glancer/utils/rate_limit_retry" # Glancer::Utils::RateLimitRetry
|
|
24
25
|
|
|
25
26
|
require "glancer/engine" # Glancer::Engine
|
|
26
27
|
|
|
@@ -143,6 +143,14 @@ RSpec.describe Glancer::Configuration do
|
|
|
143
143
|
it "sets read_only_db to nil" do
|
|
144
144
|
expect(config.read_only_db).to be_nil
|
|
145
145
|
end
|
|
146
|
+
|
|
147
|
+
it "sets max_llm_retries to 3" do
|
|
148
|
+
expect(config.max_llm_retries).to eq(3)
|
|
149
|
+
end
|
|
150
|
+
|
|
151
|
+
it "sets llm_retry_delay to 60" do
|
|
152
|
+
expect(config.llm_retry_delay).to eq(60)
|
|
153
|
+
end
|
|
146
154
|
end
|
|
147
155
|
|
|
148
156
|
# ── adapter= ─────────────────────────────────────────────────────────────────
|
|
@@ -855,4 +863,56 @@ RSpec.describe Glancer::Configuration do
|
|
|
855
863
|
expect(config.resolved_enrichment_model).to eq("gemini-2.0-flash")
|
|
856
864
|
end
|
|
857
865
|
end
|
|
866
|
+
|
|
867
|
+
# ── max_llm_retries= ─────────────────────────────────────────────────────────
|
|
868
|
+
|
|
869
|
+
describe "#max_llm_retries=" do
|
|
870
|
+
it "defaults to 3" do
|
|
871
|
+
expect(config.max_llm_retries).to eq(3)
|
|
872
|
+
end
|
|
873
|
+
|
|
874
|
+
it "accepts 0 (disables retries)" do
|
|
875
|
+
config.max_llm_retries = 0
|
|
876
|
+
expect(config.max_llm_retries).to eq(0)
|
|
877
|
+
end
|
|
878
|
+
|
|
879
|
+
it "accepts a positive integer" do
|
|
880
|
+
config.max_llm_retries = 5
|
|
881
|
+
expect(config.max_llm_retries).to eq(5)
|
|
882
|
+
end
|
|
883
|
+
|
|
884
|
+
it "raises ArgumentError for a negative integer" do
|
|
885
|
+
expect { config.max_llm_retries = -1 }.to raise_error(ArgumentError, /non-negative integer/)
|
|
886
|
+
end
|
|
887
|
+
|
|
888
|
+
it "raises ArgumentError for a non-integer" do
|
|
889
|
+
expect { config.max_llm_retries = 2.5 }.to raise_error(ArgumentError, /non-negative integer/)
|
|
890
|
+
end
|
|
891
|
+
end
|
|
892
|
+
|
|
893
|
+
# ── llm_retry_delay= ─────────────────────────────────────────────────────────
|
|
894
|
+
|
|
895
|
+
describe "#llm_retry_delay=" do
|
|
896
|
+
it "defaults to 60" do
|
|
897
|
+
expect(config.llm_retry_delay).to eq(60)
|
|
898
|
+
end
|
|
899
|
+
|
|
900
|
+
it "accepts a positive integer" do
|
|
901
|
+
config.llm_retry_delay = 30
|
|
902
|
+
expect(config.llm_retry_delay).to eq(30)
|
|
903
|
+
end
|
|
904
|
+
|
|
905
|
+
it "accepts a positive float" do
|
|
906
|
+
config.llm_retry_delay = 0.5
|
|
907
|
+
expect(config.llm_retry_delay).to eq(0.5)
|
|
908
|
+
end
|
|
909
|
+
|
|
910
|
+
it "raises ArgumentError for zero" do
|
|
911
|
+
expect { config.llm_retry_delay = 0 }.to raise_error(ArgumentError, /positive number/)
|
|
912
|
+
end
|
|
913
|
+
|
|
914
|
+
it "raises ArgumentError for a negative number" do
|
|
915
|
+
expect { config.llm_retry_delay = -5 }.to raise_error(ArgumentError, /positive number/)
|
|
916
|
+
end
|
|
917
|
+
end
|
|
858
918
|
end
|
|
@@ -0,0 +1,193 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "spec_helper"
|
|
4
|
+
|
|
5
|
+
RSpec.describe Glancer::Utils::RateLimitRetry do
|
|
6
|
+
before do
|
|
7
|
+
Glancer.configuration.max_llm_retries = 3
|
|
8
|
+
Glancer.configuration.llm_retry_delay = 60
|
|
9
|
+
allow(described_class).to receive(:sleep)
|
|
10
|
+
allow(Glancer::Utils::Logger).to receive(:warn)
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
describe ".with_retry" do
|
|
14
|
+
context "when the block succeeds on the first attempt" do
|
|
15
|
+
it "returns the block result without retrying" do
|
|
16
|
+
result = described_class.with_retry(context: "Test") { 42 }
|
|
17
|
+
expect(result).to eq(42)
|
|
18
|
+
expect(described_class).not_to have_received(:sleep)
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
context "when the block raises a non-rate-limit error" do
|
|
23
|
+
it "re-raises immediately without retrying" do
|
|
24
|
+
calls = 0
|
|
25
|
+
expect do
|
|
26
|
+
described_class.with_retry(context: "Test") do
|
|
27
|
+
calls += 1
|
|
28
|
+
raise StandardError, "some other error"
|
|
29
|
+
end
|
|
30
|
+
end.to raise_error(StandardError, "some other error")
|
|
31
|
+
expect(calls).to eq(1)
|
|
32
|
+
expect(described_class).not_to have_received(:sleep)
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
context "when the block raises a rate-limit error" do
|
|
37
|
+
it "retries up to max_retries times then re-raises" do
|
|
38
|
+
calls = 0
|
|
39
|
+
expect do
|
|
40
|
+
described_class.with_retry(context: "Test", max_retries: 2) do
|
|
41
|
+
calls += 1
|
|
42
|
+
raise StandardError, "You exceeded your current quota"
|
|
43
|
+
end
|
|
44
|
+
end.to raise_error(StandardError, "You exceeded your current quota")
|
|
45
|
+
expect(calls).to eq(3) # 1 initial + 2 retries
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
it "logs a warning on each retry" do
|
|
49
|
+
expect do
|
|
50
|
+
described_class.with_retry(context: "Test", max_retries: 1) do
|
|
51
|
+
raise StandardError, "rate limit exceeded"
|
|
52
|
+
end
|
|
53
|
+
end.to raise_error(StandardError)
|
|
54
|
+
expect(Glancer::Utils::Logger).to have_received(:warn).with("Test", %r{Rate limit hit \(attempt 1/1\)})
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
it "sleeps for the base delay when no retry-after hint is present" do
|
|
58
|
+
expect do
|
|
59
|
+
described_class.with_retry(context: "Test", max_retries: 1, base_delay: 10) do
|
|
60
|
+
raise StandardError, "quota exceeded"
|
|
61
|
+
end
|
|
62
|
+
end.to raise_error(StandardError)
|
|
63
|
+
expect(described_class).to have_received(:sleep).with(10)
|
|
64
|
+
end
|
|
65
|
+
|
|
66
|
+
it "sleeps for the hint delay when a retry-after hint is present" do
|
|
67
|
+
expect do
|
|
68
|
+
described_class.with_retry(context: "Test", max_retries: 1, base_delay: 60) do
|
|
69
|
+
raise StandardError, "quota exceeded. Please retry in 51.63s"
|
|
70
|
+
end
|
|
71
|
+
end.to raise_error(StandardError)
|
|
72
|
+
expect(described_class).to have_received(:sleep).with(51.63)
|
|
73
|
+
end
|
|
74
|
+
|
|
75
|
+
it "uses exponential backoff when no hint is present" do
|
|
76
|
+
sleep_calls = []
|
|
77
|
+
allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
|
|
78
|
+
|
|
79
|
+
expect do
|
|
80
|
+
described_class.with_retry(context: "Test", max_retries: 3, base_delay: 10) do
|
|
81
|
+
raise StandardError, "resource exhausted"
|
|
82
|
+
end
|
|
83
|
+
end.to raise_error(StandardError)
|
|
84
|
+
|
|
85
|
+
# attempt 1 → 10 * 2^0 = 10, attempt 2 → 10 * 2^1 = 20, attempt 3 → 10 * 2^2 = 40
|
|
86
|
+
expect(sleep_calls).to eq([10, 20, 40])
|
|
87
|
+
end
|
|
88
|
+
|
|
89
|
+
it "succeeds if a later attempt does not raise" do
|
|
90
|
+
calls = 0
|
|
91
|
+
result = described_class.with_retry(context: "Test", max_retries: 2) do
|
|
92
|
+
calls += 1
|
|
93
|
+
raise StandardError, "too many requests" if calls < 3
|
|
94
|
+
|
|
95
|
+
"ok"
|
|
96
|
+
end
|
|
97
|
+
expect(result).to eq("ok")
|
|
98
|
+
expect(calls).to eq(3)
|
|
99
|
+
end
|
|
100
|
+
|
|
101
|
+
it "reads max_retries from configuration when not supplied" do
|
|
102
|
+
Glancer.configuration.max_llm_retries = 1
|
|
103
|
+
calls = 0
|
|
104
|
+
expect do
|
|
105
|
+
described_class.with_retry(context: "Test") do
|
|
106
|
+
calls += 1
|
|
107
|
+
raise StandardError, "quota exceeded"
|
|
108
|
+
end
|
|
109
|
+
end.to raise_error(StandardError)
|
|
110
|
+
expect(calls).to eq(2) # 1 initial + 1 from config
|
|
111
|
+
end
|
|
112
|
+
|
|
113
|
+
it "reads base_delay from configuration when not supplied" do
|
|
114
|
+
Glancer.configuration.llm_retry_delay = 5
|
|
115
|
+
expect do
|
|
116
|
+
described_class.with_retry(context: "Test", max_retries: 1) do
|
|
117
|
+
raise StandardError, "resource exhausted"
|
|
118
|
+
end
|
|
119
|
+
end.to raise_error(StandardError)
|
|
120
|
+
expect(described_class).to have_received(:sleep).with(5)
|
|
121
|
+
end
|
|
122
|
+
|
|
123
|
+
it "does not retry when max_retries is 0" do
|
|
124
|
+
calls = 0
|
|
125
|
+
expect do
|
|
126
|
+
described_class.with_retry(context: "Test", max_retries: 0) do
|
|
127
|
+
calls += 1
|
|
128
|
+
raise StandardError, "rate limit"
|
|
129
|
+
end
|
|
130
|
+
end.to raise_error(StandardError)
|
|
131
|
+
expect(calls).to eq(1)
|
|
132
|
+
expect(described_class).not_to have_received(:sleep)
|
|
133
|
+
end
|
|
134
|
+
end
|
|
135
|
+
|
|
136
|
+
context "rate limit error detection" do
|
|
137
|
+
{
|
|
138
|
+
"rate limit" => /rate.?limit/i,
|
|
139
|
+
"quota exceeded" => /quota.?exceed/i,
|
|
140
|
+
"You exceeded your current quota" => /exceeded.?your.?current.?quota/i,
|
|
141
|
+
"Too Many Requests" => /too.?many.?request/i,
|
|
142
|
+
"RESOURCE_EXHAUSTED" => /resource.?exhausted/i,
|
|
143
|
+
"HTTP 429 error" => /\b429\b/
|
|
144
|
+
}.each_key do |message|
|
|
145
|
+
it "detects '#{message}' as a rate-limit error" do
|
|
146
|
+
calls = 0
|
|
147
|
+
expect do
|
|
148
|
+
described_class.with_retry(context: "Test", max_retries: 1) do
|
|
149
|
+
calls += 1
|
|
150
|
+
raise StandardError, message
|
|
151
|
+
end
|
|
152
|
+
end.to raise_error(StandardError)
|
|
153
|
+
expect(calls).to eq(2) # retried once
|
|
154
|
+
end
|
|
155
|
+
end
|
|
156
|
+
|
|
157
|
+
it "detects errors whose class name contains 'rate_limit'" do
|
|
158
|
+
klass = Class.new(StandardError) { def self.name = "SomeRateLimitError" }
|
|
159
|
+
calls = 0
|
|
160
|
+
expect do
|
|
161
|
+
described_class.with_retry(context: "Test", max_retries: 1) do
|
|
162
|
+
calls += 1
|
|
163
|
+
raise klass, "any message"
|
|
164
|
+
end
|
|
165
|
+
end.to raise_error(klass)
|
|
166
|
+
expect(calls).to eq(2)
|
|
167
|
+
end
|
|
168
|
+
end
|
|
169
|
+
|
|
170
|
+
context "retry-after hint parsing" do
|
|
171
|
+
[
|
|
172
|
+
["Please retry in 51.632812448s", 51.632812448],
|
|
173
|
+
["retry in 30s", 30.0],
|
|
174
|
+
["RETRY IN 120.5s now", 120.5],
|
|
175
|
+
["retryIn 0s — ignored", nil],
|
|
176
|
+
["no hint here", nil]
|
|
177
|
+
].each do |message, expected|
|
|
178
|
+
it "parses #{expected.inspect} from '#{message}'" do
|
|
179
|
+
sleep_calls = []
|
|
180
|
+
allow(described_class).to receive(:sleep) { |t| sleep_calls << t }
|
|
181
|
+
expect do
|
|
182
|
+
described_class.with_retry(context: "Test", max_retries: 1, base_delay: 99) do
|
|
183
|
+
raise StandardError, "quota exceeded. #{message}"
|
|
184
|
+
end
|
|
185
|
+
end.to raise_error(StandardError)
|
|
186
|
+
|
|
187
|
+
expected_delay = expected || 99
|
|
188
|
+
expect(sleep_calls.first).to eq(expected_delay)
|
|
189
|
+
end
|
|
190
|
+
end
|
|
191
|
+
end
|
|
192
|
+
end
|
|
193
|
+
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: glancer
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.
|
|
4
|
+
version: 1.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Ernane Ferreira
|
|
@@ -169,6 +169,7 @@ files:
|
|
|
169
169
|
- lib/glancer/retriever.rb
|
|
170
170
|
- lib/glancer/utils/logger.rb
|
|
171
171
|
- lib/glancer/utils/markdown_helper.rb
|
|
172
|
+
- lib/glancer/utils/rate_limit_retry.rb
|
|
172
173
|
- lib/glancer/utils/result_formatter.rb
|
|
173
174
|
- lib/glancer/utils/table_stats.rb
|
|
174
175
|
- lib/glancer/utils/transaction.rb
|
|
@@ -201,6 +202,7 @@ files:
|
|
|
201
202
|
- spec/lib/glancer/retriever_spec.rb
|
|
202
203
|
- spec/lib/glancer/utils/logger_spec.rb
|
|
203
204
|
- spec/lib/glancer/utils/markdown_helper_spec.rb
|
|
205
|
+
- spec/lib/glancer/utils/rate_limit_retry_spec.rb
|
|
204
206
|
- spec/lib/glancer/utils/result_formatter_spec.rb
|
|
205
207
|
- spec/lib/glancer/utils/table_stats_spec.rb
|
|
206
208
|
- spec/lib/glancer/utils/transaction_spec.rb
|