qasql 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- qasql-1.0.0/LICENSE +21 -0
- qasql-1.0.0/MANIFEST.in +5 -0
- qasql-1.0.0/PKG-INFO +640 -0
- qasql-1.0.0/README.md +599 -0
- qasql-1.0.0/pyproject.toml +84 -0
- qasql-1.0.0/qasql/__init__.py +45 -0
- qasql-1.0.0/qasql/__main__.py +10 -0
- qasql-1.0.0/qasql/cli.py +203 -0
- qasql-1.0.0/qasql/config.py +219 -0
- qasql-1.0.0/qasql/core/__init__.py +18 -0
- qasql-1.0.0/qasql/core/executor.py +239 -0
- qasql-1.0.0/qasql/core/generator.py +249 -0
- qasql-1.0.0/qasql/core/judge.py +204 -0
- qasql-1.0.0/qasql/core/prompts.py +173 -0
- qasql-1.0.0/qasql/core/schema_agent.py +277 -0
- qasql-1.0.0/qasql/database.py +331 -0
- qasql-1.0.0/qasql/engine.py +490 -0
- qasql-1.0.0/qasql/llm.py +218 -0
- qasql-1.0.0/qasql/py.typed +2 -0
- qasql-1.0.0/qasql/result.py +109 -0
- qasql-1.0.0/qasql.egg-info/PKG-INFO +640 -0
- qasql-1.0.0/qasql.egg-info/SOURCES.txt +26 -0
- qasql-1.0.0/qasql.egg-info/dependency_links.txt +1 -0
- qasql-1.0.0/qasql.egg-info/entry_points.txt +2 -0
- qasql-1.0.0/qasql.egg-info/requires.txt +21 -0
- qasql-1.0.0/qasql.egg-info/top_level.txt +1 -0
- qasql-1.0.0/setup.cfg +4 -0
- qasql-1.0.0/setup.py +9 -0
qasql-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Chansokheang
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
qasql-1.0.0/MANIFEST.in
ADDED
qasql-1.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,640 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: qasql
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Local-first Text-to-SQL engine for enterprise deployment
|
|
5
|
+
Author-email: Chansokheang <heangs770@gmail.com>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/Chansokheang/Map-Reduce-Schema-sdk
|
|
8
|
+
Project-URL: Documentation, https://github.com/Chansokheang/Map-Reduce-Schema-sdk#readme
|
|
9
|
+
Project-URL: Repository, https://github.com/Chansokheang/Map-Reduce-Schema-sdk
|
|
10
|
+
Project-URL: Issues, https://github.com/Chansokheang/Map-Reduce-Schema-sdk/issues
|
|
11
|
+
Keywords: text-to-sql,natural-language,sql,database,llm,enterprise,local,ollama
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: Operating System :: OS Independent
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
19
|
+
Classifier: Topic :: Database
|
|
20
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
21
|
+
Requires-Python: >=3.10
|
|
22
|
+
Description-Content-Type: text/markdown
|
|
23
|
+
License-File: LICENSE
|
|
24
|
+
Requires-Dist: requests>=2.28.0
|
|
25
|
+
Provides-Extra: anthropic
|
|
26
|
+
Requires-Dist: anthropic>=0.18.0; extra == "anthropic"
|
|
27
|
+
Provides-Extra: openai
|
|
28
|
+
Requires-Dist: openai>=1.0.0; extra == "openai"
|
|
29
|
+
Provides-Extra: postgres
|
|
30
|
+
Requires-Dist: psycopg2-binary>=2.9.0; extra == "postgres"
|
|
31
|
+
Provides-Extra: all
|
|
32
|
+
Requires-Dist: anthropic>=0.18.0; extra == "all"
|
|
33
|
+
Requires-Dist: openai>=1.0.0; extra == "all"
|
|
34
|
+
Requires-Dist: psycopg2-binary>=2.9.0; extra == "all"
|
|
35
|
+
Provides-Extra: dev
|
|
36
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
37
|
+
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
38
|
+
Requires-Dist: black>=23.0.0; extra == "dev"
|
|
39
|
+
Requires-Dist: ruff>=0.1.0; extra == "dev"
|
|
40
|
+
Dynamic: license-file
|
|
41
|
+
|
|
42
|
+
# QA-SQL SDK
|
|
43
|
+
|
|
44
|
+
**Local-first Text-to-SQL engine for enterprise deployment.**
|
|
45
|
+
|
|
46
|
+
All processing happens locally - sensitive database schemas never leave your network.
|
|
47
|
+
|
|
48
|
+
[](https://www.python.org/downloads/)
|
|
49
|
+
[](https://opensource.org/licenses/MIT)
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## Table of Contents
|
|
54
|
+
|
|
55
|
+
- [Features](#features)
|
|
56
|
+
- [Installation](#installation)
|
|
57
|
+
- [Quick Start](#quick-start)
|
|
58
|
+
- [CLI Usage](#cli-usage)
|
|
59
|
+
- [Python SDK](#python-sdk)
|
|
60
|
+
- [Configuration](#configuration)
|
|
61
|
+
- [How It Works](#how-it-works)
|
|
62
|
+
- [Examples](#examples)
|
|
63
|
+
- [API Reference](#api-reference)
|
|
64
|
+
- [Troubleshooting](#troubleshooting)
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## Features
|
|
69
|
+
|
|
70
|
+
- **Privacy-First**: Use local LLMs (Ollama) - zero data leaves your network
|
|
71
|
+
- **Multi-Strategy Generation**: Generates 4-5 SQL candidates using different approaches
|
|
72
|
+
- **Automatic Schema Discovery**: Extracts and profiles database structure
|
|
73
|
+
- **Smart Selection**: LLM-as-a-Judge picks the best SQL candidate
|
|
74
|
+
- **Database Support**: SQLite and PostgreSQL
|
|
75
|
+
- **Flexible LLM Support**: Ollama (local), Anthropic Claude, OpenAI GPT
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## Installation
|
|
80
|
+
|
|
81
|
+
### Step 1: Install the SDK
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
# From source (development)
|
|
85
|
+
cd qasql-sdk
|
|
86
|
+
pip install -e .
|
|
87
|
+
|
|
88
|
+
# Or from PyPI (after publish)
|
|
89
|
+
pip install qasql
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Step 2: Install Optional Dependencies
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
# For Anthropic Claude
|
|
96
|
+
pip install qasql[anthropic]
|
|
97
|
+
|
|
98
|
+
# For OpenAI
|
|
99
|
+
pip install qasql[openai]
|
|
100
|
+
|
|
101
|
+
# For PostgreSQL
|
|
102
|
+
pip install qasql[postgres]
|
|
103
|
+
|
|
104
|
+
# All extras
|
|
105
|
+
pip install qasql[all]
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Step 3: Setup LLM Provider
|
|
109
|
+
|
|
110
|
+
#### Option A: Ollama (Local - Recommended for Privacy)
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
# Install Ollama
|
|
114
|
+
curl -fsSL https://ollama.ai/install.sh | sh
|
|
115
|
+
|
|
116
|
+
# Start Ollama server (keep running in terminal)
|
|
117
|
+
ollama serve
|
|
118
|
+
|
|
119
|
+
# In another terminal, pull a model
|
|
120
|
+
ollama pull llama3.2
|
|
121
|
+
|
|
122
|
+
# Or for better SQL generation
|
|
123
|
+
ollama pull codellama:13b
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
#### Option B: Anthropic API
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
export ANTHROPIC_API_KEY='your-anthropic-api-key'
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
#### Option C: OpenAI API
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
export OPENAI_API_KEY='your-openai-api-key'
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Quick Start
|
|
141
|
+
|
|
142
|
+
### 1. Test Schema Extraction (No LLM Required)
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
cd qasql-sdk/examples
|
|
146
|
+
python test_schema_only.py
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### 2. Full Text-to-SQL Test (Requires LLM)
|
|
150
|
+
|
|
151
|
+
```bash
|
|
152
|
+
# Make sure Ollama is running first
|
|
153
|
+
ollama serve
|
|
154
|
+
|
|
155
|
+
# Then run the test
|
|
156
|
+
cd qasql-sdk/examples
|
|
157
|
+
python test_california_schools.py
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### 3. Interactive Demo
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
cd qasql-sdk/examples
|
|
164
|
+
python interactive_demo.py --db-uri sqlite:///../../app/california_schools.sqlite
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## CLI Usage
|
|
170
|
+
|
|
171
|
+
### Using `python -m qasql`
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
cd qasql-sdk
|
|
175
|
+
|
|
176
|
+
# List tables
|
|
177
|
+
python -m qasql tables --db-uri sqlite:///path/to/database.sqlite
|
|
178
|
+
|
|
179
|
+
# Setup database (extract schema)
|
|
180
|
+
python -m qasql setup --db-uri sqlite:///path/to/database.sqlite
|
|
181
|
+
|
|
182
|
+
# Generate SQL from question
|
|
183
|
+
python -m qasql query --db-uri sqlite:///path/to/database.sqlite \
|
|
184
|
+
--question "How many customers are there?"
|
|
185
|
+
|
|
186
|
+
# Generate SQL with hint (enables SME strategy)
|
|
187
|
+
python -m qasql query --db-uri sqlite:///path/to/database.sqlite \
|
|
188
|
+
--question "What is the total revenue?" \
|
|
189
|
+
--hint "revenue = sum(amount) from orders table"
|
|
190
|
+
|
|
191
|
+
# Execute the generated SQL
|
|
192
|
+
python -m qasql query --db-uri sqlite:///path/to/database.sqlite \
|
|
193
|
+
--question "List all products" \
|
|
194
|
+
--execute
|
|
195
|
+
|
|
196
|
+
# Show verbose output with timings
|
|
197
|
+
python -m qasql query --db-uri sqlite:///path/to/database.sqlite \
|
|
198
|
+
--question "Count orders by status" \
|
|
199
|
+
--verbose
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### CLI Options
|
|
203
|
+
|
|
204
|
+
```
|
|
205
|
+
Global Options:
|
|
206
|
+
--config, -c Path to config file (JSON)
|
|
207
|
+
--db-uri Database URI (sqlite:/// or postgresql://)
|
|
208
|
+
--provider LLM provider: ollama, anthropic, openai (default: ollama)
|
|
209
|
+
--model LLM model name (default: llama3.2)
|
|
210
|
+
--ollama-url Ollama server URL (default: http://localhost:11434)
|
|
211
|
+
--output-dir, -o Output directory (default: ./qasql_output)
|
|
212
|
+
|
|
213
|
+
Commands:
|
|
214
|
+
setup Extract schema and generate descriptions
|
|
215
|
+
--readable-names Path to readable names mapping file
|
|
216
|
+
--force, -f Force regeneration
|
|
217
|
+
|
|
218
|
+
query Generate SQL from natural language
|
|
219
|
+
--question, -q Natural language question (required)
|
|
220
|
+
--hint SME hint for better accuracy
|
|
221
|
+
--execute, -e Execute the generated SQL
|
|
222
|
+
--verbose, -v Show timing information
|
|
223
|
+
--json Output as JSON
|
|
224
|
+
|
|
225
|
+
tables List database tables
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Python SDK
|
|
231
|
+
|
|
232
|
+
### Basic Usage
|
|
233
|
+
|
|
234
|
+
```python
|
|
235
|
+
from qasql import QASQLEngine
|
|
236
|
+
|
|
237
|
+
# Initialize engine
|
|
238
|
+
engine = QASQLEngine(
|
|
239
|
+
db_uri="sqlite:///path/to/database.sqlite",
|
|
240
|
+
llm_provider="ollama", # or "anthropic", "openai"
|
|
241
|
+
llm_model="llama3.2", # model name
|
|
242
|
+
output_dir="./qasql_output"
|
|
243
|
+
)
|
|
244
|
+
|
|
245
|
+
# One-time setup (extracts schema, generates column descriptions)
|
|
246
|
+
setup_result = engine.setup()
|
|
247
|
+
print(f"Tables found: {setup_result.tables_found}")
|
|
248
|
+
|
|
249
|
+
# Query WITHOUT hint → generates 4 candidates
|
|
250
|
+
result = engine.query("How many customers are there?")
|
|
251
|
+
print(result.sql)
|
|
252
|
+
print(result.confidence)
|
|
253
|
+
|
|
254
|
+
# Query WITH hint → generates 5 candidates (includes SME strategy)
|
|
255
|
+
result = engine.query(
|
|
256
|
+
question="What is the total revenue by month?",
|
|
257
|
+
hint="revenue = sum(order_amount), use orders table"
|
|
258
|
+
)
|
|
259
|
+
print(result.sql)
|
|
260
|
+
print(result.reasoning)
|
|
261
|
+
|
|
262
|
+
# Execute SQL directly
|
|
263
|
+
rows, columns = engine.execute_sql(result.sql)
|
|
264
|
+
print(columns)
|
|
265
|
+
print(rows)
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
### With Configuration File
|
|
269
|
+
|
|
270
|
+
```python
|
|
271
|
+
from qasql import QASQLEngine
|
|
272
|
+
|
|
273
|
+
# Load from config file
|
|
274
|
+
engine = QASQLEngine(config_file="qasql.config.json")
|
|
275
|
+
engine.setup()
|
|
276
|
+
result = engine.query("Show all orders")
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
### Inspect Schema
|
|
280
|
+
|
|
281
|
+
```python
|
|
282
|
+
# Get table list
|
|
283
|
+
tables = engine.get_tables()
|
|
284
|
+
print(tables) # ['customers', 'orders', 'products']
|
|
285
|
+
|
|
286
|
+
# Get full schema
|
|
287
|
+
schema = engine.get_schema()
|
|
288
|
+
for table_name, info in schema.items():
|
|
289
|
+
print(f"{table_name}: {len(info['columns'])} columns, {info['row_count']} rows")
|
|
290
|
+
|
|
291
|
+
# Get column descriptions
|
|
292
|
+
profile = engine.get_profile()
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
### Inspect Query Results
|
|
296
|
+
|
|
297
|
+
```python
|
|
298
|
+
result = engine.query("Show top 10 customers by revenue")
|
|
299
|
+
|
|
300
|
+
# Access all fields
|
|
301
|
+
print(result.sql) # Generated SQL
|
|
302
|
+
print(result.confidence) # 0.0 - 1.0
|
|
303
|
+
print(result.reasoning) # Why this SQL was selected
|
|
304
|
+
print(result.question) # Original question
|
|
305
|
+
print(result.hint) # Hint if provided
|
|
306
|
+
|
|
307
|
+
# Candidate details
|
|
308
|
+
print(f"Candidates: {result.successful_candidates}/{result.total_candidates}")
|
|
309
|
+
for candidate in result.candidates:
|
|
310
|
+
print(f" [{candidate.strategy}] {candidate.success} - {candidate.sql[:50]}...")
|
|
311
|
+
|
|
312
|
+
# Timing information
|
|
313
|
+
for stage, ms in result.metadata.get("timings", {}).items():
|
|
314
|
+
print(f" {stage}: {ms:.0f}ms")
|
|
315
|
+
|
|
316
|
+
# Convert to dictionary
|
|
317
|
+
result_dict = result.to_dict()
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Configuration
|
|
323
|
+
|
|
324
|
+
### Config File (qasql.config.json)
|
|
325
|
+
|
|
326
|
+
```json
|
|
327
|
+
{
|
|
328
|
+
"database": {
|
|
329
|
+
"type": "sqlite",
|
|
330
|
+
"uri": "./database.sqlite"
|
|
331
|
+
},
|
|
332
|
+
"llm": {
|
|
333
|
+
"provider": "ollama",
|
|
334
|
+
"model": "llama3.2",
|
|
335
|
+
"base_url": "http://localhost:11434"
|
|
336
|
+
},
|
|
337
|
+
"options": {
|
|
338
|
+
"readable_names": "mappings.json",
|
|
339
|
+
"relevance_threshold": 0.5,
|
|
340
|
+
"query_timeout": 30,
|
|
341
|
+
"output_dir": "./output"
|
|
342
|
+
}
|
|
343
|
+
}
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
### PostgreSQL Configuration
|
|
347
|
+
|
|
348
|
+
```json
|
|
349
|
+
{
|
|
350
|
+
"database": {
|
|
351
|
+
"type": "postgresql",
|
|
352
|
+
"uri": "postgresql://user:password@localhost:5432/mydb"
|
|
353
|
+
},
|
|
354
|
+
"llm": {
|
|
355
|
+
"provider": "ollama",
|
|
356
|
+
"model": "llama3.2"
|
|
357
|
+
}
|
|
358
|
+
}
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
### Environment Variables
|
|
362
|
+
|
|
363
|
+
```bash
|
|
364
|
+
export QASQL_DB_URI="sqlite:///database.sqlite"
|
|
365
|
+
export QASQL_DB_TYPE="sqlite"
|
|
366
|
+
export QASQL_LLM_PROVIDER="ollama"
|
|
367
|
+
export QASQL_LLM_MODEL="llama3.2"
|
|
368
|
+
export QASQL_OLLAMA_URL="http://localhost:11434"
|
|
369
|
+
|
|
370
|
+
# For cloud providers
|
|
371
|
+
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
372
|
+
export OPENAI_API_KEY="sk-..."
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
### Readable Names Mapping
|
|
376
|
+
|
|
377
|
+
If your database has cryptic column names, provide a mapping file:
|
|
378
|
+
|
|
379
|
+
**JSON format:**
|
|
380
|
+
```json
|
|
381
|
+
{
|
|
382
|
+
"tbl_cust_01": {
|
|
383
|
+
"table_readable_name": "Customers",
|
|
384
|
+
"columns": {
|
|
385
|
+
"col_a": "Customer Name",
|
|
386
|
+
"col_b": "Email Address",
|
|
387
|
+
"col_c": "Registration Date"
|
|
388
|
+
}
|
|
389
|
+
},
|
|
390
|
+
"tbl_ord_02": {
|
|
391
|
+
"table_readable_name": "Orders",
|
|
392
|
+
"columns": {
|
|
393
|
+
"ord_id": "Order ID",
|
|
394
|
+
"amt_val": "Order Amount"
|
|
395
|
+
}
|
|
396
|
+
}
|
|
397
|
+
}
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
**CSV format:**
|
|
401
|
+
```csv
|
|
402
|
+
table,column,readable_name
|
|
403
|
+
tbl_cust_01,col_a,Customer Name
|
|
404
|
+
tbl_cust_01,col_b,Email Address
|
|
405
|
+
tbl_ord_02,amt_val,Order Amount
|
|
406
|
+
```
|
|
407
|
+
|
|
408
|
+
---
|
|
409
|
+
|
|
410
|
+
## How It Works
|
|
411
|
+
|
|
412
|
+
### Architecture
|
|
413
|
+
|
|
414
|
+
```
|
|
415
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
416
|
+
│ QA-SQL SDK │
|
|
417
|
+
├─────────────────────────────────────────────────────────────┤
|
|
418
|
+
│ │
|
|
419
|
+
│ ┌──────────┐ ┌──────────────┐ ┌──────────────────┐ │
|
|
420
|
+
│ │ Database │───▶│ QASQLEngine │───▶│ LLM Provider │ │
|
|
421
|
+
│ │ SQLite/ │ │ │ │ Ollama/Anthropic │ │
|
|
422
|
+
│ │ Postgres │◀───│ │◀───│ /OpenAI │ │
|
|
423
|
+
│ └──────────┘ └──────────────┘ └──────────────────┘ │
|
|
424
|
+
│ │
|
|
425
|
+
└─────────────────────────────────────────────────────────────┘
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
### Two-Phase Flow
|
|
429
|
+
|
|
430
|
+
**Phase 1: Setup (One-time)**
|
|
431
|
+
```
|
|
432
|
+
Database → Schema Extraction → Column Descriptions → Ready
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
**Phase 2: Query (Runtime)**
|
|
436
|
+
```
|
|
437
|
+
Question → Schema Agent → Candidate Generation → Execution → Judge → SQL
|
|
438
|
+
(Map-Reduce) (4-5 strategies) (retry) (select best)
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
### Candidate Generation Strategies
|
|
442
|
+
|
|
443
|
+
| Strategy | Description | With Hint | Without Hint |
|
|
444
|
+
|----------|-------------|-----------|--------------|
|
|
445
|
+
| full_schema | Complete database schema | ✓ | ✓ |
|
|
446
|
+
| sme_metadata | Schema + domain expert hints | ✓ | ✗ (skipped) |
|
|
447
|
+
| minimal_profile | Column names only | ✓ | ✓ |
|
|
448
|
+
| focused_schema | Relevant tables only | ✓ | ✓ |
|
|
449
|
+
| full_profile | Schema + descriptions | ✓ | ✓ |
|
|
450
|
+
| **Total** | | **5** | **4** |
|
|
451
|
+
|
|
452
|
+
When no hint is provided, the SME strategy is skipped since it requires domain knowledge.
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## Examples
|
|
457
|
+
|
|
458
|
+
### Example 1: Simple Query
|
|
459
|
+
|
|
460
|
+
```python
|
|
461
|
+
from qasql import QASQLEngine
|
|
462
|
+
|
|
463
|
+
engine = QASQLEngine(db_uri="sqlite:///sales.sqlite")
|
|
464
|
+
engine.setup()
|
|
465
|
+
|
|
466
|
+
result = engine.query("How many orders were placed last month?")
|
|
467
|
+
print(result.sql)
|
|
468
|
+
# SELECT COUNT(*) FROM orders WHERE order_date >= date('now', '-1 month')
|
|
469
|
+
```
|
|
470
|
+
|
|
471
|
+
### Example 2: Query with Hint
|
|
472
|
+
|
|
473
|
+
```python
|
|
474
|
+
result = engine.query(
|
|
475
|
+
question="What is the average order value by customer segment?",
|
|
476
|
+
hint="order value = quantity * unit_price, segment is in customers table"
|
|
477
|
+
)
|
|
478
|
+
print(result.sql)
|
|
479
|
+
print(result.confidence) # Higher confidence with hint
|
|
480
|
+
```
|
|
481
|
+
|
|
482
|
+
### Example 3: Execute and Display Results
|
|
483
|
+
|
|
484
|
+
```python
|
|
485
|
+
result = engine.query("List top 5 customers by total purchases")
|
|
486
|
+
|
|
487
|
+
if result.sql:
|
|
488
|
+
rows, columns = engine.execute_sql(result.sql)
|
|
489
|
+
|
|
490
|
+
# Print as table
|
|
491
|
+
print(" | ".join(columns))
|
|
492
|
+
print("-" * 50)
|
|
493
|
+
for row in rows:
|
|
494
|
+
print(" | ".join(str(v) for v in row))
|
|
495
|
+
```
|
|
496
|
+
|
|
497
|
+
### Example 4: Using Anthropic
|
|
498
|
+
|
|
499
|
+
```python
|
|
500
|
+
import os
|
|
501
|
+
os.environ["ANTHROPIC_API_KEY"] = "your-key"
|
|
502
|
+
|
|
503
|
+
engine = QASQLEngine(
|
|
504
|
+
db_uri="sqlite:///mydb.sqlite",
|
|
505
|
+
llm_provider="anthropic",
|
|
506
|
+
llm_model="claude-sonnet-4-5-20250929"
|
|
507
|
+
)
|
|
508
|
+
```
|
|
509
|
+
|
|
510
|
+
---
|
|
511
|
+
|
|
512
|
+
## API Reference
|
|
513
|
+
|
|
514
|
+
### QASQLEngine
|
|
515
|
+
|
|
516
|
+
```python
|
|
517
|
+
class QASQLEngine:
|
|
518
|
+
def __init__(
|
|
519
|
+
self,
|
|
520
|
+
db_uri: str = None, # Database URI
|
|
521
|
+
db_type: str = None, # "sqlite" or "postgresql"
|
|
522
|
+
llm_provider: str = "ollama", # "ollama", "anthropic", "openai"
|
|
523
|
+
llm_model: str = "llama3.2", # Model name
|
|
524
|
+
llm_base_url: str = "http://localhost:11434",
|
|
525
|
+
readable_names: str = None, # Path to mappings file
|
|
526
|
+
output_dir: str = "./qasql_output",
|
|
527
|
+
config_file: str = None, # Path to config JSON
|
|
528
|
+
): ...
|
|
529
|
+
|
|
530
|
+
def setup(self, force: bool = False) -> SetupResult: ...
|
|
531
|
+
def query(self, question: str, hint: str = None) -> QueryResult: ...
|
|
532
|
+
def execute_sql(self, sql: str) -> tuple[list, list]: ...
|
|
533
|
+
def get_tables(self) -> list[str]: ...
|
|
534
|
+
def get_schema(self) -> dict: ...
|
|
535
|
+
def get_profile(self) -> dict: ...
|
|
536
|
+
```
|
|
537
|
+
|
|
538
|
+
### QueryResult
|
|
539
|
+
|
|
540
|
+
```python
|
|
541
|
+
@dataclass
|
|
542
|
+
class QueryResult:
|
|
543
|
+
sql: str # Generated SQL
|
|
544
|
+
confidence: float # 0.0 - 1.0
|
|
545
|
+
question: str # Original question
|
|
546
|
+
hint: str | None # Provided hint
|
|
547
|
+
reasoning: str # Selection reasoning
|
|
548
|
+
candidates: list # All candidates
|
|
549
|
+
successful_candidates: int # Count of successful
|
|
550
|
+
total_candidates: int # Total count
|
|
551
|
+
metadata: dict # Timings, etc.
|
|
552
|
+
|
|
553
|
+
def to_dict(self) -> dict: ...
|
|
554
|
+
```
|
|
555
|
+
|
|
556
|
+
### SetupResult
|
|
557
|
+
|
|
558
|
+
```python
|
|
559
|
+
@dataclass
|
|
560
|
+
class SetupResult:
|
|
561
|
+
success: bool
|
|
562
|
+
database_name: str
|
|
563
|
+
tables_found: int
|
|
564
|
+
schema_path: str | None
|
|
565
|
+
descriptions_path: str | None
|
|
566
|
+
errors: list[str]
|
|
567
|
+
```
|
|
568
|
+
|
|
569
|
+
---
|
|
570
|
+
|
|
571
|
+
## Troubleshooting
|
|
572
|
+
|
|
573
|
+
### "command not found: qasql"
|
|
574
|
+
|
|
575
|
+
Use `python -m qasql` instead:
|
|
576
|
+
```bash
|
|
577
|
+
python -m qasql tables --db-uri sqlite:///mydb.sqlite
|
|
578
|
+
```
|
|
579
|
+
|
|
580
|
+
### "Cannot connect to Ollama"
|
|
581
|
+
|
|
582
|
+
Make sure Ollama is running:
|
|
583
|
+
```bash
|
|
584
|
+
# Terminal 1
|
|
585
|
+
ollama serve
|
|
586
|
+
|
|
587
|
+
# Terminal 2
|
|
588
|
+
ollama pull llama3.2
|
|
589
|
+
```
|
|
590
|
+
|
|
591
|
+
### "ANTHROPIC_API_KEY not found"
|
|
592
|
+
|
|
593
|
+
Set the environment variable:
|
|
594
|
+
```bash
|
|
595
|
+
export ANTHROPIC_API_KEY='your-key'
|
|
596
|
+
```
|
|
597
|
+
|
|
598
|
+
### "Database not found"
|
|
599
|
+
|
|
600
|
+
Check the path is correct:
|
|
601
|
+
```bash
|
|
602
|
+
# Use absolute path
|
|
603
|
+
python -m qasql tables --db-uri sqlite:////absolute/path/to/db.sqlite
|
|
604
|
+
|
|
605
|
+
# Or relative path
|
|
606
|
+
python -m qasql tables --db-uri sqlite:///./relative/path/db.sqlite
|
|
607
|
+
```
|
|
608
|
+
|
|
609
|
+
### "No module named 'qasql'"
|
|
610
|
+
|
|
611
|
+
Install the package:
|
|
612
|
+
```bash
|
|
613
|
+
cd qasql-sdk
|
|
614
|
+
pip install -e .
|
|
615
|
+
```
|
|
616
|
+
|
|
617
|
+
---
|
|
618
|
+
|
|
619
|
+
## Data Privacy
|
|
620
|
+
|
|
621
|
+
| Provider | Data Location | Recommendation |
|
|
622
|
+
|----------|---------------|----------------|
|
|
623
|
+
| **Ollama** | 100% Local | Enterprise / Sensitive data |
|
|
624
|
+
| Anthropic | Cloud (Anthropic servers) | Development / Non-sensitive |
|
|
625
|
+
| OpenAI | Cloud (OpenAI servers) | Development / Non-sensitive |
|
|
626
|
+
|
|
627
|
+
**With Ollama, zero data leaves your network.**
|
|
628
|
+
|
|
629
|
+
---
|
|
630
|
+
|
|
631
|
+
## License
|
|
632
|
+
|
|
633
|
+
MIT License - see [LICENSE](LICENSE) for details.
|
|
634
|
+
|
|
635
|
+
---
|
|
636
|
+
|
|
637
|
+
## Support
|
|
638
|
+
|
|
639
|
+
- Issues: [GitHub Issues](https://github.com/your-org/qasql/issues)
|
|
640
|
+
- Documentation: [GitHub Wiki](https://github.com/your-org/qasql/wiki)
|