@memberjunction/query-gen 0.0.1 → 2.126.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.turbo/turbo-build.log +4 -0
- package/CHANGELOG.md +34 -0
- package/COORDINATOR.md +768 -0
- package/IMPLEMENTATION_PLAN.md +1753 -0
- package/LLM_ENTITY_GROUPING_PLAN.md +977 -0
- package/README.md +675 -29
- package/dist/cli/commands/export.d.ts +15 -0
- package/dist/cli/commands/export.d.ts.map +1 -0
- package/dist/cli/commands/export.js +178 -0
- package/dist/cli/commands/export.js.map +1 -0
- package/dist/cli/commands/generate.d.ts +19 -0
- package/dist/cli/commands/generate.d.ts.map +1 -0
- package/dist/cli/commands/generate.js +282 -0
- package/dist/cli/commands/generate.js.map +1 -0
- package/dist/cli/commands/validate.d.ts +17 -0
- package/dist/cli/commands/validate.d.ts.map +1 -0
- package/dist/cli/commands/validate.js +193 -0
- package/dist/cli/commands/validate.js.map +1 -0
- package/dist/cli/config.d.ts +51 -0
- package/dist/cli/config.d.ts.map +1 -0
- package/dist/cli/config.js +142 -0
- package/dist/cli/config.js.map +1 -0
- package/dist/cli/index.d.ts +13 -0
- package/dist/cli/index.d.ts.map +1 -0
- package/dist/cli/index.js +57 -0
- package/dist/cli/index.js.map +1 -0
- package/dist/core/EntityGrouper.d.ts +74 -0
- package/dist/core/EntityGrouper.d.ts.map +1 -0
- package/dist/core/EntityGrouper.js +246 -0
- package/dist/core/EntityGrouper.js.map +1 -0
- package/dist/core/MetadataExporter.d.ts +59 -0
- package/dist/core/MetadataExporter.d.ts.map +1 -0
- package/dist/core/MetadataExporter.js +151 -0
- package/dist/core/MetadataExporter.js.map +1 -0
- package/dist/core/QueryDatabaseWriter.d.ts +50 -0
- package/dist/core/QueryDatabaseWriter.d.ts.map +1 -0
- package/dist/core/QueryDatabaseWriter.js +152 -0
- package/dist/core/QueryDatabaseWriter.js.map +1 -0
- package/dist/core/QueryFixer.d.ts +48 -0
- package/dist/core/QueryFixer.d.ts.map +1 -0
- package/dist/core/QueryFixer.js +115 -0
- package/dist/core/QueryFixer.js.map +1 -0
- package/dist/core/QueryRefiner.d.ts +94 -0
- package/dist/core/QueryRefiner.d.ts.map +1 -0
- package/dist/core/QueryRefiner.js +267 -0
- package/dist/core/QueryRefiner.js.map +1 -0
- package/dist/core/QueryTester.d.ts +70 -0
- package/dist/core/QueryTester.d.ts.map +1 -0
- package/dist/core/QueryTester.js +243 -0
- package/dist/core/QueryTester.js.map +1 -0
- package/dist/core/QueryWriter.d.ts +57 -0
- package/dist/core/QueryWriter.d.ts.map +1 -0
- package/dist/core/QueryWriter.js +184 -0
- package/dist/core/QueryWriter.js.map +1 -0
- package/dist/core/QuestionGenerator.d.ts +58 -0
- package/dist/core/QuestionGenerator.d.ts.map +1 -0
- package/dist/core/QuestionGenerator.js +145 -0
- package/dist/core/QuestionGenerator.js.map +1 -0
- package/dist/data/schema.d.ts +230 -0
- package/dist/data/schema.d.ts.map +1 -0
- package/dist/data/schema.js +6 -0
- package/dist/data/schema.js.map +1 -0
- package/dist/index.d.ts +28 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +77 -0
- package/dist/index.js.map +1 -0
- package/dist/prompts/PromptNames.d.ts +32 -0
- package/dist/prompts/PromptNames.d.ts.map +1 -0
- package/dist/prompts/PromptNames.js +35 -0
- package/dist/prompts/PromptNames.js.map +1 -0
- package/dist/utils/category-builder.d.ts +28 -0
- package/dist/utils/category-builder.d.ts.map +1 -0
- package/dist/utils/category-builder.js +90 -0
- package/dist/utils/category-builder.js.map +1 -0
- package/dist/utils/entity-helpers.d.ts +49 -0
- package/dist/utils/entity-helpers.d.ts.map +1 -0
- package/dist/utils/entity-helpers.js +189 -0
- package/dist/utils/entity-helpers.js.map +1 -0
- package/dist/utils/error-handlers.d.ts +19 -0
- package/dist/utils/error-handlers.d.ts.map +1 -0
- package/dist/utils/error-handlers.js +41 -0
- package/dist/utils/error-handlers.js.map +1 -0
- package/dist/utils/graph-helpers.d.ts +51 -0
- package/dist/utils/graph-helpers.d.ts.map +1 -0
- package/dist/utils/graph-helpers.js +82 -0
- package/dist/utils/graph-helpers.js.map +1 -0
- package/dist/utils/prompt-helpers.d.ts +25 -0
- package/dist/utils/prompt-helpers.d.ts.map +1 -0
- package/dist/utils/prompt-helpers.js +66 -0
- package/dist/utils/prompt-helpers.js.map +1 -0
- package/dist/utils/query-helpers.d.ts +23 -0
- package/dist/utils/query-helpers.d.ts.map +1 -0
- package/dist/utils/query-helpers.js +34 -0
- package/dist/utils/query-helpers.js.map +1 -0
- package/dist/utils/user-helpers.d.ts +15 -0
- package/dist/utils/user-helpers.d.ts.map +1 -0
- package/dist/utils/user-helpers.js +32 -0
- package/dist/utils/user-helpers.js.map +1 -0
- package/dist/vectors/EmbeddingService.d.ts +58 -0
- package/dist/vectors/EmbeddingService.d.ts.map +1 -0
- package/dist/vectors/EmbeddingService.js +90 -0
- package/dist/vectors/EmbeddingService.js.map +1 -0
- package/dist/vectors/SimilaritySearch.d.ts +51 -0
- package/dist/vectors/SimilaritySearch.d.ts.map +1 -0
- package/dist/vectors/SimilaritySearch.js +85 -0
- package/dist/vectors/SimilaritySearch.js.map +1 -0
- package/docs/API.md +1040 -0
- package/docs/ARCHITECTURE.md +1120 -0
- package/examples/advanced-usage.ts +401 -0
- package/examples/basic-usage.ts +285 -0
- package/package.json +48 -6
- package/src/cli/commands/export.ts +173 -0
- package/src/cli/commands/generate.ts +330 -0
- package/src/cli/commands/validate.ts +185 -0
- package/src/cli/config.ts +203 -0
- package/src/cli/index.ts +63 -0
- package/src/core/EntityGrouper.ts +318 -0
- package/src/core/MetadataExporter.ts +148 -0
- package/src/core/QueryDatabaseWriter.ts +187 -0
- package/src/core/QueryFixer.ts +153 -0
- package/src/core/QueryRefiner.ts +382 -0
- package/src/core/QueryTester.ts +264 -0
- package/src/core/QueryWriter.ts +239 -0
- package/src/core/QuestionGenerator.ts +199 -0
- package/src/data/golden-queries.json +1371 -0
- package/src/data/schema.ts +252 -0
- package/src/index.ts +49 -0
- package/src/prompts/PromptNames.ts +36 -0
- package/src/utils/category-builder.ts +97 -0
- package/src/utils/entity-helpers.ts +203 -0
- package/src/utils/error-handlers.ts +41 -0
- package/src/utils/graph-helpers.ts +99 -0
- package/src/utils/prompt-helpers.ts +79 -0
- package/src/utils/query-helpers.ts +32 -0
- package/src/utils/user-helpers.ts +39 -0
- package/src/vectors/EmbeddingService.ts +109 -0
- package/src/vectors/SimilaritySearch.ts +108 -0
- package/tsconfig.json +39 -0
|
@@ -0,0 +1,1753 @@
|
|
|
1
|
+
# Query Generation Package Implementation Plan
|
|
2
|
+
|
|
3
|
+
## Package Overview
|
|
4
|
+
**Package Name**: `@memberjunction/query-gen`
|
|
5
|
+
**Purpose**: AI-powered generation of domain-specific SQL query templates with automatic testing, refinement, and metadata export
|
|
6
|
+
**CLI Command**: `mj querygen`
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Phase 1: Project Setup & Infrastructure (Week 1)
|
|
11
|
+
|
|
12
|
+
### 1.1 Package Structure Creation
|
|
13
|
+
```
|
|
14
|
+
packages/QueryGen/
|
|
15
|
+
├── src/
|
|
16
|
+
│ ├── cli/
|
|
17
|
+
│ │ ├── commands/
|
|
18
|
+
│ │ │ ├── generate.ts # Main generation command
|
|
19
|
+
│ │ │ ├── validate.ts # Query validation command
|
|
20
|
+
│ │ │ └── export.ts # Metadata export command
|
|
21
|
+
│ │ ├── config.ts # Configuration loader
|
|
22
|
+
│ │ └── index.ts # CLI entry point
|
|
23
|
+
│ ├── core/
|
|
24
|
+
│ │ ├── EntityGrouper.ts # Entity relationship analysis
|
|
25
|
+
│ │ ├── QuestionGenerator.ts # Business question generation
|
|
26
|
+
│ │ ├── QueryWriter.ts # SQL template generation
|
|
27
|
+
│ │ ├── QueryTester.ts # Query execution & validation
|
|
28
|
+
│ │ ├── QueryRefiner.ts # Query refinement logic
|
|
29
|
+
│ │ └── MetadataExporter.ts # MJ metadata file export
|
|
30
|
+
│ ├── prompts/
|
|
31
|
+
│ │ └── PromptNames.ts # Static prompt name constants
|
|
32
|
+
│ ├── vectors/
|
|
33
|
+
│ │ └── SimilaritySearch.ts # Weighted similarity logic for golden queries
|
|
34
|
+
│ ├── data/
|
|
35
|
+
│ │ ├── golden-queries.json # 20 example queries
|
|
36
|
+
│ │ └── schema.ts # Type definitions
|
|
37
|
+
│ ├── utils/
|
|
38
|
+
│ │ ├── sql-helpers.ts # SQL parsing utilities
|
|
39
|
+
│ │ └── error-handlers.ts # Error handling helpers
|
|
40
|
+
│ └── index.ts
|
|
41
|
+
├── package.json
|
|
42
|
+
├── tsconfig.json
|
|
43
|
+
└── README.md
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### 1.2 Configuration System
|
|
47
|
+
**Decision**: Integrate with `mj.config.cjs` for consistency with existing MJ packages
|
|
48
|
+
|
|
49
|
+
```typescript
|
|
50
|
+
// Add to mj.config.cjs
|
|
51
|
+
queryGeneration: {
|
|
52
|
+
// Entity Filtering
|
|
53
|
+
includeEntities: ['*'], // Default: all entities
|
|
54
|
+
excludeEntities: [], // Default: none
|
|
55
|
+
excludeSchemas: ['__mj'], // Default: exclude MJ core schema
|
|
56
|
+
|
|
57
|
+
// Entity Grouping
|
|
58
|
+
maxEntitiesPerGroup: 3, // Default: 3 related entities
|
|
59
|
+
minEntitiesPerGroup: 1, // Default: 1 (single entity queries)
|
|
60
|
+
questionsPerGroup: 2, // Default: 1-2 questions per group
|
|
61
|
+
entityGroupStrategy: 'breadth', // 'breadth' | 'depth' - prefer breadth-first grouping
|
|
62
|
+
|
|
63
|
+
// AI Configuration
|
|
64
|
+
modelOverride: undefined, // Optional: prefer specific model
|
|
65
|
+
vendorOverride: undefined, // Optional: prefer specific vendor
|
|
66
|
+
embeddingModel: 'all-MiniLM-L6-v2', // Local embedding model
|
|
67
|
+
|
|
68
|
+
// Iteration Limits
|
|
69
|
+
maxRefinementIterations: 3, // Default: 3 refinement cycles
|
|
70
|
+
maxFixingIterations: 5, // Default: 5 error-fixing attempts
|
|
71
|
+
|
|
72
|
+
// Few-Shot Learning
|
|
73
|
+
topSimilarQueries: 5, // Default: top 5 example queries
|
|
74
|
+
similarityThreshold: 0.7, // Similarity threshold (still returns topN even if below threshold)
|
|
75
|
+
|
|
76
|
+
// Similarity Weighting
|
|
77
|
+
similarityWeights: {
|
|
78
|
+
name: 0.1, // 10% weight for name similarity
|
|
79
|
+
userQuestion: 0.2, // 20% weight for user question similarity
|
|
80
|
+
description: 0.35, // 35% weight for description similarity
|
|
81
|
+
technicalDescription: 0.35 // 35% weight for technical description similarity
|
|
82
|
+
},
|
|
83
|
+
|
|
84
|
+
// Output Configuration
|
|
85
|
+
outputMode: 'metadata', // 'metadata' | 'database' | 'both'
|
|
86
|
+
outputDirectory: './metadata/queries',
|
|
87
|
+
|
|
88
|
+
// Performance
|
|
89
|
+
parallelGenerations: 3, // Generate 3 queries in parallel
|
|
90
|
+
enableCaching: true, // Cache prompt results
|
|
91
|
+
|
|
92
|
+
// Validation
|
|
93
|
+
testWithSampleData: true, // Test queries before export
|
|
94
|
+
requireMinRows: 1, // Queries must return at least 1 row
|
|
95
|
+
maxRefinementRows: 10, // Maximum rows to use for refinement evaluation
|
|
96
|
+
|
|
97
|
+
// Verbose Logging
|
|
98
|
+
verbose: false
|
|
99
|
+
}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### 1.3 Golden Queries Data Structure
|
|
103
|
+
**Decision**: Embed as JSON file in `src/data/` directory (distributed with npm package)
|
|
104
|
+
|
|
105
|
+
```typescript
|
|
106
|
+
// src/data/golden-queries.json structure
|
|
107
|
+
[
|
|
108
|
+
{
|
|
109
|
+
"name": "Customer Orders Summary",
|
|
110
|
+
"userQuestion": "Show me a summary of customer orders by region",
|
|
111
|
+
"description": "Aggregates order data by customer region with totals",
|
|
112
|
+
"technicalDescription": "Groups orders by customer region, calculates total orders and revenue per region",
|
|
113
|
+
"sql": "SELECT ...",
|
|
114
|
+
"parameters": [...],
|
|
115
|
+
"selectClause": [...]
|
|
116
|
+
},
|
|
117
|
+
// ... 19 more queries
|
|
118
|
+
]
|
|
119
|
+
|
|
120
|
+
// Note: Embeddings for each field will be generated at runtime using AIEngine
|
|
121
|
+
// We don't pre-compute embeddings - they're computed on-demand during CLI execution
|
|
122
|
+
// This allows flexibility if embedding models change
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
### 1.4 Dependencies
|
|
126
|
+
```json
|
|
127
|
+
{
|
|
128
|
+
"dependencies": {
|
|
129
|
+
"@memberjunction/core": "workspace:*",
|
|
130
|
+
"@memberjunction/core-entities": "workspace:*",
|
|
131
|
+
"@memberjunction/ai": "workspace:*",
|
|
132
|
+
"@memberjunction/ai-engine": "workspace:*",
|
|
133
|
+
"@memberjunction/ai-prompts": "workspace:*",
|
|
134
|
+
"@memberjunction/ai-vectors-memory": "workspace:*",
|
|
135
|
+
"@memberjunction/sql-server-dataprovider": "workspace:*",
|
|
136
|
+
"commander": "^11.0.0",
|
|
137
|
+
"chalk": "^5.3.0",
|
|
138
|
+
"ora": "^7.0.0",
|
|
139
|
+
"nunjucks": "^3.2.4"
|
|
140
|
+
}
|
|
141
|
+
}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## Phase 2: Entity Analysis & Grouping (Week 2)
|
|
147
|
+
|
|
148
|
+
### 2.1 EntityGrouper Implementation
|
|
149
|
+
**Purpose**: Create logical groups of 1-N related entities for query generation
|
|
150
|
+
|
|
151
|
+
**Key Features**:
|
|
152
|
+
- Load all entities from Metadata (respecting include/exclude filters)
|
|
153
|
+
- Analyze foreign key relationships to identify related entities
|
|
154
|
+
- Generate all valid combinations of 1-N entities
|
|
155
|
+
- Ensure no duplicate entity groups
|
|
156
|
+
- Allow same entity in multiple groups (different combinations)
|
|
157
|
+
|
|
158
|
+
**Algorithm**:
|
|
159
|
+
```typescript
|
|
160
|
+
class EntityGrouper {
|
|
161
|
+
async generateEntityGroups(
|
|
162
|
+
entities: EntityInfo[],
|
|
163
|
+
minSize: number,
|
|
164
|
+
maxSize: number
|
|
165
|
+
): Promise<EntityGroup[]> {
|
|
166
|
+
// 1. Build relationship graph from foreign keys
|
|
167
|
+
// 2. For each entity, find all connected entities using BREADTH-FIRST traversal
|
|
168
|
+
// - Prefer entities with direct relationships (1 hop away)
|
|
169
|
+
// - Then add entities 2 hops away, etc.
|
|
170
|
+
// - This creates more focused, practical entity groups
|
|
171
|
+
// 3. Generate combinations of size 1 to maxSize
|
|
172
|
+
// 4. Deduplicate groups (same entities = same group)
|
|
173
|
+
// 5. Return unique groups with relationship metadata
|
|
174
|
+
}
|
|
175
|
+
}
|
|
176
|
+
|
|
177
|
+
interface EntityGroup {
|
|
178
|
+
entities: EntityInfo[];
|
|
179
|
+
relationships: RelationshipInfo[];
|
|
180
|
+
primaryEntity: EntityInfo; // The "main" entity
|
|
181
|
+
relationshipType: 'single' | 'parent-child' | 'many-to-many';
|
|
182
|
+
}
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Output Example**:
|
|
186
|
+
```typescript
|
|
187
|
+
[
|
|
188
|
+
{
|
|
189
|
+
entities: [CustomersEntity],
|
|
190
|
+
relationships: [],
|
|
191
|
+
primaryEntity: CustomersEntity,
|
|
192
|
+
relationshipType: 'single'
|
|
193
|
+
},
|
|
194
|
+
{
|
|
195
|
+
entities: [CustomersEntity, OrdersEntity],
|
|
196
|
+
relationships: [{ from: 'Orders', to: 'Customers', via: 'CustomerID' }],
|
|
197
|
+
primaryEntity: OrdersEntity,
|
|
198
|
+
relationshipType: 'parent-child'
|
|
199
|
+
},
|
|
200
|
+
{
|
|
201
|
+
entities: [CustomersEntity, OrdersEntity, OrderDetailsEntity],
|
|
202
|
+
relationships: [...],
|
|
203
|
+
primaryEntity: OrdersEntity,
|
|
204
|
+
relationshipType: 'parent-child'
|
|
205
|
+
}
|
|
206
|
+
]
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### 2.2 Metadata Preparation
|
|
210
|
+
**Purpose**: Format entity metadata for AI prompts
|
|
211
|
+
|
|
212
|
+
**⚠️ CRITICAL**: Entity metadata MUST include SchemaName and BaseView for functional SQL generation
|
|
213
|
+
|
|
214
|
+
**Data Structure**:
|
|
215
|
+
```typescript
|
|
216
|
+
interface EntityMetadataForPrompt {
|
|
217
|
+
entityName: string;
|
|
218
|
+
description: string;
|
|
219
|
+
schemaName: string; // REQUIRED: e.g., "dbo", "sales", "hr"
|
|
220
|
+
baseTable: string;
|
|
221
|
+
baseView: string; // REQUIRED: e.g., "vwCustomers", "vwOrders"
|
|
222
|
+
fields: {
|
|
223
|
+
name: string;
|
|
224
|
+
displayName: string;
|
|
225
|
+
type: string;
|
|
226
|
+
description: string;
|
|
227
|
+
isPrimaryKey: boolean;
|
|
228
|
+
isForeignKey: boolean;
|
|
229
|
+
relatedEntity?: string;
|
|
230
|
+
isRequired: boolean;
|
|
231
|
+
defaultValue?: string;
|
|
232
|
+
}[];
|
|
233
|
+
relationships: {
|
|
234
|
+
type: 'one-to-many' | 'many-to-one' | 'many-to-many';
|
|
235
|
+
relatedEntity: string;
|
|
236
|
+
relatedEntityView: string; // Include view name for joins
|
|
237
|
+
relatedEntitySchema: string; // Include schema for joins
|
|
238
|
+
foreignKeyField: string;
|
|
239
|
+
description: string;
|
|
240
|
+
}[];
|
|
241
|
+
}
|
|
242
|
+
|
|
243
|
+
// Example formatted metadata:
|
|
244
|
+
{
|
|
245
|
+
entityName: "Customers",
|
|
246
|
+
description: "Customer information and contact details",
|
|
247
|
+
schemaName: "dbo",
|
|
248
|
+
baseTable: "Customer",
|
|
249
|
+
baseView: "vwCustomers", // Query FROM [dbo].[vwCustomers]
|
|
250
|
+
fields: [...],
|
|
251
|
+
relationships: [
|
|
252
|
+
{
|
|
253
|
+
type: "one-to-many",
|
|
254
|
+
relatedEntity: "Orders",
|
|
255
|
+
relatedEntityView: "vwOrders",
|
|
256
|
+
relatedEntitySchema: "sales",
|
|
257
|
+
foreignKeyField: "CustomerID",
|
|
258
|
+
description: "Customer orders"
|
|
259
|
+
}
|
|
260
|
+
]
|
|
261
|
+
}
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
**Why This Matters**:
|
|
265
|
+
- SQL queries MUST reference `[SchemaName].[BaseView]` to work
|
|
266
|
+
- Without schema: Query fails with "Invalid object name"
|
|
267
|
+
- Without BaseView: Query uses base table instead of view (missing computed fields)
|
|
268
|
+
- Relationships need schema/view info for proper JOINs
|
|
269
|
+
|
|
270
|
+
**Example Query Fragment**:
|
|
271
|
+
```sql
|
|
272
|
+
-- ✅ CORRECT: Includes schema and uses view
|
|
273
|
+
SELECT c.Name, c.Email, COUNT(o.ID) as OrderCount
|
|
274
|
+
FROM [dbo].[vwCustomers] c
|
|
275
|
+
LEFT JOIN [sales].[vwOrders] o ON o.CustomerID = c.ID
|
|
276
|
+
GROUP BY c.Name, c.Email
|
|
277
|
+
|
|
278
|
+
-- ❌ WRONG: Missing schema or using table name
|
|
279
|
+
SELECT c.Name, c.Email
|
|
280
|
+
FROM Customers c -- Will fail with "Invalid object name"!
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## Phase 3: Business Question Generation (Week 2-3)
|
|
286
|
+
|
|
287
|
+
**⚠️ IMPORTANT: Use Nunjucks Templates for All Prompts**
|
|
288
|
+
|
|
289
|
+
All AI prompts in this package MUST use Nunjucks template syntax to format data for readability:
|
|
290
|
+
- ✅ Use `{% for %}` loops to iterate over arrays
|
|
291
|
+
- ✅ Use `{{ variable }}` for simple values
|
|
292
|
+
- ✅ Use conditional logic with `{% if %}`
|
|
293
|
+
- ✅ Format structured data as markdown (not JSON dumps)
|
|
294
|
+
- ❌ AVOID `{{ data | json }}` - This makes prompts harder for LLMs to read
|
|
295
|
+
- ✅ PREFER structured markdown with loops and conditionals
|
|
296
|
+
|
|
297
|
+
**Why**: Structured markdown is much easier for LLMs to parse than raw JSON, leading to better AI responses.
|
|
298
|
+
|
|
299
|
+
### 3.1 QuestionGenerator Implementation
|
|
300
|
+
**Purpose**: Generate 1-2 domain-specific business questions per entity group
|
|
301
|
+
|
|
302
|
+
**AI Prompt**: `metadata/prompts/templates/query-gen/business-question-generator.template.md`
|
|
303
|
+
|
|
304
|
+
**Prompt Content**:
|
|
305
|
+
```markdown
|
|
306
|
+
# Business Question Generator
|
|
307
|
+
|
|
308
|
+
You are an expert data analyst helping to generate meaningful business questions that can be answered with SQL queries.
|
|
309
|
+
|
|
310
|
+
## Entity Group Context
|
|
311
|
+
|
|
312
|
+
{% for entity in entityGroupMetadata %}
|
|
313
|
+
### Entity: {{ entity.entityName }}
|
|
314
|
+
- **Schema**: {{ entity.schemaName }}
|
|
315
|
+
- **View**: {{ entity.baseView }}
|
|
316
|
+
- **Description**: {{ entity.description }}
|
|
317
|
+
|
|
318
|
+
**Fields**:
|
|
319
|
+
{% for field in entity.fields %}
|
|
320
|
+
- `{{ field.name }}` ({{ field.type }}){% if field.description %} - {{ field.description }}{% endif %}{% if field.isPrimaryKey %} [PRIMARY KEY]{% endif %}{% if field.isForeignKey %} [FK to {{ field.relatedEntity }}]{% endif %}
|
|
321
|
+
{% endfor %}
|
|
322
|
+
|
|
323
|
+
{% if entity.relationships.length > 0 %}
|
|
324
|
+
**Relationships**:
|
|
325
|
+
{% for rel in entity.relationships %}
|
|
326
|
+
- {{ rel.type }}: {{ rel.relatedEntity }} via `{{ rel.foreignKeyField }}`{% if rel.description %} - {{ rel.description }}{% endif %}
|
|
327
|
+
{% endfor %}
|
|
328
|
+
{% endif %}
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
{% endfor %}
|
|
332
|
+
|
|
333
|
+
## Instructions
|
|
334
|
+
Generate 1-2 realistic business questions that:
|
|
335
|
+
1. Use the available entities and their relationships
|
|
336
|
+
2. Are answerable with the data in these tables
|
|
337
|
+
3. Are practical questions a business user would ask
|
|
338
|
+
4. Vary in complexity (simple aggregations vs. complex joins)
|
|
339
|
+
5. Leverage entity descriptions to understand domain context
|
|
340
|
+
|
|
341
|
+
## Output Format
|
|
342
|
+
Return JSON array of questions:
|
|
343
|
+
```json
|
|
344
|
+
{
|
|
345
|
+
"questions": [
|
|
346
|
+
{
|
|
347
|
+
"userQuestion": "What are the top 5 customers by order volume?",
|
|
348
|
+
"description": "Identify customers with the most orders",
|
|
349
|
+
"technicalDescription": "Count orders per customer, sort descending, limit 5",
|
|
350
|
+
"complexity": "simple",
|
|
351
|
+
"requiresAggregation": true,
|
|
352
|
+
"requiresJoins": true,
|
|
353
|
+
"entities": ["Customers", "Orders"]
|
|
354
|
+
}
|
|
355
|
+
]
|
|
356
|
+
}
|
|
357
|
+
```
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
**AI Prompt Configuration** (`.prompts.json`):
|
|
361
|
+
```json
|
|
362
|
+
{
|
|
363
|
+
"fields": {
|
|
364
|
+
"Name": "Business Question Generator",
|
|
365
|
+
"Description": "Generates domain-specific business questions for entity groups",
|
|
366
|
+
"TypeID": "@lookup:AI Prompt Types.Name=Chat",
|
|
367
|
+
"TemplateText": "@file:templates/query-gen/business-question-generator.template.md",
|
|
368
|
+
"Status": "Active",
|
|
369
|
+
"ResponseFormat": "JSON",
|
|
370
|
+
"SelectionStrategy": "Specific",
|
|
371
|
+
"PowerPreference": "Highest",
|
|
372
|
+
"ParallelizationMode": "None",
|
|
373
|
+
"OutputType": "object",
|
|
374
|
+
"ValidationBehavior": "Strict",
|
|
375
|
+
"MaxRetries": 3,
|
|
376
|
+
"FailoverMaxAttempts": 5,
|
|
377
|
+
"PromptRole": "System",
|
|
378
|
+
"PromptPosition": "First",
|
|
379
|
+
"CategoryID": "@lookup:AI Prompt Categories.Name=Query Generation?create&Description=Prompts for QueryGen system"
|
|
380
|
+
},
|
|
381
|
+
"relatedEntities": {
|
|
382
|
+
"MJ: AI Prompt Models": [
|
|
383
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Claude 4.5 Sonnet", "VendorID": "@lookup:MJ: AI Vendors.Name=Anthropic", "Priority": 1 } },
|
|
384
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 2 } },
|
|
385
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Cerebras", "Priority": 3 } },
|
|
386
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Gemini 2.5 Flash", "VendorID": "@lookup:MJ: AI Vendors.Name=Google", "Priority": 4 } },
|
|
387
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT-OSS-120B", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 5 } },
|
|
388
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT 5-nano", "VendorID": "@lookup:MJ: AI Vendors.Name=OpenAI", "Priority": 6 } }
|
|
389
|
+
]
|
|
390
|
+
}
|
|
391
|
+
}
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
**Using AIEngine for Prompts**:
|
|
395
|
+
```typescript
|
|
396
|
+
// Static prompt name constant
|
|
397
|
+
export const PROMPT_BUSINESS_QUESTION_GENERATOR = 'Business Question Generator';
|
|
398
|
+
|
|
399
|
+
// In QuestionGenerator class:
|
|
400
|
+
class QuestionGenerator {
|
|
401
|
+
async generateQuestions(entityGroup: EntityGroup): Promise<BusinessQuestion[]> {
|
|
402
|
+
// 1. Ensure AIEngine is configured
|
|
403
|
+
const aiEngine = AIEngine.Instance;
|
|
404
|
+
await aiEngine.Config(false, this.contextUser);
|
|
405
|
+
|
|
406
|
+
// 2. Find the prompt by name (AIEngine caches all prompts)
|
|
407
|
+
const prompt = aiEngine.Prompts.find(p => p.Name === PROMPT_BUSINESS_QUESTION_GENERATOR);
|
|
408
|
+
if (!prompt) {
|
|
409
|
+
throw new Error(`Prompt '${PROMPT_BUSINESS_QUESTION_GENERATOR}' not found`);
|
|
410
|
+
}
|
|
411
|
+
|
|
412
|
+
// 3. Use AIPromptRunner to execute
|
|
413
|
+
const promptRunner = new AIPromptRunner();
|
|
414
|
+
const result = await promptRunner.ExecutePrompt({
|
|
415
|
+
prompt,
|
|
416
|
+
data: { entityGroupMetadata: formatEntityGroupForPrompt(entityGroup) },
|
|
417
|
+
contextUser: this.contextUser
|
|
418
|
+
});
|
|
419
|
+
|
|
420
|
+
return result.result.questions;
|
|
421
|
+
}
|
|
422
|
+
}
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
**Note**: AIEngine loads all prompts during Config() - no need for separate PromptManager or caching.
|
|
426
|
+
|
|
427
|
+
### 3.2 Question Validation
|
|
428
|
+
**Purpose**: Filter out low-quality or unanswerable questions
|
|
429
|
+
|
|
430
|
+
**Validation Criteria**:
|
|
431
|
+
- Question must reference entities in the group
|
|
432
|
+
- Question should be specific enough to generate a query
|
|
433
|
+
- Avoid overly generic questions ("Show me all data")
|
|
434
|
+
- Prefer questions with measurable outcomes
|
|
435
|
+
|
|
436
|
+
---
|
|
437
|
+
|
|
438
|
+
## Phase 4: Vector Similarity Search (Week 3)
|
|
439
|
+
|
|
440
|
+
### 4.1 Using AIEngine for Embeddings
|
|
441
|
+
**Purpose**: Use MemberJunction's AIEngine for all embedding operations
|
|
442
|
+
|
|
443
|
+
**Key Features**:
|
|
444
|
+
```typescript
|
|
445
|
+
// Use AIEngine.Instance.EmbedTextLocal() for all embeddings
|
|
446
|
+
// AIEngine is already configured with local embedding models
|
|
447
|
+
// No need for a separate EmbeddingService wrapper
|
|
448
|
+
|
|
449
|
+
// Example usage:
|
|
450
|
+
const aiEngine = AIEngine.Instance;
|
|
451
|
+
await aiEngine.Config(false, contextUser);
|
|
452
|
+
|
|
453
|
+
// Embed a query field
|
|
454
|
+
const nameEmbedding = await aiEngine.EmbedTextLocal(query.name);
|
|
455
|
+
const descEmbedding = await aiEngine.EmbedTextLocal(query.description);
|
|
456
|
+
const techDescEmbedding = await aiEngine.EmbedTextLocal(query.technicalDescription);
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
**Note**: We embed each field separately (name, description, technicalDescription) for weighted similarity scoring, not as a concatenated string.
|
|
460
|
+
|
|
461
|
+
### 4.2 Weighted Similarity Search Implementation
|
|
462
|
+
**Purpose**: Find top-K most similar golden queries using weighted field similarity
|
|
463
|
+
|
|
464
|
+
**Algorithm**: Weighted cosine similarity across multiple fields
|
|
465
|
+
|
|
466
|
+
```typescript
|
|
467
|
+
class SimilaritySearch {
|
|
468
|
+
private weights = {
|
|
469
|
+
name: 0.1,
|
|
470
|
+
userQuestion: 0.2,
|
|
471
|
+
description: 0.35,
|
|
472
|
+
technicalDescription: 0.35
|
|
473
|
+
};
|
|
474
|
+
|
|
475
|
+
async findSimilarQueries(
|
|
476
|
+
queryEmbeddings: {
|
|
477
|
+
name: number[],
|
|
478
|
+
userQuestion: number[],
|
|
479
|
+
description: number[],
|
|
480
|
+
technicalDescription: number[]
|
|
481
|
+
},
|
|
482
|
+
goldenEmbeddings: Array<{
|
|
483
|
+
query: GoldenQuery,
|
|
484
|
+
embeddings: {
|
|
485
|
+
name: number[],
|
|
486
|
+
userQuestion: number[],
|
|
487
|
+
description: number[],
|
|
488
|
+
technicalDescription: number[]
|
|
489
|
+
}
|
|
490
|
+
}>,
|
|
491
|
+
topK: number = 5
|
|
492
|
+
): Promise<SimilarQuery[]> {
|
|
493
|
+
// 1. Calculate weighted similarity for each golden query
|
|
494
|
+
const similarities = goldenEmbeddings.map(golden => {
|
|
495
|
+
// Calculate cosine similarity for each field
|
|
496
|
+
const nameSim = this.cosineSimilarity(
|
|
497
|
+
queryEmbeddings.name,
|
|
498
|
+
golden.embeddings.name
|
|
499
|
+
);
|
|
500
|
+
const userQuestionSim = this.cosineSimilarity(
|
|
501
|
+
queryEmbeddings.userQuestion,
|
|
502
|
+
golden.embeddings.userQuestion
|
|
503
|
+
);
|
|
504
|
+
const descSim = this.cosineSimilarity(
|
|
505
|
+
queryEmbeddings.description,
|
|
506
|
+
golden.embeddings.description
|
|
507
|
+
);
|
|
508
|
+
const techDescSim = this.cosineSimilarity(
|
|
509
|
+
queryEmbeddings.technicalDescription,
|
|
510
|
+
golden.embeddings.technicalDescription
|
|
511
|
+
);
|
|
512
|
+
|
|
513
|
+
// Calculate weighted sum
|
|
514
|
+
const weightedScore =
|
|
515
|
+
(nameSim * this.weights.name) +
|
|
516
|
+
(userQuestionSim * this.weights.userQuestion) +
|
|
517
|
+
(descSim * this.weights.description) +
|
|
518
|
+
(techDescSim * this.weights.technicalDescription);
|
|
519
|
+
|
|
520
|
+
return {
|
|
521
|
+
query: golden.query,
|
|
522
|
+
similarity: weightedScore,
|
|
523
|
+
fieldScores: { nameSim, userQuestionSim, descSim, techDescSim }
|
|
524
|
+
};
|
|
525
|
+
});
|
|
526
|
+
|
|
527
|
+
// 2. Sort descending by weighted similarity
|
|
528
|
+
const sorted = similarities.sort((a, b) => b.similarity - a.similarity);
|
|
529
|
+
|
|
530
|
+
// 3. Return top K (ALWAYS return topK results, even if below threshold)
|
|
531
|
+
// Threshold is informational only - we still use the best matches available
|
|
532
|
+
return sorted.slice(0, topK);
|
|
533
|
+
}
|
|
534
|
+
|
|
535
|
+
private cosineSimilarity(a: number[], b: number[]): number {
|
|
536
|
+
// Use SimpleVectorService.CosineSimilarity() from @memberjunction/ai-vectors-memory
|
|
537
|
+
// Or implement: dot product / (magnitude(a) * magnitude(b))
|
|
538
|
+
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
|
|
539
|
+
const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
|
|
540
|
+
const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
|
|
541
|
+
return dotProduct / (magA * magB);
|
|
542
|
+
}
|
|
543
|
+
}
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
### 4.3 Few-Shot Example Selection
|
|
547
|
+
**Output**: 3-5 most relevant golden queries to include in SQL generation prompt
|
|
548
|
+
|
|
549
|
+
**Example**:
|
|
550
|
+
```typescript
|
|
551
|
+
// User question: "What are the top customers by revenue?"
|
|
552
|
+
// Similar golden queries:
|
|
553
|
+
[
|
|
554
|
+
{ name: "Top Customers by Order Count", similarity: 0.92, sql: "..." },
|
|
555
|
+
{ name: "Revenue by Customer Segment", similarity: 0.87, sql: "..." },
|
|
556
|
+
{ name: "Customer Purchase Analysis", similarity: 0.83, sql: "..." }
|
|
557
|
+
]
|
|
558
|
+
```
|
|
559
|
+
|
|
560
|
+
---
|
|
561
|
+
|
|
562
|
+
## Phase 5: SQL Query Generation (Week 4)
|
|
563
|
+
|
|
564
|
+
### 5.1 QueryWriter Implementation
|
|
565
|
+
**Purpose**: Generate Nunjucks SQL templates using AI with few-shot learning
|
|
566
|
+
|
|
567
|
+
**AI Prompt**: `metadata/prompts/templates/query-gen/sql-query-writer.template.md`
|
|
568
|
+
|
|
569
|
+
**Prompt Content** (based on Skip's query-writer.md):
|
|
570
|
+
```markdown
|
|
571
|
+
# SQL Query Template Writer
|
|
572
|
+
|
|
573
|
+
You are an expert SQL developer specializing in creating MemberJunction-compatible Nunjucks SQL query templates.
|
|
574
|
+
|
|
575
|
+
## Task
|
|
576
|
+
Generate a SQL query template that answers the following business question:
|
|
577
|
+
|
|
578
|
+
**User Question**: {{ userQuestion }}
|
|
579
|
+
**Description**: {{ description }}
|
|
580
|
+
**Technical Description**: {{ technicalDescription }}
|
|
581
|
+
|
|
582
|
+
## Available Entities
|
|
583
|
+
|
|
584
|
+
{% for entity in entityMetadata %}
|
|
585
|
+
### {{ entity.entityName }}
|
|
586
|
+
- **Schema.View**: `[{{ entity.schemaName }}].[{{ entity.baseView }}]`
|
|
587
|
+
- **Description**: {{ entity.description }}
|
|
588
|
+
|
|
589
|
+
**Available Fields**:
|
|
590
|
+
{% for field in entity.fields %}
|
|
591
|
+
- `{{ field.name }}` ({{ field.type }}){% if field.description %} - {{ field.description }}{% endif %}{% if field.isPrimaryKey %} [PK]{% endif %}{% if field.isForeignKey %} [FK→{{ field.relatedEntity }}]{% endif %}
|
|
592
|
+
{% endfor %}
|
|
593
|
+
|
|
594
|
+
{% if entity.relationships.length > 0 %}
|
|
595
|
+
**Join Information**:
|
|
596
|
+
{% for rel in entity.relationships %}
|
|
597
|
+
- To join `{{ rel.relatedEntity }}`: `LEFT JOIN [{{ rel.relatedEntitySchema }}].[{{ rel.relatedEntityView }}] alias ON alias.{{ rel.foreignKeyField }} = {{ entity.entityName.substring(0,1).toLowerCase() }}.ID`
|
|
598
|
+
{% endfor %}
|
|
599
|
+
{% endif %}
|
|
600
|
+
|
|
601
|
+
---
|
|
602
|
+
{% endfor %}
|
|
603
|
+
|
|
604
|
+
## Example Queries (Similar to Your Task)
|
|
605
|
+
|
|
606
|
+
{% for example in fewShotExamples %}
|
|
607
|
+
### Example {{ loop.index }}: {{ example.name }}
|
|
608
|
+
**User Question**: {{ example.userQuestion }}
|
|
609
|
+
**Description**: {{ example.description }}
|
|
610
|
+
|
|
611
|
+
**SQL Template**:
|
|
612
|
+
```sql
|
|
613
|
+
{{ example.sql }}
|
|
614
|
+
```
|
|
615
|
+
|
|
616
|
+
**Parameters**:
|
|
617
|
+
{% for param in example.parameters %}
|
|
618
|
+
- `{{ param.name }}` ({{ param.type }}){% if param.isRequired %} [REQUIRED]{% endif %} - {{ param.description }}
|
|
619
|
+
- Sample: `{{ param.sampleValue }}`
|
|
620
|
+
{% endfor %}
|
|
621
|
+
|
|
622
|
+
**Output Fields**:
|
|
623
|
+
{% for field in example.selectClause %}
|
|
624
|
+
- `{{ field.name }}` ({{ field.type }}) - {{ field.description }}
|
|
625
|
+
{% endfor %}
|
|
626
|
+
|
|
627
|
+
---
|
|
628
|
+
{% endfor %}
|
|
629
|
+
|
|
630
|
+
## Requirements
|
|
631
|
+
1. **Use Nunjucks Syntax**: Parameters use `{{ paramName }}` syntax
|
|
632
|
+
2. **Use SQL Filters**: Apply `| sqlString`, `| sqlNumber`, `| sqlDate`, `| sqlIn` filters
|
|
633
|
+
3. **Use Base Views**: Query from `vw*` views, not base tables
|
|
634
|
+
4. **Include Comments**: Document query purpose and logic
|
|
635
|
+
5. **Handle NULLs**: Use COALESCE or ISNULL for aggregations
|
|
636
|
+
6. **Performance**: Include appropriate WHERE clauses and JOINs
|
|
637
|
+
7. **Parameterize**: Make queries reusable with parameters
|
|
638
|
+
|
|
639
|
+
## Output Format
|
|
640
|
+
Return JSON with three properties:
|
|
641
|
+
|
|
642
|
+
```json
|
|
643
|
+
{
|
|
644
|
+
"sql": "SELECT ... FROM ... WHERE ...",
|
|
645
|
+
"selectClause": [
|
|
646
|
+
{
|
|
647
|
+
"name": "CustomerName",
|
|
648
|
+
"description": "Name of the customer",
|
|
649
|
+
"type": "string",
|
|
650
|
+
"optional": false
|
|
651
|
+
}
|
|
652
|
+
],
|
|
653
|
+
"parameters": [
|
|
654
|
+
{
|
|
655
|
+
"name": "minRevenue",
|
|
656
|
+
"type": "number",
|
|
657
|
+
"isRequired": true,
|
|
658
|
+
"description": "Minimum revenue threshold",
|
|
659
|
+
"usage": ["WHERE clause: Revenue >= {{ minRevenue | sqlNumber }}"],
|
|
660
|
+
"defaultValue": null,
|
|
661
|
+
"sampleValue": "10000"
|
|
662
|
+
}
|
|
663
|
+
]
|
|
664
|
+
}
|
|
665
|
+
```
|
|
666
|
+
```
|
|
667
|
+
|
|
668
|
+
**AI Prompt Configuration** (`.prompts.json`):
|
|
669
|
+
```json
|
|
670
|
+
{
|
|
671
|
+
"fields": {
|
|
672
|
+
"Name": "SQL Query Writer",
|
|
673
|
+
"Description": "Generates Nunjucks SQL query templates from business questions",
|
|
674
|
+
"TypeID": "@lookup:AI Prompt Types.Name=Chat",
|
|
675
|
+
"TemplateText": "@file:templates/query-gen/sql-query-writer.template.md",
|
|
676
|
+
"Status": "Active",
|
|
677
|
+
"ResponseFormat": "JSON",
|
|
678
|
+
"SelectionStrategy": "Specific",
|
|
679
|
+
"PowerPreference": "Highest",
|
|
680
|
+
"ParallelizationMode": "None",
|
|
681
|
+
"OutputType": "object",
|
|
682
|
+
"ValidationBehavior": "Strict",
|
|
683
|
+
"MaxRetries": 3,
|
|
684
|
+
"FailoverMaxAttempts": 5,
|
|
685
|
+
"PromptRole": "System",
|
|
686
|
+
"PromptPosition": "First",
|
|
687
|
+
"CategoryID": "@lookup:AI Prompt Categories.Name=Query Generation"
|
|
688
|
+
},
|
|
689
|
+
"relatedEntities": {
|
|
690
|
+
"MJ: AI Prompt Models": [
|
|
691
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Claude 4.5 Sonnet", "VendorID": "@lookup:MJ: AI Vendors.Name=Anthropic", "Priority": 1 } },
|
|
692
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 2 } },
|
|
693
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Cerebras", "Priority": 3 } },
|
|
694
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Gemini 2.5 Flash", "VendorID": "@lookup:MJ: AI Vendors.Name=Google", "Priority": 4 } },
|
|
695
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT-OSS-120B", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 5 } },
|
|
696
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT 5-nano", "VendorID": "@lookup:MJ: AI Vendors.Name=OpenAI", "Priority": 6 } }
|
|
697
|
+
]
|
|
698
|
+
}
|
|
699
|
+
}
|
|
700
|
+
```
|
|
701
|
+
|
|
702
|
+
### 5.2 QueryParameterProcessor Integration
|
|
703
|
+
**Purpose**: Render Nunjucks templates with sample parameter values for testing
|
|
704
|
+
|
|
705
|
+
```typescript
|
|
706
|
+
class QueryWriter {
|
|
707
|
+
async generateQuery(
|
|
708
|
+
businessQuestion: BusinessQuestion,
|
|
709
|
+
entityMetadata: EntityMetadataForPrompt[],
|
|
710
|
+
fewShotExamples: GoldenQuery[]
|
|
711
|
+
): Promise<GeneratedQuery> {
|
|
712
|
+
// 1. Prepare prompt data
|
|
713
|
+
const promptData = {
|
|
714
|
+
userQuestion: businessQuestion.userQuestion,
|
|
715
|
+
description: businessQuestion.description,
|
|
716
|
+
technicalDescription: businessQuestion.technicalDescription,
|
|
717
|
+
entityMetadata,
|
|
718
|
+
fewShotExamples
|
|
719
|
+
};
|
|
720
|
+
|
|
721
|
+
// 2. Execute SQL Query Writer prompt
|
|
722
|
+
const promptRunner = new AIPromptRunner();
|
|
723
|
+
const result = await promptRunner.ExecutePrompt({
|
|
724
|
+
prompt: await this.getPrompt('SQL Query Writer'),
|
|
725
|
+
data: promptData,
|
|
726
|
+
contextUser: this.contextUser
|
|
727
|
+
});
|
|
728
|
+
|
|
729
|
+
// 3. Parse result
|
|
730
|
+
const generated: GeneratedQuery = result.result as any;
|
|
731
|
+
|
|
732
|
+
// 4. Validate structure
|
|
733
|
+
this.validateGeneratedQuery(generated);
|
|
734
|
+
|
|
735
|
+
return generated;
|
|
736
|
+
}
|
|
737
|
+
|
|
738
|
+
private validateGeneratedQuery(query: GeneratedQuery): void {
|
|
739
|
+
if (!query.sql || !query.parameters || !query.selectClause) {
|
|
740
|
+
throw new Error('Invalid query structure returned from AI');
|
|
741
|
+
}
|
|
742
|
+
}
|
|
743
|
+
}
|
|
744
|
+
```
|
|
745
|
+
|
|
746
|
+
---
|
|
747
|
+
|
|
748
|
+
## Phase 6: Query Testing & Fixing (Week 5)
|
|
749
|
+
|
|
750
|
+
### 6.1 QueryTester Implementation
|
|
751
|
+
**Purpose**: Render and execute SQL queries to validate they work correctly
|
|
752
|
+
|
|
753
|
+
**Key Features**:
|
|
754
|
+
```typescript
|
|
755
|
+
class QueryTester {
|
|
756
|
+
private processor: QueryParameterProcessor;
|
|
757
|
+
|
|
758
|
+
async testQuery(
|
|
759
|
+
query: GeneratedQuery,
|
|
760
|
+
maxAttempts: number = 5
|
|
761
|
+
): Promise<QueryTestResult> {
|
|
762
|
+
let attempt = 0;
|
|
763
|
+
let lastError: string | undefined;
|
|
764
|
+
|
|
765
|
+
while (attempt < maxAttempts) {
|
|
766
|
+
attempt++;
|
|
767
|
+
|
|
768
|
+
try {
|
|
769
|
+
// 1. Render template with sample parameter values
|
|
770
|
+
const renderedSQL = this.renderQueryTemplate(query);
|
|
771
|
+
|
|
772
|
+
// 2. Execute SQL on database
|
|
773
|
+
const results = await this.executeSQLQuery(renderedSQL);
|
|
774
|
+
|
|
775
|
+
// 3. Validate results
|
|
776
|
+
if (results.length === 0) {
|
|
777
|
+
throw new Error('Query returned no results (may need sample data)');
|
|
778
|
+
}
|
|
779
|
+
|
|
780
|
+
// 4. Success!
|
|
781
|
+
return {
|
|
782
|
+
success: true,
|
|
783
|
+
renderedSQL,
|
|
784
|
+
rowCount: results.length,
|
|
785
|
+
sampleRows: results.slice(0, 10),
|
|
786
|
+
attempts: attempt
|
|
787
|
+
};
|
|
788
|
+
|
|
789
|
+
} catch (error) {
|
|
790
|
+
lastError = extractErrorMessage(error, 'Query Testing');
|
|
791
|
+
console.error(`Attempt ${attempt}/${maxAttempts} failed:`, lastError);
|
|
792
|
+
|
|
793
|
+
// 5. If not last attempt, try to fix the query
|
|
794
|
+
if (attempt < maxAttempts) {
|
|
795
|
+
query = await this.fixQuery(query, lastError);
|
|
796
|
+
}
|
|
797
|
+
}
|
|
798
|
+
}
|
|
799
|
+
|
|
800
|
+
// Failed after max attempts
|
|
801
|
+
return {
|
|
802
|
+
success: false,
|
|
803
|
+
error: lastError,
|
|
804
|
+
attempts: maxAttempts
|
|
805
|
+
};
|
|
806
|
+
}
|
|
807
|
+
|
|
808
|
+
private renderQueryTemplate(query: GeneratedQuery): string {
|
|
809
|
+
// Use QueryParameterProcessor to render Nunjucks template
|
|
810
|
+
const paramValues: Record<string, any> = {};
|
|
811
|
+
|
|
812
|
+
// Convert sampleValue strings to proper types
|
|
813
|
+
for (const param of query.parameters) {
|
|
814
|
+
paramValues[param.name] = this.parseSampleValue(param.sampleValue, param.type);
|
|
815
|
+
}
|
|
816
|
+
|
|
817
|
+
const result = QueryParameterProcessor.processQueryTemplate(
|
|
818
|
+
{ SQL: query.sql, Parameters: query.parameters } as any,
|
|
819
|
+
paramValues
|
|
820
|
+
);
|
|
821
|
+
|
|
822
|
+
if (!result.success) {
|
|
823
|
+
throw new Error(`Template rendering failed: ${result.error}`);
|
|
824
|
+
}
|
|
825
|
+
|
|
826
|
+
return result.processedSQL;
|
|
827
|
+
}
|
|
828
|
+
|
|
829
|
+
private parseSampleValue(value: string, type: string): any {
|
|
830
|
+
switch (type) {
|
|
831
|
+
case 'number': return Number(value);
|
|
832
|
+
case 'boolean': return value.toLowerCase() === 'true';
|
|
833
|
+
case 'date': return new Date(value);
|
|
834
|
+
case 'array': return JSON.parse(value);
|
|
835
|
+
default: return value;
|
|
836
|
+
}
|
|
837
|
+
}
|
|
838
|
+
|
|
839
|
+
private async executeSQLQuery(sql: string): Promise<any[]> {
|
|
840
|
+
// Execute SQL against database
|
|
841
|
+
const result = await this.dataProvider.ExecuteSQL(sql);
|
|
842
|
+
return result.Results;
|
|
843
|
+
}
|
|
844
|
+
|
|
845
|
+
private async fixQuery(
|
|
846
|
+
query: GeneratedQuery,
|
|
847
|
+
errorMessage: string
|
|
848
|
+
): Promise<GeneratedQuery> {
|
|
849
|
+
// Use SQL Query Fixer prompt to correct the query
|
|
850
|
+
const fixer = new QueryFixer();
|
|
851
|
+
return await fixer.fixQuery(query, errorMessage);
|
|
852
|
+
}
|
|
853
|
+
}
|
|
854
|
+
```
|
|
855
|
+
|
|
856
|
+
### 6.2 SQL Query Fixer Prompt
|
|
857
|
+
**AI Prompt**: `metadata/prompts/templates/query-gen/sql-query-fixer.template.md`
|
|
858
|
+
|
|
859
|
+
**Prompt Content** (based on Skip's query-fixer.md):
|
|
860
|
+
```markdown
|
|
861
|
+
# SQL Query Fixer
|
|
862
|
+
|
|
863
|
+
You are an expert SQL developer tasked with fixing a broken SQL query.
|
|
864
|
+
|
|
865
|
+
## Original Query
|
|
866
|
+
```sql
|
|
867
|
+
{{ originalSQL }}
|
|
868
|
+
```
|
|
869
|
+
|
|
870
|
+
## Error Message
|
|
871
|
+
```
|
|
872
|
+
{{ errorMessage }}
|
|
873
|
+
```
|
|
874
|
+
|
|
875
|
+
## Entity Metadata
|
|
876
|
+
|
|
877
|
+
{% for entity in entityMetadata %}
|
|
878
|
+
### {{ entity.entityName }}
|
|
879
|
+
- **Schema.View**: `[{{ entity.schemaName }}].[{{ entity.baseView }}]`
|
|
880
|
+
- **Description**: {{ entity.description }}
|
|
881
|
+
|
|
882
|
+
**Available Fields**:
|
|
883
|
+
{% for field in entity.fields %}
|
|
884
|
+
- `{{ field.name }}` ({{ field.type }}){% if field.isPrimaryKey %} [PK]{% endif %}{% if field.isForeignKey %} [FK→{{ field.relatedEntity }}]{% endif %}
|
|
885
|
+
{% endfor %}
|
|
886
|
+
{% endfor %}
|
|
887
|
+
|
|
888
|
+
## Query Parameters
|
|
889
|
+
|
|
890
|
+
{% if parameters.length > 0 %}
|
|
891
|
+
{% for param in parameters %}
|
|
892
|
+
- `{{ param.name }}` ({{ param.type }}){% if param.isRequired %} [REQUIRED]{% endif %} - {{ param.description }}
|
|
893
|
+
- Sample value: `{{ param.sampleValue }}`
|
|
894
|
+
{% endfor %}
|
|
895
|
+
{% else %}
|
|
896
|
+
No parameters defined for this query.
|
|
897
|
+
{% endif %}
|
|
898
|
+
|
|
899
|
+
## Instructions
|
|
900
|
+
Analyze the error and fix the SQL query. Common issues:
|
|
901
|
+
- Syntax errors (missing commas, parentheses, keywords)
|
|
902
|
+
- Invalid column names (check entity metadata)
|
|
903
|
+
- Type mismatches (ensure correct types for parameters)
|
|
904
|
+
- Missing JOINs or incorrect JOIN conditions
|
|
905
|
+
- Aggregation errors (missing GROUP BY, invalid aggregate usage)
|
|
906
|
+
- Subquery issues
|
|
907
|
+
|
|
908
|
+
## Requirements
|
|
909
|
+
1. Preserve the query's intent and logic
|
|
910
|
+
2. Fix only what's broken (minimal changes)
|
|
911
|
+
3. Maintain Nunjucks parameter syntax
|
|
912
|
+
4. Ensure SQL is valid for SQL Server
|
|
913
|
+
5. Update parameters array if needed
|
|
914
|
+
|
|
915
|
+
## Output Format
|
|
916
|
+
Return JSON with corrected query:
|
|
917
|
+
|
|
918
|
+
```json
|
|
919
|
+
{
|
|
920
|
+
"sql": "SELECT ... (corrected)",
|
|
921
|
+
"selectClause": [...],
|
|
922
|
+
"parameters": [...],
|
|
923
|
+
"changesSummary": "Fixed missing GROUP BY clause for aggregate functions"
|
|
924
|
+
}
|
|
925
|
+
```
|
|
926
|
+
```
|
|
927
|
+
|
|
928
|
+
**AI Prompt Configuration** (`.prompts.json`):
|
|
929
|
+
```json
|
|
930
|
+
{
|
|
931
|
+
"fields": {
|
|
932
|
+
"Name": "SQL Query Fixer",
|
|
933
|
+
"Description": "Fixes SQL syntax and logic errors in generated queries",
|
|
934
|
+
"TypeID": "@lookup:AI Prompt Types.Name=Chat",
|
|
935
|
+
"TemplateText": "@file:templates/query-gen/sql-query-fixer.template.md",
|
|
936
|
+
"Status": "Active",
|
|
937
|
+
"ResponseFormat": "JSON",
|
|
938
|
+
"SelectionStrategy": "Specific",
|
|
939
|
+
"PowerPreference": "Highest",
|
|
940
|
+
"ParallelizationMode": "None",
|
|
941
|
+
"OutputType": "object",
|
|
942
|
+
"ValidationBehavior": "Strict",
|
|
943
|
+
"MaxRetries": 3,
|
|
944
|
+
"FailoverMaxAttempts": 5,
|
|
945
|
+
"PromptRole": "System",
|
|
946
|
+
"PromptPosition": "First",
|
|
947
|
+
"CategoryID": "@lookup:AI Prompt Categories.Name=Query Generation"
|
|
948
|
+
},
|
|
949
|
+
"relatedEntities": {
|
|
950
|
+
"MJ: AI Prompt Models": [
|
|
951
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Claude 4.5 Sonnet", "VendorID": "@lookup:MJ: AI Vendors.Name=Anthropic", "Priority": 1 } },
|
|
952
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 2 } },
|
|
953
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Cerebras", "Priority": 3 } },
|
|
954
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Gemini 2.5 Flash", "VendorID": "@lookup:MJ: AI Vendors.Name=Google", "Priority": 4 } },
|
|
955
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT-OSS-120B", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 5 } },
|
|
956
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT 5-nano", "VendorID": "@lookup:MJ: AI Vendors.Name=OpenAI", "Priority": 6 } }
|
|
957
|
+
]
|
|
958
|
+
}
|
|
959
|
+
}
|
|
960
|
+
```
|
|
961
|
+
|
|
962
|
+
---
|
|
963
|
+
|
|
964
|
+
## Phase 7: Query Refinement & Evaluation (Week 6)
|
|
965
|
+
|
|
966
|
+
### 7.1 Query Evaluator Prompt
|
|
967
|
+
**Purpose**: Assess if the query answers the business question correctly
|
|
968
|
+
|
|
969
|
+
**AI Prompt**: `metadata/prompts/templates/query-gen/query-evaluator.template.md`
|
|
970
|
+
|
|
971
|
+
**Prompt Content**:
|
|
972
|
+
```markdown
|
|
973
|
+
# Query Result Evaluator
|
|
974
|
+
|
|
975
|
+
You are a data analyst evaluating whether a SQL query answers a business question correctly.
|
|
976
|
+
|
|
977
|
+
## Business Question
|
|
978
|
+
**User Question**: {{ userQuestion }}
|
|
979
|
+
**Description**: {{ description }}
|
|
980
|
+
**Technical Description**: {{ technicalDescription }}
|
|
981
|
+
|
|
982
|
+
## Generated SQL Query
|
|
983
|
+
```sql
|
|
984
|
+
{{ generatedSQL }}
|
|
985
|
+
```
|
|
986
|
+
|
|
987
|
+
## Query Parameters
|
|
988
|
+
{% if parameters.length > 0 %}
|
|
989
|
+
{% for param in parameters %}
|
|
990
|
+
- `{{ param.name }}` ({{ param.type }}){% if param.isRequired %} [REQUIRED]{% endif %} - {{ param.description }}
|
|
991
|
+
- Sample value used: `{{ param.sampleValue }}`
|
|
992
|
+
{% endfor %}
|
|
993
|
+
{% else %}
|
|
994
|
+
No parameters defined for this query.
|
|
995
|
+
{% endif %}
|
|
996
|
+
|
|
997
|
+
## Sample Results (Limited to Top 10 Rows for Efficiency)
|
|
998
|
+
|
|
999
|
+
{% if sampleResults.length > 0 %}
|
|
1000
|
+
**Total rows returned**: {{ sampleResults.length }}
|
|
1001
|
+
|
|
1002
|
+
{% for row in sampleResults %}
|
|
1003
|
+
### Row {{ loop.index }}
|
|
1004
|
+
{% for key, value in row %}
|
|
1005
|
+
- **{{ key }}**: {{ value }}
|
|
1006
|
+
{% endfor %}
|
|
1007
|
+
{% if not loop.last %}---{% endif %}
|
|
1008
|
+
{% endfor %}
|
|
1009
|
+
|
|
1010
|
+
**Note**: Only the first 10 rows are shown to keep the prompt size manageable and reduce token costs.
|
|
1011
|
+
{% else %}
|
|
1012
|
+
⚠️ Query returned no results.
|
|
1013
|
+
{% endif %}
|
|
1014
|
+
|
|
1015
|
+
## Instructions
|
|
1016
|
+
Evaluate if the query answers the business question:
|
|
1017
|
+
1. **Result Relevance**: Do the results match what was asked?
|
|
1018
|
+
2. **Data Completeness**: Are all necessary columns present?
|
|
1019
|
+
3. **Correctness**: Are calculations and aggregations correct?
|
|
1020
|
+
4. **Usability**: Are results formatted appropriately?
|
|
1021
|
+
|
|
1022
|
+
## Output Format
|
|
1023
|
+
Return JSON evaluation:
|
|
1024
|
+
|
|
1025
|
+
```json
|
|
1026
|
+
{
|
|
1027
|
+
"answersQuestion": true,
|
|
1028
|
+
"confidence": 0.95,
|
|
1029
|
+
"reasoning": "Query correctly aggregates orders by customer and sorts by total revenue descending. Sample results show expected data.",
|
|
1030
|
+
"suggestions": [
|
|
1031
|
+
"Consider adding customer contact info for better usability",
|
|
1032
|
+
"Add date range parameter to filter orders by time period"
|
|
1033
|
+
],
|
|
1034
|
+
"needsRefinement": false
|
|
1035
|
+
}
|
|
1036
|
+
```
|
|
1037
|
+
```
|
|
1038
|
+
|
|
1039
|
+
**AI Prompt Configuration** (`.prompts.json`):
|
|
1040
|
+
```json
|
|
1041
|
+
{
|
|
1042
|
+
"fields": {
|
|
1043
|
+
"Name": "Query Result Evaluator",
|
|
1044
|
+
"Description": "Evaluates if a query correctly answers the business question",
|
|
1045
|
+
"TypeID": "@lookup:AI Prompt Types.Name=Chat",
|
|
1046
|
+
"TemplateText": "@file:templates/query-gen/query-evaluator.template.md",
|
|
1047
|
+
"Status": "Active",
|
|
1048
|
+
"ResponseFormat": "JSON",
|
|
1049
|
+
"SelectionStrategy": "Specific",
|
|
1050
|
+
"PowerPreference": "Highest",
|
|
1051
|
+
"ParallelizationMode": "None",
|
|
1052
|
+
"OutputType": "object",
|
|
1053
|
+
"ValidationBehavior": "Strict",
|
|
1054
|
+
"MaxRetries": 3,
|
|
1055
|
+
"FailoverMaxAttempts": 5,
|
|
1056
|
+
"PromptRole": "System",
|
|
1057
|
+
"PromptPosition": "First",
|
|
1058
|
+
"CategoryID": "@lookup:AI Prompt Categories.Name=Query Generation"
|
|
1059
|
+
},
|
|
1060
|
+
"relatedEntities": {
|
|
1061
|
+
"MJ: AI Prompt Models": [
|
|
1062
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Claude 4.5 Sonnet", "VendorID": "@lookup:MJ: AI Vendors.Name=Anthropic", "Priority": 1 } },
|
|
1063
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 2 } },
|
|
1064
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Cerebras", "Priority": 3 } },
|
|
1065
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Gemini 2.5 Flash", "VendorID": "@lookup:MJ: AI Vendors.Name=Google", "Priority": 4 } },
|
|
1066
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT-OSS-120B", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 5 } },
|
|
1067
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT 5-nano", "VendorID": "@lookup:MJ: AI Vendors.Name=OpenAI", "Priority": 6 } }
|
|
1068
|
+
]
|
|
1069
|
+
}
|
|
1070
|
+
}
|
|
1071
|
+
```
|
|
1072
|
+
|
|
1073
|
+
### 7.2 Query Refiner Implementation
|
|
1074
|
+
**Purpose**: Iteratively improve queries based on evaluation feedback
|
|
1075
|
+
|
|
1076
|
+
**AI Prompt**: `metadata/prompts/templates/query-gen/query-refiner.template.md`
|
|
1077
|
+
|
|
1078
|
+
**Prompt Content**:
|
|
1079
|
+
```markdown
|
|
1080
|
+
# Query Refiner
|
|
1081
|
+
|
|
1082
|
+
You are an expert SQL developer refining a query based on evaluation feedback.
|
|
1083
|
+
|
|
1084
|
+
## Original Business Question
|
|
1085
|
+
**User Question**: {{ userQuestion }}
|
|
1086
|
+
**Description**: {{ description }}
|
|
1087
|
+
|
|
1088
|
+
## Current Query
|
|
1089
|
+
```sql
|
|
1090
|
+
{{ currentSQL }}
|
|
1091
|
+
```
|
|
1092
|
+
|
|
1093
|
+
## Evaluation Feedback
|
|
1094
|
+
|
|
1095
|
+
**Answers Question**: {% if evaluationFeedback.answersQuestion %}✅ Yes{% else %}❌ No{% endif %}
|
|
1096
|
+
**Confidence**: {{ evaluationFeedback.confidence * 100 }}%
|
|
1097
|
+
**Needs Refinement**: {% if evaluationFeedback.needsRefinement %}Yes{% else %}No{% endif %}
|
|
1098
|
+
|
|
1099
|
+
**Reasoning**: {{ evaluationFeedback.reasoning }}
|
|
1100
|
+
|
|
1101
|
+
{% if evaluationFeedback.suggestions.length > 0 %}
|
|
1102
|
+
**Suggestions for Improvement**:
|
|
1103
|
+
{% for suggestion in evaluationFeedback.suggestions %}
|
|
1104
|
+
{{ loop.index }}. {{ suggestion }}
|
|
1105
|
+
{% endfor %}
|
|
1106
|
+
{% endif %}
|
|
1107
|
+
|
|
1108
|
+
## Entity Metadata
|
|
1109
|
+
|
|
1110
|
+
{% for entity in entityMetadata %}
|
|
1111
|
+
### {{ entity.entityName }}
|
|
1112
|
+
- **Schema.View**: `[{{ entity.schemaName }}].[{{ entity.baseView }}]`
|
|
1113
|
+
- **Description**: {{ entity.description }}
|
|
1114
|
+
|
|
1115
|
+
**Available Fields**:
|
|
1116
|
+
{% for field in entity.fields %}
|
|
1117
|
+
- `{{ field.name }}` ({{ field.type }}){% if field.isPrimaryKey %} [PK]{% endif %}{% if field.isForeignKey %} [FK→{{ field.relatedEntity }}]{% endif %}
|
|
1118
|
+
{% endfor %}
|
|
1119
|
+
{% endfor %}
|
|
1120
|
+
|
|
1121
|
+
## Instructions
|
|
1122
|
+
Refine the query based on suggestions:
|
|
1123
|
+
1. Address concerns raised in evaluation
|
|
1124
|
+
2. Implement suggested improvements
|
|
1125
|
+
3. Maintain query correctness and performance
|
|
1126
|
+
4. Preserve existing parameters unless changing them improves the query
|
|
1127
|
+
|
|
1128
|
+
## Output Format
|
|
1129
|
+
Return JSON with refined query:
|
|
1130
|
+
|
|
1131
|
+
```json
|
|
1132
|
+
{
|
|
1133
|
+
"sql": "SELECT ... (refined)",
|
|
1134
|
+
"selectClause": [...],
|
|
1135
|
+
"parameters": [...],
|
|
1136
|
+
"improvementsSummary": "Added customer contact columns and date range filter as suggested"
|
|
1137
|
+
}
|
|
1138
|
+
```
|
|
1139
|
+
```
|
|
1140
|
+
|
|
1141
|
+
**AI Prompt Configuration** (`.prompts.json`):
|
|
1142
|
+
```json
|
|
1143
|
+
{
|
|
1144
|
+
"fields": {
|
|
1145
|
+
"Name": "Query Refiner",
|
|
1146
|
+
"Description": "Refines queries based on evaluation feedback",
|
|
1147
|
+
"TypeID": "@lookup:AI Prompt Types.Name=Chat",
|
|
1148
|
+
"TemplateText": "@file:templates/query-gen/query-refiner.template.md",
|
|
1149
|
+
"Status": "Active",
|
|
1150
|
+
"ResponseFormat": "JSON",
|
|
1151
|
+
"SelectionStrategy": "Specific",
|
|
1152
|
+
"PowerPreference": "Highest",
|
|
1153
|
+
"ParallelizationMode": "None",
|
|
1154
|
+
"OutputType": "object",
|
|
1155
|
+
"ValidationBehavior": "Strict",
|
|
1156
|
+
"MaxRetries": 3,
|
|
1157
|
+
"FailoverMaxAttempts": 5,
|
|
1158
|
+
"PromptRole": "System",
|
|
1159
|
+
"PromptPosition": "First",
|
|
1160
|
+
"CategoryID": "@lookup:AI Prompt Categories.Name=Query Generation"
|
|
1161
|
+
},
|
|
1162
|
+
"relatedEntities": {
|
|
1163
|
+
"MJ: AI Prompt Models": [
|
|
1164
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Claude 4.5 Sonnet", "VendorID": "@lookup:MJ: AI Vendors.Name=Anthropic", "Priority": 1 } },
|
|
1165
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 2 } },
|
|
1166
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Kimi K2", "VendorID": "@lookup:MJ: AI Vendors.Name=Cerebras", "Priority": 3 } },
|
|
1167
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=Gemini 2.5 Flash", "VendorID": "@lookup:MJ: AI Vendors.Name=Google", "Priority": 4 } },
|
|
1168
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT-OSS-120B", "VendorID": "@lookup:MJ: AI Vendors.Name=Groq", "Priority": 5 } },
|
|
1169
|
+
{ "fields": { "PromptID": "@parent:ID", "ModelID": "@lookup:AI Models.Name=GPT 5-nano", "VendorID": "@lookup:MJ: AI Vendors.Name=OpenAI", "Priority": 6 } }
|
|
1170
|
+
]
|
|
1171
|
+
}
|
|
1172
|
+
}
|
|
1173
|
+
```
|
|
1174
|
+
|
|
1175
|
+
### 7.3 Refinement Loop Implementation
|
|
1176
|
+
```typescript
|
|
1177
|
+
class QueryRefiner {
|
|
1178
|
+
async refineQuery(
|
|
1179
|
+
query: GeneratedQuery,
|
|
1180
|
+
businessQuestion: BusinessQuestion,
|
|
1181
|
+
entityMetadata: EntityMetadataForPrompt[],
|
|
1182
|
+
maxRefinements: number = 3
|
|
1183
|
+
): Promise<RefinedQuery> {
|
|
1184
|
+
let currentQuery = query;
|
|
1185
|
+
let refinementCount = 0;
|
|
1186
|
+
|
|
1187
|
+
while (refinementCount < maxRefinements) {
|
|
1188
|
+
// 1. Test the current query
|
|
1189
|
+
const testResult = await this.tester.testQuery(currentQuery);
|
|
1190
|
+
|
|
1191
|
+
if (!testResult.success) {
|
|
1192
|
+
throw new Error(`Query testing failed: ${testResult.error}`);
|
|
1193
|
+
}
|
|
1194
|
+
|
|
1195
|
+
// 2. Evaluate if it answers the question
|
|
1196
|
+
const evaluation = await this.evaluateQuery(
|
|
1197
|
+
currentQuery,
|
|
1198
|
+
businessQuestion,
|
|
1199
|
+
testResult.sampleRows
|
|
1200
|
+
);
|
|
1201
|
+
|
|
1202
|
+
// 3. If evaluation passes, we're done!
|
|
1203
|
+
if (evaluation.answersQuestion && !evaluation.needsRefinement) {
|
|
1204
|
+
return {
|
|
1205
|
+
query: currentQuery,
|
|
1206
|
+
testResult,
|
|
1207
|
+
evaluation,
|
|
1208
|
+
refinementCount
|
|
1209
|
+
};
|
|
1210
|
+
}
|
|
1211
|
+
|
|
1212
|
+
// 4. Refine the query based on suggestions
|
|
1213
|
+
refinementCount++;
|
|
1214
|
+
console.log(`Refinement iteration ${refinementCount}/${maxRefinements}`);
|
|
1215
|
+
|
|
1216
|
+
currentQuery = await this.performRefinement(
|
|
1217
|
+
currentQuery,
|
|
1218
|
+
businessQuestion,
|
|
1219
|
+
evaluation,
|
|
1220
|
+
entityMetadata
|
|
1221
|
+
);
|
|
1222
|
+
}
|
|
1223
|
+
|
|
1224
|
+
// Reached max refinements - return best attempt
|
|
1225
|
+
return {
|
|
1226
|
+
query: currentQuery,
|
|
1227
|
+
testResult: await this.tester.testQuery(currentQuery),
|
|
1228
|
+
evaluation: await this.evaluateQuery(currentQuery, businessQuestion, []),
|
|
1229
|
+
refinementCount,
|
|
1230
|
+
reachedMaxRefinements: true
|
|
1231
|
+
};
|
|
1232
|
+
}
|
|
1233
|
+
|
|
1234
|
+
private async evaluateQuery(
|
|
1235
|
+
query: GeneratedQuery,
|
|
1236
|
+
businessQuestion: BusinessQuestion,
|
|
1237
|
+
sampleResults: any[]
|
|
1238
|
+
): Promise<QueryEvaluation> {
|
|
1239
|
+
const promptRunner = new AIPromptRunner();
|
|
1240
|
+
const result = await promptRunner.ExecutePrompt({
|
|
1241
|
+
prompt: await this.getPrompt('Query Result Evaluator'),
|
|
1242
|
+
data: {
|
|
1243
|
+
userQuestion: businessQuestion.userQuestion,
|
|
1244
|
+
description: businessQuestion.description,
|
|
1245
|
+
technicalDescription: businessQuestion.technicalDescription,
|
|
1246
|
+
generatedSQL: query.sql,
|
|
1247
|
+
parameters: query.parameters,
|
|
1248
|
+
sampleResults
|
|
1249
|
+
},
|
|
1250
|
+
contextUser: this.contextUser
|
|
1251
|
+
});
|
|
1252
|
+
|
|
1253
|
+
return result.result as QueryEvaluation;
|
|
1254
|
+
}
|
|
1255
|
+
|
|
1256
|
+
private async performRefinement(
|
|
1257
|
+
query: GeneratedQuery,
|
|
1258
|
+
businessQuestion: BusinessQuestion,
|
|
1259
|
+
evaluation: QueryEvaluation,
|
|
1260
|
+
entityMetadata: EntityMetadataForPrompt[]
|
|
1261
|
+
): Promise<GeneratedQuery> {
|
|
1262
|
+
const promptRunner = new AIPromptRunner();
|
|
1263
|
+
const result = await promptRunner.ExecutePrompt({
|
|
1264
|
+
prompt: await this.getPrompt('Query Refiner'),
|
|
1265
|
+
data: {
|
|
1266
|
+
userQuestion: businessQuestion.userQuestion,
|
|
1267
|
+
description: businessQuestion.description,
|
|
1268
|
+
currentSQL: query.sql,
|
|
1269
|
+
evaluationFeedback: evaluation,
|
|
1270
|
+
entityMetadata
|
|
1271
|
+
},
|
|
1272
|
+
contextUser: this.contextUser
|
|
1273
|
+
});
|
|
1274
|
+
|
|
1275
|
+
return result.result as GeneratedQuery;
|
|
1276
|
+
}
|
|
1277
|
+
}
|
|
1278
|
+
```
|
|
1279
|
+
|
|
1280
|
+
---
|
|
1281
|
+
|
|
1282
|
+
## Phase 8: Metadata Export (Week 7)
|
|
1283
|
+
|
|
1284
|
+
### 8.1 MetadataExporter Implementation
|
|
1285
|
+
**Purpose**: Export validated queries to MJ metadata format
|
|
1286
|
+
|
|
1287
|
+
**Output Format**: MJ Queries metadata JSON file
|
|
1288
|
+
|
|
1289
|
+
```typescript
|
|
1290
|
+
class MetadataExporter {
|
|
1291
|
+
async exportQueries(
|
|
1292
|
+
validatedQueries: ValidatedQuery[],
|
|
1293
|
+
outputDirectory: string
|
|
1294
|
+
): Promise<ExportResult> {
|
|
1295
|
+
// 1. Transform to MJ Query metadata format
|
|
1296
|
+
const metadata = validatedQueries.map(q => this.toQueryMetadata(q));
|
|
1297
|
+
|
|
1298
|
+
// 2. Create metadata file structure
|
|
1299
|
+
const metadataFile = {
|
|
1300
|
+
timestamp: new Date().toISOString(),
|
|
1301
|
+
generatedBy: 'query-gen',
|
|
1302
|
+
version: '1.0',
|
|
1303
|
+
queries: metadata
|
|
1304
|
+
};
|
|
1305
|
+
|
|
1306
|
+
// 3. Write to file
|
|
1307
|
+
const outputPath = path.join(outputDirectory, `queries-${Date.now()}.json`);
|
|
1308
|
+
await fs.writeFile(
|
|
1309
|
+
outputPath,
|
|
1310
|
+
JSON.stringify(metadataFile, null, 2),
|
|
1311
|
+
'utf-8'
|
|
1312
|
+
);
|
|
1313
|
+
|
|
1314
|
+
return {
|
|
1315
|
+
success: true,
|
|
1316
|
+
outputPath,
|
|
1317
|
+
queryCount: metadata.length
|
|
1318
|
+
};
|
|
1319
|
+
}
|
|
1320
|
+
|
|
1321
|
+
private toQueryMetadata(query: ValidatedQuery): QueryMetadataRecord {
|
|
1322
|
+
return {
|
|
1323
|
+
fields: {
|
|
1324
|
+
Name: this.generateQueryName(query.businessQuestion),
|
|
1325
|
+
CategoryID: '@lookup:Query Categories.Name=Auto-Generated',
|
|
1326
|
+
UserQuestion: query.businessQuestion.userQuestion,
|
|
1327
|
+
Description: query.businessQuestion.description,
|
|
1328
|
+
TechnicalDescription: query.businessQuestion.technicalDescription,
|
|
1329
|
+
SQL: query.query.sql,
|
|
1330
|
+
OriginalSQL: query.query.sql,
|
|
1331
|
+
UsesTemplate: true,
|
|
1332
|
+
Status: 'Active'
|
|
1333
|
+
},
|
|
1334
|
+
relatedEntities: {
|
|
1335
|
+
'Query Fields': query.query.selectClause.map((field, i) => ({
|
|
1336
|
+
fields: {
|
|
1337
|
+
QueryID: '@parent:ID',
|
|
1338
|
+
Name: field.name,
|
|
1339
|
+
Description: field.description,
|
|
1340
|
+
SQLBaseType: field.type,
|
|
1341
|
+
Sequence: i + 1
|
|
1342
|
+
}
|
|
1343
|
+
})),
|
|
1344
|
+
'Query Params': query.query.parameters.map((param, i) => ({
|
|
1345
|
+
fields: {
|
|
1346
|
+
QueryID: '@parent:ID',
|
|
1347
|
+
Name: param.name,
|
|
1348
|
+
Type: param.type,
|
|
1349
|
+
Description: param.description,
|
|
1350
|
+
ValidationFilters: param.usage.join(', '),
|
|
1351
|
+
IsRequired: param.isRequired,
|
|
1352
|
+
DefaultValue: param.defaultValue,
|
|
1353
|
+
Sequence: i + 1
|
|
1354
|
+
}
|
|
1355
|
+
}))
|
|
1356
|
+
}
|
|
1357
|
+
};
|
|
1358
|
+
}
|
|
1359
|
+
|
|
1360
|
+
private generateQueryName(question: BusinessQuestion): string {
|
|
1361
|
+
// Convert user question to a concise name
|
|
1362
|
+
// "What are the top customers by revenue?" -> "Top Customers By Revenue"
|
|
1363
|
+
return question.userQuestion
|
|
1364
|
+
.replace(/\?/g, '')
|
|
1365
|
+
.split(' ')
|
|
1366
|
+
.filter(word => word.length > 2)
|
|
1367
|
+
.map(word => word.charAt(0).toUpperCase() + word.slice(1))
|
|
1368
|
+
.slice(0, 5)
|
|
1369
|
+
.join(' ');
|
|
1370
|
+
}
|
|
1371
|
+
}
|
|
1372
|
+
```
|
|
1373
|
+
|
|
1374
|
+
### 8.2 Database Direct Insert (Optional)
|
|
1375
|
+
**Purpose**: Alternative to metadata files - insert directly into database
|
|
1376
|
+
|
|
1377
|
+
```typescript
|
|
1378
|
+
class QueryDatabaseWriter {
|
|
1379
|
+
async writeQueriesToDatabase(
|
|
1380
|
+
validatedQueries: ValidatedQuery[],
|
|
1381
|
+
contextUser: UserInfo
|
|
1382
|
+
): Promise<WriteResult> {
|
|
1383
|
+
const md = new Metadata();
|
|
1384
|
+
const results: string[] = [];
|
|
1385
|
+
|
|
1386
|
+
for (const vq of validatedQueries) {
|
|
1387
|
+
try {
|
|
1388
|
+
// 1. Create Query entity
|
|
1389
|
+
const query = await md.GetEntityObject<QueryEntity>('Queries', contextUser);
|
|
1390
|
+
query.NewRecord();
|
|
1391
|
+
query.Name = this.generateQueryName(vq.businessQuestion);
|
|
1392
|
+
query.CategoryID = await this.findOrCreateCategory('Auto-Generated');
|
|
1393
|
+
query.UserQuestion = vq.businessQuestion.userQuestion;
|
|
1394
|
+
query.Description = vq.businessQuestion.description;
|
|
1395
|
+
query.TechnicalDescription = vq.businessQuestion.technicalDescription;
|
|
1396
|
+
query.SQL = vq.query.sql;
|
|
1397
|
+
query.OriginalSQL = vq.query.sql;
|
|
1398
|
+
query.UsesTemplate = true;
|
|
1399
|
+
query.Status = 'Active';
|
|
1400
|
+
|
|
1401
|
+
const saved = await query.Save();
|
|
1402
|
+
if (!saved) {
|
|
1403
|
+
throw new Error(`Failed to save query: ${query.LatestResult?.Message}`);
|
|
1404
|
+
}
|
|
1405
|
+
|
|
1406
|
+
// 2. Create Query Fields
|
|
1407
|
+
for (let i = 0; i < vq.query.selectClause.length; i++) {
|
|
1408
|
+
const field = vq.query.selectClause[i];
|
|
1409
|
+
const qf = await md.GetEntityObject<QueryFieldEntity>('Query Fields', contextUser);
|
|
1410
|
+
qf.NewRecord();
|
|
1411
|
+
qf.QueryID = query.ID;
|
|
1412
|
+
qf.Name = field.name;
|
|
1413
|
+
qf.Description = field.description;
|
|
1414
|
+
qf.SQLBaseType = field.type;
|
|
1415
|
+
qf.Sequence = i + 1;
|
|
1416
|
+
await qf.Save();
|
|
1417
|
+
}
|
|
1418
|
+
|
|
1419
|
+
// 3. Create Query Params
|
|
1420
|
+
for (let i = 0; i < vq.query.parameters.length; i++) {
|
|
1421
|
+
const param = vq.query.parameters[i];
|
|
1422
|
+
const qp = await md.GetEntityObject<QueryParamEntity>('Query Params', contextUser);
|
|
1423
|
+
qp.NewRecord();
|
|
1424
|
+
qp.QueryID = query.ID;
|
|
1425
|
+
qp.Name = param.name;
|
|
1426
|
+
qp.Type = param.type;
|
|
1427
|
+
qp.Description = param.description;
|
|
1428
|
+
qp.IsRequired = param.isRequired;
|
|
1429
|
+
qp.DefaultValue = param.defaultValue;
|
|
1430
|
+
qp.Sequence = i + 1;
|
|
1431
|
+
await qp.Save();
|
|
1432
|
+
}
|
|
1433
|
+
|
|
1434
|
+
results.push(`✓ ${query.Name} (ID: ${query.ID})`);
|
|
1435
|
+
|
|
1436
|
+
} catch (error) {
|
|
1437
|
+
results.push(`✗ ${vq.businessQuestion.userQuestion}: ${extractErrorMessage(error, 'Database Write')}`);
|
|
1438
|
+
}
|
|
1439
|
+
}
|
|
1440
|
+
|
|
1441
|
+
return {
|
|
1442
|
+
success: true,
|
|
1443
|
+
results
|
|
1444
|
+
};
|
|
1445
|
+
}
|
|
1446
|
+
}
|
|
1447
|
+
```
|
|
1448
|
+
|
|
1449
|
+
---
|
|
1450
|
+
|
|
1451
|
+
## Phase 9: CLI Implementation (Week 8)
|
|
1452
|
+
|
|
1453
|
+
### 9.1 CLI Command Structure
|
|
1454
|
+
```typescript
|
|
1455
|
+
// src/cli/index.ts
|
|
1456
|
+
import { Command } from 'commander';
|
|
1457
|
+
|
|
1458
|
+
const program = new Command();
|
|
1459
|
+
|
|
1460
|
+
program
|
|
1461
|
+
.name('mj querygen')
|
|
1462
|
+
.description('AI-powered SQL query template generation for MemberJunction')
|
|
1463
|
+
.version('1.0.0');
|
|
1464
|
+
|
|
1465
|
+
program
|
|
1466
|
+
.command('generate')
|
|
1467
|
+
.description('Generate queries for entities')
|
|
1468
|
+
.option('-e, --entities <names...>', 'Specific entities to generate queries for')
|
|
1469
|
+
.option('-x, --exclude-entities <names...>', 'Entities to exclude')
|
|
1470
|
+
.option('-s, --exclude-schemas <names...>', 'Schemas to exclude')
|
|
1471
|
+
.option('-m, --max-entities <number>', 'Max entities per group', '3')
|
|
1472
|
+
.option('-r, --max-refinements <number>', 'Max refinement iterations', '3')
|
|
1473
|
+
.option('-f, --max-fixes <number>', 'Max error-fixing attempts', '5')
|
|
1474
|
+
.option('--model <name>', 'Preferred AI model')
|
|
1475
|
+
.option('--vendor <name>', 'Preferred AI vendor')
|
|
1476
|
+
.option('-o, --output <path>', 'Output directory', './metadata/queries')
|
|
1477
|
+
.option('--mode <mode>', 'Output mode: metadata|database|both', 'metadata')
|
|
1478
|
+
.option('-v, --verbose', 'Verbose output')
|
|
1479
|
+
.action(generateCommand);
|
|
1480
|
+
|
|
1481
|
+
program
|
|
1482
|
+
.command('validate')
|
|
1483
|
+
.description('Validate existing query templates')
|
|
1484
|
+
.option('-p, --path <path>', 'Path to queries metadata file')
|
|
1485
|
+
.action(validateCommand);
|
|
1486
|
+
|
|
1487
|
+
program
|
|
1488
|
+
.command('export')
|
|
1489
|
+
.description('Export queries from database to metadata files')
|
|
1490
|
+
.option('-o, --output <path>', 'Output directory')
|
|
1491
|
+
.action(exportCommand);
|
|
1492
|
+
|
|
1493
|
+
program.parse();
|
|
1494
|
+
```
|
|
1495
|
+
|
|
1496
|
+
### 9.2 Generate Command Implementation
|
|
1497
|
+
```typescript
|
|
1498
|
+
async function generateCommand(options: any): Promise<void> {
|
|
1499
|
+
const spinner = ora('Initializing query generation...').start();
|
|
1500
|
+
|
|
1501
|
+
try {
|
|
1502
|
+
// 1. Load configuration
|
|
1503
|
+
const config = await loadConfig(options);
|
|
1504
|
+
|
|
1505
|
+
// 2. Connect to database and load metadata
|
|
1506
|
+
spinner.text = 'Loading metadata...';
|
|
1507
|
+
await Metadata.Provider.Config(false, contextUser);
|
|
1508
|
+
|
|
1509
|
+
// 3. Build entity groups
|
|
1510
|
+
spinner.text = 'Analyzing entity relationships...';
|
|
1511
|
+
const grouper = new EntityGrouper(config);
|
|
1512
|
+
const entityGroups = await grouper.generateEntityGroups();
|
|
1513
|
+
spinner.succeed(`Found ${entityGroups.length} entity groups`);
|
|
1514
|
+
|
|
1515
|
+
// 4. Initialize vector similarity search
|
|
1516
|
+
spinner.start('Embedding golden queries...');
|
|
1517
|
+
const embeddingService = new EmbeddingService(config.embeddingModel);
|
|
1518
|
+
const goldenQueries = await loadGoldenQueries();
|
|
1519
|
+
const embeddedGolden = await embeddingService.embedGoldenQueries(goldenQueries);
|
|
1520
|
+
spinner.succeed(`Embedded ${goldenQueries.length} golden queries`);
|
|
1521
|
+
|
|
1522
|
+
// 5. Generate queries for each entity group
|
|
1523
|
+
const totalGroups = entityGroups.length;
|
|
1524
|
+
let processedGroups = 0;
|
|
1525
|
+
const allValidatedQueries: ValidatedQuery[] = [];
|
|
1526
|
+
|
|
1527
|
+
for (const group of entityGroups) {
|
|
1528
|
+
processedGroups++;
|
|
1529
|
+
spinner.start(`[${processedGroups}/${totalGroups}] Processing ${group.primaryEntity.Name}...`);
|
|
1530
|
+
|
|
1531
|
+
try {
|
|
1532
|
+
// 5a. Generate business questions
|
|
1533
|
+
const questionGen = new QuestionGenerator(config);
|
|
1534
|
+
const questions = await questionGen.generateQuestions(group);
|
|
1535
|
+
|
|
1536
|
+
// 5b. For each question, generate and validate query
|
|
1537
|
+
for (const question of questions) {
|
|
1538
|
+
spinner.text = `[${processedGroups}/${totalGroups}] Generating query: ${question.userQuestion}`;
|
|
1539
|
+
|
|
1540
|
+
// Embed question for similarity search
|
|
1541
|
+
const questionEmbedding = await embeddingService.embedQuery({
|
|
1542
|
+
name: '',
|
|
1543
|
+
userQuestion: question.userQuestion,
|
|
1544
|
+
description: question.description,
|
|
1545
|
+
technicalDescription: question.technicalDescription,
|
|
1546
|
+
sql: ''
|
|
1547
|
+
});
|
|
1548
|
+
|
|
1549
|
+
// Find similar golden queries
|
|
1550
|
+
const similaritySearch = new SimilaritySearch();
|
|
1551
|
+
const fewShotExamples = await similaritySearch.findSimilarQueries(
|
|
1552
|
+
questionEmbedding,
|
|
1553
|
+
embeddedGolden,
|
|
1554
|
+
config.topSimilarQueries,
|
|
1555
|
+
config.similarityThreshold
|
|
1556
|
+
);
|
|
1557
|
+
|
|
1558
|
+
// Generate SQL query
|
|
1559
|
+
const queryWriter = new QueryWriter(config);
|
|
1560
|
+
const generatedQuery = await queryWriter.generateQuery(
|
|
1561
|
+
question,
|
|
1562
|
+
group.entities.map(e => formatEntityMetadata(e)),
|
|
1563
|
+
fewShotExamples.map(s => s.query)
|
|
1564
|
+
);
|
|
1565
|
+
|
|
1566
|
+
// Test and fix query
|
|
1567
|
+
const queryTester = new QueryTester(config);
|
|
1568
|
+
const testResult = await queryTester.testQuery(
|
|
1569
|
+
generatedQuery,
|
|
1570
|
+
config.maxFixingIterations
|
|
1571
|
+
);
|
|
1572
|
+
|
|
1573
|
+
if (!testResult.success) {
|
|
1574
|
+
spinner.warn(`Query failed after ${config.maxFixingIterations} attempts: ${question.userQuestion}`);
|
|
1575
|
+
continue;
|
|
1576
|
+
}
|
|
1577
|
+
|
|
1578
|
+
// Refine query
|
|
1579
|
+
const queryRefiner = new QueryRefiner(config);
|
|
1580
|
+
const refinedResult = await queryRefiner.refineQuery(
|
|
1581
|
+
generatedQuery,
|
|
1582
|
+
question,
|
|
1583
|
+
group.entities.map(e => formatEntityMetadata(e)),
|
|
1584
|
+
config.maxRefinementIterations
|
|
1585
|
+
);
|
|
1586
|
+
|
|
1587
|
+
allValidatedQueries.push({
|
|
1588
|
+
businessQuestion: question,
|
|
1589
|
+
query: refinedResult.query,
|
|
1590
|
+
testResult: refinedResult.testResult,
|
|
1591
|
+
evaluation: refinedResult.evaluation,
|
|
1592
|
+
entityGroup: group
|
|
1593
|
+
});
|
|
1594
|
+
|
|
1595
|
+
spinner.text = `[${processedGroups}/${totalGroups}] ✓ ${question.userQuestion}`;
|
|
1596
|
+
}
|
|
1597
|
+
|
|
1598
|
+
spinner.succeed(`[${processedGroups}/${totalGroups}] ${group.primaryEntity.Name} complete (${questions.length} queries)`);
|
|
1599
|
+
|
|
1600
|
+
} catch (error) {
|
|
1601
|
+
spinner.warn(`[${processedGroups}/${totalGroups}] Error processing ${group.primaryEntity.Name}: ${extractErrorMessage(error, 'Query Generation')}`);
|
|
1602
|
+
}
|
|
1603
|
+
}
|
|
1604
|
+
|
|
1605
|
+
// 6. Export results
|
|
1606
|
+
spinner.start(`Exporting ${allValidatedQueries.length} queries...`);
|
|
1607
|
+
|
|
1608
|
+
if (config.outputMode === 'metadata' || config.outputMode === 'both') {
|
|
1609
|
+
const exporter = new MetadataExporter();
|
|
1610
|
+
const exportResult = await exporter.exportQueries(
|
|
1611
|
+
allValidatedQueries,
|
|
1612
|
+
config.outputDirectory
|
|
1613
|
+
);
|
|
1614
|
+
spinner.succeed(`Exported to ${exportResult.outputPath}`);
|
|
1615
|
+
}
|
|
1616
|
+
|
|
1617
|
+
if (config.outputMode === 'database' || config.outputMode === 'both') {
|
|
1618
|
+
const dbWriter = new QueryDatabaseWriter();
|
|
1619
|
+
const writeResult = await dbWriter.writeQueriesToDatabase(
|
|
1620
|
+
allValidatedQueries,
|
|
1621
|
+
contextUser
|
|
1622
|
+
);
|
|
1623
|
+
spinner.succeed(`Wrote ${allValidatedQueries.length} queries to database`);
|
|
1624
|
+
}
|
|
1625
|
+
|
|
1626
|
+
// 7. Summary
|
|
1627
|
+
console.log('\n✅ Query generation complete!\n');
|
|
1628
|
+
console.log(`Entity Groups Processed: ${processedGroups}`);
|
|
1629
|
+
console.log(`Queries Generated: ${allValidatedQueries.length}`);
|
|
1630
|
+
console.log(`Output Location: ${config.outputDirectory}`);
|
|
1631
|
+
|
|
1632
|
+
} catch (error) {
|
|
1633
|
+
spinner.fail('Query generation failed');
|
|
1634
|
+
console.error(extractErrorMessage(error, 'Query Generation'));
|
|
1635
|
+
process.exit(1);
|
|
1636
|
+
}
|
|
1637
|
+
}
|
|
1638
|
+
```
|
|
1639
|
+
|
|
1640
|
+
### 9.3 Progress Reporting
|
|
1641
|
+
**Features**:
|
|
1642
|
+
- Use `ora` for spinners during long operations
|
|
1643
|
+
- Use `chalk` for colored console output
|
|
1644
|
+
- Show progress for each entity group: `[3/15] Processing Customers...`
|
|
1645
|
+
- Display summary statistics at the end
|
|
1646
|
+
- Save detailed logs to file if verbose mode enabled
|
|
1647
|
+
|
|
1648
|
+
---
|
|
1649
|
+
|
|
1650
|
+
## Phase 10: Testing & Documentation (Week 9)
|
|
1651
|
+
|
|
1652
|
+
### 10.1 Unit Tests
|
|
1653
|
+
**Test Coverage**:
|
|
1654
|
+
- Entity grouping logic
|
|
1655
|
+
- Vector similarity search
|
|
1656
|
+
- Query parameter rendering
|
|
1657
|
+
- SQL execution and error handling
|
|
1658
|
+
- Metadata export format
|
|
1659
|
+
|
|
1660
|
+
### 10.2 Integration Tests
|
|
1661
|
+
**Test Scenarios**:
|
|
1662
|
+
- Full generation workflow on test database
|
|
1663
|
+
- AI prompt failover scenarios
|
|
1664
|
+
- Query refinement iterations
|
|
1665
|
+
- Database vs. metadata output modes
|
|
1666
|
+
|
|
1667
|
+
### 10.3 Documentation
|
|
1668
|
+
**README.md Contents**:
|
|
1669
|
+
- Installation instructions
|
|
1670
|
+
- Configuration guide
|
|
1671
|
+
- CLI command reference
|
|
1672
|
+
- Example workflows
|
|
1673
|
+
- Troubleshooting guide
|
|
1674
|
+
|
|
1675
|
+
**Example Usage**:
|
|
1676
|
+
```bash
|
|
1677
|
+
# Generate queries for all entities
|
|
1678
|
+
mj querygen generate
|
|
1679
|
+
|
|
1680
|
+
# Generate for specific entities
|
|
1681
|
+
mj querygen generate -e Customers Orders Products
|
|
1682
|
+
|
|
1683
|
+
# Exclude schemas
|
|
1684
|
+
mj querygen generate -s __mj internal
|
|
1685
|
+
|
|
1686
|
+
# Override AI model
|
|
1687
|
+
mj querygen generate --model "Claude 4.5 Sonnet" --vendor Anthropic
|
|
1688
|
+
|
|
1689
|
+
# Output to database
|
|
1690
|
+
mj querygen generate --mode database
|
|
1691
|
+
|
|
1692
|
+
# Verbose output
|
|
1693
|
+
mj querygen generate -v
|
|
1694
|
+
```
|
|
1695
|
+
|
|
1696
|
+
---
|
|
1697
|
+
|
|
1698
|
+
## Phase 11: Optimization & Polish (Week 10)
|
|
1699
|
+
|
|
1700
|
+
### 11.1 Performance Optimizations
|
|
1701
|
+
- **Parallel Processing**: Generate queries for multiple entity groups in parallel (config: `parallelGenerations: 3`)
|
|
1702
|
+
- **Caching**: Cache AI prompt results to avoid re-running identical prompts
|
|
1703
|
+
- **Connection Pooling**: Reuse database connections efficiently
|
|
1704
|
+
- **Streaming**: Process large entity lists in batches
|
|
1705
|
+
|
|
1706
|
+
### 11.2 Error Handling Improvements
|
|
1707
|
+
- Graceful degradation when AI models are unavailable
|
|
1708
|
+
- Detailed error logs with context
|
|
1709
|
+
- Retry logic with exponential backoff
|
|
1710
|
+
- User-friendly error messages
|
|
1711
|
+
|
|
1712
|
+
### 11.3 Code Quality
|
|
1713
|
+
- ESLint/Prettier formatting
|
|
1714
|
+
- TypeScript strict mode
|
|
1715
|
+
- Comprehensive JSDoc comments
|
|
1716
|
+
- Refactor long functions (follow functional decomposition guidelines)
|
|
1717
|
+
|
|
1718
|
+
---
|
|
1719
|
+
|
|
1720
|
+
## Summary Timeline
|
|
1721
|
+
|
|
1722
|
+
| Phase | Duration | Key Deliverables |
|
|
1723
|
+
|-------|----------|------------------|
|
|
1724
|
+
| 1. Project Setup | Week 1 | Package structure, config system, dependencies |
|
|
1725
|
+
| 2. Entity Analysis | Week 2 | EntityGrouper, relationship graph |
|
|
1726
|
+
| 3. Business Questions | Week 2-3 | QuestionGenerator, AI prompt |
|
|
1727
|
+
| 4. Vector Similarity | Week 3 | EmbeddingService, SimilaritySearch |
|
|
1728
|
+
| 5. SQL Generation | Week 4 | QueryWriter, few-shot learning |
|
|
1729
|
+
| 6. Query Testing | Week 5 | QueryTester, QueryFixer, error handling |
|
|
1730
|
+
| 7. Query Refinement | Week 6 | QueryRefiner, evaluation loop |
|
|
1731
|
+
| 8. Metadata Export | Week 7 | MetadataExporter, database writer |
|
|
1732
|
+
| 9. CLI Implementation | Week 8 | Command structure, progress reporting |
|
|
1733
|
+
| 10. Testing & Docs | Week 9 | Unit tests, integration tests, README |
|
|
1734
|
+
| 11. Optimization | Week 10 | Performance tuning, error handling, polish |
|
|
1735
|
+
|
|
1736
|
+
**Total Duration**: ~10 weeks
|
|
1737
|
+
|
|
1738
|
+
---
|
|
1739
|
+
|
|
1740
|
+
## Key Design Decisions Summary
|
|
1741
|
+
|
|
1742
|
+
1. **Configuration**: Integrate with `mj.config.cjs` for consistency
|
|
1743
|
+
2. **Golden Queries**: Embed as JSON file in `src/data/` directory
|
|
1744
|
+
3. **AI Prompts**: 5 new prompts with 6-model failover configuration
|
|
1745
|
+
4. **Vector Search**: Use local embeddings (`all-MiniLM-L6-v2`) for similarity
|
|
1746
|
+
5. **Testing Strategy**: Render with sample values → execute → fix → refine
|
|
1747
|
+
6. **Output Modes**: Metadata files (default), database, or both
|
|
1748
|
+
7. **Parallelization**: Process multiple entity groups concurrently
|
|
1749
|
+
8. **Error Handling**: Follow MJ standards with `extractErrorMessage` utility
|
|
1750
|
+
|
|
1751
|
+
---
|
|
1752
|
+
|
|
1753
|
+
This comprehensive plan provides a clear roadmap for implementing the `@memberjunction/query-gen` package. The phased approach ensures steady progress with testable milestones at each stage.
|