@quarri/claude-data-tools 1.0.2 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,159 @@
1
+ ---
2
+ description: Generate SQL queries from natural language questions
3
+ globs:
4
+ alwaysApply: false
5
+ ---
6
+
7
+ # /quarri-query - Natural Language to SQL
8
+
9
+ Generate SQL queries from natural language questions using your Quarri database schema.
10
+
11
+ ## When to Use
12
+
13
+ Use `/quarri-query` when users ask data questions like:
14
+ - "Show me revenue by region"
15
+ - "What are the top 10 customers by order count?"
16
+ - "How many orders were placed last month?"
17
+
18
+ ## Workflow
19
+
20
+ ### Step 1: Fetch Context
21
+
22
+ Before generating SQL, gather the necessary context:
23
+
24
+ ```
25
+ 1. Get database schema using quarri_get_schema
26
+ 2. Search for relevant metrics using quarri_search_metrics with the question
27
+ 3. Get rules that apply using quarri_list_rules
28
+ 4. If the question mentions specific values (names, categories), use quarri_search_values to find exact matches
29
+ ```
30
+
31
+ ### Step 2: Understand the Schema
32
+
33
+ The Quarri schema follows a star schema pattern:
34
+ - **Fact tables**: Contain numeric measures (revenue, quantity, counts)
35
+ - **Dimension tables**: Contain descriptive attributes (customer name, product category, region)
36
+ - **Bridge table** (`quarri.bridge`): Pre-joined view connecting facts to dimensions
37
+
38
+ Always prefer querying through `quarri.bridge` or `quarri.schema` for optimal joins.
39
+
40
+ ### Step 3: Generate SQL
41
+
42
+ When generating SQL, follow these principles:
43
+
44
+ **Column Selection:**
45
+ - Use explicit column names, not `SELECT *`
46
+ - Include columns needed for the question
47
+ - Add appropriate aggregations (SUM, COUNT, AVG) for measures
48
+
49
+ **Joins:**
50
+ - Prefer the pre-joined `quarri.bridge` or `quarri.schema` views
51
+ - If joining manually, use the relationships defined in the schema
52
+
53
+ **Filters:**
54
+ - Apply filters mentioned in the question
55
+ - Use semantic search results for value matching
56
+ - Respect any rules defined for specific columns
57
+
58
+ **Grouping:**
59
+ - Group by all non-aggregated columns
60
+ - Consider time granularity (day, month, year) for date columns
61
+
62
+ **Ordering:**
63
+ - Order by the most relevant column (usually the measure, descending)
64
+ - Limit results to a reasonable number (default 100)
65
+
66
+ ### Step 4: Validate and Execute
67
+
68
+ 1. Show the generated SQL to the user with a brief explanation
69
+ 2. Execute using `quarri_execute_sql`
70
+ 3. Present results in a clear table format
71
+
72
+ ## SQL Generation Patterns
73
+
74
+ ### Aggregation Queries
75
+ ```sql
76
+ SELECT
77
+ dimension_column,
78
+ SUM(measure_column) as total_measure
79
+ FROM quarri.bridge
80
+ WHERE filter_conditions
81
+ GROUP BY dimension_column
82
+ ORDER BY total_measure DESC
83
+ LIMIT 100;
84
+ ```
85
+
86
+ ### Time Series Queries
87
+ ```sql
88
+ SELECT
89
+ DATE_TRUNC('month', date_column) as period,
90
+ SUM(measure_column) as total_measure
91
+ FROM quarri.bridge
92
+ WHERE date_column >= DATE '2024-01-01'
93
+ GROUP BY period
94
+ ORDER BY period;
95
+ ```
96
+
97
+ ### Top N Queries
98
+ ```sql
99
+ SELECT
100
+ dimension_column,
101
+ SUM(measure_column) as total_measure
102
+ FROM quarri.bridge
103
+ GROUP BY dimension_column
104
+ ORDER BY total_measure DESC
105
+ LIMIT 10;
106
+ ```
107
+
108
+ ### Comparison Queries
109
+ ```sql
110
+ SELECT
111
+ category_column,
112
+ segment_column,
113
+ SUM(measure_column) as total_measure
114
+ FROM quarri.bridge
115
+ GROUP BY category_column, segment_column
116
+ ORDER BY category_column, total_measure DESC;
117
+ ```
118
+
119
+ ## Handling Ambiguity
120
+
121
+ When the question is ambiguous:
122
+
123
+ 1. **Multiple possible metrics**: List the available metrics and ask which one to use
124
+ 2. **Unclear time range**: Default to "all time" or ask for clarification
125
+ 3. **Unknown column values**: Use `quarri_search_values` to find matches and confirm with user
126
+ 4. **Multiple valid interpretations**: Present the most likely interpretation and ask for confirmation
127
+
128
+ ## Example Interaction
129
+
130
+ **User**: "Show revenue by region"
131
+
132
+ **Claude's process**:
133
+ 1. Fetch schema - find `revenue` measure and `region` dimension
134
+ 2. Search metrics - find "Total Revenue" metric definition
135
+ 3. Generate SQL:
136
+ ```sql
137
+ SELECT
138
+ region,
139
+ SUM(revenue) as total_revenue
140
+ FROM quarri.bridge
141
+ GROUP BY region
142
+ ORDER BY total_revenue DESC;
143
+ ```
144
+ 4. Execute and display results in a table
145
+
146
+ ## Error Handling
147
+
148
+ - If schema fetch fails, report the connection error
149
+ - If a column doesn't exist, suggest similar columns from the schema
150
+ - If the query fails, explain the error and suggest corrections
151
+ - If no results, confirm the query logic and filters are correct
152
+
153
+ ## Output Format
154
+
155
+ Always present:
156
+ 1. **Generated SQL**: The query in a code block
157
+ 2. **Explanation**: Brief description of what the query does
158
+ 3. **Results**: Data in a formatted table
159
+ 4. **Row count**: Total rows returned