dremiojs 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.eslintrc.json +14 -0
- package/.prettierrc +7 -0
- package/README.md +59 -0
- package/dremiodocs/dremio-cloud/cloud-api-reference.md +748 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-about.md +225 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-admin.md +3754 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-bring-data.md +6098 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-changelog.md +32 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-developer.md +1147 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-explore-analyze.md +2522 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-get-started.md +300 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-help-support.md +869 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-manage-govern.md +800 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-overview.md +36 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-security.md +1844 -0
- package/dremiodocs/dremio-cloud/sql-docs.md +7180 -0
- package/dremiodocs/dremio-software/dremio-software-acceleration.md +1575 -0
- package/dremiodocs/dremio-software/dremio-software-admin.md +884 -0
- package/dremiodocs/dremio-software/dremio-software-client-applications.md +3277 -0
- package/dremiodocs/dremio-software/dremio-software-data-products.md +560 -0
- package/dremiodocs/dremio-software/dremio-software-data-sources.md +8701 -0
- package/dremiodocs/dremio-software/dremio-software-deploy-dremio.md +3446 -0
- package/dremiodocs/dremio-software/dremio-software-get-started.md +848 -0
- package/dremiodocs/dremio-software/dremio-software-monitoring.md +422 -0
- package/dremiodocs/dremio-software/dremio-software-reference.md +677 -0
- package/dremiodocs/dremio-software/dremio-software-security.md +2074 -0
- package/dremiodocs/dremio-software/dremio-software-v25-api.md +32637 -0
- package/dremiodocs/dremio-software/dremio-software-v26-api.md +36757 -0
- package/jest.config.js +10 -0
- package/package.json +25 -0
- package/src/api/catalog.ts +74 -0
- package/src/api/jobs.ts +105 -0
- package/src/api/reflection.ts +77 -0
- package/src/api/source.ts +61 -0
- package/src/api/user.ts +32 -0
- package/src/client/base.ts +66 -0
- package/src/client/cloud.ts +37 -0
- package/src/client/software.ts +73 -0
- package/src/index.ts +16 -0
- package/src/types/catalog.ts +31 -0
- package/src/types/config.ts +18 -0
- package/src/types/job.ts +18 -0
- package/src/types/reflection.ts +29 -0
- package/tests/integration_manual.ts +95 -0
- package/tsconfig.json +19 -0
|
@@ -0,0 +1,848 @@
|
|
|
1
|
+
# Dremio Software - Get Started
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Source: https://docs.dremio.com/current/get-started/
|
|
8
|
+
|
|
9
|
+
Version: current [26.x]
|
|
10
|
+
|
|
11
|
+
On this page
|
|
12
|
+
|
|
13
|
+
# Get Started with Dremio
|
|
14
|
+
|
|
15
|
+
Welcome to Dremio, a data lakehouse platform that facilitates high-performance, self-service analytics on large datasets.
|
|
16
|
+
|
|
17
|
+
To get started with Dremio, you have two options for a hands-on experience:
|
|
18
|
+
|
|
19
|
+
* [Enterprise Edition Free Trial](/current/get-started/kubernetes-trial) (Recommended) - Deploy and explore Dremio on Kubernetes using the Enterprise Edition free trial with all features unlocked.
|
|
20
|
+
* [Community Edition on Docker](/current/get-started/docker) - Deploy and explore Dremio on Docker using the Community Edition with a limited set of features. This option provides a local deployment with a single node, which is suggested for testing and evaluation purposes.
|
|
21
|
+
|
|
22
|
+
Choose the option that best fits your needs.
|
|
23
|
+
To learn about the differences between the two editions, see [Dremio Editions](/editions).
|
|
24
|
+
|
|
25
|
+
## Related Topics
|
|
26
|
+
|
|
27
|
+
If you want to learn more about Dremio, see the following:
|
|
28
|
+
|
|
29
|
+
* [Quick Tour of the Console](/current/get-started/quick_tour/) - A walkthrough of the Dremio console and how to best use its various capabilities.
|
|
30
|
+
* [What is Dremio?](/current/what-is-dremio/) - An overview of Dremio, its key concepts, and architecture.
|
|
31
|
+
|
|
32
|
+
Was this page helpful?
|
|
33
|
+
|
|
34
|
+
[Previous
|
|
35
|
+
|
|
36
|
+
Overview](/current/)[Next
|
|
37
|
+
|
|
38
|
+
Enterprise Edition Free Trial](/current/get-started/kubernetes-trial)
|
|
39
|
+
|
|
40
|
+
* Related Topics
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
# Source: https://docs.dremio.com/current/get-started/kubernetes-trial
|
|
45
|
+
|
|
46
|
+
Version: current [26.x]
|
|
47
|
+
|
|
48
|
+
On this page
|
|
49
|
+
|
|
50
|
+
# Get Started with the Enterprise Edition Free Trial
|
|
51
|
+
|
|
52
|
+
This Get Started guide walks you through deploying Dremio on Kubernetes using a free trial of the Enterprise Edition, exploring the multiple features available in this edition. For more information, see [How Does the Enterprise Edition Free Trial Work](/current/admin/licensing/free-trial#how-does-the-enterprise-edition-free-trial-work).
|
|
53
|
+
|
|
54
|
+
## Prerequisites
|
|
55
|
+
|
|
56
|
+
Before deploying Dremio on Kubernetes, ensure you have the following:
|
|
57
|
+
|
|
58
|
+
* A hosted Kubernetes environment to deploy and manage the Dremio cluster.
|
|
59
|
+
Each Dremio release is tested against [Amazon Elastic Kubernetes Service (EKS)](https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html), [Azure Kubernetes Service (AKS)](https://learn.microsoft.com/en-us/azure/aks/what-is-aks), and [Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine?hl=en#how-it-works) to ensure compatibility. If you have a containerization platform built on top of Kubernetes that is not listed here, please contact your provider and the Dremio Account Team regarding compatibility.
|
|
60
|
+
* Helm 3 installed on your local machine to run Helm commands. For installation instructions, refer to [Installing Helm](https://helm.sh/docs/intro/install/) in the Helm documentation.
|
|
61
|
+
* A local kubectl configured to access your Kubernetes cluster. For installation instructions, refer to [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) in the Kubernetes documentation.
|
|
62
|
+
* Object Storage: Amazon S3 (including S3-compatible, e.g., MinIO), Azure Storage, or Google Cloud Storage (GCS).
|
|
63
|
+
|
|
64
|
+
## Step 1: Deploy Dremio
|
|
65
|
+
|
|
66
|
+
Let's start by deploying the Enterprise Edition on your hosted Kubernetes environment:
|
|
67
|
+
|
|
68
|
+
1. If you haven't already, [sign up for the Enterprise Edition Free Trial](https://www.dremio.com/get-started/?utm_source=dremio-docs&utm_medium=referral).
|
|
69
|
+
2. In the email you receive from Dremio, click a link to download the `values-overrides.yaml` file containing the deployment information and save the file locally.
|
|
70
|
+
3. Open the `values-overrides.yaml` file in an editor to make the following configurations:
|
|
71
|
+
|
|
72
|
+
1. For distributed storage, follow the instructions in [Configuring the Distributed Storage](/current/deploy-dremio/configuring-kubernetes/#configuring-the-distributed-storage), and then return here.
|
|
73
|
+
2. For object storage, follow the instructions in [Configuring Storage for Dremio Catalog](/current/deploy-dremio/configuring-kubernetes/#configuring-storage-for-dremio-catalog), and then return here.
|
|
74
|
+
3. (Optional) For the coordinator, you can adjust its default values by following the instructions in [Recommended Resources Configuration](/current/deploy-dremio/configuring-kubernetes/#recommended-resources-configuration), and then return here.
|
|
75
|
+
4. Save the `values-overrides.yaml` file.
|
|
76
|
+
4. Open a terminal window, and start the deployment by installing Dremio's Helm chart with the following command:
|
|
77
|
+
|
|
78
|
+
Run helm install
|
|
79
|
+
|
|
80
|
+
```
|
|
81
|
+
helm install <your-dremio-install-release> \
|
|
82
|
+
oci://quay.io/dremio/dremio-helm \
|
|
83
|
+
-f <your-local-path>/values-overrides.yaml
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Where:
|
|
87
|
+
|
|
88
|
+
* `<your-dremio-install-release>` - The name that identifies your Dremio installation. For example, `dremio-1-0`.
|
|
89
|
+
* `<your-local-path>` - The path to reach the `values-overrides.yaml` file.
|
|
90
|
+
5. Monitor the deployment using the following command:
|
|
91
|
+
|
|
92
|
+
Run kubectl to monitor pods
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
kubectl get pods
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
The deployment is complete when all pods are in the `Ready` state.
|
|
99
|
+
|
|
100
|
+
Now, access the Dremio console to interact with the platform in a user-friendly and visual way. It is a key component of the Dremio experience and is accessible through a web browser:
|
|
101
|
+
|
|
102
|
+
1. Run the following command in Kubernetes to find the host for the Dremio console:
|
|
103
|
+
|
|
104
|
+
Run kubectl to find the Dremio console
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
kubectl get services dremio-client
|
|
108
|
+
```
|
|
109
|
+
2. Depending on the value in the `TYPE` column of the output, open the Dremio console in your browser with one of the following URLs:
|
|
110
|
+
|
|
111
|
+
* https://EXTERNAL\_IP:9047 - If the value in the `TYPE` column is `LoadBalancer`, use the value from the `EXTERNAL_IP` column of the output in the URL. For example, `https://8.8.8.8:9047`.
|
|
112
|
+
* <http://localhost:32390> - If the value in the `TYPE` column is `NodePort`.
|
|
113
|
+
3. Follow the instructions, and enter your details.
|
|
114
|
+
|
|
115
|
+
You should have the Dremio console ready in your browser.
|
|
116
|
+
|
|
117
|
+

|
|
118
|
+
|
|
119
|
+
To learn how to navigate the Dremio console, see [Quick Tour of the Dremio Console](/current/get-started/quick_tour).
|
|
120
|
+
|
|
121
|
+
## Step 2: Create an Engine
|
|
122
|
+
|
|
123
|
+
Engines are responsible for query execution. Each engine comprises one or more executors that perform queries and Data Manipulation Language (DML) operations by running the query execution plan and transiting data between themselves to serve queries.
|
|
124
|
+
|
|
125
|
+
To create an engine, do the following:
|
|
126
|
+
|
|
127
|
+
1. Click  in the side navigation bar to go to the Settings page.
|
|
128
|
+
2. Select **Engines** from the settings sidebar, and then click **Add Engine** on the far right.
|
|
129
|
+
3. In the New Engine dialog, enter a name for your engine. For example, `my-engine`.
|
|
130
|
+
4. Click **Add**.
|
|
131
|
+
|
|
132
|
+
You will see a new line with your engine with the **Status** as `Starting`.
|
|
133
|
+
Wait until the **Status** changes to `Running` for the engine to be available to serve your queries.
|
|
134
|
+
|
|
135
|
+
note
|
|
136
|
+
|
|
137
|
+
The engine you created is configured to automatically stop/start. This means that Dremio automatically stops the engine after 15 minutes of idle time to save resources. When a new query is issued, Dremio automatically starts the engine, but your query may take a bit longer to execute while the engine starts.
|
|
138
|
+
|
|
139
|
+
If you want to have the engine always running, edit the engine and uncheck the **Automatically start/stop** option.
|
|
140
|
+
|
|
141
|
+
## Step 3: Add the Sample Data
|
|
142
|
+
|
|
143
|
+
Let's add the sample datasets that will be used in this Get Started guide, namely:
|
|
144
|
+
|
|
145
|
+
* **NYC taxi trip data** – In Iceberg format, with more than 338 million records.
|
|
146
|
+
* **NYC weather data** – In CSV format, with more than 4 thousand records.
|
|
147
|
+
|
|
148
|
+
### Add the Datasets
|
|
149
|
+
|
|
150
|
+
Add the datasets from a sample data source, as follows:
|
|
151
|
+
|
|
152
|
+
1. In the Dremio console, click  in the side navigation bar to go to the Datasets page.
|
|
153
|
+
2. Click  right next to **Sources**.
|
|
154
|
+
3. In the Add Source dialog, select `Sample Source` in the **Object Storage** section.
|
|
155
|
+
|
|
156
|
+
### Format the Datasets
|
|
157
|
+
|
|
158
|
+
Now that the data source has been added, let's format the needed datasets as tables so that we can query them:
|
|
159
|
+
|
|
160
|
+
1. Under **Object Storage**, click the newly added `Samples` source, and then `samples.dremio.com` to see its details.
|
|
161
|
+
2. Hover over the `NYC-taxi-trips-iceberg` folder, and click  on the far right.
|
|
162
|
+
3. In the Folder Settings dialog, check the **Format**, verify that `Iceberg` is detected, and click **Save**.
|
|
163
|
+
4. Click  in the side navigation bar, click the `Samples` source, and then `samples.dremio.com` to see its details.
|
|
164
|
+
5. Hover over the `NYC-weather.csv` file, and click  on the far right.
|
|
165
|
+
6. In the Table Settings dialog, do the following:
|
|
166
|
+
1. For **Line Delimiter**, select `LF - Unix/Linux`.
|
|
167
|
+
2. Under **Options**, check **Extract Column Names**.
|
|
168
|
+
3. Click **Save**.
|
|
169
|
+
|
|
170
|
+
The sample data is now added, formatted, and ready to be queried.
|
|
171
|
+
You can validate it by clicking  in the side navigation bar, then the `Samples` source, and then `samples.dremio.com` to see its details:
|
|
172
|
+
|
|
173
|
+
* The icon for `NYC-taxi-trips-iceberg` is , which means the folder is formatted as a table.
|
|
174
|
+
* The icon for `NYC-weather.csv` is , which means the file is formatted as a table.
|
|
175
|
+
|
|
176
|
+
## Step 4: Create a Data Product
|
|
177
|
+
|
|
178
|
+
In this step, you will start creating a data product to explore the relationship between weather conditions and tipping behavior in taxi rides to answer the business question: "Do people tip more during taxi rides when it's raining?".
|
|
179
|
+
|
|
180
|
+
### Run the Query for the Data Product
|
|
181
|
+
|
|
182
|
+
To answer the business question, you will need the average tip amount per precipitation level. For that, combine the data in the `NYC-taxi-trips-iceberg` and `NYC-weather.csv` datasets on a common field: the date.
|
|
183
|
+
|
|
184
|
+
To do this, run the SQL query that joins the two datasets, and calculates the average tip amount per precipitation level:
|
|
185
|
+
|
|
186
|
+
1. Click  in the side navigation bar to go to the [SQL Runner](/current/get-started/quick_tour/#sql-runner).
|
|
187
|
+
2. Copy the SQL below, paste it in the SQL Runner, and click **Run**.
|
|
188
|
+
|
|
189
|
+
SQL to join datasets
|
|
190
|
+
|
|
191
|
+
```
|
|
192
|
+
SELECT AVG(tip_amount) as avg_tip_amount, prcp
|
|
193
|
+
FROM Samples."samples.dremio.com"."NYC-weather.csv"
|
|
194
|
+
JOIN Samples."samples.dremio.com"."NYC-taxi-trips-iceberg"
|
|
195
|
+
ON (TO_CHAR(CAST(pickup_datetime AS DATE), 'YYYY-MM-DD')) = SUBSTRING(CAST("date" AS CHAR) FROM 0 FOR 10)
|
|
196
|
+
GROUP BY prcp;
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
You will get the query results, as shown in the image below.
|
|
200
|
+
|
|
201
|
+

|
|
202
|
+
|
|
203
|
+
### Create the View for the Data Product
|
|
204
|
+
|
|
205
|
+
In Dremio, views are virtual tables based on the result set of a query. You can create views from data that resides in any data source, folder, table, or view that you have access to. You can also share views you've created with stakeholders in your organization.
|
|
206
|
+
|
|
207
|
+
Let's create a view for the data product from the query that you ran above:
|
|
208
|
+
|
|
209
|
+
1. Click **Save as View** on the far right to create a view of your query that others can access.
|
|
210
|
+
2. In the Save View As dialog, enter a name for your view. For example, `avg_tips_precipitation`.
|
|
211
|
+
3. Click **Save**.
|
|
212
|
+
|
|
213
|
+
You can see the [lineage](/current/data-products/govern/lineage) of your datasets in a graph showing all the relationships with end-to-end visibility into how data is sourced and transformed, which helps you understand the data flow and dependencies between datasets.
|
|
214
|
+
For your newly created view, see the lineage by selecting the **Lineage** tab at the top of the page:
|
|
215
|
+
|
|
216
|
+

|
|
217
|
+
|
|
218
|
+
## Step 5: Accelerate the Query with Reflections
|
|
219
|
+
|
|
220
|
+
In this step, you will use [Reflections](/current/acceleration) to accelerate queries, particularly when working with large datasets.
|
|
221
|
+
|
|
222
|
+
### Enable the Reflection
|
|
223
|
+
|
|
224
|
+
Let's enable a [Raw Reflection](/current/acceleration/#types) to accelerate the query of your view:
|
|
225
|
+
|
|
226
|
+
1. Select the **Reflections** tab at the top of the page, toggle the **Raw Reflections** switch to on, and click **Save**.
|
|
227
|
+
2. On the far right, you will see an animated spinner icon close to **Footprint**. Wait until it turns into a green checkmark, which means that your query has been accelerated.
|
|
228
|
+
|
|
229
|
+
### Run the Accelerated Query
|
|
230
|
+
|
|
231
|
+
Let's now query the view and see the acceleration in action:
|
|
232
|
+
|
|
233
|
+
1. Click , click `avg_tips_precipitation`, and click **Run** to execute the query.
|
|
234
|
+
2. Check the execution time. It's **a sub-second query**!
|
|
235
|
+
|
|
236
|
+

|
|
237
|
+
3. Now, go to the Jobs tab, and confirm that the query was accelerated with a Reflection.
|
|
238
|
+
|
|
239
|
+

|
|
240
|
+
|
|
241
|
+
You’ve just created a Raw Reflection and accelerated your query!
|
|
242
|
+
|
|
243
|
+
While creating a Reflection manually is a great way to understand how Dremio boosts performance, you don’t need to manage this complexity yourself in real-world environments if you use [Autonomous Reflections](/current/acceleration/autonomous-reflections) — available exclusively in the Enterprise Edition. Dremio will automatically create, select, and maintain the most efficient Reflections for you, saving time while ensuring consistently fast performance of your queries.
|
|
244
|
+
|
|
245
|
+
And that's it! You finished the Get Started guide for the Enterprise Edition free trial.
|
|
246
|
+
|
|
247
|
+
Explore the documentation to learn more about Dremio, start using your data, build your data products, connect your client applications, and much more.
|
|
248
|
+
|
|
249
|
+
Was this page helpful?
|
|
250
|
+
|
|
251
|
+
[Previous
|
|
252
|
+
|
|
253
|
+
Get Started with Dremio](/current/get-started/)[Next
|
|
254
|
+
|
|
255
|
+
Community Edition on Docker](/current/get-started/docker)
|
|
256
|
+
|
|
257
|
+
* Prerequisites
|
|
258
|
+
* Step 1: Deploy Dremio
|
|
259
|
+
* Step 2: Create an Engine
|
|
260
|
+
* Step 3: Add the Sample Data
|
|
261
|
+
+ Add the Datasets
|
|
262
|
+
+ Format the Datasets
|
|
263
|
+
* Step 4: Create a Data Product
|
|
264
|
+
+ Run the Query for the Data Product
|
|
265
|
+
+ Create the View for the Data Product
|
|
266
|
+
* Step 5: Accelerate the Query with Reflections
|
|
267
|
+
+ Enable the Reflection
|
|
268
|
+
+ Run the Accelerated Query
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
# Source: https://docs.dremio.com/current/get-started/docker
|
|
273
|
+
|
|
274
|
+
Version: current [26.x]
|
|
275
|
+
|
|
276
|
+
On this page
|
|
277
|
+
|
|
278
|
+
# Get Started with the Community Edition on Docker
|
|
279
|
+
|
|
280
|
+
This Docker-based Get Started guide offers a simple and fast way to spin up Dremio locally with the Community Edition and explore the capabilities available in this edition.
|
|
281
|
+
|
|
282
|
+
This Docker deployment is indicated for testing and evaluation purposes and is not recommended for production usage. To try out a complete version of Dremio with enterprise-grade features, go to [Get Started with the Enterprise Edition Free Trial](/current/get-started/kubernetes-trial).
|
|
283
|
+
|
|
284
|
+
## Prerequisites
|
|
285
|
+
|
|
286
|
+
Before you start, download and install [Docker Desktop](https://www.docker.com/products/docker-desktop/).
|
|
287
|
+
|
|
288
|
+
## Step 1: Deploy Dremio
|
|
289
|
+
|
|
290
|
+
Let's deploy the Dremio Community Edition on Docker:
|
|
291
|
+
|
|
292
|
+
1. Open your Docker Desktop.
|
|
293
|
+
2. Click **>\_Terminal** on the bottom-right of the screen, and run the following command:
|
|
294
|
+
|
|
295
|
+
Run Docker command
|
|
296
|
+
|
|
297
|
+
```
|
|
298
|
+
docker run \
|
|
299
|
+
-p 9047:9047 -p 31010:31010 -p 45678:45678 -p 32010:32010 \
|
|
300
|
+
-e DREMIO_JAVA_SERVER_EXTRA_OPTS=-Dpaths.dist=file:///opt/dremio/data/dist \
|
|
301
|
+
dremio/dremio-oss
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
After a couple of minutes, the containers should be up and running, and Dremio is deployed.
|
|
305
|
+
|
|
306
|
+
Now, access the Dremio console, where you interact with the platform in a user-friendly and visual way. It is a key component of the Dremio experience and is accessible through a web browser:
|
|
307
|
+
|
|
308
|
+
1. In your browser, navigate to <http://localhost:9047>.
|
|
309
|
+
2. You will be asked to enter your details and click **Next**.
|
|
310
|
+
|
|
311
|
+
You should have the Dremio console ready in your browser.
|
|
312
|
+
|
|
313
|
+

|
|
314
|
+
|
|
315
|
+
To learn how to navigate the Dremio console, see [Quick Tour of the Dremio Console](/current/get-started/quick_tour).
|
|
316
|
+
|
|
317
|
+
## Step 2: Add the Sample Data
|
|
318
|
+
|
|
319
|
+
Let's add the sample datasets that will be used in this Get Started guide, namely:
|
|
320
|
+
|
|
321
|
+
* **NYC taxi trip data** – In Iceberg format, with more than 338 million records.
|
|
322
|
+
* **NYC weather data** – In CSV format, with more than 4 thousand records.
|
|
323
|
+
|
|
324
|
+
### Add the Datasets
|
|
325
|
+
|
|
326
|
+
Add the datasets from a sample data source, as follows:
|
|
327
|
+
|
|
328
|
+
1. In the Dremio console, click  in the side navigation bar to go to the Datasets page.
|
|
329
|
+
2. Click  right next to **Sources**.
|
|
330
|
+
3. In the Add Source dialog, select `Sample Source` in the **Object Storage** section.
|
|
331
|
+
|
|
332
|
+
### Format the Datasets
|
|
333
|
+
|
|
334
|
+
Now that the data source has been added, let's format the needed datasets as tables so that we can query them:
|
|
335
|
+
|
|
336
|
+
1. Under **Object Storage**, click the newly added `Samples` source, and then `samples.dremio.com` to see its details.
|
|
337
|
+
2. Hover over the `NYC-taxi-trips-iceberg` folder, and click  on the far right.
|
|
338
|
+
3. In the Folder Settings dialog, check the **Format**, verify that `Iceberg` is detected, and click **Save**.
|
|
339
|
+
4. Click  in the side navigation bar, click the `Samples` source, and then `samples.dremio.com` to see its details.
|
|
340
|
+
5. Hover over the `NYC-weather.csv` file, and click  on the far right.
|
|
341
|
+
6. In the Table Settings dialog, do the following:
|
|
342
|
+
1. For **Line Delimiter**, select `LF - Unix/Linux`.
|
|
343
|
+
2. Under **Options**, check **Extract Column Names**.
|
|
344
|
+
3. Click **Save**.
|
|
345
|
+
|
|
346
|
+
The sample data is now added, formatted, and ready to be queried.
|
|
347
|
+
You can validate it by clicking  in the side navigation bar, then the `Samples` source, and then `samples.dremio.com` to see its details:
|
|
348
|
+
|
|
349
|
+
* The icon for `NYC-taxi-trips-iceberg` is , which means the folder is formatted as a table.
|
|
350
|
+
* The icon for `NYC-weather.csv` is , which means the file is formatted as a table.
|
|
351
|
+
|
|
352
|
+
## Step 3: Create a Data Product
|
|
353
|
+
|
|
354
|
+
In this step, you will start creating a data product to explore the relationship between weather conditions and tipping behavior in taxi rides to answer the business question: "Do people tip more during taxi rides when it's raining?".
|
|
355
|
+
|
|
356
|
+
### Run the Query for the Data Product
|
|
357
|
+
|
|
358
|
+
To answer the business question, you will need the average tip amount per precipitation level. For that, combine the data in the `NYC-taxi-trips-iceberg` and `NYC-weather.csv` datasets on a common field: the date.
|
|
359
|
+
|
|
360
|
+
To do this, run the SQL query that joins the two datasets, and calculates the average tip amount per precipitation level:
|
|
361
|
+
|
|
362
|
+
1. Click  in the side navigation bar to go to the [SQL Runner](/current/get-started/quick_tour/#sql-runner).
|
|
363
|
+
2. Copy the SQL below, paste it in the SQL Runner, and click **Run**.
|
|
364
|
+
|
|
365
|
+
SQL to join datasets
|
|
366
|
+
|
|
367
|
+
```
|
|
368
|
+
SELECT AVG(tip_amount) as avg_tip_amount, prcp
|
|
369
|
+
FROM Samples."samples.dremio.com"."NYC-weather.csv"
|
|
370
|
+
JOIN Samples."samples.dremio.com"."NYC-taxi-trips-iceberg"
|
|
371
|
+
ON (TO_CHAR(CAST(pickup_datetime AS DATE), 'YYYY-MM-DD')) = SUBSTRING(CAST("date" AS CHAR) FROM 0 FOR 10)
|
|
372
|
+
GROUP BY prcp;
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
You will get the query results, as shown in the image below.
|
|
376
|
+
|
|
377
|
+

|
|
378
|
+
|
|
379
|
+
### Create the View for the Data Product
|
|
380
|
+
|
|
381
|
+
In Dremio, views are virtual tables based on the result set of a query. You can create views from data that resides in any data source, folder, table, or view that you have access to. You can also share views you've created with stakeholders in your organization.
|
|
382
|
+
|
|
383
|
+
Let's create a view for the data product from the query that you ran above:
|
|
384
|
+
|
|
385
|
+
1. Click **Save as View** on the far right to create a view of your query that others can access.
|
|
386
|
+
2. On the Save View As dialog, enter a name for your view. For example, `avg_tips_precipitation`.
|
|
387
|
+
3. Click **Save**.
|
|
388
|
+
|
|
389
|
+
Your view is created and ready to be used.
|
|
390
|
+
|
|
391
|
+
## Step 4: Accelerate the Query with Reflections
|
|
392
|
+
|
|
393
|
+
In this step, you will use [Reflections](/current/acceleration) to accelerate queries, particularly when working with large datasets.
|
|
394
|
+
|
|
395
|
+
### Enable the Reflection
|
|
396
|
+
|
|
397
|
+
Let's enable a [Raw Reflection](/current/acceleration/#types) to accelerate the query of your view:
|
|
398
|
+
|
|
399
|
+
1. Select the **Reflections** tab at the top of the page, toggle the **Raw Reflections** switch to on, and click **Save**.
|
|
400
|
+
2. On the far right, you will see an animated spinner icon close to **Footprint**. Wait until it turns into a green checkmark, which means that your query has been accelerated.
|
|
401
|
+
|
|
402
|
+
### Run the Accelerated Query
|
|
403
|
+
|
|
404
|
+
Let's now query the view and see the acceleration in action:
|
|
405
|
+
|
|
406
|
+
1. Click , click `avg_tips_precipitation`, and click **Run** to execute the query.
|
|
407
|
+
2. Check the execution time. It's **a sub-second query**!
|
|
408
|
+
|
|
409
|
+

|
|
410
|
+
3. Now, go to the Jobs tab, and confirm that the query was accelerated with a Reflection.
|
|
411
|
+
|
|
412
|
+

|
|
413
|
+
|
|
414
|
+
You’ve just created a Raw Reflection and accelerated your query!
|
|
415
|
+
|
|
416
|
+
While creating a Reflection manually is a great way to understand how Dremio boosts performance, you don’t need to manage this complexity yourself in real-world environments if you use [Autonomous Reflections](/current/acceleration/autonomous-reflections) — available exclusively in the Enterprise Edition. Dremio will automatically create, select, and maintain the most efficient Reflections for you, saving time while ensuring consistently fast performance of your queries.
|
|
417
|
+
|
|
418
|
+
And that's it! You finished the Get Started guide for the Community Edition on Docker.
|
|
419
|
+
|
|
420
|
+
For a more complete and full-featured experience with Dremio, [sign up for the Enterprise Edition free trial](https://www.dremio.com/get-started/?utm_source=dremio-docs&utm_medium=referral) on the Dremio website, and follow the steps in [Get Started with the Enterprise Edition Free Trial](/current/get-started/kubernetes-trial).
|
|
421
|
+
|
|
422
|
+
Was this page helpful?
|
|
423
|
+
|
|
424
|
+
[Previous
|
|
425
|
+
|
|
426
|
+
Enterprise Edition Free Trial](/current/get-started/kubernetes-trial)[Next
|
|
427
|
+
|
|
428
|
+
Quick Tour of the Console](/current/get-started/quick_tour)
|
|
429
|
+
|
|
430
|
+
* Prerequisites
|
|
431
|
+
* Step 1: Deploy Dremio
|
|
432
|
+
* Step 2: Add the Sample Data
|
|
433
|
+
+ Add the Datasets
|
|
434
|
+
+ Format the Datasets
|
|
435
|
+
* Step 3: Create a Data Product
|
|
436
|
+
+ Run the Query for the Data Product
|
|
437
|
+
+ Create the View for the Data Product
|
|
438
|
+
* Step 4: Accelerate the Query with Reflections
|
|
439
|
+
+ Enable the Reflection
|
|
440
|
+
+ Run the Accelerated Query
|
|
441
|
+
|
|
442
|
+
---
|
|
443
|
+
|
|
444
|
+
# Source: https://docs.dremio.com/current/get-started/quick_tour
|
|
445
|
+
|
|
446
|
+
Version: current [26.x]
|
|
447
|
+
|
|
448
|
+
On this page
|
|
449
|
+
|
|
450
|
+
# Quick Tour of the Console
|
|
451
|
+
|
|
452
|
+
This section walks you through the Dremio console and how to best use the different capabilities to get to insights quickly and how to manage your data products.
|
|
453
|
+
|
|
454
|
+
The Dremio console has two main pages:
|
|
455
|
+
|
|
456
|
+
* **Datasets page**: Provides a view of your data products and underlying source tables. You can discover and explore your data on this page. For a quick tour of the Datasets page, see [Datasets Page Quick Tour](/current/get-started/quick_tour#datasets-page).
|
|
457
|
+
* **SQL Runner**: An easy to use editor to query data and create data products. For a quick tour of the SQL Runner, see [SQL Runner Quick Tour](/current/get-started/quick_tour#sql-runner).
|
|
458
|
+
|
|
459
|
+
## Datasets Page
|
|
460
|
+
|
|
461
|
+
When you work on the Datasets page, there are different components that you can use to manage your data. The largest component is the **Data** panel, which is used to explore the spaces and sources in your data catalog, as shown in this image:
|
|
462
|
+
|
|
463
|
+

|
|
464
|
+
|
|
465
|
+
| Location | Description |
|
|
466
|
+
| --- | --- |
|
|
467
|
+
| 1 | By default, you have a home space that you can further organize by creating a hierarchy of folders, and you can create additional spaces. |
|
|
468
|
+
| 2 | A [space](/current/what-is-dremio/key-concepts#spaces-and-folders) is a directory in which views are saved. Spaces provide a way to group datasets by common themes such as a project, purpose, department, or geographic region. |
|
|
469
|
+
| 3 | A [source](/current/data-sources/) is a data lake or external source (such as a relational database) that you can connect to Dremio. |
|
|
470
|
+
| 4 | The title indicates that the Samples data lake is open and lists the contents of the sample source. A source also consists of layers, so if you expand a data source, you will find datasets and data types within the datasets. |
|
|
471
|
+
| 5 | A [dataset](/current/what-is-dremio/key-concepts/#tables-and-views) is a collection of data. The datasets stored in files can be in many different formats, and to run SQL queries against data in different formats, you can create tables and views. By default, when you click on a dataset, the SQL editor is opened on the SQL Runner page with a `SELECT * FROM <dataset_name>` statement. To get a preview of the query, click **Run** or **Preview**. If you would rather click directly on a dataset to see or edit the definition, see [Preferences](/current/help-support/advanced-topics/dremio-preferences) for modifying this setting. |
|
|
472
|
+
|
|
473
|
+
### Opening Datasets
|
|
474
|
+
|
|
475
|
+
If you have privileges to modify the table or view, you will have the option of viewing and editing the table or view definition. When viewing or editing a table or view, you are directed to the **Data** tab by default, which shows the definitions of the table or view. For more options, check out the other tabs:
|
|
476
|
+
|
|
477
|
+

|
|
478
|
+
|
|
479
|
+
| Location | Description |
|
|
480
|
+
| --- | --- |
|
|
481
|
+
| 1 | [Details](/current/data-products/govern/wikis-tags) shows the columns in a dataset and lets you add information about a specific dataset in its wiki. You can add searchable labels, which enhances team collaboration because other users can search the labels to trace a specific dataset. |
|
|
482
|
+
| 2 | [Lineage](/current/data-products/govern/lineage) is a graph of the dataset, showing its data source, parent datasets, and children datasets. |
|
|
483
|
+
| 3 | [Reflections](/current/acceleration/) are physically optimized representations of source data. |
|
|
484
|
+
|
|
485
|
+
Tabs are made visible based on the privileges that you have.
|
|
486
|
+
|
|
487
|
+
## SQL Runner
|
|
488
|
+
|
|
489
|
+
The SQL Runner is where you run queries on your datasets and get results. To navigate to the SQL Runner, click  in the side navigation bar. The main components of the SQL Runner are highlighted below:
|
|
490
|
+
|
|
491
|
+

|
|
492
|
+
|
|
493
|
+
caution
|
|
494
|
+
|
|
495
|
+
Dremio's query engine intentionally ignores any file or folder if the filename or folder name starts with a period (“.”) or an underscore (“\_”).
|
|
496
|
+
|
|
497
|
+
### 1. Data
|
|
498
|
+
|
|
499
|
+
The **Data** panel is used to explore your data catalog, which includes [sources](/current/data-sources/), [tables, and views](/current/what-is-dremio/key-concepts#tables--views). For catalog objects that you use frequently, you can star the objects to make them easier to access from the panel.
|
|
500
|
+
|
|
501
|
+
To add a dataset into the SQL editor, go to the data source. Use the left caret > to expand the source view. Locate the dataset that you would like to use within the query. Click the + button or drag and drop the data into the SQL editor.
|
|
502
|
+
|
|
503
|
+
### 2. Scripts
|
|
504
|
+
|
|
505
|
+
You can save your SQL as a script if you have drafts or SQL statements that you run frequently. Each saved script has a name, when the script was created or modified, and the context that was set for the editor.
|
|
506
|
+
|
|
507
|
+
In the **Scripts** panel, you can:
|
|
508
|
+
|
|
509
|
+
* Open a script in the SQL editor
|
|
510
|
+
* Rename a script
|
|
511
|
+
* Delete a script
|
|
512
|
+
* Share a script by [granting privileges](/current/security/rbac/privileges/#granting-privileges)
|
|
513
|
+
* Search your set of scripts by name
|
|
514
|
+
* Sort scripts by name or date
|
|
515
|
+
|
|
516
|
+
### 3. Run Mode
|
|
517
|
+
|
|
518
|
+
Running the query routes it to the engine and returns the complete result set. Dremio's query engine intentionally ignores any file or folder if the filename or folder name starts with a period (“.”) or an underscore (“\_”).
|
|
519
|
+
|
|
520
|
+
caution
|
|
521
|
+
|
|
522
|
+
If the engine scaled down, the startup time will take about two minutes.
|
|
523
|
+
|
|
524
|
+
note
|
|
525
|
+
|
|
526
|
+
Sometimes `COUNT(*)` and `SELECT` query results do not match because the result of queries run in the Dremio app has a threshold of one million. Depending on the number of threads (minor fragments) and how data is distributed among those threads, Dremio could truncate results before reaching the threshold. Each individual thread stops processing after returning a number of records equal to `threshold/# of threads`. For example, a query runs with 10 threads and should return 800,000 records, but one of the threads is responsible for 400,000 records. The per-thread threshold is 100,000, so that thread will only return 100,000 records and you will only see 500,000 records in the output. When results are truncated, the Dremio app will display a warning that the results are not complete and users can execute the query using JDBC/ODBC to get complete results.
|
|
527
|
+
|
|
528
|
+
note
|
|
529
|
+
|
|
530
|
+
Known issue: Running a `USE` statement will not update the context that is set in the SQL Runner.
|
|
531
|
+
|
|
532
|
+
### 4. Preview Mode
|
|
533
|
+
|
|
534
|
+
Executing a preview returns a subset of rows in the result set. Like the run mode, running the preview job will route the query to the selected engine, although the preview mode runs a subset of your results in less time.
|
|
535
|
+
|
|
536
|
+
### 5. SQL Editor
|
|
537
|
+
|
|
538
|
+
The SQL editor is where you create and edit queries to get insight from your data. In the top-right corner of the SQL editor, you'll find:
|
|
539
|
+
|
|
540
|
+

|
|
541
|
+
|
|
542
|
+
a. Create a new tab by clicking **+** next to the other tabs. Because a tab is defined by a session, a new tab is automatically saved as a script and named as the date and time that the session was created, such as `Nov 3, 2023, 10:19:57 AM`.
|
|
543
|
+
|
|
544
|
+
b. Grant [script privileges](/current/security/rbac/privileges/#granting-privileges) to share a saved script with others in your organization.
|
|
545
|
+
|
|
546
|
+
c. Save your SQL as a script or as a view. You can save a script even if there are syntax errors. Saving a new view requires valid syntax, and there can be only one SQL statement in the editor. When you save the script as a view, the tab will remain open in the SQL Runner and the Edit dataset page for the view will open in a new browser tab.
|
|
547
|
+
|
|
548
|
+
d. Set a **Context** for a session to run queries without having to qualify the referenced objects.
|
|
549
|
+
|
|
550
|
+
e. Use **fx** to see a list of functions supported by Dremio along with a short description and syntax of each function. Click on the + button or drag and drop the function template into the SQL editor. For more information on supported SQL, see the [SQL Reference](/current/reference/sql/).
|
|
551
|
+
|
|
552
|
+
f. Toggle the **dark/light mode** to change the theme of the SQL editor.
|
|
553
|
+
|
|
554
|
+
g. Enable autocomplete to receive suggestions for SQL keywords, catalog objects, and functions while you are constructing SQL statements. Suggestions depend on the context set in the SQL Runner. To enable or disable the autocomplete feature, see [Dremio Preferences](/current/help-support/advanced-topics/dremio-preferences).
|
|
555
|
+
|
|
556
|
+
h. Click the keyboard button to see the shortcuts for the SQL Runner. For a list of shortcuts, see [Keyboard Shortcuts](/current/help-support/keyboard-shortcuts).
|
|
557
|
+
|
|
558
|
+
#### Syntax Error Highlighting
|
|
559
|
+
|
|
560
|
+
Before you run a query, make sure to fix any syntax errors that have been detected in your query.
|
|
561
|
+
|
|
562
|
+
The SQL editor automatically checks for syntax errors, and every detected error is marked with a red wavy underline. If you hover over the error, you’ll see a message stating whether the error is the result of a token that is missing, unexpected, unrecognized, or extraneous in the query.
|
|
563
|
+
|
|
564
|
+
#### Running Multiple Queries
|
|
565
|
+
|
|
566
|
+
You can run multiple queries in the SQL editor by using a semicolon to separate each statement. To run all of the queries in the SQL editor, simply click **Run**. The results of each query will be shown in the same order that the queries are constructed:
|
|
567
|
+
|
|
568
|
+

|
|
569
|
+
|
|
570
|
+
When you have multiple queries, you can also select a subset to run. If you select one or more queries and then click **Run**, only the selected queries will run accordingly, as shown below:
|
|
571
|
+
|
|
572
|
+

|
|
573
|
+
|
|
574
|
+
### 6. Result Set Actions
|
|
575
|
+
|
|
576
|
+
Above the top-right corner of the result set, you will find these actions:
|
|
577
|
+
|
|
578
|
+

|
|
579
|
+
|
|
580
|
+
a. Download the result set as a JSON, CSV, or Parquet file. By default, downloading writes out the results from the last run of the query into the download file. To trigger a rerun of queries for downloads, see the [Rerun on download preference](/current/help-support/advanced-topics/dremio-preferences).
|
|
581
|
+
|
|
582
|
+
b. Copy result set to a clipboard. If the result set is too large, then download the table content to get the complete results.
|
|
583
|
+
|
|
584
|
+
note
|
|
585
|
+
|
|
586
|
+
* The option to download as a CSV file is not available if the result set includes one or more columns that have complex datatypes (ie., a union, map, or array). Column headers for the results table indicate complex types with this icon: 
|
|
587
|
+
* The download process runs a CREATE TABLE AS SELECT (CTAS) command, which is why compute resources are required.
|
|
588
|
+
|
|
589
|
+
The download and copy results features can be enabled or disabled for a project in [Dremio Preferences](/current/help-support/advanced-topics/dremio-preferences). Disabling this in a project will prevent all users from downloading and copying results from the project.
|
|
590
|
+
|
|
591
|
+
### 7. Execution State
|
|
592
|
+
|
|
593
|
+
The execution state will show you the type of job that was run, the number of records, and the amount of time that it took to run the query. When you click on the linked job, a Job Overview page opens in a new tab, providing a summary, total execution time, Reflections data, job results, and more details. If a job took too long or failed, [viewing the job overview](/current/admin/monitoring/jobs/) can help you troubleshoot what actually happened.
|
|
594
|
+
|
|
595
|
+

|
|
596
|
+
|
|
597
|
+
### 8. Transformations
|
|
598
|
+
|
|
599
|
+
Transformations can be applied to data. Using the following no-code UI flows automatically updates the SQL in the SQL editor:
|
|
600
|
+
|
|
601
|
+
* **Add Field**
|
|
602
|
+
* **Group By**
|
|
603
|
+
* **Join**
|
|
604
|
+
* **Filter Columns**
|
|
605
|
+
|
|
606
|
+
If you are using the preview mode, transformations use a subset of the results and may not provide a complete profile of the result set. It may show null or incomplete values in the dataset as a result of joining, grouping, or calculating fields. You may encounter an error showing "no rows returned due to the LIMIT logic" or "more rows returned due to an outer join".
|
|
607
|
+
|
|
608
|
+
### 9. Results Table
|
|
609
|
+
|
|
610
|
+
The results of the query are shown in a table format. You can edit a table column by clicking directly on the column title to open a dropdown of edit options, which include sorting results, converting data types, renaming columns, and calculating fields.
|
|
611
|
+
|
|
612
|
+

|
|
613
|
+
|
|
614
|
+
You can edit a result value directly if you click and drag your cursor over the result value, which opens a dropdown of available edit options such as to replace, keep only, exclude, or copy the result value.
|
|
615
|
+
|
|
616
|
+
tip
|
|
617
|
+
|
|
618
|
+
Downloading large result sets could produce delays and errors. If you encounter these issues, create smaller views that summarize the results. You can then run queries on these smaller views and download the results.
|
|
619
|
+
|
|
620
|
+
## Additional Resources
|
|
621
|
+
|
|
622
|
+
Find out more about Dremio by enrolling in the [Dremio Fundamentals course in Dremio University](https://university.dremio.com/course/dremio-fundamentals-software).
|
|
623
|
+
|
|
624
|
+
Was this page helpful?
|
|
625
|
+
|
|
626
|
+
[Previous
|
|
627
|
+
|
|
628
|
+
Community Edition on Docker](/current/get-started/docker)[Next
|
|
629
|
+
|
|
630
|
+
What is Dremio?](/current/what-is-dremio/)
|
|
631
|
+
|
|
632
|
+
* Datasets Page
|
|
633
|
+
+ Opening Datasets
|
|
634
|
+
* SQL Runner
|
|
635
|
+
+ 1. Data
|
|
636
|
+
+ 2. Scripts
|
|
637
|
+
+ 3. Run Mode
|
|
638
|
+
+ 4. Preview Mode
|
|
639
|
+
+ 5. SQL Editor
|
|
640
|
+
+ 6. Result Set Actions
|
|
641
|
+
+ 7. Execution State
|
|
642
|
+
+ 8. Transformations
|
|
643
|
+
+ 9. Results Table
|
|
644
|
+
* Additional Resources
|
|
645
|
+
|
|
646
|
+
---
|
|
647
|
+
|
|
648
|
+
# Source: https://docs.dremio.com/current/get-started/quick_tour/
|
|
649
|
+
|
|
650
|
+
Version: current [26.x]
|
|
651
|
+
|
|
652
|
+
On this page
|
|
653
|
+
|
|
654
|
+
# Quick Tour of the Console
|
|
655
|
+
|
|
656
|
+
This section walks you through the Dremio console and how to best use the different capabilities to get to insights quickly and how to manage your data products.
|
|
657
|
+
|
|
658
|
+
The Dremio console has two main pages:
|
|
659
|
+
|
|
660
|
+
* **Datasets page**: Provides a view of your data products and underlying source tables. You can discover and explore your data on this page. For a quick tour of the Datasets page, see [Datasets Page Quick Tour](/current/get-started/quick_tour#datasets-page).
|
|
661
|
+
* **SQL Runner**: An easy to use editor to query data and create data products. For a quick tour of the SQL Runner, see [SQL Runner Quick Tour](/current/get-started/quick_tour#sql-runner).
|
|
662
|
+
|
|
663
|
+
## Datasets Page
|
|
664
|
+
|
|
665
|
+
When you work on the Datasets page, there are different components that you can use to manage your data. The largest component is the **Data** panel, which is used to explore the spaces and sources in your data catalog, as shown in this image:
|
|
666
|
+
|
|
667
|
+

|
|
668
|
+
|
|
669
|
+
| Location | Description |
|
|
670
|
+
| --- | --- |
|
|
671
|
+
| 1 | By default, you have a home space that you can further organize by creating a hierarchy of folders, and you can create additional spaces. |
|
|
672
|
+
| 2 | A [space](/current/what-is-dremio/key-concepts#spaces-and-folders) is a directory in which views are saved. Spaces provide a way to group datasets by common themes such as a project, purpose, department, or geographic region. |
|
|
673
|
+
| 3 | A [source](/current/data-sources/) is a data lake or external source (such as a relational database) that you can connect to Dremio. |
|
|
674
|
+
| 4 | The title indicates that the Samples data lake is open and lists the contents of the sample source. A source also consists of layers, so if you expand a data source, you will find datasets and data types within the datasets. |
|
|
675
|
+
| 5 | A [dataset](/current/what-is-dremio/key-concepts/#tables-and-views) is a collection of data. The datasets stored in files can be in many different formats, and to run SQL queries against data in different formats, you can create tables and views. By default, when you click on a dataset, the SQL editor is opened on the SQL Runner page with a `SELECT * FROM <dataset_name>` statement. To get a preview of the query, click **Run** or **Preview**. If you would rather click directly on a dataset to see or edit the definition, see [Preferences](/current/help-support/advanced-topics/dremio-preferences) for modifying this setting. |
|
|
676
|
+
|
|
677
|
+
### Opening Datasets
|
|
678
|
+
|
|
679
|
+
If you have privileges to modify the table or view, you will have the option of viewing and editing the table or view definition. When viewing or editing a table or view, you are directed to the **Data** tab by default, which shows the definitions of the table or view. For more options, check out the other tabs:
|
|
680
|
+
|
|
681
|
+

|
|
682
|
+
|
|
683
|
+
| Location | Description |
|
|
684
|
+
| --- | --- |
|
|
685
|
+
| 1 | [Details](/current/data-products/govern/wikis-tags) shows the columns in a dataset and lets you add information about a specific dataset in its wiki. You can add searchable labels, which enhances team collaboration because other users can search the labels to trace a specific dataset. |
|
|
686
|
+
| 2 | [Lineage](/current/data-products/govern/lineage) is a graph of the dataset, showing its data source, parent datasets, and children datasets. |
|
|
687
|
+
| 3 | [Reflections](/current/acceleration/) are physically optimized representations of source data. |
|
|
688
|
+
|
|
689
|
+
Tabs are made visible based on the privileges that you have.
|
|
690
|
+
|
|
691
|
+
## SQL Runner
|
|
692
|
+
|
|
693
|
+
The SQL Runner is where you run queries on your datasets and get results. To navigate to the SQL Runner, click  in the side navigation bar. The main components of the SQL Runner are highlighted below:
|
|
694
|
+
|
|
695
|
+

|
|
696
|
+
|
|
697
|
+
caution
|
|
698
|
+
|
|
699
|
+
Dremio's query engine intentionally ignores any file or folder if the filename or folder name starts with a period (“.”) or an underscore (“\_”).
|
|
700
|
+
|
|
701
|
+
### 1. Data
|
|
702
|
+
|
|
703
|
+
The **Data** panel is used to explore your data catalog, which includes [sources](/current/data-sources/), [tables, and views](/current/what-is-dremio/key-concepts#tables--views). For catalog objects that you use frequently, you can star the objects to make them easier to access from the panel.
|
|
704
|
+
|
|
705
|
+
To add a dataset into the SQL editor, go to the data source. Use the left caret > to expand the source view. Locate the dataset that you would like to use within the query. Click the + button or drag and drop the data into the SQL editor.
|
|
706
|
+
|
|
707
|
+
### 2. Scripts
|
|
708
|
+
|
|
709
|
+
You can save your SQL as a script if you have drafts or SQL statements that you run frequently. Each saved script has a name, when the script was created or modified, and the context that was set for the editor.
|
|
710
|
+
|
|
711
|
+
In the **Scripts** panel, you can:
|
|
712
|
+
|
|
713
|
+
* Open a script in the SQL editor
|
|
714
|
+
* Rename a script
|
|
715
|
+
* Delete a script
|
|
716
|
+
* Share a script by [granting privileges](/current/security/rbac/privileges/#granting-privileges)
|
|
717
|
+
* Search your set of scripts by name
|
|
718
|
+
* Sort scripts by name or date
|
|
719
|
+
|
|
720
|
+
### 3. Run Mode
|
|
721
|
+
|
|
722
|
+
Running the query routes it to the engine and returns the complete result set. Dremio's query engine intentionally ignores any file or folder if the filename or folder name starts with a period (“.”) or an underscore (“\_”).
|
|
723
|
+
|
|
724
|
+
caution
|
|
725
|
+
|
|
726
|
+
If the engine scaled down, the startup time will take about two minutes.
|
|
727
|
+
|
|
728
|
+
note
|
|
729
|
+
|
|
730
|
+
Sometimes `COUNT(*)` and `SELECT` query results do not match because the result of queries run in the Dremio app has a threshold of one million. Depending on the number of threads (minor fragments) and how data is distributed among those threads, Dremio could truncate results before reaching the threshold. Each individual thread stops processing after returning a number of records equal to `threshold/# of threads`. For example, a query runs with 10 threads and should return 800,000 records, but one of the threads is responsible for 400,000 records. The per-thread threshold is 100,000, so that thread will only return 100,000 records and you will only see 500,000 records in the output. When results are truncated, the Dremio app will display a warning that the results are not complete and users can execute the query using JDBC/ODBC to get complete results.
|
|
731
|
+
|
|
732
|
+
note
|
|
733
|
+
|
|
734
|
+
Known issue: Running a `USE` statement will not update the context that is set in the SQL Runner.
|
|
735
|
+
|
|
736
|
+
### 4. Preview Mode
|
|
737
|
+
|
|
738
|
+
Executing a preview returns a subset of rows in the result set. Like the run mode, running the preview job will route the query to the selected engine, although the preview mode runs a subset of your results in less time.
|
|
739
|
+
|
|
740
|
+
### 5. SQL Editor
|
|
741
|
+
|
|
742
|
+
The SQL editor is where you create and edit queries to get insight from your data. In the top-right corner of the SQL editor, you'll find:
|
|
743
|
+
|
|
744
|
+

|
|
745
|
+
|
|
746
|
+
a. Create a new tab by clicking **+** next to the other tabs. Because a tab is defined by a session, a new tab is automatically saved as a script and named as the date and time that the session was created, such as `Nov 3, 2023, 10:19:57 AM`.
|
|
747
|
+
|
|
748
|
+
b. Grant [script privileges](/current/security/rbac/privileges/#granting-privileges) to share a saved script with others in your organization.
|
|
749
|
+
|
|
750
|
+
c. Save your SQL as a script or as a view. You can save a script even if there are syntax errors. Saving a new view requires valid syntax, and there can be only one SQL statement in the editor. When you save the script as a view, the tab will remain open in the SQL Runner and the Edit dataset page for the view will open in a new browser tab.
|
|
751
|
+
|
|
752
|
+
d. Set a **Context** for a session to run queries without having to qualify the referenced objects.
|
|
753
|
+
|
|
754
|
+
e. Use **fx** to see a list of functions supported by Dremio along with a short description and syntax of each function. Click on the + button or drag and drop the function template into the SQL editor. For more information on supported SQL, see the [SQL Reference](/current/reference/sql/).
|
|
755
|
+
|
|
756
|
+
f. Toggle the **dark/light mode** to change the theme of the SQL editor.
|
|
757
|
+
|
|
758
|
+
g. Enable autocomplete to receive suggestions for SQL keywords, catalog objects, and functions while you are constructing SQL statements. Suggestions depend on the context set in the SQL Runner. To enable or disable the autocomplete feature, see [Dremio Preferences](/current/help-support/advanced-topics/dremio-preferences).
|
|
759
|
+
|
|
760
|
+
h. Click the keyboard button to see the shortcuts for the SQL Runner. For a list of shortcuts, see [Keyboard Shortcuts](/current/help-support/keyboard-shortcuts).
|
|
761
|
+
|
|
762
|
+
#### Syntax Error Highlighting
|
|
763
|
+
|
|
764
|
+
Before you run a query, make sure to fix any syntax errors that have been detected in your query.
|
|
765
|
+
|
|
766
|
+
The SQL editor automatically checks for syntax errors, and every detected error is marked with a red wavy underline. If you hover over the error, you’ll see a message stating whether the error is the result of a token that is missing, unexpected, unrecognized, or extraneous in the query.
|
|
767
|
+
|
|
768
|
+
#### Running Multiple Queries
|
|
769
|
+
|
|
770
|
+
You can run multiple queries in the SQL editor by using a semicolon to separate each statement. To run all of the queries in the SQL editor, simply click **Run**. The results of each query will be shown in the same order that the queries are constructed:
|
|
771
|
+
|
|
772
|
+

|
|
773
|
+
|
|
774
|
+
When you have multiple queries, you can also select a subset to run. If you select one or more queries and then click **Run**, only the selected queries will run accordingly, as shown below:
|
|
775
|
+
|
|
776
|
+

|
|
777
|
+
|
|
778
|
+
### 6. Result Set Actions
|
|
779
|
+
|
|
780
|
+
Above the top-right corner of the result set, you will find these actions:
|
|
781
|
+
|
|
782
|
+

|
|
783
|
+
|
|
784
|
+
a. Download the result set as a JSON, CSV, or Parquet file. By default, downloading writes out the results from the last run of the query into the download file. To trigger a rerun of queries for downloads, see the [Rerun on download preference](/current/help-support/advanced-topics/dremio-preferences).
|
|
785
|
+
|
|
786
|
+
b. Copy result set to a clipboard. If the result set is too large, then download the table content to get the complete results.
|
|
787
|
+
|
|
788
|
+
note
|
|
789
|
+
|
|
790
|
+
* The option to download as a CSV file is not available if the result set includes one or more columns that have complex datatypes (ie., a union, map, or array). Column headers for the results table indicate complex types with this icon: 
|
|
791
|
+
* The download process runs a CREATE TABLE AS SELECT (CTAS) command, which is why compute resources are required.
|
|
792
|
+
|
|
793
|
+
The download and copy results features can be enabled or disabled for a project in [Dremio Preferences](/current/help-support/advanced-topics/dremio-preferences). Disabling this in a project will prevent all users from downloading and copying results from the project.
|
|
794
|
+
|
|
795
|
+
### 7. Execution State
|
|
796
|
+
|
|
797
|
+
The execution state will show you the type of job that was run, the number of records, and the amount of time that it took to run the query. When you click on the linked job, a Job Overview page opens in a new tab, providing a summary, total execution time, Reflections data, job results, and more details. If a job took too long or failed, [viewing the job overview](/current/admin/monitoring/jobs/) can help you troubleshoot what actually happened.
|
|
798
|
+
|
|
799
|
+

|
|
800
|
+
|
|
801
|
+
### 8. Transformations
|
|
802
|
+
|
|
803
|
+
Transformations can be applied to data. Using the following no-code UI flows automatically updates the SQL in the SQL editor:
|
|
804
|
+
|
|
805
|
+
* **Add Field**
|
|
806
|
+
* **Group By**
|
|
807
|
+
* **Join**
|
|
808
|
+
* **Filter Columns**
|
|
809
|
+
|
|
810
|
+
If you are using the preview mode, transformations use a subset of the results and may not provide a complete profile of the result set. It may show null or incomplete values in the dataset as a result of joining, grouping, or calculating fields. You may encounter an error showing "no rows returned due to the LIMIT logic" or "more rows returned due to an outer join".
|
|
811
|
+
|
|
812
|
+
### 9. Results Table
|
|
813
|
+
|
|
814
|
+
The results of the query are shown in a table format. You can edit a table column by clicking directly on the column title to open a dropdown of edit options, which include sorting results, converting data types, renaming columns, and calculating fields.
|
|
815
|
+
|
|
816
|
+

|
|
817
|
+
|
|
818
|
+
You can edit a result value directly if you click and drag your cursor over the result value, which opens a dropdown of available edit options such as to replace, keep only, exclude, or copy the result value.
|
|
819
|
+
|
|
820
|
+
tip
|
|
821
|
+
|
|
822
|
+
Downloading large result sets could produce delays and errors. If you encounter these issues, create smaller views that summarize the results. You can then run queries on these smaller views and download the results.
|
|
823
|
+
|
|
824
|
+
## Additional Resources
|
|
825
|
+
|
|
826
|
+
Find out more about Dremio by enrolling in the [Dremio Fundamentals course in Dremio University](https://university.dremio.com/course/dremio-fundamentals-software).
|
|
827
|
+
|
|
828
|
+
Was this page helpful?
|
|
829
|
+
|
|
830
|
+
[Previous
|
|
831
|
+
|
|
832
|
+
Community Edition on Docker](/current/get-started/docker)[Next
|
|
833
|
+
|
|
834
|
+
What is Dremio?](/current/what-is-dremio/)
|
|
835
|
+
|
|
836
|
+
* Datasets Page
|
|
837
|
+
+ Opening Datasets
|
|
838
|
+
* SQL Runner
|
|
839
|
+
+ 1. Data
|
|
840
|
+
+ 2. Scripts
|
|
841
|
+
+ 3. Run Mode
|
|
842
|
+
+ 4. Preview Mode
|
|
843
|
+
+ 5. SQL Editor
|
|
844
|
+
+ 6. Result Set Actions
|
|
845
|
+
+ 7. Execution State
|
|
846
|
+
+ 8. Transformations
|
|
847
|
+
+ 9. Results Table
|
|
848
|
+
* Additional Resources
|