dremiojs 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.eslintrc.json +14 -0
- package/.prettierrc +7 -0
- package/README.md +59 -0
- package/dremiodocs/dremio-cloud/cloud-api-reference.md +748 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-about.md +225 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-admin.md +3754 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-bring-data.md +6098 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-changelog.md +32 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-developer.md +1147 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-explore-analyze.md +2522 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-get-started.md +300 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-help-support.md +869 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-manage-govern.md +800 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-overview.md +36 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-security.md +1844 -0
- package/dremiodocs/dremio-cloud/sql-docs.md +7180 -0
- package/dremiodocs/dremio-software/dremio-software-acceleration.md +1575 -0
- package/dremiodocs/dremio-software/dremio-software-admin.md +884 -0
- package/dremiodocs/dremio-software/dremio-software-client-applications.md +3277 -0
- package/dremiodocs/dremio-software/dremio-software-data-products.md +560 -0
- package/dremiodocs/dremio-software/dremio-software-data-sources.md +8701 -0
- package/dremiodocs/dremio-software/dremio-software-deploy-dremio.md +3446 -0
- package/dremiodocs/dremio-software/dremio-software-get-started.md +848 -0
- package/dremiodocs/dremio-software/dremio-software-monitoring.md +422 -0
- package/dremiodocs/dremio-software/dremio-software-reference.md +677 -0
- package/dremiodocs/dremio-software/dremio-software-security.md +2074 -0
- package/dremiodocs/dremio-software/dremio-software-v25-api.md +32637 -0
- package/dremiodocs/dremio-software/dremio-software-v26-api.md +36757 -0
- package/jest.config.js +10 -0
- package/package.json +25 -0
- package/src/api/catalog.ts +74 -0
- package/src/api/jobs.ts +105 -0
- package/src/api/reflection.ts +77 -0
- package/src/api/source.ts +61 -0
- package/src/api/user.ts +32 -0
- package/src/client/base.ts +66 -0
- package/src/client/cloud.ts +37 -0
- package/src/client/software.ts +73 -0
- package/src/index.ts +16 -0
- package/src/types/catalog.ts +31 -0
- package/src/types/config.ts +18 -0
- package/src/types/job.ts +18 -0
- package/src/types/reflection.ts +29 -0
- package/tests/integration_manual.ts +95 -0
- package/tsconfig.json +19 -0
|
@@ -0,0 +1,3446 @@
|
|
|
1
|
+
# Dremio Software - Deploy Dremio
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/
|
|
8
|
+
|
|
9
|
+
Version: current [26.x]
|
|
10
|
+
|
|
11
|
+
On this page
|
|
12
|
+
|
|
13
|
+
# Deploy Dremio
|
|
14
|
+
|
|
15
|
+
This topic describes the deployment models. Dremio is a distributed system that can be deployed in a public cloud or on-premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.
|
|
16
|
+
|
|
17
|
+
## Deploy on Kubernetes
|
|
18
|
+
|
|
19
|
+
Kubernetes is the recommended deployment option for Dremio. For more information, see the following topics in this section:
|
|
20
|
+
|
|
21
|
+
* [Kubernetes Environments](/current/deploy-dremio/kubernetes-environments/) – Learn about the Kubernetes environments used to deploy Dremio.
|
|
22
|
+
* [Deploying on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes/) – Deploy Dremio on your Kubernetes environment.
|
|
23
|
+
* [Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/) – Understand the configuration of your deployments in more detail.
|
|
24
|
+
* [Managing Engines](/current/deploy-dremio/managing-engines-kubernetes/) – Manage Dremio engines to optimize query execution.
|
|
25
|
+
|
|
26
|
+
## Other Deployment Options
|
|
27
|
+
|
|
28
|
+
Besides Kubernetes, Dremio provides other options for deployment described in this section.
|
|
29
|
+
|
|
30
|
+
### Shared Multi-Tenant Environment
|
|
31
|
+
|
|
32
|
+
If you plan on using a shared multi-tenant environment, Dremio provides a model that uses YARN for deployment:
|
|
33
|
+
|
|
34
|
+
* [**Hadoop using YARN**](/current/deploy-dremio/other-options/yarn-hadoop.md) - Dremio on Hadoop in YARN deployment. Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.
|
|
35
|
+
|
|
36
|
+
note
|
|
37
|
+
|
|
38
|
+
Co-locating Dremio with Hadoop/NoSQL: When Dremio is co-located with a Hadoop cluster (such as HDFS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.
|
|
39
|
+
|
|
40
|
+
Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load. So, unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.
|
|
41
|
+
|
|
42
|
+
### Standalone Cluster
|
|
43
|
+
|
|
44
|
+
If you plan on creating a standalone cluster, Dremio provides the flexibility to deploy Dremio as a standalone on-premise cluster:
|
|
45
|
+
|
|
46
|
+
* [**Standalone Cluster**](/current/deploy-dremio/other-options/standalone/index.md) - Dremio on a standalone on-premise cluster. In this scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.
|
|
47
|
+
|
|
48
|
+
Was this page helpful?
|
|
49
|
+
|
|
50
|
+
[Previous
|
|
51
|
+
|
|
52
|
+
Architecture](/current/what-is-dremio/architecture)[Next
|
|
53
|
+
|
|
54
|
+
Kubernetes Environments](/current/deploy-dremio/kubernetes-environments)
|
|
55
|
+
|
|
56
|
+
* Deploy on Kubernetes
|
|
57
|
+
* Other Deployment Options
|
|
58
|
+
+ Shared Multi-Tenant Environment
|
|
59
|
+
+ Standalone Cluster
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/kubernetes-environments
|
|
64
|
+
|
|
65
|
+
Version: current [26.x]
|
|
66
|
+
|
|
67
|
+
On this page
|
|
68
|
+
|
|
69
|
+
# Kubernetes Environments for Dremio
|
|
70
|
+
|
|
71
|
+
Dremio is designed to run Kubernetes environments, providing enterprise-grade data lakehouse capabilities. To successfully [deploy Dremio on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes), you need a compatible hosted Kubernetes environment.
|
|
72
|
+
|
|
73
|
+
Dremio is tested and supported on the following Kubernetes environments:
|
|
74
|
+
|
|
75
|
+
* Elastic Kubernetes Service (EKS)
|
|
76
|
+
* Azure Kubernetes Service (AKS)
|
|
77
|
+
* Google Kubernetes Engine (GKE)
|
|
78
|
+
* Red Hat OpenShift
|
|
79
|
+
|
|
80
|
+
The sections on this page detail recommendations for AWS and Azure. Please use the information provided as a guide for your vendors' equivalent options.
|
|
81
|
+
|
|
82
|
+
note
|
|
83
|
+
|
|
84
|
+
If you're using a containerization platform built on Kubernetes that isn't listed here, please contact your provider and Dremio Account team to discuss compatibility and support options.
|
|
85
|
+
|
|
86
|
+
## Requirements
|
|
87
|
+
|
|
88
|
+
### Versions
|
|
89
|
+
|
|
90
|
+
Dremio requires regular updates to your Kubernetes version. You must be on an officially supported version, and preferably not one on extended support. See the following examples for AWS [Available versions on standard support](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#available-versions) and Azure [Kubernetes versions](https://learn.microsoft.com/en-us/azure/aks/supported-kubernetes-versions).
|
|
91
|
+
|
|
92
|
+
### Recommendations
|
|
93
|
+
|
|
94
|
+
See this table for resource request recommendations of the variours parts of the deployment, [Recommended Resources Configuration](/current/deploy-dremio/configuring-kubernetes/#recommended-resources-configuration).
|
|
95
|
+
|
|
96
|
+
For a list of all Dremio engine sizes see, [Add an Engine](/current/deploy-dremio/managing-engines-kubernetes/add-an-engine). Engines will make up the lions share of any Dremio deployment.
|
|
97
|
+
|
|
98
|
+
#### Node Sizes
|
|
99
|
+
|
|
100
|
+
The following sections suggest AWS and Azure machines that could be used to meet our recommendations.
|
|
101
|
+
|
|
102
|
+
Dremio recommends having separate EKS node groups for the different components of our services to allow each node group to autoscale independently:
|
|
103
|
+
|
|
104
|
+
**Core Services**
|
|
105
|
+
|
|
106
|
+
* **Coordinators**
|
|
107
|
+
|
|
108
|
+
For [coordinators](/current/what-is-dremio/architecture/#main-coordinator), Dremio recommends at least 32 CPUs and 64 GB of memory, hence, a `c6i.8xlarge` or `Standard_F32s_v2` is a good option, offering a CPU-to-memory ratio of 1:2. In the Helm charts, this would result in 30 CPUs and 60 GB of memory allocated to the Dremio pod.
|
|
109
|
+
* **Executors**
|
|
110
|
+
|
|
111
|
+
For [executors](/current/what-is-dremio/architecture/#engines), Dremio recommends either:
|
|
112
|
+
|
|
113
|
+
+ 16 CPUs and 128 GB of memory, hence, a `r5d.4xlarge` or `Standard_E16_v5` is a good option, offering a CPU-to-memory ratio of 1:8. In the Helm charts, this results in 15 CPUs and 120 GB of memory allocated to the Dremio pod.
|
|
114
|
+
+ 32 CPUs and 128 GB of memory, hence, a `m5d.8xlarge` or `Standard_D32_v5` is a good option, offering a CPU-to-memory ratio of 1:4 for high-concurrency workloads. In the Helm charts, this results in 30 CPUs and 120 GB of memory allocated to the Dremio pod.
|
|
115
|
+
|
|
116
|
+
**Auxiliary Services**
|
|
117
|
+
|
|
118
|
+
* [Open Catalog](/current/what-is-dremio/architecture/#open-catalog) and [Semantic Search](/current/deploy-dremio/current/what-is-dremio/architecture/#ai-enabled-semantic-search).
|
|
119
|
+
|
|
120
|
+
Catalog is made up of 4 key components: Catalog Service, Catalog Server, Catalog External, and MongoDB. Search has one key component, OpenSearch.
|
|
121
|
+
|
|
122
|
+
Each of these components needs between 2-4 CPUs and 4-16 GB of memory; hence, a `m5d.2xlarge` or `Standard_D8_v5` is a good option and could be used to host multiple containers that are part of these services.
|
|
123
|
+
|
|
124
|
+
* ZooKeeper, NATS, Operators, and Open Telemetry:
|
|
125
|
+
|
|
126
|
+
Each of these need between 0.5-1 CPUs and 0.5-1 GB, `m5d.large`, `t2.medium`, `Standard_D2_v5` or `Standard_A2_v2` are good options and could be used to host multiple containers that are part of these services.
|
|
127
|
+
|
|
128
|
+
#### Disk Storage Class
|
|
129
|
+
|
|
130
|
+
Dremio recommends:
|
|
131
|
+
|
|
132
|
+
* For AWS, GP3 or IO2 as the storage type for all nodes.
|
|
133
|
+
* For Azure managed-premium as the storage type for all nodes.
|
|
134
|
+
|
|
135
|
+
Additionally, for [coordinators](/current/what-is-dremio/architecture/#main-coordinator) and [executors](/current/what-is-dremio/architecture/#engines), you can further use local NVMe SSD storage for C3 and spill on executors. For more information on storage classes, see the following resources [AWS Storage Class](https://docs.aws.amazon.com/eks/latest/userguide/create-storage-class.html) and [Azure Storage Class](https://learn.microsoft.com/en-us/azure/aks/concepts-storage).
|
|
136
|
+
|
|
137
|
+
Storage size requirements are:
|
|
138
|
+
|
|
139
|
+
* Coordinator volume #1: 128-512 GB (key-value store).
|
|
140
|
+
* Coordinator volume #2: 16 GB (logs).
|
|
141
|
+
* Executor volume #1: 128-512 GB (spilling).
|
|
142
|
+
* Executor volume #2: 128-512 GB (C3).
|
|
143
|
+
* Executor volume #3: 16 GB (logs).
|
|
144
|
+
* MongoDB volume: 128-512 GB.
|
|
145
|
+
* OpenSearch volume: 128 GB.
|
|
146
|
+
* Zookeeper volume: 16 GB.
|
|
147
|
+
|
|
148
|
+
### EKS Add-Ons
|
|
149
|
+
|
|
150
|
+
The following add-ons are required for EKS clusters:
|
|
151
|
+
|
|
152
|
+
* Amazon EBS CSI Driver
|
|
153
|
+
* EKS Pod Identity Agent
|
|
154
|
+
|
|
155
|
+
Was this page helpful?
|
|
156
|
+
|
|
157
|
+
[Previous
|
|
158
|
+
|
|
159
|
+
Deploy Dremio](/current/deploy-dremio/)[Next
|
|
160
|
+
|
|
161
|
+
Deploy on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes)
|
|
162
|
+
|
|
163
|
+
* Requirements
|
|
164
|
+
+ Versions
|
|
165
|
+
+ Recommendations
|
|
166
|
+
+ EKS Add-Ons
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/deploy-on-kubernetes
|
|
171
|
+
|
|
172
|
+
Version: current [26.x]
|
|
173
|
+
|
|
174
|
+
On this page
|
|
175
|
+
|
|
176
|
+
# Deploy Dremio on Kubernetes
|
|
177
|
+
|
|
178
|
+
You can follow these instructions to deploy Dremio on Kubernetes provisioned through a cloud provider or running in an on-premises environment.
|
|
179
|
+
|
|
180
|
+
FREE TRIAL
|
|
181
|
+
|
|
182
|
+
If you are using an **Enterprise Edition free trial**, go to [Get Started with the Enterprise Edition Free Trial](/current/get-started/kubernetes-trial).
|
|
183
|
+
|
|
184
|
+
## Prerequisites
|
|
185
|
+
|
|
186
|
+
Before deploying Dremio on Kubernetes, ensure you have the following:
|
|
187
|
+
|
|
188
|
+
* A hosted Kubernetes environment to deploy and manage the Dremio cluster.
|
|
189
|
+
Each Dremio release is tested against [Amazon Elastic Kubernetes Service (EKS)](https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html), [Azure Kubernetes Service (AKS)](https://learn.microsoft.com/en-us/azure/aks/what-is-aks), and [Google Kubernetes Engines (GKE)](https://cloud.google.com/kubernetes-engine?hl=en#how-it-works) to ensure compatibility. If you have a containerization platform built on top of Kubernetes that is not listed here, please contact your provider and the Dremio Account Team regarding compatibility.
|
|
190
|
+
* Helm 3 installed on your local machine to run Helm commands. For installation instructions, refer to [Installing Helm](https://helm.sh/docs/intro/install/) in the Helm documentation.
|
|
191
|
+
* A local kubectl configured to access your Kubernetes cluster. For installation instructions, refer to [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) in the Kubernetes documentation.
|
|
192
|
+
* Object Storage: Amazon S3 (including S3-compatible, e.g., MinIO), Azure Storage, or Google Cloud Storage (GCS).
|
|
193
|
+
* Storage classes that support ReadWriteOnce (RWO) access mode and ideally can create expandable volumes.
|
|
194
|
+
* The ability to connect to [Quay.io](http://quay.io/) to access the [new v3 Helm chart](https://quay.io/repository/dremio/dremio-helm?tab=tags) for Dremio 26+, since the [older v2 Helm chart](https://github.com/dremio/dremio-cloud-tools/tree/master/charts/dremio_v2) will not function.
|
|
195
|
+
|
|
196
|
+
### Additional Prerequisites for the Enterprise Edition
|
|
197
|
+
|
|
198
|
+
For the Enterprise Edition, you must:
|
|
199
|
+
|
|
200
|
+
* Create an account on [Quay.io](https://quay.io/) to access [Dremio's OCI repository](https://quay.io/organization/dremio), which stores Dremio's Helm charts and images.
|
|
201
|
+
To get access, contact your Dremio account executive or Dremio Support.
|
|
202
|
+
|
|
203
|
+
note
|
|
204
|
+
|
|
205
|
+
If your internet access doesn't allow reaching Dremio's OCI repository in Quay.io, consider using a private mirror to fetch Dremio's Helm chart images.
|
|
206
|
+
* Get a valid license key issued by Dremio to put in the Helm chart. To obtain the license, refer to [Licensing](/current/admin/licensing/).
|
|
207
|
+
|
|
208
|
+
### Additional Prerequisites for the OpenShift
|
|
209
|
+
|
|
210
|
+
Before deploying Dremio onto OpenShift, you additionally need the following:
|
|
211
|
+
|
|
212
|
+
* Have the OpenShift `oc` CLI command configured and authenticated. For the installation instructions, see [OpenShift CLI (oc)](https://docs.redhat.com/en/documentation/openshift_container_platform/4.11/html/cli_tools/openshift-cli-oc).
|
|
213
|
+
|
|
214
|
+
#### Node Tuning for OpenSearch on OpenShift
|
|
215
|
+
|
|
216
|
+
OpenSearch requires the `vm.max_map_count` kernel parameter to be set to at least **262144**.
|
|
217
|
+
|
|
218
|
+
This parameter controls the maximum number of memory map areas a process can have, and OpenSearch uses memory-mapped files extensively for performance.
|
|
219
|
+
|
|
220
|
+
Without this setting, OpenSearch pods will fail to start with errors related to virtual memory limits.
|
|
221
|
+
|
|
222
|
+
Since the Helm chart sets `setVMMaxMapCount: false` for OpenShift compatibility (to avoid privileged init containers), you need to configure this kernel parameter at the node level. The **recommended way** to do it is a Node Tuning Operator. This Operator ships with OpenShift and provides a declarative way to configure kernel parameters.
|
|
223
|
+
|
|
224
|
+
Create a `Tuned` resource to configure the required kernel parameter:
|
|
225
|
+
|
|
226
|
+
The `tuned-opensearch.yaml` configuration file
|
|
227
|
+
|
|
228
|
+
```
|
|
229
|
+
apiVersion: tuned.openshift.io/v1
|
|
230
|
+
kind: Tuned
|
|
231
|
+
metadata:
|
|
232
|
+
name: openshift-opensearch
|
|
233
|
+
namespace: openshift-cluster-node-tuning-operator
|
|
234
|
+
spec:
|
|
235
|
+
profile:
|
|
236
|
+
- data: |
|
|
237
|
+
[main]
|
|
238
|
+
summary=Optimize systems running OpenSearch on OpenShift nodes
|
|
239
|
+
include=openshift-node
|
|
240
|
+
[sysctl]
|
|
241
|
+
vm.max_map_count=262144
|
|
242
|
+
name: openshift-opensearch
|
|
243
|
+
recommend:
|
|
244
|
+
- match:
|
|
245
|
+
- label: tuned.openshift.io/opensearch
|
|
246
|
+
type: pod
|
|
247
|
+
priority: 20
|
|
248
|
+
profile: openshift-opensearch
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
This YAML should be saved locally and applied to any cluster you intend to deploy Dremio:
|
|
252
|
+
|
|
253
|
+
```
|
|
254
|
+
oc apply -f tuned-opensearch.yaml
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
## Step 1: Deploy Dremio
|
|
258
|
+
|
|
259
|
+
To deploy the Dremio cluster in Kubernetes, do the following:
|
|
260
|
+
|
|
261
|
+
1. Configure your values to deploy Dremio to Kubernetes in the file `values-overrides.yaml`. For that, go to [Configuring Your Values to Deploy Dremio to Kubernetes](/current/deploy-dremio/configuring-kubernetes/) and get back here to continue with the deployment.
|
|
262
|
+
2. On your terminal, start the deployment by installing Dremio's Helm chart:
|
|
263
|
+
|
|
264
|
+
* Standard Kubernetes
|
|
265
|
+
* OpenShift
|
|
266
|
+
|
|
267
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
268
|
+
|
|
269
|
+
```
|
|
270
|
+
helm install <your-dremio-install-release> oci://quay.io/dremio/dremio-helm \
|
|
271
|
+
--values <your-local-path>/values-overrides.yaml \
|
|
272
|
+
--version <optional-helm-chart-version> \
|
|
273
|
+
--set-file <optional-config-files> \
|
|
274
|
+
--wait
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
Where:
|
|
278
|
+
|
|
279
|
+
* `<your-dremio-install-release>` - The name that identifies your Dremio installation. For example, `dremio-1-0`.
|
|
280
|
+
* `<your-local-path>` - The path to reach your `values-overrides.yaml` configuration file.
|
|
281
|
+
* (Optional) `--version <optional-helm-chart-version>` - The version of Dremio's Helm chart to be used. If not provided, defaults to the latest.
|
|
282
|
+
* (Optional) `--set-file <optional-config-file>` - An optional configuration file for deploying Dremio. For example, an [Identity Provider](/current/security/authentication/identity-providers/) configuration file, which is not defined in the `values-overrides.yaml` and can be provided here through this option.
|
|
283
|
+
|
|
284
|
+
For OpenShift, the command requires an additional `--values` option with the path to the OpenShift-specific `values-openshift-overrides.yaml` configuration file. This additional option must be placed before the `--values` option with the `values-overrides.yaml` configuration file, resulting in its substitution first.
|
|
285
|
+
|
|
286
|
+
Run the following command for OpenShift:
|
|
287
|
+
|
|
288
|
+
```
|
|
289
|
+
helm install <your-dremio-install-release> oci://quay.io/dremio/dremio-helm \
|
|
290
|
+
--values <your-local-path1>/values-openshift-overrides.yaml \
|
|
291
|
+
--values <your-local-path2>/values-overrides.yaml \
|
|
292
|
+
--version <optional-helm-chart-version> \
|
|
293
|
+
--set-file <optional-config-files> \
|
|
294
|
+
--wait
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
Where:
|
|
298
|
+
|
|
299
|
+
* `<your-dremio-install-release>` - The name that identifies your Dremio installation. For example, `dremio-1-0`.
|
|
300
|
+
* `<your-local-path1>` - The path to reach your `values-openshift-overrides.yaml` configuration file. Only required for OpenShift.
|
|
301
|
+
* `<your-local-path2>` - The path to reach your `values-overrides.yaml` configuration file.
|
|
302
|
+
* (Optional) `--version <optional-helm-chart-version>` - The version of Dremio's Helm chart to be used. If not provided, defaults to the latest.
|
|
303
|
+
* (Optional) `--set-file <optional-config-file>` - An optional configuration file for deploying Dremio. For example, an [Identity Provider](/current/security/authentication/identity-providers/) configuration file, which is not defined in the `values-overrides.yaml` and can be provided here through this option.
|
|
304
|
+
3. Monitor the deployment using the following commands:
|
|
305
|
+
|
|
306
|
+
* Standard Kubernetes
|
|
307
|
+
* OpenShift
|
|
308
|
+
|
|
309
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
310
|
+
|
|
311
|
+
```
|
|
312
|
+
kubectl get pods
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
For OpenShift, run the following command:
|
|
316
|
+
|
|
317
|
+
```
|
|
318
|
+
oc get pods
|
|
319
|
+
```
|
|
320
|
+
|
|
321
|
+
When all of the pods are in the `Ready` state, the deployment is complete.
|
|
322
|
+
|
|
323
|
+
Troubleshooting
|
|
324
|
+
|
|
325
|
+
* If a pod remains in `Pending` state for more than a few minutes, run the following command to view its status to check for issues, such as insufficient resources for scheduling:
|
|
326
|
+
|
|
327
|
+
```
|
|
328
|
+
kubectl describe pods <pod-name>
|
|
329
|
+
```
|
|
330
|
+
* If the events at the bottom of the output mention insufficient CPU or memory, do one of the following:
|
|
331
|
+
|
|
332
|
+
+ Adjust the values in the `values-overrides.yaml` configuration file and redeploy.
|
|
333
|
+
+ Add more resources to your Kubernetes cluster.
|
|
334
|
+
* If a pod returns a failed state (especially `dremio-master-0`, the most important pod), use the following commands to collect the logs:
|
|
335
|
+
|
|
336
|
+
+ Standard Kubernetes
|
|
337
|
+
+ OpenShift
|
|
338
|
+
|
|
339
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
340
|
+
|
|
341
|
+
```
|
|
342
|
+
kubectl logs dremio-master-0
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
For OpenShift, run the following command:
|
|
346
|
+
|
|
347
|
+
```
|
|
348
|
+
oc logs deployment/dremio-master
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
## Step 2: Connecting to Dremio
|
|
352
|
+
|
|
353
|
+
Now that you've installed the Helm chart and deployed Dremio on Kubernetes, the next step is connecting to Dremio, where you have the following options:
|
|
354
|
+
|
|
355
|
+
* Dremio Console
|
|
356
|
+
* OpenShift Route
|
|
357
|
+
* BI Tools via ODBC/JDBC
|
|
358
|
+
* BI Tools via Apache Arrow Flight
|
|
359
|
+
|
|
360
|
+
To connect to Dremio via [the Dremio console](/current/get-started/quick_tour), run the following command to use the `services dremio-client` in Kubernetes to find the host for the Dremio console:
|
|
361
|
+
|
|
362
|
+
```
|
|
363
|
+
$ kubectl get services dremio-client
|
|
364
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
365
|
+
... ... ... ... ... ...
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access the Dremio console through the address in the `EXTERNAL_IP` column and port **9047**.
|
|
369
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access the Dremio console through <http://8.8.8.8:9047>.
|
|
370
|
+
|
|
371
|
+
```
|
|
372
|
+
$ kubectl get services dremio-client
|
|
373
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
374
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.web.port` in the file `values-overrides.yaml`.
|
|
378
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access the Dremio console through <http://localhost:30670>.
|
|
379
|
+
|
|
380
|
+
To expose Dremio externally using OpenShift Routes, do the following:
|
|
381
|
+
|
|
382
|
+
```
|
|
383
|
+
$ oc expose service dremio-client --port=9047 --name=dremio-ui
|
|
384
|
+
|
|
385
|
+
$ oc get route dremio-ui -o jsonpath='{.spec.host}'
|
|
386
|
+
```
|
|
387
|
+
|
|
388
|
+
To connect your BI tools to Dremio via ODBC/JDBC, run the following command to use the `services dremio-client` in Kubernetes to find the host for ODBC/JDBC connections by using the following command:
|
|
389
|
+
|
|
390
|
+
```
|
|
391
|
+
$ kubectl get services dremio-client
|
|
392
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
393
|
+
... ... ... ... ... ...
|
|
394
|
+
```
|
|
395
|
+
|
|
396
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access Dremio using ODBC/JDBC through the address in the `EXTERNAL_IP` column and port **31010**.
|
|
397
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access Dremio using ODBC/JDBC on port 31010 through <http://8.8.8.8:31010>.
|
|
398
|
+
|
|
399
|
+
```
|
|
400
|
+
$ kubectl get services dremio-client
|
|
401
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
402
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.client.port` in the file `values-overrides.yaml`.
|
|
406
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access Dremio using ODBC/JDBC through <http://localhost:32390>.
|
|
407
|
+
|
|
408
|
+
To connect your BI tools to Dremio via Apache Arrow Flight, run the following command to use the `services dremio-client` in Kubernetes to find the host for Apache Arrow Flight connections by using the following command:
|
|
409
|
+
|
|
410
|
+
```
|
|
411
|
+
$ kubectl get services dremio-client
|
|
412
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
413
|
+
... ... ... ... ... ...
|
|
414
|
+
```
|
|
415
|
+
|
|
416
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access Dremio using Apache Arrow Flight through the address in the `EXTERNAL_IP` column and port **32010**.
|
|
417
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access Dremio using Apache Arrow Flight through <http://8.8.8.8:32010>.
|
|
418
|
+
|
|
419
|
+
```
|
|
420
|
+
$ kubectl get services dremio-client
|
|
421
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
422
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.flight.port` in the file `values-overrides.yaml`.
|
|
426
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access Dremio using Apache Arrow Flight through <http://localhost:31357>.
|
|
427
|
+
|
|
428
|
+
Was this page helpful?
|
|
429
|
+
|
|
430
|
+
[Previous
|
|
431
|
+
|
|
432
|
+
Kubernetes Environments](/current/deploy-dremio/kubernetes-environments)[Next
|
|
433
|
+
|
|
434
|
+
Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/)
|
|
435
|
+
|
|
436
|
+
* Prerequisites
|
|
437
|
+
+ Additional Prerequisites for the Enterprise Edition
|
|
438
|
+
+ Additional Prerequisites for the OpenShift
|
|
439
|
+
* Step 1: Deploy Dremio
|
|
440
|
+
* Step 2: Connecting to Dremio
|
|
441
|
+
|
|
442
|
+
---
|
|
443
|
+
|
|
444
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/
|
|
445
|
+
|
|
446
|
+
Version: current [26.x]
|
|
447
|
+
|
|
448
|
+
On this page
|
|
449
|
+
|
|
450
|
+
# Configuring Your Values to Deploy Dremio to Kubernetes
|
|
451
|
+
|
|
452
|
+
[Helm](https://helm.sh/) is a standard for managing Kubernetes applications, and the [Helm chart](https://helm.sh/docs/topics/charts/) defines how applications are deployed to Kubernetes. Dremio's Helm chart contains the default deployment configurations, which are specified in the `values.yaml`.
|
|
453
|
+
|
|
454
|
+
Dremio recommends configuring your deployment values in a separate `.yaml` file since it will allow simpler updates to the latest version of the Helm chart by copying the separate configuration file across Helm chart updates.
|
|
455
|
+
|
|
456
|
+
FREE TRIAL
|
|
457
|
+
|
|
458
|
+
If you are using an **Enterprise Edition free trial**, you don't need to do all the configurations described on this page. Instead, follow the configuration steps described in [Get Started with the Enterprise Edition Free Trial](/current/get-started/kubernetes-trial).
|
|
459
|
+
|
|
460
|
+
## Configure Your Values
|
|
461
|
+
|
|
462
|
+
To configure your deployment values, do the following:
|
|
463
|
+
|
|
464
|
+
1. Get the `values-overrides.yaml` configuration file and save it locally. [Click here](/downloads/values-overrides.yaml) to download the file.
|
|
465
|
+
|
|
466
|
+
The `values-overrides.yaml` configuration file
|
|
467
|
+
|
|
468
|
+
```
|
|
469
|
+
# A Dremio License is required
|
|
470
|
+
dremio:
|
|
471
|
+
license: "<your-license-key>"
|
|
472
|
+
image:
|
|
473
|
+
repository: quay.io/dremio/dremio-enterprise-jdk21
|
|
474
|
+
|
|
475
|
+
# Configuration file customization
|
|
476
|
+
# The configFiles and configBinaries options provide the ability to override or add configuration files
|
|
477
|
+
# included in the Dremio ConfigMap. Both use a map where keys correspond to the filenames
|
|
478
|
+
# and values are the file contents.
|
|
479
|
+
|
|
480
|
+
# configFiles: Use this to provide text-based configuration files that will be mounted in /opt/dremio/conf/
|
|
481
|
+
# Note: The dremio.conf file is controlled by multiple settings in this values file and
|
|
482
|
+
# should not be directly overridden here.
|
|
483
|
+
# Example:
|
|
484
|
+
#configFiles:
|
|
485
|
+
# vault_config.json: |
|
|
486
|
+
# {
|
|
487
|
+
# <your-vault-json-config>
|
|
488
|
+
# }
|
|
489
|
+
|
|
490
|
+
# configBinaries: Use this to provide binary configuration files (encoded as base64)
|
|
491
|
+
# These files will also be mounted in /opt/dremio/conf/
|
|
492
|
+
# Example:
|
|
493
|
+
#configBinaries:
|
|
494
|
+
# custom-truststore.jks: "base64EncodedBinaryContent"
|
|
495
|
+
|
|
496
|
+
# dremioConfExtraOptions: Use this to add settings in dremio.conf
|
|
497
|
+
# Example:
|
|
498
|
+
#dremioConfExtraOptions:
|
|
499
|
+
# # Enable SSL for fabric services
|
|
500
|
+
# "services.fabric.ssl.enabled": true
|
|
501
|
+
# "services.fabric.ssl.auto-certificate.enabled": false
|
|
502
|
+
|
|
503
|
+
# Hive 2 and 3 configuration files - can be provided here too. See: https://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#hive
|
|
504
|
+
#hive2ConfigFiles:
|
|
505
|
+
#
|
|
506
|
+
#hive3ConfigFiles:
|
|
507
|
+
#
|
|
508
|
+
|
|
509
|
+
# To pull images from Dremio's Quay, you must create an image pull secret. For more info, see:
|
|
510
|
+
# https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
|
|
511
|
+
# All of the images are pulled using this same secret.
|
|
512
|
+
imagePullSecrets:
|
|
513
|
+
- <your-pull-secret-name>
|
|
514
|
+
|
|
515
|
+
# Dremio Coordinator
|
|
516
|
+
coordinator:
|
|
517
|
+
web:
|
|
518
|
+
auth:
|
|
519
|
+
enabled: true
|
|
520
|
+
type: "internal" # Valid types are: internal, ldap, azuread, oauth, oauth+ldap
|
|
521
|
+
# if enabled is true and type ldap, azuread, oauth, or oauth+ldap
|
|
522
|
+
# Uncomment the entry below and provide the JSON configuration inline
|
|
523
|
+
# OR use --set-file coordinator.web.auth.ssoFile=/path/to/file for the SSO provider configuration file during Helm install
|
|
524
|
+
# for more information about the file format for your SSO provider
|
|
525
|
+
# see https://docs.dremio.com/current/get-started/cluster-deployments/customizing-configuration/dremio-conf/sso-config/
|
|
526
|
+
# ssoFile: |
|
|
527
|
+
# {
|
|
528
|
+
# <your-sso-json-file-content>
|
|
529
|
+
# }
|
|
530
|
+
tls:
|
|
531
|
+
enabled: false
|
|
532
|
+
secret: "<your-tls-secret-name>"
|
|
533
|
+
client:
|
|
534
|
+
tls:
|
|
535
|
+
enabled: false
|
|
536
|
+
secret: "<your-tls-secret-name>"
|
|
537
|
+
flight:
|
|
538
|
+
tls:
|
|
539
|
+
enabled: false
|
|
540
|
+
secret: "<your-tls-secret-name>"
|
|
541
|
+
resources:
|
|
542
|
+
requests:
|
|
543
|
+
cpu: "32"
|
|
544
|
+
memory: "64Gi"
|
|
545
|
+
limits:
|
|
546
|
+
memory: "64Gi"
|
|
547
|
+
volumeSize: 512Gi
|
|
548
|
+
|
|
549
|
+
# Where Dremio stores metadata, Reflections, uploaded files, and backups. The distributed store is required for Dremio to be operational.
|
|
550
|
+
# For more information, see https://docs.dremio.com/current/get-started/cluster-deployments/architecture/distributed-storage/
|
|
551
|
+
distStorage:
|
|
552
|
+
# The supported distributed storage types are: aws, gcp, or azureStorage. For S3-compatible storage use aws.
|
|
553
|
+
type: <your-distributed-storage-type> # Add here your distributed storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-distributed-storage
|
|
554
|
+
|
|
555
|
+
# MongoDB is the backing store for the Open Catalog. Backups are enabled by default and will take place automatically. Dremio will write these backups to your distributed storage location. Not all authentication types are supported. See our distributed storage docs link above. Lack of support will be noted where applicable.
|
|
556
|
+
mongodb:
|
|
557
|
+
backup:
|
|
558
|
+
enabled: true
|
|
559
|
+
|
|
560
|
+
# Dremio Catalog
|
|
561
|
+
catalog:
|
|
562
|
+
externalAccess:
|
|
563
|
+
enabled: true
|
|
564
|
+
tls:
|
|
565
|
+
enabled: false
|
|
566
|
+
secret: "<your-catalog-tls-secret-name>"
|
|
567
|
+
# This is where Iceberg tables created in your catalog will reside
|
|
568
|
+
storage:
|
|
569
|
+
# The supported catalog storage types are: S3, azure and GCS. For S3-compatible storage use S3.
|
|
570
|
+
type: <your-catalog-storage-type>
|
|
571
|
+
# Add here your catalog storage template from https://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-storage-for-dremio-catalog
|
|
572
|
+
|
|
573
|
+
service:
|
|
574
|
+
type: LoadBalancer
|
|
575
|
+
```
|
|
576
|
+
2. Edit the `values-overrides.yaml` file to configure your values. See the following sections for details on each configuration option:
|
|
577
|
+
|
|
578
|
+
* License
|
|
579
|
+
* Pull Secret
|
|
580
|
+
* Coordinator
|
|
581
|
+
* Coordinator's Distributed Storage
|
|
582
|
+
* Open Catalog
|
|
583
|
+
* Advanced Values Configurations
|
|
584
|
+
|
|
585
|
+
IMPORTANT
|
|
586
|
+
|
|
587
|
+
In all code examples, `...` denotes additional values that have been omitted.
|
|
588
|
+
|
|
589
|
+
Group all values associated with a given parent key in the YAML under a single instance of that parent, for example:
|
|
590
|
+
|
|
591
|
+
Do
|
|
592
|
+
|
|
593
|
+
```
|
|
594
|
+
dremio:
|
|
595
|
+
key-one: <value-one>
|
|
596
|
+
key-two:
|
|
597
|
+
key-three: <value-two>
|
|
598
|
+
```
|
|
599
|
+
|
|
600
|
+
Do not
|
|
601
|
+
|
|
602
|
+
```
|
|
603
|
+
dremio:
|
|
604
|
+
key-one: <value-one>
|
|
605
|
+
|
|
606
|
+
dremio:
|
|
607
|
+
key-two:
|
|
608
|
+
key-three: <value-two>
|
|
609
|
+
```
|
|
610
|
+
|
|
611
|
+
Please note the parent relationships at the top of each YAML snippet and subsequent values throughout this section. The hierarchy of keys and indentations in YAML must be respected.
|
|
612
|
+
3. Save the `values-overrides.yaml` file.
|
|
613
|
+
|
|
614
|
+
Once done with the configuration, deploy Dremio to Kubernetes. See how in [Deploying Dremio to Kubernetes](/current/deploy-dremio/deploy-on-kubernetes/).
|
|
615
|
+
|
|
616
|
+
### License
|
|
617
|
+
|
|
618
|
+
Provide your license key. To obtain a license, see [Licensing](/current/admin/licensing).
|
|
619
|
+
Add this configuration under the parent, as shown in the following example:
|
|
620
|
+
|
|
621
|
+
Configuration of the license key
|
|
622
|
+
|
|
623
|
+
```
|
|
624
|
+
dremio:
|
|
625
|
+
license: "<your-license-key>"
|
|
626
|
+
...
|
|
627
|
+
```
|
|
628
|
+
|
|
629
|
+
### Pull Secret
|
|
630
|
+
|
|
631
|
+
Provide the secret used to pull the images from Quay.io as follows:
|
|
632
|
+
|
|
633
|
+
1. Log in to [Quay.io](https://quay.io/), select your account name at the top right corner, and select **Account Settings** in the drop-down menu.
|
|
634
|
+
2. Click **Generate Encrypted Password**, type your password, and click **Verify**.
|
|
635
|
+
3. On the next dialog, select **Kubernetes Secret**, and follow steps 1 and 2 to download the secret and run the command to submit the secret to the cluster.
|
|
636
|
+
4. Add the configuration under the parent, as shown in the following example:
|
|
637
|
+
|
|
638
|
+
Configuration of the secret to pull images from Quay.io
|
|
639
|
+
|
|
640
|
+
```
|
|
641
|
+
imagePullSecrets:
|
|
642
|
+
- <your-quayio-secret-name>
|
|
643
|
+
```
|
|
644
|
+
|
|
645
|
+
### Coordinator
|
|
646
|
+
|
|
647
|
+
#### Resource Configuration
|
|
648
|
+
|
|
649
|
+
Configure the volume size, resources limits, and resources requests. To configure these values, see Recommended Resources Configuration.
|
|
650
|
+
|
|
651
|
+
Add this configuration under the parents, as shown in the following example:
|
|
652
|
+
|
|
653
|
+
Configuration of the coordinator's resources with example values
|
|
654
|
+
|
|
655
|
+
```
|
|
656
|
+
coordinator:
|
|
657
|
+
resources:
|
|
658
|
+
requests:
|
|
659
|
+
cpu: 15
|
|
660
|
+
memory: 30Gi
|
|
661
|
+
volumeSize: 100Gi
|
|
662
|
+
...
|
|
663
|
+
```
|
|
664
|
+
|
|
665
|
+
#### Identity Provider
|
|
666
|
+
|
|
667
|
+
Optionally, you can configure authentication via an identity provider. Each type of identity provider requires an additional configuration file provided during Dremio's deployment.
|
|
668
|
+
|
|
669
|
+
Select the authentication `type`, and follow the corresponding link for instructions on how to create the associated configuration file:
|
|
670
|
+
|
|
671
|
+
* `azuread` - See how to [configure Microsoft Entra ID with user and group lookup](/current/security/authentication/identity-providers/microsoft-entra-id#configuring-microsoft-entra-id).
|
|
672
|
+
* `ldap` - See how to [configure Dremio for LDAP](/current/security/authentication/identity-providers/ldap).
|
|
673
|
+
* `oauth` - See how to [configure Dremio for OpenID](/current/security/authentication/identity-providers/oidc/#configuring-dremio-for-openid).
|
|
674
|
+
* `oauth+ldap` - See how to [configure Dremio for Hybrid OpenID+LDAP](/current/security/authentication/identity-providers/oidc/#configuring-dremio-for-hybrid-openidldap).
|
|
675
|
+
|
|
676
|
+
Add this configuration under the parents, as shown in the following example:
|
|
677
|
+
|
|
678
|
+
Configuration of the coordinator's identity provider
|
|
679
|
+
|
|
680
|
+
```
|
|
681
|
+
coordinator:
|
|
682
|
+
web:
|
|
683
|
+
auth:
|
|
684
|
+
type: <your-auth-type>
|
|
685
|
+
...
|
|
686
|
+
```
|
|
687
|
+
|
|
688
|
+
The identity provider configuration file can be embedded in your `values-overrides.yaml`. To do this, use the `ssoFile` option and provide the JSON content constructed per the instructions linked above. Here is an example for Microsoft Entra ID:
|
|
689
|
+
|
|
690
|
+
Configuration of an embedded identity provider file with an example for Microsoft Entra ID
|
|
691
|
+
|
|
692
|
+
```
|
|
693
|
+
coordinator:
|
|
694
|
+
web:
|
|
695
|
+
auth:
|
|
696
|
+
enabled: true
|
|
697
|
+
type: "azuread"
|
|
698
|
+
ssoFile: |
|
|
699
|
+
{
|
|
700
|
+
"oAuthConfig": {
|
|
701
|
+
"clientId": "<your-client-id>",
|
|
702
|
+
"clientSecret": "<your-secret>",
|
|
703
|
+
"redirectUrl": "<your-redirect-url>",
|
|
704
|
+
"authorityUrl": "https://login.microsoftonline.com/<your-tenant-id>/v2.0",
|
|
705
|
+
"scope": "openid profile",
|
|
706
|
+
"jwtClaims": {
|
|
707
|
+
"userName": "<your-preferred-username>"
|
|
708
|
+
}
|
|
709
|
+
}
|
|
710
|
+
}
|
|
711
|
+
...
|
|
712
|
+
```
|
|
713
|
+
|
|
714
|
+
For examples for the other types, see [Identity Providers](/current/security/authentication/identity-providers)
|
|
715
|
+
|
|
716
|
+
This is not the only configuration file that can be embedded inside the `values-overrides.yaml` file. However, these are generally used for advanced configurations. For more information, see Additional Configuration.
|
|
717
|
+
|
|
718
|
+
#### Transport Level Security
|
|
719
|
+
|
|
720
|
+
Optionally enable the desired level of Transport Level Security (TLS) by setting `enabled: true` for client, Arrow Flight, or web TLS. To provide the TLS secret, see Creating a TLS Secret.
|
|
721
|
+
|
|
722
|
+
Add this configuration under the parent, as shown in the following example:
|
|
723
|
+
|
|
724
|
+
Configuration of TLS for the coordinator
|
|
725
|
+
|
|
726
|
+
```
|
|
727
|
+
coordinator:
|
|
728
|
+
client:
|
|
729
|
+
tls:
|
|
730
|
+
enabled: false
|
|
731
|
+
secret: <your-tls-secret>
|
|
732
|
+
flight:
|
|
733
|
+
tls:
|
|
734
|
+
enabled: false
|
|
735
|
+
secret: <your-tls-secret>
|
|
736
|
+
web:
|
|
737
|
+
tls:
|
|
738
|
+
enabled: false
|
|
739
|
+
secret: <your-tls-secret>
|
|
740
|
+
...
|
|
741
|
+
```
|
|
742
|
+
|
|
743
|
+
note
|
|
744
|
+
|
|
745
|
+
If Web TLS is enabled, see Configuring Open Catalog when the Coordinator Web is Using TLS.
|
|
746
|
+
|
|
747
|
+
### Coordinator's Distributed Storage
|
|
748
|
+
|
|
749
|
+
This is where Dremio stores metadata, Reflections, uploaded files, and backups. A distributed store is required for Dremio to be operational. The supported types are Amazon S3 or S3-compatible storage, Azure Storage, and Google Cloud Storage (GCS). For examples of configurations, see Configuring the Distributed Storage.
|
|
750
|
+
|
|
751
|
+
Add this configuration under the parent, as shown in the following example:
|
|
752
|
+
|
|
753
|
+
Configuration of the coordinator's distributed storage
|
|
754
|
+
|
|
755
|
+
```
|
|
756
|
+
distStorage:
|
|
757
|
+
type: "<your-dist-store-type>"
|
|
758
|
+
...
|
|
759
|
+
```
|
|
760
|
+
|
|
761
|
+
### Open Catalog
|
|
762
|
+
|
|
763
|
+
The configuration for the Open Catalog has several options:
|
|
764
|
+
|
|
765
|
+
* Configuring storage for the Open Catalog is mandatory since this is the location where Iceberg tables created in the catalog will be written. For configuring the storage, see Configuring Storage for the Open Catalog.
|
|
766
|
+
|
|
767
|
+
Add this configuration under the parent, as shown in the following example:
|
|
768
|
+
|
|
769
|
+
Configuration of the storage for the Open Catalog
|
|
770
|
+
|
|
771
|
+
```
|
|
772
|
+
catalog:
|
|
773
|
+
storage:
|
|
774
|
+
location: <your-object-store-path>
|
|
775
|
+
type: <your-object-store-type>
|
|
776
|
+
...
|
|
777
|
+
```
|
|
778
|
+
* (Optional) MongoDB is the backing store for Open Catalog. Its backup is enabled by default. This backup is written to distributed storage. Open Catalog backup can be disabled by setting enabled to false. The configuration shown here performs an automatic Open Catalog backup every day at midnight, and keeps the last three backups.
|
|
779
|
+
|
|
780
|
+
Enablement of the Open Catalog Backing Store Backup
|
|
781
|
+
|
|
782
|
+
```
|
|
783
|
+
mongodb:
|
|
784
|
+
backup:
|
|
785
|
+
enabled: true
|
|
786
|
+
schedule: "0 0 * * *"
|
|
787
|
+
keep: 3
|
|
788
|
+
```
|
|
789
|
+
* (Optional) Configure external access if you want to connect to the Open Catalog with an engine other than Dremio that supports Iceberg REST. For example, Spark.
|
|
790
|
+
|
|
791
|
+
Add this configuration under the parent, as shown in the following example:
|
|
792
|
+
|
|
793
|
+
Configuration of external access for the Open Catalog
|
|
794
|
+
|
|
795
|
+
```
|
|
796
|
+
catalog:
|
|
797
|
+
externalAccess:
|
|
798
|
+
enabled: true
|
|
799
|
+
...
|
|
800
|
+
```
|
|
801
|
+
* (Optional) Use Transport Level Security (TLS) for external access to require clients connecting to the Open Catalog from outside the namespace to use TLS. To configure it, see Configuring TLS for Open Catalog External Access.
|
|
802
|
+
|
|
803
|
+
Add this configuration under the parent, as shown in the following example:
|
|
804
|
+
|
|
805
|
+
Configuration of TLS for external access to the Open Catalog
|
|
806
|
+
|
|
807
|
+
```
|
|
808
|
+
catalog:
|
|
809
|
+
externalAccess:
|
|
810
|
+
enabled: true
|
|
811
|
+
tls:
|
|
812
|
+
enabled: true
|
|
813
|
+
secret: <your-catalog-tls-secret>
|
|
814
|
+
...
|
|
815
|
+
```
|
|
816
|
+
* (Optional) If Dremio coordinator web access is using TLS, additional configuration is necessary. To configure it, see Configuring Open Catalog When the Coordinator Web is Using TLS.
|
|
817
|
+
|
|
818
|
+
Add this configuration under the parent, as shown in the following example:
|
|
819
|
+
|
|
820
|
+
Configuration of the Open Catalog when the coordinator web access is using TLS
|
|
821
|
+
|
|
822
|
+
```
|
|
823
|
+
catalog:
|
|
824
|
+
externalAccess:
|
|
825
|
+
enabled: true
|
|
826
|
+
authentication:
|
|
827
|
+
authServerHostname: <your-auth-server-host>
|
|
828
|
+
...
|
|
829
|
+
```
|
|
830
|
+
|
|
831
|
+
Save the `values-overrides.yaml` file.
|
|
832
|
+
|
|
833
|
+
Once done with the configuration, deploy Dremio to Kubernetes. See how in the topic [Deploying Dremio to Kubernetes](/current/deploy-dremio/deploy-on-kubernetes/).
|
|
834
|
+
|
|
835
|
+
## Configuring Your Values - Advanced
|
|
836
|
+
|
|
837
|
+
### OpenShift
|
|
838
|
+
|
|
839
|
+
warning
|
|
840
|
+
|
|
841
|
+
OpenShift has additional prerequisites that must be applied before installing Dremio. For more information, see [Deploy on Kubernetes - Prerequisites](/current/deploy-dremio/deploy-on-kubernetes#prerequisites).
|
|
842
|
+
|
|
843
|
+
To deploy successfully on OpenShift, you must deploy with two override files. The YAML file you've been using to this point (`values-overrides.yaml`), and an additional YAML file mentioned below (`openshift-overrides.yaml`) with security settings required by OpenShift per its default configuration. Both can be provided in a single Helm install command.
|
|
844
|
+
|
|
845
|
+
Get the `openshift-overrides.yaml` configuration file and save it locally.
|
|
846
|
+
[Click here](/downloads/values-openshift-overrides.yaml) to download the file.
|
|
847
|
+
|
|
848
|
+
### Dremio Platform Images
|
|
849
|
+
|
|
850
|
+
The Dremio platform requires 18 images when running fully featured. All images are published by Dremio to our Quay and are listed below. If you want to use a private mirror of our repository, add the snippets below to `values-overrides.yaml` to repoint to your own.
|
|
851
|
+
|
|
852
|
+
Dremio Platform Images
|
|
853
|
+
|
|
854
|
+
note
|
|
855
|
+
|
|
856
|
+
If creating a private mirror, use the same repository names and tags from [Dremio's Quay.io](https://quay.io/organization/dremio).
|
|
857
|
+
This is important for supportability.
|
|
858
|
+
|
|
859
|
+
```
|
|
860
|
+
dremio:
|
|
861
|
+
image:
|
|
862
|
+
repository: quay.io/dremio/dremio-enterprise-jdk21
|
|
863
|
+
tag: <the-image-tag-from-quayio>
|
|
864
|
+
```
|
|
865
|
+
|
|
866
|
+
```
|
|
867
|
+
busyBox:
|
|
868
|
+
image:
|
|
869
|
+
repository: quay.io/dremio/busybox
|
|
870
|
+
tag: <the-image-tag-from-quayio>
|
|
871
|
+
```
|
|
872
|
+
|
|
873
|
+
```
|
|
874
|
+
k8s:
|
|
875
|
+
image:
|
|
876
|
+
repository: quay.io/dremio/alpine/k8s
|
|
877
|
+
tag: <the-image-tag-from-quay-io>
|
|
878
|
+
```
|
|
879
|
+
|
|
880
|
+
```
|
|
881
|
+
engine:
|
|
882
|
+
operator:
|
|
883
|
+
image:
|
|
884
|
+
repository: quay.io/dremio/dremio-engine-operator
|
|
885
|
+
tag: <the-image-tag-from-quay-io>
|
|
886
|
+
```
|
|
887
|
+
|
|
888
|
+
```
|
|
889
|
+
zookeeper:
|
|
890
|
+
image:
|
|
891
|
+
repository: quay.io/dremio/zookeeper
|
|
892
|
+
tag: <the-image-tag-from-quay-io>
|
|
893
|
+
```
|
|
894
|
+
|
|
895
|
+
```
|
|
896
|
+
opensearch:
|
|
897
|
+
image:
|
|
898
|
+
repository: quay.io/dremio/dremio-search-opensearch
|
|
899
|
+
tag: <the-image-tag-from-quay-io> # The tag version must be a valid OpenSearch version as listed here https://opensearch.org/docs/latest/version-history/
|
|
900
|
+
preInstallJob:
|
|
901
|
+
image:
|
|
902
|
+
repository: quay.io/dremio/dremio-search-init
|
|
903
|
+
tag: <the-image-tag-from-quay-io>
|
|
904
|
+
```
|
|
905
|
+
|
|
906
|
+
```
|
|
907
|
+
opensearchOperator:
|
|
908
|
+
manager:
|
|
909
|
+
image:
|
|
910
|
+
repository: quay.io/dremio/dremio-opensearch-operator
|
|
911
|
+
tag: <the-image-tag-from-quay-io>
|
|
912
|
+
kubeRbacProxy:
|
|
913
|
+
image:
|
|
914
|
+
repository: quay.io/dremio/kubebuilder/kube-rbac-proxy
|
|
915
|
+
tag: <the-image-tag-from-quay-io>
|
|
916
|
+
```
|
|
917
|
+
|
|
918
|
+
```
|
|
919
|
+
mongodbOperator:
|
|
920
|
+
image:
|
|
921
|
+
repository: quay.io/dremio/dremio-mongodb-operator
|
|
922
|
+
tag: <the-image-tag-from-quay-io>
|
|
923
|
+
```
|
|
924
|
+
|
|
925
|
+
```
|
|
926
|
+
mongodb:
|
|
927
|
+
image:
|
|
928
|
+
repository: quay.io/dremio/percona/percona-server-mongodb
|
|
929
|
+
tag: <the-image-tag-from-quay-io>
|
|
930
|
+
```
|
|
931
|
+
|
|
932
|
+
```
|
|
933
|
+
catalogservices:
|
|
934
|
+
image:
|
|
935
|
+
repository: quay.io/dremio/dremio-catalog-services-server
|
|
936
|
+
tag: <the-image-tag-from-quay-io>
|
|
937
|
+
```
|
|
938
|
+
|
|
939
|
+
```
|
|
940
|
+
catalog:
|
|
941
|
+
image:
|
|
942
|
+
repository: quay.io/dremio/dremio-catalog-server
|
|
943
|
+
tag: <the-image-tag-from-quay-io>
|
|
944
|
+
externaAccess:
|
|
945
|
+
image:
|
|
946
|
+
repository: quay.io/dremio/dremio-catalog-server-external
|
|
947
|
+
tag: <the-image-tag-from-quay-io>
|
|
948
|
+
```
|
|
949
|
+
|
|
950
|
+
```
|
|
951
|
+
nats:
|
|
952
|
+
container:
|
|
953
|
+
image:
|
|
954
|
+
repository: quay.io/dremio/nats
|
|
955
|
+
tag: <the-image-tag-from-quay-io>
|
|
956
|
+
reloader:
|
|
957
|
+
image:
|
|
958
|
+
repository: quay.io/dremio/natsio/nats-server-config-reloader
|
|
959
|
+
tag: <the-image-tag-from-quay-io>
|
|
960
|
+
natsBox:
|
|
961
|
+
container:
|
|
962
|
+
image:
|
|
963
|
+
repository: quay.io/dremio/natsio/nats-box
|
|
964
|
+
tag: <the-image-tag-from-quay-io>
|
|
965
|
+
```
|
|
966
|
+
|
|
967
|
+
```
|
|
968
|
+
telemetry:
|
|
969
|
+
image:
|
|
970
|
+
repository: quay.io/dremio/otel/opentelemetry-collector-contrib
|
|
971
|
+
tag: <the-image-tag-from-quay-io>
|
|
972
|
+
```
|
|
973
|
+
|
|
974
|
+
### Scale-out Coordinators
|
|
975
|
+
|
|
976
|
+
Dremio can scale to support high-concurrency use cases through scaling coordinators. Multiple stateless coordinators rely on the primary coordinator to manage Dremio's state, enabling Dremio to support many more concurrent users. These scale-out coordinators are intended for high query throughput and are not applicable for standby or disaster recovery. While scale-out coordinators generally reduce the load on the primary coordinator, the primary coordinator's vCPU request should be increased for every two scale-outs added to avoid negatively impacting performance.
|
|
977
|
+
|
|
978
|
+
Perform this configuration in this section of the file, where count refers to the number of scale-outs. A count of 0 will provision only the primary coordinator:
|
|
979
|
+
|
|
980
|
+
Configuration of scale-out coordinators with an example value
|
|
981
|
+
|
|
982
|
+
```
|
|
983
|
+
coordinator:
|
|
984
|
+
count: 1
|
|
985
|
+
...
|
|
986
|
+
```
|
|
987
|
+
|
|
988
|
+
note
|
|
989
|
+
|
|
990
|
+
When using scale-out coordinators, the load balancer session affinity should be enhanced. See: Advanced Load Balancer Configuration.
|
|
991
|
+
|
|
992
|
+
### Configuring Kubernetes Pod Metadata (including Node Selector)
|
|
993
|
+
|
|
994
|
+
It's possible to add metadata both globally and to each of the StatefulSets (coordinators, classic engines, ZooKeeper, etc.), including configuring a node selector for pods to use specific node pools.
|
|
995
|
+
|
|
996
|
+
warning
|
|
997
|
+
|
|
998
|
+
Define these values with caution and foreknowledge of expected entries because any misconfiguration may result in Kubernetes being unable to schedule your pods.
|
|
999
|
+
|
|
1000
|
+
Use the following options to add metadata:
|
|
1001
|
+
|
|
1002
|
+
* `labels:` - Configured using key-value pairs as shown in the following examples:
|
|
1003
|
+
|
|
1004
|
+
Configuration of a global label with a key-value example
|
|
1005
|
+
|
|
1006
|
+
```
|
|
1007
|
+
labels:
|
|
1008
|
+
foo: bar
|
|
1009
|
+
```
|
|
1010
|
+
|
|
1011
|
+
Configuration of a StatefulSet label for the Open Catalog with a key-value example
|
|
1012
|
+
|
|
1013
|
+
```
|
|
1014
|
+
catalog:
|
|
1015
|
+
labels:
|
|
1016
|
+
foo: bar
|
|
1017
|
+
...
|
|
1018
|
+
```
|
|
1019
|
+
|
|
1020
|
+
For more information on labels, see the Kubernetes documentation on [Labels and Selectors](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/).
|
|
1021
|
+
* `annotations:` - Configured using key-value pairs as shown in the following examples.
|
|
1022
|
+
|
|
1023
|
+
Configuration of a global annotation with a key-value example
|
|
1024
|
+
|
|
1025
|
+
```
|
|
1026
|
+
annotations:
|
|
1027
|
+
foo: bar
|
|
1028
|
+
```
|
|
1029
|
+
|
|
1030
|
+
Configuration of a StatefulSet annotation for MongoDB with a key-value example
|
|
1031
|
+
|
|
1032
|
+
```
|
|
1033
|
+
mongodb:
|
|
1034
|
+
annotations:
|
|
1035
|
+
foo: bar
|
|
1036
|
+
...
|
|
1037
|
+
```
|
|
1038
|
+
|
|
1039
|
+
For more information on annotations, see the Kubernetes documentation on [Annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/).
|
|
1040
|
+
* `tolerations:` - Configured using a specific structure as shown in the following examples:
|
|
1041
|
+
|
|
1042
|
+
Configuration of a global toleration with example values
|
|
1043
|
+
|
|
1044
|
+
```
|
|
1045
|
+
tolerations:
|
|
1046
|
+
- key: "key1"
|
|
1047
|
+
operator: "Equal"
|
|
1048
|
+
value: "value1"
|
|
1049
|
+
effect: "NoSchedule"
|
|
1050
|
+
```
|
|
1051
|
+
|
|
1052
|
+
Configuration of a StatefulSet toleration for the Open Catalog with example values
|
|
1053
|
+
|
|
1054
|
+
```
|
|
1055
|
+
catalog:
|
|
1056
|
+
tolerations:
|
|
1057
|
+
- key: "key1"
|
|
1058
|
+
operator: "Equal"
|
|
1059
|
+
value: "value1"
|
|
1060
|
+
effect: "NoSchedule"
|
|
1061
|
+
...
|
|
1062
|
+
```
|
|
1063
|
+
|
|
1064
|
+
For more information on tolerations, see the Kubernetes documentation on [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/).
|
|
1065
|
+
* `nodeSelector:` - Configured using a specific structure as shown in the following examples.
|
|
1066
|
+
|
|
1067
|
+
Configuration of a global node selector with an example value
|
|
1068
|
+
|
|
1069
|
+
```
|
|
1070
|
+
nodeSelector:
|
|
1071
|
+
nodetype: coordinator
|
|
1072
|
+
```
|
|
1073
|
+
|
|
1074
|
+
Configuration of a StatefulSet node selector for the coordinator with an example value
|
|
1075
|
+
|
|
1076
|
+
```
|
|
1077
|
+
coordinator:
|
|
1078
|
+
nodeSelector:
|
|
1079
|
+
nodetype: coordinator
|
|
1080
|
+
...
|
|
1081
|
+
```
|
|
1082
|
+
|
|
1083
|
+
To understand the structure and values to use in the configurations, expand "Metadata Structure and Values" below:
|
|
1084
|
+
|
|
1085
|
+
Metadata Structure and Values
|
|
1086
|
+
|
|
1087
|
+
For global metadata:
|
|
1088
|
+
|
|
1089
|
+
Global metadata structure
|
|
1090
|
+
|
|
1091
|
+
```
|
|
1092
|
+
annotations: {}
|
|
1093
|
+
labels: {}
|
|
1094
|
+
tolerations: []
|
|
1095
|
+
nodeSelector: {}
|
|
1096
|
+
```
|
|
1097
|
+
|
|
1098
|
+
For StatefulSet metadata:
|
|
1099
|
+
|
|
1100
|
+
StatefulSet metadata structure for the coordinator
|
|
1101
|
+
|
|
1102
|
+
```
|
|
1103
|
+
coordinator:
|
|
1104
|
+
annotations: {}
|
|
1105
|
+
labels: {}
|
|
1106
|
+
tolerations: []
|
|
1107
|
+
nodeSelector:
|
|
1108
|
+
nodetype: coordinator
|
|
1109
|
+
```
|
|
1110
|
+
|
|
1111
|
+
StatefulSet metadata structure for the executors
|
|
1112
|
+
|
|
1113
|
+
```
|
|
1114
|
+
executor:
|
|
1115
|
+
annotations: {}
|
|
1116
|
+
labels: {}
|
|
1117
|
+
tolerations: []
|
|
1118
|
+
nodeSelector:
|
|
1119
|
+
nodetype: coordinator
|
|
1120
|
+
```
|
|
1121
|
+
|
|
1122
|
+
StatefulSet metadata structure for the Open Catalog
|
|
1123
|
+
|
|
1124
|
+
```
|
|
1125
|
+
catalog:
|
|
1126
|
+
annotations: {}
|
|
1127
|
+
labels: {}
|
|
1128
|
+
tolerations: []
|
|
1129
|
+
nodeSelector:
|
|
1130
|
+
nodetype: catalog
|
|
1131
|
+
```
|
|
1132
|
+
|
|
1133
|
+
StatefulSet metadata structure for the Open Catalog services
|
|
1134
|
+
|
|
1135
|
+
```
|
|
1136
|
+
catalogservices:
|
|
1137
|
+
annotations: {}
|
|
1138
|
+
labels: {}
|
|
1139
|
+
tolerations: []
|
|
1140
|
+
nodeSelector:
|
|
1141
|
+
nodetype: catalogservices
|
|
1142
|
+
```
|
|
1143
|
+
|
|
1144
|
+
StatefulSet metadata structure for MongoDB
|
|
1145
|
+
|
|
1146
|
+
```
|
|
1147
|
+
mongodb:
|
|
1148
|
+
annotations: {}
|
|
1149
|
+
labels: {}
|
|
1150
|
+
tolerations: []
|
|
1151
|
+
nodeSelector:
|
|
1152
|
+
nodetype: mongo
|
|
1153
|
+
```
|
|
1154
|
+
|
|
1155
|
+
StatefulSet metadata structure for OpenSearch
|
|
1156
|
+
|
|
1157
|
+
```
|
|
1158
|
+
opensearch:
|
|
1159
|
+
annotations: {}
|
|
1160
|
+
labels: {}
|
|
1161
|
+
tolerations: []
|
|
1162
|
+
nodeSelector:
|
|
1163
|
+
nodetype: operators
|
|
1164
|
+
oidcProxy:
|
|
1165
|
+
annotations: {}
|
|
1166
|
+
labels: {}
|
|
1167
|
+
tolerations: []
|
|
1168
|
+
nodeSelector:
|
|
1169
|
+
nodeType: utils
|
|
1170
|
+
preInstallJob:
|
|
1171
|
+
annotations: {}
|
|
1172
|
+
labels: {}
|
|
1173
|
+
tolerations: []
|
|
1174
|
+
nodeSelector:
|
|
1175
|
+
nodeType: jobs
|
|
1176
|
+
```
|
|
1177
|
+
|
|
1178
|
+
StatefulSet metadata structure for NATS
|
|
1179
|
+
|
|
1180
|
+
```
|
|
1181
|
+
nats:
|
|
1182
|
+
podTemplate:
|
|
1183
|
+
merge:
|
|
1184
|
+
spec:
|
|
1185
|
+
annotations: {}
|
|
1186
|
+
labels: {}
|
|
1187
|
+
tolerations: []
|
|
1188
|
+
nodeSelector:
|
|
1189
|
+
nodetype: nats
|
|
1190
|
+
```
|
|
1191
|
+
|
|
1192
|
+
StatefulSet metadata structure for the MongoDB operator
|
|
1193
|
+
|
|
1194
|
+
```
|
|
1195
|
+
mongodbOperator:
|
|
1196
|
+
annotations: {}
|
|
1197
|
+
labels: {}
|
|
1198
|
+
tolerations: []
|
|
1199
|
+
nodeSelector:
|
|
1200
|
+
nodetype: operators
|
|
1201
|
+
```
|
|
1202
|
+
|
|
1203
|
+
StatefulSet metadata structure for the OpenSearch operator
|
|
1204
|
+
|
|
1205
|
+
```
|
|
1206
|
+
opensearchOperator:
|
|
1207
|
+
annotations: {}
|
|
1208
|
+
labels: {}
|
|
1209
|
+
tolerations: []
|
|
1210
|
+
nodeSelector:
|
|
1211
|
+
nodetype: operators
|
|
1212
|
+
```
|
|
1213
|
+
|
|
1214
|
+
### Configuring Pods Priority
|
|
1215
|
+
|
|
1216
|
+
You can configure the priority of Dremio pods through priority classes. First, define the priority class, as shown in the following example:
|
|
1217
|
+
|
|
1218
|
+
Definition of a `high-priority` priority class
|
|
1219
|
+
|
|
1220
|
+
```
|
|
1221
|
+
apiVersion: scheduling.k8s.io/v1
|
|
1222
|
+
kind: PriorityClass
|
|
1223
|
+
metadata:
|
|
1224
|
+
name: high-priority
|
|
1225
|
+
value: 1000000
|
|
1226
|
+
globalDefault: false
|
|
1227
|
+
description: "This priority class should be used for coordinator pods only."
|
|
1228
|
+
```
|
|
1229
|
+
|
|
1230
|
+
Then, apply the priority class under the parents, as shown in the following example:
|
|
1231
|
+
|
|
1232
|
+
Configuration of the `high-priority` priority class for the coordinator
|
|
1233
|
+
|
|
1234
|
+
```
|
|
1235
|
+
coordinator:
|
|
1236
|
+
priorityClassName: high-priority
|
|
1237
|
+
```
|
|
1238
|
+
|
|
1239
|
+
To understand the structure and values to use in the configurations, expand "Priority Class Configuration Structure and Values" below:
|
|
1240
|
+
|
|
1241
|
+
Priority Class Configuration Structure and Values
|
|
1242
|
+
|
|
1243
|
+
Priority class configuration for the coordinator
|
|
1244
|
+
|
|
1245
|
+
```
|
|
1246
|
+
coordinator:
|
|
1247
|
+
priorityClassName: <your-priority-class-name>
|
|
1248
|
+
```
|
|
1249
|
+
|
|
1250
|
+
Priority class configuration for the Open Catalog
|
|
1251
|
+
|
|
1252
|
+
```
|
|
1253
|
+
catalog:
|
|
1254
|
+
priorityClassName: <your-priority-class-name>
|
|
1255
|
+
externalAccess:
|
|
1256
|
+
priorityClassName: <your-priority-class-name>
|
|
1257
|
+
```
|
|
1258
|
+
|
|
1259
|
+
Priority class configuration for the Open Catalog services
|
|
1260
|
+
|
|
1261
|
+
```
|
|
1262
|
+
catalogservices:
|
|
1263
|
+
priorityClassName: <your-priority-class-name>
|
|
1264
|
+
```
|
|
1265
|
+
|
|
1266
|
+
Priority class configuration for the engine
|
|
1267
|
+
|
|
1268
|
+
```
|
|
1269
|
+
engine:
|
|
1270
|
+
executor:
|
|
1271
|
+
priorityClassName: <your-priority-class-name>
|
|
1272
|
+
operator:
|
|
1273
|
+
priorityClassName: <your-priority-class-name>
|
|
1274
|
+
```
|
|
1275
|
+
|
|
1276
|
+
Priority class configuration for OpenSearch
|
|
1277
|
+
|
|
1278
|
+
```
|
|
1279
|
+
opensearch:
|
|
1280
|
+
priorityClassName: <your-priority-class-name>
|
|
1281
|
+
```
|
|
1282
|
+
|
|
1283
|
+
Priority class configuration for the OpenSearch operator
|
|
1284
|
+
|
|
1285
|
+
```
|
|
1286
|
+
opensearchOperator:
|
|
1287
|
+
priorityClassName: <your-priority-class-name>
|
|
1288
|
+
```
|
|
1289
|
+
|
|
1290
|
+
Priority class configuration for MongoDB
|
|
1291
|
+
|
|
1292
|
+
```
|
|
1293
|
+
mongodb:
|
|
1294
|
+
priorityClassName: <your-priority-class-name>
|
|
1295
|
+
```
|
|
1296
|
+
|
|
1297
|
+
Priority class configuration for the MongoDB hooks
|
|
1298
|
+
|
|
1299
|
+
```
|
|
1300
|
+
mongodbHooks:
|
|
1301
|
+
priorityClassName: <your-priority-class-name>
|
|
1302
|
+
```
|
|
1303
|
+
|
|
1304
|
+
Priority class configuration for NATS
|
|
1305
|
+
|
|
1306
|
+
```
|
|
1307
|
+
nats:
|
|
1308
|
+
podTemplate:
|
|
1309
|
+
merge:
|
|
1310
|
+
spec:
|
|
1311
|
+
priorityClassName: <your-priority-class-name>
|
|
1312
|
+
natsBox:
|
|
1313
|
+
podTemplate:
|
|
1314
|
+
merge:
|
|
1315
|
+
spec:
|
|
1316
|
+
priorityClassName: <your-priority-class-name>
|
|
1317
|
+
```
|
|
1318
|
+
|
|
1319
|
+
Priority class configuration for ZooKeeper
|
|
1320
|
+
|
|
1321
|
+
```
|
|
1322
|
+
zookeeper:
|
|
1323
|
+
priorityClassName: <your-priority-class-name>
|
|
1324
|
+
```
|
|
1325
|
+
|
|
1326
|
+
Priority class configuration for telemetry
|
|
1327
|
+
|
|
1328
|
+
```
|
|
1329
|
+
telemetry:
|
|
1330
|
+
priorityClassName: <your-priority-class-name>
|
|
1331
|
+
```
|
|
1332
|
+
|
|
1333
|
+
note
|
|
1334
|
+
|
|
1335
|
+
To verify which priority class is applied to each pod, run the command below, and check the `PRIORITY_CLASS` column:
|
|
1336
|
+
|
|
1337
|
+
Run kubectl to list the pods and their priority class
|
|
1338
|
+
|
|
1339
|
+
```
|
|
1340
|
+
kubectl get pods -o custom-columns="NAME:.metadata.name,PRIORITY_CLASS:.spec.priorityClassName" -n dremio
|
|
1341
|
+
```
|
|
1342
|
+
|
|
1343
|
+
### Configuring Extra Environment Variables
|
|
1344
|
+
|
|
1345
|
+
Optionally, you can define extra environment variables to be passed to either coordinators or executors. This can be done by adding the configuration under the parents as shown in the following examples:
|
|
1346
|
+
|
|
1347
|
+
Configuration of extra environment variables for the coordinator
|
|
1348
|
+
|
|
1349
|
+
```
|
|
1350
|
+
coordinator:
|
|
1351
|
+
extraEnvs:
|
|
1352
|
+
- name: <your-variable-name>
|
|
1353
|
+
value: "<your-variable-value>"
|
|
1354
|
+
...
|
|
1355
|
+
```
|
|
1356
|
+
|
|
1357
|
+
Configuration of extra environment variables for the executors
|
|
1358
|
+
|
|
1359
|
+
```
|
|
1360
|
+
executor:
|
|
1361
|
+
extraEnvs:
|
|
1362
|
+
- name: <your-variable-name>
|
|
1363
|
+
value: "<your-variable-value>"
|
|
1364
|
+
...
|
|
1365
|
+
```
|
|
1366
|
+
|
|
1367
|
+
Environment variables defined as shown will be applied to Executors of both [Classic Engines](/current/deploy-dremio/configuring-kubernetes/#configuration-of-classic-engines) and [New Engines](/current/deploy-dremio/managing-engines-kubernetes).
|
|
1368
|
+
|
|
1369
|
+
### Advanced Load Balancer Configuration
|
|
1370
|
+
|
|
1371
|
+
Dremio will create a public load balancer by default, and the Dremio Client service will provide an external IP to connect to Dremio. For more information, see [Connecting to the Dremio Console](/current/deploy-dremio/deploy-on-kubernetes#connecting-to-the-dremio-console).
|
|
1372
|
+
|
|
1373
|
+
* **Private Cluster** - For private Kubernetes clusters (no public endpoint), set `internalLoadBalancer: true`. Add this configuration under the parent as shown in the following example:
|
|
1374
|
+
|
|
1375
|
+
Configuration of an internal load balancer
|
|
1376
|
+
|
|
1377
|
+
```
|
|
1378
|
+
service:
|
|
1379
|
+
type: LoadBalancer
|
|
1380
|
+
internalLoadBalancer: true
|
|
1381
|
+
...
|
|
1382
|
+
```
|
|
1383
|
+
* **Static IP** - To define a static IP for your load balancer, set `loadBalancerIP: <your-static-IP>`. If unset, an available IP will be assigned upon creation of the load balancer. Add this configuration under the parent as shown in the following example:
|
|
1384
|
+
|
|
1385
|
+
Configuration of a static IP for the load balancer
|
|
1386
|
+
|
|
1387
|
+
```
|
|
1388
|
+
service:
|
|
1389
|
+
type: LoadBalancer
|
|
1390
|
+
loadBalancerIP: <your-desired-ip>
|
|
1391
|
+
...
|
|
1392
|
+
```
|
|
1393
|
+
|
|
1394
|
+
tip
|
|
1395
|
+
|
|
1396
|
+
This can be helpful if DNS is configured to expect Dremio to have a specific IP.
|
|
1397
|
+
* **Session Affinity** - If leveraging scale-out coordinators, set this to `ClientIP`, otherwise leave unset. Add this configuration under the parent as shown in the following example:
|
|
1398
|
+
|
|
1399
|
+
Configuration of session affinity for scale-out coordinators
|
|
1400
|
+
|
|
1401
|
+
```
|
|
1402
|
+
service:
|
|
1403
|
+
type: LoadBalancer
|
|
1404
|
+
sessionAffinity: ClientIP
|
|
1405
|
+
...
|
|
1406
|
+
```
|
|
1407
|
+
|
|
1408
|
+
#### Additional Load Balancer Configuration for Amazon EKS in Auto Mode
|
|
1409
|
+
|
|
1410
|
+
If deploying Dremio to Amazon EKS (Elastic Kubernetes Service) in Auto Mode, you need to add service annotations for the load balancer to start (for more information, see [Use Service Annotations to configure Network Load Balancers](https://docs.aws.amazon.com/eks/latest/userguide/auto-configure-nlb.html)). Add this configuration under the parent as shown in the following example:
|
|
1411
|
+
|
|
1412
|
+
Configuration of service annotations for Amazon EKS in Auto Mode
|
|
1413
|
+
|
|
1414
|
+
```
|
|
1415
|
+
service:
|
|
1416
|
+
type: LoadBalancer
|
|
1417
|
+
annotations:
|
|
1418
|
+
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
|
|
1419
|
+
...
|
|
1420
|
+
```
|
|
1421
|
+
|
|
1422
|
+
### Advanced TLS Configuration for OpenSearch
|
|
1423
|
+
|
|
1424
|
+
Dremio generates Transport Level Security (TLS) certificates by default for OpenSearch, and they are rotated monthly. However, if you want to have your own, you need to create two secrets containing the relevant certificates. The format of the secrets is different from the other TLS secrets shown on this page, and the `tls.crt`, `tls.key`, and `ca.crt` files must be in PEM format. Use the example below as a reference to create your secrets:
|
|
1425
|
+
|
|
1426
|
+
Run kubetcl to create two secrets for your own TLS certificates for OpenSearch
|
|
1427
|
+
|
|
1428
|
+
```
|
|
1429
|
+
kubectl create secret generic opensearch-tls-certs \
|
|
1430
|
+
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
|
|
1431
|
+
|
|
1432
|
+
kubectl create secret generic opensearch-tls-certs-admin \
|
|
1433
|
+
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
|
|
1434
|
+
```
|
|
1435
|
+
|
|
1436
|
+
Add the snippet below to the `values-overrides.yaml` file before deploying Dremio. Because OpenSearch requires TLS, if certificate generation is disabled, you must provide a certificate.
|
|
1437
|
+
|
|
1438
|
+
Configuration of TLS certificates for OpenSearch
|
|
1439
|
+
|
|
1440
|
+
```
|
|
1441
|
+
opensearch:
|
|
1442
|
+
tlsCertsSecretName: <opensearch-tls-certs>
|
|
1443
|
+
disableTlsCertGeneration: true
|
|
1444
|
+
...
|
|
1445
|
+
```
|
|
1446
|
+
|
|
1447
|
+
### Advanced Configuration of Engines
|
|
1448
|
+
|
|
1449
|
+
Dremio's default resource offset is `reserve-2-8`, where the first value represents 2 vCPUs and the second represents 8 GB of RAM. If you need to change this default for your created engines, add the following snippet to `values-overrides.yaml` and set the `defaultOffset` to one of the configurable offsets listed below, which are available out of the box:
|
|
1450
|
+
|
|
1451
|
+
* `reserve-0-0`
|
|
1452
|
+
* `reserve-2-4`
|
|
1453
|
+
* `reserve-2-8`
|
|
1454
|
+
* `reserve-2-16`
|
|
1455
|
+
|
|
1456
|
+
The listed values are keys and thus must be provided in this exact format in the snippet below.
|
|
1457
|
+
|
|
1458
|
+
Configuration of the default resource offset for engines with an example value
|
|
1459
|
+
|
|
1460
|
+
```
|
|
1461
|
+
engine:
|
|
1462
|
+
options:
|
|
1463
|
+
resourceAllocationOffsets:
|
|
1464
|
+
defaultOffset: reserve-2-8
|
|
1465
|
+
...
|
|
1466
|
+
```
|
|
1467
|
+
|
|
1468
|
+
### Configuration of Classic Engines
|
|
1469
|
+
|
|
1470
|
+
note
|
|
1471
|
+
|
|
1472
|
+
* You should only use classic engines if the new ones introduced in Dremio 26.0 are not appropriate for your use case. Classic and new engines are not intended to be used side by side.
|
|
1473
|
+
* Classic engines will not auto-start/auto-stop, which is only possible with the new engines.
|
|
1474
|
+
|
|
1475
|
+
The classic way of configuring engines is still supported, and you can add this snippet to `values-overrides.yaml` as part of the deployment. Note that this snippet is a configuration example, and you should adjust the values to your own case.
|
|
1476
|
+
|
|
1477
|
+
Configuration of classic engines with example values
|
|
1478
|
+
|
|
1479
|
+
```
|
|
1480
|
+
executor:
|
|
1481
|
+
resources:
|
|
1482
|
+
requests:
|
|
1483
|
+
cpu: "16"
|
|
1484
|
+
memory: "120Gi"
|
|
1485
|
+
limits:
|
|
1486
|
+
memory: "120Gi"
|
|
1487
|
+
engines: ["default"]
|
|
1488
|
+
count: 3
|
|
1489
|
+
volumeSize: 128Gi
|
|
1490
|
+
cloudCache:
|
|
1491
|
+
enabled: true
|
|
1492
|
+
volumes:
|
|
1493
|
+
- size: 128Gi
|
|
1494
|
+
...
|
|
1495
|
+
```
|
|
1496
|
+
|
|
1497
|
+
#### Engine Overrides
|
|
1498
|
+
|
|
1499
|
+
Engine overrides are primarily used in conjunction with classic engines to modify the configuration of one or more named engines. By default, every engine inside the `engines` list under `executor` will be the same. The values set under `executor` act as the default for all engines. Thus, the engine overrides do not need to be exhaustive.
|
|
1500
|
+
|
|
1501
|
+
Configuration of overrides for an engine named 'small'
|
|
1502
|
+
|
|
1503
|
+
```
|
|
1504
|
+
engineOverride:
|
|
1505
|
+
small:
|
|
1506
|
+
cpu: "8"
|
|
1507
|
+
memory: "60Gi"
|
|
1508
|
+
count: 2
|
|
1509
|
+
cloudCache:
|
|
1510
|
+
enabled: false
|
|
1511
|
+
```
|
|
1512
|
+
|
|
1513
|
+
Engine overrides can also be used with the new engines, but only to disable the Cloud Columnar Cache (C3) option. C3 is enabled by default on all new engines, but you can choose to disable it if needed.
|
|
1514
|
+
|
|
1515
|
+
### Telemetry
|
|
1516
|
+
|
|
1517
|
+
[Telemetry](/current/admin/service-telemetry-kubernetes) egress is enabled by default. These metrics provide visibility into various components and services, ensuring optimal performance and reliability. To disable egress, add the following to your `values-override.yaml`:
|
|
1518
|
+
|
|
1519
|
+
Configuration to disable telemetry
|
|
1520
|
+
|
|
1521
|
+
```
|
|
1522
|
+
telemetry:
|
|
1523
|
+
enabled: false
|
|
1524
|
+
...
|
|
1525
|
+
```
|
|
1526
|
+
|
|
1527
|
+
### Logging
|
|
1528
|
+
|
|
1529
|
+
By default, Dremio enables logging with a pre-defined volume size, which you can check in the `values.yaml` file by downloading Dremio's Helm chart. To override the default configuration, add the following to your `values-overrides.yaml`:
|
|
1530
|
+
|
|
1531
|
+
Configuration of logging
|
|
1532
|
+
|
|
1533
|
+
```
|
|
1534
|
+
dremio:
|
|
1535
|
+
log:
|
|
1536
|
+
enabled: true
|
|
1537
|
+
volume:
|
|
1538
|
+
size: 10Gi
|
|
1539
|
+
storageClass: ""
|
|
1540
|
+
...
|
|
1541
|
+
```
|
|
1542
|
+
|
|
1543
|
+
### Disabling Parts of the Deployment
|
|
1544
|
+
|
|
1545
|
+
You can disable some components of the Dremio platform if their functionality does not pertain to your use case. Dremio's functionality will continue to work if any of these components described in this section are disabled.
|
|
1546
|
+
|
|
1547
|
+
#### Semantic Search
|
|
1548
|
+
|
|
1549
|
+
To disable Semantic Search, add this configuration under the parent as shown in the following example:
|
|
1550
|
+
|
|
1551
|
+
Configuration to disable Semantic Search
|
|
1552
|
+
|
|
1553
|
+
```
|
|
1554
|
+
opensearch:
|
|
1555
|
+
enabled: false
|
|
1556
|
+
replicas: 0
|
|
1557
|
+
```
|
|
1558
|
+
|
|
1559
|
+
## Additional Configuration
|
|
1560
|
+
|
|
1561
|
+
Dremio has several configuration and binary files to define the behavior for enabling authentication via an identity provider, logging, connecting to Hive, etc. During the deployment, these files are combined and used to create a [Kubernetes ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/). This ConfigMap is, in turn, used by the Dremio deployment as the source of truth for various settings. Options can be used to embed these in the `values-override.yaml` configuration file.
|
|
1562
|
+
|
|
1563
|
+
To inspect Dremio's configuration files or perform a more complex operation not shown here, see Downloading Dremio's Helm Charts.
|
|
1564
|
+
|
|
1565
|
+
### Additional Config Files
|
|
1566
|
+
|
|
1567
|
+
Use the `configFiles` option to add configuration files to your Dremio deployment. You can add multiple files, each of which is a key-value pair. The key is the file name, and the value is the file content. These can be TXT, XML, or JSON files. For example, here is how to embed the configuration for Hashicorp Vault, followed by a separate example file:
|
|
1568
|
+
|
|
1569
|
+
Configuration of additional configuration files with example JSONs
|
|
1570
|
+
|
|
1571
|
+
```
|
|
1572
|
+
dremio:
|
|
1573
|
+
configFiles:
|
|
1574
|
+
vault_config.json: |
|
|
1575
|
+
{
|
|
1576
|
+
"vaultUrl": "https://your-vault.com",
|
|
1577
|
+
"namespace": "optional/dremio/global/vault/namespace",
|
|
1578
|
+
"auth": {
|
|
1579
|
+
"kubernetes": {
|
|
1580
|
+
"vaultRole": "dremio-vault-role",
|
|
1581
|
+
"serviceAccountJwt": "file:///optional/custom/path/to/serviceAccount/jwt",
|
|
1582
|
+
"loginMountPath": "optional/custom/kubernetes/login/path"
|
|
1583
|
+
}
|
|
1584
|
+
}
|
|
1585
|
+
}
|
|
1586
|
+
another_config.json: |
|
|
1587
|
+
{
|
|
1588
|
+
"key-in-this-file": "content-of-this-key"
|
|
1589
|
+
}
|
|
1590
|
+
...
|
|
1591
|
+
```
|
|
1592
|
+
|
|
1593
|
+
### Additional Config Variables
|
|
1594
|
+
|
|
1595
|
+
Use the `dremioConfExtraOptions` option to add new variables to your Dremio deployment. For example, here is how to enable Transport Layer Security (TLS) between executors and coordinators, leveraging auto-generated self-signed certificates.
|
|
1596
|
+
|
|
1597
|
+
Configuration of additional configuration variables with an example to enable TLS
|
|
1598
|
+
|
|
1599
|
+
```
|
|
1600
|
+
dremio:
|
|
1601
|
+
dremioConfExtraOptions:
|
|
1602
|
+
"services.fabric.ssl.enabled": true
|
|
1603
|
+
"services.fabric.ssl.auto-certificate.enabled": true
|
|
1604
|
+
...
|
|
1605
|
+
```
|
|
1606
|
+
|
|
1607
|
+
### Additional Java Truststore
|
|
1608
|
+
|
|
1609
|
+
Use the `trustStore` option under `advancedConfigs` to provide the password and content of a Java truststore file. The content must be base64-encoded. To extract the encoded content, you can use `cat truststore.jks | base64`. Add this configuration under the parents as shown in the following example:
|
|
1610
|
+
|
|
1611
|
+
Configuration of an additional Java truststore with a truststore password
|
|
1612
|
+
|
|
1613
|
+
```
|
|
1614
|
+
dremio:
|
|
1615
|
+
advancedConfigs:
|
|
1616
|
+
trustStore:
|
|
1617
|
+
enabled: true
|
|
1618
|
+
password: "<your-truststore-password>"
|
|
1619
|
+
binaryData: "base64EncodedBinaryContent"
|
|
1620
|
+
```
|
|
1621
|
+
|
|
1622
|
+
### Additional Config Binary Files
|
|
1623
|
+
|
|
1624
|
+
Use the `configBinaries` option to provide binary configuration files. Provided content must be base64-encoded. Add this configuration under the parents as shown in the following example:
|
|
1625
|
+
|
|
1626
|
+
Configuration of additional binary configuration files
|
|
1627
|
+
|
|
1628
|
+
```
|
|
1629
|
+
dremio:
|
|
1630
|
+
configBinaries:
|
|
1631
|
+
custom-binary.conf: "base64EncodedBinaryContent"
|
|
1632
|
+
...
|
|
1633
|
+
```
|
|
1634
|
+
|
|
1635
|
+
### Hive
|
|
1636
|
+
|
|
1637
|
+
Use the `hive2ConfigFiles` option to configure Hive 2. Add this configuration under the parents as shown in the following example:
|
|
1638
|
+
|
|
1639
|
+
Configuration of Hive 2 with an example for the `hive-site.xml` file
|
|
1640
|
+
|
|
1641
|
+
```
|
|
1642
|
+
dremio:
|
|
1643
|
+
hive2ConfigFiles:
|
|
1644
|
+
hive-site.xml: |
|
|
1645
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
|
1646
|
+
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
|
|
1647
|
+
<configuration>
|
|
1648
|
+
<property>
|
|
1649
|
+
<n>hive.metastore.uris</n>
|
|
1650
|
+
<value>thrift://hive-metastore:9083</value>
|
|
1651
|
+
</property>
|
|
1652
|
+
</configuration>
|
|
1653
|
+
...
|
|
1654
|
+
```
|
|
1655
|
+
|
|
1656
|
+
Use the `hive3ConfigFiles` option to configure Hive 3. Add this configuration under the parents as shown in the following example:
|
|
1657
|
+
|
|
1658
|
+
Configuration of Hive 3 with an example for the `hive-site.xml` file
|
|
1659
|
+
|
|
1660
|
+
```
|
|
1661
|
+
dremio:
|
|
1662
|
+
hive3ConfigFiles:
|
|
1663
|
+
hive-site.xml: |
|
|
1664
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
|
1665
|
+
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
|
|
1666
|
+
<configuration>
|
|
1667
|
+
<property>
|
|
1668
|
+
<n>hive.metastore.uris</n>
|
|
1669
|
+
<value>thrift://hive3-metastore:9083</value>
|
|
1670
|
+
</property>
|
|
1671
|
+
</configuration>
|
|
1672
|
+
...
|
|
1673
|
+
```
|
|
1674
|
+
|
|
1675
|
+
## References
|
|
1676
|
+
|
|
1677
|
+
### Recommended Resources Configuration
|
|
1678
|
+
|
|
1679
|
+
The table in this section contains the recommended values for resources requests and volume size to configure Dremio components. In the `values-overrides.yaml` file, set the following values:
|
|
1680
|
+
|
|
1681
|
+
Configuration of resources in Dremio components
|
|
1682
|
+
|
|
1683
|
+
```
|
|
1684
|
+
resources:
|
|
1685
|
+
requests:
|
|
1686
|
+
memory: # Refer to the Memory column in the tables below for recommended values
|
|
1687
|
+
cpu: # Refer to the CPU column in the tables below for recommended values
|
|
1688
|
+
volumeSize: # Refer to the Volume Size column in the tables below for recommended values
|
|
1689
|
+
```
|
|
1690
|
+
|
|
1691
|
+
Dremio recommends the following configuration values:
|
|
1692
|
+
|
|
1693
|
+
* Production Configuration
|
|
1694
|
+
* Minimal Configuration
|
|
1695
|
+
|
|
1696
|
+
Dremio recommends the following configuration values to operate in a production environment:
|
|
1697
|
+
|
|
1698
|
+
| Dremio Component | Memory | CPU | Volume Size | Pod Count |
|
|
1699
|
+
| --- | --- | --- | --- | --- |
|
|
1700
|
+
| Coordinator | 64Gi | 32 | 512Gi | 1 |
|
|
1701
|
+
| Catalog Server | 8Gi | 4 | - | 1 |
|
|
1702
|
+
| Catalog Server (External) | 8Gi | 4 | - | 1 |
|
|
1703
|
+
| Catalog Service Server | 8Gi | 4 | - | 1 |
|
|
1704
|
+
| Engine Operator | 1Gi | 1 | - | 1 |
|
|
1705
|
+
| OpenSearch | 16Gi | 2 | 100Gi | 3 |
|
|
1706
|
+
| MongoDB | 4Gi | 8 | 512Gi1 | 3 |
|
|
1707
|
+
| NATS | 1Gi | 700m | - | 3 |
|
|
1708
|
+
| ZooKeeper | 1Gi | 500m | - | 3 |
|
|
1709
|
+
| Open Telemetry | 1Gi | 1 | - | 1 |
|
|
1710
|
+
| M Engine | 120Gi | 16 | 521Gi | 4 |
|
|
1711
|
+
|
|
1712
|
+
1 You can use a smaller volume size if you do not heavily use Iceberg.
|
|
1713
|
+
|
|
1714
|
+
The following configuration will deploy a functional Dremio Platform, sized to fit onto a more modest cluster. It is appropriate for a single user to check out Dremio's various features, leveraging our sample data set. For any multi-user and performance-oriented evaluation, the Production Configuration should be used.
|
|
1715
|
+
|
|
1716
|
+
| Dremio Component | Memory | CPU | Volume Size | Pod Count |
|
|
1717
|
+
| --- | --- | --- | --- | --- |
|
|
1718
|
+
| Coordinator | 8Gi | 2 | 20Gi | 1 |
|
|
1719
|
+
| Catalog Server | 1Gi | 1 | - | 1 |
|
|
1720
|
+
| Catalog Server (External) | 1Gi | 1 | - | 1 |
|
|
1721
|
+
| Catalog Service Server | 1Gi | 1 | - | 1 |
|
|
1722
|
+
| Engine Operator | 1Gi | 1 | - | 1 |
|
|
1723
|
+
| OpenSearch | 3Gi | 1500m | 10Gi | 3 |
|
|
1724
|
+
| MongoDB | 1Gi | 1 | 10Gi | 3 |
|
|
1725
|
+
| NATS | 1Gi | 700m | - | 3 |
|
|
1726
|
+
| ZooKeeper | 1Gi | 500m | - | 1 |
|
|
1727
|
+
| Open Telemetry | 1Gi | 1 | - | 1 |
|
|
1728
|
+
| XS Engine | 8Gi | 2 | 20Gi | 1 |
|
|
1729
|
+
|
|
1730
|
+
Expand the widget below for Dremio platform components resource YAML snippets:
|
|
1731
|
+
|
|
1732
|
+
Dremio Platform Resource Configuration YAML
|
|
1733
|
+
|
|
1734
|
+
Coordinator
|
|
1735
|
+
|
|
1736
|
+
```
|
|
1737
|
+
coordinator:
|
|
1738
|
+
resources:
|
|
1739
|
+
requests:
|
|
1740
|
+
cpu: "32"
|
|
1741
|
+
memory: "64Gi"
|
|
1742
|
+
limits:
|
|
1743
|
+
memory: "64Gi"
|
|
1744
|
+
volumeSize: "512Gi"
|
|
1745
|
+
```
|
|
1746
|
+
|
|
1747
|
+
Catalog Server
|
|
1748
|
+
|
|
1749
|
+
```
|
|
1750
|
+
catalog:
|
|
1751
|
+
requests:
|
|
1752
|
+
cpu: "4"
|
|
1753
|
+
memory: "8Gi"
|
|
1754
|
+
limits:
|
|
1755
|
+
cpu: "4"
|
|
1756
|
+
memory: "8Gi"
|
|
1757
|
+
```
|
|
1758
|
+
|
|
1759
|
+
Catalog Service Server
|
|
1760
|
+
|
|
1761
|
+
```
|
|
1762
|
+
catalogservices:
|
|
1763
|
+
resources:
|
|
1764
|
+
requests:
|
|
1765
|
+
cpu: "4"
|
|
1766
|
+
memory: "8Gi"
|
|
1767
|
+
limits:
|
|
1768
|
+
cpu: "4"
|
|
1769
|
+
memory: "8Gi"
|
|
1770
|
+
```
|
|
1771
|
+
|
|
1772
|
+
OpenSearch
|
|
1773
|
+
|
|
1774
|
+
```
|
|
1775
|
+
opensearch:
|
|
1776
|
+
resources:
|
|
1777
|
+
requests:
|
|
1778
|
+
memory: "16Gi"
|
|
1779
|
+
cpu: "2"
|
|
1780
|
+
limits:
|
|
1781
|
+
memory: "16Gi"
|
|
1782
|
+
cpu: "2"
|
|
1783
|
+
```
|
|
1784
|
+
|
|
1785
|
+
MongoDB
|
|
1786
|
+
|
|
1787
|
+
```
|
|
1788
|
+
mongodb:
|
|
1789
|
+
resources:
|
|
1790
|
+
requests:
|
|
1791
|
+
cpu: "2"
|
|
1792
|
+
memory: "2Gi"
|
|
1793
|
+
limits:
|
|
1794
|
+
cpu: "4"
|
|
1795
|
+
memory: "2Gi"
|
|
1796
|
+
storage:
|
|
1797
|
+
resources:
|
|
1798
|
+
requests:
|
|
1799
|
+
storage: "512Gi"
|
|
1800
|
+
```
|
|
1801
|
+
|
|
1802
|
+
NATS
|
|
1803
|
+
|
|
1804
|
+
```
|
|
1805
|
+
nats:
|
|
1806
|
+
resources:
|
|
1807
|
+
requests:
|
|
1808
|
+
cpu: "500m"
|
|
1809
|
+
memory: "1024Mi"
|
|
1810
|
+
limits:
|
|
1811
|
+
cpu: "750m"
|
|
1812
|
+
memory: "1536Mi"
|
|
1813
|
+
```
|
|
1814
|
+
|
|
1815
|
+
ZooKeeper
|
|
1816
|
+
|
|
1817
|
+
```
|
|
1818
|
+
zookeeper:
|
|
1819
|
+
resources:
|
|
1820
|
+
requests:
|
|
1821
|
+
cpu: "500m"
|
|
1822
|
+
memory: "1Gi"
|
|
1823
|
+
limits:
|
|
1824
|
+
memory: "1Gi"
|
|
1825
|
+
volumeSize: "10Gi"
|
|
1826
|
+
```
|
|
1827
|
+
|
|
1828
|
+
Open Telemetry
|
|
1829
|
+
|
|
1830
|
+
```
|
|
1831
|
+
telemetry:
|
|
1832
|
+
resources:
|
|
1833
|
+
requests:
|
|
1834
|
+
cpu: "1"
|
|
1835
|
+
memory: "1Gi"
|
|
1836
|
+
limits:
|
|
1837
|
+
cpu: "2"
|
|
1838
|
+
memory: "2Gi"
|
|
1839
|
+
```
|
|
1840
|
+
|
|
1841
|
+
### Creating a TLS Secret
|
|
1842
|
+
|
|
1843
|
+
If you have enabled Transport Layer Security (TLS) in your `values-overrides.yaml`, the corresponding secrets must be created before deploying Dremio. To create a secret, run the following command:
|
|
1844
|
+
|
|
1845
|
+
Run kubectl to create a TLS secret
|
|
1846
|
+
|
|
1847
|
+
```
|
|
1848
|
+
kubectl create secret tls <your-tls-secret-name> --key privkey.pem --cert cert.pem
|
|
1849
|
+
```
|
|
1850
|
+
|
|
1851
|
+
For more information, see [kubectl create secret tls](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_secret_tls/#synopsis) in the Kubernetes documentation.
|
|
1852
|
+
|
|
1853
|
+
caution
|
|
1854
|
+
|
|
1855
|
+
TLS for OpenSearch requires a secret of a different makeup. See Advanced TLS Configuration for OpenSearch.
|
|
1856
|
+
|
|
1857
|
+
### Configuring the Distributed Storage
|
|
1858
|
+
|
|
1859
|
+
Dremio’s distributed store uses scalable and fault-tolerant storage, and it is configured as follows:
|
|
1860
|
+
|
|
1861
|
+
1. In the `values-overrides.yaml` file, find the section with `distStorage:` and `type:`
|
|
1862
|
+
|
|
1863
|
+
Configuration of the distributed storage
|
|
1864
|
+
|
|
1865
|
+
```
|
|
1866
|
+
distStorage:
|
|
1867
|
+
type: "<your-dist-store-type>"
|
|
1868
|
+
...
|
|
1869
|
+
```
|
|
1870
|
+
2. In `type:`, configure your storage provider with one of the following values:
|
|
1871
|
+
|
|
1872
|
+
* `"aws"` - For Amazon S3 or S3-compatible storage.
|
|
1873
|
+
* `"azureStorage"` - For Azure Storage.
|
|
1874
|
+
* `"gcp"` - For Google Cloud Storage (GCS) in Google Cloud Platform (GCP).
|
|
1875
|
+
3. Select the tab below for the storage provider you chose in step 2, and follow the example to configure your distributed storage:
|
|
1876
|
+
|
|
1877
|
+
note
|
|
1878
|
+
|
|
1879
|
+
Distributed storage is also used to store Open Catalog backups. You may be required to provide two authentication methods to enable storage of these backups.
|
|
1880
|
+
|
|
1881
|
+
* Amazon S3 and S3-Compatible
|
|
1882
|
+
* Azure Storage
|
|
1883
|
+
* Google Cloud Storage
|
|
1884
|
+
|
|
1885
|
+
For Amazon S3 and S3-Compatible, select the tab below for your type of authentication:
|
|
1886
|
+
|
|
1887
|
+
* Metadata
|
|
1888
|
+
* Access Key
|
|
1889
|
+
* AWS Profile
|
|
1890
|
+
* EKS Pod Identity
|
|
1891
|
+
|
|
1892
|
+
Dremio uses the Identity and Access Management (IAM) role to retrieve the credentials to authenticate. Metadata is only supported in Amazon Web Services Elastic Kubernetes Service (AWS EKS) and requires that the EKS worker node IAM role is configured with sufficient access rights.
|
|
1893
|
+
|
|
1894
|
+
Add the configuration under the parent as shown in the following example:
|
|
1895
|
+
|
|
1896
|
+
Metadata authentication for the distributed storage
|
|
1897
|
+
|
|
1898
|
+
```
|
|
1899
|
+
distStorage:
|
|
1900
|
+
type: "aws"
|
|
1901
|
+
aws:
|
|
1902
|
+
bucketName: "<your-bucket-name>"
|
|
1903
|
+
path: "/"
|
|
1904
|
+
authentication: "metadata"
|
|
1905
|
+
region: "<your-bucket-region>"
|
|
1906
|
+
#
|
|
1907
|
+
# Extra Properties
|
|
1908
|
+
# Use the extra properties block to provide additional parameters
|
|
1909
|
+
# to configure the distributed storage in the generated core-site.xml file.
|
|
1910
|
+
#
|
|
1911
|
+
#extraProperties: |
|
|
1912
|
+
# <property>
|
|
1913
|
+
# <name>the-property-name</name>
|
|
1914
|
+
# <value>the-property-value</value>
|
|
1915
|
+
# </property>
|
|
1916
|
+
```
|
|
1917
|
+
|
|
1918
|
+
Where:
|
|
1919
|
+
|
|
1920
|
+
* `bucketName` - The name of your S3 bucket for distributed storage.
|
|
1921
|
+
* `path` - The path relative to your bucket to create Dremio's directories.
|
|
1922
|
+
* `authentication` - Set as `"metadata"`.
|
|
1923
|
+
* `region` - The AWS region in which your bucket resides. Required even if using S3-compatible.
|
|
1924
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file. Important for S3-compatible and customer-managed KMS encryption.
|
|
1925
|
+
|
|
1926
|
+
Dremio uses a configured Amazon Web Services (AWS) Access Key and Secret to authenticate.
|
|
1927
|
+
|
|
1928
|
+
Add the configuration under the parent as shown in the following example:
|
|
1929
|
+
|
|
1930
|
+
Access Key authentication for the distributed storage
|
|
1931
|
+
|
|
1932
|
+
```
|
|
1933
|
+
distStorage:
|
|
1934
|
+
type: "aws"
|
|
1935
|
+
aws:
|
|
1936
|
+
bucketName: "<your-bucket-name>"
|
|
1937
|
+
path: "/"
|
|
1938
|
+
authentication: "accessKeySecret"
|
|
1939
|
+
region: "<your-bucket-region>"
|
|
1940
|
+
credentials:
|
|
1941
|
+
accessKey: "<your-access-key>"
|
|
1942
|
+
secret: "<your-access-key-secret>"
|
|
1943
|
+
#
|
|
1944
|
+
# Extra Properties
|
|
1945
|
+
# Use the extra properties block to provide additional parameters
|
|
1946
|
+
# to configure the distributed storage in the generated core-site.xml file.
|
|
1947
|
+
#
|
|
1948
|
+
#extraProperties: |
|
|
1949
|
+
# <property>
|
|
1950
|
+
# <name>the-property-name</name>
|
|
1951
|
+
# <value>the-property-value</value>
|
|
1952
|
+
# </property>
|
|
1953
|
+
```
|
|
1954
|
+
|
|
1955
|
+
Where:
|
|
1956
|
+
|
|
1957
|
+
* `bucketName` - The name of your S3 bucket for distributed storage.
|
|
1958
|
+
* `path` - The path relative to your bucket to create Dremio's directories.
|
|
1959
|
+
* `authentication` - Set as `"accessKeySecret"`.
|
|
1960
|
+
* `region` - The AWS region in which your bucket resides. Required even if using S3-compatible.
|
|
1961
|
+
* `credentials` - The credentials configuration:
|
|
1962
|
+
+ `accessKey` - Your AWS access key ID.
|
|
1963
|
+
+ `secret` - Your AWS access key secret.
|
|
1964
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file. Important for S3-compatible and customer-managed KMS encryption.
|
|
1965
|
+
|
|
1966
|
+
Dremio uses the default Amazon Web Services (AWS) profile to retrieve the credentials to authenticate.
|
|
1967
|
+
|
|
1968
|
+
note
|
|
1969
|
+
|
|
1970
|
+
You need to add an AWS Access Key to store Open Catalog backups.
|
|
1971
|
+
|
|
1972
|
+
Add the configuration under the parent as shown in the following example:
|
|
1973
|
+
|
|
1974
|
+
AWS profile authentication for the distributed storage
|
|
1975
|
+
|
|
1976
|
+
```
|
|
1977
|
+
distStorage:
|
|
1978
|
+
type: "aws"
|
|
1979
|
+
aws:
|
|
1980
|
+
bucketName: "<your-bucket-name>"
|
|
1981
|
+
path: "/"
|
|
1982
|
+
authentication: "awsProfile"
|
|
1983
|
+
region: "<your-bucket-region>"
|
|
1984
|
+
credentials:
|
|
1985
|
+
awsProfileName: "default"
|
|
1986
|
+
#accessKey: "<your-access-key>" for Open Catalog Backup
|
|
1987
|
+
#secret: "<your-access-key-secret>" for Open Catalog Backup
|
|
1988
|
+
#
|
|
1989
|
+
# Extra Properties
|
|
1990
|
+
# Use the extra properties block to provide additional parameters to configure the distributed
|
|
1991
|
+
# storage in the generated core-site.xml file.
|
|
1992
|
+
#
|
|
1993
|
+
#extraProperties: |
|
|
1994
|
+
# <property>
|
|
1995
|
+
# <name>the-property-name</name>
|
|
1996
|
+
# <value>the-property-value</value>
|
|
1997
|
+
# </property>
|
|
1998
|
+
```
|
|
1999
|
+
|
|
2000
|
+
Where:
|
|
2001
|
+
|
|
2002
|
+
* `bucketName` - The name of your S3 bucket for distributed storage.
|
|
2003
|
+
* `path` - The path relative to your bucket to create Dremio's directories.
|
|
2004
|
+
* `authentication` - Set as `"awsProfile"`.
|
|
2005
|
+
* `region` - The AWS region your bucket resides in. Required even if using S3-compatible.
|
|
2006
|
+
* `credentials` - The credentials configuration:
|
|
2007
|
+
+ `awsProfileName` - Set as `"default"`.
|
|
2008
|
+
+ `accessKey` - AWS access key ID for Open Catalog backup storage.
|
|
2009
|
+
+ `secret` - AWS access key secret for Open Catalog backup storage.
|
|
2010
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file. Important for S3-compatible and customer-managed KMS encryption.
|
|
2011
|
+
|
|
2012
|
+
EKS Pod Identities allow for Kubernetes service accounts to be associated with an IAM role. Dremio, in turn, can use this IAM role to retrieve the credentials to authenticate. As both the coordinators and engines require access to distributed storage, both of their `ServiceAccounts` must be associated with an IAM role with sufficient access rights. By default, their `ServiceAccounts` are `dremio-coordinator`, `dremio-engine-executor` for [New Engines](/current/deploy-dremio/managing-engines-kubernetes), and (optional) `dremio-executor` for [Classic Engines](/current/deploy-dremio/configuring-kubernetes/#configuration-of-classic-engines).
|
|
2013
|
+
|
|
2014
|
+
Add the configuration under the parent as shown in the following example:
|
|
2015
|
+
|
|
2016
|
+
AWS profile authentication for the distributed storage
|
|
2017
|
+
|
|
2018
|
+
```
|
|
2019
|
+
distStorage:
|
|
2020
|
+
type: "aws"
|
|
2021
|
+
aws:
|
|
2022
|
+
bucketName: "<your-bucket-name>"
|
|
2023
|
+
path: "/"
|
|
2024
|
+
authentication: "podIdentity"
|
|
2025
|
+
region: "<your-bucket-region>"
|
|
2026
|
+
#
|
|
2027
|
+
# Extra Properties
|
|
2028
|
+
# Use the extra properties block to provide additional parameters to configure the distributed
|
|
2029
|
+
# storage in the generated core-site.xml file.
|
|
2030
|
+
#
|
|
2031
|
+
#extraProperties: |
|
|
2032
|
+
# <property>
|
|
2033
|
+
# <name>the-property-name</name>
|
|
2034
|
+
# <value>the-property-value</value>
|
|
2035
|
+
# </property>
|
|
2036
|
+
```
|
|
2037
|
+
|
|
2038
|
+
Where:
|
|
2039
|
+
|
|
2040
|
+
* `bucketName` - The name of your S3 bucket for distributed storage.
|
|
2041
|
+
* `path` - The path relative to your bucket to create Dremio's directories.
|
|
2042
|
+
* `authentication` - Set as `"podIdentity"`.
|
|
2043
|
+
* `region` - The AWS reigon your bucket resides. Required even if using S3-Compatible.
|
|
2044
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file. Important for S3-compatible and customer-managed KMS encryption.
|
|
2045
|
+
|
|
2046
|
+
**Extra Properties**
|
|
2047
|
+
|
|
2048
|
+
Example extra properties for S3-compatible storage and for providing a customer-managed KMS key for an encrypted bucket.
|
|
2049
|
+
|
|
2050
|
+
S3-Compatible extra properties
|
|
2051
|
+
|
|
2052
|
+
```
|
|
2053
|
+
extraProperties: |
|
|
2054
|
+
<property>
|
|
2055
|
+
<name>fs.s3a.endpoint</name>
|
|
2056
|
+
<value>0.0.0.0</value>
|
|
2057
|
+
</property>
|
|
2058
|
+
<property>
|
|
2059
|
+
<name>fs.s3a.path.style.access</name>
|
|
2060
|
+
<value>true</value>
|
|
2061
|
+
</property>
|
|
2062
|
+
<property>
|
|
2063
|
+
<name>dremio.s3.compat</name>
|
|
2064
|
+
<value>true</value>
|
|
2065
|
+
</property>
|
|
2066
|
+
<property>
|
|
2067
|
+
<name>fs.s3a.connection.ssl.enabled</name>
|
|
2068
|
+
<value>false</value>
|
|
2069
|
+
</property>
|
|
2070
|
+
```
|
|
2071
|
+
|
|
2072
|
+
Customer-managed KMS extra properties
|
|
2073
|
+
|
|
2074
|
+
```
|
|
2075
|
+
extraProperties: |
|
|
2076
|
+
<property>
|
|
2077
|
+
<name>fs.s3a.connection.ssl.enabled</name>
|
|
2078
|
+
<value>true</value>
|
|
2079
|
+
</property>
|
|
2080
|
+
<property>
|
|
2081
|
+
<name>fs.s3a.server-side-encryption-algorithm</name>
|
|
2082
|
+
<value>SSE-KMS</value>
|
|
2083
|
+
</property>
|
|
2084
|
+
<property>
|
|
2085
|
+
<name>fs.s3a.server-side-encryption.key</name>
|
|
2086
|
+
<value>KEY_ARN</value>
|
|
2087
|
+
</property>
|
|
2088
|
+
```
|
|
2089
|
+
|
|
2090
|
+
For Azure Storage, select the tab below for your type of authentication:
|
|
2091
|
+
|
|
2092
|
+
* Access Key
|
|
2093
|
+
* Entra ID
|
|
2094
|
+
|
|
2095
|
+
Dremio uses the configured Azure Storage account access key to authenticate.
|
|
2096
|
+
|
|
2097
|
+
Add the configuration under the parent as shown in the following example:
|
|
2098
|
+
|
|
2099
|
+
Access Key authentication for the distributed storage
|
|
2100
|
+
|
|
2101
|
+
```
|
|
2102
|
+
distStorage:
|
|
2103
|
+
type: "azureStorage"
|
|
2104
|
+
azureStorage:
|
|
2105
|
+
accountName: "<your-account-name>"
|
|
2106
|
+
authentication: "accessKey"
|
|
2107
|
+
filesystem: "<your-blob-container>"
|
|
2108
|
+
path: "/"
|
|
2109
|
+
credentials:
|
|
2110
|
+
accessKey: "<your-access-key>"
|
|
2111
|
+
#
|
|
2112
|
+
# Extra Properties
|
|
2113
|
+
# Use the extra properties block to provide additional parameters to configure the distributed
|
|
2114
|
+
# storage in the generated core-site.xml file.
|
|
2115
|
+
#
|
|
2116
|
+
#extraProperties: |
|
|
2117
|
+
# <property>
|
|
2118
|
+
# <name>the-property-name</name>
|
|
2119
|
+
# <value>the-property-value</value>
|
|
2120
|
+
# </property>
|
|
2121
|
+
```
|
|
2122
|
+
|
|
2123
|
+
Where:
|
|
2124
|
+
|
|
2125
|
+
* `accountName` - The name of your storage account.
|
|
2126
|
+
* `authentication` - Set as `"accessKey"`.
|
|
2127
|
+
* `filesystem` - The name of your blob container to use within the storage account.
|
|
2128
|
+
* `path` - The path relative to the filesystem to create Dremio's directories.
|
|
2129
|
+
* `credentials` - The credentials configuration:
|
|
2130
|
+
+ `accessKey` - Your Azure Storage account access key.
|
|
2131
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file.
|
|
2132
|
+
|
|
2133
|
+
Dremio uses the configured Azure client ID (application ID), Microsoft Entra ID token endpoint, and Azure client secret (application password) to authenticate.
|
|
2134
|
+
|
|
2135
|
+
note
|
|
2136
|
+
|
|
2137
|
+
You need to add an Azure Access Key to store Dremio Catalog backups.
|
|
2138
|
+
|
|
2139
|
+
Add the configuration under the parent as shown in the following example:
|
|
2140
|
+
|
|
2141
|
+
Entra ID authentication for the distributed storage
|
|
2142
|
+
|
|
2143
|
+
```
|
|
2144
|
+
distStorage:
|
|
2145
|
+
type: "azureStorage"
|
|
2146
|
+
azureStorage:
|
|
2147
|
+
accountName: "<your-account-name>"
|
|
2148
|
+
authentication: "entraID"
|
|
2149
|
+
filesystem: "<your-blob-container>"
|
|
2150
|
+
path: "/"
|
|
2151
|
+
credentials:
|
|
2152
|
+
clientId: "<your-application-client-id>"
|
|
2153
|
+
tokenEndpoint: "<your-token-endpoint>"
|
|
2154
|
+
clientSecret: "<your-client-secret>"
|
|
2155
|
+
#accessKey: "<your-access-key>" for Open Catalog Backup.
|
|
2156
|
+
#
|
|
2157
|
+
# Extra Properties
|
|
2158
|
+
# Use the extra properties block to provide additional parameters to configure the distributed
|
|
2159
|
+
# storage in the generated core-site.xml file.
|
|
2160
|
+
#
|
|
2161
|
+
#extraProperties: |
|
|
2162
|
+
# <property>
|
|
2163
|
+
# <name>the-property-name</name>
|
|
2164
|
+
# <value>the-property-value</value>
|
|
2165
|
+
# </property>
|
|
2166
|
+
```
|
|
2167
|
+
|
|
2168
|
+
Where:
|
|
2169
|
+
|
|
2170
|
+
* `accountName` - The name of your storage account.
|
|
2171
|
+
* `authentication` - Set as `"entraID"`.
|
|
2172
|
+
* `filesystem` - The name of your blob container to use within the storage account.
|
|
2173
|
+
* `path` - The path relative to the filesystem to create Dremio's directories.
|
|
2174
|
+
* `credentials` - The credentials configuration:
|
|
2175
|
+
+ `clientId` - Your Azure client ID (application ID).
|
|
2176
|
+
+ `tokenEndpoint` - Your Microsoft Entra ID token endpoint.
|
|
2177
|
+
+ `clientSecret` - Your Azure client secret (application password).
|
|
2178
|
+
+ `accessKey` - Your access key for Open Catalog Backup.
|
|
2179
|
+
* `extraProperties` - Additional parameters to configure the distributed storage in the generated `core-site.xml` file.
|
|
2180
|
+
|
|
2181
|
+
**Extra Properties**
|
|
2182
|
+
|
|
2183
|
+
Example extra properties to configure the Azure Storage data source to access data on the Azure Government Cloud platform.
|
|
2184
|
+
|
|
2185
|
+
Azure Government Cloud endpoint extra properties
|
|
2186
|
+
|
|
2187
|
+
```
|
|
2188
|
+
extraProperties: |
|
|
2189
|
+
<property>
|
|
2190
|
+
<name>fs.azure.endpoint</name>
|
|
2191
|
+
<description>The azure storage endpoint to use.</description>
|
|
2192
|
+
<value>dfs.core.usgovcloudapi.net</value>
|
|
2193
|
+
</property>
|
|
2194
|
+
```
|
|
2195
|
+
|
|
2196
|
+
For Google Cloud Storage (GCS) in Google Cloud Platform (GCP), select the tab below for your type of authentication:
|
|
2197
|
+
|
|
2198
|
+
* Automatic
|
|
2199
|
+
* Service Account
|
|
2200
|
+
|
|
2201
|
+
Dremio uses Google Application Default Credentials to authenticate. This is platform-dependent and may not be available in all Kubernetes clusters.
|
|
2202
|
+
|
|
2203
|
+
note
|
|
2204
|
+
|
|
2205
|
+
You need to add a service account key to store Open Catalog backups.
|
|
2206
|
+
|
|
2207
|
+
Add the configuration under the parent as shown in the following example:
|
|
2208
|
+
|
|
2209
|
+
```
|
|
2210
|
+
distStorage:
|
|
2211
|
+
type: "gcp"
|
|
2212
|
+
gcp:
|
|
2213
|
+
bucketName: "<your-bucket-name>"
|
|
2214
|
+
path: "/"
|
|
2215
|
+
authentication: "auto"
|
|
2216
|
+
#credentials: for Open Catalog backup.
|
|
2217
|
+
# clientEmail: "<your-email-for-the-service-account>"
|
|
2218
|
+
# privateKey: |-
|
|
2219
|
+
# -----BEGIN PRIVATE KEY-----\n <your-full-private-key-value> \n-----END PRIVATE KEY-----\n
|
|
2220
|
+
```
|
|
2221
|
+
|
|
2222
|
+
Where:
|
|
2223
|
+
|
|
2224
|
+
* `bucketName` - The name of your GCS bucket for distributed storage.
|
|
2225
|
+
* `path` - The path relative to the bucket to create Dremio's directories.
|
|
2226
|
+
* `authentication` - Set as `"auto"`.
|
|
2227
|
+
* `credentials` - The credentials configuration, for Open Catalog backup:
|
|
2228
|
+
+ `clientEmail` - Your email for the service account that has access to the GCS bucket, for Open Catalog backup.
|
|
2229
|
+
+ `privateKey` - Your full private key value, for Open Catalog backup.
|
|
2230
|
+
|
|
2231
|
+
Dremio uses a JSON key file generated from the GCP console to authenticate.
|
|
2232
|
+
|
|
2233
|
+
Add the configuration under the parent as shown in the following example:
|
|
2234
|
+
|
|
2235
|
+
```
|
|
2236
|
+
distStorage:
|
|
2237
|
+
type: "gcp"
|
|
2238
|
+
gcp:
|
|
2239
|
+
bucketName: "<your-bucket-name>"
|
|
2240
|
+
path: "/"
|
|
2241
|
+
authentication: "serviceAccountKeys"
|
|
2242
|
+
credentials:
|
|
2243
|
+
projectId: "<your-project-id>"
|
|
2244
|
+
clientId: "<your-client-id>"
|
|
2245
|
+
clientEmail: "<your-email-for-the-service-account>"
|
|
2246
|
+
privateKeyId: "<your-private-key-id>"
|
|
2247
|
+
privateKey: |-
|
|
2248
|
+
-----BEGIN PRIVATE KEY-----\n <your-full-private-key-value> \n-----END PRIVATE KEY-----\n
|
|
2249
|
+
```
|
|
2250
|
+
|
|
2251
|
+
Where:
|
|
2252
|
+
|
|
2253
|
+
* `bucketName` - The name of your GCS bucket for distributed storage.
|
|
2254
|
+
* `path` - The path relative to your bucket to create Dremio's directories.
|
|
2255
|
+
* `authentication` - Set as `"serviceAccountKeys"`.
|
|
2256
|
+
* `credentials` - The credentials configuration:
|
|
2257
|
+
+ `projectId` - Your GCP Project ID that the GCS bucket belongs to.
|
|
2258
|
+
+ `clientId` - Your Client ID for the service account that has access to the GCS bucket.
|
|
2259
|
+
+ `clientEmail` - Your email for the service account that has access to the GCS bucket.
|
|
2260
|
+
+ `privateKeyId` - Your private key ID for the service account that has access to GCS bucket.
|
|
2261
|
+
+ `privateKey` - Your full private key value.
|
|
2262
|
+
|
|
2263
|
+
note
|
|
2264
|
+
|
|
2265
|
+
When using a GCS bucket on Google Kubernetes Engine (GKE), we recommend enabling **Workload Identity** and configuring a Kubernetes service account for Dremio with an associated workload identity that has access to the GCS bucket.
|
|
2266
|
+
|
|
2267
|
+
### Configuring Storage for the Open Catalog
|
|
2268
|
+
|
|
2269
|
+
To use the Open Catalog, configure the storage settings based on your storage provider (for example, Amazon S3, Azure Storage, or Google Cloud Storage). This configuration is required to enable support for vended credentials and to allow access to the table metadata necessary for Iceberg table operations.
|
|
2270
|
+
|
|
2271
|
+
1. In the `values-overrides.yaml` file, find the section to configure your storage provider under the parents, as shown in the following example:
|
|
2272
|
+
|
|
2273
|
+
Configuration of the storage for the Open Catalog
|
|
2274
|
+
|
|
2275
|
+
```
|
|
2276
|
+
catalog:
|
|
2277
|
+
storage:
|
|
2278
|
+
location: <your-object-store-path>
|
|
2279
|
+
type: <your-object-store-type>
|
|
2280
|
+
...
|
|
2281
|
+
```
|
|
2282
|
+
2. To configure it, select the tab for your storage provider, and follow the steps:
|
|
2283
|
+
|
|
2284
|
+
* Amazon S3
|
|
2285
|
+
* S3-compatible
|
|
2286
|
+
* Azure Storage
|
|
2287
|
+
* Google Cloud Storage
|
|
2288
|
+
|
|
2289
|
+
To use the Open Catalog with Amazon S3, do the following:
|
|
2290
|
+
|
|
2291
|
+
1. Configure the access to the storage, as described in [Configure Storage Access](/current/data-sources/open-catalog/#configure-storage-access). Creating a Kubernetes secret may be required.
|
|
2292
|
+
2. Configure the Open Catalog in the `values-overrides.yaml` file as follows:
|
|
2293
|
+
|
|
2294
|
+
Configuration of the storage for the Open Catalog in Amazon S3
|
|
2295
|
+
|
|
2296
|
+
```
|
|
2297
|
+
catalog:
|
|
2298
|
+
storage:
|
|
2299
|
+
location: s3://<your-bucket>/<your-folder>
|
|
2300
|
+
type: S3
|
|
2301
|
+
s3:
|
|
2302
|
+
region: <bucket_region>
|
|
2303
|
+
roleArn: <dremio_catalog_iam_role> // The role that was configured in the previous step
|
|
2304
|
+
userArn: <dremio_catalog_user_arn> // The IAM user that was created in the previous step
|
|
2305
|
+
externalId: <dremio_catalog_external_id> // The external id that was created in the previous step
|
|
2306
|
+
useAccessKeys: false // Set it to true if you intend to use accessKeys.
|
|
2307
|
+
...
|
|
2308
|
+
```
|
|
2309
|
+
3. If using EKS Pod Identities, ensure the catalog's Kubernetes `ServiceAccount`, which is `dremio-catalog-server` by default, is associated with the `userArn` which you also provided above.
|
|
2310
|
+
|
|
2311
|
+
To use the Open Catalog with S3-compatible storage, do the following:
|
|
2312
|
+
|
|
2313
|
+
1. Configure the access to the storage, as described in [Configure Storage Access](/current/data-sources/open-catalog/#configure-storage-access). Creating a Kubernetes secret is required.
|
|
2314
|
+
2. For this step, select the tab for whether the S3-compatible storage has STS support or not, and follow the instructions:
|
|
2315
|
+
|
|
2316
|
+
* Has STS support
|
|
2317
|
+
* No STS support
|
|
2318
|
+
|
|
2319
|
+
The Open Catalog uses STS as a mechanism to perform credentials vending, so configure Open Catalog in the `values-overrides.yaml` file as follows:
|
|
2320
|
+
|
|
2321
|
+
caution
|
|
2322
|
+
|
|
2323
|
+
roleArn must be provided even when using S3-compatible storage. A dummy value is provided in the template below.
|
|
2324
|
+
|
|
2325
|
+
Configuration of the storage for the Open Catalog in S3-compatible with STS support
|
|
2326
|
+
|
|
2327
|
+
```
|
|
2328
|
+
catalog:
|
|
2329
|
+
storage:
|
|
2330
|
+
location: s3://<your-bucket/<your-folder>
|
|
2331
|
+
type: S3
|
|
2332
|
+
s3:
|
|
2333
|
+
region: <your-bucket-region> // Optional, bucket region
|
|
2334
|
+
roleArn: arn:aws:iam::000000000000:role/catalog-access-role // Mandatory, a dummy role, as shown here, must be provided
|
|
2335
|
+
endpoint: <s3-compatible-server-url> // This is the S3 server url, for example, http://<minio-host>:<minio-port> for MinIO
|
|
2336
|
+
stsEndpoint: <s3-compatible-sts-server-url> // This is the STS server url, for example http://<minio-host>:<minio-port> for MinIO
|
|
2337
|
+
pathStyleAccess: true // Mandatory to be true
|
|
2338
|
+
useAccessKeys: true // Mandatory to be true
|
|
2339
|
+
...
|
|
2340
|
+
```
|
|
2341
|
+
|
|
2342
|
+
Vended credentials will not work, and, in such cases, you must select `Use master storage credentials` and in the Dremio console, and provide explicit access keys for external engines where they are required.
|
|
2343
|
+
|
|
2344
|
+
Once the Kubernetes secrets for the access keys have been created, configure the Open Catalog in the `values-overrides.yaml` file as follows:
|
|
2345
|
+
|
|
2346
|
+
caution
|
|
2347
|
+
|
|
2348
|
+
roleArn must be provided even when using S3-compatible storage. A dummy value is provided in the template below.
|
|
2349
|
+
|
|
2350
|
+
Configuration of the storage for the Open Catalog in S3-compatible with no STS support
|
|
2351
|
+
|
|
2352
|
+
```
|
|
2353
|
+
catalog:
|
|
2354
|
+
storage:
|
|
2355
|
+
location: s3://<your-bucket/<your-folder>
|
|
2356
|
+
type: S3
|
|
2357
|
+
s3:
|
|
2358
|
+
region: <your-bucket-region> // Optional, bucket region
|
|
2359
|
+
roleArn: arn:aws:iam::000000000000:role/catalog-access-role // Mandatory, a dummy role, as shown here, must be provided
|
|
2360
|
+
endpoint: <s3-compatible-server-url> // This is the S3 server url, for example to MinIO http://<minio-host>:<minio-port
|
|
2361
|
+
pathStyleAccess: true // Mandatory to be true
|
|
2362
|
+
skipSts: true // Mandatory to be true
|
|
2363
|
+
useAccessKeys: true // Mandatory to be true
|
|
2364
|
+
...
|
|
2365
|
+
```
|
|
2366
|
+
|
|
2367
|
+
To use the Open Catalog with Azure Storage, do the following:
|
|
2368
|
+
|
|
2369
|
+
1. Configure the access to the storage, as described in [Configure Storage Access](/current/data-sources/open-catalog/#configure-storage-access).
|
|
2370
|
+
2. Configure the Open Catalog in the `values-overrides.yaml` file as follows:
|
|
2371
|
+
|
|
2372
|
+
Configuration of the storage for the Open Catalog in Azure Storage
|
|
2373
|
+
|
|
2374
|
+
```
|
|
2375
|
+
catalog:
|
|
2376
|
+
storage:
|
|
2377
|
+
location: abfss://<your-container-name>@<your-storage-account>.dfs.core.windows.net/<path>
|
|
2378
|
+
type: azure
|
|
2379
|
+
azure:
|
|
2380
|
+
tenantId: <your-azure-directory-tenant-id>
|
|
2381
|
+
multiTenantAppName: ~ // Optional: Used only if you register an app with multi-tenants.
|
|
2382
|
+
useClientSecrets: true // Has to be true
|
|
2383
|
+
...
|
|
2384
|
+
```
|
|
2385
|
+
|
|
2386
|
+
To use the Open Catalog with Google Cloud Storage (GCS), do the following:
|
|
2387
|
+
|
|
2388
|
+
1. Configure the access to the storage, as described in [Configure Storage Access](/current/data-sources/open-catalog/#configure-storage-access).
|
|
2389
|
+
2. Configure the Open Catalog in the `values-overrides.yaml` file as follows:
|
|
2390
|
+
|
|
2391
|
+
Configuration of the storage for the Open Catalog in Google Cloud Storage
|
|
2392
|
+
|
|
2393
|
+
```
|
|
2394
|
+
catalog:
|
|
2395
|
+
...
|
|
2396
|
+
storage:
|
|
2397
|
+
location: gs://<your-bucket>/<your-path>
|
|
2398
|
+
type: GCS
|
|
2399
|
+
gcs:
|
|
2400
|
+
useCredentialsFile: True
|
|
2401
|
+
```
|
|
2402
|
+
|
|
2403
|
+
### Configuring TLS for Open Catalog External Access
|
|
2404
|
+
|
|
2405
|
+
For clients connecting to the Open Catalog from outside the namespace, Transport Layer Security (TLS) can be enabled for Open Catalog external access as follows:
|
|
2406
|
+
|
|
2407
|
+
1. Enable external access with TLS and provide the TLS secret. See the section Creating a TLS Secret.
|
|
2408
|
+
2. In the `values-overrides.yaml` file, find the Open Catalog configuration section:
|
|
2409
|
+
|
|
2410
|
+
Configuration section for the Open Catalog
|
|
2411
|
+
|
|
2412
|
+
```
|
|
2413
|
+
catalog:
|
|
2414
|
+
...
|
|
2415
|
+
```
|
|
2416
|
+
3. Configure TLS for the Open Catalog as follows:
|
|
2417
|
+
|
|
2418
|
+
Configuration of TLS for external access to the Open Catalog
|
|
2419
|
+
|
|
2420
|
+
```
|
|
2421
|
+
catalog:
|
|
2422
|
+
externalAccess:
|
|
2423
|
+
enabled: true
|
|
2424
|
+
tls:
|
|
2425
|
+
enabled: true
|
|
2426
|
+
secret: <dremio-tls-secret-catalog></dremio-tls-secret-catalog>
|
|
2427
|
+
...
|
|
2428
|
+
```
|
|
2429
|
+
|
|
2430
|
+
### Configuring Open Catalog When the Coordinator Web is Using TLS
|
|
2431
|
+
|
|
2432
|
+
When the Dremio coordinator uses Transport Layer Security (TLS)for Web access (i.e., when `coordinator.web.tls` is set to `true`), the Open Catalog external access must be configured appropriately, or client authentication will fail. For that, configure the Open Catalog as follows:
|
|
2433
|
+
|
|
2434
|
+
1. In the `values-overrides.yaml` file, find the Open Catalog configuration section:
|
|
2435
|
+
|
|
2436
|
+
Configuration section for the Open Catalog
|
|
2437
|
+
|
|
2438
|
+
```
|
|
2439
|
+
catalog:
|
|
2440
|
+
...
|
|
2441
|
+
```
|
|
2442
|
+
2. Configure the Open Catalog as follows:
|
|
2443
|
+
|
|
2444
|
+
Configuration of the Open Catalog when the coordinator web is using TLS
|
|
2445
|
+
|
|
2446
|
+
```
|
|
2447
|
+
catalog:
|
|
2448
|
+
externalAccess:
|
|
2449
|
+
enabled: true
|
|
2450
|
+
authentication:
|
|
2451
|
+
authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.local
|
|
2452
|
+
...
|
|
2453
|
+
```
|
|
2454
|
+
|
|
2455
|
+
The `authServerHostname` must match the CN (or the SAN) field of the (master) coordinator Web TLS certificate.
|
|
2456
|
+
|
|
2457
|
+
In case it does not match the CN or SAN fields of the TLS certificate, as a last resort, it is possible to disable hostname verification (`disableHostnameVerification: true`):
|
|
2458
|
+
|
|
2459
|
+
Configuration of the Open Catalog with hostname verification disabled
|
|
2460
|
+
|
|
2461
|
+
```
|
|
2462
|
+
catalog:
|
|
2463
|
+
externalAccess:
|
|
2464
|
+
enabled: true
|
|
2465
|
+
authentication:
|
|
2466
|
+
authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.local
|
|
2467
|
+
disableHostnameVerification: true
|
|
2468
|
+
...
|
|
2469
|
+
```
|
|
2470
|
+
|
|
2471
|
+
## Downloading Dremio's Helm Charts
|
|
2472
|
+
|
|
2473
|
+
You can download Dremio's Helm charts to implement advanced configurations beyond those outlined in this topic.
|
|
2474
|
+
|
|
2475
|
+
However, please proceed with caution. Modifications made without a clear understanding can lead to unexpected behavior and compromise the Dremio Support team's ability to provide effective assistance.
|
|
2476
|
+
|
|
2477
|
+
To ensure success, Dremio recommends engaging with the Professional Services team through your Account Executive or Customer Success Manager. Please note that such engagements may require additional time and could involve consulting fees.
|
|
2478
|
+
|
|
2479
|
+
To download Dremio’s Helm charts, use the following command:
|
|
2480
|
+
|
|
2481
|
+
Run helm pull to download Dremio’s Helm charts
|
|
2482
|
+
|
|
2483
|
+
```
|
|
2484
|
+
helm pull oci://quay.io/dremio/dremio-helm --version <tag> --untar
|
|
2485
|
+
```
|
|
2486
|
+
|
|
2487
|
+
Where:
|
|
2488
|
+
|
|
2489
|
+
* (Optional) `--version <tag>` - The Helm chart version to pull. For example, `--version 3.0.0`. If not specified, the latest version is pulled.
|
|
2490
|
+
|
|
2491
|
+
The command creates a new local directory called `dremio-helm` containing the Helm charts.
|
|
2492
|
+
|
|
2493
|
+
For more information on the command, see [Helm Pull](https://helm.sh/docs/helm/helm_pull/) in Helm's documentation.
|
|
2494
|
+
|
|
2495
|
+
### Overriding Additional Values
|
|
2496
|
+
|
|
2497
|
+
After completing the `helm pull`:
|
|
2498
|
+
|
|
2499
|
+
1. Find the `values.yaml` file, open it, and check the configurations you want to override.
|
|
2500
|
+
2. Copy what you want to override from the `values.yaml` to `values-overrides.yaml` and configure the file with your values.
|
|
2501
|
+
3. Save the `values-overrides.yaml` file.
|
|
2502
|
+
|
|
2503
|
+
Once done with the configuration, deploy Dremio to Kubernetes via the OCI Repo. See how in [Deploying Dremio to Kubernetes](/current/deploy-dremio/deploy-on-kubernetes).
|
|
2504
|
+
|
|
2505
|
+
### Manual Modifications to Deployment Files
|
|
2506
|
+
|
|
2507
|
+
important
|
|
2508
|
+
|
|
2509
|
+
For modifications in these files to take effect, you need to install Dremio using a local version of the Helm charts. Thus, the `helm install` command must reference a local folder, not the OCI repo like Quay. For more information and sample commands, see [Helm install](https://helm.sh/docs/helm/helm_install/).
|
|
2510
|
+
|
|
2511
|
+
After completing the `helm pull`, you can edit the charts directly. This may be necessary to add deployment-specific modifications not catered for in the Additional Configuration section. These would typically require modifications to files in the `/config` directory. Any customizations to your Dremio environment are propagated to all the pods when installing or upgrading the deployment.
|
|
2512
|
+
|
|
2513
|
+
Was this page helpful?
|
|
2514
|
+
|
|
2515
|
+
[Previous
|
|
2516
|
+
|
|
2517
|
+
Deploy on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes)[Next
|
|
2518
|
+
|
|
2519
|
+
Managing Engines](/current/deploy-dremio/managing-engines-kubernetes)
|
|
2520
|
+
|
|
2521
|
+
* Configure Your Values
|
|
2522
|
+
+ License
|
|
2523
|
+
+ Pull Secret
|
|
2524
|
+
+ Coordinator
|
|
2525
|
+
- Resource Configuration
|
|
2526
|
+
- Identity Provider
|
|
2527
|
+
- Transport Level Security
|
|
2528
|
+
+ Coordinator's Distributed Storage
|
|
2529
|
+
+ Open Catalog
|
|
2530
|
+
* Configuring Your Values - Advanced
|
|
2531
|
+
+ OpenShift
|
|
2532
|
+
+ Dremio Platform Images
|
|
2533
|
+
+ Scale-out Coordinators
|
|
2534
|
+
+ Configuring Kubernetes Pod Metadata (including Node Selector)
|
|
2535
|
+
+ Configuring Pods Priority
|
|
2536
|
+
+ Configuring Extra Environment Variables
|
|
2537
|
+
+ Advanced Load Balancer Configuration
|
|
2538
|
+
- Additional Load Balancer Configuration for Amazon EKS in Auto Mode
|
|
2539
|
+
+ Advanced TLS Configuration for OpenSearch
|
|
2540
|
+
+ Advanced Configuration of Engines
|
|
2541
|
+
+ Configuration of Classic Engines
|
|
2542
|
+
- Engine Overrides
|
|
2543
|
+
+ Telemetry
|
|
2544
|
+
+ Logging
|
|
2545
|
+
+ Disabling Parts of the Deployment
|
|
2546
|
+
- Semantic Search
|
|
2547
|
+
* Additional Configuration
|
|
2548
|
+
+ Additional Config Files
|
|
2549
|
+
+ Additional Config Variables
|
|
2550
|
+
+ Additional Java Truststore
|
|
2551
|
+
+ Additional Config Binary Files
|
|
2552
|
+
+ Hive
|
|
2553
|
+
* References
|
|
2554
|
+
+ Recommended Resources Configuration
|
|
2555
|
+
+ Creating a TLS Secret
|
|
2556
|
+
+ Configuring the Distributed Storage
|
|
2557
|
+
+ Configuring Storage for the Open Catalog
|
|
2558
|
+
+ Configuring TLS for Open Catalog External Access
|
|
2559
|
+
+ Configuring Open Catalog When the Coordinator Web is Using TLS
|
|
2560
|
+
* Downloading Dremio's Helm Charts
|
|
2561
|
+
+ Overriding Additional Values
|
|
2562
|
+
+ Manual Modifications to Deployment Files
|
|
2563
|
+
|
|
2564
|
+
---
|
|
2565
|
+
|
|
2566
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/managing-engines-kubernetes
|
|
2567
|
+
|
|
2568
|
+
Version: current [26.x]
|
|
2569
|
+
|
|
2570
|
+
On this page
|
|
2571
|
+
|
|
2572
|
+
# Managing Engines in Kubernetes Enterprise
|
|
2573
|
+
|
|
2574
|
+
note
|
|
2575
|
+
|
|
2576
|
+
This feature is for Enterprise Edition only.
|
|
2577
|
+
For Community Edition, see [Configuration of Classic Engines](/current/deploy-dremio/configuring-kubernetes/#configuration-of-classic-engines).
|
|
2578
|
+
|
|
2579
|
+
Dremio supports the ability to provision multiple separate execution engines in Kubernetes from a Dremio main coordinator node, and automatically start and stop based on workload requirements at runtime. This provides several benefits, including:
|
|
2580
|
+
|
|
2581
|
+
* Creating a new engine doesn't require restarting Dremio, which enables administrators to achieve workload isolation efficiently.
|
|
2582
|
+
* When creating a new engine, you can use Kubernetes metadata to label engines to keep track of resources.
|
|
2583
|
+
* Right-size execution resources for each distinct workload, instead of implementing a one-size-fits-all model.
|
|
2584
|
+
* Easily experiment with different execution resource sizes at any scale.
|
|
2585
|
+
|
|
2586
|
+
To manage your engines, open the Engines page as follows:
|
|
2587
|
+
|
|
2588
|
+
1. Open your Dremio console.
|
|
2589
|
+
2. Click  in the side navigation bar to open the Settings sidebar.
|
|
2590
|
+
3. Select **Engines**.
|
|
2591
|
+
|
|
2592
|
+

|
|
2593
|
+
|
|
2594
|
+
## Monitoring Engines
|
|
2595
|
+
|
|
2596
|
+
You can monitor the status and properties of your engines on the Engines page.
|
|
2597
|
+
|
|
2598
|
+

|
|
2599
|
+
|
|
2600
|
+
Each engine has the following information available:
|
|
2601
|
+
|
|
2602
|
+
* **Name** - The name of the engine, which you can click to see its details. See the section about Viewing Engine Details.
|
|
2603
|
+
* **Size** - The size configured for the engine.
|
|
2604
|
+
* **Status** - The engine status. For more information, see the section in this topic about Engine Statuses.
|
|
2605
|
+
* **Auto start/stop** - Whether the engine has auto start/stop enabled for autoscaling.
|
|
2606
|
+
* **Idle period** - The idle time to auto stop when the engine has **Auto start/stop** enabled.
|
|
2607
|
+
* **Queues** - Query queues routed to the engine.
|
|
2608
|
+
* **Labels** - Labels associated with the engine.
|
|
2609
|
+
|
|
2610
|
+
## Performing Actions on Engines
|
|
2611
|
+
|
|
2612
|
+
While monitoring engines, you have actions you can perform on each engine through the icons displayed on the right-hand side when hovering over the engine row.
|
|
2613
|
+
|
|
2614
|
+

|
|
2615
|
+
|
|
2616
|
+
### Stopping/Starting an Engine
|
|
2617
|
+
|
|
2618
|
+
You can click / to stop/start an engine manually at any time. Stopping an engine will cause running queries to fail while new queries will remain queued, which can also fail by timeout if the engine isn't started. To prevent query failures, reroute queries to another engine, and stop the engine only when no queries are running or queued for the engine.
|
|
2619
|
+
|
|
2620
|
+
note
|
|
2621
|
+
|
|
2622
|
+
You can enable **autoscaling on an engine** to make it stop automatically after an idle time without queries and start again automatically when new queries are issued, all without any human intervention.
|
|
2623
|
+
|
|
2624
|
+
Autoscaling is configured when you add an engine or edit an engine:
|
|
2625
|
+
|
|
2626
|
+
#### Stopping All Engines
|
|
2627
|
+
|
|
2628
|
+
Some complex operations, like upgrading or uninstalling Dremio, require all engines to be stopped beforehand. You can stop engines manually one by one as described above, or automate the procedure using the [Engine Management API](/current/reference/api/engine-management) to stop all engines. Expand the sample below of a bash script executing the necessary endpoints to stop all engines.
|
|
2629
|
+
|
|
2630
|
+
Sample bash script to stop all engines
|
|
2631
|
+
|
|
2632
|
+
```
|
|
2633
|
+
#!/bin/bash
|
|
2634
|
+
# Check if the bearer token is provided
|
|
2635
|
+
if [ -z "$1" ]; then
|
|
2636
|
+
echo "Error: Bearer token is required."
|
|
2637
|
+
exit 1
|
|
2638
|
+
fi
|
|
2639
|
+
BEARER_TOKEN=$1
|
|
2640
|
+
BASE_URL=${2:-https://localhost:9047}
|
|
2641
|
+
# Make an HTTP GET request to retrieve engine IDs
|
|
2642
|
+
RESPONSE=$(curl -k -s -H "Authorization: Bearer $BEARER_TOKEN" "$BASE_URL/api/v3/engines")
|
|
2643
|
+
# Check if the response contains the "id" field
|
|
2644
|
+
if ! echo "$RESPONSE" | grep -q '"id"'; then
|
|
2645
|
+
echo "Error: No 'id' field found in the response."
|
|
2646
|
+
exit 1
|
|
2647
|
+
fi
|
|
2648
|
+
# Extract IDs from the response
|
|
2649
|
+
IDS=$(echo "$RESPONSE" | jq -r '.data[] | .id')
|
|
2650
|
+
# Loop through each ID and make an HTTP PUT request
|
|
2651
|
+
for ID in $IDS; do
|
|
2652
|
+
RESPONSE=$(curl -k -s -o /dev/null -w "%{http_code}" -X PUT -H "Authorization: Bearer $BEARER_TOKEN" "$BASE_URL/api/v3/engines/$ID/stop")
|
|
2653
|
+
if [ "$RESPONSE" -eq 200 ]; then
|
|
2654
|
+
echo "Successfully stopped engine with ID: $ID"
|
|
2655
|
+
else
|
|
2656
|
+
echo "Failed to stop the engine with ID: $ID, HTTP status code: $RESPONSE"
|
|
2657
|
+
fi
|
|
2658
|
+
done
|
|
2659
|
+
echo "All engines processed."
|
|
2660
|
+
```
|
|
2661
|
+
|
|
2662
|
+
### Editing the Engine Settings
|
|
2663
|
+
|
|
2664
|
+
You can click  to edit the engine settings. After saving the new settings, the engine may restart, causing running queries to fail and new queries to be queued.
|
|
2665
|
+
|
|
2666
|
+

|
|
2667
|
+
|
|
2668
|
+
note
|
|
2669
|
+
|
|
2670
|
+
The name of the engine must follow these rules:
|
|
2671
|
+
|
|
2672
|
+
* Must start with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
2673
|
+
* Must end with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
2674
|
+
* Must contain only lowercase alphanumeric characters or a hyphen (`[\-a-z0-9]`).
|
|
2675
|
+
* Must be under 30 characters in length.
|
|
2676
|
+
* Must be unique and not previously used for any existing or deleted engines.
|
|
2677
|
+
|
|
2678
|
+
### Deleting an Engine
|
|
2679
|
+
|
|
2680
|
+
You can click  to delete an engine. Deleting an engine will cause running, queued, and new queries to fail. To prevent query failures, you can reroute queries to another engine, and only delete when no more queries are running or queued for the engine.
|
|
2681
|
+
|
|
2682
|
+
## Viewing Engine Details
|
|
2683
|
+
|
|
2684
|
+
While monitoring engines, if you need to know more details about engines, click the engine's name to view all the information about it.
|
|
2685
|
+
|
|
2686
|
+

|
|
2687
|
+
|
|
2688
|
+
On this page, you will also find a set of buttons at the top to delete the engine, stop/start the engine, and edit the engine settings.
|
|
2689
|
+
|
|
2690
|
+
## Adding an Engine
|
|
2691
|
+
|
|
2692
|
+
You can create more engines by clicking **Add Engine** at the top-right corner of the Engines page.
|
|
2693
|
+
|
|
2694
|
+

|
|
2695
|
+
|
|
2696
|
+
In the New engine dialog, do the following:
|
|
2697
|
+
|
|
2698
|
+
1. Fill out the **General** section:
|
|
2699
|
+
|
|
2700
|
+
1. **Name** - Type the name of the engine. Use a meaningful name that helps you to identify the engine better. For example, `low-cost-query`.
|
|
2701
|
+
|
|
2702
|
+
note
|
|
2703
|
+
|
|
2704
|
+
The name of the engine must follow these rules:
|
|
2705
|
+
|
|
2706
|
+
* Must start with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
2707
|
+
* Must end with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
2708
|
+
* Must contain only lowercase alphanumeric characters or a hyphen (`[\-a-z0-9]`).
|
|
2709
|
+
* Must be under 30 characters in length.
|
|
2710
|
+
* Must be unique and not previously used for any existing or deleted engines.
|
|
2711
|
+
2. **CPU** and **Size** – Select the number of CPUs per executor pod and the size of the engine.
|
|
2712
|
+
Dremio provides nine engine sizes, each with two CPU options targeting 16 or 32 CPU nodes. By default, Dremio will subtract 2 CPUs and 8 GB of memory from its request, resulting in requests for 14 or 30 CPUs and 120 GB of memory. This adjustment helps optimize the packing of executors on the most common node sizes. The table below shows the engine sizes:
|
|
2713
|
+
|
|
2714
|
+
| Engine Size | Executors per Replica | Memory per Executor |
|
|
2715
|
+
| --- | --- | --- |
|
|
2716
|
+
| 2XSmall | 1 | 56 GB |
|
|
2717
|
+
| XSmall | 1 | 120 GB |
|
|
2718
|
+
| Small | 2 | 120 GB |
|
|
2719
|
+
| Medium | 4 | 120 GB |
|
|
2720
|
+
| Large | 8 | 120 GB |
|
|
2721
|
+
| XLarge | 12 | 120 GB |
|
|
2722
|
+
| 2XLarge | 16 | 120 GB |
|
|
2723
|
+
| 3XLarge | 24 | 120 GB |
|
|
2724
|
+
| 4XLarge | 32 | 120 GB |
|
|
2725
|
+
3. **Automatically start/stop** - If checked, the engine automatically stops after the specified idle time and automatically starts when new queries are issued to the engine. If not checked, the engine only stops and starts through manual intervention. By default, this setting is checked and the engine stops automatically after `15 min` of idle time. For more information, see the section Stopping/Starting an Engine.
|
|
2726
|
+
4. (Optional) Expand **Advanced Options** for further settings.
|
|
2727
|
+
|
|
2728
|
+

|
|
2729
|
+
|
|
2730
|
+
Fill out the advanced options as follows:
|
|
2731
|
+
|
|
2732
|
+
1. **Cloud cache volume (c3)** - Specify the amount of local storage for caching data.
|
|
2733
|
+
2. **Spill volume** - Specify the disk size allocated for temporary storage when operations exceed memory limits.
|
|
2734
|
+
2. (Optional) Select **Kubernetes pod metadata** to define pod metadata for the engine, such as labels, annotations, node selectors, and tolerations. Define those values with care and foreknowledge of expected entries because any misconfiguration may result in Kubernetes being unable to start the executors that make up the engine.
|
|
2735
|
+
|
|
2736
|
+

|
|
2737
|
+
|
|
2738
|
+
Fill out the pod's metadata with:
|
|
2739
|
+
|
|
2740
|
+
1. **Labels** - Add labels as key/value pairs to identify and organize pods. Use them to group, filter, and select subsets of resources efficiently.
|
|
2741
|
+
|
|
2742
|
+
note
|
|
2743
|
+
|
|
2744
|
+
The engine label must follow these rules:
|
|
2745
|
+
|
|
2746
|
+
* Must start with an alphanumeric character ([a-z0-9]).
|
|
2747
|
+
* Must end with an alphanumeric character ([a-z0-9]).
|
|
2748
|
+
* Must contain only lowercase alphanumeric characters, a hyphen, or a underscore ([-\_a-z0-9]).
|
|
2749
|
+
* The maximum length is 63 characters.
|
|
2750
|
+
2. **Annotations** - Add annotations as key/value pairs to store non-identifying metadata, such as build information or pointers to logging services. Unlike labels, they're not used for selection or grouping.
|
|
2751
|
+
|
|
2752
|
+
note
|
|
2753
|
+
|
|
2754
|
+
The engine annotation must follow these rules:
|
|
2755
|
+
|
|
2756
|
+
* Must be UTF-8 encoded and can include any valid UTF-8 character.
|
|
2757
|
+
* Can be in plain text, JSON, or any other UTF-8 compatible format.
|
|
2758
|
+
* The maximum size is 256KB.
|
|
2759
|
+
* The maximum size of all engine annotations is 1MB.
|
|
2760
|
+
3. **Node selectors** - Add node selectors as key/value pairs for node-specific constraints to schedule pods on nodes matching specified labels. Use this to target nodes with specific configurations or roles.
|
|
2761
|
+
4. **Tolerations** - Add tolerations to allow pods to be scheduled on nodes with matching taints, but they don’t restrict scheduling to only those nodes; the pod can still land on a node without the taint.
|
|
2762
|
+
3. Click **Add** to add the engine.
|
|
2763
|
+
|
|
2764
|
+
The newly added engine will be displayed in the listed engines.
|
|
2765
|
+
|
|
2766
|
+

|
|
2767
|
+
|
|
2768
|
+
## Engine Statuses
|
|
2769
|
+
|
|
2770
|
+
The following table describes each engine status:
|
|
2771
|
+
|
|
2772
|
+
| Status | Icon | Description |
|
|
2773
|
+
| --- | --- | --- |
|
|
2774
|
+
| Starting | The starting engine icon | The engine is starting. This is the initial state of an engine after being created. New queries are queued to be processed. |
|
|
2775
|
+
| Running | The running engine icon | The engine is running. New queries are queued and processed. |
|
|
2776
|
+
| Stopping | The stopping engine icon | The engine is stopping. Running queries will fail. New queries will remain queued, which can also fail by timeout if the engine isn't started. |
|
|
2777
|
+
| Stopped | The stopped engine icon | The engine is stopped. New queries will remain queued, which can fail by timeout if the engine isn't started. |
|
|
2778
|
+
| Recovering | The recovering engine icon | The engine is recovering. New queries will remain queued, which can fail by timeout if the engine doesn't recover. |
|
|
2779
|
+
| Failed | The failed engine icon | The engine failed. New queries will remain queued, which can fail by timeout if the engine doesn't start. |
|
|
2780
|
+
|
|
2781
|
+
## Related Topics
|
|
2782
|
+
|
|
2783
|
+
* [Engine Management API](/current/reference/api/engine-management/) - The API to manage your engines using REST API calls.
|
|
2784
|
+
* [sys.engines](/current/reference/sql/system-tables/engines/) - The system table to query for information about your engines.
|
|
2785
|
+
* [Audit Logs](/current/security/auditing/) - Audit logs for your engines.
|
|
2786
|
+
|
|
2787
|
+
Was this page helpful?
|
|
2788
|
+
|
|
2789
|
+
[Previous
|
|
2790
|
+
|
|
2791
|
+
Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/)[Next
|
|
2792
|
+
|
|
2793
|
+
Other Options](/current/deploy-dremio/other-options/)
|
|
2794
|
+
|
|
2795
|
+
* Monitoring Engines
|
|
2796
|
+
* Performing Actions on Engines
|
|
2797
|
+
+ Stopping/Starting an Engine
|
|
2798
|
+
+ Editing the Engine Settings
|
|
2799
|
+
+ Deleting an Engine
|
|
2800
|
+
* Viewing Engine Details
|
|
2801
|
+
* Adding an Engine
|
|
2802
|
+
* Engine Statuses
|
|
2803
|
+
* Related Topics
|
|
2804
|
+
|
|
2805
|
+
---
|
|
2806
|
+
|
|
2807
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/other-options/
|
|
2808
|
+
|
|
2809
|
+
Version: current [26.x]
|
|
2810
|
+
|
|
2811
|
+
# Other Deployment Options
|
|
2812
|
+
|
|
2813
|
+
Besides the [Kubernetes deployment](/current/deploy-dremio/deploy-on-kubernetes), there are other alternative supported options for deploying Dremio:
|
|
2814
|
+
|
|
2815
|
+
* [Hadoop Deployment (YARN)](/current/deploy-dremio/other-options/yarn-hadoop) - Deploy Dremio on a Hadoop cluster using YARN.
|
|
2816
|
+
* [Dremio on Your Infrastructure](/current/deploy-dremio/other-options/standalone/) - Deploy Dremio as a standalone cluster.
|
|
2817
|
+
|
|
2818
|
+
Was this page helpful?
|
|
2819
|
+
|
|
2820
|
+
[Previous
|
|
2821
|
+
|
|
2822
|
+
Managing Engines](/current/deploy-dremio/managing-engines-kubernetes)[Next
|
|
2823
|
+
|
|
2824
|
+
Dremio with Hadoop](/current/deploy-dremio/other-options/yarn-hadoop)
|
|
2825
|
+
|
|
2826
|
+
---
|
|
2827
|
+
|
|
2828
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/kubernetes-environments/
|
|
2829
|
+
|
|
2830
|
+
Version: current [26.x]
|
|
2831
|
+
|
|
2832
|
+
On this page
|
|
2833
|
+
|
|
2834
|
+
# Kubernetes Environments for Dremio
|
|
2835
|
+
|
|
2836
|
+
Dremio is designed to run Kubernetes environments, providing enterprise-grade data lakehouse capabilities. To successfully [deploy Dremio on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes), you need a compatible hosted Kubernetes environment.
|
|
2837
|
+
|
|
2838
|
+
Dremio is tested and supported on the following Kubernetes environments:
|
|
2839
|
+
|
|
2840
|
+
* Elastic Kubernetes Service (EKS)
|
|
2841
|
+
* Azure Kubernetes Service (AKS)
|
|
2842
|
+
* Google Kubernetes Engine (GKE)
|
|
2843
|
+
* Red Hat OpenShift
|
|
2844
|
+
|
|
2845
|
+
The sections on this page detail recommendations for AWS and Azure. Please use the information provided as a guide for your vendors' equivalent options.
|
|
2846
|
+
|
|
2847
|
+
note
|
|
2848
|
+
|
|
2849
|
+
If you're using a containerization platform built on Kubernetes that isn't listed here, please contact your provider and Dremio Account team to discuss compatibility and support options.
|
|
2850
|
+
|
|
2851
|
+
## Requirements
|
|
2852
|
+
|
|
2853
|
+
### Versions
|
|
2854
|
+
|
|
2855
|
+
Dremio requires regular updates to your Kubernetes version. You must be on an officially supported version, and preferably not one on extended support. See the following examples for AWS [Available versions on standard support](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#available-versions) and Azure [Kubernetes versions](https://learn.microsoft.com/en-us/azure/aks/supported-kubernetes-versions).
|
|
2856
|
+
|
|
2857
|
+
### Recommendations
|
|
2858
|
+
|
|
2859
|
+
See this table for resource request recommendations of the variours parts of the deployment, [Recommended Resources Configuration](/current/deploy-dremio/configuring-kubernetes/#recommended-resources-configuration).
|
|
2860
|
+
|
|
2861
|
+
For a list of all Dremio engine sizes see, [Add an Engine](/current/deploy-dremio/managing-engines-kubernetes/add-an-engine). Engines will make up the lions share of any Dremio deployment.
|
|
2862
|
+
|
|
2863
|
+
#### Node Sizes
|
|
2864
|
+
|
|
2865
|
+
The following sections suggest AWS and Azure machines that could be used to meet our recommendations.
|
|
2866
|
+
|
|
2867
|
+
Dremio recommends having separate EKS node groups for the different components of our services to allow each node group to autoscale independently:
|
|
2868
|
+
|
|
2869
|
+
**Core Services**
|
|
2870
|
+
|
|
2871
|
+
* **Coordinators**
|
|
2872
|
+
|
|
2873
|
+
For [coordinators](/current/what-is-dremio/architecture/#main-coordinator), Dremio recommends at least 32 CPUs and 64 GB of memory, hence, a `c6i.8xlarge` or `Standard_F32s_v2` is a good option, offering a CPU-to-memory ratio of 1:2. In the Helm charts, this would result in 30 CPUs and 60 GB of memory allocated to the Dremio pod.
|
|
2874
|
+
* **Executors**
|
|
2875
|
+
|
|
2876
|
+
For [executors](/current/what-is-dremio/architecture/#engines), Dremio recommends either:
|
|
2877
|
+
|
|
2878
|
+
+ 16 CPUs and 128 GB of memory, hence, a `r5d.4xlarge` or `Standard_E16_v5` is a good option, offering a CPU-to-memory ratio of 1:8. In the Helm charts, this results in 15 CPUs and 120 GB of memory allocated to the Dremio pod.
|
|
2879
|
+
+ 32 CPUs and 128 GB of memory, hence, a `m5d.8xlarge` or `Standard_D32_v5` is a good option, offering a CPU-to-memory ratio of 1:4 for high-concurrency workloads. In the Helm charts, this results in 30 CPUs and 120 GB of memory allocated to the Dremio pod.
|
|
2880
|
+
|
|
2881
|
+
**Auxiliary Services**
|
|
2882
|
+
|
|
2883
|
+
* [Open Catalog](/current/what-is-dremio/architecture/#open-catalog) and [Semantic Search](/current/deploy-dremio/current/what-is-dremio/architecture/#ai-enabled-semantic-search).
|
|
2884
|
+
|
|
2885
|
+
Catalog is made up of 4 key components: Catalog Service, Catalog Server, Catalog External, and MongoDB. Search has one key component, OpenSearch.
|
|
2886
|
+
|
|
2887
|
+
Each of these components needs between 2-4 CPUs and 4-16 GB of memory; hence, a `m5d.2xlarge` or `Standard_D8_v5` is a good option and could be used to host multiple containers that are part of these services.
|
|
2888
|
+
|
|
2889
|
+
* ZooKeeper, NATS, Operators, and Open Telemetry:
|
|
2890
|
+
|
|
2891
|
+
Each of these need between 0.5-1 CPUs and 0.5-1 GB, `m5d.large`, `t2.medium`, `Standard_D2_v5` or `Standard_A2_v2` are good options and could be used to host multiple containers that are part of these services.
|
|
2892
|
+
|
|
2893
|
+
#### Disk Storage Class
|
|
2894
|
+
|
|
2895
|
+
Dremio recommends:
|
|
2896
|
+
|
|
2897
|
+
* For AWS, GP3 or IO2 as the storage type for all nodes.
|
|
2898
|
+
* For Azure managed-premium as the storage type for all nodes.
|
|
2899
|
+
|
|
2900
|
+
Additionally, for [coordinators](/current/what-is-dremio/architecture/#main-coordinator) and [executors](/current/what-is-dremio/architecture/#engines), you can further use local NVMe SSD storage for C3 and spill on executors. For more information on storage classes, see the following resources [AWS Storage Class](https://docs.aws.amazon.com/eks/latest/userguide/create-storage-class.html) and [Azure Storage Class](https://learn.microsoft.com/en-us/azure/aks/concepts-storage).
|
|
2901
|
+
|
|
2902
|
+
Storage size requirements are:
|
|
2903
|
+
|
|
2904
|
+
* Coordinator volume #1: 128-512 GB (key-value store).
|
|
2905
|
+
* Coordinator volume #2: 16 GB (logs).
|
|
2906
|
+
* Executor volume #1: 128-512 GB (spilling).
|
|
2907
|
+
* Executor volume #2: 128-512 GB (C3).
|
|
2908
|
+
* Executor volume #3: 16 GB (logs).
|
|
2909
|
+
* MongoDB volume: 128-512 GB.
|
|
2910
|
+
* OpenSearch volume: 128 GB.
|
|
2911
|
+
* Zookeeper volume: 16 GB.
|
|
2912
|
+
|
|
2913
|
+
### EKS Add-Ons
|
|
2914
|
+
|
|
2915
|
+
The following add-ons are required for EKS clusters:
|
|
2916
|
+
|
|
2917
|
+
* Amazon EBS CSI Driver
|
|
2918
|
+
* EKS Pod Identity Agent
|
|
2919
|
+
|
|
2920
|
+
Was this page helpful?
|
|
2921
|
+
|
|
2922
|
+
[Previous
|
|
2923
|
+
|
|
2924
|
+
Deploy Dremio](/current/deploy-dremio/)[Next
|
|
2925
|
+
|
|
2926
|
+
Deploy on Kubernetes](/current/deploy-dremio/deploy-on-kubernetes)
|
|
2927
|
+
|
|
2928
|
+
* Requirements
|
|
2929
|
+
+ Versions
|
|
2930
|
+
+ Recommendations
|
|
2931
|
+
+ EKS Add-Ons
|
|
2932
|
+
|
|
2933
|
+
---
|
|
2934
|
+
|
|
2935
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/deploy-on-kubernetes/
|
|
2936
|
+
|
|
2937
|
+
Version: current [26.x]
|
|
2938
|
+
|
|
2939
|
+
On this page
|
|
2940
|
+
|
|
2941
|
+
# Deploy Dremio on Kubernetes
|
|
2942
|
+
|
|
2943
|
+
You can follow these instructions to deploy Dremio on Kubernetes provisioned through a cloud provider or running in an on-premises environment.
|
|
2944
|
+
|
|
2945
|
+
FREE TRIAL
|
|
2946
|
+
|
|
2947
|
+
If you are using an **Enterprise Edition free trial**, go to [Get Started with the Enterprise Edition Free Trial](/current/get-started/kubernetes-trial).
|
|
2948
|
+
|
|
2949
|
+
## Prerequisites
|
|
2950
|
+
|
|
2951
|
+
Before deploying Dremio on Kubernetes, ensure you have the following:
|
|
2952
|
+
|
|
2953
|
+
* A hosted Kubernetes environment to deploy and manage the Dremio cluster.
|
|
2954
|
+
Each Dremio release is tested against [Amazon Elastic Kubernetes Service (EKS)](https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html), [Azure Kubernetes Service (AKS)](https://learn.microsoft.com/en-us/azure/aks/what-is-aks), and [Google Kubernetes Engines (GKE)](https://cloud.google.com/kubernetes-engine?hl=en#how-it-works) to ensure compatibility. If you have a containerization platform built on top of Kubernetes that is not listed here, please contact your provider and the Dremio Account Team regarding compatibility.
|
|
2955
|
+
* Helm 3 installed on your local machine to run Helm commands. For installation instructions, refer to [Installing Helm](https://helm.sh/docs/intro/install/) in the Helm documentation.
|
|
2956
|
+
* A local kubectl configured to access your Kubernetes cluster. For installation instructions, refer to [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) in the Kubernetes documentation.
|
|
2957
|
+
* Object Storage: Amazon S3 (including S3-compatible, e.g., MinIO), Azure Storage, or Google Cloud Storage (GCS).
|
|
2958
|
+
* Storage classes that support ReadWriteOnce (RWO) access mode and ideally can create expandable volumes.
|
|
2959
|
+
* The ability to connect to [Quay.io](http://quay.io/) to access the [new v3 Helm chart](https://quay.io/repository/dremio/dremio-helm?tab=tags) for Dremio 26+, since the [older v2 Helm chart](https://github.com/dremio/dremio-cloud-tools/tree/master/charts/dremio_v2) will not function.
|
|
2960
|
+
|
|
2961
|
+
### Additional Prerequisites for the Enterprise Edition
|
|
2962
|
+
|
|
2963
|
+
For the Enterprise Edition, you must:
|
|
2964
|
+
|
|
2965
|
+
* Create an account on [Quay.io](https://quay.io/) to access [Dremio's OCI repository](https://quay.io/organization/dremio), which stores Dremio's Helm charts and images.
|
|
2966
|
+
To get access, contact your Dremio account executive or Dremio Support.
|
|
2967
|
+
|
|
2968
|
+
note
|
|
2969
|
+
|
|
2970
|
+
If your internet access doesn't allow reaching Dremio's OCI repository in Quay.io, consider using a private mirror to fetch Dremio's Helm chart images.
|
|
2971
|
+
* Get a valid license key issued by Dremio to put in the Helm chart. To obtain the license, refer to [Licensing](/current/admin/licensing/).
|
|
2972
|
+
|
|
2973
|
+
### Additional Prerequisites for the OpenShift
|
|
2974
|
+
|
|
2975
|
+
Before deploying Dremio onto OpenShift, you additionally need the following:
|
|
2976
|
+
|
|
2977
|
+
* Have the OpenShift `oc` CLI command configured and authenticated. For the installation instructions, see [OpenShift CLI (oc)](https://docs.redhat.com/en/documentation/openshift_container_platform/4.11/html/cli_tools/openshift-cli-oc).
|
|
2978
|
+
|
|
2979
|
+
#### Node Tuning for OpenSearch on OpenShift
|
|
2980
|
+
|
|
2981
|
+
OpenSearch requires the `vm.max_map_count` kernel parameter to be set to at least **262144**.
|
|
2982
|
+
|
|
2983
|
+
This parameter controls the maximum number of memory map areas a process can have, and OpenSearch uses memory-mapped files extensively for performance.
|
|
2984
|
+
|
|
2985
|
+
Without this setting, OpenSearch pods will fail to start with errors related to virtual memory limits.
|
|
2986
|
+
|
|
2987
|
+
Since the Helm chart sets `setVMMaxMapCount: false` for OpenShift compatibility (to avoid privileged init containers), you need to configure this kernel parameter at the node level. The **recommended way** to do it is a Node Tuning Operator. This Operator ships with OpenShift and provides a declarative way to configure kernel parameters.
|
|
2988
|
+
|
|
2989
|
+
Create a `Tuned` resource to configure the required kernel parameter:
|
|
2990
|
+
|
|
2991
|
+
The `tuned-opensearch.yaml` configuration file
|
|
2992
|
+
|
|
2993
|
+
```
|
|
2994
|
+
apiVersion: tuned.openshift.io/v1
|
|
2995
|
+
kind: Tuned
|
|
2996
|
+
metadata:
|
|
2997
|
+
name: openshift-opensearch
|
|
2998
|
+
namespace: openshift-cluster-node-tuning-operator
|
|
2999
|
+
spec:
|
|
3000
|
+
profile:
|
|
3001
|
+
- data: |
|
|
3002
|
+
[main]
|
|
3003
|
+
summary=Optimize systems running OpenSearch on OpenShift nodes
|
|
3004
|
+
include=openshift-node
|
|
3005
|
+
[sysctl]
|
|
3006
|
+
vm.max_map_count=262144
|
|
3007
|
+
name: openshift-opensearch
|
|
3008
|
+
recommend:
|
|
3009
|
+
- match:
|
|
3010
|
+
- label: tuned.openshift.io/opensearch
|
|
3011
|
+
type: pod
|
|
3012
|
+
priority: 20
|
|
3013
|
+
profile: openshift-opensearch
|
|
3014
|
+
```
|
|
3015
|
+
|
|
3016
|
+
This YAML should be saved locally and applied to any cluster you intend to deploy Dremio:
|
|
3017
|
+
|
|
3018
|
+
```
|
|
3019
|
+
oc apply -f tuned-opensearch.yaml
|
|
3020
|
+
```
|
|
3021
|
+
|
|
3022
|
+
## Step 1: Deploy Dremio
|
|
3023
|
+
|
|
3024
|
+
To deploy the Dremio cluster in Kubernetes, do the following:
|
|
3025
|
+
|
|
3026
|
+
1. Configure your values to deploy Dremio to Kubernetes in the file `values-overrides.yaml`. For that, go to [Configuring Your Values to Deploy Dremio to Kubernetes](/current/deploy-dremio/configuring-kubernetes/) and get back here to continue with the deployment.
|
|
3027
|
+
2. On your terminal, start the deployment by installing Dremio's Helm chart:
|
|
3028
|
+
|
|
3029
|
+
* Standard Kubernetes
|
|
3030
|
+
* OpenShift
|
|
3031
|
+
|
|
3032
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
3033
|
+
|
|
3034
|
+
```
|
|
3035
|
+
helm install <your-dremio-install-release> oci://quay.io/dremio/dremio-helm \
|
|
3036
|
+
--values <your-local-path>/values-overrides.yaml \
|
|
3037
|
+
--version <optional-helm-chart-version> \
|
|
3038
|
+
--set-file <optional-config-files> \
|
|
3039
|
+
--wait
|
|
3040
|
+
```
|
|
3041
|
+
|
|
3042
|
+
Where:
|
|
3043
|
+
|
|
3044
|
+
* `<your-dremio-install-release>` - The name that identifies your Dremio installation. For example, `dremio-1-0`.
|
|
3045
|
+
* `<your-local-path>` - The path to reach your `values-overrides.yaml` configuration file.
|
|
3046
|
+
* (Optional) `--version <optional-helm-chart-version>` - The version of Dremio's Helm chart to be used. If not provided, defaults to the latest.
|
|
3047
|
+
* (Optional) `--set-file <optional-config-file>` - An optional configuration file for deploying Dremio. For example, an [Identity Provider](/current/security/authentication/identity-providers/) configuration file, which is not defined in the `values-overrides.yaml` and can be provided here through this option.
|
|
3048
|
+
|
|
3049
|
+
For OpenShift, the command requires an additional `--values` option with the path to the OpenShift-specific `values-openshift-overrides.yaml` configuration file. This additional option must be placed before the `--values` option with the `values-overrides.yaml` configuration file, resulting in its substitution first.
|
|
3050
|
+
|
|
3051
|
+
Run the following command for OpenShift:
|
|
3052
|
+
|
|
3053
|
+
```
|
|
3054
|
+
helm install <your-dremio-install-release> oci://quay.io/dremio/dremio-helm \
|
|
3055
|
+
--values <your-local-path1>/values-openshift-overrides.yaml \
|
|
3056
|
+
--values <your-local-path2>/values-overrides.yaml \
|
|
3057
|
+
--version <optional-helm-chart-version> \
|
|
3058
|
+
--set-file <optional-config-files> \
|
|
3059
|
+
--wait
|
|
3060
|
+
```
|
|
3061
|
+
|
|
3062
|
+
Where:
|
|
3063
|
+
|
|
3064
|
+
* `<your-dremio-install-release>` - The name that identifies your Dremio installation. For example, `dremio-1-0`.
|
|
3065
|
+
* `<your-local-path1>` - The path to reach your `values-openshift-overrides.yaml` configuration file. Only required for OpenShift.
|
|
3066
|
+
* `<your-local-path2>` - The path to reach your `values-overrides.yaml` configuration file.
|
|
3067
|
+
* (Optional) `--version <optional-helm-chart-version>` - The version of Dremio's Helm chart to be used. If not provided, defaults to the latest.
|
|
3068
|
+
* (Optional) `--set-file <optional-config-file>` - An optional configuration file for deploying Dremio. For example, an [Identity Provider](/current/security/authentication/identity-providers/) configuration file, which is not defined in the `values-overrides.yaml` and can be provided here through this option.
|
|
3069
|
+
3. Monitor the deployment using the following commands:
|
|
3070
|
+
|
|
3071
|
+
* Standard Kubernetes
|
|
3072
|
+
* OpenShift
|
|
3073
|
+
|
|
3074
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
3075
|
+
|
|
3076
|
+
```
|
|
3077
|
+
kubectl get pods
|
|
3078
|
+
```
|
|
3079
|
+
|
|
3080
|
+
For OpenShift, run the following command:
|
|
3081
|
+
|
|
3082
|
+
```
|
|
3083
|
+
oc get pods
|
|
3084
|
+
```
|
|
3085
|
+
|
|
3086
|
+
When all of the pods are in the `Ready` state, the deployment is complete.
|
|
3087
|
+
|
|
3088
|
+
Troubleshooting
|
|
3089
|
+
|
|
3090
|
+
* If a pod remains in `Pending` state for more than a few minutes, run the following command to view its status to check for issues, such as insufficient resources for scheduling:
|
|
3091
|
+
|
|
3092
|
+
```
|
|
3093
|
+
kubectl describe pods <pod-name>
|
|
3094
|
+
```
|
|
3095
|
+
* If the events at the bottom of the output mention insufficient CPU or memory, do one of the following:
|
|
3096
|
+
|
|
3097
|
+
+ Adjust the values in the `values-overrides.yaml` configuration file and redeploy.
|
|
3098
|
+
+ Add more resources to your Kubernetes cluster.
|
|
3099
|
+
* If a pod returns a failed state (especially `dremio-master-0`, the most important pod), use the following commands to collect the logs:
|
|
3100
|
+
|
|
3101
|
+
+ Standard Kubernetes
|
|
3102
|
+
+ OpenShift
|
|
3103
|
+
|
|
3104
|
+
Run the following command for any Kubernetes environment except for OpenShift:
|
|
3105
|
+
|
|
3106
|
+
```
|
|
3107
|
+
kubectl logs dremio-master-0
|
|
3108
|
+
```
|
|
3109
|
+
|
|
3110
|
+
For OpenShift, run the following command:
|
|
3111
|
+
|
|
3112
|
+
```
|
|
3113
|
+
oc logs deployment/dremio-master
|
|
3114
|
+
```
|
|
3115
|
+
|
|
3116
|
+
## Step 2: Connecting to Dremio
|
|
3117
|
+
|
|
3118
|
+
Now that you've installed the Helm chart and deployed Dremio on Kubernetes, the next step is connecting to Dremio, where you have the following options:
|
|
3119
|
+
|
|
3120
|
+
* Dremio Console
|
|
3121
|
+
* OpenShift Route
|
|
3122
|
+
* BI Tools via ODBC/JDBC
|
|
3123
|
+
* BI Tools via Apache Arrow Flight
|
|
3124
|
+
|
|
3125
|
+
To connect to Dremio via [the Dremio console](/current/get-started/quick_tour), run the following command to use the `services dremio-client` in Kubernetes to find the host for the Dremio console:
|
|
3126
|
+
|
|
3127
|
+
```
|
|
3128
|
+
$ kubectl get services dremio-client
|
|
3129
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3130
|
+
... ... ... ... ... ...
|
|
3131
|
+
```
|
|
3132
|
+
|
|
3133
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access the Dremio console through the address in the `EXTERNAL_IP` column and port **9047**.
|
|
3134
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access the Dremio console through <http://8.8.8.8:9047>.
|
|
3135
|
+
|
|
3136
|
+
```
|
|
3137
|
+
$ kubectl get services dremio-client
|
|
3138
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3139
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
3140
|
+
```
|
|
3141
|
+
|
|
3142
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.web.port` in the file `values-overrides.yaml`.
|
|
3143
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access the Dremio console through <http://localhost:30670>.
|
|
3144
|
+
|
|
3145
|
+
To expose Dremio externally using OpenShift Routes, do the following:
|
|
3146
|
+
|
|
3147
|
+
```
|
|
3148
|
+
$ oc expose service dremio-client --port=9047 --name=dremio-ui
|
|
3149
|
+
|
|
3150
|
+
$ oc get route dremio-ui -o jsonpath='{.spec.host}'
|
|
3151
|
+
```
|
|
3152
|
+
|
|
3153
|
+
To connect your BI tools to Dremio via ODBC/JDBC, run the following command to use the `services dremio-client` in Kubernetes to find the host for ODBC/JDBC connections by using the following command:
|
|
3154
|
+
|
|
3155
|
+
```
|
|
3156
|
+
$ kubectl get services dremio-client
|
|
3157
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3158
|
+
... ... ... ... ... ...
|
|
3159
|
+
```
|
|
3160
|
+
|
|
3161
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access Dremio using ODBC/JDBC through the address in the `EXTERNAL_IP` column and port **31010**.
|
|
3162
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access Dremio using ODBC/JDBC on port 31010 through <http://8.8.8.8:31010>.
|
|
3163
|
+
|
|
3164
|
+
```
|
|
3165
|
+
$ kubectl get services dremio-client
|
|
3166
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3167
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
3168
|
+
```
|
|
3169
|
+
|
|
3170
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.client.port` in the file `values-overrides.yaml`.
|
|
3171
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access Dremio using ODBC/JDBC through <http://localhost:32390>.
|
|
3172
|
+
|
|
3173
|
+
To connect your BI tools to Dremio via Apache Arrow Flight, run the following command to use the `services dremio-client` in Kubernetes to find the host for Apache Arrow Flight connections by using the following command:
|
|
3174
|
+
|
|
3175
|
+
```
|
|
3176
|
+
$ kubectl get services dremio-client
|
|
3177
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3178
|
+
... ... ... ... ... ...
|
|
3179
|
+
```
|
|
3180
|
+
|
|
3181
|
+
* If the value in the `TYPE` column of the output is `LoadBalancer`, access Dremio using Apache Arrow Flight through the address in the `EXTERNAL_IP` column and port **32010**.
|
|
3182
|
+
For example, in the output below, the value under the `EXTERNAL-IP` column is `8.8.8.8`. Therefore, access Dremio using Apache Arrow Flight through <http://8.8.8.8:32010>.
|
|
3183
|
+
|
|
3184
|
+
```
|
|
3185
|
+
$ kubectl get services dremio-client
|
|
3186
|
+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
3187
|
+
dremio-client LoadBalancer 10.99.227.180 8.8.8.8 31010:32260/TCP,9047:30620/TCP 2d
|
|
3188
|
+
```
|
|
3189
|
+
|
|
3190
|
+
If you want to change the exposed port on the load balancer, change the value of the setting `coordinator.flight.port` in the file `values-overrides.yaml`.
|
|
3191
|
+
* If the value in the `TYPE` column of the output is `NodePort`, access Dremio using Apache Arrow Flight through <http://localhost:31357>.
|
|
3192
|
+
|
|
3193
|
+
Was this page helpful?
|
|
3194
|
+
|
|
3195
|
+
[Previous
|
|
3196
|
+
|
|
3197
|
+
Kubernetes Environments](/current/deploy-dremio/kubernetes-environments)[Next
|
|
3198
|
+
|
|
3199
|
+
Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/)
|
|
3200
|
+
|
|
3201
|
+
* Prerequisites
|
|
3202
|
+
+ Additional Prerequisites for the Enterprise Edition
|
|
3203
|
+
+ Additional Prerequisites for the OpenShift
|
|
3204
|
+
* Step 1: Deploy Dremio
|
|
3205
|
+
* Step 2: Connecting to Dremio
|
|
3206
|
+
|
|
3207
|
+
---
|
|
3208
|
+
|
|
3209
|
+
# Source: https://docs.dremio.com/current/deploy-dremio/managing-engines-kubernetes/
|
|
3210
|
+
|
|
3211
|
+
Version: current [26.x]
|
|
3212
|
+
|
|
3213
|
+
On this page
|
|
3214
|
+
|
|
3215
|
+
# Managing Engines in Kubernetes Enterprise
|
|
3216
|
+
|
|
3217
|
+
note
|
|
3218
|
+
|
|
3219
|
+
This feature is for Enterprise Edition only.
|
|
3220
|
+
For Community Edition, see [Configuration of Classic Engines](/current/deploy-dremio/configuring-kubernetes/#configuration-of-classic-engines).
|
|
3221
|
+
|
|
3222
|
+
Dremio supports the ability to provision multiple separate execution engines in Kubernetes from a Dremio main coordinator node, and automatically start and stop based on workload requirements at runtime. This provides several benefits, including:
|
|
3223
|
+
|
|
3224
|
+
* Creating a new engine doesn't require restarting Dremio, which enables administrators to achieve workload isolation efficiently.
|
|
3225
|
+
* When creating a new engine, you can use Kubernetes metadata to label engines to keep track of resources.
|
|
3226
|
+
* Right-size execution resources for each distinct workload, instead of implementing a one-size-fits-all model.
|
|
3227
|
+
* Easily experiment with different execution resource sizes at any scale.
|
|
3228
|
+
|
|
3229
|
+
To manage your engines, open the Engines page as follows:
|
|
3230
|
+
|
|
3231
|
+
1. Open your Dremio console.
|
|
3232
|
+
2. Click  in the side navigation bar to open the Settings sidebar.
|
|
3233
|
+
3. Select **Engines**.
|
|
3234
|
+
|
|
3235
|
+

|
|
3236
|
+
|
|
3237
|
+
## Monitoring Engines
|
|
3238
|
+
|
|
3239
|
+
You can monitor the status and properties of your engines on the Engines page.
|
|
3240
|
+
|
|
3241
|
+

|
|
3242
|
+
|
|
3243
|
+
Each engine has the following information available:
|
|
3244
|
+
|
|
3245
|
+
* **Name** - The name of the engine, which you can click to see its details. See the section about Viewing Engine Details.
|
|
3246
|
+
* **Size** - The size configured for the engine.
|
|
3247
|
+
* **Status** - The engine status. For more information, see the section in this topic about Engine Statuses.
|
|
3248
|
+
* **Auto start/stop** - Whether the engine has auto start/stop enabled for autoscaling.
|
|
3249
|
+
* **Idle period** - The idle time to auto stop when the engine has **Auto start/stop** enabled.
|
|
3250
|
+
* **Queues** - Query queues routed to the engine.
|
|
3251
|
+
* **Labels** - Labels associated with the engine.
|
|
3252
|
+
|
|
3253
|
+
## Performing Actions on Engines
|
|
3254
|
+
|
|
3255
|
+
While monitoring engines, you have actions you can perform on each engine through the icons displayed on the right-hand side when hovering over the engine row.
|
|
3256
|
+
|
|
3257
|
+

|
|
3258
|
+
|
|
3259
|
+
### Stopping/Starting an Engine
|
|
3260
|
+
|
|
3261
|
+
You can click / to stop/start an engine manually at any time. Stopping an engine will cause running queries to fail while new queries will remain queued, which can also fail by timeout if the engine isn't started. To prevent query failures, reroute queries to another engine, and stop the engine only when no queries are running or queued for the engine.
|
|
3262
|
+
|
|
3263
|
+
note
|
|
3264
|
+
|
|
3265
|
+
You can enable **autoscaling on an engine** to make it stop automatically after an idle time without queries and start again automatically when new queries are issued, all without any human intervention.
|
|
3266
|
+
|
|
3267
|
+
Autoscaling is configured when you add an engine or edit an engine:
|
|
3268
|
+
|
|
3269
|
+
#### Stopping All Engines
|
|
3270
|
+
|
|
3271
|
+
Some complex operations, like upgrading or uninstalling Dremio, require all engines to be stopped beforehand. You can stop engines manually one by one as described above, or automate the procedure using the [Engine Management API](/current/reference/api/engine-management) to stop all engines. Expand the sample below of a bash script executing the necessary endpoints to stop all engines.
|
|
3272
|
+
|
|
3273
|
+
Sample bash script to stop all engines
|
|
3274
|
+
|
|
3275
|
+
```
|
|
3276
|
+
#!/bin/bash
|
|
3277
|
+
# Check if the bearer token is provided
|
|
3278
|
+
if [ -z "$1" ]; then
|
|
3279
|
+
echo "Error: Bearer token is required."
|
|
3280
|
+
exit 1
|
|
3281
|
+
fi
|
|
3282
|
+
BEARER_TOKEN=$1
|
|
3283
|
+
BASE_URL=${2:-https://localhost:9047}
|
|
3284
|
+
# Make an HTTP GET request to retrieve engine IDs
|
|
3285
|
+
RESPONSE=$(curl -k -s -H "Authorization: Bearer $BEARER_TOKEN" "$BASE_URL/api/v3/engines")
|
|
3286
|
+
# Check if the response contains the "id" field
|
|
3287
|
+
if ! echo "$RESPONSE" | grep -q '"id"'; then
|
|
3288
|
+
echo "Error: No 'id' field found in the response."
|
|
3289
|
+
exit 1
|
|
3290
|
+
fi
|
|
3291
|
+
# Extract IDs from the response
|
|
3292
|
+
IDS=$(echo "$RESPONSE" | jq -r '.data[] | .id')
|
|
3293
|
+
# Loop through each ID and make an HTTP PUT request
|
|
3294
|
+
for ID in $IDS; do
|
|
3295
|
+
RESPONSE=$(curl -k -s -o /dev/null -w "%{http_code}" -X PUT -H "Authorization: Bearer $BEARER_TOKEN" "$BASE_URL/api/v3/engines/$ID/stop")
|
|
3296
|
+
if [ "$RESPONSE" -eq 200 ]; then
|
|
3297
|
+
echo "Successfully stopped engine with ID: $ID"
|
|
3298
|
+
else
|
|
3299
|
+
echo "Failed to stop the engine with ID: $ID, HTTP status code: $RESPONSE"
|
|
3300
|
+
fi
|
|
3301
|
+
done
|
|
3302
|
+
echo "All engines processed."
|
|
3303
|
+
```
|
|
3304
|
+
|
|
3305
|
+
### Editing the Engine Settings
|
|
3306
|
+
|
|
3307
|
+
You can click  to edit the engine settings. After saving the new settings, the engine may restart, causing running queries to fail and new queries to be queued.
|
|
3308
|
+
|
|
3309
|
+

|
|
3310
|
+
|
|
3311
|
+
note
|
|
3312
|
+
|
|
3313
|
+
The name of the engine must follow these rules:
|
|
3314
|
+
|
|
3315
|
+
* Must start with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
3316
|
+
* Must end with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
3317
|
+
* Must contain only lowercase alphanumeric characters or a hyphen (`[\-a-z0-9]`).
|
|
3318
|
+
* Must be under 30 characters in length.
|
|
3319
|
+
* Must be unique and not previously used for any existing or deleted engines.
|
|
3320
|
+
|
|
3321
|
+
### Deleting an Engine
|
|
3322
|
+
|
|
3323
|
+
You can click  to delete an engine. Deleting an engine will cause running, queued, and new queries to fail. To prevent query failures, you can reroute queries to another engine, and only delete when no more queries are running or queued for the engine.
|
|
3324
|
+
|
|
3325
|
+
## Viewing Engine Details
|
|
3326
|
+
|
|
3327
|
+
While monitoring engines, if you need to know more details about engines, click the engine's name to view all the information about it.
|
|
3328
|
+
|
|
3329
|
+

|
|
3330
|
+
|
|
3331
|
+
On this page, you will also find a set of buttons at the top to delete the engine, stop/start the engine, and edit the engine settings.
|
|
3332
|
+
|
|
3333
|
+
## Adding an Engine
|
|
3334
|
+
|
|
3335
|
+
You can create more engines by clicking **Add Engine** at the top-right corner of the Engines page.
|
|
3336
|
+
|
|
3337
|
+

|
|
3338
|
+
|
|
3339
|
+
In the New engine dialog, do the following:
|
|
3340
|
+
|
|
3341
|
+
1. Fill out the **General** section:
|
|
3342
|
+
|
|
3343
|
+
1. **Name** - Type the name of the engine. Use a meaningful name that helps you to identify the engine better. For example, `low-cost-query`.
|
|
3344
|
+
|
|
3345
|
+
note
|
|
3346
|
+
|
|
3347
|
+
The name of the engine must follow these rules:
|
|
3348
|
+
|
|
3349
|
+
* Must start with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
3350
|
+
* Must end with a lowercase alphanumeric character (`[a-z0-9]`).
|
|
3351
|
+
* Must contain only lowercase alphanumeric characters or a hyphen (`[\-a-z0-9]`).
|
|
3352
|
+
* Must be under 30 characters in length.
|
|
3353
|
+
* Must be unique and not previously used for any existing or deleted engines.
|
|
3354
|
+
2. **CPU** and **Size** – Select the number of CPUs per executor pod and the size of the engine.
|
|
3355
|
+
Dremio provides nine engine sizes, each with two CPU options targeting 16 or 32 CPU nodes. By default, Dremio will subtract 2 CPUs and 8 GB of memory from its request, resulting in requests for 14 or 30 CPUs and 120 GB of memory. This adjustment helps optimize the packing of executors on the most common node sizes. The table below shows the engine sizes:
|
|
3356
|
+
|
|
3357
|
+
| Engine Size | Executors per Replica | Memory per Executor |
|
|
3358
|
+
| --- | --- | --- |
|
|
3359
|
+
| 2XSmall | 1 | 56 GB |
|
|
3360
|
+
| XSmall | 1 | 120 GB |
|
|
3361
|
+
| Small | 2 | 120 GB |
|
|
3362
|
+
| Medium | 4 | 120 GB |
|
|
3363
|
+
| Large | 8 | 120 GB |
|
|
3364
|
+
| XLarge | 12 | 120 GB |
|
|
3365
|
+
| 2XLarge | 16 | 120 GB |
|
|
3366
|
+
| 3XLarge | 24 | 120 GB |
|
|
3367
|
+
| 4XLarge | 32 | 120 GB |
|
|
3368
|
+
3. **Automatically start/stop** - If checked, the engine automatically stops after the specified idle time and automatically starts when new queries are issued to the engine. If not checked, the engine only stops and starts through manual intervention. By default, this setting is checked and the engine stops automatically after `15 min` of idle time. For more information, see the section Stopping/Starting an Engine.
|
|
3369
|
+
4. (Optional) Expand **Advanced Options** for further settings.
|
|
3370
|
+
|
|
3371
|
+

|
|
3372
|
+
|
|
3373
|
+
Fill out the advanced options as follows:
|
|
3374
|
+
|
|
3375
|
+
1. **Cloud cache volume (c3)** - Specify the amount of local storage for caching data.
|
|
3376
|
+
2. **Spill volume** - Specify the disk size allocated for temporary storage when operations exceed memory limits.
|
|
3377
|
+
2. (Optional) Select **Kubernetes pod metadata** to define pod metadata for the engine, such as labels, annotations, node selectors, and tolerations. Define those values with care and foreknowledge of expected entries because any misconfiguration may result in Kubernetes being unable to start the executors that make up the engine.
|
|
3378
|
+
|
|
3379
|
+

|
|
3380
|
+
|
|
3381
|
+
Fill out the pod's metadata with:
|
|
3382
|
+
|
|
3383
|
+
1. **Labels** - Add labels as key/value pairs to identify and organize pods. Use them to group, filter, and select subsets of resources efficiently.
|
|
3384
|
+
|
|
3385
|
+
note
|
|
3386
|
+
|
|
3387
|
+
The engine label must follow these rules:
|
|
3388
|
+
|
|
3389
|
+
* Must start with an alphanumeric character ([a-z0-9]).
|
|
3390
|
+
* Must end with an alphanumeric character ([a-z0-9]).
|
|
3391
|
+
* Must contain only lowercase alphanumeric characters, a hyphen, or a underscore ([-\_a-z0-9]).
|
|
3392
|
+
* The maximum length is 63 characters.
|
|
3393
|
+
2. **Annotations** - Add annotations as key/value pairs to store non-identifying metadata, such as build information or pointers to logging services. Unlike labels, they're not used for selection or grouping.
|
|
3394
|
+
|
|
3395
|
+
note
|
|
3396
|
+
|
|
3397
|
+
The engine annotation must follow these rules:
|
|
3398
|
+
|
|
3399
|
+
* Must be UTF-8 encoded and can include any valid UTF-8 character.
|
|
3400
|
+
* Can be in plain text, JSON, or any other UTF-8 compatible format.
|
|
3401
|
+
* The maximum size is 256KB.
|
|
3402
|
+
* The maximum size of all engine annotations is 1MB.
|
|
3403
|
+
3. **Node selectors** - Add node selectors as key/value pairs for node-specific constraints to schedule pods on nodes matching specified labels. Use this to target nodes with specific configurations or roles.
|
|
3404
|
+
4. **Tolerations** - Add tolerations to allow pods to be scheduled on nodes with matching taints, but they don’t restrict scheduling to only those nodes; the pod can still land on a node without the taint.
|
|
3405
|
+
3. Click **Add** to add the engine.
|
|
3406
|
+
|
|
3407
|
+
The newly added engine will be displayed in the listed engines.
|
|
3408
|
+
|
|
3409
|
+

|
|
3410
|
+
|
|
3411
|
+
## Engine Statuses
|
|
3412
|
+
|
|
3413
|
+
The following table describes each engine status:
|
|
3414
|
+
|
|
3415
|
+
| Status | Icon | Description |
|
|
3416
|
+
| --- | --- | --- |
|
|
3417
|
+
| Starting | The starting engine icon | The engine is starting. This is the initial state of an engine after being created. New queries are queued to be processed. |
|
|
3418
|
+
| Running | The running engine icon | The engine is running. New queries are queued and processed. |
|
|
3419
|
+
| Stopping | The stopping engine icon | The engine is stopping. Running queries will fail. New queries will remain queued, which can also fail by timeout if the engine isn't started. |
|
|
3420
|
+
| Stopped | The stopped engine icon | The engine is stopped. New queries will remain queued, which can fail by timeout if the engine isn't started. |
|
|
3421
|
+
| Recovering | The recovering engine icon | The engine is recovering. New queries will remain queued, which can fail by timeout if the engine doesn't recover. |
|
|
3422
|
+
| Failed | The failed engine icon | The engine failed. New queries will remain queued, which can fail by timeout if the engine doesn't start. |
|
|
3423
|
+
|
|
3424
|
+
## Related Topics
|
|
3425
|
+
|
|
3426
|
+
* [Engine Management API](/current/reference/api/engine-management/) - The API to manage your engines using REST API calls.
|
|
3427
|
+
* [sys.engines](/current/reference/sql/system-tables/engines/) - The system table to query for information about your engines.
|
|
3428
|
+
* [Audit Logs](/current/security/auditing/) - Audit logs for your engines.
|
|
3429
|
+
|
|
3430
|
+
Was this page helpful?
|
|
3431
|
+
|
|
3432
|
+
[Previous
|
|
3433
|
+
|
|
3434
|
+
Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/)[Next
|
|
3435
|
+
|
|
3436
|
+
Other Options](/current/deploy-dremio/other-options/)
|
|
3437
|
+
|
|
3438
|
+
* Monitoring Engines
|
|
3439
|
+
* Performing Actions on Engines
|
|
3440
|
+
+ Stopping/Starting an Engine
|
|
3441
|
+
+ Editing the Engine Settings
|
|
3442
|
+
+ Deleting an Engine
|
|
3443
|
+
* Viewing Engine Details
|
|
3444
|
+
* Adding an Engine
|
|
3445
|
+
* Engine Statuses
|
|
3446
|
+
* Related Topics
|