dremiojs 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/.eslintrc.json +14 -0
  2. package/.prettierrc +7 -0
  3. package/README.md +59 -0
  4. package/dremiodocs/dremio-cloud/cloud-api-reference.md +748 -0
  5. package/dremiodocs/dremio-cloud/dremio-cloud-about.md +225 -0
  6. package/dremiodocs/dremio-cloud/dremio-cloud-admin.md +3754 -0
  7. package/dremiodocs/dremio-cloud/dremio-cloud-bring-data.md +6098 -0
  8. package/dremiodocs/dremio-cloud/dremio-cloud-changelog.md +32 -0
  9. package/dremiodocs/dremio-cloud/dremio-cloud-developer.md +1147 -0
  10. package/dremiodocs/dremio-cloud/dremio-cloud-explore-analyze.md +2522 -0
  11. package/dremiodocs/dremio-cloud/dremio-cloud-get-started.md +300 -0
  12. package/dremiodocs/dremio-cloud/dremio-cloud-help-support.md +869 -0
  13. package/dremiodocs/dremio-cloud/dremio-cloud-manage-govern.md +800 -0
  14. package/dremiodocs/dremio-cloud/dremio-cloud-overview.md +36 -0
  15. package/dremiodocs/dremio-cloud/dremio-cloud-security.md +1844 -0
  16. package/dremiodocs/dremio-cloud/sql-docs.md +7180 -0
  17. package/dremiodocs/dremio-software/dremio-software-acceleration.md +1575 -0
  18. package/dremiodocs/dremio-software/dremio-software-admin.md +884 -0
  19. package/dremiodocs/dremio-software/dremio-software-client-applications.md +3277 -0
  20. package/dremiodocs/dremio-software/dremio-software-data-products.md +560 -0
  21. package/dremiodocs/dremio-software/dremio-software-data-sources.md +8701 -0
  22. package/dremiodocs/dremio-software/dremio-software-deploy-dremio.md +3446 -0
  23. package/dremiodocs/dremio-software/dremio-software-get-started.md +848 -0
  24. package/dremiodocs/dremio-software/dremio-software-monitoring.md +422 -0
  25. package/dremiodocs/dremio-software/dremio-software-reference.md +677 -0
  26. package/dremiodocs/dremio-software/dremio-software-security.md +2074 -0
  27. package/dremiodocs/dremio-software/dremio-software-v25-api.md +32637 -0
  28. package/dremiodocs/dremio-software/dremio-software-v26-api.md +36757 -0
  29. package/jest.config.js +10 -0
  30. package/package.json +25 -0
  31. package/src/api/catalog.ts +74 -0
  32. package/src/api/jobs.ts +105 -0
  33. package/src/api/reflection.ts +77 -0
  34. package/src/api/source.ts +61 -0
  35. package/src/api/user.ts +32 -0
  36. package/src/client/base.ts +66 -0
  37. package/src/client/cloud.ts +37 -0
  38. package/src/client/software.ts +73 -0
  39. package/src/index.ts +16 -0
  40. package/src/types/catalog.ts +31 -0
  41. package/src/types/config.ts +18 -0
  42. package/src/types/job.ts +18 -0
  43. package/src/types/reflection.ts +29 -0
  44. package/tests/integration_manual.ts +95 -0
  45. package/tsconfig.json +19 -0
@@ -0,0 +1,422 @@
1
+ # Dremio Software - Monitoring
2
+
3
+
4
+
5
+ ---
6
+
7
+ # Source: https://docs.dremio.com/current/admin/monitoring/
8
+
9
+ Version: current [26.x]
10
+
11
+ On this page
12
+
13
+ # Monitoring
14
+
15
+ As an administrator, you can monitor logs, usage, system telemetry, [jobs](/current/admin/monitoring/jobs/), and [Dremio nodes](/current/admin/monitoring/dremio-nodes).
16
+
17
+ As the [Dremio Shared Responsibility Models](/responsibility) outline, monitoring is a shared responsibility between Dremio and you. The Shared Responsibility Models lay out Dremio's responsibilities for providing monitoring technologies and logs and your responsibilities for implementation and use.
18
+
19
+ ## Logs
20
+
21
+ Logs are primarily for troubleshooting issues and monitoring the health of the deployment.
22
+
23
+ note
24
+
25
+ By default, Dremio uses the following locations to write logs:
26
+
27
+ * Tarball - `<DREMIO_HOME>/log`
28
+ * RPM - `/var/log/dremio`
29
+ * Kubernetes - `/opt/dremio/log`
30
+
31
+ ### Log Types
32
+
33
+ | Log Type | Description |
34
+ | --- | --- |
35
+ | Audit | The `audit.json` file tracks all activities that users perform within Dremio. For details, see [Audit Logging](/current/security/auditing/). |
36
+ | System | The following system logs are enabled by default: * `access.log`: HTTP access log for the Dremio web server; generated by coordinator nodes only. * `server.gc`: Garbage collection log. * `server.log` and `json/server.json`: Server logs generated in a text format (server.log) and json format (json/server.json). Users granted the `ADMIN` role can disable one of these formats. * `server.out`: Log for Dremio daemon standard out. * `metadata_refresh.log`: Log for refreshing metadata. * `tracker.json`: Tracker log. * `vacuum.json`: Log for the files scanned and deleted by `VACUUM CATALOG` and `VACUUM TABLE` commands. |
37
+ | Query | Query logging is enabled by default. The `queries.json` file contains the log of completed queries; it does not include queries currently in planning or execution. You can retrieve the same information that is in `queries.json` using the [`sys.jobs_recent`](/current/reference/sql/system-tables/jobs_recent) system table. Query logs include the following information: * `queryId`: Unique ID of the executed query. * `queryText`: SQL query text. * `start`: Start time of the query. * `finish`: End time of the query. * `outcome`: Whether the query was completed or failed. * `username`: User who executed the query. * `commandDescription`: Type of the command; may be a regular SQL query execution job or another SQL command. The query log may contain additional information depending on your Dremio configuration. |
38
+ | Warning | The `hive.deprecated.function.warning.log` file contains warnings for Hive functions that have been deprecated. To resolve warnings that are listed in this file, replace deprecated functions with a [supported function](/current/reference/sql/sql-functions/ALL_FUNCTIONS/). For example, to resolve a warning that mentions `NVL`, replace `NVL` with `COALESCE`. |
39
+
40
+ ### Retrieving Logs from the Dremio Console Enterprise
41
+
42
+ Retrieve logs for Kubernetes deployments in the Dremio console at **Settings** > **Support** > **Download Logs**.
43
+
44
+ #### Prerequisites
45
+
46
+ * You must be using Dremio 25.1+. Log collection is powered by Dremio Diagnostics Collector (DDC).
47
+ * You must have the EXPORT DIAGNOSTICS privilege to view **Download Logs** options in **Settings** > **Support**.
48
+
49
+ #### Downloading Logs
50
+
51
+ To download logs:
52
+
53
+ 1. In the Dremio console, navigate to **Settings** > **Support** > **Download Logs** and click **Start collecting data**.
54
+
55
+ note
56
+
57
+ You may store a maximum of three log bundles. Delete log bundles as needed to start a new log collection if you reach the maximum.
58
+
59
+ We recommend the default `Light` collection, which provides 7 days of logs and completed queries in the `queries.json` file, for troubleshooting most issues. For more complex issues, select the `Standard` collection, which provides 7 days of logs and 28 days of completed queries in the `queries.json` file.
60
+
61
+ 2. When Dremio completes log collection, the log bundle appears in a list below **Start collecting data**. To download a log bundle, click **Download** next to the applicable bundle. Log bundles are available to download for 24 hours.
62
+
63
+ ### Logging in Kubernetes
64
+
65
+ By default, all logs are written to a persisted volume mounted at `/opt/dremio/log`.
66
+
67
+ To disable logging, set `writeLogsToFile: false` in the `values-overrides.yaml` configuration file either globally or individually for each `coordinator` and `executor` parent. For more information, see [Configuring Your Values](/current/deploy-dremio/configuring-kubernetes/).
68
+
69
+ #### Using the Container Console
70
+
71
+ All logs are written to the container's console (stdout) simultaneously. These logs can be monitored using a `kubectl` command:
72
+
73
+ Command for viewing logs using kubectl logs
74
+
75
+ ```
76
+ kubectl logs [-f] [container-name]
77
+ ```
78
+
79
+ Use the `-f` flag to continuously print new log entries to your terminal as they are generated.
80
+
81
+ You can also write logs to a file on disk in addition to stdout. Read [Writing Logs to a File](https://github.com/dremio/dremio-cloud-tools/blob/master/charts/dremio_v2/docs/setup/Writing-Logs-To-A-File.md) for details.
82
+
83
+ #### Using the AKS Container
84
+
85
+ Azure provides integration with AKS clusters and Azure Log Analytics to monitor container logs. This is a standard practice that puts infrastructure in place to aggregate logs from containers into a central log store to analyze them.
86
+
87
+ AKS log monitoring is useful for the following reasons:
88
+
89
+ * Monitoring logs across lots of pods can be overwhelming.
90
+ * When a pod (for example, a Dremio executor) crashes and restarts, only the logs from the last pod are available.
91
+ * If a pod is crashing regularly, the logs are lost, which makes it difficult to analyze the reasons for the crash.
92
+
93
+ For more information regarding AKS, see [Azure Monitor features for Kubernetes monitoring](https://learn.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-overview).
94
+
95
+ #### Enabling Log Monitoring
96
+
97
+ You can enable log monitoring when creating an AKS cluster or after the cluster has been created.
98
+
99
+ Once logging is enabled, all your container `stdout` and `stderr` logs are collected by the infrastructure for you to analyze.
100
+
101
+ 1. While creating an AKS cluster, enable container monitoring. You can use an existing Log Analytics workspace or create a new one.
102
+ 2. In an existing AKS cluster where monitoring was not enabled during creation, go to **Logs on the AKS cluster** and enable it.
103
+
104
+ #### Viewing Container Logs
105
+
106
+ To view all the container logs:
107
+
108
+ 1. Go to **Monitoring** > **Logs**.
109
+ 2. Use the filter option to see the logs from the containers that you are interested in.
110
+
111
+ ## Usage
112
+
113
+ Monitoring usage across your cluster makes it easier to observe patterns, analyze the resources being consumed by your data platform, and understand the impact on your users.
114
+
115
+ ### Catalog Usage Enterprise
116
+
117
+ Go to **Settings** > **Monitor** to view your catalog usage. You must be a member of the `ADMIN` role to access the Monitor page. When you open the Monitor page, you are directed to the Catalog Usage tab by default where you can see the following metrics:
118
+
119
+ * Top 10 most queried datasets and how often the jobs on the dataset were accelerated
120
+ * Top 10 most queried spaces and source folders
121
+
122
+ note
123
+
124
+ A source can be listed in the top 10 most queried spaces and source folders if the source contains a child dataset that was used in the query (for example, `postgres.accounts`). Queries of datasets in sub-folders (for example, `s3.mybucket.iceberg_table`) are classified by the sub-folder and not the source.
125
+
126
+ All datasets are assessed in the metrics on the Monitor page except for datasets in the [system tables](/current/reference/sql/system-tables/), the [information schema](https://docs.dremio.com/current/reference/sql/information-schema), and home spaces.
127
+
128
+ The metrics on the Monitor page analyze only user queries. Refreshes of data Reflections and metadata refreshes are excluded.
129
+
130
+ ### Jobs Enterprise
131
+
132
+ Go to **Settings** > **Monitor** > **Jobs** to open the Jobs tab. You must be a member of the `ADMIN` role to access the Monitor page. The Jobs tab shows an aggregate view of the following metrics for the jobs that are running on your cluster:
133
+
134
+ * Total job count over the last 24 hours and the relative rate of failure/cancelation
135
+ * Top 10 most active users based on the number of jobs they ran
136
+ * Total jobs accelerated, total job time saved, and average job speedup from Autonomous Reflections over the past month.
137
+ * Total number of jobs accelerated by autonomous and manual Reflections over time
138
+ * Total number of completed and failed jobs over time
139
+ * Jobs (completed and failed) grouped by the [queue](/current/admin/workloads/workload-management/#default-queues) they ran on
140
+ * Percentage of time that jobs spent in each [state](/current/admin/monitoring/jobs/#job-states-and-statuses)
141
+ * Top 10 longest running jobs
142
+
143
+ To view all jobs and the details of specific jobs, see [Viewing Jobs](/current/admin/monitoring/jobs/).
144
+
145
+ ### Resources Enterprise
146
+
147
+ Go to **Settings** > **Monitor** > **Resources** to open the Resources tab. You must be a member of the `ADMIN` role to access the Monitor page. The Resources tab shows an aggregate view of the following metrics for the jobs and nodes running on your cluster:
148
+
149
+ * Percentage of CPU and memory utilization for each coordinator and executor node
150
+ * Top 10 most CPU and memory intensive jobs
151
+ * Number of running executors
152
+
153
+ ### Cluster Usage
154
+
155
+ Dremio displays the number of unique users who executed jobs on that day and the number of executed jobs.
156
+
157
+ 1. Hover over ![Icon represents help](/images/icons/help.png "Help icon") in the side navigation bar.
158
+ 2. Click **About Dremio** in the menu.
159
+ 3. Click the **Cluster Usage Data** tab.
160
+
161
+ ![](/images/cluster-usage.png "Viewing Cluster Usage")
162
+
163
+ ## System Telemetry
164
+
165
+ Dremio exposes system telemetry metrics in Prometheus format by default. It is not necessary to configure an exporter to collect the metrics. Instead, you can specify the host and port number where metrics are exposed in the [dremio.conf](/current/deploy-dremio/other-options/standalone/dremio-config/dremio-conf/index.md) file and scrape the metrics with any Prometheus-compliant tool.
166
+
167
+ To specify the host and port number where metrics are exposed, add these two properties to the `dremio.conf` file:
168
+
169
+ * `services.web-admin.host`: set to the desired host address (typically `0.0.0.0` or the IP address of the host where Dremio is running).
170
+ * `services.web-admin.port`: set to any desired value that is greater than `1024`.
171
+
172
+ For example:
173
+
174
+ Example host and port settings in dremio.conf
175
+
176
+ ```
177
+ services.web-admin.host: "127.0.0.1"
178
+ services.web-admin.port: 9090
179
+ ```
180
+
181
+ Restart Dremio after you update the `dremio.conf` file to make sure your changes take effect.
182
+
183
+ Access the exported Dremio system telemetry metrics at `http://<yourHost>:<yourPort>/metrics`.
184
+
185
+ For more information about Prometheus metrics, read [Types of Metrics](https://prometheus.io/docs/tutorials/understanding_metric_types/) in the Prometheus documentation.
186
+
187
+ Was this page helpful?
188
+
189
+ [Previous
190
+
191
+ Workload Management](/current/admin/workloads/workload-management)[Next
192
+
193
+ Viewing Jobs](/current/admin/monitoring/jobs/)
194
+
195
+ * Logs
196
+ + Log Types
197
+ + Retrieving Logs from the Dremio Console Enterprise
198
+ + Logging in Kubernetes
199
+ * Usage
200
+ + Catalog Usage Enterprise
201
+ + Jobs Enterprise
202
+ + Resources Enterprise
203
+ + Cluster Usage
204
+ * System Telemetry
205
+
206
+ ---
207
+
208
+ # Source: https://docs.dremio.com/current/admin/monitoring/jobs/
209
+
210
+ Version: current [26.x]
211
+
212
+ On this page
213
+
214
+ # Viewing Jobs
215
+
216
+ All jobs run in Dremio are listed on a separate page, showing the job ID, type, status, and other attributes.
217
+
218
+ To navigate to the Jobs page, click ![This is the icon that represents the Jobs page.](/images/cloud/jobs-page-icon.png "Jobs icon.") in the side navigation bar.
219
+
220
+ ## Search Filters and Columns
221
+
222
+ By default, the Jobs page lists the jobs run within the last 30 days and the jobs are filtered by **UI, External Tools** job types. To change these defaults for your account, you can filter on values and manage columns directly on the Jobs page, as shown in this image:
223
+
224
+ ![This is a screenshot showing the main components of the Jobs page.](/images/jobs-ui-map.png "This is a screenshot showing the main components of the Jobs page.")
225
+
226
+ a. **Search Jobs** by typing the username or job ID.
227
+
228
+ b. **Start Time** allows you to pick the date and time at which the job began.
229
+
230
+ c. **Status** represents one or more job states. For descriptions, see Job States and Statuses.
231
+
232
+ d. **Attribute** includes Accelerator, AI agent, AI function, Downloads, External Tools, Internal, MCP, and UI. For descriptions, see Query Types in the Job Attributes.
233
+
234
+ e. **User** can be searched by typing the username or checking the box next to the username in the dropdown.
235
+
236
+ f. **Manage Columns** by checking the boxes next to additional columns that you want to see in the Jobs list. The grayed out checkboxes show the columns that are required by default. You can also rearrange the column order by clicking directly on a column to drag and drop.
237
+
238
+ ## Job Attributes
239
+
240
+ Each job has the following attributes, which can appear as columns in the list of jobs:
241
+
242
+ | Attribute | Description |
243
+ | --- | --- |
244
+ | Accelerated | A lightning bolt icon in a row indicates that the job ran a query that was [accelerated](/current/acceleration/). |
245
+ | Dataset | The dataset queried, if one was queried. |
246
+ | Duration | The length of time (in seconds) that a job required from start to completion. |
247
+ | Job ID | A universally unique identifier. |
248
+ | Planner Cost Estimate | A cost estimate calculated by Dremio based on an evaluation of the resources that to be used in the execution of a query. The number is not in units, and is intended to give a an idea of the cost of executing a query relative to the costs of executing other queries. Values are derived by adding weighted estimates of required I/O, memory, and CPU load. In reported values, K = thousand, M = million, B = billion, and T = trillion. For example, a value of 12,543,765,321 is reported as 12.5B. |
249
+ | Planning Time | The length of time (in seconds) in which the query optimizer planned the execution of the query. |
250
+ | Query Type | Represents one of the following query types: * **UI** - queries issued from the SQL Runner in the Dremio UI. * **External Tools** - queries from client applications, such as Microsoft Power BI, Superset, Tableau, other third-party client applications, and custom applications. * **Accelerator** - queries related to creating, maintaining, and removing Reflections. * **Internal** - queries that Dremio submits for internal operations. * **Downloads** - queries used to download datasets. * **MCP** - queries issued from the Dremio MCP Server. * **AI agent** - queries issued from Dremio's AI Agent. * **AI function** – queries that call AI functions. |
251
+ | Queue | Dremio provides the following generic queues as a starting point for customization: * High Cost Reflections * High Cost User Queries * Low Cost Reflections * Low Cost User Queries * UI Previews |
252
+ | Rows Returned | Number of output records. |
253
+ | Rows Scanned | Number of input records. |
254
+ | SQL | The SQL query that was submitted for the job. |
255
+ | Start Time | The date and time which the job began. |
256
+ | Status | An icon that represents one or more job states. This column is automatically shown at the start of each row. For descriptions, see [Job states and statuses](.#job-states-and-statuses). |
257
+ | User | Username of the user who ran the query and initiated the job. |
258
+
259
+ ## Job States and Statuses
260
+
261
+ Each job passes through a sequence of states until it is complete, though the sequence can be interrupted if a query is canceled or if there is an error during a state. In this diagram, the states that a job passes through are in white, and the possible end states are in dark gray.
262
+
263
+ ![](/assets/images/job-states-d8a1b49d0b4cef93a610cd185648e268.png)
264
+
265
+ This table lists the statuses that the UI lets you filter on and shows how they map to the states:
266
+
267
+ | Icon | Status | State | Description |
268
+ | --- | --- | --- | --- |
269
+ | | Setup | Pending | Represents a state where the query is waiting to be scheduled on the query pool. |
270
+ | Metadata Retrieval | Represents a state where metadata schema is retrieved and the SQL command is parsed. |
271
+ | Planning | Represents a state where the following are done: * Physical and logical planning * Reflection matching * Partition metadata retrieval * Mapping the query to a queue based workload management rules * Pick the engine associated with the query to run the query. |
272
+ | | Engine Start | Engine Start | Represents a state where the engine starts if it has stopped. If the engine is stopped, it takes time to restart for the executors to be active. If the engine is already started, then this state does not have a duration. |
273
+ | | Queued | Queued | Represents a state where a job is queued. Each queue has a limit of concurrent queries. If the queries in progress exceed the concurrency limit, the query should wait in the queue until the jobs in progress complete. |
274
+ | | Running | Execution Planning | Represents a state where executor nodes are selected from the chosen engine to run the query, and work is distributed to each executor. |
275
+ | Running | Represents a state where executor nodes execute and complete the fragments assigned to them. Typically, most queries spend more time in this state. |
276
+ | Starting | Represents a state where the query is starting up. |
277
+ | | Canceled | Canceled | Represents a terminal state that indicates that the query is canceled by the user or an intervention in the system. |
278
+ | | Completed | Completed | Represents a terminal state that indicates that the query is successfully completed. |
279
+ | | Failed | Failed | Represents a terminal state that indicates that the query has failed due to an error. |
280
+
281
+ Was this page helpful?
282
+
283
+ [Previous
284
+
285
+ Monitoring](/current/admin/monitoring/)[Next
286
+
287
+ Overview](/current/admin/monitoring/jobs/overview)
288
+
289
+ * Search Filters and Columns
290
+ * Job Attributes
291
+ * Job States and Statuses
292
+
293
+ ---
294
+
295
+ # Source: https://docs.dremio.com/current/admin/monitoring/dremio-nodes
296
+
297
+ Version: current [26.x]
298
+
299
+ On this page
300
+
301
+ # Monitoring Dremio Nodes
302
+
303
+ There are various approaches for operational monitoring of Dremio nodes. This topic discusses both:
304
+
305
+ * Prometheus metrics, which can be leveraged with tools like Grafana to ensure the stability and performance of Dremio deployments.
306
+ * [`queries.json`](/current/admin/monitoring/#log-types), a log file generated by Dremio, which can be used to calculate various service-level agreements (SLAs) related to query performance.
307
+
308
+ While these two datasets can be used in similar ways, Prometheus metrics are less granular than `queries.json`—the latter allows you to drill down into which specific kinds of queries or users are experiencing SLA breaches.
309
+
310
+ ### Enabling Node Metrics
311
+
312
+ Dremio enables node monitoring by default. Starting in Dremio 26.0, each node in the cluster exposes Prometheus metrics via the `/metrics` endpoint on port 9010.
313
+
314
+ ### Available Prometheus Metrics
315
+
316
+ The following table describes the Prometheus metrics provided by Dremio and specifies which Dremio node roles support them:
317
+
318
+ | Metric Name | Description | Main Coordinator | Scale-out Coordinator | Executor |
319
+ | --- | --- | --- | --- | --- |
320
+ | `jobs_active` | Gauge showing the number of currently active jobs | Yes | Yes | No |
321
+ | `jobs_total` | Counter of total jobs submitted, categorized by the type of query | Yes | Yes | No |
322
+ | `jobs.failed` | Counter of failed jobs categorized by query types | Yes | No | No |
323
+ | `jobs.waiting` | Gauge of currently waiting jobs categorized by queue | Yes | No | No |
324
+ | `dremio.memory.jvm_direct_current` | Total direct memory (in bytes) given to the JVM | Yes | Yes | Yes |
325
+ | `memory.heap.committed` | Committed heap memory as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
326
+ | `memory.heap.init` | Initialized heap memory as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
327
+ | `memory.heap.max` | Maximum amount of heap memory that can be allocated as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
328
+ | `memory.heap.usage` | Ratio of used heap memory to max heap memory, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
329
+ | `memory.heap.used` | Amount of used heap memory, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
330
+ | `memory.non-heap.committed` | Amount of non-heap memory committed, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
331
+ | `memory.non-heap.init` | Initialized non-heap memory as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
332
+ | `memory.non-heap.max` | Maximum amount of non-heap memory that can be allocated, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
333
+ | `memory.non-heap.usage` | Ratio of used non-heap memory to max non-heap memory, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
334
+ | `memory.non-heap.used` | Amount of used non-heap memory, as described in [Class MemoryUsage](https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryUsage.html) in the Oracle documentation | Yes | Yes | Yes |
335
+ | `memory.total.committed` | Sum of heap and non-heap committed memory (in bytes) | Yes | Yes | Yes |
336
+ | `memory.total.init` | Sum of heap and non-heap initialized memory (in bytes) | Yes | Yes | Yes |
337
+ | `memory.total.max` | Sum of the heap and non-heap max memory (in bytes) | Yes | Yes | Yes |
338
+ | `memory.total.used` | Sum of the heap and non-heap used memory (in bytes) | Yes | Yes | Yes |
339
+ | `reflections.failed` | Scheduled Reflections that have failed and won't be retried | Yes | No | No |
340
+ | `reflections.unknown` | Reflections for which an error occurred in the Reflection manager and that could not be retried | Yes | No | No |
341
+ | `reflections.active` | Currently active Reflections | Yes | No | No |
342
+ | `reflections.refreshing` | Reflections that are currently refreshing or pending a refresh | Yes | No | No |
343
+ | `reflections.manager_sync` | Time taken to run Reflection management | Yes | Yes | No |
344
+ | `threads.blocked.count` | Gauge of currently blocked threads | Yes | Yes | Yes |
345
+ | `threads.count` | Gauge of active and idle threads | Yes | Yes | Yes |
346
+ | `threads.daemon.count` | Number of currently available active daemon threads | Yes | Yes | Yes |
347
+ | `threads.deadlock.count` | Number of currently deadlocked threads | Yes | Yes | Yes |
348
+ | `threads.new.count` | Current number of threads in new state (not yet started) | Yes | Yes | Yes |
349
+ | `threads.runnable.count` | Current number of threads in runnable state (executing) | Yes | Yes | Yes |
350
+ | `threads.terminated.count` | Current number of threads in the terminated state (completed execution) | Yes | Yes | Yes |
351
+ | `threads.timed_waiting.count` | Current number of threads in the timed\_waiting state | Yes | Yes | Yes |
352
+ | `threads.waiting.count` | Current number of threads in the waiting state | Yes | Yes | Yes |
353
+ | `jvm.gc.overhead.percent` | An approximate percentage of CPU time used by garbage collection activities | Yes | Yes | No |
354
+
355
+ ## Parameters to Monitor for Scaling Capacity
356
+
357
+ The following parameters, derived from the `queries.json`, can help identify when additional engines or vertical scaling are needed to maintain performance.
358
+
359
+ ### Query Execution Errors
360
+
361
+ By reviewing the `outcomeReason` field in `queries.json`, you can identify resource-related issues and take proactive steps, such as scaling engines or redistributing workloads, to maintain performance and stability.
362
+
363
+ | Error Type (`outcomeReason`) | Recommended Threshold | Action |
364
+ | --- | --- | --- |
365
+ | `OUT_OF_MEMORY` | 1% of queries running out of direct memory | Add an engine and move workload |
366
+ | `RESOURCE ERROR` | 1% of queries running out of heap memory | Add an engine and move workload |
367
+ | `ExecutionSetupException` | 1% of queries exhibiting node disconnects | Add an engine and move workload |
368
+ | `ChannelClosedException (fabric server)` | 1% of queries exhibiting node disconnects | Add an engine and move workload |
369
+ | `CONNECTION ERROR: Exceeded timeout` | 1% of queries exhibiting node disconnects | Add an engine and move workload |
370
+
371
+ ### Job State Durations
372
+
373
+ Use the [job state durations](/current/admin/monitoring/jobs/#job-states-and-statuses) (provided in milliseconds) in the `queries.json` to address SLA breaches.
374
+
375
+ | Job State (`queries.json`) | Recommended Threshold | Action |
376
+ | --- | --- | --- |
377
+ | Total Duration (*finish - start*) | p90 SLA aligns with your needs | Add all the states below |
378
+ | Pending (*pendingTime*) | p90 should not exceed 2000 milliseconds | Vertically scale the main coordinator node |
379
+ | Metadata Retrieval (*metadataRetrievalTime*) | p90 should not exceed 5000 milliseconds | Switch to a table format if the raw data is Parquet |
380
+ | Planning (*planningTime*) | p90 should not exceed 2000 milliseconds | Vertically scale the main coordinator node |
381
+ | Queued (*queuedTime*) | p90 should not exceed 2000 milliseconds | Add an engine and move workload |
382
+ | Execution Planning (*executionPlanningTime*) | p90 should not exceed 2000 milliseconds | Vertically scale the main coordinator node |
383
+ | Starting (*startingTime*) | p90 should not exceed 2000 milliseconds | Add an engine and move workload |
384
+ | Running (*runningTime*) | p90 SLA aligns with your needs | Add an engine and move workload |
385
+
386
+ *All italicized values can be found in `queries.json` as represented in the parentheses above.*
387
+
388
+ Was this page helpful?
389
+
390
+ [Previous
391
+
392
+ Viewing Jobs](/current/admin/monitoring/jobs/)[Next
393
+
394
+ Exporting Logs](/current/admin/monitoring/exporting/)
395
+
396
+ * Enabling Node Metrics
397
+ * Available Prometheus Metrics
398
+ * Parameters to Monitor for Scaling Capacity
399
+ + Query Execution Errors
400
+ + Job State Durations
401
+
402
+ ---
403
+
404
+ # Source: https://docs.dremio.com/current/admin/monitoring/exporting/
405
+
406
+ Version: current [26.x]
407
+
408
+ # Exporting Logs
409
+
410
+ You can use a monitor tool to collect logs from Dremio and forward them to a central location for analysis. See the steps for installing a monitoring tool that can be used to integrate with the following platforms:
411
+
412
+ * [Amazon S3 and Azure Blob Storage](/current/admin/monitoring/exporting/aws-azure-storage)
413
+ * [Datadog](/current/admin/monitoring/exporting/datadog)
414
+ * [Splunk](/current/admin/monitoring/exporting/splunk)
415
+
416
+ Was this page helpful?
417
+
418
+ [Previous
419
+
420
+ Monitoring Dremio Nodes](/current/admin/monitoring/dremio-nodes)[Next
421
+
422
+ Amazon S3 and Azure Blob Storage](/current/admin/monitoring/exporting/aws-azure-storage)