dremiojs 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.eslintrc.json +14 -0
- package/.prettierrc +7 -0
- package/README.md +59 -0
- package/dremiodocs/dremio-cloud/cloud-api-reference.md +748 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-about.md +225 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-admin.md +3754 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-bring-data.md +6098 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-changelog.md +32 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-developer.md +1147 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-explore-analyze.md +2522 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-get-started.md +300 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-help-support.md +869 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-manage-govern.md +800 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-overview.md +36 -0
- package/dremiodocs/dremio-cloud/dremio-cloud-security.md +1844 -0
- package/dremiodocs/dremio-cloud/sql-docs.md +7180 -0
- package/dremiodocs/dremio-software/dremio-software-acceleration.md +1575 -0
- package/dremiodocs/dremio-software/dremio-software-admin.md +884 -0
- package/dremiodocs/dremio-software/dremio-software-client-applications.md +3277 -0
- package/dremiodocs/dremio-software/dremio-software-data-products.md +560 -0
- package/dremiodocs/dremio-software/dremio-software-data-sources.md +8701 -0
- package/dremiodocs/dremio-software/dremio-software-deploy-dremio.md +3446 -0
- package/dremiodocs/dremio-software/dremio-software-get-started.md +848 -0
- package/dremiodocs/dremio-software/dremio-software-monitoring.md +422 -0
- package/dremiodocs/dremio-software/dremio-software-reference.md +677 -0
- package/dremiodocs/dremio-software/dremio-software-security.md +2074 -0
- package/dremiodocs/dremio-software/dremio-software-v25-api.md +32637 -0
- package/dremiodocs/dremio-software/dremio-software-v26-api.md +36757 -0
- package/jest.config.js +10 -0
- package/package.json +25 -0
- package/src/api/catalog.ts +74 -0
- package/src/api/jobs.ts +105 -0
- package/src/api/reflection.ts +77 -0
- package/src/api/source.ts +61 -0
- package/src/api/user.ts +32 -0
- package/src/client/base.ts +66 -0
- package/src/client/cloud.ts +37 -0
- package/src/client/software.ts +73 -0
- package/src/index.ts +16 -0
- package/src/types/catalog.ts +31 -0
- package/src/types/config.ts +18 -0
- package/src/types/job.ts +18 -0
- package/src/types/reflection.ts +29 -0
- package/tests/integration_manual.ts +95 -0
- package/tsconfig.json +19 -0
|
@@ -0,0 +1,800 @@
|
|
|
1
|
+
# Manage and Govern Your Data | Dremio Documentation
|
|
2
|
+
|
|
3
|
+
Original URL: https://docs.dremio.com/dremio-cloud/manage-govern/
|
|
4
|
+
|
|
5
|
+
On this page
|
|
6
|
+
|
|
7
|
+
Data management focuses on the operational efficiency, performance, and reliability of your data at scale. With Dremio’s autonomous management capabilities, many of these processes are intelligently automated; reducing manual effort and ensuring consistent optimization. Dremio automates table optimization by merging small files into optimally sized ones (typically around 512 MB), reducing metadata overhead, and reclaiming storage by physically removing deleted rows. It also reorganizes data to align with clustering specifications, ensuring consistent, high-performance queries across large datasets. Together, these autonomous management features help keep your lakehouse fast, efficient, and cost-effective.
|
|
8
|
+
|
|
9
|
+
Data governance is the foundation of a secure, reliable, and compliant lakehouse. It ensures that data across your environment is accurate, consistent, and properly controlled throughout its lifecycle. With Dremio, you can implement robust governance practices by maintaining complete data lineage for transparency and auditability, defining role-based and fine-grained (row-access and column-masking) access controls on data, and using documentation and tags to improve data discoverability. Together, these capabilities enable trustworthy, well-governed data that fuels analytics and AI with confidence.
|
|
10
|
+
|
|
11
|
+
## Autonomous Management
|
|
12
|
+
|
|
13
|
+
### Optimization
|
|
14
|
+
|
|
15
|
+
Managing [Apache Iceberg tables](/dremio-cloud/manage-govern/optimization/) is critical to maintaining fast and predictable query performance, especially for agentic AI workloads that demand low latency. As new data is ingested and tables are updated, metadata and small data files accumulate, leading to performance degradation over time. Dremio automates table optimization by merging small files into optimally sized ones (typically ~512 MB), reducing metadata overhead, organizing data to align with clustering specification and reclaiming storage by physically removing deleted rows.
|
|
16
|
+
|
|
17
|
+
### Clustering
|
|
18
|
+
|
|
19
|
+
Dremio also reorganizes data to align with [clustering](/dremio-cloud/manage-govern/optimization/) specifications, ensuring consistent, high-performance queries at scale.
|
|
20
|
+
|
|
21
|
+
### Materialize and Query Rewrite
|
|
22
|
+
|
|
23
|
+
Dremio can autonomously materialize datasets using Reflections, a precomputed and optimized copy of source data or a query result, designed to speed up query performance. Dremio's query optimizer can accelerate a query against tables or views by using one or more Reflections to partially or entirely satisfy that query, rather than processing the raw data in the underlying data source. Queries do not need to reference Reflections directly. Instead, Dremio rewrites queries on the fly to use the Reflections that satisfy them. For more information, see [Reflections](/dremio-cloud/admin/performance/autonomous-reflections/).
|
|
24
|
+
|
|
25
|
+
## Governance
|
|
26
|
+
|
|
27
|
+
### Lineage
|
|
28
|
+
|
|
29
|
+
Track and visualize how data flows through your lakehouse, from source to consumption. [Lineage](/dremio-cloud/manage-govern/lineage/) helps you understand data origins, track transformations, identify dependencies, and perform impact analysis.
|
|
30
|
+
|
|
31
|
+
### Wikis
|
|
32
|
+
|
|
33
|
+
Enrich data understanding by documenting datasets with wikis. Use Generative AI to automatically generate [wikis](/dremio-cloud/manage-govern/wikis-labels/), reducing manual documentation effort. Wikis are used by Dremio's AI Agent to understand the semantics of your environment and adhere to these definitions in response to user prompts.
|
|
34
|
+
|
|
35
|
+
### Labels
|
|
36
|
+
|
|
37
|
+
Enhance data discoverability and searchability by categorizing datasets with labels. Use Generative AI to automatically generate [labels](/dremio-cloud/manage-govern/wikis-labels/), reducing manual cataloging effort.
|
|
38
|
+
|
|
39
|
+
### Role-Based Access Control Policies
|
|
40
|
+
|
|
41
|
+
Manage access to datasets through [roles](/dremio-cloud/security/roles) rather than individual user grants for easier administration. Assign [privileges](/dremio-cloud/security/privileges) to roles, simplifying management and ensuring users only have access to what they need to perform their job.
|
|
42
|
+
|
|
43
|
+
### Row-Access and Column-Masking Policies
|
|
44
|
+
|
|
45
|
+
Apply fine-grained access controls to protect sensitive data using row-access and column-masking policies. Control access to specific rows and columns based on rules and conditions to maintain compliance and adhere to regulatory requirements. For more information, see [Row-Access & Column-Masking Policies](/dremio-cloud/manage-govern/row-column-policies/).
|
|
46
|
+
|
|
47
|
+
## Related Topics
|
|
48
|
+
|
|
49
|
+
* [Roles](/dremio-cloud/security/roles) – Manage role-based access control.
|
|
50
|
+
* [Explore and Analyze Your Data](/dremio-cloud/explore-analyze/) - Explore and analyze your governed data.
|
|
51
|
+
* [Catalog API - Lineage](/dremio-cloud/api/catalog/lineage/) - Retrieve lineage information about datasets.
|
|
52
|
+
|
|
53
|
+
Was this page helpful?
|
|
54
|
+
|
|
55
|
+
* Autonomous Management
|
|
56
|
+
+ Optimization
|
|
57
|
+
+ Clustering
|
|
58
|
+
+ Materialize and Query Rewrite
|
|
59
|
+
* Governance
|
|
60
|
+
+ Lineage
|
|
61
|
+
+ Wikis
|
|
62
|
+
+ Labels
|
|
63
|
+
+ Role-Based Access Control Policies
|
|
64
|
+
+ Row-Access and Column-Masking Policies
|
|
65
|
+
* Related Topics
|
|
66
|
+
|
|
67
|
+
<div style="page-break-after: always;"></div>
|
|
68
|
+
|
|
69
|
+
# Lineage | Dremio Documentation
|
|
70
|
+
|
|
71
|
+
Original URL: https://docs.dremio.com/dremio-cloud/manage-govern/lineage
|
|
72
|
+
|
|
73
|
+
On this page
|
|
74
|
+
|
|
75
|
+
Lineage provides a graph of a dataset's relationships (its source, parent datasets, and child datasets) to illustrate how datasets are connected, where the data originates, while tracking its movement and transformations.
|
|
76
|
+
|
|
77
|
+
By default, the lineage graph focuses on the initially selected dataset and its relationships with other datasets, represented as nodes that display the dataset name and path. To view additional metadata, use the **Show/hide layers** options.
|
|
78
|
+
|
|
79
|
+
If you wish to track lineage for a different dataset node, the lineage graph needs to be refocused. To refocus the lineage graph on a different dataset, you can either click  or  on the right of the dataset name, and then select **Focus on this dataset**.
|
|
80
|
+
|
|
81
|
+

|
|
82
|
+
|
|
83
|
+
## Privileges Required for Lineage
|
|
84
|
+
|
|
85
|
+
* If you have the `SELECT` privilege on the parent datasets and the child datasets, you can see the parent datasets and data sources on the left. The child datasets appear on the right.
|
|
86
|
+
* If you have only the `READ METADATA` privilege on the parent and child datasets, then you can only see limited metadata for these datasets.
|
|
87
|
+
* If you do not have the `SELECT` or the `READ METADATA` privilege on the parent and child datasets, they are not visible.
|
|
88
|
+
|
|
89
|
+
## Lineage Refresh with Dataset Schema Changes
|
|
90
|
+
|
|
91
|
+
For datasets in Iceberg REST catalogs, the lineage graphs are stored in Dremio's metadata cache, which is automatically refreshed at fixed time intervals. For more information, see [Metadata Refresh](/dremio-cloud/bring-data/connect/catalogs/iceberg-rest-catalog/#metadata). It is possible that the lineage graph might show an outdated schema for the dataset if the dataset schema has been recently updated and Dremio's metadata cache has not yet been refreshed.
|
|
92
|
+
|
|
93
|
+
Was this page helpful?
|
|
94
|
+
|
|
95
|
+
* Privileges Required for Lineage
|
|
96
|
+
* Lineage Refresh with Dataset Schema Changes
|
|
97
|
+
|
|
98
|
+
<div style="page-break-after: always;"></div>
|
|
99
|
+
|
|
100
|
+
# Automatic Optimization | Dremio Documentation
|
|
101
|
+
|
|
102
|
+
Original URL: https://docs.dremio.com/dremio-cloud/manage-govern/optimization
|
|
103
|
+
|
|
104
|
+
On this page
|
|
105
|
+
|
|
106
|
+
As [Apache Iceberg](/dremio-cloud/developer/data-formats/iceberg) tables are written to and updated, data and metadata files accumulate, which can affect query performance. For example, small files produced by data ingestion jobs slow queries because the query engine must read more files.
|
|
107
|
+
|
|
108
|
+
To optimize performance, Dremio automates table maintenance in the Open Catalog. This process compacts small files into larger ones, partitions data based on the values of a table's columns, rewrites manifest files, removes position delete files, and clusters tables—improving query speed while reducing storage costs.
|
|
109
|
+
|
|
110
|
+
Automatic optimization runs on a dedicated engine configured by Dremio, ensuring peak performance without impacting project query workloads.
|
|
111
|
+
|
|
112
|
+
When Dremio optimizes a table, it evaluates file sizes, partition layout, and metadata organization to reduce I/O and metadata overhead. Optimization consists of five main operations: clustering, data file compaction, partition evolution, manifest file rewriting, and position delete files.
|
|
113
|
+
|
|
114
|
+
## Clustering
|
|
115
|
+
|
|
116
|
+
Iceberg clustering sorts individual records in data files based on the clustered columns provided in the [`CREATE TABLE`](/dremio-cloud/sql/commands/create-table/) or [`ALTER TABLE`](/dremio-cloud/sql/commands/alter-table/) statement.
|
|
117
|
+
|
|
118
|
+
To cluster a table, you must first define the clustering keys. Then, automatic optimization uses the clustering keys to optimize tables. For details, see [Clustering](/dremio-cloud/developer/data-formats/iceberg/#clustering).
|
|
119
|
+
|
|
120
|
+
## Data File Compaction
|
|
121
|
+
|
|
122
|
+
Iceberg tables that are constantly being updated can have data files of various sizes. As a result, query performance can be negatively affected by sub-optimal file sizes. The optimal file size in Dremio is 256 MB.
|
|
123
|
+
|
|
124
|
+
Dremio logically combines smaller files and splits larger ones to 256 MB (see the following graphic), helping to reduce metadata overhead and costs related to opening and reading files.
|
|
125
|
+
|
|
126
|
+

|
|
127
|
+
|
|
128
|
+
## Partition Evolution
|
|
129
|
+
|
|
130
|
+
To improve read or write performance, data is partitioned based on the values of a table's columns. If the columns used in a partition evolve over time, query performance can be impacted when the queries are not aligned with the current segregations of the partition. Dremio detects and rewrites these files to align with the current partition specification. This operation is used:
|
|
131
|
+
|
|
132
|
+
* When select partitions are queried more often or are of more importance (than others), and it's not necessary to optimize the entire table.
|
|
133
|
+
* When select partitions are more active and are constantly being updated. Optimization should only occur when activity is low or paused.
|
|
134
|
+
|
|
135
|
+
## Manifest File Rewriting
|
|
136
|
+
|
|
137
|
+
Iceberg uses metadata files (or manifests) to track point-in-time snapshots by maintaining all deltas as a table. This metadata layer functions as an index over a table’s data and the manifest files contained in this layer speed up query planning and prune unnecessary data files. For Iceberg tables that are constantly being updated (such as the ingestion of streaming data or users performing frequent DML operations), the number of manifest files that are suboptimal in size can grow over time. Additionally, the clustering of metadata entries in these files may not be optimal. As a result, suboptimal manifests can impact the time it takes to plan and execute a query.
|
|
138
|
+
|
|
139
|
+
Dremio rewrites these manifest files quickly based on size criteria. The target size for a manifest file is based on the Iceberg table's property. If a default size is not set, Dremio defaults to 8 MB. For the target size, Dremio considers the range from 0.75x to 1.8x, inclusive, to be optimal. Manifest files exceeding the 1.8x size will be split while files smaller than the 0.75x size will be compacted.
|
|
140
|
+
|
|
141
|
+
This operation results in the optimization of the metadata, helping to reduce query planning time.
|
|
142
|
+
|
|
143
|
+
## Position Delete Files
|
|
144
|
+
|
|
145
|
+
Iceberg v2 added the ability for delete files to be encoded to rows that have been deleted in existing data files. This enables you to delete or replace individual rows in immutable data files without the need to rewrite those files. [Position delete files](https://iceberg.apache.org/spec/#position-delete-files) identify deleted rows by file and position in one or more data files, as shown in the following example.
|
|
146
|
+
|
|
147
|
+
| `file_path` | `pos` |
|
|
148
|
+
| --- | --- |
|
|
149
|
+
| `file:/Users/test.user/Downloads/gen_tables/orders_with_deletes/data/2021/2021-00.parquet` | `6` |
|
|
150
|
+
| `file:/Users/test.user/Downloads/gen_tables/orders_with_deletes/data/2021/2021-00.parquet` | `16` |
|
|
151
|
+
|
|
152
|
+
Dremio can optimize Iceberg tables containing position delete files. This is beneficial to do because when data files are read, the associated delete files are stored in memory. Also, one data file can be linked to several delete files, which can impact read time.
|
|
153
|
+
|
|
154
|
+
When tables are optimized in Dremio, the position delete files are removed and the data files that are linked to them are rewritten. Data files are rewritten if any of the following conditions are met:
|
|
155
|
+
|
|
156
|
+
* The file size is not within the optimum range.
|
|
157
|
+
* The partition's specification is not current.
|
|
158
|
+
* The data file has an attached delete file.
|
|
159
|
+
|
|
160
|
+
## Related Topics
|
|
161
|
+
|
|
162
|
+
* [Apache Iceberg](/dremio-cloud/developer/data-formats/iceberg) – Learn more about the Apache Iceberg table format.
|
|
163
|
+
* [Load Data Into Tables](/dremio-cloud/bring-data/load/) – Load data from CSV, JSON, or Parquet files into existing Iceberg tables.
|
|
164
|
+
|
|
165
|
+
Was this page helpful?
|
|
166
|
+
|
|
167
|
+
* Clustering
|
|
168
|
+
* Data File Compaction
|
|
169
|
+
* Partition Evolution
|
|
170
|
+
* Manifest File Rewriting
|
|
171
|
+
* Position Delete Files
|
|
172
|
+
* Related Topics
|
|
173
|
+
|
|
174
|
+
<div style="page-break-after: always;"></div>
|
|
175
|
+
|
|
176
|
+
# Wikis and Labels | Dremio Documentation
|
|
177
|
+
|
|
178
|
+
Original URL: https://docs.dremio.com/dremio-cloud/manage-govern/wikis-labels
|
|
179
|
+
|
|
180
|
+
On this page
|
|
181
|
+
|
|
182
|
+
Wikis and labels help users document, organize, and discover datasets within the Open Catalog. This page explains how to manage wikis and labels, as well as how Dremio’s Generative AI features can assist in generating wikis and labels for you.
|
|
183
|
+
|
|
184
|
+
## Wikis
|
|
185
|
+
|
|
186
|
+
Wikis for datasets provide an efficient way to document and describe datasets within the Open Catalog. These wikis enable users to add comprehensive information, context, and relevant details about the datasets they manage.
|
|
187
|
+
With a user-friendly, rich text editor, the wikis support [Github-flavored markdown](https://github.github.com/gfm/), allowing users to format content easily and enhance readability.
|
|
188
|
+
Wikis ensure that dataset documentation is both accessible and structured, making it simpler for teams to understand the datasets and how to work with them effectively.
|
|
189
|
+
|
|
190
|
+

|
|
191
|
+
|
|
192
|
+
### Manage Wikis
|
|
193
|
+
|
|
194
|
+
note
|
|
195
|
+
|
|
196
|
+
Ensure you have sufficient [Role-Based Access Control (RBAC) privileges](/dremio-cloud/security/privileges/) to view or edit wikis.
|
|
197
|
+
|
|
198
|
+
To view or edit the wiki for a dataset in the Dremio console:
|
|
199
|
+
|
|
200
|
+
1. On the Datasets page, navigate to the folder where your dataset is stored.
|
|
201
|
+
2. Hover over your dataset, and on the right-hand side, click the  icon.
|
|
202
|
+
3. Click **Open Details Panel**.
|
|
203
|
+
* You can edit the dataset wiki by clicking **Edit Wiki**, writing your wiki content, and clicking **Save**.
|
|
204
|
+
|
|
205
|
+
## Labels
|
|
206
|
+
|
|
207
|
+
Labels for datasets offer a powerful way to organize and retrieve datasets within a data catalog. By creating and assigning labels to datasets, users can easily search and filter through large collections related datasets.
|
|
208
|
+
Labels also enhance the search experience, allowing users to quickly locate datasets associated with a specific label. By clicking on a label, users can initiate a search that brings up all datasets linked to that label, streamlining the process of finding relevant data and improving overall data management.
|
|
209
|
+
|
|
210
|
+
The following image shows a dataset in the catalog with several label and a brief wiki. In this example, the label "pii-data" was used in the search field to narrow down on a customer dataset that contains Personally Identifiable Information (PII).
|
|
211
|
+
|
|
212
|
+

|
|
213
|
+
|
|
214
|
+
### Manage Labels
|
|
215
|
+
|
|
216
|
+
note
|
|
217
|
+
|
|
218
|
+
Ensure you have sufficient [Role-Based Access Control (RBAC) privileges](/dremio-cloud/security/privileges/) to view or edit labels.
|
|
219
|
+
|
|
220
|
+
To view or edit the labels for a dataset in the Dremio console:
|
|
221
|
+
|
|
222
|
+
1. On the Datasets page, navigate to the folder where your dataset is stored.
|
|
223
|
+
2. Hover over your dataset, and on the right-hand side, click the  icon.
|
|
224
|
+
3. Click **Open Details Panel**.
|
|
225
|
+
* You can add a label by clicking on the  icon, typing a label name (e.g. `PII`), and clicking **Enter**.
|
|
226
|
+
|
|
227
|
+
## Generate Labels and Wikis Preview
|
|
228
|
+
|
|
229
|
+
To help eliminate the need for manual profiling and cataloging, you can use Generative AI to generate labels and wikis for your datasets.
|
|
230
|
+
|
|
231
|
+
note
|
|
232
|
+
|
|
233
|
+
If you haven't opted into the Generative AI features, see [Dremio Preferences](/dremio-cloud/admin/projects/preferences) for the steps on how to enable.
|
|
234
|
+
|
|
235
|
+
#### Generate Labels
|
|
236
|
+
|
|
237
|
+
In order to generate a label, Generative AI bases its understanding on your schema by considering other labels that have been previously generated and labels that have been created by other users.
|
|
238
|
+
|
|
239
|
+
To generate labels:
|
|
240
|
+
|
|
241
|
+
1. Navigate to either the Details page or Details Panel of a dataset.
|
|
242
|
+
2. In the Dataset Overview on the right, click  to generate labels.
|
|
243
|
+
3. In the Generating labels dialog, review the labels generated for the dataset and decide which to save. If multiple labels have been generated, you can save some, all, or none of them. To remove, simply click the **x** on the label.
|
|
244
|
+
|
|
245
|
+

|
|
246
|
+
|
|
247
|
+
4. Complete one of the following actions:
|
|
248
|
+
|
|
249
|
+
* If these are the only labels for your dataset, click **Save**.
|
|
250
|
+
* If you already have labels for the dataset and want to add these generated labels, click **Append**.
|
|
251
|
+
* If you already have labels for the dataset and want to replace them with these generated labels, click **Overwrite**.
|
|
252
|
+
|
|
253
|
+
The labels for the dataset will appear in the Dataset Overview.
|
|
254
|
+
|
|
255
|
+
#### Generate Wikis
|
|
256
|
+
|
|
257
|
+
In order to generate a wiki, Generative AI bases its understanding on your schema and data to produce descriptions of datasets, because it can determine how the columns within the dataset relate to each other and to the dataset as a whole.
|
|
258
|
+
|
|
259
|
+
You can generate wikis only if you are the dataset owner or have `ALTER` privileges on the dataset.
|
|
260
|
+
|
|
261
|
+
To generate a wiki:
|
|
262
|
+
|
|
263
|
+
1. Navigate to either the Details page or Details Panel of a dataset.
|
|
264
|
+
2. In the Wiki section, click **Generate wiki**. A dialog will open and a preview of the wiki content will generate on the right of the dialog. If you would like to regenerate, click .
|
|
265
|
+
|
|
266
|
+

|
|
267
|
+
|
|
268
|
+
3. Click  to copy the generated wiki content on the right of the dialog.
|
|
269
|
+
4. Click within the text box on the left and paste the wiki content.
|
|
270
|
+
5. (Optional) Use the toolbar to make edits to the wiki content. If you would like to regenerate, click  in the toolbar to regenerate wiki content in the preview.
|
|
271
|
+
6. Click **Save**.
|
|
272
|
+
|
|
273
|
+
The wiki for the dataset will appear in the Wiki section.
|
|
274
|
+
|
|
275
|
+
## Related Topics
|
|
276
|
+
|
|
277
|
+
* [Search for Dremio Objects and Entities](/dremio-cloud/explore-analyze/discover#search-for-dremio-objects-and-entities) - Explore Dremio's semantic search capabilities.
|
|
278
|
+
* [Data Privacy](/data-privacy/) - Learn more about Dremio's data privacy practices.
|
|
279
|
+
|
|
280
|
+
Was this page helpful?
|
|
281
|
+
|
|
282
|
+
* Wikis
|
|
283
|
+
+ Manage Wikis
|
|
284
|
+
* Labels
|
|
285
|
+
+ Manage Labels
|
|
286
|
+
* Generate Labels and Wikis Preview
|
|
287
|
+
* Related Topics
|
|
288
|
+
|
|
289
|
+
<div style="page-break-after: always;"></div>
|
|
290
|
+
|
|
291
|
+
# Row-Access and Column-Masking Policies | Dremio Documentation
|
|
292
|
+
|
|
293
|
+
Original URL: https://docs.dremio.com/dremio-cloud/manage-govern/row-column-policies
|
|
294
|
+
|
|
295
|
+
On this page
|
|
296
|
+
|
|
297
|
+
Row-access and column-masking policies may be applied to tables, views, and columns via [user-defined functions (UDFs)](/dremio-cloud/sql/commands/create-function/). Using these policies, you can control access to sensitive data based upon the rules and conditions you need to maintain compliance or adhere to regulatory requirements, while also removing the need to produce a secondary set of data with protected information manually removed.
|
|
298
|
+
|
|
299
|
+
The following restrictions apply to policies and UDFs:
|
|
300
|
+
|
|
301
|
+
* Only users with the ADMIN role can create UDFs.
|
|
302
|
+
* UDFs can only have one owner, which is the user that created the UDF, by default.
|
|
303
|
+
* You can transfer ownership of a UDF using the `GRANT OWNERSHIP` command (see [Privileges](/dremio-cloud/security/privileges)).
|
|
304
|
+
* Users or roles must have the EXECUTE privilege in order to apply filtering and masking policies.
|
|
305
|
+
|
|
306
|
+
## Column-Masking Policies
|
|
307
|
+
|
|
308
|
+
Column-masking is a way to mask—or scramble—private data at the column-level dynamically prior to query execution. For example, the owner of a table or view may apply a policy to a column to only display the year of a date or the last four digits of a credit card.
|
|
309
|
+
|
|
310
|
+
Column-masking policies may be any UDF with a scalar return type that is identical to the data type of the column on which it is applied. However, only one column-masking policy may be applied to each column.
|
|
311
|
+
|
|
312
|
+
In the following example of a user-defined function, only users within in the Accounting department in the state of California (CA) may see an entry's social security number (ssn) if the record lists an income above $10,000, otherwise the SSN value is masked with XXX-XX-.
|
|
313
|
+
|
|
314
|
+
Column-masking policy example
|
|
315
|
+
|
|
316
|
+
```
|
|
317
|
+
CREATE FUNCTION protect_ssn (ssn VARCHAR(11))
|
|
318
|
+
RETURNS VARCHAR(11)
|
|
319
|
+
RETURN SELECT CASE WHEN query_user()='jdoe@dremio.com' OR is_member('Accounting') THEN ssn
|
|
320
|
+
ELSE CONCAT('XXX-XXX-', SUBSTR(ssn,9,3))
|
|
321
|
+
END;
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
## Row-Access Policies
|
|
325
|
+
|
|
326
|
+
Row-access policies are a way to control which records in a table or view are returned for specific users and roles. For example, the owner of a table or view may apply a policy that filters out customers from a specific country unless the user running the query has a specific role.
|
|
327
|
+
|
|
328
|
+
Row-access policy example
|
|
329
|
+
|
|
330
|
+
```
|
|
331
|
+
CREATE FUNCTION country_filter (country VARCHAR)
|
|
332
|
+
RETURNS BOOLEAN
|
|
333
|
+
RETURN SELECT query_user()='jdoe@dremio.com' OR (is_member('Accounting') AND country='CA');
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
Row-access policies may be any boolean UDF applied to the table or view. The return value of the UDF is treated logically in a query as an `AND` operator included in a `WHERE` clause. The return type of the UDF must be `BOOLEAN`, otherwise Dremio will give an error at execution time.
|
|
337
|
+
|
|
338
|
+
## User-Defined Functions
|
|
339
|
+
|
|
340
|
+
A user-defined function, or [UDF,](/dremio-cloud/sql/commands/create-function) is a callable routine that accepts input parameters, executes the function body, and returns a single value or a set of rows.
|
|
341
|
+
|
|
342
|
+
The UDFs which serve as the basis for filtering and masking policies must be defined independently of your sources. Not only does this allow organizations to use a single policy for multiple tables and views, but this also restricts user access to policies and prevents unauthorized tampering. Modifying a single UDF automatically updates the policy in the context of any tables or views using that access or mask policy.
|
|
343
|
+
|
|
344
|
+
The following process describes how policies are enforced with Dremio:
|
|
345
|
+
|
|
346
|
+
1. A user with the ADMIN role creates a UDF to serve as a security policy.
|
|
347
|
+
2. The administrator then sets the security policy to one or more tables, views, and/or columns.
|
|
348
|
+
3. Dremio enforces the policy at runtime when an end-user performs a query.
|
|
349
|
+
|
|
350
|
+
Creating UDFs and attaching security policies is done through SQL commands. Policies are applied prior to execution during the query planning phase. At this point, Dremio checks first the table/view for a row-access policy and then each column accessed for a column-masking policy. If any policies are found, they are automatically applied to the policy's scope using the associated UDF in the query plan.
|
|
351
|
+
|
|
352
|
+
### Query Substitutions
|
|
353
|
+
|
|
354
|
+
Row-access and column-masking function act as an "implicit view," replacing a table/view reference in an SQL statement prior to processing the query. This implicit view is created through an examination of each policy applied to a table, view, or column.
|
|
355
|
+
|
|
356
|
+
For example, [jdoe@dremio.com](mailto:jdoe@dremio.com) has SELECT access to table\_1. However, the column-masking policy protect\_ssn is set for the column\_1 column with a UDF to replace all but the last four digits of a social security number with X for anyone that is not a member of the Accounting department, or this user. When they run a query in Dremio that includes this column-masking policy, the following occurs:
|
|
357
|
+
|
|
358
|
+
1. During the SQL Planning phase, Dremio identifies which tables, views, and columns are being accessed (table\_1) and whether security policies must be enforced.
|
|
359
|
+
2. The engine searches for any security policies set to the associated objects, such as protect\_ssn (see Examples of UDFs below).
|
|
360
|
+
3. When the protect\_ssn policy is found for the object affected by the query, the query planner immediately modifies the execution path to incorporate the masking function.
|
|
361
|
+
4. Query execution proceeds as normal with the associated UDF included within the execution path.
|
|
362
|
+
|
|
363
|
+
## List Existing UDFs
|
|
364
|
+
|
|
365
|
+
To view all existing UDFs created in Dremio, use the [`SHOW FUNCTIONS`](/dremio-cloud/sql/commands/show-functions/) SQL command.
|
|
366
|
+
|
|
367
|
+
## List Existing Policies
|
|
368
|
+
|
|
369
|
+
To view row-access and column-masking policies, use a [`SELECT` statement](/dremio-cloud/sql/commands/SELECT) with the target table/view, system table, and policies specified.
|
|
370
|
+
|
|
371
|
+
List existing column-masking and row-access policies
|
|
372
|
+
|
|
373
|
+
```
|
|
374
|
+
SELECT view_name, masking_policies, row_access_policies FROM sys.project.views;
|
|
375
|
+
SELECT table_name, masking_policies, row_access_policies FROM sys.project."tables";
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
To view all column-masking policies set for a given table, use the [`DESCRIBE TABLE`](/dremio-cloud/sql/commands/describe-table/) command.
|
|
379
|
+
|
|
380
|
+
## Set a Policy
|
|
381
|
+
|
|
382
|
+
To create a row-access or column-masking policy, you must perform the following steps using the associated SQL commands:
|
|
383
|
+
|
|
384
|
+
1. Create a new UDF or replace an existing one using the `CREATE \[OR REPLACE\]` [function](/dremio-cloud/sql/commands/create-function/) command.
|
|
385
|
+
|
|
386
|
+
Create or replace UDF
|
|
387
|
+
|
|
388
|
+
```
|
|
389
|
+
CREATE FUNCTION country_filter (country VARCHAR)
|
|
390
|
+
RETURNS BOOLEAN
|
|
391
|
+
RETURN SELECT query_user()='jdoe@dremio.com' OR (is_member('Accounting') AND country='CA');
|
|
392
|
+
|
|
393
|
+
CREATE FUNCTION id_filter (id INT)
|
|
394
|
+
RETURNS BOOLEAN
|
|
395
|
+
RETURN SELECT id = 1;
|
|
396
|
+
```
|
|
397
|
+
2. Grant the [EXECUTE privilege](/dremio-cloud/security/privileges) to the role/users to apply the policy.
|
|
398
|
+
|
|
399
|
+
Grant EXECUTE privilege
|
|
400
|
+
|
|
401
|
+
```
|
|
402
|
+
GRANT EXECUTE ON FUNCTION country_filter TO role Policy_Role;
|
|
403
|
+
```
|
|
404
|
+
3. Create a policy to apply the function use `ADD ROW ACCESS POLICY` for row-level access or `SET MASKING POLICY` for column-masking. These may be used with the `CREATE TABLE`, `CREATE VIEW`, `ALTER TABLE`, and `ALTER VIEW` commands.
|
|
405
|
+
|
|
406
|
+
Create policy to apply function
|
|
407
|
+
|
|
408
|
+
```
|
|
409
|
+
-- Add row-access policy
|
|
410
|
+
ALTER TABLE e.employee
|
|
411
|
+
ADD ROW ACCESS POLICY country_filter(country);
|
|
412
|
+
|
|
413
|
+
-- Add column-masking policy
|
|
414
|
+
ALTER VIEW e.employee_view
|
|
415
|
+
SET MASKING POLICY protect_ssn (ssn_col, region);
|
|
416
|
+
|
|
417
|
+
-- Create table with row policy
|
|
418
|
+
CREATE TABLE e.employee(
|
|
419
|
+
id INTEGER,
|
|
420
|
+
ssn VARCHAR(11),
|
|
421
|
+
country VARCHAR,
|
|
422
|
+
ROW ACCESS POLICY country_filter(country)
|
|
423
|
+
);
|
|
424
|
+
|
|
425
|
+
-- Create table with masking policy
|
|
426
|
+
CREATE VIEW e.employee_view(
|
|
427
|
+
ssn_col VARCHAR MASKING POLICY protect_ssn (ssn_col, region),
|
|
428
|
+
region VARCHAR,
|
|
429
|
+
state_col VARCHAR)
|
|
430
|
+
);
|
|
431
|
+
```
|
|
432
|
+
|
|
433
|
+
note
|
|
434
|
+
|
|
435
|
+
Both row-access and column-masking UDFs may be applied in a single security policy, or set individually.
|
|
436
|
+
|
|
437
|
+
## Drop a Policy
|
|
438
|
+
|
|
439
|
+
To remove a security policy from a table, view, or row, use `UNSET MASKING POLICY` or `DROP ROW ACCESS POLICY` with `ALTER TABLE` or `ALTER VIEW`.
|
|
440
|
+
|
|
441
|
+
Remove security policy
|
|
442
|
+
|
|
443
|
+
```
|
|
444
|
+
ALTER TABLE w.employee DROP ROW ACCESS POLICY country_filter(country);
|
|
445
|
+
ALTER VIEW e.employees_view MODIFY COLUMN ssn_col UNSET MASKING POLICY protect_ssn;
|
|
446
|
+
```
|
|
447
|
+
|
|
448
|
+
## Examples of UDFs
|
|
449
|
+
|
|
450
|
+
The following are examples of user-defined functions that you may create with Dremio.
|
|
451
|
+
|
|
452
|
+
### Column-Masking Policies
|
|
453
|
+
|
|
454
|
+
Redact SSN
|
|
455
|
+
|
|
456
|
+
```
|
|
457
|
+
CREATE FUNCTION
|
|
458
|
+
protect_ssn (val VARCHAR)
|
|
459
|
+
RETURNS VARCHAR
|
|
460
|
+
RETURN
|
|
461
|
+
SELECT
|
|
462
|
+
CASE
|
|
463
|
+
WHEN query_user() IN ('jdoe@dremio.com','janders@dremio.com')
|
|
464
|
+
OR is_member('Accounting') THEN val
|
|
465
|
+
ELSE CONCAT('XXX-XX-',SUBSTR(value,8,4))
|
|
466
|
+
END;
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
Use column-masking and row-access policies
|
|
470
|
+
|
|
471
|
+
```
|
|
472
|
+
CREATE FUNCTION lower_country(country VARCHAR)
|
|
473
|
+
RETURNS VARCHAR
|
|
474
|
+
RETURN SELECT lower(country);
|
|
475
|
+
|
|
476
|
+
CREATE FUNCTION country_filter (country VARCHAR)
|
|
477
|
+
RETURNS BOOLEAN
|
|
478
|
+
RETURN SELECT query_user()='dremio'
|
|
479
|
+
OR (is_member('Accounting')
|
|
480
|
+
AND country='CA');
|
|
481
|
+
|
|
482
|
+
CREATE FUNCTION protect_ssn (ssn VARCHAR(11))
|
|
483
|
+
RETURNS VARCHAR(11)
|
|
484
|
+
RETURN SELECT CASE WHEN query_user()='dremio' OR is_member('Accounting') THEN ssn
|
|
485
|
+
ELSE CONCAT('XXX-XXX-', SUBSTR(ssn,9,3))
|
|
486
|
+
END;
|
|
487
|
+
|
|
488
|
+
CREATE FUNCTION salary_range (salary FLOAT, id INTEGER)
|
|
489
|
+
RETURNS BOOLEAN
|
|
490
|
+
RETURN SELECT CASE WHEN id > 1 AND salary > 10000 THEN true
|
|
491
|
+
ELSE false
|
|
492
|
+
END;
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
Use STRUCT
|
|
496
|
+
|
|
497
|
+
```
|
|
498
|
+
--
|
|
499
|
+
CREATE TABLE struct_demo (emp_info struct <name : VARCHAR>);
|
|
500
|
+
INSERT INTO nas.struct_demo VALUES(SELECT convert_from('{"name":"a"}', 'json'));
|
|
501
|
+
CREATE FUNCTION hello(nameCol struct<name:VARCHAR>) RETURNS struct<name:VARCHAR> RETURN SELECT nameCol;
|
|
502
|
+
ALTER TABLE nas.struct_demo MODIFY COLUMN emp_info SET MASKING POLICY hello(emp_info);
|
|
503
|
+
```
|
|
504
|
+
|
|
505
|
+
Use LIST
|
|
506
|
+
|
|
507
|
+
```
|
|
508
|
+
CREATE FUNCTION hello_country(countryList LIST<VARCHAR>) RETURNS VARCHAR RETURN SELECT 'Hello World';
|
|
509
|
+
ALTER TABLE "test.json" MODIFY COLUMN country SET MASKING POLICY hello_country(country);
|
|
510
|
+
```
|
|
511
|
+
|
|
512
|
+
### Row-Access Policies
|
|
513
|
+
|
|
514
|
+
Use simple filter expressions
|
|
515
|
+
|
|
516
|
+
```
|
|
517
|
+
CREATE FUNCTION country_filter (country VARCHAR)
|
|
518
|
+
RETURNS BOOLEAN
|
|
519
|
+
RETURN SELECT state='CA';
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
Match users
|
|
523
|
+
|
|
524
|
+
```
|
|
525
|
+
CREATE FUNCTION query_1(my_value varchar)
|
|
526
|
+
RETURNS BOOLEAN
|
|
527
|
+
RETURN SELECT CASE
|
|
528
|
+
WHEN current_user = 'jdoe@dremio.com' THEN true
|
|
529
|
+
ELSE false
|
|
530
|
+
END;
|
|
531
|
+
```
|
|
532
|
+
|
|
533
|
+
### Table-Driven Policy with a Subquery
|
|
534
|
+
|
|
535
|
+
Use a subquery as a table-driven policy
|
|
536
|
+
|
|
537
|
+
```
|
|
538
|
+
DROP TABLE <catalog-name>.salesmanagerregions;
|
|
539
|
+
CREATE TABLE <catalog-name>.salesmanagerregions (
|
|
540
|
+
sales_manager varchar,
|
|
541
|
+
sales_region varchar
|
|
542
|
+
);
|
|
543
|
+
|
|
544
|
+
INSERT INTO <catalog-name>.salesmanagerregions
|
|
545
|
+
VALUES ('john.smith@example.com', 'WW'),
|
|
546
|
+
('jane.doe@example.com', 'NA'),
|
|
547
|
+
('viktor.jones@example.com', 'EU');
|
|
548
|
+
|
|
549
|
+
CREATE TABLE <catalog-name>.revenue (
|
|
550
|
+
company varchar,
|
|
551
|
+
region varchar,
|
|
552
|
+
revenue decimal(18,2)
|
|
553
|
+
);
|
|
554
|
+
|
|
555
|
+
INSERT INTO <catalog-name>.revenue
|
|
556
|
+
VALUES ('Acme', 'EU', 2.5),
|
|
557
|
+
('Acme', 'NA', 1.5);
|
|
558
|
+
|
|
559
|
+
CREATE OR REPLACE FUNCTION security.sales_policy (sales_region_in varchar) RETURNS BOOLEAN
|
|
560
|
+
RETURN SELECT is_member('sales_executive_role')
|
|
561
|
+
OR EXISTS (
|
|
562
|
+
SELECT 1 FROM <catalog-name>.salesmanagerregions
|
|
563
|
+
WHERE user() = sales_manager
|
|
564
|
+
AND sales_region = sales_region_in
|
|
565
|
+
);
|
|
566
|
+
|
|
567
|
+
ALTER TABLE <catalog-name>.revenue
|
|
568
|
+
ADD ROW ACCESS POLICY security.sales_policy(region);
|
|
569
|
+
|
|
570
|
+
SELECT * FROM <catalog-name>.revenue;
|
|
571
|
+
-- company, region, revenue
|
|
572
|
+
-- Acme, NA, 1.50
|
|
573
|
+
```
|
|
574
|
+
|
|
575
|
+
## Use Reflections on Datasets with Policies
|
|
576
|
+
|
|
577
|
+
Dremio supports Reflection creation on views and tables with row-access and column-masking policies defined on any of the underlying anchor datasets. See the following examples.
|
|
578
|
+
|
|
579
|
+
Example of a view with a row-access policy and a raw Reflection
|
|
580
|
+
|
|
581
|
+
```
|
|
582
|
+
-- Create nested views
|
|
583
|
+
CREATE OR REPLACE VIEW myView AS
|
|
584
|
+
SELECT city, state, pop FROM Samples."samples.dremio.com"."zips.json"
|
|
585
|
+
WHERE pop > 10000;
|
|
586
|
+
CREATE OR REPLACE VIEW myView2 AS
|
|
587
|
+
SELECT city, state FROM myView
|
|
588
|
+
WHERE STARTS_WITH(city, 'A');
|
|
589
|
+
|
|
590
|
+
-- Create a raw Reflection on the inner view
|
|
591
|
+
ALTER TABLE myView
|
|
592
|
+
CREATE RAW REFLECTION myReflection
|
|
593
|
+
USING DISPLAY(city, state);
|
|
594
|
+
|
|
595
|
+
-- Query the view after the Reflection is created
|
|
596
|
+
SELECT * FROM myView2;
|
|
597
|
+
|
|
598
|
+
-- Create a UDF
|
|
599
|
+
CREATE OR REPLACE FUNCTION isMA(state VARCHAR)
|
|
600
|
+
RETURNS BOOLEAN
|
|
601
|
+
RETURN SELECT CASE WHEN IS_MEMBER('hr') THEN state='MA'
|
|
602
|
+
ELSE NULL
|
|
603
|
+
END;
|
|
604
|
+
|
|
605
|
+
-- Add a row-access policy and query the view
|
|
606
|
+
ALTER TABLE myView
|
|
607
|
+
ADD ROW ACCESS POLICY isMA("state");
|
|
608
|
+
SELECT * FROM myView2;
|
|
609
|
+
```
|
|
610
|
+
|
|
611
|
+
After running the last query, the Reflection is used to accelerate the query as shown in the results below:
|
|
612
|
+
|
|
613
|
+

|
|
614
|
+
|
|
615
|
+
The `Query1` results show that the row-access policy has been applied successfully:
|
|
616
|
+
|
|
617
|
+

|
|
618
|
+
|
|
619
|
+
The `Query2` results do not appear to those who are not members of HR:
|
|
620
|
+
|
|
621
|
+

|
|
622
|
+
|
|
623
|
+
The `Query2` results appear to those who are members of HR:
|
|
624
|
+
|
|
625
|
+

|
|
626
|
+
|
|
627
|
+
Example of a table with a row-access policy and an aggregation Reflection
|
|
628
|
+
|
|
629
|
+
```
|
|
630
|
+
ALTER TABLE NAS.rcac.employee
|
|
631
|
+
ADD ROW ACCESS POLICY is_recent_employee(hire_date);
|
|
632
|
+
ALTER TABLE NAS.rcac.employee
|
|
633
|
+
CREATE AGGREGATE REFLECTION ar_tvrf_1 USING DIMENSIONS(hire_date);
|
|
634
|
+
SELECT MIN(SALARY) FROM NAS.rcac.employee
|
|
635
|
+
GROUP BY hire_date;
|
|
636
|
+
```
|
|
637
|
+
|
|
638
|
+
### Limitations
|
|
639
|
+
|
|
640
|
+
See the following limitations where datasets with row-access and/or column-masking policies cannot support Reflections:
|
|
641
|
+
|
|
642
|
+
* Policies with Multiple Arguments
|
|
643
|
+
* Aggregates on Masked Columns
|
|
644
|
+
* SET Operations
|
|
645
|
+
* NULL Generating JOINs
|
|
646
|
+
* Trimming Projects
|
|
647
|
+
|
|
648
|
+
#### Policies with Multiple Arguments
|
|
649
|
+
|
|
650
|
+
If a policy on an anchor dataset contains multiple columns, the Reflection created on the view containing the policy fails. See the following example:
|
|
651
|
+
|
|
652
|
+
Example of the limitation
|
|
653
|
+
|
|
654
|
+
```
|
|
655
|
+
-- Create tables
|
|
656
|
+
CREATE TABLE employees (
|
|
657
|
+
id INT,
|
|
658
|
+
hire_date DATE,
|
|
659
|
+
ssn VARCHAR(11),
|
|
660
|
+
name VARCHAR,
|
|
661
|
+
country VARCHAR,
|
|
662
|
+
salary FLOAT,
|
|
663
|
+
job_id INT);
|
|
664
|
+
CREATE TABLE jobs (
|
|
665
|
+
id INT,
|
|
666
|
+
title VARCHAR,
|
|
667
|
+
is_good BOOLEAN);
|
|
668
|
+
|
|
669
|
+
-- Create a view
|
|
670
|
+
CREATE VIEW job_salary_in_the_usa AS
|
|
671
|
+
SELECT job_id, salary
|
|
672
|
+
FROM employees
|
|
673
|
+
WHERE country = 'USA';
|
|
674
|
+
|
|
675
|
+
-- Create a UDF
|
|
676
|
+
CREATE OR REPLACE FUNCTION hide_salary_on_bad_job(salary FLOAT, job_id_in INT)
|
|
677
|
+
RETURNS BOOLEAN
|
|
678
|
+
RETURN SELECT CASE WHEN IS_MEMBER('public') AND (
|
|
679
|
+
SELECT is_good FROM jobs j WHERE job_id_in = j.id)
|
|
680
|
+
THEN NULL
|
|
681
|
+
ELSE salary
|
|
682
|
+
END;
|
|
683
|
+
|
|
684
|
+
-- Add a column-masking policy
|
|
685
|
+
ALTER TABLE employees
|
|
686
|
+
MODIFY COLUMN salary
|
|
687
|
+
SET MASKING POLICY hide_salary_on_bad_job(salary, job_id);
|
|
688
|
+
|
|
689
|
+
-- Create a raw Reflection on the view
|
|
690
|
+
ALTER DATASET job_salary_in_the_usa
|
|
691
|
+
CREATE RAW REFLECTION job_salary_drr USING DISPLAY(job_id, salary);
|
|
692
|
+
```
|
|
693
|
+
|
|
694
|
+
In the above example, the `job_salary_drr` Reflection fails to materialize due to the multi-argument policy on `test.tables.employees::salary`.
|
|
695
|
+
|
|
696
|
+
#### Aggregates on Masked Columns
|
|
697
|
+
|
|
698
|
+
You cannot create a raw Reflection on the view if there is a policy defined on the masked column.
|
|
699
|
+
|
|
700
|
+
Example of the limitation
|
|
701
|
+
|
|
702
|
+
```
|
|
703
|
+
CREATE OR REPLACE VIEW myView AS
|
|
704
|
+
SELECT MIN(salary)
|
|
705
|
+
FROM employees
|
|
706
|
+
```
|
|
707
|
+
|
|
708
|
+
In the above example, there is a policy defined on `salary`, so you cannot create a Reflection on this view.
|
|
709
|
+
|
|
710
|
+
#### NULL Generating JOINs
|
|
711
|
+
|
|
712
|
+
You can only apply the policy if it's on the “join side” of the join, such as:
|
|
713
|
+
|
|
714
|
+
* Left side of LEFT JOIN
|
|
715
|
+
* Right side of RIGHT JOIN
|
|
716
|
+
* Either side of INNER JOIN
|
|
717
|
+
* Neither side of FULL OUTER JOIN
|
|
718
|
+
|
|
719
|
+
If the policy is not on the "join side", the join generates NULL values for all the entries that didn’t match the join condition.
|
|
720
|
+
|
|
721
|
+
Example of the limitation
|
|
722
|
+
|
|
723
|
+
```
|
|
724
|
+
CREATE OR REPLACE VIEW myView AS
|
|
725
|
+
SELECT emp.department_id, dept.department_name, emp.name
|
|
726
|
+
FROM employees as emp
|
|
727
|
+
RIGHT JOIN department as dept
|
|
728
|
+
ON emp.department_id = dept.department_id
|
|
729
|
+
```
|
|
730
|
+
|
|
731
|
+
In the above example, there is a policy defined on the `employees` table, which is on the left side of the RIGHT JOIN, so you cannot create a Reflection on this view.
|
|
732
|
+
|
|
733
|
+
#### SET Operations
|
|
734
|
+
|
|
735
|
+
The policy must be defined on all UNION datasets and on the same field.
|
|
736
|
+
|
|
737
|
+
Example of the limitation
|
|
738
|
+
|
|
739
|
+
```
|
|
740
|
+
CREATE OR REPLACE VIEW myView AS
|
|
741
|
+
SELECT * FROM a
|
|
742
|
+
UNION SELECT * FROM employees
|
|
743
|
+
UNION SELECT * FROM c
|
|
744
|
+
```
|
|
745
|
+
|
|
746
|
+
In the above example, there is a policy defined on the `employees` table, so you cannot create a Reflection on this view.
|
|
747
|
+
|
|
748
|
+
#### Trim Projects
|
|
749
|
+
|
|
750
|
+
In order to create a Reflection on a view, the view should reference all the fields that are part of the row-access and column-masking policies.
|
|
751
|
+
|
|
752
|
+
Example of the limitation
|
|
753
|
+
|
|
754
|
+
```
|
|
755
|
+
-- Create a UDF
|
|
756
|
+
CREATE OR REPLACE FUNCTION isMA(state VARCHAR)
|
|
757
|
+
RETURNS BOOLEAN
|
|
758
|
+
RETURN SELECT CASE WHEN IS_MEMBER('public') THEN state='MA'
|
|
759
|
+
ELSE NULL
|
|
760
|
+
END;
|
|
761
|
+
|
|
762
|
+
-- Create views
|
|
763
|
+
CREATE OR REPLACE VIEW myView1 AS
|
|
764
|
+
SELECT city, state, pop FROM Samples."samples.dremio.com"."zips.json"
|
|
765
|
+
WHERE pop > 10000;
|
|
766
|
+
|
|
767
|
+
-- Add a row-access policy
|
|
768
|
+
ALTER TABLE myView1
|
|
769
|
+
ADD ROW ACCESS POLICY isMA("state");
|
|
770
|
+
|
|
771
|
+
-- Create views
|
|
772
|
+
CREATE OR REPLACE VIEW myView2 AS
|
|
773
|
+
SELECT * FROM myView1;
|
|
774
|
+
CREATE OR REPLACE VIEW myView3 AS
|
|
775
|
+
SELECT city, pop FROM myView1;
|
|
776
|
+
```
|
|
777
|
+
|
|
778
|
+
#### Trimming Projects
|
|
779
|
+
|
|
780
|
+
In the above example, you can create a Reflection on `myView2` but not on `myView3` since it trims the `state` column from the view which has a policy defined on it.
|
|
781
|
+
|
|
782
|
+
Was this page helpful?
|
|
783
|
+
|
|
784
|
+
* Column-Masking Policies
|
|
785
|
+
* Row-Access Policies
|
|
786
|
+
* User-Defined Functions
|
|
787
|
+
+ Query Substitutions
|
|
788
|
+
* List Existing UDFs
|
|
789
|
+
* List Existing Policies
|
|
790
|
+
* Set a Policy
|
|
791
|
+
* Drop a Policy
|
|
792
|
+
* Examples of UDFs
|
|
793
|
+
+ Column-Masking Policies
|
|
794
|
+
+ Row-Access Policies
|
|
795
|
+
+ Table-Driven Policy with a Subquery
|
|
796
|
+
* Use Reflections on Datasets with Policies
|
|
797
|
+
+ Limitations
|
|
798
|
+
|
|
799
|
+
<div style="page-break-after: always;"></div>
|
|
800
|
+
|