Providing access to Unity Catalog in Databricks involves ensuring your workspace is enabled for Unity Catalog and then managing specific, granular permissions for users and groups through SQL commands, the Databricks UI, or SCIM provisioning. This multi-step process establishes a robust data governance framework for your data assets.
Understanding Unity Catalog Access
Unity Catalog provides a unified governance solution for data and AI on the Lakehouse Platform. It centralizes access control, auditing, and lineage for all data across Databricks workspaces. Granting access is crucial for enabling teams to securely discover, analyze, and manage data.
Step 1: Confirm Unity Catalog Enablement for Your Workspace
Before you can grant specific data access, you must ensure that your Databricks workspace is enabled for Unity Catalog. This means a Unity Catalog metastore must be attached to your workspace.
As an Azure Databricks account administrator, you can verify this by:
- Logging into the Account Console.
- Navigating to Workspaces.
- Finding your specific workspace in the list.
- Checking the Metastore column. If a metastore name is present, your workspace is successfully attached to a Unity Catalog metastore and is enabled for Unity Catalog data governance.
If your workspace is not yet enabled, an account administrator typically needs to create a Unity Catalog metastore and assign it to the workspace.
Step 2: Provision Users and Groups
Unity Catalog permissions are managed for users and groups. Therefore, you need to provision these identities within Databricks.
- Users: Individual accounts that interact with Databricks.
- Groups: Collections of users that simplify permission management.
You can provision users and groups through:
- Databricks Account Console: Add users and create groups directly.
- SCIM Provisioning: Integrate with identity providers like Azure Active Directory, Okta, or OneLogin to automatically sync users and groups to your Databricks account. This is recommended for large organizations to streamline identity management.
Step 3: Grant Permissions within Unity Catalog
Once Unity Catalog is enabled and users/groups are provisioned, you can grant them specific permissions on Unity Catalog objects (catalogs, schemas, tables, views, functions, etc.). Unity Catalog enforces a hierarchical permission model, meaning privileges are inherited downwards.
Methods for Granting Permissions
You can grant permissions using two primary methods:
1. SQL GRANT
Statements
SQL commands offer the most granular control and are often preferred for scripting and automation. Permissions are granted using the GRANT
statement and revoked using REVOKE
.
Key Concepts:
- Securable Objects: Catalogs, schemas (databases), tables, views, functions, external locations, storage credentials.
- Principals: Users, groups, or the
account users
keyword for all users in the account. - Privileges: Specific actions allowed, such as
SELECT
,MODIFY
,CREATE TABLE
,USE CATALOG
,ALL PRIVILEGES
.
Syntax Example:
-- Grant general usage on a catalog
GRANT USE CATALOG ON CATALOG `my_catalog` TO `data_analysts_group`;
-- Allow creating new schemas within a catalog
GRANT CREATE SCHEMA ON CATALOG `my_catalog` TO `data_engineers_group`;
-- Grant access to a specific schema
GRANT USE SCHEMA ON SCHEMA `my_catalog`.`production` TO `finance_team`;
-- Grant read access to a table
GRANT SELECT ON TABLE `my_catalog`.`production`.`sales_data` TO `finance_team`;
-- Grant write and read access to a table
GRANT MODIFY ON TABLE `my_catalog`.`production`.`customer_info` TO `marketing_team`;
-- Grant all privileges (use with caution) and allow granting to others
GRANT ALL PRIVILEGES ON TABLE `my_catalog`.`production`.`sensitive_data` TO `data_owner` WITH GRANT OPTION;
2. Databricks UI (Catalog Explorer)
The Databricks UI provides an intuitive way to manage permissions, especially for individual grants or exploring existing access.
Steps:
- Navigate to Catalog in the Databricks workspace sidebar.
- Browse to the specific securable object (e.g., a catalog, schema, or table).
- Click on the object name to view its details.
- Go to the Permissions tab.
- Click the Grant button or manage existing permissions to add users or groups and assign appropriate privileges.
Common Unity Catalog Permissions
Here's a table outlining some frequently used Unity Catalog privileges and their purpose:
Privilege | Securable Object | Description |
---|---|---|
USE CATALOG |
Catalog | Required to interact with any object within a catalog. |
CREATE SCHEMA |
Catalog | Allows creation of new schemas within a catalog. |
USE SCHEMA |
Schema | Required to interact with any object within a schema. |
CREATE TABLE |
Schema | Allows creation of new tables within a schema. |
SELECT |
Table, View | Grants read-only access to data in a table or view. |
MODIFY |
Table | Allows adding, updating, and deleting data in a table. |
CREATE FUNCTION |
Schema | Allows creation of new functions within a schema. |
ALL PRIVILEGES |
Catalog, Schema, Table | Grants all available privileges on the securable object (use with extreme caution). |
WITH GRANT OPTION |
Any | Allows the grantee to further grant the same privilege to other principals. |
Best Practices for Access Control
- Grant Least Privilege: Provide only the minimum necessary permissions for users and groups to perform their tasks.
- Use Groups: Manage permissions primarily at the group level rather than individual users for easier administration and auditing.
- Hierarchical Permissions: Leverage the hierarchical nature of Unity Catalog. Granting
USE CATALOG
on a catalog andUSE SCHEMA
on a schema is often a prerequisite for accessing tables within. - Regular Audits: Periodically review and audit permissions to ensure they align with current roles and responsibilities.
- Separate Environments: Use separate catalogs or schemas for development, staging, and production to enforce clear separation of data.
By following these steps, you can effectively provide and manage access to your data assets within Unity Catalog, ensuring data security and compliance across your Databricks environment.