dlt.destinations.impl.databricks.databricks_adapter
databricks_adapter
def databricks_adapter(
    data: Any,
    cluster: Union[TColumnNames, Literal["AUTO"]] = None,
    partition: TColumnNames = None,
    table_format: Literal["DELTA", "ICEBERG"] = "DELTA",
    table_comment: Optional[str] = None,
    table_tags: Optional[List[Union[str, Dict[str, str]]]] = None,
    table_properties: Optional[Dict[str, Union[str, int, bool, float]]] = None,
    column_hints: Optional[TDatabricksTableSchemaColumns] = None
) -> DltResource
Prepares data for loading into Databricks.
This function takes data, which can be raw or already wrapped in a DltResource object, and prepares it for Databricks by optionally specifying clustering, partitioning, and table description.
Arguments:
- dataAny - The data to be transformed. This can be raw data or an instance of DltResource. If raw data is provided, the function will wrap it into a- DltResourceobject.
- clusterUnion[TColumnNames, Literal["AUTO"]], optional - A column name, list of column names, or "AUTO" to cluster the Databricks table by. Use "AUTO" to let Databricks automatically determine the best clustering.
- partitionTColumnNames, optional - A column name or list of column names to partition the Databricks table by. Partitioning divides the table into separate files based on the partition column values.
- table_formatLiteral["DELTA", "ICEBERG"], optional - The table format to use. Defaults to "DELTA". Use "ICEBERG" to create Apache Iceberg tables for better schema evolution and time travel capabilities.
- table_commentstr, optional - A description for the Databricks table.
- table_tagsList[Union[str, Dict[str, str]]], optional - A list of tags for the Databricks table. Can contain a mix of strings and key-value pairs as dictionaries.
- Example- ["production", {"environment": "prod"}, "employees"]
- table_propertiesDict[str, Union[str, int, bool, float]], optional - A dictionary of table properties to be added to the Databricks table using TBLPROPERTIES. These are key-value pairs for metadata and Delta Lake optimization settings. Example: {"delta.appendOnly": True, "delta.logRetentionDuration": "30 days"}
- column_hintsTTableSchemaColumns, optional - A dictionary of column hints. Each key is a column name, and the value is a dictionary of hints. The supported hints are:- column_comment- adds a comment to the column. Supports basic markdown format basic-syntax.
- column_tags- adds tags to the column. Supports a list of strings and/or key-value pairs.
 
Returns:
A DltResource object that is ready to be loaded into Databricks.
Raises:
- ValueError- If any hint is invalid or none are specified.
Examples:
    data = [{"name": "Marcel", "description": "Raccoon Engineer", "date_hired": 1700784000}]
    databricks_adapter(data, cluster="date_hired", table_comment="Employee Data",
... table_tags=["production", {"environment": "prod"}, "employees"])
    # Use AUTO clustering
    databricks_adapter(data, cluster="AUTO", table_comment="Auto-clustered table")
    # Use partitioning
    databricks_adapter(data, partition=["year", "month"], cluster="customer_id")
    # Create Iceberg table
    databricks_adapter(data, table_format="ICEBERG", cluster="customer_id")