Databricks Driver

Name: Apache.Arrow.Adbc.Drivers.Databricks
Author: The Apache Software Foundation

The Databricks ADBC driver is built on top of the Spark ADBC driver and inherits all of its properties, plus additional Databricks-specific functionality.

Database and Connection Properties

Note: The Databricks driver inherits all properties from the Spark driver. The properties below are Databricks-specific additions.

Configuration Methods

The Databricks driver supports multiple ways to configure properties:

1. Direct Property Configuration

Pass properties directly when creating the driver connection (traditional method).

Property	Description	Default
`adbc.databricks.oauth.grant_type`	The OAuth grant type. Supported values: `access_token` (personal access token), `client_credentials` (OAuth client credentials flow)	`access_token`
`adbc.databricks.oauth.client_id`	The OAuth client ID (when using `client_credentials` grant type)
`adbc.databricks.oauth.client_secret`	The OAuth client secret (when using `client_credentials` grant type)
`adbc.databricks.oauth.scope`	The OAuth scope (when using `client_credentials` grant type)	`sql`
`adbc.databricks.token_renew_limit`	Minutes before token expiration to start renewing the token. Set to 0 to disable automatic renewal	`0`
`adbc.databricks.identity_federation_client_id`	The client ID of the service principal when using workload identity federation

Property	Description	Default
`adbc.databricks.cloudfetch.enabled`	Whether to use CloudFetch for retrieving results	`true`
`adbc.databricks.cloudfetch.lz4.enabled`	Whether the client can decompress LZ4 compressed results	`true`
`adbc.databricks.cloudfetch.max_bytes_per_file`	Maximum bytes per file for CloudFetch. Supports unit suffixes (B, KB, MB, GB). Examples: `20MB`, `1024KB`, `20971520`	`20MB`
`adbc.databricks.cloudfetch.parallel_downloads`	Maximum number of parallel downloads	`3`
`adbc.databricks.cloudfetch.prefetch_count`	Number of files to prefetch	`2`
`adbc.databricks.cloudfetch.memory_buffer_size_mb`	Maximum memory buffer size in MB for prefetched files	`200`
`adbc.databricks.cloudfetch.prefetch_enabled`	Whether CloudFetch prefetch functionality is enabled	`true`
`adbc.databricks.cloudfetch.max_retries`	Maximum number of retry attempts for downloads	`3`
`adbc.databricks.cloudfetch.retry_delay_ms`	Delay in milliseconds between retry attempts	`500`
`adbc.databricks.cloudfetch.timeout_minutes`	Timeout in minutes for HTTP operations	`5`
`adbc.databricks.cloudfetch.url_expiration_buffer_seconds`	Buffer time in seconds before URL expiration to trigger refresh	`60`
`adbc.databricks.cloudfetch.max_url_refresh_attempts`	Maximum number of URL refresh attempts	`3`

Property	Description	Default
`adbc.connection.catalog`	Optional default catalog for the session
`adbc.connection.db_schema`	Optional default schema for the session
`adbc.databricks.enable_direct_results`	Whether to enable the use of direct results when executing queries	`true`
`adbc.databricks.apply_ssp_with_queries`	Whether to apply server-side properties (SSP) with queries. If false, SSP will be applied when opening the session	`false`
`adbc.databricks.ssp_*`	Server-side properties prefix. Properties with this prefix will be passed to the server by executing "set key=value" queries
`adbc.databricks.enable_multiple_catalog_support`	Whether to use multiple catalogs	`true`
`adbc.databricks.enable_pk_fk`	Whether to enable primary key foreign key metadata calls	`true`
`adbc.databricks.use_desc_table_extended`	Whether to use DESC TABLE EXTENDED to get extended column metadata when supported by DBR	`true`
`adbc.databricks.enable_run_async_thrift`	Whether to enable RunAsync flag in Thrift operations	`true`
`adbc.databricks.driver_config_take_precedence`	Whether driver configuration overrides passed-in properties during configuration merging	`false`
`adbc.apache.statement.batch_size`	Sets the maximum number of rows to retrieve in a single batch request	`2000000`
`adbc.apache.connection.polltime_ms`	The time in milliseconds between each poll for query execution status. Databricks default is 100ms (Apache default: 500ms)	`100`

Property	Description	Default
`adbc.databricks.trace_propagation.enabled`	Whether to propagate trace parent headers in HTTP requests	`true`
`adbc.databricks.trace_propagation.header_name`	The name of the HTTP header to use for trace parent propagation	`traceparent`
`adbc.databricks.trace_propagation.state_enabled`	Whether to include trace state header in HTTP requests	`false`

Spark Type	Arrow Type	C# Type
ARRAY*	String	string
BIGINT	Int64	long
BINARY	Binary	byte[]
BOOLEAN	Boolean	bool
CHAR	String	string
DATE	Date32	DateTime
DECIMAL	Decimal128	SqlDecimal
DOUBLE	Double	double
FLOAT	Float	float
INT	Int32	int
INTERVAL_DAY_TIME+	String	string
INTERVAL_YEAR_MONTH+	String	string
MAP*	String	string
NULL	Null	null
SMALLINT	Int16	short
STRING	String	string
STRUCT*	String	string
TIMESTAMP	Timestamp	DateTimeOffset
TINYINT	Int8	sbyte
UNION	String	string
USER_DEFINED	String	string
VARCHAR	String	string

lidavidm/Apache.Arrow.Adbc.Drivers.Databricksv0.22.0

Get Started

Readme

Databricks Driver

Database and Connection Properties

Configuration Methods

1. Direct Property Configuration

2. Environment Variable Configuration

Note

Example: Using mitmproxy to Inspect Thrift Traffic

3. Hybrid Configuration

Authentication Properties

CloudFetch Properties

Databricks-Specific Properties

Tracing Properties

Authentication Methods

1. Token-based Authentication

2. OAuth Client Credentials Flow

Server-Side Properties

Data Types

Tracing

Tracing Exporters

File Exporter (adbcfile)

Platform	Folder
Windows	`%LOCALAPPDATA%/Apache.Arrow.Adbc/Traces`
macOS	`$HOME/Library/Application Support/Apache.Arrow.Adbc/Traces`
Linux	`$HOME/.local/share/Apache.Arrow.Adbc/Traces`