We’re excited to announce a new integration between ClickHouse and Microsoft OneLake - Microsoft Fabric’s unified data lake - powered by the OneLake Tables APIs and Apache Iceberg.
This integration will debut as a beta feature in ClickHouse 25.11 (November release) and will be available shortly after in ClickHouse Cloud.
With this new integration, ClickHouse now supports direct querying of Iceberg tables in OneLake, eliminating friction and unlocking unified, fast analytics across multiple sources.
Building a Unified Analytics and AI Platform #
A common challenge for organizations and in particular data teams is being able to use all of their data, regardless of where it’s stored for analytics, GenAI and AI agents. Discovery, governance and access controls are critical to successfully delivering on these projects. Making sure they work with all data assets in various data lakes, data warehouses and operational stores is very difficult.
Microsoft OneLake enables organizations to unify their data into a common store, cataloging and securing it in a single place. They can use OneLake Shortcuts to connect external stores like Amazon S3 or Google Cloud Storage to OneLake, moving data only when it is queried. They can also continuously replicate data from popular databases using native zero-copy mirroring or build their own with Open Mirroring. All of this data is cataloged in the OneLake catalog and secured using OneLake Security so it can be safely accessed from Microsoft Fabric, Azure AI Foundry and many popular 3rd party engines like ClickHouse.
OneLake’s Iceberg-compatible Tables APIs make it possible for ClickHouse to discover and query data in OneLake with minimal configuration. The ease of use and ecosystem interoperability enable a unified analytical platform that combines flexibility and control to empower users to build and deliver analytics and AI projects quickly with all of their data - enterprise, operational, logs, and 3rd party.
How does the integration work? #
To use this new integration, you only need two components:
- A ClickHouse instance (even ClickHouse local will suffice)
- A Microsoft Fabric account
Tables can be written into OneLake using Fabric managed engines like Spark or Data Warehouse or your tool of choice. You can write tables in either Delta Lake or Apache Iceberg format. OneLake’s format virtualization will automatically convert between formats and keep both in sync at all times, making it easy to bring your engine of choice without converting or migrating your data.
OneLake also exposes an Iceberg REST catalog via OneLake Table APIs, which ClickHouse uses as an interface to discover and query the underlying Iceberg tables directly.
For security, ClickHouse uses Microsoft Entra ID (formerly Azure Active Directory) to authenticate the user identity with the OneLake Table APIs, ensuring secure and controlled access to data stored in OneLake. Additionally, the same access token received from Entra will be used to allow users the ability to read data from OneLake. Access permissions are managed in Entra as Service Principals or through Fabric workspace-level permissions.
Getting Started with ClickHouse and OneLake #
Getting started with ClickHouse and OneLake is simple. You can deploy ClickHouse on Azure or directly from the Azure Marketplace, ensuring you’re using version 25.11. Once your service is deployed, you’ll be able to enable the OneLake integration and run the following queries to access your Iceberg tables directly. Read more in the ClickHouse documentation for OneLake.
Querying OneLake from ClickHouse #
It is very simple to start querying OneLake using ClickHouse. Here, you’ll find a step-by-step guide to obtaining all the information needed to connect ClickHouse to your OneLake Iceberg catalog.
Since this integration is still in beta, you’ll need to also execute the following:
SET allow_database_iceberg=1
Creating a connection to OneLake #
Once the prerequisites are met, you can create a connection to your OneLake Iceberg catalog by running the following command in ClickHouse:
CREATE DATABASE onelake_catalog
ENGINE = DataLakeCatalog('[https://onelake.table.fabric.microsoft.com/iceberg](https://onelake.table.fabric.microsoft.com/iceberg)')
SETTINGS
catalog_type = 'onelake',
warehouse = 'warehouse_uuid/data_item_uuid',
onelake_tenant_id = '<tenant_id>',
oauth_server_uri = '[https://login.microsoftonline.com/](https://login.microsoftonline.com/)<tenant_uuid>/oauth2/v2.0/token',
auth_scope = '[https://storage.azure.com/.default](https://storage.azure.com/.default)',
onelake_client_id = '<client_id>',
onelake_client_secret = '<client_secret>';
Querying OneLake tables #
After creating the connection, you can list all the tables available in the catalog:
SHOW TABLES FROM onelake_catalog;
Then, you can query any table directly from ClickHouse:
SELECT count(*)
FROM onelake_catalog.`year_2017.green_tripdata_2017`
WHERE VendorID = 2;
What’s next #
The 25.11 release is just the first step toward deeper integration with the OneLake ecosystem.
We’re already working on several enhancements that will be introduced in upcoming releases, including:
-
General Availability: Upgrading the feature from beta to GA by improving the overall quality of life, performance, and incorporating user feedback.
-
Write support: Adding support for writing data back to your Iceberg tables in OneLake.
-
Enhanced cloud integration: Introducing a new user interface in ClickHouse Cloud to easily create connections to the OneLake catalog and query your data directly from the UI.
With ClickHouse and Microsoft OneLake, customers can leverage their vast data assets, integrated via OneLake native integrations, Shortcuts and Mirroring, to quickly and easily query, perform complex analytics and drive GenAI and AI Agent initiatives with ease. Get started today at ClickHouse.com



