How to extract lineage from Databricks

Have more questions? Submit a request

Once you have crawled assets from Databricks you can retrieve lineage from Unity Catalog.

🚨 Careful! Your Databricks workspace must be Unity Catalog-enabled for the retrieval to succeed. You may also need to upgrade existing tables and views to Unity Catalog, as well as reach out to your Databricks account executive to enable lineage in Unity Catalog. (As of publishing, the feature is still in preview from Databricks on AWS and Azure.)

To retrieve lineage from Databricks, complete the following steps.

Select the extractor

To select the Databricks Lineage extractor:

  1. In the top right of any screen, navigate to New and then click New Workflow.
  2. From the filters along the top, click Miner.
  3. From the list of packages, select Databricks Lineage and click on Setup Workflow.

Configure the extractor

To configure the Databricks Lineage extractor:

  • For Connection, select the connection to extract. (To select a connection, the crawler must have already run.)

Run the extractor

To run the Databricks Lineage extractor, after completing the steps above:

  • To run the extractor once, immediately, at the bottom of the screen click the Run button.
  • To schedule the extractor to run hourly, daily, weekly or monthly, at the bottom of the screen click the Schedule & Run button.

Once the extractor has completed running, you will see lineage for Databricks assets! πŸŽ‰

Related articles

Was this article helpful?
1 out of 1 found this helpful