How to integrate Airflow/OpenLineage

Have more questions? Submit a request
πŸ§ͺ Preview feature! This feature is available for your experimentation, and we'd love your feedback. It may change before its final generally-available form. If you'd like to participate in the preview, reach out to your customer success manager for more information.

To integrate Airflow/OpenLineage with Atlan, complete the following steps.

Create an API token in Atlan

Before running the workflow, you will need to create an API token in Atlan.

Select the source in Atlan

To select Airflow/OpenLineage as your source, from within Atlan:

  1. In the top right of any screen, click New and then click New workflow.
  2. From the filters along the top, click Orchestrator.
  3. From the list of packages, select Airflow Assets and then click Setup Workflow.

Configure the integration in Atlan

To configure the Airflow/OpenLineage connection, from within Atlan:

  1. For Host, enter the URL of your Airflow UI.
  2. For Port, enter the port number for your Airflow UI.
  3. For Enable OpenLineage Events, click Yes to enable the processing of OpenLineage events or click No to disable it. If disabled, new events will not be processed in Atlan.
  4. For Connection Name, provide a connection name that represents your source environment. For example, you might use values like production,development,gold, or analytics.
  5. (Optional) To change the users who are able to manage this connection, change the users or groups listed under Connection Admins.
    🚨 Careful! If you do not specify any user or group, no one will be able to manage the connection β€” not even admins.
  6. To run the workflow, at the bottom of the screen, click the Run button.

Configure the integration in Airflow/OpenLineage

πŸ’ͺ Did you know? You will need the Atlan API token and connection name to configure the integration in Airflow/OpenLineage. This will allow Airflow to connect with the OpenLineage API and send events to Atlan.

To configure Airflow to send OpenLineage events to Atlan:

  1. Based on your Airflow version, there may be additional prerequisites for using OpenLineage:
    • For Airflow versions 2.7 onward, OpenLineage is a built-in integration β€” no configuration required. Skip to step 2.
    • For Airflow versions 2.3 onward, download and install the latest openlineage-airflow library and update the requirements.txt file of your Airflow instance with:
      openlineage-airflow
    • For Airflow versions 2.1 - 2.2 and older:
      1. Download and install the latest openlineage-airflow library and update the requirements.txt file of your Airflow instance with:
        openlineage-airflow
      2. Set your LineageBackend in your airflow.cfg or via environmental variable AIRFLOW__LINEAGE__BACKEND to:
        openlineage.lineage_backend.OpenLineageBackend
  2. Add the following environment variables to your project's .env file:
    • OPENLINEAGE_URL: points to the service that will consume OpenLineage events β€” for example, https://<instance>.atlan.com/events/openlineage/airflow/.
    • OPENLINEAGE_API_KEY: set the API token generated in Atlan.
    • OPENLINEAGE_NAMESPACE: set the connection name as exactly configured in Atlan.

Once the orchestrator has completed running, you will see Airflow DAGs and tasks along with lineage from OpenLineage events in Atlan! πŸŽ‰

Related articles

Was this article helpful?
0 out of 0 found this helpful