How to integrate Astronomer/OpenLineage

To integrate Astronomer/OpenLineage with Atlan, complete the following steps. To learn more about OpenLineage, refer to OpenLineage configuration and facets.

💪 Did you know? For Apache Airflow operators supported for OpenLineage extraction, you can refer to Airflow's Supported operators documentation. To learn how to extract lineage though OpenLineage methods, custom extractors, or manually annotated lineage, see Implementing OpenLineage in Operators.

Create an API token in Atlan

Before running the workflow, you will need to create an API token in Atlan.

Configure the integration in Atlan

Select the source

To select Astronomer/OpenLineage as your source, from within Atlan:

  1. In the top right of any screen, click New and then click New workflow.
  2. From the filters along the top, click Orchestrator.
  3. From the list of packages, select Astronomer Airflow Assets and then click Setup Workflow.

Create the connection

You will only need to create a connection once to enable Atlan to receive incoming OpenLineage events. Once you have set up the connection, you neither have to rerun the workflow nor schedule it. Atlan will process the OpenLineage events as and when your DAGs run to catalog your Apache Airflow assets.

To configure the Astronomer/OpenLineage connection, from within Atlan:

  1. For Connection Name, provide a connection name that represents your source environment. For example, you might use values like production,development,gold, or analytics.
  2. (Optional) To change the users who are able to manage this connection, change the users or groups listed under Connection Admins.
    🚨 Careful! If you do not specify any user or group, no one will be able to manage the connection — not even admins.
  3. (Optional) For Host, enter the URL of your Astronomer Airflow UI. This will allow Atlan to help you view your assets directly in Astronomer from the asset profile.
  4. (Optional) For Port, enter the port number for your Astronomer Airflow UI.
  5. For Enable OpenLineage Events, click Yes to enable the processing of OpenLineage events or click No to disable it. If disabled, new events will not be processed in Atlan.
  6. To create a connection, at the bottom of the screen, click the Create connection button.

Configure the integration in Astronomer

💪 Did you know? You will need the Atlan API token and connection name to configure the integration in Astronomer. This will allow Astronomer to connect with the OpenLineage API and send events to Atlan.
🚨 Careful! Atlan does not support integrating with Apache Airflow versions older than 2.5.0.

Astronomer has a built-in OpenLineage integration — Atlan recommends using OpenLineage version 1.2.1 or latest. You will need to use environment variables in Astronomer to set custom values for the integration with Atlan.

To configure Astronomer to send OpenLineage events to Atlan:

  1. Open your Astronomer console and select a workspace.
  2. In the left menu under Workspace, click Deployments and then select the required deployment.
  3. On your deployment page, click the Variables tab.
  4. On the Variables page, click the Edit variables button.
  5. Add the following environment variable keys and corresponding values:
    • For Apache Airflow versions 2.7.0 onward:
      • AIRFLOW__OPENLINEAGE__NAMESPACE: set the connection name as exactly configured in Atlan.
      • OPENLINEAGE_DISABLED and AIRFLOW__OPENLINEAGE__DISABLED: set both to false to enable the OpenLineage listener in Apache Airflow, if disabled by default.
      • AIRFLOW__OPENLINEAGE__TRANSPORT: specify details of where and how to send OpenLineage events in the following JSON string format:
        {
          "type": "http", 
          "url": "https://<instance>.atlan.com/events/openlineage/airflow-astronomer/", 
          "auth": { 
            "type": "api_key", 
            "api_key": "<API_token>"
           }
        }
        • Replace <instance> with the name of your Atlan instance.
        • Replace <API_token> with the API token generated in Atlan.
    • For Apache Airflow versions 2.5.0 onward and prior to 2.7.0: 
      • OPENLINEAGE_URL: points to the service that will consume OpenLineage events — for example, https://<instance>.atlan.com/events/openlineage/airflow-astronomer/.
      • OPENLINEAGE_API_KEY: set the API token generated in Atlan.
      • OPENLINEAGE_NAMESPACE: set the connection name as exactly configured in Atlan.
      • OPENLINEAGE_DISABLED and AIRFLOW__OPENLINEAGE__DISABLED: set both to false to enable the OpenLineage listener in Apache Airflow, if disabled by default.
  6.  Click Update Environment Variables to save your changes. It can take up to two minutes for new variables to be applied to your deployment.

Verify the Atlan connection in Astronomer

To verify connectivity to Astronomer:

  1. For Verify connection with Astronomer, click the clipboard icon to copy and run the preflight check DAG on your Astronomer instance to test connectivity with Atlan. If you encounter any errors after running the DAG, refer to the preflight checks documentation.
  2. Click Done to complete setup.

Once your DAGs have completed running in Apache Airflow, you will see Apache Airflow DAGs and tasks along with lineage from OpenLineage events in Atlan! 🎉

You can also view event logs in Atlan to track and debug events received from OpenLineage.

Related articles

Was this article helpful?
0 out of 0 found this helpful