How to integrate Google Cloud Composer/OpenLineage

To integrate Google Cloud Composer/OpenLineage with Atlan, complete the following steps.

💪 Did you know? For Airflow operators supported for OpenLineage extraction, you can refer to Airflow's Supported operators documentation.

Create an API token in Atlan

Before running the workflow, you will need to create an API token in Atlan.

Configure the integration in Atlan

Select the source in Atlan

To select Google Cloud Composer/OpenLineage as your source, from within Atlan:

  1. In the top right of any screen, click New and then click New workflow.
  2. From the filters along the top, click Orchestrator.
  3. From the list of packages, select Google Cloud Composer Airflow Assets and then click Setup Workflow.

Create the connection

To configure the Google Cloud Composer/OpenLineage connection, from within Atlan:

  1. For Connection Name, provide a connection name that represents your source environment. For example, you might use values like production,development,gold, or analytics.
  2. (Optional) To change the users who are able to manage this connection, change the users or groups listed under Connection Admins.
    🚨 Careful! If you do not specify any user or group, no one will be able to manage the connection — not even admins.
  3. (Optional) For Host, enter the URL of your Google Cloud Composer Airflow UI. This will allow Atlan to help you view your assets directly in Google Cloud Composer from the asset profile.
  4. (Optional) For Port, enter the port number for your Google Cloud Composer Airflow UI.
  5. For Enable OpenLineage Events, click Yes to enable the processing of OpenLineage events or click No to disable it. If disabled, new events will not be processed in Atlan.
  6. To create a connection, at the bottom of the screen, click the Create connection button.

Configure the integration in Google Cloud Composer

💪 Did you know? You will need the Atlan API token and connection name to configure the integration in Google Cloud Composer. This will allow Google Cloud Composer to connect with the OpenLineage API and send events to Atlan.
🚨 Careful! Atlan does not support integrating with Airflow versions older than 2.5.0. Additionally, Google Cloud Composer does not support Airflow versions 2.7.0 and above.

To configure Google Cloud Composer to send OpenLineage events to Atlan:

  1. You will need to specify environment variables in Google Cloud Composer for the integration. To set environment variables in Google Cloud Composer:
    1. Open your Google Cloud console and navigate to the Environments page.
    2. From the list of environments, click the name of your environment.
    3. In the Environment details page, click the Environment variables tab and then click Edit.
    4. Add the following environment variable names and corresponding values:
      • OPENLINEAGE_URL: points to the service that will consume OpenLineage events — for example, https://<instance>.atlan.com/events/openlineage/airflow-cloud-composer/.
      • OPENLINEAGE_API_KEY: set the API token generated in Atlan.
      • OPENLINEAGE_NAMESPACE: set the connection name as exactly configured in Atlan.
    5. Click Save to save your changes.
  2. You will also need to install the OpenLineage PyPI package in Google Cloud Composer. To install the OpenLineage PyPI package in your environment:
    1. In the Environment details page, click the PyPI packages tab and then click Edit.
    2. Click Add package to add a custom package.
    3. Under PyPI packages, for Package name, specify the package name:
      openlineage-airflow
    4. Click Save to save your configuration.

Verify the Atlan connection in Google Cloud Composer

To verify connectivity to Google Cloud Composer:

  1. For Verify connection with Cloud Composer, click the clipboard icon to copy and run the preflight check DAG on your Google Cloud Composer instance to test connectivity with Atlan. If you encounter any errors after running the DAG, refer to the preflight checks documentation.
  2. Click Done to complete setup.

Once your DAGs have completed running in Airflow, you will see Airflow DAGs and tasks along with lineage from OpenLineage events in Atlan! 🎉

Related articles

Was this article helpful?
0 out of 0 found this helpful