What is lineage?

What is lineage?

Data lineage captures how data moves across your data landscape. This information is useful to:

  • Trace data's origins, to assist with root cause analysis
  • Trace data's destinations, to assist with impact analysis
  • Automate the propagation of metadata to derived assets
💪 Did you know? Tag propagation is disabled by default in Atlan. You can enable tag propagation to child and downstream assets.

Root cause analysis

Root cause analysis is about identifying the underlying causes of a data problem. You want to know where the data came from and what happened to it before it got to you. With root cause analysis, your focus is on these upstream sources and transformations.

Impact analysis

Impact analysis is about identifying potential consequences of changes. You want to know where the data is going and what could happen to others if you change it. With impact analysis, the primary focus is on these downstream systems and consumers.

💪 Did you know? When viewing lineage in Atlan, hover over any asset to view a metadata popover. The metadata popovers display relevant metadata for the asset, providing you with more context for your analysis. For example, database and schema names for Snowflake assets, project names for dbt models, and more.

How does it work?

Atlan constructs lineage by combining assets and processes:

  • Assets represent the inputs and outputs of processes — databases, dashboards, and so on.
  • Processes represent the activities that move or transform data between the assets. (Processes are the lines between the assets in Atlan's graphical view.)

Atlan chains these together into a flow of data from various resources:

SQL parsing

Atlan parses SQL queries to determine how data stores have created or transformed assets. Examples of this include:

API crawling

Atlan also retrieves lineage information for assets from APIs. Examples of this include:

API ingestion

Atlan provides built-in lineage extraction for the tools above. But you can also extend lineage with your own information using Atlan's open APIs. You can use these to integrate lineage from your own home-grown tools or orchestration suites like Apache Airflow and Dagster.

Related articles

Was this article helpful?
3 out of 3 found this helpful