Troubleshooting Apache Spark/OpenLineage connectivity

Does Atlan support Column Level Lineage (CLL) for object storage?

Atlan currently does not support Column Level Lineage (CLL) for object storage. This is because object storage systems do not have structured schema, unlike relational data sources.

Object storage systems store unstructured data, unlike relational data sources where columns and relationships are clearly defined. As a result, object storage systems cannot support column level lineage. For example, unstructured data can include a collection of image files stored in an S3 bucket, which doesn't support column-level lineage.

To enable CLL for object storage, users must register S3 objects as tables using AWS Glue, Hive, or similar cataloging tools.

Column level lineage support is also not available for the following Apache Airflow distributions:

Related articles

Was this article helpful?
0 out of 0 found this helpful