Troubleshooting lineage

Have more questions? Submit a request

So you've crawled your source, and mined the queries, but lineage is missing. Why?

Where to look first?

Views

  • Check the SQL attribute of the view data asset β€” this must have SQL in it for view lineage to appear.
  • The crawler workflows populate the SQL attribute. If it's empty on the view asset, the crawler is the suspect.

Tables

  • The miner workflows populate table lineage. If it's missing, the miner is the suspect.
  • Check the SQL picked up by the miner (for example, in S3).
  • If the miner picks up the necessary SQL but lineage is not produced, check if any of the assets involved are missing.

Data stores to BI assets

  • For Atlan to link these assets, the upstream assets (data stores) must first exist.
  • If they are only created after the downstream assets, lineage will stay unlinked.
  • Or if some of the assets are missing, lineage may have gaps preventing linkage.

Causes of missing assets

There are several reasons why assets may be missing:

Workflow ordering

The order of operations you run in Atlan is important. To have lineage across tools, you need to:

  1. Crawl data stores first.
  2. Mine query logs (and dbt) second.
  3. Crawl BI tools last.

If you've used a different order, the upstream assets (data stores) may not yet exist when you load the BI metadata. Then you can have lineage within the BI metadata, but not between the BI metadata and the data sources.

If that's the case for you, don't worry. Re-run your existing workflows in the order above and Atlan should resolve it.

Crawling filters

Another reason lineage may have gaps is that linking assets do not exist, even after re-running the crawler.

When crawling a source, you can specify filters on which metadata to include and exclude. If you've excluded metadata needed to link assets into lineage, then end-to-end lineage will have gaps.

Check that you have not excluded any of the asset(s) you're expecting to be in lineage. (And remember that using an include filter means that not all metadata is being crawled β€” some is being excluded.)

If in doubt, try running your workflows without include or exclude filters.

Source permissions

Atlan is not the only place where you can filter metadata.

Atlan accesses your sources through credentials you provide. Those credentials have assigned permissions controlling what (meta)data they can access in the source. If those permissions prevent access to some (meta)data, then Atlan cannot crawl that metadata.

So if ordering and filter don't fix the problem, check your source permissions. Are they providing access to all the data assets you need for lineage?

Different connections, same source

We currently do not resolve lineage across different connections for the same source. You need to crawl (and mine) all assets from a given source through the same connection to generate lineage.

🚨 Careful! This one is the most subtle of the causes. The assets may even appear to be in the environment in this case. Check the qualifiedName of the asset matches exactly what lineage expects.

Temporary tables

If your data processing uses temporary tables:

  • If the transformation occurs in a single session, we will generate lineage.
  • If the transformation occurs across sessions we can't currently generate lineage.

Related articles

Was this article helpful?
0 out of 0 found this helpful