Atlan currently supports usage and popularity metrics only for Snowflake tables, views, and columns.
Can any Snowflake customer set this up?
There is no separate setup required. It is bundled with the Snowflake miner package. As long as the miner is set up, popularity will be calculated.
Which editions of Snowflake are supported?
The following Snowflake editions are supported:
- Standard Edition
- Enterprise Edition
- Business Critical Edition
- Virtual Private Snowflake (VPS)
Do account-level permissions need to be modified for setup?
No extra permissions are required.
- For enterprise customers, Atlan will use the
ACCESS_HISTORYtable to determine which objects have been accessed — keeping Snowflake as the source of truth.
- For other customers, Atlan will use internal logic to determine the same.
How can I set up popularity on my instance?
- Head over to the Atlan marketplace and set up the Snowflake miner package with a daily frequency (recommended).
- On the first run, popularity will be calculated from the start date of the miner with a default window of 14 days.
- For all subsequent runs, the popularity window will be increased to a maximum limit of 30 days.
For preconfigured miners:
- The next run of the miner will migrate the last 30 days worth of query data and calculate popularity for 30 days (if available).
- Subsequent runs will work as expected.
How is the popularity score calculated?
The computation of popularity score is based on the number of read queries that used the data asset and the number of distinct users executing the read queries. Values are collected over a period of 30 days.
Popularity score = number of distinct readers * log (total number of reads)
What asset types have popularity scores?
Popularity scores are currently available for all Snowflake tables, views, and columns.
How is the compute cost estimated per asset?
In Atlan, the compute cost of an asset is estimated based on each individual query. For example, if warehouse
X-small costs 1 credit per hour:
- Query A ran from 1 p.m. to 5 p.m. on warehouse
X-small= 4 credits
- Query B ran from 11 a.m. to 3 p.m. on warehouse
X-small= 4 credits
- Estimated compute cost: 4 credits + 4 credits = 8 credits
Can we exclude queries run by ETL users?
Support for including or excluding queries by roles, users, and warehouses is coming soon.