Atlan is a fully virtualized solution that does not involve moving data from existing storage layers. Atlan crawls metadata from upstream data sources and stores it in a secure VPC (virtual private cloud).
Atlan pushes any queries to existing processing layers. For example, directly to your database, warehouse, or a processing layer such as Athena or Presto on top of blob storage. So the data itself stays put — Atlan does not move it or store it.
Not sure on the difference between data and metadata? Try our helpful primer.
Data previews and queries
Atlan gives users the ability to see sample data previews for a data asset as well as the results for any queries run on Atlan.
In both cases, Atlan pushes the request upstream to the data source, and shows a 100-row sample of the result to Atlan users. Atlan does not cache any of this data. So each time a user previews or queries data, it is re-queried from the source.
Atlan stores the metadata it collects and creates in applications and databases within the VPC. This includes:
- asset metadata
- user data
Atlan stores asset metadata, including lineage, in:
- Apache Atlas, a graph database layer that stores entity relationships and attributes
- Elasticsearch, to optimize search on the product
- Cassandra, as the persistence back-end
Atlan stores data on users, roles, and groups in its own PostgreSQL database. Keycloak uses this information for access and identity management.
Atlan hashes all sensitive fields like passwords and stores them securely. Any user data transmitted over the internet is SSL-encrypted over HTTPS.