In some cases you won't be able to expose your databases for Atlan to crawl and ingest metadata. This may happen for various reasons:
- Transactional databases may have high-load mechanisms. That could make direct connection problematic.
- Security requirements may restrict accessing sensitive, mission critical transactional databases from outside.
In such cases you may want to decouple the extraction of metadata from its ingestion in Atlan. This approach gives you full control over your resources and metadata transfer to Atlan.
To extract metadata from your on-premises databases you will need to use Atlan's metadata-extractor tool.
Install Docker Compose
Docker Compose is a tool for defining and running applications composed of many Docker containers. (Any guesses where the name came from? 😉)
To install Docker Compose:
Get the metadata-extractor tool
To get the metadata-extractor tool:
- Raise a support ticket to get a link to the latest version.
- Download the image using the link provided by support.
- Load the image to the server you'll use to crawl databases:
sudo docker load -i /path/to/jdbc-metadata-extractor-master.tar
Get the compose file
Atlan provides you a configuration file for the metadata-extractor tool. This is a Docker compose file.
To get the compose file:
- Download the latest compose file.
- Save the file to an empty directory on the server you'll use to access your on-premises databases.
- The file is
Define database connections
The structure of the compose file includes three main sections:
x-templatescontains configuration fragments. You should ignore this section — do not make any changes to it.
servicesis where you will define your database connections.
volumescontains mount information. You should ignore this section as well — do not make any changes to it.
For each on-premises database, define an entry under
services in the compose file.
Each entry will have the following structure:
services: CONNECTION-NAME: <<: *extract environment: <<: *CONNECTION-TYPE # Credentials # Database address volumes: # Output folder
CONNECTION-NAMEwith the name of your connection.
<<: *extracttells the metadata-extractor tool to run.
environmentcontains all parameters for the tool.
<<: *CONNECTION-TYPEapplies default arguments for the corresponding connection type.
Refer to Supported connections for on-premises databases for full details of each connection type.
Let's explain in detail using an example:
services: inventory: # 1. Call this connection "inventory" <<: *extract environment: <<: *psql # 2. Connect to PostgreSQL using basic authentication USERNAME: some-username # 3. Credentials PASSWORD: some-password HOST: inventory.local # 4. Database address PORT: 5432 DATABASE: inventory volumes: - *shared-jdbc-drivers - ./output/inventory:/output # 5. Store results in ./output/inventory
- The name of this service is
inventory. You can use any meaningful name you want. In this example we use the same name as the database we're going to crawl.
<<: *psqlsets the connection type to Postgres using basic authentication.
PASSWORDspecify the credentials required for the
DATABASEspecify the database address. The
5432by default, so you can omit it most of the time.
./output/inventory:/outputline specifies where to store results. In this example, results will be stored in the
./output/inventoryfolder on the local file system. We recommend you to output metadata for different databases in separate folders.
You can add as many database connections as you want.
servicesformat in more detail.
To create and use Docker secrets:
- Create a new Docker secret:
printf "This is a secret password" | docker secret create my_database_password -
- At the top of your compose file, add a secrets element to access your secret:
secrets: my_database_password: external: true
- Within the
servicesection of the compose file, add a new secrets element and specify
PASSWORD_SECRET_PATHto use it as a password.
For example, your compose file would now look something like this:
secrets: my_database_password: external: true x-templates: # ... services: my-database: <<: *extract environment: <<: *psql USERNAME: my-user PASSWORD_SECRET_PATH: "/run/secrets/my_database_password" # ... volumes: # ... secrets: - my_database_password volumes: jars: