In some cases you won't be able to expose your databases for Atlan to crawl and ingest metadata. This may happen for various reasons:
- Transactional databases may have high-load mechanisms. That could make direct connection problematic.
- Security requirements may restrict accessing sensitive, mission critical transactional databases from outside.
In such cases you may want to decouple the extraction of metadata from its ingestion in Atlan. This approach gives you full control over your resources and metadata transfer to Atlan.
Prerequisites
To extract metadata from your on-premises databases you will need to use Atlan's metadata-extractor tool.
Install Docker Compose
Docker Compose is a tool for defining and running applications composed of many Docker containers. (Any guesses where the name came from? π)
To install Docker Compose:
Get the metadata-extractor tool
To get the metadata-extractor tool:
- Raise a support ticket to get a link to the latest version.
- Download the image using the link provided by support.
- To load the image:
- For Docker Image, load the image to the server you'll use to crawl databases:
sudo docker load -i /path/to/jdbc-metadata-extractor-master.tar
- For OCI Image:
- Docker:
- Install Skopeo.
- Load the image to the server you'll use to crawl databases:
skopeo copy oci-archive:/path/to/jdbc-metadata-extractor-master-oci.tar docker-daemon:jdbc-metadata-extractor-master:latest
- Podman:
- Load the image to the server you'll use to crawl databases:
podman load -i /path/to/jdbc-metadata-extractor-master-oci.tar podman tag <loaded image hash> jdbc-metadata-extractor-master:latest
- Load the image to the server you'll use to crawl databases:
- Docker:
- For Docker Image, load the image to the server you'll use to crawl databases:
Get the compose file
Atlan provides you a configuration file for the metadata-extractor tool. This is a Docker compose file.
To get the compose file:
- Download the latest compose file.
- Save the file to an empty directory on the server you'll use to access your on-premises databases.
- The file is
docker-compose.yml
.
Define database connections
The structure of the compose file includes three main sections:
-
x-templates
contains configuration fragments. You should ignore this section β do not make any changes to it. -
services
is where you will define your database connections. -
volumes
contains mount information. You should ignore this section as well β do not make any changes to it.
Define services
For each on-premises database, define an entry under services
in the compose file.
Each entry will have the following structure:
services:
CONNECTION-NAME:
<<: *extract
environment:
<<: *CONNECTION-TYPE
# Credentials
# Database address
volumes:
# Output folder
- Replace
CONNECTION-NAME
with the name of your connection. -
<<: *extract
tells the metadata-extractor tool to run. -
environment
contains all parameters for the tool. -
<<: *CONNECTION-TYPE
applies default arguments for the corresponding connection type.
Refer to Supported connections for on-premises databases for full details of each connection type.
Example
Let's explain in detail using an example:
services:
inventory: # 1. Call this connection "inventory"
<<: *extract
environment:
<<: *psql # 2. Connect to PostgreSQL using basic authentication
USERNAME: some-username # 3. Credentials
PASSWORD: some-password
HOST: inventory.local # 4. Database address
PORT: 5432
DATABASE: inventory
volumes:
- *shared-jdbc-drivers
- ./output/inventory:/output # 5. Store results in ./output/inventory
- The name of this service is
inventory
. You can use any meaningful name you want. In this example, we are using the same name as the database we're going to crawl. - The
<<: *psql
sets the connection type to Postgres using basic authentication. -
USERNAME
andPASSWORD
specify the credentials required for thepsql
connection. -
HOST
,PORT
andDATABASE
specify the database address. ThePORT
is5432
by default, so you can omit it most of the time. - The
./output/inventory:/output
line specifies where to store results. You will need to replaceinventory
with the name of your connection. We recommend you to output metadata for different databases in separate folders.
You can add as many database connections as you want.
services
format in more detail.Secure credentials
To create and use Docker secrets:
- Create a JSON file and add the credentials that you want to use in Docker secrets. For example:
{ "USERNAME": "my-secret-user", "PASSWORD": "my-secret-password" }
πͺ Did you know? The keys here will be the environment variable names, hence consider migrating them from the compose file to secrets. Once set to secrets, the environment variables in secrets will take precedence over the ones in the compose file. If not provided in secrets, the values will be parsed from the compose file instead. - Create a new Docker secret:
docker secret create my_database_credentials credentials.json
- At the top of your compose file, add a secrets element to access your secret:
secrets: my_database_credentials: external: true
- Within the
service
section of the compose file, add a new secrets element and specifyCREDENTIAL_SECRET_PATH
to use it as credentials.π¨ Careful! If you have added database credentials directly to the compose file, Atlan recommends that you leaveCREDENTIAL_SECRET_PATH
as blank.
For example, your compose file would now look something like this:
secrets:
my_database_password:
external: true
x-templates:
# ...
services:
my-database:
<<: *extract
environment:
<<: *psql
CREDENTIAL_SECRET_PATH: "/run/secrets/my_database_credentials"
# ...
volumes:
# ...
secrets:
- my_database_password
volumes:
jars:
Troubleshooting secure credentials
Atlan recommends the following troubleshooting measures:
- If you're unable to create Docker secrets, ensure that Swarm mode is enabled. Secrets are encrypted during transit and at rest in a Docker swarm. Run the following command to enable Swarm mode:
docker swarm init
- If running the compose file after providing the credentials secret results in
Unsupported external secret <secret_name>
, complete the following steps:- Modify the compose file as follows:
secrets: my_database_password: external: true x-templates: # ... services: my-database: <<: *extract environment: <<: *psql CREDENTIAL_SECRET_PATH: "/run/secrets/my_database_credentials" # ... volumes: # ... secrets: - my_database_password deploy: replicas: 1 restart_policy: condition: none volumes: jars:
- Run the compose file using the following command:
docker stack deploy -c docker-compose.yml <stack_name>
- Replace the
<stack_name>
with the name you provided while deploying the stack.
- Replace the
- Verify deployment status using the following command:
docker stack ps --no-trunc <stack_name>
- Replace the
<stack_name>
with the name you provided while deploying the stack.
- Replace the
- If stack deployment has been successfully completed, monitor the
docker service logs
using the following command:docker service logs <stack_name>_<service_name> --follow
- Replace the
<stack_name>
with the name you provided while deploying the stack. - Replace the
<service_name>
with the service name in Docker.
π¨ Careful! Thedocker stack deploy
command will run all the services in thedocker-compose.yml
file, so ensure that thedocker-compose.yml
only contains the service you intend to run. - Replace the
- Modify the compose file as follows: