How to set up on-premises database access

Have more questions? Submit a request
πŸ€“ Who can do this? You will need access to a machine that can run Docker on-premises. You will also need your database access details, including credentials.

In some cases you won't be able to expose your databases for Atlan to crawl and ingest metadata. This may happen for various reasons:

  1. Transactional databases may have high-load mechanisms. That could make direct connection problematic.
  2. Security requirements may restrict accessing sensitive, mission critical transactional databases from outside.

In such cases you may want to decouple the extraction of metadata from its ingestion in Atlan. This approach gives you full control over your resources and metadata transfer to Atlan.

Prerequisites

To extract metadata from your on-premises databases you will need to use Atlan's metadata-extractor tool.

πŸ’ͺ Did you know? Atlan uses exactly the same metadata-extractor behind the scenes when it connects to cloud databases.

Install Docker Compose

Docker Compose is a tool for defining and running applications composed of many Docker containers. (Any guesses where the name came from? πŸ˜‰)

To install Docker Compose:

  1. Install Docker
  2. Install Docker Compose
πŸ’ͺ Did you know? Instructions provided in this documentation should be enough even if you are completely new to Docker and Docker Compose. But you can also walk through the Get started with Docker Compose tutorial if you want to learn Docker Compose basics first.

Get the metadata-extractor tool

To get the metadata-extractor tool:

  1. Raise a support ticket to get a link to the latest version.
  2. Downloaded the image using the link provided by support.
  3. Load the image to the server you'll use to crawl databases:
    sudo docker load -i /path/to/jdbc-metadata-extractor-master.tar

Get the compose file

Atlan provides you a configuration file for the metadata-extractor tool. This is a Docker compose file.

To get the compose file:

  1. Download the latest compose file.
  2. Save the file to an empty directory on the server you'll use to access your on-premises databases.
  3. The file is docker-compose.yml.

Define database connections

The structure of the compose file includes three main sections:

  • x-templates contains configuration fragments. You should ignore this section β€” do not make any changes to it.
  • services is where you will define your database connections.
  • volumes contains mount information. You should ignore this section as well β€” do not make any changes to it.

Define services

For each on-premises database, define an entry under services in the compose file.

Each entry will have the following structure:

services:
  CONNECTION-NAME:
    <<: *extract
    environment:
      <<: *CONNECTION-TYPE
      # Credentials
      # Database address
    volumes:
      # Output folder
  • Replace CONNECTION-NAME with the name of your connection.
  • <<: *extract tells the metadata-extractor tool to run.
  • environment contains all parameters for the tool.
  • <<: *CONNECTION-TYPE applies default arguments for the corresponding connection type.

Refer to Supported connections for on-premises databases for full details of each connection type.

Example

Let's explain in detail using an example:

services:
  inventory:                        # 1. Call this connection "inventory"
    <<: *extract
    environment:
      <<: *psql                     # 2. Connect to PostgreSQL using basic authentication
      USERNAME: some-username       # 3. Credentials
      PASSWORD: some-password
      HOST: inventory.local         # 4. Database address
      PORT: 5432
      DATABASE: inventory
    volumes:
      - *shared-jdbc-drivers
      - ./output/inventory:/output  # 5. Store results in ./output/inventory
  1. The name of this service is inventory. You can use any meaningful name you want. In this example we use the same name as the database we're going to crawl.
  2. The <<: *psql sets the connection type to Postgres using basic authentication.
  3. USERNAME and PASSWORD specify the credentials required for the psql connection.
  4. HOST, PORT and DATABASE specify the database address. The PORT is 5432 by default, so you can omit it most of the time.
  5. The ./output/inventory:/output line specifies where to store results. In this example, results will be stored in the ./output/inventory folder on the local file system. We recommend you to output metadata for different databases in separate folders.

You can add as many database connections as you want.

πŸ’ͺ Did you know? Docker's documentation describes the services format in more detail.

Secure credentials

🚨 Careful! If you decide to keep database credentials in the compose file, we recommend you restrict access to the directory and the compose file. For extra security, we recommend you use Docker secrets to store the sensitive passwords.

To create and use Docker secrets:

  1. Create a new Docker secret:
    printf "This is a secret password" | docker secret create my_database_password -
  2. At the top of your compose file, add a secrets element to access your secret:
    secrets:
          my_database_password:
            external: true
  3. Within the service section of the compose file, add a new secrets element and specify PASSWORD_SECRET_PATH to use it as a password.

For example, your compose file would now look something like this:

secrets:
  my_database_password:
    external: true

x-templates:
  # ...

services:
  my-database:
    <<: *extract
    environment:
      <<: *psql
      USERNAME: my-user
      PASSWORD_SECRET_PATH: "/run/secrets/my_database_password"
      # ...
    volumes:
      # ...
    secrets:
      - my_database_password

volumes:
  jars:

Related articles

Was this article helpful?
0 out of 0 found this helpful