Supported connections for on-premises databases

The metadata-extractor tool supports the following connection types.

These describe the details required when setting up on-premises database access.

Amazon Redshift with basic authentication

Use <<: *redshift under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 5439)
  • DATABASE — database name (required)
  • USERNAME — database username (required)
  • PASSWORD — database user password (required)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-redshift-database:
    <<: *extract
    environment:
      <<: *redshift
      USERNAME: my-database-username
      PASSWORD: my-database-password
      HOST: redshift-host
      PORT: redshift-database-port
      DATABASE: my-database-name
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/redshift-example:/output

Amazon Redshift with IAM user authentication

Use <<: *redshift under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 5439)
  • DATABASE — database name (required)
  • DATABASE_USER — database username of the IAM user (required)
  • AWS_ACCESS_KEY_ID — AWS access key ID (required)
  • AWS_SECRET_ACCESS_KEY — AWS secret access key (required)
  • CLUSTER_ID — cluster identifier of your private Amazon Redshift cluster (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  redshift-iam-user-example:
    <<: *extract
    environment:
      <<: *redshift-iam-user
      AWS_ACCESS_KEY_ID: aws-access-key-id
      AWS_SECRET_ACCESS_KEY: aws-secret-access-key
      HOST: redshift-host
      DATABASE: my-database-name
      DATABASE_USER: my-database-user
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      CLUSTER_ID: private-cluster-id
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/redshift-iam-user-example:/output

Hive with basic authentication

Use <<: *hive under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 10000)
  • DEFAULT_SCHEMA — default schema name (optional, default is default)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • SCHEMA_EXCLUDE_REGEX — regex to exclude schemas (optional)
  • SCHEMA_INCLUDE_REGEX — regex to include schemas (optional)
  • TEMP_TABLE_REGEX — regex to exclude tables (optional)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

hive-example:
    <<: *hive-extract
    environment:
      <<: *hive
      HOST: hive-host
      PORT: hive-port
      DEFAULT_SCHEMA: default
      USERNAME: my-hive-username
      PASSWORD: my-hive-password
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/hive-example:/output

Microsoft SQL Server with basic authentication

Use <<: *mssql under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • DATABASE — database name (required)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-mssql-database:
    <<: *extract
    environment:
      <<: *mssql
      USERNAME: db-user
      PASSWORD: db-user-password
      HOST: mssql-database-host
      DATABASE: northwind
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/mssql-database:/output

MySQL with basic authentication

Use <<: *mysql under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 3306)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-mysql-database:
    <<: *extract
    environment:
      <<: *mysql
      USERNAME: db-user
      PASSWORD: db-user-password
      HOST: mysql-database-host.internal
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-mysql-database:/output

MySQL with IAM authentication

Use <<: *mysql-iam under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 3306)
  • USERNAME — database user name (required)
  • AWS_ACCESS_KEY_ID — AWS access key id (required)
  • AWS_SECRET_ACCESS_KEY — AWS secret access key (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-mysql-database:
    <<: *extract
    environment:
      <<: *mysql-iam
      AWS_ACCESS_KEY_ID: my-access-key-id
      AWS_SECRET_ACCESS_KEY: my-secret-access-key
      USERNAME: db-user
      HOST: mysql-database-host.internal
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-mysql-database:/output

Oracle with basic authentication

Use <<: *oracledb under the environment section to use this connection type.

💪 Did you know? The Oracle JDBC driver is downloaded to the image under the Oracle FUTC license.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 1521)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • SERVICE — SID/service identifier (usually ORCL) or service name (required)
  • DEFAULT_DATABASE_NAME — database name (usually same as SID) (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-oracledb-database:
    <<: *extract
    environment:
      <<: *oracledb
      HOST: oracle-host.internal
      USERNAME: db-user
      PASSWORD: db-user-password
      SERVICE: ORCL
      DEFAULT_DATABASE_NAME: ORCL
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-oracledb-database:/output

PostgreSQL with basic authentication

Use <<: *psql under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 5432)
  • DATABASE — database name (required)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-psql-database:
    <<: *extract
    environment:
      <<: *psql
      USERNAME: db-user
      PASSWORD: db-user-password
      HOST: psql-database-host.internal
      DATABASE: inventory
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-psql-database:/output

PostgreSQL with IAM authentication

Use <<: *psql-iam under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 5432)
  • DATABASE — database name (required)
  • USERNAME — database user name (required)
  • AWS_ACCESS_KEY_ID — AWS access key id (required)
  • AWS_SECRET_ACCESS_KEY — AWS secret access key (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-psql-database:
    <<: *extract
    environment:
      <<: *psql-iam
      AWS_ACCESS_KEY_ID: my-access-key-id
      AWS_SECRET_ACCESS_KEY: my-secret-access-key
      USERNAME: db-user
      HOST: psql-database-host.internal
      DATABASE: inventory
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DB1$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DB2$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-psql-database:/output

SAP HANA with basic authentication

Use <<: *saphana under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 39017)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if schema-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-saphana-database:
    <<: *extract
    environment:
      <<: *saphana
      USERNAME: my-database-username
      PASSWORD: my-database-password
      HOST: my-database-host
      PORT: 39017
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$$": ["^SCHEMA1$$", "^SCHEMA2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$": ["^SCHEMA1$", "^SCHEMA2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/saphana-example:/output

Teradata with basic authentication

Use <<: *teradata under the environment section to use this connection type.

Available attributes:

  • HOST — database host name or IP address (required)
  • PORT — database port (optional, default is 1025)
  • USERNAME — database user name (required)
  • PASSWORD — database user password (required)
  • USE_SOURCE_SCHEMA_FILTERING — Boolean to specify if database-level filtering needs to be enabled while fetching schemas, tables, and columns (optional)
  • EXCLUDE_FILTER_TEMPLATE — exclude filter pattern (optional)
  • INCLUDE_FILTER_TEMPLATE — include filter pattern (optional)
  • USE_JDBC_INTERNAL_METHODS - Boolean to specify if JDBC internal methods need to be used as part of the extraction (optional)

Example

services:
  my-teradata-database:
    <<: *extract
    environment:
      <<: *teradata
      USERNAME: db-user
      PASSWORD: db-user-password
      HOST: teradata-database-host.internal
      # If using Docker Swarm mode for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$$": ["^DB1$$", "^DB2$$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$$": ["^DB1$$", "^DB2$$"]}'
      # If using Docker Compose for offline extraction, format the filters as follows:
      INCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$": ["^DB1$", "^DB2$"]}'
      EXCLUDE_FILTER_TEMPLATE: '{"^DEFAULT$": ["^DB1$", "^DB2$"]}'
      USE_SOURCE_SCHEMA_FILTERING: "false"
      USE_JDBC_INTERNAL_METHODS: "false"
    volumes:
      - *shared-jdbc-drivers
      - ./output/my-teradata-database:/output

Related articles

Was this article helpful?
1 out of 1 found this helpful