In some cases you won't be able to expose your Looker instance for Atlan to crawl and ingest metadata. For example, this may happen when security requirements restrict access to sensitive, mission critical data.
In such cases you may want to decouple the extraction of metadata from its ingestion in Atlan. This approach gives you full control over your resources and metadata transfer to Atlan.
To extract metadata from your on-premises Looker instance you will need to use Atlan's looker-extractor tool.
Install Docker Compose
To install Docker Compose:
Get the looker-extractor tool
To get the looker-extractor tool:
- Raise a support ticket to get a link to the latest version.
- Downloaded the image using the link provided by support.
- Load the image to the server you'll use to crawl Looker:
sudo docker load -i /path/to/looker-extractor-master.tar
Get the compose file
Atlan provides you a configuration file for the looker-extractor tool. This is a Docker compose file.
To get the compose file:
- Download the latest compose file.
- Save the file to an empty directory on the server you'll use to access your on-premises databases.
- The file is
Define Looker connections
The structure of the compose file includes three main sections:
x-templatescontains configuration fragments. You should ignore this section — do not make any changes to it.
servicesis where you will define your Looker connections.
volumescontains mount information. You should ignore this section as well — do not make any changes to it.
For each on-premises Looker instance, define an entry under
services in the compose file.
Each entry will have the following structure:
services: CONNECTION-NAME: <<: *extract environment: <<: *looker-defaults INCLUDE_PROJECTS: "project1,project2" USE_FIELD_LEVEL_LINEAGE: "true" volumes: - ./output/looker-example:/output/process
CONNECTION-NAMEwith the name of your connection.
<<: *extracttells the looker-extractor tool to run.
environmentcontains all parameters for the tool. Replaces the values given for
INCLUDE_PROJECTSwith the names of your own Looker projects you want to extract. Separate each project name by a comma.
volumesspecifies where to store results. In this example, the extractor will store results in the
./output/looker-examplefolder on the local file system.
You can add as many Looker connections as you want.
servicesformat in more detail.
To define the credentials for your Looker connections you will need to provide:
- A Looker SDK configuration file
- A private key to access your git repository via ssh (to extract field-level lineage)
- A passphrase to decipher the private key (to extract field-level lineage)
The Looker metadata includes the git repo locations.
The Looker SDK configuration is a
.ini file with the following format:
[Looker] # Base URL for your looker instance API. Do not include /api/* in the URL. base_url=https://<host>:<port> # API 3 client id client_id=YourClientID # API 3 client secret client_secret=YourClientSecret verify_ssl=True
Using local files
To specify the local files in your compose file:
secrets: looker_config: file: ./looker.ini looker_git_private_key: file: ./id_ed25519 looker_git_private_key_passphrase: file: ./passphrase.txt
secretssection is at the same top-level as the
servicessection described earlier. It is not a sub-section of the
Using Docker secrets
To create and use Docker secrets:
- Store the Looker SDK configuration file:
sudo docker secret create looker_config path/to/looker.ini
- At the top of your compose file, add a secrets element to access your secret:
secrets: looker_config: external: true name: looker_config
💪 Did you know? You can use the same steps to create Docker secrets for your git details, as well. Replace the name (
nameshould be the same one you used in the
docker secret createcommand above.
- Once stored as a Docker secret, you can remove the local Looker SDK configuration file.
looker_config) and path to the file, but otherwise run the same command.
servicesection of the compose file, add a new secrets element and specify the name of the secret within your service to use it.
Let's explain in detail with an example:
secrets: looker_config: external: true name: looker_config looker_git_private_key: file: ./id_ed25519 looker_git_private_key_passphrase: external: true name: looker_git_private_key_passphrase x-templates: # ... services: my-looker: <<: *extract environment: <<: *looker-defaults INCLUDE_PROJECTS: "project1,project2" USE_FIELD_LEVEL_LINEAGE: "true" volumes: - ./output/looker-example:/output/process secrets: - looker_config - looker_git_private_key - looker_git_private_key_passphrase volumes: jars:
- In this example we've defined the secrets at the top of the file (you could also define them at the bottom):
looker_configrefers to an external Docker secret created using the
docker secret createcommand.
looker_git_private_keyrefers to a local file.
looker_git_private_key_passphraserefers to an external Docker secret created using the
docker secret createcommand.
- The name of this service is
my-looker. You can use any meaningful name you want.
<<: *looker-defaultssets the connection type to Looker.
INCLUDE_PROJECTStells the extractor to only extract
USE_FIELD_LEVEL_LINEAGEtells the extractor to extract field-level lineage. This means the git private key information is also required.
./output/looker-example:/output/processline tells the extractor where to store results. In this example, the extractor will store results in the
./output/looker-exampledirectory on the local file system. We recommend you output metadata for different connections in separate directories.
servicestells the extractor which secrets to use for this service. Each of these refers to the name of a secret listed at the beginning of the compose file.