Once you have set up the metadata-extractor tool, you can extract metadata from your on-premises databases using the following steps.
Run metadata-extractor
Crawl all databases
To crawl all databases using the metadata-extractor tool:
- Log into the server with Docker Compose installed.
- Change to the directory containing the compose file.
- Run Docker Compose:
sudo docker-compose up
.
Crawl a specific database
To crawl a specific database using the metadata-extractor tool:
- Log into the server with Docker Compose installed.
- Change to the directory containing the compose file.
- Save the compose file and use the command
sudo docker-compose up <CONNECTION-NAME>
within the local folder where the compose file is stored.
(Replace <CONNECTION-NAME>
with the name of the connection from the services
section of the compose file.)
(Optional) Review generated files
The metadata-extractor tool will generate the following JSON files for each service
:
columns-<DATABASE>.json
databases.json
extras-procedures-<DATABASE>.json
procedures-<DATABASE>.json
schemas-<DATABASE>.json
table-<DATABASE>.json
view-<DATABASE>.json
You can inspect the metadata and make sure it is acceptable to provide the metadata to Atlan.
Upload generated files to S3
To provide Atlan access to the extracted metadata, you will need to upload the metadata to an S3 bucket.
To upload the metadata to S3:
- Ensure all the files for a particular service have the same prefix. For example,
metadata/inventory/columns-inventory.json
,metadata/inventory/databases.json
, etc. - Upload the files to the S3 bucket using your preferred method.
For example, to upload all the files using the AWS CLI:
aws s3 cp output/inventory s3://my-bucket/metadata/inventory --recursive
Crawl metadata in Atlan
Once you have extracted metadata on-premises and uploaded the results to S3, you can crawl the metadata into Atlan:
For all of the above cases, select Offline for the extraction method.