How to mine Amazon Redshift

Once you have crawled assets from Amazon Redshift, you can mine its query history to construct lineage and retrieve usage and popularity metrics.

To mine lineage from Amazon Redshift, review the order of operations and then complete the following steps.

Select the miner

To select the Amazon Redshift miner:

  1. In the top right of any screen, navigate to New and then click New Workflow.
  2. From the filters along the top, click Miner.
  3. From the list of packages, select Redshift Miner and click on Setup Workflow.

Configure the miner

To configure the Amazon Redshift miner:

  1. For Connection, select the connection to mine. (To select a connection, the crawler must have already run.)
  2. For Miner extraction method, select Query History.
  3. For Start time, choose the earliest date from which to mine query history.
    🚨 Careful! Amazon Redshift only stores query history for 2-5 days. If you need to query more history, for example in an initial load, consider using the S3 miner first. After the initial load, you can modify the miner's configuration to use query history extraction.
  4. (Optional) For Advanced Config, keep Default for the default configuration or click Advanced to further configure the miner:
    1. For Cross Connection, click Yes to extract lineage across all available data source connections or click No to only extract lineage from the selected Amazon Redshift connection.
    2. For Control Config, if Atlan support has provided you with a custom control configuration, select Custom and enter the configuration into the Custom Config box. You can also:
      • Enter {“ignore-all-case”: true} to enable crawling assets with case-sensitive identifiers.
  5. (Optional) For Enable Popularity, click Yes to retrieve usage and popularity metrics for your Amazon Redshift assets from query history:
    • For Excluded Users, type the names of users to be excluded while calculating usage metrics for Amazon Redshift assets. Press enter after each name to add more names.

Run the miner

To run the Amazon Redshift miner, after completing the steps above:

  1. To check for any permissions or other configuration issues before running the miner, click Preflight checks.
  2. You can either:
    • To run the miner once immediately, at the bottom of the screen, click the Run button.
    • To schedule the miner to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule Run button.

Once the miner has completed running, you will see lineage for Amazon Redshift assets that were created in Amazon Redshift between the start time and when the miner ran! 🎉

Related articles

Was this article helpful?
1 out of 1 found this helpful