Once you have crawled assets from Google BigQuery, you can mine its query history to construct lineage.
To mine lineage from Google BigQuery, review the order of operations and then complete the following steps.
Select the miner
To select the Google BigQuery miner:
- In the top right of any screen, navigate to New and then click New Workflow.
- From the filters along the top, click Miner.
- From the list of packages, select BigQuery Miner and then click Setup Workflow.
Configure the miner
To configure the Google BigQuery miner:
- For Connection, select the connection to mine. (To select a connection, the crawler must have already run.)
- For Miner Extraction Method, select Query History.
- For Start time, choose the earliest date from which to mine query history.
- (Optional) By default, the miner fetches data from the US region. To fetch data from another region, for Region, select Custom and then enter the region where your
INFORMATION_SCHEMAis hosted under Custom BigQuery Region. Enter the region in the following format
<REGION>with your specific region — for example,
- To check for any permissions or other configuration issues before running the miner, click Preflight checks.
- At the bottom of the screen, click Next to proceed.
Configure the miner behavior
To configure the Google BigQuery miner behavior:
- (Optional) For Calculate popularity, change to True to retrieve usage and popularity metrics for your Google BigQuery assets from query history:
- To select a pricing model for running queries, for Pricing Model, click On Demand to be charged for the number of bytes processed or Flat Rate for the number of slots purchased.
- For Popularity Window (days), 30 days is the maximum limit. You can set a shorter popularity window of less than 30 days.
- For Excluded Users, type the names of users to be excluded while calculating usage metrics for Google BigQuery assets. Press
enterafter each name to add more names.
- (Optional) For Control Config, click Custom to configure the following:
- For Fetch excluded project's QUERY_HISTORY, click Yes to mine query history from databases or projects excluded while crawling metadata from Google BigQuery.
- If Atlan support has provided you with a custom control configuration, enter the configuration into the Custom Config box.
Run the miner
To run the Google BigQuery miner, after completing the steps above:
- To run the miner once immediately, at the bottom of the screen, click the Run button.
- To schedule the miner to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule & Run button.
Once the miner has completed running, you will see lineage for Google BigQuery assets that were created in Google BigQuery between the start time and when the miner ran! 🎉