How to set up Google BigQuery

πŸ€“ Who can do this? You will need your Google BigQuery administrator to run these commands β€” you may not have access yourself. For more information, see Google Cloud's Granting, changing, and revoking access to resources.

Atlan extracts metadata from Google BigQuery through read-only access. Once you have crawled metadata for your Google BigQuery assets, you can mine query history to construct lineage. If you have enabled sample data preview or querying, asset previews and queries will be cost-optimized for tables only. For views and materialized views, Atlan will send you a cost nudge before you run the preview or query the data β€” learn more here.

You will need to create a service account to enable Atlan to extract metadata from Google BigQuery. To create a service account, you can either use:

Permissions

Atlan requires the following permissions to extract metadata from Google BigQuery.

(Required) For metadata crawling

To configure permissions for crawling metadata, add the following permissions to the custom role:

  • bigquery.datasets.get allows Atlan to retrieve metadata about a dataset.
  • bigquery.datasets.getIamPolicy allows Atlan to read a dataset's IAM permissions.
  • bigquery.jobs.create allows Atlan to run jobs (including queries) within the project.
    🚨 Careful! Without this, Atlan can't query the source.
  • bigquery.routines.get allows Atlan to retrieve routine definitions and metadata.
  • bigquery.routines.list allows Atlan to list routines and metadata on routines.
  • bigquery.tables.get allows Atlan to retrieve table metadata.
  • bigquery.tables.getIamPolicy allows Atlan to read a table's IAM policy.
  • bigquery.tables.list allows Atlan to list tables and metadata on tables.
  • bigquery.readsessions.create allows Atlan to create a session to stream large results.
  • bigquery.readsessions.getData allows Atlan to retrieve data from the session.
  • bigquery.readsessions.update allows Atlan to cancel the session.
  • resourcemanager.projects.get allows Atlan to retrieve project names and metadata.

(Optional) To add data preview and querying

To configure permissions for previewing and querying data, add the following permissions to the custom role:

  • bigquery.tables.getData allows Atlan to retrieve table data.
    🚨 Careful! This permission is also required for retrieving metadata such as the row count and update time of a table.
  • bigquery.jobs.get allows Atlan to retrieve data and metadata on any job, including queries.
  • bigquery.jobs.listAll allows Atlan to list all jobs and retrieve metadata on any job submitted by any user.
  • bigquery.jobs.update allows Atlan to cancel any job, including a running query.

(Optional) To add query history mining

To configure permissions for mining query history, add the following permissions to the custom role:

  • bigquery.jobs.listAll allows Atlan to fetch all queries for a project.

Google Cloud console

Create a custom role

You will need to create a custom role in the Google Cloud console for integration with Atlan.

To create a custom role:

  1. Open the Google Cloud console.
  2. From the left menu under IAM and admin, click Roles.
  3. Using the dropdown list at the top of the page, select the project in which you want to create a role.
  4. From the upper left of the Roles page, click Create Role.
  5. In the Create role page, enter the following details:
    1. For Title, enter a meaningful name for the custom role β€” for example, Atlan User Role.
    2. (Optional) For Description, enter a description for the custom role.
    3. For ID, the Google Cloud console generates a custom role ID based on the custom role name. Edit the ID if necessary β€” the ID cannot be changed later.
    4. (Optional) For Role launch stage, assign a stage for the custom role β€” for example, Alpha, Beta, or General Availability.
    5. Click Add permissions to select the permissions you want to include in the custom role. In the Add permissions dialog, click the Enter property name or value filter and add the required and any optional permissions.
    6. Click Create to finish custom role setup.

Create a service account

Once you have created a custom role, you will need to create a service account and add your custom role to it.

To create a service account:

  1. Open the Google Cloud console.
  2. From the left menu under IAM and admin, click Service accounts.
  3. Select a Google Cloud project.
  4. From the upper left of the Service accounts page, click Create Service Account.
  5. For Service account details, enter the following details:
    1. For Service account name, enter a service account name to display in the Google Cloud console.
    2. For Service account ID, the Google Cloud console generates a service account ID based on this name. Edit the ID if necessary β€” the ID cannot be changed later.
    3. (Optional) For Service account description, enter a description for the service account.
    4. Click Create and continue to proceed to the next step.
  6. For Grant this service account access to the project, enter the following details:
    1. Click the Select a role dropdown and then select the custom role you created in the previous step β€” for example, Atlan User Role.
    2. Click Continue to proceed to the next step.
  7. Click Done to finish the service account setup.

Create service account key

Once you have created a service account, you will need to create a service account key for crawling Google BigQuery.

To create a service account key:

  1. Open the Google Cloud console.
  2. From the left menu under IAM and admin, click Service accounts.
  3. Select the Google Cloud project for which you created the service account.
  4. On the Service accounts page, click the email address of the service account that you want to create a key for.
  5. From the upper left of your service account page, click the Keys tab.
  6. On the Keys page, click the Add Key dropdown and then click Create new key.
  7. In the Create private key dialog, for Key type, click JSON and then click Create. This will create a service account key file. Download the key file and store it in a secure location β€” you will not be able to download it again.

Google Cloud CLI

Prerequisites

You will need to set up the Google Cloud CLI in any one of the following development environments:

  • Cloud Shell β€” to use an online terminal with the gcloud CLI already set up, activate Cloud Shell:
    • To launch a Cloud Shell session from the Google Cloud console, open the Google Cloud console, and from the top right, click the Activate Cloud Shell icon.
    • A Cloud Shell session will start and display a command-line prompt. It can take a few seconds for the session to initialize.
  • Local shell β€” to use a local development environment, install and initialize the gcloud CLI.

Create a custom role

To create a custom role with the requisite and any optional permissions, run the following command:

gcloud iam roles create atlanUserRole --project=<project_id> \
    --title="Atlan User Role" --description="Atlan User Role to extract metadata" \
    --permissions="bigquery.datasets.get,bigquery.datasets.getIamPolicy,bigquery.jobs.create,bigquery.readsessions.create,bigquery.readsessions.getData,bigquery.readsessions.update,bigquery.routines.get,bigquery.routines.list,bigquery.tables.get,bigquery.tables.getIamPolicy,bigquery.tables.list,resourcemanager.projects.get" \
    --stage=ALPHA
  • Replace <project_id> with the project ID of your Google Cloud project.

Create a service account

To create a service account, run the following command:

gcloud iam service-accounts create atlanUser \
    --description="Atlan Service Account to extract metadata" \
    --display-name="Atlan User"

To add your custom role to your service account, run the following command:

gcloud projects add-iam-policy-binding <project_id> \
    --member="serviceAccount:atlanUser@<project_id>.iam.gserviceaccount.com" \
    --role="atlanUserRole"
  • Replace <project_id> with the project ID of your Google Cloud project.

Create a service account key

To create a service account key, run the following command:

gcloud iam service-accounts keys create  <key_file_path> \
    --iam-account=atlanUser@<project_id>.iam.gserviceaccount.com"
  • Replace <key_file_path> with path to a new output file for the private key β€” for example, ~/atlanUser-private-key.json.
  • Replace <project_id> with the project ID of your Google Cloud project.
🚨 Careful! Due to limitations at source, Atlan currently does not support generating lineage using the bq cp commands β€” for example, bq cp <source-table> <destination-table>.

Related articles

Was this article helpful?
1 out of 1 found this helpful