Atlan extracts metadata from Google BigQuery through read-only access. Once you have crawled metadata for your Google BigQuery assets, you can mine query history to construct lineage. If you have enabled sample data preview or querying, asset previews and queries will be cost-optimized for tables only. For views and materialized views, Atlan will send you a cost nudge before you run the preview or query the data β learn more here.
You will need to create a service account to enable Atlan to extract metadata from Google BigQuery. To create a service account, you can either use:
Permissions
Atlan requires the following permissions to extract metadata from Google BigQuery.
(Required) For metadata crawling
To configure permissions for crawling metadata, add the following permissions to the custom role:
bigquery.datasets.get
allows Atlan to retrieve metadata about a dataset.bigquery.datasets.getIamPolicy
allows Atlan to read a dataset's IAM permissions.bigquery.jobs.create
allows Atlan to run jobs (including queries) within the project.π¨ Careful! Without this, Atlan can't query the source.bigquery.routines.get
allows Atlan to retrieve routine definitions and metadata.bigquery.routines.list
allows Atlan to list routines and metadata on routines.bigquery.tables.get
allows Atlan to retrieve table metadata.bigquery.tables.getIamPolicy
allows Atlan to read a table's IAM policy.bigquery.tables.list
allows Atlan to list tables and metadata on tables.bigquery.readsessions.create
allows Atlan to create a session to stream large results.bigquery.readsessions.getData
allows Atlan to retrieve data from the session.bigquery.readsessions.update
allows Atlan to cancel the session.resourcemanager.projects.get
allows Atlan to retrieve project names and metadata.
(Optional) To add data preview and querying
To configure permissions for previewing and querying data, add the following permissions to the custom role:
bigquery.tables.getData
allows Atlan to retrieve table data.π¨ Careful! This permission is also required for retrieving metadata such as the row count and update time of a table.bigquery.jobs.get
allows Atlan to retrieve data and metadata on any job, including queries.bigquery.jobs.listAll
allows Atlan to list all jobs and retrieve metadata on any job submitted by any user.bigquery.jobs.update
allows Atlan to cancel any job, including a running query.
(Optional) To add query history mining
To configure permissions for mining query history, add the following permissions to the custom role:
bigquery.jobs.listAll
allows Atlan to fetch all queries for a project.
Google Cloud console
Create a custom role
You will need to create a custom role in the Google Cloud console for integration with Atlan.
To create a custom role:
- Open the Google Cloud console.
- From the left menu under IAM and admin, click Roles.
- Using the dropdown list at the top of the page, select the project in which you want to create a role.
- From the upper left of the Roles page, click Create Role.
- In the Create role page, enter the following details:
- For Title, enter a meaningful name for the custom role β for example,
Atlan User Role
. - (Optional) For Description, enter a description for the custom role.
- For ID, the Google Cloud console generates a custom role ID based on the custom role name. Edit the ID if necessary β the ID cannot be changed later.
- (Optional) For Role launch stage, assign a stage for the custom role β for example, Alpha, Beta, or General Availability.
- Click Add permissions to select the permissions you want to include in the custom role. In the Add permissions dialog, click the Enter property name or value filter and add the required and any optional permissions.
- Click Create to finish custom role setup.
- For Title, enter a meaningful name for the custom role β for example,
Create a service account
Once you have created a custom role, you will need to create a service account and add your custom role to it.
To create a service account:
- Open the Google Cloud console.
- From the left menu under IAM and admin, click Service accounts.
- Select a Google Cloud project.
- From the upper left of the Service accounts page, click Create Service Account.
- For Service account details, enter the following details:
- For Service account name, enter a service account name to display in the Google Cloud console.
- For Service account ID, the Google Cloud console generates a service account ID based on this name. Edit the ID if necessary β the ID cannot be changed later.
- (Optional) For Service account description, enter a description for the service account.
- Click Create and continue to proceed to the next step.
- For Grant this service account access to the project, enter the following details:
- Click the Select a role dropdown and then select the custom role you created in the previous step β for example,
Atlan User Role
. - Click Continue to proceed to the next step.
- Click the Select a role dropdown and then select the custom role you created in the previous step β for example,
- Click Done to finish the service account setup.
Create service account key
Once you have created a service account, you will need to create a service account key for crawling Google BigQuery.
To create a service account key:
- Open the Google Cloud console.
- From the left menu under IAM and admin, click Service accounts.
- Select the Google Cloud project for which you created the service account.
- On the Service accounts page, click the email address of the service account that you want to create a key for.
- From the upper left of your service account page, click the Keys tab.
- On the Keys page, click the Add Key dropdown and then click Create new key.
- In the Create private key dialog, for Key type, click JSON and then click Create. This will create a service account key file. Download the key file and store it in a secure location β you will not be able to download it again.
Google Cloud CLI
Prerequisites
You will need to set up the Google Cloud CLI in any one of the following development environments:
- Cloud Shell β to use an online terminal with the gcloud CLI already set up, activate Cloud Shell:
- To launch a Cloud Shell session from the Google Cloud console, open the Google Cloud console, and from the top right, click the Activate Cloud Shell icon.
- A Cloud Shell session will start and display a command-line prompt. It can take a few seconds for the session to initialize.
- Local shell β to use a local development environment, install and initialize the gcloud CLI.
Create a custom role
To create a custom role with the requisite and any optional permissions, run the following command:
gcloud iam roles create atlanUserRole --project=<project_id> \
--title="Atlan User Role" --description="Atlan User Role to extract metadata" \
--permissions="bigquery.datasets.get,bigquery.datasets.getIamPolicy,bigquery.jobs.create,bigquery.readsessions.create,bigquery.readsessions.getData,bigquery.readsessions.update,bigquery.routines.get,bigquery.routines.list,bigquery.tables.get,bigquery.tables.getIamPolicy,bigquery.tables.list,resourcemanager.projects.get" \
--stage=ALPHA
- Replace
<project_id>
with the project ID of your Google Cloud project.
Create a service account
To create a service account, run the following command:
gcloud iam service-accounts create atlanUser \
--description="Atlan Service Account to extract metadata" \
--display-name="Atlan User"
To add your custom role to your service account, run the following command:
gcloud projects add-iam-policy-binding <project_id> \
--member="serviceAccount:atlanUser@<project_id>.iam.gserviceaccount.com" \
--role="atlanUserRole"
- Replace
<project_id>
with the project ID of your Google Cloud project.
Create a service account key
To create a service account key, run the following command:
gcloud iam service-accounts keys create <key_file_path> \
--iam-account=atlanUser@<project_id>.iam.gserviceaccount.com"
- Replace
<key_file_path>
with path to a new output file for the private key β for example,~/atlanUser-private-key.json
. - Replace
<project_id>
with the project ID of your Google Cloud project.
bq cp
commands β for example, bq cp <source-table> <destination-table>
.