How to crawl dbt

Have more questions? Submit a request

Once you have configured a dbt Cloud service token or uploaded your dbt Core project files to S3, you can crawl dbt metadata into Atlan.

To enrich metadata in Atlan from dbt, complete the following steps.

Select the source

To select dbt as your source:

  1. In the top right of any screen, navigate to New and then click New Workflow.
  2. From the list of packages, select dbt Enrichment and click on Setup Workflow.

Configure the crawler

dbt Core

To complete the dbt crawler configuration for dbt Core:

  1. For Extraction method click Core.
  2. Enter the details for the S3 location of your project files:
    • For S3 bucket name enter the name of the S3 bucket containing your project files. Do not include the s3://.
    • For S3 prefix enter the path of the prefix in the S3 bucket up to, but not including, the project name.
    • For S3 region enter the name of the S3 region in which the bucket exists. (If you are re-using Atlan's bucket, you can leave this blank.)
  3. (Optional) To specify the metadata to include or exclude, for Advanced options select Yes:
    1. For Exclude filter pattern enter an AWS filter pattern to exclude folders from being crawled. This will be evaluated first.
    2. For Include filter pattern enter an AWS filter pattern to include folders to crawl.

dbt Cloud

To complete the dbt crawler configuration for dbt Cloud:

  1. For Extraction method click Cloud:
  2. At the bottom of the form, click the Test Authentication button to confirm connectivity to dbt Cloud using these details.
  3. (Optional) Select the dbt projects you want to exclude from crawling in the Exclude Metadata field. (This will default to no projects, if none are specified.)
  4. Select the dbt projects you want to include in crawling in the Include Metadata field. (This will default to all projects, if none are specified.)
πŸ’ͺ Did you know? If a project appears in both the include and exclude filters, the exclude filter takes precedence.

Run the crawler

To run the dbt crawler, after completing the steps above:

  • To run the crawler once, immediately, at the bottom of the screen click the Run button.
  • To schedule the crawler to run hourly, daily, weekly or monthly, at the bottom of the screen click the Schedule & Run button.

Once the crawler has completed running, you will see dbt metadata populated on any pre-existing Atlan assets! πŸŽ‰

Related articles

Was this article helpful?
1 out of 1 found this helpful