Option 1: Use the Atlan S3 bucket
To avoid access issues, we recommend uploading the required files — manifest.json
and run_results.json
— to the same S3 bucket as Atlan. Raise a support request to get the details of your Atlan bucket and include the ARN value of the IAM user or IAM role we can provision access to.
Create IAM policy
You will need to create an IAM policy and attach it to the IAM user or role to upload the required files to your Atlan bucket. To create an IAM policy with the necessary permissions, follow the steps in the AWS Identity and Access Management User Guide.
Create the policy using the following JSON:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::<bucket_name>/*"
],
"Effect": "Allow"
},
{
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<bucket_name>"
],
"Effect": "Allow"
}
]
}
- Replace
<bucket_name>
with the name of your Atlan bucket.
If you instead opt to use your own S3 bucket, you will need to complete the following steps:
Option 2: Use your own S3 bucket
You'll first need to create a cross-account bucket policy giving Atlan's IAM role access to your bucket. A cross-account bucket policy is required since your Atlan tenant and S3 bucket may not always be deployed in the same AWS account. The permissions required for the S3 bucket include — GetBucketLocation
, ListBucket
, and GetObject
.
To create a cross-account bucket policy:
- Raise a support ticket to get the ARN of the Node Instance Role for your Atlan EKS cluster.
- Create a new policy to allow access by this ARN and update your bucket policy with the following:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Principal": { "AWS": "<role-arn>" }, "Action": [ "s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject" ], "Resource": [ "arn:aws:s3:::<bucket-name>", "arn:aws:s3:::<bucket-name>/<prefix>/*" ] } ] }
- Replace
<role-arn>
with the role ARN of Atlan's node instance role. - Replace
<bucket-name>
with the name of the bucket you are creating. - Replace
<prefix>
with the name of the prefix (directory) within that bucket where you will upload the files.
- Replace
-
Once the new policy has been set up, please notify the support team. Your request should include the S3 bucket name and prefix. This should be done prior to setting up the workflow so that we can create and attach an IAM policy for your bucket to Atlan's IAM role.
(Optional) Update KMS policy
If your S3 bucket is encrypted, you will need to update your KMS policy. This will allow Atlan to decrypt the objects in your S3 bucket.
- Provide the KMS key ARN and KMS key alias ARN to the Atlan support team. The KMS key that you provide must be a customer managed KMS key. (This is because you can only change the key policy for a customer managed KMS key, and not for an AWS managed KMS key. Refer to AWS documentation to learn more.)
- To whitelist the ARN of Atlan's node instance, update the KMS policy with the following:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "DecryptCrossAccount", "Effect": "Allow", "Principal": { "AWS": "<role-arn>" }, "Action": [ "kms:Decrypt", "kms:DescribeKey" ], "Resource": "*" } ] }
- Replace
<role-arn>
with the role ARN of Atlan's node instance role.
Structure the bucket
Atlan uses the metadata.invocation_id
and metadata.project_id
attributes to uniquely identify and link the uploaded files. Atlan does not use the file paths to identify a project or job that the file belongs to. The following directory structure is provided as a guideline:
Multiple projects
Atlan supports extracting dbt metadata from multiple dbt projects. The main-prefix
has the following format s3://<BUCKET_NAME>/<PATH_PREFIX>
, and Atlan support team will provide it to you after setting up access policies on your bucket.
You will need to use the following directory structure:
main-prefix
-
project1
-
job1
manifest.json
- other files
-
job2
manifest.json
- other files
-
-
project2
-
job3
manifest.json
- other files
-
job4
manifest.json
- other files
-
-
project3
-
job5
manifest.json
- other files
-
Single project
Even if you have a single dbt project, Atlan recommends that you follow the directory structure above.
Upload project files
To load correct metadata, Atlan processes the manifest.json
and run_results.json
files for each job. There are many ways to load the metadata, below are suggested approaches from Atlan.
You will need to upload the files from the target
directory of the dbt project into distinct folders.
Upload the run artifacts generated from the following commands:
- (Required) Compilation results:
dbt compile --full-refresh
- This command will generate files that contain a full representation of your dbt project's resources, including models, tests, macros, node configurations, resource properties, and more.
- Files to upload —
manifest.json
andrun_results.json
- Alternatively, you can upload the same files by running the
dbt run --full-refresh
command.
- (Optional) Test results:
dbt test
- This command will execute all dbt tests in a dbt project and generate files that contain the test results.
- Files to upload —
manifest.json
andrun_results.json
- (Optional) Catalog:
dbt docs generate
- This command will generate metadata about the tables and views produced by the models in your dbt project — for example, column data types and table statistics.
- Files to upload —
manifest.json
andcatalog.json