How to set up Amazon Athena

🤓 Who can do this? You will probably need your Amazon Athena administrator to run these commands — you may not have access yourself.
💪 Did you know? Prefixing all resources created for Atlan with atlan- will help you better identify them. You should also add AWS tags and descriptions to these resources for later reference.

Create IAM policy

To create an IAM policy with the necessary permissions follow the steps in the AWS Identity and Access Management User Guide.

Create the policy using the following JSON:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:ListBucket",
        "s3:GetObject",
        "glue:GetTables",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetDatabase",
        "glue:SearchTables",
        "glue:GetTableVersions",
        "glue:GetTableVersion",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:GetUserDefinedFunctions",
        "glue:GetUserDefinedFunction",
        "athena:GetTableMetadata",
        "athena:StartQueryExecution",
        "athena:GetQueryResults",
        "athena:GetDatabase",
        "athena:GetDataCatalog",
        "athena:ListQueryExecutions",
        "athena:GetWorkGroup",
        "athena:StopQueryExecution",
        "athena:GetQueryResultsStream",
        "athena:ListDatabases",
        "athena:GetQueryExecution",
        "athena:ListTableMetadata",
        "athena:BatchGetQueryExecution"
      ],
      "Resource": [
        "arn:aws:glue:<region>:<account_id>:tableVersion/*/*/*",
        "arn:aws:glue:<region>:<account_id>:table/*/*",
        "arn:aws:glue:<region>:<account_id>:catalog",
        "arn:aws:glue:<region>:<account_id>:database/*",
        "arn:aws:athena:<region>:<account_id>:datacatalog/*",
        "arn:aws:athena:<region>:<account_id>:workgroup/*",
        "arn:aws:s3:::<data_bucket>",
        "arn:aws:s3:::<data_bucket>/*"
      ]
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucketMultipartUploads",
        "s3:AbortMultipartUpload",
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": [
        "arn:aws:s3:::<s3_bucket>/*",
        "arn:aws:s3:::<s3_bucket>"
      ]
    },
    {
      "Sid": "VisualEditor2",
      "Effect": "Allow",
      "Action": "athena:ListDataCatalogs",
      "Resource": "*"
    }
  ]
}
  • Replace <region> with the AWS region of your Athena instance.
  • Replace <account_id> with your account ID.
  • Replace <data_bucket> with the S3 bucket where your actual data resides, such as your Glue tables.
  • Replace <s3_bucket> with the S3 bucket where Athena can store temporary Athena query results.
    💪 Did you know? We recommend using Atlan's deployment bucket to store these results. This ensures all Atlan assets are managed in a single bucket.

If you have an external Hive metastore connected to Athena, also add these policies:

{
  "Sid": "VisualEditor3",
  "Effect": "Allow",
  "Action": [
    "lambda:InvokeFunction",
    "lambda:GetFunction"
  ],
  "Resource": [
    "arn:aws:lambda:<region>:<account_id>:function:<lambda_fn_name>"
  ]
}
  • Replace <region> with the AWS region of your Athena instance.
  • Replace <account_id> with your account ID.
  • Replace <lambda_fn_name> with the name of the Lambda function used to configure the catalog.

These allow Atlan to trigger the Lambda function.

🚨 Careful! If you're using AWS Lake Formation to manage access to your AWS resources, you will need to grant permissions in AWS Lake Formation as well as to the objects you want to crawl.

Choose authentication mechanism

Using the policy created above, configure one of the following options for authentication.

User-based authentication

To configure user-based authentication:

  1. Create an AWS IAM user by following the steps in the AWS Identity and Access Management User Guide.
  2. On the Set permissions page, attach the policy created in the previous step to this user.
  3. Once the user is created, view or download the user's access key ID and secret access key.
    🚨 Careful! This will be your only opportunity to view or download the access keys. You will not have access to them again after leaving the user creation screen.

Role delegation-based authentication

To configure role delegation-based authentication:

  1. Raise a support ticket to get the ARN of the Node Instance Role for your Atlan EKS cluster.
  2. Create a new role in your AWS account by following the steps in the AWS Identity and Access Management User Guide.
    1. When prompted for policies, attach the policy created in the previous step to this role.
    2. When prompted, create a trust relationship for the role using the following trust policy. (Replace <atlan_nodeinstance_role_arn> with the ARN received from Atlan support.)
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Effect": "Allow",
            "Principal": {
              "AWS": "<atlan_nodeinstance_role_arn>"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
          }
        ]
      }
    3. (Optional) To use an external ID for additional security:
      1. Generate the external ID within Atlan.
      2. Paste the external ID into the policy (replace <atlan_generated_external_id> with it):
        {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                "AWS": "<atlan_nodeinstance_role_arn>"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                "StringEquals": {
                  "sts:ExternalId": "<atlan_generated_external_id>"
                }
              }
            }
          ]
        }
    4. Now, reach out to Atlan support with:
      • The name of the role you created above.
      • The ID of the AWS account where the role was created.
    🚨 Careful! Wait until the support team confirms the account is allowlisted to assume the role before running the crawler.

Related articles

Was this article helpful?
1 out of 1 found this helpful