How to set up a private network link to Hive

AWS PrivateLink creates a secure, private connection between services running in AWS. This document describes the steps to set this up between Hive and Atlan.

πŸ€“ Who can do this? You will need your AWS administrator involved β€” you may not have access to run these tasks yourself.

Prerequisites

You should already have the following:

  • Hive instance running in AWS (private EMR instance).
  • Atlan hosted in the same region as the Hive instance.
πŸ’ͺ Did you know? You will also need Atlan's AWS account ID later in this process. If you do not already have this, request it now from support.

Set up network to EMR instance

To set up the private network of your Hive EMR instance, from within AWS:

Copy network settings

To copy the network settings of your Hive EMR instance:

  1. From the left menu, under EMR on EC2, click Clusters.
  2. In the Clusters table, click on your Hive EMR cluster.
  3. From the cluster's Network and security tab, under Network, for Virtual Private Cloud (VPC), click on your VPC to view more details.
  4. Under your VPC's Details tab, copy and save the value under the IPv4 CIDR column.

Create inbound rule

To create an inbound rule allowing your VPC access to your Hive EMR instance:

  1. From the left menu, under EMR on EC2, click Clusters.
  2. In the Clusters table, click on your Hive EMR cluster.
  3. From the cluster's Network and security tab, click the downward arrow for EC2 security groups (firewall) to expand this section.
  4. Under EC2 security groups (firewall), click on a security group for the cluster.
  5. Under the Inbound rules tab, click the Edit inbound rules button.
  6. At the bottom left of the Inbound rules table, click the Add rule button.
    1. For Type, select All traffic.
    2. For Port, enter the port on which Hive is accessible.
    3. For Source, choose Custom and enter the CIDR range for your Hive instance (see Copy network settings).
  7. Below the bottom right of the Inbound rules table, click the Save rules button. Repeat steps 4 to 7 for each security group in the cluster.

Create internal Network Load Balancer

Start creating NLB

To create an NLB, from within AWS:

  1. Navigate to Services, then Compute, then EC2.
  2. On the left, under Load Balancing, click on Load Balancers.
  3. At the top of the screen, click the Create Load Balancer button.
  4. Under the Network Load Balancer option, click the Create button.
  5. Enter the following Basic configuration settings for the load balancer:
    1. For Load balancer name, enter a unique name.
    2. For Scheme, select Internal.
    3. For IP address type, select IPv4.
  6. Enter the following Network mapping settings for the load balancer:
    1. For VPC, select the VPC where the Hive instance is located (see Copy network settings).
    2. For Mappings, select the availability zones with private subnets.
  7. Enter the following Listeners and routing settings for the load balancer:
    1. For Port, enter the port value used in Created inbound rule.
    2. For Default action, click the Create target group link. This will open the target group creation in a new browser tab.

Create target group

To create a target group for the NLB:

  1. Enter the following Basic configuration settings for the target group:
    1. For Choose target type, select Instances.
    2. For Target group name, enter a name.
    3. For Port, enter the port value used in Create inbound rule.
    4. For VPC, select the VPC where the Hive instance is located (see Copy network settings).
    5. At the bottom of the form, click the Next button.
  2. From the Available instances table:
    1. Click the checkbox next to your Hive instance.
    2. Enter the port for the port value used in steps above.
    3. Click the Include as pending below button.
  3. At the bottom right of the form, click the Create target group button.

Finish creating NLB

Return to the browser tab where you started the NLB creation, and continue:

  1. Under Listeners and routing, click the refresh arrow to the far right of the Default action drop-down box.
  2. Select the target group you created above in the Default action drop-down.
  3. At the bottom right of the form click the Create load balancer button.
  4. In the resulting screen, click the View load balancer button.

Verify target group is healthy

To verify the target group is healthy:

  1. From the EC2 menu on the left, under Load Balancing click Target Groups.
  2. From the Target groups table, click the row for the target group you created above.
  3. At the bottom of the screen, under the Details tab, check that there is a 1 under both Total targets and Healthy.

Create endpoint service

To create an endpoint service, from within AWS:

  1. Navigate to Services, then Networking & Content Delivery, then VPC.
  2. From the menu on the left, under Virtual private cloud click Endpoint services.
  3. At the top of the page, click the Create endpoint service button.
  4. Enter the following Endpoint service settings:
    1. For Name, enter a meaningful name.
    2. For Load balancer type, choose Network.
  5. For Available load balancers, select the load balancer you created above in Create internal Network Load Balancer.
  6. Enter the following Additional settings:
    1. For Require acceptance for endpoint, enable Acceptance required.
    2. For Supported IP address types, enable IPv4.
  7. At the bottom right of the form, click the Create button.
  8. Under the Details of the endpoint service, copy the hostname under Service name.

Allow Atlan account access

To allow Atlan's account access to the service, from within the endpoint service screen:

  1. At the bottom of the screen, change to the Allow principals tab.
  2. At the top of the Allow principals table, click the Allow principals button.
  3. Under Principals to add and ARN, enter the Atlan account ID.
  4. At the bottom right of the form, click the Allow principals button.

Notify Atlan support

Once all the above steps are complete, provide Atlan support with the following information:

  • The hostname for the endpoint service created above.
  • The port number for your Hive instance.

There are additional steps Atlan then needs to complete:

  • Creating a security group.
  • Creating an endpoint.

Once the Atlan team has confirmed the configuration is ready, please continue with the remaining steps.

Accept the consumer connection request

To accept the consumer connection request, from within AWS:

  1. Navigate to Services, then Networking & Content Delivery, then VPC.
  2. From the menu on the left, under Virtual private cloud click Endpoint services.
  3. From the Endpoint services table, select the endpoint service you created in Create endpoint service.
  4. At the bottom of the screen, change to the Endpoint connections tab.
    1. You should see a row in the Endpoint connections table with a State of Pending.
    2. Select this row, and click the Actions button and then Accept endpoint connection request.
    3. If prompted to confirm, type accept into the field and click the Accept button.
  5. Wait for this to complete, it could take about 30 seconds.

πŸ˜… The connection is now established. You can now use the service endpoint provided by Atlan support as the hostname to crawl Hive in Atlan! πŸŽ‰

Related articles

Was this article helpful?
0 out of 0 found this helpful