# gitops-eks-r53-arc ## Deploy the solution ### Prerequisites - [Git](https://git-scm.com/) - [AWS Command Line Interface (CLI)](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) - [NodeJS](https://nodejs.org) - [AWS Cloud Development Kit (CDK)](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html) - [Flux](https://fluxcd.io/flux/installation/) - [Route 53 Public Hosted Zone configured](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/CreatingHostedZone.html) Clone the [solution repository from AWS Samples](ttps://github.com/aws-samples/gitops-eks-r53-arc) and navigate to the parent folder: ```bash git clone https://github.com/aws-samples/gitops-eks-r53-arc.git cd gitops-eks-r53-arc ``` This repository contains: - `app/`: a set of Kubernetes manifests to deploy a 3rd-party sample application; - `infra/`: a CDK application that will deploy and configure the corresponding AWS services for you > AWS CDK is a framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. AWS CDK lets you build reliable, scalable, cost-effective applications in the cloud with the considerable expressive power of a programming language. Please refer to [AWS CDK documentation](https://docs.aws.amazon.com/cdk/v2/guide/home.html) to learn more. ## Bootstrap CDK To start deploying the solution you need to install and initialize CDK first. Navigate into the `infra` repository and run the following in your preferred terminal: ```bash cd infra npm install cdk bootstrap ``` ## Deploy EKS clusters After the bootstrapping process you're ready to deploy our CDK application, comprised by different stacks. Each CDK stack will match 1:1 with an AWS CloudFormation one. The first two stacks will deploy two EKS clusters in different AWS Regions, one in us-west-2 (Oregon) and the other in eu-west-1 (Ireland): ``` cdk deploy US-EKSStack EMEA-EKSStack --require-approval never ``` It will take around 20-30 minutes to deploy _each_ stack. Run the following commands and add the relevant outputs for both stacks as environment variables to be used on later steps: ```bash export US_EKS_ASG=$(aws cloudformation describe-stacks --stack-name US-EKSStack \ --query "Stacks[0].Outputs[?OutputKey=='EKSASGArn'].OutputValue" \ --output text --region us-west-2) export EMEA_EKS_ASG=$(aws cloudformation describe-stacks --stack-name EMEA-EKSStack \ --query "Stacks[0].Outputs[?OutputKey=='EKSASGArn'].OutputValue" \ --output text --region eu-west-1) export US_EKS_KUBECONFIG=$(aws cloudformation describe-stacks --stack-name US-EKSStack \ --query "Stacks[0].Outputs[?starts_with(OutputKey, 'EKSClusteruswest2ConfigCommand')].OutputValue" \ --output text --region us-west-2) export EMEA_EKS_KUBECONFIG=$(aws cloudformation describe-stacks --stack-name EMEA-EKSStack \ --query "Stacks[0].Outputs[?starts_with(OutputKey, 'EKSClustereuwest1ConfigCommand')].OutputValue" \ --output text --region eu-west-1) export US_EKS_VPC=$(aws cloudformation describe-stacks --stack-name US-EKSStack \ --query "Stacks[0].Outputs[?OutputKey=='EKSVPCArn'].OutputValue" \ --output text --region us-west-2) export EMEA_EKS_VPC=$(aws cloudformation describe-stacks --stack-name EMEA-EKSStack \ --query "Stacks[0].Outputs[?OutputKey=='EKSVPCArn'].OutputValue" \ --output text --region eu-west-1) ``` Add configuration for each EKS cluster on your local `kubeconfig` by running the following: ```bash eval $US_EKS_KUBECONFIG && eval $EMEA_EKS_KUBECONFIG ``` > Please note that this CDK example was designed for demonstration purposes only and should not be used in production environments as is. Please refer to [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/), specially the [Security Best Practices section](https://aws.github.io/aws-eks-best-practices/security/docs/) to learn how to properly run production Kubernetes workloads on AWS. ## Create a CodeCommit Repository > You can choose to use a different Git repository, such as GitHub or Gitlab. Refer to Flux documentation on how to setup your cluster with them: [Github](https://fluxcd.io/docs/cmd/flux_bootstrap_github/) | [Gitlab](https://fluxcd.io/docs/cmd/flux_bootstrap_gitlab/) ```bash cdk deploy CodeCommitRepositoryStack --require-approval never ``` > As of the time this blog post was written AWS CodeCommit does not offer a native replication mechanism across AWS Regions. You should consider your high-availability requirements and plan accordingly. You can build your own solution to [replicate AWS CodeCommit Repositories between Regions using AWS Fargate](https://aws.amazon.com/blogs/devops/replicate-aws-codecommit-repository-between-regions-using-aws-fargate/), for example. Run the command below to save your recently created repository URL as an environment variable to be used later: ```bash export CODECOMMIT_REPO_URL=$(aws cloudformation describe-stacks \ --stack-name CodeCommitRepositoryStack \ --query "Stacks[0].Outputs[?OutputKey=='CodeCommitRepoUrl'].OutputValue" \ --region us-west-2 --output text) ``` ### Generate SSH Key pair for the CodeCommit user Generate a SSH key pair for the `gitops` IAM user which will have permissions to CodeCommit repository, generated by `CodeCommitRepositoryStack` stack and [upload the public one to AWS IAM](https://console.aws.amazon.com/iam/home#/users/flux?section=security_credentials). ```bash ssh-keygen -t rsa -C "gitops" -b 4096 -f id_rsa_gitops -N "" aws iam upload-ssh-public-key --ssh-public-key-body file://id_rsa_gitops.pub --user-name gitops ``` Save the SSH key ID after the upload: ```bash export SSH_KEY_ID=$(aws iam list-ssh-public-keys --user-name gitops \ --query 'SSHPublicKeys[0].SSHPublicKeyId' --output text) ``` ## Bootstrap Flux ### Bootstrapping Flux with your CodeCommit repository Add the CodeCommit SSH Key to your `~/.ssh/known_hosts` file: ```bash ssh-keyscan -H git-codecommit.us-west-2.amazonaws.com >> ~/.ssh/known_hosts ``` Append the SSH Key ID with the `SSH_KEY_ID` environment variable you saved earlier to have the SSH URL formed and bootstrap Flux on **each EKS cluster** by running the following: ```bash flux bootstrap git --url=$(echo $CODECOMMIT_REPO_URL | sed "s/ssh:\/\//ssh:\/\/$SSH_KEY_ID@/g") --username $SSH_KEY_ID --private-key-file id_rsa_gitops ``` > Your final SSH URL should be something like `ssh://APKAEIBAERJR2EXAMPLE@git-codecommit.us-west-2.amazonaws.com/v1/repos/gitops-repo`. Make sure you confirm giving key access to the repository, answering `y` when prompted. Change your current kubectl context to the other EKS cluster and re-run the `flux bootstrap` command: ```bash AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) kubectl config use-context "arn:aws:eks:us-west-2:${AWS_ACCOUNT_ID}:cluster/eks-cluster-us-west-2" flux bootstrap git --url=$(echo $CODECOMMIT_REPO_URL | sed "s/ssh:\/\//ssh:\/\/$SSH_KEY_ID@/g") --username $SSH_KEY_ID --private-key-file id_rsa_gitops ``` > Here we’re using SSH connection to CodeCommit specifically for Flux compatibility and temporary credentials should always [be your first choice](https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-git-remote-codecommit.html). ### Adding a demo application to the `gitops-repo` repository Clone the `gitops-repo` repository using your preferred method. In this case we're using the HTTPS repository URL with the [CodeCommit Credential Helper provided by AWS CLI](https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-https-unixes.html): ```bash cd .. git config --global credential.helper '!aws codecommit credential-helper $@' git config --global credential.UseHttpPath true git clone $(echo $CODECOMMIT_REPO_URL | sed "s/ssh:\/\//https:\/\//g") ``` Copy everything from `app` folder to the `gitops-repo` folder: ```bash cp -R app/* gitops-repo ``` The `app` folder contains our sample application that will be deployed in the two EKS clusters. It will use the [microservices-demo](https://github.com/microservices-demo/microservices-demo) app plus an [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) backed by [AWS Load Balancer Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/). Please note that [microservices-demo](https://github.com/microservices-demo/microservices-demo) is owned and maintained by a [3rd party](https://www.weave.works/). Navigate into `gitops-repo` directory and push the changes to the remote `gitops-repo` repository: ```bash cd gitops-repo git add . git commit -m "Add microservices-demo to the gitops repo" git push origin main cd .. ``` After a few minutes Flux will deploy the demo application with an Application Load Balancer exposing your application to the world. Save the created Application Load Balancers ARNs on `us-west-2` and `eu-west-1` as environment variables: ```bash export US_EKS_ALB=$(aws elbv2 describe-load-balancers --query "LoadBalancers[?starts_with(DNSName, 'microservices-demo')].LoadBalancerArn" --region us-west-2 --output text) export EMEA_EKS_ALB=$(aws elbv2 describe-load-balancers --query "LoadBalancers[?starts_with(DNSName, 'microservices-demo')].LoadBalancerArn" --region eu-west-1 --output text) ``` ## Deploy the Route 53 Application Recovery Controller stack For this step you will use the environment variables we saved earlier plus the Application Load Balancers ARNs: ```bash cd infra cdk deploy Route53ARCStack --parameters uswest2ASG=$US_EKS_ASG --parameters euwest1ASG=$EMEA_EKS_ASG --parameters uswest2VPC=$US_EKS_VPC --parameters euwest1VPC=$EMEA_EKS_VPC --parameters uswest2ALB=$US_EKS_ALB --parameters euwest1ALB=$EMEA_EKS_ALB ``` As part of this deployment, two Route 53 Health Checks will be created. Save the ID of each of them for later use: ```bash export US_R53_HEALTH_CHECK_ID=$(aws cloudformation describe-stacks --stack-name Route53ARCStack --query "Stacks[0].Outputs[?starts_with(OutputKey, 'Route53ARCStackMultiRegionEKSuswest2HealthCheckId')].OutputValue" --output text --region us-west-2) export EMEA_R53_HEALTH_CHECK_ID=$(aws cloudformation describe-stacks --stack-name Route53ARCStack --query "Stacks[0].Outputs[?starts_with(OutputKey, 'Route53ARCStackMultiRegionEKSeuwest1HealthCheckId')].OutputValue" --output text --region us-west-2) ``` ### Take a look on Route 53 Application Recovery Controller Console At this point you should have the following deployed: 1. 2 EKS clusters (one per region) 2. 2 ALB load balancers (one per region) 3. 1 Route 53 ARC Cluster 4. 1 Route 53 ARC Recovery group with two cells (one per region) with VPCs, ASGs and ALBs bundled into their respective resource sets 5. 2 Route 53 ARC Routing Controls (one per region) and 2 Route 53 Health Checks linked together Let’s open the [Route 53 Application Recovery Controller Console](https://us-west-2.console.aws.amazon.com/route53recovery/home) and open the [Readiness check portion](https://us-west-2.console.aws.amazon.com/route53recovery/home#/readiness/home) of it. It’s under the Multi-Region section of the left-side navigation pane: ![](img/readiness-check.png) With Readiness check you can programmatically make sure your application (recovery group) and its cells are healthy which means your application is ready to failover. Let's move to the [Routing control portion](https://us-west-2.console.aws.amazon.com/route53recovery/home#/recovery-control/home) of Route 53 ARC. It's also within Multi-Region section of the left-side navigation pane: ![](img/recovery-control.png) Click on the `MultiRegionEKS-ControlPanel` control panel: ![](img/control-panel.png) ### Turning your routing controls on By default, recently created routing controls are switched off which means the corresponding Route 53 Health Check will be on the unhealthy state, preventing traffic to be routed. Select the two routing controls created for you, click on `Change routing control states`, turn them `On`, enter `Confirm` on the text input field and click on `Change traffic routing` so we can configure our Route 53 Hosted Zone properly: ![](img/routing-control-state.png) ## Let's set our Route 53 Hosted Zone to use those two new Health checks Find your existing Route 53 Hosted Zone ID by running the following command. Make sure you replace `example.com` with the public domain you own: ```bash export R53_DNS_NAME="example.com" export R53_HOSTED_ZONE=$(aws route53 list-hosted-zones --query "HostedZones[?Name=='$R53_DNS_NAME.'].Id" --output text | sed "s/\/hostedzone\///g") ``` Let's extract the information we need to create our Route 53 records from each load balancer: ```bash export US_EKS_ALB_ZONE_ID=$(aws elbv2 describe-load-balancers --region us-west-2 --query "LoadBalancers[0].CanonicalHostedZoneId" --output text --load-balancer-arns $US_EKS_ALB) export US_EKS_ALB_DNS_NAME=$(aws elbv2 describe-load-balancers --region us-west-2 --query "LoadBalancers[0].DNSName" --output text --load-balancer-arns $US_EKS_ALB) export EMEA_EKS_ALB_ZONE_ID=$(aws elbv2 describe-load-balancers --region eu-west-1 --query "LoadBalancers[0].CanonicalHostedZoneId" --output text --load-balancer-arns $EMEA_EKS_ALB) export EMEA_EKS_ALB_DNS_NAME=$(aws elbv2 describe-load-balancers --region eu-west-1 --query "LoadBalancers[0].DNSName" --output text --load-balancer-arns $EMEA_EKS_ALB) ``` Now we'll create the corresponding Route 53 records. First, we need to generate a file with the changes per the [change-resource-record-sets](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/route53/change-resource-record-sets.html) CLI command documentation: ```bash cat << EOF > changes.json { "Changes": [ { "Action": "CREATE", "ResourceRecordSet": { "Name": "service.$R53_DNS_NAME", "Type": "A", "SetIdentifier": "US", "Failover": "PRIMARY", "AliasTarget": { "HostedZoneId": "$US_EKS_ALB_ZONE_ID", "DNSName": "$US_EKS_ALB_DNS_NAME", "EvaluateTargetHealth": true }, "HealthCheckId": "$US_R53_HEALTH_CHECK_ID" } }, { "Action": "CREATE", "ResourceRecordSet": { "Name": "service.$R53_DNS_NAME", "Type": "A", "SetIdentifier": "EMEA", "Failover": "SECONDARY", "AliasTarget": { "HostedZoneId": "$EMEA_EKS_ALB_ZONE_ID", "DNSName": "$EMEA_EKS_ALB_DNS_NAME", "EvaluateTargetHealth": true }, "HealthCheckId": "$EMEA_R53_HEALTH_CHECK_ID" } } ] } EOF ``` > We're using the subdomain `service` and using [Route 53' Failover Policy](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-failover.html) on this example. Route 53 ARC Routing Control works regardless the routing policy you use. Then, we'll run the following: ```bash aws route53 change-resource-record-sets --hosted-zone-id $R53_HOSTED_ZONE --change-batch file://changes.json ``` ## Testing Route 53 ARC Failover The easiest way to verify if our routing controls are working as expected is by doing `nslookup` queries. In our case, if the Application Load Balancer on the primary region (us-west-2) is working properly, Route 53 will answer back with that ALB's IP addresses: ``` nslookup service.$R53_DNS_NAME Server: 192.168.4.1 Address: 192.168.4.1#53 Non-authoritative answer: Name: service.example.com Address: 198.51.100.217 Name: service.example.com Address: 198.51.100.110 ``` ### Triggering the failover through Route 53 ARC Routing control Let's imagine we're experiencing intermittent issues with us-west-2 and we want to shift traffic from our primary region to the secondary one (eu-west-1). Navigate back to the `MultiRegionEKS-ControlPanel` [Routing control console](https://us-west-2.console.aws.amazon.com/route53recovery/home#/recovery-control/home) and select the `MultiRegionEKS-ControlPanel-MultiRegionEKS-us-west-2` routing control and click on `Change routing control states`: ![](img/routing-control-state-changes.png) Turn it `Off`, enter `Confirm` on the text input field and click on `Change traffic routing`: ![](img/routing-control-state-confirm.png) The Route 53 Health Check linked with `MultiRegionEKS-ControlPanel-MultiRegionEKS-us-west-2` will start failing, which will make Route 53 stop returning ALB's IP addresses from our primary region. After a few seconds, depending on how your recursive DNS provider cache results, you will see changed response, pointing to the IP addresses from the ALB we had deployed on our secondary (eu-west-1) region: ``` nslookup service.$R53_DNS_NAME Server: 192.168.4.1 Address: 192.168.4.1#53 Non-authoritative answer: Name: service.example.com Address: 203.0.113.176 Name: service.example.com Address: 203.0.113.46 ``` > It’s strongly advised to work with Route 53 ARC Routing Controls through the API (or CLI) instead of Management Console during production-impacting events as you can leverage the 5 Geo-replicated endpoints directly. Please refer to [Route 53 ARC Best practices documentation](https://docs.aws.amazon.com/r53recovery/latest/dg/route53-arc-best-practices.html) to learn more. ## Cleaning up Please be mindful that the resources deployed here will incur in charges to your AWS bill. After exploring this solution and adapting it to your own applications architecture make sure to clean up your environment. ### Removing the Application Load Balancers and their respective Target Groups #### Using Flux You can delete the corresponding `microservices-demo-kustomization.yaml` and `ingress-kustomization.yaml` files plus the `ingress/` folder from your `gitops-repo` that is managed by Flux and push your changes into your Git repository: ```bash cd ../gitops-repo rm microservices-demo-kustomization.yaml rm ingress-kustomization.yaml rm -rf ingress git add . git commit -m "Remove sample application and ingress controller" git push origin main ``` #### Using AWS Management Console You can also delete the corresponding Load Balancers ([us-west-2](https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#LoadBalancers) and [eu-west-1](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#LoadBalancers)) and Target Groups ([us-west-2](https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#TargetGroups:) and [eu-west-1](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#TargetGroups:)) ### Removing the resources you deployed using AWS CDK You need to delete the SSH public key for `gitops` user first: ```bash aws iam delete-ssh-public-key --user-name gitops --ssh-public-key-id $SSH_KEY_ID ``` You can do it through CloudFormation console in [both](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks) [regions](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks) and removing the corresponding stacks or by issuing a `cdk destroy --all` within the directory of the CDK app: ```bash cd ../infra cdk destroy --all --force ``` ## Feedback/bug reports Contributions are welcome, both in form of issues or PRs. Please refer to [CONTRIBUTING](CONTRIBUTING.md) for additional details. ## Security See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. ### Content Security Legal Disclaimer > The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage. ## License This library is licensed under the MIT-0 License. See the LICENSE file.