AWSTemplateFormatVersion: "2010-09-09" Description: Amazon EKS - Node Group Metadata: "AWS::CloudFormation::Interface": ParameterGroups: - Label: default: EKS Cluster Parameters: - ClusterName - ClusterControlPlaneSecurityGroup - Label: default: Worker Node Configuration Parameters: - NodeGroupName - NodeAutoScalingGroupMinSize - NodeAutoScalingGroupDesiredCapacity - NodeAutoScalingGroupMaxSize - NodeInstanceType - NodeImageIdSSMParam - NodeImageId - NodeVolumeSize - KeyName - BootstrapArguments - Label: default: Worker Network Configuration Parameters: - VpcId - SubnetId Parameters: BootstrapArguments: Type: String Default: "" Description: "Arguments to pass to the bootstrap script. See files/bootstrap.sh in https://github.com/awslabs/amazon-eks-ami" ClusterControlPlaneSecurityGroup: Type: "AWS::EC2::SecurityGroup::Id" Description: The security group of the cluster control plane. ClusterName: Type: String Description: The cluster name provided when the cluster was created. If it is incorrect, nodes will not be able to join the cluster. SubnetId: Type: "AWS::EC2::Subnet::Id" Description: Subnet ID to launch instances into. KeyName: Type: "AWS::EC2::KeyPair::KeyName" Description: The EC2 Key Pair to allow SSH access to the instances NodeAutoScalingGroupDesiredCapacity: Type: Number Default: 1 Description: Desired capacity of Node Group ASG. NodeAutoScalingGroupMaxSize: Type: Number Default: 4 Description: Maximum size of Node Group ASG. Set to at least 1 greater than NodeAutoScalingGroupDesiredCapacity. NodeAutoScalingGroupMinSize: Type: Number Default: 0 Description: Minimum size of Node Group ASG. NodeGroupName: Type: String Description: Unique identifier for the Node Group. NodeImageId: Type: String Default: "" Description: (Optional) Specify your own custom image ID. This value overrides any AWS Systems Manager Parameter Store value specified above. NodeImageIdSSMParam: Type: "AWS::SSM::Parameter::Value" Default: /aws/service/eks/optimized-ami/1.19/amazon-linux-2-gpu/recommended/image_id Description: AWS Systems Manager Parameter Store parameter of the AMI ID for the worker node instances. NodeInstanceType: Type: String Default: p4d.24xlarge AllowedValues: - p4d.24xlarge ConstraintDescription: Must be a valid EC2 instance type Description: EC2 instance type for the node instances NodeVolumeSize: Type: Number Default: 300 Description: Node volume size VpcId: Type: "AWS::EC2::VPC::Id" Description: The VPC of the worker instances Mappings: PartitionMap: aws: EC2ServicePrincipal: "ec2.amazonaws.com" aws-us-gov: EC2ServicePrincipal: "ec2.amazonaws.com" aws-cn: EC2ServicePrincipal: "ec2.amazonaws.com.cn" aws-iso: EC2ServicePrincipal: "ec2.c2s.ic.gov" aws-iso-b: EC2ServicePrincipal: "ec2.sc2s.sgov.gov" Conditions: HasNodeImageId: !Not - "Fn::Equals": - Ref: NodeImageId - "" Resources: PlacementGroup: Type: AWS::EC2::PlacementGroup Properties: Strategy: cluster NodeInstanceRole: Type: "AWS::IAM::Role" Properties: AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Principal: Service: - !FindInMap [PartitionMap, !Ref "AWS::Partition", EC2ServicePrincipal] Action: - "sts:AssumeRole" ManagedPolicyArns: - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy" - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy" - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" Path: / NodeInstanceProfile: Type: "AWS::IAM::InstanceProfile" Properties: Path: / Roles: - Ref: NodeInstanceRole NodeSecurityGroup: Type: "AWS::EC2::SecurityGroup" Properties: GroupDescription: Security group for all nodes in the cluster Tags: - Key: !Sub kubernetes.io/cluster/${ClusterName} Value: owned VpcId: !Ref VpcId NodeSecurityGroupIngress: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow node to communicate with each other FromPort: 0 GroupId: !Ref NodeSecurityGroup IpProtocol: "-1" SourceSecurityGroupId: !Ref NodeSecurityGroup ToPort: 65535 NodeSecurityGroupEgress: Type: "AWS::EC2::SecurityGroupEgress" DependsOn: NodeSecurityGroup Properties: Description: Allow the efa worker nodes outbound communication DestinationSecurityGroupId: !Ref NodeSecurityGroup FromPort: 0 GroupId: !Ref NodeSecurityGroup IpProtocol: "-1" ToPort: 65536 NodeSecurityGroupEgressAllIpv4: Type: "AWS::EC2::SecurityGroupEgress" DependsOn: NodeSecurityGroup Properties: Description: Allow the efa worker nodes outbound communication FromPort: 0 CidrIp: "0.0.0.0/0" GroupId: !Ref NodeSecurityGroup IpProtocol: "-1" ToPort: 65536 NodeSecurityGroupEgressAllIpv6: Type: "AWS::EC2::SecurityGroupEgress" DependsOn: NodeSecurityGroup Properties: Description: Allow the efa worker nodes outbound communication FromPort: 0 CidrIpv6: "::/0" GroupId: !Ref NodeSecurityGroup IpProtocol: "-1" ToPort: 65536 NodeSecurityGroupIngressSSHIpv4: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow SSH FromPort: 22 CidrIp: "0.0.0.0/0" GroupId: !Ref NodeSecurityGroup IpProtocol: "tcp" ToPort: 22 NodeSecurityGroupIngressSSHIpv6: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow SSH FromPort: 22 CidrIpv6: "::/0" GroupId: !Ref NodeSecurityGroup IpProtocol: "tcp" ToPort: 22 ClusterControlPlaneSecurityGroupIngress: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow pods to communicate with the cluster API Server FromPort: 443 GroupId: !Ref ClusterControlPlaneSecurityGroup IpProtocol: tcp SourceSecurityGroupId: !Ref NodeSecurityGroup ToPort: 443 ControlPlaneEgressToNodeSecurityGroup: Type: "AWS::EC2::SecurityGroupEgress" DependsOn: NodeSecurityGroup Properties: Description: Allow the cluster control plane to communicate with worker Kubelet and pods DestinationSecurityGroupId: !Ref NodeSecurityGroup FromPort: 1025 GroupId: !Ref ClusterControlPlaneSecurityGroup IpProtocol: tcp ToPort: 65535 ControlPlaneEgressToNodeSecurityGroupOn443: Type: "AWS::EC2::SecurityGroupEgress" DependsOn: NodeSecurityGroup Properties: Description: Allow the cluster control plane to communicate with pods running extension API servers on port 443 DestinationSecurityGroupId: !Ref NodeSecurityGroup FromPort: 443 GroupId: !Ref ClusterControlPlaneSecurityGroup IpProtocol: tcp ToPort: 443 NodeSecurityGroupFromControlPlaneIngress: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow worker Kubelets and pods to receive communication from the cluster control plane FromPort: 1025 GroupId: !Ref NodeSecurityGroup IpProtocol: tcp SourceSecurityGroupId: !Ref ClusterControlPlaneSecurityGroup ToPort: 65535 NodeSecurityGroupFromControlPlaneOn443Ingress: Type: "AWS::EC2::SecurityGroupIngress" DependsOn: NodeSecurityGroup Properties: Description: Allow pods running extension API servers on port 443 to receive communication from cluster control plane FromPort: 443 GroupId: !Ref NodeSecurityGroup IpProtocol: tcp SourceSecurityGroupId: !Ref ClusterControlPlaneSecurityGroup ToPort: 443 NodeLaunchTemplate: Type: "AWS::EC2::LaunchTemplate" Properties: LaunchTemplateData: BlockDeviceMappings: - DeviceName: /dev/xvda Ebs: DeleteOnTermination: true VolumeSize: !Ref NodeVolumeSize VolumeType: gp2 IamInstanceProfile: Arn: !GetAtt NodeInstanceProfile.Arn ImageId: !If - HasNodeImageId - Ref: NodeImageId - Ref: NodeImageIdSSMParam InstanceType: !Ref NodeInstanceType KeyName: !Ref KeyName NetworkInterfaces: - Description: NetworkInterfaces Configuration For EFA and EKS NetworkCardIndex: 0 DeviceIndex: 0 DeleteOnTermination: true Groups: - !Ref NodeSecurityGroup InterfaceType: efa SubnetId: !Ref SubnetId - Description: NetworkInterfaces Configuration For EFA and EKS NetworkCardIndex: 1 DeviceIndex: 1 DeleteOnTermination: true Groups: - !Ref NodeSecurityGroup InterfaceType: efa SubnetId: !Ref SubnetId - Description: NetworkInterfaces Configuration For EFA and EKS NetworkCardIndex: 2 DeviceIndex: 2 DeleteOnTermination: true Groups: - !Ref NodeSecurityGroup InterfaceType: efa SubnetId: !Ref SubnetId - Description: NetworkInterfaces Configuration For EFA and EKS NetworkCardIndex: 3 DeviceIndex: 3 DeleteOnTermination: true Groups: - !Ref NodeSecurityGroup InterfaceType: efa SubnetId: !Ref SubnetId Placement: GroupName: !Ref PlacementGroup UserData: !Base64 "Fn::Sub": | Content-Type: multipart/mixed; boundary="==BOUNDARY==" MIME-Version: 1.0 --==BOUNDARY== Content-Type: text/cloud-boothook; charset="us-ascii" cloud-init-per once yum_wget yum install -y wget cloud-init-per once wget_efa wget -q --timeout=20 https://s3-us-west-2.amazonaws.com/aws-efa-installer/aws-efa-installer-latest.tar.gz -O /tmp/aws-efa-installer-latest.tar.gz cloud-init-per once tar_efa tar -xf /tmp/aws-efa-installer-latest.tar.gz -C /tmp pushd /tmp/aws-efa-installer cloud-init-per once install_efa ./efa_installer.sh -y -g pop /tmp/aws-efa-installer cloud-init-per once efa_info /opt/amazon/efa/bin/fi_info -p efa --==BOUNDARY== Content-Type: text/x-shellscript; charset="us-ascii" #!/bin/bash set -o xtrace /etc/eks/bootstrap.sh ${ClusterName} ${BootstrapArguments} /opt/aws/bin/cfn-signal --exit-code $? \ --stack ${AWS::StackName} \ --resource NodeGroup \ --region ${AWS::Region} --==BOUNDARY==-- MetadataOptions: "HttpPutResponseHopLimit": 2 NodeGroup: Type: "AWS::AutoScaling::AutoScalingGroup" Properties: DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity LaunchTemplate: LaunchTemplateId: !Ref NodeLaunchTemplate Version: !GetAtt NodeLaunchTemplate.LatestVersionNumber MaxSize: !Ref NodeAutoScalingGroupMaxSize MinSize: !Ref NodeAutoScalingGroupMinSize Tags: - Key: Name PropagateAtLaunch: true Value: !Sub ${ClusterName}-${NodeGroupName}-Node - Key: !Sub kubernetes.io/cluster/${ClusterName} PropagateAtLaunch: true Value: owned VPCZoneIdentifier: [ !Ref SubnetId ] UpdatePolicy: AutoScalingRollingUpdate: MaxBatchSize: 1 MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity PauseTime: PT5M Outputs: NodeInstanceRole: Description: The node instance role Value: !GetAtt NodeInstanceRole.Arn NodeAutoScalingGroup: Description: The autoscaling group Value: !Ref NodeGroup