AWSTemplateFormatVersion: '2010-09-09' Description: TDengine Cluster Deployment Partner Solution (qs-1tsfh53e9). Metadata: cfn-lint: { config: { ignore_checks: [ W9002, W9003, W9004, W9006 ] } } LICENSE: Apache License Version 2.0 QuickStartDocumentation: EntrypointName: Parameters to deploy TDengine Cluster Partner Solution into existing VPC. AWS::CloudFormation::Interface: ParameterGroups: - Label: default: VPC network configuration Parameters: - VPCCIDR - VPCID - PrivateSubnet1ID - PrivateSubnet2ID - PublicSubnet1ID - PublicSubnet2ID - Label: default: Partner Solution configuration Parameters: - QSS3BucketName - QSS3KeyPrefix - QSS3BucketRegion - RemoteAccessCIDR - Label: default: Amazon EC2 configuration Parameters: - KeyPairName - Label: default: Bastion configuration Parameters: - BastionAMIOS - BastionInstanceType - NumBastionHosts - Label: default: TDengine general configuration Parameters: - ParamBucketName - EntDownloadLink - Label: default: TDengine cluster configuration Parameters: - InstanceAmiId - InstanceType - DiskSize - NumberOfServers ParameterLabels: PublicSubnet1ID: default: Public subnet 1 ID PublicSubnet2ID: default: Public subnet 2 ID PrivateSubnet1ID: default: Private subnet 1 ID PrivateSubnet2ID: default: Private subnet 2 ID VPCID: default: VPC ID VPCCIDR: default: VPC CIDR RemoteAccessCIDR: default: Allowed bastion external access CIDR KeyPairName: default: Key pair name BastionAMIOS: default: Bastion AMI OS BastionInstanceType: default: Bastion instance type NumBastionHosts: default: Number of bastion hosts QSS3BucketName: default: S3 bucket name QSS3BucketRegion: default: S3 bucket Region QSS3KeyPrefix: default: S3 key prefix ParamBucketName: default: S3 bucket for DataTiering. By default, deploy SSD and S3 bucket two data tiering. EntDownloadLink: default: Enterprise download link InstanceAmiId: default: Instance AMI ID InstanceType: default: Instance type DiskSize: default: Disk size NumberOfServers: default: Number of servers Parameters: PublicSubnet1ID: Type: String Description: Public subnet ID in Availability Zone 1 of your existing VPC (example:subnet-a0246dcd). Default: "" PublicSubnet2ID: Type: String Description: Public subnet ID in Availability Zone 2 of your existing VPC (example:subnet-b1236eea). Default: "" PrivateSubnet1ID: Type: "AWS::EC2::Subnet::Id" Description: Privat subnet ID in Availability Zone 1 of your existing VPC (example:subnet-fe9a8b32). PrivateSubnet2ID: Type: "AWS::EC2::Subnet::Id" Description: Privat subnet ID in Availability Zone 2 of your existing VPC (example:subnet-fe9a8b32). VPCID: Type: "AWS::EC2::VPC::Id" Description: Existing VPC ID (example:vpc-0343606e). VPCCIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/([0-9]|[1-2][0-9]|3[0-2]))$ ConstraintDescription: Must be a valid IP range in the format x.x.x.x/x. Default: 10.0.0.0/16 Description: CIDR block for VPC. Type: String RemoteAccessCIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/([0-9]|[1-2][0-9]|3[0-2]))$ ConstraintDescription: CIDR block parameter must be in the format x.x.x.x/x. Description: Allowed CIDR block for external secure shell (SSH) access to bastion hosts. Default: 10.0.0.0/16 Type: String KeyPairName: Description: Key pairs allow you to connect to your instance after it launches. Type: AWS::EC2::KeyPair::KeyName BastionAMIOS: AllowedValues: - Amazon-Linux2-HVM - CentOS-7-HVM - Ubuntu-Server-20.04-LTS-HVM - SUSE-SLES-15-HVM Default: Amazon-Linux2-HVM Description: Linux distribution for AMI used for bastion instances. Type: String BastionInstanceType: AllowedValues: - t2.nano - t2.micro - t2.small - t2.medium - t2.large - t3.micro - t3.small - t3.medium - t3.large - t3.xlarge - t3.2xlarge - m5.large - m5.xlarge - m5.2xlarge - m4.large - m4.xlarge - m4.2xlarge - m4.4xlarge Default: t3.micro Description: Amazon EC2 instance type for bastion instances. Type: String NumBastionHosts: AllowedValues: - "1" - "2" - "3" - "4" Default: "1" Description: Enter number of bastion hosts to create. Type: String QSS3BucketName: AllowedPattern: ^[0-9a-z]+([0-9a-z-\.]*[0-9a-z])*$ ConstraintDescription: >- The S3 bucket name can include numbers, lowercase letters, and hyphens (-), but it cannot start or end with a hyphen. Default: aws-quickstart Description: >- Name of the S3 bucket for your copy of the deployment assets. Keep the default name unless you are customizing the template. Changing the name updates code references to point to a new location. MinLength: 3 MaxLength: 63 Type: String QSS3KeyPrefix: AllowedPattern: ^([0-9a-zA-Z!-_\.\*'\(\)/]+/)*$ ConstraintDescription: >- The S3 key prefix can include numbers, lowercase letters, uppercase letters, hyphens (-), underscores (_), periods (.), asterisks (*), single quotes ('), open parenthesis ((), close parenthesis ()), and forward slashes (/). End the prefix with a forward slash. Default: quickstart-taosdata-tdengine-enterprise/ Description: >- S3 key prefix that is used to simulate a folder for your copy of the deployment assets. Keep the default prefix unless you are customizing the template. Changing the prefix updates code references to point to a new location. Type: String QSS3BucketRegion: Default: us-east-1 Description: >- AWS Region where the S3 bucket (QSS3BucketName) is hosted. Keep the default Region unless you are customizing the template. Changing the Region updates code references to point to a new location. When using your own bucket, specify the Region. Type: String EntDownloadLink: Description: Enterprise install package download link. For more information, refer to https://www.taosdata.com/products#enterprise-edition-link. Type: String ParamBucketName: Description: Description of what it's for. AllowedPattern: '^[0-9a-zA-Z]+([0-9a-zA-Z-]*[0-9a-zA-Z])*$' Type: String InstanceAmiId: Type: String Description: Instance AMI ID for TDEngine EC2 instance. InstanceType: Description: EC2 instance type. Type: String Default: t3.medium AllowedValues: - t3.medium - t3.large - t3.xlarge - t3.2xlarge - c5.large - c5.xlarge - c5.2xlarge - c5.4xlarge - c5.9xlarge - c5.12xlarge - c5.18xlarge DiskSize: Description: Size (in GB) of Amazon EBS volume on each node. Type: Number Default: 100 MinValue: 10 ConstraintDescription: "Minimum disk size is 10." NumberOfServers: Description: Must be 3 or more to form a cluster. Type: Number Default: 3 AllowedValues: - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 Conditions: UsingDefaultBucket: !Equals [!Ref QSS3BucketName, 'aws-quickstart'] Resources: BastionStack: Type: AWS::CloudFormation::Stack Properties: TemplateURL: !Sub - 'https://${S3Bucket}.s3.${S3Region}.${AWS::URLSuffix}/${QSS3KeyPrefix}submodules/quickstart-linux-bastion/templates/linux-bastion-entrypoint-existing-vpc.template.yaml' - S3Region: !If [ UsingDefaultBucket, !Ref 'AWS::Region', !Ref QSS3BucketRegion ] S3Bucket: !If [ UsingDefaultBucket, !Sub '${QSS3BucketName}-${AWS::Region}', !Ref QSS3BucketName ] Parameters: VPCID: !Ref VPCID PublicSubnet1ID: !Ref PublicSubnet1ID PublicSubnet2ID: !Ref PublicSubnet2ID KeyPairName: !Ref KeyPairName QSS3BucketName: !Ref QSS3BucketName QSS3KeyPrefix: !Sub '${QSS3KeyPrefix}submodules/quickstart-linux-bastion/' QSS3BucketRegion: !Ref QSS3BucketRegion RemoteAccessCIDR: !Ref RemoteAccessCIDR NumBastionHosts: !Ref NumBastionHosts BastionInstanceType: !Ref BastionInstanceType BastionAMIOS: !Ref BastionAMIOS ExternalSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Enable SSH and TDengine external ports. VpcId: !Ref VPCID SecurityGroupIngress: - IpProtocol: tcp FromPort: 22 ToPort: 22 CidrIp: !Ref RemoteAccessCIDR InternalSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Enable TDengine internal ports. VpcId: !Ref VPCID SecurityGroupIngress: - IpProtocol: tcp FromPort: 22 ToPort: 22 SourceSecurityGroupId: !GetAtt ExternalSecurityGroup.GroupId - IpProtocol: tcp FromPort: 6041 ToPort: 6041 CidrIp: !Ref VPCCIDR - IpProtocol: tcp FromPort: 6030 ToPort: 6030 CidrIp: !Ref VPCCIDR InternalSecurityGroupIngress: Type: AWS::EC2::SecurityGroupIngress Properties: Description: Allow all traffic inside internal security group. GroupId: !Ref InternalSecurityGroup IpProtocol: -1 SourceSecurityGroupId: !GetAtt InternalSecurityGroup.GroupId CacheParameterGroup: Type: 'AWS::ElastiCache::ParameterGroup' Properties: Description: JuiceFSParameterGroup CacheParameterGroupFamily: redis7 Properties: maxmemory-policy: noeviction CacheSubnetGroup: Type: AWS::ElastiCache::SubnetGroup Properties: Description: cache-juicefs-subnet SubnetIds: - !Ref PrivateSubnet1ID - !Ref PrivateSubnet2ID RedisReplicationGroup: Type: AWS::ElastiCache::ReplicationGroup Properties: AtRestEncryptionEnabled: true ReplicationGroupDescription: RedisJuiceFSReplicationGroup AutomaticFailoverEnabled: true NumCacheClusters: 2 CacheNodeType: cache.t3.micro CacheParameterGroupName: !Ref CacheParameterGroup CacheSubnetGroupName: !Ref CacheSubnetGroup Engine: Redis EngineVersion: 7.0 MultiAZEnabled: true SecurityGroupIds: - !GetAtt InternalSecurityGroup.GroupId S3Bucket: Type: 'AWS::S3::Bucket' DeletionPolicy: Retain UpdateReplacePolicy: Retain Properties: AccessControl: BucketOwnerFullControl BucketName: !Ref ParamBucketName BucketEncryption: ServerSideEncryptionConfiguration: - ServerSideEncryptionByDefault: SSEAlgorithm: AES256 TDengineIAMRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Principal: Service: - ec2.amazonaws.com Action: - 'sts:AssumeRole' ManagedPolicyArns: - !Sub arn:${AWS::Partition}:iam::aws:policy/AmazonSSMManagedInstanceCore Path: / Policies: - PolicyName: TDengineIAMPolicy PolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Action: - 'ec2:CreateTags' Resource: !Sub arn:${AWS::Partition}:ec2:${AWS::Region}:${AWS::AccountId}:instance/* - Effect: Allow Action: - 'ec2:DescribeInstances' Resource: '*' - Effect: Allow Action: - 's3:DeleteObject' - 's3:GetObject' - 's3:GetObjectAttributes' - 's3:PutObject' Resource: - !Sub arn:${AWS::Partition}:s3:::${S3Bucket} - !Sub arn:${AWS::Partition}:s3:::${S3Bucket}/* - Effect: Allow Action: 'ssm:SendCommand' Resource: - !Sub arn:${AWS::Partition}:ssm:*:*:document/* - !Sub arn:${AWS::Partition}:ec2:*:${AWS::AccountId}:instance/* TDengineIAMProfile: Type: 'AWS::IAM::InstanceProfile' Properties: Path: / Roles: - !Ref TDengineIAMRole TDengineEC2LaunchTemplate: Type: 'AWS::EC2::LaunchTemplate' Properties: LaunchTemplateName: !Sub - ${StackID}-tdengine-launch-template-for-auto-scaling - StackID: !Select [2, !Split [ /, !Ref AWS::StackId ]] LaunchTemplateData: NetworkInterfaces: - DeviceIndex: 0 Groups: - !Ref InternalSecurityGroup DeleteOnTermination: true IamInstanceProfile: Arn: !GetAtt - TDengineIAMProfile - Arn Placement: Tenancy: default ImageId: !Ref InstanceAmiId KeyName: !Ref KeyPairName InstanceType: !Ref InstanceType BlockDeviceMappings: - DeviceName: /dev/xvda Ebs: VolumeType: gp3 VolumeSize: !Ref DiskSize DeleteOnTermination: true UserData: Fn::Base64: !Sub | #!/bin/bash #install tools mkdir /TDengine; cd /TDengine chown -R ec2-user:ec2-user /TDengine touch {localip,log} yum -y install jq sed curl -s https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip -o awscliv2.zip unzip awscliv2.zip; ./aws/install; rm -Rf aws curl -s http://169.254.169.254/latest/meta-data/local-ipv4 > /TDengine/localip curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/linux_32bit/session-manager-plugin.rpm" -o "session-manager-plugin.rpm" yum install -y session-manager-plugin.rpm region=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq .region -r` mkdir -p /var/lib/s3/taos curl -sSL https://d.juicefs.com/install | sh - juicefs format \ --storage s3 \ --bucket https://${S3Bucket.RegionalDomainName} \ redis://${RedisReplicationGroup.PrimaryEndPoint.Address}:6379/1 \ taosdata juicefs mount -d redis://${RedisReplicationGroup.PrimaryEndPoint.Address}:6379/1 /var/lib/s3/taos # install TDengine su ec2-user < /TDengine/dnodes #process with asg scaling, find existing dnode dnodedns=`aws ec2 describe-instances \ --region $region \ --filters "Name=tag-value,Values=TDengineEC2AutoScalingGroup" \ "Name=tag-value,Values=In-TDengine-${AWS::StackId}" \ "Name=instance-state-name,Values=running" \ --query 'Reservations[*].Instances[*].[PrivateDnsName]' \ --output text | awk 'END {print $NF}'` if [ $dnodedns ] then echo "found dnode in cluster, $dnodedns" >> /TDengine/log sed -i '17 a firstEp '$dnodedns':6030' /etc/taos/taos.cfg systemctl start taosd.service systemctl start taosadapter.service taos -h $dnodedns -P 6030 -s "CREATE DNODE '`hostname -f`:6030'" offlinednodeid=`taos -s "show dnodes" | awk -F '|' '$2 ~ /ip-/ && $5 ~ /offline/ {print $1}' | awk 'END {print $NF}'` if [ $offlinednodeid ] then echo "found offline dnode in cluster, $offlinednodeid, dropping." >> /TDengine/log taos -h $dnodedns -P 6030 -s "DROP DNODE $offlinednodeid force" mnodecounts=`taos -s "show mnodes" | awk -F '|' '$2 ~ /ip-/ && $4 ~ /ready/ {print $0}' | wc -l` if [ $mnodecounts -lt 3 ] then lastdnodeip=`taos -s "show dnodes" | awk -F '|' '$2 ~ /ip-/ {print $1}' | awk 'END {print $NF}'` echo "mnodes not enough, adding $lastdnodeip." >> /TDengine/log taos -s "CREATE MNODE ON DNODE $lastdnodeip" fi fi exit 0 else echo "did not find any dnode in cluster, will init cluster by electing firstep." >> /TDengine/log fi # elect firstep node aws ec2 describe-instances \ --region $region \ --filters "Name=tag-value,Values=TDengineEC2AutoScalingGroup" \ "Name=instance-state-name,Values=running,pending" \ --query 'Reservations[*].Instances[*].[PrivateIpAddress]' \ --output text > /TDengine/dnodeips maxip=0 while read dnodeip do _dnodeip=`echo $dnodeip | sed 's/\.//g'` if [ $_dnodeip -gt $maxip ] then maxip=$_dnodeip echo $maxip fi done < /TDengine/dnodeips echo $maxip >> /TDengine/log if [ `sed 's/\.//g' /TDengine/localip` -ne $maxip ] then echo "this is not firstep, exit." >> /TDengine/log exit 0 fi ############################################################################################################### # Will only be executed on firstep node # start TDengine process sed -i '17 a firstEp '`hostname -f`:6030'' /etc/taos/taos.cfg systemctl start taosd.service systemctl start taosadapter.service firstepinstanceid=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq .instanceId -r` # generate key for loginless ssh from firstep to other dnodes ssh-keygen -t rsa -f /TDengine/tdengine_key -q -N "" chown -R ec2-user:ec2-user /TDengine cat /TDengine/dnodes | while read instance do if [ $firstepinstanceid = $instance ] then aws ec2 create-tags --resources $instance --tags 'Key=TDengine-tag,Value=In-TDengine-${AWS::StackId}' echo 'skip loop for firstep.' >> /TDengine/log continue fi tdenginepk=`cat /TDengine/tdengine_key.pub` aws ssm send-command \ --document-name "AWS-RunShellScript" \ --instance-ids $instance \ --region $region \ --max-errors "5" \ --parameters "commands=['echo $tdenginepk >> /home/ec2-user/.ssh/authorized_keys']" sleep 10 ip=`aws ec2 describe-instances \ --region $region --filters \ "Name=instance-id,Values=$instance" \ --query 'Reservations[*].Instances[*].[PrivateIpAddress]' \ --output text` su ec2-user <> /home/ec2-user/.ssh/authorized_keys" < /dev/null while true do if ssh -i /TDengine/tdengine_key -o 'StrictHostKeyChecking=no' $ip stat /etc/taos/taos.cfg \> /dev/null 2\>\&1 then echo "taos config existed in $ip." >> /TDengine/log ssh -i /TDengine/tdengine_key -o 'StrictHostKeyChecking=no' $ip "sudo sed -i '17 a firstEp `hostname -f`:6030' /etc/taos/taos.cfg; sudo systemctl start taosd.service; sudo systemctl start taosadapter.service" aws ec2 create-tags --resources $instance --tags 'Key=TDengine-tag,Value=In-TDengine-${AWS::StackId}' break else echo 'taos config does not exist, waiting.' >> /TDengine/log sleep 5 fi done EOSU privatednsname=`aws ec2 describe-instances \ --region $region --filters \ "Name=instance-id,Values=$instance" \ --query 'Reservations[*].Instances[*].[PrivateDnsName]' \ --output text` taos -s "CREATE DNODE '$privatednsname:6030'" mnodecounts=`taos -s "show mnodes" | awk -F '|' '$2 ~ /ip-/ && $4 ~ /ready/ {print $0}' | wc -l` if [ $mnodecounts -lt 3 ] then lastdnodeip=`taos -s "show dnodes" | awk -F '|' '$2 ~ /ip-/ {print $1}' | awk 'END {print $NF}'` taos -s "CREATE MNODE ON DNODE $lastdnodeip" fi done TDengineNLB: Type: AWS::ElasticLoadBalancingV2::LoadBalancer Properties: Name: !Sub - ${StackID}-TDengine-NLB - StackID: !Select [4, !Split [-, !Select [2, !Split [ /, !Ref AWS::StackId ]]]] IpAddressType: ipv4 Scheme: internal Subnets: - !Ref PrivateSubnet1ID - !Ref PrivateSubnet2ID Type: network TDengineListener: Type: AWS::ElasticLoadBalancingV2::Listener Properties: DefaultActions: - Type: forward TargetGroupArn: Ref: TDengineTargetGroup LoadBalancerArn: Ref: TDengineNLB Port: 6041 Protocol: TCP TDengineTargetGroup: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: HealthCheckIntervalSeconds: 10 HealthCheckPort: traffic-port HealthCheckProtocol: TCP HealthyThresholdCount: 3 UnhealthyThresholdCount: 3 Port: 6041 Protocol: TCP TargetType: instance VpcId: !Ref VPCID TargetGroupAttributes: - Key: deregistration_delay.timeout_seconds Value: 60 TDengineEC2AutoScalingGroup: Type: AWS::AutoScaling::AutoScalingGroup Properties: MinSize: !Ref NumberOfServers MaxSize: !Ref NumberOfServers DesiredCapacity: !Ref NumberOfServers LaunchTemplate: LaunchTemplateId: !Ref TDengineEC2LaunchTemplate Version: !GetAtt TDengineEC2LaunchTemplate.LatestVersionNumber VPCZoneIdentifier: - !Ref PrivateSubnet1ID - !Ref PrivateSubnet2ID TargetGroupARNs: - !Ref TDengineTargetGroup