AWSTemplateFormatVersion: '2010-09-09' Description: 'Amazon Redshift EC2 access' Parameters: RedshiftClusterEndpoint: Description: Redshift cluster endpoint including port number and database name Type: String Default: redshift-cluster.xxxxxx.region.redshift.amazonaws.com:5439/dev DbUsername: Description: Redshift database user name which has access to run SQL Script. Type: String AllowedPattern: "([a-z])([a-z]|[0-9])*" Default: 'awsuser' SQLScriptS3Path: Description: S3 location uri for seim-colon(;) delimeted SQL script Type: String Default: s3://your-s3-bucket/prefix/your_user_queries.sql VPC: Description: "vpc_id where redshift cluster is provisioned" Type: AWS::EC2::VPC::Id SubnetId: Description: Subnet ID where source redshift clusters will be created. Please make sure the private subnet is attached to NAT gateway. Type: AWS::EC2::Subnet::Id CIDR: Description: IP range (CIDR notation) for your VPC and VPN to access EC2 and Redshift Type: String Default: 10.0.0.0/8 MinLength: '9' MaxLength: '18' AllowedPattern: "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})" ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x. KeyPairName: Description: Existing key pair name to be used to login to the EC2 instance Type: AWS::EC2::KeyPair::KeyName EC2InstanceType: Description: Enter EC2 instance type Type: String Default: t3.nano AllowedValues: - t3.nano - m5.large - m5.xlarge - m5.2xlarge - m5.4xlarge - m5.8xlarge - m5.12xlarge - m5.16xlarge - m5.24xlarge EC2InstanceVolumeGiB: Description: Enter size of EC2 instance volume in GiBs. We recommend keeping it 30 GiB or more Type: Number Default: 10 MinValue: 1 EC2InstanceAMI: Description: AMI for the Amazon Linux 2 based EC2 instance Type: "AWS::SSM::Parameter::Value<String>" Default: "/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2" IsPublicSubnet: Description: Is the SubnetId mentioned above a public subnet? Type: String Default: 'No' AllowedValues: - 'Yes' - 'No' Metadata: AWS::CloudFormation::Interface: ParameterGroups: - Label: default: Configurations from Extract CloudFormation Parameters: - RedshiftClusterEndpoint - DbUsername - SQLScriptS3Path - VPC - SubnetId - CIDR - KeyPairName - EC2InstanceType - EC2InstanceVolumeGiB - EC2InstanceAMI - IsPublicSubnet Conditions: IsSubnetPublic: Fn::Not: - Fn::Equals: - 'No' - Ref: IsPublicSubnet Resources: SecurityGroupEc2Redshift: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: 'EC2 and Redshift security group' SecurityGroupIngress: - CidrIp: !Ref CIDR Description : Allow inbound access for on prem users on redshift port for the subnet IpProtocol: tcp FromPort: !Sub - '${port}' - {port: !Select [1, !Split [":", !Select [0, !Split ["/", !Ref RedshiftClusterEndpoint]]]]} ToPort: !Sub - '${port}' - {port: !Select [1, !Split [":", !Select [0, !Split ["/", !Ref RedshiftClusterEndpoint]]]]} - CidrIp: !Ref CIDR Description : Allow ssh access IpProtocol: tcp FromPort: 22 ToPort: 22 VpcId: !Ref VPC SecurityGroupSelfReference: Type: AWS::EC2::SecurityGroupIngress Properties: Description: Self Referencing Rule FromPort: -1 IpProtocol: -1 GroupId: !GetAtt [SecurityGroupEc2Redshift, GroupId] SourceSecurityGroupId: !GetAtt [SecurityGroupEc2Redshift, GroupId] ToPort: -1 Ec2RedshiftIamPolicy: Type: 'AWS::IAM::ManagedPolicy' Properties: PolicyDocument: Version: '2012-10-17' Statement: - Sid: RedshiftWhatIfBucketAccess Effect: Allow Action: - s3:GetBucketLocation - s3:GetObject - s3:ListMultipartUploadParts - s3:ListBucket - s3:ListBucketMultipartUploads Resource: - !Sub - arn:aws:s3:::${RedshiftWhatIfConfigJsonObject} - {RedshiftWhatIfConfigJsonObject: !Select [1, !Split ["//", !Ref SQLScriptS3Path]]} - Sid: RedshiftWhatIfExternalBucketAccess Effect: Allow Action: - s3:GetBucketLocation - s3:GetObject - s3:ListMultipartUploadParts - s3:ListBucket - s3:ListBucketMultipartUploads Resource: - arn:aws:s3:::event-driven-app-with-lambda-redshift/* - Sid: RedshiftDataAPIAccess Effect: Allow Action: - redshift-data:ExecuteStatement - redshift-data:ListStatements - redshift-data:GetStatementResult - redshift-data:DescribeStatement Resource: - '*' - Sid: RedshiftClusterAccess Effect: Allow Action: - redshift:GetClusterCredentials Resource: - !Sub - arn:aws:redshift:${AWS::Region}:${AWS::AccountId}:cluster:${SourceRedshiftClusterIdentifier} - {SourceRedshiftClusterIdentifier: !Select [0, !Split [".", !Ref RedshiftClusterEndpoint]]} - !Sub - "arn:aws:redshift:${AWS::Region}:${AWS::AccountId}:dbname:${SourceRedshiftClusterIdentifier}/${RedshiftDatabaseName}" - {SourceRedshiftClusterIdentifier: !Select [0, !Split [".", !Ref RedshiftClusterEndpoint]],RedshiftDatabaseName: !Select [1, !Split ["/", !Ref RedshiftClusterEndpoint]]} - !Sub - "arn:aws:redshift:${AWS::Region}:${AWS::AccountId}:dbuser:${SourceRedshiftClusterIdentifier}/${DbUsername}" - {SourceRedshiftClusterIdentifier: !Select [0, !Split [".", !Ref RedshiftClusterEndpoint]]} InstanceProfileEC2Instance: Type: AWS::IAM::InstanceProfile Properties: Path: "/" Roles: - Ref: Ec2RedshiftIamRole Ec2RedshiftIamRole: Type: AWS::IAM::Role Properties : AssumeRolePolicyDocument: Version : 2012-10-17 Statement : - Effect : Allow Principal : Service : - ec2.amazonaws.com Action : - sts:AssumeRole Path : / ManagedPolicyArns: - !Ref Ec2RedshiftIamPolicy EC2InstanceToAccessRedshift: Type: AWS::EC2::Instance CreationPolicy: ResourceSignal: Timeout: PT45M Properties: KeyName: Ref: "KeyPairName" InstanceType: !Ref EC2InstanceType IamInstanceProfile: !Ref InstanceProfileEC2Instance Tags: - Key: Name Value: Fn::Join: - "-" - - Ref: AWS::StackName - EC2InstanceReplica BlockDeviceMappings: - DeviceName: "/dev/sda1" Ebs: DeleteOnTermination: true VolumeSize: !Ref EC2InstanceVolumeGiB VolumeType: gp2 ImageId: !Ref EC2InstanceAMI NetworkInterfaces: - DeleteOnTermination: true DeviceIndex: "0" SubnetId: !Ref SubnetId AssociatePublicIpAddress: !If [IsSubnetPublic, true, false] GroupSet: - Ref: SecurityGroupEc2Redshift UserData: Fn::Base64: Fn::Sub: - | #!/bin/bash -e yum update -y yum -y install python3 yum -y install python3-pip yum -y install aws-cfn-bootstrap yum -y install jq pip3 install sqlalchemy pip3 install psycopg2-binary pip3 install boto3 pip3 install pandas pip3 install matplotlib mkdir /root/.aws echo "[default]" > /root/.aws/config echo "region = ${AWS::Region}" >> /root/.aws/config echo "output = json" >> /root/.aws/config curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" unzip awscliv2.zip sudo ./aws/install mkdir /amazonutils cd /amazonutils echo "export PATH=$PATH:/amazonutils" >> ~/.bashrc aws s3 cp ${SQLScriptS3Path} . aws s3 cp s3://event-driven-app-with-lambda-redshift/whatif/redshift-data.py . aws s3 cp s3://event-driven-app-with-lambda-redshift/whatif/redshift-data.sh . aws s3 cp s3://event-driven-app-with-lambda-redshift/whatif/redshift-concurrency-test.py . sed -i "s#REDSHIFT_CLUSTER_ENDPOINT=\"\"#REDSHIFT_CLUSTER_ENDPOINT=\"${RedshiftClusterEndpoint}\"#g" redshift-data.py sed -i "s#REDSHIFT_USER_NAME=\"\"#REDSHIFT_USER_NAME=\"${DbUsername}\"#g" redshift-data.py sed -i "s#SQL_SCRIPT_S3_PATH=\"\"#SQL_SCRIPT_S3_PATH=\"${SQLScriptS3Path}\"#g" redshift-data.py sed -i "s#REDSHIFT_CLUSTER_ENDPOINT=\"\"#REDSHIFT_CLUSTER_ENDPOINT=\"${RedshiftClusterEndpoint}\"#g" redshift-concurrency-test.py sed -i "s#REDSHIFT_USER_NAME=\"\"#REDSHIFT_USER_NAME=\"${DbUsername}\"#g" redshift-concurrency-test.py sed -i "s#SQL_SCRIPT_S3_PATH=\"\"#SQL_SCRIPT_S3_PATH=\"${SQLScriptS3Path}\"#g" redshift-concurrency-test.py sed -i "s#REDSHIFT_CLUSTER_ENDPOINT=\"\"#REDSHIFT_CLUSTER_ENDPOINT=\"${RedshiftClusterEndpoint}\"#g" redshift-data.sh sed -i "s#REDSHIFT_USER_NAME=\"\"#REDSHIFT_USER_NAME=\"${DbUsername}\"#g" redshift-data.sh /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource EC2InstanceToAccessRedshift --region ${AWS::Region} - SQLScriptS3Path: !Ref SQLScriptS3Path