AWSTemplateFormatVersion: '2010-09-09' Description: CloudFormation template to setup the GA release of EMR with Apache Ranger Parameters: AvailabilityZones: Description: The list of Availability Zones to use for the subnets in the VPC. Three Availability Zones are used for this deployment, and the logical order of your selections is preserved. Default: us-east-1a,us-east-1b,us-east-1c Type: List PrivateSubnet1CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.0.0/19 Description: The CIDR block for private subnet 1 located in Availability Zone 1 Type: String PrivateSubnet2CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.32.0/19 Description: The CIDR block for private subnet 2 located in Availability Zone 2 Type: String PrivateSubnet3CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.64.0/19 Description: The CIDR block for private subnet 3 located in Availability Zone 3 Type: String PublicSubnet1CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.128.0/20 Description: CIDR block for the public (DMZ) subnet 1 located in Availability Zone 1 Type: String PublicSubnet2CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.144.0/20 Description: The CIDR block for the public (DMZ) subnet 2 located in Availability Zone 2 Type: String PublicSubnet3CIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.160.0/20 Description: The CIDR block for the public (DMZ) subnet 3 located in Availability Zone 3 Type: String QSS3BucketName: AllowedPattern: ^[0-9a-zA-Z]+([0-9a-zA-Z-]*[0-9a-zA-Z])*$ ConstraintDescription: Quick Start bucket name can include numbers, lowercase letters, uppercase letters, and hyphens (-). It cannot start or end with a hyphen (-). Default: aws-quickstart Description: S3 bucket name for the Quick Start assets. This string can include numbers, lowercase letters, uppercase letters, and hyphens (-). It cannot start or end with a hyphen (-). Type: String AllowedValues: ['aws-quickstart'] QSS3KeyPrefix: AllowedPattern: ^[0-9a-zA-Z-/.]*$ ConstraintDescription: Quick Start key prefix can include numbers, lowercase letters, uppercase letters, hyphens (-), dots(.) and forward slash (/). Default: quickstart-amazon-eks/ Description: S3 key prefix for the Quick Start assets. Quick Start key prefix can include numbers, lowercase letters, uppercase letters, hyphens (-), dots(.) and forward slash (/). Type: String AllowedValues: ['quickstart-amazon-eks/'] CIDRAccessToADAndBastion: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/([0-9]|[1-2][0-9]|3[0-2]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/x Description: The CIDR IP range that is permitted to access the instances. We recommend that you set this value to a trusted IP range. Type: String Default: 0.0.0.0/0 VPCCIDR: AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(1[6-9]|2[0-8]))$ ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/16-28 Default: 10.0.0.0/16 Description: The CIDR block for the VPC Type: String NumberOfAZs: Type: String AllowedValues: ["2", "3"] Default: "3" Description: Number of Availability Zones to use in the VPC. This must match your selections in the list of Availability Zones parameter. S3Bucket: Description: S3Bucket where artifacts are stored Type: String Default: aws-bigdata-blog AllowedValues: ["aws-bigdata-blog"] S3Key: Description: S3Key of the Lambda code Type: String Default: artifacts/aws-blog-emr-ranger AllowedValues: ["artifacts/aws-blog-emr-ranger"] ProjectVersion: Default: 3.0 Description: Project version Type: String AllowedValues: - 3.0 - beta DomainAdminPassword: AllowedPattern: (?=^.{6,255}$)((?=.*\d)(?=.*[A-Z])(?=.*[a-z])|(?=.*\d)(?=.*[^A-Za-z0-9])(?=.*[a-z])|(?=.*[^A-Za-z0-9])(?=.*[A-Z])(?=.*[a-z])|(?=.*\d)(?=.*[A-Z])(?=.*[^A-Za-z0-9]))^.* Description: 'Password for the domain admin user. Must be at least 8 characters containing letters, numbers and symbols - Eg: CheckSum123' MaxLength: '32' MinLength: '8' NoEcho: 'true' Type: String LDAPBindPassword: Description: Ldap Bind user password that was used in the previous cloud formation template Type: String NoEcho: 'true' LDAPSearchBase: Description: Ldap search base Type: String Default: CN=Users,DC=awsemr,DC=com AllowedValues: - CN=Users,DC=awsemr,DC=com CrossRealmTrustPrincipalPassword: Description: 'Password that you want to use for your cross-realm trust - Eg: CheckSum123' MaxLength: '32' MinLength: '5' NoEcho: 'true' Type: String DefaultADUserPassword: Description: Default Password for all users created in the AD server Type: String NoEcho: true InstallEMRRangerinPublicSubnet: Description: Flag to indicate if Ranger and EMR servers should be run in the Public subnet Default: false Type: String AllowedValues: [true, false] AttachAdditionalSourcePrefixToSG: Description: Attaches additional sources to EMR Master SG Default: false Type: String AllowedValues: [true, false] CIDRAccessToPrivateSubnetResources: Description: IP address range (in CIDR notation) of the client that will be allowed to connect to the cluster using SSH e.g., 203.0.113.5/32 AllowedPattern: (\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2}) Type: String MinLength: '9' MaxLength: '18' Default: 10.0.0.0/16 ConstraintDescription: must be a valid CIDR range of the form x.x.x.x/x AdditionalSourcePrefixToSG: Description: Sources that are allowd to access the Ranger Instance. Should be a source prefix e.g., pl-xxx Type: String LDAPUserSearchAttribute: Description: Ldap user search attribute Type: String Default: sAMAccountName AllowedValues: - sAMAccountName LDAPUserObjectClass: Description: Ldap user object class Type: String Default: person AllowedValues: - person LDAPGroupSearchBase: Description: Ldap group search Type: String Default: dc=awsemr,dc=com AllowedValues: - dc=awsemr,dc=com LDAPGroupObjectClass: Description: Ldap group object class Type: String Default: group AllowedValues: - group LDAPMemberAttribute: Description: Ldap member attribute Type: String Default: member AllowedValues: - member MasterInstanceType: Description: Instance type for the cluster nodes Type: String Default: r5.2xlarge AllowedValues: - m4.large - m4.xlarge - m4.2xlarge - m4.4xlarge - r5.2xlarge CoreInstanceType: Description: Instance type for the cluster nodes Type: String Default: r5.2xlarge AllowedValues: - m4.large - m4.xlarge - m4.2xlarge - m4.4xlarge - r5.2xlarge RangerInstanceType: Default: m4.xlarge Description: Instance Type of the core node Type: String EMRClusterName: Default: EMR-EMRSecurityWithRangerV1 Description: Cluster name for the EMR Type: String EMRLogDir: Description: 'Log Dir for the EMR cluster. Eg: s3://xxx' Type: String AllowedPattern: ^s3://.* DomainDNSName: AllowedPattern: '[a-zA-Z0-9\-]+\..+' Default: awsemr.com Description: The Active Directory domain that you want to establish the cross-realm trust with e.g., awsemr.com MaxLength: '25' MinLength: '3' Type: String KdcAdminPassword: Description: Password of your KDC Password MaxLength: '32' MinLength: '5' NoEcho: 'true' Type: String DomainAdminUser: AllowedPattern: '[a-zA-Z0-9]*' Default: awsadmin Description: User name of an AD account with computer join privileges MaxLength: '25' MinLength: '5' Type: String # KerberosRealm: # AllowedPattern: '[a-zA-Z0-9\-]+\..+' # Default: COMPUTE.INTERNAL # Description: Cluster's Kerberos realm name. This is usually the VPC's domain name # in uppercase letters e.g. COMPUTE.INTERNAL # MaxLength: '25' # MinLength: '3' # Type: String KerberosADdomain: AllowedPattern: '[a-zA-Z0-9\-]+\..+' Default: AWSEMR.COM Description: The AD domain that you want to trust. This is the same as the AD domain name, but in uppercase letters e.g., AWSEMR.COM MaxLength: '25' MinLength: '3' Type: String MasterInstanceCount: Default: '1' Description: Number of master instances Type: Number CoreInstanceCount: Default: 3 Description: Number of instances (core nodes) for the cluster e.g., 2 Type: Number LDAPBindUserName: Description: BindUser Type: String Default: binduser AllowedValues: - binduser RangerVersion: Description: 'RangerVersion. Expected values are : 0.6,0.7,1.0,2.0. NOTE: Use Ranger 0.6 if EMR version is 5.0' AllowedValues: - '2.0' Type: String Default: '2.0' RangerHttpProtocol: Description: Ranger HTTP protocol Type: String Default: 'https' AllowedValues: - 'https' AppsEMR: Description: 'Comma separated list of applications to install on the cluster e.g., ' Type: String Default: Hadoop, Spark, Hive, Livy, Hue AllowedValues: ["Hadoop, Spark, Hive, Livy, Hue"] EnableKerberos: Description: Enable Kerberos on the Cluster. This is Required for Ranger EMR support Default: true Type: String AllowedValues: [true] emrReleaseLabel: Type: String Default: emr-5.32.0 AllowedValues: - emr-5.32.0 - emr-6.3.0 - emr-6.4.0 KeyPairName: Description: Name of an existing EC2 key pair to access the Amazon EMR cluster Type: AWS::EC2::KeyPair::KeyName DBUserName: Description: ' The RDS MySQL database username' Type: String Default: root AllowedValues: - root DBRootPassword: Description: ' The RDS MySQL database password' MaxLength: '41' MinLength: '8' NoEcho: 'true' Type: String RangerAdminPassword: Description: Password of the Ranger Admin server NoEcho: 'true' Default: admin Type: String CreateNonEMRResources: Description: Only Install EMR cluster, Other resources are ignored as they might have been created with previous stacks Default: false Type: String AllowedValues: [true, false] RangerAgentKeySecretName: Description: Name of Ranger Agent Cert Secrets mgr resource Type: String Default: emr/rangerGAagentkey AllowedValues: ["emr/rangerGAagentkey"] RangerServerCertSecretName: Description: Name of Ranger Server Cert Secrets mgr resource Type: String Default: emr/rangerGAservercert AllowedValues: ["emr/rangerGAservercert"] Metadata: AWS::CloudFormation::Interface: ParameterGroups: - Label: default: Minimum required fields (others can be kept as default) Parameters: - KeyPairName - DomainAdminPassword - CrossRealmTrustPrincipalPassword - LDAPBindPassword - DefaultADUserPassword - CIDRAccessToADAndBastion - DBRootPassword - EMRLogDir - KdcAdminPassword Mappings: DefaultConfiguration: MachineConfiguration: BastionInstanceType: t2.small NetworkConfiguration: VPCCIDR: 10.0.0.0/16 PublicSubnet1CIDR: 10.0.1.0/24 PrivateSubnet1CIDR: 10.0.2.0/24 PublicSubnet2CIDR: 10.0.3.0/24 PrivateSubnet2CIDR: 10.0.4.0/24 PublicSubnet3CIDR: 10.0.5.0/24 PrivateSubnet3CIDR: 10.0.6.0/24 Conditions: USEastRegion: !Equals [ !Ref 'AWS::Region', "us-east-1" ] 3AZDeployment: !Equals [!Ref NumberOfAZs, "3"] 2AZDeployment: !Or - !Equals [!Ref NumberOfAZs, "2"] - !Equals [!Ref NumberOfAZs, "3"] Resources: STEP1: Type: AWS::CloudFormation::Stack Properties: TemplateURL: !Join ['', ['https://s3.amazonaws.com/', !Ref 'S3Bucket', '/', !Ref 'S3Key', '/', !Ref 'ProjectVersion', '/cloudformation/', 'step1_vpc-ec2-ad.template']] Parameters: S3Bucket: !Ref S3Bucket S3Key: !Ref S3Key ProjectVersion: !Ref ProjectVersion KeyPairName: !Ref 'KeyPairName' AvailabilityZones: !Join [ ',', !Ref 'AvailabilityZones' ] DomainAdminPassword: !Ref DomainAdminPassword LDAPBindPassword: !Ref LDAPBindPassword DefaultADUserPassword: !Ref DefaultADUserPassword CrossRealmTrustPrincipalPassword: !Ref CrossRealmTrustPrincipalPassword CIDRAccessToADAndBastion: !Ref CIDRAccessToADAndBastion STEP2: DependsOn: STEP1 Type: AWS::CloudFormation::Stack Properties: TemplateURL: !Join ['', ['https://s3.amazonaws.com/', !Ref 'S3Bucket', '/', !Ref 'S3Key', '/', !Ref 'ProjectVersion', '/cloudformation/', 'step2_ranger-rds-emr.template']] Parameters: S3Bucket: !Ref S3Bucket S3Key: !Ref S3Key ProjectVersion: !Ref ProjectVersion KeyPairName: !Ref 'KeyPairName' VPC: !GetAtt 'STEP1.Outputs.VPC' PrivateSubnet1AID: !GetAtt 'STEP1.Outputs.PrivateSubnet1AID' PrivateSubnet2AID: !GetAtt 'STEP1.Outputs.PrivateSubnet2AID' PublicSubnet1AID: !GetAtt 'STEP1.Outputs.PublicSubnet1AID' PublicSubnet2AID: !GetAtt 'STEP1.Outputs.PublicSubnet2AID' InstallEMRRangerinPublicSubnet: !Ref InstallEMRRangerinPublicSubnet DBUserName: !Ref 'DBUserName' DBRootPassword: !Ref 'DBRootPassword' DomainAdminUser: !Ref 'DomainAdminUser' DomainAdminPassword: !Ref 'DomainAdminPassword' DomainDNSName: !Ref 'DomainDNSName' CrossRealmTrustPrincipalPassword: !Ref 'CrossRealmTrustPrincipalPassword' ADDomainJoinPassword: !Ref 'DomainAdminPassword' KdcAdminPassword: !Ref 'KdcAdminPassword' LDAPHostPrivateIP: !GetAtt 'STEP1.Outputs.LDAPHostPrivateIP' LDAPBindUserName: !Ref LDAPBindUserName LDAPBindPassword: !Ref LDAPBindPassword LDAPGroupSearchBase: !Ref LDAPGroupSearchBase LDAPMemberAttribute: !Ref LDAPMemberAttribute LDAPSearchBase: !Ref LDAPSearchBase LDAPUserObjectClass: !Ref LDAPUserObjectClass LDAPUserSearchAttribute: !Ref LDAPUserSearchAttribute KerberosRealm: !If [ USEastRegion, 'EC2.INTERNAL', 'COMPUTE.INTERNAL' ] KerberosADdomain: !Ref KerberosADdomain AppsEMR: !Ref AppsEMR EMRClusterName: !Ref EMRClusterName EMRLogDir: !Ref EMRLogDir emrReleaseLabel: !Ref emrReleaseLabel MasterInstanceCount: !Ref MasterInstanceCount CoreInstanceCount: !Ref CoreInstanceCount RangerVersion: !Ref RangerVersion RangerHttpProtocol: !Ref RangerHttpProtocol RangerAdminPassword: !Ref RangerAdminPassword CIDRAccessToPrivateSubnetResources: !Ref CIDRAccessToPrivateSubnetResources CreateNonEMRResources: !Ref CreateNonEMRResources AttachAdditionalSourcePrefixToSG: !Ref AttachAdditionalSourcePrefixToSG AdditionalSourcePrefixToSG: !Ref AdditionalSourcePrefixToSG RangerAgentKeySecretName: !Ref RangerAgentKeySecretName RangerServerCertSecretName: !Ref RangerServerCertSecretName Outputs: VPC: Value: !GetAtt 'STEP1.Outputs.VPC' Description: VPC ID LDAPHostPrivateIP: Value: !GetAtt [STEP1, Outputs.LDAPHostPrivateIP] Description: LDAP Host Private IP address RDSInstanceAddress: Description: IP Address of the RDS instance Value: !GetAtt 'STEP2.Outputs.RDSInstanceAddress' RangerServerIP: Value: !GetAtt 'STEP2.Outputs.RangerServerIP' EMRClusterURL: Description: Cluster EMR cluster MasterNode Value: !GetAtt 'STEP2.Outputs.EMRClusterURL'