# Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"). You # may not use this file except in compliance with the License. A copy of # the License is located at # # http://aws.amazon.com/apache2.0/ # # or in the "license" file accompanying this file. This file is # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF # ANY KIND, either express or implied. See the License for the specific # language governing permissions and limitations under the License. from awscli.customizations.emr.createdefaultroles import EMR_ROLE_NAME from awscli.customizations.emr.createdefaultroles import EC2_ROLE_NAME TERMINATE_CLUSTERS = ( 'Shuts down one or more clusters, each specified by cluster ID. ' 'Use this command only on clusters that do not have termination ' 'protection enabled. Clusters with termination protection enabled ' 'are not terminated. When a cluster is shut ' 'down, any step not yet completed is canceled and the ' 'Amazon EC2 instances in the cluster are terminated. ' 'Any log files not already saved are uploaded to ' 'Amazon S3 if a --log-uri was specified when the cluster was created. ' 'The maximum number of clusters allowed in the list is 10. ' 'The command is asynchronous. Depending on the ' 'configuration of the cluster, it may take from 5 to 20 minutes for the ' 'cluster to terminate completely and release allocated resources such as ' 'Amazon EC2 instances.') CLUSTER_ID = ( '

A unique string that identifies a cluster. The ' 'create-cluster command returns this identifier. You can ' 'use the list-clusters command to get cluster IDs.

') HBASE_BACKUP_DIR = ( '

The Amazon S3 location of the Hbase backup. Example: ' 's3://mybucket/mybackup, where mybucket is the ' 'specified Amazon S3 bucket and mybackup is the specified backup ' 'location. The path argument must begin with s3://, which ' 'refers to an Amazon S3 bucket.

') HBASE_BACKUP_VERSION = ( '

The backup version to restore from. If not specified, the latest backup ' 'in the specified location is used.

') # create-cluster options help text CREATE_CLUSTER_DESCRIPTION = ( 'Creates an Amazon EMR cluster with the specified configurations.') DESCRIBE_CLUSTER_DESCRIPTION = ( 'Provides cluster-level details including status, hardware ' 'and software configuration, VPC settings, bootstrap ' 'actions, instance groups and so on. ' 'Permissions needed for describe-cluster include ' 'elasticmapreduce:ListBootstrapActions, ' 'elasticmapreduce:ListInstanceFleets, ' 'elasticmapreduce:DescribeCluster, ' 'and elasticmapreduce:ListInstanceGroups.') CLUSTER_NAME = ( '

The name of the cluster. If not provided, the default is "Development Cluster".

') LOG_URI = ( '

Specifies the location in Amazon S3 to which log files ' 'are periodically written. If a value is not provided, ' 'logs files are not written to Amazon S3 from the master node ' 'and are lost if the master node terminates.

') LOG_ENCRYPTION_KMS_KEY_ID = ( '

Specifies the KMS Id utilized for log encryption. If a value is ' 'not provided, log files will be encrypted by default encryption method ' 'AES-256. This attribute is only available with EMR version 5.30.0 and later, ' 'excluding EMR 6.0.0.

') SERVICE_ROLE = ( '

Specifies an IAM service role, which Amazon EMR requires to call other AWS services ' 'on your behalf during cluster operation. This parameter ' 'is usually specified when a customized service role is used. ' 'To specify the default service role, as well as the default instance ' 'profile, use the --use-default-roles parameter. ' 'If the role and instance profile do not already exist, use the ' 'aws emr create-default-roles command to create them.

') AUTOSCALING_ROLE = ( '

Specify --auto-scaling-role EMR_AutoScaling_DefaultRole' ' if an automatic scaling policy is specified for an instance group' ' using the --instance-groups parameter. This default' ' IAM role allows the automatic scaling feature' ' to launch and terminate Amazon EC2 instances during scaling operations.

') USE_DEFAULT_ROLES = ( '

Specifies that the cluster should use the default' ' service role (EMR_DefaultRole) and instance profile (EMR_EC2_DefaultRole)' ' for permissions to access other AWS services.

' '

Make sure that the role and instance profile exist first. To create them,' ' use the create-default-roles command.

') AMI_VERSION = ( '

Applies only to Amazon EMR release versions earlier than 4.0. Use' ' --release-label for 4.0 and later. Specifies' ' the version of Amazon Linux Amazon Machine Image (AMI)' ' to use when launching Amazon EC2 instances in the cluster.' ' For example, --ami-version 3.1.0.') RELEASE_LABEL = ( '

Specifies the Amazon EMR release version, which determines' ' the versions of application software that are installed on the cluster.' ' For example, --release-label emr-5.15.0 installs' ' the application versions and features available in that version.' ' For details about application versions and features available' ' in each release, see the Amazon EMR Release Guide:

' '

https://docs.aws.amazon.com/emr/latest/ReleaseGuide

' '

Use --release-label only for Amazon EMR release version 4.0' ' and later. Use --ami-version for earlier versions.' ' You cannot specify both a release label and AMI version.

') OS_RELEASE_LABEL = ( '

Specifies a particular Amazon Linux release for all nodes in a cluster' ' launch request. If a release is not specified, EMR uses the latest validated' ' Amazon Linux release for cluster launch.

') CONFIGURATIONS = ( '

Specifies a JSON file that contains configuration classifications,' ' which you can use to customize applications that Amazon EMR installs' ' when cluster instances launch. Applies only to Amazon EMR 4.0 and later.' ' The file referenced can either be stored locally (for example,' ' --configurations file://configurations.json)' ' or stored in Amazon S3 (for example, --configurations' ' https://s3.amazonaws.com/myBucket/configurations.json).' ' Each classification usually corresponds to the xml configuration' ' file for an application, such as yarn-site for YARN. For a list of' ' available configuration classifications and example JSON, see' ' the following topic in the Amazon EMR Release Guide:

' '

https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html

') INSTANCE_GROUPS = ( '

Specifies the number and type of Amazon EC2 instances' ' to create for each node type in a cluster, using uniform instance groups.' ' You can specify either --instance-groups or' ' --instance-fleets but not both.' ' For more information, see the following topic in the EMR Management Guide:

' '

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-group-configuration.html

' '

You can specify arguments individually using multiple' ' InstanceGroupType argument blocks, one for the MASTER' ' instance group, one for a CORE instance group,' ' and optional, multiple TASK instance groups.

' '

If you specify inline JSON structures, enclose the entire' ' InstanceGroupType argument block in single quotation marks.' '

Each InstanceGroupType block takes the following inline arguments.' ' Optional arguments are shown in [square brackets].

' '
  • [Name] - An optional friendly name for the instance group.
  • ' '
  • InstanceGroupType - MASTER, CORE, or TASK.
  • ' '
  • InstanceType - The type of EC2 instance, for' ' example m4.large,' ' to use for all nodes in the instance group.
  • ' '
  • InstanceCount - The number of EC2 instances to provision in the instance group.
  • ' '
  • [BidPrice] - If specified, indicates that the instance group uses Spot Instances.' ' This is the maximum price you are willing to pay for Spot Instances. Specify OnDemandPrice' ' to set the amount equal to the On-Demand price, or specify an amount in USD.
  • ' '
  • [EbsConfiguration] - Specifies additional Amazon EBS storage volumes attached' ' to EC2 instances using an inline JSON structure.
  • ' '
  • [AutoScalingPolicy] - Specifies an automatic scaling policy for the' ' instance group using an inline JSON structure.
  • ') INSTANCE_FLEETS = ( '

    Applies only to Amazon EMR release version 5.0 and later. Specifies' ' the number and type of Amazon EC2 instances to create' ' for each node type in a cluster, using instance fleets.' ' You can specify either --instance-fleets or' ' --instance-groups but not both.' ' For more information and examples, see the following topic in the Amazon EMR Management Guide:

    ' '

    https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-fleet.html

    ' '

    You can specify arguments individually using multiple' ' InstanceFleetType argument blocks, one for the MASTER' ' instance fleet, one for a CORE instance fleet,' ' and an optional TASK instance fleet.

    ' '

    The following arguments can be specified for each instance fleet. Optional arguments are shown in [square brackets].

    ' '
  • [Name] - An optional friendly name for the instance fleet.
  • ' '
  • InstanceFleetType - MASTER, CORE, or TASK.
  • ' '
  • TargetOnDemandCapacity - The target capacity of On-Demand units' ' for the instance fleet, which determines how many On-Demand Instances to provision.' ' The WeightedCapacity specified for an instance type within' ' InstanceTypeConfigs counts toward this total when an instance type' ' with the On-Demand purchasing option launches.
  • ' '
  • TargetSpotCapacity - The target capacity of Spot units' ' for the instance fleet, which determines how many Spot Instances to provision.' ' The WeightedCapacity specified for an instance type within' ' InstanceTypeConfigs counts toward this total when an instance' ' type with the Spot purchasing option launches.
  • ' '
  • [LaunchSpecifications] - When TargetSpotCapacity is specified,' ' specifies the block duration and timeout action for Spot Instances.' '
  • InstanceTypeConfigs - Specify up to five EC2 instance types to' ' use in the instance fleet, including details such as Spot price and Amazon EBS configuration.' ' When you use an On-Demand or Spot Instance allocation strategy,' ' you can specify up to 30 instance types per instance fleet.
  • ') INSTANCE_TYPE = ( '

    Shortcut parameter as an alternative to --instance-groups.' ' Specifies the type of Amazon EC2 instance to use in a cluster.' ' If used without the --instance-count parameter,' ' the cluster consists of a single master node running on the EC2 instance type' ' specified. When used together with --instance-count,' ' one instance is used for the master node, and the remainder' ' are used for the core node type.

    ') INSTANCE_COUNT = ( '

    Shortcut parameter as an alternative to --instance-groups' ' when used together with --instance-type. Specifies the' ' number of Amazon EC2 instances to create for a cluster.' ' One instance is used for the master node, and the remainder' ' are used for the core node type.

    ') ADDITIONAL_INFO = ( '

    Specifies additional information during cluster creation. To set development mode when starting your EMR cluster,' ' set this parameter to {"clusterType":"development"}.

    ') EC2_ATTRIBUTES = ( '

    Configures cluster and Amazon EC2 instance configurations. Accepts' ' the following arguments:

    ' '
  • KeyName - Specifies the name of the AWS EC2 key pair that will be used for' ' SSH connections to the master node and other instances on the cluster.
  • ' '
  • AvailabilityZone - Applies to clusters that use the uniform instance group configuration.' ' Specifies the availability zone in which to launch the cluster.' ' For example, us-west-1b. AvailabilityZone is used for uniform instance groups,' ' while AvailabilityZones (plural) is used for instance fleets.
  • ' '
  • AvailabilityZones - Applies to clusters that use the instance fleet configuration.' ' When multiple Availability Zones are specified, Amazon EMR evaluates them and launches instances' ' in the optimal Availability Zone. AvailabilityZone is used for uniform instance groups,' ' while AvailabilityZones (plural) is used for instance fleets.
  • ' '
  • SubnetId - Applies to clusters that use the uniform instance group configuration.' ' Specify the VPC subnet in which to create the cluster. SubnetId is used for uniform instance groups,' ' while SubnetIds (plural) is used for instance fleets.
  • ' '
  • SubnetIds - Applies to clusters that use the instance fleet configuration.' ' When multiple EC2 subnet IDs are specified, Amazon EMR evaluates them and launches instances in the optimal subnet.' ' SubnetId is used for uniform instance groups,' ' while SubnetIds (plural) is used for instance fleets.
  • ' '
  • InstanceProfile - An IAM role that allows EC2 instances to' ' access other AWS services, such as Amazon S3, that' ' are required for operations.
  • ' '
  • EmrManagedMasterSecurityGroup - The security group ID of the Amazon EC2' ' security group for the master node.
  • ' '
  • EmrManagedSlaveSecurityGroup - The security group ID of the Amazon EC2' ' security group for the slave nodes.
  • ' '
  • ServiceAccessSecurityGroup - The security group ID of the Amazon EC2 ' 'security group for Amazon EMR access to clusters in VPC private subnets.
  • ' '
  • AdditionalMasterSecurityGroups - A list of additional Amazon EC2' ' security group IDs for the master node.
  • ' '
  • AdditionalSlaveSecurityGroups - A list of additional Amazon EC2' ' security group IDs for the slave nodes.
  • ') AUTO_TERMINATE = ( '

    Specifies whether the cluster should terminate after' ' completing all the steps. Auto termination is off by default.

    ') TERMINATION_PROTECTED = ( '

    Specifies whether to lock the cluster to prevent the' ' Amazon EC2 instances from being terminated by API call,' ' user intervention, or an error.

    ') SCALE_DOWN_BEHAVIOR = ( '

    Specifies the way that individual Amazon EC2 instances terminate' ' when an automatic scale-in activity occurs or an instance group is resized.

    ' '

    Accepted values:

    ' '
  • TERMINATE_AT_TASK_COMPLETION - Specifies that Amazon EMR' ' blacklists and drains tasks from nodes before terminating the instance.
  • ' '
  • TERMINATE_AT_INSTANCE_HOUR - Specifies that Amazon EMR' ' terminate EC2 instances at the instance-hour boundary, regardless of when' ' the request to terminate was submitted.
  • ' ) VISIBILITY = ( '

    Specifies whether the cluster is visible to all IAM users' ' of the AWS account associated with the cluster. If a user' ' has the proper policy permissions set, they can also manage the cluster.

    ' '

    Visibility is on by default. The --no-visible-to-all-users option' ' is no longer supported. To restrict cluster visibility, use an IAM policy.

    ') DEBUGGING = ( '

    Specifies that the debugging tool is enabled for the cluster,' ' which allows you to browse log files using the Amazon EMR console.' ' Turning debugging on requires that you specify --log-uri' ' because log files must be stored in Amazon S3 so that' ' Amazon EMR can index them for viewing in the console.' ' Effective January 23, 2023, Amazon EMR will discontinue the debugging tool for all versions.

    ') TAGS = ( '

    A list of tags to associate with a cluster, which apply to' ' each Amazon EC2 instance in the cluster. Tags are key-value pairs that' ' consist of a required key string' ' with a maximum of 128 characters, and an optional value string' ' with a maximum of 256 characters.

    ' '

    You can specify tags in key=value format or you can add a' ' tag without a value using only the key name, for example key.' ' Use a space to separate multiple tags.

    ') BOOTSTRAP_ACTIONS = ( '

    Specifies a list of bootstrap actions to run on each EC2 instance when' ' a cluster is created. Bootstrap actions run on each instance' ' immediately after Amazon EMR provisions the EC2 instance and' ' before Amazon EMR installs specified applications.

    ' '

    You can specify a bootstrap action as an inline JSON structure' ' enclosed in single quotation marks, or you can use a shorthand' ' syntax, specifying multiple bootstrap actions, each separated' ' by a space. When using the shorthand syntax, each bootstrap' ' action takes the following parameters, separated by' ' commas with no trailing space. Optional parameters' ' are shown in [square brackets].

    ' '
  • Path - The path and file name of the script' ' to run, which must be accessible to each instance in the cluster.' ' For example, Path=s3://mybucket/myscript.sh.
  • ' '
  • [Name] - A friendly name to help you identify' ' the bootstrap action. For example, Name=BootstrapAction1
  • ' '
  • [Args] - A comma-separated list of arguments' ' to pass to the bootstrap action script. Arguments can be' ' either a list of values (Args=arg1,arg2,arg3)' ' or a list of key-value pairs, as well as optional values,' ' enclosed in square brackets (Args=[arg1,arg2=arg2value,arg3])
  • .') APPLICATIONS = ( '

    Specifies the applications to install on the cluster.' ' Available applications and their respective versions vary' ' by Amazon EMR release. For more information, see the' ' Amazon EMR Release Guide:

    ' '

    https://docs.aws.amazon.com/emr/latest/ReleaseGuide/

    ' '

    When using versions of Amazon EMR earlier than 4.0,' ' some applications take optional arguments for configuration.' ' Arguments should either be a comma-separated list of values' ' (Args=arg1,arg2,arg3) or a bracket-enclosed list of values' ' and key-value pairs (Args=[arg1,arg2=arg3,arg4]).

    ') EMR_FS = ( '

    Specifies EMRFS configuration options, such as consistent view' ' and Amazon S3 encryption parameters.

    ' '

    When you use Amazon EMR release version 4.8.0 or later, we recommend' ' that you use the --configurations option together' ' with the emrfs-site configuration classification' ' to configure EMRFS, and use security configurations' ' to configure encryption for EMRFS data in Amazon S3 instead.' ' For more information, see the following topic in the Amazon EMR Management Guide:

    ' '

    https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-configure-consistent-view.html

    ') RESTORE_FROM_HBASE = ( '

    Applies only when using Amazon EMR release versions earlier than 4.0.' ' Launches a new HBase cluster and populates it with' ' data from a previous backup of an HBase cluster. HBase' ' must be installed using the --applications option.

    ') STEPS = ( '

    Specifies a list of steps to be executed by the cluster. Steps run' ' only on the master node after applications are installed' ' and are used to submit work to a cluster. A step can be' ' specified using the shorthand syntax, by referencing a JSON file' ' or by specifying an inline JSON structure. Args supplied with steps' ' should be a comma-separated list of values (Args=arg1,arg2,arg3) or' ' a bracket-enclosed list of values and key-value' ' pairs (Args=[arg1,arg2=value,arg4).

    ') INSTALL_APPLICATIONS = ( '

    The applications to be installed.' ' Takes the following parameters: ' 'Name and Args.

    ') EBS_ROOT_VOLUME_SIZE = ( '

    This option is available only with Amazon EMR version 4.x and later. Specifies the size,' ' in GiB, of the EBS root device volume of the Amazon Linux AMI' ' that is used for each EC2 instance in the cluster.

    ') SECURITY_CONFIG = ( '

    Specifies the name of a security configuration to use for the cluster.' ' A security configuration defines data encryption settings and' ' other security options. For more information, see' ' the following topic in the Amazon EMR Management Guide:

    ' '

    https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-encryption-enable-security-configuration.html

    ' '

    Use list-security-configurations to get a list of available' ' security configurations in the active account.

    ') CUSTOM_AMI_ID = ( '

    Applies only to Amazon EMR release version 5.7.0 and later.' ' Specifies the AMI ID of a custom AMI to use' ' when Amazon EMR provisions EC2 instances. A custom' ' AMI can be used to encrypt the Amazon EBS root volume. It' ' can also be used instead of bootstrap actions to customize' ' cluster node configurations. For more information, see' ' the following topic in the Amazon EMR Management Guide:

    ' '

    https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html

    ') REPO_UPGRADE_ON_BOOT = ( '

    Applies only when a --custom-ami-id is' ' specified. On first boot, by default, Amazon Linux AMIs' ' connect to package repositories to install security updates' ' before other services start. You can set this parameter' ' using --rep-upgrade-on-boot NONE to' ' disable these updates. CAUTION: This creates additional' ' security risks.

    ') KERBEROS_ATTRIBUTES = ( '

    Specifies required cluster attributes for Kerberos when Kerberos authentication' ' is enabled in the specified --security-configuration.' ' Takes the following arguments:

    ' '
  • Realm - Specifies the name of the Kerberos' ' realm to which all nodes in a cluster belong. For example,' ' Realm=EC2.INTERNAL.
  • ' '
  • KdcAdminPassword - Specifies the password used within the cluster' ' for the kadmin service, which maintains Kerberos principals, password' ' policies, and keytabs for the cluster.
  • ' '
  • CrossRealmTrustPrincipalPassword - Required when establishing a cross-realm trust' ' with a KDC in a different realm. This is the cross-realm principal password,' ' which must be identical across realms.
  • ' '
  • ADDomainJoinUser - Required when establishing trust with an Active Directory' ' domain. This is the User logon name of an AD account with sufficient privileges to join resources to the domain.
  • ' '
  • ADDomainJoinPassword - The AD password for ADDomainJoinUser.
  • ') # end create-cluster options help descriptions LIST_CLUSTERS_CLUSTER_STATES = ( '

    Specifies that only clusters in the states specified are' ' listed. Alternatively, you can use the shorthand' ' form for single states or a group of states.

    ' '

    Takes the following state values:

    ' '
  • STARTING
  • ' '
  • BOOTSTRAPPING
  • ' '
  • RUNNING
  • ' '
  • WAITING
  • ' '
  • TERMINATING
  • ' '
  • TERMINATED
  • ' '
  • TERMINATED_WITH_ERRORS
  • ') LIST_CLUSTERS_STATE_FILTERS = ( '

    Shortcut options for --cluster-states. The' ' following shortcut options can be specified:

    ' '
  • --active - list only clusters that' ' are STARTING,BOOTSTRAPPING,' ' RUNNING, WAITING, or TERMINATING.
  • ' '
  • --terminated - list only clusters that are TERMINATED.
  • ' '
  • --failed - list only clusters that are TERMINATED_WITH_ERRORS.
  • ') LIST_CLUSTERS_CREATED_AFTER = ( '

    List only those clusters created after the date and time' ' specified in the format yyyy-mm-ddThh:mm:ss. For example,' ' --created-after 2017-07-04T00:01:30.

    ') LIST_CLUSTERS_CREATED_BEFORE = ( '

    List only those clusters created before the date and time' ' specified in the format yyyy-mm-ddThh:mm:ss. For example,' ' --created-before 2017-07-04T00:01:30.

    ') EMR_MANAGED_MASTER_SECURITY_GROUP = ( '

    The identifier of the Amazon EC2 security group ' 'for the master node.

    ') EMR_MANAGED_SLAVE_SECURITY_GROUP = ( '

    The identifier of the Amazon EC2 security group ' 'for the slave nodes.

    ') SERVICE_ACCESS_SECURITY_GROUP = ( '

    The identifier of the Amazon EC2 security group ' 'for Amazon EMR to access clusters in VPC private subnets.

    ') ADDITIONAL_MASTER_SECURITY_GROUPS = ( '

    A list of additional Amazon EC2 security group IDs for ' 'the master node

    ') ADDITIONAL_SLAVE_SECURITY_GROUPS = ( '

    A list of additional Amazon EC2 security group IDs for ' 'the slave nodes.

    ') AVAILABLE_ONLY_FOR_AMI_VERSIONS = ( 'This command is only available when using Amazon EMR versions' 'earlier than 4.0.') STEP_CONCURRENCY_LEVEL = ( 'This command specifies the step concurrency level of the cluster.' 'Default is 1 which is non-concurrent.' ) MANAGED_SCALING_POLICY = ( '

    Managed scaling policy for an Amazon EMR cluster. The policy ' 'specifies the limits for resources that can be added or terminated ' 'from a cluster. You can specify the ComputeLimits which include ' 'the MaximumCapacityUnits, MaximumCoreCapacityUnits, MinimumCapacityUnits, ' 'MaximumOnDemandCapacityUnits and UnitType. For an ' 'InstanceFleet cluster, the UnitType must be InstanceFleetUnits. For ' 'InstanceGroup clusters, the UnitType can be either VCPU or Instances.

    ' ) PLACEMENT_GROUP_CONFIGS = ( '

    Placement group configuration for an Amazon EMR ' 'cluster. The configuration specifies the EC2 placement group ' 'strategy associated with each EMR Instance Role.

    ' '

    Currently, we support placement group only for MASTER ' 'role with SPREAD strategy by default. You can opt-in by ' 'passing --placement-group-configs InstanceRole=MASTER ' 'during cluster creation.

    ' ) AUTO_TERMINATION_POLICY = ( '

    Auto termination policy for an Amazon EMR cluster. ' 'The configuration specifies the termination idle timeout' 'threshold for an cluster.

    ' ) EXECUTION_ROLE_ARN = ( '

    You must grant the execution role the permissions needed ' 'to access the same IAM resources that the step can access. ' 'The execution role can be a cross-account IAM Role.

    ' )