Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: CC-BY-SA-4.0

CreateLabelingJob

Creates a job that uses workers to label the data objects in your input dataset. You can use the labeled data to train machine learning models.

You can select your workforce from one of three providers: + A private workforce that you create. It can include employees, contractors, and outside experts. Use a private workforce when want the data to stay within your organization or when a specific set of skills is required. + One or more vendors that you select from the AWS Marketplace. Vendors provide expertise in specific areas. + The Amazon Mechanical Turk workforce. This is the largest workforce, but it should only be used for public data or data that has been stripped of any personally identifiable information.

You can also use automated data labeling to reduce the number of data objects that need to be labeled by a human. Automated data labeling uses active learning to determine if a data object can be labeled by machine or if it needs to be sent to a human worker. For more information, see Using Automated Data Labeling.

The data objects to be labeled are contained in an Amazon S3 bucket. You create a manifest file that describes the location of each object. For more information, see Using Input and Output Data.

The output can be used as the manifest file for another labeling job or as training data for your machine learning models.

{
   "[HumanTaskConfig](#SageMaker-CreateLabelingJob-request-HumanTaskConfig)": { 
      "[AnnotationConsolidationConfig](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-AnnotationConsolidationConfig)": { 
         "[AnnotationConsolidationLambdaArn](API_AnnotationConsolidationConfig.md#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn)": "string"
      },
      "[MaxConcurrentTaskCount](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-MaxConcurrentTaskCount)": number,
      "[NumberOfHumanWorkersPerDataObject](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject)": number,
      "[PreHumanTaskLambdaArn](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn)": "string",
      "[PublicWorkforceTaskPrice](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-PublicWorkforceTaskPrice)": { 
         "[AmountInUsd](API_PublicWorkforceTaskPrice.md#SageMaker-Type-PublicWorkforceTaskPrice-AmountInUsd)": { 
            "[Cents](API_USD.md#SageMaker-Type-USD-Cents)": number,
            "[Dollars](API_USD.md#SageMaker-Type-USD-Dollars)": number,
            "[TenthFractionsOfACent](API_USD.md#SageMaker-Type-USD-TenthFractionsOfACent)": number
         }
      },
      "[TaskAvailabilityLifetimeInSeconds](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-TaskAvailabilityLifetimeInSeconds)": number,
      "[TaskDescription](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-TaskDescription)": "string",
      "[TaskKeywords](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-TaskKeywords)": [ "string" ],
      "[TaskTimeLimitInSeconds](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-TaskTimeLimitInSeconds)": number,
      "[TaskTitle](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-TaskTitle)": "string",
      "[UiConfig](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-UiConfig)": { 
         "[UiTemplateS3Uri](API_UiConfig.md#SageMaker-Type-UiConfig-UiTemplateS3Uri)": "string"
      },
      "[WorkteamArn](API_HumanTaskConfig.md#SageMaker-Type-HumanTaskConfig-WorkteamArn)": "string"
   },
   "[InputConfig](#SageMaker-CreateLabelingJob-request-InputConfig)": { 
      "[DataAttributes](API_LabelingJobInputConfig.md#SageMaker-Type-LabelingJobInputConfig-DataAttributes)": { 
         "[ContentClassifiers](API_LabelingJobDataAttributes.md#SageMaker-Type-LabelingJobDataAttributes-ContentClassifiers)": [ "string" ]
      },
      "[DataSource](API_LabelingJobInputConfig.md#SageMaker-Type-LabelingJobInputConfig-DataSource)": { 
         "[S3DataSource](API_LabelingJobDataSource.md#SageMaker-Type-LabelingJobDataSource-S3DataSource)": { 
            "[ManifestS3Uri](API_LabelingJobS3DataSource.md#SageMaker-Type-LabelingJobS3DataSource-ManifestS3Uri)": "string"
         }
      }
   },
   "[LabelAttributeName](#SageMaker-CreateLabelingJob-request-LabelAttributeName)": "string",
   "[LabelCategoryConfigS3Uri](#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri)": "string",
   "[LabelingJobAlgorithmsConfig](#SageMaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)": { 
      "[InitialActiveLearningModelArn](API_LabelingJobAlgorithmsConfig.md#SageMaker-Type-LabelingJobAlgorithmsConfig-InitialActiveLearningModelArn)": "string",
      "[LabelingJobAlgorithmSpecificationArn](API_LabelingJobAlgorithmsConfig.md#SageMaker-Type-LabelingJobAlgorithmsConfig-LabelingJobAlgorithmSpecificationArn)": "string",
      "[LabelingJobResourceConfig](API_LabelingJobAlgorithmsConfig.md#SageMaker-Type-LabelingJobAlgorithmsConfig-LabelingJobResourceConfig)": { 
         "[VolumeKmsKeyId](API_LabelingJobResourceConfig.md#SageMaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId)": "string"
      }
   },
   "[LabelingJobName](#SageMaker-CreateLabelingJob-request-LabelingJobName)": "string",
   "[OutputConfig](#SageMaker-CreateLabelingJob-request-OutputConfig)": { 
      "[KmsKeyId](API_LabelingJobOutputConfig.md#SageMaker-Type-LabelingJobOutputConfig-KmsKeyId)": "string",
      "[S3OutputPath](API_LabelingJobOutputConfig.md#SageMaker-Type-LabelingJobOutputConfig-S3OutputPath)": "string"
   },
   "[RoleArn](#SageMaker-CreateLabelingJob-request-RoleArn)": "string",
   "[StoppingConditions](#SageMaker-CreateLabelingJob-request-StoppingConditions)": { 
      "[MaxHumanLabeledObjectCount](API_LabelingJobStoppingConditions.md#SageMaker-Type-LabelingJobStoppingConditions-MaxHumanLabeledObjectCount)": number,
      "[MaxPercentageOfInputDatasetLabeled](API_LabelingJobStoppingConditions.md#SageMaker-Type-LabelingJobStoppingConditions-MaxPercentageOfInputDatasetLabeled)": number
   },
   "[Tags](#SageMaker-CreateLabelingJob-request-Tags)": [ 
      { 
         "[Key](API_Tag.md#SageMaker-Type-Tag-Key)": "string",
         "[Value](API_Tag.md#SageMaker-Type-Tag-Value)": "string"
      }
   ]
}

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

** HumanTaskConfig ** Configures the information required for human workers to complete a labeling task.
Type: HumanTaskConfig object
Required: Yes

** InputConfig ** Input data for the labeling job, such as the Amazon S3 location of the data objects and the location of the manifest file that describes the data objects.
Type: LabelingJobInputConfig object
Required: Yes

** LabelAttributeName ** The attribute name to use for the label in the output manifest file. This is the key for the key/value pair formed with the label that a worker assigns to the object. The name can’t end with “-metadata”. If you are running a semantic segmentation labeling job, the attribute name must end with “-ref”. If you are running any other kind of labeling job, the attribute name must not end with “-ref”.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 127.
Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: Yes

** LabelCategoryConfigS3Uri ** The S3 URL of the file that defines the categories used to label the data objects.
The file is a JSON structure in the following format:
{
"document-version": "2018-11-28"
"labels": [
{
"label": "label 1"
},
{
"label": "label 2"
},
...
{
"label": "label n"
}
]
}
Type: String
Length Constraints: Maximum length of 1024.
Pattern: ^(https|s3)://([^/]+)/?(.*)$
Required: No

** LabelingJobAlgorithmsConfig ** Configures the information required to perform automated data labeling.
Type: LabelingJobAlgorithmsConfig object
Required: No

** LabelingJobName ** The name of the labeling job. This name is used to identify the job in a list of labeling jobs.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: Yes

** OutputConfig ** The location of the output data and the AWS Key Management Service key ID for the key used to encrypt the output data, if any.
Type: LabelingJobOutputConfig object
Required: Yes

** RoleArn ** The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform tasks on your behalf during data labeling. You must grant this role the necessary permissions so that Amazon SageMaker can successfully complete data labeling.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern: ^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$
Required: Yes

** StoppingConditions ** A set of conditions for stopping the labeling job. If any of the conditions are met, the job is automatically stopped. You can use these conditions to control the cost of data labeling.
Type: LabelingJobStoppingConditions object
Required: No

** Tags ** An array of key/value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
Type: Array of Tag objects
Array Members: Minimum number of 0 items. Maximum number of 50 items.
Required: No

{
   "[LabelingJobArn](#SageMaker-CreateLabelingJob-response-LabelingJobArn)": "string"
}

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

** LabelingJobArn ** The Amazon Resource Name (ARN) of the labeling job. You use this ARN to identify the labeling job.
Type: String
Length Constraints: Maximum length of 2048.
Pattern: arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:labeling-job/.*

For information about the errors that are common to all actions, see Common Errors.

ResourceInUse
Resource being accessed is in use.
HTTP Status Code: 400

ResourceLimitExceeded
You have exceeded an Amazon SageMaker resource limit. For example, you might have too many training jobs created.
HTTP Status Code: 400

For more information about using this API in one of the language-specific AWS SDKs, see the following: + AWS Command Line Interface + AWS SDK for .NET + AWS SDK for C++ + AWS SDK for Go + AWS SDK for Go - Pilot + AWS SDK for Java + AWS SDK for JavaScript + AWS SDK for PHP V3 + AWS SDK for Python + AWS SDK for Ruby V2