/* * Copyright 2018-2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with * the License. A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR * CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions * and limitations under the License. */ package com.amazonaws.services.comprehend.model; import java.io.Serializable; import javax.annotation.Generated; import com.amazonaws.protocol.StructuredPojo; import com.amazonaws.protocol.ProtocolMarshaller; /** *
* Specifies the format and location of the input data. *
* * @see AWS API Documentation */ @Generated("com.amazonaws:aws-java-sdk-code-generator") public class EntityRecognizerInputDataConfig implements Serializable, Cloneable, StructuredPojo { /** ** The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This file
* is in JSON lines format. Each line is a complete JSON object that contains a training document and its labels.
* Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
*
* The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity recognizer. * Any entity types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not contain * the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return), \\r (escaped * carriage return), \t (tab), \\t (escaped tab), space, and , (comma). *
*/ private java.util.List* The S3 location of the folder that contains the training documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*
* The S3 location of the CSV file that annotates your training documents. *
*/ private EntityRecognizerAnnotations annotations; /** ** The S3 location of the CSV file that has the entity list for your custom entity recognizer. *
*/ private EntityRecognizerEntityList entityList; /** ** A list of augmented manifest files that provide training data for your custom model. An augmented manifest file * is a labeled dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to AUGMENTED_MANIFEST
.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This file
* is in JSON lines format. Each line is a complete JSON object that contains a training document and its labels.
* Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the
* Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This
* file is in JSON lines format. Each line is a complete JSON object that contains a training document and
* its labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
* @see EntityRecognizerDataFormat
*/
public void setDataFormat(String dataFormat) {
this.dataFormat = dataFormat;
}
/**
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This file
* is in JSON lines format. Each line is a complete JSON object that contains a training document and its labels.
* Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the
* file depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the
* Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth.
* This file is in JSON lines format. Each line is a complete JSON object that contains a training document
* and its labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
* @see EntityRecognizerDataFormat
*/
public String getDataFormat() {
return this.dataFormat;
}
/**
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This file
* is in JSON lines format. Each line is a complete JSON object that contains a training document and its labels.
* Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the
* Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This
* file is in JSON lines format. Each line is a complete JSON object that contains a training document and
* its labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
* @return Returns a reference to this object so that method calls can be chained together.
* @see EntityRecognizerDataFormat
*/
public EntityRecognizerInputDataConfig withDataFormat(String dataFormat) {
setDataFormat(dataFormat);
return this;
}
/**
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This file
* is in JSON lines format. Each line is a complete JSON object that contains a training document and its labels.
* Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
*
* COMPREHEND_CSV
: A CSV file that supplements your training documents. The CSV file contains
* information about the custom entities that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the Annotations
or
* EntityList
parameters. You must provide your training documents by using the
* Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by Amazon SageMaker Ground Truth. This
* file is in JSON lines format. Each line is a complete JSON object that contains a training document and
* its labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses COMPREHEND_CSV
as the default.
* @return Returns a reference to this object so that method calls can be chained together.
* @see EntityRecognizerDataFormat
*/
public EntityRecognizerInputDataConfig withDataFormat(EntityRecognizerDataFormat dataFormat) {
this.dataFormat = dataFormat.toString();
return this;
}
/**
*
* The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity recognizer. * Any entity types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not contain * the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return), \\r (escaped * carriage return), \t (tab), \\t (escaped tab), space, and , (comma). *
* * @return The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity * recognizer. Any entity types that you don't specify are ignored. *
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not
* contain the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage
* return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
*/
public java.util.List
* The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity recognizer.
* Any entity types that you don't specify are ignored.
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not contain
* the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return), \\r (escaped
* carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not
* contain the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return),
* \\r (escaped carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
*/
public void setEntityTypes(java.util.Collection
* The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity recognizer.
* Any entity types that you don't specify are ignored.
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not contain
* the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return), \\r (escaped
* carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
*
* NOTE: This method appends the values to the existing list (if any). Use
* {@link #setEntityTypes(java.util.Collection)} or {@link #withEntityTypes(java.util.Collection)} if you want to
* override the existing values.
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not
* contain the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return),
* \\r (escaped carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
* @return Returns a reference to this object so that method calls can be chained together.
*/
public EntityRecognizerInputDataConfig withEntityTypes(EntityTypesListItem... entityTypes) {
if (this.entityTypes == null) {
setEntityTypes(new java.util.ArrayList
* The entity types in the labeled training data that Amazon Comprehend uses to train the custom entity recognizer.
* Any entity types that you don't specify are ignored.
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not contain
* the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return), \\r (escaped
* carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
*
* A maximum of 25 entity types can be used at one time to train an entity recognizer. Entity types must not
* contain the following invalid characters: \n (line break), \\n (escaped line break), \r (carriage return),
* \\r (escaped carriage return), \t (tab), \\t (escaped tab), space, and , (comma).
* @return Returns a reference to this object so that method calls can be chained together.
*/
public EntityRecognizerInputDataConfig withEntityTypes(java.util.Collection
* The S3 location of the folder that contains the training documents for your custom entity recognizer.
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*/
public void setDocuments(EntityRecognizerDocuments documents) {
this.documents = documents;
}
/**
*
* The S3 location of the folder that contains the training documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*/
public EntityRecognizerDocuments getDocuments() {
return this.documents;
}
/**
*
* The S3 location of the folder that contains the training documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
*
* This parameter is required if you set DataFormat
to COMPREHEND_CSV
.
* @return Returns a reference to this object so that method calls can be chained together.
*/
public EntityRecognizerInputDataConfig withDocuments(EntityRecognizerDocuments documents) {
setDocuments(documents);
return this;
}
/**
*
* The S3 location of the CSV file that annotates your training documents. *
* * @param annotations * The S3 location of the CSV file that annotates your training documents. */ public void setAnnotations(EntityRecognizerAnnotations annotations) { this.annotations = annotations; } /** ** The S3 location of the CSV file that annotates your training documents. *
* * @return The S3 location of the CSV file that annotates your training documents. */ public EntityRecognizerAnnotations getAnnotations() { return this.annotations; } /** ** The S3 location of the CSV file that annotates your training documents. *
* * @param annotations * The S3 location of the CSV file that annotates your training documents. * @return Returns a reference to this object so that method calls can be chained together. */ public EntityRecognizerInputDataConfig withAnnotations(EntityRecognizerAnnotations annotations) { setAnnotations(annotations); return this; } /** ** The S3 location of the CSV file that has the entity list for your custom entity recognizer. *
* * @param entityList * The S3 location of the CSV file that has the entity list for your custom entity recognizer. */ public void setEntityList(EntityRecognizerEntityList entityList) { this.entityList = entityList; } /** ** The S3 location of the CSV file that has the entity list for your custom entity recognizer. *
* * @return The S3 location of the CSV file that has the entity list for your custom entity recognizer. */ public EntityRecognizerEntityList getEntityList() { return this.entityList; } /** ** The S3 location of the CSV file that has the entity list for your custom entity recognizer. *
* * @param entityList * The S3 location of the CSV file that has the entity list for your custom entity recognizer. * @return Returns a reference to this object so that method calls can be chained together. */ public EntityRecognizerInputDataConfig withEntityList(EntityRecognizerEntityList entityList) { setEntityList(entityList); return this; } /** ** A list of augmented manifest files that provide training data for your custom model. An augmented manifest file * is a labeled dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to AUGMENTED_MANIFEST
.
*
* This parameter is required if you set
* A list of augmented manifest files that provide training data for your custom model. An augmented manifest file
* is a labeled dataset that is produced by Amazon SageMaker Ground Truth.
*
* This parameter is required if you set DataFormat
to AUGMENTED_MANIFEST
.
*/
public java.util.ListDataFormat
to AUGMENTED_MANIFEST
.
*
* This parameter is required if you set
* A list of augmented manifest files that provide training data for your custom model. An augmented manifest file
* is a labeled dataset that is produced by Amazon SageMaker Ground Truth.
*
* This parameter is required if you set
* NOTE: This method appends the values to the existing list (if any). Use
* {@link #setAugmentedManifests(java.util.Collection)} or {@link #withAugmentedManifests(java.util.Collection)} if
* you want to override the existing values.
* DataFormat
to AUGMENTED_MANIFEST
.
*/
public void setAugmentedManifests(java.util.CollectionDataFormat
to AUGMENTED_MANIFEST
.
*
* This parameter is required if you set
* A list of augmented manifest files that provide training data for your custom model. An augmented manifest file
* is a labeled dataset that is produced by Amazon SageMaker Ground Truth.
*
* This parameter is required if you set DataFormat
to AUGMENTED_MANIFEST
.
* @return Returns a reference to this object so that method calls can be chained together.
*/
public EntityRecognizerInputDataConfig withAugmentedManifests(AugmentedManifestsListItem... augmentedManifests) {
if (this.augmentedManifests == null) {
setAugmentedManifests(new java.util.ArrayListDataFormat
to AUGMENTED_MANIFEST
.
*
* This parameter is required if you set DataFormat
to AUGMENTED_MANIFEST
.
* @return Returns a reference to this object so that method calls can be chained together.
*/
public EntityRecognizerInputDataConfig withAugmentedManifests(java.util.Collection