/* * Copyright 2010-2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). * You may not use this file except in compliance with the License. * A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either * express or implied. See the License for the specific language governing * permissions and limitations under the License. */ package com.amazonaws.services.comprehend.model; import java.io.Serializable; /** *
* Specifies the format and location of the input data. *
*/ public class EntityRecognizerInputDataConfig implements Serializable { /** ** The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*/
private String dataFormat;
/**
*
* The entity types in the labeled training data that Amazon Comprehend uses * to train the custom entity recognizer. Any entity types that you don't * specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity * recognizer. Entity types must not contain the following invalid * characters: \n (line break), \\n (escaped line break), \r (carriage * return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), * space, and , (comma). *
*/ private java.util.List* The S3 location of the folder that contains the training documents for * your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to
* COMPREHEND_CSV
.
*
* The S3 location of the CSV file that annotates your training documents. *
*/ private EntityRecognizerAnnotations annotations; /** ** The S3 location of the CSV file that has the entity list for your custom * entity recognizer. *
*/ private EntityRecognizerEntityList entityList; /** ** A list of augmented manifest files that provide training data for your * custom model. An augmented manifest file is a labeled dataset that is * produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*
* @return
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your
* training documents. The CSV file contains information about the
* custom entities that your trained model will detect. The required
* format of the file depends on whether you are providing
* annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using
* either the Annotations
or EntityList
* parameters. You must provide your training documents by using the
* Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is
* produced by Amazon SageMaker Ground Truth. This file is in JSON
* lines format. Each line is a complete JSON object that contains a
* training document and its labels. Each label annotates a named
* entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*
* @param dataFormat
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your
* training documents. The CSV file contains information about
* the custom entities that your trained model will detect. The
* required format of the file depends on whether you are
* providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using
* either the Annotations
or EntityList
* parameters. You must provide your training documents by using
* the Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is
* produced by Amazon SageMaker Ground Truth. This file is in
* JSON lines format. Each line is a complete JSON object that
* contains a training document and its labels. Each label
* annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Returns a reference to this object so that method calls can be chained * together. *
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*
* @param dataFormat
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your
* training documents. The CSV file contains information about
* the custom entities that your trained model will detect. The
* required format of the file depends on whether you are
* providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using
* either the Annotations
or EntityList
* parameters. You must provide your training documents by using
* the Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is
* produced by Amazon SageMaker Ground Truth. This file is in
* JSON lines format. Each line is a complete JSON object that
* contains a training document and its labels. Each label
* annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*
* @param dataFormat
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your
* training documents. The CSV file contains information about
* the custom entities that your trained model will detect. The
* required format of the file depends on whether you are
* providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using
* either the Annotations
or EntityList
* parameters. You must provide your training documents by using
* the Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is
* produced by Amazon SageMaker Ground Truth. This file is in
* JSON lines format. Each line is a complete JSON object that
* contains a training document and its labels. Each label
* annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your training
* documents. The CSV file contains information about the custom entities
* that your trained model will detect. The required format of the file
* depends on whether you are providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using either the
* Annotations
or EntityList
parameters. You must
* provide your training documents by using the Documents
* parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is produced by
* Amazon SageMaker Ground Truth. This file is in JSON lines format. Each
* line is a complete JSON object that contains a training document and its
* labels. Each label annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* Returns a reference to this object so that method calls can be chained * together. *
* Constraints:
* Allowed Values: COMPREHEND_CSV, AUGMENTED_MANIFEST
*
* @param dataFormat
* The format of your training data: *
*
* COMPREHEND_CSV
: A CSV file that supplements your
* training documents. The CSV file contains information about
* the custom entities that your trained model will detect. The
* required format of the file depends on whether you are
* providing annotations or an entity list.
*
* If you use this value, you must provide your CSV file by using
* either the Annotations
or EntityList
* parameters. You must provide your training documents by using
* the Documents
parameter.
*
* AUGMENTED_MANIFEST
: A labeled dataset that is
* produced by Amazon SageMaker Ground Truth. This file is in
* JSON lines format. Each line is a complete JSON object that
* contains a training document and its labels. Each label
* annotates a named entity in the training document.
*
* If you use this value, you must provide the
* AugmentedManifests
parameter in your request.
*
* If you don't specify a value, Amazon Comprehend uses
* COMPREHEND_CSV
as the default.
*
* The entity types in the labeled training data that Amazon Comprehend uses * to train the custom entity recognizer. Any entity types that you don't * specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity * recognizer. Entity types must not contain the following invalid * characters: \n (line break), \\n (escaped line break), \r (carriage * return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), * space, and , (comma). *
* * @return* The entity types in the labeled training data that Amazon * Comprehend uses to train the custom entity recognizer. Any entity * types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an * entity recognizer. Entity types must not contain the following * invalid characters: \n (line break), \\n (escaped line break), \r * (carriage return), \\r (escaped carriage return), \t (tab), \\t * (escaped tab), space, and , (comma). *
*/ public java.util.List* The entity types in the labeled training data that Amazon Comprehend uses * to train the custom entity recognizer. Any entity types that you don't * specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity * recognizer. Entity types must not contain the following invalid * characters: \n (line break), \\n (escaped line break), \r (carriage * return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), * space, and , (comma). *
* * @param entityTypes* The entity types in the labeled training data that Amazon * Comprehend uses to train the custom entity recognizer. Any * entity types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train * an entity recognizer. Entity types must not contain the * following invalid characters: \n (line break), \\n (escaped * line break), \r (carriage return), \\r (escaped carriage * return), \t (tab), \\t (escaped tab), space, and , (comma). *
*/ public void setEntityTypes(java.util.Collection* The entity types in the labeled training data that Amazon Comprehend uses * to train the custom entity recognizer. Any entity types that you don't * specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity * recognizer. Entity types must not contain the following invalid * characters: \n (line break), \\n (escaped line break), \r (carriage * return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), * space, and , (comma). *
** Returns a reference to this object so that method calls can be chained * together. * * @param entityTypes
* The entity types in the labeled training data that Amazon * Comprehend uses to train the custom entity recognizer. Any * entity types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train * an entity recognizer. Entity types must not contain the * following invalid characters: \n (line break), \\n (escaped * line break), \r (carriage return), \\r (escaped carriage * return), \t (tab), \\t (escaped tab), space, and , (comma). *
* @return A reference to this updated object so that method calls can be * chained together. */ public EntityRecognizerInputDataConfig withEntityTypes(EntityTypesListItem... entityTypes) { if (getEntityTypes() == null) { this.entityTypes = new java.util.ArrayList* The entity types in the labeled training data that Amazon Comprehend uses * to train the custom entity recognizer. Any entity types that you don't * specify are ignored. *
** A maximum of 25 entity types can be used at one time to train an entity * recognizer. Entity types must not contain the following invalid * characters: \n (line break), \\n (escaped line break), \r (carriage * return), \\r (escaped carriage return), \t (tab), \\t (escaped tab), * space, and , (comma). *
** Returns a reference to this object so that method calls can be chained * together. * * @param entityTypes
* The entity types in the labeled training data that Amazon * Comprehend uses to train the custom entity recognizer. Any * entity types that you don't specify are ignored. *
** A maximum of 25 entity types can be used at one time to train * an entity recognizer. Entity types must not contain the * following invalid characters: \n (line break), \\n (escaped * line break), \r (carriage return), \\r (escaped carriage * return), \t (tab), \\t (escaped tab), space, and , (comma). *
* @return A reference to this updated object so that method calls can be * chained together. */ public EntityRecognizerInputDataConfig withEntityTypes( java.util.Collection* The S3 location of the folder that contains the training documents for * your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to
* COMPREHEND_CSV
.
*
* The S3 location of the folder that contains the training * documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to
* COMPREHEND_CSV
.
*
* The S3 location of the folder that contains the training documents for * your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to
* COMPREHEND_CSV
.
*
* The S3 location of the folder that contains the training * documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
* to COMPREHEND_CSV
.
*
* The S3 location of the folder that contains the training documents for * your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
to
* COMPREHEND_CSV
.
*
* Returns a reference to this object so that method calls can be chained * together. * * @param documents
* The S3 location of the folder that contains the training * documents for your custom entity recognizer. *
*
* This parameter is required if you set DataFormat
* to COMPREHEND_CSV
.
*
* The S3 location of the CSV file that annotates your training documents. *
* * @return* The S3 location of the CSV file that annotates your training * documents. *
*/ public EntityRecognizerAnnotations getAnnotations() { return annotations; } /** ** The S3 location of the CSV file that annotates your training documents. *
* * @param annotations* The S3 location of the CSV file that annotates your training * documents. *
*/ public void setAnnotations(EntityRecognizerAnnotations annotations) { this.annotations = annotations; } /** ** The S3 location of the CSV file that annotates your training documents. *
** Returns a reference to this object so that method calls can be chained * together. * * @param annotations
* The S3 location of the CSV file that annotates your training * documents. *
* @return A reference to this updated object so that method calls can be * chained together. */ public EntityRecognizerInputDataConfig withAnnotations(EntityRecognizerAnnotations annotations) { this.annotations = annotations; return this; } /** ** The S3 location of the CSV file that has the entity list for your custom * entity recognizer. *
* * @return* The S3 location of the CSV file that has the entity list for your * custom entity recognizer. *
*/ public EntityRecognizerEntityList getEntityList() { return entityList; } /** ** The S3 location of the CSV file that has the entity list for your custom * entity recognizer. *
* * @param entityList* The S3 location of the CSV file that has the entity list for * your custom entity recognizer. *
*/ public void setEntityList(EntityRecognizerEntityList entityList) { this.entityList = entityList; } /** ** The S3 location of the CSV file that has the entity list for your custom * entity recognizer. *
** Returns a reference to this object so that method calls can be chained * together. * * @param entityList
* The S3 location of the CSV file that has the entity list for * your custom entity recognizer. *
* @return A reference to this updated object so that method calls can be * chained together. */ public EntityRecognizerInputDataConfig withEntityList(EntityRecognizerEntityList entityList) { this.entityList = entityList; return this; } /** ** A list of augmented manifest files that provide training data for your * custom model. An augmented manifest file is a labeled dataset that is * produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* A list of augmented manifest files that provide training data for * your custom model. An augmented manifest file is a labeled * dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* A list of augmented manifest files that provide training data for your * custom model. An augmented manifest file is a labeled dataset that is * produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* A list of augmented manifest files that provide training data * for your custom model. An augmented manifest file is a labeled * dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
* to AUGMENTED_MANIFEST
.
*
* A list of augmented manifest files that provide training data for your * custom model. An augmented manifest file is a labeled dataset that is * produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* Returns a reference to this object so that method calls can be chained * together. * * @param augmentedManifests
* A list of augmented manifest files that provide training data * for your custom model. An augmented manifest file is a labeled * dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
* to AUGMENTED_MANIFEST
.
*
* A list of augmented manifest files that provide training data for your * custom model. An augmented manifest file is a labeled dataset that is * produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
to
* AUGMENTED_MANIFEST
.
*
* Returns a reference to this object so that method calls can be chained * together. * * @param augmentedManifests
* A list of augmented manifest files that provide training data * for your custom model. An augmented manifest file is a labeled * dataset that is produced by Amazon SageMaker Ground Truth. *
*
* This parameter is required if you set DataFormat
* to AUGMENTED_MANIFEST
.
*