/* * Copyright 2018-2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with * the License. A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR * CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions * and limitations under the License. */ package com.amazonaws.services.sagemaker.model; import java.io.Serializable; import javax.annotation.Generated; import com.amazonaws.protocol.StructuredPojo; import com.amazonaws.protocol.ProtocolMarshaller; /** *
* Describes the input source of a transform job and the way the transform job consumes it. *
* * @see AWS API * Documentation */ @Generated("com.amazonaws:aws-java-sdk-code-generator") public class TransformInput implements Serializable, Cloneable, StructuredPojo { /** ** Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. *
*/ private TransformDataSource dataSource; /** ** The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each * http call to transfer data to the transform job. *
*/ private String contentType; /** *
* If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
* the data for the transform job accordingly. The default value is None
.
*
* The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
* total size of each object is too large to fit in a single request. You can also use data splitting to improve
* performance by processing multiple concurrent mini-batches. The default value for SplitType
is
* None
, which indicates that input data files are not split, and request payloads contain the entire
* contents of an input object. Set the value of this parameter to Line
to split records on a newline
* character boundary. SplitType
also supports a number of record-oriented binary data formats.
* Currently, the supported record formats are:
*
* RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and
* MaxPayloadInMB
parameters. When the value of BatchStrategy
is MultiRecord
,
* Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
* limit. If the value of BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual
* records in each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
* applied to a binary data format, padding is removed if the value of BatchStrategy
is set to
* SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create
* a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
* Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. *
* * @param dataSource * Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. */ public void setDataSource(TransformDataSource dataSource) { this.dataSource = dataSource; } /** ** Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. *
* * @return Describes the location of the channel data, which is, the S3 location of the input data that the model * can consume. */ public TransformDataSource getDataSource() { return this.dataSource; } /** ** Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. *
* * @param dataSource * Describes the location of the channel data, which is, the S3 location of the input data that the model can * consume. * @return Returns a reference to this object so that method calls can be chained together. */ public TransformInput withDataSource(TransformDataSource dataSource) { setDataSource(dataSource); return this; } /** ** The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each * http call to transfer data to the transform job. *
* * @param contentType * The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with * each http call to transfer data to the transform job. */ public void setContentType(String contentType) { this.contentType = contentType; } /** ** The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each * http call to transfer data to the transform job. *
* * @return The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type * with each http call to transfer data to the transform job. */ public String getContentType() { return this.contentType; } /** ** The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each * http call to transfer data to the transform job. *
* * @param contentType * The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with * each http call to transfer data to the transform job. * @return Returns a reference to this object so that method calls can be chained together. */ public TransformInput withContentType(String contentType) { setContentType(contentType); return this; } /** *
* If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
* the data for the transform job accordingly. The default value is None
.
*
None
.
* @see CompressionType
*/
public void setCompressionType(String compressionType) {
this.compressionType = compressionType;
}
/**
*
* If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
* the data for the transform job accordingly. The default value is None
.
*
None
.
* @see CompressionType
*/
public String getCompressionType() {
return this.compressionType;
}
/**
*
* If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
* the data for the transform job accordingly. The default value is None
.
*
None
.
* @return Returns a reference to this object so that method calls can be chained together.
* @see CompressionType
*/
public TransformInput withCompressionType(String compressionType) {
setCompressionType(compressionType);
return this;
}
/**
*
* If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
* the data for the transform job accordingly. The default value is None
.
*
None
.
* @return Returns a reference to this object so that method calls can be chained together.
* @see CompressionType
*/
public TransformInput withCompressionType(CompressionType compressionType) {
this.compressionType = compressionType.toString();
return this;
}
/**
*
* The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
* total size of each object is too large to fit in a single request. You can also use data splitting to improve
* performance by processing multiple concurrent mini-batches. The default value for SplitType
is
* None
, which indicates that input data files are not split, and request payloads contain the entire
* contents of an input object. Set the value of this parameter to Line
to split records on a newline
* character boundary. SplitType
also supports a number of record-oriented binary data formats.
* Currently, the supported record formats are:
*
* RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and
* MaxPayloadInMB
parameters. When the value of BatchStrategy
is MultiRecord
,
* Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
* limit. If the value of BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual
* records in each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
* applied to a binary data format, padding is removed if the value of BatchStrategy
is set to
* SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create
* a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
SplitType
is None
, which indicates that input data files are not split, and
* request payloads contain the entire contents of an input object. Set the value of this parameter to
* Line
to split records on a newline character boundary. SplitType
also supports a
* number of record-oriented binary data formats. Currently, the supported record formats are:
* * RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the
* BatchStrategy
and MaxPayloadInMB
parameters. When the value of
* BatchStrategy
is MultiRecord
, Amazon SageMaker sends the maximum number of
* records in each request, up to the MaxPayloadInMB
limit. If the value of
* BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual records in each
* request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
* is applied to a binary data format, padding is removed if the value of BatchStrategy
is set
* to SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create a Dataset Using RecordIO in the MXNet
* documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
* The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
* total size of each object is too large to fit in a single request. You can also use data splitting to improve
* performance by processing multiple concurrent mini-batches. The default value for SplitType
is
* None
, which indicates that input data files are not split, and request payloads contain the entire
* contents of an input object. Set the value of this parameter to Line
to split records on a newline
* character boundary. SplitType
also supports a number of record-oriented binary data formats.
* Currently, the supported record formats are:
*
* RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and
* MaxPayloadInMB
parameters. When the value of BatchStrategy
is MultiRecord
,
* Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
* limit. If the value of BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual
* records in each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
* applied to a binary data format, padding is removed if the value of BatchStrategy
is set to
* SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create
* a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
SplitType
is None
, which indicates that input data files are not split, and
* request payloads contain the entire contents of an input object. Set the value of this parameter to
* Line
to split records on a newline character boundary. SplitType
also supports
* a number of record-oriented binary data formats. Currently, the supported record formats are:
* * RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the
* BatchStrategy
and MaxPayloadInMB
parameters. When the value of
* BatchStrategy
is MultiRecord
, Amazon SageMaker sends the maximum number of
* records in each request, up to the MaxPayloadInMB
limit. If the value of
* BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual records in
* each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
* is applied to a binary data format, padding is removed if the value of BatchStrategy
is set
* to SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create a Dataset Using RecordIO in the MXNet
* documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
* The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
* total size of each object is too large to fit in a single request. You can also use data splitting to improve
* performance by processing multiple concurrent mini-batches. The default value for SplitType
is
* None
, which indicates that input data files are not split, and request payloads contain the entire
* contents of an input object. Set the value of this parameter to Line
to split records on a newline
* character boundary. SplitType
also supports a number of record-oriented binary data formats.
* Currently, the supported record formats are:
*
* RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and
* MaxPayloadInMB
parameters. When the value of BatchStrategy
is MultiRecord
,
* Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
* limit. If the value of BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual
* records in each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
* applied to a binary data format, padding is removed if the value of BatchStrategy
is set to
* SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create
* a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
SplitType
is None
, which indicates that input data files are not split, and
* request payloads contain the entire contents of an input object. Set the value of this parameter to
* Line
to split records on a newline character boundary. SplitType
also supports a
* number of record-oriented binary data formats. Currently, the supported record formats are:
* * RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the
* BatchStrategy
and MaxPayloadInMB
parameters. When the value of
* BatchStrategy
is MultiRecord
, Amazon SageMaker sends the maximum number of
* records in each request, up to the MaxPayloadInMB
limit. If the value of
* BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual records in each
* request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
* is applied to a binary data format, padding is removed if the value of BatchStrategy
is set
* to SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create a Dataset Using RecordIO in the MXNet
* documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
* The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
* total size of each object is too large to fit in a single request. You can also use data splitting to improve
* performance by processing multiple concurrent mini-batches. The default value for SplitType
is
* None
, which indicates that input data files are not split, and request payloads contain the entire
* contents of an input object. Set the value of this parameter to Line
to split records on a newline
* character boundary. SplitType
also supports a number of record-oriented binary data formats.
* Currently, the supported record formats are:
*
* RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and
* MaxPayloadInMB
parameters. When the value of BatchStrategy
is MultiRecord
,
* Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
* limit. If the value of BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual
* records in each request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
* applied to a binary data format, padding is removed if the value of BatchStrategy
is set to
* SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create
* a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*
SplitType
is None
, which indicates that input data files are not split, and
* request payloads contain the entire contents of an input object. Set the value of this parameter to
* Line
to split records on a newline character boundary. SplitType
also supports a
* number of record-oriented binary data formats. Currently, the supported record formats are:
* * RecordIO *
** TFRecord *
*
* When splitting is enabled, the size of a mini-batch depends on the values of the
* BatchStrategy
and MaxPayloadInMB
parameters. When the value of
* BatchStrategy
is MultiRecord
, Amazon SageMaker sends the maximum number of
* records in each request, up to the MaxPayloadInMB
limit. If the value of
* BatchStrategy
is SingleRecord
, Amazon SageMaker sends individual records in each
* request.
*
* Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
* is applied to a binary data format, padding is removed if the value of BatchStrategy
is set
* to SingleRecord
. Padding is not removed if the value of BatchStrategy
is set to
* MultiRecord
.
*
* For more information about RecordIO
, see Create a Dataset Using RecordIO in the MXNet
* documentation. For more information about TFRecord
, see Consuming TFRecord data in the
* TensorFlow documentation.
*