/** * Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. * SPDX-License-Identifier: Apache-2.0. */ #pragma once #include #include #include #include #include #include namespace Aws { namespace Utils { namespace Json { class JsonValue; class JsonView; } // namespace Json } // namespace Utils namespace SageMaker { namespace Model { /** *

Describes the input source of a transform job and the way the transform job * consumes it.

The method to use to split the transform job's data files into smaller * batches. Splitting is necessary when the total size of each object is too large * to fit in a single request. You can also use data splitting to improve * performance by processing multiple concurrent mini-batches. The default value * for SplitType is None, which indicates that input data * files are not split, and request payloads contain the entire contents of an * input object. Set the value of this parameter to Line to split * records on a newline character boundary. SplitType also supports a * number of record-oriented binary data formats. Currently, the supported record * formats are:

RecordIO
TFRecord

When splitting is enabled, the size of a mini-batch depends on the * values of the BatchStrategy and MaxPayloadInMB * parameters. When the value of BatchStrategy is * MultiRecord, Amazon SageMaker sends the maximum number of records * in each request, up to the MaxPayloadInMB limit. If the value of * BatchStrategy is SingleRecord, Amazon SageMaker sends * individual records in each request.

Some data formats represent a * record as a binary payload wrapped with extra padding bytes. When splitting is * applied to a binary data format, padding is removed if the value of * BatchStrategy is set to SingleRecord. Padding is not * removed if the value of BatchStrategy is set to * MultiRecord.

For more information about * RecordIO, see Create a Dataset Using * RecordIO in the MXNet documentation. For more information about * TFRecord, see Consuming * TFRecord data in the TensorFlow documentation.

*/ inline const SplitType& GetSplitType() const{ return m_splitType; } /** *

RecordIO
TFRecord

*/ inline bool SplitTypeHasBeenSet() const { return m_splitTypeHasBeenSet; } /** *

RecordIO
TFRecord

*/ inline void SetSplitType(const SplitType& value) { m_splitTypeHasBeenSet = true; m_splitType = value; } /** *

RecordIO
TFRecord

*/ inline void SetSplitType(SplitType&& value) { m_splitTypeHasBeenSet = true; m_splitType = std::move(value); } /** *

RecordIO
TFRecord

*/ inline TransformInput& WithSplitType(const SplitType& value) { SetSplitType(value); return *this;} /** *

RecordIO
TFRecord

*/ inline TransformInput& WithSplitType(SplitType&& value) { SetSplitType(std::move(value)); return *this;} private: TransformDataSource m_dataSource; bool m_dataSourceHasBeenSet = false; Aws::String m_contentType; bool m_contentTypeHasBeenSet = false; CompressionType m_compressionType; bool m_compressionTypeHasBeenSet = false; SplitType m_splitType; bool m_splitTypeHasBeenSet = false; }; } // namespace Model } // namespace SageMaker } // namespace Aws

See Also: