See a sample of JSON payload
```json
{
"tuningJobName": "survival-model-tuning-job",
"tuningStrategy": "Bayesian",
"maxNumberOfTrainingJobs": 10,
"maxParallelTrainingJobs": 10,
"trainingJobDefinitionName":"training-job-def-0",
"algorithmARN": "arn:aws:sagemaker:::algorithm/h2o-gbm-algorithm",
"trainingJobEarlyStoppingType": "Auto",
"enableManagedSpotTraining": true,
"spotTrainingCheckpointS3Uri": "s3:///model-training-checkpoint/",
"inputContentType": "text/csv",
"trainingInstanceType":"ml.c5.2xlarge",
"trainingInstanceVolumeSizeInGB": 30,
"channels": [
{
"channelName":"training",
"s3DataSource": {
"AttributeNames": [ ],
"S3DataDistributionType": "FullyReplicated",
"S3DataType": "S3Prefix",
"S3Uri": "s3:///titanic/training/train.csv"
}
},
{
"channelName":"validation",
"s3DataSource": {
"AttributeNames": [],
"S3DataDistributionType": "FullyReplicated",
"S3DataType": "S3Prefix",
"S3Uri": "s3:///titanic/validation/validation.csv"
}
}
],
"parameterRanges": {
"IntegerParameterRanges": [
{
"Name": "ntrees",
"MinValue": "10",
"MaxValue": "100",
"ScalingType": "Linear"
},
{
"Name": "min_rows",
"MinValue": "10",
"MaxValue": "30",
"ScalingType": "Linear"
},
{
"Name": "max_depth",
"MinValue": "3",
"MaxValue": "7",
"ScalingType": "Linear"
},
{
"Name": "score_tree_interval",
"MinValue": "5",
"MaxValue": "10",
"ScalingType": "Linear"
}
],
"ContinuousParameterRanges": [
{
"Name": "learn_rate",
"MinValue": "0.001",
"MaxValue": "0.01",
"ScalingType": "Logarithmic"
},
{
"Name": "sample_rate",
"MinValue": "0.6",
"MaxValue": "1.0",
"ScalingType": "Auto"
},
{
"Name": "col_sample_rate",
"MinValue": "0.7",
"MaxValue": "0.9",
"ScalingType": "Auto"
}
],
"CategoricalParameterRanges": [
]
},
"staticHyperParameters":{
"stopping_metric":"auc",
"training": "{'classification': 'true', 'target': 'Survived', 'distribution':'bernoulli','ignored_columns':'PassengerId,Name,Cabin,Ticket','categorical_columns':'Sex,Embarked,Survived,Pclass,Embarked'}",
"balance_classes":"True",
"seed": "1",
"stopping_rounds":"10",
"stopping_tolerance":"1e-9"
},
"model":{
"name": "survival-model",
"artifactsS3OutputPath":"s3:///model-artifacts/",
"artifactType": "MOJO",
"trainingSecurityGroupIds": ["sg-c1c27b81"],
"trainingSubnets": ["subnet-1e980630"],
"hosting": {
"initialInstanceCount": "1",
"instanceType": "ml.m5.xlarge",
"inferenceImage": ".dkr.ecr..amazonaws.com/h2o-gbm-predictor",
"subnets": ["subnet-1e980630","subnet-bf64c381"],
"securityGroupIds": ["sg-c1c27b81"]
}
}
}
```
5. A new execution of the state machine will be triggered and the hyperparameter tuning process will start.
### SAM CLI Deployment Options
The deployment options that you can pass to the Sagemaker Model Tuner Serverless Application are described below.
Name | Default value | Description
-------------- | ------------- | -----------
**Stack Name** | sam-app | Name of the stack/serverless application for example `sagemaker-model-tuner`.
**AWS Region** | None | AWS Region to deploy the infrastructure for Sagemaker Model Tuner Serverless Application.
**Parameter Environment** | `development` | Environment to tag the created resources.
### Step Function Invocation Parameters
The different parameters that you can pass to the model tuning step-function are described below.
Name | Description
-------------- | -------------
**tuningJobName** | A Unique Name for Amazon Sagemaker Hyper-paramemeter Tuning Job.
**tuningStrategy** | Tuning Job hyperparmater search strategy, valid options are `Bayesian` | `Random`.
**maxNumberOfTrainingJobs** | Initial number of Instances at the creation time of Amazon Sagemaker Model Endpoint.
**maxParallelTrainingJobs** | Maximum number of training jobs executed in parallel during Sagemaker Hyperparameter Tuning Job Execution. You can set this up to `10` parallel training jobs (soft limit).
**trainingJobDefinitionName** | Training Job Definition name
**algorithmARN** | Arn of [Sagemaker Algorithm Resource](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-mkt-create-algo.html) to be used in model training and tuning.
**autoscalingMinCapacity** | Minimum Number of Instances allowed to Scale In by Automatic Scaling Mechanism.
**autoscalingMaxCapacity** | Maximum Number of Instances allowed to Scale Out by Automatic Scaling Mechanism.
**trainingJobEarlyStoppingType** | Enable or disable early stopping for training jobs launched by Hyperparameter Tuning Job. Valid options are `AUTO` | `Off` .
**aenableManagedSpotTraining** | Enable or disable Managed Spot Training. Valid options are `true` | `false`
**spotTrainingCheckpointS3Uri** | Managed Spot Training S3 Checkpointing Location. Should be set as `""`, if managed spot training will not be used.
**inputContentType** | A valid Content Type permitted by Amazon Sagemaker Algorithm for all input channels channel.
**trainingInstanceType** | A valid EC2 Instance type permitted by Amazon Sagemaker Algorithm to execute the training jobs.
**trainingInstanceVolumeSizeInGB** | Training Instance Storage Volumen in GBs-
**channels** | The array of Input Data Channels. Total number of channels should match with the required channels by given Sagemaker Algorithm Resource.
**- channelName** | Name of the input channel as defined in Sagemaker Algorithm Resource.
**- s3DataSource** | S3 Data Source Input Definition for the given input channel.
**-- AttributeNames** | A list of one or more attribute names to use that are found in a specified augmented manifest file.
**-- S3DataDistributionType** | Define how to distribute the data inputs to tuning job hosts. Valid options are `FullyReplicated` | `ShardedByS3Key`.
**-- S3DataType** | Valid options are `ManifestFile` | `S3Prefix` | `AugmentedManifestFile`
**-- S3Uri** | Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest.
**parameterRanges** | Define tunable hyperparameters as defined in Sagemaker Algorithm Resource.
**- IntegerParameterRanges** | The array of IntegerParameterRange json objects that specify ranges of integer hyperparameters that a hyperparameter tuning job searches. Maximum number of `20` items.
**- ContinuousParameterRanges** | The array of ContinuousParameterRange json objects that specify ranges of continuous hyperparameters that a hyperparameter tuning job searches. Maximum number of `20` items.
**- CategoricalParameterRanges** | The array of CategoricalParameterRange json objects that specify ranges of categorical hyperparameters that a hyperparameter tuning job searches. Maximum number of `20` items.
**staticHyperParameters** | Specify the values of hyperparameters that do not change for the tuning job.
**model** | Parameters related to model.
**- name** | A Unique Name for the best Sagemaker Model selected by Hyperparameter Tuning Process, exactly as defined in Amazon Sagemaker API.
**- artifactsS3OutputPath** | An S3 location to store the model artifacts produced by Hyperparameter Tuning Process.
**- artifactType** | An Amazon Sagemaker algorithm like H2O Bring Your Own Algorithm (BYOA) image can produce `MOJO` | `BINARY` type artifacts using a container environmental variable. If it is not necessary to specify it, it can be set as empty string.
**- hosting** | Details of the production variant should be defined for the process.
**-- initialInstanceCount** | Initial number of Amazon Sagemaker Model Hosting instances at the creation time of Amazon Sagemaker Model Endpoint.
**-- instanceType** | Instance type which is allowed to be used by Amazon Sagemaker Model.
**-- inferenceImage** | (Optional) - URI of model inference/hosting docker image to serve the Amazon Sagemaker Model. If not specified, inference image will be taken from Sagemaker Algorithm Resource.
**-- acceleratorType** | (Optional) - Elastic Inference Accelerator Type for Amazon Sagemaker Model Endpoint.
**-- subnets** | Subnets to launch the Amazon Sagemaker Model Hosting instances
**-- securityGroupIds** | Security Group Ids to control access ingress/egress for Amazon Sagemaker Model Hosting instances
## 📷 Screenshots
Below are different screenshots displaying how the different stage of a Sagemaker Model Tuning looks like in the AWS Console.
### The state machine during execution
You can see below a current execution of the `ModelTuningStateMachine` in the AWS Step Functions console.